Technology – 6 am pacific

Friends, Roman For Your Mother Tongue

The other day, I started reading Parineeta, by Sarat Chandra Chattopadhyay, in Hindi. I found the translation to be horrible. I could have downloaded an English translation, that might have been better, I suppose. But really, what I would have loved to do, is to read it in the original Bengali.

I am Bengali on my mother’s side (Odiya on my father’s side). I understand Bengali well enough but can’t read. Wouldn’t it be nice, I thought, if Parineeta was available in the original Bengali, but written in the Roman script – the script used to write English, commonly and mistakenly referred to as “English” script?

For that matter, wouldn’t it be nice if the many of us, who understand an Indian language well but can’t read its script, could easily lay our hands on its literature in the Roman script – the script in which we read every day?

Shoaib Daniyal, argues very convincingly in Scroll.in that Devanagari should give way to Roman, as the de facto script for Hindi. While there is no official sanction for it – and it is unlikely to happen any time soon – on the ground the shift is happening already. While more Hindi is still being read in Devanagari (think Dainik Jagran), more Hindi is definitely being written in Roman (think social media and texting). The tyranny of technology, the QWERTY keyboard and the implacable advance in English medium education in India point to a future with a lot more Roman and much less of Devanagari.

This is of course true not just about Devanagari, but of all Indian scripts. In fact moving all Indian languages en masse to Roman has great advantage. Not only will we make writing in individual Indian languages more accessible to their speakers, we will also make other Indian languages more accessible. I may not ever want to read Kannada literature, but it would be nice to be able to read the road signs in Jayanagar.

Back to Parineeta and making literary classics accessible. If someone has the inclination, here is an idea that would get someone a lot of punya as a non-profit. I hope someone does. I’ll be the first one to buy its product.

There’s a large body of work from Indian language writers like Tagore, Sarat Chandra Chattopadhyay and Premchand that are now public domain. In India copyright ceases to exist 60 years after the death of the author. All these books are now free to translate, transliterate or make derivative works out of.

Your readers will be all the lovers of great literature from a generation in India that placed such a great emphasis on English that it crowded out the possibility to be a fluent reader in one’s mother tongue. Many of them understand, speak, but do not read, or do not read fluently their mother tongue’s script. If they could buy Parineeta in Sarat Chandra’s original Bengali, but written in the Roman script, I think many of them would be willing to plunk down cash. Like me.

You would publish these books on paper as well as on e-readers like the Kindle. Even if I can understand some Bengali, I may still have trouble understanding a work of literature. But with the help of a custom dictionary which allows me to pull up the meaning of a word by just clicking it – much better, no?

It won’t be easy to get to a critical mass of books, but it is quite doable. The first hurdle is going to be to get electronic copies (not scans) of the original works. For writers like Tagore, they are already available online. But for others, it might be necessary to recreate the electronic versions. Project Gutenberg has had much success with solving the same problem with English language books.
Actually, a Project Gutenberg for Indian language classics would be just what the doctor ordered. Unfortunately, Project Gutenberg seems far too busy to take up non-European books at this time.

The next step would be to decide what Romanization standard to use. The Hunterian standard seems to have the blessing of the Government of India for Bibliographies but there are many other competing standards like ITRANS. It should be fairly easy to develop software that converts the Indian script to standardized Roman.

An e-reader like the Kindle would be the perfect device to read on. The book would need a custom dictionary, which the Kindle supports.

Charging a small price for the books, even though they are public domain, would be totally fair and legal. It’s done all the time by book publishers.

In a few years, English will be everywhere. We will all be lamenting how nobody cares about our own languages any more. The way to think about this is to not conflate the language with its script. Once we set the language free, it will get a new lease of life.

Dark Data is a Big Opportunity for Services Companies

Why is Big Data big? Obviously not because someone invented a new statistical method of data analysis that can explain everything. It’s because there is a lot more data spewing out of business processes where there were none before. The questions haven’t changed – as a business manager I still want to know how to forecast sales or understand which levers to apply to improve my business outcomes. But where there was once no way to answer these questions, today there actually is data that can be analyzed to provide some of these answers.

So far all the attention has been on business problems whose answers lie in the copiously flowing data from associated business processes. Marketing, for example, where all Marketing is getting subsumed into Digital Marketing which, as anyone will tell you is a data gusher.

But there are innumerable questions in business that need answering where the data just isn’t there to analyze. Or rather, the data is there but it is unusable, just beyond reach.

I am part of a non-profit theatre company in the Bay Area. Theatre companies live and die by their ticket sales. All costs are fixed. Once you decide to stage a play your costs are all locked in. Your revenue, however, is completely variable, by the number of tickets you sell.

In such a business, it would be crucial to understand where one stands on ticket sales. In other words, you should be able to answer the following question at all times “Based upon the ticket sales today, X days before opening night, we are on track to fill Y% of seats”.

Almost all ticket sales are online. Which is a good beginning. The online ticket selling service that we use sends us a daily email with our cumulative ticket sales till that day. But, and here’s the nub, they don’t store a time series of daily ticket sales. So if I wanted to draw a graph with number of days to opening night on the X axis and cumulative tickets sold (by show, of course) on the Y axis, I’m out of luck.

This is not a unique situation. I’m sure you can think of many such examples in your business where you know the data is created but it isn’t kept, or it isn’t in the right form.

Then there’s a form of data that is not captured but can be, with just a little work. At Infosys, when I was trying to wrap my head around how to implement CRM in a company which hadn’t used one for years, I thought that it might be too much to expect the field force to make notes after every client meeting. But perhaps, if they could just log every client meeting, that by itself would be very useful. It would be a measure of business activity which we may be able to correlate to deal value and perhaps, could serve as a rough, early warning forecasting system. There are so many opportunities for squeezing a business process for meaningful data. Analyzing this data, typically doesn’t need Hadoop clusters, but the business outcomes could be quite significant.

Think of data as fossils in sedimentary rock. The fossils in the upper layers are newer, better formed and easier to interpret. The ones in the lower layers are just the opposite. But they are just as important to understanding and improving your business.

IT Services companies will see a lot of opportunity in Big Data and Analytics. But software companies will take away most of the value in the top sedimentary layers. The lower layers will be messy. And straightening out messes, is where IT Services companies thrive.

Meanwhile, I’ll be trying to sort out my ‘messy’ ticket sales data using Google Script. If anybody knows of a script that will help extract a number from an email that follows an identical string of text, please send me a note. Thanks.

Government Mass Surveillance Will Create a Surge in Technology Spend

How the NSA hacked Google in one simple graphic. Photograph: Washington Post

The Guardian has a presentation called The NSA Files that is the most brilliant rich media presentation of a complex subject I have ever come across outside of a museum. So go read it, even if it is just to see the presentation.

As technology races ahead, from time to time public debate and the law of the land must catch up to it. Government surveillance is one of the most important technology issues of our times. In the US, the NSA files have already had a profound impact on the perceptions of Americans about surveillance and civil liberties. Outside the US, Germany is aghast that the US and UK, NATO allies would spy on Angela Merkel and Germany. On the other hand, there is some evidence that the NSA has aided in combating Mexican drug cartels and prevented terrorism.

The political issues surrounding mass electronic surveillance are complex and do not lend themselves to quick fixes. The issues at stake involve civil liberties, anti-terrorism efforts and international spying, among other things. It might take a decade of public education and wrangling in courts, legislative bodies, NATO and perhaps even the UN, before policy, practice and the law around surveillance settles down.

Leaving aside the political issues, the NSA files will feed a long surge in surveillance and anti-surveillance technology. Snowden’s leaked files are like the Trinity test – the first detonation of a nuclear bomb. Prior to that governments knew that a nuclear bomb was feasible and some were feverishly working on making one, but the public did not really know much about it. The Trinity test, first brought the power and potential horror of a nuclear bomb to the attention of people around the world. And just like Trinity set off a nuclear arms race that lasted for decades, the NSA files will set off a “Snoop Brawl” which will lead to a burst of technology spending around the world. But unlike the nuclear arms race, which was a race principally between nations, this Snoop Brawl is going to be multi-faceted; many players and many fist fights.

Country vs Country
Countries now realize, if they hadn’t actually known it all this while, that the NSA gives the US almost unfettered access to their secrets – those of their governments, politicians and companies. Dilma Roussef, Petrobras, Ban Ki-moon and Angela Merkel were all targets of the NSA; none of which can be justified by national defense. Surveillance is not a single-purpose (e.g. protect the country against terrorism) tool. It is an arsenal of weaponry to project a country’s power and further its interests. And no country has a bigger arsenal than the NSA.

Not China for sure. A few months back, China was pilloried for not doing enough to stop Chinese hackers, allegedly sponsored by the Red Army. That doesn’t seem so egregious anymore. If all major countries decide to bolster their electronic espionage and counter-espionage capabilities, that itself is going to be a lot of hardware, software and thousands of tech jobs.

Country vs Citizens
The NSA collectes metadata on all phone calls in the country – who called whom, when and from where. It is easy to see how this data could be very useful in tracking the networks involved in illegal activity. Or political opponents. If information is power, this is the hammer of Thor (Thor 2 was so-so by the way). Which country that isn’t shackled by its laws, can pass up the opportunity to gather this data?

Outside of a few advanced democracies with active civil liberties protection groups, there is no countervailing force to stop a government from collecting and using this data. The only hurdle is the technology and skills to mine a massive data set like this. Which money can buy. Expect countries to spend a lot on this and other Big Brother technology.

While this gives the government a valuable instrument to catch the bad guys, the very same instrument, in the wrong hands could be used to suppress democracy itself. Or give more power to despots. Expect big orders from tin pot dictatorships as well as big nations where ruling classes are trying to quell democracy.

Companies vs Governments
The US is a hub for internet companies and cloud companies that carry the private data of people and businesses around the world. Much of this data travels through pipes in the US or sits in databases controlled by US companies. Courtesy of Snowden now the world knows that US companies have been cooperating with the NSA with no disclosure to the customers.

European and other advanced nations are likely to enact laws that prevent companies like Google and Apple from becoming listening posts for the US. They may require these companies to “fragment their clouds” and keep the data of their, say, German customers in Germany with restrictions on who can this data be shared with.

On the other hand, companies like Google will have to work hard to regain the trust of their customers. Google is already taking measures to prevent the NSA from eavesdropping without them knowing. Apple is making noises that they might challenge the legality of not allowing them to disclose when information has been shared with the government at their request.

Companies vs Companies
The other things that all companies will realize – not just cloud companies with user data – is that their IP is not safe. In the past year, the Chinese hacking incidents being reported have already raised awareness of this issue. But what the NSA files make clear is that ethics and national interest seem to have no intersection at all. Every country may be spying on foreign companies that can bring value to leading companies in that country. Protection of IP and confidential information will become a key concern for companies. Much more than it is today.

Consumers vs Companies and Governments
Consumers are going to increasingly want to know how their communication and confidential information is protected by online services. Now that they know that “Enemy of the State” is for real what steps should they take to protect themselves? Encryption and information security are complex technical issues which most people don’t understand or even care about. Perhaps there is a need for “information security rating agencies” that rate online services on how they protect users’ confidential data – from hackers and governments everywhere. Perhaps people will increasingly want to roll-their-own email rather than use Gmail or Yahoo mail.

Like the nuclear arms race, the Snoop Brawl will create a flood of spending from both governments and companies. And exactly like the nuclear arms race, it leaves no positive impact on the human condition.

Cloud to Enterprise: Resistance is Futile

At the end of its Journey to the Cloud, will an enterprise still be running its own data center?

Perhaps, but if it does, it will be much, much smaller. It will be the Private Cloud part of a Hybrid Cloud that will run only a small portion of enterprise workloads. On the other hand, many enterprises will live entirely in the Public Cloud.

Consequently, most of the enterprise data center capacity in the world will disappear, assimilated by the Borg, that is the Public Cloud.

Why is the gravitational pull to the Public Cloud so strong?

One reason is that the price-performance of the Public Cloud is much superior to that of a dedicated data center, even one with Private Cloud. And the gap is widening all the time.

At the heart of this price-performance advantage is scale and specialization. The kind that takes Public Cloud service providers to places like Prineville, Oregon to build massive chiller-less data centers with custom-built servers.

Facebook’s done exactly that. There is a fascinating, if somewhat dated piece in Wired magazine, which goes into a lot of detail on Facebook’s data center in Prineville. Needless to say, the scale and level of specialization is such that every single element of cost and energy coinsumption is highly optimized. Not something an average enterprise can or should be spending their time on.

Facebook’s not an enterprise cloud service provider (yet?). But Amazon, Google, Salesforce.com and Microsoft will undoubtedly have the same laser focus on cost and performance.

Cost advantage also comes from “pooling” – the simple notion that the capacity required to run a pooled data center is less than the sum of the capacities of the individual data centers. Pooling is a powerful cost driver in many services from electric power to limo services. It has nothing to do with specialization, but the cost savings are quite real.

But it’s not just about price-performance. Over time, specialized cloud service providers will add product features that will simply make them better. Or allow customers to do things they couldnt do before. Big Data is certainly shaping up to be one of those areas.

This has happened before. A close analogy would be how Salesforce.com started out by being the cheaper, easier to implement SaaS competitor to on-premise CRMs, but is now functionally a superior product.

Better price-performance and superior feature-function may be quite enough to explain the exodus to the cloud. But there is actually an even bigger force at play here. Something that doesn’t lend itself well to ROI calculations, but nevertheless, CEOs understand it very well.

Running a data center is not core to any large enterprise. Its business is selling widgets or serving customers. And for it to be the best in the world at what it does, it needs to focus on what makes it good. Running a data center is not one of those things.

Which is why if I was a data center today, I would be saying my prayers. Resistance is futile. Prepare to be assimilated.

I Want My Customer Data

For the last few years, I have been using Mint (mint.com) to keep track of our household expenses. My needs are very simple. I want to be able to answer simple questions like “How much are we spending on regular monthly expenses?” and “How much is going towards discretionary expenses like eating out?”.

But I find it difficult to answer the simple questions above. It takes a lot of time and effort to get to a point where I can say that I am reasonably close to the real answers to these questions. As I describe the problems that make it so time-consuming, it actually throws light on a new battleground for consumer services – customer data in the hands of customers.

Except for a tiny fraction of cash expenses, all of our expenses are in the form of transactions – debit cards, credit cards, electronic transfers, bill pay transactions, and yes, a few, hand written checks. We try to keep things simple so we don’t have too many accounts. All these accounts are hooked into Mint. Mint pulls all these transactions so that I can see everything in one place. So far so good.

To get a handle on our household expenses, each transaction needs to be categorized correctly into categories like Entertainment or Restaurants.

That’s where the problems start. Every time I log into Mint, I have a whole bunch of transactions that are labeled “Uncategorized”. I then have to manually go through each transaction, try to figure out the merchant to whom I made that payment and then categorize it. Often, the name of the merchant is completely garbled. If my wife made the payment, I have to wait for her to be around so I can ask her. It takes time and is annoying as heck.

It shouldn’t be that difficult to get this right. A typical card transaction is passed from acquirer to network (like Visa) to issuer (my bank or credit card company) and then to Mint. On the way, nobody seems to care enough to categorize the transaction intelligently, and ensure that the merchant name is represented correctly. It’s all left to Mint to do whatever it can with the data it has.

Mint tries. It allows you to categorize recurring expenses automatically. But non-recurring expenses are far too high in our family to ignore. Mint uses very little intelligence to extract the meta data from transactions. A typical transaction would show up on Mint like so

CHECK CRD PURCHASE 02/18 LYFE KITCHEN OF PA PALO ALTO CA 434256XXXXXXXXXX 08304975697XXXX ?MCC=5812 on Feb 19

In this case Mint extracted the merchant name as “Lyfe Kitchen Pa” (correct) and tagged it “Uncategorized” (missed opportunity).

It so happens that the string “MCC=5812” refers to the Visa merchant code for “Eating places and restaurants”. A simple google search will tell you that. Why Mint would choose to leave it Uncategorized is difficult to fathom.

In other cases, while extracting the merchant name it applies no intelligence, it appears. It just pulls out the first two or three words that are not numbers. It typically fails for merchants like 23andMe or 37 Signals or 76, the gas station.

Ultimately, no one in the entire chain of merchant-acquirer-network-issuer/bank, all of who are making money because I am spending, care enough to do anything with my data, except the bare minimum to complete the transaction. They don’t recognize the value that I put in my data. Mint does. Which is why I spend a lot more time on Mint than on my bank account website. But even Mint doesn’t do enough.

Next, consider Simple (simple.com), a new banking service that I started using a few months ago. It is still by invitation only but if you can wangle an invitation, you won’t be disappointed.

This is a screenshot of what I might see on the Simple website. Simple extracts the merchant name quite well. And the automatic expense categorization works quite well. I don’t know how it does it, but I have never had to go in and change the category on a transaction. In fact if I had to, I wouldn’t know how to do it.

But that’s not all. It will put the address on a little embedded google map. That can be a big help in identifying where you were. If you went to a restaurant, it will tell you how much you tipped.

It’s not perfect, but I feel like they are putting the information that they have to the best possible use. To do more, they would have to get more information from upstream sources over which they don’t have much influence.

I have been comparing Simple with Mint which may not be a fair comparison. Simple has to deal with its own transactions (bill pay or their own debit card). Mint aggregates across many different sources of transactions each presumably with its own idiosyncrasies.

But if you were to compare Simple with any other bank that I am aware of, the difference in the use of customer data (and user experience) is vast. It is light years ahead.

Consumer services today offer a lot of choice. One of the most important ways in which consumer services will compete with each other is what they let their customers do with their data. For a long time, the focus of these companies has been on the use of customer data to extract insights for themselves – how to cross sell more, how to identify loyal customers to serve them better and so on. But this is different.

Customer expectations are rapidly changing. They are being shaped by companies like Apple and Amazon.com that set the standards, not just in their industry, but across industries. Customer data is now part of the customer experience. This is the new battleground.

So what are these customer expectations? Here are mine:

1. That my service provider will obtain and share the data with me in a timely fashion.

I’ll illustrate this with an example. Today’s smart meter technology allows my utility to obtain the power consumption at my home in at a resolution of 15 mins. At this resolution, the data can tell me a lot more than what my monthly bill tells me, which is almost nothing, other than the fact that it is high or low. (Currently mine is running too high and I don’t know why!)

But deploying smart meters costs money. Lots, in fact. Will it be worth it? It might have been hard for utilities to justify the cost. After all, they are all monopolies. Luckily, regulators in most advanced nations have been nudging utilities in that direction.

In every industry, there will be similar challenges. How do you justify the cost of gathering more data that is useful to the customer? Expecting new revenue from additional services to justify giving customers more data may be too short-sighted.

2. That my service provider will understand that that data is mine.

I shouldn’t have to pay just to get that data. Although I will gladly pay for a service using that data that is of incremental value. I should be free to take that data out myself, or allow another service provider to pull it out on my behalf.

Today, my bank charges me a fee if I want to see a used check image older than 6 months. Tax filing time must be quite profitable for the bank. I don’t have a problem with the bank trying to turn a profit. But not on my data. If your storage costs are too high (really?) allow me to easily export it to my Evernote or DropBox account. (New feature idea – managing check images!)

3. That my service provider will take the utmost care to secure my data

This is generally well understood. Because of laws and the damage that negative publicity around loss of customer data can do, most services try hard to protect it. Try harder! Lately, the hackers seem to be winning.

4. That my service provider will add value to the data

My online brokerage service has always given me a CSV download of my transactions. But till two years ago, they did not have a decent performance analysis of my investment portfolio. So I had to go and put my entire portfolio in a Google spreadsheet which would look up prices from Google Finance and calculate the rate of return. But with reinvested dividends and what not, it took a lot of work to keep the portfolio up-to-date. How you can be an online brokerage and not offer the most basic use of my data – portfolio performance – is beyond me?

Performance analysis, alerts, suggestions – they are all possible. And expected. If my credit card hits me with a foreign transaction fee, I want to know about that in an alert (thank you Mint!).

5. That my service provider will understand that the reward is mostly my loyalty

There was a time, when online retailers did not give you a transaction history. I stopped shopping on those sites. I don’t want to spend the time to search for the same item all over again, if I want to buy another one.

For service providers, this is going to become the cost of doing business. So if you think that you will invest in giving me more value from my data only if I pay you more for this value, your competitors who think differently will get my business.

But, if you are clever about it, you will discover value points that I will pay for. You see, the work that you do to help me get more value from my customer data, in turn helps you understand me better. And when you understand me better, you will be able to design services that I will want to pay for.

Wanted: A Rules Engine for Excel

James Kwak writes about the role of Excel in the JPMorgan 2012 trading loss

After the London Whale trade blew up, the Model Review Group discovered that the model had not been automated and found several other errors. Most spectacularly,

“After subtracting the old rate from the new rate, the spreadsheet divided by their sum instead of their average, as the modeler had intended. This error likely had the effect of muting volatility by a factor of two and of lowering the VaR . . .”

So @SUM instead of @AVG and boom – $2B in trading losses.

I am exaggerating of course. It didn’t quite happen that way, but it does appear that this Excel error made the trade appear much less risky than it actually was. [You’ll find a full analysis of the JPMorgan trade here. A lot more interesting is this history of Excel bloopers.]

Excel is the weapon of choice for financial analysts of all hues. You could be an analyst evaluating complex derivative instruments or an investor projecting a public company’s future earnings. If you are a financial analyst, you probably spend a big part of your life looking at a grid of tiny grey cells.

During my startup days, we built a product that would pull in data from our library directly into their spreadsheets and keep it current. Since we worked with our users to design the product we got to see many, many spreadsheets.

As a rule, these spreadsheets are massive, multi-MB beasts. They start big. Over time they gather more and more data and complex analyses and become massive. In their full glory, they are inscrutable to anyone except their owners. And often even that is doubtful.

Not surprisingly, we regularly found errors in the spreadsheets. As expected, data errors were common. But errors in formulas, of the kind above, were not uncommon at all.

Now think about it. Companies are making decisions worth tens, sometimes hundreds of millions on the backs of financial analyses done in Excel spreadsheets, that are understood solely by their owners and are impervious to scrutiny by anybody else.

Excel was never designed to be anything more than a personal productivity tool. If the stakes of getting a risk model right are that high, then shouldn’t that risk model be treated like enterprise software – with development standards, commented code, versioning, unit and integration testing?

It’s not as if enterprise software doesn’t have its share of messes. The CIO of a print publication recently told me that they did not think they knew all the different offers on the publication that were available out on the internet. Apparently, there was a link on some forum that was still allowing a special offer and they didn’t know how to turn it off! We recommended re-engineering the whole application and putting a rules engine in front of it.

Perhaps that’s what financial modeling needs. A rules engine that drives all the analysis below. The problem with Excel is that the formulas and the data are all mushed together into a blob of grey cells. They need to be layered. Data in one place, rules in another. If I want to project a company’s revenues by applying the average year-on-year growth in the last four quarters, that’s written in the rules area. The historical revenues are in the data area. If the gross profit is a fixed percentage of revenue, that too is written in the rules area. When I change the rule, the computation changes and not otherwise. There are no computations that aren’t in the rules.

When I review my model with my team members, we are just looking at the rules, confident that the spreadsheet itself is simply a manifestation of the rules and data. Once in a way, we do look at the data sets as well, just to make sure we are using the right ones and that they are current. And we test the model from time to time to see if it gives out expected results.

But this sounds more and more like a database application. And database applications need programmers. In the real world that we live in, no financial analyst will let a programmer stand between him and his model. But perhaps that is the challenge here. Can we build a rules engine that is enterprise strength, does not require a programmer and sits on top of the WYSIWYG goodness of Excel?

Excel is one of the most powerful applications of our times. It is the killer app in MS Office. It is ubiquitous and everyone who will ever build a financial model already knows how to use Excel. Which is why we do too much with Excel. Which is why we end up betting hundreds of millions on the backs of black box Excel financial models. This needs to change.

The Journey to the Cloud

submerged-servers-v2 — Oil Cooled Servers - grcooling.com

The beginning of the year is a good time to prognosticate. So here is my prediction – not just for the year, but for the whole decade. This decade in Enterprise IT is going to be mostly about the Journey to the Cloud.

Enterprise IT faces a raft of technology shifts in the coming years. The deep penetration of mobile devices into the enterprise, the impact of social media and the technology to extract intelligence from greater and greater quantities of data, are a few important ones. But what stands out for the disruptive nature of its impact is the Cloud. The Cloud is truly paradigm-shifting for Enterprise IT.

What is so different about the impact of the Cloud? It is not so much about the opportunity of value creation. While the size of the prize is massive, that is true about the other technology shifts as well.

The Cloud is disruptive. It upends the current order of things in Enterprise IT. It shifts control away from IT towards Business. It dramatically changes the role of the CIO and her organization. And it changes the way we manage the business of IT in profound ways – like managing by business outcomes rather than by costs and intermediate IT outcomes.

Why is this so? Since the beginning of Enterprise IT, IT infrastructure, particularly the data center, has always been firmly rooted in IT. Even when outsourced, the outsourcing contract has been managed by the CIO’s organization and they remained accountable for its results. But today, Cloud service providers, run their own infrastructure that is common across all their customers and charge a fee for use. Sometimes they provide infrastructure services only. Often, its more. But always, using Cloud services means transplanting IT infrastructure outside the domain of the IT organization.

IT infrastructure may not directly create value, but business leaders understand the risks involved. Something that could bring your company’s order processing system to a halt is not to be trifled with. So why would an enterprise undertake the risk of such disruption?

Economics, for one. From what we have seen with our clients, the potential for savings range from attractive to downright stunning. But there are other reasons as well – flexibility, developer productivity, time-to-market – and as you get up to the application layer, soon, state-of-the-art functionality will exist only in SaaS applications.

But the biggest reason why the Cloud is unstoppable is because running a data center is not “core” to anybody’s business, except a handful of Cloud service providers. These providers run mega data centers at locations and at a scale where every element of performance and cost is carefully optimized. No company can match that. Enterprise IT will find it difficult to justify “build” over “buy” when it comes to data center infrastructure.

And the future will hold even greater specialization. For example, Intel is running trials with oil-cooled servers (photo above) which need only 2-3 % of their power for cooling (against the usual 50-60%). Today’s data centers are all set up for air-cooling. Can Enterprise IT deal with such shifts in technology?

No, I don’t think so. The Cloud is written in every Enterprise IT organization’s future.

Like most paradigm-shifting technologies, the switch-over may never be complete (I still have a VCR or two at home). But it is inevitable and once it gathers momentum it will be unstoppable.

Cloud adoption will be gradual at first. CIOs will start with new development on IaaS or PaaS. Some will migrate small, non-critical applications to the Cloud. SaaS applications have some momentum. But the bulk of the core enterprise applications still run in the data center. There is a long way to go.

The Journey to the Cloud will be long. There will be risks, and many challenges on the way. There will be existential questions like what is the role of the CIO in this brave, new world. But “there is gold in them thar hills”. It will be worth the ride.

Cross-posted from infosys.com with minor alterations. [link]

Why is That Ad Following Me Around on the Internet?

Forbes has a piece on Why Wall Street Likes LinkedIn More Than Facebook. The difference in how well the two companies monetize user interest is quite significant.

And when it comes to making money, LinkedIn packs a much harder punch than Facebook. LinkedIn manages to rake in $1.30 per user hour spent on the site, while Facebook scrapes by with a measly 6.2 cents per user hour.

The article’s explanation of this gap is that professional data is more important to enterprises than personal data to consumer advertisers. But I don’t know if it is just about that. Somewhere in there lurks a question begging to be asked? Does Facebook monetize user attention as effectively as it could?

Take me. I use Facebook regularly. Not a whole bunch, but I will check in a couple of times a week, mostly over the weekend. Here are the typical ads that I see.

I can link each ad back to something in my profile. I live in the Bay Area, am liberal, married and male. Is this kind of broad-brush targeting the best Facebook can do?

The ads are uniformly unmotivating. They are shown repeatedly. As if there was a very limited stock of ads to show and they had to show half a dozen on every page. So sorry, but we will subject you to the same insipid ads over and over again.

You might point out that I’m not a typical Facebook user. I share a little but not a whole lot of information. I never use my Facebook login on 3rd party sites. So FB doesn’t have much to work with.

But that wouldn’t be true. FB has a ton of personal data for me – my social graph, my status updates. If it still serves up ads that don’t result in any action, is it a wonder that their monetization is at 6.2 cents per user hour?

In contrast look at Google advertising. Comparing Google search advertising with FB advertising is perhaps not fair. Search clearly declares intent which makes targeting much easier. Display ads will find it tough to compete with that.

But Gmail ads are perhaps as close as it gets contextually, to FB. I find Gmail ads to be very relevant. Google’s algorithms are clearly analyzing my email content to determine which ads to serve up. FB isn’t doing that with status updates.

Users want to see ads that are better targeted. I am interested in seeing relevant ads. And advertisers want to reach interested users. The right economic model (a la AdWords) and some clever computer algorithms should get you there. It can’t be that difficult if you have Facebook’s resources. So I wonder why it’s taking them so long to get this right.

Twitter too is trying hard to get their advertising model right. Promoted tweets in my timeline show up from time to time. They are using the social graph cleverly and I find many of their promoted tweets relevant. I spend a lot more time on Twitter than I do on FB, so I do hope they find a non-intrusive but successful advertising model.

Meanwhile, an ad for a certain watch has been following me around wherever I go on the internet. It so happened that a couple of months ago I clicked on an ad for the watch. I then went and read a whole bunch of reviews and so on and then actually bought the watch. But now, the ad networks think that since I clicked on the watch ad, I am interested in it. So now I am bombarded by the same ad everywhere. All wasted advertising since I’ve already bought the watch. And very annoying.

So now I have done two things. Both of them are detrimental to the ad networks’ and the advertiser’s interests.

One, I opted out of all interest based advertising. They don’t make it very easy for me to do that, but I patiently went to all the websites and opted out.

Two, I will never click on a display ad again. If it interests me, I will google it and get to its website through search.

I am positive that I am not the only person who is doing this. Some may want to but don’t know how or don’t have the patience to. But many will opt out. Soon there’ll be a video on YouTube on “How to Stop That Ad Following You Around on the Internet”. And then the game’s over.

Everything’s a Game

I recently crossed a 100 miles per gallon with my Chevy Volt. For those of you who live in “advanced” societies that follow the metric system, that would be 42.5 km per liter.

If you fell out of your chair at that number, that’s probably because you don’t know that the Chevy Volt is an electric car with a back up gasoline engine. A full charge takes me about 30 miles after which it switches to the petrol engine. So if I keep driving on my battery, the mileage keeps improving.

I crossed a 100mpg after some effort. I charge the car using a standard 110V outlet which takes 10 hours for a full charge. Which means that I have to remember to plug it in at night, otherwise the next day I’ll be driving on gasoline. I now regularly forget to charge my iPad, but almost never, my Volt.

So when I crossed 100 mpg, I was naturally quite thrilled about it. I posted this tweet

The big 100! @chevyvolt is there a club or something I will get invited to? http://t.co/3TUwzzxY

— Basab Pradhan (@basabp) February 12, 2012

To which I got some responses that were humbling. Like this one

https://twitter.com/GeorgeRobison/status/170950408829800449

After which I joined voltstats.net and kept at it. My mileage is now 104mpg.

Why would I spend any time pushing my mileage up? And then joining a website with a bunch of people who are similarly engaged? I get nothing out of the deal. Yes, there is some satisfaction on doing my bit to save the planet, but anything over 17mpg (my previous car) would have been an improvement. Why go for a 100?

This behavior, that would make no sense to economists, is driven by what is called gamification. Apparently we are all wired to play games.

There’s a lot of action around gamification of the enterprise. SAP is investing in this area. Salesforce.com bought a company Rypple that uses gamification to improve employee performance.

Outside of the enterprise Stackoverflow uses badges and so on to reward certain activity. My daughter does Math exercises on Khan Academy, which awards badges after you win a certain number of points. It certainly keeps her going without much complaint. We offered her a reward for every 10,000 points. But she has never claimed it. Achievement is its own reward?!

But I wonder if the psychology at work here is the same thing that makes us play silly games, board games and basketball? Does that capture what is going on here?

I think there is something else at work here. If you can measure something and if you can influence it, that something automatically becomes a challenge, a contest. Is it the overachiever in you that compels you to better your best score (or someone else’s)? Or is it your playful, game loving side? Perhaps they are all at work here – play, achievement, competition – just in varying proportions for different people.

Whatever it is, we will see more and more of it in our companies. For a game to be successful, the measurement of outcomes should be largely driven by game play not by random or extraneous circumstances. As life gets more digitized, such opportunities will keep popping up.

Ten years ago, you could never have had a contest on how many friends you had. Now I can say that I have more Facebook friends than you.

Aadhaar Under Attack for Specious Reasons

A parliamentary committee is about to reject the National Identification Authority of India Bill of 2010. Here is an article from The Hindu about it. Here is my post from a few months ago on the UID project.

The success of Aadhaar is important for India. Very important. It is a foundational pillar for nation-building (as in Aadhaar) . And it is really, really disheartening to see it being attacked and brought down.

The reasons for the opposition to the bill in the Parliamentary Committee per The Hindu article are,

Sources in the Committee say the Bill has been rejected in its current form on the grounds of the project’s high cost, as well as concerns regarding national security, privacy and duplication of the National Population Register’s (NPR) activities. One major sticking point was reportedly the Aadhaar project’s ambition to enrol every “resident” of the country, rather than every “citizen.”

A common misperception is that Aadhaar is linked to an entitlement program. It is easy to understand why there is this misperception. Today any entitlement program – PDS (ration card) or a passport – has the identification and entitlement program tied together. Sometimes, one entitlement program might use the identification from another entitlement program (a ration card can be used for many purposes other than getting rations at a Fair Price Shop), but there is no stand-alone identification program.

Aadhaar is a stand-alone identification program. It does not come with an entitlement program. It simply links a number/name/father’s name/address with biometric identifiers [Update: It is actually number/name/date of birth/address]. Every entitlement program comes with a set of qualifiers (PDS for BPL, passport for citizens…).

What qualifies someone for a government entitlement program can vary quite a bit. Aadhar cannot and should not duplicate a verification system for all these qualifiers. But once someone is qualified say by the PDS to receive a ration card, if the UID number is linked to the ration card, every time the beneficiary wants to get subsidized rice from a PDS shop, biometric identification is fast and infallible.

With that background let’s examine each of the reasons for opposing the bill.

Inclusion of “residents” as opposed to “citizens”

The people who raise this as a problem must be under the impression that the UID number by itself confers some benefit. But it doesn’t. Let’s say the Secretary in charge of PDS thinks that only citizens should get the benefit of subsidized rice and an illegal immigrant from Bangladesh should not. Perhaps he thinks that by giving a UID number to the Bangladeshi immigrant we are enabling him to “take advantage” of the PDS.

We are not. Whatever verification procedures are used by the PDS today to distinguish between an illegal immigrant and a real citizen should stay in place. The UID could be an additional layer of verification (you do have to show some government ID to get the UID) but it cannot and should not replace what PDS has in place. However, once the beneficiary’s qualifications have been verified by PDS, his UID is linked to his eligibility for subsidized rice. He uses biometric identification to get his rice.

The same logic applies to getting a passport or anything that is a benefit for citizens but not for residents.

But then you might ask, why not just have Aadhaar cover citizens and not residents. Here are two good reasons why:

– Residents may not have entitlements. But remember this is not just about entitlements from the government. There are KYC requirements for opening a bank account where UID can help. And non-citizen residents can also open accounts.
– To distinguish between a citizen and a resident is not an easy process. It is best done by other departments, like the Home Ministry. It would greatly slow down Aadhaar if they had to do it.

Issues related to privacy of those who have been assigned a UID number

Aadhaar has been designed to give answer’s to questions like “Is this man whose thumb is on the scanner, Ram Mohan?” It replies in yes or no. It does not answer questions like what is the name and address of a man whose UID number is 12345…

This is as good as it gets from a privacy standpoint. Now that doesn’t mean that it will be foolproof. Nothing is. After all there is a database somewhere where names and addresses and UID numbers are stored. But isn’t that true about any database anywhere in the world? If you want to live in the modern world and one day become a first world country you are going to have your biometric identification somewhere.

Home Minister P. Chidambaram has also raised issues about security weaknesses in Aadhaar. “The possibility of creating fake identity profiles is real” he writes. I can’t see how that would happen given that the biometric data has to belong to a real person and it can’t be someone who is already in the database.

Perhaps he means that non-citizens can get a UID number and that shouldn’t be allowed. As I have argued above, it is not UIDAI’s responsibility to qualify people for citizenship. The Home Ministry should continue using the methods they use today like police verification for passports.

The problem in tackling objections related to privacy or security is that the person who is in charge of security or privacy has to just think of scenarios where your system will break. An honest discussion about the probability of the event and it’s downside risk is never really possible if the people objecting have an agenda. And you can be sure that most people who are opposing Aadhaar have an agenda.

Duplication of work being done for National Population Register

I haven’t paid it much attention, but my guess is that the National Population Register is a program for identification plus it also classifies people into citizens and non-citizens. Why can’t the National Population Register use Aadhaar as its ID infrastructure? Or if it provides better ID infrastructure let’s do a “dare to compare” and pick the better one.

Aadhaar is not just a superior technical solution. It’s implementation is designed to be scalable at low cost. Which is why they have been making such rapid progress. It helps that Nandan Nilekani ran a multi billion dollar company tech company before he volunteered to do this. He knows how to do this. And he has just a single point agenda – he is in a position to do some good for the country and he is taking that chance. Try doing something like this with a politician at the helm.

The massive expenditure that the project entails

If you have a big country, it takes a lot of money. I have seen some estimates that the cost of enrolling the whole country the investment is just over $3B. Compare that with the cost of subsidies on food, fertilizer and petroleum at over $29B per annum. Some say that the leakages in just the PDS system are 85% out of a total budget of $12B. You do the math. And that is just the savings in one entitlement program.

The truth is that these questions about Aadhaar are not being posed by people who want India to have an identification system that brings us into the 21st century. I don’t know what their agendas are. But I do know that if 85% of PDS subsidies are leaked through corruption, the numbers are large enough that there will be powerful forces ranged against anything like Aadhaar that threatens the destroy the gravy train. I also know that a program with a $7B budget is big enough that people will want a piece of the action. And if they can’t get it, the next best thing is to bring the whole thing down.

If I can’t get mine, nobody can. India be damned.