Big Data and Not-So-Big Data for the Entrepreneur

There’s been an explosion of interest in big data. It’s easy these days to find many large corporations in the developed world, though only a few in the developing world, seeding research projects centered on big data. I’m all in favor of this especially when it’s used with a set of hypotheses in mind, rather than as a fishing expedition for patterns. In this post, I touch upon different ways in which entrepreneurs use data.

In my academic work with the Harvard biostatistician, JP Onnela, I’ve completed an interesting analysis of the Kumbh Mela, a phenomenon that spans centuries in India, leveraging the fact that 2013 was the first major “cellphone Kumbh”, as it were. Here’s a sample http://arxiv.org/abs/1505.06360 - we used 390 million records for this preliminary analysis. This is big data in action for scholarship and policy purposes. Much work at several universities in India and the US has been spawned by Harvard’s Kumbh effort in 2013.

But other forms of data are as important and take on particular resonance in the emerging markets I frequent.

Consider macro statistics. The numbers out of China are less contentious these days but there were decades when people had to corroborate them by looking at ‘real’ outcomes such as power consumption to see if and by how much the data were inflated.

Similarly, in Myanmar today, businesspeople have to keep track of the currency since the government might well print money and not let anyone know. This means inflation numbers are suspect, and I imagine the real cost of capital then becomes hard to gauge.

As yet another example, Nigeria last year aggressively revisited the way it computes its GDP. GDP numbers are statistical estimates, and their past method did not adequately account for two increasingly important sectors: telecom and large parts of the trading economy, mostly in the informal sector.

One common virtue stands out in all such big data efforts – that is, carefully select the data to look at, and interpret it even more carefully.

There is also “biggish” data, that is, involving rather large data sets but not super-large! For example, at AspiringMinds (Delhi, Beijing, Middle East, US), we use machine learning as the backbone to help recruiters select the candidates most suited to them. Here better analytic techniques on biggish data have been used to launch and nurture a company from scratch. What have we achieved with this? The talent market is rapidly being democratized in these countries, and that’s a big deal for both recruiters as well as talent.

Here’s something many entrepreneurs find counter-intuitive in my experience: data, and an orientation to sensible data analytics, are far more important than even these examples suggest even if you are an entrepreneur in something you think of as not being data-centric. There is no longer any such thing! That’s the first realization all entrepreneurs need to have. In many cases data may not be big or biggish, but it could be small.

For dealing with “small” data, there are two abilities you should nurture, especially while working in developing countries.

First, triangulate.  Because of the dubious nature of some data, and the imperfect quality of all data, stitching together the big picture is even more important in fast moving developing countries than in places where the sources of data are more robust and vetted. This should also include marrying insights from quantitative and qualitative data.

Second, get used to generating your own data, rather than waiting for it.  This in turns means you should develop a tolerance for imperfections in data. As the aphorism goes, don’t let the perfect be the enemy of the good.  Usually one does this by running a series of “experiments”. Take an action, measure the outcome, and change how you do things in response. Iterate indefinitely. It sounds implausible in the context of a company or organization, but it isn’t. It’s more an issue of developing an orientation and ethos.

Here are two leads (out of many I can think of) for getting inspiration. In India, Pratham is one of my favorites; it’s the organization that is spreading primary education “goodness” nationwide. Out of their data orientation came the famed ASER  report, issued annually to measure the state of India’s schools, a lovely example. In China, I am working currently with my students with a mid-sized food company, Beingmate, in Hangzhou to use data to shed light on the contaminated milk problem in that country.

So get out there, try it, and bring back some data!

 

Leave a Reply

Your email address will not be published. Required fields are marked *


+ 2 = 6

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>