I was sitting in a cafe last week in Wood’s Hole, Massachusetts with a senior director of oceanographic and atmospheric research. He was talking about Big Data, the process of gleaning useful information out of millions and billions of data points. The usual problem with statistics has been capturing enough samples to form credible predictions, such as “valid within 5 percentage point 19 times out of 20”. This is what happens when you have a thousand responses, if you are lucky. With ‘big data’ the issue is managing the analysis, because the number of data points is so large. Sampling errors hardly matter; indeed with bg data, you have got rid of the sample; you are analyzing the whole set of data points.
His comments about academic social science were priceless:
Up til now we have had a few social scientists in the 19th century, like Dickens and Tolstoy. Then we have had a century of academic crap in universities, so dominated by jargon and cant as to be useless. Big data promises the possibility of having real social science in the 21st century.
McKinsey has a useful piece on big data, if you are interested.