Big Data and Field Experiments: The Case of NYU’s Center for Urban Science and Progress

The NY Times has published a neat article about NYU’s new urban “Big Data” center.  The gist of the article is that urban quality of life can be improved by the quants crunching the data on things such as the spatial and temporal distribution of 311 Noise calls in NYC.

“The initiative at N.Y.U. is part of a broader trend: the global drive to apply modern sensor, computing and data-sifting technologies to urban environments, in what has become known as “smart city” technology. The goals are big gains in efficiency and quality of life by using digital technology to better manage traffic and curb the consumption of water and electricity, for example. By some estimates, water and electricity use can be cut by 30 to 50 percent over the course of a decade.”

This last sentence is false.  “Big data” is a necessary but not a sufficient condition for conservation and better use of urban scarce resources.  I am a 100% fan of having better outcome variables (i.e Y from basic statistics) such as local air pollution, household energy consumption, and noise levels at a given street on a given day but such information alone does not establish cause and effect.  Big data needs to be supplemented with field experiments to have random interventions such as introducing time of day electricity pricing to raise the price of a KWh at peak use times to protect the grid from overload.   Big Data allows for better measurement but the social scientists can only establish cause and effect if specific interventions are known to have been implemented (i.e a ban on traffic near the UN when Castro is in town).

I have been working on this issue in my work using electric utility data by household/month.  We can only make progress establishing cause and effect if we know something about the households.    In the typical “Big Data” electric utility data set, the only thing the researcher knows is the zip code where the household lives and the year and month when the electricy consumption took place.  Is such “big data” (there are millions of these records) sufficient to establish how to increase conservation?  No!!

To achieve 30 to 50% reductions in water and electricity will require introducing serious economic incentives for conservation.  The article does not mention the word incentive anywhere in the article, instead it focuses on sociological nudges.  That’s a good start but such social “keeping up with the Jones” incentives are not sufficient to achieve the aggressive 30% reduction.  Engineers need to play nice with the economists and vice-versa.  There are gains to trade here!