Links on Data

CalculatedRisk rounds up some links on how data collection can come under political fire, which is, of course, terrifying. He also tells this story:

The Depression led to an effort to enhance and expand data collection on employment, and I was hoping the housing bubble and bust would lead to a similar effort to collect better housing related data. From the BLS history:

[T]he growing crisis [the Depression], spurred action on improving employment statistics. In July [1930], Congress enacted a bill sponsored by Senator Wagner directing the Bureau to “collect, collate, report, and publish at least once each month full and complete statistics of the volume of and changes in employment.” Additional appropriations were provided.In the early stages of the Depression, policymakers were flying blind. But at least they recognized the need for better data, and took action. All business people know that when there is a problem, a key first step is to measure the problem. That is why I’ve been a strong supporter of trying to improve data collection on the number of households, vacant housing units, foreclosures and more.

New data is useless and if we had more data on what happened in the Great Depression we might not be scratching our heads as much today. Here’s an example of a chart that tells some kind of story but really doesn’t have enough history to teach us much of use:

(The chart annoys me in that clearly these two datasets have radically different statistical properties: they don’t belong on the same scale, or probably not even the same chart.)

So Government datasets are excellent because they’re (mostly) impartial and consistently measured: I’d rather have a consistently flawed dataset that I can correct than one whose basis changes unpredictably throughout.

But it’s painful to audit data collection and analysis policies, which is why it took so long for economists to figure out the way the government measures productivity changes due to offshoring is garbage. Michael Mandel blew the top off of this recently and taught us all  a lesson.

But governments aren’t the only game in town. There are countless surveys of this and that group (architects, real estate agents, industrial producers, etc etc), which are ok, but big data is hopefully changing that, too. MIT’s Billion Prices Project is a ‘simple’ web scrape but is potentially a vastly better measure of inflation in the cost of goods. Check out their charts.

Hopefully data won’t be a bottleneck to knowledge some day.

addendum: Michael Mandel reports a huge revision in the domestically produced computers figures:

There are four important points here.

1) A big chunk of those computer shipments were supposedly going into domestic nonresidential investment. Post-revision, either the U.S. investment drought was deeper than we thought, or imports of computers were a lot bigger (see the recent PPI piece on Hidden Toll: Imports and Job Loss Since 2007).

2) The U.S. shift from the production of tangibles to the production of intangibles (think the App Economy) has been even sharper and more pronounced than we realized.

3) Budget cutbacks for economic statistics, such as the House Republicans are proposing, would increase the odds of big revisions like this one.

4) Bad data leads to bad policy mistakes, especially at times of turmoil. We need more funding for economics statistics, rather than less.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s