We can now measure the world with growing impunity.
We can use those measurements to extract deep insights about ourselves and the world around us.
And we can act on those insights to improve our businesses, our society, our planet and the lives of each and every one of us.
But one has to ask -- what role could public policy play in all of this? Accelerate the good, avoid the bad?
And the discussion quickly focuses on the notion of "open data" (#opendata on Twitter) -- a set of progressive policies and investments to make these valuable data sets more freely available.
If you haven't yet been "down the rabbit hole" on the amazing power of predictive analytics powered by big data, you have a mind-blowing experience awaiting you.
We -- collectively -- have started to do better in creating analogies that help explain the amazing potential around what is now at our fingertips.
Rick Smolan (the prime motivator behind #HFOBD) uses a vision analogy: imagine you'd gone through your whole life with only one eye, and -- somehow -- scientists created the capability for a second one, and a third one, and so on. You'd have an entirely new perspective on things.
More powerful is the Michael Palmer's analogy: "data is the new oil". The potential value is lying all around us: we need to learn to find it, refine it, put it to work -- powering the next wave of economic and social growth -- and hopefully avoid the potential negatives that inevitably lie ahead.
Enter the notion of open data.
Much like governments have a vested interest to exploring, developing and exploit their energy resources; they have a similar interest in exploring, developing and exploiting their data resources.
At a high level, I believe the government has potentially three roles:
- making government-gathered data sets easy to discover and consume for the public good.
- creating policies for the private sector to do the same
- enacting legislation that protects the rights of others when it comes to putting all that insight to use.
The UK Open Data Experiment
Perhaps the most visible -- and visionary -- example of the first role comes from the UK government in the form of data.gov.uk. Take a moment to check it out, it's pretty eye-opening.
A group of very forward-looking people came together -- with full legislative support -- to progressively open up the data sets owned by the UK government.
Some will say this all about transparency in government -- which is not a bad thing in itself. Do a bit of poking around, and you'll find much much more. For example, you'll see a vibrant collection of new startups who are now taking this new source of "free" digital crude oil, refining it, and putting it to work. You'll also find a thriving of community of people who are interested in using government data sets to drive better policy decisions.
From where I sit, data.gov.uk is clearly an engine of both economic growth and social good.
There are other small-scale experiments I've been able to discover around the globe, but nothing quite in this category yet.
Sadly, here in the US, most of the legislative discussion has been around criminalizing the unauthorized sharing of information: online media, customer data sets, etc. While there is definitely a role of government to protect the rights of others, it doesn't have to come at the expense of broader economic and social good.
And it seems like the UK is leading the way on this.
Update: I somehow missed the US-based "data.gov" site -- doesn't seem to be getting the same attention as the UK-based one, though. I wonder what the difference might be? Thanks to Mchmarny for the tip ...
Creating Policies That Incentivize Information Sharing
Most of our organizations are sitting on potential treasure troves. Unlike other forms of assets, these information bases are a unique beast. The intellectual challenge is that information isn't a physical thing.
To borrow from the language of economics, information can be a non-rivalrous good: a second party can extract value from a data set without impinging on the rights or value seen by the first party. But if you're own some of these information assets, you're probably not motivated to share, are you?
Let's say that you're a senior exec in a company, sitting on a huge pile of potentially interesting information assets that might be useful to non-competitors.
You look around you, all you see is downside and very little upside. Perhaps you'd be interested in selling your information, but you'd be entering the great unknown: it's hard to value the asset, the accounting and legal rules are about as clear as mud, there aren't many established marketplaces for doing so, and so on. And what about the risks?
You move on to something else.
What if -- for example -- you could easily donate your information sets to research or a university? Get a nice tax credit, and clear legal protection that no one could come after you years down the road? There'd be a heck of a lot more information sharing, more rich data sets available -- more "digital crude oil" for powering the next wave of economic growth and social good.
It's not such a wild idea -- plenty of individuals decide to "donate" their personal health care records for the greater good. Although they don't get a tax credit for doing so -- today!!
Protecting The Rights Of Others
Any time there's a good to be exploited -- whether physical or digital -- there are externalities to be considered. Fracking for energy in the US can impinge on drinking water quality, for example. Your right to smoke infringes on my right to breathe. Your right to own a gun can impinge on my perceptions of personal safety. And so on.
The familiar knee-jerk reaction is to go to the source -- make the good in question highly regulated and difficult to obtain. The result are inevitably increasingly strong laws against gathering data, using data, and -- gasp! -- sharing data.
But, once again, we're not dealing with a physical entity, we're dealing with a digital one. Unlimited amount of data is becoming ridiculously easy to gather from anywhere and everywhere.
BTW, anonymization of data sets is mostly a thin fig leaf at best, based on what I've seen. Data science is largely about re-correlating what you tried to hide through anonymization.
Using legislation to make certain forms of data increasingly difficult to capture and share strikes me as societally counter-productive, sort of like making computers and the internet hard to access and use.
Yes, it's true that in some parts of the world you need a governmental approval to use computers and the internet in an unrestricted manner, but that shouldn't be held up as the ideal model, should it?
Instead, we should be looking at the other end of the spectrum: increased regulation and policy that's aimed at protecting our collective rights in this new digital world: privacy, confidentiality, use for purpose, etc. Focus on the intent, not the ingredients.
Extending our existing, physical-world legal framework in this direction won't be quick, or easy -- but it will need to be done before long.
Open Data: From Public Good To Social Good?
Economics defines a public good as one that's non-exclusionary and non-rivalous. Non-exclusionary is simple: if I'm using something, it's going to be pretty hard to exclude you from doing the same. Examples include transportation infrastructure, street lights, national defense, and the like. Non-rivalous means that if you use the good, it doesn't really affect my use of it.
As a public good, wide access to available data sets are debatable as to whether they're truly should be seen as public good.
But when the focus shifts to social goods -- things that make for a better society -- the view changes considerably. Consider the investment our governments make in education, research, transparent markets, clean air and water, safe streets, etc. -- the compelling case for more progressive open data policies begin to be much more clear.
Every nation has physical assets: resources, geography, etc. And we're all familiar with policies to maximize the value of those assets, and avoid the externalities.
Every nation also has people assets: education, the free exchange of ideas, intellectual property and the like. We're all familiar with the policies that attempt to maximize the value of those assets, and avoid the negative aspects.
We now must add a third category: information assets.
What new policies will maximize the value of those assets, and avoid the negative aspects?