The controversies surrounding the revelations about the NSA are bringing a lot of background information about recent developments in technology onto the radar screen. 20 years ago when internet access was opened up to the general public giving rise to the first dotcom boom, there was a general public consciousness about how dramatically computer technology was developing and how much of an impact it was having on the lives of ordinary people. Since then generally visible technology has settled down a bit. We are still surfing the internet with browsers and sending email. The changes have mostly been about more and more of the same and the race to make it all mobile.
However, behind the scenes where the serious geeks hang out, things are moving at exponential speeds. Hardware and networks are progressively able to crunch more and more data at faster and faster speed. As fundamentally new approaches like quantum computing are being developed there doesn't seem to be any prospect of a plateau in this trend. Also the cost of new hardware is moving in the opposite direction from its power.
One of the important ways in which this power is being used that that has a great deal to do with the activities of NSA is the software field that has become known as Big Data.
Big data is a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications. The challenges include capture, curation, storage, search, sharing, transfer, analysis, and visualization. The trend to larger data sets is due to the additional information derivable from analysis of a single large set of related data, as compared to separate smaller sets with the same total amount of data, allowing correlations to be found to "spot business trends, determine quality of research, prevent diseases, link legal citations, combat crime, and determine real-time roadway traffic conditions."We are talking exabytes of data and it is happening in all sorts of different fields such as meteorology and genomics. I spent part of my working life dealing with traditional databases on a small scale and know the difficulties of constructing them in a way that you can retrieve the data that you are likely to be looking for. Those approaches are now about as antiquated as the buggy whip.
This has opened up the field of big data analytics that has spawned an array of new technologies. The major players in private industry like IBM and Google are investing heavily in research. Last year President Obama announced a federal Big Data Initiative committing more than $200 M to data research projects. That of course was before the present upheavals.
Even having some technical background in this area, I find myself unable to really grasp the immensity of this. One thing is certain. It is here to stay. We can and should have a public conversation about how it should and should not be used, but it isn't going away.
One of the things that I think we can draw from this is that there is an active partnership between government and private industry not only in the development of technology but also in its use and application. With programs like PRISM we have gotten a glimpse of it in national security. We don't really know much about how that actually works in practice. We have also seen the role of government contractors such as Booz Allen who actually operate most of the government programs. They in turn have a web of connections to other corporations.
Everybody has pretty much known that companies like Facebook and Google use user data and activities to make money. That is the subject of ongoing issues in the US and the EU about privacy regulations. Most people are more or less inclined to take that in their stride. But, when big data is found to be in the hands of big brother, the level of nervousness begins to rise.