Wednesday, November 05, 2008

Tony's epiphany: Is Google evil?

Tony's had an epiphany on the dangers of traffic analysis and the power of Google to do evil through collecting masses of personal data in small bites, linking it all together and drawing conclusions from our digital shadows so constructed!

"Now call me paranoid, but I suddenly twigged why I thought the Google announcement about an extension to the Google Visualisation API that will enabl[e] developers to display data from any data source connected to the web (any database, Excel spreadsheet, etc.), not just from Google Spreadsheets could have some consequences.

At the moment, the API will let you pull datatable formatted data from your database into the Google namespace. But suppose the next step is for the API to make a call on your database using a query you have handcrafted; then add in some fear that Google has already sussed out how to Crawl through HTML forms by parsing a form and then automatically generating and posting queries using those forms to find more links from deep within a website, and you can see how giving the Google API a single query on your database would tell them some “useful info” (?!;-) about your database schema - info they could use to scrape and index a little more data out of your database…

Now of course the Viz API service may never extend that far, and I’m sure Google’s T&C’s would guarantee “good Internet citizenry practices”, but the potential for evil will be there…

And finally, it’s probably also worth mentioning that even if we don’t give the Goog the keys to our databases, plenty of us are in the habit of feeding public data stores anyway. For example, there are several sites built specifically around visualising user submitted data, (if you make it public…): Many Eyes and Swivel, for example. And then of course, there’s also Google Spreadsheets, Zoho sheet etc etc.

The race for data is on… what are the consequences?!;-)...

In the previous couple of posts, I’ve rambled about web apps that will find a book from its cover and a song just by playing it and your online contacts across a myriad of services from your username on a single service.

But today I saw something that brought home to me the consequences of aggregating millions of tiny individual actions, in this case photo uploads to the flickr social photo site.

Form my reading of the post, the purple overlays in the images above - not the blue bounding boxes - are generated automatically by clustering geotagged and placename tagged images and extrapolating a well contoured shape around them.

That is, from the photos tagged “London”, the algorithm creates the purple “London city” overlay in the above diagram.

For each an every photo upload, there is maybe a tiny personal consequence. For millions of photo uploads, there are consequences like this… (From millions of personal votes cast, there’s the possible consequence of change…)

And it struck me that even the relatively unsophisticated form of signals intelligence that is traffic analysis was capable of changing the face of war. So what are the consequences of traffic analysis at this scale?

What are the possible consequences? What are we walking into?

(Of course, following a brief moment of “I want to stop contributing to this; I’m gonna kill my computer and go and grow onions somewhere”, I then started wondering: “hmm, maybe if we also mine the info about what camera took each photo, and looked up the price of that camera, we might be able to generate socio-economic overlays over different neighbourhoods, and then… arrghh… stop, no, evil, evil…;-)"

I saw the following on Google Blogoscoped: Search Google Profiles, which describes a new Google search feature. (Didn’t know you had a Google Profile? If you have a Google account, you probably do - http://www.google.com/s2/profiles/me/? And if you want to really scare yourself with what your Google account can do to you, check http://www.google.com/history/… go on, I dare you…)

I had a quick look to see if I could find a link for the new profile search on my profile page, but didn’t spot one, although it’s easy enough to find the search form here: http://www.google.com/s2/profiles. (Maybe I don’t get a link because my profile isn’t public?)"

I've been nagging Tony and other colleagues about the downside of a Googleville Panopticon World for years but it would be a shame if he killed all his computers and devoted his considerable talents to allium ecology. We do, however, need to be actively building the social, legal, economic, architectural and environmental infrastructure to go along with the technological development that the Tony Hirsts of this world are so enthused by, to at least nudge the Googles of this world to behave in better-than-amoral ways. Bottom line - if it costs them, economically, socially, legally etc., to be evil they will be less likely to lean in such a direction. Our capacity to regulate in positive ways however is not good - little understanding amongst the political elite of what they are dealing with technologically - and neither is our record. So despite the hype/hope surrounding Obama's election today I'm not optimistic about this complex systemic mess (of the Ackoff variety) being addressed remotely appropriately any time soon.

Update: Tony isn't the only one wondering if Google is data mining us to death

No comments: