LibrarianShipwreck

Libraries, Archives, Technology, Impending Doom

You Better Not Cry, but You Better Shout! (Big Data is coming to town)

Humans sneeze information, and while most of us then just throw away the tissue, there are others who greedily fish this rag from the trash to analyze the traces left behind (I know, kind of gross). Those informational strands we generate may seem insignificant to us, but they provide a wealth generating wealth of information for those with the ability and desire to try to make sense of it all. Behold! “Big data”! A term that connotes the taking of the troves of information now made available through our daily actions and in them finding connections and marketable meaning.

For, as we go about our daily routines, most of us (perhaps unwittingly) generate hundreds of fragmentary facts about ourselves. How much we just spent on books, what books they were, which news stories we read, what stores we walked by, what things we like and “like,” all of these pieces of information may be ignored by us as we create them. Truthfully, many of these informational moments are simply a result of us going about normal routines.

“Big data” is not a term that expresses a quantity of data collected as much as it is a sort of vision of what can be done with this information. It is to say that now that all of this information has been gathered it can be harnessed for all manner of purposes: it can tell you what you want to buy before you know it, it can help predict antisocial behaviors that may evince terroristic tendencies, it can predict your sexual orientation and income level, and it can suggest what movies you might like based on your past viewing history (once all of these things are cross referenced against thousands of other people with similar histories).

Claiming that there is tremendous use value in “big data” is inevitable, a sort of late game defense by those using it. For the idea that “big data will solve all of our problems” serves as a justification for why so many companies are gathering so much information about us in the first place. Why does Google need all of this information: because of “big data!” Why does Facebook need to map out our connections in so much detail: because of “big data!” Why must I rate (on a five star scale) my opinion of the movie I just watched: because…you get the point. It would be like you smashing me over the head with a skillet, me going to the hospital, getting an x-ray, being told by the doctor that I have a brain tumor, and you proceeding to claim that the discovery of the tumor retroactively justifies smashing me over the head.

Big data is a cause that has many supplicants and evangels, folk who are now going out into the digital and physical realm to preach the “good news.” Indeed, Big data seems a confused sort of response to Neil Postman (in his book Technopoly) writing:

“Information is dangerous when it has no place to go, when there is no theory to which it applies, no pattern in which it fits, when there is no higher purpose that it serves,” (Postman, 63)

Postman worried that too much information was rendering all information meaningless and creating a vast and dangerous sort of glut. Yet “big data” replies that the “place…theory…pattern” and “higher purpose” are all the same and they are “big data,” which will take every insignificant speck of information and collate it into something more impressive, something that will make the world a better (or at least more efficient) place! Huzzah!

Or, if you prefer, not huzzah.

Some much needed skepticism towards the faith in “big data” can be found in Kate Crawford’s recent piece at Foreign Policy titled “Think Again: Big Data – Why the rise of machines isn’t all it’s cracked up to be.” Whereas many of the converts to “big data” see in it only the scientific sheen of pure “neutral” information being objectively sorted through by emotionless (and therefore non-judgmental) computer processes, Crawford is willing to issue the reminder that those systems gathering the information and those programs trying to make sense of it all are still designed by human beings.

Crawford uses her article to confront some of the main talking points of the “big data” preachers (“With enough data, the numbers speak for themselves;” “Big data will make our cities smarter and more efficient;” “Big data doesn’t discriminate between social groups;” “Big data is anonymous, so it doesn’t invade our privacy;” and “Big data is the future of science”). It is not so much that Crawford rejects these claims outright, but she does point out the flaws in the logic found therein, particularly by calling attention to the fact that all of these systems wind up being largely biased towards the more technologically connected members of a population who are (as a result of being more “plugged in”) creating much more data. Thus, Crawford is able to demonstrate some of the biases inherent in the logic of “big data,” whilst also calling attention to some of the more concerning features of a “big data” society (for example: all that information really isn’t all that anonymous). Summing up, Crawford writes:

“Given the immense amount of information collected about us every day — including Facebook clicks, GPS data, health-care prescriptions, and Netflix queues — we must decide sooner rather than later whom we can trust with that information, and for what purpose. We can’t escape the fact that data is never neutral and that it’s difficult to anonymize. But we can draw on expertise across different fields in order to better recognize biases, gaps, and assumptions, and to rise to the new challenges to privacy and fairness.”

These are worthy concerns that Crawford notes, and her article provides a needed (and nuanced) reading of some of the talking points in the big data debate. Yet Crawford seems to fall victim to the “big data” driven, mistaken, rejoinder to Postman, a sort of acceptance that the gathering of all of this information is worthwhile, that something must be done with it all. Missing from Crawford’s piece is the sense that perhaps a proper response to the concerns generated by “big data” would be (to be a bit extreme) to prevent groups like Google, Facebook and the government (at least not without a warrant) from gathering this information at all. What if the response to Crawford claiming, “we must decide sooner rather than later who we can trust with that information” is to say “nobody?”

Part of the unspoken premise buried within “big data” is the concept that all of this information should be collected in the first place, and as long as we accept that this data should be collected it makes “big data” the almost inevitable end point. The skillet blow is justified by the tumor’s discovery (i.e. the violation of privacy is justified by better movie suggestions).

“Big data” turns every aspect of a person’s life into just another data point to be analyzed, by looking ever more at every moment of a human’s life “big data” seeks to reduce people into statistics, and aims to turn all of their actions into numbers that can be mathematically represented and fed into a program. Even before the invention of the Internet, computers, smart phones, tablets, (heck) before the invention of the printing press, people were generating a great deal of information about themselves. What has changed is that we now live in a world in which all of this information can be captured and then put into systems that will construct from it a profile far more detailed and worrisome than anything previously assembled by the secret police of a repressive regime. After all, Google and Facebook don’t need to engage in any manner of skullduggery or intimidation to read your personal messages, they’re just doing it. “Big data” is not really about humans creating information, we’ve always done that, it’s about the big systems that can capture it all, and that is new. Or, as Neil Postman wrote (quoting from Technopoly):

“Technology increases the available supply of information. As the supply is increased, control mechanisms are strained. Additional control mechanisms are needed to cope with new information. When additional control mechanisms are themselves technical, they in turn further increase the supply of information. When the supply of information is no longer controllable, a general breakdown in psychic tranquility and social purpose occurs.” (Postman, 72).

“Big data” is a perfect example of this cycle from information to technology to more information to more technology, but the key point in the above quotation is Postman’s comment about “psychic tranquility and social purpose.” It is quite difficult for people today to truly be in control of their information, for we can never be certain what information our actions are generating, who is tracking it, and what seemingly insignificant action on our part may (when fed into a larger algorithm) transform us into a person of interest.

Who is making use of the fact that you just downloaded that album (legally or illegally)? Who is making use of the location pings from your smart phone? Who is getting to know you by seeing that you like “Movie Star A,” “Television Program Q,” have “liked” several videos featuring sleeping kittens, and also play two hours a night of “violent video game D?” To actively think about this is to greatly disrupt our personal “psychic tranquility” but to ignore that this information is being used is to lead us forget that (sadly) somebody (or something) really is out there watching. Today they might just want to sell you shoes, but what if tomorrow your interest profile is identical to that of a suspected terrorist?

What are the implications for our privacy and our personal lives if we act knowing that everything we do is being watched, noted, and cataloged by some unseen forces? Will it keep us from acting? Will it regiment our actions? And, at the same time, what happens if we pay no heed to the fact that everything we do is being watched, noted, and cataloged? Will it keep us from getting hired?

Part of what Crawford alludes to in her concluding paragraph (which was quoted above) is worry over who it will be that is gathering and making use of this information. Yet, without meaning offense, it is really a rather naïve worry. To gather and then analyze massive amounts of information requires a level of resources and computing power that rests with a rather limited range of groups. The list of those that might belong on the “who” list is short, not much wondering is required. Or, as Langdon Winner wrote (in The Whale and the Reactor):

“Current developments in the information age suggest an increase in power by those who already had a great deal of power, an enhanced centralization of control by those already prepared for control, an augmentation of wealth by the already wealthy…those best situated to take advantage of the power of a new technology are often those previously well situated by dint of wealth, social standing, and institutional position.” (Winner, 107)

In other words, those who have risen to social and economic prominence by gathering vast amounts of information will be able to use that information to help maintain their social and economic prominence. True, this amassed information may help some of these companies “provide better customer service” but is receiving pretty decent book recommendations from Amazon worth Amazon knowing every book (and other item) you’ve looked at on their website? Are marginally better search results from Google worth Google eternally remembering everything for which you have ever searched?

While one of the questions about “big data” is certainly whether this information will be used for “good or evil,” such a question is ignoring the real issues: why is this information being gathered in the first place? What kinds of people and groups would want to gather this information? And, do people not have a right to control their own information?

Before “big data” can become something that blog posts are written regarding, hundreds and thousands of decisions were made in boardrooms, and garages, decisions not made by neutral “big data” but by biased human beings.  Was it not recognized that creating such a vast information gathering apparatus could be dangerous (did Eric Schmidt wonder this?)? And did anybody wonder what “big data” represents in terms of our view of what it means to be human? After all, your Iphone may know that you met with a friend at a given café, Google may know the list of cafes you considered meeting at, and Facebook may know that they have just had a breakup, but even with all of this contextual information can any of these measure your empathy or your ability to listen?

Or, as Lewis Mumford wrote in The Pentagon of Power (volume 2 of The Myth of the Machine):

“The computer still suffers from the same radical weaknesses that undermined the decisions of Kings and Emperors: the only information it heeds is that which is fed into it by its Grand Viziers and courtiers; and as usually happened with kingship, the courtiers—read mathematical model-makers and programmers—ask the king only for such answers as can be based on the inadequate information they supply. That information must ignore many significant aspects of human experience in order to conform to His Majesty’s peculiar limitations.” (Mumford, 273).

All of human existence is reduced to just that which can be easily turned into a data point and fed into a calculation. And thus (Mumford again):

“the final purpose of life in terms of the megamachine at last becomes clear: it is to furnish and process an endless quantity of data, in order to expand the role and ensure the domination of the power system.” (Mumford, 275).

Despite what the high priests (or “Grand Viziers and courtiers” if you prefer) might claim about “big data,” the fact is that at best (or at worst) “big data” can only capture a fraction of what the human experience truly is, and but a slice of who we truly are. It may be able to take all of our preferences and predict what we might also be interested in buying; but in so doing it reduces us from people into nothing more than consumer profiles. The only things that matter to “big data” are the informational droppings that can be fed into the system, and we are more than that.

“Big data” is an amoral Santa Claus, but unlike Santa, “big data” really exists: it knows when you’re awake, knows when you’re asleep, knows if you’ve been bad (illegally downloading) or good (rating every purchase), and exhorts you therefore to “Be good for goodness sake” (and now you will have that song stuck in your head all day). Yet “good” in this context is less a moral category than a type of consumer.

Perhaps, the gravest danger, beyond this information being gathered in the first place, is for us to believe that what “big data” says about us is all that we are. For, you are not just a bunch of numbers, even if that’s all that “big data” thinks of you.

3 Books are quoted from in this article:

Mumford, Lewis. The Myth of the Machine: II. The Pentagon of Power. Harvest/Harcourt Brace Jovanovich Publishers, 1970.

Postman, Neil. Technopoly: The Surrender of Culture to Technology. Vintage, 1993.

Winner, Langdon. The Whale and the Reactor: A Search for Limits in an Age of High Technology. University of Chicago Press, 1986.

[Note – the site stopthecyborgs.org also has an interesting treatment of Crawford’s article, comparing some aspects of “big data” to phrenology]

About these ads

About TheLuddbrarian

"I won't explain myself because I hate common sense." librarianshipwreck.wordpress.com @libshipwreck

18 comments on “You Better Not Cry, but You Better Shout! (Big Data is coming to town)

  1. T E Stazyk
    May 13, 2013

    This is an interesting complement to your post the other day about technology and freedom. The scary thing is that Big Data is not necessarily clean (in a reliability sense) data and correlation is not causality and in the hands of people with spreadsheets the data can prove just about anything anyone wants.

  2. Pingback: The APPS Act is not an Appendectomy (unfortunately) | LibrarianShipwreck

  3. Pingback: “Much Seeing Eyes,” “The All Seeing Eye” and the price tag on privacy | LibrarianShipwreck

  4. Pingback: It’s a Verb! It’s a Company’s Name! It’s…a country!? Google, the nation state. | LibrarianShipwreck

  5. Pingback: So, Did You Hear the One about the Yahoo and the Tumblr? | LibrarianShipwreck

  6. Pingback: Advertising Our Ethics – Facebook does the right thing for the wrong reason | LibrarianShipwreck

  7. Pingback: The Car is Moving, but Who’s driving? | LibrarianShipwreck

  8. Pingback: Does your Smartphone Sync with our Values? Behold: The Fairphone! | LibrarianShipwreck

  9. Pingback: The NSA puts the “All Seeing Eye” back in Verizon | LibrarianShipwreck

  10. Pingback: “More than machinery we need humanity” – The NSA, Verizon, Prism, and You | LibrarianShipwreck

  11. Pingback: “More than machinery we need humanity” – The NSA, Verizon, Prism, and You | LibrarianShipwreck

  12. Pingback: Did You Remember to Take Your Password This Morning? | LibrarianShipwreck

  13. Pingback: The Triumph of Technique – The Logic of the NSA (Part 1) | LibrarianShipwreck

  14. Pingback: The Panoptic Con – Defining Our Machine Wrought Maladies (1) | LibrarianShipwreck

  15. Pingback: The Happy Days are Back! (A permission slip to not care about privacy) | LibrarianShipwreck

  16. Pingback: Grifting, Garbage and Graph Search | LibrarianShipwreck

  17. Pingback: How Not to Heed a Warning | LibrarianShipwreck

  18. Pingback: Mejor no llorar, es mejor Gritar! (Big Data llega a la ciudad) | blognooficial

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Ne'er do wells

Archive

Categories

Creative Commons License

libshipwreck

Follow

Get every new post delivered to your Inbox.

Join 5,061 other followers

%d bloggers like this: