Related Posts

Share This

Not so anonymous

MaskMany companies, such as Google, use your personal information after it’s been anonymized.   That is, they remove your name and IP address, for example, from the information that is stored before they use it.  Instead, they will assign you an unique ID, and include other information that may be known, such as gender, zip code, area code (for your phones), and birth date.   Google consolidates from its other services, most importantly, from Google Plus where your profile will contain this information.

Yet from anonymous information, Paul Ohm of the University of Colorado showed that 87% of the people in a sample list containing only birth date, sex and zip code could be re-identified.[1]  Further, some companies such as AOL store the IP address with the information.  This increases the ability of third parties to identify people with the data.

Just the simple mash-up of public records and anonymous medical records allowed one research group to identify the health record of the governor of Massachusetts.[2]  They did this using medical records with no identifying information except a unique number.  Yet, the governor was isolated out of the entire dataset.  When researchers at MIT and the Université Catholique de Louvain, in Belgium looked at the rough geo locations 1.5 million mobile users in a small European country, they were able to uniquely identify 95% of the people.[3]  Today, your  location is tracked by numerous apps and with data available through DMs plus the information you provide, your behaviors and identity are available.

One of the most interesting examples of de-anonymizing information comes from the Netflix contest to find a better recommendation algorithm.  They released over 100 million ratings from 500,00 subscribers, and offered $1 million prize for the best algorithm.  Yet two University of Texas (Austin) researchers were able to identify the subscribers using public records.  Their techniques were straight-forward.  Using movie reviews publicly available in IMDB, they were able to match up the times and person’s review on IMDB to those of the anonymous subscribers.  Once this was uncovered, they were also able to associate political views with the person and other sensitive personal information.  This was all done with public records to identify large swaths of subscribers released by NetFlix.[4]

When information is then mashed together, combining the re-identified information with other information gathered from public and other private companies, quite a lot can be known about a person.

Just where and how they are able to get this personal information:  see Every Move You Make

What can you do to protect yourself from this?  Stay tuned.

Related stories:


[1] UCLA Law Review, Vol. 57, p. 1701, 2010  http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1450006

[2] Sweeney, L. k-anonymity: a model for protecting privacy. Int. J. Uncertainty Fuzziness and Knowledge-Based Systems 10, 557–570 (2002). http://www.worldscientific.com/doi/abs/10.1142/S0218488502001648

[3] “How hard is it to ‘de-anonymize’ cellphone data?” in MIT News, 3/26/13 http://web.mit.edu/newsoffice/2013/de-anonymize-cellphone-data-0327.html

[4] How To Break Anonymity of the Netflix Prize Dataset, Cornell University Library, http://arxiv.org/abs/cs/0610105

Be Sociable, Share!