Did Netflix Out a Customer? Your Private Details May Be Fodder for a Contest
Suggesting movies is risky business. User picks reveal more than you may think.
Feb. 7, 2010— -- My university, which is in Philadelphia, recently sent an e-mail to faculty and staff reiterating its privacy policy.
Specifically, it said that the Pennsylvania Breach of Personal Information Act requires us to notify a person if we disclose personal information.
Personal Information is defined as "the first name or initial and last name in combination with one or more of the following nonpublic unencrypted pieces of information: a Social Security number, a driver's license number or state identification card, financial account number, credit card or debit card number accompanied by the applicable passwords or security codes."
This is a laudable policy, but almost immediately after reading this e-mail, I read about a contest that the DVD movie rental company Netflix conducted over the last few years.
The contest was intended to elicit from the general public algorithms that would enable the company to improve its suggestions for future selections. To do this, Netflix released a huge trove of data about users' picks of past movies and their ratings of these movies.
Since it wanted a better way of determining other movies these users might like or dislike, the company announced a $1 million prize. The prize would be awarded to that group of researchers whose predictions about a different trove of movie ratings data involving these same users were most accurate.
The users were anonymous, identified only by number. No names, Social Security numbers, drivers' license numbers or financial account figures were released, so the contest complied with Pennsylvania's policy on privacy, undoubtedly a common one across the country.
Netflix also took other measures to anonymize the information, but this did not prevent the company from being sued recently as part of a class action suit by a subscriber for violation of her privacy. An unnamed, in-the-closet lesbian mother has alleged that, by not adequately anonymizing the data set, Netflix outed her and thereby caused her economic and psychological harm.
The explanation: It turns out that people were able to identify specific users by matching their Netflix reviews and ratings with some signed ones the users had posted on the Internet Movie Database.
Nevertheless, the lawsuit maintains that in releasing the large data set Neflix had violated the very strict Video Privacy Protection Act, which was passed when the movie choices of Supreme Court nominee Robert Bork were obtained from a video store.