Tuesday, November 9, 2010

Scraping By


The Wall Street Journal has a long article today about the web and privacy. It choses to dissect a company called TapLeaf and how it collates data on individuals which is then resold.

I feel slightly sorry for RapLeaf, they are getting more attention than I am sure they think they deserve. The truth is the whole “privacy” issue and the web is exploding, and I expect to see a lot more about this in the coming months. Pretty soon the government(s) will get involved, whether we like it or not. We have to hark back to the era when telephone research was king, it was great for a while, then all the calls got out of control. The government took notice. We can point fingers at the direct marketing industry, but we had a hand in it too. Eventually we got the “do not call” list and exemptions for MR, but it was a close thing.

Over the last couple of weeks there has been a long debate in a LinkedIn discussion group (“Text Analytics Professionals”) about the ethics of “screen scraping”, which is downloading or scraping information from websites for use in market research. A company called Buzzback, owned by Nielsen, was caught scraping a website called “Patients Like Me”. It was against the terms of service (TOS) for the Patients Like Me (PLM) website. The debate has gone back and forth, there are those who feel anything on a website is “fair game”, there are those who disagree. Do you obey the TOS or ignore it ?

I think this debate is going to be irrelevant soon. Like it or not, as sure as night follows day, there will be more legislation about web privacy, and I would expect it to cover screen scraping. What the MR community have to do is to make sure it does not get side swiped by this. Social media is now hugely important to MR and we can expect the new rules, when ever they come, to cover information posted on social media. Complaining about the government(s) is all well and good, but we have to be part of the process. And there will be a process.

No comments:

Post a Comment