Web Analytics Ethics

Two years ago, sitting in the airplane after attending Emetrics 05 Santa Barbara (and having to leave early), I penned a letter to organizer Jim Sterne, asking him if he’d bring up some issues around web data privacy at the first Web Analytics Association general meeting. Turns out he didn’t get my email until after the meeting, but it resonated with him and he circulated it within the WAA.

Nothing came of that initial email, but Jim didn’t forget it. A year ago, he asked if I’d be interested in a speaking slot at Emetrics ’06 Santa Barbara to talk about web privacy issues, which I gladly accepted. Not only did Jim invite me to speak, he put me on first – presumably in order to help set the tone for the summit. I got a good reception, but again, nothing really came of it.

This year I’ll be at Emetrics 07 San Francisco, and while I’m not speaking, I still think the issue deserves consideration. In fact, I think it’s more front and center than ever, with items such as Google’s recent announcement that they’ll be anonymizing their search logs after 18-24 months.

Against this backdrop, and in the spirit of keeping this alive, here’s the original email I sent to Jim, verbatim:

June 3, 2005, 05:24 AM

Jim -

Greetings from Boston. Thank you for the wonderful Emetrics conference;
it exceeded my expectations and I hated to leave early.

I'm unable to attend the WAA meeting this morning, but I did want to
have you possibly bring up for discussion the role that WAA wants to
play with respect to privacy of data collected/used by the WAA members,
and, in a larger context, some of the ethics around using and
protecting access to the data.

My current mental state (it's 4am California time, and I haven't had
any sleep) prevents me from presenting a coherent case, but here are
some thoughts:

With the recent news on personal privacy leaks, and even Citibank
running ads highlighting identity theft, I suspect it is only a matter
of time before the government decides it's time to step in and
legislate on the issue.  If that happens, I'm convinced that the strong
arm of the legislature will come down with a set of guidelines and
regulations that will rival Sarbanes-Oxley. Just as SOX has spawned an
entire compliance industry (and fattened the wallets of lawyers,
accountants and auditors) and caused a massive re-engineering effort, I
think a parallel will emerge around data access and security - where
procedures need to be meticulously documented, controls need to be put
in place for every piece of data, and systems will need to be built to
audit compliance.

Web analytics as an industry has largely ignored issues of data access,
modification, sharing and integration, having (rightly) focused on
getting the most use of the data.

But there are practical questions to ask. Some examples:

 - if you are surfing books at Amazon and not logged in, and later in
the same visit, you log in and look at kitchen appliances, should
Amazon add to your interest profile the books you searched while logged
out? I think most consumers would say no, they are unrelated.

 - what if you were at Amazon putting books in your shopping cart, and
then went to check out and said "yes, I have an account"? I think most
consumers would say yes, the convenience is worthwhile.

 - if you are searching Yahoo Personals and not logged in, and later
log in to read your Mail, should Yahoo add to your interest profile the
personal ads you looked at while logged out?

I think most users expect that logged out behavior is treated
differently at Yahoo (in fact, Yahoo's privacy policy mandates it), but
where is the line?

 - should consumers have access to the information collected about
them?  Can they opt-out of such collection, or change the data? How
would one control access (and make sure we were showing information
only to the correct people)?

 - should data collection policies (e.g. downloadable toolbars, "web
accelerator" proxies, etc) default to "opt-out" for data collection,
and have consumers explicitly opt-in before data can be collected?

 - should there be an acceptable use policy for cookies?  e.g.
duration, standard naming convention describing use, when cookies
should not be used, when cookie data should be encrypted, etc?

 - how do these policies impact targeting, computation of unique users,
visit lengths, user value, etc?

 - how long should we keep data about users?

These issues impact all of the WAA: advocacy, technology, education,
standards, research, etc.  These kinds of questions guide what the WAA
does, and should "baked in" to the DNA of the organization. Thus I
think it's appropriate to have a discussion about it.

I've spoken with several people about this issue, and the immediate
reaction is that this is a job for lobbyists. I don't agree. The WAA
advocacy team will no doubt do a fine job lobbying lawmakers on best
practices, once the Association formulates its stance.

However, any
data privacy laws that governments may pass will only be the lowest
bar. While as analysts and marketers, we'd like to see the bar be up to
us to set, I don't think that will last long-term. I think we should
assume that a bar *will* be set. However I don't think that's what we
should shoot for.  Consider - as practitioners, we want to practice
"safe data" and stay above the bar. One way to do that is to layer
policies on top of the laws.  Another is to layer values on top of the
policies.

I suggest that the WAA take up the discussion of what values we stand
for.  Should web analytics practitioners, especially ones that have the
good sense to join the WAA, take an oath similar to the hippocratic
oath that doctors take? Should practitioners be held to an ethical
standard for the privilege of having access to the data?

We are not dealing with life and death issues here, but we are dealing
with issues of trust.  We've seen that one of the reasons we have data
quality issues is that people delete cookies and they delete cookies
because they don't trust web sites to use the cookies responsibly. We
also know that if consumers have more trust, they will use the web
more, and transact more, so it's in our best interests to increase the
trust that consumers feel.

While a larger "data access oath" may be out of scope for the WAA -
indeed, I can see an argument that an umbrella data ethics group emerge
- I don't want to try to boil the ocean. But is it worth having a
discussion about what values the WAA holds, and in turn, expects from
its members?

I look forward to the thoughts that will come out of the meeting.

Bob

PS timely:
http://news.yahoo.com/news?tmpl=story&cid=582&e=1&u=/nm/20050603/wr_nm/tech_privacy_dc

I boiled the essence of this letter into a PowerPoint presentation that I used at Emetrics last year. The presentation is purposefully without any fancy design in order that the message be front and center. You have my permission to do what you want with it. During the Q & A after the talk, I said I could imagine a cataclysmic event that would set into motion things like congressional hearings on data privacy. I referred to it as the Chernobyl of Data. Fortunately, it hasn’t happened, and of course I hope it doesn’t. But I continue to be concerned about a head-in-the-sand mentality within the web analytics community, and what it will ultimately mean once the hammer comes down – in any form.

I’m interested in your thoughts. Is it time to join together as an industry to tackle this?

Web Analytics Ethics