Web Analytics Ethics

Two years ago, sitting in the airplane after attending Emetrics 05 Santa Barbara (and having to leave early), I penned a letter to organizer Jim Sterne, asking him if he’d bring up some issues around web data privacy at the first Web Analytics Association general meeting. Turns out he didn’t get my email until after the meeting, but it resonated with him and he circulated it within the WAA.

Nothing came of that initial email, but Jim didn’t forget it. A year ago, he asked if I’d be interested in a speaking slot at Emetrics ’06 Santa Barbara to talk about web privacy issues, which I gladly accepted. Not only did Jim invite me to speak, he put me on first – presumably in order to help set the tone for the summit. I got a good reception, but again, nothing really came of it.

This year I’ll be at Emetrics 07 San Francisco, and while I’m not speaking, I still think the issue deserves consideration. In fact, I think it’s more front and center than ever, with items such as Google’s recent announcement that they’ll be anonymizing their search logs after 18-24 months.

Against this backdrop, and in the spirit of keeping this alive, here’s the original email I sent to Jim, verbatim:

June 3, 2005, 05:24 AM

Jim -

Greetings from Boston. Thank you for the wonderful Emetrics conference;
it exceeded my expectations and I hated to leave early.

I'm unable to attend the WAA meeting this morning, but I did want to
have you possibly bring up for discussion the role that WAA wants to
play with respect to privacy of data collected/used by the WAA members,
and, in a larger context, some of the ethics around using and
protecting access to the data.

My current mental state (it's 4am California time, and I haven't had
any sleep) prevents me from presenting a coherent case, but here are
some thoughts:

With the recent news on personal privacy leaks, and even Citibank
running ads highlighting identity theft, I suspect it is only a matter
of time before the government decides it's time to step in and
legislate on the issue.  If that happens, I'm convinced that the strong
arm of the legislature will come down with a set of guidelines and
regulations that will rival Sarbanes-Oxley. Just as SOX has spawned an
entire compliance industry (and fattened the wallets of lawyers,
accountants and auditors) and caused a massive re-engineering effort, I
think a parallel will emerge around data access and security - where
procedures need to be meticulously documented, controls need to be put
in place for every piece of data, and systems will need to be built to
audit compliance.

Web analytics as an industry has largely ignored issues of data access,
modification, sharing and integration, having (rightly) focused on
getting the most use of the data.

But there are practical questions to ask. Some examples:

 - if you are surfing books at Amazon and not logged in, and later in
the same visit, you log in and look at kitchen appliances, should
Amazon add to your interest profile the books you searched while logged
out? I think most consumers would say no, they are unrelated.

 - what if you were at Amazon putting books in your shopping cart, and
then went to check out and said "yes, I have an account"? I think most
consumers would say yes, the convenience is worthwhile.

 - if you are searching Yahoo Personals and not logged in, and later
log in to read your Mail, should Yahoo add to your interest profile the
personal ads you looked at while logged out?

I think most users expect that logged out behavior is treated
differently at Yahoo (in fact, Yahoo's privacy policy mandates it), but
where is the line?

 - should consumers have access to the information collected about
them?  Can they opt-out of such collection, or change the data? How
would one control access (and make sure we were showing information
only to the correct people)?

 - should data collection policies (e.g. downloadable toolbars, "web
accelerator" proxies, etc) default to "opt-out" for data collection,
and have consumers explicitly opt-in before data can be collected?

 - should there be an acceptable use policy for cookies?  e.g.
duration, standard naming convention describing use, when cookies
should not be used, when cookie data should be encrypted, etc?

 - how do these policies impact targeting, computation of unique users,
visit lengths, user value, etc?

 - how long should we keep data about users?

These issues impact all of the WAA: advocacy, technology, education,
standards, research, etc.  These kinds of questions guide what the WAA
does, and should "baked in" to the DNA of the organization. Thus I
think it's appropriate to have a discussion about it.

I've spoken with several people about this issue, and the immediate
reaction is that this is a job for lobbyists. I don't agree. The WAA
advocacy team will no doubt do a fine job lobbying lawmakers on best
practices, once the Association formulates its stance.

However, any
data privacy laws that governments may pass will only be the lowest
bar. While as analysts and marketers, we'd like to see the bar be up to
us to set, I don't think that will last long-term. I think we should
assume that a bar *will* be set. However I don't think that's what we
should shoot for.  Consider - as practitioners, we want to practice
"safe data" and stay above the bar. One way to do that is to layer
policies on top of the laws.  Another is to layer values on top of the

I suggest that the WAA take up the discussion of what values we stand
for.  Should web analytics practitioners, especially ones that have the
good sense to join the WAA, take an oath similar to the hippocratic
oath that doctors take? Should practitioners be held to an ethical
standard for the privilege of having access to the data?

We are not dealing with life and death issues here, but we are dealing
with issues of trust.  We've seen that one of the reasons we have data
quality issues is that people delete cookies and they delete cookies
because they don't trust web sites to use the cookies responsibly. We
also know that if consumers have more trust, they will use the web
more, and transact more, so it's in our best interests to increase the
trust that consumers feel.

While a larger "data access oath" may be out of scope for the WAA -
indeed, I can see an argument that an umbrella data ethics group emerge
- I don't want to try to boil the ocean. But is it worth having a
discussion about what values the WAA holds, and in turn, expects from
its members?

I look forward to the thoughts that will come out of the meeting.


PS timely:

I boiled the essence of this letter into a PowerPoint presentation that I used at Emetrics last year. The presentation is purposefully without any fancy design in order that the message be front and center. You have my permission to do what you want with it. During the Q & A after the talk, I said I could imagine a cataclysmic event that would set into motion things like congressional hearings on data privacy. I referred to it as the Chernobyl of Data. Fortunately, it hasn’t happened, and of course I hope it doesn’t. But I continue to be concerned about a head-in-the-sand mentality within the web analytics community, and what it will ultimately mean once the hammer comes down – in any form.

I’m interested in your thoughts. Is it time to join together as an industry to tackle this?

Web Analytics Ethics

Cookies Misleading; News at 11



I’ll get excited about these “people delete cookies” stories when somebody comes up with a better method to track ANONYMOUS visitors. Heck, I’ll even get excited if WA vendors come up with “cookie deletion metrics calculators” that automatically measure and compensate the reported numbers. (Don’t get me started on panels.)

True, from an advertising perspective, sure you can’t accurately determine reach and frequency. Unlike the precision you get offline … oh wait.

Cookies Misleading; News at 11

Mainstreaming Web Analytics

Once Upon A Time, I left the web analytics field for a brief respite. While I was away, a new competitor emerged, and everyone was talking about them, and I had to go figure out what made them so special.

Once Upon A Year Ago (or so), I stopped reading web analytics blogs. Now I return to find all these whippersnappers — people who actually Analyze Web Sites, not just write software that does analysis! Here, according to Technorati, are the number of blog postings in the last year that contain the phrase “web analytics”:

Technorati "web analytics"

Looks like I have some blogrolling to do.

Mainstreaming Web Analytics


Well, it’s hockey playoff season, so that means it’s time to resurface the blog.

 Zamboni Model700

OK, that doesn’t make any sense, but I wanted to say something about hockey, so there you go.

Yes, I really am resurfacing the blog– upgraded the software and put in a fresh coat of paint. I intend to consolidate a few old blogs and assorted posts from the past; there’s a pile of stuff from Ye Olden Days that will eventually make its way here.

While I’m not a fan of revisionist blogging, I’ve cleaned up some of the old posts (broken links) and deleted a few posts that made no sense – e.g. they were too time-based to be of even token value now.

The New and Improved site is being watched by Google, because I’m sending web bugs beacons back to Google Analytics. I’m also publishing the feed through Feedburner, which provides its own set of (rather weak) stats.

For you RSS readers, no big changes, except that the whole feed got refreshed with the software changes. Oops.

So what’s the story? Simple. I got crazy busy, and blogging fell below the line. Not just writing — reading did too. Months ago, a colleague mentioned that he’s more interesting when he reads blogs. I’ve started reading again, but if there’s a correlation between amount of reading and interestingness, I’m still not very interesting. But since being interesting has never stopped me from blogging, I say Game On!


Trumpet and Sugar

She: did you ever see the music man? it’s funny
Me: i think so
She: they are singing a song about the wells fargo wagon coming to town with the packages
She: basically the ups truck
She: the whole town is singing
Me: what a wonderful time that must have been!
She: the entire town is chasing it
She: all excited over a trumpet and a box of brown sugar

Trumpet and Sugar

Web Analytics is so 2006


I’m hearing it all over. There’s a new day on the horizon, a day when we in the web world recognize that none of this is really an exact science anyway, so why pretend?

Enough with the weighted regressions and Taguchi Methods already. It’s time to take the anal out of analysis. Instead of Web Analytics, I propose Web Casualytics. Or Fuzzylytics maybe. Or Estimytics.

Now excuse me while I go register some domain names.

Web Analytics is so 2006