The silent snoop on your PC

We have become digital citizens and playing into this space is inevitable, says the writer. File photo

We have become digital citizens and playing into this space is inevitable, says the writer. File photo

Published Jul 12, 2011

Share

London - You are not paranoid. You are being followed. Go have a look. Somewhere in your browser - under “privacy” in “preferences” in Firefox, for example - you can find a list of all the cookies that have been installed on your computer from the websites you once visited, cookies that in all likelihood are pinging themselves back to giant corporate data collectors that are building a profile of where you go, what you do and what you seem to like online.

Here are a few of the companies that have put cookies on my computer: aCerno, AdBrite, Adconion, Adify Media, Adgear, Adinterax, AdMeld, Aggregate Knowledge, Akamai, AudienceScience... and that's not even all the As.

I don't know what information these cookies contain, when they were installed and when they might be reactivated or who the data is going to end up with. I do know one thing they all have in common, though: I have never heard of any of these companies or visited their websites.

I don't doubt the cookies on your computer all come from law-abiding and reputable firms, with wonderful (and wonderfully long and heavily legalled) privacy policies. But wouldn't you like to know just a bit more about why they are there and what they are doing? And while we are at it, what the hell is a cookie, anyway?

This is the story of how a few humble lines of computer code changed the internet and changed the world. It is a story of possibilities unbounded, but it has a dark side, too. For it is also the tale of corporate forces benefiting from our ignorance and posing unprecedented threats to our privacy. It is about what a new generation of technologists are trying to do to protect us - and about what our governments might have to do if they fail. Above all, it is about what we have to learn to protect ourselves.

In the beginning, there were just words. Maybe a few pictures, but mainly just words. Surfing the internet was like hopping from page to page inside a book. The largest-ever book, sure, but all just a bit - well, static.

None of the things we now take for granted - all that rich interactivity, the back and forth of engaging with a website, having sites automatically log you in and remember your preferences, being able to buy things with a stored credit card or even get customised, local news and weather - was possible. It was only 17 years ago, but that is prehistory in technology terms. Or if not prehistoric, at least BC. Before the Cookie. The problem was that websites couldn't remember, from page to page, who you were or what you had told them last about where you were and what you wanted to see.

“Websites were more like a Word document, without anything moving,” says Lou Montulli, a software engineer who in 1994, just out of college, helped found Netscape, the first really popular browser and a company that ironed out many of those early questions about how websites and personal computers should talk to each other. “Thanks to Netscape the web became an interactive medium and cookies were a strong part of that. Cookies allowed websites to have some memory and that created the modern web as we know it.”

Montulli says he ranks the cookie quite far down the list of all his inventions at Netscape. In those heady days, when the full potential of the web was being suddenly grasped, the atmosphere was “euphoric”, he says. “We were a very serious engineering company. Maybe we weren't the best business people at times, but we did put forward a lot of the technology that we all take for granted today.”

Still, when he calls The Independent to reminisce, Mr Montulli introduces himself as “the cookie guy” and it is his best-known invention. A cookie is just a few lines of text sent from the website to your computer with a unique identification code and an instruction to send the cookie back the next time you hit one of the pages on the site. It is the equivalent of saying, “Hello, it's me”. It was Montulli's solution for Netscape business partners that wanted to build an e-commerce site. Without cookies, these proto-Amazons couldn't remember, from one page to the next, what users had put in their shopping cart.

Even at the moment of its invention it was clear that the cookie would be ubiquitous and controversial. Montulli knew its applications would go far beyond e-commerce (it was designed, he said, “to be a tool, like a Swiss Army knife”). There were always going to be privacy concerns, too, with anything that allowed tracking of users on the internet, even if it might initially have seemed to be only about tracking you from one page of a website to another.

Alternative ideas in 1994 for solving the problem of memory included creating a serial number for every Netscape browser, which would have allowed tracking of an individual wherever they went on the web and whatever they did. “Cookies were an attempt to very clearly define a smaller pool of information for sharing,” Montulli says. “I would very strongly argue that cookies did an incredible amount to protect us, versus other technologies.”

Cookie use exploded, and not just by websites that wanted to keep track of their users within the site, but also by advertisers. Unlike an ad in a newspaper, which gets sent in by the advertiser and placed on the page by the newspaper's staff, the ads that appear on a website are put there directly by a company called an advertising network. Watch next time you are on a slow internet connection; as you wait for the various pieces of the page to load, you will see “transferring data from...” and “connecting to...” on the bottom of the browser and you can see where they are all coming from. DoubleClick, now owned by Google, was an early pioneer, but scores and scores quickly sprung up. These networks, too, started sending cookies with the ads, partly as a technical solution (to make sure the ad is served as promised and to track whether a consumer acts on it, something that can be important for payments to the website), but also as a means of storing data as a user hopped from site to site within the network.

With that information, the networks were better able to target ads, adding things like an internet user's location or information that they had given the host site about their age or gender or interests, to the new data they had gained about their browsing habits.

This difference between websites, whose cookies a user might implicitly accept, and ad networks, which are largely invisible and unknown to the user, created instant controversy - but an attempt to ban third-party cookies, or at least to disable them unless a user changed the default settings in the browser, was derailed by an alliance of browser makers (by then including Microsoft) and the ad firms themselves.

Dave Kristol, who led an effort to create technical specifications for cookies and the browsers that sent and received them, recalls how he found himself embroiled in a public-policy debate with powerful factions on all sides. “The process of developing standards and letting all voices be heard is also messy, like most democratic processes,” he wrote in 2001, after more than five years of work. “It gives me a new appreciation for how hard it must be to write legislation, even ignoring the distorting influences of lobbyists.”

The cookie controversies of the late Nineties and early Noughties included thunderous hearings in the United States Congress and threats of legislation. The headlines did bring cookies to the attention of some privacy-conscious individuals, for whom toughening up their privacy settings or routinely deleting their cookies has become a housecleaning habit. And it was a shot across the bows of the web publishers and the ad networks, who vowed self-regulation. Websites began publishing “privacy policies” that set out what they would and would not do with cookie-collected data, while ad networks were careful to insist that all their profiles on individual users would be anonymous and impossible to trace back to a named person. As self-regulation goes, this is what you might call light touch. Most browsers continued to allow third-party cookies by default, as long as the cookies came from “certified” companies only. Meanwhile, the ad industry and a new generation of digital market-research firms got on with the business of assembling massive amounts of data on web users.

Developments on the web these past few years have provided these firms with riches they could not have dreamed of a decade ago. We spend more time, and share more, online, providing clues for the harvesters of data. Websites contain content, such as videos, games, social-networking features and multiple ads, served up from a variety of different sources, all usually with cookies or other tracking devices embedded. In breadth and depth, the scale of the data-collection industry is breathtaking. A 2009 report by the University of California, Berkeley found an average of 12 trackers on each of the most-popular 100 websites, with one having 100 different trackers in a month. When a user visits that website, potentially 100 different entities - nearly all unseen by the user - will learn about their visit. One advertising company was able to monitor activity on 91 of the top 100 sites, and 88 per cent of the 350,000 sites tracked by Berkeley.

Last month, the British advertising giant WPP boasted that it had just created the world's largest database of individuals' online behaviour: profiles of 500 million people, covering, it says, almost 100 per cent of the people online in the countries in which it operates, including the UK, the US, Australia and eight others. The firm said it was pooling data from many of the world's major websites and networks of online advertisers and adding it to information purchased from traditional market-research firms that assemble real-world data such as what people are buying in high street stores. That real-world information has been “anonymised”, WPP says, before it is added to its database of cookies.

Ashkan Soltani, the privacy researcher who co-authored the Berkeley study, told a Congressional hearing earlier this year that online tracking has become so sophisticated that advertising networks can serve up ads for depression medicine to people who have searched for “depression” on a health site, and that recent computer-science research suggests it is often quite easy to identify the people whose profiles include so much supposedly anonymised user information. Worse, ad networks always seem to be one step ahead of even the most privacy-minded internet users. Many now use “Flash cookies”, served from sites that use Adobe's programming language Flash, which have the ability to restore other deleted cookies.

Here's the upside. Websites are expected to reap $86bn (£54bn) in revenues from online advertising this year, as marketers have switched their budgets to the net, lured by the opportunity to target ads more closely to people based on their browsing history. This is money that helps pay for the vast amount of content on the web that is free to users, and without which the quality - if not the quantity - of that content would crash. This is why a new generation of Dave Kristols is trying to square the continued use of third-party cookies with greater control for the user who is concerned about the use to which these cookies are being put.

It is also why web publishers, advertisers and ad networks are working to come up with a new self-regulatory regime, as Congressional hearings and European Union directives threaten them again. And it is why the makers of web browsers - Firefox, Microsoft and Google among them - have started launching new features such as incognito browsing and do-not-track buttons. The trouble now is getting all the firms involved to figure out the technology so that users' preferences can be translated into restrictions on cookie uses that don't curtail all those rich interactive features that cookies can allow.

In the UK, the implementation of an EU directive for internet users to give their consent before cookies are used has been delayed by a year to next May, to provide time to find a workable technology. “I find it most interesting that browser vendors are competing at providing privacy controls now, given how contentious the issue was in 1996,” Kristol says. “One difference is that I think the various browser vendors - and there are more now - are working together in good faith. I didn't feel that was so in 1996.”

The ad industry, too, is trying to coalesce around user-friendly solutions and has set up two websites - Network Advertising Initiative and Digital Advertising Alliance - where users can go to discover whose cookies might be on their computer. The industry has a few tricks up its sleeve to protect itself, though. Some of the opt-out options it purports to be offering allow the users to opt out only of receiving targeted ads - not to have their data deleted from these giant databases.

And the technology used to ensure that these ad networks remember you have opted out is none other than the cookie. People who habitually delete their cookies for privacy's sake may find that by doing so, they are opting to have their privacy invaded again.

The cookie guy looks at all these new options, whether on these industry websites or in the new generation of privacy-conscious browsers, and expresses some pride. “It is difficult for the layman - it is difficult even for me - to know how cookies are being used in all circumstances, and it is difficult to control their use,” Montulli says. “But you have a lot of power over cookies today, if you choose to use it. And that is because of the base design for the cookie.” - The Independent

Related Topics: