Google Analytics, Privacy, and Legalese

Google Logo in Building43Google Analytics have become an almost ever-present part of the contemporary Internet. Large, small, and medium-sized sites alike track their website visitors using Google’s free tools to identify where visitors are coming from, what they’re looking at (and for how long), where they subsequently navigate to, what keywords bring people to websites, and whether internal metrics are in line with advertising campaign goals. As of 2010, roughly 52% of all websites used Google’s analytics system, and it accounted for 81.4% of the traffic analysis tools market. As of this writing, Google’s system is used by roughly 58% of the top 10,000 websites, 57% of the top 100,000 websites, and 41.5% of the top million sites. In short, Google is providing analytics services to a considerable number of the world’s most commonly frequented websites.

In this short post I want to discuss the terms of using Google analytics. Based on conversations I’ve had over the past several months, it seems like many of the medium and small business owners are unaware of the conditions that Google places on using their tool. Further, independent bloggers are using analytics engines – either intentionally or by the default of their website host/creator – and are ignorant of what they must do to legitimately use them. After outlining the brief bits of legalese that are required by Google – and suggesting what Google should do to ensure terms of service compliance – I’ll suggest a business model/addition that could simultaneously assist in privacy compliance while netting an enterprising company/individual a few extra dollars in revenue.

Google Analytics, Privacy, and Opt-Outs

Google describes their tool as an “enterprise-class web analytics solution that gives you rich insights into your website traffic and marketing effectiveness….With Google Analytics, you’re more prepared to write better-targeted ads, strengthen your marketing initiatives and create higher converting websites.” While Google’s tools do provide considerable insight into websites’ visitors, the insight may come at the cost of vistors’ privacy and be accompanied by legal liability for organizations using the tools. Data protection experts in Germany warn that Google’s ‘insights’ violate German data protection laws, with recent analyses exploring whether German site owners should risk using Google’s system. As of the beginning of 2011 it remains unclear whether using Google analytics could put both German site owners – and owners of sites that Germans visit – at risk of sanction by authorities.

For consumers that are concerned about their browsing information being released to Google, the company released an opt-out system in 2010 that relies on users installing an add-on to their web browser. Unfortunately, Google’s claims surrounding the add-on and the actual reality of those claims stand in contrast with one another. Specifically, the company states that the “add-on communicates with the Google Analytics JavaScript (ga.js) to indicate that information about the website visit should not be sent to Google Analytics” but independent tests have revealed that this statement is inaccurate. For websites still using urchin.js (the predecessor to ga.js) the opt-out does not limit any of the information that would otherwise be sent. With urchin.js the following is still collected:

  • screen resolution
  • screen depth
  • language
  • Google Analytics Account
  • page title
  • domain
  • original referrer
  • referrer
  • user-agent
  • IP address
  • IP address derived information (including ISPs, approximate location, country, and the potential to tie in the IP information with internal databases)

In the case of ga.js, the following is still captured:

  • domain
  • referrer

Further, the following is captured via googleadservices.com:

  • screen resolution
  • bit depth
  • time zone
  • whether java is supported

While the opt-out does limit the amount of information that is provided through ga.js the Google webcrawlers can theoretically be used to match “ the url in both the referer and/or the googleadservices url= variable.” Doing so would let them combine the information gathered from Google’s ad services and analytics system, even if a user had opted-out of the analytics. In summary, the opt-out mechanisms that Google provides are somewhat disingenuous given that few users are likely to know the full range and magnitude of the systems that Google has deployed to collect information about web browsers’ actions. A more honest opt-out mechanism would opt users out of every Google product that captures browser traffic information.

Required Legalese

Of course many of the people running Google analytics will have few concerns about the broader privacy concerns or problems associated with the opt-out mechanisms that the company provides. Instead site owners use the tools to derive information about their visitors. Unfortunately, many of these owners are using Google’s systems in contravention of Google’s Terms of Service (ToS).

Users are required to establish a privacy policy in the processes of setting up Google Analytics. This policy must include the following text (from Section 8.1):

This website uses Google Analytics, a web analytics service provided by Google, Inc. (“Google”).  Google Analytics uses “cookies”, which are text files placed on your computer, to help the website analyze how users use the site. The information generated by the cookie about your use of the website (including your IP address) will be transmitted to and stored by Google on servers in the United States . Google will use this information for the purpose of evaluating your use of the website, compiling reports on website activity for website operators and providing other services relating to website activity and internet usage.  Google may also transfer this information to third parties where required to do so by law, or where such third parties process the information on Google’s behalf. Google will not associate your IP address with any other data held by Google.  You may refuse the use of cookies by selecting the appropriate settings on your browser, however please note that if you do this you may not be able to use the full functionality of this website.  By using this website, you consent to the processing of data about you by Google in the manner and for the purposes set out above.

Moreover, the website’s privacy policy must be placed in a “prominent position” and the owner must use “reasonable endeavours” to bring the policy (and Google’s required text) to the website’s users. If you install Ghostery in Firefox and browse your favorite websites you are likely to find that small- and medium-sized businesses and bloggers are likely to use Google Analytics and unlikely to have the above mentioned legalese. As a result they are using Google’s product without complying with the terms of service.

Google could, and ought to, bundle a compliance mechanism with their Analytics product. It would be trivial for the company to create a spider that evaluated whether websites using the Analytics engine also contained privacy policies and required legalese. In cases where websites used the Google product but appeared in contravention of the ToS Google could direct an email to the website to remind the administrator of their duties under the Terms of Service. If compliance was not forthcoming (demonstrated by the continuing absence of the policy and legalese, discovered using Google’s crawlers) then the site owner would cease receiving information from Google Analytics. Indeed, privacy commissioners should demand that the company integrate such basic compliance tools into the product that they are offering. They should put some onus on Google to guarantee that its services are designed to comply both with the company’s own notice requirements and the notice and consent laws in privacy commissioners’ jurisdictions.

Google Analytics and Business

So, how can an enterprising business cash in on contraventions of Google’s Terms of Service? Creating a spider that checks whether websites are running the company’s Analytics product and has the required legalese should be a relatively simple task, and could be supplemented by an automated email to the site owner. That email might explain that the crawled website was violating Google’s terms of service and that, for a relatively low fee, the enterprising business could prepare the text that Google requires. This text, of course, would be a simple copy/paste of what Google already offers for free. Should a company integrate this kind of a search tool with already existing products – perhaps a privacy compliance service – then clients would receive an even better ‘bang for their buck’ with minimal extra effort being put forth by the consulting firm.

While privacy policies are certainly not the best way to notify anyone of anything, it is a (very minimal!) baseline that has global traction. While outside the scope of this post, what would be best would be a graduated privacy notice system that included first a set of principles (perhaps that adheres to a privacy commons notification model), second a somewhat detailed description of what the website/business did to collect, use, and disseminate personal information, and third the present legalese contained in most privacy policies.

At the very least, website owners must comply with Google’s terms of service if they are using the Analytics product. Thus, any website running Analytics must possess a privacy policy and prominently display it. In the absence of such policies, Netizens should complain to Google and demand that non-compliant sites have their access to the Analytics engine revoked. Should such revocations happen to enough high profile bloggers and businesses I image that there would be a rapid ‘education’ on the legalese, and Terms of Service more generally, that are associated with Google Analytics.

Christopher Parsons

I’m a Postdoctoral Fellow at the Citizen Lab in the Munk School of Global Affairs at the University of Toronto and a Principal at Block G Privacy and Security Consulting. My research interests focus on how privacy (particularly informational privacy, expressive privacy and accessibility privacy) is affected by digitally mediated surveillance and the normative implications that such surveillance has in (and on) contemporary Western political systems. I’m currently attending to a particular set of technologies that facilitate digitally mediated surveillance, including Deep Packet Inspection (DPI), behavioral advertising, and mobile device security. I try to think through how these technologies influence citizens in their decisions to openly express themselves or to engage in self-censoring behavior on a regular basis.