...But first, some good news. Among the 123 members of the Association of Research Libraries, there are four libraries with almost secure search services that don't send clickstream data to Amazon, Google, or any advertising network. Let's now sing the praises of libraries at Southern Illinois University, University of Louisville, University of Maryland, and University of New Mexico for their commendable attention to the privacy of their users. And it's no fault of their own that they're not fully secure. SIU fails to earn a green lock badge because of mixed content issues in the CARLI service; while Louisville, Maryland and New Mexico miss out on green locks because of the weak cipher suite used by OCLC on their Worldcat Local installations. These are relatively minor issues that are likely to get addressed without much drama.
Over the weekend, I decided to try to quantify the extent of privacy leakage in public-facing library services by studying the search services of the 123 ARL libraries. These are the best funded and most prestigious libraries in North America, and we should expect them to positively represent libraries. I went to each library's on-line search facility and did a search for a book whose title might suggest to an advertiser that I might be pregnant. (I'm not!) I checked to see whether the default search linked to by the library's home page (as listed on the ARL website) was delivered over a secure connection (HTTPS). I checked for privacy leakage of referer headers from cover images by using Chrome developer tools (the sources tab). I used Ghostery to see if the library's online search used Google Analytics or not. I also noted whether advertising network "web beacons" were placed by the search session.
code of ethics, I'm surprised this is not more controversial. ALA even sponsors workshops on "Getting Started with Google Analytics". To paraphrase privacy advocate and educator Dorothea Salo, the code of ethics does not say:
80% of the ARL libraries provide their default discovery tools to users without the benefit of a secure connection. This means that any network provider in the path between the library and the user can read and alter the query, and the results returned to the user. It also means that when a user accesses the library over public wifi, such as in a coffee shop, the user's clicks are available for everyone else in the coffee shop to look at, and potentially to tamper with. (The Digital Library Privacy Pledge is not having the effect we had hoped for, at least not yet.)
28% of ARL libraries enrich their catalog displays with cover images sourced from Amazon.com. Because of privacy leakage in referer headers, this means that a user's searches for library books are available for use by Amazon when Amazon wants to sell that user something. It's not clear that libraries realize this is happening, or whether they just don't realize that their catalog enrichment service uses cover images sourced by Amazon.
13% of ARL libraries help advertisers (other than Google) target their ads by allowing web beacons to be placed on their catalog web pages. Whether the beacons are from Facebook, DoubleClick, AddThis or Sharethis, advertisers track individual users, often in a personally identifiable way. Searches on these library catalogs are available to the ad networks to maximize the value of advertising placed throughout their networks.
Much of the privacy leakage I found in my survey occurs beyond the control of librarians. There are IT departments, vender-provided services, and incumbent bureaucracies involved. Important library services appear to be unavailable in secure versions. But specific, serious privacy leakage problems that I've discussed with product managers and CTOs of library automation vendors have gone unfixed for more than a year. I'm getting tired of it.
The results of my quick survey for each of the 123 ARL libraries are available as a Google Sheet. There are bound to be a few errors, and I'd love to be able to make changes as privacy leaks get plugged and websites become secure, so feel free to leave a comment.