Photo: Alfons Morales/Unsplash
Data tracking has long been a lucrative business model for many corporations. The fact that it also takes place in science is not so well-known, however. But here too, dangers are lurking for data protection and the freedom of science and research. And libraries also have a role to play, as stakeholders in the scientific ecosystem, particularly if they take out any kind of contract with profit-oriented companies such as publishing houses, in which the data from researchers can also function as bargaining chips.
Julia Reda from the Society for Civil Rights (GFF) has long been dedicated to the assertion of fundamental rights in the conflict area surrounding copyright and data protection. In the interview she explains the role libraries and digital infrastructures play in this complex topic and why it is so important for these institutions to build their own infrastructure and focus on green Open Access instead of financially supporting publishing houses to build up a parallel and commercial infrastructure.
During the recent online conference #vBib21, you gave a presentation on “Tracking Science: Consequences for Data Protection and Scientific Freedom” (German). Why is this topic also relevant to libraries?
Libraries do far more than just making literature available. Ideally they provide a comprehensive knowledge structure in which people can learn and do research. The exploding costs of licences for specialist scientific articles is not only making it more difficult for libraries to fulfil this task: scientific publishers use the enormous profits they procure in this way, at the expense of the public purse, to buy up more and more software companies responsible for organising the science industry, from logging measurement results in the laboratory to assessing the quality of research.
In this way the science corporations create a commercial parallel structure to the services that libraries should actually provide but are often unable to, due to a lack of financial resources. Once public research takes place on commercial platforms, it is easy for these companies to collect highly sensitive data about the researchers. This represents a danger to their privacy and to the independence of science. Libraries must take a stand against this trend, because it is their very own duties that are being privatised here.
How is scientific freedom threatened when major publishing houses track the surfing and search habits of individual scientists?
Individual researchers could be hindered in carrying out their research: For example, the Chinese government has already induced certain scientific publishers to block access to specialist articles in China for users whose topics are a thorn in the side of the regime. China has also imposed sanctions on individual scientists and research institutes who are working in these research fields. If science companies sell personal data to governments – about who is reading and downloading which specialist articles – further researchers can become the target for sanctions. The resulting “scissors in the head” (self-censoring) that begins before the actual restrictions of scientific freedom even occur, is particularly dangerous. Researchers start to avoid controversial topics because they feel that they are being watched, and they want to avoid trouble.
A further danger is that the use of data to make decisions will increase existing unfairness in science. It’s already well-known that the so-called “impact factor”, which should provide information about the quality of specialist journals, is completely unsuited to this task. Nevertheless it continues to be called upon for career promotion decisions. Measuring the quality of scientific papers according to the number of times they have been accessed can also give a distorted picture. Male, white scientists who are English native speakers and who are particularly present in the media have an unfair advantage in such procedures.
The topic is however very abstract. Do you have a concrete example for us in which tracking had negative consequences for scientific freedom or an individual scientist?
One well-known example is that of the activist and researcher Aaron Swartz, who was accused of computer sabotage by the US government after he had automatically downloaded thousands of specialist articles from the commercial repository JSTOR via his completely legal university access in the year 2010. JSTOR became aware of Swartz’s “suspicious” surfing behaviour and blocked his access. Although JSTOR reached an out of court agreement with Swartz, the US public prosecutor’s office pressed charges against him and pursued him as if he were a criminal. He had done nothing worse than borrowing too many books from the library. Swartz committed suicide in the year 2013; it was only after this that the case against him was dropped.
In the meantime the EU copyright reform of 2019 has made it clear that scientific publishers in the EU are not allowed to prevent the mass download of specialist articles for the purposes of text and data mining as long as these activities do not compromise the safety and integrity of the computer systems.
In the scientific ecosystem, libraries often advocate Open Science, for example by operating their own Open Access repositories or by negotiating Open Access-friendly contracts with publishers. What pitfalls are lurking here in relation to tracking and scientific freedom?
It is important that libraries rely on green Open Access, i.e. provide their own infrastructure for the publication of specialist articles. Contracts with science companies are not only associated with high publication costs for Open Access publications, the so-called Article Processing Charges (APCs), which financially burden the public purse. The danger of these contracts is also that libraries will give away the control of the publication infrastructure. Commercial companies then host the specialist articles and can track the surfing habits of people who call up these articles. Universities and libraries should preferably completely avoid these contracts and invest the money in their own infrastructure.
If contracts are signed with external service providers, however, these four aspects are particularly important:
- The contracts must be tendered, so that different companies can compete with their respective quotations.
- The contracts must avoid “lock-in effects” that lead the libraries to become permanently dependent on a specific provider. To ensure this, it is important that the software of the online platform used is Open Source, so that it is possible to change provider without the researcher needing to get used to a completely new platform.
- The specialist articles must also be subject to genuinely free licences that allow unlimited further use on any other platform and for any purpose; this ensures that it is still legally possible to move from one provider to another, or to infrastructure that you operate yourself.
- Finally libraries must insist that tracking of the surfing habits of individual researchers is contractually forbidden and that the software runs on in-house university servers. Only in this way can the university or the library comply with its public mandate to protect the fundamental rights of the researchers. Just a few days ago, a university was prohibited by the courts from forwarding personal data to the USA via a commercial provider, because this violates EU data protection laws (see Administrative Court Wiesbaden: The cookie tool “Cookiebot” is a breach of the GDPR and is therefore forbidden [German]).
In your presentation you called upon libraries to get involved in the debate and campaign for data protection. What can libraries and digital infrastructure institutions do in practice?
As well as specific suggestions on what they need to bear in mind when negotiating contracts, it is important that the library associations and managements of higher education institutions publicly declare their solidarity with the researchers whose fundamental rights are threatened by science tracking. This involves turning down pseudo-scientific quantitative metrics for the evaluation of research quality, as companies such as RELX are increasingly offering. It is also a good idea to disseminate campaigns such as the petition “Stop Tracking Science” or the statement from the German Psychological Society about science tracking, which addresses not only individual researchers but also scientific institutions with sensible suggestions.
Are there further issues in which libraries should get involved or become more aware of, in order to protect scientific freedom?
Libraries are often keen to stand up for the basic rights of researchers and to use legal regulations to improve access to knowledge for the general public. For example, the new EU copyright reform has created new ways for libraries to make out-of-print works freely accessible on the internet. It is important they make use of these new freedoms as soon as possible, even if they do not have any practical experience of them. At the Society for Civil Rights (GFF), I am working towards the legal assertion of fundamental rights in the conflict issue of copyright. I would be delighted to hear from libraries who would like to take advantage of these new opportunities and who are looking for support.
This text has been translated from German.
This article was first published by ZBW Mediatalk and has been republished here under a Creative Commons license.