It was timely that the Open Data Institute’s annual summit should happen on the same day that a whistleblower exposed Google’s secret transfer of the personal medical data records of up to 50 million Americans from one of the US’s largest heathcare providers. Resolving the thorny issues of personal data rights, ownership and bias seems ever more pressing, not least as artificial intelligence (AI) becomes embedded in society and as facial recognition takes off.
The summit was also exactly one month ahead of the UK general election, with an early call during the introductory ‘State of the Nation’ debate from ODI co-founders, Sir Tim Berners-Lee and Sir Nigel Shadbolt, for a moratorium on political advertising on social media. As the latter pointed out, even MPs themselves describe the laws in this area as unfit for purpose; as the former pointed out, political parties are not allowed to advertise on television.
Here, immediately, was an early example of regulations lagging far behind digital reality. There is a need for much greater transparency in the ad-tech space, said Shadbolt (with a recommendation for the Who Targets Me browser extension or equivalent – https://whotargets.me/en/).
But let’s not get too UK-centric. Facebook launched a number of policies and tools for monitoring election adverts as a response to the Russian interference in the 2016 US presidential election. However, even these limited measures are not being applied in all countries and for all elections (so not for the Sri Lankan presidential election on 16th November, for instance).
There was also, the day after the summit, an investigation by the FT that showed some of the UK’s most popular health websites to be sharing people’s sensitive data — including medical symptoms, diagnoses, drug names and menstrual and fertility information — with companies around the world, ranging from ad-targeting giants such as Google, Amazon, Facebook and Oracle, to lesser-known data-brokers and adtech firms like Scorecard and OpenX – https://www.ft.com/content/0fbf4d8e-022b-11ea-be59-e49b2a136b8d.
The summit also followed on swiftly from the weekend news about a regulatory investigation of Goldman Sachs’ credit card practices after a prominent software developer called attention to differences in the credit lines of the Apple Card (which is underpinned by Goldman Sachs) for male and female customers.
If you think about similar potential issues and apply them to, for instance, who receives universal credit or housing benefits, but with those people having no redress, then the danger is clear, said Carly Kind, director at the Ada Lovelace Institute, which describes itself as an independent research and deliberative body with a mission to ensure data and AI work for people and society. With facial recognition, she said, “there is a sense of inevitability”, with people feeling they have no control. In this sphere, the sharing of data is a passive one, she observed – people “leak the data”.
There was a similar observation from Michael Veale, digital rights academic at UCL, about lack of trust with regards decline buttons on websites and apps, with people understandably taking little or no interest because they feel they have no real options. The fact that online and app tracking technologies were made illegal ten years ago means they are operating outside the law. Apps typically have upwards of ten trackers, he said.
Often the website hosts have no idea of the problems, such as publishers not knowing what’s going on with advertising on their sites. Veale pointed to last year’s infection by malware that forced the computers of visitors to thousands of websites to mine for bitcoins, including those belonging to NHS services, the UK government’s Information Commissioners Office (ICO) and several English councils.
Then there’s the huge and complex area of bias, whether on gender, ethnicity, religious or other grounds. AI technologist, Kriti Sharma, cited her experience of being questioned far less about her competence when submitting code on online tech forums if she hides her gender.
What would we think, she asked, about someone who thought things like this:
- A black or white person is less likely to pay off their loan on time;
- A person called John makes a better programmer than a person called Mary;
- A black man is more likely to be a repeat offender than a white man.
“A pretty sexist, racist person, right?” But these are real AI-based decisions, based on the biases it has learned from humans. There are dangers too in advertising: Consider the AI that ensures gambling addicts are pushed adverts for online casinos. Or consider that job adverts in the US for salaries above $200,000 are more likely to be pushed to males than females.
It needs awareness of our biases, diverse tech teams (around twelve per cent of people working in AI and machine-learning are women, said Sharma) and ensure diverse datasets.
The theme was taken up with gusto by Caroline Criado Perez, author of “Invisible Women: Exposing Data Bias in a World Designed for Men”. She dissected the deep-seated bias against “atypical” female bodies, leading to women being ignored in areas such as medical research, medication and apps and transport research and design (such as the car crash dummy).
Criado Perez felt the bias was not deliberate – “misogynists are just not that smart” – but is ingrained, with the typical reference being a Caucasian man of 70kg. To fix this, proper sex disaggregated data is needed and diversity is fundamental, she said.
Facial recognition has its own issues, with algorithms having a much harder time recognising people with darker skin. Even top-performing offerings show a ten-fold difference.
So What’s to be Done?
There’s a need to “politicise fairness”, said UCL’s Veale, with “creative interventions” by governments and civic society. In the latter category is the Algorthmic Justice League – https://www.ajlunited.org/ – which seeks to highlight and counter algorithmic bias. Tech firms have been successful to date in lobbying and offering supposed fixes to bias and unfairness rather than tackling the root causes, said Veale.
Fines don’t work, pointed out Catherine Miller, acting CEO at tech think-tank, Doteveryone. When Facebook was hit with $5 billion in fines for violating consumer privacy, “the share price barely shivered” and despite Cambridge Analytica, Facebook’s number of daily users in Europe has increased. Google, Facebook and Apple are too big, there is an imbalance in power between them and government.
There is already a legal framework, said Estelle Dehon, public law barrister at Cornerstone Barristers specialising in environmental and information law, with GDPR, the Data Protection Act and Human Rights Act. There is a need to put humans at the heart of the solutions and also to work out answers to questions as fundamental as, “what is privacy?”. When there’s gender or racial bias in data and AI, she added, the answer “I didn’t mean to do that” is not good enough: this absolutely chimes with GDPR and the fairness obligation in recital 39 & Art 5(1). “Equity must be designed in, from the start.”
Doteveryone’s Miller added a fundamental question of her own – “what does consumer welfare look like in the digital age?” There is the issue of societal impact versus impact on the individual. With Cambridge Analytica, there might not have been an impact on me as an individual, she said, but there was harm to democracy.
Too many of the legal tools are based on bricks and mortar, said Ariel Ezrachi, professor of competition law at University of Oxford. For instance, competition rules around mergers are centred on competing companies per se, so don’t take into account wider implications, including data-related, of buying complementary companies.
A prime topical example is Google buying Fitbit – as Wired pointed out, this was despite Google already being investigated by Congress, state attorneys general, and federal antitrust regulators, “a reflection of growing alarm over a conglomerate whose dominant market share is built on unrivaled access to personal data. Now it was announcing a $2.2 billion acquisition of a firm with troves of the most intimate details of its users’ physical health”.
Interoperability has to be made an intrinsic part of the internet again, said Derek McAuley, professor of digital economy at University of Nottingham. At present, “everyone is playing by Game of Thrones rules”. The big companies either buy out the competition or change the APIs to take them out of the market. The last time he counted, Facebook had made 78 acquisitions, of which 17 were social media companies. Google not using Fitbit data should be embedded in law not left to vague reassurances.
“Applying fines after they have taken out all of the competition doesn’t work,” said McAuley, the need is to get in front of the problem. That means regulation, added Ezrachi: regulation is the fence at the top of the cliff stopping you falling, competition law is the ambulance at the bottom of the cliff.
Data Ownership vs Rights
Regularly touted is the idea of giving people ownership of their data, with this leading to an interesting disagreement in the afternoon panel on the topic. In the rights not ownership camp is the ODI and the ICO’s executive director for technology, Simon McDougall. There were three reasons why he does not support ownership.
First, what is owned can be sold and, to a degree, there is already “privacy as a luxury product”, he argued. Second, there are too many grey areas and ownership could over-simplify things – he gave the example of taking a selfie of the panel – who would have ownership? Third, it could let data controllers off the hook in terms of responsibility, pushing the onus onto individuals.
Adding to the objections was Kitty von Bertele, associate at global social justice philanthropic organisation, Luminate. Ownership doesn’t solve the group conundrum, with data from one individual having implications for family, colleagues and friends. She too advocates a rights-based approach.
Streamr’s head of communications, Shiv Malik, was the voice for ownership on the panel, although found support from some of the audience. Streamr is an open source de-centralised network for publishing and sharing data. The conversation has probably been going on for ten years, he said, but with no one actually doing anything. It is now being solved by technologists, he argued. Streamr’s vision is for a single common interface to bring together data buyers and sellers, who transact via the Ethereum blockchain. “I do trust the people”, he said, rather than large corporations or the state and people have to be incentivised.
Conclusion – A Pivotal Moment?
The ODI summit is an excellent platform for experts to air their concerns, chew over the issues and suggest solutions. However, it was easy to come away at the end with the feeling that, for all the talk, it is those solutions that remain elusive. When Shadbolt observed that “we always seem to be at a pivotal moment,” it would be easy to detect a degree of frustration.
Perhaps data ownership will become part of the solution. There is certainly an urgent need for regulators to catch up. Perhaps the ODI’s work with Arup, Co-op, Deutsche Bank, Refinitiv and Pinsent Masons around data trust pilots will bear fruit – is independent stewardship of data the way forward? Whether or not we are indeed at a pivotal moment, there is no doubt that the problem is only becoming more pressing.