Photo by Owen Beard on Unsplash
The creation of a central NHS digital database from GP records in England - General Practice Data for Planning and Research (GPDPR) - has been delayed by two months. The system was due to launch on 1 July, but the date has now been pushed back to 1 September by the government.
The NHS had been calling for a delay to allow patients more time to learn about the system. The British Medical Association and the Royal College of GPs had also expressed concerns.
Parties that currently oppose the database claim a lack of transparency in the process and have demanded greater consultation on how the scheme would work, as well as better communication about how patients can opt out to prevent their data being shared.
In a recent, lively conversation DAMA UK board members Mark Humphries and Akhtar Ali discussed the pros and cons of the proposed scheme.
Both of our experts’ positions were nuanced, with some agreement over the potential positive outcomes and shortcomings of the proposals.
Mark declared himself broadly in favour of the programme, citing the ability to boost future medical research and improve treatments for the UK population. Akhtar immediately countered by stating the NHS has already captured and shared data for many years, citing advances made in controlling and treating Covid-19 as an example.
Mark added he’d be happy for his data to be shared, believing it’s “a small price to pay” for groundbreaking, genetics-based research to come. He drew a line to medical research of the past, which involved a level of personal sacrifice for participants in early organ transplants - giving rise to many procedures we take for granted today.
While current data collection currently enables NHS and academic research, Mark elaborated: “There are plans for two specific blocks of research [with the new data strategy]. One is large-scaleplanning. The other way it will be used is for developing new treatments.”
However, Akhtar pointed out that successful lawsuits have resulted in courts preventing various commercial organisations from “patenting individuals’ genetics” as part of their plans.
Mark conceded that using data to develop treatments is controversial as it would likely involve commercial organisations, not least pharmaceutical companies.
He cited the case of Henrietta Lacks, an American woman who died of cervical cancer. Johns Hopkins University removed cells from her tumour that have been central to medical research ever since - without her family’s knowledge.
Mark explained: “They were horrified. They raised the issue of big profit, people making money out of their mother’s cells. The case has since been recorded in a book written by a journalist, and it’s a source of massive pride to her family that her cells have made such a massive contribution to medicine. But in the context of data sharing this is highly relevant - when there is no trust, there's anger and bitterness.”
Sharing data beyond the public sector would undoubtedly be a cause of concern for many people, Mark added.
Akhtar seized on this point. He pondered: “The bigger question is: how far would companies go to get access to that data, and who will they sell it to?” He pointed to the dilemma arising from dealing with the pandemic: “Most of the data and money to fight Covid came from the public sector. But the profits are ringfenced to corporations that aren’t willing to give the vaccination free of charge to poorer, developing nations to protect their critical care staff.”
He largely opposes the NHS digital database plans due to a perceived lack of transparency around whether consumers or commercial organisations will really benefit from the sharing of a data set worth around £5bn per year. (NB a figure of £10bn has been quoted - this includes the £5bn pa estimated value of the data and cost savings to the NHS.)
Akhtar also said there have been around 1,300 NHS data breaches in the past two years alone. He believes fines issued by the ICO are too small to act as a deterrent for poor data management and security - with the proposed changes potentially opening the floodgates to far greater problems.
He said: “Once we have given away £5bn-worth of data, no commercial organisation is going to relinquish “free money”. This has been demonstrated by the investment of vast public funds in Covid vaccines, yet when those same organisations are asked to provide them at cost to poorer nations they suddenly claim they aren’t charities - despite suggesting they would gladly get involved in such a scheme.”
Akhtar compared patient data sharing beyond what is possible today to “moving from keeping embarrassing photos in a private album at home to revealing them on Facebook”.
Mark countered that data will be pseudonymised when it leaves the GP practice - but recognised that, even with many personal details removed, records could in theory be used to identify a patient.
He explained: “One of the things you need in order to make sure you can still link data together is a unique key. It will look like random data, but allows people to trace from the beginning of the chain to the huge database. You can identify the individual if you've got access to every link in the chain.”
Akhtar questioned the future extent of the data-sharing scheme and whether its scope would be widened once the programme’s initial aims had been established.
He believes the proposals suffer from a lack of trust amongst patients, and also suggested winning buy-in from the medical community would be difficult since their feedback on previous attempts to centralise and manage NHS data hadn’t always been heeded.
“If the government had nothing to hide it would follow its own laws which were set out in the revised Data Protection Act following the GDPR regulations,” he stated. “They need to clarify who the external parties will be and what they want the data for.”
While reaffirming his support for the scheme based on its uses for planning, research and treatment, Mark agreed with Akhtar about a current lack of transparency: “At the moment trust is missing, and that is vital in determining whether this is a success or an opportunity missed. In this delay period the government should engage, addressing these valid concerns. Who will have a say in the proposed legislation, assess applications to access the data, put controls in place - and how often will it all be reviewed?”
In conclusion, Akhtar pointed out that even data professionals remain in the dark about the nuts and bolts of the programme: “If most people in data didn’t know this was happening until just before the original opt-out deadline, how could individuals be aware of it and know how to opt out?”
It will be fascinating to follow the debate before the September opt-out deadline, and beyond.
(You can read the full transcript of the head-to-head discussion below.)
MARK: There are two different data sharing initiatives going on at the moment. One is pulling data from different health trusts and GPs so that all healthcare practitioners have access to your medical records wherever you turn up in the system. That is completely different to GPDPR, but the two are happening at the same time, which is a bit confusing.
The two issues do get conflated. There is also an opt-out mechanism built into that. So you can say, I don't want my data shared between all the different hospitals and trusts for whatever reason. But that is limited to keeping the data within the NHS so it's only used for healthcare purposes. That is one of the benefits of having a monolithic national healthcare system.
AKHTAR: The analogy there for me is your gran keeping a photo album. Photos are only for the album in her house, and only she's got access to it. We're now moving on to gran putting pictures on Facebook, but it’s locked to herself. So those embarrassing pictures of you are quite safe.
So we're on a journey like that. Your data has been captured, you’ve seen a doctor, talked about some potentially embarrassing stuff. But you share that on the basis it’s confidential to your doctor.
MARK: I would argue it's not the same as putting it on Facebook, in the public domain. The purpose of sharing is to enable research within the NHS and universities. At the moment there are plans for two specific blocks of research.
One is large-scale planning. It's about managing healthcare capacity and treatments on aggregate numbers. It doesn't really matter who the individual statistic is, it's about the large numbers: how many people are getting liver cancer, prostate cancer, breast cancer; what do childhood diseases look like?
The other way it will be used is for developing new treatments. This is where it starts to get controversial - sharing healthcare data with commercial companies like Pfizer and AstraZeneca, so that they can use the data to develop new treatments. That's when the data goes outside the public sector. [The notion that] companies are making profit out of our data alarms people.
And how do we know this data will be safe, that hackers aren't going to get their hands on it? An important point is that the data will be pseudonymised when it leaves the GP practice. In principle your name, NHS number, address, date of birth (but not year of birth) are removed. So in that sense it’s not like the Facebook analogy.
Just looking at the data, you wouldn't be able to identify it’s Akhtar. But one of the problems with pseudonymised data is if someone is determined, and they have the tools and ability, they can often use various remaining attributes to build a picture - like a fuzzy photo - which is good enough to identify it’s probably Akhtar.
AKHTAR: Let's step back a bit. Many of these things are already in flow. Your GP captures your data - they need to do that, research is ongoing. Historically, you could opt out. There was an interesting case where individuals had opted out, but the process fell over and 150,000 individuals had their data shared even though they requested it wasn’t. There have been something like 1,300 NHS data breaches in the past two years.
NHS research happens with data sharing, for example to come up with treatment for kidney failure. Things are already happening, but we have to step back and realise NHS data will be worth £5bn per annum. That's a big number.
When you start saying £5bn today, what will it be worth tomorrow? I’ve opted out, I’ll put my hands up. My family suffer from kidney disease. My father had a transplant. I understand the need for that. But NHS and other health organisations already have access to our data. So we need to make a big distinction that it’s nothing new. At this moment in time, we can share data and we can opt out. But who else can have access to that £5bn-worth of our data?
The Royal College of GPs, BMA - all of them have challenged this. Mark talked about pseudonymised data. But the government's had about 200 GDPR breaches since the law kicked in - they're the ones that we know about. Then there are hackers, I don’t think any institute is foolproof.
The bigger question is: how far would companies go to get access to that data, and who will they sell it to? This is the reason the Royal College of GPs and the BMA and others have challenged this, because there’s no clarity. Will it be sold to pharmaceuticals, AI companies, or even investors potentially looking to buy into UK hospitals and so on, cherrypicking what to buy based on the data?
The pandemic is a great example of how data was shared. But at the same time, the vast majority of investment in vaccines was funded with public money. Why should we be worried to share our data? Well, that’s a perfect example. We spend billions of pounds of taxpayers’ money to make our vaccine, using our data. But when it’s for the greater good, vaccinating people in poorer countries, it’s a case of these companies saying it’s not their purpose. They are commercial organisations, not philanthropists.
To me, it's about asking what are the implications once we open the floodgates?
MARK: A technical point about pseudonymisation. One of the things you need in order to make sure you can still link data together is a unique key. It will look like random data, but allows people to trace from the beginning of the chain to the huge database. So it's not completely anonymised. You can trace the data back to identify the individual if you've got access to every link in the chain.
If it was fully anonymised, it would be a one-way flow with no way to link it back to the original source records. There's an awful lot of work goes into pseudonymisation, what you can and can't do.
I also want to make a quick point about the risks Akhtar has laid out. Things could go wrong - they’re all valid concerns. What he hasn't said is, this data should not be shared.
I'm keen to emphasise from my point of view that I'm still in favour of sharing healthcare data. But secondly, you need to have trust in order to share data. And at the moment trust is missing.
So the most important thing to do with this delay period is engage, have the debate, address all those concerns. Under what conditions will data be shared with commercial companies? Who will have a say in the proposed legislation, assess applications to access the data, put controls in place?
What I haven't seen is any details about who will be on that body and their terms of reference or decision-making process. We need to put controls in place relevant for today, but also reviewed on a cycle to assess whether they are still relevant and robust at that time.
If you look back at the history of medicine, a lot of what we take for granted today involved sacrifice and some dodgy ethical groundwork - anatomy, grave robbers and so on, so the first doctors could understand how the human body works. Even organ transplant as recently as the 1960s is relevant. Doctors didn't understand organ rejection, so just went ahead and implanted living kidneys. Not only did the patients die, but the deaths that they suffered were actually much worse than natural kidney failure.
And yet, if that experimentation had not happened, then organ transplantation and the the anti-rejection drugs which have been developed off the back of it would not be in place. This is something we take for granted now. So there is sacrifice and so, from my point of view, that’s why I'm happy to share my data. I think it's a small price to pay for future benefits of medicine.
Future medical research will be based on genetics. A big pool of data would therefore be valuable from a research point of view, to identify certain genes and how they affect the whole population.
AKHTAR: I watched a programme which was a discussion about the majority of learnings for the basis of medicine coming from Islam. Oxford University has many historical books on medicine, but wasn’t willing to share so much knowledge so [the programme said] they hid the books. So my concern comes back to trust.
Then genetics and DNA. Who is the benefit for in the US when they want to patent my DNA? The big thing we don't talk about here is the ethics of data. It's going to become the blood supply of capitalist organisations trying to get into the NHS on the cheap.
This is a concern for all health organisations and charities - where is that data going to go? Why can't the government come out and say, these are the potential companies we want to give it to. If it's all about my care, why would you want to patent it, why don’t you give it to everyone so they can all come up with the best cure?
At the moment, we're still getting a good deal in comparison to the rest of the world on medication cost. So what are the controls, and who are the benefits for?
MARK: The government absolutely needs to build trust. Unless they do that, this will fail. I think that will be a huge opportunity missed. This has the potential to unlock future healthcare treatments that we will all benefit from. People's valid concerns need to be tackled.
When there is no trust, there's fear and anger. But if concerns are recognised, the conversation is had and the value is explained, it actually becomes a very positive story.
Another general point, a lot of people think GDPR was put in place to stop data sharing. But actually one of its main goals is to encourage data sharing by putting trust in place. So there are limits to what you can do and protections in place so people know what’s possible and how they can complain.
In the pandemic, vaccine scepticism is uneven in the population. If you saw the same attitudes in NHS data sharing and didn’t get representation across age and ethnicity that would be a problem. Getting a big enough sample is important but it needs to be representative.
AKHTAR: We can talk about trust and benefits, but the government tried to do something similar in 2014 with social care data. We have pooled data for the NHS and it’s shared in the UK. A hospital anywhere will have my records. These things already happen - so the big thing is who do we want to share the data with?
Be up front and tell us the purpose. It feels cloak and dagger. Most data professionals didn’t know this was happening till just before the [original opt-out] deadline so how could individuals be aware of it, and know how to opt out? That illustrates the problem of trust. When billion pound contracts are handed out people will wonder who’s really benefitting.
If GPs and the BMA are not comfortable then it rings alarm bells for anyone who uses the NHS.
MARK: I am still pro data sharing, but of course all of these concerns are valid. We need to talk about what measures will be taken to secure the data, and be transparent about who it will be shared with. Good management of risk is critical here.