Trust in data is about more than ethics – Deborah Yates
There was a time when almost every one of us would react to a new set of terms and conditions – be they from a software supplier or a service provider – by scrolling through to the end and clicking ‘agree’. We wanted to get at the goodies and didn’t think much about where our data would end up or how our actions would be tracked.
But in the post-Cambridge Analytica period, we are a little more circumspect. Many still opt for the scroll and click approach for brands that we trust, but even then we are far more likely to then go in and edit our data preferences. When it comes to apps, surveys and sign-ups from brands we don’t know then we may be willing to forgo any alleged benefits to retain our data.
In other words, the issue of trust in data has become a mainstream issue, even though many of those making such decisions may not realise that trust in data, and trust in the organisations collecting and using data, is an issue they are interacting with. They may cite ‘brand trust’ or ‘security’, but these are illustrations of trust in data and why it is now important for all businesses. Organisations should take it as read that those who interact with them have the right to ask how and why data is collected, how it is used and who has access to it. After all, these are the people who will deem your company trustworthy, or not.
We cannot take trust as a given, it is very much part of a relationship that customers or business partners give to a brand or business once it has earned it – and the definition of ‘earning it’ is nuanced and context dependent. That may be through experience or reputation, but either of those can be a gateway to a loss of trust as well.
Of course, demonstrating ethical values plays a large role in building trust. This can both paint a picture of how an organisation operates and speak to the values of those who interact with it. There is a reason that many organisations talk about their approach to staff welfare, the environment, animal testing or their position on fair wages for suppliers.
These issues may speak to the value base of customers, but it shows something wider. It establishes a brand as considered, thoughtful and trustworthy. It imparts a moral compass and hopefully reflects values across the board.
Ethical considerations in the way an organisation collects, uses and shares data is increasingly on the agenda - both from a social and economic perspective. The rise of data ethics - defined as a distinct and recognised form of ethics that considers the impact of data and data practices on people, society and the environment - as a recognised discipline is a testament to this.
However, demonstrating ethical collection and use of data is just one element of trustworthy data stewardship. Gaining trust requires organisations to go above and beyond good data governance practices. They will need to demonstrate trustworthiness in privacy and security, ethics and transparency, engagement and accountability, as well as equity and fairness. Addressing each of these areas can help to increase confidence in data, as well as trust in the businesses or organisations that handle it. In doing so, those addressing each area shift from the theoretical to the practical. After all, it is easy for organisations to make claims about any element of their ethics or data practices, it is quite another to visibly demonstrate that these ethics are integrated and embedded into every day business. Claiming ethical practicess will certainly win attention in the short-term, but failing to deliver on those can actually be more damaging to an organisation than failing to set such guidelines.
The Open Data Institute has long been working in the field of trustworthy data and data practices, with a team of experts who can help organisations to assess, build and demonstrate how they create and then embed ethical data practices that can be acted upon and upheld.
Please do get in touch if you would like to learn more about what the ODI can do to help your organisation demonstrate trustworthiness with data, or improve confidence in collection, use and sharing of data. We can work with you to develop your approach, or work with you to build on existing practices you may already have in place. We can also provide training for you or your staff.
Deborah Yates, Programme Lead, Data Assurance, the Open Data Institute.
Dr Jenny Andrew, Head of Data Chartered Society of Physiotherapy
Image courtesy of Joshua Sortino, Unsplash
The UK government’s Central Digital and Data Office (CDDO) has just launched an Algorithmic Transparency Standard for the public sector. The idea is to promote accountability in the algorithmic decision-making that is a growing feature of our public services and civic life. It has largely been received with enthusiasm, and when I agreed to write about it, I thought it would be in a similar vein. This is not the piece I planned…
I should say that I trained as a scientist, and I’m conditioned to view accountability as essential to continuous improvement, and to methodological rigour. And I’ve been an active trade unionist all my working life: I’ve done my time in industrial relations, at the sharp end of holding decision-makers to account. We need more accountability, and structures for citizen-engagement in all our institutions, all our services, and in all businesses. However, the material point, and therefore the object of accountability has to be the decision, and its real-world impacts, not an algorithm or any other tool that feeds it.
To see why this framing matters, look no further than the 2020 A-level results – one of the cases that precipitated the development of the standard. When students chanted “F**k the algorithm”, they gave the exams regulator and the Department for Education a scapegoat for a succession of bad decisions. As a result, the infamous algorithm was dropped, and rightly so, but there’s been relatively little scrutiny of the circumstances that surrounded it.
As it happens, I watched the A-level results fiasco unfold, like a slow-motion car-crash, with commentary from a retired senior examiner and experienced moderator: my mother. “They’ll use the predicted grades in the end,” she said, on the day in March that the exams were cancelled, “They won’t be able to adjust the distribution so far beyond its range.”
Looking back now, my mum’s prediction was a neat illustration of the competencies that weave together into good data design:
It is rare to find those capabilities tied up in a single person. When we design data-centric processes and tools, therefore, we assemble teams and focus groups, and structure projects with touchpoints that ensure coordination. Data management professionals understand data as a whole-lifecycle concern, so we build relevant expertise into every stage of it.
Bad things happen when we arbitrarily decouple data acquisition from storage and use, or when we pass responsibilities from data manager to software developer to data interpreter like a relay baton. Often the risks in algorithmic decision-making can be traced to those handovers: cliff-edges in domain knowledge and disconnects from the holistic view.
The Algorithmic Transparency Standard, as it stands, reinforces a rather narrow, tech-centric perspective. Here’s how I think it can be recast into a more joined-up approach:
Even the title is a problem. The current hype has rendered ‘algorithm’ a loaded term, often conflated with AI and machine learning. Although the guidance suggests a wider scope for the standard, I doubt, for example, that an exam moderator working on a spreadsheet would consider using it. (If you’ve seen the mischief that can be done with a spreadsheet, there’s no way you would exempt those decisions from due scrutiny!) The standard itself should be rebalanced to give more weight to the people and process elements that contextualise the analytical technology, and more detail to the data that feeds it.
Analytical technology should be viewed in its place within the data lifecycle. We know that its outputs are only as good as the weakest link in the data supply chain. Taking a whole data lifecycle perspective right from the design phase of the analysis helps to forecast and avert choices that may be embarrassing to recount with hindsight. Furthermore, as any research data manager can attest, designing for accountability makes capture of the essential metadata a lot less troublesome than trying to reconstruct it later.
Accountability in public sector decision-making cannot be the preserve of ‘tech-native’ people. We need meaningful participation from across domains and interest groups. Not every citizen will follow all the details of the data and technologies that are used in the public sector. We can, however, target networks and civil sector organisations whose advocates may play a more active role. In the trade union movement, for example, we are developing data literacy among our reps and officers, to complement their expertise in employment and industrial relations at the negotiating table, on behalf of fellow workers.
To establish any data protocol in a way that sticks takes a combination of authority, useability and motivation. As a profession, data management can enhance all three in the transparency standard. We are custodians of the organisational structures and processes that will need to support and integrate it. Our experience tells us where contributors will struggle with information gathering, and our workflows are the key to making it easier for them. And our data-centric perspective holds the link between technology and its real-world purpose, keeping it relevant to stakeholders and measuring it by its impacts.
Real-world impact is what matters here. Spare us all from yet more data gathering without material purpose! I wonder how the Algorithmic Transparency Standard will perform outside the 'laboratory conditions’ of its creation. Will we look back in time to see that it made a real-world difference to the decisions affecting society and public services. Probably not with its current, limited viewpoint. Not without expert, structural support.
This isn’t the enthusiastic post I planned to write, not because I want the standard to fail, but because I really want it to succeed. I think it needs a critical friend more than it needs another cheerleader, and our profession is uniquely suited to that brief.
So, I’m thinking about how we can enhance what’s good in the Algorithmic Transparency Standard, how I materialise the principle of accountability in my own professional practice, and how I can support my colleagues and trade union community to adopt it into theirs. I would love to hear other DAMA UK members’ ideas on the subject. And I would love public sector bodies, the CDDO included, to talk to us about how they can build this standard into constructive and sustainable citizen-engagement about the services they provide.
The Cynefin Framework applied to Data Governance
Phil Jones, Enterprise Data Governance Manager, Marks & Spencer
When I first got involved in data governance I found the breadth, depth, diversity, and complexity of the subject matter somewhat overwhelming, particularly on where to start, how to get it done, and how to introduce change that would stick.
My background is in business process architecture with some project management experience, so I set about applying these trusted techniques: they had worked for me in the past, after all, so why not now? I figured that the underlying causes of the data quality issues that I needed to fix were related to failures of understanding and on process adherence. My plan was to do detailed analysis to find a “one best way”, and then implement change supported by KPI measurement. I saw some progress, but it became apparent that my preferred approaches to problem solving were not sufficient, particularly when I started to encounter the need for behavioural and cultural change.
It was about this time that I came across the Cynefin framework. Cynefin [ku-nev-in] was developed by Dave Snowden in 1999 and has been used to help decision-makers understand their challenges and to make decisions in context. It has been used in a huge variety of scenarios: strategy, police work, international development, public policy, military, counterterrorism, product and software development, and education. This blog is my attempt to apply Cynefin to Data Governance.
What is the Cynefin Framework?
Cynefin provides a framework by which decision makers can figure out how best to approach a problem, helping us to distinguish order from complexity from chaos. It helps to avoid the tendency for us to force a “one size fits all” approach to fix problems, best explained by Maslow: “if all you have is a hammer, everything looks like a nail”.
What does the Cynefin framework look like?
The framework is made up of five situational domains that are defined by the nature of the cause-and-effect relationships: clear, complicated, complex, chaotic, and disorder:
The “Clear” Domain: the domain of best practice
The relationship between cause and effect exists, is repeatable and is predictable. This is the realm of “known knowns” and decisions are not questioned as there is a clear approach to follow. The approach that you apply to a problem or decision within the Clear domain is:
Sense à Categorise à Respond
This is the domain of best practices: each time you encounter a problem of a certain category you follow a script that guides you through to the resolution of that problem with a high level of confidence of a positive outcome.
I am a keen cyclist. A problem that I encountered on a recent ride was a puncture. Sensing a puncture is not so difficult: the bike becomes slightly wobbly, and you can feel the metal rims of the wheel impacting on the road. Categorising the problem takes seconds: pinching the tyre confirms the issue. Based on this problem categorisation I responded by following a clear and well-established routine – a best practice – to fix the puncture so that I could continue my ride.
The Complicated Domain: the domain of good practice
There is still a predictable relationship between cause and effect, but there are multiple viable options available. This is the realm of “known unknowns”, is not easy to do and often requires effort and expertise. There is not a single “best practice”; there are a range of good practices: problems in this domain require the involvement of domain experts to select the most viable approach. The approach that you can take to a problem or decision within the Complicated domain is:
Sense à Analyse à Respond
To illustrate this with another cycling analogy: my bike recently developed an annoying squeak that I couldn’t fix. A friend who knows far more about bikes that I do came up with a couple of ideas but neither of them worked, so I went to a bike shop where the mechanic did a more thorough inspection and was able to isolate the problem: a worn bottom bracket. I sensed the problem – the squeak – I sought the advice of experts to analyse the problem. My response was to apply the most viable option using good practice to fix the squeak. I can now ride with a greater amount of serenity.
The Complex Domain: the domain of emergence
The Complex domain is where the relationship between cause and effect potentially exists, but you can’t necessarily predict the effects of your actions in advance: you can reflect in hindsight that your actions caused a specific effect, but you weren’t able to predict this in advance, and you can’t necessarily predict that the same action will always cause the same effect in the future. Because of this, instead of attempting to impose a course of action with a defined outcome, decision makers must patiently allow the path forward to reveal itself: to emerge over time.
The approach that you can take to a problem or decision within the Complex domain is:
Probe à Sense à Respond
This is the domain of emergent practices and “unknown unknowns”. You test out hypotheses with “safe to fail” experiments that are configured to help a solution to emerge. Many business situations fall into this category, particularly those problems that involve people and behavioural change.
When Ken Livingston announced a scheme to improve transport and the health of people in London in 2007, he said that the programme would herald a “cycling and walking transformation in London”. Implementing this has largely followed the Complex approach: transport officials studied schemes in Paris and elsewhere and considered the context of the problem in London. They talked to (probed) commuters, locals, and visitors to better understand people’s attitudes and behaviours related to cycling. They assessed the constraints imposed by the existing road network, and a range of other challenges, from which they came up with a limited “safe to fail” trial.
The scheme launched in 2010 in a localised area and the operators of the scheme assessed (sensed) what worked and responded by applying adjustments. For example, the initial payment process required access keys; this was replaced by users of the scheme having to register on an app. The scheme has continuously evolved with further innovations: it has followed an emergent approach, with successful changes amplified and those less successful dampened. The scheme as of today is different from where it was when first launched, and it will continue to evolve.
The Chaotic Domain: the domain of rapid response
The Chaotic domain is where you cannot determine the relationship between cause and effect: both in foresight and hindsight: the realm of “unknowable unknowns”. The only wrong decision is indecision. We come into the Chaotic domain with the priority to establish order and stabilise the problem, to “staunch the bleeding” and to get a system back on its feet. We don’t have time to look up a script: the problems we are seeing have not been experienced before; we can’t call up experts and rely on best practices; we can’t devise a set of experiments to see if we can emerge a new practice to tackle the problem. The approach that you can take to a problem or decision within the Chaotic domain is:
Act à Sense à Respond
The practices followed in the Chaotic domain are novel practices in which you move quickly towards a crisis plan with an emphasis on clear communications and decisive actions.
An example of where cyclists experience chaos are the mass crashes in events such as le Tour de France. An accidental touching of wheels or an errant spectator can bring down the whole peloton, resulting in cyclists and bikes strewn across the road. The immediate priority is to act: who is injured? Medics triage and act to look after those in pain. Are the bikes in one piece? Mechanics act by applying fixes or replacing broken bikes. All of this is done in a matter of seconds.
With the casualties being looked after, those in a position to continue sense what to do: if their team leader is down, what should the team do? They don’t put out an urgent request for white boards and assemble for a team meeting: the race is not going to wait for them. They get back on their bikes and work out their response and get on with it, and then adapt their approach as required. They may form temporary alliances with other teams to catch back up with the main peloton: they have come up with some novel practices and have been able to get some semblance of order out of chaos.
The Disorder Domain: the domain of emergence
Disorder is the state that you are in when it is not yet clear which of the other four domains your situation sits within, or you are at risk of applying your default approach to problem solving irrespective of the nature of the problem. The goal is to best categorise your problem into the most appropriate domain as quickly as possible.
Movement between domains
The framework is dynamic in nature in that problems can move between domains for many reasons. The guidance within Cynefin is that the most stable pattern of movement is iterating between Complex and Complicated, with transitions from Complicated to Clear only done when newly defined “good practice” has been sufficiently tested as “best practice”.
Beware of complacency
Clear solutions are vulnerable to rapid or accelerated change: the framework calls out the need to watch out for the movement of a problem from Clear to Chaos. There is a danger that those employing best practice approaches to problem-solving become complacent: they assume that past success guarantees future success and become over-confident. To avoid this, Clear problem-solving approaches should be subjected to continuous improvement, and mechanisms should be made available to team members to report on situations where the approach is not working.
There was one time when I fixed a puncture but was then distracted, or complacent, when putting the wheel back on the bike. I found this out to my cost when I was going down a hill and lost control of the bike – a chaotic event, at least for me. Fortunately, I landed on some grass and only my pride was wounded. Having learnt my lesson I now double-check that everything is okay before setting off.
Where does Data Governance sit within the Cynefin framework?
Dealing with Chaos
Snowden states that “there has never been a better chance to innovate than in a major crisis … where we need to rethink government, rethink social interactions, and rethink work practices. These must take in place in parallel with dealing with the crisis”. We have all had to innovate throughout the recent Covid-19 pandemic and the ensuing lockdowns: for example, how to collaborate on managing data in an effective way when we are all working remotely; how to best onboard new joiners and provide them with an understanding of the data they need to perform their roles. In my team we acted quickly to come up with some novel solutions and have adapted them over time.
According to Snowden, many business situations fall in the Complex and Complicated domain areas. Anything involving people change, such as changes to behaviour or to organisational culture, sit in Complex. It is widely agreed that behavioural and cultural change are the most challenging aspects of Data Governance. Nicola Askham, the Data Governance Coach and DAMA committee member, makes this point really well in a blog (link) where she discusses “the biggest mistake that I have seen is organisations failing to address culture change as part of their governance initiatives … This mistake … can ultimately lead to the complete failure of a data governance initiative”.
To add to this view, in a recent open discussion in social media another DAMA committee member, Nigel Turner, positioned two key features for successful data governance programmes that apply to any organisation: firstly, that “it must be unique to each [organisation] as it must be embedded within the existing cultural and organisational context of that organisation … one size does not fit all”. Secondly, when considering the challenges of engagement and adoption: “how do I get business buy in? How do I appoint data owners and data stewards? How do I demonstrate the benefits of data governance? How do I prioritise potential data improvements?”
These opinions from highly respected data governance professionals place those components of data governance related to people change within the Complex domain where, as we have learnt from earlier, the approach is to probe, sense, and respond.
In the domain of good practice, my team in Marks & Spencer have developed a range of good practices which can be applied by subject matter experts (SMEs) to a problem. For example, we have developed a “Data Governance by Design, by Default” and a “Data Quality Remediation Approach” good practice guides that SMEs can refer to when tackling problems and opportunities of this nature. Both good practice guides are informed by our Data Principles and Policies, which also sit within the Complicated domain: the principles and policies are directive and require a certain amount of effort and expertise to apply. All these artefacts are subject to continuous improvement.
In the domain of good practice, the focus is on developing repeatable patterns to apply the appropriate response any time when the situation is encountered. In M&S we have developed automated processes to perform data quality checks at scale against specific business rules, and to automatically tag datasets to support information classification. These rule sets were carefully constructed and tested, and only moved to the Clear domain when they were approved; however, this is not a “fire and forget” approach: we monitor the performance and the currency of the rules to ensure that they remain fit for purpose.
When you’re next faced with a problem, question whether the approach that you are about to apply is appropriate: does the problem really sit in best or good practice, or do you need to do some probing, sensing, and responding? And when you are working on a problem in the Complex domain, how comfortable are you to create environments and experiments that allow patterns to emerge in the face of pressures for rapid resolution of the problem and a move to command and control?
You can also use the framework to challenge those who claim that complex problems have simple solutions and recognise where their biases are leading them to misdirected and constrained thinking. For example, people who prefer to operate in the Clear domain may try and impose KPI measurement for what is a Complex problem. This can result in the undesirable behaviours of the gamification of the KPIs and thereby giving a false sense of progress, rather than addressing the underlying problem.
Cynefin has really helped, and continues to help, the work that my team is doing in Data Governance where the problems we face manifest themselves across all five domains. By understanding the characteristics of a problem we can rapidly apply the best approach to properly understand the problem and then work out how best to set about fixing it.
Dave Snowden is highly altruistic in how he shares his ideas and his expertise: I encourage you to visit his website Cognitive Edge (cognitive-edge.com) for further resources, and there is a load of video content online. I can particularly recommend his commentary on hosting a children’s party: highly amusing.
If you have applied Cynefin to help fix a problem already, or after reading this blog, it would be great to hear from you.
 Content relating to the Cynefin framework in this blog has largely been sourced from Dave Snowden’s excellent book: “Cynefin – Weaving sense-making into the fabric of our World”, and his many articles and online videos. The examples provided relating to bikes and data governance are mine alone, as are any errors.
Mentors – we need you!
2021 marks the 10th anniversary of the DAMA UK mentoring scheme. Our award winning programme has proven very popular and the numbers of applications for mentoring are growing year on year. But we are at risk of becoming victims of our own success and we need to recruit more mentors to support the increase in requests for mentoring.
The scheme’s main aims are to:
Why become a DAMA UK Mentor
Obviously a key part of the role of a mentor is to support the mentee, but it is also an opportunity for personal development. Professional development organisation, Art Of Mentoring, lists the following 11 reasons why you should say ‘Yes’ to becoming a mentor:
Nigel Turner was one of the original founders of the DAMA UK mentoring scheme:
“Having been a mentor myself since the start, what have I learned about mentoring? First, being a mentor is as much a learning experience as being a mentee, as I have been exposed to many different data management people and their problems working in a wide variety of organisational cultures, including small businesses, global multinationals and UK government departments. This has taught me that although good practice in data management is often generic, with many different organisations facing similar challenges with data quality, governance, reporting and so on, understanding specific cultural contexts is critical to providing viable support and advice. What works in a small business may not do so in a multinational and vice versa.
Moreover, what mentees usually want is not someone to tell them what to do, but a mentor who acts as a sounding board to listen to their ideas and thoughts, ask independent questions, provide feedback and generally act as a supportive friend who has their best interests at heart. In essence, mentoring should be all about helping others to develop themselves in the direction they want to go in. As the film director Steven Spielberg observed, “The delicate balance of mentoring someone is not creating them in your own image but giving them the opportunity to create themselves.”
“I think my favourite thing about being a mentor is the opportunity to talk to people outside of my day job about data management and not be considered weird! No matter what industry you work in there are so many commonalities when it comes to data. Being able to help people, especially those starting out in their careers, to prepare for typical hurdles and encourage them to develop a full understanding of some of the complexities in our field so that they can be effective in what they do is extremely rewarding.
If you are a DAMA UK member and would like to volunteer as a mentor visit the mentoring pages on our website at https://www.dama-uk.org/Mentoring for more information on the scheme, and how to get involved. Here’s to the next 10 years!
Photo by Owen Beard on Unsplash
The creation of a central NHS digital database from GP records in England - General Practice Data for Planning and Research (GPDPR) - has been delayed by two months. The system was due to launch on 1 July, but the date has now been pushed back to 1 September by the government.
The NHS had been calling for a delay to allow patients more time to learn about the system. The British Medical Association and the Royal College of GPs had also expressed concerns.
Parties that currently oppose the database claim a lack of transparency in the process and have demanded greater consultation on how the scheme would work, as well as better communication about how patients can opt out to prevent their data being shared.
In a recent, lively conversation DAMA UK board members Mark Humphries and Akhtar Ali discussed the pros and cons of the proposed scheme.
Both of our experts’ positions were nuanced, with some agreement over the potential positive outcomes and shortcomings of the proposals.
Mark declared himself broadly in favour of the programme, citing the ability to boost future medical research and improve treatments for the UK population. Akhtar immediately countered by stating the NHS has already captured and shared data for many years, citing advances made in controlling and treating Covid-19 as an example.
Mark added he’d be happy for his data to be shared, believing it’s “a small price to pay” for groundbreaking, genetics-based research to come. He drew a line to medical research of the past, which involved a level of personal sacrifice for participants in early organ transplants - giving rise to many procedures we take for granted today.
While current data collection currently enables NHS and academic research, Mark elaborated: “There are plans for two specific blocks of research [with the new data strategy]. One is large-scaleplanning. The other way it will be used is for developing new treatments.”
However, Akhtar pointed out that successful lawsuits have resulted in courts preventing various commercial organisations from “patenting individuals’ genetics” as part of their plans.
Mark conceded that using data to develop treatments is controversial as it would likely involve commercial organisations, not least pharmaceutical companies.
He cited the case of Henrietta Lacks, an American woman who died of cervical cancer. Johns Hopkins University removed cells from her tumour that have been central to medical research ever since - without her family’s knowledge.
Mark explained: “They were horrified. They raised the issue of big profit, people making money out of their mother’s cells. The case has since been recorded in a book written by a journalist, and it’s a source of massive pride to her family that her cells have made such a massive contribution to medicine. But in the context of data sharing this is highly relevant - when there is no trust, there's anger and bitterness.”
Sharing data beyond the public sector would undoubtedly be a cause of concern for many people, Mark added.
Akhtar seized on this point. He pondered: “The bigger question is: how far would companies go to get access to that data, and who will they sell it to?” He pointed to the dilemma arising from dealing with the pandemic: “Most of the data and money to fight Covid came from the public sector. But the profits are ringfenced to corporations that aren’t willing to give the vaccination free of charge to poorer, developing nations to protect their critical care staff.”
He largely opposes the NHS digital database plans due to a perceived lack of transparency around whether consumers or commercial organisations will really benefit from the sharing of a data set worth around £5bn per year. (NB a figure of £10bn has been quoted - this includes the £5bn pa estimated value of the data and cost savings to the NHS.)
Akhtar also said there have been around 1,300 NHS data breaches in the past two years alone. He believes fines issued by the ICO are too small to act as a deterrent for poor data management and security - with the proposed changes potentially opening the floodgates to far greater problems.
He said: “Once we have given away £5bn-worth of data, no commercial organisation is going to relinquish “free money”. This has been demonstrated by the investment of vast public funds in Covid vaccines, yet when those same organisations are asked to provide them at cost to poorer nations they suddenly claim they aren’t charities - despite suggesting they would gladly get involved in such a scheme.”
Akhtar compared patient data sharing beyond what is possible today to “moving from keeping embarrassing photos in a private album at home to revealing them on Facebook”.
Mark countered that data will be pseudonymised when it leaves the GP practice - but recognised that, even with many personal details removed, records could in theory be used to identify a patient.
He explained: “One of the things you need in order to make sure you can still link data together is a unique key. It will look like random data, but allows people to trace from the beginning of the chain to the huge database. You can identify the individual if you've got access to every link in the chain.”
Akhtar questioned the future extent of the data-sharing scheme and whether its scope would be widened once the programme’s initial aims had been established.
He believes the proposals suffer from a lack of trust amongst patients, and also suggested winning buy-in from the medical community would be difficult since their feedback on previous attempts to centralise and manage NHS data hadn’t always been heeded.
“If the government had nothing to hide it would follow its own laws which were set out in the revised Data Protection Act following the GDPR regulations,” he stated. “They need to clarify who the external parties will be and what they want the data for.”
While reaffirming his support for the scheme based on its uses for planning, research and treatment, Mark agreed with Akhtar about a current lack of transparency: “At the moment trust is missing, and that is vital in determining whether this is a success or an opportunity missed. In this delay period the government should engage, addressing these valid concerns. Who will have a say in the proposed legislation, assess applications to access the data, put controls in place - and how often will it all be reviewed?”
In conclusion, Akhtar pointed out that even data professionals remain in the dark about the nuts and bolts of the programme: “If most people in data didn’t know this was happening until just before the original opt-out deadline, how could individuals be aware of it and know how to opt out?”
It will be fascinating to follow the debate before the September opt-out deadline, and beyond.
(You can read the full transcript of the head-to-head discussion below.)
MARK: There are two different data sharing initiatives going on at the moment. One is pulling data from different health trusts and GPs so that all healthcare practitioners have access to your medical records wherever you turn up in the system. That is completely different to GPDPR, but the two are happening at the same time, which is a bit confusing.
The two issues do get conflated. There is also an opt-out mechanism built into that. So you can say, I don't want my data shared between all the different hospitals and trusts for whatever reason. But that is limited to keeping the data within the NHS so it's only used for healthcare purposes. That is one of the benefits of having a monolithic national healthcare system.
AKHTAR: The analogy there for me is your gran keeping a photo album. Photos are only for the album in her house, and only she's got access to it. We're now moving on to gran putting pictures on Facebook, but it’s locked to herself. So those embarrassing pictures of you are quite safe.
So we're on a journey like that. Your data has been captured, you’ve seen a doctor, talked about some potentially embarrassing stuff. But you share that on the basis it’s confidential to your doctor.
MARK: I would argue it's not the same as putting it on Facebook, in the public domain. The purpose of sharing is to enable research within the NHS and universities. At the moment there are plans for two specific blocks of research.
One is large-scale planning. It's about managing healthcare capacity and treatments on aggregate numbers. It doesn't really matter who the individual statistic is, it's about the large numbers: how many people are getting liver cancer, prostate cancer, breast cancer; what do childhood diseases look like?
The other way it will be used is for developing new treatments. This is where it starts to get controversial - sharing healthcare data with commercial companies like Pfizer and AstraZeneca, so that they can use the data to develop new treatments. That's when the data goes outside the public sector. [The notion that] companies are making profit out of our data alarms people.
And how do we know this data will be safe, that hackers aren't going to get their hands on it? An important point is that the data will be pseudonymised when it leaves the GP practice. In principle your name, NHS number, address, date of birth (but not year of birth) are removed. So in that sense it’s not like the Facebook analogy.
Just looking at the data, you wouldn't be able to identify it’s Akhtar. But one of the problems with pseudonymised data is if someone is determined, and they have the tools and ability, they can often use various remaining attributes to build a picture - like a fuzzy photo - which is good enough to identify it’s probably Akhtar.
AKHTAR: Let's step back a bit. Many of these things are already in flow. Your GP captures your data - they need to do that, research is ongoing. Historically, you could opt out. There was an interesting case where individuals had opted out, but the process fell over and 150,000 individuals had their data shared even though they requested it wasn’t. There have been something like 1,300 NHS data breaches in the past two years.
NHS research happens with data sharing, for example to come up with treatment for kidney failure. Things are already happening, but we have to step back and realise NHS data will be worth £5bn per annum. That's a big number.
When you start saying £5bn today, what will it be worth tomorrow? I’ve opted out, I’ll put my hands up. My family suffer from kidney disease. My father had a transplant. I understand the need for that. But NHS and other health organisations already have access to our data. So we need to make a big distinction that it’s nothing new. At this moment in time, we can share data and we can opt out. But who else can have access to that £5bn-worth of our data?
The Royal College of GPs, BMA - all of them have challenged this. Mark talked about pseudonymised data. But the government's had about 200 GDPR breaches since the law kicked in - they're the ones that we know about. Then there are hackers, I don’t think any institute is foolproof.
The bigger question is: how far would companies go to get access to that data, and who will they sell it to? This is the reason the Royal College of GPs and the BMA and others have challenged this, because there’s no clarity. Will it be sold to pharmaceuticals, AI companies, or even investors potentially looking to buy into UK hospitals and so on, cherrypicking what to buy based on the data?
The pandemic is a great example of how data was shared. But at the same time, the vast majority of investment in vaccines was funded with public money. Why should we be worried to share our data? Well, that’s a perfect example. We spend billions of pounds of taxpayers’ money to make our vaccine, using our data. But when it’s for the greater good, vaccinating people in poorer countries, it’s a case of these companies saying it’s not their purpose. They are commercial organisations, not philanthropists.
To me, it's about asking what are the implications once we open the floodgates?
MARK: A technical point about pseudonymisation. One of the things you need in order to make sure you can still link data together is a unique key. It will look like random data, but allows people to trace from the beginning of the chain to the huge database. So it's not completely anonymised. You can trace the data back to identify the individual if you've got access to every link in the chain.
If it was fully anonymised, it would be a one-way flow with no way to link it back to the original source records. There's an awful lot of work goes into pseudonymisation, what you can and can't do.
I also want to make a quick point about the risks Akhtar has laid out. Things could go wrong - they’re all valid concerns. What he hasn't said is, this data should not be shared.
I'm keen to emphasise from my point of view that I'm still in favour of sharing healthcare data. But secondly, you need to have trust in order to share data. And at the moment trust is missing.
So the most important thing to do with this delay period is engage, have the debate, address all those concerns. Under what conditions will data be shared with commercial companies? Who will have a say in the proposed legislation, assess applications to access the data, put controls in place?
What I haven't seen is any details about who will be on that body and their terms of reference or decision-making process. We need to put controls in place relevant for today, but also reviewed on a cycle to assess whether they are still relevant and robust at that time.
If you look back at the history of medicine, a lot of what we take for granted today involved sacrifice and some dodgy ethical groundwork - anatomy, grave robbers and so on, so the first doctors could understand how the human body works. Even organ transplant as recently as the 1960s is relevant. Doctors didn't understand organ rejection, so just went ahead and implanted living kidneys. Not only did the patients die, but the deaths that they suffered were actually much worse than natural kidney failure.
And yet, if that experimentation had not happened, then organ transplantation and the the anti-rejection drugs which have been developed off the back of it would not be in place. This is something we take for granted now. So there is sacrifice and so, from my point of view, that’s why I'm happy to share my data. I think it's a small price to pay for future benefits of medicine.
Future medical research will be based on genetics. A big pool of data would therefore be valuable from a research point of view, to identify certain genes and how they affect the whole population.
AKHTAR: I watched a programme which was a discussion about the majority of learnings for the basis of medicine coming from Islam. Oxford University has many historical books on medicine, but wasn’t willing to share so much knowledge so [the programme said] they hid the books. So my concern comes back to trust.
Then genetics and DNA. Who is the benefit for in the US when they want to patent my DNA? The big thing we don't talk about here is the ethics of data. It's going to become the blood supply of capitalist organisations trying to get into the NHS on the cheap.
This is a concern for all health organisations and charities - where is that data going to go? Why can't the government come out and say, these are the potential companies we want to give it to. If it's all about my care, why would you want to patent it, why don’t you give it to everyone so they can all come up with the best cure?
At the moment, we're still getting a good deal in comparison to the rest of the world on medication cost. So what are the controls, and who are the benefits for?
MARK: The government absolutely needs to build trust. Unless they do that, this will fail. I think that will be a huge opportunity missed. This has the potential to unlock future healthcare treatments that we will all benefit from. People's valid concerns need to be tackled.
When there is no trust, there's fear and anger. But if concerns are recognised, the conversation is had and the value is explained, it actually becomes a very positive story.
Another general point, a lot of people think GDPR was put in place to stop data sharing. But actually one of its main goals is to encourage data sharing by putting trust in place. So there are limits to what you can do and protections in place so people know what’s possible and how they can complain.
In the pandemic, vaccine scepticism is uneven in the population. If you saw the same attitudes in NHS data sharing and didn’t get representation across age and ethnicity that would be a problem. Getting a big enough sample is important but it needs to be representative.
AKHTAR: We can talk about trust and benefits, but the government tried to do something similar in 2014 with social care data. We have pooled data for the NHS and it’s shared in the UK. A hospital anywhere will have my records. These things already happen - so the big thing is who do we want to share the data with?
Be up front and tell us the purpose. It feels cloak and dagger. Most data professionals didn’t know this was happening till just before the [original opt-out] deadline so how could individuals be aware of it, and know how to opt out? That illustrates the problem of trust. When billion pound contracts are handed out people will wonder who’s really benefitting.
If GPs and the BMA are not comfortable then it rings alarm bells for anyone who uses the NHS.
MARK: I am still pro data sharing, but of course all of these concerns are valid. We need to talk about what measures will be taken to secure the data, and be transparent about who it will be shared with. Good management of risk is critical here.
I’m delighted to announce that we have completed another round of elections to the DAMA board. This round of elections broke two records, one for the number of candidates that came forward (23) and for the total number of votes cast (353). I am particularly delighted to welcome three brand new members to the committee (in alphabetic order):
We now have a strong committee of highly qualified and motivated data management professionals, and together we will work to continue to build an programme of activities, events and webinars to serve our growing membership.
The last year was a good one for DAMA UK. We have trebled our membership from 250 to over 750. In part this reflects a growing awareness of the value of data management. I think that it is also related to an important exercise that we undertook last year when we developed a marketing strategy for the first time. In the exercise we really looked critically at DAMA UK from our members’ point of view and considered how we added value. The themes that stood out were CDMP, access to interesting material, networking events and mentoring. We will continue to develop our offerings along these lines.
As we develop these themes, we will be reaching out to you, our members, to contribute. If you have an idea for a webinar or a blog, please share it. If we are holding an event in your area and you have a story that you would like to present, please volunteer. If you would like to be a mentor to other data management professionals, please get in touch.
Mark Humphries, DAMA UK Chair
DATA MANAGEMENT MENTORING – A DECADE OF DELICATE BALANCE
As a co-founder of DAMA UK’s mentoring scheme, it’s no surprise that I am a great advocate of the value of mentoring, both in our data management profession and in life. I’m delighted to say that this year marks the 10th anniversary of our scheme, founded back in 2011.
Since its inception we have used the University of California’s definition of mentoring as ‘a developmental partnership through which one person shares knowledge, skills, information and perspective to foster the personal and professional growth of someone else’ . The scheme’s main aims have also remained unchanged over the intervening period and are to:
· Help improve the skills and expertise of all DAMA UK members by growing skills, expertise and best practice across the organisation.
· Support the professionalism of all DAMA UK members
· Raise the profile of data management specialists across wider UK industry
We launched the scheme after it became clear to the committee that some of our members want to talk with other data management professionals who are not their managers or work colleagues. This may be to help them with specific data management problems, e.g. how can I get senior managers in my organisation to buy in to data governance? How do I start a customer master data management project? How do I build a business case for improving data quality? In addition, some want to focus on their own personal and professional development, e.g. What should I do to prepare myself to take on a data governance role? I don’t feel I am getting the credit within my company for what I do in data management, so how can I raise my profile and be recognised more widely?
The scheme is open only to DAMA UK individual or corporate members. In the last 10 years more than 60 DAMA UK members have been connected to a variety of mentors who have provided support across the entire breadth of the data management disciplines including data architecture, business intelligence, data quality and data governance, in addition to career path advice. At the present time 10 mentors are mentoring more than 25 DAMA UK members. We are also currently revamping the scheme to make it easier for mentors and mentees to link up more easily via the DAMA UK website.
Having been a mentor myself since the start, what have I learned about mentoring? First, being a mentor is as much a learning experience as being a mentee, as I have been exposed to many different data management people and their problems working in a wide variety of organisational cultures, including small businesses, global multinationals and UK government departments. This has taught me that although good practice in data management is often generic, with many different organisations facing similar challenges with data quality, governance, reporting and so on, understanding specific cultural contexts is critical to providing viable support and advice. What works in a small business may not do so in a multinational and vice versa.
If you are a DAMA UK member and have not been involved to date by either being a mentor or a mentee (or both) why not give it a try? In the first instance go to the mentoring pages on our website at https://www.dama-uk.org/Mentoring for more information on the scheme, and how to get involved. Here’s to the next 10 years!
Principal Information Management Consultant, Global Data Strategy
DAMA UK Committee Member
Photo by Lili Popper on Unsplash
I want to start by telling you a bit about me, my name is Andy Lunt and I’m a Data Governance Manager at The Adecco Group. I’ve worked for The Adecco Group for the last 13 years starting out as a recruitment consultant, moving on into setting up and managing an Management Informationteam both here in the UK and Poland. I’ve now taken on the role of leading the implementation of a data governance programme.
I have lots of hobbies outside of work that keep me busy that in climbing to shooting clay pigeon as well DIY – I live in an old cottage so always lots of things to fix! I love to have a tiple at the weekends my poison of choice being Belgian beer!
There are many different ways for an organisation to start a data governance programme. Some are traditional and conventional, others less so. What’s more important is that in any organisation that depends on data, data governance happens rather than who starts it or how it’s started.
Increasingly, businesses are using buzz words like ‘data driven’ or ‘predictive analytics’ and so on. Part of the job of a new data governance manager is to help the organisation realise these aspirations by helping them understand the ‘as is’ data landscape and what foundations need to be put in place to enable analytics, data science and data engineering teams to get to their promised land.
At Adecco in the UK I have started our data governance journey 12 months ago, the driving force behind the move is the need for data to play a central role in helping us achieve our strategic goals.One of our big investments in this area was a data science function. We want this function to deliver insight into how our business operates but more importantly show us how to react to the changes in the world of recruitment around us.
The challenge, as any large organisation can sympathise with, is a large volume of data spread out in a federated fashion, which essentially means siloed data! Housed in a mixture of legacy and new systems. If we couple this challenge with limited system and data documentation and no real data ownership, we have recipe for failure when it comes to being a data driven organisation.
How I answered the Challenge!
All these steps Adecco are taking are a collective effort to help us mature our data management practices and achieve our ultimate goal of becoming a data driven organisation in a safe and controlled way.
Want to hear more? DAMA UK members can access my webinar by logging into the website. You’ll find my webinar in the resources area. Enjoy
Hi I'm Nicola Askham, one of the Directors for DAMA UK. For the past two years I've been responsible for arranging and hosting the webinars for DAMA UK. I can honestly say it's been an inspiring, if busy role. I get to meet a range of really interesting people who share a wide range of insights and stories.
As the host I have to turn up to all of the webinars. I can honestly say that I have listened to presentations and heard people speak on topics that I wouldn't ordinarily have joined a webinar to learn about. So, if you do see one of our webinars being advertised on a topic that isn't perhaps your area of data speciality, please be open minded and register. Listen to the webinar and you may be surprised what you will learn. Often tips shared in respect of one data management discipline are translatable to other data management disciplines.
I have been amazed at the gems I have picked up from listening to webinars on topics that have nothing to do with my speciality.
Up until last year all of our webinars were publicly available on the Brighttalk platform, but late last year we started providing additional webinars which are only available to our members. They have been a great success.
These webinars have provided very focussed advice for our members and given them a chance to question experts in their field. Our latest members only webinar was in fact a presentation practice session. It gave members a safe space to practice presenting via video and gain feedback from three of our experienced committee members. The feedback was so good that we are planning further presentation practice sessions later in the year. Please let us know if you would like to take part in one of them.
I am currently planning our webinars for the rest of the year. We are looking for people willing to share their stories, across all data management disciplines. If you believe that you've got a great story to share, some tips, or even better a case study of how adopting one of the data management disciplines has enabled you and your organisation to achieve better things, we'd love to hear from you.
We have a number of the DAMA DMBoK version 2 copies which we will be giving to our webinar presenters as a thank you for doing the webinar for us.
If you do have a story you'd like to share, please get in touch: info@DAMAUK.org
Photo by Jaime Lopes on Unsplash
I’m not really a big fan of online or remote interaction – frequently exacerbated by my failure to properly harness new technology. But needs must and, for me at least, working from home is going to be my new normal for the foreseeable future, so I thought I’d better start embracing it! Back in November I attended my first ever virtual conference – the IRM MDM and Data Governance summit. I’ve been a regular attendee at this event for more than 10 years as a delegate, speaker and sponsor. This is not a paid advert by the way - based on my personal experience I would say to anyone that this is a really valuable peer to peer learning event for data management professionals. As such I wanted to get the most out of it, to try and recreate the in-person experience and catch up with people in my network. So I committed to ‘be in the room’.
I remember a keynote from a previous IRM event, delivered by Nigel Risner:
He challenged the audience to be in the room. “Do you live your life in the present or past tense? If you are in the room, be in the room. If your mind is elsewhere you might as well leave now.”
I can’t recall everything he said – I was somewhat distracted by the multitude of animal hats he was wearing and the fact that at the end of it I think I concluded I was a Dolphin……but I did leave my phone in my bag and try to pay proper attention to the speakers and presentations that followed.
So on November 3rd 2020 I booked 2 days off for personal development in the work calendar, switched on my Out of Office notification, logged out of my work email and sat down with the conference agenda, circling the sessions that most interested me. This included a virtual wine tasting guided by a sommelier (@diegosomm) from Argentina – yum! I treated anything marked on the agenda as a networking break as just that – not a catch up with email opportunity. I had some lovely video chats with people in the breakout rooms. And I popped along to the sponsors area to look at what solutions were being promoted, now that GDPR is ‘old news’.
It is much harder to ‘be in the room’ in a virtual environment. The platform format helped quite a lot with that. Sessions were auto-scheduled and if you were late for the start you missed the first few minutes . I think if everything was ‘on demand’ I’d have found it harder to commit the time. The presenters were on video – they couldn’t see you, but the fact that you can see who’s talking makes it easier to listen. You could type in Q & A in real time so there was some ‘live’ interaction. And, unlike the in-person event, this time I could download the whole presentation the next day – not just the slides. So often it’s the commentary that sparks the light bulb moment – not the words on the page. On the concluding panel session someone also pointed out that if you’d chosen a session unwisely (I’ve sat through some ‘big data’ ones in my time where the presenter could have been speaking in a foreign language for all that I understood) you could just switch over – no more needing to do the walk of shame and try and sneak out of the rear doors.
So did I prefer the virtual platform to the in-person event? No - but I did enjoy it and I think that was down to the fact that I committed to “be in the room”. Whilst I was having a lovely time catching up with past colleagues I should take this opportunity to apologise to my current ones as I really did put the day job aside for two days – but I have some great Argentinean wine recommendations if you need them!
Mary Drabble is the Principal Data Governance Analyst at Standard Life Aberdeen, leading a team embedding the organisation-wide Data Governance Implementation Framework. Mary has a proven track record in Master Data Management, Data Governance and Data Quality tools, methodologies, architectures and processes. Prior to taking on an end user role, as a consultant with more than 15 years’ experience in Information Management, she helped clients across all industries in a wide variety of engagements ranging from Analytical to Operational data and information management solutions.