November 2024, Nigel Turner, Principal Consultant, EMEA
Earlier this year I was asked to take part in a panel session at an event in Manchester, UK. The main focus of the session was Artificial Intelligence (AI), this year’s hot subject in data management, and probably a focus of headlines and scrutiny for some time to come. Much of the panel discussion focused on how AI should be governed, both by the organizations and individuals who develop and exploit it, and by governments who seek to regulate it so it benefits, and does not detriment, society as a whole. The interrelationship between AI and data was a central discussion point, and this led to consideration of how the governance of AI and the governance of data should relate to each other.
One question centred on whether AI governance and data governance were totally different disciplines or had similarities and so overlapped in any way. At the outset there was a broad consensus that both data and AI needed stronger governance than is often the case. Lately AI has been hitting the headlines for the wrong reasons in the UK, with several reported cases of the technology being used to simulate fake statements from well known politicians and celebrities. These stories exemplify that doubts are growing about AI’s ability to differentiate fact from fiction, leading to it being too readily able to distort and create a false reality. Moreover, combining these emerging capabilities with longstanding data quality and other problems can lead to undesirable conclusions and outcomes. It's therefore evident that both AI and data need careful governance, with policies, processes and procedures put in place to minimise the chances of mistakes and errors.
So how do you ensure AI and data are governed effectively to protect individuals, organisations and society from harm? First, it’s clear that for both AI and data you need rigorously defined, implemented and enforced governance frameworks, set within a regulatory and operational context. As a more established discipline, data governance has relatively mature frameworks developed over several years. These lay out how organisations should ensure that their employees and agents are accountable for data, personally responsible for its use and curation within legal and ethical boundaries. A good example is the General Data Protection Regulation (GDPR) in Europe, a comprehensive legal data protection framework which is being emulated in many other parts of the world. Problems with data often arise when organisations do not implement effective data governance frameworks and controls, and as a result inadvertently allow their data to be used in uncontrolled and often unethical ways. Applying AI technologies on top of this is a recipe for disaster.
Although legal and operational AI frameworks are also beginning to be being developed, they remain immature. This is not a new problem in data management as it is often the case that new technology outpaces the ability of society, lawmakers and organisations to control its use effectively. And again, with AI, effective governance is playing a catch up game, trying to keep pace with the technology’s rapid and accelerating capabilities. The good news is that many governments and organisations have at least recognised the need to act. The European Union, UK, USA, Canada, Brazil and China are all drafting AI legislation and associated frameworks and guidelines. These emerging legislative frameworks cover issues such as determining when the use of AI is appropriate (or not), what ‘guardrails’ need to be put in place to limit what the tool is and is not allowed to do, and emphasise key safety, privacy and ethical policies, principles and practices that must be adhered to. They are also stipulating how AI outputs should be validated to avoid bias and ensuring that AI is designed and implemented transparently so its results can be validated or critiqued by others.
But the approaches vary significantly. The UK’s current AI governance emphasis is to create a ‘principles based framework’ for AI, rather than enacting detailed legal regulations on how AI should be managed and controlled. The five principles focus on:
- AI systems must operate in a robust, secure and safe way
- They should be transparent and explainable
- They should be fair, and not undermine the legal rights of individuals or organizations
- AI governance should be put in place to ensure effective oversight of the AI lifecycle and that accountability is clearly defined
- Impacted individuals or organizations have the right to challenge an AI generated decision or outcome.
The monitoring of the above will be sector based, using existing industry and data protection regulators to scrutinise compliance with these principles. Whether this will work or not remains to be seen. It’s also noticeable that these principles apply equally to data as to AI, as either or both can breach these principles. So, it’s vitally important that AI governance and data governance must align and reinforce each other. Both data inaccuracies and AI imperfections lead to undesirable outcomes.
The fundamental relationship between AI and data governance is that, as we all know, AI’s results rely heavily on the quality and veracity of the data it is trained on. Poor data will produce erroneous outcomes and conclusions; good data is likely to generate valid and useful insights. So, any organisation considering using AI operationally needs to ensure that not only does it have a robust AI framework in place, but also an equally well implemented data governance framework. Data governance must underpin and supplement AI and provide the sound and reliable data management foundation that AI ultimately depends upon.
Data governance and its drive to create and maintain high quality, trusted data is a necessary precondition of AI success. But AI can also enable successful data governance and is already beginning to do so. One of the big challenges data governance continues to face is how to leverage overall control over what are often siloed, disparate, inconsistent and disconnected data sources. AI can help to break down these silos by helping to identify, categorise and consolidate related data found in these stovepipes, for example highlighting personal data held in various scattered source files. This enables data governance roles such as data owners and data stewards to get a better handle on this data and so improve its overall use and control. Data governance also critically relies on the effective generation and management of metadata which enables data owners and data stewards to curate the data for which they are responsible. Many data catalogs are already embracing AI technologies to help automate the generation of the metadata required and to dynamically update it as source data changes. Furthermore, AI can also help to suggest, automate and enforce data governance policies and rules, including data quality rules.
To summarize, AI governance and data governance are both ‘must haves’ for any organization which wants to exploit its data assets within legal and ethical boundaries. In order to ensure that they work to reinforce and not negate or contradict each other AI and data governance specialists within organisations need to get together to develop and implement the required frameworks harmoniously. Only this will provide a seamless and integrated set of frameworks where the benefits of AI can be realised more effectively within the controlled and secure data environment that data governance can bring. This will require AI and data governance people to deepen their understanding of each other’s principles, policies, approaches and tools. If this is done, both AI and data governance are enhanced, and the disasters highlighted above reduced.
A win-win scenario is achievable but will take motivation and effort from the data management community. If this comes to pass, the opportunities for exploiting an organization’s data assets in an ethical and managed way are potentially boundless.