The Center for Data Innovation sat down with Luca Nannini, a PhD student in AI at the University of Santiago de Compostela and data analyst at Minsait, an IT company. Mr. Nannini discussed his work on explainable AI, including the various technical and social dimensions of explainability. He highlighted the importance of thinking about the user when designing explainable systems and offered his thoughts on how explainability principles should inform thinking on future AI development.
Benjamin Mueller: What are the goals of the NLX4AI project, and how is the group pursuing them?
Luca Nannini: Interactive Natural Language Technology for Explainable Artificial Intelligence (NL4XAI) is a research project funded by the EU’s Horizon 2020 programme. It is the first European excellence training network focusing on natural language and explainable artificial intelligence (XAI). The goal is to train 11 early-stage researchers from a range of European universities and businesses on the development and application of natural language technologies to foster explainability in interactive AI systems.
NL4XAI is divided into four complementary research packages, going from a “back-end” to a “front-end” approach to explainability: the first package aims to design and develop novel human-centred XAI techniques. The second stresses the importance of natural language processing and generation for XAI. The third elaborates on argumentation technologies for XAI using narratives or dialogue. The fourth, which I am part of, is based on the usability of multi-modal XAI interfaces. The four projects’ goals align with principles of fairness, accountability, and transparency in AI ethics. Moreover, the goal of NXAI supports the right, enshrined in GDPR, to an explanation of autonomous systems’ decisions.
Mueller: What is the historical background of “explainable AI”? And why is it so difficult to make AI systems more transparent?
Nannini: Explainable AI is not a brand new discipline for understanding opaque IT systems. Explainability had its inception almost fifty years ago in the medical domain, specifically in expert systems, during the so-called “first AI winter.” In that period, AI research struggled to advance, primarily because of its reliance on symbolic, rule-based systems. Approaches focusing on data crunching, like machine learning, were not widely considered. With the progress made in deep learning over the last decade—symbolized by ever-more sophisticated, scalable and reliable neural networks—AI has seen extensive success in many applications. But it took a little time to realize that the promise of big data also contained several perils: foremost, quantity does not entail quality. Every AI artefact reflects, seamlessly or not, the socio-technical context of the people collecting the data, training the model, and monitoring its performance.
For those reasons, explainable AI is a tool we need to be able to calibrate based on several technical, cognitive, and socio-technical constraints. The are several technical pitfalls: the first is assuming that the output of your interpretation method is statistically accurate. This is often not the case, since the data you use is often just an approximation of reality. Another is the assumption that one single interpretation method fits all your interpretability problems. Are we looking into feature dependence and interaction? How many features are we facing? Will our interpretation method scale, or is it better before to apply dimensionality reduction techniques?
Transparency as an explainability strategy, aside from privacy and intellectual property issues, does not guarantee that relevant information is actually conveyed. There are cognitive pitfalls to be mindful of–users can easily be misled by ambiguous, irrelevant signals, or overloaded with information. When we face complex systems offering explanations, automation and authority bias can creep on. So the design of explanations should follow Paul Grice’s “cooperative principle”: understand what is relevant for your user, beware of confirmation bias, and be concise and insightful. And finally, don’t overestimate cognitive efforts made by them: it is hard to objectively assess users’ intention and expertise.
Mueller: How do you think about the inherent trade-off between an AI model’s interpretability and its accuracy?
Nannini: I make sense of the “interpretability-accuracy” trade-off by applying the following question to any given AI project: what is it that makes the use of AI in this context desirable? Borrowing from the Occam’s Razor: can the AI method used be simplified to strengthen model interpretability without compromising performance? Complex models might have a “bells and whistles” appeal, when a good old-fashioned linear regression can work fine for many problems. And if complexity is necessary, why do we need an explanation, and which tool can answer my interpretability question? Who is the user receiving the explanation, and how will it help them? Some scholars in AI, following Cynthia Rudin’s account, advocate for “interpretable-by-design” solutions, while others simply state that rather than an approximated explanation we need validation protocols for AI system, as is done in the medical domain with drug trials. I personally believe that we do not need always to peek inside the black box if other trust means are provable, e.g., ensuring data and model quality reporting and monitoring frameworks with distributed accountability oversight during the MLOps and post-market monitoring.
Mueller: Most work on interpretability wants to give simple explanations of an entire neural network’s behavior. What do you make of the biology-inspired approach of ”zooming into” individual neurons and weights and prying open the algorithmic black box in that way?
Nannini: I believe that what you might mean by “zooming into” relates to the ability of weighting, on a low-level approach, how given feature values contribute to an output. I personally think that is quite unfeasible, if not completely useless, to observe individual neuron activation—which is just as true for neuroscience as it is in neural networks. It is more feasible to observe a local subset of data features, there are techniques to do that, such as saliency maps or Shapley values. Even if in future we can pry open individual neurons, will this ability to “zoom in” constitute an explanation? Gestalt theory suggests that in complex systems, the overall property of a system does not correspond to the sum of its components’ properties.
What is useful is knowing the relevant feature target that, once perturbed, could drastically change the system’s output. But what we need to assess is the system’s behavior, i.e., how those different features interrelate with their activation in an iterative way. What’s the use of zooming if we can’t trace back why this local region behaved in the way we just observed? How might it behave in the next iteration of the model? This is essentially a heuristic limit connected to the fundamental problem of causal inference, as stated by Donald Rubin: we can’t always observe potential outcomes, but we can validate assumptions about missing counterfactuals. So causality is really important for explainability. Interpretability does not equal explainability: the former is an intrinsic property of a system, while the latter concerns the ability to enhance a user’s understanding in a given context with specific intentions and domain expertise.
Mueller: What do you think is the most promising approach for building explainable artificial general intelligence?
Nannini: I am not generally lured by the idea of artificial general intelligence (AGI), especially for an anthropomorphic one. Moreover, I believe that speculating on when and how we might reach that point in time is an unconscious exercise of technological determinism. It is conjured up to suggest we are doomed to see AI continuing its current innovation trajectories as unstoppable, regardless of the socio-political context. This does not take into account physical and cognitive resource constraints nor other repercussions it might provoke on human rights and the environment.
It is true that achieving explainability in an AGI system would be difficult if not impossible. Morality is a slippery and evolutionary concept among different cultures. We can’t hard-code morality but we could teach a system to operate accordingly to moral constraints, for instance by requiring human input into decision-making. There are a few questions we should debate before trying to design an AGI: for whom might such a system be helpful? For which functions? How would we face questions about its rights? How should the system be held accountable for its actions? How would we encode its moral values and intention? All these questions are quite open-ended and pertain to the much-needed realm of AI ethics. Just think about the classic trolley dilemma, applied to autonomous vehicles. Many worst-case scenarios were examined at the beginning of 2022 by the IEEE Spectrum magazine: for example, an AGI could be used to monitor communities and might be able to forecast and mitigate human rebellion even before inception. But these dire worst-case scenarios need to be set against benevolent ones, where a well-designed AGI—based, for instance, on a hierarchical architecture harnessing different modules such as transformers, GLOM capsule neural networks, and Joshua Tenenbaum’s computational principles for reverse-engineering human learning—can reason through abductive inferences without being data-hungry, can learn and abstract from smaller datasets and examples, and provides you with options, and explanations.
A well-designed AGI system might not be completely rational, but somehow conscious of conflicting moral values in their uncertainties. In other words, how can an AGI’s motivations balance our preferences versus those of others? In the end, are we sure that what we need are explanations about its internal architecture functioning, rather than holding accountable its maintainers or potential malicious actors poisoning them with wrong examples? There are no direct answers, but it is promising to see more and more scholars and industry practitioners working on AI governance issues.