The EU is trying to make AI systems more transparent. Article 13 of the EU’s proposed Artificial Intelligence Act (AIA) states that “High-risk AI systems shall be designed and developed in such a way to ensure that their operation is sufficiently transparent to enable users to interpret the system’s output and use it appropriately.” But the proposal does not specify what it means for those using an AI system to “interpret” its output nor does it provide the technical measures a provider must take to demonstrate its system complies. To avoid confusion, the EU should clarify its terminology and ensure it does not mistakenly outlaw the most innovative systems.
The AIA demonstrates that the EU does not yet know if it wants “explainability” or “interpretability”. While Recital 38 calls for ‘“explainable” AI systems, Article 13 states that users should be able to “interpret” the system and requires human oversight measures that can facilitate the “interpretation” of its outputs. The distinction, however, is nontrivial: requirements for explainable and interpretable AI systems differ significantly; the consequences for the innovation ecosystem may be substantial.
To see the difference, consider two AI systems: one that uses a simple decision tree model and one that uses a random forest. Decision tree models sequentially answer questions according to certain features and weights and continue to branch out until they reach a conclusion. Figure 1 maps out an algorithm for deciding whether to work from home.
Figure 1: Sample decision tree to decide whether to work from home
An AI system that only uses a decision tree is explainable insofar as the designer can communicate the system’s behavior and output in a way that is understandable to a human user. The system is also interpretable because the designer knows how the decision tree’s features (in Figure 1, “Friday?” and “Rain?”) and their weights determine the system’s output (“yes” or “no”). The cause and effect are clear.
A random forest is a collection of decision trees (a type of ensemble learning method that deploys multiple algorithms). The orchestra, of potentially thousands of decision trees, outperforms any single tree. An AI system that uses a random forest model can be (imperfectly) explained but is uninterpretable because examining an individual tree’s features and weights no longer determines the model’s output.
If the EU is calling for interpretable AI, that would constitute a ban on so-called “black-box” algorithms. While some types of machine learning (ML) models, such as simple decision trees or linear- and logistic-regression, lend themselves to interpretations, complex systems like random forests and neural networks create parameters that are unknown to their designer. Indeed, this opacity is often integral to the value of the system: Black-box medical diagnostic algorithms, for example, can uncover complex biological relationships that elude or otherwise defy human interpretation. Many of the highest-performing ML models cannot be fully interpreted.
Figure 2: Interpretability versus performance in AI systems
Requiring all high-risk systems to be interpretable is the wrong approach, for two reasons. First, it fails to acknowledge instances where high performance is more important than interpretability. In scenarios where the assignment of responsibility and trust is crucial, such as bank loan applications, interpretability is the chief value. For several high-risk applications in the AIA, however, high performance is preferable. For instance, a traffic system (“critical infrastructure”) that uses a neural net-based system to ease congestion or an emergency first response system (“public services”) that utilizes advanced speech recognition. Prioritizing interpretability in some important domains will not provide the best outcomes.
Highly interpretable systems are also more prone to gaming. AI systems used for grading education or vocational training, high-risk according to the AIA, would be especially vulnerable to abuse if students are able to figure out what the system is testing for—the creators of nonsensical essay generator BABEL have long embarrassed robo-graders. Other high-risk areas can be similarly gamed, such as biometric identification and law enforcement.
Although complex models cannot be fully interpreted, there are a variety of model-agnostic methods (those that can compare any ML model using the same metric) that can explain, often through visualizing, the importance of individual features to the output of a system. Should the EU choose to prioritize explainability over interpretability, providers of AI systems will not have to try and pry open black-boxes (which may be impossible) and the EU will not impede the progress of highly accurate systems, as well as those yet to be imagined.
Algorithms used to formulate explanations are not yet perfect: Different explanation algorithms can lead to different explanations and, given the provider chooses the algorithm, explainability may be unsuited to adversarial contexts (where the interests of the provider of a system and its users diverge). Each high-risk system listed in the AIA could operate in adversarial contexts.
A revised AIA should define explainability and specify that, for high-risk AI systems, providers should clarify how a system can be explained as part of the transparency obligations in Article 16. The AI Act should refrain, however, from obliging any particular explanation method.
The text should also define interpretability and only require high-risk AI systems to be interpretable in contexts where interpretability trumps safety or accuracy. It should further outline the technical measures by which a provider can demonstrate a system’s interpretability.
By clarifying these terms, the EU can refrain from implementing a de facto ban on the most innovative AI systems and pursue its goal of becoming a world-class hub for AI.
Image credit: x6e38