My PhD in explainable AI

Within my PhD research I answer the question what an AI should explain and how it can do so.

Towards a responsible and effective human–AI collaboration with the help of explainable AI.

My PhD research on Explainable AI takes place at the Technical University of Delft at the Interactive Intelligence group. My promotor is Mark Neerincx (TuD, TNO), co-promotor is Catholijn Jonker (TuD, website) and supervisor Jurriaan van Diggelen (TNO). I began my PhD in September 2018 as a guest researcher at TuD alongside my TNO research. Within TNO projects I seek to collaborate with colleagues from both TuD and TNO, connect them, and pursue a multidisciplinary approach to Explainable AI. Furthermore, this format allows me to progress the fundamental research on Explainable AI to tackle the real-world problems the industry and Dutch government face.

My case for Explainable AI

The potential of artificial intelligence (AI) is growing, and our society has begun to experience its benefits and dangers. With this experience comes the realization that we have only a limited understanding how AI currently functions. We want to understand how, when and why an AI makes its decisions. With this knowledge, we believe that we will be better suited to decide when an AI should be used, trusted and relied upon as well as how we can influence and contest its decisions. Since every AI is unique and increasingly more often not even truly understood by their developers, we want an AI that can answer our questions; we desire explainable AI. Research is required how to design and develop such an explainable AI, as it can help us collaborate with it more responsibly and effectively.

Thesis abstract

Explainable AI for human–AI collaboration

As a society, we have come to notice the influence and impact Artificially Intelligent (AI) agents have on the way we live our lives. For these AI agents to support us both effectively and responsibly, we require an understanding on how they make decisions and what the consequences are of these decisions. The research field of Explainable Artificial Intelligence (XAI) aims to develop AI agents that can explain its own functioning to provide this understanding. In this thesis we defined, developed and evaluated a core set of explanations an AI agent should provide to support humans in their collaboration with AI agents.

Contrastive explanations

First, we studied the effects of explanations that convey why one decision was made instead of another, i.e., the contrastive explanation class. Two forms of this class were evaluated, providing either rule-based or example-based content. The rule-based form improved a human's understanding the most. Both explanations caused participants to feel they understood the AI agent, although this did not correlate with their actual understanding. Furthermore, with an explaining agent the participants proved to be more persuaded to follow the agent's advice even when incorrect, particularly when the explanations were provided in an example-based form.

A method to generate rule-based contrastive explanations was developed for AI agents that offer decision support and evaluated to be efficient, accurate and agnostic from the AI agent's functioning. Based on our findings of a pilot study, a second method was presented and defined for AI agents planning behaviour over time, such as used in autonomous systems. These findings indicated that humans desire contrastive explanations from such planning AI agents to report what consequences that agent expects when performing its plan. The presented and defined methods thus takes this in account by allowing humans to question the plans of these AI agents and receive an answer addressing the agent's expected consequences translated in human interpretable terms instead of numerical values(i.e., that by turning right the agent expects to fall of a cliff, instead of explaining that turning right reduces the expected utility significantly).

Confidence and actionable explanations

Next, we defined two novel explanation classes, that of confidence explanations and actionable explanations. Confidence explanations convey the likelihood of an AI agent's decision to prove correct, compute this in an interpretable way and explain it using past examples of performance. We proposed an agnostic approach to generate such explanations using case-based reasoning. Evaluations showed this approach to be accurate and predictable, even under simulated updates of the AI agent and concept drift over time. Two studies showed that both laymen and domain experts preferred our case-based reasoning approach for confidence explanations over state of the art alternatives. Actionable explanations aim to support a human's ability to contest and alter an AI agent's decision, in particular when that human is subjected to the AI agent's decision. We formally defined six properties that make an explanation actionable to enable uni-vocal comparisons and argumentations on explanation theories that contribute to contesting AI agents' decisions. A literature review was performed to provide a research agenda for the development and testing of methods to generate actionable explanations.

Explanations for collaboration

Finally, we recognized that explanations serve the collaboration between humans and AI agents and their application needs to be designed within the context of that collaboration. We extended an existing design method for such collaborations with the notion of explanations and presented several designs. Each design varied in the degree of autonomy provided to the AI agent in morally sensitive tasks and discussed the role of explanations within such tasks. Several of these designs were then evaluated in the healthcare domain with first responders. Results showed that the participants valued the explanations but also found them tedious when experiencing time pressure. Furthermore, they felt less responsible for the AI agent when it became more autonomous which reduced their motivation to review the explanations. This illustrates the complexity of designing an explainable AI agent that integrates various explanations to support a human–AI collaboration.


The above findings show that explanations from an AI agent have the potential to improve the collaboration between human and AI agent since explanations can bring about various beneficial effects. Not all of these effects are positive however. Explanations can also induce negative effects detrimental to the collaboration, which is largely dependent on context. For instance, a more persuasive advice due to an explanation might be detrimental in a use case where it prohibits a desirable critical human stance but beneficial when it remedies unwarranted undertrust. The performed studies showed that explanations can induce such effects whose value is use-case dependent. Similar future studies measuring the variety of effects explanations bring about will provide a solid foundation for design choices on explainable AI agents. Such design choices could be structured and made accessible with the use of design patterns that describe which explanations have what effects in what kind of use cases given a particular kind of human–AI collaboration. Aside from these insights, we illustrated the value of a more formalized approach towards explanations that are actionable instead of only having an epistemic value. Through distinct properties and levels we could provide a research agenda towards explanations with the profound practical value of enabling human autonomy when interacting or dealing with AI agents. Finally, we showed that the development of explanation generating methods that are independent of the AI agent's functioning can be effective. This is especially the case when explanations do not need to disclose every detail of an AI agent. Furthermore, such methods are more robust to future developments in AI research. Finally, we illustrated how identified effects of explanations and concrete methods to generate them can be combined into design patterns which in turn can offer a responsible design approach for an explainable AI agent that suits the human–AI collaboration.

Advice for the future

We conclude with an advice for future research in XAI. First, to give more attention and spend more effort on the evaluation of the explanations developed methods generate as well as to address the wide-ranging design space of explanations. In particular, there is a need for more rigorous evaluations grounded in realistic applications of AI agents based on explicit theoretical models describing both positive and negative effects of explanations. This provides industry and governments with the knowledge needed to responsibly apply explanations and to formulate best practices and regulation. Secondly, to provide more attention to the role and embedding of explanations in the human–AI collaboration. This will open more directions for research, such as towards the role of explanations in long-term collaborations, more interactive explanations, explanations to aid in knowledge discovery, explanations adapted towards a human's knowledge and, most important, explanations attuned to the application context. In general this advice comes down to a more profound focus on human-centered research to the explanations an AI agent should provide.


Supervising students

  • Dhivin Nelson


    Privacy preserving actionable explanations.

    Currently supervising with Bart Kamphorst (TNO) and Meike Nauta (UT).

  • Wouter Zirkzee


    An exploration of privacy preserving XAI.

    Supervised with Mark Neerincx (TuD) and Bart Kamphorst (TNO).

  • Chantal Leeuwestein


    Explainable Artificial Intelligence for decision support systems in financial services.

    In collaboration with a financial institute. Supervised with Mark Hoogendoorn (VU).

  • Manon de Jonge


    Simulating team work: Software support for research in human-agent teaming.

    Supervised with Frank Grootjens (RU).

  • Elisabeth Nieuwburg

    2019 - Cum Laude

    An objective user evaluation of explanations of machine learning based advice in a diabetes context.

    Supervised with Mark Neerincx (UU) and Anita Cremers (HU).

  • Marcel Robeer

    2018 - Cum Laude

    Contrastive explanation for machine learning.

    Supervised with Matthieu Brinkhuis (UU) and nominated for the 2019 Best Master’s Thesis at Utrecht University.


More of my research and projects

  • My TNO research

    My research at TNO focuses on human-AI interaction where I participate and lead several research projects.

    TNO research
  • My own projects

    I maintain and founded several opensource projects and organize several community building events,