This is the third part in a series of blog posts about my thesis project at BrainCreators.
In the introductory part, I covered some basic concepts regarding AI explainability, the LIME algorithm, gave a short introduction to the python LIME library, and discussed some modifications of it I used during the project.
In the second part I covered the notion of Justifications and how one can use LIME to produce justifications.
In today’s post, I will show how this concept can be incorporated in practice for a better and more beneficial human-AI dialogue (as was shown in my thesis project conducted at BrainCreators).
So what is a justification, again? And how can we incorporate it in a human-AI interaction?
After discussing the concept of Justifications casually in the former post, it is now time to formally define it. And so, building on the intuition we have so far, we can define justification as a feature set d ∈ D, where D is a data point classified as belonging to class c by classifier m (being either a human or an artificial classifier) so that m’s classification of d is also as belonging to class c.
In other words, a justification in this regard is a subset of features of the classified data point (e.g. an area of an image) which a classifier deems as sufficient on its own to be classified in the proper way to classify the entire data point (e.g. the whole image).
In order to incorporate a justification into an interaction between two agents, we assume one to be of better knowledge (a Teacher), and one to be in training (a Learner). Usually the person would be the Teacher trying to get a Learner model to better its performance. Now, we can think of several ways to use “justifications” in the interaction between the two. Let’s consider again an image classification task for illustration:
Not using justifications
The usual way of training an artificial classifier for this purpose is to give it images and labels as input. Assuming we want our classifier to distinguish between different types of balls, for example, we can pass him the image above, with the label “volleyball”.
Given enough images (and a good quality learning model), this has proven to be a task AI is more than suitable for. But what happens if we don’t have enough images (or if our training images are not representative enough of the images we actually want to use our classifier on)?
Using the teacher to give justifications
In these cases, we can benefit from augmenting the image label (“volleyball”) with a teacher also “focusing the learner’s attention” on parts of the image it deemed as sufficient for making this classification.
Using the teacher to give feedback on justifications produced by the learner
In the third variant – we can have the learner output parts of the image it deems as sufficient for making a certain classification, and use the teacher to give feedback on them.
In essence, these techniques can be seen as using methods borrowed from the field of Explainability (discussed in the first blog post) to pursue Active Learning goals: assuming that “a machine learning algorithm could achieve greater accuracy with fewer training labels if it is allowed to choose the data from which it learns” (Settles). However, While traditional Active Learning techniques try to rank data instances (e.g. images) and pick the most important ones for labeling by an annotator; the approach of incorporating Justifications tries to rank subsets of a random instance (e.g. “part of an image”) which are most important for the classifier to consider.
Does it work in practice?
During the thesis project, I tested several implementations of such interactions (some using human teachers, and some using artificial teachers). At each variant, the Learner only got a small sample of images (40 or 400) to learn from, and the effect that adding justifications to the images on the Learner’s performance was recorded.
One of the first steps needed to conduct these experiments (after getting the data and training the teachers) was to produce justifications by the teacher. Building on the LIME algorithm (discussed in the second blog post) the way to produce them was straightforward – each Teacher was passed random crops of each image, and decided whether or not each of them was sufficient to classify an image correctly. And so, for example, this is a heatmap of justifications identified by an artificial Teacher to classify the image as containing a basketball:
Looking at the picture, t is easy to develop an intuition as to why a learner only focused on these justifications would perform better compared with a learner getting other random crops of the image as input along with the label “basketball” (obviously not an optimal label for this image – but sometimes our data is messy and not optimal!).
Using human annotation sof the actual balls as justification was able to slightly (yet significantly) improve the Learner’s performance when there where little or biased images to learn from.
Using justifications produced by artificial networks (by two different teachers – “pop1” & “pop2”) was only slightly improving the Learner’s performance on large biased training sets.
However, using the artificial networks to give feedback on the learner’s justification was able to considerably improve the learner’s performance – for both regular and biased training sets. Large biased training set resulted in a performance gain of 12.5%!
So what does it all mean?
First of all – incorporating justifications as part of an annotator-learner dialogue can really improve the classifier’s performance! Especially in cases where the training data is biased.
However, it is worth noting that unfortunately justifications aren’t dissolving biases completely. Training a model on a representative training set significantly outperforms models trained on a biased training set, even when incorporating justifications in their dialogue with the annotator. This leads to several conclusions.
First, from a performance perspective – given a choice between reducing biases in the training set and incorporating justifications, one should prioritize the former. Of course, when nothing necessitates picking one over the other – it seems like one should opt for combining them (incorporating justifications while using an unbiased training set) as they seem to be synergistic. Second, from a societal perspective – we should still take ethical concerns into serious considerations regarding possible biases in our data.
And lastly – this project also shows that we can use AI not just to learn but also to teach. While teaching other AI agents is helpful (as demonstrated by these results) – One might even envision in this context a human playing the part of the “classifier” – thus using the machine not to learn by itself, but as a tool for human learning. This could be seen as another task built on human-AI interaction (like Visual Dialog).
Some final remarks
I learned a lot from conducting this thesis project, and hope you found the blog posts helpful.
I would like to thank my thesis supervisors, Dr. Thomas Mensink of the FNWI at the University of Amsterdam & Maarten Stol of BrainCreators. They were tremendously helpful in giving great advice and answering any question, steering me in the right direction while allowing me to find my own path and make my own mistakes.
I would also like to thank the entire team at BrainCreators which were extremely helpful in technical and methodological guidance, and provided a great environment and atmosphere for experimentation, reflection, and also for having tons of fun!