Explainable AI

Machine learning has applications within just about every vertical, from demand forecasting in retail to diagnosing cancer for medical patients. Typically, these machine learning models output their results as a probability of a certain outcome. It might say, for instance, that a particular patient has a 72% chance of lung cancer from looking at their CT scan.

Imagine yourself as a patient in this situation, and you can probably see where the problem is. If you found out you had a 72% chance of having cancer, you would undoubtedly want to know why the machine learning system thought that.

In theory, that’s where Explainable AI (XAI) comes in. XAI would allow someone like a patient or doctor to read a report that explained exactly how the machine learning model came to its conclusion. In practice, however, most machine learning algorithms operate as a ‘black box’, meaning that there is no way for human beings to understand how they came to their conclusions.

There are two key reasons humans have a hard time understanding machine learning models. First, the ‘factors’ that are used by these models usually have no analogue in human thinking. (WARNING: massive simplification ahead) These models typically use a very large number of ‘neurons’ that each separately try to learn an element of the prediction problem. These neurons are randomly assigned to do some calculation, and over time the model iterates until these neurons start to do better at creating an output. Unfortunately, because these neurons are often just a simple calculation along with some numerical weights, they can’t really be described in human terms.

When predicting retail sales, for instance, the machine learning model might end up creating a neuron that tends to heavily weight recent sales values when predicting upcoming values. It wouldn’t actually have a name in the model corresponding to its role, like “recent sales factor neuron.” Rather, it would be some random calculation that just happens, in general, to end up weighing recent sales heavily. Now, a data scientist might be able to look at an individual neuron and roughly figure out what it is doing, but remember there are often 1,000 or more neurons, and models often include several ‘layers’ of these neurons on top of each other, which start to make the role of any individual neuron hopelessly opaque.

One way data scientists have tried to get around this is by using heat maps. When looking at an automated cancer diagnosis, for example, an XAI system can highlight on the CT image where the model is placing the most weight. This is somewhat helpful, but ultimately insufficient because of how all the factors in the model come together. For instance, the model may be concerned with a particular group of pixels in the CT scan, but only because of the relationship between those pixels and another group in a totally separate part of the scan. In that case, it is really the combination of pixels that is important, but the heat map has no way of showing that.

This brings us to the second big problem, which is that even if you could correctly describe the actions of each individual neuron, how could you hope to actually synthesize that information into something that would be easy for a human being to understand? A model with 100+ factors is complicated enough without taking into account that each of those factors is interacting with each other.

This is where infoSentience’s Fractal Synthesis technology can make a huge difference. In order to understand how Fractal Synthesis works, we first need to take a step back and look at how infoSentience’s technology works in general (how meta, right?). infoSentience has created technology that can analyze any data set, figure out what is most important, and explain what it found using natural language. Critically, this is system is flexible across four key dimensions:

Time – you can ask the system to center its analysis on any particular period in time and any time interval. For example, you could ask it to give a retail report for the week starting on December 8^th, or a quarterly report for the 2^nd quarter 2022.
Subject(s) - you can ask the system to report on a single subject or a group of subjects, and it will not only deliver that report, but include relevant context such as what sub-components within the group were most important, and also how the selected subject fits into other groups within the dataset.
Length – the system can write more or less information depending on what you want to see. If given less room to write, the system will focus more on the main points. If given more room, it will look to add additional context.
Interest – the system will be set up with a ‘best guess’ of what a user is most interested in, whether that be particular metrics or types of stories (trends, outlier events, etc.). However, the system can also quickly change how it weights different types of content to tailor the output to a particular use case.

Having this level of flexibility gives the system the ability to report not just on a given data set overall, but on any individual component or sub-set within the data. That’s why we call it Fractal Synthesis- it’s able to apply its algorithms and generate in-depth reports at any level of specificity.

For example, for a given retail data set it could create a three-paragraph report on the top-level results, which might include mentioning that a particular department had done well. If the user was interested in learning more about that department they could create a brand new three-paragraph report just on that department. If an interesting metric was mentioned in that department report, let’s say sales returns for example, the user could create a brand-new report just on sales returns within that department, or zoom out and look at sales returns for the entire company.

You can probably see how this could help solve the critical problems within XAI. Machine learning models are based on hundreds of factors AND their interactions with each other. At the end of the day, these interactions sum up to a number, which might correspond to the topic sentence in a report. In order to contextualize this topic sentence, the Fractal Synthesis technology could dive into all of the factors and use its subject and length flexibility to summarize the most important factors. Since each of the factors summarized is itself made up of multiple components, a user could simply ask the system to ‘dive in’ to that factor to get a new report on its most important subcomponents.

In order to make any of this high-level synthesis possible, the Fractal Synthesis system does need to be able to categorize the ‘work’ that each neuron is doing on its own and in combination. This is a tricky process (currently the biggest limitation of the system), and tends to vary quite a bit depending on the model being used and its targeted output. Fundamentally, however, the system plays the role of the data scientist that is able to examine the output of a single neuron and determine what, approximately, that neuron is doing. The key difference being that once it has that ‘map’ of what every neuron is doing it is capable of quickly synthesizing what the model is collectively doing and explaining that using natural language.

Solving XAI is critical to allow the power of machine learning models to be applied in real-world situations. It is needed because it allows us to: (1) trust AI systems by enabling us to understand and validate the decisions that they make, (2) debug AI systems more effectively by identifying the sources of errors or biases in the system, and (3) identify opportunities for improvement in AI systems by highlighting areas where the system is underperforming or inefficient. Fractal Synthesis technology could be the key to unlocking XAI in complex machine learning models, and I’m looking forward to keeping you informed on our progress in this space.

Steve Wasick

March 8, 2023

Leave a Reply Cancel reply