Using TrustyAI’s explainability from Python

The TrustyAI‘s explainability library is primarily aimed at the Java Virtual Machine (JVM) and designed to be integrated seamlessly with the remaining TrustyAI services, adding explainability capabilities (such as feature importance and counterfactual explanations) to business automation workflows that integrate predictive models.

Many of these capabilities are useful on their own. However, in the data science field, Java is not usually the first language of choice, with Python being the most likely choice.

In this blog post, I will show a quick example of how to use TrustyAI’s explainability library from Python, allowing to create quick prototypes, using it from Jupyter notebooks or integrating with the wider Python ecosystem.

To do so, we will use the python-trustyai Python library, which creates low-level bindings to the JVM and provides wrappers to Java-specific types, allowing us to seamlessly communicate with the explainability library and call the available methods in a "Pythonic" way.

At the moment you can try all the examples in this post in two different ways:

  • Install the python-trustyai library locally, by cloning the repository and installing with python install
  • Build a container using the manifest provided in the repository (or using it as a base for your own)

Setting up dependencies

The library will execute code directly from the JARs, so the first step after importing it is to initialize the bindings by providing a "classpath". You can specify the JARs locations relative to your script (for convenience a shell script is provided which will download the dependencies from Maven Central). For instance:

import trustyai
       # any other dependencies       

Counterfactual explanations

We will start by showing how to search for counterfactual explanations using a toy model. If you are not familiar with counterfactuals, the simplest definition is as follows:

Assuming you have a predictive model and an original input, but those inputs do not provide the outcome you want. The counterfactual will be an alternative input (as close as possible to the original) that has the desired outcome.

As an example, a counterfactual for a loan approval predictive model could be expressed as "my application was rejected, but if my application data was this, then it would have been approved".

(For a more detailed introduction to counterfactuals you can watch the “Introduction to Counterfactuals and how it helps understanding black-box prediction models” KIE group video.)

Using a toy model

TrustyAI’s explainability methods are aimed primarily at black-box models, that is models for which we are unaware of the internals and can only interact by sending inputs and receiving predictions. We will define a toy model in order to introduce some basic concepts, and later on, we will look at how to integrate this library with a more complex, Python-based model.

This model takes an all-numerical input x and returns a y of either true or false if the sum of the x components is within a threshold e of a point C

This model is provided in the TestUtils python-trustyai module. We simply import it from Python and initialise it with C=500 and e=1.0.

from trustyai.utils import TestUtils
center = 500.0
epsilon = 1.0
model = TestUtils.getSumThresholdModel(center, epsilon)

Next, we need to define a goal or desired outcome. We will define it as a feature with the value true, that is, we want the sum of the input features to be within the vicinity of a (to be defined) point C

. The goal is a list of Output that take the following feature values:

  • The name
  • The type
  • The value (wrapped in Value)
  • A confidence threshold, which we will leave at zero (no threshold)
from trustyai.model import Output, Type, Value
goal = [Output("inside", Type.BOOLEAN, Value(True), 0.0)]

We will now define our initial features, x. Each feature can be instantiated by using the utility class FeatureFactory and in this case, we want to use numerical features, so we’ll use FeatureFactory.newNumericalFeature and create four features (an arbitrary number, just for example purposes):

import random
from trustyai.model import FeatureFactory
features = [
   FeatureFactory.newNumericalFeature(f"x{i+1}", random.random() * 10.0)
   for i in range(4)

As we can see, the sum of of the features will not be within e (1.0) of C (500.0). As such the model prediction will be false:

feature_sum = 0.0
for f in features:
   value = f.getValue().asNumber()
   print(f"Feature {f.getName()} has value {value}")
   feature_sum += value
print(f"\nFeatures sum is {feature_sum}")
Feature x1 has value 5.011623817323953
Feature x2 has value 4.574121526021039
Feature x3 has value 4.240726704569074
Feature x4 has value 3.2819942125458788

Features sum is 17.108466260459945

An important concept in counterfactual is one of the constraints. A constrained feature is one that shouldn’t change its value relative to the original. This might be crucial for several reasons, such as a business constraint that wouldn’t make sense to change, a feature that would be impossible to change in the real world or even for legal reasons.

Since this is just a trivial model, we will allow all features to change so we specify a list of False values, each entry corresponding to the above features.

constraints = [False] * 4

Another important concept is the one of bounds or search domains. This will inform the counterfactual explainer of the limits of the search. In some cases, we can use domain-specific knowledge (for instance we know which is the sensible range of a numerical variable) or we could use the bounds to limit the counterfactual to a region of interest. Additionally, they can also be taken from the original data, if available.

In this case, we simply specify an arbitrary (but sensible) value, e.g. all the features can vary between 0 and 1000. Since all the features are numerical we will use the utility method NumericalFeatureDomain.create(min, max).

from trustyai.model.domain import NumericalFeatureDomain
feature_boundaries = [NumericalFeatureDomain.create(0.0, 1000.0)] * 4

We can now instantiate the explainer itself.

We will configure the termination criteria. For this example, we will specify that the counterfactual search should only execute a maximum of 10,000 iterations before stopping and returning whatever the best result is, so far.

from org.optaplanner.core.config.solver.termination import TerminationConfig
from org.kie.kogito.explainability.local.counterfactual import (
from java.lang import Long
termination_config = TerminationConfig().withScoreCalculationCountLimit(
solver_config = (

We can now instantiate the explainer itself using CounterfactualExplainer and our solver_config configuration.

from trustyai.explainers import CounterfactualExplainer
explainer = CounterfactualExplainer(solver_config)

We will now express the counterfactual problem as defined above.

  • original represents our x
  • goals, that is our desired prediction (True)
  • A domain represents the boundaries for the counterfactual search

and wrap these quantities in a CounterfactualPrediction (the UUID is simply to uniquely label the search instance):

import uuid
from trustyai.model import PredictionFeatureDomain, PredictionInput, PredictionOutput, CounterfactualPrediction

original = PredictionInput(features)
goals = PredictionOutput(goal)
domain = PredictionFeatureDomain(feature_boundaries)

prediction = CounterfactualPrediction(
   original, goals, domain, constraints, None, uuid.uuid4()

We now request the counterfactual x’ which is closest to x and which satisfies f(x’,e,C)=y’:

explanation = explainer.explain(prediction, model)

Let’s look at the resulting counterfactual x’:

feature_sum = 0.0
for entity in explanation.getEntities():
   feature_sum += entity.getProposedValue()
print(f"\nFeature sum is {feature_sum}")
java.lang.DoubleFeature{value=5.011623817323953, intRangeMinimum=0.0, intRangeMaximum=1000.0, id='x1'}
java.lang.DoubleFeature{value=4.160541846083166, intRangeMinimum=0.0, intRangeMaximum=1000.0, id='x2'}
java.lang.DoubleFeature{value=4.240726704569074, intRangeMinimum=0.0, intRangeMaximum=1000.0, id='x3'}
java.lang.DoubleFeature{value=485.6288805328477, intRangeMinimum=0.0, intRangeMaximum=1000.0, id='x4'}

Feature sum is 499.0417729008239

We can see that the explainer found a valid counterfactual for this model and input since the sum of the features is within e=1.0 of C=500.0.

As we’ve discussed, it is possible to constraint a specific feature xi by setting the constraints list corresponding element to True.

In this example, we now want to fix x1 and x4 and see if the result is as expected. That is, these features should have the same value in the counterfactual x’ as in the original x.

constraints = [True, False, False, True]  # x1, x2, x3 and x4

We simply need to wrap the previous quantities with the new constraints and request a new counterfactual explanation:

prediction = CounterfactualPrediction(
   original, goals, domain, constraints, None, uuid.uuid4()
explanation = explainer.explain(prediction, model)

We can see that x1 and x4 have the same value as the original and the model satisfies the conditions.

print(f"Original x1: {features[0].getValue()}")
print(f"Original x4: {features[3].getValue()}\n")
for entity in explanation.getEntities():
Original x1: 5.011623817323953
Original x4: 3.2819942125458788

java.lang.DoubleFeature{value=5.011623817323953, intRangeMinimum=5.011623817323953, intRangeMaximum=5.011623817323953, id='x1'}
java.lang.DoubleFeature{value=4.574121526021039, intRangeMinimum=0.0, intRangeMaximum=1000.0, id='x2'}
java.lang.DoubleFeature{value=486.268546111066, intRangeMinimum=0.0, intRangeMaximum=1000.0, id='x3'}
java.lang.DoubleFeature{value=3.2819942125458788, intRangeMinimum=3.2819942125458788, intRangeMaximum=3.2819942125458788, id='x4'}

Using Python models

We’ve covered some basic concepts with the previous toy model but now will look at a slightly different example. We will now show how to use a custom (and more complex) Python model with TrustyAI counterfactual explanations.

This model will predict how likely a loan is to be repaid given the applicant characteristics. The model used was XGBoost, trained with a considerably large, anonymised and public dataset of real financial institutions. This is, undoubtedly, closer to a real-world scenario than our toy model.

For convenience and brevity, the model training steps are not included and the model was serialised using the joblib library so that for this example we simply need to deserialize it for it to be ready to use.

import joblib
xg_model = joblib.load("models/credit-bias-xgboost.joblib")
XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
              colsample_bynode=1, colsample_bytree=1, gamma=0, gpu_id=-1,
              importance_type='gain', interaction_constraints='',
              learning_rate=0.07, max_delta_step=0, max_depth=8,
              min_child_weight=1, missing=nan, monotone_constraints='()',
              n_estimators=200, n_jobs=12, num_parallel_tree=1, random_state=27,
              reg_alpha=0, reg_lambda=1, scale_pos_weight=0.9861206227457426,
              seed=27, subsample=1, tree_method='exact', validate_parameters=1,

This model has as a single output a boolean PaidLoan, which, as mentioned, will contain the prediction of whether a certain loan applicant will repay the loan in time or not. The model is slightly more complex than the previous examples, with considerably more input features.

We will start by testing the model with an input we are quite sure (from the original data) that will be predicted as false:

x = [
       False, # NewCreditCustomer
       2125.0, # Amount
       20.97, # Interest 
       60.0, # LoanDuration
       4.0, # Education
       0.0, # NrOfDependants
       6.0, # EmploymentDurationCurrentEmployer
       0.0, # IncomeFromPrincipalEmployer
       301.0, # IncomeFromPension
       0.0, # IncomeFromFamilyAllowance
       53.0, # IncomeFromSocialWelfare
       0.0, # IncomeFromLeavePay
       0.0, # IncomeFromChildSupport
       0.0, # IncomeOther
       8.0, # ExistingLiabilities
       6.0, # RefinanceLiabilities
       26.29, # DebtToIncome
       10.92, # FreeCash
       1000.0, # CreditScoreEeMini
       1.0, # NoOfPreviousLoansBeforeLoan
       500.0, # AmountOfPreviousLoansBeforeLoan
       590.95, # PreviousRepaymentsBeforeLoan
       0.0, # PreviousEarlyRepaymentsBefoleLoan
       0.0, # PreviousEarlyRepaymentsCountBeforeLoan
       False, # Council_house
       False, # Homeless
       False, # Joint_ownership
       False, # Joint_tenant
       False, # Living_with_parents
       False, # Mortgage
       False, # Other
       False, # Owner
       True, # Owner_with_encumbrance
       True, # Tenant
       False, # Entrepreneur
       False, # Fully
       False, # Partially
       True, # Retiree
       False, # Self_employed

The model’s prediction is that this application will not be repaid with a probability of ~77%:

import numpy as np
print(f"Paid loan is predicted as: {xg_model.predict(np.array(x))}")
[[0.7770493  0.22295067]]
Paid loan is predicted as: [False]

Since Python models cannot be passed directly as-is to the counterfactual explainer, we will prepare the XGBoost model to be used from the TrustyAI counterfactual engine. Fortunately, the process is simple enough. We only need to create a prediction function that takes:

  • A java.util.List of PredictionInput as inputs
  • A java.util.List of PredictionOutput as outputs

If this interface is used, the actual inner working of this method can be anything (including calling a XGBoost Python model for prediction, as in our case). We will use a utility method, toJList, which converts a Python list into a Java List, leaving the contents unchanged.

from typing import List
from trustyai.utils import toJList
def predict(inputs):
   values = [feature.getValue().asNumber() for feature in inputs.get(0).getFeatures()]
   result = xg_model.predict_proba(np.array([values]))
   false_prob, true_prob = result[0]
   if false_prob > true_prob:
       prediction = (False, false_prob)
       prediction = (True, true_prob)
   output = Output("PaidLoan", Type.BOOLEAN, Value(prediction[0]), prediction[1])
   return toJList([PredictionOutput([output])])

Once the prediction method is created, we wrap it in a PredictionProvider class. Since the TrustyAI explainability API is asynchronous (based on CompletableFutures), this class takes care of all the JVM’s asynchronous plumbing for us.

from trustyai.model import PredictionProvider
model = PredictionProvider(predict)

We will now express the previous inputs (x) in terms of Features, so that we might use it for the counterfactual search:

def make_feature(name, value):
   if type(value) is bool:
       return FeatureFactory.newBooleanFeature(name, value)
       return FeatureFactory.newNumericalFeature(name, value)
features = [
   make_feature(p[0], p[1])
   for p in [
       ("NewCreditCustomer", False),
       ("Amount", 2125.0),
       ("Interest", 20.97),
       ("LoanDuration", 60.0),
       ("Education", 4.0),
       ("NrOfDependants", 0.0),
       ("EmploymentDurationCurrentEmployer", 6.0),
       ("IncomeFromPrincipalEmployer", 0.0),
       ("IncomeFromPension", 301.0),
       ("IncomeFromFamilyAllowance", 0.0),
       ("IncomeFromSocialWelfare", 53.0),
       ("IncomeFromLeavePay", 0.0),
       ("IncomeFromChildSupport", 0.0),
       ("IncomeOther", 0.0),
       ("ExistingLiabilities", 8.0),
       ("RefinanceLiabilities", 6.0),
       ("DebtToIncome", 26.29),
       ("FreeCash", 10.92),
       ("CreditScoreEeMini", 1000.0),
       ("NoOfPreviousLoansBeforeLoan", 1.0),
       ("AmountOfPreviousLoansBeforeLoan", 500.0),
       ("PreviousRepaymentsBeforeLoan", 590.95),
       ("PreviousEarlyRepaymentsBefoleLoan", 0.0),
       ("PreviousEarlyRepaymentsCountBeforeLoan", 0.0),
       ("Council_house", False),
       ("Homeless", False),
       ("Joint_ownership", False),
       ("Joint_tenant", False),
       ("Living_with_parents", False),
       ("Mortgage", False),
       ("Other", False),
       ("Owner", False),
       ("Owner_with_encumbrance", True),
       ("Tenant", True),
       ("Entrepreneur", False),
       ("Fully", False),
       ("Partially", False),
       ("Retiree", True),
       ("Self_employed", False),

We can confirm now, using the newly created PredictionProvider model directly, that this input will lead to a PaidLoan=false prediction:

from trustyai.utils import toJList
'Output{value=false, type=boolean, score=0.7835956811904907, name='PaidLoan'}'

Unconstrained basic search

To get started we will search for a counterfactual with no constraints at all. This is not a realistic use case, but we will use it as a baseline.

n_features = len(features)
constraints = [False] * n_features

We will also create a set of equal bounds for all the features. Again, this is not realistic, but, as mentioned, we do it to establish a baseline. Note that boolean features will ignore the bounds anyway (since they only have two possible values), so we can just create a set such as:

features_boundaries = [NumericalFeatureDomain.create(0.0, 10000.0)] * n_features

Next, we create a termination criteria for the search. We will use a 10 second time limit for the search and instantiate a new counterfactual explainer

termination_config = TerminationConfig().withSecondsSpentLimit(Long.valueOf(10))
solver_config = (
explainer = CounterfactualExplainer(solver_config)

We want our goal to be the model predicting the loan will be paid (PaidLoad=true), so we specify it as:

goal = [Output("PaidLoan", Type.BOOLEAN, Value(True), 0.0)]

As before, we will wrap all this context in a CounterfactualPrediction object and search for a counterfactual. Then, we will confirm that our counterfactual changes the outcome, by predicting its outcome using the model:

'Output{value=true, type=boolean, score=0.6006738543510437, name='PaidLoan'}'

And indeed it changes. We will now verify which features were changed:

def show_changes(explanation, original):
   entities = explanation.getEntities()
   N = len(original)
   for i in range(N):
       name = original[i].getName()
       original_value = original[i].getValue()
       new_value = entities[i].asFeature().getValue()
       if original_value != new_value:
           print(f"Feature '{name}': {original_value} -> {new_value}")
show_changes(explanation, features)
Feature 'IncomeFromSocialWelfare': 53.0 -> 53.31125429433703
Feature 'RefinanceLiabilities': 6.0 -> 1.230474777192958
Feature 'PreviousEarlyRepaymentsCountBeforeLoan': 0.0 -> 6.0
Feature 'Owner': false -> true
Feature 'Owner_with_encumbrance': true -> false

Here we can see the problem with the unconstrained search.

Some of the fields that were changed (e.g. IncomeFromSocialWelfare, RefinanceLiabilities, etc) might be unfeasible to change in practice. This is where we should improve some of the initial counterfactual settings, namely the constraints and the search domain.

Constrained search

We will now try a more realistic search, which incorporates domain-specific knowledge (and common sense).

To do so, we will constrain features we feel they shouldn’t (or mustn’t) change and specify sensible search bounds. We will start with the constraints:

constraints = [
   True,  # NewCreditCustomer
   False,  # Amount
   True,  # Interest
   False,  # LoanDuration
   True,  # Education
   True,  # NrOfDependants
   False,  # EmploymentDurationCurrentEmployer
   False,  # IncomeFromPrincipalEmployer
   False,  # IncomeFromPension
   False,  # IncomeFromFamilyAllowance
   False,  # IncomeFromSocialWelfare
   False,  # IncomeFromLeavePay
   False,  # IncomeFromChildSupport
   False,  # IncomeOther
   True,  # ExistingLiabilities
   True,  # RefinanceLiabilities
   False,  # DebtToIncome
   False,  # FreeCash
   False,  # CreditScoreEeMini
   True,  # NoOfPreviousLoansBeforeLoan
   True,  # AmountOfPreviousLoansBeforeLoan
   True,  # PreviousRepaymentsBeforeLoan
   True,  # PreviousEarlyRepaymentsBefoleLoan
   True,  # PreviousEarlyRepaymentsCountBeforeLoan
   False,  # Council_house
   False,  # Homeless
   False,  # Joint_ownership
   False,  # Joint_tenant
   False,  # Living_with_parents
   False,  # Mortgage
   False,  # Other
   False,  # Owner
   False,  # Owner_with_encumbrance"
   False,  # Tenant
   False,  # Entrepreneur
   False,  # Fully
   False,  # Partially
   False,  # Retiree
   False,  # Self_employed

The constraints should be self-explanatory, but in essence, they are divided into three groups. They can be attributes that:

  • you cannot or should not change (protected), for instance, age, education level, etc
  • you can change, for instance, loan duration, loan amount, etc
  • you probably won’t be able to change, but might be informative to change. For instance, you might not be able to easily change your income, but you might be interested in knowing how much it would need to be, in order to get the prediction as favourable.
features_boundaries = [
   None,  # NewCreditCustomer
   NumericalFeatureDomain.create(0.0, 1000.0),  # Amount
   None,  # Interest
   NumericalFeatureDomain.create(0.0, 120.0),  # LoanDuration
   None,  # Education
   None,  # NrOfDependants
   NumericalFeatureDomain.create(0.0, 40.0),  # EmploymentDurationCurrentEmployer
   NumericalFeatureDomain.create(0.0, 1000.0),  # IncomeFromPrincipalEmployer
   NumericalFeatureDomain.create(0.0, 1000.0),  # IncomeFromPension
   NumericalFeatureDomain.create(0.0, 1000.0),  # IncomeFromFamilyAllowance
   NumericalFeatureDomain.create(0.0, 1000.0),  # IncomeFromSocialWelfare
   NumericalFeatureDomain.create(0.0, 1000.0),  # IncomeFromLeavePay
   NumericalFeatureDomain.create(0.0, 1000.0),  # IncomeFromChildSupport
   NumericalFeatureDomain.create(0.0, 1000.0),  # IncomeOthe
   None,  # ExistingLiabilities
   None,  # RefinanceLiabilities
   NumericalFeatureDomain.create(0.0, 100.0),  # DebtToIncome
   NumericalFeatureDomain.create(0.0, 100.0),  # FreeCash
   NumericalFeatureDomain.create(0.0, 10000.0),  # CreditScoreEeMini
   None,  # NoOfPreviousLoansBeforeLoan
   None,  # AmountOfPreviousLoansBeforeLoan
   None,  # PreviousRepaymentsBeforeLoan
   None,  # PreviousEarlyRepaymentsBefoleLoan
   None,  # PreviousEarlyRepaymentsCountBeforeLoan
   None,  # Council_house
   None,  # Homeless
   None,  # Joint_ownership
   None,  # Joint_tenant
   None,  # Living_with_parents
   None,  # Mortgage
   None,  # Other
   None,  # Owner
   None,  # Owner_with_encumbrance
   None,  # Tenant
   None,  # Entrepreneur
   None,  # Fully
   None,  # Partially
   None,  # Retiree
   None,  # Self_employed

As before, we wrap this data in a CounterfactualPrediction, start a new search test that the counterfactual does change the outcome:

'Output{value=true, type=boolean, score=0.5038489103317261, name='PaidLoan'}'

And we confirm that only unconstrained features were changed:

show_changes(explanation, features)
Feature 'LoanDuration': 60.0 -> 56.947228037333545
Feature 'IncomeFromSocialWelfare': 53.0 -> 59.6876474017064
Feature 'FreeCash': 10.92 -> 10.914352713171315

Minimum counterfactual probabilities

We can see that the previous counterfactual, although with the desired outcome, had an outcome probability close to 50%. It might be the case where we want a higher "confidence" in a counterfactual’s outcome.

With TrustyAI we have the possibility to specify a minimum probability for the result (when the model supports prediction confidences).

Let’s say we want a result that is at least 75% confident that the loan will be repaid. We can just encode the minimum probability as the last argument of each Output (the desired "confidence"). A minimum probability of 0 (as we’ve used) simply means that any desired outcome will be accepted, regardless of its probability.

goal = [Output("PaidLoan", Type.BOOLEAN, Value(True), 0.75)]

We can then re-run the search with all the data as defined previously and  check that the answer is what we are looking for, in terms of outcome:

'Output{value=true, type=boolean, score=0.7572674751281738, name='PaidLoan'}'

And indeed, this time the counterfactual will lead to an outcome with confidence as we specified.

And we show which features need to be changed for the desired outcome:

show_changes(explanation, features)
Feature 'LoanDuration': 60.0 -> 14.899149688096976
Feature 'EmploymentDurationCurrentEmployer': 6.0 -> 5.8223107382429395
Feature 'FreeCash': 10.92 -> 10.942602612323316
Feature 'Joint_ownership': false -> true

This concludes the introduction on using TrustyAI’s explainability library from Python as well as some counterfactual basics.

By using these Python bindings, we are able to use these features and easily integrate them with the available Python tools and ecosystem, such as interactive notebooks and plotting libraries, for instance.

Happy coding!

0 0 votes
Article Rating
Notify of
Inline Feedbacks
View all comments