Concrete XGBRegressor results differ from XGBoost

First of all some background informations:

I want to achieve an anomaly detector. Since it’s not possible yet to use IsolationForest in ConcreteML I have the following pipeline:

  1. Generating anomaly scores for my train dataset (X) with a scikit-learn IsolationForest using decision_function
  2. Train a ConcreteML XGBRegressor (in clear so no FHE) with the training data and the generated anomaly scores (y)

Now i want to compare the prediction delta/difference between the IsolationForest scores and the ConcreteML XGBRegressor scores. While this works more or less fine for the Regressor from the original XGBoost library, I observe a constant right shift for the one from ConcreteML.

no_scaling

I made sure to use the same hyperparameters and checked that both Regressors are of the same XGBoost version (1.6.2). Since the anomaly scores from the IsolationForest are quite small (min=-0.25, max=0.3) i already tried to scale them which kinda solves the problem of the distribution shift.

with_scaling

Anyway i would like to understand why this is needed and would be happy about any input :slight_smile:

Hello Islk,

Thanks for getting in touch with us !
We have done few tests and observed the same behavior.
An issue has been opened. We will let you as soon as the issue is resolved.

Thanks !

1 Like

Hi Islk,

The issue has been fixed in Concrete ML 1.2.1 !

Thanks !

1 Like

Thanks for the information!