ValueError with running Concrete XGBoost -onnx.ModelProto exceeds maximum protobuf size of 2GB

bluetail14 · September 12, 2023, 10:42pm

I have this code. my Concrete-ML is Version: 1.1.0.


enc_model = ConcreteXGBClassifier( max_depth = 12, n_estimators=120 , n_jobs=-1, random_state=15).fit(X_train, y_train)

and have encountered this error:

ValueError: Message onnx.ModelProto exceeds maximum protobuf size of 2GB: 2865550695


> 
> --------------------------------------------------------------------------
> ValueError                                Traceback (most recent call last)
> Cell In[73], line 2
>       1 t_start = time()
> ----> 2 enc_model = ConcreteXGBClassifier( max_depth = 12, n_estimators=120 , n_jobs=-1, random_state=15).fit(X_train, y_train)
>       3 t_end = time()
>       4 print(f"Model fitting to training data took {int(t_end - t_start)} seconds")
> 
> File ~/miniconda3/envs/jupyterlab/lib/python3.10/site-packages/concrete/ml/sklearn/base.py:702, in BaseClassifier.fit(self, X, y, **fit_parameters)
>     698 assert_true(self.n_classes_ > 1, "You must provide at least 2 classes in y.")
>     700 # Change to composition in order to avoid diamond inheritance and indirect super() calls
>     701 # FIXME: https://github.com/zama-ai/concrete-ml-internal/issues/3249
> --> 702 return super().fit(X, y, **fit_parameters)
> 
> File ~/miniconda3/envs/jupyterlab/lib/python3.10/site-packages/concrete/ml/sklearn/base.py:1264, in BaseTreeEstimatorMixin.fit(self, X, y, **fit_parameters)
>    1261 assert self.sklearn_model is not None, self._sklearn_model_is_not_fitted_error_message()
>    1263 # Convert the tree inference with Numpy operators
> -> 1264 self._tree_inference, self.output_quantizers, self.onnx_model_ = tree_to_numpy(
>    1265     self.sklearn_model,
>    1266     q_X[:1],
>    1267     framework=self.framework,
>    1268     output_n_bits=self.n_bits,
>    1269 )
>    1271 self._is_fitted = True
>    1273 return self
> 
> File ~/miniconda3/envs/jupyterlab/lib/python3.10/site-packages/concrete/ml/sklearn/tree_to_numpy.py:294, in tree_to_numpy(model, x, framework, output_n_bits)
>     289 # Tree values pre-processing
>     290 # i.e., mainly predictions quantization
>     291 # but also rounding the threshold such that they are now integers
>     292 q_y = tree_values_preprocessing(onnx_model, framework, output_n_bits)
> --> 294 _tree_inference = get_equivalent_numpy_forward(onnx_model)
>     296 return (_tree_inference, [q_y.quantizer], onnx_model)
> 
> File ~/miniconda3/envs/jupyterlab/lib/python3.10/site-packages/concrete/ml/onnx/convert.py:88, in get_equivalent_numpy_forward(onnx_model, check_model)
>      71 """Get the numpy equivalent forward of the provided ONNX model.
>      72 
>      73 Args:
>    (...)
>      85         the equivalent numpy function.
>      86 """
>      87 if check_model:
> ---> 88     checker.check_model(onnx_model)
>      89 required_onnx_operators = set(get_op_type(node) for node in onnx_model.graph.node)
>      90 unsupported_operators = required_onnx_operators - IMPLEMENTED_ONNX_OPS
> 
> File ~/miniconda3/envs/jupyterlab/lib/python3.10/site-packages/onnx/checker.py:111, in check_model(model, full_check)
>     108     C.check_model_path(model, full_check)
>     109 else:
>     110     protobuf_string = (
> --> 111         model if isinstance(model, bytes) else model.SerializeToString()
>     112     )
>     113     # If the protobuf is larger than 2GB,
>     114     # remind users should use the model path to check
>     115     if sys.getsizeof(protobuf_string) > MAXIMUM_PROTOBUF:
> 
> ValueError: Message onnx.ModelProto exceeds maximum protobuf size of 2GB: 2865550695

how to fix? thank you.

luis · September 13, 2023, 7:49am

Hello @bluetail14 ,
This looks an error in a check that we do on the onnx graph representation of the models.
Could you please share with us the shape of the data you are considering as input please?

bluetail14 · September 19, 2023, 4:53pm

x_train has shape: (53760, 60)
y_train has shape: (53760,)
x_test has shape: (13441, 60)
y_test has shape: (13441,)

luis · September 19, 2023, 8:22pm

Hello @bluetail14 , thanks for your answer, I was able to reproduce your issue.
It seems caused by a limitation of protobuf used to serialize onnx models.

I’ll investigate if there is a way to circumvent this using a workaround.

In the meantime I would advise you to try to reduce the number of estimators and the max depth that you are using.