Load onnx model in fhe circuit

Hi,

I have an onnx model that I would like to load and infer in a fhe circuit with concrete-python afterwards.

from concrete import fhe
import onnx
import onnxruntime
from concrete.ml.torch.compile import compile_onnx_model


num_inputs = 1
input_shape = (3, 32, 32)
input_set = numpy.random.uniform(-100, 100, size=(10, *input_shape))
x_test = tuple(numpy.random.uniform(-100, 100, size=(1, *input_shape)) for _ in range(num_inputs))


model = onnx.load("vgg_block.onnx")

onnx.checker.check_model(model)

quantized_module = compile_onnx_model(model, input_set, n_bits=2)
quantized_module.forward(*x_test, fhe="simulate")
print()
print("Simulation ok")
print()

from concrete import fhe

@fhe.compiler({"x":"encrypted"})
def f(x):
    y = quantized_module.forward(x)
    # doing some stuff 
    # ...

    return y


cfg = fhe.Configuration(show_graph=True, enable_unsafe_features=True)
circuit = f.compile(input_set, configuration=cfg)

simulation = circuit.simulate(input_set[0])

Is there a way to do it? I know that there is a way for the LinearRegression from this post

I am currently using concrete-ml == 1.0.1 , concrete-python==1.0.0 but I am open to earlier versions if there is a workaround

Thanks!

Hello @tricycl3 ,

Glad to see that you are exploring further the capabilities of Concrete ML!
Using the latest version of Concrete ML is recommended.

There is indeed a way to compile to FHE an ONNX graph to run it in FHE (link to the documentation) but you might encounter some errors if the ONNX graph you are using includes unsupported operators.

What issues are you currently facing ?

Be aware that your model will be quantized when compiled.

By the way we have multiple examples on running a Quantization Aware Trained VGG for CIFAR-10 image classification in FHE in our use-case-examples!

Hi!

Currently this code gives the error AttributeError: 'Tracer' object has no attribute 'dtype' , as I suppose it is already a graph as it is already compiled.
What I would like to do is to only access the model (quantized) and then to be able to use it in a function f of my liking with concrete-python or concrete-numpy, like calling lookupt tables first and then apply the DNN on it (as an example)

I know it was possible with the concrete LinearRegression with this kind of functions :

@cnp.compiler({"x": "encrypted"})
def f(x):
    y = table[x]
    return lr.quantized_module_._forward(y)

So is there a similar setting for an onnx model?

Hello @tricycl3 ,

Interesting use case ! I think what you want is to do is indeed retrieve a QuantizedModule object without having to compile it (which is obviously done in compile_onnx_model). You should be therefore able to do so as following, which are basically the steps done in compile_onnx_model before compilation:

from concrete.ml.torch import NumpyModule
from concrete.ml.quantization import PostTrainingAffineQuantization
import torch

dummy_input_for_tracing = torch.from_numpy(input_set[[0], ::]).float()

numpy_module= NumpyModule(model, dummy_input_for_tracing)
post_training_quant = PostTrainingAffineQuantization(n_bits=2, numpy_module)
quantized_module = post_training_quant.quantize_module(input_set)

Then you’ll be able to call quantized_module.quantized_forward in your f function. If this does not work, you can try quantized_module._clear_forward.

Besides, here I used PostTrainingAffineQuantization since I believe you are trying to run your model using Post-Training Quantization (PTQ). I your model has been trained using Quantization Aware Training (QAT), you should use PostTrainingQATImporter instead.

Tell me if this works !

Thanks for the fast answer!

The _clear_forward method seems to work, I only have dimensions issues.

My input set is defined like this:

num_inputs = 1
input_shape = (3, 32, 32)
input_set = numpy.random.uniform(-100, 100, size=(10, *input_shape))
x_test = tuple(numpy.random.uniform(-100, 100, size=(1, *input_shape)) for _ in range(num_inputs))

Then I am trying to only access the quantized module but with the right dimensions

dummy_input_for_tracing = torch.from_numpy(input_set[[0], ::]).float()
print(input_set[[0], ::].shape)
numpy_module = NumpyModule(quantized, dummy_input_for_tracing)
post_training_quant = PostTrainingAffineQuantization(n_bits=2, numpy_model=numpy_module )
quantized_module = post_training_quant.quantize_module(input_set)

#quantized_module = compile_onnx_model(quantized, input_set, n_bits=2)
quantized_module.forward(*x_test)#, fhe="simulate")
print(quantized_module)
print("quantized_module ok")
print()

from concrete import fhe

@fhe.compiler({"x":"encrypted"})
def f(x):
    print(x.shape)
    x = np.expand_dims(x, axis=0)
    print(x.shape)
    return quantized_module._clear_forward(x)


cfg = fhe.Configuration(show_graph=True, enable_unsafe_features=True)
circuit = f.compile(input_set, configuration=cfg)

simulation = circuit.simulate(*x_test)

But this gives me the following error: RuntimeError: A subgraph within the function you are trying to compile cannot be fused because of a node, which is marked explicitly as non-fusable

With the following Traceback

 %0 = x                                                                                                                # EncryptedTensor<float64, shape=(3, 32, 32)>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ within this subgraph
                                                                                                                                                                     VGG.py:176
 %1 = expand_dims(%0, axis=0)                                                                                          # EncryptedTensor<float64, shape=(1, 3, 32, 32)>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ within this subgraph
                                                                                                                                                                        VGG.py:170
 %2 = ones()                                                                                                           # EncryptedTensor<uint1, shape=(1, 3, 34, 34)>
 %3 = 0                                                                                                                # ClearScalar<uint1>
 %4 = multiply(%2, %3)                                                                                                 # EncryptedTensor<uint1, shape=(1, 3, 34, 34)>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ with this input node
                                                                                                                                                                      /home/usr/Documents/project/venv_fhe/lib/python3.8/site-packages/concrete/ml/onnx/onnx_impl_utils.py:51
 %5 = (%4[:, :, 1:33, 1:33] = %1)                                                                                      # EncryptedTensor<uint1, shape=(1, 3, 34, 34)>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ this node is not fusable
                                                                                                                                                                      /home/usr/Documents/project/venv_fhe/lib/python3.8/site-packages/concrete/ml/onnx/onnx_impl_utils.py:61
 %6 = [[[[-1  0  ...   0  0]]]]                                                                                        # ClearTensor<int2, shape=(64, 3, 3, 3)>                @ /0/Conv.conv
 %7 = conv2d(%5, %6, [0 0 0 0 0 ... 0 0 0 0 0], pads=[0, 0, 0, 0], strides=(1, 1), dilations=(1, 1), group=1)          # EncryptedTensor<uint1, shape=(1, 64, 32, 32)>         @ /0/Conv.conv
 %8 = subgraph(%7)                                                                                                     # EncryptedTensor<uint1, shape=(1, 64, 32, 32)>
 %9 = ones()                                                                                                           # EncryptedTensor<uint1, shape=(1, 64, 34, 34)>
%10 = 0                                                                                                                # ClearScalar<uint1>
%11 = multiply(%9, %10)                                                                                                # EncryptedTensor<uint1, shape=(1, 64, 34, 34)>
%12 = (%11[:, :, 1:33, 1:33] = %8)                                                                                     # EncryptedTensor<uint1, shape=(1, 64, 34, 34)>
%13 = [[[[0 0 0] ... [0 0 0]]]]                                                                                        # ClearTensor<int2, shape=(64, 64, 3, 3)>               @ /2/Conv.conv
%14 = conv2d(%12, %13, [0 0 0 0 0 ... 0 0 0 0 0], pads=[0, 0, 0, 0], strides=(1, 1), dilations=(1, 1), group=1)        # EncryptedTensor<uint1, shape=(1, 64, 32, 32)>         @ /2/Conv.conv
%15 = subgraph(%14)                                                                                                    # EncryptedTensor<uint1, shape=(1, 64, 32, 32)>
%16 = maxpool2d(%15, kernel_shape=(3, 3), strides=(3, 3), pads=(0, 0, 0, 0), dilations=(1, 1), ceil_mode=False)        # EncryptedTensor<uint1, shape=(1, 64, 10, 10)>
%17 = subgraph(%16)                                                                                                    # EncryptedTensor<uint1, shape=(1, 64, 10, 10)>
return %17

I added the np.expand_dims to have x of shape (1,3,32,32) instead of (3,32,32) which raised an error previously (it was AssertionError: Expected number of channels in weight to be 32.0 (C / group). Got 3. )

Is there a new method that I am not aware of to expand the dims of a encrypted array?

Could you try to give (dummy_input_for_tracing,) (a tuple of a single element) instead of dummy_input_for_tracing in the NumpyModule ? I might have made a mistake here, usually such dimension issues could come from the inputset having the wrong shape, so hopefully that avoid you to call the expand dim function.

Actually, this should not help as a NumpyModule only requires a dummy_input_for_tracing when considering a torch model, not a ONNX one (you can pass any array to it actually, it should not impact anything). I will get back to you when I know more about it !

You could try to use reshaped_inputset = (numpy.expand_dims(input_val, 0) for input_val in input_set) as an input_set, such as circuit = f.compile(reshaped_inputset, configuration=cfg) ! By the way, you don’t need to set enable_unsafe_features=True to call simulate anymore :slightly_smiling_face:

If this still does not work, would it be possible to obtain your ONNX graph ?

More importantly, you will have to quantized your input_set and test before giving it to the compile method. There is no such need in Concrete-ML since these quantization and reshaping steps are wrapped together in the quantized model’s related methods. This is not the case for Concrete-Python, and only integers should be used for both compilation and inference !

Let me know how it goes

I tried with the reshaped input set but no luck

This time the trace back is :

%0 = x                                                                                                                # EncryptedTensor<float64, shape=(1, 3, 32, 32)>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ within this subgraph
                                                                                                                                                                        VGG.py:177
 %1 = ones()                                                                                                           # EncryptedTensor<uint1, shape=(1, 3, 34, 34)>
 %2 = 0                                                                                                                # ClearScalar<uint1>
 %3 = multiply(%1, %2)                                                                                                 # EncryptedTensor<uint1, shape=(1, 3, 34, 34)>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ with this input node
                                                                                                                                                                      /home/usr/Documents/project/venv_fhe/lib/python3.8/site-packages/concrete/ml/onnx/onnx_impl_utils.py:51
 %4 = (%3[:, :, 1:33, 1:33] = %0)                                                                                      # EncryptedTensor<uint1, shape=(1, 3, 34, 34)>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ this node is not fusable
                                                                                                                                                                      /home/usr/Documents/project/venv_fhe/lib/python3.8/site-packages/concrete/ml/onnx/onnx_impl_utils.py:61
 %5 = [[[[-1  0  ...   0  0]]]]                                                                                        # ClearTensor<int2, shape=(64, 3, 3, 3)>                @ /0/Conv.conv
 %6 = conv2d(%4, %5, [0 0 0 0 0 ... 0 0 0 0 0], pads=[0, 0, 0, 0], strides=(1, 1), dilations=(1, 1), group=1)          # EncryptedTensor<uint1, shape=(1, 64, 32, 32)>         @ /0/Conv.conv
 %7 = subgraph(%6)                                                                                                     # EncryptedTensor<uint1, shape=(1, 64, 32, 32)>
 %8 = ones()                                                                                                           # EncryptedTensor<uint1, shape=(1, 64, 34, 34)>
 %9 = 0                                                                                                                # ClearScalar<uint1>
%10 = multiply(%8, %9)                                                                                                 # EncryptedTensor<uint1, shape=(1, 64, 34, 34)>
%11 = (%10[:, :, 1:33, 1:33] = %7)                                                                                     # EncryptedTensor<uint1, shape=(1, 64, 34, 34)>
%12 = [[[[0 0 0] ... [0 0 0]]]]                                                                                        # ClearTensor<int2, shape=(64, 64, 3, 3)>               @ /2/Conv.conv
%13 = conv2d(%11, %12, [0 0 0 0 0 ... 0 0 0 0 0], pads=[0, 0, 0, 0], strides=(1, 1), dilations=(1, 1), group=1)        # EncryptedTensor<uint1, shape=(1, 64, 32, 32)>         @ /2/Conv.conv
%14 = subgraph(%13)                                                                                                    # EncryptedTensor<uint1, shape=(1, 64, 32, 32)>
%15 = maxpool2d(%14, kernel_shape=(3, 3), strides=(3, 3), pads=(0, 0, 0, 0), dilations=(1, 1), ceil_mode=False)        # EncryptedTensor<uint1, shape=(1, 64, 10, 10)>
%16 = subgraph(%15)                                                                                                    # EncryptedTensor<uint1, shape=(1, 64, 10, 10)>
return %16

My model looks like this:

Sequential(
  (0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (1): ReLU()
  (2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (3): ReLU()
  (4): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (5): MaxPool2d(kernel_size=3, stride=3, padding=0, dilation=1, ceil_mode=False)
  (tanh): CustomActivation()
)

I am planning to change the MaxPool by an AVGpool , my tanh is just a custom activation to put everything in [0,1], I will use a threshold afterwards to only have binary values

class CustomActivation(nn.Module):
    def forward(self, x):
        return (torch.tanh(x) + 1) / 2

Maybe it comes from there, but I would have thought that it would be a different error then?

I can send you the onnx file, how would you like it ?

Thanks for the additional info ! We indeed do not support MaxPool yet (it should be available in the future) so using an AVGPool is the best alternative. Also, you are right, I don’t think the issue lies in the custom activation as the error would have been different.

What I want to investigate is if the channel order is properly handled in the conv operator. ONNX (and PyTorch) considers a channel-first format ( NCHW) and I wonder if your model follows that convention. If not, that could explain it. Else, it might get trickier.

Besides, if you could send me a simple git repo with the code as well as the onnx graph, that would be great !

Sure! Here is a temp repo that I have done:

https://github.com/tguerand/temp_fhe

I can add you as a collaborator if you need to.

Thank you very much ! I will take a look and get back to you when I have more information about your issue

Hello again,
So the error now comes from the fact that your inputset is composed of floating points, while Concrete-Python only asks for integers !

As stated above, there is no such need in Concrete-ML since these quantization and reshaping steps are wrapped together in the quantized model’s related methods. This is not the case for Concrete-Python, and only integers should be used for both compilation and inference.

You should therefore be able to get your code running by using the following instead :

    q_input_set = quantized_module.quantize_input(input_set)

    q_reshaped_inputset = (numpy.expand_dims(input_val, 0) for input_val in q_input_set)
    cfg = fhe.Configuration(show_graph=True)
    circuit = f.compile(q_reshaped_inputset)

    q_x_test = quantized_module.quantize_input(*x_test)

    simulation = circuit.simulate(q_x_test)

If you add some steps before the QuantizedModule call in the function to compile, you then might not want to use quantized_module.quantize_input but your own quantization function instead as the scales might differ.

Besides, I agree that the error message you got is a bit misleading and will see what Concrete-Python can do about it !

Hope that helps :slight_smile:

1 Like

:saluting_face:

You are totally right I forgot about this step thank you!

I ll resume for everyone here:


# imports
import time

import numpy as np
import torch
import torch.nn as nn
import onnx
from concrete.ml.torch import NumpyModule
from concrete.ml.quantization import PostTrainingAffineQuantization
from concrete import fhe

# load your model

model = []

model.append(nn.Conv2d(3, 64, kernel_size=3, stride=1, padding=1, bias=True))
model.append(nn.Mish())
model.append(nn.BatchNorm2d(64).to(device))
model.append(nn.AvgPool2d(3))
model.append(nn.Tanh())
model = nn.Sequential(*model)
  
model.load_state_dict(torch.load("model.h5")['model_state_dict'])

# save your onnx graph
num_inputs = 1
input_shape = (3, 32, 32)
input_set = np.random.uniform(-100, 100, size=(10, *input_shape))
x_test = tuple(np.random.uniform(-100, 100, size=(1, *input_shape)) for _ in range(num_inputs))
dummy_input = torch.randn(input_shape)

torch.onnx.export(model, dummy_input, 'model.onnx')
quantized = onnx.load("model.onnx")  

# quantize your model post training

dummy_input_for_tracing = torch.from_numpy(input_set[[0], ::]).float()
numpy_module = NumpyModule(quantized, dummy_input_for_tracing)
post_training_quant = PostTrainingAffineQuantization(n_bits=2, numpy_model=numpy_module )
quantized_module = post_training_quant.quantize_module(input_set)
q_input_set = quantized_module.quantize_input(input_set)

# quantize your inputs
q_x_test = quantized_module.quantize_input(*x_test)

# define your function
@fhe.compiler({"x":"encrypted"})
def f(x):
    y = quantized_module._clear_forward(x)
    # your code then here
    # ....
    return y


# compile everything
reshaped_inputset = (np.expand_dims(input_val, 0) for input_val in q_input_set)
cfg = fhe.Configuration(show_graph=True)
print("compiling")
t = time.time()
circuit = f.compile(reshaped_inputset, configuration=cfg)
print('compiled in ', time.time() - t)
t = time.time()
circuit.keygen()
print("Keygen done in ", time.time() - t)

print(f"Encryption keys: {circuit.size_of_secret_keys} bytes")
print(f"Evaluation keys: {circuit.size_of_bootstrap_keys + circuit.size_of_keyswitch_keys} bytes")
print(f"Inputs: {circuit.size_of_inputs} bytes")
print(f"Outputs: {circuit.size_of_outputs} bytes")

enc = circuit.encrypt(q_x_test)

out = circuit.run(enc)

And it works ! Could take some time depending of your model

1 Like