ERROR in running hybrid model example

Hello, I’m trying the hybrid model example, but while testing the inference with gpt2 model, I got a deserialize error, the output as follows, and the concrete-ml is installed with docker

Using device: cpu
Number of tokens:

Prompt:

**********
**********
8 tokens in 'Computations on encrypted data can help'
**********
**********
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Computations on encrypted data can Traceback (most recent call last):
  File "infer_hybrid_llm_generate.py", line 78, in <module>
    output_ids = model.generate(
  File "/usr/local/lib/python3.8/dist-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/transformers/generation/utils.py", line 1479, in generate
    return self.greedy_search(
  File "/usr/local/lib/python3.8/dist-packages/transformers/generation/utils.py", line 2340, in greedy_search
    outputs = self(
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/transformers/models/gpt2/modeling_gpt2.py", line 1074, in forward
    transformer_outputs = self.transformer(
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/transformers/models/gpt2/modeling_gpt2.py", line 888, in forward
    outputs = block(
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/transformers/models/gpt2/modeling_gpt2.py", line 390, in forward
    attn_outputs = self.attn(
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/transformers/models/gpt2/modeling_gpt2.py", line 312, in forward
    query, key, value = self.c_attn(hidden_states).split(self.split_size, dim=2)
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/concrete/ml/torch/hybrid_model.py", line 269, in forward
    y = self.remote_call(x)
  File "/usr/local/lib/python3.8/dist-packages/concrete/ml/torch/hybrid_model.py", line 332, in remote_call
    decrypted_prediction = client.deserialize_decrypt_dequantize(encrypted_result)[0]
  File "/usr/local/lib/python3.8/dist-packages/concrete/ml/deployment/fhe_client_server.py", line 353, in deserialize_decrypt_dequantize
    deserialized_decrypted_quantized_result = self.deserialize_decrypt(
  File "/usr/local/lib/python3.8/dist-packages/concrete/ml/deployment/fhe_client_server.py", line 329, in deserialize_decrypt
    deserialized_encrypted_quantized_result = fhe.Value.deserialize(
  File "/usr/local/lib/python3.8/dist-packages/concrete/fhe/compilation/value.py", line 47, in deserialize
    return Value(NativeValue.deserialize(serialized_data))
  File "/usr/local/lib/python3.8/dist-packages/concrete/compiler/value.py", line 68, in deserialize
    return Value.wrap(_Value.deserialize(serialized_value))
RuntimeError: Failed to deserialize Value

And the server seems works well and output as follows:

INFO:     127.0.0.1:44578 - "GET /list_shapes HTTP/1.1" 200 OK
INFO:     127.0.0.1:44580 - "GET /get_client HTTP/1.1" 200 OK
INFO:     127.0.0.1:44582 - "POST /add_key HTTP/1.1" 200 OK
INFO:     127.0.0.1:44584 - "GET /get_client HTTP/1.1" 200 OK
INFO:     127.0.0.1:44586 - "POST /add_key HTTP/1.1" 200 OK
INFO:     127.0.0.1:44588 - "GET /get_client HTTP/1.1" 200 OK
INFO:     127.0.0.1:44590 - "POST /add_key HTTP/1.1" 200 OK
INFO:     127.0.0.1:44592 - "GET /get_client HTTP/1.1" 200 OK
INFO:     127.0.0.1:44594 - "POST /add_key HTTP/1.1" 200 OK
INFO:     127.0.0.1:44596 - "GET /get_client HTTP/1.1" 200 OK
INFO:     127.0.0.1:44598 - "POST /add_key HTTP/1.1" 200 OK
INFO:     127.0.0.1:44600 - "GET /get_client HTTP/1.1" 200 OK
INFO:     127.0.0.1:44602 - "POST /add_key HTTP/1.1" 200 OK
INFO:     127.0.0.1:44604 - "GET /get_client HTTP/1.1" 200 OK
INFO:     127.0.0.1:44606 - "POST /add_key HTTP/1.1" 200 OK
INFO:     127.0.0.1:44608 - "GET /get_client HTTP/1.1" 200 OK
INFO:     127.0.0.1:44610 - "POST /add_key HTTP/1.1" 200 OK
INFO:     127.0.0.1:44612 - "GET /get_client HTTP/1.1" 200 OK
INFO:     127.0.0.1:44614 - "POST /add_key HTTP/1.1" 200 OK
INFO:     127.0.0.1:44616 - "GET /get_client HTTP/1.1" 200 OK
INFO:     127.0.0.1:44618 - "POST /add_key HTTP/1.1" 200 OK
INFO:     127.0.0.1:44620 - "GET /get_client HTTP/1.1" 200 OK
INFO:     127.0.0.1:44622 - "POST /add_key HTTP/1.1" 200 OK
INFO:     127.0.0.1:44624 - "GET /get_client HTTP/1.1" 200 OK
INFO:     127.0.0.1:44626 - "POST /add_key HTTP/1.1" 200 OK
INFO:     127.0.0.1:44628 - "GET /get_client HTTP/1.1" 200 OK
INFO:     127.0.0.1:44630 - "POST /add_key HTTP/1.1" 200 OK
INFO:     127.0.0.1:44632 - "GET /get_client HTTP/1.1" 200 OK
INFO:     127.0.0.1:44634 - "POST /add_key HTTP/1.1" 200 OK
INFO:     127.0.0.1:44636 - "GET /get_client HTTP/1.1" 200 OK
INFO:     127.0.0.1:44638 - "POST /add_key HTTP/1.1" 200 OK
INFO:     127.0.0.1:44640 - "GET /get_client HTTP/1.1" 200 OK
INFO:     127.0.0.1:44642 - "POST /add_key HTTP/1.1" 200 OK
INFO:     127.0.0.1:44644 - "GET /get_client HTTP/1.1" 200 OK
INFO:     127.0.0.1:44646 - "POST /add_key HTTP/1.1" 200 OK
INFO:     127.0.0.1:44648 - "GET /get_client HTTP/1.1" 200 OK
INFO:     127.0.0.1:44650 - "POST /add_key HTTP/1.1" 200 OK
INFO:     127.0.0.1:44652 - "GET /get_client HTTP/1.1" 200 OK
INFO:     127.0.0.1:44654 - "POST /add_key HTTP/1.1" 200 OK
2024-03-12 07:23:02.771 | INFO     | __main__:compute:167 - Reading uploaded data...
2024-03-12 07:23:02.817 | INFO     | __main__:compute:170 - Uploaded data read in 0.04540824890136719 seconds
2024-03-12 07:23:02.817 | INFO     | concrete.ml.torch.hybrid_model:compute:820 - It took 0.0001571178436279297 seconds to load the key
2024-03-12 07:23:02.882 | INFO     | concrete.ml.torch.hybrid_model:compute:826 - It took 0.06443285942077637 seconds to load the circuit
2024-03-12 07:23:09.478 | INFO     | concrete.ml.torch.hybrid_model:compute:836 - fhe inference of input of shape (1, 8, 768) took 6.595885515213013
2024-03-12 07:23:09.478 | INFO     | concrete.ml.torch.hybrid_model:compute:837 - Results size is 178.87525939941406 Mb
INFO:     127.0.0.1:34118 - "POST /compute HTTP/1.1" 200 OK

I have tried with docker environment zamafhe/concrete-ml:1.4.1 and pip installed in python 3.10, but they both have same errors.

Hello @bubb1es00 , thanks for raising this issue to us!

We are investigating it and will come back to you shortly.

Hello @bubb1es00 , sorry for the delay but we are still investigating the issue.

Hello all,
I am facing similar issue since yesterday, it happened after I had to rebuild my environment and dependencies.
Now I don´t know if it was caused by a new version of Concrete or something.

    encrypted_prediction = FHEModelServer(self.server_dir).run(
  File "/home/x/.local/share/virtualenvs/project-fhe-sd-ulA48tET/lib/python3.10/site-packages/concrete/ml/deployment/fhe_client_server.py", line 123, in run
    deserialized_encrypted_quantized_data = fhe.Value.deserialize(
  File "/home/x/.local/share/virtualenvs/project-fhe-sd-ulA48tET/lib/python3.10/site-packages/concrete/fhe/compilation/value.py", line 47, in deserialize
    return Value(NativeValue.deserialize(serialized_data))
  File "/home/x/.local/share/virtualenvs/project-fhe-sd-ulA48tET/lib/python3.10/site-packages/concrete/compiler/value.py", line 68, in deserialize
    return Value.wrap(_Value.deserialize(serialized_value))
RuntimeError: Failed to deserialize Value

Hello all, quick update on this.

My colleagues working on Concrete found the issue and have a patch ready for it.
The fix will be available soon as a patched version of both Concrete and Concrete ML.

Thanks for your patience! :pray:

Hello @churdo and @bubb1es00 , could you folks please check the size of the ciphertext that’s being deserialized when the error occur please? :pray:

A simple sys.getsizeof(serialized_ciphertext) should do the trick

Hello @luis
I got about 70 MB.
I am talking to somebody in Discord as well, I was asked for the same, perhaps that is you ?
As I mentioned there, seems that the issue started after I updated Concrete-ML from 1.3.0 to 1.5.0.
Every dependency in my project was updated, so I cannot be 100% sure yet.

@luis I managed to test with the versoin 1.3.0 and worked.

So the issue is with the version 1.5.0.

Hello @churdo , we have a fix coming for this issue specifically. We are troubleshooting some things in our release process at the moment but the fix should come anytime now.

I’ll ping you here once the patch release is done

Thank you @luis ,
One point to mention if that makes any difference, I told the guys in discord that it was affecting only 1.5.
But it also affects 1.4.

Actually this will be dependent on your version of concrete-python and not concrete-ml :slightly_smiling_face:

In my case it works when I use concrete-ml 1.3.
I assume that it changes the concrete-python version automatically as dependency or perhaps my issue is slightly different.

I am also getting same error but when i changed the concrete-ml version from 1.6.0 to 1.3.0 i am getting version mismatch error. @churdo @luis

ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/uvicorn/protocols/http/httptools_impl.py", line 436, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
  File "/usr/local/lib/python3.10/dist-packages/uvicorn/middleware/proxy_headers.py", line 78, in __call__
    return await self.app(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/fastapi/applications.py", line 292, in __call__
    await super().__call__(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/applications.py", line 122, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/errors.py", line 184, in __call__
    raise exc
  File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/errors.py", line 162, in __call__
    await self.app(scope, receive, _send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/exceptions.py", line 79, in __call__
    raise exc
  File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/exceptions.py", line 68, in __call__
    await self.app(scope, receive, sender)
  File "/usr/local/lib/python3.10/dist-packages/fastapi/middleware/asyncexitstack.py", line 20, in __call__
    raise e
  File "/usr/local/lib/python3.10/dist-packages/fastapi/middleware/asyncexitstack.py", line 17, in __call__
    await self.app(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 718, in __call__
    await route.handle(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 276, in handle
    await self.app(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 66, in app
    response = await func(request)
  File "/usr/local/lib/python3.10/dist-packages/fastapi/routing.py", line 273, in app
    raw_response = await run_endpoint_function(
  File "/usr/local/lib/python3.10/dist-packages/fastapi/routing.py", line 190, in run_endpoint_function
    return await dependant.call(**values)
  File "/root/zama/serve_model.py", line 171, in compute
    encrypted_results = server.compute(
  File "/usr/local/lib/python3.10/dist-packages/concrete/ml/torch/hybrid_model.py", line 823, in compute
    fhe = self.get_circuit(model_name, module_name, input_shape)
  File "/usr/local/lib/python3.10/dist-packages/concrete/ml/torch/hybrid_model.py", line 691, in get_circuit
    return _get_circuit(path)
  File "/usr/local/lib/python3.10/dist-packages/concrete/ml/torch/hybrid_model.py", line 615, in _get_circuit
    return FHEModelServer(path)
  File "/usr/local/lib/python3.10/dist-packages/concrete/ml/deployment/fhe_client_server.py", line 96, in __init__
    self.load()
  File "/usr/local/lib/python3.10/dist-packages/concrete/ml/deployment/fhe_client_server.py", line 102, in load
    check_concrete_versions(server_zip_path)
  File "/usr/local/lib/python3.10/dist-packages/concrete/ml/deployment/fhe_client_server.py", line 66, in check_concrete_versions
    raise ValueError(
ValueError: Version mismatch for packages:
concrete-python: 2.7.0 != 2.5.0rc1
concrete-ml: 1.6.0 != 1.3.0

Hello @Varun_Joshi , just to confirm, are you saying that with Concrete ML 1.6.0 you encountered a RuntimeError: Failed to deserialize Value error?

Yes @luis i was facing the deserialize value error in concrete-ml version 1.6.0, so i downgraded it to 1.3.0 but now getting error related to version mismatch.

Alright noted, the deserialization error should have been fixed in Concrete ML 1.6.0 but it looks like it wasn’t. Could you please tell us with version of Concrete Python you have installed when you run in the deserialization error? And if possible could you provide us a script to replicate your issue?

The version error is due to a mismatch between the client/server files and the version of Concrete ML you are using. You can fix it but re-generating these files, but hopefully we can fix the first error you are facing :slightly_smiling_face:

@luis I am using concrete-python version 2.7.0 with concrete-ml 1.6.0 and to regenerate the issue to can run the hybrid model example.

I have re-generated the client files with concrete-ml version 1.3.0 but now i am getting cannot read data error

ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/uvicorn/protocols/http/httptools_impl.py", line 436, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
  File "/usr/local/lib/python3.10/dist-packages/uvicorn/middleware/proxy_headers.py", line 78, in __call__
    return await self.app(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/fastapi/applications.py", line 292, in __call__
    await super().__call__(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/applications.py", line 122, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/errors.py", line 184, in __call__
    raise exc
  File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/errors.py", line 162, in __call__
    await self.app(scope, receive, _send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/exceptions.py", line 79, in __call__
    raise exc
  File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/exceptions.py", line 68, in __call__
    await self.app(scope, receive, sender)
  File "/usr/local/lib/python3.10/dist-packages/fastapi/middleware/asyncexitstack.py", line 20, in __call__
    raise e
  File "/usr/local/lib/python3.10/dist-packages/fastapi/middleware/asyncexitstack.py", line 17, in __call__
    await self.app(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 718, in __call__
    await route.handle(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 276, in handle
    await self.app(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 66, in app
    response = await func(request)
  File "/usr/local/lib/python3.10/dist-packages/fastapi/routing.py", line 273, in app
    raw_response = await run_endpoint_function(
  File "/usr/local/lib/python3.10/dist-packages/fastapi/routing.py", line 190, in run_endpoint_function
    return await dependant.call(**values)
  File "/root/zama/serve_model.py", line 171, in compute
    encrypted_results = server.compute(
  File "/usr/local/lib/python3.10/dist-packages/concrete/ml/torch/hybrid_model.py", line 829, in compute
    encrypted_results = fhe.run(
  File "/usr/local/lib/python3.10/dist-packages/concrete/ml/deployment/fhe_client_server.py", line 123, in run
    deserialized_encrypted_quantized_data = fhe.Value.deserialize(
  File "/usr/local/lib/python3.10/dist-packages/concrete/fhe/compilation/value.py", line 47, in deserialize
    return Value(NativeValue.deserialize(serialized_data))
  File "/usr/local/lib/python3.10/dist-packages/concrete/compiler/value.py", line 68, in deserialize
    return Value.wrap(_Value.deserialize(serialized_value))
RuntimeError: Cannot read data```

Hello @Varun_Joshi , this is really unexpected and we weren’t able to reproduce the issue you are facing with just the information you gave us.

Are you running the hybrid model example from the documentation or a custom one?
Also which OS / platform are you using (M1 MacOS for example or x86 Linux for example)?

You might want to delete all keys/compiled models between two debug runs just to be sure that nothing is interfering.

Hello @luis
1.6.0 fixed the issue for me, what exactly was the reason ?