The code finally outputs `[ CPUBFloat16Type{32,1024,8,128} ]` but kills itself

My code:

hybrid_model.compile_model( 
            tensor[:, 0:1],
            n_bits=8
        )

allow_save_FHE_model: bool = 1
if allow_save_FHE_model: 
    via_mlir = bool(int(os.environ.get("VIA_MLIR", 1)))
    hybrid_model.save_and_clear_private_info("/fhe_circuit", via_mlir=via_mlir)

The process can run and finally output:

Columns 105 to 128  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
[ CPUBFloat16Type{32,1024,8,128} ]

But the code can not save the circuit, it looks like kills itself after compile()

Hey
Is “/fhe_circuit” a valid path?

ie, maybe you miss a “.”, and wanted to use “./fhe_circuit”

I tried many ways to set the saving path, even the resolve path. But they all failed, because the process kills after compile(). It cannot run into the saving part.

I use it to compile a llm, and I found it stopping in a transformer block. It just stops when running on this transformer block. As long as it runs into the block, it stops imediately. There is no error report.

I guess probably something overloads the limitation, noise perhaps. How to refer the “noise monitor” function to find the running situation?

So, are you sure the compilation finishes well? From what you say, I have the impression it doesn’t

maybe do some

print(“AAA”)
hybrid_model.compile_model(
tensor[:, 0:1],
n_bits=8
)
print(“BBB”)

and tell me if you see AAA and BBB.

Also, could you show us (copy-paste) the full trace of the python execution, please, as well as a full code?

But yes, it’s expected you’ll have errors if you try to compile a full LLM. For LLM, doing 100% on FHE is too much for today. You might want to have a look to concrete-ml/use_case_examples/llm at main · zama-ai/concrete-ml · GitHub and to concrete-ml/use_case_examples/hybrid_model at main · zama-ai/concrete-ml · GitHub

I can see AAA, but cannot see BBB. I even do not know if the compile is successful when the [ CPUBFloat16Type{32,1024,8,128} ] appears.

I try a very small llm, much smaller than the GPT2 in your example. So it is high likely to compile.

I will submit the full code after organization.

As for your example, I have another problem.

In file qgpt2_models.py, in method class QGPT2LMHeadModel(GPT2LMHeadModel)::def q_attention():

    def q_attention(self) -> GPT2Attention:
        """Get GPT-2's attention module found in the first layer.

        Returns:
            GPT2Attention: The attention module.
        """
        return self.transformer.h[0].attn

Does this example only compile the first attention?

if you don’t see the BBB, for sure, the function compile_model did not finish well

yes, as I told you, we can’t use FHE on the full LLM, so we’ve shown we can compile a single attention, and have accurate results. In the hybrid model, we show how to do some of the layers on the client side (in the clear) and some in FHE (on the server side). There are much more explanations in the .md, I would recommend reading them.

Do you have a function to monitor the noise growth?

No tool. Noise is managed by the Concrete optimizer, you as a user shouldn’t worry about it.

It’s not a matter of noise, it’s a matter of “LLM is too huge to be compiled today”

I see. How to compile several funstions? For instance, if I have add() and project() two functions, or the relation is y = project(add(x)), how to compile them?

Can I compile them together by just compiling the project() function?
Or should I compile these two functions perspectively?

for the sake of clarity, could you open a new question/thread please?