Let’s take QGPT2Evaluate.ipynb, including Multi-Head Attention GPT-2 Model, as an example.

I have two methods to attempt analyze the computational complexity of FHE: quantitatively analyze the actual execution time, and `qualitatively evaluate the computational complexity in theory(my doubt)`

.

# 1. Quantitatively analyze the actual execution time

## 1.1 Execute it without FHE

```
proj_12_heads_qgpt2.set_fhe_mode(fhe="disable")
t1 = time.time()
output_logits_clear = proj_12_heads_qgpt2(input_ids).logits
print(f"Time of inference without Multi-Head Attention Model(fhe=\"disable\"): {(time.time() - t1):.2f}")
```

## 1.2 Execute it with FHE

```
circuit_12_heads = proj_12_heads_qgpt2.compile(input_ids)
proj_12_heads_qgpt2.set_fhe_mode(fhe="execute")
t2 = time.time()
output_logits_executed = proj_12_heads_qgpt2(input_ids).logits
print(f"Time of inference with Multi-Head Attention Model(fhe=\"execute\"): {(time.time() - t2):.2f}")
```

Therefore, we can qualitatively analyze the complexity of FHE by comparing the two times above.

However, “Execute it with FHE” may take long time to run.

So, I’m curious if there is a quantitative theory to analyze the computational complexity of this Multi-Head Attention Model in QGPT2Evaluate.ipynb.

# 2. How to qualitatively evaluate the computational complexity of FHE in theory?

Thank you very much. I would greatly appreciate any information you can provide.