FHE Simulation vs Pure FHE

Hello - I am new to concrete ml - and I am reading this initial notes written in one of the colab inside “use case examples” was –
" In notebooks [1], [2] and [3], we showed how to convert any custom NN into its FHE counterpart respecting the FHE constraints. Then, how to evaluate it in FHE simulation mode. Pure FHE computation will be available in the next releases.

During those tutorials, you may have experienced some slowness of FHE computations, which is intimately related to non-linear crypto operations on large accumulator size.

In this tutorial we present a new way to speed up computations by cancelling the least-significant bits of the accumulator, and thus having PBS of smaller precision. This technique reduces the FHE-circuit complexity while maintaining high accuracy."

What is " Pure FHE computation" in this context . I have asked chatgpt about it - it came up with this answer. How much right/wrong is it?

" In the context of Fully Homomorphic Encryption (FHE) libraries like Zama’s Concrete-ML, FHE simulation is a way to approximate the behavior of homomorphic encryption on quantized models without incurring the heavy computational cost of actual FHE calculations. Here’s how FHE simulation typically works and what it entails:

1. Quantized Model Evaluation without FHE Encryption

  • The model is quantized to fit the requirements and limitations of FHE. Quantization refers to reducing the precision of model parameters and activations (e.g., from floating-point to lower-precision integers) so that the computations can be performed on limited bit-widths suitable for homomorphic encryption.
  • FHE simulation evaluates the quantized model as if it were operating under FHE bit-width constraints but without actually performing encryption and decryption on the data.

2. Bit-width Constraints and Noise Simulation

  • FHE imposes constraints on the bit-width of each operation due to noise accumulation and encryption scheme limits. During simulation, the library mimics these bit-width restrictions to ensure that the model will behave correctly when encrypted.
  • The simulation introduces approximate noise handling, simulating the precision and rounding behaviors that would occur in FHE. This includes truncating values to the allowed bit-width, which approximates how real FHE computations degrade precision over successive operations.

3. Polynomial Approximations for Activation Functions

  • Activation functions, particularly non-linear ones like ReLU or Sigmoid, are typically not directly supported by FHE since they’re not polynomial functions.
  • In FHE simulation, these activations are replaced with polynomial approximations (e.g., Taylor series expansions or simple polynomials like x^2). This lets you assess how the quantized model performs under approximations, which is critical because these polynomial activations are what would be used in the actual encrypted computation.

4. CKKS or BFV Scheme Simulation (for Approximate/Exact Arithmetic)

  • In actual FHE, different encryption schemes like CKKS (for approximate arithmetic) and BFV (for exact arithmetic) dictate how numbers are represented, the types of arithmetic supported, and the precision.
  • In simulation, these schemes are approximated by imposing similar arithmetic constraints (e.g., using fixed-point arithmetic or integer arithmetic with limited precision) without actually encrypting data. This allows for testing the model’s compatibility with FHE arithmetic and noise tolerance.

5. Bootstrapping and Noise Constraints

  • In real FHE, bootstrapping refreshes ciphertexts to reduce accumulated noise. Although bootstrapping itself isn’t performed in simulation, the noise level is often monitored to mimic how noise would accumulate in an actual FHE computation.
  • By simulating noise buildup, users can adjust their model parameters or reduce model depth to prevent noise from reaching a level that would make the results unusable when actually encrypted.

6. Validation of Inference Accuracy in a Simulated FHE Environment

  • The model is evaluated end-to-end in this simulation environment to validate that it performs inference with acceptable accuracy under FHE-like constraints.
  • This process helps users understand if the model will still produce accurate results when actually encrypted and highlights any adjustments needed before deploying the model with real FHE computations.

Example of FHE Simulation Workflow

Let’s say you have a neural network model, and you want to simulate how it would perform under FHE constraints:

  1. Quantize the Model: Reduce the model’s weights and activations to integer representations, with lower precision (e.g., 8-bit or 16-bit integers).
  2. Replace Activations: Swap out non-linear activations (like ReLU) with polynomial functions (like x^2).
  3. Simulate Limited Precision: Perform model inference by imposing bit-width constraints (e.g., rounding intermediate computations to a limited number of bits).
  4. Monitor Noise Levels: Simulate how noise would accumulate in an actual FHE computation by tracking rounding and precision loss through each layer of the model.
  5. Validate Output: Ensure that the model’s predictions are still accurate within the FHE-like environment.

By using these techniques, FHE simulation allows you to approximate the model’s behavior under homomorphic encryption constraints without the computational load of real FHE, making it faster and more practical for model testing and validation."

Anyone would like to clarify on this matter?

Hi @mmr,

Welcome to FHE field, and thanks for your question.

This part " In the context of Fully Homomorphic Encryption (FHE) libraries like Zama’s Concrete-ML, FHE simulation is a way to approximate the behavior of homomorphic encryption on quantized models without incurring the heavy computational cost of actual FHE calculations." is true.

Now, let me shed some more light on this:

In the FHE universe, there are several encryption schemes, such as TFHE, CKKS, and BFV. At Zama, we use the TFHE scheme (which wasn’t mentioned in the ChatGPT response).

Each of these schemes adds noise to the ciphertext (encrypted value) to ensure security. As operations are performed on this ciphertext, the noise gradually increases until it reaches a level where it can corrupt the encrypted data.

Fortunately, FHE schemes provide a complex cryptographic mechanism called bootstrapping, which reduces noise after a series of operations. This mechanism enables us to perform as many homomorphic operations as we need, without the ciphertext becoming corrupted by noise.

Current Constraints of the TFHE Scheme

  • TFHE operates on integer values only. That’s why quantization is a necessary first step before any homomorphic computations can be performed.
  • There is a precision limit of 16-bit width for non-linear operations.

Types of Inference in Concrete-ML (CML) Library

For model inference in CML, we offer three modes:

  1. Quantized Mode - model.predict(X_test, fhe="disable"): In this mode, you perform inference using your quantized models and quantized inputs without homomorphic operations . This allows you to see the impact of quantization on model performance.
  • It can reduce performance, since there is a loss of precision.
  • It can also increase performance, as it can act as a form of regularization and improve robustness.
  1. Pure FHE Computation - model.predict(X_test, fhe="execute"): In this mode, all operations are homomorphic, providing fully encrypted inference. However, It can be slow.

  2. Simulation Mode - model.predict(X_test, fhe="simulate"): In this mode, we mimics real homomorphic operations, but no actual FHE is performed. We highly recommend using this mode during the development phase, as it’s much faster, you can verify whether your model is FHE-compliant and assess its performance.

Hope it’s clearer now :slight_smile: stay tuned for updates on CML! We’re actively working to relax some of these constraints =)

Thanks,
Célia

1 Like

Some of the examples regarding training CNNs with cifar 10 has lot of moving parts, can it be streamlined a bit.

Which example are you referring to? Could you share the link ?

CifarQuantizationAwareTraining.ipynb from zama-ai/concrete-ml/tree/main/use_case_examples/cifar/cifar_brevitas_finetuning

This one in particular.

Yes, you can shorten it…

We wanted it to be as pedagogical as possible.

The idea is to first quantize the model, then check if it’s FHE-compatible through compilation and simulation, and finally apply FHE.