I’m reading the content about hybrid model. It seems that we only use FHE on the server module. So I wonder will the module on the client be quantized? Or they just stay in FP32 format, and we just quantize and encrypt the intermediate results before we send the intermediate results to server.
Looking forward to your reply
Yes, you’re understanding it correctly: the intermediate results are quantized, encrypted, then sent to the server for computation of the server-side layers. The results of these layers are decrypted and dequantized then further processed by the client. This pattern repeats, as configured through the HybridFHEModel
contructor
1 Like