Good Project Zama

Dear Concrete ML Team,

I am exploring the use of Concrete ML for encrypted fine-tuning of a PyTorch model using LoRA layers. My goal is to train only the LoRA layers under FHE while keeping the rest of the model frozen. I have two specific questions:

  1. Quantisation and LoRA Layers:

I plan to quantise the model (including LoRA layers) post-training. However, this raises a question about how to obtain quantisation parameters for the LoRA layers. If the quantisation process is performed on pre-training data (task 1) with LoRA layers deactivated, what would be the recommended approach to ensure the quantisation parameters for the LoRA layers are accurate for their use during fine-tuning (task 2)?

  1. Fine-Tuning Setup with FHE:

Instead of using the HybridFHEModel, I would like to set up a system with FHEModelDev, FHEModelClient, and FHEModelServer. The workflow I envision involves:

• The client encrypting the input data and sending it to the server.

• The server fine-tuning only the LoRA layers under FHE.

Is this workflow feasible with the LoRA layers from my PyTorch model as described? If so, are there any key considerations or limitations I should be aware of when implementing this setup?

Thank you for your time and kind guidance.

Best regards,
Vaandie
[/quote]

5 Likes