The inference time is very high of the following example available at Zama site
Neutral network fine-tuning: Fine-tune a VGG network to classify the CIFAR image data-sets and predict on encrypted data.
I am using a machine of 16 Core (2.1 Ghz) and 64 GB RAM. However it took 28.6 minutes for inference of a single image. The question is, How to decrease this inference time?
Hi,
One possible way to reduce latency is to use a bigger machine, since Concrete makes very good use of parallelism: we obtain around 40 seconds inference time on that model using the hpc7a instances on AWS.
A second possible approach is, depending on the use-case, to optimize your model by making it smaller or pruning it. You might want to look into structured pruning. See this section of the documentation about such techniques - some are already used in the CIFAR example that you link.