I would like to use an FHE/quantized converted EfficientNet model on the melanoma dataset available on Kaggle. Has anyone tried to convert this kind of big model to its concrete-ml counterpart? I would like to know if it’s even possible before starting to work.
The dataset is very complicated; even for a human, it’s difficult to classify melanoma. I noticed a performance drop on the CIFAR-100 dataset with concrete-ml compared to its non-FHE counterpart. So, I have the feeling that, for now, it’s impossible to achieve good results with this kind of “big” models and “big” datasets.
At a first look, all the operators in EfficientNet are supported in Concrete-ML (e.g. depthwise convolution, residual connections). It should thus be possible to make a Brevitas Quantization Aware Training implementation of this network, and it should compile to FHE. If you try it and encounter specific issues , let us know and we can investigate in more detail. We highly recommend you use the approach described in the CIFAR QAT fine-tuning example.
For CIFAR-100, using the approach described above, with FHE simulation there is almost no accuracy decrease. Currently we did not test on the whole dataset in FHE, but we expect simulation to be representative of real FHE computation.
If you experience a significant accuracy decrease, could you show how you compile and how you compute the accuracy ?
Hello,
Thank you for your reply. Apologies for my late response, I’ve been quite busy lately.
If I remember correctly, batch normalization isn’t supported by Concrete-ML, or at least I couldn’t find it on the list of supported operators (Using Torch - Concrete ML).
The training in clear (without Concrete-ML) on a 3060 GPU took me close to 8-9 hours. Do you think this won’t be a significant issue for its TFHE counterpart?
Regardless, I will give it a try. I’ll follow your QAT fine-tuning example. Could you please provide some information about batch normalization before I proceed?
Ah, okay, great!
No, no, my classic model (efficient) without concrete-ML took 8.9 hours to train, so I was a bit apprehensive about its concrete-ML equivalent.