With the current release:
As a work in progress, the torch to homomorphic numpy compiler has several limitations in this version of the Concrete Numpy:
- The supported operators are for now nn.Linear, nn.Sigmoid and nn.Relu6
- Torch nn.Modules that you create to be compiled must declare their submodules in the init function, in the order they will be called in forward
- The forward function will only reference the submodules declared in init (e.g. do not call operators from torch.functional, do not splice tensors)
- The maximum accumulation that is performed does not overflow 7 bits.
Our framework for now performs post training quantization to 2 bits weights and activations. We know this is a tough constraint for now, but your network should work in this setting or you can accept the performance loss due to quantization.
How we are improving our support for more generic neural networks:
We are working on creating a NN conv block that maximizes the number of leveled operations and minimizes the number of PBS, while being a drop-in replacement for nn.Conv2D and achieving the same performance. This conv block will use Quantization Aware Training and pruning to reduce precision and number of neuron connections to respect the 7-bit accumulator constraint.