Error while compiling quantized model from brevitas to concrete-ml

Hi,

Following the notebook tutorial on VGG, I tried to make it work on a simple dense classifier. Everything works fine until the compilation of the quantized model from brevitas by concrete. I got a python error UnboundLocalError: local variable 'node_integer_inputs' referenced before assignment. I tried to fix it, but no luck.

I made a notebook with a minimal example: concrete-debugs/quantization_debug.ipynb at master · BastienVialla/concrete-debugs · GitHub

Hi Bastien,

I think you found a bug! The node_integer_inputs variable in concrete/ml/quantization/post_training.py:50 is indeed not initialized properly: it is initialized only on the True branch of the if has_variable_inputs: on line 401.

Here’s a quick fix:

  • copy the block:
                # Find the unique integer producers of the current's op output tensor
                node_integer_inputs = set.union(
                    *[tensor_int_producers.get(input_node, set()) for input_node in node.input]
                )

and paste it just above

                if get_op_type(node) == QuantizedBrevitasQuant.op_type():

To find the file concrete/ml/quantization/post_training.py you can use find . -name "post_training.py" in the directory where you have your virtualenv.

If it doesn’t work we can find some other way to help you work around this, let us know!

2 Likes

Thanks for the quick answer.
Now in the same function call, 6 lines below, it is curr_calibration_data that is not defined. I have this error: UnboundLocalError: local variable 'curr_calibration_data' referenced before assignment.

Hello @BVialla . So, just to keep you updated, we are working on the subject, to reproduce the issue on our side and be able to give you a proper fix. We don’t want to waste your time with things that we haven’t tested first. We keep you updated as soon as possible. Cheers

1 Like

Hi again,

I have some good news: Though there is indeed a bug, you will not need to wait for an update of Concrete-ML to fix the problem. You can fix the problem by changing the network to make it compatible with Concrete-ML.

The problem is actually in the network design: the network needs to quantize its inputs and in Brevitas this is done using a QuantIdentity layer.

There were some other issues with your network that I could see:

  • bit_width needs to be specified for activations and the new QuantIdentity layer
  • do not use return_quant_tensor or set it to False

These guidelines are documented in the Step-by-step guide here: Step-by-step Guide - Concrete ML . If the documentation is not very clear, let me know, and we’ll improve it.

I applied these modifications to your model, giving this new one:

class QDenseClassifier(nn.Module):
    def __init__(self,
                 hparams: dict,
                 bits: int,
                 act_quant: brevitas.quant = Int8ActPerTensorFloat,
                 weight_quant: brevitas.quant = Int8WeightPerTensorFloat):
        super(QDenseClassifier, self).__init__()
        self.hparams = hparams
        self.input_quant = qnn.QuantIdentity(act_quant=act_quant, bit_width=bits)
        self.dense1 = qnn.QuantLinear(hparams['n_feats'], hparams['hidden_dim'], weight_quant=weight_quant, weight_bit_width=bits,  bias=True)
        self.dp1 = qnn.QuantDropout(0.1)
        self.act1 = qnn.QuantReLU(act_quant=act_quant, bit_width=bits)
        
        self.dense2 = qnn.QuantLinear(hparams['hidden_dim'], 1, weight_bit_width=bits, weight_quant=weight_quant, bias=True)
            
    def forward(self, src):
        x = self.dense1(self.input_quant(src))
        x = self.dp1(x)
        x = self.act1(x)
        
        x = self.dense2(x)
        return x

Hope this helps you make progress with your experiments! It seems you are doing fine-tuning, I suggest, if you are not already doing so, looking at the fine-tuning demo: concrete-ml/use_case_examples/cifar_brevitas_finetuning at release/0.6.x · zama-ai/concrete-ml · GitHub

1 Like

Good news indeed.
Thanks for the complete answer, I’ll update the code and keep experimenting. I did read the doc on quantization, but apparently I rushed a bit and skipped the part on quantizing inputs.