Error while compiling quantized model from brevitas to concrete-ml


Following the notebook tutorial on VGG, I tried to make it work on a simple dense classifier. Everything works fine until the compilation of the quantized model from brevitas by concrete. I got a python error UnboundLocalError: local variable 'node_integer_inputs' referenced before assignment. I tried to fix it, but no luck.

I made a notebook with a minimal example: concrete-debugs/quantization_debug.ipynb at master · BastienVialla/concrete-debugs · GitHub

Hi Bastien,

I think you found a bug! The node_integer_inputs variable in concrete/ml/quantization/ is indeed not initialized properly: it is initialized only on the True branch of the if has_variable_inputs: on line 401.

Here’s a quick fix:

  • copy the block:
                # Find the unique integer producers of the current's op output tensor
                node_integer_inputs = set.union(
                    *[tensor_int_producers.get(input_node, set()) for input_node in node.input]

and paste it just above

                if get_op_type(node) == QuantizedBrevitasQuant.op_type():

To find the file concrete/ml/quantization/ you can use find . -name "" in the directory where you have your virtualenv.

If it doesn’t work we can find some other way to help you work around this, let us know!


Thanks for the quick answer.
Now in the same function call, 6 lines below, it is curr_calibration_data that is not defined. I have this error: UnboundLocalError: local variable 'curr_calibration_data' referenced before assignment.

Hello @BVialla . So, just to keep you updated, we are working on the subject, to reproduce the issue on our side and be able to give you a proper fix. We don’t want to waste your time with things that we haven’t tested first. We keep you updated as soon as possible. Cheers

Hi again,

I have some good news: Though there is indeed a bug, you will not need to wait for an update of Concrete-ML to fix the problem. You can fix the problem by changing the network to make it compatible with Concrete-ML.

The problem is actually in the network design: the network needs to quantize its inputs and in Brevitas this is done using a QuantIdentity layer.

There were some other issues with your network that I could see:

  • bit_width needs to be specified for activations and the new QuantIdentity layer
  • do not use return_quant_tensor or set it to False

These guidelines are documented in the Step-by-step guide here: Step-by-step Guide - Concrete ML . If the documentation is not very clear, let me know, and we’ll improve it.

I applied these modifications to your model, giving this new one:

class QDenseClassifier(nn.Module):
    def __init__(self,
                 hparams: dict,
                 bits: int,
                 act_quant: brevitas.quant = Int8ActPerTensorFloat,
                 weight_quant: brevitas.quant = Int8WeightPerTensorFloat):
        super(QDenseClassifier, self).__init__()
        self.hparams = hparams
        self.input_quant = qnn.QuantIdentity(act_quant=act_quant, bit_width=bits)
        self.dense1 = qnn.QuantLinear(hparams['n_feats'], hparams['hidden_dim'], weight_quant=weight_quant, weight_bit_width=bits,  bias=True)
        self.dp1 = qnn.QuantDropout(0.1)
        self.act1 = qnn.QuantReLU(act_quant=act_quant, bit_width=bits)
        self.dense2 = qnn.QuantLinear(hparams['hidden_dim'], 1, weight_bit_width=bits, weight_quant=weight_quant, bias=True)
    def forward(self, src):
        x = self.dense1(self.input_quant(src))
        x = self.dp1(x)
        x = self.act1(x)
        x = self.dense2(x)
        return x

Hope this helps you make progress with your experiments! It seems you are doing fine-tuning, I suggest, if you are not already doing so, looking at the fine-tuning demo: concrete-ml/use_case_examples/cifar_brevitas_finetuning at release/0.6.x · zama-ai/concrete-ml · GitHub

Good news indeed.
Thanks for the complete answer, I’ll update the code and keep experimenting. I did read the doc on quantization, but apparently I rushed a bit and skipped the part on quantizing inputs.