Brevitas QAT - Compiling Problem

I’m currently trying to benchmark a simple Autoencoder model for anomaly detection:

for n_bits in range(2,7):

	brevitas_model = torch.nn.Sequential(
		# QuantIdentity layer as entry point
    		brevitas.nn.QuantIdentity(bit_width=n_bits, return_quant_tensor=True),
		# Encoder
    		brevitas.nn.QuantLinear(input_dim, encoding_dim, bias=True, weight_bit_width=n_bits),
		# Decoder
    		brevitas.nn.QuantLinear(encoding_dim, input_dim, bias=True, weight_bit_width=n_bits),

	# train and compile the model on train dataset

	# predict on test dataset

I attempted to adhere to the instructions outlined in this guide: Step-by-step Guide - Concrete ML

However, I seem to encounter an issue. When reaching n_bits=4 and compiling the model, it consistently exceeds the maximum_integer_bit_width limit of 16.

Is this a problem of my model architecture?


How high is your input_dim and your encoding_dim ? With 4b weights & activations you might be hitting the limit on the accumulator size if those values are high enough.

I would suggest the best option is to use “rounded activations”, see here You could probably round to N+2 bits (N here is the number of weight & activation bits).

You could try directly to use 2+weight&activation bits for rounding_threshold_bits in the compile function. Or you could evaluate the accuracy you get when varying the rounding threshold bits.

Hey Andrei,

the dataset are bert generated word embeddings.

For the autoencoder I’m currently using the following dimensions:

input_dim = 768
encoding_dim = input_dim//10 # equals 76

Thanks for your reply.
I will give it a try with the rounding_threshold_bits