Brevitas QAT - Compiling Problem

lstk · November 20, 2023, 2:22pm

I’m currently trying to benchmark a simple Autoencoder model for anomaly detection:

for n_bits in range(2,7):

	brevitas_model = torch.nn.Sequential(
		# QuantIdentity layer as entry point
    		brevitas.nn.QuantIdentity(bit_width=n_bits, return_quant_tensor=True),
		# Encoder
    		brevitas.nn.QuantLinear(input_dim, encoding_dim, bias=True, weight_bit_width=n_bits),
    		brevitas.nn.QuantReLU(),
		# Decoder
    		brevitas.nn.QuantLinear(encoding_dim, input_dim, bias=True, weight_bit_width=n_bits),
    		brevitas.nn.QuantSigmoid()
  	)

	# train and compile the model on train dataset
	...

	# predict on test dataset
	...

I attempted to adhere to the instructions outlined in this guide: Step-by-step Guide - Concrete ML

However, I seem to encounter an issue. When reaching n_bits=4 and compiling the model, it consistently exceeds the maximum_integer_bit_width limit of 16.

Is this a problem of my model architecture?

andrei-stoian-zama · November 20, 2023, 2:34pm

Hi,

How high is your input_dim and your encoding_dim ? With 4b weights & activations you might be hitting the limit on the accumulator size if those values are high enough.

I would suggest the best option is to use “rounded activations”, see here You could probably round to N+2 bits (N here is the number of weight & activation bits).

You could try directly to use 2+weight&activation bits for rounding_threshold_bits in the compile function. Or you could evaluate the accuracy you get when varying the rounding threshold bits.

lstk · November 20, 2023, 2:54pm

Hey Andrei,

the dataset are bert generated word embeddings.

For the autoencoder I’m currently using the following dimensions:

input_dim = 768
encoding_dim = input_dim//10 # equals 76

Thanks for your reply.
I will give it a try with the rounding_threshold_bits