XGBoost inference's query size

Ronny_Ko · March 26, 2025, 5:58am

Hi,

I tried to measure the client’s encrypted ciphertext’s query size for XGBClassifier. But even if I increase the feature’s n_bit from 6 to 14, the returned ciphertext size from .quantize_encrypt_serialize() returns the same size. I don’t understand why the query size is the same.

Also, when I use 30 features, the measured ciphertext size (i.e., the returned ciphertext from .quantize_encrypt_serialize()) is 984 bytes. I feel this is too small, isn’t it?

andrei-stoian-zama · March 26, 2025, 8:08am

The bitwidth of the ciphertexts does not matter for Concrete ML models, as Concrete ML does not use the TFHE-rs TfheInt[8-64] datatypes, it uses so-called “native” ciphertexts. These can store 1-27bits, but their usage is more complex and, thus, is handled under the hood by Concrete ML. Thus, there is no difference in size when using any supported n_bits.

With respect to the total size, by default Concrete ML uses compressed ciphertexts: the message and mask-generation seed is stored instead of the message+expanded mask. Thus, only a few bytes are needed for a single ciphertext, so 30 ciphertexts in 984 bytes seems reasonable.

This compression is not available on the output ciphertext vector and inspecting its size should reveal it needs about 16-20KB per individual ciphertext.