Hello,
I have implemented a bitonic sort algorithm on concrete and I’m testing it on GPU. I’m getting an error when I pass an array length 128 or higher. However, it works fine if I’m not using GPU.
Also, the result is not accurate either. I have tested the algorithm on plaintext array and that sorts the array correctly. Following is the error, followed by the code:
loc(“/home/ubuntu/concreteOct06/bitonic_tensorized.py”:70:0): error: failed to legalize unresolved materialization from ‘tensor<8x!FHE.eint<10>>’ to ‘tensor<8x3x!TFHE.glwe<sk?>>’ that remained live after conversion
Traceback (most recent call last):
File “/home/ubuntu/concreteOct06/bitonic_tensorized.py”, line 83, in
sort_compile = sort_array.compile(inputset, dataflow_parallelize=parallelize, use_gpu=gpu)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/home/ubuntu/conc/lib/python3.12/site-packages/concrete/fhe/compilation/decorators.py”, line 156, in compile
return self.compiler.compile(
^^^^^^^^^^^^^^^^^^^^^^
File “/home/ubuntu/conc/lib/python3.12/site-packages/concrete/fhe/compilation/compiler.py”, line 203, in compile
fhe_module = self._module_compiler.compile(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/home/ubuntu/conc/lib/python3.12/site-packages/concrete/fhe/compilation/module_compiler.py”, line 437, in compile
output = FheModule(
^^^^^^^^^^
File “/home/ubuntu/conc/lib/python3.12/site-packages/concrete/fhe/compilation/module.py”, line 759, in init
self.execution_runtime.init()
File “/home/ubuntu/conc/lib/python3.12/site-packages/concrete/fhe/compilation/utils.py”, line 58, in init
self._val = self._init()
^^^^^^^^^^^^
File “/home/ubuntu/conc/lib/python3.12/site-packages/concrete/fhe/compilation/module.py”, line 738, in init_execution
execution_server = Server.create(
^^^^^^^^^^^^^^
File “/home/ubuntu/conc/lib/python3.12/site-packages/concrete/fhe/compilation/server.py”, line 213, in create
library = compiler.compile(
^^^^^^^^^^^^^^^^^
RuntimeError: Lowering from FHE to TFHE failed
def compare_and_swap_vectorized(arr, j, k, direction):
n = arr.size
idx = np.arange(n)
ixj = np.bitwise_xor(idx, j)
sel = ixj > idx
i_sel = idx[sel]
l_sel = ixj[sel]
a = arr[i_sel]
b = arr[l_sel]
# local dir bit: convert to int
dir_bit = ((i_sel & k) == 0).astype(np.int64)
gt = (a > b).astype(np.int64)
lt = (a < b).astype(np.int64)
swap_mask = dir_bit * gt + (1 - dir_bit) * lt
if direction == 0:
swap_mask = 1 - swap_mask # invert mask for descending
# use integer mask instead of boolean
arr[i_sel] = swap_mask * b + (1 - swap_mask) * a
arr[l_sel] = swap_mask * a + (1 - swap_mask) * b
return arr