## Overview
TFHE-rs is a pure Rust implementation of TFHE for Boolean and int…eger arithmetics over encrypted data.
With this bounty, we are asking users to implement an SQL encrypted query on a clear database.
Here's a simplified overview of how FHE can be used to implement an SQL encrypted query on a clear database:
**1. Query Encryption**
The first step involves the user encrypting their SQL query using their FHE secret key before sending it to the server hosting the database. The encryption ensures that the server cannot see the actual query, only its encrypted form.
**2. Processing Encrypted Query on Clear Database**
The server then processes this encrypted query on the clear database. FHE computations mixing clear and encrypted data will return encrypted data, it means the server can perform the necessary SQL operations (like SELECT, JOIN, etc.) using the clear data as well as the encrypted query. The result of this operation is an encrypted version of the query result.
**3. Decrypting the Result**
Finally, the encrypted result is sent back to the user, who can decrypt it using their private key to obtain the clear text result of the SQL query.
We ask that you, the hunters, implement a database that can manipulate signed and unsigned integers (8, 16, 32, 64 bits), booleans and short strings (32 ASCII characters) as possible data types for a column.
As we are not looking at a classical database use case, we don’t expect the database to be resilient to crashes, be highly available etc. it is merely a datastore addressed in a similar way to a database via an encrypted SQL query.
## How to participate?
1️⃣ <a href="https://zama.ai/bounty-and-grant-program" target="_blank">Register here</a>.
2️⃣ When ready, submit your code [here](https://zama.ai/bounty-program).
🗓️ Submission deadline: May 12th, 2024.
## Description
### Expected Work
The implementation should live in its own crate (and not in tfhe-rs/tfhe/example) and depend on either tfhe-rs 0.5.x or the main branch of tfhe-rs.
We ask that your crate passes the clippy checks when treating warnings as errors.
```
cargo clippy -- --no-deps -D warnings
```
We ask you to provide both an executable and an API.
### Executable
For the clear database storage you can use existing databases or tools but we require that you manage to load a database from a directory with the following structure:
db_dir
- table_1.csv
- table_2.csv
example of a table_content.csv:
```
column_1_name:uint32,colum_2_name:bool,column_3_name:string
0,true,"first line"
123,false,"some other line"
```
The executable takes as input a database to load with the format discussed earlier and a file containing the SQL query in plaintext to avoid dealing with escaping quotes on the command line, we **require** the following format for the output:
```console
$ cargo run --release -- --input-db /path/to/db_dir --query-file query.txt
Runtime: 42.24 s
Clear DB query result: (some result)
Encrypted DB query result: (some result)
Results match: YES
```
erroneous results will automatically disqualify a submission.
### API
You are free to use the High Level API with CPU or GPU or the integer API with CPU or GPU.
We only expect you to implement ONE API, implementing more APIs won’t improve you submission’s classification, correctness is mandatory, incorrect implementations will be disqualified, performance will be the main differentiator for submissions.
For the encryption we ask that you use a [Select query from the sqlparser crate](https://docs.rs/sqlparser/latest/sqlparser/ast/struct.Select.html) as input. You may choose another type if you explain how it's a better choice. You must return your own EncryptedQuery representation that can be used on the device of your choice.
There are 3 possible APIs to implement
- CPU Integer API (uses tfhe::integer::ServerKey, an EncryptedQuery, a Table struct representing the clear database),
- GPU Integer API (uses tfhe::integer::CudaServerKey and a CudaEncryptedQuery, a Table struct representing the clear database)
- High-Level API CPU/GPU (uses tfhe::ServerKey, EncryptedQuery built on HL API types, a Table struct representing the clear database)
The APIs are specified in detail below, choose the one(s) you wish to implement (only implement methods of the API you chose)
#### CPU Integer API
```rust
fn default_cpu_parameters() -> PBSParameters;
fn encrypt_query(query: sqlparser::ast::Select) -> EncryptedQuery;
/// Loads a directory with a structure described above in a format
/// that works for your implementation of the encryted query
fn load_tables(path) -> Tables
/// # Inputs:
/// - sks: The server key to use
/// - input: your EncryptedQuery
/// - tables: the plain data you run the query on
///
/// # Output
/// - EncryptedResult
fn run_fhe_query(
sks: &tfhe::integer::ServerKey,
input: &EncryptedQuery,
data: &Tables,
) -> EncrypedResult;
/// The output of this function should be a string using the CSV format
/// You should provide a way to compare this string with the output of
/// the clear DB system you use for comparison
fn decrypt_result(clientk_key: &ClientKey, result: &EncryptedResult) -> String
```
#### GPU Integer API
```rust
fn default_gpu_parameters() -> PBSParameters;
fn encrypt_query_gpu(query: sqlparser::ast::Select) -> CudaEncryptedQuery;
/// Loads a directory with a structure described above in a format
/// that works for your implementation of the encryted query
fn load_tables(path) -> Tables
/// # Inputs:
/// - sks: The server key to use
/// - input: your CudaEncryptedQuery
/// - tables: the plain data you run the query on
///
/// # Output
/// - EncryptedResult
fn run_fhe_query_gpu(
sks: &tfhe::integer::ServerKey,
input: &CudaEncryptedQuery,
data: &Tables,
) -> EncrypedResult;
/// The output of this function should be a string using the CSV format
/// You should provide a way to compare this string with the output of
/// the clear DB system you use for comparison
fn decrypt_result_gpu(clientk_key: &ClientKey, result: &CudaEncryptedResult) -> String
```
#### High-Level API
```rust
/// Returns parameters and the device (CudaGpu/Cpu) on which the function /// should be ran
fn default_parameters() -> (PBSParameters, tfhe::Device);
fn load_tables(path) -> Tables
fn encrypt_query(query: String) -> EncryptedQuery;
// The server key is provided to allow setting up threads if needed
fn run_fhe_query(
sks: &tfhe::integer::ServerKey,
input: &EncryptedQuery,
data: &Tables,
) -> EncrypedResult;
fn decrypt_result(clientk_key: &ClientKey, result: &EncrypedResult) -> String
```
### SQL support
We ask to be able to run the following encrypted queries:
Requested:
SQL Select: https://www.w3schools.com/sql/sql_select.asp
SQL Select Distinct: https://www.w3schools.com/sql/sql_distinct.asp
SQL Where: https://www.w3schools.com/sql/sql_where.asp
SQL And: https://www.w3schools.com/sql/sql_and.asp
SQL Or: https://www.w3schools.com/sql/sql_or.asp
SQL Not: https://www.w3schools.com/sql/sql_not.asp
SQL In: https://www.w3schools.com/sql/sql_in.asp
SQL Between: https://www.w3schools.com/sql/sql_between.asp
For integer values we expect you to manage the <, <=, >, >=, = operators
For strings we expect you to manage the = operator only
Note that the != (not equal) operator is built via the NOT operator of SQL and an = operator.
Note that some operators may re-use other implementations, for example the IN operator is indicated to be equivalent to multiple OR operators.
Bonus API:
SQL Joins: https://www.w3schools.com/sql/sql_join.asp
The format of the encrypted query and result are free, the evaluation of this bounty will be on speed and correctness on the Requested APIs, bonus APIs are NOT required and would only be used as a last resort to differentiate two submissions.
### Other Details
#### Benchmarking
Implementations will be benchmarked on the following AWS hardware:
CPU: hpc7a.96xlarge (192 vCPU, 768 GB Memory)
GPU: p3.2xlarge (note that TFHE-rs only supports single GPU execution for now) (1 Tesla V100, 16 GB GPU Memory)
#### Parameters
The allowed parameters are:
PARAM_MESSAGE_2_CARRY_2_KS_PBS,
PARAM_MULTI_BIT_MESSAGE_2_CARRY_2_GROUP_2_KS_PBS,
PARAM_MULTI_BIT_MESSAGE_2_CARRY_2_GROUP_3_KS_PBS
Your solution must be able to run with the multi bit and non multi bit parameter sets, here it should be straightforward as the message and carry moduli are the same.
Note that the multi bit parameters should be faster on GPU, namely the group 3 should be the fastest. It could be the case on CPU as well but it can saturate even the largest CPU machines with threads and performance suffers as a consequence.
#### Turtle shell queries
As Mario Kart has its turtle shells, you will be able, if you want to, to provide one example of a small DB (with the directory and csv structure discussed earlier) and a “turtle shell” query you feel is very hard/has corner cases alongside your submission, we will run the DB and the query on the submissions we gather, corner cases beware! The performance and correctness on these hunter provided use cases will be used for the evaluation of the bounty, in addition to cases we’ll choose ourselves, we ask that the query runs in under 5 minutes on a laptop/commodity hardware, though our benchmarking machine are larger we want to keep runtimes reasonable when possible.
Good luck!
## Reward
### 🥇Best submission: up to €5,000
To be considered best submission, a solution must be efficient, effective and demonstrate a deep understanding of the core problem. Alongside the technical correctness, it should also be submitted with a clean code, clear explanations and a complete documentation.
### 🥈Second-best submission: up to €3,000
For a solution to be considered the second best submission, it should be both efficient and effective. The code should be neat and readable, while its documentation might not be as exhaustive as the best submission, it should cover the key aspects of the solution.
### 🥉Third-best submission: up to €2,000
The third best submission is one that presents a solution that effectively tackles the challenge at hand, even if it may have certain areas of improvement in terms of efficiency or depth of understanding. Documentation should be present, covering the essential components of the solution.
_Reward amounts are decided based on code quality and speed performance on a m6i.metal AWS server._
## Related links and references
- [SQL Select](https://www.w3schools.com/sql/sql_select.asp)
- [Rust sqlparser](https://docs.rs/sqlparser/latest/sqlparser/)
- [TFHE-rs documentation](https://docs.zama.ai/tfhe-rs)
- [TFHE-rs rust API docs](https://docs.rs/tfhe/latest/tfhe)
- [TFHE-rs repository](https://github.com/zama-ai/tfhe-rs)
## Questions?
Do you have a specific question about this bounty? Simply comment this issue and our team will get back to you!