31 Dec 2023

How fast is SQLite?

Plenty fast

I have recently been on a path towards radical simplification of the systems I build. Forgoing Spring Boot for a simple Axum Rust backend is a simple choice for me. Choosing a database is an entirely different story. I have been using PostgreSQL successfully, but operating it has been a bit of a black box. Sure, I just spin up a standard docker container and it works, but I wouldn’t want to start straying from the default configuration considering the absolute wealth of features and manual pages there are. So I started looking into SQLite.

Advantages

SQLite is an embedded database, a library if you will. Compared to dedicated DB servers that need to be operated separately, this simplifies the deployment significantly. Additionally, the latency of queries is extremly low, as they are more function call than networked request. For incremental backups, there is the excellent Litestream project that observes the write-ahead log and pushes changes up to an S3 compatible object storage. Right up my alley.

Performance

I conducted some microbenchmarks to gauge the performance of SQLite. There is one table that is declared as follows:

CREATE TABLE metrics(bucket TEXT NOT NULL, date TEXT NOT NULL, data TEXT NOT NULL);

You might notice that the motivating application to try SQLite is a home-grown metrics collection system. For one scenario I also add indices on bucket and date. Next, I ran a bunch of (hopefully) self explanatory scenarios and measured the execution time of the queries in isolation. Read queries are prepared statements of the form:

SELECT * FROM metrics WHERE bucket = ?1 AND rowid = ?2;

Notice that rowid is an implicit column added by SQLite. Write queries:

INSERT INTO metrics (bucket, date, data) VALUES (?1, ?2, ?3);

The benchmarks were run on my local macbook, as well as a linux ARM server and a linux x86 server on Hetzner.

Results

Scenario	Throughput	90 percentile duration
Vanilla (read)	183’150/s	6us
Vanilla (write)	4’363/s	248us
WAL Mode (read)	454’338/s	3us
WAL Mode (write)	14’401/s	37us
WAL + Synchronisation Normal (read)	483’558/s	3us
WAL + Synchronisation Normal (write)	113’684/s	8us
WAL + In-memory (read)	50’000’000/s	0us
WAL + In-memory (write)	981’836/s	1us
WAL + Index (read)	440’917/s	3us
WAL + Index (write)	47’359/s	25us
WAL + Index (mixed, 80% read)	197’012/s	15us
WAL + Index (mixed 80% write)	59’111/s	16us

Table 1: M1 Macbook Air results

Scenario	Throughput	90 percentile duration
Vanilla (read)	94’786/s	12us
Vanilla (write)	560/s	2’463us
WAL Mode (read)	238’322/s	6us
WAL Mode (write)	3’316/s	585us
WAL + Synchronisation Normal (read)	275’406/s	5us
WAL + Synchronisation Normal (write)	46’512/s	18us
WAL + In-memory (read)	943’396/s	1us
WAL + In-memory (write)	299’976/s	4us
WAL + Index (read)	186’358/s	7us
WAL + Index (write)	19’479/s	46us
WAL + Index (mixed, 80% read)	75’153/s	24us
WAL + Index (mixed 80% write)	22’801/s	42us

Table 2: Linux Ampere ARM results (Hetzner CAX31)

Scenario	Throughput	90 percentile duration
Vanilla (read)	73’800/s	14us
Vanilla (write)	925/s	1’484us
WAL Mode (read)	332’225/s	4us
WAL Mode (write)	5’542/s	211us
WAL + Synchronisation Normal (read)	315’756/s	4us
WAL + Synchronisation Normal (write)	80’145/s	12us
WAL + In-memory (read)	952’380/s	1us
WAL + In-memory (write)	322’695/s	3us
WAL + Index (read)	315’357/s	4us
WAL + Index (write)	26’469/s	34us
WAL + Index (mixed, 80% read)	121’703/s	15us
WAL + Index (mixed 80% write)	36’260/s	26us

Table 3: Linux AMD x86 results (Hetzner CPX31)

Observations

Notice the increase of queries per second across the board once write-ahead log (WAL) mode is turned on and Synchronization is set to Normal. The 90th percentile duration column is indicated in microseconds! Even the slow queries are still incredibly fast. The most “real-world” scenario here is probably WAL + Index (mixed, 80% read). We are looking at a QPS in the ballpark of 100'000 regardless of platform. Plenty fast for my usecases.

Limitations

I found it hard to get any speedup when trying to access a single SQLite DB from multiple threads. Writes are single threaded by nature, not much of a debate there, but reads (at least in WAL mode) should scale well. It’s probably down to the library I use for connection pooling. But considering single threaded performance is already so good, I didn’t spend to much time here. In Rust lingo, I just use an Arc<Mutex<Connection>> to protect a single connection from concurrent access.