florian marending

31 Dec 2023

How fast is SQLite?

Plenty fast

I have recently been on a path towards radical simplification of the systems I build. Forgoing Spring Boot for a simple Axum Rust backend is a simple choice for me. Choosing a database is an entirely different story. I have been using PostgreSQL successfully, but operating it has been a bit of a black box. Sure, I just spin up a standard docker container and it works, but I wouldn't want to start straying from the default configuration considering the absolute wealth of features and manual pages there are. So I started looking into SQLite.

Advantages

SQLite is an embedded database, a library if you will. Compared to dedicated DB servers that need to be operated separately, this simplifies the deployment significantly. Additionally, the latency of queries is extremly low, as they are more function call than networked request. For incremental backups, there is the excellent Litestream project that observes the write-ahead log and pushes changes up to an S3 compatible object storage. Right up my alley.

Performance

I conducted some microbenchmarks to gauge the performance of SQLite. There is one table that is declared as follows:

CREATE TABLE metrics(bucket TEXT NOT NULL, date TEXT NOT NULL, data TEXT NOT NULL);

You might notice that the motivating application to try SQLite is a home-grown metrics collection system. For one scenario I also add indices on bucket and date. Next, I ran a bunch of (hopefully) self explanatory scenarios and measured the execution time of the queries in isolation. Read queries are prepared statements of the form:

SELECT * FROM metrics WHERE bucket = ?1 AND rowid = ?2;

Notice that rowid is an implicit column added by SQLite. Write queries:

INSERT INTO metrics (bucket, date, data) VALUES (?1, ?2, ?3);

The benchmarks were run on my local macbook, as well as a linux ARM server and a linux x86 server on Hetzner.

Results

ScenarioThroughput90 percentile duration
Vanilla (read)183'150/s6us
Vanilla (write)4'363/s248us
WAL Mode (read)454'338/s3us
WAL Mode (write)14'401/s37us
WAL + Synchronisation Normal (read)483'558/s3us
WAL + Synchronisation Normal (write)113'684/s8us
WAL + In-memory (read)50'000'000/s0us
WAL + In-memory (write)981'836/s1us
WAL + Index (read)440'917/s3us
WAL + Index (write)47'359/s25us
WAL + Index (mixed, 80% read)197'012/s15us
WAL + Index (mixed 80% write)59'111/s16us
Table 1: M1 Macbook Air results.
ScenarioThroughput90 percentile duration
Vanilla (read)94'786/s12us
Vanilla (write)560/s2'463us
WAL Mode (read)238'322/s6us
WAL Mode (write)3'316/s585us
WAL + Synchronisation Normal (read)275'406/s5us
WAL + Synchronisation Normal (write)46'512/s18us
WAL + In-memory (read)943'396/s1us
WAL + In-memory (write)299'976/s4us
WAL + Index (read)186'358/s7us
WAL + Index (write)19'479/s46us
WAL + Index (mixed, 80% read)75'153/s24us
WAL + Index (mixed 80% write)22'801/s42us
Table 2: Linux Ampere ARM results (Hetzner CAX31).
ScenarioThroughput90 percentile duration
Vanilla (read)73'800/s14us
Vanilla (write)925/s1'484us
WAL Mode (read)332'225/s4us
WAL Mode (write)5'542/s211us
WAL + Synchronisation Normal (read)315'756/s4us
WAL + Synchronisation Normal (write)80'145/s12us
WAL + In-memory (read)952'380/s1us
WAL + In-memory (write)322'695/s3us
WAL + Index (read)315'357/s4us
WAL + Index (write)26'469/s34us
WAL + Index (mixed, 80% read)121'703/s15us
WAL + Index (mixed 80% write)36'260/s26us
Table 3: Linux AMD x86 results (Hetzner CPX31).

Observations

Notice the increase of queries per second across the board once write-ahead log (WAL) mode is turned on and Synchronization is set to Normal. The 90th percentile duration column is indicated in microseconds! Even the slow queries are still incredibly fast. The most "real-world" scenario here is probably WAL + Index (mixed, 80% read). We are looking at a QPS in the ballpark of 100'000 regardless of platform. Plenty fast for my usecases.

Limitations

I found it hard to get any speedup when trying to access a single SQLite DB from multiple threads. Writes are single threaded by nature, not much of a debate there, but reads (at least in WAL mode) should scale well. It's probably down to the library I use for connection pooling. But considering single threaded performance is already so good, I didn't spend to much time here. In Rust lingo, I just use an Arc<Mutex<Connection>> to protect a single connection from concurrent access.