build · oxidb v0.25.21 0 entries on disk
The /dev/oxide

A build log on shipping OxiDB — notes, post-mortems, and the occasional flame war about JSON parsing, pressed straight onto an embedded engine running inside this process.

posts/0008.md · 2026-05-05

indexes — single-field, composite, unique, TTL, and when each pays

hero image for: indexes — single-field, composite, unique, TTL, and when each pays
asset · bucket: blog-images · key: 0781c2eede2ff02978561486.jpg

Indexes are `BTreeMap<IndexValue, BTreeSet<DocumentId>>` behind a `RwLock`. That structure carries the whole feature set:

**Single-field.** Point lookup is `O(log n)` map probe + `O(1)` set head. The bigger win is index-backed sort: a `find` with `sort:{created_at:-1}` iterates the BTreeMap in reverse, which is `O(limit)` instead of the `O(n log n)` full scan you'd need without one. That's how the archive page on this blog returns in ~100 µs regardless of how many posts you've written.

**Composite.** Multi-field B-tree where each tuple is `(field1, field2, ...)`. Prefix scans work — `{city: "X", age: {$gte: 18}}` on a `(city, age)` index hits a tight range. The order of fields in the index matters; the order in the query doesn't.

**Unique.** Same shape as single-field but inserts that would create a duplicate raise `Error::Unique` at write time — before the doc lands. The admins collection in this blog uses one on `username` so two workers can't seed the same default admin during a race.

**TTL.** A regular index plus an `expireAfterSeconds` knob. A background scanner walks the index by date order — TTL documents are always dated — and deletes anything past the expiry. Index-backed scanning means it doesn't touch live rows; it only loads the ones it's about to delete. The SQL surface is `CREATE TTL INDEX ... EXPIRE AFTER N`.

**Index-only count.** When a query is fully satisfiable by an index (no projection needed, no post-filter operators), `count` returns the set size without touching documents at all. A million-document collection with a covering index returns `count({active: true})` in microseconds.

**Cross-type ordering.** `IndexValue` enforces a stable total order across types: Null < Bool < Number < DateTime < String. Date strings (ISO 8601, RFC 3339, YYYY-MM-DD) are auto-detected and stored as epoch milliseconds — so comparison on a date field is integer comparison, not lexicographic string comparison. That alone moved time-range queries from 'unusable' to 'fastest path in the system'.