HexRide · 01 / 11
INTRO
navigate
Final Project · Data Storages

HexRide.

A ride-matching backend that does Uber-scale geo-queries on vanilla Postgres. No PostGIS. Just hexagons and bigints.
Author Nikita
Course Data Storages — 6 ECTS
Stack PostgreSQL 16 · Node · H3
01 · The pitch

Riders request trips. Drivers move around Barcelona. The server matches them in real time using Uber's H3 hex grid — but stored as plain bigints in vanilla Postgres.

Payments run through a stubbed Stripe-style provider with idempotent webhooks enforced at the schema level.

PostGIS   ·   h3-pg extension   ·   ORM   ·   just Postgres + smart indexing
02 · The grid

Two resolutions.

H3 has 16 levels (0–15). Each level subdivides each hex into ~7 children. Pick two: one for matching, one for analytics.

Resolution 9
~175m edge
Pickup matching · single block
Resolution 7
~1.2km edge
Heatmaps · district-level analytics

Both computed in app code from the same (lat, lng). Stored as bigint columns on every spatial table.

03 · The trick

An H3 cell ID is just a 64-bit integer. So “find drivers near me” becomes a btree index lookup — not a geometry operation.

latLng
41.3851, 2.1734
H3 cell @ res 9
617733122422964223
SQL
WHERE h3_9 = ANY($1)

No ST_DWithin. No GIST. No PostGIS. Just bigint = bigint, indexed.

04 · Data model

Seven tables, three domains.

Identity

  • users base
  • drivers 1:1 user
  • riders 1:1 user

Trips

  • trips lifecycle
  • driver_pings partitioned
  • trip_events audit log

Payments

  • payment_methods soft-delete
  • payments idempotent

H3 columns at both res 9 and res 7 on every spatial table — drivers, trips, pings, events. Computed in app, written once, indexed twice.

05 · The matching query

The centerpiece.

WITH candidate_hexes AS (
  SELECT unnest($1::bigint[]) AS h3_9
)
SELECT d.id
FROM drivers d
JOIN candidate_hexes c
  ON d.current_h3_9 = c.h3_9
WHERE d.status = 'available'
ORDER BY d.last_seen_at DESC
LIMIT 1
FOR UPDATE SKIP LOCKED;
L1–3 · gridDisk
App computes gridDisk(pickup, k=2) in JS — array of nearby hexes, ~350m radius.
L7 · the index hit
Equality on current_h3_9 uses the partial btree. Plan: Index Scan, never Seq Scan.
L9 · last_seen_at DESC
Prefer recently-pinged drivers. Proxy for “actually online,” not just status = available.
L11 · the magic
SKIP LOCKED means concurrent matchers grab different drivers instead of blocking on the same row.
06 · Why SKIP LOCKED matters

100 riders. Zero double-bookings.

FOR UPDATE (alone)

Tx A locks driver_42
Tx B waits…
Tx C waits…
Tx D waits…
↓ A commits
Tx B sees same row, re-locks the now-busy driver
Tx C waits…
→ Throughput dies. Same driver picked twice if status check is stale.

FOR UPDATE SKIP LOCKED

Tx A locks driver_42
Tx B skips it, locks driver_87
Tx C skips both, locks driver_12
Tx D skips three, locks driver_03
↓ all commit in parallel
4 riders, 4 drivers, 0 conflicts
→ Linear throughput. Correctness guaranteed by Postgres, not the app.
07 · Indexes that matter

Five indexes. No GIST.

PARTIAL BTREE
drivers(current_h3_9) WHERE status = 'available'
Most rows are offline. The index only contains drivers matching ever reads — small, hot, cache-friendly.
PARTIAL BTREE
trips(state, requested_at) WHERE state = 'requested'
Pending-trip retry worker scans only unmatched rows. Completed trips don't pollute the index.
COMPOSITE BTREE
driver_pings(h3_7, ts)
Time-bucketed analytics — “density per district last 5min” — without scanning ping history.
PARTIAL UNIQUE
payment_methods(rider_id) WHERE is_default AND deleted_at IS NULL
“Exactly one default per rider” enforced in SQL, not in app code. Zero race conditions possible.
UNIQUE BTREE
payments(provider_payment_id)
The linchpin of webhook idempotency. Makes ON CONFLICT DO NOTHING a one-line correctness proof.
08 · The idempotent webhook

Stripe retries. We don't double-charge.

INSERT INTO payments (
  id, trip_id, payment_method_id, amount_cents,
  currency, status, provider, provider_payment_id, authorized_at
)
VALUES ($1, $2, $3, $4, $5,
        'authorized', 'stripe', $6, now())
ON CONFLICT (provider_payment_id) DO UPDATE SET
  status        = EXCLUDED.status,
  authorized_at = COALESCE(payments.authorized_at, EXCLUDED.authorized_at),
  updated_at    = now()
WHERE payments.status NOT IN
  ('captured', 'refunded', 'partially_refunded');
unique index
The UNIQUE on provider_payment_id is what makes ON CONFLICT work. Without it, this is a plain insert.
DO UPDATE
Replays of the same event update the row in place — same outcome, no duplicate.
WHERE clause · the guard
A late authorized webhook arriving after captured can't move state backward. State regression prevented in SQL.
100 retries → 1 row
Idempotency lives in the schema, not the handler. Junior engineer rewriting in another language gets it right by construction.
09 · Stack

Boring on purpose.

Every choice optimizes for “the report can defend this in one paragraph.”

Database PostgreSQL 16
Driver node-postgres (raw SQL)
Runtime Node.js + TypeScript
Web Fastify
H3 h3-js v4 (app-side)
Frontend Vanilla HTML + Leaflet
Load test k6 / autocannon
Deploy docker compose, single host

No ORM, no PostGIS, no h3-pg. Every “no” is a story for the report.

10 · What you'll see in the demo

Five things to grade.

  1. One command spin-up. docker compose up, simulator scripts seed 500 drivers walking around Barcelona.
  2. Live heatmap. Leaflet renders driver density per res-7 hex, refreshing every 2 seconds.
  3. Concurrency proof. “Request 100 trips” button → 100 unique driver assignments, never duplicates.
  4. Idempotency proof. “Replay webhook” button fires same event 100×, payments table grows by exactly one row.
  5. Query plans in the report. EXPLAIN (ANALYZE, BUFFERS) before/after each index, with measured deltas.