bikemap.nyc is a visualization of the entire history of Citi Bike, the largest bike-sharing system in the US.
visualization.mov
Each moving arrow represents a real bike ride, based on anonymized historical system data published by Lyft.
- GPU-accelerated rendering of thousands of concurrent rides
- Natural language date parsing to jump to any moment in history
- Search for individual rides by date and station name
- Full keyboard controls for playback and navigation
- Coverage of more than 291.2 million trips from 2013 to 2025 (0.7% data loss)
There is no backend. The client uses DuckDB WASM to query parquet files using SQL directly from a CDN, downloading only the rows it needs via HTTP range requests.
The raw system data spans 12 years and has significant inconsistencies, making it difficult to use directly. The processing pipeline cleans and normalizes the data into optimized parquet files.
- Station clustering: Creates a list of all unique station names and their coordinates.
- Route generation: Queries OSRM for bike routes between all station pairs. Geometries are cached per pair and stored as polyline6 in an intermediate SQLite database.
- Parquet export: Generates a parquet file for each day by joining each trip with its corresponding route geometry.
This is what you see when you visit bikemap.nyc.
- Data loading: DuckDB WASM queries parquet files from the CDN using HTTP range requests. Trips load in 30-minute batches with lookahead prefetching.
- Processing: A Web Worker decodes the polyline6 geometry and pre-computes timestamps with easing so that bikes slow down at station endpoints.
- Rendering: Heavy lifting is done with deck.gl layers on top of Mapbox.
- Search: Natural language date parsing via chrono-node lets you jump to any point in time or find a specific ride by querying the parquets directly.
1. Set up environment variables
Create a .env file in apps/client and add your Mapbox token:
NEXT_PUBLIC_MAPBOX_TOKEN=pk.xxx # Get one at https://mapbox.com2. Install dependencies and run
bun install
bun devNote: The client queries parquet files from the official hosted CDN by default. You don't need to run the processing pipeline unless you want to regenerate the data. See the processing README for how to run the pipeline.
I built this project because I think it is cool and beautiful :)
I hope to keep this project running indefinitely, but I'm paying for Mapbox and hosting costs out of pocket. If you'd like to support me, please consider buying me a coffee!