Skip to content

freeman-jiang/bikemap.nyc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

346 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

bikemap.nyc

bikemap.nyc is a visualization of the entire history of Citi Bike, the largest bike-sharing system in the US.

visualization.mov

Each moving arrow represents a real bike ride, based on anonymized historical system data published by Lyft.

Features

  • GPU-accelerated rendering of thousands of concurrent rides
  • Natural language date parsing to jump to any moment in history
  • Search for individual rides by date and station name
  • Full keyboard controls for playback and navigation
  • Coverage of more than 291.2 million trips from 2013 to 2025 (0.7% data loss)

How it works

There is no backend. The client uses DuckDB WASM to query parquet files using SQL directly from a CDN, downloading only the rows it needs via HTTP range requests.

1) Data processing pipeline

The raw system data spans 12 years and has significant inconsistencies, making it difficult to use directly. The processing pipeline cleans and normalizes the data into optimized parquet files.

  1. Station clustering: Creates a list of all unique station names and their coordinates.
  2. Route generation: Queries OSRM for bike routes between all station pairs. Geometries are cached per pair and stored as polyline6 in an intermediate SQLite database.
  3. Parquet export: Generates a parquet file for each day by joining each trip with its corresponding route geometry.

2) Client application

This is what you see when you visit bikemap.nyc.

  • Data loading: DuckDB WASM queries parquet files from the CDN using HTTP range requests. Trips load in 30-minute batches with lookahead prefetching.
  • Processing: A Web Worker decodes the polyline6 geometry and pre-computes timestamps with easing so that bikes slow down at station endpoints.
  • Rendering: Heavy lifting is done with deck.gl layers on top of Mapbox.
  • Search: Natural language date parsing via chrono-node lets you jump to any point in time or find a specific ride by querying the parquets directly.

Quickstart

1. Set up environment variables

Create a .env file in apps/client and add your Mapbox token:

NEXT_PUBLIC_MAPBOX_TOKEN=pk.xxx  # Get one at https://mapbox.com

2. Install dependencies and run

bun install
bun dev

Note: The client queries parquet files from the official hosted CDN by default. You don't need to run the processing pipeline unless you want to regenerate the data. See the processing README for how to run the pipeline.

Why

I built this project because I think it is cool and beautiful :)

I hope to keep this project running indefinitely, but I'm paying for Mapbox and hosting costs out of pocket. If you'd like to support me, please consider buying me a coffee!

About

🚲 Every Citi Bike ride ever taken visualized on a map

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published