This is a demo that showcases some of Typesense's features using a 28 Million database of books from OpenLibrary (Internet Archive).
View it live here: books-search.typesense.org
This search experience is powered by Typesense which is a blazing-fast, open source typo-tolerant search-engine. It is an open source alternative to Algolia and an easier-to-use alternative to ElasticSearch.
The book dataset is from openlibrary.org. If you're able to contribute book metadata, please do π
The app was built using the Typesense Adapter for InstantSearch.js and is hosted on S3, with CloudFront for a CDN.
The search backend is powered by a geo-distributed 3-node Typesense cluster running on Typesense Cloud, with nodes in Oregon, Frankfurt and Mumbai.
The dataset has ~28M records, takes up 6.8GB on disk and 14.3GB in RAM when indexed in Typesense. Takes ~3 hours to index these 28M records.
src/
andindex.html
- contain the frontend UI components, built with Typesense Adapter for InstantSearch.jsscripts/indexer
- contains the script to index the book data into Typesense.scripts/data
- contains a 1K sample subset of the books database. But you can download the full dataset from the link above.
To run this project locally, install the dependencies and run the local server:
yarn
bundle # JSON parsing takes a while to run using JS when indexing, so we're using Ruby just for indexing
yarn run typesenseServer
ln -s .env.development .env
yarn run indexer:extractAuthors # This will output an authors.jsonl file
yarn run indexer:transformDataset # This will output a transformed_dataset.json file
BATCH_SIZE=100000 yarn run indexer:importToTypesense # This will import the JSONL file into Typesense
yarn start
Open http://localhost:3000 to see the app.
The app is hosted on S3, with Cloudfront for a CDN.
yarn build
yarn deploy