Manga BR Search Engine

An webapp which searches through the most popular manga, anime, novel and webtoon readers in PT-BR/EN-US to find the most updated one for a specific series.

Made by Eduardo Henrique (BACKEND) and Rosialdo Vidinho (FRONTEND). Documentation and project management support by Guilherme Bernardo;

Goals and Limitations

✅ means done.

🚧 means doing.

❌ means won't do.

[Project Related]

🚧 Migrating from JS to TS.
🚧 Adopt micro commits strategy.
- Commit for atomic changes so that errors and bugs can be resolved faster.
🚧 Integrate backend and frontend.
🚧 Separate app in 3 Docker containers: one for the DB, one for the API and one for the crawlers.
🚧 Creating an RESTful API using Express so the user can read from DB without having access to it.

[Proxy Related]

✅ Use proxies.
✅ Creating a proxy pool with auto renew.

[Crawler Related]

🚧 Create crawlers for the following sites:
- ✅ BRMangas.
- ✅ Manganato.
✅ Send multiple requests at once.
✅ Using Puppeteer for JS Rendering.
✅ Created list of relevant sites to use.
🚧 Change crawler structure to adopt crawlee.
🚧 Creating crawlers for a lot of sites.

[DB Related]

✅ Created DB (MongoDB).
🚧 Create backup DB.
🚧 Update DB on demand.

[Frontend Related]

🚧 Create search bar to take user input.
🚧 Creating frontend for the site using React.
🚧 Create responsive dropdown menu for series type selection.

[Limitations]

❌ Crawl websites with Cloudflare anti-bot features.
- Sites such as mangalivre won't be crawled for the time being.
❌ Read series through site.
- It will only redirect the user to a site with said series.
❌ Update DB daily.
- We are deciding if this is viable right now, but as it is, it will hinder our progress, so we will postpone this feature.
- Maybe this will be possible since we discovered a way to request multiple pages at once.

Problems Encountered

Problems

Updating DB daily;
Updating DB without reconstructing it from scratch;

Possible Solutions:

Make the update process faster and less resource-heavy;
Only update series that changed values;

Solved Problems

Only update DB after going through every series in a site;
Create proxy pool with auto renew;

Solutions:

Update for every visited page in the site;
Use Webshare builtin proxy renewing tool;

Demo

WIP 😎

Translations

portuguese translation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Manga BR Search Engine

Table of Contents

Goals and Limitations

[Project Related]

[Proxy Related]

[Crawler Related]

[DB Related]

[Frontend Related]

[Limitations]

Problems Encountered

Problems

Solved Problems

Demo

Translations

Files

README.md

Latest commit

History

README.md

File metadata and controls

Manga BR Search Engine

Table of Contents

Goals and Limitations

[Project Related]

[Proxy Related]

[Crawler Related]

[DB Related]

[Frontend Related]

[Limitations]

Problems Encountered

Problems

Solved Problems

Demo

Translations