This Chrome Extension uses the metadata of an article and calculates how likely it is to have quality content
Ambition Level of this project: 6/10
The idea is, given a webpage that has an article on it. This app will read Meta Data such as
-The Author
-The sources
-The site it is hosted on
... To be added to
and then represent on a scale how likely this article is to be reliable.
For instance, if an article is written by an author that works for The Economist, their sources are also by well established reporters and the site it is published on is trusted. The article will be marked as likely to contain reliable information.
However, if an article is written by an Author who writes for The Onion, they have no sources and the article is posted on the onion. It will be marked as unreliable.
Another way to check the reputation is check for spelling mistakes. Look into how spam mail is recognized. Look for lots of capitalization or exlamation marks
Issues to consider. Having many dependincies means many things can break
-Collecting the Article
-This will be incredibly helpful: https://github.com/codelucas/newspaper
-API search:
-Find an API that returns an authors reputation (Alternatively if they are peer reviewed or other criterea)
-Sites to look into:
Outs Bad Articles: Haux.com, Snopps, Retraction Watch (Has Database)
Reputation of good articles: Google Scholar Index(higher ranked, higher impact),
look into 'impact factor'
-Find an API that returns the reputation of a site (Minimum Viable product, hard code some site reliabilities)
-API implementation:
-Connect to API
-Chrome Extension:
-Send the url to the python app through sockets
-Collecting the Article
-Python
-Regular Expressions
-Unknowns
-A useful walk through of web scrapers (not required): https://first-web-scraper.readthedocs.io/en/latest/
-API search:
-Understand how API's work and API endpoints
-API implementation:
-Connect to API (probably with Javascript)
-Chrome Extension:
-JavaScript
-JSON (Simple)
-useful link https://developer.chrome.com/extensions/getstarted
https://github.com/orgs/UofT-group-sideprojects/projects/2
Project Owner: Daniel Visca
email: [email protected]
phone number: +1 (514) 889-6686
If you would like any clarification, or would like to get together in person to work on this please reach out!