Skip to content

News article aggregator using paraphrase-distilroberta-base-v2

License

Notifications You must be signed in to change notification settings

0xGar/aggregator

Repository files navigation

aggregator

News article aggregator

This Python script groups similar news stories together across various RSS feeds at a set interval, outputting the results into a file. It uses the sentence-transformers/paraphrase-distilroberta-base-v2 model from Hugging Face to extract summaries, which are then used to create article clusters via agglomerative clustering.

'aggregator_esports.drama.py" and "aggregator_mainstream_news.py" are provided as examples. These files specify RSS feeds, and settings specifying how the feeds should be processed. For example, the latter mentioned file specifies mainstream RSS news feeds. In addition, it specifies the cluster distance for grouping, and the minimum number of items that group must have (which are news articles from separate sources) in order to be included in the output results.

Each general topic (e.g., politics, health, science, mainstream news, esports, etc,.) should have its own aggregator file. To do so, 1) copy either "aggregator_esports.drama.py" or "aggregator_mainstream_news.py" and modify as needed, and then 2) import the file into run.py and then append to the 'aggregators' array. See comments in that file for further instructions.

About

News article aggregator using paraphrase-distilroberta-base-v2

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages