This project aims to develop a news event detection framework that will detect newsworthy events from tweets and generate headlines from them.
Newsstand requires the NER package and run the java gateway
All code is in the DatasetA/EventDetection folder DatasetA/Myoutputs has all output files generated cluster.csv has tweets and their assigned cluster ids summary.csv has headlines along with cluster id topics.csv: contains topics
-
Clone the repository by running
git clone https://github.com/Shafaq19/Twitter-News-Event-Detection.git
-
you need to register for a api key for
-
if you are using python envirment install the dependencies by
pip install -r requirements.txt
else conda users can doconda install
-
run DatasetA/EventDetection/Mainfile.py and check the outputs at Dataset/Myoutputs folder
Congrats You are all set!:+1:
- Using NER as a base rather then the traditional TF-IDF vectorization that doesnt allow flexibilty in reconzing unknown words and take too much space.
- Dynammic threshold based clustering algorithm
You can [also][df1]:
- Search for news related to relevent topic of interest like technology
- News events at your beck in just 4 minutes!
- Responsive and self refreshed
The overriding design goal for tweetter based newsstand is to make it as easier and faster for journalist to cover stories as possible. The idea is that a since people are already on site and actively reporting it on tweets makes it feasibke for the journalist as they dont have to go onsite or worry something might be missed
No Plugins Needed |
---|
Dataset Cittation: Andrew J. McMinn, Yashar Moshfeghi, Joemon M. Jose. Building a large-scale corpus for Evaluating Event Detection on Twitter - Proceedings of the 22nd ACM international conference on Conference on information & knowledge management.
similarity function taken from paper: Liu, Xiaomo, Quanzhi Li, Armineh Nourbakhsh, Rui Fang, Merine Thomas, Kajsa Anderson, RussKociuba, et al. 2016. “Reuters Tracer: A Large Scale System of Detecting and Verifying Real-Time News Events from Twitter.” in the Proceedings of the 25th ACM International Conference on Information and Knowledge Management, 207–216, Indianapolis, Indiana, October 24–28
FASTNU © Shafaq Arshad