Presenters: Iris Shen, Charles Huang, Chieh-Han Wu, Anshul Kanakia
Contributors: Yuxiao Dong, Junjie Qian
Microsoft Research - Microsoft Academic Graph team
Time: Thu, August 08, 2019, 9:30 am - 12:00 pm and 1:00 pm - 3:30 pm
Many real-world datasets come in the form of graphs. These datasets include social networks, biological networks, knowledge graphs, the World Wide Web, and many more. Having a comprehensive understanding of these networks is essential to truly understand many important applications.
This hands-on tutorial introduces the fundamental concepts and tools used in modeling large-scale graphs and knowledge graphs. The audience will learn a spectrum of techniques used to build applications that use graphs and knowledge graphs: ranging from traditional data analysis and mining methods to the emerging deep learning and embedding approaches.
Five lab sessions are included to give the audience hands-on experience to work through real-life examples on major topics covered in this tutorial. This includes:
- understanding basic graph properties;
- using graph representation learning to explore network similarity;
- utilizing NLP and text mining techniques to build knowledge graphs;
- modeling knowledge graphs with embedding techniques and how to apply it to recommendation applications.
We use Microsoft Academic Graph (MAG) -- the largest publicly available academic domain knowledge graph –- as the dataset to demonstrate the algorithms and applications presented here. MAG includes 6 types of entities with 450 million nodes, and over 3 billion edges covering more than 660K academic concepts. The MAG dataset (500G+) is regularly updated at a bi-weekly cadence. We use a Top CS Conference Sub-Graph from one of the most up-to-date data versions for this hands-on tutorial. The full graph with bi-weekly updates is available for free here.
Key takeaways for attendees will be:
- a solid understanding of graph and knowledge graph fundamentals and advanced representation learning techniques;
- hands-on experience with state-of-the-art algorithms and analytics for a large-scale dataset;
- explore the possibilities to create semantic search experience with their own private contents.
Time | Module | Slides | Codes |
---|---|---|---|
9:30am - 10:30am | I: Welcome, Setup, Dataset | link | link1 link2 |
10:30am - 11:15am | II: Graph Basics | link | link |
11:15am - 12:00pm | III: Graph Representation Learning | link | link |
12:00pm - 1:00pm | LUNCH BREAK | ||
1:00pm - 2:05pm | IV: Knowledge Graph Fundamentals and Construction | link | link |
2:05pm - 3:10pm | V: Knowledge Graph Inference and Applications | link | link |
3:10pm - 3:30pm | VI: Summary and Looking Forward | link | n/a |
For a longer and more thoeretic version of the graph and knowledge graph contents, please see DAT278x edX Online Course: From Graph to Knowledge Graph: Algorithms and Applications (GitHub link) (slides and codes only), (edX link here - with videos, quiz, final exam, and you can earn a certificate for this course at edX).