wdc-page

This repository contains the source files of the Web Data Commons website and is used to maintain the site. The Web Data Commons project extracts structured data from the Common Crawl, the largest web corpus available to the public, and provides the extracted data for public download in order to support researchers and companies in exploiting the wealth of information that is available on the Web.