Skip to content
Change the repository type filter

All

    Repositories list

    • Code to accompany the Data and Policy and Ottawa group paper
      HTML
      MIT License
      0202Updated Jul 6, 2023Jul 6, 2023
    • Drift detection and model retraining methods
      Jupyter Notebook
      0000Updated Nov 28, 2022Nov 28, 2022
    • Imputation tool for categorical and continuous data using scikit-learn algorithms. Includes simulation study and model persistence.
      Python
      2100Updated Jun 22, 2022Jun 22, 2022
    • The ONS Big Data Team Github pages
      HTML
      MIT License
      21001Updated May 19, 2021May 19, 2021
    • precon

      Public
      Functions for price index economics.
      Python
      MIT License
      1580Updated Mar 30, 2021Mar 30, 2021
    • Collect data from Zoopla then use machine learning to identify caravans
      Python
      MIT License
      41700Updated Oct 1, 2020Oct 1, 2020
    • Creating synthetic data for testing clerical linkage interface
      Python
      2100Updated May 21, 2020May 21, 2020
    • Runs Tesseract OCR over PDF paper filings from Companies House.
      Python
      MIT License
      5001Updated Jan 6, 2020Jan 6, 2020
    • Downloads paper filings from the Companies House API
      Python
      MIT License
      4301Updated Jan 6, 2020Jan 6, 2020
    • Jupyter Notebook
      0000Updated Dec 2, 2019Dec 2, 2019
    • Using Scala to create a Spark UDF designed to be callable from PySpark.
      Scala
      MIT License
      2400Updated Nov 13, 2019Nov 13, 2019
    • exploring and data checking companies house data read from pdfs
      HTML
      1200Updated Aug 8, 2019Aug 8, 2019
    • Command line utility to convert zip archives into Hadoop Sequence Files
      Scala
      MIT License
      1000Updated Apr 21, 2019Apr 21, 2019
    • Reading digital XBRL/iXBRL account documents - for sharing
      HTML
      184820Updated Feb 28, 2019Feb 28, 2019
    • Files for a brief report investigating if traffic flow (Annual average daily flow) could be used as an early indicator for GDP
      HTML
      1100Updated Jan 7, 2019Jan 7, 2019
    • Python web GUI for text analysis purposes.
      Python
      1000Updated Nov 21, 2018Nov 21, 2018
    • feedback from PyCon UK 2018
      Jupyter Notebook
      0000Updated Oct 15, 2018Oct 15, 2018
    • Small self-contained RISE presentation example
      Jupyter Notebook
      0000Updated Oct 5, 2018Oct 5, 2018
    • Timed runs of Scala and Python UDFs in Spark (on a Virtual Machine).
      Python
      MIT License
      0000Updated Sep 20, 2018Sep 20, 2018
    • RSS-2018

      Public
      Traffic flow as an early indicator for GDP
      HTML
      GNU General Public License v3.0
      1100Updated Sep 12, 2018Sep 12, 2018
    • reveal.js demo slides
      HTML
      2000Updated Jul 27, 2018Jul 27, 2018
    • dsc-lunch

      Public
      slides for dsc lunch
      HTML
      0100Updated Jul 16, 2018Jul 16, 2018
    • Repo for work from the Bristol data dive: http://www.data4sdgs.org/news/bristol-data-dive
      Jupyter Notebook
      1200Updated Apr 24, 2018Apr 24, 2018
    • Classification of restricted access properties and caravans within Zoopla Data
      Jupyter Notebook
      MIT License
      0200Updated Apr 18, 2018Apr 18, 2018
    • Web scraping demo files
      Jupyter Notebook
      MIT License
      1200Updated Mar 13, 2018Mar 13, 2018
    • Working paper and notebook for unsupervised document clustering
      Jupyter Notebook
      51300Updated Mar 6, 2018Mar 6, 2018
    • Explore how we might produce statistics on social sentiment from news/social media towards events/topics and how those can be linked to existing official statistics which annually measure population well-being
      Python
      2000Updated Dec 4, 2017Dec 4, 2017
    • ExtracTED

      Public
      Scripts to extract and parse TED (Tenders Electronic Daily: http://ted.europa.eu/TED/main/HomePage.do) documents.
      Python
      MIT License
      91700Updated Dec 1, 2017Dec 1, 2017
    • Comparing densities of mobile cell towers with population estimates
      R
      MIT License
      61200Updated Oct 17, 2017Oct 17, 2017
    • Repository for the Big Data Team work on the LCF Project
      Jupyter Notebook
      MIT License
      2000Updated Sep 21, 2017Sep 21, 2017