Skip to content
@wbsg-uni-mannheim

Web-based Systems Group @ University of Mannheim

We explore technical and empirical questions concerning the development of global, decentralized information environments.

Pinned Loading

  1. productbert-intermediate productbert-intermediate Public

    This repository contains code and data download scripts for the paper "Intermediate Training of BERT for Product Matching" by Ralph Peeters, Christian Bizer and Goran Glavaš.

    Python 35 11

  2. wdc-lspc-v2 wdc-lspc-v2 Public

    This repository contains code and data download scripts for the paper "Using schema.org annotations for training and maintaining product matchers" by Ralph Peeters, Anna Primpeli, Benedikt Wichtlhu…

    Jupyter Notebook 15 3

  3. WDCFramework WDCFramework Public

    Java Framework which is used by the Web Data Commons project to extract Microdata, Microformats and RDFa data, Web graphs, and HTML tables from the web crawls provided by the Common Crawl Foundation.

    Java 8 1

  4. contrastive-product-matching contrastive-product-matching Public

    This repository contains the code to reproduce the experiments of the poster "Supervised Contrastive Learning for Product Matching"

    Python 36 14

  5. TabAnnGPT TabAnnGPT Public

    This repository contains the code for the experiments run in the papers "Column Type Annotation using ChatGPT" and "Column Property Annotation using Large Language Models".

    Jupyter Notebook 9 2

  6. MatchGPT MatchGPT Public

    This repository contains code and extensive prompt examples to reproduce and extend the experiments in our papers "Using ChatGPT for Entity Matching" and "Entity Matching using Large Language Models".

    Jupyter Notebook 49 11

Repositories

Showing 10 of 27 repositories
  • MatchGPT Public

    This repository contains code and extensive prompt examples to reproduce and extend the experiments in our papers "Using ChatGPT for Entity Matching" and "Entity Matching using Large Language Models".

    wbsg-uni-mannheim/MatchGPT’s past year of commit activity
    Jupyter Notebook 49 11 0 0 Updated Oct 18, 2024
  • TailorMatch Public

    This repository contains code and comprehensive examples to replicate and build upon the experiments presented in our paper “Fine-tuning Large Language Models for Entity Matching” The repository provides resources for implementing fine-tuning techniques on large language models specifically for entity matching tasks.

    wbsg-uni-mannheim/TailorMatch’s past year of commit activity
    Jupyter Notebook 7 1 0 0 Updated Sep 13, 2024
  • wdc-pave Public

    Web Data Commons - Using LLMs for Product Attribute Value Extraction and Normalization

    wbsg-uni-mannheim/wdc-pave’s past year of commit activity
    Python 8 1 0 0 Updated Jul 3, 2024
  • wdc-page Public

    This repository contains the source files of the Web Data Commons website and is used to maintain the site. The Web Data Commons project extracts structured data from the Common Crawl

    wbsg-uni-mannheim/wdc-page’s past year of commit activity
    HTML 1 1 0 0 Updated Jul 3, 2024
  • SC-Block Public

    SC-Block is a supervised contrastive blocking method which combines supervised contrastive learning for positioning records in an embedding space and nearest neighbour search for candidate set building.

    wbsg-uni-mannheim/SC-Block’s past year of commit activity
    Python 8 BSD-3-Clause 2 0 0 Updated Jun 10, 2024
  • wbsg-uni-mannheim/wdc-sotab’s past year of commit activity
    Jupyter Notebook 4 0 1 0 Updated Jun 3, 2024
  • TabAnnGPT Public

    This repository contains the code for the experiments run in the papers "Column Type Annotation using ChatGPT" and "Column Property Annotation using Large Language Models".

    wbsg-uni-mannheim/TabAnnGPT’s past year of commit activity
    Jupyter Notebook 9 2 0 0 Updated May 28, 2024
  • ExtractGPT Public

    Attribute Value Extraction using Large Language Models

    wbsg-uni-mannheim/ExtractGPT’s past year of commit activity
    Python 22 Apache-2.0 7 0 0 Updated May 24, 2024
  • wdc-smb Public

    This repository contains the code and data download links to reproduce building the WDC SMB Benchmark.

    wbsg-uni-mannheim/wdc-smb’s past year of commit activity
    0 BSD-3-Clause 0 0 0 Updated Dec 11, 2023
  • pie_chatgpt Public

    Product Information Extraction using ChatGPT

    wbsg-uni-mannheim/pie_chatgpt’s past year of commit activity
    Jupyter Notebook 2 0 0 0 Updated Oct 4, 2023

Top languages

Loading…

Most used topics

Loading…