Skip to content
Change the repository type filter

All

    Repositories list

    • ACP-RAG

      Public
      [NAACL 2025] Large-Scale Corpus Construction and Retrieval-Augmented Generation for Ancient Chinese Poetry: New Method and Data Insights (ACP-Corpus; ACP-QA; ACP-RAG)
      0100Updated Feb 7, 2025Feb 7, 2025
    • PAVENet

      Public
      [IEEE TPAMI 2025] Official repository of "Privacy-Preserving Biometric Verification With Handwritten Random Digit String".
      Python
      GNU General Public License v3.0
      0100Updated Jan 22, 2025Jan 22, 2025
    • DOLPHIN

      Public
      [IEEE TIFS 2024] Official repository of "Online Writer Retrieval with Chinese Handwritten Phrases: A Synergistic Temporal-Frequency Representation Learning Approach".
      Python
      GNU General Public License v3.0
      0400Updated Jan 21, 2025Jan 21, 2025
    • Algorithms, papers, datasets, performance comparisons for Document AI. Continuously updating.
      517500Updated Dec 9, 2024Dec 9, 2024
    • RFUND

      Public
      [MM'2024] Official release of RFUND introduced in the MM'2024 paper "PEneo: Unifying Line Extraction, Line Grouping, and Entity Linking for End-to-end Document Pair Extraction"
      01800Updated Dec 4, 2024Dec 4, 2024
    • DCOH-120K

      Public
      GNU General Public License v3.0
      0000Updated Nov 7, 2024Nov 7, 2024
    • [EMNLP 2024] TongGu, a classical Chinese language model.
      02210Updated Sep 28, 2024Sep 28, 2024
    • WenMind

      Public
      WenMind benchmark.
      Python
      0500Updated Sep 26, 2024Sep 26, 2024
    • HisDoc1B

      Public
      0500Updated Jul 17, 2024Jul 17, 2024
    • .github

      Public
      0000Updated Jun 4, 2024Jun 4, 2024
    • C3bench

      Public
      C3 benchmark
      0210Updated May 27, 2024May 27, 2024
    • SCUT-EnsExam is a real-world handwritten text erasure dataset for examination paper scenarios, which consists of 545 examination paper images. The dataset is randomly divided into training set and test set of 430 and 115 images, respectively.
      0900Updated Dec 5, 2023Dec 5, 2023
    • Evaluation of the Optical Character Recognition (OCR) capabilities of GPT-4V(ision)
      Python
      412200Updated Nov 13, 2023Nov 13, 2023
    • A CNN model builds with Pytorch and reaches 99.7% accuracy
      Python
      2400Updated May 1, 2021May 1, 2021