Skip to content

marcdotson/document-classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Document Classification

Description

The objective of this project is to create and compare text-based classification models. Using a dataset of labeled documents, the models will be used to classify un-labeled documents into one of two categories.

Project Organization

  • /code Scripts with prefixes (e.g., 01_import-data.py, 02_clean-data.py) and functions in /code/src.
  • /data Simulated and real data, the latter not pushed.
  • /figures PNG images and plots.
  • /output Output from model runs, not pushed.
  • /presentations Presentation slides.
  • /private A catch-all folder for miscellaneous files, not pushed.
  • /venv Project library.
  • /writing Case studies and the paper.
  • requirements.txt Information on the reproducible environment.

Reproducible Environment

Every package you install lives in your system library, accessible to all projects. However, packages change. Add a reproducible environment by creating a project library using venv.

For more details on using GitHub, Quarto, etc. see ASC Training.

About

Creating and comparing text-based classification models.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages