Skip to content

Honest decision forests and trees implemented efficiently and scikit-learn compliant.

License

Notifications You must be signed in to change notification settings

YuxinB/honest-forests

 
 

Repository files navigation

honest-forests package

Overview

Honest decision forests and trees implemented efficiently and scikit-learn compliant.

Honest trees and forests use sample splitting to unbias the estimates made in leaves. This leads to asytmptotic convergence guarantees and empirically better calibration (e.g. more accurate posterior probabilities, see our paper here).

An example can be seen here, comparing an honest forest to the traditional random forest and two other ad-hoc calibration approaches.

overlapping_gaussians.png

Install from Github

git clone https://github.com/neurodata/honest-forests.git
cd honest-forests
pip install -e .

Contributing

Git workflow

The preferred workflow for contributing to hyppo is to fork the main repository on GitHub, clone, and develop on a branch. Steps:

  1. Fork the project repository by clicking on the ‘Fork’ button near the top right of the page. This creates a copy of the code under your GitHub user account. For more details on how to fork a repository see this guide.

  2. Clone your fork of the hyppo repo from your GitHub account to your local disk:

    git clone [email protected]:YourGithubAccount/honest-forests.git
    cd honest-forests
  3. Create a feature branch to hold your development changes:

    git checkout -b my-feature

    Always use a feature branch. Pull requests directly to either dev or main will be rejected until you create a feature branch based on dev.

  4. Develop the feature on your feature branch. Add changed files using git add and then git commit files:

    git add modified_files
    git commit

    After making all local changes, you will want to push your changes to your fork:

    git push -u origin my-feature

Pull Request Checklist

We recommended that your contribution complies with the following rules before you submit a pull request:

  • Follow the coding-guidelines.

  • Give your pull request a helpful title that summarizes what your contribution does.

  • Link your pull request to the issue (see: closing keywords for an easy way of linking your issue)

  • All public methods should have informative docstrings with sample usage presented as doctests when appropriate.

  • At least one paragraph of narrative documentation with links to references in the literature (with PDF links when possible) and the example.

  • If your feature is complex enough that a doctest is insufficient to fully showcase the utility, consider creating a Jupyter notebook to illustrate use instead

  • All functions and classes must have unit tests. These should include, at the very least, type checking and ensuring correct computation/outputs.

  • All code should be automatically formatted by black. You can run this formatter by calling:

    pip install black
    black path/to/your_module.py

Coding Guidelines

Uniformly formatted code makes it easier to share code ownership. hyppo package closely follows the official Python guidelines detailed in PEP8 that detail how code should be formatted and indented. Please read it and follow it.

Docstring Guidelines

Properly formatted docstrings are required for documentation generation by Sphinx. The hyppo package closely follows the numpydoc guidelines. Please read and follow the numpydoc guidelines. Refer to the example.py provided by numpydoc.

About

Honest decision forests and trees implemented efficiently and scikit-learn compliant.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%