-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TabRepo 2.0 Feature Tracker #63
Comments
Another thing that has been my radar for some time is to have tabrepo on huggingface. It will speedup the download time by ~8x (download is very slow from outside) and would make the dataset more visible. |
Having an example or an API that allows to "join" two repository would be also be quite useful. One could do:
|
Sorry for the delay again :-) The list you made sounds great! One thing I want to mention that I think could be quite useful is adding a way to recover original and transformed features from openml. Something like that: df, y = repo.openml_dataframe(dataset="airplane", fold=2) # gets the raw columns from the dataset
X, y = repo.openml_transformed_features(dataset="airplane", fold=2) # gets the features as provided to the model This would allow to use Tabrepo to train TabPFN models (probably with larger scales that what they currently use). Also it would make it easier to train new models and add them in tabrepo. |
For TabRepo 2.0, several quality of life changes should be made for ease of use. This list will evolve over time.
P0 (Critical)
hyperparameters
dictionary. Add AutoGluon hyperparameters fetching #80ensemble_size
->n_iterations
P1
EvaluationRepository
scripts
code, move relevant logic intotabrepo/
for ease of use by others for their own experiments.requirements_frozen
file to ensure reproducibility of experiments, or provide a Docker container.P2 (Nice-to-have)
binary_as_multiclass
torepo.predict
methods #64P3
log_loss
.The text was updated successfully, but these errors were encountered: