-
Notifications
You must be signed in to change notification settings - Fork 424
Open
Description
Best regards.
I read and tested the previous issues:
Fit best model on new data in Optuna mode
Saving mljar automl model for future use
I want to replicate manually the results found with the normal fit (compete, optuna or other).
For instance, I already run fit with optuna mode and some custom cv_indices and the results were stored.
The cv_indices has 5 elements, so I want to do:
mean_metric = 0
for train_index, val_index in cv_indices:
model = AutoML(...something with the path)
model.fit(X[train_index,:],y[train_index])
y_pred = model.predict_proba(X[val_index,:])
mean_metric+=some_metric(y[val_index],y_pred)
print(mean_metric/5)
And this print should be similar to the results reported during the fit:.
import pandas as pd
from sklearn.model_selection import train_test_split
from supervised.automl import AutoML
# Initialize AutoML with custom CV
automl = AutoML(
mode="Optuna", # Or "Explain" / "Perform"
ml_task="binary_classification",
results_path=f"{DATADIR}AutoML_Optuna_Results_2",
validation_strategy={
"validation_type": "custom",
"custom_cv": cv_indices
},
eval_metric="auc",optuna_time_budget=60*5
)
# Fit the model
automl.fit(xdf, target,cv=cv_indices)
Nonetheless, the result I get is higher (like if the model already saw the data, therefore already part of "training set")
I appreciate your help.
Metadata
Metadata
Assignees
Labels
No labels