Support for grouped custom metrics in workflows #371

simdadim · 2023-02-23T22:57:31Z

I've working on a tabular dataset where multiple rows are linked forming an "session". Within the session, one of the rows are our target action, and the goal for the model is to try to find this, and bump this row as far up within the session as possible (predict this row with higher probability than the other rows in the same session).

I've modeled this using a binary classifier. To evaluate the performance, I want to se how much up/down the target row was moved in the correct direction if I sort the rows within the session by each rows predicted probability of being the correct one. Stated otherwise - I want the most likely row within each session to have the highest probability within the given session.

The formula is quite easy, its just group by session_id, identify the target row in each session, and evaluate the relative distance it has moved if I sort on the new predicted probability score.

But, I struggle to create a custom yardstick-metric to calculate this, since the grouping of the data is not passed on to my custom evaluating function. I've tried different approaches, but from what I can see, the problem is that the predict_model-function is dropping the grouping of the dataframe.

Is it possible to keep the grouping of a DF in predict_model(), and include the grouping variable? To my understanding, this would make it possible to develop custom metrics that accounts for grouped data. I imagine this means that the .predictions column in the resample_results dataframe would also keep the groups.

The text was updated successfully, but these errors were encountered:

EmilHvitfeldt · 2023-03-30T18:58:49Z

Hello @simdadim 👋

This is a good idea, we are currently thinking about how to best handle these types of metrics. We want to make sure our approach is sound and general enough to do everything we want.

I'm gonna keep this issue up to remind us of your request, but it will take a little while before we get to working on this problem.

EmilHvitfeldt added the feature a feature request or enhancement label Mar 30, 2023

SHo-JANG mentioned this issue Apr 20, 2023

Group_by calculate metric #421

Open

simonpcouch mentioned this issue May 15, 2023

add fairness metrics #434

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for grouped custom metrics in workflows #371

Support for grouped custom metrics in workflows #371

simdadim commented Feb 23, 2023 •

edited

Loading

EmilHvitfeldt commented Mar 30, 2023

Support for grouped custom metrics in workflows #371

Support for grouped custom metrics in workflows #371

Comments

simdadim commented Feb 23, 2023 • edited Loading

EmilHvitfeldt commented Mar 30, 2023

simdadim commented Feb 23, 2023 •

edited

Loading