Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weighted Pearson, Spearman, and R2 score #1759

Draft
wants to merge 20 commits into
base: master
Choose a base branch
from

Conversation

matsumotosan
Copy link
Member

What does this PR do?

Fixes #1235

Before submitting
  • Was this discussed/agreed via a Github issue? (no need for typos and docs improvements)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure to update the docs?
  • Did you write any new necessary tests?
PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Did you have fun?

Make sure you had fun coding 🙃

@matsumotosan
Copy link
Member Author

I’m adding a weighted version of Pearson right now and updated the pearson_corrcoef function to take in an additional optional argument weights. When testing, I am getting a failure due to pearson_corrcoef() got an unexpected keyword argument 'weights' even though I’ve modified the function definition for pearson_corrcoef.

I tried debugging test_pearson_corrcoef_functional to check the signature of the imported pearson_corrcoef and got <Signature (preds: torch.Tensor, target: torch.Tensor) -> torch.Tensor> (missing weights: torch.Tensor).

Comment on lines 35 to 41
if weights.ndim == preds.ndim:
raise ValueError(
f"Expected `n_samples` of preds to match the length of `weights`,"
f" but got {preds.shape[0]} and {len(weights)}."
)
else:
raise ValueError("")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if weights.ndim == preds.ndim:
raise ValueError(
f"Expected `n_samples` of preds to match the length of `weights`,"
f" but got {preds.shape[0]} and {len(weights)}."
)
else:
raise ValueError("")
if weights.shape != preds.shape:
raise ValueError(
f"Expected `n_samples` of preds to match the length of `weights`,"
f" but got {preds.shape[0]} and {len(weights)}."
)

I think this check needs to be even more restrictive such that they can be multiplied together or else I am missing something.
Also I do not think there should be an else statement.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I split the check into single and multi-output cases:

  • if single output (preds.ndim=1), preds and weights must have same shape
  • if multi-output (preds.ndim=2)
    • if weights.ndim=2, preds and weights must have same shape
    • if weight.ndim=1, preds.shape[0] has to be equal to len(weights)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, that makes sense to me, but the implementation then needs to reflect that. Currently, if weight.ndim=1 and preds.ndim=2 the code will just fail with an value error.

src/torchmetrics/regression/pearson.py Outdated Show resolved Hide resolved
def _scipy_pearson(preds, target):
if preds.ndim == 2:
return [pearsonr(t.numpy(), p.numpy())[0] for t, p in zip(target.T, preds.T)]
return pearsonr(target.numpy(), preds.numpy())[0]


def _weighted_pearson(preds, target, weights):
if preds.ndim == 2:
return [pearsonr(t.numpy(), p.numpy())[0] for t, p in zip(target.T, preds.T)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this not depend on weights?

tests/unittests/regression/test_pearson.py Show resolved Hide resolved
tests/unittests/regression/test_pearson.py Outdated Show resolved Hide resolved
tests/unittests/regression/test_pearson.py Outdated Show resolved Hide resolved
@SkafteNicki SkafteNicki added the enhancement New feature or request label May 6, 2023
@SkafteNicki SkafteNicki added this to the v1.0.0 milestone May 6, 2023
@matsumotosan matsumotosan requested a review from SkafteNicki May 8, 2023 01:38
@SkafteNicki
Copy link
Member

@matsumotosan I have taken care of the testing now, such that inputs gets correctly send to the respective input.
Now the tests are only failing due to some values error in relation to the shape checking.

@SkafteNicki SkafteNicki modified the milestones: v1.0.0, future Jun 2, 2023
@Borda
Copy link
Member

Borda commented Aug 8, 2023

@matsumotosan @SkafteNicki, how is it going here? Could we have it for the next 1.1 release?

@matsumotosan
Copy link
Member Author

The docstring for final_aggregation contains an incomplete reference to some sort of equation to compute the aggregated statistics. Would anyone happen to know where this is?

@Borda Borda force-pushed the 1235-weighted-reg-coeffs branch from 6a8603b to e2e800f Compare September 19, 2023 13:39
@Borda
Copy link
Member

Borda commented Sep 19, 2023

@matsumotosan run rebase from master so pls check if I resolved all conflict property...

Comment on lines 62 to +75
if cond:
mx_new = (n_prior * mean_x + preds.sum(0)) / (n_prior + n_obs)
my_new = (n_prior * mean_y + target.sum(0)) / (n_prior + n_obs)
if weights is None:
mx_new = (n_prior * mean_x + preds.sum(0)) / (n_prior + n_obs)
my_new = (n_prior * mean_y + target.sum(0)) / (n_prior + n_obs)
else:
mx_new = (n_prior * mean_x + torch.matmul(weights, preds)) / (n_prior + n_obs)
my_new = (n_prior * mean_y + torch.matmul(weights, target)) / (n_prior + n_obs)
else:
mx_new = preds.mean(0)
my_new = target.mean(0)
if weights is None:
mx_new = preds.mean(0)
my_new = target.mean(0)
else:
mx_new = torch.matmul(weights, preds) / weights.sum()
my_new = torch.matmul(weights, target) / weights.sum()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we rather have have if weights is NOne to make it full of ones to have no effect but the code simpler

@kushagragpt99
Copy link

Hey! Wanted to see if there’s still some work left here since #1235 is still open.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request topic: Regress
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add weights for the pearson, spearman, and r2_score
4 participants