Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster Hjerson bootstrap resampling #48

Open
martinpopel opened this issue Oct 5, 2015 · 1 comment
Open

Faster Hjerson bootstrap resampling #48

martinpopel opened this issue Oct 5, 2015 · 1 comment

Comments

@martinpopel
Copy link
Collaborator

Currently, bootstrap resampling is done schematically

for( $i = 0; $i < 1000; $i++ ) {
    $metric->init();
    for( $j = 0; $j < $count; $j++ ) {
    $rand = mt_rand( 0, $count - 1 );
    $metric->addSentence( $sentences[ $rand ]...);
    }
   $samples[] = $metric->getScore();
}

That is we calling $metric->getScore() 1000 times. So for Hjerson we run the external Python script 1000 times. For BLEU, we precompute sentence-level (and n-gram level) statistics in advance, so get_score() and add_sentence() is much faster. We would need to get sentece-level scores from Hjerson.
We should also compute 5 Hjerson metrics in one run (instead of five runs).

@martinpopel
Copy link
Collaborator Author

Partially solved in #50

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant