Skip to content

Commit

Permalink
new elo results
Browse files Browse the repository at this point in the history
  • Loading branch information
yuchenlin committed Jul 15, 2024
1 parent de573cf commit dd82749
Show file tree
Hide file tree
Showing 12 changed files with 1,734 additions and 1,379 deletions.
166 changes: 108 additions & 58 deletions leaderboard/data_dir/all_stat.json

Large diffs are not rendered by default.

282 changes: 160 additions & 122 deletions leaderboard/data_dir/all_stat_wildbench.-1.json

Large diffs are not rendered by default.

240 changes: 139 additions & 101 deletions leaderboard/data_dir/all_stat_wildbench.100.json

Large diffs are not rendered by default.

282 changes: 160 additions & 122 deletions leaderboard/data_dir/all_stat_wildbench.1000.json

Large diffs are not rendered by default.

282 changes: 160 additions & 122 deletions leaderboard/data_dir/all_stat_wildbench.1500.json

Large diffs are not rendered by default.

240 changes: 139 additions & 101 deletions leaderboard/data_dir/all_stat_wildbench.2000.json

Large diffs are not rendered by default.

240 changes: 139 additions & 101 deletions leaderboard/data_dir/all_stat_wildbench.300.json

Large diffs are not rendered by default.

240 changes: 139 additions & 101 deletions leaderboard/data_dir/all_stat_wildbench.3000.json

Large diffs are not rendered by default.

282 changes: 160 additions & 122 deletions leaderboard/data_dir/all_stat_wildbench.500.json

Large diffs are not rendered by default.

184 changes: 92 additions & 92 deletions leaderboard/data_dir/score.json

Large diffs are not rendered by default.

672 changes: 336 additions & 336 deletions leaderboard/data_dir/wb_elo_results.json

Large diffs are not rendered by default.

3 changes: 2 additions & 1 deletion leaderboard/show_eval.sh
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,8 @@ python leaderboard/data_dir/_create_tables.py score

python leaderboard/data_dir/_merge_results.py

margin=3;tie_margin=2;K=4;dynamic=True;interval=16
# margin=3;tie_margin=2;K=4;dynamic=True;interval=16
margin=3;tie_margin=2;K=4;dynamic=True;interval=100
python -m leaderboard.wb_elo --K $K --margin $margin --tie_margin $tie_margin --num_rounds 100 --dynamic $dynamic --interval $interval --num_processes 4

python leaderboard/data_dir/_merge_results.py
Expand Down

0 comments on commit dd82749

Please sign in to comment.