MCTS result on MATH

### System Info

Operating System=Linux
Python version= 3.10
Hardware=A100 40g

### Who can help?

@ziyuwan 

### Information

- [X] The official example scripts
- [ ] My own modified scripts

### Tasks

- [X] An officially supported task in the codebase (such as scrips/, ...)
- [ ] My own task or dataset (give details below)

### Reproduction

Hi, thanks for your excellent work.

When I use "Qwen2.5-Math-7B-Instruct" as the base model, I got "[{"majority_vote": 0.824, "total_completion_tokens": 2990.498}]".  However, when I use another base model like 'mistral-7b-sft'  on MATH , I got result '[{"majority_vote": 0.29, "total_completion_tokens": 963.846}]' in vanilla MCTS process.

### Expected behavior

I wonder if the result is reasonable. And can you give some results based on your experiment? Many thanks!!!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

MCTS result on MATH #87

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

MCTS result on MATH #87

Description

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions