π»
π Machine Learning Engineer | NLP & LLM
π Economist | Empirical & Behavioral
π PhD | Decision Science & Managerial Economics
Pinned Loading
-
Logic-RL-Lite
Logic-RL-Lite PublicLightweight replication study of DeepSeek-R1-Zero. Interesting findings include "No Aha Moment", "Longer CoT β Accuracy", and "Language Mixing in Instruct Models".
Python 59
-
DeepEnlighten
DeepEnlighten PublicPure RL to post-train base models for social reasoning capabilities. Lightweight replication of DeepSeek-R1-Zero with Social IQa dataset.
Python 17
140 contributions in the last year
Day of Week | March Mar | April Apr | May May | June Jun | July Jul | August Aug | September Sep | October Oct | November Nov | December Dec | January Jan | February Feb | March Mar | ||||||||||||||||||||||||||||||||||||||||
Sunday Sun | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Monday Mon | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Tuesday Tue | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Wednesday Wed | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Thursday Thu | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Friday Fri | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Saturday Sat |
Less
No contributions.
Low contributions.
Medium-low contributions.
Medium-high contributions.
High contributions.
More
Contribution activity
March 2025
Created 4 commits in 3 repositories
Created 2 repositories
-
DolbyUUU/verl
Python
This contribution was made on Mar 17
-
DolbyUUU/DeepEnlighten
Python
This contribution was made on Mar 12
Opened their first pull request on GitHub in volcengine/verl Public
Opened 1 issue in 1 repository
deepseek-ai/DeepSeek-Math
1
open
-
Model Size Choices in Evaluation
This contribution was made on Mar 13