Skip to content
View WeiXiongUST's full-sized avatar

Block or report WeiXiongUST

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
WeiXiongUST/README.md

Hi there 👋

I am Wei Xiong, currently a first-year Ph.D. student in computer science at UIUC. I work on RLHF for aligning language models.

Previously, I have spent time on the mathematical foundation of RL, where I was fortunate to collaborate with many great senior mentors and talented peers. I also spent time on deep RL at Microsoft Research Asia.

You can find more information about me at:

Pinned Loading

  1. RLHFlow/RLHF-Reward-Modeling Public

    Recipes to train reward model for RLHF.

    Python 1.3k 90

  2. OptimalScale/LMFlow Public

    An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.

    Python 8.4k 833

  3. Decentralized-Proximal-Algorithm-with-Variance-Reduction Public

    This is the code used for the paper "PMGT-VR: A decentralized proximal-gradient algorithmic framework with variance reduction", prepint.

    Python 15

  4. multi-armed-bandit-test-framework Public

    This is the code about multi_armed bandit used for my undergraduate thesis.

    Python 5

  5. ShenGroup/MPMAB_BEACON Public

    This is the official implementation for the paper "Heterogeneous Multi-player Multi-armed Bandits: Closing the Gap and Generalization" in NeurIPS 2021.

    Python 3

  6. RLHFlow/Online-RLHF Public

    A recipe for online RLHF and online iterative DPO.

    Python 500 46

579 contributions in the last year

Contribution Graph
Day of Week March April May June July August September October November December January February March
Sunday
Monday
Tuesday
Wednesday
Thursday
Friday
Saturday
Less
No contributions.
Low contributions.
Medium-low contributions.
Medium-high contributions.
High contributions.
More

Contribution activity

March 2025

Created 4 repositories
Loading