Skip to content
View CownowAn's full-sized avatar
👋
👋

Block or report CownowAn

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Python 271 22 Updated Jan 6, 2025

Super-Efficient RLHF Training of LLMs with Parameter Reallocation

Python 244 16 Updated Jan 13, 2025

This is the official implementation of the paper "S²R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning"

Python 50 2 Updated Mar 13, 2025

Official repository for paper: O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning

Python 64 2 Updated Feb 21, 2025
Python 103 6 Updated Jan 21, 2025

s1: Simple test-time scaling

Python 6,080 711 Updated Mar 6, 2025

Scalable RL solution for advanced reasoning of language models

Python 1,444 89 Updated Mar 18, 2025

Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning

Python 160 6 Updated Mar 20, 2025

Clean, minimal, accessible reproduction of DeepSeek R1-Zero

Python 11,394 1,442 Updated Mar 10, 2025

Fully open reproduction of DeepSeek-R1

Python 23,440 2,132 Updated Mar 27, 2025

A series of technical report on Slow Thinking with LLM

Python 598 33 Updated Mar 28, 2025

🤖 Awesome list of AI Agents

618 66 Updated Mar 13, 2025

An Open Large Reasoning Model for Real-World Solutions

Python 1,477 76 Updated Mar 4, 2025
Python 30 3 Updated Oct 31, 2024

A reading list on LLM based Synthetic Data Generation 🔥

1,221 71 Updated Feb 20, 2025

(ICML 2024) Alphazero-like Tree-Search can guide large language model decoding and training

Python 260 27 Updated May 26, 2024
Jupyter Notebook 8 Updated Nov 13, 2024

[ACL'24] Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"

Python 54 4 Updated Feb 23, 2024

LLM Agora, debating between open-source LLMs to refine the answers

Python 62 8 Updated Sep 28, 2023

A complete computer science study plan to become a software engineer.

313,768 78,272 Updated Dec 5, 2024
Python 913 104 Updated Jan 23, 2025

Official implementation for the paper "DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models"

Python 476 58 Updated Jan 17, 2025

We present the first systematic study on the scaling property of raw agents instantiated by LLMs. We find that performance scales with the increase in the number of agents, using the simple(st) way…

Python 113 13 Updated Oct 8, 2024
Python 61 3 Updated Nov 19, 2024

A compilation of the best multi-agent papers

TeX 468 30 Updated Mar 26, 2025

From scratch implementation of a vision language model in pure PyTorch

Jupyter Notebook 207 21 Updated May 6, 2024

An autoregressive character-level language model for making more things

Python 2,965 768 Updated Jun 4, 2024

ICML 2024: Improving Factuality and Reasoning in Language Models through Multiagent Debate

Python 417 57 Updated Oct 3, 2023
Next
Showing results