bobo0810

Mr.Li bobo0810

Always Be Curious

152 followers · 7 following

North University of China
Beijing
06:17 - 8h ahead

Achievements

x2 x3 x2

Achievements

x2 x3 x2

Lists (3)

Sort

Stars

Purshow / Awesome-Unified-Multimodal

108 6 Updated Mar 30, 2025

threegold116 / Awesome-Omni-MLLMs

A collection of omni-mllm

14 1 Updated Mar 28, 2025

QwenLM / Qwen2.5-Omni

Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and performing real-time speech generation.

Jupyter Notebook 1,729 119 Updated Mar 30, 2025

huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Python 142,167 28,467 Updated Mar 30, 2025

FunAudioLLM / CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 12,520 1,254 Updated Mar 25, 2025

ModalMinds / MM-EUREKA

MM-EUREKA: Exploring Visual Aha Moment with Rule-based Large-scale Reinforcement Learning

Python 461 16 Updated Mar 29, 2025

stepfun-ai / Step-Audio

Python 4,090 328 Updated Mar 12, 2025

baichuan-inc / Baichuan-Omni-1.5

Python 143 8 Updated Feb 8, 2025

THUNLP-MT / StreamingBench

StreamingBench: Assessing the Gap for MLLMs to Achieve Streaming Video Understanding

Python 114 3 Updated Mar 27, 2025

OpenBMB / UltraEval-Audio

An easy-to-use, fast, and easily integrable tool for evaluating audio LLM

Python 70 1 Updated Mar 27, 2025

Ola-Omni / Ola

Ola: Pushing the Frontiers of Omni-Modal Language Model

Python 321 14 Updated Feb 28, 2025

ZhangAIPI / YOPO_MLLM_Pruning

Pruning the VLLMs

Python 90 4 Updated Dec 9, 2024

youngyangyang04 / leetcode-master

《代码随想录》LeetCode 刷题攻略：200道经典题目刷题顺序，共60w字的详细图解，视频难点剖析，50余张思维导图，支持C++，Java，Python，Go，JavaScript等多语言版本，从此算法学习不再迷茫！🔥🔥 来看看，你会发现相见恨晚！🚀

Shell 55,149 11,917 Updated Mar 17, 2025

magpie-align / magpie

[ICLR 2025] Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing. Your efficient and high-quality synthetic data generation pipeline!

Python 665 60 Updated Mar 17, 2025

salesforce / LAVIS

LAVIS - A One-stop Library for Language-Vision Intelligence

Jupyter Notebook 10,399 1,010 Updated Nov 18, 2024

hhaAndroid / awesome-mm-chat

多模态 MM +Chat 合集

Python 250 19 Updated Feb 18, 2025

baaivision / DenseFusion

DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception

Python 137 1 Updated Dec 6, 2024

OpenGVLab / InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

Python 7,372 566 Updated Mar 20, 2025

FreedomIntelligence / ALLaVA

Harnessing 1.4M GPT4V-synthesized Data for A Lite Vision-Language Model

Python 259 9 Updated Jun 25, 2024

modelscope / FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Python 9,227 938 Updated Mar 28, 2025