Skip to content
View zwcolin's full-sized avatar
🎃
Hello there
🎃
Hello there

Highlights

  • Pro

Organizations

@ucsd-ets @princeton-nlp @dsc-courses

Block or report zwcolin

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Shell 2 Updated Mar 21, 2025

Awesome Reasoning in MLLMs: Papers and Projects about learning to reason with MLLMs, including Chain-of-Thought (CoT), OpenAl o1, and DeepSeek-R1

46 2 Updated Mar 18, 2025

Verifiers for LLM Reinforcement Learning

Python 732 75 Updated Mar 23, 2025

Witness the aha moment of VLM with less than $3.

Python 3,432 271 Updated Mar 1, 2025

Random maze environments with different size and complexity for reinforcement learning research.

Python 2 Updated Apr 30, 2024

A customizable framework to create maze and gridworld environments

Python 263 59 Updated Apr 5, 2019

A framework for few-shot evaluation of language models.

Python 8,442 2,256 Updated Mar 30, 2025

A fork to add multimodal model training to open-r1

Python 1,143 58 Updated Feb 8, 2025

LLaVA-MORE: A Comparative Study of LLMs and Visual Backbones for Enhanced Visual Instruction Tuning

Python 121 8 Updated Mar 25, 2025
Python 48 2 Updated Nov 5, 2024

A collection of materials for CS application

2 Updated Dec 21, 2024
Python 1 Updated Dec 12, 2024
Python 113 15 Updated Jul 14, 2022
HTML 19 3 Updated Nov 26, 2024

qpdf: A content-preserving PDF document transformer

C++ 3,861 300 Updated Mar 30, 2025

A novel Multimodal Large Language Model (MLLM) architecture, designed to structurally align visual and textual embeddings.

Python 850 56 Updated Mar 25, 2025
Python 16 1 Updated Dec 11, 2024

Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks

Python 2,099 305 Updated Mar 30, 2025

A curated list of recent and past chart understanding work based on our IEEE TKDE survey paper: From Pixels to Insights: A Survey on Automatic Chart Understanding in the Era of Large Foundation Mod…

194 19 Updated Feb 23, 2025

🔽 Display any CSV (comma separated values) file as a searchable, filterable, pretty HTML table

CSS 881 331 Updated Mar 8, 2024

Refine high-quality datasets and visual AI models

Python 9,321 608 Updated Mar 30, 2025

A project page template for academic papers. Demo at https://eliahuhorwitz.github.io/Academic-project-page-template/

JavaScript 2,857 468 Updated Jan 24, 2025

[NeurIPS 2024] CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs

Python 100 11 Updated Dec 25, 2024

Label Studio is a multi-type data labeling and annotation tool with standardized output format

JavaScript 21,392 2,638 Updated Mar 29, 2025

The AI Datastore for Schemas, BLOBs, and Predictions. Use with your apps or integrate built-in Human Supervision, Data Workflow, and UI Catalog to get the most value out of your AI Data.

Python 1,860 121 Updated Nov 18, 2024

Recent LLM-based CV and related works. Welcome to comment/contribute!

859 38 Updated Mar 8, 2025

[NeurIPS 2024] 💫CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching

Python 153 7 Updated Nov 18, 2024

A collection of resources on controllable generation with text-to-image diffusion models.

1,023 27 Updated Dec 31, 2024

LLaVA-UHD v2: an MLLM Integrating High-Resolution Semantic Pyramid via Hierarchical Window Transformer

Python 370 17 Updated Mar 30, 2025
Next
Showing results