Highlights
- Pro
Stars
State-of-the-Art Text Embeddings
[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
A library for efficient similarity search and clustering of dense vectors.
Grounding DINO 1.5: IDEA Research's Most Capable Open-World Object Detection Model Series
Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We also show you how to solve end to end problems using Llama mode…
Image augmentation for machine learning experiments.
Research code for pixel-based encoders of language (PIXEL)
Codebase for CROPE: Evaluating In-Context Adaptation of Vision and Language Models to Culture-Specific Concepts
Multilingual Image Captioning Evaluation
a python framework to build, learn and reason about probabilistic circuits and tensor networks
Official implementation of project Honeybee (CVPR 2024)
Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.
Athens NLP Summer School 2024 - Lab material
🤖 Machine Learning Summer School Guide
Code for the MultipanelVQA benchmark "Muffin or Chihuahua? Challenging Large Vision-Language Models with Multipanel VQA"
A reading list of up-to-date papers on NLP for Social Good.
Code for Enhancing Continual Learning in Visual Question Answering with Modality-Aware Feature Distillation
An annotated implementation of the Transformer paper.
Official Implementation of "Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining"
Code repo for "Read Anywhere Pointed: Layout-aware GUI Screen Reading with Tree-of-Lens Grounding"
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Video+code lecture on building nanoGPT from scratch
The official implementation of “Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training”
Multimodal language model benchmark, featuring challenging examples
Website for hosting the Open Foundation Models Cheat Sheet.