You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Model Overview: TinyLlama is a compact 1.1 billion parameter language model pre-trained on around 1 trillion tokens, leveraging the architecture and tokenizer of Llama 2 and advancements from the open-source community.
Model Architecture: TinyLlama adopts a Transformer architecture similar to Llama 2, featuring a hidden size of 2048, an intermediate hidden size of 5632, a context length of 2048, and 22 layers with 32,000 vocab size.
Speed Optimizations: Incorporates Fully Sharded Data Parallel (FSDP), Flash Attention for efficient training, and Grouped-query Attention to reduce memory overhead.
Training Performance: Achieves a training throughput of 24,000 tokens per second per A100-40G GPU, requiring significantly fewer GPU hours compared to models like Pythia-1.0B and MPT-1.3B.
Comparative Analysis: Outperforms similar-sized open-source language models like OPT-1.3B and Pythia-1.4B in various downstream tasks.
Evaluation on Commonsense Reasoning Tasks: Demonstrates superior performance on tasks like HellaSwag, OpenBookQA, WinoGrande, ARC-Easy, ARC-Challenge, BoolQ, and PIQA.
Problem-solving Capabilities: Evaluated using the InstructEval benchmark including tasks like MMLU, BIG-Bench Hard, DROP, and HumanEval, where TinyLlama shows better problem-solving skills compared to existing models.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Link: https://arxiv.org/abs/2401.02385
Beta Was this translation helpful? Give feedback.
All reactions