LLaVA-Video Collection Models focus on video understanding (previously known as LLaVA-NeXT-Video). • 8 items • Updated Feb 21, 2025 • 64
view article Article Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models +1 Jun 24, 2024 • 205
Vision Language Models Papers 🖼️💬📝 Collection Papers about vision-language models, most important ones are on top of the list. • 27 items • Updated Apr 30, 2024 • 40