Idefics3
Generate text based on an image and prompt
Generate text based on an image and prompt
Media understanding
Identify objects in images using text queries
Generate text and segment images using PaliGemma
Annotate and describe images with text prompts
Segment and caption objects in images and videos
Analyze images to caption, detect objects, and extract text
Generate detailed image analyses and depth predictions
Generate detailed descriptions from images and questions
Chat about uploaded images with AIβgenerated answers
Chat with AI using text and images, get highlighted answers
Generate captions, detect objects, and segment images with AI
Ask questions about images and get answers
Chat with Pixtral 12B using Mistral Inference
Interact with a chatbot that understands text and images
State-of-the-art Zero-shot Object Detection
Chat with images using Llama Vision model
Generate text from images and queries
Generate text responses based on images and chat history
Paligemma2 Detection with Supervision
Generate text responses from images and text input
Visualize image depth, segmentation, and generation
A unified multimodal understanding and generation model.