How to use Llama 4 Models
Output
Llama4 Notebook
If you’d like to see common use-cases in code see our notebook here .Llama 4 Model Details
Llama 4 Maverick
- Model String: meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8
-
Specs:
- 17B active parameters (400B total)
- 128-expert MoE architecture
- 524,288 context length (will be increased to 1M)
- Support for 12 languages: Arabic, English, French, German, Hindi, Indonesian, Italian, Portuguese, Spanish, Tagalog, Thai, and Vietnamese
- Multimodal capabilities (text + images)
- Support Function Calling
- Best for: Enterprise applications, multilingual support, advanced document intelligence
- Knowledge Cutoff: August 2024
Llama 4 Scout
- Model String: meta-llama/Llama-4-Scout-17B-16E-Instruct
-
Specs:
- 17B active parameters (109B total)
- 16-expert MoE architecture
- 327,680 context length (will be increased to 10M)
- Support for 12 languages: Arabic, English, French, German, Hindi, Indonesian, Italian, Portuguese, Spanish, Tagalog, Thai, and Vietnamese
- Multimodal capabilities (text + images)
- Support Function Calling
- Best for: Multi-document analysis, codebase reasoning, and personalized tasks
- Knowledge Cutoff: August 2024
Function Calling
Output
Query models with multiple images
Currently this model supports 5 images as input.Output
Llama 4 Use-cases
Llama 4 Maverick:
- Instruction following and Long context ICL: Very consistent in following precise instructions with in-context learning across very long contexts
- Multilingual customer support: Process support tickets with screenshots in 12 languages to quickly diagnose technical issues
- Multimodal capabilities: Particularly strong at OCR and chart/graph interpretation
- Agent/tool calling work: Designed for agentic workflows with consistent tool calling capabilities
Llama 4 Scout:
- Summarization: Excels at condensing information effectively
- Function calling: Performs well in executing predefined functions
- Long context ICL recall: Shows strong ability to recall information from long contexts using in-context learning
- Long Context RAG: Serves as a workhorse model for coding flows and RAG (Retrieval-Augmented Generation) applications
- Cost-efficient: Provides good performance as an affordable long-context model