Note: This feature extends our fine-tuning capabilities to support models from the Hugging Face ecosystem, enabling you to leverage community innovations and your own custom checkpoints.
Overview
The Together Fine-Tuning Platform now supports training custom models beyond our official model catalog. If you’ve found a promising model on Hugging Face Hub, whether it’s a community model, a specialized variant, or your own previous experiment, you can now fine-tune it using our service. Why Use This Feature?- Leverage specialized models: Use domain-specific or task-optimized models as your starting point
- Continue previous work: Resume training from your own checkpoints or experiments
- Access community innovations: Fine-tune cutting-edge models not yet in our official catalog
- Base Model (
model
parameter): A model from Together’s official catalog that provides the infrastructure configuration, training settings, and inference setup - Custom Model (
from_hf_model
parameter): Your actual HuggingFace model that gets fine-tuned
- Load your model checkpoint
- Apply your fine-tuning data
- Make the trained model available through our inference endpoints
Prerequisites
Before you begin, ensure your model meets these requirements: Model Architecture- Supported type: CausalLM models only (models designed for text generation tasks)
- Size limit: A maximum of 100 billion parameters
- Framework version: Compatible with Transformers library v4.55 or earlier
- Model weights must be in the
.safetensors
format for security and efficiency - The model configuration must not require custom code execution (no
trust_remote_code
) - The repository must be publicly accessible, or you must provide an API token that has access to the private repository
- The Hugging Face repository URL containing your model
- (Optional) The Hugging Face API token for accessing private repositories
- Your training data prepared according to one of our standard formats
- Your training hyperparameters for the fine-tuning job
Compatibility Check
Before starting your fine-tuning job, validate that your model meets our requirements:- Architecture Check: Visit your model’s HuggingFace page and verify it’s a “text-generation” or “causal-lm” model
- Size Check: Look for parameter count in model card (should be ≤100B)
- Format Check: Verify model files include
.safetensors
format - Code Check: Ensure the model doesn’t require
trust_remote_code=True
Quick Start
Fine-tune a custom model from Hugging Face in three simple steps:HuggingFaceTB/SmolLM2-1.7B-Instruct
has Llama architecture, and the closest model size and max sequence length.
Parameter Explanation
Parameter | Purpose | Example |
---|---|---|
model | Specifies the base model family for optimal configuration and inference setup | "togethercomputer/llama-2-7b-chat" , "meta-llama/Llama-3" |
from_hf_model | The Hugging Face repository containing your custom model weights | "username/model-name" |
hf_model_revision | (Optional) Use only if you need a specific commit hash instead of the latest version | "abc123def456" |
hf_api_token | (Optional) API token for accessing private repositories | "hf_xxxxxxxxxxxx" |
Detailed Implementation Guide
Step 1: Prepare Your Training Data Ensure your training data is formatted correctly and uploaded to the Together platform. Refer to our data preparation guide for detailed instructions on supported formats. Step 2: Start Fine-Tuning Launch your fine-tuning job with your custom model:Common Use Cases & Examples
Architecture-Specific Examples
Llama Family ModelsEnd-to-End Workflow Examples
Complete Domain Adaptation WorkflowContinuing Training from a Previous Fine-tune
Resume training from a checkpoint you previously created to add more data or continue the adaptation process:Fine-tuning a Community Specialist Model
Leverage community models that have already been optimized for specific domains:Troubleshooting
Understanding Training Stages Your fine-tuning job progresses through several stages. Understanding these helps you identify where issues might occur:- Data Download: The system downloads your model weights from Hugging Face and your training data from Together
- Initialization: Model is loaded onto GPUs and the data pipeline is prepared for training
- Training: The actual fine-tuning occurs based on your specified hyperparameters
- Saving: The trained model is saved to temporary storage
- Upload: The final model is moved to permanent storage for inference availability
- Internal Errors: Training failed due to an internal problem with the Fine-tuning API. Our team gets automatically notified and usually starts investigating the issue shortly after it occurs. If this persists for long periods of time, please contact support with your job ID.
- CUDA OOM (Out of Memory) Errors: Training failed because it exceeded available GPU memory. To resolve this, reduce the
batch_size
parameter or consider using a smaller model variant. - Value Errors and Assertions: Training failed due to a checkpoint validation error. These typically occur when model hyperparameters are incompatible or when the model architecture doesn’t match expectations. Check that your model is actually CausalLM and that all parameters are within valid ranges.
- Runtime Errors: Training failed due to computational exceptions raised by PyTorch. These often indicate issues with model weights or tensor operations. Verify that your model checkpoint is complete and uncorrupted.
Frequently Asked Questions
Question: How to choose the base model? There are three variables to consider:- Model Architecture
- Model Size
- Maximum Sequence Length
HuggingFaceTB/SmolLM2-135M-Instruct
. It has Llama architecture, the model size is 135M parameters and the max sequence length is 8k. Looking into the Llama models, Fine-tuning API supports llama2, llama3, llama3.1 and llama3.2 families. The closest model by number of parameters is meta-llama/Llama-3.2-1B-Instruct
, but the max seq length is 131k, which is much higher than the model can support. It’s better to use togethercomputer/llama-2-7b-chat
, which is larger than the provided model, but the max seq length is not exceeding the model’s limits.
Issue: “No exact architecture match available”
- Solution: Choose the closest architecture family (e.g., treat CodeLlama as Llama)
- Solution: Use the smallest available base model; the system will adjust automatically
- Solution: Check your model’s
config.json
formax_position_embeddings
or use our compatibility checker
Question: Which models are supported? Any CausalLM model under 100B parameters that has a corresponding base model in our official catalog. The base model determines the inference configuration. If your checkpoint significantly differs from the base model architecture, you’ll receive warnings, but training will proceed.
Question: Can I fine-tune an adapter/LoRA model? Yes, you can continue training from an existing adapter model. However, the Fine-tuning API will merge the adapter with the base model during training, resulting in a full checkpoint rather than a separate adapter.
Question: Will my model work with inference? Your model will work with inference if:
- The base model you specified is officially supported
- The architecture matches the base model configuration
- Training completed successfully without errors
Question: Can I load a custom model for dedicated endpoint and train it? No, you cannot use uploaded models for training in Fine-tuning API. Models uploaded for inference will not appear in the fine-tunable models. To learn more about what you can do with the uploaded models for dedicated endpoint, see this page. However, you can upload your model to the Hugging Face Hub and use the repo id to train it. The trained model will be available for the inference after the training.
Question: How do I handle private repositories? Include your Hugging Face API token with read permissions for those repositories when creating the fine-tuning job:
Question: What if my model requires custom code? Models requiring
trust_remote_code=True
are not currently supported for security reasons. Consider these alternatives:
- Use a similar model that doesn’t require custom code
- Contact our support team and request adding the model to our official catalog
- Wait for the architecture to be supported officially
Question: How do I specify a particular model version? If you need to use a specific commit hash instead of the latest version, use the
hf_model_revision
parameter:
Support
Need help with your custom model fine-tuning?- Documentation: Check our error guide
- Community: Join our Discord Community for peer support and tips
- Direct Support: Contact our support team with your job ID for investigation of specific issues
- Your fine-tuning job ID
- The Hugging Face model repository you’re using
- Any error messages you’re encountering