Comprehensive guides and documentation to help you start and utilize our platform.
Fine-tuning allows you to customize pre-trained language models for your specific use cases and datasets, improving their performance on domain-specific tasks. The Dell Enterprise Hub provides you with guided steps, pre-configured settings and optimized training containers for fine-tuning models from the Dell Model Catalog and deploying them on your Dell infrastructure.
To start training one of the models available in the Dell Model Catalog, please follow the following steps:
Training containers leverage Hugging Face autotrain, a powerful tool that simplifies the process of model training. Hugging Face autotrain supports a variety of configurations to customize training jobs, including:
lr: Initial learning rate for the training.epochs: The number of training epochs.batch_size: Size of the batches used during training.More details on these configurations can be found in the Autotrain CLI documentation.
To finetune LLMs your dataset should have a column with the formatted training samples. The column used for training is defined through the text-column argument when starting your training, below it would be text.
Example format:
text
human: hello \n bot: hi nice to meet you
human: how are you \n bot: I am fine
human: What is your name? \n bot: My name is Mary
human: Which is the best programming language? \n bot: Python
You can use both CSV and JSONL files. For more details, refer to the original documentation.
To deploy a fine-tuned model on your Dell Platform, you can use the special "Bring Your Own Model" (BYOM) Dell inference container available in the Dell Enterprise Hub. This makes it easy to integrate fine-tuned models seamlessly into your Dell environment.
Unlike direct deployment of models provided in the Dell Model Catalog, when you deploy a fine-tuned model, the model is mounted to the BYOM Dell inference container. It's important to make sure that the mounted directory contains the fine-tuned model and the provided path is correct.
For models fine-tuned from the Gemma base model, the following hardware configurations are recommended for deployment:
| Dell Platforms | Number of Shards (GPUs) | Max Input Tokens | Max Total Tokens | Max Batch Prefill Tokens |
|---|---|---|---|---|
| xe9680-nvidia-h100 | 1 | 4000 | 4096 | 16182 |
| xe9680-amd-mi300x | 1 | 4000 | 4096 | 16182 |
| xe8640-nvidia-h100 | 1 | 4000 | 4096 | 16182 |
| r760xa-nvidia-h100 | 1 | 4000 | 4096 | 16182 |
| r760xa-nvidia-l40s | 2 | 4000 | 4096 | 8192 |
| r760xa-nvidia-l40s | 4 | 4000 | 4096 | 16182 |
For models fine-tuned from the Llama 3.1 8B base model, the following SKUs are suitable:
| Dell Platforms | Number of Shards (GPUs) | Max Input Tokens | Max Total Tokens | Max Batch Prefill Tokens |
|---|---|---|---|---|
| xe9680-nvidia-h100 | 1 | 8000 | 8192 | 32768 |
| xe9680-amd-mi300x | 1 | 8000 | 8192 | 32768 |
| xe8640-nvidia-h100 | 1 | 8000 | 8192 | 32768 |
| r760xa-nvidia-h100 | 1 | 4000 | 4096 | 16182 |
| r760xa-nvidia-l40s | 2 | 8000 | 8192 | 16182 |
| r760xa-nvidia-l40s | 4 | 8000 | 8192 | 32768 |
For models fine-tuned from the Llama 3.1 70B base model, use these configurations for deployment:
| Dell Platforms | Number of Shards (GPUs) | Max Input Tokens | Max Total Tokens | Max Batch Prefill Tokens |
|---|---|---|---|---|
| xe9680-nvidia-h100 | 4 | 8000 | 8192 | 16182 |
| xe9680-nvidia-h100 | 8 | 8000 | 8192 | 16182 |
| xe9680-amd-mi300x | 4 | 8000 | 8192 | 16182 |
| xe9680-amd-mi300x | 8 | 8000 | 8192 | 16182 |
| xe8640-nvidia-h100 | 4 | 8000 | 8192 | 8192 |
Hardware configurations for models fine-tuned from the Mistral 7B are as follows:
| Dell Platforms | Number of Shards (GPUs) | Max Input Tokens | Max Total Tokens | Max Batch Prefill Tokens |
|---|---|---|---|---|
| xe9680-nvidia-h100 | 1 | 8000 | 8192 | 32768 |
| xe9680-amd-mi300x | 1 | 8000 | 8192 | 32768 |
| xe8640-nvidia-h100 | 1 | 8000 | 8192 | 32768 |
| r760xa-nvidia-h100 | 1 | 4000 | 4096 | 16182 |
| r760xa-nvidia-l40s | 2 | 8000 | 8192 | 16182 |
| r760xa-nvidia-l40s | 4 | 8000 | 8192 | 32768 |
For models fine-tuned from the Mixtral base model, the deployment configurations are:
| Dell Platforms | Number of Shards (GPUs) | Max Input Tokens | Max Total Tokens | Max Batch Prefill Tokens |
|---|---|---|---|---|
| xe9680-nvidia-h100 | 4 | 8000 | 8192 | 16182 |
| xe9680-nvidia-h100 | 8 | 8000 | 8192 | 16182 |
| xe9680-amd-mi300x | 4 | 8000 | 8192 | 16182 |
| xe9680-amd-mi300x | 8 | 8000 | 8192 | 16182 |
| xe8640-nvidia-h100 | 4 | 8000 | 8192 | 8192 |
| r760xa-nvidia-h100 | 4 | 8000 | 8192 | 16182 |