Comprehensive guides and documentation to help you start and utilize our platform.
Quickstart
To deploy a model from the Dell Model Catalog, you can follow these simple steps:
Browse the Model Catalog: Navigate to the Model Catalog and select the model you want to deploy.
Open the Model Card: Click on the model to view its details, specifications, and deployment options.
Configure Deployment Settings:
Select your Dell Platform (e.g., xe9680-nvidia-h100, r760xa-nvidia-l40s)
Choose the number of GPUs/shards needed
Configure parameters like max tokens and batch size
Copy the Deployment Command: The platform will generate a Docker command optimized for your selected hardware configuration.
Run the Container: Execute the command in your Dell environment to start the model inference server.
Test Your Model: Once the container is running, you can test the model using the provided sample code snippets or API endpoints.
For models without pre-downloaded weights, make sure to include both HF_TOKEN and MODEL_ID environment variables when running your Docker container to authenticate with Hugging Face Hub and download the model weights.
Managing deployments via CLI and Python SDK
Dell provides both a Command Line Interface (CLI) and a Python SDK to support various developer workflows. These tools enable you to:
Browse available models and applications.
Auto-generate deployment snippets for Dell hardware.
Run system checks for software/hardware compatibility.
Use APIs for greater control and automation.
For more information you can visit the Dell AI CLI and SDK documentation: