Documentation

Comprehensive guides and documentation to help you start and utilize our platform.

Quickstart

To deploy a model from the Dell Model Catalog, you can follow these simple steps:

  1. Browse the Model Catalog: Navigate to the Model Catalog and select the model you want to deploy.
  2. Open the Model Card: Click on the model to view its details, specifications, and deployment options.
  3. Configure Deployment Settings:
    • Select your Dell Platform (e.g., xe9680-nvidia-h100, r760xa-nvidia-l40s)
    • Choose the number of GPUs/shards needed
    • Configure parameters like max tokens and batch size
  4. Copy the Deployment Command: The platform will generate a Docker command optimized for your selected hardware configuration.
  5. Run the Container: Execute the command in your Dell environment to start the model inference server.
  6. Test Your Model: Once the container is running, you can test the model using the provided sample code snippets or API endpoints.

For models without pre-downloaded weights, make sure to include both HF_TOKEN and MODEL_ID environment variables when running your Docker container to authenticate with Hugging Face Hub and download the model weights.

Managing deployments via CLI and Python SDK

Dell provides both a Command Line Interface (CLI) and a Python SDK to support various developer workflows. These tools enable you to:

  • Browse available models and applications.
  • Auto-generate deployment snippets for Dell hardware.
  • Run system checks for software/hardware compatibility.
  • Use APIs for greater control and automation.

For more information you can visit the Dell AI CLI and SDK documentation: