Automatically spin up the entire infrastructure with production-grade AI pipelines, from data ingest, training and serving via simple code. Done in hours, not months.
Complex AI pipelines done by write simple config
Write a simple job configuration with task_name: Axolotl which is already integrated. Add your custom parameters, specify the training dataset, and define the output location
When the job is submitted, the platform provisions an A100 instance on VM or Kubernetes and begins training. The final model is stored in the output location you define.
name = "qlora_finetuning"description = "Finetuning Llama-7B with Axolotl"job = { task_name="axolotl", profile="node_a100", mode="train" params = { config_text="config/lora8b-instruct.yml", },}input = 'gs://bucket/input'output = 'gs://bucket.output'Specify the location of the notebook that performs data augmentation using GPT-OSS-20B. Include the required models in the startup configuration and set the data sources and output destination.
Once the job is submitted, the platform provisions GPT-OSS-20B automatically and executes the notebook as soon as the environment is ready.
name='generate_qa_llm'description='Augment data with GPT OSS 20B'cmd = 'python augmented.ipynb'startup = [{ name="sglang", model = "openai/gpt-oss-20b"}]input = 'gs://bucket/input'output = 'gs://bucket.output'Define the job using the pre-integrated VLLM runtime. Specify the models to be served and the required compute resources.
When the job is submitted, the platform provisions new GPU infrastructure automatically and launches the models as soon as the environment is ready.
name = "host_llm_inference"description = "Serviing custom Llama-70B"job = { task_name="vllm", params = { model="llama-70b-finetuned", profile="medium_gpu" }}Define the job to execute Python code for converting PDF files into markdown inside the Docling container. Configure the required compute resources.
After the job is submitted, the platform deploys the Docling container and runs the Python workflow automatically on the specified infrastructure.
name = 'extract_pdf'description = 'Extract PDF to Markdown with Docling'image = 'docling:latest'cmd = "chmod +x run.sh && ./run.sh"input = 'gs://bucket/input'output = 'gs://bucket/output'compute = 'single_a100'Provide better view on your pipelines. Support any parallel, chaining and dependencies operations.
Build infra automatically. Scale from VM up to Kubernetes clusters.
Spin a workstation for private AI model development inside organization private networks. Secured via IAP provides protection to internal data while working from public internet.
Features
- Real-time editor online collaboration
- Pipelines building with high compute
- Produce test environment
dax project deploygcloud compute ssh deploy --tunnel-through-iap
Designed for repeatable deployment across teams and environments. Translating complex compute topologies into YAML-based components. Simplify complex infrastructure for LLM and data science pipelines with consistency and precision.
Features
- VM and Clusters support
- Spot / Preemptible options for cost savings
- Overrides configuration for more advanced usage
gcp_vm_g2_16: machineType: g2-standard-16 gpu: 1 osImage: projects/cos-cloud/global/images/family/cos-121-lts preemptible: "true" provisioningModel: SPOT imageSize: 50 bootSize: 30 alternativeZones: - us-east1-b - us-central1-b
Fully compatible with existing Kubernetes environments or deployable on demand through DAX. Support for Ray and native Kubernetes jobs provides flexibility for a wide range of workloads. Integrated gang scheduling ensures efficient GPU allocation for high-intensity AI tasks. Operational across GKE, on-premises deployments, and any standard Kubernetes cluster.
Features
- Cloud and On-premise clusters integration.
- Jobs via Ray, AppWrapper and Kubernetes.
- Gang-scheduling for GPU compute.
- More advanced features.
apiVersion: kueue.x-k8s.io/v1beta1kind: ClusterQueuemetadata: name: "cluster-queue"spec: namespaceSelector: {} # match all namespaces resourceGroups: - coveredResources: [ "cpu", "memory", "ephemeral-storage" ] flavors: - name: "default-flavor" resources: - name: "cpu" nominalQuota: 10000 # Infinite quota. - name: "memory" nominalQuota: 10000Gi # Infinite quota. - name: "ephemeral-storage" nominalQuota: 10000Gi # Infinite quota.