Skip to main content

Billing & Metering

Business Value: Know exactly what each tenant, project, and user consumes — down to GPU-hours, storage bytes, and network bandwidth — with transparent, auditable billing records.

What Gets Metered

ResourceWhat's MeasuredHowGranularity
GPU ComputeGPU-hours, utilization %, memory, power draw, temperatureGPU telemetry exporter on every nodePer GPU, configurable intervals
CPU ComputeCPU-hours, utilization %, core count, memoryNode exporter on every nodePer node, configurable intervals
Bare Metal AllocationNode-hours: allocation start → releaseOrchestration DB timestampsPer node, precise timestamps
K8s Cluster RuntimeCluster-hours: creation → deletionOrchestration DB lifecycle eventsPer cluster
Slurm JobsPer-job GPU-hours, CPU-hours, elapsed time, memory, exit codeSlurm accounting daemon → relational databasePer job
Storage (Parallel FS)Directory size, quota utilization, I/O throughputStorage reporting + storage pluginPer tenant directory
Storage (Object/Platform)PVC usage, S3 bucket size, object countStorage management + CSI metricsPer PVC, per bucket
Network (InfiniBand)Bandwidth per tenant, RDMA throughput, packet dropsFabric manager telemetry + IB port countersPer partition (per tenant)

Metering Pipeline

Raw metrics flow through a multi-stage pipeline: hardware-level exporters collect data at configurable intervals → time-series database scrapes and stores short-term → long-term HA storage provides retention → metering service aggregates into billable records per tenant/project/user → billing reports generated and exportable as CSV/PDF.

Metering Pipeline & Billing Flow

Quota Management

Pre-set resource limits prevent overspend and ensure fair capacity distribution. Quotas are enforced in real-time — if a tenant exceeds their quota, new resource creation is blocked. The portal dashboard shows quota usage with color-coded warnings at configurable thresholds.