Skip to main content

Architecture

Architecture Tiers

Dflare AI Architecture Tiers

Platform Component Map

LayerComponentsTechnology
Portal & APIOrchestration UI, REST APIs, WebSocket statusModern web framework, microservices, relational & document databases, caching layer
Identity & AccessAuthentication, authorization, tenant mgmtEnterprise IAM (OAuth2/OIDC), JWT, RBAC + ABAC
Compute OrchestrationCluster lifecycle, workload managementKubernetes, Slurm (operator-based)
Network FabricTenant isolation, VLAN/VRF managementEthernet fabric controller, VXLAN/EVPN, IB fabric manager
StorageHigh-performance GPU + platform storageParallel filesystem (InfiniBand), Object/file storage (Ethernet)
GPU RuntimeGPU access, telemetry, schedulingMulti-vendor GPU operators
MonitoringMetrics, alerting, dashboardsHA time-series database, metric collectors, dashboarding
BillingUsage tracking, quota, reportingMetering service, relational database
Container RuntimeContainer lifecycle on bare metalOCI-compliant container runtime

Key Microservices

  • Bare Metal Manager — Bare metal provisioning; translates portal requests into bare metal controller API calls
  • Workflow Orchestrator — Workflow engine; handles provisioning logic, scheduling, and step sequencing
  • Cluster Manager — Kubernetes and Slurm cluster lifecycle management
  • Network Manager — Interfaces with fabric controller for VRF/VLAN operations
  • Volume Service — Parallel filesystem provisioning; directories, access control maps, quotas
  • Auth Service — IAM integration for multi-tenant RBAC/ABAC enforcement
  • Metering Service — Usage aggregation, billing calculation, quota tracking
  • Metric Server — Kubernetes metrics processing for tenant clusters
  • Container Registry — Internal image registry for air-gapped deployments

Infrastructure Specifications

Compute

SpecificationDetail
Supported ServersEnterprise GPU server platforms from multiple vendors
Supported GPUsNVIDIA, AMD, and Intel accelerators (various models)
GPUs per NodeConfigurable — typically multiple GPUs per node
GPU InterconnectVendor-specific high-bandwidth intra-node GPU links
InfiniBand LinksMultiple high-bandwidth links per node (configurable)
EthernetFrontend management network
Maximum NodesScalable to thousands of nodes per deployment
Maximum GPUsScalable to thousands of accelerators per deployment

Network

FabricTechnologySpeed
GPU-to-GPU (training)InfiniBand via RDMA/collective communicationsHigh-bandwidth, multiple links per node
GPU-to-Storage (data)InfiniBand via parallel filesystem/RDMAHigh-bandwidth, multiple links per node
Platform ServicesEthernet (dedicated VLAN)Standard datacenter Ethernet
Tenant IsolationVRF (VXLAN/EVPN) + IB Partition KeyHardware-enforced

Performance Tuning (Applied via Golden OS Image)

BIOS Settings:

  1. Performance profile, C-states disabled — Maximum clock speeds, zero power-save latency
  2. NUMA alignment enabled — GPU PCIe aligned to nearest CPU socket
  3. PCIe ASPM disabled, virtualization off — Zero overhead for bare metal GPU workloads

OS Settings: 4. CPU governor: performance — Locked maximum frequency 5. Huge pages: always — Optimized GPU memory allocation 6. IOMMU enabled with passthrough — Direct device passthrough for accelerators