Platform

On-premises AI inference, operated as a service.

NVIDIA GPU nodes, a managed software stack, and remote operations. Deployed as a portable unit or a containerized cluster on your network. We run it. Your data and models stay on hardware you own.

01 · HARDWARE

Two form factors, one software stack.

Build on the portable unit and scale to a cluster without changing your application.

Portable

2 to 4 NVIDIA Blackwell-generation GPUs
Up to 384 GB VRAM; 70B models on a single GPU
~2.9 to 5.8 PFLOPS FP8
Standard 15 to 20A wall outlet, ~1 kW
Operational 60 seconds after power-on

Container · 8 ft enclosure

2 to 6 nodes, 16 to 48 GPUs
Up to ~4.6 TB VRAM; ~23 to 70 PFLOPS FP8
Power, cooling, fire suppression, physical security integrated
100A 208V 3-phase; commissioned in 12 to 14 weeks
Larger containerized builds beyond 100 PFLOPS available

02 · SOFTWARE

The same managed stack on every unit.

Built on standard, open components. No proprietary runtime to lock you in.

Inference serving standard HTTP and gRPC endpoints; models compiled for the GPUs they run on.
Multi-node scheduling add a node and its capacity joins the pool, no reconfiguration.
Distributed storage fault-tolerant NVMe across nodes, erasure-coded.
Interconnect low-latency RDMA fabric between nodes.
Workloads bring your own GPU-aware containers.
Telemetry GPU, throughput, and health metrics across the fleet.
Access an out-of-band management link you control, with optional site-to-site VPN.

03 · MODELS

Open weights, on hardware you own.

Run open models you control: Llama, DeepSeek, Qwen, Mistral, and others.
Right-size per workload: 7B to 70B on a single GPU, up to 405B tensor-parallel across four.
Swap models without application changes; the inference endpoints stay the same.
Fixed compute cost: you own the hardware, so inference is not metered per token.

04 · OPERATIONS & DATA

We run it. Nothing leaves your building.

Managed remotely monitoring, security patching, updates, and support.
Zero data egress all inference runs on-premises, with no third-party API in the path.
Compliance aligns with PIPEDA and Québec Law 25, supports a SOC 2 posture, maps to NIST AI RMF.
Hardware NVIDIA-certified components with enterprise warranty.

On-prem inference, operated as a service.

Talk to us