Compute Cost Advisory

Audit the stack.
Eliminate the waste.
Take control.

GPU compute is your biggest cost and your least managed one. Ferrous audits your entire AI stack, identifies what you're overpaying, and shows you exactly what to do about it — independent from every provider, platform, and tool in the market.

Start with the audit ↗ See the problem

23×

price spread for the same H100 GPU — $1.38 to $7.50+/hr across providers

CloudZero · GetDeploying 43 providers · May 2026

average GPU utilisation in enterprise clusters — 95% sits idle, billed by the hour

Cast AI · 23,000 clusters measured · Apr 2026

20–40%

of AI spend recoverable in a typical audit, before any contract changes are needed

CloudChipr · LeanOps FinOps Playbook 2026

H100: $6.88/hr on AWS · $1.99/hr on RunPod — same chip 98% of FinOps teams now managing AI spend — up from 31% two years ago (FinOps Foundation 2026) Training a 7B model: $362,000 on AWS · $58,000 on Lambda Labs AWS raised H200 GPU prices 15% in January 2026 AI gross margins average 52% in 2026 — investors expect 70–80% at Series B GPU utilisation averages 5% in enterprise AI clusters (Cast AI, 23,000 clusters) CME Group compute futures announced May 2026 — pending regulatory approval Neocloud revenue hits $20B in 2026 — 90% of enterprises now adopting GPU-first clouds (Forrester 2026) Silicon Data GPU Forward Curve: first standardised 12-month view of GPU rental costs — DRW-backed, CME-partnered Neocloud revenue hits $20B in 2026 — 90% of enterprises now adopting GPU-first clouds (Forrester 2026) Silicon Data GPU Forward Curve: first standardised 12-month view of GPU rental costs — DRW-backed, CME-partnered H100: $6.88/hr on AWS · $1.99/hr on RunPod — same chip 98% of FinOps teams now managing AI spend — up from 31% two years ago (FinOps Foundation 2026) Training a 7B model: $362,000 on AWS · $58,000 on Lambda Labs AWS raised H200 GPU prices 15% in January 2026 AI gross margins average 52% in 2026 — investors expect 70–80% at Series B GPU utilisation averages 5% in enterprise AI clusters (Cast AI, 23,000 clusters) CME Group compute futures announced May 2026 — pending regulatory approval

The problem

GPU compute is your biggest cost.
Almost no one manages it like it is.

AI startups spend 40–60% of technical budgets on GPU compute. CFOs trained on SaaS economics are managing a cost that represents millions per year with no procurement benchmarks, no utilisation monitoring, and no systematic framework.

The same H100 chip costs $6.88/hr on AWS and $1.99/hr on RunPod. Most companies have never compared their provider rates against the verified market. Most have never audited what's actually running on their GPUs.

The correct order of operations. Audit the stack to identify the waste. Benchmark every contract against the verified market rate. Negotiate procurement to the best available price. Build the financial model that reflects compute as the variable cost it actually is. Most companies skip directly to spending — and pay accordingly.

The conflict of interest no one names. Every tool vendor, cloud provider, and managed FinOps firm has a structural conflict. They make money when you keep spending with them. Ferrous is paid a fixed retainer by you. We can tell you to cancel a commitment, switch providers, or cut a contract entirely. No one else in this market can say that.

6×

Same GPUs. Six times the cost.

Training a 7B model costs $362,000 on AWS and $58,000 on Lambda Labs. Most companies have never benchmarked their provider rates.

BuildMVPFast GPU cloud comparison · Apr 2026

Average GPU utilisation.

Cast AI measured 23,000 production clusters. 95% of provisioned GPU capacity sits idle, billed by the hour. The waste problem precedes the price problem.

Cast AI 2026 State of Kubernetes Optimisation Report

+15%

AWS raised H200 prices in January 2026.

GPU prices are not simply falling. AWS broke two decades of declining compute costs. The most capable hardware is getting more expensive.

The Register · Data Center Dynamics · Jan 2026

Provider	H100 /GPU-hr	Type
AWS P5	$6.88–7.50	Hyperscaler
Azure NC H100 v5	$6.98	Hyperscaler
GCP A3 High	~$3.00	Hyperscaler
Lambda Labs	$2.99	Neocloud
Spheron	$2.64	Neocloud
RunPod community	$1.99–2.39	Neocloud
Thunder Compute	$1.38	Neocloud
AWS savings plan	$1.90–2.10	Reserved

All prices verified June 2026 · Spheron · IntuitionLabs · CloudZero · ThunderCompute

The 23× spread between the cheapest and most expensive provider for the same chip is not a market quirk — it is the procurement opportunity. The audit identifies which of your workloads can migrate and which cannot.

What we do

Four steps.
One integrated engagement.

AI usage audit — the foundation

Full inventory of every AI-related cost: GPU infrastructure contracts, model API subscriptions (OpenAI, Anthropic, Bedrock, Gemini, Mistral), inference tooling, AI SaaS. Utilisation analysis. Workload classification. Waste and optimisation report with ranked, actionable findings. Consistently identifies 20–40% of AI spend as immediately recoverable. Fixed-fee standalone engagement — no retainer required to start.

Compute-native financial planning

Driver-based financial model rebuilt from audit data — not estimates. Compute modelled as the variable cost it actually is, by GPU class, provider, and workload. Monthly board pack with compute economics, gross margin trajectory, and cost-per-inference benchmarked against verified June 2026 market rates.

Procurement advisory and negotiation

Contract renegotiations from audit findings. Hyperscaler EA negotiations with multi-cloud optionality as active leverage. Provider migration to Lambda, RunPod, or Spheron at $1.99–2.99/GPU-hr versus AWS at $6.88–7.50/GPU-hr where workloads allow. Reserved instance strategy against validated base load.

Compute-aware capital strategy

Venture debt package with compute commitment schedule. Cloud prepayment structuring sized to audit-validated utilisation. Series A/B fundraising narrative built on verified unit economics. Preparation for the compute unit economics questions investors now ask directly at every funding round.

What the audit finds

Idle reserved capacity

30–50% of reserved GPU-hours unused, billed in full. Rightsizing alone recovers significant spend.

Hyperscaler rates at 3× market

AWS and Azure for workloads that could run on Lambda or Spheron at a fraction of the cost.

Duplicate API subscriptions

Multiple teams paying separately for OpenAI, Anthropic, Bedrock — often for overlapping capabilities.

GPU for CPU-eligible workloads

Embedding generation and preprocessing running on GPU when CPU costs one-tenth as much.

No utilisation monitoring

Instances left running overnight and on weekends. No alerting. Pure waste, immediately fixable.

Training costs misclassified

R&D training runs billed as COGS, inflating apparent gross margin compression and distorting unit economics.

What makes Ferrous different

Three differentiators.
Only one is permanent.

01 —

Structural independence

Permanent

We can tell you AWS is charging 3× the market rate. We can tell you to cancel a reserved commitment. We can tell you to switch providers entirely. No tool vendor, managed FinOps firm, or cloud provider can say those things — their revenue depends on you continuing to spend with them. Ours doesn't.

02 —

CFO fluency, not just FinOps

18–36 months to replicate

We connect your compute cost picture to your venture debt covenants, equity fundraising narrative, cloud prepayment schedule, and board reporting. FinOps tools show you the bill. We show you what it means for your business — and what to tell investors when they ask.

03 —

Proprietary benchmarking data

Compounds with every client

Every audit finding, contract term, and provider rate is logged anonymously from day one. By client ten we hold benchmarking data no one else has. No incumbent CFO firm collects this. No FinOps tool sees the procurement side. This is the moat that compounds.

The forward view

We are building the expertise now.
The market is forming around us.

The compute cost optimisation service we provide today delivers immediate, measurable value without requiring any financial market to exist. The audit, the procurement savings, the gross margin improvement — none of this depends on regulatory approval or market liquidity.

The longer-term direction is clear: compute is becoming a financial asset class. CME, ICE, and AX have all announced compute futures products. Silicon Data — DRW-backed and CME-partnered — has launched the GPU Forward Curve: the first standardised 12-month view of anticipated GPU rental costs, giving Ferrous and its clients the most credible daily benchmark currently available. When those instruments arrive, the clients who trusted Ferrous in the consulting phase will be first to benefit from the advisory phase.

The correct sequence. Build expertise in compute cost management. Build client trust through audits and procurement savings. Build the benchmarking dataset that makes future financial instrument recommendations precise. When the instruments arrive, Ferrous will be the only firm that has done all three.

CME Group ICE / NYSE AX Exchange Ornn / OCPI Silicon Data / DRW GPU Forward Curve

Oct
25

October 2025

Ornn OCPI goes live. First transaction-based GPU pricing benchmark — printed trades, not surveys. Listed on Bloomberg Terminal April 2026.

Dec
25

December 2025

First compute swap executes via Ornn Exchange. Live for $500K+ annualised compute spend.

Jan
26

January 21, 2026

AX Exchange announces compute futures.pending approval

May
5

May 5, 2026

Larry Fink at Milken Institute: "A new asset class will be buying futures of compute." — Bloomberg.

May
12

May 12, 2026

CME Group & Silicon Data announce compute futures (DRW-backed, Carmen Li).pending approval

May
19

May 19, 2026

ICE (NYSE owner) & Ornn announce GPU futures: H100, H200, B200, RTX 5090.pending approval

Start here

Start with
the audit.

The first conversation is a working session — we bring your numbers into the model and show you what we find. No commitment required. Most audits identify savings that pay for the engagement within the first month.

Full name

Company

Work email

Company stage

Estimated monthly AI / compute spend

What brings you here?

We respond within one business day. All information is confidential.

Message received. We will be in touch within one business day to schedule a working session.

Audit the stack.Eliminate the waste.Take control.

GPU compute is your biggest cost.Almost no one manages it like it is.

Four steps.One integrated engagement.

Three differentiators.Only one is permanent.

We are building the expertise now.The market is forming around us.