0
Skip to Content
Ferrous Pro
New Page
Ferrous Pro
New Page
New Page
Ferrous
The Problem What We Do Why Us Get in touch
Compute Cost Advisory

Audit the stack.
Eliminate the waste.
Take control.

GPU compute is your biggest cost and your least managed one. Ferrous audits your entire AI stack, identifies what you're overpaying, and shows you exactly what to do about it — independent from every provider, platform, and tool in the market.

Start with the audit ↗ See the problem
23×
price spread for the same H100 GPU — $1.38 to $7.50+/hr across providers
CloudZero · GetDeploying 43 providers · May 2026
5%
average GPU utilisation in enterprise clusters — 95% sits idle, billed by the hour
Cast AI · 23,000 clusters measured · Apr 2026
20–40%
of AI spend recoverable in a typical audit, before any contract changes are needed
CloudChipr · LeanOps FinOps Playbook 2026
H100: $6.88/hr on AWS · $1.99/hr on RunPod — same chip 98% of FinOps teams now managing AI spend — up from 31% two years ago (FinOps Foundation 2026) Training a 7B model: $362,000 on AWS · $58,000 on Lambda Labs AWS raised H200 GPU prices 15% in January 2026 AI gross margins average 52% in 2026 — investors expect 70–80% at Series B GPU utilisation averages 5% in enterprise AI clusters (Cast AI, 23,000 clusters) CME Group compute futures announced May 2026 — pending regulatory approval Neocloud revenue hits $20B in 2026 — 90% of enterprises now adopting GPU-first clouds (Forrester 2026) Silicon Data GPU Forward Curve: first standardised 12-month view of GPU rental costs — DRW-backed, CME-partnered Neocloud revenue hits $20B in 2026 — 90% of enterprises now adopting GPU-first clouds (Forrester 2026) Silicon Data GPU Forward Curve: first standardised 12-month view of GPU rental costs — DRW-backed, CME-partnered H100: $6.88/hr on AWS · $1.99/hr on RunPod — same chip 98% of FinOps teams now managing AI spend — up from 31% two years ago (FinOps Foundation 2026) Training a 7B model: $362,000 on AWS · $58,000 on Lambda Labs AWS raised H200 GPU prices 15% in January 2026 AI gross margins average 52% in 2026 — investors expect 70–80% at Series B GPU utilisation averages 5% in enterprise AI clusters (Cast AI, 23,000 clusters) CME Group compute futures announced May 2026 — pending regulatory approval
The problem

GPU compute is your biggest cost.
Almost no one manages it like it is.

AI startups spend 40–60% of technical budgets on GPU compute. CFOs trained on SaaS economics are managing a cost that represents millions per year with no procurement benchmarks, no utilisation monitoring, and no systematic framework.

The same H100 chip costs $6.88/hr on AWS and $1.99/hr on RunPod. Most companies have never compared their provider rates against the verified market. Most have never audited what's actually running on their GPUs.

The correct order of operations. Audit the stack to identify the waste. Benchmark every contract against the verified market rate. Negotiate procurement to the best available price. Build the financial model that reflects compute as the variable cost it actually is. Most companies skip directly to spending — and pay accordingly.
The conflict of interest no one names. Every tool vendor, cloud provider, and managed FinOps firm has a structural conflict. They make money when you keep spending with them. Ferrous is paid a fixed retainer by you. We can tell you to cancel a commitment, switch providers, or cut a contract entirely. No one else in this market can say that.
6×
Same GPUs. Six times the cost.
Training a 7B model costs $362,000 on AWS and $58,000 on Lambda Labs. Most companies have never benchmarked their provider rates.
BuildMVPFast GPU cloud comparison · Apr 2026
5%
Average GPU utilisation.
Cast AI measured 23,000 production clusters. 95% of provisioned GPU capacity sits idle, billed by the hour. The waste problem precedes the price problem.
Cast AI 2026 State of Kubernetes Optimisation Report
+15%
AWS raised H200 prices in January 2026.
GPU prices are not simply falling. AWS broke two decades of declining compute costs. The most capable hardware is getting more expensive.
The Register · Data Center Dynamics · Jan 2026
ProviderH100 /GPU-hrType
AWS P5$6.88–7.50Hyperscaler
Azure NC H100 v5$6.98Hyperscaler
GCP A3 High~$3.00Hyperscaler
Lambda Labs$2.99Neocloud
Spheron$2.64Neocloud
RunPod community$1.99–2.39Neocloud
Thunder Compute$1.38Neocloud
AWS savings plan$1.90–2.10Reserved

All prices verified June 2026 · Spheron · IntuitionLabs · CloudZero · ThunderCompute

The 23× spread between the cheapest and most expensive provider for the same chip is not a market quirk — it is the procurement opportunity. The audit identifies which of your workloads can migrate and which cannot.
What we do

Four steps.
One integrated engagement.

0
AI usage audit — the foundation
Full inventory of every AI-related cost: GPU infrastructure contracts, model API subscriptions (OpenAI, Anthropic, Bedrock, Gemini, Mistral), inference tooling, AI SaaS. Utilisation analysis. Workload classification. Waste and optimisation report with ranked, actionable findings. Consistently identifies 20–40% of AI spend as immediately recoverable. Fixed-fee standalone engagement — no retainer required to start.
1
Compute-native financial planning
Driver-based financial model rebuilt from audit data — not estimates. Compute modelled as the variable cost it actually is, by GPU class, provider, and workload. Monthly board pack with compute economics, gross margin trajectory, and cost-per-inference benchmarked against verified June 2026 market rates.
2
Procurement advisory and negotiation
Contract renegotiations from audit findings. Hyperscaler EA negotiations with multi-cloud optionality as active leverage. Provider migration to Lambda, RunPod, or Spheron at $1.99–2.99/GPU-hr versus AWS at $6.88–7.50/GPU-hr where workloads allow. Reserved instance strategy against validated base load.
3
Compute-aware capital strategy
Venture debt package with compute commitment schedule. Cloud prepayment structuring sized to audit-validated utilisation. Series A/B fundraising narrative built on verified unit economics. Preparation for the compute unit economics questions investors now ask directly at every funding round.
What the audit finds
01
Idle reserved capacity
30–50% of reserved GPU-hours unused, billed in full. Rightsizing alone recovers significant spend.
02
Hyperscaler rates at 3× market
AWS and Azure for workloads that could run on Lambda or Spheron at a fraction of the cost.
03
Duplicate API subscriptions
Multiple teams paying separately for OpenAI, Anthropic, Bedrock — often for overlapping capabilities.
04
GPU for CPU-eligible workloads
Embedding generation and preprocessing running on GPU when CPU costs one-tenth as much.
05
No utilisation monitoring
Instances left running overnight and on weekends. No alerting. Pure waste, immediately fixable.
06
Training costs misclassified
R&D training runs billed as COGS, inflating apparent gross margin compression and distorting unit economics.
What makes Ferrous different

Three differentiators.
Only one is permanent.

01 —
Structural independence
Permanent
We can tell you AWS is charging 3× the market rate. We can tell you to cancel a reserved commitment. We can tell you to switch providers entirely. No tool vendor, managed FinOps firm, or cloud provider can say those things — their revenue depends on you continuing to spend with them. Ours doesn't.
02 —
CFO fluency, not just FinOps
18–36 months to replicate
We connect your compute cost picture to your venture debt covenants, equity fundraising narrative, cloud prepayment schedule, and board reporting. FinOps tools show you the bill. We show you what it means for your business — and what to tell investors when they ask.
03 —
Proprietary benchmarking data
Compounds with every client
Every audit finding, contract term, and provider rate is logged anonymously from day one. By client ten we hold benchmarking data no one else has. No incumbent CFO firm collects this. No FinOps tool sees the procurement side. This is the moat that compounds.
The forward view

We are building the expertise now.
The market is forming around us.

The compute cost optimisation service we provide today delivers immediate, measurable value without requiring any financial market to exist. The audit, the procurement savings, the gross margin improvement — none of this depends on regulatory approval or market liquidity.

The longer-term direction is clear: compute is becoming a financial asset class. CME, ICE, and AX have all announced compute futures products. Silicon Data — DRW-backed and CME-partnered — has launched the GPU Forward Curve: the first standardised 12-month view of anticipated GPU rental costs, giving Ferrous and its clients the most credible daily benchmark currently available. When those instruments arrive, the clients who trusted Ferrous in the consulting phase will be first to benefit from the advisory phase.

The correct sequence. Build expertise in compute cost management. Build client trust through audits and procurement savings. Build the benchmarking dataset that makes future financial instrument recommendations precise. When the instruments arrive, Ferrous will be the only firm that has done all three.
CME Group ICE / NYSE AX Exchange Ornn / OCPI Silicon Data / DRW GPU Forward Curve
Oct
25
October 2025
Ornn OCPI goes live. First transaction-based GPU pricing benchmark — printed trades, not surveys. Listed on Bloomberg Terminal April 2026.
Dec
25
December 2025
First compute swap executes via Ornn Exchange. Live for $500K+ annualised compute spend.
Jan
26
January 21, 2026
AX Exchange announces compute futures.pending approval
May
5
May 5, 2026
Larry Fink at Milken Institute: "A new asset class will be buying futures of compute." — Bloomberg.
May
12
May 12, 2026
CME Group & Silicon Data announce compute futures (DRW-backed, Carmen Li).pending approval
May
19
May 19, 2026
ICE (NYSE owner) & Ornn announce GPU futures: H100, H200, B200, RTX 5090.pending approval
Start here

Start with
the audit.

The first conversation is a working session — we bring your numbers into the model and show you what we find. No commitment required. Most audits identify savings that pay for the engagement within the first month.

We respond within one business day. All information is confidential.

Message received. We will be in touch within one business day to schedule a working session.
Independence guaranteed
Fixed retainer only. No referral fees from providers. No commissions. We can tell you to cancel a commitment — and we do, when the numbers warrant it.
The audit stands alone
The 30-day AI usage audit is a fixed-fee standalone engagement. No retainer required to start. It typically identifies 20–40% of AI spend as recoverable.
Working session, not a pitch
We bring your numbers into the model live in the first conversation. You leave with a view of your compute cost structure — not a proposal deck.
Ferrous
ferrous.pro All data verified June 2026 Send us a message ↗