Live data · Updated daily · Prices dropping fast

You're overpaying
for AI infrastructure

Most teams waste 40-70% on LLM APIs and GPU cloud. We track real-time pricing across 23 models and 12 providers so you don't have to. Find out in 10 seconds.

Free forever · No signup required · Pro trial unlocks everything

$2.1M+Saved for teams

500+Teams using Perffeco

23Models tracked

12GPU providers

Built for AI startups, ML engineers, FinOps teams, CTOs, and technical investors

Daily data refresh | 23 LLM models | 12 GPU providers | 6 benchmark suites | Editorially independent

💸

Am I Overpaying for AI?

Find out in 10 seconds. Pick your model and monthly spend.

Current LLM

Monthly AI Spend ($)

🎯

Your AI Cost Score Card

Generate your personalized AI efficiency report. Share it to flex your optimization game.

Your LLM

Monthly Spend ($)

Monthly API Calls

AI Efficiency Score

Monthly Spend

Annual Savings Possible

Your Cost / Task

Optimal Cost / Task

perffeco.com — Intelligence Economics Platform

GPT-4o Mini $0.15/1M in Claude Sonnet 4.6 $3.00/1M in H100 SXM from $1.49/hr DeepSeek V3.2 $0.14/1M in Gemini 2.5 Pro $1.25/1M in B200 from $2.99/hr Claude Opus 4.6 $5.00/1M in o3 Pro $20.00/1M in H200 SXM from $1.85/hr A100 80GB from $1.19/hr GPT-4o Mini $0.15/1M in Claude Sonnet 4.6 $3.00/1M in H100 SXM from $1.49/hr DeepSeek V3.2 $0.14/1M in Gemini 2.5 Pro $1.25/1M in B200 from $2.99/hr Claude Opus 4.6 $5.00/1M in o3 Pro $20.00/1M in H200 SXM from $1.85/hr A100 80GB from $1.19/hr

Cheapest AI Right Now

Real-time lowest prices across LLM APIs and GPU clouds

Cheapest LLM

Mistral Nemo

Mistral AI

$0.02

per 1M input tokens

96% cheaper than GPT-4o

Cheapest H100

H100 SXM 80GB

Vast.ai

$1.49

per hour on-demand

62% cheaper than AWS

Best Value

DeepSeek V3.2

DeepSeek

$0.14

per 1M input · Elo 1280

Best quality/price ratio for production

Updated daily from provider APIs · View full dashboard

Everything you need to cut AI costs

Free tier gives you the data. Pro gives you the edge.

Module 1

LLM Economics

API pricing for 23 models across 6 providers. Cost-per-task analysis, speed benchmarks, price deflation tracking, and token calculators.

→

Module 2

GPU Cloud Pricing

H100, A100, H200, B200, L40S across 12 providers. Spot vs on-demand analysis, hidden costs, and monthly TCO calculator.

→

Module 3

Quality Benchmarks

Arena Elo, GPQA, SWE-bench, MATH-500, HumanEval rankings. Radar comparisons and quality-per-dollar analysis.

→

Module 4

FinOps & Agents

AI agent economics (dev costs, inference, ROI), 6-step FinOps framework, optimisation strategies, and savings calculators.

→

Model Head-to-Head

Pick two models. See who wins on price, quality, and value.

GPT-4o

$6.25/1M blended

VS

Claude Sonnet 4.6

$9.00/1M blended

GPT-4o is 31% cheaper, but Claude Sonnet 4.6 scores higher on 5/6 benchmarks.

Read Full GPT-4o vs Claude Analysis ↗

Used by teams at every stage

"Switched our inference stack from GPT-4o to a routed DeepSeek + Claude mix after comparing on Perffeco. Monthly API bill dropped from $8.2K to $2.4K with no measurable quality loss on our eval suite."

VP

VP of Engineering

Series B AI Startup, 40-person eng team

"We were paying 3x more than necessary for H100 compute. Perffeco's provider comparison showed us Vast.ai at $1.49/hr vs our Azure contract at $6.98/hr. Migrated our training workloads in a week."

ML

ML Platform Lead

FinTech scale-up, $50K+/mo GPU spend

"I use Perffeco to benchmark model economics across our portfolio. The cost-per-quality analysis is something we couldn't find anywhere else. It's become part of our due diligence process."

GP

General Partner

Deep Tech VC, $200M fund

Free Weekly

The AI Cost Index

Every Monday: which models got cheaper, GPU price drops, and one FinOps tip that saves real money. Trusted by 500+ AI engineers.

No spam. Unsubscribe in one click. Read by teams at YC, a16z portfolio, and F500 infra orgs.

Sources: OpenAI, Anthropic, Google, Mistral, xAI, DeepSeek, Meta (LLM pricing) · Vast.ai, RunPod, Lambda, CoreWeave, AWS, GCP, Azure (GPU pricing) · LMSYS Arena, GPQA, SWE-bench, MATH-500 (benchmarks) · Epoch AI (training cost research) · Updated daily · Editorially independent

Stop overpaying for AI. Start saving today.

Free dashboard. No signup needed. Pro trial unlocks all data for 14 days.

◆ Frontier Model Economics

INTELLIGENCE · TRAINING COST · ENERGY · REAL DATA · 2026

✓ disclosed = lab-verified cost ~ estimated = Epoch AI cost model

Save Dashboard View

X-AXIS

Y-AXIS

OpenAI

Anthropic

Google DeepMind

DeepSeek

PRIMARY SOURCES

Epoch AI — Rising Costs (Cottier et al. 2024) · DeepSeek V3 Technical Report (2024) · DeepSeek R1 Paper — Nature 2025 · Meta Llama 3 Technical Report · Google Gemini Technical Reports · OpenAI System Cards · Mistral AI Technical Reports

📊

Get AI Cost Report

Personalised analysis from $29

⚡

Unlock All Data

Pro — 14-day free trial

🎯

Book Strategy Call

Free 30-min consultation

LLM Economics Dashboard

Compare API pricing across top models. Prices dropped ~80% YoY.

11

Models

-80%

Price Drop YoY

1000x

Gap

6

Providers

LLM API Pricing

Per 1M tokens. Blended cost (lowest first).

Model	Provider	Input $/1M	Output $/1M	Blended	Speed	TTFT	Context	Tier

Input vs Output Price by Model

Horizontal bars show input (left) and output (right) cost per 1M tokens. Budget models clustered at the left — the price gap between cheapest and most expensive is over 500x.

Price Tier Distribution

How the 23 tracked models split across pricing tiers. The budget tier now has the most models — a shift from 2024 when premium dominated.

Speed vs Cost — The Best Trade-offs

Top-right = fast and expensive. Bottom-left = slow and cheap. The sweet spot is top-left: fast AND cheap. Haiku 4.5 and Mistral Nemo stand out.

Output Speed (tokens/second)

Speed matters for real-time applications. Reasoning models (o3 Pro, DeepSeek R1) are deliberately slower — they “think” more. For chat/API, Mistral Nemo and Claude Haiku are fastest.

Fast (>100 tok/s) Medium (50-100 tok/s) Slow (<50 tok/s)

Cost Per Task

📧 Email Summary

~700 tokens

$0.0001

GPT-4o Mini

📝 Blog Post

~2,500 tokens

$0.001

GPT-4o Mini

💻 Code Review

~4,500 tokens

$0.01

Sonnet 4.6

📄 10-Page Analysis

~17K tokens

$0.075

Sonnet 4.6

🤖 RAG Query

~8.5K tokens

$0.005

DeepSeek V3.2

🔄 1M Calls/mo

~1K tokens avg

$20

Mistral Nemo

Price Deflation

GPT-4 (Mar 2023)$45.00

$45

Claude 3 Opus (2024)$37.50

$37.50

GPT-4o (May 2024)$12.50

$12.50

GPT-5 (2025)$5.63

$5.63

DeepSeek V3.2 (2026)$0.21

$0.21

Token Calculator

ModelCalls/DayInput TokensOutput Tokens

Monthly

Price/Performance

DeepSeek V3.2Best

95

Gemini FlashExcellent

90

GPT-4o MiniExcellent

85

Claude Haiku 4.5Good

72

GPT-5Good

68

Opus 4.6Premium

42

Self-Host Open Source Models — Save 80-95%

At high volume, self-hosting DeepSeek V3, Llama 4, or Qwen3 on rented GPUs is dramatically cheaper than API pricing. A single H100 at $1.49/hr can serve thousands of requests per minute.

Vast.ai — H100 $1.49/hr RunPod — H100 $2.69/hr DigitalOcean GPUs Vultr — from $0.65/hr

Perffeco may earn a commission from provider links.

Want GPU economics, agent costs & FinOps tools?

Pro unlocks full GPU pricing, AI agent economics, FinOps framework, and all calculators.

Get a personalised AI Cost Report

Instant AI-generated report with savings analysis, model recommendations, and 90-day roadmap. From $29.

Read: GPT-4o vs Claude Pricing ↗ Read: Full LLM Cost Comparison ↗ Read: Real Cost of GPT-4o ↗

Get Notified When LLM Prices Drop

Get notified when prices drop. Free forever.

GPU Economics Dashboard

GPU cloud pricing across top providers.

12

Providers

32

Configs

-70%

H100 from Peak

4.7x

Gap

GPU Pricing Table

GPU	VRAM	Provider	On-Demand	Spot	Type

H100 80GB — Provider Compare

Vast.ai

Marketplace

$1.49

/hr on-demand

View GPUs ↗

Lambda

Specialized

$1.85

/hr

View GPUs ↗

RunPod

Marketplace

$2.69

/hr secure

View GPUs ↗

DigitalOcean

GPU Cloud

From $2.50

/hr Gradient

Try GPUs ↗

Vultr

Cloud GPU

From $0.65

/hr A100/H100

Try GPUs ↗

CoreWeave

Specialized

$4.76

/hr

GCP

Hyperscaler

$3.67

/hr

AWS

Hyperscaler

$3.93

/hr

Azure

Hyperscaler

$6.98

/hr

H100 Hourly Cost by Provider

Marketplace providers (Vast.ai, RunPod) consistently undercut hyperscalers (AWS, Azure) by 40-75%. The cheapest H100 is 4.7x less than the most expensive — choosing the right provider saves thousands monthly.

Monthly Cost: 1x H100 (24/7)

What you'd actually pay per month running a single H100 around the clock. Multiply by your GPU count for cluster costs.

H100 Price Trends

Mid 2023$7.50-$11.00

$7.50-$11

Q1 2024$6.00-$8.00

$6-$8

Q1 2025$2.85-$3.50

$2.85-$3.50

Jun 2025 (AWS -44%)$2.00-$3.93

$2-$3.93

Q1 2026$1.49-$3.93

$1.49-$3.93

Hidden Costs

🌐 Egress

$0.08-0.12

/GB out

💾 Storage

$0.02-0.08

/GB/mo

🔗 Network

15-30%

IB premium

⏱️ Commit

1-3yr

RI terms

🛡️ Support

$5-15K

/mo enterprise

⚡ Spot Risk

60-90%

save vs interrupt

GPU Calculator

GPUGPUsHours/DayDays/Month

Monthly

Get a Custom GPU Infrastructure Report

Full provider comparison for your specific GPU, on-demand vs spot analysis, hidden costs breakdown, and 12-month TCO projection. AI-generated PDF delivered instantly.

Get GPU price alerts

Weekly digest of GPU pricing changes, new provider launches, and spot price trends.

Read: H100 GPU Pricing Guide 2026 ↗ Read: RunPod vs Vast.ai ↗

Get Notified When GPU Prices Drop

Get notified when prices drop. Free forever.

AI Agents Economics

Dev costs, inference, pricing models, and ROI benchmarks.

~10x/yr

Inference Deflation

84%

Margin Erosion

$2.5T

AI Spend 2026

Dev Cost Tiers

Simple Chatbot

$5-25K

$3.2-5K/mo ops

FAQ / rule-based
Single LLM
Templates

Task Agent

$50-120K

$5-8K/mo ops

Tool-calling
Multi-step
API integrations

RAG Agent

$80-180K

$6-10K/mo ops

Knowledge retrieval
RAG memory
Self-correct

Multi-Agent

$150-500K+

$8-13K/mo ops

Orchestration
HITL
Governance

Ready to Build? Start Here

Top platforms for building AI agents at every complexity level

Voiceflow

No-Code Agents

Best for: Simple Chatbots

Try Free ↗

Relevance AI

Agent Builder

Best for: Task Agents

Try Free ↗

LangSmith

Dev Platform

Best for: RAG Agents

Try Free ↗

CrewAI

Multi-Agent Framework

Best for: Multi-Agent

Try Free ↗

Perffeco may earn a commission from partner links.

Real-World Results

FinTech Support Agent

62%

ticket deflection rate

Switched from GPT-4o to DeepSeek V3 for L1 support, routing complex queries to Claude Sonnet. Reduced monthly inference from $4,200 to $380 while maintaining 94% CSAT.

E-commerce Sales Agent

3.2x

pipeline ROI

Multi-agent system handling lead qualification + personalised outreach. Dev cost $85K, generating $272K pipeline/quarter. Payback period: 3.7 months.

Code Review Agent

38%

developer productivity gain

RAG agent trained on internal codebase reviews PRs automatically. Using Claude Sonnet 4.6 at $1,200/mo for a 40-person eng team. Saves ~120 eng hours/month.

Inference Costs

Agent	Tok/Task	Budget	Premium	Mo/10K
Support	2K	$0.001	$0.03	$10-300
RAG	15K	$0.008	$0.23	$80-2.3K
Code Gen	25K	$0.013	$0.38	$130-3.8K
Data	40K	$0.021	$0.60	$210-6K
Web Agent	80K	$0.042	$1.20	$420-12K
Multi-Agent	200K	$0.11	$3.00	$1.1K-30K

Agent Inference Cost: Budget vs Premium

Monthly cost for 10K tasks per agent type. The gap between budget and premium models widens dramatically for token-heavy agents — a Multi-Agent system costs 27x more on premium models. This is why model routing is critical for agents.

Recommended LLM by Agent Type

Support Agent

DeepSeek V3

$10/mo

10K tasks @ 2K tokens

Host on Vast.ai ↗

RAG Agent

Claude Haiku 4.5

$500/mo

10K tasks @ 15K tokens

Compare LLM Prices ↗

Code Gen Agent

Claude Sonnet 4.6

$1,200/mo

10K tasks @ 25K tokens

Compare LLM Prices ↗

Multi-Agent

GPT-5 + DeepSeek

$2,800/mo

Route: 80% budget / 20% premium

Host on RunPod ↗

Pricing Models

🏷️ Per-Seat

$20-500

/user/mo

Traditional SaaS.

⚡ Per-Task

$0.01-5

/task

Aligns cost with value.

🎯 Outcome

10-30%

of value

Revenue share.

🔢 Token Pass

2-5x

markup

Transparent, deflation risk.

Not sure how to price your AI agent?

Our team has helped 50+ companies model agent economics. Get a free 30-minute strategy call to review your pricing model, TCO projections, and margin analysis.

Book Free Strategy Call

No commitment. 30 minutes. Actionable advice.

ROI Benchmarks

Support

40-60%

deflection

Sales SDR

2-3x

pipeline/cost

Code Gen

25-45%

productivity

Data Analysis

60-80%

time saved

Agent TCO Calculator

ComplexityLLMTasks/Month

Dev

Monthly Inference

Year 1 TCO

LLM Benchmarks Leaderboard

Arena Elo, MMLU-Pro, GPQA, HumanEval, SWE-bench, MATH-500.

15

Models

6

Benchmarks

6M+

Arena Votes

Overall Rankings

#	Model	Arena	MMLU-Pro	GPQA	HumanEval	SWE	MATH	Type	$/1M

Top 5 Models — Radar Comparison

Multi-dimensional comparison of the top 5 frontier models across 5 benchmarks. Larger area = more capable. Claude Opus 4.6 leads on coding (SWE-bench), GPT-5 leads on math. No single model dominates every axis.

Quality vs Cost — Bubble Chart

Each bubble is a model. X-axis = price (log scale), Y-axis = Arena Elo quality. Bubble size = coding score. The best value models are top-left (high quality, low price). DeepSeek R1 stands out as a quality outlier at budget pricing.

Best Model by Use Case

Best Budget

DeepSeek V3

$0.14/1M tokens

CIS: 83.5 — 96% cheaper than GPT-4o

Self-Host on Vast.ai ↗

Best for Coding

Claude Opus 4.6

SWE-bench: 80.8%

#1 on coding benchmarks

View API Pricing ↗

Best Value (Quality/$)

Gemini 2.5 Pro

$1.25/1M — CIS: 88.7

Near-frontier quality at mid pricing

Compare All Models ↗

Best Overall

o3 Pro

CIS: 93.1 — $20/1M

Highest quality, premium price

View API Pricing ↗

Want price/performance rankings?

See which model gives the most intelligence per dollar with our Pro price/performance analysis.

Read: GPT-4o vs Claude Pricing ↗ Read: Full LLM Cost Comparison ↗

Chatbot Arena Elo

Claude Opus 4.61503

1503

GPT-51480

1480

Claude Sonnet 4.61470

1470

Gemini 2.5 Pro1450

1450

Grok 31440

1440

DeepSeek R11436

1436

SWE-bench Verified

Claude Opus 4.680.8%

80.8%

Claude Sonnet 4.679.6%

79.6%

GPT-576.3%

76.3%

Claude Haiku 4.573.3%

73.3%

DeepSeek R169.1%

69.1%

Run your own coding benchmarks

Rent GPUs from $0.34/hr to benchmark models on your own codebase and eval suite.

Vast.ai from $0.34/hr RunPod from $0.44/hr

MATH-500

GPT-599.4%

99.4%

Claude Sonnet 4.697.8%

97.8%

Claude Opus 4.697.6%

97.6%

DeepSeek R197.3%

97.3%

Mistral Large 293.6%

93.6%

Benchmark Guide

⚔️ Arena

6M+ human votes. Gold standard for preference.

📚 MMLU-Pro

57 subjects. Broad knowledge test.

🔬 GPQA

PhD-level science reasoning.

💻 HumanEval

164 Python coding tasks.

🛠️ SWE-bench

Real GitHub issues. Most realistic.

🧮 MATH-500

Competition math reasoning.

Get Benchmark Updates as Models Release

New models drop weekly. Our AI Cost Index newsletter includes benchmark scores, pricing changes, and cost-per-quality analysis for every new release.

Join 500+ AI engineers. No spam.

Run Your Own Eval Suite

Standard benchmarks don't test your specific use case. Rent GPUs to run custom evaluations on your own data and prompts.

Vast.ai — RTX 4090 $0.22/hr RunPod — RTX 4090 $0.44/hr DigitalOcean GPUs

Perffeco may earn a commission from provider links.

FinOps for AI

Frameworks, tools, and strategies to cut AI spend 40-70%.

84%

Margin Hit

15%

Forecast Right

40-70%

Savings Opp

AI FinOps Framework

1

Allocation

Tag AI costs to teams, projects, models.

2

Forecasting

Predict costs. Build 30-50% buffers.

3

Anomaly Detection

Real-time alerts. Per-task budgets.

4

Rate Optimization

Volume discounts, CUDs, spot, cheapest model.

5

Rightsizing

Match model to task complexity.

6

Governance

Unified dashboards, approval workflows.

Case Study

SaaS Company Cuts AI Spend 68% in 3 Weeks

A 200-person SaaS company spending $45K/mo on LLM APIs implemented model routing (Step 4) and prompt caching (Step 5). Monthly spend dropped to $14.4K — saving $367K annually. The entire implementation took 3 weeks with no quality degradation.

Need help implementing this framework?

Get a free 30-minute FinOps audit. We'll review your current AI spend and identify your biggest savings opportunities.

Book Free FinOps Audit

Top Tools 2026

Cast AI

K8s + GPU

K8s cost optimization
Spot management
GPU monitoring

Try Free ↗

Finout

Enterprise FinOps

Multi-cloud billing
AI cost attribution
Custom dashboards

Try Free ↗

Cloudchipr

AI Optimization

AI recommendations
Real-time observability
Auto cleanup

Try Free ↗

Holori

Multi-Cloud

20+ providers
Topology mapping
Forecasting

Try Free ↗

Flexera One

IT + Cloud

Hybrid cloud
SaaS optimization
License compliance

Learn More ↗

Keebo

Data Cost

Auto-tune warehouses
Query optimization
Snowflake focus

Try Free ↗

Perffeco may earn a commission from partner links.

Optimization Strategies

🔀 Model Routing

40-70%

savings

💾 Prompt Caching

50-90%

on repeats

📐 Prompt Eng

30-50%

token cut

🔄 Batch API

50%

discount

🖥️ Spot GPUs

60-90%

vs on-demand

🏠 Self-Host

80-95%

at scale

Quick Win Stack (1 week)

Cache → Route → Batch → Prompt audit = 60-80% reduction.

Savings Waterfall — Cumulative Impact

Starting from a $50K/mo baseline, each strategy stacks. Model routing alone saves 40-70%. Combining all 6 strategies typically achieves 75-90% total reduction. The Quick Win Stack (first 4) delivers 60-80% in just one week.

Ready to Self-Host? Cheapest GPU Options

Self-hosting open-source models (DeepSeek V3, Llama 4, Qwen3) saves 80-95% at scale. Start with spot GPUs for testing.

Vast.ai — H100 $1.49/hr RunPod — H100 $2.69/hr DigitalOcean GPUs Vultr — from $0.65/hr

Perffeco may earn a commission from provider links.

Cost by Org Size

Enterprise 10K+$500K-2M+/mo

$500K-2M+

Mid-Market$50-500K

$50-500K

Startup$5-50K

$5-50K

How does your AI spend compare?

Get a custom cost benchmark report comparing your AI spend to companies at your stage. Free, confidential, and actionable.

Free. We'll also send you our weekly AI Cost Index.

Savings Calculator

Monthly AI SpendLevel

Monthly Save

Annual

New Monthly

Limited — 14-Day Free Trial, All Features Unlocked

Pay less than one saved GPU hour per month

Pro pays for itself in the first comparison. Most teams save 40-70% on AI costs after switching. Start free, upgrade when you're ready.

Monthly

Annual Save 20%

Free

$0

/month

Frontier Model Economics dashboard (interactive bubble chart with 6 axes)
Composite Intelligence Score (CIS) methodology & rankings
Intelligence / Dollar & Intelligence / Token rankings
Energy efficiency rankings (kWh/token)
LLM API pricing table (10+ models, 6 providers)
Cost per task estimates (6 task types)
Price deflation trends
Benchmark leaderboard (10 models, 6 benchmarks)
Arena Elo, SWE-bench Coding & MATH Reasoning views
Benchmark guide (Arena, MMLU-Pro, GPQA, HumanEval, SWE-bench, MATH-500)
Basic GPU pricing (top providers)
Curated data with periodic updates

Feature	Free	Pro	Team	Enterprise
LLM API pricing table	✓	✓	✓	✓
Frontier Model Economics dashboard	✓	✓	✓	✓
Benchmark leaderboard	✓	✓	✓	✓
CIS methodology & rankings	✓	✓	✓	✓
Cost per task estimates	✓	✓	✓	✓
Full GPU economics (7+ providers)	—	✓	✓	✓
AI Agent economics	—	✓	✓	✓
FinOps framework & tools	—	✓	✓	✓
All calculators (LLM, GPU, Agent, FinOps)	—	✓	✓	✓
Live data via OpenRouter API	—	✓	✓	✓
CSV export	—	✓	✓	✓
Custom price alerts	—	—	✓	✓
Team seats (up to 10)	—	—	✓	✓
API access	—	—	10K/mo	Unlimited
Shared dashboards	—	—	✓	✓
Unlimited seats	—	—	—	✓
Dedicated account manager	—	—	—	✓
SLA guarantee	—	—	—	✓
SSO / SAML	—	—	—	Roadmap

Pricing FAQ

What happens after the 14-day trial?

After your trial ends, you'll automatically move to the Free tier unless you add a payment method. No charges are made during the trial. You keep access to all Free features permanently.

Can I switch plans at any time?

Yes. Upgrade or downgrade anytime from your Stripe billing portal. Upgrades take effect immediately with prorated billing. Downgrades take effect at the end of your current billing cycle.

Do you offer refunds?

Yes, we have a 14-day money-back guarantee. If you're not satisfied within 14 days of purchase, email hello@perffeco.com for a full refund, no questions asked.

Still not sure?

Book a free 15-minute call and we'll help you pick the right plan for your team.

Book a Call

AI Cost Reports

Instant, data-driven reports powered by Perffeco intelligence. Get actionable recommendations for your specific workload.

🎯

LLM Selection Report

Find the optimal model for your use case. Compares 22+ models on quality, cost, and latency for your specific workload.

$29

one-time payment

Top 3 model recommendations
Cost-per-task analysis for your volume
Quality vs price trade-off matrix
Model routing strategy
Monthly cost projection

AI Cost Optimization Report

Comprehensive analysis of your AI spend with specific savings recommendations. Model routing, caching, and provider optimization.

$49

one-time payment

Current spend analysis
Model routing recommendations
Prompt caching opportunities
Provider comparison (GPU + API)
90-day savings roadmap
Projected annual savings

🖥️

GPU Infrastructure Report

Custom GPU provider comparison for your workload. On-demand vs spot vs reserved analysis with hidden costs exposed.

$49

one-time payment

Provider comparison for your GPU type
On-demand vs spot vs reserved analysis
Hidden cost breakdown (egress, storage)
Multi-GPU cluster pricing
12-month cost projection

🤖

Agent TCO Report

Full Total Cost of Ownership for your AI agent. Dev costs, inference projections, scaling analysis, and pricing strategy.

$49

one-time payment

Development cost estimate
Monthly inference projections
Year 1 & Year 2 TCO
Pricing model recommendation
ROI timeline & break-even

📊

Full FinOps Audit Report

Enterprise-grade comprehensive audit. Covers LLM spend, GPU infrastructure, agent economics, and a complete FinOps implementation roadmap. Everything above, combined.

$149

one-time payment · includes 30-min strategy call

Full LLM spend analysis
GPU infrastructure review
Agent economics breakdown
Model routing strategy
Provider optimization
Caching & batching plan
90-day implementation roadmap
Projected savings (monthly + annual)
30-minute strategy call included
Custom recommendations

Generate Your Report

Fill in your details to generate an instant preview. Pay to unlock the full report.

Your Email Company Name (optional) Current Monthly AI Spend ($) Primary Use Case Current LLM(s) Used Monthly API Calls / Tasks

Your AI Cost Report — Preview

Finding #1 — Immediate Savings

-

Finding #2 — Recommended Model

-

Finding #3 — Annual Projection

-

Finding #4 — Provider Optimization

Switch to Vast.ai for 45% savings

Based on your workload, migrating GPU inference to marketplace providers could save significant costs.

Finding #5 — Caching Strategy

Implement prompt caching for 60% reduction

Analysis of your use case shows high prompt overlap potential.

Finding #6-10 — Full Roadmap

5 more findings + 90-day implementation plan

Complete savings roadmap with step-by-step instructions and tool recommendations.

Unlock Full Report

Get all 10 findings, provider comparisons, implementation roadmap, and projected savings timeline.

$49

One-time payment · Instant access · PDF download

Secure payment via Stripe Instant PDF delivery 14-day refund guarantee

Trusted by AI teams at

STARTUPS SCALE-UPS ENTERPRISE VCs

Your Account

Manage your profile, plan, and integrations.

Profile

Email

—

Plan

FREE

Member Since

—

Connect Telegram

Link your Telegram account to receive mobile price alerts and chat with the Perffeco agent on the go.

Your Plan

Free

Dashboard access, weekly data updates, basic LLM pricing.

Account Actions

Manage your session and account settings.

Team

Create a team to collaborate with your organization.

Price Alerts

API Keys

Saved Dashboards

Team Analytics

Refer a Friend, Get $10 Credit

Share your link. When they subscribe to Pro, you both get $10 off your next bill.

Your referral link is generated automatically when you sign in. Credited within 24 hours of referral's first payment.

Build Your AI Stack

Answer 4 quick questions and get a personalised model + infrastructure recommendation.

What are you building?

Select your primary use case

💬

Chatbot

Customer support, FAQ, conversational

💻

Code Assistant

Code gen, review, debugging

📚

RAG / Search

Knowledge base, document Q&A

📊

Data Analysis

Extraction, summarisation, reports

🤖

AI Agent

Multi-step, tool-calling, autonomous

Expected volume?

How many API calls per day

🌱

Low

< 1,000 calls/day

📈

Medium

1K - 50K calls/day

🚀

High

50K+ calls/day

Quality priority?

Balance between cost and output quality

💰

Budget

Minimise cost, good enough quality

⚖️

Balanced

Best price/performance ratio

👑

Premium

Best quality, cost secondary

Open to self-hosting?

Self-hosting open-source models can save 80-95%

🖥️

Yes

Have GPU infra or willing to rent

☁️

No, API Only

Prefer managed API providers

Your Personalised AI Stack

Get the full recommendation as a detailed PDF with architecture diagram, cost projections, and migration guide.

Free summary. Paid detailed reports include 90-day roadmap.

Free Token Counter

Paste your text to estimate tokens and see what it would cost across different models.

0

Words

0

Characters

0

Est. Tokens

Model	Provider	Input Cost	Output Cost (same length)
GPT-4o Mini	OpenAI	$0.000	$0.000
GPT-4o	OpenAI	$0.000	$0.000
GPT-5	OpenAI	$0.000	$0.000
Claude Sonnet 4.6	Anthropic	$0.000	$0.000
Claude Opus 4.6	Anthropic	$0.000	$0.000
DeepSeek V3.2	DeepSeek	$0.000	$0.000
Gemini 2.5 Pro	Google	$0.000	$0.000
Grok 3	xAI	$0.000	$0.000

Token estimation uses chars/4 approximation. Actual tokenisation varies by model.

See full LLM pricing comparison →

Terms of Service

Last updated: March 2026

1. Acceptance of Terms

By accessing or using Perffeco ("the Service"), you agree to be bound by these Terms of Service. If you do not agree to these terms, you may not access or use the Service.

2. Description of Service

Perffeco is an intelligence economics platform providing AI model pricing comparisons, GPU cloud cost analysis, benchmark data, and FinOps tools. Data is aggregated from public sources and provider APIs.

3. User Accounts

You are responsible for maintaining the confidentiality of your account credentials. You agree to notify us immediately of any unauthorised access to your account. You must be at least 18 years old to create an account.

4. Subscription & Billing

Paid plans are billed monthly or annually via Stripe. You may cancel at any time; access continues until the end of your billing period. Refunds are handled on a case-by-case basis within 14 days of purchase.

5. Data Accuracy

While we strive for accuracy, pricing and benchmark data is provided "as is" without warranties. Perffeco is not responsible for decisions made based on the data provided. Always verify critical pricing directly with providers.

6. API Usage

API access is subject to rate limits based on your plan. Automated scraping, redistribution of data, or use of the API to build competing products is prohibited without prior written consent.

7. Contact

For questions about these terms, contact us at hello@perffeco.com.

Privacy Policy

Last updated: March 2026

1. Information We Collect

We collect your email address and password hash when you create an account. We also collect usage analytics (pages viewed, features used) to improve the Service. Payment information is processed directly by Stripe and never stored on our servers.

2. How We Use Your Information

Your information is used to provide and improve the Service, manage your subscription, send service notifications, and respond to support requests. We do not sell or share your personal data with third parties for marketing purposes.

3. Data Storage

Your data is stored securely on Supabase infrastructure (hosted on AWS). We implement industry-standard security measures including encryption in transit and at rest, and row-level security policies.

4. Cookies

We use essential cookies for authentication and session management. No third-party tracking cookies are used. You can disable cookies in your browser settings, but this may affect Service functionality.

5. Your Rights (GDPR)

You have the right to access, correct, delete, or export your personal data at any time. You can delete your account from the Account page or by emailing us. We will respond to data requests within 30 days.

6. Contact

For privacy inquiries, contact us at hello@perffeco.com.

Frequently Asked Questions

Everything you need to know about Perffeco and our data.

Where does Perffeco source its data?

We aggregate data from official provider APIs, published technical reports, Epoch AI research, LM Arena leaderboards, and direct price-page scraping across multiple GPU and LLM API providers. All disclosed costs are lab-verified; estimated costs use the Epoch AI cost model.

How often is data updated?

Free tier data updates weekly. Pro subscribers get daily updates. Enterprise customers receive real-time feeds via our API. Pricing data is checked against live provider pages every 24 hours.

How is the Composite Intelligence Score (CIS) calculated?

CIS = MMLU x 0.15 + MMLU-Pro x 0.30 + GPQA x 0.35 + HLE x 0.20. This weighting reflects the discriminative power of each benchmark in 2025-2026: GPQA (PhD-level) has the highest weight, while MMLU is down-weighted due to near-saturation at 91%+ ceilings.

Can I export data or use an API?

Pro subscribers can export data as CSV. Team and Enterprise plans include API access with up to 10K calls/month (Team) or unlimited (Enterprise). Contact us for custom data feeds and white-label solutions.

What do the subscription plans include?

Free includes the Frontier Model Economics dashboard (interactive bubble chart, 6 axes), CIS methodology and all intelligence rankings (per-dollar, per-token, energy efficiency), LLM API pricing (10+ models, 6 providers), cost/task estimates, price deflation trends, the full benchmark leaderboard (10 models, 6 benchmarks), Arena Elo, SWE-bench Coding, MATH Reasoning views, benchmark guide, basic GPU pricing, and weekly updates. Pro ($29/mo) adds full GPU economics (multiple providers, H100 compare, price trends, hidden costs), complete AI Agent economics (dev cost tiers, inference tables, pricing models, ROI benchmarks), the full FinOps suite (6-step framework, tools directory, optimization strategies, Quick Win Stack, cost-by-org benchmarks), all four calculators (LLM token, GPU cost, Agent TCO, FinOps savings), LLM price/performance rankings, live data via OpenRouter API, daily updates, and CSV export. Team ($79/user/mo) adds up to 10 seats, shared dashboards, team cost analytics, priority support, API access (10K calls/mo), and custom alerts. Enterprise gets unlimited seats, real-time API (unlimited), custom dashboards and reports, SSO/SAML, a dedicated account manager, SLA guarantee, and white-label options.

How do I cancel my subscription?

You can cancel anytime from your Stripe billing portal — no lock-in, no questions asked. Your access continues until the end of your billing period. Email hello@perffeco.com if you need help.

Welcome Back

You're overpayingfor AI infrastructure

Am I Overpaying for AI?

Your AI Cost Score Card

Cheapest AI Right Now

Everything you need to cut AI costs

LLM Economics

GPU Cloud Pricing

Quality Benchmarks

FinOps & Agents

Model Head-to-Head

Used by teams at every stage

The AI Cost Index

Stop overpaying for AI. Start saving today.

◆ Frontier Model Economics

Save Dashboard View

PRIMARY SOURCES

LLM Economics Dashboard

📧 Email Summary

📝 Blog Post

💻 Code Review

📄 10-Page Analysis

🤖 RAG Query

🔄 1M Calls/mo

Get Notified When LLM Prices Drop

GPU Economics Dashboard

🌐 Egress

💾 Storage

🔗 Network

⏱️ Commit

🛡️ Support

⚡ Spot Risk

Get a Custom GPU Infrastructure Report

Get Notified When GPU Prices Drop

AI Agents Economics

Ready to Build? Start Here

FinTech Support Agent

E-commerce Sales Agent

Code Review Agent

🏷️ Per-Seat

⚡ Per-Task

🎯 Outcome

🔢 Token Pass

Not sure how to price your AI agent?

Support

Sales SDR

Code Gen

Data Analysis

Download Your Full TCO Report

LLM Benchmarks Leaderboard

⚔️ Arena

📚 MMLU-Pro

🔬 GPQA

💻 HumanEval

🛠️ SWE-bench

🧮 MATH-500

Get Benchmark Updates as Models Release

FinOps for AI

Allocation

Forecasting

Anomaly Detection

Rate Optimization

Rightsizing

Governance

SaaS Company Cuts AI Spend 68% in 3 Weeks

Need help implementing this framework?

🔀 Model Routing

💾 Prompt Caching

📐 Prompt Eng

🔄 Batch API

🖥️ Spot GPUs

🏠 Self-Host

Quick Win Stack (1 week)

How does your AI spend compare?

Download Your Savings Roadmap

Pay less than one saved GPU hour per month

Pricing FAQ

Still not sure?

AI Cost Reports

LLM Selection Report

You're overpaying
for AI infrastructure