Welcome Back

Enter your email and we'll send a password reset link.

Live data · Updated daily · Prices dropping fast

You're overpaying
for AI infrastructure

Most teams waste 40-70% on LLM APIs and GPU cloud. We track real-time pricing across 23 models and 12 providers so you don't have to. Find out in 10 seconds.

Free forever · No signup required · Pro trial unlocks everything
$2.1M+Saved for teams
500+Teams using Perffeco
23Models tracked
12GPU providers
Built for AI startups, ML engineers, FinOps teams, CTOs, and technical investors
Daily data refresh | 23 LLM models | 12 GPU providers | 6 benchmark suites | Editorially independent
💸

Am I Overpaying for AI?

Find out in 10 seconds. Pick your model and monthly spend.

🎯

Your AI Cost Score Card

Generate your personalized AI efficiency report. Share it to flex your optimization game.

AI Efficiency Score
Monthly Spend
Annual Savings Possible
Your Cost / Task
Optimal Cost / Task
perffeco.com — Intelligence Economics Platform
GPT-4o Mini $0.15/1M in Claude Sonnet 4.6 $3.00/1M in H100 SXM from $1.49/hr DeepSeek V3.2 $0.14/1M in Gemini 2.5 Pro $1.25/1M in B200 from $2.99/hr Claude Opus 4.6 $5.00/1M in o3 Pro $20.00/1M in H200 SXM from $1.85/hr A100 80GB from $1.19/hr GPT-4o Mini $0.15/1M in Claude Sonnet 4.6 $3.00/1M in H100 SXM from $1.49/hr DeepSeek V3.2 $0.14/1M in Gemini 2.5 Pro $1.25/1M in B200 from $2.99/hr Claude Opus 4.6 $5.00/1M in o3 Pro $20.00/1M in H200 SXM from $1.85/hr A100 80GB from $1.19/hr

Cheapest AI Right Now

Real-time lowest prices across LLM APIs and GPU clouds

Cheapest LLM
Mistral Nemo
Mistral AI
$0.02
per 1M input tokens
96% cheaper than GPT-4o
Cheapest H100
H100 SXM 80GB
Vast.ai
$1.49
per hour on-demand
62% cheaper than AWS
Best Value
DeepSeek V3.2
DeepSeek
$0.14
per 1M input · Elo 1280
Best quality/price ratio for production
Updated daily from provider APIs · View full dashboard

Everything you need to cut AI costs

Free tier gives you the data. Pro gives you the edge.

Module 1

LLM Economics

API pricing for 23 models across 6 providers. Cost-per-task analysis, speed benchmarks, price deflation tracking, and token calculators.

Module 2

GPU Cloud Pricing

H100, A100, H200, B200, L40S across 12 providers. Spot vs on-demand analysis, hidden costs, and monthly TCO calculator.

Module 3

Quality Benchmarks

Arena Elo, GPQA, SWE-bench, MATH-500, HumanEval rankings. Radar comparisons and quality-per-dollar analysis.

Module 4

FinOps & Agents

AI agent economics (dev costs, inference, ROI), 6-step FinOps framework, optimisation strategies, and savings calculators.

Model Head-to-Head

Pick two models. See who wins on price, quality, and value.

GPT-4o
$6.25/1M blended
VS
Claude Sonnet 4.6
$9.00/1M blended
GPT-4o is 31% cheaper, but Claude Sonnet 4.6 scores higher on 5/6 benchmarks.
Read Full GPT-4o vs Claude Analysis ↗

Used by teams at every stage

"Switched our inference stack from GPT-4o to a routed DeepSeek + Claude mix after comparing on Perffeco. Monthly API bill dropped from $8.2K to $2.4K with no measurable quality loss on our eval suite."
VP
VP of Engineering
Series B AI Startup, 40-person eng team
"We were paying 3x more than necessary for H100 compute. Perffeco's provider comparison showed us Vast.ai at $1.49/hr vs our Azure contract at $6.98/hr. Migrated our training workloads in a week."
ML
ML Platform Lead
FinTech scale-up, $50K+/mo GPU spend
"I use Perffeco to benchmark model economics across our portfolio. The cost-per-quality analysis is something we couldn't find anywhere else. It's become part of our due diligence process."
GP
General Partner
Deep Tech VC, $200M fund
Free Weekly

The AI Cost Index

Every Monday: which models got cheaper, GPU price drops, and one FinOps tip that saves real money. Trusted by 500+ AI engineers.

No spam. Unsubscribe in one click. Read by teams at YC, a16z portfolio, and F500 infra orgs.

Sources: OpenAI, Anthropic, Google, Mistral, xAI, DeepSeek, Meta (LLM pricing) · Vast.ai, RunPod, Lambda, CoreWeave, AWS, GCP, Azure (GPU pricing) · LMSYS Arena, GPQA, SWE-bench, MATH-500 (benchmarks) · Epoch AI (training cost research) · Updated daily · Editorially independent

Stop overpaying for AI. Start saving today.

Free dashboard. No signup needed. Pro trial unlocks all data for 14 days.

Frontier Model Economics

INTELLIGENCE · TRAINING COST · ENERGY · REAL DATA · 2026
✓ disclosed = lab-verified cost ~ estimated = Epoch AI cost model

Save Dashboard View

X-AXIS
Y-AXIS
OpenAI
Anthropic
Google DeepMind
DeepSeek
Meta
Mistral AI
Alibaba
● dot size = log₁₀(total params) · white ring = lab-disclosed training cost · hover any dot for details
BEST INTELLIGENCE / DOLLAR
  • 1.DeepSeek V3/R1 ~1.1e-5
  • 2.Qwen2.5 72B5.3e-6
  • 3.Llama 4 Scout4.8e-6
  • 4.Mistral Large 23.2e-6
BEST INTELLIGENCE / TOKEN-$
  • 1.Llama 4 Scout1.9e+8
  • 2.DeepSeek V3/R1 ~1.6e+8
  • 3.Qwen2.5 72B9.6e+7
  • 4.Mistral Large 22.3e+7
LOWEST ENERGY / TOKEN
  • 1.Llama 4 Scout5.04e-9 kWh
  • 2.Qwen2.5 72B1.45e-8 kWh
  • 3.DeepSeek V3/R1 ~7.48e-8 kWh
  • 4.Mistral Large 21.44e-7 kWh
ALL MODELS · ranked by CIS
  • Gemini 2.5 Pro71.9
  • o166.8
  • DeepSeek V3/R1 ~65.8
  • Claude 3.5 Sonnet60.4
  • Claude 3 Opus57.2
  • Llama 3.1 405B55.2
  • GPT-4o54.1
  • Qwen2.5 72B53.1
  • Gemini 1.5 Pro50.9
  • GPT-447.3
◆ COMPOSITE INTELLIGENCE SCORE & DERIVED METRICS
CIS = MMLU×0.15 + MMLU-Pro×0.30 + GPQA×0.35 + HLE×0.20
cost_mid = (low_est + high_est) / 2
cost/token = cost_mid / train_tokens_est
int/$ = CIS / cost_mid · int/tok$ = CIS / cost_per_token
MMLU near-saturated (91% ceiling); GPQA= best discriminator 2024+
H100 SXM5: 1,979 TFLOPS bf16 · MFU 38% = 751.6 TFLOPS eff.
Energy: GPU-hrs × 700W TDP × PUE 1.3 (hyperscale DC)
CO₂: 0.400 kg/kWh (IEA 2024 global electricity average)

PRIMARY SOURCES

Epoch AI — Rising Costs (Cottier et al. 2024) · DeepSeek V3 Technical Report (2024) · DeepSeek R1 Paper — Nature 2025 · Meta Llama 3 Technical Report · Google Gemini Technical Reports · OpenAI System Cards · Mistral AI Technical Reports

📊
Get AI Cost Report
Personalised analysis from $29
Unlock All Data
Pro — 14-day free trial
🎯
Book Strategy Call
Free 30-min consultation

LLM Economics Dashboard

Compare API pricing across top models. Prices dropped ~80% YoY.

11
Models
-80%
Price Drop YoY
1000x
Gap
6
Providers
LLM API Pricing
Per 1M tokens. Blended cost (lowest first).
ModelProviderInput $/1MOutput $/1MBlendedSpeedTTFTContextTier
Input vs Output Price by Model
Horizontal bars show input (left) and output (right) cost per 1M tokens. Budget models clustered at the left — the price gap between cheapest and most expensive is over 500x.
Price Tier Distribution
How the 23 tracked models split across pricing tiers. The budget tier now has the most models — a shift from 2024 when premium dominated.
Speed vs Cost — The Best Trade-offs
Top-right = fast and expensive. Bottom-left = slow and cheap. The sweet spot is top-left: fast AND cheap. Haiku 4.5 and Mistral Nemo stand out.
Output Speed (tokens/second)
Speed matters for real-time applications. Reasoning models (o3 Pro, DeepSeek R1) are deliberately slower — they “think” more. For chat/API, Mistral Nemo and Claude Haiku are fastest.
Fast (>100 tok/s) Medium (50-100 tok/s) Slow (<50 tok/s)
Cost Per Task

📧 Email Summary

~700 tokens

$0.0001
GPT-4o Mini

📝 Blog Post

~2,500 tokens

$0.001
GPT-4o Mini

💻 Code Review

~4,500 tokens

$0.01
Sonnet 4.6

📄 10-Page Analysis

~17K tokens

$0.075
Sonnet 4.6

🤖 RAG Query

~8.5K tokens

$0.005
DeepSeek V3.2

🔄 1M Calls/mo

~1K tokens avg

$20
Mistral Nemo
Token Calculator
Monthly
Price/Performance
DeepSeek V3.2Best
95
Gemini FlashExcellent
90
GPT-4o MiniExcellent
85
Claude Haiku 4.5Good
72
GPT-5Good
68
Opus 4.6Premium
42
Self-Host Open Source Models — Save 80-95%

At high volume, self-hosting DeepSeek V3, Llama 4, or Qwen3 on rented GPUs is dramatically cheaper than API pricing. A single H100 at $1.49/hr can serve thousands of requests per minute.

Vast.ai — H100 $1.49/hr RunPod — H100 $2.69/hr DigitalOcean GPUs Vultr — from $0.65/hr

Perffeco may earn a commission from provider links.

Want GPU economics, agent costs & FinOps tools?
Pro unlocks full GPU pricing, AI agent economics, FinOps framework, and all calculators.
Get a personalised AI Cost Report
Instant AI-generated report with savings analysis, model recommendations, and 90-day roadmap. From $29.
Read: GPT-4o vs Claude Pricing ↗ Read: Full LLM Cost Comparison ↗ Read: Real Cost of GPT-4o ↗

Get Notified When LLM Prices Drop

Get notified when prices drop. Free forever.

GPU Economics Dashboard

GPU cloud pricing across top providers.

12
Providers
32
Configs
-70%
H100 from Peak
4.7x
Gap
GPU Pricing Table
GPUVRAMProviderOn-DemandSpotType
H100 80GB — Provider Compare
Vast.ai
Marketplace
$1.49
/hr on-demand
View GPUs ↗
Lambda
Specialized
$1.85
/hr
View GPUs ↗
RunPod
Marketplace
$2.69
/hr secure
View GPUs ↗
DigitalOcean
GPU Cloud
From $2.50
/hr Gradient
Try GPUs ↗
Vultr
Cloud GPU
From $0.65
/hr A100/H100
Try GPUs ↗
CoreWeave
Specialized
$4.76
/hr
GCP
Hyperscaler
$3.67
/hr
AWS
Hyperscaler
$3.93
/hr
Azure
Hyperscaler
$6.98
/hr
H100 Hourly Cost by Provider
Marketplace providers (Vast.ai, RunPod) consistently undercut hyperscalers (AWS, Azure) by 40-75%. The cheapest H100 is 4.7x less than the most expensive — choosing the right provider saves thousands monthly.
Monthly Cost: 1x H100 (24/7)
What you'd actually pay per month running a single H100 around the clock. Multiply by your GPU count for cluster costs.
Hidden Costs

🌐 Egress

$0.08-0.12
/GB out

💾 Storage

$0.02-0.08
/GB/mo

🔗 Network

15-30%
IB premium

⏱️ Commit

1-3yr
RI terms

🛡️ Support

$5-15K
/mo enterprise

⚡ Spot Risk

60-90%
save vs interrupt
GPU Calculator
Monthly

Get a Custom GPU Infrastructure Report

Full provider comparison for your specific GPU, on-demand vs spot analysis, hidden costs breakdown, and 12-month TCO projection. AI-generated PDF delivered instantly.

Get GPU price alerts
Weekly digest of GPU pricing changes, new provider launches, and spot price trends.
Read: H100 GPU Pricing Guide 2026 ↗ Read: RunPod vs Vast.ai ↗

Get Notified When GPU Prices Drop

Get notified when prices drop. Free forever.

AI Agents Economics

Dev costs, inference, pricing models, and ROI benchmarks.

~10x/yr
Inference Deflation
84%
Margin Erosion
$2.5T
AI Spend 2026
Dev Cost Tiers
Simple Chatbot
$5-25K
$3.2-5K/mo ops
  • FAQ / rule-based
  • Single LLM
  • Templates
Task Agent
$50-120K
$5-8K/mo ops
  • Tool-calling
  • Multi-step
  • API integrations
RAG Agent
$80-180K
$6-10K/mo ops
  • Knowledge retrieval
  • RAG memory
  • Self-correct
Multi-Agent
$150-500K+
$8-13K/mo ops
  • Orchestration
  • HITL
  • Governance

Ready to Build? Start Here

Top platforms for building AI agents at every complexity level

Voiceflow
No-Code Agents
Best for: Simple Chatbots
Try Free ↗
Relevance AI
Agent Builder
Best for: Task Agents
Try Free ↗
LangSmith
Dev Platform
Best for: RAG Agents
Try Free ↗
CrewAI
Multi-Agent Framework
Best for: Multi-Agent
Try Free ↗

Perffeco may earn a commission from partner links.

Real-World Results

FinTech Support Agent

62%
ticket deflection rate

Switched from GPT-4o to DeepSeek V3 for L1 support, routing complex queries to Claude Sonnet. Reduced monthly inference from $4,200 to $380 while maintaining 94% CSAT.

E-commerce Sales Agent

3.2x
pipeline ROI

Multi-agent system handling lead qualification + personalised outreach. Dev cost $85K, generating $272K pipeline/quarter. Payback period: 3.7 months.

Code Review Agent

38%
developer productivity gain

RAG agent trained on internal codebase reviews PRs automatically. Using Claude Sonnet 4.6 at $1,200/mo for a 40-person eng team. Saves ~120 eng hours/month.

Inference Costs
AgentTok/TaskBudgetPremiumMo/10K
Support2K$0.001$0.03$10-300
RAG15K$0.008$0.23$80-2.3K
Code Gen25K$0.013$0.38$130-3.8K
Data40K$0.021$0.60$210-6K
Web Agent80K$0.042$1.20$420-12K
Multi-Agent200K$0.11$3.00$1.1K-30K
Agent Inference Cost: Budget vs Premium
Monthly cost for 10K tasks per agent type. The gap between budget and premium models widens dramatically for token-heavy agents — a Multi-Agent system costs 27x more on premium models. This is why model routing is critical for agents.
Recommended LLM by Agent Type
Support Agent
DeepSeek V3
$10/mo
10K tasks @ 2K tokens
Host on Vast.ai ↗
RAG Agent
Claude Haiku 4.5
$500/mo
10K tasks @ 15K tokens
Compare LLM Prices ↗
Code Gen Agent
Claude Sonnet 4.6
$1,200/mo
10K tasks @ 25K tokens
Compare LLM Prices ↗
Multi-Agent
GPT-5 + DeepSeek
$2,800/mo
Route: 80% budget / 20% premium
Host on RunPod ↗
Pricing Models

🏷️ Per-Seat

$20-500
/user/mo

Traditional SaaS.

⚡ Per-Task

$0.01-5
/task

Aligns cost with value.

🎯 Outcome

10-30%
of value

Revenue share.

🔢 Token Pass

2-5x
markup

Transparent, deflation risk.

Not sure how to price your AI agent?

Our team has helped 50+ companies model agent economics. Get a free 30-minute strategy call to review your pricing model, TCO projections, and margin analysis.

Book Free Strategy Call
No commitment. 30 minutes. Actionable advice.
ROI Benchmarks

Support

40-60%
deflection

Sales SDR

2-3x
pipeline/cost

Code Gen

25-45%
productivity

Data Analysis

60-80%
time saved
Agent TCO Calculator
Dev
Monthly Inference
Year 1 TCO

LLM Benchmarks Leaderboard

Arena Elo, MMLU-Pro, GPQA, HumanEval, SWE-bench, MATH-500.

15
Models
6
Benchmarks
6M+
Arena Votes
Overall Rankings
#ModelArenaMMLU-ProGPQAHumanEvalSWEMATHType$/1M
Top 5 Models — Radar Comparison
Multi-dimensional comparison of the top 5 frontier models across 5 benchmarks. Larger area = more capable. Claude Opus 4.6 leads on coding (SWE-bench), GPT-5 leads on math. No single model dominates every axis.
Quality vs Cost — Bubble Chart
Each bubble is a model. X-axis = price (log scale), Y-axis = Arena Elo quality. Bubble size = coding score. The best value models are top-left (high quality, low price). DeepSeek R1 stands out as a quality outlier at budget pricing.
Best Model by Use Case
Best Budget
DeepSeek V3
$0.14/1M tokens
CIS: 83.5 — 96% cheaper than GPT-4o
Self-Host on Vast.ai ↗
Best for Coding
Claude Opus 4.6
SWE-bench: 80.8%
#1 on coding benchmarks
View API Pricing ↗
Best Value (Quality/$)
Gemini 2.5 Pro
$1.25/1M — CIS: 88.7
Near-frontier quality at mid pricing
Compare All Models ↗
Best Overall
o3 Pro
CIS: 93.1 — $20/1M
Highest quality, premium price
View API Pricing ↗
Want price/performance rankings?
See which model gives the most intelligence per dollar with our Pro price/performance analysis.
Read: GPT-4o vs Claude Pricing ↗ Read: Full LLM Cost Comparison ↗
Chatbot Arena Elo
Claude Opus 4.61503
1503
GPT-51480
1480
Claude Sonnet 4.61470
1470
Gemini 2.5 Pro1450
1450
Grok 31440
1440
DeepSeek R11436
1436
SWE-bench Verified
Claude Opus 4.680.8%
80.8%
Claude Sonnet 4.679.6%
79.6%
GPT-576.3%
76.3%
Claude Haiku 4.573.3%
73.3%
DeepSeek R169.1%
69.1%
Run your own coding benchmarks
Rent GPUs from $0.34/hr to benchmark models on your own codebase and eval suite.
Vast.ai from $0.34/hr RunPod from $0.44/hr
MATH-500
GPT-599.4%
99.4%
Claude Sonnet 4.697.8%
97.8%
Claude Opus 4.697.6%
97.6%
DeepSeek R197.3%
97.3%
Mistral Large 293.6%
93.6%
Benchmark Guide

⚔️ Arena

6M+ human votes. Gold standard for preference.

📚 MMLU-Pro

57 subjects. Broad knowledge test.

🔬 GPQA

PhD-level science reasoning.

💻 HumanEval

164 Python coding tasks.

🛠️ SWE-bench

Real GitHub issues. Most realistic.

🧮 MATH-500

Competition math reasoning.

Get Benchmark Updates as Models Release

New models drop weekly. Our AI Cost Index newsletter includes benchmark scores, pricing changes, and cost-per-quality analysis for every new release.

Join 500+ AI engineers. No spam.
Run Your Own Eval Suite

Standard benchmarks don't test your specific use case. Rent GPUs to run custom evaluations on your own data and prompts.

Vast.ai — RTX 4090 $0.22/hr RunPod — RTX 4090 $0.44/hr DigitalOcean GPUs

Perffeco may earn a commission from provider links.

FinOps for AI

Frameworks, tools, and strategies to cut AI spend 40-70%.

84%
Margin Hit
15%
Forecast Right
40-70%
Savings Opp
AI FinOps Framework
1

Allocation

Tag AI costs to teams, projects, models.

2

Forecasting

Predict costs. Build 30-50% buffers.

3

Anomaly Detection

Real-time alerts. Per-task budgets.

4

Rate Optimization

Volume discounts, CUDs, spot, cheapest model.

5

Rightsizing

Match model to task complexity.

6

Governance

Unified dashboards, approval workflows.

Case Study

SaaS Company Cuts AI Spend 68% in 3 Weeks

A 200-person SaaS company spending $45K/mo on LLM APIs implemented model routing (Step 4) and prompt caching (Step 5). Monthly spend dropped to $14.4K — saving $367K annually. The entire implementation took 3 weeks with no quality degradation.

Need help implementing this framework?

Get a free 30-minute FinOps audit. We'll review your current AI spend and identify your biggest savings opportunities.

Book Free FinOps Audit
Top Tools 2026
Cast AI
K8s + GPU
  • K8s cost optimization
  • Spot management
  • GPU monitoring
Try Free ↗
Finout
Enterprise FinOps
  • Multi-cloud billing
  • AI cost attribution
  • Custom dashboards
Try Free ↗
Cloudchipr
AI Optimization
  • AI recommendations
  • Real-time observability
  • Auto cleanup
Try Free ↗
Holori
Multi-Cloud
  • 20+ providers
  • Topology mapping
  • Forecasting
Try Free ↗
Flexera One
IT + Cloud
  • Hybrid cloud
  • SaaS optimization
  • License compliance
Learn More ↗
Keebo
Data Cost
  • Auto-tune warehouses
  • Query optimization
  • Snowflake focus
Try Free ↗

Perffeco may earn a commission from partner links.

Optimization Strategies

🔀 Model Routing

40-70%
savings

💾 Prompt Caching

50-90%
on repeats

📐 Prompt Eng

30-50%
token cut

🔄 Batch API

50%
discount

🖥️ Spot GPUs

60-90%
vs on-demand

🏠 Self-Host

80-95%
at scale

Quick Win Stack (1 week)

Cache → Route → Batch → Prompt audit = 60-80% reduction.

Savings Waterfall — Cumulative Impact
Starting from a $50K/mo baseline, each strategy stacks. Model routing alone saves 40-70%. Combining all 6 strategies typically achieves 75-90% total reduction. The Quick Win Stack (first 4) delivers 60-80% in just one week.
Ready to Self-Host? Cheapest GPU Options

Self-hosting open-source models (DeepSeek V3, Llama 4, Qwen3) saves 80-95% at scale. Start with spot GPUs for testing.

Perffeco may earn a commission from provider links.

Cost by Org Size
Enterprise 10K+$500K-2M+/mo
$500K-2M+
Mid-Market$50-500K
$50-500K
Startup$5-50K
$5-50K

How does your AI spend compare?

Get a custom cost benchmark report comparing your AI spend to companies at your stage. Free, confidential, and actionable.

Free. We'll also send you our weekly AI Cost Index.
Savings Calculator
Monthly Save
Annual
New Monthly
Limited — 14-Day Free Trial, All Features Unlocked

Pay less than one saved GPU hour per month

Pro pays for itself in the first comparison. Most teams save 40-70% on AI costs after switching. Start free, upgrade when you're ready.

Monthly
Annual Save 20%
Free
$0
/month
  • Frontier Model Economics dashboard (interactive bubble chart with 6 axes)
  • Composite Intelligence Score (CIS) methodology & rankings
  • Intelligence / Dollar & Intelligence / Token rankings
  • Energy efficiency rankings (kWh/token)
  • LLM API pricing table (10+ models, 6 providers)
  • Cost per task estimates (6 task types)
  • Price deflation trends
  • Benchmark leaderboard (10 models, 6 benchmarks)
  • Arena Elo, SWE-bench Coding & MATH Reasoning views
  • Benchmark guide (Arena, MMLU-Pro, GPQA, HumanEval, SWE-bench, MATH-500)
  • Basic GPU pricing (top providers)
  • Curated data with periodic updates
Most Popular
Pro
$29
/month
  • Everything in Free
  • Full GPU economics (7+ providers, trends, hidden costs)
  • GPU provider compare (H100 across 8 providers)
  • GPU price trends (historical pricing)
  • GPU hidden cost analysis (egress, storage, network, spot risk)
  • AI Agent dev cost tiers (chatbot to multi-agent)
  • Agent inference cost tables (6 agent types)
  • Agent pricing models (per-seat, per-task, outcome, token pass)
  • Agent ROI benchmarks (support, sales, code, data)
  • FinOps 6-step framework
  • FinOps tools directory (Cast AI, Finout, Cloudchipr & more)
  • FinOps optimization strategies & Quick Win Stack
  • Cost by org size benchmarks (startup to enterprise)
  • All calculators (LLM token, GPU cost, Agent TCO, FinOps savings)
  • LLM price/performance rankings
  • Live LLM pricing via OpenRouter API
  • CSV export
No card required · Full access for 14 days
Average customer saves $2,800/mo
That's 96x ROI on the Pro plan
Team
$79
/user/month
  • Everything in Pro
  • Priority support
  • Team collaboration:
  • Up to 10 seats
  • Custom price alerts
  • API access (10K calls/mo)
  • Shared dashboards
  • Team analytics
Enterprise
Custom
contact us
  • Everything in Team
  • Unlimited seats
  • Dedicated account manager
  • SLA guarantee
  • Custom onboarding
  • SSO & SAML (roadmap)
  • Real-time API (roadmap)
  • White-label option (roadmap)
Book a Demo or email hello@perffeco.com
Secure payments via Stripe
Cancel anytime, no lock-in
14-day money-back guarantee
FeatureFreeProTeamEnterprise
LLM API pricing table
Frontier Model Economics dashboard
Benchmark leaderboard
CIS methodology & rankings
Cost per task estimates
Full GPU economics (7+ providers)
AI Agent economics
FinOps framework & tools
All calculators (LLM, GPU, Agent, FinOps)
Live data via OpenRouter API
CSV export
Custom price alerts
Team seats (up to 10)
API access10K/moUnlimited
Shared dashboards
Unlimited seats
Dedicated account manager
SLA guarantee
SSO / SAMLRoadmap

Pricing FAQ

What happens after the 14-day trial?

After your trial ends, you'll automatically move to the Free tier unless you add a payment method. No charges are made during the trial. You keep access to all Free features permanently.

Can I switch plans at any time?

Yes. Upgrade or downgrade anytime from your Stripe billing portal. Upgrades take effect immediately with prorated billing. Downgrades take effect at the end of your current billing cycle.

Do you offer refunds?

Yes, we have a 14-day money-back guarantee. If you're not satisfied within 14 days of purchase, email hello@perffeco.com for a full refund, no questions asked.

Still not sure?

Book a free 15-minute call and we'll help you pick the right plan for your team.

Book a Call

AI Cost Reports

Instant, data-driven reports powered by Perffeco intelligence. Get actionable recommendations for your specific workload.

🎯

LLM Selection Report

Find the optimal model for your use case. Compares 22+ models on quality, cost, and latency for your specific workload.
$29
one-time payment
  • Top 3 model recommendations
  • Cost-per-task analysis for your volume
  • Quality vs price trade-off matrix
  • Model routing strategy
  • Monthly cost projection
💰

AI Cost Optimization Report

Comprehensive analysis of your AI spend with specific savings recommendations. Model routing, caching, and provider optimization.
$49
one-time payment
  • Current spend analysis
  • Model routing recommendations
  • Prompt caching opportunities
  • Provider comparison (GPU + API)
  • 90-day savings roadmap
  • Projected annual savings
🖥️

GPU Infrastructure Report

Custom GPU provider comparison for your workload. On-demand vs spot vs reserved analysis with hidden costs exposed.
$49
one-time payment
  • Provider comparison for your GPU type
  • On-demand vs spot vs reserved analysis
  • Hidden cost breakdown (egress, storage)
  • Multi-GPU cluster pricing
  • 12-month cost projection
🤖

Agent TCO Report

Full Total Cost of Ownership for your AI agent. Dev costs, inference projections, scaling analysis, and pricing strategy.
$49
one-time payment
  • Development cost estimate
  • Monthly inference projections
  • Year 1 & Year 2 TCO
  • Pricing model recommendation
  • ROI timeline & break-even
📊

Full FinOps Audit Report

Enterprise-grade comprehensive audit. Covers LLM spend, GPU infrastructure, agent economics, and a complete FinOps implementation roadmap. Everything above, combined.
$149
one-time payment · includes 30-min strategy call
  • Full LLM spend analysis
  • GPU infrastructure review
  • Agent economics breakdown
  • Model routing strategy
  • Provider optimization
  • Caching & batching plan
  • 90-day implementation roadmap
  • Projected savings (monthly + annual)
  • 30-minute strategy call included
  • Custom recommendations
Trusted by AI teams at
STARTUPS SCALE-UPS ENTERPRISE VCs

Your Account

Manage your profile, plan, and integrations.

Refer a Friend, Get $10 Credit

Share your link. When they subscribe to Pro, you both get $10 off your next bill.

Your referral link is generated automatically when you sign in. Credited within 24 hours of referral's first payment.

Build Your AI Stack

Answer 4 quick questions and get a personalised model + infrastructure recommendation.

What are you building?

Select your primary use case

💬
Chatbot
Customer support, FAQ, conversational
💻
Code Assistant
Code gen, review, debugging
📚
RAG / Search
Knowledge base, document Q&A
📊
Data Analysis
Extraction, summarisation, reports
🤖
AI Agent
Multi-step, tool-calling, autonomous

Expected volume?

How many API calls per day

🌱
Low
< 1,000 calls/day
📈
Medium
1K - 50K calls/day
🚀
High
50K+ calls/day

Quality priority?

Balance between cost and output quality

💰
Budget
Minimise cost, good enough quality
⚖️
Balanced
Best price/performance ratio
👑
Premium
Best quality, cost secondary

Open to self-hosting?

Self-hosting open-source models can save 80-95%

🖥️
Yes
Have GPU infra or willing to rent
☁️
No, API Only
Prefer managed API providers

Your Personalised AI Stack

Get the full recommendation as a detailed PDF with architecture diagram, cost projections, and migration guide.

Free summary. Paid detailed reports include 90-day roadmap.

Free Token Counter

Paste your text to estimate tokens and see what it would cost across different models.

0
Words
0
Characters
0
Est. Tokens
ModelProviderInput CostOutput Cost (same length)
GPT-4o MiniOpenAI$0.000$0.000
GPT-4oOpenAI$0.000$0.000
GPT-5OpenAI$0.000$0.000
Claude Sonnet 4.6Anthropic$0.000$0.000
Claude Opus 4.6Anthropic$0.000$0.000
DeepSeek V3.2DeepSeek$0.000$0.000
Gemini 2.5 ProGoogle$0.000$0.000
Grok 3xAI$0.000$0.000

Token estimation uses chars/4 approximation. Actual tokenisation varies by model.

See full LLM pricing comparison →

Terms of Service

Last updated: March 2026

1. Acceptance of Terms

By accessing or using Perffeco ("the Service"), you agree to be bound by these Terms of Service. If you do not agree to these terms, you may not access or use the Service.

2. Description of Service

Perffeco is an intelligence economics platform providing AI model pricing comparisons, GPU cloud cost analysis, benchmark data, and FinOps tools. Data is aggregated from public sources and provider APIs.

3. User Accounts

You are responsible for maintaining the confidentiality of your account credentials. You agree to notify us immediately of any unauthorised access to your account. You must be at least 18 years old to create an account.

4. Subscription & Billing

Paid plans are billed monthly or annually via Stripe. You may cancel at any time; access continues until the end of your billing period. Refunds are handled on a case-by-case basis within 14 days of purchase.

5. Data Accuracy

While we strive for accuracy, pricing and benchmark data is provided "as is" without warranties. Perffeco is not responsible for decisions made based on the data provided. Always verify critical pricing directly with providers.

6. API Usage

API access is subject to rate limits based on your plan. Automated scraping, redistribution of data, or use of the API to build competing products is prohibited without prior written consent.

7. Contact

For questions about these terms, contact us at hello@perffeco.com.

Privacy Policy

Last updated: March 2026

1. Information We Collect

We collect your email address and password hash when you create an account. We also collect usage analytics (pages viewed, features used) to improve the Service. Payment information is processed directly by Stripe and never stored on our servers.

2. How We Use Your Information

Your information is used to provide and improve the Service, manage your subscription, send service notifications, and respond to support requests. We do not sell or share your personal data with third parties for marketing purposes.

3. Data Storage

Your data is stored securely on Supabase infrastructure (hosted on AWS). We implement industry-standard security measures including encryption in transit and at rest, and row-level security policies.

4. Cookies

We use essential cookies for authentication and session management. No third-party tracking cookies are used. You can disable cookies in your browser settings, but this may affect Service functionality.

5. Your Rights (GDPR)

You have the right to access, correct, delete, or export your personal data at any time. You can delete your account from the Account page or by emailing us. We will respond to data requests within 30 days.

6. Contact

For privacy inquiries, contact us at hello@perffeco.com.

Frequently Asked Questions

Everything you need to know about Perffeco and our data.

Where does Perffeco source its data?

We aggregate data from official provider APIs, published technical reports, Epoch AI research, LM Arena leaderboards, and direct price-page scraping across multiple GPU and LLM API providers. All disclosed costs are lab-verified; estimated costs use the Epoch AI cost model.

How often is data updated?

Free tier data updates weekly. Pro subscribers get daily updates. Enterprise customers receive real-time feeds via our API. Pricing data is checked against live provider pages every 24 hours.

How is the Composite Intelligence Score (CIS) calculated?

CIS = MMLU x 0.15 + MMLU-Pro x 0.30 + GPQA x 0.35 + HLE x 0.20. This weighting reflects the discriminative power of each benchmark in 2025-2026: GPQA (PhD-level) has the highest weight, while MMLU is down-weighted due to near-saturation at 91%+ ceilings.

Can I export data or use an API?

Pro subscribers can export data as CSV. Team and Enterprise plans include API access with up to 10K calls/month (Team) or unlimited (Enterprise). Contact us for custom data feeds and white-label solutions.

What do the subscription plans include?

Free includes the Frontier Model Economics dashboard (interactive bubble chart, 6 axes), CIS methodology and all intelligence rankings (per-dollar, per-token, energy efficiency), LLM API pricing (10+ models, 6 providers), cost/task estimates, price deflation trends, the full benchmark leaderboard (10 models, 6 benchmarks), Arena Elo, SWE-bench Coding, MATH Reasoning views, benchmark guide, basic GPU pricing, and weekly updates. Pro ($29/mo) adds full GPU economics (multiple providers, H100 compare, price trends, hidden costs), complete AI Agent economics (dev cost tiers, inference tables, pricing models, ROI benchmarks), the full FinOps suite (6-step framework, tools directory, optimization strategies, Quick Win Stack, cost-by-org benchmarks), all four calculators (LLM token, GPU cost, Agent TCO, FinOps savings), LLM price/performance rankings, live data via OpenRouter API, daily updates, and CSV export. Team ($79/user/mo) adds up to 10 seats, shared dashboards, team cost analytics, priority support, API access (10K calls/mo), and custom alerts. Enterprise gets unlimited seats, real-time API (unlimited), custom dashboards and reports, SSO/SAML, a dedicated account manager, SLA guarantee, and white-label options.

How do I cancel my subscription?

You can cancel anytime from your Stripe billing portal — no lock-in, no questions asked. Your access continues until the end of your billing period. Email hello@perffeco.com if you need help.

Perffeco AI Agent

Before you go — GPT-4o dropped 60% last year

Get notified when AI prices drop. 500+ engineers get our free Monday email with the week's biggest pricing changes, cheapest GPU deals, and one tip that saves real money.

No spam. Unsubscribe anytime.