Model Comparison

Claude 4 Opus (Non-reasoning) vs GPT-4.1 mini

Anthropic vs OpenAI

OpenAI's GPT-4.1 mini beats Anthropic's Claude 4 Opus (Non-reasoning) on both price and benchmarks — here's the full breakdown.

Data last updated March 4, 2026

GPT-4.1 mini is the clear winner — cheaper and higher-scoring than Claude 4 Opus (Non-reasoning). Claude 4 Opus (Non-reasoning) costs $0.15 per request vs $0.0036 for GPT-4.1 mini (at 5K input / 1K output tokens). There's little reason to choose Claude 4 Opus (Non-reasoning) unless you need its specific API features or ecosystem.

Benchmarks & Performance

Metric Claude 4 Opus (Non-reasoning) GPT-4.1 mini
Intelligence Index 22.2 22.9
MMLU-Pro 0.9 0.8
GPQA 0.7 0.7
AIME 0.6 0.4
Output tokens/sec 36.2 70.6
Time to first token 1.34s 0.48s
Context window 1,000,000 1,047,576

Pricing per 1M Tokens

List prices as published by the provider. Not adjusted for token efficiency.

Metric Claude 4 Opus (Non-reasoning) GPT-4.1 mini
Input price / 1M tokens $15.00 $0.40
Output price / 1M tokens $75.00 $1.60
Cache hit price / 1M tokens $1.50 $0.10

Intelligence vs Price

15 20 25 30 35 40 $0.002 $0.005 $0.01 $0.02 $0.05 $0.1 $0.2 Typical request cost (5K input + 1K output) Intelligence Index Gemini 2.5 Pro DeepSeek R1 0528 GPT-4.1 Claude 4 Sonnet... Claude 4.5 Sonn... Gemini 2.5 Flas... Grok 3 mini Rea... Claude 4 Opus (Non-reasoning) GPT-4.1 mini
Claude 4 Opus (Non-reasoning) GPT-4.1 mini Other models

Value Analysis

Cost per IQ point based on a typical request of 5,000 input and 1,000 output tokens.

Cheaper (list price)

GPT-4.1 mini

Higher Benchmarks

GPT-4.1 mini

Better Value ($/IQ point)

GPT-4.1 mini

Claude 4 Opus (Non-reasoning)

$0.0068 / IQ point

GPT-4.1 mini

$0.0002 / IQ point

Frequently Asked Questions

How much cheaper is GPT-4.1 mini than Claude 4 Opus (Non-reasoning)?

GPT-4.1 mini is dramatically cheaper — 42x less per request than Claude 4 Opus (Non-reasoning). GPT-4.1 mini is cheaper on both input ($0.4/M vs $15.0/M) and output ($1.6/M vs $75.0/M). At a fraction of the cost, GPT-4.1 mini saves significantly in production workloads. This comparison assumes a typical request of 5,000 input and 1,000 output tokens (5:1 ratio). Actual ratios vary by workload — chat and completion tasks typically run 2:1, code review around 3:1, document analysis and summarization 10:1 to 50:1, and embedding workloads are pure input with no output tokens.

Do Claude 4 Opus (Non-reasoning) and GPT-4.1 mini score the same on benchmarks?

Claude 4 Opus (Non-reasoning) and GPT-4.1 mini score nearly the same overall (22.2 vs 22.9). Claude 4 Opus (Non-reasoning) leads on MMLU-Pro (0.86 vs 0.78), GPQA (0.7 vs 0.66), AIME (0.56 vs 0.43). If mathematical reasoning matters, Claude 4 Opus (Non-reasoning)'s AIME score of 0.56 gives it an edge.

Which generates output faster, Claude 4 Opus (Non-reasoning) or GPT-4.1 mini?

GPT-4.1 mini is 95% faster at 70.6 tokens per second compared to Claude 4 Opus (Non-reasoning) at 36.2 tokens per second. GPT-4.1 mini also starts generating sooner at 0.48s vs 1.34s time to first token. The speed difference matters for chatbots but is less relevant in batch processing.

Which has a larger context window, Claude 4 Opus (Non-reasoning) or GPT-4.1 mini?

GPT-4.1 mini has a 5% larger context window at 1,047,576 tokens vs Claude 4 Opus (Non-reasoning) at 1,000,000 tokens. That's roughly 1,396 vs 1,333 pages of text. The extra context capacity in GPT-4.1 mini matters for document analysis and long conversations.

Is GPT-4.1 mini worth choosing over Claude 4 Opus (Non-reasoning) on value alone?

GPT-4.1 mini offers dramatically better value — $0.0002 per intelligence point vs Claude 4 Opus (Non-reasoning) at $0.0068. GPT-4.1 mini is both cheaper and higher-scoring, making it the clear value pick. You don't sacrifice quality to save money with GPT-4.1 mini.

How does prompt caching affect Claude 4 Opus (Non-reasoning) and GPT-4.1 mini pricing?

With prompt caching, GPT-4.1 mini is dramatically cheaper — 39x less per request than Claude 4 Opus (Non-reasoning). Caching saves 45% on Claude 4 Opus (Non-reasoning) and 42% on GPT-4.1 mini compared to standard input prices. Both models benefit from caching at similar rates, so the uncached price comparison holds.

Pricing verified against official vendor documentation. Updated daily. See our methodology.

Related Comparisons

Stop guessing. Start measuring.

Create an account, install the SDK, and see your first margin data in minutes.

See My Margin Data

No credit card required