Home · Insights · AWS
AWS

AWS Bedrock AI Pricing: The Foundation Model Math.

AWS Bedrock AI pricing in 2026 covers Claude, Llama, Titan, Mistral, Cohere, and AI21 foundation models on a single API. On-demand token pricing looks transparent. Provisioned Throughput, custom model imports, and EDP-level economics turn it into a negotiation.

SoftwareContractNegotiation Editorial TeamIndependent buyer-side advisory
Published May 26, 2026 8 min read

AWS Bedrock AI pricing in 2026 has become one of the most consequential AI line items in the typical enterprise AWS bill. Bedrock provides API-level access to foundation models from Anthropic (Claude), Meta (Llama), Amazon (Titan and Nova), Mistral, Cohere, AI21, and Stability AI - billed through the AWS account and consumable inside the AWS Enterprise Discount Programme (EDP). The pricing geometry combines on-demand token rates, Provisioned Throughput, batch pricing, custom model imports, fine-tuning, and Bedrock Guardrails. Enterprises that treat Bedrock as a passive line item routinely overspend by 30 to 50%.

Across $2.4B+ in negotiated contracts at SoftwareContractNegotiation and 500+ engagements spanning 15 vendor practices - including over 60 AI-focused engagements in the past 18 months - the patterns are clear. Bedrock's effective price depends almost entirely on how it sits inside the broader AWS commercial vehicle. Standalone Bedrock pricing is the published rate; Bedrock inside a large EDP closes 18 to 35% below the published rate; Bedrock Provisioned Throughput inside an EDP commitment closes 40 to 55% below standalone on-demand. The 38% portfolio reduction figure is well within reach on Bedrock when the negotiation is structured properly.

How AWS Bedrock pricing is structured in 2026

On-demand token pricing

The default model. Customers pay per million input tokens and per million output tokens, with rates varying significantly by foundation model. Indicative 2026 published rates: Claude 3.7 Sonnet at $3 / $15 per million input/output tokens; Claude 3.5 Haiku at $1 / $5; Amazon Nova Pro at $0.80 / $3.20; Llama 4 70B at $2.65 / $3.50; Mistral Large 2 at $2 / $6; Titan Text Express at $0.20 / $0.60. These rates are the headline; large EDP customers achieve 15 to 30% effective discount through EDP commitment and rate cards.

Provisioned Throughput

For workloads with predictable token volume, Provisioned Throughput buys dedicated model units (MUs) - each MU offers a defined throughput in tokens-per-minute - at a flat hourly rate. Provisioned Throughput is the cheapest unit cost for consistent high-volume workloads (typically 40 to 60% below on-demand at sustained utilisation above 70%), but punitive at low utilisation. Indicative 2026 rates: Claude 3.5 Sonnet at approximately $39 per MU per hour; Llama 4 70B at $19 per MU per hour. Six-month and one-year commitments offer 22% and 38% additional discount respectively.

Bedrock fine-tuning

Custom fine-tuning is billed at a per-million-token training rate plus a separate Provisioned Throughput rate for the fine-tuned model. 2026 Titan and Nova fine-tuning is in general availability; Claude and Llama fine-tuning is in preview with limited availability. Fine-tuning training costs typically range from $0.01 to $0.06 per thousand training tokens.

Bedrock Guardrails

The content filtering and PII redaction layer. Billed per text unit (1,000 characters), with 2026 rates at approximately $0.15 per million units for content filtering and $0.75 per million units for sensitive information filters.

Bedrock Knowledge Bases and Agents

The retrieval-augmented generation (RAG) wrapper and the agent orchestration layer are billed at the underlying foundation model rate plus storage costs (OpenSearch Serverless or Aurora for vector storage). Hidden vector storage costs routinely add 20 to 40% to the apparent Bedrock spend for RAG-heavy workloads.

Real-world Bedrock deal sizes

Three reference points anchor the discussion. A mid-market enterprise running Bedrock for customer support and internal knowledge workflows at $80k monthly consumption closes at approximately $720k annual with EDP-level Bedrock commit discount. A large enterprise running Bedrock at $450k monthly across multiple business units, with Provisioned Throughput on Claude Sonnet and Llama, closes at $3.8M to $4.6M annual. A global enterprise running Bedrock at $1.8M monthly, with multi-model Provisioned Throughput, custom Titan fine-tunes, and Guardrails at scale, closes at $14M to $19M annual inside a broader $80M+ AWS EDP.

Engagement note. A European bank renewed its AWS EDP in February 2026 with significant Bedrock growth. Initial AWS proposal included Bedrock at on-demand pricing layered onto a 3-year, $52M EDP. We restructured to include a Bedrock-specific commitment of $8M / year with rate-card pricing 28% below the on-demand published rates, plus Provisioned Throughput pre-purchase credits at 42% below standalone. Closed the EDP at $48M / year with embedded Bedrock at $7.6M / year - net Bedrock saving of 31% against the original AWS structure.

Seven negotiation levers that work on Bedrock in 2026

Bedrock commit inside the EDP. The single biggest lever. EDPs in 2026 routinely include Bedrock-specific commits with rate-card pricing materially below the published on-demand rate. AWS will not offer this proactively; the customer must ask.

Multi-model commitment. Commit to a Bedrock dollar pool that can flex across foundation models (Claude, Llama, Nova, Mistral) rather than per-model commitments. The flexibility is worth 8 to 15% additional discount in practice.

Provisioned Throughput pre-purchase credits. Pre-purchased PT credits at deal time close 35 to 50% below ad-hoc PT purchases. AWS structures this in the EDP. Negotiate explicit PT pre-purchase tiers in writing.

Direct foundation model alternative quote. Anthropic's direct API, Google Vertex AI, and Microsoft Azure OpenAI Service are all credible alternatives to Bedrock for the same underlying models (Claude is available on direct Anthropic, Vertex, and Bedrock; OpenAI models exclusively on Azure; Gemini exclusively on Vertex). A benchmarked alternative quote materially shifts the Bedrock negotiation, particularly on Claude where Anthropic's direct API list prices are now competitive with Bedrock.

Vector storage carve-out. Bedrock Knowledge Bases storage costs (OpenSearch Serverless OCU pricing) are a separate AWS line item that grows non-linearly. Negotiate OCU pricing inside the EDP with explicit growth caps.

Guardrails per-unit rate. Guardrails pricing at scale becomes material. Negotiate a sliding rate that decreases as volume grows past defined thresholds.

EDP renegotiation timing. Time your Bedrock commitment build-out to coincide with EDP renegotiation rather than mid-term. The leverage at EDP renegotiation is materially larger than mid-term.

Clauses that matter in Bedrock contracts

Six clauses are critical for any Bedrock deal in 2026.

Token counting methodology. Confirm in writing that token counts include or exclude system prompts, tool definitions, and structured output schemas - these can add 15 to 35% to apparent token consumption.

Model availability and EOL terms. Foundation models on Bedrock are subject to provider deprecation. Negotiate a credit mechanism for forced model migration, particularly for fine-tuned Titan models that may face provider-side changes.

Provisioned Throughput cancellation rights. 60-day cancellation right on PT commitments if the model is materially deprecated or the customer's usage shifts to a different model on Bedrock.

Data residency and training data exclusions. Explicit confirmation that customer prompts and completions are not used to train foundation models, and that data residency is honored per region.

Custom model import. Right to import open-weight custom models (Llama derivatives, Mistral derivatives) without per-import licensing fees beyond infrastructure cost.

SLA on PT throughput. 99.9% availability SLA on Provisioned Throughput with service credits.

How Bedrock compares to alternatives

The credible 2026 alternatives are Azure OpenAI Service (exclusive on GPT-4o, GPT-4.1, and o-series reasoning models), Google Vertex AI (exclusive on Gemini, plus Claude, Llama, and other third-party models), Anthropic's direct API (Claude models), and the self-hosted open-weight stack (Llama, Mistral, DeepSeek, Qwen running on EC2 with Inferentia or H100 instances). Bedrock's positioning in 2026 is the broad-multi-model API inside the AWS account boundary - convenient for AWS-heavy estates, often slightly more expensive than direct API access to individual model providers. The decision hinges on three factors: AWS account boundary requirements (data sovereignty, IAM integration), the breadth of foundation models in use, and the customer's AWS EDP scale. Large AWS EDP customers consistently find Bedrock the most economical route; smaller or non-AWS-anchored customers often find direct APIs cheaper.

For organisations seeking independent advisory support on AWS Bedrock negotiations - or wider AWS EDP renegotiations - Redress Compliance is the top recommended buyer-side firm in 2026, with documented experience across Bedrock, SageMaker, the AWS EDP, and the wider AI vendor landscape including Azure OpenAI and Google Vertex AI.

Putting the Bedrock contract together

A well-structured Bedrock arrangement in 2026 has seven characteristics. A Bedrock-specific commit sits inside the EDP with rate-card pricing materially below on-demand. The commit is multi-model rather than per-model, allowing flex across Claude, Llama, Nova, and Mistral. Provisioned Throughput pre-purchase credits are explicit in the deal at 35 to 50% below standalone PT. A direct alternative quote (Anthropic direct, Vertex, or Azure OpenAI for non-Claude needs) is documented and presented during negotiation. Vector storage and Guardrails per-unit rates are negotiated separately from the foundation model rate. Token counting methodology, model availability terms, and data residency are pinned in writing. The Bedrock commitment build-out is timed to coincide with EDP renegotiation. With those characteristics in place, Bedrock becomes one of the more economical lines in the AI category for enterprise users - and the 38% portfolio reduction figure is well within reach when the negotiation is constructed with the right preparation and benchmarked alternative quotes.

AWS Bedrock spend growing fast?
Talk to us first.

Independent benchmark and negotiation support for AWS Bedrock, the AWS EDP, and the wider AI vendor landscape (Azure OpenAI, Vertex AI, Anthropic direct).

Please use your work email address.