AI Model Hosting Agreements

AI Infrastructure & Compute

Definition

AI model hosting agreements establish legal and economic frameworks for infrastructure providers deploying third-party AI models, covering: (1) Revenue allocation—infrastructure provider receives 30-50% of API revenue (covering compute, bandwidth, support costs), model developer retains 50-70% (compensating R&D, training expenses), with tiered structures providing volume discounts (80/20 split at <$10K monthly, 70/30 at $10K-$100K, 60/40 at $100K+), (2) Performance SLAs—uptime guarantees (99.9% industry standard, 99.99% for premium tiers), latency commitments (p95 <500ms for inference, p99 <2 seconds), throughput minimums (100-10,000 requests/second depending on tier), failure penalties (service credits 10-25% of monthly fees, termination rights if repeated breaches), (3) Liability and indemnification—provider responsible for infrastructure failures (data loss, security breaches, downtime), developer responsible for model outputs (IP infringement, harmful content, accuracy), mutual hold harmless for third-party claims, insurance requirements ($1M-$10M+ coverage), and (4) Usage restrictions—acceptable use policies prohibiting harmful content generation, competitive intelligence gathering, model extraction/distillation attempts, rate limit violations. Common structures: API marketplace licensing (Hugging Face hosts 100K+ open source models charging users directly, paying creators 70-90% after 10-30% platform fee), enterprise dedicated deployments (Anthropic partners with AWS hosting Claude models—AWS provides infrastructure and enterprise sales, Anthropic retains model IP and majority revenue share), managed inference platforms (Replicate, Together AI charge end users for compute, compensate model developers based on usage after deducting infrastructure costs typically 60-70% gross margins).

Why it matters

Model hosting agreements determine value capture in $30B+ AI infrastructure market as model development and deployment increasingly separated. Key dynamics: (1) Vertical integration versus specialization—OpenAI/Anthropic hosting own models retain 100% economics but capital-intensive ($100M+ annual infrastructure spend), while specialized model developers (Stability AI, Mistral) partnering with hosting platforms (AWS, Replicate) trade 30-50% revenue share for zero infrastructure burden and instant global distribution, (2) Lock-in versus portability—providers offering lowest hosting fees often embed restrictive export provisions (fine-tuned models cannot migrate to competitors), enterprise customers paying 20-40% premium for portability rights ensuring vendor negotiating leverage, (3) Data ownership ambiguity—consumer agreements often grant hosting platforms rights to train on user interactions (improving base models), enterprise agreements prohibit training on sensitive data but enforcement difficult (requires audit rights and technical controls model developers often lack). Real-world friction points: Anthropic/AWS partnership negotiations reportedly contentious over revenue split (Anthropic wanting 80%+, AWS countering 50/50), Stability AI facing hosting cost pressure leading to AWS dependency (90%+ of inference on AWS infrastructure creating pricing vulnerability), Meta releasing Llama as open source partially avoiding hosting negotiations (users self-host eliminating revenue share to intermediaries). Understanding hosting agreements critical for: Model developers optimizing go-to-market (self-host versus partner, exclusive versus multi-platform), infrastructure investors evaluating moats (revenue share agreements create sticky relationships but competitive threats from hyperscaler bundling), and enterprises negotiating deployments (data rights, portability, SLA enforcement mechanisms).

Common misconceptions

  • Revenue shares aren't uniform—vary dramatically by: model performance (GPT-4-class commands 70-80% share, smaller models 40-60%), developer negotiating power (OpenAI/Anthropic dictate terms, startups accept unfavorable splits), and exclusivity (exclusive hosting partnerships justify higher revenue shares to model developer compensating lock-in).
  • SLA credits don't fully compensate downtime—typical penalty 10-25% of monthly fees for breaches, but business impact often 10-100x (lost revenue, customer churn, reputational damage). Critical applications require multi-provider redundancy not reliance on SLA enforcement.
  • Indemnification provisions aren't symmetric—model developers accept broad indemnification for outputs (promising to defend hosting platform against user lawsuits), hosting platforms narrowly limit indemnification to willful misconduct. Reflects risk allocation—model behavior unpredictable, infrastructure failures preventable.

Technical details

Revenue sharing models and economic structures

API marketplace commission structures: Platform fee ranges: 10-30% of gross revenue (Hugging Face 10%, Replicate 20-30%, Together AI 25%). Covers: Infrastructure costs (GPU compute, bandwidth), platform development, customer acquisition, payment processing. Model creator share: 70-90% of revenue. Volume tiers: <$1K monthly → 70/30 split. $1K-$10K → 75/25 split. $10K-$100K → 80/20 split. $100K+ → 85/15 split (incentivizing high-volume creators). Payment terms: Monthly net-30 (platform pays creator 30 days after month-end), minimum thresholds ($100-$500 before payout preventing micro-transactions), cryptocurrency options (USDC/USDT for international creators avoiding banking friction).

Enterprise licensing and revenue allocation: Dedicated deployment model: Enterprise pays $50K-$500K+ monthly for private model deployment. Split: 60-70% to model developer (compensating training costs, ongoing development), 30-40% to infrastructure provider (covering hosting, support, SLAs). Minimum commitments: $100K-$1M annual contract value, 1-3 year terms. Overages: Usage exceeding committed capacity billed at premium rates (120-150% of base rate). Volume discounts: $1M+ annual → 10-20% discount. Multi-year commitment → 15-25% discount. Upfront payment → 10-15% discount.

Hybrid consumption and subscription: Base subscription: $1K-$10K monthly provides included usage (1M-10M tokens, 10K-100K requests). Covers platform access, basic support, SLA guarantees. Overage pricing: $1-$10 per additional 1M tokens consumed. Tiered rates: Initial overage expensive (discouraging usage spikes), higher tiers cheaper (rewarding growth). Revenue split: Subscription revenue 50/50 (covers fixed costs), overage revenue 70/30 model developer (marginal costs low, incentivize adoption). Annual reconciliation: True-up based on actual consumption versus estimate, refund or additional billing.

Performance-based compensation: Base plus success fees: Model developer receives 50% base share + additional 10-20% if performance targets met (accuracy thresholds, latency SLAs, user satisfaction scores). Risk-reward alignment: Developer incentivized to optimize model quality not just adoption. Clawback provisions: If performance degrades post-deployment, reduce developer share until remediated or terminate agreement. Revenue milestones: Escalating splits as revenue grows (50/50 <$1M annual, 60/40 $1M-$5M, 70/30 >$5M) rewarding scale achievement.

Service level agreements and performance guarantees

Uptime and availability commitments: Standard tier: 99.9% monthly uptime (43 minutes downtime allowed), no penalty for breaches, best-effort restoration. Premium tier: 99.95% uptime (22 minutes downtime), service credits 10% of monthly fees per 0.1% breach. Enterprise tier: 99.99% uptime (4 minutes downtime), 25% credits per breach, termination rights after 3 consecutive months of failures. Exclusions: Scheduled maintenance (announced 7+ days advance, during low-traffic windows), third-party failures (AWS region outage), force majeure events.

Latency and throughput SLAs: Inference latency: p50 <200ms, p95 <500ms, p99 <2 seconds. Measurement: Averaged over 1-hour windows, 99% of windows must meet targets. Throughput: Guaranteed requests per second based on tier (100 RPS standard, 1,000 RPS premium, 10,000+ RPS enterprise). Burst handling: 2-3x burst capacity for short periods (<5 minutes), exceeding burst triggers rate limiting not failures. Penalties: 5% credits if p95 latency exceeds 500ms for >5% of monthly hours. 10% credits if throughput falls below 80% of guaranteed minimum.

Data protection and security: Encryption: TLS 1.3 for data in transit, AES-256 for data at rest. Key management: Customer-managed keys optional (enterprise tier), platform-managed keys standard. Access controls: Role-based access, multi-factor authentication required, IP whitelisting available. Audit logging: 90-day retention, exportable logs, real-time alerting on suspicious activity. Compliance certifications: SOC2 Type II (standard), HIPAA available (premium tier with BAA), GDPR/CCPA compliant data handling. Breach notification: 24-72 hour disclosure to affected parties, forensics and remediation at platform expense.

Model performance and accuracy: Baseline performance: Model must maintain training-time accuracy benchmarks (specified in agreement), degradation >10% triggers developer remediation obligations. Monitoring and testing: Platform runs evaluation datasets weekly/monthly, shares results with developer. Drift detection: Automated monitoring for prediction drift, data drift, concept drift. Update procedures: Developer provides model updates addressing degradation, platform deploys within 7-14 days testing cycles. Version rollback: If update degrades performance, automatic rollback to previous version.

Liability allocation and risk management

Infrastructure failure liability: Provider responsible for: Data loss (corruption, deletion, unauthorized access), security breaches (hacking, credential theft, DDoS), downtime exceeding SLAs, performance degradation from infrastructure issues. Remedies: Service credits (10-25% monthly fees), termination rights (persistent failures), actual damages (if gross negligence or willful misconduct proven, otherwise limited to fees paid). Exclusions: User error, third-party services, acts of God.

Model output liability and indemnification: Developer responsible for: IP infringement (model outputs copying copyrighted content), harmful content generation (despite usage policies), inaccuracy (model providing false information causing damages), privacy violations (model trained on personal data without consent). Indemnification obligations: Developer defends and indemnifies platform against third-party claims arising from model outputs. Insurance requirements: Developer maintains errors & omissions coverage ($1M-$10M+ limits), names platform as additional insured. Limitations: Indemnification void if platform modified model, ignored usage policies, or had actual knowledge of violations.

User-generated content and Section 230: Platform protections: Section 230 (US) and equivalents shield platforms from user-generated content liability—platform not liable for how users employ models. Moderation obligations: Platform must: Implement acceptable use policies, respond to abuse reports within 24-72 hours, suspend violating users. Safe harbor: Following prescribed procedures immunizes platform from liability. Knowledge exception: If platform has actual knowledge of illegal activity and fails to act, loses protection—creates tension between monitoring (knowledge risk) and willful blindness (inadequate enforcement).

Data ownership and usage rights: Customer data: Inputs and outputs from enterprise deployments owned by customer, platform prohibited from training on this data, strict confidentiality obligations. Consumer data: Free/public tier users typically grant platform rights to use inputs/outputs for model improvement (subsidizes free access). Model weights: Remain property of developer, platform receives limited license to host/serve, cannot share with competitors or use for own model development. Fine-tuned models: Ownership depends on agreement—customer owns (with export rights), customer owns (no export), shared ownership (customer can use, platform can incorporate learnings).

Competitive dynamics and market positioning

Vertical integration strategies: Self-hosting advantages: OpenAI, Anthropic, Google retain 100% economics, full control over user experience, capture data for improvement, maintain IP security. Challenges: Capital intensive ($100M-$500M+ annual infrastructure), operational complexity (24/7 SRE teams), slower global expansion (must build data centers). Partnership advantages: Specialized model developers (Stability AI, Cohere) leverage hosting platforms for instant distribution, zero infrastructure burden, focus resources on model development. Trade-offs: 30-50% revenue share, platform lock-in risk, reduced control over user experience.

Platform competition and differentiation: Hyperscaler bundling: AWS/GCP/Azure bundling model hosting with existing enterprise relationships (discount 10-30% for combined usage), integrated billing/management, regulatory compliance (HIPAA, FedRAMP). Threats: Commoditization, pricing pressure, hyperscaler preferential self-dealing (promoting own models). Specialized platforms: Hugging Face, Replicate, Together AI focusing exclusively on AI models, superior developer experience, cutting-edge model availability, community network effects. Advantages: Innovation velocity, flexibility, model diversity. Disadvantages: Smaller scale, higher prices, less enterprise credibility.

Open source versus proprietary models: Open source hosting economics: Platforms charge only infrastructure/convenience premium (2-5x base compute cost) not model licensing, rely on volume and value-added services (fine-tuning, deployment tools, monitoring). Competition: Users can self-host open models avoiding platform fees entirely—platforms must provide 3-5x value (ease of use, performance optimization, support) justifying premium. Proprietary model licensing: Closed models (GPT-4, Claude) command 50-100x infrastructure cost in API pricing reflecting model value, more defensible margins but concentration risk (few proprietary models, losing partnership catastrophic).

Enterprise versus consumer positioning: Enterprise focus: Higher margins (50-70% gross), long-term contracts (1-3 years), customer success teams, dedicated infrastructure, compliance certifications. Challenges: Longer sales cycles (6-12 months), customization demands, pricing pressure from procurement. Consumer focus: Lower margins (20-40% gross), viral growth, self-serve, shared infrastructure. Benefits: Rapid scaling, low customer acquisition cost, network effects. Risks: Churn, support burden (millions of users), payment fraud.

Related Terms

See in context