(cache)AI Model Registry - Compare LLM Costs & Providers

🎉 Helicone Joins Mintlify 🚀

Filters

Anthropic

AWS Bedrock

Azure OpenAI

Baseten

Canopy Wave

Cerebras

Chutes

DeepInfra

DeepSeek

Fireworks

Google AI Studio

Groq

Helicone

Mistral AI

Nebius Token Factory

Novita

OpenAI

OpenRouter

Perplexity

Vertex AI

xAI

$0.00 - $200/M tokens

$0.00$200

Minimum: 0 tokens

01.0M

Caching

Web Search

Text

Image

Audio

Video

Text

Image

Audio

Video

Credits

Showing 111 of 111 models

OpenAI GPT-5.4Credits

GPT-5.4 is our frontier model for complex professional work. Reasoning.effort supports: none (default), low, medium, high and xhigh. Features a 1.05M ...

by openai

•

1.1M context

•

$2.5/M in,$15.0/M out

OpenAI GPT-5.4Pinned VersionCredits

GPT-5.4 is our frontier model for complex professional work. Reasoning.effort supports: none (default), low, medium, high and xhigh. Features a 1.05M ...

by openai

•

1.1M context

•

$2.5/M in,$15.0/M out

Google Gemini 3.1 Flash-Lite PreviewCredits

Gemini 3.1 Flash-Lite Preview is Google's most cost-efficient model, optimized for high-volume agentic tasks, translation, and simple data processing....

by google

•

1.0M context

•

$0.25/M in,$1.5/M out

Claude Sonnet 4.6Credits

Claude Sonnet 4.6 is Anthropic's most capable Sonnet model, released February 2026. Features near-Opus-level intelligence at Sonnet pricing, with a 1M...

by anthropic

•

1.0M context

•

$3.0/M in,$15.0/M out

Google Gemini 3.1 Pro PreviewCredits

Gemini 3.1 Pro Preview is Google's most advanced reasoning model, released February 2026. It uses extended thinking/chain-of-thought reasoning to work...

by google

•

1.0M context

•

$2.0/M in,$12.0/M out

Claude Opus 4.6Credits

Claude Opus 4.6 is Anthropic's most capable model to date, released February 2026. Building on the intelligence of Opus 4.5, it brings new levels of r...

by anthropic

•

1.0M context

•

$5.0/M in,$25.0/M out

Google Gemini 3 Flash PreviewCredits

Gemini 3 Flash Preview is Google's latest fast and efficient AI model optimized for quick response times while maintaining high quality. This preview ...

by google

•

1.0M context

•

$0.50/M in,$3.0/M out

OpenAI GPT-5.2Credits

GPT-5.2 is our best general-purpose model, part of the GPT-5 flagship model family. Our most intelligent model yet for both general and agentic tasks,...

by openai

•

400K context

•

$1.8/M in,$14.0/M out

GPT-5.2 ProCredits

Tough problems that may take longer to solve but require harder thinking

by openai

•

400K context

•

$21.0/M in,$168.0/M out

OpenAI GPT-5.2 ChatCredits

GPT-5.2 Chat is a continuously updated version of GPT-5.2 optimized for conversational interactions. It receives regular updates with the latest impro...

by openai

•

128K context

•

$1.8/M in,$14.0/M out

OpenAI GPT Image 1.5

GPT Image 1.5 is OpenAI's state-of-the-art image generation model with better instruction following, 4× faster generation, and cheaper image tokens th...

by openai

•

8K context

•

$5.0/M in,$10.0/M out

Claude Opus 4.5Credits

Claude Opus 4.5 is Anthropic's flagship model released November 2025, representing the highest level of intelligence and capability. Features extended...

by anthropic

•

200K context

•

$5.0/M in,$25.0/M out

Google Gemini 3 Pro Image PreviewCredits

Gemini 3 Pro Image is Google's native image generation model with state-of-the-art reasoning capabilities. It is the best model for complex and multi-...

by google

•

66K context

•

$2.0/M in,$12.0/M out

Google Gemini 3 Pro PreviewCredits

Gemini 3 Pro Preview is Google's latest experimental AI model with advanced reasoning, coding, and multimodal capabilities. This preview version offer...

by google

•

1.0M context

•

$2.0/M in,$12.0/M out

xAI Grok 4.1 Fast Non-ReasoningCredits

A frontier multimodal model optimized specifically for high-performance agentic tool calling.

by xai

•

2.0M context

•

$0.20/M in,$0.50/M out

xAI Grok 4.1 Fast ReasoningCredits

A frontier multimodal model optimized for high-performance agentic tool calling with reasoning capabilities.

by xai

•

2.0M context

•

$0.20/M in,$0.50/M out

OpenAI GPT-5.1Pinned VersionCredits

GPT-5.1 is an enhanced version of GPT-5 with improved performance and capabilities. It features the same 400K context window and advanced tool calling...

by openai

•

400K context

•

$1.3/M in,$10.0/M out

Kimi K2 ThinkingCredits

Kimi K2 Thinking is a powerful open-source AI model from Moonshot AI designed for complex, step-by-step reasoning and long-horizon agentic tasks. It e...

by moonshotai

•

256K context

•

$0.48/M in,$2.0/M out

Claude 4.5 HaikuCredits

Our fastest model. Intelligence at blazing speeds. Multilingual and vision capabilities. 8,192 max output tokens. Training data cut-off: October 2024....

by anthropic

•

200K context

•

$1.0/M in,$5.0/M out

Claude 4.5 Haiku (20251001)Credits

Our fastest model. Intelligence at blazing speeds. Multilingual and vision capabilities. 8,192 max output tokens. Training data cut-off: October 2024....

by anthropic

•

200K context

•

$1.0/M in,$5.0/M out

GPT-5 ProPinned VersionCredits

Most capable GPT-5 model with extended thinking capabilities

by openai

•

128K context

•

$15.0/M in,$120.0/M out

Claude Sonnet 4.5Credits

Best-in-class coding and agentic model with hours-long autonomous operation capabilities. Supports extended thinking, context awareness, parallel tool...

by anthropic

•

200K context

•

$3.0/M in,$15.0/M out

Claude Sonnet 4.5 (20250929)Credits

Best-in-class coding and agentic model with hours-long autonomous operation capabilities. Supports extended thinking, context awareness, parallel tool...

by anthropic

•

200K context

•

$3.0/M in,$15.0/M out

Qwen3 VL 235B A22B InstructCredits

Qwen3 VL 235B A22B Instruct is a powerful, open-weight multimodal model from Alibaba Cloud that excels at both language and vision tasks. It integrate...

by alibaba

•

256K context

•

$0.30/M in,$1.5/M out

DeepSeek V3.1 TerminusCredits

DeepSeek-V3.1 Terminus is an update to DeepSeek V3.1 that maintains the model's original capabilities while addressing issues reported by users, inclu...

by deepseek

•

128K context

•

$0.27/M in,$1.0/M out

DeepSeek V3.2Credits

DeepSeek-V3.2-Exp is an experimental model introducing the groundbreaking DeepSeek Sparse Attention (DSA) mechanism for enhanced long-context processi...

by deepseek

•

164K context

•

$0.26/M in,$0.40/M out

xAI Grok 4 Fast Non-ReasoningCredits

Grok 4 Fast is xAI's latest advancement in cost-efficient reasoning models. Built on xAI’s learnings from Grok 4, Grok 4 Fast delivers frontier-level ...

by xai

•

2.0M context

•

$0.20/M in,$0.50/M out

Kimi K2 (09/05)Credits

Enhanced version of Kimi K2 with doubled context window (256k tokens) and significantly improved coding capabilities, especially for frontend developm...

by moonshotai

•

262K context

•

$0.50/M in,$2.0/M out

Grok 4 Fast ReasoningCredits

Grok 4 Fast is xAI's latest advancement in cost-efficient reasoning models. Built on xAI’s learnings from Grok 4, Grok 4 Fast delivers frontier-level ...

by xai

•

2.0M context

•

$0.20/M in,$0.50/M out

OpenAI GPT-5Pinned VersionCredits

GPT-5 is OpenAI's most advanced language model, featuring enhanced reasoning capabilities with 80% fewer factual errors than o3. It supports a 400K to...

by openai

•

400K context

•

$1.3/M in,$10.0/M out

OpenAI GPT-5 MiniPinned VersionCredits

GPT-5 Mini delivers GPT-5-level performance at a fraction of the cost and latency. With the same 400K context window and advanced capabilities includi...

by openai

•

400K context

•

$0.25/M in,$2.0/M out

OpenAI GPT-5 NanoPinned VersionCredits

GPT-5 Nano is the smallest and fastest model in the GPT-5 family, designed for ultra-low latency applications. Despite its compact size, it maintains ...

by openai

•

400K context

•

$0.05/M in,$0.40/M out

Claude Opus 4.1Credits

Our most capable model with the highest level of intelligence and capability. Supports extended thinking, multilingual capabilities, and vision proces...

by anthropic

•

200K context

•

$15.0/M in,$75.0/M out

Claude Opus 4.1 (20250805)Credits

Our most capable model with the highest level of intelligence and capability. Supports extended thinking, multilingual capabilities, and vision proces...

by anthropic

•

200K context

•

$15.0/M in,$75.0/M out

Qwen3 Coder 30B A3B InstructCredits

This streamlined model maintains impressive performance and efficiency, featuring the following key enhancements: (a) Significant Performance among op...

by alibaba

•

262K context

•

$0.10/M in,$0.30/M out

Qwen3 235B A22B ThinkingCredits

Qwen3-235B-A22B-Thinking-2507 is the Qwen3's new model with scaling the thinking capability of Qwen3-235B-A22B, improving both the quality and depth o...

by qwen

•

262K context

•

$0.30/M in,$2.9/M out

Qwen3 Coder 480B A35B Instruct TurboCredits

Qwen3-Coder-480B-A35B-Instruct is the Qwen3's most agentic code model, featuring significant performance on agentic coding, agentic browser-use and ot...

by qwen

•

262K context

•

$0.22/M in,$0.95/M out

Google Gemini 2.5 Flash LiteCredits

Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency. It offers improv...

by google

•

1.0M context

•

$0.10/M in,$0.40/M out

DeepSeek TNG R1T2 ChimeraCredits

DeepSeek-TNG-R1T2-Chimera is the second-generation Chimera model from TNG Tech. It is a 671 B-parameter mixture-of-experts text-generation model assem...

by deepseek

•

130K context

•

$0.30/M in,$1.2/M out

Google Gemini 2.5 FlashCredits

Gemini 2.5 Flash is Google's state-of-the-art workhorse model, specifically designed for advanced reasoning, coding, mathematics, and scientific tasks...

by google

•

1.0M context

•

$0.30/M in,$2.5/M out

Google Gemini 2.5 ProCredits

Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, coding, mathematics, and scientific tasks. It employs “thinking”...

by google

•

1.0M context

•

$1.3/M in,$10.0/M out

Qwen3 30B A3BCredits

Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. B...

by qwen

•

41K context

•

$0.08/M in,$0.29/M out

Claude Opus 4Credits

Our previous flagship model with very high intelligence and capability. Supports extended thinking, multilingual capabilities, and vision processing. ...

by anthropic

•

200K context

•

$15.0/M in,$75.0/M out

Claude Sonnet 4Credits

High-performance model with high intelligence and balanced performance. Supports extended thinking, multilingual capabilities, and vision processing. ...

by anthropic

•

200K context

•

$3.0/M in,$15.0/M out

Qwen3 32BCredits

Qwen3-32B is a 32.8 billion parameter language model that uniquely supports seamless switching between thinking mode for complex reasoning tasks and n...

by alibaba

•

131K context

•

$0.29/M in,$0.59/M out

OpenAI GPT-4.1Credits

GPT-4.1 is a flagship large language model optimized for advanced instruction following, real-world software engineering, and long-context reasoning. ...

by openai

•

1.0M context

•

$2.0/M in,$8.0/M out

OpenAI GPT-4.1 MiniCredits

GPT-4.1 Mini is a mid-sized model delivering performance competitive with GPT-4o at substantially lower latency and cost. It retains a 1 million token...

by openai

•

1.0M context

•

$0.40/M in,$1.6/M out

OpenAI GPT-4.1 NanoCredits

For tasks that demand low latency, GPT-4.1 nano is the fastest and cheapest model in the GPT-4.1 series. It delivers exceptional performance at a smal...

by openai

•

1.0M context

•

$0.10/M in,$0.40/M out

OpenAI GPT Image 1

GPT Image 1 is OpenAI's image generation model that turns text and image inputs into high-fidelity images. It offers strong instruction following and ...

by openai

•

8K context

•

$6.3/M in,$12.5/M out

Baidu Ernie 4.5 21B A3B ThinkingCredits

ERNIE-4.5-21B-A3B-Thinking is a text-based Mixture of Experts (MoE) post-training model featuring 21B total parameters with 3B active parameters per t...

by baidu

•

128K context

•

$0.07/M in,$0.28/M out

Claude 3.7 SonnetCredits

High-performance model with toggleable extended thinking for complex reasoning tasks. Combines high intelligence with the ability to think through pro...

by anthropic

•

200K context

•

$3.0/M in,$15.0/M out

Kimi K2.5Credits

Kimi K2.5 is Moonshot AI's flagship agentic model and a new SOTA open model. Built on Kimi K2 with continued pretraining over approximately 15T mixed ...

by moonshotai

•

262K context

•

$0.60/M in,$1.2/M out

Perplexity SonarCredits

Fast and accurate web-grounded chat model with real-time search capabilities. Ideal for general queries requiring up-to-date information from the web.

by perplexity

•

127K context

•

$1.0/M in,$1.0/M out

Perplexity Sonar ProCredits

Advanced web-grounded chat model with enhanced search quality and 200K context window. Best for complex queries requiring comprehensive web research.

by perplexity

•

200K context

•

$3.0/M in,$15.0/M out

Perplexity Sonar ReasoningCredits

Web-grounded reasoning model that thinks step-by-step before responding. Combines search capabilities with logical reasoning for accurate, well-reason...

by perplexity

•

127K context

•

$1.0/M in,$5.0/M out

Perplexity Sonar Reasoning ProCredits

Advanced reasoning model with 128K context window designed for complex, multi-step queries. Provides in-depth analysis with web-grounded research and ...

by perplexity

•

127K context

•

$2.0/M in,$8.0/M out

Perplexity Sonar Deep ResearchCredits

Specialized research model that conducts comprehensive multi-query searches with citation tracking and reasoning tokens. Automatically determines sear...

by perplexity

•

127K context

•

$2.0/M in,$8.0/M out

DeepSeek R1 Distill Llama 70BCredits

DeepSeek-R1-Distill-Llama-70B is a 70-billion parameter model created by distilling the reasoning capabilities of DeepSeek's flagship R1 model (671B p...

by deepseek

•

128K context

•

$0.03/M in,$0.13/M out

DeepSeek ReasonerCredits

DeepSeek-Reasoner (DeepSeek-V3.1 Thinking Mode) is designed for advanced reasoning, mathematical problem-solving, and complex coding tasks. It uses ch...

by deepseek

•

128K context

•

$0.50/M in,$1.7/M out

o1Credits

Reasoning model with extended thinking capabilities

by openai

•

200K context

•

$15.0/M in,$60.0/M out

o1-miniCredits

Efficient reasoning model

by openai

•

128K context

•

$1.1/M in,$4.4/M out

OpenAI GPT-5Credits

GPT-5 is OpenAI's most advanced language model, featuring enhanced reasoning capabilities with 80% fewer factual errors than o3. It supports a 400K to...

by openai

•

400K context

•

$1.3/M in,$10.0/M out

OpenAI GPT-5 MiniCredits

GPT-5 Mini delivers GPT-5-level performance at a fraction of the cost and latency. With the same 400K context window and advanced capabilities includi...

by openai

•

400K context

•

$0.25/M in,$2.0/M out

OpenAI GPT-5 NanoCredits

GPT-5 Nano is the smallest and fastest model in the GPT-5 family, designed for ultra-low latency applications. Despite its compact size, it maintains ...

by openai

•

400K context

•

$0.05/M in,$0.40/M out

GPT-5 ProCredits

Most capable GPT-5 model with extended thinking capabilities

by openai

•

128K context

•

$15.0/M in,$120.0/M out

GPT-5 CodexCredits

Specialized model for code generation and analysis

by openai

•

400K context

•

$1.3/M in,$10.0/M out

OpenAI GPT-5.1Credits

GPT-5.1 is an enhanced version of GPT-5 with improved performance and capabilities. It features the same 400K context window and advanced tool calling...

by openai

•

400K context

•

$1.3/M in,$10.0/M out

GPT-5.1 CodexCredits

Specialized model for code generation and analysis, based on GPT-5.1

by openai

•

400K context

•

$1.3/M in,$10.0/M out

GPT-5.1 Codex MiniCredits

Compact specialized model for code generation and analysis, based on GPT-5.1

by openai

•

400K context

•

$0.25/M in,$2.0/M out

OpenAI GPT-5.1 ChatCredits

GPT-5.1 Chat is a continuously updated version of GPT-5.1 optimized for conversational interactions. It receives regular updates with the latest impro...

by openai

•

128K context

•

$1.3/M in,$10.0/M out

OpenAI Codex Mini LatestCredits

Latest version of Codex Mini, a compact specialized model for code generation and analysis

by openai

•

200K context

•

$1.5/M in,$6.0/M out

Meta Llama 4 Scout 17B 16ECredits

Llama 4 instruction-tuned MoE (17B, 16 experts) for fast, high-quality chat, tool use, and multilingual reasoning with balanced latency and cost.

by meta-llama

•

131K context

•

$0.08/M in,$0.30/M out

Meta Llama 4 Maverick 17B 128ECredits

Llama 4 instruction-tuned MoE (17B, 128 experts) targeting tougher reasoning and long-form tasks, trading more compute for higher response diversity a...

by meta-llama

•

131K context

•

$0.15/M in,$0.60/M out

Meta Llama Guard 4 12BCredits

Meta’s latest safety/guardrail model for prompt and output moderation, aligning conversations to policy via classification and constrained generation.

by meta-llama

•

131K context

•

$0.20/M in,$0.21/M out

Kimi K2 (07/11)Credits

Kimi K2 Instruct is a large-scale Mixture-of-Experts (MoE) language model developed by Moonshot AI, featuring 1 trillion total parameters with 32 bill...

by moonshotai

•

131K context

•

$0.57/M in,$2.3/M out

Qwen3 Next 80B A3B InstructCredits

Qwen3-Next-80B-A3B-Instruct is a causal language model that is instruction-optimized for chat and agent applications. It features a Mixture-of-Experts...

by qwen

•

262K context

•

$0.14/M in,$1.4/M out

DeepSeek V3Credits

DeepSeek-V3.1 (deepseek-chat) is a powerful generalist model with 671B parameters, offering exceptional performance at an economical price. It achieve...

by deepseek

•

128K context

•

$0.27/M in,$1.0/M out

Zai GLM-4.7Credits

GLM-4.7 is Zhipu AI's flagship coding model with major upgrades in advanced coding capabilities, multi-step reasoning, and agentic orchestration. Feat...

by zai

•

205K context

•

$0.43/M in,$1.8/M out

Google Gemini 2.0 Flash ExperimentalFree

Experimental version of Gemini 2.0 Flash with native image generation capabilities. Features multimodal input and output support including text and im...

by google

•

1.0M context

•

$0.00/M in,$0.00/M out

Meta Llama 3.3 70B VersatileCredits

Llama-3.3-70B-Versatile is Meta's advanced multilingual large language model, optimized for a wide range of natural language processing tasks. With 70...

by meta-llama

•

131K context

•

$0.59/M in,$0.79/M out

Meta Llama 3.3 70B InstructCredits

The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction tuned generative model in 70B (text in/text out). The Llama...

by meta-llama

•

128K context

•

$0.13/M in,$0.39/M out

Google Gemma 3 12BCredits

Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini mode...

by google

•

131K context

•

$0.05/M in,$0.10/M out

Claude 3.5 Sonnet v2Credits

Our previous intelligent model with high level of intelligence and capability. Fast latency with multilingual and vision capabilities, but no extended...

by anthropic

•

200K context

•

$3.0/M in,$15.0/M out

Claude 3.5 HaikuCredits

Our fastest model. Intelligence at blazing speeds. Multilingual and vision capabilities. 8,192 max output tokens. Training data cut-off: July 2024. AP...

by anthropic

•

200K context

•

$0.80/M in,$4.0/M out

Meta Llama Prompt Guard 2 86MCredits

86M parameter multilingual prompt safety classifier based on mDeBERTa-base, detecting prompt injections and jailbreaks across 8+ languages with advers...

by meta-llama

•

512 context

•

$0.01/M in,$0.01/M out

Meta Llama Prompt Guard 2 22MCredits

22M parameter lightweight prompt safety classifier based on DeBERTa-xsmall, offering 75% reduced latency for detecting prompt injections and jailbreak...

by meta-llama

•

512 context

•

$0.01/M in,$0.01/M out

OpenAI GPT-5 Chat LatestCredits

GPT-5 Chat Latest is a continuously updated version of GPT-5 optimized for conversational interactions. It receives regular updates with the latest im...

by openai

•

128K context

•

$1.3/M in,$10.0/M out

Qwen2.5 Coder 7B fastCredits

Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language mo...

by alibaba

•

32K context

•

$0.03/M in,$0.09/M out

xAI Grok Code Fast 1Credits

Speedy and economical reasoning model that excels at agentic coding. Features function calling, structured outputs, and reasoning capabilities.

by xai

•

256K context

•

$0.20/M in,$1.5/M out

OpenAI ChatGPT-4oCredits

OpenAI ChatGPT 4o is continually updated by OpenAI to point to the current version of GPT-4o used by ChatGPT. It therefore differs slightly from the A...

by openai

•

128K context

•

$5.0/M in,$15.8/M out

Mistral-LargeCredits

Mistral Large 2.1

by mistral

•

128K context

•

$2.0/M in,$6.0/M out

Meta Llama 3.1 8B InstructCredits

Meta's latest class of models, Llama 3.1, launched with a variety of sizes and configurations. The 8B instruct-tuned version is particularly fast and ...

by meta-llama

•

16K context

•

$0.02/M in,$0.05/M out

Meta Llama 3.1 8B Instruct TurboCredits

Optimized version of Llama 3.1 8B Instruct with 128K context window, designed for high-speed inference in multilingual chat and dialogue use cases wit...

by meta-llama

•

128K context

•

$0.02/M in,$0.03/M out

OpenAI GPT-4o-miniCredits

GPT-4o mini is OpenAI's newest model after GPT-4 Omni, supporting both text and image inputs with text outputs. As their most advanced small model, i...

by openai

•

128K context

•

$0.15/M in,$0.60/M out

Mistral NemoCredits

The Mistral-Nemo-Instruct-2407 Large Language Model (LLM) is an instruct fine-tuned version of the Mistral-Nemo-Base-2407. Trained jointly by Mistral ...

by mistral

•

128K context

•

$20.0/M in,$40.0/M out

Zai GLM-4.6Credits

As the latest iteration in the GLM series, GLM-4.6 achieves comprehensive enhancements across multiple domains, including real-world coding, long-cont...

by zai

•

205K context

•

$0.45/M in,$1.5/M out

xAI Grok 4Credits

Latest and greatest flagship model, offering unparalleled performance in natural language, math and reasoning - the perfect jack of all trades. Featur...

by xai

•

256K context

•

$3.0/M in,$15.0/M out

Meta Llama 3.1 8B InstantCredits

Compact 8B general-purpose model offering efficient inference for chat, coding, and RAG workflows on limited compute.

by meta-llama

•

131K context

•

$0.05/M in,$0.08/M out

Google Gemma 2Credits

Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini mode...

by google

•

8K context

•

$0.01/M in,$0.03/M out

OpenAI o3Credits

o3 is a well-rounded and powerful model across domains. It sets a new standard for math, science, coding, and visual reasoning tasks. It also excels a...

by openai

•

200K context

•

$2.0/M in,$8.0/M out

OpenAI o3 ProCredits

The o-series of models are trained with reinforcement learning to think before they answer and perform complex reasoning. The o3-pro model uses more c...

by openai

•

200K context

•

$20.0/M in,$80.0/M out

OpenAI o4 MiniCredits

o4-mini is our latest small o-series model. It's optimized for fast, effective reasoning with exceptionally efficient performance in coding and visual...

by openai

•

200K context

•

$1.1/M in,$4.4/M out

OpenAI GPT-OSS 120bCredits

gpt-oss-120b is our most powerful open-weight model, which fits into a single H100 GPU (117B parameters with 5.1B active parameters). Features permiss...

by openai

•

131K context

•

$0.04/M in,$0.16/M out

OpenAI GPT-OSS 20bCredits

gpt-oss-20b is our medium-sized open-weight model for low latency, local, or specialized use-cases (21B parameters with 3.6B active parameters). Featu...

by openai

•

131K context

•

$0.05/M in,$0.20/M out

xAI Grok 3Credits

Excels at enterprise use cases like data extraction, coding, and text summarization. Possesses deep domain knowledge in finance, healthcare, law, and ...

by xai

•

131K context

•

$3.0/M in,$15.0/M out

xAI Grok 3 MiniCredits

Lightweight model that thinks before responding. Fast, smart, and great for logic-based tasks that do not require deep domain knowledge. Features func...

by xai

•

131K context

•

$0.30/M in,$0.50/M out

Hermes 2 Pro Llama 3 8BCredits

Hermes 2 Pro is an upgraded, retrained version of Nous Hermes 2, consisting of an updated and cleaned version of the OpenHermes 2.5 Dataset, as well a...

by meta-llama

•

131K context

•

$0.14/M in,$0.14/M out

OpenAI GPT-4oCredits

GPT-4o ("o" for "omni") is OpenAI's latest AI model, supporting both text and image inputs with text outputs. It maintains the intelligence level of G...

by openai

•

128K context

•

$2.5/M in,$10.0/M out

Claude 3 HaikuCredits

Claude 3 Haiku is Anthropic's fastest and most compact model. Designed for near-instant responsiveness and seamless AI experiences that mimic human in...

by anthropic

•

200K context

•

$0.25/M in,$1.3/M out

Mistral SmallCredits

Mistral-Small-3.2-24B-Instruct-2506 is an updated 24B parameter model from Mistral optimized for instruction following, repetition reduction, and impr...

by mistral

•

128K context

•

$75.0/M in,$200.0/M out

OpenAI o3 MiniCredits

o3-mini is our newest small reasoning model, providing high intelligence at the same cost and latency targets of o1-mini. o3-mini supports key develop...

by openai

•

200K context

•

$1.1/M in,$4.4/M out