AI Models

Available Models

Last updated: 2026-02-26

Model Licensing Terms

Enterprise = Model usage is licensed under UCOP and/or DOE negotiated enterprise agreements. Model provider will not use your data for training. Acceptable for use with Prudent-to-Protect (P2P) Information (e.g. pre-publication data). May be approved for protected R&D information including ECI or S&T Matrix as part of an approved Access Plan. Please contact IT Policy for more information.

Commercial = Model usage is licensed under non-negotiated standard commercial agreements. Model providers will not use your data for training. Acceptable for use with non-sensitive open scientific data and pre-publication research. No S&T or ECI or anything with FN or sponsor-specified protections permitted. For details consult IT Policy AI Tool Security Levels

Chat and Vision Models

ProviderCreatorModel IDInput/OutputVisionToolsCost (I/O)License
Amazon BedrockAnthropicamazon/claude-haiku-3-5200K / 8.192KNY$0.80 / $4.00E
Amazon BedrockAnthropicamazon/claude-haiku-4-5200K / 64KYY$1.00 / $5.00E
Amazon BedrockAnthropicamazon/claude-haiku-4-5-high200K / 64KYY$1.00 / $5.00E
Amazon BedrockAnthropicamazon/claude-opus-4200K / 8.192KYY$15.00 / $75.00E
Amazon BedrockAnthropicamazon/claude-opus-4-1200K / 8.192KYY$15.00 / $75.00E
Amazon BedrockAnthropicamazon/claude-opus-4-1-high200K / 8.192KYY$15.00 / $75.00E
Amazon BedrockAnthropicamazon/claude-opus-4-5200K / 8.192KYY$5.00 / $25.00E
Amazon BedrockAnthropicamazon/claude-opus-4-5-high200K / 8.192KYY$5.00 / $25.00E
Amazon BedrockAnthropicamazon/claude-opus-4-61M / 8.192KYY$5.00 / $25.00E
Amazon BedrockAnthropicamazon/claude-opus-4-6-high1M / 8.192KYY$5.00 / $25.00E
Amazon BedrockAnthropicamazon/claude-opus-4-high200K / 8.192KYY$15.00 / $75.00E
Amazon BedrockAnthropicamazon/claude-sonnet-41M / 16.384KYY$3.00 / $15.00E
Amazon BedrockAnthropicamazon/claude-sonnet-4-5200K / 16.384KYY$3.00 / $15.00E
Amazon BedrockAnthropicamazon/claude-sonnet-4-5-high200K / 16.384KYY$3.00 / $15.00E
Amazon BedrockAnthropicamazon/claude-sonnet-4-6200K / 16.384KYY$3.00 / $15.00E
Amazon BedrockAnthropicamazon/claude-sonnet-4-6-high200K / 16.384KYY$3.00 / $15.00E
Amazon BedrockAnthropicamazon/claude-sonnet-4-high1M / 16.384KYY$3.00 / $15.00E
Amazon BedrockMetaamazon/llama-4-maverick128K / 4.096KNY$0.24 / $0.97E
Amazon BedrockMetaamazon/llama-4-scout128K / 4.096KNY$0.17 / $0.66E
Amazon BedrockMistral AImistral-large128K / 8.192KNY$0.50 / $1.50E
Amazon BedrockMistral AImistral-large-3128K / 8.192KNY$0.50 / $1.50E
Amazon BedrockOpenAIamazon/gpt-oss-120b128K / 128KNY$0.15 / $0.60E
Amazon BedrockOpenAIamazon/gpt-oss-20b128K / 128KNY$0.07 / $0.30E
Amazon BedrockUnknowndevstral256K / 256KNY$0.40 / $2.00E
Amazon BedrockUnknowndevstral-2256K / 256KNY$0.40 / $2.00E
Amazon BedrockUnknownnemotron-nano-3262.144K / 8.192KNY$0.06 / $0.24E
Amazon BedrockUnknownnemotron-nano-vl128K / 8.192KYN$0.20 / $0.60E
Amazon BedrockUnknownnova-microN/A / N/ANN$0 / $0E
Amazon BedrockUnknownnova-micro-1128K / 10KNY$0.04 / $0.14E
Amazon BedrockUnknownnova-premierN/A / N/ANN$0 / $0E
Amazon BedrockUnknownnova-premier-1N/A / N/ANN$0 / $0E
Amazon BedrockUnknownnova-proN/A / N/ANN$0 / $0E
Amazon BedrockUnknownnova-pro-1300K / 10KYY$0.80 / $3.20E
Google Vertex AIAnthropicanthropic/claude-haiku200K / 8.192KYY$1.00 / $5.00E
Google Vertex AIAnthropicanthropic/claude-haiku-high200K / 8.192KYY$1.00 / $5.00E
Google Vertex AIAnthropicanthropic/claude-opus1M / 8.192KYY$5.00 / $25.00E
Google Vertex AIAnthropicanthropic/claude-opus-high1M / 8.192KYY$5.00 / $25.00E
Google Vertex AIAnthropicanthropic/claude-sonnet200K / 16.384KYY$3.00 / $15.00E
Google Vertex AIAnthropicanthropic/claude-sonnet-high200K / 16.384KYY$3.00 / $15.00E
Google Vertex AIAnthropicclaude-3-5-haiku200K / 8.192KNY$1.00 / $5.00E
Google Vertex AIAnthropicclaude-haiku200K / 8.192KYY$1.00 / $5.00E
Google Vertex AIAnthropicclaude-haiku-4-5200K / 8.192KYY$1.00 / $5.00E
Google Vertex AIAnthropicclaude-haiku-high200K / 8.192KYY$1.00 / $5.00E
Google Vertex AIAnthropicclaude-opus1M / 8.192KYY$5.00 / $25.00E
Google Vertex AIAnthropicclaude-opus-4-0200K / 8.192KYY$15.00 / $75.00E
Google Vertex AIAnthropicclaude-opus-4-1200K / 8.192KYY$15.00 / $75.00E
Google Vertex AIAnthropicclaude-opus-4-5200K / 8.192KYY$5.00 / $25.00E
Google Vertex AIAnthropicclaude-opus-4-61M / 8.192KYY$5.00 / $25.00E
Google Vertex AIAnthropicclaude-opus-high1M / 8.192KYY$5.00 / $25.00E
Google Vertex AIAnthropicclaude-sonnet200K / 16.384KYY$3.00 / $15.00E
Google Vertex AIAnthropicclaude-sonnet-4-5200K / 16.384KYY$3.00 / $15.00E
Google Vertex AIAnthropicclaude-sonnet-4-6200K / 16.384KYY$3.00 / $15.00E
Google Vertex AIAnthropicclaude-sonnet-high200K / 16.384KYY$3.00 / $15.00E
Google Vertex AIAnthropicgoogle/claude-haiku-3-5200K / 8.192KNY$1.00 / $5.00E
Google Vertex AIAnthropicgoogle/claude-haiku-4-5200K / 8.192KYY$1.00 / $5.00E
Google Vertex AIAnthropicgoogle/claude-haiku-4-5-high200K / 8.192KYY$1.00 / $5.00E
Google Vertex AIAnthropicgoogle/claude-opus-4200K / 8.192KYY$15.00 / $75.00E
Google Vertex AIAnthropicgoogle/claude-opus-4-1200K / 8.192KYY$15.00 / $75.00E
Google Vertex AIAnthropicgoogle/claude-opus-4-1-high200K / 8.192KYY$15.00 / $75.00E
Google Vertex AIAnthropicgoogle/claude-opus-4-5200K / 8.192KYY$5.00 / $25.00E
Google Vertex AIAnthropicgoogle/claude-opus-4-5-high200K / 8.192KYY$5.00 / $25.00E
Google Vertex AIAnthropicgoogle/claude-opus-4-61M / 8.192KYY$5.00 / $25.00E
Google Vertex AIAnthropicgoogle/claude-opus-4-6-high1M / 8.192KYY$5.00 / $25.00E
Google Vertex AIAnthropicgoogle/claude-opus-4-high200K / 8.192KYY$15.00 / $75.00E
Google Vertex AIAnthropicgoogle/claude-sonnet-41M / 16.384KYY$3.00 / $15.00E
Google Vertex AIAnthropicgoogle/claude-sonnet-4-5200K / 16.384KYY$3.00 / $15.00E
Google Vertex AIAnthropicgoogle/claude-sonnet-4-5-high200K / 16.384KYY$3.00 / $15.00E
Google Vertex AIAnthropicgoogle/claude-sonnet-4-6200K / 16.384KYY$3.00 / $15.00E
Google Vertex AIAnthropicgoogle/claude-sonnet-4-6-high200K / 16.384KYY$3.00 / $15.00E
Google Vertex AIAnthropicgoogle/claude-sonnet-4-high1M / 16.384KYY$3.00 / $15.00E
Google Vertex AIDeepSeekgoogle/deepseek-3.2163.84K / 32.768KNY$0.56 / $1.68E
Google Vertex AIDeepSeekgoogle/deepseek-r165.336K / 8.192KNY$1.35 / $5.40E
Google Vertex AIGooglegemini-1.5-flash1M / 8.192KYY$0.07 / $0.30E
Google Vertex AIGooglegemini-1.5-pro2.09715M / 8.192KYY$1.25 / $5.00E
Google Vertex AIGooglegemini-2.0-flash1.04858M / 8.192KYY$0.10 / $0.40E
Google Vertex AIGooglegemini-2.0-flash-lite1.04858M / 8.192KYY$0.07 / $0.30E
Google Vertex AIGooglegemini-2.5-flash1.04858M / 65.535KYY$0.30 / $2.50E
Google Vertex AIGooglegemini-2.5-flash-high1.04858M / 65.535KYY$0.30 / $2.50E
Google Vertex AIGooglegemini-2.5-flash-imageN/A / N/ANN$0 / $0E
Google Vertex AIGooglegemini-2.5-flash-liteN/A / N/ANN$0 / $0E
Google Vertex AIGooglegemini-2.5-pro1.04858M / 65.535KYY$1.25 / $10.00E
Google Vertex AIGooglegemini-2.5-pro-high1.04858M / 65.535KYY$2.00 / $12.00E
Google Vertex AIGooglegemini-3-flash1.04858M / 65.535KYY$0.50 / $3.00E
Google Vertex AIGooglegemini-3-flash-high1.04858M / 65.535KYY$0.50 / $3.00E
Google Vertex AIGooglegemini-3-pro1.04858M / 65.535KYY$2.00 / $12.00E
Google Vertex AIGooglegemini-3-pro-high1.04858M / 65.535KYY$2.00 / $12.00E
Google Vertex AIGooglegemini-3.1-pro1.04858M / 65.536KYY$2.00 / $12.00E
Google Vertex AIGooglegemini-3.1-pro-high1.04858M / 65.536KYY$2.00 / $12.00E
Google Vertex AIGooglegemini-embedding-001N/A / N/ANN$0 / $0E
Google Vertex AIGooglegemini-flash1.04858M / 65.535KYY$0.50 / $3.00E
Google Vertex AIGooglegemini-flash-high1.04858M / 65.535KYY$0.50 / $3.00E
Google Vertex AIGooglegemini-flash-imageN/A / N/ANN$0 / $0E
Google Vertex AIGooglegemini-pro1.04858M / 65.536KYY$2.00 / $12.00E
Google Vertex AIGooglegemini-pro-high1.04858M / 65.536KYY$2.00 / $12.00E
Google Vertex AIMetagoogle/llama-4-maverick1M / 1MNY$0.35 / $1.15E
Google Vertex AIMistral AIgoogle/codestral128K / 128KNY$0.20 / $0.60E
Google Vertex AIOpenAIgoogle/gpt-oss-120b131.072K / 32.768KNN$0.15 / $0.60E
Google Vertex AIOpenAIgoogle/gpt-oss-120b-high131.072K / 32.768KNN$0 / $0E
Google Vertex AIOpenAIgoogle/gpt-oss-20b131.072K / 32.768KNN$0.07 / $0.30E
Google Vertex AIOpenAIgoogle/gpt-oss-20b-high131.072K / 32.768KNN$0.07 / $0.30E
Google Vertex AIQwengoogle/qwen-3262.144K / 16.384KNY$0.25 / $1.00E
Google Vertex AIQwengoogle/qwen-3-coder262.144K / 32.768KNY$1.00 / $4.00E
Google Vertex AIUnknowngoogle/glm-4.7200K / 128KNY$0.60 / $2.20E
Google Vertex AIUnknowngoogle/glm-5200K / 128KNY$1.00 / $3.20E
Google Vertex AIUnknowngoogle/kimi-k2-thinking256K / 256KNY$0.60 / $2.50E
Google Vertex AIUnknowngoogle/minimax-m2196.608K / 196.608KNY$0.30 / $1.20E
Google Vertex AIUnknowntext-embedding-004N/A / N/ANN$0 / $0E
LBL IT DivisionIBMlbl/granite-docling8.192K / N/ANN$0 / $0E
LBL IT DivisionMetaLlama-4-Scout-17B-16E-Instruct131.072K / 8.192KNN$0 / $0E
LBL IT DivisionMetalbl/Llama-4-Scout-17B-16E-Instruct131.072K / 8.192KNN$0 / $0E
LBL IT DivisionMetalbl/llama131.072K / 8.192KNN$0 / $0E
LBL IT DivisionMetalbl/llama-4-scout131.072K / 8.192KNN$0 / $0E
LBL IT DivisionMetameta/llama-4-scout131.072K / 8.192KNN$0 / $0E
LBL IT DivisionOpenAIgpt272K / 128KYY$1.75 / $14.00E
LBL IT DivisionOpenAIgpt-oss-120b131.072K / 131.072KNN$0 / $0E
LBL IT DivisionOpenAIgpt-oss-120b-high131.072K / 131.072KNN$0 / $0E
LBL IT DivisionOpenAIgpt-oss-20bN/A / N/ANN$0 / $0E
LBL IT DivisionOpenAIgpt-oss-20b-highN/A / N/ANN$0 / $0E
LBL IT DivisionOpenAIlbl/gpt-oss-120b131.072K / 131.072KNN$0 / $0E
LBL IT DivisionOpenAIlbl/gpt-oss-120b-high131.072K / 131.072KNN$0 / $0E
LBL IT DivisionOpenAIlbl/gpt-oss-120b-medium131.072K / 131.072KNN$0 / $0E
LBL IT DivisionOpenAIlbl/gpt-oss-20bN/A / N/ANN$0 / $0E
LBL IT DivisionOpenAIlbl/gpt-oss-20b-highN/A / N/ANN$0 / $0E
LBL IT DivisionUnknownNanonets-OCRN/A / N/ANN$0 / $0E
LBL IT DivisionUnknownlbl/Nanonets-OCRN/A / N/ANN$0 / $0E
LBL IT DivisionUnknownlbl/jbei-publications-chatN/A / N/ANN$0 / $0E
Microsoft AzureOpenAIazure/gpt-oss-120bN/A / 131.072KNN$0.15 / $0.60E
Microsoft AzurexAIazure/grok-3N/A / N/ANN$3.00 / $15.00E
Microsoft AzurexAIazure/grok-3-miniN/A / N/ANN$0.25 / $1.27E
OpenAIOpenAIgpt-4.11.04758M / 32.768KYY$2.00 / $8.00E
OpenAIOpenAIgpt-4.1-mini1.04758M / 32.768KYY$0.40 / $1.60E
OpenAIOpenAIgpt-4.1-nano1.04758M / 32.768KYY$0.10 / $0.40E
OpenAIOpenAIgpt-4o128K / 16.384KYY$2.50 / $10.00E
OpenAIOpenAIgpt-4o-mini128K / 16.384KYY$0.15 / $0.60E
OpenAIOpenAIgpt-5272K / 128KYY$1.25 / $10.00E
OpenAIOpenAIgpt-5-chat128K / 16.384KYN$1.25 / $10.00E
OpenAIOpenAIgpt-5-codex272K / 128KYY$1.25 / $10.00E
OpenAIOpenAIgpt-5-high272K / 128KYY$1.25 / $10.00E
OpenAIOpenAIgpt-5-mini272K / 128KYY$0.25 / $2.00E
OpenAIOpenAIgpt-5-mini-high272K / 128KYY$0.25 / $2.00E
OpenAIOpenAIgpt-5-nano272K / 128KYY$0.05 / $0.40E
OpenAIOpenAIgpt-5-nano-high272K / 128KYY$0.05 / $0.40E
OpenAIOpenAIgpt-5.1272K / 128KYY$1.25 / $10.00E
OpenAIOpenAIgpt-5.1-chat128K / 16.384KYN$1.25 / $10.00E
OpenAIOpenAIgpt-5.1-codex272K / 128KYY$1.25 / $10.00E
OpenAIOpenAIgpt-5.1-codex-max272K / 128KYY$1.25 / $10.00E
OpenAIOpenAIgpt-5.1-codex-mini272K / 128KYY$0.25 / $2.00E
OpenAIOpenAIgpt-5.1-high272K / 128KYY$1.25 / $10.00E
OpenAIOpenAIgpt-5.2272K / 128KYY$1.75 / $14.00E
OpenAIOpenAIgpt-5.2-chat128K / 16.384KYY$1.75 / $14.00E
OpenAIOpenAIgpt-5.2-codex272K / 128KYY$1.75 / $14.00E
OpenAIOpenAIgpt-5.2-high272K / 128KYY$1.25 / $10.00E
OpenAIOpenAIgpt-5.2-pro272K / 128KYY$1.75 / $14.00E
OpenAIOpenAIgpt-5.2-xhigh272K / 128KYY$1.75 / $14.00E
OpenAIOpenAIgpt-5.3-codex272K / 128KYY$1.75 / $14.00E
OpenAIOpenAIgpt-chat128K / 16.384KYY$1.75 / $14.00E
OpenAIOpenAIgpt-codex272K / 128KYY$1.75 / $14.00E
OpenAIOpenAIgpt-mini272K / 128KYY$0.25 / $2.00E
OpenAIOpenAIgpt-nano272K / 128KYY$0.05 / $0.40E
OpenAIOpenAIo1200K / 100KYY$15.00 / $60.00E
OpenAIOpenAIo3200K / 100KYY$2.00 / $8.00E
OpenAIOpenAIo3-high200K / 100KYY$2.00 / $8.00E
OpenAIOpenAIo3-mini200K / 100KNY$1.10 / $4.40E
OpenAIOpenAIo3-mini-high200K / 100KNY$1.10 / $4.40E
OpenAIOpenAIo4-mini200K / 100KYY$1.10 / $4.40E
OpenAIOpenAIo4-mini-high200K / 100KYY$1.10 / $4.40E
OpenAIUnknowno1-high200K / 100KYY$15.00 / $60.00E
OpenAIUnknowno1-mini128K / 65.536KYN$1.10 / $4.40E
OpenAIUnknowno1-mini-high128K / 65.536KYN$1.10 / $4.40E
xAIxAIxai/grok-3131.072K / 131.072KNY$3.00 / $15.00C
xAIxAIxai/grok-3-mini131.072K / 131.072KNY$0.30 / $0.50C
xAIxAIxai/grok-4-0709256K / 256KNY$3.00 / $15.00C
xAIxAIxai/grok-4-1-fast2M / 2MYY$0.20 / $0.50C
xAIxAIxai/grok-4-1-fast-reasoning2M / 2MYY$0.20 / $0.50C
xAIxAIxai/grok-code-fast-1256K / 256KNY$0.20 / $1.50C

Image Generation Models

ProviderCreatorModel IDInput/OutputCost (I/O)License
Google Vertex AIGooglegemini-3-pro-image65.536K / 32.768K$2.00 / $12.00E
Google Vertex AIGooglegemini-pro-image65.536K / 32.768K$2.00 / $12.00E

Vector Embedding Models

ProviderCreatorModel IDMax TokensCost
LBL IT DivisionUnknownbge-m38.192KFree
Amazon BedrockCoherecohere-embed-english-v3512$0.10
Amazon BedrockCoherecohere-embed-multilingual-v3512$0.10
Amazon BedrockCoherecohere-embed-v4128K$0.12
LBL IT DivisionUnknownlbl/bge-m38.192KFree
LBL IT DivisionNomic.AIlbl/nomic-embed-text8.192KFree
LBL IT DivisionNomic.AIlbl/nomic-embed-vision8.192KFree
LBL IT DivisionNomic.AInomic-embed-text8.192KFree
LBL IT DivisionNomic.AInomic-embed-vision8.192KFree
Amazon BedrockUnknownnova-2-embed-multimodal8.172K$0.14
LBL IT DivisionUnknowntext-embedding-ada-0028.191K$0.10
Amazon BedrockUnknowntitan-embed-image-v1128$0.80
Amazon BedrockUnknowntitan-embed-text-v18.192K$0.10
Amazon BedrockUnknowntitan-embed-text-v28.192K$0.20

Code Completion Models

ProviderCreatorModel IDMax ContextCost
LBL IT DivisionGooglecodegemma8.192KN/C
LBL IT DivisionGooglelbl/codegemma:2b8.192KN/C

LBL-Hosted Customized Models

LBL-Hosted Customized Models use a customized system prompt on top of a base model, to provide improved behavior for LBL users in chat modes.

Note: API users can bypass the system prompt by accessing underlying models directly, if desired.

ProviderModel IDUnderlying ModelInput/OutputVisionCost
LBL IT Divisionlbl/cborg-chathosted_vllm/Llama-4-Scout-17B-16E-Instruct131.072K / 8.192KNN/C
LBL IT Divisionlbl/cborg-coderhosted_vllm/gpt-oss-120b131.072K / 131.072KNN/C
LBL IT Divisionlbl/cborg-coder-baseollama/codegemma:2b8.192K / N/ANN/C
LBL IT Divisionlbl/cborg-deepthoughthosted_vllm/gpt-oss-120b131.072K / 131.072KNN/C
LBL IT Divisionlbl/cborg-minihosted_vllm/gpt-oss-20bN/A / N/ANN/C
LBL IT Divisionlbl/cborg-ocropenai/Nanonets-OCRN/A / N/ANN/C
LBL IT Divisionlbl/cborg-visionhosted_vllm/Llama-4-Scout-17B-16E-Instruct131.072K / 8.192KNN/C

Note

Cost Explanation: The Cost column provides a rough order-of-magnitude estimate of costs associated with the model. Detailed cost data is provided in the model details below. Cost for using commercial models are paid for by the IT Division. There is no cost to individual users at this time and no PID is required.

Understanding the Context Window Length

The context length is a measure of the approximate number of words that a model can process as inputs. Some models support extremely long context lengths. For a typical book with 300 words per page, the correspondence between pages and tokens is approximately as follows:

Context LengthPages of Text
1.0M2000
200K400
128K250
64K128
32K64
16K32
8K16
4K8

When chatting with a model, your entire chat history of the session is fed into the context window with every message sent. Therefore, as you send more messages the context length will increase. Over time this can cause the cost of each message exchange to increase until the model’s maximum token limit is reached.

Model Information Details

Amazon Bedrock

Models hosted on Amazon Bedrock are provided under enterprise agreements. Your data will not be used for training.

amazon/claude-haiku-3-5

  • Endpoint Location: Amazon Bedrock
  • API Model Name: amazon/claude-haiku-3-5
  • Underlying Model: bedrock/us.anthropic.claude-3-5-haiku-20241022-v1:0
  • Mode: Chat
  • Max Input Tokens: 200,000
  • Max Output Tokens: 8,192
  • Capabilities: Tool Use, Prompt Caching, PDF Input, Structured Output
  • Cost per 1M Tokens (Input): $0.80
  • Cost per 1M Tokens (Output): $4.00

amazon/claude-haiku-4-5

  • Endpoint Location: Amazon Bedrock
  • API Model Name: amazon/claude-haiku-4-5
  • Underlying Model: bedrock/us.anthropic.claude-haiku-4-5-20251001-v1:0
  • Mode: Chat
  • Max Input Tokens: 200,000
  • Max Output Tokens: 64,000
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output, Computer Use
  • Cost per 1M Tokens (Input): $1.00
  • Cost per 1M Tokens (Output): $5.00

amazon/claude-haiku-4-5-high

  • Endpoint Location: Amazon Bedrock
  • API Model Name: amazon/claude-haiku-4-5-high
  • Underlying Model: bedrock/us.anthropic.claude-haiku-4-5-20251001-v1:0
  • Mode: Chat
  • Max Input Tokens: 200,000
  • Max Output Tokens: 64,000
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output, Computer Use
  • Reasoning Effort: high
  • Cost per 1M Tokens (Input): $1.00
  • Cost per 1M Tokens (Output): $5.00

amazon/claude-opus-4

  • Endpoint Location: Amazon Bedrock
  • API Model Name: amazon/claude-opus-4
  • Underlying Model: bedrock/us.anthropic.claude-opus-4-20250514-v1:0
  • Mode: Chat
  • Max Input Tokens: 200,000
  • Max Output Tokens: 8,192
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output, Computer Use
  • Cost per 1M Tokens (Input): $15.00
  • Cost per 1M Tokens (Output): $75.00

amazon/claude-opus-4-1

  • Endpoint Location: Amazon Bedrock
  • API Model Name: amazon/claude-opus-4-1
  • Underlying Model: bedrock/us.anthropic.claude-opus-4-1-20250805-v1:0
  • Mode: Chat
  • Max Input Tokens: 200,000
  • Max Output Tokens: 8,192
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output, Computer Use
  • Cost per 1M Tokens (Input): $15.00
  • Cost per 1M Tokens (Output): $75.00

amazon/claude-opus-4-1-high

  • Endpoint Location: Amazon Bedrock
  • API Model Name: amazon/claude-opus-4-1-high
  • Underlying Model: bedrock/us.anthropic.claude-opus-4-1-20250805-v1:0
  • Mode: Chat
  • Max Input Tokens: 200,000
  • Max Output Tokens: 8,192
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output, Computer Use
  • Reasoning Effort: high
  • Cost per 1M Tokens (Input): $15.00
  • Cost per 1M Tokens (Output): $75.00

amazon/claude-opus-4-5

  • Endpoint Location: Amazon Bedrock
  • API Model Name: amazon/claude-opus-4-5
  • Underlying Model: bedrock/us.anthropic.claude-opus-4-5-20251101-v1:0
  • Mode: Chat
  • Max Input Tokens: 200,000
  • Max Output Tokens: 8,192
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output, Computer Use
  • Cost per 1M Tokens (Input): $5.00
  • Cost per 1M Tokens (Output): $25.00

amazon/claude-opus-4-5-high

  • Endpoint Location: Amazon Bedrock
  • API Model Name: amazon/claude-opus-4-5-high
  • Underlying Model: bedrock/us.anthropic.claude-opus-4-5-20251101-v1:0
  • Mode: Chat
  • Max Input Tokens: 200,000
  • Max Output Tokens: 8,192
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output, Computer Use
  • Reasoning Effort: high
  • Cost per 1M Tokens (Input): $5.00
  • Cost per 1M Tokens (Output): $25.00

amazon/claude-opus-4-6

  • Endpoint Location: Amazon Bedrock
  • API Model Name: amazon/claude-opus-4-6
  • Underlying Model: bedrock/us.anthropic.claude-opus-4-6-v1
  • Mode: Chat
  • Max Input Tokens: 1,000,000
  • Max Output Tokens: 8,192
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output, Computer Use
  • Cost per 1M Tokens (Input): $5.00
  • Cost per 1M Tokens (Output): $25.00

amazon/claude-opus-4-6-high

  • Endpoint Location: Amazon Bedrock
  • API Model Name: amazon/claude-opus-4-6-high
  • Underlying Model: bedrock/us.anthropic.claude-opus-4-6-v1
  • Mode: Chat
  • Max Input Tokens: 1,000,000
  • Max Output Tokens: 8,192
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output, Computer Use
  • Reasoning Effort: high
  • Cost per 1M Tokens (Input): $5.00
  • Cost per 1M Tokens (Output): $25.00

amazon/claude-opus-4-high

  • Endpoint Location: Amazon Bedrock
  • API Model Name: amazon/claude-opus-4-high
  • Underlying Model: bedrock/us.anthropic.claude-opus-4-20250514-v1:0
  • Mode: Chat
  • Max Input Tokens: 200,000
  • Max Output Tokens: 8,192
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output, Computer Use
  • Reasoning Effort: high
  • Cost per 1M Tokens (Input): $15.00
  • Cost per 1M Tokens (Output): $75.00

amazon/claude-sonnet-4

  • Endpoint Location: Amazon Bedrock
  • API Model Name: amazon/claude-sonnet-4
  • Underlying Model: bedrock/us.anthropic.claude-sonnet-4-20250514-v1:0
  • Mode: Chat
  • Max Input Tokens: 1,000,000
  • Max Output Tokens: 16,384
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output, Computer Use
  • Cost per 1M Tokens (Input): $3.00
  • Cost per 1M Tokens (Output): $15.00

amazon/claude-sonnet-4-5

  • Endpoint Location: Amazon Bedrock
  • API Model Name: amazon/claude-sonnet-4-5
  • Underlying Model: bedrock/us.anthropic.claude-sonnet-4-5-20250929-v1:0
  • Mode: Chat
  • Max Input Tokens: 200,000
  • Max Output Tokens: 16,384
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output, Computer Use
  • Cost per 1M Tokens (Input): $3.00
  • Cost per 1M Tokens (Output): $15.00

amazon/claude-sonnet-4-5-high

  • Endpoint Location: Amazon Bedrock
  • API Model Name: amazon/claude-sonnet-4-5-high
  • Underlying Model: bedrock/us.anthropic.claude-sonnet-4-5-20250929-v1:0
  • Mode: Chat
  • Max Input Tokens: 200,000
  • Max Output Tokens: 16,384
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output, Computer Use
  • Reasoning Effort: high
  • Cost per 1M Tokens (Input): $3.00
  • Cost per 1M Tokens (Output): $15.00

amazon/claude-sonnet-4-6

  • Endpoint Location: Amazon Bedrock
  • API Model Name: amazon/claude-sonnet-4-6
  • Underlying Model: bedrock/us.anthropic.claude-sonnet-4-6
  • Mode: Chat
  • Max Input Tokens: 200,000
  • Max Output Tokens: 16,384
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output, Computer Use
  • Cost per 1M Tokens (Input): $3.00
  • Cost per 1M Tokens (Output): $15.00

amazon/claude-sonnet-4-6-high

  • Endpoint Location: Amazon Bedrock
  • API Model Name: amazon/claude-sonnet-4-6-high
  • Underlying Model: bedrock/us.anthropic.claude-sonnet-4-6
  • Mode: Chat
  • Max Input Tokens: 200,000
  • Max Output Tokens: 16,384
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output, Computer Use
  • Reasoning Effort: high
  • Cost per 1M Tokens (Input): $3.00
  • Cost per 1M Tokens (Output): $15.00

amazon/claude-sonnet-4-high

  • Endpoint Location: Amazon Bedrock
  • API Model Name: amazon/claude-sonnet-4-high
  • Underlying Model: bedrock/us.anthropic.claude-sonnet-4-20250514-v1:0
  • Mode: Chat
  • Max Input Tokens: 1,000,000
  • Max Output Tokens: 16,384
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output, Computer Use
  • Reasoning Effort: high
  • Cost per 1M Tokens (Input): $3.00
  • Cost per 1M Tokens (Output): $15.00

amazon/llama-4-maverick

  • Endpoint Location: Amazon Bedrock
  • API Model Name: amazon/llama-4-maverick
  • Underlying Model: bedrock/us.meta.llama4-maverick-17b-instruct-v1:0
  • Mode: Chat
  • Max Input Tokens: 128,000
  • Max Output Tokens: 4,096
  • Capabilities: Tool Use
  • Cost per 1M Tokens (Input): $0.24
  • Cost per 1M Tokens (Output): $0.97

amazon/llama-4-scout

  • Endpoint Location: Amazon Bedrock
  • API Model Name: amazon/llama-4-scout
  • Underlying Model: bedrock/us.meta.llama4-scout-17b-instruct-v1:0
  • Mode: Chat
  • Max Input Tokens: 128,000
  • Max Output Tokens: 4,096
  • Capabilities: Tool Use
  • Cost per 1M Tokens (Input): $0.17
  • Cost per 1M Tokens (Output): $0.66

mistral-large

  • Endpoint Location: Amazon Bedrock
  • API Model Name: mistral-large
  • Underlying Model: bedrock/mistral.mistral-large-3-675b-instruct
  • Mode: Chat
  • Max Input Tokens: 128,000
  • Max Output Tokens: 8,192
  • Capabilities: Tool Use
  • Cost per 1M Tokens (Input): $0.50
  • Cost per 1M Tokens (Output): $1.50

mistral-large-3

  • Endpoint Location: Amazon Bedrock
  • API Model Name: mistral-large-3
  • Underlying Model: bedrock/mistral.mistral-large-3-675b-instruct
  • Mode: Chat
  • Max Input Tokens: 128,000
  • Max Output Tokens: 8,192
  • Capabilities: Tool Use
  • Cost per 1M Tokens (Input): $0.50
  • Cost per 1M Tokens (Output): $1.50

amazon/gpt-oss-120b

  • Endpoint Location: Amazon Bedrock
  • API Model Name: amazon/gpt-oss-120b
  • Underlying Model: bedrock/openai.gpt-oss-120b-1:0
  • Mode: Chat
  • Max Input Tokens: 128,000
  • Max Output Tokens: 128,000
  • Capabilities: Tool Use, Reasoning, Structured Output
  • Cost per 1M Tokens (Input): $0.15
  • Cost per 1M Tokens (Output): $0.60

amazon/gpt-oss-20b

  • Endpoint Location: Amazon Bedrock
  • API Model Name: amazon/gpt-oss-20b
  • Underlying Model: bedrock/openai.gpt-oss-20b-1:0
  • Mode: Chat
  • Max Input Tokens: 128,000
  • Max Output Tokens: 128,000
  • Capabilities: Tool Use, Reasoning, Structured Output
  • Cost per 1M Tokens (Input): $0.07
  • Cost per 1M Tokens (Output): $0.30

devstral

  • Endpoint Location: Amazon Bedrock
  • API Model Name: devstral
  • Underlying Model: bedrock/mistral.devstral-2-123b
  • Mode: Chat
  • Max Input Tokens: 256,000
  • Max Output Tokens: 256,000
  • Capabilities: Tool Use, Structured Output
  • Cost per 1M Tokens (Input): $0.40
  • Cost per 1M Tokens (Output): $2.00

devstral-2

  • Endpoint Location: Amazon Bedrock
  • API Model Name: devstral-2
  • Underlying Model: bedrock/mistral.devstral-2-123b
  • Mode: Chat
  • Max Input Tokens: 256,000
  • Max Output Tokens: 256,000
  • Capabilities: Tool Use, Structured Output
  • Cost per 1M Tokens (Input): $0.40
  • Cost per 1M Tokens (Output): $2.00

nemotron-nano-3

  • Endpoint Location: Amazon Bedrock
  • API Model Name: nemotron-nano-3
  • Underlying Model: bedrock/nvidia.nemotron-nano-3-30b
  • Mode: Chat
  • Max Input Tokens: 262,144
  • Max Output Tokens: 8,192
  • Capabilities: Tool Use
  • Cost per 1M Tokens (Input): $0.06
  • Cost per 1M Tokens (Output): $0.24

nemotron-nano-vl

  • Endpoint Location: Amazon Bedrock
  • API Model Name: nemotron-nano-vl
  • Underlying Model: bedrock/nvidia.nemotron-nano-12b-v2
  • Mode: Chat
  • Max Input Tokens: 128,000
  • Max Output Tokens: 8,192
  • Capabilities: Vision
  • Cost per 1M Tokens (Input): $0.20
  • Cost per 1M Tokens (Output): $0.60

nova-micro

  • Endpoint Location: Amazon Bedrock
  • API Model Name: nova-micro
  • Underlying Model: bedrock/amazon.nova-premier-v1:0
  • Cost: No cost

nova-micro-1

  • Endpoint Location: Amazon Bedrock
  • API Model Name: nova-micro-1
  • Underlying Model: bedrock/amazon.nova-micro-v1:0
  • Mode: Chat
  • Max Input Tokens: 128,000
  • Max Output Tokens: 10,000
  • Capabilities: Tool Use, Prompt Caching, Structured Output
  • Cost per 1M Tokens (Input): $0.04
  • Cost per 1M Tokens (Output): $0.14

nova-premier

  • Endpoint Location: Amazon Bedrock
  • API Model Name: nova-premier
  • Underlying Model: bedrock/amazon.nova-premier-v1:0
  • Cost: No cost

nova-premier-1

  • Endpoint Location: Amazon Bedrock
  • API Model Name: nova-premier-1
  • Underlying Model: bedrock/amazon.nova-premier-v1:0
  • Cost: No cost

nova-pro

  • Endpoint Location: Amazon Bedrock
  • API Model Name: nova-pro
  • Underlying Model: bedrock/amazon.nova-premier-v1:0
  • Cost: No cost

nova-pro-1

  • Endpoint Location: Amazon Bedrock
  • API Model Name: nova-pro-1
  • Underlying Model: bedrock/amazon.nova-pro-v1:0
  • Mode: Chat
  • Max Input Tokens: 300,000
  • Max Output Tokens: 10,000
  • Capabilities: Vision, Tool Use, Prompt Caching, PDF Input, Structured Output
  • Cost per 1M Tokens (Input): $0.80
  • Cost per 1M Tokens (Output): $3.20

cohere-embed-english-v3

  • Endpoint Location: Amazon Bedrock
  • API Model Name: cohere-embed-english-v3
  • Underlying Model: bedrock/cohere.embed-english-v3
  • Mode: Embedding
  • Max Input Tokens: 512
  • Cost per 1M Tokens (Input): $0.10

cohere-embed-multilingual-v3

  • Endpoint Location: Amazon Bedrock
  • API Model Name: cohere-embed-multilingual-v3
  • Underlying Model: bedrock/cohere.embed-multilingual-v3
  • Mode: Embedding
  • Max Input Tokens: 512
  • Cost per 1M Tokens (Input): $0.10

cohere-embed-v4

  • Endpoint Location: Amazon Bedrock
  • API Model Name: cohere-embed-v4
  • Underlying Model: bedrock/cohere.embed-v4:0
  • Mode: Embedding
  • Max Input Tokens: 128,000
  • Cost per 1M Tokens (Input): $0.12

nova-2-embed-multimodal

  • Endpoint Location: Amazon Bedrock
  • API Model Name: nova-2-embed-multimodal
  • Underlying Model: bedrock/amazon.nova-2-multimodal-embeddings-v1:0
  • Mode: Embedding
  • Max Input Tokens: 8,172
  • Cost per 1M Tokens (Input): $0.14

titan-embed-image-v1

  • Endpoint Location: Amazon Bedrock
  • API Model Name: titan-embed-image-v1
  • Underlying Model: bedrock/amazon.titan-embed-image-v1
  • Mode: Embedding
  • Max Input Tokens: 128
  • Cost per 1M Tokens (Input): $0.80

titan-embed-text-v1

  • Endpoint Location: Amazon Bedrock
  • API Model Name: titan-embed-text-v1
  • Underlying Model: bedrock/amazon.titan-embed-text-v1
  • Mode: Embedding
  • Max Input Tokens: 8,192
  • Cost per 1M Tokens (Input): $0.10

titan-embed-text-v2

  • Endpoint Location: Amazon Bedrock
  • API Model Name: titan-embed-text-v2
  • Underlying Model: bedrock/amazon.titan-embed-text-v2:0
  • Mode: Embedding
  • Max Input Tokens: 8,192
  • Cost per 1M Tokens (Input): $0.20

Google Vertex AI

Models hosted on Google Vertex AI are provided under enterprise agreements. Your data will not be used for training.

anthropic/claude-haiku

  • Endpoint Location: Google Vertex AI
  • API Model Name: anthropic/claude-haiku
  • Underlying Model: vertex_ai/claude-haiku-4-5@20251001
  • Mode: Chat
  • Max Input Tokens: 200,000
  • Max Output Tokens: 8,192
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output
  • Cost per 1M Tokens (Input): $1.00
  • Cost per 1M Tokens (Output): $5.00

anthropic/claude-haiku-high

  • Endpoint Location: Google Vertex AI
  • API Model Name: anthropic/claude-haiku-high
  • Underlying Model: vertex_ai/claude-haiku-4-5@20251001
  • Mode: Chat
  • Max Input Tokens: 200,000
  • Max Output Tokens: 8,192
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output
  • Reasoning Effort: high
  • Cost per 1M Tokens (Input): $1.00
  • Cost per 1M Tokens (Output): $5.00

anthropic/claude-opus

  • Endpoint Location: Google Vertex AI
  • API Model Name: anthropic/claude-opus
  • Underlying Model: vertex_ai/claude-opus-4-6@default
  • Mode: Chat
  • Max Input Tokens: 1,000,000
  • Max Output Tokens: 8,192
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output, Computer Use
  • Cost per 1M Tokens (Input): $5.00
  • Cost per 1M Tokens (Output): $25.00

anthropic/claude-opus-high

  • Endpoint Location: Google Vertex AI
  • API Model Name: anthropic/claude-opus-high
  • Underlying Model: vertex_ai/claude-opus-4-6@default
  • Mode: Chat
  • Max Input Tokens: 1,000,000
  • Max Output Tokens: 8,192
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output, Computer Use
  • Reasoning Effort: high
  • Cost per 1M Tokens (Input): $5.00
  • Cost per 1M Tokens (Output): $25.00

anthropic/claude-sonnet

  • Endpoint Location: Google Vertex AI
  • API Model Name: anthropic/claude-sonnet
  • Underlying Model: vertex_ai/claude-sonnet-4-6@default
  • Mode: Chat
  • Max Input Tokens: 200,000
  • Max Output Tokens: 16,384
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output, Computer Use
  • Cost per 1M Tokens (Input): $3.00
  • Cost per 1M Tokens (Output): $15.00

anthropic/claude-sonnet-high

  • Endpoint Location: Google Vertex AI
  • API Model Name: anthropic/claude-sonnet-high
  • Underlying Model: vertex_ai/claude-sonnet-4-6@default
  • Mode: Chat
  • Max Input Tokens: 200,000
  • Max Output Tokens: 16,384
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output, Computer Use
  • Reasoning Effort: high
  • Cost per 1M Tokens (Input): $3.00
  • Cost per 1M Tokens (Output): $15.00

claude-3-5-haiku

  • Endpoint Location: Google Vertex AI
  • API Model Name: claude-3-5-haiku
  • Underlying Model: vertex_ai/claude-3-5-haiku@20241022
  • Mode: Chat
  • Max Input Tokens: 200,000
  • Max Output Tokens: 8,192
  • Capabilities: Tool Use, PDF Input
  • Cost per 1M Tokens (Input): $1.00
  • Cost per 1M Tokens (Output): $5.00

claude-haiku

  • Endpoint Location: Google Vertex AI
  • API Model Name: claude-haiku
  • Underlying Model: vertex_ai/claude-haiku-4-5@20251001
  • Mode: Chat
  • Max Input Tokens: 200,000
  • Max Output Tokens: 8,192
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output
  • Cost per 1M Tokens (Input): $1.00
  • Cost per 1M Tokens (Output): $5.00

claude-haiku-4-5

  • Endpoint Location: Google Vertex AI
  • API Model Name: claude-haiku-4-5
  • Underlying Model: vertex_ai/claude-haiku-4-5@20251001
  • Mode: Chat
  • Max Input Tokens: 200,000
  • Max Output Tokens: 8,192
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output
  • Cost per 1M Tokens (Input): $1.00
  • Cost per 1M Tokens (Output): $5.00

claude-haiku-high

  • Endpoint Location: Google Vertex AI
  • API Model Name: claude-haiku-high
  • Underlying Model: vertex_ai/claude-haiku-4-5@20251001
  • Mode: Chat
  • Max Input Tokens: 200,000
  • Max Output Tokens: 8,192
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output
  • Reasoning Effort: high
  • Cost per 1M Tokens (Input): $1.00
  • Cost per 1M Tokens (Output): $5.00

claude-opus

  • Endpoint Location: Google Vertex AI
  • API Model Name: claude-opus
  • Underlying Model: vertex_ai/claude-opus-4-6@default
  • Mode: Chat
  • Max Input Tokens: 1,000,000
  • Max Output Tokens: 8,192
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output, Computer Use
  • Cost per 1M Tokens (Input): $5.00
  • Cost per 1M Tokens (Output): $25.00

claude-opus-4-0

  • Endpoint Location: Google Vertex AI
  • API Model Name: claude-opus-4-0
  • Underlying Model: vertex_ai/claude-opus-4@20250514
  • Mode: Chat
  • Max Input Tokens: 200,000
  • Max Output Tokens: 8,192
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output, Computer Use
  • Cost per 1M Tokens (Input): $15.00
  • Cost per 1M Tokens (Output): $75.00

claude-opus-4-1

  • Endpoint Location: Google Vertex AI
  • API Model Name: claude-opus-4-1
  • Underlying Model: vertex_ai/claude-opus-4-1@20250805
  • Mode: Chat
  • Max Input Tokens: 200,000
  • Max Output Tokens: 8,192
  • Capabilities: Vision, Tool Use
  • Cost per 1M Tokens (Input): $15.00
  • Cost per 1M Tokens (Output): $75.00

claude-opus-4-5

  • Endpoint Location: Google Vertex AI
  • API Model Name: claude-opus-4-5
  • Underlying Model: vertex_ai/claude-opus-4-5@20251101
  • Mode: Chat
  • Max Input Tokens: 200,000
  • Max Output Tokens: 8,192
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output, Computer Use
  • Cost per 1M Tokens (Input): $5.00
  • Cost per 1M Tokens (Output): $25.00

claude-opus-4-6

  • Endpoint Location: Google Vertex AI
  • API Model Name: claude-opus-4-6
  • Underlying Model: vertex_ai/claude-opus-4-6@default
  • Mode: Chat
  • Max Input Tokens: 1,000,000
  • Max Output Tokens: 8,192
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output, Computer Use
  • Cost per 1M Tokens (Input): $5.00
  • Cost per 1M Tokens (Output): $25.00

claude-opus-high

  • Endpoint Location: Google Vertex AI
  • API Model Name: claude-opus-high
  • Underlying Model: vertex_ai/claude-opus-4-6@default
  • Mode: Chat
  • Max Input Tokens: 1,000,000
  • Max Output Tokens: 8,192
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output, Computer Use
  • Reasoning Effort: high
  • Cost per 1M Tokens (Input): $5.00
  • Cost per 1M Tokens (Output): $25.00

claude-sonnet

  • Endpoint Location: Google Vertex AI
  • API Model Name: claude-sonnet
  • Underlying Model: vertex_ai/claude-sonnet-4-6@default
  • Mode: Chat
  • Max Input Tokens: 200,000
  • Max Output Tokens: 16,384
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output, Computer Use
  • Cost per 1M Tokens (Input): $3.00
  • Cost per 1M Tokens (Output): $15.00

claude-sonnet-4-5

  • Endpoint Location: Google Vertex AI
  • API Model Name: claude-sonnet-4-5
  • Underlying Model: vertex_ai/claude-sonnet-4-5@20250929
  • Mode: Chat
  • Max Input Tokens: 200,000
  • Max Output Tokens: 16,384
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output, Computer Use
  • Cost per 1M Tokens (Input): $3.00
  • Cost per 1M Tokens (Output): $15.00

claude-sonnet-4-6

  • Endpoint Location: Google Vertex AI
  • API Model Name: claude-sonnet-4-6
  • Underlying Model: vertex_ai/claude-sonnet-4-6@default
  • Mode: Chat
  • Max Input Tokens: 200,000
  • Max Output Tokens: 16,384
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output, Computer Use
  • Cost per 1M Tokens (Input): $3.00
  • Cost per 1M Tokens (Output): $15.00

claude-sonnet-high

  • Endpoint Location: Google Vertex AI
  • API Model Name: claude-sonnet-high
  • Underlying Model: vertex_ai/claude-sonnet-4-6@default
  • Mode: Chat
  • Max Input Tokens: 200,000
  • Max Output Tokens: 16,384
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output, Computer Use
  • Reasoning Effort: high
  • Cost per 1M Tokens (Input): $3.00
  • Cost per 1M Tokens (Output): $15.00

google/claude-haiku-3-5

  • Endpoint Location: Google Vertex AI
  • API Model Name: google/claude-haiku-3-5
  • Underlying Model: vertex_ai/claude-3-5-haiku@20241022
  • Mode: Chat
  • Max Input Tokens: 200,000
  • Max Output Tokens: 8,192
  • Capabilities: Tool Use, PDF Input
  • Cost per 1M Tokens (Input): $1.00
  • Cost per 1M Tokens (Output): $5.00

google/claude-haiku-4-5

  • Endpoint Location: Google Vertex AI
  • API Model Name: google/claude-haiku-4-5
  • Underlying Model: vertex_ai/claude-haiku-4-5@20251001
  • Mode: Chat
  • Max Input Tokens: 200,000
  • Max Output Tokens: 8,192
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output
  • Cost per 1M Tokens (Input): $1.00
  • Cost per 1M Tokens (Output): $5.00

google/claude-haiku-4-5-high

  • Endpoint Location: Google Vertex AI
  • API Model Name: google/claude-haiku-4-5-high
  • Underlying Model: vertex_ai/claude-haiku-4-5@20251001
  • Mode: Chat
  • Max Input Tokens: 200,000
  • Max Output Tokens: 8,192
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output
  • Reasoning Effort: high
  • Cost per 1M Tokens (Input): $1.00
  • Cost per 1M Tokens (Output): $5.00

google/claude-opus-4

  • Endpoint Location: Google Vertex AI
  • API Model Name: google/claude-opus-4
  • Underlying Model: vertex_ai/claude-opus-4@20250514
  • Mode: Chat
  • Max Input Tokens: 200,000
  • Max Output Tokens: 8,192
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output, Computer Use
  • Cost per 1M Tokens (Input): $15.00
  • Cost per 1M Tokens (Output): $75.00

google/claude-opus-4-1

  • Endpoint Location: Google Vertex AI
  • API Model Name: google/claude-opus-4-1
  • Underlying Model: vertex_ai/claude-opus-4-1@20250805
  • Mode: Chat
  • Max Input Tokens: 200,000
  • Max Output Tokens: 8,192
  • Capabilities: Vision, Tool Use
  • Cost per 1M Tokens (Input): $15.00
  • Cost per 1M Tokens (Output): $75.00

google/claude-opus-4-1-high

  • Endpoint Location: Google Vertex AI
  • API Model Name: google/claude-opus-4-1-high
  • Underlying Model: vertex_ai/claude-opus-4-1@20250805
  • Mode: Chat
  • Max Input Tokens: 200,000
  • Max Output Tokens: 8,192
  • Capabilities: Vision, Tool Use
  • Reasoning Effort: high
  • Cost per 1M Tokens (Input): $15.00
  • Cost per 1M Tokens (Output): $75.00

google/claude-opus-4-5

  • Endpoint Location: Google Vertex AI
  • API Model Name: google/claude-opus-4-5
  • Underlying Model: vertex_ai/claude-opus-4-5@20251101
  • Mode: Chat
  • Max Input Tokens: 200,000
  • Max Output Tokens: 8,192
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output, Computer Use
  • Cost per 1M Tokens (Input): $5.00
  • Cost per 1M Tokens (Output): $25.00

google/claude-opus-4-5-high

  • Endpoint Location: Google Vertex AI
  • API Model Name: google/claude-opus-4-5-high
  • Underlying Model: vertex_ai/claude-opus-4-5@20251101
  • Mode: Chat
  • Max Input Tokens: 200,000
  • Max Output Tokens: 8,192
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output, Computer Use
  • Reasoning Effort: high
  • Cost per 1M Tokens (Input): $5.00
  • Cost per 1M Tokens (Output): $25.00

google/claude-opus-4-6

  • Endpoint Location: Google Vertex AI
  • API Model Name: google/claude-opus-4-6
  • Underlying Model: vertex_ai/claude-opus-4-6@default
  • Mode: Chat
  • Max Input Tokens: 1,000,000
  • Max Output Tokens: 8,192
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output, Computer Use
  • Cost per 1M Tokens (Input): $5.00
  • Cost per 1M Tokens (Output): $25.00

google/claude-opus-4-6-high

  • Endpoint Location: Google Vertex AI
  • API Model Name: google/claude-opus-4-6-high
  • Underlying Model: vertex_ai/claude-opus-4-6@default
  • Mode: Chat
  • Max Input Tokens: 1,000,000
  • Max Output Tokens: 8,192
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output, Computer Use
  • Reasoning Effort: high
  • Cost per 1M Tokens (Input): $5.00
  • Cost per 1M Tokens (Output): $25.00

google/claude-opus-4-high

  • Endpoint Location: Google Vertex AI
  • API Model Name: google/claude-opus-4-high
  • Underlying Model: vertex_ai/claude-opus-4@20250514
  • Mode: Chat
  • Max Input Tokens: 200,000
  • Max Output Tokens: 8,192
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output, Computer Use
  • Reasoning Effort: high
  • Cost per 1M Tokens (Input): $15.00
  • Cost per 1M Tokens (Output): $75.00

google/claude-sonnet-4

  • Endpoint Location: Google Vertex AI
  • API Model Name: google/claude-sonnet-4
  • Underlying Model: vertex_ai/claude-sonnet-4@20250514
  • Mode: Chat
  • Max Input Tokens: 1,000,000
  • Max Output Tokens: 16,384
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output, Computer Use
  • Cost per 1M Tokens (Input): $3.00
  • Cost per 1M Tokens (Output): $15.00

google/claude-sonnet-4-5

  • Endpoint Location: Google Vertex AI
  • API Model Name: google/claude-sonnet-4-5
  • Underlying Model: vertex_ai/claude-sonnet-4-5@20250929
  • Mode: Chat
  • Max Input Tokens: 200,000
  • Max Output Tokens: 16,384
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output, Computer Use
  • Cost per 1M Tokens (Input): $3.00
  • Cost per 1M Tokens (Output): $15.00

google/claude-sonnet-4-5-high

  • Endpoint Location: Google Vertex AI
  • API Model Name: google/claude-sonnet-4-5-high
  • Underlying Model: vertex_ai/claude-sonnet-4-5@20250929
  • Mode: Chat
  • Max Input Tokens: 200,000
  • Max Output Tokens: 16,384
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output, Computer Use
  • Reasoning Effort: high
  • Cost per 1M Tokens (Input): $3.00
  • Cost per 1M Tokens (Output): $15.00

google/claude-sonnet-4-6

  • Endpoint Location: Google Vertex AI
  • API Model Name: google/claude-sonnet-4-6
  • Underlying Model: vertex_ai/claude-sonnet-4-6@default
  • Mode: Chat
  • Max Input Tokens: 200,000
  • Max Output Tokens: 16,384
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output, Computer Use
  • Cost per 1M Tokens (Input): $3.00
  • Cost per 1M Tokens (Output): $15.00

google/claude-sonnet-4-6-high

  • Endpoint Location: Google Vertex AI
  • API Model Name: google/claude-sonnet-4-6-high
  • Underlying Model: vertex_ai/claude-sonnet-4-6@default
  • Mode: Chat
  • Max Input Tokens: 200,000
  • Max Output Tokens: 16,384
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output, Computer Use
  • Reasoning Effort: high
  • Cost per 1M Tokens (Input): $3.00
  • Cost per 1M Tokens (Output): $15.00

google/claude-sonnet-4-high

  • Endpoint Location: Google Vertex AI
  • API Model Name: google/claude-sonnet-4-high
  • Underlying Model: vertex_ai/claude-sonnet-4@20250514
  • Mode: Chat
  • Max Input Tokens: 1,000,000
  • Max Output Tokens: 16,384
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output, Computer Use
  • Reasoning Effort: high
  • Cost per 1M Tokens (Input): $3.00
  • Cost per 1M Tokens (Output): $15.00

google/deepseek-3.2

  • Endpoint Location: Google Vertex AI
  • API Model Name: google/deepseek-3.2
  • Underlying Model: vertex_ai/deepseek-ai/deepseek-v3.2-maas
  • Mode: Chat
  • Max Input Tokens: 163,840
  • Max Output Tokens: 32,768
  • Capabilities: Tool Use, Reasoning, Prompt Caching
  • Cost per 1M Tokens (Input): $0.56
  • Cost per 1M Tokens (Output): $1.68

google/deepseek-r1

  • Endpoint Location: Google Vertex AI
  • API Model Name: google/deepseek-r1
  • Underlying Model: vertex_ai/deepseek-ai/deepseek-r1-0528-maas
  • Mode: Chat
  • Max Input Tokens: 65,336
  • Max Output Tokens: 8,192
  • Capabilities: Tool Use, Reasoning, Prompt Caching
  • Cost per 1M Tokens (Input): $1.35
  • Cost per 1M Tokens (Output): $5.40

gemini-1.5-flash

  • Endpoint Location: Google Vertex AI
  • API Model Name: gemini-1.5-flash
  • Underlying Model: vertex_ai/gemini-1.5-flash
  • Mode: Chat
  • Max Input Tokens: 1,000,000
  • Max Output Tokens: 8,192
  • Capabilities: Vision, Tool Use, Structured Output
  • Cost per 1M Tokens (Input): $0.07
  • Cost per 1M Tokens (Output): $0.30

gemini-1.5-pro

  • Endpoint Location: Google Vertex AI
  • API Model Name: gemini-1.5-pro
  • Underlying Model: vertex_ai/gemini-1.5-pro
  • Mode: Chat
  • Max Input Tokens: 2,097,152
  • Max Output Tokens: 8,192
  • Capabilities: Vision, Tool Use, PDF Input, Structured Output
  • Cost per 1M Tokens (Input): $1.25
  • Cost per 1M Tokens (Output): $5.00

gemini-2.0-flash

  • Endpoint Location: Google Vertex AI
  • API Model Name: gemini-2.0-flash
  • Underlying Model: vertex_ai/gemini-2.0-flash
  • Mode: Chat
  • Max Input Tokens: 1,048,576
  • Max Output Tokens: 8,192
  • Capabilities: Vision, Tool Use, Prompt Caching, Structured Output
  • Cost per 1M Tokens (Input): $0.10
  • Cost per 1M Tokens (Output): $0.40

gemini-2.0-flash-lite

  • Endpoint Location: Google Vertex AI
  • API Model Name: gemini-2.0-flash-lite
  • Underlying Model: vertex_ai/gemini-2.0-flash-lite
  • Mode: Chat
  • Max Input Tokens: 1,048,576
  • Max Output Tokens: 8,192
  • Capabilities: Vision, Tool Use, Prompt Caching, Structured Output
  • Cost per 1M Tokens (Input): $0.07
  • Cost per 1M Tokens (Output): $0.30

gemini-2.5-flash

  • Endpoint Location: Google Vertex AI
  • API Model Name: gemini-2.5-flash
  • Underlying Model: vertex_ai/gemini-2.5-flash
  • Mode: Chat
  • Max Input Tokens: 1,048,576
  • Max Output Tokens: 65,535
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output
  • Cost per 1M Tokens (Input): $0.30
  • Cost per 1M Tokens (Output): $2.50

gemini-2.5-flash-high

  • Endpoint Location: Google Vertex AI
  • API Model Name: gemini-2.5-flash-high
  • Underlying Model: vertex_ai/gemini-2.5-flash
  • Mode: Chat
  • Max Input Tokens: 1,048,576
  • Max Output Tokens: 65,535
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output
  • Reasoning Effort: high
  • Cost per 1M Tokens (Input): $0.30
  • Cost per 1M Tokens (Output): $2.50

gemini-2.5-flash-image

  • Endpoint Location: Google Vertex AI
  • API Model Name: gemini-2.5-flash-image
  • Underlying Model: vertex_ai/google/gemini-2.5-flash-image
  • Cost: No cost

gemini-2.5-flash-lite

  • Endpoint Location: Google Vertex AI
  • API Model Name: gemini-2.5-flash-lite
  • Underlying Model: vertex_ai/gemini-2.5-flash-lite-preview
  • Cost: No cost

gemini-2.5-pro

  • Endpoint Location: Google Vertex AI
  • API Model Name: gemini-2.5-pro
  • Underlying Model: vertex_ai/gemini-2.5-pro
  • Mode: Chat
  • Max Input Tokens: 1,048,576
  • Max Output Tokens: 65,535
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output
  • Cost per 1M Tokens (Input): $1.25
  • Cost per 1M Tokens (Output): $10.00

gemini-2.5-pro-high

  • Endpoint Location: Google Vertex AI
  • API Model Name: gemini-2.5-pro-high
  • Underlying Model: vertex_ai/gemini-3-pro-preview
  • Mode: Chat
  • Max Input Tokens: 1,048,576
  • Max Output Tokens: 65,535
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output
  • Reasoning Effort: high
  • Cost per 1M Tokens (Input): $2.00
  • Cost per 1M Tokens (Output): $12.00

gemini-3-flash

  • Endpoint Location: Google Vertex AI
  • API Model Name: gemini-3-flash
  • Underlying Model: vertex_ai/gemini-3-flash-preview
  • Mode: Chat
  • Max Input Tokens: 1,048,576
  • Max Output Tokens: 65,535
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output
  • Reasoning Effort: high
  • Cost per 1M Tokens (Input): $0.50
  • Cost per 1M Tokens (Output): $3.00

gemini-3-flash-high

  • Endpoint Location: Google Vertex AI
  • API Model Name: gemini-3-flash-high
  • Underlying Model: vertex_ai/gemini-3-flash-preview
  • Mode: Chat
  • Max Input Tokens: 1,048,576
  • Max Output Tokens: 65,535
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output
  • Reasoning Effort: high
  • Cost per 1M Tokens (Input): $0.50
  • Cost per 1M Tokens (Output): $3.00

gemini-3-pro

  • Endpoint Location: Google Vertex AI
  • API Model Name: gemini-3-pro
  • Underlying Model: vertex_ai/gemini-3-pro-preview
  • Mode: Chat
  • Max Input Tokens: 1,048,576
  • Max Output Tokens: 65,535
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output
  • Cost per 1M Tokens (Input): $2.00
  • Cost per 1M Tokens (Output): $12.00

gemini-3-pro-high

  • Endpoint Location: Google Vertex AI
  • API Model Name: gemini-3-pro-high
  • Underlying Model: vertex_ai/gemini-3-pro-preview
  • Mode: Chat
  • Max Input Tokens: 1,048,576
  • Max Output Tokens: 65,535
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output
  • Reasoning Effort: high
  • Cost per 1M Tokens (Input): $2.00
  • Cost per 1M Tokens (Output): $12.00

gemini-3.1-pro

  • Endpoint Location: Google Vertex AI
  • API Model Name: gemini-3.1-pro
  • Underlying Model: vertex_ai/gemini-3.1-pro-preview
  • Mode: Chat
  • Max Input Tokens: 1,048,576
  • Max Output Tokens: 65,536
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output
  • Cost per 1M Tokens (Input): $2.00
  • Cost per 1M Tokens (Output): $12.00

gemini-3.1-pro-high

  • Endpoint Location: Google Vertex AI
  • API Model Name: gemini-3.1-pro-high
  • Underlying Model: vertex_ai/gemini-3.1-pro-preview
  • Mode: Chat
  • Max Input Tokens: 1,048,576
  • Max Output Tokens: 65,536
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output
  • Reasoning Effort: high
  • Cost per 1M Tokens (Input): $2.00
  • Cost per 1M Tokens (Output): $12.00

gemini-embedding-001

  • Endpoint Location: Google Vertex AI
  • API Model Name: gemini-embedding-001
  • Underlying Model: openai/gemini-embedding-001
  • Cost: No cost

gemini-flash

  • Endpoint Location: Google Vertex AI
  • API Model Name: gemini-flash
  • Underlying Model: vertex_ai/gemini-3-flash-preview
  • Mode: Chat
  • Max Input Tokens: 1,048,576
  • Max Output Tokens: 65,535
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output
  • Reasoning Effort: high
  • Cost per 1M Tokens (Input): $0.50
  • Cost per 1M Tokens (Output): $3.00

gemini-flash-high

  • Endpoint Location: Google Vertex AI
  • API Model Name: gemini-flash-high
  • Underlying Model: vertex_ai/gemini-3-flash-preview
  • Mode: Chat
  • Max Input Tokens: 1,048,576
  • Max Output Tokens: 65,535
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output
  • Reasoning Effort: high
  • Cost per 1M Tokens (Input): $0.50
  • Cost per 1M Tokens (Output): $3.00

gemini-flash-image

  • Endpoint Location: Google Vertex AI
  • API Model Name: gemini-flash-image
  • Underlying Model: vertex_ai/google/gemini-2.5-flash-image
  • Cost: No cost

gemini-pro

  • Endpoint Location: Google Vertex AI
  • API Model Name: gemini-pro
  • Underlying Model: vertex_ai/gemini-3.1-pro-preview
  • Mode: Chat
  • Max Input Tokens: 1,048,576
  • Max Output Tokens: 65,536
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output
  • Cost per 1M Tokens (Input): $2.00
  • Cost per 1M Tokens (Output): $12.00

gemini-pro-high

  • Endpoint Location: Google Vertex AI
  • API Model Name: gemini-pro-high
  • Underlying Model: vertex_ai/gemini-3.1-pro-preview
  • Mode: Chat
  • Max Input Tokens: 1,048,576
  • Max Output Tokens: 65,536
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output
  • Reasoning Effort: high
  • Cost per 1M Tokens (Input): $2.00
  • Cost per 1M Tokens (Output): $12.00

google/llama-4-maverick

  • Endpoint Location: Google Vertex AI
  • API Model Name: google/llama-4-maverick
  • Underlying Model: vertex_ai/meta/llama-4-maverick-17b-128e-instruct-maas
  • Mode: Chat
  • Max Input Tokens: 1,000,000
  • Max Output Tokens: 1,000,000
  • Capabilities: Tool Use
  • Cost per 1M Tokens (Input): $0.35
  • Cost per 1M Tokens (Output): $1.15

google/codestral

  • Endpoint Location: Google Vertex AI
  • API Model Name: google/codestral
  • Underlying Model: vertex_ai/codestral-2501
  • Mode: Chat
  • Max Input Tokens: 128,000
  • Max Output Tokens: 128,000
  • Capabilities: Tool Use
  • Cost per 1M Tokens (Input): $0.20
  • Cost per 1M Tokens (Output): $0.60

google/gpt-oss-120b

  • Endpoint Location: Google Vertex AI
  • API Model Name: google/gpt-oss-120b
  • Underlying Model: vertex_ai/openai/gpt-oss-120b-maas
  • Mode: Chat
  • Max Input Tokens: 131,072
  • Max Output Tokens: 32,768
  • Capabilities: Reasoning
  • Cost per 1M Tokens (Input): $0.15
  • Cost per 1M Tokens (Output): $0.60

google/gpt-oss-120b-high

  • Endpoint Location: Google Vertex AI
  • API Model Name: google/gpt-oss-120b-high
  • Underlying Model: vertex_ai/openai/gpt-oss-120b-maas
  • Mode: Chat
  • Max Input Tokens: 131,072
  • Max Output Tokens: 32,768
  • Capabilities: Reasoning
  • Reasoning Effort: high
  • Cost: No cost

google/gpt-oss-20b

  • Endpoint Location: Google Vertex AI
  • API Model Name: google/gpt-oss-20b
  • Underlying Model: vertex_ai/openai/gpt-oss-20b-maas
  • Mode: Chat
  • Max Input Tokens: 131,072
  • Max Output Tokens: 32,768
  • Capabilities: Reasoning
  • Cost per 1M Tokens (Input): $0.07
  • Cost per 1M Tokens (Output): $0.30

google/gpt-oss-20b-high

  • Endpoint Location: Google Vertex AI
  • API Model Name: google/gpt-oss-20b-high
  • Underlying Model: vertex_ai/openai/gpt-oss-20b-maas
  • Mode: Chat
  • Max Input Tokens: 131,072
  • Max Output Tokens: 32,768
  • Capabilities: Reasoning
  • Reasoning Effort: high
  • Cost per 1M Tokens (Input): $0.07
  • Cost per 1M Tokens (Output): $0.30

google/qwen-3

  • Endpoint Location: Google Vertex AI
  • API Model Name: google/qwen-3
  • Underlying Model: vertex_ai/qwen/qwen3-235b-a22b-instruct-2507-maas
  • Mode: Chat
  • Max Input Tokens: 262,144
  • Max Output Tokens: 16,384
  • Capabilities: Tool Use
  • Cost per 1M Tokens (Input): $0.25
  • Cost per 1M Tokens (Output): $1.00

google/qwen-3-coder

  • Endpoint Location: Google Vertex AI
  • API Model Name: google/qwen-3-coder
  • Underlying Model: vertex_ai/qwen/qwen3-coder-480b-a35b-instruct-maas
  • Mode: Chat
  • Max Input Tokens: 262,144
  • Max Output Tokens: 32,768
  • Capabilities: Tool Use
  • Cost per 1M Tokens (Input): $1.00
  • Cost per 1M Tokens (Output): $4.00

google/glm-4.7

  • Endpoint Location: Google Vertex AI
  • API Model Name: google/glm-4.7
  • Underlying Model: vertex_ai/zai-org/glm-4.7-maas
  • Mode: Chat
  • Max Input Tokens: 200,000
  • Max Output Tokens: 128,000
  • Capabilities: Tool Use, Reasoning
  • Cost per 1M Tokens (Input): $0.60
  • Cost per 1M Tokens (Output): $2.20

google/glm-5

  • Endpoint Location: Google Vertex AI
  • API Model Name: google/glm-5
  • Underlying Model: vertex_ai/zai-org/glm-5-maas
  • Mode: Chat
  • Max Input Tokens: 200,000
  • Max Output Tokens: 128,000
  • Capabilities: Tool Use, Reasoning, Prompt Caching
  • Cost per 1M Tokens (Input): $1.00
  • Cost per 1M Tokens (Output): $3.20

google/kimi-k2-thinking

  • Endpoint Location: Google Vertex AI
  • API Model Name: google/kimi-k2-thinking
  • Underlying Model: vertex_ai/moonshotai/kimi-k2-thinking-maas
  • Mode: Chat
  • Max Input Tokens: 256,000
  • Max Output Tokens: 256,000
  • Capabilities: Tool Use
  • Cost per 1M Tokens (Input): $0.60
  • Cost per 1M Tokens (Output): $2.50

google/minimax-m2

  • Endpoint Location: Google Vertex AI
  • API Model Name: google/minimax-m2
  • Underlying Model: vertex_ai/minimaxai/minimax-m2-maas
  • Mode: Chat
  • Max Input Tokens: 196,608
  • Max Output Tokens: 196,608
  • Capabilities: Tool Use
  • Cost per 1M Tokens (Input): $0.30
  • Cost per 1M Tokens (Output): $1.20

text-embedding-004

  • Endpoint Location: Google Vertex AI
  • API Model Name: text-embedding-004
  • Underlying Model: openai/text-embedding-004
  • Cost: No cost

gemini-3-pro-image

  • Endpoint Location: Google Vertex AI
  • API Model Name: gemini-3-pro-image
  • Underlying Model: vertex_ai/gemini-3-pro-image-preview
  • Mode: Image Generation
  • Max Input Tokens: 65,536
  • Max Output Tokens: 32,768
  • Cost per 1M Tokens (Input): $2.00
  • Cost per 1M Tokens (Output): $12.00

gemini-pro-image

  • Endpoint Location: Google Vertex AI
  • API Model Name: gemini-pro-image
  • Underlying Model: vertex_ai/gemini-3-pro-image-preview
  • Mode: Image Generation
  • Max Input Tokens: 65,536
  • Max Output Tokens: 32,768
  • Cost per 1M Tokens (Input): $2.00
  • Cost per 1M Tokens (Output): $12.00

LBL IT Division

The IT Division’s Science IT group provides access to open-weight models running on Berkeley Lab-owned networks and hardware, located in the Building 50 data center. LBL-Hosted models are free-to-use.

lbl/granite-docling

  • Endpoint Location: LBL IT Division
  • API Model Name: lbl/granite-docling
  • Underlying Model: hosted_vllm/granite-docling
  • Mode: Chat
  • Max Input Tokens: 8,192
  • Cost: No cost

Llama-4-Scout-17B-16E-Instruct

  • Endpoint Location: LBL IT Division
  • API Model Name: Llama-4-Scout-17B-16E-Instruct
  • Underlying Model: hosted_vllm/Llama-4-Scout-17B-16E-Instruct
  • Mode: Chat
  • Max Input Tokens: 131,072
  • Max Output Tokens: 8,192
  • Cost: No cost

lbl/Llama-4-Scout-17B-16E-Instruct

  • Endpoint Location: LBL IT Division
  • API Model Name: lbl/Llama-4-Scout-17B-16E-Instruct
  • Underlying Model: hosted_vllm/Llama-4-Scout-17B-16E-Instruct
  • Mode: Chat
  • Max Input Tokens: 131,072
  • Max Output Tokens: 8,192
  • Cost: No cost

lbl/llama

  • Endpoint Location: LBL IT Division
  • API Model Name: lbl/llama
  • Underlying Model: hosted_vllm/Llama-4-Scout-17B-16E-Instruct
  • Mode: Chat
  • Max Input Tokens: 131,072
  • Max Output Tokens: 8,192
  • Cost: No cost

lbl/llama-4-scout

  • Endpoint Location: LBL IT Division
  • API Model Name: lbl/llama-4-scout
  • Underlying Model: hosted_vllm/Llama-4-Scout-17B-16E-Instruct
  • Mode: Chat
  • Max Input Tokens: 131,072
  • Max Output Tokens: 8,192
  • Cost: No cost

meta/llama-4-scout

  • Endpoint Location: LBL IT Division
  • API Model Name: meta/llama-4-scout
  • Underlying Model: hosted_vllm/Llama-4-Scout-17B-16E-Instruct
  • Mode: Chat
  • Max Input Tokens: 131,072
  • Max Output Tokens: 8,192
  • Cost: No cost

gpt

  • Endpoint Location: LBL IT Division
  • API Model Name: gpt
  • Underlying Model: openai/gpt-5.2
  • Mode: Chat
  • Max Input Tokens: 272,000
  • Max Output Tokens: 128,000
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output
  • Cost per 1M Tokens (Input): $1.75
  • Cost per 1M Tokens (Output): $14.00

gpt-oss-120b

  • Endpoint Location: LBL IT Division
  • API Model Name: gpt-oss-120b
  • Underlying Model: hosted_vllm/gpt-oss-120b
  • Mode: Chat
  • Max Input Tokens: 131,072
  • Max Output Tokens: 131,072
  • Cost: No cost

gpt-oss-120b-high

  • Endpoint Location: LBL IT Division
  • API Model Name: gpt-oss-120b-high
  • Underlying Model: hosted_vllm/gpt-oss-120b
  • Mode: Chat
  • Max Input Tokens: 131,072
  • Max Output Tokens: 131,072
  • Reasoning Effort: high
  • Cost: No cost

gpt-oss-20b

  • Endpoint Location: LBL IT Division
  • API Model Name: gpt-oss-20b
  • Underlying Model: hosted_vllm/gpt-oss-20b
  • Mode: Chat
  • Reasoning Effort: high
  • Cost: No cost

gpt-oss-20b-high

  • Endpoint Location: LBL IT Division
  • API Model Name: gpt-oss-20b-high
  • Underlying Model: hosted_vllm/gpt-oss-20b
  • Mode: Chat
  • Reasoning Effort: high
  • Cost: No cost

lbl/gpt-oss-120b

  • Endpoint Location: LBL IT Division
  • API Model Name: lbl/gpt-oss-120b
  • Underlying Model: hosted_vllm/gpt-oss-120b
  • Mode: Chat
  • Max Input Tokens: 131,072
  • Max Output Tokens: 131,072
  • Cost: No cost

lbl/gpt-oss-120b-high

  • Endpoint Location: LBL IT Division
  • API Model Name: lbl/gpt-oss-120b-high
  • Underlying Model: hosted_vllm/gpt-oss-120b
  • Mode: Chat
  • Max Input Tokens: 131,072
  • Max Output Tokens: 131,072
  • Reasoning Effort: high
  • Cost: No cost

lbl/gpt-oss-120b-medium

  • Endpoint Location: LBL IT Division
  • API Model Name: lbl/gpt-oss-120b-medium
  • Underlying Model: hosted_vllm/gpt-oss-120b
  • Mode: Chat
  • Max Input Tokens: 131,072
  • Max Output Tokens: 131,072
  • Reasoning Effort: medium
  • Cost: No cost

lbl/gpt-oss-20b

  • Endpoint Location: LBL IT Division
  • API Model Name: lbl/gpt-oss-20b
  • Underlying Model: hosted_vllm/gpt-oss-20b
  • Mode: Chat
  • Reasoning Effort: high
  • Cost: No cost

lbl/gpt-oss-20b-high

  • Endpoint Location: LBL IT Division
  • API Model Name: lbl/gpt-oss-20b-high
  • Underlying Model: hosted_vllm/gpt-oss-20b
  • Mode: Chat
  • Reasoning Effort: high
  • Cost: No cost

Nanonets-OCR

  • Endpoint Location: LBL IT Division
  • API Model Name: Nanonets-OCR
  • Underlying Model: openai/Nanonets-OCR
  • Cost: No cost

lbl/Nanonets-OCR

  • Endpoint Location: LBL IT Division
  • API Model Name: lbl/Nanonets-OCR
  • Underlying Model: openai/Nanonets-OCR
  • Cost: No cost

lbl/jbei-publications-chat

  • Endpoint Location: LBL IT Division
  • API Model Name: lbl/jbei-publications-chat
  • Underlying Model: openai/lbl/jbei-publications-chat
  • Mode: Chat
  • Cost: No cost

bge-m3

  • Endpoint Location: LBL IT Division
  • API Model Name: bge-m3
  • Underlying Model: openai/nomic-embed-text
  • Mode: Embedding
  • Max Input Tokens: 8,192
  • Cost: No cost

lbl/bge-m3

  • Endpoint Location: LBL IT Division
  • API Model Name: lbl/bge-m3
  • Underlying Model: openai/nomic-embed-text
  • Mode: Embedding
  • Max Input Tokens: 8,192
  • Cost: No cost

lbl/nomic-embed-text

  • Endpoint Location: LBL IT Division
  • API Model Name: lbl/nomic-embed-text
  • Underlying Model: openai/nomic-embed-text
  • Mode: Embedding
  • Max Input Tokens: 8,192
  • Cost: No cost

lbl/nomic-embed-vision

  • Endpoint Location: LBL IT Division
  • API Model Name: lbl/nomic-embed-vision
  • Underlying Model: openai/nomic-embed-vision
  • Mode: Embedding
  • Max Input Tokens: 8,192
  • Cost: No cost

nomic-embed-text

  • Endpoint Location: LBL IT Division
  • API Model Name: nomic-embed-text
  • Underlying Model: openai/nomic-embed-text
  • Mode: Embedding
  • Max Input Tokens: 8,192
  • Cost: No cost

nomic-embed-vision

  • Endpoint Location: LBL IT Division
  • API Model Name: nomic-embed-vision
  • Underlying Model: openai/nomic-embed-vision
  • Mode: Embedding
  • Max Input Tokens: 8,192
  • Cost: No cost

text-embedding-ada-002

  • Endpoint Location: LBL IT Division
  • API Model Name: text-embedding-ada-002
  • Underlying Model: openai/text-embedding-ada-002
  • Mode: Embedding
  • Max Input Tokens: 8,191
  • Cost per 1M Tokens (Input): $0.10

codegemma

  • Endpoint Location: LBL IT Division
  • API Model Name: codegemma
  • Underlying Model: ollama/codegemma:2b
  • Mode: Code Completion
  • Max Input Tokens: 8,192
  • Cost: No cost

lbl/codegemma:2b

  • Endpoint Location: LBL IT Division
  • API Model Name: lbl/codegemma:2b
  • Underlying Model: ollama/codegemma:2b
  • Mode: Code Completion
  • Max Input Tokens: 8,192
  • Cost: No cost

lbl/cborg-chat

  • Endpoint Location: LBL IT Division
  • API Model Name: lbl/cborg-chat
  • Underlying Model: hosted_vllm/Llama-4-Scout-17B-16E-Instruct
  • Mode: Chat
  • Max Input Tokens: 131,072
  • Max Output Tokens: 8,192
  • Cost: No cost

lbl/cborg-coder

  • Endpoint Location: LBL IT Division
  • API Model Name: lbl/cborg-coder
  • Underlying Model: hosted_vllm/gpt-oss-120b
  • Mode: Chat
  • Max Input Tokens: 131,072
  • Max Output Tokens: 131,072
  • Reasoning Effort: high
  • Cost: No cost

lbl/cborg-coder-base

  • Endpoint Location: LBL IT Division
  • API Model Name: lbl/cborg-coder-base
  • Underlying Model: ollama/codegemma:2b
  • Mode: Code Completion
  • Max Input Tokens: 8,192
  • Cost: No cost

lbl/cborg-deepthought

  • Endpoint Location: LBL IT Division
  • API Model Name: lbl/cborg-deepthought
  • Underlying Model: hosted_vllm/gpt-oss-120b
  • Mode: Chat
  • Max Input Tokens: 131,072
  • Max Output Tokens: 131,072
  • Reasoning Effort: high
  • Cost: No cost

lbl/cborg-mini

  • Endpoint Location: LBL IT Division
  • API Model Name: lbl/cborg-mini
  • Underlying Model: hosted_vllm/gpt-oss-20b
  • Mode: Chat
  • Reasoning Effort: high
  • Cost: No cost

lbl/cborg-ocr

  • Endpoint Location: LBL IT Division
  • API Model Name: lbl/cborg-ocr
  • Underlying Model: openai/Nanonets-OCR
  • Cost: No cost

lbl/cborg-vision

  • Endpoint Location: LBL IT Division
  • API Model Name: lbl/cborg-vision
  • Underlying Model: hosted_vllm/Llama-4-Scout-17B-16E-Instruct
  • Mode: Chat
  • Max Input Tokens: 131,072
  • Max Output Tokens: 8,192
  • Cost: No cost

Microsoft Azure

Models hosted on Microsoft Azure are provided under enterprise agreements. Your data will not be used for training.

azure/gpt-oss-120b

  • Endpoint Location: Microsoft Azure
  • API Model Name: azure/gpt-oss-120b
  • Underlying Model: azure/gpt-oss-120b
  • Max Output Tokens: 131,072
  • Cost per 1M Tokens (Input): $0.15
  • Cost per 1M Tokens (Output): $0.60

azure/grok-3

  • Endpoint Location: Microsoft Azure
  • API Model Name: azure/grok-3
  • Underlying Model: azure/grok-3
  • Cost per 1M Tokens (Input): $3.00
  • Cost per 1M Tokens (Output): $15.00

azure/grok-3-mini

  • Endpoint Location: Microsoft Azure
  • API Model Name: azure/grok-3-mini
  • Underlying Model: azure/grok-3-mini
  • Cost per 1M Tokens (Input): $0.25
  • Cost per 1M Tokens (Output): $1.27

OpenAI

Models hosted on OpenAI are provided under enterprise agreements. Your data will not be used for training.

gpt-4.1

  • Endpoint Location: OpenAI
  • API Model Name: gpt-4.1
  • Underlying Model: openai/gpt-4.1
  • Mode: Chat
  • Max Input Tokens: 1,047,576
  • Max Output Tokens: 32,768
  • Capabilities: Vision, Tool Use, Prompt Caching, PDF Input, Structured Output
  • Cost per 1M Tokens (Input): $2.00
  • Cost per 1M Tokens (Output): $8.00

gpt-4.1-mini

  • Endpoint Location: OpenAI
  • API Model Name: gpt-4.1-mini
  • Underlying Model: openai/gpt-4.1-mini
  • Mode: Chat
  • Max Input Tokens: 1,047,576
  • Max Output Tokens: 32,768
  • Capabilities: Vision, Tool Use, Prompt Caching, PDF Input, Structured Output
  • Cost per 1M Tokens (Input): $0.40
  • Cost per 1M Tokens (Output): $1.60

gpt-4.1-nano

  • Endpoint Location: OpenAI
  • API Model Name: gpt-4.1-nano
  • Underlying Model: openai/gpt-4.1-nano
  • Mode: Chat
  • Max Input Tokens: 1,047,576
  • Max Output Tokens: 32,768
  • Capabilities: Vision, Tool Use, Prompt Caching, PDF Input, Structured Output
  • Cost per 1M Tokens (Input): $0.10
  • Cost per 1M Tokens (Output): $0.40

gpt-4o

  • Endpoint Location: OpenAI
  • API Model Name: gpt-4o
  • Underlying Model: openai/gpt-4o
  • Mode: Chat
  • Max Input Tokens: 128,000
  • Max Output Tokens: 16,384
  • Capabilities: Vision, Tool Use, Prompt Caching, PDF Input, Structured Output
  • Cost per 1M Tokens (Input): $2.50
  • Cost per 1M Tokens (Output): $10.00

gpt-4o-mini

  • Endpoint Location: OpenAI
  • API Model Name: gpt-4o-mini
  • Underlying Model: openai/gpt-4o-mini
  • Mode: Chat
  • Max Input Tokens: 128,000
  • Max Output Tokens: 16,384
  • Capabilities: Vision, Tool Use, Prompt Caching, PDF Input, Structured Output
  • Cost per 1M Tokens (Input): $0.15
  • Cost per 1M Tokens (Output): $0.60

gpt-5

  • Endpoint Location: OpenAI
  • API Model Name: gpt-5
  • Underlying Model: openai/gpt-5
  • Mode: Chat
  • Max Input Tokens: 272,000
  • Max Output Tokens: 128,000
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output
  • Cost per 1M Tokens (Input): $1.25
  • Cost per 1M Tokens (Output): $10.00

gpt-5-chat

  • Endpoint Location: OpenAI
  • API Model Name: gpt-5-chat
  • Underlying Model: openai/gpt-5-chat-latest
  • Mode: Chat
  • Max Input Tokens: 128,000
  • Max Output Tokens: 16,384
  • Capabilities: Vision, Reasoning, Prompt Caching, PDF Input, Structured Output
  • Cost per 1M Tokens (Input): $1.25
  • Cost per 1M Tokens (Output): $10.00

gpt-5-codex

  • Endpoint Location: OpenAI
  • API Model Name: gpt-5-codex
  • Underlying Model: openai/gpt-5-codex
  • Max Input Tokens: 272,000
  • Max Output Tokens: 128,000
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output
  • Cost per 1M Tokens (Input): $1.25
  • Cost per 1M Tokens (Output): $10.00

gpt-5-high

  • Endpoint Location: OpenAI
  • API Model Name: gpt-5-high
  • Underlying Model: openai/gpt-5
  • Mode: Chat
  • Max Input Tokens: 272,000
  • Max Output Tokens: 128,000
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output
  • Reasoning Effort: high
  • Cost per 1M Tokens (Input): $1.25
  • Cost per 1M Tokens (Output): $10.00

gpt-5-mini

  • Endpoint Location: OpenAI
  • API Model Name: gpt-5-mini
  • Underlying Model: openai/gpt-5-mini
  • Mode: Chat
  • Max Input Tokens: 272,000
  • Max Output Tokens: 128,000
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output
  • Cost per 1M Tokens (Input): $0.25
  • Cost per 1M Tokens (Output): $2.00

gpt-5-mini-high

  • Endpoint Location: OpenAI
  • API Model Name: gpt-5-mini-high
  • Underlying Model: openai/gpt-5-mini
  • Mode: Chat
  • Max Input Tokens: 272,000
  • Max Output Tokens: 128,000
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output
  • Reasoning Effort: high
  • Cost per 1M Tokens (Input): $0.25
  • Cost per 1M Tokens (Output): $2.00

gpt-5-nano

  • Endpoint Location: OpenAI
  • API Model Name: gpt-5-nano
  • Underlying Model: openai/gpt-5-nano
  • Mode: Chat
  • Max Input Tokens: 272,000
  • Max Output Tokens: 128,000
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output
  • Cost per 1M Tokens (Input): $0.05
  • Cost per 1M Tokens (Output): $0.40

gpt-5-nano-high

  • Endpoint Location: OpenAI
  • API Model Name: gpt-5-nano-high
  • Underlying Model: openai/gpt-5-nano
  • Mode: Chat
  • Max Input Tokens: 272,000
  • Max Output Tokens: 128,000
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output
  • Reasoning Effort: high
  • Cost per 1M Tokens (Input): $0.05
  • Cost per 1M Tokens (Output): $0.40

gpt-5.1

  • Endpoint Location: OpenAI
  • API Model Name: gpt-5.1
  • Underlying Model: openai/gpt-5.1
  • Mode: Chat
  • Max Input Tokens: 272,000
  • Max Output Tokens: 128,000
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output
  • Cost per 1M Tokens (Input): $1.25
  • Cost per 1M Tokens (Output): $10.00

gpt-5.1-chat

  • Endpoint Location: OpenAI
  • API Model Name: gpt-5.1-chat
  • Underlying Model: openai/gpt-5.1-chat-latest
  • Mode: Chat
  • Max Input Tokens: 128,000
  • Max Output Tokens: 16,384
  • Capabilities: Vision, Reasoning, Prompt Caching, PDF Input, Structured Output
  • Cost per 1M Tokens (Input): $1.25
  • Cost per 1M Tokens (Output): $10.00

gpt-5.1-codex

  • Endpoint Location: OpenAI
  • API Model Name: gpt-5.1-codex
  • Underlying Model: openai/gpt-5.1-codex
  • Max Input Tokens: 272,000
  • Max Output Tokens: 128,000
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output
  • Cost per 1M Tokens (Input): $1.25
  • Cost per 1M Tokens (Output): $10.00

gpt-5.1-codex-max

  • Endpoint Location: OpenAI
  • API Model Name: gpt-5.1-codex-max
  • Underlying Model: openai/gpt-5.1-codex-max
  • Max Input Tokens: 272,000
  • Max Output Tokens: 128,000
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output
  • Cost per 1M Tokens (Input): $1.25
  • Cost per 1M Tokens (Output): $10.00

gpt-5.1-codex-mini

  • Endpoint Location: OpenAI
  • API Model Name: gpt-5.1-codex-mini
  • Underlying Model: openai/gpt-5.1-codex-mini
  • Max Input Tokens: 272,000
  • Max Output Tokens: 128,000
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output
  • Cost per 1M Tokens (Input): $0.25
  • Cost per 1M Tokens (Output): $2.00

gpt-5.1-high

  • Endpoint Location: OpenAI
  • API Model Name: gpt-5.1-high
  • Underlying Model: openai/gpt-5.1
  • Mode: Chat
  • Max Input Tokens: 272,000
  • Max Output Tokens: 128,000
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output
  • Reasoning Effort: high
  • Cost per 1M Tokens (Input): $1.25
  • Cost per 1M Tokens (Output): $10.00

gpt-5.2

  • Endpoint Location: OpenAI
  • API Model Name: gpt-5.2
  • Underlying Model: openai/gpt-5.2
  • Mode: Chat
  • Max Input Tokens: 272,000
  • Max Output Tokens: 128,000
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output
  • Cost per 1M Tokens (Input): $1.75
  • Cost per 1M Tokens (Output): $14.00

gpt-5.2-chat

  • Endpoint Location: OpenAI
  • API Model Name: gpt-5.2-chat
  • Underlying Model: openai/gpt-5.2-chat-latest
  • Mode: Chat
  • Max Input Tokens: 128,000
  • Max Output Tokens: 16,384
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output
  • Cost per 1M Tokens (Input): $1.75
  • Cost per 1M Tokens (Output): $14.00

gpt-5.2-codex

  • Endpoint Location: OpenAI
  • API Model Name: gpt-5.2-codex
  • Underlying Model: openai/gpt-5.2-codex
  • Max Input Tokens: 272,000
  • Max Output Tokens: 128,000
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output
  • Cost per 1M Tokens (Input): $1.75
  • Cost per 1M Tokens (Output): $14.00

gpt-5.2-high

  • Endpoint Location: OpenAI
  • API Model Name: gpt-5.2-high
  • Underlying Model: openai/gpt-5.1
  • Mode: Chat
  • Max Input Tokens: 272,000
  • Max Output Tokens: 128,000
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output
  • Reasoning Effort: high
  • Cost per 1M Tokens (Input): $1.25
  • Cost per 1M Tokens (Output): $10.00

gpt-5.2-pro

  • Endpoint Location: OpenAI
  • API Model Name: gpt-5.2-pro
  • Underlying Model: openai/gpt-5.2
  • Mode: Chat
  • Max Input Tokens: 272,000
  • Max Output Tokens: 128,000
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output
  • Cost per 1M Tokens (Input): $1.75
  • Cost per 1M Tokens (Output): $14.00

gpt-5.2-xhigh

  • Endpoint Location: OpenAI
  • API Model Name: gpt-5.2-xhigh
  • Underlying Model: openai/gpt-5.2
  • Mode: Chat
  • Max Input Tokens: 272,000
  • Max Output Tokens: 128,000
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output
  • Reasoning Effort: xhigh
  • Cost per 1M Tokens (Input): $1.75
  • Cost per 1M Tokens (Output): $14.00

gpt-5.3-codex

  • Endpoint Location: OpenAI
  • API Model Name: gpt-5.3-codex
  • Underlying Model: openai/gpt-5.3-codex
  • Max Input Tokens: 272,000
  • Max Output Tokens: 128,000
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output
  • Cost per 1M Tokens (Input): $1.75
  • Cost per 1M Tokens (Output): $14.00

gpt-chat

  • Endpoint Location: OpenAI
  • API Model Name: gpt-chat
  • Underlying Model: openai/gpt-5.2-chat-latest
  • Mode: Chat
  • Max Input Tokens: 128,000
  • Max Output Tokens: 16,384
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output
  • Cost per 1M Tokens (Input): $1.75
  • Cost per 1M Tokens (Output): $14.00

gpt-codex

  • Endpoint Location: OpenAI
  • API Model Name: gpt-codex
  • Underlying Model: openai/gpt-5.3-codex
  • Max Input Tokens: 272,000
  • Max Output Tokens: 128,000
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output
  • Cost per 1M Tokens (Input): $1.75
  • Cost per 1M Tokens (Output): $14.00

gpt-mini

  • Endpoint Location: OpenAI
  • API Model Name: gpt-mini
  • Underlying Model: openai/gpt-5-mini
  • Mode: Chat
  • Max Input Tokens: 272,000
  • Max Output Tokens: 128,000
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output
  • Cost per 1M Tokens (Input): $0.25
  • Cost per 1M Tokens (Output): $2.00

gpt-nano

  • Endpoint Location: OpenAI
  • API Model Name: gpt-nano
  • Underlying Model: openai/gpt-5-nano
  • Mode: Chat
  • Max Input Tokens: 272,000
  • Max Output Tokens: 128,000
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output
  • Cost per 1M Tokens (Input): $0.05
  • Cost per 1M Tokens (Output): $0.40

o1

  • Endpoint Location: OpenAI
  • API Model Name: o1
  • Underlying Model: openai/o1
  • Mode: Chat
  • Max Input Tokens: 200,000
  • Max Output Tokens: 100,000
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output
  • Cost per 1M Tokens (Input): $15.00
  • Cost per 1M Tokens (Output): $60.00

o3

  • Endpoint Location: OpenAI
  • API Model Name: o3
  • Underlying Model: openai/o3
  • Mode: Chat
  • Max Input Tokens: 200,000
  • Max Output Tokens: 100,000
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output
  • Cost per 1M Tokens (Input): $2.00
  • Cost per 1M Tokens (Output): $8.00

o3-high

  • Endpoint Location: OpenAI
  • API Model Name: o3-high
  • Underlying Model: openai/o3
  • Mode: Chat
  • Max Input Tokens: 200,000
  • Max Output Tokens: 100,000
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output
  • Reasoning Effort: high
  • Cost per 1M Tokens (Input): $2.00
  • Cost per 1M Tokens (Output): $8.00

o3-mini

  • Endpoint Location: OpenAI
  • API Model Name: o3-mini
  • Underlying Model: openai/o3-mini
  • Mode: Chat
  • Max Input Tokens: 200,000
  • Max Output Tokens: 100,000
  • Capabilities: Tool Use, Reasoning, Prompt Caching, Structured Output
  • Cost per 1M Tokens (Input): $1.10
  • Cost per 1M Tokens (Output): $4.40

o3-mini-high

  • Endpoint Location: OpenAI
  • API Model Name: o3-mini-high
  • Underlying Model: openai/o3-mini
  • Mode: Chat
  • Max Input Tokens: 200,000
  • Max Output Tokens: 100,000
  • Capabilities: Tool Use, Reasoning, Prompt Caching, Structured Output
  • Reasoning Effort: high
  • Cost per 1M Tokens (Input): $1.10
  • Cost per 1M Tokens (Output): $4.40

o4-mini

  • Endpoint Location: OpenAI
  • API Model Name: o4-mini
  • Underlying Model: openai/o4-mini
  • Mode: Chat
  • Max Input Tokens: 200,000
  • Max Output Tokens: 100,000
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output
  • Cost per 1M Tokens (Input): $1.10
  • Cost per 1M Tokens (Output): $4.40

o4-mini-high

  • Endpoint Location: OpenAI
  • API Model Name: o4-mini-high
  • Underlying Model: openai/o4-mini
  • Mode: Chat
  • Max Input Tokens: 200,000
  • Max Output Tokens: 100,000
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output
  • Reasoning Effort: high
  • Cost per 1M Tokens (Input): $1.10
  • Cost per 1M Tokens (Output): $4.40

o1-high

  • Endpoint Location: OpenAI
  • API Model Name: o1-high
  • Underlying Model: openai/o1
  • Mode: Chat
  • Max Input Tokens: 200,000
  • Max Output Tokens: 100,000
  • Capabilities: Vision, Tool Use, Reasoning, Prompt Caching, PDF Input, Structured Output
  • Reasoning Effort: high
  • Cost per 1M Tokens (Input): $15.00
  • Cost per 1M Tokens (Output): $60.00

o1-mini

  • Endpoint Location: OpenAI
  • API Model Name: o1-mini
  • Underlying Model: openai/o1-mini
  • Mode: Chat
  • Max Input Tokens: 128,000
  • Max Output Tokens: 65,536
  • Capabilities: Vision, Prompt Caching, PDF Input
  • Cost per 1M Tokens (Input): $1.10
  • Cost per 1M Tokens (Output): $4.40

o1-mini-high

  • Endpoint Location: OpenAI
  • API Model Name: o1-mini-high
  • Underlying Model: openai/o1-mini
  • Mode: Chat
  • Max Input Tokens: 128,000
  • Max Output Tokens: 65,536
  • Capabilities: Vision, Prompt Caching, PDF Input
  • Reasoning Effort: high
  • Cost per 1M Tokens (Input): $1.10
  • Cost per 1M Tokens (Output): $4.40

xAI

xai/grok-3

  • Endpoint Location: xAI
  • API Model Name: xai/grok-3
  • Underlying Model: xai/grok-3
  • Mode: Chat
  • Max Input Tokens: 131,072
  • Max Output Tokens: 131,072
  • Capabilities: Tool Use
  • Cost per 1M Tokens (Input): $3.00
  • Cost per 1M Tokens (Output): $15.00

xai/grok-3-mini

  • Endpoint Location: xAI
  • API Model Name: xai/grok-3-mini
  • Underlying Model: xai/grok-3-mini
  • Mode: Chat
  • Max Input Tokens: 131,072
  • Max Output Tokens: 131,072
  • Capabilities: Tool Use, Reasoning
  • Cost per 1M Tokens (Input): $0.30
  • Cost per 1M Tokens (Output): $0.50

xai/grok-4-0709

  • Endpoint Location: xAI
  • API Model Name: xai/grok-4-0709
  • Underlying Model: xai/grok-4-0709
  • Mode: Chat
  • Max Input Tokens: 256,000
  • Max Output Tokens: 256,000
  • Capabilities: Tool Use
  • Cost per 1M Tokens (Input): $3.00
  • Cost per 1M Tokens (Output): $15.00

xai/grok-4-1-fast

  • Endpoint Location: xAI
  • API Model Name: xai/grok-4-1-fast
  • Underlying Model: xai/grok-4-1-fast-non-reasoning
  • Mode: Chat
  • Max Input Tokens: 2,000,000.0
  • Max Output Tokens: 2,000,000.0
  • Capabilities: Vision, Tool Use, Structured Output
  • Cost per 1M Tokens (Input): $0.20
  • Cost per 1M Tokens (Output): $0.50

xai/grok-4-1-fast-reasoning

  • Endpoint Location: xAI
  • API Model Name: xai/grok-4-1-fast-reasoning
  • Underlying Model: xai/grok-4-1-fast-reasoning
  • Mode: Chat
  • Max Input Tokens: 2,000,000.0
  • Max Output Tokens: 2,000,000.0
  • Capabilities: Vision, Tool Use, Reasoning, Structured Output
  • Cost per 1M Tokens (Input): $0.20
  • Cost per 1M Tokens (Output): $0.50

xai/grok-code-fast-1

  • Endpoint Location: xAI
  • API Model Name: xai/grok-code-fast-1
  • Underlying Model: xai/grok-code-fast-1
  • Mode: Chat
  • Max Input Tokens: 256,000
  • Max Output Tokens: 256,000
  • Capabilities: Tool Use, Reasoning
  • Cost per 1M Tokens (Input): $0.20
  • Cost per 1M Tokens (Output): $1.50