At a glance

MiniMaxpricing, performance & catalog

The citable facts about MiniMax's 9 models — sourced from provider APIs and refreshed continuously.

Lowest price
MiniMax M2.7 at $0.300 per 1M input tokens
Highest throughput
MiniMax M2.5 at 100 tokens/s
Lowest latency
MiniMax M2.5 at 3.00s
Largest context
MiniMax M3 at 1.0M tokens
Catalog
9 active models from 1 organization

FAQ

Common questions about MiniMax.

What is MiniMax?

MiniMax is an API provider that hosts large language models. Active models: 9; From (input): $0.30 / 1M tok; Avg throughput: 93 tok/s; Avg latency: 3.25 s; Max context: 1.0M.

How many models does MiniMax offer?

MiniMax currently serves 9 active models out of 9 historical offerings on LLM Stats.

What is MiniMax's API pricing?

MiniMax input pricing starts from $0.30 per 1M tokens, with the most expensive offering at $0.6 per 1M tokens. See the Pricing tab above for the full per-model breakdown.

How fast is MiniMax?

MiniMax averages 93 output tokens per second across its catalog, with average latency of 3.25s. Per-model performance is shown in the Performance tab.

Is MiniMax OpenAI compatible?

Most providers expose an OpenAI-compatible /v1/chat/completions endpoint so you can switch from OpenAI to MiniMax by changing only the base URL and API key. Check https://platform.minimax.io for the exact endpoint format and any provider-specific parameters.

Does MiniMax support multimodal models?

Yes. MiniMax's catalog includes 2 vision-capable and 4 audio models. See the Models and Capabilities tabs for the full per-model breakdown.

Whose models does MiniMax host?

MiniMax hosts models from MiniMax. See the Models tab for the full catalog grouped by creator.

How do I start using MiniMax?

Sign up at https://platform.minimax.io to get an API key, then call MiniMax's API directly from your application. Most clients work out of the box by pointing the OpenAI SDK at MiniMax's base URL with your key. Use the Pricing and Performance tabs above to pick the right model for your latency, cost, and context-window requirements.