MiMo V2 Flash

Xiaomi's high-efficiency inference model with hybrid architecture, 3 MTP layers for 2.5-3.7x faster inference, and 256K context.

mimo-v2-flash
STABLEGet StartedView uptime
256,000 context
Starting at $0.10/M input tokens
Starting at $0.30/M output tokens
Streaming
Tools
Reasoning
JSON Output

Select Provider

All Providers for MiMo V2 Flash

LLM Gateway routes requests to the best providers that are able to handle your prompt size and parameters.

Xiaomi
Context: 256k
Input
$0.1
/M tokens
Cached
$0.01
/M tokens
Output
$0.3
/M tokens
Get Started