Xiaomi's high-efficiency inference model with hybrid architecture, 3 MTP layers for 2.5-3.7x faster inference, and 256K context.
mimo-v2-flash
LLM Gateway routes requests to the best providers that are able to handle your prompt size and parameters.
AI-powered help
Please introduce yourself before we start.