Browse all available models. Pricing, capabilities, and provider info.
AionLabs · Chat
Aion-1.0 is a multi-model system designed for high performance across various tasks, including reasoning and coding. It …
Aion-2.0 is a variant of DeepSeek V3.2 optimized for immersive roleplaying and storytelling. It is particularly strong a…
Replicate · Chat
Uncensored Flux Dev fine-tuned for NSFW content.
A relaxed, informal male voice for chatting with friends.
A noble, chivalrous young male voice for heroic tales.
wavespeed · Chat
Creates natural‑looking digital twin videos from text or audio with lip‑sync, captions, and background removal.
Replicate · Image
Remove image backgrounds with AI precision.
A cheerful, bubbly young female voice that radiates positivity.
wavespeed · Image
Creative upscaler. Price based on target megapixels (1–120 MP).
Premium clarity & sharpness. Price based on target megapixels (1–120 MP).
Flux‑tuned upscaler. Price based on target megapixels (1–120 MP).
Anthropic · Chat
Claude 3 Haiku is Anthropic's fastest and most compact model for near-instant responsiveness. Quick and accurate targete…
Claude 3.5 Haiku features offers enhanced capabilities in speed, coding accuracy, and tool use. Engineered to excel in r…
This model always redirects to the latest model in the Anthropic Claude Haiku family.
Claude Haiku 4.5 is Anthropic’s fastest and most efficient model, delivering near-frontier intelligence at a fraction of…
Claude Opus 4 is benchmarked as the world’s best coding model, at time of release, bringing sustained performance on com…
Claude Opus 4.1 is an updated version of Anthropic’s flagship model, offering improved performance in coding, reasoning,…
Claude Opus 4.5 is Anthropic’s frontier reasoning model optimized for complex software engineering, agentic workflows, a…
Opus 4.6 is Anthropic’s strongest model for coding and long-running professional tasks. It is built for agents that oper…
Fast-mode variant of Opus 4.6 - identical capabilities with higher output speed at premium 6x pricing.
Opus 4.7 is the next generation of Anthropic's Opus family, built for long-running, asynchronous agents. Building on the…
Fast-mode variant of Opus 4.7 - identical capabilities with higher output speed at premium 6x pricing.
Claude Opus 4.8 is Anthropic's most capable generally available model in the Opus family. It supports text, image, and f…
Fast-mode variant of Opus 4.8 - identical capabilities with higher output speed at 2x pricing relative to regular Opus 4…
This model always redirects to the latest model in the Anthropic Claude Sonnet family.
Claude Sonnet 4.5 is Anthropic’s most advanced Sonnet model to date, optimized for real-world agents and coding workflow…
Sonnet 4.6 is Anthropic's most capable Sonnet-class model yet, with frontier performance across coding, agents, and prof…
Cohere · Chat
Command R7B (12-2024) is a small, fast update of the Command R+ model, delivered in December 2024. It excels at RAG, too…
wavespeed · Video
65B parameter model for high‑quality video generation from a single image. Supports motion control and audio.
65B parameter text‑to‑image generator that produces highly coherent images. Supports prompt expansion.
wavespeed · Tools
Professional AI video upscaling. Cost per second of input video.
TheDrummer · Chat
Uncensored and creative writing model based on Mistral Small 3.2 24B with good recall, prompt adherence, and intelligenc…
Original DALL-E 2 from OpenAI for classic text-to-image generation.
High-fidelity image generation with detailed prompt adherence.
Replicate · Tools
A deep, authoritative male voice ideal for professional presentations.
· Chat
Specialized chat model from DeepSeek, optimized for conversation.
DeepSeek R1 is here: Performance on par with [OpenAI o1](/openai/o1), but open-sourced and with fully open reasoning tok…
DeepSeek · Chat
May 28th update to the original DeepSeek R1 Performance on par with OpenAI o1, but open-sourced and with fully open reas…
DeepSeek R1 Distill Llama 70B is a distilled large language model based on [Llama-3.3-70B-Instruct](/meta-llama/llama-3.…
DeepSeek R1 Distill Qwen 32B is a distilled large language model based on [Qwen 2.5 32B](https://huggingface.co/Qwen/Qwe…
DeepSeek-V3 is the latest model from the DeepSeek team, building upon the instruction following and coding abilities of …
DeepSeek V3, a 685B-parameter, mixture-of-experts model, is the latest iteration of the flagship chat model family from …
DeepSeek-V3.1 is a large hybrid reasoning model (671B parameters, 37B active) that supports both thinking and non-thinki…
Nex AGI · Chat
DeepSeek V3.1 Nex-N1 is the flagship release of the Nex-N1 series — a post-trained model designed to highlight agent aut…
DeepSeek-V3.1 Terminus is an update to [DeepSeek V3.1](/deepseek/deepseek-chat-v3.1) that maintains the model's original…
DeepSeek-V3.2 is a large language model designed to harmonize high computational efficiency with strong reasoning and ag…
DeepSeek-V3.2-Exp is an experimental large language model released by DeepSeek as an intermediate step between V3.1 and …
DeepSeek V4 Flash is an efficiency-optimized Mixture-of-Experts model from DeepSeek with 284B total parameters and 13B a…
DeepSeek V4 Pro is a large-scale Mixture-of-Experts model from DeepSeek with 1.6T total parameters and 49B activated par…
Ariel Replicate · Image
Colorize black and white images.
Mistral AI · Chat
Devstral 2 is a state-of-the-art open-source model by Mistral AI specializing in agentic coding. It is a 123B-parameter …
A serene, calm female voice in Indonesian.
An energetic, uplifting young female voice full of motivation.
Baidu · Chat
ERNIE-4.5-300B-A47B is a 300B parameter Mixture-of-Experts (MoE) language model developed by Baidu as part of the ERNIE …
Baidu's text-to-image model. Supports English, Chinese, and Japanese.
Fast 8-step distilled ERNIE image generation.
A polite, well-mannered young male voice for formal settings.
High-quality uncensored Flux model with JibMix style.
Professional Flux model with excellent prompt adherence.
Flexible model for various styles.
Lightweight and fast Flux edit model for quick modifications.
Fast and efficient text-to-image generation using Flux.2 Klein Base.
Advanced image editing with improved prompt understanding.
Professional image editing with aspect ratio control.
12B parameter model, supports img2img with prompt strength.
Fastest Flux endpoint, optimized by PrunaAI.
Black Forest Labs · Image
Professional inpainting to remove objects.
Premium text-based image editing with max performance.
Text-based image editing with natural language.
Fast Flux model for quick iterations.
Google · Chat
Gemini 2.5 Flash is Google's state-of-the-art workhorse model, specifically designed for advanced reasoning, coding, mat…
Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cos…
Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, coding, mathematics, and scientifi…
Gemini 3 Flash Preview is a high speed, high value thinking model designed for agentic workflows, multi turn chat, and c…
Gemini 3.1 Flash Lite is Google’s GA high-efficiency multimodal model optimized for low-latency, high-volume workloads. …
Gemini 3.1 Flash Lite Preview is Google's high-efficiency model optimized for high-volume use cases. It outperforms Gemi…
Gemini 3.1 Pro Preview is Google’s frontier reasoning model, delivering enhanced software engineering performance, impro…
Gemini 3.1 Pro Preview Custom Tools is a variant of Gemini 3.1 Pro that improves tool selection behavior by preventing o…
Gemini 3.5 Flash is Google's high-efficiency multimodal model, bringing near-Pro level coding and reasoning at Flash-tie…
This model always redirects to the latest model in the Google Gemini Flash family.
This model always redirects to the latest model in the Google Gemini Pro family.
Restore old, damaged, or low-quality faces with AI.
Z.AI · Chat
GLM-4.5 is our latest flagship foundation model, purpose-built for agent-based applications. It leverages a Mixture-of-E…
GLM-4.5-Air is the lightweight variant of our latest flagship model family, also purpose-built for agent-centric applica…
GLM-4.5V is a vision-language foundation model for multimodal agent applications. Built on a Mixture-of-Experts (MoE) ar…
Compared with GLM-4.5, this generation brings several key improvements: Longer context window: The context window has be…
GLM-4.6V is a large multimodal model designed for high-fidelity visual understanding and long-context reasoning across i…
GLM-4.7 is Z.ai’s latest flagship model, featuring upgrades in two key areas: enhanced programming capabilities and more…
As a 30B-class SOTA model, GLM-4.7-Flash offers a new option that balances performance and efficiency. It is further opt…
Z-AI · Chat
GLM-5.1 delivers a major leap in coding capability, with particularly significant gains in handling long-horizon tasks. …
GLM-5 is Z.ai’s flagship open-source foundation model engineered for complex systems design and long-horizon agent workf…
GLM-5 Turbo is a new model from Z.ai designed for fast inference and strong performance in agent-driven environments suc…
GLM-5V-Turbo is Z.ai’s first native multimodal agent foundation model, built for vision-based coding and agent-driven ta…
OpenAI · Chat
This model always redirects to the latest model in the OpenAI GPT family.
GPT Chat Latest points to OpenAI's stable API alias chat-latest that always resolves to the latest Instant chat model us…
GPT Image 1 text-to-image generation. Price varies by quality and resolution.
Native GPT-Image editing with text instructions.
Maximum detail & photorealism variant of GPT Image 1.
Lightweight GPT Image Mini for fast drafts.
Fast GPT Image Mini editing for quick modifications.
GPT Image 1.5 with better typography and quality.
Enhanced GPT Image 1.5 editing.
OpenAI GPT Image 2 text-to-image. Quality & resolution based pricing.
OpenAI GPT Image 2 editing. Modify images with text prompts.
This model always redirects to the latest model in the OpenAI GPT Mini family.
GPT-3.5 Turbo is OpenAI's fastest model. It can understand and generate natural language or code, and is optimized for c…
GPT-4o ("o" for "omni") is OpenAI's latest AI model, supporting both text and image inputs with text outputs. It maintai…
GPT-4o mini is OpenAI's newest model after [GPT-4 Omni](/models/openai/gpt-4o), supporting both text and image inputs wi…
GPT-4o mini Search Preview is a specialized model for web search in Chat Completions. It is trained to understand and ex…
GPT-5 Mini is a compact version of GPT-5, designed to handle lighter-weight reasoning tasks. It provides the same instru…
GPT-5-Nano is the smallest and fastest variant in the GPT-5 system, optimized for developer tools, rapid interactions, a…
GPT-5.1-Codex-Max is OpenAI’s latest agentic coding model, designed for long-running, high-context software development …
GPT-5.2 is the latest frontier-grade model in the GPT-5 series, offering stronger agentic and long context perfomance co…
GPT-5.2 Chat (AKA Instant) is the fast, lightweight member of the 5.2 family, optimized for low-latency chat while retai…
GPT-5.2-Codex is an upgraded version of GPT-5.1-Codex optimized for software engineering and coding workflows. It is des…
GPT-5.3 Chat is an update to ChatGPT's most-used model that makes everyday conversations smoother, more useful, and more…
GPT-5.3-Codex is OpenAI’s most advanced agentic coding model, combining the frontier software engineering performance of…
GPT-5.4 is OpenAI’s latest frontier model, unifying the Codex and GPT lines into a single system. It features a 1M+ toke…
GPT-5.4 mini brings the core capabilities of GPT-5.4 to a faster, more efficient model optimized for high-throughput wor…
GPT-5.4 nano is the most lightweight and cost-efficient variant of the GPT-5.4 family, optimized for speed-critical and …
GPT-5.4 Pro is OpenAI's most advanced model, building on GPT-5.4's unified architecture with enhanced reasoning capabili…
GPT-5.5 is OpenAI’s frontier model designed for complex professional workloads, building on GPT-5.4 with stronger reason…
GPT-5.5 Pro is OpenAI’s high-capability model optimized for deep reasoning and accuracy on complex, high-stakes workload…
A gentle, soothing female voice ideal for relaxation and meditation.
IBM · Chat
Granite-4.0-H-Micro is a 3B parameter from the Granite 4 family of models. These models are the latest in a series of mo…
Granite 4.1 8B is a dense, decoder-only 8-billion-parameter language model from IBM, part of the Granite 4.1 family. It …
xAI's first image model based on Aurora architecture. Exceptional photorealism.
xAI · Chat
Grok 4.20 is a reasoning model from xAI with industry-leading speed and agentic tool calling capabilities. It combines t…
Grok 4.20 Multi-Agent is a variant of xAI’s Grok 4.20 designed for collaborative, agent-based workflows. Multiple agents…
Grok 4.3 is a reasoning model from xAI. It accepts text and image inputs with text output, and is suited for agentic wor…
Grok Build 0.1 is xAI’s fast coding model trained specifically for agentic software engineering workflows. It supports t…
High-realism image generation from xAI.
xAI's flagship text-to-image model with exceptional aesthetic quality.
xAI's high-quality image editing. Modify existing images with text instructions.
xAI's image editing model. Precise control over edits with text instructions.
Replicate · Video
xAI's flagship text-to-video model.
Turns a static image into a video with realistic motion and synchronized audio. Supports motion prompts, multiple resolu…
High quality image-to-video with fine details.
Faster version of Hailuo 2.3.
Alibaba's text-to-video model.
A sophisticated, refined male voice for high‑end presentations.
Nous Research · Chat
Hermes 3 is a generalist language model with many improvements over Hermes 2, including advanced agentic capabilities, m…
Hermes 3 is a generalist language model with many improvements over [Hermes 2](/models/nousresearch/nous-hermes-2-mistra…
Hermes 4 is a large-scale reasoning model built on Meta-Llama-3.1-405B and released by Nous Research. It introduces a hy…
Hermes 4 70B is a hybrid reasoning model from Nous Research, built on Meta-Llama-3.1-70B. It introduces the same hybrid …
Performance-optimized text-to-image with fast generation.
17B parameter open-source model with state-of-the-art quality and speed.
Tencent · Chat
Hunyuan-A13B is a 13B active parameter Mixture-of-Experts (MoE) language model developed by Tencent, with a total parame…
Tencent · 3D
Generate 3D models from text or images.
Hy3 preview is a high-efficiency Mixture-of-Experts model from Tencent designed for agentic workflows and production use…
Best‑in‑class for typography and structured design. Supports up to 2K resolution.
Second generation model for graphic design.
Fast version of V2.
Balance between speed and text quality.
Highest quality Ideogram with excellent typography.
Fast text generation under 5 seconds.
Google's latest text-to-image model with high detail.
Optimized for low latency with Imagen 3 quality.
Google's flagship text-to-image model, high quality.
Fast version of Imagen 4, slightly lower quality.
Ultra high-quality image generation from Google.
Prime Intellect · Chat
INTELLECT-3 is a 106B-parameter Mixture-of-Experts model (12B active) post-trained from GLM-4.5-Air-Base using supervise…
AI21 · Chat
Jamba Large 1.7 is the latest model in the Jamba open family, offering improvements in grounding, instruction-following,…
A warm, approachable male voice perfect for everyday conversation.
Specialized NSFW model for anime and semi-realistic styles.
Moonshot AI · Chat
This model always redirects to the latest model in the MoonshotAI Kimi family.
Kimi K2 Instruct is a large-scale Mixture-of-Experts (MoE) language model developed by Moonshot AI, featuring 1 trillion…
Kimi K2 Thinking is Moonshot AI’s most advanced open reasoning model to date, extending the K2 series into agentic, long…
Kimi K2 0905 is the September update of [Kimi K2 0711](moonshotai/kimi-k2). It is a large-scale Mixture-of-Experts (MoE)…
Kimi K2.5 is Moonshot AI's native multimodal model, delivering state-of-the-art visual coding capability and a self-dire…
Kimi K2.6 is Moonshot AI's next-generation multimodal model, designed for long-horizon coding, coding-driven UI/UX gener…
MoonshotAI: Kimi K2.7 Code is a coding-focused model in Moonshot AI's Kimi K2 family, built to complete end-to-end progr…
Ultimate quality, best for final renders.
Balanced quality and speed, 720p up to 10s.
Blazing fast, high quality 1080p.
Next-gen model with optional native audio.
Multimodal video model with advanced audio understanding.
Kling V3.0 4K text-to-video.
Higher‑capacity variant; excellent for photorealistic output and natural lighting.
Standard Krea medium model with good quality and creativity control.
Fast text‑to‑image with basic style and aspect‑ratio control.
RainSpeed · Chat
This is a fake model that always fails.
A firm, resolute male voice that conveys confidence and purpose.
Add motion to static images.
Sao10K · Chat
Lunaris 8B is a versatile generalist and roleplaying model based on Llama 3. It's a strategic merge of multiple models, …
Euryale L3.1 70B v2.2 is a model focused on creative roleplay from [Sao10k](https://ko-fi.com/sao10k). It is the success…
Meta · Chat
Llama 3.2 11B Vision is a multimodal model with 11 billion parameters, designed to handle tasks combining visual and tex…
Llama 3.2 1B is a 1-billion-parameter language model focused on efficiently performing natural language tasks, such as s…
Llama 3.2 3B is a 3-billion-parameter multilingual large language model, optimized for advanced natural language process…
The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction tuned generative model in 70B…
Llama 4 Maverick 17B Instruct (128E) is a high-capacity multimodal language model from Meta, built on a mixture-of-exper…
Nvidia · Chat
Free embedding model with vision-language understanding.
Leonardo.ai's all-purpose text-to-image model.
Google · Audio
Generates short music clips up to 30 seconds. Ideal for quick prototyping.
Generates full songs up to 3 minutes with detailed structure and high‑quality vocals.
Anthracite · Chat
This is a series of models designed to replicate the prose quality of the Claude 3 models, specifically Sonnet(https://o…
High‑resolution text‑to‑image with support for multiple aspect ratios.
Text‑based image editing with optional image input. Supports localized edits and text rendering.
An assertive, confident female voice in Indonesian.
Inception · Chat
Mercury 2 is an extremely fast reasoning LLM, and the first reasoning diffusion LLM (dLLM). Instead of generating tokens…
A light, airy female voice, great for children's content.
Xiaomi · Chat
MiMo-V2-Flash is an open-source foundation language model developed by Xiaomi. It is a Mixture-of-Experts model with 309…
MiMo-V2.5 is a native omnimodal model by Xiaomi. It delivers Pro-level agentic performance at roughly half the inference…
MiMo-V2.5-Pro is Xiaomi’s flagship model, delivering strong performance in general agentic capabilities, complex softwar…
Focus on Asian aesthetics and speed.
MiniMax M2-her is a dialogue-first large language model built for immersive roleplay, character-driven chat, and express…
MiniMax-M2.5 is a SOTA large language model designed for real-world productivity. Trained in a diverse range of complex …
MiniMax-M2.7 is a next-generation large language model designed for autonomous, real-world productivity and continuous i…
MiniMax-M3 is a multimodal foundation model from MiniMax. It supports text, image, and video inputs with text output, a …
Replicate · Audio
Creates songs up to 4 minutes. 2 free trials!
Full song generation with rich instrumentation and natural vocals.
The largest model in the Ministral 3 family, Ministral 3 14B offers frontier capabilities and performance comparable to …
The smallest model in the Ministral 3 family, Ministral 3 3B is a powerful, efficient tiny language model with vision ca…
A balanced model in the Ministral 3 family, Ministral 3 8B is a powerful, efficient tiny language model with vision capa…
This is Mistral AI's flagship model, Mistral Large 2 (version `mistral-large-2407`). It's a proprietary weights-availabl…
Mistral Medium 3.5 is a dense 128B instruction-following model from Mistral AI. It supports text and image inputs with t…
Mistral's official instruct fine-tuned version of [Mixtral 8x22B](/models/mistralai/mixtral-8x22b). It uses 39B active p…
Morph · Chat
Morph's fastest apply model for code edits. ~10,500 tokens/sec with 96% accuracy for rapid code transformations. The mod…
Morph's high-accuracy apply model for complex code edits. ~4,500 tokens/sec with 98% accuracy for precise code transform…
Image editing with img2img support.
Advanced image editing with multi-image input support.
Edit images with Nano Banana Pro.
Ultra high-resolution image editing with Nano Banana Pro.
NVIDIA Nemotron 3 Nano 30B A3B is a small language MoE model with highest compute efficiency and accuracy for developers…
NVIDIA Nemotron 3 Super is a 120B-parameter open hybrid MoE model, activating just 12B parameters for maximum compute ef…
A serene, spiritual female voice with a calm and measured tone.
Amazon · Chat
Nova 2 Lite is a fast, cost-effective reasoning model for everyday workloads that can process text, images, and videos t…
Amazon Nova Lite 1.0 is a very low-cost multimodal model from Amazon that focused on fast processing of image, video, an…
Amazon Nova Micro 1.0 is a text-only model that delivers the lowest latency responses in the Amazon Nova family of model…
Amazon Nova Premier is the most capable of Amazon’s multimodal models for complex reasoning tasks and for use as the bes…
Amazon Nova Pro 1.0 is a capable multimodal model from Amazon focused on providing a combination of accuracy, speed, and…
Creative-focused model with minimal restrictions. Explore diverse artistic styles.
A calm, deliberate male speaker, excellent for educational content.
The latest and strongest model family from OpenAI, o1 is designed to spend more time thinking before responding. The o1 …
The o1 series of models are trained with reinforcement learning to think before they answer and perform complex reasonin…
High-quality image generation with strong prompt adherence.
In-painting and out-painting editing for precise control.
High-quality image upscaling up to 8 MP.
Generate realistic avatar videos from a single reference image and audio/script.
wavepseed · Video
Ultra-efficient video generation powered by PrunaAI optimization.
Swaps the character in a source video while preserving background, lighting, camera motion, and audio.
Fast and efficient text-to-video generation. Draft mode available at 4x cheaper.
A calm, authoritative male leader voice in Indonesian.
Perceptron · Chat
Perceptron Mk1 (Mark One) is Perceptron's highest-quality vision-language model for video and embodied reasoning.** It a…
Microsoft · Chat
[Microsoft Research](/microsoft) Phi-4 is designed to perform well in complex reasoning tasks and can operate efficientl…
Phi-4-mini-instruct is a lightweight open model built upon synthetic data and filtered publicly available websites - wit…
Pixtral Large is a 124B parameter, open-weight, multimodal model built on top of [Mistral Large 2](/mistralai/mistral-la…
Fast generation with good quality.
Enhanced with motion control and multi-image fusion.
Realistic NSFW model based on Pony Diffusion.
Community model fine-tuned for photorealism and anime.
Model from Alibaba, very economical for daily use.
Alibaba · Chat
The Qwen3.5 Series 35B-A3B is a native vision-language model designed with a hybrid architecture that integrates linear …
Qwen3.5-9B is a multimodal foundation model from the Qwen3.5 family, designed to deliver strong reasoning, coding, and v…
The Qwen3.5 native vision-language Flash models are built on a hybrid architecture that integrates a linear attention me…
Qwen3.5 Plus (April 2026) is a large-scale multimodal language model from Alibaba. It accepts text, image, and video inp…
Qwen3.6 27B is a dense 27-billion-parameter language model from the Qwen Team at Alibaba, released in April 2026. It fea…
Qwen3.6-35B-A3B is an open-weight multimodal model from Alibaba Cloud with 35 billion total parameters and 3 billion act…
Qwen3.6 Flash is a fast, efficient language model from Alibaba's Qwen 3.6 series. It supports text, image, and video inp…
Qwen3.6-Max-Preview is a proprietary frontier model from Alibaba Cloud built on a sparse mixture-of-experts architecture…
Qwen 3.6 Plus builds on a hybrid architecture that combines efficient linear attention with sparse mixture-of-experts ro…
Qwen3.7-Max is the flagship model in Alibaba's Qwen3.7 series. It supports text input and output and is designed for age…
Qwen3.7-Plus is a cost‑effective model in Alibaba's Qwen3.7 series. It supports text and image input with text output, b…
Multimodal model with excellent text rendering.
Alibaba's 7B text-to-image foundation model. Excellent photorealism and typography.
Alibaba's 7B unified image editing model. Edit existing images with text instructions.
Pro version of Qwen Image 2.0 Edit with higher quality and better detail preservation.
Qwen Plus 0728, based on the Qwen3 foundation model, is a 1 million context hybrid reasoning model with a balanced perfo…
Qwen-Plus, based on the Qwen2.5 foundation model, is a 131K context model with a balanced performance, speed, and cost c…
Qwen2.5 72B is the latest series of Qwen large language models. Qwen2.5 brings the following improvements upon Qwen2: - …
Qwen2.5 7B is the latest series of Qwen large language models. Qwen2.5 brings the following improvements upon Qwen2: - S…
Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen). Qwen2.5-Cod…
Qwen2.5-VL is proficient in recognizing common objects such as flowers, birds, fish, and insects. It is also highly capa…
Qwen3-14B is a dense 14.8B parameter causal language model from the Qwen3 series, designed for both complex reasoning an…
Qwen3-235B-A22B is a 235B parameter mixture-of-experts (MoE) model developed by Qwen, activating 22B parameters per forw…
Qwen3-235B-A22B-Instruct-2507 is a multilingual, instruction-tuned mixture-of-experts language model based on the Qwen3-…
Qwen3-235B-A22B-Thinking-2507 is a high-performance, open-weight Mixture-of-Experts (MoE) language model optimized for c…
Qwen3-30B-A3B-Thinking-2507 is a 30B parameter Mixture-of-Experts reasoning model optimized for complex tasks requiring …
Qwen3-32B is a dense 32.8B parameter causal language model from the Qwen3 series, optimized for both complex reasoning a…
Qwen3-8B is a dense 8.2B parameter causal language model from the Qwen3 series, designed for both reasoning-heavy tasks …
Qwen3-Coder-480B-A35B-Instruct is a Mixture-of-Experts (MoE) code generation model developed by the Qwen team. It is opt…
Qwen3 Coder Flash is Alibaba's fast and cost efficient version of their proprietary Qwen3 Coder Plus. It is a powerful c…
Qwen3-Coder-Next is an open-weight causal language model optimized for coding agents and local development workflows. It…
Qwen3 Coder Plus is Alibaba's proprietary version of the Open Source Qwen3 Coder 480B A35B. It is a powerful coding agen…
Qwen3-Max-Thinking is the flagship reasoning model in the Qwen3 series, designed for high-stakes cognitive tasks that re…
Qwen3-Next-80B-A3B-Instruct is an instruction-tuned chat model in the Qwen3-Next series optimized for fast, stable respo…
Qwen3-Next-80B-A3B-Thinking is a reasoning-first chat model in the Qwen3-Next line that outputs structured “thinking” tr…
Qwen3-VL-235B-A22B Instruct is an open-weight multimodal model that unifies strong text generation with visual understan…
Qwen3-VL-235B-A22B Thinking is a multimodal model that unifies strong text generation with visual understanding across i…
Qwen3-VL-30B-A3B-Instruct is a multimodal model that unifies strong text generation with visual understanding for images…
Qwen3-VL-30B-A3B-Thinking is a multimodal model that unifies strong text generation with visual understanding for images…
Qwen3-VL-32B-Instruct is a large-scale multimodal vision-language model designed for high-precision understanding and re…
Qwen3-VL-8B-Thinking is the reasoning-optimized variant of the Qwen3-VL-8B multimodal model, designed for advanced visua…
Generate stunning posters in RPS style.
Upscale images up to 4x with AI enhancement.
Legacy version, more affordable.
Standard version for fast design generation.
Professional design-focused image generation.
Generate detailed editable SVG vector graphics.
Utility for image editing, inpainting, and object modification using Recraft V4.1.
Standard Recraft V4.1 text-to-image model for fast design generation.
Generate native SVG vector graphics from text prompts.
Pro version of Recraft V4.1 edit utility for professional inpainting and editing.
Professional grade raster image generation with Recraft V4.1 Pro.
Professional vectorization model for high-quality scalable vector art.
Relace · Chat
Relace Apply 3 is a specialized code-patching LLM that merges AI-suggested edits straight into your source files. It can…
The relace-search model uses 4-12 `view_file` and `grep` tools in parallel to explore a codebase and return relevant fil…
A soft‑spoken, gentle female voice in Indonesian.
Inclusion AI · Chat
Ring-2.6-1T is a 1T-parameter-scale thinking model with 63B active parameters, built for real-world agent workflows that…
openrouter · Chat
Quick generation with decent quality.
Preview version, fast results.
Maximum quality preview.
Professional grade image generation.
Balanced speed and quality.
Fast generation with good quality – ideal for drafts and rapid iteration.
Professional grade image generation with improved detail and consistency.
Rocinante 12B is designed for engaging storytelling and rich prose. Early testers have reported: - Expanded vocabulary w…
A compassionate, caring male voice in Indonesian.
A wise, mature female voice, perfect for storytelling and narration.
A cute, sweet young female voice in Indonesian.
8B parameter multimodal diffusion transformer.
Stable Diffusion XL, versatile and powerful.
Ultra-fast 4-step generation, high quality.
Seed 1.6 is a general-purpose model released by the ByteDance Seed team. It incorporates multimodal capabilities and ada…
Seed 1.6 Flash is an ultra-fast multimodal deep thinking model by ByteDance Seed, supporting both text and visual unders…
Seed-2.0-Lite is a versatile, cost‑efficient enterprise workhorse that delivers strong multimodal and agent capabilities…
Lightweight, fast, and affordable.
Professional grade with higher detail.
Faster Pro version.
Multimodal video model with support for up to 9 reference images.
Fast I2V using last image. Cost varies by resolution and duration.
Turbo I2V using last reference image. Cost varies by resolution and duration.
Aggressively styled variant of the fast I2V model with enhanced motion dynamics.
Fast T2V generation, balanced quality & speed. Cost varies by resolution and duration.
Ultra-fast T2V with Turbo acceleration. Costs vary by resolution and duration.
Standard Seedance 2.0 I2V with Turbo speed. Cost varies by resolution and duration.
High‑energy, stylized variant of the standard model with enhanced motion dynamics.
Standard Seedance 2.0 T2V. Cost varies by resolution and duration.
Standard Seedance 2.0 T2V with Turbo speed. Cost varies by resolution and duration.
Stable text-to-image with good quality.
ByteDance's advanced image model with aspect ratio control.
Older Seedream 4 image editing model.
ByteDance's latest text-to-image model.
Edit images using the latest Seedream 4.5 model.
Generate a sequence of edited images with Seedream 4.5.
Latest ByteDance model with multi-step reasoning.
wavespeed · Audio
Generates sound effects synced to video, up to 60 seconds. Returns the video with a new audio track.
Fast I2V generation with support for motion prompts, sound effects, and multi‑frame guidance.
Uses images or short video clips as style/motion references; strong character and scene consistency.
Fast T2V with native audio output, flexible resolutions, and optional sound effects.
Perplexity · Chat
Sonar is lightweight, affordable, fast, and simple to use — now featuring citations and the ability to customize sources…
Sonar Deep Research is a research-focused model designed for multi-step retrieval, synthesis, and reasoning across compl…
Note: Sonar Pro pricing includes Perplexity search pricing. See [details here](https://docs.perplexity.ai/guides/pricing…
Exclusively available on the OpenRouter API, Sonar Pro's new Pro Search mode is Perplexity's most advanced agentic searc…
A sweet, pleasant young female voice perfect for audiobooks.
Alias to Sora 2 Pro I2V. Price varies by resolution and duration.
Standard Sora 2 image-to-video. $0.10 per second.
OpenAI's highest quality image-to-video. Price varies by resolution and duration.
OpenAI's highest quality text-to-video. Price varies by resolution and duration.
Alias to Sora 2 Pro T2V. Price varies by resolution and duration.
Standard Sora 2 text-to-video. $0.10 per second.
Original Stable Diffusion model.
StepFun · Chat
Step 3.5 Flash is StepFun's most capable open-source foundation model. Built on a sparse Mixture of Experts (MoE) archit…
Step 3.7 Flash is StepFun's latest high-efficiency multimodal Mixture-of-Experts model. It pairs a 196B-parameter langua…
Generate stickers with transparent backgrounds.
Switchpoint · Chat
Switchpoint AI's router instantly analyzes your request and directs it to the optimal AI from an ever-evolving library. …
Arcee AI · Chat
Trinity Large Thinking is a powerful open source reasoning model from the team at Arcee AI. It shows strong performance …
UI-TARS-1.5 is a multimodal vision-language agent optimized for GUI-based environments, including desktop interfaces, we…
Lip-sync animation from image and audio.
Automatically adds styled, burned‑in subtitles to any video. No styling or timing work needed.
Google's premium text-to-video model.
Latest Veo with optional audio generation.
Faster Veo 3, slightly lower quality but great speed.
Refined Veo 3.1 with improved prompt adherence and JSON support.
Fast version of Veo 3.1 with JSON prompt support.
Generate video from reference images using the fastest Veo 3.1 model.
A commanding, regal female voice that exudes authority.
Vidu Studio's advanced Q3 Pro model for consistent character video generation.
Lightweight video model, 5 seconds 480p. Fast and economical.
Optimized Wan 2.1 for image-to-video at 480p.
Optimized fast image-to-video with interpolated frames.
Image-to-video with precise motion control.
Advanced text-to-video with high quality output.
Alibaba's Wan 2.6 image editing model.
Professional‑grade image‑to‑video with multi‑modal input (images, text, audio).
High‑energy variant tuned for bold motion and lively output.
Next-gen text-to-image with high quality output.
Professional grade image generation with advanced controls.
8.9B parameter model built on FLUX.1-schnell. Ultra-fast with unique visual style.
Generate consistent characters from text prompts. Ideal for storytelling and branding.
WizardLM-2 8x22B is Microsoft AI's most advanced Wizard model. It demonstrates highly competitive performance compared t…
Super fast 6B parameter model, sub-second generation.
Ultra-fast 6B parameter image generation.
An outgoing, lively female voice full of energy and enthusiasm.