OpenAI's high-intelligence flagship model for complex, multi-step tasks. GPT-4o is cheaper and faster than GPT-4 Turbo.
OpenAI's affordable and intelligent small model for fast, lightweight tasks, is cheaper and more capable than GPT-3.5 Turbo.
OpenAI's previous high-intelligence model, optimized for chat but works well for traditional completions tasks.
OpenAI's cost-efficient reasoning model, excels at STEM, especially math and coding—nearly matching the performance of OpenAI o1 on evaluation benchmarks such as AIME and Codeforces.
The latest GPT-3.5 Turbo model with higher accuracy at responding in requested formats and a fix for a bug which caused a text encoding issue for non-English language function calls.
Claude 3.5 Sonnet is a high-speed, cost-effective model offering industry-leading performance in reasoning, knowledge, and coding. It operates twice as fast as its predecessor. Key features include enhanced humor and nuance understanding, advanced coding capabilities, and strong visual reasoning.
Claude 3.5 Haiku features offers enhanced capabilities in speed, coding accuracy, and tool use. Engineered to excel in real-time applications, it delivers quick response times that are essential for dynamic tasks such as chat interactions and immediate coding suggestions.
Anthropic's Claude 3 Opus can handle complex analysis, longer tasks with multiple steps, and higher-order math and coding tasks, it provides top-level performance, intelligence, fluency, and understanding.
Anthropic's Claude 3 Haiku outperforms models in its intelligence category on performance, speed and cost without the need for specialized fine-tuning, it's the fastest and most compact model for near-instant responsiveness.
The Gemini 1.5 Pro is a cutting-edge multimodal AI model developed by Google DeepMind. It excels in processing and understanding text, images, audio, and video, featuring a breakthrough long context window of up to 1 million tokens. This model powers generative AI services across Google's platforms and supports third-party developers.
Gemini 1.5 Flash is a cutting-edge multimodal AI model known for its speed and efficiency. It excels in tasks like visual understanding and classification, featuring a long context window of up to one million tokens. This model is optimized for high-volume, high-speed applications, making it a significant advancement in AI technology.
Grok-2 is xAI's frontier language model with state-of-the-art reasoning capabilities.
The Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned generative models in 8B, 70B and 405B sizes. The Llama 3.1 instruction tuned text only models (8B, 70B, 405B) are optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on common industry benchmarks. 405B model is the most capable from the Llama 3.1 family.
The Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned generative models in 8B, 70B and 405B sizes. The Llama 3.1 instruction tuned text only models (8B, 70B, 405B) are optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on common industry benchmarks.
Instruction-tuned image reasoning model with 90B parameters from Meta. Optimized for visual recognition, image reasoning, captioning, and answering general questions about an image. The model can understand visual data, such as charts and graphs and also bridge the gap between vision and language by generating text to describe images details.
Mixtral 8x7B is a Sparse Mixture of Experts (SMoE) model developed by Mistral AI. It features a decoder-only architecture with 8 expert networks per MLP layer, enabling efficient processing of natural language tasks like text classification and generation. This innovative model excels in multilingual and domain-specific applications, offering cutting-edge performance in AI language modeling.
Mistral Large is a cutting-edge language model developed by Mistral AI, renowned for its advanced reasoning capabilities. It excels in multilingual tasks, code generation, and complex problem-solving, making it ideal for diverse text-based applications.
Pixtral Large is a 124B open-weights multimodal model built on top of Mistral Large 2. The model is able to understand documents, charts and natural images.
Gemma 2 is a state-of-the-art, lightweight open model developed by Google, available in 9 billion and 27 billion parameter sizes. It offers enhanced performance and efficiency, building on the technology used in the Gemini models. Designed for a wide range of applications, Gemma 2 excels in text-to-text tasks, making it a versatile tool for developers.
The Perplexity Sonar Online model is a state-of-the-art large language model developed by Perplexity AI. It offers real-time internet access, ensuring up-to-date information retrieval. Known for its cost-efficiency, speed, and enhanced performance, it surpasses previous models in the Sonar family, making it ideal for dynamic and accurate data processing.
The new Command R+ model delivers roughly 50% higher throughput and 25% lower latencies as compared to the previous Command R+ version, while keeping the hardware footprint the same.
DeepSeek-V2.5 is an upgraded version that combines DeepSeek-V2-Chat and DeepSeek-Coder-V2-Instruct. The new model integrates the general and coding abilities of the two previous versions.
Qwen2.5 is a model pretrained on a large-scale dataset of up to 18 trillion tokens, offering significant improvements in knowledge, coding, mathematics, and instruction following compared to its predecessor Qwen2. The model also features enhanced capabilities in generating long texts, understanding structured data, and generating structured outputs, while supporting multilingual capabilities for over 29 languages.
QwQ is an experimental research model developed by the Qwen Team, designed to advance AI reasoning capabilities. This model embodies the spirit of philosophical inquiry, approaching problems with genuine wonder and doubt. QwQ demonstrates impressive analytical abilities, achieving scores of 65.2% on GPQA, 50.0% on AIME, 90.6% on MATH-500, and 50.0% on LiveCodeBench. With its contemplative approach and exceptional performance on complex problems.
The Yi Large model was designed by 01.AI with the following usecases in mind: knowledge search, data classification, human-like chat bots, and customer service. It stands out for its multilingual proficiency, particularly in Spanish, Chinese, Japanese, German, and French.
DBRX is a state-of-the-art, transformer-based, decoder-only large language model developed by Databricks. It features a Mixture-of-Experts (MoE) architecture with 132 billion parameters, designed for efficient next-token prediction. Released in 2024, it outperforms many open-source models on standard benchmarks.
WizardLM-2 8x22B is Microsoft AI's most advanced Wizard model. It demonstrates highly competitive performance compared to leading proprietary models, and it consistently outperforms all existing state-of-the-art opensource models.