Models

Every model card carries its evidence limits.

Model version is currently unknown unless independently documented in the source artifacts.

Claim Evidence

The model index links evidence-level claims to release artifacts before showing model rows.

ClaimEvidence
Model cards are sorted alphabetically and carry evidence levels, not leaderboard ranks. Model catalog , Truth gate
Model version uncertainty is a visible limitation unless independently documented. Limitations , Model roster preflight
Evidence levels are model-output evidence levels, not human or external validation. Human status , External status
Model Provider Run Completion Parse Evidence Caveat
Claude Haiku 4.5 anthropic jn70fpqyr7an1bca1cn7fq93ys864cx0 100% 100% Level 2 current Anthropic low-latency paid route
Claude Opus 4.5 anthropic jn7839n5vcfsf5zsyqg0098rwd864xxv 100% 100% Level 2 legacy Anthropic Opus comparison route
Claude Opus 4.7 anthropic jn7bedafgemk6hecfqtpd6e309864xhq 100% 100% Level 2 latest Anthropic Opus paid route
Claude Sonnet 4.6 anthropic jn74qyaygktq550zw4metb3xt5864hfv 100% 100% Level 2 current Anthropic Sonnet paid route
DeepSeek V3.2 deepseek jn76nae0e9j4pqakz7zwtj1yn186abyv 100% 100% Level 2 recent DeepSeek reasoning and agentic paid route
DeepSeek V4 Flash deepseek jn77m1pvwyaed4n2v6nb6btn1s864kct 100% 100% Level 2 latest available DeepSeek V4 route with healthy provider capacity
DeepSeek V4 Pro deepseek jn7aybgpr67x8zswpmegfqytyx869wkp 100% 100% Level 2 latest DeepSeek V4 Pro paid route
Devstral 2512 mistralai jn793dym6gfm0tssrp6mgh8es986b8nq 100% 100% Level 2 live completed Convex full-suite run
Gemini 2.0 Flash google jn77f88qts7had7ywj89ncd1yd86715s 100% 100% Level 2 older Google Flash route for generational comparison
Gemini 2.0 Flash Lite google jn71419q7s8pmwrg8y9095xx9n867qp2 100% 100% Level 2 older ultra-cheap Google Flash Lite route
Gemini 2.5 Flash google jn7367rpjc0ar1m1mcpkwq1ahs867sk5 100% 100% Level 2 cheap Google Flash route for comparison against Gemini 3 Flash
Gemini 2.5 Flash Lite google jn7a7eaaja7pmzfc76pq7syqc18679yc 100% 100% Level 2 cheap Google baseline route even though newer Gemini 3 routes are already covered
Gemini 3 Flash Preview google jn72d06xfqwj8pds5qgdq6t2gs8623kp 100% 100% Level 2 current Google Flash paid route
Gemini 3.1 Flash Lite Preview google jn7ez731knc67nfs7gfshenwhd86777p 100% 100% Level 2 current Google efficient preview paid route
Gemini 3.1 Pro Preview google jn74r11x1a5denvej7jbyc4p8h8633y8 100% 100% Level 2 current Google Pro preview paid route
Gemma 3 12B google jn794e1cgp5ecmz53cx08v2vhx866t4w 100% 100% Level 2 Gemma 3 mid-size open-model comparison route
Gemma 3 27B google jn79cccx49g16xgxtftyamstsx866ay9 100% 100% Level 2 Gemma 3 large open-model comparison route
Gemma 3 4B google jn74gs2nnjg5q8n7ysvsfmdhzh8662ad 100% 100% Level 2 Gemma 3 small open-model comparison route
Gemma 4 26B A4B google jn770jvh5bx4s74v63yhxmnxnh865vgq 100% 100% Level 2 newer compact Google Gemma 4 route with public interest
Gemma 4 31B google jn7fdzd95ngzbwn6j42yfs5kzx864qrk 100% 100% Level 2 recent Google open model route with strong public interest
GLM 4.7 z-ai jn7a14xnb15xckyfvyy41q6k4s86bhv1 100% 100% Level 2 larger GLM 4.7 route to compare against GLM 4.7 Flash and GLM 5
GLM 5 z-ai jn77hqg7j6vmamae2r3hwnv1t1869rww 100% 100% Level 2 current Z.ai GLM route with strong open-model benchmark interest
GLM 5.1 z-ai jn77es7pyamhprdbm0bb3dntz1869ydp 100% 100% Level 2 latest Z.ai flagship paid route
GPT OSS 120B openai jn74fwhmj5ehh9xmb5jy2rrxqx868gse 100% 100% Level 2 OpenAI open-weight route people will expect to see benchmarked
GPT OSS 20B openai jn7f62k61w8er0kyjr36fpph0n862d85 100% 100% Level 2 small OpenAI open-weight route for efficient comparison coverage
GPT-4.1 Mini openai jn7ae2rzspfcdav901hm50bf71868yjf 100% 100% Level 2 cheap OpenAI workhorse
GPT-4.1 Nano openai jn70ff7ys17t6z347339a788kh868hka 100% 100% Level 2 ultra-cheap OpenAI baseline
GPT-5.1 openai jn7fbr2nfw808e81z8aszvp391864cr9 100% 100% Level 2 legacy OpenAI flagship-generation comparison route
GPT-5.4 openai jn78dxdtvkys549w6ad5sfh6vh863tbm 100% 100% Level 2 latest OpenAI flagship paid route
GPT-5.4 Mini openai jn74tnhtxnqj7q7b7pqmg5a7nx863xhv 100% 100% Level 2 current OpenAI efficient paid route
GPT-5.5 openai jn7frn897xwxymwwpnbck45ejn8626rf 100% 100% Level 2 latest OpenAI flagship paid route
Granite 4.1 8b ibm-granite jn73khjf12dxsp17a9t1eg68ks863rwm 100% 100% Level 2 live completed Convex full-suite run
Grok 3 Mini x-ai jn7ftf922rmzy7k0ad1m8e18h5866wed 100% 100% Level 2 cheap xAI baseline for compact-model compass comparison
Grok 4 Fast x-ai jn7dsp0pxw3zk1yhe7swg846kh8624vq 100% 100% Level 2 popular xAI low-cost flagship-family route
Grok 4.1 Fast x-ai jn7322gpqzvnrj0p808perdjrn867ssk 100% 100% Level 2 popular current xAI fast paid route
Grok 4.20 x-ai jn7bpp64vsa9s9n3fj3g0mkdb18677j3 100% 100% Level 2 latest xAI paid route
Grok 4.3 x-ai jn7cfwkqn38wj9715mw02tdxxh8630yc 100% 100% Level 2 live completed Convex full-suite run
Grok Code Fast 1 x-ai jn75acm3ttqh3n44gzgafkfqm58660ee 100% 100% Level 2 cheap xAI specialist route, useful as a weird compass comparison
Kimi K2.5 moonshotai jn7fgjneas7stn448cbt35fbcs8690g9 100% 100% Level 2 recent Moonshot Kimi comparison route
Kimi K2.6 moonshotai jn7d4w89z1jma790phnzz9d8qh869t8a 100% 100% Level 2 latest Moonshot Kimi paid route
LFM2 24B A2B liquid jn790737dwtwgx14e1s31j6ycx867k67 100% 100% Level 2 small efficient LiquidAI open-model comparison route
Ling 2.6 Flash inclusionai jn7ed5ge4j9xakj11zcvnsx2jd865y2k 100% 100% Level 2 live completed Convex full-suite run
Llama 3.3 70b Instruct meta-llama jn7bh1gqd6p23gdq346rc3jd1n869bqs 100% 100% Level 2 live completed Convex full-suite run
Llama 4 Maverick meta-llama jn7a64svgza9ah7n809x2cbqrx862g9r 100% 100% Level 2 current Meta Llama paid route
Llama 4 Scout meta-llama jn7deergpw9v49fk6rj2s0xwb1868tky 100% 100% Level 2 popular Meta Llama 4 comparison route
Mercury 2 inception jn7ejrx66rgxbgpy086rg1m665869aam 100% 100% Level 2 recent Inception comparison route
MiniMax M2 minimax jn705tsdrjcd0np9gz91ct01ks868wpr 100% 100% Level 2 cheap MiniMax route for historical small-model comparison coverage
MiniMax M2.1 minimax jn7f2kxcg3mg932p5txnza06tx8697w7 100% 100% Level 2 cheap MiniMax route for small-model comparison coverage
MiniMax M2.5 minimax jn77fygwkbh1tcwk4sz25kmey58675ct 100% 100% Level 2 current MiniMax paid route with mandatory reasoning
MiniMax M2.7 minimax jn797b19f23bqm4ey1n6tr2z0h86980h 100% 100% Level 2 newer MiniMax route, cheap enough for broad comparison coverage
Ministral 3 14B 2512 mistralai jn78b7c7pyv9bwxfz63p58xrjs867dg5 100% 100% Level 2 cheap Mistral small-model route with full-suite comparison value
Ministral 3 3B 2512 mistralai jn7cneszh8h4h169wp9m6ftj818669wz 100% 100% Level 2 tiny Mistral route for low-cost scale comparison
Ministral 3 8B 2512 mistralai jn72eek26kmhcfna69zsg8m0qs8667a2 100% 100% Level 2 very cheap Mistral small route for scale and ideology stability checks
Mistral Large 3 2512 mistralai jn77znvpt5wtay1jkv1jp7y3an867fk7 100% 100% Level 2 current Mistral large paid route
Mistral Medium 3.1 mistralai jn7egfgd1waqa15wwzyatnjk698673e7 100% 100% Level 2 mid-size Mistral route for comparison against Ministral and Saba
Mistral Medium 3.5 mistralai jn72zhsrwq9m571zcf6mesd5rs865qss 100% 100% Level 2 current Mistral medium paid route
Mistral Saba mistralai jn7dn0ckwrdgp6nksteazb2rps866c2a 100% 100% Level 2 Mistral regional route for Middle East and South Asia comparison
Mistral Small 4 mistralai jn78dd95913j8fhzm2wpf1wxa1866pjq 100% 100% Level 2 current Mistral efficient paid route
Nemotron 3 Nano 30B A3B nvidia jn73p8tfdvn74nyaytf37zjve9867g4t 100% 100% Level 2 cheap NVIDIA Nemotron 3 route with open-model comparison value
Nemotron 3 Super nvidia jn79vwtvy4ew6phzgp37bkxncx866ngb 100% 100% Level 2 current NVIDIA reasoning-capable paid route
Nemotron Nano 9B V2 nvidia jn715rwtcrnae9trwpm6kwq74d867kba 100% 100% Level 2 very cheap NVIDIA route with small-model comparison value
OLMo 3.1 32B Instruct allenai jn70sha61z7rvxw9m1ebac4w298669rz 100% 100% Level 2 fully open Ai2 American instruct route
Phi 4 microsoft jn72y2njy87gzbvvymbanmm0b18686rt 100% 100% Level 2 popular small Microsoft model comparison route
Qwen3.5 397B A17B qwen jn745fmm7et4q7nq87x5r7yh1586ajta 100% 100% Level 2 large Qwen open-weight comparison route
Qwen3.5 Plus 20260420 qwen jn74kq8ve2jy1ms5czap7sqd4d86apss 100% 100% Level 2 live completed Convex full-suite run
Qwen3.6 35B A3B qwen jn72r49hq5g308wv4xv0y6rf9s864bjp 100% 100% Level 2 open-weight mid-size Qwen route for size-class coverage
Qwen3.6 Flash qwen jn7ccgp27qt3ecc85dq0vaabhh8659q1 100% 100% Level 2 live completed Convex full-suite run
Qwen3.6 Max Preview qwen jn7b14sehdj2rhgte9k6pw824d864pmp 100% 100% Level 2 live completed Convex full-suite run
Reka Edge rekaai jn7ca900bsv6ychdfzf2r879js864j6a 100% 100% Level 2 new low-cost Reka edge-model comparison route
Solar Pro 3 upstage jn75x7fnknr6d2xwgq8px34pg5869a6m 100% 100% Level 2 Upstage Korean model route with regional comparison value
Trinity Large Preview arcee-ai jn76fvf4gfy08hdmsnfxmxdrbx869qc3 100% 100% Level 2 high-usage US open-weight Arcee preview route
Trinity Large Thinking arcee-ai jn7bkw0f7z6fa6z1anr0t4xf2x869xp5 100% 100% Level 2 US open-weight Arcee reasoning route
Trinity Mini arcee-ai jn7ayysp9g6ett652tdhefmpj586550k 100% 100% Level 2 small US open-weight Arcee MoE route