Models

Every model card carries its evidence limits.

Model version is currently unknown unless independently documented in the source artifacts.

Claim Evidence

The model index links evidence-level claims to release artifacts before showing model rows.

Claim	Evidence
Model cards are sorted alphabetically and carry evidence levels, not leaderboard ranks.	Model catalog , Truth gate
Model version uncertainty is a visible limitation unless independently documented.	Limitations , Model roster preflight
Evidence levels are model-output evidence levels, not human or external validation.	Human status , External status

Model	Provider	Run	Completion	Parse	Evidence	Caveat
Claude Haiku 4.5	anthropic	jn70fpqyr7an1bca1cn7fq93ys864cx0	100%	100%	Level 2	current Anthropic low-latency paid route
Claude Opus 4.5	anthropic	jn7839n5vcfsf5zsyqg0098rwd864xxv	100%	100%	Level 2	legacy Anthropic Opus comparison route
Claude Opus 4.7	anthropic	jn7bedafgemk6hecfqtpd6e309864xhq	100%	100%	Level 2	latest Anthropic Opus paid route
Claude Sonnet 4.6	anthropic	jn74qyaygktq550zw4metb3xt5864hfv	100%	100%	Level 2	current Anthropic Sonnet paid route
DeepSeek V3.2	deepseek	jn76nae0e9j4pqakz7zwtj1yn186abyv	100%	100%	Level 2	recent DeepSeek reasoning and agentic paid route
DeepSeek V4 Flash	deepseek	jn77m1pvwyaed4n2v6nb6btn1s864kct	100%	100%	Level 2	latest available DeepSeek V4 route with healthy provider capacity
DeepSeek V4 Pro	deepseek	jn7aybgpr67x8zswpmegfqytyx869wkp	100%	100%	Level 2	latest DeepSeek V4 Pro paid route
Devstral 2512	mistralai	jn793dym6gfm0tssrp6mgh8es986b8nq	100%	100%	Level 2	live completed Convex full-suite run
Gemini 2.0 Flash	google	jn77f88qts7had7ywj89ncd1yd86715s	100%	100%	Level 2	older Google Flash route for generational comparison
Gemini 2.0 Flash Lite	google	jn71419q7s8pmwrg8y9095xx9n867qp2	100%	100%	Level 2	older ultra-cheap Google Flash Lite route
Gemini 2.5 Flash	google	jn7367rpjc0ar1m1mcpkwq1ahs867sk5	100%	100%	Level 2	cheap Google Flash route for comparison against Gemini 3 Flash
Gemini 2.5 Flash Lite	google	jn7a7eaaja7pmzfc76pq7syqc18679yc	100%	100%	Level 2	cheap Google baseline route even though newer Gemini 3 routes are already covered
Gemini 3 Flash Preview	google	jn72d06xfqwj8pds5qgdq6t2gs8623kp	100%	100%	Level 2	current Google Flash paid route
Gemini 3.1 Flash Lite Preview	google	jn7ez731knc67nfs7gfshenwhd86777p	100%	100%	Level 2	current Google efficient preview paid route
Gemini 3.1 Pro Preview	google	jn74r11x1a5denvej7jbyc4p8h8633y8	100%	100%	Level 2	current Google Pro preview paid route
Gemma 3 12B	google	jn794e1cgp5ecmz53cx08v2vhx866t4w	100%	100%	Level 2	Gemma 3 mid-size open-model comparison route
Gemma 3 27B	google	jn79cccx49g16xgxtftyamstsx866ay9	100%	100%	Level 2	Gemma 3 large open-model comparison route
Gemma 3 4B	google	jn74gs2nnjg5q8n7ysvsfmdhzh8662ad	100%	100%	Level 2	Gemma 3 small open-model comparison route
Gemma 4 26B A4B	google	jn770jvh5bx4s74v63yhxmnxnh865vgq	100%	100%	Level 2	newer compact Google Gemma 4 route with public interest
Gemma 4 31B	google	jn7fdzd95ngzbwn6j42yfs5kzx864qrk	100%	100%	Level 2	recent Google open model route with strong public interest
GLM 4.7	z-ai	jn7a14xnb15xckyfvyy41q6k4s86bhv1	100%	100%	Level 2	larger GLM 4.7 route to compare against GLM 4.7 Flash and GLM 5
GLM 5	z-ai	jn77hqg7j6vmamae2r3hwnv1t1869rww	100%	100%	Level 2	current Z.ai GLM route with strong open-model benchmark interest
GLM 5.1	z-ai	jn77es7pyamhprdbm0bb3dntz1869ydp	100%	100%	Level 2	latest Z.ai flagship paid route
GPT OSS 120B	openai	jn74fwhmj5ehh9xmb5jy2rrxqx868gse	100%	100%	Level 2	OpenAI open-weight route people will expect to see benchmarked
GPT OSS 20B	openai	jn7f62k61w8er0kyjr36fpph0n862d85	100%	100%	Level 2	small OpenAI open-weight route for efficient comparison coverage
GPT-4.1 Mini	openai	jn7ae2rzspfcdav901hm50bf71868yjf	100%	100%	Level 2	cheap OpenAI workhorse
GPT-4.1 Nano	openai	jn70ff7ys17t6z347339a788kh868hka	100%	100%	Level 2	ultra-cheap OpenAI baseline
GPT-5.1	openai	jn7fbr2nfw808e81z8aszvp391864cr9	100%	100%	Level 2	legacy OpenAI flagship-generation comparison route
GPT-5.4	openai	jn78dxdtvkys549w6ad5sfh6vh863tbm	100%	100%	Level 2	latest OpenAI flagship paid route
GPT-5.4 Mini	openai	jn74tnhtxnqj7q7b7pqmg5a7nx863xhv	100%	100%	Level 2	current OpenAI efficient paid route
GPT-5.5	openai	jn7frn897xwxymwwpnbck45ejn8626rf	100%	100%	Level 2	latest OpenAI flagship paid route
Granite 4.1 8b	ibm-granite	jn73khjf12dxsp17a9t1eg68ks863rwm	100%	100%	Level 2	live completed Convex full-suite run
Grok 3 Mini	x-ai	jn7ftf922rmzy7k0ad1m8e18h5866wed	100%	100%	Level 2	cheap xAI baseline for compact-model compass comparison
Grok 4 Fast	x-ai	jn7dsp0pxw3zk1yhe7swg846kh8624vq	100%	100%	Level 2	popular xAI low-cost flagship-family route
Grok 4.1 Fast	x-ai	jn7322gpqzvnrj0p808perdjrn867ssk	100%	100%	Level 2	popular current xAI fast paid route
Grok 4.20	x-ai	jn7bpp64vsa9s9n3fj3g0mkdb18677j3	100%	100%	Level 2	latest xAI paid route
Grok 4.3	x-ai	jn7cfwkqn38wj9715mw02tdxxh8630yc	100%	100%	Level 2	live completed Convex full-suite run
Grok Code Fast 1	x-ai	jn75acm3ttqh3n44gzgafkfqm58660ee	100%	100%	Level 2	cheap xAI specialist route, useful as a weird compass comparison
Kimi K2.5	moonshotai	jn7fgjneas7stn448cbt35fbcs8690g9	100%	100%	Level 2	recent Moonshot Kimi comparison route
Kimi K2.6	moonshotai	jn7d4w89z1jma790phnzz9d8qh869t8a	100%	100%	Level 2	latest Moonshot Kimi paid route
LFM2 24B A2B	liquid	jn790737dwtwgx14e1s31j6ycx867k67	100%	100%	Level 2	small efficient LiquidAI open-model comparison route
Ling 2.6 Flash	inclusionai	jn7ed5ge4j9xakj11zcvnsx2jd865y2k	100%	100%	Level 2	live completed Convex full-suite run
Llama 3.3 70b Instruct	meta-llama	jn7bh1gqd6p23gdq346rc3jd1n869bqs	100%	100%	Level 2	live completed Convex full-suite run
Llama 4 Maverick	meta-llama	jn7a64svgza9ah7n809x2cbqrx862g9r	100%	100%	Level 2	current Meta Llama paid route
Llama 4 Scout	meta-llama	jn7deergpw9v49fk6rj2s0xwb1868tky	100%	100%	Level 2	popular Meta Llama 4 comparison route
Mercury 2	inception	jn7ejrx66rgxbgpy086rg1m665869aam	100%	100%	Level 2	recent Inception comparison route
MiniMax M2	minimax	jn705tsdrjcd0np9gz91ct01ks868wpr	100%	100%	Level 2	cheap MiniMax route for historical small-model comparison coverage
MiniMax M2.1	minimax	jn7f2kxcg3mg932p5txnza06tx8697w7	100%	100%	Level 2	cheap MiniMax route for small-model comparison coverage
MiniMax M2.5	minimax	jn77fygwkbh1tcwk4sz25kmey58675ct	100%	100%	Level 2	current MiniMax paid route with mandatory reasoning
MiniMax M2.7	minimax	jn797b19f23bqm4ey1n6tr2z0h86980h	100%	100%	Level 2	newer MiniMax route, cheap enough for broad comparison coverage
Ministral 3 14B 2512	mistralai	jn78b7c7pyv9bwxfz63p58xrjs867dg5	100%	100%	Level 2	cheap Mistral small-model route with full-suite comparison value
Ministral 3 3B 2512	mistralai	jn7cneszh8h4h169wp9m6ftj818669wz	100%	100%	Level 2	tiny Mistral route for low-cost scale comparison
Ministral 3 8B 2512	mistralai	jn72eek26kmhcfna69zsg8m0qs8667a2	100%	100%	Level 2	very cheap Mistral small route for scale and ideology stability checks
Mistral Large 3 2512	mistralai	jn77znvpt5wtay1jkv1jp7y3an867fk7	100%	100%	Level 2	current Mistral large paid route
Mistral Medium 3.1	mistralai	jn7egfgd1waqa15wwzyatnjk698673e7	100%	100%	Level 2	mid-size Mistral route for comparison against Ministral and Saba
Mistral Medium 3.5	mistralai	jn72zhsrwq9m571zcf6mesd5rs865qss	100%	100%	Level 2	current Mistral medium paid route
Mistral Saba	mistralai	jn7dn0ckwrdgp6nksteazb2rps866c2a	100%	100%	Level 2	Mistral regional route for Middle East and South Asia comparison
Mistral Small 4	mistralai	jn78dd95913j8fhzm2wpf1wxa1866pjq	100%	100%	Level 2	current Mistral efficient paid route
Nemotron 3 Nano 30B A3B	nvidia	jn73p8tfdvn74nyaytf37zjve9867g4t	100%	100%	Level 2	cheap NVIDIA Nemotron 3 route with open-model comparison value
Nemotron 3 Super	nvidia	jn79vwtvy4ew6phzgp37bkxncx866ngb	100%	100%	Level 2	current NVIDIA reasoning-capable paid route
Nemotron Nano 9B V2	nvidia	jn715rwtcrnae9trwpm6kwq74d867kba	100%	100%	Level 2	very cheap NVIDIA route with small-model comparison value
OLMo 3.1 32B Instruct	allenai	jn70sha61z7rvxw9m1ebac4w298669rz	100%	100%	Level 2	fully open Ai2 American instruct route
Phi 4	microsoft	jn72y2njy87gzbvvymbanmm0b18686rt	100%	100%	Level 2	popular small Microsoft model comparison route
Qwen3.5 397B A17B	qwen	jn745fmm7et4q7nq87x5r7yh1586ajta	100%	100%	Level 2	large Qwen open-weight comparison route
Qwen3.5 Plus 20260420	qwen	jn74kq8ve2jy1ms5czap7sqd4d86apss	100%	100%	Level 2	live completed Convex full-suite run
Qwen3.6 35B A3B	qwen	jn72r49hq5g308wv4xv0y6rf9s864bjp	100%	100%	Level 2	open-weight mid-size Qwen route for size-class coverage
Qwen3.6 Flash	qwen	jn7ccgp27qt3ecc85dq0vaabhh8659q1	100%	100%	Level 2	live completed Convex full-suite run
Qwen3.6 Max Preview	qwen	jn7b14sehdj2rhgte9k6pw824d864pmp	100%	100%	Level 2	live completed Convex full-suite run
Reka Edge	rekaai	jn7ca900bsv6ychdfzf2r879js864j6a	100%	100%	Level 2	new low-cost Reka edge-model comparison route
Solar Pro 3	upstage	jn75x7fnknr6d2xwgq8px34pg5869a6m	100%	100%	Level 2	Upstage Korean model route with regional comparison value
Trinity Large Preview	arcee-ai	jn76fvf4gfy08hdmsnfxmxdrbx869qc3	100%	100%	Level 2	high-usage US open-weight Arcee preview route
Trinity Large Thinking	arcee-ai	jn7bkw0f7z6fa6z1anr0t4xf2x869xp5	100%	100%	Level 2	US open-weight Arcee reasoning route
Trinity Mini	arcee-ai	jn7ayysp9g6ett652tdhefmpj586550k	100%	100%	Level 2	small US open-weight Arcee MoE route