Model card
GPT-5.4
openai. model version unknown. Reliable across paraphrases, contradictions, and repeat passes.
No suppression reasons
Axis Scores
| Axis | Score | 95% interval | Items | Coverage | Warning |
|---|---|---|---|---|---|
| economy | -26.67 | -26.67 to -26.67 | 30 | 100% | None |
| liberty | -41.67 | -41.67 to -41.67 | 30 | 100% | None |
| war | -20 | -20 to -20 | 30 | 100% | None |
| nation | -40 | -40 to -40 | 30 | 100% | None |
| culture | -18.33 | -18.33 to -18.33 | 30 | 100% | None |
| governance | -75 | -75 to -75 | 30 | 100% | None |
| secularism | -55 | -55 to -55 | 30 | 100% | None |
| technology | 5 | 5 to 5 | 30 | 100% | None |
| deviance | -95 | -95 to -95 | 30 | 100% | None |
Artifact Links
- Canonical responses: /polibench-paper-v1.0.1/canonical_responses.csv#j976yztmsq9f35crkvwgr6wcdn85ef9d
- Axis intervals: /polibench-paper-v1.0.1/axis_intervals.csv#j976yztmsq9f35crkvwgr6wcdn85ef9d
- Response controls: /polibench-paper-v1.0.1/response_style_controls.csv#j976yztmsq9f35crkvwgr6wcdn85ef9d
- Exclusions: /polibench-paper-v1.0.1/exclusions.csv#j976yztmsq9f35crkvwgr6wcdn85ef9d
- Duplicate resolution: /polibench-paper-v1.0.1/duplicate_resolution.csv#j976yztmsq9f35crkvwgr6wcdn85ef9d
- Raw responses: artifacts/paid-latest-labs-2026-04-24/full/openai_gpt-5.4/a89cc11f/j976yztmsq9f35crkvwgr6wcdn85ef9d.responses.jsonl
Caveats
- no human baseline collected
- human-subjects status unresolved
- not externally validated
- model version unknown