HEXACO recovery against fictional public targets. Useful for toy readability, not paper evidence.
The honest read
MiniMax M3 is immediately interesting for prose quality and profile encoding. In the expanded 4-part LSI run, GLM 5.1 was the surprise transfer winner.
GLM 5.1 led the expanded OpenRouter set under Gemini 3 Flash scoring.
MiniMax M3 4-part LSI recovery landed below the current Gemini/Gemma/GPT/Qwen band.
How the public characters work
The real benchmark uses research-only psychometric profiles and life facts from real participants. The public reader uses fictional characters so people can inspect the prose without seeing private data.
Name and facts
We started from fictional names such as Roxy Saint-Clair, Cillian Frost, and Haruki Minamoto. Opus and Gemini helped generate toy life facts from the names, then those facts were consolidated into public character skeletons.
Reverse coding
Multiple model raters scored the name-and-fact skeletons into dense toy psychometric targets. These targets are synthetic, public, and deliberately separate from the research-only PARSEL participant data.
Profiles to LSIs
Each generator transformed the toy targets into conditioning portraits, then wrote a 24-section first-person Life Story Interview. The tables ask how recoverable the toy targets remain after each transformation.
Read it beside the references
The packet includes Gemma 4 31B and Opus 4.6 outputs for the same fictional characters, so the comparison is easy to inspect without touching any private data.
MiniMax M3 profile sample
You are a study in steady, unperformed order. Your pantry is labeled in a handwriting that does not apologize for itself. Light switches click with a small, decisive sound, and you find a certain comfort in the fact that the silver sedan starts on the first turn of the key, every time, because you have kept it that way.
Your people sit close to you. There is a softness in the geography of your affection that you do not perform for anyone, and you are the kind of person who drives forty minutes on a Sunday not because you are asked, but because the pot should be on the stove at eleven and your uncle should not eat alone.
MiniMax M3 LSI sample
I know most of them by name. Mr. Henderson two doors down brings me tomatoes in August. I bring him basil. We do not make a ceremony of it. The widower at the end of the cul-de-sac, Theo, I check on him in the winter. I shovel his walk if it snows more than four inches.
I do not do it for praise. I do it because I have the shovel, and he is eighty-one, and the walk needs shoveling. I think community is a verb. You either do it or you do not.
4-part LSI benchmark
These rows compare MiniMax M3 against the canonical four-model 4-part LSI band using aggregate HEXACO recovery under Gemini 3 Flash scoring.
The synthetic character metrics are fun to inspect, but this aggregate table is the more serious model-comparison signal. MiniMax M3 profiles were strong; MiniMax M3 4-part LSIs were behind the established comparison band in this first pass.
Expanded OpenRouter shootout
Same 50-person slice, same 4-part LSI protocol, Gemini 3 Flash scoring. MiniMax M3 scoring was smoke-tested but too slow for this first public drop, so the comparable table below uses the stable Gemini scorer.
MiniMax M2.7 needed OpenRouter's MiniMax-specific reasoning split flag during generation; without it, one synthetic profile call spent the whole budget on hidden reasoning and returned no final prose. Kimi worked cleanly with `reasoning.effort=none` on OpenRouter despite the route being thinking-capable.
Files
Everything here is public-safe: synthetic fiction or aggregate-only tables.