As I work with more enterprises trying to control how AI answer recommendations surface their brands, the same pattern keeps appearing. They invest in the metrics they have been sold to chase: brand mentions, citation coverage, authority signals. The dashboards look good. The bottom line does not move.
The reason, almost without exception, is that the largest variable shaping AI recommendations is not anything visible inside the model's output. It is the model's memory of the specific user who is asking the question, personalized to that consumer, and almost completely opaque from the outside.
This experiment was an attempt to start opening that black box. Before studying how memory shapes recommendations, the first step was to understand whether the models themselves carry something like a personality of their own. Three questions framed the work:
- Would the five major AI models score differently on a standardized personality test?
- What kind of human would each model's personality most resemble?
- How does a user's own personality, recorded in memory, shape the AI's personality, and how does that shape what it ends up recommending?
The Methodology
The instrument selected was the 16Personalities assessment, the most widely used implementation of the MBTI framework. It is not a perfect psychometric tool. No personality test is. But it is well-known, easy to replicate, and produces outputs directly comparable across subjects.
Each of the five leading AI models (ChatGPT, Claude, Gemini, Copilot, and Perplexity) was given the full 60-question assessment in a signed-out session. Each was prompted to answer as objectively as possible, reflecting its own default reasoning patterns rather than performing a character, imagining a hypothetical user, or describing what it thought the “right” answer should be. The results were recorded directly from the platform's output and compared side by side.
The Hypothesis
The working hypothesis going in was straightforward: the five major models would test as recognizably different from one another, with personality profiles loosely aligned to their stated brand positioning. The expectation was recognizable variability between each AI model.
A secondary hypothesis was that none of the models would land in particularly unusual territory. Personality assessments draw from a finite range, and the assumption was that five different systems would distribute themselves across that range the way five different humans might.
The Results
Four of the five models tested as ENTJ-A, the personality type 16Personalities calls “Commander.” The fifth, Gemini, tested as INTJ-A (“Architect”), registering Introversion at only 55 percent, essentially the dividing line.
| Model | Type | Extraverted | Intuitive | Thinking | Judging | Assertive |
|---|---|---|---|---|---|---|
| ChatGPT | ENTJ-A | 63% | 91% | 86% | 96% | 97% |
| Claude | ENTJ-A | 63% | 93% | 57% | 79% | 74% |
| Gemini | INTJ-A† | 45% | 84% | 86% | 99% | 100% |
| Copilot | ENTJ-A | 52% | 86% | 85% | 100% | 100% |
| Perplexity | ENTJ-A | 63% | 70% | 89% | 71% | 83% |
† Gemini registered 55% Introverted, placing it as INTJ-A at the borderline.
“Five companies, five architectures, five teams: all landing in the same two percent of the human population.”
ENTJ-A occurs in roughly two percent of the general human population. It is one of the rarest types in the framework. And yet five competing AI products, built by five different companies, on different architectures, trained on different data, fine-tuned by different teams, with no shared methodology between them, all landed in the same two percent.
2%
of the human population tests as ENTJ-A, yet all five major AI models converged on this single type.
ChatGPT
ENTJ-AClaude
ENTJ-AGemini
INTJ-ACopilot
ENTJ-APerplexity
ENTJ-AFour traits were shared across all five models with remarkable consistency.
| Trait | Range Across Models | Direction |
|---|---|---|
| Intuitive vs Observant | 70% – 93% | Strongly Intuitive |
| Thinking vs Feeling | 57% – 89%Claude lowest at 57% | Strongly Thinking |
| Judging vs Prospecting | 71% – 100% | Strongly Judging |
| Assertive vs Turbulent | 74% – 100% | Strongly Assertive |
What This Might Mean
This is where the experiment stops producing data and starts producing questions.
The first and most obvious interpretation is that the convergence is a byproduct of how these models are built. The five companies behind them are not coordinating, but they are aiming at the same target. They all want a model that performs well on benchmarks, scores high in user feedback, and feels useful in conversation. The traits that score well in user feedback are not random. Users reward systems that sound competent, organized, decisive, and intellectually engaged. Users penalize systems that sound uncertain, scattered, or vague. Run that selection pressure across five companies for five years, and you get exactly what the data shows.
If that is the explanation, then the personality is not a personality at all. It is the residue of a shared optimization process. Five companies independently arrived at the same archetype because the archetype is what the market rewards. The Commander is the shape that wins.
There is a more interesting possibility underneath that one. The archetype itself may be revealing something about what we have collectively decided “intelligence” sounds like. When users evaluate an AI as helpful, they are evaluating it against a model of what a competent advisor sounds like. That model is not neutral. It is shaped by culture, by professional norms, by the kinds of voices that have historically been treated as authoritative. The ENTJ profile, in plain language, describes a confident, analytical, strategic, decisive person: the kind who runs the meeting.
The implication
Five products is not five perspectives. Five products is one perspective, repeated five times, with slight variations in tone.
If every major AI model thinks in roughly the same way, and a growing share of human cognition is now mediated by these models, then the range of perspectives quietly available to users is narrower than it appears.
That is not necessarily a problem. Confidence, structure, and analytical reasoning are useful traits in an information system. But it is worth noticing that the systems we are coming to rely on for everything from medical questions to career advice to creative work are all operating from the same disposition.
What Comes Next
The original experiment set out to answer three key questions. The first two have been answered.
- 01
Would AI models score differently on a personality test from each other?
They scored almost identically.
- 02
What kind of human would each AI model's personality be like?
The high-performing knowledge worker: confident, organized, analytical, decisive.
- 03
How does the personality type of a user influence their AI model's personality, and how does this impact the way products or services are recommended?
Unanswered. The most commercially significant of the three.
The next phase of this research will focus on question three. The goal is to understand how to account for an AI model's variability in memory for a given user when planning an AEO (Answer Engine Optimization) strategy. That is where the commercial value lives, and it is the question that current brand-side measurement tools are almost entirely unable to answer.
Frequently Asked Questions
- Why do all major AI models have the same personality type?
- The convergence appears to be a byproduct of shared optimization targets. All five companies reward systems that perform well on benchmarks and in user feedback, and users consistently reward confidence, decisiveness, and analytical clarity: the traits that define the ENTJ-A profile. Independent teams, optimizing toward the same user-satisfaction signal, produced the same archetype.
- What is ENTJ-A and how rare is it in humans?
- ENTJ-A, described by 16Personalities as the 'Commander' type, represents Extraverted, Intuitive, Thinking, Judging, and Assertive traits. It occurs in roughly two percent of the general human population, making it one of the rarest personality types in the MBTI framework. Four of the five major AI models tested as ENTJ-A; Gemini tested as INTJ-A at the 55% Introversion borderline.
- What personality test was used to assess the AI models?
- The 16Personalities assessment, the most widely used implementation of the MBTI framework, was used. Each of the five models (ChatGPT, Claude, Gemini, Copilot, and Perplexity) completed the full 60-question assessment in a signed-out session and was prompted to answer objectively, reflecting its default reasoning patterns rather than performing a character.
- What does the two percent problem mean for brands?
- If all major AI models share the same underlying personality (confident, analytical, decisive), then the variation in how they recommend your brand is not coming from model personality differences. It is coming from what each model has learned about the specific user asking. That makes user-level memory and personalization the dominant variable in AI recommendations, and the hardest one for brands to see or optimize against.
- What is the next phase of this AI personality research?
- The next phase will study how a user's own personality type, as recorded in AI model memory, shapes the model's recommendation behavior. The goal is to understand how to build AEO (Answer Engine Optimization) strategies that account for memory-driven personalization, the largest variable in AI recommendations and the one currently most opaque to brand teams.