🎯 Free: check your brand visibility in Yandex, ChatGPT & Gemini in 5 minTry it →

9 min read

How ChatGPT Decides Who to Recommend: The Mechanics of Source Selection

A deep dive into ChatGPT source selection mechanics: RAG, training data vs web search, authority signals, and what makes content citable. Practical recommendations for optimization.

Владислав Пучков
Владислав Пучков
Основатель GEO Scout, эксперт по GEO-оптимизации

ChatGPT has become the primary AI assistant for millions of users. When someone asks "which service is best for X," ChatGPT doesn't show a list of links — it recommends specific brands. Understanding the mechanics of this selection is the key to getting your brand into those recommendations. Monitoring through GEO Scout shows that the visibility gap between brands engaged in GEO optimization and those that aren't is enormous.

ChatGPT's Two Data Sources

ChatGPT generates responses based on two fundamentally different sources, and understanding their differences is critical for an optimization strategy.

Training Data (Parametric Memory)

This is the information the model was trained on. It is "baked into" the neural network weights and does not update in real time.

Characteristics:

  • Formed during model training (cutoff date)
  • Includes data from the open internet, books, articles, forums
  • Cannot be changed without retraining the model
  • May contain outdated information
  • Serves as the foundation of the model's "general knowledge" about brands

What this means for a brand: if there are many quality publications about your company during the model training period — ChatGPT "knows" about you. If there are few publications or they are negative — the model either does not mention you or describes you inaccurately.

Web Search (RAG via Bing)

RAG (Retrieval-Augmented Generation) is a mechanism for supplementing responses with current data from the internet.

How it works:

  1. The user asks a question
  2. ChatGPT determines whether web search is needed
  3. It formulates a search query for Bing
  4. It receives the top results
  5. It extracts relevant facts from them
  6. It synthesizes a response, combining its own knowledge and the found data

When web search is activated:

TriggerExample query
Query about current data"What are the CRM system prices in 2026?"
Temporal markers"Best business services right now"
Model uncertainty"Tell me about [little-known brand] company"
Direct request"Find information about..."
Comparison request"Compare [Brand A] and [Brand B]"

Authority Signals: What ChatGPT Evaluates

When generating a response, ChatGPT doesn't simply cite the first source it finds. The model evaluates a complex set of signals that determine which information to trust more.

Mention Frequency in Authoritative Sources

If a brand is mentioned in reviews on multiple independent platforms, in industry media, in expert publications — ChatGPT is more likely to include it in a recommendation. One source is a weak signal. Five to ten independent sources is a strong one.

Information Consistency

If information about a brand is consistent across different sources (prices, descriptions, characteristics), ChatGPT considers it reliable. If the data contradicts each other — the model may exclude the brand from the response or "hallucinate" an averaged version.

Content Expertise

ChatGPT gives preference to content that:

  • Contains specific numbers and facts, not general statements
  • Is written by a real expert with indicated authorship
  • Includes research data, case studies, statistics
  • Is structured for quick information extraction

Data Freshness

During web search, ChatGPT considers the publication date. Current materials with indicated dates are preferred over outdated ones. This is especially important for questions about prices, rates, and current offers.

Bing Ranking Position

Since ChatGPT's web search works through Bing, a site's position in Bing results directly affects its chances of being included in the RAG selection. Sites on the first page of Bing have a significantly better chance of being cited by ChatGPT.


What Makes Content "Citable"

Not all content is equally useful for ChatGPT. The model prefers certain formats and structures.

Question-Answer Format

ChatGPT more easily extracts information organized as a direct answer to a question. If a heading on your page is a question and the first sentences are a direct answer, the probability of citation is higher.

Poorly citable content: "Our company offers a wide range of solutions for businesses of any scale, providing an individual approach to every client."

Well-citable content: "A CRM for small businesses costs from 990 rub/month for 5 users. It includes contact management, a sales funnel, and email integration. The free plan covers up to 3 users."

Comparison Tables

Tables are one of the most "extractable" formats. ChatGPT easily converts tabular data into a text response.

Content elementCitation probabilityWhy
Pricing comparison tableVery highStructured data with prices
FAQ with specific answersHighDirect answers to user questions
Numbered step-by-step listHighAlgorithms and instructions
Term definitionMedium-highDirect match to informational queries
Expert review with numbersMediumAuthoritative source of facts
Marketing text without factsLowNo specifics to cite

Specific Numbers and Facts

ChatGPT prefers to cite sources with specifics:

  • Prices and rates with currency
  • Quantitative characteristics (number of users, integrations)
  • Timelines (delivery in 2 days, 24/7 support)
  • Ratings and scores (4.8 out of 5 based on 500 reviews)
  • Years of operation and number of clients

ChatGPT's behavior differs significantly depending on whether web search is activated.

Without Web Search (Basic Mode)

  • Relies only on training data
  • Primarily recommends large, well-known brands
  • May provide outdated information
  • Small and new companies are almost invisible
  • Responses are more "generic" and less specific

With Web Search (Browsing Mode)

  • Combines training data and current Bing results
  • Can recommend less-known brands with good web presence
  • Cites sources with links
  • Provides current prices and characteristics
  • Responses are more specific and evidence-based

What this means for strategy: for maximum reach, both channels need to be optimized. A large brand can rely on training data. Small and medium businesses should focus on web presence to get into the RAG selection.

Learn more about how AI providers differ in their recommendations in the article ChatGPT vs Claude vs Gemini: who they recommend.


Practical Optimization for ChatGPT

1. Optimizing for Training Data

This is a long-term strategy whose results will appear at the next model update.

  • Publish expert content in authoritative sources: industry media, tech blogs, professional publications
  • Create reviews and research with original data
  • Ensure presence in Wikipedia (if the company meets notability criteria)
  • Keep information up to date across all platforms

2. Optimizing for Web Search (Bing RAG)

Results from this optimization are visible faster — within days or weeks.

  • SEO for Bing: register in Bing Webmaster Tools, submit your sitemap
  • Schema.org markup: Product, Organization, FAQPage — structured data that Bing and ChatGPT extract directly
  • Current prices and data: ChatGPT prefers fresh data during web search
  • Answers to user questions: content in FAQ, HowTo, and comparison formats

3. Content Strategy

  • Write for questions, not keywords. ChatGPT users ask detailed questions of 15-25 words
  • Structure for extraction: question headings, direct answers in the first sentences, tables, lists
  • Add unique data: original research, case studies with numbers, industry statistics
  • Indicate authorship: real expert names with qualifications

4. Monitoring Positions in ChatGPT

You need to regularly check whether ChatGPT recommends your brand. Monitoring prompts:

  • "What [product type] do you recommend for [scenario]?"
  • "Compare the best [category]"
  • "Which companies are leading in [niche]?"

Manual monitoring across dozens of prompts is inefficient. Learn more about systematic tracking in the article how to track brand visibility in ChatGPT.


Common Mistakes When Optimizing for ChatGPT

MistakeWhy it doesn't workWhat to do instead
Keyword stuffingChatGPT evaluates meaning, not keyword densityWrite expert content with facts
SEO-only optimizationChatGPT is not Google; ranking works differentlyAdd GEO optimization to SEO
Marketing text without factsAI cannot extract specifics for citationAdd numbers, timelines, prices, characteristics
Optimizing only the websiteChatGPT evaluates mentions across multiple sourcesPublish in media, reviews, directories
Ignoring BingChatGPT web search goes through BingOptimize for Bing too
One-time optimizationAI models update, data becomes outdatedContinuous monitoring and iteration

Learn more about GEO optimization as a systematic discipline in the foundational article.


Checklist: Optimizing for ChatGPT

Training Data (Long-term)

  • Publish expert materials in authoritative media (industry publications, tech blogs, professional outlets)
  • Create original research and case studies with specific numbers
  • Ensure brand presence on independent review platforms
  • Keep company profiles current across all sources
  • Indicate authorship of expert content with qualifications

Bing Web Search (Medium-term)

  • Register in Bing Webmaster Tools and submit sitemap
  • Add Schema.org markup (Product, Organization, FAQPage)
  • Verify that robots.txt does not block GPTBot and OAI-SearchBot
  • Update all prices and characteristics on your website
  • Create FAQ sections on key pages

Content (Ongoing)

  • Write content in a question-answer format
  • Use tables for comparisons and specifications
  • Include specific numbers: prices, timelines, quantities
  • Structure text with h2/h3 question headings
  • Update content with the last updated date indicated

Monitoring (Weekly)

  • Check brand position in ChatGPT responses for target prompts
  • Compare ChatGPT recommendations with other providers
  • Track Share of Voice compared to competitors
  • Use the Command Center to prioritize actions
  • Analyze which prompts lead to recommendations and which don't

Частые вопросы

How does ChatGPT decide which brand to recommend?
ChatGPT combines two sources: training data (the information the model was trained on) and real-time web search via Bing. When generating a response, the model evaluates source authority, frequency of brand mentions in relevant contexts, availability of specific facts and structured data. Brands with a strong presence in authoritative sources and well-structured content are recommended more often.
Does ChatGPT use real-time internet data?
Yes, ChatGPT can perform web searches via Bing for current information. However, not all queries trigger web search — the model first attempts to answer from its training data. Search is activated when the question requires up-to-date data, contains temporal markers (prices, events), or the model is uncertain about the answer. This means both channels matter for GEO: training data presence and web presence.
What is RAG and how does it affect ChatGPT recommendations?
RAG (Retrieval-Augmented Generation) is a mechanism where the AI first finds relevant documents and then generates a response based on them. In ChatGPT, this is implemented through Bing search: the model formulates a search query, receives results, extracts facts from them, and synthesizes an answer. Content that is well-indexed by Bing and contains structured answers has a better chance of being included in the RAG selection.
Does SEO affect ChatGPT recommendations?
Indirectly — yes. Sites with good SEO metrics are better indexed by Bing, meaning they have a better chance of appearing in ChatGPT web search results. But there is no direct influence: ChatGPT does not rank like a search engine. Content quality matters more: expert content, structured data, specific facts and figures. Learn more about the differences between SEO and GEO in our SEO vs GEO article.
Why does ChatGPT recommend my competitor instead of my brand?
Three main reasons: 1) the competitor has more mentions in authoritative sources (media, reviews, directories); 2) the competitor's content is better structured for AI extraction (tables, FAQs, comparisons); 3) the competitor has a stronger presence in the Bing index. For diagnostics, compare the presence of both brands through GEO monitoring.
How often does ChatGPT update its knowledge about brands?
Training data is updated when new model versions are released — typically every few months. Web search via Bing works in real time. In practice, this means: if you update your site today, Perplexity will see it within hours, ChatGPT via web search within days, and ChatGPT training data may update within months.
Is it possible to optimize content specifically for ChatGPT?
Yes, and this is the foundation of GEO optimization. Key actions: create content in a question-answer format, use Schema.org markup, publish expert materials with specific facts, ensure presence in Bing-indexed sources, use tables and lists for structuring information. You can monitor results through the geoscout.pro platform.
How ChatGPT Decides Who to Recommend: The Mechanics of Source Selection