🎯 Free: get your first AI visibility baseline in 5 min, then refresh it every 7 daysTry it →

Blog
5 min read

robots.txt for Shopify and AI Bots: Keep Your Catalog Visible in ChatGPT, Perplexity, and Google AI

How to configure Shopify robots.txt for AI crawlers without hiding products, collections, shipping pages, and commercial signals needed for GEO.

robots.txtShopifyAI botsecommerce
Vladislav Puchkov
Vladislav Puchkov
Founder of GEO Scout, GEO optimization expert

For ecommerce, AI search is becoming a decision channel. Users ask "what should I buy", "which store has better delivery", "where can I return the product", "which brand is safer for kids", "what is the best gift under $100", and "which shop is reliable". If a Shopify store hides products, collections, and policy pages from AI crawlers, the answer will often be built from marketplaces, review sites, affiliate content, or competitors.

robots.txt in Shopify is therefore not just a technical file. It is a commercial visibility policy. It defines which parts of the catalog can be read by machines and which operational areas remain private.

Shopify Uses robots.txt.liquid

Shopify manages robots.txt through the robots.txt.liquid template. The platform already includes default rules for technical areas. That is useful, but manual changes can create problems. Do not copy a robots.txt file from WordPress, Magento, or a custom stack. Shopify has its own paths, parameters, theme behavior, and app-generated URLs.

Before changing the policy, map the store:

  • product pages;
  • collections;
  • vendor or brand pages;
  • shipping, payment, returns, and warranty pages;
  • blog and buying guides;
  • search and filter URLs;
  • cart, checkout, and account;
  • pages generated by apps.

Only after that should you decide what AI bots can access.

What AI Should See in a Shopify Store

AI answers need more than product cards. The system tries to understand assortment, buying conditions, availability, trust signals, and buyer fit.

Keep these pages accessible:

  • /products/... with unique descriptions, specs, and media;
  • /collections/... when collections have meaningful text and logic;
  • shipping, payment, return, and warranty pages;
  • brand pages that explain assortment and positioning;
  • FAQ, size guides, and buying guides;
  • blog posts with selections, comparisons, and tutorials;
  • review pages if they are public and compliant.

These URLs can become sources when a user asks an AI assistant where to buy something or which product fits a specific need.

What to Block

Operational Shopify areas should not become part of AI answers:

User-agent: *
Disallow: /cart
Disallow: /checkout
Disallow: /account
Disallow: /search
Disallow: /*?sort_by=
Disallow: /*?filter.
Disallow: /*preview_theme_id=
 
Sitemap: https://example.com/sitemap.xml

This is an example of policy logic, not a universal file. Some stores use search pages as intentional SEO landing pages. Some apps create useful public URLs. Each rule should be tested against the actual store.

Which AI Bots to Allow or Restrict

Shopify stores should not confuse AI search with model training. If the store wants visibility in ChatGPT Search, Perplexity, Google AI, and similar surfaces, public commercial pages need to be accessible to retrieval bots. If the legal policy restricts training use, handle training crawlers separately.

Example pattern:

User-agent: OAI-SearchBot
Allow: /
Disallow: /cart
Disallow: /checkout
Disallow: /account
 
User-agent: GPTBot
Disallow: /
 
User-agent: PerplexityBot
Allow: /
Disallow: /cart
Disallow: /checkout
Disallow: /account

The tradeoff is real. Blocking all AI bots may protect content, but it can reduce the chance that products appear in answers and recommendations. Opening everything can create crawl waste and expose low-value duplicates.

Collections and Filters

Collections are often the strongest Shopify pages for AI visibility. A collection can explain the assortment better than a single product card. For example, "gifts for home under $100" may match a user prompt more directly than one SKU.

Filters and sorting parameters are different. They can create thousands of duplicates. If a parameter only changes order or a narrow combination without unique content, block it or canonicalize it. If a filtered view is intentionally built as a landing page with copy, FAQ, schema, and internal links, it can remain accessible.

Pair robots.txt With Product Data

robots.txt is only one layer. A Shopify store needs a complete AI-readable stack:

  • accessible priority URLs;
  • correct canonical tags;
  • Product schema on product pages;
  • Offer, priceCurrency, availability, and shippingDetails;
  • Merchant Center or another product feed;
  • FAQ on policy and category pages;
  • clear collection descriptions.

If a bot can open the page but cannot find price, stock, delivery, specs, or return terms, AI systems may rely on third-party sources anyway.

How to Validate the Change

A practical validation workflow:

  1. Open /robots.txt and confirm the live Shopify output.
  2. Check the sitemap and whether products and collections are listed.
  3. Confirm that priority pages are not blocked by noindex.
  4. Review CDN, app, or server logs when available.
  5. Test prompts around "where to buy", "best store", "delivery", "returns", "price", and "product comparison".

In GEO Scout, ecommerce teams can create prompt clusters and track whether the store is mentioned more often, whether catalog URLs appear as cited sources, and whether marketplaces are intercepting category answers. That is the business-level validation missing from most robots.txt checks.

Mini Checklist

  • Products and collections are open when useful and indexable.
  • Cart, checkout, account, preview, and private areas are blocked.
  • Filter and sort combinations do not create crawl traps.
  • OAI-SearchBot is not blocked together with GPTBot.
  • Product schema and shipping data are complete.
  • Sitemap contains current products and collections.
  • Impact is measured in AI answers, not only in the robots.txt file.

A Shopify store should be legible to AI systems as a commercial source: what it sells, who it fits, what it costs, how it ships, and why it can be trusted. robots.txt controls the doorway to that knowledge.

Частые вопросы

Can robots.txt be edited in Shopify?
Yes. Shopify allows merchants to customize the robots.txt.liquid template, but changes should be careful because Shopify already blocks some technical paths by default.
What should stay open for AI bots on Shopify?
Products, collections, brand pages, shipping, returns, payment, FAQ, reviews, buying guides, and editorial content should usually stay accessible because they support commercial AI answers.
Which Shopify URLs are usually blocked?
Cart, checkout, account pages, internal search, sort parameters, preview URLs, and technical duplicates are common candidates. Exact rules depend on the theme, apps, and store architecture.
Should filtered collection pages be open?
Main collections should be open if they contain unique content. Filter and sort combinations should usually be blocked or canonicalized unless they are built as intentional landing pages.
How can I measure the effect of Shopify robots.txt changes?
Compare crawl logs, product discoverability, and AI answers for commercial prompts. GEO Scout can show whether store pages appear more often as cited sources and recommendations.