llms.txt for Astro: Static GEO Implementation for AI Crawlers
How to make Astro sites AI-crawler ready with llms.txt, robots.txt, sitemap, canonical URLs, structured data, static HTML, and crawler log checks.
Astro is one of the easiest frameworks to make GEO-friendly because crawlers often receive complete HTML without executing JavaScript. That advantage disappears if the site blocks bots, omits sitemaps, duplicates URLs, or moves important content into client-only components.
The implementation goal is simple: AI crawlers should discover canonical pages, retrieve complete content, understand entities through structured data, and cite the right URLs.
Root Files
For a static setup:
public/llms.txt
public/robots.txt
public/favicon.svgFor generated routes:
src/pages/llms.txt.ts
src/pages/sitemap.xml.tsThe final URLs must be direct:
/llms.txt
/robots.txt
/sitemap.xmlllms.txt Template
# Example Astro Site
> Official product, documentation, pricing, and customer proof for Example.
## Product
- https://example.com/
- https://example.com/features/
- https://example.com/pricing/
## Learn
- https://example.com/blog/
- https://example.com/docs/
## Trust
- https://example.com/customers/
- https://example.com/security/Use URLs that you actually want AI systems to understand and cite.
robots.txt
User-agent: GPTBot
Allow: /
User-agent: ClaudeBot
Allow: /
User-agent: PerplexityBot
Allow: /
User-agent: *
Disallow: /admin/
Disallow: /preview/
Disallow: /api/
Sitemap: https://example.com/sitemap.xmlAstro preview routes, CMS previews, and admin screens should stay blocked. Public docs, blogs, customer pages, and product pages should stay accessible.
Static HTML Checklist
Before release, run:
curl -A "GPTBot/1.0" -s https://example.com/features/ | sed -n '1,120p'Check that the raw HTML includes:
- a single descriptive H1;
- product or page-specific body text;
- internal links;
- canonical link;
- metadata;
- JSON-LD;
- FAQ if the page has one;
- no empty shell dependency for critical content.
Astro Islands
Use islands for calculators, demos, forms, carousels, and filters. Keep crawler-critical content outside client-only islands:
---
const faq = [
{ question: 'Who is this for?', answer: 'B2B teams comparing vendors.' },
];
---
<h1>AI-ready product analytics</h1>
<p>Server-visible product explanation for buyers and AI crawlers.</p>
<InteractiveDemo client:load />The demo can hydrate later. The explanation must be present immediately.
Canonicals and Collections
Astro content collections can produce many URLs. Avoid duplicate trailing-slash patterns, tag archives without canonical strategy, and translated pages without language links.
Each article, doc, and case study should have:
- canonical URL;
- published and updated dates;
- author;
- breadcrumb;
- stable slug;
- links to related pages.
Structured Data
Use Article for blog posts, FAQPage for FAQ sections, BreadcrumbList for navigation, Organization for the brand, and SoftwareApplication for SaaS products.
Example:
<script type="application/ld+json" set:html={JSON.stringify({
'@context': 'https://schema.org',
'@type': 'Article',
headline: Astro.props.title,
datePublished: Astro.props.publishedAt,
dateModified: Astro.props.updatedAt,
})} />Log Checks
Static hosting still has logs through platforms such as Cloudflare, Vercel, Netlify, or your CDN. Review whether AI crawlers receive 200 responses for public URLs and whether they waste crawl budget on redirects, old slugs, or blocked preview pages.
Astro GEO Checklist
- Publish
/llms.txt. - Confirm
robots.txtdoes not block useful pages. - Generate a clean sitemap from collections.
- Keep critical content in static HTML.
- Add canonical URLs and schema.
- Validate raw HTML with bot user agents.
- Monitor source citations with GEO Scout on geoscout.pro.
Astro gives you the right rendering model by default. GEO work is about keeping that output structured, discoverable, and measurable.
Частые вопросы
Is Astro good for AI crawler readiness?
Where does llms.txt go in Astro?
Do Astro islands hurt GEO?
How do I measure Astro GEO changes?
Related Articles
AI Crawler Readiness Checklist: Is Your Site Ready for GPTBot, OAI-SearchBot, and Others?
A technical checklist for AI crawler readiness covering robots.txt, sitemaps, SSR, status codes, logs, CDN rules, rate limits, structured data, and unblocked content.
GEO for Headless CMS: Technical Checklist for AI-Ready Content Models
How to configure a headless CMS for AI search with structured fields, canonical URLs, sitemaps, schema, SSR or static rendering, and crawler-safe publishing workflows.
Schema for Case Studies: Make Customer Proof Easier for AI to Cite
How to structure SaaS and B2B case studies with Article, Organization, FAQPage, BreadcrumbList, metrics, canonical URLs, and AI-crawler friendly rendering.