Crawler analytics

Crawler analytics shows you which AI bots visit your site, how often, and what they fetched. The data comes from your Cloudflare account and refreshes hourly.

This is a different signal from prompts and citations. Citations tell you which sources AI models cite in answers. Crawler analytics tell you which pages AI models fetch. The gap between the two is often the most useful diagnostic.

Requirements

You need a Cloudflare account with at least one zone (your domain). Cloudflare’s AI Crawl Control surfaces the underlying data on all paid plans (Pro, Business, Enterprise). Free plan retention is too short for this integration.

Your Cloudflare API token needs these permissions:

Zone → Bot Management → Read
Zone → Zone → Read
Zone → Analytics → Read

Under Zone Resources, set Include → Specific zone → your domain.

Setup

In Bourd, open the workspace you want to attach the integration to.
Navigate to Settings → Integrations.
Click Connect Cloudflare.
Paste your Cloudflare API token. Confirm the hostname (Bourd pre-fills it from your tracked domain).
Click Connect.

Bourd verifies the token against your hostname. If it matches, your first sync starts within the hour.

What gets synced

All AI bot requests from the last 7 days, on the first sync.
Hourly thereafter.
Only requests from known AI crawlers are kept. The full list is in the next section.

What does not get synced

Traffic older than 7 days.
Generic search bots (Googlebot, Bingbot) and non-AI scrapers.
Real human traffic.

Interpreting the data

The dashboard groups crawler hits by bot. The bots fall into three categories with very different meanings, and the same number of requests can imply very different things depending on which category produced them.

Training crawlers

These crawl your site to gather data for the next model generation. They run in the background, respect robots.txt, and anything they collect feeds the next training cycle.

Bot	Provider
`gptbot`	OpenAI
`claudebot`	Anthropic
`perplexitybot`	Perplexity
`bytespider`	ByteDance
`ccbot`	Common Crawl
`meta-externalagent`	Meta
`applebot`	Apple
`amazonbot`	Amazon

High traffic here: your content is being absorbed for training. The visibility payoff lands months later, when the next model ships.

Zero traffic here: either you have blocked the crawler in robots.txt, or your content has not been prioritized for the next training round.

User-triggered fetches

These fire when someone using an AI assistant asks it to look at a specific URL or click a link in an answer. Real human, real conversation, in real time.

Bot	Provider
`chatgpt-user`	OpenAI
`claude-user`	Anthropic
`perplexity-user`	Perplexity
`meta-externalfetcher`	Meta
`duckassistbot`	DuckDuckGo
`mistralai-user`	Mistral

High traffic here: you are appearing in AI answers right now and users are clicking through. This is the closest thing to direct attribution that AI visibility produces.

Zero traffic here: AI assistants are not surfacing your URLs to users, even if your brand is being mentioned.

Search index builders

These crawl to populate the search index AI assistants query at runtime. They sit between the training corpus and the user-facing answer.

Bot	Provider
`oai-searchbot`	OpenAI
`claude-searchbot`	Anthropic
`google-cloudvertexbot`	Google (Vertex AI)
`facebookbot`	Meta

High traffic here: you are in the index AI assistants reach for when they need current information. This is your retrieval-layer visibility.

Zero traffic here: AI assistants will not find your content when they search. Anything time-sensitive (recent posts, product changes, news) is invisible at runtime.

Reading the gap between categories

The most useful diagnostic is comparing categories against each other.

A site with high training-crawler traffic but no *-user and no *-searchbot activity is in training data but not surfacing in answers. The model knows you exist. The content fit or messaging is what needs work, not the SEO stack.

A site with *-user traffic but very little training-crawler traffic is being surfaced through the search and retrieval layers, ahead of the next training cycle. Common for fresh content. Watch the training-crawler signal pick up over the following weeks.

A site with strong activity across all three categories is what you want. Memory, retrieval, and live click-through are all working.

Compare these patterns against your citation data in Bourd: which sources are cited, which are crawled, and where the gap lives.

Permissions

Required role: Admin. Members can view crawler analytics but cannot connect, edit, or disconnect the integration.
The Cloudflare API token is encrypted at rest and is never returned after you save it. Disconnecting the integration deletes it.

Troubleshooting

No data after 24 hours. Open the integration row and look for an error banner. The two common causes are a token missing one of the required read scopes, or a zone with no AI bot traffic inside the 7-day window.

Banner reads “Cloudflare rejected the request: …” Common variants:

“…token does not have permission…” The token is missing one of the required scopes. Re-issue the token in Cloudflare with the scopes listed in Requirements above, then reconnect.
“…does not have access to the field…” Your zone is on a tier without that data feature. This rarely affects core functionality; if it does, reach out to support.
“…cannot request data older than…” Your plan’s retention window is shorter than Bourd asked for. The next sync will succeed for traffic inside the available window.

A specific bot is missing. Bourd recognizes a fixed list of known AI bots. If you see a crawler in your Cloudflare dashboard that doesn’t appear in Bourd, email support@bourd.dev with the user-agent string and we will add it.

Next steps

Analyze citations to see which sources AI models recommend, and compare against which sources they crawl.
Set up your brand and competitors if you have not already, so you can correlate crawler activity with mentions in answers.