Skip to content

Crawler analytics

Crawler analytics shows you which AI bots visit your site, how often, and what they fetched. The data comes from your Cloudflare account and refreshes hourly.

This is a different signal from prompts and citations. Citations tell you which sources AI models cite in answers. Crawler analytics tell you which pages AI models fetch. The gap between the two is often the most useful diagnostic.

You need a Cloudflare account with at least one zone (your domain). Cloudflare’s AI Crawl Control surfaces the underlying data on all paid plans (Pro, Business, Enterprise). Free plan retention is too short for this integration.

Your Cloudflare API token needs these permissions:

  • ZoneBot ManagementRead
  • ZoneZoneRead
  • ZoneAnalyticsRead

Under Zone Resources, set IncludeSpecific zoneyour domain.

  1. In Bourd, open the workspace you want to attach the integration to.
  2. Navigate to Settings → Integrations.
  3. Click Connect Cloudflare.
  4. Paste your Cloudflare API token. Confirm the hostname (Bourd pre-fills it from your tracked domain).
  5. Click Connect.

Bourd verifies the token against your hostname. If it matches, your first sync starts within the hour.

  • All AI bot requests from the last 7 days, on the first sync.
  • Hourly thereafter.
  • Only requests from known AI crawlers are kept. The full list is in the next section.
  • Traffic older than 7 days.
  • Generic search bots (Googlebot, Bingbot) and non-AI scrapers.
  • Real human traffic.

The dashboard groups crawler hits by bot. The bots fall into three categories with very different meanings, and the same number of requests can imply very different things depending on which category produced them.

These crawl your site to gather data for the next model generation. They run in the background, respect robots.txt, and anything they collect feeds the next training cycle.

BotProvider
gptbotOpenAI
claudebotAnthropic
perplexitybotPerplexity
bytespiderByteDance
ccbotCommon Crawl
meta-externalagentMeta
applebotApple
amazonbotAmazon

High traffic here: your content is being absorbed for training. The visibility payoff lands months later, when the next model ships.

Zero traffic here: either you have blocked the crawler in robots.txt, or your content has not been prioritized for the next training round.

These fire when someone using an AI assistant asks it to look at a specific URL or click a link in an answer. Real human, real conversation, in real time.

BotProvider
chatgpt-userOpenAI
claude-userAnthropic
perplexity-userPerplexity
meta-externalfetcherMeta
duckassistbotDuckDuckGo
mistralai-userMistral

High traffic here: you are appearing in AI answers right now and users are clicking through. This is the closest thing to direct attribution that AI visibility produces.

Zero traffic here: AI assistants are not surfacing your URLs to users, even if your brand is being mentioned.

These crawl to populate the search index AI assistants query at runtime. They sit between the training corpus and the user-facing answer.

BotProvider
oai-searchbotOpenAI
claude-searchbotAnthropic
google-cloudvertexbotGoogle (Vertex AI)
facebookbotMeta

High traffic here: you are in the index AI assistants reach for when they need current information. This is your retrieval-layer visibility.

Zero traffic here: AI assistants will not find your content when they search. Anything time-sensitive (recent posts, product changes, news) is invisible at runtime.

The most useful diagnostic is comparing categories against each other.

A site with high training-crawler traffic but no *-user and no *-searchbot activity is in training data but not surfacing in answers. The model knows you exist. The content fit or messaging is what needs work, not the SEO stack.

A site with *-user traffic but very little training-crawler traffic is being surfaced through the search and retrieval layers, ahead of the next training cycle. Common for fresh content. Watch the training-crawler signal pick up over the following weeks.

A site with strong activity across all three categories is what you want. Memory, retrieval, and live click-through are all working.

Compare these patterns against your citation data in Bourd: which sources are cited, which are crawled, and where the gap lives.

  • Required role: Admin. Members can view crawler analytics but cannot connect, edit, or disconnect the integration.
  • The Cloudflare API token is encrypted at rest and is never returned after you save it. Disconnecting the integration deletes it.

No data after 24 hours. Open the integration row and look for an error banner. The two common causes are a token missing one of the required read scopes, or a zone with no AI bot traffic inside the 7-day window.

Banner reads “Cloudflare rejected the request: …” Common variants:

  • “…token does not have permission…” The token is missing one of the required scopes. Re-issue the token in Cloudflare with the scopes listed in Requirements above, then reconnect.
  • “…does not have access to the field…” Your zone is on a tier without that data feature. This rarely affects core functionality; if it does, reach out to support.
  • “…cannot request data older than…” Your plan’s retention window is shorter than Bourd asked for. The next sync will succeed for traffic inside the available window.

A specific bot is missing. Bourd recognizes a fixed list of known AI bots. If you see a crawler in your Cloudflare dashboard that doesn’t appear in Bourd, email support@bourd.dev with the user-agent string and we will add it.