Bot Crawl Checker

Check if your URL is accessible to 24 well-known search engine crawlers, AI bots, and social media bots. Analyze robots.txt rules, HTTP status, and response times.

We'll send the crawl report to this email after verification.

How It Works

1

Enter URL & Email

Provide the website URL you want to check and your email address.

2

Verify Email

Click the verification link in your inbox to start the crawl check.

3

Get Report

Receive a detailed report showing which of the 24 bots can access your site.

24 Bots We Check

We simulate requests using each bot's real User-Agent header and check robots.txt, HTTP status, meta tags, and WAF challenges.

GooglebotGoogle
Search
BingBotMicrosoft
Search
YandexBotYandex
Search
BaiduspiderBaidu
Search
DuckDuckBotDuckDuckGo
Search
Yahoo SlurpYahoo
Search
ApplebotApple
Search
SeznamBotSeznam
Search
QwantbotQwant
Search
YetiNaver
Search
MojeekBotMojeek
Search
Kagi BotKagi
Search
GPTBotOpenAI
AI
ChatGPT-UserOpenAI
AI
ClaudeBotAnthropic
AI
PerplexityBotPerplexity
AI
Google-ExtendedGoogle
AI
CCBotCommon Crawl
AI
cohere-aiCohere
AI
AhrefsBotAhrefs
SEO
SemrushBotSemrush
SEO
TwitterbotX (Twitter)
Social
FacebookbotMeta
Social
LinkedInBotLinkedIn
Social

What We Check

robots.txt Rules

Parses robots.txt and checks each bot's specific User-Agent directives.

HTTP Response

Sends actual GET requests with each bot's real User-Agent header.

Meta Robots Tags

Detects noindex/nofollow directives in HTML meta tags.

X-Robots-Tag Headers

Checks HTTP response headers for indexing restrictions.

WAF/Firewall Detection

Detects Cloudflare JS challenges, CAPTCHAs, and 403 blocks.

Rate Limiting

Identifies HTTP 429 responses that throttle bot access.

How to Check Your Website's Bot Accessibility

  1. 1

    Enter your website URL

    Type or paste the URL you want to check into the Website URL field. You can enter a full URL like https://example.com/page or just example.com — we'll add HTTPS automatically.
  2. 2

    Provide your email address

    Enter a valid email address where you want to receive the crawl report. We use email verification to prevent abuse and ensure you get your results.
  3. 3

    Click Check Crawlability

    Hit the Check Crawlability button. We'll send a verification email to your inbox with a one-click confirmation link.
  4. 4

    Verify your email

    Open the verification email and click Verify & Start Check. This triggers the full crawl check — all 24 bots are tested against your URL simultaneously.
  5. 5

    Receive your report

    Within a few minutes, you'll get a detailed email report showing which bots can access your site, which are blocked, and exactly why — including robots.txt rules, HTTP errors, meta tags, and WAF challenge detection.

Why Check Bot & Crawler Accessibility?

Websites can accidentally block search engine crawlers, AI bots, and social media bots through misconfigured firewalls, robots.txt rules, or security settings. If Googlebot or BingBot can't access your pages, your content won't appear in search results. Blocking AI crawlers like GPTBot or ClaudeBot means your content won't be cited in AI-powered search experiences. Social media bots need access to generate link previews when your URLs are shared.

Common causes of accidental blocking include Cloudflare Bot Fight Mode, overly aggressive WAF rules, misconfigured robots.txt files, and meta robots tags set to noindex. These issues can go undetected for months, silently hurting your search rankings and traffic. This tool checks your URL against 24 well-known bots to identify these issues.

What This Tool Checks

For each of the 24 bots, we perform five independent checks:

  1. robots.txt Analysis — We fetch your site's robots.txt file and parse the User-agent rules specific to each bot. A bot might be allowed by the wildcard (User-agent: *) but specifically blocked by its own directive (e.g., User-agent: GPTBot / Disallow: /).
  2. HTTP Response — We send an actual GET request using each bot's real User-Agent header string. This reveals whether your server, CDN, or WAF returns different responses to different bots (403 Forbidden, 429 Rate Limited, etc.).
  3. X-Robots-Tag Header — Some servers send indexing directives via HTTP headers instead of HTML meta tags. We check for noindex and nofollow in the X-Robots-Tag response header.
  4. Meta Robots Tag — We parse the first 8KB of HTML to find <meta name="robots" content="..."> tags that might block indexing.
  5. WAF Challenge Detection — We detect Cloudflare JavaScript challenges, CAPTCHA pages, and other WAF challenge responses that bots cannot solve.

Bots We Test

Search Engine Crawlers (12)

Googlebot, BingBot, YandexBot, Baiduspider, DuckDuckBot, Yahoo Slurp, Applebot, SeznamBot, Qwantbot, Yeti (Naver), MojeekBot, and Kagi Bot. These crawlers index your pages for their respective search engines.

AI Crawlers (7)

GPTBot and ChatGPT-User (OpenAI), ClaudeBot (Anthropic), PerplexityBot (Perplexity), Google-Extended, CCBot (Common Crawl), and cohere-ai (Cohere). These bots access content for AI training and AI-powered search experiences.

SEO & Social Bots (5)

AhrefsBot, SemrushBot (SEO analysis), Twitterbot, Facebookbot, and LinkedInBot (social media link previews).

How It Compares

How to Fix Common Issues

Blocked by robots.txt: Edit your robots.txt file to add Allow: / for the blocked bot's User-agent.

HTTP 403 (WAF/Firewall): Check your CDN's security settings. In Cloudflare, go to Security > Bots and ensure Bot Fight Mode isn't blocking verified crawlers. Create a WAF rule with expression (cf.client.bot) and action Skip.

Cloudflare JS Challenge: Lower your Security Level from "High" to "Medium" or create a firewall rule that skips challenges for known bot user agents.

Meta noindex: Remove or update the <meta name="robots" content="noindex"> tag from your HTML. This is sometimes added during development and forgotten in production.

Frequently Asked Questions

1

What is a bot crawl checker?

A bot crawl checker tests whether well-known web crawlers (like Googlebot, BingBot, GPTBot) can access your website. It simulates requests using each bot's real User-Agent header and checks multiple blocking layers including robots.txt, HTTP status codes, meta tags, response headers, and firewall challenges.
2

Why is my site blocked by Googlebot or BingBot?

Common causes include: a restrictive robots.txt file with Disallow rules, Cloudflare Bot Fight Mode or aggressive WAF rules challenging crawler traffic, a meta robots tag set to 'noindex', X-Robots-Tag HTTP headers blocking indexing, or the server returning 403/429/503 errors to bot user agents. Our tool checks all of these layers.
3

What's the difference between robots.txt and meta robots?

robots.txt controls whether a bot can crawl (access) a URL. It's checked before the bot even downloads the page. Meta robots tags (in HTML) and X-Robots-Tag headers control whether a page is indexed after the bot accesses it. A page can be crawlable but not indexable, or vice versa. Both matter for search visibility.
4

Should I block AI crawlers like GPTBot and ClaudeBot?

It depends on your goals. Allowing AI crawlers means your content can be used in AI-powered search experiences (ChatGPT, Claude, Perplexity) which can drive traffic and citations. Blocking them prevents your content from being used for AI training. Many sites allow AI search bots (ChatGPT-User, PerplexityBot) while blocking training bots (GPTBot, CCBot).
5

What is Cloudflare Bot Fight Mode and how does it affect crawlers?

Cloudflare Bot Fight Mode automatically challenges traffic that appears to come from bots. While it's designed to block malicious bots, it can also interfere with legitimate search engine crawlers, causing them to receive JavaScript challenges they can't solve. This results in your pages not being indexed. You can disable it or create WAF rules to allow verified bots.
6

How many bots does this tool check?

We check 24 bots across 4 categories: 12 search engine crawlers (Google, Bing, Yandex, Baidu, DuckDuckGo, Yahoo, Apple, Seznam, Qwant, Naver, Mojeek, Kagi), 7 AI crawlers (GPTBot, ChatGPT-User, ClaudeBot, PerplexityBot, Google-Extended, CCBot, cohere-ai), 2 SEO tool crawlers (Ahrefs, Semrush), and 3 social media bots (Twitter, Facebook, LinkedIn).
7

Why do social media bots need access to my site?

When someone shares your URL on Twitter, Facebook, or LinkedIn, these platforms send their bots to fetch your page's Open Graph meta tags (title, description, image) to generate a link preview card. If your site blocks these bots, shared links will appear as plain text without rich previews, reducing click-through rates.
8

What does HTTP 405 'Method Not Allowed' mean for bots?

HTTP 405 means the server rejects the HTTP method (GET or HEAD) used by the bot. This often happens with Cloudflare Workers or custom server configurations that only handle GET requests and reject HEAD requests. Since many crawlers use HEAD requests to check URLs before crawling, this can prevent indexing.
9

How does the email verification work?

When you submit a URL and email, we send a verification link to your inbox. Clicking the link confirms your email is valid and starts the crawl check. Once all 24 bots are tested, we email you a detailed report with the results. This prevents abuse and ensures reports reach real recipients.
10

Is this tool free?

Yes, the Bot Crawl Checker is completely free. You can run up to 3 checks per email per day. The full report is sent to your email with detailed results for all 24 bots, including specific issues found and recommendations for fixing them.

Rate This Tool

0/1000

Get Weekly Tools

Suggest a Tool