http - Is there any advantage of using X-Robot-Tag instead of robots.txt?

Question

Welcome To Ask or Share your Answers For Others

http - Is there any advantage of using X-Robot-Tag instead of robots.txt?

asked Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

It looks like there are two mainstream solutions for instructing crawlers what to index and what not to index: adding an X-Robot-Tag HTTP header, or indicating a robots.txt.

Is there any advantage to using the former?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1.1k views

1 Answer

深蓝 · Answer 1 · 2021-10-23T18:24:43+0000

With robots.txt you cannot disallow indexing of your documents.

They have different purposes:

robots.txt can disallow crawling (with Disallow)
X-Robots-Tag 1 can disallow indexing (with noindex)

(And both offer additional different features, e.g., linking to your Sitemap in robots.txt, disallowing following links in X-Robots-Tag, and many more.)

Crawling means accessing the document. Indexing means providing a link to (and possibly metadata from or about) the document in an index. In the typical case, a bot indexes a document after having crawled it, but that’s not necessary.

A bot that isn’t allowed to crawl a document may still index it (without ever accessing it). A bot that isn’t allowed to index a document may still crawl it. You can’t disallow both.

1 Note that the header is called X-Robots-Tag, not X-Robot-Tag. By the way, the metadata name robots (for the HTML meta element) is an alternative to the HTTP header.

Categories

http - Is there any advantage of using X-Robot-Tag instead of robots.txt?

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags