I sometimes check the robots.txt
of sites to see what they might not want to be indexed.
It’s interesting because some sites use it as access control, which, of course, is silly. Just because robots won’t index it doesn’t mean people won’t find it. Plus, by specifying it in robots, you’re making it easier for humans to discover.
Pr0-tip: instead of adding content to robots.txt
to prevent indexing, use noindex
in the meta
tag.
<meta name="robots" content="noindex" />