jwm.robotstxt.googlebot¶
This file implements the standard defined by the Robots Exclusion Protocol (REP) internet draft (I-D).
Google doesn’t follow the standard strictly, because there are a lot of non-conforming robots.txt files out there, and we err on the side of disallowing when this seems intended.
An more user-friendly description of how Google handles robots.txt can be found at:
This library provides a low-level parser for robots.txt (ParseRobotsTxt()), and a matcher for URLs against a robots.txt (class RobotsMatcher).
Functions
|
Parses body of a robots.txt and emits parse callbacks. |
Classes
RobotsMatcher - matches robots.txt against URLs. |
|
Handler for directives found in robots.txt. |