[analog-help] robots and requests
Aengus
analog07 at eircom.net
Thu Sep 27 12:24:48 PDT 2007
Aimee Mandeville <aimee at edc.uri.edu> wrote:
> I have run Analog on my log files and in looking at the data I have
> noticed that many of my requests are coming from Robots, spiders,
> crawlers etc. I have figured out how to exclude these from the
> various reports using the HOSTREPEXCLUDE, DOMEXCLUDE and ORGEXCLUDE.
> I would like to know if these are still getting counted and reported
> as REQUESTS in my request report.
HOSTREPEXCLUDE means that the requests are just excluded from the Host
Report, and they are included in all the other reports. Use HOSTEXCLUDE
if you want to exclude a Host entirely.
DOMEXCLUDE and ORGEXCLUDE will exclude any matching requests completely.
If you are logging Browser strings, then you should be able to use the
ROBOTINCLUDE command to define any browser strings as Robots, so that
you can get a count of them in the OS Report. Then you can tell whether
you're excluding them all or not by checking the number of "Unwanted
logfile entries" listed in the General Summary.
Better yet, if you are logging Browser Strings anyway, use the list at
http://www.wadsack.com/robot-list.html to get a list on known robots.
Search/Replace ROBOTINCLUDE with BROWEXCLUDE to exclude all of those
requests completely from your logfile analysis.
Aengus
More information about the analog-help
mailing list