[analog-help] Re: How to exclude Crwalers and Robots
Andreas Kuhn
akuhn at gmx.de
Tue Jan 16 13:21:09 PST 2007
Aengus <analog07 at ...> writes:
>
> On Tuesday, January 16, 2007 5:55 AM [EDT],
> Andreas Kuhn <AKuhn at ...> wrote:
>
> >> Do crawlers and robots have any influence of the request report? If
> >> so, how can I exclude the PIs crawler and robots produce?
>
> You can use HOSTEXCLUDE or BROWEXCLUDE (http://analog.cx/docs/include.html)
> to exclude any robots/spiders that you identify. (There's an up to date list
> of browser strings used by known Robots at
> http://www.wadsack.com/robot-list.html)
START REPLY ANDREAS KUHN-----------------
I will try this. Is there any possibility to exclude all robots identified
by "includerobot"?
END REPLY ANDREAS KUHN-----------------
>
> >> My problem is that I am using several reporttools. Comparing the
> >> figures the analog-figures are about 50% higher than the others. Now
> >> my question is, wether the crawler are producing this difference.
>
> They might be, but there are many reasons why different reporting methods
> return different answers. Analog reports on the data in your web servers log
> files, and you can be quite sure that it is extremely accurate. But its
> reports depend on the parameters that it is told to use (include/exclude
> certain hosts, ignore image requests, what counts as a page, etc). If you
> use a different method that uses different parameters, you'll get a
> different result.
START REPLY ANDREAS KUHN-----------------
Well you are right. And I am trying to compare Apple and oranges, (Serverlogs
and 0-pixel-tracking). It is clear to me, that i wont get the same figures. I
expect difference of about 10% more or less.
My former experiences with analog were quit good, so I am about to believe
analog. ;-)) But if analog gives me the correct figures the other system might
have a failure; propably I am making a mistake ....
Filtering I am starting with
--snip--
#FILTER
Fileexclude *
Fileinclude *de2*html
Fileinclude *en2*html
Fileinclude *.pdf
etc.
--snap--
So I only should get 'html' files and pdf-files.
END REPLY ANDREAS KUHN-----------------
>
> If you don't understand the paraemeters that your different reporting
> methods are using, it's a waste of time comparing them. You can compare this
> months Analog results to last months, and learn something useful from the
> comparison, you won't learn anything useful by comparing an Analog report to
> some other report unless you understand the assumptions that both reports
> are based on.
START REPLY ANDREAS KUHN-----------------
I am just doing this comparison to know wether the new system works and gives
me similar figures. I am not expecting exactly the same figures.
END REPLY ANDREAS KUHN-----------------
>
> Aengus
>
> +------------------------------------------------------------------------
> | TO UNSUBSCRIBE from this list:
> | http://lists.meer.net/mailman/listinfo/analog-help
> |
> | Analog Documentation: http://analog.cx/docs/Readme.html
> | List archives: http://www.analog.cx/docs/mailing.html#listarchives
> | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
> +------------------------------------------------------------------------
>
>
More information about the analog-help
mailing list