[analog-help] cache files narfs up user reports
Aengus
analog07 at eircom.net
Tue Sep 11 11:32:52 PDT 2007
kevin creason <ckevinj at gmail.com> wrote:
>> I think understand the hole I've dug myself into-- but let me run
>> this by the experts.
>>
>> I have large log files (~1.5 gb each month, roll over monthly).
>> I have lots of different organizations and applications that would
>> like to have stats for just their little demesne (16 now, including
>> a generic all inclusive one).
>> Each report is slightly different, but not hugely.
As the documentation says, "If you want different sets of options, you
must create several cache files from the same logfile". You can't
reliably generate different reports from a single cache file.
>> I was running all the reports one after the other, each with one
>> main cfg file and then their subfiles. It runs in under 20 minutes
>> on the 11th of a month... I don't recall how long it took on the
>> 31st, but it was a bit longer. :)
If you're running 15 different reports, then you might want to consider
running a script that creates 15 seperate log files. That way, you only
need to read the full log a max of 3 times (once to split, once to
create your complete report, and each of the 15 partial logs" to create
your sub reports. You can even use the UNCOMPRESS command to combine the
2 full reads into a single operation).
>> I'm assuming that that is because the cache file strips out some of
>> that data and the users data and request data tables are linked
>> together anymore. Is my interpretation of events/problem correct?
Yes.
>> Is there a better way to speed this up and provide such
>> application/site detail with Analog?
Split your logfiles. I can process about 100,000 lines per second on my
desktop, so a 2GB log file takes a bit over a minute to analyze, whether
I'm excluding most of it or not. Rather than reading the same logfile 15
times to make 15 different reports, you're at a point where the
investment in splitting the logs is probably worth the effort.
There's a post somewhere in the archives with an grep command line for
doing this, I think.
You could also just run the reports overnight - who cares if they take
45 minutes to run at 2AM?
Aengus
More information about the analog-help
mailing list