[analog-help] Escaping Quotes in Logs
Jeremy Wadsack
jeremy at 7simplemachines.com
Tue Feb 26 17:10:46 PST 2008
The format for these parameters is not typical for a web log. Usually
the query string is URL-escaped. In that case quote characters are
converted to their hex equivalent. In your example I would expect
something more like this:
F%2E+Scott+Fitzgerald%27s+evolving+American+dream:+the+
%22pursuit+of+happiness%22+in+Gatsby%2C+Tender+is+the+night
In this case Analog can parse the file just fine. For your files you
will probably need to pre-process the lines to convert them to something
Analog can support.
Thanks,
Jeremy Wadsack
Seven Simple Machines
-----Original Message-----
From: analog-help-bounces at lists.meer.net
[mailto:analog-help-bounces at lists.meer.net] On Behalf Of Roberto Hoyle
Sent: Tuesday, February 26, 2008 11:32 AM
To: Support for analog web log analyzer
Subject: Re: [analog-help] Escaping Quotes in Logs
On Feb 26, 2008, at 12:19 PM, Aengus wrote:
> Roberto Hoyle <roberto.j.hoyle at dartmouth.edu> wrote:
>> I have a log entry of the form:
>>
>> "Entry"
>>
>> however, the Entry above may have escaped quotes (\") in it.
>>
>> Is there a way to have Analog differentiate between escaped quotes
>> and
>> regular quotes in it's parsing?
>
> ?? Can you expand on your description the problem? I don't
> understand what you're trying to do, and what you expect to get,
> versus what you're actually getting.
A query string can have quotes in it, and the log file will contain
the quotes with a preceding '\':
"http://ry2ue4ek7d.search.serialssolutions.com/?sid=HWW:HUMAB&genre=arti
cle&pid=
<an>199728800301005</an>&aulast=Callahan&aufirst=John
+F.&issn=0041-462X&title=Twentieth+Century+Literature&stitle=Twentieth
+Century+Lit&atitle=F.+Scott+Fitzgerald's+evolving+American+dream:+the+
\"pursuit+of+happiness\"+in+Gatsby,+Tender+is+the+night,+and+The+last
+tycoon&volume=42&spage=374&epage=395&date=1996&ssn=fall"
Note that "pursuit of happiness" was entered as a search term, and
therefore gets escaped (\") and has the spaces replaced by '+".
Is there a way for Analog to deal with the entire line, instead of
stopping at the first '"' character, if I try to analyze the query
parameters?
r.
+-----------------------------------------------------------------------
-
| TO UNSUBSCRIBE from this list:
| http://lists.meer.net/mailman/listinfo/analog-help
|
| Analog Documentation: http://analog.cx/docs/Readme.html
| List archives: http://www.analog.cx/docs/mailing.html#listarchives
| Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
+-----------------------------------------------------------------------
-
More information about the analog-help
mailing list