Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Java > analysis of java application logs

Reply
Thread Tools

analysis of java application logs

 
 
Ulrich Scholz
Guest
Posts: n/a
 
      05-23-2011
Hi,

I'm looking for an approach to the problem of analyzing application
log files.

I need to analyse Java log files from applications (i.e., not logs of
web servers). These logs contain Java exceptions, thread dumps, and
free-form log4j messages issued by log statements inserted by
programmers during development. Right now, these man-made log entries
do not have any specific format.

What I'm looking for is a tool and/or strategy that supports in lexing/
parsing, tagging, and analysing the log entries. Because there is only
little defined syntax and grammar - and because you might not know
what you are looking for - the task requires the quick issuing of
queries against the log data base. Some sort of visualization would be
nice, too.

Pointers to existing tools and approaches as well as appropriate tools/
algorithms to develop the required system would be welcome.

Ulrich
 
Reply With Quote
 
 
 
 
Robert Klemme
Guest
Posts: n/a
 
      05-23-2011
On 23 Mai, 09:50, Ulrich Scholz <(E-Mail Removed)> wrote:
> I'm looking for an approach to the problem of analyzing application
> log files.
>
> I need to analyse Java log files from applications (i.e., not logs of
> web servers). These logs contain Java exceptions, thread dumps, and
> free-form log4j messages issued by log statements inserted by
> programmers during development. Right now, these man-made log entries
> do not have any specific format.
>
> What I'm looking for is a tool and/or strategy that supports in lexing/
> parsing, tagging, and analysing the log entries. Because there is only
> little defined syntax and grammar - and because you might not know
> what you are looking for - the task requires the quick issuing of
> queries against the log data base. Some sort of visualization would be
> nice, too.
>
> Pointers to existing tools and approaches as well as appropriate tools/
> algorithms to develop the required system would be welcome.


I once did a project for our Ruby Best Practices blog. The code is
over there at github:
https://github.com/rklemme/muppet-laboratories

Explanations can be found in the blog. This is the first posting of
the series:
http://blog.rubybestpractices.com/po...oratories.html

This works different from what you want: log files are read and
written out to small log files according to particular criteria. But
you could reuse the parsing part (including detection of multi line
log statements) and write what you found into a relational database.
If you have it in the DB you can query for at least timestamp, log
level, message content and probably also thread id and class. If you
want to do custom tagging you could do that once the data is in the
database.

Since we do not know what goal your analysis has and how many
different questions to want to ask the data it's not entirely clear
whether that would be the optimal approach for your problem. One
variant to the above would be to provide the parsing process a number
of regular expressions with a label attached and label all log entries
during insertion into the database. But since modern relational
databases usually also support full text indexing and regular
expression matches that might also be solved with a view. If your
data volume is large you need to additionally make sure this remains
efficient.

Kind regards

robert
 
Reply With Quote
 
 
 
 
jlp
Guest
Posts: n/a
 
      05-23-2011
Le 23/05/2011 09:50, Ulrich Scholz a écrit :
> Hi,
>
> I'm looking for an approach to the problem of analyzing application
> log files.
>
> I need to analyse Java log files from applications (i.e., not logs of
> web servers). These logs contain Java exceptions, thread dumps, and
> free-form log4j messages issued by log statements inserted by
> programmers during development. Right now, these man-made log entries
> do not have any specific format.
>
> What I'm looking for is a tool and/or strategy that supports in lexing/
> parsing, tagging, and analysing the log entries. Because there is only
> little defined syntax and grammar - and because you might not know
> what you are looking for - the task requires the quick issuing of
> queries against the log data base. Some sort of visualization would be
> nice, too.
>
> Pointers to existing tools and approaches as well as appropriate tools/
> algorithms to develop the required system would be welcome.
>
> Ulrich

At work, so it is not free, with a colleague we have developped a such tool.

The colleague has developped the Viewer of CSV file with the library
JFreeChart. The csv files are time series ( date are for example in
format YYYY/MM/DD:HH:mm:ss )
I have developped my own parser that translates native logs => csv files.
In java i have used the java regexp patterns.
In a file, we have to find the beginning and the end of an
enregistrement ( it can be a multi-lines enregistrement). I can
exclude/include enregistrements with java regexp patterns.

We have to match the pattern of the date ( regexp and java dateFormat
pattern).
For every enregistrement, we can extract usefull values by pattern
matching ( I use a two passes matching to simplify the patterns) the
values can be bound to a filter ( http URL for example)
All this is embedded in swing components.

I can parse acces logs ( Apache, tomcat, weblogic), log4J logs, Verbse
GC of JVM ( IBM JVM, Open JDK 7 ..), java Threads dumps, hibernate sql
logs, Tuxedo logs and more generally all implicit or explicit dated
enregistrements.
That are the main ways ...
I take me a long time, an still in developpement ... but we have not
found any other tool.
 
Reply With Quote
 
Lew
Guest
Posts: n/a
 
      05-23-2011
Ulrich Scholz wrote:
> I'm looking for an approach to the problem of analyzing application
> log files.
>
> I need to analyse Java log files from applications (i.e., not logs of
> web servers). These logs contain Java exceptions, thread dumps, and
> free-form log4j messages issued by log statements inserted by
> programmers during development. Right now, these man-made log entries
> do not have any specific format.
>
> What I'm looking for is a tool and/or strategy that supports in lexing/
> parsing, tagging, and analysing the log entries. Because there is only
> little defined syntax and grammar - and because you might not know
> what you are looking for - the task requires the quick issuing of
> queries against the log data base. Some sort of visualization would be
> nice, too.
>
> Pointers to existing tools and approaches as well as appropriate tools/
> algorithms to develop the required system would be welcome.


It helps if you have a logging strategy that mandates a consistent logging
format, specific information in particular positions or marked by particular
markup, logging levels and other such so that your analysis tool isn't faced
with a completely open-ended input. What you describe requires a general
text-analysis approach, as you indicate that you can make no guarantees about
the format. Based on that, your best tool is "less" or equivalent text-file
reader.

What is a tool supposed to do, read your mind?

It's really hard to extract information from a garbage can where people just
randomly dumped whatever they individually felt like dumping without regard
for operational needs. You can't build a skyscraper on a bad foundation, and
you can't build a good log analysis off a crappy log.

Fix the logging system, then the analysis problem will be tractable.

--
Lew
Honi soit qui mal y pense.
http://upload.wikimedia.org/wikipedi.../c/cf/Friz.jpg
 
Reply With Quote
 
Lew
Guest
Posts: n/a
 
      05-23-2011
CncShipper wrote:
> I wrote one of these and thought about Open Sourcing it, but lost


"open sourcing"

> interest. I parsed the logs into a db, and assigned id's to the


"DB, "IDs" (no greengrocer's apostrophe, and "id" is a different word from
"ID", although the meaning you imputed by the substitution is poetic and
interesting)

> various fields.
>
> You could then search by Type, ( WARNING, SEVERE, etc... )


"type"

> You could search a range of times
> It could handle multiple log files into one run
> could Sync on an event and stop analyzing on another trigger


"synch"

> Graphs to count trends, events, exceptions


"graphs"

> Used Reg-Ex a heck of a lot of work.. Sorted all the transactions in


"used" "regex"

> the logs, so you could also display by package name, really helped
> me solve a lot of problems when I was working .. took me nearly two
> years to complete everything to where it is today..


Double-dot, or two consecutive periods, is not legitimate punctuation in lieu
of a comma or full stop.

> I never found a package that even came close to it.. which is why I
> wrote it


You have made an important and useful point. Covering for a bad log format is
a freeform-text parsing problem, inherently difficult and heuristic and
probably never perfect. I wonder if your effort would have been better spent
converting to a log format that is parser-friendly, as the OP should do.

--
Lew
Honi soit qui mal y pense.
http://upload.wikimedia.org/wikipedi.../c/cf/Friz.jpg
 
Reply With Quote
 
jlp
Guest
Posts: n/a
 
      05-23-2011
Le 23/05/2011 17:43, Lew a écrit :
[SNIP]
> You have made an important and useful point. Covering for a bad log
> format is a freeform-text parsing problem, inherently difficult and
> heuristic and probably never perfect. I wonder if your effort would have
> been better spent converting to a log format that is parser-friendly, as
> the OP should do.
>

I agree with you, Lee, it is what i did with my own tool. Native logs
are converted in CSV files. But some logs are not simple to convert :
- java exceptions
- java threads dumps ( different for every JVM : Sun/Oracle, JRockit,
IBM ...)
- java heap dump summary ( same remark)
- verbose GC logs (same remark)
- multi-lines log enregistrement (xml logs ...)

Others are more simple :
- acces logs that are Common Log Format ( CLF) or CLF extended compliant
( Apache, Tomcat, IIS, WebLogic, Websphere ...)
- Log4J

 
Reply With Quote
 
Lew
Guest
Posts: n/a
 
      05-23-2011
jlp wrote:
> I agree with you, Lee, it is what i [sic] did with my own tool. Native logs are


Who's Lee?

--
Lew
The first-person singular pronoun in English is spelled "I", not "i". It's
only one letter long, so it should be possible to spell it correctly. This is
one of the first lessons in an EFL course, so it should come as no surprise.
 
Reply With Quote
 
Daniele Futtorovic
Guest
Posts: n/a
 
      05-23-2011
On 23/05/2011 18:20, Lew allegedly wrote:
> jlp wrote:
>> I agree with you, Lee, it is what i [sic] did with my own tool. Native
>> logs are

>
> Who's Lee?
>


You're Lee now, Lee.
 
Reply With Quote
 
Daniele Futtorovic
Guest
Posts: n/a
 
      05-23-2011
On 23/05/2011 15:11, Lew allegedly wrote:
> Ulrich Scholz wrote:
>> I'm looking for an approach to the problem of analyzing application
>> log files.
>>
>> I need to analyse Java log files from applications (i.e., not logs of
>> web servers). These logs contain Java exceptions, thread dumps, and
>> free-form log4j messages issued by log statements inserted by
>> programmers during development. Right now, these man-made log entries
>> do not have any specific format.
>>
>> What I'm looking for is a tool and/or strategy that supports in lexing/
>> parsing, tagging, and analysing the log entries. Because there is only
>> little defined syntax and grammar - and because you might not know
>> what you are looking for - the task requires the quick issuing of
>> queries against the log data base. Some sort of visualization would be
>> nice, too.
>>
>> Pointers to existing tools and approaches as well as appropriate tools/
>> algorithms to develop the required system would be welcome.

>
> It helps if you have a logging strategy that mandates a consistent
> logging format, specific information in particular positions or marked
> by particular markup, logging levels and other such so that your
> analysis tool isn't faced with a completely open-ended input. What you
> describe requires a general text-analysis approach, as you indicate that
> you can make no guarantees about the format. Based on that, your best
> tool is "less" or equivalent text-file reader.
>
> What is a tool supposed to do, read your mind?
>
> It's really hard to extract information from a garbage can where people
> just randomly dumped whatever they individually felt like dumping
> without regard for operational needs. You can't build a skyscraper on a
> bad foundation, and you can't build a good log analysis off a crappy log.
>
> Fix the logging system, then the analysis problem will be tractable.
>


I would argue around the same lines.

I've been faced a while ago with a situation where some orthogonal
organisational unit wanted to exploit my logs. I told them to GTFO.

My logs are my logs. I put in it what I consider necessary. I often
improve them as I step through the code. I might change the message, fix
the level, &c. I don't want to have them set in stone. Neither do I
generally have enough confidence in them to allow them to be used for
analysis.

"The solution, then, is simple", I told them, "spec out the exact
messages and arguments you want, and the exact situations you want them
logged in, and I'll add them for you. But leave me my precious debugging
logs."

Let me emphasize: IMHO debugging logs and logs for analysis are two
different things and should be kept strictly separated -- possibly
logged to a different target respectively.

--
DF.
An escaped convict once said to me:
"Alcatraz is the place to be"
 
Reply With Quote
 
Robert Klemme
Guest
Posts: n/a
 
      05-23-2011
On 23.05.2011 19:06, Daniele Futtorovic wrote:
> On 23/05/2011 18:20, Lew allegedly wrote:
>> jlp wrote:
>>> I agree with you, Lee, it is what i [sic] did with my own tool. Native
>>> logs are

>>
>> Who's Lee?
>>

>
> You're Lee now, Lee.


Did you mean to say "Bruce"?


robert

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Logs button not opening Logs GUI Lester Lane Cisco 6 08-28-2009 10:02 AM
WinXP Home SP2 logs in then right away logs off Andrew Computer Support 15 10-19-2004 09:45 AM
Win XP SP2 Logs in then Logs out awallwork at sign gmail dot com Computer Support 2 10-16-2004 08:19 PM
Win XP SP2 Logs in then Logs out Andrew Computer Support 2 10-16-2004 04:27 PM
WinXP Home SP2 Logs on then Logs off awallwork at sign gmail dot com Computer Support 2 10-16-2004 02:28 AM



Advertisments