Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > A little Direction Please

Reply
Thread Tools

A little Direction Please

 
 
Andy
Guest
Posts: n/a
 
      05-13-2008
Greets


Q; I am trying to learn how to define some variables

The basis of this script is to Scrub log files for ftp logins,
seperate the successful logins

Then create an array ( I hope the right terminology) to seperate it

I hardcoded the log file, because I am looking for a way for it to
scrub *.logs on a server

but ...hey step by step right.

Fields: date time c-ip cs-username cs-method cs-uri-stem sc-status sc-
bytes cs-host
2008-01-20 00:00:02 x.x.x.x 0598_Andy [6952041]sent /
0598_Andy/qff0598.zip 226 0 -

This field 226 0 - is a successful login

My plan is to scrub the logs, export to file.

sort fields into variable.

I hope in the end to get

1..log of successful logins
2.log of last successful login ( I think I am going to try date
comparison from most recent to last.)
3 be able to parse the fields and get data.


I know that there are those of you who are advanced, I would
appreciate any directions or help.

Again I am trying to put this together this is what I have so far.

#!/usr/bin/perl
use strict;
use warnings;

open(INPUT, '<', "ex080120.log")or die("Could not open log file.");
open(OUTPUT, '>',"ftpacct.log")or die("Could not open log file.");
my $extractedLine;
while (<INPUT>) {
my $line = $_;
if ($line =~ m/^(.+226\s+0\s+-\s+.*)$/) {
print OUTPUT "$1\n";
}
}
close(INPUT);
close(OUTPUT);
exit;

 
Reply With Quote
 
 
 
 
Ben Morrow
Guest
Posts: n/a
 
      05-13-2008

Quoth Andy <>:
> Greets
>
> Q; I am trying to learn how to define some variables
>
> The basis of this script is to Scrub log files for ftp logins,
> seperate the successful logins
>
> Then create an array ( I hope the right terminology) to seperate it
>
> I hardcoded the log file, because I am looking for a way for it to
> scrub *.logs on a server
>
> but ...hey step by step right.
>
> Fields: date time c-ip cs-username cs-method cs-uri-stem sc-status sc-
> bytes cs-host
> 2008-01-20 00:00:02 x.x.x.x 0598_Andy [6952041]sent /
> 0598_Andy/qff0598.zip 226 0 -


What are these fields separated by? A single space? Can the fields ever
contain spaces? How are they quoted in that case? What about newlines?

> This field 226 0 - is a successful login
>
> My plan is to scrub the logs, export to file.
>
> sort fields into variable.
>
> I hope in the end to get
>
> 1..log of successful logins
> 2.log of last successful login ( I think I am going to try date
> comparison from most recent to last.)
> 3 be able to parse the fields and get data.
>
>
> I know that there are those of you who are advanced, I would
> appreciate any directions or help.
>
> Again I am trying to put this together this is what I have so far.
>
> #!/usr/bin/perl
> use strict;
> use warnings;
>
> open(INPUT, '<', "ex080120.log")or die("Could not open log file.");
> open(OUTPUT, '>',"ftpacct.log")or die("Could not open log file.");


3-arg open: good.
Checking the return value: good.
It's better to keep filehandles in variables than use the old-fashioned
global handles, though; and if the open fails you should say what
failed, and why:

open(my $INPUT, '<', "ex080120.log")
or die("can't read ex080120.log: $!");
open(my $OUTPUT, '>', "ftpacct.log")
or die("can't write ftpacct.log: $!);

> my $extractedLine;
> while (<INPUT>) {
> my $line = $_;


This is silly. If you want the line in $line, put it there in the first
place:

while (my $line = <$INPUT>) {

> if ($line =~ m/^(.+226\s+0\s+-\s+.*)$/) {
> print OUTPUT "$1\n";
> }


I would recommend splitting the line into a hash first, and then
selecting lines based on that. Something like

my @fields = qw/
date time c_ip
cs_username cs_method cs_uri_stem
sc_status sc_bytes cs_host
/;

while (my $line = <$INPUT>) {

# Here I assume fields are delimited by a single space, and
# spaces and newlines *never* appear in a field (not even inside
# quotes). If this isn't true, you probably want to use the
# Text::CSV_XS module, which can parse all sorts of
# <foo>-delimited files.

my %record;
@record{@fields} = split / /, $line;

$record{sc_status} == 226
and $record{sc_bytes} == 0
and $record{cs_host} eq '-'
or next;

print $OUTPUT $line;
}

Once you've understood that bit of code it should be straightforward to
change it to do something more sophisticated. To keep track of the last
login for any given user, you need a hash %lastlogin, keyed by username,
that lives outside the loop.

> }
> close(INPUT);
> close(OUTPUT);


An advantage of keeping filehandles in variables is that they are closed
for you when the variable goes out of scope. An advantage of real
operating systems (Win32 counts, here) is that they close filehandles
for you when the process exits, in any case.

That said, there is value in explicitly closing a filehandle opened for
writing, *and checking the return value*. If any of the writes to that
filehandle failed (disk full, for instance) the error will be returned
by close. (Of course, if you want to catch errors sooner than that, you
can check the return value of print instead.)

> exit;


There's no need to explicitly exit from a Perl program. Falling off the
end is the usual way to finish.

Ben

--
I've seen things you people wouldn't believe: attack ships on fire off
the shoulder of Orion; I watched C-beams glitter in the dark near the
Tannhauser Gate. All these moments will be lost, in time, like tears in rain.
Time to die.
 
Reply With Quote
 
 
 
 
Jürgen Exner
Guest
Posts: n/a
 
      05-13-2008
Andy <> wrote:
>Q; I am trying to learn how to define some variables


To define a variable in Perl typically you use the assignment operator
'='.

>The basis of this script is to Scrub log files for ftp logins,
>seperate the successful logins
>
>Then create an array ( I hope the right terminology) to seperate it
>
>I hardcoded the log file, because I am looking for a way for it to
>scrub *.logs on a server
>
>but ...hey step by step right.
>
>Fields: date time c-ip cs-username cs-method cs-uri-stem sc-status sc-
>bytes cs-host
> 2008-01-20 00:00:02 x.x.x.x 0598_Andy [6952041]sent /
>0598_Andy/qff0598.zip 226 0 -
>
>This field 226 0 - is a successful login
>
>My plan is to scrub the logs, export to file.
>
>sort fields into variable.
>
>I hope in the end to get
>
>1..log of successful logins
>2.log of last successful login ( I think I am going to try date
>comparison from most recent to last.)
>3 be able to parse the fields and get data.
>
>
>I know that there are those of you who are advanced, I would
>appreciate any directions or help.
>
>Again I am trying to put this together this is what I have so far.
>
>#!/usr/bin/perl
>use strict;
>use warnings;
>
>open(INPUT, '<', "ex080120.log")or die("Could not open log file.");
>open(OUTPUT, '>',"ftpacct.log")or die("Could not open log file.");


You might want to add the reason why the open() call failed and the file
name for which it failed.

>my $extractedLine;


Why declare a variable that you never use again?

>while (<INPUT>) {
> my $line = $_;
> if ($line =~ m/^(.+226\s+0\s+-\s+.*)$/) {


I know for some people it is difficult to just trust the default
argument. But I would write this as
while (<INPUT>) {
if (m/^(.+226\s+0\s+-\s+.*)$/) {
or
while (my $line=<INPUT>) {
if ($line =~ m/^(.+226\s+0\s+-\s+.*)$/) {

> print OUTPUT "$1\n";
> }
>}
>close(INPUT);
>close(OUTPUT);


You may want to check the success of the close() call, too, in
particular for a file handle you wrote to.

jue
 
Reply With Quote
 
RedGrittyBrick
Guest
Posts: n/a
 
      05-13-2008
Andy wrote:
> Greets
>
>
> Q; I am trying to learn how to define some variables
>
> The basis of this script is to Scrub log files for ftp logins,
> seperate the successful logins
>
> Then create an array ( I hope the right terminology) to seperate it
>
> I hardcoded the log file, because I am looking for a way for it to
> scrub *.logs on a server
>
> but ...hey step by step right.
>
> Fields: date time c-ip cs-username cs-method cs-uri-stem sc-status sc-
> bytes cs-host
> 2008-01-20 00:00:02 x.x.x.x 0598_Andy [6952041]sent /
> 0598_Andy/qff0598.zip 226 0 -
>
> This field 226 0 - is a successful login
>
> My plan is to scrub the logs, export to file.
>
> sort fields into variable.


perldoc -f split


>
> I hope in the end to get
>
> 1..log of successful logins


grep "226 0 - *$" ex*.log > ftpacct.log

perl -n -e 'print if /226 0 - *$/' ex*.log > ftpacct.log

> 2.log of last successful login ( I think I am going to try date
> comparison from most recent to last.)


Logfiles are generally in date order, you just need the last record.

tail -n 1 successful-logins.log > last-successful-login.log

> 3 be able to parse the fields and get data.
>
>
> I know that there are those of you who are advanced, I would
> appreciate any directions or help.
>
> Again I am trying to put this together this is what I have so far.
>
> #!/usr/bin/perl
> use strict;
> use warnings;


Good!

>
> open(INPUT, '<', "ex080120.log")or die("Could not open log file.");


Best practise is to ...
- Use lexical filehandles
- Include filename in message
- Include the failure reason in the message

my $filename = 'ex080120.log';
open(my $input, '<', $filename)
or die("Could not open '$filename' because $!");

> open(OUTPUT, '>',"ftpacct.log")or die("Could not open log file.");


see above

> my $extractedLine;


Not used? Remove it.

> while (<INPUT>) {
> my $line = $_;


It's sometimes easier to work with $_ than assign it to another
variable. It would simplify your later code.

> if ($line =~ m/^(.+226\s+0\s+-\s+.*)$/) {


Matching ^.+ is wasteful.
You don't need to capture the whole line using ().

> print OUTPUT "$1\n";


Unless you chomp your input you'll output an extra blank line.

Putting all the above together

if (/226\s+0\s+-\s*$/) {
print OUTPUT;

OR

print OUTPUT if /\s+0\s+-\s*$/;

Though I'd use lexical filehandles, as I wrote earlier.

print $output if /\s+0\s+-\s*$/;

However to achieve your other aim, use your original construction and add
$last_login = $line;
my ($date, $time, ... $hyphen) = split;
...


> }
> }
> close(INPUT);
> close(OUTPUT);


print "last successful login is $last_login";

> exit;
>



Untested, caveat emptor.

--
RGB
 
Reply With Quote
 
Andy
Guest
Posts: n/a
 
      05-13-2008
On May 13, 12:35 pm, RedGrittyBrick <RedGrittyBr...@SpamWeary.foo>
wrote:
> Andy wrote:
> > Greets

>
> > Q; I am trying to learn how to define some variables

>
> > The basis of this script is to Scrub log files for ftp logins,
> > seperate the successful logins

>
> > Then create an array ( I hope the right terminology) to seperate it

>
> > I hardcoded the log file, because I am looking for a way for it to
> > scrub *.logs on a server

>
> > but ...hey step by step right.

>
> > Fields: date time c-ip cs-username cs-method cs-uri-stem sc-status sc-
> > bytes cs-host
> > 2008-01-20 00:00:02 x.x.x.x 0598_Andy [6952041]sent /
> > 0598_Andy/qff0598.zip 226 0 -

>
> > This field 226 0 - is a successful login

>
> > My plan is to scrub the logs, export to file.

>
> > sort fields into variable.

>
> perldoc -f split
>
>
>
> > I hope in the end to get

>
> > 1..log of successful logins

>
> grep "226 0 - *$" ex*.log > ftpacct.log
>
> perl -n -e 'print if /226 0 - *$/' ex*.log > ftpacct.log
>
> > 2.log of last successful login ( I think I am going to try date
> > comparison from most recent to last.)

>
> Logfiles are generally in date order, you just need the last record.
>
> tail -n 1 successful-logins.log > last-successful-login.log
>
> > 3 be able to parse the fields and get data.

>
> > I know that there are those of you who are advanced, I would
> > appreciate any directions or help.

>
> > Again I am trying to put this together this is what I have so far.

>
> > #!/usr/bin/perl
> > use strict;
> > use warnings;

>
> Good!
>
>
>
> > open(INPUT, '<', "ex080120.log")or die("Could not open log file.");

>
> Best practise is to ...
> - Use lexical filehandles
> - Include filename in message
> - Include the failure reason in the message
>
> my $filename = 'ex080120.log';
> open(my $input, '<', $filename)
> or die("Could not open '$filename' because $!");
>
> > open(OUTPUT, '>',"ftpacct.log")or die("Could not open log file.");

>
> see above
>
> > my $extractedLine;

>
> Not used? Remove it.
>
> > while (<INPUT>) {
> > my $line = $_;

>
> It's sometimes easier to work with $_ than assign it to another
> variable. It would simplify your later code.
>
> > if ($line =~ m/^(.+226\s+0\s+-\s+.*)$/) {

>
> Matching ^.+ is wasteful.
> You don't need to capture the whole line using ().
>
> > print OUTPUT "$1\n";

>
> Unless you chomp your input you'll output an extra blank line.
>
> Putting all the above together
>
> if (/226\s+0\s+-\s*$/) {
> print OUTPUT;
>
> OR
>
> print OUTPUT if /\s+0\s+-\s*$/;
>
> Though I'd use lexical filehandles, as I wrote earlier.
>
> print $output if /\s+0\s+-\s*$/;
>
> However to achieve your other aim, use your original construction and add
> $last_login = $line;
> my ($date, $time, ... $hyphen) = split;
> ...
>
> > }
> > }
> > close(INPUT);
> > close(OUTPUT);

>
> print "last successful login is $last_login";
>
> > exit;

>
> Untested, caveat emptor.
>
> --
> RGB


WOW!

Guys you opened my eyes up...I knew there were many ways to do this ,
it is just confusing figuring out which one to use.
I have of course google'd for file manipulations and sorting , I guess
it just takes experience to figure out which is best.

Thanks for the responses, all I have to do is figure out how to take
what you have advised me and try to get it to work.

I think I can safely say " progress in motion".....umm slowly.

I will try your suggestions and see what happens.....

-Thank you again

GREATLY APPRECIATED




 
Reply With Quote
 
Jürgen Exner
Guest
Posts: n/a
 
      05-13-2008
RedGrittyBrick <> wrote:
>Andy wrote:
>> if ($line =~ m/^(.+226\s+0\s+-\s+.*)$/) {

>
>Matching ^.+ is wasteful.
>You don't need to capture the whole line using ().
>
>> print OUTPUT "$1\n";

>
>Unless you chomp your input you'll output an extra blank line.


My first thought, too. However because of the rather 'interesting' way
he is printing the captured group instead of just the plain line he is
loosing the newline in the pattern match. Therefore he has to add it
back explicitely.

> print OUTPUT if /\s+0\s+-\s*$/;


Much nicer, of course.

jue
 
Reply With Quote
 
John W. Krahn
Guest
Posts: n/a
 
      05-13-2008
Ben Morrow wrote:
> Quoth Andy <>:
>>
>> if ($line =~ m/^(.+226\s+0\s+-\s+.*)$/) {
>> print OUTPUT "$1\n";
>> }

>
> I would recommend splitting the line into a hash first, and then
> selecting lines based on that. Something like
>
> my @fields = qw/
> date time c_ip
> cs_username cs_method cs_uri_stem
> sc_status sc_bytes cs_host
> /;
>
> while (my $line = <$INPUT>) {
>
> # Here I assume fields are delimited by a single space, and
> # spaces and newlines *never* appear in a field (not even inside
> # quotes). If this isn't true, you probably want to use the
> # Text::CSV_XS module, which can parse all sorts of
> # <foo>-delimited files.
>
> my %record;
> @record{@fields} = split / /, $line;
>
> $record{sc_status} == 226
> and $record{sc_bytes} == 0
> and $record{cs_host} eq '-'


Because you are using "split / /, $line" $record{cs_host} will probably
contain "-\n" instead of '-'.


> or next;
>
> print $OUTPUT $line;
> }



John
--
Perl isn't a toolbox, but a small machine shop where you
can special-order certain sorts of tools at low cost and
in short order. -- Larry Wall
 
Reply With Quote
 
John W. Krahn
Guest
Posts: n/a
 
      05-13-2008
Jürgen Exner wrote:
> RedGrittyBrick <> wrote:
>> Andy wrote:
>>> if ($line =~ m/^(.+226\s+0\s+-\s+.*)$/) {

>> Matching ^.+ is wasteful.
>> You don't need to capture the whole line using ().
>>
>>> print OUTPUT "$1\n";

>> Unless you chomp your input you'll output an extra blank line.

>
> My first thought, too. However because of the rather 'interesting' way
> he is printing the captured group instead of just the plain line he is
> loosing the newline in the pattern match. Therefore he has to add it
> back explicitely.


The \s+ at the end is greedy and will match everything at the end
including the newline unless there is a non-whitespace character after
it that .* will match.


John
--
Perl isn't a toolbox, but a small machine shop where you
can special-order certain sorts of tools at low cost and
in short order. -- Larry Wall
 
Reply With Quote
 
Jürgen Exner
Guest
Posts: n/a
 
      05-13-2008
"John W. Krahn" <> wrote:
>Jürgen Exner wrote:
>> RedGrittyBrick <> wrote:
>>> Andy wrote:
>>>> if ($line =~ m/^(.+226\s+0\s+-\s+.*)$/) {
>>> Matching ^.+ is wasteful.
>>> You don't need to capture the whole line using ().
>>>
>>>> print OUTPUT "$1\n";
>>> Unless you chomp your input you'll output an extra blank line.

>>
>> My first thought, too. However because of the rather 'interesting' way
>> he is printing the captured group instead of just the plain line he is
>> loosing the newline in the pattern match. Therefore he has to add it
>> back explicitely.

>
>The \s+ at the end is greedy and will match everything at the end
>including the newline unless there is a non-whitespace character after
>it that .* will match.


You are right. I was looking at the trailing .* only and didn't dissect
the RE beyond that.
This RE certainly has some Interesting side effects.

jue
 
Reply With Quote
 
Ben Morrow
Guest
Posts: n/a
 
      05-13-2008

Quoth "John W. Krahn" <>:
> Ben Morrow wrote:
> >
> > while (my $line = <$INPUT>) {

<snip>
> > my %record;
> > @record{@fields} = split / /, $line;
> >
> > $record{sc_status} == 226
> > and $record{sc_bytes} == 0
> > and $record{cs_host} eq '-'

>
> Because you are using "split / /, $line" $record{cs_host} will probably
> contain "-\n" instead of '-'.


Good point. I'm too used to -l

Ben

--
"Faith has you at a disadvantage, Buffy."
"'Cause I'm not crazy, or 'cause I don't kill people?"
"Both, actually."
[]
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Little direction please Python MySQL len Python 12 11-17-2008 03:56 PM
1 little 2 little 3 little Kennedys dale Digital Photography 0 03-23-2008 01:03 PM
stack increase direction and big-endian or little-endia gary C++ 3 10-23-2005 02:40 PM
Need advice with certification/training direction, please Dennis Microsoft Certification 1 07-18-2004 05:00 AM
passion for programming but need a little direction jqpdev MCAD 4 12-01-2003 02:09 PM



Advertisments