Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > Expression problem

Reply
Thread Tools

Expression problem

 
 
K.J. 44
Guest
Posts: n/a
 
      11-27-2006
I have the two following regular expressions. I am not very good at
writing these yet. I am parsing some logs looking for some key words,
then taking the text after them.

if ($details[$i] =~ /\bworkstation\b\bname:\b\s\b[0-9A-Za-z_\-]+\b/i) {
($nothing, $hostName[$i]) = split(/:/, $&);
}
if ($details[$i] =~ /\buser\b\bname:\b\w+/i) {
print("The username is: $&");
($nothing, $username[$i]) = split(/:/, $&);
}

The $details array is read in from a text file and this works fine.
What I want to do is search the $details text for certain key words,
then take the text right after. The first if statement

Find the word Workstation followed by a space followed by name:
followed by a space followed by a string of characters including word
characters and hyphens. if the match is found, take only the text
after the : as the workstation name.

The second part is along the same lines for username.

Find the word user followed a space followed by the word name: followed
by a space followed by a string of word characters. Split at the : as
the username found.

These do not seem to be finding matches when I can see them in the log
file. Where am I messing up?

Thanks.

 
Reply With Quote
 
 
 
 
Ric
Guest
Posts: n/a
 
      11-27-2006
You need to post at least one of the lines you read from your text file.
You should name the values you would like to have extracted.

example line: yesterday 12:30 hans:went:home

var1 should contain: 12:20
var2 should contain: went

this is much faster and precise than explaining something in a huge
block of text


K.J. 44 schrieb:
> I have the two following regular expressions. I am not very good at
> writing these yet. I am parsing some logs looking for some key words,
> then taking the text after them.
>
> if ($details[$i] =~ /\bworkstation\b\bname:\b\s\b[0-9A-Za-z_\-]+\b/i) {
> ($nothing, $hostName[$i]) = split(/:/, $&);
> }
> if ($details[$i] =~ /\buser\b\bname:\b\w+/i) {
> print("The username is: $&");
> ($nothing, $username[$i]) = split(/:/, $&);
> }
>
> The $details array is read in from a text file and this works fine.
> What I want to do is search the $details text for certain key words,
> then take the text right after. The first if statement
>
> Find the word Workstation followed by a space followed by name:
> followed by a space followed by a string of characters including word
> characters and hyphens. if the match is found, take only the text
> after the : as the workstation name.
>
> The second part is along the same lines for username.
>
> Find the word user followed a space followed by the word name: followed
> by a space followed by a string of word characters. Split at the : as
> the username found.
>
> These do not seem to be finding matches when I can see them in the log
> file. Where am I messing up?
>
> Thanks.
>

 
Reply With Quote
 
 
 
 
xhoster@gmail.com
Guest
Posts: n/a
 
      11-27-2006
"K.J. 44" <> wrote:
> I have the two following regular expressions. I am not very good at
> writing these yet. I am parsing some logs looking for some key words,
> then taking the text after them.
>
> if ($details[$i] =~ /\bworkstation\b\bname:\b\s\b[0-9A-Za-z_\-]+\b/i) {


About the only places that \b should be used are the beginning or end
of the regex or before or after something like ".*". Also, I don't see how
you can ever productively have more than one in a row, doing so is probably
exactly the same as having just one.

\b is a zero-width condition. So "tion\b\bname" can never match because
you are demanding that there is a n before the \b zero-width placeholder,
and an n after the \b zero-width placeholder, and if that is the case then
the conditions which define \b are not met. Similarly the \b in ":\b\s" is
impossible to ever match, as it has to have a non-word character on each
side which is what \b does not do. The \b in "\s\b[0-9A-Za-z_\-]" is
almost redundant--it just forbid the "-" from the character class from
being used, which I doubt is what you want.

....

>
> Find the word Workstation followed by a space followed by name:
> followed by a space followed by a string of characters including word
> characters and hyphens. if the match is found, take only the text
> after the : as the workstation name.


.... /\bworkstation name: ([-\w]+)/i ...

If you "space" you meant "white space", then turn my spaces into "\s"
(not "\b").


Xho

--
-------------------- http://NewsReader.Com/ --------------------
Usenet Newsgroup Service $9.95/Month 30GB
 
Reply With Quote
 
K.J. 44
Guest
Posts: n/a
 
      11-27-2006
Thank you very much for your help and suggestions. I will try these
out.

Thanks!
wrote:
> "K.J. 44" <> wrote:
> > I have the two following regular expressions. I am not very good at
> > writing these yet. I am parsing some logs looking for some key words,
> > then taking the text after them.
> >
> > if ($details[$i] =~ /\bworkstation\b\bname:\b\s\b[0-9A-Za-z_\-]+\b/i) {

>
> About the only places that \b should be used are the beginning or end
> of the regex or before or after something like ".*". Also, I don't see how
> you can ever productively have more than one in a row, doing so is probably
> exactly the same as having just one.
>
> \b is a zero-width condition. So "tion\b\bname" can never match because
> you are demanding that there is a n before the \b zero-width placeholder,
> and an n after the \b zero-width placeholder, and if that is the case then
> the conditions which define \b are not met. Similarly the \b in ":\b\s" is
> impossible to ever match, as it has to have a non-word character on each
> side which is what \b does not do. The \b in "\s\b[0-9A-Za-z_\-]" is
> almost redundant--it just forbid the "-" from the character class from
> being used, which I doubt is what you want.
>
> ...
>
> >
> > Find the word Workstation followed by a space followed by name:
> > followed by a space followed by a string of characters including word
> > characters and hyphens. if the match is found, take only the text
> > after the : as the workstation name.

>
> ... /\bworkstation name: ([-\w]+)/i ...
>
> If you "space" you meant "white space", then turn my spaces into "\s"
> (not "\b").
>
>
> Xho
>
> --
> -------------------- http://NewsReader.Com/ --------------------
> Usenet Newsgroup Service $9.95/Month 30GB


 
Reply With Quote
 
Tad McClellan
Guest
Posts: n/a
 
      11-28-2006
Christian Winter <> wrote:

> So the first piece of code could be cut down to
> if( $details[$i] =~ /\bworkstation\sname:\s([0-9A-Za-z_\-]+)/i )
> {
> $hostName[$i] = $1;
> }
>
> There's still room for improvement, like



adding an //x modifier, particularly if you are "not very good"
at grokking regexes:

if ( $details[$i] =~ /\bworkstation
\s
name:
\s
(
[0-9A-Za-z_-]+
)
/ix
)


> e.g. removing A-Z from
> the character class, as you are giving the /i modifier anyway,
> so both upper and lowercase characters will be matched.



and removing the backslash. Hyphen is not meta in a character
class if it is first or last in the class.


--
Tad McClellan SGML consulting
Perl programming
Fort Worth, Texas
 
Reply With Quote
 
Mr P
Guest
Posts: n/a
 
      11-28-2006

K.J. 44 wrote:
> Thank you very much for your help and suggestions. I will try these
> out.
>
> Thanks!
> wrote:
> > "K.J. 44" <> wrote:
> > > I have the two following regular expressions. I am not very good at
> > > writing these yet. I am parsing some logs looking for some key words,
> > > then taking the text after them.
> > >
> > > if ($details[$i] =~ /\bworkstation\b\bname:\b\s\b[0-9A-Za-z_\-]+\b/i) {

> >

 
Reply With Quote
 
Mr P
Guest
Posts: n/a
 
      11-28-2006

>
> The $details array is read in from a text file and this works fine.
> What I want to do is search the $details text for certain key words,
> then take the text right after. The first if statement

 
Reply With Quote
 
Dr.Ruud
Guest
Posts: n/a
 
      11-28-2006
Mr P schreef:

> PS: [0-9A-Za-z_\-] looks a LOT like \w


But \w matches 91801 characters (codepoints).
http://www.xs4all.nl/~rvtol/perl/unicount.pl

--
Affijn, Ruud

"Gewoon is een tijger."
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Seek xpath expression where an attribute name is a regular expression GIMME XML 3 12-29-2008 03:11 PM
C/C++ language proposal: Change the 'case expression' from "integral constant-expression" to "integral expression" Adem C++ 42 11-04-2008 12:39 PM
C/C++ language proposal: Change the 'case expression' from "integral constant-expression" to "integral expression" Adem C Programming 45 11-04-2008 12:39 PM
Matching abitrary expression in a regular expression =?iso-8859-1?B?bW9vcJk=?= Java 8 12-02-2005 12:51 AM
Dynamically changing the regular expression of Regular Expression validator VSK ASP .Net 2 08-24-2003 02:47 PM



Advertisments