Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > noob question: Trying to extract part of a string in a variable to another variable

Reply
Thread Tools

noob question: Trying to extract part of a string in a variable to another variable

 
 
Robin
Guest
Posts: n/a
 
      04-26-2004

"cayenne" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed) m...
> Hello all,
> I'm a perl noob...and just can't quite figure out how to do something
> that should be pretty simple.
>
> Here's an example.
>
> I have $mail_address = 'fred jones <(E-Mail Removed)>'
>
> I want to use regular expressions to just parse out the userid here of
> fred_jones
>
> I'm trying things like this:
>
> $mail_address =~ /\w+@/;
>
> But, doesn't seem to work. I'm a little hazy on exactly how the =~
> works...through examples I've successfully used it for substitutions
> like x =~ s/tom/joe/g; but, I'm just wanting to match a regular
> expression and extract it to the variable...or even to another
> variable leaving $mail_address unchanged.
>
> I've looked in books at the substr() function, but, I don't know how
> to use regular expressions to find the offset point, etc.
>
> Can someone give me an example...or pointers to a good reference on
> this type of thing?
>
> Thanks in advance,
>
> chilecayenne


Regular expressions are not the right way to find the offset unless you want
to use $1 an $2 and $3...etc, and then use index, it still isn't an optimal
way to find the offset point. Just change up your regular expression looks
like the other code, man I'm so tired.
-Robin


 
Reply With Quote
 
 
 
 
Joe Smith
Guest
Posts: n/a
 
      04-26-2004
Sherm Pendley wrote:

> If your pattern has subexpressions, then on a match the
> offset of each subexpression appears in the @- array. That is, the offset
> of $1 is in $-[0], $2 is in $-[1], and so forth.


Incorrect. The offset of $& is in $-[0], the offset of $1 is in $-[1], etc.
-Joe
 
Reply With Quote
 
 
 
 
Anno Siegel
Guest
Posts: n/a
 
      04-26-2004
Jürgen Exner <(E-Mail Removed)> wrote in comp.lang.perl.misc:
> cayenne wrote:


[...]

> > I've looked in books at the substr() function, but, I don't know how
> > to use regular expressions to find the offset point, etc.

>
> You don't.


Ah, but you do, though not in this case. The @- and @+ arrays are
there to support it.

Anno
 
Reply With Quote
 
Richard Morse
Guest
Posts: n/a
 
      04-26-2004
In article <(E-Mail Removed)> ,
http://www.velocityreviews.com/forums/(E-Mail Removed) (cayenne) wrote:

> I have $mail_address = 'fred jones <(E-Mail Removed)>'
>
> I want to use regular expressions to just parse out the userid here of
> fred_jones
>
> I'm trying things like this:
>
> $mail_address =~ /\w+@/;


What you seem to be asking for is this:

my ($user_id) = ($mail_address =~ m/(\w+)@/);

However, please note that \w doesn't really have the complete set of
valid characters to prefix the '@' sign in an email address.

Just off the top of my head, I know that '.', '-', '?', '=', and more
are valid. Possibly any unicode character other than whitespace and '@'
are valid. It might even be valid to have '<' in an email address.

At the very least, you probably want

my ($user_id) = ($mail_address =~ m/([\w.-+=]+)@/);

HTH,
Ricky

--
Pukku
 
Reply With Quote
 
Glenn Jackman
Guest
Posts: n/a
 
      04-26-2004
Richard Morse <(E-Mail Removed)> wrote:
> At the very least, you probably want
>
> my ($user_id) = ($mail_address =~ m/([\w.-+=]+)@/);



Be careful where you use '-' inside a range:
Invalid [] range ".-+" before HERE mark in regex m/([\w.-+ << HERE =]+)@/

Put the hyphen last: [\w.+=-]

--
Glenn Jackman
NCF Sysadmin
(E-Mail Removed)
 
Reply With Quote
 
Tad McClellan
Guest
Posts: n/a
 
      04-26-2004
Glenn Jackman <(E-Mail Removed)> wrote:

> Put the hyphen last: [\w.+=-]



Or first.


--
Tad McClellan SGML consulting
(E-Mail Removed) Perl programming
Fort Worth, Texas
 
Reply With Quote
 
cayenne
Guest
Posts: n/a
 
      05-19-2004
Richard Morse <(E-Mail Removed)> wrote in message news:<(E-Mail Removed)>...
> In article <(E-Mail Removed)> ,
> (E-Mail Removed) (cayenne) wrote:
>
> > I have $mail_address = 'fred jones <(E-Mail Removed)>'
> >
> > I want to use regular expressions to just parse out the userid here of
> > fred_jones
> >
> > I'm trying things like this:
> >
> > $mail_address =~ /\w+@/;

>
> What you seem to be asking for is this:
>
> my ($user_id) = ($mail_address =~ m/(\w+)@/);
>
> However, please note that \w doesn't really have the complete set of
> valid characters to prefix the '@' sign in an email address.
>
> Just off the top of my head, I know that '.', '-', '?', '=', and more
> are valid. Possibly any unicode character other than whitespace and '@'
> are valid. It might even be valid to have '<' in an email address.
>
> At the very least, you probably want
>
> my ($user_id) = ($mail_address =~ m/([\w.-+=]+)@/);
>
> HTH,
> Ricky


Just quickly, can you explain the extensive use of parens here? I
understand the () in the regular expression, to keep those parts the
match...but, what is the function of the () around $user_id and the
entire part after the = sign?

Thanks in advance,

CC
 
Reply With Quote
 
Richard Morse
Guest
Posts: n/a
 
      05-19-2004
In article <(E-Mail Removed)>,
(E-Mail Removed) (cayenne) wrote:

> Richard Morse <(E-Mail Removed)> wrote in message
> news:<(E-Mail Removed)>...
>
> > my ($user_id) = ($mail_address =~ m/([\w.-+=]+)@/);

>
> Just quickly, can you explain the extensive use of parens here? I
> understand the () in the regular expression, to keep those parts the
> match...but, what is the function of the () around $user_id and the
> entire part after the = sign?


Parens around $user_id force the match to happen in a list context. A
match in a scalar context would return the number of matches, while in a
list context, it returns the various matches.

my $user_id = ($mail_address =~ m/.../)

would have $user_id be the value 1 (because there is one match, as it
isn't a /g match).

The parens around the match are there because it makes it easier for me
to read it. I've never not put them there, although a quick test I just
did seems to indicate that they aren't necessary.

HTH,
Ricky

--
Pukku
 
Reply With Quote
 
Paul Lalli
Guest
Posts: n/a
 
      05-19-2004
On Wed, 19 May 2004, cayenne wrote:

> Richard Morse <(E-Mail Removed)> wrote in message news:<(E-Mail Removed)>...
> >
> > my ($user_id) = ($mail_address =~ m/([\w.-+=]+)@/);
> >

> Just quickly, can you explain the extensive use of parens here? I
> understand the () in the regular expression, to keep those parts the
> match...but, what is the function of the () around $user_id and the
> entire part after the = sign?
>


The parens around $user_id force the binding operation of =~ to be
evaluated in list context. This is done because a pattern match in list
context returns a list of all of the captured matches (ie, the things that
go into $1, $2, etc). This is a shorthand way of writing the two
statements:

$mail_address =~ m/([\w.-+=]+)@/
my $user_id = $1;

The parens around the whole pattern match here are actually unnecessary.
This is because the =~ operator has a higher precedence than the =
operator. They are likely used here just for clarity, to make sure the
readers of the code are aware that ($user_id) is being assigned to the
return value of the pattern match, rather than the alternate
interpretation of the assignment of $user_id to $mail_address being
pattern matched against the pattern (which would be written like so:
(my $user_id = $mail_address) =~ m/([\w.-+=]+)@/;

Please let me know if this is not clear enough.

Paul Lalli
 
Reply With Quote
 
John W. Krahn
Guest
Posts: n/a
 
      05-19-2004
Paul Lalli wrote:
>
> On Wed, 19 May 2004, cayenne wrote:
>
> > Richard Morse <(E-Mail Removed)> wrote in message news:<(E-Mail Removed)>...
> > >
> > > my ($user_id) = ($mail_address =~ m/([\w.-+=]+)@/);

> >
> > Just quickly, can you explain the extensive use of parens here? I
> > understand the () in the regular expression, to keep those parts the
> > match...but, what is the function of the () around $user_id and the
> > entire part after the = sign?

>
> The parens around $user_id force the binding operation of =~ to be
> evaluated in list context. This is done because a pattern match in list
> context returns a list of all of the captured matches (ie, the things that
> go into $1, $2, etc). This is a shorthand way of writing the two
> statements:
>
> $mail_address =~ m/([\w.-+=]+)@/
> my $user_id = $1;


They are not the same at all. If the match fails the first will set
$user_id to undef but your version will set $user_id to the contents of
a previously successful match's capturing parentheses or ''.




John
--
use Perl;
program
fulfillment
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
How to extract an std::string from another std::string? Dwight Army of Champions C++ 0 04-20-2010 09:59 AM
Extract the numeric and alphabetic part from an alphanumeric string Sandhya Prabhakaran Python 6 08-03-2009 04:40 PM
Here a noob, there a noob.... JimDoire MCSE 0 04-10-2008 07:23 PM
Noob: What is a slot? Me trying to understand another's code Carnell, James E Python 2 09-05-2007 03:15 PM
Variable displays at one part while does not in another part in a Jack ASP General 8 05-10-2005 07:26 PM



Advertisments