Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Pattern matching help! grep emails from file!

Reply
Thread Tools

Pattern matching help! grep emails from file!

 
 
danpres2k
Guest
Posts: n/a
 
      08-22-2003
Hello, I have a file with email address at a lot of junk data. I want
to get the email addresses out of that file so that each email address
is stored at a new line. I am trying to do http://www.velocityreviews.com/forums/(E-Mail Removed)
substitution:
$filestring=<FILE>;
$filestring = s/(\w+\@\w+\.\w+)/\n$1\n/;

The file is like:
"testin" (E-Mail Removed), <testing>(E-Mail Removed)
(E-Mail Removed)
"(E-Mail Removed)"

Expected output:
(E-Mail Removed)
(E-Mail Removed)
(E-Mail Removed)
(E-Mail Removed)

Thanks guyz.
 
Reply With Quote
 
 
 
 
Shawn Milochik
Guest
Posts: n/a
 
      08-22-2003
On Fri, 22 Aug 2003 10:46:22 -0400, danpres2k wrote:

> Hello, I have a file with email address at a lot of junk data. I want to
> get the email addresses out of that file so that each email address is
> stored at a new line. I am trying to do (E-Mail Removed) substitution:
> $filestring=<FILE>;
> $filestring = s/(\w+\@\w+\.\w+)/\n$1\n/;
>
> The file is like:
> "testin" (E-Mail Removed), <testing>(E-Mail Removed) (E-Mail Removed)
> "(E-Mail Removed)"
>
> Expected output:
> (E-Mail Removed)
> (E-Mail Removed)
> (E-Mail Removed)
> (E-Mail Removed)
>
> Thanks guyz.


Two things:

1. Do you really want 2 newlines for each output?

2. Since the first regex is matching the e-mail address and
ONLY the e-mail address, you're actually telling the s/// to
search the entire string for an e-mail address and substitute
the e-mail address with itself, not, as you intend, to substitute
the entire string with itself. You want to add a .* after the closing
parenthesis, maybe.

Instead of:
> $filestring = s/(\w+\@\w+\.\w+)/\n$1\n/;


Try:
> $filestring = s/(\w+\@\w+\.\w+).*/\n$1\n/;


Or Possibly:
> $filestring = s/.*(\w+\@\w+\.\w+).*/\n$1\n/;


Untested, but I had a similar problem recently, and the
principle is the same.

Shawn
 
Reply With Quote
 
 
 
 
danpres2k
Guest
Posts: n/a
 
      08-22-2003
Shawn,

Thanks for your help. But I couldn't use that as well. I am getting
null value for $filestring when I am printing it:

$filestring = <FILE>;
$filestring = s/.*(\w+\@\w+\.\w+).*/$1/;
print $filestring;

Got any suggestion?
Thanks.

Shawn Milochik <(E-Mail Removed)> wrote in message news:<pan.2003.08.22.14.41.19.841521.5432@Linurati .net>...
> On Fri, 22 Aug 2003 10:46:22 -0400, danpres2k wrote:
>
> > Hello, I have a file with email address at a lot of junk data. I want to
> > get the email addresses out of that file so that each email address is
> > stored at a new line. I am trying to do (E-Mail Removed) substitution:
> > $filestring=<FILE>;
> > $filestring = s/(\w+\@\w+\.\w+)/\n$1\n/;
> >
> > The file is like:
> > "testin" (E-Mail Removed), <testing>(E-Mail Removed) (E-Mail Removed)
> > "(E-Mail Removed)"
> >
> > Expected output:
> > (E-Mail Removed)
> > (E-Mail Removed)
> > (E-Mail Removed)
> > (E-Mail Removed)
> >
> > Thanks guyz.

>
> Two things:
>
> 1. Do you really want 2 newlines for each output?
>
> 2. Since the first regex is matching the e-mail address and
> ONLY the e-mail address, you're actually telling the s/// to
> search the entire string for an e-mail address and substitute
> the e-mail address with itself, not, as you intend, to substitute
> the entire string with itself. You want to add a .* after the closing
> parenthesis, maybe.
>
> Instead of:
> > $filestring = s/(\w+\@\w+\.\w+)/\n$1\n/;

>
> Try:
> > $filestring = s/(\w+\@\w+\.\w+).*/\n$1\n/;

>
> Or Possibly:
> > $filestring = s/.*(\w+\@\w+\.\w+).*/\n$1\n/;

>
> Untested, but I had a similar problem recently, and the
> principle is the same.
>
> Shawn

 
Reply With Quote
 
danpres2k
Guest
Posts: n/a
 
      08-25-2003
Thanks again Shawn, It did work but only printed a part of the last
email in the first line. how do i go about the newline chars in the
$filestring? i am storing the string from the file handle in
$filestring. is this correct?

thanks.
d

Shawn Milochik <(E-Mail Removed)> wrote in message news:<pan.2003.08.22.17.09.55.836451.3093@Linurati .net>...
> On Fri, 22 Aug 2003 16:57:00 -0400, danpres2k wrote:
>
> > Shawn,
> >
> > Thanks for your help. But I couldn't use that as well. I am getting null
> > value for $filestring when I am printing it:
> >
> > $filestring = <FILE>;
> > $filestring = s/.*(\w+\@\w+\.\w+).*/$1/; print $filestring;
> >
> > Got any suggestion?
> > Thanks.
> >
> > Shawn Milochik <(E-Mail Removed)> wrote in message
> > news:<pan.2003.08.22.14.41.19.841521.5432@Linurati .net>...
> >> On Fri, 22 Aug 2003 10:46:22 -0400, danpres2k wrote:
> >>
> >> > Hello, I have a file with email address at a lot of junk data. I want
> >> > to get the email addresses out of that file so that each email
> >> > address is stored at a new line. I am trying to do (E-Mail Removed)
> >> > substitution: $filestring=<FILE>;
> >> > $filestring = s/(\w+\@\w+\.\w+)/\n$1\n/;
> >> >
> >> > The file is like:
> >> > "testin" (E-Mail Removed), <testing>(E-Mail Removed) (E-Mail Removed)
> >> > "(E-Mail Removed)"
> >> >
> >> > Expected output:
> >> > (E-Mail Removed)
> >> > (E-Mail Removed)
> >> > (E-Mail Removed)
> >> > (E-Mail Removed)
> >> >
> >> > Thanks guyz.
> >>
> >> Two things:
> >>
> >> 1. Do you really want 2 newlines for each output?
> >>
> >> 2. Since the first regex is matching the e-mail address and ONLY the
> >> e-mail address, you're actually telling the s/// to search the entire
> >> string for an e-mail address and substitute the e-mail address with
> >> itself, not, as you intend, to substitute the entire string with
> >> itself. You want to add a .* after the closing parenthesis, maybe.
> >>
> >> Instead of:
> >> > $filestring = s/(\w+\@\w+\.\w+)/\n$1\n/;
> >>
> >> Try:
> >> > $filestring = s/(\w+\@\w+\.\w+).*/\n$1\n/;
> >>
> >> Or Possibly:
> >> > $filestring = s/.*(\w+\@\w+\.\w+).*/\n$1\n/;
> >>
> >> Untested, but I had a similar problem recently, and the principle is
> >> the same.
> >>
> >> Shawn

>
>
> Yeah, just a typo. Replace
> =
> with:
> =~
>
> I didn't catch that in the OP.
>
> Shawn

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Help with Pattern matching. Matching multiple lines from while reading from a file. Bobby Chamness Perl Misc 2 05-03-2007 06:02 PM
How to grep the shortly matching in a string Arowana Lin Ruby 4 12-18-2005 05:29 PM
How to stop Sir name showing on emails from hotmail emails Drifter Computer Information 3 07-07-2004 07:15 AM
Grep Pattern, matching any two consecutive words having 3 to 8 chars each User Perl Misc 5 06-13-2004 12:14 PM
Pattern matching : not matching problem Marc Bissonnette Perl Misc 9 01-13-2004 05:52 PM



Advertisments