Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > Regular expression for matching words containing underscore _character

Reply
Thread Tools

Regular expression for matching words containing underscore _character

 
 
Raj
Guest
Posts: n/a
 
      12-12-2007
I have large text passages containing names of database tables,
procedures, packages, variables etc having the underscore character as
a part of the name. eg. rsp_names_friends_master. I tried "\b[a-zA-
Z0-9_]+\b" but it matches all words in the passage.

Thanks in advance for the help.

Raj

 
Reply With Quote
 
 
 
 
RedGrittyBrick
Guest
Posts: n/a
 
      12-12-2007
Raj wrote:
> I have large text passages containing names of database tables,
> procedures, packages, variables etc having the underscore character as
> a part of the name. eg. rsp_names_friends_master. I tried "\b[a-zA-
> Z0-9_]+\b" but it matches all words in the passage.


Similarly "[ab]+" matches "aaa" and "aa" though neither contain "b".

Try "\b[a-zA-Z0-9]+_[a-zA-Z0-9_]+\b"

Or "\b\w+_\w+\b"
 
Reply With Quote
 
 
 
 
Tad J McClellan
Guest
Posts: n/a
 
      12-13-2007
RedGrittyBrick <(E-Mail Removed)> wrote:
> Raj wrote:
>> I have large text passages containing names of database tables,
>> procedures, packages, variables etc having the underscore character as
>> a part of the name. eg. rsp_names_friends_master. I tried "\b[a-zA-
>> Z0-9_]+\b" but it matches all words in the passage.

>
> Similarly "[ab]+" matches "aaa" and "aa" though neither contain "b".
>
> Try "\b[a-zA-Z0-9]+_[a-zA-Z0-9_]+\b"
>
> Or "\b\w+_\w+\b"



Three (six?) useless uses of word boundary in the quotes above...

Every pattern there will behave identically without any \b's.


--
Tad McClellan
email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"
 
Reply With Quote
 
Raj
Guest
Posts: n/a
 
      12-13-2007
On Dec 12, 8:47 pm, RedGrittyBrick <(E-Mail Removed)>
wrote:
> Raj wrote:
> > I have large text passages containing names of database tables,
> > procedures, packages, variables etc having the underscore character as
> > a part of the name. eg. rsp_names_friends_master. I tried "\b[a-zA-
> > Z0-9_]+\b" but it matches all words in the passage.

>
> Similarly "[ab]+" matches "aaa" and "aa" though neither contain "b".
>
> Try "\b[a-zA-Z0-9]+_[a-zA-Z0-9_]+\b"
>
> Or "\b\w+_\w+\b"


Thanks. It worked.
Raj
 
Reply With Quote
 
Florian Kaufmann
Guest
Posts: n/a
 
      12-13-2007
On Dec 12, 4:27 pm, Raj <(E-Mail Removed)> wrote:
> I have large text passages containing names of database tables,
> procedures, packages, variables etc having the underscore character as
> a part of the name. eg. rsp_names_friends_master. I tried "\b[a-zA-
> Z0-9_]+\b" but it matches all words in the passage.
>
> Thanks in advance for the help.
>
> Raj


I would use this to merly find lines wich contain what you search
/\p{IsAlnum}+_)+\p{IsAlnum}+/

I would use this to get the words you search into an array
/((?:\p{IsAlnum}+_)+\p{IsAlnum}+)/g

Example:
perl -ne 'print @a if (@a = /((?:\p{IsAlnum}+_)+\p{IsAlnum}+)/g)' <<<
'yyy_yyy saf_;fasl asfd ; xxx_xxx'

Greetings

Flo
 
Reply With Quote
 
RedGrittyBrick
Guest
Posts: n/a
 
      12-13-2007
Tad J McClellan wrote:
> RedGrittyBrick <(E-Mail Removed)> wrote:
>> Raj wrote:
>>> I have large text passages containing names of database tables,
>>> procedures, packages, variables etc having the underscore character as
>>> a part of the name. eg. rsp_names_friends_master. I tried "\b[a-zA-
>>> Z0-9_]+\b" but it matches all words in the passage.

>> Similarly "[ab]+" matches "aaa" and "aa" though neither contain "b".
>>
>> Try "\b[a-zA-Z0-9]+_[a-zA-Z0-9_]+\b"
>>
>> Or "\b\w+_\w+\b"

>
>
> Three (six?) useless uses of word boundary in the quotes above...
>
> Every pattern there will behave identically without any \b's.
>
>


TFTC

$ perl -e 'print "$_\n" for "_aa-bbb.cc_[d_d]" =~ /\w+/g'
_aa
bbb
cc_
d_d

$ perl -e 'print "$_\n" for "_aa-bbb.cc_[d_d]" =~ /\w+_\w+/g'
d_d

In Perl programs I've written, I don't think I've ever used \b. Perhaps
I should have analyzed the OP's RE completely rather than only
commenting on the primary reason for the problem.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Regular Expression for words (with umlauts, without numbers) Jens Lechtenboerger Python 1 05-13-2011 04:49 PM
Regular expression to match only strings NOT containing particular words Dylan Nicholson Perl Misc 6 10-19-2007 02:33 PM
Matching abitrary expression in a regular expression =?iso-8859-1?B?bW9vcJk=?= Java 8 12-02-2005 12:51 AM
regular expression for english words rahul Perl Misc 12 05-12-2005 08:41 PM
Dynamically changing the regular expression of Regular Expression validator VSK ASP .Net 2 08-24-2003 02:47 PM



Advertisments