Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > accentuated character - RE

Reply
Thread Tools

accentuated character - RE

 
 
nicolas_laurent545@hotmail.com
Guest
Posts: n/a
 
      04-18-2006
Hi

(\w+) does not see accentuated character such as ().
[a-z] sees accentuated character but the problem is that I have to
enumerate etc.

Is there any other method in regular expression to include accentuated
character so I do not
need to specify them in advance ?

Thanks

 
Reply With Quote
 
 
 
 
John W. Krahn
Guest
Posts: n/a
 
      04-18-2006
http://www.velocityreviews.com/forums/(E-Mail Removed) wrote:
>
> (\w+) does not see accentuated character such as ().
> [a-z] sees accentuated character but the problem is that I have to
> enumerate etc.
>
> Is there any other method in regular expression to include accentuated
> character so I do not
> need to specify them in advance ?


Put this line near the top of your program:

use locale;


perldoc locale
perldoc perllocale
etc.


John
--
use Perl;
program
fulfillment
 
Reply With Quote
 
 
 
 
Gunnar Hjalmarsson
Guest
Posts: n/a
 
      04-19-2006
John W. Krahn wrote:
> (E-Mail Removed) wrote:
>>(\w+) does not see accentuated character such as ().
>>[a-z] sees accentuated character but the problem is that I have to
>>enumerate etc.
>>
>>Is there any other method in regular expression to include accentuated
>>character so I do not
>>need to specify them in advance ?

>
> Put this line near the top of your program:
>
> use locale;


Or, possibly better, in the smaller block where that behaviour is desired.

--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl
 
Reply With Quote
 
Dave
Guest
Posts: n/a
 
      04-19-2006
>
><(E-Mail Removed)> wrote in message
> >news:(E-Mail Removed) roups.com...

>Hi


>(\w+) does not see accentuated character such as ().
>[a-z] sees accentuated character but the problem is that I have to
>enumerate etc.


>Is there any other method in regular expression to include accentuated
>character so I do not
>need to specify them in advance ?


>Thanks


You would be better off using (\p{IsAlpha}+). This will get all Alphabetical
characters.
See the docs on Unicode.


 
Reply With Quote
 
Gunnar Hjalmarsson
Guest
Posts: n/a
 
      04-19-2006
Dave wrote:
> <(E-Mail Removed)> wrote in message
>> (\w+) does not see accentuated character such as ().
>> [a-z] sees accentuated character but the problem is that I have to
>> enumerate etc.

>
>> Is there any other method in regular expression to include accentuated
>> character so I do not need to specify them in advance ?

>
> You would be better off using (\p{IsAlpha}+).


How can you tell?

--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl
 
Reply With Quote
 
Dave
Guest
Posts: n/a
 
      04-20-2006

"Gunnar Hjalmarsson" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
> Dave wrote:
>> <(E-Mail Removed)> wrote in message
>>> (\w+) does not see accentuated character such as (). [a-z] sees
>>> accentuated character but the problem is that I have to enumerate
>>> etc.

>>
>>> Is there any other method in regular expression to include accentuated
>>> character so I do not need to specify them in advance ?

>>
>> You would be better off using (\p{IsAlpha}+).

>
> How can you tell?
>
> --
> Gunnar Hjalmarsson
> Email: http://www.gunnar.cc/cgi-bin/contact.pl


Fair point I should have had the word 'probably' in that sentence as from
the original post (which, as you correctly imply, does not give the OP's
actual goal) I am assuming he is trying to use (\w+) to capture whole words
(in a natural language) but is finding that it does not work well for this.
I should have made my assumption explicit. Thanks for pointing this out.
(Your suggesting of adding use locale; makes similar assumptions it has to
be said.)



 
Reply With Quote
 
Gunnar Hjalmarsson
Guest
Posts: n/a
 
      04-20-2006
Dave wrote:
> Gunnar Hjalmarsson wrote:
>>Dave wrote:
>>><(E-Mail Removed)> wrote in message
>>>>(\w+) does not see accentuated character such as (). [a-z] sees
>>>>accentuated character but the problem is that I have to enumerate
>>>>etc.
>>>>
>>>>Is there any other method in regular expression to include accentuated
>>>>character so I do not need to specify them in advance ?
>>>
>>>You would be better off using (\p{IsAlpha}+).

>>
>>How can you tell?

>
> Fair point I should have had the word 'probably' in that sentence as from
> the original post (which, as you correctly imply, does not give the OP's
> actual goal) I am assuming he is trying to use (\w+) to capture whole words
> (in a natural language) but is finding that it does not work well for this.
> I should have made my assumption explicit. Thanks for pointing this out.
> (Your suggesting of adding use locale; makes similar assumptions it has to
> be said.)


Not really. I just meant that we don't really know whether he is
interested in also matching digits.

--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Accentuated function names Jean-baptiste Hétier Ruby 6 12-15-2008 01:47 PM
[MacOS X]Dir#.glob over accentuated named directories ??? Une Bvue Ruby 5 03-06-2008 05:19 PM
sorting Array of accentuated Strings Une Bvue Ruby 8 12-08-2007 07:12 PM
character encoding +missing character sequence raavi Java 2 03-02-2006 05:01 AM
getting the character code of a character in a string Velvet ASP .Net 9 01-19-2006 09:27 PM



Advertisments