Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > regexp problem with UTF8

Thread Tools

regexp problem with UTF8

Risto Vaarandi
Posts: n/a

I have a perl program that has worked for 2 years on redhat and solaris
nodes without problems. Recently I moved it to a redhat9 node (which has
utf8 as default system character set), and discovered that the following
regular expression inside the program does not work:

if ($line =~ /^\s*([^=\s]+)\s*=\s*(.*\S)/) {
$keyword = $1;
$value = $2;

When the regexp is written as /^\s*(\w+)\s*=\s*(.*\S)/, or as
/^\s*([^=]+)\s*=\s*(.*\S)/ , everything works fine. What could be the
problem here? (When I change the system charset from UTF8 to iso8859-1,
it works.)


Reply With Quote

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
given char* utf8, how to read unicode line by line, and output utf8 gry C++ 2 03-13-2012 04:32 AM
Anything to be done about utf8 regexp performance? Jochen Lehmeier Perl Misc 1 11-04-2009 07:43 AM
[regexp] How to convert string "/regexp/i" to /regexp/i - ? Joao Silva Ruby 16 08-21-2009 05:52 PM
Programmatically turning a Regexp into an anchored Regexp Greg Hurrell Ruby 4 02-14-2007 06:56 PM
utf8 in regexp (perl 5.8.1) Wes Groleau Perl 1 04-12-2005 04:45 AM