Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > Remove Unicode control character from string

Reply
Thread Tools

Remove Unicode control character from string

 
 
Ryan Chan
Guest
Posts: n/a
 
      10-04-2009
Hello,

Consider the sample code:
##############
use strict;

my $in = "\U0004";
my $out = chr($in) . "apple";

$out =~ s/[:cntrl:]//gi;
print $out;
##############

I want to remove the the first unicode char (e.g. "\U0004") from the
string, I found the above code does not work as expected, any idea?

Thanks.


 
Reply With Quote
 
 
 
 
Ben Bullock
Guest
Posts: n/a
 
      10-04-2009
On Oct 4, 11:30*pm, Ryan Chan <(E-Mail Removed)> wrote:
> Hello,
>
> Consider the sample code:
> ##############
> use strict;
>
> my $in = "\U0004";
> my $out = chr($in) . "apple";
>
> $out =~ s/[:cntrl:]//gi;
> print $out;
> ##############
>
> I want to remove the the first unicode char (e.g. "\U0004") from the
> string, I found the above code does not work as expected, any idea?


I got a message like this:

POSIX syntax [: :] belongs inside character classes in regex; marked
by <-- HERE
in m/[:cntrl:] <-- HERE / at ./moo.pl line 6.

It seems you need more []s.

$out =~ s/[[:cntrl:]]//gi;

seems to do the trick.
 
Reply With Quote
 
 
 
 
Jürgen Exner
Guest
Posts: n/a
 
      10-04-2009
Ryan Chan <(E-Mail Removed)> wrote:
>$out =~ s/[:cntrl:]//gi;
>
>I want to remove the the first unicode char (e.g. "\U0004") from the
>string, I found the above code does not work as expected, any idea?


You can use the notation [:cntrl:] only inside of a character class.
From "perldoc perlre":

The POSIX character class syntax

[:class:]

is also available. Note that the "[" and "]" brackets are *literal*;
they must always be used within a character class expression.

# this is correct:
$string =~ /[[:alpha:]]/;

# this is not, and will generate a warning:
$string =~ /[:alpha:]/;

jue
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Convert string with control character in caret notation to realcontrol character string. Bart Vandewoestyne C Programming 8 09-25-2012 12:41 PM
Resolving unicode escapes to unicode character Tyler Ruby 1 07-29-2011 01:47 PM
Getting unicode escape sequence from unicode character? Kenneth McDonald Python 1 12-27-2006 10:27 PM
remove the last character or the newline character? Daniel Mark Python 6 09-28-2006 02:40 PM
8 bit character string to 16 bit character string Brand Bogard C Programming 8 05-28-2006 05:05 PM



Advertisments