Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > Data::Dumper vs. UTF-8, as usual

Reply
Thread Tools

Data::Dumper vs. UTF-8, as usual

 
 
jidanni
Guest
Posts: n/a
 
      03-21-2011
Gentlemen, I need to use
use utf8;
use open qw/:std :encoding(utf/;
in my program, but it has the side effect of causing
print Dumper "龔";
to print
$VAR1 = "\x{9f94}";
instead of
$VAR1 = "龔";
like it would otherwise. I dare not touch the 'use' stuff, so how can I
tweak this?:
use strict;
use warnings FATAL => 'all';
use open qw/:std :encoding(utf/;
use utf8;
use Data:umper;
print Dumper "龔";
 
Reply With Quote
 
 
 
 
Ilya Zakharevich
Guest
Posts: n/a
 
      03-21-2011
On 2011-03-21, jidanni <(E-Mail Removed)> wrote:
> Gentlemen, I need to use
> use utf8;
> use open qw/:std :encoding(utf/;
> in my program, but it has the side effect of causing
> print Dumper "?";
> to print
> $VAR1 = "\x{9f94}";
> instead of
> $VAR1 = "?";
> like it would otherwise. I dare not touch the 'use' stuff, so how can I
> tweak this?:
> use strict;
> use warnings FATAL => 'all';
> use open qw/:std :encoding(utf/;
> use utf8;
> use Data:umper;
> print Dumper "?";


What are you using for editing your files? Are you sure you use a
real question mark? Check with
od -tx1a -Ax your_script.pl

I see no problem here with 5.8.8,
Ilya
 
Reply With Quote
 
 
 
 
Peter J. Holzer
Guest
Posts: n/a
 
      03-21-2011
On 2011-03-21 22:44, Ilya Zakharevich <(E-Mail Removed)> wrote:
> On 2011-03-21, jidanni <(E-Mail Removed)> wrote:
>> Gentlemen, I need to use
>> use utf8;
>> use open qw/:std :encoding(utf/;
>> in my program, but it has the side effect of causing
>> print Dumper "?";
>> to print
>> $VAR1 = "\x{9f94}";
>> instead of
>> $VAR1 = "?";
>> like it would otherwise. I dare not touch the 'use' stuff, so how can I
>> tweak this?:
>> use strict;
>> use warnings FATAL => 'all';
>> use open qw/:std :encoding(utf/;
>> use utf8;
>> use Data:umper;
>> print Dumper "?";

>
> What are you using for editing your files? Are you sure you use a
> real question mark?


The only question mark in jidanni's posting was at the end of "how can I
tweak this?". The character jidanni wants to be displayed is a CJK
character: http://unicode.org/cgi-bin/GetUnihan...codepoint=9F94

> I see no problem here with 5.8.8,


I see a problem with your newsreader .


Unfortunately I don't know a solution for the OP's problem. This may be
a case where writing a custom dumping routine (and uploading it to CPAN)
may be worthwhile.

hp

 
Reply With Quote
 
Xho Jingleheimerschmidt
Guest
Posts: n/a
 
      03-22-2011
jidanni wrote:
> Gentlemen, I need to use
> use utf8;
> use open qw/:std :encoding(utf/;
> in my program, but it has the side effect of causing
> print Dumper "龔";


without utf8, Perl is interpreting that character as just 3 bytes,
printing those three bytes, and it is your terminal that is converting
those back into the character that you see. If you were to print the
length, rather than the output of Dumper, you would see the difference
that "use utf8" makes.

As far as I can tell, the "use open" part makes no difference, other
than to silence a warning about wide characters.

> to print
> $VAR1 = "\x{9f94}";
> instead of
> $VAR1 = "龔";
> like it would otherwise. I dare not touch the 'use' stuff, so how can I
> tweak this?:


Without writing your own version of Data:umper (or extending/fixing
the current one), or doing something basically equivalent, I don't see
how you can. However, my version of Data:umper is rather old, maybe
it has been already tweaked in the mean time. It could use something
like $Data:umper::Useutf8.

Xho
 
Reply With Quote
 
Peter J. Holzer
Guest
Posts: n/a
 
      03-22-2011
On 2011-03-21 23:56, Peter J. Holzer <(E-Mail Removed)> wrote:
> On 2011-03-21 22:44, Ilya Zakharevich <(E-Mail Removed)> wrote:
>> On 2011-03-21, jidanni <(E-Mail Removed)> wrote:
>>> Gentlemen, I need to use
>>> use utf8;
>>> use open qw/:std :encoding(utf/;
>>> in my program, but it has the side effect of causing
>>> print Dumper "?";

[? was a Chinese character in the OP]
>>> to print
>>> $VAR1 = "\x{9f94}";
>>> instead of
>>> $VAR1 = "?";
>>> like it would otherwise. I dare not touch the 'use' stuff, so how can I
>>> tweak this?:

[...]
> Unfortunately I don't know a solution for the OP's problem. This may be
> a case where writing a custom dumping routine (and uploading it to CPAN)
> may be worthwhile.


Forgot to add: It also depends very much on what Data:umper is used
for in the OP's script: Is the output supposed to be readable by humans
or by other programs? Is the output only used for debugging purposes or
is the part of the "real" output of the program?

hp
 
Reply With Quote
 
Ilya Zakharevich
Guest
Posts: n/a
 
      03-23-2011
On 2011-03-21, Peter J. Holzer <(E-Mail Removed)> wrote:
>>> use Data:umper;
>>> print Dumper "?";

>>
>> What are you using for editing your files? Are you sure you use a
>> real question mark?

>
> The only question mark in jidanni's posting was at the end of "how can I
> tweak this?". The character jidanni wants to be displayed is a CJK
> character: http://unicode.org/cgi-bin/GetUnihan...codepoint=9F94
>
>> I see no problem here with 5.8.8,

>
> I see a problem with your newsreader .


I do not see any problem with it. It is told that the TTY understands
latin-1, and performs accordingly. The real problem is with wetware -
I could have guessed that this question mark is not \x3f...

Thanks,
Ilya
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Help: ASP.Net broken (tried usual suspects...) Mike ASP .Net 7 01-13-2006 07:30 PM
Interesting problem with NAT and VPN (not the usual question) Jim Westwood Cisco 6 10-15-2005 05:07 PM
Not your usual "Failed to start monitoring changes to [path] error =?Utf-8?B?QUMgW01WUCBNQ01TXQ==?= ASP .Net 0 08-24-2005 03:01 PM
What's the usual way to setup input textbox width (and be cross-browser) ? craigkenisston@hotmail.com ASP .Net 1 07-27-2005 06:43 AM
What's the usual way to setup input textbox width (and be cross-browser) ? craigkenisston@hotmail.com ASP .Net 0 07-27-2005 06:02 AM



Advertisments