Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > Wide character in print

Reply
Thread Tools

Wide character in print

 
 
Yuri Shtil
Guest
Posts: n/a
 
      07-31-2003
Hi all

I am getting this when I try to print certain strings. Is it harmless ?

If not, how do I get rid of it ?

Yuri.


 
Reply With Quote
 
 
 
 
Gregory Toomey
Guest
Posts: n/a
 
      08-01-2003
"Yuri Shtil" <(E-Mail Removed)> wrote in message
news:a3eWa.21387$cF.8823@rwcrnsc53...
> Hi all
>
> I am getting this when I try to print certain strings. Is it harmless ?
>
> If not, how do I get rid of it ?
>
> Yuri.


Upgrade! Fixed on Eniac 2.

gtoomey


 
Reply With Quote
 
 
 
 
Eric Amick
Guest
Posts: n/a
 
      08-03-2003
On Sat, 02 Aug 2003 19:01:28 GMT, "Yuri Shtil" <(E-Mail Removed)> wrote:

>What is Eniac 2 ?
>
>Sorry for ignorance !!!


It's a stupid joke. Ignore it. I suspect you're trying to print Unicode
characters to a filehandle that isn't expecting them. You should be able
to fix the problem by adding

binmode(FILEHANDLE, ":utf8");

after the opening of the filehandle. If that doesn't work, you should be
able to turn off the warning.

perldoc perldiag

--
Eric Amick
Columbia, MD
 
Reply With Quote
 
Alan J. Flavell
Guest
Posts: n/a
 
      08-03-2003
On Sun, Aug 3, Eric Amick inscribed on the eternal scroll:

> On Sat, 02 Aug 2003 19:01:28 GMT, "Yuri Shtil" <(E-Mail Removed)>

had, it seems, blurted out atop a fullquote:

> >What is Eniac 2 ?
> >
> >Sorry for ignorance !!!

>
> It's a stupid joke.


Well, I thought it was rather amusing; but then, the hon. Usenaut
could perhaps be advised to pay more attention to Usenet posting
conventions, and to entrust unknown terminology to a search engine of
their choice before revealing ignorance of the history of computers in
public... [An aside on the topic of character coding and old
computers: http://www.mailcom.com/besm6/ shows what can happen when
people try to put two different character codings into the same web
page - Mozilla decided it must be Chinese, with unfortunate
results...] [OK, so BESM-6 was a youngster compared to ENIAC]

> I suspect you're trying to print Unicode
> characters to a filehandle that isn't expecting them.


OK, let's get serious.

There is a Perl document (perldiag) which lists the error messages
issued by perl itself. For 5.8.0 this document could be perused at
http://www.perldoc.com/perl5.8.0/pod/perldiag.html ,
although it's also part of any complete Perl installation.

This should be the _first_ recourse for any unrecognised message.

And indeed, here is the offending item:

Wide character in %s
(W utf Perl met a wide character (>255) when it wasn't expecting
one. This warning is by default on for I/O (like print) but can be
turned off by no warnings 'utf8';. You are supposed to explicitly
mark the filehandle with an encoding, see open and perlfunc/binmode.

Seems to me that they key phrase here is "You are supposed to...".

> You should be able to fix the problem by adding
>
> binmode(FILEHANDLE, ":utf8");


Do you think so? That tells Perl that the filehandle *is* expecting
utf-8 encoding, but if it isn't in fact expecting it, then it's
likely to cause an even worse problem.

If the hon. Usenaut is expecting a particular character coding on
their output, I would recommend (in 5.8.0) defining that coding in
an encoding layer, to give Perl the chance to convert between "Wide
characters" internally, and the expected encoding externally.

Without some context, I've no idea whether the material in question
might want to be koi8-r (the traditional encoding for Russian
Cyrillic), or nothing more exciting than Windows-1252; but either way,
an :encoding layer is what I'd recommend.

The relevant documentation page that's called out from the binmode()
page is: http://www.perldoc.com/perl5.8.0/lib/open.html

(In earlier Perl versions, one needs to call the encoding explicitly,
instead of including it in the open/binmode calls).

> If that doesn't work, you should be able to turn off the warning.


But again: the warning is there for a reason. Just hiding the warning
doesn't make that reason go away. I would recommend identifying and
then solving the problem, not just hiding it.

You then added, almost it seems as an afterthought:

> perldoc perldiag


Oh, right: but I'd suggest putting that up-front, IMNSHO it's the
single most important part of this reply.

cheers
 
Reply With Quote
 
Yuri Shtil
Guest
Posts: n/a
 
      08-04-2003
I am amazed how a simple question can start something close to a flaming war
!!!
Are only superbly educated in computer history are allowed to participate in
this group ?

On the serious note, my problem showed up when I tried to parse/write XML
code that came from a third party application.
So I have no idea what to expect since the application does not specify the
encoding (or at least I don't know how to extract it).

These wide characters just showed up in some records.

There is an another problem.

My code passes extracted XML strings to an another application as counted
strings. It seems that the Perl length function returns an incorrect result
when these
"wide" characters are present.

Again, please pardon my ignorance and try to avoid flaming each other.

"Alan J. Flavell" <(E-Mail Removed)> wrote in message
news(E-Mail Removed) ern.ch...
> On Sun, Aug 3, Eric Amick inscribed on the eternal scroll:
>
> > On Sat, 02 Aug 2003 19:01:28 GMT, "Yuri Shtil" <(E-Mail Removed)>

> had, it seems, blurted out atop a fullquote:
>
> > >What is Eniac 2 ?
> > >
> > >Sorry for ignorance !!!

> >
> > It's a stupid joke.

>
> Well, I thought it was rather amusing; but then, the hon. Usenaut
> could perhaps be advised to pay more attention to Usenet posting
> conventions, and to entrust unknown terminology to a search engine of
> their choice before revealing ignorance of the history of computers in
> public... [An aside on the topic of character coding and old
> computers: http://www.mailcom.com/besm6/ shows what can happen when
> people try to put two different character codings into the same web
> page - Mozilla decided it must be Chinese, with unfortunate
> results...] [OK, so BESM-6 was a youngster compared to ENIAC]
>
> > I suspect you're trying to print Unicode
> > characters to a filehandle that isn't expecting them.

>
> OK, let's get serious.
>
> There is a Perl document (perldiag) which lists the error messages
> issued by perl itself. For 5.8.0 this document could be perused at
> http://www.perldoc.com/perl5.8.0/pod/perldiag.html ,
> although it's also part of any complete Perl installation.
>
> This should be the _first_ recourse for any unrecognised message.
>
> And indeed, here is the offending item:
>
> Wide character in %s
> (W utf Perl met a wide character (>255) when it wasn't expecting
> one. This warning is by default on for I/O (like print) but can be
> turned off by no warnings 'utf8';. You are supposed to explicitly
> mark the filehandle with an encoding, see open and perlfunc/binmode.
>
> Seems to me that they key phrase here is "You are supposed to...".
>
> > You should be able to fix the problem by adding
> >
> > binmode(FILEHANDLE, ":utf8");

>
> Do you think so? That tells Perl that the filehandle *is* expecting
> utf-8 encoding, but if it isn't in fact expecting it, then it's
> likely to cause an even worse problem.
>
> If the hon. Usenaut is expecting a particular character coding on
> their output, I would recommend (in 5.8.0) defining that coding in
> an encoding layer, to give Perl the chance to convert between "Wide
> characters" internally, and the expected encoding externally.
>
> Without some context, I've no idea whether the material in question
> might want to be koi8-r (the traditional encoding for Russian
> Cyrillic), or nothing more exciting than Windows-1252; but either way,
> an :encoding layer is what I'd recommend.
>
> The relevant documentation page that's called out from the binmode()
> page is: http://www.perldoc.com/perl5.8.0/lib/open.html
>
> (In earlier Perl versions, one needs to call the encoding explicitly,
> instead of including it in the open/binmode calls).
>
> > If that doesn't work, you should be able to turn off the warning.

>
> But again: the warning is there for a reason. Just hiding the warning
> doesn't make that reason go away. I would recommend identifying and
> then solving the problem, not just hiding it.
>
> You then added, almost it seems as an afterthought:
>
> > perldoc perldiag

>
> Oh, right: but I'd suggest putting that up-front, IMNSHO it's the
> single most important part of this reply.
>
> cheers



 
Reply With Quote
 
Alan J. Flavell
Guest
Posts: n/a
 
      08-04-2003
On Mon, Aug 4, Yuri Shtil continued in TOFU style:

> Are only superbly educated in computer history are allowed to participate in
> this group ?


You're no fun in a usenet discussion...

> On the serious note, my problem showed up when I tried to parse/write XML
> code that came from a third party application.
> So I have no idea what to expect since the application does not specify the
> encoding


But a text file is, in general, useless without a specification of its
character encoding.

> (or at least I don't know how to extract it).


It's not normally something that one can "extract" in any formal way
from the datastream itself; it's a piece of meta-data that goes along
with the data. However, with some samples and some knowledge of
context, someone could well offer a hypothesis.

Perhaps if you'd show the data in context (accompanied for example by
a hexadecimal dump of the bytes), someone could offer a suggestion
about what it is.

> These wide characters just showed up in some records.


That's not a very definite description of symptoms, you know. I think
we could have guessed that for ourselves based on your previous
presentation. I for one was hoping to see something more definite in
the way of an exhibit.

> There is an another problem.
>
> My code passes extracted XML strings to an another application as counted
> strings. It seems that the Perl length function returns an incorrect result
> when these
> "wide" characters are present.


I'd have to guess that the Perl length function returns what it's
documented to return, but that you're expecting something different.

> Again, please pardon my ignorance


Lack of knowledge (ignorance) is NOT the issue here, and is a
perfectly normal and acceptable state of being, and (I think I can
speak for many another here) is one of the reasons why we come to
Usenet to share what we know. The *problem* is that you aren't
showing us any working, so we don't know exactly what you're trying,
we don't know exactly what results you are getting, we don't know what
you expected the answer to be, and so we can't really offer any
definite help.

If you haven't tried it yet I'd suggest
http://www.perldoc.com/perl5.8.0/pod/perluniintro.html
and then
http://www.perldoc.com/perl5.8.0/pod/perlunicode.html
with particular reference to #Byte-and-Character-Semantics

But most of all to
http://mail.augustmail.com/~tadmc/cl...uidelines.html

have fun
 
Reply With Quote
 
Jürgen Exner
Guest
Posts: n/a
 
      08-05-2003
Alan J. Flavell wrote:
> On Mon, Aug 4, Yuri Shtil continued in TOFU style:

[...]
> The *problem* is that you aren't
> showing us any working, so we don't know exactly what you're trying,
> we don't know exactly what results you are getting, we don't know what
> you expected the answer to be, and so we can't really offer any
> definite help.
>
> If you haven't tried it yet I'd suggest
> http://www.perldoc.com/perl5.8.0/pod/perluniintro.html
> and then
> http://www.perldoc.com/perl5.8.0/pod/perlunicode.html
> with particular reference to #Byte-and-Character-Semantics
>
> But most of all to
> http://mail.augustmail.com/~tadmc/cl...uidelines.html


I'd like to add http://www.catb.org/~esr/faqs/smart-questions.html to that
list.

jue


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Why "Wide character in print"? tcgo Perl Misc 40 11-13-2012 07:09 AM
80 character wide <pre> block appears only 60 character wide onWindows Disc Magnet HTML 2 05-15-2010 06:53 AM
80 character wide <pre> block appears only 60 character wide onWindows Disc Magnet HTML 2 05-14-2010 10:57 AM
get wide character and multibyte character value George2 C++ 2 01-25-2008 08:59 AM
wcout does not print wide character string in solaris. iwongu C++ 1 12-14-2006 07:51 PM



Advertisments