Velocity Reviews

Velocity Reviews (http://www.velocityreviews.com/forums/index.php)
-   Python (http://www.velocityreviews.com/forums/f43-python.html)
-   -   Printing characters outside of the ASCII range (http://www.velocityreviews.com/forums/t954370-printing-characters-outside-of-the-ascii-range.html)

danielk 11-09-2012 05:17 PM

Printing characters outside of the ASCII range
 
I'm converting an application to Python 3. The app works fine on Python 2.

Simply put, this simple one-liner:

print(chr(254))

errors out with:

Traceback (most recent call last):
File "D:\home\python\tst.py", line 1, in <module>
print(chr(254))
File "C:\Python33\lib\encodings\cp437.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_m ap)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\xfe' in position 0: character maps to <undefined>

I'm using this character as a delimiter in my application.

What do I have to do to convert this string so that it does not error out?

Ian Kelly 11-09-2012 05:34 PM

Re: Printing characters outside of the ASCII range
 
On Fri, Nov 9, 2012 at 10:17 AM, danielk <danielkleinad@gmail.com> wrote:
> I'm converting an application to Python 3. The app works fine on Python 2..
>
> Simply put, this simple one-liner:
>
> print(chr(254))
>
> errors out with:
>
> Traceback (most recent call last):
> File "D:\home\python\tst.py", line 1, in <module>
> print(chr(254))
> File "C:\Python33\lib\encodings\cp437.py", line 19, in encode
> return codecs.charmap_encode(input,self.errors,encoding_m ap)[0]
> UnicodeEncodeError: 'charmap' codec can't encode character '\xfe' in position 0: character maps to <undefined>
>
> I'm using this character as a delimiter in my application.
>
> What do I have to do to convert this string so that it does not error out?


In Python 2, chr(254) means the byte 254.

In Python 3, chr(254) means the Unicode character with code point 254,
which is "". This character does not exist in CP 437, so it fails to
encode it for output.

If what you really want is the byte, then use b'\xfe' or bytes([254]) instead.

Andrew Berg 11-09-2012 05:39 PM

Re: Printing characters outside of the ASCII range
 
On 2012.11.09 11:17, danielk wrote:
> I'm converting an application to Python 3. The app works fine on Python 2.
>
> Simply put, this simple one-liner:
>
> print(chr(254))
>
> errors out with:
>
> Traceback (most recent call last):
> File "D:\home\python\tst.py", line 1, in <module>
> print(chr(254))
> File "C:\Python33\lib\encodings\cp437.py", line 19, in encode
> return codecs.charmap_encode(input,self.errors,encoding_m ap)[0]
> UnicodeEncodeError: 'charmap' codec can't encode character '\xfe' in position 0: character maps to <undefined>
>
> I'm using this character as a delimiter in my application.
>
> What do I have to do to convert this string so that it does not error out?
>

That character is outside of cp437 - the default terminal encoding on
many Windows systems. You will either need to change the code page to
something that supports the character (if you're going to change it, you
might as well change it to cp65001 since you are using 3.3), catch the
error and replace the character with something that is in the current
codepage (don't assume cp437; it is not the default everywhere), or use
a different character completely. If it works on Python 2, it's probably
changing the character automatically to a replacement character or you
were using IDLE, which is graphical and is not subject to the weird
encoding system of terminals.
--
CPython 3.3.0 | Windows NT 6.1.7601.17835

Dave Angel 11-09-2012 05:47 PM

Re: Printing characters outside of the ASCII range
 
On 11/09/2012 12:17 PM, danielk wrote:
> I'm converting an application to Python 3. The app works fine on Python 2.
>
> Simply put, this simple one-liner:
>
> print(chr(254))
>
> errors out with:
>
> Traceback (most recent call last):
> File "D:\home\python\tst.py", line 1, in <module>
> print(chr(254))
> File "C:\Python33\lib\encodings\cp437.py", line 19, in encode
> return codecs.charmap_encode(input,self.errors,encoding_m ap)[0]
> UnicodeEncodeError: 'charmap' codec can't encode character '\xfe' in position 0: character maps to <undefined>
>
> I'm using this character as a delimiter in my application.
>
> What do I have to do to convert this string so that it does not error out?


What character do you want? What characters do your console handle
directly? What does a "delimiter" mean for your particular console?

Or are you just printing it for the fun of it, and the real purpose is
for further processing, which will not go to the console?

What kind of things will it be separating? (strings, bytes ?) Clearly
you originally picked it as something unlikely to occur in those elements.

When those things are combined with a separator between, how are the
results going to be used? Saved to a file? Printed to console? What?

--

DaveA


danielk 11-09-2012 09:17 PM

Re: Printing characters outside of the ASCII range
 
On Friday, November 9, 2012 12:48:05 PM UTC-5, Dave Angel wrote:
> On 11/09/2012 12:17 PM, danielk wrote:
>
> > I'm converting an application to Python 3. The app works fine on Python2.

>
> >

>
> > Simply put, this simple one-liner:

>
> >

>
> > print(chr(254))

>
> >

>
> > errors out with:

>
> >

>
> > Traceback (most recent call last):

>
> > File "D:\home\python\tst.py", line 1, in <module>

>
> > print(chr(254))

>
> > File "C:\Python33\lib\encodings\cp437.py", line 19, in encode

>
> > return codecs.charmap_encode(input,self.errors,encoding_m ap)[0]

>
> > UnicodeEncodeError: 'charmap' codec can't encode character '\xfe' in position 0: character maps to <undefined>

>
> >

>
> > I'm using this character as a delimiter in my application.

>
> >

>
> > What do I have to do to convert this string so that it does not error out?

>
>
>
> What character do you want? What characters do your console handle
>
> directly? What does a "delimiter" mean for your particular console?
>
>
>
> Or are you just printing it for the fun of it, and the real purpose is
>
> for further processing, which will not go to the console?
>
>
>
> What kind of things will it be separating? (strings, bytes ?) Clearly
>
> you originally picked it as something unlikely to occur in those elements..
>
>
>
> When those things are combined with a separator between, how are the
>
> results going to be used? Saved to a file? Printed to console? What?
>
>
>
> --
>
>
>
> DaveA


The database I'm using stores information as a 3-dimensional array. The delimiters between elements are chr(252), chr(253) and chr(254). So a record can look like this (example only uses one of the delimiters for simplicity):

name + chr(254) + address + chr(254) + city + chr(254) + st + chr(254) + zip

The other delimiters can be embedded within each field. For example, if there were multiple addresses for 'name' then the 'address' field would look like this:

addr1 + chr(253) + addr2 + chr(253) + addr3 + etc ...

I use Python to connect to the database using subprocess.Popen to run a server process. Python requests 'actions' like 'read' and 'write' to the server process, whereby the server process performs the actions. Some actions require that the server send back information in the form of records that contain those delimiters.

I have __str__ and __repr__ methods in the classes but Python is choking onthose characters. Surely, I could convert those characters on the server before sending them to Python and that is what I'm probably going to do, so guess I've answered my own question. On Python 2, it just printed the 'extended' ASCII representation.

I guess the question I have is: How do you tell Python to use a specific encoding for 'print' statements when I know there will be characters outside of the ASCII range of 0-127?


danielk 11-09-2012 09:17 PM

Re: Printing characters outside of the ASCII range
 
On Friday, November 9, 2012 12:48:05 PM UTC-5, Dave Angel wrote:
> On 11/09/2012 12:17 PM, danielk wrote:
>
> > I'm converting an application to Python 3. The app works fine on Python2.

>
> >

>
> > Simply put, this simple one-liner:

>
> >

>
> > print(chr(254))

>
> >

>
> > errors out with:

>
> >

>
> > Traceback (most recent call last):

>
> > File "D:\home\python\tst.py", line 1, in <module>

>
> > print(chr(254))

>
> > File "C:\Python33\lib\encodings\cp437.py", line 19, in encode

>
> > return codecs.charmap_encode(input,self.errors,encoding_m ap)[0]

>
> > UnicodeEncodeError: 'charmap' codec can't encode character '\xfe' in position 0: character maps to <undefined>

>
> >

>
> > I'm using this character as a delimiter in my application.

>
> >

>
> > What do I have to do to convert this string so that it does not error out?

>
>
>
> What character do you want? What characters do your console handle
>
> directly? What does a "delimiter" mean for your particular console?
>
>
>
> Or are you just printing it for the fun of it, and the real purpose is
>
> for further processing, which will not go to the console?
>
>
>
> What kind of things will it be separating? (strings, bytes ?) Clearly
>
> you originally picked it as something unlikely to occur in those elements..
>
>
>
> When those things are combined with a separator between, how are the
>
> results going to be used? Saved to a file? Printed to console? What?
>
>
>
> --
>
>
>
> DaveA


The database I'm using stores information as a 3-dimensional array. The delimiters between elements are chr(252), chr(253) and chr(254). So a record can look like this (example only uses one of the delimiters for simplicity):

name + chr(254) + address + chr(254) + city + chr(254) + st + chr(254) + zip

The other delimiters can be embedded within each field. For example, if there were multiple addresses for 'name' then the 'address' field would look like this:

addr1 + chr(253) + addr2 + chr(253) + addr3 + etc ...

I use Python to connect to the database using subprocess.Popen to run a server process. Python requests 'actions' like 'read' and 'write' to the server process, whereby the server process performs the actions. Some actions require that the server send back information in the form of records that contain those delimiters.

I have __str__ and __repr__ methods in the classes but Python is choking onthose characters. Surely, I could convert those characters on the server before sending them to Python and that is what I'm probably going to do, so guess I've answered my own question. On Python 2, it just printed the 'extended' ASCII representation.

I guess the question I have is: How do you tell Python to use a specific encoding for 'print' statements when I know there will be characters outside of the ASCII range of 0-127?


Prasad, Ramit 11-09-2012 09:34 PM

RE: Printing characters outside of the ASCII range
 
danielk wrote:

>
> The database I'm using stores information as a 3-dimensional array. The delimiters between elements are
> chr(252), chr(253) and chr(254). So a record can look like this (example only uses one of the delimiters for
> simplicity):
>
> name + chr(254) + address + chr(254) + city + chr(254) + st + chr(254) + zip
>
> The other delimiters can be embedded within each field. For example, if there were multiple addresses for 'name'
> then the 'address' field would look like this:
>
> addr1 + chr(253) + addr2 + chr(253) + addr3 + etc ...
>
> I use Python to connect to the database using subprocess.Popen to run a server process. Python requests
> 'actions' like 'read' and 'write' to the server process, whereby the server process performs the actions. Some
> actions require that the server send back information in the form of records that contain those delimiters.
>
> I have __str__ and __repr__ methods in the classes but Python is choking on those characters. Surely, I could
> convert those characters on the server before sending them to Python and that is what I'm probably going to do,
> so guess I've answered my own question. On Python 2, it just printed the 'extended' ASCII representation.
>
> I guess the question I have is: How do you tell Python to use a specific encoding for 'print' statements when I
> know there willbe characters outside of the ASCII range of 0-127?


You just need to change the string to one that is not
trying to use the ASCII codec when printing.

print(chr(253).decode('latin1')) # changelatin1 to your
# chosen encoding.



~Ramit


This email is confidential and subject to important disclaimers and
conditions including on offers for the purchase or sale of
securities, accuracy and completenessof information, viruses,
confidentiality, legal privilege, and legal entity disclaimers,
available at http://www.jpmorgan.com/pages/disclosures/email.

Andrew Berg 11-09-2012 09:39 PM

Re: Printing characters outside of the ASCII range
 
On 2012.11.09 15:17, danielk wrote:
> I guess the question I have is: How do you tell Python to use a specific encoding for 'print' statements when I know there will be characters outside of the ASCII range of 0-127?

You don't. It's raising that exception because the terminal cannot
display that character, not because it's using the wrong encoding. As
Ian mentioned, chr() on Python 2 and chr() on Python 3 return two
different things. I'm not very familiar with the oddities of Python 2,
but I suspect sending bytes to the terminal could work since that is
what chr() on Python 2 returns.
--
CPython 3.3.0 | Windows NT 6.1.7601.17835

danielk 11-09-2012 09:46 PM

Re: Printing characters outside of the ASCII range
 
On Friday, November 9, 2012 4:34:19 PM UTC-5, Prasad, Ramit wrote:
> danielk wrote:
>
> >

>
> > The database I'm using stores information as a 3-dimensional array. Thedelimiters between elements are

>
> > chr(252), chr(253) and chr(254). So a record can look like this (example only uses one of the delimiters for

>
> > simplicity):

>
> >

>
> > name + chr(254) + address + chr(254) + city + chr(254) + st + chr(254) + zip

>
> >

>
> > The other delimiters can be embedded within each field. For example, ifthere were multiple addresses for 'name'

>
> > then the 'address' field would look like this:

>
> >

>
> > addr1 + chr(253) + addr2 + chr(253) + addr3 + etc ...

>
> >

>
> > I use Python to connect to the database using subprocess.Popen to run aserver process. Python requests

>
> > 'actions' like 'read' and 'write' to the server process, whereby the server process performs the actions. Some

>
> > actions require that the server send back information in the form of records that contain those delimiters.

>
> >

>
> > I have __str__ and __repr__ methods in the classes but Python is choking on those characters. Surely, I could

>
> > convert those characters on the server before sending them to Python and that is what I'm probably going to do,

>
> > so guess I've answered my own question. On Python 2, it just printed the 'extended' ASCII representation.

>
> >

>
> > I guess the question I have is: How do you tell Python to use a specific encoding for 'print' statements when I

>
> > know there will be characters outside of the ASCII range of 0-127?

>
>
>
> You just need to change the string to one that is not
>
> trying to use the ASCII codec when printing.
>
>
>
> print(chr(253).decode('latin1')) # change latin1 to your
>
> # chosen encoding.
>
>
>
>
>
>
>
> ~Ramit
>
>
>
>
>
> This email is confidential and subject to important disclaimers and
>
> conditions including on offers for the purchase or sale of
>
> securities, accuracy and completeness of information, viruses,
>
> confidentiality, legal privilege, and legal entity disclaimers,
>
> available at http://www.jpmorgan.com/pages/disclosures/email.


D:\home\python>pytest.py
Traceback (most recent call last):
File "D:\home\python\pytest.py", line 1, in <module>
print(chr(253).decode('latin1'))
AttributeError: 'str' object has no attribute 'decode'

Do I need to import something?

danielk 11-09-2012 09:46 PM

Re: Printing characters outside of the ASCII range
 
On Friday, November 9, 2012 4:34:19 PM UTC-5, Prasad, Ramit wrote:
> danielk wrote:
>
> >

>
> > The database I'm using stores information as a 3-dimensional array. Thedelimiters between elements are

>
> > chr(252), chr(253) and chr(254). So a record can look like this (example only uses one of the delimiters for

>
> > simplicity):

>
> >

>
> > name + chr(254) + address + chr(254) + city + chr(254) + st + chr(254) + zip

>
> >

>
> > The other delimiters can be embedded within each field. For example, ifthere were multiple addresses for 'name'

>
> > then the 'address' field would look like this:

>
> >

>
> > addr1 + chr(253) + addr2 + chr(253) + addr3 + etc ...

>
> >

>
> > I use Python to connect to the database using subprocess.Popen to run aserver process. Python requests

>
> > 'actions' like 'read' and 'write' to the server process, whereby the server process performs the actions. Some

>
> > actions require that the server send back information in the form of records that contain those delimiters.

>
> >

>
> > I have __str__ and __repr__ methods in the classes but Python is choking on those characters. Surely, I could

>
> > convert those characters on the server before sending them to Python and that is what I'm probably going to do,

>
> > so guess I've answered my own question. On Python 2, it just printed the 'extended' ASCII representation.

>
> >

>
> > I guess the question I have is: How do you tell Python to use a specific encoding for 'print' statements when I

>
> > know there will be characters outside of the ASCII range of 0-127?

>
>
>
> You just need to change the string to one that is not
>
> trying to use the ASCII codec when printing.
>
>
>
> print(chr(253).decode('latin1')) # change latin1 to your
>
> # chosen encoding.
>
>
>
>
>
>
>
> ~Ramit
>
>
>
>
>
> This email is confidential and subject to important disclaimers and
>
> conditions including on offers for the purchase or sale of
>
> securities, accuracy and completeness of information, viruses,
>
> confidentiality, legal privilege, and legal entity disclaimers,
>
> available at http://www.jpmorgan.com/pages/disclosures/email.


D:\home\python>pytest.py
Traceback (most recent call last):
File "D:\home\python\pytest.py", line 1, in <module>
print(chr(253).decode('latin1'))
AttributeError: 'str' object has no attribute 'decode'

Do I need to import something?


All times are GMT. The time now is 04:30 AM.

Powered by vBulletin®. Copyright ©2000 - 2014, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.