![]() |
Unicode blues in Python3
I know that unicode is the way to go in Python 3.1, but it is getting
in my way right now in my Unix scripts. How do I write a chr(253) to a file? #nntst2.py import sys,codecs mychar=chr(253) print(sys.stdout.encoding) print(mychar) > ./nntst2.py ISO8859-1 ý > ./nntst2.py >nnout2 Traceback (most recent call last): File "./nntst2.py", line 5, in <module> print(mychar) UnicodeEncodeError: 'ascii' codec can't encode character '\xfd' in position 0: ordinal not in range(128) > cat nnout2 ascii ...Oh great! ok lets try this: #nntst3.py import sys,codecs mychar=chr(253) print(sys.stdout.encoding) print(mychar.encode('latin1')) > ./nntst3.py ISO8859-1 b'\xfd' > ./nntst3.py >nnout3 > cat nnout3 ascii b'\xfd' ...Eh... not what I want really. #nntst4.py import sys,codecs mychar=chr(253) print(sys.stdout.encoding) sys.stdout=codecs.getwriter("latin1")(sys.stdout) print(mychar) > ./nntst4.py ISO8859-1 Traceback (most recent call last): File "./nntst4.py", line 6, in <module> print(mychar) File "Python-3.1.2/Lib/codecs.py", line 356, in write self.stream.write(data) TypeError: must be str, not bytes ...OK, this is not working either. Is there any way to write a value 253 to standard output? |
Re: Unicode blues in Python3
On Tuesday 23 March 2010 10:33:33 nn wrote:
> I know that unicode is the way to go in Python 3.1, but it is getting > in my way right now in my Unix scripts. How do I write a chr(253) to a > file? > > #nntst2.py > import sys,codecs > mychar=chr(253) > print(sys.stdout.encoding) > print(mychar) The following code works for me: $ cat nnout5.py #!/usr/bin/python3.1 import sys mychar = chr(253) sys.stdout.write(mychar) $ echo $(cat nnout) ý Can I ask why you're using print() in the first place, rather than writing directly to a file? Python 3.x, AFAIK, distinguishes between text and binary files and will let you specify the encoding you want for strings you write. Hope that helps, Rami > > > ./nntst2.py > > ISO8859-1 > ý > > > ./nntst2.py >nnout2 > > Traceback (most recent call last): > File "./nntst2.py", line 5, in <module> > print(mychar) > UnicodeEncodeError: 'ascii' codec can't encode character '\xfd' in > position 0: ordinal not in range(128) > > > cat nnout2 > > ascii > > ..Oh great! > > ok lets try this: > #nntst3.py > import sys,codecs > mychar=chr(253) > print(sys.stdout.encoding) > print(mychar.encode('latin1')) > > > ./nntst3.py > > ISO8859-1 > b'\xfd' > > > ./nntst3.py >nnout3 > > > > cat nnout3 > > ascii > b'\xfd' > > ..Eh... not what I want really. > > #nntst4.py > import sys,codecs > mychar=chr(253) > print(sys.stdout.encoding) > sys.stdout=codecs.getwriter("latin1")(sys.stdout) > print(mychar) > > > ./nntst4.py > > ISO8859-1 > Traceback (most recent call last): > File "./nntst4.py", line 6, in <module> > print(mychar) > File "Python-3.1.2/Lib/codecs.py", line 356, in write > self.stream.write(data) > TypeError: must be str, not bytes > > ..OK, this is not working either. > > Is there any way to write a value 253 to standard output? ---- Rami Chowdhury "Ninety percent of everything is crap." -- Sturgeon's Law 408-597-7068 (US) / 07875-841-046 (UK) / 01819-245544 (BD) |
Re: Unicode blues in Python3
Rami Chowdhury wrote: > On Tuesday 23 March 2010 10:33:33 nn wrote: > > I know that unicode is the way to go in Python 3.1, but it is getting > > in my way right now in my Unix scripts. How do I write a chr(253) to a > > file? > > > > #nntst2.py > > import sys,codecs > > mychar=chr(253) > > print(sys.stdout.encoding) > > print(mychar) > > The following code works for me: > > $ cat nnout5.py > #!/usr/bin/python3.1 > > import sys > mychar = chr(253) > sys.stdout.write(mychar) > $ echo $(cat nnout) > ý > > Can I ask why you're using print() in the first place, rather than writing > directly to a file? Python 3.x, AFAIK, distinguishes between text and binary > files and will let you specify the encoding you want for strings you write. > > Hope that helps, > Rami > > > > > ./nntst2.py > > > > ISO8859-1 > > ý > > > > > ./nntst2.py >nnout2 > > > > Traceback (most recent call last): > > File "./nntst2.py", line 5, in <module> > > print(mychar) > > UnicodeEncodeError: 'ascii' codec can't encode character '\xfd' in > > position 0: ordinal not in range(128) > > > > > cat nnout2 > > > > ascii > > > > ..Oh great! > > > > ok lets try this: > > #nntst3.py > > import sys,codecs > > mychar=chr(253) > > print(sys.stdout.encoding) > > print(mychar.encode('latin1')) > > > > > ./nntst3.py > > > > ISO8859-1 > > b'\xfd' > > > > > ./nntst3.py >nnout3 > > > > > > cat nnout3 > > > > ascii > > b'\xfd' > > > > ..Eh... not what I want really. > > > > #nntst4.py > > import sys,codecs > > mychar=chr(253) > > print(sys.stdout.encoding) > > sys.stdout=codecs.getwriter("latin1")(sys.stdout) > > print(mychar) > > > > > ./nntst4.py > > > > ISO8859-1 > > Traceback (most recent call last): > > File "./nntst4.py", line 6, in <module> > > print(mychar) > > File "Python-3.1.2/Lib/codecs.py", line 356, in write > > self.stream.write(data) > > TypeError: must be str, not bytes > > > > ..OK, this is not working either. > > > > Is there any way to write a value 253 to standard output? > #nntst5.py import sys mychar=chr(253) sys.stdout.write(mychar) > ./nntst5.py >nnout5 Traceback (most recent call last): File "./nntst5.py", line 4, in <module> sys.stdout.write(mychar) UnicodeEncodeError: 'ascii' codec can't encode character '\xfd' in position 0: ordinal not in range(128) equivalent to print. I use print so I can do tests and debug runs to the screen or pipe it to some other tool and then configure the production bash script to write the final output to a file of my choosing. |
Re: Unicode blues in Python3
nn wrote:
> I know that unicode is the way to go in Python 3.1, but it is getting > in my way right now in my Unix scripts. How do I write a chr(253) to a > file? > Python3 make a distinction between bytes and string(i.e., unicode) types, and you are still thinking in the Python2 mode that does *NOT* make such a distinction. What you appear to want is to write a particular byte to a file -- so use the bytes type and a file open in binary mode: >>> b=bytes([253]) >>> f = open("abc", 'wb') >>> f.write(b) 1 >>> f.close() On unix (at least), the "od" program can verify the contents is correct: > od abc -d 0000000 253 0000001 Hope that helps. Gary Herron > #nntst2.py > import sys,codecs > mychar=chr(253) > print(sys.stdout.encoding) > print(mychar) > > > ./nntst2.py > ISO8859-1 > ý > > > ./nntst2.py >nnout2 > Traceback (most recent call last): > File "./nntst2.py", line 5, in <module> > print(mychar) > UnicodeEncodeError: 'ascii' codec can't encode character '\xfd' in > position 0: ordinal not in range(128) > > >> cat nnout2 >> > ascii > > ..Oh great! > > ok lets try this: > #nntst3.py > import sys,codecs > mychar=chr(253) > print(sys.stdout.encoding) > print(mychar.encode('latin1')) > > >> ./nntst3.py >> > ISO8859-1 > b'\xfd' > > >> ./nntst3.py >nnout3 >> > > >> cat nnout3 >> > ascii > b'\xfd' > > ..Eh... not what I want really. > > #nntst4.py > import sys,codecs > mychar=chr(253) > print(sys.stdout.encoding) > sys.stdout=codecs.getwriter("latin1")(sys.stdout) > print(mychar) > > > ./nntst4.py > ISO8859-1 > Traceback (most recent call last): > File "./nntst4.py", line 6, in <module> > print(mychar) > File "Python-3.1.2/Lib/codecs.py", line 356, in write > self.stream.write(data) > TypeError: must be str, not bytes > > ..OK, this is not working either. > > Is there any way to write a value 253 to standard output? > |
Re: Unicode blues in Python3
Gary Herron wrote: > nn wrote: > > I know that unicode is the way to go in Python 3.1, but it is getting > > in my way right now in my Unix scripts. How do I write a chr(253) to a > > file? > > > > Python3 make a distinction between bytes and string(i.e., unicode) > types, and you are still thinking in the Python2 mode that does *NOT* > make such a distinction. What you appear to want is to write a > particular byte to a file -- so use the bytes type and a file open in > binary mode: > > >>> b=bytes([253]) > >>> f = open("abc", 'wb') > >>> f.write(b) > 1 > >>> f.close() > > On unix (at least), the "od" program can verify the contents is correct: > > od abc -d > 0000000 253 > 0000001 > > > Hope that helps. > > Gary Herron > > > > > #nntst2.py > > import sys,codecs > > mychar=chr(253) > > print(sys.stdout.encoding) > > print(mychar) > > > > > ./nntst2.py > > ISO8859-1 > > ý > > > > > ./nntst2.py >nnout2 > > Traceback (most recent call last): > > File "./nntst2.py", line 5, in <module> > > print(mychar) > > UnicodeEncodeError: 'ascii' codec can't encode character '\xfd' in > > position 0: ordinal not in range(128) > > > > > >> cat nnout2 > >> > > ascii > > > > ..Oh great! > > > > ok lets try this: > > #nntst3.py > > import sys,codecs > > mychar=chr(253) > > print(sys.stdout.encoding) > > print(mychar.encode('latin1')) > > > > > >> ./nntst3.py > >> > > ISO8859-1 > > b'\xfd' > > > > > >> ./nntst3.py >nnout3 > >> > > > > > >> cat nnout3 > >> > > ascii > > b'\xfd' > > > > ..Eh... not what I want really. > > > > #nntst4.py > > import sys,codecs > > mychar=chr(253) > > print(sys.stdout.encoding) > > sys.stdout=codecs.getwriter("latin1")(sys.stdout) > > print(mychar) > > > > > ./nntst4.py > > ISO8859-1 > > Traceback (most recent call last): > > File "./nntst4.py", line 6, in <module> > > print(mychar) > > File "Python-3.1.2/Lib/codecs.py", line 356, in write > > self.stream.write(data) > > TypeError: must be str, not bytes > > > > ..OK, this is not working either. > > > > Is there any way to write a value 253 to standard output? > > Actually what I want is to write a particular byte to standard output, and I want this to work regardless of where that output gets sent to. I am aware that I could do open('nnout','w',encoding='latin1').write(mychar) but I am porting a python2 program and don't want to rewrite everything that uses that script. |
Re: Unicode blues in Python3
nn, 23.03.2010 19:46:
> Actually what I want is to write a particular byte to standard output, > and I want this to work regardless of where that output gets sent to. > I am aware that I could do > open('nnout','w',encoding='latin1').write(mychar) but I am porting a > python2 program and don't want to rewrite everything that uses that > script. Are you writing text or binary data to stdout? Stefan |
Re: Unicode blues in Python3
Stefan Behnel wrote: > nn, 23.03.2010 19:46: > > Actually what I want is to write a particular byte to standard output, > > and I want this to work regardless of where that output gets sent to. > > I am aware that I could do > > open('nnout','w',encoding='latin1').write(mychar) but I am porting a > > python2 program and don't want to rewrite everything that uses that > > script. > > Are you writing text or binary data to stdout? > > Stefan latin1 charset text. |
Re: Unicode blues in Python3
nn wrote:
> > Stefan Behnel wrote: >> nn, 23.03.2010 19:46: >>> Actually what I want is to write a particular byte to standard output, >>> and I want this to work regardless of where that output gets sent to. >>> I am aware that I could do >>> open('nnout','w',encoding='latin1').write(mychar) but I am porting a >>> python2 program and don't want to rewrite everything that uses that >>> script. >> Are you writing text or binary data to stdout? >> >> Stefan > > latin1 charset text. Are you sure about that? If you carefully reconsider, could you come to the conclusion that you are not writing text at all, but binary data? If it really was text that you write, why do you need to use U+00FD (LATIN SMALL LETTER Y WITH ACUTE). To my knowledge, that character is really infrequently used in practice. So that you try to write it strongly suggests that it is not actually text what you are writing. Also, your formulation suggests the same: "Is there any way to write a value 253 to standard output?" If you would really be writing text, you'd ask "Is there any way to write 'ý' to standard output?" Regards, Martin |
Re: Unicode blues in Python3
On Tue, 23 Mar 2010 11:46:33 -0700, nn wrote:
> Actually what I want is to write a particular byte to standard output, > and I want this to work regardless of where that output gets sent to. What do you mean "work"? Do you mean "display a particular glyph" or something else? In bash: $ echo -e "\0101" # octal 101 = decimal 65 A $ echo -e "\0375" # decimal 253 � but if I change the terminal encoding, I get this: $ echo -e "\0375" ý Or this: $ echo -e "\0375" ² depending on which encoding I use. I think your question is malformed. You need to work out what behaviour you actually want, before you can ask for help on how to get it. -- Steven |
Re: Unicode blues in Python3
Martin v. Loewis wrote: > nn wrote: > > > > Stefan Behnel wrote: > >> nn, 23.03.2010 19:46: > >>> Actually what I want is to write a particular byte to standard output, > >>> and I want this to work regardless of where that output gets sent to. > >>> I am aware that I could do > >>> open('nnout','w',encoding='latin1').write(mychar) but I am porting a > >>> python2 program and don't want to rewrite everything that uses that > >>> script. > >> Are you writing text or binary data to stdout? > >> > >> Stefan > > > > latin1 charset text. > > Are you sure about that? If you carefully reconsider, could you come to > the conclusion that you are not writing text at all, but binary data? > > If it really was text that you write, why do you need to use > U+00FD (LATIN SMALL LETTER Y WITH ACUTE). To my knowledge, that > character is really infrequently used in practice. So that you try to > write it strongly suggests that it is not actually text what you are > writing. > > Also, your formulation suggests the same: > > "Is there any way to write a value 253 to standard output?" > > If you would really be writing text, you'd ask > > > "Is there any way to write '�' to standard output?" > > Regards, > Martin To be more informative I am both writing text and binary data together. That is I am embedding text from another source into stream that uses non-ascii characters as "control" characters. In Python2 I was processing it mostly as text containing a few "funny" characters. |
| All times are GMT. The time now is 01:04 AM. |
Powered by vBulletin®. Copyright ©2000 - 2013, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.