Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Can't print Chinese to HTTP

Reply
Thread Tools

Can't print Chinese to HTTP

 
 
Gnarlodious
Guest
Posts: n/a
 
      11-30-2009
Hello.
The "upgrade to Python 3.1 has been disaster so far. I can't figure out how to print Chinese to a browser. If my script is:

#!/usr/bin/python
print("Content-type:text/html\n\n")
print('晉')

the Chinese string simply does not print. It works in interactive Terminal no problem, and also works in Python 2.6 (which my server is still running) in 4 different browsers. What am I doing wrong? BTW searched Google for 2 days no solution, if this doesn't get solved soon I will have to roll back to 2.6.

Thanks for any clue.

-- Gnarlie
http://Gnarlodious.com




 
Reply With Quote
 
 
 
 
Martin v. Löwis
Guest
Posts: n/a
 
      11-30-2009
Gnarlodious wrote:
> Hello. The "upgrade to Python 3.1 has been disaster so far. I can't
> figure out how to print Chinese to a browser. If my script is:
>
> #!/usr/bin/python
> print("Content-type:text/html\n\n")
> print('晉')
>
> the Chinese string simply does not print. It works in interactive
> Terminal no problem, and also works in Python 2.6 (which my server is
> still running) in 4 different browsers. What am I doing wrong? BTW
> searched Google for 2 days no solution, if this doesn't get solved
> soon I will have to roll back to 2.6.
>
> Thanks for any clue.


In the CGI case, Python cannot figure out what encoding to use for
output, so it raises an exception. This exception should show up in
the error log of your web server, please check.

One way of working around this problem is to encode the output
explicitly:

#!/usr/bin/python
print("Content-type:text/plain;charset=utf-8\n\n")
sys.stdout.buffer.write('晉\n'.encode("utf-8"))

FWIW, the Content-type in your example is wrong in two ways:
what you produce is not HTML, and the charset parameter is
missing.

Regards,
Martin
 
Reply With Quote
 
 
 
 
Gnarlodious
Guest
Posts: n/a
 
      11-30-2009
Thanks for the help, but it doesn't work. All I get is an error like:

UnicodeEncodeError: 'ascii' codec can't encode character '\\u0107' in
position 0: ordinal not in range(12

It does work in Terminal interactively, after I import the sys module.
But my script doesn't act the same. Here is my entire script:

#!/usr/bin/python
print("Content-type:text/plain;charset=utf-8\n\n")
import sys
sys.stdout.buffer.write('晉\n'.encode("utf-8"))

All I get is the despised "Internal Server Error" with Console
reporting:

malformed header from script. Bad header=\xe6\x99\x89

Strangely, if I run the script in Terminal it acts as expected.

This is OSX 10.6 2,, Python 3.1.1.
And it is frustrating because my entire website is hung up on this one
line I have been working on for 5 days.

-- Gnarlie
http://Gnarlodious.com
 
Reply With Quote
 
Aahz
Guest
Posts: n/a
 
      11-30-2009
In article <(E-Mail Removed)>,
Gnarlodious <(E-Mail Removed)> wrote:
>
>Thanks for the help, but it doesn't work. All I get is an error like:
>
>UnicodeEncodeError: 'ascii' codec can't encode character '\\u0107' in
>position 0: ordinal not in range(12


No time to give you more info, but you probably need to change the
encoding of sys.stdout.
--
Aahz ((E-Mail Removed)) <*> http://www.pythoncraft.com/

The best way to get information on Usenet is not to ask a question, but
to post the wrong information.
 
Reply With Quote
 
Lie Ryan
Guest
Posts: n/a
 
      11-30-2009
On 12/1/2009 4:05 AM, Gnarlodious wrote:
> Thanks for the help, but it doesn't work. All I get is an error like:
>
> UnicodeEncodeError: 'ascii' codec can't encode character '\\u0107' in
> position 0: ordinal not in range(12


The error says it all; you're trying to encode the chinese character
using 'ascii' codec.

> malformed header from script. Bad header=\xe6\x99\x89


Hmmm... strange. The \xe6\x99\x89 happens to coincide with UTF-8
representation of 晉. Why is your content becoming a header?

> #!/usr/bin/python

do you know what python version, exactly, that gets called by this
hashbang? You mentioned that you're using python 3, but I'm not sure
that this hashbang will invoke python3 (unless Mac OSX has made a
progress above other linux distros and made python 3 the default python).

> Strangely, if I run the script in Terminal it acts as expected.


I think I see it now. You're invoking python3 in the terminal; but your
server invokes python 2. Python 2 uses byte-based string literal, while
python 3 uses unicode-based string literal. When you try to '
晉\n'.encode("utf-8"), python 2 tried to decode the string using 'ascii'
decoder, causing the exception.
 
Reply With Quote
 
Ned Deily
Guest
Posts: n/a
 
      11-30-2009
In article
<(E-Mail Removed)>,
Gnarlodious <(E-Mail Removed)> wrote:

> It does work in Terminal interactively, after I import the sys module.
> But my script doesn't act the same. Here is my entire script:
>
> #!/usr/bin/python
> print("Content-type:text/plain;charset=utf-8\n\n")
> import sys
> sys.stdout.buffer.write('n'.encode("utf-8"))
>
> All I get is the despised "Internal Server Error" with Console
> reporting:
>
> malformed header from script. Bad header=xe6x99x89
>
> Strangely, if I run the script in Terminal it acts as expected.
>
> This is OSX 10.6 2,, Python 3.1.1.


Are you sure you are actually using Python 3? /usr/bin/python is the
path to the Apple-supplied python 2.6.1. If you installed Python 3.1.1
using the python.org OS X installer, the path should be
/usr/local/bin/python3

--
Ned Deily,
http://www.velocityreviews.com/forums/(E-Mail Removed)

 
Reply With Quote
 
exarkun@twistedmatrix.com
Guest
Posts: n/a
 
      11-30-2009
On 05:05 pm, (E-Mail Removed) wrote:
>Thanks for the help, but it doesn't work. All I get is an error like:
>
>UnicodeEncodeError: 'ascii' codec can't encode character '\\u0107' in
>position 0: ordinal not in range(12
>
>It does work in Terminal interactively, after I import the sys module.
>But my script doesn't act the same. Here is my entire script:
>
>#!/usr/bin/python
>print("Content-type:text/plain;charset=utf-8\n\n")
>import sys
>sys.stdout.buffer.write('f49\n'.encode("utf-8"))
>
>All I get is the despised "Internal Server Error" with Console
>reporting:
>
>malformed header from script. Bad header=\xe6\x99\x89


As the error suggests, you're writing f49 to the headers section of the
response. This is because you're not ending the headers section with a
blank line. Lines in HTTP end with \r\n, not with just \n.

Have you considered using something with fewer sharp corners than CGI?
You might find it more productive.

Jean-Paul
 
Reply With Quote
 
Gnarlodious
Guest
Posts: n/a
 
      12-01-2009
> you probably need to change the encoding of sys.stdout
>>> sys.stdout.encoding

'UTF-8'

>> #!/usr/bin/python


> do you know what python version, exactly, that gets called by this

hashbang?
Verified in HTTP:
>>> print(sys.version)

3.1.1
Is is possible modules are getting loaded from my old Python?

I symlinked to the new Python, and no I do not want to roll it back
because it is work (meaning I would have to type "sudo").
ls /usr/bin/python
lrwxr-xr-x 1 root wheel 63 Nov 20 21:24 /usr/bin/python -> /Library/
Frameworks/Python.framework/Versions/3.1/bin/python3.1
Ugh, I have not been able to program in 11 days.

Now I remember doing it that way because I could not figure out how to
get Apache to find the new Python.

ls /usr/local/bin/python3.1
lrwxr-xr-x 1 root wheel 71 Nov 20 08:19 /usr/local/bin/python3.1 -
> ../../../Library/Frameworks/Python.framework/Versions/3.1/bin/

python3.1

So they are both pointing to the same Python.


And yes, I would prefer easier http scripting, but don't know one.

-- Gnarlie
 
Reply With Quote
 
Gnarlodious
Guest
Posts: n/a
 
      12-01-2009
On Nov 30, 5:53*am, "Martin v. Löwis" wrote:

> #!/usr/bin/python
> print("Content-type:text/plain;charset=utf-8\n\n")
> sys.stdout.buffer.write('晉\n'.encode("utf-8"))


Does this work for anyone? Because all I get is a blank page. Nothing.
If I can establish what SHOULD work, maybe I can diagnose this
problem.

-- Gnarlie
 
Reply With Quote
 
Lie Ryan
Guest
Posts: n/a
 
      12-01-2009
On 12/2/2009 12:27 AM, Gnarlodious wrote:
> On Nov 30, 5:53 am, "Martin v. Löwis" wrote:
>
>> #!/usr/bin/python
>> print("Content-type:text/plain;charset=utf-8\n\n")
>> sys.stdout.buffer.write('晉\n'.encode("utf-8"))

>
> Does this work for anyone? Because all I get is a blank page. Nothing.
> If I can establish what SHOULD work, maybe I can diagnose this
> problem.
>


with a minor fix (import sys) that runs without errors in Python 3.1
(Vista), but the result is a bit disturbing...

--------------------------

Content-type:text/plain;charset=utf-8
<BLANKLINE>
<BLANKLINE>
--------------------------

(is this a bug? or just undefined behavior?)



the following works correctly in python 3.1:

---------------------------
#!/usr/bin/python
import sys
print = lambda s: sys.stdout.buffer.write(s.encode('utf-8'))
print("Content-type:text/plain;charset=utf-8\n\n")
print('晉\n')
----------------------------

(and that code will definitely fail with python2 because of the print
assignment, an insurance if your server happens to be misconfigured to
run python2)
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
HTTP SOAP/HTTP GET/HTTP POST milan_9211 Software 0 01-10-2011 02:10 PM
print a vs print '%s' % a vs print '%f' a David Cournapeau Python 0 12-30-2008 03:19 AM
Problem - I want to print Current Output of Pdf file and should print once.I get print dialog box but it is not working keto Java 0 05-30-2007 11:27 AM
Unlarging the print to print using PDF file to print Bun Mui Computer Support 3 09-13-2004 03:15 AM
Re: How do I translate a Chinese web site and add Chinese character set to IE..? =A0 - =A0 Taking these two questions one at a time... =A Patrick Dunford NZ Computing 3 04-28-2004 07:15 PM



Advertisments