Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > "env" parameter to "popen" won't accept Unicode on Windows - minorUnicode bug

Reply
Thread Tools

"env" parameter to "popen" won't accept Unicode on Windows - minorUnicode bug

 
 
John Nagle
Guest
Posts: n/a
 
      01-15-2008
I passed a dict for the "env" variable to Popen with Unicode strings
for the dictionary values.

Got:

File "D:\Python24\lib\subprocess.py", line 706, in _execute_child
TypeError: environment can only contain strings

It turns out that the strings in the "env" parameter have to be ASCII,
not Unicode, even though Windows fully supports Unicode in CreateProcess.

John Nagle
 
Reply With Quote
 
 
 
 
Bjoern Schliessmann
Guest
Posts: n/a
 
      01-15-2008
John Nagle wrote:

> It turns out that the strings in the "env" parameter have to be
> ASCII, not Unicode, even though Windows fully supports Unicode in
> CreateProcess.


Are you sure it supports Unicode, not UTF8 or UTF16? Probably using
something like u"thestring".encode("utf16") will help.

Regards,


Björn

--
BOFH excuse #31:

cellular telephone interference

 
Reply With Quote
 
 
 
 
Benjamin
Guest
Posts: n/a
 
      01-15-2008
On Jan 14, 6:26 pm, Bjoern Schliessmann <usenet-
(E-Mail Removed)> wrote:
> John Nagle wrote:
> > It turns out that the strings in the "env" parameter have to be
> > ASCII, not Unicode, even though Windows fully supports Unicode in
> > CreateProcess.

>
> Are you sure it supports Unicode, not UTF8 or UTF16? Probably using
> something like u"thestring".encode("utf16") will help.

Otherwise: bugs.python.org
>
> Regards,
>
> Björn
>
> --
> BOFH excuse #31:
>
> cellular telephone interference


 
Reply With Quote
 
Benjamin
Guest
Posts: n/a
 
      01-15-2008
On Jan 14, 6:26 pm, John Nagle <(E-Mail Removed)> wrote:
> I passed a dict for the "env" variable to Popen with Unicode strings
> for the dictionary values.
>
> Got:
>
> File "D:\Python24\lib\subprocess.py", line 706, in _execute_child
> TypeError: environment can only contain strings
>
> It turns out that the strings in the "env" parameter have to be ASCII,
> not Unicode, even though Windows fully supports Unicode in CreateProcess.


>
> John Nagle


 
Reply With Quote
 
John Nagle
Guest
Posts: n/a
 
      01-15-2008
Benjamin wrote:
> On Jan 14, 6:26 pm, Bjoern Schliessmann <usenet-
> (E-Mail Removed)> wrote:
>> John Nagle wrote:
>>> It turns out that the strings in the "env" parameter have to be
>>> ASCII, not Unicode, even though Windows fully supports Unicode in
>>> CreateProcess.

>> Are you sure it supports Unicode, not UTF8 or UTF16? Probably using
>> something like u"thestring".encode("utf16") will help.

> Otherwise: bugs.python.org


Whatever translation is necessary should be done in "popen", which
has cases for Windows and POSIX. "popen" is supposed to be cross-platform
to the extent possible. I think it's just something that didn't get fixed
when Unicode support went in.

John Nagle
 
Reply With Quote
 
Diez B. Roggisch
Guest
Posts: n/a
 
      01-15-2008
John Nagle wrote:

> Benjamin wrote:
>> On Jan 14, 6:26 pm, Bjoern Schliessmann <usenet-
>> (E-Mail Removed)> wrote:
>>> John Nagle wrote:
>>>> It turns out that the strings in the "env" parameter have to be
>>>> ASCII, not Unicode, even though Windows fully supports Unicode in
>>>> CreateProcess.


That's of course nonsense, they don't need to be ascii, they need to be
byte-strings in whatever encoding you like.

>>> Are you sure it supports Unicode, not UTF8 or UTF16? Probably using
>>> something like u"thestring".encode("utf16") will help.

>> Otherwise: bugs.python.org


John's understanding of the differences between unicode and it's encodings
is a bit blurry, to say the least.

> Whatever translation is necessary should be done in "popen", which
> has cases for Windows and POSIX. "popen" is supposed to be cross-platform
> to the extent possible. I think it's just something that didn't get fixed
> when Unicode support went in.


Sure thing, python will just magically convert unicode to the encoding the
program YOU invoke will expect. Right after we introduced the

solve_my_problem()

built-in-function. Any other wishes?

If I write this simple program

------ test.py -------
import os
import sys

ENCODDINGS = ['utf-8', 'latin1']

os.env["MY_VARIABLE"].encode(ENCODINGS[int(sys.argv[1])])
------ test.py -------


how's python supposed to know that

suprocess.call("python test.py 0", env=dict(MY_VARIABLE=u'foo'))

needs to be UTF-8?

Diez
 
Reply With Quote
 
Brian Smith
Guest
Posts: n/a
 
      01-15-2008
Diez B. Roggisch wrote:
> Sure thing, python will just magically convert unicode to the
> encoding the program YOU invoke will expect. Right after we
> introduced the
>
> solve_my_problem()
>
> built-in-function. Any other wishes?


There's no reason to be rude.

Anyway, at least on Windows it makes perfect sense for people to expect
Unicode to be handled automatically. popen() knows that it is running on
Windows, and it knows what encoding Windows needs for its environment
(it's either UCS2 or UTF-16 for most Windows APIs). At least when it
receives a unicode string, it has enough information to apply the
conversion automatically, and doing so saves the caller from having to
figure out what exact encoding is to be used.

- Brian

 
Reply With Quote
 
Diez B. Roggisch
Guest
Posts: n/a
 
      01-15-2008
Brian Smith wrote:

> Diez B. Roggisch wrote:
>> Sure thing, python will just magically convert unicode to the
>> encoding the program YOU invoke will expect. Right after we
>> introduced the
>>
>> solve_my_problem()
>>
>> built-in-function. Any other wishes?

>
> There's no reason to be rude.


If you'd know John, you'd know there is.

> Anyway, at least on Windows it makes perfect sense for people to expect
> Unicode to be handled automatically. popen() knows that it is running on
> Windows, and it knows what encoding Windows needs for its environment
> (it's either UCS2 or UTF-16 for most Windows APIs). At least when it
> receives a unicode string, it has enough information to apply the
> conversion automatically, and doing so saves the caller from having to
> figure out what exact encoding is to be used.



For once, the distinction between windows and other platforms is debatable.
I admit that subprocess contains already quite a few platform specific
aspects, but it's purpose is to abstract these away as much as possible.

However, I'm not sure that just because there are wide-char windows apis
available automatically means that using UCS2/UTF-16 would succeed. A look
into the python sources (PC/_subprocess.c) reveals that someone already
thought about this, but it seems that just setting a
CREATE_UNICODE_ENVIRONMENT in the CreateProcess-function should have been
easy enough to do it if there weren't any troubles to expect.

Additionally, passing unicode to env would also imply that os.environ should
yield unicode as well. Not sure how much code _that_ breaks.

Diez
 
Reply With Quote
 
Bjoern Schliessmann
Guest
Posts: n/a
 
      01-15-2008
Brian Smith wrote:
> popen() knows that it is running on Windows, and it knows what
> encoding Windows needs for its environment (it's either UCS2 or
> UTF-16 for most Windows APIs). At least when it receives a unicode
> string, it has enough information to apply the conversion
> automatically, and doing so saves the caller from having to figure
> out what exact encoding is to be used.


So you propose Python should employ a hidden automatism that
automagically guesses the right encoding? Why not leave it
explicitly/consistently and let the user decide? What will happen
if a future Windows changes its encoding? Will we need another
magic routine to tell it apart?

Regards,


Björn

--
BOFH excuse #353:

Second-system effect.

 
Reply With Quote
 
John Nagle
Guest
Posts: n/a
 
      01-15-2008
Diez B. Roggisch wrote:
> John Nagle wrote:
>
>> Benjamin wrote:
>>> On Jan 14, 6:26 pm, Bjoern Schliessmann <usenet-
>>> (E-Mail Removed)> wrote:
>>>> John Nagle wrote:
>>>>> It turns out that the strings in the "env" parameter have to be
>>>>> ASCII, not Unicode, even though Windows fully supports Unicode in
>>>>> CreateProcess.

>
> That's of course nonsense, they don't need to be ascii, they need to be
> byte-strings in whatever encoding you like.
>
>>>> Are you sure it supports Unicode, not UTF8 or UTF16? Probably using
>>>> something like u"thestring".encode("utf16") will help.
>>> Otherwise: bugs.python.org

>
> John's understanding of the differences between unicode and it's encodings
> is a bit blurry, to say the least.


Who's this guy?
>
>> Whatever translation is necessary should be done in "popen", which
>> has cases for Windows and POSIX. "popen" is supposed to be cross-platform
>> to the extent possible. I think it's just something that didn't get fixed
>> when Unicode support went in.


I've been looking at the source code. There's "_PyPopenCreateProcess"
in "posixmodule.c". That one doesn't support passing an environment at
all; see the call to Windows CreateProcess. Is that the one that Popen uses?

Where is "win32process" in the source? It ought to be in Modules, but
it's not.

John Nagle
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
*bug* *bug* *bug* David Raleigh Arnold Firefox 12 04-02-2007 03:13 AM
q: accept parameter =?Utf-8?B?SklNLkgu?= ASP .Net 1 05-22-2006 02:26 PM
Possible bug in 5.8.6: accept/fork/wait/exit ? James Marshall Perl Misc 2 06-06-2005 06:55 AM
Perl socket on linux won't accept connections from windows clients Eli Sidwell Perl 7 06-24-2004 08:33 PM
Signals interrupt accept() - bug in perldoc perlipc? Peter Valdemar Morch Perl Misc 0 06-22-2004 10:00 AM



Advertisments