Velocity Reviews

Velocity Reviews (http://www.velocityreviews.com/forums/index.php)
-   Python (http://www.velocityreviews.com/forums/f43-python.html)
-   -   how to get os.py to use an ./ntpath.py instead of Lib/ntpath.py (http://www.velocityreviews.com/forums/t952036-how-to-get-os-py-to-use-an-ntpath-py-instead-of-lib-ntpath-py.html)

ruck 09-10-2012 05:25 PM

how to get os.py to use an ./ntpath.py instead of Lib/ntpath.py
 
In Python 2.7.2 on Windows 7,

os.walk() uses isdir(),
which comes from os.path,
which really comes from ntpath.py,
which really comes from genericpath.py

I want os.walk() to use a modified isdir() on my Windows 7.
Not knowing any better, it seems to me like ntpath.py would be a good place to intercept.

When os.py does "import ntpath as path",
how can I get python to process my customized ntpath.py
instead of Lib/ntpath.py ?

Thanks for any comments.
John

BTW, here's my mod to ntpath.py:
$ diff ntpath.py.standard ntpath.py
14c14,19
< from genericpath import *
---
> from genericpath import *
>
> def isdir(s):
> return genericpath.isdir('\\\\?\\' + abspath(s + '\\'))
> def isfile(s):
> return genericpath.isfile('\\\\?\\' + abspath(s + '\\'))


Why? Because the genericpath implementation relies on os.stat() which
uses Windows API function that presumes or enforces some naming
conventions like "doesn't end with a space or a period".
But the NTFS actually supports such filenames and dirnames, and some sw
(like cygwin) lets users make files & dirs without restricting.
So, cygwin users like me may have file 'voo...\\doo' which os.walk()
cannot ordinarily walk. That is, the isdir('voo...') returns false
because the underlying os.stat is assessing 'voo' instead of 'voo...' .
The workaround is to pass os.stat a fullpathname that is prefixed
with r'\\?\' so the Windows API recognizes that you do NOT want the
name filtered.

Better said by Microsoft:
"For file I/O, the "\\?\" prefix to a path string tells
the Windows APIs to disable all string parsing and to
send the string that follows it straight to the file
system. For example, if the file system supports large
paths and file names, you can exceed the MAX_PATH limits
that are otherwise enforced by the Windows APIs."

Steven D'Aprano 09-10-2012 08:16 PM

Re: how to get os.py to use an ./ntpath.py instead of Lib/ntpath.py
 
On Mon, 10 Sep 2012 10:25:29 -0700, ruck wrote:

> In Python 2.7.2 on Windows 7,
>
> os.walk() uses isdir(),
> which comes from os.path,
> which really comes from ntpath.py,
> which really comes from genericpath.py
>
> I want os.walk() to use a modified isdir() on my Windows 7. Not knowing
> any better, it seems to me like ntpath.py would be a good place to
> intercept.
>
> When os.py does "import ntpath as path", how can I get python to process
> my customized ntpath.py instead of Lib/ntpath.py ?


import os
os.path.isdir = my_isdir

ought to do it.

This general technique is called "monkey-patching". The Ruby community is
addicted to it. Everybody else -- and a goodly number of the more
sensible Ruby crowd -- consider it a risky, dirty hack that 99 times out
of 100 will lead to blindness, moral degeneracy and subtle, hard-to-fix
bugs.

They are right to be suspicious of it. As a general rule, monkey-patching
is not for production code. You have been warned.

http://www.codinghorror.com/blog/200...or-humans.html


[...]
> Why? Because the genericpath implementation relies on os.stat() which
> uses Windows API function that presumes or enforces some naming
> conventions like "doesn't end with a space or a period". But the NTFS
> actually supports such filenames and dirnames, and some sw (like cygwin)
> lets users make files & dirs without restricting. So, cygwin users like
> me may have file 'voo...\\doo' which os.walk() cannot ordinarily walk.
> That is, the isdir('voo...') returns false because the underlying
> os.stat is assessing 'voo' instead of 'voo...' .


Please consider submitting a patch that adds support for cygwin paths to
the standard library. You'll need to target 3.4 though, 2.7 is now a
maintenance release with no new features allowed.


> The workaround is to
> pass os.stat a fullpathname that is prefixed with r'\\?\' so the Windows
> API recognizes that you do NOT want the name filtered.
>
> Better said by Microsoft:
> "For file I/O, the "\\?\" prefix to a path string tells the Windows APIs
> to disable all string parsing and to send the string that follows it
> straight to the file system.


That's not so much a workaround as the officially supported API for
dealing with the situation you are in. Why don't you just prepend a '?'
to paths like they tell you to?


--
Steven

ruck 09-10-2012 10:22 PM

Re: how to get os.py to use an ./ntpath.py instead of Lib/ntpath.py
 
On Monday, September 10, 2012 1:16:13 PM UTC-7, Steven D'Aprano wrote:
> On Mon, 10 Sep 2012 10:25:29 -0700, ruck wrote:
>
>
>
> > In Python 2.7.2 on Windows 7,

>
> >

>
> > os.walk() uses isdir(),

>
> > which comes from os.path,

>
> > which really comes from ntpath.py,

>
> > which really comes from genericpath.py

>
> >

>
> > I want os.walk() to use a modified isdir() on my Windows 7. Not knowing

>
> > any better, it seems to me like ntpath.py would be a good place to

>
> > intercept.

>
> >

>
> > When os.py does "import ntpath as path", how can I get python to process

>
> > my customized ntpath.py instead of Lib/ntpath.py ?

>
>
>
> import os
>
> os.path.isdir = my_isdir
>
>
>
> ought to do it.
>
>
>
> This general technique is called "monkey-patching". The Ruby community is
>
> addicted to it. Everybody else -- and a goodly number of the more
>
> sensible Ruby crowd -- consider it a risky, dirty hack that 99 times out
>
> of 100 will lead to blindness, moral degeneracy and subtle, hard-to-fix
>
> bugs.
>
>
>
> They are right to be suspicious of it. As a general rule, monkey-patching
>
> is not for production code. You have been warned.
>
>
>
> http://www.codinghorror.com/blog/200...or-humans.html
>
>
>
>
>
> [...]
>
> > Why? Because the genericpath implementation relies on os.stat() which

>
> > uses Windows API function that presumes or enforces some naming

>
> > conventions like "doesn't end with a space or a period". But the NTFS

>
> > actually supports such filenames and dirnames, and some sw (like cygwin)

>
> > lets users make files & dirs without restricting. So, cygwin users like

>
> > me may have file 'voo...\\doo' which os.walk() cannot ordinarily walk.

>
> > That is, the isdir('voo...') returns false because the underlying

>
> > os.stat is assessing 'voo' instead of 'voo...' .

>
>
>
> Please consider submitting a patch that adds support for cygwin paths to
>
> the standard library. You'll need to target 3.4 though, 2.7 is now a
>
> maintenance release with no new features allowed.
>
>
>
>
>
> > The workaround is to

>
> > pass os.stat a fullpathname that is prefixed with r'\\?\' so the Windows

>
> > API recognizes that you do NOT want the name filtered.

>
> >

>
> > Better said by Microsoft:

>
> > "For file I/O, the "\\?\" prefix to a path string tells the Windows APIs

>
> > to disable all string parsing and to send the string that follows it

>
> > straight to the file system.

>
>
>
> That's not so much a workaround as the officially supported API for
>
> dealing with the situation you are in. Why don't you just prepend a '?'
>
> to paths like they tell you to?
>
>
>
>
>
> --
>
> Steven


Steven says:
That's not so much a workaround as the officially supported API for
dealing with the situation you are in. Why don't you just prepend a '?'
to paths like they tell you to?

Good idea, but the first thing os.walk() does is a listdir(), and os.listdir() does not like the r'\\?\' prefix. In other words,
os.walk(r'\\?\C:Users\john\Desktop\sandbox\goo')
does not work.

Also, your recipe worked for me --
I'm walking 'goo' which contains 'voo.../doo'

import os

import genericpath
def my_isdir(s):
return genericpath.isdir('\\\\?\\' + os.path.abspath(s + '\\'))

print 'os.walk(\'goo\') with standard isdir()'
for root, dirs, files in os.walk('goo'):
print root, dirs, files

print 'os.walk(\'goo\') with modified isdir()'
os.path.isdir = my_isdir
for root, dirs, files in os.walk('goo'):
print root, dirs, files

yields

os.walk('goo') with standard isdir()
goo [] ['voo...']
os.walk('goo') with modified isdir()
goo ['voo...'] []
goo\voo... [] ['doo']

About monkeypatching, generally -- thanks for the pointer to that discussion. That sounded like a lot of wisdom and lessons learned being shared.
About me suggesting a patch -- I'll sleep on that :)

Thanks Steven!
John

Steven D'Aprano 09-11-2012 03:46 AM

Re: how to get os.py to use an ./ntpath.py instead of Lib/ntpath.py
 
On Mon, 10 Sep 2012 15:22:05 -0700, ruck wrote:

> On Monday, September 10, 2012 1:16:13 PM UTC-7, Steven D'Aprano wrote:

[...]
> > That's not so much a workaround as the officially supported API for
> > dealing with the situation you are in. Why don't you just prepend a
> > '?' to paths like they tell you to?

>
> Good idea, but the first thing os.walk() does is a listdir(), and
> os.listdir() does not like the r'\\?\' prefix. In other words,
> os.walk(r'\\?\C:Users\john\Desktop\sandbox\goo') does not work.


Now that sounds like a bug to me. If Microsoft officially support
leading ? in file names, then so should Python on Windows.


> Also, your recipe worked for me --
> I'm walking 'goo' which contains 'voo.../doo'


Good for you. (Sorry, that comes across as more condescending than it is
intended as.) Monkey-patching often gets used for quick scripts and tiny
pieces of code because it works.

Just beware that if you extend that technique to larger bodies of code,
say when using a large framework, or multiple libraries, your experience
may not be quite so good. Especially if *they* are monkey-patching too,
as some very large frameworks sometimes do. (Or so I am lead to believe.)

The point is not that monkey-patching is dangerous and should never be
used, but that it is risky and should be used with caution.



--
Steven

Tim Golden 09-11-2012 07:20 AM

Re: how to get os.py to use an ./ntpath.py instead of Lib/ntpath.py
 
On 11/09/2012 04:46, Steven D'Aprano wrote:
> On Mon, 10 Sep 2012 15:22:05 -0700, ruck wrote:
>
>> On Monday, September 10, 2012 1:16:13 PM UTC-7, Steven D'Aprano wrote:

> [...]
>>> That's not so much a workaround as the officially supported API for
>>> dealing with the situation you are in. Why don't you just prepend a
>>> '?' to paths like they tell you to?

>>
>> Good idea, but the first thing os.walk() does is a listdir(), and
>> os.listdir() does not like the r'\\?\' prefix. In other words,
>> os.walk(r'\\?\C:Users\john\Desktop\sandbox\goo') does not work.

>
> Now that sounds like a bug to me. If Microsoft officially support
> leading ? in file names, then so should Python on Windows.


And so it does, but you'll notice from the MSDN docs that the \\?
syntax must be supplied as a Unicode string, which os.listdir
will do if you pass it a Python unicode object and not otherwise:

import os
os.listdir(u"\\\\?\\c:\\users")

# and consequently

for p, ds, fs in os.walk(u"\\\\?\\c:\\users"):
print p


TJG

ruck 09-11-2012 07:13 PM

Re: how to get os.py to use an ./ntpath.py instead of Lib/ntpath.py
 
On Tuesday, September 11, 2012 12:21:24 AM UTC-7, Tim Golden wrote:
> And so it does, but you'll notice from the MSDN docs that the \\?
> syntax must be supplied as a Unicode string, which os.listdir
> will do if you pass it a Python unicode object and not otherwise:


I was saying os.listdir doesn't like the r'\\?\' prefix.
But Tim corrects me -- so yes, Steven's earler suggestion "Why don't you just prepend a '?' to paths like they tell you to?" does work, when I supply it in unicode.
Good:
>>> os.listdir(u'\\\\?\\C:\\Users\\john\\Desktop\\sand box\\goo')

[u'voo...']
Bad:
>>> os.listdir('\\\\?\\C:\\Users\\john\\Desktop\\sandb ox\\goo')


Traceback (most recent call last):
File "<pyshell#3>", line 1, in <module>
os.listdir('\\\\?\\C:\\Users\\john\\Desktop\\sandb ox\\goo')
WindowsError: [Error 123] The filename, directory name, or volume labelsyntax is incorrect: '\\\\?\\C:\\Users\\john\\Desktop\\sandbox\\goo/*.*'

Thanks to both of you for taking the time to teach.

BTW, when I posted the original, I was trying to supply my own customized ntpath module, and I was really puzzled as to why it wasn't getting picked up! According to sys.path I expected my custom ntpath.py to be chosen, instead of the standard Lib/ntpath.py.

Now I guess I understand why. I moved Lib/ntpath.* out of the way, and learned that during initialization, Python is importing "site" module, which is importing "os" which is importing "ntpath" -- before my dir is added to sys.path. So later when I import os, it and ntpath have already been imported, so Python doesn't attempt a fresh import.

To get my custom ntpath.py honored, need to RELOAD, like:
import os
import ntpath
reload(ntpath)
print 'os.walk(\'goo\') with isdir override in custom ntpath'
for root, dirs, files in os.walk('goo'):
print root, dirs, files

where the diff betw standard ntpath.py and my ntpath.py are:
14c14,19
< from genericpath import *
---
> from genericpath import *
>
> def isdir(s):
> return genericpath.isdir('\\\\?\\' + abspath(s + '\\'))
> def isfile(s):
> return genericpath.isfile('\\\\?\\' + abspath(s + '\\'))


I'm not sure how I could have known that ntpath was already imported, since*I* didn't import it, but that was the key to my confusion.

Thanks again for the help.
John

ruck 09-11-2012 07:13 PM

Re: how to get os.py to use an ./ntpath.py instead of Lib/ntpath.py
 
On Tuesday, September 11, 2012 12:21:24 AM UTC-7, Tim Golden wrote:
> And so it does, but you'll notice from the MSDN docs that the \\?
> syntax must be supplied as a Unicode string, which os.listdir
> will do if you pass it a Python unicode object and not otherwise:


I was saying os.listdir doesn't like the r'\\?\' prefix.
But Tim corrects me -- so yes, Steven's earler suggestion "Why don't you just prepend a '?' to paths like they tell you to?" does work, when I supply it in unicode.
Good:
>>> os.listdir(u'\\\\?\\C:\\Users\\john\\Desktop\\sand box\\goo')

[u'voo...']
Bad:
>>> os.listdir('\\\\?\\C:\\Users\\john\\Desktop\\sandb ox\\goo')


Traceback (most recent call last):
File "<pyshell#3>", line 1, in <module>
os.listdir('\\\\?\\C:\\Users\\john\\Desktop\\sandb ox\\goo')
WindowsError: [Error 123] The filename, directory name, or volume labelsyntax is incorrect: '\\\\?\\C:\\Users\\john\\Desktop\\sandbox\\goo/*.*'

Thanks to both of you for taking the time to teach.

BTW, when I posted the original, I was trying to supply my own customized ntpath module, and I was really puzzled as to why it wasn't getting picked up! According to sys.path I expected my custom ntpath.py to be chosen, instead of the standard Lib/ntpath.py.

Now I guess I understand why. I moved Lib/ntpath.* out of the way, and learned that during initialization, Python is importing "site" module, which is importing "os" which is importing "ntpath" -- before my dir is added to sys.path. So later when I import os, it and ntpath have already been imported, so Python doesn't attempt a fresh import.

To get my custom ntpath.py honored, need to RELOAD, like:
import os
import ntpath
reload(ntpath)
print 'os.walk(\'goo\') with isdir override in custom ntpath'
for root, dirs, files in os.walk('goo'):
print root, dirs, files

where the diff betw standard ntpath.py and my ntpath.py are:
14c14,19
< from genericpath import *
---
> from genericpath import *
>
> def isdir(s):
> return genericpath.isdir('\\\\?\\' + abspath(s + '\\'))
> def isfile(s):
> return genericpath.isfile('\\\\?\\' + abspath(s + '\\'))


I'm not sure how I could have known that ntpath was already imported, since*I* didn't import it, but that was the key to my confusion.

Thanks again for the help.
John

Chris Angelico 09-11-2012 10:50 PM

Re: how to get os.py to use an ./ntpath.py instead of Lib/ntpath.py
 
On Wed, Sep 12, 2012 at 5:13 AM, ruck <john.ruckstuhl@gmail.com> wrote:
> I'm not sure how I could have known that ntpath was already imported, since *I* didn't import it, but that was the key to my confusion.


One way to find out is to peek at the cache.

>>> import sys
>>> sys.modules


There are quite a few of them in the 3.2 interactive that I just tried this in.

ChrisA

Dave Angel 09-11-2012 10:57 PM

Re: how to get os.py to use an ./ntpath.py instead of Lib/ntpath.py
 
On 09/11/2012 03:13 PM, ruck wrote:
> <snip>
>
> I'm not sure how I could have known that ntpath was already imported, since *I* didn't import it, but that was the key to my confusion.
>


import sys
print sys.modules



--

DaveA


Thomas Rachel 09-12-2012 06:17 AM

Re: how to get os.py to use an ./ntpath.py instead of Lib/ntpath.py
 
Am 11.09.2012 05:46 schrieb Steven D'Aprano:

> Good for you. (Sorry, that comes across as more condescending than it is
> intended as.) Monkey-patching often gets used for quick scripts and tiny
> pieces of code because it works.
>
> Just beware that if you extend that technique to larger bodies of code,
> say when using a large framework, or multiple libraries, your experience
> may not be quite so good. Especially if *they* are monkey-patching too,
> as some very large frameworks sometimes do. (Or so I am lead to believe.)


This sonds like a good use case for a context manager, like the one in
decimal.Context.get_manager().

First shot:

@contextlib.contextmanager
def changed_os_path(**k):
old = {}
try:
for i in k.items():
old[i] = getattr(os.path, i)
setattr(os.path, i, k[i])
yield None
finally:
for i in k.items():
setattr(os.path, i, old[i])

and so for your code you can use

print 'os.walk(\'goo\') with modified isdir()'
with changed_os_path(isdir=my_isdir):
for root, dirs, files in os.walk('goo'):
print root, dirs, files

so the change is only effective as long as you are in the relevant code
part and is reverted as soon as you leave it.


Thomas


All times are GMT. The time now is 09:41 AM.

Powered by vBulletin®. Copyright ©2000 - 2014, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.