Velocity Reviews

Velocity Reviews (http://www.velocityreviews.com/forums/index.php)
-   Python (http://www.velocityreviews.com/forums/f43-python.html)
-   -   accepting file path or file object? (http://www.velocityreviews.com/forums/t954215-accepting-file-path-or-file-object.html)

andrea crotti 11-05-2012 10:54 AM

accepting file path or file object?
 
Quite often I find convenient to get a filename or a file object as
argument of a function, and do something as below:

def grep_file(regexp, filepath_obj):
"""Check if the given text is found in any of the file lines, take
a path to a file or an opened file object
"""
if isinstance(filepath_obj, basestring):
fobj = open(filepath_obj)
else:
fobj = filepath_obj

for line in fobj:
if re.search(regexp, line):
return True

return False


This makes it also more convenient to unit-test, since I can just pass
a StringIO. But then there are other problems, for example if I pass
a file object is the caller that has to make sure to close the file
handle..

So I'm thinking if it's not just worth to skip the support for file
objects and only use the filenames, which seems a more robust and
consistent choice..

Any comment/suggestions about this?

Ulrich Eckhardt 11-05-2012 12:35 PM

Re: accepting file path or file object?
 
Am 05.11.2012 11:54, schrieb andrea crotti:
> Quite often I find convenient to get a filename or a file object as
> argument of a function, and do something as below:
>
> def grep_file(regexp, filepath_obj):
> """Check if the given text is found in any of the file lines, take
> a path to a file or an opened file object
> """
> if isinstance(filepath_obj, basestring):
> fobj = open(filepath_obj)
> else:
> fobj = filepath_obj
>
> for line in fobj:
> if re.search(regexp, line):
> return True
>
> return False
>
> This makes it also more convenient to unit-test, since I can just pass
> a StringIO.


I do the same for the same reason, but I either pass in a file object or
the actual data contained in the file, but not a path.


> But then there are other problems, for example if I pass a file
> object is the caller that has to make sure to close the file
> handle..


I don't consider that a problem. If you open a file, you should do that
in a with expression:

with open(..) as f:
found = grep_file(regex, f)

That is also the biggest criticism I have with your code, because you
don't close the file after use. Another things is the readability of
your code:

grep_file("foo", "bar")

The biggest problem there is that I don't know which of the two
arguments is which. I personally would expect the file to come first,
although the POSIX grep has it opposite on the commandline. Consider as
alternative:

grep("foo", path="bar")
with open(..) as f:
grep("foo", file=f)
with open(..) as f:
grep("foo", data=f.read())

Using **kwargs, you could switch inside the function depending on the
mode that was used, extract lines accordingly and match these against
the regex.


Greetings!

Uli


Grant Edwards 11-05-2012 03:05 PM

Re: accepting file path or file object?
 
On 2012-11-05, andrea crotti <andrea.crotti.0@gmail.com> wrote:

> Quite often I find convenient to get a filename or a file object as
> argument of a function, and do something as below:
>
> def grep_file(regexp, filepath_obj):

[...]
> if isinstance(filepath_obj, basestring):
> fobj = open(filepath_obj)
> else:
> fobj = filepath_obj

[...]
> This makes it also more convenient to unit-test, since I can just pass
> a StringIO. But then there are other problems, for example if I pass
> a file object is the caller that has to make sure to close the file
> handle..
>
> So I'm thinking if it's not just worth to skip the support for file
> objects and only use the filenames, which seems a more robust and
> consistent choice..
>
> Any comment/suggestions about this?


I have found that accepting either a "file-like-object" or a filename
is sometimes worth the effort for a module that's going to be re-used
in a variety of contexts. However, when I do it, I don't usually
check the type of the object -- I check for whatever "feature" I want
to use. If I'm going to want to be able to call a read() method, I
check for presence of a read() method. If that fails, then I assume
it's a filename and pass it to open(). If that fails, then it fails.

--
Grant Edwards grant.b.edwards Yow! Oh my GOD -- the
at SUN just fell into YANKEE
gmail.com STADIUM!!


All times are GMT. The time now is 08:40 AM.

Powered by vBulletin®. Copyright ©2000 - 2013, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.


1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57