Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > finding file size

Reply
Thread Tools

finding file size

 
 
John Roth
Guest
Posts: n/a
 
      01-03-2004

"Martin v. Loewis" <(E-Mail Removed)> wrote in message
news:bt3l9i$pog$07$(E-Mail Removed)-online.com...
> Sean Ross wrote:
>
> > My question is this: Is there a reason why file objects could not have a
> > size method or property?

>
> Yes. In Python, file objects belong to the larger category of "file-like
> objects", and not all file-like objects have the inherent notion of a
> size. E.g. what would you think sys.stdin.size should return (which
> actually is a proper file object - not just file-like)?
>
> Other examples include the things returned from os.popen or socket.socket.


I think the issue here is that the abstract concept behind a "file-like
object"
is that of something external that can be opened, read, written to and
closed.
As you say, this does not include the notion of basic implementation: a file
on a file system is a different animal than a network socket, which is
different
from a pipe, etc.

I think we need an object that encapsulates the notion of a file (or
directory)
as a file system object. That object shouldn't support "file-like"
activities:
it should have a method that returns a standard file object to do that.

I like Geritt Holl's filename suggestion as well, but it's not the same
as this suggestion.

John Roth
>
> Regards,
> Martin
>



 
Reply With Quote
 
 
 
 
Gerrit Holl
Guest
Posts: n/a
 
      01-03-2004
Martin v. Loewis wrote:
> Gerrit Holl wrote:
> >Any comments?

>
> It should be possible to implement that type without modifying
> Python proper.


It should indeed. But it isn't what I had in mind, and it's not exactly
the same as a filename type in the language: for example, the name
attribute of a file will still be a string, just as the contents of
os.listdir, glob.glob, etc. (it seems glob follows listdir).

> It might make a good recipe for the cookbook.


If the type would be created without changing python proper, the type
would probably just call os.path.foo for the filename.foo method. It
would be the other way around if the type would become part of the
language: os.path would only be there for backward compatibility, like
string. But in order for os.listdir (and probably more functions) to
return Path objects rather than strings, a C implementation would be
preferable (necessary?). On the other hand, would this type ever be
added, a python implementation would of course be a must.

> Any volunteers?


I may have a look at it.
When thinking about it, a lot more issues than raised in my first post
need to be resolved, like what to do when the intializer is empty...
curdir? root?

I guess there would a base class with all os-independent stuff, or stuff
that can be coded independently, e.g:
class Path(str):
def split(self):
return self.rsplit(self.sep, 1)
def splitext(self):
return self.rsplit(self.extsep, 1)
def basename(self):
return self.split()[1]
def dirname(self):
return self.split()[0]
def getsize(self):
return os.stat(self).st_size
def getmtime(self):
return os.stat(self).st_mtime
def getatime(self):
return os.stat(self).st_atime
def getctime(self):
return os.stat(self).st_ctime

where the subclasses define, sep, extsep, etc.

yours,
Gerrit.

--
168. If a man wish to put his son out of his house, and declare before
the judge: "I want to put my son out," then the judge shall examine into
his reasons. If the son be guilty of no great fault, for which he can be
rightfully put out, the father shall not put him out.
-- 1780 BC, Hammurabi, Code of Law
--
Asperger's Syndrome - a personal approach:
http://people.nl.linux.org/~gerrit/english/

 
Reply With Quote
 
 
 
 
Gerrit Holl
Guest
Posts: n/a
 
      01-03-2004
Peter Otten wrote:
> http://mail.python.org/pipermail/pyt...ne/108425.html
>
> http://members.rogers.com/mcfletch/p...ng/filepath.py
>
> has an implementation of your proposal by Mike C. Fletcher. I think both
> filename class and os.path functions can peacefully coexist.


Thanks for the links.
(I think they don't, by the way)

yours,
Gerrit.

--
19. If he hold the slaves in his house, and they are caught there, he
shall be put to death.
-- 1780 BC, Hammurabi, Code of Law
--
Asperger's Syndrome - a personal approach:
http://people.nl.linux.org/~gerrit/english/

 
Reply With Quote
 
Mike C. Fletcher
Guest
Posts: n/a
 
      01-03-2004
Gerrit Holl wrote:

>Peter Otten wrote:
>
>
>>http://mail.python.org/pipermail/pyt...ne/108425.html
>>
>>http://members.rogers.com/mcfletch/p...ng/filepath.py
>>
>>has an implementation of your proposal by Mike C. Fletcher. I think both
>>filename class and os.path functions can peacefully coexist.
>>
>>

>
>Thanks for the links.
>(I think they don't, by the way)
>
>

You hawks, always seeing war where we see peace .

Seriously, though, a path type would eventually have ~ the same relation
as the str type now does to the string module. Initial implementations
of a path type are going to use the os.path stuff, but to avoid code
duplication, the os.path module would eventually become a set of trivial
wrappers that dispatch on their first argument's method(s) (after
coercian to path type).

Is that peaceful? I don't know. If there's a war, let's be honest,
os.path is going to take a good long while to defeat because it's there
and embedded directly into thousands upon thousands of scripts and
applications. We can fight a decent campaign, making a common module,
then getting it blessed into a standard module, encouraging newbies to
shun the dark old os.path way, encouraging maintainers to use the new
module throughout their code-base, etceteras, but os.path is going to
survive a good long while, and I'd imagine that being friendly toward it
would keep a few of our comrades off the floor.

Just as a note, however, we haven't had a *huge* outpouring of glee for
the current spike-tests/implementations. So it may be that we need to
get our own little army in shape before attacking the citadel .

Have fun,
Mike

_______________________________________
Mike C. Fletcher
Designer, VR Plumber, Coder
http://members.rogers.com/mcfletch/




 
Reply With Quote
 
Gerrit Holl
Guest
Posts: n/a
 
      01-04-2004
[Peter Otten]
> >>I think both filename class and os.path functions can peacefully coexist.


[Gerrit Holl (me)]
> >Thanks for the links.
> >(I think they don't, by the way)


[Mike C. Fletcher]
> Is that peaceful? I don't know. If there's a war, let's be honest,
> os.path is going to take a good long while to defeat because it's there
> and embedded directly into thousands upon thousands of scripts and
> applications. We can fight a decent campaign, making a common module,
> then getting it blessed into a standard module, encouraging newbies to
> shun the dark old os.path way, encouraging maintainers to use the new
> module throughout their code-base, etceteras, but os.path is going to
> survive a good long while, and I'd imagine that being friendly toward it
> would keep a few of our comrades off the floor.


Sure, I don't think os.path would die soon, it will surely take longer
than the string module to die. But I think there is a number of places
where Python could be more object-oriented than it is, and this is one
of them. The first step in making those modules more object-oriented is
providing a OO-alternative: the second step is deprecating the old way,
and the third step is providing only the OO-way. The third step will
surely not be made until Python 3.0.

The string module has made the first two steps. In my view, the time
module has made the first step, although I'm not sure whether that's
true. I would like to see a datetime module that makes the time module
totally reduntant, because I never liked the time module: it doesn't fit
into my brain properly, because it's not object oriented. Now, I try to
use the datetime module whenever I can, but something like strptime
isn't there. PEP 321 solves this, so I'd like time to become eventually
deprecated after something DateUtil-like inclusion as well, but it
probably won't.

Hmm, the Zen of Python is not very clear about this:

Now is better than never.
Although never is often better than *right* now.

....so there must be a difference between 'now' and 'right now'

> Just as a note, however, we haven't had a *huge* outpouring of glee for
> the current spike-tests/implementations. So it may be that we need to
> get our own little army in shape before attacking the citadel .


Sure

yours,
Gerrit.

--
147. If she have not borne him children, then her mistress may sell her
for money.
-- 1780 BC, Hammurabi, Code of Law
--
Asperger's Syndrome - a personal approach:
http://people.nl.linux.org/~gerrit/english/

 
Reply With Quote
 
Peter Otten
Guest
Posts: n/a
 
      01-04-2004
> [Peter Otten]
>> >>I think both filename class and os.path functions can peacefully
>> >>coexist.

>
> [Gerrit Holl (me)]
>> >Thanks for the links.
>> >(I think they don't, by the way)

>
> [Mike C. Fletcher]
>> Is that peaceful? I don't know. If there's a war, let's be honest,
>> os.path is going to take a good long while to defeat because it's there
>> and embedded directly into thousands upon thousands of scripts and


[Gerrit Holl]
> Sure, I don't think os.path would die soon, it will surely take longer
> than the string module to die. But I think there is a number of places
> where Python could be more object-oriented than it is, and this is one
> of them. The first step in making those modules more object-oriented is
> providing a OO-alternative: the second step is deprecating the old way,
> and the third step is providing only the OO-way. The third step will
> surely not be made until Python 3.0.


I don't think OO is a goal in itself. In addition to the os.path functions'
ubiquity there are practical differences between a path and the general str
class.

While a string is the default that you read from files and GUI widgets, a
filename will never be. So expect to replace e. g.

os.path.exists(somestring)

with

os.filename(somestring).exists()

which is slightly less compelling than somefile.exists().

Are unicode filenames something we should care about?
Should filename really be a subclass of str? I think somepath[-1] could
return the name as well.
Should files and directories really be of the same class?

These to me all seem real questions and at that point I'm not sure whether a
filename class that looks like a light wrapper around os.path (even if you
expect os.path to be implemented in terms of filename later) is the best
possible answer.

Peter


 
Reply With Quote
 
Gerrit Holl
Guest
Posts: n/a
 
      01-04-2004
Peter Otten wrote:
> While a string is the default that you read from files and GUI widgets, a
> filename will never be.


I'm not so sure about that. A GUI where a file is selected from the list
could very well return a Path object - it won't for a while, of course,
but that's a different issue. But I agree that is often isn't. Just as
an integer is not something you read from a file, etc.

> So expect to replace e. g.
>
> os.path.exists(somestring)
>
> with
>
> os.filename(somestring).exists()
>
> which is slightly less compelling than somefile.exists().


I would rather read:
path(somestring).exists()

which is better than os.filename(somestring).exists() and, IMO, better
than os.path.exists(somestring). I think path should be a builtin.

> Are unicode filenames something we should care about?


That's a difficult issue. I don't know how to solve that.

> Should filename really be a subclass of str? I think somepath[-1] could
> return the name as well.


It could. But I don't think it should. This would mean that the index of
a path returns the respective directories. Explicit is better than
implicit: somepath[-1] is not very explicit as being a basename.

> Should files and directories really be of the same class?


Directories could be a subclass, with some more features. But...

> These to me all seem real questions and at that point I'm not sure whether a
> filename class that looks like a light wrapper around os.path (even if you
> expect os.path to be implemented in terms of filename later) is the best
> possible answer.


....questions exist to be answered. I don't claim to know all answers,
but I think OO-ifying os.path is a good thing. How - that's another
issue, which is PEP-worthy.

>From earlier discussions, I get the impression that most people are

sympathic about OO-ifying os.path but that people don't agree in how to
do it. If we can agree on that, the only thing we need to do is
upgrading the BDFL's judgement from lukewarm to liking

I've written a Pre-PEP at: http://tinyurl.com/2578q
It is very unfinished but it is a rough draft. Comments welcome.

yours,
Gerrit.

--
132. If the "finger is pointed" at a man's wife about another man, but
she is not caught sleeping with the other man, she shall jump into the
river for her husband.
-- 1780 BC, Hammurabi, Code of Law
--
Asperger's Syndrome - a personal approach:
http://people.nl.linux.org/~gerrit/english/

 
Reply With Quote
 
Martin v. Loewis
Guest
Posts: n/a
 
      01-04-2004
Gerrit Holl wrote:

>>Are unicode filenames something we should care about?

>
>
> That's a difficult issue. I don't know how to solve that.


It depends on the platform. There are:

1. platforms on which Unicode is the natural string type
for file names, with byte strings obtained by conversion
only. On these platforms, all filenames can be represented
by a Unicode string, but some file names cannot
be represented by a byte string.
Windows NT+ is the class of such systems.
2. platforms on which Unicode and byte string filenames
work equally well; they can be converted forth and
back without any loss of accuracy or expressiveness.
OS X is one such example; probably Plan 9 as well.
3. platforms on which byte strings are the natural string
type for filenames. They often have only a weak notion
of file name encoding, causing
a) not all Unicode strings being available as filenames
b) not all byte string filenames being convertible to
Unicode
c) the conversion may depend on user settings, so for
the same file, Unicode conversion may give different
results for different users.
POSIX systems fall in this category.

So if filenames where a datatype, I think they should be
able to use both Unicode strings and byte strings as their
own internal representation, and declare one of the two
as "accurate". Conversion of filenames to both Unicode
strings and byte strings should be supported, but may
fail at runtime (unless conversion into the "accurate"
type is attempted).

Regards,
Martin

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: File Size - Big File Size Keith Thompson C Programming 6 10-03-2009 10:28 PM
Preferred Size, Minimum Size, Size Jason Cavett Java 5 05-25-2008 08:32 AM
Disconnect Between HD File Size & PS's File Size One4All Digital Photography 8 09-12-2007 03:02 AM
mega pixels, file size, image size, and print size - Adobe Evangelists Frank ess Digital Photography 0 11-14-2006 05:08 PM
compare file size with online file size tiewknvc9 Java 6 10-01-2006 09:30 AM



Advertisments