Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > Getting file size of binary file

Reply
Thread Tools

Getting file size of binary file

 
 
Arnold
Guest
Posts: n/a
 
      01-08-2004
Is using fseek and ftell a reliable method of getting the file size on a
binary file? I thought I remember reading somewhere it wasn't... If not what
would be the "right" and portable method to obtain it? Thanks.


 
Reply With Quote
 
 
 
 
Richard Bos
Guest
Posts: n/a
 
      01-08-2004
"Arnold" <(E-Mail Removed)> wrote:

> Is using fseek and ftell a reliable method of getting the file size on a
> binary file?


No. From 7.19.9.2#3: "A binary stream need not meaningfully support
fseek calls with a whence value of SEEK_END".

To say that this irks me would be a bit of an understatement.

> I thought I remember reading somewhere it wasn't... If not what
> would be the "right" and portable method to obtain it?


There is none, in ISO C.

To say that _this_ irks me would be a bit of an understatement, as well.
It should at least be possible to get the value of "what the OS thinks
the file size is", but apparently there are reasons why it isn't; I've
never heard one that is convincing, though.

Richard
 
Reply With Quote
 
 
 
 
Richard Head
Guest
Posts: n/a
 
      01-08-2004
On Thu, 08 Jan 2004 08:46:35 +0000, Arnold wrote:

> Is using fseek and ftell a reliable method of getting the file size on a
> binary file? I thought I remember reading somewhere it wasn't... If not what
> would be the "right" and portable method to obtain it? Thanks.


try fstat()
 
Reply With Quote
 
Joona I Palaste
Guest
Posts: n/a
 
      01-08-2004
Richard Head <(E-Mail Removed)> scribbled the following:
> On Thu, 08 Jan 2004 08:46:35 +0000, Arnold wrote:
>> Is using fseek and ftell a reliable method of getting the file size on a
>> binary file? I thought I remember reading somewhere it wasn't... If not what
>> would be the "right" and portable method to obtain it? Thanks.


> try fstat()


Which part of the ISO C standard defines fstat()?

--
/-- Joona Palaste ((E-Mail Removed)) ------------- Finland --------\
\-- http://www.helsinki.fi/~palaste --------------------- rules! --------/
"My absolute aspect is probably..."
- Mato Valtonen
 
Reply With Quote
 
CBFalconer
Guest
Posts: n/a
 
      01-08-2004
Richard Head wrote:
> On Thu, 08 Jan 2004 08:46:35 +0000, Arnold wrote:
>
> > Is using fseek and ftell a reliable method of getting the file
> > size on a binary file? I thought I remember reading somewhere it
> > wasn't... If not what would be the "right" and portable method

> to obtain it? Thanks.
>
> try fstat()


No, don't. There is no fstat() in standard C. Please do not give
off-topic answers in this newsgroup, where there may be nobody to
make corrections.

--
Chuck F ((E-Mail Removed)) ((E-Mail Removed))
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net> USE worldnet address!

 
Reply With Quote
 
Kevin Goodsell
Guest
Posts: n/a
 
      01-08-2004
Richard Bos wrote:

> "Arnold" <(E-Mail Removed)> wrote:
>
>
>>Is using fseek and ftell a reliable method of getting the file size on a
>>binary file?

>
>
> No. From 7.19.9.2#3: "A binary stream need not meaningfully support
> fseek calls with a whence value of SEEK_END".


From the FAQ for this group:

http://www.eskimo.com/~scs/C-faq/q19.12.html

---
How can I find out the size of a file, prior to reading it in?

If the ``size of a file'' is the number of characters you'll be able to
read from it in C, it is difficult or impossible to determine this
number exactly).

Under Unix, the stat call will give you an exact answer. Several other
systems supply a Unix-like stat which will give an approximate answer.
You can fseek to the end and then use ftell, but these tend to have the
same problems: fstat is not portable, and generally tells you the same
thing stat tells you; ftell is not guaranteed to return a byte count
except for binary files. Some systems provide routines called filesize
or filelength, but these are not portable, either.

Are you sure you have to determine the file's size in advance? Since the
most accurate way of determining the size of a file as a C program will
see it is to open the file and read it, perhaps you can rearrange the
code to learn the size as it reads.
---

Does this look strange to anyone else? There's that lone closing paren
in the first paragraph, but the part that really bothers me is "ftell is
not guaranteed to return a byte count except for binary files." It seems
to be suggesting that the fseek/ftell method would be OK for a binary
file, but line from the standard that Richard quoted suggests the opposite.

>
> To say that this irks me would be a bit of an understatement.
>
>
>>I thought I remember reading somewhere it wasn't... If not what
>>would be the "right" and portable method to obtain it?

>
>
> There is none, in ISO C.
>
> To say that _this_ irks me would be a bit of an understatement, as well.
> It should at least be possible to get the value of "what the OS thinks
> the file size is", but apparently there are reasons why it isn't; I've
> never heard one that is convincing, though.


I suppose that it's partly because C deals with streams, not files
directly (for the most part). Many things may not make sense for a
stream, size included. How could the size of stdin be meaningful, for
example? At the same time, there are at least a few standard functions
that only make sense for certain types of streams. Seems like it
wouldn't be such a bad idea to have a few more.

-Kevin
--
My email address is valid, but changes periodically.
To contact me please use the address from a recent posting.
 
Reply With Quote
 
glen herrmannsfeldt
Guest
Posts: n/a
 
      01-08-2004
Richard Bos wrote:

(snip)

> No. From 7.19.9.2#3: "A binary stream need not meaningfully support
> fseek calls with a whence value of SEEK_END".
>
> To say that this irks me would be a bit of an understatement.


(snip)

> To say that _this_ irks me would be a bit of an understatement, as well.
> It should at least be possible to get the value of "what the OS thinks
> the file size is", but apparently there are reasons why it isn't; I've
> never heard one that is convincing, though.


I was reading not so long ago what one of IBM's C compilers for
VM/CMS or MVS does for fseek/ftell. For files with variable length
records, text or binary, ftell returns the block number in the
upper 17 bits, and position in the block in the lower 15 bits.
(OS restrictions tend to keep blocks less than 32K.) I think
it wraps at 128K blocks.

MVS keeps track of files in tracks, which can't reliably be
converted to bytes. CMS maps variable length blocks onto
a fixed block file system, but also doesn't accurately
keep track of bytes of file data.

On traditional IBM mainframe OS's, tracks are formatted when
written. The block size is determined by the program, and can
either fixed fixed or variable length. As an added complication,
files with fixed length blocks will usually have a short block
at the end. If opened for append, this short block stays in
place, so even for fixed length blocks a block count can't
reliably indicate file size.

-- glen

 
Reply With Quote
 
Richard Bos
Guest
Posts: n/a
 
      01-09-2004
Kevin Goodsell <(E-Mail Removed)> wrote:

> Richard Bos wrote:
>
> > It should at least be possible to get the value of "what the OS thinks
> > the file size is", but apparently there are reasons why it isn't; I've
> > never heard one that is convincing, though.

>
> I suppose that it's partly because C deals with streams, not files
> directly (for the most part). Many things may not make sense for a
> stream, size included. How could the size of stdin be meaningful, for
> example? At the same time, there are at least a few standard functions
> that only make sense for certain types of streams. Seems like it
> wouldn't be such a bad idea to have a few more.


Exactly; the function could always return -1 for "not available".

Richard
 
Reply With Quote
 
Richard Bos
Guest
Posts: n/a
 
      01-09-2004
glen herrmannsfeldt <(E-Mail Removed)> wrote:

> Richard Bos wrote:
>
> > To say that _this_ irks me would be a bit of an understatement, as well.
> > It should at least be possible to get the value of "what the OS thinks
> > the file size is", but apparently there are reasons why it isn't; I've
> > never heard one that is convincing, though.

>
> I was reading not so long ago what one of IBM's C compilers for
> VM/CMS or MVS does for fseek/ftell. For files with variable length
> records, text or binary, ftell returns the block number in the
> upper 17 bits, and position in the block in the lower 15 bits.
> (OS restrictions tend to keep blocks less than 32K.) I think
> it wraps at 128K blocks.
>
> MVS keeps track of files in tracks, which can't reliably be
> converted to bytes. CMS maps variable length blocks onto
> a fixed block file system, but also doesn't accurately
> keep track of bytes of file data.
>
> On traditional IBM mainframe OS's, tracks are formatted when
> written. The block size is determined by the program, and can
> either fixed fixed or variable length. As an added complication,
> files with fixed length blocks will usually have a short block
> at the end. If opened for append, this short block stays in
> place, so even for fixed length blocks a block count can't
> reliably indicate file size.


That doesn't convince me, either.

The OS has _some_ idea of how large the file is, if only to prevent the
user from writing past the end of it. It should be possible to pass this
knowledge on to the C implementation. If the result is approximate, that
is inherent in the OS, and the user will be expecting it.

Richard
 
Reply With Quote
 
glen herrmannsfeldt
Guest
Posts: n/a
 
      01-09-2004
Richard Bos wrote:

> glen herrmannsfeldt <(E-Mail Removed)> wrote:


(snip)

>>I was reading not so long ago what one of IBM's C compilers for
>>VM/CMS or MVS does for fseek/ftell. For files with variable length
>>records, text or binary, ftell returns the block number in the
>>upper 17 bits, and position in the block in the lower 15 bits.
>>(OS restrictions tend to keep blocks less than 32K.) I think
>>it wraps at 128K blocks.


>>MVS keeps track of files in tracks, which can't reliably be
>>converted to bytes.


(snip)

> That doesn't convince me, either.


> The OS has _some_ idea of how large the file is, if only to prevent the
> user from writing past the end of it. It should be possible to pass this
> knowledge on to the C implementation. If the result is approximate, that
> is inherent in the OS, and the user will be expecting it.


The OS keeps track of how many tracks are allocated, but now how many
bytes are written to each one. The number of bytes you can fit on a
track with a BLKSIZE of 1 is about 1% of the maximum. There also
could be empty tracks allocated but not yet used, after the data.

There is no standard (or non-standard) way to say approximately how
much space a data set takes.

Assuming that every file system is like unix is not a good idea.

-- glen

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Newbie: working with binary files/extract png from a binary file Jim Ruby 6 12-24-2013 08:09 AM
Preferred Size, Minimum Size, Size Jason Cavett Java 5 05-25-2008 08:32 AM
writing binary file (ios::binary) Ron Eggler C++ 9 04-28-2008 08:20 AM
Getting picture size/setting window size jodleren Javascript 2 02-15-2007 12:35 PM
mega pixels, file size, image size, and print size - Adobe Evangelists Frank ess Digital Photography 0 11-14-2006 05:08 PM



Advertisments