Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > EOF for binary files

Reply
Thread Tools

EOF for binary files

 
 
Registered User
Guest
Posts: n/a
 
      11-11-2006
I've read in a book:

<quote>
With a binary-mode stream, you can't detect the end-of-file by looking
for EOF, because a byte of data from a binary stream could have that
value, which would result in premature end of input. Instead, you can
use the library function feof(), which can be used for both binary- and
text-mode files:

int feof(FILE *fp);
</quote>

Isn't it true that testing for EOF is valid for both text- and
binary-mode files?

Also, the FAQ recommends not to use feof():
<quote>In virtually all cases, there's no need to use feof at all.
</quote>

 
Reply With Quote
 
 
 
 
Richard Heathfield
Guest
Posts: n/a
 
      11-11-2006
Registered User said:

> I've read in a book:
>
> <quote>
> With a binary-mode stream, you can't detect the end-of-file by looking
> for EOF, because a byte of data from a binary stream could have that
> value, which would result in premature end of input.


Ditch the book. It doesn't understand EOF.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: normal service will be restored as soon as possible. Please do not
adjust your email clients.
 
Reply With Quote
 
 
 
 
Registered User
Guest
Posts: n/a
 
      11-11-2006
Richard Heathfield wrote:
> Registered User said:
>
> > I've read in a book:
> >
> > <quote>
> > With a binary-mode stream, you can't detect the end-of-file by looking
> > for EOF, because a byte of data from a binary stream could have that
> > value, which would result in premature end of input.

>
> Ditch the book. It doesn't understand EOF.
>

Oh, thanks Richard!! That part of the book really got me confused.

 
Reply With Quote
 
CBFalconer
Guest
Posts: n/a
 
      11-11-2006
Registered User wrote:
>
> I've read in a book:
>
> <quote>
> With a binary-mode stream, you can't detect the end-of-file by
> looking for EOF, because a byte of data from a binary stream could
> have that value, which would result in premature end of input.
> Instead, you can use the library function feof(), which can be
> used for both binary- and text-mode files:
>
> int feof(FILE *fp);
> </quote>
>
> Isn't it true that testing for EOF is valid for both text- and
> binary-mode files?


Yes. The only possible exception occurs when (sizeof(int) == 1).
A stream is a stream of bytes, and the routines to read them return
ints formed from the (unsigned)char value involved. Thus the value
of EOF is always distinct.

>
> Also, the FAQ recommends not to use feof():
> <quote>In virtually all cases, there's no need to use feof at all.
> </quote>


feof is primarily useful to distinguish between i/o errors and
actual eof, either of which conditions will usually return EOF.

if (EOF == (ch = getc(f))) {
if (feof(f)) /* actual file eof encountered */
else {
/* use ferror etc. to determine the cause */
}
}
else {
/* use the value of ch, which is a valid unsigned char */
}

note that ch must have been declared as an int.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>

 
Reply With Quote
 
Richard Heathfield
Guest
Posts: n/a
 
      11-11-2006
Registered User said:

> Richard Heathfield wrote:
>> Registered User said:
>>
>> > I've read in a book:
>> >
>> > <quote>
>> > With a binary-mode stream, you can't detect the end-of-file by looking
>> > for EOF, because a byte of data from a binary stream could have that
>> > value, which would result in premature end of input.

>>
>> Ditch the book. It doesn't understand EOF.
>>

> Oh, thanks Richard!! That part of the book really got me confused.


The mistake the author makes is that he appears to believe EOF is a
character. It isn't. It's a message from your I/O library which, freely
translated, means "you asked me for more data, squire, but there ain't
none. The pot's empty. Sorry, I'd love to help and all that...".

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: normal service will be restored as soon as possible. Please do not
adjust your email clients.
 
Reply With Quote
 
Richard Tobin
Guest
Posts: n/a
 
      11-11-2006
In article <(E-Mail Removed) .com>,
Registered User <(E-Mail Removed)> wrote:

>With a binary-mode stream, you can't detect the end-of-file by looking
>for EOF, because a byte of data from a binary stream could have that
>value, which would result in premature end of input.


It would certainly be a mistake to compare a byte against EOF if the
byte is a char, because EOF is an int value and a char converted to
an int might have the same value as EOF. But getc() doesn't return
a char; it returns an unsigned char converted to an int, so there
is no possibility of a real byte appearing to be equal to EOF, because
EOF is guaranteed to be negative.

So you can perfectly well compare against EOF provided you don't
convert the value to a char first.

-- Richard
--
"Consideration shall be given to the need for as many as 32 characters
in some alphabets" - X3.4, 1963.
 
Reply With Quote
 
Eric Sosman
Guest
Posts: n/a
 
      11-11-2006
Registered User wrote:
> I've read in a book:
>
> <quote>
> With a binary-mode stream, you can't detect the end-of-file by looking
> for EOF, because a byte of data from a binary stream could have that
> value, which would result in premature end of input. Instead, you can
> use the library function feof(), which can be used for both binary- and
> text-mode files:
>
> int feof(FILE *fp);
> </quote>
>
> Isn't it true that testing for EOF is valid for both text- and
> binary-mode files?


The book is right in the sense that it is possible for a
byte read from a stream (text or binary) to have the value
EOF, but only on "exotic" machines where bytes and ints have
the same size. That is, the book is right if it's trying to
be "fully general" -- but if it's writing about "mainstream"
C implementations it's wrong.

The Standard defines all input operations as if they used
the fgetc() function as many times as necessary (the actual
implementation might do something more intricate, but the end
result must be the same). The fgetc() function returns an int
value: either EOF to indicate failure, or an actual input byte
represented as unsigned char converted to int. If int is
wider than char, converting an unsigned char to an int yields
a non-negative value, and since the EOF macro expands to a
negative number there can be no confusion.

On those exotic architectures, though, things get sticky.
If sizeof(int) == 1, there must be unsigned char values that
are too large for int: for example, on a system with sixteen-bit
chars and sixteen-bit ints, INT_MAX will be 32767 but UCHAR_MAX
will be 65535. Since fgetc() must be able to read back any
character values fputc() might have written (subject to some
restrictions that don't matter here), on this system it must
be able to return 65536 distinguishable int values. Half of
those will necessarily be negative, and one of them will have
the same value as EOF. So on exotic architectures, it is
possible for fgetc() to return EOF when reading "real" data,
and the only way to tell whether the EOF is actual data or an
indication of input failure is to call both feof() and ferror().

> Also, the FAQ recommends not to use feof():
> <quote>In virtually all cases, there's no need to use feof at all.
> </quote>


I'm not the FAQ author, but I'd read "in virtually all cases"
to mean "whenever int is wider than char," or "on virtually all
`mainstream' machines." It would be nice, IMHO, if the FAQ were
more explicit about this, but it's not a big failing.

The FAQ is right in implying that feof() is seldom used,
because after receiving an EOF return value (on a "mainstream"
system) your immediate concern should be "End-of-input, or error?"
and it seems more natural to use ferror() for that question:

int ch;
while ( (ch = fgetc(stream)) != EOF ) {
/* process the character just read */
}
/* "Why did we get EOF?" */
if (ferror(stream)) {
/* do something about the I/O error */
}
else {
/* normal end-of-input */
}

This code assumes that EOF can only appear as the result of
end-of-input or I/O error, so if there's no I/O error the stream
must have reached its end. Of course, the same reasoning would
hold for using feof(stream) and swapping the bodies of the two
if statements, but "ferror?" seems a more direct inquiry.

On "exotic" architectures the either/or reasoning breaks down
because there's a third possibility: an EOF return might be actual
input data. If you're writing with such a system in mind you need
to use both feof() and ferror() to distinguish the three outcomes,
and the loop might look something like

int ch;
while ( (ch = fgetc(stream)) , /* comma operator */
(!feof(stream) && !ferror(stream) ) {
/* process the character just read */
}
/* "Was it error or end-of-input?" */
if (ferror(stream)) {
/* do something about the I/O error */
}
else {
/* normal end-of-input */
}

Of course, this can be written in many other rearrangements. One
likely change would be to call feof() and ferror() only when an EOF
shows up instead of every single time, by changing the while clause
to something like

while ( (ch = fgetc(stream)) != EOF
|| (!feof(stream) && !ferror(stream)) )

Since most I/O devices are pathetically slow compared to most CPUs,
this "optimization" probably doesn't save noticeable time -- but
it is in the tradition of C to worry about tiny efficiencies while
ignoring gross waste. (That same tradition, by the way, calls
for using getc() instead of fgetc() wherever possible.)

--
Eric Sosman
http://www.velocityreviews.com/forums/(E-Mail Removed)lid
 
Reply With Quote
 
Coos Haak
Guest
Posts: n/a
 
      11-11-2006
Op 11 Nov 2006 14:34:44 GMT schreef Richard Tobin:

> In article <(E-Mail Removed) .com>,
> Registered User <(E-Mail Removed)> wrote:
>
>>With a binary-mode stream, you can't detect the end-of-file by looking
>>for EOF, because a byte of data from a binary stream could have that
>>value, which would result in premature end of input.

>
> It would certainly be a mistake to compare a byte against EOF if the
> byte is a char, because EOF is an int value and a char converted to
> an int might have the same value as EOF. But getc() doesn't return
> a char; it returns an unsigned char converted to an int, so there
> is no possibility of a real byte appearing to be equal to EOF, because
> EOF is guaranteed to be negative.


getc returns an int, not a char, be it signed or unsigned.
#include <stdio.h>
int getc(FILE *FP);
And yes, if no EOF condition is reached, the int may be regarded as char.
EOF does not fit in a char so it well may be some negative number.

> So you can perfectly well compare against EOF provided you don't
> convert the value to a char first.


Yes.
--
Coos
 
Reply With Quote
 
Flash Gordon
Guest
Posts: n/a
 
      11-11-2006
Coos Haak wrote:
> Op 11 Nov 2006 14:34:44 GMT schreef Richard Tobin:
>
>> In article <(E-Mail Removed) .com>,
>> Registered User <(E-Mail Removed)> wrote:
>>
>>> With a binary-mode stream, you can't detect the end-of-file by looking
>>> for EOF, because a byte of data from a binary stream could have that
>>> value, which would result in premature end of input.

>> It would certainly be a mistake to compare a byte against EOF if the
>> byte is a char, because EOF is an int value and a char converted to
>> an int might have the same value as EOF. But getc() doesn't return
>> a char; it returns an unsigned char converted to an int, so there
>> is no possibility of a real byte appearing to be equal to EOF, because
>> EOF is guaranteed to be negative.

>
> getc returns an int, not a char, be it signed or unsigned.


Richard said that.

> #include <stdio.h>
> int getc(FILE *FP);
> And yes, if no EOF condition is reached, the int may be regarded as char.


Be *definition* if EOF is not returned the value is that of an
*unsigned* char as, again, richard said.

> EOF does not fit in a char so it well may be some negative number.


EOF is *defined* as being a negative number, so there is no "may well
be" about it.

>> So you can perfectly well compare against EOF provided you don't
>> convert the value to a char first.

>
> Yes.


Everything Richard said in that post is correct, not just that last
sentence.
--
Flash Gordon
 
Reply With Quote
 
Coos Haak
Guest
Posts: n/a
 
      11-11-2006
Op Sat, 11 Nov 2006 17:21:14 +0000 schreef Flash Gordon:

> Coos Haak wrote:
>> Op 11 Nov 2006 14:34:44 GMT schreef Richard Tobin:
>>
>>> In article <(E-Mail Removed) .com>,
>>> Registered User <(E-Mail Removed)> wrote:
>>>
>>>> With a binary-mode stream, you can't detect the end-of-file by looking
>>>> for EOF, because a byte of data from a binary stream could have that
>>>> value, which would result in premature end of input.
>>> It would certainly be a mistake to compare a byte against EOF if the
>>> byte is a char, because EOF is an int value and a char converted to
>>> an int might have the same value as EOF. But getc() doesn't return

My mistake, I overlooked this -------
Sorry for reading and replying too fast and hasty ;-(
--
Coos
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
[Windows] Any way to distinguish ^C Induced EOF from ^Z EOF? Jan Burse Java 67 03-14-2012 12:21 AM
ifstream eof not reporting eof? SpreadTooThin C++ 10 06-15-2007 08:49 AM
if EOF = -1, can't a valid character == EOF and cause problems? Kobu C Programming 10 03-04-2005 10:40 PM
EOF for binary? flamesrock Python 6 01-08-2005 11:31 PM
Reading binary file finding EOF spideyman99@hotmail.com C Programming 11 12-14-2004 11:23 AM



Advertisments