Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > Questions about K&R (Kernighan and Ritchi)

Reply
Thread Tools

Questions about K&R (Kernighan and Ritchi)

 
 
sandeep
Guest
Posts: n/a
 
      04-22-2010
Hello friends ~

I am learning C from the K&R book. I have questions about Section 8.5
("an implementation of Fopen and Getc"). Although this section is UNIX(r)
specific I think all my questions are really about standard C... so the
ISO taliban can relax...

1> Look at this Macro
#define feof(p) ((p)->flag & _EOF) != 0)

My question is: feof is only specified to return 0 or not 0. There is no
requirement for it to only return 0 or 1. So why the unnecessary "!= 0"
to force it to be 0 or 1? This seems very inefficient, after all feof is
likely to be called many times.

2> Here is another macro
#define getc(p) (--(p)->cnt>=0 ?(unsigned char)*(p)->ptr++ :_fillbuf(p))
Doesn't that _fillbuf(p) ought to be _fillbuf((p)), one bracket for the
function call and one bracket to stop expansion of sideeffects in p?

3> In a comment on that getc Macro, K&R say: "The characters are returned
unsigned, which ensures that all characters will be positive". I don't
really understand the point of this, I usually use char not unsigned char
for characters. And in K&R, all strings are of type char* not unsigned
char*.

Also if sizeof(char) == sizeof(int) then the character (unsigned char)
UCHARMAX will clash with EOF == -1 when it gets promoted to int.

Regards ~
 
Reply With Quote
 
 
 
 
Keith Thompson
Guest
Posts: n/a
 
      04-22-2010
sandeep <> writes:
> I am learning C from the K&R book. I have questions about Section 8.5
> ("an implementation of Fopen and Getc"). Although this section is UNIX(r)
> specific I think all my questions are really about standard C... so the
> ISO taliban can relax...


I see the smiley, but referring to those of us who prefer to
discuss ISO C as "taliban" is a bit insulting, don't you think?
(And yes, I know the word literally means "students", but I doubt
that that's what you meant.)

> 1> Look at this Macro
> #define feof(p) ((p)->flag & _EOF) != 0)
>
> My question is: feof is only specified to return 0 or not 0. There is no
> requirement for it to only return 0 or 1. So why the unnecessary "!= 0"
> to force it to be 0 or 1? This seems very inefficient, after all feof is
> likely to be called many times.


Yes, the "!= 0" could be omitted, but it's not likely to be a big deal.
Since it's a macro, a compiler is likely to omit the extra calculation
anyway.

And no, feof() isn't likely to be called many times in well written
code. The way to determine whether you've reached the end of an input
stream is by checking the result of the reading function (for example,
getc() returns the value EOF). *After* that's happened, you can call
feof() to determine whether you reached end-of-file or encountered an
error.

> 2> Here is another macro
> #define getc(p) (--(p)->cnt>=0 ?(unsigned char)*(p)->ptr++ :_fillbuf(p))
> Doesn't that _fillbuf(p) ought to be _fillbuf((p)), one bracket for the
> function call and one bracket to stop expansion of sideeffects in p?


No, extra parentheses aren't needed. As long as the name of the macro
parameter is immediately surrounded by parentheses (or brackets),
there's no problem with operator precedence.

And it's not about "expansion of side effects", it's about operator
precedence, i.e., which operators are associated with which operands.
Any side effects will occur anyway.

> 3> In a comment on that getc Macro, K&R say: "The characters are returned
> unsigned, which ensures that all characters will be positive". I don't
> really understand the point of this, I usually use char not unsigned char
> for characters. And in K&R, all strings are of type char* not unsigned
> char*.
>
> Also if sizeof(char) == sizeof(int) then the character (unsigned char)
> UCHARMAX will clash with EOF == -1 when it gets promoted to int.


getc() returns a result of type int, not char. For example, if
UCHAR_MAX is 255, then getc() will return the value 255 if you read a
'\xff' character, and the value -1 (assuming EOF==-1) if you encounter
the end of the stream or an error. They clash only if you store the
result in something smaller than an int. So don't do that.

See section 12 of the comp.lang.c FAQ,
<http://www.c-faq.com/stdio/index.html>, especially the first few
questions.

--
Keith Thompson (The_Other_Keith) kst- <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
 
Reply With Quote
 
 
 
 
Seebs
Guest
Posts: n/a
 
      04-22-2010
On 2010-04-22, sandeep <> wrote:
> 1> Look at this Macro
> #define feof(p) ((p)->flag & _EOF) != 0)
>
> My question is: feof is only specified to return 0 or not 0. There is no
> requirement for it to only return 0 or 1. So why the unnecessary "!= 0"
> to force it to be 0 or 1? This seems very inefficient, after all feof is
> likely to be called many times.


I've seen code like this written for the same reason that some people
write
if (p != NULL)
instead of
if (p)

It's clearer to the user. The compiler may well notice that no one uses
the specific value and just run past it.

> 2> Here is another macro
> #define getc(p) (--(p)->cnt>=0 ?(unsigned char)*(p)->ptr++ :_fillbuf(p))
> Doesn't that _fillbuf(p) ought to be _fillbuf((p)), one bracket for the
> function call and one bracket to stop expansion of sideeffects in p?


No, because there's no such thing as "expansion of sideffects". Parentheses
are used *only* to control grouping -- they have no effect on side
effects. As such, the () around p are sufficient whether or not they're
also part of the function call.

> 3> In a comment on that getc Macro, K&R say: "The characters are returned
> unsigned, which ensures that all characters will be positive". I don't
> really understand the point of this, I usually use char not unsigned char
> for characters. And in K&R, all strings are of type char* not unsigned
> char*.


char may well be unsigned.

The point of this is that converting everything to unsigned char means
that every char value is necessarily non-negative, guaranteeing that no
value returned which represents a character can compare equal to EOF,
which is negative.

> Also if sizeof(char) == sizeof(int) then the character (unsigned char)
> UCHARMAX will clash with EOF == -1 when it gets promoted to int.


(Not necessarily, but I see your point.)

I am not aware of an implementation where this can actually happen;
specifically, I'm under the impression that such implementations are likely
to simply only ever yield values in some smaller range from getchar(),
so that EOF can never occur. A typical choice might be to have a 32-bit
char object, but to only store 8 bits at a time in files or retrieve
8 bits at a time when reading files.

-s
--
Copyright 2010, all wrongs reversed. Peter Seebach / usenet-
http://www.seebs.net/log/ <-- lawsuits, religion, and funny pictures
http://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated!
 
Reply With Quote
 
Alan Curry
Guest
Posts: n/a
 
      04-22-2010
In article <hqqco9$2t1$>,
sandeep <> wrote:
|Hello friends ~
|
|I am learning C from the K&R book. I have questions about Section 8.5
|("an implementation of Fopen and Getc"). Although this section is UNIX(r)
|specific I think all my questions are really about standard C... so the
|ISO taliban can relax...

Was this sample implementation written from scratch for the 2nd edition, or
is it just an updated version of some code that predates the C standard?
That would explain some of the things you're seeing...


|
|1> Look at this Macro
|#define feof(p) ((p)->flag & _EOF) != 0)
|
|My question is: feof is only specified to return 0 or not 0. There is no
|requirement for it to only return 0 or 1. So why the unnecessary "!= 0"
|to force it to be 0 or 1? This seems very inefficient, after all feof is
|likely to be called many times.

If this implementation predates the standard, then what feof was "specified"
to return might have been less clear, so making it return 0 or 1 would have
been the safe thing to do.

|
|3> In a comment on that getc Macro, K&R say: "The characters are returned
|unsigned, which ensures that all characters will be positive". I don't
|really understand the point of this, I usually use char not unsigned char
|for characters. And in K&R, all strings are of type char* not unsigned
|char*.
|
|Also if sizeof(char) == sizeof(int) then the character (unsigned char)
|UCHARMAX will clash with EOF == -1 when it gets promoted to int.

Regardless of whether this implementation predates the standard, I think it's
safe to say that sizeof(char) == sizeof(int) was not even considered a remote
possibility when getc was designed.

--
Alan Curry
 
Reply With Quote
 
Eric Sosman
Guest
Posts: n/a
 
      04-22-2010
On 4/22/2010 5:26 PM, Keith Thompson wrote:
> sandeep<> writes:
>> [...]
>> Also if sizeof(char) == sizeof(int) then the character (unsigned char)
>> UCHARMAX will clash with EOF == -1 when it gets promoted to int.

>
> getc() returns a result of type int, not char. For example, if
> UCHAR_MAX is 255, then getc() will return the value 255 if you read a
> '\xff' character, and the value -1 (assuming EOF==-1) if you encounter
> the end of the stream or an error. They clash only if you store the
> result in something smaller than an int. So don't do that.


I think you've misunderstood the question. On a system
where UCHAR_MAX > INT_MAX, getc() et al. have a problem: It
is possible to read unsigned char values that won't fit in an
int and hence can't be returned properly. What happens later
is of little importance, since the damage has been done within
getc() itself.

On such a system, I think we can deduce (for hosted
implementations)

- Conversion of values in (INT_MAX, UCHAR_MAX] doesn't raise
a signal or do anything untoward, but instead yields some
implementation-defined value. (At least, it does so inside
getc() et al, which need not be written in C.)

- Each unsigned char value converts to a distinct int value;
even the out-of-range conversions preserve information.

- Since there must be as many values in [INT_MIN, -1] as in
the span of out-of-range values, INT_MIN + INT_MAX == -1.
That is, two's complement is mandatory.

To cater to such systems (should one feel it necessary), the
familiar

int ch;
while ((ch = getc(stream)) != EOF) ...

needs to be rewritten as

int ch;
whie ((ch = getc(stream) != EOF
|| !(feof(stream) || ferror(stream))) ...

because getc() must map one valid input character value to
the int value EOF.

Let us now ponder the perils of in-band signalling.

--
Eric Sosman
lid
 
Reply With Quote
 
Keith Thompson
Guest
Posts: n/a
 
      04-22-2010
Eric Sosman <> writes:
> On 4/22/2010 5:26 PM, Keith Thompson wrote:
>> sandeep<> writes:
>>> [...]
>>> Also if sizeof(char) == sizeof(int) then the character (unsigned char)
>>> UCHARMAX will clash with EOF == -1 when it gets promoted to int.

>>
>> getc() returns a result of type int, not char. For example, if
>> UCHAR_MAX is 255, then getc() will return the value 255 if you read a
>> '\xff' character, and the value -1 (assuming EOF==-1) if you encounter
>> the end of the stream or an error. They clash only if you store the
>> result in something smaller than an int. So don't do that.

>
> I think you've misunderstood the question.


I think you're right. I managed to miss the "sizeof(char) ==
sizeof(int)" part of the question.

Well, I answered *some* qusetion, just not the one the OP asked.

[snip]

> Let us now ponder the perils of in-band signalling.


And of a language design that encourages it (by, for example, not
providing a decent way for functions to return multiple values).

--
Keith Thompson (The_Other_Keith) kst- <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
 
Reply With Quote
 
Peter Nilsson
Guest
Posts: n/a
 
      04-22-2010
sandeep <nos...@nospam.com> wrote:
> I am learning C from the K&R book. I have questions about
> Section 8.5 ("an implementation of Fopen and Getc").
> Although this section is UNIX(r) specific I think all my
> questions are really about standard C... so the
> ISO taliban can relax...


Ahh, the Jacob Navia school of begining by insulting the
very people you're seeking comments from. Sure has worked
well for him, hasn't it...

> 1> Look at this Macro
> #define feof(p) ((p)->flag & _EOF) != 0)
>
> My question is: feof is only specified to return 0 or not 0.
> There is no requirement for it to only return 0 or 1. So why
> the unnecessary "!= 0" to force it to be 0 or 1? This seems
> very inefficient, after all feof is likely to be called many
> times.


True, but it's most likely to be called in a conditional. Most
compilers are quite capable of implementing expr != 0 without
actually evaluating the != operator.

> 2> Here is another macro
> #define getc(p) (--(p)->cnt>=0 ?(unsigned char)*(p)
> ->ptr++ :_fillbuf(p))
> Doesn't that _fillbuf(p) ought to be _fillbuf((p)), one
> bracket for the function call and one bracket to stop
> expansion of sideeffects in p?


What do you mean by expansion of sideeffects?

Note that function call parentheses and commas separating
parameters are syntactical, so there's (generally) no need
to 'protect' function parameters that represent expressions.

If someone wants to pass an argument with a comma operator
they'll have to supply parentheses to avoid a constraint
violation on calling a function macro with too many
arguments. [Although C99 now supports variadic macros.]

> 3> In a comment on that getc Macro, K&R say: "The
> characters are returned unsigned, which ensures that all
> characters will be positive". I don't really understand
> the point of this, I usually use char not unsigned char
> for characters.


Character codes are non-negative, hence getc's return.
Plain char was invented for hysterical reasons.

> And in K&R, all strings are of type char* not unsigned
> char*.


Plain char is a bain of C. It should have had two 'byte'
types and char should have been a typedef char_t. But it
isn't...

> Also if sizeof(char) == sizeof(int)


Then there are all sorts of problems for hosted
implementations. Despite what some members of the
Committee may say, many aspects of the standard
library were not designed with that implementation
in mind.

> then the character (unsigned char) UCHARMAX will clash with EOF
> == -1 when it gets promoted to int.


The mapping is implementation defined, but yes, there will be
overlap with EOF (which needn't be -1 BTW.) General practice
though is to ignore such systems as hosted environments.

--
Peter
 
Reply With Quote
 
Ben Bacarisse
Guest
Posts: n/a
 
      04-22-2010
Seebs <usenet-> writes:

> On 2010-04-22, sandeep <> wrote:

<snip>
>> Also if sizeof(char) == sizeof(int) then the character (unsigned char)
>> UCHARMAX will clash with EOF == -1 when it gets promoted to int.

>
> (Not necessarily, but I see your point.)
>
> I am not aware of an implementation where this can actually happen;
> specifically, I'm under the impression that such implementations are likely
> to simply only ever yield values in some smaller range from getchar(),
> so that EOF can never occur. A typical choice might be to have a 32-bit
> char object, but to only store 8 bits at a time in files or retrieve
> 8 bits at a time when reading files.


That may be reasonable from a practical point of view, but I don't think
it is conforming. In

int i;
fread(&i, sizeof i, 1, fp);

fread's behaviour is defined in terms of fgetc: fgetc is called sizeof
i times. getchar is also (indirectly) defined in terms of fgetc so I
don't think there can be any special dispensation for it.

--
Ben.
 
Reply With Quote
 
Keith Thompson
Guest
Posts: n/a
 
      04-23-2010
Ben Bacarisse <> writes:
> Seebs <usenet-> writes:
>> On 2010-04-22, sandeep <> wrote:

> <snip>
>>> Also if sizeof(char) == sizeof(int) then the character (unsigned char)
>>> UCHARMAX will clash with EOF == -1 when it gets promoted to int.

>>
>> (Not necessarily, but I see your point.)
>>
>> I am not aware of an implementation where this can actually happen;
>> specifically, I'm under the impression that such implementations are likely
>> to simply only ever yield values in some smaller range from getchar(),
>> so that EOF can never occur. A typical choice might be to have a 32-bit
>> char object, but to only store 8 bits at a time in files or retrieve
>> 8 bits at a time when reading files.

>
> That may be reasonable from a practical point of view, but I don't think
> it is conforming. In
>
> int i;
> fread(&i, sizeof i, 1, fp);
>
> fread's behaviour is defined in terms of fgetc: fgetc is called sizeof
> i times. getchar is also (indirectly) defined in terms of fgetc so I
> don't think there can be any special dispensation for it.


I don't think that by itself makes Seebs's hypothetical implementation
non-conforming.

What does make it non-conforming is that you wouldn't be able to
write a byte with any value in the range 256..UCHAR_MAX to a file
(in binary mode) and then read it back (also in binary mode) and
get the same value.

--
Keith Thompson (The_Other_Keith) kst- <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
 
Reply With Quote
 
Seebs
Guest
Posts: n/a
 
      04-23-2010
On 2010-04-22, Ben Bacarisse <> wrote:
> That may be reasonable from a practical point of view, but I don't think
> it is conforming. In
>
> int i;
> fread(&i, sizeof i, 1, fp);
>
> fread's behaviour is defined in terms of fgetc: fgetc is called sizeof
> i times. getchar is also (indirectly) defined in terms of fgetc so I
> don't think there can be any special dispensation for it.


Interesting point. Hadn't thought of that.

That brings us to the other answer, which is the frequent assertion that
the requirement for EOF to be a distinct value means that you can't really
have a fully conforming hosted implementation where sizeof(int) == 1.

-s
--
Copyright 2010, all wrongs reversed. Peter Seebach / usenet-
http://www.seebs.net/log/ <-- lawsuits, religion, and funny pictures
http://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated!
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
After the Deletion of Google Answers U Got Questions Fills the Gap Answering and Asking the Tough Questions Linux Flash Drives Computer Support 0 05-07-2007 05:38 PM
Malloc and free questions - learner questions pkirk25 C Programming 50 10-04-2006 02:22 PM
Questions on Canon 300D and etc. questions regarding digital photography Progressiveabsolution Digital Photography 12 03-24-2005 05:18 PM
Newbie questions - Couple of VC++ questions regarding dlls and VB6 Ali Syed C Programming 3 10-13-2004 10:15 PM
Re: Questions....questions....questions Patrick Michael A+ Certification 0 06-16-2004 04:53 PM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57