Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > getc() vs. fgetc()

Reply
Thread Tools

getc() vs. fgetc()

 
 
William L. Bahn
Guest
Posts: n/a
 
      07-14-2004
I'm sure this has been asked before, and I have looked in the FAQ, but I'm
looking for an explanation for the following:

The functions pairs:

gets()/fgets()
puts()/fputs()
printf()/fprintf()
scanf()/fscanf()

differ primarily in that the first one assumes stdin/stdout while the second
one works with a stream passed by the programmer. This makes sense and makes
the functions easy to remember.

But then we have:

getc()/fgetc()
putc()/fputc()
getchar()/fgetchar()
putchar()/fputchar()

In each case the pairs of functions perform the same task. This makes it
hard for people that don't use these functions all the time because
everytime they use one they have to look up whether it assumes one of the
standard streams or not. Is there a reason that the standard did not adopt a
consistent (and quite useful) naming convention for these functions?




 
Reply With Quote
 
 
 
 
Andrew Palmer
Guest
Posts: n/a
 
      07-14-2004

"William L. Bahn" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
> I'm sure this has been asked before, and I have looked in the FAQ, but I'm
> looking for an explanation for the following:
>
> The functions pairs:
>
> gets()/fgets()
> puts()/fputs()
> printf()/fprintf()
> scanf()/fscanf()
>
> differ primarily in that the first one assumes stdin/stdout while the

second
> one works with a stream passed by the programmer. This makes sense and

makes
> the functions easy to remember.
>
> But then we have:
>
> getc()/fgetc()
> putc()/fputc()
> getchar()/fgetchar()
> putchar()/fputchar()
>
> In each case the pairs of functions perform the same task. This makes it
> hard for people that don't use these functions all the time because
> everytime they use one they have to look up whether it assumes one of the
> standard streams or not. Is there a reason that the standard did not adopt

a
> consistent (and quite useful) naming convention for these functions?


Yeah, fgetc() and fputc() make sense, but getc()/putc() should do what
getchar()/putchar() do. It may actually be even less consistant than you're
saying, though. The arguments for fputs() are backwards from fprintf()
(fprintf() must be that way), the arguments for fputs() and fgets() don't
match, and the "f" in fgetchar() and fputchar() doesn't seem to refer to
anything. To (partly) answer your question, the latter two functions are
actually not part of "the standard."


 
Reply With Quote
 
 
 
 
Richard Bos
Guest
Posts: n/a
 
      07-14-2004
"William L. Bahn" <(E-Mail Removed)> wrote:

> The functions pairs:
>
> gets()/fgets()


Don't use gets(). Ever. It is an irrepairable security hole, because
there is no way to tell it where its buffer stops and it will start to
munge other variable or worse..

> puts()/fputs()
> printf()/fprintf()
> scanf()/fscanf()
>
> differ primarily in that the first one assumes stdin/stdout while the second
> one works with a stream passed by the programmer. This makes sense and makes
> the functions easy to remember.


That's deceptive, though. It's true in the case of (f)printf() and
(f)scanf(), but puts(s) actually does something subtly different from
fputs(s, stdin);

> But then we have:
>
> getc()/fgetc()
> putc()/fputc()
> getchar()/fgetchar()
> putchar()/fputchar()


No, we don't. There is no such thing as fgetchar() and fputchar() in C.

> In each case the pairs of functions perform the same task.


No, they don't. fgetc(instream) gets a character from instream.
getc(instream) does the same thing superficially, but it is allowed to
evaluate its parameter more than once. This means that
fgetc(instream[i++]) is safe, but getc(instream[i++]) is not safe; it
might evaluate i++ more than once, even more than once between two
sequence points, and thus cause undefined behaviour. The other side of
the coin is that getc() could be slightly faster than fgetc().
getchar() is equivalent to getc(stdin). Since stdin does not contain
side effects, this is both safe and efficient.
The same thing is true for fputc()/putc()/putchar()/stdout, with the
proviso that putc() is only allowed to evaluate its second argument more
than once; putc(line[i++], outstream) is safe, but putc(i,
outstream[j++]) is not.

> This makes it
> hard for people that don't use these functions all the time because
> everytime they use one they have to look up whether it assumes one of the
> standard streams or not. Is there a reason that the standard did not adopt a
> consistent (and quite useful) naming convention for these functions?


I presume it was for historical reasons; that is, because it was the way
pre-Standard C implementations usually did it, and changing it would
have broken too much existing code.

Richard
 
Reply With Quote
 
Dan Pop
Guest
Posts: n/a
 
      07-14-2004
In <(E-Mail Removed)> "William L. Bahn" <(E-Mail Removed)> writes:

>But then we have:
>
>getc()/fgetc()
>putc()/fputc()
>getchar()/fgetchar()
>putchar()/fputchar()
>
>In each case the pairs of functions perform the same task. This makes it
>hard for people that don't use these functions all the time because
>everytime they use one they have to look up whether it assumes one of the
>standard streams or not. Is there a reason that the standard did not adopt a
>consistent (and quite useful) naming convention for these functions?


The naming convention predates the standard, and it is consistent, even if
it is not obvious to those unfamiliar with the language history.

First, there is no such thing as fgetchar and fputchar, so we're
left only with the getc/fgetc and putc/fputc pairs. The 'f' stands, in
both cases, for "function", which makes perfect sense once you understand
the history of <stdio.h>.

In the pre-ANSI days, getc and putc were typically implemented as macros,
only. This was good enough for most purposes, unless you needed to pass
their address to another function or, for some other reason, needed a
function with the semantics as these macros. So, fgetc and fputc have
been introduced as the function versions of getc and putc.

Things are different in standard C, because each function in the standard
C library must be implemented as a function, even if it is also provided
as a macro. So, you can take the address of getc, or even call the
function version of getc, if you're careful enough to bypass the macro.
Likewise, fgetc and fputc can be provided as macros, too, although I can't
imagine why any implementor might want to do so.

But even today there is a subtle difference between the plain versions and
the f-versions: if implemented as macros, all the functions from the
standard C library are restricted to single evaluation of each of their
parameters. This makes something like putchar(i++) safe: i is guaranteed
to be incremented once, even if putchar is implemented as a macro.
However, there are two exceptions from this rule: getc and putc. If
implemented as macros, they are allowed to evaluate their FILE pointer
parameter (and *only* this parameter) more than once. So, in the
unlikely event that you ever need to call getc/putc with an expression
containing side effects as the FILE pointer argument, use the f-version
instead.

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: http://www.velocityreviews.com/forums/(E-Mail Removed)
 
Reply With Quote
 
William L. Bahn
Guest
Posts: n/a
 
      07-14-2004

"Richard Bos" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
> "William L. Bahn" <(E-Mail Removed)> wrote:
>
> > The functions pairs:
> >
> > gets()/fgets()

>
> Don't use gets(). Ever. It is an irrepairable security hole, because
> there is no way to tell it where its buffer stops and it will start to
> munge other variable or worse..


I understand that. That wasn't the question. I also understand that they
handle the newline character differently and that when you use fgets() to
bring a string you need to also check whether the newline character is at
the end of the string in order to determine if the entire string was read in
versus hitting the count limit. My question wasn't about any of that. All of
those are answered in the FAQ quite extensively. None of this would be any
different if the functions where named Bob and Sue. My question was limited
to a querry about the nameing convention.
>
> > puts()/fputs()
> > printf()/fprintf()
> > scanf()/fscanf()
> >
> > differ primarily in that the first one assumes stdin/stdout while the

second
> > one works with a stream passed by the programmer. This makes sense and

makes
> > the functions easy to remember.

>
> That's deceptive, though. It's true in the case of (f)printf() and
> (f)scanf(), but puts(s) actually does something subtly different from
> fputs(s, stdin);


I understand that. Again, that is addressed more than adequately by the FAQ.
That wasn't the question. The question only has to do with the naming
convention chosen.

>
> > But then we have:
> >
> > getc()/fgetc()
> > putc()/fputc()
> > getchar()/fgetchar()
> > putchar()/fputchar()

>
> No, we don't. There is no such thing as fgetchar() and fputchar() in C.


I'll take your word for that. They are in every stdio.h I have every worked
with, but that's not a very long list. In fact I just looked at the
portability list for those arguments and see that they are not listed as
being ANSI compliant. So thanks, even though it still doesn't answer the
question that was asked.

>
> > In each case the pairs of functions perform the same task.

>
> No, they don't. fgetc(instream) gets a character from instream.
> getc(instream) does the same thing superficially, but it is allowed to
> evaluate its parameter more than once. This means that
> fgetc(instream[i++]) is safe, but getc(instream[i++]) is not safe; it
> might evaluate i++ more than once, even more than once between two
> sequence points, and thus cause undefined behaviour. The other side of
> the coin is that getc() could be slightly faster than fgetc().
> getchar() is equivalent to getc(stdin). Since stdin does not contain
> side effects, this is both safe and efficient.
> The same thing is true for fputc()/putc()/putchar()/stdout, with the
> proviso that putc() is only allowed to evaluate its second argument more
> than once; putc(line[i++], outstream) is safe, but putc(i,
> outstream[j++]) is not.


Although it still doesn't answer the question that was asked, this is
definitely completely new information to me and so I appreciate it. I'm
trying to picture the process flow that allows a function to evaluate its
parameters more than once and can't. While I know that this is
implementation specific, my mental picture of the process is that the
parameters are evaluated and the resulting values are placed on a stack.
Necessary context is then saved and program control is turned over to the
function's code that then accesses the evaluated values of the parameters
from the stack based on the value of the stack pointer. After doing whatever
it wants to with those values, at calculates a return value (if any), pops
all of the arguments off the stack, and places that value on the stack and
returns control to the calling function that then pops the return value from
the stack placing the stack back to its original condition prior to the
function call.

I would be greatful for an alternate picture that allows multiple
evaluations of a function's arguments for a single call to the function.

>
> > This makes it
> > hard for people that don't use these functions all the time because
> > everytime they use one they have to look up whether it assumes one of

the
> > standard streams or not. Is there a reason that the standard did not

adopt a
> > consistent (and quite useful) naming convention for these functions?

>
> I presume it was for historical reasons; that is, because it was the way
> pre-Standard C implementations usually did it, and changing it would
> have broken too much existing code.


This is my general assumption, but I'm hoping that, like the implied type
casting of chars and shorts to ints with certain functions for compatibility
with legacy code, that I can get a more definite answer. In particular, if
there was a reason that 'f' was used to distinguish getc() from fgetc(). If
the performance you mentioned was the other way around I could almost see
the two being getc() and fast_gets(), but of course that would only be a
guess on my part and not what I am looking for.

Thanks.

>
> Richard



 
Reply With Quote
 
William L. Bahn
Guest
Posts: n/a
 
      07-14-2004

"Dan Pop" <(E-Mail Removed)> wrote in message
news:cd3e4j$9g6$(E-Mail Removed)...
> In <(E-Mail Removed)> "William L. Bahn"

<(E-Mail Removed)> writes:
>
> >But then we have:
> >
> >getc()/fgetc()
> >putc()/fputc()
> >getchar()/fgetchar()
> >putchar()/fputchar()
> >
> >In each case the pairs of functions perform the same task. This makes it
> >hard for people that don't use these functions all the time because
> >everytime they use one they have to look up whether it assumes one of the
> >standard streams or not. Is there a reason that the standard did not

adopt a
> >consistent (and quite useful) naming convention for these functions?

>
> The naming convention predates the standard, and it is consistent, even if
> it is not obvious to those unfamiliar with the language history.
>
> First, there is no such thing as fgetchar and fputchar, so we're
> left only with the getc/fgetc and putc/fputc pairs. The 'f' stands, in
> both cases, for "function", which makes perfect sense once you understand
> the history of <stdio.h>.


THANK YOU!!!

It still seems slopply to use the 'f' prefix for more than one thing in
functions that are in the same library and have so much surface similarity
to each otehr But I can see that the original developers could easily have
been so close to the material that they didn't see it that way - at least
not when it mattered. The same with the location of the FILE * in several of
the functions. It would have been nice had they seen, in time, the utility
of adopting a convention that said that the FILE * will always go first in
those functions that use it (since going last would create unnecessary
overhead in variable length functions such as fprintf()).

>
> In the pre-ANSI days, getc and putc were typically implemented as macros,
> only. This was good enough for most purposes, unless you needed to pass
> their address to another function or, for some other reason, needed a
> function with the semantics as these macros. So, fgetc and fputc have
> been introduced as the function versions of getc and putc.
>
> Things are different in standard C, because each function in the standard
> C library must be implemented as a function, even if it is also provided
> as a macro. So, you can take the address of getc, or even call the
> function version of getc, if you're careful enough to bypass the macro.
> Likewise, fgetc and fputc can be provided as macros, too, although I can't
> imagine why any implementor might want to do so.
>
> But even today there is a subtle difference between the plain versions and
> the f-versions: if implemented as macros, all the functions from the
> standard C library are restricted to single evaluation of each of their
> parameters. This makes something like putchar(i++) safe: i is guaranteed
> to be incremented once, even if putchar is implemented as a macro.
> However, there are two exceptions from this rule: getc and putc. If
> implemented as macros, they are allowed to evaluate their FILE pointer
> parameter (and *only* this parameter) more than once. So, in the
> unlikely event that you ever need to call getc/putc with an expression
> containing side effects as the FILE pointer argument, use the f-version
> instead.
>


Thank you very much. This makes sense. As I said in another post, I'm having
a hard time picturing the process flow that makes multiple evaluation of
function's parameter possible. Could you describe that in more detail.

Thanks.

> Dan
> --
> Dan Pop
> DESY Zeuthen, RZ group
> Email: (E-Mail Removed)



 
Reply With Quote
 
Keith Thompson
Guest
Posts: n/a
 
      07-14-2004
(E-Mail Removed) (Richard Bos) writes:
> "William L. Bahn" <(E-Mail Removed)> wrote:

[...]
> > puts()/fputs()
> > printf()/fprintf()
> > scanf()/fscanf()
> >
> > differ primarily in that the first one assumes stdin/stdout while
> > the second one works with a stream passed by the programmer. This
> > makes sense and makes the functions easy to remember.

>
> That's deceptive, though. It's true in the case of (f)printf() and
> (f)scanf(), but puts(s) actually does something subtly different from
> fputs(s, stdin);


I think you mean fputs(s, stdout).

The difference, of course, is that puts(s) appends a newline; I
wouldn't call that a subtle difference.

[...]
> > In each case the pairs of functions perform the same task.

>
> No, they don't. fgetc(instream) gets a character from instream.
> getc(instream) does the same thing superficially, but it is allowed to
> evaluate its parameter more than once. This means that
> fgetc(instream[i++]) is safe, but getc(instream[i++]) is not safe; it
> might evaluate i++ more than once, even more than once between two
> sequence points, and thus cause undefined behaviour. The other side of
> the coin is that getc() could be slightly faster than fgetc().
> getchar() is equivalent to getc(stdin). Since stdin does not contain
> side effects, this is both safe and efficient.
> The same thing is true for fputc()/putc()/putchar()/stdout, with the
> proviso that putc() is only allowed to evaluate its second argument more
> than once; putc(line[i++], outstream) is safe, but putc(i,
> outstream[j++]) is not.


I'd say that they do perform the same task, but with slightly
different semantics. It depends, I suppose, on how loosely you want
to define the phrase "perform the same task", but William's statement
seems perfectly reasonable to me.

--
Keith Thompson (The_Other_Keith) (E-Mail Removed) <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
 
Reply With Quote
 
Keith Thompson
Guest
Posts: n/a
 
      07-14-2004
"William L. Bahn" <(E-Mail Removed)> writes:
[...]
> Although it still doesn't answer the question that was asked, this is
> definitely completely new information to me and so I appreciate it. I'm
> trying to picture the process flow that allows a function to evaluate its
> parameters more than once and can't. While I know that this is
> implementation specific, my mental picture of the process is that the
> parameters are evaluated and the resulting values are placed on a stack.
> Necessary context is then saved and program control is turned over to the
> function's code that then accesses the evaluated values of the parameters
> from the stack based on the value of the stack pointer. After doing whatever
> it wants to with those values, at calculates a return value (if any), pops
> all of the arguments off the stack, and places that value on the stack and
> returns control to the calling function that then pops the return value from
> the stack placing the stack back to its original condition prior to the
> function call.
>
> I would be greatful for an alternate picture that allows multiple
> evaluations of a function's arguments for a single call to the function.


If they evaluate their arguments more than once, it's because they're
implemented as macros.

Any library function can be implemented as a macro in addition to its
declaration as a function, but with the restriction that the macro
cannot evaluate any of its arguments more than once. For putc() and a
few other functions, the implementation is given special permission to
use a macro that does evaluate its stream argument more than once;
this allows for a more efficient implementation.

Here's a definition of the putc() macro on one system (obviously
this is non-portable):

#define putc(x, p) (--(p)->_cnt < 0 ? __flsbuf((x), (p)) \
: (int)(*(p)->_ptr++ = (unsigned char) (x)))

As long as the output buffer isn't full, an invocation of putc() can
store a character directly in the output buffer without the overhead
of a function call. The stream argument is rarely going to be an
expression with side effects anyway, so evaluating it more than once
will rarely matter, but if the standard didn't explicitly permit it an
implementation would have to correctly support calls where the second
argument does have side effects.

--
Keith Thompson (The_Other_Keith) (E-Mail Removed) <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
 
Reply With Quote
 
lawrence.jones@ugsplm.com
Guest
Posts: n/a
 
      07-15-2004
William L. Bahn <(E-Mail Removed)> wrote:
>
> The same with the location of the FILE * in several of
> the functions. It would have been nice had they seen, in time, the utility
> of adopting a convention that said that the FILE * will always go first in
> those functions that use it (since going last would create unnecessary
> overhead in variable length functions such as fprintf()).


Ah, but there was method in that madness, too. By putting the stream
argument at the end, it could be made optional with the default being
the appropriate standard stream (stdin or stdout). Fortunately, that
idea didn't hang around very long, but the order of the arguments did.

-Larry Jones

They say winning isn't everything, and I've decided
to take their word for it. -- Calvin
 
Reply With Quote
 
William L. Bahn
Guest
Posts: n/a
 
      07-15-2004
THANKS!

I was trying to think of a way a macro could evaluate an argument more than
once and the obvious answer just didn't make itself obvious to me:

#define sq(x,y) ( (x)*(x) )

evaluates it more than once and hence would have problems with:

d = sq( x*=2 );

What's the general way of handling something like this where you need to use
the value more than once? Using pow() is not the answer because that would
only work in a case similar to this one and I'm looking for a general way.

Can you do something like:

#define sq(x) {double u; u=(x); u*u}

That won't work because you can't do:

y = {3}; // Curly braces, not parens

But is there some trick that would let you do the same idea?

"Keith Thompson" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
> "William L. Bahn" <(E-Mail Removed)> writes:
> [...]
> > Although it still doesn't answer the question that was asked, this is
> > definitely completely new information to me and so I appreciate it. I'm
> > trying to picture the process flow that allows a function to evaluate

its
> > parameters more than once and can't. While I know that this is
> > implementation specific, my mental picture of the process is that the
> > parameters are evaluated and the resulting values are placed on a stack.
> > Necessary context is then saved and program control is turned over to

the
> > function's code that then accesses the evaluated values of the

parameters
> > from the stack based on the value of the stack pointer. After doing

whatever
> > it wants to with those values, at calculates a return value (if any),

pops
> > all of the arguments off the stack, and places that value on the stack

and
> > returns control to the calling function that then pops the return value

from
> > the stack placing the stack back to its original condition prior to the
> > function call.
> >
> > I would be greatful for an alternate picture that allows multiple
> > evaluations of a function's arguments for a single call to the function.

>
> If they evaluate their arguments more than once, it's because they're
> implemented as macros.
>
> Any library function can be implemented as a macro in addition to its
> declaration as a function, but with the restriction that the macro
> cannot evaluate any of its arguments more than once. For putc() and a
> few other functions, the implementation is given special permission to
> use a macro that does evaluate its stream argument more than once;
> this allows for a more efficient implementation.
>
> Here's a definition of the putc() macro on one system (obviously
> this is non-portable):
>
> #define putc(x, p) (--(p)->_cnt < 0 ? __flsbuf((x), (p)) \
> : (int)(*(p)->_ptr++ = (unsigned char)

(x)))
>
> As long as the output buffer isn't full, an invocation of putc() can
> store a character directly in the output buffer without the overhead
> of a function call. The stream argument is rarely going to be an
> expression with side effects anyway, so evaluating it more than once
> will rarely matter, but if the standard didn't explicitly permit it an
> implementation would have to correctly support calls where the second
> argument does have side effects.
>
> --
> Keith Thompson (The_Other_Keith) (E-Mail Removed)

<http://www.ghoti.net/~kst>
> San Diego Supercomputer Center <*>

<http://users.sdsc.edu/~kst>
> We must do something. This is something. Therefore, we must do this.



 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off




Advertisments