Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > Correct behaviour of scanf and sscanf

Reply
Thread Tools

Correct behaviour of scanf and sscanf

 
 
Rob Thorpe
Guest
Posts: n/a
 
      03-14-2005
Given the code:-

r = sscanf (s, "%lf", x);

What is the correct output if the string s is simply "-" ?

If "-" is considered the beginning of a number, that has been
cut-short then the correct output is that r = EOF. If it is taken to
be a letter in the stream, then the output should be r = 0, as far as
I can see. My compiler gives EOF.

Does the standard specify which is correct?
 
Reply With Quote
 
 
 
 
Mac
Guest
Posts: n/a
 
      03-15-2005
On Mon, 14 Mar 2005 10:44:13 -0800, Rob Thorpe wrote:

> Given the code:-
>
> r = sscanf (s, "%lf", x);
>
> What is the correct output if the string s is simply "-" ?
>
> If "-" is considered the beginning of a number, that has been
> cut-short then the correct output is that r = EOF. If it is taken to
> be a letter in the stream, then the output should be r = 0, as far as
> I can see. My compiler gives EOF.
>
> Does the standard specify which is correct?


From C-89 (or a reasonable facsimile thereof)

*************beginning of excerpt ***************

4.9.6.6 The sscanf function

[snip]

Returns

The sscanf function returns the value of the macro EOF if an input
failure occurs before any conversion. Otherwise, the sscanf function
returns the number of input items assigned, which can be fewer than
provided for, or even zero, in the event of an early matching failure.

*********** end of excerpt **************


The behavior you describe sounds correct to me. sscanf() is supposed to
return EOF if an input failure occurs. Elsewhere, it says that
encountering the end of the string is identical to encountering the end
of file in a fscanf() call. So this is just like calling fscanf on a text
file which has the single character '.' in it.

HTH

--Mac

 
Reply With Quote
 
 
 
 
Eric Sosman
Guest
Posts: n/a
 
      03-15-2005
Rob Thorpe wrote:

> Given the code:-
>
> r = sscanf (s, "%lf", x);
>
> What is the correct output if the string s is simply "-" ?
>
> If "-" is considered the beginning of a number, that has been
> cut-short then the correct output is that r = EOF. If it is taken to
> be a letter in the stream, then the output should be r = 0, as far as
> I can see. My compiler gives EOF.
>
> Does the standard specify which is correct?


Haven't seen a reply in the several hours since I first
saw the message, so (fools rush in ...) I'll hazard a guess.

The Standard speaks of two sorts of failure for *scanf()
directives: "matching failure," which amounts to an input
sequence that doesn't satisfy the syntax required by the
directive, and "input failure," meaning that the source of
input characters dried up -- for the stream-input versions
this means EOF was sensed, and for sscanf() it means the
scan reached the end of the string. On a matching failure,
*scanf() stops operating and returns the number of items
already matched and converted (0, in your example), while
for an input failure *scanf() returns EOF.

So the question boils down to this: When "%lf" processes
"-", is the failure a matching failure or an input failure?

One point of view considers it a matching failure. The
characters for "%f" are supposed to be something strtod() would
swallow: an optional all-whitespace prefix, an optional sign,
and then a character string resembling a floating-point constant
as written in C source code. The string "-" doesn't match this
description (the floating-point constant is missing), so it could
be called a matching failure.

The other viewpoint holds that no "mismatch" was detected
before end-of-string, so it's an input failure. The sequence
of characters is perfectly good as the prefix of a valid match,
and the only thing preventing a complete match is the fact that
no more input was available. Hence (says this argument), the
operation ends with an input failure rather than a matching
failure, and EOF is the correct return value.

IMHO the Standard is not entirely clear about which argument
is correct: is an incomplete prefix a failure to match, or a
failure of the input source? To me, the language of the Standard
doesn't shine enough light into this dark corner -- but if anyone
happens to have a torch to hand, I'd welcome illumination ...

Trying to put myself in the place of an implementor, I'd
imagine the input failure (EOF) outcome would be "more natural,"
but I don't think the Standard's language actually says so in
so many words.

The fool has rushed in; tread, o ye angels!

--
Eric Sosman
http://www.velocityreviews.com/forums/(E-Mail Removed)lid
 
Reply With Quote
 
CBFalconer
Guest
Posts: n/a
 
      03-15-2005
Rob Thorpe wrote:
>
> Given the code:-
>
> r = sscanf (s, "%lf", x);
>
> What is the correct output if the string s is simply "-" ?
>
> If "-" is considered the beginning of a number, that has been
> cut-short then the correct output is that r = EOF. If it is
> taken to be a letter in the stream, then the output should be
> r = 0, as far as I can see. My compiler gives EOF.
>
> Does the standard specify which is correct?


The problem is that detection of such a lone '-' requires reading
two characters, the second of which is not a digit. C only
guarantees one level of pushback via ungetc, so whatever routine is
doing the parsing (such as scanf) cannot leave the input stream
unaltered and report 'No number available'. With string sources
this obviously does not apply. So the question is "should the
string and stream operations function in the same manner". A
similar (but worse) problem arises after reading the e in floating
point formats. "3.0e-x" should return 3.0 and have to push back
three chars.

I think the proper thing would be to guarantee three level
pushback, maybe in C05. This requires defining what is to be done
when an application attempts excess pushback

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson


 
Reply With Quote
 
tigervamp
Guest
Posts: n/a
 
      03-15-2005

CBFalconer wrote:
> Rob Thorpe wrote:
> >
> > Given the code:-
> >
> > r = sscanf (s, "%lf", x);
> >
> > What is the correct output if the string s is simply "-" ?
> >
> > If "-" is considered the beginning of a number, that has been
> > cut-short then the correct output is that r = EOF. If it is
> > taken to be a letter in the stream, then the output should be
> > r = 0, as far as I can see. My compiler gives EOF.
> >
> > Does the standard specify which is correct?

>
> The problem is that detection of such a lone '-' requires reading
> two characters, the second of which is not a digit. C only
> guarantees one level of pushback via ungetc,


Footnote 242 in section 7.19.6.2 (fscanf) indicates that a _maximum_ of
one character can be pushed back, the standard does not say that sscanf
behaves differently.

> so whatever routine is
> doing the parsing (such as scanf) cannot leave the input stream
> unaltered and report 'No number available'. With string sources
> this obviously does not apply. So the question is "should the
> string and stream operations function in the same manner".


According to the standard they should.

> A similar (but worse) problem arises after reading the e in floating
> point formats. "3.0e-x" should return 3.0 and have to push back
> three chars.


fscanf should consume the "3.0e-x", recognize a matching failure, push
the "x" back onto the stream, and return 0. This is the behavior
defined in example 3 of section 7.19.6.2p20 (fscanf), and again the
standard specifies that sscanf should behave the same.

I think that in the OP's case the behavior should be similiar and the
return value should be 0, glibc does this and I think they are right
here. From what I can tell, EOF is never returned if a character was
read (regardless of whether is matched or was pushed back), but I may
well be wrong.

> I think the proper thing would be to guarantee three level
> pushback, maybe in C05. This requires defining what is to be done
> when an application attempts excess pushback


I think the current behavior is pretty clear and well-defined but
notable implementations do not follow this behavior (Solaris and glibc
both push back multiple characters to achieve the output you described
above, details about the Solaris behavior can be found at
http://iforce.sun.com/protected/sola...eral/scanf.txt,
apparently there are instances that require at least 5 characters to be
pushed back to follow the behavior you outlined).

> --
> "If you want to post a followup via groups.google.com, don't use
> the broken "Reply" link at the bottom of the article. Click on
> "show options" at the top of the article, then click on the
> "Reply" at the bottom of the article headers." - Keith Thompson


Rob Gamble

 
Reply With Quote
 
Rob Thorpe
Guest
Posts: n/a
 
      03-15-2005
CBFalconer <(E-Mail Removed)> wrote in message news:<(E-Mail Removed)>...
> Rob Thorpe wrote:
> >
> > Given the code:-
> >
> > r = sscanf (s, "%lf", x);
> >
> > What is the correct output if the string s is simply "-" ?
> >
> > If "-" is considered the beginning of a number, that has been
> > cut-short then the correct output is that r = EOF. If it is
> > taken to be a letter in the stream, then the output should be
> > r = 0, as far as I can see. My compiler gives EOF.
> >
> > Does the standard specify which is correct?

>
> The problem is that detection of such a lone '-' requires reading
> two characters, the second of which is not a digit. C only
> guarantees one level of pushback via ungetc, so whatever routine is
> doing the parsing (such as scanf) cannot leave the input stream
> unaltered and report 'No number available'. With string sources
> this obviously does not apply. So the question is "should the
> string and stream operations function in the same manner". A
> similar (but worse) problem arises after reading the e in floating
> point formats. "3.0e-x" should return 3.0 and have to push back
> three chars.
>
> I think the proper thing would be to guarantee three level
> pushback, maybe in C05. This requires defining what is to be done
> when an application attempts excess pushback


Thanks, that explains it.

I wondered if testing for both 0 and EOF is OTT, but since it works as
you describe it's necessary in very many situtions.
 
Reply With Quote
 
Dan Pop
Guest
Posts: n/a
 
      03-15-2005
In <(E-Mail Removed)> Eric Sosman <(E-Mail Removed)> writes:

>Rob Thorpe wrote:
>
>> Given the code:-
>>
>> r = sscanf (s, "%lf", x);
>>
>> What is the correct output if the string s is simply "-" ?
>>
>> If "-" is considered the beginning of a number, that has been
>> cut-short then the correct output is that r = EOF. If it is taken to
>> be a letter in the stream, then the output should be r = 0, as far as
>> I can see. My compiler gives EOF.
>>
>> Does the standard specify which is correct?

>
> Haven't seen a reply in the several hours since I first
>saw the message, so (fools rush in ...) I'll hazard a guess.
>
> The Standard speaks of two sorts of failure for *scanf()
>directives: "matching failure," which amounts to an input
>sequence that doesn't satisfy the syntax required by the
>directive, and "input failure," meaning that the source of
>input characters dried up -- for the stream-input versions
>this means EOF was sensed, and for sscanf() it means the
>scan reached the end of the string. On a matching failure,
>*scanf() stops operating and returns the number of items
>already matched and converted (0, in your example), while
>for an input failure *scanf() returns EOF.
>
> So the question boils down to this: When "%lf" processes
>"-", is the failure a matching failure or an input failure?
>
> One point of view considers it a matching failure. The
>characters for "%f" are supposed to be something strtod() would
>swallow: an optional all-whitespace prefix, an optional sign,
>and then a character string resembling a floating-point constant
>as written in C source code. The string "-" doesn't match this
>description (the floating-point constant is missing), so it could
>be called a matching failure.
>
> The other viewpoint holds that no "mismatch" was detected
>before end-of-string, so it's an input failure. The sequence
>of characters is perfectly good as the prefix of a valid match,
>and the only thing preventing a complete match is the fact that
>no more input was available. Hence (says this argument), the
>operation ends with an input failure rather than a matching
>failure, and EOF is the correct return value.
>
> IMHO the Standard is not entirely clear about which argument
>is correct: is an incomplete prefix a failure to match, or a
>failure of the input source? To me, the language of the Standard
>doesn't shine enough light into this dark corner -- but if anyone
>happens to have a torch to hand, I'd welcome illumination ...


An incomplete prefix followed by an end of file condition cannot be a
matching failure, like "- " or "-foo", we're clearly in the case where
an input failure occured before any conversion, just as if the input
were an empty string.

I agree that the text of the standard is less than crystal clear and I
wouldn't be surprised to see different behaviours on different
implementations. OTOH, as an implementation user, especially in the
case of sscanf, I see no problem: if the function doesn't return 1, it is
obvious that the input string doesn't contain a valid number.

Dan
--
Dan Pop <(E-Mail Removed)>
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Is pythonic version of scanf() or sscanf() planned? ryniek90 Python 0 10-13-2009 05:58 PM
Is pythonic version of scanf() or sscanf() planned? ryniek90 Python 15 10-13-2009 03:44 PM
difference between scanf("%i") and scanf("%d") ??? perhaps bug inVS2005? =?ISO-8859-1?Q?Martin_J=F8rgensen?= C Programming 18 05-02-2006 10:53 AM
scanf (yes/no) - doesn't work + deprecation errors scanf, fopen etc. =?ISO-8859-1?Q?Martin_J=F8rgensen?= C Programming 185 04-03-2006 02:49 PM
sscanf and scanf behave differently effbiae C Programming 2 01-19-2006 09:09 PM



Advertisments