Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > getline - sort of

Reply
Thread Tools

getline - sort of

 
 
Bill Waddington
Guest
Posts: n/a
 
      01-16-2008
This must be a FAQ - or several FAQs - but I can't quite seem
to pin it down.

I need to read a string from stdin which I will then process as digits
with sscanf. I need to limit the # of chars read, and discard the
rest of the line. I also need to detect when more or fewer chars
are input. Something like GNU getline I guess.

Is there a standard portable way to do this w/a library function,
or do I just write it myself and process input a single char at
a time?

This must come up all the time. Sorry to be such a pinhead. Drivers
I can handle. User input, that another thing entirely...

Thanks,
Bill
--
William D Waddington
http://www.velocityreviews.com/forums/(E-Mail Removed)
"Even bugs...are unexpected signposts on
the long road of creativity..." - Ken Burtch
 
Reply With Quote
 
 
 
 
Malcolm McLean
Guest
Posts: n/a
 
      01-16-2008

"Bill Waddington" <(E-Mail Removed)> wrote in message
> This must be a FAQ - or several FAQs - but I can't quite seem
> to pin it down.
>
> I need to read a string from stdin which I will then process as digits
> with sscanf. I need to limit the # of chars read, and discard the
> rest of the line. I also need to detect when more or fewer chars
> are input. Something like GNU getline I guess.
>
> Is there a standard portable way to do this w/a library function,
> or do I just write it myself and process input a single char at
> a time?
>
> This must come up all the time. Sorry to be such a pinhead. Drivers
> I can handle. User input, that another thing entirely...
>

There's no really good way of getting an unbounded input line.

Probably if the input is line-delimited numbers, your best bet is to call
fgets() with a suitably over-lrage buffer.
Reject any input for which there is no trailing newline.
Then call strtod() to convert the .line to a number.

char buff[256];
double x;
char *end;

fgets(buff, 256, stdin);
if(!strchr(buff, "\n'))
{
evil person is passing you a bad line that is over long
}
x = strtod(buff, &end);
if(end == buff)
{
evil person passed a line that was not a number
}

at this point x holds the number, end the addfress of the last character
after it. You might discard, or process further.

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm

 
Reply With Quote
 
 
 
 
Flash Gordon
Guest
Posts: n/a
 
      01-16-2008
Malcolm McLean wrote, On 16/01/08 17:39:
>
> "Bill Waddington" <(E-Mail Removed)> wrote in message
>> This must be a FAQ - or several FAQs - but I can't quite seem
>> to pin it down.
>>
>> I need to read a string from stdin which I will then process as digits
>> with sscanf. I need to limit the # of chars read, and discard the
>> rest of the line. I also need to detect when more or fewer chars
>> are input. Something like GNU getline I guess.
>>
>> Is there a standard portable way to do this w/a library function,
>> or do I just write it myself and process input a single char at
>> a time?
>>
>> This must come up all the time. Sorry to be such a pinhead. Drivers
>> I can handle. User input, that another thing entirely...
>>

> There's no really good way of getting an unbounded input line.
>
> Probably if the input is line-delimited numbers, your best bet is to
> call fgets() with a suitably over-lrage buffer.


Since the OP has a requirement for an upper limit an overly large buffer
is *not* required. The OP needs a buffer of the correct size allowing
for the newline and /0 termination.

> Reject any input for which there is no trailing newline.


Where did the OP say reject? The OP should use lack of a newline to
indicate too long a line and loop to discard the rest, and an early
newline to indicate too short a line.

> Then call strtod() to convert the .line to a number.


Where did the OP specify a floating point number? I agree that the strto
functions are generally easier to use than sscanf but as the OP does not
mention decimal points or signs strtoul is more likely to be of use.

> char buff[256];


Manifest constants are bad.

> double x;
> char *end;
>
> fgets(buff, 256, stdin);


Here you repeat your manifest constant leading to a higher chance of
introducing bugs.

Either #define (or enum) the buffer size of use sizeof on the buffer (it
it will be the buffer not a pointer passed in, which seems likely.

Also fgets returns a value. Failure to check it means you will not know
if an end-of-file or error occurs, both of which *can* occur on stdin.

> if(!strchr(buff, "\n'))


strchr takes the character as an int value, NOT a (pointer to) string.
Also the OP probably wants to use the value returned to find out how
long the line was. So that should be more like:
eolptr = strchr(buff,'\n');
if (!eolptr)

> {
> evil person is passing you a bad line that is over long


Do a loop reading until you get either EOF or /n. I would use getc
rather than fgetc for this since it is intended to be a macro and
therefore makes the optimisers job easier.

> }


else
linelen = eolptr-buf;

> x = strtod(buff, &end);
> if(end == buff)
> {
> evil person passed a line that was not a number


Or something that started with something other than white space or a
number. We do not know that this is invalid and it might be why the OP
wants to use sscanf.

> }
>
> at this point x holds the number, end the addfress of the last character
> after it. You might discard, or process further.


Or it could contain an INF or NAN. Also errno should be checked (having
been zeroed prior to the call) so that overflow can easily be detected.
--
Flash Gordon
 
Reply With Quote
 
Malcolm McLean
Guest
Posts: n/a
 
      01-16-2008
"Flash Gordon" <(E-Mail Removed)> wrote in message
>
>> at this point x holds the number, end the addfress of the last character
>> after it. You might discard, or process further.

>
> Or it could contain an INF or NAN. Also errno should be checked (having
> been zeroed prior to the call) so that overflow can easily be detected.
>

That's why strtod is the better choice, unless you really need massive and
accurate integers.

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm

 
Reply With Quote
 
Flash Gordon
Guest
Posts: n/a
 
      01-16-2008
Malcolm McLean wrote, On 16/01/08 18:42:
> "Flash Gordon" <(E-Mail Removed)> wrote in message
>>
>>> at this point x holds the number, end the addfress of the last
>>> character after it. You might discard, or process further.

>>
>> Or it could contain an INF or NAN. Also errno should be checked
>> (having been zeroed prior to the call) so that overflow can easily be
>> detected.
>>

> That's why strtod is the better choice, unless you really need massive
> and accurate integers.


You not using strtod properly is a reason why it should be used? Strange
argument. Anyway...

With strtod you need to check the end pointer, errno and *also* check
for INF and NAN (they can be explicitly supplied by the user and that
might not be flagged). If you only expect integers you *also* have to
check that the user entered an integer (entering 0.5 should in my
opinion not be considered correct). You also have to be concerned about
whether the range of exactly represented integers is large enough.

With strtoul you have to check the end pointer and errno. If the valid
range is smaller than unsigned long you also have to check the range.

Hmm, a description under half the length suggests to me that it is a LOT
simpler to use strtoul, strtoull, strtol or strtoll depending on valid
range.

Also you should note that strtoul etc are actually *designed* for
integers unlike strtod. It is generally best to use a tool designed for
the job when one is available rather than some other tool.
--
Flash Gordon
 
Reply With Quote
 
Randy Howard
Guest
Posts: n/a
 
      01-16-2008
On Wed, 16 Jan 2008 11:24:21 -0600, Bill Waddington wrote
(in article <(E-Mail Removed)>):

> This must be a FAQ - or several FAQs - but I can't quite seem
> to pin it down.
>
> I need to read a string from stdin which I will then process as digits
> with sscanf. I need to limit the # of chars read, and discard the
> rest of the line. I also need to detect when more or fewer chars
> are input. Something like GNU getline I guess.


Richard has this on his website, which seems to cover the general idea
pretty well, I suspect it might be helpful to you.

http://www.cpax.org.uk/prg/writings/fgetdata.php



--
Randy Howard (2reply remove FOOBAR)
"The power of accurate observation is called cynicism by those
who have not got it." - George Bernard Shaw





 
Reply With Quote
 
Malcolm McLean
Guest
Posts: n/a
 
      01-16-2008

"Flash Gordon" <(E-Mail Removed)> wrote in message
> Malcolm McLean wrote, On 16/01/08 18:42:
>> "Flash Gordon" <(E-Mail Removed)> wrote in message
>>>
>>>> at this point x holds the number, end the addfress of the last
>>>> character after it. You might discard, or process further.
>>>
>>> Or it could contain an INF or NAN. Also errno should be checked (having
>>> been zeroed prior to the call) so that overflow can easily be detected.
>>>

>> That's why strtod is the better choice, unless you really need massive
>> and accurate integers.

>
> You not using strtod properly is a reason why it should be used? Strange
> argument. Anyway...
>
> With strtoul you have to check the end pointer and errno. If the valid
> range is smaller than unsigned long you also have to check the range.
>
> Hmm, a description under half the length suggests to me that it is a LOT
> simpler to use strtoul, strtoull, strtol or strtoll depending on valid
> range.
>

If the user passes NAN or INF, it is an open question what the correct
behaviour should be. Having the double set to that value is a good start.
If he passes something non-numeric, endptr will pick it up.
>
> Also you should note that strtoul etc are actually *designed* for integers
> unlike strtod. It is generally best to use a tool designed for the job
> when one is available rather than some other tool.
>

Generally, yes, you shoyuld, use functions as designed.

In fact strtol and related functions are driven more by the idea that data
is integers, becasue that's what the machine can crunch efficiently.
Generally data is numerical when it is not strings. It is easier to let
corrupt data be represented as massive flaoting point numbers which can then
be filtered out during sanity checks further down the line, rather than try
to build a parser for integers.

The exception might be super-safe applications where you don't want even the
remote chance that something like 10.000000000000001 would be changed into
an integral 10 and accepted, when in fact you want to reject such input.
However needing that level of safety is rare.


--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm


 
Reply With Quote
 
Eric Sosman
Guest
Posts: n/a
 
      01-16-2008
Bill Waddington wrote:
> This must be a FAQ - or several FAQs - but I can't quite seem
> to pin it down.
>
> I need to read a string from stdin which I will then process as digits
> with sscanf. I need to limit the # of chars read, and discard the
> rest of the line. I also need to detect when more or fewer chars
> are input. Something like GNU getline I guess.
>
> Is there a standard portable way to do this w/a library function,
> or do I just write it myself and process input a single char at
> a time?
>
> This must come up all the time. Sorry to be such a pinhead. Drivers
> I can handle. User input, that another thing entirely...



Other responders seem to be pointing to solutions of a
related but different problem: Reading an entire line without
knowing how long it might be. Your task seems simpler, and
there's no need to commit canaricide with cannons.

Since you know an upper limit on the line length, you can
use a char array of the appropriate size (including room for
the '\n' and the '\0'), and read a line into it with fgets().
Then use strchr() to find the '\n' at the end of the line. If
a newline is found, fgets() got an entire line and the position
of the newline gives you its length. If not, the line was too
long and part of it remains unread, or else you've reached end-
of-input in a malformed file that lacks a '\n' at the end of its
final line. Either way, you can skip the rest of the line (or
detect the newline-less final line), by calling getc() or getchar()
in a loop until it returns '\n' or EOF.

--
(E-Mail Removed)
 
Reply With Quote
 
Bill Waddington
Guest
Posts: n/a
 
      01-16-2008
On Wed, 16 Jan 2008 14:47:50 -0500, Eric Sosman <(E-Mail Removed)>
wrote:

>Bill Waddington wrote:
>> This must be a FAQ - or several FAQs - but I can't quite seem
>> to pin it down.
>>
>> I need to read a string from stdin which I will then process as digits
>> with sscanf. I need to limit the # of chars read, and discard the
>> rest of the line. I also need to detect when more or fewer chars
>> are input. Something like GNU getline I guess.
>>
>> Is there a standard portable way to do this w/a library function,
>> or do I just write it myself and process input a single char at
>> a time?
>>
>> This must come up all the time. Sorry to be such a pinhead. Drivers
>> I can handle. User input, that another thing entirely...

>
>
>Other responders seem to be pointing to solutions of a
>related but different problem: Reading an entire line without
>knowing how long it might be.


True. I know how long the useful input might be, but need to
allow for less or more and discard any extra input safely.

>Your task seems simpler, and there's no need to commit
>canaricide with cannons.


>Since you know an upper limit on the line length, you can
>use a char array of the appropriate size (including room for
>the '\n' and the '\0'), and read a line into it with fgets().
>Then use strchr() to find the '\n' at the end of the line. If
>a newline is found, fgets() got an entire line and the position
>of the newline gives you its length. If not, the line was too
>long and part of it remains unread, or else you've reached end-
>of-input in a malformed file that lacks a '\n' at the end of its
>final line. Either way, you can skip the rest of the line (or
>detect the newline-less final line), by calling getc() or getchar()
>in a loop until it returns '\n' or EOF.


That, or just grind along w/getc() or getchar() from the start.

If I read all the suggestions correctly, the bottom line is code
it up or borrow it somewhere, but there isn't a single standard
lib way to do it.

Thanks to all,
Bill
--
William D Waddington
(E-Mail Removed)
"Even bugs...are unexpected signposts on
the long road of creativity..." - Ken Burtch
 
Reply With Quote
 
Flash Gordon
Guest
Posts: n/a
 
      01-16-2008
Malcolm McLean wrote, On 16/01/08 19:16:
>
> "Flash Gordon" <(E-Mail Removed)> wrote in message
>> Malcolm McLean wrote, On 16/01/08 18:42:
>>> "Flash Gordon" <(E-Mail Removed)> wrote in message
>>>>
>>>>> at this point x holds the number, end the addfress of the last
>>>>> character after it. You might discard, or process further.
>>>>
>>>> Or it could contain an INF or NAN. Also errno should be checked
>>>> (having been zeroed prior to the call) so that overflow can easily
>>>> be detected.
>>>>
>>> That's why strtod is the better choice, unless you really need
>>> massive and accurate integers.

>>
>> You not using strtod properly is a reason why it should be used?
>> Strange argument. Anyway...
>>
>> With strtoul you have to check the end pointer and errno. If the valid
>> range is smaller than unsigned long you also have to check the range.
>>
>> Hmm, a description under half the length suggests to me that it is a
>> LOT simpler to use strtoul, strtoull, strtol or strtoll depending on
>> valid range.
>>

> If the user passes NAN or INF, it is an open question what the correct
> behaviour should be. Having the double set to that value is a good start.
> If he passes something non-numeric, endptr will pick it up.
>>
>> Also you should note that strtoul etc are actually *designed* for
>> integers unlike strtod. It is generally best to use a tool designed
>> for the job when one is available rather than some other tool.
>>

> Generally, yes, you shoyuld, use functions as designed.
>
> In fact strtol and related functions are driven more by the idea that
> data is integers, becasue that's what the machine can crunch efficiently.


No, they are driven by the idea the there are a lot of situations where
you are dealing with integers.

> Generally data is numerical when it is not strings.


I do a lot of interfacing with other systems. A lot of the time the
numbers are provided as integers (numbers of pennies etc.)

> It is easier to let
> corrupt data be represented as massive flaoting point numbers which can
> then be filtered out during sanity checks further down the line, rather
> than try to build a parser for integers.


You do not have to build a parser for integers, those nice library
writers have done the job for you.

> The exception might be super-safe applications where you don't want even
> the remote chance that something like 10.000000000000001 would be
> changed into an integral 10 and accepted, when in fact you want to
> reject such input. However needing that level of safety is rare.


No, the exception is where the requirement is to accept an integer, and
this is a very common situation.
--
Flash Gordon
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: When will Thunderbird support sort in place (in context sort)? Ron Natalie Firefox 0 02-02-2006 04:38 AM
The Colourised Bewitched -- sort of OK....... sort of! anthony DVD Video 26 06-28-2005 04:39 AM
xsl:sort lang="es" modern vs. tradidional Spanish sort order nobody XML 0 06-01-2004 06:25 AM
What is faster? C++ vector sort or sort in database JerryJ C++ 11 04-28-2004 10:23 PM
Ado sort error-Ado Sort -Relate, Compute By, or Sort operations cannot be done on column(s) whose key length is unknown or exceeds 10 KB. Navin ASP General 1 09-09-2003 07:16 AM



Advertisments