Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > string validation for int and long (strisint, strislong)

Reply
Thread Tools

string validation for int and long (strisint, strislong)

 
 
jake1138
Guest
Posts: n/a
 
      02-24-2005
Maybe this is a newbie thing and everyone already knows how to do this,
but I figured I'd post these functions anyway in case someone finds
them useful. I used Jack Klein's example (see link below) and made
functions out of it.

http://home.att.net/~jackklein/c/code/strtol.html

Here are functions that will validate a string to see if it represents
an integer or long, respectively:

/*
* checks to see if string is an integer
*
* returns 1 if true, 0 if false
*/
int strisint(const char *str, size_t size)
{
char *endptr;
long longint;

errno = 0;
longint = strtol(str, &endptr, 10);
if (errno == ERANGE || longint < INT_MIN || longint > INT_MAX
|| endptr == str || *endptr != '\0') {
return 0;
} else {
return 1;
}
}

/*
* checks to see if string is a long integer
*
* returns 1 if true, 0 if false
*/
int strislong(const char *str, size_t size)
{
char *endptr;
long longint;

errno = 0;
longint = strtol(str, &endptr, 10);
if (errno == ERANGE || endptr == str || *endptr != '\0') {
return 0;
} else {
return 1;
}
}

 
Reply With Quote
 
 
 
 
Peter Nilsson
Guest
Posts: n/a
 
      02-25-2005
jake1138 wrote:
> Maybe this is a newbie thing and everyone already knows how to
> do this,but I figured I'd post these functions anyway in case
> someone finds them useful. ...
>
> int strisint(const char *str, size_t size)
> int strislong(const char *str, size_t size)


Both function identifiers are reserved for use as external
identifiers, since they begin with str and are followed
by a lowercase letter. [Capitalising one or more letters
won't help under C89's potential case insensitive linking.]

> {
> char *endptr;
> long longint;
>
> errno = 0;


You should allow the caller to preserve the prior errno if
strtol succeeds...

int errno_save = errno;
errno = 0;
...
if (errno) return 0;
errno_save = errno;
return endptr == str || *(endptr + strspn(endptr, " \t"));

> longint = strtol(str, &endptr, 10);
> if (errno == ERANGE || endptr == str || *endptr != '\0') {
> return 0;
> } else {
> return 1;
> }
> }


One problem with these wrappers is that they don't return the
converted value, so the caller is likely going to have to call a
conversion routine like strtol _anyway_!

Testing for a decimal number, without conversion, can be done
with something like...

#include <ctype.h>

int is_number(const char *p)
{
int d = 0;
const unsigned char *up = (const unsigned char *) p;
while (isspace(*up)) up++;
if (*up == '+' || *up == '-') up++;
while (isdigit(*up++)) d = 1;
return d && *up == 0;
}

--
Peter

 
Reply With Quote
 
 
 
 
jake1138
Guest
Posts: n/a
 
      02-28-2005
Thanks for your comments. Read on...

Peter Nilsson wrote:
> jake1138 wrote:
> > Maybe this is a newbie thing and everyone already knows how to
> > do this,but I figured I'd post these functions anyway in case
> > someone finds them useful. ...
> >
> > int strisint(const char *str, size_t size)
> > int strislong(const char *str, size_t size)

>
> Both function identifiers are reserved for use as external
> identifiers, since they begin with str and are followed
> by a lowercase letter. [Capitalising one or more letters
> won't help under C89's potential case insensitive linking.]


I've never heard that before (but I'm fairly new at C). In fact, I've
never read much at all about the rules of usage with the library
functions. Perhaps you could point me to documentation?

>
> > {
> > char *endptr;
> > long longint;
> >
> > errno = 0;

>
> You should allow the caller to preserve the prior errno if
> strtol succeeds...
>
> int errno_save = errno;
> errno = 0;
> ...
> if (errno) return 0;
> errno_save = errno;
> return endptr == str || *(endptr + strspn(endptr, " \t"));


Did you mean "errno = errno_save;" on line 5? I'm not sure what that
buys you since I believe any function can potentially change the value
of errno and thus you should not rely on it being preserved between
function calls. It seems to me if the caller wants the value of errno,
they would be expected to store it before calling any given function.
Am I missing something?

>
> > longint = strtol(str, &endptr, 10);
> > if (errno == ERANGE || endptr == str || *endptr != '\0') {
> > return 0;
> > } else {
> > return 1;
> > }
> > }

>
> One problem with these wrappers is that they don't return the
> converted value, so the caller is likely going to have to call a
> conversion routine like strtol _anyway_!


That is by design. I have these validation functions for several
reasons:
1) I can log an error and exit when I detect invalid input data.
2) I can use atoi (which doesn't detect errors).
3) I can write simple code: read -> validate -> convert -> store

> Testing for a decimal number, without conversion, can be done
> with something like...
>
> #include <ctype.h>
>
> int is_number(const char *p)
> {
> int d = 0;
> const unsigned char *up = (const unsigned char *) p;
> while (isspace(*up)) up++;
> if (*up == '+' || *up == '-') up++;
> while (isdigit(*up++)) d = 1;
> return d && *up == 0;
> }


I don't see how this handles the decimal in a decimal number. If you
pass in "3.14", it will fail. If you meant an integer number, then
this works except it doesn't handle invalid sizes. I guess the caller
could check against INT_MIN and INT_MAX, but I'd rather that be in the
validation routine.

 
Reply With Quote
 
Peter Nilsson
Guest
Posts: n/a
 
      03-01-2005
jake1138 wrote:
> Peter Nilsson wrote:
> > jake1138 wrote:
> > > Maybe this is a newbie thing and everyone already knows how to
> > > do this,but I figured I'd post these functions anyway in case
> > > someone finds them useful. ...
> > >
> > > int strisint(const char *str, size_t size)
> > > int strislong(const char *str, size_t size)

> >
> > Both function identifiers are reserved for use as external
> > identifiers, since they begin with str and are followed
> > by a lowercase letter. [Capitalising one or more letters
> > won't help under C89's potential case insensitive linking.]

>
> I've never heard that before (but I'm fairly new at C). In fact,
> I've never read much at all about the rules of usage with the
> library functions. Perhaps you could point me to documentation?


The standards, or even just the public drafts. N869 is available
for public reading. The relevant section is 7.26 Future library
directions.

> > > {
> > > char *endptr;
> > > long longint;
> > >
> > > errno = 0;

> >
> > You should allow the caller to preserve the prior errno if
> > strtol succeeds...
> >
> > int errno_save = errno;
> > errno = 0;
> > ...
> > if (errno) return 0;
> > errno_save = errno;
> > return endptr == str || *(endptr + strspn(endptr, " \t"));

>
> Did you mean "errno = errno_save;" on line 5?


Yes. Thanks.

> I'm not sure what that buys you since I believe any function can
> potentially change the value of errno and thus you should not rely
> on it being preserved between function calls. It seems to me if
> the caller wants the value of errno, they would be expected to
> store it before calling any given function. Am I missing
> something?


Note that I said "if strtol succeeds..."

No standard library function is allowed to set errno to 0. If you
think about this, you'll realise that this allows the caller to
delay error detection. If the caller sets errno to zero, then rather
than having to check errno after every function, it can delay the
test until later, possibly culling mutliple tests in the process.

Apart from general efficiency, it helps to make programs more
readable, since they are not cluttered with repeated tests.

> > Testing for a decimal number, without conversion, can be done
> > with something like...
> >
> > #include <ctype.h>
> >
> > int is_number(const char *p)
> > {
> > int d = 0;
> > const unsigned char *up = (const unsigned char *) p;
> > while (isspace(*up)) up++;
> > if (*up == '+' || *up == '-') up++;
> > while (isdigit(*up++)) d = 1;
> > return d && *up == 0;
> > }

>
> I don't see how this handles the decimal in a decimal number. If
> you pass in "3.14", it will fail.


As will strtol. Decimal is a number base name, not necessarily a
distinction between integer and floating point (which has decimal
_points_.) But then I could be wrong, according to whichever
literature you read. No matter...

> If you meant an integer number, then
> this works except it doesn't handle invalid sizes.


True.

> I guess the caller could check against INT_MIN and INT_MAX, but
> I'd rather that be in the validation routine.


I understand what you're doing, but realise that robustness would see
your processing routines checking for the same errors that your
validation suite are supposed to detect.

Reviewing the design of your (hypothetical) program, I would ask:
Why aren't you processing, or at least storing, the converted data
as you validate?

--
Peter

 
Reply With Quote
 
Chris Torek
Guest
Posts: n/a
 
      03-01-2005
In article <. com>
Peter Nilsson <> wrote:
>No standard library function is allowed to set errno to 0. If you
>think about this, you'll realise that this allows the caller to
>delay error detection. If the caller sets errno to zero, then rather
>than having to check errno after every function, it can delay the
>test until later, possibly culling mutliple tests in the process.


The premise here is correct (no standard library function can clear
errno), but the conclusion is not. Successful operations are
allowed to set errno to some nonzero value. Hence:

errno = 0;
do_some_work();
if (errno) ...

may misfire, thinking something went wrong when all went well. I
think this was a bad design decision (not that errno itself is
exactly wonderful ), but it is in the C standards, so we must
live with it.

As a practical matter, many (far too many) Unix-derived systems
actually do set errno to ENOTTY on the first successful I/O from
or to a (non-device) file. If you have ever had email returned
with something like:

Subject: cannot send mail to joe.typo@host: Not a typewriter

this is the reason. Someone did an "errno = 0; do_some_work();
if (errno)". In this case, the work involved looking up the
user (whose name has a typo and hence does not exist); along
the way, something did some I/O; this set errno to ENOTTY, and
strerror(ENOTTY) is "Not a typewriter". Well, of course Joe
is not a typewriter, but what has that to do with anything?
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.
 
Reply With Quote
 
Eric Sosman
Guest
Posts: n/a
 
      03-01-2005
jake1138 wrote:
> Peter Nilsson wrote:
>>jake1138 wrote:
>>>[...]
>>>int strisint(const char *str, size_t size)
>>>int strislong(const char *str, size_t size)

>>
>>Both function identifiers are reserved for use as external
>>identifiers, since they begin with str and are followed
>>by a lowercase letter. [Capitalising one or more letters
>>won't help under C89's potential case insensitive linking.]

>
> I've never heard that before (but I'm fairly new at C). In fact, I've
> never read much at all about the rules of usage with the library
> functions. Perhaps you could point me to documentation?


A useful compilation of names to avoid can be found at

http://www.oakroadsystems.com/tech/c-predef.htm

--
Eric Sosman
lid
 
Reply With Quote
 
Peter Nilsson
Guest
Posts: n/a
 
      03-02-2005
Chris Torek wrote:
> Peter Nilsson <> wrote:
> >
> > No standard library function is allowed to set errno to 0. If
> > you think about this, you'll realise that this allows the caller
> > to delay error detection. If the caller sets errno to zero, then
> > rather than having to check errno after every function, it can
> > delay the test until later, possibly culling mutliple tests in
> > the process.

>
> The premise here is correct (no standard library function can clear
> errno), but the conclusion is not. Successful operations are
> allowed to set errno to some nonzero value. Hence:
>
> errno = 0;
> do_some_work();
> if (errno) ...
>
> may misfire, thinking something went wrong when all went well. I
> think this was a bad design decision (not that errno itself is
> exactly wonderful ), but it is in the C standards, so we must
> live with it.


Quite right, but where standard functions _are_ required to set
errno on certain conditions, the standards preclude such functions
from setting errno to other values, outside of those precise
conditions.

So it is certainly possible to perform bulk strtol calculations,
deferring error detection till later.

--
Peter

 
Reply With Quote
 
jake1138
Guest
Posts: n/a
 
      03-03-2005
Peter Nilsson wrote:
> > I'm not sure what that buys you since I believe any function can
> > potentially change the value of errno and thus you should not rely
> > on it being preserved between function calls. It seems to me if
> > the caller wants the value of errno, they would be expected to
> > store it before calling any given function. Am I missing
> > something?

>
> Note that I said "if strtol succeeds..."
>
> No standard library function is allowed to set errno to 0. If you
> think about this, you'll realise that this allows the caller to
> delay error detection. If the caller sets errno to zero, then rather
> than having to check errno after every function, it can delay the
> test until later, possibly culling mutliple tests in the process.
>
> Apart from general efficiency, it helps to make programs more
> readable, since they are not cluttered with repeated tests.


I see.

> > I guess the caller could check against INT_MIN and INT_MAX, but
> > I'd rather that be in the validation routine.

>
> I understand what you're doing, but realise that robustness would see
> your processing routines checking for the same errors that your
> validation suite are supposed to detect.
>
> Reviewing the design of your (hypothetical) program, I would ask:
> Why aren't you processing, or at least storing, the converted data
> as you validate?


Because I'm stupid. No, I just didn't think about it that way at
first. I realize now it makes more sense to do both with one routine.

 
Reply With Quote
 
jake1138
Guest
Posts: n/a
 
      03-03-2005
Eric Sosman wrote:
>
> A useful compilation of names to avoid can be found at
>
> http://www.oakroadsystems.com/tech/c-predef.htm


Thanks.

 
Reply With Quote
 
Dave Thompson
Guest
Posts: n/a
 
      03-07-2005
On 24 Feb 2005 19:05:38 -0800, "Peter Nilsson" <>
wrote:

> jake1138 wrote:
> > Maybe this is a newbie thing and everyone already knows how to
> > do this,but I figured I'd post these functions anyway in case
> > someone finds them useful. ...
> >
> > int strisint(const char *str, size_t size)
> > int strislong(const char *str, size_t size)

>
> Both function identifiers are reserved for use as external
> identifiers, since they begin with str and are followed
> by a lowercase letter. [Capitalising one or more letters
> won't help under C89's potential case insensitive linking.]
>

And so does/would is[a-z]*. Unfortunately.

Although C89 is officially obsolete, and I haven't heard of any
linkers with the uncased or 6-char problems for a long time now.

<snip>
> Testing for a decimal number, without conversion, can be done
> with something like...
>
> #include <ctype.h>
>
> int is_number(const char *p)
> {
> int d = 0;
> const unsigned char *up = (const unsigned char *) p;
> while (isspace(*up)) up++;
> if (*up == '+' || *up == '-') up++;
> while (isdigit(*up++)) d = 1;
> return d && *up == 0;


Needs to be *--up or up[-1].

> }


But this doesn't check for _the value in range_ of int, or long, or
whatever, as the OP's versions did, and you often need or want. And
can't be fixed to do so without doing at least most of the conversion.


- David.Thompson1 at worldnet.att.net
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Promoting unsigned long int to long int pereges C Programming 112 07-28-2008 05:00 AM
Having compilation error: no match for call to ‘(const __gnu_cxx::hash<long long int>) (const long long int&)’ veryhotsausage C++ 1 07-04-2008 05:41 PM
unsigned long long int to long double Daniel Rudy C Programming 5 09-20-2005 02:37 AM
int main(int argc, char *argv[] ) vs int main(int argc, char **argv ) Hal Styli C Programming 14 01-20-2004 10:00 PM
dirty stuff: f(int,int) cast to f(struct{int,int}) Schnoffos C Programming 2 06-27-2003 03:13 AM



Advertisments