Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > Does strtok require a non-null token?

Reply
Thread Tools

Does strtok require a non-null token?

 
 
ryampolsky@gmail.com
Guest
Posts: n/a
 
      10-12-2006
I'm using strtok to break apart a colon-delimited string. It basically
works, but it looks like strtok skips over empty sections. In other
words, if the string has 2 colons in a row, it doesn't treat that as a
null token, it just treats the 2 colons as a single delimiter.

Is that the intended behavior?

 
Reply With Quote
 
 
 
 
Al Balmer
Guest
Posts: n/a
 
      10-12-2006
On 12 Oct 2006 11:38:36 -0700, http://www.velocityreviews.com/forums/(E-Mail Removed) wrote:

>I'm using strtok to break apart a colon-delimited string. It basically
>works, but it looks like strtok skips over empty sections. In other
>words, if the string has 2 colons in a row, it doesn't treat that as a
>null token, it just treats the 2 colons as a single delimiter.
>
>Is that the intended behavior?


Yes. This is one of the drawbacks of strtok. From the current
position, it searches for a character *not* in the delimiter set, sets
this position as the return pointer, then searches for the first
character that *is* in the delimiter set and sets it to null.

(Individual implementations may be different, but that's the way it's
required to behave.)

For your application, it's probably easier to scan the string
yourself.

--
Al Balmer
Sun City, AZ
 
Reply With Quote
 
 
 
 
William Hughes
Guest
Posts: n/a
 
      10-12-2006

(E-Mail Removed) wrote:
> I'm using strtok to break apart a colon-delimited string. It basically
> works, but it looks like strtok skips over empty sections. In other
> words, if the string has 2 colons in a row, it doesn't treat that as a
> null token, it just treats the 2 colons as a single delimiter.
>
> Is that the intended behavior?


Yes. Just one more reason to avoid strtok().

- William Hughes

 
Reply With Quote
 
Default User
Guest
Posts: n/a
 
      10-12-2006
William Hughes wrote:

>
> (E-Mail Removed) wrote:
> > I'm using strtok to break apart a colon-delimited string. It
> > basically works, but it looks like strtok skips over empty
> > sections. In other words, if the string has 2 colons in a row, it
> > doesn't treat that as a null token, it just treats the 2 colons as
> > a single delimiter.
> >
> > Is that the intended behavior?

>
> Yes. Just one more reason to avoid strtok().


Unless that's the behavior you want. Example, breaking lines into words
with white space. You don't want a bunch of "null" words.





Brian
 
Reply With Quote
 
CBFalconer
Guest
Posts: n/a
 
      10-12-2006
(E-Mail Removed) wrote:
>
> I'm using strtok to break apart a colon-delimited string. It
> basically works, but it looks like strtok skips over empty
> sections. In other words, if the string has 2 colons in a row, it
> doesn't treat that as a null token, it just treats the 2 colons as
> a single delimiter.
>
> Is that the intended behavior?


Yes. If that is a problem, consider using my toksplit routine, the
code for which has been published here before. I think googling
for "toksplit" will bring it up, so I won't burden the newsgroup
with YAC (yet another copy).

--
Some informative links:
<news:news.announce.newusers
<http://www.geocities.com/nnqweb/>
<http://www.catb.org/~esr/faqs/smart-questions.html>
<http://www.caliburn.nl/topposting.html>
<http://www.netmeister.org/news/learn2quote.html>
<http://cfaj.freeshell.org/google/>


 
Reply With Quote
 
Ben Pfaff
Guest
Posts: n/a
 
      10-12-2006
(E-Mail Removed) writes:

> I'm using strtok to break apart a colon-delimited string. It basically
> works, but it looks like strtok skips over empty sections. In other
> words, if the string has 2 colons in a row, it doesn't treat that as a
> null token, it just treats the 2 colons as a single delimiter.


strtok() has at least these problems:

* It merges adjacent delimiters. If you use a comma as your
delimiter, then "a,,b,c" will be divided into three tokens,
not four. This is often the wrong thing to do. In fact, it
is only the right thing to do, in my experience, when the
delimiter set contains white space (for dividing a string
into "words") or it is known in advance that there will be
no adjacent delimiters.

* The identity of the delimiter is lost, because it is
changed to a null terminator.

* It modifies the string that it tokenizes. This is bad
because it forces you to make a copy of the string if
you want to use it later. It also means that you can't
tokenize a string literal with it; this is not
necessarily something you'd want to do all the time but
it is surprising.

* It can only be used once at a time. If a sequence of
strtok() calls is ongoing and another one is started,
the state of the first one is lost. This isn't a
problem for small programs but it is easy to lose track
of such things in hierarchies of nested functions in
large programs. In other words, strtok() breaks
encapsulation.

--
"A lesson for us all: Even in trivia there are traps."
--Eric Sosman
 
Reply With Quote
 
William Hughes
Guest
Posts: n/a
 
      10-12-2006

Default User wrote:
> William Hughes wrote:
>
> >
> > (E-Mail Removed) wrote:
> > > I'm using strtok to break apart a colon-delimited string. It
> > > basically works, but it looks like strtok skips over empty
> > > sections. In other words, if the string has 2 colons in a row, it
> > > doesn't treat that as a null token, it just treats the 2 colons as
> > > a single delimiter.
> > >
> > > Is that the intended behavior?

> >
> > Yes. Just one more reason to avoid strtok().

>
> Unless that's the behavior you want. Example, breaking lines into words
> with white space. You don't want a bunch of "null" words.
>
>


The point is not that the function's behaviour is not sometimes
what you want. The point is

-the default behaviour is surprising

-the default behaviour is not even
usually what you want

-the default behaviour throws information away

-if you don't like the default behaviour, see
figure 1.

Personally I'm with the Linux man pages on this one. Under Bugs
is the advice "Never use this function".

-William Hughes

 
Reply With Quote
 
Default User
Guest
Posts: n/a
 
      10-12-2006
William Hughes wrote:

>
> Default User wrote:


> > Unless that's the behavior you want. Example, breaking lines into
> > words with white space. You don't want a bunch of "null" words.
> >
> >

>
> The point is not that the function's behaviour is not sometimes
> what you want. The point is
>
> -the default behaviour is surprising


Only if one fails to read the documentation. A number of functions are
funny that way.

> -the default behaviour is not even
> usually what you want


How do you know? Even if true, so what?

> -the default behaviour throws information away


Again, if you know that and if fits the problem, so what?

> -if you don't like the default behaviour, see
> figure 1.


I don't understand this statement. I have no idea what "figure 1" is.

> Personally I'm with the Linux man pages on this one. Under Bugs
> is the advice "Never use this function".


Well, that's stupid advice. The function may be tricky, but sometimes
it's just the right thing. In those cases, it should be used. If not,
it shouldn't.



Brian



 
Reply With Quote
 
Al Balmer
Guest
Posts: n/a
 
      10-12-2006
On 12 Oct 2006 14:32:27 -0700, "William Hughes"
<(E-Mail Removed)> wrote:

>
>Default User wrote:
>> William Hughes wrote:
>>
>> >
>> > (E-Mail Removed) wrote:
>> > > I'm using strtok to break apart a colon-delimited string. It
>> > > basically works, but it looks like strtok skips over empty
>> > > sections. In other words, if the string has 2 colons in a row, it
>> > > doesn't treat that as a null token, it just treats the 2 colons as
>> > > a single delimiter.
>> > >
>> > > Is that the intended behavior?
>> >
>> > Yes. Just one more reason to avoid strtok().

>>
>> Unless that's the behavior you want. Example, breaking lines into words
>> with white space. You don't want a bunch of "null" words.
>>
>>

>
>The point is not that the function's behaviour is not sometimes
>what you want. The point is
>
> -the default behaviour is surprising


The behavior of many functions might be surprising if you don't read
the documentation.
>
> -the default behaviour is not even
> usually what you want

Like any other function in the library, it's used where appropriate.
Sometimes it *is* what I want.
>
> -the default behaviour throws information away


I don't really know what information you're referring to. You could
just as easily say it adds information. If there's information that
you need to protect, it's trivial.
>
> -if you don't like the default behaviour, see
> figure 1.


? Did you copy this from a book with pictures? That would explain the
odd indentation, I suppose.
>
>Personally I'm with the Linux man pages on this one. Under Bugs
>is the advice "Never use this function".


That's silly. Like any other function, it should be used when
appropriate, and not used when not appropriate.

--
Al Balmer
Sun City, AZ
 
Reply With Quote
 
William Hughes
Guest
Posts: n/a
 
      10-12-2006

Al Balmer wrote:
> On 12 Oct 2006 14:32:27 -0700, "William Hughes"
> <(E-Mail Removed)> wrote:
>
> >
> >Default User wrote:
> >> William Hughes wrote:
> >>
> >> >
> >> > (E-Mail Removed) wrote:
> >> > > I'm using strtok to break apart a colon-delimited string. It
> >> > > basically works, but it looks like strtok skips over empty
> >> > > sections. In other words, if the string has 2 colons in a row, it
> >> > > doesn't treat that as a null token, it just treats the 2 colons as
> >> > > a single delimiter.
> >> > >
> >> > > Is that the intended behavior?
> >> >
> >> > Yes. Just one more reason to avoid strtok().
> >>
> >> Unless that's the behavior you want. Example, breaking lines into words
> >> with white space. You don't want a bunch of "null" words.
> >>
> >>

> >
> >The point is not that the function's behaviour is not sometimes
> >what you want. The point is
> >
> > -the default behaviour is surprising

>
> The behavior of many functions might be surprising if you don't read
> the documentation.
> >
> > -the default behaviour is not even
> > usually what you want

> Like any other function in the library, it's used where appropriate.
> Sometimes it *is* what I want.
> >
> > -the default behaviour throws information away

>
> I don't really know what information you're referring to.


The number of delimiters. (strtok() also discards the identity
of these delimiters but that has not been previously mentioned in
this subthread).

> You could
> just as easily say it adds information. If there's information that
> you need to protect, it's trivial.
> >
> > -if you don't like the default behaviour, see
> > figure 1.

>
> ? Did you copy this from a book with pictures? That would explain the
> odd indentation, I suppose.
> >


figure 1. is a picture of a hand with a single digit extended (guess
which
one). It comes from an old piece of xerox-lore, a parody of DEC (?)
documentation in which an oft repeated phase is "see figure 1."
I guess the reference was a little too obscure.

> >Personally I'm with the Linux man pages on this one. Under Bugs
> >is the advice "Never use this function".

>
> That's silly. Like any other function, it should be used when
> appropriate, and not used when not appropriate.
>


Well, never is probably too strong. However, strtok() is dominated by
a good general purpose parsing method. Since you need a good
general purpose parsing method, why not use that instead of
strtok()?

- William Hughes

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
How does strtok works? Abhi C Programming 1 07-08-2005 06:55 AM
strtok() and std::string Alex Vinokur C++ 6 04-14-2005 01:40 PM
Problems with strtok() returning one too many tokens... Adam Balgach C++ 2 11-28-2004 01:12 AM
strtok trouble Robert C Programming 17 09-06-2003 10:30 PM
strtok problem jorntk@yahoo.com C Programming 4 08-29-2003 11:26 AM



Advertisments