Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > strtok ( ) help

Reply
Thread Tools

strtok ( ) help

 
 
ern
Guest
Posts: n/a
 
      01-20-2006
I'm using strtok( ) to capture lines of input. After I call
"splitCommand", I call strtok( ) again to get the next line. Strtok( )
returns NULL (but there is more in the file...). That didn't happen
before 'splitCommands' entered the picture. The problem is in
splitCommands( ) somehow modifying the pointer, but I HAVE to call that
function. Is there a way to make a copy of it or something ?

/* HERE IS MY CODE */

char * lineOfScript;
const char * delim = "\n";

lineOfScript = strstr(scriptFileBuffer,"reprocess:"); //find starting
place in script
lineOfScript = strtok(lineOfScript,delim); //skip a line
lineOfScript = strtok(NULL,delim); //get next line... now I have a
command
splitCommand(lineOfScript); //this is probably where my pointer gets
messed up...
lineOfScript = strtok(NULL,delim); //get next line, but strtok returns
NULL

//Split up command into seperate words.
//Store words in global array
int splitCommand(char * command){
const char * delimeters = " ";
int i = 0;
g_UserCommands[0] = strtok(command, " ");
while(g_UserCommands[i] != NULL && i < 5){
//printf("%s", g_UserCommands[i]);//for debugging...
i+=1;
g_UserCommands[i] = strtok(NULL, " ");
}
i=0;
return 1;
}

 
Reply With Quote
 
 
 
 
Vladimir S. Oka
Guest
Posts: n/a
 
      01-20-2006
ern wrote:
> I'm using strtok( ) to capture lines of input. After I call
> "splitCommand", I call strtok( ) again to get the next line. Strtok( )
> returns NULL (but there is more in the file...). That didn't happen
> before 'splitCommands' entered the picture. The problem is in
> splitCommands( ) somehow modifying the pointer, but I HAVE to call that
> function. Is there a way to make a copy of it or something ?
>
> /* HERE IS MY CODE */
>
> char * lineOfScript;
> const char * delim = "\n";
>
> lineOfScript = strstr(scriptFileBuffer,"reprocess:"); //find starting
> place in script
> lineOfScript = strtok(lineOfScript,delim); //skip a line
> lineOfScript = strtok(NULL,delim); //get next line... now I have a
> command
> splitCommand(lineOfScript); //this is probably where my pointer gets
> messed up...
> lineOfScript = strtok(NULL,delim); //get next line, but strtok returns
> NULL
>
> //Split up command into seperate words.
> //Store words in global array
> int splitCommand(char * command){
> const char * delimeters = " ";
> int i = 0;
> g_UserCommands[0] = strtok(command, " ");
> while(g_UserCommands[i] != NULL && i < 5){
> //printf("%s", g_UserCommands[i]);//for debugging...
> i+=1;
> g_UserCommands[i] = strtok(NULL, " ");
> }
> i=0;
> return 1;
> }
>


In splitCommand() you start a new strtok() "session", so when
you return and invoke it next it actually searches from where it
left off in 'command'. This was a local variable to
splitCommand(), and there's no telling what's there once
splitCommand() returns (it actually /is not/ there anymore).

C passes /all/ parameters by value (even pointers), i.e. a copy
is made for the called function. After the function returns the
object ceases to exist. Trying to access it invokes Undefined
Behaviour.

Cheers

Vladimir


--
My e-mail address is real, and I read it.
 
Reply With Quote
 
 
 
 
Vladimir S. Oka
Guest
Posts: n/a
 
      01-20-2006
Vladimir S. Oka wrote:
> ern wrote:
>> I'm using strtok( ) to capture lines of input. After I call
>> "splitCommand", I call strtok( ) again to get the next line. Strtok( )
>> returns NULL (but there is more in the file...). That didn't happen
>> before 'splitCommands' entered the picture. The problem is in
>> splitCommands( ) somehow modifying the pointer, but I HAVE to call that
>> function. Is there a way to make a copy of it or something ?
>>
>> /* HERE IS MY CODE */
>>
>> char * lineOfScript;
>> const char * delim = "\n";
>>
>> lineOfScript = strstr(scriptFileBuffer,"reprocess:"); //find starting
>> place in script
>> lineOfScript = strtok(lineOfScript,delim); //skip a line
>> lineOfScript = strtok(NULL,delim); //get next line... now I have a
>> command
>> splitCommand(lineOfScript); //this is probably where my pointer gets
>> messed up...
>> lineOfScript = strtok(NULL,delim); //get next line, but strtok returns
>> NULL
>>
>> //Split up command into seperate words.
>> //Store words in global array
>> int splitCommand(char * command){
>> const char * delimeters = " ";
>> int i = 0;
>> g_UserCommands[0] = strtok(command, " ");
>> while(g_UserCommands[i] != NULL && i < 5){
>> //printf("%s", g_UserCommands[i]);//for debugging...
>> i+=1;
>> g_UserCommands[i] = strtok(NULL, " ");
>> }
>> i=0;
>> return 1;
>> }
>>

>
> In splitCommand() you start a new strtok() "session", so when you return
> and invoke it next it actually searches from where it left off in
> 'command'. This was a local variable to splitCommand(), and there's no
> telling what's there once splitCommand() returns (it actually /is not/
> there anymore).


Consult your system help/manuals for a detailed description of
how strtok() works. The C Standard describes it in 7.21.5.8.
I've last used it too long ago to dare paraphrase either.

Cheers

Vladimir

--
My e-mail address is real, and I read it.
 
Reply With Quote
 
pemo
Guest
Posts: n/a
 
      01-20-2006
ern wrote:
> I'm using strtok( ) to capture lines of input. After I call
> "splitCommand", I call strtok( ) again to get the next line. Strtok(
> ) returns NULL (but there is more in the file...). That didn't happen
> before 'splitCommands' entered the picture. The problem is in
> splitCommands( ) somehow modifying the pointer, but I HAVE to call
> that function. Is there a way to make a copy of it or something ?
>
> /* HERE IS MY CODE */
>
> char * lineOfScript;
> const char * delim = "\n";
>
> lineOfScript = strstr(scriptFileBuffer,"reprocess:"); //find
> starting place in script
> lineOfScript = strtok(lineOfScript,delim); //skip a line
> lineOfScript = strtok(NULL,delim); //get next line... now I have a
> command
> splitCommand(lineOfScript); //this is probably where my pointer gets
> messed up...
> lineOfScript = strtok(NULL,delim); //get next line, but strtok
> returns NULL
>
> //Split up command into seperate words.
> //Store words in global array
> int splitCommand(char * command){
> const char * delimeters = " ";
> int i = 0;
> g_UserCommands[0] = strtok(command, " ");
> while(g_UserCommands[i] != NULL && i < 5){
> //printf("%s", g_UserCommands[i]);//for debugging...
> i+=1;
> g_UserCommands[i] = strtok(NULL, " ");
> }
> i=0;
> return 1;
> }




strtok(lineOfScript,delim);

After the initial call, strtok() has to 'remember' the data you've asked it
to parse. To do that here, it makes a copy of whatever lineOfScript pointed
to, and stores it in some internal buffer [that it maintains, and you can't
directly access].


strtok(NULL,delim);

When you call it again, passing NULL as the first param, it simply continues
parsing from wherever it previously left off - i.e., it continues to parse
its internal buffer as set by whatever lineOfScript originally pointed to.


strtok("BOO",delim);

Now, if you call it again with a non-NULL initial param, it forgets whatever
data it was previously storing/working on and resets its internal buffer to
whatever data you've just passed in - a copy of "BOO" in this case. So,
whatever you didn't yet parse - that was originally ref'ed by lineOfScript -
is now lost and forgotten.

Bottom line, you can't do what you're trying to do with ...

p = strtok(p1, p2);

while(strtok(NULL, p2))
{
p3 = strtok(p4, ...);

...
}


--
================================================== =============
In an attempt to reduce 'unwanted noise' on the 'signal' ...

Disclaimer:

Any comment/code I contribute might =NOT= be 100% portable, nor
semantically correct [read - 'not 100% pedantically correct'].
I don't care too much about that though, and I reckon it's the
same with most 'visitors' here. However, rest assured that any
'essential' (?) corrections WILL almost certainly appear v.soon
[read - 'to add noise as they see fit, a pedant will be along
shortly'].

WARNINGS: Always read the label. No beside-the-point minutiae
filter supplied. Keep away from children. Do not ignite.
================================================== =============


 
Reply With Quote
 
Default User
Guest
Posts: n/a
 
      01-20-2006
pemo wrote:


> strtok(lineOfScript,delim);
>
> After the initial call, strtok() has to 'remember' the data you've
> asked it to parse. To do that here, it makes a copy of whatever
> lineOfScript pointed to, and stores it in some internal buffer [that
> it maintains, and you can't directly access].


That's not likely. What it will "remember" is the last pointer value
that it returned, which is an offset into the string (probably just a
static char*. If it made a copy of the string, not only would that be
inefficient, but if an operation changed the original string between
calls to strtok() its copy would no longer match.



Brian
 
Reply With Quote
 
pemo
Guest
Posts: n/a
 
      01-20-2006
Default User wrote:
> pemo wrote:
>
>
>> strtok(lineOfScript,delim);
>>
>> After the initial call, strtok() has to 'remember' the data you've
>> asked it to parse. To do that here, it makes a copy of whatever
>> lineOfScript pointed to, and stores it in some internal buffer [that
>> it maintains, and you can't directly access].

>
> That's not likely. What it will "remember" is the last pointer value
> that it returned, which is an offset into the string (probably just a
> static char*. If it made a copy of the string, not only would that be
> inefficient, but if an operation changed the original string between
> calls to strtok() its copy would no longer match.


Yes, you're probably right, thanks for the correction ... hold on, brb <time
passes> ...
....
yup, certainly looks like it *is* as you say - for the gcc version at least.
Still, makes one wonder whether your comment ["but if an operation changed
the original string between calls to strtok() its copy would no longer
match"] might either be useful, or else goes against how the docs say
strtok() works.

int main(void)
{
char ar[] = "now is the time for all good men to come to the aid of the
party";

char * p = NULL;

int n = 0;

p = strtok(ar, " ");

while(p != NULL)
{
puts(p);

++n;

if(n % 5 == 0)
{
strcpy(ar, "the quick brown fox jumps over the lazy dog");
}

p = strtok(NULL, " ");
}
}


now
is
the
time
for
jumps
over
the
lazy
dog


--
================================================== =============
In an attempt to reduce ‘unwanted noise’ on the ‘signal’ ...

Disclaimer:

Any comment/code I contribute might =NOT= be 100% portable, nor
semantically correct [read - ‘not 100% pedantically correct’].
I don’t care too much about that though, and I reckon it’s the
same with most ‘visitors’ here. However, rest assured that any
‘essential’ (?) corrections WILL almost certainly appear v.soon
[read - ‘to add noise as they see fit, a pedant will be along
shortly’].

WARNINGS: Always read the label. No beside-the-point minutiae
filter supplied. Keep away from children. Do not ignite.
================================================== =============


 
Reply With Quote
 
Default User
Guest
Posts: n/a
 
      01-20-2006
pemo wrote:

> Default User wrote:


> > That's not likely. What it will "remember" is the last pointer value
> > that it returned, which is an offset into the string (probably just
> > a static char*. If it made a copy of the string, not only would
> > that be inefficient, but if an operation changed the original
> > string between calls to strtok() its copy would no longer match.

>
> Yes, you're probably right, thanks for the correction ... hold on,
> brb <time passes> ... ...
> yup, certainly looks like it is as you say - for the gcc version at
> least. Still, makes one wonder whether your comment ["but if an
> operation changed the original string between calls to strtok() its
> copy would no longer match"] might either be useful, or else goes
> against how the docs say strtok() works.



Remember also that strtok() has to modify the original string, so it
has to have a pointer into that string in all cases. That is, it
doesn't help to work on copy of the string because it has to punch null
characters in place of the delimiters. Also, the return value is a
pointer into the original string (or NULL of course).

The strtok() syntax and semantics are well into the "cheap, fast, and
dirty" style.



Brian
 
Reply With Quote
 
pemo
Guest
Posts: n/a
 
      01-21-2006
Default User wrote:
> pemo wrote:
>
>> Default User wrote:

>
>>> That's not likely. What it will "remember" is the last pointer value
>>> that it returned, which is an offset into the string (probably just
>>> a static char*. If it made a copy of the string, not only would
>>> that be inefficient, but if an operation changed the original
>>> string between calls to strtok() its copy would no longer match.

>>
>> Yes, you're probably right, thanks for the correction ... hold on,
>> brb <time passes> ... ...
>> yup, certainly looks like it is as you say - for the gcc version at
>> least. Still, makes one wonder whether your comment ["but if an
>> operation changed the original string between calls to strtok() its
>> copy would no longer match"] might either be useful, or else goes
>> against how the docs say strtok() works.

>
>
> Remember also that strtok() has to modify the original string, so it
> has to have a pointer into that string in all cases. That is, it
> doesn't help to work on copy of the string because it has to punch
> null characters in place of the delimiters. ...


> The strtok() syntax and semantics are well into the "cheap, fast, and
> dirty" style.


Yup, ok, and I guess, from 7.21.5.8.2 "A sequence of calls to the strtok
function breaks *the string pointed to by s1* into a ..." [pretty much]
implies that a copy should *not* be made [hmmmm ????]. Whether it really
*is certain though* ... if this [the std] were a law!

Ok, I reckon the meaning *is clear* [gulp] in this case.

> Also, the return value is a pointer into the original string (or NULL of
> course).


Jeez, I really want to be the last one to want to play *the pedantic card on
c.l.c* (surely, c.std.c is where *certain types* should 'go play'), but the
std says ...

>>The strtok function returns a pointer to the first character of a token,
>>or a null pointer if there is no token.


Now, it's late [here] and I've not bothered to parse *all* the previous
paras in the std to see if there's a case for categorically stating that
'token', in this context, *is* necessaraily a member of the set of things in
the set of inputs to strtok(). But, if there's not a case, then "returns a
pointer to the first character of a token" doesn't, I think, preclude strtok
returning a pointer into some local [or any other] buffer, rather than the
one encoding the original string [the input] ... just that [perhaps], at the
time, the semantics of what token it is pointing to tallys with what its
input is?

As to the answer to this ... I actually don't give much of a damn [*a *****
actually], I'm more of a computational linguist these days, and it's *the
language* that interests me mostly now [my X3J11 days are a distant
memory] - and how, something that often appears at first sight reasonably
clear, can, in actual fact, be anything but! However, I'd rather ... than
be a pedant about it all now.


--
================================================== =============
In an attempt to reduce ‘unwanted noise’ on the ‘signal’ ...

Disclaimer:

Any comment/code I contribute might =NOT= be 100% portable, nor
semantically correct [read - ‘not 100% pedantically correct’].
I don’t care too much about that though, and I reckon it’s the
same with most ‘visitors’ here. However, rest assured that any
‘essential’ (?) corrections WILL almost certainly appear v.soon
[read - ‘to add noise as they see fit, a pedant will be along
shortly’].

WARNINGS: Always read the label. No beside-the-point minutiae
filter supplied. Keep away from children. Do not ignite.
================================================== =============


 
Reply With Quote
 
Default User
Guest
Posts: n/a
 
      01-21-2006
pemo wrote:


> Now, it's late [here] and I've not bothered to parse all the previous
> paras in the std to see if there's a case for categorically stating
> that 'token', in this context, is necessaraily a member of the set of
> things in the set of inputs to strtok(). But, if there's not a case,
> then "returns a pointer to the first character of a token" doesn't, I
> think, preclude strtok returning a pointer into some local [or any
> other] buffer, rather than the one encoding the original string [the
> input] ... just that [perhaps], at the time, the semantics of what
> token it is pointing to tallys with what its input is?


What's unclear about this?

A sequence of calls to the strtok function breaks the
string pointed to by s1 into a sequence of tokens, each of
which is delimited by a character from the string pointed to
by s2.

"Breaks the string". Not forms some copies. Read how tokens are found
and formed. It pretty well lays out the state machine for you.




Brian
 
Reply With Quote
 
pemo
Guest
Posts: n/a
 
      01-21-2006
Default User wrote:
> pemo wrote:
>
>
>> Now, it's late [here] and I've not bothered to parse all the previous
>> paras in the std to see if there's a case for categorically stating
>> that 'token', in this context, is necessaraily a member of the set of
>> things in the set of inputs to strtok(). But, if there's not a case,
>> then "returns a pointer to the first character of a token" doesn't, I
>> think, preclude strtok returning a pointer into some local [or any
>> other] buffer, rather than the one encoding the original string [the
>> input] ... just that [perhaps], at the time, the semantics of what
>> token it is pointing to tallys with what its input is?

>
> What's unclear about this?
>
> A sequence of calls to the strtok function breaks the
> string pointed to by s1 into a sequence of tokens, each of
> which is delimited by a character from the string pointed to
> by s2.
>
> "Breaks the string". Not forms some copies. Read how tokens are found
> and formed. It pretty well lays out the state machine for you.


Ho && Hum

--
================================================== =============
In an attempt to reduce ‘unwanted noise’ on the ‘signal’ ...

Disclaimer:

Any comment/code I contribute might =NOT= be 100% portable, nor
semantically correct [read - ‘not 100% pedantically correct’].
I don’t care too much about that though, and I reckon it’s the
same with most ‘visitors’ here. However, rest assured that any
‘essential’ (?) corrections WILL almost certainly appear v.soon
[read - ‘to add noise as they see fit, *a pedant* will be along
shortly’].

WARNINGS: Always read the label. No beside-the-point minutiae
filter supplied. Keep away from children. Do not ignite.
================================================== =============


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Help with strtok manochavishal@gmail.com C Programming 8 03-16-2006 02:57 PM
Help With strtok manochavishal@gmail.com C Programming 2 03-14-2006 09:44 AM
strtok help Glen C++ 1 08-10-2005 07:14 AM
strtok() and std::string Alex Vinokur C++ 6 04-14-2005 01:40 PM
Problems with strtok() returning one too many tokens... Adam Balgach C++ 2 11-28-2004 01:12 AM



Advertisments