Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > Horror! Using goto statement in finite state machines?

Reply
Thread Tools

Horror! Using goto statement in finite state machines?

 
 
Mark McIntyre
Guest
Posts: n/a
 
      02-11-2007
On Sun, 11 Feb 2007 13:59:28 -0800, in comp.lang.c , Keith Thompson
<kst-> wrote:

>Mark McIntyre <> writes:
>> On Sat, 10 Feb 2007 15:54:53 -0800, in comp.lang.c , Keith Thompson
>> <kst-> wrote:
>>
>>>"Serve Laurijssen" <> writes:
>>>> "Christopher Layne" <> wrote in message
>>>>
>>>>> And yet the one with unconditional jumps will end up being faster.
>>>>
>>>> such general statements are, by definition, false
>>>
>>>What definition would that be?

>>
>> The definition contained within the statement. Thats what "by
>> definition" means.

>
>I see no definition in the statement. If there is one, what term is
>being defined?


Ok, I'll play: the statement "the one with unconditional jumps will
end up being faster" is a defintion of the behaviour of unconditional
jumps.

Why are you persisting at this point? I'm sure you knew precisely what
Serve meant, and it seems bizarre.
--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan
 
Reply With Quote
 
 
 
 
Keith Thompson
Guest
Posts: n/a
 
      02-12-2007
Mark McIntyre <> writes:
> On Sun, 11 Feb 2007 13:59:28 -0800, in comp.lang.c , Keith Thompson
> <kst-> wrote:
>>Mark McIntyre <> writes:
>>> On Sat, 10 Feb 2007 15:54:53 -0800, in comp.lang.c , Keith Thompson
>>> <kst-> wrote:
>>>>"Serve Laurijssen" <> writes:

[...]
>>>>> such general statements are, by definition, false
>>>>
>>>>What definition would that be?
>>>
>>> The definition contained within the statement. Thats what "by
>>> definition" means.

>>
>>I see no definition in the statement. If there is one, what term is
>>being defined?

>
> Ok, I'll play: the statement "the one with unconditional jumps will
> end up being faster" is a defintion of the behaviour of unconditional
> jumps.
>
> Why are you persisting at this point? I'm sure you knew precisely what
> Serve meant, and it seems bizarre.


I'll just drop this; there's no point in discussing it further.

--
Keith Thompson (The_Other_Keith) kst- <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
 
Reply With Quote
 
 
 
 
Thad Smith
Guest
Posts: n/a
 
      02-12-2007
Eric Sosman wrote:
> Rui Maciel wrote On 02/09/07 14:47,:
>>Ben Pfaff wrote:
>>>How about a switch statement or an array of function pointers?

>>
>>As far as I can tell, to use the switch instead of the goto statement would
>>end up generating a nested mess inside continuous loops. [...]

>
> If there's nothing going on during a state transition except
> the transition itself, then switch-in-a-loop has no benefit that
> I can see, aside from evading the wrath of the goto police. If
> you just change `goto state42' to `state = 42', all you've done
> is change the spelling.
>
> However, lots of applications of state machines perform some
> additional action at each transition: They consume the next input
> character or emit the next output message or something of the
> sort. In that setting, switch-in-a-loop is very attractive
> because it provides convenient places for the "transitional"
> actions, to wit, before and after the big switch block:
>
> for (; {
> /* "prefix" actions ... */
> switch (state) {
> ...
> }
> /* "suffix" actions ... */
> }
>
> If both "prefix" and "suffix" are empty, though, there
> seems little point to this structure.


Very well put, Eric.

Often, for me, the state machine is a subprocess used within a program.
There may be a large processing loop for the program, one item of
which is to call a function to run a particular state machine. In that
case, the state is a static variable. The function does a switch which
processes the state and possibly updates the state, then exits.

--
Thad
 
Reply With Quote
 
William Ahern
Guest
Posts: n/a
 
      02-13-2007
On Sat, 10 Feb 2007 17:18:46 -0800, websnarf wrote:
> On Feb 9, 8:14 am, Rui Maciel <rui.mac...@gmail.com> wrote:
>> I've been delving into finite state machines and at this time it seems
>> that the best way to implement them is through an intense use of the goto
>> statement. Yet, everyone plus their granmother is constantly cursing at
>> the goto statement, accusing it of being some sort of spawn of satan. So
>> it seems we have a pickle here.

>
> Correct. The C language standard itself creates this problem, for
> really, no good reason. The simplest way to improve the C standard
> for finite state machines is to extend either the continue or goto
> statements to be able to go to a specific case label. In that way a
> FSM can be implemented inside of a switch statement without paying the
> exorbitant cost of rerunning a literal switch operation on every state
> transition. This kind of thinking is basically beyond what the C
> standards committee is capable of conceiving.
>


For awhile I got into the practice of pairing switch case's with
equivalently named labels:

struct foo {
struct { enum foo_states this, next; } state;
};

switch (foo->state.this) {
one:
foo->state.this = one;
case one:
...
foo->state.next = two;

do_something_asynchronously(&fsm_callback);

break;
two:
foo->state.this = two;
case two:
...
goto one;

break;
}


Ultimately I stopped and fell back to just using switch. With network
server development most of the switch cases end up initiating some
asynchronous operation which will need to call back into the function, so
the optimization just clutters things up. Though, being able to do `goto
(state.this = two)' would be nice.

Tail call optimization would be even nicer. One my of I/O libraries
implements a type of trampoline, otherwise the stack grows too big. Though
in that case the C compiler would really need to be smart since it's
mutually recursive among many functions.
 
Reply With Quote
 
William Ahern
Guest
Posts: n/a
 
      02-13-2007
On Sat, 10 Feb 2007 19:06:14 -0800, websnarf wrote:
<snip>
> That's actually a follow up involving Michael B. Allen. (It all
> started when he made the ridiculous claim that the C std library could
> not be beat in terms of performance on strings.)
>
> This is the original thread that I was thinking of:
>
> http://groups.google.com/group/comp....3b05cfb3818d7/
>


Ridiculous? Michael's comments echo exactly my experience. I write network
servers day-in and day-out. The standard library functions work great
for strings, memcpy() in particular. snprintf() is also very useful, but
like Michael pointed out I often deal with string parsing in situ, using
FSMs. I've written stateful base64, base16, base32, URL
percent decoders/encoders. Instead of allocating memory willy-nilly, as
much as possible I write all of my code to deal with "strings" terms of
streams, not discrete objects. I've even written a streaming MIME parser.

Actually, if a committee wanted to standardize something they'd be better
off standardizing a vector structure (not unlike the bstring package, just
less distasteful). Like was discussed in the above thread, the problem
with C strings is that everybody and their grandmother has invented some
kind of wrapper which does the same thing using different names.

Personally, of late I often make use of the iovec structure from POSIX's
sys/uio.h. If I were to revamp string handling in C I'd make two modifications.

1) Add an svec structure. struct svec { unsigned char *sv_base;
size_t sv_len; }.

2a) Mandate that (char) was unassigned (very radical), or
2b) Mandate that (char) and (unsigned char) were interchangeable without
maddening signedness compiler warnings in GCC-4.1, and add a new type
uchar (typedef unsigned char uchar), 'cause I'm lazy.

Then, if I wanted to push my luck, I'd deprecate all the wide-character
interfaces, and include a small suite of functions for manipulating UTF-8
encoded Unicode strings. I might start by adding a struct uvec, something
like: struct uvec { uchar *uv_base, [a bunch of opaque members,
noticeably and intentionally missing uv_len] }.
 
Reply With Quote
 
Christopher Layne
Guest
Posts: n/a
 
      02-13-2007
William Ahern wrote:

> Ridiculous? Michael's comments echo exactly my experience. I write network
> servers day-in and day-out. The standard library functions work great
> for strings, memcpy() in particular. snprintf() is also very useful, but


I'd shorten that to say that the mem* stdlib functions work great. The str*
functions are just about useless for network code.

> Personally, of late I often make use of the iovec structure from POSIX's
> sys/uio.h. If I were to revamp string handling in C I'd make two
> modifications.
>
> 1) Add an svec structure. struct svec { unsigned char *sv_base;
> size_t sv_len; }.
>
> 2a) Mandate that (char) was unassigned (very radical), or
> 2b) Mandate that (char) and (unsigned char) were interchangeable without
> maddening signedness compiler warnings in GCC-4.1, and add a new type
> uchar (typedef unsigned char uchar), 'cause I'm lazy.


BTW: I have an iovec based buffer tokenizer that utilizes a source,
source_len, delimiter array, and optional character class flags (the standard
ctypes) if you're interested. Approx 3M iovec/sec tokenization on a P
(celeron) without modifying any of the source buffer. Basically, it's like
this:

const char *delim = " \t\n\r";
struct iovec v[16]; /* or whatever you feel is appropriate */
struct iovec *vp;
c = vtok(v, NE(v), buf, buf_len, delim, VT_SPACE | VT_CNTRL);
/* or */
c = vtoka(&vp, 0, buf, buf_len, delim, VT_SPACE | VT_CNTRL);

v[0..c - 1].iov_base = token
v[0..c - 1].iov_len = token_len

Returns c (count of vectors) when either len has been consumed, or max iovecs
have been used. Last iovec is the tail if len has been consumed.

> Then, if I wanted to push my luck, I'd deprecate all the wide-character
> interfaces, and include a small suite of functions for manipulating UTF-8
> encoded Unicode strings. I might start by adding a struct uvec, something
> like: struct uvec { uchar *uv_base, [a bunch of opaque members,
> noticeably and intentionally missing uv_len] }.


Absolutely do NOT remove the length.
 
Reply With Quote
 
William Ahern
Guest
Posts: n/a
 
      02-13-2007
On Mon, 12 Feb 2007 22:16:26 -0800, Christopher Layne wrote:
> William Ahern wrote:

<snip>
>> Personally, of late I often make use of the iovec structure from
>> POSIX's sys/uio.h. If I were to revamp string handling in C I'd make
>> two modifications.
>>
>> 1) Add an svec structure. struct svec { unsigned char *sv_base; size_t
>> sv_len; }.
>>
>> 2a) Mandate that (char) was unassigned (very radical), or 2b) Mandate
>> that (char) and (unsigned char) were interchangeable without maddening
>> signedness compiler warnings in GCC-4.1, and add a new type uchar
>> (typedef unsigned char uchar), 'cause I'm lazy.

>
> BTW: I have an iovec based buffer tokenizer that utilizes a source,
> source_len, delimiter array, and optional character class flags (the
> standard ctypes) if you're interested. Approx 3M iovec/sec tokenization
> on a P (celeron) without modifying any of the source buffer. Basically,
> it's like this:
>


I have something similar (almost exact, actually), which i call splitv()
(from the example split() function included in the strtok man page on many
Unices). But, I never put in character class support; it just takes an
optional (1 << CHAR_BIT)-character map for selecting delimiters.

I'd still be interested to see yours. I had a flag in mine to return empty
fields, but it's broken, meaning something in my loop should be smarter.
I've never gone back to it because so far I've never wanted to have the
empty fields.

>> Then, if I wanted to push my luck, I'd deprecate all the wide-character
>> interfaces, and include a small suite of functions for manipulating UTF-8
>> encoded Unicode strings. I might start by adding a struct uvec, something
>> like: struct uvec { uchar *uv_base, [a bunch of opaque members,
>> noticeably and intentionally missing uv_len] }.

>
> Absolutely do NOT remove the length.


Well, the notion is that given the history of C strings, the terminology
is overloaded. Is the "length" the "sizeof" of the object, where the units
are C's notion of bytes, or are we talking about the number of graphemes,
etc. I would remove something like uv_len and replace it with
two or more members or functions, each with more precise, less
ambiguous names. The problem with Unicode string processing is that it
violates many of the supposed intrinsic properties of strings that have
variously benefited and plagued programmers for years. Take the tokenizer
above, for instance. In written Thai there generally isn't any word or
symbol delimiters at all. Just one long string of syllables. To
tokenize you actually need a language dictionary, and you parse from right
to left trying to form words; when the next syllable couldn't
possible result in a legitimate word, you break. So with Thai it's an all
or nothing proposition; you either manipulate the text using a
sophisticated and capable interface, or you treat it as an opaque chunk of
memory. All this "wide-char" non-sense is completely and utterly useless.
We must permanently put to rest this practice in text processing
of conflating "characters" and "bytes" (indeed, at many levels remove the
notion of characters altogether; in one sense the usefulness rests solely
with parsing so-called human readable network protocols which employ
ASCII, but then that's not really "text").

Thus, this is also why I'd deprecate the wide-character support. It
fails to provide any sort of usefulness at the memory management level,
nor does it comprehensively solve the internationalization issue (heck,
technically if doesn't even allow you the privilege of conflating
characters and bytes if you so choosed, which will always be useful
for historical and technical reasons, as mentioned above). Also, the
whole notion of environment locale's is broken. But I digress. In any
event, I think the answer to both of those is to standardize on Unicode
strings, and a UTF-8 encoding specifically, so when you need to you can
fallback to more traditional string and memory management, while also
opening the door to sophisticated and comprehensive internationalized text
processing. This, I think, proceeds from and enhances the best qualities
of C. And if the idea of including such a large string interface with C was
too repugnant, I'd still implement everything else: rip out
wide-characters, fixed char signedness headaches, etc. They stand on
their own merits.
 
Reply With Quote
 
Christopher Layne
Guest
Posts: n/a
 
      02-16-2007
William Ahern wrote:

> I have something similar (almost exact, actually), which i call splitv()
> (from the example split() function included in the strtok man page on many
> Unices). But, I never put in character class support; it just takes an
> optional (1 << CHAR_BIT)-character map for selecting delimiters.
>
> I'd still be interested to see yours. I had a flag in mine to return empty
> fields, but it's broken, meaning something in my loop should be smarter.
> I've never gone back to it because so far I've never wanted to have the
> empty fields.


Give me a chance to clean it up some and finish up some remaining
functionality. And I hear you on the empty field, delimiters only, etc.
business.

>> Absolutely do NOT remove the length.

>
> Well, the notion is that given the history of C strings, the terminology
> is overloaded. Is the "length" the "sizeof" of the object, where the units
> are C's notion of bytes, or are we talking about the number of graphemes,


Length being purely bytes.
It's a structure member that should always be accessible to facilitate any
additional external handling of data.

> Thus, this is also why I'd deprecate the wide-character support. It
> fails to provide any sort of usefulness at the memory management level,
> nor does it comprehensively solve the internationalization issue (heck,
> technically if doesn't even allow you the privilege of conflating
> characters and bytes if you so choosed, which will always be useful
> for historical and technical reasons, as mentioned above). Also, the


Unicode, UTF-8, and UTF-16 and how they relate to C. I wonder if these will
ever be solved honestly.

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Safe finite state machine design SomeDude VHDL 3 08-14-2006 12:47 PM
Gate Level model of a Finite state machine Inderkal VHDL 8 12-09-2004 11:17 PM
How to sequencialize two finite state machines ? Martin Maurer VHDL 3 06-14-2004 07:40 AM
How to implement linked Finite State Machines Sidney Cadot VHDL 0 04-18-2004 10:45 AM
HELP PLEASE!! - Finite State Machine - Automaton - Microprogrammed System deejayfred VHDL 0 10-02-2003 01:23 AM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57