Velocity Reviews > Reversing a linked list

jacob navia
Guest
Posts: n/a

 09-16-2011
This group has lost many people. One of the reasons is that the
newcomers just ask a question or two, and do not see discussions
that try to explain code, teaching newcomers to the language how
a program works.

This is a second example (after hexdump.c) of commented code,
i.e. explaining step by step how a relatively simple operation
is done.

-----------------------
This is an evergreen of data structures programming. In most classes
you will get an exercise like this:

Exercise 7: Given a list "L" reverse it without moving its contents.

The solution for this exercise is to go through the list, relinking the
"Next" pointers in the inverse order, without touching any data that
the list may hold. We will use the code of the C containers library and
study how it is done.

The library uses a simple structure "ListElement" whose definition runs
like this:

typedef struct _ListElement {
struct _ListElement *Next;
char Data[];
} ListElement;

We have a pointer to the next element of the list, and a chunk of data
of unspecified length. This is the basic structure that will be used
to hold the list data. Besides the list element we have a header of the
list, containing several fields not relevant to the task of our reverse
function. We will use only the following fields from that header:

o RaiseError: Pointer to the error function.
o First: The first element of the list.
o count: The number of items in the list.
o Last: A pointer to the last element of the list.
o timestamp: A counter of the modifications done to the list.

The interface of the reverse function is simple: it receives a list to
reverse as input and returns an integer result code. A negative result
means failure (with different negative numbers as error codes) and a
positive number means success.

1 static int Reverse(List *l)
2 {
3 ListElement *Previous,*Current,*Next;
4

The first thing to do in a serious program is to check the validity of
the received arguments, i.e. test the preconditions as Bertrand Meyer
would say it. We test that the list handed to us is not Null (lines
5- and that the list is not read-only (lines 9-12) since we are going
to modify it.

If anything is wrong we return an error code after calling the error
function. Note that the error function is a field either in a global
structure
called iError (for interface Error) or is a field of the list itself. We
use the global interface iError in the case that the list is\Null, or the
list specific error function for the read only error. This is a powerful
feature of C called function pointers that we will discuss in more detail
later on.

5 if (l == NULL) {
8 }
9 if (l->Flags & CONTAINER_READONLY) {
12 }

Then, we test for special conditions, i.e. for degenerate cases.
Obviously, reversing an empty list or a list with only one element is
very easy: we have nothing to do. We test for that (line 13) and return
success immediately if that is the case.

13 if (l->count < 2)
14 return 1;

Now, we setup our loop. We start with the first element (that must
exist since the list has more than one element, line 15). Since we are
going to reverse the list, the first element will be the last and the
last will be the first. We setup the last one immediately since we know
it in line 16. And before we start there is no previous element so we
set it to NULL.

15 Next = l->First;
16 l->Last = l->First;
17 Previous = NULL;

Now we go through the whole list. We save the "Next" value, and advance
it to the next element. Then, we set the value of the current element's
pointer to the previous, i.e. our current will point to previous
reversing the direction of the list.

18 while (Next) {
19 Current = Next;
20 Next = Next->Next;
21 Current->Next = Previous;
22 Previous = Current;
23 }

OK, we are done. We have gone through the whole list reversing
pointers, and we arrive at the cleanup. We should first establish our
"First" pointer, then we should ensure that our last element has
the Null marker, and that our list is marked as modified. We return a
positive number meaning success.

24 l->First = Previous;
25 l->Last->Next = NULL;
26 l->timestamp++;
27 return 1;
28 }

jacob navia
Guest
Posts: n/a

 09-16-2011
Le 16/09/11 08:30, jacob navia a écrit :

> typedef struct _ListElement {
> struct _ListElement *Next;
> char Data[];
> } ListElement;

Should be:

typedef struct tagListElement { // <<<<<
struct tagListElement *Next; // <<<<
char Data[];
} ListElement;

Underscores should not start an identifier since those are reserved for
the implementation. Change _ListElement to tagListElement.

Ben Bacarisse
Guest
Posts: n/a

 09-16-2011
jacob navia <(E-Mail Removed)> writes:
<snip>
> -----------------------
> This is an evergreen of data structures programming. In most classes
> you will get an exercise like this:
>
> Exercise 7: Given a list "L" reverse it without moving its contents.
>
> The solution for this exercise is to go through the list, relinking the
> "Next" pointers in the inverse order, without touching any data that
> the list may hold. We will use the code of the C containers library and
> study how it is done.
>
> The library uses a simple structure "ListElement" whose definition runs
> like this:
>
> typedef struct _ListElement {
> struct _ListElement *Next;
> char Data[];
> } ListElement;
>
> We have a pointer to the next element of the list, and a chunk of data
> of unspecified length. This is the basic structure that will be used
> to hold the list data. Besides the list element we have a header of the
> list, containing several fields not relevant to the task of our reverse
> function. We will use only the following fields from that header:
>
> o RaiseError: Pointer to the error function.
> o First: The first element of the list.
> o count: The number of items in the list.
> o Last: A pointer to the last element of the list.
> o timestamp: A counter of the modifications done to the list.

I'd show this other structure. This is a teaching example after all.
timestamp is an odd name for a counter. Obviously I can see the
connection, but an update count and a time are only loosely connected.

> The interface of the reverse function is simple: it receives a list to
> reverse as input and returns an integer result code. A negative result
> means failure (with different negative numbers as error codes) and a
> positive number means success.
>
> 1 static int Reverse(List *l)
> 2 {
> 3 ListElement *Previous,*Current,*Next;
> 4
>
> The first thing to do in a serious program is to check the validity of
> the received arguments, i.e. test the preconditions as Bertrand Meyer
> would say it. We test that the list handed to us is not Null (lines
> 5- and that the list is not read-only (lines 9-12) since we are
> going to modify it.

This is not what I understand by the phrase "function precondition".
Maybe Meyer uses it that way but I would be surprised. Almost by
definition, a precondition is something that you *don't* check. It is
stated in the documentation (and probably as a comment) but as soon as
you test for it it becomes part of the defined behaviour of the function
and stops being a precondition. You can (often during development) use
something like assert to make sure that your callers are keeping to the
implied contract of the preconditions, but as soon as this check turns
into "when l == NULL return error number XYZ" it stops being (at least
by my understanding of the term) a precondition.

Since this is a teaching exercise, you could ask the reader to state the
function's preconditions because it does have quite a few interesting
ones.

<snip>
> Then, we test for special conditions, i.e. for degenerate cases.
> Obviously, reversing an empty list or a list with only one element is
> very easy: we have nothing to do. We test for that (line 13) and return
> success immediately if that is the case.
>
> 13 if (l->count < 2)
> 14 return 1;

It's not clear why you need to do this. There may be some tiny
performance benefit, but it suggests to the learner that these lists are
special when, in most cases, they are not. I think the following code
can be written to handle all lists, and the change needed to do that is
simplification of the code. In other words, I think you've turned the
empty list into a special case where it need not be one.

This test also induces another pre-condition for the function by which I
mean a pre-condition in my sense of the word (something that must be
true that the function does not test and handle as part of its defined
behaviour). However, I can see that there is teaching value here as you
can ask about the consequences of leaving the test there and the merits
of removing it.

> Now, we setup our loop. We start with the first element (that must
> exist since the list has more than one element, line 15). Since we are
> going to reverse the list, the first element will be the last and the
> last will be the first. We setup the last one immediately since we know
> it in line 16. And before we start there is no previous element so we
> set it to NULL.
>
> 15 Next = l->First;
> 16 l->Last = l->First;
> 17 Previous = NULL;
>
> Now we go through the whole list. We save the "Next" value, and advance
> it to the next element. Then, we set the value of the current element's
> pointer to the previous, i.e. our current will point to previous
> reversing the direction of the list.
>
> 18 while (Next) {
> 19 Current = Next;
> 20 Next = Next->Next;
> 21 Current->Next = Previous;
> 22 Previous = Current;
> 23 }
>
> OK, we are done. We have gone through the whole list reversing
> pointers, and we arrive at the cleanup. We should first establish our
> "First" pointer, then we should ensure that our last element has
> the Null marker, and that our list is marked as modified. We return a
> positive number meaning success.
>
> 24 l->First = Previous;
> 25 l->Last->Next = NULL;

This is, for me, the problem line. I don't think it's needed yet it
turns the empty list into a special case.

> 26 l->timestamp++;
> 27 return 1;
> 28 }

--
Ben.

jacob navia
Guest
Posts: n/a

 09-16-2011
Le 16/09/11 13:51, Ben Bacarisse a écrit :
>>
>> 24 l->First = Previous;
>> 25 l->Last->Next = NULL;

>
> This is, for me, the problem line. I don't think it's needed yet it
> turns the empty list into a special case.

Bright mind Ben, you have a bright mind.

Exactly.

l->Last was modified as the first object with Previous that has
the NULL value at the start.

That line is not necessary. If we eliminate it, the empty list will
pass.

Thanks.

In the other stuff you are mostly right: timestamp is a bad name,
and I will show the whole structure header for the list.

Concerning the "preconditions" term, Betrand Meyer uses it as one of the
two parts of a function contract: The preconditions and the postconditions.

The first ones are the ones that the function requires from its caller,
and the second ones are the ones that the function
ensures to its users. They are part of the Eiffel language that Meyer
designed.

It is true that Reverse has other preconditions not mentioned
explicitely or shown in the code:

o We assume a list that is terminated by a NULL pointer
o We assume a list without cycles

Testing for the first means

if (l->count > 0) assert(l->Last->Next == NULL);

Testing for the second needs

rvp = l->First;
rvpFast = rvp->Next;
while (rvpFast) {
if (rvp == rvpFast) {
iError.RaiseError("Reverse", CIRCULAR_LIST);
return ERROR_CIRCULAR_LIST;
}
if (rvpFast)
rvpFast = rvpFast->Next;
if (rvpFast)
rvpFast = rvpFast = rvpFast->Next;
rvp = rvp->Next;
}
return 0;

We establish two pointers to traverse the list, one that will move
twice as fast as the first. If they are ever equal there is a circle
in the list.

I will add that to the solved exercises.

Thanks again

jacob navia
Guest
Posts: n/a

 09-16-2011
Le 16/09/11 14:18, jacob navia a écrit :
> if (rvpFast)
> rvpFast = rvpFast = rvpFast->Next;

OOOPS, that should be:

> rvpFast = rvpFast->Next;

Sorry

Malcolm McLean
Guest
Posts: n/a

 09-16-2011
On Sep 16, 3:26*pm, jacob navia <(E-Mail Removed)> wrote:
>

Ypu can cut that code down to just this

{
NODE *prev = NULL;
NODE *temp;

{
}
return prev;
}

This expresses the good things and the bad things about C. The code's
only a few lines, so it's hardly worthwhile writing a function for it.
Normally you'll have a linked list consisting of a "next" member
embedded in some structure or other, and you can just modify the code
boilerplate. But it would be nicer to be able to make it generic.
Also, the code will crash if passed an invalid list, but it needs no
special conditions for the empty list or the one-member list case.
Also, you've got the annoying little dance with a temporary. It's not
possible to glance at the code and see that it is right, as you can
with the performance-insensitive solution.

{
}

--
Visit my website
http://www.malcolmmclean.site11.com/www

ImpalerCore
Guest
Posts: n/a

 09-16-2011
On Sep 16, 7:51*am, Ben Bacarisse <(E-Mail Removed)> wrote:
> jacob navia <(E-Mail Removed)> writes:

[snip]

> > The first thing to do in a serious program is to check the validity of
> > the received arguments, i.e. test the preconditions as Bertrand Meyer
> > would say it. We test that the list handed to us is not Null (lines
> > 5- and that the list is not read-only (lines 9-12) since we are
> > going to modify it.

>
> This is not what I understand by the phrase "function precondition".
> Maybe Meyer uses it that way but I would be surprised. *Almost by
> definition, a precondition is something that you *don't* check. *It is
> stated in the documentation (and probably as a comment) but as soon as
> you test for it it becomes part of the defined behaviour of the function
> and stops being a precondition. *You can (often during development) use
> something like assert to make sure that your callers are keeping to the
> implied contract of the preconditions, but as soon as this check turns
> into "when l == NULL return error number XYZ" it stops being (at least
> by my understanding of the term) a precondition.

From my perspective, a precondition for a function is independent of
the method used to handle its violations. This is what contrasts the
defensive programming from the design by contract paradigm. They both
see the same function preconditions; they respond to them differently.

Take 'size_t strlen( const char* str )' for example.

\code
size_t strlen( const char* str )
{
const char* s;

for ( s = str; *s != '\0'; ++s )
;

return s - str;
}
\endcode

If I were to list the preconditions, I would say that 'str' must
reference a character buffer and that buffer contain a terminating
'\0' character. Some of these preconditions are testable, others are
not (when a memory debugger is not present). It is difficult to
impossible to verify whether a fixed-width buffer passed to strlen
contains a '\0' without potentially generating a memory access
violation.

The de-facto error-handling method of much of the standard library is
to do nothing, which in my opinion blends the precondition with its
implicit response (nothing) to a precondition violation. In other
languages like C++ or Eiffel, the response to precondition violations
may result an exception, or violate some explicit contract.

The point is that I believe it's a faulty view to combine a function's
preconditions with how the module responds to them. For example, I
see GLib's abundant use of g_return_if_fail macros as a defensive
programming response to function preconditions.

\code
size_t strlen( const char* str )
{
const char* s;

g_return_val_if_fail( str != NULL, 0 );

for ( s = str; *s != '\0'; ++s )
;

return s - str;
}
\endcode

The 'assert' function is more like the design by contract approach by
responding to a precondition violation by aborting the program.

size_t strlen( const char* str )
{
const char* s;

assert( str != NULL );

for ( s = str; *s != '\0'; ++s )
;

return s - str;
}
\endcode

Best regards,
John D.

[snip]

jacob navia
Guest
Posts: n/a

 09-16-2011
Le 16/09/11 16:21, ImpalerCore a écrit :
> From my perspective, a precondition for a function is independent of
> the method used to handle its violations. This is what contrasts the
> defensive programming from the design by contract paradigm. They both
> see the same function preconditions; they respond to them differently.
>

I would agree with that. For instance, "Reverse" needs a list with no
cycles.

It would be hugely expensive to test this precondition (I gave the code
for the test in an answer to Ben) since implies going through all the
list. But you are right that all the preconditions should be explicitely
documented, even if untested in the code.

The other precondition is that the list must be terminated by a NULL
pointer and all the Next pointer must be valid.

The first part can be easily tested with:

if (l->count > 0) assert(l->Last->Next == NULL);

This is cheap to do because the header structure stores a pointer
to the last element. But if there is no pointer to the last element
we have to go through the whole list, what makes testing for that
precondition extremely expensive.

I will add a commentary with this content to the preconditions
part.

Malcolm McLean
Guest
Posts: n/a

 09-16-2011
On Sep 16, 5:47*pm, jacob navia <(E-Mail Removed)> wrote:
> But you are right that all the preconditions should be explicitely
> documented, even if untested in the code.
>

Depends on the audience. For a proposed standard library, I agree.
However in client code you can expect users to understand that linked
lists will be NULL-terminated and cycle free. There is a problem with
null pointers, which may be used to represent either the empty list or
an invalid list.
--
Read my review of Atlas Shrugged
http://www.malcolmmclean.site11.com/www

ImpalerCore
Guest
Posts: n/a

 09-16-2011
On Sep 16, 10:09*am, Malcolm McLean <(E-Mail Removed)>
wrote:
> On Sep 16, 3:26*pm, jacob navia <(E-Mail Removed)> wrote:
>
>
>
> Ypu can cut that code down to just this
>
> {
> * NODE *prev = NULL;
> * NODE *temp;
>
> * {
> * * temp = head->next;
> * * head->next = prev;
> * * prev = head;
> * * head = temp;
> * }
> * return prev;
>
> }
>
> This expresses the good things and the bad things about C. The code's
> only a few lines, so it's hardly worthwhile writing a function for it.

Code compression is not the best reason for defining functions. Code
abstraction is. As such, I fail to see how this is a 'bad thing'
about C. I see it as a very good thing, of any programming language
worth using.

> Normally you'll have a linked list consisting of a "next" member
> embedded in some structure or other, and you can just modify the code
> boilerplate. But it would be nicer to be able to make it generic.

It is possible to make it generic if you're willing to place the onus
on the user to be responsible for the keeping track of node's
referenced object type. GLib has a linked list structure with a
void*, and a function like reverse can be implemented independently
from the knowledge of the list's object type.

In my opinion, having one general abstraction for a linked list using
void* is more valuable than packaging the type information in the list
node (so you don't have to cast the node's object pointer). If the
number of types that require linked lists is one or two, or efficiency
is a high priority, then packaging a specific object type in the list
node with a struct can be better.

> Also, the code will crash if passed an invalid list,

You trade safety for efficiency. It's a judgement call; it only looks
compiler's standard string library in <string.h> usually make a choice
for efficiency. Again this goes back to whether you want to test for
a function's preconditions, and if tested how you respond to them:
early-exit with an error value, argument substitution, use a benign
value, set a global errno, log a message to the screen or to a file,
assert, controlled-crash (like saving user's work), let the OS deal
with it ...

> but it needs no
> special conditions for the empty list or the one-member list case.
> Also, you've got the annoying little dance with a temporary. It's not
> possible to glance at the code and see that it is right,

That's a matter of experience. Sure, maybe the lisp people will get
it right away. I wouldn't necessarily bet that the average C
programmer will "get it" better. I doubt most students learning
linked lists for the first time will be able to visually inspect
either and know that it's right without sitting at a computer (I know
I sure didn't).

This is why you write functions to begin with, to reduce the
complexity into containable pieces (what I refer to as code
abstraction). Users of the linked list reverse function shouldn't
want or need to see that it's done right. Granted if the function is
implemented poorly, it becomes a concern, but that should be visible
with external behavior or tests, not through inspecting the source
code. For example, if one is a Windows applications programmer, you
really don't want to dig down into the source of a Windows API call to
see if it's doing something right or wrong, do ya?

> as you can
> with the performance-insensitive solution.
>
> {
>
> }

The real advantage of functions is that you've hidden the
implementation details, so if the 'inefficientreverse' function is not
good enough from some limitation, you can modify the function rather
When a regular posts that they'd prefer to do linked lists by hand, I
scratch my head and wonder why. Why is the linked list seemingly
impervious in the mindset of some to improvement by abstraction?

(I know you likely already know most of this stuff, just putting my
thoughts out there)

Best regards,
John D.