Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > Dynamic lists of strings in C

Reply
Thread Tools

Dynamic lists of strings in C

 
 
hlubenow
Guest
Posts: n/a
 
      03-31-2007
Hello,

I really like Perl and Python for their flexible lists like @a (Perl) and
a[] (Python), where you can easily store lots of strings or even a whole
text-file.

Now I'm not a C-professional, just a hobby-programmer trying to teach it
myself. I found C rather difficult without those lists (and corresponding
string-functions).
Slowly getting into C-strings, I thought, what if I take a C-string and
treat it as a list, with the elements separated by '\n' ? After all it's
just data stored byte by byte in memory with a closing '\0'. So these
strings could become very long.
Then I wrote some functions based on this idea. They worked, but I had to
realloc() memory for the whole list every list-operation. So this could be
done faster. Someone told me, I could try it with a "linked list".
I read about that and rewrote the functions. To my surprise with structs
this was easier to do than the string-version.
Anyway, below I post an example of what I did (I promise, if code get's any
longer than this, I'll put it up as a file for download). It just defines
one or two list and prints some results, demonstrating the list-functions.

Please take a look at main(). With the comments there, it should give you an
idea of what I'm after.

Well, what do you think of it (besides me casting malloc() ), that is
what do you think of my code, but also of the idea in general ?

I know, other people have tried something similar before:

http://jengelh.hopto.org/f/libHX/
http://mij.oltrelinux.com/devel/simclist/
https://sourceforge.net/project/show...group_id=97342
http://www.pronix.de/comment/site-10...06/site-1.html

But why for example has this never made its way to the C standard library
like std::vector has to STL in C++ ?

See you

H.

------------------------------------------------------------------------

/*
list.c: Provides linked lists of strings.

Compile with:
gcc -W -Wall -pedantic -ansi -o listexample list.c

Written by hlubenow (Email: http://www.velocityreviews.com/forums/(E-Mail Removed)), (C) 2007. License: LGPL.

This program is free software; you can redistribute it and/or modify
it under the terms of the GNU Library General Public License as
published by the Free Software Foundation; either version 2 of the
License, or (at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.

You should have received a copy of the GNU Library General Public
License along with this program; if not, write to the
Free Software Foundation, Inc.,
59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
*/

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

struct list
{
char *content;
struct list *next;
};

char *strMalloc(unsigned int s);
char *strRealloc(char *a, unsigned int s);
struct list *listMalloc(unsigned int s);
struct list *newList(void);
struct list *listUnshift(struct list *head, char *b);
struct list *listAssign(struct list *head, int pos, char *b);
struct list *listAppend(struct list *head, char *b);
struct list *listInsert(struct list *head, int pos, char *b);
int countList(struct list *head);
char *getElement(struct list *head, int pos);
struct list *deleteElement(struct list *head, int pos);
struct list *deleteElements(struct list *head, int start, int end);
struct list *strToList(char *a, char *b);
char *listToStr(struct list *head, char *joinstr);
int printList(struct list *head);
struct list *clearList(struct list *head);


int main(void)
{
struct list *a;
struct list *s;
int count;
char *elem;
char *j;
char *x;
int i;

/* Ok, let's go: */

/* Create a list */
a = newList();

/* Fill it */
a = listAppend(a, "apple");
a = listAppend(a, "banana");
a = listAppend(a, "cherry");
a = listAppend(a, "strawberry");

/* Count it */
count = countList(a);

printf("\nThe list has %d elements now:\n", count);

/* Get single elements from it */
for (i = 0; i < count; i++)
{
elem = getElement(a, i);
printf("%d: %s\n", i, elem);
free(elem);
}

/* Directly define elements */
a = listAssign(a, 1, "pear");

puts("\nList changed:");
printList(a);

/* Even beyond the borders of the list */
a = listAssign(a, 5, "lemon");

puts("\nList changed again:");
printList(a);

count = countList(a);
printf("The list has %d elements now.\n\n", count);

/* You also can insert (listInsert()) and delete elements
(deleteElements()) from the list */

/* Furthermore strings can be split and returned as a list: */

s = strToList("Split me into parts", " ");

count = countList(s);
for (i = 0; i < count; i++)
{
elem = getElement(s, i);
printf("%d: %s\n", i, elem);
free(elem);
}

s = clearList(s);

/* And lists can be joined to a string: */
a = listAssign(a, 4, "mango");
j = listToStr(a, " ");

printf("\nA string: %s\n", j);

free(x);
free(j);

/* Please call this, if you don't need the list anymore: */
a = clearList(a);

return 0;
}


char *strMalloc(unsigned int s)
{
/* Allocates memory for a string, testing if allocation has
been successfull. */

char *a;
if ((a = (char *) malloc(s * sizeof(char))) == NULL)
{
puts("Error allocating memory.");
exit(1);
}

return a;
}


char *strRealloc(char *a, unsigned int s)
{
/* Reallocates memory of a string, testing if reallocation has
been successfull. */

if ((a = (char *) realloc(a, s * sizeof(char))) == NULL)
{
puts("Error reallocating memory.");
exit(1);
}

return a;
}


struct list *listMalloc(unsigned int s)
{
/* Allocates memory for a list, testing if allocation has
been successfull. */

struct list *a;
if ((a = (struct list *) malloc(s * sizeof(struct list))) == NULL)
{
puts("Error allocating list-memory.");
exit(1);
}

return a;
}


/* List functions. */

struct list *newList(void)
{
return NULL;
}

struct list *listUnshift(struct list *head, char *b)
{
/* Inserts an element at the beginning of the list, like
Perl's "unshift()". The first element has to be put into
the list this way. listAppend() and listAssign() take care
of this automatically. */

struct list *c = listMalloc(1);
c->content = strMalloc(strlen(b) + 1);
strcpy(c->content, b);
c->next = head;
return c;
}


struct list *listAssign(struct list *head, int pos, char *b)
{
/* Lets you define or redefine list-elements.
If pos is greater than the list, the list is extended and the new
element is appended. */

int listlen;
int i;
struct list *a;

if (pos < 0)
{
puts ("List index out of range.");
exit(1);
}

if (head == NULL)
{
head = listUnshift(head, b);
return head;
}

listlen = countList(head);

if (pos >= listlen)
{
for (i = 0; i < pos - listlen; i++)
{
head = listAppend(head, "");
}

head = listAppend(head, b);
return head;
}

a = head;

for (i=0; i < pos; i++)
{
a = a->next;
}

a->content = strRealloc(a->content, strlen(b) + 1);
strcpy(a->content, b);
return head;
}


struct list *listAppend(struct list *head, char *b)
{
struct list *a;
struct list *c;

if (head == NULL)
{
head = listUnshift(head, b);
return head;
}

c = listMalloc(1);
c->content = strMalloc(strlen(b) + 1);
strcpy(c->content, b);

a = head;

while(a->next)
{
a = a->next;
}
a->next = c;
c->next = NULL;
return head;
}


struct list *listInsert(struct list *head, int pos, char *b)
{
/* Inserts a new element into the list extending the list. */

int listlen;
int i;
struct list *a;
struct list *c;

if (head == NULL || pos == 0)
{
head = listUnshift(head, b);
return head;
}

listlen = countList(head);

if (pos >= listlen)
{
puts ("List index out of range.");
exit(1);
}

c = listMalloc(1);
c->content = strMalloc(strlen(b) + 1);
strcpy(c->content, b);

a = head;

for (i=0; i < pos - 1; i++)
{
a = a->next;
}

c->next = a->next;
a->next = c;
return head;
}


int countList(struct list *head)
{
/* Returns the number of elements of the list.
Also used internally. */

int x = 0;

if (head == NULL)
{
return x;
}

while(head->next)
{
x ++;
head = head->next;
}
x ++;
return x;
}


char *getElement(struct list *head, int pos)
{
/* Returns the element at position pos of the list. */

int i;
char *a;
int listlen = countList(head);

if (pos >= listlen)
{
puts ("List index out of range.");
exit(1);
}

for (i=0; i < pos; i++)
{
head = head->next;
}

a = strMalloc(strlen(head->content) + 1);
strcpy(a, head->content);
return a;
}


struct list *deleteElement(struct list *head, int pos)
{
struct list *a;
struct list *b;
struct list *c;
int i;
int length = countList(head);

if (pos >= length || pos < 0)
{
puts ("List index out of range.");
exit(1);
}

if (length <= 1)
{
if (length == 1)
{
free(head);
}

return NULL;
}

a = head;
b = a->next;

if (length == 2)
{
a->next = NULL;
free(b);
return head;
}

for (i=0; i < pos - 1; i++)
{
a = a->next;
}

b = a->next;
c = b->next;

a->next = c;
free(b);

return head;
}


struct list *deleteElements(struct list *head, int start, int end)
{
/* Deletes elements starting from position "start" to position
"end" from the list.

If -1 is passed as "end", the list is deleted
from position "start" to its end. */

int i;
int length = countList(head);

if (start >= length || end >= length
|| start < 0 || end < -1)
{
puts ("List index out of range.");
exit(1);
}

if (start > end)
{
puts ("Invalid values.");
exit(1);
}

if (end == -1)
{
end = length - 1;
}

for (i=start; i <= end; i++)
{
head = deleteElement(head, start);
}

return head;
}


struct list *strToList(char *a, char *b)
{
/* Splits a string at "b" and converts the result
into a list. "listUnshift()" instead of "listAppend()" is
used for speed reasons (tricky). */

struct list *head;
char *c;
int lenc;
int lenb;
int i;
int u;
int x;

if (strstr(a, b) == NULL)
{
puts("Splitpart not in string to split !");
return NULL;
}

head = NULL;

c = strMalloc(strlen(a) + 1);
strcpy(c, a);

lenc = strlen(c);
lenb = strlen(b);

for(i = lenc - 1; i >= 0; i--)
{
x = 0;

if(i >= lenb - 1 && *(c + i) == *(b + lenb - 1))
{
for(u = 0; u < lenb; u++)
{
if(*(c + i - lenb + 1 + u) == *(b + u))
{
x++;
}
else
{
break;
}
}

if(x == lenb)
{
*(c + i - lenb + 1) = '\0';
if(i != lenc - 1)
{
head = listUnshift(head, c + i + 1);
}
}
}
}

head = listUnshift(head, c);

free(c);

return head;
}


char *listToStr(struct list *head, char *joinstr)
{
/* Join a list to a single string, connecting the
list-elements with "joinstr". */

int size;
char *b;
struct list *a = head;

while(a->next != NULL)
{
size += strlen(a->content);
size += strlen(joinstr);
a = a->next;
}

size += strlen(a->content);
size ++;

b = strMalloc(size);
strcpy(b, "");

a = head;

while(a->next != NULL)
{
strcat(b, a->content);
strcat(b, joinstr);
a = a->next;
}

strcat(b, a->content);
return b;
}


int printList(struct list *head)
{
while(head->next != NULL)
{
if(*(head->content + strlen(head->content) - 1) != '\n')
{
puts(head->content);
}
else
{
printf("%s", head->content);
}

head = head->next;
}
if(*(head->content + strlen(head->content) - 1) != '\n')
{
puts(head->content);
}
else
{
printf("%s", head->content);
}

return 0;
}


struct list *clearList(struct list *head)
{
/* If you don't need your list any more, you're supposed to call
this to free the memory of each element's struct instance. */

struct list *a;
struct list **b;
int listlen;
int i;

listlen = countList(head);

/* Create "listlen" pointers to each element (structure) of the list. */

if ((b = (struct list **) malloc(listlen * sizeof(struct list *))) ==
NULL)
{
puts("Error allocating memory.");
exit(1);
}

a = head;

for (i=0; i < listlen; i++)
{
b[i] = a;
a = a->next;
}

for (i=listlen - 1; i == 0; i--)
{
if (i > 0)
{
b[i - 1]->next = NULL;
}
free(b[i]);
}

free(b);

return NULL;
}
 
Reply With Quote
 
 
 
 
Jens Thoms Toerring
Guest
Posts: n/a
 
      03-31-2007
hlubenow <(E-Mail Removed)> wrote:
> Well, what do you think of it (besides me casting malloc() ), that is
> what do you think of my code, but also of the idea in general ?


> I know, other people have tried something similar before:


> http://jengelh.hopto.org/f/libHX/
> http://mij.oltrelinux.com/devel/simclist/
> https://sourceforge.net/project/show...group_id=97342
> http://www.pronix.de/comment/site-10...06/site-1.html


I guess if you have a closer look you will find a lot more of
such examples, I am using something similar (but not restricted
to strings) in one of my programs since quite some time.

> But why for example has this never made its way to the C standard library
> like std::vector has to STL in C++ ?


Probably because it's rarely needed and there's too much memory
allocation behind the scenes - it looks like for that kind of
reasons not even strdup() was included (a function you could
have used here very well).

> #include <stdio.h>
> #include <stdlib.h>
> #include <string.h>


> struct list
> {
> char *content;
> struct list *next;
> };


That's not a list, that's a list element. And why not store the
length of what content points to, that would safe a lot of un-
necessary strlen() calls. And you probably could avoid of lot
of repeated calculation if you would store the number of elements
somewhere instead of calling countList() over and over again.

And, while you're still at it, you probably should also store
a pointer to the end if the list, so you don't have to loop
over the whole list when you want to append to the end.

> char *strMalloc(unsigned int s)


It's much better to always use size_t instead of unsigned int,
since a size_t is always large enough while you can't be sure
with unsigned int and also malloc() etc. expects a variable of
this type as the size argument.

> {
> /* Allocates memory for a string, testing if allocation has
> been successfull. */


> char *a;
> if ((a = (char *) malloc(s * sizeof(char))) == NULL)


Why, oh why, not

a = malloc( s * sizeof *a )

What do you get from the cast and the sizeof(char), which is
always 1?

> {
> puts("Error allocating memory.");
> exit(1);


With this your code already has become useless for inclusion into
any larger project. If I run out of memory I need to be notified
about the problem and the program can't suddenly die on me. But
if you insist on exit() return EXIT_FAILURE instead of 1 (and
print error messages to stderr).

> }


> return a;
> }


> struct list *listAssign(struct list *head, int pos, char *b)
> {
> /* Lets you define or redefine list-elements.
> If pos is greater than the list, the list is extended and the new
> element is appended. */


> int listlen;
> int i;
> struct list *a;


> if (pos < 0)
> {
> puts ("List index out of range.");
> exit(1);
> }


If you test the arguments (laudable) you also should check 'b'
isn't a NULL pointer.

> if (head == NULL)
> {
> head = listUnshift(head, b);
> return head;
> }


> listlen = countList(head);


> if (pos >= listlen)
> {
> for (i = 0; i < pos - listlen; i++)
> {
> head = listAppend(head, "");


Why assign an empty string to the "undefined" elements in between
instead of having a NULL pointer for those?

> }


> head = listAppend(head, b);
> return head;
> }


> a = head;


> for (i=0; i < pos; i++)
> {
> a = a->next;
> }


> a->content = strRealloc(a->content, strlen(b) + 1);
> strcpy(a->content, b);


Why not doing the copy also in strMalloc() (making that a
strdup() function) or strRealloc()?

> return head;
> }


> char *getElement(struct list *head, int pos)
> {
> /* Returns the element at position pos of the list. */


> int i;
> char *a;
> int listlen = countList(head);


> if (pos >= listlen)
> {
> puts ("List index out of range.");
> exit(1);
> }


So it's ok if 'pos' is negative or 'head' a NULL pointer?

> for (i=0; i < pos; i++)
> {
> head = head->next;
> }


Is it a good idea to return an empty string for elements that are
"undefined", i.e. that where created when the user created a
new element with an index larger than the length of the array?
The empty string can very well be also a "legal" entry in the
list.

And you definitely have to document clearly that the user has
to deallocate the memory the return value of this function
points to when (s)he's done with that string.


> a = strMalloc(strlen(head->content) + 1);
> strcpy(a, head->content);
> return a;
> }


> struct list *deleteElement(struct list *head, int pos)


Since you seem to allow "undefined" elements in between wouldn't
it make a lot more sense to "undefine" the element instead of
removing the element of the list, especially since then all the
indices for strings later in the list get garbled? You at least
should give the user the choice of either a Perl-like delete
(that doesn't remove the element but marks it as undefined) and
a Perl-like splice (that actually removes parts of the array).

> {
> struct list *a;
> struct list *b;
> struct list *c;
> int i;
> int length = countList(head);


> if (pos >= length || pos < 0)
> {
> puts ("List index out of range.");
> exit(1);
> }


> if (length <= 1)
> {
> if (length == 1)
> {
> free(head);


Wouldn't you also need to free the string pointed to by that element?

> }


> return NULL;
> }


> a = head;
> b = a->next;


> if (length == 2)
> {
> a->next = NULL;
> free(b);


Same problem here.

> return head;
> }


> for (i=0; i < pos - 1; i++)
> {
> a = a->next;
> }


> b = a->next;
> c = b->next;


> a->next = c;
> free(b);


And another time.

> return head;
> }


> struct list *strToList(char *a, char *b)
> {
> /* Splits a string at "b" and converts the result
> into a list. "listUnshift()" instead of "listAppend()" is
> used for speed reasons (tricky). */


My guess is that this is just an example of premature optimization...

> struct list *head;
> char *c;
> int lenc;
> int lenb;
> int i;
> int u;
> int x;


> if (strstr(a, b) == NULL)
> {
> puts("Splitpart not in string to split !");
> return NULL;
> }


> head = NULL;


> c = strMalloc(strlen(a) + 1);
> strcpy(c, a);


The following is one of the instances where strtok() could be
used instead - would probably make the code a bit easier to read.

> lenc = strlen(c);
> lenb = strlen(b);


> for(i = lenc - 1; i >= 0; i--)
> {
> x = 0;


> if(i >= lenb - 1 && *(c + i) == *(b + lenb - 1))
> {
> for(u = 0; u < lenb; u++)
> {
> if(*(c + i - lenb + 1 + u) == *(b + u))
> {
> x++;
> }
> else
> {
> break;
> }
> }


> if(x == lenb)
> {
> *(c + i - lenb + 1) = '\0';
> if(i != lenc - 1)
> {
> head = listUnshift(head, c + i + 1);
> }
> }
> }
> }


> head = listUnshift(head, c);


> free(c);


> return head;
> }



> char *listToStr(struct list *head, char *joinstr)
> {
> /* Join a list to a single string, connecting the
> list-elements with "joinstr". */


> int size;
> char *b;
> struct list *a = head;


> while(a->next != NULL)
> {
> size += strlen(a->content);
> size += strlen(joinstr);
> a = a->next;
> }


> size += strlen(a->content);


Why not make that

size_t join_len = strlen( joinstr );

for ( a = head; a != NULL; a = a->next )
size += strlen( a->context ) + join_len;

Looks a bit clearer to me. And calling strlen() on 'joinstr' for
each element is a waste of CPU cycles. BTW, you should test if
'joinstr' isn't a NULL pointer.

> size ++;


> b = strMalloc(size);
> strcpy(b, "");


> a = head;


> while(a->next != NULL)
> {
> strcat(b, a->content);
> strcat(b, joinstr);
> a = a->next;
> }


> strcat(b, a->content);


You better use 'b' as a pointer to the current position in the
string here - strcat() has to over the whole string each time.
What about e.g.

char *p;

sptr = b = strMalloc(size);

for ( a = head; a->next != NULL; a = a->next ) {
strcpy( sptr, a->content;
sptr += strlen( a->content );
strcpy( sptr, joinstr );
sptr += join_len;
}
strcpy( sptr, a->content );

Of course, having the lengths of the strings stored together with
the pointer to the string would avoid lots of calls of strlen().

> return b;
> }


> int printList(struct list *head)
> {
> while(head->next != NULL)
> {
> if(*(head->content + strlen(head->content) - 1) != '\n')
> {
> puts(head->content);


Why would you like to output a '\n' if there's none at the end of
the string?

> }
> else
> {
> printf("%s", head->content);
> }


> head = head->next;
> }
> if(*(head->content + strlen(head->content) - 1) != '\n')
> {
> puts(head->content);
> }
> else
> {
> printf("%s", head->content);
> }
> return 0;
> }


Why not make that just

int printList( struct list *head )
for ( ; head != NULL; head = head->next )
if ( head->content[ strlen( head->content ) - 1 ] != '\n' )
puts(head->content);
else
printf( "%s", head->content );

return 0;
}


> struct list *clearList(struct list *head)
> {
> /* If you don't need your list any more, you're supposed to call
> this to free the memory of each element's struct instance. */


> struct list *a;
> struct list **b;
> int listlen;
> int i;


> listlen = countList(head);


> /* Create "listlen" pointers to each element (structure) of the list. */


> if ((b = (struct list **) malloc(listlen * sizeof(struct list *))) ==
> NULL)
> {
> puts("Error allocating memory.");
> exit(1);
> }


What is that necessary for? And I would be annoyed a lot if my
program would die due to memory exhaustion in a function that
is supposed to free memory.

> a = head;


> for (i=0; i < listlen; i++)
> {
> b[i] = a;
> a = a->next;
> }


> for (i=listlen - 1; i == 0; i--)


Is there any reason why you would free the list starting at the end?
And the loop will normally do nothing since the condition for the
end of the loop probably should be 'i >= 0' instead of 'i == 0'.

> {
> if (i > 0)
> {
> b[i - 1]->next = NULL;


> }


That is absolutely useless.

> free(b[i]);


You never free 'b[i]->content'.

> }


> free(b);


> return NULL;
> }

Regards, Jens
--
\ Jens Thoms Toerring ___ (E-Mail Removed)
\__________________________ http://toerring.de
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Dynamic lists of strings in C hlubenow C Programming 28 04-11-2007 08:42 PM
Dynamic lists of strings in C hlubenow C Programming 0 03-31-2007 03:01 PM
Strings, Strings and Damned Strings Ben C Programming 14 06-24-2006 05:09 AM
List of lists of lists of lists... =?UTF-8?B?w4FuZ2VsIEd1dGnDqXJyZXogUm9kcsOtZ3Vleg==?= Python 5 05-15-2006 11:47 AM
converting lists to strings to lists robin Python 10 04-12-2006 04:58 PM



Advertisments