Velocity Reviews

Velocity Reviews (http://www.velocityreviews.com/forums/index.php)
-   C Programming (http://www.velocityreviews.com/forums/f42-c-programming.html)
-   -   List as a dynamic array of increasing-sized arrays (http://www.velocityreviews.com/forums/t736042-list-as-a-dynamic-array-of-increasing-sized-arrays.html)

MartinBroadhurst 10-20-2010 11:02 PM

List as a dynamic array of increasing-sized arrays
 
In my experience, the most common data structures used for lists
supporting addition and removal at the tail, and random access, are
linked lists and dynamic arrays.

* Linked lists require following pointers for traversal (3 lookups per
element, poor cache performance), and have O(n) indexed access.
Adding and removing elements cause many small allocations and
deallocations of memory.
* Dynamic arrays have efficient traversal (1 lookup per element, good
cache performance) and indexed access is O(1).
Adding elements requires reallocation when full, however, which
involves copying all elements.

Instead, I should like to propose a structure consisting of a dynamic
array of increasing-sized arrays:

0 -> 0 1 2 3
1 -> 4 5 6 7
2 -> 8 9 10 11 12 13 14 15
3 -> 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

* Each array after the first two is twice the size of its predecessor.
This means that the structure has the same growth strategy as most
dynamic arrays (double when full), reducing the number of
allocations.
* As new arrays are appended to the end, no copying is required,
except for the pointers when the containing array is resized.
* The regular increase in the size of the arrays makes indexed access
efficient.

I have placed the source code for an implemenation here:
http://www.martinbroadhurst.com/c/list.html

To do:
* Allow specification of a starting capacity.
* The algorithm for finding the block for an index should use log2,
not repeated multiplication and addition.
* The block at the tail should not be freed as soon as it becomes
empty, as this means that a list whose population oscillates around a
power of 2 will repeatedly allocate and deallocate that block.
Instead, the block should be freed when some number of elements from
the next block have been removed, introducing a bit of "hysteresis".

Martin

MartinBroadhurst 10-21-2010 07:22 PM

Re: List as a dynamic array of increasing-sized arrays
 
On 21 Oct, 18:01, Armand <u...@compgroups.net/> wrote:
> Nice iterators.


Thanks, Armand. I think iterators are more useful, though, when
they're "polymorphic", and can decouple collection types from
consumers.
I wrote a polymorphic iterator with just a simple "get" function that
wraps data structure-specific implementations:
http://www.martinbroadhurst.com/source/iterator.h.html
http://www.martinbroadhurst.com/source/iterator.c.html
If I find the time I'm going to unify this with the list one,
providing reverse traversal, random access and eof/bof. Reverse
traversal and random-access can be optional for data structures that
don't support them.

MartinBroadhurst 10-21-2010 07:24 PM

Re: List as a dynamic array of increasing-sized arrays
 
On 21 Oct, 18:51, c...@tiac.net (Richard Harter) wrote:
>
> You've reinvented obstacks. *Good show.
>
>


Thanks, Richard. I've heard of obstacks, but was unsure from the GNU
documentation how they are implemented.

Martin


Max Priority 10-21-2010 08:31 PM

typedef struct {
Dynarray *blocks;
unsigned int lastsize; /* Size of last block */
unsigned int lastcount; /* Number of elements in the last block */
unsigned int count;
} List;

You don't need to keep track of the count of elements.
Knowing lastsize and lastcount allows you to calculate it.

MP

Gene 10-23-2010 08:01 PM

Re: List as a dynamic array of increasing-sized arrays
 
On Oct 20, 7:02*pm, MartinBroadhurst <martin.broadhu...@gmail.com>
wrote:
> In my experience, the most common data structures used for lists
> supporting addition and removal at the tail, and random access, are
> linked lists and dynamic arrays.
>
> * Linked lists require following pointers for traversal (3 lookups per
> element, poor cache performance), and have O(n) indexed access.
> * Adding and removing elements cause many small allocations and
> deallocations of memory.
> * Dynamic arrays have efficient traversal (1 lookup per element, good
> cache performance) and indexed access is O(1).
> * Adding elements requires reallocation when full, however, which
> involves copying all elements.
>
> Instead, I should like to propose a structure consisting of a dynamic
> array of increasing-sized arrays:
>
> 0 -> *0 *1 *2 *3
> 1 -> *4 *5 *6 *7
> 2 -> *8 *9 10 11 12 13 14 15
> 3 -> 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
>
> * Each array after the first two is twice the size of its predecessor.
> * This means that the structure has the same growth strategy as most
> dynamic arrays (double when full), reducing the number of
> allocations.
> * As new arrays are appended to the end, no copying is required,
> except for the pointers when the containing array is resized.
> * The regular increase in the size of the arrays makes indexed access
> efficient.
>
> I have placed the source code for an implemenation here:http://www.martinbroadhurst.com/c/list.html
>


This is pretty close to a VList: http://en.wikipedia.org/wiki/VList
Though they were originally defined as immutable, this was primarily
for thread safety reasons. Mutable VLists work fine.

MartinBroadhurst 10-24-2010 11:23 AM

Re: List as a dynamic array of increasing-sized arrays
 
On 23 Oct, 18:40, c...@tiac.net (Richard Harter) wrote:

Thanks very much for reading it, Richard.

>
> The link in your post isn't right; it should have been
> <http://www.martinbroadhurst.com/c/list.c.html>.
>


My link was to a page containing the source for the dynarray that
contains the list, as well as the list itself. Apologies if this was
confusing.

> Your design is good for
> arrays, not so good for linked lists.


Yes, absolutely, it's an "iterable stack". Inserting in the middle
would be O(n) theoretically, and in practice worse than for a dynamic
array since there would be several block shifts rather than just one.

Perhaps I misnamed it. I'm not sure if the word "list" has as precise
a meaning in computer science as "stack" or "queue" do. I tend to
think of a list as an ordered collection that supports, at a minimum,
adding at the tail.

One could, however, *embed* a linked list into the structure, by
making it a collection of linked list nodes rather than void
pointers.

I wrote an implementation of a linked list embedded in an ordinary
dynamic array a while ago, my rationale being that it might provide
better locality of reference and more efficient allocation than one
based on individually allocated nodes.

Embedding a linked list into this structure, however, would remove the
copying problem associated with growing dynamic arrays.
Further, when a linked list is embedded in a dynamic array, the next
and previous pointers need to be integer offsets into the array buffer
rather than pointers, since otherwise they would be invalidated by
reallocation. As the "array-of-arrays" structure doesn't reallocate,
previous and next can be pointers.

Martin

MartinBroadhurst 10-24-2010 11:25 AM

Re: List as a dynamic array of increasing-sized arrays
 
On 23 Oct, 21:01, Gene <gene.ress...@gmail.com> wrote:
>
> This is pretty close to a VList: *http://en.wikipedia.org/wiki/VList
> Though they were originally defined as immutable, this was primarily
> for thread safety reasons. *Mutable VLists work fine.- Hide quoted text -
>


Thanks for the pointer, Gene. I'll read Bagwell's paper.

Martin

MartinBroadhurst 10-26-2010 01:09 PM

Re: List as a dynamic array of increasing-sized arrays
 
On 21 Oct, 00:02, MartinBroadhurst <martin.broadhu...@gmail.com>
wrote:
>


Below are the timings in seconds for inserting 50,000 elements into
various sequential data structures. These are on an Intel Celeron
667MHz with 256MB of RAM, running Ubuntu Server 10.10 with glibc
version 2.10.1.

Doubly-linked list 0.022233
Singly-linked list 0.021964
Doubly-linked list in dynamic array 0.014827
Deque 0.012803
Dynamic array 0.006111
Dynamic array of increasing-sized arrays 0.005775

The dynamic array of increasing-sized arrays was a little over 4%
faster across the range 50,000 to 250,000 elements. Dynamic array was
faster for fewer or more elements than this.

Full result is here:

http://www.martinbroadhurst.com/c/list.html#performance

Martin


BartC 10-26-2010 03:40 PM

Re: List as a dynamic array of increasing-sized arrays
 
"MartinBroadhurst" <martin.broadhurst@gmail.com> wrote in message
news:615aa5e3-0707-435e-816f-52e4be28740b@v20g2000yqb.googlegroups.com...
> On 21 Oct, 00:02, MartinBroadhurst <martin.broadhu...@gmail.com>
> wrote:
>>

>
> Below are the timings in seconds for inserting 50,000 elements into
> various sequential data structures. These are on an Intel Celeron
> 667MHz with 256MB of RAM, running Ubuntu Server 10.10 with glibc
> version 2.10.1.
>
> Doubly-linked list 0.022233
> Singly-linked list 0.021964
> Doubly-linked list in dynamic array 0.014827
> Deque 0.012803
> Dynamic array 0.006111
> Dynamic array of increasing-sized arrays 0.005775
>
> The dynamic array of increasing-sized arrays was a little over 4%
> faster across the range 50,000 to 250,000 elements. Dynamic array was
> faster for fewer or more elements than this.
>
> Full result is here:
>
> http://www.martinbroadhurst.com/c/list.html#performance


You say "inserting" 50000 elements above. Yet the link says the elements are
added to the tail of each list.

I thought the dynamic array timing seemed pretty good.

Try inserting elements at the head of the list, or in the middle.

--
Bartc


MartinBroadhurst 10-26-2010 04:36 PM

Re: List as a dynamic array of increasing-sized arrays
 
On 26 Oct, 16:40, "BartC" <b...@freeuk.com> wrote:
>
> You say "inserting" 50000 elements above. Yet the link says the elements are
> added to the tail of each list.
>


My mistake, I meant inserting in the sense of inserting into a stack.

> I thought the dynamic array timing seemed pretty good.
>


I also. I had been led to believe that dynamic arrays would suffer
from excessive copying on reallocation by things like this -

http://www.drdobbs.com/architecture-...MY32JVN?pgno=5

- but, at least with glibc, and in that size range, that's not the
case.

> Try inserting elements at the head of the list, or in the middle.
>


As I said in my original post, it's designed for adding at the tail. I
wouldn't expect it to compare favourably with a dynamic array or
linked list for insertion in the middle. My thinking was that it would
be a good data structure when you have an unknown amount of sequential
data and simply want to store it and read it back.

I *would* expect the linked list embedded in an array to be good for
insertion, though.

Martin


All times are GMT. The time now is 05:30 AM.

Powered by vBulletin®. Copyright ©2000 - 2014, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.