![]() |
Fast Container
There are written so many articles but no article is really providing
the answer I'm looking for. Given a scenario with 100000 data elements the day (500000 the week). Potential we might have 3000-5000 updates for individual elements each second. Imagine a filter that decides on each update the a data element will be added, updated or removed. The final container which is sorted by a user defined criteria must be able to handle this amount of data. Info: basically each value of that sorted container is a pointer to the real data. Final scope: we display the data in a table which provides index based positions (row) NOW, what is the right way to implement a fast container for this? Some of the problems I was confronted with: - Using a std::vector deletion/insertion is expensive, is it? - Using a std::set I have no index access. Or is there a way to handle this properly? - Somebody told me something about a weighted balanced tree as possible solution. - Somebody else told me to use standard containers only. |
Re: Fast Container
On 11/2/2011 11:21 AM, Thomas Lehmann wrote:
> There are written so many articles but no article is really providing > the answer I'm looking for. > > Given a scenario with 100000 data elements the day (500000 the week). > Potential we might have 3000-5000 updates for individual elements > each second. > > Imagine a filter that decides on each update the a data element will > be added, updated or removed. > > The final container which is sorted by a user defined criteria must be > able to handle this amount of data. > > Info: basically each value of that sorted container is a pointer to the > real data. > > Final scope: we display the data in a table which provides index based > positions (row) > > NOW, what is the right way to implement a fast container for this? > > Some of the problems I was confronted with: > - Using a std::vector deletion/insertion is expensive, is it? > - Using a std::set I have no index access. Or is there a way to handle this properly? > - Somebody told me something about a weighted balanced tree as possible solution. > - Somebody else told me to use standard containers only. You didn't say whether the order of the objects in a container is important. If you can define deletion from your custom vector as swapping with the last element and resizing by -1, it's fast. Insertion at the end is not too bad, and causes reallocation only once in a while (when the vector grows past its capacity), and you can kind of control that by managing the capacity at "calm" times (shooter's rule: reload when you have a chance, don't wait 'til you're empty). Also, consider std::deque, it's appends are even faster than vector's. V -- I do not respond to top-posted replies, please don't ask |
Re: Fast Container
Le 02/11/11 16:43, Leigh Johnston a écrit :
[snip] if you want fast inserts/deletes > anywhere but also want fast indexing then consider using my > "segmented_array" container which can be found at: > > http://i42.co.uk/stuff/segmented_array.htm > > /Leigh Actually this is a list of vectors. Why should be faster than using list and vector? As far as I can read from your code you do not use those containers. Why? Thanks for publishing that code, it is full of ideas. jacob |
Re: Fast Container
On Nov 2, 5:21*pm, Thomas Lehmann
<thomas.lehmann.priv...@googlemail.com> wrote: > There are written so many articles but no article is really providing > the answer I'm looking for. > > Given a scenario with 100000 data elements the day (500000 the week). > Potential we might have 3000-5000 updates for individual elements > each second. > > Imagine a filter that decides on each update the a data element will > be added, updated or removed. > > The final container which is sorted by a user defined criteria must be > able to handle this amount of data. > > Info: basically each value of that sorted container is a pointer to the > real data. > > Final scope: we display the data in a table which provides index based > positions (row) > > NOW, what is the right way to implement a fast container for this? > > Some of the problems I was confronted with: > - Using a std::vector deletion/insertion is expensive, is it? > - Using a std::set I have no index access. Or is there a way to handle this properly? > - Somebody told me something about a weighted balanced tree as possible solution. > - Somebody else told me to use standard containers only. I think you can use a combination of list and vector. The list is used for fast insertions, deletions etc of the data elements, and the vector contains iterators that are sorted iaw. criteria as defined by the actual data elements, and not the iterator (list unsorted). The vector always remains sorted (insertion of iterator via lower_bound). You have the benefit of random access iterators for fast access on sorted sequence. You have the benefit of fast insertion, removal. Adding data elements happen by pushing back onto the list, and adding the iterator to the sorted sequence (vector). Removing data elements happen by finding the item in the (sorted) vector and using the iterator to erase the data element from the list. The "referring" iterators in the vector remain valid. Kind regards, Werner |
Re: Fast Container
On Nov 2, 11:06*pm, Werner <wer...@gmail.com> wrote:
> I think you can use a combination of list and vector. > The list is used for fast insertions, deletions etc of the data > elements, > and the vector contains iterators that are sorted iaw. criteria as > defined by the actual data elements, and not the iterator (list > unsorted). typedef std::list<DataElement> ElementList; typedef std::vector<ElementList::iterator> ElementRefSequence; ElementList elementList; ElementRefSequence elementRefSequence; |
Re: Fast Container
Well, I did say that it is a sorted container.
|
Re: Fast Container
Maybe I'm doing something wrong but comparing to vector
and deque your segmented_array is not faster. - I'm compiling on Windows Ultimate 64 Bit - I've tried with 64 and 1024 segments. - 200000 elements (with push_back) - took 5ms (vector took 2ms, deque took 5ms) - The code looks same for deque and vector: <code> BOOST_AUTO_TEST_CASE(SegmentedArrayAddValues) { const unsigned int limit = 200000; SegmentedArrayTest<int>* container = new SegmentedArrayTest<int>(); container->createContainer(); // creating content for (unsigned int ix = 0; ix < limit; ++ix) container->addValue(ix); BOOST_CHECK_EQUAL(container->size(), (size_t)limit); // cleanup delete container; } </code> |
Re: Fast Container
On 11/3/2011 10:01 AM, Thomas Lehmann wrote:
> Well, I did say that it is a sorted container. You did, I wasn't paying attention, sorry. Still, take a look at 'std::deque'. V -- I do not respond to top-posted replies, please don't ask |
Re: Fast Container
On Nov 2, 3:21*pm, Thomas Lehmann
<thomas.lehmann.priv...@googlemail.com> wrote: I'm feeling I've missed something... > There are written so many articles but no article is really providing > the answer I'm looking for. > > Given a scenario with 100000 data elements the day (500000 the week). > Potential we might have 3000-5000 updates for individual elements > each second. how are "elements" identified? > Imagine a filter that decides on each update [that] a data element will > be added, updated or removed. > > The final container which is sorted by a user defined criteria must be > able to handle this amount of data. > > Info: basically each value of that sorted container is a pointer to the > real data. > > Final scope: we display the data in a table which provides index based > positions (row) "final scope"? So when it's sorted it goes in a "table" that's indexed? > NOW, what is the right way to implement a fast container for this? use a relational database. I was thinking std::map but I don't know how your elements are identified. std::map has reasonable performance for insertion and deletion- anywhere in the container. And you can choose what to index it with. at the end you can copy it into whatever your user finds conventient. > Some of the problems I was confronted with: > - Using a std::vector deletion/insertion is expensive, is it? yes. Elements are always contiuous so if you delete say the 50th element of a 100 element vector its going to have to move elements 51-99 down one. At 500000 elements that gets expensive. > - Using a std::set I have no index access. Or is there a way to handle this properly? so you want to be able to index it? How are these indexes assigned? Time? > - Somebody told me something about a weighted balanced tree as possible solution. some sort of tree. (which is what a std::map usually is under the hood). B-tree? (that's not a binary tree). > - Somebody else told me to use standard containers only. why? I mean its a good place to start but is it an absolute constraint- and why? Do you have any performance criteria? |
| All times are GMT. The time now is 01:35 PM. |
Powered by vBulletin®. Copyright ©2000 - 2013, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.