Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C++ > STL Container Choice

Reply
Thread Tools

STL Container Choice

 
 
Mike Copeland
Guest
Posts: n/a
 
      03-28-2012
I have a data structure (similar to the one below) that contains a
variety of data types, and I wish to store many objects of this type in
a container that (1) allows fast retrieval access and (2) is reasonably
easy to store and update. Its "access key" is an integer value (e.g.
bibNum).
It seems that a vector is my best choice here (fast direct access via
the key and ease of storage), but map could be better - true? Are there
guidelines for container selection that could help me decide, or is some
other container type a better choice? Pleas advise. TIA

struct FD_TYPE // .FIN/.CFF file data
{
int bibNum; // Bib #
int finTime; // Finish Time
short entDivNum; // Division
short divFinPos; // division Finish Position
short evtFinPos; // Event Finish Position
char entGender; // Sex
char phaseNum; // Phase #
short entAge; // Age
char rCode; // RCode
char entType; // Entrant Type
short numLaps; // # of Laps
short waveNum; // Wave #
char teamTypeCode; // Team Type Code
short evtYear; // Event Year
short penalty; // Penalty
bool bOK; // usage flag
string teamName; // Team name (or "NONE")
string entName; // Entrant Name
} extern FinData;

 
Reply With Quote
 
 
 
 
Jorgen Grahn
Guest
Posts: n/a
 
      03-28-2012
On Wed, 2012-03-28, Mike Copeland wrote:
> I have a data structure (similar to the one below) that contains a
> variety of data types, and I wish to store many objects of this type in
> a container that (1) allows fast retrieval access and (2) is reasonably
> easy to store and update. Its "access key" is an integer value (e.g.
> bibNum).
> It seems that a vector is my best choice here (fast direct access via
> the key and ease of storage), but map could be better - true?


Not enough information.
- if bibNum==4711 exists, do 0--4710 exist, too?
- would it be useful for you to have them sorted?
- how do you plan to modify the container?
- what does "many objects" mean to you?

The way you speak of the data in terms of <key, value> makes me think
you really want std::map or std::tr1::unordered_map. Use the former if
it's useful for you to access the entries sorted on bibNum.

> Are there
> guidelines for container selection that could help me decide, or is some
> other container type a better choice? Pleas advise. TIA
>
> struct FD_TYPE // .FIN/.CFF file data
> {
> int bibNum; // Bib #
> int finTime; // Finish Time
> short entDivNum; // Division

....
> string entName; // Entrant Name
> } extern FinData;


/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .
 
Reply With Quote
 
 
 
 
Jorgen Grahn
Guest
Posts: n/a
 
      03-28-2012
On Wed, 2012-03-28, Sam wrote:
....
> Mike Copeland writes:
>
>> I have a data structure (similar to the one below) that contains a
>> variety of data types, and I wish to store many objects of this type in
>> a container that (1) allows fast retrieval access and (2) is reasonably
>> easy to store and update. Its "access key" is an integer value (e.g.
>> bibNum).
>> It seems that a vector is my best choice here (fast direct access via
>> the key and ease of storage), but map could be better - true? Are there
>> guidelines for container selection that could help me decide, or is some
>> other container type a better choice? Pleas advise. TIA

>
> If your keys are generally sequential, without any large gaps, a vector
> would be fine.


It depends. Having to deal with elements which exist but yet don't
exist can lead to messy code and bugs.

> If your keys are sparse, a map is a better choice.
>
> You should also used a typedef to indirectly refer to your container, i.e.
> typedef std::vector<FD_TYPE> FD_TYPE_cont_t, then always use FD_TYPE_cont_t
> to declare the container type, and always use FD_TYPE_cont_t::iterator and
> FD_TYPE_cont_t::const_iterator, to declare any iterators.
>
> In the event that you change your mind regarding the container later, having
> an intermediate typedef saves a tremendous amount of pain.


A matter of personal taste, of course, but I disagree. The containers
are so different that if you switch, you would have to review all the
code anyway.

(I also don't see how a simple search-and-replace is a tremendous
amount of pain. There are tools for such things, and the compiler will
catch any mistakes.)

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .
 
Reply With Quote
 
Mike Copeland
Guest
Posts: n/a
 
      03-28-2012
In article <(E-Mail Removed)>,
http://www.velocityreviews.com/forums/(E-Mail Removed) says...
> > I have a data structure (similar to the one below) that contains a
> > variety of data types, and I wish to store many objects of this type in
> > a container that (1) allows fast retrieval access and (2) is reasonably
> > easy to store and update. Its "access key" is an integer value (e.g.
> > bibNum).
> > It seems that a vector is my best choice here (fast direct access via
> > the key and ease of storage), but map could be better - true?

>
> Not enough information.
> - if bibNum==4711 exists, do 0--4710 exist, too?

No. Objects would randomly exist throughout, with non-existant
objects unknown via any access process. The range of possible index
values is 1..1,000,000,000.
> - would it be useful for you to have them sorted?

No. I simply want to store the object and easily/quickly retrieve
them as needed. (A binary search is meaningless here, given the random
distribution of object.)
> - how do you plan to modify the container?

I (may need to) modify individual data elements of the object.
> - what does "many objects" mean to you?

100-100,000 objects.
>
> The way you speak of the data in terms of <key, value> makes me think
> you really want std::map or std::tr1::unordered_map. Use the former if
> it's useful for you to access the entries sorted on bibNum.
>
> > Are there
> > guidelines for container selection that could help me decide, or is some
> > other container type a better choice? Pleas advise. TIA
> >
> > struct FD_TYPE // .FIN/.CFF file data
> > {
> > int bibNum; // Bib #
> > int finTime; // Finish Time
> > short entDivNum; // Division

> ...
> > string entName; // Entrant Name
> > } extern FinData;

 
Reply With Quote
 
Jorgen Grahn
Guest
Posts: n/a
 
      03-29-2012
On Wed, 2012-03-28, Mike Copeland wrote:
> In article <(E-Mail Removed)>,
> (E-Mail Removed) says...
>> > I have a data structure (similar to the one below) that contains a
>> > variety of data types, and I wish to store many objects of this type in
>> > a container that (1) allows fast retrieval access and (2) is reasonably
>> > easy to store and update. Its "access key" is an integer value (e.g.
>> > bibNum).
>> > It seems that a vector is my best choice here (fast direct access via
>> > the key and ease of storage), but map could be better - true?

>>
>> Not enough information.
>> - if bibNum==4711 exists, do 0--4710 exist, too?

> No. Objects would randomly exist throughout, with non-existant
> objects unknown via any access process. The range of possible index
> values is 1..1,000,000,000.
>> - would it be useful for you to have them sorted?

> No. I simply want to store the object and easily/quickly retrieve
> them as needed. (A binary search is meaningless here, given the random
> distribution of object.)


I don't understand that last part. Binary search in a std::vector
sorted by key can be very useful (but insertion/removal would be very
expensive so it's only worth considering if your container rarely
changes.)

Or are you saying the relative order of elements in your container
carries information, which would be destroyed by sorting it?

>> - how do you plan to modify the container?

> I (may need to) modify individual data elements of the object.


But never the container itself, apart from inserting into it?

>> - what does "many objects" mean to you?

> 100-100,000 objects.


OK. Then

>> The way you speak of the data in terms of <key, value> makes me think
>> you really want std::map or std::tr1::unordered_map. Use the former if
>> it's useful for you to access the entries sorted on bibNum.


I think unordered_map is the most natural choice, closely followed by
plain old map.

If you know them, unordered map is pretty much the same thing as
Perl's hash, Python's dict, and the non-standard STL hash_map.

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .
 
Reply With Quote
 
Mike Copeland
Guest
Posts: n/a
 
      03-29-2012
In article <(E-Mail Removed)>,
(E-Mail Removed) says...
> > No. I simply want to store the object and easily/quickly retrieve
> > them as needed. (A binary search is meaningless here, given the random
> > distribution of object.)

>
> I don't understand that last part. Binary search in a std::vector
> sorted by key can be very useful (but insertion/removal would be very
> expensive so it's only worth considering if your container rarely
> changes.)


There container is constructed at the start of the application, not
changed during, but the data in the objects may be adjusted (counts
updated, etc.). The container is static otherwise.

> Or are you saying the relative order of elements in your container
> carries information, which would be destroyed by sorting it?


No, sorting wouldn't affect its information. However, the data
objects would be (very) sparsely scattered across the container's range.
Perhaps I don't understand how a vector is allocated, but it seems to
me that, if sorted, such a container loses its viability for a binary
search - because the sorting doesn't "compact" the data. Am I wrong
here?

> >> - how do you plan to modify the container?

> > I (may need to) modify individual data elements of the object.

>
> But never the container itself, apart from inserting into it?


True.
>
> >> - what does "many objects" mean to you?

> > 100-100,000 objects.

>
> OK. Then
>
> >> The way you speak of the data in terms of <key, value> makes me think
> >> you really want std::map or std::tr1::unordered_map. Use the former if
> >> it's useful for you to access the entries sorted on bibNum.

>
> I think unordered_map is the most natural choice, closely followed by
> plain old map.


(Not being very familiar with STL containers,) how does one declare
an "unordered_map"? I thought the options were on;y map and multumap...
8<{{
 
Reply With Quote
 
Jorgen Grahn
Guest
Posts: n/a
 
      03-29-2012
On Thu, 2012-03-29, Mike Copeland wrote:
> In article <(E-Mail Removed)>,
> (E-Mail Removed) says...
>> > No. I simply want to store the object and easily/quickly retrieve
>> > them as needed. (A binary search is meaningless here, given the random
>> > distribution of object.)

>>
>> I don't understand that last part. Binary search in a std::vector
>> sorted by key can be very useful (but insertion/removal would be very
>> expensive so it's only worth considering if your container rarely
>> changes.)

>
> There container is constructed at the start of the application, not
> changed during, but the data in the objects may be adjusted (counts
> updated, etc.). The container is static otherwise.
>
>> Or are you saying the relative order of elements in your container
>> carries information, which would be destroyed by sorting it?

>
> No, sorting wouldn't affect its information. However, the data
> objects would be (very) sparsely scattered across the container's range.
> Perhaps I don't understand how a vector is allocated, but it seems to
> me that, if sorted, such a container loses its viability for a binary
> search - because the sorting doesn't "compact" the data. Am I wrong
> here?


Kind of. I was thinking of a vector<T> where the indices are irrelevant
and the key is still a member of T (like in your example)

[4, 32, 33, 44, 1567, 1600, 16218]

You can do a binary search in such a vector (there's even
std::lower_bound() etc so you don't have to implement it yourself).

But I suspect this is superior to unordered_map only in special cases,
when sizeof(T) is very small and so on ...

>> >> - how do you plan to modify the container?
>> > I (may need to) modify individual data elements of the object.

>>
>> But never the container itself, apart from inserting into it?

>
> True.
>>
>> >> - what does "many objects" mean to you?
>> > 100-100,000 objects.

>>
>> OK. Then
>>
>> >> The way you speak of the data in terms of <key, value> makes me think
>> >> you really want std::map or std::tr1::unordered_map. Use the former if
>> >> it's useful for you to access the entries sorted on bibNum.

>>
>> I think unordered_map is the most natural choice, closely followed by
>> plain old map.

>
> (Not being very familiar with STL containers,) how does one declare
> an "unordered_map"? I thought the options were on;y map and multumap...
> 8<{{


Check e.g. the Wikipedia. unordered_map is not available in C++98, but
came with TR1 and is part of C++11. Even my fairly old g++ installation
has it. It's also similar to the good old STL hash_map, which never
made it into the standard:

http://www.sgi.com/tech/stl/hash_map.html

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .
 
Reply With Quote
 
Juha Nieminen
Guest
Posts: n/a
 
      03-30-2012
Jorgen Grahn <(E-Mail Removed)> wrote:
> But I suspect this is superior to unordered_map only in special cases,
> when sizeof(T) is very small and so on ...


Why would std::unordered_map be better than std::vector if sizeof(T)
is large?
 
Reply With Quote
 
Juha Nieminen
Guest
Posts: n/a
 
      03-30-2012
Mike Copeland <(E-Mail Removed)> wrote:
> There container is constructed at the start of the application, not
> changed during, but the data in the objects may be adjusted (counts
> updated, etc.). The container is static otherwise.


Then use a sorted std::vector.
 
Reply With Quote
 
Jorgen Grahn
Guest
Posts: n/a
 
      03-30-2012
On Fri, 2012-03-30, Juha Nieminen wrote:
> Jorgen Grahn <(E-Mail Removed)> wrote:
>> But I suspect this is superior to unordered_map only in special cases,
>> when sizeof(T) is very small and so on ...

>
> Why would std::unordered_map be better than std::vector if sizeof(T)
> is large?


Why not? If you believe binary search in a vector is faster than
unordered_map lookup in general, then that's what we should discuss, I
think.

In my unscientific benchmarks, binary search only wins if the elements
are few (less than 100, or something like that).

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
container inside container in stl wolverine C++ 2 07-24-2006 03:08 PM
Copy elements from one STL container to another STL container Marko.Cain.23@gmail.com C++ 4 02-16-2006 05:03 PM
STL: container's values setup by another container Maitre Bart C++ 2 02-11-2004 12:11 AM
Can Choice components respond to keyboard input like HTML Choice components? Mickey Segal Java 0 02-02-2004 10:59 PM
Choice of DHCP-server? Is the "IOS-one" a good choice? Fred Cisco 1 12-11-2003 06:25 AM



Advertisments