Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C++ > Memory efficient way to store strings in hash maps using

Reply
Thread Tools

Memory efficient way to store strings in hash maps using

 
 
朝の木
Guest
Posts: n/a
 
      04-17-2012
Hi. I have recently written an article about problem of storing
millions of short strings in hash maps, as well as in other
containers. The problem is the memory overhead of std::string.
Therefore in the article I have discussed using boost::array
(std::array) as keys and values of hash maps and for different hash
map implementations. This is nothing advanced, but the benchmark
results may be interesting.

I have just started blogging, so please take a look at my articles.
Most of them are related to efficiency and memory consumption when
storing string data in containers. Any feedback (here or there) will
motivate me to continue blogging.

Articles as for now:
* Huge unordered hash maps (and threading)
* Debugging in C++
* Std::string on several unordered hash map implementations.
Benchmark.
* Memory overhead of an std::string
* Memory efficient way to store strings in hash maps using
boost::array

The blog url: http://jovislab.com/blog/

There are on ads on my blog

Thanks in advance for your comments!
 
Reply With Quote
 
 
 
 
Marc
Guest
Posts: n/a
 
      04-17-2012
朝の木 wrote:

> Hi. I have recently written an article about problem of storing
> millions of short strings in hash maps, as well as in other
> containers. The problem is the memory overhead of std::string.


Assuming you are using libstdc++, did you try other implementations of
strings, in particular ones using the short-string optimization
technique? There should be one in ext/vstring.h called __vstring.
Other libraries (libcxx for instance) have something like that by
default.
 
Reply With Quote
 
 
 
 
asanoki@gmail.com
Guest
Posts: n/a
 
      04-18-2012
W dniu środa, 18 kwietnia 2012 06:07:28 UTC+9 użytkownik Marc napisał:
> Assuming you are using libstdc++, did you try other implementations of
> strings, in particular ones using the short-string optimization
> technique? There should be one in ext/vstring.h called __vstring.
> Other libraries (libcxx for instance) have something like that by
> default.


Thanks for the hint. I have just checked it. Still the overhead is big.
G++ -O2 32bit, sys 3bit, 1000000 strings on the heap. Virtual memory in KB:
length __vstring std::string
1 34552 42340
4 34552 42340
8 34552 50260
16 58048 58048

Regards.
 
Reply With Quote
 
Juha Nieminen
Guest
Posts: n/a
 
      04-18-2012
????????? <(E-Mail Removed)> wrote:
> Hi. I have recently written an article about problem of storing
> millions of short strings


Storing large amounts of strings (or other similar data) in an efficient
manner (in terms of both memory usage and access times) is a very non-trivial
problem, and there are dozens and dozens of data containers and algorithms
dedicated to that precise problem.

If one needs that kind of efficiency (and one understands even a tiny bit
about data conatiners, algorithms, efficiency, and how the standard data
containers work), one wouldn't be using std::string for this in the first
place. (If it uses short string optimization, it may help a bit, but it's
defeated immediately when the strings are even one character longer than
the optimization threshold. Even in that case they would still consume
more memory and be slower than an advanced, very specialized data container
designed for this exact purpose.)
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
hash of hash of hash of hash in c++ rp C++ 1 11-10-2011 04:45 PM
Efficient way to store a limited number of booleans mathieu C++ 11 12-12-2007 02:24 PM
File open, read and store in Hash, efficient? Kev Ruby 5 03-09-2007 10:52 AM
STL: Map of maps possible, but no multi-map of maps? Workarounds? Marcus C++ 2 12-09-2005 06:34 AM
std::maps within std::maps -- optimisation Simon Elliott C++ 4 03-10-2005 10:11 AM



Advertisments