Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C++ > Perfomance Considerations

Reply
Thread Tools

Perfomance Considerations

 
 
Alexander Adam
Guest
Posts: n/a
 
      11-20-2007
Hi,

What I am having is this -- I got a string start (type is wchar_t) and
the end of a string to compare. Now I've got around 500 words to
compare to and if matched, I want to get an unique integer for the
word. Currently this is done by filling up a std::map<wstring, int>
with all words and doing an: iterator it = my_map.find(wchar_t* value)
and return it->second if found, otherwise return 0. Now this routine
takes pretty much time and I was wondering whether doing a manual list
of

if (strncmp(...) == 0)
...
else if (strncmp(...) == 0)
...

would be faster / more efficient? I was also thinking about doing
something like

switch (*str)
{
case 'a':
if (strncmp(...))
}

i.e. first compaing the first char and then do some deeper evaluation
to avoid calling the strncmp subroutine but I doubt that this even
makes sense.

Any idea so far or any closer description required? Basically as said,
I am having a wchar_t* str_start and an wchar_t* str_end which I want
to compare to predefined words and match an unique integer code for
it.

Regards
Alexander
 
Reply With Quote
 
 
 
 
red floyd
Guest
Posts: n/a
 
      11-20-2007
Alexander Adam wrote:
> Hi,
>
> What I am having is this -- I got a string start (type is wchar_t) and
> the end of a string to compare. Now I've got around 500 words to
> compare to and if matched, I want to get an unique integer for the
> word. Currently this is done by filling up a std::map<wstring, int>
> with all words and doing an: iterator it = my_map.find(wchar_t* value)
> and return it->second if found, otherwise return 0. Now this routine
> takes pretty much time and I was wondering whether doing a manual list
> of
>
> if (strncmp(...) == 0)
> ..
> else if (strncmp(...) == 0)
> ..
>
> would be faster / more efficient? I was also thinking about doing
> something like
>
> switch (*str)
> {
> case 'a':
> if (strncmp(...))
> }
>
> i.e. first compaing the first char and then do some deeper evaluation
> to avoid calling the strncmp subroutine but I doubt that this even
> makes sense.
>
> Any idea so far or any closer description required? Basically as said,
> I am having a wchar_t* str_start and an wchar_t* str_end which I want
> to compare to predefined words and match an unique integer code for
> it.
>


This looks like a clear case of Hoare's Dictum. Have you even
determined if this is your bottleneck? Have you looked into alternate
datastructures and algorithms? Is there a reason your're using strncmp
instead of wcsncmp (since you're dealing with wchar_t)? Why don't you
convert your wchar_t array to a wstring?

There's lots of stuff to be done up front before you worry about this
sort of micro-optimization.

 
Reply With Quote
 
 
 
 
Andre Kostur
Guest
Posts: n/a
 
      11-20-2007
Alexander Adam <(E-Mail Removed)> wrote in news:e552122e-6fbc-439e-ba09-
http://www.velocityreviews.com/forums/(E-Mail Removed):

> Hi,
>
> What I am having is this -- I got a string start (type is wchar_t) and
> the end of a string to compare. Now I've got around 500 words to
> compare to and if matched, I want to get an unique integer for the
> word. Currently this is done by filling up a std::map<wstring, int>
> with all words and doing an: iterator it = my_map.find(wchar_t* value)
> and return it->second if found, otherwise return 0. Now this routine
> takes pretty much time and I was wondering whether doing a manual list
> of
>
> if (strncmp(...) == 0)
> ..
> else if (strncmp(...) == 0)
> ..
>
> would be faster / more efficient? I was also thinking about doing


As Red Floyd has mentioned... profile first. However, how would you expect
a linear search to be able to beat the tree search (except for certain
degenerate cases)? Unless you have some statistical knowledge of how often
the words occur, the map will use on average 9 comparisons where your
linear search will use 250 (If you happen to know that 1 particular word
occurs 99.9% of the time, the linear search would kick the tree's butt if
you look for that word first. Then again, you could look for that word
first, then search the map if it doesn't match.). I would think that this
change would be going backwards in terms of performance. (But of course
you'd have to measure it to be sure). Perhaps a hashing container would
net you some benefits (see std::tr1::unordered_map<>). Depends on how
expensive the hashing function is I suppose.

 
Reply With Quote
 
Marco Manfredini
Guest
Posts: n/a
 
      11-20-2007
Alexander Adam wrote:

> Hi,
>
> What I am having is this -- I got a string start (type is wchar_t) and
> the end of a string to compare. Now I've got around 500 words to
> compare to and if matched, I want to get an unique integer for the
> word. Currently this is done by filling up a std::map<wstring, int>
> with all words and doing an: iterator it = my_map.find(wchar_t* value)
> and return it->second if found, otherwise return 0. Now this routine
> takes pretty much time and I was wondering whether doing a manual list
> of
>
> if (strncmp(...) == 0)
> ..
> else if (strncmp(...) == 0)
> ..


No, because on average you have to make 250 compares per word, while
std::map does 8 or 9. If some of your search words show up more often
than others then you might sort the compares, but that's not really
helping you.

>
> would be faster / more efficient? I was also thinking about doing
> something like
>
> switch (*str)
> {
> case 'a':
> if (strncmp(...))
> }


Yes that's better. I don't want to bother you with further hints on how
to improve this further, or if you should use hash_maps instead. If
(what I guess from your suggestion to strcmp everything) you have a
*fixed* number of words that you would like to recognize, then a very
good bet is a code generator which turns your wordlist into a piece of
code to recognize these words as fast a possible.
I'm quite happy with re2c (http://re2c.org/), which, with its default
settings, basically does the switch-thing, alas to the bitter end. And
it supports wide-characters.

--
IYesNo yes=YesNoFactory.getFactoryInstance().YES;
yes.getDescription().equals(array[0].toUpperCase());
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
data perfomance? Scott Reynolds ASP .Net 3 03-01-2005 01:54 PM
CreateInstranceAndUnwrap slow perfomance Gnanaprakash Rathinam ASP .Net 5 12-30-2004 09:18 PM
Server perfomance Fredrik Melin ASP .Net 1 10-27-2004 11:45 AM
perfomance an ip-route-cache on a router Ralf Huelsmann Cisco 1 08-15-2004 03:09 PM
Optimizing perfomance on T3 line Christoph Schad Cisco 12 01-01-2004 08:40 PM



Advertisments