Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C++ > Standard C++ Library

Reply
Thread Tools

Standard C++ Library

 
 
Razii
Guest
Posts: n/a
 
      04-03-2008
On Wed, 2 Apr 2008 21:10:22 -0500, stan <(E-Mail Removed)> wrote:

>> I doubt it's that easy to trigger worse case scenario. You will need
>> to read hashing algorithm and then write a bible size book with
>> specifically chosen data.


>I'll go out on a limb here and guess that you're self taught. You might
>want to consider a course in algorithms if you're really interested in
>programming.


No thanks.

>You clearly seem concerned with appearances as evidenced by
>your taunts. You can't even imagine how much it would improve your
>standing with programmers if you could replace "doubt" and "guess"
>with a clear, correct, and concise analysis of an algorithm.


I already posted the reason why Hasmap is faster than sorted map in
this situation. I have tested both version.

>>>Interestingly, you deleted the rest of my hash vs balanced tree
>>>comparaison and links that explain that.

>>
>> There was nothing new in the rest of your post. I have seen
>> Wikipedia's articles before.

>
>And your reading skills have already been clearly demonstrated.


By the way, I checked your posting history for this newsgroup. Other
than one post in "Best book" thread, all your posts are in the threads
started by me How come? Did I inspire you so much that you
started posting to this newsgroup regularly? Do you have a crush on
me?


 
Reply With Quote
 
 
 
 
Yannick Tremblay
Guest
Posts: n/a
 
      04-03-2008
In article <(E-Mail Removed)>,
Razii <(E-Mail Removed)> wrote:
>On 02 Apr 2008 14:44:59 GMT, http://www.velocityreviews.com/forums/(E-Mail Removed) (Yannick Tremblay)
>wrote:
>
>>Might be in the general case. Howver, if you write a word sorter
>>using a hash table, it is possible that you would encounter data that
>>would trigger a worse case scenario performance. So you could write
>>your program, claim it is very fast and I could make it incredibly
>>slow simply by feeding it specifically chosen data. Not really
>>something you'd want to expose to the world.

>
>I doubt it's that easy to trigger worse case scenario. You will need
>to read hashing algorithm and then write a bible size book with
>specifically chosen data.


Regardless if it is easy or hard, if you are going to release software
in the wild, then you must make sure that a nasty character can't
destroy your application simply by looking up the Java Hash algorithm
and feeding it hand prepared data. Read the history of buffer
overflow exploit to learn to what extend peoples will go to break
programs.

A lot of search/sort/hash algorithm have worse case scenario that are
maybe unlikely to be found naturally but rather easy to generate.
Some examples would be: already sorted data, almost already sorted
data, data sorted in reverse order. Although random data is unlikely
to trigger any of the above, it is not that uncommonly found in the
wild and a good programmer must be aware of these potential problems
and protect his software against it.

So if you any real interest in algorithms, try pre-sorting your text
before attempting your test.

By the way, IMO, the best solution to your problem is:

$ sort bible.txt > sortedbible.txt

Using such an utility, generating somewhat sorted data is trivial. e,g:

$ sort -r bible.txt > reversesortedbible.txt
$ sort bible.txt > doublesortedbible.txt && sort bible.txt >>
doublesortedbible.txt

Very very simple.









 
Reply With Quote
 
 
 
 
Mirek Fidler
Guest
Posts: n/a
 
      04-03-2008
> Might be in the general case. Howver, if you write a word sorter
> using a hash table, it is possible that you would encounter data that
> would trigger a worse case scenario performance. So you could write
> your program, claim it is very fast and I could make it incredibly
> slow simply by feeding it specifically chosen data. Not really
> something you'd want to expose to the world.


Well, but exactly the same holds for the better part of all current
computational power.

Admit it, everything in modern CPU is basically based on designs
basically equal to hashing (cache, paging, branch prediction). Chances
that your hashmap will fail due to worst case scenario is as likely as
regular map failing because of cache/paging issues.

BTW, speaking about it, one reason why good hashmap always outperforms
binary tree on modern CPUs is branch misprediction - that central
compare and jump is totally random in the loop, mispredicted in 50% of
cases.

Choosing binary tree map before hash map is 1990 thinking. A lot has
changed since then.

Mirek
 
Reply With Quote
 
Bo Persson
Guest
Posts: n/a
 
      04-03-2008
Razii wrote:
> On 02 Apr 2008 14:44:59 GMT, (E-Mail Removed) (Yannick Tremblay)
> wrote:
>
>> In article <(E-Mail Removed)>,
>> Razii <(E-Mail Removed)> wrote:
>>> On 02 Apr 2008 11:11:51 GMT, (E-Mail Removed) (Yannick
>>> Tremblay) wrote:
>>>
>>>>> Where is the HashMap container in c++ standard library?
>>>>
>>>> std::tr1::unordered_map
>>>
>>> WC2.cpp(12) : fatal error C1083: Cannot open include file:
>>> 'tr1/unordered_map': No such file or directory

>>
>> Read your platform documentation and figure out if it supports TR1.
>> If not, obtain a TR1 library (dinkumware comes to mind)

>
> How is it standard when VC++ doesn't have it?
>


It is a standard because ISO has issued one. How else could it be?

http://www.iso.org/iso/iso_catalogue...csnumber=43289


The hint was that if you just cannot wait for the next compiler
release from MS, you can buy your own copy of the library from the
same place they are getting it from.

http://dinkumware.com/



Bo Persson


 
Reply With Quote
 
red floyd
Guest
Posts: n/a
 
      04-03-2008
Razii wrote:
> On 02 Apr 2008 14:44:59 GMT, (E-Mail Removed) (Yannick Tremblay)
> wrote:
>
>> In article <(E-Mail Removed)>,
>> Razii <(E-Mail Removed)> wrote:
>>> On 02 Apr 2008 11:11:51 GMT, (E-Mail Removed) (Yannick Tremblay)
>>> wrote:
>>>
>>>>> Where is the HashMap container in c++ standard library?
>>>> std::tr1::unordered_map
>>> WC2.cpp(12) : fatal error C1083: Cannot open include file:
>>> 'tr1/unordered_map': No such file or directory

>> Read your platform documentation and figure out if it supports TR1.
>> If not, obtain a TR1 library (dinkumware comes to mind)

>
> How is it standard when VC++ doesn't have it?
>


Simple. Microsoft doesn't define Standard C++, ISO does (though given
what happened with DIS29500...)
 
Reply With Quote
 
Razii
Guest
Posts: n/a
 
      04-03-2008
On 03 Apr 2008 09:57:50 GMT, (E-Mail Removed) (Yannick Tremblay)
wrote:

>$ sort bible.txt > sortedbible.txt


>Using such an utility, generating somewhat sorted data is trivial. e,g:


I think you are a little confused. This is not the sorting bible
problem. This one counts up the frequency of each word, and it makes
no difference if the data is already sorted. The sort is done to print
output (.e words) in nice format. Here, I wilt post it to a blogpage
that explains it..

Why is C++ slower than Java?

http://razi2.blogspot.com/2008/04/wh...than-java.html




 
Reply With Quote
 
Razii
Guest
Posts: n/a
 
      04-03-2008
On Thu, 3 Apr 2008 19:14:08 +0200, "Bo Persson" <(E-Mail Removed)> wrote:

>The hint was that if you just cannot wait for the next compiler
>release from MS, you can buy your own copy of the library from the
>same place they are getting it from.


I downloaded MinGW that uses GCC compiler. Here is C++ version that
uses map

http://www.pastebin.ca/965243

and I changed it to

std::tr1::unordered_map< std::string, int > dictionary;

The time improved from 5600 ms to 3500 ms (which is still slower than
java version 2300 ms). However, the result is not valid since java
version sorts it by transferring the data back to TreeMap in the end.

Can anyone post the most efficient way how would I transfer data that
is in std::tr1::unordered_map< std::string, int > dictionary to a
sorted map

Or is there any other way to sort unordered_map?

 
Reply With Quote
 
Paul Brettschneider
Guest
Posts: n/a
 
      04-03-2008
Razii wrote:

> On 03 Apr 2008 09:57:50 GMT, (E-Mail Removed) (Yannick Tremblay)
> wrote:
>
>>$ sort bible.txt > sortedbible.txt

>
>>Using such an utility, generating somewhat sorted data is trivial. e,g:

>
> I think you are a little confused. This is not the sorting bible
> problem. This one counts up the frequency of each word, and it makes
> no difference if the data is already sorted. The sort is done to print
> output (.e words) in nice format. Here, I wilt post it to a blogpage
> that explains it..
>
> Why is C++ slower than Java?
>
> http://razi2.blogspot.com/2008/04/wh...than-java.html


Yawn. What a lame attempt at trolling.

Anyway, a few changes and down to 750 ms from 6500 ms:

#include <iostream>
#include <fstream>
#include <sstream>
#include <ctime>
#include <cstring>

static const int num_morons = ('z' - 'a' + 1) * 2;

template <typename T> class MoronicArray {
T arr[num_morons];
public:
T &operator[](unsigned char c) {
int n;
if(c >= 'a' && c <= 'z')
n = c - 'a';
else if(c >= 'A' && c <= 'Z')
n = c- 'A' + num_morons/2;
else
n = 0;
return arr[n];
}
MoronicArray() { memset(this, '\0', sizeof(*this)); }
};

class Moron {
MoronicArray<int> counts;
MoronicArray<Moron *> morons;
public:
Moron *operator[](unsigned char c) {
Moron *&m = morons[c];
if(m != NULL)
return m;
return m = new Moron();
};
void inc(unsigned char c) {
++counts[c];
};
void moronificator(const std::string &s) {
for(unsigned char c = 'a'; c <= 'z'; ++c) {
if(counts[c] > 0)
std::cout << s << c << " " << counts[c] <<
std::endl;
if(morons[c])
morons[c]->moronificator(s + std::string(1,
c));
}
for(unsigned char c = 'A'; c <= 'Z'; ++c) {
if(counts[c] > 0)
std::cout << s << c << " " << counts[c] <<
std::endl;
if(morons[c])
morons[c]->moronificator(s + std::string(1,
c));
}
}
};

int main(int argc, char **argv)
{
int w_total = 0, l_total = 0, c_total = 0;
Moron m;

printf(" lines words bytes file\n" );
clock_t start=clock();

for(int i = 1; i <argc; ++i) {
std::ifstream input_file(argv[i]);
std:stringstream buffer;
buffer << input_file.rdbuf();
const std::string &s = buffer.str();
int l = s.size();

int w_cnt = 0, l_cnt = 0, c_cnt = 0;
Moron *act = &m;
unsigned char last = '\0';
for(int j = 0; j < l; ++j) {
unsigned char c = s[j];
if(c == '\n') {
++l_cnt;
}
if(c >= 'a' && c <= 'z' || c >= 'A' && c <= 'Z') {
if(last)
act = (*act)[last];
last = c;
} else if(last) {
act->inc(last);
act = &m;
last = '\0';
} else {
last = '\0';
}
++c_cnt;
}

printf("%d\t%d\t%d\t %s\n", l_cnt, w_cnt, c_cnt, argv[i]);
l_total += l_cnt;
w_total += w_cnt;
c_total += c_cnt;
}

clock_t end=clock();

if(argc > 2) {
printf("--------------------------------------\n%d\t%d\t%d\t
total",
l_total, w_total, c_total);
}

printf("--------------------------------------\n");

m.moronificator(std::string());

std::cout <<"Time: " << double(end-start)/CLOCKS_PER_SEC * 1000 << "
ms\n";
}

Lrf, guvf boivbhfyl jnf n wbxr :C.
 
Reply With Quote
 
Paul Brettschneider
Guest
Posts: n/a
 
      04-04-2008
Razii wrote:

> On Fri, 04 Apr 2008 00:39:22 +0200, Paul Brettschneider
> <(E-Mail Removed)> wrote:
>
>>Anyway, a few changes and down to 750 ms from 6500 ms:

>
> By the way, your output doesn't print total number of words at top.


Then fix it. I'm not your nanny.
 
Reply With Quote
 
Paul Brettschneider
Guest
Posts: n/a
 
      04-04-2008
Razii wrote:

> On Fri, 04 Apr 2008 00:39:22 +0200, Paul Brettschneider
> <(E-Mail Removed)> wrote:
>
>>Anyway, a few changes and down to 750 ms from 6500 ms:

>
> lol at "few" changes. I haven't tested it yet but I don't see <map> in
> your version.


Sheesh... I'd thought it would be obvious that the code was a joke.

And of course you can write the same silly code with std::map. It's slower
then, but still 4-5 times faster than the original.

#include <iostream>
#include <fstream>
#include <ctime>
#include <map>

class Moron {
int count;
std::map<char, Moron *> morons;
public:
Moron() : count(0) {};
Moron *operator[](char c) {
std::map<char, Moron *>::iterator it = morons.find(c);
if(it != morons.end())
return it->second;
Moron *res = new Moron();
morons[c] = res;
return res;
};
void inc() {
++count;
};
void moronificator(const std::string &s) {
if(count)
std::cout << s << " " << count << std::endl;
for(std::map<char, Moron *>::iterator it = morons.begin();
it != morons.end(); ++it)
it->second->moronificator(s + std::string(1,
it->first));
};
};

int main(int argc, char **argv)
{
int w_total = 0, l_total = 0, c_total = 0;
const size_t bufsiz = 1024 * 1024;
char buf[bufsiz];
Moron m;

printf(" lines words bytes file\n" );
clock_t start=clock();

for(int i = 1; i <argc; ++i) {
std::ifstream input_file(argv[i]);
int w_cnt = 0, l_cnt = 0, c_cnt = 0;
Moron *act = &m;

for(; {
if(input_file.read(buf, bufsiz) &&
input_file.fail())
break;
int l = input_file.gcount();
if(!l)
break;
for(int j = 0; j < l; ++j) {
char c = buf[j];
if(c == '\n') {
++l_cnt;
}
if(c >= 'a' && c <= 'z' || c >= 'A' && c
<= 'Z') {
act = (*act)[c];
} else if(act != &m) {
++w_cnt;
act->inc();
act = &m;
}
++c_cnt;
}
}
if(act != &m) {
++w_cnt;
act->inc();
}

printf("%d\t%d\t%d\t %s\n", l_cnt, w_cnt, c_cnt, argv[i]);
l_total += l_cnt;
w_total += w_cnt;
c_total += c_cnt;
}
clock_t end=clock();

if(argc > 2) {
printf("--------------------------------------\n%d\t%d\t%d\t
total",
l_total, w_total, c_total);
}

printf("--------------------------------------\n");

m.moronificator(std::string());

std::cout <<"Time: " << double(end-start)/CLOCKS_PER_SEC * 1000 << "
ms\n";
}

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Rum-Time Library vs Standard Library Carmen Sei C++ 4 03-05-2008 10:09 AM
add pexpect to the standard library, standard "install" mechanism. funkyj Python 5 01-20-2006 08:35 PM
How standard is the standard library? steve.leach Python 1 04-18-2005 04:07 PM
Re: Possible additions to the standard library? (WAS: Aboutstandard library improvement) Daniel Bickett Python 0 02-04-2005 11:22 AM
Re: Possible additions to the standard library? (WAS: Aboutstandard library improvement) Fredrik Lundh Python 0 02-04-2005 07:34 AM



Advertisments