Velocity Reviews > C++ > C++ solution for K & R(2nd Ed) Ex.6-4 - better solution needed

C++ solution for K & R(2nd Ed) Ex.6-4 - better solution needed

subramanian100in@yahoo.com, India
Guest
Posts: n/a

 09-29-2007
As a beginner in C++, I have attempted the C++ solution for the
following:

Consider the Ex. 6-4 in K & R(ANSI C) 2nd Edition in Page 143 :

Write a program that prints the distinct words in its input sorted
into descending order of frequency of occurrence. Precede each word by
its count.

(Here I assume that we should NOT sort the words first. Instead sort
in decreasing order as per frequency.)

Following is my C++ solution which works fine. Kindly suggest better
way of doing it.

#include <iostream>
#include <vector>
#include <string>
#include <utility>
#include <algorithm>

using namespace std;

typedef pair<int, string> is;

bool cmp_fn(is arg1, is arg2)
{
return (arg2.first < arg1.first) ? true : false;
}

void print(const is & arg)
{
cout << arg.second << ": " << arg.first << endl;
}

int main()
{
vector<string> unique_words;
vector<is> v;

string word;

while (cin >> word)
{
if (find(unique_words.begin(), unique_words.end(),
word) == unique_words.end())
{
unique_words.push_back(word);
v.push_back(make_pair(1, word));
}
else
{
for (vector<is>::iterator i = v.begin(); i !=
v.end(); ++i)
if (i->second == word)
{
++i->first;
break;
}

}
}

sort(v.begin(), v.end(), cmp_fn);

for_each(v.begin(), v.end(), print);

return 0;
}

Kindly help.

Thanks
V.Subramanian

Barry
Guest
Posts: n/a

 09-29-2007
http://www.velocityreviews.com/forums/(E-Mail Removed), India wrote:
> As a beginner in C++, I have attempted the C++ solution for the
> following:
>
> Consider the Ex. 6-4 in K & R(ANSI C) 2nd Edition in Page 143 :
>
> Write a program that prints the distinct words in its input sorted
> into descending order of frequency of occurrence. Precede each word by
> its count.
>
> (Here I assume that we should NOT sort the words first. Instead sort
> in decreasing order as per frequency.)
>
> Following is my C++ solution which works fine. Kindly suggest better
> way of doing it.
>
> #include <iostream>
> #include <vector>
> #include <string>
> #include <utility>
> #include <algorithm>
>
> using namespace std;
>
> typedef pair<int, string> is;
>
> bool cmp_fn(is arg1, is arg2)
> {
> return (arg2.first < arg1.first) ? true : false;
> }
>
> void print(const is & arg)
> {
> cout << arg.second << ": " << arg.first << endl;
> }
>
> int main()
> {
> vector<string> unique_words;
> vector<is> v;
>
> string word;
>
> while (cin >> word)
> {
> if (find(unique_words.begin(), unique_words.end(),
> word) == unique_words.end())
> {
> unique_words.push_back(word);
> v.push_back(make_pair(1, word));
> }
> else
> {
> for (vector<is>::iterator i = v.begin(); i !=
> v.end(); ++i)
> if (i->second == word)
> {
> ++i->first;
> break;
> }
>
> }
> }
>
> sort(v.begin(), v.end(), cmp_fn);
>
> for_each(v.begin(), v.end(), print);
>
> return 0;
> }
>
> Kindly help.
>

I see that the problem needs mutli_index, where we need string as key,
but sorted by another key of int.

I think Boost has something for you.

Anyway, if you don't want to use big tool to deal with small thing,
I suggest you use std::vector<pair<int, string> > as your associative
container, you wrap this as your member, when you add a key word, find a
right place to insert.

In this case, insertion, searching, is of O(N), not considering the cost
of caused by insertion.

--
Thanks
Barry

=?UTF-8?B?RXJpayBXaWtzdHLDtm0=?=
Guest
Posts: n/a

 09-29-2007
On 2007-09-29 13:41, (E-Mail Removed), India wrote:
> As a beginner in C++, I have attempted the C++ solution for the
> following:
>
> Consider the Ex. 6-4 in K & R(ANSI C) 2nd Edition in Page 143 :
>
> Write a program that prints the distinct words in its input sorted
> into descending order of frequency of occurrence. Precede each word by
> its count.
>
> (Here I assume that we should NOT sort the words first. Instead sort
> in decreasing order as per frequency.)
>
> Following is my C++ solution which works fine. Kindly suggest better
> way of doing it.
>
> #include <iostream>
> #include <vector>
> #include <string>
> #include <utility>
> #include <algorithm>

I have not looked at your code, but for this you should only need to
include <iostream>, <string> and <map>.

Some hints:

Use std::map to store the word as a key and the number of occurrences.

The [] operator of std::map can be used like this:

map[key] = value;

One nice feature of the [] operator is that if there is no key/value
pair already in the map it will be created, and the value will be
default initialised (meaning that if the value is an int it will be set
to 0).

--
Erik WikstrÃ¶m

Kai-Uwe Bux
Guest
Posts: n/a

 09-29-2007
Erik Wikström wrote:

> On 2007-09-29 13:41, (E-Mail Removed), India wrote:
>> As a beginner in C++, I have attempted the C++ solution for the
>> following:
>>
>> Consider the Ex. 6-4 in K & R(ANSI C) 2nd Edition in Page 143 :
>>
>> Write a program that prints the distinct words in its input sorted
>> into descending order of frequency of occurrence. Precede each word by
>> its count.
>>
>> (Here I assume that we should NOT sort the words first. Instead sort
>> in decreasing order as per frequency.)
>>
>> Following is my C++ solution which works fine. Kindly suggest better
>> way of doing it.
>>
>> #include <iostream>
>> #include <vector>
>> #include <string>
>> #include <utility>
>> #include <algorithm>

>
> I have not looked at your code, but for this you should only need to
> include <iostream>, <string> and <map>.

Really? You are probably thinking of using

std::map< std::string, unsigned >

But that will not magially sort the strings by frequency.

It think, I still would have <vector> and <algorithm>. (Of course, you could
first build the map and then use an inverse multimap, but I think that
would be more obscure than just sorting a vector.)

> Some hints:
>
> Use std::map to store the word as a key and the number of occurrences.
>
> The [] operator of std::map can be used like this:
>
> map[key] = value;
>
> One nice feature of the [] operator is that if there is no key/value
> pair already in the map it will be created, and the value will be
> default initialised (meaning that if the value is an int it will be set
> to 0).

Best

Kai-Uwe Bux

red floyd
Guest
Posts: n/a

 09-30-2007
Kai-Uwe Bux wrote:
> Erik Wikström wrote:
>
>> On 2007-09-29 13:41, (E-Mail Removed), India wrote:
>>> As a beginner in C++, I have attempted the C++ solution for the
>>> following:
>>>
>>> Consider the Ex. 6-4 in K & R(ANSI C) 2nd Edition in Page 143 :
>>>
>>> Write a program that prints the distinct words in its input sorted
>>> into descending order of frequency of occurrence. Precede each word by
>>> its count.
>>>
>>> (Here I assume that we should NOT sort the words first. Instead sort
>>> in decreasing order as per frequency.)
>>>
>>> Following is my C++ solution which works fine. Kindly suggest better
>>> way of doing it.
>>>
>>> #include <iostream>
>>> #include <vector>
>>> #include <string>
>>> #include <utility>
>>> #include <algorithm>

>> I have not looked at your code, but for this you should only need to
>> include <iostream>, <string> and <map>.

>
> Really? You are probably thinking of using
>
> std::map< std::string, unsigned >
>

// comparator for *DECREASING* order
struct compare {
bool operator()(const std:air<string, unsigned>& lhs,
const std:air<string, unsigned& rhs) const
{
return lhs.second > rhs.second;
}
};

std::map<std::string, unsigned> words_and_freqs;

// fill map, then do this. Creates a set sorted by frequency

std::set<std:air<std::string, unsigned> >
freqs_first(words_and_freqs.begin(), words_and_freqs.end(),
compare());

the set is sorted by the frequency in descending order.

Kai-Uwe Bux
Guest
Posts: n/a

 09-30-2007
red floyd wrote:

> Kai-Uwe Bux wrote:
>> Erik Wikström wrote:
>>
>>> On 2007-09-29 13:41, (E-Mail Removed), India wrote:
>>>> As a beginner in C++, I have attempted the C++ solution for the
>>>> following:
>>>>
>>>> Consider the Ex. 6-4 in K & R(ANSI C) 2nd Edition in Page 143 :
>>>>
>>>> Write a program that prints the distinct words in its input sorted
>>>> into descending order of frequency of occurrence. Precede each word by
>>>> its count.
>>>>
>>>> (Here I assume that we should NOT sort the words first. Instead sort
>>>> in decreasing order as per frequency.)
>>>>
>>>> Following is my C++ solution which works fine. Kindly suggest better
>>>> way of doing it.
>>>>
>>>> #include <iostream>
>>>> #include <vector>
>>>> #include <string>
>>>> #include <utility>
>>>> #include <algorithm>
>>> I have not looked at your code, but for this you should only need to
>>> include <iostream>, <string> and <map>.

>>
>> Really? You are probably thinking of using
>>
>> std::map< std::string, unsigned >
>>

>
> // comparator for *DECREASING* order
> struct compare {
> bool operator()(const std:air<string, unsigned>& lhs,
> const std:air<string, unsigned& rhs) const
> {
> return lhs.second > rhs.second;
> }
> };
>
> std::map<std::string, unsigned> words_and_freqs;
>
> // fill map, then do this. Creates a set sorted by frequency
>
> std::set<std:air<std::string, unsigned> >
> freqs_first(words_and_freqs.begin(), words_and_freqs.end(),
> compare());
>
> the set is sorted by the frequency in descending order.

And, if I see this correctly, it will contain exactly one word for each
frequency, because two pairs with equal frequency counts will be treated as
comparing equal by std::set<>.

However, that can be fixed, in which case you would throw in the header
<set> instead of <vector> and <algorithm>. Well, that's fine, too.

Best

Kai-Uwe Bux

subramanian100in@yahoo.com, India
Guest
Posts: n/a

 09-30-2007

First I express my thanks to all of you for replying.

Looks like, some of you have suggested map data structure. This cannot
be used because it sorts the input based on string which is the key.
But the input words should not be sorted. Their order should be
retained as they appear in the input. Later, they should be sorted
based on frequency of their occurrence.

Let me put the logic that I have used, in words.
Instead of going through the code, kindly go through this and give me

I create "vector<string> unique_words;" to store each input word as it
also create "vector< pair<int, string> > v;" along with the above
vector<string>. Whenever a word arrives, first it is stored in
vector<string> if it is a new word and in this case, make_pair(1,
word) is stored in vector<pair<int,string>>. If the word has been
previously stored, then its count is incremented in
vector<pair<int,string>>. After reading all words, I do

sort(v.begin(), v.end(), cmp_fn);

for_each(v.begin(), v.end(), print);

print. There can be other disadvantages also.

Thanks
V.Subramanian

Kai-Uwe Bux
Guest
Posts: n/a

 09-30-2007
(E-Mail Removed), India wrote:

>
> First I express my thanks to all of you for replying.
>
> Looks like, some of you have suggested map data structure. This cannot
> be used because it sorts the input based on string which is the key.

It still can (and should!) be used to compile the frequency data. You'r just
not done, yet

> But the input words should not be sorted. Their order should be
> retained as they appear in the input. Later, they should be sorted
> based on frequency of their occurrence.

So, there needs to be a second step in the process. Assume you had a map

std::map< std::string, unsigned int >

with frequency data. How would you go about sorting them by frequency. There
are several options:

a) convert the list into a vector of pairs. (Easy since vector has a
constructor that takes a pair of iterators.) Then sort by frequency. Then
write to screen.

b) convert the list into a std::multimap< unsigned count, std::string >.

> Let me put the logic that I have used, in words.
> Instead of going through the code, kindly go through this and give me
>
> I create "vector<string> unique_words;" to store each input word as it
> also create "vector< pair<int, string> > v;" along with the above
> vector<string>. Whenever a word arrives, first it is stored in
> vector<string> if it is a new word and in this case, make_pair(1,
> word) is stored in vector<pair<int,string>>. If the word has been
> previously stored, then its count is incremented in
> vector<pair<int,string>>. After reading all words, I do
>
> sort(v.begin(), v.end(), cmp_fn);

good thinking.

> for_each(v.begin(), v.end(), print);

You might want to have a look into ostream_iterator. Then you can use

std::copy( v.begin(),v.end(), ... );

> The disadvantage is the addition of two global functions cmp_fn and
> print. There can be other disadvantages also.
>

You'r on the right track.

Best

Kai-Uwe Bux

Barry
Guest
Posts: n/a

 09-30-2007
(E-Mail Removed), India wrote:
> First I express my thanks to all of you for replying.
>
> Looks like, some of you have suggested map data structure. This cannot
> be used because it sorts the input based on string which is the key.
> But the input words should not be sorted. Their order should be
> retained as they appear in the input. Later, they should be sorted
> based on frequency of their occurrence.
>
> Let me put the logic that I have used, in words.
> Instead of going through the code, kindly go through this and give me
>
> I create "vector<string> unique_words;" to store each input word as it
> also create "vector< pair<int, string> > v;" along with the above
> vector<string>. Whenever a word arrives, first it is stored in
> vector<string> if it is a new word and in this case, make_pair(1,
> word) is stored in vector<pair<int,string>>. If the word has been
> previously stored, then its count is incremented in
> vector<pair<int,string>>. After reading all words, I do
>
> sort(v.begin(), v.end(), cmp_fn);
>
> for_each(v.begin(), v.end(), print);
>
> The disadvantage is the addition of two global functions cmp_fn and
> print. There can be other disadvantages also.

I think you can wrap them up,

You can check this out:

#include <string>
#include <iostream>
#include <vector>
#include <algorithm>

class AssocVector
{
public:
typedef std::vector<std:air<int, std::string> > ContainerType;
typedef ContainerType::iterator Iterator;
typedef ContainerType::const_iterator ConstIterator;
public:
void Insert(std::string const& word)
{
bool found = false;
Iterator iter = c.begin();
for (; iter != c.end(); ++iter)
{
if (iter->second == word)
{
++(iter->first);
found = true;
break;
}
}

if (found && iter != c.begin())
{
for (; iter != c.begin(); --iter)
{
Iterator prev = iter;
--prev;
if (prev->first < iter->first)
std::iter_swap(prev, iter);
else
break;
}
}

if (!found)
c.push_back(std::make_pair(1, word));
}

Iterator begin()
{
return c.begin();
}

Iterator end()
{
return c.end();
}

ConstIterator begin() const
{
return c.begin();
}

ConstIterator end() const
{
return c.end();
}
protected:
ContainerType c;
};

struct Printer
{
void operator() (std:air<int, std::string> const& p) const
{
std::cout << p.first << ' ' << p.second << std::endl;
}

};

int main()
{
AssocVector assocVec;

std::string word;
while (std::cin >> word)
assocVec.Insert(word);

std::for_each (assocVec.begin(), assocVec.end(), (Printer()));
}

--
Thanks
Barry

Kai-Uwe Bux
Guest
Posts: n/a

 09-30-2007
Barry wrote:

> (E-Mail Removed), India wrote:
>> First I express my thanks to all of you for replying.
>>
>> Looks like, some of you have suggested map data structure. This cannot
>> be used because it sorts the input based on string which is the key.
>> But the input words should not be sorted. Their order should be
>> retained as they appear in the input. Later, they should be sorted
>> based on frequency of their occurrence.
>>
>> Let me put the logic that I have used, in words.
>> Instead of going through the code, kindly go through this and give me
>>
>> I create "vector<string> unique_words;" to store each input word as it
>> also create "vector< pair<int, string> > v;" along with the above
>> vector<string>. Whenever a word arrives, first it is stored in
>> vector<string> if it is a new word and in this case, make_pair(1,
>> word) is stored in vector<pair<int,string>>. If the word has been
>> previously stored, then its count is incremented in
>> vector<pair<int,string>>. After reading all words, I do
>>
>> sort(v.begin(), v.end(), cmp_fn);
>>
>> for_each(v.begin(), v.end(), print);
>>
>> The disadvantage is the addition of two global functions cmp_fn and
>> print. There can be other disadvantages also.

>
> I think you can wrap them up,
>
> You can check this out:
>
> #include <string>
> #include <iostream>
> #include <vector>
> #include <algorithm>
>
> class AssocVector
> {
> public:
> typedef std::vector<std:air<int, std::string> > ContainerType;
> typedef ContainerType::iterator Iterator;
> typedef ContainerType::const_iterator ConstIterator;
> public:
> void Insert(std::string const& word)
> {
> bool found = false;
> Iterator iter = c.begin();
> for (; iter != c.end(); ++iter)
> {
> if (iter->second == word)
> {
> ++(iter->first);
> found = true;
> break;
> }
> }
>
> if (found && iter != c.begin())
> {
> for (; iter != c.begin(); --iter)
> {
> Iterator prev = iter;
> --prev;
> if (prev->first < iter->first)
> std::iter_swap(prev, iter);
> else
> break;
> }
> }
>
> if (!found)
> c.push_back(std::make_pair(1, word));
> }

Wow: is that bubble sort on the fly?

Anyway, I don't like the mix of responsibilities that issues from putting
the increment of the frequency count within the search loop. Consider:

void Insert(std::string const& word)
{
Iterator iter = c.begin();
while ( iter != c.end() && iter->second != word ) {
++ iter;
}
if ( iter == c.end() ) {
c.push_back(std::make_pair(1, word));
} else {
++ iter->first;
while ( iter != c.begin() ) {
Iterator next = iter;
--iter;
if (iter->first < next->first) {
std::iter_swap(next, iter);
} else {
return;
}
}
}
}

>
> Iterator begin()
> {
> return c.begin();
> }
>
> Iterator end()
> {
> return c.end();
> }
>
> ConstIterator begin() const
> {
> return c.begin();
> }
>
> ConstIterator end() const
> {
> return c.end();
> }
> protected:
> ContainerType c;
> };
>
> struct Printer
> {
> void operator() (std:air<int, std::string> const& p) const
> {
> std::cout << p.first << ' ' << p.second << std::endl;
> }
>
> };
>
> int main()
> {
> AssocVector assocVec;
>
> std::string word;
> while (std::cin >> word)
> assocVec.Insert(word);
>
> std::for_each (assocVec.begin(), assocVec.end(), (Printer()));
> }

I think this is overkill (and inefficient due to the use of bubble sort

Instead, let me suggest some spagetty code (all goes into main):

#include <iostream>
#include <string>
#include <map>

int main ( void ) {
std::string word;
typedef std::map< std::string, unsigned long >
frequency_table;
frequency_table frequency;

while ( std::cin >> word ) {
++ frequency[ word ];
}

// reverting and resorting:
typedef std::multimap< unsigned long, std::string >
inverse_table;
inverse_table inverse;
for ( frequency_table::const_iterator iter
= frequency.begin();
iter != frequency.end(); ++ iter ) {
inverse.insert
( inverse_table::value_type
( iter->second, iter->first ) );
}

// output:
for ( inverse_table::const_reverse_iterator iter
= inverse.rbegin();
iter != inverse.rend(); ++ iter ) {
std::cout << iter->second << " " << iter->first << '\n';
}
}

Best

Kai-Uwe Bux