James Kanze wrote:
> On Jun 1, 11:01 pm, Kai-Uwe Bux <jkherci...@gmx.net> wrote:
>> James Kanze wrote:
>> > On Jun 1, 8:11 pm, Kai-Uwe Bux <jkherci...@gmx.net> wrote:
>> >> Fred Yu wrote:
>> >> > I want to encode input text into html format such as
>> >> > replace "<" with "<", replace "&" with "&". Could
>> >> > you give me some ideas? Thanks.
>
>> >> Containers: std::map< char, std::string >
>> >> Iterators: std::istream_iterator, std:
stream_iterator
>> >> Algorithms: std::transform
>
>> > Agreed for the first (although it may be overkill---in this
>> > particular case, I think I'd go with a simple switch).
>
>> > No real need for the second; just use istream::get() and
>> > ostream:
ut() (or operator<< in some cases).
>
>> > As to the third: how? You're replacing a single character
>> > with a sequence of characters, and transform does a one to
>> > one (which in practice makes it of fairly limited
>> > utility---although I've used it with a vector<string>,
>> > ostream_iterator, and as string transformer class that I've
>> > written, which works something like $(patsubst...) in GNU
>> > make).
>
>> I was thinking of something like this:
>
>> #include <iostream>
>> #include <iterator>
>> #include <map>
>> #include <algorithm>
>> #include <cassert>
>
>> struct encoder {
>
>> std::map< char, std::string > the_map;
>
>> encoder ( void ) {
>> the_map[ 'a' ] = "a";
>> // ...
>> the_map[ '&' ] = "&";
>> // ...
>> }
>
>> std::string const & operator() ( char ch ) const {
>> std::map< char, std::string >::const_iterator iter =
>> the_map.find( ch );
>> assert( iter != the_map.end() );
>> return ( iter->second );
>> }
>> };
>
>> int main ( void ) {
>> encoder the_encoder;
>> std::transform( std::istreambuf_iterator<char>( std::cin ),
>> std::istreambuf_iterator<char>(),
>> std:
stream_iterator<std::string>( std::cout, "" ),
>> the_encoder );
>> }
>
> Which looks like a lot of overhead (including in terms of
> programming) for very little gain. It might be worth it if you
> create some sort of generic encoder, in order to reuse the idiom
> in many different contexts, but for such a simple problem, it
> just seems overkill for a onetime solution.
It's just what came to mind first. I tend to think of std::map whenever
there is an obvious table lookup. I like that because (a) it tends to have
exactly one line for each table entry, which can be formatted in such a way
that it is easy to read, and (b) the logic of table lookup is completely
decoupled from the rest of the program. Of course, a simple function
char const * encode ( char ch ) {
switch ( ch ) {
...
}
}
could do the same.
> As I said, I'd
> probably go with the switch. If I were going to go to the
> effort of initializing the map completely, I'd probably go with
> a char const*[UCHAR_MAX], rather than std::map. Or a map with
> just the elements which don't use an identity transformation.
Initializing the map completely is not a big deal at all. Just change the
constructor slightly:
for ( char ch = std::numeric_limits<char>::min();
ch < std::numeric_limits<char>::max();
++ ch ) {
the_map[ ch ] = ch;
}
the_map[ std::numeric_limits<char>::max() ] =
std::numeric_limits<char>::max();
// now for the special characters:
the_map[ '&' ] = "&";
...
> And I'd probably still write out the loop; somehow, the idea of
> transforming each individual character into a string just to
> output it bothers me.
a) Note that the operator() of the encoder returns a string const &. So,
this does not really create a string each time just for output. It only
involves a few levels of indirection (something like char*** instead of
char*).
b) You can use
map< char, char const * >
instead of map< char, string >. Transform will just look up the char const *
and write it, which is very much the same as a hand coded loop. The price
to pay is that the trick from above for initializing all the characters
that are just passed through becomes more tricky.
c) Maybe you are thinking of a _real_ alternative:
#include <iostream>
#include <istream>
#include <ostream>
int main ( void ) {
char ch;
while ( std::cin.get( ch ) ) {
switch ( ch ) {
case '&' : { std::cout << "&"; break; }
case '<' : { std::cout << "lt"; break; }
// ...
default : { std::cout << ch; break; }
}
}
}
I have to admit that I don't like that. It mixes flow control and table
lookup to the effect that different types are piped to std::cout (char for
default and const char * for the other characters).
Best
Kai-Uwe Bux