Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C++ > How to encode text into html format

Reply
Thread Tools

How to encode text into html format

 
 
Fred Yu
Guest
Posts: n/a
 
      06-01-2008
Hi,

I want to encode input text into html format such as replace "<" with "&lt",
replace "&" with "&amp".
Could you give me some ideas? Thanks.

Fred




 
Reply With Quote
 
 
 
 
Kai-Uwe Bux
Guest
Posts: n/a
 
      06-01-2008
Fred Yu wrote:

> Hi,
>
> I want to encode input text into html format such as replace "<" with
> "&lt", replace "&" with "&amp".
> Could you give me some ideas? Thanks.



Containers: std::map< char, std::string >
Iterators: std::istream_iterator, std:stream_iterator
Algorithms: std::transform


Best

Kai-Uwe Bux
 
Reply With Quote
 
 
 
 
AnonMail2005@gmail.com
Guest
Posts: n/a
 
      06-01-2008
On Jun 1, 12:37*pm, "Fred Yu" <(E-Mail Removed)> wrote:
> Hi,
>
> I want to encode input text into html format such as replace "<" with "&lt",
> replace "&" with "&amp".
> Could you give me some ideas? Thanks.
>
> Fred


google iconv. It will convert from many char encodings to many other
char
encodings. I've used it to "format" text in various XML wrapper
classes.
 
Reply With Quote
 
Elmar Baumann
Guest
Posts: n/a
 
      06-01-2008

"Fred Yu" <(E-Mail Removed)> schrieb im Newsbeitrag
news:g1uka7$o1g$(E-Mail Removed)99.com...
> Hi,
>
> I want to encode input text into html format such as replace "<" with
> "&lt",
> replace "&" with "&amp".


Example for AnsiString Class

AnsiString Input; //contains the html code
int pos;

do // replace "<" to "&lt"
{
if(Input.Pos("<") > NULL)
{
pos = Input.Pos("<");
Input.Delete(pos,1);
Input.Insert("%26lt",pos);
}
}
while(Input.Pos("<") > NULL);



 
Reply With Quote
 
James Kanze
Guest
Posts: n/a
 
      06-01-2008
On Jun 1, 8:11 pm, Kai-Uwe Bux <(E-Mail Removed)> wrote:
> Fred Yu wrote:
> > I want to encode input text into html format such as replace "<" with
> > "&lt", replace "&" with "&amp".
> > Could you give me some ideas? Thanks.


> Containers: std::map< char, std::string >
> Iterators: std::istream_iterator, std:stream_iterator
> Algorithms: std::transform


Agreed for the first (although it may be overkill---in this
particular case, I think I'd go with a simple switch).

No real need for the second; just use istream::get() and
ostream:ut() (or operator<< in some cases).

As to the third: how? You're replacing a single character with
a sequence of characters, and transform does a one to one (which
in practice makes it of fairly limited utility---although I've
used it with a vector<string>, ostream_iterator, and as string
transformer class that I've written, which works something like
$(patsubst...) in GNU make).

--
James Kanze (GABI Software) email:(E-Mail Removed)
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
 
Reply With Quote
 
Kai-Uwe Bux
Guest
Posts: n/a
 
      06-01-2008
James Kanze wrote:

> On Jun 1, 8:11 pm, Kai-Uwe Bux <(E-Mail Removed)> wrote:
>> Fred Yu wrote:
>> > I want to encode input text into html format such as replace "<" with
>> > "&lt", replace "&" with "&amp".
>> > Could you give me some ideas? Thanks.

>
>> Containers: std::map< char, std::string >
>> Iterators: std::istream_iterator, std:stream_iterator
>> Algorithms: std::transform

>
> Agreed for the first (although it may be overkill---in this
> particular case, I think I'd go with a simple switch).
>
> No real need for the second; just use istream::get() and
> ostream:ut() (or operator<< in some cases).
>
> As to the third: how? You're replacing a single character with
> a sequence of characters, and transform does a one to one (which
> in practice makes it of fairly limited utility---although I've
> used it with a vector<string>, ostream_iterator, and as string
> transformer class that I've written, which works something like
> $(patsubst...) in GNU make).


I was thinking of something like this:

#include <iostream>
#include <iterator>
#include <map>
#include <algorithm>
#include <cassert>

struct encoder {

std::map< char, std::string > the_map;

encoder ( void ) {
the_map[ 'a' ] = "a";
// ...
the_map[ '&' ] = "&amp";
// ...
}

std::string const & operator() ( char ch ) const {
std::map< char, std::string >::const_iterator iter =
the_map.find( ch );
assert( iter != the_map.end() );
return ( iter->second );
}
};

int main ( void ) {
encoder the_encoder;
std::transform( std::istreambuf_iterator<char>( std::cin ),
std::istreambuf_iterator<char>(),
std:stream_iterator<std::string>( std::cout, "" ),
the_encoder );
}


Best

Kai-Uwe Bux

 
Reply With Quote
 
Frank Birbacher
Guest
Posts: n/a
 
      06-01-2008
Hi!

James Kanze schrieb:
> As to the third: how? You're replacing a single character with
> a sequence of characters, and transform does a one to one (which
> in practice makes it of fairly limited utility---although I've
> used it with a vector<string>, ostream_iterator, and as string
> transformer class that I've written, which works something like
> $(patsubst...) in GNU make).


The source range of transform may have another value type than the
destination range.

char const* replace(char);

transform(str.begin(), str.end(),
ostream_iterator<const char*>(cout),
&replace);

Frank
 
Reply With Quote
 
James Kanze
Guest
Posts: n/a
 
      06-02-2008
On Jun 1, 11:01 pm, Kai-Uwe Bux <(E-Mail Removed)> wrote:
> James Kanze wrote:
> > On Jun 1, 8:11 pm, Kai-Uwe Bux <(E-Mail Removed)> wrote:
> >> Fred Yu wrote:
> >> > I want to encode input text into html format such as
> >> > replace "<" with "&lt", replace "&" with "&amp". Could
> >> > you give me some ideas? Thanks.


> >> Containers: std::map< char, std::string >
> >> Iterators: std::istream_iterator, std:stream_iterator
> >> Algorithms: std::transform


> > Agreed for the first (although it may be overkill---in this
> > particular case, I think I'd go with a simple switch).


> > No real need for the second; just use istream::get() and
> > ostream:ut() (or operator<< in some cases).


> > As to the third: how? You're replacing a single character
> > with a sequence of characters, and transform does a one to
> > one (which in practice makes it of fairly limited
> > utility---although I've used it with a vector<string>,
> > ostream_iterator, and as string transformer class that I've
> > written, which works something like $(patsubst...) in GNU
> > make).


> I was thinking of something like this:


> #include <iostream>
> #include <iterator>
> #include <map>
> #include <algorithm>
> #include <cassert>


> struct encoder {


> std::map< char, std::string > the_map;


> encoder ( void ) {
> the_map[ 'a' ] = "a";
> // ...
> the_map[ '&' ] = "&amp";
> // ...
> }


> std::string const & operator() ( char ch ) const {
> std::map< char, std::string >::const_iterator iter =
> the_map.find( ch );
> assert( iter != the_map.end() );
> return ( iter->second );
> }
> };


> int main ( void ) {
> encoder the_encoder;
> std::transform( std::istreambuf_iterator<char>( std::cin ),
> std::istreambuf_iterator<char>(),
> std:stream_iterator<std::string>( std::cout, "" ),
> the_encoder );
> }


Which looks like a lot of overhead (including in terms of
programming) for very little gain. It might be worth it if you
create some sort of generic encoder, in order to reuse the idiom
in many different contexts, but for such a simple problem, it
just seems overkill for a onetime solution. As I said, I'd
probably go with the switch. If I were going to go to the
effort of initializing the map completely, I'd probably go with
a char const*[UCHAR_MAX], rather than std::map. Or a map with
just the elements which don't use an identity transformation.
And I'd probably still write out the loop; somehow, the idea of
transforming each individual character into a string just to
output it bothers me.

--
James Kanze (GABI Software) email:(E-Mail Removed)
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
 
Reply With Quote
 
James Kanze
Guest
Posts: n/a
 
      06-02-2008
On Jun 1, 11:25 pm, Frank Birbacher <(E-Mail Removed)> wrote:
> James Kanze schrieb:


> > As to the third: how? You're replacing a single character with
> > a sequence of characters, and transform does a one to one (which
> > in practice makes it of fairly limited utility---although I've
> > used it with a vector<string>, ostream_iterator, and as string
> > transformer class that I've written, which works something like
> > $(patsubst...) in GNU make).


> The source range of transform may have another value type than the
> destination range.


I'm aware of that, however...

> char const* replace(char);


> transform(str.begin(), str.end(),
> ostream_iterator<const char*>(cout),
> &replace);


For some reason, I was thinking in terms of std::string, and not
char const*. And converting each std::string seemed a bit heavy
for the task at hand. But a statically generated char const*[];
why not?

--
James Kanze (GABI Software) email:(E-Mail Removed)
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
 
Reply With Quote
 
Kai-Uwe Bux
Guest
Posts: n/a
 
      06-02-2008
James Kanze wrote:

> On Jun 1, 11:01 pm, Kai-Uwe Bux <(E-Mail Removed)> wrote:
>> James Kanze wrote:
>> > On Jun 1, 8:11 pm, Kai-Uwe Bux <(E-Mail Removed)> wrote:
>> >> Fred Yu wrote:
>> >> > I want to encode input text into html format such as
>> >> > replace "<" with "&lt", replace "&" with "&amp". Could
>> >> > you give me some ideas? Thanks.

>
>> >> Containers: std::map< char, std::string >
>> >> Iterators: std::istream_iterator, std:stream_iterator
>> >> Algorithms: std::transform

>
>> > Agreed for the first (although it may be overkill---in this
>> > particular case, I think I'd go with a simple switch).

>
>> > No real need for the second; just use istream::get() and
>> > ostream:ut() (or operator<< in some cases).

>
>> > As to the third: how? You're replacing a single character
>> > with a sequence of characters, and transform does a one to
>> > one (which in practice makes it of fairly limited
>> > utility---although I've used it with a vector<string>,
>> > ostream_iterator, and as string transformer class that I've
>> > written, which works something like $(patsubst...) in GNU
>> > make).

>
>> I was thinking of something like this:

>
>> #include <iostream>
>> #include <iterator>
>> #include <map>
>> #include <algorithm>
>> #include <cassert>

>
>> struct encoder {

>
>> std::map< char, std::string > the_map;

>
>> encoder ( void ) {
>> the_map[ 'a' ] = "a";
>> // ...
>> the_map[ '&' ] = "&amp";
>> // ...
>> }

>
>> std::string const & operator() ( char ch ) const {
>> std::map< char, std::string >::const_iterator iter =
>> the_map.find( ch );
>> assert( iter != the_map.end() );
>> return ( iter->second );
>> }
>> };

>
>> int main ( void ) {
>> encoder the_encoder;
>> std::transform( std::istreambuf_iterator<char>( std::cin ),
>> std::istreambuf_iterator<char>(),
>> std:stream_iterator<std::string>( std::cout, "" ),
>> the_encoder );
>> }

>
> Which looks like a lot of overhead (including in terms of
> programming) for very little gain. It might be worth it if you
> create some sort of generic encoder, in order to reuse the idiom
> in many different contexts, but for such a simple problem, it
> just seems overkill for a onetime solution.


It's just what came to mind first. I tend to think of std::map whenever
there is an obvious table lookup. I like that because (a) it tends to have
exactly one line for each table entry, which can be formatted in such a way
that it is easy to read, and (b) the logic of table lookup is completely
decoupled from the rest of the program. Of course, a simple function

char const * encode ( char ch ) {
switch ( ch ) {
...
}
}

could do the same.


> As I said, I'd
> probably go with the switch. If I were going to go to the
> effort of initializing the map completely, I'd probably go with
> a char const*[UCHAR_MAX], rather than std::map. Or a map with
> just the elements which don't use an identity transformation.


Initializing the map completely is not a big deal at all. Just change the
constructor slightly:

for ( char ch = std::numeric_limits<char>::min();
ch < std::numeric_limits<char>::max();
++ ch ) {
the_map[ ch ] = ch;
}
the_map[ std::numeric_limits<char>::max() ] =
std::numeric_limits<char>::max();
// now for the special characters:
the_map[ '&' ] = "&amp";
...


> And I'd probably still write out the loop; somehow, the idea of
> transforming each individual character into a string just to
> output it bothers me.



a) Note that the operator() of the encoder returns a string const &. So,
this does not really create a string each time just for output. It only
involves a few levels of indirection (something like char*** instead of
char*).

b) You can use

map< char, char const * >

instead of map< char, string >. Transform will just look up the char const *
and write it, which is very much the same as a hand coded loop. The price
to pay is that the trick from above for initializing all the characters
that are just passed through becomes more tricky.

c) Maybe you are thinking of a _real_ alternative:


#include <iostream>
#include <istream>
#include <ostream>

int main ( void ) {
char ch;
while ( std::cin.get( ch ) ) {
switch ( ch ) {
case '&' : { std::cout << "&amp"; break; }
case '<' : { std::cout << "lt"; break; }
// ...
default : { std::cout << ch; break; }
}
}
}


I have to admit that I don't like that. It mixes flow control and table
lookup to the effect that different types are piped to std::cout (char for
default and const char * for the other characters).



Best

Kai-Uwe Bux
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: How include a large array? Edward A. Falk C Programming 1 04-04-2013 08:07 PM
How to Encode Parameters into an HTML Parsing Script SMERSH009X@gmail.com Python 2 06-22-2007 07:44 PM
Format text from db into HTML text Rigga ASP .Net 3 06-11-2005 06:43 AM
A good way to encode a 1024 one-hot vector into binary? Ryan VHDL 9 01-31-2005 02:16 AM
Encode And Decode entirety of Text or Html to '%xx' format Newbie Javascript 4 07-31-2004 02:47 PM



Advertisments