On 9 Jun 2004 06:03:20 -0700,
(MCP) wrote or
quoted :
>What are the Java APIs out there that can simply correct malformed
>HTML code, like take a input stream of badly formed HTML and produce
>an output stream of clean HTML code (parsable by the Swing HTML
>parser) ?
I have been bugging the HTMLValidator people to write such a beast. I
figured it could save me a ton of work if it did simple unambiguous
corrections like insert missing </li> or convert stray & to &
His fear is making a change that the user did not want. He did not
want to be morally liable for messing up the source.
I have done a number of one shot programs to clean up various problems
in my website. They do it all with indexof and substring. If you are
just trying to correct a single problem at a time, it can be pretty
simple.
--
Canadian Mind Products, Roedy Green.
Coaching, problem solving, economical contract programming.
See
http://mindprod.com/jgloss/jgloss.html for The Java Glossary.