neologist 04-27-2007 12:05 PM

translating MS Word codes using regexps
We have a Web form where the input on certain textarea boxes
is filled in by some of our users by cutting and pasting
text from MS Word documents. They think it's plain text, but
it's not. We would like to parse that input before saving it
to the database, e.g., to translate the funny Word quotes to
plain double and single quotes using regexps such as by


I'm sure there's a table of standard codes and some examples
of how to make typical substitutions somewhere, possibly
even a CPAN package to assist with this, but I'm not able to
find it because I don't know exactly what I'm looking for.

Can someone point me in the right direction?

