![]() |
|
|
|||||||
![]() |
XML - Docs to XML conversion & read the XML files |
|
|
Thread Tools | Search this Thread |
|
|
#1 |
|
I am creating an application which need to convert document files into
XML. Then read the xml files for specific words in specific format. I am using Microsoft.Office.Interop for converting the document files to xml .The files are getting generated but with lots of formating information which leads to heavy file. I need an help to write a code which can reduce the xml files by removing the unwanted document formating. Or can be preserved if required. Thanks in advance. msinghindia@gmail.com |
|
|
|
|
#2 |
|
Posts: n/a
|
wrote:
> I need an help to write a code which can reduce the xml files by > removing the unwanted document formating. Or can be preserved if > required. That sounds like a straight programming problem. First, you need to analyse the files to create rules for recognizing the "unwanted" markup. Then you need to write code that either filters that markup out during the conversion process, or postprocesses the XML file by reading it in, applying those rules to alter it, and writing it back out. Pick your programming language and have fun. If you take the postprocessing approach, you could probably do this in XSLT... but whether that's the best approach depends in part on the nature of the rules you're trying to apply. -- () ASCII Ribbon Campaign | Joe Kesselman /\ Stamp out HTML e-mail! | System architexture and kinetic poetry |
|