Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > XML > Re: Method to compare two XML documents

Reply
Thread Tools

Re: Method to compare two XML documents

 
 
GrindKore
Guest
Posts: n/a
 
      08-09-2004
Record MD5 hash of your file and later on subsequent scans compare hash
value, if same then document has not changed else do your imaging and update
xml manifest with new hash value.

"GrinKore" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed) trafeed.com...
> Hello, I'm working on the intranet document imaging application where

every
> 24 hours my program scans all network servers for various documents and
> creates raster images of them to be placed on company's intranet server.
>
> I have created ActiveX DLL that scans FSO and returns XML document as

a
> manifest of all compatible document files stored on those servers. See
> attached sample XML output for more details.
>
> What I want to do is to compare two xml documents so that I can
> determine what files have changed since last scan. Since production system
> has to be able to handle 100,000 + nodes looping through both XML

documents
> takes considerable amount of time. Is there any other ways to do this?
>
> Thanks in advance...
>
>
>
>
>



 
Reply With Quote
 
 
 
 
Keith M. Corbett
Guest
Posts: n/a
 
      08-10-2004
"GrindKore" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed) trafeed.com...
> Record MD5 hash of your file and later on subsequent scans compare hash
> value, if same then document has not changed else do your imaging and

update
> xml manifest with new hash value.


As a refinement to this idea, you could parse the XML, generate ESIS
("element structure information set") and hash the output. Comparing ESIS
rather than "raw" XML will avoid false positives where input differs in ways
that would not affect the information that a processing application would
receive. (Comments, whitespace, DTD changes...)

In many cases this approach might be undesirable in terms of performance.
OTOH this capability might be added at reasonable cost to applications that
already have access to the element structure information.

For an example of an SGML/XML parser that generates ESIS, see James Clark's
SP: http://www.jclark.com/sp

/kmc



 
Reply With Quote
 
 
 
 
Nick Kew
Guest
Posts: n/a
 
      08-10-2004
In article <(E-Mail Removed)>,
"Keith M. Corbett" <(E-Mail Removed)> writes:

> As a refinement to this idea, you could parse the XML, generate ESIS
> ("element structure information set") and hash the output. Comparing ESIS
> rather than "raw" XML will avoid false positives where input differs in ways
> that would not affect the information that a processing application would
> receive. (Comments, whitespace, DTD changes...)


This can be further refined to detect or ignore selected types of
difference. For example, you can detect whether a document's
structure differs only in attribute values or attributes while
preserving an element tree. Check the archives of the WAI-ER
working group (at lists.w3.org) for further discussion, including
prototype implementation of change detection that successfully
distinguishes 'significant' changes on, for example, a news site
where stories change frequently, and adverts (which we ignore)
change with every hit.

> For an example of an SGML/XML parser that generates ESIS, see James Clark's
> SP: http://www.jclark.com/sp


Or the more up-to-date OpenSP (at openjade.sourceforge.net).
Don't expose the old SP on the Web (eg via CGI): it's not designed
for it and has security issues.

--
Nick Kew
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
XSLT Compare two documents and output differences super.raddish@gmail.com XML 4 06-26-2007 11:54 AM
How to compare two SOAP Envelope or two Document or two XML files GenxLogic Java 3 12-06-2006 08:41 PM
Compare & Merge XML documents Michael Ransburg Java 0 02-16-2004 02:23 PM
Re: Method to compare two XML documents GrinKore XML 3 11-04-2003 04:00 PM
Re: VB6: Method to compare two XML documents Nick Kew XML 0 11-04-2003 03:09 PM



Advertisments