Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Java > what is the bettter/performant way to compare org.w3c.dom.DocumentFragment

Reply
Thread Tools

what is the bettter/performant way to compare org.w3c.dom.DocumentFragment

 
 
Mausam
Guest
Posts: n/a
 
      01-17-2012
I have a java class, whose contains a DocumentFragment.

In the equals method of my class, I am converting the DocumentFragment to a String and comparing an equals on the String.

I know this is not the best way, because "attributes" e.g can change order in Element of DocumentFragment, or e.g documents differ only in the sequence of unordered elements.

So in such cases this equality will fail.

Please suggest a better approach.
 
Reply With Quote
 
 
 
 
Jeff Higgins
Guest
Posts: n/a
 
      01-17-2012
On 01/17/2012 10:03 AM, Mausam wrote:
> I have a java class, whose contains a DocumentFragment.
>
> In the equals method of my class, I am converting the DocumentFragment to a String and comparing an equals on the String.
>
> I know this is not the best way, because "attributes" e.g can change order in Element of DocumentFragment, or e.g documents differ only in the sequence of unordered elements.
>
> So in such cases this equality will fail.
>
> Please suggest a better approach.

A my class is equal to another my class if and only if ...


 
Reply With Quote
 
 
 
 
Arne Vajh°j
Guest
Posts: n/a
 
      01-17-2012
On 1/17/2012 10:03 AM, Mausam wrote:
> I have a java class, whose contains a DocumentFragment.
>
> In the equals method of my class, I am converting the DocumentFragment to a String and comparing an equals on the String.
>
> I know this is not the best way, because "attributes" e.g can change order in Element of DocumentFragment, or e.g documents differ only in the sequence of unordered elements.
>
> So in such cases this equality will fail.


I think XML Canonicalization will solve the problem.

It comes as a cost though.

Arne

 
Reply With Quote
 
Mausam
Guest
Posts: n/a
 
      01-18-2012
On Tuesday, 17 January 2012 23:02:29 UTC+5:30, Jeff Higgins wrote:
> On 01/17/2012 10:03 AM, Mausam wrote:
> > I have a java class, whose contains a DocumentFragment.
> >
> > In the equals method of my class, I am converting the DocumentFragment to a String and comparing an equals on the String.
> >
> > I know this is not the best way, because "attributes" e.g can change order in Element of DocumentFragment, or e.g documents differ only in the sequence of unordered elements.
> >
> > So in such cases this equality will fail.
> >
> > Please suggest a better approach.

> A my class is equal to another my class if and only if ...


Thanks Jeff, I understand what you mean.

BTW, I was checking the API http://docs.oracle.com/javase/1.5.0/...3c.dom.Node%29

The attributes NamedNodeMaps are equal. This is: they are both null, or they have the same length and for each node that exists in one map there is a node that exists in the other map and is equal, although not necessarily at the same index.


The childNodes NodeLists are equal. This is: they are both null, or they have the same length and contain equal nodes at the same index. Note that normalization can affect equality; to avoid this, nodes should be normalized before being compared.

Here for attributes, they take care of "NOT necessarily at the same index" but in case of childNodes its not being taken care of. So if there is a sequence of unordered elements (<emp/><dept/> and <dept/><emp/> ) they will be treated as NOT equal.

So either I iterate through each node and attribute and do a comparison. That's the fall back. But before that, I wanted to check the experts if there are better options.
 
Reply With Quote
 
Jeff Higgins
Guest
Posts: n/a
 
      01-18-2012
On 01/17/2012 08:56 PM, Mausam wrote:
> On Tuesday, 17 January 2012 23:02:29 UTC+5:30, Jeff Higgins wrote:
>> On 01/17/2012 10:03 AM, Mausam wrote:
>>> I have a java class, whose contains a DocumentFragment.
>>>
>>> In the equals method of my class, I am converting the DocumentFragment to a String and comparing an equals on the String.
>>>
>>> I know this is not the best way, because "attributes" e.g can change order in Element of DocumentFragment, or e.g documents differ only in the sequence of unordered elements.
>>>
>>> So in such cases this equality will fail.
>>>
>>> Please suggest a better approach.

>> A my class is equal to another my class if and only if ...

>
> Thanks Jeff, I understand what you mean.
>
> BTW, I was checking the API http://docs.oracle.com/javase/1.5.0/...3c.dom.Node%29
>
> The attributes NamedNodeMaps are equal. This is: they are both null, or they have the same length and for each node that exists in one map there is a node that exists in the other map and is equal, although not necessarily at the same index.
>
>
> The childNodes NodeLists are equal. This is: they are both null, or they have the same length and contain equal nodes at the same index. Note that normalization can affect equality; to avoid this, nodes should be normalized before being compared.
>
> Here for attributes, they take care of "NOT necessarily at the same index" but in case of childNodes its not being taken care of. So if there is a sequence of unordered elements (<emp/><dept/> and<dept/><emp/> ) they will be treated as NOT equal.
>
> So either I iterate through each node and attribute and do a comparison. That's the fall back. But before that, I wanted to check the experts if there are better options.


Yep. I based my hair trigger response upon the .equals(Object) of the
"known implementing classes" of Node. Sorry. I'll be interested in
finding out the "cost" associated with Arne Vajh°j's response.


 
Reply With Quote
 
Arne Vajh°j
Guest
Posts: n/a
 
      01-18-2012
On 1/17/2012 6:38 PM, Arne Vajh°j wrote:
> On 1/17/2012 10:03 AM, Mausam wrote:
>> I have a java class, whose contains a DocumentFragment.
>>
>> In the equals method of my class, I am converting the DocumentFragment
>> to a String and comparing an equals on the String.
>>
>> I know this is not the best way, because "attributes" e.g can change
>> order in Element of DocumentFragment, or e.g documents differ only in
>> the sequence of unordered elements.
>>
>> So in such cases this equality will fail.

>
> I think XML Canonicalization will solve the problem.
>
> It comes as a cost though.


Example:

import java.io.IOException;
import java.io.UnsupportedEncodingException;

import javax.xml.parsers.ParserConfigurationException;

import org.apache.xml.security.Init;
import org.apache.xml.security.c14n.CanonicalizationExcep tion;
import org.apache.xml.security.c14n.Canonicalizer;
import org.apache.xml.security.c14n.InvalidCanonicalizerE xception;
import org.xml.sax.SAXException;

public class XmlComp {
static {
Init.init();
}
private static String canonicalize(String s) throws
InvalidCanonicalizerException, UnsupportedEncodingException,
CanonicalizationException, ParserConfigurationException, IOException,
SAXException {
Canonicalizer c14n =
Canonicalizer.getInstance(Canonicalizer.ALGO_ID_C1 4N_OMIT_COMMENTS);
String res = new
String(c14n.canonicalize(s.getBytes(Canonicalizer. ENCODING)),
Canonicalizer.ENCODING);
return res;
}
public static void main(String[] args) throws Exception {
String s1 = "<a><b c='1' d='2'/></a>";
String s2 = "<a><b d='2' c='1'/></a>";
System.out.println(s1);
System.out.println(s2);
System.out.println(canonicalize(s1));
System.out.println(canonicalize(s2));
}
}

outputs:

<a><b c='1' d='2'/></a>
<a><b d='2' c='1'/></a>
<a><b c="1" d="2"></b></a>
<a><b c="1" d="2"></b></a>

Arne

 
Reply With Quote
 
Arne Vajh°j
Guest
Posts: n/a
 
      01-18-2012
On 1/17/2012 9:33 PM, Jeff Higgins wrote:
> On 01/17/2012 08:56 PM, Mausam wrote:
>> On Tuesday, 17 January 2012 23:02:29 UTC+5:30, Jeff Higgins wrote:
>>> On 01/17/2012 10:03 AM, Mausam wrote:
>>>> I have a java class, whose contains a DocumentFragment.
>>>>
>>>> In the equals method of my class, I am converting the
>>>> DocumentFragment to a String and comparing an equals on the String.
>>>>
>>>> I know this is not the best way, because "attributes" e.g can change
>>>> order in Element of DocumentFragment, or e.g documents differ only
>>>> in the sequence of unordered elements.
>>>>
>>>> So in such cases this equality will fail.
>>>>
>>>> Please suggest a better approach.
>>> A my class is equal to another my class if and only if ...

>>
>> Thanks Jeff, I understand what you mean.
>>
>> BTW, I was checking the API
>> http://docs.oracle.com/javase/1.5.0/...3c.dom.Node%29
>>
>>
>> The attributes NamedNodeMaps are equal. This is: they are both null,
>> or they have the same length and for each node that exists in one map
>> there is a node that exists in the other map and is equal, although
>> not necessarily at the same index.
>>
>>
>> The childNodes NodeLists are equal. This is: they are both null, or
>> they have the same length and contain equal nodes at the same index.
>> Note that normalization can affect equality; to avoid this, nodes
>> should be normalized before being compared.
>>
>> Here for attributes, they take care of "NOT necessarily at the same
>> index" but in case of childNodes its not being taken care of. So if
>> there is a sequence of unordered elements (<emp/><dept/>
>> and<dept/><emp/> ) they will be treated as NOT equal.
>>
>> So either I iterate through each node and attribute and do a
>> comparison. That's the fall back. But before that, I wanted to check
>> the experts if there are better options.

>
> Yep. I based my hair trigger response upon the .equals(Object) of the
> "known implementing classes" of Node. Sorry. I'll be interested in
> finding out the "cost" associated with Arne Vajh°j's response.


The cost is CPU time. It cost a bit of CPU time to parse and
reorganize and serialize again.

Arne


 
Reply With Quote
 
Mausam
Guest
Posts: n/a
 
      01-18-2012
Thanks Arne,

I can achieve that using Node.isEqualTo(Node) API post JDK1.5.

I am worried of following usecases (wondering if its even valid usecase or not)

1)
Are these two Nodes equal? (check that one has empty street element and other has no street element. That implies that value for street is empty in both cases. So as per employee object is considered in Java, both will be equal.
<Employee company="example" xmlns="http://example.com" debug="true">
<Employeename>mausam</Employeename>
<email>a @example.com</email>
<street/>
</Employee>

<Employee debug="true" company="example" xmlns="http://example.com">
<Employeename>mausam</Employeename>
<email>a @example.com</email>
</Employee>

2)
Check the sequence of street element. In Node 1 it is after email and in node2 it is before.
<Employee company="example" xmlns="http://example.com" debug="true">
<Employeename>mausam</Employeename>
<email>a @example.com</email>
<street>Marienplatz</street>
</Employee>

<Employee debug="true" company="example" xmlns="http://example.com">
<Employeename>mausam</Employeename>
<street>Marienplatz</street>
<email>a @example.com</email>
</Employee>

--

Please note that I can not create java objects from XMLs as those are free xml fragments and does not comply to schema. But thanks a lot for your effort and code example.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
is there a quick way to compare the results from two arrays and note the diffences? windandwaves HTML 1 03-24-2005 09:49 PM
Best way to compare two XML files for equivalence (subset) Tim Smith XML 0 09-29-2004 05:17 AM
Re: Fast way to compare two files using STL? Thomas Matthews C++ 1 08-18-2003 04:28 PM
Re: Fast way to compare two files using STL? Gianni Mariani C++ 1 08-16-2003 11:34 AM
Re: Fast way to compare two files using STL? Kevin Goodsell C++ 0 08-16-2003 01:04 AM



Advertisments