Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > XML > Looking for freely available, huge DTD

Reply
Thread Tools

Looking for freely available, huge DTD

 
 
Joe Kesselman
Guest
Posts: n/a
 
      05-31-2007
Ixa wrote:
>> [a tool for anonymizing XSLT testcases]
>>> Nontrivial? Perl and few regular expressions will do the trick.

>> Feel free to present your solution here

>
> OK, here goes.


Good solution for the cases it covers, but...

Remember, my comment was that it was nontrivial to properly anonymize
XSLT testcases. That means rewriting the stylesheet logic in synch with
the changes to the input document.... which means being aware of the
XPaths and making sure the greeked input document still matches them
when (and only when) it should, which may require being more careful
about how you manipulate the document's content.

Trivial cases have trivial solutions. One that's robust enough to give
out to customers who have written nontrivial stylesheets with some trust
of getting back valid testcases is not that simple, alas.


--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
 
Reply With Quote
 
 
 
 
Ixa
Guest
Posts: n/a
 
      05-31-2007
> rewriting the stylesheet logic in synch with the changes to the input document

So, you are actually looking for a method for further scrambling the
actual logic in XSLT templates and structure in XML in addition to
messing up the element, attribute and variable names, or?

> Trivial cases have trivial solutions.


Absolutely. There is lot of room for improvements in that method, the
most important being (from XSLT point of view) that the anonymizer
should be structure-aware. Now it just blindly messes up the lines in
text files and requires check on the result.

I guess the best approach could be to use XSLT to modify the XSLT and
XML at the same time. Then there would be total control on what parts
of the trees would be changed and there would be possibility to do more
than just alphabet scrambling.

--
Ixa

 
Reply With Quote
 
 
 
 
Joe Kesselman
Guest
Posts: n/a
 
      05-31-2007
Ixa wrote:
> I guess the best approach could be to use XSLT to modify the XSLT and
> XML at the same time.


That's a bit hard to do with XSLT 1.0, unless you use the redirect
extensions... but, yes, I was pondering that approach. (As folks may
remember, I've already written an article on using stylesheets to
manipulate other stylesheets.)

But the bookkeeping for this particular N-way anonymizer may be ugly
enough to be better handled in a more traditional programmming language.

Hey, if I had a full solution worked out, I'd have published it
already... <smile/>

--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Memory error due to the huge/huge input file size tejsupra@gmail.com Python 3 11-20-2008 07:21 PM
XML file from one DTD to another DTD test Java 2 07-28-2006 08:48 PM
How to specify DTD to DTD.getDTD for DocumentParser? Ronald Fischer Java 4 03-17-2005 09:37 AM
Removing the dtd name when using print(...) on the dtd generated class Joseph Tilian Java 0 12-21-2004 02:58 PM
Including a dtd into another dtd... possible? Asfand Yar Qazi XML 1 09-19-2003 12:10 PM



Advertisments