Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > XML > XML compression

Reply
Thread Tools

XML compression

 
 
Ed Beroset
Guest
Posts: n/a
 
      12-11-2004
I have an XML file that I want to squeeze down as small as possible for
storage in an embedded device. I want it to still be a valid XML file
(and not something like a binary ASN.1 encoding of an XML file) but it
does not need to carry the long tags it currently has as long as I
create an XSLT which will put it back into the right form. What I had
in mind was something like this:

<original-xml-fragment>
<very-long-and-verbose-tag name="Long tag 1">
<more-information-is-stored-here name="stuff 1"/>
</very-long-and-verbose-tag>
<very-long-and-verbose-tag name="Long tag 2">
<more-information-is-stored-here name="stuff 2"/>
<valuable-additional-information name="foo"/>
</very-long-and-verbose-tag>
</original-xml-fragment>

I'm thinking of transforming it to this:

<o><v n="Long tag 1"><m n="stuff 1"/></v><v n="Long tag 2"><m n="stuff
2"/><v2 n="foo"/></v></o>

My question is, has someone already generated an XSLT that would
abbreviate tags in this kind of way AND generate the corresponding
"decoder" XSLT which would reconstitute the original. I have ideas
about how to do it using a procedural language, but I would like to do
it entirely with XSL transforms if I can.

The only part that I don't really know how to do is to automatically
generate short, unique abbreviations for each of the tags. I *could*
specify them all manually once, but I'd prefer an automatic solution to
simplify maintenance.

Ed

 
Reply With Quote
 
 
 
 
Joris Gillis
Guest
Posts: n/a
 
      12-11-2004
> I have an XML file that I want to squeeze down as small as possible for
> storage in an embedded device. I want it to still be a valid XML file
> (and not something like a binary ASN.1 encoding of an XML file) but it
> does not need to carry the long tags it currently has as long as I
> create an XSLT which will put it back into the right form. What I had
> in mind was something like this:
>
> <original-xml-fragment>
> <very-long-and-verbose-tag name="Long tag 1">
> <more-information-is-stored-here name="stuff 1"/>
> </very-long-and-verbose-tag>
> <very-long-and-verbose-tag name="Long tag 2">
> <more-information-is-stored-here name="stuff 2"/>
> <valuable-additional-information name="foo"/>
> </very-long-and-verbose-tag>
> </original-xml-fragment>
>
> I'm thinking of transforming it to this:
>
> <o><v n="Long tag 1"><m n="stuff 1"/></v><v n="Long tag 2"><m n="stuff
> 2"/><v2 n="foo"/></v></o>
>
> My question is, has someone already generated an XSLT that would
> abbreviate tags in this kind of way AND generate the corresponding
> "decoder" XSLT which would reconstitute the original. I have ideas
> about how to do it using a procedural language, but I would like to do
> it entirely with XSL transforms if I can.
>
> The only part that I don't really know how to do is to automatically
> generate short, unique abbreviations for each of the tags. I *could*
> specify them all manually once, but I'd prefer an automatic solution to
> simplify maintenance.


Hi,

I've created this little stylesheet that will map all unique nodes names and give them abbreviations. It might be handy as an intermediate step towards a solution for your - btw very interesting- question.

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlnssl="http://www.w3.org/1999/XSL/Transform">

<xslutput method="xml" indent="yes"/>

<xsl:key name="name" match="*|@*" use="local-name()"/>

<xsl:template match="/">
<name-mapping>
<xsl:for-each select="//*[generate-id()=generate-id(key('name',local-name()))]|//@*[generate-id()=generate-id(key('name',local-name()))]">
<name>
<xsl:attribute name="s"><xsl:number value="position()" format="a"/></xsl:attribute>
<xsl:value-of select="local-name()"/>
</name>
</xsl:for-each>
</name-mapping>
</xsl:template>

</xsl:stylesheet>



this will generate the following output:

<name-mapping>
<name s="a">original-xml-fragment</name>
<name s="b">very-long-and-verbose-tag</name>
<name s="c">name</name>
<name s="d">more-information-is-stored-here</name>
<name s="e">valuable-additional-information</name>
</name-mapping>

regards,
--
Joris Gillis (http://www.ticalc.org/cgi-bin/acct-v...i?userid=38041)
Ceterum censeo XML omnibus esse utendum
 
Reply With Quote
 
 
 
 
Joris Gillis
Guest
Posts: n/a
 
      12-11-2004
>> My question is, has someone already generated an XSLT that would
>> abbreviate tags in this kind of way AND generate the corresponding
>> "decoder" XSLT which would reconstitute the original. I have ideas
>> about how to do it using a procedural language, but I would like to do
>> it entirely with XSL transforms if I can.
>>
>> The only part that I don't really know how to do is to automatically
>> generate short, unique abbreviations for each of the tags. I *could*
>> specify them all manually once, but I'd prefer an automatic solution to
>> simplify maintenance.

>
> this will generate the following output:
>
> <name-mapping>
> <name s="a">original-xml-fragment</name>
> <name s="b">very-long-and-verbose-tag</name>
> <name s="c">name</name>
> <name s="d">more-information-is-stored-here</name>
> <name s="e">valuable-additional-information</name>
> </name-mapping>
>

Hi, again

given that it is allowed to use two steps of tranformation, you can do this:
Unleash the above stylesheet on the verbose XML and let it output to a file named 'name-map.xml'.

When you apply the following stylesheet, the verbose XML will be reduced.

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlnssl="http://www.w3.org/1999/XSL/Transform">

<xslutput method="xml" indent="yes"/>

<xsl:template match="*">
<xsl:variable name="name" select="local-name()"/>
<xsl:element name="{document('name-map.xml')//name[.=$name]/@s}">
<xsl:apply-templates select="@*"/>
<xsl:apply-templates/>
</xsl:element>
</xsl:template>

<xsl:template match="@*">
<xsl:variable name="name" select="local-name()"/>
<xsl:attribute name="{document('name-map.xml')//name[.=$name]/@s}">
<xsl:value-of select="."/>
</xsl:attribute>
</xsl:template>

</xsl:stylesheet>


The reduced form will look like this:
<a>
<b c="Long tag 1">
<d c="stuff 1"/>
</b>
<b c="Long tag 2">
<d c="stuff 2"/>
<e c="foo"/>
</b>
</a>


And this stylesheet will expand it again to the original verbose form:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlnssl="http://www.w3.org/1999/XSL/Transform">

<xslutput method="xml" indent="yes"/>

<xsl:template match="*">
<xsl:variable name="name" select="local-name()"/>
<xsl:element name="{document('name-map.xml')//name[@s=$name]}">
<xsl:apply-templates select="@*"/>
<xsl:apply-templates/>
</xsl:element>
</xsl:template>

<xsl:template match="@*">
<xsl:variable name="name" select="local-name()"/>
<xsl:attribute name="{document('name-map.xml')//name[@s=$name]}">
<xsl:value-of select="."/>
</xsl:attribute>
</xsl:template>

</xsl:stylesheet>


I hope this is useful.

regards,

--
Joris Gillis (http://www.ticalc.org/cgi-bin/acct-v...i?userid=38041)
Ceterum censeo XML omnibus esse utendum
 
Reply With Quote
 
Ed Beroset
Guest
Posts: n/a
 
      12-11-2004
Joris Gillis wrote:
[big snip of useful, working XSLT]
>
> I hope this is useful.


It's more than useful -- it's superb! Thanks very much. When I figure
out how to combine them into a single step, I'll post the result.

Ed

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Different results parsing a XML file with XML::Simple (XML::Sax vs. XML::Parser) Erik Wasser Perl Misc 5 03-05-2006 10:09 PM
IIS 6.0 Compression + xml http VP ASP .Net 1 02-28-2006 02:01 PM
Print XML parsing to JspWriter (out) Class org.xml.sax.helpers.NewInstance can not access a member of class javax.xml.parsers.SAXParser with modifiers "protected" Per Magnus L?vold Java 0 11-15-2004 02:27 PM
What XML technologies to learn first for "XML Processing" and "XML Mapping"? Bomb Diggy Java 0 07-28-2004 07:26 AM
XML compression Robert Metzger XML 0 11-07-2003 04:32 PM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57