Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Ruby > trouble reading a gzipped xml-file

Reply
Thread Tools

trouble reading a gzipped xml-file

 
 
Guido de Melo
Guest
Posts: n/a
 
      11-14-2005
Hi,

I'm trying to read a gzipped xml-file into rexml, but I don't quite
succeed. Perhaps someone can help me. Till now I tried this:

#!/usr/bin/ruby -w

require 'zlib'
require 'rexml/document'

Zlib::GzipReader.open('file.dia') {|gz|
print gz.read
}
# this prints everything nicely and it works

f = Zlib::GzipReader.open("file.dia")
s = f.read
p s

# now the ungzipped contents are in s, they look like this however:
# "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<dia:diagram
# xmlns:dia=\"http://www.lysator.liu.se/~alla/dia/\">\n
# <dia:diagramdata>\n <dia:attribute name=\"background\">\n
# all in one line and encapsulated in ""
# so of course the next command fails

xmldoc = REXML:ocument.new s
p xmldoc
# gives: <UNDEFINED> ... </>

Any ideas on this? This can't be too difficult, I think...
Guido
 
Reply With Quote
 
 
 
 
Robert Klemme
Guest
Posts: n/a
 
      11-14-2005
Guido de Melo wrote:
> Hi,
>
> I'm trying to read a gzipped xml-file into rexml, but I don't quite
> succeed. Perhaps someone can help me. Till now I tried this:
>
> #!/usr/bin/ruby -w
>
> require 'zlib'
> require 'rexml/document'
>
> Zlib::GzipReader.open('file.dia') {|gz|
> print gz.read
> }
> # this prints everything nicely and it works
>
> f = Zlib::GzipReader.open("file.dia")
> s = f.read
> p s


You're not closing f here.

> # now the ungzipped contents are in s, they look like this however:
> # "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<dia:diagram
> # xmlns:dia=\"http://www.lysator.liu.se/~alla/dia/\">\n
> # <dia:diagramdata>\n <dia:attribute name=\"background\">\n
> # all in one line and encapsulated in ""
> # so of course the next command fails


I'm not so sure that it fails just because of this.

> xmldoc = REXML:ocument.new s
> p xmldoc
> # gives: <UNDEFINED> ... </>
>
> Any ideas on this? This can't be too difficult, I think...
> Guido


Did you try this?

xmldoc = Zlib::GzipReader.open('file.dia') {|gz| REXML:ocument.new gz}

Kind regards

robert

 
Reply With Quote
 
 
 
 
Guido de Melo
Guest
Posts: n/a
 
      11-14-2005
Robert Klemme wrote:
> You're not closing f here.
>

[...]
>
> I'm not so sure that it fails just because of this.


It shouldn't, because ruby will determine the end of the file by itself
and close the handle on exiting.

> Did you try this?
>
> xmldoc = Zlib::GzipReader.open('file.dia') {|gz| REXML:ocument.new gz}


irb(main):003:0> xmldoc = Zlib::GzipReader.open('file.dia') {|gz|
REXML:ocument.new gz}
RuntimeError: Zlib::GzipReader is not a valid input stream. It must be
either a String, IO, StringIO or Source.

This doesn't work either, I'm afraid...
Guido
 
Reply With Quote
 
Robert Klemme
Guest
Posts: n/a
 
      11-14-2005
Guido de Melo wrote:
> Robert Klemme wrote:
>> You're not closing f here.
>>

> [...]
>>
>> I'm not so sure that it fails just because of this.

>
> It shouldn't, because ruby will determine the end of the file by
> itself and close the handle on exiting.


I didn't mean to say that it fails because of the open file. My point
with the first remark was that it's a good habit to open files for only as
long as they are actually used. The block form is the idiom of choice
here: it's not much longer as a simple File.open() or File.new() and it
ensures the file is always properly closed.

>> Did you try this?
>>
>> xmldoc = Zlib::GzipReader.open('file.dia') {|gz| REXML:ocument.new
>> gz}

>
> irb(main):003:0> xmldoc = Zlib::GzipReader.open('file.dia') {|gz|
> REXML:ocument.new gz}
> RuntimeError: Zlib::GzipReader is not a valid input stream. It must
> be either a String, IO, StringIO or Source.
>
> This doesn't work either, I'm afraid...


I guess this is because GzipReader doesn't inherit IO:

>> Zlib::GzipReader.ancestors

=> [Zlib::GzipReader, Enumerable, Zlib::GzipFile, Object, Kernel]

But you can do this

xmldoc = Zlib::GzipReader.open('file.dia') {|gz| REXML:ocument.new(
gz.read )}

Btw, you see differing output because you use different printing methods:

Zlib::GzipReader.open('file.dia') {|gz|
print gz.read
}

vs.

p s

If there is no exception during GZIP reading I guess there might be a bug
somewhere. As a test I'd write the gunzipped content to another file and
do a diff on the plain xml and this output to see whether GzipReader
actually yields the same content.

Btw, did you actually try to make REXML read the first variant? Maybe you
have a problem in your XML file.

Kind regards

robert

 
Reply With Quote
 
Guido de Melo
Guest
Posts: n/a
 
      11-14-2005
Thank you very much! You solved it!

Robert Klemme wrote:
>>>You're not closing f here.

>>[...]
>>>I'm not so sure that it fails just because of this.

>>
>>It shouldn't, because ruby will determine the end of the file by
>>itself and close the handle on exiting.

>
> I didn't mean to say that it fails because of the open file. My point
> with the first remark was that it's a good habit to open files for only as
> long as they are actually used. The block form is the idiom of choice
> here: it's not much longer as a simple File.open() or File.new() and it
> ensures the file is always properly closed.


You are right of course, I will try to do this in the future.

> But you can do this
>
> xmldoc = Zlib::GzipReader.open('file.dia') {|gz| REXML:ocument.new(
> gz.read )}


And this worked perfect for me

> If there is no exception during GZIP reading I guess there might be a bug
> somewhere. As a test I'd write the gunzipped content to another file and
> do a diff on the plain xml and this output to see whether GzipReader
> actually yields the same content.


They do

> Btw, did you actually try to make REXML read the first variant? Maybe you
> have a problem in your XML file.


Thank goodness the XML produced by dia is sound

Kind regards,
Guido
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
uncompress base64-gzipped string Niels Egberts Python 0 06-12-2009 08:58 PM
Generating zipped or gzipped attachment with email package? skip@pobox.com Python 0 05-21-2009 01:34 PM
How to read gzipped utf8 file in Python? John Nagle Python 1 11-22-2007 08:25 PM
Can sqlite read gzipped databases? Paul Smith Python 1 03-21-2007 05:00 AM
Reading delimited gzipped serialized objects Dave Brown Java 7 03-10-2006 03:20 PM



Advertisments