Go Back   Velocity Reviews > Newsgroups > Java
User Name
Password
Register FAQ Members List Calendar Search Today's Posts Mark Forums Read


Reply

Java - XML Not good for Big Files (vs Flat Files)

 
Thread Tools Search this Thread
Old 04-04-2006, 03:27 PM   #1
Homer
 
Posts: n/a
Default XML Not good for Big Files (vs Flat Files)

I am a little bit tired of this obsession people have with XML and XML
technology. Please share your thoughts and let me know if I am thinking
in a wrong way. I believe some people are over using XML all over the
place. Nowadays Canadian Government is pushing XML to its organization
as standard for data/file transfer. Huge files moving between companies
now include tones of XML Tags repeating all over the file and slowing
down networks and crashing applications because of size.
I am not objecting to the whole technology. I know advantages of XML
and using it all the times for Config files or our web oriented
applications but using it as standard for moving big files is going too
far. Here is the example:

John,Smith,5555555,37 Finch Ave.

Is now:

<FirstName>John</FirstName>
<LastName>Smith</LastName>
<PhoneNum>5555555</PhoneNum>
<Address>37 Finch Ave.</Address>

And Tags are repeating and repeating:

<FirstName>....</FirstName>
<LastName>....</LastName>
<PhoneNum>....</PhoneNum>
<Address>....</Address>

<FirstName>....</FirstName>
<LastName>....</LastName>
<PhoneNum>....</PhoneNum>
<Address>....</Address>


Please let me know what you think.


Regards,

Homer

  Reply With Quote
Old 04-04-2006, 03:50 PM   #2
James McGill
 
Posts: n/a
Default Re: XML Not good for Big Files (vs Flat Files)

On Tue, 2006-04-04 at 08:27 -0700, Homer wrote:
>
> And Tags are repeating and repeating:


XML markup does tend to bloat the data.

I personally believe you should use serializable objects that can be
represented according to an XML schema when that's appropriate, but that
also can be serialized into a tightly packed format when that is
appropriate as well. So I should be able to marshal/unmarshal the
serialized object to and from XML, but I should also be able to stream
that object without marshalling it -- and the other end should be able
to unmarshal to xml, validate according to the schema, etc.

Likewise, database bindings should be informed by the xml schema, but
the XML markup shouldn't be what you store in the db.


  Reply With Quote
Old 04-04-2006, 04:01 PM   #3
mtp
 
Posts: n/a
Default Re: XML Not good for Big Files (vs Flat Files)

Homer wrote:
> I am a little bit tired of this obsession people have with XML and XML
> technology. Please share your thoughts and let me know if I am thinking
> in a wrong way. I believe some people are over using XML all over the
> place. Nowadays Canadian Government is pushing XML to its organization
> as standard for data/file transfer. Huge files moving between companies
> now include tones of XML Tags repeating all over the file and slowing
> down networks and crashing applications because of size.


you can use indexing, binary XML, or compression

> I am not objecting to the whole technology. I know advantages of XML
> and using it all the times for Config files or our web oriented
> applications but using it as standard for moving big files is going too
> far. Here is the example:
>
> John,Smith,5555555,37 Finch Ave.
>
> Is now:
>
> <FirstName>John</FirstName>
> <LastName>Smith</LastName>
> <PhoneNum>5555555</PhoneNum>
> <Address>37 Finch Ave.</Address>
>
> And Tags are repeating and repeating:
>
> <FirstName>....</FirstName>
> <LastName>....</LastName>
> <PhoneNum>....</PhoneNum>
> <Address>....</Address>
>
> <FirstName>....</FirstName>
> <LastName>....</LastName>
> <PhoneNum>....</PhoneNum>
> <Address>....</Address>
>
>
> Please let me know what you think.


may be one of the computing service wanted more money for his service
with this big project ?

may be everybody think "newer is better" ?
  Reply With Quote
Old 04-04-2006, 04:06 PM   #4
cherukan@gmail.com
 
Posts: n/a
Default Re: XML Not good for Big Files (vs Flat Files)

Homer wrote:
> I am a little bit tired of this obsession people have with XML and XML
> technology. Please share your thoughts and let me know if I am thinking
> in a wrong way. I believe some people are over using XML all over the
> place. Nowadays Canadian Government is pushing XML to its organization
> as standard for data/file transfer. Huge files moving between companies
> now include tones of XML Tags repeating all over the file and slowing
> down networks and crashing applications because of size.
> I am not objecting to the whole technology. I know advantages of XML
> and using it all the times for Config files or our web oriented
> applications but using it as standard for moving big files is going too
> far. Here is the example:
>
> John,Smith,5555555,37 Finch Ave.
>
> Is now:
>
> <FirstName>John</FirstName>
> <LastName>Smith</LastName>
> <PhoneNum>5555555</PhoneNum>
> <Address>37 Finch Ave.</Address>
>
> And Tags are repeating and repeating:
>
> <FirstName>....</FirstName>
> <LastName>....</LastName>
> <PhoneNum>....</PhoneNum>
> <Address>....</Address>
>
> <FirstName>....</FirstName>
> <LastName>....</LastName>
> <PhoneNum>....</PhoneNum>
> <Address>....</Address>
>
>
> Please let me know what you think.
>
>
> Regards,
>
> Homer


Yes that does seem like a network killer. It depends on what the
intended use of the file is, on the other end and the client receiving
it, if they *have to* use XML, certain optimizations can be done for
just the transfer part...

<header>
<firstName>A15</firstName>
<lastName>A15</lastName>
<phone>A10</phone>
<address>A10</address>
</header>
<data>
[[CDATA
<!-- fixed width data goes here -->
]]
</data>

OR

<header>
<fieldSeparator>;</fieldSeparator>
<field>firstName</field>
<field>lastName</field>
<field>phone</field>
<field>address</field>
</header>
<data>
[[CDATA
<!-- delimited data goes here -->
]]
</data>

OR a combination of the above.

In short, XML should be preferred only if documentation and
discoverability are more important than performance.

  Reply With Quote
Old 04-04-2006, 04:11 PM   #5
RC
 
Posts: n/a
Default Re: XML Not good for Big Files (vs Flat Files)

Homer wrote:


> Please let me know what you think.


XML is never designed to replace database server.

You can use XML file transfer portion of data
from a database.
i.e.

SELECT lastname,fistname,phonenumber,address
FROM phonebook
WHERE state = 'NY' AND city = 'somewhere';

A flat file like this

William|John|12345678|84 5th Ave

I don't know which column is last name, first name.
3rd column is person ID or phone number?

You need let the programmers know what column is what.

Next time if some one change flat file format to

85 5th Ave|John|William|12345678

Then your database will incorrect after updated.


True XML creates large file size.
But it makes our life easier.

You can make up your own tags
<lastName> or <Last_Name>, etc.
the tags can be in English, Spanish, French, Russian, Japanese, etc.
  Reply With Quote
Old 04-04-2006, 04:19 PM   #6
James McGill
 
Posts: n/a
Default Re: XML Not good for Big Files (vs Flat Files)

On Tue, 2006-04-04 at 09:06 -0700, wrote:
>
> OR a combination of the above.


You're almost touching on the big problem: Misconception of what it
means to be "standard".

XML has (several) standardized markup frameworks, but it is silent as to
content or utilization. It is ridiculous for a government entity to
demand that "XML" be "the standard" for data interchange. They need to
bless certain schemas if that's their goal, but it also needs to be
abstract enough that systems can be designed efficiently.

In your examples, the designers can claim that they are using "XML", and
therefore "are standardized" on it, but the three examples we've seen so
far are not at all interchangeable...

  Reply With Quote
Old 04-04-2006, 04:39 PM   #7
Timbo
 
Posts: n/a
Default Re: XML Not good for Big Files (vs Flat Files)

Homer wrote:
> John,Smith,5555555,37 Finch Ave.
>
> Is now:
>
> <FirstName>John</FirstName>
> <LastName>Smith</LastName>
> <PhoneNum>5555555</PhoneNum>
> <Address>37 Finch Ave.</Address>
>

It's true that the XML data in your example is bulky, but what it
has that the unstructured doesn't have is meta-level information,
such as "John" the first name of someone. If the parties involved
(ie. that sender and receiver of this information) have an
agreement as to the meaning of "FirstName", then they are sharing
more than just text... it has some implicit meaning. If you send
it unstructured, then the receiver has to know how to parse the
data into this agreed meaning, which means it needs to know the
format of the data.

Then, on the other hand, if the data is just stored in a database
or something with no definition of the what the tags mean, then I
agree with you... using XML is of little use.
  Reply With Quote
Old 04-04-2006, 04:44 PM   #8
Oliver Wong
 
Posts: n/a
Default Re: XML Not good for Big Files (vs Flat Files)


"Homer" <> wrote in message
news: oups.com...
>I am a little bit tired of this obsession people have with XML and XML
> technology. Please share your thoughts and let me know if I am thinking
> in a wrong way. I believe some people are over using XML all over the
> place. Nowadays Canadian Government is pushing XML to its organization
> as standard for data/file transfer. Huge files moving between companies
> now include tones of XML Tags repeating all over the file and slowing
> down networks and crashing applications because of size.
> I am not objecting to the whole technology. I know advantages of XML
> and using it all the times for Config files or our web oriented
> applications but using it as standard for moving big files is going too
> far. Here is the example:
>
> John,Smith,5555555,37 Finch Ave.
>
> Is now:
>
> <FirstName>John</FirstName>
> <LastName>Smith</LastName>
> <PhoneNum>5555555</PhoneNum>
> <Address>37 Finch Ave.</Address>
>
> And Tags are repeating and repeating:
>
> <FirstName>....</FirstName>
> <LastName>....</LastName>
> <PhoneNum>....</PhoneNum>
> <Address>....</Address>
>
> <FirstName>....</FirstName>
> <LastName>....</LastName>
> <PhoneNum>....</PhoneNum>
> <Address>....</Address>
>
>
> Please let me know what you think.


If your complaint is file size during network transfer, compress the
file before sending it.

If your complaint is file size during parsing, use SAX instead of DOM,
and don't keep the whole file in memory at once.

Use the right tool for the job. If for whatever problem you're trying to
solve, you've got a better tool than XML, then use it. But if the problem is
"The government requires me to use XML", then I can't think of a better tool
than XML to solve that particular problem (except maybe emmigration ).

- Oliver

  Reply With Quote
Old 04-04-2006, 04:56 PM   #9
James McGill
 
Posts: n/a
Default Re: XML Not good for Big Files (vs Flat Files)

On Tue, 2006-04-04 at 16:44 +0000, Oliver Wong wrote:

> except maybe emmigration


You say that as though anyone would ever leave the utopian paradise that
is Canada...

  Reply With Quote
Old 04-04-2006, 04:58 PM   #10
Lasse Reichstein Nielsen
 
Posts: n/a
Default Re: XML Not good for Big Files (vs Flat Files)

"Homer" <> writes:

> I am a little bit tired of this obsession people have with XML and XML
> technology.


Hear hear!
Seems some people think XML is the solution to all problems.
I'd rather classify it as the lowest common denominator for exchanging
tree-structured data - and definitly not something fit for humans to
read or write directly.

> John,Smith,5555555,37 Finch Ave.
>
> Is now:
>
> <FirstName>John</FirstName>
> <LastName>Smith</LastName>
> <PhoneNum>5555555</PhoneNum>
> <Address>37 Finch Ave.</Address>
>
> And Tags are repeating and repeating:


> Please let me know what you think.


Apart from what everybody else have said, zipping such a file
should yield a *very* high compression factor.

/L
--
Lasse Reichstein Nielsen -
DHTML Death Colors: <URL:http://www.infimum.dk/HTML/rasterTriangleDOM.html>
'Faith without judgement merely degrades the spirit divine.'
  Reply With Quote
Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
How to copy *.vob files on DVD to the hard disk and merge them together zengpeiwen1719 Software 0 05-24-2008 09:33 AM
Torrent looks good. . . converted DVD files looks horrible. . . WHY!?! novak.arthur@gmail.com DVD Video 4 02-11-2007 05:57 PM
A lot of files for download - Rapidshare links - New Files (11.out.2006) Lucas22 DVD Video 0 10-12-2006 02:53 AM
CD-DA to Wav to DVD as Wav files? makbertodelete@anothermessage.com DVD Video 10 09-27-2005 08:42 PM




SEO by vBSEO 3.3.2 ©2009, Crawlability, Inc.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47