![]() |
|
|
|||||||
![]() |
Java - XML Not good for Big Files (vs Flat Files) |
|
|
Thread Tools | Search this Thread |
|
|
#1 |
|
Posts: n/a
|
I am a little bit tired of this obsession people have with XML and XML
technology. Please share your thoughts and let me know if I am thinking in a wrong way. I believe some people are over using XML all over the place. Nowadays Canadian Government is pushing XML to its organization as standard for data/file transfer. Huge files moving between companies now include tones of XML Tags repeating all over the file and slowing down networks and crashing applications because of size. I am not objecting to the whole technology. I know advantages of XML and using it all the times for Config files or our web oriented applications but using it as standard for moving big files is going too far. Here is the example: John,Smith,5555555,37 Finch Ave. Is now: <FirstName>John</FirstName> <LastName>Smith</LastName> <PhoneNum>5555555</PhoneNum> <Address>37 Finch Ave.</Address> And Tags are repeating and repeating: <FirstName>....</FirstName> <LastName>....</LastName> <PhoneNum>....</PhoneNum> <Address>....</Address> <FirstName>....</FirstName> <LastName>....</LastName> <PhoneNum>....</PhoneNum> <Address>....</Address> Please let me know what you think. Regards, Homer |
|
|
|
#2 |
|
Posts: n/a
|
On Tue, 2006-04-04 at 08:27 -0700, Homer wrote:
> > And Tags are repeating and repeating: XML markup does tend to bloat the data. I personally believe you should use serializable objects that can be represented according to an XML schema when that's appropriate, but that also can be serialized into a tightly packed format when that is appropriate as well. So I should be able to marshal/unmarshal the serialized object to and from XML, but I should also be able to stream that object without marshalling it -- and the other end should be able to unmarshal to xml, validate according to the schema, etc. Likewise, database bindings should be informed by the xml schema, but the XML markup shouldn't be what you store in the db. |
|
|
|
#3 |
|
Posts: n/a
|
Homer wrote:
> I am a little bit tired of this obsession people have with XML and XML > technology. Please share your thoughts and let me know if I am thinking > in a wrong way. I believe some people are over using XML all over the > place. Nowadays Canadian Government is pushing XML to its organization > as standard for data/file transfer. Huge files moving between companies > now include tones of XML Tags repeating all over the file and slowing > down networks and crashing applications because of size. you can use indexing, binary XML, or compression > I am not objecting to the whole technology. I know advantages of XML > and using it all the times for Config files or our web oriented > applications but using it as standard for moving big files is going too > far. Here is the example: > > John,Smith,5555555,37 Finch Ave. > > Is now: > > <FirstName>John</FirstName> > <LastName>Smith</LastName> > <PhoneNum>5555555</PhoneNum> > <Address>37 Finch Ave.</Address> > > And Tags are repeating and repeating: > > <FirstName>....</FirstName> > <LastName>....</LastName> > <PhoneNum>....</PhoneNum> > <Address>....</Address> > > <FirstName>....</FirstName> > <LastName>....</LastName> > <PhoneNum>....</PhoneNum> > <Address>....</Address> > > > Please let me know what you think. may be one of the computing service wanted more money for his service with this big project ? may be everybody think "newer is better" ? |
|
|
|
#4 |
|
Posts: n/a
|
Homer wrote:
> I am a little bit tired of this obsession people have with XML and XML > technology. Please share your thoughts and let me know if I am thinking > in a wrong way. I believe some people are over using XML all over the > place. Nowadays Canadian Government is pushing XML to its organization > as standard for data/file transfer. Huge files moving between companies > now include tones of XML Tags repeating all over the file and slowing > down networks and crashing applications because of size. > I am not objecting to the whole technology. I know advantages of XML > and using it all the times for Config files or our web oriented > applications but using it as standard for moving big files is going too > far. Here is the example: > > John,Smith,5555555,37 Finch Ave. > > Is now: > > <FirstName>John</FirstName> > <LastName>Smith</LastName> > <PhoneNum>5555555</PhoneNum> > <Address>37 Finch Ave.</Address> > > And Tags are repeating and repeating: > > <FirstName>....</FirstName> > <LastName>....</LastName> > <PhoneNum>....</PhoneNum> > <Address>....</Address> > > <FirstName>....</FirstName> > <LastName>....</LastName> > <PhoneNum>....</PhoneNum> > <Address>....</Address> > > > Please let me know what you think. > > > Regards, > > Homer Yes that does seem like a network killer. It depends on what the intended use of the file is, on the other end and the client receiving it, if they *have to* use XML, certain optimizations can be done for just the transfer part... <header> <firstName>A15</firstName> <lastName>A15</lastName> <phone>A10</phone> <address>A10</address> </header> <data> [[CDATA <!-- fixed width data goes here --> ]] </data> OR <header> <fieldSeparator>;</fieldSeparator> <field>firstName</field> <field>lastName</field> <field>phone</field> <field>address</field> </header> <data> [[CDATA <!-- delimited data goes here --> ]] </data> OR a combination of the above. In short, XML should be preferred only if documentation and discoverability are more important than performance. |
|
|
|
#5 |
|
Posts: n/a
|
Homer wrote:
> Please let me know what you think. XML is never designed to replace database server. You can use XML file transfer portion of data from a database. i.e. SELECT lastname,fistname,phonenumber,address FROM phonebook WHERE state = 'NY' AND city = 'somewhere'; A flat file like this William|John|12345678|84 5th Ave I don't know which column is last name, first name. 3rd column is person ID or phone number? You need let the programmers know what column is what. Next time if some one change flat file format to 85 5th Ave|John|William|12345678 Then your database will incorrect after updated. True XML creates large file size. But it makes our life easier. You can make up your own tags <lastName> or <Last_Name>, etc. the tags can be in English, Spanish, French, Russian, Japanese, etc. |
|
|
|
#6 |
|
Posts: n/a
|
On Tue, 2006-04-04 at 09:06 -0700, wrote:
> > OR a combination of the above. You're almost touching on the big problem: Misconception of what it means to be "standard". XML has (several) standardized markup frameworks, but it is silent as to content or utilization. It is ridiculous for a government entity to demand that "XML" be "the standard" for data interchange. They need to bless certain schemas if that's their goal, but it also needs to be abstract enough that systems can be designed efficiently. In your examples, the designers can claim that they are using "XML", and therefore "are standardized" on it, but the three examples we've seen so far are not at all interchangeable... |
|
|
|
#7 |
|
Posts: n/a
|
Homer wrote:
> John,Smith,5555555,37 Finch Ave. > > Is now: > > <FirstName>John</FirstName> > <LastName>Smith</LastName> > <PhoneNum>5555555</PhoneNum> > <Address>37 Finch Ave.</Address> > It's true that the XML data in your example is bulky, but what it has that the unstructured doesn't have is meta-level information, such as "John" the first name of someone. If the parties involved (ie. that sender and receiver of this information) have an agreement as to the meaning of "FirstName", then they are sharing more than just text... it has some implicit meaning. If you send it unstructured, then the receiver has to know how to parse the data into this agreed meaning, which means it needs to know the format of the data. Then, on the other hand, if the data is just stored in a database or something with no definition of the what the tags mean, then I agree with you... using XML is of little use. |
|
|
|
#8 |
|
Posts: n/a
|
"Homer" <> wrote in message news: oups.com... >I am a little bit tired of this obsession people have with XML and XML > technology. Please share your thoughts and let me know if I am thinking > in a wrong way. I believe some people are over using XML all over the > place. Nowadays Canadian Government is pushing XML to its organization > as standard for data/file transfer. Huge files moving between companies > now include tones of XML Tags repeating all over the file and slowing > down networks and crashing applications because of size. > I am not objecting to the whole technology. I know advantages of XML > and using it all the times for Config files or our web oriented > applications but using it as standard for moving big files is going too > far. Here is the example: > > John,Smith,5555555,37 Finch Ave. > > Is now: > > <FirstName>John</FirstName> > <LastName>Smith</LastName> > <PhoneNum>5555555</PhoneNum> > <Address>37 Finch Ave.</Address> > > And Tags are repeating and repeating: > > <FirstName>....</FirstName> > <LastName>....</LastName> > <PhoneNum>....</PhoneNum> > <Address>....</Address> > > <FirstName>....</FirstName> > <LastName>....</LastName> > <PhoneNum>....</PhoneNum> > <Address>....</Address> > > > Please let me know what you think. If your complaint is file size during network transfer, compress the file before sending it. If your complaint is file size during parsing, use SAX instead of DOM, and don't keep the whole file in memory at once. Use the right tool for the job. If for whatever problem you're trying to solve, you've got a better tool than XML, then use it. But if the problem is "The government requires me to use XML", then I can't think of a better tool than XML to solve that particular problem (except maybe emmigration - Oliver |
|
|
|
#9 |
|
Posts: n/a
|
On Tue, 2006-04-04 at 16:44 +0000, Oliver Wong wrote:
> except maybe emmigration You say that as though anyone would ever leave the utopian paradise that is Canada... |
|
|
|
#10 |
|
Posts: n/a
|
"Homer" <> writes:
> I am a little bit tired of this obsession people have with XML and XML > technology. Hear hear! Seems some people think XML is the solution to all problems. I'd rather classify it as the lowest common denominator for exchanging tree-structured data - and definitly not something fit for humans to read or write directly. > John,Smith,5555555,37 Finch Ave. > > Is now: > > <FirstName>John</FirstName> > <LastName>Smith</LastName> > <PhoneNum>5555555</PhoneNum> > <Address>37 Finch Ave.</Address> > > And Tags are repeating and repeating: > Please let me know what you think. Apart from what everybody else have said, zipping such a file should yield a *very* high compression factor. /L -- Lasse Reichstein Nielsen - DHTML Death Colors: <URL:http://www.infimum.dk/HTML/rasterTriangleDOM.html> 'Faith without judgement merely degrades the spirit divine.' |
|
![]() |
| Thread Tools | Search this Thread |
|
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| How to copy *.vob files on DVD to the hard disk and merge them together | zengpeiwen1719 | Software | 0 | 05-24-2008 09:33 AM |
| Torrent looks good. . . converted DVD files looks horrible. . . WHY!?! | novak.arthur@gmail.com | DVD Video | 4 | 02-11-2007 05:57 PM |
| A lot of files for download - Rapidshare links - New Files (11.out.2006) | Lucas22 | DVD Video | 0 | 10-12-2006 02:53 AM |
| CD-DA to Wav to DVD as Wav files? | makbertodelete@anothermessage.com | DVD Video | 10 | 09-27-2005 08:42 PM |