Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > ASP .Net > ASP General > large data file manipulation

Thread Tools

large data file manipulation

Roland Hall
Posts: n/a
I'm looking for information on working with large data files using FSO, XML.

I have a program which creates a large CSV file, over 7mb. It's a rate
table of freight shipping costs.
There are certain fields I do not need, some are blank. A typical line
would be:

Raw data:

" ", "30142", "GA", "01001"," ", "MA","
","100",018609,000000,000000,000000,014435,013181, 010622,009022,007125,006569,006569,006569,006569,0 00000,000000,000000,000000


blank,fromzip,fromstate, tozip,blank, tostate,blank,class, mc, blank, blank,
blank, l5c, m5c, m1m, m2m, m5m, mxm, mxxm, mxxxm, mxlm, blank, blank,

I don't need the double quotes or spaces or any field determined to be blank
in the structure. It is my understanding I can read this file in 3 ways:


I chose readLine because I didn't want the 7mb all at once nor reading bytes
because the line is not fixed. I'm using readLine. I manipulate my data
and append my data to a new file after 1000 lines, finishing up with however
many lines are left upon reaching the end.

My result file is a little over 3mb [41380 lines of raw data]. It takes
seconds to process and will only be used if shipping rates change. The 3mb
file is still too large to work with and I have decided to split it up in
one of two ways, either by state or zip code ranges. "By state" gives me 50
and zip range gives me 10. Not sure what the difference in size will be or
if it will be a noticeable difference. The rate table, or part of it, will
only in memory long enough to get the rate and then released.

I have printing to the screen turned on during the debug process. You can
see it here:

My questions are:

Since I have to use data files would using XML over CSV be drastically
different to use as a lookup for my new file?
How much more efficient is XML to retrieve information over CSV being read
in? To make a true comparison, the result will eventually be multiple
files, read in with readALL [if used as CSV] and then I would search an
array for the rate I needed.

If I used XML, would it be necessary to split the file up, as I would with
the CSV [by ship to state] or could I use the single file?

Yes, I know SQL is better but I have to also have a version that does not
use a database.


Roland Hall
/* This information is distributed in the hope that it will be useful, but
without any warranty; without even the implied warranty of merchantability
or fitness for a particular purpose. */
Technet Script Center -
WSH 5.6 Documentation -
MSDN Library -

Reply With Quote

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
median of large data set (from large file) Perl Misc 5 04-02-2009 04:06 AM
File Auditing - Fast DB import and data manipulation Ruby 2 03-26-2006 06:29 AM
How do I: Split a large file on record and data (file = 3GB) seansan Perl Misc 6 01-05-2004 02:51 PM
Re: file overwrite data manipulation John Harrison C++ 2 06-27-2003 05:12 AM
Re: file overwrite data manipulation Chris Theis C++ 0 06-26-2003 07:32 AM