Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Java > better way to parse Tabs

Reply
Thread Tools

better way to parse Tabs

 
 
MileHighCelt
Guest
Posts: n/a
 
      12-06-2005
I have been looking thru use groups and API's and there seems to be
some issue on parsing a tab delimited file. If a user uploads a file
that is tab delimited, and could contain nulls, is this necessarily the
best approach:

InputStream in = theFile.getInputStream();
BufferedReader r = new BufferedReader(new InputStreamReader(in));
String line;
while ((line = r.readLine()) != null) {
StringTokenizer st = new StringTokenizer(line, "\t", true);
MyBean region = new MyBean();

region.setLab_id(Integer.parseInt(st.nextToken())) ;
region.setControl_nmbr(st.nextToken());
...
}

As I understand it, the "true" arguement in the tokenizer will handle
the nulls/no value as empty strings?

Is there a better way to solve this? Therre are probably 50 columns
per row, with a variety of date, int, and String values.

Thank you for you opinion and advice.

 
Reply With Quote
 
 
 
 
Jean-Francois Briere
Guest
Posts: n/a
 
      12-07-2005
> As I understand it, the "true" arguement in the tokenizer will handle
> the nulls/no value as empty strings?


No. If you read carefully the documentation, you will see that "true"
means that it returns the delimiters (in your case "\t") as tokens.
So if you want use a StringTokenizer you'll have to allways check for
the "\t" token and also take care of sequential "\t" tokens.

> Is there a better way to solve this?


Yes. You could do (JVM 1.4+):

String[] elements = line.split("\t");

In the returning string array, the empty elements (\t\t) will be empty
strings.
All you'll have to do is convert to Date or int the relevant elements,
allways verifying first for empty string of course).

region.setLab_id(convertToInt(elements[0]));
region.setControl_nmbr(elements[1]); // no conversion needed
....
region.setSomeDateField(convertToDate(elements[n]));
....
int convertToInt(String str)
{
return (str.length() == 0) ? 0 : Integer.parseInt(str);
}

Date convertToDate(String str)
{
return (str.length() == 0) ? null :
someSimpleDateFormatInstance.parse(str);
}

Regards

 
Reply With Quote
 
 
 
 
MileHighCelt
Guest
Posts: n/a
 
      12-07-2005
Thank you - that is exactly what I was looking for! I will swap out
the StringTokenizer I implemented as it sure seems slow.

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Tabs -vs- Spaces: Tabs should have won. rantingrick Python 95 07-18-2011 11:07 PM
CSV::Writer... Using tabs instead of commas (or creating excel file using tabs to seperate data) John Kopanas Ruby 2 01-29-2007 06:26 PM
List text files showing LFs and expanded tabs (was: Colorize expanded tabs) qwweeeit Python 2 12-14-2005 10:07 AM
Build a Better Blair (like Build a Better Bush, only better) Kenny Computer Support 0 05-06-2005 04:50 AM
Which is the better way to parse this file? Roberto A. F. De Almeida Python 2 09-02-2003 05:09 PM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57