Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > Breaking into tokens based on white space

Reply
Thread Tools

Breaking into tokens based on white space

 
 
j2ee@att.net
Guest
Posts: n/a
 
      07-15-2004
I have a file which has these 3 columns (for example)

Name Size1 Size2
+ abc_p.h 12345 432
*unknown
+ dfe_e_io.h 210989 123
+ dfx_e_io.c 210912 1290 and so on upto 500 entries.

I have to retreive Name(file names) and size1 and store it in an array

Then I have to retrieve name and size2 and store it in another array
My solution:
I checked if the each line in the file matched the file name using regular
expression. If there is match then store those filenames and size1 in array1
using substr operation.
But the problem is I hardcoded the values of starting position and
length of the string in the substr operation. So my code will work only for a
given length of string. for eg. say 20. If a name is of lenght> 20, my code
won't work.
Can you tell if there is a generic way of writing regular expression that
matches the name in my file , and then size1 and stores them in a array? Special
cases: IN the name column you may have some unwanted string like *unknown which
should be ignored.

Let me know if you need clarifications. Thanks..
 
Reply With Quote
 
 
 
 
A. Sinan Unur
Guest
Posts: n/a
 
      07-15-2004
wrote in
news: om:

> I have a file which has these 3 columns (for example)
>
> Name Size1 Size2
> + abc_p.h 12345 432
> *unknown
> + dfe_e_io.h 210989 123
> + dfx_e_io.c 210912 1290 and so on upto 500 entries.


Why do you repeatedly post the same message? If you need a clarification or
you have further questions about replies to your earlier posts on this
topics, you should post those comments in the same thread.

--
A. Sinan Unur
d
(remove '.invalid' and reverse each component for email address)

 
Reply With Quote
 
 
 
 
Paul Lalli
Guest
Posts: n/a
 
      07-15-2004
On Thu, 15 Jul 2004 wrote:

> I have a file which has these 3 columns (for example)
>
> Name Size1 Size2
> + abc_p.h 12345 432
> *unknown
> + dfe_e_io.h 210989 123
> + dfx_e_io.c 210912 1290 and so on upto 500 entries.
>
> I have to retreive Name(file names) and size1 and store it in an array
>
> Then I have to retrieve name and size2 and store it in another array


What do you mean by 'array' here? How are you storing both the size and
the name in the array? Are you sure you don't want hashes? More to the
point, are you sure you don't want a multi-dimensional hash for the two
sizes?

> My solution:
> I checked if the each line in the file matched the file name using regular
> expression. If there is match then store those filenames and size1 in array1
> using substr operation.


Why? Why are you parsing the line once to see if it matched, and second
time to pull it out?

> But the problem is I hardcoded the values of starting position and
> length of the string in the substr operation. So my code will work only for a
> given length of string. for eg. say 20. If a name is of lenght> 20, my code
> won't work.
> Can you tell if there is a generic way of writing regular expression that
> matches the name in my file , and then size1 and stores them in a array? Special
> cases: IN the name column you may have some unwanted string like *unknown which
> should be ignored.


You should perhaps read up on regular expressions (perldoc perlre) and
search for the section on capturing parentheses.

#!/usr/bin/perl
use strict;
use warnings;
my %files;
#UNTESTED
while (<DATA>){
if (/^\+ (\S+)\s+(\d+)\s+(\d+)\s*$/){
push @{$files{$1}}, $2, $3; #add size1 and size2 to file's array
}
}
#You never said what you wanted to do with these arrays...
print "Size 1:\n\n";
print "$_ => $files{$_}[0]\n" for keys %files;
print "\nSize 2:\n\n";
print "$_ => $files{$_}[1]\n" for keys %files;


__DATA__
Name Size1 Size2
+ abc_p.h 12345 432
*unknown
+ dfe_e_io.h 210989 123
+ dfx_e_io.c 210912 1290




Paul Lalli
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Any programs to trim white space/ remove all white space in HTML file? Ben C HTML 6 01-28-2007 11:41 PM
Looking for a breaking news rss feed that really contains breaking news Amy XML 0 02-22-2005 06:31 PM
Why Python style guide (PEP-8) says 4 space indents instead of 8 space??? 8 space indents ever ok?? Christian Seberino Python 21 10-27-2003 04:20 PM
Breaking Ruby code into tokens Hal Fulton Ruby 2 10-05-2003 01:35 AM
Stack space, global space, heap space Shuo Xiang C Programming 10 07-11-2003 07:30 PM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57