Troll wrote:
> Now time for some stupid Qs:
>
> Let's say that the data I have is in a file called employees.
> How can I call this file so that I can parse it?
>
> 1) Can I do:
> @HRdata = `cat employees`;
> while (<@HRdata>) {
The above is considered bad practice, especially if the file is large.
Why read the entire file into memory when you can read, process, and
discard a line at a time..? To open and read a file:
open (FIN, '<employess') || die "blah blah blah...";
while (<FIN>) {
}
>
>
> 2) With regard to the HEADING sections, the script has to be able to
> recognise the different sections by the following rules:
> # there's a blank line
> before each heading
> HEADING 1 # this is the name of the heading -
> this is a string with a special character and a blank space as part of it
> ColumnA ColumnB ColumnC # these are the column names - these are
> strings which also can inlude a blank space if they have 2 or more words
> ******* # a sort of an underlining
> pattern
>
while (<FIN>) {
if ( /^$/ ) {
# this is a blank line, don't do anything
} elsif ( /HEADING (\.+)/ ) {
# this is a heading, with the heading name in $1
} elsif ( (($name, $sex, $status, $age) = /(\s+) (\s+) (\s+) (\d+)/) ==
4 ) {
# this line contains three words and a number, do whatever
# (I'm not really sure if this will work. My Linux box is
# down and I have no way of testing.)
}
} # end of while(<FIN>)
> I guess this is to make sure that one does not include any silly heading
> data as part of the arrays created and the parsing only takes place on
> 'real' data. Can you pls advise? Or do you need more info? I'm more in
> favour of creating separate 'if' loops due to my 'newbie' status. I'll get
> lost otherwise...
>
"if loops"...? How does one make an if loop?
> Thanks.
>
>
>
> "Troll" <> wrote in message
> news:uRK4b.77094$...
>
>>Wow. I don't know how you get the time to respond to my queries in such
>>detail. It is greatly appreciated.
>>I just came back from work and it's like 2:30 am so I'll crash out soon
>
> and
>
>>have a closer read tomorrow [especially of the HEADINGS part].
>>
>>With the push @array stuff I actually got to this today in my readings. I
>>saw an example of appending an array onto another array with a push and I
>>was wondering if we could just substitute a $variable for one of the
>
> arrays.
>
>>I'm glad you confirmed this. 
>>
>>I was also wondering if doing this at the beginning of the script:
>>
>>my (%names, %sexes, %depts, %m_statuses, %ages) # declaring things
>>locally
>>
>>would be considered bad practice. I thought that one should declare things
>>as my ( ) if one is using things within a loop so as not to impact
>
> anything
>
>>external to the loop. But if one uses variables/arrays both within and
>>outside the loops, should we then still declare stuff as my ( )?
>>Maybe I'm just confused about my ( )...
>>
>>Greg, if you could possibly keep an eye on this thread for the next few
>
> days
>
>>I would be very much in your debt. Your help has been invaluabe so far in
>>allowing me to visualise quite a few things.
>>
>>Thanks very much.
>>
>>
>>"Ga Mu" <> wrote in message
>>news:uRJ4b.147542$ .net...
>>
>>>Troll wrote:
>>>
>>>>Thanks again !
>>>>
>>>>1)
>>>>Sorry for being too vague. With regard to the HEADINGS they separate
>>
>>blocks
>>
>>>>of data. But because the column names will be different [data is
>>
>>different]
>>
>>>>then I'm not quite sure I could use:
>>>>$names{$heading}{$name}++;
>>>>
>>>>So I'm looking at creating separate my () definitions for each HEADING
>>
>>and
>>
>>>>just wanted to confirm how to jump out of one HEADING loop and start
>>
>>with
>>
>>>>the next.
>>>>
>>>>For example, under HEADING 1 we have these columns:
>>>>Name, Sex, Dept, M_Status, Age
>>>>
>>>>and under HEADING 2we have:
>>>>Address, Phone#, Mobile#, Salary
>>>>
>>>>So at the beginning of the script I would have
>>>>my (%names, %sexes, %depts, %m_statuses, %ages)
>>>>my (%addresses, %phones, %mobiles, %salaries)
>>>>#then I have my while (<>) and parsing here
>>>>#I have my output at the end
>>>>
>>>>Is that a little more clearer?
>>>
>>>Yes. Much clearer. There are a couple of different ways you could do
>>>this. One is to use a single loop that reads through the file and uses
>>>a state variable (e.g., $heading) to keep track of where you are in the
>>>parsing process. The other is to have a separate loop for each heading.
>>> Again, six of one, half a dozen of another. It's more a matter of
>>>preference than anything else.
>>>
>>>An example of the first approach:
>>>
>>>my $heading = 'initial';
>>>my $fin_name = '/usr/local/blah/blah/blah';
>>>open FIN,$fin_name || die "Can't open $fin_name\n";
>>>
>>>while (<FIN>) {
>>>
>>> # check for a new heading
>>> # I am assuming single word heading names
>>> if ( /HEADING (\S+)/ {
>>>
>>> $heading = $1; # set $heading equal to word extracted above
>>>
>>> # take appropriate action based on the heading we are under
>>>
>>> } elsif ( $heading eq 'NAMES' ) {
>>>
>>> ( $name, $sex, $dept, $m_status, $age ) =
>>> /(\w+) (\w+) (\w+) (\w+) (\d+)/;
>>>
>>> # update counts, append to lists, etc...
>>>
>>> } elsif ( $heading eq 'ADDRESSES' ) {
>>>
>>> # I am assuming the address field is limited to 30 characters
>>> # here:
>>> ( $address,$phone, $mobile, $salary ) =
>>> /(\.{30}) (\S+) (\S+) (\d+)/;
>>>
>>> # update counts, append to lists, etc...
>>>
>>> }
>>>
>>>}
>>>
>>>
>>>And the second approach:
>>>
>>>my $heading = 'initial';
>>>my $fin_name = '/usr/local/blah/blah/blah';
>>>open FIN,$fin_name || die "Can't open $fin_name\n";
>>>
>>># scan for first heading
>>>while ( <FIN> && ! /HEADING NAMES/ );
>>>
>>># parse the names, etc...
>>>while ( <FIN> && ! /HEADING ADDRESSES/ ) {
>>>
>>> ( $name, $sex, $dept, $m_status, $age ) =
>>> /(\w+) (\w+) (\w+) (\w+) (\d+)/;
>>>
>>> # update counts, append to lists, etc...
>>>
>>>
>>># parse the addresses, etc...
>>># for brevity , I am assuming only two headings
>>>while ( <FIN> ) {
>>>
>>> ( $address,$phone, $mobile, $salary ) =
>>> /(\.{30}) (\S+) (\S+) (\d+)/;
>>>
>>> # update counts, append to lists, etc...
>>>
>>>}
>>>
>>>
>>>>
>>>>2)
>>>>With my last question regarding the printing of the names of single
>>
>>people,
>>
>>>>if we include a print statement in the parsing loop would that give us
>>>>something like:
>>>>Pete is single.
>>>>John is single.
>>>>while the parsing is still running?
>>>
>>>Yes.
>>>
>>>
>>>>What I'm after is hopefully feeding that output into something else
>>>>[@array?] which can then print a list of the names [line by line] at
>
> the
>
>>end
>>
>>>>of the script, something like:
>>>>#this is the output structure
>>>>Number of Petes =
>>>>Number of Males =
>>>>Singles are:
>>>>Pete
>>>>John
>>>>Number of Salespeople =
>>>>
>>>>
>>>>Does this make sense?
>>>>
>>>
>>>Yes. It would be easy to create a list/array of, e.g., single people.
>>>Prior to the loop, declare the array. Within the loop, test each person
>>>for being single. If they are, push them onto the list:
>>>
>>># prior to your parsing loop, declare array @singles:
>>>
>>>my @singles;
>>>
>>># within your parsing loop, after parsing out name, status, etc.:
>>>
>>>if ( $m_status eq 'Single' ) push @singles,($name);
>>>
>>># after loop, to print the list of singles:
>>>
>>>print "Single persons:\n";
>>>foreach $single_person ( @singles ) print " $single_person\n";
>>>
>>>
>>>Greg
>>>
>>
>>
>
>