Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > editing delimited record files

Reply
Thread Tools

editing delimited record files

 
 
garth_rockett@yahoo.com
Guest
Posts: n/a
 
      10-05-2005
I am a Perl newbie ... absolute newbie. I need to parse a validate the
data in a text file containing delimited records. I might need to
sanitize some of the data and edit and save the file as I read through
it. For example, removing negative signs from integer fields,
truncating decimal points and digits following it in fields which are
meant to be integers, converting all different allowable date formats
into one common format.

I need pointers on an efficient strategy to do this. What tools can I
look into. I would learn up what you'd suggest ... so puh-lease ...
help!!


Cheers,
Andy

 
Reply With Quote
 
 
 
 
John W. Krahn
Guest
Posts: n/a
 
      10-05-2005
http://www.velocityreviews.com/forums/(E-Mail Removed) wrote:
> I am a Perl newbie ... absolute newbie. I need to parse a validate the
> data in a text file containing delimited records. I might need to
> sanitize some of the data and edit and save the file as I read through
> it. For example, removing negative signs from integer fields,
> truncating decimal points and digits following it in fields which are
> meant to be integers, converting all different allowable date formats
> into one common format.
>
> I need pointers on an efficient strategy to do this. What tools can I
> look into. I would learn up what you'd suggest ... so puh-lease ...
> help!!



http://www.manning.com/books/cross


John
--
use Perl;
program
fulfillment
 
Reply With Quote
 
 
 
 
Paul Lalli
Guest
Posts: n/a
 
      10-05-2005
(E-Mail Removed) wrote:
> I am a Perl newbie ... absolute newbie. I need to parse a validate the
> data in a text file containing delimited records. I might need to
> sanitize some of the data and edit and save the file as I read through
> it. For example, removing negative signs from integer fields,
> truncating decimal points and digits following it in fields which are
> meant to be integers, converting all different allowable date formats
> into one common format.
>
> I need pointers on an efficient strategy to do this. What tools can I
> look into. I would learn up what you'd suggest ... so puh-lease ...
> help!!


http://learn.perl.org

Once you've gotten Perl installed:
perldoc -f open
perldoc -f readline
perldoc perlretut
perldoc -f print

Paul Lalli

 
Reply With Quote
 
Matt Garrish
Guest
Posts: n/a
 
      10-05-2005

"Paul Lalli" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed) ups.com...
> (E-Mail Removed) wrote:
>> I am a Perl newbie ... absolute newbie. I need to parse a validate the
>> data in a text file containing delimited records. I might need to
>> sanitize some of the data and edit and save the file as I read through
>> it. For example, removing negative signs from integer fields,
>> truncating decimal points and digits following it in fields which are
>> meant to be integers, converting all different allowable date formats
>> into one common format.
>>
>> I need pointers on an efficient strategy to do this. What tools can I
>> look into. I would learn up what you'd suggest ... so puh-lease ...
>> help!!

>
> http://learn.perl.org
>
> Once you've gotten Perl installed:
> perldoc -f open
> perldoc -f readline
> perldoc perlretut
> perldoc -f print
>


A module like Text-CSV might be a better option for a newbie than pointing
them to regular expression parsing.

Matt


 
Reply With Quote
 
Paul Lalli
Guest
Posts: n/a
 
      10-05-2005
Matt Garrish wrote:
> > (E-Mail Removed) wrote:
> >> I am a Perl newbie ... absolute newbie. I need to parse a validate the
> >> data in a text file containing delimited records. I might need to
> >> sanitize some of the data and edit and save the file as I read through
> >> it. For example, removing negative signs from integer fields,
> >> truncating decimal points and digits following it in fields which are
> >> meant to be integers, converting all different allowable date formats
> >> into one common format.

> A module like Text-CSV might be a better option for a newbie than pointing
> them to regular expression parsing.


Really? Text::CSV would help with removing negative signs, truncating
decimals, or converting date formats?

Paul Lalli

 
Reply With Quote
 
Tad McClellan
Guest
Posts: n/a
 
      10-05-2005
Matt Garrish <(E-Mail Removed)> wrote:
>
> "Paul Lalli" <(E-Mail Removed)> wrote in message
> news:(E-Mail Removed) ups.com...
>> (E-Mail Removed) wrote:
>>> I am a Perl newbie ... absolute newbie. I need to parse a validate the
>>> data in a text file containing delimited records. I might need to
>>> sanitize some of the data and edit and save the file as I read through
>>> it. For example, removing negative signs from integer fields,
>>> truncating decimal points and digits following it in fields which are
>>> meant to be integers, converting all different allowable date formats
>>> into one common format.
>>>
>>> I need pointers on an efficient strategy to do this. What tools can I
>>> look into. I would learn up what you'd suggest ... so puh-lease ...
>>> help!!

>>
>> http://learn.perl.org
>>
>> Once you've gotten Perl installed:
>> perldoc -f open
>> perldoc -f readline
>> perldoc perlretut
>> perldoc -f print
>>

>
> A module like Text-CSV might be a better option for a newbie than pointing
> them to regular expression parsing.



Since Text::CSV won't help with the validate/sanitize requirements,
it is entirely possible/likely that regexes will be needed too.


--
Tad McClellan SGML consulting
(E-Mail Removed) Perl programming
Fort Worth, Texas
 
Reply With Quote
 
Matt Garrish
Guest
Posts: n/a
 
      10-06-2005

"Paul Lalli" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed) oups.com...
> Matt Garrish wrote:
>> > (E-Mail Removed) wrote:
>> >> I am a Perl newbie ... absolute newbie. I need to parse a validate the
>> >> data in a text file containing delimited records. I might need to
>> >> sanitize some of the data and edit and save the file as I read through
>> >> it. For example, removing negative signs from integer fields,
>> >> truncating decimal points and digits following it in fields which are
>> >> meant to be integers, converting all different allowable date formats
>> >> into one common format.

>> A module like Text-CSV might be a better option for a newbie than
>> pointing
>> them to regular expression parsing.

>
> Really? Text::CSV would help with removing negative signs, truncating
> decimals, or converting date formats?
>


How did you get from parsing the file into chunks into performing data
validation? I could equally well ask you how removing negative signs,
truncating decimals and converting dates would help break up the file, but
that sort of illogic won't get us too far will it?

Matt


 
Reply With Quote
 
Paul Lalli
Guest
Posts: n/a
 
      10-06-2005

Matt Garrish wrote:
> "Paul Lalli" <(E-Mail Removed)> wrote in message
> news:(E-Mail Removed) oups.com...
> > Matt Garrish wrote:
> >> > (E-Mail Removed) wrote:
> >> >> I am a Perl newbie ... absolute newbie. I need to parse a validate the
> >> >> data in a text file containing delimited records. I might need to
> >> >> sanitize some of the data and edit and save the file as I read through
> >> >> it. For example, removing negative signs from integer fields,
> >> >> truncating decimal points and digits following it in fields which are
> >> >> meant to be integers, converting all different allowable date formats
> >> >> into one common format.
> >> A module like Text-CSV might be a better option for a newbie than
> >> pointing
> >> them to regular expression parsing.

> >
> > Really? Text::CSV would help with removing negative signs, truncating
> > decimals, or converting date formats?
> >

>
> How did you get from parsing the file into chunks into performing data
> validation? I could equally well ask you how removing negative signs,
> truncating decimals and converting dates would help break up the file, but
> that sort of illogic won't get us too far will it?


.... at least one of us is either confused or stubborn to the point of
absurdity[1]. The OP asked for help doing two things: parsing the
file, and validating/"sanitizing" the resulting fields, listing a few
examples of the type of validation/sanitization he wants to accomplish.
My response included perldocs about regular expressions, intended to
help with the second part of the goal. You responded that I should
have recommended Text::CSV *instead of* regular expressions. I
questioned how Text::CSV would be able to fulfill this second goal.

End result - the OP should look at both. Text::CSV could be used for
parsing the data into fields, and regular expressions for validating
the resulting fields.

Paul Lalli

[1] And I make no assumption that that one isn't me.

 
Reply With Quote
 
garth_rockett@yahoo.com
Guest
Posts: n/a
 
      10-06-2005

> My response included perldocs about regular expressions, intended to
> help with the second part of the goal. You responded that I should
> have recommended Text::CSV *instead of* regular expressions. I
> questioned how Text::CSV would be able to fulfill this second goal.


I am comfortable with sed and grep style regular expressions, so I
believe Perl regexps will not be such a giant leap-of-faith. I might be
wrong but I am looking forward to using the regexps.

>
> End result - the OP should look at both. Text::CSV could be used for
> parsing the data into fields, and regular expressions for validating
> the resulting fields.


I also saw the $^I variable, which upon being set to a string, can be
used for inplace editing of text files, something which is actually at
the heart of the problem. I understand that this might be more I/O but
is there a better way to do it. My first impression was that a fair
amount of economy of code can be achieved using $^I for inplace
editing. Being a Perl newbie, I would like to write as much Perl as I
can, but as little of it for Production code as I can.

> Paul Lalli
>
> [1] And I make no assumption that that one isn't me.


Thank you.

Cheers,
Andy

 
Reply With Quote
 
Matt Garrish
Guest
Posts: n/a
 
      10-06-2005

"Paul Lalli" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed) oups.com...
>
> Matt Garrish wrote:
>> "Paul Lalli" <(E-Mail Removed)> wrote in message
>> news:(E-Mail Removed) oups.com...
>> > Matt Garrish wrote:
>> > Really? Text::CSV would help with removing negative signs, truncating
>> > decimals, or converting date formats?
>> >

>>
>> How did you get from parsing the file into chunks into performing data
>> validation? I could equally well ask you how removing negative signs,
>> truncating decimals and converting dates would help break up the file,
>> but
>> that sort of illogic won't get us too far will it?

>
> ... at least one of us is either confused or stubborn to the point of
> absurdity[1]. The OP asked for help doing two things: parsing the
> file, and validating/"sanitizing" the resulting fields, listing a few
> examples of the type of validation/sanitization he wants to accomplish.
> My response included perldocs about regular expressions, intended to
> help with the second part of the goal. You responded that I should
> have recommended Text::CSV *instead of* regular expressions. I
> questioned how Text::CSV would be able to fulfill this second goal.
>


I read your first post and the implication - which I didn't think was
intended but was there nonetheless - was that you were suggesting regular
expressions as the means of parsing and validating (i.e., open, read line,
regular expressions, print, with no mention of parsing). I only meant to
point out that there exist better means of splitting a delimited file; I did
not mean to imply that you were wrong about regular expressions as a means
of validation if that's how you took it. Let's just assume our wires got
crossed somewhere and leave it at that...

Matt


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
convert non-delimited to delimited RyanL Python 6 08-28-2007 12:06 AM
DB Record As Comma Delimited String rn5a@rediffmail.com ASP General 1 04-27-2007 03:57 PM
need more ideas on importing delimited text files TJS ASP .Net 2 02-06-2004 09:38 PM
Newbie question - reading delimited files and printing... Markis Landis Gardner Java 6 12-25-2003 03:29 AM
Merge two xml files on common date field and write out tab-delimited file Luke Airig XML 0 12-21-2003 08:00 PM



Advertisments