Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > Date in CSV/TSV question

Reply
Thread Tools

Date in CSV/TSV question

 
 
Dr Eberhard Lisse
Guest
Posts: n/a
 
      01-01-2013
I have a Tab Separated File of roughly 1000 likes with the first fields like

"07 Jan 2011" "TFR"
"05 Jan 2011" "DR"

I need change the first field to look like

2011-01-07 "TFR"
2011-01-05 "DR"

for all lines, of course -O

Can someone point me to where I can read this up? Or send me a code
fragment?

Thanks, el
--
if you want to reply, replace nospam with my initials
 
Reply With Quote
 
 
 
 
Dave Saville
Guest
Posts: n/a
 
      01-02-2013
On Tue, 1 Jan 2013 23:56:14 UTC, Dr Eberhard Lisse <(E-Mail Removed)>
wrote:

> I have a Tab Separated File of roughly 1000 likes with the first fields like
>
> "07 Jan 2011" "TFR"
> "05 Jan 2011" "DR"
>
> I need change the first field to look like
>
> 2011-01-07 "TFR"
> 2011-01-05 "DR"
>
> for all lines, of course -O
>
> Can someone point me to where I can read this up? Or send me a code
> fragment?


Not clear if the file has the quotes or you are using them to show the
fields. Assuming you have extracted the first field then split on
space to day month year. Set up an array of month names. Find the
index of the given month. Regenerate the field with sprintf. $new =
sprintf($year-%2.2d-$day, $index); For simplicity put a dummy month on
the front of the list, perl arrays index from 0, so @months = qw(crap
Jan Feb ..........

HTH
--
Regards
Dave Saville
 
Reply With Quote
 
 
 
 
Dr Eberhard W Lisse
Guest
Posts: n/a
 
      01-02-2013
Thanks.

el

On 2013-01-02 15:01 , Henry Law wrote:
> On 01/01/13 23:56, Dr Eberhard Lisse wrote:
>> I have a Tab Separated File of roughly 1000 likes with the first
>> fields like
>>
>> "07 Jan 2011" "TFR"
>> "05 Jan 2011" "DR"
>>
>> I need change the first field to look like
>>
>> 2011-01-07 "TFR"
>> 2011-01-05 "DR"

>
> OK, couldn't resist having a bash at this. Didn't spend a lot of time
> on it but this does what you want.
>
> #!/usr/bin/perl
> use strict;
> use warnings;
> use 5.010;
>
> use Date::Calc qw( Decode_Date_EU );
> use Text::CSV;
>
> my $csv = Text::CSV->new( { sep_char=>"\t", quote_char=>'"' } )
> or die "Failed to create CSV object: $!\n";
> while ( 1 ) {
> my $row = $csv->getline( \*DATA );
> last unless $row->[0]; # getline returns zero-length arrayref;
> irritating
> my ( $year, $month, $day ) = Decode_Date_EU( $row->[0] );
> die "Bad date" unless $year;
> printf "%04d-%02d-%02d\t%s\n", $year, $month, $day, $row->[1];
> }
>
> __DATA__
> "07 Jan 2011" "TFR"
> "05 Jan 2011" "DR"
>
>> henry@eris:~/Perl/tryout$ ./tryout
>> 2011-01-07 TFR
>> 2011-01-05 DR

>
> It could be improved, and made more Perlish (I write code in isolation,
> rather, which isn't a good idea). In particular I was maddened by the
> need to check the EOF condition explicitly. "while my $row =
> getline..." returns a one-element array containing a null value when it
> hits EOF; you'd think it would return undef. (And yes I did try
> "defined" as suggested in perldoc IO::Handle but the arrayref is
> actually defined, despite not containing anything useful).
>



--
If you want to email me, replace nospam with el
 
Reply With Quote
 
Rainer Weikusat
Guest
Posts: n/a
 
      01-02-2013
Dr Eberhard Lisse <(E-Mail Removed)> writes:
> I have a Tab Separated File of roughly 1000 likes with the first fields like
>
> "07 Jan 2011" "TFR"
> "05 Jan 2011" "DR"
>
> I need change the first field to look like
>
> 2011-01-07 "TFR"
> 2011-01-05 "DR"
>
> for all lines, of course -O
>
> Can someone point me to where I can read this up? Or send me a code
> fragment?


-----------
%months = map { $_, sprintf('%02d', ++$n); } qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec);

while (<>) {
s/^"(\d+)\s+(\S+)\s+(\d+)"/"$3-$months{$2}-$1"/;
print;
}
-----------
 
Reply With Quote
 
Keith Thompson
Guest
Posts: n/a
 
      01-04-2013
Henry Law <(E-Mail Removed)> writes:
[...]
> You could use Date::Calc, particularly the Decode_Date_EU function; it's
> overkill if what you've described is really all there is, but it saves
> programming. A truly lazy^H^H^H^Hcreative programmer would look for
> something to decode the tab-separated file too; maybe Text::CSV would do
> that? I've only ever used it for comma separated data, (which, er, is
> what it's for).


Yes, quoting "perldoc Text::CSV":

The module accepts either strings or files as input and
can utilize any user-specified characters as delimiters,
separators, and escapes so it is perhaps better called ASV
(anything separated values) rather than just CSV.

--
Keith Thompson (The_Other_Keith) http://www.velocityreviews.com/forums/(E-Mail Removed) <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
 
Reply With Quote
 
Rainer Weikusat
Guest
Posts: n/a
 
      01-04-2013
Henry Law <(E-Mail Removed)> writes:

> On 02/01/13 10:22, Dave Saville wrote:
>> On Tue, 1 Jan 2013 23:56:14 UTC, Dr Eberhard Lisse <(E-Mail Removed)>
>> wrote:
>>
>>> I have a Tab Separated File of roughly 1000 likes with the first fields like
>>>
>>> "07 Jan 2011" "TFR"
>>> "05 Jan 2011" "DR"

>>
>> Not clear if the file has the quotes or you are using them to show the
>> fields. Assuming you have extracted the first field then split on
>> space to day month year. Set up an array of month names. Find the
>> index of the given month. Regenerate the field with sprintf. $new =
>> sprintf($year-%2.2d-$day, $index); For simplicity put a dummy month on
>> the front of the list, perl arrays index from 0, so @months = qw(crap
>> Jan Feb ..........

>
> You could use Date::Calc, particularly the Decode_Date_EU function;
> it's overkill if what you've described is really all there is, but it
> saves programming. A truly lazy^H^H^H^Hcreative programmer would look
> for something to decode the tab-separated file too; maybe Text::CSV
> would do that?


Nice example how it 'saves programming':

,----
| #!/usr/bin/perl
| use strict;
| use warnings;
| use 5.010;
|
| use Date::Calc qw( Decode_Date_EU );
| use Text::CSV;
|
| my $csv = Text::CSV->new( { sep_char=>"\t", quote_char=>'"' } )
| or die "Failed to create CSV object: $!\n";
| while ( 1 ) {
| my $row = $csv->getline( \*DATA );
| last unless $row->[0]; # getline returns zero-length arrayref;
| irritating
| my ( $year, $month, $day ) = Decode_Date_EU( $row->[0] );
| die "Bad date" unless $year;
| printf "%04d-%02d-%02d\t%s\n", $year, $month, $day, $row->[1];
| }
`----

That's 14 lines of code. Alternate version without Date::Calc and
Text::CSV

,----
| %months = map { $_, sprintf('%02d', ++$n); } qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec);
|
| while (<>) {
| s/^"(\d+)\s+(\S+)\s+(\d+)"/"$3-$months{$2}-$1"/;
| print;
| }
`----

That's good enough for the problem which was described and it's four
lines of code. "Truly creative", -10 lines of code were saved here
and a comment explaining an 'ugly' workaround for deficiency in the
downloaded code had to be added as well[*],

while (1) {
 
Reply With Quote
 
C.DeRykus
Guest
Posts: n/a
 
      01-05-2013
On Wednesday, January 2, 2013 7:37:02 AM UTC-8, Rainer Weikusat wrote:
> Dr Eberhard Lisse <(E-Mail Removed)> writes:
>
> > I have a Tab Separated File of roughly 1000 likes with the first fields like

>
> >

>
> > "07 Jan 2011" "TFR"

>
> > "05 Jan 2011" "DR"

>
> >

>
> > I need change the first field to look like

>
> >

>
> > 2011-01-07 "TFR"

>
> > 2011-01-05 "DR"

>
> >

>
> > for all lines, of course -O

>
> >

>
> > Can someone point me to where I can read this up? Or send me a code

>
> > fragment?

>
>
>
> -----------
>
> %months = map { $_, sprintf('%02d', ++$n); } qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec);
>
>
>
> while (<>) {
>
> s/^"(\d+)\s+(\S+)\s+(\d+)"/"$3-$months{$2}-$1"/;
>
> print;
>
> }
>
> -----------


Maybe even shrink it to a long one-liner:

perl -MDate::Manip -pi.bak -le 's{^"(\d+)\s+(\S+)\s+(\d+)"}
{"$3-" . UnixDate("$1 $2 $3","%m") . "-$1"}e' infile

--
Charles DeRykus


 
Reply With Quote
 
Rainer Weikusat
Guest
Posts: n/a
 
      01-05-2013
"C.DeRykus" <(E-Mail Removed)> writes:
> On Wednesday, January 2, 2013 7:37:02 AM UTC-8, Rainer Weikusat wrote:
>> Dr Eberhard Lisse <(E-Mail Removed)> writes:
>> > I have a Tab Separated File of roughly 1000 likes with the first

>> fields like
>>
>> > "07 Jan 2011" "TFR"
>> > "05 Jan 2011" "DR"

>>
>>> 2011-01-07 "TFR"
>>> 2011-01-05 "DR"


[...]

>> -----------
>> %months = map { $_, sprintf('%02d', ++$n); } qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec);
>>
>> while (<>) {
>> s/^"(\d+)\s+(\S+)\s+(\d+)"/"$3-$months{$2}-$1"/;
>> print;
>> }
>> -----------

>
> Maybe even shrink it to a long one-liner:
>
> perl -MDate::Manip -pi.bak -le 's{^"(\d+)\s+(\S+)\s+(\d+)"}
> {"$3-" . UnixDate("$1 $2 $3","%m") . "-$1"}e' infile


Considering the situation of the OP, he has a 'zero line' solution
because all code was written by someone else. I don't know how his is
for other people, however, I can type

qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec)

much faster than I can download anything from the net, especially
considering that I'd have to read to documentation for this anything,
too, making this a very bad tradeoff. And if I had to rely one someone
else's code for totally trivial stuff such as splitting a text file
with n 'somehow separated' data columns into an array, I would have a
very hard time solving the much more complicated problems I usually
need to deal with. Actually, I regularly search CPAN whenever I have a
reasonably complex and self-contained subtask of something that 'using
a module' if one existed would be a good idea. The most common result
of this searches, however, is 'nada', the second most common is some
totally bizarre implementation of 25% of the features I actually need
and the third 'implementation is total crap' aka 'IO:oll' (and the
original author abandoned the code in question in 1975 in order to
become a missionary in Gabun or something like that).

CPAN is mostly a load of tripe resulting from fifteen years of bored
'hobbyists' (here supposed to mean people whose actual job isn't
programming) trying whatever weirdo-approach for solving fifty
different but vaguely related _trivial_ problems with the help of a
steam-engine powered motor umbrella constructed out of yellow,
magenta and purple lego bricks happened to come to their mind. And
downloading all these 'incredible machines' is - except in case of
500 SLOC throw-away 'oneliners' - not the end of the story: I have to
maintain the code because the people who use the software I'm
responsible for come to me with any problems resulting from that.

The rule of thumb I usually follow is that 'using a library' (or -
something I very much prefer - an already written program somebody
actually used to solve a real problem) is only worth the effort if it
saves a significant amount of work, at least something like 500 lines
of code and preferably, a few thousands. And even then, I end up
'maintaining' seriously byzantine workarounds for all the problems in
the 'free' code until I grow tired of that and replace it with
something which actually works (in the sense that it reliably does
what is needed to solve the problem I have to solve and nothing else)
more often than not.
 
Reply With Quote
 
Dr Eberhard Lisse
Guest
Posts: n/a
 
      01-05-2013
The OP is an elderly Obstetrician & Gynecologist, who occasionally needs
to Practically Extract and Report stuff.

el

On 2013-01-05 21:56 , Rainer Weikusat wrote:
> "C.DeRykus" <(E-Mail Removed)> writes:
>> On Wednesday, January 2, 2013 7:37:02 AM UTC-8, Rainer Weikusat wrote:
>>> Dr Eberhard Lisse <(E-Mail Removed)> writes:
>>>> I have a Tab Separated File of roughly 1000 likes with the first
>>> fields like
>>>
>>>> "07 Jan 2011" "TFR"
>>>> "05 Jan 2011" "DR"
>>>
>>>> 2011-01-07 "TFR"
>>>> 2011-01-05 "DR"

>
> [...]
>
>>> -----------
>>> %months = map { $_, sprintf('%02d', ++$n); } qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec);
>>>
>>> while (<>) {
>>> s/^"(\d+)\s+(\S+)\s+(\d+)"/"$3-$months{$2}-$1"/;
>>> print;
>>> }
>>> -----------

>>
>> Maybe even shrink it to a long one-liner:
>>
>> perl -MDate::Manip -pi.bak -le 's{^"(\d+)\s+(\S+)\s+(\d+)"}
>> {"$3-" . UnixDate("$1 $2 $3","%m") . "-$1"}e' infile

>
> Considering the situation of the OP, he has a 'zero line' solution
> because all code was written by someone else. I don't know how his is
> for other people, however, I can type
>
> qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec)
>
> much faster than I can download anything from the net, especially
> considering that I'd have to read to documentation for this anything,
> too, making this a very bad tradeoff. And if I had to rely one someone
> else's code for totally trivial stuff such as splitting a text file
> with n 'somehow separated' data columns into an array, I would have a
> very hard time solving the much more complicated problems I usually
> need to deal with. Actually, I regularly search CPAN whenever I have a
> reasonably complex and self-contained subtask of something that 'using
> a module' if one existed would be a good idea. The most common result
> of this searches, however, is 'nada', the second most common is some
> totally bizarre implementation of 25% of the features I actually need
> and the third 'implementation is total crap' aka 'IO:oll' (and the
> original author abandoned the code in question in 1975 in order to
> become a missionary in Gabun or something like that).
>
> CPAN is mostly a load of tripe resulting from fifteen years of bored
> 'hobbyists' (here supposed to mean people whose actual job isn't
> programming) trying whatever weirdo-approach for solving fifty
> different but vaguely related _trivial_ problems with the help of a
> steam-engine powered motor umbrella constructed out of yellow,
> magenta and purple lego bricks happened to come to their mind. And
> downloading all these 'incredible machines' is - except in case of
> 500 SLOC throw-away 'oneliners' - not the end of the story: I have to
> maintain the code because the people who use the software I'm
> responsible for come to me with any problems resulting from that.
>
> The rule of thumb I usually follow is that 'using a library' (or -
> something I very much prefer - an already written program somebody
> actually used to solve a real problem) is only worth the effort if it
> saves a significant amount of work, at least something like 500 lines
> of code and preferably, a few thousands. And even then, I end up
> 'maintaining' seriously byzantine workarounds for all the problems in
> the 'free' code until I grow tired of that and replace it with
> something which actually works (in the sense that it reliably does
> what is needed to solve the problem I have to solve and nothing else)
> more often than not.
>



--
if you want to reply, replace nospam with my initials
 
Reply With Quote
 
C.DeRykus
Guest
Posts: n/a
 
      01-05-2013
On Saturday, January 5, 2013 11:56:18 AM UTC-8, Rainer Weikusat wrote:
> "C.DeRykus" <(E-Mail Removed)> writes:
>
> > On Wednesday, January 2, 2013 7:37:02 AM UTC-8, Rainer Weikusat wrote:

>
> >> Dr Eberhard Lisse <(E-Mail Removed)> writes:

>
> >> > I have a Tab Separated File of roughly 1000 likes with the first

>
> >> fields like

>
> >>

>
> >> > "07 Jan 2011" "TFR"

>
> >> > "05 Jan 2011" "DR"

>
> >>

>
> >>> 2011-01-07 "TFR"

>
> >>> 2011-01-05 "DR"

>
>
>
> [...]
>
>
>
> >> -----------

>
> >> %months = map { $_, sprintf('%02d', ++$n); } qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec);

>
> >>

>
> >> while (<>) {

>
> >> s/^"(\d+)\s+(\S+)\s+(\d+)"/"$3-$months{$2}-$1"/;

>
> >> print;

>
> >> }

>
> >> -----------

>
> >

>
> > Maybe even shrink it to a long one-liner:

>
> >

>
> > perl -MDate::Manip -pi.bak -le 's{^"(\d+)\s+(\S+)\s+(\d+)"}

>
> > {"$3-" . UnixDate("$1 $2 $3","%m") . "-$1"}e' infile

>
>
>
> Considering the situation of the OP, he has a
> 'zero line' solution because all code was written
> by someone else.


Hm, it sounded like he just a separate tab-delimited
file he needed in a different format (ideal for a 1-
liner.) The -i switch is especially useful for just
this if the scenario allows it.

> I don't know how his
> for other people, however, I can type
>
> qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec)
>


> much faster than I can download anything from the net, especially
>
> considering that I'd have to read to documentation for this anything,
>
> too, making this a very bad tradeoff. And if I had to rely one someone
>
> else's code for totally trivial stuff such as splitting a text file
>
> with n 'somehow separated' data columns into an array, I would have a
>
> very hard time solving the much more complicated problems I usually
>
> need to deal with. Actually, I regularly search CPAN whenever I have a
>
> reasonably complex and self-contained subtask of something that 'using
>
> a module' if one existed would be a good idea. The most common result
>
> of this searches, however, is 'nada', the second most common is some
>
> totally bizarre implementation of 25% of the features I actually need
>
> and the third 'implementation is total crap' aka 'IO:oll' (and the
>
> original author abandoned the code in question in 1975 in order to
>
> become a missionary in Gabun or something like that).
>
>
>
> CPAN is mostly a load of tripe resulting from fifteen years of bored
>
> 'hobbyists' (here supposed to mean people whose actual job isn't
>
> programming) trying whatever weirdo-approach for solving fifty
>
> different but vaguely related _trivial_ problems with the help of a
>
> steam-engine powered motor umbrella constructed out of yellow,
>
> magenta and purple lego bricks happened to come to their mind. And
>
> downloading all these 'incredible machines' is - except in case of
>
> 500 SLOC throw-away 'oneliners' - not the end of the story: I have to
>
> maintain the code because the people who use the software I'm
>
> responsible for come to me with any problems resulting from that.
>
>
>
> The rule of thumb I usually follow is that 'using a library' (or -
>
> something I very much prefer - an already written program somebody
>
> actually used to solve a real problem) is only worth the effort if it


>
> saves a significant amount of work, at least something like 500 lines
>
> of code and preferably, a few thousands. And even then, I end up
>
> 'maintaining' seriously byzantine workarounds for all the problems in
>
> the 'free' code until I grow tired of that and replace it with
>
> something which actually works (in the sense that it reliably does
>
> what is needed to solve the problem I have to solve and nothing else)
>
> more often than not.


I can appreciate your viewpoint. Date::Manip though
is well-maintained and extraordinarily useful. There
are several other very good Date modules as well.

Leveraging a small bit of module code for a tedious,
surprisingly frequent little chore appeals to the
very lazy. So, it's worth it IMO

--
Charles DeRykus


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Date, date date date.... Peter Grison Java 10 05-30-2004 01:20 PM
Given a date, how to find the beginning date and ending date of that week Matt ASP General 11 11-08-2003 11:24 PM
Given a date, how to find the beginning date and ending date of that week Matt ASP .Net 1 11-08-2003 09:14 PM
Given a date, how to find the beginning date and ending date of that week Matt C Programming 3 11-08-2003 09:07 PM
Given a date, how to find the beginning date and ending date of that week Matt C++ 2 11-08-2003 08:30 PM



Advertisments