Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > Text file splitter, date/time field

Reply
Thread Tools

Text file splitter, date/time field

 
 
originals@gmail.com
Guest
Posts: n/a
 
      01-31-2006
Sorry to be such a leech!

I need to split an archive of a discussion forum saved as one huge txt
file into individual txt files--one per message.

Posts are stamped with a date and time, messages can be of any length.
Posters are sometimes address by their time (as it was an anon forum)
but the full time/date stamp is always unique to the start of a
message.

New to perl but have installed activeperl and can run a .pl script from
the command line.

If anyone could provide a script for this job, I'd really appreciate
it.

05.11.01 10:01 AM

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

05.11.01 10:41 AM

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

05.12.01 10:50 PM

10:01, xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

you get the idea.

Thanks.

ps I won't just use it an dump it, I will learn from it!!!! cheers.

 
Reply With Quote
 
 
 
 
John W. Krahn
Guest
Posts: n/a
 
      01-31-2006
http://www.velocityreviews.com/forums/(E-Mail Removed) wrote:
>
> I need to split an archive of a discussion forum saved as one huge txt
> file into individual txt files--one per message.
>
> Posts are stamped with a date and time, messages can be of any length.
> Posters are sometimes address by their time (as it was an anon forum)
> but the full time/date stamp is always unique to the start of a
> message.
>
> New to perl but have installed activeperl and can run a .pl script from
> the command line.
>
> If anyone could provide a script for this job, I'd really appreciate
> it.


#!/usr/bin/perl
use warnings;
use strict;

while ( <> ) {

if ( /^\d{2}\.\d{2}\.\d{2} \d{2}:\d{2} [AP]M$/ ) {
chomp;
tr/ /_/;
open OUT, '>', $_ or die "Cannot open $_: $!";
next;
}

print OUT if fileno OUT;
}

__END__



John
--
use Perl;
program
fulfillment
 
Reply With Quote
 
 
 
 
originals@gmail.com
Guest
Posts: n/a
 
      02-01-2006
John, thanks for this. It's only out putting empty files (0k, no
extension and no content when opened in notepad). I've tried it with
the sample I posted (saved as plain text) just to make sure and same
result. Maybe you can tweak. In the meantime I'll see if I can get
anywhere using the "if ( /^\d{2}\.\d{2}\.\d{2} \d{2}:\d{2} [AP]M$/ )"
in a similar script I've found that splitts after a keyword.

many thanks

 
Reply With Quote
 
usenet@DavidFilmer.com
Guest
Posts: n/a
 
      02-01-2006
(E-Mail Removed) wrote:
> John, thanks for this. It's only out putting empty files


That's odd - John's script should not have produced any type of file
for you - not because there's anything wrong with his script, but
because you're on a Windows machine, and you want to create files named
as per the timestamp, which include double-points (aka "colon", ie ":")
which is an illegal character on Windows filesystems. It should have
failed on the attempt to create the file for writing.

John's script works perfectly for me on UNIX, and perfectly on Windows
if I create a slightly modified version of the filename, such as:

(my $file = $_) =~ s/\:/_/g;
open OUT, '>',$file or die "Cannot open $_: $!";

--
http://DavidFilmer.com

 
Reply With Quote
 
John W. Krahn
Guest
Posts: n/a
 
      02-02-2006
(E-Mail Removed) wrote:
> (E-Mail Removed) wrote:
>>John, thanks for this. It's only out putting empty files

>
> That's odd - John's script should not have produced any type of file
> for you - not because there's anything wrong with his script, but
> because you're on a Windows machine, and you want to create files named
> as per the timestamp, which include double-points (aka "colon", ie ":")
> which is an illegal character on Windows filesystems. It should have
> failed on the attempt to create the file for writing.
>
> John's script works perfectly for me on UNIX, and perfectly on Windows
> if I create a slightly modified version of the filename, such as:
>
> (my $file = $_) =~ s/\:/_/g;
> open OUT, '>',$file or die "Cannot open $_: $!";


Thanks, I don't have Windows to test on. Actually if you just changed the line:

tr/ /_/;

to:

tr/ :/_/;

it would have done the same.


BTW:

> open OUT, '>',$file or die "Cannot open $_: $!";

^^^^^ ^^

If you are going to change the variable in the open() you should change it in
the die() as well.



John
--
use Perl;
program
fulfillment
 
Reply With Quote
 
Throw
Guest
Posts: n/a
 
      02-17-2006

(E-Mail Removed) wrote:

> I need to split an archive of a discussion forum saved as one huge txt
> file into individual txt files--one per message.
>
> Posts are stamped with a date and time, messages can be of any length.
> Posters are sometimes address by their time (as it was an anon forum)
> but the full time/date stamp is always unique to the start of a
> message.


G'day everyone

The solution given to this question is exactly what I'm looking for,
except I need to split a concatenated PHP file. Basically, I have one
large text file into which I have copied PHP file after PHP file, and
now I want to split them up again. The PHP file always begins with

<?php

and always ends with

?>

so it should be fairly easy to adjust the above script, shouldn't it?
However, I have tried and failed. Also, what would the command line be
for it? Can anyone help me with the adaptation?

Thanks a lot
Samuel (aka throw aka leuce aka voetleuce)

 
Reply With Quote
 
A. Sinan Unur
Guest
Posts: n/a
 
      02-17-2006
"Throw" <(E-Mail Removed)> wrote in news:1140187205.361247.173780
@g43g2000cwa.googlegroups.com:

> except I need to split a concatenated PHP file. Basically, I have one
> large text file into which I have copied PHP file after PHP file, and
> now I want to split them up again. The PHP file always begins with
>
> <?php
>
> and always ends with
>
> ?>
>
> so it should be fairly easy to adjust the above script, shouldn't it?
> However, I have tried and failed.


What have you tried and what has failed?

Please read the posting guidelines for this group. They provide you with
invaluable information you can use to help your self as well as helping us
help you.

Sinan

--
A. Sinan Unur <(E-Mail Removed)>
(reverse each component and remove .invalid for email address)

comp.lang.perl.misc guidelines on the WWW:
http://mail.augustmail.com/~tadmc/cl...uidelines.html

 
Reply With Quote
 
Tad McClellan
Guest
Posts: n/a
 
      02-18-2006
Throw <(E-Mail Removed)> wrote:

> so it should be fairly easy to adjust the above script, shouldn't it?
> However, I have tried and failed.



What have you tried?

If you show us your broken code we could help you fix it.


--
Tad McClellan SGML consulting
(E-Mail Removed) Perl programming
Fort Worth, Texas
 
Reply With Quote
 
it_says_BALLS_on_your_forehead
Guest
Posts: n/a
 
      02-18-2006

Throw wrote:
> (E-Mail Removed) wrote:
>
> > I need to split an archive of a discussion forum saved as one huge txt
> > file into individual txt files--one per message.
> >
> > Posts are stamped with a date and time, messages can be of any length.
> > Posters are sometimes address by their time (as it was an anon forum)
> > but the full time/date stamp is always unique to the start of a
> > message.

>
> G'day everyone
>
> The solution given to this question is exactly what I'm looking for,
> except I need to split a concatenated PHP file. Basically, I have one
> large text file into which I have copied PHP file after PHP file, and
> now I want to split them up again. The PHP file always begins with
>
> <?php
>
> and always ends with
>
> ?>
>
> so it should be fairly easy to adjust the above script, shouldn't it?
> However, I have tried and failed. Also, what would the command line be
> for it? Can anyone help me with the adaptation?


check out this FAQ:
http://groups.google.com/group/comp....86c9f8a384c887

 
Reply With Quote
 
it_says_BALLS_on_your_forehead
Guest
Posts: n/a
 
      02-18-2006

it_says_BALLS_on_your_forehead wrote:
> Throw wrote:
> > (E-Mail Removed) wrote:
> >
> > > I need to split an archive of a discussion forum saved as one huge txt
> > > file into individual txt files--one per message.
> > >
> > > Posts are stamped with a date and time, messages can be of any length.
> > > Posters are sometimes address by their time (as it was an anon forum)
> > > but the full time/date stamp is always unique to the start of a
> > > message.

> >
> > G'day everyone
> >
> > The solution given to this question is exactly what I'm looking for,
> > except I need to split a concatenated PHP file. Basically, I have one
> > large text file into which I have copied PHP file after PHP file, and
> > now I want to split them up again. The PHP file always begins with
> >
> > <?php
> >
> > and always ends with
> >
> > ?>
> >
> > so it should be fairly easy to adjust the above script, shouldn't it?
> > However, I have tried and failed. Also, what would the command line be
> > for it? Can anyone help me with the adaptation?

>
> check out this FAQ:
> http://groups.google.com/group/comp....86c9f8a384c887


remember when crafting your solution, if you want to use John's
example, you must have some sort of unique identifier for each file you
want to write. since there's no unique timestamp, i would suggest an
iterator in the while loop. If you couple John's example with the
information in the FAQ linked to above, the answer should be obvious.

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
javascript validation for a not required field, field is onlyrequired if another field has a value jr Javascript 3 07-08-2010 10:33 AM
Copy File Field Value to Dynamic File Field Value VUNETdotUS Javascript 25 11-10-2007 10:36 AM
1.Enter space bar for field names and save the field.The field shoud not get saved and an alert should be there as"Space bars are not allowed" Sound Javascript 2 09-28-2006 02:43 PM
Placing the selection of a list field in a text field Jerry Manner HTML 5 06-09-2005 01:52 PM
Changing drop-down field to text field. ehm Javascript 2 09-24-2003 07:00 PM



Advertisments