![]() |
Break large file down into smaller parts
Greets,
I have a 2million+ line file that gets generated twice a day, and was wondering if there would be a way to read in the amount of lines and split the file into several (say 5) parts with different file names? so instead of having list.txt with 2 million lines, i'd end up with file1.txt, file2.txt, file3.txt..... each with an equal (or nearly equal) amount of data from the original. Brian F. |
Re: Break large file down into smaller parts
Brian F. wrote:
> Greets, > > I have a 2million+ line file that gets generated twice a day, and was > wondering if there would be a way to read in the amount of lines and > split the file into several (say 5) parts with different file names? > > so instead of having list.txt with 2 million lines, i'd end up with > file1.txt, file2.txt, file3.txt..... each with an equal (or nearly > equal) amount of data from the original. man split split --lines=NUMBER Toni |
Re: Break large file down into smaller parts
If you are using unix or the like there is a command called split that will do
it for you. |
Re: Break large file down into smaller parts
On Tue, 16 Nov 2004 08:10:45 -0800, Brian F. wrote:
> I have a 2million+ line file that gets generated twice a day, and was > wondering if there would be a way to read in the amount of lines and > split the file into several (say 5) parts with different file names? > > so instead of having list.txt with 2 million lines, i'd end up with > file1.txt, file2.txt, file3.txt..... each with an equal (or nearly > equal) amount of data from the original. 1. Count the number of lines in the file; 'perldoc -q lines' 2. Decide on how many parts you want. 3. Iterate through the file, opening, writing to and closing each file as appropriate. -- Tore Aursand <toreau@gmail.com> "A car is not the only thing that can be recalled by its maker." (Unknown) |
Re: Break large file down into smaller parts
On Tue, 16 Nov 2004 08:10:45 -0800, Brian F. wrote:
> Greets, > > I have a 2million+ line file that gets generated twice a day, and was > wondering if there would be a way to read in the amount of lines and > split the file into several (say 5) parts with different file names? > > so instead of having list.txt with 2 million lines, i'd end up with > file1.txt, file2.txt, file3.txt..... each with an equal (or nearly > equal) amount of data from the original. If the file you're reading isn't being written to as this script runs, then the example should do what you want. If the file you want to read *is* being written to while you're reading it, that opens up a whole host of other issues (like losing information while reading). (example - may need work) #!/usr/bin/perl use strict; use warnings; my $prefix_for_chunks = '/tmp/testing'; my $chunk_count = 1; my $chunk_size = 100000; my $file_to_read = '/var/log/messages'; open IN, $file_to_read or die "Can't open $file_to_read: $!\n"; my $current_output_file = sprintf "%s%04d.txt", $prefix_for_chunks, $chunk_count; open OUT, '+>', $current_output_file or die "Can't open $current_output_file for writing: $!\n"; while(<IN>) { if(!( $. % $chunk_size ) ){ close OUT; $current_output_file = sprintf "%s%s.txt", $prefix_for_chunks, $chunk_count++; open OUT, '+>', $current_output_file or die "Can't open $current_output_file for writing: $!\n"; } print OUT $_; } close IN; close OUT; =cut HTH Jim |
| All times are GMT. The time now is 03:49 PM. |
Powered by vBulletin®. Copyright ©2000 - 2013, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.