Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > A script to separate out file names from the path?

Reply
Thread Tools

A script to separate out file names from the path?

 
 
Rich Grise
Guest
Posts: n/a
 
      12-11-2006
I have a collection of about 6000 files that need to be reorganized.
These have been strewn all over the place, from CDs to various partitions
and subdirectories on different workstations, to a pile of various
subdirectories from our Samba server, and what-not.

They're all on different depths of subdir, and I'm almost certain that
there's a lot of redundancy - I've got a list that looks something like
this example:

/Collection/a/b/c/d/file1
/Collection/a/b/c/d/file2
/Collection/a/b/c/d/file3
/Collection/a/b/c/d/file4
/Collection/a/b/c/e/file4
/Collection/a/b/c/e/file5
/Collection/e/f/g/file4
/Collection/e/f/g/file5
/Collection/e/f/g/file6
/Collection/e/f/g/file7

and so on; as you can see, they're at different subdir depths;
what I want to do, if possible, is to take this array, split out
only the last component (after some unknown number of '/', but
the last one in the string), put it in the front of a new
string, then concatenate the original line;

The ultimate goal is to sort these by filename - I could kill
a lot of reduncancy pretty easy that way.

But it turns out, what I've been trying to do is use
for (<>) {
my @line = split(/\//,$_);
my $count = @line;
print (@line[$count-1], " : ", $_);
}

doesn't seem to accomplish what I think it should. Here's the
script I've got so far:

#!/usr/bin/perl

while (<>) {
$input = chop($_);
@line = split(/\//,$input);
$count = @line;
print ("count = ", $count, "\n");

# foreach $item(@line) {
# print (" item = ", $item);
# }
# print ("count = ", $count, " ");

# for ($i = 0; $i < $count; $i++) {
# print (" item ", $i, " = ", @line[$i], " ");
# }

# $myitem = @line[$count-1];

# print (@line[$count-1]);

# print ": ";
# print $input;
# print "\n";
}


As you can seem I've tried variations on this, and nothing I've
tried yet has done what I want.

Here's the input (example):

/Collection/a/b/c/d/file1
/Collection/a/b/c/d/file2
/Collection/a/b/c/d/file3
/Collection/a/b/c/d/file4
/Collection/a/b/c/e/file4
/Collection/a/b/c/e/file5
/Collection/e/f/g/file4
/Collection/e/f/g/file5
/Collection/e/f/g/file6
/Collection/e/f/g/file7

And here's what I want the output to look like:

file1 : /Collection/a/b/c/d/file1
file2 : /Collection/a/b/c/d/file2
file3 : /Collection/a/b/c/d/file3
file4 : /Collection/a/b/c/d/file4
file4 : /Collection/a/b/c/e/file4
file5 : /Collection/a/b/c/e/file5
file4 : /Collection/e/f/g/file4
file5 : /Collection/e/f/g/file5
file6 : /Collection/e/f/g/file6
file7 : /Collection/e/f/g/file7

Which I could sort, and track down the duplicates.

But I'm stuck on rearranging the strings. )-;

Would anyone wish to be so kind as to volunteer to do my homework for me?

Thanks,
Rich

 
Reply With Quote
 
 
 
 
usenet@DavidFilmer.com
Guest
Posts: n/a
 
      12-11-2006
Rich Grise wrote:
> A script to separate out file names from the path?


The module File::Basename is part of your standard Perl distribution.

--
The best way to get a good answer is to ask a good question.
David Filmer (http://DavidFilmer.com)

 
Reply With Quote
 
 
 
 
J. Gleixner
Guest
Posts: n/a
 
      12-11-2006
Rich Grise wrote:
[...]
> The ultimate goal is to sort these by filename - I could kill
> a lot of reduncancy pretty easy that way.
>
> But it turns out, what I've been trying to do is use
> for (<>) {
> my @line = split(/\//,$_);
> my $count = @line;
> print (@line[$count-1], " : ", $_);
> }


You can use a negative index.

my @arr = qw(a b c d e);
print $arr[-1];

Will print: e

Note: It's $line[] not @line[].

And since split returns a list, you could get the last item:

my $last_item = ( split /\// ) [-1];


> Would anyone wish to be so kind as to volunteer to do my homework for me?

No, however most people will help you learn the language so you can do
it yourself.
 
Reply With Quote
 
Lew Pitcher
Guest
Posts: n/a
 
      12-11-2006

Rich Grise wrote:
> I have a collection of about 6000 files that need to be reorganized.
> These have been strewn all over the place, from CDs to various partitions
> and subdirectories on different workstations, to a pile of various
> subdirectories from our Samba server, and what-not.
>
> They're all on different depths of subdir, and I'm almost certain that
> there's a lot of redundancy - I've got a list that looks something like
> this example:
>
> /Collection/a/b/c/d/file1
> /Collection/a/b/c/d/file2
> /Collection/a/b/c/d/file3
> /Collection/a/b/c/d/file4
> /Collection/a/b/c/e/file4
> /Collection/a/b/c/e/file5
> /Collection/e/f/g/file4
> /Collection/e/f/g/file5
> /Collection/e/f/g/file6
> /Collection/e/f/g/file7
>
> and so on; as you can see, they're at different subdir depths;
> what I want to do, if possible, is to take this array, split out
> only the last component (after some unknown number of '/', but
> the last one in the string), put it in the front of a new
> string, then concatenate the original line;
>
> The ultimate goal is to sort these by filename - I could kill
> a lot of reduncancy pretty easy that way.
>
> But it turns out, what I've been trying to do is use
> for (<>) {
> my @line = split(/\//,$_);
> my $count = @line;
> print (@line[$count-1], " : ", $_);
> }
>
> doesn't seem to accomplish what I think it should. Here's the
> script I've got so far:

[snip]

I say why use complex tools when simple tools will suffice

Have you looked at the basename(1) and dirname(1) utilities?

lpitcher@merlin:~$ basename /Collection/a/b/c/d/file1.a
file1.a
lpitcher@merlin:~$ basename /Collection/a/b/c/d/file1
file1

lpitcher@merlin:~$ dirname /Collection/a/b/c/d/file1.a
/Collection/a/b/c/d
lpitcher@merlin:~$ dirname /Collection/a/b/c/d/file1
/Collection/a/b/c/d

Something as simple as

#!/bin/bash
echo `basename $1`: $1

might do the trick

HTH
--
Lew

 
Reply With Quote
 
John W. Krahn
Guest
Posts: n/a
 
      12-11-2006
Rich Grise wrote:
> I have a collection of about 6000 files that need to be reorganized.
> These have been strewn all over the place, from CDs to various partitions
> and subdirectories on different workstations, to a pile of various
> subdirectories from our Samba server, and what-not.
>
> They're all on different depths of subdir, and I'm almost certain that
> there's a lot of redundancy - I've got a list that looks something like
> this example:
>
> /Collection/a/b/c/d/file1
> /Collection/a/b/c/d/file2
> /Collection/a/b/c/d/file3
> /Collection/a/b/c/d/file4
> /Collection/a/b/c/e/file4
> /Collection/a/b/c/e/file5
> /Collection/e/f/g/file4
> /Collection/e/f/g/file5
> /Collection/e/f/g/file6
> /Collection/e/f/g/file7
>
> and so on; as you can see, they're at different subdir depths;
> what I want to do, if possible, is to take this array, split out
> only the last component (after some unknown number of '/', but
> the last one in the string), put it in the front of a new
> string, then concatenate the original line;
>
> The ultimate goal is to sort these by filename - I could kill
> a lot of reduncancy pretty easy that way.
>
> But it turns out, what I've been trying to do is use
> for (<>) {
> my @line = split(/\//,$_);
> my $count = @line;
> print (@line[$count-1], " : ", $_);


You are using an array slice when you should be using a scalar:

Found in /usr/lib/perl5/5.8.6/pod/perlfaq4.pod
What is the difference between $array[1] and @array[1]?

And you can use negative numbers to index from the end of the array:

print "$line[-1] : $_";


> }
>
> doesn't seem to accomplish what I think it should. Here's the
> script I've got so far:
>
> #!/usr/bin/perl


use warnings;
use strict;

> while (<>) {
> $input = chop($_);


You should use chomp instead of chop.

> @line = split(/\//,$input);
> $count = @line;
> print ("count = ", $count, "\n");
>
> # foreach $item(@line) {
> # print (" item = ", $item);
> # }
> # print ("count = ", $count, " ");
>
> # for ($i = 0; $i < $count; $i++) {
> # print (" item ", $i, " = ", @line[$i], " ");
> # }
>
> # $myitem = @line[$count-1];
>
> # print (@line[$count-1]);
>
> # print ": ";
> # print $input;
> # print "\n";
> }



#!/usr/bin/perl
use warnings;
use strict;

use File::Basename;

print map /\0(.+)/s,
sort
map basename( $_ ) . "\0$_",
<>;

__END__




John
--
Perl isn't a toolbox, but a small machine shop where you can special-order
certain sorts of tools at low cost and in short order. -- Larry Wall
 
Reply With Quote
 
Paul Lalli
Guest
Posts: n/a
 
      12-11-2006
Rich Grise wrote:
> for (<>) {
> my @line = split(/\//,$_);
> my $count = @line;
> print (@line[$count-1], " : ", $_);
> }
>
> doesn't seem to accomplish what I think it should.


No, that would have worked perfectly well. It's just not at all what
you did.

> Here's the
> script I've got so far:
>
> #!/usr/bin/perl
>
> while (<>) {
> $input = chop($_);


perldoc -f chop
chop VARIABLE
chop( LIST )
chop Chops off the last character of a string and returns
the character chopped.

Did you bother printing $index to see what it was? It's not the line
minus the trailing newline. It's the trailing newline.

You should be using chomp anyway.

while (my $input = <>) {
chomp $input;
#etc
}

Regardless, use File::Basename, as another responder suggested. This
wheel has already been written.

Paul Lalli

 
Reply With Quote
 
Uri Guttman
Guest
Posts: n/a
 
      12-11-2006
>>>>> "LP" == Lew Pitcher <> writes:

LP> I say why use complex tools when simple tools will suffice

LP> Have you looked at the basename(1) and dirname(1) utilities?

i say why use external shell commands when File::Basename is a core
module?

uri

--
Uri Guttman ------ -------- http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
 
Reply With Quote
 
Rich Grise
Guest
Posts: n/a
 
      12-11-2006
On Mon, 11 Dec 2006 12:19:55 -0800, usenet wrote:

> Rich Grise wrote:
>> A script to separate out file names from the path?

>
> The module File::Basename is part of your standard Perl distribution.


Sorry for the bother - I just did it the old way in C, which I know is
heresy for the perl group. =:-O

/* relist.c */
/* reformats strings. */

#include <stdio.h>

char buffer[512];
char * bufp;

int main() {
while (bufp = gets(buffer)) {
bufp = strrchr(buffer, '/');
printf ("item ID = %s, data = %s\n", bufp + 1, buffer);
}
}

Thanks!
Rich


 
Reply With Quote
 
Dr.Ruud
Guest
Posts: n/a
 
      12-11-2006
Rich Grise schreef:

> #include <stdio.h>
>
> char buffer[512];
> char * bufp;
>
> int main() {
> while (bufp = gets(buffer)) {
> bufp = strrchr(buffer, '/');
> printf ("item ID = %s, data = %s\n", bufp + 1, buffer);
> }
> }


Perl version:

while ( <> =~ m~(.+/(.+))~ ) {
printf "item ID = %s, data = %s\n", $2, $1 ;
}

--
Affijn, Ruud

"Gewoon is een tijger."
 
Reply With Quote
 
Tad McClellan
Guest
Posts: n/a
 
      12-12-2006
["Followup-To:" header set to comp.lang.perl.misc.]

Rich Grise <> wrote:


> Here's the input (example):
>
> /Collection/a/b/c/d/file1
> /Collection/a/b/c/d/file2
> /Collection/a/b/c/d/file3
> /Collection/a/b/c/d/file4
> /Collection/a/b/c/e/file4
> /Collection/a/b/c/e/file5
> /Collection/e/f/g/file4
> /Collection/e/f/g/file5
> /Collection/e/f/g/file6
> /Collection/e/f/g/file7
>
> And here's what I want the output to look like:
>
> file1 : /Collection/a/b/c/d/file1
> file2 : /Collection/a/b/c/d/file2
> file3 : /Collection/a/b/c/d/file3
> file4 : /Collection/a/b/c/d/file4
> file4 : /Collection/a/b/c/e/file4
> file5 : /Collection/a/b/c/e/file5
> file4 : /Collection/e/f/g/file4
> file5 : /Collection/e/f/g/file5
> file6 : /Collection/e/f/g/file6
> file7 : /Collection/e/f/g/file7



perl -pe 's/(.*\/(.*))/$2 : $1/' input.file


--
Tad McClellan SGML consulting
Perl programming
Fort Worth, Texas
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
trying to redirect both std out and err to a file together and still err to a separate file qwertmonkey@syberianoutpost.ru Java 1 08-29-2012 12:10 AM
Separate Tabs, Separate Sessions BigAndy Firefox 0 05-09-2007 09:27 AM
Separate Tabs, Separate Sessions BigAndy Firefox 0 05-09-2007 09:26 AM
Using separate classpaths for separate classes? Frank Fredstone Java 1 06-27-2006 06:46 AM
How to use several separate classes (separate files) to be executed in one class (another file) EvgueniB Java 1 12-15-2003 01:18 AM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57