Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > check file exists with case sensitive on a case insensitive filesystem

Reply
Thread Tools

check file exists with case sensitive on a case insensitive filesystem

 
 
Xah Lee
Guest
Posts: n/a
 
      04-05-2009
been a while i coded in perl...

how to check if a file exists with case sensitiveness on os x?

a quick google search says call system ls & grep. That seems too slow.
I have some over 10 thousand files to check.

Xah
http://xahlee.org/


 
Reply With Quote
 
 
 
 
Uri Guttman
Guest
Posts: n/a
 
      04-05-2009
>>>>> "XL" == Xah Lee <(E-Mail Removed)> writes:

XL> been a while i coded in perl...
XL> how to check if a file exists with case sensitiveness on os x?

XL> a quick google search says call system ls & grep. That seems too slow.
XL> I have some over 10 thousand files to check.

sorry we don't help self declared delusional geniuses who denigrate perl
in every insane cross post they make. my suggestion is to rtfm. or do it
in assembler. or write your own filesystem and plug it in.

uri

--
Uri Guttman ------ http://www.velocityreviews.com/forums/(E-Mail Removed) -------- http://www.sysarch.com --
----- Perl Code Review , Architecture, Development, Training, Support ------
--------- Free Perl Training --- http://perlhunter.com/college.html ---------
--------- Gourmet Hot Cocoa Mix ---- http://bestfriendscocoa.com ---------
 
Reply With Quote
 
 
 
 
Tad J McClellan
Guest
Posts: n/a
 
      04-05-2009
Xah Lee <(E-Mail Removed)> wrote:


> been a while i coded in perl...



Thank you.


> how to check if a file exists with case sensitiveness on os x?
>
> a quick google search says call system ls & grep. That seems too slow.
> I have some over 10 thousand files to check.



Bummer Dude.


--
Tad McClellan
email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"
 
Reply With Quote
 
A. Sinan Unur
Guest
Posts: n/a
 
      04-05-2009
Xah Lee <(E-Mail Removed)> wrote in news:a92be1ae-a933-49e3-9ea6-
(E-Mail Removed):

> been a while i coded in perl...
>
> how to check if a file exists with case sensitiveness on os x?
>
> a quick google search says call system ls & grep. That seems too slow.
> I have some over 10 thousand files to check.


How is that too slow?

On Windows XP:

C:\Temp\test> dir
....
38077 File(s) 466 bytes

C:\Temp\test> cat check.bat
ls | grep %1

C:\Temp\test> timethis check ZZCTWD8DIS

C:\Temp\test> ls | grep ZZCTWD8DIS

TimeThis : Elapsed Time : 00:00:00.375

C:\Temp\test> dir ZZCTWD8DIS

2009/04/05 01:21 PM 0 zzctwd8dis

Less than half a second for 38,077 files. Of course, if you wanted to
use this in a loop, it would get very slow very fast.

Using Perl functions to replace ls and grep increases performance to 6
or 7 checks per second (against the same set of 38,077 randomly
generated filenames). I am not providing details because someone of your
stature should be able to write the appropriate benchmarks with no
effort.

Finally, if the contents of the directory are fixed, and you just want
to check the existence of many files case sensitively, just do it the
Perl way: Use a hash.

Sinan

--
A. Sinan Unur <(E-Mail Removed)>
(remove .invalid and reverse each component for email address)

comp.lang.perl.misc guidelines on the WWW:
http://www.rehabitation.com/clpmisc/
 
Reply With Quote
 
sln@netherlands.com
Guest
Posts: n/a
 
      04-05-2009
On Sun, 5 Apr 2009 08:18:49 -0700 (PDT), Xah Lee <(E-Mail Removed)> wrote:

>been a while i coded in perl...
>
>how to check if a file exists with case sensitiveness on os x?
>
>a quick google search says call system ls & grep. That seems too slow.
>I have some over 10 thousand files to check.
>
> Xah
>? http://xahlee.org/
>
>?


This will do it for you Zah Zah.
Somewhere in here is the answer to your problem.

-sln

-----------------------------------------
#!/usr/bin/perl

package SMG;
use strict;

use File::stat;
use File:ath;
use File::Copy;
use File::Find;
use File::Spec;
use sort 'stable';


require Exporter;
our @ISA = qw(Exporter);
our @EXPORT = qw(SafeMerge GetCommonElementsNxM GetCommonElements);
my $VERSION = 1.00;


#my $current = sort::current();



################################################## ##################################
# SAFE MERGE
# The two major options -
# 1. Merge 2 directories into 1, preserving/renaming duplicate named files.
# 2. Compare 2 directories and just rename the duplicates of 1 directory.
# Also -
# - The prefix level will be on a per-file monitored basis.
# - File/Dir names are cached and renamed in the "From" array,
# - Move is done via os-rename function into the "ToDir".
# - Remove "From" directory option (after rename).
# - Duplicate's-renaming-only is done in-place in the "FromDir".
# - Report output is created in the current directory (for now).
# - Itteration scheme are within-directory, then cross-directory until no more dups.
#-----------------------------------------------------------------------------------
sub SafeMerge ($$$$$$$)
{
return 0 if (@_ < 2 || @_ > 7);
my ($ToDir,$FromDir,$PrefixName,$PrefixLevel,$Exclude ,$Duponly,$RmvFrom) = @_;

$PrefixName = '' unless defined $PrefixName; # Renaming Prefix
$PrefixLevel = 0 unless defined $PrefixLevel; # Number of times to += prefix

$Exclude = [] unless defined $Exclude; # Exclude list (must be re)
$Duponly = 0 unless defined $Duponly; # Flag to just rename dups in "from"
$RmvFrom = 0 unless defined $RmvFrom; # Flag to remove "from" after operation

my @To = ();
my @From = ();
my @Xfound = ();
my $dupsize = 0;
my $passes = 0;

print "\n\n *******************************\n";
print "********[ Safe Merge ]*********\n";
print " *******************************\n";
print "\nPrefix\t$PrefixName\nLevel\t$PrefixLevel\nCheck s ... ";

$ToDir = File::Spec->canonpath( "$ToDir" );
$FromDir = File::Spec->canonpath( "$FromDir" );
#print "$ToDir\n$FromDir";
if ($ToDir eq $FromDir) { print "\rFrom/To are identical: $ToDir\n"; return 0; }
if (!-e $ToDir) { print "\rTo directory does not exist: $ToDir\n"; return 0; }
if (!-e $FromDir) { print "\rFrom directory does not exist: $FromDir\n"; return 0;}

# Glob seems to be bothered by spaces in dir path names
my $Sep = "\\".File::Spec->canonpath( "/" );
{
my $td = File::Spec->canonpath( "$ToDir/*" );
my $fd = File::Spec->canonpath( "$FromDir/*" );
if ($td =~ / /) {
$td = "'$td'";
}
if ($fd =~ / /) {
$fd = "'$fd'";
}
foreach (glob ("$td")) {
/.+$Sep(.+)$/;
# /.+\/(.+)$/;
push (@To, [$1, 0]);
}
foreach (glob ("$fd")) {
/.+$Sep(.+)$/;
# /.+\/(.+)$/;
my $ftmp = $1;
for (@{$Exclude}) {
if ($ftmp =~ /$_/) {
# save in @Xfound if $RmvFrom is set (see below)
print "\rExclude - $ftmp\n";
push (@Xfound, $ftmp) if ($RmvFrom);
$ftmp = undef;
last;
}
}
push (@From, [$ftmp, 0, '']) if defined ($ftmp);
}
}

# Will return if "From" empty. Technically not an error.
# If it is empty through exclusions (regexp filter), "From" could be rmdir'd below.
if (@From == 0) {
print "\rEmpty dir: $FromDir\n";
return 0;
}
print "OK\n";


## ================================================== ==========================
## Iterate until there are no more dups across directories after (D2) rename
##
do {
my $Dupref = GetCommonElementsNxM_HashMethod(\@To,\@From, 1, 0);
$dupsize = @{$Dupref};
$passes++;

print "\nP A S S \# $passes\n---------------------\n";
print "Found $dupsize dups:\n";

#**********************************
# Check for cross-directory dups
# -------------
if ($dupsize)
{
my $errflg = 0;
print "\n<DIR> $FromDir\n\n";
for my $ndx (@{$Dupref})
{
my $Fromref = $From[$ndx];
my $pname = $PrefixName;

print "\t$Fromref->[0] ... $PrefixName";

#*****************************************
# Rename and check for in-directory dups
# ----------
while (1)
{
# Level check
if ($Fromref->[1] >= $PrefixLevel)
{
print "$Fromref->[0] <<level $Fromref->[1] exceeded>>\n";
# Clear the name in From so this file doesn't get moved
$Fromref->[0] = "";
$errflg = 1;
last;
}
my $stmp = uc($pname.$Fromref->[0]);
my $found = 0;
for (@From) { if (uc($_->[0]) eq $stmp) { $found = 1; last; } }

if (!$found) {
# OK to rename! Dup not found in "From"
# Change the name in "From" to this new one
# so it gets moved later. Increment its level of prefix
# ---------------------
print "$Fromref->[0]";
print "\n";
if ($Fromref->[1] == 0) { $Fromref->[2] = $Fromref->[0]; }
$Fromref->[0] = $pname.$Fromref->[0];
$Fromref->[1] += 1;
last;
}
else {
# File exist with that name, add another level of prefix.
# ---------------------
print "$PrefixName";
$pname = $pname.$PrefixName;
if ($Fromref->[1] == 0) { $Fromref->[2] = $Fromref->[0]; }
$Fromref->[1] += 1;
}
}
}
if ($errflg) {
print "!!! Can't rename some files, raise the level or change the prefix.\n";
#return 0;
}
}
} while ($dupsize > 0);
##
## End cross-dir iteration
## =============================


#************************************************* ************
# Check for rename of duplicates in D2 (only) without moving
# ------------
if ($Duponly)
{
my $rentried = 0;
my $renok = 0;
print "\nRename only ... ";
for (@From) {
my $res = 0;
if ($_->[1] > 0 && $_->[1] < $PrefixLevel) {
$rentried++;
my $fr = File::Spec->canonpath( "$FromDir/$_->[2]" );
my $fr2 = File::Spec->canonpath( "$FromDir/$_->[0]" );
if (!($res = rename ("$fr", "$fr2"))) {
print "Rename error:\t$_->[2] ... $_->[0]\n";
} else {
$renok++; }
}
}
if ($rentried) { print "$renok out of $rentried files OK\n"; }
else { print "No duplicates found\n"; }
return 1;
}

#**********************
# Move "From" into "To"
# -------------------
#print @From."@From\n";
my $sep = File::Spec->canonpath( "/" );
print "\nMoving:\t$FromDir$sep* to $ToDir$sep\n";
my $movcnt = 0;

for (@From) {
my $res = 0;
if ($_->[1] > 0 && $_->[1] < $PrefixLevel) {
my $fr = File::Spec->canonpath( "$FromDir/$_->[2]" );
my $to = File::Spec->canonpath( "$ToDir/$_->[0]" );
$res = File::Copy::move ("$fr", "$to");
if ($res == 0) { print "Move error:\t$_->[2] ... $to\n"; }
}
if ($_->[1] == 0 && $_->[0] gt '') {
my $fr = File::Spec->canonpath( "$FromDir/$_->[0]" );
my $to = File::Spec->canonpath( "$ToDir/$_->[0]" );
$res = File::Copy::move ("$fr", "$to");
if ($res == 0) { print "Move error:\t$fr\n" }
}
$movcnt++ if ($res);
}
my $notmoved = @From - $movcnt;
print "Moved:\t$movcnt out of ".@From." files.\n";

#***************************************
# Check if "FromDir" is to be deleted
# --------------
if ($RmvFrom) {
print "Remove:\t$FromDir ... ";
if ($notmoved > 0) {
print "not deleted, contains $notmoved file(s) that couldn't be moved!\n";
}
else {
# check if @Xfound has values (and their not directories)
# if so, delete these first before trying to remove the From directory
# for now, don't want to "unlink" a directory, and we're not doing a tree here
# ----------------
for (@Xfound) {
my $fr = File::Spec->canonpath( "$FromDir/$_" );
unlink ("$fr") if (!-d $fr) ; }
my $fr = File::Spec->canonpath( "$FromDir" );
if (!rmdir "$fr") {print "$!\n" } else {print "OK\n"; }
}
}
return 1;
}


################################################## #####
# Get Common Elements (from two N-dimensioned Array's)
# IN - Refs to the NxN arrays to compare,
# sort flag and the compare field.
# OUT - Ndx's into Right_Array of matching elements
# ---------------------------------------------------
# Notes -
# 1. Elements are assumed textual and case insensitive
# 2. Ignores in-array duplicates
# 3. Sort will be done if sort flag > 0
#
sub GetCommonElementsNxM($$$$)
{
my ($A_Left,$A_Right,$Srtflg,$Fld) = @_;
$Srtflg = 0 unless defined $Srtflg;
$Fld = 0 unless defined $Fld;
# my @Dup = ();
my @Ndx = ();

if ($Srtflg > 0) {
@{$A_Left} = sort {uc($a->[$Fld]) cmp uc($b->[$Fld])} @{$A_Left};
@{$A_Right} = sort {uc($a->[$Fld]) cmp uc($b->[$Fld])} @{$A_Right};
} else {print "==> Common Elements : Not sorting arrays\n";}

my $rpos = 0;
my $rend = @{$A_Right};
my $cnt = 0;
my $llast = undef;
my $rlast = undef;
foreach my $left_element (@{$A_Left})
{
next if (uc($left_element->[$Fld]) eq uc($llast->[$Fld]));

$rpos += $cnt;
$cnt = 0;
foreach my $right_element (@{$A_Right}[$rpos..($rend-1)])
{
last if (uc($left_element->[$Fld]) lt uc($right_element->[$Fld]));
$cnt++;
next if (uc($right_element->[$Fld]) eq uc($rlast->[$Fld]));
if (uc($left_element->[$Fld]) eq uc($right_element->[$Fld]))
{
# push (@Dup, $right_element->[$Fld]); # the string
push (@Ndx, $rpos+$cnt-1); # the index into R_Array
last;
}
$rlast = $right_element;
}
$llast = $left_element;
last if ($rpos >= $rend);
}
# return (\@Dup);
return (\@Ndx);
}

sub GetCommonElementsNxM_HashMethod($$$$)
{
my ($A_Left,$A_Right,$Srtflg,$Fld) = @_;
$Srtflg = 0 unless defined $Srtflg;
$Fld = 0 unless defined $Fld;

my ($element,%count);
#my @Dup = ();
my @Ndx = ();
my $idx = 0;

foreach $element (@{$A_Left}) {
$count{uc $element->[$Fld]} = 1;
}
foreach $element (@{$A_Right}) {
if ($count{uc $element->[$Fld]} == 1) {
#push @Dup, $element->[$Fld];
push @Ndx, $idx;
$count{$element->[$Fld]}++;
}
$idx++;
}
# return (\@Dup);
return (\@Ndx);
}


################################################## #####
# Get Common Elements from single Array's
# IN - Refs to the Nx1 arrays to compare, sort flag
# OUT - Ndx's into Right_Array of matching elements
# ---------------------------------------------------
# Notes -
# 1. Elements are assumed textual and case insensitive
# 2. Ignores in-array duplicates
# 3. Sort will be done if sort flag > 0
################################################## #####
sub GetCommonElements($$$)
{
my ($A_Left,$A_Right,$Srtflg) = @_;
$Srtflg = 0 unless defined $Srtflg;
# my @Dup = ();
my @Ndx = ();

if ($Srtflg > 0) {
@{$A_Left} = sort {uc($a) cmp uc($b)} @{$A_Left};
@{$A_Right} = sort {uc($a) cmp uc($b)} @{$A_Right};
} else {print "==> Common Elements : Not sorting arrays\n";}

my $rpos = 0;
my $rend = @{$A_Right};
my $cnt = 0;
my $llast = '';
my $rlast = '';
foreach my $left_element (@{$A_Left})
{
next if (uc($left_element) eq uc($llast));

$rpos += $cnt;
$cnt = 0;
foreach my $right_element (@{$A_Right}[$rpos..($rend-1)])
{
last if (uc($left_element) lt uc($right_element));
$cnt++;
next if (uc($right_element) eq uc($rlast));
if (uc($left_element) eq uc($right_element))
{
# push (@Dup, $right_element); # the string
push (@Ndx, $rpos+$cnt-1); # the index into R_Array
last;
}
$rlast = $right_element;
}
$llast = $left_element;
last if ($rpos >= $rend);
}
# return (\@Dup);
return (\@Ndx);
}

sub GetCommonElements_HashMethod($$$$)
{
my ($A_Left,$A_Right,$Srtflg) = @_;
$Srtflg = 0 unless defined $Srtflg;

my ($element,%count);
#my @Dup = ();
my @Ndx = ();
my $idx = 0;

foreach $element (@{$A_Left}) {
$count{uc $element} = 1;
}
foreach $element (@{$A_Right}) {
if ($count{uc $element} == 1) {
#push @Dup, $element;
push @Ndx, $idx;
$count{$element}++;
}
$idx++;
}
# return (\@Dup);
return (\@Ndx);
}

1;


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
How to check if a directory exists? folder.exists() does not work! Ulf Meinhardt Java 8 08-28-2009 12:26 PM
case insensitive find on case sensitive stl map benhoefer@gmail.com C++ 1 04-06-2007 08:42 PM
case sensitive / insensitive string equality Stephanie ASP General 2 10-03-2005 01:23 PM
File.exists() -- case-insensitive N.V.Dev Java 10 08-10-2004 10:33 PM
how to case select with case-insensitive string ? Tee ASP .Net 3 06-23-2004 07:40 PM



Advertisments