Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Thanks comp.lang.python!!!

Reply
Thread Tools

Thanks comp.lang.python!!!

 
 
hokiegal99
Guest
Posts: n/a
 
      07-20-2003
While migrating a lot of Mac OS9 machines to PCs, we encountered several
problems with Mac file and directory names. Below is a list of the problems:

1. Macs allow characters in file and dir names that are not acceptable
on PCs. Specifically this set of characters [*<>?\|/]

2. Mac files and dirs that contained a "/" in their names would ftp to
the server OK, but the "/" would be translated to "%2f". So, a Mac file
named 7/19/03 would ftp as 7%2f19%2f03... not a very desirable filename,
especially when there are hundreds of them.

3. The last problem was spaces at the beginning and ending of file and
dir names. We encountered hundreds of files and dirs like this on the
Macs. They would ftp up to the Linux server OK but when the Windows PC
attempted to d/l them, the ftp transaction would stop and complain about
not finding files whenever it tried to transfer a file. Dirs with spaces
at the beginning or ending would literally crash the ftp client.

These were problems that we did not expect. So, we wrote a script to
clean up these names. Since all of the files were bring uploaded to a
Linux ftp server, we decided to do the cleaning there. Python is a
simple, easily readable programming language, so we chose to use it.
Long story short, attached to this email is the script. If anyone can
use it to address Mac to PC migrations, feel free to.

The only caveat is that the script uses os.walk. I don't think Python
2.2.x comes with os.walk. To address this, we d/l 2.3b2 and have used it
extensively with this script w/o any problems. And, someone here on
comp.lang.python told me that os.walk could be incorporated into 2.2.x
too, but I never tried to do that as 2.3b2 worked just fine.

Thanks to everyone who contributed to this script. Much of it is
straight from advice that I received here. Also, if anyone sees how it
can be improved, let me know. For now, I'm satisfied with it as it works
"well enough" for what I need it to do, however, I'm trying to become a
better programmer so I appreciated feedback from those who are much more
experienced than I am.

Special Thanks to Andy Jewell, Bengt Richter and Ethan Mindlace Fremen
as they wrote much of the code initially and gave a lot of great tips!!!

print " "
import os, re, string
setpath = raw_input("Path to the Directory Where Mac Directory & Filenames Need to be Made Sane: ")
print " "
print "--- Replace Bad Characters in Directory Names & Filenames ---"
print " "
def clean_names(setpath):
bad = re.compile(r'%2f|%25|[*?<>/\|\\]') #search for these bad chars.
for root, dirs, files in os.walk(setpath):
for dir in dirs:
badchars = bad.findall(dir) # find all bad chars.
newdir = dir
for badchar in badchars: # loop through each character in badchars
print "replaced: ",badchar," in dir ",newdir," ",
newdir = newdir.replace(badchar,'-') #replace bad chars.
print newdir
if newdir: # If there are any bad characters in the name, do this:
newpath = os.path.join(root,newdir)
oldpath = os.path.join(root,dir)
os.rename(oldpath,newpath)
for root, dirs, files in os.walk(setpath):
for file in files:
badchars = bad.findall(file) # find all bad chars.
newfile = file
for badchar in badchars: # loop through each character in badchars
print "replaced: ",badchar," in file ",newfile," ",
newfile = newfile.replace(badchar,'-') #replace bad chars.
print newfile
if newfile: # If there are any bad characters in the name, do this:
newpath = os.path.join(root,newfile)
oldpath = os.path.join(root,file)
os.rename(oldpath,newpath)
clean_names(setpath) #1
clean_names(setpath) #2
clean_names(setpath) #3
clean_names(setpath) #4
clean_names(setpath) #5 Be recursive so program gets sub dirs too.
clean_names(setpath) #6 You may add or remove as many of these as you need.
clean_names(setpath) #7
clean_names(setpath) #8
clean_names(setpath) #9
clean_names(setpath) #10
print " "
print "--- Done ---"
print " "
print "--- Remove Spaces from Beginning and Ending of Directory Names & Filenames ---"
print " "
def clean_spaces(setpath):
for root, dirs, files in os.walk(setpath):
for dir in dirs:
old_dname = dir #original name of dir as it exists in the filesystem.
new_dname = old_dname.strip() #new name of dir with spaces striped from beginning and ending.
if new_dname != old_dname: #if space(s) are found...
print "removed spaces from dir:",old_dname #show user what's going on.
newpath = os.path.join(root,new_dname) #declare new path.
oldpath = os.path.join(root,old_dname) #declare old path.
os.rename(oldpath,newpath) #rename dir without spaces.
for root, dirs, files in os.walk(setpath):
for file in files:
old_fname = file #original name of file as it exists in the filesystem.
new_fname = old_fname.strip() #new name of file with spaces striped from beginning and ending.
if new_fname != old_fname: #if space(s) are found...
print "removed spaces from file:",old_fname #show user what's going on.
newpath = os.path.join(root,new_fname) #declare new path.
oldpath = os.path.join(root,old_fname) #declare old path.
os.rename(oldpath,newpath) #rename file without spaces
clean_spaces(setpath) #1
clean_spaces(setpath) #2
clean_spaces(setpath) #3
clean_spaces(setpath) #4
clean_spaces(setpath) #5 Be recursive as it doesn't hurt anything, although it's probably not needed for files.
clean_spaces(setpath) #6
clean_spaces(setpath) #7
clean_spaces(setpath) #8
clean_spaces(setpath) #9
clean_spaces(setpath) #10
print " "
print "--- Done --- "
print " "
print "--- This program was written by Brad Tilley http://www.velocityreviews.com/forums/(E-Mail Removed) ---"
print " "
 
Reply With Quote
 
 
 
 
Andy Jewell
Guest
Posts: n/a
 
      07-20-2003
On Sunday 20 Jul 2003 4:40 am, hokiegal99 wrote:
> While migrating a lot of Mac OS9 machines to PCs, we encountered several
> problems with Mac file and directory names. Below is a list of the
> problems:
>
> 1. Macs allow characters in file and dir names that are not acceptable
> on PCs. Specifically this set of characters [*<>?\|/]
>
> 2. Mac files and dirs that contained a "/" in their names would ftp to
> the server OK, but the "/" would be translated to "%2f". So, a Mac file
> named 7/19/03 would ftp as 7%2f19%2f03... not a very desirable filename,
> especially when there are hundreds of them.
>
> 3. The last problem was spaces at the beginning and ending of file and
> dir names. We encountered hundreds of files and dirs like this on the
> Macs. They would ftp up to the Linux server OK but when the Windows PC
> attempted to d/l them, the ftp transaction would stop and complain about
> not finding files whenever it tried to transfer a file. Dirs with spaces
> at the beginning or ending would literally crash the ftp client.
>


Shame on Apple for allowing subversive filenames!

> These were problems that we did not expect. So, we wrote a script to
> clean up these names. Since all of the files were bring uploaded to a
> Linux ftp server, we decided to do the cleaning there. Python is a
> simple, easily readable programming language, so we chose to use it.
> Long story short, attached to this email is the script. If anyone can
> use it to address Mac to PC migrations, feel free to.
>


You never do, until they bite you!

> The only caveat is that the script uses os.walk. I don't think Python
> 2.2.x comes with os.walk. To address this, we d/l 2.3b2 and have used it
> extensively with this script w/o any problems. And, someone here on
> comp.lang.python told me that os.walk could be incorporated into 2.2.x
> too, but I never tried to do that as 2.3b2 worked just fine.
>


There is os.path.walk, instead.

> Thanks to everyone who contributed to this script. Much of it is
> straight from advice that I received here. Also, if anyone sees how it
> can be improved, let me know. For now, I'm satisfied with it as it works
> "well enough" for what I need it to do, however, I'm trying to become a
> better programmer so I appreciated feedback from those who are much more
> experienced than I am.
>


You're welcome. We all come here to learn... )

> Special Thanks to Andy Jewell, Bengt Richter and Ethan Mindlace Fremen
> as they wrote much of the code initially and gave a lot of great tips!!!


)

Some additional comments on your source-code, if I may. The following points
will help you make your program much more efficient:

1) You'd normally place your functions in a separate section, usually at the
top of your program, rather than 'in the middle'. It will work fine this
way, but it's a bit less readable.

2) There seem to be some indentation anomalies, probably because of using a
combination of tabs and spaces. This WILL bite you sometime in the future:
best to stick to one or t'other, preferably just spaces: the convention in
Python is to indent by 4 spaces for each 'suite', or logical 'block' of
code.

3) I'm not sure you quite get the recursive bit yet! Simply calling your
function lots of times in succession doesn't cut it... all that happens is
that each time you call it, it does the same thing, effectively doing the job
10 times... What you'd have to do is call the function from *WITHIN* itself,
i.e. in the body, like:

def recurse(dir,depth=0):
""" walk dir's subdirectories recursively, printing their name """
# process list of files in dir...
for entry in os.listdirs(dir):
# if the current one is a directory...
if os.path.isdir(os.join(dir,entry)):
print " "*depth+"+"+entry
# recurse (call ourselves)
recurse(os.join(dir,entry),depth+1)

** NOTE: Looking at the docs, if you use os.walk, you don't need to do the
recursion yourself, as os.walk does it for you!

3) You're still repeating yourself several times, too. You can get away with
JUST ONE os.walk() loop:

for root, dirs, files in os.walk(setpath):
for thisfile in dirs+files:
badchars=bad.findall(thisfile)
newname=thisfile.strip() # strip off leading and trailing whitespace
# replace any bad characters...
for badchar in badchars:
newname=neaname.replace(badchar,"-")
# rename thisfile ONLY if newname is different...
if newname != thisfile: # check if it's changed:
print renaming thisfile,newname,"in",root
os.rename(os.path.join(root,thisfile),os.path.join (root,newname)

!! that replaces what your four for loops do... 8-0

hope you find this useful

-andyj


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Cakewalk Pro Audio 9 (running problem) [THANKS] [ MORE THANKS] con't beenthere Computer Support 2 09-07-2006 08:58 AM
Thanks =?Utf-8?B?VG9ueSBT?= Wireless Networking 0 10-01-2005 12:29 PM
thanks Carl DaVault [MSFT] Wireless Networking 2 10-15-2004 06:10 PM
Thanks, thanks a lot Rick Computer Support 0 05-05-2004 04:04 AM
Firebird - html underline probs Thanks AC Firefox 0 06-25-2003 09:01 PM



Advertisments