![]() |
question
Still trying to learn Python and was hoping some of you might be able to give me some advice
on my code below. It works but I'm wondering if there's more efficient ways of doing this than the way I've done it. The first step is I have to loop thru nested folders and find all files that start with "LOGO". So I use Unix's 'find' and write all the paths out to a text file like so: find ~/Desktop/logoFiles/ -type f -iname "LOGO*" > ~/Desktop/logo_stuff/paths.txt" Next, I use Python (code below) to look thru each paragraph of that file (paths.txt). While looping thru each paragraph, I want to first make sure that the folder name that the file is in is a number. For example, if it found file 'LOGO9012', then I'd want to make sure the folder name that this file was in was a number (~/Desktop/ logoFiles/12/LOGO9012). If that's ok then I grab the number off the file name. If there's not a number in the file name then I ignore it. I then create a dictionary that has a key which equals the number I grabbed from the file name and then the value contains the full file path (that I grabbed out of the text file) followed by a delimiter (<::>) followed by 'LOGO' and the number I got from the file name (making sure to keep any leading zeroes if they were there originally). Finally, I sort the keys in numerical order and save out a file with all the appropriate info from my dictionary (which will be a list of paths followed by a delimiter and file name). Here's an example of what this new file might contain: /Users/jyoung1/Desktop/logoFiles/02/Logo002<::>LOGO002 /Users/jyoung1/Desktop/logoFiles/02/LOGO102<::>LOGO102 /Users/jyoung1/Desktop/logoFiles/02/LOGO302<::>LOGO302 /Users/jyoung1/Desktop/logoFiles/9/LOGO462.2PMS<::>LOGO462 Anyway, if anyone has time to look at this I'd appreciate your thoughts. Thanks! Jay #!/usr/bin/python import re dList = {} sortedList = "" pathFile = open("~/Desktop/logo_stuff/paths.txt", "r") for x in pathFile: if len(re.findall(r"^\d+$", x.split("/")[-2])) > 0: #make sure folder name is a number n = re.findall(r"\d+", x.split("/")[-1]) #Grab number off file name if len(n) > 0: dList[int(n[0])] = x[:-1] + "<::>LOGO" + n[0] + "\n" pathFile.close() keyList = dList.keys() keyList.sort() for x in keyList: sortedList += dList[x] newFile = open("~/Desktop/logo_stuff/sortedPaths.txt", "w") newFile.write(sortedList) newFile.close() |
Re: question
On Aug 24, 7:20 pm, JYOUN...@kc.rr.com wrote:
> Still trying to learn Python and was hoping some of you might be able to give me some advice > on my code > below. It works but I'm wondering if there's more efficient ways of doing this than the way > I've done it. > > The first step is I have to loop thru nested folders and find all files that start with "LOGO". So > I use Unix's 'find' > and write all the paths out to a text file like so: > > find ~/Desktop/logoFiles/ -type f -iname "LOGO*" > ~/Desktop/logo_stuff/paths.txt" > > Next, I use Python (code below) to look thru each paragraph of that file (paths.txt). While > looping thru each > paragraph, I want to first make sure that the folder name that the file is in is a number. For > example, if it found > file 'LOGO9012', then I'd want to make sure the folder name that this file was in was a > number (~/Desktop/ > logoFiles/12/LOGO9012). If that's ok then I grab the number off the file name. If there's not > a number in the > file name then I ignore it. I then create a dictionary that has a key which equals the number I > grabbed from the > file name and then the value contains the full file path (that I grabbed out of the text file) > followed by a > delimiter (<::>) followed by 'LOGO' and the number I got from the file name (making sure to > keep any leading > zeroes if they were there originally). Finally, I sort the keys in numerical order and save out a > file with all the > appropriate info from my dictionary (which will be a list of paths followed by a delimiter and > file name). Here's > an example of what this new file might contain: > > /Users/jyoung1/Desktop/logoFiles/02/Logo002<::>LOGO002 > /Users/jyoung1/Desktop/logoFiles/02/LOGO102<::>LOGO102 > /Users/jyoung1/Desktop/logoFiles/02/LOGO302<::>LOGO302 > /Users/jyoung1/Desktop/logoFiles/9/LOGO462.2PMS<::>LOGO462 > > Anyway, if anyone has time to look at this I'd appreciate your thoughts. Thanks! > > Jay > > #!/usr/bin/python > > import re > > dList = {} > sortedList = "" > > pathFile = open("~/Desktop/logo_stuff/paths.txt", "r") > > for x in pathFile: > if len(re.findall(r"^\d+$", x.split("/")[-2])) > 0: #make sure folder name is a number > n = re.findall(r"\d+", x.split("/")[-1]) #Grab number off file name > if len(n) > 0: dList[int(n[0])] = x[:-1] + "<::>LOGO" + n[0] + "\n" > > pathFile.close() > > keyList = dList.keys() > keyList.sort() > > for x in keyList: > sortedList += dList[x] > > newFile = open("~/Desktop/logo_stuff/sortedPaths.txt", "w") > newFile.write(sortedList) > newFile.close() > Here's my attempt: #!/usr/bin/python import os import re # Search for the files folder = "~/Desktop/logoFiles/" pathList = [] for name in os.listdir(folder): path = os.path.join(folder, name) # If you want it to be case-insensitive then use name.upper().startswith("LOGO") if os.path.isfile(path) and name.startswith("LOGO"): pathList.append(path) dList = {} for x in pathList: parts = x.split("/") if parts[-2].isdigit(): # Make sure folder name is a number n = re.findall(r"(\d+)$", parts[-1]) # Grab number off file name if n: dList[int(n[0])] = "%s<::>LOGO%s\n" % (x, n[0]) keyList = dList.keys() keyList.sort() sortedList = "".join(dList[x] for x in keyList) newFile = open("~/Desktop/logo_stuff/sortedPaths.txt", "w") newFile.write(sortedList) newFile.close() |
| All times are GMT. The time now is 09:41 PM. |
Powered by vBulletin®. Copyright ©2000 - 2013, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.