![]() |
How to structure a perl program to include and exclude files?
I'm implementing this in Perl but I recognise that there's a strong
element of language independent program design in the question. Hope there's enough perlishness to keep me afloat. I am writing a Perl program which will process a file tree and allow the user to specify which directories and subdirectories are to be included or excluded. (Anyone who uses xxcopy in Win will know immediately what I mean). I plan to have the users describe the files to include and exclude by means of strict Perl regex's. So a control file might look something like this include /a # Do files in a and all subdirs exclude a/b/~?temp\d* # Except for temp files in a/b .... and so on. I haven't worked out the full grammar yet (do I allow indefinite series of include..exclude..include? I don't know). But I'm having more trouble with conceptualising how to write the program in Perl. Current idea is to write a recursive function to process all the files in a single directory, calling itself for sub-directories. It would slurp in the control file regex's and sort them alphabetically into two arrays, one for "include" and one for "exclude", and then implement logic like $do_this_one = 0; foreach $regex (@includes) { if ($current_file =~ $regex) { $do_this_one = 1; } } foreach $regex (@excludes) { if ($current_file =~ $regex) { $do_this_one = 0; } } do_the_stuff() if $do_this_one; But doing that lot for every file looks very laborious; for example if the control file is a simple "include /a and all subdirectories" then I don't want to look at the regex more than once. And it's not very Perl-ish either, come to that. Questions: (1) Is there a module that will help me? Or some code that I could copy? (2) If not, is there a better way of structuring the do-we-do-this-one logic to make it more elegant and efficient? Henry Law <>< Manchester, England |
Re: How to structure a perl program to include and exclude files?
Henry Law wrote:
> I am writing a Perl program which will process a file tree and > allow the user to specify which directories and subdirectories are > to be included or excluded. (Anyone who uses xxcopy in Win will > know immediately what I mean). I plan to have the users describe > the files to include and exclude by means of strict Perl regex's. > So a control file might look something like this > > include /a # Do files in a and all subdirs > exclude a/b/~?temp\d* # Except for temp files in a/b > > ... and so on. I haven't worked out the full grammar yet (do I > allow indefinite series of include..exclude..include? I don't > know). But I'm having more trouble with conceptualising how to > write the program in Perl. Current idea is to write a recursive > function to process all the files in a single directory, calling > itself for sub-directories. Why not use File::Find? use File::Find 'find'; find ( sub { local $_ = $File::Find::name; push @found, $_ if /$include/ and !/$exclude/ }, $path ); -- Gunnar Hjalmarsson Email: http://www.gunnar.cc/cgi-bin/contact.pl |
Re: How to structure a perl program to include and exclude files?
Henry Law <lawshouse.public@btconnect.com> wrote in message news:<r6flf0p3ujkeb2dck5rbnjoa64pvp1gug6@4ax.com>. ..
> I'm implementing this in Perl but I recognise that there's a strong > element of language independent program design in the question. Hope > there's enough perlishness to keep me afloat. > > I am writing a Perl program which will process a file tree and allow > the user to specify which directories and subdirectories are to be > included or excluded. (Anyone who uses xxcopy in Win will know > immediately what I mean). I plan to have the users describe the files > to include and exclude by means of strict Perl regex's. So a control > file might look something like this > > include /a # Do files in a and all subdirs > exclude a/b/~?temp\d* # Except for temp files in a/b > > ... and so on. I haven't worked out the full grammar yet (do I allow > indefinite series of include..exclude..include? I don't know). But > I'm having more trouble with conceptualising how to write the program > in Perl. Current idea is to write a recursive function to process all > the files in a single directory, calling itself for sub-directories. Henry, I think I can give you some advice on this issue since I've been thinking of this issue many years. Even though I'm extremely knowledgeable about XXCOPY, I'm not sure exactly what you are trying to do. Are you trying to create a perl script so that something similar to XXCOPY can be made available in Linux (or other) environments? Currently, XXCOPY's support for inclusion is very limited (it accepts only variations in the "last name" (e.g., /IN:*.mp3 /IN:*.doc /IN:abc*). Other than this exception, XXCOPY's file-selection mechanisms are all exclusive in nature. There is good reason for this design. Exclusion specifiers (in the form of date-range specifications, and filesize-specifications in addition to file/directory pattern specifications) can all be treated in an additive manner. As long as the file-selection parameters (switches in XXCOPY command line) are exclusive in nature, both the implementation and user-understanding are very easy. Similar or dissimilar file-selection switches won't contradict each other. They can overlap (some files can be excluded for two or more reasons). On the other hand, if you design a command rules that allow both the exclusion and the inclusion, you really have to decide which one will have the precedence over the other since they are contradictory in nature (not only in the definition of the command rule, but also for user understanding). I think it is helpful to verbalize what you are trying to do into plain English. If you can express what you (the user) want to do and how you (the programmer) will implement and document the program actions in plain English with clarity, you may proceed. But, if you are confused of what you are trying to achieve, you can't program it regardless of the language you choose. Let me go back to how XXCOPY presents its capability with regard to the inclusion and exclusion. The truth is that the inclusion feature in XXCOPY is really an exclusion operation in disguise. 1. If there is no inclusion switch (/IN:...), XXCOPY will not exclude anything. xxcopy \src_dir\ ... This is equivalent to xxcopy \src_dir\* Which is really xxcopy \src_dir\ /IN:* 2. If the source specifier contains the lastname pattern, xxcopy \src_dir\*.mp3 This is equivalent to xxcopy \src_dir\ /X:(everything except *.mp3) 3. If the command contains two or more inclusion specifiers xxcopy \src_dir\ /IN:*.mp3 /IN:*.jpg This is equivalent to xxcopy \src_dir\ /X:(everything except *.mp3 and *.jpg) ------------- The above examples illustrate how XXCOPY transforms the inclusion specifiers into exclusion actions inside. As a matter of fact, date-specifier, size-specifier and all other forms of file-selection mechanisms are treated as exclusionary actions which can easily implemented as "filters" here and there inside the program. Since exclusion actions can be applied repeatedly without a concern to precedence, etc. the implementation is quite simple and the documentation is also straightforward. The reason why XXCOPY does not support a simple thing as a "list of filenames to process" in a text file is it is really an unrestricted form of inclusion operations. This may not go well with XXCOPY's one-source, one-destination view of the file management operations. In the future, we plan to implement a full inclusion feature (even an "inclusion list" supplied as a text file) in XXCOPY. When we do support such a feature, we plan to resolve the inclusion-exclusion precedence as follows: 1. Gather all inclusion-specifiers (list of files and directories) at first and define what will be included (this can even be thought as exclusion list in reverse). 2. Apply all other (exclusionary) specifiers, next. This will give the exclusion specifiers the precedence. Note that the precedence in this context does not mean which one will be evaluated first. Rather, the last one to be evaluated will prevail (have the lasting effect). Therefore, in this case, the exclusion specifiers will have overriding power to inclusion specifiers. Here, I think the rules are clear. When the exclusion and inclusion are mixed, unless you simplify the way they are treated, the user will be totally confused and you, the designer will be confused and you will not have a working program whose behaviors will make sense to anyone. I'm not necessarily providing this idea as an advice to make a product for sale which requires a formal documentation. Even if this project is for your own personal usage, you as a programmer and you as the user have to come to a clear understanding. When you start talking about "recursion" in the design of inclusion and exclusion, I think you are clouding your thoughts. Give one of the two an unconditional precedence to the other. Else, you may never make something concrete out of your nebulous idea. Kan Yabumoto, The author of XXCopy |
Re: How to structure a perl program to include and exclude files?
On 21 Jul 2004 23:23:10 -0700, tech@xxcopy.com (Kan Yabumoto) wrote:
>Henry Law <lawshouse.public@btconnect.com> wrote in message news:<r6flf0p3ujkeb2dck5rbnjoa64pvp1gug6@4ax.com>. .. >> included or excluded. (Anyone who uses xxcopy in Win will know >> immediately what I mean). I plan to have the users describe the files >Even though I'm extremely knowledgeable about XXCOPY, I'm not >Kan Yabumoto, >The author of XXCopy Isn't usenet wonderful; I cite one of my favourite programs as an example, and the author reads my post and gives me advice! Ken, you're absolutely right that I haven't got the basic functions clear in my mind: I need to think more about that. Your description of how you do your includes and excludes is very helpful; I had sort of got to the point where I recognised that includes and excludes can't go on indefinitely. But this has now become positively off-topic so I'll leave it at that. To write more would be to risk Anno's or Tad's hand to appear out of the monitor in 3D and hit me on the nose. (With justification ...) Henry Law <>< Manchester, England |
| All times are GMT. The time now is 08:41 AM. |
Powered by vBulletin®. Copyright ©2000 - 2013, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.