Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Ruby > Comparing directory contents

Reply
Thread Tools

Comparing directory contents

 
 
dave davidson
Guest
Posts: n/a
 
      08-03-2005
Hi all,

I work in the SCM dept of a Windows software shop. A typical software build
involves us getting the code from an engineer, compiling the binaries, gathering
any support files, and then wrapping it in an installer (Installshield). We run
the installer to make sure everything looks ok. As quick-and-dirty sanity check
to make sure we got everything, we go into the install folder, do a 'dir /s',
and pipe the output to a text file. If the file list in the current build
matches the file list of the previous, we give it the ok. These lists are saved
on disk, printed and filed with the build paperwork so we can refer to them
again if necessary.

This method works surprisingly well for catching files that were mistakenly
excluded, but as you can imagine it gets very tedious and error-prone since we
have to hand-check the output. Additionally, many times we are asked by the
engineer to include additional support files, or remove existing ones. I'm
thinking there must be a better way, or better yet, a Ruby Way
I am relatively new to the language, so I don't really know which angle to
attack it from. The basic gist would be to read in the previous file list
output, strip any junk (extra spaces, line breaks, etc), and do the same for the
current, so what's left is two lists of just pure filenames (don't care about
timestamps or attributes right now). The script would process the lists and the
result would be something like "Indentical" or "Extra files: [filenames]" or
"Removed files: [filenames]".

I'm wondering if something like this already exists. A search of rubyforge and
RAA, however, did not turn up anything this specific, although I really wasn't
sure what I should be looking for. If I could be pointed to a base library that
would get me going, that would be great. Any insights on implementation would
also be greatly apprecited. Thanks!



 
Reply With Quote
 
 
 
 
Brian Schröder
Guest
Posts: n/a
 
      08-03-2005
On 03/08/05, dave davidson <(E-Mail Removed)> wrote:
> Hi all,
>=20
> I work in the SCM dept of a Windows software shop. A typical software bu=

ild
> involves us getting the code from an engineer, compiling the binaries, ga=

thering
> any support files, and then wrapping it in an installer (Installshield). =

We run
> the installer to make sure everything looks ok. As quick-and-dirty sanit=

y check
> to make sure we got everything, we go into the install folder, do a 'dir =

/s',
> and pipe the output to a text file. If the file list in the current buil=

d
> matches the file list of the previous, we give it the ok. These lists ar=

e saved
> on disk, printed and filed with the build paperwork so we can refer to th=

em
> again if necessary.
>=20
> This method works surprisingly well for catching files that were mistaken=

ly
> excluded, but as you can imagine it gets very tedious and error-prone sin=

ce we
> have to hand-check the output. Additionally, many times we are asked by =

the
> engineer to include additional support files, or remove existing ones. I'=

m
> thinking there must be a better way, or better yet, a Ruby Way
> I am relatively new to the language, so I don't really know which angle t=

o
> attack it from. The basic gist would be to read in the previous file lis=

t
> output, strip any junk (extra spaces, line breaks, etc), and do the same =

for the
> current, so what's left is two lists of just pure filenames (don't care a=

bout
> timestamps or attributes right now). The script would process the lists =

and the
> result would be something like "Indentical" or "Extra files: [filenames]"=

or
> "Removed files: [filenames]".
>=20
> I'm wondering if something like this already exists. A search of rubyfor=

ge and
> RAA, however, did not turn up anything this specific, although I really w=

asn't
> sure what I should be looking for. If I could be pointed to a base libra=

ry that
> would get me going, that would be great. Any insights on implementation =

would
> also be greatly apprecited. Thanks!
>=20
>=20
>=20


Does this help?

bschroed@black:~/svn/projekte/ruby-things$ ls -1 > before.list
bschroed@black:~/svn/projekte/ruby-things$ touch another.one
bschroed@black:~/svn/projekte/ruby-things$ ls -1 > after.list
bschroed@black:~/svn/projekte/ruby-things$ irb
irb(main):001:0> before =3D File.read('before.list').to_a
=3D> ["before.list\n", ...]
irb(main):002:0> after =3D File.read('after.list').to_a
=3D> ["before.list\n", "after.list\n", "another.one\n", ...]
irb(main):003:0> before - after
=3D> []
irb(main):004:0> after - before
=3D> ["after.list\n", "another.one\n"]

regards,

Brian

--=20
http://ruby.brian-schroeder.de/

Stringed instrument chords: http://chordlist.brian-schroeder.de/


 
Reply With Quote
 
 
 
 
Jacob Fugal
Guest
Posts: n/a
 
      08-03-2005
Though Brian Schr=F6der gave an interesting irb implementation, what you
really need is diff[1]. And don't despair, there is diff for
Windows[2] (via the command line). The GNU developers have put a *lot*
of work and refinement into this heavily used tool -- don't reinvent
the wheel.

[1] http://www.gnu.org/software/diffutil...ode/index.html
[2] http://gnuwin32.sourceforge.net/packages/diffutils.htm

Jacob Fugal


 
Reply With Quote
 
Lothar Scholz
Guest
Posts: n/a
 
      08-03-2005
Hello Jacob,

JF> Though Brian Schr=F6der gave an interesting irb implementation, what =
you
JF> really need is diff[1]. And don't despair, there is diff for
JF> Windows[2] (via the command line).

JF> The GNU developers have put a *lot*
JF> of work and refinement into this heavily used tool -- don't reinvent
JF> the wheel.

<flame>
And they still got nothing what even comes close to "AraxisMerge" on
Windows, neither from the GUI nor from the quality of the diff algorithm.
</flame>

But back to the question from the original poster, i think diff is a
complete wrong idea as he said he only needs the file names and the conte=
nt
does not matter for an installer as he puts the complete file into the
setup.exe.

I don't see a very ruby way to solve it as it is a not very
complicated task to process strings. Build two hashs over the file lists
and compare them item by item. Just parsing the previous file list would =
be litte bit
complicated if the Installshield file format must be parsed and not a
plain string list, but still it should be able to write the script in
100 lines. Or maybe i did not understand dave's real problem.


--=20
Best regards, emailto: scholz at scriptolutions d=
ot com
Lothar Scholz http://www.ruby-ide.com
CTO Scriptolutions Ruby, PHP, Python IDE 's
=20



 
Reply With Quote
 
Jacob Fugal
Guest
Posts: n/a
 
      08-04-2005
On 8/3/05, Lothar Scholz <(E-Mail Removed)> wrote:
> JF> Though Brian Schr=F6der gave an interesting irb implementation, what =

you
> JF> really need is diff[1]. And don't despair, there is diff for
> JF> Windows[2] (via the command line).
>=20
> JF> The GNU developers have put a *lot*
> JF> of work and refinement into this heavily used tool -- don't reinvent
> JF> the wheel.
>=20
> <flame>
> And they still got nothing what even comes close to "AraxisMerge" on
> Windows, neither from the GUI nor from the quality of the diff algorithm.
> </flame>


Ok, to qualify my statement: Don't reinvent this particular wheel for
a once-off solution. I won't say that someone else can make a better
wheel when that's their primary goal. I don't think Dave Davidson's
goal is to develop a new diff utility. Regarding AraxisMerge, I've
never heard of it. It may be better than GNU DiffUtils. I can't make
any judgement there.

> But back to the question from the original poster, i think diff is a
> complete wrong idea as he said he only needs the file names and the conte=

nt
> does not matter for an installer as he puts the complete file into the
> setup.exe.


diff -qr | grep '^Only'

Know the tool before dismissing it.

Jacob Fugal


 
Reply With Quote
 
Jacob Fugal
Guest
Posts: n/a
 
      08-04-2005
On 8/4/05, Jacob Fugal <(E-Mail Removed)> wrote:
> diff -qr | grep '^Only'


Rather,

diff -qr DIR1 DIR2 | grep '^Only'

Sorry for the shabby proofreading...

Jacob Fugal


 
Reply With Quote
 
Ara.T.Howard@noaa.gov
Guest
Posts: n/a
 
      08-04-2005
On Fri, 5 Aug 2005, Jacob Fugal wrote:

> On 8/3/05, Lothar Scholz <(E-Mail Removed)> wrote:
>> JF> Though Brian Schr=F6der gave an interesting irb implementation, what =

> you
>> JF> really need is diff[1]. And don't despair, there is diff for
>> JF> Windows[2] (via the command line).
>> =20
>> JF> The GNU developers have put a *lot*
>> JF> of work and refinement into this heavily used tool -- don't reinvent
>> JF> the wheel.
>> =20
>> <flame>
>> And they still got nothing what even comes close to "AraxisMerge" on
>> Windows, neither from the GUI nor from the quality of the diff algorithm.
>> </flame>

>
> Ok, to qualify my statement: Don't reinvent this particular wheel for
> a once-off solution. I won't say that someone else can make a better
> wheel when that's their primary goal. I don't think Dave Davidson's
> goal is to develop a new diff utility. Regarding AraxisMerge, I've
> never heard of it. It may be better than GNU DiffUtils. I can't make
> any judgement there.
>
>> But back to the question from the original poster, i think diff is a
>> complete wrong idea as he said he only needs the file names and the conte=

> nt
>> does not matter for an installer as he puts the complete file into the
>> setup.exe.

>
> diff -qr | grep '^Only'
>
> Know the tool before dismissing it.


the way i read the OP's post the original contents should be stored and
alterable. the diff approach would require both directories to exist and be
stored and i think the OP wanted to store only the __inventory__ of the dir -
not the actual dir. so not only would the storage/database requirements
skyrocket, but you'd be using a sledgehammer to pound in a mini-tack. this
problem is quite easily solved in only a few lines of ruby - including
database code, command line parsing, etc:


here's the code:


harp:~ > cat ./dirlist

#! /usr/bin/env ruby
require 'pstore'
require 'yaml'
require 'getoptlong'

class DirDb < :Store
def [] dir
transaction{ super(exp(dir)) rescue nil}
end
def []= dir, filelist
transaction{ super(exp(dir), filelist) }
end
def exp dir
File::expand_path dir
end
end

class FileList < ::Array
def initialize dir
@dir = File::expand_path dir
@glob = File::join @dir, '**', '*'
replace Dir[@glob].map{|f| File::expand_path f}
end
def basenames
map{|f| f.gsub(%r|^#{ Regexp::escape @dir }/*|,'')}
end
def add filename
self << File::expand_path(File::join(@dir, filename))
end
def delete filename
super(File::expand_path(File::join(@dir, filename)))
end
def to_yaml
to_a.to_yaml
end
end

class Main
def self::main(*a, &b)
new(*a, &b).run
end
def initialize
gl = GetoptLong::new ['--db', '-d', GetoptLong::REQUIRED_ARGUMENT]
gl.each do |opt, arg|
case opt
when /db/
@db_path = arg
end
end
@db_path ||= File::expand_path(File::join('~', '.dirdb'))
@mode, @mode_args = ARGV.shift, ARGV
@mode ||= 'help'
@db = DirDb::new @db_path
end
def run
send(@mode, *@mode_args)
end
def scan dir
@db[dir] = FileList::new dir
show dir
end
def show dir
y @db[dir]
end
def report old_dir, new_dir
previous = @db[old_dir]
current = FileList::new new_dir
report = {}
report['identical'] = previous.basenames & current.basenames
report['extra'] = current.basenames - previous.basenames
report['removed'] = previous.basenames - current.basenames
y report
end
def add dir, filename
filelist = @db[dir]
filelist.add filename
@db[dir] = filelist
end
def delete dir, filename
filelist = @db[dir]
filelist.delete filename
@db[dir] = filelist
end
def help
puts "#{ $0 } scan dir | show dir | report new_dir old_dir | add dir filename | delete dir filename"
end
end

$0 == __FILE__ and Main::main



and here's how you use it:


harp:~ > mkdir version-1.0.0 && touch version-1.0.0/a version-1.0.0/b version-1.0.0/c


harp:~ > ./dirlist
./dirlist scan dir | show dir | report new_dir old_dir | add dir filename | delete dir filename


harp:~ > ./dirlist scan version-1.0.0/
---
- /home/ahoward/version-1.0.0/a
- /home/ahoward/version-1.0.0/b
- /home/ahoward/version-1.0.0/c


harp:~ > rm -rf version-1.0.0/


harp:~ > mkdir version-2.0.0 && touch version-2.0.0/a version-2.0.0/b


harp:~ > ./dirlist report version-1.0.0 version-2.0.0
---
removed:
- c
extra: []
identical:
- a
- b


harp:~ > touch version-2.0.0/c


harp:~ > ./dirlist report version-1.0.0 version-2.0.0
---
removed: []
extra: []
identical:
- a
- b
- c


harp:~ > touch version-2.0.0/d


harp:~ > ./dirlist report version-1.0.0 version-2.0.0
---
removed: []
extra:
- d
identical:
- a
- b
- c


harp:~ > ./dirlist add version-1.0.0 d


harp:~ > ./dirlist report version-1.0.0 version-2.0.0
---
removed: []
extra: []
identical:
- a
- b
- c
- d


harp:~ > rm version-2.0.0/a


harp:~ > ./dirlist report version-1.0.0 version-2.0.0
---
removed:
- a
extra: []
identical:
- b
- c
- d


harp:~ > ./dirlist delete version-1.0.0 a


harp:~ > ./dirlist report version-1.0.0 version-2.0.0
---
removed: []
extra: []
identical:
- b
- c
- d


in any case, i'm all for using built-in tools to accomplish tasks - but this
task is so basic it seem silly not to just write it in pure ruby...

kind regards.

-a
--
================================================== =============================
| email :: ara [dot] t [dot] howard [at] noaa [dot] gov
| phone :: 303.497.6469
| My religion is very simple. My religion is kindness.
| --Tenzin Gyatso
================================================== =============================

 
Reply With Quote
 
dave davidson
Guest
Posts: n/a
 
      08-04-2005




All,

Thanks so much for the hints and pointers regarding this issue... I've not had a
chance to try all the suggestions (too busy counting files by hand but I just
wanted to let you know i appreciate the help!

Dave



 
Reply With Quote
 
Jacob Fugal
Guest
Posts: n/a
 
      08-04-2005
On 8/4/05, http://www.velocityreviews.com/forums/(E-Mail Removed) <(E-Mail Removed)> wrote:
> > diff -qr | grep '^Only'

<snip>
>=20
> the way i read the OP's post the original contents should be stored and
> alterable. the diff approach would require both directories to exist and=

be
> stored and i think the OP wanted to store only the __inventory__ of the d=

ir -
> not the actual dir. so not only would the storage/database requirements
> skyrocket, but you'd be using a sledgehammer to pound in a mini-tack.


Ok, I forgot about that constraint. I still think diff would be the
exact tool I would use when on a *nix system:

# Done once to build the list compared against
$ find master_dir/ > master.list

# Done each time to verify all files are there in the working copy
$ find working_dir/ | diff master.list -

I'll admit that once you start getting into pipes and such this
solution probably won't work, or at least not as easily, on Windows.

Jacob Fugal


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Adding contents on yaml file without overwriting actual contents Kamarulnizam Rahim Ruby 4 01-28-2011 09:10 AM
How to move directory ( with contents) to another directory. mike Java 2 03-15-2008 04:06 AM
if innerHTML used twice then it replaces the contents i want it to display all the contents virendra.amritkar@gmail.com Javascript 0 06-29-2007 08:13 AM
comparing the contents of memory srinukasam VHDL 5 06-23-2005 07:49 PM
Can I restrict both attribute contents and element contents in schema Don Adams XML 1 03-05-2004 12:48 PM



Advertisments