Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > ndiff

Reply
Thread Tools

ndiff

 
 
Bryan
Guest
Posts: n/a
 
      07-23-2003
i tried using ndiff and Differ.compare today from the difflib module. i
have three questions.

1. both ndiff and Differ.compare return all the lines including lines that
are the same in both files, not just the diffs. is the convention to take
the output and then filter out lines that contain a space as the first
character to just get the diffs? it seems strange to me that the output is
not just the deltas and a lot of wasted filtering (especially if the file is
very large) to get the diff you wanted in the first place. isn't there a
better way?

2. i also tried passing IS_LINE_JUNK and IS_CHARACTER_JUNK, but there was
no difference in the output even though i changed some whitespace in the
file. i then wrote my own junk functions and again, there was no
difference in the output even though i returned 1 to filter out some lines.
can someone show an example of using IS_LINE_JUNK and IS_CHARACTER_JUNK
showing different output than when not using it.

3. is there a simple method that just returns true or false whether two
files are different or not? i was hoping that ndiff/compare would return an
empty list if there was no difference, but that's not the case. i ended up
using a simple: if file1.read() == file2.read(): but there must be a smarter
faster way.

thanks,

bryan


 
Reply With Quote
 
 
 
 
Ian Bicking
Guest
Posts: n/a
 
      07-24-2003
On Wed, 2003-07-23 at 18:55, Bryan wrote:
> 3. is there a simple method that just returns true or false whether two
> files are different or not? i was hoping that ndiff/compare would return an
> empty list if there was no difference, but that's not the case. i ended up
> using a simple: if file1.read() == file2.read(): but there must be a smarter
> faster way.


Maybe something like:

def areDifferent(file1, file2):
while 1:
data1, data2 = file1.read(1000), file2.read(1000)
if not data1 and not data2:
return True
if data1 != data2:
return False


You still have to go through the entire file if you really want to be
sure. If you use filenames, of course, you can take some shortcuts:

def filesDiffer(filename1, filename2):
if os.stat(filename1).st_size != os.stat(filename2).st_size:
return False
else:
return areDifferent(open(filename1), open(filename2)

You could also try a quick comparison from somewhere not at the
beginning (using .seek(pos)), if you think it is likely that files will
have common headers. But you'd still have to scan the entire file to be
sure.

Ian



 
Reply With Quote
 
 
 
 
Raymond Hettinger
Guest
Posts: n/a
 
      07-25-2003
> 1. both ndiff and Differ.compare return all the lines including lines that
> are the same in both files, not just the diffs. is the convention to take
> the output and then filter out lines that contain a space as the first
> character to just get the diffs? it seems strange to me that the output is
> not just the deltas and a lot of wasted filtering (especially if the file is
> very large) to get the diff you wanted in the first place. isn't there a
> better way?


The new difflib.py in Py2.3 has two new functions, context_diff()
and unified_diff(). The new functions and an exposed underlying
method strip-away the commonalities leaving only the changes
and context, if desired.


Raymond Hettinger


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
comparing two lists, ndiff performance Zbigniew Braniecki Python 3 01-30-2008 02:01 PM
[SUMMARY] NDiff (#46) Ruby Quiz Ruby 2 09-15-2005 07:14 PM
[QUIZ] NDiff (#46) Jim Freeze Ruby 3 09-13-2005 01:10 PM
[QUIZ] NDiff (#46) Ruby Quiz Ruby 13 09-10-2005 05:12 PM
difflib.ndiff broken? Humpdydum Python 3 07-16-2004 12:24 AM



Advertisments