Hi,
We rewrote TextDiffBuilder using a modified version of the greedy longest common subsequence algorithm from 'An O(ND) Difference Algorithm and Its Variations (1986)' by Eugene W. Myers [1]. It's in The Inbox as: System-klub.209. The implementation is smaller, faster, gives a minimal patch and passes the tests. Old New Ratio Number of classes 5 4 0.8 Number of methods 50 29 0.58 Lines of code 387 197 0.51 We would appreciate if someone can review this before moving into trunk. Cheers, Levente & Balazs [1] http://xmailserver.org/diff2.pdf |
Levente Uzonyi wrote:
> Hi, > > We rewrote TextDiffBuilder using a modified version of the greedy > longest common subsequence algorithm from 'An O(ND) Difference Algorithm > and Its Variations (1986)' by Eugene W. Myers [1]. It's in The Inbox as: > > System-klub.209. The implementation is smaller, faster, gives a minimal > patch and passes the tests. > > Old New Ratio > Number of classes 5 4 0.8 > Number of methods 50 29 0.58 > Lines of code 387 197 0.51 > > We would appreciate if someone can review this before moving into trunk. Looks good. There seem to be a few matches that are shown differently (tried with various MC diffs) but that's to be expected since it's a different algorithm. Ship it :-) Cheers, - Andreas |
On Tue, 29 Dec 2009, Andreas Raab wrote:
> Levente Uzonyi wrote: >> Hi, >> >> We rewrote TextDiffBuilder using a modified version of the greedy longest >> common subsequence algorithm from 'An O(ND) Difference Algorithm and Its >> Variations (1986)' by Eugene W. Myers [1]. It's in The Inbox as: >> >> System-klub.209. The implementation is smaller, faster, gives a minimal >> patch and passes the tests. >> >> Old New Ratio >> Number of classes 5 4 0.8 >> Number of methods 50 29 0.58 >> Lines of code 387 197 0.51 >> >> We would appreciate if someone can review this before moving into trunk. > > Looks good. There seem to be a few matches that are shown differently (tried > with various MC diffs) but that's to be expected since it's a different > algorithm. Ship it :-) Done. The main reason for the rewrite was that the old version didn't create minimal patches, even for really simple cases (see the failing tests). First I tried to fix it, but when I saw how bloated the implementation was for such a simple thing, I thought a rewrite would be better. Actually I couldn't find out how the old version worked. Another reason for the differences in the diffs is that we accidentally swapped the order of insertions and removals when they occured after each other. This made only a "visual" difference. With the latest version, this is fixed too. Levente > > Cheers, > - Andreas > > > |
In reply to this post by Levente Uzonyi-2
Hi,
You might also be interested in http://www.squeaksource.com/DiffMerge.html. Regards, Tony Levente Uzonyi wrote: > Hi, > > We rewrote TextDiffBuilder using a modified version of the greedy > longest common subsequence algorithm from 'An O(ND) Difference Algorithm > and Its Variations (1986)' by Eugene W. Myers [1]. It's in The Inbox as: > > System-klub.209. The implementation is smaller, faster, gives a minimal > patch and passes the tests. > > Old New Ratio > Number of classes 5 4 0.8 > Number of methods 50 29 0.58 > Lines of code 387 197 0.51 > > We would appreciate if someone can review this before moving into trunk. > > Cheers, > Levente & Balazs > > [1] http://xmailserver.org/diff2.pdf > |
Free forum by Nabble | Edit this page |