bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Java Diff Funtion-


From: Burr, Rod L
Subject: Java Diff Funtion-
Date: Wed, 2 Oct 2002 20:25:45 -0400

Hi,
        In short, I really need someone that can answer a few technical 
questions about some of the internal mechanics of the GNU Diff.java program, 
written by Mike Haertel, David Hayes, Richard Stallman, Len Tower, and Paul 
Eggert.
        I have scoured the world and have read tons of documentation... been to 
Java user forums... hit up universities... and can't get my answers.  I think 
it's got to come from someone actually intimate with the code.
----------------------------------------------------------------------------------------------------------------------------------------------------------
        I have a particular application that requires the use of something at 
least similar to the Diff Function.  In order to assess the usability of that 
piece of software, I enlisted the aid of a Java familar IT professional, and we 
began disecting and examining the code.
        We did fine until we got to the core "compare" section.  At that point 
my IT guy got lost in the theory of what was happening.  In order to get back 
on track I obtained a copy of Eugene Myers' original publication, "An O(ND) 
Difference Algorithm and its Variations".  I scanned through it to get a sense 
of what it was all about.
        I consider myself to be 'mathematics capable', and I believe that I 
could eventually arrive at a point of somewhat complete understanding.  
However, since, I am not fully up to speed on the particular theories 
addressed, I'm afraid it would require an extensive investment of time, that 
might not be worthwhile, especially given my initial perceptions.
        It appears to me, upon review of editorials about the algorithm, that 
the object of the algorithm is to address the "efficiency" question, with 
respect to the identification of records that are "common" from one generation 
of a file to the next.    However, as I examine the code, I am confused.  I am 
not able to clearly perceive the actual purpose of the algorithm itself.  
It seems clear to me that "common record identification" has already occurred 
before the algorithm is invoked.  The "equivs" and "counts" arrays, already 
contain the information that identifies each record (within both source files- 
'before' and 'after') as being either unique or common.  
        From my standpoint, the only thing left to do is to decide how to 
handle those records, which are common.  However, from my perception, the 
algorithm, instead, sets off in a 'search pattern' to locate common records 
that have already been located... and I have not yet identified how the 'common 
records' are actually processed, once they've been identified.

        It's possible that I do not fully understand the problem, and it's 
obvious that I do not fully understand the program, itself.  
        I've read through all of the documentation on the program that I've 
been able to find, but nothing addresses the internal mechanics at a level of 
detail that would address my confusion.  
        
        Can you help me to get in contact with someone that can try to answer 
my questions and get me up to speed on what is happening with this process?  

Thanks,

Rod Burr




 

<<application/ms-tnef>>


reply via email to

[Prev in Thread] Current Thread [Next in Thread]