[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Octave-patch-tracker] [patch #8426] strings package: speeding up the ed
From: |
Carnë Draug |
Subject: |
[Octave-patch-tracker] [patch #8426] strings package: speeding up the edit distance |
Date: |
Fri, 08 Aug 2014 13:19:18 +0000 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:30.0) Gecko/20100101 Firefox/30.0 Iceweasel/30.0 |
Follow-up Comment #6, patch #8426 (project octave):
Seems I mangled the verbatim syntax so I'll paste the part that got cut out
again:
* "(nargin >= 3 && length (weights) < 3)" deserves an error message on its
own. Also, do not use length it will return true for a matrix size [2 3].
Finally, shouldn't an empty weigths be allowed? The documentation says "For
the special case that @var{weights} is empty"
function [dist, L] = editdistance (str1, str2, weights = [1 0 1], modus = 0)
if (nargin < 2 || nargin > 4)
print_usage ();
endif
if (nargin > 2 && (numel (weights) != 3 && ! isempty (weigths))
error ("editdistance: WEIGTHS must be a 3 element vector.")
endif
* I can't figure out "(nargin >= 2 || weights(1) == weights(3) && weights(1)
== 1 && weights(2) == 0)" ? This will always return true because this function
is always called with at least 2 arguments. But I remove it, then the rest is
the same as `weigths = [1 0 1]`. What should this be doing? From the
documentation, it would seem it should be:
saveMemory = nargout < 2;
if (modus == 0 && saveMemory && isempty (weights))
dist = berghel_roach (str1,str2);
return
endif
* while the algorithm seems like it's not really requires a char, it does
require the input to be a vector. You should check for that. For the same
reason mentioned above, do not use length in:
L1 = length (str1) + 1;
L2 = length (str2) + 1;
* use more whitespace please. For example, add a space between operators
(including "="), and after commas in a comma separated list
* can you avoid the nested sub-function? I can see that it can modify the
sparse matrix when it's called so this will avoid copying it for write. But
nested functions in Octave and Matlab have weird scoping rules and can lead to
weird bugs later on (and I'm not sure if we're completely Matlab compatible on
that respect). I guess that since is less prone to problems so it should be
fine if you can't find a way that would not impact performance.
* finally, if you clone the repository and submit a changeset, it's easier to
review and you'll keep authorship of the changes:
hg clone http://hg.code.sf.net/p/octave/strings octave-forge-strings
cd octave-forge-strings
## make your changes
hg commit -m
## enter a commit message following commit message guidelines
http://wiki.octave.org/Commit_message_guidelines
## probably something like:
##
## editdistance: use Berghel and Roach algorithm for performance (patch
#8426)
##
## * editdistance.m: blah blah blah multi-line comment with deeper details,
with
## each line not taking more than 80 characters.
hg export -o ../speedup-editdistance.diff tip
_______________________________________________________
Reply to this item at:
<http://savannah.gnu.org/patch/?8426>
_______________________________________________
Message sent via/by Savannah
http://savannah.gnu.org/