[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[bug-diffutils] bug#32993: Pathologically slow operation
From: |
Stefan Monnier |
Subject: |
[bug-diffutils] bug#32993: Pathologically slow operation |
Date: |
Mon, 08 Oct 2018 17:34:18 -0400 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux) |
I recently bumped into a `diff` operation that I killed after several
minutes while diffing two files (on 3.7GHz core i3, which is the fastest
machine I have).
These files were generated as part of Emacs's "refine-hunk" processing
which tries to do word-level diffs (by basically turning every word
into N copies of this word, each one on its own line (where N is the
number of chars in the word, used to indicate to `diff` that long words
are "more costly" than short ones)).
So the files's sizes were:
% wc tmp/diff-bug-*
1038026 851160 4963190 tmp/diff-bug-1
65041 54877 314788 tmp/diff-bug-2
1103067 906037 5277978 total
%
With --speed-large-files, diff still took almost a minute to return an
answer (which is 973026 lines long).
Those file aren't exactly security sensitive, but they contain personal
info that I'd rather not make public (I can make send them in private
upon request, tho). Is there a chance this performance behavior is the
result of a performance bug, or is the algorithm really that costly?
Stefan
- [bug-diffutils] bug#32993: Pathologically slow operation,
Stefan Monnier <=