[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[elpa] externals/trie 71f8273 098/111: Significantly improve efficiency
From: |
Stefan Monnier |
Subject: |
[elpa] externals/trie 71f8273 098/111: Significantly improve efficiency of trie-fuzzy-complete. |
Date: |
Mon, 14 Dec 2020 11:35:28 -0500 (EST) |
branch: externals/trie
commit 71f827391a2e6a8c17d6f671f67324b72188f20e
Author: Toby S. Cubitt <toby-predictive@dr-qubit.org>
Commit: Toby S. Cubitt <toby-predictive@dr-qubit.org>
Significantly improve efficiency of trie-fuzzy-complete.
---
trie.el | 136 ++++++++++++++++++++++++++++++++++++++--------------------------
1 file changed, 81 insertions(+), 55 deletions(-)
diff --git a/trie.el b/trie.el
index 04b0760..22a5d0c 100644
--- a/trie.el
+++ b/trie.el
@@ -2,13 +2,11 @@
;; Copyright (C) 2008-2010, 2012, 2014, 2017-2018 Free Software Foundation,
Inc
-;; Author: Toby Cubitt <toby-predictive@dr-qubit.org>
-;; Version: 0.5
-;; Keywords: extensions, matching, data structures
-;; trie, ternary search tree, tree, completion, regexp
-;; Package-Requires: ((tNFA "0.1.1") (heap "0.3"))
-;; URL: http://www.dr-qubit.org/emacs.php
-;; Repository: http://www.dr-qubit.org/git/predictive.git
+;; Author: Toby Cubitt <toby-predictive@dr-qubit.org> Version: 0.5 Keywords:
+;; extensions, matching, data structures trie, ternary search tree, tree,
+;; completion, regexp Package-Requires: ((tNFA "0.1.1") (heap "0.3")) URL:
+;; http://www.dr-qubit.org/emacs.php Repository:
+;; http://www.dr-qubit.org/git/predictive.git
;; This file is part of Emacs.
;;
@@ -2436,7 +2434,8 @@ results\)."
;; Fuzzy completing
(defun trie-fuzzy-complete
- (trie prefix distance &optional rankfun maxnum reverse filter resultfun)
+ (trie prefix distance &optional rankfun maxnum reverse filter resultfun
+ ranked-by-dist)
"Return completions of prefixes within Lewenstein DISTANCE of PREFIX
along with their associated data, in the order defined by
RANKFUN, defaulting to \"lexicographic\" order \(i.e. the order
@@ -2498,36 +2497,46 @@ of the default key-dist-data list."
(trie-transform-from-read-warn trie)
;; construct rankfun to sort by Lewenstein distance if requested
- (when (eq rankfun t)
+ (cond
+ ((eq rankfun t)
(setq rankfun (trie--construct-fuzzy-complete-rankfun
- (trie--comparison-function trie))))
-
- ;; accumulate results
- (trie--accumulate-results
- rankfun maxnum reverse filter resultfun accumulator nil
- (funcall (trie--mapfun trie)
- (lambda (node)
- (trie--do-fuzzy-complete
- node
- (apply #'vector (number-sequence 0 (length prefix)))
- (cond ((stringp prefix) "") ((listp prefix) ()) (t []))
- (length prefix) 0
- ;; FIXME: Would it pay to replace these arguments with
- ;; dynamically-scoped variables, to save stack space?
- prefix distance (if maxnum reverse (not reverse))
- (trie--comparison-function trie)
- (trie--construct-equality-function
- (trie--comparison-function trie))
- (trie--lookupfun trie)
- (trie--mapfun trie)
- accumulator))
- (trie--node-subtree (trie--root trie))
- (if maxnum reverse (not reverse)))))
+ (trie--comparison-function trie))
+ ranked-by-dist 'dist-only))
+ ((null rankfun) (setq ranked-by-dist nil))
+ (ranked-by-dist (setq ranked-by-dist t)))
+
+ (let ((equalfun (trie--construct-equality-function
+ (trie--comparison-function trie)))
+ (stats (make-list (1+ distance) 0)))
+ ;; accumulate results
+ (trie--accumulate-results
+ rankfun maxnum reverse filter resultfun accumulator nil
+ (funcall (trie--mapfun trie)
+ (lambda (node)
+ (trie--do-fuzzy-complete
+ node
+ (apply #'vector (number-sequence 0 (length prefix)))
+ (cond ((stringp prefix) "") ((listp prefix) ()) (t []))
+ (length prefix) 0
+ ;; FIXME: Would it pay to replace these arguments with
+ ;; dynamically-scoped variables, to save stack space?
+ prefix distance (if maxnum reverse (not reverse))
+ (trie--comparison-function trie)
+ equalfun
+ (trie--lookupfun trie)
+ (trie--mapfun trie)
+ accumulator
+ ranked-by-dist
+ (and ranked-by-dist maxnum)
+ (and ranked-by-dist maxnum stats)))
+ (trie--node-subtree (trie--root trie))
+ (if maxnum reverse (not reverse))))))
(defun trie--do-fuzzy-complete (node row seq pfxcost pfxlen
prefix distance reverse
- cmpfun equalfun lookupfun mapfun accumulator)
+ cmpfun equalfun lookupfun mapfun
+ accumulator ranked-by-dist maxnum stats)
;; Search everything below NODE for completions of prefixes within
;; Lewenstein distance DISTANCE of PREFIX. ROW is the previous row of the
;; Lewenstein table. SEQ is the sequence corresponding to NODE. PFXCOST is
@@ -2538,7 +2547,12 @@ of the default key-dist-data list."
;; entry of row is <= DISTANCE), accumulate result
(if (trie--node-data-p node)
(when (<= pfxcost distance)
- (funcall accumulator (list seq pfxcost pfxlen) (trie--node-data node)))
+ (funcall accumulator (list seq pfxcost pfxlen) (trie--node-data node))
+ (and stats
+ (incf (nth pfxcost stats))
+ (eq ranked-by-dist 'dist-only)
+ (>= (nth 0 stats) maxnum)
+ (throw 'trie--accumulate-done nil)))
;; build next row of Lewenstein table
(setq row (Lewenstein--next-row
@@ -2548,27 +2562,39 @@ of the default key-dist-data list."
(setq pfxcost (aref row (1- (length row)))
pfxlen (length seq)))
- (let ((min (apply #'min (append row nil))))
- (cond
- ;; if there's a prefix of current SEQ within DISTANCE of PREFIX and no
- ;; ROW entry is less than this, then we're not going to find a better
- ;; prefix, so accumulate all completions below NODE
- ((and (<= pfxcost distance) (> min pfxcost))
- (trie--mapc
- (lambda (n s)
- (funcall accumulator (list s pfxcost pfxlen) (trie--node-data n)))
- mapfun node seq reverse))
-
- ;; as long as some ROW entry is <= DISTANCE, recursively search below
NODE
- ((<= min distance)
- (funcall mapfun
- (lambda (n)
- (trie--do-fuzzy-complete
- n row seq pfxcost pfxlen prefix distance reverse
- cmpfun equalfun lookupfun mapfun accumulator))
- (trie--node-subtree node)
- reverse))
- ))))
+ ;; min = minimum possible prefix cost for any continnuation of seq
+ ;; num = number of guaranteed-better completions already accumulated
+ (let* ((min (apply #'min (append row nil)))
+ (num (and ranked-by-dist
+ (apply #'+ (cl-subseq stats 0 (min pfxcost min))))))
+ ;; skip subtree if we already have enough guaranteed-better completions
+ (when (or (null ranked-by-dist) (< num maxnum))
+ (cond
+ ;; if there's a prefix of current SEQ within DISTANCE of PREFIX and
+ ;; no ROW entry is less than this, then we're not going to find a
+ ;; better prefix, so accumulate all completions below NODE
+ ((and (<= pfxcost distance) (> min pfxcost))
+ (trie--mapc
+ (lambda (n s)
+ (funcall accumulator (list s pfxcost pfxlen) (trie--node-data n))
+ (and stats
+ (incf (nth pfxcost stats))
+ (eq ranked-by-dist 'dist-only)
+ (>= (nth 0 stats) maxnum)
+ (throw 'trie--accumulate-done nil)))
+ mapfun node seq reverse))
+
+ ;; as long as some ROW entry is <= DISTANCE, recursively search below
NODE
+ ((<= min distance)
+ (funcall mapfun
+ (lambda (n)
+ (trie--do-fuzzy-complete
+ n row seq pfxcost pfxlen prefix distance reverse
+ cmpfun equalfun lookupfun mapfun accumulator
+ ranked-by-dist maxnum stats))
+ (trie--node-subtree node)
+ reverse))
+ )))))
- [elpa] externals/trie a1f9faa 044/111: Re-filled to 72 chars/line, for mailing to gnu-emacs-sources list, (continued)
- [elpa] externals/trie a1f9faa 044/111: Re-filled to 72 chars/line, for mailing to gnu-emacs-sources list, Stefan Monnier, 2020/12/14
- [elpa] externals/trie 13bb42f 042/111: Updated docstrings for regexp-related functions and others., Stefan Monnier, 2020/12/14
- [elpa] externals/trie c7c9994 015/111: trie--createfun now passed corresponding sequence as an argument, Stefan Monnier, 2020/12/14
- [elpa] externals/trie da9ace9 051/111: More efficient implementations of replacements for CL 'position' function., Stefan Monnier, 2020/12/14
- [elpa] externals/trie 0d17008 037/111: Added nilflag argument to trie-stack functions, Stefan Monnier, 2020/12/14
- [elpa] externals/trie f930fe9 027/111: Documentation updates related to wildcard searches and predictive features that make use of them, Stefan Monnier, 2020/12/14
- [elpa] externals/trie 46369a7 026/111: Added trie-wildcard-match function, Stefan Monnier, 2020/12/14
- [elpa] externals/trie 7823234 095/111: Fix bug in trie-fuzzy-complete that meant it didn't return minimum prefix distance in some cases., Stefan Monnier, 2020/12/14
- [elpa] externals/trie 5fa968c 093/111: Fix byte-compiler warning., Stefan Monnier, 2020/12/14
- [elpa] externals/trie 87d5786 102/111: Allow trie-fuzzy-match/complete to take lists of multiple prefixes/strings., Stefan Monnier, 2020/12/14
- [elpa] externals/trie 71f8273 098/111: Significantly improve efficiency of trie-fuzzy-complete.,
Stefan Monnier <=
- [elpa] externals/trie c2b5e26 105/111: Myriad bug fixes and code refactoring in new fuzzy and ngram completion., Stefan Monnier, 2020/12/14
- [elpa] externals/trie 63da3b1 111/111: * trie.el: Fix header which likely suffered a `M-q` accident, Stefan Monnier, 2020/12/14
- [elpa] externals/trie ff5e05f 040/111: Bumped copyright year, Stefan Monnier, 2020/12/14
- [elpa] externals/trie 2281926 020/111: Minor code reformatting and rearrangement, Stefan Monnier, 2020/12/14
- [elpa] externals/trie d99fb00 055/111: Simplified advice-based edebug pretty-printing of tries and dictionaries., Stefan Monnier, 2020/12/14
- [elpa] externals/trie b4d81bf 064/111: Trivial whitespace tidying., Stefan Monnier, 2020/12/14
- [elpa] externals/trie d45e9d5 062/111: Added autoload cookies., Stefan Monnier, 2020/12/14
- [elpa] externals/trie 1c2790d 038/111: Replaced wildcard searches with more powerful and efficient regexp searches., Stefan Monnier, 2020/12/14
- [elpa] externals/trie bbfecae 085/111: Do lexbind test at compile-time instead of load-time., Stefan Monnier, 2020/12/14
- [elpa] externals/trie 5e8e73f 081/111: Fix data wrapping handling in fuzzy query functions., Stefan Monnier, 2020/12/14