bug#57245: 29.0.50; M-> in a large XML file (without long lines) is slow

bug-gnu-emacs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#57245: 29.0.50; M-> in a large XML file (without long lines) is slow

From:	Dmitry Gutov
Subject:	bug#57245: 29.0.50; M-> in a large XML file (without long lines) is slow
Date:	Wed, 17 Aug 2022 15:14:07 +0300
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.9.1

On 17.08.2022 14:24, Eli Zaretskii wrote:

Date: Tue, 16 Aug 2022 22:32:23 +0300
Cc: 57245@debbugs.gnu.org
From: Dmitry Gutov <dgutov@yandex.ru>

On 16.08.2022 19:54, Eli Zaretskii wrote:

Stefan, can you see why syntax-related stuff in sgml-mode is so heavy
here?


nxml-syntax-propertize might well be heavier than average, but the delay
scales linearly with the size of the file.


Which is generally not a good scaling factor, especially if the
coefficient is quite large (as it seems to be in this case).

Someone can work on the coefficient, but any accurate parser has to scanthe buffer from the beginning. At least once.

Migration to tree-sitter might give us a better coefficient later, butthe principle will remain.

Which seems to be exactly the behavior the "font-lock narrowing" was
supposed to guard from?


No.  It wasn't supposed to fix modes that foolishly scan the buffer
from BOB to point.


You might want to choose words better.

It was supposed to fix modes which scan from the
beginning of line, and that is (a) only a problem when lines are very
long, and (b) much harder to solve in the mode itself, because
font-lock very frequently uses anchored regexps and otherwise likes to
start from BOL, and syntax processing also likes starting from BOL.


syntax-wholelines-max handles that problem.

Though it might depend on what you mean by "anchored regexps".

Btw, does nXML and/or sgml-mode use libxml for their analysis?  If
not, why not? wouldn't that be faster (and possibly more accurate)?


Might be "a simple matter of coding".

But we do need syntax-propertize to run, so that the user commands canrely on proper syntax information in the buffer. It remains to be seenwhether xml-parse-region is a good base for nxml-syntax-propertize, andhow much of a performance improvement it can bring (with all the stringmarshaling around).

nxml also probably handles invalid documents better, which might ormight not be important.

[Prev in Thread]

Current Thread

[Next in Thread]

bug#57245: 29.0.50; M-> in a large XML file (without long lines) is slow, (continued)

Prev by Date: bug#57207: 29.0.50; Fontification is slow after e7b5912b23 (Improvements to long lines handling)
Next by Date: bug#57245: 29.0.50; M-> in a large XML file (without long lines) is slow
Previous by thread: bug#57245: 29.0.50; M-> in a large XML file (without long lines) is slow
Next by thread: bug#57245: 29.0.50; M-> in a large XML file (without long lines) is slow
Index(es):
- Date
- Thread