emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: treesitter local parser: huge slowdown and memory usage in a long fi


From: Yuan Fu
Subject: Re: treesitter local parser: huge slowdown and memory usage in a long file
Date: Tue, 13 Feb 2024 00:08:33 -0800


> On Feb 12, 2024, at 4:50 PM, Dmitry Gutov <dmitry@gutov.dev> wrote:
> 
> On 12/02/2024 06:16, Yuan Fu wrote:
>> Thanks, the culprit is the call to treesit-update-ranges in
>> treesit--pre-redisplay, where we don’t pass it any specific range, so it
>>  updates the range for the whole buffer. Eli, is there any way to get a
>> rough estimate the range that redisplay is refreshing? Do you think
>> something like this would work?
> 
> If we don't update the ranges outside of some interval surrounding the 
> window, what does that mean for correctness?

If the place of update and the embedded code currently in view belong to the 
same node in the host language, then when we update ranges for the current 
window-visible range, the whole node’s range is updated. So at least for this 
node, the range is correct.

If the place of update and the embedded code currently in view belong to 
different nodes in the host language, then when we update ranges for the 
current window-visible range, only the visible node’s range is updated. 

> 
> Perhaps the mode has a syntax-propertize-function which behaves differently 
> (as it should) depending on the language at point. Or different ranges have 
> different syntax tables, something like that.
> 
> If the ranges, after some edit (perhaps a programmatic one, performed far 
> from the visible area), are kept not update somewhere around the beginning of 
> the buffer, do we not risk confusing the syntax-ppss parser, for example?

That can happen, yes. 

> 
> Come to think of it, take treesit-indent: it only updates the ranges for the 
> current line. But the line's indentation usually depends on the previous 
> buffer positions, doesn't it?

The range passed to treesit-update-ranges act as an intercepting range—we 
capture nodes that intercepts with the range and use them to update ranges. If 
the line to be indented is in an embedded language block, the whole block will 
be captured and it’s range will be given to the embedded language parser.


We haven’t have any problem so far mainly because most embedded code blocks are 
local,  and it’s rare for some edit to take place far from the visible portion 
which affects ranges and user expects that edit to affect the current visible 
range.

I don’t have any great idea for a better way to update ranges right now. Let me 
think about that. In the meantime, I’ll push a temporary fix so V’s original 
problem can be solved.

Yuan


reply via email to

[Prev in Thread] Current Thread [Next in Thread]