[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: access to raw buffer text from module
From: |
Stephen Leake |
Subject: |
Re: access to raw buffer text from module |
Date: |
Thu, 05 Dec 2019 17:01:41 -0800 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/26.2 (windows-nt) |
Stefan Monnier <address@hidden> writes:
>> A related but different question. Would it be possible to get access to
>> the raw buffer data from dynamic modules? (That is, pointer to the start,
>> length and gap information.)
>
> You might like to talk with Stephen Leake
> <address@hidden>.
> IIUC he wrote a dynamic module which parses the buffer. AFAICT he
> didn't use such a "raw" access, so it'd be interesting to hear about
> his experience.
No, I sent the buffer content as a string.
I was hoping to avoid that copy, but other things turned out to be way
slower (creating _lots_ of text properties), so I went back to a
separate process, and made that faster (doing more stuff in the process,
so fewer text properties are needed).
>> I'm only interested in read-only access, and I'd be happy to patch it
>> in myself if it's deemed generally acceptable.
>
> It would tend to expose internal data subject to change (and offer the
> ability to change this data in a way that can break some invariants), so
> it's definitely not in the style of the current module interface.
>
> But we may be able to provide a slightly less "raw" access that doesn't
> suffer in the same way. So details about your particular needs would be
> helpful to try and figure out what we can do (i.e. tell us the problems
> you face when using `char-after` or `buffer-substring`, which would be
> the main ways I can think of to access the buffer's content with the
> current module API).
In my case, I wanted raw speed when lexing the source text. The lexer
I'm using can handle utf-8, when given a start address and byte length.
Allowing for a gap would mean checking for that at each byte, which
might slow things down as much as copying.
But lexing is a _very_ small portion of the total parse time, so it's
really not worth worrying about the copy either; even sending the text
to a separate process does not take a noticeable amount of time.
If I convert to LSP style (https://langserver.org/), then the full text
is sent once, and only edits are sent after that, making the copy issue
irrelevant.
--
-- Stephe