bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes

bug-gnu-emacs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes

From:	Dmitry Gutov
Subject:	bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date:	Fri, 19 Jan 2024 03:28:16 +0200
User-agent:	Mozilla Thunderbird

On 18/01/2024 23:24, Stefan Monnier wrote:

Still not sure how that would work.  I mean, we could have content-type
like text/html or application/json, but neither splits into two
languages, really.


Not sure what you mean by "split", but just as with major modes and
"languages", MIME content-types have inclusion properties, such as

     application/atom+xml ⊂ application/xml ⊂ text/plain

I meant to try to clarify your meaning when you said that content-typesare to languages the same as what languages are to major modes.

That might mean that a content-type corresponds to a number oflanguages, just like a language corresponds to a number (open set) ofmajor modes. But I don't see how. Please enlighten.

Speaking of the relation of inclusion, as I said before we might notwant to make languages hierarchical (even though it might help forcertain cases), because the relation might not be universal across uses.

And if you had the idea of expressing it through major mode inheritanceas well, it's likely to have adverse effects, something like:


  (derived-mode-add-parents 'c++-ts-mode 'c++-lang)
  +
  (derived-mode-add-parents 'c++-lang 'c-lang)
  =
  (provided-mode-derived-p 'c++-ts-mode 'c-lang)

Which can lead some callers to decide that c++-ts-mode's language is C.

OTOH, the major mode can only run the language hook, I think, if any major
mode can correspond only to one language.

Not so.  A major mode can easily do
      (run-mode-hooks (compute-the-hook))

I guess that would mean that the language hook is not run automatically,
that each major mode would need explicit code to compute it and run.


Not necessarily, e.g. you could specify the language to
`define-derived-mode` with something like

     :language (compute-the-language)

and then have `define-derived-mode` compute the hook name from that.
This said, I suspect that generic major modes supporting many languages
will not be very numerous (after all, that's the point of being generic,
no?), so it should be OK if they have to do some things manually.


Okay.

The side-effect of this approach is that we basically declare a mode'slanguage twice: once in the attribute above, and once in themajor-mode-remap-alist which is put into autoloads. But it's probablyminor enough.

And if languages are distinct from major modes in naming, the :languageattribute in define-derived-mode could make it run the correspondinghook at the end. Which seems good.

Though I suppose if set-auto-mode-0 saves the currently "detected"
language somewhere, the major mode definitions could pick it up and
call the corresponding hook.

Major modes are not activated solely via `set-auto-mode-0`, so relying
on that is a crutch/hack, not something on which to base a design.

The major mode could compute which language it is for. But the algorithm
could be undecidable if the buffer is not visiting a file yet, doesn't have
an interpreter comment, etc. That's where the command set-buffer-language
was supposed to come in handy.


That still doesn't justify the major mode relying on `set-auto-mode-0`.

AFAICT you seem to want to standardize how the user controls the language of
language-generic major modes.  I'm not sure we need such a standard.
Do we even have such a generic major mode yet?

In my picture that was just the natural conclusion. What I was trying todo, is put a level of control above the major modes - the mapping fromlanguages to modes, and make it more easy to control and configurable.

It didn't seem that the presence of a major mode was required to detectthe expected language, hence the addition of the new values. At thatpoint it seemed natural to both allow the absence of a configured majormode (why not), and to run the language hook anyway, for reliability.

If we really don't need any of that, then the auto-mode-alist and thecompanion vars don't even have to change, and the only place where thelanguage name could feature (aside from the code looking it up), is themode definitions.

I'm not comfortable enshrining the "-ts-mode" convention here.

We can still go the "strict" approach, where when no language is assigned,
we don't try to guess it.

I think the `<LANG>-mode` heuristic is acceptable, because it's been
*the* convention used in Emacs.

We are now getting a whole set of new modes for which this heuristic isn't
going to work


Cue the patch I submitted when I open this bug report 🙂
Now `<LANG>-mode` is again included in `derived-mode-all-parents` for
those new modes.

If the language is called <LANG>-lang instead (of without suffix), thenthe major mode could also run the language-specific hook, which in yourpatch it cannot do.

Admittedly, it doesn't fully give a solution to the problem of computing
"the" language of a buffer.  But that gets us back to one of my recent
questions: beside Eglot, which other package needs that?  Is "the"
language always unique and always the same for all those packages?
Is it really the only thing those packages need?

In the case of Eglot, at least it doesn't seem to be the case: we don't
just need the language, but also the name of the language server to use.
And for some buffers there can be several applicable language servers,
and they don't necessarily all accept the same language.

So we need either the major mode to provide the name of the server, or
a central database that maps from language/type/mode to server name.
In both cases, adding the language info to the server name is
a non-issue.  And in neither case is it necessary to know "the" language
in order to find the server.  My patch makes the central database
work better.

I think I've included some thoughts on this subject in my previousemail. They don't seem to be quoted/commented on here.

The major-mode could be fundamental-mode. If the language were to be
specifiable through settings external to major modes, we could still do
useful things while in fundamental-mode (e.g. do some useful editing with
Eglot, provided it supports indentation and completion), or suggest which
major modes to install from ELPA.


I don't see the interest of using specifically `fundamental-mode` for
that.  In any case, this seems too hypothetical at this stage to have
a good idea of what we'd need in such circumstances.

The latter feature (suggest which major modes to install) has come uprecently. It's not that difficult to implement (with a whitelist ofpackages), and fundamental-mode is most likely *the* major mode whichwould be used until the suitable major mode is installed.

Would we really care that an xml file inside an archive is applied both
archive-subfile-mode and xml-mode dir-locals settings?


No, I wasn't thinking of XML files inside archives, but about files
which are both archives and something else (e.g. ODT).  The same applies
for most other "generic" data containers, like XML and JSON.

Okay, ODT. Which we can view with either doc-view-mode or xml-mode.Languages :doc or :xml. We configure one of these langauges to be usedby default, and switch to another at will.


Not sure it's useful to consider both modes somehow active at the same time.

Although this example does underscore the problem of major modes needingto be able to specify/change the current language themselves. At leastif we don't want such modes as doc-view to have to be rewritten.

On the third hand, external tools (lsp servers, ripgrep, etc) will viewsuch files as a certain type only - just ODT. Which might make us adisservice if the current detected language changes as we change themajor mode. Hmm. And since xml-mode itself doesn't know ODT, it won't beable to "compute" that language value either (same would likely be truefor other "container" modes).

Anyway, as I mentioned elsewhere, I think this discussion of "languages"
is only tangentially related to my proposed patch.  There is some
overlap, but they serve different purposes, and they're not
mutually exclusive.

I think the "languages" feature seems to cover the same functionality asyour patch, and then some. Although at the expense of the downstreamcallers having to use the new feature, rather than having things work"automagically" (as soon as they stop supporting Emacs 29.1, that is).

[Prev in Thread]

Current Thread

[Next in Thread]

bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes, (continued)

Prev by Date: bug#67220: 30.0.50; ERC 5.6: Prefer parameter-driven MODE processing in ERC
Next by Date: bug#67677: 30.0.50; ERC 5.6: Use templates for formatting chat messages
Previous by thread: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Next by thread: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Index(es):
- Date
- Thread