auctex-devel
[Top][All Lists]

## [AUCTeX-devel] Syntax specification for font locking

 From: Ralf Angeli Subject: [AUCTeX-devel] Syntax specification for font locking Date: Sun, 29 Apr 2007 14:36:44 +0200 User-agent: Gnus/5.110006 (No Gnus v0.6) Emacs/22.0.96 (gnu/linux)

```Hi,

the current code in font-latex.el is really quite complicated.  So I
thought of simplifying it a bit by getting rid of keyword classes and
let every macro be specified on its own.  With this change it would
also be nice to make it possible to specify different colors for
different macro arguments (and possibly the macro itself) as well
giving the user the possibility to override single macros by
specifying it in the user's list of macros.

In order to know how to highlight a macro there has to be a syntax
description.  Currently there are three symbols for macro types:
command (e.g. \foo{bar}), declaration (e.g. {\foo bar}) and noarg
(e.g. \foo).  In case of the command type one can specify the type and
sequence of arguments with a list of opening and closing delimiters
separated with spaces.  In addition a | character can be used to
denote alternatives.  For example in case of \newcommand the specifier
is "* {}|\\ [] [] {}", meaning \newcommand can be followed by an
asterisk, then a mandatory argument which can be either a group or a
macro and then two optional and one mandatory argument.

I'd like to get rid of the symbols and do everything with the
specifier string.  The specifier syntax should be easily
understandable in order for users to be able to create such
specifiers.  The current syntax with spaces looks a bit clunky but
makes it easier to find alternatives given with |.  Another
possibility would be to use single characters per token.  So in case
of \newcommand such a string could look like "\\*{|\\[[{".  The
leading backslash denotes the macro string itself and indicates that
it is a command (with or without arguments).  Since | is used between
two alternatives one always has to look one char in advance in order
to find an alternative.  From a parsing perspective it might be easier
to use it as a prefix, i.e. "|{\\" would mean: Use the next to tokens
as alternatives.  If there should arise the necessity to specify more
than two alternatives the | char could be repeated to indicate the
number of alternatives.

The approach with one char per token has the drawback that you cannot
type in any type of delimiter pair (e.g. "<>" in case of \frametitle)
and font-latex will match it without further code changes as an
optional argument.  But regarding a delimiter pair other than "{}" as
an optional argument is probably a heuristic bound to fail sooner or
later anyway.  In order to make this bullet proof the user would have
to explicitely indicate the type of argument (optional or mandatory).
One could probably support this with a some kind of prefix for one
type of argument, e.g. "!" for mandatory ones.

In case of declarations a specifier like "{" would suffice.  To be
more expressive one could use "{\\" or "{\\.}" where the backslash
denotes the macro string and the period any kind of text before the
closing brace.  If such a more verbose syntax is used for declarations
it should probably be used with commands as well which would make
parsing more difficult.

I guess for font-latex the approach with one char per token with
prefix if necessary would suffice.  I am not sure if we want to use
this kind of specification later for other purposes as well, e.g. for
providing a help text or a list of completions for each macro
argument.  Then it might be necessary to distinguish delimiters and
prefixes from tokens more explicitely or to mark tokens in some way.