help-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Own programming language mode - syntax highlighting


From: Tim X
Subject: Re: Own programming language mode - syntax highlighting
Date: Wed, 08 Dec 2010 15:28:34 -0000
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.0.50 (gnu/linux)

Gary <help-gnu-emacs@garydjones.name> writes:

> Following Xah Lee's excellent tutorial, I have been able to get the
> basics done - syntax highlighting, indentation, and so on. What I am
> missing is a small part of the syntax highlighting related to variables.
>
> Declarations work fine - for example
> int x = 0
> is correctly highlighted. What I can't work out how to do is to
> highlight declared variables in the rest of the code, for example when I
> later use x such as
> x = x+1
>
> Does anyone have any ideas? Ideally I'd like to only highlight those
> variables I have really declared, not something that just looks like it
> *might* be a variable, so I can see immediately if I've made a mistake
> in my coding or typing.

A lot depends on the language, but in general, you cannot do this
reliably unless you have some sort of parsing support. Some have tried
doing this with regexp, but unless the language is /very/ simple, the
regexp will become too complex. To do it correctly, Emacs needs to
understand the code (i.e. parse it) to determine what class a token
represents. This means you need a mechanism to specify the grammar and
an engine to apply that grammar to the code to parse it. Consider
something like the following to see why only basic regexp will not work

int a;
int b;

    a = b;
    b = foo( a + 1 );
    c = bar() + b;

For emacs to recognise that a, b and c are all variables, it needs to
know how they would be parsed. Worse still, to know that c has not been
declared as a variable, it needs to know/remember the variables that
have been declared and recognise that c has not (or maybe it was in an
earlier context i.e. like a global). It is farily evident that regexp
are insufficient in this respect. 

Things become further complicated when your editing code because the
buffer is often in a state where it cannot be parsed because statements
are incomplete/incorrect. At that point, you then need to make a
decision about what to do with the font-locking of the code - leave it
incorrectly font-locked, remove existing font-locking or something
in-between. To complicate matters further, you also need to consider
performance. Depending on the size of the files being edited,
continuously parsing the buffer is likely to degrade performance and
slow down editing. 

The CEDET tools and semantic can be used to implement simple parsing of
code, but it is fairly complex and you still have the issue of handling
incomplete code and deciding what to do with it etc. 

In general, while it is theoretically possible to do what you want, the
amount of work required is often too high to be worth the effort.  

Tim
-- 
tcross (at) rapttech dot com dot au


reply via email to

[Prev in Thread] Current Thread [Next in Thread]