m4-discuss
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

ideas on changesyntax


From: Eric Blake
Subject: ideas on changesyntax
Date: Sun, 29 Oct 2006 21:53:28 -0700
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.7) Gecko/20060909 Thunderbird/1.5.0.7 Mnenhy/0.7.4.666

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I'm trying to reconcile how changesyntax plays with changequote, and had
some thoughts about changing the semantics of changesyntax.

Right now, changesyntax understands character ranges (such as a-z), but
not escape sequences.  It can be awkward specifying non-printing
characters, and the frozen file format supports escape sequences, so I
propose making changesyntax understand escape sequences (such as \\, \n,
\t, \f, \001, ...).

Right now, changesyntax() is a no-op.  I propose making an empty argument
revert the entire syntax table back to the startup default (similar to how
changequote without arguments reverts to the default, and remembering that
changesyntax should remain blind since it is a new builtin in 2.0).

Right now, changesyntax(w) makes all 256 characters part of the word
syntax category, which is an unrecoverable action, because from then on,
all characters you type will be appended to the ever-growing macro name.
I assume it was intended that you would do something like changesyntax(w,
`((', `))') to start with everything as word, then revert the few
exceptions with additional arguments to the same changesyntax before
things are finalized.  But that seems rather heavy-handed (not to mention,
still relatively easy to do with escape sequences, as in
changesyntax(w\000-\377), if it is still desired).  I propose making a
one-character argument restore the default state of that syntax category.
 In other words, changesyntax(w0-9b-zB-Z) followed by changesyntax(w)
would restore aA_ as words, and remove 0-9; then since all characters
should always belong to a category, it would also restore 0-9 to their
default category of digits.  And changesyntax(@) would disable the escape
category, since it is not enabled by default.

I recently changed the L (left-quote) and B (begin-comment) category to be
mutually exclusive with other categories, rather than add-on attributes,
since that made the input parser easier.  But this has the unintended
consequence that changesyntax(L[, O[) currently leaves the syntax engine
in an inconsistent setup, where it remembers that [ was a left-quote
character, but currently no character is specified as a left quote, so
quotes are effectively disabled.  GNU is already different than other
implementations in that changequote() disables quotes (Solaris reverts to
`' default quoting, and BSD retains the previous quoting unchanged); M4 is
pretty difficult to use without quoting.  So I'm proposing that if any
changesyntax action leaves the syntax table with no characters in the L
syntax category, that the default `' be reinstalled in the same manner as
multi-character quotes from changequote (in other words, ` can still start
quotes even though it no longer belongs to the L syntax category).
Disabling comments is not as drastic, and POSIX requires that comments can
be disabled, so emptying the B syntax category can remain as an
alternative way to disable comments.

Any comments before I proceed on this path?

- --
Life is short - so eat dessert first!

Eric Blake             address@hidden
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.1 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFRYVI84KuGfSFAYARAkcOAKCm1kYdSZTWdYAr/NA+FlOtRl5aWgCfcJ8i
C3UZb29qB/dR7fDGNmnqVkk=
=grA+
-----END PGP SIGNATURE-----




reply via email to

[Prev in Thread] Current Thread [Next in Thread]