help-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Meta-Characters, Special Characters


From: xah
Subject: Re: Meta-Characters, Special Characters
Date: 30 May 2007 18:20:17 -0700
User-agent: G2/1.0

The following is a modified and extended version of the previous
article.
The HTML formatted version is available at
http://xahlee.org/emacs/keystroke_rep.html

The Confusion of Emacs's Keystroke Representation

Xah Lee, 2007-05-29

Someone wrote:
«
[about the various ways to input or represent keystrokes and or
non-printable characters in Emacs]

As far as I can see in all those situations entering meta-characters
is
addressed in a different way which I find confusing, e.g.:
a) <key> _or_ C-q <key>
b) C-q C-[, C-q C-m, C-q C-j, C-q C-i
c) \e, \r, \n, \t
d) (define-key [(meta c) (control c) (tab c)] "This is confusing!")
»

None of this complexity is intrinsic, except your item d. Your first
item:

«C-q <key>»

The C-q (or, pressing the Control key down then type q) is the
keyboard shortcut to invoke the command quoted-insert. After this
command is invoked, the next key press on your keyboard will force
emacs to insert a character represented by that key, and withheld that
key's normal function.

For example, if you are doing string replacement, and you want to
replace tabs by returns. When emacs prompts you to type a string to
replace, you can't just press the tab key, because the normal function
of a tab key in emacs will try to do a command completion. (and in
other Applications, it usually switches you to the next input field)
So, here you can do C-q first, then press the tab key. Similarly, you
can't type the return key and expect it to insert a return character,
because normally the return key will activate the OK button or “end of
input”.

This input mechanism usually don't exist in other text editors. In
popular text editors such as Microsoft Word or Mac's Application, you
usally bring up a window showing all the special characters, then
press a button to insert the char you want.

«C-q C-[, C-q C-m, C-q C-j, C-q C-i»

In this, the C-q is the keyboard shortcut to invoke the command quoted-
insert, which will insert a literal character of whatever character
you can type on your keyboard. So, for example, C-q followed by the
tab key will insert the non-printable character “tab”.

The C-[, C-m, C-j etc key-press combinations (Holding down Control key
while pressing “[”, “m”, “j”), are methods to input non-printable
characters that may not have a corresponding key on the keyboard.

For example, suppose you want to do string replacement, by replacing
Carriage Return (ASCII 13) by Line Feed (ASCII 10). Depending what is
your operatin system and keyboard, usually your keyboard only has a
key that corresponds to just one of these characters. But now with the
special method to input non-printable characters, you can now type any
of the non-printable characters directly.

When speaking of non-printable characters, implied in the context is
some standard character set. Implicitly, we are talking about ASCII,
and this applies to emacs. Now, in ASCII, there are about 30 non-
printable characters. Each of these is given a standard abbreviation,
and several representations for different purposes. For example, ASCII
13 is the “Carriage return” character, with abbr code CR, and ^M as
its control-key-input representation. (M being the 13th of the English
alphabet), and Control-m is the conventional means to input the
character, and the conventional method to indicate a control key
combination is by using the caret “^” followed by the character.

For the full detail, look at the table in the wikipedia article:
ASCII↗.

In general, the practical issues involved for a non-printable
character, in the context of a programing language for text editing,
are: its display representation, its input method, and the display
representation for the character's input method.

(Note: Emacs also has a general way to input non-printable and or non-
typable characters of the unicode standard. See Emacs and Unicode
Tips )

«\e, \r, \n, \t »

This is a ad-hoc set of input and display representation for a few non-
printable characters. This set is started by the motherfucking unix
tech geeking morons, and by its free and speedy nature as cigarette
given to children, today has spread to many languages (Perl, Java, C+
+, C#, Python, JavaScript ...) and is a de facto standard. The damage
is to such a degree that the general concept of unprintable
characters, their representation, and their method of input, all
treated in one systematic, simple way, are not in the consciousness of
average industrial programers.

I do not know the history of these display representations. It is my
guess, that part of the reason for these, is that the unix text editor
vi, doesn't have a general way to input and or represent non-printable
chars. Other reasons are that these particular non-printable chars are
vastly far more frequently needed in text/string manipulation among
programing languages, and the blackslash representation are somewhat
more intuitive, and processing blackslahsed characters as a “string
escape” mechanism works better as a representation inside strings for
programing languages, than the representations of prefixing a caret
“^”.

«
(global-set-key (kbd "M-a") 'func-name)      ; meta a
(global-set-key (kbd "C-a") 'func-name)      ; control a
(global-set-key [f2]   'func-name)      ; F2 key
(global-set-key [kp-2] 'func-name)      ; the 2 key on the number
keypad
(global-set-key [M-f2] 'func-name)      ; meta f2
(global-set-key [(meta shift a)] 'func-name)  ; Meta shift a (capital
A)
(global-set-key [?\C-x ?a] 'func-name)  ; control x, followed by a
(global-set-key [?\C-x f2] 'func-name)  ; control x, followed by f2
[This is confusing!]
»

These are elisp code to define a keyboard shortcuts. This is the only
part of complexity in our context that we can blame emacs's design.

Emacs today has several rather confusing ways for keystroke
representation, out of mostly historical reasons. For example, the
need to keep compatibility between Emacs and Xemacs↗. Another example
of a reason, is that elisp the language uses integer to represent
printable characters. So, for example, the number 97 in lisp's
keystroke code also means the keystroke “a”. These mostly historical
reasons, are exacerbated by the influence of unix mentality “Why
Change when it ain't broken”.

Note here, that keystroke combination and sequence, is not the same
and cannot be mapped to character's input/representation in a
character set such as ASCII. For example, the F1 key in vast majority
of keyboards, isn't a character. The Alt modifier key, isn't a
character nor is it a function in one of ASCII's non-printable
character. The keys on the number keypad, need a different
representation than the ones on the main keyboard section.

So, this means, when you have a editor with a language such as emacs,
that allows users to define arbitrary key stroke combination and
sequences, you necessarily have to come up with a system to represent
keystrokes. So, this complexity is a intrinsic complexity.

(Side note: A easy way to understand what's intrinsic vs extraneous
complexity is to think: “My god, why is math so complex? God must have
fucked up in its design.”. The gist is that, certain things, are
inherently complex by nature, while others, are extraneous complexity
that are artificially created by lousy design or historical baggage.
As a concrete example in computing, languages like Lisp, is in general
very well designed. Due to its simplicity and almost no artificial
complexity, programers are immediately exposed to many of the
intrinsic complexity of computing. While languages like C and its
litters such as C++, Java, C#, Perl etc and in general software in
unix, created by the unix motherfuckers, are filled to the brim with
artificial complexity due to mostly laziness/hack, ignorance, and
lies.)

For various ways to represent keystrokes in emacs, see How to Define
Keyboard Shortcuts in Emacs.

For the unix mentality “Why Change when it ain't broken”, see Why
Change when it ain't broken.

We, as software creators, must not have unix's “why change when it
ain't broken” attitude. Emacs itself, although far more well thought
out than the majority of software, nevertheless acquired many baggage
in its 30 or so years of old age. I would recommend that we start a
effort to eliminate some of these outdated baggage. Please see: The
Modernization of Emacs.

  Xah
  xah@xahlee.org
∑ http://xahlee.org/



reply via email to

[Prev in Thread] Current Thread [Next in Thread]