|
From: | Artur Malabarba |
Subject: | Re: Single quotes in Info |
Date: | Tue, 27 Jan 2015 23:15:22 -0200 |
Eli, if I may ask, did you get a chance to see the code? (it's quite short)
The last couple emails give me the impression we're not quite on the same page.
On 27 Jan 2015 19:18, "Eli Zaretskii" <address@hidden> wrote:
>
> > Date: Tue, 27 Jan 2015 18:24:09 -0200
> > From: Artur Malabarba <address@hidden>
> > Cc: Marcin Borkowski <address@hidden>, emacs-devel <address@hidden>
> >
> > > If this is implemented in isearch, then IMO doing it for quotes alone
> > > makes very little sense.
> >
> > The quotes are just proof of concept.
>
> Yes, but what concept is that? Does it scale up to a general-purpose
> feature of the kind that suits isearch.el? Just replacing one
> character for another doesn't, IMO.
No. It replaces one character with an arbitrary regexp. In the quotes case that's used to match about a dozen different quotation characters, but it's not limited to that. You can also use that to implement lax-whi
> > > If we do this via our private database, that database is going to be
> > > huge.
> >
> > Is it? I would expect something on the order of 50 lines.
>
> There are more than 5000 characters in the Unicode database that have
> equivalence and canonical decompositions. (Look for entries in
> UnicodeData.txt whose 6th field is non-empty.)
The purpose of this is to allow the user to search for complex characters (such as curly quotes or any of these "“””„⹂〞‟‟❞❝❠“„〝〟🙷🙶🙸) by typing a simple character available on simple keyboards (such as the plain double quote "). Each simple character, needs an entry on the `isearch-groups-alist' variable. The max number of entries we'll ever need on this alist (in the very worst possible scenario) is the number of simple characters in a simple keyboard (which is way less than 5000 last I checked).
This might be easier to understand looking at the code.>
> > > We already have infrastructure for that, see
> > > the description of the 'decomposition' character property in the ELisp
> > > manual.
> >
> > Building this on preexisting infrastructure would be great, but does that go
> > the right way? Does it relate a simple character to all its complex
> > equivalents? Or does it relate each complex character to a simple alternative?
> The latter. Read paragraph 1.1 of UAX #15 for the starting point, and
> also section 3.7 of the Unicode Standard.
[Prev in Thread] | Current Thread | [Next in Thread] |