help-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: alist keys: strings or symbols


From: tomas
Subject: Re: alist keys: strings or symbols
Date: Mon, 20 Jul 2020 11:01:35 +0200
User-agent: Mutt/1.5.21 (2010-09-15)

On Sun, Jul 19, 2020 at 06:23:52PM +0200, excalamus--- via Users list for the 
GNU Emacs text editor wrote:
> Some questions about alists:
> 
> - Is it a better practice to convert string keys to symbols?

It depends. Strings have an "inner life", i.e. are sequences
of characters, symbols are atomic and have no innards (but
see below).

So if you just want to know whether two keys are equal or not,
symbols are the more appropriate choice: it'll be faster, too;
if you find yourself asking whether one key is "greater" (that'd
be lexicographically, I guess) or "less" than another, or whether
it has such-and-such a prefix, you'd rather want a string.

The borders are somewhat fuzzy, since it's possible to extract
the string representation of a symbol). In Emacs Lisp they are
even fuzzier, since you can treat, given the right context, a
symbol as a string. This works for Emacs Lisp:

  (string< 'boo "far")
  => t

Emacs lisp transforms 'boo to "foo" and compares the strings
lexicographically.

* Different equalities:

What you have to bear in mind is that there are different measures
of equality: if you are comparing just the "objects" (if you come
from C, that's --basically-- the object's addresses), you use eq.
In that case, asking for "greater" or "less" doesn't make much sense.

If you are comparing the object's "innards", you use =equal=

>  Is =intern= best for this?  What about handling illegal symbol names?

Yes. And... there are few, if any, illegal symbol names. Try

  (setq foo ".(")

It works. It's a funny symbol, but who cares ;-)

> - If a symbol is used as a key and that symbol is already in use
>   elsewhere, is there potential for conflict with the existing symbol?

No. Interning something gives you an address (well, there's a type
tag attached to it). If it's used somewhere else, it'll reuse that,
otherwise, a new symbol is created. Since those things are immutable,
you don't care.

[...]

> Notice that the keys are strings.  This means that they require
> an equality predicate like ='string-equal= to retrieve unless I use
> =assoc= and =cdr=:

They only require it because you want them compared _as strings_. Had
you put symbols in there, then you could have used =eq= as comparison,
which is the default (so you can leave it out).

[...]

> This works, but now the code is getting messy. There are two forms of
> lookup: the verbose =alist-get= and the brute force =assoc/cdr=.  One
> requires ='string-equal=, the other does not.  If I forget the
> predicate, the lookup will fail silently.

"fail silently" meaning that it's looking for the wrong thing in your
assoc list and not finding it.

> I could convert the keys to symbols using =intern=.  

All that said, I'd think you go with this... unless you find yourself
looking at the innards of your keys too often (extracting prefixes,
doing case-insensitive search, that kind of thing). Remember that
=eq= is just one comparison (address, basically), whereas =equal=
has to first dereference the string and then compare character by
character.

Your keywords are a choice from a limited set, and are immutable,
so to me, they /look/ like symbols. That seems to be the fitting
representation.

> This has several apparent problems.
> 
> As I understand it, this would pollute the global obarray. Is that a
> real concern?

Shouldn't be. The global obarray is built for this.

> [...]  Regardless, I
> don't want my package to conflict with (i.e. overwrite) a person's
> environment unknowingly.

It won't. The obarray just maps a string to some immutable thingy
(basically a pointer with some decorations). This thingy can be
used for many things in different contexts. If some package out
there, say =shiny-widgets.el= binds some variable to the symbol
named "THE TITLE", that won't interfere with your usage. You just
happen to both use the symbol =0xdeadbef-plus-some-type-tags=
(which points to the symbol "THE TITLE" in the obarray) for
different things.

> 
> The string may also have characters illegal for use as a symbol.  
> Here's what happens with illegal symbol characters in the string.
> #+begin_src emacs-lisp :results verbatim :session exc
> (setq exc-bad-meta-data
>   (concat
>    "#+THE TITLE: Test post\n"
>    "#+AUTHOR: Excalamus\n"
>    "#+DATE: 2020-07-17\n"
>    "#+POST TAGS: blogging tests\n"
>    "\n"))
> 
> (setq exc-alist-i-bad (exc-parse-org-meta-data-intern exc-bad-meta-data))

I havent't had a look at your code, but "THE TITLE" interns fine as a
symbol here.

The important thing is that you make a choice and stick consistently
to it. That includes being aware of the comparison functions used.

Cheers
-- t

Attachment: signature.asc
Description: Digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]