help-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

alist keys: strings or symbols


From: excalamus
Subject: alist keys: strings or symbols
Date: Sun, 19 Jul 2020 18:23:52 +0200 (CEST)

Some questions about alists:

- Is it a better practice to convert string keys to symbols?  Is
  =intern= best for this?  What about handling illegal symbol names?
- If a symbol is used as a key and that symbol is already in use
  elsewhere, is there potential for conflict with the existing symbol?

I have an alist created from parsing meta data from a file.  The file
looks like:

#+begin_src emacs-lisp :results verbatim :session exc
(defvar exc-post-meta-data
  (concat
   "#+TITLE: Test post\n"
   "#+AUTHOR: Excalamus\n"
   "#+DATE: 2020-07-17\n"
   "#+TAGS: blogging tests\n"
   "\n")
  "Sample post meta information.")

(defvar exc-post-content
  (concat
   "* Header\n"
   "** Subheader\n"
   "Hello, world!\n\n"
   "#+begin_src python\n"
   "    print('Goodbye, cruel world...')\n"
   "#+end_src\n")
  "Sample post file without meta information.")

(defvar exc-post
  (concat
   exc-post-meta-data
   exc-post-content)
  "Sample post file.")

(message "%s" exc-post)
#+end_src

#+RESULTS:
#+begin_example
"#+TITLE: Test post
,#+AUTHOR: Excalamus
,#+DATE: 2020-07-17
,#+TAGS: blogging tests

,* Header
,** Subheader
Hello, world!

,#+begin_src python
    print('Goodbye, cruel world...')
,#+end_src
"
#+end_example

The meta data is parsed into an alist:

#+begin_src emacs-lisp :results verbatim :session exc
(defun exc-parse-org-meta-data (data)
  "Parse Org formatted meta DATA into an alist.

Keywords are the '#+' options given within an Org file.  These
are things like TITLE, DATE, and FILETAGS.  Keywords are
case-sensitive!.  Values are whatever remains on that line."
  (with-temp-buffer
    (insert data)
    (org-element-map (org-element-parse-buffer 'element) 'keyword
      (lambda (x) (cons (org-element-property :key x)
                        (org-element-property :value x))))))

(setq exc-alist (exc-parse-org-meta-data exc-post))
exc-alist
#+end_src

#+RESULTS:
: (("TITLE" . "Test post") ("AUTHOR" . "Excalamus") ("DATE" . "2020-07-17") 
("TAGS" . "blogging tests"))

Notice that the keys are strings.  This means that they require
an equality predicate like ='string-equal= to retrieve unless I use
=assoc= and =cdr=:

#+begin_src emacs-lisp :results verbatim :session exc
(alist-get "TITLE" exc-alist)
#+end_src

#+RESULTS:
: nil

#+begin_src emacs-lisp :results verbatim :session exc
(cdr (assoc "TITLE" exc-alist))
#+end_src

#+RESULTS:
: "Test post"

I can use =assoc/cdr= well enough.  The bother starts when I need
a default.  It looks like =alist-get= is what I need.

#+begin_src emacs-lisp :results verbatim :session exc
(alist-get "TYPE" exc-alist 'post nil 'string-equal)
#+end_src

#+RESULTS:
: post

This works, but now the code is getting messy. There are two forms of
lookup: the verbose =alist-get= and the brute force =assoc/cdr=.  One
requires ='string-equal=, the other does not.  If I forget the
predicate, the lookup will fail silently.

I could create a wrapper for =alist-get= which uses =string-equal=:

#+begin_src emacs-lisp :results none :session exc
(defun exc-alist-get (key alist &optional default remove)
  "Get value associated with KEY in ALIST using `string-equal'.

See `alist-get' for explanation of DEFAULT and REMOVE."
  (alist-get key alist default remove 'string-equal))
#+end_src

Now my calls are uniform and a bit more safe:

#+begin_src emacs-lisp :results verbatim :session exc
(exc-alist-get "TITLE" exc-alist)
#+end_src

#+RESULTS:
: "Test post"

#+begin_src emacs-lisp :results verbatim :session exc
(exc-alist-get "TYPE" exc-alist 'post)
#+end_src

#+RESULTS:
: post

This works, but seems like a smell.  All these problems go
back to strings as keys.  Maybe there's a better way?

I could convert the keys to symbols using =intern=.  

#+begin_src emacs-lisp :results verbatim :session exc
(defun exc-parse-org-meta-data-intern (data)
  "Parse Org formatted meta DATA into an alist.

Keywords are the '#+' options given within an Org file.  These
are things like TITLE, DATE, and FILETAGS.  Keywords are
case-sensitive!.  Values are whatever remains on that line."
  (with-temp-buffer
    (insert data)
    (org-element-map (org-element-parse-buffer 'element) 'keyword
      (lambda (x) (cons (intern (org-element-property :key x))
                        (org-element-property :value x))))))

(setq exc-alist-i (exc-parse-org-meta-data-intern exc-post))
exc-alist-i
#+end_src

#+RESULTS:
: ((TITLE . "Test post") (AUTHOR . "Excalamus") (DATE . "2020-07-17") (TAGS . 
"blogging tests"))

This has several apparent problems.

As I understand it, this would pollute the global obarray. Is that a
real concern?  I know the symbol is only being used as a lookup; the
variable, function, and properties shouldn't change.  Regardless, I
don't want my package to conflict with (i.e. overwrite) a person's
environment unknowingly.

The string may also have characters illegal for use as a symbol.  
Here's what happens with illegal symbol characters in the string.
#+begin_src emacs-lisp :results verbatim :session exc
(setq exc-bad-meta-data
  (concat
   "#+THE TITLE: Test post\n"
   "#+AUTHOR: Excalamus\n"
   "#+DATE: 2020-07-17\n"
   "#+POST TAGS: blogging tests\n"
   "\n"))

(setq exc-alist-i-bad (exc-parse-org-meta-data-intern exc-bad-meta-data))
exc-alist-i-bad
#+end_src

#+RESULTS:
: ((AUTHOR . "Excalamus") (DATE . "2020-07-17"))

How are situations like these best handled?



reply via email to

[Prev in Thread] Current Thread [Next in Thread]