[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Converting backslash sequences to (non-ascii) characters
From: |
Eduardo Ochs |
Subject: |
Converting backslash sequences to (non-ascii) characters |
Date: |
Wed, 9 Jun 2021 00:02:37 -0300 |
Hi Erich,
I won't be able to approve posts with big attachments until I
recover my moderator password for the eev mailing list, so this
is a public answer to your e-mail that was intended to be public
but that I only received in private... you said that you were
trying to generate links to a PDF using this,
http://angg.twu.net/eev-intros/find-pdf-like-intro.html#10
(find-pdf-like-intro "10. Generating a pair with the page number")
but you were getting sexps like these
# (find-leo9page 1 "Th\351orie de la Science")
# (find-leo9text 1 "Th\351orie de la Science")
# (find-leo9page (+ 0 1) "Th\351orie de la Science")
# (find-leo9text (+ 0 1) "Th\351orie de la Science")
instead of:
# (find-leo9page 1 "Théorie de la Science")
# (find-leo9text 1 "Théorie de la Science")
# (find-leo9page (+ 0 1) "Théorie de la Science")
# (find-leo9text (+ 0 1) "Théorie de la Science")
I thought a bit about making `M-h M-p' convert some of its
backslash sequences to the corresponding characters and realized
that it wouldn't be easy to find a robust way to do that...
...so here is a workaround. If you mark a region and then run
`M-x bsl', the `bsl' will convert the right (???) backslash
sequences in that region to the corresponding characters. This
code is something that I can test and maintain easily, and I will
be able to reuse its structure in other things that to do.
Hope it helps! =)
Cheers,
Eduardo
--snip--snip-- Here is the code: --snip--snip--
;; (find-elnode "Basic Char Syntax" "\\a")
;; (find-elnode "General Escape Syntax" "\\uXXXX")
;; (find-elnode "General Escape Syntax" "\\x41")
;; (find-elnode "General Escape Syntax" "octal")
;; (find-elnode "Rx Notation")
;; (find-elnode "Rx Constructs" "Capture groups")
;; (find-elnode "Search and Replace")
(setq ee-elisp-bsl-re
(rx "\\" (or (group-n 1 (any "abtnvfres\\d"))
(group-n 2 (regexp "[0-7][0-7][0-7]"))
(and (any "x") (group-n 3 hex hex))
(and (any "uU") (group-n 4 hex hex hex hex
(opt hex hex) (opt hex hex)))
)))
(defun ee-elisp-bsl-convert1 (str)
(save-match-data
(string-match ee-elisp-bsl-re str)
(let ((char (match-string 1 str))
(octal (match-string 2 str))
(hexa (match-string 3 str))
(hexb (match-string 4 str)))
(cond (char (match-string 0 str))
(octal (format "%c" (string-to-number octal 8)))
(hexa (format "%c" (string-to-number hexa 16)))
(hexb (format "%c" (string-to-number hexb 16)))
))))
(defun ee-elisp-bsl-replace (s e)
"Replace some backslash sequences in the region.
The sequences \\ooo, \\xhh and \\uhhhh, where the `o's are octal
digits and the `h's are hex digits, are replaced by their
corresponding characters."
(interactive "r")
(save-excursion
(save-restriction
(narrow-to-region s e)
(goto-char (point-min))
(while (re-search-forward ee-elisp-bsl-re nil 'noerror)
(replace-match (ee-elisp-bsl-convert1 (match-string 0)) 'fixedcase 'literal)
))))
(defalias 'bsl 'ee-elisp-bsl-replace)
;; Tests:
;; (ee-elisp-bsl-convert1 "\\n")
;; (ee-elisp-bsl-convert1 "\\234")
;; (ee-elisp-bsl-convert1 "\\351")
;; (ee-elisp-bsl-convert1 "_\\351_")
;; (ee-elisp-bsl-convert1 "\\x41")
;; (ee-elisp-bsl-convert1 "\\u2022")
;; (find-einsert '((0 255)))
;;
;; ...and mark the next line and run `M-x bsl' on it:
;; "Foo \n\\\234\u2022\xa2"
- Converting backslash sequences to (non-ascii) characters,
Eduardo Ochs <=