help-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Coding system to encode arguments to groff?


From: Tim Landscheidt
Subject: Re: Coding system to encode arguments to groff?
Date: Sun, 03 Oct 2021 13:14:04 +0000
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux)

Eli Zaretskii <eliz@gnu.org> wrote:

>> I pass text arguments from Emacs Lisp to a groff command
>> with the "-d" option.  For ASCII strings, this is trivial;
>> for strings with umlauts, I need to use:

>> | (encode-coding-string variable-to-pass 'iso-latin-1)

> What is your default locale's codeset on that system?  In general, if
> the default locale matches the encoding you need to use, the above
> should happen automagically.

If I understand your question correctly, UTF-8:

| [tim@vagabond ~]$ locale
| LANG=de_DE.UTF-8
| LC_CTYPE="de_DE.UTF-8"
| LC_NUMERIC="de_DE.UTF-8"
| LC_TIME="de_DE.UTF-8"
| LC_COLLATE="de_DE.UTF-8"
| LC_MONETARY="de_DE.UTF-8"
| LC_MESSAGES="de_DE.UTF-8"
| LC_PAPER="de_DE.UTF-8"
| LC_NAME="de_DE.UTF-8"
| LC_ADDRESS="de_DE.UTF-8"
| LC_TELEPHONE="de_DE.UTF-8"
| LC_MEASUREMENT="de_DE.UTF-8"
| LC_IDENTIFICATION="de_DE.UTF-8"
| LC_ALL=
| [tim@vagabond ~]$

>> For strings with other Unicode characters like "–" (#x2013),
>> I need to call groff's preconv like:

>> | (shell-command-to-string (concat "preconv -r <(echo " 
>> (shell-quote-argument variable-to-pass) ")"))

>> which for "ä–ö" returns something like:

>> | \[u00E4]\[u2013]\[u00F6]

> This is just the original "ä–ö" string, so I'm not quite sure what did
> the above accomplish.

The output is literal, i. e.:

| 0000000   \   [   u   0   0   E   4   ]   \   [   u   2   0   1   3   ]
| 0000020   \   [   u   0   0   F   6   ]  \n

>> Now in Emacs, this looks very much like what a coding system
>> would do.  The info documentation for elisp just laconically
>> says:

>> |    How to define a coding system is an arcane matter, and is not
>> | documented here.

>> Has someone implemented such a coding system for groff so
>> that something like:

>> | (encode-coding-string variable-to-pass 'x-groff)

> I don't think you should need a new coding-system.  But you didn't
> explain why you need to explicitly encode the command-line arguments,
> so it's hard to give an accurate advice.  What kind of Groff command
> needs this jumping through hoops from you?  E.g., why isn't it enough
> to bind coding-system-for-write to whatever you need, around the call
> to call-process or whatever?

> IOW, please describe in more detail the Groff-related context in which
> this problem happens, so that we could have an intelligent discussion
> of the issues you might have.

On Fedora 34 with GNU groff 1.22.4:

| (let
|     ((temp-ps-buffer (generate-new-buffer "*test ps*"))
|      (test-arg "a-o"))
|   (with-temp-buffer
|     (insert ".fam H\n\\*[test-arg]\n")
|     (call-process-region
|      (point-min)
|      (point-max)
|      "groff"
|      nil
|      temp-ps-buffer
|      nil
|      "-Tps"
|      "-d" (concat "test-arg=" test-arg)))
|   (switch-to-buffer temp-ps-buffer)
|   (ps-mode)
|   (doc-view-mode))

produces a PostScript buffer with the text "a-o".

With test-arg = "ä-ö" (ä minus ö), it produces gibberish mi-
nus gibberish.

With test-arg = (encode-coding-string "ä-ö" 'iso-latin-1) (ä
minus ö), it produces the text "ä-ö".

With test-arg = (encode-coding-string "ä–ö" 'iso-latin-1) (ä
endash ö), it produces the text "ä[white space]ö".

With test-arg = (shell-command-to-string (concat "preconv -r
<(echo " (shell-quote-argument "ä–ö") ")")) (ä endash ö), it
produces the intended text "ä–ö".

(Passing "-k" as an additional option to groff does not
change the output as "-k" only converts standard input, not
macro definitions set as command line arguments.)

Tim



reply via email to

[Prev in Thread] Current Thread [Next in Thread]