help-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: string escaping in bash


From: Eli Schwartz
Subject: Re: string escaping in bash
Date: Fri, 12 Mar 2021 14:37:06 -0500

On 3/12/21 10:05 AM, Peng Yu wrote:
> I wondering if there is a simple but robust way to implement string escaping.

Yes. printf, $'' and various other technologies that utilize the
standard mini-language for string escaping.

> Specifically, the string "\n" (a slash and the letter "n") should be
> replaced as a newline character, the string "\t" (a slash and the
> letter "t") should be replaced as a tab character, and "\\" (two
> consecutive slashes should be replaced with a single slash. All other
> characters and their preceding slash (if there is) should remain as
> is.

No, because your requirement here is to use a subset of the standardized
string escaping mini-language whereby you wish to ONLY support a subset
of replacement tokens and refuse to replace the other tokens.

You are engaged in ambiguity -- all other characters and their preceding
slash should, for robustness, be written as e.g.

\\a

to guarantee they cannot be mistreated as an escape.

Anyway, since you wish to invent your own string escaping mini-language
(albeit a subset of an existing language), you must code your own
implementation and institute your own correctness guarantees.

Good luck. :)

> If I use a multi-string-replacement strategy, it will not be robust.
> For example, if I do it in the order 1) \\ -> \, 2) \n -> newline, \\n
> will not be replaced correctly. The result should be "\n" (a slash
> followed by the letter "n").
> 
> $ x='\\n'; x=${x//\\\\/\\}; x=${x//\\n/$'\n'}; declare -p x
> declare -- x="
> "
> 
> Does anybody have a robust way to implement this in bash?

Instead of doing successive, full passes as you replace each token, your
program e.g. /usr/bin/pengus-whitespace-only-printf might want to read
in the string character by character, detect whichever replacement token
might be there, replace it, and move on. This would retain progress
state, and therefore avoid processing the same tokens twice.

-- 
Eli Schwartz
Arch Linux Bug Wrangler and Trusted User

Attachment: OpenPGP_signature
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]