[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: sed with Variable Substitution in the command
From: |
Alan D. Salewski |
Subject: |
Re: sed with Variable Substitution in the command |
Date: |
Sat, 21 Sep 2024 13:54:41 -0400 |
User-agent: |
Mutt/2.0.5 (2021-01-21) |
On 2024-09-19 10:03:01, Steve Matzura <sm@noisynotes.com> spake thus:
Can't get any of these to work, despite reading and looking at several
examples. Variable substitution has always been a problem for me because
there doesn't seem to be a single hard-and-fast rule--sometimes the
variable is quoted, sometimes it's wrapped in braces, sometimes the "$"
is interpreted by the shell as a string positional directive ("$", "%").
All of the following fail:
sed -i 's/replace-this/$with_this_variable/'
sed -i 's/replace-this/${with_this_variable}/'
sed -i 's/replace-this/"${with_this_variable}/'
What am I missing?
Hi Steve,
Those all fail for variations of the same reason: They all attempt
to use string interpolation with single-quoted strings. The last one
nearly what you wanted[0]:
sed -i 's/replace-this/'"${with_this_variable}/"
Explanation:
------------
String interpolation is performed by the shell on double-quoted
strings, but not on single-quoted strings. That is to say, shell
variables get expanded within double-quoted strings, but not within
single-quoted strings.
When bash encounters a single-quoted string, the contents are taken
verbatim. This is very useful in lots of situations, and the only
significant restriction is that such a string cannot itself contain
the single-quote character (').
When bash encounters a double-quoted string, it's contents are
subject to possible further expansions, one of which is "parameter
and variable expansion".
For more on this, see in bash(1) the sections "QUOTING" and
"EXPANSION".
Because the shell treats immediately adjacent strings as string
concatenation[1], it is often useful to use single-quoted and
double-quoted strings together when working with tools (such as
'sed') that take regex arguments:
sed -e 's/complicated-regex/'"${known_clean_variable}"'/g'
(In order to visually emphasize that the trailing single-quoted
string is a continuation of the sed 's' command, I've teaked your
example slightly to add the 'g' flag.)
In this example, <complicated-regex> can contain anything at all
(except a single-quote char), and no special escaping is needed.
The variable $known_clean_variable must be known to contain a value
that sed will not choke on. So as was mentioned elsewhere on this
thread, this is only safe to do when you know the contents of the
variable will not create a syntax problem. But in many common
situations, that is the case. And for many for which it is not, you
can often simply change the '/' character used for the sed 's'
command to something else, such as '!' or '#' or ',' -- anything
that is known to not exist in the variable:
sed -e 's!complicated-regex!'"${known_clean_variable}"'!g'
or:
sed -e 's#complicated-regex#'"${known_clean_variable}"'#g'
or:
sed -e 's,complicated-regex,'"${known_clean_variable}"',g'
Now, for braces around variable names...
Note that I have used braces around the variable name above. It is
/never/ incorrect to do so. The braces can be omitted in a lot of
situations when doing so does not introduce ambiguity or otherwise
change the intended meaning; the examples above are such a
situation, and could have been written like this:
sed -e 's/complicated-regex/'"$known_clean_variable"'/g'
But (for example) if we wanted within our substitution to add an
'XX' suffix to whatever value was in the variable, then not having
the braces /would/ change the meaning:
sed -e 's/complicated-regex/'"$known_clean_variableXX"'/g'
The above line is attempting to use a variable named
'known_clean_variableXX', which is not what was
intended. Re-introducing the braces avoids that problem:
sed -e 's/complicated-regex/'"${known_clean_variable}XX"'/g'
For completeness, I'll also note that this specific situation could
be worked around by putting the 'XX' suffix in the single-quoted
string (but that does not express the intent quite as cleanly):
sed -e 's/complicated-regex/'"$known_clean_variable"'XX/g'
When writing programs, (almost) always using the braces can help
keep things consistent and avoid bugs.
When typing at the command line, whatever is fast and efficient is
fine.
Sometimes using the braces would "look odd" to experienced shell
folks, and are omitted by convention:
echo $1 $2 $3 # looks "odd"
echo "$1" "$2" "$3" # same here, and safer in general
echo ${1} ${2} ${3} # looks "odd"
echo "${1}" "${2}" "${3}" # ditto
If in doubt, include the braces. It's better to be an odd duck than
a dead duck (or even a lucky duck)[2].
Some folks are surprised to find out that numeric parameters are one
of the cases in which braces are sometimes required. Specifically,
any time you want to use a positional parameter that needs to be
referenced by two or more digits. Here we demonstrate by setting the
first one hundred positional parameters to known strings, and then
showing how lack of braces could lead to referencing the wrong
parameter:
$ set -- one two three four five six seven eight nine ten num-{11..99}
ONE_HUNDRED
$ echo $1 # fine
one
$ echo $10 # oops!
one0
$ echo $100 # oops!
one00
$ echo ${10} # correct
ten
$ echo ${100} # correct
ONE_HUNDRED
$ # these are all correct (sanity checks)
$ echo ${11}
num-11
$ echo ${42}
num-42
$ echo ${99}
num-99
HTH,
-Al
[0] You noted elsewhere on this thread that you solved your
immediate problem by putting the entire argument to the sed
command in double-quotes. That would look like either this:
sed -i "s/replace-this/$with_this_variable/"
or this:
sed -i "s/replace-this/${with_this_variable}/"
You won't hear any quibbles from me on that when it works in a
specific situation -- anything that works that allows you to get
your job done is good enough. I will point out, however, that
that approach requires more work to make it correct in the
general case, as it requires escaping characters in the regular
expression to "hide them" from the shell's string
interpolation.
Even when it's not too difficult to get right, it is still
unecessary cognitive load that can be avoided (both when
writing, and for future readers) by placing the regex within
single quotes.
A useful rule of thumb is to always use single-quoted string
literals, except where you explicitly want string interpolation
to happen.
[1] E.g., this shell syntax:
$ printf '%s\n' foo"BAR"'baz'QUUX
prints:
fooBARbazQUUX
rather than:
foo
BAR
baz
QUUX
[2] Dead duck: Your program crashes and/or errors mysteriously.
Lucky duck: Your program happens to work correctly, but you
don't know why. Or your program keeps going after having done
something otherwise incorrectly or unintentionally, but no
serious harm was done.
For folks who are not native English speakers:
https://englishbyday.com/duck-idioms/
--
a l a n d. s a l e w s k i
ads@salewski.email
salewski@att.net
https://github.com/salewski