help-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: sed with Variable Substitution in the command


From: Alan D. Salewski
Subject: Re: sed with Variable Substitution in the command
Date: Sat, 21 Sep 2024 13:54:41 -0400
User-agent: Mutt/2.0.5 (2021-01-21)

On 2024-09-19 10:03:01, Steve Matzura <sm@noisynotes.com> spake thus:
Can't get any of these to work, despite reading and looking at several examples. Variable substitution has always been a problem for me because there doesn't seem to be a single hard-and-fast rule--sometimes the variable is quoted, sometimes it's wrapped in braces, sometimes the "$" is interpreted by the shell as a string positional directive ("$", "%").

All of the following fail:


sed -i 's/replace-this/$with_this_variable/'


sed -i 's/replace-this/${with_this_variable}/'


sed -i 's/replace-this/"${with_this_variable}/'


What am I missing?

Hi Steve,

Those all fail for variations of the same reason: They all attempt to use string interpolation with single-quoted strings. The last one nearly what you wanted[0]:

    sed -i 's/replace-this/'"${with_this_variable}/"

Explanation:
------------
String interpolation is performed by the shell on double-quoted strings, but not on single-quoted strings. That is to say, shell variables get expanded within double-quoted strings, but not within single-quoted strings.

When bash encounters a single-quoted string, the contents are taken verbatim. This is very useful in lots of situations, and the only significant restriction is that such a string cannot itself contain the single-quote character (').

When bash encounters a double-quoted string, it's contents are subject to possible further expansions, one of which is "parameter and variable expansion".

For more on this, see in bash(1) the sections "QUOTING" and "EXPANSION".

Because the shell treats immediately adjacent strings as string concatenation[1], it is often useful to use single-quoted and double-quoted strings together when working with tools (such as 'sed') that take regex arguments:

    sed -e 's/complicated-regex/'"${known_clean_variable}"'/g'

(In order to visually emphasize that the trailing single-quoted string is a continuation of the sed 's' command, I've teaked your example slightly to add the 'g' flag.)

In this example, <complicated-regex> can contain anything at all (except a single-quote char), and no special escaping is needed.

The variable $known_clean_variable must be known to contain a value that sed will not choke on. So as was mentioned elsewhere on this thread, this is only safe to do when you know the contents of the variable will not create a syntax problem. But in many common situations, that is the case. And for many for which it is not, you can often simply change the '/' character used for the sed 's' command to something else, such as '!' or '#' or ',' -- anything that is known to not exist in the variable:

    sed -e 's!complicated-regex!'"${known_clean_variable}"'!g'
or:
    sed -e 's#complicated-regex#'"${known_clean_variable}"'#g'
or:
    sed -e 's,complicated-regex,'"${known_clean_variable}"',g'

Now, for braces around variable names...

Note that I have used braces around the variable name above. It is /never/ incorrect to do so. The braces can be omitted in a lot of situations when doing so does not introduce ambiguity or otherwise change the intended meaning; the examples above are such a situation, and could have been written like this:

    sed -e 's/complicated-regex/'"$known_clean_variable"'/g'

But (for example) if we wanted within our substitution to add an 'XX' suffix to whatever value was in the variable, then not having the braces /would/ change the meaning:

    sed -e 's/complicated-regex/'"$known_clean_variableXX"'/g'

The above line is attempting to use a variable named 'known_clean_variableXX', which is not what was intended. Re-introducing the braces avoids that problem:

    sed -e 's/complicated-regex/'"${known_clean_variable}XX"'/g'

For completeness, I'll also note that this specific situation could be worked around by putting the 'XX' suffix in the single-quoted string (but that does not express the intent quite as cleanly):

    sed -e 's/complicated-regex/'"$known_clean_variable"'XX/g'

When writing programs, (almost) always using the braces can help keep things consistent and avoid bugs.

When typing at the command line, whatever is fast and efficient is fine.

Sometimes using the braces would "look odd" to experienced shell folks, and are omitted by convention:

    echo $1 $2 $3              # looks "odd"

    echo "$1" "$2" "$3"        # same here, and safer in general

    echo ${1} ${2} ${3}        # looks "odd"

    echo "${1}" "${2}" "${3}"  # ditto

If in doubt, include the braces. It's better to be an odd duck than a dead duck (or even a lucky duck)[2].

Some folks are surprised to find out that numeric parameters are one of the cases in which braces are sometimes required. Specifically, any time you want to use a positional parameter that needs to be referenced by two or more digits. Here we demonstrate by setting the first one hundred positional parameters to known strings, and then showing how lack of braces could lead to referencing the wrong parameter:

    $ set -- one two three four five six seven eight nine ten num-{11..99} 
ONE_HUNDRED

    $ echo $1      # fine
    one

    $ echo $10     # oops!
    one0

    $ echo $100    # oops!
    one00

    $ echo ${10}   # correct
    ten

    $ echo ${100}  # correct
    ONE_HUNDRED

    $ # these are all correct (sanity checks)
    $ echo ${11}
    num-11
    $ echo ${42}
    num-42
    $ echo ${99}
    num-99

HTH,
-Al

[0] You noted elsewhere on this thread that you solved your immediate problem by putting the entire argument to the sed command in double-quotes. That would look like either this:

        sed -i "s/replace-this/$with_this_variable/"

    or this:

        sed -i "s/replace-this/${with_this_variable}/"

You won't hear any quibbles from me on that when it works in a specific situation -- anything that works that allows you to get your job done is good enough. I will point out, however, that that approach requires more work to make it correct in the general case, as it requires escaping characters in the regular expression to "hide them" from the shell's string interpolation.

Even when it's not too difficult to get right, it is still unecessary cognitive load that can be avoided (both when writing, and for future readers) by placing the regex within single quotes.

A useful rule of thumb is to always use single-quoted string literals, except where you explicitly want string interpolation to happen.


[1] E.g., this shell syntax:

        $ printf '%s\n' foo"BAR"'baz'QUUX

    prints:

        fooBARbazQUUX

    rather than:

        foo
        BAR
        baz
        QUUX


[2] Dead duck: Your program crashes and/or errors mysteriously.

Lucky duck: Your program happens to work correctly, but you don't know why. Or your program keeps going after having done something otherwise incorrectly or unintentionally, but no serious harm was done.

    For folks who are not native English speakers:
        https://englishbyday.com/duck-idioms/

--
a l a n   d.   s a l e w s k i
ads@salewski.email
salewski@att.net
https://github.com/salewski


reply via email to

[Prev in Thread] Current Thread [Next in Thread]