help-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Help with String Extraction


From: Andreas Kähäri
Subject: Re: Help with String Extraction
Date: Tue, 17 Sep 2024 07:51:55 +0200

On Mon, Sep 16, 2024 at 06:17:49PM -0700, Eduardo A. Bustamante López wrote:
> There are a few issues with your script.
> 
> On Mon, Sep 16, 2024 at 02:06:22PM -0400, Steve Matzura wrote:
> (...)
> > #!/bin/sh

#!/bin/bash ?

> > for f in $(ls *.xml)
> 
> Do not parse the output of ls [1]. Instead, just use:
> 
>   for f in *.xml

Or ./*.xml unless you know the name will never in the future be mistaken
for an option to e.g. grep due to leading dashes.

> > do
> >   echo $f # Display the filename
> 
> The use of (the right kind of) quotes is crucial in shell scripts. As a 
> general
> rule, wrap parameter expansions [2] in double quotes:
> 
>     echo "$f"

Don't use echo for variable data. You don't know if it is interpreting
backslashes or not (unless you examine whether xpg_echo is set or not).

See e.g. https://unix.stackexchange.com/questions/65803

> 
> >   stringZ = $(grep -i svrinfodesc $f) # Find the string I want
> 
> This is not a correct variable assignment. Do not include spaces before or 
> after
> the  '='  character. Instead do something like this:
> 
>     stringZ=$(some command)

In this case, assuming grep -w works and the query string only occurs in
the locations that we care about,

        stringZ=$( grep -w -F svrinfodesc -- "$f" )

... which has the potential to return several lines of text which has
consequences for the rest of the parsing, unless one knows something
about the data that lets you trust that it doesn't.

> 
> >   echo $stringZ # Let's see the string
> > pos1 = `expr index "$string" \>` + 1 # Position of first character after ">"
> 
> Putting aside the incorrect assignment,
> here the  '+ 1' is outside of the command substitution (the stuff between
> backticks). I guess you intended to have an additional command substitution
> wrapping the  "expr index"  one. Keep in mind that nesting command 
> substitutions
> using backticks is not pretty, e.g.
> 
>   $ name=John; x=`echo hello \`echo "$name"\``; echo "$x"
>   hello John

Or, since this is a bash list,

        printf -v x 'hello %s' "$name"

> 
> So I'd use the  $(...)  command substitution form instead, it's much easier to
> nest.
> 
>   $ name=John; x=$(echo hello $(echo "$name")); echo "$x"
>   hello John
> 
> > len = (pos2 - pos1) # How long will the substring be?
> 
> This is not how you do integer substraction with Bash. You have a few options,
> e.g.
> 
>    # Start with these values
>    $ pos2=10 pos1=5
> 
>    # Option 1
>    $ echo "$((pos2 - pos1))"
>    5
> 
>    # Option 2
>    $ ((diff = pos2 - pos1))
>    $ echo "$diff"
>    5
> 
> 
> In addition, please consider enabling the xtrace option (set -x) at the
> beginning of your script, which will cause Bash to output a trace of commands 
> as
> it executes them. This comes in handy when troubleshooting, as you can review 
> if
> bash is executing commands in the way you expect.

I would still use xmlstarlet, xmllint or some other XML parser for this.
There are too many ways that the XML could be formatted that would make
the above approach fail (again, unless you know something about the data
that lets you trust that it doesn't).

> 
> 
> [1] <https://mywiki.wooledge.org/ParsingLs>
> 
> [2] A parameter expansion is how a construct like  $foo  is called. When you
> omit wrapping the expansion in double quotes, the shell will do additional
> processing on the expanded value, like expanding file name patterns.

-- 
Andreas (Kusalananda) Kähäri
Uppsala, Sweden

.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]