help-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Help-bash] Sustitute to bash read or util-linux line in pure bash?


From: Greg Wooledge
Subject: Re: [Help-bash] Sustitute to bash read or util-linux line in pure bash?
Date: Tue, 13 Mar 2018 16:32:15 -0400
User-agent: NeoMutt/20170113 (1.7.2)

On Tue, Mar 13, 2018 at 07:20:02PM +0100, Garreau, Alexandre wrote:
> By doing research on how to do proper redirections while processing text
> in a shell loop with read I did end on these two stackexchange posts [0]
> [1] from Stéphane Chazelas. I think the main points here are about
> portability, and, maybe, since you confirm it, more of readability and
> complexity to use without error, than any other problem.
[...]

> [0] Why is using a shell loop to process text considered bad practice? -
> Unix & Linux Stack Exchange <https://unix.stackexchange.com/a/169765>

Let's just take this one.  You have overlooked the context here and
got yourself all confused.  Here's what it says:

====================================================================
Yes, we see a number of things like:

while read line; do
  echo $line | cut -c3
done

Or worse:

for line in `cat file`; do
  foo=`echo $line | awk '{print $2}'`
  echo whatever $foo
done

(don't laugh, I've seen many of those).
====================================================================

Then it goes on to give a different example:

====================================================================
cut -c4-5 < in | tr a b > out
====================================================================

And later:

====================================================================
while read line; do
  echo ${line:2:1}
done
====================================================================


So.  What is the page trying to tell you?

-->   Don't do expensive things inside a loop.   <--

That's all.  The examples it is criticizing are calling an external
command inside a loop.  The command gets called once for each iteration
(line of input text).  This takes a lot of time.  It is inefficient.

The corrected examples that it gives show two ALTERNATIVE ways to
achieve similar goals.  The first is to use a filter program that
reads the ENTIRE input and just call it one time.  This is more
efficient than calling the filter 10000 times.

The second is to use shell builtin features to perform the slicing
and dicing of the input lines, instead of an external tool.

There is NOTHING wrong with using read, singly or in a loop.  The
criticism was about the BODY of the loop, in those specific examples.


You seem to be on some kind of quest.  I'm not sure what your ultimate
goal is, but you are at the start of this journey, so let's start at
the beginning.

A shell script is a primitive hack of a program that you write to
perform one task.  You write it to save yourself some labor, because
it's a task that you perform all the time, and you want to do less
typing.

It could be backing up your computer.  Or playing a random MP3 file
from a directory.  Or telling a web browser to open a new tab with a
specific Google search as the URL.  That kind of thing.

Another purpose of a shell script is to set up an execution environment
for a process.  These kinds of scripts are used by various system-level
service managers (/etc/init.d/ or daemontools run scripts), and those
will have their own idiosyncratic rules and styles.

For you, the beginner, the steps are as follows:

1) Learn the basic syntax of the shell programming language.

2) Learn what NOT to do, because 99% of the shell scripts in the world
   are horrible rubbish and should NOT be followed.  There are so many
   ways to fuck up a shell script that it's not even funny.

3) Learn some basic tricks and techniques that you CAN safely use in
   shell scripts.  These are your building blocks for scripts, and
   you will use them frequently.


=== Things to do ===

Learn to quote properly.

Learn how and when to use the while read loop.  You use it when you are
processing text line-by-line, in small to moderate quantities, in ways
that make sense for a shell script to do.

Learn how to use the shell's builtin string manipulations ("parameter
expansion").  Use these instead of the vast majority of the cut or
sed or awk calls that you see in crappy scripts.

https://mywiki.wooledge.org/Quotes
https://mywiki.wooledge.org/BashGuide
https://mywiki.wooledge.org/BashFAQ/001
https://mywiki.wooledge.org/BashFAQ/100

Start with those.


=== Things not to do ===

Don't try to use a while read loop with 1 billion lines of input.  It
doesn't scale to those numbers.

Don't call awk or perl or sed or cut or any other external program
for every line of input.  That's inefficient.  That's what this web
page was about.  Either do what those programs would do using the
shell's own builtin tools, or write the ENTIRE LOOP using that other
tool.

https://mywiki.wooledge.org/BashPitfalls
https://mywiki.wooledge.org/BashWeaknesses



reply via email to

[Prev in Thread] Current Thread [Next in Thread]