help-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Help-bash] Definitive answer of the difference between [ and [[


From: Bob Proulx
Subject: Re: [Help-bash] Definitive answer of the difference between [ and [[
Date: Sat, 1 Feb 2014 14:38:37 -0700
User-agent: Mutt/1.5.21 (2010-09-15)

Peng Yu wrote:
> I'm wondering what the difference is between [ and [[. After reading
> the manual, it is not clearly to me that I know all the differences
> between them without missing anything. There are some online
> discussions. But they are not definitive. Does anyone have a
> definitive answer to this question? Thanks.

Definitively they are different commands and have different syntax and
semantics.  They are only peripherally similar in the same way that
"echo" and "printf" are similar but yet different.

Let me tell a story that describes a mental model that should help
make the difference between them clear.  It doesn't matter if the
story is factually 100% correct.  The important part is the model that
is developed and now to remember the differences.

In the beginning '[' was an external command.  It still exists as an
external command for compatibility reasons.

  $ ls -ldog /usr/bin/[ /usr/bin/test
  -rwxr-xr-x 1 39464 Jul 20  2013 /usr/bin/[
  -rwxr-xr-x 1 35368 Jul 20  2013 /usr/bin/test

It is often used in the COMMANDS part of an 'if' statement.

  $ type if
  if is a shell keyword
  $ help if
  if: if COMMANDS; then ...
         ^^^^^^^^

If '[' is an external command then all of the arguments up through to
the ';' that terminates the command must be external arguments.
Meaning that arguments cannot use shell metacharacters or there would
be problems.

Can you do this?

  [ 2 > 1 ]
  [ * = 1 ]

No.  Because ">" is special to the shell as a redirection.  It would
not be possible to use it as a comparison operator.  It would create a
file named "1". in the above.  In order to use '<' or '>' or others
those would need to be quoted.  [ 2 '>' 1 ].  That is inconvenient.
(Just as it is inconvenient when using the 'expr' command which
requires quoting of the special arguments and users find it very
confusing.)  Also the '*' in the above is special to the shell.  It
will file expand to match all files in the current directory.

Therefore the external '[' does not use any of the characters used by
the shell.  Instead the external '[' used other similar commands.
These were very familar to the programmers of FORTRAN and other
languages of the day.  "-eq", "-ge", "-gt", "-le", "-lt", "-ne", and
others.  None of those are special to the shell.  All can be used
without confict or problems.

  [ 2 -gt 1 ]

Those commands _were_ primarily external.  But external commands are
slower than internal commands due to the need to process arguments and
to fork off another shell.  Internal commands are much faster.  Using
a shell with an external '[' and other command will be much slower
than one with an internal '['.  So what does the shell author do?  The
shell author creates an internal alternative '[' that is much faster
than the external '['.

  $ type -a [
  [ is a shell builtin
  [ is /usr/bin/[

Can the shell author who is trying to support a lot of scripts
suddenly change the syntax from external to internal and not have a
lot of complaints from the users?  No.  Therefore the shell author
creates an internal '[' that acts *exactly* like the external version
of the command.  Yes '[' is a shell builtin.  But also yes '[' behaves
*exactly* as if it were an external command.  Any difference would
lead to breaking scripts.  Therefore any difference would be a bug.

This means that when processing '[' any arguments must be processed
just like any other external command argument.

  echo *
  ...matches all filenames in the current directory...
  ...echo receives them all as arguments...prints them all out...

And so must file globbing happen with '[' too.  If one of the
arguments is a bare '*' character then file globbing will happen with
it too.  Also the same with any other special character.

This is why a very common error is insufficient quoting of arguments.
People use '[' but do not quote the arguments sufficiently to protect
them from expansion.  When using '[' you must think of it as an
external command.  All of the arguments go through the normal word
splitting and file glob expansion and everything else that any
argument to any external command will go through.  If there is a
character that must be protected then it must be quoted sufficiently.

Time goes by.  ksh wants to add a syntax that can improve upon the
very common '[' statement but avoid the problems that are seen with it
being an external command.  Enter '[['.  That is a new syntax.  It has
never before been seen.  There are no legacy scripts to support.  It
is a clean slate.  It is written from the beginning to be an internal
keyword.  It is a keyword not a command.  It is a keyword in the new
syntax of ksh.  It is a keyword just like 'if' and 'while'.  As a
keyword it can introduce constructs such as an internal expression.

  type [[
  [[ is a shell keyword
  help [[
  [[ ... ]]: [[ expression ]]
    Execute conditional command.

It doesn't say "COMMANDS" there.  It says "expression".  It introduces
an internal expression.  It doesn't need to worry about external
argument processing or shell metacharacters.  It can process them
differently within the new keyword.

  [[ 2 > 1 ]]
  [[ * = 1 ]]

Yes.  That works fine.  It does not create a file named "1".  In this
context the '>' is now a lexicographical comparison and not a
redirection.  The '*' is not a file glob.  The manual says:

  Expressions are composed of the primaries described below under
  CONDITIONAL EXPRESSIONS. Word splitting and pathname expansion are
  not performed on the words between the [[ and ]]; tilde expansion,
  parameter and variable expansion, arithmetic expansion, command
  substitution, process substitution, and quote removal are performed.
  Conditional operators such as -f must be unquoted to be recognized
  as primaries.

And '[[' has its own set of CONDITIONAL EXPRESSIONS that are not
related to '['.  They are related in that they are similar and provide
a similar function.  But in the mental model consider them completely
separate and distinct.

The '[[' is an internal rewrite version 2 of the external '[' as
conceived of by the shell author.  Over the years the feature set of
'[[' has changed and improved.  In different versions of bash various
parts of it has changed.  New features have been added.  Bug were
introduced.  Bugs were fixed.  Many people like '[[' because being
internal it has a set of rules that match other internal commands such
as '<' and '>' not being file redirections.  Such as pathname
expansion not being done within '[['.

However '[[' is a new shell feature of ksh, bash, zsh, probably
others.  It isn't a _standard_ shell feature.  It won't work with
"#!/bin/sh" scripts.  When using '[[' you need to use "#!/bin/bash" or
/bin/sh as it is a feature specific to ksh and ksh-like shells.

And so in the evolution of shell scripts '[[' is similar to '[' in the
same way that 'printf' is similar to 'echo'.  The echo command has
legacy baggage.  The '[' / 'test' command has legacy baggage.  The
'printf' command is a new invention to address the problems in 'echo'.
The '[[' is a new invention to address the problems in '['.

Are they perfect?  No.  Are there good reasons to continue to use the
previous legacy versions?  Some will say no but I definitely say yes.
The older versions are perfectly well behaved if you understand the
operating model.  The newer versions are simply a different model.
Sometimes the newer version introduces its own problems.  Such as the
regular expression handling in '[[' has been problematic.

Hope this helps,
Bob



reply via email to

[Prev in Thread] Current Thread [Next in Thread]