bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Can "gawk -i extension" be made safer?


From: arnold
Subject: Re: Can "gawk -i extension" be made safer?
Date: Tue, 27 Jun 2023 04:40:37 -0600
User-agent: Heirloom mailx 12.5 7/5/10

Hi.

Gawk is not going to change.

I see no need for yet another command line option or some
kind of additional syntax at the program level. In particular
the code that deals with AWKPATH is already quite complicated;
it does not need to become even more so.

I suggest instead of using

        #! /usr/bin/gawk ..

that software should fix AWKPATH in the right place, which is
via shell script:

        #! /bin/sh

        export AWKPATH=/usr/local/lib/magicprog:/usr/share/awk:
        gawk -f magicprog.awk -i inplace "$@"

This is straightforward and easy to do.  Autoconf can be used by
program authors to customize their shell scripts as needed.

Thank you for starting the discussion.

Arnold

Stephane CHAZELAS <stephane@chazelas.org> wrote:

> 2023-06-26 21:40:55 -0400, Andrew J. Schorr:
> > Hi,
> [...]
> > > Note that the new -I I was suggesting would not change gawk's
> > > established behaviour. That would not fix scripts that currently
> > > use -i or @include, but at least would allow script writer to
> > > switch to a safer API going forward.
> > 
> > There is actually already an -I option used for tracing:
> > 
> > bash-5.1$ ./gawk --help | fgrep -e -I
> >         -I                      --trace
>
> My bad, I only checked with 5.1.0 on Ubuntu 22.04, that -I was
> apparently added in 5.1.1
>
> Then maybe gawk -m / --module?
>
> > > Maybe a @use (a la perl) or @import (a la python) or @require...
> > > could be the corresponding directive.
> > 
> > Is it really worth adding a new directive when the problem here is actually
> > that the path is not set appropriately?
>
> The problem is that that $AWKPATH is also used for -f/-E
>
> awk -f script.awk
>
> has always been intended (in any awk, since the 70s) to be the
> same as awk -f ./script.awk (and in implementations other
> than gawk never to be the same as awk -f
> /some/library/script.awk nor awk -f script.awk.awk or awk -f
> /some/library/script.awk.awk).
>
> If we remove . from $AWKPATH globally, we break that. We also
> break #! /usr/bin/gawk -E scripts for the cases where they're
> invoked as execve("myscript", args, env) (admitedly rare in
> practice).
>
> As someone mentioned at
> https://unix.stackexchange.com/questions/749645/how-to-safely-use-gawks-i-option-or-include-directive/749910#749910
> C has #include "file path on the filesystem (relative to the file it is 
> included from though)" vs 
>       #include <file looked up in a search path>
> which removes the problem here.
>
> So gawk could do the something similar where @include <inplace>
> looks up inplace or inplace.awk in the absolute components of
> $AWKPATH while @include "file.awk" remains unchanged for
> backward compatibility but ideally would only treat "file.awk" as
> a file path on the filesystem.
>
> And -i/--include inplace does the same as @include "inplace" and 
> -m/--module inplace does the same as @incluce <inplace>
>
> > > My understanding is that the -Wposix mode is intended for users
> > > that don't care about gawk extensions and want a awk that
> > > behaves the standard way. The current behaviour of -f that looks
> > > for files in $AWKPATH or looks for them with ".awk" added breaks
> > > POSIX compliance, so changing the behaviour as I suggested in
> > > POSIX mode would restore compliance and would be unlikely to
> > > break any script.
> > 
> > I'm going to leave the language lawyering to Arnold. I don't know whether
> > gawk --posix mode should care much about how to find source code.
>
> See
> https://pubs.opengroup.org/onlinepubs/9699919799.2018edition/utilities/awk.html
> for the specification
>
> -f  progfile
>      Specify the pathname of the file progfile containing an awk
>      program
>
> progfile is to be interpreted as a pathname (itself defined at
> https://pubs.opengroup.org/onlinepubs/9699919799.2018edition/basedefs/V1_chap03.html#tag_03_271
>
> [...]
> > > I don't thinkg AWKLIBPATH has a problem. Its default value
> > > doesn't include "." or the empty string AFAICT.
> > 
> > I agree that the default value is fine, but presumably somebody could
> > change it.
>
> Yes, and if they do have
>
> cd mysoftware && AWKLIBPATH=./lib gawk -l mylib -l myotherlib
>
> We should probably not get in their way.
>
> [...]
> > > Also searching in the current working directory is desirable for
> > > -f or -E (where IMO arguments should only be interpreted as
> > > paths) and some usages of -i/@include while it is unwelcome for
> > > usages of -i extension.
> > 
> > I'm confused by that point. You seem to say that it's desirable for
> > some usages of -i, and then say that it is unwelcome. And why is it 
> > desirable
> > for -E?
>
> See above (and below).
>
> -i is --include also intended to include actual files as opposed
> to "standard" extensions. Looking at some usages of gawk -i
> through a github code search, I saw some cases of:
>
> cd mysoftware && gawk -i myfile-to-include.awk ...
>
> Where the intention was also to look for the file in the current
> working directory.
>
> While in gawk -i inplace, the inplace extension is intended to
> be loaded from some "standard" place regardless of what the
> current working directory is, hence the need to distinguish the
> 2 cases with --include vs --module and @include "file.awk" vs
> @include <module>.
>
> [...]
> > > #! /usr/bin/safegawk -f/-E
> > > 
> > > We do *not* want the argument of -f/-E (which here is filled-in
> > > by the system from the first (file path) argument to execve())
> > > to be looked-up in $AWKPATH, only be interpreted as a file path.
> > 
> > That is true. I'm not sure how best to handle the gawk shebang case.
> > As you suggest, it may be inadvisable for -E to do any searching at all.
> > And perhaps -E should have a side-effect of disabling relative paths
> > in AWKPATH and in AWKLIBPATH, on the theory that -E is used only 
> > in shebang situations.
> > 
> > Regards,
> > Andy



reply via email to

[Prev in Thread] Current Thread [Next in Thread]