[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Can "gawk -i extension" be made safer?
From: |
Stephane CHAZELAS |
Subject: |
Re: Can "gawk -i extension" be made safer? |
Date: |
Tue, 27 Jun 2023 09:14:56 +0100 |
2023-06-26 21:40:55 -0400, Andrew J. Schorr:
> Hi,
[...]
> > Note that the new -I I was suggesting would not change gawk's
> > established behaviour. That would not fix scripts that currently
> > use -i or @include, but at least would allow script writer to
> > switch to a safer API going forward.
>
> There is actually already an -I option used for tracing:
>
> bash-5.1$ ./gawk --help | fgrep -e -I
> -I --trace
My bad, I only checked with 5.1.0 on Ubuntu 22.04, that -I was
apparently added in 5.1.1
Then maybe gawk -m / --module?
> > Maybe a @use (a la perl) or @import (a la python) or @require...
> > could be the corresponding directive.
>
> Is it really worth adding a new directive when the problem here is actually
> that the path is not set appropriately?
The problem is that that $AWKPATH is also used for -f/-E
awk -f script.awk
has always been intended (in any awk, since the 70s) to be the
same as awk -f ./script.awk (and in implementations other
than gawk never to be the same as awk -f
/some/library/script.awk nor awk -f script.awk.awk or awk -f
/some/library/script.awk.awk).
If we remove . from $AWKPATH globally, we break that. We also
break #! /usr/bin/gawk -E scripts for the cases where they're
invoked as execve("myscript", args, env) (admitedly rare in
practice).
As someone mentioned at
https://unix.stackexchange.com/questions/749645/how-to-safely-use-gawks-i-option-or-include-directive/749910#749910
C has #include "file path on the filesystem (relative to the file it is
included from though)" vs
#include <file looked up in a search path>
which removes the problem here.
So gawk could do the something similar where @include <inplace>
looks up inplace or inplace.awk in the absolute components of
$AWKPATH while @include "file.awk" remains unchanged for
backward compatibility but ideally would only treat "file.awk" as
a file path on the filesystem.
And -i/--include inplace does the same as @include "inplace" and
-m/--module inplace does the same as @incluce <inplace>
> > My understanding is that the -Wposix mode is intended for users
> > that don't care about gawk extensions and want a awk that
> > behaves the standard way. The current behaviour of -f that looks
> > for files in $AWKPATH or looks for them with ".awk" added breaks
> > POSIX compliance, so changing the behaviour as I suggested in
> > POSIX mode would restore compliance and would be unlikely to
> > break any script.
>
> I'm going to leave the language lawyering to Arnold. I don't know whether
> gawk --posix mode should care much about how to find source code.
See
https://pubs.opengroup.org/onlinepubs/9699919799.2018edition/utilities/awk.html
for the specification
-f progfile
Specify the pathname of the file progfile containing an awk
program
progfile is to be interpreted as a pathname (itself defined at
https://pubs.opengroup.org/onlinepubs/9699919799.2018edition/basedefs/V1_chap03.html#tag_03_271
[...]
> > I don't thinkg AWKLIBPATH has a problem. Its default value
> > doesn't include "." or the empty string AFAICT.
>
> I agree that the default value is fine, but presumably somebody could
> change it.
Yes, and if they do have
cd mysoftware && AWKLIBPATH=./lib gawk -l mylib -l myotherlib
We should probably not get in their way.
[...]
> > Also searching in the current working directory is desirable for
> > -f or -E (where IMO arguments should only be interpreted as
> > paths) and some usages of -i/@include while it is unwelcome for
> > usages of -i extension.
>
> I'm confused by that point. You seem to say that it's desirable for
> some usages of -i, and then say that it is unwelcome. And why is it desirable
> for -E?
See above (and below).
-i is --include also intended to include actual files as opposed
to "standard" extensions. Looking at some usages of gawk -i
through a github code search, I saw some cases of:
cd mysoftware && gawk -i myfile-to-include.awk ...
Where the intention was also to look for the file in the current
working directory.
While in gawk -i inplace, the inplace extension is intended to
be loaded from some "standard" place regardless of what the
current working directory is, hence the need to distinguish the
2 cases with --include vs --module and @include "file.awk" vs
@include <module>.
[...]
> > #! /usr/bin/safegawk -f/-E
> >
> > We do *not* want the argument of -f/-E (which here is filled-in
> > by the system from the first (file path) argument to execve())
> > to be looked-up in $AWKPATH, only be interpreted as a file path.
>
> That is true. I'm not sure how best to handle the gawk shebang case.
> As you suggest, it may be inadvisable for -E to do any searching at all.
> And perhaps -E should have a side-effect of disabling relative paths
> in AWKPATH and in AWKLIBPATH, on the theory that -E is used only
> in shebang situations.
>
> Regards,
> Andy