[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: -F fs_val handles backslash-newline differently, compared to -v FS=v
From: |
Andrew J. Schorr |
Subject: |
Re: -F fs_val handles backslash-newline differently, compared to -v FS=val and FS=val |
Date: |
Thu, 8 Jun 2023 12:12:44 -0400 |
User-agent: |
Mutt/1.5.21 (2010-09-15) |
It's an interesting case. Inside main.c:parse_args, the -F
option sets a preassign of type PRE_ASSIGN_FS, whereas
-v results in a generic PRE_ASSIGN. Then in main(), the
PRE_ASSIGN_FS case results in a call to cmdline_fs instead
of arg_assign. And cmdline_fs does minimal processing of the
value. It checks for '\t', and then just calls
*tmp = make_str_node(str, strlen(str), SCAN); /* do process escapes */
The arg_assign function is much more complicated. It has logic
for disallowing newline in posix mode, and then it calls
it = make_str_node(cp, strlen(cp), SCAN | ELIDE_BACK_NL);
Using the master branch:
bash-5.1$ gawk -F '\
a' 'BEGIN { print "FS1=" FS }'
FS1=\a
bash-5.1$ gawk --lint -F '\
a' 'BEGIN { print "FS1=" FS }'
gawk: warning: backslash string continuation is not portable
FS1=\a
bash-5.1$ gawk --lint --posix -F '\
a' 'BEGIN { print "FS1=" FS }'
gawk: warning: backslash string continuation is not portable
FS1=\a
bash-5.1$ gawk -v FS='\
a' 'BEGIN { print "FS2=" FS }'
FS2=a
bash-5.1$ gawk --lint -v FS='\
a' 'BEGIN { print "FS2=" FS }'
gawk: warning: backslash string continuation is not portable
FS2=a
bash-5.1$ gawk --posix --lint -v FS='\
a' 'BEGIN { print "FS2=" FS }'
gawk: fatal: POSIX does not allow physical newlines in string values
If one patgches cmdline_fs to add ELIDE_BACK_NL, then all 3 examples
give the same result, but I have no idea whether that's the desirable
outcome. The actual string argument processed contains backslash followed
by newline followed by 'a'. Should that get mapped to 'a'?
I also don't know if a -F arg containing a newline in posix mode
should trigger that same fatal error.
Regards,
Andy
On Thu, Jun 08, 2023 at 08:43:57AM -0600, arnold@skeeve.com wrote:
> That is an interesting report. I will (eventually)
> investigate; I don't have a lot of free time at the moment.
>
> Thanks,
>
> Arnold
>
> Denys Vlasenko <dvlasenk@redhat.com> wrote:
>
> > GNU awk 5.1.1
> >
> > gawk -F '\
> > a' 'BEGIN { print "FS1=" FS }'
> >
> > gawk -v FS='\
> > a' 'BEGIN { print "FS2=" FS }'
> >
> > echo | gawk '{ print "FS3=" FS }' FS='\
> > a'
> >
> > The first command treats "backslash+newline" as backslash:
> >
> > FS1=\a
> >
> > The second and third commands treat the same as empty string:
> >
> > FS2=a
> > FS3=a
> >
> > I think it would be better if all forms have the same rules.