bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: -F fs_val handles backslash-newline differently, compared to -v FS=v


From: Andrew J. Schorr
Subject: Re: -F fs_val handles backslash-newline differently, compared to -v FS=val and FS=val
Date: Thu, 8 Jun 2023 12:12:44 -0400
User-agent: Mutt/1.5.21 (2010-09-15)

It's an interesting case. Inside main.c:parse_args, the -F
option sets a preassign of type PRE_ASSIGN_FS, whereas
-v results in a generic PRE_ASSIGN. Then in main(), the
PRE_ASSIGN_FS case results in a call to cmdline_fs instead
of arg_assign. And cmdline_fs does minimal processing of the
value. It checks for '\t', and then just calls

        *tmp = make_str_node(str, strlen(str), SCAN); /* do process escapes */

The arg_assign function is much more complicated. It has logic
for disallowing newline in posix mode, and then it calls

        it = make_str_node(cp, strlen(cp), SCAN | ELIDE_BACK_NL);

Using the master branch:

bash-5.1$ gawk -F '\
a' 'BEGIN { print "FS1=" FS }'
FS1=\a
bash-5.1$ gawk --lint -F '\
a' 'BEGIN { print "FS1=" FS }'
gawk: warning: backslash string continuation is not portable
FS1=\a
bash-5.1$ gawk --lint --posix -F '\
a' 'BEGIN { print "FS1=" FS }'
gawk: warning: backslash string continuation is not portable
FS1=\a
bash-5.1$ gawk -v FS='\
a' 'BEGIN { print "FS2=" FS }'
FS2=a
bash-5.1$ gawk --lint -v FS='\
a' 'BEGIN { print "FS2=" FS }'
gawk: warning: backslash string continuation is not portable
FS2=a
bash-5.1$ gawk --posix --lint -v FS='\
a' 'BEGIN { print "FS2=" FS }'
gawk: fatal: POSIX does not allow physical newlines in string values

If one patgches cmdline_fs to add ELIDE_BACK_NL, then all 3 examples
give the same result, but I have no idea whether that's the desirable
outcome. The actual string argument processed contains backslash followed
by newline followed by 'a'. Should that get mapped to 'a'?

I also don't know if a -F arg containing a newline in posix mode
should trigger that same fatal error.

Regards,
Andy

On Thu, Jun 08, 2023 at 08:43:57AM -0600, arnold@skeeve.com wrote:
> That is an interesting report. I will (eventually)
> investigate; I don't have a lot of free time at the moment.
> 
> Thanks,
> 
> Arnold
> 
> Denys Vlasenko <dvlasenk@redhat.com> wrote:
> 
> > GNU awk 5.1.1
> >
> > gawk -F '\
> > a' 'BEGIN { print "FS1=" FS }'
> >
> > gawk -v FS='\
> > a' 'BEGIN { print "FS2=" FS }'
> >
> > echo | gawk '{ print "FS3=" FS }' FS='\
> > a'
> >
> > The first command treats "backslash+newline" as backslash:
> >
> > FS1=\a
> >
> > The second and third commands treat the same as empty string:
> >
> > FS2=a
> > FS3=a
> >
> > I think it would be better if all forms have the same rules.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]