bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

gensub() treats trailing backslash in replacement string inconsistently


From: Denys Vlasenko
Subject: gensub() treats trailing backslash in replacement string inconsistently / surprisingly
Date: Sat, 3 Jun 2023 18:33:50 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.8.0

Awk 5.1.1

In gensub(), backslash can be used to escape special char '&',
and is used to denote \0 - \9 "replace by Nth substring" operations.

It is not specified what would happen if backslash is followed
by some other char, such as \k. Experimentally, backslash gets
removed - \k acks the same as k. Good.

It is also not specified what would happen if backslash is the last char.
And here, it's inconsistent. The replacement string which is
just one backslash uses that string verbatim:

awk 'BEGIN { s="\\";print "s=" s;g=gensub("a",s,1,"a");print g }'
s=\
\

The replacement string which has something non-empty and then ends
in backslash, at first glance, seems to drop it:

awk 'BEGIN { s="b\\";print "s=" s;g=gensub("a",s,1,"a");print g }'
s=b\
b

but in fact, it uses a NUL char (!) there:

awk 'BEGIN { s="b\\";print "s=" s;g=gensub("a",s,1,"a");print g; print 
length(g) }'
s=b\
b
2  <============== HUH??

awk 'BEGIN { s="b\\";g=gensub("a",s,1,"a");print g }' | hexdump -vC
00000000  62 00 0a                                          |b..|
00000003     ^^------------- AHA!!!

I think it would be better to do something consistent.
Insertion of NUL char is particularly odd.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]