Re: [bug#67841] [PATCH] Clarify error messages for misuse of m4

autoconf-patches

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug#67841] [PATCH] Clarify error messages for misuse of m4_warn and

From:	Jacob Bachmeyer
Subject:	Re: [bug#67841] [PATCH] Clarify error messages for misuse of m4_warn and --help for -W.
Date:	Mon, 18 Dec 2023 23:44:47 -0600
User-agent:	Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.8.1.22) Gecko/20090807 MultiZilla/1.8.3.4e SeaMonkey/1.1.17 Mnenhy/0.7.6.0

Zack Weinberg wrote:

On Fri, Dec 15, 2023, at 7:08 PM, Jacob Bachmeyer wrote:

Zack Weinberg wrote:

[...]
Also, there’s a perl 2.14ism in one place (s///a) which I need
to figure out how to make 2.6-compatible before it can land.

...

+  $q_channel =~ s/([^\x20-\x7e])/"\\x".sprintf("%02x", ord($1))/aeg;

...

If I am reading perlre correctly, you should be able to simply drop the/a modifier because it has no effect on the pattern you have written,since you are using an explicit character class and are *not* using the/i modifier.


Thanks, you've made me realize that /a wasn't even what I wanted in the
first place.  What I thought /a would do is force s/// to act byte by
byte -- or, in the terms of perlunitut, force the target string to be
treated as a binary string.  That might be clearer with a concrete example:

$ perl -e '$_ = "\xE2\x88\x85"; s/([^\x20-\x7e])/sprintf("\\x%02x", ord($1))/eg; print 
"$_\n";'
\xe2\x88\x85
$ perl -e '$_ = "\N{EMPTY SET}"; s/([^\x20-\x7e])/sprintf("\\x%02x", ord($1))/eg; print 
"$_\n";'
\x2205

What change do I need to make to the second one-liner to make it also
print \xe2\x88\x85?

Add -MEncode to the one-liner and insert "$_ = encode_utf8($_);" beforethe substitution to declare that you want the string as UTF-8 bytes.The Encode documentation states:"All possible characters have a UTF-8 representation so this function[encode_utf8] cannot fail."

In the actual patch, try "my $q_channel = encode_utf8($channel);" wheninitially copying the channel name.

  How do I express that in a way that is backward
compatible all the way to 5.6.0?

Now the fun part... Perl 5.6 had serious deficiencies in Unicodesupport; Encode was introduced with 5.8. You will need to make theEncode import conditional and generate a stub for encode_utf8 if theimport fails. This should not be a problem since non-ASCII here in thefirst place is unlikely, and I think Perl 5.6 would treat non-ASCII asexactly the octet string you want anyway.


Something like:  (untested)

BEGIN {
 my $have_Encode = 0;
 eval { require Encode; $have_Encode = 1; };
 if ($have_Encode) {
   Encode->import('encode_utf8');
 } else {
   # for Perl 5.6, which did not really have Unicode support anyway
   eval 'sub encode_utf8 { return pop }';
 }
}

Note that the stub is defined using eval STRING rather than eval BLOCKbecause "sub" has compile-time effects in Perl and we only want it ifEncode could not be loaded.

  And finally, how do I ensure that there is absolutely nothing I can put in 
the initial assignment to $_ that will cause the rest of the one-liner to 
crash?  For example
over in the Python universe it's very easy to get Unicode conversion
to crash:

$ python3 -c 'print("\uDC00".encode("utf-8"))'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
UnicodeEncodeError: 'utf-8' codec can't encode character '\udc00' in position 
0: surrogates not allowed


Not a problem in Perl:

$ perl -MEncode -e '$_ = "\x{dc00}"; $_ = encode_utf8($_);s/([^\x20-\x7e])/sprintf("\\x%02x", ord($1))/eg; print "$_\n";'

\xed\xb0\x80

:-)

-- Jacob

[Prev in Thread]

Current Thread

[Next in Thread]

[PATCH] Clarify error messages for misuse of m4_warn and --help for -W., Zack Weinberg, 2023/12/15
- Re: [bug#67841] [PATCH] Clarify error messages for misuse of m4_warn and --help for -W., Karl Berry, 2023/12/15
  - Re: [bug#67841] [PATCH] Clarify error messages for misuse of m4_warn and --help for -W., Zack Weinberg, 2023/12/18
- Re: [bug#67841] [PATCH] Clarify error messages for misuse of m4_warn and --help for -W., Jacob Bachmeyer, 2023/12/15
  - Re: [bug#67841] [PATCH] Clarify error messages for misuse of m4_warn and --help for -W., Zack Weinberg, 2023/12/18
    - Re: [bug#67841] [PATCH] Clarify error messages for misuse of m4_warn and --help for -W., Jacob Bachmeyer <=
- [PATCH v2, committed] Clarify error messages for misuse of m4_warn and --help for -W., Zack Weinberg, 2023/12/18

Prev by Date: [PATCH v2, committed] Clarify error messages for misuse of m4_warn and --help for -W.
Next by Date: Re: [PATCH] [committed] autom4te: Don’t crash if Data::Dumper::Sortkeys is unavailable.
Previous by thread: Re: [bug#67841] [PATCH] Clarify error messages for misuse of m4_warn and --help for -W.
Next by thread: [PATCH v2, committed] Clarify error messages for misuse of m4_warn and --help for -W.
Index(es):
- Date
- Thread