Paolo Bonzini wrote:
Jim's fix for the fgrep infinite loop would erroneously miss matches
in SJIS character sets. In this character set low bytes (i.e. ASCII
bytes) are also valid second bytes in a double-byte character, so you
have to continue looking for a match, even if you match in the middle
of a double-byte character.
Good catch!
Thank you.
The attached test will be skipped unless (on a glibc system) you run
something like
mkdir /usr/lib/locale/ja_JP.SHIFT_JIS
zcat /usr/share/i18n/charmaps/SHIFT_JIS.gz | \
localedef \
-f - \
-i /usr/share/i18n/locales/ja_JP \
/usr/lib/locale/ja_JP.SHIFT_JIS
It is telling that when you run those commands,
you see this diagnostic:
character map `SHIFT_JIS' is not ASCII compatible, locale not ISO C compliant
+# % becomes an half-width katakana in SJIS, and an invalid sequence
s/an/a/
+seq=0
I find s/seq/k/ to be slightly more readable.
+ timeout 10s grep $1 `encode "$3"`> out$seq 2>&1
Please use $(...), rather than `...` in tests.
init.sh ensures that the shell we are using is capable enough.