|
From: | Paolo Bonzini |
Subject: | Re: [PATCH 1/5] maint: ensure that MB_CUR_MAX is defined even when !MBS_SUPPORT |
Date: | Fri, 16 Sep 2011 15:12:37 +0200 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:6.0.2) Gecko/20110906 Thunderbird/6.0.2 |
On 09/16/2011 03:03 PM, address@hidden wrote:
Please remember that dfa.[ch] are shared code with gawk and I think also gettext (although I don't know how up to date gettext's version is). I'd really prefer not to have too many GREP_xxx kinds of things in those files. (It's ok in the rest of grep, of course.:-)
We could separate the variables for dfa and the rest of grep. Grep just needs "#define DFA_MB_CUR_MAX GREP_MB_CUR_MAX" then (and you can similarly "#define DFA_MB_CUR_MAX gawk_mb_cur_max" in gawk).
For what it's worth, MB_CUR_MAX is a function call in GLIBC. There were some cases in gawk where I was losing a noticable amount of time calling it a lot. So I set up a global variable gawk_mb_cur_max and initialize it in main(), since the result should never change during a single run of the program. It made a difference.
Interesting. We do have a field for mb_cur_max in dfaexec, but it is there because some UTF-8 regex can be run as if the locale was single byte. I suspect however that awk programs (especially badly written ones!) do more regex compilation than grep, up to 1 compilation per match. For grep it shouldn't really matter.
Having variables grep_mb_cur_max and dfa_mb_cur_max (separate for the reasons Arnold explained) would work, but it would make it impossible for the compiler to throw away the multibyte code when MBS_SUPPORT is zero.
Paolo
[Prev in Thread] | Current Thread | [Next in Thread] |