[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Faster token table lookups
From: |
Hans Aberg |
Subject: |
Re: Faster token table lookups |
Date: |
Fri, 22 Dec 2000 11:21:31 +0100 |
At 11:42 +0100 0-12-20, Akim Demaille wrote:
>> It would merely be a convenience to those that want to make a
>> lot of name table lookups. I think there is a suggestion one can
>> implement it in the lexer function yylex: instead of returning a
>> macro number directly, it makes a table lookup, and returns that
>> number to Bison.
>
>Then why not, send a patch :)
I made that patch: The patched file "output.c" is attached. (Instructions
for putting it in, see below.)
I realized that it was unnecessary to sort the names, but merely an array
with the typecodes, and a comparison function using the unsorted lookup
"tags" table expanding to the lookup names. Then I applied a heapsort to
that, with max number of comparisons O(n log n), where n is the table size.
Writing it out then was simple; just a modification of the script Bison
uses to wrote out the unsorted table.
- It also writes out two functions that can be used to find the typecodes:
int yytypecode(const char* s);
int yytypecode_cache(const char* s, int* cache);
using fast binary search, at most 1 + log_2 n comparisons.
- These functions will translate to Yacc numbers if those are used; with
%raw, they will produce Bison's internal numbers.
- If the name-prefix is changed, the "yy" of these functions will change
accordingly (by two extra written out macros).
- If the name is not found, both functions return -1.
- One simply uses these functions as replacements for the token macros:
If
%token identifier
is declared, one can in the lexer write
return yytokencode("identifier");
And if
%token "=>"
one can write
return yytokencode("\"=>\"");
including the extra as the name on the table us exactly "=>" and not =>.
A character code '+' would be reached with
return yytokencode("\'+\'");
The yytypecode_cache function can be used to cache the lookup number so it
only lookup the number once: If the "cache" argument is NULL or have a
dereferenced value < -1, a lookup is made just as in the case of
yytokencode(). If is != NULL and the dereferenced value < -1, the lookup
value will be written into it. And if it is != NULL and the dereferenced
value >= -1, it simply returns that value.
So one simply uses it as follows:
int identifier_ = -2; /* Define a cache */
...
return yytypecode_cache("identifier", &identifier_);
Then, the first time yytypecode_cache is called, it will make a table
lookup, but on subsequent calls, it will merely use the lookup number found
earlier.
- In order to implement the patch, one should add between output_stos() and
output_rule_data() the following
#define SWAP(x, y) ...
static void adjust(int tc[], int m, register int n) ...
static void output_name_typecode_table(void) ...
In output_rule_data(), one should add the
output_name_typecode_table();
call right after
/* add a NULL entry to list of tokens */
fprintf (ftable, ", NULL\n};\n");
And in output_headers() one should add the lines
fprintf (ftable, "#define yytypecode %stypecode\n", spec_name_prefix);
fprintf (ftable, "#define yytypecode_cache %stypecode_cache\n",
spec_name_prefix);
in the "if (spec_name_prefix)" conditional.
output.c
Description: Text document
Hans Aberg
- Faster token table lookups, Hans Aberg, 2000/12/15
- Re: Faster token table lookups, Akim Demaille, 2000/12/19
- Re: Faster token table lookups, Hans Aberg, 2000/12/19
- Re: Faster token table lookups, Akim Demaille, 2000/12/20
- Re: Faster token table lookups, Hans Aberg, 2000/12/20
- Re: Faster token table lookups,
Hans Aberg <=
- Re: Faster token table lookups, Akim Demaille, 2000/12/22
- Re: Faster token table lookups, Hans Aberg, 2000/12/22
- Re: Faster token table lookups, Hans Aberg, 2000/12/22