tinycc-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Tinycc-devel] vectorize the curent hash implementation


From: Michael Matz
Subject: Re: [Tinycc-devel] vectorize the curent hash implementation
Date: Sat, 23 Apr 2016 20:45:37 +0200 (CEST)
User-agent: Alpine 2.20 (LSU 67 2015-01-07)

Hi,

On Sat, 23 Apr 2016, Vladimir Vissoultchev wrote:

Rolling hashes have a nice property -- associativity -- i.e. hash(concat(a,
b)) = [hash(a) * TOK_HASH_PRIME^len(b) + hash(b)] in GF(TOK_HASH_SIZE)

What I tried based on this was to pre-hash all of file->buffer and calc hash
on current (pos, len) with no loop like this:

   h = (ph[len] - (ph[0] - TOK_HASH_INIT) * hash_prime_powers[len]) &
(TOK_HASH_SIZE - 1)

... where ph is pointing to current pos in pre-hash buffer.

Unfortunately complete file->buffer pre-hashing is slow, I'm currently working on impl pre-hash part in SSE2 but it looks like it's going to be too slow again as there is no performant way to calculate all intermediate hashes.

The hashing of identifiers (when tcc is compiled with GCC on -O0) in next_nomacro1 itself takes only 2.3 % of the overall cycle estimate (valgrind cachegrind measurement of compiling tcc with tcc itself). The stepping forward of the characters itself takes more than the hashing (namely 3.3%). So any optimization of the hashing needs to be _extremely_ efficient to lead to any measurable improvement at all. I.e. this loop:

3.3%        while (c = *++p, isidnum_table[c - CH_EOF] & (IS_ID|IS_NUM))
2.3%            h = TOK_HASH_FUNC(h, c);

Even optimizing this loop to run twice as fast (which would be quite an achievement) will make tcc run only 2.8% faster (nothing to sneeze at of course, I just wanted to mention that it's not easy to improve the speed by an fantastic amount).


Ciao,
Michael.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]