Re: [Tinycc-devel] vectorize the curent hash implementation

tinycc-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Tinycc-devel] vectorize the curent hash implementation

From:	Michael B. Smith
Subject:	Re: [Tinycc-devel] vectorize the curent hash implementation
Date:	Tue, 26 Apr 2016 01:09:17 +0000

When last I looked, tcc didn't do any constant-folding (although perhaps that 
has changed).

So, precomputing (as a constant) (IS_ID|IS_NUM) and another array which, for a 
value of 'c' is the answer to "c - CH_EOF" -- should provide most (if not all) 
of that optimization.

-----Original Message-----
From: Tinycc-devel [mailto:address@hidden On Behalf Of Michael Matz
Sent: Saturday, April 23, 2016 2:46 PM
To: address@hidden
Subject: Re: [Tinycc-devel] vectorize the curent hash implementation

Hi,

On Sat, 23 Apr 2016, Vladimir Vissoultchev wrote:

> Rolling hashes have a nice property -- associativity -- i.e. 
> hash(concat(a,
> b)) = [hash(a) * TOK_HASH_PRIME^len(b) + hash(b)] in GF(TOK_HASH_SIZE)
>
> What I tried based on this was to pre-hash all of file->buffer and 
> calc hash on current (pos, len) with no loop like this:
>
>    h = (ph[len] - (ph[0] - TOK_HASH_INIT) * hash_prime_powers[len]) & 
> (TOK_HASH_SIZE - 1)
>
> ... where ph is pointing to current pos in pre-hash buffer.
>
> Unfortunately complete file->buffer pre-hashing is slow, I'm currently 
> working on impl pre-hash part in SSE2 but it looks like it's going to 
> be too slow again as there is no performant way to calculate all 
> intermediate hashes.

The hashing of identifiers (when tcc is compiled with GCC on -O0) in
next_nomacro1 itself takes only 2.3 % of the overall cycle estimate (valgrind 
cachegrind measurement of compiling tcc with tcc itself).  The stepping forward 
of the characters itself takes more than the hashing (namely 3.3%).  So any 
optimization of the hashing needs to be _extremely_ efficient to lead to any 
measurable improvement at all.  I.e. this loop:

3.3%        while (c = *++p, isidnum_table[c - CH_EOF] & (IS_ID|IS_NUM))
2.3%            h = TOK_HASH_FUNC(h, c);

Even optimizing this loop to run twice as fast (which would be quite an
achievement) will make tcc run only 2.8% faster (nothing to sneeze at of 
course, I just wanted to mention that it's not easy to improve the speed by an 
fantastic amount).

Ciao,
Michael.

_______________________________________________
Tinycc-devel mailing list
address@hidden
https://lists.nongnu.org/mailman/listinfo/tinycc-devel

[Prev in Thread]

Current Thread

[Next in Thread]

[Tinycc-devel] vectorize the curent hash implementation, Sergey Korshunoff, 2016/04/23
- Re: [Tinycc-devel] vectorize the curent hash implementation, Vladimir Vissoultchev, 2016/04/23
  - Re: [Tinycc-devel] vectorize the curent hash implementation, KHMan, 2016/04/23
    - Re: [Tinycc-devel] vectorize the curent hash implementation, Vladimir Vissoultchev, 2016/04/23
    - Re: [Tinycc-devel] vectorize the curent hash implementation, KHMan, 2016/04/23
    - Re: [Tinycc-devel] vectorize the curent hash implementation, Michael Matz, 2016/04/23
    - Re: [Tinycc-devel] vectorize the curent hash implementation, Michael B. Smith <=

Prev by Date: Re: [Tinycc-devel] gcc/tcc nested macros difference
Next by Date: [Tinycc-devel] X(X(1)) (nested macro substitution)
Previous by thread: Re: [Tinycc-devel] vectorize the curent hash implementation
Next by thread: [Tinycc-devel] -fno-hex-floats option
Index(es):
- Date
- Thread