On 30.06.2023 01:14, Detlef Riekenberg wrote:
Hi Herman
On 25.06.2023 20:30, Herman ten Brugge via Tinycc-devel wrote:
I just pushed a patch to fix this.
Hi Hermann,
some numbers from Win32:
before:
# 6.334 s, 85768 lines/s, 27.9 MB/s
after first patch:
# 11.825 s, 45941 lines/s, 14.9 MB/s
after second patch:
# 10.406 s, 52206 lines/s, 17.0 MB/s
Hm ...
I do not think, that we really need a 64bit hash (with 64bit multiply)
for the complete file content.
Actually it hasn't yet to do with the hash at all. Also not #pragma once
is not used. Here is some more data:
# 25401 idents, 4838227 lines, 176764178 bytes (168.6 MB)
# 10.405 s, 52211 lines/s, 17.0 MB/s
# text 4705836, data.rw 3084, data.ro 483724, bss 524940 bytes
# 172 files compiled, 13771 included, 5087 skipped, 43749 not found
# 72308 files stat'ed, 0 hashed
Which means tcc compiled 172 files on one command line, each of them
including on average ~110 headers, from which ~30 are skipped by the
include cache mechanism (checking the #ifndef _XXX_H_ around the file).
The result now is that the new stat() is called 72308 times, mostly
failing (due to include path search). Which means that at least on
Windows just those stat() calls are taking about the same time as
tcc parsing ~169MB of source code, and that the cache makes tcc
much slower than no cache at all.
(If you step a bit into what that stat() from msvcr90.dll does, then
it's no surprise really.)
In addition to the filename and the filesize, i suggest to use
"st_mtime":
much cheaper and available for free.
gcc, at least 3.4.6, checks st_size and st_mtime, and then does a plain
memcmp() over the entire buffers (cppfiles.c:should_stack_file()).
BUT: tinycc does have a mission that gcc does not have, which is to be
fast and simple. So I guess it will have to make some restrictions
to the feature as to what extend it can be supported sensibly.
For example tinycc could require at least same basenames (as in the
reported
case). Which would reduce drastically the number of possible candidates
and still would work for all purposes of #pragma once except when
'b.h' is
a link to 'a.h' and both are used in the same translation unit.