[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Groff] caching result of charinfo::get_flags
From: |
Werner LEMBERG |
Subject: |
Re: [Groff] caching result of charinfo::get_flags |
Date: |
Tue, 21 Dec 2010 08:35:57 +0100 (CET) |
> > Maybe the look-up algorithm of `get_flags' (without caching) could
> > also be optimized. IIUC currently it does not sort/merge the
> > ranges and check all of them linearly.
>
> Certainly, but for the current state, I think this isn't necessary.
> It might be worth to look at it eventually, since everything which
> makes GNU troff faster is good...
BTW, I've just used the file `bash.1' version 2.05 from the linuxjm
project (with 217kByte it is about 30 times larger than `gprof.1'),
and profiling shows a completely different hot spot:
% cumulative self self total
time seconds seconds calls ms/call ms/call name
43.14 0.22 0.22 78453 0.00 0.00
unicode_decompose_ptable::lookup
5.88 0.25 0.03 695361 0.00 0.00 token::next
5.88 0.28 0.03 9941 0.00 0.00 file_iterator::fill
3.92 0.30 0.02 202744 0.00 0.00 tfont::get_width
3.92 0.32 0.02 108869 0.00 0.00 read_long_escape_name
...
Doing the same for the English bash.1 (version 4.1, about 276kByte), I
get this:
% cumulative self self total
time seconds seconds calls ms/call ms/call name
11.11 0.03 0.03 915059 0.00 0.00 token::next
11.11 0.06 0.03 406191 0.00 0.00 font::get_width
7.41 0.08 0.02 609951 0.00 0.00 glyph_to_unicode
7.41 0.10 0.02 154972 0.00 0.00 symbol::symbol
7.41 0.12 0.02 150763 0.00 0.00 string_iterator::fill
...
The timings are from a normal build (-O2).
Bruno, who has worked a lot on groff's Unicode support, already
pointed out in a comment in ptable.cpp that groff's `mythical
Aho-Hopcroft-Ullman hash function' can be improved; see
http://www.haible.de/bruno/hashfunc.html
While of virtually no importance for latin man pages, non-latin man
pages (CJK, Russian, Greek, etc.) which contain zillions of \[uXXXX]
entries would benefit a lot...
Werner