groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [groff] 28/28: [pdf]: Implement linear bookmark tag search.


From: G. Branden Robinson
Subject: Re: [groff] 28/28: [pdf]: Implement linear bookmark tag search.
Date: Mon, 4 Mar 2024 15:29:13 -0600

Hi Doug, Dave, and Peter,

Thank you for the swift feedback, gentlemen.

At 2024-03-04T11:36:55-0500, Douglas McIlroy wrote:
> > (pdfbookmark, pdf*href-M): Use the new mechanism to record a
> > bookmark tag if `PRINTSTYLE` (a mom(7) macro) is _not_ defined
> 
> This feels backwards to me. I understand pdf.tmac to be a low-level
> macro package that other packages can invoke. It shouldn't have to
> know what packages are going to use it.  Moreover, PRINTSTYLE is a
> lousy mnemonic for a  flag that alters a bookkeeping mechanism. It is
> also stylistically out of step with other macro names in the package.

I entirely agree.  It was the best I could do in a pinch since (1) Peter
and I have agreed that I won't make intrusive changes to om.tmac without
prior arrangement; and (2) I'm aware of no other established mechanism
for API versioning or dependency expression among *roff macro packages.

> Aside from this quibble about a name, the proposal looks sound. I'm
> surprised that the time cost is so steep. Is that perhaps because the
> density of bookmarks in the test case is unusually high?

That sounds plausible; I learned from instrumentation while working up
this change that there are ~900 bookmark tags in the 381-page
groff-man-pages.pdf document.

Should be easy to measure other characteristic documents; let me try my
20-trial approach on automake.mom and mom's own documents instead.

[A few minutes later...]

Your hypothesis is consistent with observations.  Surprisingly, for the
mom(7) documents, the linear search appears to bring a (slight)
performance _improvement_.  I did not see that coming.

I'm attaching the shell script and raw output.  The only other
ingredient one needs is a touch of awk, to massage the elapsed time into
a form datamash(1) will accept.

Values are in seconds.

$ for n in before after; do \
  awk '/Elapsed/ {time = $NF; sub("0:0", "", time); print time};' \
  mom-$n.log | datamash range 1 mean 1 sstdev 1; done
0.35    2.319   0.12719235329548
0.24    2.304   0.083502284872615

At 2024-03-04T13:23:41-0600, Dave Kemper wrote:
> On 3/4/24, G. Branden Robinson <g.branden.robinson@gmail.com> wrote:
> > The following change could probably use some additional eyeballs.
> > It's one of the things Deri and I disagreed about.
> 
> Is Deri's view archived anywhere?

Yes, it's here.

https://lists.gnu.org/archive/html/groff/2024-02/msg00025.html

At 2024-03-04T14:22:34-0500, Peter Schaffter wrote:
> > The other thing to ask is of Peter: assuming you are among the
> > non-horrified, would you like me to prepare a patch to om.tmac to
> > migrate it to this new `pdf:lookup` macro?
> 
> Assuming the migration in no way interferes with the status quo of
> mom,

That's my intention.  I'll review all the PDFs generated by mom (those
covered in the measurement above), hover my mouse over hyperlinks, and
see if they continue to produce the expected results.

Or I could write a regression test based on the scraping out of device
control commands from -Z output.  This might depend on whether pdfmom(1)
is prepared for that...

> yes, please prepare a patch and send it to me.

Will do.

> > If that is done (regardless of who does it), I can also chop out the
> > `ie d PRINTSTYLE` branches from pdf.tmac shown below.
> 
> Which would address Doug's concern about PRINTSTYLE (mom specific
> macro) appearing in pdf.tmac, which should be macroset agnostic.

Agreed.

Regards,
Branden

Attachment: 20times.sh
Description: 20times.sh

Attachment: mom-before.log
Description: mom-before.log

Attachment: mom-after.log
Description: mom-after.log

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]