monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] mtndumb & public cert_id


From: Zbigniew Zagórski
Subject: Re: [Monotone-devel] mtndumb & public cert_id
Date: Wed, 17 Dec 2008 21:22:31 +0100

2008/12/17 Thomas Keller <address@hidden>:
> Zbigniew Zagórski schrieb:...
>>> And you're
>>> basically looking for something which is faster than packets_for_certs
>>> REVID which only takes revisions, but not selectors, and which outputs
>>> more things than you actually need, correct?
>>
>> No in fact i look for "get all certs" and "give me specific cert" ...
>
> So, you're actually triggering `mtn automate select_cert '*'` then, right?

Only because "get all at once" strategy won with "get it when it's needed"
with speed. My original design was:

foreach REV:
   certs = automate.select_certs(REV)
   for each cert:
       ...
But it was replaced with greedy "get into multimap", and then later
use only map. It's
always like that that pretty and simple design must be messed up because
of performance issues.

>>> ...
>>> revision ID, to a selector. Now you then still get all cert packets in
>>> return, not just the IDs which you need for the merkle tree, but is this
>>> really such a huge speed penalty?
>>
>> Well after thinking a little bit i need to clarify. The biggest
>> penalty is cost of thousands of automate calls. It's rather big when
>> it comes tho  thousands (or tens of thousands) invocations. And it's
>> kind of waste when you _always_ want all and only id, but not the packet.
>
> This should have been gotten better a bit in 0.42 since Timothy fixed an
> issue with stdio which could not reuse the opened database instance for
> every call.

I didn't checked it against trunk, but thanks for pointing this out: i'll check.

>> ...
>> Regarding timing:
>>
>> For my private database (~1200 revs):
>>    old approach      15s    (~1200 automate calls)
>>    new approach      2s     (~15 automate calls)
>>
>> For net.venge.monotone db (14348 revs):
>>    old approach      180s   (~14400 automate calls)
>>    new approach      10s    (~70 automate calls)
...
>
> Wow, ouch, but you're using automate stdio, right?

Yes, I'm always politically correct. ;)

[Do you imagine it possible to make 14000 process invocations in 180 s
on win32? ;) ]

>> All i want is to have ~constant number of automate calls that
>> return me bulk data from which i can safely build merkle tree.
>>
>> With old approach i need 2+N calls (two to get revisions and toposort
>> them, N to get all certs for each revision).
>>
>> With new i have 3 calls (all revs, toposort, all certs).
>
> Understood. I've just asked here a bit more because I think what you
> (and maybe others) are really looking for is an sql-like interface to
> some of the internal data structures. Sure, one could use the 'unstable'
> mtn execute "interface", but I wonder if we should wrap something around
> this and provide access to the mtn internals for exactly these
> high-performance use cases.

Well. I could happily use mtn db execute to get all certs. No problem for
me  (not that i would be proud of it ;) ...
However i don't have good replacement for 'get_cert_for_packet
 CERTID'.  No one sane in mind would want to duplicate packet
generation in client side - that's real implementation detail.
Namely the change was needed so i decided, why not make two good
changes at once and win with need to use "db execute".

That's way select_certs emerged. It's design is not well thought, it's just
something that was easiest to write with current infra and my sparse
orientation
in it.

> Other than that I'm ok with your patch if you add some documentation and
> tests to them.

Of course i wanted do to th that way.

>  If you like to see this in 0.42 you probably have to do
> this until the end of this week, otherwise it'll go into 0.43.

In fact i've got this change sitting in one of my view for about a month or so
and i've delurked out is because next release. I would like to see it there ...

Nevertheless, I don't know if i'll manage (esp. about doc&test part) so don't
wait it's trivial.

-- 
Zbigniew Zagórski
/ software developer / geek / happy daddy /

reply via email to

[Prev in Thread] Current Thread [Next in Thread]