sks-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Sks-devel] Dumps/importing & de-peering (WAS: Re: SKS apocalypse mi


From: Andrew Gallagher
Subject: Re: [Sks-devel] Dumps/importing & de-peering (WAS: Re: SKS apocalypse mitigation)
Date: Mon, 21 May 2018 19:27:05 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0

On 05/05/18 17:28, brent s. wrote:
> 
>>> but the problem for
>>> (b) is the "standard place" - SKS/recon/HKP/peering is, by nature,
>>> unfederated/decentralized. sure, there's the SKS pool, but that
>>> certainly isn't required for peering (even with keyservers that ARE in
>>> the pool) nor running sks. how does one decide the "canonical" dump to
>>> be downloaded in (b)?
>>
>> There can be no canonical dump of course. Each peer can provide its own dump 
>> at a well known local URL. This is even more important if and when we allow 
>> divergent policy. 
> 
> hrm. i suppose, but i'm under the impression not many keyserver admins
> run their own dumps? (which i don't fault them for; the current dump i
> have in its uncompressed form is 11 GB (5054255 keys). granted, you
> don't see new keyserver turnups often, but still -- that can be a
> lengthy download, plus the fairly sizeable chunk of time it takes for
> the initial import.)

Right.

I've thought about this a bit more, and the bootstrapping issue can be
solved without requiring every keyserver to produce a unique dump. We
just need one more database [table]...!

Let us call it Limbo. It contains the hashes of objects that the local
server does not have and has never seen (so has never had the chance to
test against policy), but knows must exist because they were in another
server's blacklist.

When bootstrapping, all that the new server needs to know is a
reasonably complete list of hashes. If it knows the real data as well,
all the better. But for recon to get started, given that we can perform
fake recon, the hashes are sufficient.

When performing a dump, a reference server also dumps its local
blacklist. When loading that dump, the blacklist of the reference is
used to populate the fresh server's Limbo. Now, the fresh server can
generate a low-delta fake recon immediately, by merging the DB, Local-BL
(initially empty) and Limbo hash lists. Recon then proceeds as discussed
before, and so long as the peer graph is well-connected, new peers can
be added without having to reference their dumps.

Limbo entries will return 404, just like missing entries (and unlike
blacklist entries). But the server will request a proportion of the
Limbo entries from its peers during each catchup. This would happen at a
much higher rate than the blacklist cache refresh, but still low enough
that its peers shouldn't suffer from the extra load.

Let's say that at each recon, the number of missing keys is found to be
N. The local server will then request these N keys from its peer. If at
the same time it were to also request (M=a*N) limbo entries thus:

(SELECT hash from limbo where hash NOT IN (SELECT hash from
peer_bl_cache where peer = $PEER) LIMIT $M)`

the extra load on the peer should not be excessive, and Limbo should be
drained at a rate roughly proportional to the parameter `a` and the rate
of new keys.

(This would also be a good place to perform the peer_bl_cache refresh).

When calculating key deltas for pool membership purposes, the fresh
server should not include its Limbo database in the count. This will
ensure that servers do not get added to the pool until their Limbo is
well drained. Alternatively, we could make an explicitly drained Limbo a
condition for pool membership.

This still leaves the issue of eventual consistency as an open problem,
but it can be addressed manually by encouraging good graph connectivity.

-- 
Andrew Gallagher

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]