sks-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Sks-devel] Clocks, timers, PTree and wiki advice


From: Jeffrey Johnson
Subject: Re: [Sks-devel] Clocks, timers, PTree and wiki advice
Date: Sat, 24 Mar 2012 21:08:16 -0400

On Mar 24, 2012, at 6:26 PM, Phil Pennock wrote:

> Okay, we appear to have two people for whom the TSC clocksource in the
> kernel fixes PTree corruption, so I've currently got this in the wiki:
> 
>  Virtual Machine issues
> 
>  There are some issues with clock-keeping mechanisms in some virtual
>  machines (VMs) affecting the Berkeley DB used for PTrees; if the clock
>  resolution is too low, multiple entries occur at the same timestamp
>  and the DB becomes corrupted.
> 
>  If you are running Linux inside a VM, then pass clocksource=tsc as
>  part of the kernel command-line in Grub/Lilo/... to switch the VM's
>  timer-system away from Jiffies towards the Time Stamp Counter of the
>  processors. Note that this can itself be problematic with older
>  kernels locking up on SMP instances. If running SKS in a VM instance,
>  you should probably constrain it to a single CPU.
> 
> http://code.google.com/p/sks-keyserver/wiki/Peering  (er, kernel
> configuration in a page about peering ... ah well).
> 
> Does this make sense to the folks who've encountered and fixed this
> problem?  Is it accurate?
> 
> I wonder whether people setting up or buying VMs tend to have a VM per
> job they run, or be buying a "personal toy" VM with a whole bunch of
> different things running, so that the single CPU constraint might be an
> issue.  I also have usually run systems on bare metal, rather than in
> VMs, so this is beyond my expertise.  Folks?
> 

Ick.

I question the analysis (but not what is observed).

For portability, all needed OS vectors needed by Berkeley DB can be overridden.
E.g. here is an override of file open used in RPM:
        xx = db_env_set_func_open((int (*)(const char *, int, ...))Open);

There are no overrides  to retrieve time info from the OS listed here:
        http://docs.oracle.com/cd/E17076_02/html/api_reference/C/frame_main.html
This is about what I would expect: a database implementation doesn't depend
critically on time, only serialization.

How VM time jitter gets coupled into Berkeley DB "corruption" has no plausible 
mechanism
"that I've heard (but I haven't been paying strict attention) so far.

The text you have written pretends at a "corruption" causal relation
        … and the DB becomes corrupted. Q.E.D.

I have run multiple SKS key servers within a VM on RHEL6 for several
years without seeing an issue with "corruption". But I also always run
ntpd.

Again I'm only questioning the analysis, not what others have actually seen,
or what seemed to fix the problem.

73 de Jeff


> -Phil
> 
> _______________________________________________
> Sks-devel mailing list
> address@hidden
> https://lists.nongnu.org/mailman/listinfo/sks-devel




reply via email to

[Prev in Thread] Current Thread [Next in Thread]