gnumed-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnumed-devel] Approaches to maintain clinical data uptime


From: Karsten Hilbert
Subject: Re: [Gnumed-devel] Approaches to maintain clinical data uptime
Date: Mon, 1 May 2006 17:53:53 +0200
User-agent: Mutt/1.5.11+cvs20060403

On Sun, Apr 30, 2006 at 11:24:45PM -0700, Jim Busser wrote:

> >>- primary server... should have 2 hard drives in a hardware RAID 1
> >>array,
> >I would have one disc for the OS and a RAID 1 mirror (2
> >discs) for the data (home dirs and clinical data). Putting
> >the OS on another mirror adds comfort (less downtime).
> 
> One disc for the OS because... db speed would be improved?
For that you'd have to separate table files and WAL log onto
separate discs or even put tables into tablespaces on
different discs. No, it's more convenient to have /home/ and
/my/database/data/ on physically separate storage simply for
moving it around if needed.

> >Such would protect you against *immediately* losing your
> >data when one data drive fails... The RAID 1 does not give
> >you *zero* downtime - not even concerning failure of a
> >single drive.
> 
> By "zero downtime" I just meant no "sudden" downtime in the event  
> that one drive fails.
Yes, that's true.

> And therefore as part of a RAID 1 array choosing, when  
> possible, a controller that would continue to run the single drive  
> while issuing alerts as one member of the array is failing (or fails).
Definitely.

> Mine aren't in drive bays, so a satisfactory scenario would include a  
> server array (as above) that could function with one drive failed,  
> with a site practice that includes IT support checking of the logs  
> for such alerts,
such alerts should be mailed/paged/texted to IT support

> and taking down the server at a time chosen to be  
> the least disruptive.
yes

> >LAN switches are pretty much plugin. Outside connectivity
> >(eg router) should not be mandatory for a working in-house
> >LAN.
> Unless the server is located out-of house or is much-accessed from  
> outside of house (say from a second surgery location, or else from an  
> emergency room where one of the GPs often sees patients who had  
> received care in the surgery).
Ah, true. If so, yes, need backup hardware there.

> >Firewall replacement should be: boot firewall boot CD in
> >another machine with two network cards.
> 
> Here are you using dedicated firewall hardware, or a PC, as the  
> firewall (or does it matter)?
PC with bootable linux/bsd based firewall CD-ROM. With
configuration either put onto a customized CD-R, a USB
stick, a floppy disc, etc.

> Does "boot CD" mean to upload a config  
> file to the firewall which, in order to work, would depend that the  
> new firewall is interoperable for that config syntax and settings?
No, it would mean boot a CD which boots a firewall system on
generic hardware - much like KNOPPIX does for a desktop.
Some configuration will have to be read from somewhere,
however - or typed in a boot time.

> >[re backup contingency] My approach would be:

> For someone who knew what they were doing ("IT support"), how long  
> might the above typically take?

Assuming trained staff:

> 2) primary goes down
> 3) GNUmed clients are shut down
> 4 ) primary is taken offline for maintenance
I would expect 15 minutes at most to have passed.

> 5) secondary is consistency-checked and promoted to be
>    master DB via Slony
Anywhere from 5 to 30 extra minutes.

> 6) GNUmed clients are brought back up
> 7) in the login dialog "GNUmed database on backup" server
>    is selected
10-15 minutes

> 8) a few things may have to be re-entered (we don't support
>   keeping that yet)
variable, optional in the die-hard case

> 9) everything is intended to proceed as normal
1-9 could take anyhwere from 30 minutes to 12 hours. I would
plan for 2 hours in the typical case (IOW be able to easily
bridge 2 hours of outage) and expect/hope for 45-60 minutes
per incident.

> Is it the type of procedure that  
> could be written down
Most definitely. That'd be best practice.

> and followed by a GNUmed office administrator  
> in the event that the "IT help" is not quickly enough available?
If educated on it and practiced and perhaps bow-lined via
phone, yes.

> What would be the response of the backup server if it would be  
> selected by the GNUmed client without it having been promoted to  
> master DB via Slony? Would the slave DB refuse to respond?
The backup server database port should be firewalled off
from all machines but the primary server until after it was
promoted to master. Opening the port would be the last step
in promoting. At which point the primary server database
port would be firewalled off already during the
take-down-for-maintenance step within the procedure. So
there never really is a chance to connect to the wrong host
if the protocol is followed (and followable).

> >>Would people run their media archive off their primary
> >primary, should be replicated, too, if downtime is critical
> 
> Not clear about the "replication" here. Is it to maintain twin sets  
> of media stored in separate physical locations? Would you invest in  
> some type of "mirroring" media setup, like 2 cd burners each holding  
> a CD for archiving, with a separate cron job backing up to each?
Ah, wait, you mean backup ?  There's good points for each
method. Preparing the backup on the primary and shipping it
off to another machine for physical archival is probably the
best bet. Running physical archival (CD burning etc) on the
primary during off-hours is another good approach. I have
taken both approaches with good success.

Karsten
-- 
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346




reply via email to

[Prev in Thread] Current Thread [Next in Thread]