freeipmi-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Freeipmi-devel] watchdog problems on Sun X4100M2


From: Al Chu
Subject: Re: [Freeipmi-devel] watchdog problems on Sun X4100M2
Date: Tue, 01 Jun 2010 09:59:29 -0700

Hi Frank,

This sounded familiar.  There is this thread for not too long ago.

http://lists.gnu.org/archive/html/freeipmi-users/2010-05/msg00017.html

Not sure if it's the same ILOM, but they are both Sun machines, so I bet
the firmware is broken in similar ways.  As I said in the previous post:

> I think ignoring the flag is a bad idea (it would get around it).
> But it would probably solve the problem.
> 
> Perhaps I can edit the code to do some workaround check to see if the
> countdown is changing.  If it is, assume the timer is running regardless
> of what the flag says??  I'm willing to give it a shot if you're
> interested.

Same applies.  I can give a patch a shot, and you can try it out??  It
definitely is a bug.  Unlike other workarounds, I'm just a little scared
of implementing this one.  Afterall, it's a daemon that, if implemented
incorrectly, can lead to a machine rebooting :-)

Al

On Tue, 2010-06-01 at 05:15 -0700, Frank Steiner wrote:
> Hi,
> 
> I'm successfully running the bmc-watchdog on our Sun x2200M2 machines.
> But on the X4100M2 I see a strange problem: The timer starts counting, but
> claims to be stopped:
> 
> 
> sunserver2 /root# bmc-watchdog -g
> Timer Use:                   SMS/OS
> Timer:                       Stopped
> Logging:                     Enabled
> Timeout Action:              Power Cycle
> Pre-Timeout Interrupt:       None
> Pre-Timeout Interval:        0 seconds
> Timer Use BIOS FRB2 Flag:    Set
> Timer Use BIOS POST Flag:    Set
> Timer Use BIOS OS Load Flag: Set
> Timer Use BIOS SMS/OS Flag:  Set
> Timer Use BIOS OEM Flag:     Set
> Initial Countdown:           900 seconds
> Current Countdown:           900 seconds
> 
> 
> sunserver2 /root# /usr/sbin/bmc-watchdog -d -u 4 -p 0 -a 3 -F -P -L -S -O -i 
> 900 -e 540
> sunserver2 /root# bmc-watchdog -g
> Timer Use:                   SMS/OS
> Timer:                       Stopped
> Logging:                     Enabled
> Timeout Action:              Power Cycle
> Pre-Timeout Interrupt:       None
> Pre-Timeout Interval:        0 seconds
> Timer Use BIOS FRB2 Flag:    Set
> Timer Use BIOS POST Flag:    Set
> Timer Use BIOS OS Load Flag: Set
> Timer Use BIOS SMS/OS Flag:  Set
> Timer Use BIOS OEM Flag:     Set
> Initial Countdown:           900 seconds
> Current Countdown:           896 seconds
> 
> And it continues to countdown.
> 
> The problem is that I cannot use the watchdog as daemon because according
> to strace it exits when it sees the timer field is still "stopped".
> 
> Is anything known about this? I didn't find a hint in the workarounds.
> The watchdog daemon is disabled in the bios to avoid it resetting when
> e.g. a file system check takes some time. But so it is on the x2200M2,
> and those correctly set the field to "Running" after the above call.
> 
> The ILOM on the 4100M2 is up-to-date (just flashed today).
> 
> cu,
> Frank
> 
-- 
Albert Chu
address@hidden
Computer Scientist
High Performance Systems Division
Lawrence Livermore National Laboratory




reply via email to

[Prev in Thread] Current Thread [Next in Thread]