freeipmi-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Freeipmi-users] problems with bmc-watchdog


From: Dave Love
Subject: Re: [Freeipmi-users] problems with bmc-watchdog
Date: Thu, 06 May 2010 00:00:31 +0100
User-agent: Gnus/5.110011 (No Gnus v0.11) Emacs/21.4 (gnu/linux)

Al Chu <address@hidden> writes:

> Let's try some tests.  Could you run bmc-watchdog "by hand" to make sure
> things look like it's working right?  "by hand", I mean something like
> run:
>
> bmc-watchdog --get (see what the current watchdog settings are)
> bmc-watchdog --set ... (with same as deamon options, except not the
> reset interval '-e 60')
> bmc-watchdog --get (see that things are set)
> bmc-watchdog --start
> bmc-watchdog --get (make sure things changed, timer is running)
> bmc-watchdog --get (make sure timer is counting down)
> bmc-watchdog --reset
> bmc-watchdog --get (make sure timer has reset)
>
> (and you probably want to do bmc-watchdog --stop at the end)

I should have said I was puzzled by when it says Stopped.  This is a
RH5, Sun ILOM 2 system (not ELOM as I thinko'd before).

  # bmc-watchdog --get
  Timer Use:                   SMS/OS
  Timer:                       Stopped
  Logging:                     Enabled
  Timeout Action:              Hard Reset
  Pre-Timeout Interrupt:       None
  Pre-Timeout Interval:        0 seconds
  Timer Use BIOS FRB2 Flag:    Set
  Timer Use BIOS POST Flag:    Set
  Timer Use BIOS OS Load Flag: Set
  Timer Use BIOS SMS/OS Flag:  Set
  Timer Use BIOS OEM Flag:     Set
  Initial Countdown:           900 seconds
  Current Countdown:           900 seconds
  # bmc-watchdog --set -u 4 -p 0 -a 1 -i 900
  # bmc-watchdog --get 
  Timer Use:                   SMS/OS
  Timer:                       Stopped
  Logging:                     Enabled
  Timeout Action:              Hard Reset
  Pre-Timeout Interrupt:       None
  Pre-Timeout Interval:        0 seconds
  Timer Use BIOS FRB2 Flag:    Set
  Timer Use BIOS POST Flag:    Set
  Timer Use BIOS OS Load Flag: Set
  Timer Use BIOS SMS/OS Flag:  Set
  Timer Use BIOS OEM Flag:     Set
  Initial Countdown:           900 seconds
  Current Countdown:           900 seconds
  # bmc-watchdog --start
  # bmc-watchdog --get
  Timer Use:                   SMS/OS
  Timer:                       Stopped
  Logging:                     Enabled
  Timeout Action:              Hard Reset
  Pre-Timeout Interrupt:       None
  Pre-Timeout Interval:        0 seconds
  Timer Use BIOS FRB2 Flag:    Clear
  Timer Use BIOS POST Flag:    Clear
  Timer Use BIOS OS Load Flag: Clear
  Timer Use BIOS SMS/OS Flag:  Clear
  Timer Use BIOS OEM Flag:     Clear
  Initial Countdown:           900 seconds
  Current Countdown:           900 seconds
  # sleep 2
  # bmc-watchdog --get
  Timer Use:                   SMS/OS
  Timer:                       Stopped
  Logging:                     Enabled
  Timeout Action:              Hard Reset
  Pre-Timeout Interrupt:       None
  Pre-Timeout Interval:        0 seconds
  Timer Use BIOS FRB2 Flag:    Clear
  Timer Use BIOS POST Flag:    Clear
  Timer Use BIOS OS Load Flag: Clear
  Timer Use BIOS SMS/OS Flag:  Clear
  Timer Use BIOS OEM Flag:     Clear
  Initial Countdown:           900 seconds
  Current Countdown:           898 seconds
  # bmc-watchdog --reset
  # bmc-watchdog --get
  Timer Use:                   SMS/OS
  Timer:                       Stopped
  Logging:                     Enabled
  Timeout Action:              Hard Reset
  Pre-Timeout Interrupt:       None
  Pre-Timeout Interval:        0 seconds
  Timer Use BIOS FRB2 Flag:    Clear
  Timer Use BIOS POST Flag:    Clear
  Timer Use BIOS OS Load Flag: Clear
  Timer Use BIOS SMS/OS Flag:  Clear
  Timer Use BIOS OEM Flag:     Clear
  Initial Countdown:           900 seconds
  Current Countdown:           900 seconds
  
> This can help us isolate things.  If the above works, then maybe there
> is a timing issue within your BMC that we need to get around.  I'm a
> little perplexed as to why it would work with the openipmi driver.  It's
> possible it's more generous on some timeouts of packets and such.  Or
> maybe the openipmi driver's own watchdog implementation/code has done
> something to massage the BMC that I'm unaware of.

I probably wasn't clear.  What I meant was:

  # bmc-watchdog -g --config-file /dev/null
  ipmi-kcs-driver.c: 749: ipmi_kcs_write: error 'BMC busy' (7)
  ipmi-kcs-driver.c: 749: ipmi_kcs_write: error 'BMC busy' (7)
  ipmi-kcs-driver.c: 858: ipmi_kcs_read: error 'BMC busy' (7)
  ipmi-kcs-driver.c: 749: ipmi_kcs_write: error 'BMC busy' (7)
  ipmi-kcs-driver.c: 749: ipmi_kcs_write: error 'BMC busy' (7)
  ipmi-kcs-driver.c: 858: ipmi_kcs_read: error 'BMC busy' (7)
  bmc-watchdog: Get Watchdog Timer Error: BMC Busy

in contrast to:

  # bmc-watchdog -g --config-file /dev/null -D OPENIPMI|head -1
  Timer Use:                   SMS/OS
  ...

and

  # bmc-info --config-file /dev/null
  Device ID             : 32
  ...

Actually now it's obvious there's something wrong with the ILOM, thanks.
I've now tried on an x2200M2 with ELOM with the results below (and I
don't have to specify the openipmi driver).  I guess I won't get
anywhere with a service request on this -- especially as I'm only doing
it because Sun couldn't fix the hangups on the Thumper -- but perhaps
you have a simple idea for a fix?

  # bmc-watchdog --get
  Timer Use:                   SMS/OS
  Timer:                       Stopped
  Logging:                     Enabled
  Timeout Action:              Hard Reset
  Pre-Timeout Interrupt:       None
  Pre-Timeout Interval:        0 seconds
  Timer Use BIOS FRB2 Flag:    Clear
  Timer Use BIOS POST Flag:    Clear
  Timer Use BIOS OS Load Flag: Clear
  Timer Use BIOS SMS/OS Flag:  Clear
  Timer Use BIOS OEM Flag:     Clear
  Initial Countdown:           900 seconds
  Current Countdown:           0 seconds
  # bmc-watchdog --set -u 4 -p 0 -a 1 -i 900
  # bmc-watchdog --get 
  Timer Use:                   SMS/OS
  Timer:                       Stopped
  Logging:                     Enabled
  Timeout Action:              Hard Reset
  Pre-Timeout Interrupt:       None
  Pre-Timeout Interval:        0 seconds
  Timer Use BIOS FRB2 Flag:    Clear
  Timer Use BIOS POST Flag:    Clear
  Timer Use BIOS OS Load Flag: Clear
  Timer Use BIOS SMS/OS Flag:  Clear
  Timer Use BIOS OEM Flag:     Clear
  Initial Countdown:           900 seconds
  Current Countdown:           0 seconds
  # bmc-watchdog --start
  # bmc-watchdog --get
  Timer Use:                   SMS/OS
  Timer:                       Running
  Logging:                     Enabled
  Timeout Action:              Hard Reset
  Pre-Timeout Interrupt:       None
  Pre-Timeout Interval:        0 seconds
  Timer Use BIOS FRB2 Flag:    Clear
  Timer Use BIOS POST Flag:    Clear
  Timer Use BIOS OS Load Flag: Clear
  Timer Use BIOS SMS/OS Flag:  Clear
  Timer Use BIOS OEM Flag:     Clear
  Initial Countdown:           900 seconds
  Current Countdown:           900 seconds
  # sleep 2
  # bmc-watchdog --get
  Timer Use:                   SMS/OS
  Timer:                       Running
  Logging:                     Enabled
  Timeout Action:              Hard Reset
  Pre-Timeout Interrupt:       None
  Pre-Timeout Interval:        0 seconds
  Timer Use BIOS FRB2 Flag:    Clear
  Timer Use BIOS POST Flag:    Clear
  Timer Use BIOS OS Load Flag: Clear
  Timer Use BIOS SMS/OS Flag:  Clear
  Timer Use BIOS OEM Flag:     Clear
  Initial Countdown:           900 seconds
  Current Countdown:           898 seconds
  # bmc-watchdog --reset
  # bmc-watchdog --get
  Timer Use:                   SMS/OS
  Timer:                       Running
  Logging:                     Enabled
  Timeout Action:              Hard Reset
  Pre-Timeout Interrupt:       None
  Pre-Timeout Interval:        0 seconds
  Timer Use BIOS FRB2 Flag:    Clear
  Timer Use BIOS POST Flag:    Clear
  Timer Use BIOS OS Load Flag: Clear
  Timer Use BIOS SMS/OS Flag:  Clear
  Timer Use BIOS OEM Flag:     Clear
  Initial Countdown:           900 seconds
  Current Countdown:           899 seconds




reply via email to

[Prev in Thread] Current Thread [Next in Thread]