Re: [Freeipmi-users] ipmipower

From: Kevin Fox
Subject: Re: [Freeipmi-users] ipmipower
Date: Tue, 11 Dec 2007 09:33:47 -0800

On Mon, 2007-12-10 at 18:01 -0800, Al Chu wrote:
> Hey Kevin,

Hi Al,

> I've never played with the IBM BMCs before, so I can't be 100% sure.  
> It does seem as though the core issue is an arp issue.  Ipmipower is
> likely losing packets and timing out in sessions.


> When you look at your arp table (/sbin/arp -a), are the entries correct?

Yup. Stuff like:
sc-bmc76 ( at 00:1A:64:11:24:73 [ether] PERM on eth0

> Are they possibly changing?  At the beginning and after the arp is sent from
> the remote BMC?

Doesn't appear to be changing. Since I fixed the mac addresses
in /etc/ethers and arp -f, it should stop talking to me all together if
it changed. After the bmc gets the arp response back from the calling
host, it starts talking again on the same mac address. Which leads me to
the, "bmc dropping all packets while it does an arp request" theory.

> I could see a situation where your BMCs are configured with the wrong IP
> and/or MAC address, and thus advertising IP -> MAC address mappings
> incorrectly. So it gets cached incorrectly for some period of time
> (leading to packets sent to the wrong location and subsequent time
> outs), but is later corrected by some other mechanism on your network
> (normal IP traffic).  This is just a guess at this point.

Shouldn't be an issue since the IP->MAC mapping is permanent in the arp

> Perhaps it'd be a good idea to double check your BMC configuration to
> make sure the IP/MAC/etc. are configured correctly.  You can use bmc-config
> --checkout to check on this.

Looks good to me.

> Also, are you using a different IP for the BMC than what you do for the
> normal communication?  Perhaps something could be confused in there too.

Yup. The ip's dedicated to the BMC. The compute interface has a
different subnet/hardware.

Thanks for the help,

> Hopefully that's a starting point.  PLMK what you find out.
> Al
> On Mon, 2007-12-10 at 17:44 -0800, Kevin Fox wrote:
> > Having problems with ipmipower. I have a 192 node cluster and am trying
> > to use it with Powerman/ipmipower. Powerman -q is showing a large,
> > random assortment of of nodes in the unknown state. It changes each run.
> > 
> > I went to the underlying ipmipower commands and stat in interactive mode
> > comes up with a random set of timed out nodes.
> > 
> > The man page refers to setting fixed mac addrs in the arp tables, and I
> > tried that.
> > 
> > Doing a tcpdump while doing an ipmiping shows the bmc, at roughly every
> > two minutes, makes an arp request towards the machine running ipmiping
> > and drops all packets on the floor while it is waiting for the arp
> > response. So every bmc drops some packets. Also, when starting up
> > ipmiping, it drops some packets while it does an initial arp request.
> > 
> > I think this is what is confusing ipmipower. I've tried lots of
> > different settings for ipmipower but have been unable to find a set of
> > options that come up with a reliable stat. Any ideas what I should set
> > things to? (These nodes are IBMx3550's using an RSAII if that helps)
> > 
> > Thanks,
> > Kevin
> > 
> > 
> > 
> > 
