qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH for-2.4] e1000: flush packets when link comes up


From: Fam Zheng
Subject: Re: [Qemu-devel] [PATCH for-2.4] e1000: flush packets when link comes up
Date: Mon, 29 Jun 2015 19:03:02 +0800
User-agent: Mutt/1.5.23 (2014-03-12)

On Mon, 06/29 11:39, Stefan Hajnoczi wrote:
> On Fri, Jun 26, 2015 at 05:06:01PM +0800, Jason Wang wrote:
> > 
> > 
> > On 06/25/2015 05:18 PM, Stefan Hajnoczi wrote:
> > > e1000_can_receive() checks the link up status register bit.  If the bit
> > > is clear, packets will be queued and the peer may disable receive to
> > > avoid wasting CPU reading packets that cannot be delivered.  The queue
> > > must be flushed once the link comes back up again.
> > >
> > > This patch fixes broken e1000 receive with Mac OS X Snow Leopard guests
> > > and tap networking.  Flushing the queue invokes the async send callback,
> > > which re-enables tap fd read.
> > >
> > > Reported-by: Jonathan Liu <address@hidden>
> > > Signed-off-by: Stefan Hajnoczi <address@hidden>
> > > ---
> > >  hw/net/e1000.c | 3 +++
> > >  1 file changed, 3 insertions(+)
> > >
> > > diff --git a/hw/net/e1000.c b/hw/net/e1000.c
> > > index bab8e2a..5c6bcd0 100644
> > > --- a/hw/net/e1000.c
> > > +++ b/hw/net/e1000.c
> > > @@ -185,6 +185,9 @@ e1000_link_up(E1000State *s)
> > >  {
> > >      s->mac_reg[STATUS] |= E1000_STATUS_LU;
> > >      s->phy_reg[PHY_STATUS] |= MII_SR_LINK_STATUS;
> > > +
> > > +    /* E1000_STATUS_LU is tested by e1000_can_receive() */
> > > +    qemu_flush_queued_packets(qemu_get_queue(s->nic));
> > >  }
> > >  
> > >  static bool
> > 
> > This only solves the issue partially and just for e1000. After checking
> > all can_receive() functions, another card with similar behaviour is vmxnet3.
> 
> No, this bug is specific to e1000 (details on that below in case 2b).
> 
> There are more issues though, so additional patches are welcome.
> 
> > Looking at commit a90a7425cf592a3afeff3eaf32f543b83050ee5c ("tap: Drop
> > tap_can_send") again. The commit disable tap read poll when
> > qemu_net_queue_send() returns zero. Which is usually the following cases:
> > 
> > 1) queue->delivering is 1
> > 2) qemu_can_send_packet() returns zero, which is:
> > 2a) vm_running is false
> > or
> > 2b) can_receive() return zero
> > 3) qemu_net_queue_deliver() returns zero, which is:
> > 3a) nc->receive_disabled is true
> > or
> > 3b) info->receive_raw() or nc->receive->receive() returns zero
> > 
> > This means we should enable tap read poll when one of those conditions
> > is not existed. This patch fixes 2b) only for e1000.
> > 
> > for 1, I'm not quite sure it's a real problem or how to fix.
> 
> This is not a problem.  When the delivery code returns it flushes the
> queue.  This will invoke send callback functions, so tap will re-enable
> read poll.
> 
> > for 2a, we may probably need a vm state change handler to flush the
> > queue, otherwise network will stall after a stop and cont.
> 
> Yes, I'm surprised the code isn't doing this already.

It's rare that a .can_receive checks vm_running. Why is it only necessary in
virtio-net, or is it that other NIC models are wrong?

> 
> > for 2b, we can either fixes the card that check link status in its
> > can_receive() or just simply can qemu_flush_queued_packets() in set_link.
> 
> No, this doesn't work because the e1000 link status bit is *not*
> equivalent to nc->link_down.  The link auto-negotiation code changes the
> e1000 bit independently from nc->link_down using a timer.
> 
> The net subsystem cannot know when this e1000-specific bit is supposed
> to change.  Therefore there is no general fix.
> 
> Also, qemu_send_packet() drops packets while nc->link_down is true.
> We're not supposed to queue so it shouldn't be necessary to flush
> packets in set_link at the net subsystem level.
> 
> > 3a and 3b looks ok, since this happen when there's no space for rx,
> > after guest refill, qemu will call qemu_flush_queued_packets() to flush
> > and re-enable tap read poll.
> > 
> > 2a and 2b does not exits before this commit, since tap_send check
> > qemu_can_send_packet() before.
> 
> The delivery and queuing mechanism in QEMU is overcomplicated.  The
> semantics of .can_receive() and .receive() need to be clarified before
> we can do real cleanup though:
> 
> .receive() returning 0 is not quite the same as .can_receive() returning
> false.
> 
> When .receive() returns 0 this means the peer should wait before sending
> more.  In most cases the net queue only needs to hold 1 packet.  But
> there are cases like slirp where QEMU code generates packets and may not
> be able to continue without calling qemu_send_packet() on more packets
> (I'm not sure but I suspect this is the case).
> 
> When .can_receive() returns false it also casues packets to be queued,
> but NIC emulation code often includes more conditions in .can_receive()
> than .receive() returning 0.
> 
> Cleaning this up requires auditing all the NICs.  Maybe we can get rid
> of .can_receive() and just have a simplified .receive():
> 
>   false - queue packet and tell peer to stop sending.  Flush must be
>           called later to re-enable sending.
>   true - packet delivered or dropped

Yes, I find that some of the .can_receive callbacks look OK for a simple
dropping, for example net-hub and virtio-net.  I can try this, or even better
if we can split the work.

As a starter maybe we can drop the .can_receive callback and move existing
implementations to the beginning of .receive() - something that a few NICs
already do.

Fam



reply via email to

[Prev in Thread] Current Thread [Next in Thread]