qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2] vhost-vsock: report QMP event when setrunnin


From: Michael S. Tsirkin
Subject: Re: [Qemu-devel] [PATCH v2] vhost-vsock: report QMP event when setrunning
Date: Thu, 12 Dec 2019 06:24:55 -0500

On Thu, Dec 12, 2019 at 11:05:25AM +0000, Stefan Hajnoczi wrote:
> On Thu, Nov 28, 2019 at 07:26:47PM +0800, address@hidden wrote:
> > Let me describe the issue with an example via `nc-vsock`:
> > 
> > Let's assume the Guest cid is 3.
> > execute 'rmmod vmw_vsock_virtio_transport' in Guest,
> > then execute 'while true; do nc-vsock 3 1234' in Host.
> > 
> > Host                             Guest
> >                                  # rmmod vmw_vsock_virtio_transport
> > 
> > # while true; do ./nc-vsock 3 1234; done
> > (after 2 second)
> > connect: Connection timed out
> > (after 2 second)
> > connect: Connection timed out
> > ...
> > 
> >                                  # modprobe vmw_vsock_virtio_transport
> > 
> > connect: Connection reset by peer
> > connect: Connection reset by peer
> > connect: Connection reset by peer
> > ...
> > 
> >                                  # nc-vsock -l 1234
> >                                  Connetion from cid 2 port ***...
> > (stop printing)
> > 
> > 
> > The above process simulates the communication process between
> > the `kata-runtime` and `kata-agent` after starting the Guest.
> > In order to connect to `kata-agent` as soon as possible, 
> > `kata-runtime` will continuously try to connect to `kata-agent` in a loop.
> > see 
> > https://github.com/kata-containers/runtime/blob/d054556f60f092335a22a288011fa29539ad4ccc/vendor/github.com/kata-containers/agent/protocols/client/client.go#L327
> > But when the vsock device in the Guest is not ready, the connection
> > will block for 2 seconds. This situation actually slows down
> > the entire startup time of `kata-runtime`.
> 
> This can be done efficiently as follows:
> 1. kata-runtime listens on a vsock port
> 2. kata-agent-port=PORT is added to the kernel command-line options
> 3. kata-agent parses the port number and connects to the host
> 
> This eliminates the reconnection attempts.

Then we'll get the same problem in reverse, won't we?
Agent must now be running before guest can boot ...
Or did I miss anything?

> > > I think that adding a QMP event is working around the issue rather than
> > > fixing the root cause.  This is probably a vhost_vsock.ko problem and
> > > should be fixed there.
> > 
> > After looking at the source code of vhost_vsock.ko, 
> > I think it is possible to optimize the logic here too.
> > The simple patch is as follows. Do you think the modification is 
> > appropriate?
> > 
> > diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
> > index 9f57736f..8fad67be 100644
> > --- a/drivers/vhost/vsock.c
> > +++ b/drivers/vhost/vsock.c
> > @@ -51,6 +51,7 @@ struct vhost_vsock {
> >     atomic_t queued_replies;
> > 
> >     u32 guest_cid;
> > +   u32 state;
> >  };
> > 
> >  static u32 vhost_transport_get_local_cid(void)
> > @@ -497,6 +541,7 @@ static int vhost_vsock_start(struct vhost_vsock *vsock)
> > 
> >             mutex_unlock(&vq->mutex);
> >     }
> > +   vsock->state = 1;
> > 
> >     mutex_unlock(&vsock->dev.mutex);
> >     return 0;
> > @@ -535,6 +580,7 @@ static int vhost_vsock_stop(struct vhost_vsock *vsock)
> >             vq->private_data = NULL;
> >             mutex_unlock(&vq->mutex);
> >     }
> > +   vsock->state = 0;
> > 
> >  err:
> >     mutex_unlock(&vsock->dev.mutex);
> > @@ -786,6 +832,27 @@ static struct miscdevice vhost_vsock_misc = {
> >     .fops = &vhost_vsock_fops,
> >  };
> > 
> > +int vhost_transport_connect(struct vsock_sock *vsk) {
> > +   struct vhost_vsock *vsock;
> > +
> > +   rcu_read_lock();
> > +
> > +   /* Find the vhost_vsock according to guest context id  */
> > +   vsock = vhost_vsock_get(vsk->remote_addr.svm_cid);
> > +   if (!vsock) {
> > +           rcu_read_unlock();
> > +           return -ENODEV;
> > +   }
> > +
> > +   rcu_read_unlock();
> > +
> > +   if (vsock->state == 1) {
> > +           return virtio_transport_connect(vsk);
> > +   } else {
> > +           return -ECONNRESET;
> > +   }
> > +}
> > +
> >  static struct virtio_transport vhost_transport = {
> >     .transport = {
> >             .get_local_cid            = vhost_transport_get_local_cid,
> > @@ -793,7 +860,7 @@ static struct virtio_transport vhost_transport = {
> >             .init                     = virtio_transport_do_socket_init,
> >             .destruct                 = virtio_transport_destruct,
> >             .release                  = virtio_transport_release,
> > -           .connect                  = virtio_transport_connect,
> > +           .connect                  = vhost_transport_connect,
> >             .shutdown                 = virtio_transport_shutdown,
> >             .cancel_pkt               = vhost_transport_cancel_pkt,
> 
> I'm not keen on adding a special case for vhost_vsock.ko connect.
> 
> Userspace APIs to avoid the 2 second wait already exist:
> 
> 1. The SO_VM_SOCKETS_CONNECT_TIMEOUT socket option controls the connect
>    timeout for this socket.
> 
> 2. Non-blocking connect allows the userspace process to do other things
>    while a connection attempt is being made.
> 
> But the best solution is the one I mentioned above.
> 
> Stefan





reply via email to

[Prev in Thread] Current Thread [Next in Thread]