All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] vhost: force vhost off for non-MSI guests
@ 2011-01-20 15:35 ` Michael S. Tsirkin
  0 siblings, 0 replies; 37+ messages in thread
From: Michael S. Tsirkin @ 2011-01-20 15:35 UTC (permalink / raw)
  To: Alex Williamson, Juan Quintela, jasowang, Jes.Sorensen; +Cc: kvm, qemu-devel

When MSI is off, each interrupt needs to be bounced through the io
thread when it's set/cleared, so vhost-net causes more context switches and
higher CPU utilization than userspace virtio which handles networking in
the same thread.

We'll need to fix this by adding level irq support in kvm irqfd,
for now disable vhost-net in these configurations.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---

I need to report some error from virtio-pci
that would be handled specially (disable but don't
report an error) so I wanted one that's never likely to be used by a
userspace ioctl. I selected ERANGE but it'd
be easy to switch to something else. Comments?

 hw/vhost.c      |    4 +++-
 hw/virtio-net.c |    6 ++++--
 hw/virtio-pci.c |    3 +++
 hw/virtio.h     |    2 ++
 4 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/hw/vhost.c b/hw/vhost.c
index 1d09ed0..c79765a 100644
--- a/hw/vhost.c
+++ b/hw/vhost.c
@@ -649,7 +649,9 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev)
 
     r = vdev->binding->set_guest_notifiers(vdev->binding_opaque, true);
     if (r < 0) {
-        fprintf(stderr, "Error binding guest notifier: %d\n", -r);
+	if (r != -EVIRTIO_DISABLED) {
+		fprintf(stderr, "Error binding guest notifier: %d\n", -r);
+	}
         goto fail_notifiers;
     }
 
diff --git a/hw/virtio-net.c b/hw/virtio-net.c
index ccb3e63..5de3fee 100644
--- a/hw/virtio-net.c
+++ b/hw/virtio-net.c
@@ -121,8 +121,10 @@ static void virtio_net_vhost_status(VirtIONet *n, uint8_t status)
     if (!n->vhost_started) {
         int r = vhost_net_start(tap_get_vhost_net(n->nic->nc.peer), &n->vdev);
         if (r < 0) {
-            error_report("unable to start vhost net: %d: "
-                         "falling back on userspace virtio", -r);
+            if (r != -EVIRTIO_DISABLED) {
+                error_report("unable to start vhost net: %d: "
+                             "falling back on userspace virtio", -r);
+            }
         } else {
             n->vhost_started = 1;
         }
diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
index dd8887a..dbf4be0 100644
--- a/hw/virtio-pci.c
+++ b/hw/virtio-pci.c
@@ -628,6 +628,9 @@ static int virtio_pci_set_guest_notifier(void *opaque, int n, bool assign)
     EventNotifier *notifier = virtio_queue_get_guest_notifier(vq);
 
     if (assign) {
+        if (!msix_enabled(&proxy->pci_dev)) {
+            return -EVIRTIO_DISABLED;
+        }
         int r = event_notifier_init(notifier, 0);
         if (r < 0) {
             return r;
diff --git a/hw/virtio.h b/hw/virtio.h
index d8546d5..53bbdba 100644
--- a/hw/virtio.h
+++ b/hw/virtio.h
@@ -98,6 +98,8 @@ typedef struct {
     void (*vmstate_change)(void * opaque, bool running);
 } VirtIOBindings;
 
+#define EVIRTIO_DISABLED ERANGE
+
 #define VIRTIO_PCI_QUEUE_MAX 64
 
 #define VIRTIO_NO_VECTOR 0xffff
-- 
1.7.3.2.91.g446ac

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [Qemu-devel] [PATCH] vhost: force vhost off for non-MSI guests
@ 2011-01-20 15:35 ` Michael S. Tsirkin
  0 siblings, 0 replies; 37+ messages in thread
From: Michael S. Tsirkin @ 2011-01-20 15:35 UTC (permalink / raw)
  To: Alex Williamson, Juan Quintela, jasowang, Jes.Sorensen; +Cc: qemu-devel, kvm

When MSI is off, each interrupt needs to be bounced through the io
thread when it's set/cleared, so vhost-net causes more context switches and
higher CPU utilization than userspace virtio which handles networking in
the same thread.

We'll need to fix this by adding level irq support in kvm irqfd,
for now disable vhost-net in these configurations.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---

I need to report some error from virtio-pci
that would be handled specially (disable but don't
report an error) so I wanted one that's never likely to be used by a
userspace ioctl. I selected ERANGE but it'd
be easy to switch to something else. Comments?

 hw/vhost.c      |    4 +++-
 hw/virtio-net.c |    6 ++++--
 hw/virtio-pci.c |    3 +++
 hw/virtio.h     |    2 ++
 4 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/hw/vhost.c b/hw/vhost.c
index 1d09ed0..c79765a 100644
--- a/hw/vhost.c
+++ b/hw/vhost.c
@@ -649,7 +649,9 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev)
 
     r = vdev->binding->set_guest_notifiers(vdev->binding_opaque, true);
     if (r < 0) {
-        fprintf(stderr, "Error binding guest notifier: %d\n", -r);
+	if (r != -EVIRTIO_DISABLED) {
+		fprintf(stderr, "Error binding guest notifier: %d\n", -r);
+	}
         goto fail_notifiers;
     }
 
diff --git a/hw/virtio-net.c b/hw/virtio-net.c
index ccb3e63..5de3fee 100644
--- a/hw/virtio-net.c
+++ b/hw/virtio-net.c
@@ -121,8 +121,10 @@ static void virtio_net_vhost_status(VirtIONet *n, uint8_t status)
     if (!n->vhost_started) {
         int r = vhost_net_start(tap_get_vhost_net(n->nic->nc.peer), &n->vdev);
         if (r < 0) {
-            error_report("unable to start vhost net: %d: "
-                         "falling back on userspace virtio", -r);
+            if (r != -EVIRTIO_DISABLED) {
+                error_report("unable to start vhost net: %d: "
+                             "falling back on userspace virtio", -r);
+            }
         } else {
             n->vhost_started = 1;
         }
diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
index dd8887a..dbf4be0 100644
--- a/hw/virtio-pci.c
+++ b/hw/virtio-pci.c
@@ -628,6 +628,9 @@ static int virtio_pci_set_guest_notifier(void *opaque, int n, bool assign)
     EventNotifier *notifier = virtio_queue_get_guest_notifier(vq);
 
     if (assign) {
+        if (!msix_enabled(&proxy->pci_dev)) {
+            return -EVIRTIO_DISABLED;
+        }
         int r = event_notifier_init(notifier, 0);
         if (r < 0) {
             return r;
diff --git a/hw/virtio.h b/hw/virtio.h
index d8546d5..53bbdba 100644
--- a/hw/virtio.h
+++ b/hw/virtio.h
@@ -98,6 +98,8 @@ typedef struct {
     void (*vmstate_change)(void * opaque, bool running);
 } VirtIOBindings;
 
+#define EVIRTIO_DISABLED ERANGE
+
 #define VIRTIO_PCI_QUEUE_MAX 64
 
 #define VIRTIO_NO_VECTOR 0xffff
-- 
1.7.3.2.91.g446ac

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [PATCH] vhost: force vhost off for non-MSI guests
  2011-01-20 15:35 ` [Qemu-devel] " Michael S. Tsirkin
@ 2011-01-20 15:43   ` Anthony Liguori
  -1 siblings, 0 replies; 37+ messages in thread
From: Anthony Liguori @ 2011-01-20 15:43 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Alex Williamson, Juan Quintela, jasowang, Jes.Sorensen, qemu-devel, kvm

On 01/20/2011 09:35 AM, Michael S. Tsirkin wrote:
> When MSI is off, each interrupt needs to be bounced through the io
> thread when it's set/cleared, so vhost-net causes more context switches and
> higher CPU utilization than userspace virtio which handles networking in
> the same thread.
>
> We'll need to fix this by adding level irq support in kvm irqfd,
> for now disable vhost-net in these configurations.
>
> Signed-off-by: Michael S. Tsirkin<mst@redhat.com>
>    

I actually think this should be a terminal error.  The user asks for 
vhost-net, if we cannot enable it, we should exit.

Or we should warn the user that they should expect bad performance.  
Silently doing something that the user has explicitly asked us not to do 
is not a good behavior.

Regards,

Anthony Liguori

> ---
>
> I need to report some error from virtio-pci
> that would be handled specially (disable but don't
> report an error) so I wanted one that's never likely to be used by a
> userspace ioctl. I selected ERANGE but it'd
> be easy to switch to something else. Comments?
>
>   hw/vhost.c      |    4 +++-
>   hw/virtio-net.c |    6 ++++--
>   hw/virtio-pci.c |    3 +++
>   hw/virtio.h     |    2 ++
>   4 files changed, 12 insertions(+), 3 deletions(-)
>
> diff --git a/hw/vhost.c b/hw/vhost.c
> index 1d09ed0..c79765a 100644
> --- a/hw/vhost.c
> +++ b/hw/vhost.c
> @@ -649,7 +649,9 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev)
>
>       r = vdev->binding->set_guest_notifiers(vdev->binding_opaque, true);
>       if (r<  0) {
> -        fprintf(stderr, "Error binding guest notifier: %d\n", -r);
> +	if (r != -EVIRTIO_DISABLED) {
> +		fprintf(stderr, "Error binding guest notifier: %d\n", -r);
> +	}
>           goto fail_notifiers;
>       }
>
> diff --git a/hw/virtio-net.c b/hw/virtio-net.c
> index ccb3e63..5de3fee 100644
> --- a/hw/virtio-net.c
> +++ b/hw/virtio-net.c
> @@ -121,8 +121,10 @@ static void virtio_net_vhost_status(VirtIONet *n, uint8_t status)
>       if (!n->vhost_started) {
>           int r = vhost_net_start(tap_get_vhost_net(n->nic->nc.peer),&n->vdev);
>           if (r<  0) {
> -            error_report("unable to start vhost net: %d: "
> -                         "falling back on userspace virtio", -r);
> +            if (r != -EVIRTIO_DISABLED) {
> +                error_report("unable to start vhost net: %d: "
> +                             "falling back on userspace virtio", -r);
> +            }
>           } else {
>               n->vhost_started = 1;
>           }
> diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
> index dd8887a..dbf4be0 100644
> --- a/hw/virtio-pci.c
> +++ b/hw/virtio-pci.c
> @@ -628,6 +628,9 @@ static int virtio_pci_set_guest_notifier(void *opaque, int n, bool assign)
>       EventNotifier *notifier = virtio_queue_get_guest_notifier(vq);
>
>       if (assign) {
> +        if (!msix_enabled(&proxy->pci_dev)) {
> +            return -EVIRTIO_DISABLED;
> +        }
>           int r = event_notifier_init(notifier, 0);
>           if (r<  0) {
>               return r;
> diff --git a/hw/virtio.h b/hw/virtio.h
> index d8546d5..53bbdba 100644
> --- a/hw/virtio.h
> +++ b/hw/virtio.h
> @@ -98,6 +98,8 @@ typedef struct {
>       void (*vmstate_change)(void * opaque, bool running);
>   } VirtIOBindings;
>
> +#define EVIRTIO_DISABLED ERANGE
> +
>   #define VIRTIO_PCI_QUEUE_MAX 64
>
>   #define VIRTIO_NO_VECTOR 0xffff
>    


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [PATCH] vhost: force vhost off for non-MSI guests
@ 2011-01-20 15:43   ` Anthony Liguori
  0 siblings, 0 replies; 37+ messages in thread
From: Anthony Liguori @ 2011-01-20 15:43 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: kvm, Juan Quintela, Jes.Sorensen, jasowang, qemu-devel, Alex Williamson

On 01/20/2011 09:35 AM, Michael S. Tsirkin wrote:
> When MSI is off, each interrupt needs to be bounced through the io
> thread when it's set/cleared, so vhost-net causes more context switches and
> higher CPU utilization than userspace virtio which handles networking in
> the same thread.
>
> We'll need to fix this by adding level irq support in kvm irqfd,
> for now disable vhost-net in these configurations.
>
> Signed-off-by: Michael S. Tsirkin<mst@redhat.com>
>    

I actually think this should be a terminal error.  The user asks for 
vhost-net, if we cannot enable it, we should exit.

Or we should warn the user that they should expect bad performance.  
Silently doing something that the user has explicitly asked us not to do 
is not a good behavior.

Regards,

Anthony Liguori

> ---
>
> I need to report some error from virtio-pci
> that would be handled specially (disable but don't
> report an error) so I wanted one that's never likely to be used by a
> userspace ioctl. I selected ERANGE but it'd
> be easy to switch to something else. Comments?
>
>   hw/vhost.c      |    4 +++-
>   hw/virtio-net.c |    6 ++++--
>   hw/virtio-pci.c |    3 +++
>   hw/virtio.h     |    2 ++
>   4 files changed, 12 insertions(+), 3 deletions(-)
>
> diff --git a/hw/vhost.c b/hw/vhost.c
> index 1d09ed0..c79765a 100644
> --- a/hw/vhost.c
> +++ b/hw/vhost.c
> @@ -649,7 +649,9 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev)
>
>       r = vdev->binding->set_guest_notifiers(vdev->binding_opaque, true);
>       if (r<  0) {
> -        fprintf(stderr, "Error binding guest notifier: %d\n", -r);
> +	if (r != -EVIRTIO_DISABLED) {
> +		fprintf(stderr, "Error binding guest notifier: %d\n", -r);
> +	}
>           goto fail_notifiers;
>       }
>
> diff --git a/hw/virtio-net.c b/hw/virtio-net.c
> index ccb3e63..5de3fee 100644
> --- a/hw/virtio-net.c
> +++ b/hw/virtio-net.c
> @@ -121,8 +121,10 @@ static void virtio_net_vhost_status(VirtIONet *n, uint8_t status)
>       if (!n->vhost_started) {
>           int r = vhost_net_start(tap_get_vhost_net(n->nic->nc.peer),&n->vdev);
>           if (r<  0) {
> -            error_report("unable to start vhost net: %d: "
> -                         "falling back on userspace virtio", -r);
> +            if (r != -EVIRTIO_DISABLED) {
> +                error_report("unable to start vhost net: %d: "
> +                             "falling back on userspace virtio", -r);
> +            }
>           } else {
>               n->vhost_started = 1;
>           }
> diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
> index dd8887a..dbf4be0 100644
> --- a/hw/virtio-pci.c
> +++ b/hw/virtio-pci.c
> @@ -628,6 +628,9 @@ static int virtio_pci_set_guest_notifier(void *opaque, int n, bool assign)
>       EventNotifier *notifier = virtio_queue_get_guest_notifier(vq);
>
>       if (assign) {
> +        if (!msix_enabled(&proxy->pci_dev)) {
> +            return -EVIRTIO_DISABLED;
> +        }
>           int r = event_notifier_init(notifier, 0);
>           if (r<  0) {
>               return r;
> diff --git a/hw/virtio.h b/hw/virtio.h
> index d8546d5..53bbdba 100644
> --- a/hw/virtio.h
> +++ b/hw/virtio.h
> @@ -98,6 +98,8 @@ typedef struct {
>       void (*vmstate_change)(void * opaque, bool running);
>   } VirtIOBindings;
>
> +#define EVIRTIO_DISABLED ERANGE
> +
>   #define VIRTIO_PCI_QUEUE_MAX 64
>
>   #define VIRTIO_NO_VECTOR 0xffff
>    

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [PATCH] vhost: force vhost off for non-MSI guests
  2011-01-20 15:43   ` Anthony Liguori
@ 2011-01-20 16:07     ` Michael S. Tsirkin
  -1 siblings, 0 replies; 37+ messages in thread
From: Michael S. Tsirkin @ 2011-01-20 16:07 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Alex Williamson, Juan Quintela, jasowang, Jes.Sorensen, qemu-devel, kvm

On Thu, Jan 20, 2011 at 09:43:57AM -0600, Anthony Liguori wrote:
> On 01/20/2011 09:35 AM, Michael S. Tsirkin wrote:
> >When MSI is off, each interrupt needs to be bounced through the io
> >thread when it's set/cleared, so vhost-net causes more context switches and
> >higher CPU utilization than userspace virtio which handles networking in
> >the same thread.
> >
> >We'll need to fix this by adding level irq support in kvm irqfd,
> >for now disable vhost-net in these configurations.
> >
> >Signed-off-by: Michael S. Tsirkin<mst@redhat.com>
> 
> I actually think this should be a terminal error.  The user asks for
> vhost-net, if we cannot enable it, we should exit.
> 
> Or we should warn the user that they should expect bad performance.
> Silently doing something that the user has explicitly asked us not
> to do is not a good behavior.
> 
> Regards,
> 
> Anthony Liguori

The issue is that user has no control of the guest, and can not know
whether the guest enables MSI. So what you ask for will just make
some guests fail, and others fail sometimes.
The user also has no way to know that version X of kvm does not expose a
way to inject level interrupts with irqfd.

We could have *another* flag that says "use vhost where it helps" but
then I think this is what everyone wants to do, anyway, and libvirt
already sets vhost=on so I prefer redefining the meaning of an existing
flag.

Maybe this is best handled by a documentation update?

We always said:
	    "                use vhost=on to enable experimental in kernel accelerator\n"

note 'enable' not 'require'. This is similar to how we specify
nvectors : you can not make guest use the feature.

How about this:

diff --git a/qemu-options.hx b/qemu-options.hx
index 898561d..3c937c1 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -1061,6 +1061,7 @@ DEF("net", HAS_ARG, QEMU_OPTION_net,
     "                use vnet_hdr=off to avoid enabling the IFF_VNET_HDR tap flag\n"
     "                use vnet_hdr=on to make the lack of IFF_VNET_HDR support an error condition\n"
     "                use vhost=on to enable experimental in kernel accelerator\n"
+    "                (note: vhost=on has no effect unless guest uses MSI-X)\n"
     "                use 'vhostfd=h' to connect to an already opened vhost net device\n"
 #endif
     "-net socket[,vlan=n][,name=str][,fd=h][,listen=[host]:port][,connect=host:port]\n"


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [PATCH] vhost: force vhost off for non-MSI guests
@ 2011-01-20 16:07     ` Michael S. Tsirkin
  0 siblings, 0 replies; 37+ messages in thread
From: Michael S. Tsirkin @ 2011-01-20 16:07 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: kvm, Juan Quintela, Jes.Sorensen, jasowang, qemu-devel, Alex Williamson

On Thu, Jan 20, 2011 at 09:43:57AM -0600, Anthony Liguori wrote:
> On 01/20/2011 09:35 AM, Michael S. Tsirkin wrote:
> >When MSI is off, each interrupt needs to be bounced through the io
> >thread when it's set/cleared, so vhost-net causes more context switches and
> >higher CPU utilization than userspace virtio which handles networking in
> >the same thread.
> >
> >We'll need to fix this by adding level irq support in kvm irqfd,
> >for now disable vhost-net in these configurations.
> >
> >Signed-off-by: Michael S. Tsirkin<mst@redhat.com>
> 
> I actually think this should be a terminal error.  The user asks for
> vhost-net, if we cannot enable it, we should exit.
> 
> Or we should warn the user that they should expect bad performance.
> Silently doing something that the user has explicitly asked us not
> to do is not a good behavior.
> 
> Regards,
> 
> Anthony Liguori

The issue is that user has no control of the guest, and can not know
whether the guest enables MSI. So what you ask for will just make
some guests fail, and others fail sometimes.
The user also has no way to know that version X of kvm does not expose a
way to inject level interrupts with irqfd.

We could have *another* flag that says "use vhost where it helps" but
then I think this is what everyone wants to do, anyway, and libvirt
already sets vhost=on so I prefer redefining the meaning of an existing
flag.

Maybe this is best handled by a documentation update?

We always said:
	    "                use vhost=on to enable experimental in kernel accelerator\n"

note 'enable' not 'require'. This is similar to how we specify
nvectors : you can not make guest use the feature.

How about this:

diff --git a/qemu-options.hx b/qemu-options.hx
index 898561d..3c937c1 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -1061,6 +1061,7 @@ DEF("net", HAS_ARG, QEMU_OPTION_net,
     "                use vnet_hdr=off to avoid enabling the IFF_VNET_HDR tap flag\n"
     "                use vnet_hdr=on to make the lack of IFF_VNET_HDR support an error condition\n"
     "                use vhost=on to enable experimental in kernel accelerator\n"
+    "                (note: vhost=on has no effect unless guest uses MSI-X)\n"
     "                use 'vhostfd=h' to connect to an already opened vhost net device\n"
 #endif
     "-net socket[,vlan=n][,name=str][,fd=h][,listen=[host]:port][,connect=host:port]\n"

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH] vhost: force vhost off for non-MSI guests
  2011-01-20 15:35 ` [Qemu-devel] " Michael S. Tsirkin
@ 2011-01-20 16:31   ` Sridhar Samudrala
  -1 siblings, 0 replies; 37+ messages in thread
From: Sridhar Samudrala @ 2011-01-20 16:31 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Alex Williamson, Juan Quintela, jasowang, Jes.Sorensen, kvm, qemu-devel

On Thu, 2011-01-20 at 17:35 +0200, Michael S. Tsirkin wrote:
> When MSI is off, each interrupt needs to be bounced through the io
> thread when it's set/cleared, so vhost-net causes more context switches and
> higher CPU utilization than userspace virtio which handles networking in
> the same thread.
> 
> We'll need to fix this by adding level irq support in kvm irqfd,
> for now disable vhost-net in these configurations.
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
> 
> I need to report some error from virtio-pci
> that would be handled specially (disable but don't
> report an error) so I wanted one that's never likely to be used by a
> userspace ioctl. I selected ERANGE but it'd
> be easy to switch to something else. Comments?

Should this error be EVHOST_DISABLED rather than EVIRTIO_DISABLED?

-Sridhar
> 
>  hw/vhost.c      |    4 +++-
>  hw/virtio-net.c |    6 ++++--
>  hw/virtio-pci.c |    3 +++
>  hw/virtio.h     |    2 ++
>  4 files changed, 12 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/vhost.c b/hw/vhost.c
> index 1d09ed0..c79765a 100644
> --- a/hw/vhost.c
> +++ b/hw/vhost.c
> @@ -649,7 +649,9 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev)
>  
>      r = vdev->binding->set_guest_notifiers(vdev->binding_opaque, true);
>      if (r < 0) {
> -        fprintf(stderr, "Error binding guest notifier: %d\n", -r);
> +	if (r != -EVIRTIO_DISABLED) {
> +		fprintf(stderr, "Error binding guest notifier: %d\n", -r);
> +	}
>          goto fail_notifiers;
>      }
>  
> diff --git a/hw/virtio-net.c b/hw/virtio-net.c
> index ccb3e63..5de3fee 100644
> --- a/hw/virtio-net.c
> +++ b/hw/virtio-net.c
> @@ -121,8 +121,10 @@ static void virtio_net_vhost_status(VirtIONet *n, uint8_t status)
>      if (!n->vhost_started) {
>          int r = vhost_net_start(tap_get_vhost_net(n->nic->nc.peer), &n->vdev);
>          if (r < 0) {
> -            error_report("unable to start vhost net: %d: "
> -                         "falling back on userspace virtio", -r);
> +            if (r != -EVIRTIO_DISABLED) {
> +                error_report("unable to start vhost net: %d: "
> +                             "falling back on userspace virtio", -r);
> +            }
>          } else {
>              n->vhost_started = 1;
>          }
> diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
> index dd8887a..dbf4be0 100644
> --- a/hw/virtio-pci.c
> +++ b/hw/virtio-pci.c
> @@ -628,6 +628,9 @@ static int virtio_pci_set_guest_notifier(void *opaque, int n, bool assign)
>      EventNotifier *notifier = virtio_queue_get_guest_notifier(vq);
>  
>      if (assign) {
> +        if (!msix_enabled(&proxy->pci_dev)) {
> +            return -EVIRTIO_DISABLED;
> +        }
>          int r = event_notifier_init(notifier, 0);
>          if (r < 0) {
>              return r;
> diff --git a/hw/virtio.h b/hw/virtio.h
> index d8546d5..53bbdba 100644
> --- a/hw/virtio.h
> +++ b/hw/virtio.h
> @@ -98,6 +98,8 @@ typedef struct {
>      void (*vmstate_change)(void * opaque, bool running);
>  } VirtIOBindings;
>  
> +#define EVIRTIO_DISABLED ERANGE
> +
>  #define VIRTIO_PCI_QUEUE_MAX 64
>  
>  #define VIRTIO_NO_VECTOR 0xffff


^ permalink raw reply	[flat|nested] 37+ messages in thread

* [Qemu-devel] Re: [PATCH] vhost: force vhost off for non-MSI guests
@ 2011-01-20 16:31   ` Sridhar Samudrala
  0 siblings, 0 replies; 37+ messages in thread
From: Sridhar Samudrala @ 2011-01-20 16:31 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: kvm, Juan Quintela, Jes.Sorensen, jasowang, qemu-devel, Alex Williamson

On Thu, 2011-01-20 at 17:35 +0200, Michael S. Tsirkin wrote:
> When MSI is off, each interrupt needs to be bounced through the io
> thread when it's set/cleared, so vhost-net causes more context switches and
> higher CPU utilization than userspace virtio which handles networking in
> the same thread.
> 
> We'll need to fix this by adding level irq support in kvm irqfd,
> for now disable vhost-net in these configurations.
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
> 
> I need to report some error from virtio-pci
> that would be handled specially (disable but don't
> report an error) so I wanted one that's never likely to be used by a
> userspace ioctl. I selected ERANGE but it'd
> be easy to switch to something else. Comments?

Should this error be EVHOST_DISABLED rather than EVIRTIO_DISABLED?

-Sridhar
> 
>  hw/vhost.c      |    4 +++-
>  hw/virtio-net.c |    6 ++++--
>  hw/virtio-pci.c |    3 +++
>  hw/virtio.h     |    2 ++
>  4 files changed, 12 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/vhost.c b/hw/vhost.c
> index 1d09ed0..c79765a 100644
> --- a/hw/vhost.c
> +++ b/hw/vhost.c
> @@ -649,7 +649,9 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev)
>  
>      r = vdev->binding->set_guest_notifiers(vdev->binding_opaque, true);
>      if (r < 0) {
> -        fprintf(stderr, "Error binding guest notifier: %d\n", -r);
> +	if (r != -EVIRTIO_DISABLED) {
> +		fprintf(stderr, "Error binding guest notifier: %d\n", -r);
> +	}
>          goto fail_notifiers;
>      }
>  
> diff --git a/hw/virtio-net.c b/hw/virtio-net.c
> index ccb3e63..5de3fee 100644
> --- a/hw/virtio-net.c
> +++ b/hw/virtio-net.c
> @@ -121,8 +121,10 @@ static void virtio_net_vhost_status(VirtIONet *n, uint8_t status)
>      if (!n->vhost_started) {
>          int r = vhost_net_start(tap_get_vhost_net(n->nic->nc.peer), &n->vdev);
>          if (r < 0) {
> -            error_report("unable to start vhost net: %d: "
> -                         "falling back on userspace virtio", -r);
> +            if (r != -EVIRTIO_DISABLED) {
> +                error_report("unable to start vhost net: %d: "
> +                             "falling back on userspace virtio", -r);
> +            }
>          } else {
>              n->vhost_started = 1;
>          }
> diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
> index dd8887a..dbf4be0 100644
> --- a/hw/virtio-pci.c
> +++ b/hw/virtio-pci.c
> @@ -628,6 +628,9 @@ static int virtio_pci_set_guest_notifier(void *opaque, int n, bool assign)
>      EventNotifier *notifier = virtio_queue_get_guest_notifier(vq);
>  
>      if (assign) {
> +        if (!msix_enabled(&proxy->pci_dev)) {
> +            return -EVIRTIO_DISABLED;
> +        }
>          int r = event_notifier_init(notifier, 0);
>          if (r < 0) {
>              return r;
> diff --git a/hw/virtio.h b/hw/virtio.h
> index d8546d5..53bbdba 100644
> --- a/hw/virtio.h
> +++ b/hw/virtio.h
> @@ -98,6 +98,8 @@ typedef struct {
>      void (*vmstate_change)(void * opaque, bool running);
>  } VirtIOBindings;
>  
> +#define EVIRTIO_DISABLED ERANGE
> +
>  #define VIRTIO_PCI_QUEUE_MAX 64
>  
>  #define VIRTIO_NO_VECTOR 0xffff

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] vhost: force vhost off for non-MSI guests
  2011-01-20 16:31   ` [Qemu-devel] " Sridhar Samudrala
@ 2011-01-20 17:47     ` Michael S. Tsirkin
  -1 siblings, 0 replies; 37+ messages in thread
From: Michael S. Tsirkin @ 2011-01-20 17:47 UTC (permalink / raw)
  To: Sridhar Samudrala
  Cc: Alex Williamson, Juan Quintela, jasowang, Jes.Sorensen, kvm, qemu-devel

On Thu, Jan 20, 2011 at 08:31:53AM -0800, Sridhar Samudrala wrote:
> On Thu, 2011-01-20 at 17:35 +0200, Michael S. Tsirkin wrote:
> > When MSI is off, each interrupt needs to be bounced through the io
> > thread when it's set/cleared, so vhost-net causes more context switches and
> > higher CPU utilization than userspace virtio which handles networking in
> > the same thread.
> > 
> > We'll need to fix this by adding level irq support in kvm irqfd,
> > for now disable vhost-net in these configurations.
> > 
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > ---
> > 
> > I need to report some error from virtio-pci
> > that would be handled specially (disable but don't
> > report an error) so I wanted one that's never likely to be used by a
> > userspace ioctl. I selected ERANGE but it'd
> > be easy to switch to something else. Comments?
> 
> Should this error be EVHOST_DISABLED rather than EVIRTIO_DISABLED?
> 
> -Sridhar

The error is reported by virtio-pci which does not know about vhost.
I started with EVIRTIO_MSIX_DISABLED and made is shorter.
Would EVIRTIO_MSIX_DISABLED be better?

> > 
> >  hw/vhost.c      |    4 +++-
> >  hw/virtio-net.c |    6 ++++--
> >  hw/virtio-pci.c |    3 +++
> >  hw/virtio.h     |    2 ++
> >  4 files changed, 12 insertions(+), 3 deletions(-)
> > 
> > diff --git a/hw/vhost.c b/hw/vhost.c
> > index 1d09ed0..c79765a 100644
> > --- a/hw/vhost.c
> > +++ b/hw/vhost.c
> > @@ -649,7 +649,9 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev)
> >  
> >      r = vdev->binding->set_guest_notifiers(vdev->binding_opaque, true);
> >      if (r < 0) {
> > -        fprintf(stderr, "Error binding guest notifier: %d\n", -r);
> > +	if (r != -EVIRTIO_DISABLED) {
> > +		fprintf(stderr, "Error binding guest notifier: %d\n", -r);
> > +	}
> >          goto fail_notifiers;
> >      }
> >  
> > diff --git a/hw/virtio-net.c b/hw/virtio-net.c
> > index ccb3e63..5de3fee 100644
> > --- a/hw/virtio-net.c
> > +++ b/hw/virtio-net.c
> > @@ -121,8 +121,10 @@ static void virtio_net_vhost_status(VirtIONet *n, uint8_t status)
> >      if (!n->vhost_started) {
> >          int r = vhost_net_start(tap_get_vhost_net(n->nic->nc.peer), &n->vdev);
> >          if (r < 0) {
> > -            error_report("unable to start vhost net: %d: "
> > -                         "falling back on userspace virtio", -r);
> > +            if (r != -EVIRTIO_DISABLED) {
> > +                error_report("unable to start vhost net: %d: "
> > +                             "falling back on userspace virtio", -r);
> > +            }
> >          } else {
> >              n->vhost_started = 1;
> >          }
> > diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
> > index dd8887a..dbf4be0 100644
> > --- a/hw/virtio-pci.c
> > +++ b/hw/virtio-pci.c
> > @@ -628,6 +628,9 @@ static int virtio_pci_set_guest_notifier(void *opaque, int n, bool assign)
> >      EventNotifier *notifier = virtio_queue_get_guest_notifier(vq);
> >  
> >      if (assign) {
> > +        if (!msix_enabled(&proxy->pci_dev)) {
> > +            return -EVIRTIO_DISABLED;
> > +        }
> >          int r = event_notifier_init(notifier, 0);
> >          if (r < 0) {
> >              return r;
> > diff --git a/hw/virtio.h b/hw/virtio.h
> > index d8546d5..53bbdba 100644
> > --- a/hw/virtio.h
> > +++ b/hw/virtio.h
> > @@ -98,6 +98,8 @@ typedef struct {
> >      void (*vmstate_change)(void * opaque, bool running);
> >  } VirtIOBindings;
> >  
> > +#define EVIRTIO_DISABLED ERANGE
> > +
> >  #define VIRTIO_PCI_QUEUE_MAX 64
> >  
> >  #define VIRTIO_NO_VECTOR 0xffff

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [Qemu-devel] Re: [PATCH] vhost: force vhost off for non-MSI guests
@ 2011-01-20 17:47     ` Michael S. Tsirkin
  0 siblings, 0 replies; 37+ messages in thread
From: Michael S. Tsirkin @ 2011-01-20 17:47 UTC (permalink / raw)
  To: Sridhar Samudrala
  Cc: kvm, Juan Quintela, Jes.Sorensen, jasowang, qemu-devel, Alex Williamson

On Thu, Jan 20, 2011 at 08:31:53AM -0800, Sridhar Samudrala wrote:
> On Thu, 2011-01-20 at 17:35 +0200, Michael S. Tsirkin wrote:
> > When MSI is off, each interrupt needs to be bounced through the io
> > thread when it's set/cleared, so vhost-net causes more context switches and
> > higher CPU utilization than userspace virtio which handles networking in
> > the same thread.
> > 
> > We'll need to fix this by adding level irq support in kvm irqfd,
> > for now disable vhost-net in these configurations.
> > 
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > ---
> > 
> > I need to report some error from virtio-pci
> > that would be handled specially (disable but don't
> > report an error) so I wanted one that's never likely to be used by a
> > userspace ioctl. I selected ERANGE but it'd
> > be easy to switch to something else. Comments?
> 
> Should this error be EVHOST_DISABLED rather than EVIRTIO_DISABLED?
> 
> -Sridhar

The error is reported by virtio-pci which does not know about vhost.
I started with EVIRTIO_MSIX_DISABLED and made is shorter.
Would EVIRTIO_MSIX_DISABLED be better?

> > 
> >  hw/vhost.c      |    4 +++-
> >  hw/virtio-net.c |    6 ++++--
> >  hw/virtio-pci.c |    3 +++
> >  hw/virtio.h     |    2 ++
> >  4 files changed, 12 insertions(+), 3 deletions(-)
> > 
> > diff --git a/hw/vhost.c b/hw/vhost.c
> > index 1d09ed0..c79765a 100644
> > --- a/hw/vhost.c
> > +++ b/hw/vhost.c
> > @@ -649,7 +649,9 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev)
> >  
> >      r = vdev->binding->set_guest_notifiers(vdev->binding_opaque, true);
> >      if (r < 0) {
> > -        fprintf(stderr, "Error binding guest notifier: %d\n", -r);
> > +	if (r != -EVIRTIO_DISABLED) {
> > +		fprintf(stderr, "Error binding guest notifier: %d\n", -r);
> > +	}
> >          goto fail_notifiers;
> >      }
> >  
> > diff --git a/hw/virtio-net.c b/hw/virtio-net.c
> > index ccb3e63..5de3fee 100644
> > --- a/hw/virtio-net.c
> > +++ b/hw/virtio-net.c
> > @@ -121,8 +121,10 @@ static void virtio_net_vhost_status(VirtIONet *n, uint8_t status)
> >      if (!n->vhost_started) {
> >          int r = vhost_net_start(tap_get_vhost_net(n->nic->nc.peer), &n->vdev);
> >          if (r < 0) {
> > -            error_report("unable to start vhost net: %d: "
> > -                         "falling back on userspace virtio", -r);
> > +            if (r != -EVIRTIO_DISABLED) {
> > +                error_report("unable to start vhost net: %d: "
> > +                             "falling back on userspace virtio", -r);
> > +            }
> >          } else {
> >              n->vhost_started = 1;
> >          }
> > diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
> > index dd8887a..dbf4be0 100644
> > --- a/hw/virtio-pci.c
> > +++ b/hw/virtio-pci.c
> > @@ -628,6 +628,9 @@ static int virtio_pci_set_guest_notifier(void *opaque, int n, bool assign)
> >      EventNotifier *notifier = virtio_queue_get_guest_notifier(vq);
> >  
> >      if (assign) {
> > +        if (!msix_enabled(&proxy->pci_dev)) {
> > +            return -EVIRTIO_DISABLED;
> > +        }
> >          int r = event_notifier_init(notifier, 0);
> >          if (r < 0) {
> >              return r;
> > diff --git a/hw/virtio.h b/hw/virtio.h
> > index d8546d5..53bbdba 100644
> > --- a/hw/virtio.h
> > +++ b/hw/virtio.h
> > @@ -98,6 +98,8 @@ typedef struct {
> >      void (*vmstate_change)(void * opaque, bool running);
> >  } VirtIOBindings;
> >  
> > +#define EVIRTIO_DISABLED ERANGE
> > +
> >  #define VIRTIO_PCI_QUEUE_MAX 64
> >  
> >  #define VIRTIO_NO_VECTOR 0xffff

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] vhost: force vhost off for non-MSI guests
  2011-01-20 15:35 ` [Qemu-devel] " Michael S. Tsirkin
@ 2011-01-20 18:05   ` Alex Williamson
  -1 siblings, 0 replies; 37+ messages in thread
From: Alex Williamson @ 2011-01-20 18:05 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Juan Quintela, jasowang, Jes.Sorensen, kvm, qemu-devel

On Thu, 2011-01-20 at 17:35 +0200, Michael S. Tsirkin wrote:
> When MSI is off, each interrupt needs to be bounced through the io
> thread when it's set/cleared, so vhost-net causes more context switches and
> higher CPU utilization than userspace virtio which handles networking in
> the same thread.
> 
> We'll need to fix this by adding level irq support in kvm irqfd,
> for now disable vhost-net in these configurations.

We need this if we want avoid bouncing vfio INtx through qemu too.

> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
> 
> I need to report some error from virtio-pci
> that would be handled specially (disable but don't
> report an error) so I wanted one that's never likely to be used by a
> userspace ioctl. I selected ERANGE but it'd
> be easy to switch to something else. Comments?
> 
>  hw/vhost.c      |    4 +++-
>  hw/virtio-net.c |    6 ++++--
>  hw/virtio-pci.c |    3 +++
>  hw/virtio.h     |    2 ++
>  4 files changed, 12 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/vhost.c b/hw/vhost.c
> index 1d09ed0..c79765a 100644
> --- a/hw/vhost.c
> +++ b/hw/vhost.c
> @@ -649,7 +649,9 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev)
>  
>      r = vdev->binding->set_guest_notifiers(vdev->binding_opaque, true);
>      if (r < 0) {
> -        fprintf(stderr, "Error binding guest notifier: %d\n", -r);
> +	if (r != -EVIRTIO_DISABLED) {
> +		fprintf(stderr, "Error binding guest notifier: %d\n", -r);
> +	}

style - the above is tab indented.

>          goto fail_notifiers;
>      }
>  
> diff --git a/hw/virtio-net.c b/hw/virtio-net.c
> index ccb3e63..5de3fee 100644
> --- a/hw/virtio-net.c
> +++ b/hw/virtio-net.c
> @@ -121,8 +121,10 @@ static void virtio_net_vhost_status(VirtIONet *n, uint8_t status)
>      if (!n->vhost_started) {
>          int r = vhost_net_start(tap_get_vhost_net(n->nic->nc.peer), &n->vdev);
>          if (r < 0) {
> -            error_report("unable to start vhost net: %d: "
> -                         "falling back on userspace virtio", -r);
> +            if (r != -EVIRTIO_DISABLED) {
> +                error_report("unable to start vhost net: %d: "
> +                             "falling back on userspace virtio", -r);
> +            }
>          } else {
>              n->vhost_started = 1;
>          }
> diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
> index dd8887a..dbf4be0 100644
> --- a/hw/virtio-pci.c
> +++ b/hw/virtio-pci.c
> @@ -628,6 +628,9 @@ static int virtio_pci_set_guest_notifier(void *opaque, int n, bool assign)
>      EventNotifier *notifier = virtio_queue_get_guest_notifier(vq);
>  
>      if (assign) {
> +        if (!msix_enabled(&proxy->pci_dev)) {
> +            return -EVIRTIO_DISABLED;
> +        }
>          int r = event_notifier_init(notifier, 0);
>          if (r < 0) {
>              return r;
> diff --git a/hw/virtio.h b/hw/virtio.h
> index d8546d5..53bbdba 100644
> --- a/hw/virtio.h
> +++ b/hw/virtio.h
> @@ -98,6 +98,8 @@ typedef struct {
>      void (*vmstate_change)(void * opaque, bool running);
>  } VirtIOBindings;
>  
> +#define EVIRTIO_DISABLED ERANGE
> +
>  #define VIRTIO_PCI_QUEUE_MAX 64
>  
>  #define VIRTIO_NO_VECTOR 0xffff

I'm not a fan of having this special return value.  Why doesn't
virtio-pci only setup the set_guest_notifiers function pointer when msix
is enabled?  If not that, it could at least expose some
virtio_foo_enabled() type feature that vhost could check.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 37+ messages in thread

* [Qemu-devel] Re: [PATCH] vhost: force vhost off for non-MSI guests
@ 2011-01-20 18:05   ` Alex Williamson
  0 siblings, 0 replies; 37+ messages in thread
From: Alex Williamson @ 2011-01-20 18:05 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Jes.Sorensen, jasowang, qemu-devel, kvm, Juan Quintela

On Thu, 2011-01-20 at 17:35 +0200, Michael S. Tsirkin wrote:
> When MSI is off, each interrupt needs to be bounced through the io
> thread when it's set/cleared, so vhost-net causes more context switches and
> higher CPU utilization than userspace virtio which handles networking in
> the same thread.
> 
> We'll need to fix this by adding level irq support in kvm irqfd,
> for now disable vhost-net in these configurations.

We need this if we want avoid bouncing vfio INtx through qemu too.

> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
> 
> I need to report some error from virtio-pci
> that would be handled specially (disable but don't
> report an error) so I wanted one that's never likely to be used by a
> userspace ioctl. I selected ERANGE but it'd
> be easy to switch to something else. Comments?
> 
>  hw/vhost.c      |    4 +++-
>  hw/virtio-net.c |    6 ++++--
>  hw/virtio-pci.c |    3 +++
>  hw/virtio.h     |    2 ++
>  4 files changed, 12 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/vhost.c b/hw/vhost.c
> index 1d09ed0..c79765a 100644
> --- a/hw/vhost.c
> +++ b/hw/vhost.c
> @@ -649,7 +649,9 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev)
>  
>      r = vdev->binding->set_guest_notifiers(vdev->binding_opaque, true);
>      if (r < 0) {
> -        fprintf(stderr, "Error binding guest notifier: %d\n", -r);
> +	if (r != -EVIRTIO_DISABLED) {
> +		fprintf(stderr, "Error binding guest notifier: %d\n", -r);
> +	}

style - the above is tab indented.

>          goto fail_notifiers;
>      }
>  
> diff --git a/hw/virtio-net.c b/hw/virtio-net.c
> index ccb3e63..5de3fee 100644
> --- a/hw/virtio-net.c
> +++ b/hw/virtio-net.c
> @@ -121,8 +121,10 @@ static void virtio_net_vhost_status(VirtIONet *n, uint8_t status)
>      if (!n->vhost_started) {
>          int r = vhost_net_start(tap_get_vhost_net(n->nic->nc.peer), &n->vdev);
>          if (r < 0) {
> -            error_report("unable to start vhost net: %d: "
> -                         "falling back on userspace virtio", -r);
> +            if (r != -EVIRTIO_DISABLED) {
> +                error_report("unable to start vhost net: %d: "
> +                             "falling back on userspace virtio", -r);
> +            }
>          } else {
>              n->vhost_started = 1;
>          }
> diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
> index dd8887a..dbf4be0 100644
> --- a/hw/virtio-pci.c
> +++ b/hw/virtio-pci.c
> @@ -628,6 +628,9 @@ static int virtio_pci_set_guest_notifier(void *opaque, int n, bool assign)
>      EventNotifier *notifier = virtio_queue_get_guest_notifier(vq);
>  
>      if (assign) {
> +        if (!msix_enabled(&proxy->pci_dev)) {
> +            return -EVIRTIO_DISABLED;
> +        }
>          int r = event_notifier_init(notifier, 0);
>          if (r < 0) {
>              return r;
> diff --git a/hw/virtio.h b/hw/virtio.h
> index d8546d5..53bbdba 100644
> --- a/hw/virtio.h
> +++ b/hw/virtio.h
> @@ -98,6 +98,8 @@ typedef struct {
>      void (*vmstate_change)(void * opaque, bool running);
>  } VirtIOBindings;
>  
> +#define EVIRTIO_DISABLED ERANGE
> +
>  #define VIRTIO_PCI_QUEUE_MAX 64
>  
>  #define VIRTIO_NO_VECTOR 0xffff

I'm not a fan of having this special return value.  Why doesn't
virtio-pci only setup the set_guest_notifiers function pointer when msix
is enabled?  If not that, it could at least expose some
virtio_foo_enabled() type feature that vhost could check.  Thanks,

Alex

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] vhost: force vhost off for non-MSI guests
  2011-01-20 17:47     ` [Qemu-devel] " Michael S. Tsirkin
@ 2011-01-20 23:43       ` Sridhar Samudrala
  -1 siblings, 0 replies; 37+ messages in thread
From: Sridhar Samudrala @ 2011-01-20 23:43 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Alex Williamson, Juan Quintela, jasowang, Jes.Sorensen, kvm, qemu-devel

On Thu, 2011-01-20 at 19:47 +0200, Michael S. Tsirkin wrote:
> On Thu, Jan 20, 2011 at 08:31:53AM -0800, Sridhar Samudrala wrote:
> > On Thu, 2011-01-20 at 17:35 +0200, Michael S. Tsirkin wrote:
> > > When MSI is off, each interrupt needs to be bounced through the io
> > > thread when it's set/cleared, so vhost-net causes more context switches and
> > > higher CPU utilization than userspace virtio which handles networking in
> > > the same thread.
> > > 
> > > We'll need to fix this by adding level irq support in kvm irqfd,
> > > for now disable vhost-net in these configurations.
> > > 
> > > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > > ---
> > > 
> > > I need to report some error from virtio-pci
> > > that would be handled specially (disable but don't
> > > report an error) so I wanted one that's never likely to be used by a
> > > userspace ioctl. I selected ERANGE but it'd
> > > be easy to switch to something else. Comments?
> > 
> > Should this error be EVHOST_DISABLED rather than EVIRTIO_DISABLED?
> > 
> > -Sridhar
> 
> The error is reported by virtio-pci which does not know about vhost.
> I started with EVIRTIO_MSIX_DISABLED and made is shorter.
> Would EVIRTIO_MSIX_DISABLED be better?

I think so. This makes it more clear.
-Sridhar
> 
> > > 
> > >  hw/vhost.c      |    4 +++-
> > >  hw/virtio-net.c |    6 ++++--
> > >  hw/virtio-pci.c |    3 +++
> > >  hw/virtio.h     |    2 ++
> > >  4 files changed, 12 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/hw/vhost.c b/hw/vhost.c
> > > index 1d09ed0..c79765a 100644
> > > --- a/hw/vhost.c
> > > +++ b/hw/vhost.c
> > > @@ -649,7 +649,9 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev)
> > >  
> > >      r = vdev->binding->set_guest_notifiers(vdev->binding_opaque, true);
> > >      if (r < 0) {
> > > -        fprintf(stderr, "Error binding guest notifier: %d\n", -r);
> > > +	if (r != -EVIRTIO_DISABLED) {
> > > +		fprintf(stderr, "Error binding guest notifier: %d\n", -r);
> > > +	}
> > >          goto fail_notifiers;
> > >      }
> > >  
> > > diff --git a/hw/virtio-net.c b/hw/virtio-net.c
> > > index ccb3e63..5de3fee 100644
> > > --- a/hw/virtio-net.c
> > > +++ b/hw/virtio-net.c
> > > @@ -121,8 +121,10 @@ static void virtio_net_vhost_status(VirtIONet *n, uint8_t status)
> > >      if (!n->vhost_started) {
> > >          int r = vhost_net_start(tap_get_vhost_net(n->nic->nc.peer), &n->vdev);
> > >          if (r < 0) {
> > > -            error_report("unable to start vhost net: %d: "
> > > -                         "falling back on userspace virtio", -r);
> > > +            if (r != -EVIRTIO_DISABLED) {
> > > +                error_report("unable to start vhost net: %d: "
> > > +                             "falling back on userspace virtio", -r);
> > > +            }
> > >          } else {
> > >              n->vhost_started = 1;
> > >          }
> > > diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
> > > index dd8887a..dbf4be0 100644
> > > --- a/hw/virtio-pci.c
> > > +++ b/hw/virtio-pci.c
> > > @@ -628,6 +628,9 @@ static int virtio_pci_set_guest_notifier(void *opaque, int n, bool assign)
> > >      EventNotifier *notifier = virtio_queue_get_guest_notifier(vq);
> > >  
> > >      if (assign) {
> > > +        if (!msix_enabled(&proxy->pci_dev)) {
> > > +            return -EVIRTIO_DISABLED;
> > > +        }
> > >          int r = event_notifier_init(notifier, 0);
> > >          if (r < 0) {
> > >              return r;
> > > diff --git a/hw/virtio.h b/hw/virtio.h
> > > index d8546d5..53bbdba 100644
> > > --- a/hw/virtio.h
> > > +++ b/hw/virtio.h
> > > @@ -98,6 +98,8 @@ typedef struct {
> > >      void (*vmstate_change)(void * opaque, bool running);
> > >  } VirtIOBindings;
> > >  
> > > +#define EVIRTIO_DISABLED ERANGE
> > > +
> > >  #define VIRTIO_PCI_QUEUE_MAX 64
> > >  
> > >  #define VIRTIO_NO_VECTOR 0xffff


^ permalink raw reply	[flat|nested] 37+ messages in thread

* [Qemu-devel] Re: [PATCH] vhost: force vhost off for non-MSI guests
@ 2011-01-20 23:43       ` Sridhar Samudrala
  0 siblings, 0 replies; 37+ messages in thread
From: Sridhar Samudrala @ 2011-01-20 23:43 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: kvm, Juan Quintela, Jes.Sorensen, jasowang, qemu-devel, Alex Williamson

On Thu, 2011-01-20 at 19:47 +0200, Michael S. Tsirkin wrote:
> On Thu, Jan 20, 2011 at 08:31:53AM -0800, Sridhar Samudrala wrote:
> > On Thu, 2011-01-20 at 17:35 +0200, Michael S. Tsirkin wrote:
> > > When MSI is off, each interrupt needs to be bounced through the io
> > > thread when it's set/cleared, so vhost-net causes more context switches and
> > > higher CPU utilization than userspace virtio which handles networking in
> > > the same thread.
> > > 
> > > We'll need to fix this by adding level irq support in kvm irqfd,
> > > for now disable vhost-net in these configurations.
> > > 
> > > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > > ---
> > > 
> > > I need to report some error from virtio-pci
> > > that would be handled specially (disable but don't
> > > report an error) so I wanted one that's never likely to be used by a
> > > userspace ioctl. I selected ERANGE but it'd
> > > be easy to switch to something else. Comments?
> > 
> > Should this error be EVHOST_DISABLED rather than EVIRTIO_DISABLED?
> > 
> > -Sridhar
> 
> The error is reported by virtio-pci which does not know about vhost.
> I started with EVIRTIO_MSIX_DISABLED and made is shorter.
> Would EVIRTIO_MSIX_DISABLED be better?

I think so. This makes it more clear.
-Sridhar
> 
> > > 
> > >  hw/vhost.c      |    4 +++-
> > >  hw/virtio-net.c |    6 ++++--
> > >  hw/virtio-pci.c |    3 +++
> > >  hw/virtio.h     |    2 ++
> > >  4 files changed, 12 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/hw/vhost.c b/hw/vhost.c
> > > index 1d09ed0..c79765a 100644
> > > --- a/hw/vhost.c
> > > +++ b/hw/vhost.c
> > > @@ -649,7 +649,9 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev)
> > >  
> > >      r = vdev->binding->set_guest_notifiers(vdev->binding_opaque, true);
> > >      if (r < 0) {
> > > -        fprintf(stderr, "Error binding guest notifier: %d\n", -r);
> > > +	if (r != -EVIRTIO_DISABLED) {
> > > +		fprintf(stderr, "Error binding guest notifier: %d\n", -r);
> > > +	}
> > >          goto fail_notifiers;
> > >      }
> > >  
> > > diff --git a/hw/virtio-net.c b/hw/virtio-net.c
> > > index ccb3e63..5de3fee 100644
> > > --- a/hw/virtio-net.c
> > > +++ b/hw/virtio-net.c
> > > @@ -121,8 +121,10 @@ static void virtio_net_vhost_status(VirtIONet *n, uint8_t status)
> > >      if (!n->vhost_started) {
> > >          int r = vhost_net_start(tap_get_vhost_net(n->nic->nc.peer), &n->vdev);
> > >          if (r < 0) {
> > > -            error_report("unable to start vhost net: %d: "
> > > -                         "falling back on userspace virtio", -r);
> > > +            if (r != -EVIRTIO_DISABLED) {
> > > +                error_report("unable to start vhost net: %d: "
> > > +                             "falling back on userspace virtio", -r);
> > > +            }
> > >          } else {
> > >              n->vhost_started = 1;
> > >          }
> > > diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
> > > index dd8887a..dbf4be0 100644
> > > --- a/hw/virtio-pci.c
> > > +++ b/hw/virtio-pci.c
> > > @@ -628,6 +628,9 @@ static int virtio_pci_set_guest_notifier(void *opaque, int n, bool assign)
> > >      EventNotifier *notifier = virtio_queue_get_guest_notifier(vq);
> > >  
> > >      if (assign) {
> > > +        if (!msix_enabled(&proxy->pci_dev)) {
> > > +            return -EVIRTIO_DISABLED;
> > > +        }
> > >          int r = event_notifier_init(notifier, 0);
> > >          if (r < 0) {
> > >              return r;
> > > diff --git a/hw/virtio.h b/hw/virtio.h
> > > index d8546d5..53bbdba 100644
> > > --- a/hw/virtio.h
> > > +++ b/hw/virtio.h
> > > @@ -98,6 +98,8 @@ typedef struct {
> > >      void (*vmstate_change)(void * opaque, bool running);
> > >  } VirtIOBindings;
> > >  
> > > +#define EVIRTIO_DISABLED ERANGE
> > > +
> > >  #define VIRTIO_PCI_QUEUE_MAX 64
> > >  
> > >  #define VIRTIO_NO_VECTOR 0xffff

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [PATCH] vhost: force vhost off for non-MSI guests
  2011-01-20 16:07     ` Michael S. Tsirkin
  (?)
@ 2011-01-21  0:23     ` Anthony Liguori
  2011-01-21  1:35         ` Alex Williamson
  2011-01-21  9:48       ` Michael S. Tsirkin
  -1 siblings, 2 replies; 37+ messages in thread
From: Anthony Liguori @ 2011-01-21  0:23 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: kvm, Juan Quintela, Jes.Sorensen, jasowang, qemu-devel, Alex Williamson

On 01/20/2011 10:07 AM, Michael S. Tsirkin wrote:
> On Thu, Jan 20, 2011 at 09:43:57AM -0600, Anthony Liguori wrote:
>    
>> On 01/20/2011 09:35 AM, Michael S. Tsirkin wrote:
>>      
>>> When MSI is off, each interrupt needs to be bounced through the io
>>> thread when it's set/cleared, so vhost-net causes more context switches and
>>> higher CPU utilization than userspace virtio which handles networking in
>>> the same thread.
>>>
>>> We'll need to fix this by adding level irq support in kvm irqfd,
>>> for now disable vhost-net in these configurations.
>>>
>>> Signed-off-by: Michael S. Tsirkin<mst@redhat.com>
>>>        
>> I actually think this should be a terminal error.  The user asks for
>> vhost-net, if we cannot enable it, we should exit.
>>
>> Or we should warn the user that they should expect bad performance.
>> Silently doing something that the user has explicitly asked us not
>> to do is not a good behavior.
>>
>> Regards,
>>
>> Anthony Liguori
>>      
> The issue is that user has no control of the guest, and can not know
> whether the guest enables MSI. So what you ask for will just make
> some guests fail, and others fail sometimes.
> The user also has no way to know that version X of kvm does not expose a
> way to inject level interrupts with irqfd.
>
> We could have *another* flag that says "use vhost where it helps" but
> then I think this is what everyone wants to do, anyway, and libvirt
> already sets vhost=on so I prefer redefining the meaning of an existing
> flag.
>    

In the very least, there needs to be a vhost=force.

Having some sort of friendly default policy is fine but we need to 
provide a mechanism for a user to have the final say.  If you want to 
redefine vhost=on to really mean, use the friendly default, that's fine 
by me, but only if the vhost=force option exists.

I actually would think libvirt would want to use vhost=force.  Debugging 
with vhost=on is going to be a royal pain in the ass if a user reports 
bad performance.  Given the libvirt XML, you can't actually tell from 
the guest and the XML whether or not vhost was actually in use or not.

Regards,

Anthony Liguori

> Maybe this is best handled by a documentation update?
>
> We always said:
> 	    "                use vhost=on to enable experimental in kernel accelerator\n"
>
> note 'enable' not 'require'. This is similar to how we specify
> nvectors : you can not make guest use the feature.
>
> How about this:
>
> diff --git a/qemu-options.hx b/qemu-options.hx
> index 898561d..3c937c1 100644
> --- a/qemu-options.hx
> +++ b/qemu-options.hx
> @@ -1061,6 +1061,7 @@ DEF("net", HAS_ARG, QEMU_OPTION_net,
>       "                use vnet_hdr=off to avoid enabling the IFF_VNET_HDR tap flag\n"
>       "                use vnet_hdr=on to make the lack of IFF_VNET_HDR support an error condition\n"
>       "                use vhost=on to enable experimental in kernel accelerator\n"
> +    "                (note: vhost=on has no effect unless guest uses MSI-X)\n"
>       "                use 'vhostfd=h' to connect to an already opened vhost net device\n"
>   #endif
>       "-net socket[,vlan=n][,name=str][,fd=h][,listen=[host]:port][,connect=host:port]\n"
>
>
>    


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [PATCH] vhost: force vhost off for non-MSI guests
  2011-01-21  0:23     ` Anthony Liguori
@ 2011-01-21  1:35         ` Alex Williamson
  2011-01-21  9:48       ` Michael S. Tsirkin
  1 sibling, 0 replies; 37+ messages in thread
From: Alex Williamson @ 2011-01-21  1:35 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Michael S. Tsirkin, kvm, Juan Quintela, Jes.Sorensen, jasowang,
	qemu-devel

On Thu, 2011-01-20 at 18:23 -0600, Anthony Liguori wrote:
> On 01/20/2011 10:07 AM, Michael S. Tsirkin wrote:
> > On Thu, Jan 20, 2011 at 09:43:57AM -0600, Anthony Liguori wrote:
> >    
> >> On 01/20/2011 09:35 AM, Michael S. Tsirkin wrote:
> >>      
> >>> When MSI is off, each interrupt needs to be bounced through the io
> >>> thread when it's set/cleared, so vhost-net causes more context switches and
> >>> higher CPU utilization than userspace virtio which handles networking in
> >>> the same thread.
> >>>
> >>> We'll need to fix this by adding level irq support in kvm irqfd,
> >>> for now disable vhost-net in these configurations.
> >>>
> >>> Signed-off-by: Michael S. Tsirkin<mst@redhat.com>
> >>>        
> >> I actually think this should be a terminal error.  The user asks for
> >> vhost-net, if we cannot enable it, we should exit.
> >>
> >> Or we should warn the user that they should expect bad performance.
> >> Silently doing something that the user has explicitly asked us not
> >> to do is not a good behavior.
> >>
> >> Regards,
> >>
> >> Anthony Liguori
> >>      
> > The issue is that user has no control of the guest, and can not know
> > whether the guest enables MSI. So what you ask for will just make
> > some guests fail, and others fail sometimes.
> > The user also has no way to know that version X of kvm does not expose a
> > way to inject level interrupts with irqfd.
> >
> > We could have *another* flag that says "use vhost where it helps" but
> > then I think this is what everyone wants to do, anyway, and libvirt
> > already sets vhost=on so I prefer redefining the meaning of an existing
> > flag.
> >    
> 
> In the very least, there needs to be a vhost=force.
> 
> Having some sort of friendly default policy is fine but we need to 
> provide a mechanism for a user to have the final say.  If you want to 
> redefine vhost=on to really mean, use the friendly default, that's fine 
> by me, but only if the vhost=force option exists.
> 
> I actually would think libvirt would want to use vhost=force.  Debugging 
> with vhost=on is going to be a royal pain in the ass if a user reports 
> bad performance.  Given the libvirt XML, you can't actually tell from 
> the guest and the XML whether or not vhost was actually in use or not.

If we add a force option, let's please distinguish hotplug from VM
creation time.  The latter can abort.  Hotplug should print an error and
fail the initfn.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [PATCH] vhost: force vhost off for non-MSI guests
@ 2011-01-21  1:35         ` Alex Williamson
  0 siblings, 0 replies; 37+ messages in thread
From: Alex Williamson @ 2011-01-21  1:35 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: kvm, Quintela, Jes.Sorensen, Juan, Michael S. Tsirkin,
	qemu-devel, jasowang

On Thu, 2011-01-20 at 18:23 -0600, Anthony Liguori wrote:
> On 01/20/2011 10:07 AM, Michael S. Tsirkin wrote:
> > On Thu, Jan 20, 2011 at 09:43:57AM -0600, Anthony Liguori wrote:
> >    
> >> On 01/20/2011 09:35 AM, Michael S. Tsirkin wrote:
> >>      
> >>> When MSI is off, each interrupt needs to be bounced through the io
> >>> thread when it's set/cleared, so vhost-net causes more context switches and
> >>> higher CPU utilization than userspace virtio which handles networking in
> >>> the same thread.
> >>>
> >>> We'll need to fix this by adding level irq support in kvm irqfd,
> >>> for now disable vhost-net in these configurations.
> >>>
> >>> Signed-off-by: Michael S. Tsirkin<mst@redhat.com>
> >>>        
> >> I actually think this should be a terminal error.  The user asks for
> >> vhost-net, if we cannot enable it, we should exit.
> >>
> >> Or we should warn the user that they should expect bad performance.
> >> Silently doing something that the user has explicitly asked us not
> >> to do is not a good behavior.
> >>
> >> Regards,
> >>
> >> Anthony Liguori
> >>      
> > The issue is that user has no control of the guest, and can not know
> > whether the guest enables MSI. So what you ask for will just make
> > some guests fail, and others fail sometimes.
> > The user also has no way to know that version X of kvm does not expose a
> > way to inject level interrupts with irqfd.
> >
> > We could have *another* flag that says "use vhost where it helps" but
> > then I think this is what everyone wants to do, anyway, and libvirt
> > already sets vhost=on so I prefer redefining the meaning of an existing
> > flag.
> >    
> 
> In the very least, there needs to be a vhost=force.
> 
> Having some sort of friendly default policy is fine but we need to 
> provide a mechanism for a user to have the final say.  If you want to 
> redefine vhost=on to really mean, use the friendly default, that's fine 
> by me, but only if the vhost=force option exists.
> 
> I actually would think libvirt would want to use vhost=force.  Debugging 
> with vhost=on is going to be a royal pain in the ass if a user reports 
> bad performance.  Given the libvirt XML, you can't actually tell from 
> the guest and the XML whether or not vhost was actually in use or not.

If we add a force option, let's please distinguish hotplug from VM
creation time.  The latter can abort.  Hotplug should print an error and
fail the initfn.  Thanks,

Alex

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [PATCH] vhost: force vhost off for non-MSI guests
  2011-01-21  0:23     ` Anthony Liguori
  2011-01-21  1:35         ` Alex Williamson
@ 2011-01-21  9:48       ` Michael S. Tsirkin
  2011-01-21 14:37         ` Anthony Liguori
  1 sibling, 1 reply; 37+ messages in thread
From: Michael S. Tsirkin @ 2011-01-21  9:48 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: kvm, Juan Quintela, Jes.Sorensen, jasowang, qemu-devel, Alex Williamson

On Thu, Jan 20, 2011 at 06:23:36PM -0600, Anthony Liguori wrote:
> On 01/20/2011 10:07 AM, Michael S. Tsirkin wrote:
> >On Thu, Jan 20, 2011 at 09:43:57AM -0600, Anthony Liguori wrote:
> >>On 01/20/2011 09:35 AM, Michael S. Tsirkin wrote:
> >>>When MSI is off, each interrupt needs to be bounced through the io
> >>>thread when it's set/cleared, so vhost-net causes more context switches and
> >>>higher CPU utilization than userspace virtio which handles networking in
> >>>the same thread.
> >>>
> >>>We'll need to fix this by adding level irq support in kvm irqfd,
> >>>for now disable vhost-net in these configurations.
> >>>
> >>>Signed-off-by: Michael S. Tsirkin<mst@redhat.com>
> >>I actually think this should be a terminal error.  The user asks for
> >>vhost-net, if we cannot enable it, we should exit.
> >>
> >>Or we should warn the user that they should expect bad performance.
> >>Silently doing something that the user has explicitly asked us not
> >>to do is not a good behavior.
> >>
> >>Regards,
> >>
> >>Anthony Liguori
> >The issue is that user has no control of the guest, and can not know
> >whether the guest enables MSI. So what you ask for will just make
> >some guests fail, and others fail sometimes.
> >The user also has no way to know that version X of kvm does not expose a
> >way to inject level interrupts with irqfd.
> >
> >We could have *another* flag that says "use vhost where it helps" but
> >then I think this is what everyone wants to do, anyway, and libvirt
> >already sets vhost=on so I prefer redefining the meaning of an existing
> >flag.
> 
> In the very least, there needs to be a vhost=force.
> Having some sort of friendly default policy is fine but we need to
> provide a mechanism for a user to have the final say.  If you want
> to redefine vhost=on to really mean, use the friendly default,
> that's fine by me, but only if the vhost=force option exists.

OK, I will add that, probably as a separate flag as vhost
is a boolean.  This will get worse performance but it will be what the
user asked for.

> 
> I actually would think libvirt would want to use vhost=force.
> Debugging with vhost=on is going to be a royal pain in the ass if a
> user reports bad performance.  Given the libvirt XML, you can't
> actually tell from the guest and the XML whether or not vhost was
> actually in use or not.

Yes you can: check MSI enabled in the guest, if it is -
check vhost enabled in the XML. Not that bad at all, is it?

> 
> Regards,
> 
> Anthony Liguori

We get worse performance without MSI anyway, how is this different?

> >Maybe this is best handled by a documentation update?
> >
> >We always said:
> >	    "                use vhost=on to enable experimental in kernel accelerator\n"
> >
> >note 'enable' not 'require'. This is similar to how we specify
> >nvectors : you can not make guest use the feature.
> >
> >How about this:
> >
> >diff --git a/qemu-options.hx b/qemu-options.hx
> >index 898561d..3c937c1 100644
> >--- a/qemu-options.hx
> >+++ b/qemu-options.hx
> >@@ -1061,6 +1061,7 @@ DEF("net", HAS_ARG, QEMU_OPTION_net,
> >      "                use vnet_hdr=off to avoid enabling the IFF_VNET_HDR tap flag\n"
> >      "                use vnet_hdr=on to make the lack of IFF_VNET_HDR support an error condition\n"
> >      "                use vhost=on to enable experimental in kernel accelerator\n"
> >+    "                (note: vhost=on has no effect unless guest uses MSI-X)\n"
> >      "                use 'vhostfd=h' to connect to an already opened vhost net device\n"
> >  #endif
> >      "-net socket[,vlan=n][,name=str][,fd=h][,listen=[host]:port][,connect=host:port]\n"
> >
> >

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [PATCH] vhost: force vhost off for non-MSI guests
  2011-01-21  1:35         ` Alex Williamson
@ 2011-01-21  9:55           ` Michael S. Tsirkin
  -1 siblings, 0 replies; 37+ messages in thread
From: Michael S. Tsirkin @ 2011-01-21  9:55 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Anthony Liguori, kvm, Juan Quintela, Jes.Sorensen, jasowang, qemu-devel

On Thu, Jan 20, 2011 at 06:35:46PM -0700, Alex Williamson wrote:
> On Thu, 2011-01-20 at 18:23 -0600, Anthony Liguori wrote:
> > On 01/20/2011 10:07 AM, Michael S. Tsirkin wrote:
> > > On Thu, Jan 20, 2011 at 09:43:57AM -0600, Anthony Liguori wrote:
> > >    
> > >> On 01/20/2011 09:35 AM, Michael S. Tsirkin wrote:
> > >>      
> > >>> When MSI is off, each interrupt needs to be bounced through the io
> > >>> thread when it's set/cleared, so vhost-net causes more context switches and
> > >>> higher CPU utilization than userspace virtio which handles networking in
> > >>> the same thread.
> > >>>
> > >>> We'll need to fix this by adding level irq support in kvm irqfd,
> > >>> for now disable vhost-net in these configurations.
> > >>>
> > >>> Signed-off-by: Michael S. Tsirkin<mst@redhat.com>
> > >>>        
> > >> I actually think this should be a terminal error.  The user asks for
> > >> vhost-net, if we cannot enable it, we should exit.
> > >>
> > >> Or we should warn the user that they should expect bad performance.
> > >> Silently doing something that the user has explicitly asked us not
> > >> to do is not a good behavior.
> > >>
> > >> Regards,
> > >>
> > >> Anthony Liguori
> > >>      
> > > The issue is that user has no control of the guest, and can not know
> > > whether the guest enables MSI. So what you ask for will just make
> > > some guests fail, and others fail sometimes.
> > > The user also has no way to know that version X of kvm does not expose a
> > > way to inject level interrupts with irqfd.
> > >
> > > We could have *another* flag that says "use vhost where it helps" but
> > > then I think this is what everyone wants to do, anyway, and libvirt
> > > already sets vhost=on so I prefer redefining the meaning of an existing
> > > flag.
> > >    
> > 
> > In the very least, there needs to be a vhost=force.
> > 
> > Having some sort of friendly default policy is fine but we need to 
> > provide a mechanism for a user to have the final say.  If you want to 
> > redefine vhost=on to really mean, use the friendly default, that's fine 
> > by me, but only if the vhost=force option exists.
> > 
> > I actually would think libvirt would want to use vhost=force.  Debugging 
> > with vhost=on is going to be a royal pain in the ass if a user reports 
> > bad performance.  Given the libvirt XML, you can't actually tell from 
> > the guest and the XML whether or not vhost was actually in use or not.
> 
> If we add a force option, let's please distinguish hotplug from VM
> creation time.  The latter can abort.  Hotplug should print an error and
> fail the initfn.

It can't abort at init - MSI is disabled at init, it needs to be enabled
by the guest later. And aborting the guest in the middle of the run
is a very bad idea.

What vhostforce=true will do is force vhost backend to be used even if
it is slower.

>  Thanks,
> 
> Alex

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [PATCH] vhost: force vhost off for non-MSI guests
@ 2011-01-21  9:55           ` Michael S. Tsirkin
  0 siblings, 0 replies; 37+ messages in thread
From: Michael S. Tsirkin @ 2011-01-21  9:55 UTC (permalink / raw)
  To: Alex Williamson; +Cc: kvm, Juan Quintela, Jes.Sorensen, jasowang, qemu-devel

On Thu, Jan 20, 2011 at 06:35:46PM -0700, Alex Williamson wrote:
> On Thu, 2011-01-20 at 18:23 -0600, Anthony Liguori wrote:
> > On 01/20/2011 10:07 AM, Michael S. Tsirkin wrote:
> > > On Thu, Jan 20, 2011 at 09:43:57AM -0600, Anthony Liguori wrote:
> > >    
> > >> On 01/20/2011 09:35 AM, Michael S. Tsirkin wrote:
> > >>      
> > >>> When MSI is off, each interrupt needs to be bounced through the io
> > >>> thread when it's set/cleared, so vhost-net causes more context switches and
> > >>> higher CPU utilization than userspace virtio which handles networking in
> > >>> the same thread.
> > >>>
> > >>> We'll need to fix this by adding level irq support in kvm irqfd,
> > >>> for now disable vhost-net in these configurations.
> > >>>
> > >>> Signed-off-by: Michael S. Tsirkin<mst@redhat.com>
> > >>>        
> > >> I actually think this should be a terminal error.  The user asks for
> > >> vhost-net, if we cannot enable it, we should exit.
> > >>
> > >> Or we should warn the user that they should expect bad performance.
> > >> Silently doing something that the user has explicitly asked us not
> > >> to do is not a good behavior.
> > >>
> > >> Regards,
> > >>
> > >> Anthony Liguori
> > >>      
> > > The issue is that user has no control of the guest, and can not know
> > > whether the guest enables MSI. So what you ask for will just make
> > > some guests fail, and others fail sometimes.
> > > The user also has no way to know that version X of kvm does not expose a
> > > way to inject level interrupts with irqfd.
> > >
> > > We could have *another* flag that says "use vhost where it helps" but
> > > then I think this is what everyone wants to do, anyway, and libvirt
> > > already sets vhost=on so I prefer redefining the meaning of an existing
> > > flag.
> > >    
> > 
> > In the very least, there needs to be a vhost=force.
> > 
> > Having some sort of friendly default policy is fine but we need to 
> > provide a mechanism for a user to have the final say.  If you want to 
> > redefine vhost=on to really mean, use the friendly default, that's fine 
> > by me, but only if the vhost=force option exists.
> > 
> > I actually would think libvirt would want to use vhost=force.  Debugging 
> > with vhost=on is going to be a royal pain in the ass if a user reports 
> > bad performance.  Given the libvirt XML, you can't actually tell from 
> > the guest and the XML whether or not vhost was actually in use or not.
> 
> If we add a force option, let's please distinguish hotplug from VM
> creation time.  The latter can abort.  Hotplug should print an error and
> fail the initfn.

It can't abort at init - MSI is disabled at init, it needs to be enabled
by the guest later. And aborting the guest in the middle of the run
is a very bad idea.

What vhostforce=true will do is force vhost backend to be used even if
it is slower.

>  Thanks,
> 
> Alex

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [PATCH] vhost: force vhost off for non-MSI guests
  2011-01-21  9:55           ` Michael S. Tsirkin
@ 2011-01-21 13:19             ` Alex Williamson
  -1 siblings, 0 replies; 37+ messages in thread
From: Alex Williamson @ 2011-01-21 13:19 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Anthony Liguori, kvm, Juan Quintela, Jes.Sorensen, jasowang, qemu-devel

On Fri, 2011-01-21 at 11:55 +0200, Michael S. Tsirkin wrote:
> On Thu, Jan 20, 2011 at 06:35:46PM -0700, Alex Williamson wrote:
> > On Thu, 2011-01-20 at 18:23 -0600, Anthony Liguori wrote:
> > > On 01/20/2011 10:07 AM, Michael S. Tsirkin wrote:
> > > > On Thu, Jan 20, 2011 at 09:43:57AM -0600, Anthony Liguori wrote:
> > > >    
> > > >> On 01/20/2011 09:35 AM, Michael S. Tsirkin wrote:
> > > >>      
> > > >>> When MSI is off, each interrupt needs to be bounced through the io
> > > >>> thread when it's set/cleared, so vhost-net causes more context switches and
> > > >>> higher CPU utilization than userspace virtio which handles networking in
> > > >>> the same thread.
> > > >>>
> > > >>> We'll need to fix this by adding level irq support in kvm irqfd,
> > > >>> for now disable vhost-net in these configurations.
> > > >>>
> > > >>> Signed-off-by: Michael S. Tsirkin<mst@redhat.com>
> > > >>>        
> > > >> I actually think this should be a terminal error.  The user asks for
> > > >> vhost-net, if we cannot enable it, we should exit.
> > > >>
> > > >> Or we should warn the user that they should expect bad performance.
> > > >> Silently doing something that the user has explicitly asked us not
> > > >> to do is not a good behavior.
> > > >>
> > > >> Regards,
> > > >>
> > > >> Anthony Liguori
> > > >>      
> > > > The issue is that user has no control of the guest, and can not know
> > > > whether the guest enables MSI. So what you ask for will just make
> > > > some guests fail, and others fail sometimes.
> > > > The user also has no way to know that version X of kvm does not expose a
> > > > way to inject level interrupts with irqfd.
> > > >
> > > > We could have *another* flag that says "use vhost where it helps" but
> > > > then I think this is what everyone wants to do, anyway, and libvirt
> > > > already sets vhost=on so I prefer redefining the meaning of an existing
> > > > flag.
> > > >    
> > > 
> > > In the very least, there needs to be a vhost=force.
> > > 
> > > Having some sort of friendly default policy is fine but we need to 
> > > provide a mechanism for a user to have the final say.  If you want to 
> > > redefine vhost=on to really mean, use the friendly default, that's fine 
> > > by me, but only if the vhost=force option exists.
> > > 
> > > I actually would think libvirt would want to use vhost=force.  Debugging 
> > > with vhost=on is going to be a royal pain in the ass if a user reports 
> > > bad performance.  Given the libvirt XML, you can't actually tell from 
> > > the guest and the XML whether or not vhost was actually in use or not.
> > 
> > If we add a force option, let's please distinguish hotplug from VM
> > creation time.  The latter can abort.  Hotplug should print an error and
> > fail the initfn.
> 
> It can't abort at init - MSI is disabled at init, it needs to be enabled
> by the guest later. And aborting the guest in the middle of the run
> is a very bad idea.

Yeah, I was thinking about the ordering of device being added vs guest
enabling MSI this morning.  Waiting until the guest decides to try to
start using the device to NAK it with an abort is very undesirable.
What if when we have vhost=on,force, the device doesn't advertise an
INTx (PCI_INTERRUPT_PIN = 0)?

Alex


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [PATCH] vhost: force vhost off for non-MSI guests
@ 2011-01-21 13:19             ` Alex Williamson
  0 siblings, 0 replies; 37+ messages in thread
From: Alex Williamson @ 2011-01-21 13:19 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: kvm, Quintela, Jes.Sorensen, jasowang, qemu-devel, Juan

On Fri, 2011-01-21 at 11:55 +0200, Michael S. Tsirkin wrote:
> On Thu, Jan 20, 2011 at 06:35:46PM -0700, Alex Williamson wrote:
> > On Thu, 2011-01-20 at 18:23 -0600, Anthony Liguori wrote:
> > > On 01/20/2011 10:07 AM, Michael S. Tsirkin wrote:
> > > > On Thu, Jan 20, 2011 at 09:43:57AM -0600, Anthony Liguori wrote:
> > > >    
> > > >> On 01/20/2011 09:35 AM, Michael S. Tsirkin wrote:
> > > >>      
> > > >>> When MSI is off, each interrupt needs to be bounced through the io
> > > >>> thread when it's set/cleared, so vhost-net causes more context switches and
> > > >>> higher CPU utilization than userspace virtio which handles networking in
> > > >>> the same thread.
> > > >>>
> > > >>> We'll need to fix this by adding level irq support in kvm irqfd,
> > > >>> for now disable vhost-net in these configurations.
> > > >>>
> > > >>> Signed-off-by: Michael S. Tsirkin<mst@redhat.com>
> > > >>>        
> > > >> I actually think this should be a terminal error.  The user asks for
> > > >> vhost-net, if we cannot enable it, we should exit.
> > > >>
> > > >> Or we should warn the user that they should expect bad performance.
> > > >> Silently doing something that the user has explicitly asked us not
> > > >> to do is not a good behavior.
> > > >>
> > > >> Regards,
> > > >>
> > > >> Anthony Liguori
> > > >>      
> > > > The issue is that user has no control of the guest, and can not know
> > > > whether the guest enables MSI. So what you ask for will just make
> > > > some guests fail, and others fail sometimes.
> > > > The user also has no way to know that version X of kvm does not expose a
> > > > way to inject level interrupts with irqfd.
> > > >
> > > > We could have *another* flag that says "use vhost where it helps" but
> > > > then I think this is what everyone wants to do, anyway, and libvirt
> > > > already sets vhost=on so I prefer redefining the meaning of an existing
> > > > flag.
> > > >    
> > > 
> > > In the very least, there needs to be a vhost=force.
> > > 
> > > Having some sort of friendly default policy is fine but we need to 
> > > provide a mechanism for a user to have the final say.  If you want to 
> > > redefine vhost=on to really mean, use the friendly default, that's fine 
> > > by me, but only if the vhost=force option exists.
> > > 
> > > I actually would think libvirt would want to use vhost=force.  Debugging 
> > > with vhost=on is going to be a royal pain in the ass if a user reports 
> > > bad performance.  Given the libvirt XML, you can't actually tell from 
> > > the guest and the XML whether or not vhost was actually in use or not.
> > 
> > If we add a force option, let's please distinguish hotplug from VM
> > creation time.  The latter can abort.  Hotplug should print an error and
> > fail the initfn.
> 
> It can't abort at init - MSI is disabled at init, it needs to be enabled
> by the guest later. And aborting the guest in the middle of the run
> is a very bad idea.

Yeah, I was thinking about the ordering of device being added vs guest
enabling MSI this morning.  Waiting until the guest decides to try to
start using the device to NAK it with an abort is very undesirable.
What if when we have vhost=on,force, the device doesn't advertise an
INTx (PCI_INTERRUPT_PIN = 0)?

Alex

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [PATCH] vhost: force vhost off for non-MSI guests
  2011-01-21 13:19             ` Alex Williamson
@ 2011-01-21 13:43               ` Michael S. Tsirkin
  -1 siblings, 0 replies; 37+ messages in thread
From: Michael S. Tsirkin @ 2011-01-21 13:43 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Anthony Liguori, kvm, Juan Quintela, Jes.Sorensen, jasowang, qemu-devel

On Fri, Jan 21, 2011 at 06:19:13AM -0700, Alex Williamson wrote:
> On Fri, 2011-01-21 at 11:55 +0200, Michael S. Tsirkin wrote:
> > On Thu, Jan 20, 2011 at 06:35:46PM -0700, Alex Williamson wrote:
> > > On Thu, 2011-01-20 at 18:23 -0600, Anthony Liguori wrote:
> > > > On 01/20/2011 10:07 AM, Michael S. Tsirkin wrote:
> > > > > On Thu, Jan 20, 2011 at 09:43:57AM -0600, Anthony Liguori wrote:
> > > > >    
> > > > >> On 01/20/2011 09:35 AM, Michael S. Tsirkin wrote:
> > > > >>      
> > > > >>> When MSI is off, each interrupt needs to be bounced through the io
> > > > >>> thread when it's set/cleared, so vhost-net causes more context switches and
> > > > >>> higher CPU utilization than userspace virtio which handles networking in
> > > > >>> the same thread.
> > > > >>>
> > > > >>> We'll need to fix this by adding level irq support in kvm irqfd,
> > > > >>> for now disable vhost-net in these configurations.
> > > > >>>
> > > > >>> Signed-off-by: Michael S. Tsirkin<mst@redhat.com>
> > > > >>>        
> > > > >> I actually think this should be a terminal error.  The user asks for
> > > > >> vhost-net, if we cannot enable it, we should exit.
> > > > >>
> > > > >> Or we should warn the user that they should expect bad performance.
> > > > >> Silently doing something that the user has explicitly asked us not
> > > > >> to do is not a good behavior.
> > > > >>
> > > > >> Regards,
> > > > >>
> > > > >> Anthony Liguori
> > > > >>      
> > > > > The issue is that user has no control of the guest, and can not know
> > > > > whether the guest enables MSI. So what you ask for will just make
> > > > > some guests fail, and others fail sometimes.
> > > > > The user also has no way to know that version X of kvm does not expose a
> > > > > way to inject level interrupts with irqfd.
> > > > >
> > > > > We could have *another* flag that says "use vhost where it helps" but
> > > > > then I think this is what everyone wants to do, anyway, and libvirt
> > > > > already sets vhost=on so I prefer redefining the meaning of an existing
> > > > > flag.
> > > > >    
> > > > 
> > > > In the very least, there needs to be a vhost=force.
> > > > 
> > > > Having some sort of friendly default policy is fine but we need to 
> > > > provide a mechanism for a user to have the final say.  If you want to 
> > > > redefine vhost=on to really mean, use the friendly default, that's fine 
> > > > by me, but only if the vhost=force option exists.
> > > > 
> > > > I actually would think libvirt would want to use vhost=force.  Debugging 
> > > > with vhost=on is going to be a royal pain in the ass if a user reports 
> > > > bad performance.  Given the libvirt XML, you can't actually tell from 
> > > > the guest and the XML whether or not vhost was actually in use or not.
> > > 
> > > If we add a force option, let's please distinguish hotplug from VM
> > > creation time.  The latter can abort.  Hotplug should print an error and
> > > fail the initfn.
> > 
> > It can't abort at init - MSI is disabled at init, it needs to be enabled
> > by the guest later. And aborting the guest in the middle of the run
> > is a very bad idea.
> 
> Yeah, I was thinking about the ordering of device being added vs guest
> enabling MSI this morning.  Waiting until the guest decides to try to
> start using the device to NAK it with an abort is very undesirable.
> What if when we have vhost=on,force, the device doesn't advertise an
> INTx (PCI_INTERRUPT_PIN = 0)?
> 
> Alex

Then we break backward compatibility with old guests.
I don't see what the issue is really:
It is trivial to check that the guest uses MSIX.

-- 
MST

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [PATCH] vhost: force vhost off for non-MSI guests
@ 2011-01-21 13:43               ` Michael S. Tsirkin
  0 siblings, 0 replies; 37+ messages in thread
From: Michael S. Tsirkin @ 2011-01-21 13:43 UTC (permalink / raw)
  To: Alex Williamson; +Cc: kvm, Juan Quintela, Jes.Sorensen, jasowang, qemu-devel

On Fri, Jan 21, 2011 at 06:19:13AM -0700, Alex Williamson wrote:
> On Fri, 2011-01-21 at 11:55 +0200, Michael S. Tsirkin wrote:
> > On Thu, Jan 20, 2011 at 06:35:46PM -0700, Alex Williamson wrote:
> > > On Thu, 2011-01-20 at 18:23 -0600, Anthony Liguori wrote:
> > > > On 01/20/2011 10:07 AM, Michael S. Tsirkin wrote:
> > > > > On Thu, Jan 20, 2011 at 09:43:57AM -0600, Anthony Liguori wrote:
> > > > >    
> > > > >> On 01/20/2011 09:35 AM, Michael S. Tsirkin wrote:
> > > > >>      
> > > > >>> When MSI is off, each interrupt needs to be bounced through the io
> > > > >>> thread when it's set/cleared, so vhost-net causes more context switches and
> > > > >>> higher CPU utilization than userspace virtio which handles networking in
> > > > >>> the same thread.
> > > > >>>
> > > > >>> We'll need to fix this by adding level irq support in kvm irqfd,
> > > > >>> for now disable vhost-net in these configurations.
> > > > >>>
> > > > >>> Signed-off-by: Michael S. Tsirkin<mst@redhat.com>
> > > > >>>        
> > > > >> I actually think this should be a terminal error.  The user asks for
> > > > >> vhost-net, if we cannot enable it, we should exit.
> > > > >>
> > > > >> Or we should warn the user that they should expect bad performance.
> > > > >> Silently doing something that the user has explicitly asked us not
> > > > >> to do is not a good behavior.
> > > > >>
> > > > >> Regards,
> > > > >>
> > > > >> Anthony Liguori
> > > > >>      
> > > > > The issue is that user has no control of the guest, and can not know
> > > > > whether the guest enables MSI. So what you ask for will just make
> > > > > some guests fail, and others fail sometimes.
> > > > > The user also has no way to know that version X of kvm does not expose a
> > > > > way to inject level interrupts with irqfd.
> > > > >
> > > > > We could have *another* flag that says "use vhost where it helps" but
> > > > > then I think this is what everyone wants to do, anyway, and libvirt
> > > > > already sets vhost=on so I prefer redefining the meaning of an existing
> > > > > flag.
> > > > >    
> > > > 
> > > > In the very least, there needs to be a vhost=force.
> > > > 
> > > > Having some sort of friendly default policy is fine but we need to 
> > > > provide a mechanism for a user to have the final say.  If you want to 
> > > > redefine vhost=on to really mean, use the friendly default, that's fine 
> > > > by me, but only if the vhost=force option exists.
> > > > 
> > > > I actually would think libvirt would want to use vhost=force.  Debugging 
> > > > with vhost=on is going to be a royal pain in the ass if a user reports 
> > > > bad performance.  Given the libvirt XML, you can't actually tell from 
> > > > the guest and the XML whether or not vhost was actually in use or not.
> > > 
> > > If we add a force option, let's please distinguish hotplug from VM
> > > creation time.  The latter can abort.  Hotplug should print an error and
> > > fail the initfn.
> > 
> > It can't abort at init - MSI is disabled at init, it needs to be enabled
> > by the guest later. And aborting the guest in the middle of the run
> > is a very bad idea.
> 
> Yeah, I was thinking about the ordering of device being added vs guest
> enabling MSI this morning.  Waiting until the guest decides to try to
> start using the device to NAK it with an abort is very undesirable.
> What if when we have vhost=on,force, the device doesn't advertise an
> INTx (PCI_INTERRUPT_PIN = 0)?
> 
> Alex

Then we break backward compatibility with old guests.
I don't see what the issue is really:
It is trivial to check that the guest uses MSIX.

-- 
MST

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [PATCH] vhost: force vhost off for non-MSI guests
  2011-01-21  9:48       ` Michael S. Tsirkin
@ 2011-01-21 14:37         ` Anthony Liguori
  0 siblings, 0 replies; 37+ messages in thread
From: Anthony Liguori @ 2011-01-21 14:37 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: kvm, Juan Quintela, Jes.Sorensen, jasowang, qemu-devel, Alex Williamson

On 01/21/2011 03:48 AM, Michael S. Tsirkin wrote:
> On Thu, Jan 20, 2011 at 06:23:36PM -0600, Anthony Liguori wrote:
>    
>> On 01/20/2011 10:07 AM, Michael S. Tsirkin wrote:
>>      
>>> On Thu, Jan 20, 2011 at 09:43:57AM -0600, Anthony Liguori wrote:
>>>        
>>>> On 01/20/2011 09:35 AM, Michael S. Tsirkin wrote:
>>>>          
>>>>> When MSI is off, each interrupt needs to be bounced through the io
>>>>> thread when it's set/cleared, so vhost-net causes more context switches and
>>>>> higher CPU utilization than userspace virtio which handles networking in
>>>>> the same thread.
>>>>>
>>>>> We'll need to fix this by adding level irq support in kvm irqfd,
>>>>> for now disable vhost-net in these configurations.
>>>>>
>>>>> Signed-off-by: Michael S. Tsirkin<mst@redhat.com>
>>>>>            
>>>> I actually think this should be a terminal error.  The user asks for
>>>> vhost-net, if we cannot enable it, we should exit.
>>>>
>>>> Or we should warn the user that they should expect bad performance.
>>>> Silently doing something that the user has explicitly asked us not
>>>> to do is not a good behavior.
>>>>
>>>> Regards,
>>>>
>>>> Anthony Liguori
>>>>          
>>> The issue is that user has no control of the guest, and can not know
>>> whether the guest enables MSI. So what you ask for will just make
>>> some guests fail, and others fail sometimes.
>>> The user also has no way to know that version X of kvm does not expose a
>>> way to inject level interrupts with irqfd.
>>>
>>> We could have *another* flag that says "use vhost where it helps" but
>>> then I think this is what everyone wants to do, anyway, and libvirt
>>> already sets vhost=on so I prefer redefining the meaning of an existing
>>> flag.
>>>        
>> In the very least, there needs to be a vhost=force.
>> Having some sort of friendly default policy is fine but we need to
>> provide a mechanism for a user to have the final say.  If you want
>> to redefine vhost=on to really mean, use the friendly default,
>> that's fine by me, but only if the vhost=force option exists.
>>      
> OK, I will add that, probably as a separate flag as vhost
> is a boolean.  This will get worse performance but it will be what the
> user asked for.
>
>    
>> I actually would think libvirt would want to use vhost=force.
>> Debugging with vhost=on is going to be a royal pain in the ass if a
>> user reports bad performance.  Given the libvirt XML, you can't
>> actually tell from the guest and the XML whether or not vhost was
>> actually in use or not.
>>      
> Yes you can: check MSI enabled in the guest, if it is -
> check vhost enabled in the XML. Not that bad at all, is it?
>    

Until you automatically detect level triggered interrupt support for 
irqfd.  This means it's also dependent on a kernel feature too.

Is there any way to tell in QEMU that vhost was silently disabled?

Regards,

Anthony Liguori

>    
>> Regards,
>>
>> Anthony Liguori
>>      
> We get worse performance without MSI anyway, how is this different?
>
>    
>>> Maybe this is best handled by a documentation update?
>>>
>>> We always said:
>>> 	    "                use vhost=on to enable experimental in kernel accelerator\n"
>>>
>>> note 'enable' not 'require'. This is similar to how we specify
>>> nvectors : you can not make guest use the feature.
>>>
>>> How about this:
>>>
>>> diff --git a/qemu-options.hx b/qemu-options.hx
>>> index 898561d..3c937c1 100644
>>> --- a/qemu-options.hx
>>> +++ b/qemu-options.hx
>>> @@ -1061,6 +1061,7 @@ DEF("net", HAS_ARG, QEMU_OPTION_net,
>>>       "                use vnet_hdr=off to avoid enabling the IFF_VNET_HDR tap flag\n"
>>>       "                use vnet_hdr=on to make the lack of IFF_VNET_HDR support an error condition\n"
>>>       "                use vhost=on to enable experimental in kernel accelerator\n"
>>> +    "                (note: vhost=on has no effect unless guest uses MSI-X)\n"
>>>       "                use 'vhostfd=h' to connect to an already opened vhost net device\n"
>>>   #endif
>>>       "-net socket[,vlan=n][,name=str][,fd=h][,listen=[host]:port][,connect=host:port]\n"
>>>
>>>
>>>        
>    


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [PATCH] vhost: force vhost off for non-MSI guests
  2011-01-21  9:55           ` Michael S. Tsirkin
@ 2011-01-21 14:40             ` Anthony Liguori
  -1 siblings, 0 replies; 37+ messages in thread
From: Anthony Liguori @ 2011-01-21 14:40 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Alex Williamson, kvm, Juan Quintela, Jes.Sorensen, jasowang, qemu-devel

On 01/21/2011 03:55 AM, Michael S. Tsirkin wrote:
> On Thu, Jan 20, 2011 at 06:35:46PM -0700, Alex Williamson wrote:
>    
>> On Thu, 2011-01-20 at 18:23 -0600, Anthony Liguori wrote:
>>      
>>> On 01/20/2011 10:07 AM, Michael S. Tsirkin wrote:
>>>        
>>>> On Thu, Jan 20, 2011 at 09:43:57AM -0600, Anthony Liguori wrote:
>>>>
>>>>          
>>>>> On 01/20/2011 09:35 AM, Michael S. Tsirkin wrote:
>>>>>
>>>>>            
>>>>>> When MSI is off, each interrupt needs to be bounced through the io
>>>>>> thread when it's set/cleared, so vhost-net causes more context switches and
>>>>>> higher CPU utilization than userspace virtio which handles networking in
>>>>>> the same thread.
>>>>>>
>>>>>> We'll need to fix this by adding level irq support in kvm irqfd,
>>>>>> for now disable vhost-net in these configurations.
>>>>>>
>>>>>> Signed-off-by: Michael S. Tsirkin<mst@redhat.com>
>>>>>>
>>>>>>              
>>>>> I actually think this should be a terminal error.  The user asks for
>>>>> vhost-net, if we cannot enable it, we should exit.
>>>>>
>>>>> Or we should warn the user that they should expect bad performance.
>>>>> Silently doing something that the user has explicitly asked us not
>>>>> to do is not a good behavior.
>>>>>
>>>>> Regards,
>>>>>
>>>>> Anthony Liguori
>>>>>
>>>>>            
>>>> The issue is that user has no control of the guest, and can not know
>>>> whether the guest enables MSI. So what you ask for will just make
>>>> some guests fail, and others fail sometimes.
>>>> The user also has no way to know that version X of kvm does not expose a
>>>> way to inject level interrupts with irqfd.
>>>>
>>>> We could have *another* flag that says "use vhost where it helps" but
>>>> then I think this is what everyone wants to do, anyway, and libvirt
>>>> already sets vhost=on so I prefer redefining the meaning of an existing
>>>> flag.
>>>>
>>>>          
>>> In the very least, there needs to be a vhost=force.
>>>
>>> Having some sort of friendly default policy is fine but we need to
>>> provide a mechanism for a user to have the final say.  If you want to
>>> redefine vhost=on to really mean, use the friendly default, that's fine
>>> by me, but only if the vhost=force option exists.
>>>
>>> I actually would think libvirt would want to use vhost=force.  Debugging
>>> with vhost=on is going to be a royal pain in the ass if a user reports
>>> bad performance.  Given the libvirt XML, you can't actually tell from
>>> the guest and the XML whether or not vhost was actually in use or not.
>>>        
>> If we add a force option, let's please distinguish hotplug from VM
>> creation time.  The latter can abort.  Hotplug should print an error and
>> fail the initfn.
>>      
> It can't abort at init - MSI is disabled at init, it needs to be enabled
> by the guest later. And aborting the guest in the middle of the run
> is a very bad idea.
>
> What vhostforce=true will do is force vhost backend to be used even if
> it is slower.
>    

vhost=on,vhostforce=false              use vhost if we think it will 
improve performance
vhost=on,vhostforce=true               always use vhost
vhost=off,vhostforce=*                    do not use vhost

Regards,

Anthony Liguori

>    
>>   Thanks,
>>
>> Alex
>>      
>    


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [PATCH] vhost: force vhost off for non-MSI guests
@ 2011-01-21 14:40             ` Anthony Liguori
  0 siblings, 0 replies; 37+ messages in thread
From: Anthony Liguori @ 2011-01-21 14:40 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: kvm, Juan Quintela, Jes.Sorensen, jasowang, qemu-devel, Alex Williamson

On 01/21/2011 03:55 AM, Michael S. Tsirkin wrote:
> On Thu, Jan 20, 2011 at 06:35:46PM -0700, Alex Williamson wrote:
>    
>> On Thu, 2011-01-20 at 18:23 -0600, Anthony Liguori wrote:
>>      
>>> On 01/20/2011 10:07 AM, Michael S. Tsirkin wrote:
>>>        
>>>> On Thu, Jan 20, 2011 at 09:43:57AM -0600, Anthony Liguori wrote:
>>>>
>>>>          
>>>>> On 01/20/2011 09:35 AM, Michael S. Tsirkin wrote:
>>>>>
>>>>>            
>>>>>> When MSI is off, each interrupt needs to be bounced through the io
>>>>>> thread when it's set/cleared, so vhost-net causes more context switches and
>>>>>> higher CPU utilization than userspace virtio which handles networking in
>>>>>> the same thread.
>>>>>>
>>>>>> We'll need to fix this by adding level irq support in kvm irqfd,
>>>>>> for now disable vhost-net in these configurations.
>>>>>>
>>>>>> Signed-off-by: Michael S. Tsirkin<mst@redhat.com>
>>>>>>
>>>>>>              
>>>>> I actually think this should be a terminal error.  The user asks for
>>>>> vhost-net, if we cannot enable it, we should exit.
>>>>>
>>>>> Or we should warn the user that they should expect bad performance.
>>>>> Silently doing something that the user has explicitly asked us not
>>>>> to do is not a good behavior.
>>>>>
>>>>> Regards,
>>>>>
>>>>> Anthony Liguori
>>>>>
>>>>>            
>>>> The issue is that user has no control of the guest, and can not know
>>>> whether the guest enables MSI. So what you ask for will just make
>>>> some guests fail, and others fail sometimes.
>>>> The user also has no way to know that version X of kvm does not expose a
>>>> way to inject level interrupts with irqfd.
>>>>
>>>> We could have *another* flag that says "use vhost where it helps" but
>>>> then I think this is what everyone wants to do, anyway, and libvirt
>>>> already sets vhost=on so I prefer redefining the meaning of an existing
>>>> flag.
>>>>
>>>>          
>>> In the very least, there needs to be a vhost=force.
>>>
>>> Having some sort of friendly default policy is fine but we need to
>>> provide a mechanism for a user to have the final say.  If you want to
>>> redefine vhost=on to really mean, use the friendly default, that's fine
>>> by me, but only if the vhost=force option exists.
>>>
>>> I actually would think libvirt would want to use vhost=force.  Debugging
>>> with vhost=on is going to be a royal pain in the ass if a user reports
>>> bad performance.  Given the libvirt XML, you can't actually tell from
>>> the guest and the XML whether or not vhost was actually in use or not.
>>>        
>> If we add a force option, let's please distinguish hotplug from VM
>> creation time.  The latter can abort.  Hotplug should print an error and
>> fail the initfn.
>>      
> It can't abort at init - MSI is disabled at init, it needs to be enabled
> by the guest later. And aborting the guest in the middle of the run
> is a very bad idea.
>
> What vhostforce=true will do is force vhost backend to be used even if
> it is slower.
>    

vhost=on,vhostforce=false              use vhost if we think it will 
improve performance
vhost=on,vhostforce=true               always use vhost
vhost=off,vhostforce=*                    do not use vhost

Regards,

Anthony Liguori

>    
>>   Thanks,
>>
>> Alex
>>      
>    

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] vhost: force vhost off for non-MSI guests
  2011-01-20 15:35 ` [Qemu-devel] " Michael S. Tsirkin
@ 2011-03-09 12:19   ` rukhsana ansari
  -1 siblings, 0 replies; 37+ messages in thread
From: rukhsana ansari @ 2011-03-09 12:19 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: kvm, Juan Quintela, Jes.Sorensen, jasowang, qemu-devel, Alex Williamson

[-- Attachment #1: Type: text/plain, Size: 671 bytes --]

Hi,

On Thu, Jan 20, 2011 at 9:05 PM, Michael S. Tsirkin <mst@redhat.com> wrote:

> When MSI is off, each interrupt needs to be bounced through the io
> thread when it's set/cleared, so vhost-net causes more context switches and
> higher CPU utilization than userspace virtio which handles networking in
> the same thread.
>
> We'll need to fix this by adding level irq support in kvm irqfd,
> for now disable vhost-net in these configurations.
>
>
> This maybe a novice question - Would appreciate it if you can you provide a
pointer to documentation or relevant code that explains what is the
limitation in supporting level irq support in kvm irqfd.


Thanks
-Rukhsana

[-- Attachment #2: Type: text/html, Size: 1016 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [PATCH] vhost: force vhost off for non-MSI guests
@ 2011-03-09 12:19   ` rukhsana ansari
  0 siblings, 0 replies; 37+ messages in thread
From: rukhsana ansari @ 2011-03-09 12:19 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: kvm, Juan Quintela, Jes.Sorensen, jasowang, qemu-devel, Alex Williamson

[-- Attachment #1: Type: text/plain, Size: 671 bytes --]

Hi,

On Thu, Jan 20, 2011 at 9:05 PM, Michael S. Tsirkin <mst@redhat.com> wrote:

> When MSI is off, each interrupt needs to be bounced through the io
> thread when it's set/cleared, so vhost-net causes more context switches and
> higher CPU utilization than userspace virtio which handles networking in
> the same thread.
>
> We'll need to fix this by adding level irq support in kvm irqfd,
> for now disable vhost-net in these configurations.
>
>
> This maybe a novice question - Would appreciate it if you can you provide a
pointer to documentation or relevant code that explains what is the
limitation in supporting level irq support in kvm irqfd.


Thanks
-Rukhsana

[-- Attachment #2: Type: text/html, Size: 1016 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] vhost: force vhost off for non-MSI guests
  2011-03-09 12:19   ` [Qemu-devel] " rukhsana ansari
@ 2011-03-14 17:05     ` rukhsana ansari
  -1 siblings, 0 replies; 37+ messages in thread
From: rukhsana ansari @ 2011-03-14 17:05 UTC (permalink / raw)
  To: Michael S. Tsirkin, qemu-devel, kvm
  Cc: Jes.Sorensen, Alex Williamson, jasowang, Juan Quintela

[-- Attachment #1: Type: text/plain, Size: 1063 bytes --]

Seeking clarification to the original question I posted:
>>
>>
> This maybe a novice question - Would appreciate it if you can you provide
a
> pointer to documentation or relevant code that explains what is the
> *limitation in supporting level irq support in kvm irqfd.*
>
>
>
After browsing the KVM kernel code, it does look like direct assignment of
PCI devices allows support for level-triggered interrupts to be injected to
the guest from the kernel.  (*as opposed to not supporting it for vhost
irqfd mechanism*)
This occurs when the guest device supports INTX.
Reference:  kvm_assigned_dev_interrupt_work_handler() in assigned-dev.c
calls kvm_set_irq()
with the guest_irq.
This function in turn invokes the assigned set function  (either
kvm_set_pic_irq or kvm_set_ioapic_irq) which was setup at kvm_irq_chip
creation time when kvm_setup_default_irq_routing () called for handling
ioctl KVM_CREATE_IRQCHIP.

So, it isn't clear why level-triggered interrupt isn't supported for irqfd
mechanism.
Would greatly appreciate clarification here

Thanks
-Rukhsana

[-- Attachment #2: Type: text/html, Size: 1186 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [PATCH] vhost: force vhost off for non-MSI guests
@ 2011-03-14 17:05     ` rukhsana ansari
  0 siblings, 0 replies; 37+ messages in thread
From: rukhsana ansari @ 2011-03-14 17:05 UTC (permalink / raw)
  To: Michael S. Tsirkin, qemu-devel, kvm
  Cc: Jes.Sorensen, Alex Williamson, jasowang, Juan Quintela

[-- Attachment #1: Type: text/plain, Size: 1063 bytes --]

Seeking clarification to the original question I posted:
>>
>>
> This maybe a novice question - Would appreciate it if you can you provide
a
> pointer to documentation or relevant code that explains what is the
> *limitation in supporting level irq support in kvm irqfd.*
>
>
>
After browsing the KVM kernel code, it does look like direct assignment of
PCI devices allows support for level-triggered interrupts to be injected to
the guest from the kernel.  (*as opposed to not supporting it for vhost
irqfd mechanism*)
This occurs when the guest device supports INTX.
Reference:  kvm_assigned_dev_interrupt_work_handler() in assigned-dev.c
calls kvm_set_irq()
with the guest_irq.
This function in turn invokes the assigned set function  (either
kvm_set_pic_irq or kvm_set_ioapic_irq) which was setup at kvm_irq_chip
creation time when kvm_setup_default_irq_routing () called for handling
ioctl KVM_CREATE_IRQCHIP.

So, it isn't clear why level-triggered interrupt isn't supported for irqfd
mechanism.
Would greatly appreciate clarification here

Thanks
-Rukhsana

[-- Attachment #2: Type: text/html, Size: 1186 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [PATCH] vhost: force vhost off for non-MSI guests
  2011-03-14 17:05     ` [Qemu-devel] " rukhsana ansari
@ 2011-03-14 19:00       ` Michael S. Tsirkin
  -1 siblings, 0 replies; 37+ messages in thread
From: Michael S. Tsirkin @ 2011-03-14 19:00 UTC (permalink / raw)
  To: rukhsana ansari
  Cc: qemu-devel, kvm, Alex Williamson, Juan Quintela, jasowang, Jes.Sorensen

On Mon, Mar 14, 2011 at 10:35:08PM +0530, rukhsana ansari wrote:
> Seeking clarification to the original question I posted:
> >>
> >>
> > This maybe a novice question - Would appreciate it if you can you provide a
> > pointer to documentation or relevant code that explains what is the
> > limitation in supporting level irq support in kvm irqfd.
> >
> >
> >
> After browsing the KVM kernel code, it does look like direct assignment of PCI
> devices allows support for level-triggered interrupts to be injected to the
> guest from the kernel.  (as opposed to not supporting it for vhost irqfd
> mechanism)
> This occurs when the guest device supports INTX.
> Reference:  kvm_assigned_dev_interrupt_work_handler() in assigned-dev.c calls
> kvm_set_irq()
> with the guest_irq.
> This function in turn invokes the assigned set function  (either
> kvm_set_pic_irq or kvm_set_ioapic_irq) which was setup at kvm_irq_chip creation
> time when kvm_setup_default_irq_routing () called for handling ioctl
> KVM_CREATE_IRQCHIP.
> 
> So, it isn't clear why level-triggered interrupt isn't supported for irqfd
> mechanism.
> Would greatly appreciate clarification here
> 
> Thanks
> -Rukhsana
> 

Mostly, no one came up with an implementation so far.

If the point is to use irqfd with vhost-net, there's also
a question of adding interfaces to
1. pass IO read transactions directly to another kernel module
2. add an interface to clear the irq level

Maybe the right thing is to combine the two somehow:
irqfd might get an oiption to set a bit in memory,
ioeventfd might get an option to read and clear from memory
and clear irqfd line at the same time.

-- 
MST

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [PATCH] vhost: force vhost off for non-MSI guests
@ 2011-03-14 19:00       ` Michael S. Tsirkin
  0 siblings, 0 replies; 37+ messages in thread
From: Michael S. Tsirkin @ 2011-03-14 19:00 UTC (permalink / raw)
  To: rukhsana ansari
  Cc: kvm, Juan Quintela, Jes.Sorensen, jasowang, qemu-devel, Alex Williamson

On Mon, Mar 14, 2011 at 10:35:08PM +0530, rukhsana ansari wrote:
> Seeking clarification to the original question I posted:
> >>
> >>
> > This maybe a novice question - Would appreciate it if you can you provide a
> > pointer to documentation or relevant code that explains what is the
> > limitation in supporting level irq support in kvm irqfd.
> >
> >
> >
> After browsing the KVM kernel code, it does look like direct assignment of PCI
> devices allows support for level-triggered interrupts to be injected to the
> guest from the kernel.  (as opposed to not supporting it for vhost irqfd
> mechanism)
> This occurs when the guest device supports INTX.
> Reference:  kvm_assigned_dev_interrupt_work_handler() in assigned-dev.c calls
> kvm_set_irq()
> with the guest_irq.
> This function in turn invokes the assigned set function  (either
> kvm_set_pic_irq or kvm_set_ioapic_irq) which was setup at kvm_irq_chip creation
> time when kvm_setup_default_irq_routing () called for handling ioctl
> KVM_CREATE_IRQCHIP.
> 
> So, it isn't clear why level-triggered interrupt isn't supported for irqfd
> mechanism.
> Would greatly appreciate clarification here
> 
> Thanks
> -Rukhsana
> 

Mostly, no one came up with an implementation so far.

If the point is to use irqfd with vhost-net, there's also
a question of adding interfaces to
1. pass IO read transactions directly to another kernel module
2. add an interface to clear the irq level

Maybe the right thing is to combine the two somehow:
irqfd might get an oiption to set a bit in memory,
ioeventfd might get an option to read and clear from memory
and clear irqfd line at the same time.

-- 
MST

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [PATCH] vhost: force vhost off for non-MSI guests
  2011-03-14 19:00       ` Michael S. Tsirkin
@ 2011-03-14 19:31         ` Alex Williamson
  -1 siblings, 0 replies; 37+ messages in thread
From: Alex Williamson @ 2011-03-14 19:31 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: rukhsana ansari, qemu-devel, kvm, Juan Quintela, jasowang, Jes.Sorensen

On Mon, 2011-03-14 at 21:00 +0200, Michael S. Tsirkin wrote:
> On Mon, Mar 14, 2011 at 10:35:08PM +0530, rukhsana ansari wrote:
> > Seeking clarification to the original question I posted:
> > >>
> > >>
> > > This maybe a novice question - Would appreciate it if you can you provide a
> > > pointer to documentation or relevant code that explains what is the
> > > limitation in supporting level irq support in kvm irqfd.
> > >
> > >
> > >
> > After browsing the KVM kernel code, it does look like direct assignment of PCI
> > devices allows support for level-triggered interrupts to be injected to the
> > guest from the kernel.  (as opposed to not supporting it for vhost irqfd
> > mechanism)
> > This occurs when the guest device supports INTX.
> > Reference:  kvm_assigned_dev_interrupt_work_handler() in assigned-dev.c calls
> > kvm_set_irq()
> > with the guest_irq.
> > This function in turn invokes the assigned set function  (either
> > kvm_set_pic_irq or kvm_set_ioapic_irq) which was setup at kvm_irq_chip creation
> > time when kvm_setup_default_irq_routing () called for handling ioctl
> > KVM_CREATE_IRQCHIP.
> > 
> > So, it isn't clear why level-triggered interrupt isn't supported for irqfd
> > mechanism.
> > Would greatly appreciate clarification here
> > 
> > Thanks
> > -Rukhsana
> > 
> 
> Mostly, no one came up with an implementation so far.
> 
> If the point is to use irqfd with vhost-net, there's also
> a question of adding interfaces to
> 1. pass IO read transactions directly to another kernel module
> 2. add an interface to clear the irq level
> 
> Maybe the right thing is to combine the two somehow:
> irqfd might get an oiption to set a bit in memory,
> ioeventfd might get an option to read and clear from memory
> and clear irqfd line at the same time.

I had wanted this for VFIO too and it gets pretty complicated.  The
first problem with level triggered interrupts is that you need to know
which GSI your device triggers.  This means translating PCI INTA through
bridge swizzles and chipset mapping to an IOAPIC.  Current device
assignment does this through a complete hack in qemu.  Then you can set
the IRQ, but being level triggered, we need to know when the guest has
serviced the IRQ so we can de-assert it.  This requires a hook into the
in-kernel APIC to sent the EOI back out to userspace.

I posted RFC patches for doing all this a while back, but they didn't go
anywhere.  I think the feeling was that it was too intrusive for "slow"
interrupts.  The current thinking for VFIO based device assignment is to
use qemu for level interrupts until we find something that actually
needs low latency in this path.  We generally consider INTx to be like
supporting i/o port space or non-4k BARs, ie. necessary for
compatibility, but not necessarily a performance path.  High performance
devices should always be using some kind of MSI because it bypasses all
of the APIC complications and slowness.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [PATCH] vhost: force vhost off for non-MSI guests
@ 2011-03-14 19:31         ` Alex Williamson
  0 siblings, 0 replies; 37+ messages in thread
From: Alex Williamson @ 2011-03-14 19:31 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: rukhsana ansari, kvm, Juan Quintela, Jes.Sorensen, jasowang, qemu-devel

On Mon, 2011-03-14 at 21:00 +0200, Michael S. Tsirkin wrote:
> On Mon, Mar 14, 2011 at 10:35:08PM +0530, rukhsana ansari wrote:
> > Seeking clarification to the original question I posted:
> > >>
> > >>
> > > This maybe a novice question - Would appreciate it if you can you provide a
> > > pointer to documentation or relevant code that explains what is the
> > > limitation in supporting level irq support in kvm irqfd.
> > >
> > >
> > >
> > After browsing the KVM kernel code, it does look like direct assignment of PCI
> > devices allows support for level-triggered interrupts to be injected to the
> > guest from the kernel.  (as opposed to not supporting it for vhost irqfd
> > mechanism)
> > This occurs when the guest device supports INTX.
> > Reference:  kvm_assigned_dev_interrupt_work_handler() in assigned-dev.c calls
> > kvm_set_irq()
> > with the guest_irq.
> > This function in turn invokes the assigned set function  (either
> > kvm_set_pic_irq or kvm_set_ioapic_irq) which was setup at kvm_irq_chip creation
> > time when kvm_setup_default_irq_routing () called for handling ioctl
> > KVM_CREATE_IRQCHIP.
> > 
> > So, it isn't clear why level-triggered interrupt isn't supported for irqfd
> > mechanism.
> > Would greatly appreciate clarification here
> > 
> > Thanks
> > -Rukhsana
> > 
> 
> Mostly, no one came up with an implementation so far.
> 
> If the point is to use irqfd with vhost-net, there's also
> a question of adding interfaces to
> 1. pass IO read transactions directly to another kernel module
> 2. add an interface to clear the irq level
> 
> Maybe the right thing is to combine the two somehow:
> irqfd might get an oiption to set a bit in memory,
> ioeventfd might get an option to read and clear from memory
> and clear irqfd line at the same time.

I had wanted this for VFIO too and it gets pretty complicated.  The
first problem with level triggered interrupts is that you need to know
which GSI your device triggers.  This means translating PCI INTA through
bridge swizzles and chipset mapping to an IOAPIC.  Current device
assignment does this through a complete hack in qemu.  Then you can set
the IRQ, but being level triggered, we need to know when the guest has
serviced the IRQ so we can de-assert it.  This requires a hook into the
in-kernel APIC to sent the EOI back out to userspace.

I posted RFC patches for doing all this a while back, but they didn't go
anywhere.  I think the feeling was that it was too intrusive for "slow"
interrupts.  The current thinking for VFIO based device assignment is to
use qemu for level interrupts until we find something that actually
needs low latency in this path.  We generally consider INTx to be like
supporting i/o port space or non-4k BARs, ie. necessary for
compatibility, but not necessarily a performance path.  High performance
devices should always be using some kind of MSI because it bypasses all
of the APIC complications and slowness.  Thanks,

Alex

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] vhost: force vhost off for non-MSI guests
  2011-03-14 19:31         ` Alex Williamson
@ 2011-03-17 15:34           ` rukhsana ansari
  -1 siblings, 0 replies; 37+ messages in thread
From: rukhsana ansari @ 2011-03-17 15:34 UTC (permalink / raw)
  To: Alex Williamson
  Cc: kvm, Juan Quintela, Jes.Sorensen, jasowang, Michael S. Tsirkin,
	qemu-devel

[-- Attachment #1: Type: text/plain, Size: 3243 bytes --]

Alex, Michael,

Thank you for the clarification.

On Tue, Mar 15, 2011 at 1:01 AM, Alex Williamson <alex.williamson@redhat.com
> wrote:

> On Mon, 2011-03-14 at 21:00 +0200, Michael S. Tsirkin wrote:
> > On Mon, Mar 14, 2011 at 10:35:08PM +0530, rukhsana ansari wrote:
> > > Seeking clarification to the original question I posted:
> > > >>
> > > >>
> > > > This maybe a novice question - Would appreciate it if you can you
> provide a
> > > > pointer to documentation or relevant code that explains what is the
> > > > limitation in supporting level irq support in kvm irqfd.
> > > >
> > > >
> > > >
> > > After browsing the KVM kernel code, it does look like direct assignment
> of PCI
> > > devices allows support for level-triggered interrupts to be injected to
> the
> > > guest from the kernel.  (as opposed to not supporting it for vhost
> irqfd
> > > mechanism)
> > > This occurs when the guest device supports INTX.
> > > Reference:  kvm_assigned_dev_interrupt_work_handler() in assigned-dev.c
> calls
> > > kvm_set_irq()
> > > with the guest_irq.
> > > This function in turn invokes the assigned set function  (either
> > > kvm_set_pic_irq or kvm_set_ioapic_irq) which was setup at kvm_irq_chip
> creation
> > > time when kvm_setup_default_irq_routing () called for handling ioctl
> > > KVM_CREATE_IRQCHIP.
> > >
> > > So, it isn't clear why level-triggered interrupt isn't supported for
> irqfd
> > > mechanism.
> > > Would greatly appreciate clarification here
> > >
> > > Thanks
> > > -Rukhsana
> > >
> >
> > Mostly, no one came up with an implementation so far.
> >
> > If the point is to use irqfd with vhost-net, there's also
> > a question of adding interfaces to
> > 1. pass IO read transactions directly to another kernel module
> > 2. add an interface to clear the irq level
> >
> > Maybe the right thing is to combine the two somehow:
> > irqfd might get an oiption to set a bit in memory,
> > ioeventfd might get an option to read and clear from memory
> > and clear irqfd line at the same time.
>
> I had wanted this for VFIO too and it gets pretty complicated.  The
> first problem with level triggered interrupts is that you need to know
> which GSI your device triggers.  This means translating PCI INTA through
> bridge swizzles and chipset mapping to an IOAPIC.  Current device
> assignment does this through a complete hack in qemu.  Then you can set
> the IRQ, but being level triggered, we need to know when the guest has
> serviced the IRQ so we can de-assert it.  This requires a hook into the
> in-kernel APIC to sent the EOI back out to userspace.
>
> I posted RFC patches for doing all this a while back, but they didn't go
> anywhere.  I think the feeling was that it was too intrusive for "slow"
> interrupts.  The current thinking for VFIO based device assignment is to
> use qemu for level interrupts until we find something that actually
> needs low latency in this path.  We generally consider INTx to be like
> supporting i/o port space or non-4k BARs, ie. necessary for
> compatibility, but not necessarily a performance path.  High performance
> devices should always be using some kind of MSI because it bypasses all
> of the APIC complications and slowness.  Thanks,
>
> Alex
>
>


-- 
-Rukhsana

[-- Attachment #2: Type: text/html, Size: 4065 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [PATCH] vhost: force vhost off for non-MSI guests
@ 2011-03-17 15:34           ` rukhsana ansari
  0 siblings, 0 replies; 37+ messages in thread
From: rukhsana ansari @ 2011-03-17 15:34 UTC (permalink / raw)
  To: Alex Williamson
  Cc: kvm, Juan Quintela, Jes.Sorensen, jasowang, Michael S. Tsirkin,
	qemu-devel

[-- Attachment #1: Type: text/plain, Size: 3243 bytes --]

Alex, Michael,

Thank you for the clarification.

On Tue, Mar 15, 2011 at 1:01 AM, Alex Williamson <alex.williamson@redhat.com
> wrote:

> On Mon, 2011-03-14 at 21:00 +0200, Michael S. Tsirkin wrote:
> > On Mon, Mar 14, 2011 at 10:35:08PM +0530, rukhsana ansari wrote:
> > > Seeking clarification to the original question I posted:
> > > >>
> > > >>
> > > > This maybe a novice question - Would appreciate it if you can you
> provide a
> > > > pointer to documentation or relevant code that explains what is the
> > > > limitation in supporting level irq support in kvm irqfd.
> > > >
> > > >
> > > >
> > > After browsing the KVM kernel code, it does look like direct assignment
> of PCI
> > > devices allows support for level-triggered interrupts to be injected to
> the
> > > guest from the kernel.  (as opposed to not supporting it for vhost
> irqfd
> > > mechanism)
> > > This occurs when the guest device supports INTX.
> > > Reference:  kvm_assigned_dev_interrupt_work_handler() in assigned-dev.c
> calls
> > > kvm_set_irq()
> > > with the guest_irq.
> > > This function in turn invokes the assigned set function  (either
> > > kvm_set_pic_irq or kvm_set_ioapic_irq) which was setup at kvm_irq_chip
> creation
> > > time when kvm_setup_default_irq_routing () called for handling ioctl
> > > KVM_CREATE_IRQCHIP.
> > >
> > > So, it isn't clear why level-triggered interrupt isn't supported for
> irqfd
> > > mechanism.
> > > Would greatly appreciate clarification here
> > >
> > > Thanks
> > > -Rukhsana
> > >
> >
> > Mostly, no one came up with an implementation so far.
> >
> > If the point is to use irqfd with vhost-net, there's also
> > a question of adding interfaces to
> > 1. pass IO read transactions directly to another kernel module
> > 2. add an interface to clear the irq level
> >
> > Maybe the right thing is to combine the two somehow:
> > irqfd might get an oiption to set a bit in memory,
> > ioeventfd might get an option to read and clear from memory
> > and clear irqfd line at the same time.
>
> I had wanted this for VFIO too and it gets pretty complicated.  The
> first problem with level triggered interrupts is that you need to know
> which GSI your device triggers.  This means translating PCI INTA through
> bridge swizzles and chipset mapping to an IOAPIC.  Current device
> assignment does this through a complete hack in qemu.  Then you can set
> the IRQ, but being level triggered, we need to know when the guest has
> serviced the IRQ so we can de-assert it.  This requires a hook into the
> in-kernel APIC to sent the EOI back out to userspace.
>
> I posted RFC patches for doing all this a while back, but they didn't go
> anywhere.  I think the feeling was that it was too intrusive for "slow"
> interrupts.  The current thinking for VFIO based device assignment is to
> use qemu for level interrupts until we find something that actually
> needs low latency in this path.  We generally consider INTx to be like
> supporting i/o port space or non-4k BARs, ie. necessary for
> compatibility, but not necessarily a performance path.  High performance
> devices should always be using some kind of MSI because it bypasses all
> of the APIC complications and slowness.  Thanks,
>
> Alex
>
>


-- 
-Rukhsana

[-- Attachment #2: Type: text/html, Size: 4065 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2011-03-17 15:35 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-01-20 15:35 [PATCH] vhost: force vhost off for non-MSI guests Michael S. Tsirkin
2011-01-20 15:35 ` [Qemu-devel] " Michael S. Tsirkin
2011-01-20 15:43 ` Anthony Liguori
2011-01-20 15:43   ` Anthony Liguori
2011-01-20 16:07   ` Michael S. Tsirkin
2011-01-20 16:07     ` Michael S. Tsirkin
2011-01-21  0:23     ` Anthony Liguori
2011-01-21  1:35       ` Alex Williamson
2011-01-21  1:35         ` Alex Williamson
2011-01-21  9:55         ` Michael S. Tsirkin
2011-01-21  9:55           ` Michael S. Tsirkin
2011-01-21 13:19           ` Alex Williamson
2011-01-21 13:19             ` Alex Williamson
2011-01-21 13:43             ` Michael S. Tsirkin
2011-01-21 13:43               ` Michael S. Tsirkin
2011-01-21 14:40           ` Anthony Liguori
2011-01-21 14:40             ` Anthony Liguori
2011-01-21  9:48       ` Michael S. Tsirkin
2011-01-21 14:37         ` Anthony Liguori
2011-01-20 16:31 ` Sridhar Samudrala
2011-01-20 16:31   ` [Qemu-devel] " Sridhar Samudrala
2011-01-20 17:47   ` Michael S. Tsirkin
2011-01-20 17:47     ` [Qemu-devel] " Michael S. Tsirkin
2011-01-20 23:43     ` Sridhar Samudrala
2011-01-20 23:43       ` [Qemu-devel] " Sridhar Samudrala
2011-01-20 18:05 ` Alex Williamson
2011-01-20 18:05   ` [Qemu-devel] " Alex Williamson
2011-03-09 12:19 ` rukhsana ansari
2011-03-09 12:19   ` [Qemu-devel] " rukhsana ansari
2011-03-14 17:05   ` rukhsana ansari
2011-03-14 17:05     ` [Qemu-devel] " rukhsana ansari
2011-03-14 19:00     ` Michael S. Tsirkin
2011-03-14 19:00       ` Michael S. Tsirkin
2011-03-14 19:31       ` Alex Williamson
2011-03-14 19:31         ` Alex Williamson
2011-03-17 15:34         ` rukhsana ansari
2011-03-17 15:34           ` [Qemu-devel] " rukhsana ansari

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.