* vhost-scsi port to v1.1.0 + MSI-X performance regression
@ 2012-07-24 7:42 ` Nicholas A. Bellinger
0 siblings, 0 replies; 16+ messages in thread
From: Nicholas A. Bellinger @ 2012-07-24 7:42 UTC (permalink / raw)
To: Anthony Liguori
Cc: qemu-devel, kvm-devel, target-devel, Stefan Hajnoczi,
Paolo Bonzini, Jan Kiszka, Hannes Reinecke, Michael S. Tsirkin,
Christoph Hellwig, Avi Kivity
Hi Anthony, Stefan & QEMU folks,
So during the process of separating out the patches from Zhi's
vhost-scsi tree this evening, I managed to squash everything down to
nine nicely reviewable patches that apply against the current
qemu.git/master:
Nicholas Bellinger (1):
virtio-scsi: Set max_target=0 during vhost-scsi operation
Stefan Hajnoczi (8):
notifier: add validity check and notify function
virtio-pci: support host notifiers in TCG mode
virtio-pci: check that event notification worked
vhost: Pass device path to vhost_dev_init()
virtio-scsi: Add wwpn and tgpt properties
virtio-scsi: Open and initialize /dev/vhost-scsi
virtio-scsi: Start/stop vhost
vhost-scsi: add -vhost-scsi host device
However, thus far I've not been able to get virtio-scsi <-> tcm_vhost
I/O to actually work against the latest qemu.git/master..
So while doing a (manual) bisection w/ this series to track down the
issue with qemu/master, I managed to run across something else.. With
the vhost-scsi series applied, everything is working as expected up
until the following commit:
commit 1523ed9e1d46b0b54540049d491475ccac7e6421
Author: Jan Kiszka <jan.kiszka@siemens.com>
Date: Thu May 17 10:32:39 2012 -0300
virtio/vhost: Add support for KVM in-kernel MSI injection
This commit ends up triggering the following assert immediately after
starting qemu with virtio-scsi <-> tcm_vhost:
qemu-system-x86_64: /usr/src/qemu.git/hw/msix.c:515:
msix_unset_vector_notifiers: Assertion `dev->msix_vector_use_notifier &&
dev->msix_vector_release_notifier' failed.
OK, so adding the following hack allows me to boot:
diff --git a/hw/msix.c b/hw/msix.c
index 59c7a83..6036909 100644
--- a/hw/msix.c
+++ b/hw/msix.c
@@ -511,6 +511,11 @@ void msix_unset_vector_notifiers(PCIDevice *dev)
{
int vector;
+ if (!dev->msix_vector_use_notifier && !dev->msix_vector_release_notifier) {
+ printf("Hit NULL msix_unset_vector_notifiers for: %s\n", dev->name);
+ return;
+ }
+
assert(dev->msix_vector_use_notifier &&
dev->msix_vector_release_notifier);
--
and virtio-scsi is then able to load + detect tcm_vhost LUNs as
expected.
However the random I/O performance with commit 1523ed9e1d46b is off by a
couple of orders of magnitude, ~6K IOPs compared to ~60K IOPs on raw
block flash using just the previous commit bdd00bdc64ba in Jan's series.
So AFAICT there appears to be a serious performance regression that is
easily reproducible with that patch, which is about as far along as I've
been able to diagnose yet.
Interestingly enough, virtio-scsi-raw performance does not seem to be
effected AFAICT by this regression, and is still able to go ~20K IOPs
with the same workload using commit 1523ed9e1d46b. (Roughly the same as
before)
Does anyone have any idea why commit 1523ed9e1d46b would be killing
vhost / tcm_vhost performance so terribly, or is there something else
that vhost / vhost-scsi should be doing with new code..?
Thanks!
--nab
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [Qemu-devel] vhost-scsi port to v1.1.0 + MSI-X performance regression
@ 2012-07-24 7:42 ` Nicholas A. Bellinger
0 siblings, 0 replies; 16+ messages in thread
From: Nicholas A. Bellinger @ 2012-07-24 7:42 UTC (permalink / raw)
To: Anthony Liguori
Cc: kvm-devel, Michael S. Tsirkin, Stefan Hajnoczi, qemu-devel,
target-devel, Hannes Reinecke, Jan Kiszka, Paolo Bonzini,
Christoph Hellwig, Avi Kivity
Hi Anthony, Stefan & QEMU folks,
So during the process of separating out the patches from Zhi's
vhost-scsi tree this evening, I managed to squash everything down to
nine nicely reviewable patches that apply against the current
qemu.git/master:
Nicholas Bellinger (1):
virtio-scsi: Set max_target=0 during vhost-scsi operation
Stefan Hajnoczi (8):
notifier: add validity check and notify function
virtio-pci: support host notifiers in TCG mode
virtio-pci: check that event notification worked
vhost: Pass device path to vhost_dev_init()
virtio-scsi: Add wwpn and tgpt properties
virtio-scsi: Open and initialize /dev/vhost-scsi
virtio-scsi: Start/stop vhost
vhost-scsi: add -vhost-scsi host device
However, thus far I've not been able to get virtio-scsi <-> tcm_vhost
I/O to actually work against the latest qemu.git/master..
So while doing a (manual) bisection w/ this series to track down the
issue with qemu/master, I managed to run across something else.. With
the vhost-scsi series applied, everything is working as expected up
until the following commit:
commit 1523ed9e1d46b0b54540049d491475ccac7e6421
Author: Jan Kiszka <jan.kiszka@siemens.com>
Date: Thu May 17 10:32:39 2012 -0300
virtio/vhost: Add support for KVM in-kernel MSI injection
This commit ends up triggering the following assert immediately after
starting qemu with virtio-scsi <-> tcm_vhost:
qemu-system-x86_64: /usr/src/qemu.git/hw/msix.c:515:
msix_unset_vector_notifiers: Assertion `dev->msix_vector_use_notifier &&
dev->msix_vector_release_notifier' failed.
OK, so adding the following hack allows me to boot:
diff --git a/hw/msix.c b/hw/msix.c
index 59c7a83..6036909 100644
--- a/hw/msix.c
+++ b/hw/msix.c
@@ -511,6 +511,11 @@ void msix_unset_vector_notifiers(PCIDevice *dev)
{
int vector;
+ if (!dev->msix_vector_use_notifier && !dev->msix_vector_release_notifier) {
+ printf("Hit NULL msix_unset_vector_notifiers for: %s\n", dev->name);
+ return;
+ }
+
assert(dev->msix_vector_use_notifier &&
dev->msix_vector_release_notifier);
--
and virtio-scsi is then able to load + detect tcm_vhost LUNs as
expected.
However the random I/O performance with commit 1523ed9e1d46b is off by a
couple of orders of magnitude, ~6K IOPs compared to ~60K IOPs on raw
block flash using just the previous commit bdd00bdc64ba in Jan's series.
So AFAICT there appears to be a serious performance regression that is
easily reproducible with that patch, which is about as far along as I've
been able to diagnose yet.
Interestingly enough, virtio-scsi-raw performance does not seem to be
effected AFAICT by this regression, and is still able to go ~20K IOPs
with the same workload using commit 1523ed9e1d46b. (Roughly the same as
before)
Does anyone have any idea why commit 1523ed9e1d46b would be killing
vhost / tcm_vhost performance so terribly, or is there something else
that vhost / vhost-scsi should be doing with new code..?
Thanks!
--nab
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: vhost-scsi port to v1.1.0 + MSI-X performance regression
2012-07-24 7:42 ` [Qemu-devel] " Nicholas A. Bellinger
@ 2012-07-24 7:57 ` Jan Kiszka
-1 siblings, 0 replies; 16+ messages in thread
From: Jan Kiszka @ 2012-07-24 7:57 UTC (permalink / raw)
To: Nicholas A. Bellinger
Cc: kvm-devel, Michael S. Tsirkin, Stefan Hajnoczi, qemu-devel,
target-devel, Hannes Reinecke, Anthony Liguori, Paolo Bonzini,
Christoph Hellwig, Avi Kivity
On 2012-07-24 09:42, Nicholas A. Bellinger wrote:
> Hi Anthony, Stefan & QEMU folks,
>
> So during the process of separating out the patches from Zhi's
> vhost-scsi tree this evening, I managed to squash everything down to
> nine nicely reviewable patches that apply against the current
> qemu.git/master:
>
> Nicholas Bellinger (1):
> virtio-scsi: Set max_target=0 during vhost-scsi operation
>
> Stefan Hajnoczi (8):
> notifier: add validity check and notify function
> virtio-pci: support host notifiers in TCG mode
> virtio-pci: check that event notification worked
> vhost: Pass device path to vhost_dev_init()
> virtio-scsi: Add wwpn and tgpt properties
> virtio-scsi: Open and initialize /dev/vhost-scsi
> virtio-scsi: Start/stop vhost
> vhost-scsi: add -vhost-scsi host device
>
>
> However, thus far I've not been able to get virtio-scsi <-> tcm_vhost
> I/O to actually work against the latest qemu.git/master..
>
> So while doing a (manual) bisection w/ this series to track down the
> issue with qemu/master, I managed to run across something else.. With
> the vhost-scsi series applied, everything is working as expected up
> until the following commit:
>
> commit 1523ed9e1d46b0b54540049d491475ccac7e6421
> Author: Jan Kiszka <jan.kiszka@siemens.com>
> Date: Thu May 17 10:32:39 2012 -0300
>
> virtio/vhost: Add support for KVM in-kernel MSI injection
>
>
> This commit ends up triggering the following assert immediately after
> starting qemu with virtio-scsi <-> tcm_vhost:
>
> qemu-system-x86_64: /usr/src/qemu.git/hw/msix.c:515:
> msix_unset_vector_notifiers: Assertion `dev->msix_vector_use_notifier &&
> dev->msix_vector_release_notifier' failed.
>
> OK, so adding the following hack allows me to boot:
>
> diff --git a/hw/msix.c b/hw/msix.c
> index 59c7a83..6036909 100644
> --- a/hw/msix.c
> +++ b/hw/msix.c
> @@ -511,6 +511,11 @@ void msix_unset_vector_notifiers(PCIDevice *dev)
> {
> int vector;
>
> + if (!dev->msix_vector_use_notifier && !dev->msix_vector_release_notifier) {
> + printf("Hit NULL msix_unset_vector_notifiers for: %s\n", dev->name);
> + return;
> + }
> +
> assert(dev->msix_vector_use_notifier &&
> dev->msix_vector_release_notifier);
>
> --
Can you post a backtrace from gdb?
Also, is there a git tree and a way to reproduce this without special
hardware needs?
>
> and virtio-scsi is then able to load + detect tcm_vhost LUNs as
> expected.
>
> However the random I/O performance with commit 1523ed9e1d46b is off by a
> couple of orders of magnitude, ~6K IOPs compared to ~60K IOPs on raw
> block flash using just the previous commit bdd00bdc64ba in Jan's series.
>
> So AFAICT there appears to be a serious performance regression that is
> easily reproducible with that patch, which is about as far along as I've
> been able to diagnose yet.
>
> Interestingly enough, virtio-scsi-raw performance does not seem to be
> effected AFAICT by this regression, and is still able to go ~20K IOPs
> with the same workload using commit 1523ed9e1d46b. (Roughly the same as
> before)
>
> Does anyone have any idea why commit 1523ed9e1d46b would be killing
> vhost / tcm_vhost performance so terribly, or is there something else
> that vhost / vhost-scsi should be doing with new code..?
No good idea yet, will have to look closer.
Maybe you are somehow deassigning (via set_guest_notifiers) before
assigning. But that would not yet explain performance regressions. Your
target is exposing MSI-X, isn't it?
Jan
--
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] vhost-scsi port to v1.1.0 + MSI-X performance regression
@ 2012-07-24 7:57 ` Jan Kiszka
0 siblings, 0 replies; 16+ messages in thread
From: Jan Kiszka @ 2012-07-24 7:57 UTC (permalink / raw)
To: Nicholas A. Bellinger
Cc: kvm-devel, Michael S. Tsirkin, Stefan Hajnoczi, qemu-devel,
target-devel, Hannes Reinecke, Anthony Liguori, Paolo Bonzini,
Christoph Hellwig, Avi Kivity
On 2012-07-24 09:42, Nicholas A. Bellinger wrote:
> Hi Anthony, Stefan & QEMU folks,
>
> So during the process of separating out the patches from Zhi's
> vhost-scsi tree this evening, I managed to squash everything down to
> nine nicely reviewable patches that apply against the current
> qemu.git/master:
>
> Nicholas Bellinger (1):
> virtio-scsi: Set max_target=0 during vhost-scsi operation
>
> Stefan Hajnoczi (8):
> notifier: add validity check and notify function
> virtio-pci: support host notifiers in TCG mode
> virtio-pci: check that event notification worked
> vhost: Pass device path to vhost_dev_init()
> virtio-scsi: Add wwpn and tgpt properties
> virtio-scsi: Open and initialize /dev/vhost-scsi
> virtio-scsi: Start/stop vhost
> vhost-scsi: add -vhost-scsi host device
>
>
> However, thus far I've not been able to get virtio-scsi <-> tcm_vhost
> I/O to actually work against the latest qemu.git/master..
>
> So while doing a (manual) bisection w/ this series to track down the
> issue with qemu/master, I managed to run across something else.. With
> the vhost-scsi series applied, everything is working as expected up
> until the following commit:
>
> commit 1523ed9e1d46b0b54540049d491475ccac7e6421
> Author: Jan Kiszka <jan.kiszka@siemens.com>
> Date: Thu May 17 10:32:39 2012 -0300
>
> virtio/vhost: Add support for KVM in-kernel MSI injection
>
>
> This commit ends up triggering the following assert immediately after
> starting qemu with virtio-scsi <-> tcm_vhost:
>
> qemu-system-x86_64: /usr/src/qemu.git/hw/msix.c:515:
> msix_unset_vector_notifiers: Assertion `dev->msix_vector_use_notifier &&
> dev->msix_vector_release_notifier' failed.
>
> OK, so adding the following hack allows me to boot:
>
> diff --git a/hw/msix.c b/hw/msix.c
> index 59c7a83..6036909 100644
> --- a/hw/msix.c
> +++ b/hw/msix.c
> @@ -511,6 +511,11 @@ void msix_unset_vector_notifiers(PCIDevice *dev)
> {
> int vector;
>
> + if (!dev->msix_vector_use_notifier && !dev->msix_vector_release_notifier) {
> + printf("Hit NULL msix_unset_vector_notifiers for: %s\n", dev->name);
> + return;
> + }
> +
> assert(dev->msix_vector_use_notifier &&
> dev->msix_vector_release_notifier);
>
> --
Can you post a backtrace from gdb?
Also, is there a git tree and a way to reproduce this without special
hardware needs?
>
> and virtio-scsi is then able to load + detect tcm_vhost LUNs as
> expected.
>
> However the random I/O performance with commit 1523ed9e1d46b is off by a
> couple of orders of magnitude, ~6K IOPs compared to ~60K IOPs on raw
> block flash using just the previous commit bdd00bdc64ba in Jan's series.
>
> So AFAICT there appears to be a serious performance regression that is
> easily reproducible with that patch, which is about as far along as I've
> been able to diagnose yet.
>
> Interestingly enough, virtio-scsi-raw performance does not seem to be
> effected AFAICT by this regression, and is still able to go ~20K IOPs
> with the same workload using commit 1523ed9e1d46b. (Roughly the same as
> before)
>
> Does anyone have any idea why commit 1523ed9e1d46b would be killing
> vhost / tcm_vhost performance so terribly, or is there something else
> that vhost / vhost-scsi should be doing with new code..?
No good idea yet, will have to look closer.
Maybe you are somehow deassigning (via set_guest_notifiers) before
assigning. But that would not yet explain performance regressions. Your
target is exposing MSI-X, isn't it?
Jan
--
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: vhost-scsi port to v1.1.0 + MSI-X performance regression
2012-07-24 7:57 ` [Qemu-devel] " Jan Kiszka
@ 2012-07-24 12:05 ` Stefan Hajnoczi
-1 siblings, 0 replies; 16+ messages in thread
From: Stefan Hajnoczi @ 2012-07-24 12:05 UTC (permalink / raw)
To: Jan Kiszka
Cc: Nicholas A. Bellinger, Anthony Liguori, qemu-devel, kvm-devel,
target-devel, Paolo Bonzini, Hannes Reinecke, Michael S. Tsirkin,
Christoph Hellwig, Avi Kivity
On Tue, Jul 24, 2012 at 8:57 AM, Jan Kiszka <jan.kiszka@siemens.com> wrote:
> On 2012-07-24 09:42, Nicholas A. Bellinger wrote:
>> Hi Anthony, Stefan & QEMU folks,
>>
>> So during the process of separating out the patches from Zhi's
>> vhost-scsi tree this evening, I managed to squash everything down to
>> nine nicely reviewable patches that apply against the current
>> qemu.git/master:
>>
>> Nicholas Bellinger (1):
>> virtio-scsi: Set max_target=0 during vhost-scsi operation
>>
>> Stefan Hajnoczi (8):
>> notifier: add validity check and notify function
>> virtio-pci: support host notifiers in TCG mode
>> virtio-pci: check that event notification worked
>> vhost: Pass device path to vhost_dev_init()
>> virtio-scsi: Add wwpn and tgpt properties
>> virtio-scsi: Open and initialize /dev/vhost-scsi
>> virtio-scsi: Start/stop vhost
>> vhost-scsi: add -vhost-scsi host device
>>
>>
>> However, thus far I've not been able to get virtio-scsi <-> tcm_vhost
>> I/O to actually work against the latest qemu.git/master..
>>
>> So while doing a (manual) bisection w/ this series to track down the
>> issue with qemu/master, I managed to run across something else.. With
>> the vhost-scsi series applied, everything is working as expected up
>> until the following commit:
>>
>> commit 1523ed9e1d46b0b54540049d491475ccac7e6421
>> Author: Jan Kiszka <jan.kiszka@siemens.com>
>> Date: Thu May 17 10:32:39 2012 -0300
>>
>> virtio/vhost: Add support for KVM in-kernel MSI injection
>>
>>
>> This commit ends up triggering the following assert immediately after
>> starting qemu with virtio-scsi <-> tcm_vhost:
>>
>> qemu-system-x86_64: /usr/src/qemu.git/hw/msix.c:515:
>> msix_unset_vector_notifiers: Assertion `dev->msix_vector_use_notifier &&
>> dev->msix_vector_release_notifier' failed.
>>
>> OK, so adding the following hack allows me to boot:
>>
>> diff --git a/hw/msix.c b/hw/msix.c
>> index 59c7a83..6036909 100644
>> --- a/hw/msix.c
>> +++ b/hw/msix.c
>> @@ -511,6 +511,11 @@ void msix_unset_vector_notifiers(PCIDevice *dev)
>> {
>> int vector;
>>
>> + if (!dev->msix_vector_use_notifier && !dev->msix_vector_release_notifier) {
>> + printf("Hit NULL msix_unset_vector_notifiers for: %s\n", dev->name);
>> + return;
>> + }
>> +
>> assert(dev->msix_vector_use_notifier &&
>> dev->msix_vector_release_notifier);
>>
>> --
>
> Can you post a backtrace from gdb?
>
> Also, is there a git tree and a way to reproduce this without special
> hardware needs?
>
>>
>> and virtio-scsi is then able to load + detect tcm_vhost LUNs as
>> expected.
>>
>> However the random I/O performance with commit 1523ed9e1d46b is off by a
>> couple of orders of magnitude, ~6K IOPs compared to ~60K IOPs on raw
>> block flash using just the previous commit bdd00bdc64ba in Jan's series.
>>
>> So AFAICT there appears to be a serious performance regression that is
>> easily reproducible with that patch, which is about as far along as I've
>> been able to diagnose yet.
>>
>> Interestingly enough, virtio-scsi-raw performance does not seem to be
>> effected AFAICT by this regression, and is still able to go ~20K IOPs
>> with the same workload using commit 1523ed9e1d46b. (Roughly the same as
>> before)
>>
>> Does anyone have any idea why commit 1523ed9e1d46b would be killing
>> vhost / tcm_vhost performance so terribly, or is there something else
>> that vhost / vhost-scsi should be doing with new code..?
>
> No good idea yet, will have to look closer.
>
> Maybe you are somehow deassigning (via set_guest_notifiers) before
> assigning. But that would not yet explain performance regressions. Your
> target is exposing MSI-X, isn't it?
Regarding performance, have the command-line options to enable irqchip
or anything like that changed after the qemu-kvm.git -> qemu.git
refactoring?
Stefan
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] vhost-scsi port to v1.1.0 + MSI-X performance regression
@ 2012-07-24 12:05 ` Stefan Hajnoczi
0 siblings, 0 replies; 16+ messages in thread
From: Stefan Hajnoczi @ 2012-07-24 12:05 UTC (permalink / raw)
To: Jan Kiszka
Cc: kvm-devel, Michael S. Tsirkin, qemu-devel, Nicholas A. Bellinger,
target-devel, Hannes Reinecke, Anthony Liguori, Paolo Bonzini,
Christoph Hellwig, Avi Kivity
On Tue, Jul 24, 2012 at 8:57 AM, Jan Kiszka <jan.kiszka@siemens.com> wrote:
> On 2012-07-24 09:42, Nicholas A. Bellinger wrote:
>> Hi Anthony, Stefan & QEMU folks,
>>
>> So during the process of separating out the patches from Zhi's
>> vhost-scsi tree this evening, I managed to squash everything down to
>> nine nicely reviewable patches that apply against the current
>> qemu.git/master:
>>
>> Nicholas Bellinger (1):
>> virtio-scsi: Set max_target=0 during vhost-scsi operation
>>
>> Stefan Hajnoczi (8):
>> notifier: add validity check and notify function
>> virtio-pci: support host notifiers in TCG mode
>> virtio-pci: check that event notification worked
>> vhost: Pass device path to vhost_dev_init()
>> virtio-scsi: Add wwpn and tgpt properties
>> virtio-scsi: Open and initialize /dev/vhost-scsi
>> virtio-scsi: Start/stop vhost
>> vhost-scsi: add -vhost-scsi host device
>>
>>
>> However, thus far I've not been able to get virtio-scsi <-> tcm_vhost
>> I/O to actually work against the latest qemu.git/master..
>>
>> So while doing a (manual) bisection w/ this series to track down the
>> issue with qemu/master, I managed to run across something else.. With
>> the vhost-scsi series applied, everything is working as expected up
>> until the following commit:
>>
>> commit 1523ed9e1d46b0b54540049d491475ccac7e6421
>> Author: Jan Kiszka <jan.kiszka@siemens.com>
>> Date: Thu May 17 10:32:39 2012 -0300
>>
>> virtio/vhost: Add support for KVM in-kernel MSI injection
>>
>>
>> This commit ends up triggering the following assert immediately after
>> starting qemu with virtio-scsi <-> tcm_vhost:
>>
>> qemu-system-x86_64: /usr/src/qemu.git/hw/msix.c:515:
>> msix_unset_vector_notifiers: Assertion `dev->msix_vector_use_notifier &&
>> dev->msix_vector_release_notifier' failed.
>>
>> OK, so adding the following hack allows me to boot:
>>
>> diff --git a/hw/msix.c b/hw/msix.c
>> index 59c7a83..6036909 100644
>> --- a/hw/msix.c
>> +++ b/hw/msix.c
>> @@ -511,6 +511,11 @@ void msix_unset_vector_notifiers(PCIDevice *dev)
>> {
>> int vector;
>>
>> + if (!dev->msix_vector_use_notifier && !dev->msix_vector_release_notifier) {
>> + printf("Hit NULL msix_unset_vector_notifiers for: %s\n", dev->name);
>> + return;
>> + }
>> +
>> assert(dev->msix_vector_use_notifier &&
>> dev->msix_vector_release_notifier);
>>
>> --
>
> Can you post a backtrace from gdb?
>
> Also, is there a git tree and a way to reproduce this without special
> hardware needs?
>
>>
>> and virtio-scsi is then able to load + detect tcm_vhost LUNs as
>> expected.
>>
>> However the random I/O performance with commit 1523ed9e1d46b is off by a
>> couple of orders of magnitude, ~6K IOPs compared to ~60K IOPs on raw
>> block flash using just the previous commit bdd00bdc64ba in Jan's series.
>>
>> So AFAICT there appears to be a serious performance regression that is
>> easily reproducible with that patch, which is about as far along as I've
>> been able to diagnose yet.
>>
>> Interestingly enough, virtio-scsi-raw performance does not seem to be
>> effected AFAICT by this regression, and is still able to go ~20K IOPs
>> with the same workload using commit 1523ed9e1d46b. (Roughly the same as
>> before)
>>
>> Does anyone have any idea why commit 1523ed9e1d46b would be killing
>> vhost / tcm_vhost performance so terribly, or is there something else
>> that vhost / vhost-scsi should be doing with new code..?
>
> No good idea yet, will have to look closer.
>
> Maybe you are somehow deassigning (via set_guest_notifiers) before
> assigning. But that would not yet explain performance regressions. Your
> target is exposing MSI-X, isn't it?
Regarding performance, have the command-line options to enable irqchip
or anything like that changed after the qemu-kvm.git -> qemu.git
refactoring?
Stefan
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: vhost-scsi port to v1.1.0 + MSI-X performance regression
2012-07-24 12:05 ` [Qemu-devel] " Stefan Hajnoczi
@ 2012-07-24 12:10 ` Jan Kiszka
-1 siblings, 0 replies; 16+ messages in thread
From: Jan Kiszka @ 2012-07-24 12:10 UTC (permalink / raw)
To: Stefan Hajnoczi
Cc: Nicholas A. Bellinger, Anthony Liguori, qemu-devel, kvm-devel,
target-devel, Paolo Bonzini, Hannes Reinecke, Michael S. Tsirkin,
Christoph Hellwig, Avi Kivity
On 2012-07-24 14:05, Stefan Hajnoczi wrote:
> On Tue, Jul 24, 2012 at 8:57 AM, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>> On 2012-07-24 09:42, Nicholas A. Bellinger wrote:
>>> Hi Anthony, Stefan & QEMU folks,
>>>
>>> So during the process of separating out the patches from Zhi's
>>> vhost-scsi tree this evening, I managed to squash everything down to
>>> nine nicely reviewable patches that apply against the current
>>> qemu.git/master:
>>>
>>> Nicholas Bellinger (1):
>>> virtio-scsi: Set max_target=0 during vhost-scsi operation
>>>
>>> Stefan Hajnoczi (8):
>>> notifier: add validity check and notify function
>>> virtio-pci: support host notifiers in TCG mode
>>> virtio-pci: check that event notification worked
>>> vhost: Pass device path to vhost_dev_init()
>>> virtio-scsi: Add wwpn and tgpt properties
>>> virtio-scsi: Open and initialize /dev/vhost-scsi
>>> virtio-scsi: Start/stop vhost
>>> vhost-scsi: add -vhost-scsi host device
>>>
>>>
>>> However, thus far I've not been able to get virtio-scsi <-> tcm_vhost
>>> I/O to actually work against the latest qemu.git/master..
>>>
>>> So while doing a (manual) bisection w/ this series to track down the
>>> issue with qemu/master, I managed to run across something else.. With
>>> the vhost-scsi series applied, everything is working as expected up
>>> until the following commit:
>>>
>>> commit 1523ed9e1d46b0b54540049d491475ccac7e6421
>>> Author: Jan Kiszka <jan.kiszka@siemens.com>
>>> Date: Thu May 17 10:32:39 2012 -0300
>>>
>>> virtio/vhost: Add support for KVM in-kernel MSI injection
>>>
>>>
>>> This commit ends up triggering the following assert immediately after
>>> starting qemu with virtio-scsi <-> tcm_vhost:
>>>
>>> qemu-system-x86_64: /usr/src/qemu.git/hw/msix.c:515:
>>> msix_unset_vector_notifiers: Assertion `dev->msix_vector_use_notifier &&
>>> dev->msix_vector_release_notifier' failed.
>>>
>>> OK, so adding the following hack allows me to boot:
>>>
>>> diff --git a/hw/msix.c b/hw/msix.c
>>> index 59c7a83..6036909 100644
>>> --- a/hw/msix.c
>>> +++ b/hw/msix.c
>>> @@ -511,6 +511,11 @@ void msix_unset_vector_notifiers(PCIDevice *dev)
>>> {
>>> int vector;
>>>
>>> + if (!dev->msix_vector_use_notifier && !dev->msix_vector_release_notifier) {
>>> + printf("Hit NULL msix_unset_vector_notifiers for: %s\n", dev->name);
>>> + return;
>>> + }
>>> +
>>> assert(dev->msix_vector_use_notifier &&
>>> dev->msix_vector_release_notifier);
>>>
>>> --
>>
>> Can you post a backtrace from gdb?
>>
>> Also, is there a git tree and a way to reproduce this without special
>> hardware needs?
>>
>>>
>>> and virtio-scsi is then able to load + detect tcm_vhost LUNs as
>>> expected.
>>>
>>> However the random I/O performance with commit 1523ed9e1d46b is off by a
>>> couple of orders of magnitude, ~6K IOPs compared to ~60K IOPs on raw
>>> block flash using just the previous commit bdd00bdc64ba in Jan's series.
>>>
>>> So AFAICT there appears to be a serious performance regression that is
>>> easily reproducible with that patch, which is about as far along as I've
>>> been able to diagnose yet.
>>>
>>> Interestingly enough, virtio-scsi-raw performance does not seem to be
>>> effected AFAICT by this regression, and is still able to go ~20K IOPs
>>> with the same workload using commit 1523ed9e1d46b. (Roughly the same as
>>> before)
>>>
>>> Does anyone have any idea why commit 1523ed9e1d46b would be killing
>>> vhost / tcm_vhost performance so terribly, or is there something else
>>> that vhost / vhost-scsi should be doing with new code..?
>>
>> No good idea yet, will have to look closer.
>>
>> Maybe you are somehow deassigning (via set_guest_notifiers) before
>> assigning. But that would not yet explain performance regressions. Your
>> target is exposing MSI-X, isn't it?
>
> Regarding performance, have the command-line options to enable irqchip
> or anything like that changed after the qemu-kvm.git -> qemu.git
> refactoring?
Current qemu.git has kernel_irqchip=on by default now, so the command
line should not matter (unless you explicitly turn it off).
Jan
--
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] vhost-scsi port to v1.1.0 + MSI-X performance regression
@ 2012-07-24 12:10 ` Jan Kiszka
0 siblings, 0 replies; 16+ messages in thread
From: Jan Kiszka @ 2012-07-24 12:10 UTC (permalink / raw)
To: Stefan Hajnoczi
Cc: kvm-devel, Michael S. Tsirkin, qemu-devel, Nicholas A. Bellinger,
target-devel, Hannes Reinecke, Anthony Liguori, Paolo Bonzini,
Christoph Hellwig, Avi Kivity
On 2012-07-24 14:05, Stefan Hajnoczi wrote:
> On Tue, Jul 24, 2012 at 8:57 AM, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>> On 2012-07-24 09:42, Nicholas A. Bellinger wrote:
>>> Hi Anthony, Stefan & QEMU folks,
>>>
>>> So during the process of separating out the patches from Zhi's
>>> vhost-scsi tree this evening, I managed to squash everything down to
>>> nine nicely reviewable patches that apply against the current
>>> qemu.git/master:
>>>
>>> Nicholas Bellinger (1):
>>> virtio-scsi: Set max_target=0 during vhost-scsi operation
>>>
>>> Stefan Hajnoczi (8):
>>> notifier: add validity check and notify function
>>> virtio-pci: support host notifiers in TCG mode
>>> virtio-pci: check that event notification worked
>>> vhost: Pass device path to vhost_dev_init()
>>> virtio-scsi: Add wwpn and tgpt properties
>>> virtio-scsi: Open and initialize /dev/vhost-scsi
>>> virtio-scsi: Start/stop vhost
>>> vhost-scsi: add -vhost-scsi host device
>>>
>>>
>>> However, thus far I've not been able to get virtio-scsi <-> tcm_vhost
>>> I/O to actually work against the latest qemu.git/master..
>>>
>>> So while doing a (manual) bisection w/ this series to track down the
>>> issue with qemu/master, I managed to run across something else.. With
>>> the vhost-scsi series applied, everything is working as expected up
>>> until the following commit:
>>>
>>> commit 1523ed9e1d46b0b54540049d491475ccac7e6421
>>> Author: Jan Kiszka <jan.kiszka@siemens.com>
>>> Date: Thu May 17 10:32:39 2012 -0300
>>>
>>> virtio/vhost: Add support for KVM in-kernel MSI injection
>>>
>>>
>>> This commit ends up triggering the following assert immediately after
>>> starting qemu with virtio-scsi <-> tcm_vhost:
>>>
>>> qemu-system-x86_64: /usr/src/qemu.git/hw/msix.c:515:
>>> msix_unset_vector_notifiers: Assertion `dev->msix_vector_use_notifier &&
>>> dev->msix_vector_release_notifier' failed.
>>>
>>> OK, so adding the following hack allows me to boot:
>>>
>>> diff --git a/hw/msix.c b/hw/msix.c
>>> index 59c7a83..6036909 100644
>>> --- a/hw/msix.c
>>> +++ b/hw/msix.c
>>> @@ -511,6 +511,11 @@ void msix_unset_vector_notifiers(PCIDevice *dev)
>>> {
>>> int vector;
>>>
>>> + if (!dev->msix_vector_use_notifier && !dev->msix_vector_release_notifier) {
>>> + printf("Hit NULL msix_unset_vector_notifiers for: %s\n", dev->name);
>>> + return;
>>> + }
>>> +
>>> assert(dev->msix_vector_use_notifier &&
>>> dev->msix_vector_release_notifier);
>>>
>>> --
>>
>> Can you post a backtrace from gdb?
>>
>> Also, is there a git tree and a way to reproduce this without special
>> hardware needs?
>>
>>>
>>> and virtio-scsi is then able to load + detect tcm_vhost LUNs as
>>> expected.
>>>
>>> However the random I/O performance with commit 1523ed9e1d46b is off by a
>>> couple of orders of magnitude, ~6K IOPs compared to ~60K IOPs on raw
>>> block flash using just the previous commit bdd00bdc64ba in Jan's series.
>>>
>>> So AFAICT there appears to be a serious performance regression that is
>>> easily reproducible with that patch, which is about as far along as I've
>>> been able to diagnose yet.
>>>
>>> Interestingly enough, virtio-scsi-raw performance does not seem to be
>>> effected AFAICT by this regression, and is still able to go ~20K IOPs
>>> with the same workload using commit 1523ed9e1d46b. (Roughly the same as
>>> before)
>>>
>>> Does anyone have any idea why commit 1523ed9e1d46b would be killing
>>> vhost / tcm_vhost performance so terribly, or is there something else
>>> that vhost / vhost-scsi should be doing with new code..?
>>
>> No good idea yet, will have to look closer.
>>
>> Maybe you are somehow deassigning (via set_guest_notifiers) before
>> assigning. But that would not yet explain performance regressions. Your
>> target is exposing MSI-X, isn't it?
>
> Regarding performance, have the command-line options to enable irqchip
> or anything like that changed after the qemu-kvm.git -> qemu.git
> refactoring?
Current qemu.git has kernel_irqchip=on by default now, so the command
line should not matter (unless you explicitly turn it off).
Jan
--
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: vhost-scsi port to v1.1.0 + MSI-X performance regression
2012-07-24 7:57 ` [Qemu-devel] " Jan Kiszka
@ 2012-07-24 20:20 ` Nicholas A. Bellinger
-1 siblings, 0 replies; 16+ messages in thread
From: Nicholas A. Bellinger @ 2012-07-24 20:20 UTC (permalink / raw)
To: Jan Kiszka
Cc: Anthony Liguori, qemu-devel, kvm-devel, target-devel,
Stefan Hajnoczi, Paolo Bonzini, Hannes Reinecke,
Michael S. Tsirkin, Christoph Hellwig, Avi Kivity
On Tue, 2012-07-24 at 09:57 +0200, Jan Kiszka wrote:
> On 2012-07-24 09:42, Nicholas A. Bellinger wrote:
> > Hi Anthony, Stefan & QEMU folks,
> >
<SNIP>
> > However, thus far I've not been able to get virtio-scsi <-> tcm_vhost
> > I/O to actually work against the latest qemu.git/master..
> >
> > So while doing a (manual) bisection w/ this series to track down the
> > issue with qemu/master, I managed to run across something else.. With
> > the vhost-scsi series applied, everything is working as expected up
> > until the following commit:
> >
> > commit 1523ed9e1d46b0b54540049d491475ccac7e6421
> > Author: Jan Kiszka <jan.kiszka@siemens.com>
> > Date: Thu May 17 10:32:39 2012 -0300
> >
> > virtio/vhost: Add support for KVM in-kernel MSI injection
> >
> >
> > This commit ends up triggering the following assert immediately after
> > starting qemu with virtio-scsi <-> tcm_vhost:
> >
> > qemu-system-x86_64: /usr/src/qemu.git/hw/msix.c:515:
> > msix_unset_vector_notifiers: Assertion `dev->msix_vector_use_notifier &&
> > dev->msix_vector_release_notifier' failed.
> >
> > OK, so adding the following hack allows me to boot:
> >
> > diff --git a/hw/msix.c b/hw/msix.c
> > index 59c7a83..6036909 100644
> > --- a/hw/msix.c
> > +++ b/hw/msix.c
> > @@ -511,6 +511,11 @@ void msix_unset_vector_notifiers(PCIDevice *dev)
> > {
> > int vector;
> >
> > + if (!dev->msix_vector_use_notifier && !dev->msix_vector_release_notifier) {
> > + printf("Hit NULL msix_unset_vector_notifiers for: %s\n", dev->name);
> > + return;
> > + }
> > +
> > assert(dev->msix_vector_use_notifier &&
> > dev->msix_vector_release_notifier);
> >
> > --
>
> Can you post a backtrace from gdb?
>
Sure, w/o the above patch the backtrace with commit 1523ed9e1d looks
like the following:
(gdb) run
Starting program: /usr/src/qemu.git/x86_64-softmmu/qemu-system-x86_64 -enable-kvm -smp 2 -m 2048 -serial file:/tmp/vhost-serial.txt -hda /usr/src/qemu-vhost.git/debian_squeeze_amd64_standard-old.qcow2 -vhost-scsi id=vhost-scsi0,wwpn=naa.600140579ad21088,tpgt=1 -device virtio-scsi-pci,vhost-scsi=vhost-scsi0,event_idx=off
[Thread debugging using libthread_db enabled]
wwpn = "vhost-scsi0" tpgt = "1"
[New Thread 0x7ffff45f8700 (LWP 26508)]
[New Thread 0x7ffff3bf6700 (LWP 26509)]
[New Thread 0x7ffff33f5700 (LWP 26510)]
vhost_scsi_stop
Failed to clear endpoint
qemu-system-x86_64: /usr/src/qemu.git/hw/msix.c:515: msix_unset_vector_notifiers: Assertion `dev->msix_vector_use_notifier && dev->msix_vector_release_notifier' failed.
Program received signal SIGABRT, Aborted.
0x00007ffff5e8b165 in raise () from /lib/libc.so.6
(gdb) bt
#0 0x00007ffff5e8b165 in raise () from /lib/libc.so.6
#1 0x00007ffff5e8df70 in abort () from /lib/libc.so.6
#2 0x00007ffff5e842b1 in __assert_fail () from /lib/libc.so.6
#3 0x00000000004a84a1 in msix_unset_vector_notifiers (dev=0x1463a70) at /usr/src/qemu.git/hw/msix.c:514
#4 0x00000000004d2865 in virtio_pci_set_guest_notifiers (opaque=0x6788, assign=136)
at /usr/src/qemu.git/hw/virtio-pci.c:703
#5 0x000000000062955f in vhost_dev_stop (hdev=0x126c8a8, vdev=0x1465220) at /usr/src/qemu.git/hw/vhost.c:954
#6 0x0000000000628989 in vhost_scsi_stop (vs=0x126c890, vdev=0x1465220) at /usr/src/qemu.git/hw/vhost-scsi.c:115
#7 0x000000000062f5c9 in virtio_scsi_set_status (vdev=0x1465220, val=<value optimized out>)
at /usr/src/qemu.git/hw/virtio-scsi.c:631
#8 0x0000000000632082 in virtio_set_status (vdev=0x1465220, val=136 '\210') at /usr/src/qemu.git/hw/virtio.c:507
#9 0x0000000000633410 in virtio_reset (opaque=0x6788) at /usr/src/qemu.git/hw/virtio.c:517
#10 0x00000000004d30a9 in virtio_pci_reset (d=0x1463a70) at /usr/src/qemu.git/hw/virtio-pci.c:280
#11 0x00000000004fc909 in qdev_reset_one (dev=0x6788, opaque=0x6788) at /usr/src/qemu.git/hw/qdev.c:207
#12 0x00000000004fc670 in qdev_walk_children (dev=0x1463a70, devfn=0x4fc8f0 <qdev_reset_one>,
busfn=0x4fc510 <qbus_reset_one>, opaque=0x0) at /usr/src/qemu.git/hw/qdev.c:372
#13 0x00000000004ae43d in pci_device_reset (dev=0x6788) at /usr/src/qemu.git/hw/pci.c:163
#14 0x00000000004ae64f in pci_bus_reset (bus=0x1415bd0) at /usr/src/qemu.git/hw/pci.c:206
#15 0x00000000004ae699 in pcibus_reset (qbus=0x6788) at /usr/src/qemu.git/hw/pci.c:213
#16 0x00000000004fc710 in qbus_walk_children (bus=0x1415bd0, devfn=0x4fc8f0 <qdev_reset_one>, busfn=0x6, opaque=0x0)
at /usr/src/qemu.git/hw/qdev.c:349
#17 0x00000000004fc6a3 in qdev_walk_children (dev=<value optimized out>, devfn=0x4fc8f0 <qdev_reset_one>,
busfn=0x4fc510 <qbus_reset_one>, opaque=0x0) at /usr/src/qemu.git/hw/qdev.c:379
#18 0x00000000004fc745 in qbus_walk_children (bus=<value optimized out>, devfn=0x4fc8f0 <qdev_reset_one>,
busfn=0x4fc510 <qbus_reset_one>, opaque=0x0) at /usr/src/qemu.git/hw/qdev.c:356
#19 0x00000000004d5822 in qemu_system_reset (report=false) at /usr/src/qemu.git/vl.c:1412
#20 0x00000000004d70bb in main (argc=<value optimized out>, argv=<value optimized out>, envp=<value optimized out>)
at /usr/src/qemu.git/vl.c:3647
(gdb)
> Also, is there a git tree and a way to reproduce this without special
> hardware needs?
>
I'll push this series + branches to demonstrate the issue into an public
tree this afternoon.
Also, the particular backend is a Fusion-IO raw block flash device, but
I'm pretty sure that using a TCM RAMDISK into tcm_vhost would exhibit
the same type of behavior. (Will double check on that shortly..)
> >
> > and virtio-scsi is then able to load + detect tcm_vhost LUNs as
> > expected.
> >
> > However the random I/O performance with commit 1523ed9e1d46b is off by a
> > couple of orders of magnitude, ~6K IOPs compared to ~60K IOPs on raw
> > block flash using just the previous commit bdd00bdc64ba in Jan's series.
> >
> > So AFAICT there appears to be a serious performance regression that is
> > easily reproducible with that patch, which is about as far along as I've
> > been able to diagnose yet.
> >
> > Interestingly enough, virtio-scsi-raw performance does not seem to be
> > effected AFAICT by this regression, and is still able to go ~20K IOPs
> > with the same workload using commit 1523ed9e1d46b. (Roughly the same as
> > before)
> >
> > Does anyone have any idea why commit 1523ed9e1d46b would be killing
> > vhost / tcm_vhost performance so terribly, or is there something else
> > that vhost / vhost-scsi should be doing with new code..?
>
> No good idea yet, will have to look closer.
>
<nod>, thanks for your help Jan. ;)
> Maybe you are somehow deassigning (via set_guest_notifiers) before
> assigning. But that would not yet explain performance regressions. Your
> target is exposing MSI-X, isn't it?
>
I believe that is correct.
--nab
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] vhost-scsi port to v1.1.0 + MSI-X performance regression
@ 2012-07-24 20:20 ` Nicholas A. Bellinger
0 siblings, 0 replies; 16+ messages in thread
From: Nicholas A. Bellinger @ 2012-07-24 20:20 UTC (permalink / raw)
To: Jan Kiszka
Cc: kvm-devel, Michael S. Tsirkin, Stefan Hajnoczi, qemu-devel,
target-devel, Hannes Reinecke, Anthony Liguori, Paolo Bonzini,
Christoph Hellwig, Avi Kivity
On Tue, 2012-07-24 at 09:57 +0200, Jan Kiszka wrote:
> On 2012-07-24 09:42, Nicholas A. Bellinger wrote:
> > Hi Anthony, Stefan & QEMU folks,
> >
<SNIP>
> > However, thus far I've not been able to get virtio-scsi <-> tcm_vhost
> > I/O to actually work against the latest qemu.git/master..
> >
> > So while doing a (manual) bisection w/ this series to track down the
> > issue with qemu/master, I managed to run across something else.. With
> > the vhost-scsi series applied, everything is working as expected up
> > until the following commit:
> >
> > commit 1523ed9e1d46b0b54540049d491475ccac7e6421
> > Author: Jan Kiszka <jan.kiszka@siemens.com>
> > Date: Thu May 17 10:32:39 2012 -0300
> >
> > virtio/vhost: Add support for KVM in-kernel MSI injection
> >
> >
> > This commit ends up triggering the following assert immediately after
> > starting qemu with virtio-scsi <-> tcm_vhost:
> >
> > qemu-system-x86_64: /usr/src/qemu.git/hw/msix.c:515:
> > msix_unset_vector_notifiers: Assertion `dev->msix_vector_use_notifier &&
> > dev->msix_vector_release_notifier' failed.
> >
> > OK, so adding the following hack allows me to boot:
> >
> > diff --git a/hw/msix.c b/hw/msix.c
> > index 59c7a83..6036909 100644
> > --- a/hw/msix.c
> > +++ b/hw/msix.c
> > @@ -511,6 +511,11 @@ void msix_unset_vector_notifiers(PCIDevice *dev)
> > {
> > int vector;
> >
> > + if (!dev->msix_vector_use_notifier && !dev->msix_vector_release_notifier) {
> > + printf("Hit NULL msix_unset_vector_notifiers for: %s\n", dev->name);
> > + return;
> > + }
> > +
> > assert(dev->msix_vector_use_notifier &&
> > dev->msix_vector_release_notifier);
> >
> > --
>
> Can you post a backtrace from gdb?
>
Sure, w/o the above patch the backtrace with commit 1523ed9e1d looks
like the following:
(gdb) run
Starting program: /usr/src/qemu.git/x86_64-softmmu/qemu-system-x86_64 -enable-kvm -smp 2 -m 2048 -serial file:/tmp/vhost-serial.txt -hda /usr/src/qemu-vhost.git/debian_squeeze_amd64_standard-old.qcow2 -vhost-scsi id=vhost-scsi0,wwpn=naa.600140579ad21088,tpgt=1 -device virtio-scsi-pci,vhost-scsi=vhost-scsi0,event_idx=off
[Thread debugging using libthread_db enabled]
wwpn = "vhost-scsi0" tpgt = "1"
[New Thread 0x7ffff45f8700 (LWP 26508)]
[New Thread 0x7ffff3bf6700 (LWP 26509)]
[New Thread 0x7ffff33f5700 (LWP 26510)]
vhost_scsi_stop
Failed to clear endpoint
qemu-system-x86_64: /usr/src/qemu.git/hw/msix.c:515: msix_unset_vector_notifiers: Assertion `dev->msix_vector_use_notifier && dev->msix_vector_release_notifier' failed.
Program received signal SIGABRT, Aborted.
0x00007ffff5e8b165 in raise () from /lib/libc.so.6
(gdb) bt
#0 0x00007ffff5e8b165 in raise () from /lib/libc.so.6
#1 0x00007ffff5e8df70 in abort () from /lib/libc.so.6
#2 0x00007ffff5e842b1 in __assert_fail () from /lib/libc.so.6
#3 0x00000000004a84a1 in msix_unset_vector_notifiers (dev=0x1463a70) at /usr/src/qemu.git/hw/msix.c:514
#4 0x00000000004d2865 in virtio_pci_set_guest_notifiers (opaque=0x6788, assign=136)
at /usr/src/qemu.git/hw/virtio-pci.c:703
#5 0x000000000062955f in vhost_dev_stop (hdev=0x126c8a8, vdev=0x1465220) at /usr/src/qemu.git/hw/vhost.c:954
#6 0x0000000000628989 in vhost_scsi_stop (vs=0x126c890, vdev=0x1465220) at /usr/src/qemu.git/hw/vhost-scsi.c:115
#7 0x000000000062f5c9 in virtio_scsi_set_status (vdev=0x1465220, val=<value optimized out>)
at /usr/src/qemu.git/hw/virtio-scsi.c:631
#8 0x0000000000632082 in virtio_set_status (vdev=0x1465220, val=136 '\210') at /usr/src/qemu.git/hw/virtio.c:507
#9 0x0000000000633410 in virtio_reset (opaque=0x6788) at /usr/src/qemu.git/hw/virtio.c:517
#10 0x00000000004d30a9 in virtio_pci_reset (d=0x1463a70) at /usr/src/qemu.git/hw/virtio-pci.c:280
#11 0x00000000004fc909 in qdev_reset_one (dev=0x6788, opaque=0x6788) at /usr/src/qemu.git/hw/qdev.c:207
#12 0x00000000004fc670 in qdev_walk_children (dev=0x1463a70, devfn=0x4fc8f0 <qdev_reset_one>,
busfn=0x4fc510 <qbus_reset_one>, opaque=0x0) at /usr/src/qemu.git/hw/qdev.c:372
#13 0x00000000004ae43d in pci_device_reset (dev=0x6788) at /usr/src/qemu.git/hw/pci.c:163
#14 0x00000000004ae64f in pci_bus_reset (bus=0x1415bd0) at /usr/src/qemu.git/hw/pci.c:206
#15 0x00000000004ae699 in pcibus_reset (qbus=0x6788) at /usr/src/qemu.git/hw/pci.c:213
#16 0x00000000004fc710 in qbus_walk_children (bus=0x1415bd0, devfn=0x4fc8f0 <qdev_reset_one>, busfn=0x6, opaque=0x0)
at /usr/src/qemu.git/hw/qdev.c:349
#17 0x00000000004fc6a3 in qdev_walk_children (dev=<value optimized out>, devfn=0x4fc8f0 <qdev_reset_one>,
busfn=0x4fc510 <qbus_reset_one>, opaque=0x0) at /usr/src/qemu.git/hw/qdev.c:379
#18 0x00000000004fc745 in qbus_walk_children (bus=<value optimized out>, devfn=0x4fc8f0 <qdev_reset_one>,
busfn=0x4fc510 <qbus_reset_one>, opaque=0x0) at /usr/src/qemu.git/hw/qdev.c:356
#19 0x00000000004d5822 in qemu_system_reset (report=false) at /usr/src/qemu.git/vl.c:1412
#20 0x00000000004d70bb in main (argc=<value optimized out>, argv=<value optimized out>, envp=<value optimized out>)
at /usr/src/qemu.git/vl.c:3647
(gdb)
> Also, is there a git tree and a way to reproduce this without special
> hardware needs?
>
I'll push this series + branches to demonstrate the issue into an public
tree this afternoon.
Also, the particular backend is a Fusion-IO raw block flash device, but
I'm pretty sure that using a TCM RAMDISK into tcm_vhost would exhibit
the same type of behavior. (Will double check on that shortly..)
> >
> > and virtio-scsi is then able to load + detect tcm_vhost LUNs as
> > expected.
> >
> > However the random I/O performance with commit 1523ed9e1d46b is off by a
> > couple of orders of magnitude, ~6K IOPs compared to ~60K IOPs on raw
> > block flash using just the previous commit bdd00bdc64ba in Jan's series.
> >
> > So AFAICT there appears to be a serious performance regression that is
> > easily reproducible with that patch, which is about as far along as I've
> > been able to diagnose yet.
> >
> > Interestingly enough, virtio-scsi-raw performance does not seem to be
> > effected AFAICT by this regression, and is still able to go ~20K IOPs
> > with the same workload using commit 1523ed9e1d46b. (Roughly the same as
> > before)
> >
> > Does anyone have any idea why commit 1523ed9e1d46b would be killing
> > vhost / tcm_vhost performance so terribly, or is there something else
> > that vhost / vhost-scsi should be doing with new code..?
>
> No good idea yet, will have to look closer.
>
<nod>, thanks for your help Jan. ;)
> Maybe you are somehow deassigning (via set_guest_notifiers) before
> assigning. But that would not yet explain performance regressions. Your
> target is exposing MSI-X, isn't it?
>
I believe that is correct.
--nab
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: vhost-scsi port to v1.1.0 + MSI-X performance regression
2012-07-24 20:20 ` [Qemu-devel] " Nicholas A. Bellinger
@ 2012-07-24 21:43 ` Nicholas A. Bellinger
-1 siblings, 0 replies; 16+ messages in thread
From: Nicholas A. Bellinger @ 2012-07-24 21:43 UTC (permalink / raw)
To: Jan Kiszka
Cc: Anthony Liguori, qemu-devel, kvm-devel, target-devel,
Stefan Hajnoczi, Paolo Bonzini, Hannes Reinecke,
Michael S. Tsirkin, Christoph Hellwig, Avi Kivity, Jens Axboe
On Tue, 2012-07-24 at 13:20 -0700, Nicholas A. Bellinger wrote:
> On Tue, 2012-07-24 at 09:57 +0200, Jan Kiszka wrote:
> > On 2012-07-24 09:42, Nicholas A. Bellinger wrote:
> > > Hi Anthony, Stefan & QEMU folks,
> > >
>
> <SNIP>
>
> > > However, thus far I've not been able to get virtio-scsi <-> tcm_vhost
> > > I/O to actually work against the latest qemu.git/master..
> > >
> > > So while doing a (manual) bisection w/ this series to track down the
> > > issue with qemu/master, I managed to run across something else.. With
> > > the vhost-scsi series applied, everything is working as expected up
> > > until the following commit:
> > >
> > > commit 1523ed9e1d46b0b54540049d491475ccac7e6421
> > > Author: Jan Kiszka <jan.kiszka@siemens.com>
> > > Date: Thu May 17 10:32:39 2012 -0300
> > >
> > > virtio/vhost: Add support for KVM in-kernel MSI injection
> > >
> > >
> > > This commit ends up triggering the following assert immediately after
> > > starting qemu with virtio-scsi <-> tcm_vhost:
> > >
> > > qemu-system-x86_64: /usr/src/qemu.git/hw/msix.c:515:
> > > msix_unset_vector_notifiers: Assertion `dev->msix_vector_use_notifier &&
> > > dev->msix_vector_release_notifier' failed.
> > >
> > > OK, so adding the following hack allows me to boot:
<SNIP>
> > Also, is there a git tree and a way to reproduce this without special
> > hardware needs?
> >
>
> I'll push this series + branches to demonstrate the issue into an public
> tree this afternoon.
>
OK, the updated vhost-scsi QEMU working tree is now available here:
http://git.kernel.org/?p=virt/kvm/nab/qemu-kvm.git;a=summary
Please ignore the qemu-kvm.git name for the moment, as it really is
based on upstream qemu. ;)
The branch breakdown looks like:
master 401a663 remove unused QemuOpts parameter from net init functions
* vhost-scsi bdd00bd msix: Add msix_nr_vectors_allocated
vhost-scsi-merge e4fdb8c [ahead 9] vhost-scsi: add -vhost-scsi host device
vhost-scsi-workaround 32ed0db msix: Work-around for vhost-scsi with KVM in-kernel MSI injection
- 'vhost-scsi' contains vhost-scsi patches + pre commit 1523ed9e1d logic
(Running stable at ~67.5K IOPs, see below)
- 'vhost-scsi-merge' contains a vhost-scsi series against yesterday's qemu.git HEAD
(not working atm, but will be sending out for RFC shortly)
- 'vhost-scsi-workaround' contains the msix.c workaround to get commit 1523ed9e1d to boot
(reproducible performance regression, down to 6K IOPs)
> Also, the particular backend is a Fusion-IO raw block flash device, but
> I'm pretty sure that using a TCM RAMDISK into tcm_vhost would exhibit
> the same type of behavior. (Will double check on that shortly..)
>
Interestingly enough, after running some test with the vhost-scsi branch
above with the last good working commit (bdd00bd), the same fio
benchmark inline below is now reaching ~67.5K IOPs using v3.5-rc2 guest
+ host..
rw=randrw
rwmixwrite=25
rwmixread=75
size=32768m
ioengine=libaio
direct=1
iodepth=64
blocksize=4k
numjobs=2
filename=/dev/sdb
This ends up being within a few percentage of what the bare-metal HW is
capable of.. (~70K IOPs per LUN for an ioDrive Duo, Jen's CC'ed). Note
the older QEMU vhost-scsi tree based on pre 1.1.0 code was doing ~60K
IOPs for the same test..
So this looks promising pre commit 1523ed9e1d vs. what we've been using
with vhost-scsi so far.. ;)
--nab
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] vhost-scsi port to v1.1.0 + MSI-X performance regression
@ 2012-07-24 21:43 ` Nicholas A. Bellinger
0 siblings, 0 replies; 16+ messages in thread
From: Nicholas A. Bellinger @ 2012-07-24 21:43 UTC (permalink / raw)
To: Jan Kiszka
Cc: Jens Axboe, kvm-devel, Michael S. Tsirkin, Stefan Hajnoczi,
qemu-devel, target-devel, Hannes Reinecke, Anthony Liguori,
Paolo Bonzini, Christoph Hellwig, Avi Kivity
On Tue, 2012-07-24 at 13:20 -0700, Nicholas A. Bellinger wrote:
> On Tue, 2012-07-24 at 09:57 +0200, Jan Kiszka wrote:
> > On 2012-07-24 09:42, Nicholas A. Bellinger wrote:
> > > Hi Anthony, Stefan & QEMU folks,
> > >
>
> <SNIP>
>
> > > However, thus far I've not been able to get virtio-scsi <-> tcm_vhost
> > > I/O to actually work against the latest qemu.git/master..
> > >
> > > So while doing a (manual) bisection w/ this series to track down the
> > > issue with qemu/master, I managed to run across something else.. With
> > > the vhost-scsi series applied, everything is working as expected up
> > > until the following commit:
> > >
> > > commit 1523ed9e1d46b0b54540049d491475ccac7e6421
> > > Author: Jan Kiszka <jan.kiszka@siemens.com>
> > > Date: Thu May 17 10:32:39 2012 -0300
> > >
> > > virtio/vhost: Add support for KVM in-kernel MSI injection
> > >
> > >
> > > This commit ends up triggering the following assert immediately after
> > > starting qemu with virtio-scsi <-> tcm_vhost:
> > >
> > > qemu-system-x86_64: /usr/src/qemu.git/hw/msix.c:515:
> > > msix_unset_vector_notifiers: Assertion `dev->msix_vector_use_notifier &&
> > > dev->msix_vector_release_notifier' failed.
> > >
> > > OK, so adding the following hack allows me to boot:
<SNIP>
> > Also, is there a git tree and a way to reproduce this without special
> > hardware needs?
> >
>
> I'll push this series + branches to demonstrate the issue into an public
> tree this afternoon.
>
OK, the updated vhost-scsi QEMU working tree is now available here:
http://git.kernel.org/?p=virt/kvm/nab/qemu-kvm.git;a=summary
Please ignore the qemu-kvm.git name for the moment, as it really is
based on upstream qemu. ;)
The branch breakdown looks like:
master 401a663 remove unused QemuOpts parameter from net init functions
* vhost-scsi bdd00bd msix: Add msix_nr_vectors_allocated
vhost-scsi-merge e4fdb8c [ahead 9] vhost-scsi: add -vhost-scsi host device
vhost-scsi-workaround 32ed0db msix: Work-around for vhost-scsi with KVM in-kernel MSI injection
- 'vhost-scsi' contains vhost-scsi patches + pre commit 1523ed9e1d logic
(Running stable at ~67.5K IOPs, see below)
- 'vhost-scsi-merge' contains a vhost-scsi series against yesterday's qemu.git HEAD
(not working atm, but will be sending out for RFC shortly)
- 'vhost-scsi-workaround' contains the msix.c workaround to get commit 1523ed9e1d to boot
(reproducible performance regression, down to 6K IOPs)
> Also, the particular backend is a Fusion-IO raw block flash device, but
> I'm pretty sure that using a TCM RAMDISK into tcm_vhost would exhibit
> the same type of behavior. (Will double check on that shortly..)
>
Interestingly enough, after running some test with the vhost-scsi branch
above with the last good working commit (bdd00bd), the same fio
benchmark inline below is now reaching ~67.5K IOPs using v3.5-rc2 guest
+ host..
rw=randrw
rwmixwrite=25
rwmixread=75
size=32768m
ioengine=libaio
direct=1
iodepth=64
blocksize=4k
numjobs=2
filename=/dev/sdb
This ends up being within a few percentage of what the bare-metal HW is
capable of.. (~70K IOPs per LUN for an ioDrive Duo, Jen's CC'ed). Note
the older QEMU vhost-scsi tree based on pre 1.1.0 code was doing ~60K
IOPs for the same test..
So this looks promising pre commit 1523ed9e1d vs. what we've been using
with vhost-scsi so far.. ;)
--nab
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: vhost-scsi port to v1.1.0 + MSI-X performance regression
2012-07-24 20:20 ` [Qemu-devel] " Nicholas A. Bellinger
@ 2012-07-25 9:26 ` Jan Kiszka
-1 siblings, 0 replies; 16+ messages in thread
From: Jan Kiszka @ 2012-07-25 9:26 UTC (permalink / raw)
To: Nicholas A. Bellinger
Cc: Anthony Liguori, qemu-devel, kvm-devel, target-devel,
Stefan Hajnoczi, Paolo Bonzini, Hannes Reinecke,
Michael S. Tsirkin, Christoph Hellwig, Avi Kivity
On 2012-07-24 22:20, Nicholas A. Bellinger wrote:
> On Tue, 2012-07-24 at 09:57 +0200, Jan Kiszka wrote:
>> On 2012-07-24 09:42, Nicholas A. Bellinger wrote:
>>> Hi Anthony, Stefan & QEMU folks,
>>>
>
> <SNIP>
>
>>> However, thus far I've not been able to get virtio-scsi <-> tcm_vhost
>>> I/O to actually work against the latest qemu.git/master..
>>>
>>> So while doing a (manual) bisection w/ this series to track down the
>>> issue with qemu/master, I managed to run across something else.. With
>>> the vhost-scsi series applied, everything is working as expected up
>>> until the following commit:
>>>
>>> commit 1523ed9e1d46b0b54540049d491475ccac7e6421
>>> Author: Jan Kiszka <jan.kiszka@siemens.com>
>>> Date: Thu May 17 10:32:39 2012 -0300
>>>
>>> virtio/vhost: Add support for KVM in-kernel MSI injection
>>>
>>>
>>> This commit ends up triggering the following assert immediately after
>>> starting qemu with virtio-scsi <-> tcm_vhost:
>>>
>>> qemu-system-x86_64: /usr/src/qemu.git/hw/msix.c:515:
>>> msix_unset_vector_notifiers: Assertion `dev->msix_vector_use_notifier &&
>>> dev->msix_vector_release_notifier' failed.
>>>
>>> OK, so adding the following hack allows me to boot:
>>>
>>> diff --git a/hw/msix.c b/hw/msix.c
>>> index 59c7a83..6036909 100644
>>> --- a/hw/msix.c
>>> +++ b/hw/msix.c
>>> @@ -511,6 +511,11 @@ void msix_unset_vector_notifiers(PCIDevice *dev)
>>> {
>>> int vector;
>>>
>>> + if (!dev->msix_vector_use_notifier && !dev->msix_vector_release_notifier) {
>>> + printf("Hit NULL msix_unset_vector_notifiers for: %s\n", dev->name);
>>> + return;
>>> + }
>>> +
>>> assert(dev->msix_vector_use_notifier &&
>>> dev->msix_vector_release_notifier);
>>>
>>> --
>>
>> Can you post a backtrace from gdb?
>>
>
> Sure, w/o the above patch the backtrace with commit 1523ed9e1d looks
> like the following:
>
> (gdb) run
> Starting program: /usr/src/qemu.git/x86_64-softmmu/qemu-system-x86_64 -enable-kvm -smp 2 -m 2048 -serial file:/tmp/vhost-serial.txt -hda /usr/src/qemu-vhost.git/debian_squeeze_amd64_standard-old.qcow2 -vhost-scsi id=vhost-scsi0,wwpn=naa.600140579ad21088,tpgt=1 -device virtio-scsi-pci,vhost-scsi=vhost-scsi0,event_idx=off
> [Thread debugging using libthread_db enabled]
> wwpn = "vhost-scsi0" tpgt = "1"
> [New Thread 0x7ffff45f8700 (LWP 26508)]
> [New Thread 0x7ffff3bf6700 (LWP 26509)]
> [New Thread 0x7ffff33f5700 (LWP 26510)]
> vhost_scsi_stop
> Failed to clear endpoint
> qemu-system-x86_64: /usr/src/qemu.git/hw/msix.c:515: msix_unset_vector_notifiers: Assertion `dev->msix_vector_use_notifier && dev->msix_vector_release_notifier' failed.
>
> Program received signal SIGABRT, Aborted.
> 0x00007ffff5e8b165 in raise () from /lib/libc.so.6
> (gdb) bt
> #0 0x00007ffff5e8b165 in raise () from /lib/libc.so.6
> #1 0x00007ffff5e8df70 in abort () from /lib/libc.so.6
> #2 0x00007ffff5e842b1 in __assert_fail () from /lib/libc.so.6
> #3 0x00000000004a84a1 in msix_unset_vector_notifiers (dev=0x1463a70) at /usr/src/qemu.git/hw/msix.c:514
> #4 0x00000000004d2865 in virtio_pci_set_guest_notifiers (opaque=0x6788, assign=136)
There are obviously random parameter values submitted. This is broken.
> at /usr/src/qemu.git/hw/virtio-pci.c:703
> #5 0x000000000062955f in vhost_dev_stop (hdev=0x126c8a8, vdev=0x1465220) at /usr/src/qemu.git/hw/vhost.c:954
> #6 0x0000000000628989 in vhost_scsi_stop (vs=0x126c890, vdev=0x1465220) at /usr/src/qemu.git/hw/vhost-scsi.c:115
And I suspect this is the origin: A stop on an unstarted vhost device.
Am I right?
But such a bug may not yet explain a performance regression - unless the
spurious stop creates a state that slows things down later on.
Did you check that direct MSI-X vector injection is actually in use
later on?
Jan
--
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] vhost-scsi port to v1.1.0 + MSI-X performance regression
@ 2012-07-25 9:26 ` Jan Kiszka
0 siblings, 0 replies; 16+ messages in thread
From: Jan Kiszka @ 2012-07-25 9:26 UTC (permalink / raw)
To: Nicholas A. Bellinger
Cc: kvm-devel, Michael S. Tsirkin, Stefan Hajnoczi, qemu-devel,
target-devel, Hannes Reinecke, Anthony Liguori, Paolo Bonzini,
Christoph Hellwig, Avi Kivity
On 2012-07-24 22:20, Nicholas A. Bellinger wrote:
> On Tue, 2012-07-24 at 09:57 +0200, Jan Kiszka wrote:
>> On 2012-07-24 09:42, Nicholas A. Bellinger wrote:
>>> Hi Anthony, Stefan & QEMU folks,
>>>
>
> <SNIP>
>
>>> However, thus far I've not been able to get virtio-scsi <-> tcm_vhost
>>> I/O to actually work against the latest qemu.git/master..
>>>
>>> So while doing a (manual) bisection w/ this series to track down the
>>> issue with qemu/master, I managed to run across something else.. With
>>> the vhost-scsi series applied, everything is working as expected up
>>> until the following commit:
>>>
>>> commit 1523ed9e1d46b0b54540049d491475ccac7e6421
>>> Author: Jan Kiszka <jan.kiszka@siemens.com>
>>> Date: Thu May 17 10:32:39 2012 -0300
>>>
>>> virtio/vhost: Add support for KVM in-kernel MSI injection
>>>
>>>
>>> This commit ends up triggering the following assert immediately after
>>> starting qemu with virtio-scsi <-> tcm_vhost:
>>>
>>> qemu-system-x86_64: /usr/src/qemu.git/hw/msix.c:515:
>>> msix_unset_vector_notifiers: Assertion `dev->msix_vector_use_notifier &&
>>> dev->msix_vector_release_notifier' failed.
>>>
>>> OK, so adding the following hack allows me to boot:
>>>
>>> diff --git a/hw/msix.c b/hw/msix.c
>>> index 59c7a83..6036909 100644
>>> --- a/hw/msix.c
>>> +++ b/hw/msix.c
>>> @@ -511,6 +511,11 @@ void msix_unset_vector_notifiers(PCIDevice *dev)
>>> {
>>> int vector;
>>>
>>> + if (!dev->msix_vector_use_notifier && !dev->msix_vector_release_notifier) {
>>> + printf("Hit NULL msix_unset_vector_notifiers for: %s\n", dev->name);
>>> + return;
>>> + }
>>> +
>>> assert(dev->msix_vector_use_notifier &&
>>> dev->msix_vector_release_notifier);
>>>
>>> --
>>
>> Can you post a backtrace from gdb?
>>
>
> Sure, w/o the above patch the backtrace with commit 1523ed9e1d looks
> like the following:
>
> (gdb) run
> Starting program: /usr/src/qemu.git/x86_64-softmmu/qemu-system-x86_64 -enable-kvm -smp 2 -m 2048 -serial file:/tmp/vhost-serial.txt -hda /usr/src/qemu-vhost.git/debian_squeeze_amd64_standard-old.qcow2 -vhost-scsi id=vhost-scsi0,wwpn=naa.600140579ad21088,tpgt=1 -device virtio-scsi-pci,vhost-scsi=vhost-scsi0,event_idx=off
> [Thread debugging using libthread_db enabled]
> wwpn = "vhost-scsi0" tpgt = "1"
> [New Thread 0x7ffff45f8700 (LWP 26508)]
> [New Thread 0x7ffff3bf6700 (LWP 26509)]
> [New Thread 0x7ffff33f5700 (LWP 26510)]
> vhost_scsi_stop
> Failed to clear endpoint
> qemu-system-x86_64: /usr/src/qemu.git/hw/msix.c:515: msix_unset_vector_notifiers: Assertion `dev->msix_vector_use_notifier && dev->msix_vector_release_notifier' failed.
>
> Program received signal SIGABRT, Aborted.
> 0x00007ffff5e8b165 in raise () from /lib/libc.so.6
> (gdb) bt
> #0 0x00007ffff5e8b165 in raise () from /lib/libc.so.6
> #1 0x00007ffff5e8df70 in abort () from /lib/libc.so.6
> #2 0x00007ffff5e842b1 in __assert_fail () from /lib/libc.so.6
> #3 0x00000000004a84a1 in msix_unset_vector_notifiers (dev=0x1463a70) at /usr/src/qemu.git/hw/msix.c:514
> #4 0x00000000004d2865 in virtio_pci_set_guest_notifiers (opaque=0x6788, assign=136)
There are obviously random parameter values submitted. This is broken.
> at /usr/src/qemu.git/hw/virtio-pci.c:703
> #5 0x000000000062955f in vhost_dev_stop (hdev=0x126c8a8, vdev=0x1465220) at /usr/src/qemu.git/hw/vhost.c:954
> #6 0x0000000000628989 in vhost_scsi_stop (vs=0x126c890, vdev=0x1465220) at /usr/src/qemu.git/hw/vhost-scsi.c:115
And I suspect this is the origin: A stop on an unstarted vhost device.
Am I right?
But such a bug may not yet explain a performance regression - unless the
spurious stop creates a state that slows things down later on.
Did you check that direct MSI-X vector injection is actually in use
later on?
Jan
--
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: vhost-scsi port to v1.1.0 + MSI-X performance regression
2012-07-24 20:20 ` [Qemu-devel] " Nicholas A. Bellinger
@ 2012-08-13 18:15 ` Michael S. Tsirkin
-1 siblings, 0 replies; 16+ messages in thread
From: Michael S. Tsirkin @ 2012-08-13 18:15 UTC (permalink / raw)
To: Nicholas A. Bellinger
Cc: Jan Kiszka, Anthony Liguori, qemu-devel, kvm-devel, target-devel,
Stefan Hajnoczi, Paolo Bonzini, Hannes Reinecke,
Christoph Hellwig, Avi Kivity
On Tue, Jul 24, 2012 at 01:20:56PM -0700, Nicholas A. Bellinger wrote:
> On Tue, 2012-07-24 at 09:57 +0200, Jan Kiszka wrote:
> > On 2012-07-24 09:42, Nicholas A. Bellinger wrote:
> > > Hi Anthony, Stefan & QEMU folks,
> > >
>
> <SNIP>
>
> > > However, thus far I've not been able to get virtio-scsi <-> tcm_vhost
> > > I/O to actually work against the latest qemu.git/master..
> > >
> > > So while doing a (manual) bisection w/ this series to track down the
> > > issue with qemu/master, I managed to run across something else.. With
> > > the vhost-scsi series applied, everything is working as expected up
> > > until the following commit:
> > >
> > > commit 1523ed9e1d46b0b54540049d491475ccac7e6421
> > > Author: Jan Kiszka <jan.kiszka@siemens.com>
> > > Date: Thu May 17 10:32:39 2012 -0300
> > >
> > > virtio/vhost: Add support for KVM in-kernel MSI injection
> > >
> > >
> > > This commit ends up triggering the following assert immediately after
> > > starting qemu with virtio-scsi <-> tcm_vhost:
> > >
> > > qemu-system-x86_64: /usr/src/qemu.git/hw/msix.c:515:
> > > msix_unset_vector_notifiers: Assertion `dev->msix_vector_use_notifier &&
> > > dev->msix_vector_release_notifier' failed.
> > >
> > > OK, so adding the following hack allows me to boot:
> > >
> > > diff --git a/hw/msix.c b/hw/msix.c
> > > index 59c7a83..6036909 100644
> > > --- a/hw/msix.c
> > > +++ b/hw/msix.c
> > > @@ -511,6 +511,11 @@ void msix_unset_vector_notifiers(PCIDevice *dev)
> > > {
> > > int vector;
> > >
> > > + if (!dev->msix_vector_use_notifier && !dev->msix_vector_release_notifier) {
> > > + printf("Hit NULL msix_unset_vector_notifiers for: %s\n", dev->name);
> > > + return;
> > > + }
> > > +
> > > assert(dev->msix_vector_use_notifier &&
> > > dev->msix_vector_release_notifier);
> > >
> > > --
> >
> > Can you post a backtrace from gdb?
> >
>
> Sure, w/o the above patch the backtrace with commit 1523ed9e1d looks
> like the following:
>
> (gdb) run
> Starting program: /usr/src/qemu.git/x86_64-softmmu/qemu-system-x86_64 -enable-kvm -smp 2 -m 2048 -serial file:/tmp/vhost-serial.txt -hda /usr/src/qemu-vhost.git/debian_squeeze_amd64_standard-old.qcow2 -vhost-scsi id=vhost-scsi0,wwpn=naa.600140579ad21088,tpgt=1 -device virtio-scsi-pci,vhost-scsi=vhost-scsi0,event_idx=off
> [Thread debugging using libthread_db enabled]
> wwpn = "vhost-scsi0" tpgt = "1"
> [New Thread 0x7ffff45f8700 (LWP 26508)]
> [New Thread 0x7ffff3bf6700 (LWP 26509)]
> [New Thread 0x7ffff33f5700 (LWP 26510)]
> vhost_scsi_stop
> Failed to clear endpoint
> qemu-system-x86_64: /usr/src/qemu.git/hw/msix.c:515: msix_unset_vector_notifiers: Assertion `dev->msix_vector_use_notifier && dev->msix_vector_release_notifier' failed.
>
> Program received signal SIGABRT, Aborted.
> 0x00007ffff5e8b165 in raise () from /lib/libc.so.6
> (gdb) bt
> #0 0x00007ffff5e8b165 in raise () from /lib/libc.so.6
> #1 0x00007ffff5e8df70 in abort () from /lib/libc.so.6
> #2 0x00007ffff5e842b1 in __assert_fail () from /lib/libc.so.6
> #3 0x00000000004a84a1 in msix_unset_vector_notifiers (dev=0x1463a70) at /usr/src/qemu.git/hw/msix.c:514
> #4 0x00000000004d2865 in virtio_pci_set_guest_notifiers (opaque=0x6788, assign=136)
> at /usr/src/qemu.git/hw/virtio-pci.c:703
> #5 0x000000000062955f in vhost_dev_stop (hdev=0x126c8a8, vdev=0x1465220) at /usr/src/qemu.git/hw/vhost.c:954
> #6 0x0000000000628989 in vhost_scsi_stop (vs=0x126c890, vdev=0x1465220) at /usr/src/qemu.git/hw/vhost-scsi.c:115
> #7 0x000000000062f5c9 in virtio_scsi_set_status (vdev=0x1465220, val=<value optimized out>)
> at /usr/src/qemu.git/hw/virtio-scsi.c:631
> #8 0x0000000000632082 in virtio_set_status (vdev=0x1465220, val=136 '\210') at /usr/src/qemu.git/hw/virtio.c:507
> #9 0x0000000000633410 in virtio_reset (opaque=0x6788) at /usr/src/qemu.git/hw/virtio.c:517
This is strange, code shows virtio_reset calls virtio_set_status
with value 0. Why is it 136 here?
Stack corruption?
We actually did have a memory corruptor with virtio scsi:
You sent patch
virtio-scsi: Fix incorrect VirtIOSCSI->cmd_vqs[0] definition
maybe the bug goes away if you apply that?
> #10 0x00000000004d30a9 in virtio_pci_reset (d=0x1463a70) at /usr/src/qemu.git/hw/virtio-pci.c:280
> #11 0x00000000004fc909 in qdev_reset_one (dev=0x6788, opaque=0x6788) at /usr/src/qemu.git/hw/qdev.c:207
> #12 0x00000000004fc670 in qdev_walk_children (dev=0x1463a70, devfn=0x4fc8f0 <qdev_reset_one>,
> busfn=0x4fc510 <qbus_reset_one>, opaque=0x0) at /usr/src/qemu.git/hw/qdev.c:372
> #13 0x00000000004ae43d in pci_device_reset (dev=0x6788) at /usr/src/qemu.git/hw/pci.c:163
> #14 0x00000000004ae64f in pci_bus_reset (bus=0x1415bd0) at /usr/src/qemu.git/hw/pci.c:206
> #15 0x00000000004ae699 in pcibus_reset (qbus=0x6788) at /usr/src/qemu.git/hw/pci.c:213
> #16 0x00000000004fc710 in qbus_walk_children (bus=0x1415bd0, devfn=0x4fc8f0 <qdev_reset_one>, busfn=0x6, opaque=0x0)
> at /usr/src/qemu.git/hw/qdev.c:349
> #17 0x00000000004fc6a3 in qdev_walk_children (dev=<value optimized out>, devfn=0x4fc8f0 <qdev_reset_one>,
> busfn=0x4fc510 <qbus_reset_one>, opaque=0x0) at /usr/src/qemu.git/hw/qdev.c:379
> #18 0x00000000004fc745 in qbus_walk_children (bus=<value optimized out>, devfn=0x4fc8f0 <qdev_reset_one>,
> busfn=0x4fc510 <qbus_reset_one>, opaque=0x0) at /usr/src/qemu.git/hw/qdev.c:356
> #19 0x00000000004d5822 in qemu_system_reset (report=false) at /usr/src/qemu.git/vl.c:1412
> #20 0x00000000004d70bb in main (argc=<value optimized out>, argv=<value optimized out>, envp=<value optimized out>)
> at /usr/src/qemu.git/vl.c:3647
> (gdb)
>
>
> > Also, is there a git tree and a way to reproduce this without special
> > hardware needs?
> >
>
> I'll push this series + branches to demonstrate the issue into an public
> tree this afternoon.
>
> Also, the particular backend is a Fusion-IO raw block flash device, but
> I'm pretty sure that using a TCM RAMDISK into tcm_vhost would exhibit
> the same type of behavior. (Will double check on that shortly..)
>
> > >
> > > and virtio-scsi is then able to load + detect tcm_vhost LUNs as
> > > expected.
> > >
> > > However the random I/O performance with commit 1523ed9e1d46b is off by a
> > > couple of orders of magnitude, ~6K IOPs compared to ~60K IOPs on raw
> > > block flash using just the previous commit bdd00bdc64ba in Jan's series.
> > >
> > > So AFAICT there appears to be a serious performance regression that is
> > > easily reproducible with that patch, which is about as far along as I've
> > > been able to diagnose yet.
> > >
> > > Interestingly enough, virtio-scsi-raw performance does not seem to be
> > > effected AFAICT by this regression, and is still able to go ~20K IOPs
> > > with the same workload using commit 1523ed9e1d46b. (Roughly the same as
> > > before)
> > >
> > > Does anyone have any idea why commit 1523ed9e1d46b would be killing
> > > vhost / tcm_vhost performance so terribly, or is there something else
> > > that vhost / vhost-scsi should be doing with new code..?
> >
> > No good idea yet, will have to look closer.
> >
>
> <nod>, thanks for your help Jan. ;)
>
> > Maybe you are somehow deassigning (via set_guest_notifiers) before
> > assigning. But that would not yet explain performance regressions. Your
> > target is exposing MSI-X, isn't it?
> >
>
> I believe that is correct.
>
> --nab
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] vhost-scsi port to v1.1.0 + MSI-X performance regression
@ 2012-08-13 18:15 ` Michael S. Tsirkin
0 siblings, 0 replies; 16+ messages in thread
From: Michael S. Tsirkin @ 2012-08-13 18:15 UTC (permalink / raw)
To: Nicholas A. Bellinger
Cc: kvm-devel, Stefan Hajnoczi, qemu-devel, target-devel,
Hannes Reinecke, Anthony Liguori, Jan Kiszka, Paolo Bonzini,
Christoph Hellwig, Avi Kivity
On Tue, Jul 24, 2012 at 01:20:56PM -0700, Nicholas A. Bellinger wrote:
> On Tue, 2012-07-24 at 09:57 +0200, Jan Kiszka wrote:
> > On 2012-07-24 09:42, Nicholas A. Bellinger wrote:
> > > Hi Anthony, Stefan & QEMU folks,
> > >
>
> <SNIP>
>
> > > However, thus far I've not been able to get virtio-scsi <-> tcm_vhost
> > > I/O to actually work against the latest qemu.git/master..
> > >
> > > So while doing a (manual) bisection w/ this series to track down the
> > > issue with qemu/master, I managed to run across something else.. With
> > > the vhost-scsi series applied, everything is working as expected up
> > > until the following commit:
> > >
> > > commit 1523ed9e1d46b0b54540049d491475ccac7e6421
> > > Author: Jan Kiszka <jan.kiszka@siemens.com>
> > > Date: Thu May 17 10:32:39 2012 -0300
> > >
> > > virtio/vhost: Add support for KVM in-kernel MSI injection
> > >
> > >
> > > This commit ends up triggering the following assert immediately after
> > > starting qemu with virtio-scsi <-> tcm_vhost:
> > >
> > > qemu-system-x86_64: /usr/src/qemu.git/hw/msix.c:515:
> > > msix_unset_vector_notifiers: Assertion `dev->msix_vector_use_notifier &&
> > > dev->msix_vector_release_notifier' failed.
> > >
> > > OK, so adding the following hack allows me to boot:
> > >
> > > diff --git a/hw/msix.c b/hw/msix.c
> > > index 59c7a83..6036909 100644
> > > --- a/hw/msix.c
> > > +++ b/hw/msix.c
> > > @@ -511,6 +511,11 @@ void msix_unset_vector_notifiers(PCIDevice *dev)
> > > {
> > > int vector;
> > >
> > > + if (!dev->msix_vector_use_notifier && !dev->msix_vector_release_notifier) {
> > > + printf("Hit NULL msix_unset_vector_notifiers for: %s\n", dev->name);
> > > + return;
> > > + }
> > > +
> > > assert(dev->msix_vector_use_notifier &&
> > > dev->msix_vector_release_notifier);
> > >
> > > --
> >
> > Can you post a backtrace from gdb?
> >
>
> Sure, w/o the above patch the backtrace with commit 1523ed9e1d looks
> like the following:
>
> (gdb) run
> Starting program: /usr/src/qemu.git/x86_64-softmmu/qemu-system-x86_64 -enable-kvm -smp 2 -m 2048 -serial file:/tmp/vhost-serial.txt -hda /usr/src/qemu-vhost.git/debian_squeeze_amd64_standard-old.qcow2 -vhost-scsi id=vhost-scsi0,wwpn=naa.600140579ad21088,tpgt=1 -device virtio-scsi-pci,vhost-scsi=vhost-scsi0,event_idx=off
> [Thread debugging using libthread_db enabled]
> wwpn = "vhost-scsi0" tpgt = "1"
> [New Thread 0x7ffff45f8700 (LWP 26508)]
> [New Thread 0x7ffff3bf6700 (LWP 26509)]
> [New Thread 0x7ffff33f5700 (LWP 26510)]
> vhost_scsi_stop
> Failed to clear endpoint
> qemu-system-x86_64: /usr/src/qemu.git/hw/msix.c:515: msix_unset_vector_notifiers: Assertion `dev->msix_vector_use_notifier && dev->msix_vector_release_notifier' failed.
>
> Program received signal SIGABRT, Aborted.
> 0x00007ffff5e8b165 in raise () from /lib/libc.so.6
> (gdb) bt
> #0 0x00007ffff5e8b165 in raise () from /lib/libc.so.6
> #1 0x00007ffff5e8df70 in abort () from /lib/libc.so.6
> #2 0x00007ffff5e842b1 in __assert_fail () from /lib/libc.so.6
> #3 0x00000000004a84a1 in msix_unset_vector_notifiers (dev=0x1463a70) at /usr/src/qemu.git/hw/msix.c:514
> #4 0x00000000004d2865 in virtio_pci_set_guest_notifiers (opaque=0x6788, assign=136)
> at /usr/src/qemu.git/hw/virtio-pci.c:703
> #5 0x000000000062955f in vhost_dev_stop (hdev=0x126c8a8, vdev=0x1465220) at /usr/src/qemu.git/hw/vhost.c:954
> #6 0x0000000000628989 in vhost_scsi_stop (vs=0x126c890, vdev=0x1465220) at /usr/src/qemu.git/hw/vhost-scsi.c:115
> #7 0x000000000062f5c9 in virtio_scsi_set_status (vdev=0x1465220, val=<value optimized out>)
> at /usr/src/qemu.git/hw/virtio-scsi.c:631
> #8 0x0000000000632082 in virtio_set_status (vdev=0x1465220, val=136 '\210') at /usr/src/qemu.git/hw/virtio.c:507
> #9 0x0000000000633410 in virtio_reset (opaque=0x6788) at /usr/src/qemu.git/hw/virtio.c:517
This is strange, code shows virtio_reset calls virtio_set_status
with value 0. Why is it 136 here?
Stack corruption?
We actually did have a memory corruptor with virtio scsi:
You sent patch
virtio-scsi: Fix incorrect VirtIOSCSI->cmd_vqs[0] definition
maybe the bug goes away if you apply that?
> #10 0x00000000004d30a9 in virtio_pci_reset (d=0x1463a70) at /usr/src/qemu.git/hw/virtio-pci.c:280
> #11 0x00000000004fc909 in qdev_reset_one (dev=0x6788, opaque=0x6788) at /usr/src/qemu.git/hw/qdev.c:207
> #12 0x00000000004fc670 in qdev_walk_children (dev=0x1463a70, devfn=0x4fc8f0 <qdev_reset_one>,
> busfn=0x4fc510 <qbus_reset_one>, opaque=0x0) at /usr/src/qemu.git/hw/qdev.c:372
> #13 0x00000000004ae43d in pci_device_reset (dev=0x6788) at /usr/src/qemu.git/hw/pci.c:163
> #14 0x00000000004ae64f in pci_bus_reset (bus=0x1415bd0) at /usr/src/qemu.git/hw/pci.c:206
> #15 0x00000000004ae699 in pcibus_reset (qbus=0x6788) at /usr/src/qemu.git/hw/pci.c:213
> #16 0x00000000004fc710 in qbus_walk_children (bus=0x1415bd0, devfn=0x4fc8f0 <qdev_reset_one>, busfn=0x6, opaque=0x0)
> at /usr/src/qemu.git/hw/qdev.c:349
> #17 0x00000000004fc6a3 in qdev_walk_children (dev=<value optimized out>, devfn=0x4fc8f0 <qdev_reset_one>,
> busfn=0x4fc510 <qbus_reset_one>, opaque=0x0) at /usr/src/qemu.git/hw/qdev.c:379
> #18 0x00000000004fc745 in qbus_walk_children (bus=<value optimized out>, devfn=0x4fc8f0 <qdev_reset_one>,
> busfn=0x4fc510 <qbus_reset_one>, opaque=0x0) at /usr/src/qemu.git/hw/qdev.c:356
> #19 0x00000000004d5822 in qemu_system_reset (report=false) at /usr/src/qemu.git/vl.c:1412
> #20 0x00000000004d70bb in main (argc=<value optimized out>, argv=<value optimized out>, envp=<value optimized out>)
> at /usr/src/qemu.git/vl.c:3647
> (gdb)
>
>
> > Also, is there a git tree and a way to reproduce this without special
> > hardware needs?
> >
>
> I'll push this series + branches to demonstrate the issue into an public
> tree this afternoon.
>
> Also, the particular backend is a Fusion-IO raw block flash device, but
> I'm pretty sure that using a TCM RAMDISK into tcm_vhost would exhibit
> the same type of behavior. (Will double check on that shortly..)
>
> > >
> > > and virtio-scsi is then able to load + detect tcm_vhost LUNs as
> > > expected.
> > >
> > > However the random I/O performance with commit 1523ed9e1d46b is off by a
> > > couple of orders of magnitude, ~6K IOPs compared to ~60K IOPs on raw
> > > block flash using just the previous commit bdd00bdc64ba in Jan's series.
> > >
> > > So AFAICT there appears to be a serious performance regression that is
> > > easily reproducible with that patch, which is about as far along as I've
> > > been able to diagnose yet.
> > >
> > > Interestingly enough, virtio-scsi-raw performance does not seem to be
> > > effected AFAICT by this regression, and is still able to go ~20K IOPs
> > > with the same workload using commit 1523ed9e1d46b. (Roughly the same as
> > > before)
> > >
> > > Does anyone have any idea why commit 1523ed9e1d46b would be killing
> > > vhost / tcm_vhost performance so terribly, or is there something else
> > > that vhost / vhost-scsi should be doing with new code..?
> >
> > No good idea yet, will have to look closer.
> >
>
> <nod>, thanks for your help Jan. ;)
>
> > Maybe you are somehow deassigning (via set_guest_notifiers) before
> > assigning. But that would not yet explain performance regressions. Your
> > target is exposing MSI-X, isn't it?
> >
>
> I believe that is correct.
>
> --nab
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2012-08-13 18:14 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-24 7:42 vhost-scsi port to v1.1.0 + MSI-X performance regression Nicholas A. Bellinger
2012-07-24 7:42 ` [Qemu-devel] " Nicholas A. Bellinger
2012-07-24 7:57 ` Jan Kiszka
2012-07-24 7:57 ` [Qemu-devel] " Jan Kiszka
2012-07-24 12:05 ` Stefan Hajnoczi
2012-07-24 12:05 ` [Qemu-devel] " Stefan Hajnoczi
2012-07-24 12:10 ` Jan Kiszka
2012-07-24 12:10 ` [Qemu-devel] " Jan Kiszka
2012-07-24 20:20 ` Nicholas A. Bellinger
2012-07-24 20:20 ` [Qemu-devel] " Nicholas A. Bellinger
2012-07-24 21:43 ` Nicholas A. Bellinger
2012-07-24 21:43 ` [Qemu-devel] " Nicholas A. Bellinger
2012-07-25 9:26 ` Jan Kiszka
2012-07-25 9:26 ` [Qemu-devel] " Jan Kiszka
2012-08-13 18:15 ` Michael S. Tsirkin
2012-08-13 18:15 ` [Qemu-devel] " Michael S. Tsirkin
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.