linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [REGRESSION] 07ec51480b5e ("virtio_pci: use shared interrupts for virtqueues") causes crashes in guest
@ 2017-03-23  0:30 Laura Abbott
  2017-03-23  5:13 ` Jason Wang
  0 siblings, 1 reply; 6+ messages in thread
From: Laura Abbott @ 2017-03-23  0:30 UTC (permalink / raw)
  To: Christoph Hellwig, Michael S. Tsirkin, Jason Wang
  Cc: Linux Kernel Mailing List, virtualization

Hi,

Fedora has received multiple reports of crashes when running
4.11 as a guest

https://bugzilla.redhat.com/show_bug.cgi?id=1430297
https://bugzilla.redhat.com/show_bug.cgi?id=1434462
https://bugzilla.kernel.org/show_bug.cgi?id=194911
https://bugzilla.redhat.com/show_bug.cgi?id=1433899

The crashes are not always consistent but they are generally
some flavor of oops or GPF in virtio related code. Multiple people
have done bisections (Thank you Thorsten Leemhuis and 
Richard W.M. Jones) and found this commit to be at fault

07ec51480b5eb1233f8c1b0f5d7a7c8d1247c507 is the first bad commit
commit 07ec51480b5eb1233f8c1b0f5d7a7c8d1247c507
Author: Christoph Hellwig <hch@lst.de>
Date:   Sun Feb 5 18:15:19 2017 +0100

    virtio_pci: use shared interrupts for virtqueues
    
    This lets IRQ layer handle dispatching IRQs to separate handlers for the
    case where we don't have per-VQ MSI-X vectors, and allows us to greatly
    simplify the code based on the assumption that we always have interrupt
    vector 0 (legacy INTx or config interrupt for MSI-X) available, and
    any other interrupt is request/freed throught the VQ, even if the
    actual interrupt line might be shared in some cases.
    
    This allows removing a great deal of variables keeping track of the
    interrupt state in struct virtio_pci_device, as we can now simply walk the
    list of VQs and deal with per-VQ interrupt handlers there, and only treat
    vector 0 special.
    
    Additionally clean up the VQ allocation code to properly unwind on error
    instead of having a single global cleanup label, which is error prone,
    and in this case also leads to more code.
    
    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

:040000 040000 79a8267ffb73f9d244267c5f68365305bddd4696 8832a160b978710bbd24ba6966f462b3faa27fcc M	drivers

It doesn't revert cleanly so we haven't been able to do a clean
test. Any ideas?

Thanks,
Laura

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [REGRESSION] 07ec51480b5e ("virtio_pci: use shared interrupts for virtqueues") causes crashes in guest
  2017-03-23  0:30 [REGRESSION] 07ec51480b5e ("virtio_pci: use shared interrupts for virtqueues") causes crashes in guest Laura Abbott
@ 2017-03-23  5:13 ` Jason Wang
  2017-03-23 14:22   ` Christoph Hellwig
  2017-03-23 15:19   ` Richard W.M. Jones
  0 siblings, 2 replies; 6+ messages in thread
From: Jason Wang @ 2017-03-23  5:13 UTC (permalink / raw)
  To: Laura Abbott, Christoph Hellwig, Michael S. Tsirkin
  Cc: Linux Kernel Mailing List, virtualization

[-- Attachment #1: Type: text/plain, Size: 2268 bytes --]



On 2017年03月23日 08:30, Laura Abbott wrote:
> Hi,
>
> Fedora has received multiple reports of crashes when running
> 4.11 as a guest
>
> https://bugzilla.redhat.com/show_bug.cgi?id=1430297
> https://bugzilla.redhat.com/show_bug.cgi?id=1434462
> https://bugzilla.kernel.org/show_bug.cgi?id=194911
> https://bugzilla.redhat.com/show_bug.cgi?id=1433899
>
> The crashes are not always consistent but they are generally
> some flavor of oops or GPF in virtio related code. Multiple people
> have done bisections (Thank you Thorsten Leemhuis and
> Richard W.M. Jones) and found this commit to be at fault
>
> 07ec51480b5eb1233f8c1b0f5d7a7c8d1247c507 is the first bad commit
> commit 07ec51480b5eb1233f8c1b0f5d7a7c8d1247c507
> Author: Christoph Hellwig <hch@lst.de>
> Date:   Sun Feb 5 18:15:19 2017 +0100
>
>      virtio_pci: use shared interrupts for virtqueues
>      
>      This lets IRQ layer handle dispatching IRQs to separate handlers for the
>      case where we don't have per-VQ MSI-X vectors, and allows us to greatly
>      simplify the code based on the assumption that we always have interrupt
>      vector 0 (legacy INTx or config interrupt for MSI-X) available, and
>      any other interrupt is request/freed throught the VQ, even if the
>      actual interrupt line might be shared in some cases.
>      
>      This allows removing a great deal of variables keeping track of the
>      interrupt state in struct virtio_pci_device, as we can now simply walk the
>      list of VQs and deal with per-VQ interrupt handlers there, and only treat
>      vector 0 special.
>      
>      Additionally clean up the VQ allocation code to properly unwind on error
>      instead of having a single global cleanup label, which is error prone,
>      and in this case also leads to more code.
>      
>      Signed-off-by: Christoph Hellwig <hch@lst.de>
>      Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>
> :040000 040000 79a8267ffb73f9d244267c5f68365305bddd4696 8832a160b978710bbd24ba6966f462b3faa27fcc M	drivers
>
> It doesn't revert cleanly so we haven't been able to do a clean
> test. Any ideas?
>
> Thanks,
> Laura

Hello:

Can you try the attached patch to see if it solves the problem? (At 
least it silent KASan warnings for me).

Thanks



[-- Attachment #2: 0001-virtio_pci-fix-out-of-bound-access-for-msix_names.patch --]
[-- Type: text/x-patch, Size: 2041 bytes --]

>From 312859b596e83a2164a8430343d31fce2a5ad808 Mon Sep 17 00:00:00 2001
From: Jason Wang <jasowang@redhat.com>
Date: Thu, 23 Mar 2017 13:07:16 +0800
Subject: [PATCH] virtio_pci: fix out of bound access for msix_names

Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 drivers/virtio/virtio_pci_common.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c
index df548a6..5905349 100644
--- a/drivers/virtio/virtio_pci_common.c
+++ b/drivers/virtio/virtio_pci_common.c
@@ -147,7 +147,7 @@ static int vp_find_vqs_msix(struct virtio_device *vdev, unsigned nvqs,
 {
 	struct virtio_pci_device *vp_dev = to_vp_device(vdev);
 	const char *name = dev_name(&vp_dev->vdev.dev);
-	int i, err = -ENOMEM, allocated_vectors, nvectors;
+	int i, j, err = -ENOMEM, allocated_vectors, nvectors;
 	unsigned flags = PCI_IRQ_MSIX;
 	bool shared = false;
 	u16 msix_vec;
@@ -212,7 +212,7 @@ static int vp_find_vqs_msix(struct virtio_device *vdev, unsigned nvqs,
 	if (!vp_dev->msix_vector_map)
 		goto out_disable_config_irq;
 
-	allocated_vectors = 1; /* vector 0 is the config interrupt */
+	allocated_vectors = j = 1; /* vector 0 is the config interrupt */
 	for (i = 0; i < nvqs; ++i) {
 		if (!names[i]) {
 			vqs[i] = NULL;
@@ -236,18 +236,19 @@ static int vp_find_vqs_msix(struct virtio_device *vdev, unsigned nvqs,
 			continue;
 		}
 
-		snprintf(vp_dev->msix_names[i + 1],
+		snprintf(vp_dev->msix_names[j],
 			 sizeof(*vp_dev->msix_names), "%s-%s",
 			 dev_name(&vp_dev->vdev.dev), names[i]);
 		err = request_irq(pci_irq_vector(vp_dev->pci_dev, msix_vec),
 				  vring_interrupt, IRQF_SHARED,
-				  vp_dev->msix_names[i + 1], vqs[i]);
+				  vp_dev->msix_names[j], vqs[i]);
 		if (err) {
 			/* don't free this irq on error */
 			vp_dev->msix_vector_map[i] = VIRTIO_MSI_NO_VECTOR;
 			goto out_remove_vqs;
 		}
 		vp_dev->msix_vector_map[i] = msix_vec;
+		j++;
 
 		/*
 		 * Use a different vector for each queue if they are available,
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [REGRESSION] 07ec51480b5e ("virtio_pci: use shared interrupts for virtqueues") causes crashes in guest
  2017-03-23  5:13 ` Jason Wang
@ 2017-03-23 14:22   ` Christoph Hellwig
  2017-03-23 15:19   ` Richard W.M. Jones
  1 sibling, 0 replies; 6+ messages in thread
From: Christoph Hellwig @ 2017-03-23 14:22 UTC (permalink / raw)
  To: Jason Wang
  Cc: Laura Abbott, Christoph Hellwig, Michael S. Tsirkin,
	Linux Kernel Mailing List, virtualization

On Thu, Mar 23, 2017 at 01:13:50PM +0800, Jason Wang wrote:
>
>
> On 2017年03月23日 08:30, Laura Abbott wrote:
>> Hi,
>>
>> Fedora has received multiple reports of crashes when running
>> 4.11 as a guest
>>
>> https://bugzilla.redhat.com/show_bug.cgi?id=1430297
>> https://bugzilla.redhat.com/show_bug.cgi?id=1434462
>> https://bugzilla.kernel.org/show_bug.cgi?id=194911
>> https://bugzilla.redhat.com/show_bug.cgi?id=1433899
>>
>> The crashes are not always consistent but they are generally
>> some flavor of oops or GPF in virtio related code. Multiple people
>> have done bisections (Thank you Thorsten Leemhuis and
>> Richard W.M. Jones) and found this commit to be at fault
>>
>> 07ec51480b5eb1233f8c1b0f5d7a7c8d1247c507 is the first bad commit
>> commit 07ec51480b5eb1233f8c1b0f5d7a7c8d1247c507
>> Author: Christoph Hellwig <hch@lst.de>
>> Date:   Sun Feb 5 18:15:19 2017 +0100
>>
>>      virtio_pci: use shared interrupts for virtqueues
>>           This lets IRQ layer handle dispatching IRQs to separate handlers 
>> for the
>>      case where we don't have per-VQ MSI-X vectors, and allows us to greatly
>>      simplify the code based on the assumption that we always have interrupt
>>      vector 0 (legacy INTx or config interrupt for MSI-X) available, and
>>      any other interrupt is request/freed throught the VQ, even if the
>>      actual interrupt line might be shared in some cases.
>>           This allows removing a great deal of variables keeping track of 
>> the
>>      interrupt state in struct virtio_pci_device, as we can now simply walk the
>>      list of VQs and deal with per-VQ interrupt handlers there, and only treat
>>      vector 0 special.
>>           Additionally clean up the VQ allocation code to properly unwind 
>> on error
>>      instead of having a single global cleanup label, which is error prone,
>>      and in this case also leads to more code.
>>           Signed-off-by: Christoph Hellwig <hch@lst.de>
>>      Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>>
>> :040000 040000 79a8267ffb73f9d244267c5f68365305bddd4696 8832a160b978710bbd24ba6966f462b3faa27fcc M	drivers
>>
>> It doesn't revert cleanly so we haven't been able to do a clean
>> test. Any ideas?
>>
>> Thanks,
>> Laura
>
> Hello:
>
> Can you try the attached patch to see if it solves the problem? (At least 
> it silent KASan warnings for me).

This looks like a correct fix to me, independent of fixing the original
bug or not:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [REGRESSION] 07ec51480b5e ("virtio_pci: use shared interrupts for virtqueues") causes crashes in guest
  2017-03-23  5:13 ` Jason Wang
  2017-03-23 14:22   ` Christoph Hellwig
@ 2017-03-23 15:19   ` Richard W.M. Jones
  2017-03-23 16:41     ` Michael S. Tsirkin
  1 sibling, 1 reply; 6+ messages in thread
From: Richard W.M. Jones @ 2017-03-23 15:19 UTC (permalink / raw)
  To: Jason Wang
  Cc: Laura Abbott, Christoph Hellwig, Michael S. Tsirkin,
	Linux Kernel Mailing List, virtualization, Thorsten Leemhuis

On Thu, Mar 23, 2017 at 01:13:50PM +0800, Jason Wang wrote:
> >From 312859b596e83a2164a8430343d31fce2a5ad808 Mon Sep 17 00:00:00 2001
> From: Jason Wang <jasowang@redhat.com>
> Date: Thu, 23 Mar 2017 13:07:16 +0800
> Subject: [PATCH] virtio_pci: fix out of bound access for msix_names
> 
> Signed-off-by: Jason Wang <jasowang@redhat.com>

I tested this, and it does appear to fix the crashes in
vp_modern_find_vqs.  Therefore:

Tested-by: Richard W.M. Jones <rjones@redhat.com>

Thanks,

Rich.

>  drivers/virtio/virtio_pci_common.c | 9 +++++----
>  1 file changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c
> index df548a6..5905349 100644
> --- a/drivers/virtio/virtio_pci_common.c
> +++ b/drivers/virtio/virtio_pci_common.c
> @@ -147,7 +147,7 @@ static int vp_find_vqs_msix(struct virtio_device *vdev, unsigned nvqs,
>  {
>  	struct virtio_pci_device *vp_dev = to_vp_device(vdev);
>  	const char *name = dev_name(&vp_dev->vdev.dev);
> -	int i, err = -ENOMEM, allocated_vectors, nvectors;
> +	int i, j, err = -ENOMEM, allocated_vectors, nvectors;
>  	unsigned flags = PCI_IRQ_MSIX;
>  	bool shared = false;
>  	u16 msix_vec;
> @@ -212,7 +212,7 @@ static int vp_find_vqs_msix(struct virtio_device *vdev, unsigned nvqs,
>  	if (!vp_dev->msix_vector_map)
>  		goto out_disable_config_irq;
>  
> -	allocated_vectors = 1; /* vector 0 is the config interrupt */
> +	allocated_vectors = j = 1; /* vector 0 is the config interrupt */
>  	for (i = 0; i < nvqs; ++i) {
>  		if (!names[i]) {
>  			vqs[i] = NULL;
> @@ -236,18 +236,19 @@ static int vp_find_vqs_msix(struct virtio_device *vdev, unsigned nvqs,
>  			continue;
>  		}
>  
> -		snprintf(vp_dev->msix_names[i + 1],
> +		snprintf(vp_dev->msix_names[j],
>  			 sizeof(*vp_dev->msix_names), "%s-%s",
>  			 dev_name(&vp_dev->vdev.dev), names[i]);
>  		err = request_irq(pci_irq_vector(vp_dev->pci_dev, msix_vec),
>  				  vring_interrupt, IRQF_SHARED,
> -				  vp_dev->msix_names[i + 1], vqs[i]);
> +				  vp_dev->msix_names[j], vqs[i]);
>  		if (err) {
>  			/* don't free this irq on error */
>  			vp_dev->msix_vector_map[i] = VIRTIO_MSI_NO_VECTOR;
>  			goto out_remove_vqs;
>  		}
>  		vp_dev->msix_vector_map[i] = msix_vec;
> +		j++;
>  
>  		/*
>  		 * Use a different vector for each queue if they are available,
> -- 
> 2.7.4
> 


-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-builder quickly builds VMs from scratch
http://libguestfs.org/virt-builder.1.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [REGRESSION] 07ec51480b5e ("virtio_pci: use shared interrupts for virtqueues") causes crashes in guest
  2017-03-23 15:19   ` Richard W.M. Jones
@ 2017-03-23 16:41     ` Michael S. Tsirkin
  2017-03-23 17:15       ` Thorsten Leemhuis
  0 siblings, 1 reply; 6+ messages in thread
From: Michael S. Tsirkin @ 2017-03-23 16:41 UTC (permalink / raw)
  To: Richard W.M. Jones
  Cc: Jason Wang, Laura Abbott, Christoph Hellwig,
	Linux Kernel Mailing List, virtualization, Thorsten Leemhuis

On Thu, Mar 23, 2017 at 03:19:07PM +0000, Richard W.M. Jones wrote:
> On Thu, Mar 23, 2017 at 01:13:50PM +0800, Jason Wang wrote:
> > >From 312859b596e83a2164a8430343d31fce2a5ad808 Mon Sep 17 00:00:00 2001
> > From: Jason Wang <jasowang@redhat.com>
> > Date: Thu, 23 Mar 2017 13:07:16 +0800
> > Subject: [PATCH] virtio_pci: fix out of bound access for msix_names
> > 
> > Signed-off-by: Jason Wang <jasowang@redhat.com>
> 
> I tested this, and it does appear to fix the crashes in
> vp_modern_find_vqs.  Therefore:
> 
> Tested-by: Richard W.M. Jones <rjones@redhat.com>
> 
> Thanks,
> 
> Rich.

I've queued the fix, thanks everyone!

> >  drivers/virtio/virtio_pci_common.c | 9 +++++----
> >  1 file changed, 5 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c
> > index df548a6..5905349 100644
> > --- a/drivers/virtio/virtio_pci_common.c
> > +++ b/drivers/virtio/virtio_pci_common.c
> > @@ -147,7 +147,7 @@ static int vp_find_vqs_msix(struct virtio_device *vdev, unsigned nvqs,
> >  {
> >  	struct virtio_pci_device *vp_dev = to_vp_device(vdev);
> >  	const char *name = dev_name(&vp_dev->vdev.dev);
> > -	int i, err = -ENOMEM, allocated_vectors, nvectors;
> > +	int i, j, err = -ENOMEM, allocated_vectors, nvectors;
> >  	unsigned flags = PCI_IRQ_MSIX;
> >  	bool shared = false;
> >  	u16 msix_vec;
> > @@ -212,7 +212,7 @@ static int vp_find_vqs_msix(struct virtio_device *vdev, unsigned nvqs,
> >  	if (!vp_dev->msix_vector_map)
> >  		goto out_disable_config_irq;
> >  
> > -	allocated_vectors = 1; /* vector 0 is the config interrupt */
> > +	allocated_vectors = j = 1; /* vector 0 is the config interrupt */
> >  	for (i = 0; i < nvqs; ++i) {
> >  		if (!names[i]) {
> >  			vqs[i] = NULL;
> > @@ -236,18 +236,19 @@ static int vp_find_vqs_msix(struct virtio_device *vdev, unsigned nvqs,
> >  			continue;
> >  		}
> >  
> > -		snprintf(vp_dev->msix_names[i + 1],
> > +		snprintf(vp_dev->msix_names[j],
> >  			 sizeof(*vp_dev->msix_names), "%s-%s",
> >  			 dev_name(&vp_dev->vdev.dev), names[i]);
> >  		err = request_irq(pci_irq_vector(vp_dev->pci_dev, msix_vec),
> >  				  vring_interrupt, IRQF_SHARED,
> > -				  vp_dev->msix_names[i + 1], vqs[i]);
> > +				  vp_dev->msix_names[j], vqs[i]);
> >  		if (err) {
> >  			/* don't free this irq on error */
> >  			vp_dev->msix_vector_map[i] = VIRTIO_MSI_NO_VECTOR;
> >  			goto out_remove_vqs;
> >  		}
> >  		vp_dev->msix_vector_map[i] = msix_vec;
> > +		j++;
> >  
> >  		/*
> >  		 * Use a different vector for each queue if they are available,
> > -- 
> > 2.7.4
> > 
> 
> 
> -- 
> Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
> Read my programming and virtualization blog: http://rwmj.wordpress.com
> virt-builder quickly builds VMs from scratch
> http://libguestfs.org/virt-builder.1.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [REGRESSION] 07ec51480b5e ("virtio_pci: use shared interrupts for virtqueues") causes crashes in guest
  2017-03-23 16:41     ` Michael S. Tsirkin
@ 2017-03-23 17:15       ` Thorsten Leemhuis
  0 siblings, 0 replies; 6+ messages in thread
From: Thorsten Leemhuis @ 2017-03-23 17:15 UTC (permalink / raw)
  To: Michael S. Tsirkin, Richard W.M. Jones
  Cc: Jason Wang, Laura Abbott, Christoph Hellwig,
	Linux Kernel Mailing List, virtualization

On 23.03.2017 17:41, Michael S. Tsirkin wrote:
> On Thu, Mar 23, 2017 at 03:19:07PM +0000, Richard W.M. Jones wrote:
>> On Thu, Mar 23, 2017 at 01:13:50PM +0800, Jason Wang wrote:
>>> >From 312859b596e83a2164a8430343d31fce2a5ad808 Mon Sep 17 00:00:00 2001
>>> From: Jason Wang <jasowang@redhat.com>
>>> Date: Thu, 23 Mar 2017 13:07:16 +0800
>>> Subject: [PATCH] virtio_pci: fix out of bound access for msix_names
>>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>> I tested this, and it does appear to fix the crashes in
>> vp_modern_find_vqs.  Therefore:
>> Tested-by: Richard W.M. Jones <rjones@redhat.com>
> I've queued the fix, thanks everyone!

Thx. Feel free to add

Tested-by: Thorsten Leemhuis <linux@leemhuis.info>

as a kernel with that patch works well in my reboot tests so far.
 Ciao, Thorsten

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-03-23 17:15 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-23  0:30 [REGRESSION] 07ec51480b5e ("virtio_pci: use shared interrupts for virtqueues") causes crashes in guest Laura Abbott
2017-03-23  5:13 ` Jason Wang
2017-03-23 14:22   ` Christoph Hellwig
2017-03-23 15:19   ` Richard W.M. Jones
2017-03-23 16:41     ` Michael S. Tsirkin
2017-03-23 17:15       ` Thorsten Leemhuis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).