All of lore.kernel.org
 help / color / mirror / Atom feed
From: Joerg Roedel <joro@8bytes.org>
To: Andreas Hartmann <andihartmann@freenet.de>
Cc: Mikulas Patocka <mpatocka@redhat.com>,
	iommu@lists.linux-foundation.org, Leo Duran <leo.duran@amd.com>,
	Christoph Hellwig <hch@lst.de>,
	device-mapper development <dm-devel@redhat.com>,
	Milan Broz <mbroz@redhat.com>, Jens Axboe <axboe@fb.com>,
	linux-pci <linux-pci@vger.kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: [PATCH] iommu/amd: Fix NULL pointer deref on device detach READ FPDMA QUEUED errors since Linux 4.0
Date: Fri, 9 Oct 2015 16:45:28 +0200	[thread overview]
Message-ID: <20151009144528.GA27420@8bytes.org> (raw)
In-Reply-To: <5616C850.2000906@maya.org>

Hi Andreas,

On Thu, Oct 08, 2015 at 09:47:28PM +0200, Andreas Hartmann wrote:
> This time, the oops was caused by the second PCI card I'm passing
> through to another VM (the ath9k card worked fine this time - chance?).
> I added the lspci output to the attached file, too.

I digged a little bit around here and found a 32bit PCI card and plugged
it into the AMD IOMMU box. I could reproduce the problem and here is
patch which fixes it for me. Can you test it too please? I'd like to
send a pull-req with this fix included to Linus for rc5.

Thanks,

	Joerg

>From d07307c04edffaaa045fb83713f8808e55ffa895 Mon Sep 17 00:00:00 2001
From: Joerg Roedel <jroedel@suse.de>
Date: Fri, 9 Oct 2015 16:23:33 +0200
Subject: [PATCH] iommu/amd: Fix NULL pointer deref on device detach

When a device group is detached from its domain, the iommu
core code calls into the iommu driver to detach each device
individually.

Before this functionality went into the iommu core code, it
was implemented in the drivers, also in the AMD IOMMU
driver as the device alias handling code.

This code is still present, as there might be aliases that
don't exist as real PCI devices (and are therefore invisible
to the iommu core code).

Unfortunatly it might happen now, that a device is unbound
multiple times from its domain, first by the alias handling
code and then by the iommu core code (or vice verca).

This ends up in the do_detach function which dereferences
the dev_data->domain pointer. When the device is already
detached, this pointer is NULL and we get a kernel oops.

Removing the alias code completly is not an option, as that
would also remove the code which handles invisible aliases.
The code could be simplified, but this is too big of a
change outside the merge window.

For now, just check the dev_data->domain pointer in
do_detach and bail out if it is NULL.

Andreas Hartmann <andihartmann@freenet.de>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 drivers/iommu/amd_iommu.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index f82060e7..08d2775 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -2006,6 +2006,15 @@ static void do_detach(struct iommu_dev_data *dev_data)
 {
 	struct amd_iommu *iommu;
 
+	/*
+	 * First check if the device is still attached. It might already
+	 * be detached from its domain because the generic
+	 * iommu_detach_group code detached it and we try again here in
+	 * our alias handling.
+	 */
+	if (!dev_data->domain)
+		return;
+
 	iommu = amd_iommu_rlookup_table[dev_data->devid];
 
 	/* decrease reference counters */
-- 
2.5.1


WARNING: multiple messages have this Message-ID (diff)
From: Joerg Roedel <joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
To: Andreas Hartmann <andihartmann-KuiJ5kEpwI6ELgA04lAiVw@public.gmane.org>
Cc: linux-pci <linux-pci-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	device-mapper development
	<dm-devel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Jens Axboe <axboe-b10kYP2dOMg@public.gmane.org>,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org,
	Mikulas Patocka
	<mpatocka-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Linus Torvalds
	<torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
	Christoph Hellwig <hch-jcswGhMUV9g@public.gmane.org>,
	Milan Broz <mbroz-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Subject: [PATCH] iommu/amd: Fix NULL pointer deref on device detach READ FPDMA QUEUED errors since Linux 4.0
Date: Fri, 9 Oct 2015 16:45:28 +0200	[thread overview]
Message-ID: <20151009144528.GA27420@8bytes.org> (raw)
In-Reply-To: <5616C850.2000906-YKS6W9RDU/w@public.gmane.org>

Hi Andreas,

On Thu, Oct 08, 2015 at 09:47:28PM +0200, Andreas Hartmann wrote:
> This time, the oops was caused by the second PCI card I'm passing
> through to another VM (the ath9k card worked fine this time - chance?).
> I added the lspci output to the attached file, too.

I digged a little bit around here and found a 32bit PCI card and plugged
it into the AMD IOMMU box. I could reproduce the problem and here is
patch which fixes it for me. Can you test it too please? I'd like to
send a pull-req with this fix included to Linus for rc5.

Thanks,

	Joerg

>From d07307c04edffaaa045fb83713f8808e55ffa895 Mon Sep 17 00:00:00 2001
From: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>
Date: Fri, 9 Oct 2015 16:23:33 +0200
Subject: [PATCH] iommu/amd: Fix NULL pointer deref on device detach

When a device group is detached from its domain, the iommu
core code calls into the iommu driver to detach each device
individually.

Before this functionality went into the iommu core code, it
was implemented in the drivers, also in the AMD IOMMU
driver as the device alias handling code.

This code is still present, as there might be aliases that
don't exist as real PCI devices (and are therefore invisible
to the iommu core code).

Unfortunatly it might happen now, that a device is unbound
multiple times from its domain, first by the alias handling
code and then by the iommu core code (or vice verca).

This ends up in the do_detach function which dereferences
the dev_data->domain pointer. When the device is already
detached, this pointer is NULL and we get a kernel oops.

Removing the alias code completly is not an option, as that
would also remove the code which handles invisible aliases.
The code could be simplified, but this is too big of a
change outside the merge window.

For now, just check the dev_data->domain pointer in
do_detach and bail out if it is NULL.

Andreas Hartmann <andihartmann-KuiJ5kEpwI6ELgA04lAiVw@public.gmane.org>
Signed-off-by: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>
---
 drivers/iommu/amd_iommu.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index f82060e7..08d2775 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -2006,6 +2006,15 @@ static void do_detach(struct iommu_dev_data *dev_data)
 {
 	struct amd_iommu *iommu;
 
+	/*
+	 * First check if the device is still attached. It might already
+	 * be detached from its domain because the generic
+	 * iommu_detach_group code detached it and we try again here in
+	 * our alias handling.
+	 */
+	if (!dev_data->domain)
+		return;
+
 	iommu = amd_iommu_rlookup_table[dev_data->devid];
 
 	/* decrease reference counters */
-- 
2.5.1

WARNING: multiple messages have this Message-ID (diff)
From: Joerg Roedel <joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
To: Andreas Hartmann <andihartmann-KuiJ5kEpwI6ELgA04lAiVw@public.gmane.org>
Cc: linux-pci <linux-pci-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	device-mapper development
	<dm-devel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Jens Axboe <axboe-b10kYP2dOMg@public.gmane.org>,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org,
	Mikulas Patocka
	<mpatocka-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Linus Torvalds
	<torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
	Christoph Hellwig <hch-jcswGhMUV9g@public.gmane.org>,
	Milan Broz <mbroz-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Subject: [PATCH] iommu/amd: Fix NULL pointer deref on device detach READ FPDMA QUEUED errors since Linux 4.0
Date: Fri, 9 Oct 2015 16:45:28 +0200	[thread overview]
Message-ID: <20151009144528.GA27420@8bytes.org> (raw)
In-Reply-To: <5616C850.2000906-YKS6W9RDU/w@public.gmane.org>

Hi Andreas,

On Thu, Oct 08, 2015 at 09:47:28PM +0200, Andreas Hartmann wrote:
> This time, the oops was caused by the second PCI card I'm passing
> through to another VM (the ath9k card worked fine this time - chance?).
> I added the lspci output to the attached file, too.

I digged a little bit around here and found a 32bit PCI card and plugged
it into the AMD IOMMU box. I could reproduce the problem and here is
patch which fixes it for me. Can you test it too please? I'd like to
send a pull-req with this fix included to Linus for rc5.

Thanks,

	Joerg

From d07307c04edffaaa045fb83713f8808e55ffa895 Mon Sep 17 00:00:00 2001
From: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>
Date: Fri, 9 Oct 2015 16:23:33 +0200
Subject: [PATCH] iommu/amd: Fix NULL pointer deref on device detach

When a device group is detached from its domain, the iommu
core code calls into the iommu driver to detach each device
individually.

Before this functionality went into the iommu core code, it
was implemented in the drivers, also in the AMD IOMMU
driver as the device alias handling code.

This code is still present, as there might be aliases that
don't exist as real PCI devices (and are therefore invisible
to the iommu core code).

Unfortunatly it might happen now, that a device is unbound
multiple times from its domain, first by the alias handling
code and then by the iommu core code (or vice verca).

This ends up in the do_detach function which dereferences
the dev_data->domain pointer. When the device is already
detached, this pointer is NULL and we get a kernel oops.

Removing the alias code completly is not an option, as that
would also remove the code which handles invisible aliases.
The code could be simplified, but this is too big of a
change outside the merge window.

For now, just check the dev_data->domain pointer in
do_detach and bail out if it is NULL.

Andreas Hartmann <andihartmann-KuiJ5kEpwI6ELgA04lAiVw@public.gmane.org>
Signed-off-by: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>
---
 drivers/iommu/amd_iommu.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index f82060e7..08d2775 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -2006,6 +2006,15 @@ static void do_detach(struct iommu_dev_data *dev_data)
 {
 	struct amd_iommu *iommu;
 
+	/*
+	 * First check if the device is still attached. It might already
+	 * be detached from its domain because the generic
+	 * iommu_detach_group code detached it and we try again here in
+	 * our alias handling.
+	 */
+	if (!dev_data->domain)
+		return;
+
 	iommu = amd_iommu_rlookup_table[dev_data->devid];
 
 	/* decrease reference counters */
-- 
2.5.1

  parent reply	other threads:[~2015-10-09 14:45 UTC|newest]

Thread overview: 76+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-28 17:40 AMD-Vi IO_PAGE_FAULTs and ata3.00: failed command: READ FPDMA QUEUED errors since Linux 4.0 Andreas Hartmann
2015-07-28 17:50 ` Mike Snitzer
2015-07-28 18:20   ` Andreas Hartmann
2015-07-28 18:58     ` Mike Snitzer
2015-07-28 19:23       ` Andreas Hartmann
2015-07-28 19:31         ` Mike Snitzer
2015-07-28 20:08           ` Andreas Hartmann
2015-07-28 21:24             ` Mike Snitzer
2015-07-29  6:17               ` [dm-devel] " Ondrej Kozina
2015-07-29  6:41                 ` Milan Broz
2015-07-29 17:23                   ` Andreas Hartmann
2015-07-30 20:30                     ` Andreas Hartmann
2015-07-31  7:23                       ` Milan Broz
2015-07-31  7:55                         ` Andreas Hartmann
2015-07-31  8:15                           ` Andreas Hartmann
2015-07-31  8:28                           ` Milan Broz
2015-07-29 10:37               ` Milan Broz
2015-07-28 18:56   ` Andreas Hartmann
2015-07-28 19:29     ` Mike Snitzer
2015-08-01 14:20       ` [dm-devel] " Andreas Hartmann
2015-08-02 13:38         ` Andreas Hartmann
2015-08-02 17:57           ` Mikulas Patocka
2015-08-02 18:48             ` Andreas Hartmann
2015-08-02 18:48               ` Andreas Hartmann
2015-08-03  8:12               ` Joerg Roedel
2015-08-04 14:47                 ` Mike Snitzer
2015-08-04 16:10                   ` Jeff Moyer
     [not found]                     ` <x4937zzm3uc.fsf-RRHT56Q3PSP4kTEheFKJxxDDeQx5vsVwAInAS/Ez/D0@public.gmane.org>
2015-08-04 18:11                       ` Andreas Hartmann
2015-08-04 18:11                         ` Andreas Hartmann
2015-08-07  6:04                         ` Andreas Hartmann
2015-09-20  6:50             ` [dm-devel] " Andreas Hartmann
2015-09-20  6:50               ` Andreas Hartmann
2015-09-29 15:21               ` Joerg Roedel
2015-09-29 15:21                 ` Joerg Roedel
2015-09-29 15:58                 ` Mikulas Patocka
2015-09-29 15:58                   ` Mikulas Patocka
2015-09-29 16:20                   ` Joerg Roedel
2015-09-30 14:52                     ` Andreas Hartmann
2015-09-30 14:52                       ` Andreas Hartmann
2015-10-06 10:13                       ` Joerg Roedel
2015-10-06 18:37                         ` Andreas Hartmann
2015-10-06 18:37                           ` Andreas Hartmann
     [not found]                           ` <56141507.7040103-YKS6W9RDU/w@public.gmane.org>
2015-10-07  2:57                             ` Andreas Hartmann
2015-10-07 16:10                               ` Joerg Roedel
2015-10-07 16:10                                 ` Joerg Roedel
2015-10-07 16:52                                 ` Andreas Hartmann
2015-10-07 16:52                                   ` Andreas Hartmann
2015-10-08 16:39                                   ` Joerg Roedel
2015-10-08 18:21                                     ` Andreas Hartmann
2015-10-08 18:21                                       ` Andreas Hartmann
2015-10-08 19:52                                       ` Andreas Hartmann
2015-10-08 19:52                                         ` Andreas Hartmann
2015-10-09  5:20                                         ` Andreas Hartmann
2015-10-09  5:20                                           ` Andreas Hartmann
2015-10-09  9:15                                           ` Andreas Hartmann
2015-10-09  9:15                                             ` Andreas Hartmann
2015-10-09 14:59                                             ` Joerg Roedel
2015-10-09 14:59                                               ` Joerg Roedel
2015-10-09 17:46                                               ` Andreas Hartmann
2015-10-09 17:46                                                 ` Andreas Hartmann
2015-10-11 12:23                                                 ` Andreas Hartmann
2015-10-11 12:23                                                   ` Andreas Hartmann
2015-10-12 12:07                                                   ` Andreas Hartmann
2015-10-12 12:34                                                 ` Mikulas Patocka
2015-10-07 15:40                           ` Joerg Roedel
2015-10-07 17:02                             ` Andreas Hartmann
2015-10-08 17:30                               ` Joerg Roedel
2015-10-08 18:59                                 ` Andreas Hartmann
2015-10-08 18:59                                   ` Andreas Hartmann
2015-10-08 19:47                                   ` Andreas Hartmann
2015-10-08 19:47                                     ` Andreas Hartmann
2015-10-09 10:40                                     ` Joerg Roedel
2015-10-09 14:45                                     ` Joerg Roedel [this message]
2015-10-09 14:45                                       ` [PATCH] iommu/amd: Fix NULL pointer deref on device detach " Joerg Roedel
2015-10-09 14:45                                       ` Joerg Roedel
2015-10-09 17:42                                       ` Andreas Hartmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151009144528.GA27420@8bytes.org \
    --to=joro@8bytes.org \
    --cc=andihartmann@freenet.de \
    --cc=axboe@fb.com \
    --cc=dm-devel@redhat.com \
    --cc=hch@lst.de \
    --cc=iommu@lists.linux-foundation.org \
    --cc=leo.duran@amd.com \
    --cc=linux-pci@vger.kernel.org \
    --cc=mbroz@redhat.com \
    --cc=mpatocka@redhat.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.