All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] iommu/vt-d: Keep RMRR mappings around on driver unbind
@ 2014-09-30 11:02 ` Joerg Roedel
  0 siblings, 0 replies; 26+ messages in thread
From: Joerg Roedel @ 2014-09-30 11:02 UTC (permalink / raw)
  To: Greg Kroah-Hartman, David Woodhouse, Jiang Liu
  Cc: iommu, linux-kernel, joro, jroedel

Hi,

here is a patch-set to fix an issue recently discovered when
the Intel IOMMU is in use with devices that need RMRR
mappings.

The problem is that the RMRR mappings are destroyed when the
device driver is unbound from the device, causing DMAR
faults.

To solve this problem a device driver core change is
necessary to catch the right point in time for the IOMMU
code to destroy any mappings for a device.

With this patch-set the RMRR mappings are only destroyed
when the device is actually removed from the system.

Please review.

Thanks,

	Joerg

Joerg Roedel (2):
  driver core: Add BUS_NOTIFY_REMOVED_DEVICE event
  iommu/vt-d: Only remove domain when device is removed

 drivers/base/core.c         |  3 +++
 drivers/iommu/intel-iommu.c | 11 +----------
 include/linux/device.h      | 11 ++++++-----
 3 files changed, 10 insertions(+), 15 deletions(-)

-- 
1.9.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 0/2] iommu/vt-d: Keep RMRR mappings around on driver unbind
@ 2014-09-30 11:02 ` Joerg Roedel
  0 siblings, 0 replies; 26+ messages in thread
From: Joerg Roedel @ 2014-09-30 11:02 UTC (permalink / raw)
  To: Greg Kroah-Hartman, David Woodhouse, Jiang Liu
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	jroedel-l3A5Bk7waGM, linux-kernel-u79uwXL29TY76Z2rM5mHXA

Hi,

here is a patch-set to fix an issue recently discovered when
the Intel IOMMU is in use with devices that need RMRR
mappings.

The problem is that the RMRR mappings are destroyed when the
device driver is unbound from the device, causing DMAR
faults.

To solve this problem a device driver core change is
necessary to catch the right point in time for the IOMMU
code to destroy any mappings for a device.

With this patch-set the RMRR mappings are only destroyed
when the device is actually removed from the system.

Please review.

Thanks,

	Joerg

Joerg Roedel (2):
  driver core: Add BUS_NOTIFY_REMOVED_DEVICE event
  iommu/vt-d: Only remove domain when device is removed

 drivers/base/core.c         |  3 +++
 drivers/iommu/intel-iommu.c | 11 +----------
 include/linux/device.h      | 11 ++++++-----
 3 files changed, 10 insertions(+), 15 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 1/2] driver core: Add BUS_NOTIFY_REMOVED_DEVICE event
@ 2014-09-30 11:02   ` Joerg Roedel
  0 siblings, 0 replies; 26+ messages in thread
From: Joerg Roedel @ 2014-09-30 11:02 UTC (permalink / raw)
  To: Greg Kroah-Hartman, David Woodhouse, Jiang Liu
  Cc: iommu, linux-kernel, joro, jroedel

From: Joerg Roedel <jroedel@suse.de>

This event closes an important gap in the bus notifiers.
There is already the BUS_NOTIFY_DEL_DEVICE event, but that
is sent when the device is still bound to its device driver.

This is too early for the IOMMU code to destroy any mappings
for the device, as they might still be in use by the driver.

The new BUS_NOTIFY_REMOVED_DEVICE event introduced with this
patch closes this gap as it is sent when the device is
already unbound from its device driver and almost completly
removed from the driver core.

With this event the IOMMU code can safely destroy any
mappings and other data structures when a device is removed.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 drivers/base/core.c    |  3 +++
 include/linux/device.h | 11 ++++++-----
 2 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 20da3ad..7b270a2 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -1211,6 +1211,9 @@ void device_del(struct device *dev)
 	 */
 	if (platform_notify_remove)
 		platform_notify_remove(dev);
+	if (dev->bus)
+		blocking_notifier_call_chain(&dev->bus->p->bus_notifier,
+					     BUS_NOTIFY_REMOVED_DEVICE, dev);
 	kobject_uevent(&dev->kobj, KOBJ_REMOVE);
 	cleanup_device_parent(dev);
 	kobject_del(&dev->kobj);
diff --git a/include/linux/device.h b/include/linux/device.h
index 43d183a..d0d5c5d 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -181,13 +181,14 @@ extern int bus_unregister_notifier(struct bus_type *bus,
  * with the device lock held in the core, so be careful.
  */
 #define BUS_NOTIFY_ADD_DEVICE		0x00000001 /* device added */
-#define BUS_NOTIFY_DEL_DEVICE		0x00000002 /* device removed */
-#define BUS_NOTIFY_BIND_DRIVER		0x00000003 /* driver about to be
+#define BUS_NOTIFY_DEL_DEVICE		0x00000002 /* device to be removed */
+#define BUS_NOTIFY_REMOVED_DEVICE	0x00000003 /* device removed */
+#define BUS_NOTIFY_BIND_DRIVER		0x00000004 /* driver about to be
 						      bound */
-#define BUS_NOTIFY_BOUND_DRIVER		0x00000004 /* driver bound to device */
-#define BUS_NOTIFY_UNBIND_DRIVER	0x00000005 /* driver about to be
+#define BUS_NOTIFY_BOUND_DRIVER		0x00000005 /* driver bound to device */
+#define BUS_NOTIFY_UNBIND_DRIVER	0x00000006 /* driver about to be
 						      unbound */
-#define BUS_NOTIFY_UNBOUND_DRIVER	0x00000006 /* driver is unbound
+#define BUS_NOTIFY_UNBOUND_DRIVER	0x00000007 /* driver is unbound
 						      from the device */
 
 extern struct kset *bus_get_kset(struct bus_type *bus);
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 1/2] driver core: Add BUS_NOTIFY_REMOVED_DEVICE event
@ 2014-09-30 11:02   ` Joerg Roedel
  0 siblings, 0 replies; 26+ messages in thread
From: Joerg Roedel @ 2014-09-30 11:02 UTC (permalink / raw)
  To: Greg Kroah-Hartman, David Woodhouse, Jiang Liu
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	jroedel-l3A5Bk7waGM, linux-kernel-u79uwXL29TY76Z2rM5mHXA

From: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>

This event closes an important gap in the bus notifiers.
There is already the BUS_NOTIFY_DEL_DEVICE event, but that
is sent when the device is still bound to its device driver.

This is too early for the IOMMU code to destroy any mappings
for the device, as they might still be in use by the driver.

The new BUS_NOTIFY_REMOVED_DEVICE event introduced with this
patch closes this gap as it is sent when the device is
already unbound from its device driver and almost completly
removed from the driver core.

With this event the IOMMU code can safely destroy any
mappings and other data structures when a device is removed.

Signed-off-by: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>
---
 drivers/base/core.c    |  3 +++
 include/linux/device.h | 11 ++++++-----
 2 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 20da3ad..7b270a2 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -1211,6 +1211,9 @@ void device_del(struct device *dev)
 	 */
 	if (platform_notify_remove)
 		platform_notify_remove(dev);
+	if (dev->bus)
+		blocking_notifier_call_chain(&dev->bus->p->bus_notifier,
+					     BUS_NOTIFY_REMOVED_DEVICE, dev);
 	kobject_uevent(&dev->kobj, KOBJ_REMOVE);
 	cleanup_device_parent(dev);
 	kobject_del(&dev->kobj);
diff --git a/include/linux/device.h b/include/linux/device.h
index 43d183a..d0d5c5d 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -181,13 +181,14 @@ extern int bus_unregister_notifier(struct bus_type *bus,
  * with the device lock held in the core, so be careful.
  */
 #define BUS_NOTIFY_ADD_DEVICE		0x00000001 /* device added */
-#define BUS_NOTIFY_DEL_DEVICE		0x00000002 /* device removed */
-#define BUS_NOTIFY_BIND_DRIVER		0x00000003 /* driver about to be
+#define BUS_NOTIFY_DEL_DEVICE		0x00000002 /* device to be removed */
+#define BUS_NOTIFY_REMOVED_DEVICE	0x00000003 /* device removed */
+#define BUS_NOTIFY_BIND_DRIVER		0x00000004 /* driver about to be
 						      bound */
-#define BUS_NOTIFY_BOUND_DRIVER		0x00000004 /* driver bound to device */
-#define BUS_NOTIFY_UNBIND_DRIVER	0x00000005 /* driver about to be
+#define BUS_NOTIFY_BOUND_DRIVER		0x00000005 /* driver bound to device */
+#define BUS_NOTIFY_UNBIND_DRIVER	0x00000006 /* driver about to be
 						      unbound */
-#define BUS_NOTIFY_UNBOUND_DRIVER	0x00000006 /* driver is unbound
+#define BUS_NOTIFY_UNBOUND_DRIVER	0x00000007 /* driver is unbound
 						      from the device */
 
 extern struct kset *bus_get_kset(struct bus_type *bus);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 2/2] iommu/vt-d: Only remove domain when device is removed
@ 2014-09-30 11:02   ` Joerg Roedel
  0 siblings, 0 replies; 26+ messages in thread
From: Joerg Roedel @ 2014-09-30 11:02 UTC (permalink / raw)
  To: Greg Kroah-Hartman, David Woodhouse, Jiang Liu
  Cc: iommu, linux-kernel, joro, jroedel

From: Joerg Roedel <jroedel@suse.de>

This makes sure any RMRR mappings stay in place when the
driver is unbound from the device.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 drivers/iommu/intel-iommu.c | 11 +----------
 1 file changed, 1 insertion(+), 10 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 5619f26..eaf825a 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -3865,16 +3865,7 @@ static int device_notifier(struct notifier_block *nb,
 	if (iommu_dummy(dev))
 		return 0;
 
-	if (action != BUS_NOTIFY_UNBOUND_DRIVER &&
-	    action != BUS_NOTIFY_DEL_DEVICE)
-		return 0;
-
-	/*
-	 * If the device is still attached to a device driver we can't
-	 * tear down the domain yet as DMA mappings may still be in use.
-	 * Wait for the BUS_NOTIFY_UNBOUND_DRIVER event to do that.
-	 */
-	if (action == BUS_NOTIFY_DEL_DEVICE && dev->driver != NULL)
+	if (action != BUS_NOTIFY_REMOVED_DEVICE)
 		return 0;
 
 	domain = find_domain(dev);
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 2/2] iommu/vt-d: Only remove domain when device is removed
@ 2014-09-30 11:02   ` Joerg Roedel
  0 siblings, 0 replies; 26+ messages in thread
From: Joerg Roedel @ 2014-09-30 11:02 UTC (permalink / raw)
  To: Greg Kroah-Hartman, David Woodhouse, Jiang Liu
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	jroedel-l3A5Bk7waGM, linux-kernel-u79uwXL29TY76Z2rM5mHXA

From: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>

This makes sure any RMRR mappings stay in place when the
driver is unbound from the device.

Signed-off-by: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>
---
 drivers/iommu/intel-iommu.c | 11 +----------
 1 file changed, 1 insertion(+), 10 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 5619f26..eaf825a 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -3865,16 +3865,7 @@ static int device_notifier(struct notifier_block *nb,
 	if (iommu_dummy(dev))
 		return 0;
 
-	if (action != BUS_NOTIFY_UNBOUND_DRIVER &&
-	    action != BUS_NOTIFY_DEL_DEVICE)
-		return 0;
-
-	/*
-	 * If the device is still attached to a device driver we can't
-	 * tear down the domain yet as DMA mappings may still be in use.
-	 * Wait for the BUS_NOTIFY_UNBOUND_DRIVER event to do that.
-	 */
-	if (action == BUS_NOTIFY_DEL_DEVICE && dev->driver != NULL)
+	if (action != BUS_NOTIFY_REMOVED_DEVICE)
 		return 0;
 
 	domain = find_domain(dev);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/2] iommu/vt-d: Keep RMRR mappings around on driver unbind
@ 2014-10-01 22:35   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 26+ messages in thread
From: Greg Kroah-Hartman @ 2014-10-01 22:35 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: David Woodhouse, Jiang Liu, iommu, linux-kernel, jroedel

On Tue, Sep 30, 2014 at 01:02:01PM +0200, Joerg Roedel wrote:
> Hi,
> 
> here is a patch-set to fix an issue recently discovered when
> the Intel IOMMU is in use with devices that need RMRR
> mappings.
> 
> The problem is that the RMRR mappings are destroyed when the
> device driver is unbound from the device, causing DMAR
> faults.
> 
> To solve this problem a device driver core change is
> necessary to catch the right point in time for the IOMMU
> code to destroy any mappings for a device.
> 
> With this patch-set the RMRR mappings are only destroyed
> when the device is actually removed from the system.
> 
> Please review.

I have no objection to these, do you want me to take them through my
tree?  Or if they are going through some other one feel free to add:

	Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

To the first patch.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/2] iommu/vt-d: Keep RMRR mappings around on driver unbind
@ 2014-10-01 22:35   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 26+ messages in thread
From: Greg Kroah-Hartman @ 2014-10-01 22:35 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	jroedel-l3A5Bk7waGM, David Woodhouse, Jiang Liu,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Tue, Sep 30, 2014 at 01:02:01PM +0200, Joerg Roedel wrote:
> Hi,
> 
> here is a patch-set to fix an issue recently discovered when
> the Intel IOMMU is in use with devices that need RMRR
> mappings.
> 
> The problem is that the RMRR mappings are destroyed when the
> device driver is unbound from the device, causing DMAR
> faults.
> 
> To solve this problem a device driver core change is
> necessary to catch the right point in time for the IOMMU
> code to destroy any mappings for a device.
> 
> With this patch-set the RMRR mappings are only destroyed
> when the device is actually removed from the system.
> 
> Please review.

I have no objection to these, do you want me to take them through my
tree?  Or if they are going through some other one feel free to add:

	Acked-by: Greg Kroah-Hartman <gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r@public.gmane.org>

To the first patch.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/2] iommu/vt-d: Keep RMRR mappings around on driver unbind
@ 2014-10-02  0:30   ` Jerry Hoemann
  0 siblings, 0 replies; 26+ messages in thread
From: Jerry Hoemann @ 2014-10-02  0:30 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: Greg Kroah-Hartman, David Woodhouse, Jiang Liu, iommu, jroedel,
	linux-kernel

On Tue, Sep 30, 2014 at 01:02:01PM +0200, Joerg Roedel wrote:
> Hi,
> 
> here is a patch-set to fix an issue recently discovered when
> the Intel IOMMU is in use with devices that need RMRR
> mappings.
> 
> The problem is that the RMRR mappings are destroyed when the
> device driver is unbound from the device, causing DMAR
> faults.
> 
> To solve this problem a device driver core change is
> necessary to catch the right point in time for the IOMMU
> code to destroy any mappings for a device.
> 
> With this patch-set the RMRR mappings are only destroyed
> when the device is actually removed from the system.
> 
> Please review.
> 
> Thanks,
> 
> 	Joerg
> 
> Joerg Roedel (2):
>   driver core: Add BUS_NOTIFY_REMOVED_DEVICE event
>   iommu/vt-d: Only remove domain when device is removed
> 
>  drivers/base/core.c         |  3 +++
>  drivers/iommu/intel-iommu.c | 11 +----------
>  include/linux/device.h      | 11 ++++++-----
>  3 files changed, 10 insertions(+), 15 deletions(-)
> 
> -- 
> 1.9.1


Joerg,

I tested on HP Gen7 and Gen9 systems for which we experience dmar faults
when we rmmod a driver whose device had RMRR regions associated with it.

We don't see problem when patch set is applied.

Thanks,

Tested-by: Jerry Hoemann <jerry.hoemann@hp.com>

-- 

----------------------------------------------------------------------------
Jerry Hoemann            Software Engineer              Hewlett-Packard

3404 E Harmony Rd. MS 36                        phone:  (970) 898-1022
Ft. Collins, CO 80528                           FAX:    (970) 898-0707
                                                email:  jerry.hoemann@hp.com
----------------------------------------------------------------------------


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/2] iommu/vt-d: Keep RMRR mappings around on driver unbind
@ 2014-10-02  0:30   ` Jerry Hoemann
  0 siblings, 0 replies; 26+ messages in thread
From: Jerry Hoemann @ 2014-10-02  0:30 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: jroedel-l3A5Bk7waGM, Greg Kroah-Hartman,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	David Woodhouse, Jiang Liu

On Tue, Sep 30, 2014 at 01:02:01PM +0200, Joerg Roedel wrote:
> Hi,
> 
> here is a patch-set to fix an issue recently discovered when
> the Intel IOMMU is in use with devices that need RMRR
> mappings.
> 
> The problem is that the RMRR mappings are destroyed when the
> device driver is unbound from the device, causing DMAR
> faults.
> 
> To solve this problem a device driver core change is
> necessary to catch the right point in time for the IOMMU
> code to destroy any mappings for a device.
> 
> With this patch-set the RMRR mappings are only destroyed
> when the device is actually removed from the system.
> 
> Please review.
> 
> Thanks,
> 
> 	Joerg
> 
> Joerg Roedel (2):
>   driver core: Add BUS_NOTIFY_REMOVED_DEVICE event
>   iommu/vt-d: Only remove domain when device is removed
> 
>  drivers/base/core.c         |  3 +++
>  drivers/iommu/intel-iommu.c | 11 +----------
>  include/linux/device.h      | 11 ++++++-----
>  3 files changed, 10 insertions(+), 15 deletions(-)
> 
> -- 
> 1.9.1


Joerg,

I tested on HP Gen7 and Gen9 systems for which we experience dmar faults
when we rmmod a driver whose device had RMRR regions associated with it.

We don't see problem when patch set is applied.

Thanks,

Tested-by: Jerry Hoemann <jerry.hoemann-VXdhtT5mjnY@public.gmane.org>

-- 

----------------------------------------------------------------------------
Jerry Hoemann            Software Engineer              Hewlett-Packard

3404 E Harmony Rd. MS 36                        phone:  (970) 898-1022
Ft. Collins, CO 80528                           FAX:    (970) 898-0707
                                                email:  jerry.hoemann-VXdhtT5mjnY@public.gmane.org
----------------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/2] iommu/vt-d: Keep RMRR mappings around on driver unbind
@ 2014-10-02  9:20     ` Joerg Roedel
  0 siblings, 0 replies; 26+ messages in thread
From: Joerg Roedel @ 2014-10-02  9:20 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Joerg Roedel, David Woodhouse, Jiang Liu, iommu, linux-kernel

On Wed, Oct 01, 2014 at 03:35:10PM -0700, Greg Kroah-Hartman wrote:
> I have no objection to these, do you want me to take them through my
> tree?  Or if they are going through some other one feel free to add:
> 
> 	Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> 
> To the first patch.

Thanks Greg! I added the patches to the IOMMU tree.


	Joerg


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/2] iommu/vt-d: Keep RMRR mappings around on driver unbind
@ 2014-10-02  9:20     ` Joerg Roedel
  0 siblings, 0 replies; 26+ messages in thread
From: Joerg Roedel @ 2014-10-02  9:20 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: David Woodhouse, Jiang Liu,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Wed, Oct 01, 2014 at 03:35:10PM -0700, Greg Kroah-Hartman wrote:
> I have no objection to these, do you want me to take them through my
> tree?  Or if they are going through some other one feel free to add:
> 
> 	Acked-by: Greg Kroah-Hartman <gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r@public.gmane.org>
> 
> To the first patch.

Thanks Greg! I added the patches to the IOMMU tree.


	Joerg

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/2] iommu/vt-d: Only remove domain when device is removed
@ 2014-11-04 16:12     ` Alex Williamson
  0 siblings, 0 replies; 26+ messages in thread
From: Alex Williamson @ 2014-11-04 16:12 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: Greg Kroah-Hartman, David Woodhouse, Jiang Liu, iommu, jroedel,
	linux-kernel, Myron Stowe

On Tue, 2014-09-30 at 13:02 +0200, Joerg Roedel wrote:
> From: Joerg Roedel <jroedel@suse.de>
> 
> This makes sure any RMRR mappings stay in place when the
> driver is unbound from the device.
> 
> Signed-off-by: Joerg Roedel <jroedel@suse.de>
> ---
>  drivers/iommu/intel-iommu.c | 11 +----------
>  1 file changed, 1 insertion(+), 10 deletions(-)
> 
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index 5619f26..eaf825a 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -3865,16 +3865,7 @@ static int device_notifier(struct notifier_block *nb,
>  	if (iommu_dummy(dev))
>  		return 0;
>  
> -	if (action != BUS_NOTIFY_UNBOUND_DRIVER &&
> -	    action != BUS_NOTIFY_DEL_DEVICE)
> -		return 0;
> -
> -	/*
> -	 * If the device is still attached to a device driver we can't
> -	 * tear down the domain yet as DMA mappings may still be in use.
> -	 * Wait for the BUS_NOTIFY_UNBOUND_DRIVER event to do that.
> -	 */
> -	if (action == BUS_NOTIFY_DEL_DEVICE && dev->driver != NULL)
> +	if (action != BUS_NOTIFY_REMOVED_DEVICE)

I haven't tested it, but I'm concerned whether this has introduced a
domain leak.  If we think about the case of unbinding a device from a
host driver and attaching it to a domain through the IOMMU API, I think
we used to count on this path to call domain_exit(), which made the
domain_context_mapped() in intel_iommu_attach_device() "unlikely".  With
this change, isn't the test in intel_iommu_attach_device() now neither
likely nor unlikely and we're only removing the dev_info from the domain
and not destroying the domain itself?  Thanks,

Alex


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/2] iommu/vt-d: Only remove domain when device is removed
@ 2014-11-04 16:12     ` Alex Williamson
  0 siblings, 0 replies; 26+ messages in thread
From: Alex Williamson @ 2014-11-04 16:12 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: jroedel-l3A5Bk7waGM, Greg Kroah-Hartman,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Myron Stowe,
	David Woodhouse, Jiang Liu

On Tue, 2014-09-30 at 13:02 +0200, Joerg Roedel wrote:
> From: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>
> 
> This makes sure any RMRR mappings stay in place when the
> driver is unbound from the device.
> 
> Signed-off-by: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>
> ---
>  drivers/iommu/intel-iommu.c | 11 +----------
>  1 file changed, 1 insertion(+), 10 deletions(-)
> 
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index 5619f26..eaf825a 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -3865,16 +3865,7 @@ static int device_notifier(struct notifier_block *nb,
>  	if (iommu_dummy(dev))
>  		return 0;
>  
> -	if (action != BUS_NOTIFY_UNBOUND_DRIVER &&
> -	    action != BUS_NOTIFY_DEL_DEVICE)
> -		return 0;
> -
> -	/*
> -	 * If the device is still attached to a device driver we can't
> -	 * tear down the domain yet as DMA mappings may still be in use.
> -	 * Wait for the BUS_NOTIFY_UNBOUND_DRIVER event to do that.
> -	 */
> -	if (action == BUS_NOTIFY_DEL_DEVICE && dev->driver != NULL)
> +	if (action != BUS_NOTIFY_REMOVED_DEVICE)

I haven't tested it, but I'm concerned whether this has introduced a
domain leak.  If we think about the case of unbinding a device from a
host driver and attaching it to a domain through the IOMMU API, I think
we used to count on this path to call domain_exit(), which made the
domain_context_mapped() in intel_iommu_attach_device() "unlikely".  With
this change, isn't the test in intel_iommu_attach_device() now neither
likely nor unlikely and we're only removing the dev_info from the domain
and not destroying the domain itself?  Thanks,

Alex

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/2] iommu/vt-d: Only remove domain when device is removed
@ 2014-11-06 12:54       ` Joerg Roedel
  0 siblings, 0 replies; 26+ messages in thread
From: Joerg Roedel @ 2014-11-06 12:54 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Joerg Roedel, Greg Kroah-Hartman, David Woodhouse, Jiang Liu,
	iommu, linux-kernel, Myron Stowe

Hi Alex,

On Tue, Nov 04, 2014 at 09:12:17AM -0700, Alex Williamson wrote:
> I haven't tested it, but I'm concerned whether this has introduced a
> domain leak.  If we think about the case of unbinding a device from a
> host driver and attaching it to a domain through the IOMMU API, I think
> we used to count on this path to call domain_exit(), which made the
> domain_context_mapped() in intel_iommu_attach_device() "unlikely".  With
> this change, isn't the test in intel_iommu_attach_device() now neither
> likely nor unlikely and we're only removing the dev_info from the domain
> and not destroying the domain itself?  Thanks,

As I see it, there is no leak. The DMA-API domains are kept in the
device_domain_list and re-used when the device driver re-attaches. But
your are right that the unlikely in intel_iommu_attach_device() isn't
true anymore. We could probably remove it.


	Joerg


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/2] iommu/vt-d: Only remove domain when device is removed
@ 2014-11-06 12:54       ` Joerg Roedel
  0 siblings, 0 replies; 26+ messages in thread
From: Joerg Roedel @ 2014-11-06 12:54 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Greg Kroah-Hartman, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Myron Stowe,
	David Woodhouse, Jiang Liu

Hi Alex,

On Tue, Nov 04, 2014 at 09:12:17AM -0700, Alex Williamson wrote:
> I haven't tested it, but I'm concerned whether this has introduced a
> domain leak.  If we think about the case of unbinding a device from a
> host driver and attaching it to a domain through the IOMMU API, I think
> we used to count on this path to call domain_exit(), which made the
> domain_context_mapped() in intel_iommu_attach_device() "unlikely".  With
> this change, isn't the test in intel_iommu_attach_device() now neither
> likely nor unlikely and we're only removing the dev_info from the domain
> and not destroying the domain itself?  Thanks,

As I see it, there is no leak. The DMA-API domains are kept in the
device_domain_list and re-used when the device driver re-attaches. But
your are right that the unlikely in intel_iommu_attach_device() isn't
true anymore. We could probably remove it.


	Joerg

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/2] iommu/vt-d: Only remove domain when device is removed
@ 2014-11-06 16:16         ` Alex Williamson
  0 siblings, 0 replies; 26+ messages in thread
From: Alex Williamson @ 2014-11-06 16:16 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: Joerg Roedel, Greg Kroah-Hartman, David Woodhouse, Jiang Liu,
	iommu, linux-kernel, Myron Stowe

On Thu, 2014-11-06 at 13:54 +0100, Joerg Roedel wrote:
> Hi Alex,
> 
> On Tue, Nov 04, 2014 at 09:12:17AM -0700, Alex Williamson wrote:
> > I haven't tested it, but I'm concerned whether this has introduced a
> > domain leak.  If we think about the case of unbinding a device from a
> > host driver and attaching it to a domain through the IOMMU API, I think
> > we used to count on this path to call domain_exit(), which made the
> > domain_context_mapped() in intel_iommu_attach_device() "unlikely".  With
> > this change, isn't the test in intel_iommu_attach_device() now neither
> > likely nor unlikely and we're only removing the dev_info from the domain
> > and not destroying the domain itself?  Thanks,
> 
> As I see it, there is no leak. The DMA-API domains are kept in the
> device_domain_list and re-used when the device driver re-attaches. But
> your are right that the unlikely in intel_iommu_attach_device() isn't
> true anymore. We could probably remove it.

But the domains are unlinked from device_domain_list using
unlink_domain_info() which is called from both domain_remove_dev_info()
and domain_remove_one_dev_info() which are both part of that more
likely, unlikely branch in intel_iommu_attach_device().  So it seems
like any time we switch a device from the DMA-API to the IOMMU-API, we
lose the reference to the domain.  Is that incorrect?  I'll try to test.
Thanks,

Alex


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/2] iommu/vt-d: Only remove domain when device is removed
@ 2014-11-06 16:16         ` Alex Williamson
  0 siblings, 0 replies; 26+ messages in thread
From: Alex Williamson @ 2014-11-06 16:16 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: Greg Kroah-Hartman, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Myron Stowe,
	David Woodhouse, Jiang Liu

On Thu, 2014-11-06 at 13:54 +0100, Joerg Roedel wrote:
> Hi Alex,
> 
> On Tue, Nov 04, 2014 at 09:12:17AM -0700, Alex Williamson wrote:
> > I haven't tested it, but I'm concerned whether this has introduced a
> > domain leak.  If we think about the case of unbinding a device from a
> > host driver and attaching it to a domain through the IOMMU API, I think
> > we used to count on this path to call domain_exit(), which made the
> > domain_context_mapped() in intel_iommu_attach_device() "unlikely".  With
> > this change, isn't the test in intel_iommu_attach_device() now neither
> > likely nor unlikely and we're only removing the dev_info from the domain
> > and not destroying the domain itself?  Thanks,
> 
> As I see it, there is no leak. The DMA-API domains are kept in the
> device_domain_list and re-used when the device driver re-attaches. But
> your are right that the unlikely in intel_iommu_attach_device() isn't
> true anymore. We could probably remove it.

But the domains are unlinked from device_domain_list using
unlink_domain_info() which is called from both domain_remove_dev_info()
and domain_remove_one_dev_info() which are both part of that more
likely, unlikely branch in intel_iommu_attach_device().  So it seems
like any time we switch a device from the DMA-API to the IOMMU-API, we
lose the reference to the domain.  Is that incorrect?  I'll try to test.
Thanks,

Alex

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/2] iommu/vt-d: Only remove domain when device is removed
@ 2014-11-06 16:43           ` Alex Williamson
  0 siblings, 0 replies; 26+ messages in thread
From: Alex Williamson @ 2014-11-06 16:43 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: Greg Kroah-Hartman, linux-kernel, iommu, Myron Stowe,
	David Woodhouse, Jiang Liu

On Thu, 2014-11-06 at 09:16 -0700, Alex Williamson wrote:
> On Thu, 2014-11-06 at 13:54 +0100, Joerg Roedel wrote:
> > Hi Alex,
> > 
> > On Tue, Nov 04, 2014 at 09:12:17AM -0700, Alex Williamson wrote:
> > > I haven't tested it, but I'm concerned whether this has introduced a
> > > domain leak.  If we think about the case of unbinding a device from a
> > > host driver and attaching it to a domain through the IOMMU API, I think
> > > we used to count on this path to call domain_exit(), which made the
> > > domain_context_mapped() in intel_iommu_attach_device() "unlikely".  With
> > > this change, isn't the test in intel_iommu_attach_device() now neither
> > > likely nor unlikely and we're only removing the dev_info from the domain
> > > and not destroying the domain itself?  Thanks,
> > 
> > As I see it, there is no leak. The DMA-API domains are kept in the
> > device_domain_list and re-used when the device driver re-attaches. But
> > your are right that the unlikely in intel_iommu_attach_device() isn't
> > true anymore. We could probably remove it.
> 
> But the domains are unlinked from device_domain_list using
> unlink_domain_info() which is called from both domain_remove_dev_info()
> and domain_remove_one_dev_info() which are both part of that more
> likely, unlikely branch in intel_iommu_attach_device().  So it seems
> like any time we switch a device from the DMA-API to the IOMMU-API, we
> lose the reference to the domain.  Is that incorrect?  I'll try to test.

Trying the simple approach, a printk in each of alloc_domain() and
free_domain_mem(), this is what I see when I start and stop a VM with an
assigned device:

alloc_domain(): ffff8801e22ac000
free_domain_mem(ffff8801e22ac000)
alloc_domain(): ffff8801e3425c80

The IOMMU API domain is alloc'd and free'd, then a new DMA-API domain is
alloc'd.  There are no frees of the DMA-API domain.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/2] iommu/vt-d: Only remove domain when device is removed
@ 2014-11-06 16:43           ` Alex Williamson
  0 siblings, 0 replies; 26+ messages in thread
From: Alex Williamson @ 2014-11-06 16:43 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: Greg Kroah-Hartman, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Myron Stowe,
	David Woodhouse, Jiang Liu

On Thu, 2014-11-06 at 09:16 -0700, Alex Williamson wrote:
> On Thu, 2014-11-06 at 13:54 +0100, Joerg Roedel wrote:
> > Hi Alex,
> > 
> > On Tue, Nov 04, 2014 at 09:12:17AM -0700, Alex Williamson wrote:
> > > I haven't tested it, but I'm concerned whether this has introduced a
> > > domain leak.  If we think about the case of unbinding a device from a
> > > host driver and attaching it to a domain through the IOMMU API, I think
> > > we used to count on this path to call domain_exit(), which made the
> > > domain_context_mapped() in intel_iommu_attach_device() "unlikely".  With
> > > this change, isn't the test in intel_iommu_attach_device() now neither
> > > likely nor unlikely and we're only removing the dev_info from the domain
> > > and not destroying the domain itself?  Thanks,
> > 
> > As I see it, there is no leak. The DMA-API domains are kept in the
> > device_domain_list and re-used when the device driver re-attaches. But
> > your are right that the unlikely in intel_iommu_attach_device() isn't
> > true anymore. We could probably remove it.
> 
> But the domains are unlinked from device_domain_list using
> unlink_domain_info() which is called from both domain_remove_dev_info()
> and domain_remove_one_dev_info() which are both part of that more
> likely, unlikely branch in intel_iommu_attach_device().  So it seems
> like any time we switch a device from the DMA-API to the IOMMU-API, we
> lose the reference to the domain.  Is that incorrect?  I'll try to test.

Trying the simple approach, a printk in each of alloc_domain() and
free_domain_mem(), this is what I see when I start and stop a VM with an
assigned device:

alloc_domain(): ffff8801e22ac000
free_domain_mem(ffff8801e22ac000)
alloc_domain(): ffff8801e3425c80

The IOMMU API domain is alloc'd and free'd, then a new DMA-API domain is
alloc'd.  There are no frees of the DMA-API domain.  Thanks,

Alex

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/2] iommu/vt-d: Only remove domain when device is removed
@ 2014-12-09 12:15           ` Joerg Roedel
  0 siblings, 0 replies; 26+ messages in thread
From: Joerg Roedel @ 2014-12-09 12:15 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Joerg Roedel, Greg Kroah-Hartman, David Woodhouse, Jiang Liu,
	iommu, linux-kernel, Myron Stowe

On Thu, Nov 06, 2014 at 09:16:05AM -0700, Alex Williamson wrote:
> But the domains are unlinked from device_domain_list using
> unlink_domain_info() which is called from both domain_remove_dev_info()
> and domain_remove_one_dev_info() which are both part of that more
> likely, unlikely branch in intel_iommu_attach_device().  So it seems
> like any time we switch a device from the DMA-API to the IOMMU-API, we
> lose the reference to the domain.  Is that incorrect?  I'll try to test.

Okay, I thought a while about that and it looks like a real fix needs a
rewrite of the domain handling code in the VT-d driver to better handle
domain lifetime. We'll get this for free when we add default domains and
more domain handling logic to the iommu core, so I think we don't need
to start rewriting the VT-d driver for this.
But for the time being, here is a simple fix for the leak in
iommu_attach_domain:

>From d65b236d0f27fe3ef7ac4d12cceb0da67aec86ce Mon Sep 17 00:00:00 2001
From: Joerg Roedel <jroedel@suse.de>
Date: Tue, 9 Dec 2014 12:56:45 +0100
Subject: [PATCH] iommu/vt-d: Fix dmar_domain leak in iommu_attach_device

Since commit 1196c2f a domain is only destroyed in the
notifier path if it is hot-unplugged. This caused a
domain leakage in iommu_attach_device when a driver was
unbound from the device and bound to VFIO. In this case the
device is attached to a new domain and unlinked from the old
domain. At this point nothing points to the old domain
anymore and its memory is leaked.
Fix this by explicitly freeing the old domain in
iommu_attach_domain.

Fixes: 1196c2f 'iommu/vt-d: Only remove domain when device is removed'
Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 drivers/iommu/intel-iommu.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 1232336..9ef8e89 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -4424,10 +4424,13 @@ static int intel_iommu_attach_device(struct iommu_domain *domain,
 
 		old_domain = find_domain(dev);
 		if (old_domain) {
-			if (domain_type_is_vm_or_si(dmar_domain))
+			if (domain_type_is_vm_or_si(dmar_domain)) {
 				domain_remove_one_dev_info(old_domain, dev);
-			else
+			} else {
 				domain_remove_dev_info(old_domain);
+				if (list_empty(&old_domain->devices))
+					domain_exit(old_domain);
+			}
 		}
 	}
 
-- 
1.8.4.5


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/2] iommu/vt-d: Only remove domain when device is removed
@ 2014-12-09 12:15           ` Joerg Roedel
  0 siblings, 0 replies; 26+ messages in thread
From: Joerg Roedel @ 2014-12-09 12:15 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Joerg Roedel, Greg Kroah-Hartman,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Myron Stowe,
	David Woodhouse, Jiang Liu

On Thu, Nov 06, 2014 at 09:16:05AM -0700, Alex Williamson wrote:
> But the domains are unlinked from device_domain_list using
> unlink_domain_info() which is called from both domain_remove_dev_info()
> and domain_remove_one_dev_info() which are both part of that more
> likely, unlikely branch in intel_iommu_attach_device().  So it seems
> like any time we switch a device from the DMA-API to the IOMMU-API, we
> lose the reference to the domain.  Is that incorrect?  I'll try to test.

Okay, I thought a while about that and it looks like a real fix needs a
rewrite of the domain handling code in the VT-d driver to better handle
domain lifetime. We'll get this for free when we add default domains and
more domain handling logic to the iommu core, so I think we don't need
to start rewriting the VT-d driver for this.
But for the time being, here is a simple fix for the leak in
iommu_attach_domain:

>From d65b236d0f27fe3ef7ac4d12cceb0da67aec86ce Mon Sep 17 00:00:00 2001
From: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>
Date: Tue, 9 Dec 2014 12:56:45 +0100
Subject: [PATCH] iommu/vt-d: Fix dmar_domain leak in iommu_attach_device

Since commit 1196c2f a domain is only destroyed in the
notifier path if it is hot-unplugged. This caused a
domain leakage in iommu_attach_device when a driver was
unbound from the device and bound to VFIO. In this case the
device is attached to a new domain and unlinked from the old
domain. At this point nothing points to the old domain
anymore and its memory is leaked.
Fix this by explicitly freeing the old domain in
iommu_attach_domain.

Fixes: 1196c2f 'iommu/vt-d: Only remove domain when device is removed'
Signed-off-by: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>
---
 drivers/iommu/intel-iommu.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 1232336..9ef8e89 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -4424,10 +4424,13 @@ static int intel_iommu_attach_device(struct iommu_domain *domain,
 
 		old_domain = find_domain(dev);
 		if (old_domain) {
-			if (domain_type_is_vm_or_si(dmar_domain))
+			if (domain_type_is_vm_or_si(dmar_domain)) {
 				domain_remove_one_dev_info(old_domain, dev);
-			else
+			} else {
 				domain_remove_dev_info(old_domain);
+				if (list_empty(&old_domain->devices))
+					domain_exit(old_domain);
+			}
 		}
 	}
 
-- 
1.8.4.5

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/2] iommu/vt-d: Only remove domain when device is removed
@ 2014-12-11 16:35             ` Jerry Hoemann
  0 siblings, 0 replies; 26+ messages in thread
From: Jerry Hoemann @ 2014-12-11 16:35 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: Alex Williamson, Joerg Roedel, Greg Kroah-Hartman, linux-kernel,
	iommu, Myron Stowe, David Woodhouse, Jiang Liu

On Tue, Dec 09, 2014 at 01:15:25PM +0100, Joerg Roedel wrote:
> On Thu, Nov 06, 2014 at 09:16:05AM -0700, Alex Williamson wrote:
> > But the domains are unlinked from device_domain_list using
> > unlink_domain_info() which is called from both domain_remove_dev_info()
> > and domain_remove_one_dev_info() which are both part of that more
> > likely, unlikely branch in intel_iommu_attach_device().  So it seems
> > like any time we switch a device from the DMA-API to the IOMMU-API, we
> > lose the reference to the domain.  Is that incorrect?  I'll try to test.
> 
> Okay, I thought a while about that and it looks like a real fix needs a
> rewrite of the domain handling code in the VT-d driver to better handle
> domain lifetime. We'll get this for free when we add default domains and
> more domain handling logic to the iommu core, so I think we don't need
> to start rewriting the VT-d driver for this.
> But for the time being, here is a simple fix for the leak in
> iommu_attach_domain:


Joerg,

This patch doesn't seem to be fixing the memory leak.

I am testing with a 3.18.0-rc7 kernel applied to a RHEL 7.0 system.

I added printk in free_domain_mem and alloc_domain to first reproduce
Alex's observation.  I created a VM and assigned a PCI NIC w/ no associated
RMRR to the VM.

The pattern I see is when starting the VM is:
alloc_domain -> X

When powering off the VM:
free_domain_mem(X)
alloc_domain -> Y

I then applied the patch below and I still see the same pattern of two
alloc_domain and one free_domain_mem when powering on/off the VM.

I added some additional instrumentation and I do not see the new call
to domain_exit being executed.  (See inline comments below.)



Jerry


> 
> >From d65b236d0f27fe3ef7ac4d12cceb0da67aec86ce Mon Sep 17 00:00:00 2001
> From: Joerg Roedel <jroedel@suse.de>
> Date: Tue, 9 Dec 2014 12:56:45 +0100
> Subject: [PATCH] iommu/vt-d: Fix dmar_domain leak in iommu_attach_device
> 
> Since commit 1196c2f a domain is only destroyed in the
> notifier path if it is hot-unplugged. This caused a
> domain leakage in iommu_attach_device when a driver was
> unbound from the device and bound to VFIO. In this case the
> device is attached to a new domain and unlinked from the old
> domain. At this point nothing points to the old domain
> anymore and its memory is leaked.
> Fix this by explicitly freeing the old domain in
> iommu_attach_domain.
> 
> Fixes: 1196c2f 'iommu/vt-d: Only remove domain when device is removed'
> Signed-off-by: Joerg Roedel <jroedel@suse.de>
> ---
>  drivers/iommu/intel-iommu.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index 1232336..9ef8e89 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -4424,10 +4424,13 @@ static int intel_iommu_attach_device(struct iommu_domain *domain,
>  
>  		old_domain = find_domain(dev);
>  		if (old_domain) {
> -			if (domain_type_is_vm_or_si(dmar_domain))
> +			if (domain_type_is_vm_or_si(dmar_domain)) {


JAH>  This path is executed when starting the VM.


>  				domain_remove_one_dev_info(old_domain, dev);
> -			else
> +			} else {


JAH>  I don't see this path being executed.

>  				domain_remove_dev_info(old_domain);
> +				if (list_empty(&old_domain->devices))
> +					domain_exit(old_domain);
> +			}
>  		}
>  	}
>  
> -- 
> 1.8.4.5
> 
> _______________________________________________
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu

-- 

----------------------------------------------------------------------------
Jerry Hoemann            Software Engineer              Hewlett-Packard

3404 E Harmony Rd. MS 36                        phone:  (970) 898-1022
Ft. Collins, CO 80528                           FAX:    (970) 898-0707
                                                email:  jerry.hoemann@hp.com
----------------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/2] iommu/vt-d: Only remove domain when device is removed
@ 2014-12-11 16:35             ` Jerry Hoemann
  0 siblings, 0 replies; 26+ messages in thread
From: Jerry Hoemann @ 2014-12-11 16:35 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: Joerg Roedel, Greg Kroah-Hartman,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Myron Stowe,
	David Woodhouse, Jiang Liu

On Tue, Dec 09, 2014 at 01:15:25PM +0100, Joerg Roedel wrote:
> On Thu, Nov 06, 2014 at 09:16:05AM -0700, Alex Williamson wrote:
> > But the domains are unlinked from device_domain_list using
> > unlink_domain_info() which is called from both domain_remove_dev_info()
> > and domain_remove_one_dev_info() which are both part of that more
> > likely, unlikely branch in intel_iommu_attach_device().  So it seems
> > like any time we switch a device from the DMA-API to the IOMMU-API, we
> > lose the reference to the domain.  Is that incorrect?  I'll try to test.
> 
> Okay, I thought a while about that and it looks like a real fix needs a
> rewrite of the domain handling code in the VT-d driver to better handle
> domain lifetime. We'll get this for free when we add default domains and
> more domain handling logic to the iommu core, so I think we don't need
> to start rewriting the VT-d driver for this.
> But for the time being, here is a simple fix for the leak in
> iommu_attach_domain:


Joerg,

This patch doesn't seem to be fixing the memory leak.

I am testing with a 3.18.0-rc7 kernel applied to a RHEL 7.0 system.

I added printk in free_domain_mem and alloc_domain to first reproduce
Alex's observation.  I created a VM and assigned a PCI NIC w/ no associated
RMRR to the VM.

The pattern I see is when starting the VM is:
alloc_domain -> X

When powering off the VM:
free_domain_mem(X)
alloc_domain -> Y

I then applied the patch below and I still see the same pattern of two
alloc_domain and one free_domain_mem when powering on/off the VM.

I added some additional instrumentation and I do not see the new call
to domain_exit being executed.  (See inline comments below.)



Jerry


> 
> >From d65b236d0f27fe3ef7ac4d12cceb0da67aec86ce Mon Sep 17 00:00:00 2001
> From: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>
> Date: Tue, 9 Dec 2014 12:56:45 +0100
> Subject: [PATCH] iommu/vt-d: Fix dmar_domain leak in iommu_attach_device
> 
> Since commit 1196c2f a domain is only destroyed in the
> notifier path if it is hot-unplugged. This caused a
> domain leakage in iommu_attach_device when a driver was
> unbound from the device and bound to VFIO. In this case the
> device is attached to a new domain and unlinked from the old
> domain. At this point nothing points to the old domain
> anymore and its memory is leaked.
> Fix this by explicitly freeing the old domain in
> iommu_attach_domain.
> 
> Fixes: 1196c2f 'iommu/vt-d: Only remove domain when device is removed'
> Signed-off-by: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>
> ---
>  drivers/iommu/intel-iommu.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index 1232336..9ef8e89 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -4424,10 +4424,13 @@ static int intel_iommu_attach_device(struct iommu_domain *domain,
>  
>  		old_domain = find_domain(dev);
>  		if (old_domain) {
> -			if (domain_type_is_vm_or_si(dmar_domain))
> +			if (domain_type_is_vm_or_si(dmar_domain)) {


JAH>  This path is executed when starting the VM.


>  				domain_remove_one_dev_info(old_domain, dev);
> -			else
> +			} else {


JAH>  I don't see this path being executed.

>  				domain_remove_dev_info(old_domain);
> +				if (list_empty(&old_domain->devices))
> +					domain_exit(old_domain);
> +			}
>  		}
>  	}
>  
> -- 
> 1.8.4.5
> 
> _______________________________________________
> iommu mailing list
> iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu

-- 

----------------------------------------------------------------------------
Jerry Hoemann            Software Engineer              Hewlett-Packard

3404 E Harmony Rd. MS 36                        phone:  (970) 898-1022
Ft. Collins, CO 80528                           FAX:    (970) 898-0707
                                                email:  jerry.hoemann-VXdhtT5mjnY@public.gmane.org
----------------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/2] iommu/vt-d: Only remove domain when device is removed
@ 2014-12-12 15:56               ` Joerg Roedel
  0 siblings, 0 replies; 26+ messages in thread
From: Joerg Roedel @ 2014-12-12 15:56 UTC (permalink / raw)
  To: Jerry Hoemann
  Cc: Alex Williamson, Joerg Roedel, Greg Kroah-Hartman, linux-kernel,
	iommu, Myron Stowe, David Woodhouse, Jiang Liu

Hi Jerry,

On Thu, Dec 11, 2014 at 09:35:34AM -0700, Jerry Hoemann wrote:
> On Tue, Dec 09, 2014 at 01:15:25PM +0100, Joerg Roedel wrote:
> > >From d65b236d0f27fe3ef7ac4d12cceb0da67aec86ce Mon Sep 17 00:00:00 2001
> > From: Joerg Roedel <jroedel@suse.de>
> > Date: Tue, 9 Dec 2014 12:56:45 +0100
> > Subject: [PATCH] iommu/vt-d: Fix dmar_domain leak in iommu_attach_device
> > 
> > Since commit 1196c2f a domain is only destroyed in the
> > notifier path if it is hot-unplugged. This caused a
> > domain leakage in iommu_attach_device when a driver was
> > unbound from the device and bound to VFIO. In this case the
> > device is attached to a new domain and unlinked from the old
> > domain. At this point nothing points to the old domain
> > anymore and its memory is leaked.
> > Fix this by explicitly freeing the old domain in
> > iommu_attach_domain.
> > 
> > Fixes: 1196c2f 'iommu/vt-d: Only remove domain when device is removed'
> > Signed-off-by: Joerg Roedel <jroedel@suse.de>
> > ---
> >  drivers/iommu/intel-iommu.c | 7 +++++--
> >  1 file changed, 5 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> > index 1232336..9ef8e89 100644
> > --- a/drivers/iommu/intel-iommu.c
> > +++ b/drivers/iommu/intel-iommu.c
> > @@ -4424,10 +4424,13 @@ static int intel_iommu_attach_device(struct iommu_domain *domain,
> >  
> >  		old_domain = find_domain(dev);
> >  		if (old_domain) {
> > -			if (domain_type_is_vm_or_si(dmar_domain))
> > +			if (domain_type_is_vm_or_si(dmar_domain)) {
> 
> 
> JAH>  This path is executed when starting the VM.
> 
> 
> >  				domain_remove_one_dev_info(old_domain, dev);
> > -			else
> > +			} else {
> 
> 
> JAH>  I don't see this path being executed.
> 
> >  				domain_remove_dev_info(old_domain);
> > +				if (list_empty(&old_domain->devices))
> > +					domain_exit(old_domain);
> > +			}

You are right, thanks for testing. The reason is that the check for
domain_type_is_vm_or_si(dmar_domain) uses the new domain and not the old
one. I'll post a new patch.


	Joerg


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/2] iommu/vt-d: Only remove domain when device is removed
@ 2014-12-12 15:56               ` Joerg Roedel
  0 siblings, 0 replies; 26+ messages in thread
From: Joerg Roedel @ 2014-12-12 15:56 UTC (permalink / raw)
  To: Jerry Hoemann
  Cc: Joerg Roedel, Greg Kroah-Hartman,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Myron Stowe,
	David Woodhouse, Jiang Liu

Hi Jerry,

On Thu, Dec 11, 2014 at 09:35:34AM -0700, Jerry Hoemann wrote:
> On Tue, Dec 09, 2014 at 01:15:25PM +0100, Joerg Roedel wrote:
> > >From d65b236d0f27fe3ef7ac4d12cceb0da67aec86ce Mon Sep 17 00:00:00 2001
> > From: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>
> > Date: Tue, 9 Dec 2014 12:56:45 +0100
> > Subject: [PATCH] iommu/vt-d: Fix dmar_domain leak in iommu_attach_device
> > 
> > Since commit 1196c2f a domain is only destroyed in the
> > notifier path if it is hot-unplugged. This caused a
> > domain leakage in iommu_attach_device when a driver was
> > unbound from the device and bound to VFIO. In this case the
> > device is attached to a new domain and unlinked from the old
> > domain. At this point nothing points to the old domain
> > anymore and its memory is leaked.
> > Fix this by explicitly freeing the old domain in
> > iommu_attach_domain.
> > 
> > Fixes: 1196c2f 'iommu/vt-d: Only remove domain when device is removed'
> > Signed-off-by: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>
> > ---
> >  drivers/iommu/intel-iommu.c | 7 +++++--
> >  1 file changed, 5 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> > index 1232336..9ef8e89 100644
> > --- a/drivers/iommu/intel-iommu.c
> > +++ b/drivers/iommu/intel-iommu.c
> > @@ -4424,10 +4424,13 @@ static int intel_iommu_attach_device(struct iommu_domain *domain,
> >  
> >  		old_domain = find_domain(dev);
> >  		if (old_domain) {
> > -			if (domain_type_is_vm_or_si(dmar_domain))
> > +			if (domain_type_is_vm_or_si(dmar_domain)) {
> 
> 
> JAH>  This path is executed when starting the VM.
> 
> 
> >  				domain_remove_one_dev_info(old_domain, dev);
> > -			else
> > +			} else {
> 
> 
> JAH>  I don't see this path being executed.
> 
> >  				domain_remove_dev_info(old_domain);
> > +				if (list_empty(&old_domain->devices))
> > +					domain_exit(old_domain);
> > +			}

You are right, thanks for testing. The reason is that the check for
domain_type_is_vm_or_si(dmar_domain) uses the new domain and not the old
one. I'll post a new patch.


	Joerg

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2014-12-12 15:56 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-30 11:02 [PATCH 0/2] iommu/vt-d: Keep RMRR mappings around on driver unbind Joerg Roedel
2014-09-30 11:02 ` Joerg Roedel
2014-09-30 11:02 ` [PATCH 1/2] driver core: Add BUS_NOTIFY_REMOVED_DEVICE event Joerg Roedel
2014-09-30 11:02   ` Joerg Roedel
2014-09-30 11:02 ` [PATCH 2/2] iommu/vt-d: Only remove domain when device is removed Joerg Roedel
2014-09-30 11:02   ` Joerg Roedel
2014-11-04 16:12   ` Alex Williamson
2014-11-04 16:12     ` Alex Williamson
2014-11-06 12:54     ` Joerg Roedel
2014-11-06 12:54       ` Joerg Roedel
2014-11-06 16:16       ` Alex Williamson
2014-11-06 16:16         ` Alex Williamson
2014-11-06 16:43         ` Alex Williamson
2014-11-06 16:43           ` Alex Williamson
2014-12-09 12:15         ` Joerg Roedel
2014-12-09 12:15           ` Joerg Roedel
2014-12-11 16:35           ` Jerry Hoemann
2014-12-11 16:35             ` Jerry Hoemann
2014-12-12 15:56             ` Joerg Roedel
2014-12-12 15:56               ` Joerg Roedel
2014-10-01 22:35 ` [PATCH 0/2] iommu/vt-d: Keep RMRR mappings around on driver unbind Greg Kroah-Hartman
2014-10-01 22:35   ` Greg Kroah-Hartman
2014-10-02  9:20   ` Joerg Roedel
2014-10-02  9:20     ` Joerg Roedel
2014-10-02  0:30 ` Jerry Hoemann
2014-10-02  0:30   ` Jerry Hoemann

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.