All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] iommu: Fix domain check on device release
@ 2024-02-23  5:13 Lu Baolu
  2024-02-23  5:13 ` [PATCH 1/2] iommu: Add static iommu_ops->release_domain Lu Baolu
  2024-02-23  5:13 ` [PATCH 2/2] iommu/vt-d: Fix NULL domain on device release Lu Baolu
  0 siblings, 2 replies; 10+ messages in thread
From: Lu Baolu @ 2024-02-23  5:13 UTC (permalink / raw)
  To: Joerg Roedel, Will Deacon, Robin Murphy, Jason Gunthorpe,
	Kevin Tian, Eric Badger
  Cc: iommu, linux-kernel, Lu Baolu

This is a follow-up to the discussion thread here:

https://lore.kernel.org/linux-iommu/20240221154012.GC13491@ziepe.ca/

It fixes a NULL pointer reference in the Intel iommu driver and
strengthens the iommu core to possibly prevent similar failures in other
iommu drivers.

Best regards,
baolu

Lu Baolu (2):
  iommu: Add static iommu_ops->release_domain
  iommu/vt-d: Fix NULL domain on device release

 include/linux/iommu.h       |  1 +
 drivers/iommu/intel/iommu.c | 26 +-------------------------
 drivers/iommu/iommu.c       | 12 ++++++++++++
 3 files changed, 14 insertions(+), 25 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/2] iommu: Add static iommu_ops->release_domain
  2024-02-23  5:13 [PATCH 0/2] iommu: Fix domain check on device release Lu Baolu
@ 2024-02-23  5:13 ` Lu Baolu
  2024-02-27  7:32   ` Tian, Kevin
  2024-02-23  5:13 ` [PATCH 2/2] iommu/vt-d: Fix NULL domain on device release Lu Baolu
  1 sibling, 1 reply; 10+ messages in thread
From: Lu Baolu @ 2024-02-23  5:13 UTC (permalink / raw)
  To: Joerg Roedel, Will Deacon, Robin Murphy, Jason Gunthorpe,
	Kevin Tian, Eric Badger
  Cc: iommu, linux-kernel, Lu Baolu, Jason Gunthorpe

The current device_release callback for individual iommu drivers does the
following:

1) Silent IOMMU DMA translation: It detaches any existing domain from the
   device and puts it into a blocking state (some drivers might use the
   identity state).
2) Resource release: It releases resources allocated during the
   device_probe callback and restores the device to its pre-probe state.

Step 1 is challenging for individual iommu drivers because each must check
if a domain is already attached to the device. Additionally, if a deferred
attach never occurred, the device_release should avoid modifying hardware
configuration regardless of the reason for its call.

To simplify this process, introduce a static release_domain within the
iommu_ops structure. It can be either a blocking or identity domain
depending on the iommu hardware. The iommu core will decide whether to
attach this domain before the device_release callback, eliminating the
need for repetitive code in various drivers.

Consequently, the device_release callback can focus solely on the opposite
operations of device_probe, including releasing all resources allocated
during that callback.

Co-developed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 include/linux/iommu.h |  1 +
 drivers/iommu/iommu.c | 12 ++++++++++++
 2 files changed, 13 insertions(+)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index de839fd01bb8..e3d9365b0fa9 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -585,6 +585,7 @@ struct iommu_ops {
 	struct module *owner;
 	struct iommu_domain *identity_domain;
 	struct iommu_domain *blocked_domain;
+	struct iommu_domain *release_domain;
 	struct iommu_domain *default_domain;
 };
 
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 210dc7b4c8cf..fb06c3f47320 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -459,6 +459,18 @@ static void iommu_deinit_device(struct device *dev)
 
 	iommu_device_unlink(dev->iommu->iommu_dev, dev);
 
+	/*
+	 * If the iommu driver provides release_domain then the core code
+	 * ensures that domain is attached prior to calling release_device.
+	 * Drivers can use this to enforce a translation on the idle iommu.
+	 * Usually the global static blocked_domain is a good choice.
+	 *
+	 * Anyway, if a deferred attach never happened then the release
+	 * should still avoid touching any hardware configuration either.
+	 */
+	if (!dev->iommu->attach_deferred && ops->release_domain)
+		ops->release_domain->ops->attach_dev(ops->release_domain, dev);
+
 	/*
 	 * release_device() must stop using any attached domain on the device.
 	 * If there are still other devices in the group they are not effected
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/2] iommu/vt-d: Fix NULL domain on device release
  2024-02-23  5:13 [PATCH 0/2] iommu: Fix domain check on device release Lu Baolu
  2024-02-23  5:13 ` [PATCH 1/2] iommu: Add static iommu_ops->release_domain Lu Baolu
@ 2024-02-23  5:13 ` Lu Baolu
  2024-02-27  7:40   ` Tian, Kevin
  1 sibling, 1 reply; 10+ messages in thread
From: Lu Baolu @ 2024-02-23  5:13 UTC (permalink / raw)
  To: Joerg Roedel, Will Deacon, Robin Murphy, Jason Gunthorpe,
	Kevin Tian, Eric Badger
  Cc: iommu, linux-kernel, Lu Baolu

In the kdump kernel, the IOMMU operates in deferred_attach mode. In this
mode, info->domain may not yet be assigned by the time the release_device
function is called. It leads to the following crash in the crash kernel:

    BUG: kernel NULL pointer dereference, address: 000000000000003c
    ...
    RIP: 0010:do_raw_spin_lock+0xa/0xa0
    ...
    _raw_spin_lock_irqsave+0x1b/0x30
    intel_iommu_release_device+0x96/0x170
    iommu_deinit_device+0x39/0xf0
    __iommu_group_remove_device+0xa0/0xd0
    iommu_bus_notifier+0x55/0xb0
    notifier_call_chain+0x5a/0xd0
    blocking_notifier_call_chain+0x41/0x60
    bus_notify+0x34/0x50
    device_del+0x269/0x3d0
    pci_remove_bus_device+0x77/0x100
    p2sb_bar+0xae/0x1d0
    ...
    i801_probe+0x423/0x740

Use the release_domain mechanism to fix it.

Fixes: 586081d3f6b1 ("iommu/vt-d: Remove DEFER_DEVICE_DOMAIN_INFO")
Reported-by: Eric Badger <ebadger@purestorage.com>
Closes: https://lore.kernel.org/r/20240113181713.1817855-1-ebadger@purestorage.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/intel/iommu.c | 26 +-------------------------
 1 file changed, 1 insertion(+), 25 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 61bb35046ea4..2c04ea90d22f 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -3798,30 +3798,6 @@ static void domain_context_clear(struct device_domain_info *info)
 			       &domain_context_clear_one_cb, info);
 }
 
-static void dmar_remove_one_dev_info(struct device *dev)
-{
-	struct device_domain_info *info = dev_iommu_priv_get(dev);
-	struct dmar_domain *domain = info->domain;
-	struct intel_iommu *iommu = info->iommu;
-	unsigned long flags;
-
-	if (!dev_is_real_dma_subdevice(info->dev)) {
-		if (dev_is_pci(info->dev) && sm_supported(iommu))
-			intel_pasid_tear_down_entry(iommu, info->dev,
-					IOMMU_NO_PASID, false);
-
-		iommu_disable_pci_caps(info);
-		domain_context_clear(info);
-	}
-
-	spin_lock_irqsave(&domain->lock, flags);
-	list_del(&info->link);
-	spin_unlock_irqrestore(&domain->lock, flags);
-
-	domain_detach_iommu(domain, iommu);
-	info->domain = NULL;
-}
-
 /*
  * Clear the page table pointer in context or pasid table entries so that
  * all DMA requests without PASID from the device are blocked. If the page
@@ -4348,7 +4324,6 @@ static void intel_iommu_release_device(struct device *dev)
 {
 	struct device_domain_info *info = dev_iommu_priv_get(dev);
 
-	dmar_remove_one_dev_info(dev);
 	intel_pasid_free_table(dev);
 	intel_iommu_debugfs_remove_dev(info);
 	kfree(info);
@@ -4839,6 +4814,7 @@ static const struct iommu_dirty_ops intel_dirty_ops = {
 
 const struct iommu_ops intel_iommu_ops = {
 	.blocked_domain		= &blocking_domain,
+	.release_domain		= &blocking_domain,
 	.capable		= intel_iommu_capable,
 	.hw_info		= intel_iommu_hw_info,
 	.domain_alloc		= intel_iommu_domain_alloc,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* RE: [PATCH 1/2] iommu: Add static iommu_ops->release_domain
  2024-02-23  5:13 ` [PATCH 1/2] iommu: Add static iommu_ops->release_domain Lu Baolu
@ 2024-02-27  7:32   ` Tian, Kevin
  2024-02-28  1:13     ` Baolu Lu
  0 siblings, 1 reply; 10+ messages in thread
From: Tian, Kevin @ 2024-02-27  7:32 UTC (permalink / raw)
  To: Lu Baolu, Joerg Roedel, Will Deacon, Robin Murphy,
	Jason Gunthorpe, Badger, Eric
  Cc: iommu, linux-kernel, Jason Gunthorpe

> From: Lu Baolu <baolu.lu@linux.intel.com>
> Sent: Friday, February 23, 2024 1:13 PM
> 
> 
> +	/*
> +	 * If the iommu driver provides release_domain then the core code
> +	 * ensures that domain is attached prior to calling release_device.
> +	 * Drivers can use this to enforce a translation on the idle iommu.

'enforce a translation' is confusing in the context of blocking/identity
domain.

otherwise looks good to me:

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH 2/2] iommu/vt-d: Fix NULL domain on device release
  2024-02-23  5:13 ` [PATCH 2/2] iommu/vt-d: Fix NULL domain on device release Lu Baolu
@ 2024-02-27  7:40   ` Tian, Kevin
  2024-02-28  1:22     ` Baolu Lu
  0 siblings, 1 reply; 10+ messages in thread
From: Tian, Kevin @ 2024-02-27  7:40 UTC (permalink / raw)
  To: Lu Baolu, Joerg Roedel, Will Deacon, Robin Murphy,
	Jason Gunthorpe, Badger, Eric
  Cc: iommu, linux-kernel

> From: Lu Baolu <baolu.lu@linux.intel.com>
> Sent: Friday, February 23, 2024 1:13 PM
> 
> -static void dmar_remove_one_dev_info(struct device *dev)
> -{
> -	struct device_domain_info *info = dev_iommu_priv_get(dev);
> -	struct dmar_domain *domain = info->domain;
> -	struct intel_iommu *iommu = info->iommu;
> -	unsigned long flags;
> -
> -	if (!dev_is_real_dma_subdevice(info->dev)) {
> -		if (dev_is_pci(info->dev) && sm_supported(iommu))
> -			intel_pasid_tear_down_entry(iommu, info->dev,
> -					IOMMU_NO_PASID, false);
> -
> -		iommu_disable_pci_caps(info);
> -		domain_context_clear(info);
> -	}
> -
> -	spin_lock_irqsave(&domain->lock, flags);
> -	list_del(&info->link);
> -	spin_unlock_irqrestore(&domain->lock, flags);
> -
> -	domain_detach_iommu(domain, iommu);
> -	info->domain = NULL;
> -}
> -

what's required here is slightly different from device_block_translation()
which leaves context entry uncleared in scalable mode (implying the
pasid table must be valid). but in the release path the pasid table will
be freed right after then leading to a use-after-free case.

let's add an explicit domain_context_clear() in intel_iommu_release_device().

with that,

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] iommu: Add static iommu_ops->release_domain
  2024-02-27  7:32   ` Tian, Kevin
@ 2024-02-28  1:13     ` Baolu Lu
  2024-02-28  3:08       ` Tian, Kevin
  0 siblings, 1 reply; 10+ messages in thread
From: Baolu Lu @ 2024-02-28  1:13 UTC (permalink / raw)
  To: Tian, Kevin, Joerg Roedel, Will Deacon, Robin Murphy,
	Jason Gunthorpe, Badger, Eric
  Cc: baolu.lu, iommu, linux-kernel, Jason Gunthorpe

On 2/27/24 3:32 PM, Tian, Kevin wrote:
>> From: Lu Baolu<baolu.lu@linux.intel.com>
>> Sent: Friday, February 23, 2024 1:13 PM
>>
>>
>> +	/*
>> +	 * If the iommu driver provides release_domain then the core code
>> +	 * ensures that domain is attached prior to calling release_device.
>> +	 * Drivers can use this to enforce a translation on the idle iommu.
> 'enforce a translation' is confusing in the context of blocking/identity
> domain.

Blocking or identity domain is also kind of a translation from the
core's perspective. The core does not care what type of translation the
release_domain is; it just enforces this type of translation before
device release if the driver has specified one.

Best regards,
baolu

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] iommu/vt-d: Fix NULL domain on device release
  2024-02-27  7:40   ` Tian, Kevin
@ 2024-02-28  1:22     ` Baolu Lu
  2024-02-28  3:08       ` Tian, Kevin
  0 siblings, 1 reply; 10+ messages in thread
From: Baolu Lu @ 2024-02-28  1:22 UTC (permalink / raw)
  To: Tian, Kevin, Joerg Roedel, Will Deacon, Robin Murphy,
	Jason Gunthorpe, Badger, Eric
  Cc: baolu.lu, iommu, linux-kernel

On 2/27/24 3:40 PM, Tian, Kevin wrote:
>> From: Lu Baolu<baolu.lu@linux.intel.com>
>> Sent: Friday, February 23, 2024 1:13 PM
>>
>> -static void dmar_remove_one_dev_info(struct device *dev)
>> -{
>> -	struct device_domain_info *info = dev_iommu_priv_get(dev);
>> -	struct dmar_domain *domain = info->domain;
>> -	struct intel_iommu *iommu = info->iommu;
>> -	unsigned long flags;
>> -
>> -	if (!dev_is_real_dma_subdevice(info->dev)) {
>> -		if (dev_is_pci(info->dev) && sm_supported(iommu))
>> -			intel_pasid_tear_down_entry(iommu, info->dev,
>> -					IOMMU_NO_PASID, false);
>> -
>> -		iommu_disable_pci_caps(info);
>> -		domain_context_clear(info);
>> -	}
>> -
>> -	spin_lock_irqsave(&domain->lock, flags);
>> -	list_del(&info->link);
>> -	spin_unlock_irqrestore(&domain->lock, flags);
>> -
>> -	domain_detach_iommu(domain, iommu);
>> -	info->domain = NULL;
>> -}
>> -
> what's required here is slightly different from device_block_translation()
> which leaves context entry uncleared in scalable mode (implying the
> pasid table must be valid). but in the release path the pasid table will
> be freed right after then leading to a use-after-free case.
> 
> let's add an explicit domain_context_clear() in intel_iommu_release_device().

Nice catch!

How about moving the scalable mode context entry management to probe and
release path? Currently, it's part of domain switch, that's really
irrelevant.

Best regards,
baolu

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH 1/2] iommu: Add static iommu_ops->release_domain
  2024-02-28  1:13     ` Baolu Lu
@ 2024-02-28  3:08       ` Tian, Kevin
  2024-02-28  3:30         ` Baolu Lu
  0 siblings, 1 reply; 10+ messages in thread
From: Tian, Kevin @ 2024-02-28  3:08 UTC (permalink / raw)
  To: Baolu Lu, Joerg Roedel, Will Deacon, Robin Murphy,
	Jason Gunthorpe, Badger, Eric
  Cc: iommu, linux-kernel, Jason Gunthorpe

> From: Baolu Lu <baolu.lu@linux.intel.com>
> Sent: Wednesday, February 28, 2024 9:14 AM
> 
> On 2/27/24 3:32 PM, Tian, Kevin wrote:
> >> From: Lu Baolu<baolu.lu@linux.intel.com>
> >> Sent: Friday, February 23, 2024 1:13 PM
> >>
> >>
> >> +	/*
> >> +	 * If the iommu driver provides release_domain then the core code
> >> +	 * ensures that domain is attached prior to calling release_device.
> >> +	 * Drivers can use this to enforce a translation on the idle iommu.
> > 'enforce a translation' is confusing in the context of blocking/identity
> > domain.
> 
> Blocking or identity domain is also kind of a translation from the
> core's perspective. The core does not care what type of translation the
> release_domain is; it just enforces this type of translation before
> device release if the driver has specified one.
> 

OK.

btw you may also want to update the following comment together:

	/*
	 * release_device() must stop using any attached domain on the device.
	 * If there are still other devices in the group they are not effected
	 * by this callback.
	 *
	 * The IOMMU driver must set the device to either an identity or
	 * blocking translation and stop using any domain pointer, as it is
	 * going to be freed.
	 */

all the purpose is to set the device to identity or blocking, either via attaching
to an explicit release_domain or implicitly by release_device().

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH 2/2] iommu/vt-d: Fix NULL domain on device release
  2024-02-28  1:22     ` Baolu Lu
@ 2024-02-28  3:08       ` Tian, Kevin
  0 siblings, 0 replies; 10+ messages in thread
From: Tian, Kevin @ 2024-02-28  3:08 UTC (permalink / raw)
  To: Baolu Lu, Joerg Roedel, Will Deacon, Robin Murphy,
	Jason Gunthorpe, Badger, Eric
  Cc: iommu, linux-kernel

> From: Baolu Lu <baolu.lu@linux.intel.com>
> Sent: Wednesday, February 28, 2024 9:23 AM
> 
> On 2/27/24 3:40 PM, Tian, Kevin wrote:
> >> From: Lu Baolu<baolu.lu@linux.intel.com>
> >> Sent: Friday, February 23, 2024 1:13 PM
> >>
> >> -static void dmar_remove_one_dev_info(struct device *dev)
> >> -{
> >> -	struct device_domain_info *info = dev_iommu_priv_get(dev);
> >> -	struct dmar_domain *domain = info->domain;
> >> -	struct intel_iommu *iommu = info->iommu;
> >> -	unsigned long flags;
> >> -
> >> -	if (!dev_is_real_dma_subdevice(info->dev)) {
> >> -		if (dev_is_pci(info->dev) && sm_supported(iommu))
> >> -			intel_pasid_tear_down_entry(iommu, info->dev,
> >> -					IOMMU_NO_PASID, false);
> >> -
> >> -		iommu_disable_pci_caps(info);
> >> -		domain_context_clear(info);
> >> -	}
> >> -
> >> -	spin_lock_irqsave(&domain->lock, flags);
> >> -	list_del(&info->link);
> >> -	spin_unlock_irqrestore(&domain->lock, flags);
> >> -
> >> -	domain_detach_iommu(domain, iommu);
> >> -	info->domain = NULL;
> >> -}
> >> -
> > what's required here is slightly different from device_block_translation()
> > which leaves context entry uncleared in scalable mode (implying the
> > pasid table must be valid). but in the release path the pasid table will
> > be freed right after then leading to a use-after-free case.
> >
> > let's add an explicit domain_context_clear() in
> intel_iommu_release_device().
> 
> Nice catch!
> 
> How about moving the scalable mode context entry management to probe
> and
> release path? Currently, it's part of domain switch, that's really
> irrelevant.
> 

sounds good.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] iommu: Add static iommu_ops->release_domain
  2024-02-28  3:08       ` Tian, Kevin
@ 2024-02-28  3:30         ` Baolu Lu
  0 siblings, 0 replies; 10+ messages in thread
From: Baolu Lu @ 2024-02-28  3:30 UTC (permalink / raw)
  To: Tian, Kevin, Joerg Roedel, Will Deacon, Robin Murphy,
	Jason Gunthorpe, Badger, Eric
  Cc: baolu.lu, iommu, linux-kernel, Jason Gunthorpe

On 2/28/24 11:08 AM, Tian, Kevin wrote:
>> From: Baolu Lu <baolu.lu@linux.intel.com>
>> Sent: Wednesday, February 28, 2024 9:14 AM
>>
>> On 2/27/24 3:32 PM, Tian, Kevin wrote:
>>>> From: Lu Baolu<baolu.lu@linux.intel.com>
>>>> Sent: Friday, February 23, 2024 1:13 PM
>>>>
>>>>
>>>> +	/*
>>>> +	 * If the iommu driver provides release_domain then the core code
>>>> +	 * ensures that domain is attached prior to calling release_device.
>>>> +	 * Drivers can use this to enforce a translation on the idle iommu.
>>> 'enforce a translation' is confusing in the context of blocking/identity
>>> domain.
>>
>> Blocking or identity domain is also kind of a translation from the
>> core's perspective. The core does not care what type of translation the
>> release_domain is; it just enforces this type of translation before
>> device release if the driver has specified one.
>>
> 
> OK.
> 
> btw you may also want to update the following comment together:
> 
> 	/*
> 	 * release_device() must stop using any attached domain on the device.
> 	 * If there are still other devices in the group they are not effected
> 	 * by this callback.
> 	 *
> 	 * The IOMMU driver must set the device to either an identity or
> 	 * blocking translation and stop using any domain pointer, as it is
> 	 * going to be freed.
> 	 */
> 
> all the purpose is to set the device to identity or blocking, either via attaching
> to an explicit release_domain or implicitly by release_device().

Done.

Best regards,
baolu

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2024-02-28  3:36 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-23  5:13 [PATCH 0/2] iommu: Fix domain check on device release Lu Baolu
2024-02-23  5:13 ` [PATCH 1/2] iommu: Add static iommu_ops->release_domain Lu Baolu
2024-02-27  7:32   ` Tian, Kevin
2024-02-28  1:13     ` Baolu Lu
2024-02-28  3:08       ` Tian, Kevin
2024-02-28  3:30         ` Baolu Lu
2024-02-23  5:13 ` [PATCH 2/2] iommu/vt-d: Fix NULL domain on device release Lu Baolu
2024-02-27  7:40   ` Tian, Kevin
2024-02-28  1:22     ` Baolu Lu
2024-02-28  3:08       ` Tian, Kevin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.