linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] iommu/vt-d: Fix PCI bus rescan device hot add
@ 2022-01-13 13:23 Jacob Pan
  2022-01-14  0:58 ` Lu Baolu
  0 siblings, 1 reply; 5+ messages in thread
From: Jacob Pan @ 2022-01-13 13:23 UTC (permalink / raw)
  To: iommu, LKML, Joerg Roedel, Lu Baolu; +Cc: Jacob Pan, Raj Ashok, Kumar, Sanjay K

During PCI bus rescan, adding new devices involve two notifiers.
1. dmar_pci_bus_notifier()
2. iommu_bus_notifier()
The current code sets #1 as low priority (INT_MIN) which resulted in #2
being invoked first. The result is that struct device pointer cannot be
found in DRHD search for the new device's DMAR/IOMMU. Subsequently, the
device is put under the "catch-all" IOMMU instead of the correct one.

This could cause system hang when device TLB invalidation is sent to the
wrong IOMMU. Invalidation timeout error or hard lockup can be observed.

This patch fixes the issue by setting a higher priority for
dmar_pci_bus_notifier. DRHD search for a new device will find the
correct IOMMU.

Fixes: 59ce0515cdaf ("iommu/vt-d: Update DRHD/RMRR/ATSR device scope")
Reported-by: Zhang, Bernice <bernice.zhang@intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
---
 drivers/iommu/intel/dmar.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c
index 915bff76fe96..5d07e5b89c2e 100644
--- a/drivers/iommu/intel/dmar.c
+++ b/drivers/iommu/intel/dmar.c
@@ -385,7 +385,7 @@ static int dmar_pci_bus_notifier(struct notifier_block *nb,
 
 static struct notifier_block dmar_pci_bus_nb = {
 	.notifier_call = dmar_pci_bus_notifier,
-	.priority = INT_MIN,
+	.priority = INT_MAX,
 };
 
 static struct dmar_drhd_unit *
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] iommu/vt-d: Fix PCI bus rescan device hot add
  2022-01-13 13:23 [PATCH] iommu/vt-d: Fix PCI bus rescan device hot add Jacob Pan
@ 2022-01-14  0:58 ` Lu Baolu
  2022-01-14  3:11   ` Jacob Pan
  0 siblings, 1 reply; 5+ messages in thread
From: Lu Baolu @ 2022-01-14  0:58 UTC (permalink / raw)
  To: Jacob Pan, iommu, LKML, Joerg Roedel
  Cc: baolu.lu, Jacob Pan, Raj Ashok, Kumar, Sanjay K

Hi Jacob,

On 1/13/22 9:23 PM, Jacob Pan wrote:
> During PCI bus rescan, adding new devices involve two notifiers.
> 1. dmar_pci_bus_notifier()
> 2. iommu_bus_notifier()
> The current code sets #1 as low priority (INT_MIN) which resulted in #2
> being invoked first. The result is that struct device pointer cannot be
> found in DRHD search for the new device's DMAR/IOMMU. Subsequently, the
> device is put under the "catch-all" IOMMU instead of the correct one.
> 
> This could cause system hang when device TLB invalidation is sent to the
> wrong IOMMU. Invalidation timeout error or hard lockup can be observed.
> 
> This patch fixes the issue by setting a higher priority for
> dmar_pci_bus_notifier. DRHD search for a new device will find the
> correct IOMMU.
> 
> Fixes: 59ce0515cdaf ("iommu/vt-d: Update DRHD/RMRR/ATSR device scope")
> Reported-by: Zhang, Bernice <bernice.zhang@intel.com>
> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
> ---
>   drivers/iommu/intel/dmar.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c
> index 915bff76fe96..5d07e5b89c2e 100644
> --- a/drivers/iommu/intel/dmar.c
> +++ b/drivers/iommu/intel/dmar.c
> @@ -385,7 +385,7 @@ static int dmar_pci_bus_notifier(struct notifier_block *nb,
>   
>   static struct notifier_block dmar_pci_bus_nb = {
>   	.notifier_call = dmar_pci_bus_notifier,
> -	.priority = INT_MIN,
> +	.priority = INT_MAX,
>   };
>   
>   static struct dmar_drhd_unit *
> 

Nice catch! dmar_pci_bus_add_dev() should take place *before*
iommu_probe_device(). This change enforces this with a higher notifier
priority for dmar callback.

Comparably, dmar_pci_bus_del_dev() should take place *after*
iommu_release_device(). Perhaps we can use two notifiers, one for
ADD_DEVICE (with .priority=INT_MAX) and the other for REMOVE_DEVICE
(with .priority=INT_MIN)?

Best regards,
baolu

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] iommu/vt-d: Fix PCI bus rescan device hot add
  2022-01-14  0:58 ` Lu Baolu
@ 2022-01-14  3:11   ` Jacob Pan
  2022-01-14  3:12     ` Lu Baolu
  0 siblings, 1 reply; 5+ messages in thread
From: Jacob Pan @ 2022-01-14  3:11 UTC (permalink / raw)
  To: Lu Baolu
  Cc: iommu, LKML, Joerg Roedel, Jacob Pan, Raj Ashok, Kumar, Sanjay K,
	jacob.jun.pan

Hi BaoLu,

On Fri, 14 Jan 2022 08:58:53 +0800, Lu Baolu <baolu.lu@linux.intel.com>
wrote:

> Hi Jacob,
> 
> On 1/13/22 9:23 PM, Jacob Pan wrote:
> > During PCI bus rescan, adding new devices involve two notifiers.
> > 1. dmar_pci_bus_notifier()
> > 2. iommu_bus_notifier()
> > The current code sets #1 as low priority (INT_MIN) which resulted in #2
> > being invoked first. The result is that struct device pointer cannot be
> > found in DRHD search for the new device's DMAR/IOMMU. Subsequently, the
> > device is put under the "catch-all" IOMMU instead of the correct one.
> > 
> > This could cause system hang when device TLB invalidation is sent to the
> > wrong IOMMU. Invalidation timeout error or hard lockup can be observed.
> > 
> > This patch fixes the issue by setting a higher priority for
> > dmar_pci_bus_notifier. DRHD search for a new device will find the
> > correct IOMMU.
> > 
> > Fixes: 59ce0515cdaf ("iommu/vt-d: Update DRHD/RMRR/ATSR device scope")
> > Reported-by: Zhang, Bernice <bernice.zhang@intel.com>
> > Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
> > ---
> >   drivers/iommu/intel/dmar.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c
> > index 915bff76fe96..5d07e5b89c2e 100644
> > --- a/drivers/iommu/intel/dmar.c
> > +++ b/drivers/iommu/intel/dmar.c
> > @@ -385,7 +385,7 @@ static int dmar_pci_bus_notifier(struct
> > notifier_block *nb, 
> >   static struct notifier_block dmar_pci_bus_nb = {
> >   	.notifier_call = dmar_pci_bus_notifier,
> > -	.priority = INT_MIN,
> > +	.priority = INT_MAX,
> >   };
> >   
> >   static struct dmar_drhd_unit *
> >   
> 
> Nice catch! dmar_pci_bus_add_dev() should take place *before*
> iommu_probe_device(). This change enforces this with a higher notifier
> priority for dmar callback.
> 
> Comparably, dmar_pci_bus_del_dev() should take place *after*
> iommu_release_device(). Perhaps we can use two notifiers, one for
> ADD_DEVICE (with .priority=INT_MAX) and the other for REMOVE_DEVICE
> (with .priority=INT_MIN)?
> 

Since device_to_iommu() lookup in intel_iommu_release_device() only
checks if device is under "an" IOMMU, not "the" IOMMU. Then the remove path
order is not needed, right?

I know this is not robust, but having so many notifiers with implicit
priority is not clean either.

Perhaps, we should have explicit priority defined around iommu_bus
notifier? i.e.

@@ -1841,6 +1841,7 @@ static int iommu_bus_init(struct bus_type *bus, const
struct iommu_ops *ops) return -ENOMEM; 
        nb->notifier_call = iommu_bus_notifier;
                       
+       nb->priority = IOMMU_BUS_NOTIFY_PRIORITY;
                       

 static struct notifier_block dmar_pci_bus_add_nb = {  
        .notifier_call = dmar_pci_bus_notifier,    
-       .priority = INT_MIN,                       
+       .priority = IOMMU_BUS_NOTIFY_PRIORITY + 1,                       
 };    

 static struct notifier_block dmar_pci_bus_remove_nb = {  
        .notifier_call = dmar_pci_bus_notifier,    
-       .priority = INT_MIN,                       
+       .priority = IOMMU_BUS_NOTIFY_PRIORITY - 1,                       
 };   
               

> Best regards,
> baolu


Thanks,

Jacob

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] iommu/vt-d: Fix PCI bus rescan device hot add
  2022-01-14  3:11   ` Jacob Pan
@ 2022-01-14  3:12     ` Lu Baolu
  2022-01-14 15:24       ` Jacob Pan
  0 siblings, 1 reply; 5+ messages in thread
From: Lu Baolu @ 2022-01-14  3:12 UTC (permalink / raw)
  To: Jacob Pan
  Cc: baolu.lu, iommu, LKML, Joerg Roedel, Jacob Pan, Raj Ashok, Kumar,
	Sanjay K

On 1/14/22 11:11 AM, Jacob Pan wrote:
> On Fri, 14 Jan 2022 08:58:53 +0800, Lu Baolu<baolu.lu@linux.intel.com>
> wrote:
> 
>> Hi Jacob,
>>
>> On 1/13/22 9:23 PM, Jacob Pan wrote:
>>> During PCI bus rescan, adding new devices involve two notifiers.
>>> 1. dmar_pci_bus_notifier()
>>> 2. iommu_bus_notifier()
>>> The current code sets #1 as low priority (INT_MIN) which resulted in #2
>>> being invoked first. The result is that struct device pointer cannot be
>>> found in DRHD search for the new device's DMAR/IOMMU. Subsequently, the
>>> device is put under the "catch-all" IOMMU instead of the correct one.
>>>
>>> This could cause system hang when device TLB invalidation is sent to the
>>> wrong IOMMU. Invalidation timeout error or hard lockup can be observed.
>>>
>>> This patch fixes the issue by setting a higher priority for
>>> dmar_pci_bus_notifier. DRHD search for a new device will find the
>>> correct IOMMU.
>>>
>>> Fixes: 59ce0515cdaf ("iommu/vt-d: Update DRHD/RMRR/ATSR device scope")
>>> Reported-by: Zhang, Bernice<bernice.zhang@intel.com>
>>> Signed-off-by: Jacob Pan<jacob.jun.pan@linux.intel.com>
>>> ---
>>>    drivers/iommu/intel/dmar.c | 2 +-
>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c
>>> index 915bff76fe96..5d07e5b89c2e 100644
>>> --- a/drivers/iommu/intel/dmar.c
>>> +++ b/drivers/iommu/intel/dmar.c
>>> @@ -385,7 +385,7 @@ static int dmar_pci_bus_notifier(struct
>>> notifier_block *nb,
>>>    static struct notifier_block dmar_pci_bus_nb = {
>>>    	.notifier_call = dmar_pci_bus_notifier,
>>> -	.priority = INT_MIN,
>>> +	.priority = INT_MAX,
>>>    };
>>>    
>>>    static struct dmar_drhd_unit *
>>>    
>> Nice catch! dmar_pci_bus_add_dev() should take place*before*
>> iommu_probe_device(). This change enforces this with a higher notifier
>> priority for dmar callback.
>>
>> Comparably, dmar_pci_bus_del_dev() should take place*after*
>> iommu_release_device(). Perhaps we can use two notifiers, one for
>> ADD_DEVICE (with .priority=INT_MAX) and the other for REMOVE_DEVICE
>> (with .priority=INT_MIN)?
>>
> Since device_to_iommu() lookup in intel_iommu_release_device() only
> checks if device is under "an" IOMMU, not "the" IOMMU. Then the remove path
> order is not needed, right?
> 
> I know this is not robust, but having so many notifiers with implicit
> priority is not clean either.
> 
> Perhaps, we should have explicit priority defined around iommu_bus
> notifier? i.e.
> 
> @@ -1841,6 +1841,7 @@ static int iommu_bus_init(struct bus_type *bus, const
> struct iommu_ops *ops) return -ENOMEM;
>          nb->notifier_call = iommu_bus_notifier;
>                         
> +       nb->priority = IOMMU_BUS_NOTIFY_PRIORITY;
>                         
> 
>   static struct notifier_block dmar_pci_bus_add_nb = {
>          .notifier_call = dmar_pci_bus_notifier,
> -       .priority = INT_MIN,
> +       .priority = IOMMU_BUS_NOTIFY_PRIORITY + 1,
>   };
> 
>   static struct notifier_block dmar_pci_bus_remove_nb = {
>          .notifier_call = dmar_pci_bus_notifier,
> -       .priority = INT_MIN,
> +       .priority = IOMMU_BUS_NOTIFY_PRIORITY - 1,
>   };

IOMMU_BUS_NOTIFY_PRIORITY by default is 0. So you can simply use 1 and
-1? Adding a comment around it will be helpful.

Best regards,
baolu

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] iommu/vt-d: Fix PCI bus rescan device hot add
  2022-01-14  3:12     ` Lu Baolu
@ 2022-01-14 15:24       ` Jacob Pan
  0 siblings, 0 replies; 5+ messages in thread
From: Jacob Pan @ 2022-01-14 15:24 UTC (permalink / raw)
  To: Lu Baolu
  Cc: iommu, LKML, Joerg Roedel, Jacob Pan, Raj Ashok, Kumar, Sanjay K,
	jacob.jun.pan

Hi Lu,

On Fri, 14 Jan 2022 11:12:45 +0800, Lu Baolu <baolu.lu@linux.intel.com>
wrote:

> On 1/14/22 11:11 AM, Jacob Pan wrote:
> > On Fri, 14 Jan 2022 08:58:53 +0800, Lu Baolu<baolu.lu@linux.intel.com>
> > wrote:
> >   
> >> Hi Jacob,
> >>
> >> On 1/13/22 9:23 PM, Jacob Pan wrote:  
> >>> During PCI bus rescan, adding new devices involve two notifiers.
> >>> 1. dmar_pci_bus_notifier()
> >>> 2. iommu_bus_notifier()
> >>> The current code sets #1 as low priority (INT_MIN) which resulted in
> >>> #2 being invoked first. The result is that struct device pointer
> >>> cannot be found in DRHD search for the new device's DMAR/IOMMU.
> >>> Subsequently, the device is put under the "catch-all" IOMMU instead
> >>> of the correct one.
> >>>
> >>> This could cause system hang when device TLB invalidation is sent to
> >>> the wrong IOMMU. Invalidation timeout error or hard lockup can be
> >>> observed.
> >>>
> >>> This patch fixes the issue by setting a higher priority for
> >>> dmar_pci_bus_notifier. DRHD search for a new device will find the
> >>> correct IOMMU.
> >>>
> >>> Fixes: 59ce0515cdaf ("iommu/vt-d: Update DRHD/RMRR/ATSR device scope")
> >>> Reported-by: Zhang, Bernice<bernice.zhang@intel.com>
> >>> Signed-off-by: Jacob Pan<jacob.jun.pan@linux.intel.com>
> >>> ---
> >>>    drivers/iommu/intel/dmar.c | 2 +-
> >>>    1 file changed, 1 insertion(+), 1 deletion(-)
> >>>
> >>> diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c
> >>> index 915bff76fe96..5d07e5b89c2e 100644
> >>> --- a/drivers/iommu/intel/dmar.c
> >>> +++ b/drivers/iommu/intel/dmar.c
> >>> @@ -385,7 +385,7 @@ static int dmar_pci_bus_notifier(struct
> >>> notifier_block *nb,
> >>>    static struct notifier_block dmar_pci_bus_nb = {
> >>>    	.notifier_call = dmar_pci_bus_notifier,
> >>> -	.priority = INT_MIN,
> >>> +	.priority = INT_MAX,
> >>>    };
> >>>    
> >>>    static struct dmar_drhd_unit *
> >>>      
> >> Nice catch! dmar_pci_bus_add_dev() should take place*before*
> >> iommu_probe_device(). This change enforces this with a higher notifier
> >> priority for dmar callback.
> >>
> >> Comparably, dmar_pci_bus_del_dev() should take place*after*
> >> iommu_release_device(). Perhaps we can use two notifiers, one for
> >> ADD_DEVICE (with .priority=INT_MAX) and the other for REMOVE_DEVICE
> >> (with .priority=INT_MIN)?
> >>  
> > Since device_to_iommu() lookup in intel_iommu_release_device() only
> > checks if device is under "an" IOMMU, not "the" IOMMU. Then the remove
> > path order is not needed, right?
> > 
> > I know this is not robust, but having so many notifiers with implicit
> > priority is not clean either.
> > 
> > Perhaps, we should have explicit priority defined around iommu_bus
> > notifier? i.e.
> > 
> > @@ -1841,6 +1841,7 @@ static int iommu_bus_init(struct bus_type *bus,
> > const struct iommu_ops *ops) return -ENOMEM;
> >          nb->notifier_call = iommu_bus_notifier;
> >                         
> > +       nb->priority = IOMMU_BUS_NOTIFY_PRIORITY;
> >                         
> > 
> >   static struct notifier_block dmar_pci_bus_add_nb = {
> >          .notifier_call = dmar_pci_bus_notifier,
> > -       .priority = INT_MIN,
> > +       .priority = IOMMU_BUS_NOTIFY_PRIORITY + 1,
> >   };
> > 
> >   static struct notifier_block dmar_pci_bus_remove_nb = {
> >          .notifier_call = dmar_pci_bus_notifier,
> > -       .priority = INT_MIN,
> > +       .priority = IOMMU_BUS_NOTIFY_PRIORITY - 1,
> >   };  
> 
> IOMMU_BUS_NOTIFY_PRIORITY by default is 0. So you can simply use 1 and
> -1? Adding a comment around it will be helpful.
> 
Yeah, I will add comment.


Thanks,

Jacob

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-01-14 15:19 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-13 13:23 [PATCH] iommu/vt-d: Fix PCI bus rescan device hot add Jacob Pan
2022-01-14  0:58 ` Lu Baolu
2022-01-14  3:11   ` Jacob Pan
2022-01-14  3:12     ` Lu Baolu
2022-01-14 15:24       ` Jacob Pan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).