iommu.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] iommu/vt-d: Allocate DMAR fault interrupts locally
@ 2024-03-21 20:50 Dimitri Sivanich
  2024-03-22  4:41 ` Zhang, Tina
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Dimitri Sivanich @ 2024-03-21 20:50 UTC (permalink / raw)
  To: Thomas Gleixner, Joerg Roedel, Suravee Suthikulpanit,
	Will Deacon, Robin Murphy, David Woodhouse, Lu Baolu,
	Mark Rutland, Peter Zijlstra, Arnd Bergmann, YueHaibing, iommu,
	Dimitri Sivanich
  Cc: linux-kernel, Steve Wahl, Russ Anderson

The Intel IOMMU code currently tries to allocate all DMAR fault interrupt
vectors on the boot cpu.  On large systems with high DMAR counts this
results in vector exhaustion, and most of the vectors are not initially
allocated socket local.

Instead, have a cpu on each node do the vector allocation for the DMARs on
that node.  The boot cpu still does the allocation for its node during its
boot sequence.

Signed-off-by: Dimitri Sivanich <sivanich@hpe.com>
---

v2: per Thomas Gleixner, implement this from a DYN CPU hotplug state, though
    this implementation runs in CPUHP_AP_ONLINE_DYN space rather than
    CPUHP_BP_PREPARE_DYN space.

 drivers/iommu/amd/amd_iommu.h | 2 +-
 drivers/iommu/amd/init.c      | 2 +-
 drivers/iommu/intel/dmar.c    | 9 +++++++--
 drivers/iommu/irq_remapping.c | 5 ++++-
 drivers/iommu/irq_remapping.h | 2 +-
 include/linux/dmar.h          | 2 +-
 6 files changed, 15 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index f482aab420f7..410c360e7e24 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -33,7 +33,7 @@ int amd_iommu_prepare(void);
 int amd_iommu_enable(void);
 void amd_iommu_disable(void);
 int amd_iommu_reenable(int mode);
-int amd_iommu_enable_faulting(void);
+int amd_iommu_enable_faulting(unsigned int cpu);
 extern int amd_iommu_guest_ir;
 extern enum io_pgtable_fmt amd_iommu_pgtable;
 extern int amd_iommu_gpt_level;
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index e7a44929f0da..4782f690ed97 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -3389,7 +3389,7 @@ int amd_iommu_reenable(int mode)
 	return 0;
 }
 
-int __init amd_iommu_enable_faulting(void)
+int __init amd_iommu_enable_faulting(unsigned int cpu)
 {
 	/* We enable MSI later when PCI is initialized */
 	return 0;
diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c
index 36d7427b1202..7644a42f283c 100644
--- a/drivers/iommu/intel/dmar.c
+++ b/drivers/iommu/intel/dmar.c
@@ -2122,7 +2122,7 @@ int dmar_set_interrupt(struct intel_iommu *iommu)
 	return ret;
 }
 
-int __init enable_drhd_fault_handling(void)
+int enable_drhd_fault_handling(unsigned int cpu)
 {
 	struct dmar_drhd_unit *drhd;
 	struct intel_iommu *iommu;
@@ -2132,7 +2132,12 @@ int __init enable_drhd_fault_handling(void)
 	 */
 	for_each_iommu(iommu, drhd) {
 		u32 fault_status;
-		int ret = dmar_set_interrupt(iommu);
+		int ret;
+
+		if (iommu->irq || iommu->node != cpu_to_node(cpu))
+			continue;
+
+		ret = dmar_set_interrupt(iommu);
 
 		if (ret) {
 			pr_err("DRHD %Lx: failed to enable fault, interrupt, ret %d\n",
diff --git a/drivers/iommu/irq_remapping.c b/drivers/iommu/irq_remapping.c
index ee59647c2050..2f7281ccc05f 100644
--- a/drivers/iommu/irq_remapping.c
+++ b/drivers/iommu/irq_remapping.c
@@ -151,7 +151,10 @@ int __init irq_remap_enable_fault_handling(void)
 	if (!remap_ops->enable_faulting)
 		return -ENODEV;
 
-	return remap_ops->enable_faulting();
+	cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "dmar:enable_fault_handling",
+			  remap_ops->enable_faulting, NULL);
+
+	return remap_ops->enable_faulting(smp_processor_id());
 }
 
 void panic_if_irq_remap(const char *msg)
diff --git a/drivers/iommu/irq_remapping.h b/drivers/iommu/irq_remapping.h
index 8c89cb947cdb..0d6f140b5e01 100644
--- a/drivers/iommu/irq_remapping.h
+++ b/drivers/iommu/irq_remapping.h
@@ -41,7 +41,7 @@ struct irq_remap_ops {
 	int  (*reenable)(int);
 
 	/* Enable fault handling */
-	int  (*enable_faulting)(void);
+	int  (*enable_faulting)(unsigned int);
 };
 
 extern struct irq_remap_ops intel_irq_remap_ops;
diff --git a/include/linux/dmar.h b/include/linux/dmar.h
index e34b601b71fd..499bb2c63483 100644
--- a/include/linux/dmar.h
+++ b/include/linux/dmar.h
@@ -117,7 +117,7 @@ extern int dmar_remove_dev_scope(struct dmar_pci_notify_info *info,
 				 int count);
 /* Intel IOMMU detection */
 void detect_intel_iommu(void);
-extern int enable_drhd_fault_handling(void);
+extern int enable_drhd_fault_handling(unsigned int cpu);
 extern int dmar_device_add(acpi_handle handle);
 extern int dmar_device_remove(acpi_handle handle);
 
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* RE: [PATCH v2] iommu/vt-d: Allocate DMAR fault interrupts locally
  2024-03-21 20:50 [PATCH v2] iommu/vt-d: Allocate DMAR fault interrupts locally Dimitri Sivanich
@ 2024-03-22  4:41 ` Zhang, Tina
  2024-03-22 15:03   ` Dimitri Sivanich
  2024-03-22 23:01 ` Jacob Pan
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 9+ messages in thread
From: Zhang, Tina @ 2024-03-22  4:41 UTC (permalink / raw)
  To: Dimitri Sivanich, Thomas Gleixner, Joerg Roedel,
	Suravee Suthikulpanit, Will Deacon, Robin Murphy,
	David Woodhouse, Lu Baolu, Mark Rutland, Peter Zijlstra,
	Arnd Bergmann, YueHaibing, iommu
  Cc: linux-kernel, Steve Wahl, Anderson, Russ

Hi Dimitri,


> -----Original Message-----
> From: Dimitri Sivanich <sivanich@hpe.com>
> Sent: Friday, March 22, 2024 4:51 AM
> To: Thomas Gleixner <tglx@linutronix.de>; Joerg Roedel <joro@8bytes.org>;
> Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>; Will Deacon
> <will@kernel.org>; Robin Murphy <robin.murphy@arm.com>; David
> Woodhouse <dwmw2@infradead.org>; Lu Baolu <baolu.lu@linux.intel.com>;
> Mark Rutland <mark.rutland@arm.com>; Peter Zijlstra
> <peterz@infradead.org>; Arnd Bergmann <arnd@arndb.de>; YueHaibing
> <yuehaibing@huawei.com>; iommu@lists.linux.dev; Dimitri Sivanich
> <sivanich@hpe.com>
> Cc: linux-kernel@vger.kernel.org; Steve Wahl <steve.wahl@hpe.com>;
> Anderson, Russ <russ.anderson@hpe.com>
> Subject: [PATCH v2] iommu/vt-d: Allocate DMAR fault interrupts locally
> 
> The Intel IOMMU code currently tries to allocate all DMAR fault interrupt
> vectors on the boot cpu.  On large systems with high DMAR counts this
> results in vector exhaustion, and most of the vectors are not initially allocated
> socket local.
> 
> Instead, have a cpu on each node do the vector allocation for the DMARs on
> that node.  The boot cpu still does the allocation for its node during its boot
> sequence.
> 
> Signed-off-by: Dimitri Sivanich <sivanich@hpe.com>
> ---
> 
> v2: per Thomas Gleixner, implement this from a DYN CPU hotplug state,
> though
>     this implementation runs in CPUHP_AP_ONLINE_DYN space rather than
>     CPUHP_BP_PREPARE_DYN space.
> 
>  drivers/iommu/amd/amd_iommu.h | 2 +-
>  drivers/iommu/amd/init.c      | 2 +-
>  drivers/iommu/intel/dmar.c    | 9 +++++++--
>  drivers/iommu/irq_remapping.c | 5 ++++-  drivers/iommu/irq_remapping.h |
> 2 +-
>  include/linux/dmar.h          | 2 +-
>  6 files changed, 15 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/iommu/amd/amd_iommu.h
> b/drivers/iommu/amd/amd_iommu.h index f482aab420f7..410c360e7e24
> 100644
> --- a/drivers/iommu/amd/amd_iommu.h
> +++ b/drivers/iommu/amd/amd_iommu.h
> @@ -33,7 +33,7 @@ int amd_iommu_prepare(void);  int
> amd_iommu_enable(void);  void amd_iommu_disable(void);  int
> amd_iommu_reenable(int mode); -int amd_iommu_enable_faulting(void);
> +int amd_iommu_enable_faulting(unsigned int cpu);
>  extern int amd_iommu_guest_ir;
>  extern enum io_pgtable_fmt amd_iommu_pgtable;  extern int
> amd_iommu_gpt_level; diff --git a/drivers/iommu/amd/init.c
> b/drivers/iommu/amd/init.c index e7a44929f0da..4782f690ed97 100644
> --- a/drivers/iommu/amd/init.c
> +++ b/drivers/iommu/amd/init.c
> @@ -3389,7 +3389,7 @@ int amd_iommu_reenable(int mode)
>  	return 0;
>  }
> 
> -int __init amd_iommu_enable_faulting(void)
> +int __init amd_iommu_enable_faulting(unsigned int cpu)
>  {
>  	/* We enable MSI later when PCI is initialized */
>  	return 0;
> diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c index
> 36d7427b1202..7644a42f283c 100644
> --- a/drivers/iommu/intel/dmar.c
> +++ b/drivers/iommu/intel/dmar.c
> @@ -2122,7 +2122,7 @@ int dmar_set_interrupt(struct intel_iommu
> *iommu)
>  	return ret;
>  }
> 
> -int __init enable_drhd_fault_handling(void)
> +int enable_drhd_fault_handling(unsigned int cpu)
>  {
>  	struct dmar_drhd_unit *drhd;
>  	struct intel_iommu *iommu;
> @@ -2132,7 +2132,12 @@ int __init enable_drhd_fault_handling(void)
>  	 */
>  	for_each_iommu(iommu, drhd) {
>  		u32 fault_status;
> -		int ret = dmar_set_interrupt(iommu);
> +		int ret;
> +
> +		if (iommu->irq || iommu->node != cpu_to_node(cpu))
> +			continue;
If iommu->irq is set, current logic will clear any previous faults by accessing DMAR_FSTS_REG. However, the code change in this patch seems missing it.

The current logic:
int dmar_set_interrupt(struct intel_iommu *iommu)
{
        int irq, ret;

        /*
         * Check if the fault interrupt is already initialized.
         */
        if (iommu->irq)
                return 0;
        ...

int __init enable_drhd_fault_handling(void)
{
	...
        for_each_iommu(iommu, drhd) {
                u32 fault_status;
                int ret = dmar_set_interrupt(iommu);

                if (ret) {
                        pr_err("DRHD %Lx: failed to enable fault, interrupt, ret %d\n",
                               (unsigned long long)drhd->reg_base_addr, ret);
                        return -1;
                }

                /*
                 * Clear any previous faults.
                 */
                dmar_fault(iommu->irq, iommu);
                fault_status = readl(iommu->reg + DMAR_FSTS_REG);
                writel(fault_status, iommu->reg + DMAR_FSTS_REG);
        }
	...

Regards,

-Tina

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] iommu/vt-d: Allocate DMAR fault interrupts locally
  2024-03-22  4:41 ` Zhang, Tina
@ 2024-03-22 15:03   ` Dimitri Sivanich
  0 siblings, 0 replies; 9+ messages in thread
From: Dimitri Sivanich @ 2024-03-22 15:03 UTC (permalink / raw)
  To: Zhang, Tina
  Cc: Thomas Gleixner, Joerg Roedel, Suravee Suthikulpanit,
	Will Deacon, Robin Murphy, David Woodhouse, Lu Baolu,
	Mark Rutland, Peter Zijlstra, Arnd Bergmann, YueHaibing, iommu,
	linux-kernel, Steve Wahl, Anderson, Russ

Hi Tina,

On Fri, Mar 22, 2024 at 04:41:01AM +0000, Zhang, Tina wrote:
> Hi Dimitri,
> 
> 
> > -----Original Message-----
> > From: Dimitri Sivanich <sivanich@hpe.com>
> > Sent: Friday, March 22, 2024 4:51 AM
> > To: Thomas Gleixner <tglx@linutronix.de>; Joerg Roedel <joro@8bytes.org>;
> > Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>; Will Deacon
> > <will@kernel.org>; Robin Murphy <robin.murphy@arm.com>; David
> > Woodhouse <dwmw2@infradead.org>; Lu Baolu <baolu.lu@linux.intel.com>;
> > Mark Rutland <mark.rutland@arm.com>; Peter Zijlstra
> > <peterz@infradead.org>; Arnd Bergmann <arnd@arndb.de>; YueHaibing
> > <yuehaibing@huawei.com>; iommu@lists.linux.dev; Dimitri Sivanich
> > <sivanich@hpe.com>
> > Cc: linux-kernel@vger.kernel.org; Steve Wahl <steve.wahl@hpe.com>;
> > Anderson, Russ <russ.anderson@hpe.com>
> > Subject: [PATCH v2] iommu/vt-d: Allocate DMAR fault interrupts locally
> > 
> > The Intel IOMMU code currently tries to allocate all DMAR fault interrupt
> > vectors on the boot cpu.  On large systems with high DMAR counts this
> > results in vector exhaustion, and most of the vectors are not initially allocated
> > socket local.
> > 
> > Instead, have a cpu on each node do the vector allocation for the DMARs on
> > that node.  The boot cpu still does the allocation for its node during its boot
> > sequence.
> > 
> > Signed-off-by: Dimitri Sivanich <sivanich@hpe.com>
> > ---
> > 
> > v2: per Thomas Gleixner, implement this from a DYN CPU hotplug state,
> > though
> >     this implementation runs in CPUHP_AP_ONLINE_DYN space rather than
> >     CPUHP_BP_PREPARE_DYN space.
> > 
> >  drivers/iommu/amd/amd_iommu.h | 2 +-
> >  drivers/iommu/amd/init.c      | 2 +-
> >  drivers/iommu/intel/dmar.c    | 9 +++++++--
> >  drivers/iommu/irq_remapping.c | 5 ++++-  drivers/iommu/irq_remapping.h |
> > 2 +-
> >  include/linux/dmar.h          | 2 +-
> >  6 files changed, 15 insertions(+), 7 deletions(-)
> > 
> > diff --git a/drivers/iommu/amd/amd_iommu.h
> > b/drivers/iommu/amd/amd_iommu.h index f482aab420f7..410c360e7e24
> > 100644
> > --- a/drivers/iommu/amd/amd_iommu.h
> > +++ b/drivers/iommu/amd/amd_iommu.h
> > @@ -33,7 +33,7 @@ int amd_iommu_prepare(void);  int
> > amd_iommu_enable(void);  void amd_iommu_disable(void);  int
> > amd_iommu_reenable(int mode); -int amd_iommu_enable_faulting(void);
> > +int amd_iommu_enable_faulting(unsigned int cpu);
> >  extern int amd_iommu_guest_ir;
> >  extern enum io_pgtable_fmt amd_iommu_pgtable;  extern int
> > amd_iommu_gpt_level; diff --git a/drivers/iommu/amd/init.c
> > b/drivers/iommu/amd/init.c index e7a44929f0da..4782f690ed97 100644
> > --- a/drivers/iommu/amd/init.c
> > +++ b/drivers/iommu/amd/init.c
> > @@ -3389,7 +3389,7 @@ int amd_iommu_reenable(int mode)
> >  	return 0;
> >  }
> > 
> > -int __init amd_iommu_enable_faulting(void)
> > +int __init amd_iommu_enable_faulting(unsigned int cpu)
> >  {
> >  	/* We enable MSI later when PCI is initialized */
> >  	return 0;
> > diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c index
> > 36d7427b1202..7644a42f283c 100644
> > --- a/drivers/iommu/intel/dmar.c
> > +++ b/drivers/iommu/intel/dmar.c
> > @@ -2122,7 +2122,7 @@ int dmar_set_interrupt(struct intel_iommu
> > *iommu)
> >  	return ret;
> >  }
> > 
> > -int __init enable_drhd_fault_handling(void)
> > +int enable_drhd_fault_handling(unsigned int cpu)
> >  {
> >  	struct dmar_drhd_unit *drhd;
> >  	struct intel_iommu *iommu;
> > @@ -2132,7 +2132,12 @@ int __init enable_drhd_fault_handling(void)
> >  	 */
> >  	for_each_iommu(iommu, drhd) {
> >  		u32 fault_status;
> > -		int ret = dmar_set_interrupt(iommu);
> > +		int ret;
> > +
> > +		if (iommu->irq || iommu->node != cpu_to_node(cpu))
> > +			continue;
> If iommu->irq is set, current logic will clear any previous faults by accessing DMAR_FSTS_REG. However, the code change in this patch seems missing it.

Yes, the current logic does clear the faults even in the iommu->irq!=0 case,
but enable_drhd_fault_handling is currently only run once by the boot cpu,
prior to startup of the other processors.  So each iommu will, initially at
least, only have its faults cleared once.

With this patch, enable_drhd_fault_handling will still run on the boot cpu
prior to the startup of other processors, but will also run on each AP as it is
brought up (hotplugged).  It will now only operate on the iommus that are on
its own socket, however.  If we add back the fault clearing for the
iommu->irq!=0 case, each iommu on a socket would end up having faults cleared
once for each cpu on the socket (so 120 times for a 60-core socket with HT
enabled).  In addition, this would happen again each time a cpu is hot-plugged
after having been hot-unplugged.

So is there a specific reason why faults would need to be cleared if the
iommu->irq has already been set, since I assume they would've already been
cleared during this boot?

I'm definitely willing to add back the fault clearing for the iommu->irq!=0
case, but am looking for guidance from others like yourself on this.

> 
> The current logic:
> int dmar_set_interrupt(struct intel_iommu *iommu)
> {
>         int irq, ret;
> 
>         /*
>          * Check if the fault interrupt is already initialized.
>          */
>         if (iommu->irq)
>                 return 0;
>         ...
> 
> int __init enable_drhd_fault_handling(void)
> {
> 	...
>         for_each_iommu(iommu, drhd) {
>                 u32 fault_status;
>                 int ret = dmar_set_interrupt(iommu);
> 
>                 if (ret) {
>                         pr_err("DRHD %Lx: failed to enable fault, interrupt, ret %d\n",
>                                (unsigned long long)drhd->reg_base_addr, ret);
>                         return -1;
>                 }
> 
>                 /*
>                  * Clear any previous faults.
>                  */
>                 dmar_fault(iommu->irq, iommu);
>                 fault_status = readl(iommu->reg + DMAR_FSTS_REG);
>                 writel(fault_status, iommu->reg + DMAR_FSTS_REG);
>         }
> 	...
> 
> Regards,
> 
> -Tina

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] iommu/vt-d: Allocate DMAR fault interrupts locally
  2024-03-21 20:50 [PATCH v2] iommu/vt-d: Allocate DMAR fault interrupts locally Dimitri Sivanich
  2024-03-22  4:41 ` Zhang, Tina
@ 2024-03-22 23:01 ` Jacob Pan
  2024-04-08  6:54 ` Tian, Kevin
  2024-04-24  3:43 ` Baolu Lu
  3 siblings, 0 replies; 9+ messages in thread
From: Jacob Pan @ 2024-03-22 23:01 UTC (permalink / raw)
  To: Dimitri Sivanich
  Cc: Thomas Gleixner, Joerg Roedel, Suravee Suthikulpanit,
	Will Deacon, Robin Murphy, David Woodhouse, Lu Baolu,
	Mark Rutland, Peter Zijlstra, Arnd Bergmann, YueHaibing, iommu,
	linux-kernel, Steve Wahl, Russ Anderson, jacob.jun.pan

Hi Dimitri,

On Thu, 21 Mar 2024 15:50:46 -0500, Dimitri Sivanich <sivanich@hpe.com>
wrote:

> The Intel IOMMU code currently tries to allocate all DMAR fault interrupt
> vectors on the boot cpu.  On large systems with high DMAR counts this
> results in vector exhaustion, and most of the vectors are not initially
> allocated socket local.
> 
> Instead, have a cpu on each node do the vector allocation for the DMARs on
> that node.  The boot cpu still does the allocation for its node during its
> boot sequence.
> 
> Signed-off-by: Dimitri Sivanich <sivanich@hpe.com>
> ---
> 
> v2: per Thomas Gleixner, implement this from a DYN CPU hotplug state,
> though this implementation runs in CPUHP_AP_ONLINE_DYN space rather than
>     CPUHP_BP_PREPARE_DYN space.
> 

I tested on a dual socket system (192 core) successfully with the following:
1. After boot DMAR-MSI spread from BSP to the first CPU of the second numa
node
2. Offline/Online all CPUs in the 2nd node

Code looks good to me.


Thanks,

Jacob

>  drivers/iommu/amd/amd_iommu.h | 2 +-
>  drivers/iommu/amd/init.c      | 2 +-
>  drivers/iommu/intel/dmar.c    | 9 +++++++--
>  drivers/iommu/irq_remapping.c | 5 ++++-
>  drivers/iommu/irq_remapping.h | 2 +-
>  include/linux/dmar.h          | 2 +-
>  6 files changed, 15 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
> index f482aab420f7..410c360e7e24 100644
> --- a/drivers/iommu/amd/amd_iommu.h
> +++ b/drivers/iommu/amd/amd_iommu.h
> @@ -33,7 +33,7 @@ int amd_iommu_prepare(void);
>  int amd_iommu_enable(void);
>  void amd_iommu_disable(void);
>  int amd_iommu_reenable(int mode);
> -int amd_iommu_enable_faulting(void);
> +int amd_iommu_enable_faulting(unsigned int cpu);
>  extern int amd_iommu_guest_ir;
>  extern enum io_pgtable_fmt amd_iommu_pgtable;
>  extern int amd_iommu_gpt_level;
> diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
> index e7a44929f0da..4782f690ed97 100644
> --- a/drivers/iommu/amd/init.c
> +++ b/drivers/iommu/amd/init.c
> @@ -3389,7 +3389,7 @@ int amd_iommu_reenable(int mode)
>  	return 0;
>  }
>  
> -int __init amd_iommu_enable_faulting(void)
> +int __init amd_iommu_enable_faulting(unsigned int cpu)
>  {
>  	/* We enable MSI later when PCI is initialized */
>  	return 0;
> diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c
> index 36d7427b1202..7644a42f283c 100644
> --- a/drivers/iommu/intel/dmar.c
> +++ b/drivers/iommu/intel/dmar.c
> @@ -2122,7 +2122,7 @@ int dmar_set_interrupt(struct intel_iommu *iommu)
>  	return ret;
>  }
>  
> -int __init enable_drhd_fault_handling(void)
> +int enable_drhd_fault_handling(unsigned int cpu)
>  {
>  	struct dmar_drhd_unit *drhd;
>  	struct intel_iommu *iommu;
> @@ -2132,7 +2132,12 @@ int __init enable_drhd_fault_handling(void)
>  	 */
>  	for_each_iommu(iommu, drhd) {
>  		u32 fault_status;
> -		int ret = dmar_set_interrupt(iommu);
> +		int ret;
> +
> +		if (iommu->irq || iommu->node != cpu_to_node(cpu))
> +			continue;
> +
> +		ret = dmar_set_interrupt(iommu);
>  
>  		if (ret) {
>  			pr_err("DRHD %Lx: failed to enable fault,
> interrupt, ret %d\n", diff --git a/drivers/iommu/irq_remapping.c
> b/drivers/iommu/irq_remapping.c index ee59647c2050..2f7281ccc05f 100644
> --- a/drivers/iommu/irq_remapping.c
> +++ b/drivers/iommu/irq_remapping.c
> @@ -151,7 +151,10 @@ int __init irq_remap_enable_fault_handling(void)
>  	if (!remap_ops->enable_faulting)
>  		return -ENODEV;
>  
> -	return remap_ops->enable_faulting();
> +	cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
> "dmar:enable_fault_handling",
> +			  remap_ops->enable_faulting, NULL);
> +
> +	return remap_ops->enable_faulting(smp_processor_id());
>  }
>  
>  void panic_if_irq_remap(const char *msg)
> diff --git a/drivers/iommu/irq_remapping.h b/drivers/iommu/irq_remapping.h
> index 8c89cb947cdb..0d6f140b5e01 100644
> --- a/drivers/iommu/irq_remapping.h
> +++ b/drivers/iommu/irq_remapping.h
> @@ -41,7 +41,7 @@ struct irq_remap_ops {
>  	int  (*reenable)(int);
>  
>  	/* Enable fault handling */
> -	int  (*enable_faulting)(void);
> +	int  (*enable_faulting)(unsigned int);
>  };
>  
>  extern struct irq_remap_ops intel_irq_remap_ops;
> diff --git a/include/linux/dmar.h b/include/linux/dmar.h
> index e34b601b71fd..499bb2c63483 100644
> --- a/include/linux/dmar.h
> +++ b/include/linux/dmar.h
> @@ -117,7 +117,7 @@ extern int dmar_remove_dev_scope(struct
> dmar_pci_notify_info *info, int count);
>  /* Intel IOMMU detection */
>  void detect_intel_iommu(void);
> -extern int enable_drhd_fault_handling(void);
> +extern int enable_drhd_fault_handling(unsigned int cpu);
>  extern int dmar_device_add(acpi_handle handle);
>  extern int dmar_device_remove(acpi_handle handle);
>  


Thanks,

Jacob

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH v2] iommu/vt-d: Allocate DMAR fault interrupts locally
  2024-03-21 20:50 [PATCH v2] iommu/vt-d: Allocate DMAR fault interrupts locally Dimitri Sivanich
  2024-03-22  4:41 ` Zhang, Tina
  2024-03-22 23:01 ` Jacob Pan
@ 2024-04-08  6:54 ` Tian, Kevin
  2024-04-08  7:21   ` Baolu Lu
  2024-04-24  3:43 ` Baolu Lu
  3 siblings, 1 reply; 9+ messages in thread
From: Tian, Kevin @ 2024-04-08  6:54 UTC (permalink / raw)
  To: Dimitri Sivanich, Thomas Gleixner, Joerg Roedel,
	Suravee Suthikulpanit, Will Deacon, Robin Murphy,
	David Woodhouse, Lu Baolu, Mark Rutland, Peter Zijlstra,
	Arnd Bergmann, YueHaibing, iommu
  Cc: linux-kernel, Steve Wahl, Anderson, Russ

> From: Dimitri Sivanich <sivanich@hpe.com>
> Sent: Friday, March 22, 2024 4:51 AM
> 
> The Intel IOMMU code currently tries to allocate all DMAR fault interrupt
> vectors on the boot cpu.  On large systems with high DMAR counts this
> results in vector exhaustion, and most of the vectors are not initially
> allocated socket local.
> 
> Instead, have a cpu on each node do the vector allocation for the DMARs on
> that node.  The boot cpu still does the allocation for its node during its
> boot sequence.
> 
> Signed-off-by: Dimitri Sivanich <sivanich@hpe.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] iommu/vt-d: Allocate DMAR fault interrupts locally
  2024-04-08  6:54 ` Tian, Kevin
@ 2024-04-08  7:21   ` Baolu Lu
  2024-04-08  9:00     ` Tian, Kevin
  0 siblings, 1 reply; 9+ messages in thread
From: Baolu Lu @ 2024-04-08  7:21 UTC (permalink / raw)
  To: Tian, Kevin, Dimitri Sivanich, Thomas Gleixner, Joerg Roedel,
	Suravee Suthikulpanit, Will Deacon, Robin Murphy,
	David Woodhouse, Mark Rutland, Peter Zijlstra, Arnd Bergmann,
	YueHaibing, iommu
  Cc: baolu.lu, linux-kernel, Steve Wahl, Anderson, Russ, Jacob Pan

On 2024/4/8 14:54, Tian, Kevin wrote:
>> From: Dimitri Sivanich <sivanich@hpe.com>
>> Sent: Friday, March 22, 2024 4:51 AM
>>
>> The Intel IOMMU code currently tries to allocate all DMAR fault interrupt
>> vectors on the boot cpu.  On large systems with high DMAR counts this
>> results in vector exhaustion, and most of the vectors are not initially
>> allocated socket local.
>>
>> Instead, have a cpu on each node do the vector allocation for the DMARs on
>> that node.  The boot cpu still does the allocation for its node during its
>> boot sequence.
>>
>> Signed-off-by: Dimitri Sivanich <sivanich@hpe.com>
> 
> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> 

Kevin,

Jacob has another proposal which shares the irq among all IOMMUs.

https://lore.kernel.org/linux-iommu/20240403234548.989061-1-jacob.jun.pan@linux.intel.com/

How do you like this?

Best regards,
baolu

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH v2] iommu/vt-d: Allocate DMAR fault interrupts locally
  2024-04-08  7:21   ` Baolu Lu
@ 2024-04-08  9:00     ` Tian, Kevin
  2024-04-08 16:39       ` Jacob Pan
  0 siblings, 1 reply; 9+ messages in thread
From: Tian, Kevin @ 2024-04-08  9:00 UTC (permalink / raw)
  To: Baolu Lu, Dimitri Sivanich, Thomas Gleixner, Joerg Roedel,
	Suravee Suthikulpanit, Will Deacon, Robin Murphy,
	David Woodhouse, Mark Rutland, Peter Zijlstra, Arnd Bergmann,
	YueHaibing, iommu
  Cc: linux-kernel, Steve Wahl, Anderson, Russ, Jacob Pan

> From: Baolu Lu <baolu.lu@linux.intel.com>
> Sent: Monday, April 8, 2024 3:22 PM
> 
> On 2024/4/8 14:54, Tian, Kevin wrote:
> >> From: Dimitri Sivanich <sivanich@hpe.com>
> >> Sent: Friday, March 22, 2024 4:51 AM
> >>
> >> The Intel IOMMU code currently tries to allocate all DMAR fault interrupt
> >> vectors on the boot cpu.  On large systems with high DMAR counts this
> >> results in vector exhaustion, and most of the vectors are not initially
> >> allocated socket local.
> >>
> >> Instead, have a cpu on each node do the vector allocation for the DMARs
> on
> >> that node.  The boot cpu still does the allocation for its node during its
> >> boot sequence.
> >>
> >> Signed-off-by: Dimitri Sivanich <sivanich@hpe.com>
> >
> > Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> >
> 
> Kevin,
> 
> Jacob has another proposal which shares the irq among all IOMMUs.
> 
> https://lore.kernel.org/linux-iommu/20240403234548.989061-1-
> jacob.jun.pan@linux.intel.com/
> 
> How do you like this?
> 

I'm a bit concerning about the need of looping all IOMMU's in DMAR
irqchip mask/unmask handlers. this one sounds simpler to me.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] iommu/vt-d: Allocate DMAR fault interrupts locally
  2024-04-08  9:00     ` Tian, Kevin
@ 2024-04-08 16:39       ` Jacob Pan
  0 siblings, 0 replies; 9+ messages in thread
From: Jacob Pan @ 2024-04-08 16:39 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Baolu Lu, Dimitri Sivanich, Thomas Gleixner, Joerg Roedel,
	Suravee Suthikulpanit, Will Deacon, Robin Murphy,
	David Woodhouse, Mark Rutland, Peter Zijlstra, Arnd Bergmann,
	YueHaibing, iommu, linux-kernel, Steve Wahl, Anderson, Russ,
	jacob.jun.pan

Hi Kevin,

On Mon, 8 Apr 2024 09:00:05 +0000, "Tian, Kevin" <kevin.tian@intel.com>
wrote:

> > From: Baolu Lu <baolu.lu@linux.intel.com>
> > Sent: Monday, April 8, 2024 3:22 PM
> > 
> > On 2024/4/8 14:54, Tian, Kevin wrote:  
> > >> From: Dimitri Sivanich <sivanich@hpe.com>
> > >> Sent: Friday, March 22, 2024 4:51 AM
> > >>
> > >> The Intel IOMMU code currently tries to allocate all DMAR fault
> > >> interrupt vectors on the boot cpu.  On large systems with high DMAR
> > >> counts this results in vector exhaustion, and most of the vectors
> > >> are not initially allocated socket local.
> > >>
> > >> Instead, have a cpu on each node do the vector allocation for the
> > >> DMARs  
> > on  
> > >> that node.  The boot cpu still does the allocation for its node
> > >> during its boot sequence.
> > >>
> > >> Signed-off-by: Dimitri Sivanich <sivanich@hpe.com>  
> > >
> > > Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> > >  
> > 
> > Kevin,
> > 
> > Jacob has another proposal which shares the irq among all IOMMUs.
> > 
> > https://lore.kernel.org/linux-iommu/20240403234548.989061-1-
> > jacob.jun.pan@linux.intel.com/
> > 
> > How do you like this?
> >   
> 
> I'm a bit concerning about the need of looping all IOMMU's in DMAR
> irqchip mask/unmask handlers. this one sounds simpler to me.
The difference is that with this patch, we still burn a few vectors on BSP
and the leading CPU of each socket.

e.g. on sapphire rapids, we lose 8 vectors to DMAR fault IRQ on BSP.

Thanks,

Jacob

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] iommu/vt-d: Allocate DMAR fault interrupts locally
  2024-03-21 20:50 [PATCH v2] iommu/vt-d: Allocate DMAR fault interrupts locally Dimitri Sivanich
                   ` (2 preceding siblings ...)
  2024-04-08  6:54 ` Tian, Kevin
@ 2024-04-24  3:43 ` Baolu Lu
  3 siblings, 0 replies; 9+ messages in thread
From: Baolu Lu @ 2024-04-24  3:43 UTC (permalink / raw)
  To: Dimitri Sivanich, Thomas Gleixner, Joerg Roedel,
	Suravee Suthikulpanit, Will Deacon, Robin Murphy,
	David Woodhouse, Mark Rutland, Peter Zijlstra, Arnd Bergmann,
	YueHaibing, iommu
  Cc: baolu.lu, linux-kernel, Steve Wahl, Russ Anderson

On 3/22/24 4:50 AM, Dimitri Sivanich wrote:
> The Intel IOMMU code currently tries to allocate all DMAR fault interrupt
> vectors on the boot cpu.  On large systems with high DMAR counts this
> results in vector exhaustion, and most of the vectors are not initially
> allocated socket local.
> 
> Instead, have a cpu on each node do the vector allocation for the DMARs on
> that node.  The boot cpu still does the allocation for its node during its
> boot sequence.
> 
> Signed-off-by: Dimitri Sivanich<sivanich@hpe.com>
> ---
> 
> v2: per Thomas Gleixner, implement this from a DYN CPU hotplug state, though
>      this implementation runs in CPUHP_AP_ONLINE_DYN space rather than
>      CPUHP_BP_PREPARE_DYN space.

Patch has been queued for iommu/vt-d.

Best regards,
baolu

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2024-04-24  3:44 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-21 20:50 [PATCH v2] iommu/vt-d: Allocate DMAR fault interrupts locally Dimitri Sivanich
2024-03-22  4:41 ` Zhang, Tina
2024-03-22 15:03   ` Dimitri Sivanich
2024-03-22 23:01 ` Jacob Pan
2024-04-08  6:54 ` Tian, Kevin
2024-04-08  7:21   ` Baolu Lu
2024-04-08  9:00     ` Tian, Kevin
2024-04-08 16:39       ` Jacob Pan
2024-04-24  3:43 ` Baolu Lu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).