linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4] PCI: vmd: Disable MSI remap only for low MSI count
@ 2024-04-18 15:31 Nirmal Patel
  2024-04-25 21:10 ` Nirmal Patel
  2024-04-26 22:39 ` Bjorn Helgaas
  0 siblings, 2 replies; 5+ messages in thread
From: Nirmal Patel @ 2024-04-18 15:31 UTC (permalink / raw)
  To: nirmal.patel, linux-pci

VMD MSI remapping is disabled by default for all the CPUs with 28c0 VMD
deviceID. We used to disable remapping because drives supported more
vectors than the VMD so the performance was better without remapping.
Now with CPUs that support more than 64 (128 VMD MSIx vectors for gen5)
we no longer need to disable this feature.

Note, pci_msix_vec_count() failure is translated to ENODEV per typical
expectations that drivers may return ENODEV when some driver-known
fundamental detail of the device is missing.

Signed-off-by: Nirmal Patel <nirmal.patel@linux.intel.com>
---
v1->v2: Updating commit message.
v2->v3: Use VMD MSI count instead of cpu count.
v3->v4: Updating commit message.
---
 drivers/pci/controller/vmd.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
index 769eedeb8802..ba63af70bb63 100644
--- a/drivers/pci/controller/vmd.c
+++ b/drivers/pci/controller/vmd.c
@@ -34,6 +34,8 @@
 #define MB2_SHADOW_OFFSET	0x2000
 #define MB2_SHADOW_SIZE		16
 
+#define VMD_MIN_MSI_VECTOR_COUNT 64
+
 enum vmd_features {
 	/*
 	 * Device may contain registers which hint the physical location of the
@@ -807,13 +809,20 @@ static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features)
 
 	sd->node = pcibus_to_node(vmd->dev->bus);
 
+	vmd->msix_count = pci_msix_vec_count(vmd->dev);
+	if (vmd->msix_count < 0)
+		return -ENODEV;
+
 	/*
 	 * Currently MSI remapping must be enabled in guest passthrough mode
 	 * due to some missing interrupt remapping plumbing. This is probably
 	 * acceptable because the guest is usually CPU-limited and MSI
 	 * remapping doesn't become a performance bottleneck.
+	 * Disable MSI remapping only if supported by VMD hardware and when
+	 * VMD MSI count is less than or equal to minimum MSI count.
 	 */
 	if (!(features & VMD_FEAT_CAN_BYPASS_MSI_REMAP) ||
+	    vmd->msix_count > VMD_MIN_MSI_VECTOR_COUNT ||
 	    offset[0] || offset[1]) {
 		ret = vmd_alloc_irqs(vmd);
 		if (ret)
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v4] PCI: vmd: Disable MSI remap only for low MSI count
  2024-04-18 15:31 [PATCH v4] PCI: vmd: Disable MSI remap only for low MSI count Nirmal Patel
@ 2024-04-25 21:10 ` Nirmal Patel
  2024-04-26 22:39 ` Bjorn Helgaas
  1 sibling, 0 replies; 5+ messages in thread
From: Nirmal Patel @ 2024-04-25 21:10 UTC (permalink / raw)
  To: lpieralisi, linux-pci, paul.e.luse

On Thu, 18 Apr 2024 11:31:21 -0400
Nirmal Patel <nirmal.patel@linux.intel.com> wrote:

> VMD MSI remapping is disabled by default for all the CPUs with 28c0
> VMD deviceID. We used to disable remapping because drives supported
> more vectors than the VMD so the performance was better without
> remapping. Now with CPUs that support more than 64 (128 VMD MSIx
> vectors for gen5) we no longer need to disable this feature.
> 
> Note, pci_msix_vec_count() failure is translated to ENODEV per typical
> expectations that drivers may return ENODEV when some driver-known
> fundamental detail of the device is missing.
> 
> Signed-off-by: Nirmal Patel <nirmal.patel@linux.intel.com>
> ---
> v1->v2: Updating commit message.
> v2->v3: Use VMD MSI count instead of cpu count.
> v3->v4: Updating commit message.
> ---
>  drivers/pci/controller/vmd.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/drivers/pci/controller/vmd.c
> b/drivers/pci/controller/vmd.c index 769eedeb8802..ba63af70bb63 100644
> --- a/drivers/pci/controller/vmd.c
> +++ b/drivers/pci/controller/vmd.c
> @@ -34,6 +34,8 @@
>  #define MB2_SHADOW_OFFSET	0x2000
>  #define MB2_SHADOW_SIZE		16
>  
> +#define VMD_MIN_MSI_VECTOR_COUNT 64
> +
>  enum vmd_features {
>  	/*
>  	 * Device may contain registers which hint the physical
> location of the @@ -807,13 +809,20 @@ static int
> vmd_enable_domain(struct vmd_dev *vmd, unsigned long features) 
>  	sd->node = pcibus_to_node(vmd->dev->bus);
>  
> +	vmd->msix_count = pci_msix_vec_count(vmd->dev);
> +	if (vmd->msix_count < 0)
> +		return -ENODEV;
> +
>  	/*
>  	 * Currently MSI remapping must be enabled in guest
> passthrough mode
>  	 * due to some missing interrupt remapping plumbing. This is
> probably
>  	 * acceptable because the guest is usually CPU-limited and
> MSI
>  	 * remapping doesn't become a performance bottleneck.
> +	 * Disable MSI remapping only if supported by VMD hardware
> and when
> +	 * VMD MSI count is less than or equal to minimum MSI count.
>  	 */
>  	if (!(features & VMD_FEAT_CAN_BYPASS_MSI_REMAP) ||
> +	    vmd->msix_count > VMD_MIN_MSI_VECTOR_COUNT ||
>  	    offset[0] || offset[1]) {
>  		ret = vmd_alloc_irqs(vmd);
>  		if (ret)

Gentle ping!

Thanks
nirmal

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v4] PCI: vmd: Disable MSI remap only for low MSI count
  2024-04-18 15:31 [PATCH v4] PCI: vmd: Disable MSI remap only for low MSI count Nirmal Patel
  2024-04-25 21:10 ` Nirmal Patel
@ 2024-04-26 22:39 ` Bjorn Helgaas
  2024-05-07  0:39   ` Nirmal Patel
  1 sibling, 1 reply; 5+ messages in thread
From: Bjorn Helgaas @ 2024-04-26 22:39 UTC (permalink / raw)
  To: Nirmal Patel; +Cc: linux-pci

I think this refers specifically to MSI-X, so use "MSI-X" in the
subject.

On Thu, Apr 18, 2024 at 11:31:21AM -0400, Nirmal Patel wrote:
> VMD MSI remapping is disabled by default for all the CPUs with 28c0 VMD
> deviceID. We used to disable remapping because drives supported more
> vectors than the VMD so the performance was better without remapping.
> Now with CPUs that support more than 64 (128 VMD MSIx vectors for gen5)
> we no longer need to disable this feature.

"because drives supported more vectors" ... I guess you are referring
to typical devices that might be behind a VMD?  But I assume there's
no actual requirement that those devices be "drives", right?

"CPUs that support more than 64 ... 128 VMD vectors"  Are we talking
about *CPUs* that support more vectors, or *VMDs* that support more
vectors?

I guess you probably think of CPUs here because VMD is integrated into
the same package, right?  That would explain the "CPUs with 28c0 VMD"
comment.  But the vmd driver doesn't care about that; it just claims a
PCI device.

s/MSI remapping/MSI-X remapping/ (I think?)
s/MSIx/MSI-X/ to match spec usage.

A reference to ee81ee84f873 ("PCI: vmd: Disable MSI-X remapping when
possible"), which added VMD_FEAT_CAN_BYPASS_MSI_REMAP, might be
useful because it has nice context.

IIUC this will keep MSI-X remapping enabled in more cases, e.g., on
new devices that support more vectors.  What is the benefit of keeping
it enabled?

The ee81ee84f873 commit log suggests two issues:

  - Number of vectors available to child domain is limited by size of
    VMD MSI-X table.

  - Remapping means child interrupts have to go through the VMD domain
    interrupt handler instead of going straight to the device handler.

But this commit log suggests that with more vectors, you can enable
remapping even without a performance penalty?  Maybe the VMD domain
interrupt handler was only needed because of vector sharing?

I'm just a little confused because this commit log doesn't say what
the actual benefit is, other than "keeping remapping enabled", and I
don't know enough to know why that's good.

> Note, pci_msix_vec_count() failure is translated to ENODEV per typical
> expectations that drivers may return ENODEV when some driver-known
> fundamental detail of the device is missing.
> 
> Signed-off-by: Nirmal Patel <nirmal.patel@linux.intel.com>
> ---
> v1->v2: Updating commit message.
> v2->v3: Use VMD MSI count instead of cpu count.
> v3->v4: Updating commit message.
> ---
>  drivers/pci/controller/vmd.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
> index 769eedeb8802..ba63af70bb63 100644
> --- a/drivers/pci/controller/vmd.c
> +++ b/drivers/pci/controller/vmd.c
> @@ -34,6 +34,8 @@
>  #define MB2_SHADOW_OFFSET	0x2000
>  #define MB2_SHADOW_SIZE		16
>  
> +#define VMD_MIN_MSI_VECTOR_COUNT 64
> +
>  enum vmd_features {
>  	/*
>  	 * Device may contain registers which hint the physical location of the
> @@ -807,13 +809,20 @@ static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features)
>  
>  	sd->node = pcibus_to_node(vmd->dev->bus);
>  
> +	vmd->msix_count = pci_msix_vec_count(vmd->dev);
> +	if (vmd->msix_count < 0)
> +		return -ENODEV;
> +
>  	/*
>  	 * Currently MSI remapping must be enabled in guest passthrough mode
>  	 * due to some missing interrupt remapping plumbing. This is probably
>  	 * acceptable because the guest is usually CPU-limited and MSI
>  	 * remapping doesn't become a performance bottleneck.

This part of the comment might need some updating.  I don't see the
connection with guest passthrough mode in the code.

> +	 * Disable MSI remapping only if supported by VMD hardware and when
> +	 * VMD MSI count is less than or equal to minimum MSI count.

Add blank line between paragraphs or rewrap into a single paragraph.

>  	 */
>  	if (!(features & VMD_FEAT_CAN_BYPASS_MSI_REMAP) ||
> +	    vmd->msix_count > VMD_MIN_MSI_VECTOR_COUNT ||
>  	    offset[0] || offset[1]) {

I think this conditional might be easier to read if it were inverted:

  if (features & VMD_FEAT_CAN_BYPASS_MSI_REMAP) && ...) {
    vmd_set_msi_remapping(vmd, false);
  } else {
    ret = vmd_alloc_irqs(vmd);
    ...
  }

Maybe a separate patch though.

>  		ret = vmd_alloc_irqs(vmd);
>  		if (ret)
> -- 
> 2.31.1
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v4] PCI: vmd: Disable MSI remap only for low MSI count
  2024-04-26 22:39 ` Bjorn Helgaas
@ 2024-05-07  0:39   ` Nirmal Patel
  2024-05-08  1:46     ` Bjorn Helgaas
  0 siblings, 1 reply; 5+ messages in thread
From: Nirmal Patel @ 2024-05-07  0:39 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: linux-pci

On Fri, 26 Apr 2024 17:39:57 -0500
Bjorn Helgaas <helgaas@kernel.org> wrote:

> I think this refers specifically to MSI-X, so use "MSI-X" in the
> subject.
ACK.
> 
> On Thu, Apr 18, 2024 at 11:31:21AM -0400, Nirmal Patel wrote:
> > VMD MSI remapping is disabled by default for all the CPUs with 28c0
> > VMD deviceID. We used to disable remapping because drives supported
> > more vectors than the VMD so the performance was better without
> > remapping. Now with CPUs that support more than 64 (128 VMD MSIx
> > vectors for gen5) we no longer need to disable this feature.  
> 
> "because drives supported more vectors" ... I guess you are referring
> to typical devices that might be behind a VMD?  But I assume there's
> no actual requirement that those devices be "drives", right?
Yes, I am referring to PCIe device with equal or more MSI-X vectors
than VMD. If a device has lesser than minimum VMD MSI-X count (64) than
VMD will not cause any performance issue while using MSI-X remapping.
> 
> "CPUs that support more than 64 ... 128 VMD vectors"  Are we talking
> about *CPUs* that support more vectors, or *VMDs* that support more
> vectors?
> 
> I guess you probably think of CPUs here because VMD is integrated into
> the same package, right?  That would explain the "CPUs with 28c0 VMD"
> comment.  But the vmd driver doesn't care about that; it just claims a
> PCI device.
Yes, I will adjust my wordings; but I meant newer VMD which still has
same deviceID (28c0) as previous generations but it has more MSI-X
vectors.(i.e. 128)
> 
> s/MSI remapping/MSI-X remapping/ (I think?)
> s/MSIx/MSI-X/ to match spec usage.
I will make adjustments.
> 
> A reference to ee81ee84f873 ("PCI: vmd: Disable MSI-X remapping when
> possible"), which added VMD_FEAT_CAN_BYPASS_MSI_REMAP, might be
> useful because it has nice context.
> 
> IIUC this will keep MSI-X remapping enabled in more cases, e.g., on
> new devices that support more vectors.  What is the benefit of keeping
> it enabled?
Sorry I took longer to respond.

VMD MSI-X remapping was a performance bottleneck in certain
situations. Starting from 28c0, VMD has a capability to disable MSI-X
remapping and improve the I/O performance. The first iteration of 28c0
VMD HW had only 64 MSI-X vectors support while the newer iterations can
support up to 128 and VMD is no longer a bottleneck. So I thought it
would be a good idea to change it to MSI-X remapping default ON.

Also upon further testings, I noticed huge boost in performance because
of this CID patch:
https://lore.kernel.org/kvm/20240423174114.526704-5-jacob.jun.pan@linux.intel.com/T/ 

The performance boost we get from the CID patch as follow:
Kernel 6.8.8 : 1Drive: 2000, 4Drives: 2300
6.9.0-rc6 + CID + MSI-X remap Disable: 1Drive: 2700, 4Drives: 6010
6.9.0-rc6 + CID + MSI-X remap Enabled: 1Drive: 2700, 4Drives: 6100

Since there is no significant performance difference between MSI-X
enable and disable after addition of CID patch, I think we can drop this
patch for now until we see significant change in I/O performance due to
VMD's MSI-X remapping policy.

Thanks for your time.

-nirmal
> 
> The ee81ee84f873 commit log suggests two issues:
> 
>   - Number of vectors available to child domain is limited by size of
>     VMD MSI-X table.
> 
>   - Remapping means child interrupts have to go through the VMD domain
>     interrupt handler instead of going straight to the device handler.
> 
> But this commit log suggests that with more vectors, you can enable
> remapping even without a performance penalty?  Maybe the VMD domain
> interrupt handler was only needed because of vector sharing?
> 
> I'm just a little confused because this commit log doesn't say what
> the actual benefit is, other than "keeping remapping enabled", and I
> don't know enough to know why that's good.
> 
> > Note, pci_msix_vec_count() failure is translated to ENODEV per
> > typical expectations that drivers may return ENODEV when some
> > driver-known fundamental detail of the device is missing.
> > 
> > Signed-off-by: Nirmal Patel <nirmal.patel@linux.intel.com>
> > ---
> > v1->v2: Updating commit message.
> > v2->v3: Use VMD MSI count instead of cpu count.
> > v3->v4: Updating commit message.
> > ---
> >  drivers/pci/controller/vmd.c | 9 +++++++++
> >  1 file changed, 9 insertions(+)
> > 
> > diff --git a/drivers/pci/controller/vmd.c
> > b/drivers/pci/controller/vmd.c index 769eedeb8802..ba63af70bb63
> > 100644 --- a/drivers/pci/controller/vmd.c
> > +++ b/drivers/pci/controller/vmd.c
> > @@ -34,6 +34,8 @@
> >  #define MB2_SHADOW_OFFSET	0x2000
> >  #define MB2_SHADOW_SIZE		16
> >  
> > +#define VMD_MIN_MSI_VECTOR_COUNT 64
> > +
> >  enum vmd_features {
> >  	/*
> >  	 * Device may contain registers which hint the physical
> > location of the @@ -807,13 +809,20 @@ static int
> > vmd_enable_domain(struct vmd_dev *vmd, unsigned long features) 
> >  	sd->node = pcibus_to_node(vmd->dev->bus);
> >  
> > +	vmd->msix_count = pci_msix_vec_count(vmd->dev);
> > +	if (vmd->msix_count < 0)
> > +		return -ENODEV;
> > +
> >  	/*
> >  	 * Currently MSI remapping must be enabled in guest
> > passthrough mode
> >  	 * due to some missing interrupt remapping plumbing. This
> > is probably
> >  	 * acceptable because the guest is usually CPU-limited and
> > MSI
> >  	 * remapping doesn't become a performance bottleneck.  
> 
> This part of the comment might need some updating.  I don't see the
> connection with guest passthrough mode in the code.
> 
> > +	 * Disable MSI remapping only if supported by VMD hardware
> > and when
> > +	 * VMD MSI count is less than or equal to minimum MSI
> > count.  
> 
> Add blank line between paragraphs or rewrap into a single paragraph.
> 
> >  	 */
> >  	if (!(features & VMD_FEAT_CAN_BYPASS_MSI_REMAP) ||
> > +	    vmd->msix_count > VMD_MIN_MSI_VECTOR_COUNT ||
> >  	    offset[0] || offset[1]) {  
> 
> I think this conditional might be easier to read if it were inverted:
> 
>   if (features & VMD_FEAT_CAN_BYPASS_MSI_REMAP) && ...) {
>     vmd_set_msi_remapping(vmd, false);
>   } else {
>     ret = vmd_alloc_irqs(vmd);
>     ...
>   }
> 
> Maybe a separate patch though.
> 
> >  		ret = vmd_alloc_irqs(vmd);
> >  		if (ret)
> > -- 
> > 2.31.1
> >   


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v4] PCI: vmd: Disable MSI remap only for low MSI count
  2024-05-07  0:39   ` Nirmal Patel
@ 2024-05-08  1:46     ` Bjorn Helgaas
  0 siblings, 0 replies; 5+ messages in thread
From: Bjorn Helgaas @ 2024-05-08  1:46 UTC (permalink / raw)
  To: Nirmal Patel; +Cc: linux-pci

On Mon, May 06, 2024 at 05:39:01PM -0700, Nirmal Patel wrote:
> On Fri, 26 Apr 2024 17:39:57 -0500
> Bjorn Helgaas <helgaas@kernel.org> wrote:
> > On Thu, Apr 18, 2024 at 11:31:21AM -0400, Nirmal Patel wrote:
> ...

> > A reference to ee81ee84f873 ("PCI: vmd: Disable MSI-X remapping when
> > possible"), which added VMD_FEAT_CAN_BYPASS_MSI_REMAP, might be
> > useful because it has nice context.
> > 
> > IIUC this will keep MSI-X remapping enabled in more cases, e.g., on
> > new devices that support more vectors.  What is the benefit of keeping
> > it enabled?
> 
> VMD MSI-X remapping was a performance bottleneck in certain
> situations. Starting from 28c0, VMD has a capability to disable MSI-X
> remapping and improve the I/O performance. The first iteration of 28c0
> VMD HW had only 64 MSI-X vectors support while the newer iterations can
> support up to 128 and VMD is no longer a bottleneck. So I thought it
> would be a good idea to change it to MSI-X remapping default ON.
> 
> Also upon further testings, I noticed huge boost in performance because
> of this CID patch:
> https://lore.kernel.org/kvm/20240423174114.526704-5-jacob.jun.pan@linux.intel.com/T/ 
> 
> The performance boost we get from the CID patch as follow:
> Kernel 6.8.8 : 1Drive: 2000, 4Drives: 2300
> 6.9.0-rc6 + CID + MSI-X remap Disable: 1Drive: 2700, 4Drives: 6010
> 6.9.0-rc6 + CID + MSI-X remap Enabled: 1Drive: 2700, 4Drives: 6100
> 
> Since there is no significant performance difference between MSI-X
> enable and disable after addition of CID patch, I think we can drop this
> patch for now until we see significant change in I/O performance due to
> VMD's MSI-X remapping policy.

OK, great, thanks for the background and the performance testing!

Bjorn

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-05-08  1:46 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-18 15:31 [PATCH v4] PCI: vmd: Disable MSI remap only for low MSI count Nirmal Patel
2024-04-25 21:10 ` Nirmal Patel
2024-04-26 22:39 ` Bjorn Helgaas
2024-05-07  0:39   ` Nirmal Patel
2024-05-08  1:46     ` Bjorn Helgaas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).