linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1 0/1] limit the i40e msix vectors based on housekeeping CPUs
@ 2020-06-15 20:21 Nitesh Narayan Lal
  2020-06-15 20:21 ` [Patch v1] i40e: limit the " Nitesh Narayan Lal
  0 siblings, 1 reply; 7+ messages in thread
From: Nitesh Narayan Lal @ 2020-06-15 20:21 UTC (permalink / raw)
  To: linux-kernel, frederic, mtosatti, sassmann, jeffrey.t.kirsher,
	jacob.e.keller, jlelli

Issue
=====
With the current implementation at the time of i40e_init_msix(), i40e
creates vectors only based on the number of online CPUs. This would
be problematic for RT setup that includes a large number of isolated
but very few housekeeping CPUs. This is because in those setups
an attempt to move all IRQs from isolated to housekeeping CPUs may
easily fail due to per CPU vector limit.

Setup For The Issue
===================
I have triggered this issue on a setup that had a total of 72
cores among which 68 were isolated and only 4 were left for
housekeeping tasks. I was using tuned's realtime-virtual-host profile
to configure the system. However, Tuned reported the error message
'Failed to set SMP affinity of IRQ xxx to '00000040,00000010,00000005':
[Errno 28] No space left on the device' for several IRQs in tuned.log.
Note: There were other IRQs as well pinned to the housekeeping CPUs that
      were generated by other drivers.

Fix
===
- In this proposed fix I have replaced num_online_cpus in i40e_init_msix()
  with the number of housekeeping CPUs.
- The reason why I chose to include both HK_FLAG_DOMAIN & HK_FLAG_WQ is
  because we would also need IRQ isolation with something like systemd's
  CPU affinity.


Testing
=======
To test this change I had added a tracepoint in i40e_init_msix() to
find the number of CPUs derived for vector creation with and without
tuned's realtime-virtual-host profile. As per expectation with the profile
applied I was only getting the number of housekeeping CPUs and all
available CPUs without it.


Nitesh Narayan Lal (1):
  i40e: limit the msix vectors based on housekeeping CPUs

 drivers/net/ethernet/intel/i40e/i40e_main.c | 16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)

-- 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Patch v1] i40e: limit the msix vectors based on housekeeping CPUs
  2020-06-15 20:21 [PATCH v1 0/1] limit the i40e msix vectors based on housekeeping CPUs Nitesh Narayan Lal
@ 2020-06-15 20:21 ` Nitesh Narayan Lal
  2020-06-15 20:48   ` Keller, Jacob E
  2020-06-16  8:03   ` Christoph Hellwig
  0 siblings, 2 replies; 7+ messages in thread
From: Nitesh Narayan Lal @ 2020-06-15 20:21 UTC (permalink / raw)
  To: linux-kernel, frederic, mtosatti, sassmann, jeffrey.t.kirsher,
	jacob.e.keller, jlelli

In a realtime environment, it is essential to isolate
unwanted IRQs from isolated CPUs to prevent latency overheads.
Creating MSIX vectors only based on the online CPUs could lead
to a potential issue on an RT setup that has several isolated
CPUs but a very few housekeeping CPUs. This is because in these
kinds of setups an attempt to move the IRQs to the limited
housekeeping CPUs from isolated CPUs might fail due to the per
CPU vector limit. This could eventually result in latency spikes
because of the IRQ threads that we fail to move from isolated
CPUs. This patch prevents i40e to add vectors only based on
available online CPUs by using housekeeping_cpumask() to derive
the number of available housekeeping CPUs.

Signed-off-by: Nitesh Narayan Lal <nitesh@redhat.com>
---
 drivers/net/ethernet/intel/i40e/i40e_main.c | 16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 5d807c8004f8..9691bececb86 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -5,6 +5,7 @@
 #include <linux/of_net.h>
 #include <linux/pci.h>
 #include <linux/bpf.h>
+#include <linux/sched/isolation.h>
 
 /* Local includes */
 #include "i40e.h"
@@ -10933,11 +10934,13 @@ static int i40e_reserve_msix_vectors(struct i40e_pf *pf, int vectors)
 static int i40e_init_msix(struct i40e_pf *pf)
 {
 	struct i40e_hw *hw = &pf->hw;
+	const struct cpumask *mask;
 	int cpus, extra_vectors;
 	int vectors_left;
 	int v_budget, i;
 	int v_actual;
 	int iwarp_requested = 0;
+	int hk_flags;
 
 	if (!(pf->flags & I40E_FLAG_MSIX_ENABLED))
 		return -ENODEV;
@@ -10968,12 +10971,15 @@ static int i40e_init_msix(struct i40e_pf *pf)
 
 	/* reserve some vectors for the main PF traffic queues. Initially we
 	 * only reserve at most 50% of the available vectors, in the case that
-	 * the number of online CPUs is large. This ensures that we can enable
-	 * extra features as well. Once we've enabled the other features, we
-	 * will use any remaining vectors to reach as close as we can to the
-	 * number of online CPUs.
+	 * the number of online (housekeeping) CPUs is large. This ensures that
+	 * we can enable extra features as well. Once we've enabled the other
+	 * features, we will use any remaining vectors to reach as close as we
+	 * can to the number of online (housekeeping) CPUs.
 	 */
-	cpus = num_online_cpus();
+	hk_flags = HK_FLAG_DOMAIN | HK_FLAG_WQ;
+	mask = housekeeping_cpumask(hk_flags);
+	cpus = cpumask_weight(mask);
+
 	pf->num_lan_msix = min_t(int, cpus, vectors_left / 2);
 	vectors_left -= pf->num_lan_msix;
 
-- 
2.18.4


^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [Patch v1] i40e: limit the msix vectors based on housekeeping CPUs
  2020-06-15 20:21 ` [Patch v1] i40e: limit the " Nitesh Narayan Lal
@ 2020-06-15 20:48   ` Keller, Jacob E
  2020-06-15 20:55     ` Nitesh Narayan Lal
  2020-06-16  8:03   ` Christoph Hellwig
  1 sibling, 1 reply; 7+ messages in thread
From: Keller, Jacob E @ 2020-06-15 20:48 UTC (permalink / raw)
  To: Nitesh Narayan Lal, linux-kernel, frederic, mtosatti, sassmann,
	Kirsher, Jeffrey T, jlelli



> -----Original Message-----
> From: Nitesh Narayan Lal <nitesh@redhat.com>
> Sent: Monday, June 15, 2020 1:21 PM
> To: linux-kernel@vger.kernel.org; frederic@kernel.org; mtosatti@redhat.com;
> sassmann@redhat.com; Kirsher, Jeffrey T <jeffrey.t.kirsher@intel.com>; Keller,
> Jacob E <jacob.e.keller@intel.com>; jlelli@redhat.com
> Subject: [Patch v1] i40e: limit the msix vectors based on housekeeping CPUs
> 
> In a realtime environment, it is essential to isolate
> unwanted IRQs from isolated CPUs to prevent latency overheads.
> Creating MSIX vectors only based on the online CPUs could lead
> to a potential issue on an RT setup that has several isolated
> CPUs but a very few housekeeping CPUs. This is because in these
> kinds of setups an attempt to move the IRQs to the limited
> housekeeping CPUs from isolated CPUs might fail due to the per
> CPU vector limit. This could eventually result in latency spikes
> because of the IRQ threads that we fail to move from isolated
> CPUs. This patch prevents i40e to add vectors only based on
> available online CPUs by using housekeeping_cpumask() to derive
> the number of available housekeeping CPUs.
> 
> Signed-off-by: Nitesh Narayan Lal <nitesh@redhat.com>
> ---

Ok, so the idea is that "housekeeping" CPUs are to be used for general purpose configuration, and thus is a subset of online CPUs. By reducing the limit to just housekeeping CPUs, we ensure that we do not overload the system with more queues than can be handled by the general purpose CPUs?

Thanks,
Jake

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Patch v1] i40e: limit the msix vectors based on housekeeping CPUs
  2020-06-15 20:48   ` Keller, Jacob E
@ 2020-06-15 20:55     ` Nitesh Narayan Lal
  0 siblings, 0 replies; 7+ messages in thread
From: Nitesh Narayan Lal @ 2020-06-15 20:55 UTC (permalink / raw)
  To: Keller, Jacob E, linux-kernel, frederic, mtosatti, sassmann,
	Kirsher, Jeffrey T, jlelli


[-- Attachment #1.1: Type: text/plain, Size: 1726 bytes --]


On 6/15/20 4:48 PM, Keller, Jacob E wrote:
>
>> -----Original Message-----
>> From: Nitesh Narayan Lal <nitesh@redhat.com>
>> Sent: Monday, June 15, 2020 1:21 PM
>> To: linux-kernel@vger.kernel.org; frederic@kernel.org; mtosatti@redhat.com;
>> sassmann@redhat.com; Kirsher, Jeffrey T <jeffrey.t.kirsher@intel.com>; Keller,
>> Jacob E <jacob.e.keller@intel.com>; jlelli@redhat.com
>> Subject: [Patch v1] i40e: limit the msix vectors based on housekeeping CPUs
>>
>> In a realtime environment, it is essential to isolate
>> unwanted IRQs from isolated CPUs to prevent latency overheads.
>> Creating MSIX vectors only based on the online CPUs could lead
>> to a potential issue on an RT setup that has several isolated
>> CPUs but a very few housekeeping CPUs. This is because in these
>> kinds of setups an attempt to move the IRQs to the limited
>> housekeeping CPUs from isolated CPUs might fail due to the per
>> CPU vector limit. This could eventually result in latency spikes
>> because of the IRQ threads that we fail to move from isolated
>> CPUs. This patch prevents i40e to add vectors only based on
>> available online CPUs by using housekeeping_cpumask() to derive
>> the number of available housekeeping CPUs.
>>
>> Signed-off-by: Nitesh Narayan Lal <nitesh@redhat.com>
>> ---
> Ok, so the idea is that "housekeeping" CPUs are to be used for general purpose configuration, and thus is a subset of online CPUs. By reducing the limit to just housekeeping CPUs, we ensure that we do not overload the system with more queues than can be handled by the general purpose CPUs?

Yes.
General purpose or the housekeeping CPUs or the non-isolated CPUs.

>
> Thanks,
> Jake
>
-- 
Nitesh


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Patch v1] i40e: limit the msix vectors based on housekeeping CPUs
  2020-06-15 20:21 ` [Patch v1] i40e: limit the " Nitesh Narayan Lal
  2020-06-15 20:48   ` Keller, Jacob E
@ 2020-06-16  8:03   ` Christoph Hellwig
  2020-06-16 17:29     ` Nitesh Narayan Lal
  2020-06-26 20:11     ` Nitesh Narayan Lal
  1 sibling, 2 replies; 7+ messages in thread
From: Christoph Hellwig @ 2020-06-16  8:03 UTC (permalink / raw)
  To: Nitesh Narayan Lal
  Cc: linux-kernel, frederic, mtosatti, sassmann, jeffrey.t.kirsher,
	jacob.e.keller, jlelli

On Mon, Jun 15, 2020 at 04:21:25PM -0400, Nitesh Narayan Lal wrote:
> +	hk_flags = HK_FLAG_DOMAIN | HK_FLAG_WQ;
> +	mask = housekeeping_cpumask(hk_flags);
> +	cpus = cpumask_weight(mask);

Code like this has no business inside a driver.  Please provide a
proper core API for it instead.  Also please wire up
pci_alloc_irq_vectors* to use this API as well.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Patch v1] i40e: limit the msix vectors based on housekeeping CPUs
  2020-06-16  8:03   ` Christoph Hellwig
@ 2020-06-16 17:29     ` Nitesh Narayan Lal
  2020-06-26 20:11     ` Nitesh Narayan Lal
  1 sibling, 0 replies; 7+ messages in thread
From: Nitesh Narayan Lal @ 2020-06-16 17:29 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-kernel, frederic, mtosatti, sassmann, jeffrey.t.kirsher,
	jacob.e.keller, jlelli


[-- Attachment #1.1: Type: text/plain, Size: 562 bytes --]


On 6/16/20 4:03 AM, Christoph Hellwig wrote:
> On Mon, Jun 15, 2020 at 04:21:25PM -0400, Nitesh Narayan Lal wrote:
>> +	hk_flags = HK_FLAG_DOMAIN | HK_FLAG_WQ;
>> +	mask = housekeeping_cpumask(hk_flags);
>> +	cpus = cpumask_weight(mask);
> Code like this has no business inside a driver.  Please provide a
> proper core API for it instead. 

Ok, I will think of a better way of doing this.

>  Also please wire up
> pci_alloc_irq_vectors* to use this API as well.

Understood, I will include this in a separate patch.

>
-- 
Thanks
Nitesh


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Patch v1] i40e: limit the msix vectors based on housekeeping CPUs
  2020-06-16  8:03   ` Christoph Hellwig
  2020-06-16 17:29     ` Nitesh Narayan Lal
@ 2020-06-26 20:11     ` Nitesh Narayan Lal
  1 sibling, 0 replies; 7+ messages in thread
From: Nitesh Narayan Lal @ 2020-06-26 20:11 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-kernel, frederic, mtosatti, sassmann, jeffrey.t.kirsher,
	jacob.e.keller, jlelli


[-- Attachment #1.1: Type: text/plain, Size: 1276 bytes --]


On 6/16/20 4:03 AM, Christoph Hellwig wrote:
> On Mon, Jun 15, 2020 at 04:21:25PM -0400, Nitesh Narayan Lal wrote:
>> +	hk_flags = HK_FLAG_DOMAIN | HK_FLAG_WQ;
>> +	mask = housekeeping_cpumask(hk_flags);
>> +	cpus = cpumask_weight(mask);
> Code like this has no business inside a driver.  Please provide a
> proper core API for it instead.  Also please wire up
> pci_alloc_irq_vectors* to use this API as well.
>

Hi Christoph,

I have been looking into using nr_houskeeping_* API that I will be defining
within pci_alloc_irq_vectors* to limit the nr of vectors.
However, I am wondering about a few things:

- Some of the drivers such as i40e until now, use the num_online CPUs to
  restrict the number of vectors that they should create. Will it make sense if
  I restrict the maximum vectors requested based on
  nr_online/housekeeping_cpus (Though I will have to make sure that the
  min_vecs is always satisfied)?

  The other option would be to check for the total available vectors in all
  online/housekeeping CPUs for limiting the maxvecs, this way will probably be
  more accurate?

- Another thing that I am wondering about is the right way to test this change.

Please let me know if you have any suggestions?

-- 
Nitesh


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-06-26 20:11 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-15 20:21 [PATCH v1 0/1] limit the i40e msix vectors based on housekeeping CPUs Nitesh Narayan Lal
2020-06-15 20:21 ` [Patch v1] i40e: limit the " Nitesh Narayan Lal
2020-06-15 20:48   ` Keller, Jacob E
2020-06-15 20:55     ` Nitesh Narayan Lal
2020-06-16  8:03   ` Christoph Hellwig
2020-06-16 17:29     ` Nitesh Narayan Lal
2020-06-26 20:11     ` Nitesh Narayan Lal

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).