All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net] i40e: use valid online CPU on q_vector initialization
@ 2016-06-27 15:16 ` Guilherme G. Piccoli
  0 siblings, 0 replies; 5+ messages in thread
From: Guilherme G. Piccoli @ 2016-06-27 15:16 UTC (permalink / raw)
  To: jeffrey.t.kirsher, intel-wired-lan; +Cc: netdev, gpiccoli

Currently, the q_vector initialization routine sets the affinity_mask
of a q_vector based on v_idx value. Meaning a loop iterates on v_idx,
which is an incremental value, and the cpumask is created based on
this value.

This is a problem in systems with multiple logical CPUs per core (like in
SMT scenarios). If we disable some logical CPUs, by turning SMT off for
example, we will end up with a sparse cpu_online_mask, i.e., only the first
CPU in a core is online, and incremental filling in q_vector cpumask might
lead to multiple offline CPUs being assigned to q_vectors.

Example: if we have a system with 8 cores each one containing 8 logical
CPUs (SMT == 8 in this case), we have 64 CPUs in total. But if SMT is
disabled, only the 1st CPU in each core remains online, so the
cpu_online_mask in this case would have only 8 bits set, in a sparse way.

In general case, when SMT is off the cpu_online_mask has only C bits set:
0, 1*N, 2*N, ..., C*(N-1)  where
C == # of cores;
N == # of logical CPUs per core.
In our example, only bits 0, 8, 16, 24, 32, 40, 48, 56 would be set.

This patch changes the way q_vector's affinity_mask is created: it iterates
on v_idx, but consumes the CPU index from the cpu_online_mask instead of
just using the v_idx incremental value.

No functional changes were introduced.

Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
---
 drivers/net/ethernet/intel/i40e/i40e_main.c | 16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 5ea2200..a89bddd 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -7726,10 +7726,11 @@ static int i40e_init_msix(struct i40e_pf *pf)
  * i40e_vsi_alloc_q_vector - Allocate memory for a single interrupt vector
  * @vsi: the VSI being configured
  * @v_idx: index of the vector in the vsi struct
+ * @cpu: cpu to be used on affinity_mask
  *
  * We allocate one q_vector.  If allocation fails we return -ENOMEM.
  **/
-static int i40e_vsi_alloc_q_vector(struct i40e_vsi *vsi, int v_idx)
+static int i40e_vsi_alloc_q_vector(struct i40e_vsi *vsi, int v_idx, int cpu)
 {
 	struct i40e_q_vector *q_vector;
 
@@ -7740,7 +7741,8 @@ static int i40e_vsi_alloc_q_vector(struct i40e_vsi *vsi, int v_idx)
 
 	q_vector->vsi = vsi;
 	q_vector->v_idx = v_idx;
-	cpumask_set_cpu(v_idx, &q_vector->affinity_mask);
+	cpumask_set_cpu(cpu, &q_vector->affinity_mask);
+
 	if (vsi->netdev)
 		netif_napi_add(vsi->netdev, &q_vector->napi,
 			       i40e_napi_poll, NAPI_POLL_WEIGHT);
@@ -7764,8 +7766,7 @@ static int i40e_vsi_alloc_q_vector(struct i40e_vsi *vsi, int v_idx)
 static int i40e_vsi_alloc_q_vectors(struct i40e_vsi *vsi)
 {
 	struct i40e_pf *pf = vsi->back;
-	int v_idx, num_q_vectors;
-	int err;
+	int err, v_idx, num_q_vectors, current_cpu;
 
 	/* if not MSIX, give the one vector only to the LAN VSI */
 	if (pf->flags & I40E_FLAG_MSIX_ENABLED)
@@ -7775,10 +7776,15 @@ static int i40e_vsi_alloc_q_vectors(struct i40e_vsi *vsi)
 	else
 		return -EINVAL;
 
+	current_cpu = cpumask_first(cpu_online_mask);
+
 	for (v_idx = 0; v_idx < num_q_vectors; v_idx++) {
-		err = i40e_vsi_alloc_q_vector(vsi, v_idx);
+		err = i40e_vsi_alloc_q_vector(vsi, v_idx, current_cpu);
 		if (err)
 			goto err_out;
+		current_cpu = cpumask_next(current_cpu, cpu_online_mask);
+		if (unlikely(current_cpu >= nr_cpu_ids))
+			current_cpu = cpumask_first(cpu_online_mask);
 	}
 
 	return 0;
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [Intel-wired-lan] [PATCH net] i40e: use valid online CPU on q_vector initialization
@ 2016-06-27 15:16 ` Guilherme G. Piccoli
  0 siblings, 0 replies; 5+ messages in thread
From: Guilherme G. Piccoli @ 2016-06-27 15:16 UTC (permalink / raw)
  To: intel-wired-lan

Currently, the q_vector initialization routine sets the affinity_mask
of a q_vector based on v_idx value. Meaning a loop iterates on v_idx,
which is an incremental value, and the cpumask is created based on
this value.

This is a problem in systems with multiple logical CPUs per core (like in
SMT scenarios). If we disable some logical CPUs, by turning SMT off for
example, we will end up with a sparse cpu_online_mask, i.e., only the first
CPU in a core is online, and incremental filling in q_vector cpumask might
lead to multiple offline CPUs being assigned to q_vectors.

Example: if we have a system with 8 cores each one containing 8 logical
CPUs (SMT == 8 in this case), we have 64 CPUs in total. But if SMT is
disabled, only the 1st CPU in each core remains online, so the
cpu_online_mask in this case would have only 8 bits set, in a sparse way.

In general case, when SMT is off the cpu_online_mask has only C bits set:
0, 1*N, 2*N, ..., C*(N-1)  where
C == # of cores;
N == # of logical CPUs per core.
In our example, only bits 0, 8, 16, 24, 32, 40, 48, 56 would be set.

This patch changes the way q_vector's affinity_mask is created: it iterates
on v_idx, but consumes the CPU index from the cpu_online_mask instead of
just using the v_idx incremental value.

No functional changes were introduced.

Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
---
 drivers/net/ethernet/intel/i40e/i40e_main.c | 16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 5ea2200..a89bddd 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -7726,10 +7726,11 @@ static int i40e_init_msix(struct i40e_pf *pf)
  * i40e_vsi_alloc_q_vector - Allocate memory for a single interrupt vector
  * @vsi: the VSI being configured
  * @v_idx: index of the vector in the vsi struct
+ * @cpu: cpu to be used on affinity_mask
  *
  * We allocate one q_vector.  If allocation fails we return -ENOMEM.
  **/
-static int i40e_vsi_alloc_q_vector(struct i40e_vsi *vsi, int v_idx)
+static int i40e_vsi_alloc_q_vector(struct i40e_vsi *vsi, int v_idx, int cpu)
 {
 	struct i40e_q_vector *q_vector;
 
@@ -7740,7 +7741,8 @@ static int i40e_vsi_alloc_q_vector(struct i40e_vsi *vsi, int v_idx)
 
 	q_vector->vsi = vsi;
 	q_vector->v_idx = v_idx;
-	cpumask_set_cpu(v_idx, &q_vector->affinity_mask);
+	cpumask_set_cpu(cpu, &q_vector->affinity_mask);
+
 	if (vsi->netdev)
 		netif_napi_add(vsi->netdev, &q_vector->napi,
 			       i40e_napi_poll, NAPI_POLL_WEIGHT);
@@ -7764,8 +7766,7 @@ static int i40e_vsi_alloc_q_vector(struct i40e_vsi *vsi, int v_idx)
 static int i40e_vsi_alloc_q_vectors(struct i40e_vsi *vsi)
 {
 	struct i40e_pf *pf = vsi->back;
-	int v_idx, num_q_vectors;
-	int err;
+	int err, v_idx, num_q_vectors, current_cpu;
 
 	/* if not MSIX, give the one vector only to the LAN VSI */
 	if (pf->flags & I40E_FLAG_MSIX_ENABLED)
@@ -7775,10 +7776,15 @@ static int i40e_vsi_alloc_q_vectors(struct i40e_vsi *vsi)
 	else
 		return -EINVAL;
 
+	current_cpu = cpumask_first(cpu_online_mask);
+
 	for (v_idx = 0; v_idx < num_q_vectors; v_idx++) {
-		err = i40e_vsi_alloc_q_vector(vsi, v_idx);
+		err = i40e_vsi_alloc_q_vector(vsi, v_idx, current_cpu);
 		if (err)
 			goto err_out;
+		current_cpu = cpumask_next(current_cpu, cpu_online_mask);
+		if (unlikely(current_cpu >= nr_cpu_ids))
+			current_cpu = cpumask_first(cpu_online_mask);
 	}
 
 	return 0;
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH net] i40e: use valid online CPU on q_vector initialization
  2016-06-27 15:16 ` [Intel-wired-lan] " Guilherme G. Piccoli
@ 2016-07-11 17:22   ` Guilherme G. Piccoli
  -1 siblings, 0 replies; 5+ messages in thread
From: Guilherme G. Piccoli @ 2016-07-11 17:22 UTC (permalink / raw)
  To: jeffrey.t.kirsher, intel-wired-lan
  Cc: netdev, jesse.brandeburg, shannon.nelson, anjali.singhai,
	mitch.a.williams

On 06/27/2016 12:16 PM, Guilherme G. Piccoli wrote:
> Currently, the q_vector initialization routine sets the affinity_mask
> of a q_vector based on v_idx value. Meaning a loop iterates on v_idx,
> which is an incremental value, and the cpumask is created based on
> this value.
>
> This is a problem in systems with multiple logical CPUs per core (like in
> SMT scenarios). If we disable some logical CPUs, by turning SMT off for
> example, we will end up with a sparse cpu_online_mask, i.e., only the first
> CPU in a core is online, and incremental filling in q_vector cpumask might
> lead to multiple offline CPUs being assigned to q_vectors.
>
> Example: if we have a system with 8 cores each one containing 8 logical
> CPUs (SMT == 8 in this case), we have 64 CPUs in total. But if SMT is
> disabled, only the 1st CPU in each core remains online, so the
> cpu_online_mask in this case would have only 8 bits set, in a sparse way.
>
> In general case, when SMT is off the cpu_online_mask has only C bits set:
> 0, 1*N, 2*N, ..., C*(N-1)  where
> C == # of cores;
> N == # of logical CPUs per core.
> In our example, only bits 0, 8, 16, 24, 32, 40, 48, 56 would be set.
>
> This patch changes the way q_vector's affinity_mask is created: it iterates
> on v_idx, but consumes the CPU index from the cpu_online_mask instead of
> just using the v_idx incremental value.
>
> No functional changes were introduced.
>
> Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
> ---
>   drivers/net/ethernet/intel/i40e/i40e_main.c | 16 +++++++++++-----
>   1 file changed, 11 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
> index 5ea2200..a89bddd 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_main.c
> +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
> @@ -7726,10 +7726,11 @@ static int i40e_init_msix(struct i40e_pf *pf)
>    * i40e_vsi_alloc_q_vector - Allocate memory for a single interrupt vector
>    * @vsi: the VSI being configured
>    * @v_idx: index of the vector in the vsi struct
> + * @cpu: cpu to be used on affinity_mask
>    *
>    * We allocate one q_vector.  If allocation fails we return -ENOMEM.
>    **/
> -static int i40e_vsi_alloc_q_vector(struct i40e_vsi *vsi, int v_idx)
> +static int i40e_vsi_alloc_q_vector(struct i40e_vsi *vsi, int v_idx, int cpu)
>   {
>   	struct i40e_q_vector *q_vector;
>
> @@ -7740,7 +7741,8 @@ static int i40e_vsi_alloc_q_vector(struct i40e_vsi *vsi, int v_idx)
>
>   	q_vector->vsi = vsi;
>   	q_vector->v_idx = v_idx;
> -	cpumask_set_cpu(v_idx, &q_vector->affinity_mask);
> +	cpumask_set_cpu(cpu, &q_vector->affinity_mask);
> +
>   	if (vsi->netdev)
>   		netif_napi_add(vsi->netdev, &q_vector->napi,
>   			       i40e_napi_poll, NAPI_POLL_WEIGHT);
> @@ -7764,8 +7766,7 @@ static int i40e_vsi_alloc_q_vector(struct i40e_vsi *vsi, int v_idx)
>   static int i40e_vsi_alloc_q_vectors(struct i40e_vsi *vsi)
>   {
>   	struct i40e_pf *pf = vsi->back;
> -	int v_idx, num_q_vectors;
> -	int err;
> +	int err, v_idx, num_q_vectors, current_cpu;
>
>   	/* if not MSIX, give the one vector only to the LAN VSI */
>   	if (pf->flags & I40E_FLAG_MSIX_ENABLED)
> @@ -7775,10 +7776,15 @@ static int i40e_vsi_alloc_q_vectors(struct i40e_vsi *vsi)
>   	else
>   		return -EINVAL;
>
> +	current_cpu = cpumask_first(cpu_online_mask);
> +
>   	for (v_idx = 0; v_idx < num_q_vectors; v_idx++) {
> -		err = i40e_vsi_alloc_q_vector(vsi, v_idx);
> +		err = i40e_vsi_alloc_q_vector(vsi, v_idx, current_cpu);
>   		if (err)
>   			goto err_out;
> +		current_cpu = cpumask_next(current_cpu, cpu_online_mask);
> +		if (unlikely(current_cpu >= nr_cpu_ids))
> +			current_cpu = cpumask_first(cpu_online_mask);
>   	}
>
>   	return 0;
>


Ping?

Sorry to bother, if you think I need to improve something here, let me 
know :)

I'm adding another Intel people in this thread, based on patches to i40e.

Thanks in advance,



Guilherme

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Intel-wired-lan] [PATCH net] i40e: use valid online CPU on q_vector initialization
@ 2016-07-11 17:22   ` Guilherme G. Piccoli
  0 siblings, 0 replies; 5+ messages in thread
From: Guilherme G. Piccoli @ 2016-07-11 17:22 UTC (permalink / raw)
  To: intel-wired-lan

On 06/27/2016 12:16 PM, Guilherme G. Piccoli wrote:
> Currently, the q_vector initialization routine sets the affinity_mask
> of a q_vector based on v_idx value. Meaning a loop iterates on v_idx,
> which is an incremental value, and the cpumask is created based on
> this value.
>
> This is a problem in systems with multiple logical CPUs per core (like in
> SMT scenarios). If we disable some logical CPUs, by turning SMT off for
> example, we will end up with a sparse cpu_online_mask, i.e., only the first
> CPU in a core is online, and incremental filling in q_vector cpumask might
> lead to multiple offline CPUs being assigned to q_vectors.
>
> Example: if we have a system with 8 cores each one containing 8 logical
> CPUs (SMT == 8 in this case), we have 64 CPUs in total. But if SMT is
> disabled, only the 1st CPU in each core remains online, so the
> cpu_online_mask in this case would have only 8 bits set, in a sparse way.
>
> In general case, when SMT is off the cpu_online_mask has only C bits set:
> 0, 1*N, 2*N, ..., C*(N-1)  where
> C == # of cores;
> N == # of logical CPUs per core.
> In our example, only bits 0, 8, 16, 24, 32, 40, 48, 56 would be set.
>
> This patch changes the way q_vector's affinity_mask is created: it iterates
> on v_idx, but consumes the CPU index from the cpu_online_mask instead of
> just using the v_idx incremental value.
>
> No functional changes were introduced.
>
> Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
> ---
>   drivers/net/ethernet/intel/i40e/i40e_main.c | 16 +++++++++++-----
>   1 file changed, 11 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
> index 5ea2200..a89bddd 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_main.c
> +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
> @@ -7726,10 +7726,11 @@ static int i40e_init_msix(struct i40e_pf *pf)
>    * i40e_vsi_alloc_q_vector - Allocate memory for a single interrupt vector
>    * @vsi: the VSI being configured
>    * @v_idx: index of the vector in the vsi struct
> + * @cpu: cpu to be used on affinity_mask
>    *
>    * We allocate one q_vector.  If allocation fails we return -ENOMEM.
>    **/
> -static int i40e_vsi_alloc_q_vector(struct i40e_vsi *vsi, int v_idx)
> +static int i40e_vsi_alloc_q_vector(struct i40e_vsi *vsi, int v_idx, int cpu)
>   {
>   	struct i40e_q_vector *q_vector;
>
> @@ -7740,7 +7741,8 @@ static int i40e_vsi_alloc_q_vector(struct i40e_vsi *vsi, int v_idx)
>
>   	q_vector->vsi = vsi;
>   	q_vector->v_idx = v_idx;
> -	cpumask_set_cpu(v_idx, &q_vector->affinity_mask);
> +	cpumask_set_cpu(cpu, &q_vector->affinity_mask);
> +
>   	if (vsi->netdev)
>   		netif_napi_add(vsi->netdev, &q_vector->napi,
>   			       i40e_napi_poll, NAPI_POLL_WEIGHT);
> @@ -7764,8 +7766,7 @@ static int i40e_vsi_alloc_q_vector(struct i40e_vsi *vsi, int v_idx)
>   static int i40e_vsi_alloc_q_vectors(struct i40e_vsi *vsi)
>   {
>   	struct i40e_pf *pf = vsi->back;
> -	int v_idx, num_q_vectors;
> -	int err;
> +	int err, v_idx, num_q_vectors, current_cpu;
>
>   	/* if not MSIX, give the one vector only to the LAN VSI */
>   	if (pf->flags & I40E_FLAG_MSIX_ENABLED)
> @@ -7775,10 +7776,15 @@ static int i40e_vsi_alloc_q_vectors(struct i40e_vsi *vsi)
>   	else
>   		return -EINVAL;
>
> +	current_cpu = cpumask_first(cpu_online_mask);
> +
>   	for (v_idx = 0; v_idx < num_q_vectors; v_idx++) {
> -		err = i40e_vsi_alloc_q_vector(vsi, v_idx);
> +		err = i40e_vsi_alloc_q_vector(vsi, v_idx, current_cpu);
>   		if (err)
>   			goto err_out;
> +		current_cpu = cpumask_next(current_cpu, cpu_online_mask);
> +		if (unlikely(current_cpu >= nr_cpu_ids))
> +			current_cpu = cpumask_first(cpu_online_mask);
>   	}
>
>   	return 0;
>


Ping?

Sorry to bother, if you think I need to improve something here, let me 
know :)

I'm adding another Intel people in this thread, based on patches to i40e.

Thanks in advance,



Guilherme


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Intel-wired-lan] [PATCH net] i40e: use valid online CPU on q_vector initialization
  2016-06-27 15:16 ` [Intel-wired-lan] " Guilherme G. Piccoli
  (?)
  (?)
@ 2016-07-13 15:01 ` Bowers, AndrewX
  -1 siblings, 0 replies; 5+ messages in thread
From: Bowers, AndrewX @ 2016-07-13 15:01 UTC (permalink / raw)
  To: intel-wired-lan

> -----Original Message-----
> From: Intel-wired-lan [mailto:intel-wired-lan-bounces at lists.osuosl.org] On
> Behalf Of Guilherme G. Piccoli
> Sent: Monday, June 27, 2016 8:17 AM
> To: Kirsher, Jeffrey T <jeffrey.t.kirsher@intel.com>; intel-wired-
> lan at lists.osuosl.org
> Cc: netdev at vger.kernel.org; gpiccoli at linux.vnet.ibm.com
> Subject: [Intel-wired-lan] [PATCH net] i40e: use valid online CPU on q_vector
> initialization
> 
> Currently, the q_vector initialization routine sets the affinity_mask of a
> q_vector based on v_idx value. Meaning a loop iterates on v_idx, which is an
> incremental value, and the cpumask is created based on this value.
> 
> This is a problem in systems with multiple logical CPUs per core (like in SMT
> scenarios). If we disable some logical CPUs, by turning SMT off for example,
> we will end up with a sparse cpu_online_mask, i.e., only the first CPU in a
> core is online, and incremental filling in q_vector cpumask might lead to
> multiple offline CPUs being assigned to q_vectors.
> 
> Example: if we have a system with 8 cores each one containing 8 logical CPUs
> (SMT == 8 in this case), we have 64 CPUs in total. But if SMT is disabled, only
> the 1st CPU in each core remains online, so the cpu_online_mask in this case
> would have only 8 bits set, in a sparse way.
> 
> In general case, when SMT is off the cpu_online_mask has only C bits set:
> 0, 1*N, 2*N, ..., C*(N-1)  where
> C == # of cores;
> N == # of logical CPUs per core.
> In our example, only bits 0, 8, 16, 24, 32, 40, 48, 56 would be set.
> 
> This patch changes the way q_vector's affinity_mask is created: it iterates on
> v_idx, but consumes the CPU index from the cpu_online_mask instead of
> just using the v_idx incremental value.
> 
> No functional changes were introduced.
> 
> Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
> ---
>  drivers/net/ethernet/intel/i40e/i40e_main.c | 16 +++++++++++-----
>  1 file changed, 11 insertions(+), 5 deletions(-)

Tested-by: Andrew Bowers <andrewx.bowers@intel.com>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-07-13 15:01 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-27 15:16 [PATCH net] i40e: use valid online CPU on q_vector initialization Guilherme G. Piccoli
2016-06-27 15:16 ` [Intel-wired-lan] " Guilherme G. Piccoli
2016-07-11 17:22 ` Guilherme G. Piccoli
2016-07-11 17:22   ` [Intel-wired-lan] " Guilherme G. Piccoli
2016-07-13 15:01 ` Bowers, AndrewX

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.