All of lore.kernel.org
 help / color / mirror / Atom feed
* [net-queue bugfix RFC] i40e: Clear IFF_RXFH_CONFIGURED when RSS is reset
@ 2022-10-13 22:54 ` Joe Damato
  0 siblings, 0 replies; 8+ messages in thread
From: Joe Damato @ 2022-10-13 22:54 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: netdev, kuba, davem, anthony.l.nguyen, jesse.brandeburg, Joe Damato

Sending this first as an RFC to ensure that this is the correct and desired
behavior when changing queue counts and flowhashes in i40e.

If this is approved, I can send an official "v1".

Before this change, reconfiguring the queue count using ethtool doesn't
always work, even for queue counts that were previously accepted because
the IFF_RXFH_CONFIGURED bit was not cleared when the flow indirection hash
is cleared by the driver.

For example:

$ sudo ethtool -x eth0
RX flow hash indirection table for eth0 with 34 RX ring(s):
    0:      0     1     2     3     4     5     6     7
    8:      8     9    10    11    12    13    14    15
   16:     16    17    18    19    20    21    22    23
   24:     24    25    26    27    28    29    30    31
   32:     32    33     0     1     2     3     4     5
[...snip...]

As you can see, the flow indirection hash distributes flows to 34 queues.

Increasing the number of queues from 34 to 64 works, and the flow
indirection hash is reset automatically:

$ sudo ethtool -L eth0 combined 64
$ sudo ethtool -x eth0
RX flow hash indirection table for eth0 with 64 RX ring(s):
    0:      0     1     2     3     4     5     6     7
    8:      8     9    10    11    12    13    14    15
   16:     16    17    18    19    20    21    22    23
   24:     24    25    26    27    28    29    30    31
   32:     32    33    34    35    36    37    38    39
   40:     40    41    42    43    44    45    46    47
   48:     48    49    50    51    52    53    54    55
   56:     56    57    58    59    60    61    62    63

However, reducing the queue count back to 34 (which previously worked)
fails:

$ sudo ethtool -L eth0 combined 34
Cannot set device channel parameters: Invalid argument

This happens because the kernel thinks that the user configured the flow
hash (since the IFF_RXFH_CONFIGURED bit is not cleared by the driver when
the driver reset it) and thus returns -EINVAL, presumably to prevent the
driver from resizing the queues and resetting the user-defined flowhash.

With this patch applied, the queue count can be reduced to fewer queues
than the flow indirection hash is set to distribute flows to if the flow
indirection hash was not modified by the user.

For example, with the patch applied:

$ sudo ethtool -L eth0 combined 32
$ sudo ethtool -x eth0
RX flow hash indirection table for eth0 with 32 RX ring(s):
    0:      0     1     2     3     4     5     6     7
    8:      8     9    10    11    12    13    14    15
   16:     16    17    18    19    20    21    22    23
   24:     24    25    26    27    28    29    30    31
[..snip..]

I can now reduce the queue count to below 32 without error (unlike earlier):

$ sudo ethtool -L eth0 combined 24
$ sudo ethtool -x eth0
RX flow hash indirection table for eth0 with 24 RX ring(s):
    0:      0     1     2     3     4     5     6     7
    8:      8     9    10    11    12    13    14    15
   16:     16    17    18    19    20    21    22    23

This works because I was using the default flow hash, so the driver discards
it and regenerates it.

However, if I manually set the flow hash to some user defined value:

$ sudo ethtool -X eth0 equal 20
$ sudo ethtool -x eth0
RX flow hash indirection table for eth0 with 24 RX ring(s):
    0:      0     1     2     3     4     5     6     7
    8:      8     9    10    11    12    13    14    15
   16:     16    17    18    19     0     1     2     3
[..snip..]

I will now not be able to shrink the queue count again:

$ sudo ethtool -L eth0 combined 16
Cannot set device channel parameters: Invalid argument

But, I can increase the queue count and the flow hash is preserved:

$ sudo ethtool -L eth0 combined 64
$ sudo ethtool -x eth0
RX flow hash indirection table for eth0 with 64 RX ring(s):
    0:      0     1     2     3     4     5     6     7
    8:      8     9    10    11    12    13    14    15
   16:     16    17    18    19     0     1     2     3

Fixes: 28c5869f2bc4 ("i40e: add new fields to store user configuration")
Signed-off-by: Joe Damato <jdamato@fastly.com>
---
 drivers/net/ethernet/intel/i40e/i40e_main.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index feabd26..0e8dca7 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -11522,6 +11522,8 @@ static void i40e_clear_rss_config_user(struct i40e_vsi *vsi)
 
 	kfree(vsi->rss_lut_user);
 	vsi->rss_lut_user = NULL;
+
+	vsi->netdev->priv_flags &= ~IFF_RXFH_CONFIGURED;
 }
 
 /**
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [Intel-wired-lan] [net-queue bugfix RFC] i40e: Clear IFF_RXFH_CONFIGURED when RSS is reset
@ 2022-10-13 22:54 ` Joe Damato
  0 siblings, 0 replies; 8+ messages in thread
From: Joe Damato @ 2022-10-13 22:54 UTC (permalink / raw)
  To: intel-wired-lan; +Cc: netdev, Joe Damato, kuba, davem

Sending this first as an RFC to ensure that this is the correct and desired
behavior when changing queue counts and flowhashes in i40e.

If this is approved, I can send an official "v1".

Before this change, reconfiguring the queue count using ethtool doesn't
always work, even for queue counts that were previously accepted because
the IFF_RXFH_CONFIGURED bit was not cleared when the flow indirection hash
is cleared by the driver.

For example:

$ sudo ethtool -x eth0
RX flow hash indirection table for eth0 with 34 RX ring(s):
    0:      0     1     2     3     4     5     6     7
    8:      8     9    10    11    12    13    14    15
   16:     16    17    18    19    20    21    22    23
   24:     24    25    26    27    28    29    30    31
   32:     32    33     0     1     2     3     4     5
[...snip...]

As you can see, the flow indirection hash distributes flows to 34 queues.

Increasing the number of queues from 34 to 64 works, and the flow
indirection hash is reset automatically:

$ sudo ethtool -L eth0 combined 64
$ sudo ethtool -x eth0
RX flow hash indirection table for eth0 with 64 RX ring(s):
    0:      0     1     2     3     4     5     6     7
    8:      8     9    10    11    12    13    14    15
   16:     16    17    18    19    20    21    22    23
   24:     24    25    26    27    28    29    30    31
   32:     32    33    34    35    36    37    38    39
   40:     40    41    42    43    44    45    46    47
   48:     48    49    50    51    52    53    54    55
   56:     56    57    58    59    60    61    62    63

However, reducing the queue count back to 34 (which previously worked)
fails:

$ sudo ethtool -L eth0 combined 34
Cannot set device channel parameters: Invalid argument

This happens because the kernel thinks that the user configured the flow
hash (since the IFF_RXFH_CONFIGURED bit is not cleared by the driver when
the driver reset it) and thus returns -EINVAL, presumably to prevent the
driver from resizing the queues and resetting the user-defined flowhash.

With this patch applied, the queue count can be reduced to fewer queues
than the flow indirection hash is set to distribute flows to if the flow
indirection hash was not modified by the user.

For example, with the patch applied:

$ sudo ethtool -L eth0 combined 32
$ sudo ethtool -x eth0
RX flow hash indirection table for eth0 with 32 RX ring(s):
    0:      0     1     2     3     4     5     6     7
    8:      8     9    10    11    12    13    14    15
   16:     16    17    18    19    20    21    22    23
   24:     24    25    26    27    28    29    30    31
[..snip..]

I can now reduce the queue count to below 32 without error (unlike earlier):

$ sudo ethtool -L eth0 combined 24
$ sudo ethtool -x eth0
RX flow hash indirection table for eth0 with 24 RX ring(s):
    0:      0     1     2     3     4     5     6     7
    8:      8     9    10    11    12    13    14    15
   16:     16    17    18    19    20    21    22    23

This works because I was using the default flow hash, so the driver discards
it and regenerates it.

However, if I manually set the flow hash to some user defined value:

$ sudo ethtool -X eth0 equal 20
$ sudo ethtool -x eth0
RX flow hash indirection table for eth0 with 24 RX ring(s):
    0:      0     1     2     3     4     5     6     7
    8:      8     9    10    11    12    13    14    15
   16:     16    17    18    19     0     1     2     3
[..snip..]

I will now not be able to shrink the queue count again:

$ sudo ethtool -L eth0 combined 16
Cannot set device channel parameters: Invalid argument

But, I can increase the queue count and the flow hash is preserved:

$ sudo ethtool -L eth0 combined 64
$ sudo ethtool -x eth0
RX flow hash indirection table for eth0 with 64 RX ring(s):
    0:      0     1     2     3     4     5     6     7
    8:      8     9    10    11    12    13    14    15
   16:     16    17    18    19     0     1     2     3

Fixes: 28c5869f2bc4 ("i40e: add new fields to store user configuration")
Signed-off-by: Joe Damato <jdamato@fastly.com>
---
 drivers/net/ethernet/intel/i40e/i40e_main.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index feabd26..0e8dca7 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -11522,6 +11522,8 @@ static void i40e_clear_rss_config_user(struct i40e_vsi *vsi)
 
 	kfree(vsi->rss_lut_user);
 	vsi->rss_lut_user = NULL;
+
+	vsi->netdev->priv_flags &= ~IFF_RXFH_CONFIGURED;
 }
 
 /**
-- 
2.7.4

_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [net-queue bugfix RFC] i40e: Clear IFF_RXFH_CONFIGURED when RSS is reset
  2022-10-13 22:54 ` [Intel-wired-lan] " Joe Damato
@ 2022-10-17 19:45   ` Jakub Kicinski
  -1 siblings, 0 replies; 8+ messages in thread
From: Jakub Kicinski @ 2022-10-17 19:45 UTC (permalink / raw)
  To: Joe Damato
  Cc: intel-wired-lan, netdev, davem, anthony.l.nguyen, jesse.brandeburg

On Thu, 13 Oct 2022 15:54:31 -0700 Joe Damato wrote:
> Before this change, reconfiguring the queue count using ethtool doesn't
> always work, even for queue counts that were previously accepted because
> the IFF_RXFH_CONFIGURED bit was not cleared when the flow indirection hash
> is cleared by the driver.

It's not cleared but when was it set? Could you describe the flow that
gets us to this set a bit more?

Normally clearing the IFF_RXFH_CONFIGURED in the driver is _only_
acceptable on error recovery paths, and should come with a "this should
never happen" warning.

> For example:
> 
> $ sudo ethtool -x eth0
> RX flow hash indirection table for eth0 with 34 RX ring(s):
>     0:      0     1     2     3     4     5     6     7
>     8:      8     9    10    11    12    13    14    15
>    16:     16    17    18    19    20    21    22    23
>    24:     24    25    26    27    28    29    30    31
>    32:     32    33     0     1     2     3     4     5
> [...snip...]
> 
> As you can see, the flow indirection hash distributes flows to 34 queues.
> 
> Increasing the number of queues from 34 to 64 works, and the flow
> indirection hash is reset automatically:
> 
> $ sudo ethtool -L eth0 combined 64
> $ sudo ethtool -x eth0
> RX flow hash indirection table for eth0 with 64 RX ring(s):
>     0:      0     1     2     3     4     5     6     7
>     8:      8     9    10    11    12    13    14    15
>    16:     16    17    18    19    20    21    22    23
>    24:     24    25    26    27    28    29    30    31
>    32:     32    33    34    35    36    37    38    39
>    40:     40    41    42    43    44    45    46    47
>    48:     48    49    50    51    52    53    54    55
>    56:     56    57    58    59    60    61    62    63

This is odd, if IFF_RXFH_CONFIGURED is set driver should not
re-initialize the indirection table. Which I believe is what
you describe at the end of your message:

> But, I can increase the queue count and the flow hash is preserved:
> 
> $ sudo ethtool -L eth0 combined 64
> $ sudo ethtool -x eth0
> RX flow hash indirection table for eth0 with 64 RX ring(s):
>     0:      0     1     2     3     4     5     6     7
>     8:      8     9    10    11    12    13    14    15
>    16:     16    17    18    19     0     1     2     3

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Intel-wired-lan] [net-queue bugfix RFC] i40e: Clear IFF_RXFH_CONFIGURED when RSS is reset
@ 2022-10-17 19:45   ` Jakub Kicinski
  0 siblings, 0 replies; 8+ messages in thread
From: Jakub Kicinski @ 2022-10-17 19:45 UTC (permalink / raw)
  To: Joe Damato; +Cc: netdev, intel-wired-lan, davem

On Thu, 13 Oct 2022 15:54:31 -0700 Joe Damato wrote:
> Before this change, reconfiguring the queue count using ethtool doesn't
> always work, even for queue counts that were previously accepted because
> the IFF_RXFH_CONFIGURED bit was not cleared when the flow indirection hash
> is cleared by the driver.

It's not cleared but when was it set? Could you describe the flow that
gets us to this set a bit more?

Normally clearing the IFF_RXFH_CONFIGURED in the driver is _only_
acceptable on error recovery paths, and should come with a "this should
never happen" warning.

> For example:
> 
> $ sudo ethtool -x eth0
> RX flow hash indirection table for eth0 with 34 RX ring(s):
>     0:      0     1     2     3     4     5     6     7
>     8:      8     9    10    11    12    13    14    15
>    16:     16    17    18    19    20    21    22    23
>    24:     24    25    26    27    28    29    30    31
>    32:     32    33     0     1     2     3     4     5
> [...snip...]
> 
> As you can see, the flow indirection hash distributes flows to 34 queues.
> 
> Increasing the number of queues from 34 to 64 works, and the flow
> indirection hash is reset automatically:
> 
> $ sudo ethtool -L eth0 combined 64
> $ sudo ethtool -x eth0
> RX flow hash indirection table for eth0 with 64 RX ring(s):
>     0:      0     1     2     3     4     5     6     7
>     8:      8     9    10    11    12    13    14    15
>    16:     16    17    18    19    20    21    22    23
>    24:     24    25    26    27    28    29    30    31
>    32:     32    33    34    35    36    37    38    39
>    40:     40    41    42    43    44    45    46    47
>    48:     48    49    50    51    52    53    54    55
>    56:     56    57    58    59    60    61    62    63

This is odd, if IFF_RXFH_CONFIGURED is set driver should not
re-initialize the indirection table. Which I believe is what
you describe at the end of your message:

> But, I can increase the queue count and the flow hash is preserved:
> 
> $ sudo ethtool -L eth0 combined 64
> $ sudo ethtool -x eth0
> RX flow hash indirection table for eth0 with 64 RX ring(s):
>     0:      0     1     2     3     4     5     6     7
>     8:      8     9    10    11    12    13    14    15
>    16:     16    17    18    19     0     1     2     3
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Intel-wired-lan] [net-queue bugfix RFC] i40e: Clear IFF_RXFH_CONFIGURED when RSS is reset
  2022-10-17 19:45   ` [Intel-wired-lan] " Jakub Kicinski
@ 2022-10-17 20:25     ` Jacob Keller
  -1 siblings, 0 replies; 8+ messages in thread
From: Jacob Keller @ 2022-10-17 20:25 UTC (permalink / raw)
  To: Jakub Kicinski, Joe Damato; +Cc: netdev, intel-wired-lan, davem



On 10/17/2022 12:45 PM, Jakub Kicinski wrote:
> On Thu, 13 Oct 2022 15:54:31 -0700 Joe Damato wrote:
>> Before this change, reconfiguring the queue count using ethtool doesn't
>> always work, even for queue counts that were previously accepted because
>> the IFF_RXFH_CONFIGURED bit was not cleared when the flow indirection hash
>> is cleared by the driver.
> 
> It's not cleared but when was it set? Could you describe the flow that
> gets us to this set a bit more?
> 
> Normally clearing the IFF_RXFH_CONFIGURED in the driver is _only_
> acceptable on error recovery paths, and should come with a "this should
> never happen" warning.
> 

Correct. The whole point of IFF_RXFH_CONFIGURED is to be able for the
driver to know whether or not the current config was the default or a
user specified value. If this flag is set, we should not be changing the
config except in exceptional circumstances.

>> For example:
>>
>> $ sudo ethtool -x eth0
>> RX flow hash indirection table for eth0 with 34 RX ring(s):
>>     0:      0     1     2     3     4     5     6     7
>>     8:      8     9    10    11    12    13    14    15
>>    16:     16    17    18    19    20    21    22    23
>>    24:     24    25    26    27    28    29    30    31
>>    32:     32    33     0     1     2     3     4     5
>> [...snip...]
>>
>> As you can see, the flow indirection hash distributes flows to 34 queues.
>>
>> Increasing the number of queues from 34 to 64 works, and the flow
>> indirection hash is reset automatically:
>>
>> $ sudo ethtool -L eth0 combined 64
>> $ sudo ethtool -x eth0
>> RX flow hash indirection table for eth0 with 64 RX ring(s):
>>     0:      0     1     2     3     4     5     6     7
>>     8:      8     9    10    11    12    13    14    15
>>    16:     16    17    18    19    20    21    22    23
>>    24:     24    25    26    27    28    29    30    31
>>    32:     32    33    34    35    36    37    38    39
>>    40:     40    41    42    43    44    45    46    47
>>    48:     48    49    50    51    52    53    54    55
>>    56:     56    57    58    59    60    61    62    63
> 
> This is odd, if IFF_RXFH_CONFIGURED is set driver should not
> re-initialize the indirection table. Which I believe is what
> you describe at the end of your message:
> 

Right. It seems like the driver should actually be checking this flag
somewhere else and preventing the flow where we clear the indirection
table...

We are at least in some places according to your report here, but
perhaps there is a gap....

>> But, I can increase the queue count and the flow hash is preserved:
>>
>> $ sudo ethtool -L eth0 combined 64
>> $ sudo ethtool -x eth0
>> RX flow hash indirection table for eth0 with 64 RX ring(s):
>>     0:      0     1     2     3     4     5     6     7
>>     8:      8     9    10    11    12    13    14    15
>>    16:     16    17    18    19     0     1     2     3
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [net-queue bugfix RFC] i40e: Clear IFF_RXFH_CONFIGURED when RSS is reset
@ 2022-10-17 20:25     ` Jacob Keller
  0 siblings, 0 replies; 8+ messages in thread
From: Jacob Keller @ 2022-10-17 20:25 UTC (permalink / raw)
  To: Jakub Kicinski, Joe Damato
  Cc: intel-wired-lan, netdev, davem, anthony.l.nguyen, jesse.brandeburg



On 10/17/2022 12:45 PM, Jakub Kicinski wrote:
> On Thu, 13 Oct 2022 15:54:31 -0700 Joe Damato wrote:
>> Before this change, reconfiguring the queue count using ethtool doesn't
>> always work, even for queue counts that were previously accepted because
>> the IFF_RXFH_CONFIGURED bit was not cleared when the flow indirection hash
>> is cleared by the driver.
> 
> It's not cleared but when was it set? Could you describe the flow that
> gets us to this set a bit more?
> 
> Normally clearing the IFF_RXFH_CONFIGURED in the driver is _only_
> acceptable on error recovery paths, and should come with a "this should
> never happen" warning.
> 

Correct. The whole point of IFF_RXFH_CONFIGURED is to be able for the
driver to know whether or not the current config was the default or a
user specified value. If this flag is set, we should not be changing the
config except in exceptional circumstances.

>> For example:
>>
>> $ sudo ethtool -x eth0
>> RX flow hash indirection table for eth0 with 34 RX ring(s):
>>     0:      0     1     2     3     4     5     6     7
>>     8:      8     9    10    11    12    13    14    15
>>    16:     16    17    18    19    20    21    22    23
>>    24:     24    25    26    27    28    29    30    31
>>    32:     32    33     0     1     2     3     4     5
>> [...snip...]
>>
>> As you can see, the flow indirection hash distributes flows to 34 queues.
>>
>> Increasing the number of queues from 34 to 64 works, and the flow
>> indirection hash is reset automatically:
>>
>> $ sudo ethtool -L eth0 combined 64
>> $ sudo ethtool -x eth0
>> RX flow hash indirection table for eth0 with 64 RX ring(s):
>>     0:      0     1     2     3     4     5     6     7
>>     8:      8     9    10    11    12    13    14    15
>>    16:     16    17    18    19    20    21    22    23
>>    24:     24    25    26    27    28    29    30    31
>>    32:     32    33    34    35    36    37    38    39
>>    40:     40    41    42    43    44    45    46    47
>>    48:     48    49    50    51    52    53    54    55
>>    56:     56    57    58    59    60    61    62    63
> 
> This is odd, if IFF_RXFH_CONFIGURED is set driver should not
> re-initialize the indirection table. Which I believe is what
> you describe at the end of your message:
> 

Right. It seems like the driver should actually be checking this flag
somewhere else and preventing the flow where we clear the indirection
table...

We are at least in some places according to your report here, but
perhaps there is a gap....

>> But, I can increase the queue count and the flow hash is preserved:
>>
>> $ sudo ethtool -L eth0 combined 64
>> $ sudo ethtool -x eth0
>> RX flow hash indirection table for eth0 with 64 RX ring(s):
>>     0:      0     1     2     3     4     5     6     7
>>     8:      8     9    10    11    12    13    14    15
>>    16:     16    17    18    19     0     1     2     3

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Intel-wired-lan] [net-queue bugfix RFC] i40e: Clear IFF_RXFH_CONFIGURED when RSS is reset
  2022-10-17 20:25     ` Jacob Keller
@ 2022-10-17 20:36       ` Joe Damato
  -1 siblings, 0 replies; 8+ messages in thread
From: Joe Damato @ 2022-10-17 20:36 UTC (permalink / raw)
  To: Jacob Keller; +Cc: netdev, Jakub Kicinski, intel-wired-lan, davem

On Mon, Oct 17, 2022 at 01:25:39PM -0700, Jacob Keller wrote:
> 
> 
> On 10/17/2022 12:45 PM, Jakub Kicinski wrote:
> > On Thu, 13 Oct 2022 15:54:31 -0700 Joe Damato wrote:
> >> Before this change, reconfiguring the queue count using ethtool doesn't
> >> always work, even for queue counts that were previously accepted because
> >> the IFF_RXFH_CONFIGURED bit was not cleared when the flow indirection hash
> >> is cleared by the driver.
> > 
> > It's not cleared but when was it set? Could you describe the flow that
> > gets us to this set a bit more?
> > 
> > Normally clearing the IFF_RXFH_CONFIGURED in the driver is _only_
> > acceptable on error recovery paths, and should come with a "this should
> > never happen" warning.
> > 
> 
> Correct. The whole point of IFF_RXFH_CONFIGURED is to be able for the
> driver to know whether or not the current config was the default or a
> user specified value. If this flag is set, we should not be changing the
> config except in exceptional circumstances.
> 
> >> For example:
> >>
> >> $ sudo ethtool -x eth0
> >> RX flow hash indirection table for eth0 with 34 RX ring(s):
> >>     0:      0     1     2     3     4     5     6     7
> >>     8:      8     9    10    11    12    13    14    15
> >>    16:     16    17    18    19    20    21    22    23
> >>    24:     24    25    26    27    28    29    30    31
> >>    32:     32    33     0     1     2     3     4     5
> >> [...snip...]
> >>
> >> As you can see, the flow indirection hash distributes flows to 34 queues.
> >>
> >> Increasing the number of queues from 34 to 64 works, and the flow
> >> indirection hash is reset automatically:
> >>
> >> $ sudo ethtool -L eth0 combined 64
> >> $ sudo ethtool -x eth0
> >> RX flow hash indirection table for eth0 with 64 RX ring(s):
> >>     0:      0     1     2     3     4     5     6     7
> >>     8:      8     9    10    11    12    13    14    15
> >>    16:     16    17    18    19    20    21    22    23
> >>    24:     24    25    26    27    28    29    30    31
> >>    32:     32    33    34    35    36    37    38    39
> >>    40:     40    41    42    43    44    45    46    47
> >>    48:     48    49    50    51    52    53    54    55
> >>    56:     56    57    58    59    60    61    62    63
> > 
> > This is odd, if IFF_RXFH_CONFIGURED is set driver should not
> > re-initialize the indirection table. Which I believe is what
> > you describe at the end of your message:
> > 
> 
> Right. It seems like the driver should actually be checking this flag
> somewhere else and preventing the flow where we clear the indirection
> table...
> 
> We are at least in some places according to your report here, but
> perhaps there is a gap....

Thanks for the comments / information. I noticed that one other driver
(mlx5) tweaks this bit, which is what led me down this rabbit hole.

I'll have to re-read the i40e code and re-run some experiments with the
queue count and flow hash to get a better understanding of the current
behavior and verify/double check the results.

I'll follow-up with an email to intel-wired-lan about the current
(unpatched) behavior I'm seeing with i40e to double check if there's
a bug or if I've simply made a mistake somewhere in my testing.

I did run the experiments a few times, so it is possible I got into some
weird state. It is worth revisiting fresh from a reboot with a kernel built
from net-next.
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [net-queue bugfix RFC] i40e: Clear IFF_RXFH_CONFIGURED when RSS is reset
@ 2022-10-17 20:36       ` Joe Damato
  0 siblings, 0 replies; 8+ messages in thread
From: Joe Damato @ 2022-10-17 20:36 UTC (permalink / raw)
  To: Jacob Keller
  Cc: Jakub Kicinski, intel-wired-lan, netdev, davem, anthony.l.nguyen,
	jesse.brandeburg

On Mon, Oct 17, 2022 at 01:25:39PM -0700, Jacob Keller wrote:
> 
> 
> On 10/17/2022 12:45 PM, Jakub Kicinski wrote:
> > On Thu, 13 Oct 2022 15:54:31 -0700 Joe Damato wrote:
> >> Before this change, reconfiguring the queue count using ethtool doesn't
> >> always work, even for queue counts that were previously accepted because
> >> the IFF_RXFH_CONFIGURED bit was not cleared when the flow indirection hash
> >> is cleared by the driver.
> > 
> > It's not cleared but when was it set? Could you describe the flow that
> > gets us to this set a bit more?
> > 
> > Normally clearing the IFF_RXFH_CONFIGURED in the driver is _only_
> > acceptable on error recovery paths, and should come with a "this should
> > never happen" warning.
> > 
> 
> Correct. The whole point of IFF_RXFH_CONFIGURED is to be able for the
> driver to know whether or not the current config was the default or a
> user specified value. If this flag is set, we should not be changing the
> config except in exceptional circumstances.
> 
> >> For example:
> >>
> >> $ sudo ethtool -x eth0
> >> RX flow hash indirection table for eth0 with 34 RX ring(s):
> >>     0:      0     1     2     3     4     5     6     7
> >>     8:      8     9    10    11    12    13    14    15
> >>    16:     16    17    18    19    20    21    22    23
> >>    24:     24    25    26    27    28    29    30    31
> >>    32:     32    33     0     1     2     3     4     5
> >> [...snip...]
> >>
> >> As you can see, the flow indirection hash distributes flows to 34 queues.
> >>
> >> Increasing the number of queues from 34 to 64 works, and the flow
> >> indirection hash is reset automatically:
> >>
> >> $ sudo ethtool -L eth0 combined 64
> >> $ sudo ethtool -x eth0
> >> RX flow hash indirection table for eth0 with 64 RX ring(s):
> >>     0:      0     1     2     3     4     5     6     7
> >>     8:      8     9    10    11    12    13    14    15
> >>    16:     16    17    18    19    20    21    22    23
> >>    24:     24    25    26    27    28    29    30    31
> >>    32:     32    33    34    35    36    37    38    39
> >>    40:     40    41    42    43    44    45    46    47
> >>    48:     48    49    50    51    52    53    54    55
> >>    56:     56    57    58    59    60    61    62    63
> > 
> > This is odd, if IFF_RXFH_CONFIGURED is set driver should not
> > re-initialize the indirection table. Which I believe is what
> > you describe at the end of your message:
> > 
> 
> Right. It seems like the driver should actually be checking this flag
> somewhere else and preventing the flow where we clear the indirection
> table...
> 
> We are at least in some places according to your report here, but
> perhaps there is a gap....

Thanks for the comments / information. I noticed that one other driver
(mlx5) tweaks this bit, which is what led me down this rabbit hole.

I'll have to re-read the i40e code and re-run some experiments with the
queue count and flow hash to get a better understanding of the current
behavior and verify/double check the results.

I'll follow-up with an email to intel-wired-lan about the current
(unpatched) behavior I'm seeing with i40e to double check if there's
a bug or if I've simply made a mistake somewhere in my testing.

I did run the experiments a few times, so it is possible I got into some
weird state. It is worth revisiting fresh from a reboot with a kernel built
from net-next.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-10-17 20:39 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-13 22:54 [net-queue bugfix RFC] i40e: Clear IFF_RXFH_CONFIGURED when RSS is reset Joe Damato
2022-10-13 22:54 ` [Intel-wired-lan] " Joe Damato
2022-10-17 19:45 ` Jakub Kicinski
2022-10-17 19:45   ` [Intel-wired-lan] " Jakub Kicinski
2022-10-17 20:25   ` Jacob Keller
2022-10-17 20:25     ` Jacob Keller
2022-10-17 20:36     ` [Intel-wired-lan] " Joe Damato
2022-10-17 20:36       ` Joe Damato

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.