All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net] selftests: net: cope with slow env in gro.sh test
@ 2024-02-06 15:27 Paolo Abeni
  2024-02-06 15:46 ` Willem de Bruijn
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Paolo Abeni @ 2024-02-06 15:27 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Shuah Khan,
	Willem de Bruijn, Coco Li, linux-kselftest

The gro self-tests sends the packets to be aggregated with
multiple write operations.

When running is slow environment, it's hard to guarantee that
the GRO engine will wait for the last packet in an intended
train.

The above causes almost deterministic failures in our CI for
the 'large' test-case.

Address the issue explicitly ignoring failures for such case
in slow environments (KSFT_MACHINE_SLOW==true).

Fixes: 7d1575014a63 ("selftests/net: GRO coalesce test")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
Note that the fixes tag is there mainly to justify targeting the net
tree, and this is aiming at net to hopefully make the test more stable
ASAP for both trees.

I experimented with a largish refactory replacing the multiple writes
with a single GSO packet, but exhausted by time budget before reaching
any good result.
---
 tools/testing/selftests/net/gro.sh | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/tools/testing/selftests/net/gro.sh b/tools/testing/selftests/net/gro.sh
index 19352f106c1d..114b5281a3f5 100755
--- a/tools/testing/selftests/net/gro.sh
+++ b/tools/testing/selftests/net/gro.sh
@@ -31,6 +31,10 @@ run_test() {
       1>>log.txt
     wait "${server_pid}"
     exit_code=$?
+    if [ ${test} == "large" -a -n "${KSFT_MACHINE_SLOW}" ]; then
+        echo "Ignoring errors due to slow environment" 1>&2
+        exit_code=0
+    fi
     if [[ "${exit_code}" -eq 0 ]]; then
         break;
     fi
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH net] selftests: net: cope with slow env in gro.sh test
  2024-02-06 15:27 [PATCH net] selftests: net: cope with slow env in gro.sh test Paolo Abeni
@ 2024-02-06 15:46 ` Willem de Bruijn
  2024-02-07 11:16 ` Matthieu Baerts
  2024-02-08  2:42 ` Jakub Kicinski
  2 siblings, 0 replies; 7+ messages in thread
From: Willem de Bruijn @ 2024-02-06 15:46 UTC (permalink / raw)
  To: Paolo Abeni, netdev
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Shuah Khan,
	Willem de Bruijn, Coco Li, linux-kselftest

Paolo Abeni wrote:
> The gro self-tests sends the packets to be aggregated with
> multiple write operations.
> 
> When running is slow environment, it's hard to guarantee that
> the GRO engine will wait for the last packet in an intended
> train.
> 
> The above causes almost deterministic failures in our CI for
> the 'large' test-case.
> 
> Address the issue explicitly ignoring failures for such case
> in slow environments (KSFT_MACHINE_SLOW==true).
> 
> Fixes: 7d1575014a63 ("selftests/net: GRO coalesce test")
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Reviewed-by: Willem de Bruijn <willemb@google.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net] selftests: net: cope with slow env in gro.sh test
  2024-02-06 15:27 [PATCH net] selftests: net: cope with slow env in gro.sh test Paolo Abeni
  2024-02-06 15:46 ` Willem de Bruijn
@ 2024-02-07 11:16 ` Matthieu Baerts
  2024-02-07 14:35   ` Paolo Abeni
  2024-02-08  2:42 ` Jakub Kicinski
  2 siblings, 1 reply; 7+ messages in thread
From: Matthieu Baerts @ 2024-02-07 11:16 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Shuah Khan,
	Willem de Bruijn, Coco Li, linux-kselftest, netdev

Hi Paolo,

On 06/02/2024 16:27, Paolo Abeni wrote:
> The gro self-tests sends the packets to be aggregated with
> multiple write operations.
> 
> When running is slow environment, it's hard to guarantee that
> the GRO engine will wait for the last packet in an intended
> train.
> 
> The above causes almost deterministic failures in our CI for
> the 'large' test-case.
> 
> Address the issue explicitly ignoring failures for such case
> in slow environments (KSFT_MACHINE_SLOW==true).

To what value is KSFT_MACHINE_SLOW set in the CI?

Is it set to a different value if the machine is not slow? e.g.

  KSFT_MACHINE_SLOW == false

(please see below)

> Fixes: 7d1575014a63 ("selftests/net: GRO coalesce test")
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> ---
> Note that the fixes tag is there mainly to justify targeting the net
> tree, and this is aiming at net to hopefully make the test more stable
> ASAP for both trees.
> 
> I experimented with a largish refactory replacing the multiple writes
> with a single GSO packet, but exhausted by time budget before reaching
> any good result.
> ---
>  tools/testing/selftests/net/gro.sh | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/tools/testing/selftests/net/gro.sh b/tools/testing/selftests/net/gro.sh
> index 19352f106c1d..114b5281a3f5 100755
> --- a/tools/testing/selftests/net/gro.sh
> +++ b/tools/testing/selftests/net/gro.sh
> @@ -31,6 +31,10 @@ run_test() {
>        1>>log.txt
>      wait "${server_pid}"
>      exit_code=$?
> +    if [ ${test} == "large" -a -n "${KSFT_MACHINE_SLOW}" ]; then

Maybe best to avoid using:

  -n "${KSFT_MACHINE_SLOW}"

Otherwise, we have the same behaviour if KSFT_MACHINE_SLOW is set to
1/yes/true or 0/no/false.

But maybe it is fine like that, and what is just missing is adding
somewhere how KSFT_MACHINE_SLOW is supposed to be set/used? :)


Not linked to that, but a small detail about the new line, just in case
you need to send a v2: it looks like it is better to avoid using '-a':

  https://www.shellcheck.net/wiki/SC2166

(but here, it looks like the usage is fine)

> +        echo "Ignoring errors due to slow environment" 1>&2
> +        exit_code=0
> +    fi
>      if [[ "${exit_code}" -eq 0 ]]; then
>          break;
>      fi

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net] selftests: net: cope with slow env in gro.sh test
  2024-02-07 11:16 ` Matthieu Baerts
@ 2024-02-07 14:35   ` Paolo Abeni
  2024-02-07 14:45     ` Matthieu Baerts
  0 siblings, 1 reply; 7+ messages in thread
From: Paolo Abeni @ 2024-02-07 14:35 UTC (permalink / raw)
  To: Matthieu Baerts
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Shuah Khan,
	Willem de Bruijn, Coco Li, linux-kselftest, netdev

On Wed, 2024-02-07 at 12:16 +0100, Matthieu Baerts wrote:
> Hi Paolo,
> 
> On 06/02/2024 16:27, Paolo Abeni wrote:
> > The gro self-tests sends the packets to be aggregated with
> > multiple write operations.
> > 
> > When running is slow environment, it's hard to guarantee that
> > the GRO engine will wait for the last packet in an intended
> > train.
> > 
> > The above causes almost deterministic failures in our CI for
> > the 'large' test-case.
> > 
> > Address the issue explicitly ignoring failures for such case
> > in slow environments (KSFT_MACHINE_SLOW==true).
> 
> To what value is KSFT_MACHINE_SLOW set in the CI?

AFAIK, the CI initialize KSFT_MACHINE_SLOW (to true) only on slow env.

> Is it set to a different value if the machine is not slow? e.g.
> 
>   KSFT_MACHINE_SLOW == false
> 
> (please see below)
> 
> > Fixes: 7d1575014a63 ("selftests/net: GRO coalesce test")
> > Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> > ---
> > Note that the fixes tag is there mainly to justify targeting the net
> > tree, and this is aiming at net to hopefully make the test more stable
> > ASAP for both trees.
> > 
> > I experimented with a largish refactory replacing the multiple writes
> > with a single GSO packet, but exhausted by time budget before reaching
> > any good result.
> > ---
> >  tools/testing/selftests/net/gro.sh | 4 ++++
> >  1 file changed, 4 insertions(+)
> > 
> > diff --git a/tools/testing/selftests/net/gro.sh b/tools/testing/selftests/net/gro.sh
> > index 19352f106c1d..114b5281a3f5 100755
> > --- a/tools/testing/selftests/net/gro.sh
> > +++ b/tools/testing/selftests/net/gro.sh
> > @@ -31,6 +31,10 @@ run_test() {
> >        1>>log.txt
> >      wait "${server_pid}"
> >      exit_code=$?
> > +    if [ ${test} == "large" -a -n "${KSFT_MACHINE_SLOW}" ]; then
> 
> Maybe best to avoid using:
> 
>   -n "${KSFT_MACHINE_SLOW}"
> 
> Otherwise, we have the same behaviour if KSFT_MACHINE_SLOW is set to
> 1/yes/true or 0/no/false.

For consistency, I followed the logic already in place in commit
c41dfb0dfbec ("selftests/net: ignore timing errors in so_txtime if
KSFT_MACHINE_SLOW").

> But maybe it is fine like that, and what is just missing is adding
> somewhere how KSFT_MACHINE_SLOW is supposed to be set/used? :)
> 
> 
> Not linked to that, but a small detail about the new line, just in case
> you need to send a v2: it looks like it is better to avoid using '-a':
> 
>   https://www.shellcheck.net/wiki/SC2166

Thank for the pointer, I was not aware of that. 

I guess a v2 dropping '-a' would be better.

Thanks,

Paolo


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net] selftests: net: cope with slow env in gro.sh test
  2024-02-07 14:35   ` Paolo Abeni
@ 2024-02-07 14:45     ` Matthieu Baerts
  0 siblings, 0 replies; 7+ messages in thread
From: Matthieu Baerts @ 2024-02-07 14:45 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Shuah Khan,
	Willem de Bruijn, Coco Li, linux-kselftest, netdev

Hi Paolo,

On 07/02/2024 15:35, Paolo Abeni wrote:
> On Wed, 2024-02-07 at 12:16 +0100, Matthieu Baerts wrote:
>> Hi Paolo,
>>
>> On 06/02/2024 16:27, Paolo Abeni wrote:
>>> The gro self-tests sends the packets to be aggregated with
>>> multiple write operations.
>>>
>>> When running is slow environment, it's hard to guarantee that
>>> the GRO engine will wait for the last packet in an intended
>>> train.
>>>
>>> The above causes almost deterministic failures in our CI for
>>> the 'large' test-case.
>>>
>>> Address the issue explicitly ignoring failures for such case
>>> in slow environments (KSFT_MACHINE_SLOW==true).
>>
>> To what value is KSFT_MACHINE_SLOW set in the CI?
> 
> AFAIK, the CI initialize KSFT_MACHINE_SLOW (to true) only on slow env.

Should be good, then!

>> Is it set to a different value if the machine is not slow? e.g.
>>
>>   KSFT_MACHINE_SLOW == false
>>
>> (please see below)
>>
>>> Fixes: 7d1575014a63 ("selftests/net: GRO coalesce test")
>>> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
>>> ---
>>> Note that the fixes tag is there mainly to justify targeting the net
>>> tree, and this is aiming at net to hopefully make the test more stable
>>> ASAP for both trees.
>>>
>>> I experimented with a largish refactory replacing the multiple writes
>>> with a single GSO packet, but exhausted by time budget before reaching
>>> any good result.
>>> ---
>>>  tools/testing/selftests/net/gro.sh | 4 ++++
>>>  1 file changed, 4 insertions(+)
>>>
>>> diff --git a/tools/testing/selftests/net/gro.sh b/tools/testing/selftests/net/gro.sh
>>> index 19352f106c1d..114b5281a3f5 100755
>>> --- a/tools/testing/selftests/net/gro.sh
>>> +++ b/tools/testing/selftests/net/gro.sh
>>> @@ -31,6 +31,10 @@ run_test() {
>>>        1>>log.txt
>>>      wait "${server_pid}"
>>>      exit_code=$?
>>> +    if [ ${test} == "large" -a -n "${KSFT_MACHINE_SLOW}" ]; then
>>
>> Maybe best to avoid using:
>>
>>   -n "${KSFT_MACHINE_SLOW}"
>>
>> Otherwise, we have the same behaviour if KSFT_MACHINE_SLOW is set to
>> 1/yes/true or 0/no/false.
> 
> For consistency, I followed the logic already in place in commit
> c41dfb0dfbec ("selftests/net: ignore timing errors in so_txtime if
> KSFT_MACHINE_SLOW").

I only checked code in -net, I forgot to look at net-next. Thanks for
the pointer! I thought it was "fragile", but if that's how we are
supposed to use this env var, that's OK then :)

>> But maybe it is fine like that, and what is just missing is adding
>> somewhere how KSFT_MACHINE_SLOW is supposed to be set/used? :)
>>
>>
>> Not linked to that, but a small detail about the new line, just in case
>> you need to send a v2: it looks like it is better to avoid using '-a':
>>
>>   https://www.shellcheck.net/wiki/SC2166
> 
> Thank for the pointer, I was not aware of that. 
> 
> I guess a v2 dropping '-a' would be better.

I'm not even sure a v2 is really needed. "-a" seems OK if you don't use
(or don't plan to use) "!" or "-" in the expression from what I read.

Another way to fix this is to use [[ ]]:

  [[ ${test} == "large" && -n "${KSFT_MACHINE_SLOW}" ]]

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net] selftests: net: cope with slow env in gro.sh test
  2024-02-06 15:27 [PATCH net] selftests: net: cope with slow env in gro.sh test Paolo Abeni
  2024-02-06 15:46 ` Willem de Bruijn
  2024-02-07 11:16 ` Matthieu Baerts
@ 2024-02-08  2:42 ` Jakub Kicinski
  2024-02-08  2:51   ` Jakub Kicinski
  2 siblings, 1 reply; 7+ messages in thread
From: Jakub Kicinski @ 2024-02-08  2:42 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: netdev, David S. Miller, Eric Dumazet, Shuah Khan,
	Willem de Bruijn, Coco Li, linux-kselftest

On Tue,  6 Feb 2024 16:27:40 +0100 Paolo Abeni wrote:
> The gro self-tests sends the packets to be aggregated with
> multiple write operations.
> 
> When running is slow environment, it's hard to guarantee that
> the GRO engine will wait for the last packet in an intended
> train.
> 
> The above causes almost deterministic failures in our CI for
> the 'large' test-case.
> 
> Address the issue explicitly ignoring failures for such case
> in slow environments (KSFT_MACHINE_SLOW==true).
> 
> Fixes: 7d1575014a63 ("selftests/net: GRO coalesce test")
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> ---
> Note that the fixes tag is there mainly to justify targeting the net
> tree, and this is aiming at net to hopefully make the test more stable
> ASAP for both trees.
> 
> I experimented with a largish refactory replacing the multiple writes
> with a single GSO packet, but exhausted by time budget before reaching
> any good result.

It does make things a lot more stable, but there was still a failure
recently:

https://netdev-3.bots.linux.dev/vmksft-net-dbg/results/455661/36-gro-sh/stdout

:(

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net] selftests: net: cope with slow env in gro.sh test
  2024-02-08  2:42 ` Jakub Kicinski
@ 2024-02-08  2:51   ` Jakub Kicinski
  0 siblings, 0 replies; 7+ messages in thread
From: Jakub Kicinski @ 2024-02-08  2:51 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: netdev, David S. Miller, Eric Dumazet, Shuah Khan,
	Willem de Bruijn, Coco Li, linux-kselftest

On Wed, 7 Feb 2024 18:42:52 -0800 Jakub Kicinski wrote:
> On Tue,  6 Feb 2024 16:27:40 +0100 Paolo Abeni wrote:
> > The gro self-tests sends the packets to be aggregated with
> > multiple write operations.
> > 
> > When running is slow environment, it's hard to guarantee that
> > the GRO engine will wait for the last packet in an intended
> > train.
> > 
> > The above causes almost deterministic failures in our CI for
> > the 'large' test-case.
> > 
> > Address the issue explicitly ignoring failures for such case
> > in slow environments (KSFT_MACHINE_SLOW==true).
> > 
> > Fixes: 7d1575014a63 ("selftests/net: GRO coalesce test")
> > Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> > ---
> > Note that the fixes tag is there mainly to justify targeting the net
> > tree, and this is aiming at net to hopefully make the test more stable
> > ASAP for both trees.
> > 
> > I experimented with a largish refactory replacing the multiple writes
> > with a single GSO packet, but exhausted by time budget before reaching
> > any good result.  
> 
> It does make things a lot more stable, but there was still a failure
> recently:
> 
> https://netdev-3.bots.linux.dev/vmksft-net-dbg/results/455661/36-gro-sh/stdout
> 
> :(

Ah, sorry, I missed the v2. That must have been between v1 and v2.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-02-08  2:51 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-06 15:27 [PATCH net] selftests: net: cope with slow env in gro.sh test Paolo Abeni
2024-02-06 15:46 ` Willem de Bruijn
2024-02-07 11:16 ` Matthieu Baerts
2024-02-07 14:35   ` Paolo Abeni
2024-02-07 14:45     ` Matthieu Baerts
2024-02-08  2:42 ` Jakub Kicinski
2024-02-08  2:51   ` Jakub Kicinski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.