* [PATCH bpf-next] selftests/bpf: Bump internal send_signal/send_signal_tracepoint timeout
@ 2022-07-27 18:29 Daniel Müller
2022-07-28 13:58 ` Jiri Olsa
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Daniel Müller @ 2022-07-27 18:29 UTC (permalink / raw)
To: bpf, ast, andrii, daniel, kernel-team
The send_signal/send_signal_tracepoint is pretty flaky, with at least
one failure in every ten runs on a few attempts I've tried it:
> test_send_signal_common:PASS:pipe_c2p 0 nsec
> test_send_signal_common:PASS:pipe_p2c 0 nsec
> test_send_signal_common:PASS:fork 0 nsec
> test_send_signal_common:PASS:skel_open_and_load 0 nsec
> test_send_signal_common:PASS:skel_attach 0 nsec
> test_send_signal_common:PASS:pipe_read 0 nsec
> test_send_signal_common:PASS:pipe_write 0 nsec
> test_send_signal_common:PASS:reading pipe 0 nsec
> test_send_signal_common:PASS:reading pipe error: size 0 0 nsec
> test_send_signal_common:FAIL:incorrect result unexpected incorrect result: actual 48 != expected 50
> test_send_signal_common:PASS:pipe_write 0 nsec
> #139/1 send_signal/send_signal_tracepoint:FAIL
The reason does not appear to be a correctness issue in the strict
sense. Rather, we merely do not receive the signal we are waiting for
within the provided timeout.
Let's bump the timeout by a factor of ten. With that change I have not
been able to reproduce the failure in 150+ iterations. I am also sneaking
in a small simplification to the test_progs test selection logic.
Signed-off-by: Daniel Müller <deso@posteo.net>
---
tools/testing/selftests/bpf/prog_tests/send_signal.c | 2 +-
tools/testing/selftests/bpf/test_progs.c | 7 ++-----
2 files changed, 3 insertions(+), 6 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/send_signal.c b/tools/testing/selftests/bpf/prog_tests/send_signal.c
index d71226e..d63a20 100644
--- a/tools/testing/selftests/bpf/prog_tests/send_signal.c
+++ b/tools/testing/selftests/bpf/prog_tests/send_signal.c
@@ -64,7 +64,7 @@ static void test_send_signal_common(struct perf_event_attr *attr,
ASSERT_EQ(read(pipe_p2c[0], buf, 1), 1, "pipe_read");
/* wait a little for signal handler */
- for (int i = 0; i < 100000000 && !sigusr1_received; i++)
+ for (int i = 0; i < 1000000000 && !sigusr1_received; i++)
j /= i + j + 1;
buf[0] = sigusr1_received ? '2' : '0';
diff --git a/tools/testing/selftests/bpf/test_progs.c b/tools/testing/selftests/bpf/test_progs.c
index c639f2e..3561c9 100644
--- a/tools/testing/selftests/bpf/test_progs.c
+++ b/tools/testing/selftests/bpf/test_progs.c
@@ -1604,11 +1604,8 @@ int main(int argc, char **argv)
struct prog_test_def *test = &prog_test_defs[i];
test->test_num = i + 1;
- if (should_run(&env.test_selector,
- test->test_num, test->test_name))
- test->should_run = true;
- else
- test->should_run = false;
+ test->should_run = should_run(&env.test_selector,
+ test->test_num, test->test_name);
if ((test->run_test == NULL && test->run_serial_test == NULL) ||
(test->run_test != NULL && test->run_serial_test != NULL)) {
--
2.30.2
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH bpf-next] selftests/bpf: Bump internal send_signal/send_signal_tracepoint timeout
2022-07-27 18:29 [PATCH bpf-next] selftests/bpf: Bump internal send_signal/send_signal_tracepoint timeout Daniel Müller
@ 2022-07-28 13:58 ` Jiri Olsa
2022-07-28 17:28 ` Yonghong Song
2022-07-29 18:20 ` patchwork-bot+netdevbpf
2 siblings, 0 replies; 4+ messages in thread
From: Jiri Olsa @ 2022-07-28 13:58 UTC (permalink / raw)
To: Daniel Müller; +Cc: bpf, ast, andrii, daniel, kernel-team
On Wed, Jul 27, 2022 at 06:29:55PM +0000, Daniel Müller wrote:
> The send_signal/send_signal_tracepoint is pretty flaky, with at least
> one failure in every ten runs on a few attempts I've tried it:
> > test_send_signal_common:PASS:pipe_c2p 0 nsec
> > test_send_signal_common:PASS:pipe_p2c 0 nsec
> > test_send_signal_common:PASS:fork 0 nsec
> > test_send_signal_common:PASS:skel_open_and_load 0 nsec
> > test_send_signal_common:PASS:skel_attach 0 nsec
> > test_send_signal_common:PASS:pipe_read 0 nsec
> > test_send_signal_common:PASS:pipe_write 0 nsec
> > test_send_signal_common:PASS:reading pipe 0 nsec
> > test_send_signal_common:PASS:reading pipe error: size 0 0 nsec
> > test_send_signal_common:FAIL:incorrect result unexpected incorrect result: actual 48 != expected 50
> > test_send_signal_common:PASS:pipe_write 0 nsec
> > #139/1 send_signal/send_signal_tracepoint:FAIL
>
> The reason does not appear to be a correctness issue in the strict
> sense. Rather, we merely do not receive the signal we are waiting for
> within the provided timeout.
> Let's bump the timeout by a factor of ten. With that change I have not
> been able to reproduce the failure in 150+ iterations. I am also sneaking
> in a small simplification to the test_progs test selection logic.
>
> Signed-off-by: Daniel Müller <deso@posteo.net>
I reproduced the fail, can't reproduce anymore with the fix
Acked-by: Jiri Olsa <jolsa@kernel.org>
jirka
> ---
> tools/testing/selftests/bpf/prog_tests/send_signal.c | 2 +-
> tools/testing/selftests/bpf/test_progs.c | 7 ++-----
> 2 files changed, 3 insertions(+), 6 deletions(-)
>
> diff --git a/tools/testing/selftests/bpf/prog_tests/send_signal.c b/tools/testing/selftests/bpf/prog_tests/send_signal.c
> index d71226e..d63a20 100644
> --- a/tools/testing/selftests/bpf/prog_tests/send_signal.c
> +++ b/tools/testing/selftests/bpf/prog_tests/send_signal.c
> @@ -64,7 +64,7 @@ static void test_send_signal_common(struct perf_event_attr *attr,
> ASSERT_EQ(read(pipe_p2c[0], buf, 1), 1, "pipe_read");
>
> /* wait a little for signal handler */
> - for (int i = 0; i < 100000000 && !sigusr1_received; i++)
> + for (int i = 0; i < 1000000000 && !sigusr1_received; i++)
> j /= i + j + 1;
>
> buf[0] = sigusr1_received ? '2' : '0';
> diff --git a/tools/testing/selftests/bpf/test_progs.c b/tools/testing/selftests/bpf/test_progs.c
> index c639f2e..3561c9 100644
> --- a/tools/testing/selftests/bpf/test_progs.c
> +++ b/tools/testing/selftests/bpf/test_progs.c
> @@ -1604,11 +1604,8 @@ int main(int argc, char **argv)
> struct prog_test_def *test = &prog_test_defs[i];
>
> test->test_num = i + 1;
> - if (should_run(&env.test_selector,
> - test->test_num, test->test_name))
> - test->should_run = true;
> - else
> - test->should_run = false;
> + test->should_run = should_run(&env.test_selector,
> + test->test_num, test->test_name);
>
> if ((test->run_test == NULL && test->run_serial_test == NULL) ||
> (test->run_test != NULL && test->run_serial_test != NULL)) {
> --
> 2.30.2
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH bpf-next] selftests/bpf: Bump internal send_signal/send_signal_tracepoint timeout
2022-07-27 18:29 [PATCH bpf-next] selftests/bpf: Bump internal send_signal/send_signal_tracepoint timeout Daniel Müller
2022-07-28 13:58 ` Jiri Olsa
@ 2022-07-28 17:28 ` Yonghong Song
2022-07-29 18:20 ` patchwork-bot+netdevbpf
2 siblings, 0 replies; 4+ messages in thread
From: Yonghong Song @ 2022-07-28 17:28 UTC (permalink / raw)
To: Daniel Müller, bpf, ast, andrii, daniel, kernel-team
On 7/27/22 11:29 AM, Daniel Müller wrote:
> The send_signal/send_signal_tracepoint is pretty flaky, with at least
> one failure in every ten runs on a few attempts I've tried it:
> > test_send_signal_common:PASS:pipe_c2p 0 nsec
> > test_send_signal_common:PASS:pipe_p2c 0 nsec
> > test_send_signal_common:PASS:fork 0 nsec
> > test_send_signal_common:PASS:skel_open_and_load 0 nsec
> > test_send_signal_common:PASS:skel_attach 0 nsec
> > test_send_signal_common:PASS:pipe_read 0 nsec
> > test_send_signal_common:PASS:pipe_write 0 nsec
> > test_send_signal_common:PASS:reading pipe 0 nsec
> > test_send_signal_common:PASS:reading pipe error: size 0 0 nsec
> > test_send_signal_common:FAIL:incorrect result unexpected incorrect result: actual 48 != expected 50
> > test_send_signal_common:PASS:pipe_write 0 nsec
> > #139/1 send_signal/send_signal_tracepoint:FAIL
>
> The reason does not appear to be a correctness issue in the strict
> sense. Rather, we merely do not receive the signal we are waiting for
> within the provided timeout.
> Let's bump the timeout by a factor of ten. With that change I have not
> been able to reproduce the failure in 150+ iterations. I am also sneaking
> in a small simplification to the test_progs test selection logic.
>
> Signed-off-by: Daniel Müller <deso@posteo.net>
Okay, this test has been improved *multiple* times to address its
flakiness. We tried very hard not to increase the runtime for it
so we don't increase overall test_progs run time. But looks like
we have to do it to make it robust. Hopefully such a 10x number
of iterations can finally address the flakiness issue.
Acked-by: Yonghong Song <yhs@fb.com>
> ---
> tools/testing/selftests/bpf/prog_tests/send_signal.c | 2 +-
> tools/testing/selftests/bpf/test_progs.c | 7 ++-----
> 2 files changed, 3 insertions(+), 6 deletions(-)
>
> diff --git a/tools/testing/selftests/bpf/prog_tests/send_signal.c b/tools/testing/selftests/bpf/prog_tests/send_signal.c
> index d71226e..d63a20 100644
> --- a/tools/testing/selftests/bpf/prog_tests/send_signal.c
> +++ b/tools/testing/selftests/bpf/prog_tests/send_signal.c
> @@ -64,7 +64,7 @@ static void test_send_signal_common(struct perf_event_attr *attr,
> ASSERT_EQ(read(pipe_p2c[0], buf, 1), 1, "pipe_read");
>
> /* wait a little for signal handler */
> - for (int i = 0; i < 100000000 && !sigusr1_received; i++)
> + for (int i = 0; i < 1000000000 && !sigusr1_received; i++)
> j /= i + j + 1;
>
> buf[0] = sigusr1_received ? '2' : '0';
> diff --git a/tools/testing/selftests/bpf/test_progs.c b/tools/testing/selftests/bpf/test_progs.c
> index c639f2e..3561c9 100644
> --- a/tools/testing/selftests/bpf/test_progs.c
> +++ b/tools/testing/selftests/bpf/test_progs.c
> @@ -1604,11 +1604,8 @@ int main(int argc, char **argv)
> struct prog_test_def *test = &prog_test_defs[i];
>
> test->test_num = i + 1;
> - if (should_run(&env.test_selector,
> - test->test_num, test->test_name))
> - test->should_run = true;
> - else
> - test->should_run = false;
> + test->should_run = should_run(&env.test_selector,
> + test->test_num, test->test_name);
>
> if ((test->run_test == NULL && test->run_serial_test == NULL) ||
> (test->run_test != NULL && test->run_serial_test != NULL)) {
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH bpf-next] selftests/bpf: Bump internal send_signal/send_signal_tracepoint timeout
2022-07-27 18:29 [PATCH bpf-next] selftests/bpf: Bump internal send_signal/send_signal_tracepoint timeout Daniel Müller
2022-07-28 13:58 ` Jiri Olsa
2022-07-28 17:28 ` Yonghong Song
@ 2022-07-29 18:20 ` patchwork-bot+netdevbpf
2 siblings, 0 replies; 4+ messages in thread
From: patchwork-bot+netdevbpf @ 2022-07-29 18:20 UTC (permalink / raw)
To: =?utf-8?q?Daniel_M=C3=BCller_=3Cdeso=40posteo=2Enet=3E?=
Cc: bpf, ast, andrii, daniel, kernel-team
Hello:
This patch was applied to bpf/bpf-next.git (master)
by Andrii Nakryiko <andrii@kernel.org>:
On Wed, 27 Jul 2022 18:29:55 +0000 you wrote:
> The send_signal/send_signal_tracepoint is pretty flaky, with at least
> one failure in every ten runs on a few attempts I've tried it:
> > test_send_signal_common:PASS:pipe_c2p 0 nsec
> > test_send_signal_common:PASS:pipe_p2c 0 nsec
> > test_send_signal_common:PASS:fork 0 nsec
> > test_send_signal_common:PASS:skel_open_and_load 0 nsec
> > test_send_signal_common:PASS:skel_attach 0 nsec
> > test_send_signal_common:PASS:pipe_read 0 nsec
> > test_send_signal_common:PASS:pipe_write 0 nsec
> > test_send_signal_common:PASS:reading pipe 0 nsec
> > test_send_signal_common:PASS:reading pipe error: size 0 0 nsec
> > test_send_signal_common:FAIL:incorrect result unexpected incorrect result: actual 48 != expected 50
> > test_send_signal_common:PASS:pipe_write 0 nsec
> > #139/1 send_signal/send_signal_tracepoint:FAIL
>
> [...]
Here is the summary with links:
- [bpf-next] selftests/bpf: Bump internal send_signal/send_signal_tracepoint timeout
https://git.kernel.org/bpf/bpf-next/c/639de43ef0dd
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2022-07-29 18:20 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-27 18:29 [PATCH bpf-next] selftests/bpf: Bump internal send_signal/send_signal_tracepoint timeout Daniel Müller
2022-07-28 13:58 ` Jiri Olsa
2022-07-28 17:28 ` Yonghong Song
2022-07-29 18:20 ` patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).