* [PATCH bpf] selftests/bpf: Poll for receive in cg_storage_multi test
@ 2023-04-03 21:58 YiFei Zhu
2023-04-03 22:16 ` Stanislav Fomichev
2023-04-03 23:44 ` Martin KaFai Lau
0 siblings, 2 replies; 3+ messages in thread
From: YiFei Zhu @ 2023-04-03 21:58 UTC (permalink / raw)
To: bpf
Cc: Alexei Starovoitov, Daniel Borkmann, Stanislav Fomichev,
Martin KaFai Lau, Andrii Nakryiko
In some cases the loopback latency might be large enough, causing
the assertion on invocations to be run before ingress prog getting
executed. The assertion would fail and the test would flake.
This can be reliably reproduced by arbitrarily increaing the loopback
latency (thanks to [1]):
tc qdisc add dev lo root handle 1: htb default 12
tc class add dev lo parent 1:1 classid 1:12 htb rate 20kbps ceil 20kbps
tc qdisc add dev lo parent 1:12 netem delay 100ms
Fix this by polling on the receive end and waiting for up to a
second, instead of instantly returning to the assert.
[1] https://gist.github.com/kstevens715/4598301
Reported-by: Martin KaFai Lau <martin.lau@linux.dev>
Link: https://lore.kernel.org/bpf/9c5c8b7e-1d89-a3af-5400-14fde81f4429@linux.dev/
Fixes: 3573f384014f ("selftests/bpf: Test CGROUP_STORAGE behavior on shared egress + ingress")
Signed-off-by: YiFei Zhu <zhuyifei@google.com>
---
.../testing/selftests/bpf/prog_tests/cg_storage_multi.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/cg_storage_multi.c b/tools/testing/selftests/bpf/prog_tests/cg_storage_multi.c
index 621c57222191..3b0094a2a353 100644
--- a/tools/testing/selftests/bpf/prog_tests/cg_storage_multi.c
+++ b/tools/testing/selftests/bpf/prog_tests/cg_storage_multi.c
@@ -7,6 +7,7 @@
#include <test_progs.h>
#include <cgroup_helpers.h>
#include <network_helpers.h>
+#include <poll.h>
#include "progs/cg_storage_multi.h"
@@ -56,8 +57,9 @@ static bool assert_storage_noexist(struct bpf_map *map, const void *key)
static bool connect_send(const char *cgroup_path)
{
- bool res = true;
int server_fd = -1, client_fd = -1;
+ struct pollfd pollfd;
+ bool res = true;
if (join_cgroup(cgroup_path))
goto out_clean;
@@ -73,6 +75,11 @@ static bool connect_send(const char *cgroup_path)
if (send(client_fd, "message", strlen("message"), 0) < 0)
goto out_clean;
+ pollfd.fd = server_fd;
+ pollfd.events = POLLIN;
+ if (poll(&pollfd, 1, 1000) != 1)
+ goto out_clean;
+
res = false;
out_clean:
--
2.40.0.348.gf938b09366-goog
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH bpf] selftests/bpf: Poll for receive in cg_storage_multi test
2023-04-03 21:58 [PATCH bpf] selftests/bpf: Poll for receive in cg_storage_multi test YiFei Zhu
@ 2023-04-03 22:16 ` Stanislav Fomichev
2023-04-03 23:44 ` Martin KaFai Lau
1 sibling, 0 replies; 3+ messages in thread
From: Stanislav Fomichev @ 2023-04-03 22:16 UTC (permalink / raw)
To: YiFei Zhu
Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau,
Andrii Nakryiko
On Mon, Apr 3, 2023 at 2:59 PM YiFei Zhu <zhuyifei@google.com> wrote:
>
> In some cases the loopback latency might be large enough, causing
> the assertion on invocations to be run before ingress prog getting
> executed. The assertion would fail and the test would flake.
>
> This can be reliably reproduced by arbitrarily increaing the loopback
> latency (thanks to [1]):
> tc qdisc add dev lo root handle 1: htb default 12
> tc class add dev lo parent 1:1 classid 1:12 htb rate 20kbps ceil 20kbps
> tc qdisc add dev lo parent 1:12 netem delay 100ms
>
> Fix this by polling on the receive end and waiting for up to a
> second, instead of instantly returning to the assert.
>
> [1] https://gist.github.com/kstevens715/4598301
>
> Reported-by: Martin KaFai Lau <martin.lau@linux.dev>
> Link: https://lore.kernel.org/bpf/9c5c8b7e-1d89-a3af-5400-14fde81f4429@linux.dev/
> Fixes: 3573f384014f ("selftests/bpf: Test CGROUP_STORAGE behavior on shared egress + ingress")
> Signed-off-by: YiFei Zhu <zhuyifei@google.com>
Thank you!
Acked-by: Stanislav Fomichev <sdf@google.com>
> ---
> .../testing/selftests/bpf/prog_tests/cg_storage_multi.c | 9 ++++++++-
> 1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/tools/testing/selftests/bpf/prog_tests/cg_storage_multi.c b/tools/testing/selftests/bpf/prog_tests/cg_storage_multi.c
> index 621c57222191..3b0094a2a353 100644
> --- a/tools/testing/selftests/bpf/prog_tests/cg_storage_multi.c
> +++ b/tools/testing/selftests/bpf/prog_tests/cg_storage_multi.c
> @@ -7,6 +7,7 @@
> #include <test_progs.h>
> #include <cgroup_helpers.h>
> #include <network_helpers.h>
> +#include <poll.h>
>
> #include "progs/cg_storage_multi.h"
>
> @@ -56,8 +57,9 @@ static bool assert_storage_noexist(struct bpf_map *map, const void *key)
>
> static bool connect_send(const char *cgroup_path)
> {
> - bool res = true;
> int server_fd = -1, client_fd = -1;
> + struct pollfd pollfd;
> + bool res = true;
>
> if (join_cgroup(cgroup_path))
> goto out_clean;
> @@ -73,6 +75,11 @@ static bool connect_send(const char *cgroup_path)
> if (send(client_fd, "message", strlen("message"), 0) < 0)
> goto out_clean;
>
> + pollfd.fd = server_fd;
> + pollfd.events = POLLIN;
> + if (poll(&pollfd, 1, 1000) != 1)
> + goto out_clean;
> +
> res = false;
>
> out_clean:
> --
> 2.40.0.348.gf938b09366-goog
>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH bpf] selftests/bpf: Poll for receive in cg_storage_multi test
2023-04-03 21:58 [PATCH bpf] selftests/bpf: Poll for receive in cg_storage_multi test YiFei Zhu
2023-04-03 22:16 ` Stanislav Fomichev
@ 2023-04-03 23:44 ` Martin KaFai Lau
1 sibling, 0 replies; 3+ messages in thread
From: Martin KaFai Lau @ 2023-04-03 23:44 UTC (permalink / raw)
To: YiFei Zhu
Cc: Alexei Starovoitov, Daniel Borkmann, Stanislav Fomichev,
Andrii Nakryiko, bpf
On 4/3/23 2:58 PM, YiFei Zhu wrote:
> In some cases the loopback latency might be large enough, causing
> the assertion on invocations to be run before ingress prog getting
> executed. The assertion would fail and the test would flake.
>
> This can be reliably reproduced by arbitrarily increaing the loopback
> latency (thanks to [1]):
> tc qdisc add dev lo root handle 1: htb default 12
> tc class add dev lo parent 1:1 classid 1:12 htb rate 20kbps ceil 20kbps
> tc qdisc add dev lo parent 1:12 netem delay 100ms
>
> Fix this by polling on the receive end and waiting for up to a
> second, instead of instantly returning to the assert.
>
> [1] https://gist.github.com/kstevens715/4598301
>
> Reported-by: Martin KaFai Lau <martin.lau@linux.dev>
> Link: https://lore.kernel.org/bpf/9c5c8b7e-1d89-a3af-5400-14fde81f4429@linux.dev/
> Fixes: 3573f384014f ("selftests/bpf: Test CGROUP_STORAGE behavior on shared egress + ingress")
> Signed-off-by: YiFei Zhu <zhuyifei@google.com>
> ---
> .../testing/selftests/bpf/prog_tests/cg_storage_multi.c | 9 ++++++++-
> 1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/tools/testing/selftests/bpf/prog_tests/cg_storage_multi.c b/tools/testing/selftests/bpf/prog_tests/cg_storage_multi.c
> index 621c57222191..3b0094a2a353 100644
> --- a/tools/testing/selftests/bpf/prog_tests/cg_storage_multi.c
> +++ b/tools/testing/selftests/bpf/prog_tests/cg_storage_multi.c
> @@ -7,6 +7,7 @@
> #include <test_progs.h>
> #include <cgroup_helpers.h>
> #include <network_helpers.h>
> +#include <poll.h>
>
> #include "progs/cg_storage_multi.h"
>
> @@ -56,8 +57,9 @@ static bool assert_storage_noexist(struct bpf_map *map, const void *key)
>
> static bool connect_send(const char *cgroup_path)
> {
> - bool res = true;
> int server_fd = -1, client_fd = -1;
> + struct pollfd pollfd;
> + bool res = true;
>
> if (join_cgroup(cgroup_path))
> goto out_clean;
> @@ -73,6 +75,11 @@ static bool connect_send(const char *cgroup_path)
> if (send(client_fd, "message", strlen("message"), 0) < 0)
> goto out_clean;
>
> + pollfd.fd = server_fd;
> + pollfd.events = POLLIN;
> + if (poll(&pollfd, 1, 1000) != 1)
> + goto out_clean;
Thanks for the fix. The slowness explanation makes sense.
A nit. All start_server() has a 3s SO_RCVTIMEO by default. How about a read()
here instead of a poll(). Easier to change the default read timeout for all
tests if needed.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2023-04-03 23:44 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-03 21:58 [PATCH bpf] selftests/bpf: Poll for receive in cg_storage_multi test YiFei Zhu
2023-04-03 22:16 ` Stanislav Fomichev
2023-04-03 23:44 ` Martin KaFai Lau
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).