[PATCH bpf v2 0/2] bpf: Fix deadlock with rq_lock in bpf_send

bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH bpf v2 0/2] bpf: Fix deadlock with rq_lock in bpf_send_signal()
@ 2020-03-04 17:53 Yonghong Song
  2020-03-04 17:53 ` [PATCH bpf v2 1/2] " Yonghong Song
  2020-03-04 17:53 ` [PATCH bpf v2 2/2] selftests/bpf: add send_signal_sched_switch test Yonghong Song
  0 siblings, 2 replies; 7+ messages in thread
From: Yonghong Song @ 2020-03-04 17:53 UTC (permalink / raw)
  To: bpf; +Cc: Alexei Starovoitov, Daniel Borkmann, kernel-team

Commit 8b401f9ed244 ("bpf: implement bpf_send_signal() helper")
introduced bpf_send_signal() helper and Commit 8482941f0906
("bpf: Add bpf_send_signal_thread() helper") added bpf_send_signal_thread()
helper. Both helpers try to send a signel to current process or thread.

When bpf_prog is called with scheduler rq_lock held, a deadlock
could happen since bpf_send_signal() and bpf_send_signal_thread()
will call group_send_sig_info() which may ultimately want to acquire
rq_lock() again. This happens in 5.2 and 4.16 based kernels in our
production environment with perf_sw_event.

In a different scenario, the following is also possible in the last kernel:
  cpu 1:
     do_task_stat <- holding sighand->siglock
     ...
     task_sched_runtime <- trying to grab rq_lock

  cpu 2:
     __schedule <- holding rq_lock
     ...
     do_send_sig_info <- trying to grab sighand->siglock

Commit eac9153f2b58 ("bpf/stackmap: Fix deadlock with
rq_lock in bpf_get_stack()") has a similar issue with above
rq_lock() deadlock. This patch set addressed the issue
in a similar way. Patch #1 provided kernel solution and
Patch #2 added a selftest.

Changelogs:
  v1 -> v2:
    . previous fix using task_work in nmi() is incorrect.
      there is no nmi() involvement here. Using task_work
      in all cases might be a solution. But decided to
      use a similar fix as in Commit eac9153f2b58.

Yonghong Song (2):
  bpf: Fix deadlock with rq_lock in bpf_send_signal()
  selftests/bpf: add send_signal_sched_switch test

 kernel/trace/bpf_trace.c                      |  5 +-
 .../bpf/prog_tests/send_signal_sched_switch.c | 89 +++++++++++++++++++
 .../bpf/progs/test_send_signal_kern.c         |  6 ++
 3 files changed, 99 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/send_signal_sched_switch.c

-- 
2.17.1

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH bpf v2 1/2] bpf: Fix deadlock with rq_lock in bpf_send_signal()
  2020-03-04 17:53 [PATCH bpf v2 0/2] bpf: Fix deadlock with rq_lock in bpf_send_signal() Yonghong Song
@ 2020-03-04 17:53 ` Yonghong Song
  2020-03-04 17:53 ` [PATCH bpf v2 2/2] selftests/bpf: add send_signal_sched_switch test Yonghong Song
  1 sibling, 0 replies; 7+ messages in thread
From: Yonghong Song @ 2020-03-04 17:53 UTC (permalink / raw)
  To: bpf; +Cc: Alexei Starovoitov, Daniel Borkmann, kernel-team, Song Liu

When experimenting with bpf_send_signal() helper in our production
environment (5.2 based), we experienced a deadlock in NMI mode:
   #5 [ffffc9002219f770] queued_spin_lock_slowpath at ffffffff8110be24
   #6 [ffffc9002219f770] _raw_spin_lock_irqsave at ffffffff81a43012
   #7 [ffffc9002219f780] try_to_wake_up at ffffffff810e7ecd
   #8 [ffffc9002219f7e0] signal_wake_up_state at ffffffff810c7b55
   #9 [ffffc9002219f7f0] __send_signal at ffffffff810c8602
  #10 [ffffc9002219f830] do_send_sig_info at ffffffff810ca31a
  #11 [ffffc9002219f868] bpf_send_signal at ffffffff8119d227
  #12 [ffffc9002219f988] bpf_overflow_handler at ffffffff811d4140
  #13 [ffffc9002219f9e0] __perf_event_overflow at ffffffff811d68cf
  #14 [ffffc9002219fa10] perf_swevent_overflow at ffffffff811d6a09
  #15 [ffffc9002219fa38] ___perf_sw_event at ffffffff811e0f47
  #16 [ffffc9002219fc30] __schedule at ffffffff81a3e04d
  #17 [ffffc9002219fc90] schedule at ffffffff81a3e219
  #18 [ffffc9002219fca0] futex_wait_queue_me at ffffffff8113d1b9
  #19 [ffffc9002219fcd8] futex_wait at ffffffff8113e529
  #20 [ffffc9002219fdf0] do_futex at ffffffff8113ffbc
  #21 [ffffc9002219fec0] __x64_sys_futex at ffffffff81140d1c
  #22 [ffffc9002219ff38] do_syscall_64 at ffffffff81002602
  #23 [ffffc9002219ff50] entry_SYSCALL_64_after_hwframe at ffffffff81c00068

The above call stack is actually very similar to an issue
reported by Commit eac9153f2b58 ("bpf/stackmap: Fix deadlock with
rq_lock in bpf_get_stack()") by Song Liu. The only difference is
bpf_send_signal() helper instead of bpf_get_stack() helper.

The above deadlock is triggered with a perf_sw_event.
Similar to Commit eac9153f2b58, the above reproducer used tracepoint
point sched/sched_switch so the issue can be easily catched.
  /* stress_test.c */
  #include <stdio.h>
  #include <stdlib.h>
  #include <sys/mman.h>
  #include <pthread.h>
  #include <sys/types.h>
  #include <sys/stat.h>
  #include <fcntl.h>

  #define THREAD_COUNT 1000
  char *filename;
  void *worker(void *p)
  {
        void *ptr;
        int fd;
        char *pptr;

        fd = open(filename, O_RDONLY);
        if (fd < 0)
                return NULL;
        while (1) {
                struct timespec ts = {0, 1000 + rand() % 2000};

                ptr = mmap(NULL, 4096 * 64, PROT_READ, MAP_PRIVATE, fd, 0);
                usleep(1);
                if (ptr == MAP_FAILED) {
                        printf("failed to mmap\n");
                        break;
                }
                munmap(ptr, 4096 * 64);
                usleep(1);
                pptr = malloc(1);
                usleep(1);
                pptr[0] = 1;
                usleep(1);
                free(pptr);
                usleep(1);
                nanosleep(&ts, NULL);
        }
        close(fd);
        return NULL;
  }

  int main(int argc, char *argv[])
  {
        void *ptr;
        int i;
        pthread_t threads[THREAD_COUNT];

        if (argc < 2)
                return 0;

        filename = argv[1];

        for (i = 0; i < THREAD_COUNT; i++) {
                if (pthread_create(threads + i, NULL, worker, NULL)) {
                        fprintf(stderr, "Error creating thread\n");
                        return 0;
                }
        }

        for (i = 0; i < THREAD_COUNT; i++)
                pthread_join(threads[i], NULL);
        return 0;
  }
and the following command:
  1. run `stress_test /bin/ls` in one windown
  2. hack bcc trace.py with the following change:
     --- a/tools/trace.py
     +++ b/tools/trace.py
     @@ -513,6 +513,7 @@ BPF_PERF_OUTPUT(%s);
              __data.tgid = __tgid;
              __data.pid = __pid;
              bpf_get_current_comm(&__data.comm, sizeof(__data.comm));
     +        bpf_send_signal(10);
      %s
      %s
              %s.perf_submit(%s, &__data, sizeof(__data));
  3. in a different window run
     ./trace.py -p $(pidof stress_test) t:sched:sched_switch

The deadlock can be reproduced.

Similar to Song's fix, the fix is to delay sending signal if
irqs is disabled to avoid deadlocks involving with rq_lock.
With this change, my above stress-test in our production system
won't cause deadlock any more.

I also implemented a scale-down version of reproducer in the
selftest (a subsequent commit). With latest bpf-next,
it complains for the following deadlock.
  [   32.832450] -> #1 (&p->pi_lock){-.-.}:
  [   32.833100]        _raw_spin_lock_irqsave+0x44/0x80
  [   32.833696]        task_rq_lock+0x2c/0xa0
  [   32.834182]        task_sched_runtime+0x59/0xd0
  [   32.834721]        thread_group_cputime+0x250/0x270
  [   32.835304]        thread_group_cputime_adjusted+0x2e/0x70
  [   32.835959]        do_task_stat+0x8a7/0xb80
  [   32.836461]        proc_single_show+0x51/0xb0
  ...
  [   32.839512] -> #0 (&(&sighand->siglock)->rlock){....}:
  [   32.840275]        __lock_acquire+0x1358/0x1a20
  [   32.840826]        lock_acquire+0xc7/0x1d0
  [   32.841309]        _raw_spin_lock_irqsave+0x44/0x80
  [   32.841916]        __lock_task_sighand+0x79/0x160
  [   32.842465]        do_send_sig_info+0x35/0x90
  [   32.842977]        bpf_send_signal+0xa/0x10
  [   32.843464]        bpf_prog_bc13ed9e4d3163e3_send_signal_tp_sched+0x465/0x1000
  [   32.844301]        trace_call_bpf+0x115/0x270
  [   32.844809]        perf_trace_run_bpf_submit+0x4a/0xc0
  [   32.845411]        perf_trace_sched_switch+0x10f/0x180
  [   32.846014]        __schedule+0x45d/0x880
  [   32.846483]        schedule+0x5f/0xd0
  ...

  [   32.853148] Chain exists of:
  [   32.853148]   &(&sighand->siglock)->rlock --> &p->pi_lock --> &rq->lock
  [   32.853148]
  [   32.854451]  Possible unsafe locking scenario:
  [   32.854451]
  [   32.855173]        CPU0                    CPU1
  [   32.855745]        ----                    ----
  [   32.856278]   lock(&rq->lock);
  [   32.856671]                                lock(&p->pi_lock);
  [   32.857332]                                lock(&rq->lock);
  [   32.857999]   lock(&(&sighand->siglock)->rlock);

  Deadlock happens on CPU0 when it tries to acquire &sighand->siglock
  but it has been held by CPU1 and CPU1 tries to grab &rq->lock
  and cannot get it.

  This is not exactly the callstack in our production environment,
  but sympotom is similar and both locks are using spin_lock_irqsave()
  to acquire the lock, and both involves rq_lock. The fix to delay
  sending signal when irq is disabled also fixed this issue.

Cc: Song Liu <songliubraving@fb.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
---
 kernel/trace/bpf_trace.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 07764c761073..43ccfdbacb03 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -730,7 +730,10 @@ static int bpf_send_signal_common(u32 sig, enum pid_type type)
 	if (unlikely(!nmi_uaccess_okay()))
 		return -EPERM;
 
-	if (in_nmi()) {
+	/* Delay sending signal if irq is disabled. Otherwise,
+	 * we risk deadlock with rq_lock.
+	 */
+	if (irqs_disabled()) {
 		/* Do an early check on signal validity. Otherwise,
 		 * the error is lost in deferred irq_work.
 		 */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH bpf v2 2/2] selftests/bpf: add send_signal_sched_switch test
  2020-03-04 17:53 [PATCH bpf v2 0/2] bpf: Fix deadlock with rq_lock in bpf_send_signal() Yonghong Song
  2020-03-04 17:53 ` [PATCH bpf v2 1/2] " Yonghong Song
@ 2020-03-04 17:53 ` Yonghong Song
  2020-03-04 18:08   ` Andrii Nakryiko
  1 sibling, 1 reply; 7+ messages in thread
From: Yonghong Song @ 2020-03-04 17:53 UTC (permalink / raw)
  To: bpf; +Cc: Alexei Starovoitov, Daniel Borkmann, kernel-team, Song Liu

Added one test, send_signal_sched_switch, to test bpf_send_signal()
helper triggered by sched/sched_switch tracepoint. This test can be used
to verify kernel deadlocks fixed by the previous commit. The test itself
is heavily borrowed from Commit eac9153f2b58 ("bpf/stackmap: Fix deadlock
with rq_lock in bpf_get_stack()").

Cc: Song Liu <songliubraving@fb.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
---
 .../bpf/prog_tests/send_signal_sched_switch.c | 89 +++++++++++++++++++
 .../bpf/progs/test_send_signal_kern.c         |  6 ++
 2 files changed, 95 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/send_signal_sched_switch.c

diff --git a/tools/testing/selftests/bpf/prog_tests/send_signal_sched_switch.c b/tools/testing/selftests/bpf/prog_tests/send_signal_sched_switch.c
new file mode 100644
index 000000000000..f5c9dbdeb173
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/send_signal_sched_switch.c
@@ -0,0 +1,89 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <test_progs.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <sys/mman.h>
+#include <pthread.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+#include "test_send_signal_kern.skel.h"
+
+static void sigusr1_handler(int signum)
+{
+}
+
+#define THREAD_COUNT 100
+
+static char *filename;
+
+static void *worker(void *p)
+{
+	int err, fd, i = 0;
+	u32 duration = 0;
+	char *pptr;
+	void *ptr;
+
+	fd = open(filename, O_RDONLY);
+	if (CHECK(fd < 0, "open", "open failed %s\n", strerror(errno)))
+		return NULL;
+
+	while (i < 100) {
+		struct timespec ts = {0, 1000 + rand() % 2000};
+
+		ptr = mmap(NULL, 4096 * 64, PROT_READ, MAP_PRIVATE, fd, 0);
+		err = errno;
+		usleep(1);
+		if (CHECK(ptr == MAP_FAILED, "mmap", "mmap failed: %s\n",
+			  strerror(err)))
+			break;
+
+		munmap(ptr, 4096 * 64);
+		usleep(1);
+		pptr = malloc(1);
+		usleep(1);
+		pptr[0] = 1;
+		usleep(1);
+		free(pptr);
+		usleep(1);
+		nanosleep(&ts, NULL);
+		i++;
+	}
+	close(fd);
+	return NULL;
+}
+
+void test_send_signal_sched_switch(void)
+{
+	struct test_send_signal_kern *skel;
+	pthread_t threads[THREAD_COUNT];
+	u32 duration = 0;
+	int i, err;
+
+	filename = "/bin/ls";
+	signal(SIGUSR1, sigusr1_handler);
+
+	skel = test_send_signal_kern__open_and_load();
+	if (CHECK(!skel, "skel_open_and_load", "skeleton open_and_load failed\n"))
+		return;
+
+	skel->bss->pid = getpid();
+	skel->bss->sig = SIGUSR1;
+
+	err = test_send_signal_kern__attach(skel);
+	if (CHECK(err, "skel_attach", "skeleton attach failed\n"))
+		goto destroy_skel;
+
+	for (i = 0; i < THREAD_COUNT; i++) {
+		err = pthread_create(threads + i, NULL, worker, NULL);
+		if (CHECK(err, "pthread_create", "Error creating thread, %s\n",
+			  strerror(errno)))
+			goto destroy_skel;
+	}
+
+	for (i = 0; i < THREAD_COUNT; i++)
+		pthread_join(threads[i], NULL);
+
+destroy_skel:
+	test_send_signal_kern__destroy(skel);
+}
diff --git a/tools/testing/selftests/bpf/progs/test_send_signal_kern.c b/tools/testing/selftests/bpf/progs/test_send_signal_kern.c
index 1acc91e87bfc..b4233d3efac2 100644
--- a/tools/testing/selftests/bpf/progs/test_send_signal_kern.c
+++ b/tools/testing/selftests/bpf/progs/test_send_signal_kern.c
@@ -31,6 +31,12 @@ int send_signal_tp(void *ctx)
 	return bpf_send_signal_test(ctx);
 }
 
+SEC("tracepoint/sched/sched_switch")
+int send_signal_tp_sched(void *ctx)
+{
+	return bpf_send_signal_test(ctx);
+}
+
 SEC("perf_event")
 int send_signal_perf(void *ctx)
 {
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH bpf v2 2/2] selftests/bpf: add send_signal_sched_switch test
  2020-03-04 17:53 ` [PATCH bpf v2 2/2] selftests/bpf: add send_signal_sched_switch test Yonghong Song
@ 2020-03-04 18:08   ` Andrii Nakryiko
  2020-03-04 18:12     ` Yonghong Song
  0 siblings, 1 reply; 7+ messages in thread
From: Andrii Nakryiko @ 2020-03-04 18:08 UTC (permalink / raw)
  To: Yonghong Song
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Kernel Team, Song Liu

On Wed, Mar 4, 2020 at 9:53 AM Yonghong Song <yhs@fb.com> wrote:
>
> Added one test, send_signal_sched_switch, to test bpf_send_signal()
> helper triggered by sched/sched_switch tracepoint. This test can be used
> to verify kernel deadlocks fixed by the previous commit. The test itself
> is heavily borrowed from Commit eac9153f2b58 ("bpf/stackmap: Fix deadlock
> with rq_lock in bpf_get_stack()").
>
> Cc: Song Liu <songliubraving@fb.com>
> Signed-off-by: Yonghong Song <yhs@fb.com>
> ---
>  .../bpf/prog_tests/send_signal_sched_switch.c | 89 +++++++++++++++++++
>  .../bpf/progs/test_send_signal_kern.c         |  6 ++
>  2 files changed, 95 insertions(+)
>  create mode 100644 tools/testing/selftests/bpf/prog_tests/send_signal_sched_switch.c
>
> diff --git a/tools/testing/selftests/bpf/prog_tests/send_signal_sched_switch.c b/tools/testing/selftests/bpf/prog_tests/send_signal_sched_switch.c
> new file mode 100644
> index 000000000000..f5c9dbdeb173
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/prog_tests/send_signal_sched_switch.c
> @@ -0,0 +1,89 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#include <test_progs.h>
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <sys/mman.h>
> +#include <pthread.h>
> +#include <sys/types.h>
> +#include <sys/stat.h>
> +#include <fcntl.h>
> +#include "test_send_signal_kern.skel.h"
> +
> +static void sigusr1_handler(int signum)
> +{
> +}
> +
> +#define THREAD_COUNT 100
> +
> +static char *filename;
> +
> +static void *worker(void *p)
> +{
> +       int err, fd, i = 0;
> +       u32 duration = 0;
> +       char *pptr;
> +       void *ptr;
> +
> +       fd = open(filename, O_RDONLY);
> +       if (CHECK(fd < 0, "open", "open failed %s\n", strerror(errno)))
> +               return NULL;
> +
> +       while (i < 100) {
> +               struct timespec ts = {0, 1000 + rand() % 2000};
> +
> +               ptr = mmap(NULL, 4096 * 64, PROT_READ, MAP_PRIVATE, fd, 0);
> +               err = errno;
> +               usleep(1);
> +               if (CHECK(ptr == MAP_FAILED, "mmap", "mmap failed: %s\n",
> +                         strerror(err)))
> +                       break;
> +
> +               munmap(ptr, 4096 * 64);
> +               usleep(1);
> +               pptr = malloc(1);
> +               usleep(1);
> +               pptr[0] = 1;
> +               usleep(1);
> +               free(pptr);
> +               usleep(1);
> +               nanosleep(&ts, NULL);
> +               i++;


Any specific reason to do mmap()/munmap() in a loop? Would, say,
getpid syscall work just fine as well to trigger a bunch of context
switches? Or event just usleep(1) alone?

> +       }
> +       close(fd);
> +       return NULL;
> +}
> +
> +void test_send_signal_sched_switch(void)
> +{
> +       struct test_send_signal_kern *skel;
> +       pthread_t threads[THREAD_COUNT];
> +       u32 duration = 0;
> +       int i, err;
> +
> +       filename = "/bin/ls";
> +       signal(SIGUSR1, sigusr1_handler);
> +
> +       skel = test_send_signal_kern__open_and_load();
> +       if (CHECK(!skel, "skel_open_and_load", "skeleton open_and_load failed\n"))
> +               return;
> +
> +       skel->bss->pid = getpid();
> +       skel->bss->sig = SIGUSR1;
> +
> +       err = test_send_signal_kern__attach(skel);
> +       if (CHECK(err, "skel_attach", "skeleton attach failed\n"))
> +               goto destroy_skel;
> +
> +       for (i = 0; i < THREAD_COUNT; i++) {
> +               err = pthread_create(threads + i, NULL, worker, NULL);
> +               if (CHECK(err, "pthread_create", "Error creating thread, %s\n",
> +                         strerror(errno)))
> +                       goto destroy_skel;
> +       }
> +
> +       for (i = 0; i < THREAD_COUNT; i++)
> +               pthread_join(threads[i], NULL);
> +
> +destroy_skel:
> +       test_send_signal_kern__destroy(skel);
> +}
> diff --git a/tools/testing/selftests/bpf/progs/test_send_signal_kern.c b/tools/testing/selftests/bpf/progs/test_send_signal_kern.c
> index 1acc91e87bfc..b4233d3efac2 100644
> --- a/tools/testing/selftests/bpf/progs/test_send_signal_kern.c
> +++ b/tools/testing/selftests/bpf/progs/test_send_signal_kern.c
> @@ -31,6 +31,12 @@ int send_signal_tp(void *ctx)
>         return bpf_send_signal_test(ctx);
>  }
>
> +SEC("tracepoint/sched/sched_switch")
> +int send_signal_tp_sched(void *ctx)
> +{
> +       return bpf_send_signal_test(ctx);
> +}
> +
>  SEC("perf_event")
>  int send_signal_perf(void *ctx)
>  {
> --
> 2.17.1
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH bpf v2 2/2] selftests/bpf: add send_signal_sched_switch test
  2020-03-04 18:08   ` Andrii Nakryiko
@ 2020-03-04 18:12     ` Yonghong Song
  2020-03-04 18:30       ` Song Liu
  0 siblings, 1 reply; 7+ messages in thread
From: Yonghong Song @ 2020-03-04 18:12 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Kernel Team, Song Liu



On 3/4/20 10:08 AM, Andrii Nakryiko wrote:
> On Wed, Mar 4, 2020 at 9:53 AM Yonghong Song <yhs@fb.com> wrote:
>>
>> Added one test, send_signal_sched_switch, to test bpf_send_signal()
>> helper triggered by sched/sched_switch tracepoint. This test can be used
>> to verify kernel deadlocks fixed by the previous commit. The test itself
>> is heavily borrowed from Commit eac9153f2b58 ("bpf/stackmap: Fix deadlock
>> with rq_lock in bpf_get_stack()").
>>
>> Cc: Song Liu <songliubraving@fb.com>
>> Signed-off-by: Yonghong Song <yhs@fb.com>
>> ---
>>   .../bpf/prog_tests/send_signal_sched_switch.c | 89 +++++++++++++++++++
>>   .../bpf/progs/test_send_signal_kern.c         |  6 ++
>>   2 files changed, 95 insertions(+)
>>   create mode 100644 tools/testing/selftests/bpf/prog_tests/send_signal_sched_switch.c
>>
>> diff --git a/tools/testing/selftests/bpf/prog_tests/send_signal_sched_switch.c b/tools/testing/selftests/bpf/prog_tests/send_signal_sched_switch.c
>> new file mode 100644
>> index 000000000000..f5c9dbdeb173
>> --- /dev/null
>> +++ b/tools/testing/selftests/bpf/prog_tests/send_signal_sched_switch.c
>> @@ -0,0 +1,89 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +#include <test_progs.h>
>> +#include <stdio.h>
>> +#include <stdlib.h>
>> +#include <sys/mman.h>
>> +#include <pthread.h>
>> +#include <sys/types.h>
>> +#include <sys/stat.h>
>> +#include <fcntl.h>
>> +#include "test_send_signal_kern.skel.h"
>> +
>> +static void sigusr1_handler(int signum)
>> +{
>> +}
>> +
>> +#define THREAD_COUNT 100
>> +
>> +static char *filename;
>> +
>> +static void *worker(void *p)
>> +{
>> +       int err, fd, i = 0;
>> +       u32 duration = 0;
>> +       char *pptr;
>> +       void *ptr;
>> +
>> +       fd = open(filename, O_RDONLY);
>> +       if (CHECK(fd < 0, "open", "open failed %s\n", strerror(errno)))
>> +               return NULL;
>> +
>> +       while (i < 100) {
>> +               struct timespec ts = {0, 1000 + rand() % 2000};
>> +
>> +               ptr = mmap(NULL, 4096 * 64, PROT_READ, MAP_PRIVATE, fd, 0);
>> +               err = errno;
>> +               usleep(1);
>> +               if (CHECK(ptr == MAP_FAILED, "mmap", "mmap failed: %s\n",
>> +                         strerror(err)))
>> +                       break;
>> +
>> +               munmap(ptr, 4096 * 64);
>> +               usleep(1);
>> +               pptr = malloc(1);
>> +               usleep(1);
>> +               pptr[0] = 1;
>> +               usleep(1);
>> +               free(pptr);
>> +               usleep(1);
>> +               nanosleep(&ts, NULL);
>> +               i++;
> 
> 
> Any specific reason to do mmap()/munmap() in a loop? Would, say,
> getpid syscall work just fine as well to trigger a bunch of context
> switches? Or event just usleep(1) alone?

No particular reason. Copied from Song's original reproducer and
it is working. Maybe Song can comment why it is written this way.

> 
>> +       }
>> +       close(fd);
>> +       return NULL;
>> +}
>> +
>> +void test_send_signal_sched_switch(void)
>> +{
>> +       struct test_send_signal_kern *skel;
>> +       pthread_t threads[THREAD_COUNT];
>> +       u32 duration = 0;
>> +       int i, err;
>> +
>> +       filename = "/bin/ls";
>> +       signal(SIGUSR1, sigusr1_handler);
>> +
>> +       skel = test_send_signal_kern__open_and_load();
>> +       if (CHECK(!skel, "skel_open_and_load", "skeleton open_and_load failed\n"))
>> +               return;
>> +
>> +       skel->bss->pid = getpid();
>> +       skel->bss->sig = SIGUSR1;
>> +
>> +       err = test_send_signal_kern__attach(skel);
>> +       if (CHECK(err, "skel_attach", "skeleton attach failed\n"))
>> +               goto destroy_skel;
>> +
>> +       for (i = 0; i < THREAD_COUNT; i++) {
>> +               err = pthread_create(threads + i, NULL, worker, NULL);
>> +               if (CHECK(err, "pthread_create", "Error creating thread, %s\n",
>> +                         strerror(errno)))
>> +                       goto destroy_skel;
>> +       }
>> +
>> +       for (i = 0; i < THREAD_COUNT; i++)
>> +               pthread_join(threads[i], NULL);
>> +
>> +destroy_skel:
>> +       test_send_signal_kern__destroy(skel);
>> +}
>> diff --git a/tools/testing/selftests/bpf/progs/test_send_signal_kern.c b/tools/testing/selftests/bpf/progs/test_send_signal_kern.c
>> index 1acc91e87bfc..b4233d3efac2 100644
>> --- a/tools/testing/selftests/bpf/progs/test_send_signal_kern.c
>> +++ b/tools/testing/selftests/bpf/progs/test_send_signal_kern.c
>> @@ -31,6 +31,12 @@ int send_signal_tp(void *ctx)
>>          return bpf_send_signal_test(ctx);
>>   }
>>
>> +SEC("tracepoint/sched/sched_switch")
>> +int send_signal_tp_sched(void *ctx)
>> +{
>> +       return bpf_send_signal_test(ctx);
>> +}
>> +
>>   SEC("perf_event")
>>   int send_signal_perf(void *ctx)
>>   {
>> --
>> 2.17.1
>>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH bpf v2 2/2] selftests/bpf: add send_signal_sched_switch test
  2020-03-04 18:12     ` Yonghong Song
@ 2020-03-04 18:30       ` Song Liu
  2020-03-04 18:34         ` Yonghong Song
  0 siblings, 1 reply; 7+ messages in thread
From: Song Liu @ 2020-03-04 18:30 UTC (permalink / raw)
  To: Yonghong Song
  Cc: Andrii Nakryiko, bpf, Alexei Starovoitov, Daniel Borkmann, Kernel Team



> On Mar 4, 2020, at 10:12 AM, Yonghong Song <yhs@fb.com> wrote:
> 
> 
> 
> On 3/4/20 10:08 AM, Andrii Nakryiko wrote:
>> On Wed, Mar 4, 2020 at 9:53 AM Yonghong Song <yhs@fb.com> wrote:
>>> 
>>> Added one test, send_signal_sched_switch, to test bpf_send_signal()
>>> helper triggered by sched/sched_switch tracepoint. This test can be used
>>> to verify kernel deadlocks fixed by the previous commit. The test itself
>>> is heavily borrowed from Commit eac9153f2b58 ("bpf/stackmap: Fix deadlock
>>> with rq_lock in bpf_get_stack()").
>>> 
>>> Cc: Song Liu <songliubraving@fb.com>
>>> Signed-off-by: Yonghong Song <yhs@fb.com>
>>> ---
>>>  .../bpf/prog_tests/send_signal_sched_switch.c | 89 +++++++++++++++++++
>>>  .../bpf/progs/test_send_signal_kern.c         |  6 ++
>>>  2 files changed, 95 insertions(+)
>>>  create mode 100644 tools/testing/selftests/bpf/prog_tests/send_signal_sched_switch.c
>>> 
>>> diff --git a/tools/testing/selftests/bpf/prog_tests/send_signal_sched_switch.c b/tools/testing/selftests/bpf/prog_tests/send_signal_sched_switch.c
>>> new file mode 100644
>>> index 000000000000..f5c9dbdeb173
>>> --- /dev/null
>>> +++ b/tools/testing/selftests/bpf/prog_tests/send_signal_sched_switch.c
>>> @@ -0,0 +1,89 @@
>>> +// SPDX-License-Identifier: GPL-2.0
>>> +#include <test_progs.h>
>>> +#include <stdio.h>
>>> +#include <stdlib.h>
>>> +#include <sys/mman.h>
>>> +#include <pthread.h>
>>> +#include <sys/types.h>
>>> +#include <sys/stat.h>
>>> +#include <fcntl.h>
>>> +#include "test_send_signal_kern.skel.h"
>>> +
>>> +static void sigusr1_handler(int signum)
>>> +{
>>> +}
>>> +
>>> +#define THREAD_COUNT 100
>>> +
>>> +static char *filename;
>>> +
>>> +static void *worker(void *p)
>>> +{
>>> +       int err, fd, i = 0;
>>> +       u32 duration = 0;
>>> +       char *pptr;
>>> +       void *ptr;
>>> +
>>> +       fd = open(filename, O_RDONLY);
>>> +       if (CHECK(fd < 0, "open", "open failed %s\n", strerror(errno)))
>>> +               return NULL;
>>> +
>>> +       while (i < 100) {
>>> +               struct timespec ts = {0, 1000 + rand() % 2000};
>>> +
>>> +               ptr = mmap(NULL, 4096 * 64, PROT_READ, MAP_PRIVATE, fd, 0);
>>> +               err = errno;
>>> +               usleep(1);
>>> +               if (CHECK(ptr == MAP_FAILED, "mmap", "mmap failed: %s\n",
>>> +                         strerror(err)))
>>> +                       break;
>>> +
>>> +               munmap(ptr, 4096 * 64);
>>> +               usleep(1);
>>> +               pptr = malloc(1);
>>> +               usleep(1);
>>> +               pptr[0] = 1;
>>> +               usleep(1);
>>> +               free(pptr);
>>> +               usleep(1);
>>> +               nanosleep(&ts, NULL);
>>> +               i++;
>> Any specific reason to do mmap()/munmap() in a loop? Would, say,
>> getpid syscall work just fine as well to trigger a bunch of context
>> switches? Or event just usleep(1) alone?
> 
> No particular reason. Copied from Song's original reproducer and
> it is working. Maybe Song can comment why it is written this way.

In that test, mmap/munmap are used to lock mmap_sem. I guess we 
don't really need them in this test. 

Thanks,
song

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH bpf v2 2/2] selftests/bpf: add send_signal_sched_switch test
  2020-03-04 18:30       ` Song Liu
@ 2020-03-04 18:34         ` Yonghong Song
  0 siblings, 0 replies; 7+ messages in thread
From: Yonghong Song @ 2020-03-04 18:34 UTC (permalink / raw)
  To: Song Liu
  Cc: Andrii Nakryiko, bpf, Alexei Starovoitov, Daniel Borkmann, Kernel Team



On 3/4/20 10:30 AM, Song Liu wrote:
> 
> 
>> On Mar 4, 2020, at 10:12 AM, Yonghong Song <yhs@fb.com> wrote:
>>
>>
>>
>> On 3/4/20 10:08 AM, Andrii Nakryiko wrote:
>>> On Wed, Mar 4, 2020 at 9:53 AM Yonghong Song <yhs@fb.com> wrote:
>>>>
>>>> Added one test, send_signal_sched_switch, to test bpf_send_signal()
>>>> helper triggered by sched/sched_switch tracepoint. This test can be used
>>>> to verify kernel deadlocks fixed by the previous commit. The test itself
>>>> is heavily borrowed from Commit eac9153f2b58 ("bpf/stackmap: Fix deadlock
>>>> with rq_lock in bpf_get_stack()").
>>>>
>>>> Cc: Song Liu <songliubraving@fb.com>
>>>> Signed-off-by: Yonghong Song <yhs@fb.com>
>>>> ---
>>>>   .../bpf/prog_tests/send_signal_sched_switch.c | 89 +++++++++++++++++++
>>>>   .../bpf/progs/test_send_signal_kern.c         |  6 ++
>>>>   2 files changed, 95 insertions(+)
>>>>   create mode 100644 tools/testing/selftests/bpf/prog_tests/send_signal_sched_switch.c
>>>>
>>>> diff --git a/tools/testing/selftests/bpf/prog_tests/send_signal_sched_switch.c b/tools/testing/selftests/bpf/prog_tests/send_signal_sched_switch.c
>>>> new file mode 100644
>>>> index 000000000000..f5c9dbdeb173
>>>> --- /dev/null
>>>> +++ b/tools/testing/selftests/bpf/prog_tests/send_signal_sched_switch.c
>>>> @@ -0,0 +1,89 @@
>>>> +// SPDX-License-Identifier: GPL-2.0
>>>> +#include <test_progs.h>
>>>> +#include <stdio.h>
>>>> +#include <stdlib.h>
>>>> +#include <sys/mman.h>
>>>> +#include <pthread.h>
>>>> +#include <sys/types.h>
>>>> +#include <sys/stat.h>
>>>> +#include <fcntl.h>
>>>> +#include "test_send_signal_kern.skel.h"
>>>> +
>>>> +static void sigusr1_handler(int signum)
>>>> +{
>>>> +}
>>>> +
>>>> +#define THREAD_COUNT 100
>>>> +
>>>> +static char *filename;
>>>> +
>>>> +static void *worker(void *p)
>>>> +{
>>>> +       int err, fd, i = 0;
>>>> +       u32 duration = 0;
>>>> +       char *pptr;
>>>> +       void *ptr;
>>>> +
>>>> +       fd = open(filename, O_RDONLY);
>>>> +       if (CHECK(fd < 0, "open", "open failed %s\n", strerror(errno)))
>>>> +               return NULL;
>>>> +
>>>> +       while (i < 100) {
>>>> +               struct timespec ts = {0, 1000 + rand() % 2000};
>>>> +
>>>> +               ptr = mmap(NULL, 4096 * 64, PROT_READ, MAP_PRIVATE, fd, 0);
>>>> +               err = errno;
>>>> +               usleep(1);
>>>> +               if (CHECK(ptr == MAP_FAILED, "mmap", "mmap failed: %s\n",
>>>> +                         strerror(err)))
>>>> +                       break;
>>>> +
>>>> +               munmap(ptr, 4096 * 64);
>>>> +               usleep(1);
>>>> +               pptr = malloc(1);
>>>> +               usleep(1);
>>>> +               pptr[0] = 1;
>>>> +               usleep(1);
>>>> +               free(pptr);
>>>> +               usleep(1);
>>>> +               nanosleep(&ts, NULL);
>>>> +               i++;
>>> Any specific reason to do mmap()/munmap() in a loop? Would, say,
>>> getpid syscall work just fine as well to trigger a bunch of context
>>> switches? Or event just usleep(1) alone?
>>
>> No particular reason. Copied from Song's original reproducer and
>> it is working. Maybe Song can comment why it is written this way.
> 
> In that test, mmap/munmap are used to lock mmap_sem. I guess we
> don't really need them in this test.

Thanks. I will simplify the test then. The goal is to get more context 
switches.

> 
> Thanks,
> song
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-03-04 18:35 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-04 17:53 [PATCH bpf v2 0/2] bpf: Fix deadlock with rq_lock in bpf_send_signal() Yonghong Song
2020-03-04 17:53 ` [PATCH bpf v2 1/2] " Yonghong Song
2020-03-04 17:53 ` [PATCH bpf v2 2/2] selftests/bpf: add send_signal_sched_switch test Yonghong Song
2020-03-04 18:08   ` Andrii Nakryiko
2020-03-04 18:12     ` Yonghong Song
2020-03-04 18:30       ` Song Liu
2020-03-04 18:34         ` Yonghong Song

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).