From: Jakub Sitnicki <jakub@cloudflare.com>
To: bpf@vger.kernel.org
Cc: netdev@vger.kernel.org, kernel-team@cloudflare.com,
John Fastabend <john.fastabend@gmail.com>
Subject: [PATCH bpf 0/3] Fix locking order and synchronization on sockmap/sockhash tear-down
Date: Thu, 6 Feb 2020 12:16:49 +0100 [thread overview]
Message-ID: <20200206111652.694507-1-jakub@cloudflare.com> (raw)
Couple of fixes that came from recent discussion [0] on commit
7e81a3530206 ("bpf: Sockmap, ensure sock lock held during tear down").
This series doesn't address the sleeping while holding a spinlock
problem. We're still trying to decide how to fix that [1].
Until then sockmap users might see the following warnings:
| BUG: sleeping function called from invalid context at net/core/sock.c:2935
| in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 62, name: kworker/0:1
| 3 locks held by kworker/0:1/62:
| #0: ffff88813b019748 ((wq_completion)events){+.+.}, at: process_one_work+0x1d7/0x5e0
| #1: ffffc900000abe50 ((work_completion)(&map->work)){+.+.}, at: process_one_work+0x1d7/0x5e0
| #2: ffff8881381f6df8 (&stab->lock){+...}, at: sock_map_free+0x26/0x180
| CPU: 0 PID: 62 Comm: kworker/0:1 Not tainted 5.5.0-04008-g7b083332376e #454
| Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014
| Workqueue: events bpf_map_free_deferred
| Call Trace:
| dump_stack+0x71/0xa0
| ___might_sleep.cold+0xa6/0xb6
| lock_sock_nested+0x28/0x90
| sock_map_free+0x5f/0x180
| bpf_map_free_deferred+0x58/0x80
| process_one_work+0x260/0x5e0
| worker_thread+0x4d/0x3e0
| kthread+0x108/0x140
| ? process_one_work+0x5e0/0x5e0
| ? kthread_park+0x90/0x90
| ret_from_fork+0x3a/0x50
|
| ======================================================
| WARNING: possible circular locking dependency detected
| 5.5.0-04008-g7b083332376e #454 Tainted: G W
| ------------------------------------------------------
| kworker/0:1/62 is trying to acquire lock:
| ffff88813b280130 (sk_lock-AF_INET){+.+.}, at: sock_map_free+0x5f/0x180
|
| but task is already holding lock:
| ffff8881381f6df8 (&stab->lock){+...}, at: sock_map_free+0x26/0x180
|
| which lock already depends on the new lock.
|
|
| the existing dependency chain (in reverse order) is:
|
| -> #1 (&stab->lock){+...}:
| _raw_spin_lock_bh+0x39/0x80
| sock_map_update_common+0xdc/0x300
| sock_map_update_elem+0xc3/0x150
| __do_sys_bpf+0x1285/0x1620
| do_syscall_64+0x6d/0x690
| entry_SYSCALL_64_after_hwframe+0x49/0xbe
|
| -> #0 (sk_lock-AF_INET){+.+.}:
| __lock_acquire+0xe2f/0x19f0
| lock_acquire+0x95/0x190
| lock_sock_nested+0x6b/0x90
| sock_map_free+0x5f/0x180
| bpf_map_free_deferred+0x58/0x80
| process_one_work+0x260/0x5e0
| worker_thread+0x4d/0x3e0
| kthread+0x108/0x140
| ret_from_fork+0x3a/0x50
|
| other info that might help us debug this:
|
| Possible unsafe locking scenario:
|
| CPU0 CPU1
| ---- ----
| lock(&stab->lock);
| lock(sk_lock-AF_INET);
| lock(&stab->lock);
| lock(sk_lock-AF_INET);
|
| *** DEADLOCK ***
|
| 3 locks held by kworker/0:1/62:
| #0: ffff88813b019748 ((wq_completion)events){+.+.}, at: process_one_work+0x1d7/0x5e0
| #1: ffffc900000abe50 ((work_completion)(&map->work)){+.+.}, at: process_one_work+0x1d7/0x5e0
| #2: ffff8881381f6df8 (&stab->lock){+...}, at: sock_map_free+0x26/0x180
|
| stack backtrace:
| CPU: 0 PID: 62 Comm: kworker/0:1 Tainted: G W 5.5.0-04008-g7b083332376e #454
| Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014
| Workqueue: events bpf_map_free_deferred
| Call Trace:
| dump_stack+0x71/0xa0
| check_noncircular+0x176/0x190
| __lock_acquire+0xe2f/0x19f0
| lock_acquire+0x95/0x190
| ? sock_map_free+0x5f/0x180
| lock_sock_nested+0x6b/0x90
| ? sock_map_free+0x5f/0x180
| sock_map_free+0x5f/0x180
| bpf_map_free_deferred+0x58/0x80
| process_one_work+0x260/0x5e0
| worker_thread+0x4d/0x3e0
| kthread+0x108/0x140
| ? process_one_work+0x5e0/0x5e0
| ? kthread_park+0x90/0x90
| ret_from_fork+0x3a/0x50
[0] https://lore.kernel.org/bpf/8736boor55.fsf@cloudflare.com/
[1] https://lore.kernel.org/bpf/5e3ba96ca7889_6b512aafe4b145b812@john-XPS-13-9370.notmuch/
Jakub Sitnicki (3):
bpf, sockmap: Don't sleep while holding RCU lock on tear-down
bpf, sockhash: synchronize_rcu before free'ing map
selftests/bpf: Test freeing sockmap/sockhash with a socket in it
net/core/sock_map.c | 12 ++-
.../selftests/bpf/prog_tests/sockmap_basic.c | 74 +++++++++++++++++++
2 files changed, 82 insertions(+), 4 deletions(-)
create mode 100644 tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
--
2.24.1
next reply other threads:[~2020-02-06 11:17 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-02-06 11:16 Jakub Sitnicki [this message]
2020-02-06 11:16 ` [PATCH bpf 1/3] bpf, sockmap: Don't sleep while holding RCU lock on tear-down Jakub Sitnicki
2020-02-06 18:59 ` John Fastabend
2020-02-06 11:16 ` [PATCH bpf 2/3] bpf, sockhash: synchronize_rcu before free'ing map Jakub Sitnicki
2020-02-06 19:01 ` John Fastabend
2020-02-06 11:16 ` [PATCH bpf 3/3] selftests/bpf: Test freeing sockmap/sockhash with a socket in it Jakub Sitnicki
2020-02-06 19:03 ` John Fastabend
2020-02-09 2:41 ` Alexei Starovoitov
2020-02-09 4:12 ` Yonghong Song
2020-02-09 15:29 ` Jakub Sitnicki
2020-02-10 3:55 ` John Fastabend
2020-02-10 11:52 ` Jakub Sitnicki
2020-02-10 23:38 ` Daniel Borkmann
2020-02-06 19:43 ` [PATCH bpf 0/3] Fix locking order and synchronization on sockmap/sockhash tear-down John Fastabend
2020-02-07 10:45 ` Jakub Sitnicki
2020-03-10 16:44 ` John Fastabend
2020-03-11 11:51 ` Jakub Sitnicki
2020-03-10 11:30 ` Jakub Sitnicki
2020-02-07 21:56 ` Daniel Borkmann
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200206111652.694507-1-jakub@cloudflare.com \
--to=jakub@cloudflare.com \
--cc=bpf@vger.kernel.org \
--cc=john.fastabend@gmail.com \
--cc=kernel-team@cloudflare.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).