bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf 0/2] Fixes for sock_hash_free
@ 2020-06-07 20:52 Jakub Sitnicki
  2020-06-07 20:52 ` [PATCH bpf 1/2] bpf, sockhash: Fix memory leak when unlinking sockets in sock_hash_free Jakub Sitnicki
  2020-06-07 20:52 ` [PATCH bpf 2/2] bpf, sockhash: Synchronize delete from bucket list on map free Jakub Sitnicki
  0 siblings, 2 replies; 7+ messages in thread
From: Jakub Sitnicki @ 2020-06-07 20:52 UTC (permalink / raw)
  To: bpf; +Cc: netdev, kernel-team, Eric Dumazet, John Fastabend

This series is an attempt to fix a race in sock_hash_free recently reported
by Eric [0]. The race, and a mem leak I found on the way, can be triggered
by the crude reproducer posted below.

[0] https://lore.kernel.org/bpf/6f8bb6d8-bb70-4533-f15b-310db595d334@gmail.com/

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: John Fastabend <john.fastabend@gmail.com>

--8<--

enum { NUM_SOCKS = 1000 };

static void *close_map(void *map)
{
	close(*(int *)map);
	return NULL;
}

int main(void)
{
	int sock[NUM_SOCKS];
	pthread_t worker;
	int map;
	int i, err;

	map = bpf_create_map(BPF_MAP_TYPE_SOCKHASH, sizeof(int), sizeof(int), NUM_SOCKS, 0);
	if (map < 0)
		error(1, -map, "map create");

	for (i = 0; i < NUM_SOCKS; i++) {
		int fd = socket(AF_INET, SOCK_STREAM, 0);
		if (fd < 0)
			error(1, errno, "socket");

		err = listen(fd, SOMAXCONN);
		if (err)
			error(1, errno, "listen");

		sock[i] = fd;
		err = bpf_map_update_elem(map, &i, &fd, BPF_ANY);
		if (err)
			error(1, errno, "map update");
	}

	err = pthread_create(&worker, NULL, close_map, &map);
	if (err)
		error(1, err, "thread create");

	/* usleep(100); */

	for (int i = 0; i < NUM_SOCKS; i++)
		close(sock[i]);

	pthread_join(worker, NULL);
	return 0;
}
-->8--

Jakub Sitnicki (2):
  bpf, sockhash: Fix memory leak when unlinking sockets in
    sock_hash_free
  bpf, sockhash: Synchronize delete from bucket list on map free

 net/core/sock_map.c | 24 ++++++++++++++++++++++--
 1 file changed, 22 insertions(+), 2 deletions(-)

-- 
2.25.4


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH bpf 1/2] bpf, sockhash: Fix memory leak when unlinking sockets in sock_hash_free
  2020-06-07 20:52 [PATCH bpf 0/2] Fixes for sock_hash_free Jakub Sitnicki
@ 2020-06-07 20:52 ` Jakub Sitnicki
  2020-06-09 16:56   ` John Fastabend
  2020-06-07 20:52 ` [PATCH bpf 2/2] bpf, sockhash: Synchronize delete from bucket list on map free Jakub Sitnicki
  1 sibling, 1 reply; 7+ messages in thread
From: Jakub Sitnicki @ 2020-06-07 20:52 UTC (permalink / raw)
  To: bpf; +Cc: netdev, kernel-team

When sockhash gets destroyed while sockets are still linked to it, we will
walk the bucket lists and delete the links. However, we are not freeing the
list elements after processing them, leaking the memory.

The leak can be triggered by close()'ing a sockhash map when it still
contains sockets, and observed with kmemleak:

  unreferenced object 0xffff888116e86f00 (size 64):
    comm "race_sock_unlin", pid 223, jiffies 4294731063 (age 217.404s)
    hex dump (first 32 bytes):
      00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      81 de e8 41 00 00 00 00 c0 69 2f 15 81 88 ff ff  ...A.....i/.....
    backtrace:
      [<00000000dd089ebb>] sock_hash_update_common+0x4ca/0x760
      [<00000000b8219bd5>] sock_hash_update_elem+0x1d2/0x200
      [<000000005e2c23de>] __do_sys_bpf+0x2046/0x2990
      [<00000000d0084618>] do_syscall_64+0xad/0x9a0
      [<000000000d96f263>] entry_SYSCALL_64_after_hwframe+0x49/0xb3

Fix it by freeing the list element when we're done with it.

Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
---
 net/core/sock_map.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/core/sock_map.c b/net/core/sock_map.c
index 00a26cf2cfe9..ea46f07a22d8 100644
--- a/net/core/sock_map.c
+++ b/net/core/sock_map.c
@@ -1031,6 +1031,7 @@ static void sock_hash_free(struct bpf_map *map)
 			sock_map_unref(elem->sk, elem);
 			rcu_read_unlock();
 			release_sock(elem->sk);
+			sock_hash_free_elem(htab, elem);
 		}
 	}
 
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH bpf 2/2] bpf, sockhash: Synchronize delete from bucket list on map free
  2020-06-07 20:52 [PATCH bpf 0/2] Fixes for sock_hash_free Jakub Sitnicki
  2020-06-07 20:52 ` [PATCH bpf 1/2] bpf, sockhash: Fix memory leak when unlinking sockets in sock_hash_free Jakub Sitnicki
@ 2020-06-07 20:52 ` Jakub Sitnicki
  2020-06-09 17:09   ` John Fastabend
  1 sibling, 1 reply; 7+ messages in thread
From: Jakub Sitnicki @ 2020-06-07 20:52 UTC (permalink / raw)
  To: bpf; +Cc: netdev, kernel-team, Eric Dumazet

We can end up modifying the sockhash bucket list from two CPUs when a
sockhash is being destroyed (sock_hash_free) on one CPU, while a socket
that is in the sockhash is unlinking itself from it on another CPU
it (sock_hash_delete_from_link).

This results in accessing a list element that is in an undefined state as
reported by KASAN:

| ==================================================================
| BUG: KASAN: wild-memory-access in sock_hash_free+0x13c/0x280
| Write of size 8 at addr dead000000000122 by task kworker/2:1/95
|
| CPU: 2 PID: 95 Comm: kworker/2:1 Not tainted 5.7.0-rc7-02961-ge22c35ab0038-dirty #691
| Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014
| Workqueue: events bpf_map_free_deferred
| Call Trace:
|  dump_stack+0x97/0xe0
|  ? sock_hash_free+0x13c/0x280
|  __kasan_report.cold+0x5/0x40
|  ? mark_lock+0xbc1/0xc00
|  ? sock_hash_free+0x13c/0x280
|  kasan_report+0x38/0x50
|  ? sock_hash_free+0x152/0x280
|  sock_hash_free+0x13c/0x280
|  bpf_map_free_deferred+0xb2/0xd0
|  ? bpf_map_charge_finish+0x50/0x50
|  ? rcu_read_lock_sched_held+0x81/0xb0
|  ? rcu_read_lock_bh_held+0x90/0x90
|  process_one_work+0x59a/0xac0
|  ? lock_release+0x3b0/0x3b0
|  ? pwq_dec_nr_in_flight+0x110/0x110
|  ? rwlock_bug.part.0+0x60/0x60
|  worker_thread+0x7a/0x680
|  ? _raw_spin_unlock_irqrestore+0x4c/0x60
|  kthread+0x1cc/0x220
|  ? process_one_work+0xac0/0xac0
|  ? kthread_create_on_node+0xa0/0xa0
|  ret_from_fork+0x24/0x30
| ==================================================================

Fix it by reintroducing spin-lock protected critical section around the
code that removes the elements from the bucket on sockhash free.

To do that we also need to defer processing of removed elements, until out
of atomic context so that we can unlink the socket from the map when
holding the sock lock.

Fixes: 90db6d772f74 ("bpf, sockmap: Remove bucket->lock from sock_{hash|map}_free")
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
---
 net/core/sock_map.c | 23 +++++++++++++++++++++--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/net/core/sock_map.c b/net/core/sock_map.c
index ea46f07a22d8..17a40a947546 100644
--- a/net/core/sock_map.c
+++ b/net/core/sock_map.c
@@ -1013,6 +1013,7 @@ static void sock_hash_free(struct bpf_map *map)
 {
 	struct bpf_htab *htab = container_of(map, struct bpf_htab, map);
 	struct bpf_htab_bucket *bucket;
+	struct hlist_head unlink_list;
 	struct bpf_htab_elem *elem;
 	struct hlist_node *node;
 	int i;
@@ -1024,13 +1025,31 @@ static void sock_hash_free(struct bpf_map *map)
 	synchronize_rcu();
 	for (i = 0; i < htab->buckets_num; i++) {
 		bucket = sock_hash_select_bucket(htab, i);
-		hlist_for_each_entry_safe(elem, node, &bucket->head, node) {
-			hlist_del_rcu(&elem->node);
+
+		/* We are racing with sock_hash_delete_from_link to
+		 * enter the spin-lock critical section. Every socket on
+		 * the list is still linked to sockhash. Since link
+		 * exists, psock exists and holds a ref to socket. That
+		 * lets us to grab a socket ref too.
+		 */
+		raw_spin_lock_bh(&bucket->lock);
+		hlist_for_each_entry(elem, &bucket->head, node)
+			sock_hold(elem->sk);
+		hlist_move_list(&bucket->head, &unlink_list);
+		raw_spin_unlock_bh(&bucket->lock);
+
+		/* Process removed entries out of atomic context to
+		 * block for socket lock before deleting the psock's
+		 * link to sockhash.
+		 */
+		hlist_for_each_entry_safe(elem, node, &unlink_list, node) {
+			hlist_del(&elem->node);
 			lock_sock(elem->sk);
 			rcu_read_lock();
 			sock_map_unref(elem->sk, elem);
 			rcu_read_unlock();
 			release_sock(elem->sk);
+			sock_put(elem->sk);
 			sock_hash_free_elem(htab, elem);
 		}
 	}
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* RE: [PATCH bpf 1/2] bpf, sockhash: Fix memory leak when unlinking sockets in sock_hash_free
  2020-06-07 20:52 ` [PATCH bpf 1/2] bpf, sockhash: Fix memory leak when unlinking sockets in sock_hash_free Jakub Sitnicki
@ 2020-06-09 16:56   ` John Fastabend
  0 siblings, 0 replies; 7+ messages in thread
From: John Fastabend @ 2020-06-09 16:56 UTC (permalink / raw)
  To: Jakub Sitnicki, bpf; +Cc: netdev, kernel-team

Jakub Sitnicki wrote:
> When sockhash gets destroyed while sockets are still linked to it, we will
> walk the bucket lists and delete the links. However, we are not freeing the
> list elements after processing them, leaking the memory.
> 
> The leak can be triggered by close()'ing a sockhash map when it still
> contains sockets, and observed with kmemleak:
> 
>   unreferenced object 0xffff888116e86f00 (size 64):
>     comm "race_sock_unlin", pid 223, jiffies 4294731063 (age 217.404s)
>     hex dump (first 32 bytes):
>       00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>       81 de e8 41 00 00 00 00 c0 69 2f 15 81 88 ff ff  ...A.....i/.....
>     backtrace:
>       [<00000000dd089ebb>] sock_hash_update_common+0x4ca/0x760
>       [<00000000b8219bd5>] sock_hash_update_elem+0x1d2/0x200
>       [<000000005e2c23de>] __do_sys_bpf+0x2046/0x2990
>       [<00000000d0084618>] do_syscall_64+0xad/0x9a0
>       [<000000000d96f263>] entry_SYSCALL_64_after_hwframe+0x49/0xb3
> 
> Fix it by freeing the list element when we're done with it.
> 
> Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface")
> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
> ---
>  net/core/sock_map.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/net/core/sock_map.c b/net/core/sock_map.c
> index 00a26cf2cfe9..ea46f07a22d8 100644
> --- a/net/core/sock_map.c
> +++ b/net/core/sock_map.c
> @@ -1031,6 +1031,7 @@ static void sock_hash_free(struct bpf_map *map)
>  			sock_map_unref(elem->sk, elem);
>  			rcu_read_unlock();
>  			release_sock(elem->sk);
> +			sock_hash_free_elem(htab, elem);
>  		}
>  	}
>  
> -- 
> 2.25.4
> 

In Cilium we pin the map and never release it thanks for catching this.

Acked-by: John Fastabend <john.fastabend@gmail.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH bpf 2/2] bpf, sockhash: Synchronize delete from bucket list on map free
  2020-06-07 20:52 ` [PATCH bpf 2/2] bpf, sockhash: Synchronize delete from bucket list on map free Jakub Sitnicki
@ 2020-06-09 17:09   ` John Fastabend
  2020-06-09 18:04     ` Alexei Starovoitov
  0 siblings, 1 reply; 7+ messages in thread
From: John Fastabend @ 2020-06-09 17:09 UTC (permalink / raw)
  To: Jakub Sitnicki, bpf; +Cc: netdev, kernel-team, Eric Dumazet

Jakub Sitnicki wrote:
> We can end up modifying the sockhash bucket list from two CPUs when a
> sockhash is being destroyed (sock_hash_free) on one CPU, while a socket
> that is in the sockhash is unlinking itself from it on another CPU
> it (sock_hash_delete_from_link).
> 
> This results in accessing a list element that is in an undefined state as
> reported by KASAN:
> 
> | ==================================================================
> | BUG: KASAN: wild-memory-access in sock_hash_free+0x13c/0x280
> | Write of size 8 at addr dead000000000122 by task kworker/2:1/95
> |
> | CPU: 2 PID: 95 Comm: kworker/2:1 Not tainted 5.7.0-rc7-02961-ge22c35ab0038-dirty #691
> | Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014
> | Workqueue: events bpf_map_free_deferred
> | Call Trace:
> |  dump_stack+0x97/0xe0
> |  ? sock_hash_free+0x13c/0x280
> |  __kasan_report.cold+0x5/0x40
> |  ? mark_lock+0xbc1/0xc00
> |  ? sock_hash_free+0x13c/0x280
> |  kasan_report+0x38/0x50
> |  ? sock_hash_free+0x152/0x280
> |  sock_hash_free+0x13c/0x280
> |  bpf_map_free_deferred+0xb2/0xd0
> |  ? bpf_map_charge_finish+0x50/0x50
> |  ? rcu_read_lock_sched_held+0x81/0xb0
> |  ? rcu_read_lock_bh_held+0x90/0x90
> |  process_one_work+0x59a/0xac0
> |  ? lock_release+0x3b0/0x3b0
> |  ? pwq_dec_nr_in_flight+0x110/0x110
> |  ? rwlock_bug.part.0+0x60/0x60
> |  worker_thread+0x7a/0x680
> |  ? _raw_spin_unlock_irqrestore+0x4c/0x60
> |  kthread+0x1cc/0x220
> |  ? process_one_work+0xac0/0xac0
> |  ? kthread_create_on_node+0xa0/0xa0
> |  ret_from_fork+0x24/0x30
> | ==================================================================
> 
> Fix it by reintroducing spin-lock protected critical section around the
> code that removes the elements from the bucket on sockhash free.
> 
> To do that we also need to defer processing of removed elements, until out
> of atomic context so that we can unlink the socket from the map when
> holding the sock lock.
> 
> Fixes: 90db6d772f74 ("bpf, sockmap: Remove bucket->lock from sock_{hash|map}_free")
> Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
> ---
>  net/core/sock_map.c | 23 +++++++++++++++++++++--
>  1 file changed, 21 insertions(+), 2 deletions(-)

Thanks.

Acked-by: John Fastabend <john.fastabend@gmail.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH bpf 2/2] bpf, sockhash: Synchronize delete from bucket list on map free
  2020-06-09 17:09   ` John Fastabend
@ 2020-06-09 18:04     ` Alexei Starovoitov
  2020-06-09 20:29       ` John Fastabend
  0 siblings, 1 reply; 7+ messages in thread
From: Alexei Starovoitov @ 2020-06-09 18:04 UTC (permalink / raw)
  To: John Fastabend
  Cc: Jakub Sitnicki, bpf, Network Development, kernel-team, Eric Dumazet

On Tue, Jun 9, 2020 at 10:51 AM John Fastabend <john.fastabend@gmail.com> wrote:
>
> Jakub Sitnicki wrote:
> > We can end up modifying the sockhash bucket list from two CPUs when a
> > sockhash is being destroyed (sock_hash_free) on one CPU, while a socket
> > that is in the sockhash is unlinking itself from it on another CPU
> > it (sock_hash_delete_from_link).
> >
> > This results in accessing a list element that is in an undefined state as
> > reported by KASAN:
> >
> > | ==================================================================
> > | BUG: KASAN: wild-memory-access in sock_hash_free+0x13c/0x280
> > | Write of size 8 at addr dead000000000122 by task kworker/2:1/95
> > |
> > | CPU: 2 PID: 95 Comm: kworker/2:1 Not tainted 5.7.0-rc7-02961-ge22c35ab0038-dirty #691
> > | Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014
> > | Workqueue: events bpf_map_free_deferred
> > | Call Trace:
> > |  dump_stack+0x97/0xe0
> > |  ? sock_hash_free+0x13c/0x280
> > |  __kasan_report.cold+0x5/0x40
> > |  ? mark_lock+0xbc1/0xc00
> > |  ? sock_hash_free+0x13c/0x280
> > |  kasan_report+0x38/0x50
> > |  ? sock_hash_free+0x152/0x280
> > |  sock_hash_free+0x13c/0x280
> > |  bpf_map_free_deferred+0xb2/0xd0
> > |  ? bpf_map_charge_finish+0x50/0x50
> > |  ? rcu_read_lock_sched_held+0x81/0xb0
> > |  ? rcu_read_lock_bh_held+0x90/0x90
> > |  process_one_work+0x59a/0xac0
> > |  ? lock_release+0x3b0/0x3b0
> > |  ? pwq_dec_nr_in_flight+0x110/0x110
> > |  ? rwlock_bug.part.0+0x60/0x60
> > |  worker_thread+0x7a/0x680
> > |  ? _raw_spin_unlock_irqrestore+0x4c/0x60
> > |  kthread+0x1cc/0x220
> > |  ? process_one_work+0xac0/0xac0
> > |  ? kthread_create_on_node+0xa0/0xa0
> > |  ret_from_fork+0x24/0x30
> > | ==================================================================
> >
> > Fix it by reintroducing spin-lock protected critical section around the
> > code that removes the elements from the bucket on sockhash free.
> >
> > To do that we also need to defer processing of removed elements, until out
> > of atomic context so that we can unlink the socket from the map when
> > holding the sock lock.
> >
> > Fixes: 90db6d772f74 ("bpf, sockmap: Remove bucket->lock from sock_{hash|map}_free")
> > Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
> > Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
> > ---
> >  net/core/sock_map.c | 23 +++++++++++++++++++++--
> >  1 file changed, 21 insertions(+), 2 deletions(-)
>
> Thanks.
>
> Acked-by: John Fastabend <john.fastabend@gmail.com>

Applied both to bpf tree.

FYI I see this splat:
 ./test_sockmap
# 1/ 6  sockmap::txmsg test passthrough:OK
# 2/ 6  sockmap::txmsg test redirect:OK
# 3/ 6  sockmap::txmsg test drop:OK
# 4/ 6  sockmap::txmsg test ingress redirect:OK
[   19.180397]
[   19.180633] =============================
[   19.181042] WARNING: suspicious RCU usage
[   19.181517] 5.7.0-07177-g75e68e5bf2c7 #688 Not tainted
[   19.182048] -----------------------------
[   19.182570] include/linux/skmsg.h:284 suspicious
rcu_dereference_check() usage!
[   19.183341]
[   19.183341] other info that might help us debug this:
[   19.183341]
[   19.184215]
[   19.184215] rcu_scheduler_active = 2, debug_locks = 1
[   19.184875] 1 lock held by test_sockmap/291:
[   19.185356]  #0: ffff8881315b5b20 (sk_lock-AF_INET){+.+.}-{0:0},
at: tls_sw_recvmsg+0x128/0x810
[   19.186315]
[   19.186315] stack backtrace:
[   19.186774] CPU: 0 PID: 291 Comm: test_sockmap Not tainted
5.7.0-07177-g75e68e5bf2c7 #688
[   19.188497] Call Trace:
[   19.188757]  dump_stack+0x71/0xa0
[   19.189140]  sk_psock_skb_redirect.isra.0+0xa6/0xf0
[   19.189651]  sk_psock_tls_strp_read+0x1cc/0x250
[   19.190142]  tls_sw_recvmsg+0x6e4/0x810
[   19.190584]  inet_recvmsg+0x55/0x1d0
[   19.190963]  ____sys_recvmsg+0x73/0x130
[   19.191365]  ? import_iovec+0x27/0xd0
[   19.191800]  ? copy_msghdr_from_user+0x4c/0x70
[   19.192271]  ___sys_recvmsg+0x68/0x90
[   19.192682]  ? __might_fault+0x36/0x80
[   19.193078]  ? __audit_syscall_exit+0x242/0x2b0
[   19.193576]  __sys_recvmsg+0x46/0x80
[   19.193964]  do_syscall_64+0x4a/0x1b0
[   19.194355]  entry_SYSCALL_64_after_hwframe+0x49/0xb3
[   19.194924] RIP: 0033:0x7fc915cc4490

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH bpf 2/2] bpf, sockhash: Synchronize delete from bucket list on map free
  2020-06-09 18:04     ` Alexei Starovoitov
@ 2020-06-09 20:29       ` John Fastabend
  0 siblings, 0 replies; 7+ messages in thread
From: John Fastabend @ 2020-06-09 20:29 UTC (permalink / raw)
  To: Alexei Starovoitov, John Fastabend
  Cc: Jakub Sitnicki, bpf, Network Development, kernel-team, Eric Dumazet

Alexei Starovoitov wrote:
> On Tue, Jun 9, 2020 at 10:51 AM John Fastabend <john.fastabend@gmail.com> wrote:
> >
> > Jakub Sitnicki wrote:
> > > We can end up modifying the sockhash bucket list from two CPUs when a
> > > sockhash is being destroyed (sock_hash_free) on one CPU, while a socket
> > > that is in the sockhash is unlinking itself from it on another CPU
> > > it (sock_hash_delete_from_link).
> > >
> > > This results in accessing a list element that is in an undefined state as
> > > reported by KASAN:
> > >
> > > | ==================================================================
> > > | BUG: KASAN: wild-memory-access in sock_hash_free+0x13c/0x280
> > > | Write of size 8 at addr dead000000000122 by task kworker/2:1/95
> > > |
> > > | CPU: 2 PID: 95 Comm: kworker/2:1 Not tainted 5.7.0-rc7-02961-ge22c35ab0038-dirty #691
> > > | Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014
> > > | Workqueue: events bpf_map_free_deferred
> > > | Call Trace:
> > > |  dump_stack+0x97/0xe0
> > > |  ? sock_hash_free+0x13c/0x280
> > > |  __kasan_report.cold+0x5/0x40
> > > |  ? mark_lock+0xbc1/0xc00
> > > |  ? sock_hash_free+0x13c/0x280
> > > |  kasan_report+0x38/0x50
> > > |  ? sock_hash_free+0x152/0x280
> > > |  sock_hash_free+0x13c/0x280
> > > |  bpf_map_free_deferred+0xb2/0xd0
> > > |  ? bpf_map_charge_finish+0x50/0x50
> > > |  ? rcu_read_lock_sched_held+0x81/0xb0
> > > |  ? rcu_read_lock_bh_held+0x90/0x90
> > > |  process_one_work+0x59a/0xac0
> > > |  ? lock_release+0x3b0/0x3b0
> > > |  ? pwq_dec_nr_in_flight+0x110/0x110
> > > |  ? rwlock_bug.part.0+0x60/0x60
> > > |  worker_thread+0x7a/0x680
> > > |  ? _raw_spin_unlock_irqrestore+0x4c/0x60
> > > |  kthread+0x1cc/0x220
> > > |  ? process_one_work+0xac0/0xac0
> > > |  ? kthread_create_on_node+0xa0/0xa0
> > > |  ret_from_fork+0x24/0x30
> > > | ==================================================================
> > >
> > > Fix it by reintroducing spin-lock protected critical section around the
> > > code that removes the elements from the bucket on sockhash free.
> > >
> > > To do that we also need to defer processing of removed elements, until out
> > > of atomic context so that we can unlink the socket from the map when
> > > holding the sock lock.
> > >
> > > Fixes: 90db6d772f74 ("bpf, sockmap: Remove bucket->lock from sock_{hash|map}_free")
> > > Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
> > > Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
> > > ---
> > >  net/core/sock_map.c | 23 +++++++++++++++++++++--
> > >  1 file changed, 21 insertions(+), 2 deletions(-)
> >
> > Thanks.
> >
> > Acked-by: John Fastabend <john.fastabend@gmail.com>
> 
> Applied both to bpf tree.
> 
> FYI I see this splat:
>  ./test_sockmap
> # 1/ 6  sockmap::txmsg test passthrough:OK
> # 2/ 6  sockmap::txmsg test redirect:OK
> # 3/ 6  sockmap::txmsg test drop:OK
> # 4/ 6  sockmap::txmsg test ingress redirect:OK
> [   19.180397]
> [   19.180633] =============================
> [   19.181042] WARNING: suspicious RCU usage
> [   19.181517] 5.7.0-07177-g75e68e5bf2c7 #688 Not tainted
> [   19.182048] -----------------------------
> [   19.182570] include/linux/skmsg.h:284 suspicious
> rcu_dereference_check() usage!

I'll have a fix for this splat shortly thanks.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-06-09 20:30 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-07 20:52 [PATCH bpf 0/2] Fixes for sock_hash_free Jakub Sitnicki
2020-06-07 20:52 ` [PATCH bpf 1/2] bpf, sockhash: Fix memory leak when unlinking sockets in sock_hash_free Jakub Sitnicki
2020-06-09 16:56   ` John Fastabend
2020-06-07 20:52 ` [PATCH bpf 2/2] bpf, sockhash: Synchronize delete from bucket list on map free Jakub Sitnicki
2020-06-09 17:09   ` John Fastabend
2020-06-09 18:04     ` Alexei Starovoitov
2020-06-09 20:29       ` John Fastabend

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).