[PATCH bpf] bpf: Fix a race in reuseport_array_free()

* [PATCH bpf] bpf: Fix a race in reuseport_array_free()
@ 2019-09-27 16:52 Martin KaFai Lau
  2019-09-27 17:24 ` Eric Dumazet
  0 siblings, 1 reply; 5+ messages in thread
From: Martin KaFai Lau @ 2019-09-27 16:52 UTC (permalink / raw)
  To: bpf, netdev
  Cc: Alexei Starovoitov, Daniel Borkmann, David Miller, kernel-team

In reuseport_array_free(), the rcu_read_lock() cannot ensure sk is still
valid.  It is because bpf_sk_reuseport_detach() can be called from
__sk_destruct() which is invoked through call_rcu(..., __sk_destruct).

This patch takes the reuseport_lock in reuseport_array_free() which
is not the fast path.  The lock is taken inside the loop in case
that the bpf map is big.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
---
 kernel/bpf/reuseport_array.c | 27 +++++----------------------
 1 file changed, 5 insertions(+), 22 deletions(-)

diff --git a/kernel/bpf/reuseport_array.c b/kernel/bpf/reuseport_array.c
index 50c083ba978c..9e593ac31ad7 100644
--- a/kernel/bpf/reuseport_array.c
+++ b/kernel/bpf/reuseport_array.c
@@ -103,29 +103,11 @@ static void reuseport_array_free(struct bpf_map *map)
 	 * array now. Hence, this function only races with
 	 * bpf_sk_reuseport_detach() which was triggerred by
 	 * close() or disconnect().
-	 *
-	 * This function and bpf_sk_reuseport_detach() are
-	 * both removing sk from "array".  Who removes it
-	 * first does not matter.
-	 *
-	 * The only concern here is bpf_sk_reuseport_detach()
-	 * may access "array" which is being freed here.
-	 * bpf_sk_reuseport_detach() access this "array"
-	 * through sk->sk_user_data _and_ with sk->sk_callback_lock
-	 * held which is enough because this "array" is not freed
-	 * until all sk->sk_user_data has stopped referencing this "array".
-	 *
-	 * Hence, due to the above, taking "reuseport_lock" is not
-	 * needed here.
 	 */
-
-	/*
-	 * Since reuseport_lock is not taken, sk is accessed under
-	 * rcu_read_lock()
-	 */
-	rcu_read_lock();
 	for (i = 0; i < map->max_entries; i++) {
-		sk = rcu_dereference(array->ptrs[i]);
+		spin_lock_bh(&reuseport_lock);
+		sk = rcu_dereference_protected(array->ptrs[i],
+					lockdep_is_held(&reuseport_lock));
 		if (sk) {
 			write_lock_bh(&sk->sk_callback_lock);
 			/*
@@ -137,8 +119,9 @@ static void reuseport_array_free(struct bpf_map *map)
 			write_unlock_bh(&sk->sk_callback_lock);
 			RCU_INIT_POINTER(array->ptrs[i], NULL);
 		}
+		spin_unlock_bh(&reuseport_lock);
+		cond_resched();
 	}
-	rcu_read_unlock();
 
 	/*
 	 * Once reaching here, all sk->sk_user_data is not
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread