netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [bpf PATCH v2 0/6] BPF fixes for sockhash
@ 2018-06-14 16:44 John Fastabend
  2018-06-14 16:44 ` [bpf PATCH v2 1/6] bpf: sockmap, fix crash when ipv6 sock is added John Fastabend
                   ` (5 more replies)
  0 siblings, 6 replies; 19+ messages in thread
From: John Fastabend @ 2018-06-14 16:44 UTC (permalink / raw)
  To: ast, daniel; +Cc: netdev

This addresses two syzbot issues that lead to identifing (by Eric and
Wei) a class of bugs where we don't correctly check for IPv4/v6
sockets and their associated state. The second issue was a locking
error in sockhash.

The first 2 patches address handling IPv4 correctly and then ensuring
that only sockets in ESTABLISHED state can be added. There is then a
follow up fix (patch4) to fix the other issue Eric noted, namely that
we depend on sockets to call tcp_close to remove them from the map.
However, we missed that a socket can transition through
tcp_disconnect() and never call tcp_close() missing our hook. To
resolve this implement the unhash hook which is also called from the
tcp_disconnect() flow.

The other issue syzbot found that the tcp_close() handler missed
locking the hash bucket lock which could result in corrupting the
sockhash bucket list if delete and close ran at the same time. To
fix this we had to restructure the tcp_close() lock handling. This is
done in patch 3.

Finally, during review I noticed the release handler was ommitted
from the upstream code (patch 5) due to an incorrect merge conflict
fix when I ported the code to latest bpf-next before submitting. And
then patch 6 fixes up selftests for the above.

The tcp_disconnect() catch also appears to be missing in kTLS so
a follow up patch will need to address that as well.

v2: Added sock lock to update paths in patch2. Martin noticed this
during review. I was planning to do this in a follow up patch but
I agree its a bit odd to not do it upfront so incorporated into
'bpf: sockmap only allow ESTABLISHED sock state'. In bpf-next we
may consider also taking sock lock on delete/map_free and which
point we could drop some usages of sk_callback_lock but need to
think a bit on the trade-offs of this.

---

John Fastabend (6):
      bpf: sockmap, fix crash when ipv6 sock is added
      bpf: sockmap only allow ESTABLISHED sock state
      bpf: sockhash fix omitted bucket lock in sock_close
      bpf: sockmap, tcp_disconnect to listen transition
      bpf: sockhash, add release routine
      bpf: selftest remove attempts to add LISTEN sockets to sockmap


 0 files changed

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [bpf PATCH v2 1/6] bpf: sockmap, fix crash when ipv6 sock is added
  2018-06-14 16:44 [bpf PATCH v2 0/6] BPF fixes for sockhash John Fastabend
@ 2018-06-14 16:44 ` John Fastabend
  2018-06-14 23:53   ` Martin KaFai Lau
  2018-06-14 16:44 ` [bpf PATCH v2 2/6] bpf: sockmap only allow ESTABLISHED sock state John Fastabend
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 19+ messages in thread
From: John Fastabend @ 2018-06-14 16:44 UTC (permalink / raw)
  To: ast, daniel; +Cc: netdev

This fixes a crash where we assign tcp_prot to IPv6 sockets instead
of tcpv6_prot.

Previously we overwrote the sk->prot field with tcp_prot even in the
AF_INET6 case. This patch ensures the correct tcp_prot and tcpv6_prot
are used. Further, only allow ESTABLISHED connections to join the
map per note in TLS ULP,

   /* The TLS ulp is currently supported only for TCP sockets
    * in ESTABLISHED state.
    * Supporting sockets in LISTEN state will require us
    * to modify the accept implementation to clone rather then
    * share the ulp context.
    */

Also tested with 'netserver -6' and 'netperf -H [IPv6]' as well as
'netperf -H [IPv4]'. The ESTABLISHED check resolves the previously
crashing case here.

Fixes: 174a79ff9515 ("bpf: sockmap with sk redirect support")
Reported-by: syzbot+5c063698bdbfac19f363@syzkaller.appspotmail.com
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Wei Wang <weiwan@google.com>
---
 0 files changed

diff --git a/kernel/bpf/sockmap.c b/kernel/bpf/sockmap.c
index 52a91d8..f6dd4cd 100644
--- a/kernel/bpf/sockmap.c
+++ b/kernel/bpf/sockmap.c
@@ -140,6 +140,7 @@ static int bpf_tcp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
 static int bpf_tcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t size);
 static int bpf_tcp_sendpage(struct sock *sk, struct page *page,
 			    int offset, size_t size, int flags);
+static void bpf_tcp_close(struct sock *sk, long timeout);
 
 static inline struct smap_psock *smap_psock_sk(const struct sock *sk)
 {
@@ -161,7 +162,42 @@ static bool bpf_tcp_stream_read(const struct sock *sk)
 	return !empty;
 }
 
-static struct proto tcp_bpf_proto;
+enum {
+	SOCKMAP_IPV4,
+	SOCKMAP_IPV6,
+	SOCKMAP_NUM_PROTS,
+};
+
+enum {
+	SOCKMAP_BASE,
+	SOCKMAP_TX,
+	SOCKMAP_NUM_CONFIGS,
+};
+
+static struct proto *saved_tcpv6_prot;
+static DEFINE_MUTEX(tcpv6_prot_mutex);
+static struct proto bpf_tcp_prots[SOCKMAP_NUM_PROTS][SOCKMAP_NUM_CONFIGS];
+static void build_protos(struct proto prot[SOCKMAP_NUM_CONFIGS],
+			 struct proto *base)
+{
+	prot[SOCKMAP_BASE]			= *base;
+	prot[SOCKMAP_BASE].close		= bpf_tcp_close;
+	prot[SOCKMAP_BASE].recvmsg		= bpf_tcp_recvmsg;
+	prot[SOCKMAP_BASE].stream_memory_read	= bpf_tcp_stream_read;
+
+	prot[SOCKMAP_TX]			= prot[SOCKMAP_BASE];
+	prot[SOCKMAP_TX].sendmsg		= bpf_tcp_sendmsg;
+	prot[SOCKMAP_TX].sendpage		= bpf_tcp_sendpage;
+}
+
+static void update_sk_prot(struct sock *sk, struct smap_psock *psock)
+{
+	int family = sk->sk_family == AF_INET6 ? SOCKMAP_IPV6 : SOCKMAP_IPV4;
+	int conf = psock->bpf_tx_msg ? SOCKMAP_TX : SOCKMAP_BASE;
+
+	sk->sk_prot = &bpf_tcp_prots[family][conf];
+}
+
 static int bpf_tcp_init(struct sock *sk)
 {
 	struct smap_psock *psock;
@@ -181,14 +217,17 @@ static int bpf_tcp_init(struct sock *sk)
 	psock->save_close = sk->sk_prot->close;
 	psock->sk_proto = sk->sk_prot;
 
-	if (psock->bpf_tx_msg) {
-		tcp_bpf_proto.sendmsg = bpf_tcp_sendmsg;
-		tcp_bpf_proto.sendpage = bpf_tcp_sendpage;
-		tcp_bpf_proto.recvmsg = bpf_tcp_recvmsg;
-		tcp_bpf_proto.stream_memory_read = bpf_tcp_stream_read;
+	/* Build IPv6 sockmap whenever the address of tcpv6_prot changes */
+	if (sk->sk_family == AF_INET6 &&
+	    unlikely(sk->sk_prot != smp_load_acquire(&saved_tcpv6_prot))) {
+		mutex_lock(&tcpv6_prot_mutex);
+		if (likely(sk->sk_prot != saved_tcpv6_prot)) {
+			build_protos(bpf_tcp_prots[SOCKMAP_IPV6], sk->sk_prot);
+			smp_store_release(&saved_tcpv6_prot, sk->sk_prot);
+		}
+		mutex_unlock(&tcpv6_prot_mutex);
 	}
-
-	sk->sk_prot = &tcp_bpf_proto;
+	update_sk_prot(sk, psock);
 	rcu_read_unlock();
 	return 0;
 }
@@ -1111,8 +1150,7 @@ static void bpf_tcp_msg_add(struct smap_psock *psock,
 
 static int bpf_tcp_ulp_register(void)
 {
-	tcp_bpf_proto = tcp_prot;
-	tcp_bpf_proto.close = bpf_tcp_close;
+	build_protos(bpf_tcp_prots[SOCKMAP_IPV4], &tcp_prot);
 	/* Once BPF TX ULP is registered it is never unregistered. It
 	 * will be in the ULP list for the lifetime of the system. Doing
 	 * duplicate registers is not a problem.

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [bpf PATCH v2 2/6] bpf: sockmap only allow ESTABLISHED sock state
  2018-06-14 16:44 [bpf PATCH v2 0/6] BPF fixes for sockhash John Fastabend
  2018-06-14 16:44 ` [bpf PATCH v2 1/6] bpf: sockmap, fix crash when ipv6 sock is added John Fastabend
@ 2018-06-14 16:44 ` John Fastabend
  2018-06-15  0:18   ` Martin KaFai Lau
  2018-06-14 16:44 ` [bpf PATCH v2 3/6] bpf: sockhash fix omitted bucket lock in sock_close John Fastabend
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 19+ messages in thread
From: John Fastabend @ 2018-06-14 16:44 UTC (permalink / raw)
  To: ast, daniel; +Cc: netdev

Per the note in the TLS ULP (which is actually a generic statement
regarding ULPs)

 /* The TLS ulp is currently supported only for TCP sockets
  * in ESTABLISHED state.
  * Supporting sockets in LISTEN state will require us
  * to modify the accept implementation to clone rather then
  * share the ulp context.
  */

After this patch we only allow socks that are in ESTABLISHED state or
are being added via a sock_ops event that is transitioning into an
ESTABLISHED state. By allowing sock_ops events we allow users to
manage sockmaps directly from sock ops programs. The two supported
sock_ops ops are BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB and
BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB.

>From the userspace BPF update API the sock lock is also taken now
to ensure we don't race with state changes after the ESTABLISHED
check. The BPF program sock ops hook already has the sock lock
taken.

Also tested with 'netserver -6' and 'netperf -H [IPv6]' as well as
'netperf -H [IPv4]'.

Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
---
 0 files changed

diff --git a/kernel/bpf/sockmap.c b/kernel/bpf/sockmap.c
index f6dd4cd..f1ab52d 100644
--- a/kernel/bpf/sockmap.c
+++ b/kernel/bpf/sockmap.c
@@ -1976,13 +1976,20 @@ static int sock_map_update_elem(struct bpf_map *map,
 		return -EINVAL;
 	}
 
+	lock_sock(skops.sk);
+	/* ULPs are currently supported only for TCP sockets in ESTABLISHED
+	 * state.
+	 */
 	if (skops.sk->sk_type != SOCK_STREAM ||
-	    skops.sk->sk_protocol != IPPROTO_TCP) {
-		fput(socket->file);
-		return -EOPNOTSUPP;
+	    skops.sk->sk_protocol != IPPROTO_TCP ||
+	    skops.sk->sk_state != TCP_ESTABLISHED) {
+		err = -EOPNOTSUPP;
+		goto out;
 	}
 
 	err = sock_map_ctx_update_elem(&skops, map, key, flags);
+out:
+	release_sock(skops.sk);
 	fput(socket->file);
 	return err;
 }
@@ -2247,10 +2254,6 @@ static int sock_hash_ctx_update_elem(struct bpf_sock_ops_kern *skops,
 
 	sock = skops->sk;
 
-	if (sock->sk_type != SOCK_STREAM ||
-	    sock->sk_protocol != IPPROTO_TCP)
-		return -EOPNOTSUPP;
-
 	if (unlikely(map_flags > BPF_EXIST))
 		return -EINVAL;
 
@@ -2338,7 +2341,20 @@ static int sock_hash_update_elem(struct bpf_map *map,
 		return -EINVAL;
 	}
 
+	lock_sock(skops.sk);
+	/* ULPs are currently supported only for TCP sockets in ESTABLISHED
+	 * state.
+	 */
+	if (skops.sk->sk_type != SOCK_STREAM ||
+	    skops.sk->sk_protocol != IPPROTO_TCP ||
+	    skops.sk->sk_state != TCP_ESTABLISHED) {
+		err = -EOPNOTSUPP;
+		goto out;
+	}
+
 	err = sock_hash_ctx_update_elem(&skops, map, key, flags);
+out:
+	release_sock(skops.sk);
 	fput(socket->file);
 	return err;
 }
@@ -2423,10 +2439,19 @@ struct sock  *__sock_hash_lookup_elem(struct bpf_map *map, void *key)
 	.map_delete_elem = sock_hash_delete_elem,
 };
 
+static bool bpf_is_valid_sock(struct bpf_sock_ops_kern *ops)
+{
+	return ops->op == BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB ||
+	       ops->op == BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB;
+}
+
 BPF_CALL_4(bpf_sock_map_update, struct bpf_sock_ops_kern *, bpf_sock,
 	   struct bpf_map *, map, void *, key, u64, flags)
 {
 	WARN_ON_ONCE(!rcu_read_lock_held());
+
+	if (!bpf_is_valid_sock(bpf_sock))
+		return -EOPNOTSUPP;
 	return sock_map_ctx_update_elem(bpf_sock, map, key, flags);
 }
 
@@ -2445,6 +2470,9 @@ struct sock  *__sock_hash_lookup_elem(struct bpf_map *map, void *key)
 	   struct bpf_map *, map, void *, key, u64, flags)
 {
 	WARN_ON_ONCE(!rcu_read_lock_held());
+
+	if (!bpf_is_valid_sock(bpf_sock))
+		return -EOPNOTSUPP;
 	return sock_hash_ctx_update_elem(bpf_sock, map, key, flags);
 }
 

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [bpf PATCH v2 3/6] bpf: sockhash fix omitted bucket lock in sock_close
  2018-06-14 16:44 [bpf PATCH v2 0/6] BPF fixes for sockhash John Fastabend
  2018-06-14 16:44 ` [bpf PATCH v2 1/6] bpf: sockmap, fix crash when ipv6 sock is added John Fastabend
  2018-06-14 16:44 ` [bpf PATCH v2 2/6] bpf: sockmap only allow ESTABLISHED sock state John Fastabend
@ 2018-06-14 16:44 ` John Fastabend
  2018-06-15  5:41   ` Martin KaFai Lau
  2018-06-14 16:45 ` [bpf PATCH v2 4/6] bpf: sockmap, tcp_disconnect to listen transition John Fastabend
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 19+ messages in thread
From: John Fastabend @ 2018-06-14 16:44 UTC (permalink / raw)
  To: ast, daniel; +Cc: netdev

First in tcp_close, reduce scope of sk_callback_lock() the lock is
only needed for protecting smap_release_sock() the ingress and cork
lists are protected by sock lock. Having the lock in wider scope is
harmless but may confuse the reader who may infer it is in fact
needed.

Next, in sock_hash_delete_elem() the pattern is as follows,

  sock_hash_delete_elem()
     [...]
     spin_lock(bucket_lock)
     l = lookup_elem_raw()
     if (l)
        hlist_del_rcu()
        write_lock(sk_callback_lock)
         .... destroy psock ...
        write_unlock(sk_callback_lock)
     spin_unlock(bucket_lock)

The ordering is necessary because we only know the {p}sock after
dereferencing the hash table which we can't do unless we have the
bucket lock held. Once we have the bucket lock and the psock element
it is deleted from the hashmap to ensure any other path doing a lookup
will fail. Finally, the refcnt is decremented and if zero the psock
is destroyed.

In parallel with the above (or free'ing the map) a tcp close event
may trigger tcp_close(). Which at the moment omits the bucket lock
altogether (oops!) where the flow looks like this,

  bpf_tcp_close()
     [...]
     write_lock(sk_callback_lock)
     for each psock->maps // list of maps this sock is part of
         hlist_del_rcu(ref_hash_node);
         .... destroy psock ...
     write_unlock(sk_callback_lock)

Obviously, and demonstrated by syzbot, this is broken because
we can have multiple threads deleting entries via hlist_del_rcu().

To fix this we might be tempted to wrap the hlist operation in a
bucket lock but that would create a lock inversion problem. In
summary to follow locking rules maps needs the sk_callback_lock but we
need the bucket lock to do the hlist_del_rcu. To resolve the lock
inversion problem note that when bpf_tcp_close is called no updates
can happen in parallel, due to ESTABLISH state check in update logic,
so pop the head of the list repeatedly and remove the reference until
no more are left. If a delete happens in parallel from the BPF API
that is OK as well because it will do a similar action, lookup the
sock in the map/hash, delete it from the map/hash, and dec the refcnt.
We check for this case before doing a destroy on the psock to ensure
we don't have two threads tearing down a psock. The new logic is
as follows,

  bpf_tcp_close()
  e = psock_map_pop(psock->maps) // done with sk_callback_lock
  bucket_lock() // lock hash list bucket
  l = lookup_elem_raw(head, hash, key, key_size);
  if (l) {
     //only get here if elmnt was not already removed
     hlist_del_rcu()
     ... destroy psock...
  }
  bucket_unlock()

And finally for all the above to work add missing sk_callback_lock
around smap_list_remove in sock_hash_ctx_update_elem(). Otherwise
delete and update may corrupt maps list.

(As an aside the sk_callback_lock serves two purposes. The
 first, is to update the sock callbacks sk_data_ready, sk_write_space,
 etc. The second is to protect the psock 'maps' list. The 'maps' list
 is used to (as shown above) to delete all map/hash references to a
 sock when the sock is closed)

(If we did not have the ESTABLISHED state guarantee from tcp_close
 then we could not ensure completion because updates could happen
 forever and pin thread in delete loop.)

Reported-by: syzbot+0ce137753c78f7b6acc1@syzkaller.appspotmail.com
Fixes: 81110384441a ("bpf: sockmap, add hash map support")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
---
 0 files changed

diff --git a/kernel/bpf/sockmap.c b/kernel/bpf/sockmap.c
index f1ab52d..04764f5 100644
--- a/kernel/bpf/sockmap.c
+++ b/kernel/bpf/sockmap.c
@@ -258,16 +258,54 @@ static void bpf_tcp_release(struct sock *sk)
 	rcu_read_unlock();
 }
 
+static struct htab_elem *lookup_elem_raw(struct hlist_head *head,
+					 u32 hash, void *key, u32 key_size)
+{
+	struct htab_elem *l;
+
+	hlist_for_each_entry_rcu(l, head, hash_node) {
+		if (l->hash == hash && !memcmp(&l->key, key, key_size))
+			return l;
+	}
+
+	return NULL;
+}
+
+static inline struct bucket *__select_bucket(struct bpf_htab *htab, u32 hash)
+{
+	return &htab->buckets[hash & (htab->n_buckets - 1)];
+}
+
+static inline struct hlist_head *select_bucket(struct bpf_htab *htab, u32 hash)
+{
+	return &__select_bucket(htab, hash)->head;
+}
+
 static void free_htab_elem(struct bpf_htab *htab, struct htab_elem *l)
 {
 	atomic_dec(&htab->count);
 	kfree_rcu(l, rcu);
 }
 
+struct smap_psock_map_entry *psock_map_pop(struct sock *sk,
+					   struct smap_psock *psock)
+{
+	struct smap_psock_map_entry *e;
+
+	write_lock_bh(&sk->sk_callback_lock);
+	e = list_first_entry_or_null(&psock->maps,
+				     struct smap_psock_map_entry,
+				     list);
+	if (e)
+		list_del(&e->list);
+	write_unlock_bh(&sk->sk_callback_lock);
+	return e;
+}
+
 static void bpf_tcp_close(struct sock *sk, long timeout)
 {
 	void (*close_fun)(struct sock *sk, long timeout);
-	struct smap_psock_map_entry *e, *tmp;
+	struct smap_psock_map_entry *e;
 	struct sk_msg_buff *md, *mtmp;
 	struct smap_psock *psock;
 	struct sock *osk;
@@ -286,7 +324,6 @@ static void bpf_tcp_close(struct sock *sk, long timeout)
 	 */
 	close_fun = psock->save_close;
 
-	write_lock_bh(&sk->sk_callback_lock);
 	if (psock->cork) {
 		free_start_sg(psock->sock, psock->cork);
 		kfree(psock->cork);
@@ -299,20 +336,48 @@ static void bpf_tcp_close(struct sock *sk, long timeout)
 		kfree(md);
 	}
 
-	list_for_each_entry_safe(e, tmp, &psock->maps, list) {
+	/* Sock is in TCP_CLOSE state so any concurrent adds or updates will be
+	 * blocked by ESTABLISHED check. However, tcp_close() + delete + free
+	 * can all run at the same time. If a tcp_close + delete happens each
+	 * code path will remove the entry for the map/hash before deleting it.
+	 * In the map case a xchg and then check to verify we have a sk protects
+	 * two paths from tearing down the same object. For hash map we lock the
+	 * bucket and remove the object from the hash map before destroying to
+	 * ensure that only one reference exists. By pulling object off the head
+	 * of the list with (with sk_callback_lock) if multiple deleters are
+	 * running we avoid duplicate references.
+	 */
+	e = psock_map_pop(sk, psock);
+	while (e) {
 		if (e->entry) {
 			osk = cmpxchg(e->entry, sk, NULL);
 			if (osk == sk) {
-				list_del(&e->list);
 				smap_release_sock(psock, sk);
 			}
 		} else {
-			hlist_del_rcu(&e->hash_link->hash_node);
-			smap_release_sock(psock, e->hash_link->sk);
-			free_htab_elem(e->htab, e->hash_link);
+			struct htab_elem *link = e->hash_link;
+			struct hlist_head *head;
+			struct htab_elem *l;
+			struct bucket *b;
+
+			b = __select_bucket(e->htab, link->hash);
+			head = &b->head;
+			raw_spin_lock_bh(&b->lock);
+			l = lookup_elem_raw(head,
+					    link->hash, link->key,
+					    e->htab->elem_size);
+			/* If another thread deleted this object skip deletion.
+			 * The refcnt on psock may or may not be zero.
+			 */
+			if (l) {
+				hlist_del_rcu(&e->hash_link->hash_node);
+				smap_release_sock(psock, e->hash_link->sk);
+				free_htab_elem(e->htab, e->hash_link);
+			}
+			raw_spin_unlock_bh(&b->lock);
 		}
+		e = psock_map_pop(sk, psock);
 	}
-	write_unlock_bh(&sk->sk_callback_lock);
 	rcu_read_unlock();
 	close_fun(sk, timeout);
 }
@@ -2088,16 +2153,6 @@ static struct bpf_map *sock_hash_alloc(union bpf_attr *attr)
 	return ERR_PTR(err);
 }
 
-static inline struct bucket *__select_bucket(struct bpf_htab *htab, u32 hash)
-{
-	return &htab->buckets[hash & (htab->n_buckets - 1)];
-}
-
-static inline struct hlist_head *select_bucket(struct bpf_htab *htab, u32 hash)
-{
-	return &__select_bucket(htab, hash)->head;
-}
-
 static void sock_hash_free(struct bpf_map *map)
 {
 	struct bpf_htab *htab = container_of(map, struct bpf_htab, map);
@@ -2114,10 +2169,13 @@ static void sock_hash_free(struct bpf_map *map)
 	 */
 	rcu_read_lock();
 	for (i = 0; i < htab->n_buckets; i++) {
-		struct hlist_head *head = select_bucket(htab, i);
+		struct bucket *b = __select_bucket(htab, i);
+		struct hlist_head *head;
 		struct hlist_node *n;
 		struct htab_elem *l;
 
+		raw_spin_lock_bh(&b->lock);
+		head = &b->head;
 		hlist_for_each_entry_safe(l, n, head, hash_node) {
 			struct sock *sock = l->sk;
 			struct smap_psock *psock;
@@ -2137,6 +2195,7 @@ static void sock_hash_free(struct bpf_map *map)
 			write_unlock_bh(&sock->sk_callback_lock);
 			kfree(l);
 		}
+		raw_spin_unlock_bh(&b->lock);
 	}
 	rcu_read_unlock();
 	bpf_map_area_free(htab->buckets);
@@ -2167,19 +2226,6 @@ static struct htab_elem *alloc_sock_hash_elem(struct bpf_htab *htab,
 	return l_new;
 }
 
-static struct htab_elem *lookup_elem_raw(struct hlist_head *head,
-					 u32 hash, void *key, u32 key_size)
-{
-	struct htab_elem *l;
-
-	hlist_for_each_entry_rcu(l, head, hash_node) {
-		if (l->hash == hash && !memcmp(&l->key, key, key_size))
-			return l;
-	}
-
-	return NULL;
-}
-
 static inline u32 htab_map_hash(const void *key, u32 key_len)
 {
 	return jhash(key, key_len, 0);
@@ -2307,8 +2353,10 @@ static int sock_hash_ctx_update_elem(struct bpf_sock_ops_kern *skops,
 		psock = smap_psock_sk(l_old->sk);
 
 		hlist_del_rcu(&l_old->hash_node);
+		write_lock_bh(&l_old->sk->sk_callback_lock);
 		smap_list_remove(psock, NULL, l_old);
 		smap_release_sock(psock, l_old->sk);
+		write_unlock_bh(&l_old->sk->sk_callback_lock);
 		free_htab_elem(htab, l_old);
 	}
 	raw_spin_unlock_bh(&b->lock);

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [bpf PATCH v2 4/6] bpf: sockmap, tcp_disconnect to listen transition
  2018-06-14 16:44 [bpf PATCH v2 0/6] BPF fixes for sockhash John Fastabend
                   ` (2 preceding siblings ...)
  2018-06-14 16:44 ` [bpf PATCH v2 3/6] bpf: sockhash fix omitted bucket lock in sock_close John Fastabend
@ 2018-06-14 16:45 ` John Fastabend
  2018-06-15  6:04   ` Martin KaFai Lau
  2018-06-14 16:45 ` [bpf PATCH v2 5/6] bpf: sockhash, add release routine John Fastabend
  2018-06-14 16:45 ` [bpf PATCH v2 6/6] bpf: selftest remove attempts to add LISTEN sockets to sockmap John Fastabend
  5 siblings, 1 reply; 19+ messages in thread
From: John Fastabend @ 2018-06-14 16:45 UTC (permalink / raw)
  To: ast, daniel; +Cc: netdev

After adding checks to ensure TCP is in ESTABLISHED state when a
sock is added we need to also ensure that user does not transition
through tcp_disconnect() and back into ESTABLISHED state without
sockmap removing the sock.

To do this add unhash hook and remove sock from map there.

Reported-by: Eric Dumazet <edumazet@google.com>
Fixes: 81110384441a ("bpf: sockmap, add hash map support")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
---
 0 files changed

diff --git a/kernel/bpf/sockmap.c b/kernel/bpf/sockmap.c
index 04764f5..ffc5152 100644
--- a/kernel/bpf/sockmap.c
+++ b/kernel/bpf/sockmap.c
@@ -130,6 +130,7 @@ struct smap_psock {
 
 	struct proto *sk_proto;
 	void (*save_close)(struct sock *sk, long timeout);
+	void (*save_unhash)(struct sock *sk);
 	void (*save_data_ready)(struct sock *sk);
 	void (*save_write_space)(struct sock *sk);
 };
@@ -141,6 +142,7 @@ static int bpf_tcp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
 static int bpf_tcp_sendpage(struct sock *sk, struct page *page,
 			    int offset, size_t size, int flags);
 static void bpf_tcp_close(struct sock *sk, long timeout);
+static void bpf_tcp_unhash(struct sock *sk);
 
 static inline struct smap_psock *smap_psock_sk(const struct sock *sk)
 {
@@ -182,6 +184,7 @@ static void build_protos(struct proto prot[SOCKMAP_NUM_CONFIGS],
 {
 	prot[SOCKMAP_BASE]			= *base;
 	prot[SOCKMAP_BASE].close		= bpf_tcp_close;
+	prot[SOCKMAP_BASE].unhash		= bpf_tcp_unhash;
 	prot[SOCKMAP_BASE].recvmsg		= bpf_tcp_recvmsg;
 	prot[SOCKMAP_BASE].stream_memory_read	= bpf_tcp_stream_read;
 
@@ -215,6 +218,7 @@ static int bpf_tcp_init(struct sock *sk)
 	}
 
 	psock->save_close = sk->sk_prot->close;
+	psock->save_unhash = sk->sk_prot->unhash;
 	psock->sk_proto = sk->sk_prot;
 
 	/* Build IPv6 sockmap whenever the address of tcpv6_prot changes */
@@ -302,28 +306,12 @@ struct smap_psock_map_entry *psock_map_pop(struct sock *sk,
 	return e;
 }
 
-static void bpf_tcp_close(struct sock *sk, long timeout)
+static void bpf_tcp_remove(struct sock *sk, struct smap_psock *psock)
 {
-	void (*close_fun)(struct sock *sk, long timeout);
 	struct smap_psock_map_entry *e;
 	struct sk_msg_buff *md, *mtmp;
-	struct smap_psock *psock;
 	struct sock *osk;
 
-	rcu_read_lock();
-	psock = smap_psock_sk(sk);
-	if (unlikely(!psock)) {
-		rcu_read_unlock();
-		return sk->sk_prot->close(sk, timeout);
-	}
-
-	/* The psock may be destroyed anytime after exiting the RCU critial
-	 * section so by the time we use close_fun the psock may no longer
-	 * be valid. However, bpf_tcp_close is called with the sock lock
-	 * held so the close hook and sk are still valid.
-	 */
-	close_fun = psock->save_close;
-
 	if (psock->cork) {
 		free_start_sg(psock->sock, psock->cork);
 		kfree(psock->cork);
@@ -378,6 +366,51 @@ static void bpf_tcp_close(struct sock *sk, long timeout)
 		}
 		e = psock_map_pop(sk, psock);
 	}
+}
+
+static void bpf_tcp_unhash(struct sock *sk)
+{
+	void (*unhash_fun)(struct sock *sk);
+	struct smap_psock *psock;
+
+	rcu_read_lock();
+	psock = smap_psock_sk(sk);
+	if (unlikely(!psock)) {
+		rcu_read_unlock();
+		return sk->sk_prot->unhash(sk);
+	}
+
+	/* The psock may be destroyed anytime after exiting the RCU critial
+	 * section so by the time we use close_fun the psock may no longer
+	 * be valid. However, bpf_tcp_close is called with the sock lock
+	 * held so the close hook and sk are still valid.
+	 */
+	unhash_fun = psock->save_unhash;
+	bpf_tcp_remove(sk, psock);
+	rcu_read_unlock();
+	unhash_fun(sk);
+
+}
+
+static void bpf_tcp_close(struct sock *sk, long timeout)
+{
+	void (*close_fun)(struct sock *sk, long timeout);
+	struct smap_psock *psock;
+
+	rcu_read_lock();
+	psock = smap_psock_sk(sk);
+	if (unlikely(!psock)) {
+		rcu_read_unlock();
+		return sk->sk_prot->close(sk, timeout);
+	}
+
+	/* The psock may be destroyed anytime after exiting the RCU critial
+	 * section so by the time we use close_fun the psock may no longer
+	 * be valid. However, bpf_tcp_close is called with the sock lock
+	 * held so the close hook and sk are still valid.
+	 */
+	close_fun = psock->save_close;
+	bpf_tcp_remove(sk, psock);
 	rcu_read_unlock();
 	close_fun(sk, timeout);
 }

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [bpf PATCH v2 5/6] bpf: sockhash, add release routine
  2018-06-14 16:44 [bpf PATCH v2 0/6] BPF fixes for sockhash John Fastabend
                   ` (3 preceding siblings ...)
  2018-06-14 16:45 ` [bpf PATCH v2 4/6] bpf: sockmap, tcp_disconnect to listen transition John Fastabend
@ 2018-06-14 16:45 ` John Fastabend
  2018-06-15  6:05   ` Martin KaFai Lau
  2018-06-14 16:45 ` [bpf PATCH v2 6/6] bpf: selftest remove attempts to add LISTEN sockets to sockmap John Fastabend
  5 siblings, 1 reply; 19+ messages in thread
From: John Fastabend @ 2018-06-14 16:45 UTC (permalink / raw)
  To: ast, daniel; +Cc: netdev

Add map_release_uref pointer to hashmap ops. This was dropped when
original sockhash code was ported into bpf-next before initial
commit.

Fixes: 81110384441a ("bpf: sockmap, add hash map support")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
---
 0 files changed

diff --git a/kernel/bpf/sockmap.c b/kernel/bpf/sockmap.c
index ffc5152..77fe204 100644
--- a/kernel/bpf/sockmap.c
+++ b/kernel/bpf/sockmap.c
@@ -2518,6 +2518,7 @@ struct sock  *__sock_hash_lookup_elem(struct bpf_map *map, void *key)
 	.map_get_next_key = sock_hash_get_next_key,
 	.map_update_elem = sock_hash_update_elem,
 	.map_delete_elem = sock_hash_delete_elem,
+	.map_release_uref = sock_map_release,
 };
 
 static bool bpf_is_valid_sock(struct bpf_sock_ops_kern *ops)

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [bpf PATCH v2 6/6] bpf: selftest remove attempts to add LISTEN sockets to sockmap
  2018-06-14 16:44 [bpf PATCH v2 0/6] BPF fixes for sockhash John Fastabend
                   ` (4 preceding siblings ...)
  2018-06-14 16:45 ` [bpf PATCH v2 5/6] bpf: sockhash, add release routine John Fastabend
@ 2018-06-14 16:45 ` John Fastabend
  2018-06-15  6:07   ` Martin KaFai Lau
  5 siblings, 1 reply; 19+ messages in thread
From: John Fastabend @ 2018-06-14 16:45 UTC (permalink / raw)
  To: ast, daniel; +Cc: netdev

In selftest test_maps the sockmap test case attempts to add a socket
in listening state to the sockmap. This is no longer a valid operation
so it fails as expected. However, the test wrongly reports this as an
error now. Fix the test to avoid adding sockets in listening state.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
---
 0 files changed

diff --git a/tools/testing/selftests/bpf/test_maps.c b/tools/testing/selftests/bpf/test_maps.c
index 6c25334..9fed5f0 100644
--- a/tools/testing/selftests/bpf/test_maps.c
+++ b/tools/testing/selftests/bpf/test_maps.c
@@ -564,7 +564,7 @@ static void test_sockmap(int tasks, void *data)
 	}
 
 	/* Test update without programs */
-	for (i = 0; i < 6; i++) {
+	for (i = 2; i < 6; i++) {
 		err = bpf_map_update_elem(fd, &i, &sfd[i], BPF_ANY);
 		if (err) {
 			printf("Failed noprog update sockmap '%i:%i'\n",
@@ -727,7 +727,7 @@ static void test_sockmap(int tasks, void *data)
 	}
 
 	/* Test map update elem afterwards fd lives in fd and map_fd */
-	for (i = 0; i < 6; i++) {
+	for (i = 2; i < 6; i++) {
 		err = bpf_map_update_elem(map_fd_rx, &i, &sfd[i], BPF_ANY);
 		if (err) {
 			printf("Failed map_fd_rx update sockmap %i '%i:%i'\n",

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [bpf PATCH v2 1/6] bpf: sockmap, fix crash when ipv6 sock is added
  2018-06-14 16:44 ` [bpf PATCH v2 1/6] bpf: sockmap, fix crash when ipv6 sock is added John Fastabend
@ 2018-06-14 23:53   ` Martin KaFai Lau
  2018-06-15  4:46     ` John Fastabend
  0 siblings, 1 reply; 19+ messages in thread
From: Martin KaFai Lau @ 2018-06-14 23:53 UTC (permalink / raw)
  To: John Fastabend; +Cc: ast, daniel, netdev

On Thu, Jun 14, 2018 at 09:44:46AM -0700, John Fastabend wrote:
> This fixes a crash where we assign tcp_prot to IPv6 sockets instead
> of tcpv6_prot.
> 
> Previously we overwrote the sk->prot field with tcp_prot even in the
> AF_INET6 case. This patch ensures the correct tcp_prot and tcpv6_prot
> are used.

> Further, only allow ESTABLISHED connections to join the
> map per note in TLS ULP,
> 
>    /* The TLS ulp is currently supported only for TCP sockets
>     * in ESTABLISHED state.
>     * Supporting sockets in LISTEN state will require us
>     * to modify the accept implementation to clone rather then
>     * share the ulp context.
>     */
This bit has been moved to patch 2.

> 
> Also tested with 'netserver -6' and 'netperf -H [IPv6]' as well as
> 'netperf -H [IPv4]'. The ESTABLISHED check resolves the previously
> crashing case here.
> 
> Fixes: 174a79ff9515 ("bpf: sockmap with sk redirect support")
> Reported-by: syzbot+5c063698bdbfac19f363@syzkaller.appspotmail.com
> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
> Signed-off-by: Wei Wang <weiwan@google.com>
> ---
>  0 files changed
> 
> diff --git a/kernel/bpf/sockmap.c b/kernel/bpf/sockmap.c
> index 52a91d8..f6dd4cd 100644
> --- a/kernel/bpf/sockmap.c
> +++ b/kernel/bpf/sockmap.c
> @@ -140,6 +140,7 @@ static int bpf_tcp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
>  static int bpf_tcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t size);
>  static int bpf_tcp_sendpage(struct sock *sk, struct page *page,
>  			    int offset, size_t size, int flags);
> +static void bpf_tcp_close(struct sock *sk, long timeout);
>  
>  static inline struct smap_psock *smap_psock_sk(const struct sock *sk)
>  {
> @@ -161,7 +162,42 @@ static bool bpf_tcp_stream_read(const struct sock *sk)
>  	return !empty;
>  }
>  
> -static struct proto tcp_bpf_proto;
> +enum {
> +	SOCKMAP_IPV4,
> +	SOCKMAP_IPV6,
> +	SOCKMAP_NUM_PROTS,
> +};
> +
> +enum {
> +	SOCKMAP_BASE,
> +	SOCKMAP_TX,
> +	SOCKMAP_NUM_CONFIGS,
> +};
> +
> +static struct proto *saved_tcpv6_prot;
__read_mostly

> +static DEFINE_MUTEX(tcpv6_prot_mutex);
> +static struct proto bpf_tcp_prots[SOCKMAP_NUM_PROTS][SOCKMAP_NUM_CONFIGS];
> +static void build_protos(struct proto prot[SOCKMAP_NUM_CONFIGS],
> +			 struct proto *base)
> +{
> +	prot[SOCKMAP_BASE]			= *base;
> +	prot[SOCKMAP_BASE].close		= bpf_tcp_close;
> +	prot[SOCKMAP_BASE].recvmsg		= bpf_tcp_recvmsg;
> +	prot[SOCKMAP_BASE].stream_memory_read	= bpf_tcp_stream_read;
> +
> +	prot[SOCKMAP_TX]			= prot[SOCKMAP_BASE];
> +	prot[SOCKMAP_TX].sendmsg		= bpf_tcp_sendmsg;
> +	prot[SOCKMAP_TX].sendpage		= bpf_tcp_sendpage;
> +}
> +
> +static void update_sk_prot(struct sock *sk, struct smap_psock *psock)
> +{
> +	int family = sk->sk_family == AF_INET6 ? SOCKMAP_IPV6 : SOCKMAP_IPV4;
> +	int conf = psock->bpf_tx_msg ? SOCKMAP_TX : SOCKMAP_BASE;
> +
> +	sk->sk_prot = &bpf_tcp_prots[family][conf];
> +}
> +
>  static int bpf_tcp_init(struct sock *sk)
>  {
>  	struct smap_psock *psock;
> @@ -181,14 +217,17 @@ static int bpf_tcp_init(struct sock *sk)
>  	psock->save_close = sk->sk_prot->close;
>  	psock->sk_proto = sk->sk_prot;
>  
> -	if (psock->bpf_tx_msg) {
> -		tcp_bpf_proto.sendmsg = bpf_tcp_sendmsg;
> -		tcp_bpf_proto.sendpage = bpf_tcp_sendpage;
> -		tcp_bpf_proto.recvmsg = bpf_tcp_recvmsg;
> -		tcp_bpf_proto.stream_memory_read = bpf_tcp_stream_read;
> +	/* Build IPv6 sockmap whenever the address of tcpv6_prot changes */
> +	if (sk->sk_family == AF_INET6 &&
> +	    unlikely(sk->sk_prot != smp_load_acquire(&saved_tcpv6_prot))) {
> +		mutex_lock(&tcpv6_prot_mutex);
bpf_tcp_init() can be called by skops?
Can mutex_lock() be used here?

> +		if (likely(sk->sk_prot != saved_tcpv6_prot)) {
> +			build_protos(bpf_tcp_prots[SOCKMAP_IPV6], sk->sk_prot);
> +			smp_store_release(&saved_tcpv6_prot, sk->sk_prot);
> +		}
> +		mutex_unlock(&tcpv6_prot_mutex);
>  	}
> -
> -	sk->sk_prot = &tcp_bpf_proto;
> +	update_sk_prot(sk, psock);
>  	rcu_read_unlock();
>  	return 0;
>  }
> @@ -1111,8 +1150,7 @@ static void bpf_tcp_msg_add(struct smap_psock *psock,
>  
>  static int bpf_tcp_ulp_register(void)
>  {
> -	tcp_bpf_proto = tcp_prot;
> -	tcp_bpf_proto.close = bpf_tcp_close;
> +	build_protos(bpf_tcp_prots[SOCKMAP_IPV4], &tcp_prot);
>  	/* Once BPF TX ULP is registered it is never unregistered. It
>  	 * will be in the ULP list for the lifetime of the system. Doing
>  	 * duplicate registers is not a problem.
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bpf PATCH v2 2/6] bpf: sockmap only allow ESTABLISHED sock state
  2018-06-14 16:44 ` [bpf PATCH v2 2/6] bpf: sockmap only allow ESTABLISHED sock state John Fastabend
@ 2018-06-15  0:18   ` Martin KaFai Lau
  2018-06-18 14:50     ` John Fastabend
  0 siblings, 1 reply; 19+ messages in thread
From: Martin KaFai Lau @ 2018-06-15  0:18 UTC (permalink / raw)
  To: John Fastabend; +Cc: ast, daniel, netdev

On Thu, Jun 14, 2018 at 09:44:52AM -0700, John Fastabend wrote:
> Per the note in the TLS ULP (which is actually a generic statement
> regarding ULPs)
> 
>  /* The TLS ulp is currently supported only for TCP sockets
>   * in ESTABLISHED state.
>   * Supporting sockets in LISTEN state will require us
>   * to modify the accept implementation to clone rather then
>   * share the ulp context.
>   */
Can you explain how that apply to bpf_tcp ulp?

My understanding is the "ulp context" referred in TLS ulp is
the tls_context stored in icsk_ulp_data but I don't see bpf_tcp's
ulp is using icsk_ulp_data.

Others LGTM.

> 
> After this patch we only allow socks that are in ESTABLISHED state or
> are being added via a sock_ops event that is transitioning into an
> ESTABLISHED state. By allowing sock_ops events we allow users to
> manage sockmaps directly from sock ops programs. The two supported
> sock_ops ops are BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB and
> BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB.
> 
> >From the userspace BPF update API the sock lock is also taken now
> to ensure we don't race with state changes after the ESTABLISHED
> check. The BPF program sock ops hook already has the sock lock
> taken.
> 
> Also tested with 'netserver -6' and 'netperf -H [IPv6]' as well as
> 'netperf -H [IPv4]'.
> 
> Reported-by: Eric Dumazet <edumazet@google.com>
> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
> ---
>  0 files changed
> 
> diff --git a/kernel/bpf/sockmap.c b/kernel/bpf/sockmap.c
> index f6dd4cd..f1ab52d 100644
> --- a/kernel/bpf/sockmap.c
> +++ b/kernel/bpf/sockmap.c
> @@ -1976,13 +1976,20 @@ static int sock_map_update_elem(struct bpf_map *map,
>  		return -EINVAL;
>  	}
>  
> +	lock_sock(skops.sk);
> +	/* ULPs are currently supported only for TCP sockets in ESTABLISHED
> +	 * state.
> +	 */
>  	if (skops.sk->sk_type != SOCK_STREAM ||
> -	    skops.sk->sk_protocol != IPPROTO_TCP) {
> -		fput(socket->file);
> -		return -EOPNOTSUPP;
> +	    skops.sk->sk_protocol != IPPROTO_TCP ||
> +	    skops.sk->sk_state != TCP_ESTABLISHED) {
> +		err = -EOPNOTSUPP;
> +		goto out;
>  	}
>  
>  	err = sock_map_ctx_update_elem(&skops, map, key, flags);
> +out:
> +	release_sock(skops.sk);
>  	fput(socket->file);
>  	return err;
>  }
> @@ -2247,10 +2254,6 @@ static int sock_hash_ctx_update_elem(struct bpf_sock_ops_kern *skops,
>  
>  	sock = skops->sk;
>  
> -	if (sock->sk_type != SOCK_STREAM ||
> -	    sock->sk_protocol != IPPROTO_TCP)
> -		return -EOPNOTSUPP;
> -
>  	if (unlikely(map_flags > BPF_EXIST))
>  		return -EINVAL;
>  
> @@ -2338,7 +2341,20 @@ static int sock_hash_update_elem(struct bpf_map *map,
>  		return -EINVAL;
>  	}
>  
> +	lock_sock(skops.sk);
> +	/* ULPs are currently supported only for TCP sockets in ESTABLISHED
> +	 * state.
> +	 */
> +	if (skops.sk->sk_type != SOCK_STREAM ||
> +	    skops.sk->sk_protocol != IPPROTO_TCP ||
> +	    skops.sk->sk_state != TCP_ESTABLISHED) {
> +		err = -EOPNOTSUPP;
> +		goto out;
> +	}
> +
>  	err = sock_hash_ctx_update_elem(&skops, map, key, flags);
> +out:
> +	release_sock(skops.sk);
>  	fput(socket->file);
>  	return err;
>  }
> @@ -2423,10 +2439,19 @@ struct sock  *__sock_hash_lookup_elem(struct bpf_map *map, void *key)
>  	.map_delete_elem = sock_hash_delete_elem,
>  };
>  
> +static bool bpf_is_valid_sock(struct bpf_sock_ops_kern *ops)
> +{
> +	return ops->op == BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB ||
> +	       ops->op == BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB;
> +}
> +
>  BPF_CALL_4(bpf_sock_map_update, struct bpf_sock_ops_kern *, bpf_sock,
>  	   struct bpf_map *, map, void *, key, u64, flags)
>  {
>  	WARN_ON_ONCE(!rcu_read_lock_held());
> +
> +	if (!bpf_is_valid_sock(bpf_sock))
> +		return -EOPNOTSUPP;
>  	return sock_map_ctx_update_elem(bpf_sock, map, key, flags);
>  }
>  
> @@ -2445,6 +2470,9 @@ struct sock  *__sock_hash_lookup_elem(struct bpf_map *map, void *key)
>  	   struct bpf_map *, map, void *, key, u64, flags)
>  {
>  	WARN_ON_ONCE(!rcu_read_lock_held());
> +
> +	if (!bpf_is_valid_sock(bpf_sock))
> +		return -EOPNOTSUPP;
>  	return sock_hash_ctx_update_elem(bpf_sock, map, key, flags);
>  }
>  
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bpf PATCH v2 1/6] bpf: sockmap, fix crash when ipv6 sock is added
  2018-06-14 23:53   ` Martin KaFai Lau
@ 2018-06-15  4:46     ` John Fastabend
  0 siblings, 0 replies; 19+ messages in thread
From: John Fastabend @ 2018-06-15  4:46 UTC (permalink / raw)
  To: Martin KaFai Lau; +Cc: ast, daniel, netdev

On 06/14/2018 04:53 PM, Martin KaFai Lau wrote:
> On Thu, Jun 14, 2018 at 09:44:46AM -0700, John Fastabend wrote:
>> This fixes a crash where we assign tcp_prot to IPv6 sockets instead
>> of tcpv6_prot.
>>
>> Previously we overwrote the sk->prot field with tcp_prot even in the
>> AF_INET6 case. This patch ensures the correct tcp_prot and tcpv6_prot
>> are used.
> 
>> Further, only allow ESTABLISHED connections to join the
>> map per note in TLS ULP,
>>
>>    /* The TLS ulp is currently supported only for TCP sockets
>>     * in ESTABLISHED state.
>>     * Supporting sockets in LISTEN state will require us
>>     * to modify the accept implementation to clone rather then
>>     * share the ulp context.
>>     */
> This bit has been moved to patch 2.

Yep better cut the comment as well.

> 
>>
>> Also tested with 'netserver -6' and 'netperf -H [IPv6]' as well as
>> 'netperf -H [IPv4]'. The ESTABLISHED check resolves the previously
>> crashing case here.
>>
>> Fixes: 174a79ff9515 ("bpf: sockmap with sk redirect support")
>> Reported-by: syzbot+5c063698bdbfac19f363@syzkaller.appspotmail.com
>> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
>> Signed-off-by: Wei Wang <weiwan@google.com>
>> ---
>>  0 files changed
>>

0 files changed will fix that as well.

>> diff --git a/kernel/bpf/sockmap.c b/kernel/bpf/sockmap.c
>> index 52a91d8..f6dd4cd 100644
>> --- a/kernel/bpf/sockmap.c
>> +++ b/kernel/bpf/sockmap.c
>> @@ -140,6 +140,7 @@ static int bpf_tcp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
>>  static int bpf_tcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t size);
>>  static int bpf_tcp_sendpage(struct sock *sk, struct page *page,
>>  			    int offset, size_t size, int flags);
>> +static void bpf_tcp_close(struct sock *sk, long timeout);
>>  
>>  static inline struct smap_psock *smap_psock_sk(const struct sock *sk)
>>  {
>> @@ -161,7 +162,42 @@ static bool bpf_tcp_stream_read(const struct sock *sk)
>>  	return !empty;
>>  }
>>  
>> -static struct proto tcp_bpf_proto;
>> +enum {
>> +	SOCKMAP_IPV4,
>> +	SOCKMAP_IPV6,
>> +	SOCKMAP_NUM_PROTS,
>> +};
>> +
>> +enum {
>> +	SOCKMAP_BASE,
>> +	SOCKMAP_TX,
>> +	SOCKMAP_NUM_CONFIGS,
>> +};
>> +
>> +static struct proto *saved_tcpv6_prot;
> __read_mostly
> 

Sure makes sense.

>> +static DEFINE_MUTEX(tcpv6_prot_mutex);
>> +static struct proto bpf_tcp_prots[SOCKMAP_NUM_PROTS][SOCKMAP_NUM_CONFIGS];
>> +static void build_protos(struct proto prot[SOCKMAP_NUM_CONFIGS],
>> +			 struct proto *base)
>> +{
>> +	prot[SOCKMAP_BASE]			= *base;
>> +	prot[SOCKMAP_BASE].close		= bpf_tcp_close;
>> +	prot[SOCKMAP_BASE].recvmsg		= bpf_tcp_recvmsg;
>> +	prot[SOCKMAP_BASE].stream_memory_read	= bpf_tcp_stream_read;
>> +
>> +	prot[SOCKMAP_TX]			= prot[SOCKMAP_BASE];
>> +	prot[SOCKMAP_TX].sendmsg		= bpf_tcp_sendmsg;
>> +	prot[SOCKMAP_TX].sendpage		= bpf_tcp_sendpage;
>> +}
>> +
>> +static void update_sk_prot(struct sock *sk, struct smap_psock *psock)
>> +{
>> +	int family = sk->sk_family == AF_INET6 ? SOCKMAP_IPV6 : SOCKMAP_IPV4;
>> +	int conf = psock->bpf_tx_msg ? SOCKMAP_TX : SOCKMAP_BASE;
>> +
>> +	sk->sk_prot = &bpf_tcp_prots[family][conf];
>> +}
>> +
>>  static int bpf_tcp_init(struct sock *sk)
>>  {
>>  	struct smap_psock *psock;
>> @@ -181,14 +217,17 @@ static int bpf_tcp_init(struct sock *sk)
>>  	psock->save_close = sk->sk_prot->close;
>>  	psock->sk_proto = sk->sk_prot;
>>  
>> -	if (psock->bpf_tx_msg) {
>> -		tcp_bpf_proto.sendmsg = bpf_tcp_sendmsg;
>> -		tcp_bpf_proto.sendpage = bpf_tcp_sendpage;
>> -		tcp_bpf_proto.recvmsg = bpf_tcp_recvmsg;
>> -		tcp_bpf_proto.stream_memory_read = bpf_tcp_stream_read;
>> +	/* Build IPv6 sockmap whenever the address of tcpv6_prot changes */
>> +	if (sk->sk_family == AF_INET6 &&
>> +	    unlikely(sk->sk_prot != smp_load_acquire(&saved_tcpv6_prot))) {
>> +		mutex_lock(&tcpv6_prot_mutex);
> bpf_tcp_init() can be called by skops?
> Can mutex_lock() be used here?
> 

No mutex lock can not be used here. Both are called
with rcu_read_lock() and we can not sleep. Thanks
for catching. Also this will give a kernel splat now
that I have the right config options. Guess we need
a v3 :/

Thanks,
John

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bpf PATCH v2 3/6] bpf: sockhash fix omitted bucket lock in sock_close
  2018-06-14 16:44 ` [bpf PATCH v2 3/6] bpf: sockhash fix omitted bucket lock in sock_close John Fastabend
@ 2018-06-15  5:41   ` Martin KaFai Lau
  2018-06-15 15:23     ` John Fastabend
  0 siblings, 1 reply; 19+ messages in thread
From: Martin KaFai Lau @ 2018-06-15  5:41 UTC (permalink / raw)
  To: John Fastabend; +Cc: ast, daniel, netdev

On Thu, Jun 14, 2018 at 09:44:57AM -0700, John Fastabend wrote:
> First in tcp_close, reduce scope of sk_callback_lock() the lock is
> only needed for protecting smap_release_sock() the ingress and cork
> lists are protected by sock lock. Having the lock in wider scope is
> harmless but may confuse the reader who may infer it is in fact
> needed.
> 
> Next, in sock_hash_delete_elem() the pattern is as follows,
> 
>   sock_hash_delete_elem()
>      [...]
>      spin_lock(bucket_lock)
>      l = lookup_elem_raw()
>      if (l)
>         hlist_del_rcu()
>         write_lock(sk_callback_lock)
>          .... destroy psock ...
>         write_unlock(sk_callback_lock)
>      spin_unlock(bucket_lock)
> 
> The ordering is necessary because we only know the {p}sock after
> dereferencing the hash table which we can't do unless we have the
> bucket lock held. Once we have the bucket lock and the psock element
> it is deleted from the hashmap to ensure any other path doing a lookup
> will fail. Finally, the refcnt is decremented and if zero the psock
> is destroyed.
> 
> In parallel with the above (or free'ing the map) a tcp close event
> may trigger tcp_close(). Which at the moment omits the bucket lock
> altogether (oops!) where the flow looks like this,
> 
>   bpf_tcp_close()
>      [...]
>      write_lock(sk_callback_lock)
>      for each psock->maps // list of maps this sock is part of
>          hlist_del_rcu(ref_hash_node);
>          .... destroy psock ...
>      write_unlock(sk_callback_lock)
> 
> Obviously, and demonstrated by syzbot, this is broken because
> we can have multiple threads deleting entries via hlist_del_rcu().
> 
> To fix this we might be tempted to wrap the hlist operation in a
> bucket lock but that would create a lock inversion problem. In
> summary to follow locking rules maps needs the sk_callback_lock but we
> need the bucket lock to do the hlist_del_rcu. To resolve the lock
> inversion problem note that when bpf_tcp_close is called no updates
> can happen in parallel, due to ESTABLISH state check in update logic,
> so pop the head of the list repeatedly and remove the reference until
> no more are left. If a delete happens in parallel from the BPF API
> that is OK as well because it will do a similar action, lookup the
> sock in the map/hash, delete it from the map/hash, and dec the refcnt.
> We check for this case before doing a destroy on the psock to ensure
> we don't have two threads tearing down a psock. The new logic is
> as follows,
> 
>   bpf_tcp_close()
>   e = psock_map_pop(psock->maps) // done with sk_callback_lock
>   bucket_lock() // lock hash list bucket
>   l = lookup_elem_raw(head, hash, key, key_size);
>   if (l) {
>      //only get here if elmnt was not already removed
>      hlist_del_rcu()
>      ... destroy psock...
>   }
>   bucket_unlock()
> 
> And finally for all the above to work add missing sk_callback_lock
> around smap_list_remove in sock_hash_ctx_update_elem(). Otherwise
> delete and update may corrupt maps list.
> 
> (As an aside the sk_callback_lock serves two purposes. The
>  first, is to update the sock callbacks sk_data_ready, sk_write_space,
>  etc. The second is to protect the psock 'maps' list. The 'maps' list
>  is used to (as shown above) to delete all map/hash references to a
>  sock when the sock is closed)
> 
> (If we did not have the ESTABLISHED state guarantee from tcp_close
>  then we could not ensure completion because updates could happen
>  forever and pin thread in delete loop.)
> 
> Reported-by: syzbot+0ce137753c78f7b6acc1@syzkaller.appspotmail.com
> Fixes: 81110384441a ("bpf: sockmap, add hash map support")
> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
> ---
>  0 files changed
> 
> diff --git a/kernel/bpf/sockmap.c b/kernel/bpf/sockmap.c
> index f1ab52d..04764f5 100644
> --- a/kernel/bpf/sockmap.c
> +++ b/kernel/bpf/sockmap.c
> @@ -258,16 +258,54 @@ static void bpf_tcp_release(struct sock *sk)
>  	rcu_read_unlock();
>  }
>  
> +static struct htab_elem *lookup_elem_raw(struct hlist_head *head,
> +					 u32 hash, void *key, u32 key_size)
> +{
> +	struct htab_elem *l;
> +
> +	hlist_for_each_entry_rcu(l, head, hash_node) {
> +		if (l->hash == hash && !memcmp(&l->key, key, key_size))
> +			return l;
> +	}
> +
> +	return NULL;
> +}
> +
> +static inline struct bucket *__select_bucket(struct bpf_htab *htab, u32 hash)
> +{
> +	return &htab->buckets[hash & (htab->n_buckets - 1)];
> +}
> +
> +static inline struct hlist_head *select_bucket(struct bpf_htab *htab, u32 hash)
> +{
> +	return &__select_bucket(htab, hash)->head;
> +}
> +
>  static void free_htab_elem(struct bpf_htab *htab, struct htab_elem *l)
>  {
>  	atomic_dec(&htab->count);
>  	kfree_rcu(l, rcu);
>  }
>  
> +struct smap_psock_map_entry *psock_map_pop(struct sock *sk,
> +					   struct smap_psock *psock)
> +{
> +	struct smap_psock_map_entry *e;
> +
> +	write_lock_bh(&sk->sk_callback_lock);
> +	e = list_first_entry_or_null(&psock->maps,
> +				     struct smap_psock_map_entry,
> +				     list);
> +	if (e)
> +		list_del(&e->list);
> +	write_unlock_bh(&sk->sk_callback_lock);
> +	return e;
> +}
> +
>  static void bpf_tcp_close(struct sock *sk, long timeout)
>  {
>  	void (*close_fun)(struct sock *sk, long timeout);
> -	struct smap_psock_map_entry *e, *tmp;
> +	struct smap_psock_map_entry *e;
>  	struct sk_msg_buff *md, *mtmp;
>  	struct smap_psock *psock;
>  	struct sock *osk;
> @@ -286,7 +324,6 @@ static void bpf_tcp_close(struct sock *sk, long timeout)
>  	 */
>  	close_fun = psock->save_close;
>  
> -	write_lock_bh(&sk->sk_callback_lock);
>  	if (psock->cork) {
>  		free_start_sg(psock->sock, psock->cork);
>  		kfree(psock->cork);
> @@ -299,20 +336,48 @@ static void bpf_tcp_close(struct sock *sk, long timeout)
>  		kfree(md);
>  	}
>  
> -	list_for_each_entry_safe(e, tmp, &psock->maps, list) {
> +	/* Sock is in TCP_CLOSE state so any concurrent adds or updates will be
> +	 * blocked by ESTABLISHED check. However, tcp_close() + delete + free
> +	 * can all run at the same time. If a tcp_close + delete happens each
> +	 * code path will remove the entry for the map/hash before deleting it.
> +	 * In the map case a xchg and then check to verify we have a sk protects
> +	 * two paths from tearing down the same object. For hash map we lock the
> +	 * bucket and remove the object from the hash map before destroying to
> +	 * ensure that only one reference exists. By pulling object off the head
> +	 * of the list with (with sk_callback_lock) if multiple deleters are
> +	 * running we avoid duplicate references.
> +	 */
> +	e = psock_map_pop(sk, psock);
> +	while (e) {
>  		if (e->entry) {
>  			osk = cmpxchg(e->entry, sk, NULL);
>  			if (osk == sk) {
> -				list_del(&e->list);
>  				smap_release_sock(psock, sk);
>  			}
>  		} else {
> -			hlist_del_rcu(&e->hash_link->hash_node);
> -			smap_release_sock(psock, e->hash_link->sk);
> -			free_htab_elem(e->htab, e->hash_link);
> +			struct htab_elem *link = e->hash_link;
> +			struct hlist_head *head;
> +			struct htab_elem *l;
> +			struct bucket *b;
> +
> +			b = __select_bucket(e->htab, link->hash);
> +			head = &b->head;
> +			raw_spin_lock_bh(&b->lock);
> +			l = lookup_elem_raw(head,
> +					    link->hash, link->key,
> +					    e->htab->elem_size);
> +			/* If another thread deleted this object skip deletion.
> +			 * The refcnt on psock may or may not be zero.
> +			 */
> +			if (l) {
> +				hlist_del_rcu(&e->hash_link->hash_node);
> +				smap_release_sock(psock, e->hash_link->sk);
> +				free_htab_elem(e->htab, e->hash_link);
> +			}
> +			raw_spin_unlock_bh(&b->lock);
>  		}
> +		e = psock_map_pop(sk, psock);
>  	}
> -	write_unlock_bh(&sk->sk_callback_lock);
>  	rcu_read_unlock();
>  	close_fun(sk, timeout);
>  }
> @@ -2088,16 +2153,6 @@ static struct bpf_map *sock_hash_alloc(union bpf_attr *attr)
>  	return ERR_PTR(err);
>  }
>  
> -static inline struct bucket *__select_bucket(struct bpf_htab *htab, u32 hash)
> -{
> -	return &htab->buckets[hash & (htab->n_buckets - 1)];
> -}
> -
> -static inline struct hlist_head *select_bucket(struct bpf_htab *htab, u32 hash)
> -{
> -	return &__select_bucket(htab, hash)->head;
> -}
> -
>  static void sock_hash_free(struct bpf_map *map)
>  {
>  	struct bpf_htab *htab = container_of(map, struct bpf_htab, map);
> @@ -2114,10 +2169,13 @@ static void sock_hash_free(struct bpf_map *map)
>  	 */
>  	rcu_read_lock();
>  	for (i = 0; i < htab->n_buckets; i++) {
> -		struct hlist_head *head = select_bucket(htab, i);
> +		struct bucket *b = __select_bucket(htab, i);
> +		struct hlist_head *head;
>  		struct hlist_node *n;
>  		struct htab_elem *l;
>  
> +		raw_spin_lock_bh(&b->lock);
There is a synchronize_rcu() at the beginning of sock_hash_free().
Taking the bucket's lock of the free-ing map at this point is a bit
suspicious.  What may still access the map after synchronize_rcu()?


> +		head = &b->head;
>  		hlist_for_each_entry_safe(l, n, head, hash_node) {
>  			struct sock *sock = l->sk;
>  			struct smap_psock *psock;
> @@ -2137,6 +2195,7 @@ static void sock_hash_free(struct bpf_map *map)
>  			write_unlock_bh(&sock->sk_callback_lock);
>  			kfree(l);
>  		}
> +		raw_spin_unlock_bh(&b->lock);
>  	}
>  	rcu_read_unlock();
>  	bpf_map_area_free(htab->buckets);
> @@ -2167,19 +2226,6 @@ static struct htab_elem *alloc_sock_hash_elem(struct bpf_htab *htab,
>  	return l_new;
>  }
>  
> -static struct htab_elem *lookup_elem_raw(struct hlist_head *head,
> -					 u32 hash, void *key, u32 key_size)
> -{
> -	struct htab_elem *l;
> -
> -	hlist_for_each_entry_rcu(l, head, hash_node) {
> -		if (l->hash == hash && !memcmp(&l->key, key, key_size))
> -			return l;
> -	}
> -
> -	return NULL;
> -}
> -
>  static inline u32 htab_map_hash(const void *key, u32 key_len)
>  {
>  	return jhash(key, key_len, 0);
> @@ -2307,8 +2353,10 @@ static int sock_hash_ctx_update_elem(struct bpf_sock_ops_kern *skops,
>  		psock = smap_psock_sk(l_old->sk);
>  
>  		hlist_del_rcu(&l_old->hash_node);
> +		write_lock_bh(&l_old->sk->sk_callback_lock);
>  		smap_list_remove(psock, NULL, l_old);
>  		smap_release_sock(psock, l_old->sk);
> +		write_unlock_bh(&l_old->sk->sk_callback_lock);
>  		free_htab_elem(htab, l_old);
>  	}
>  	raw_spin_unlock_bh(&b->lock);
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bpf PATCH v2 4/6] bpf: sockmap, tcp_disconnect to listen transition
  2018-06-14 16:45 ` [bpf PATCH v2 4/6] bpf: sockmap, tcp_disconnect to listen transition John Fastabend
@ 2018-06-15  6:04   ` Martin KaFai Lau
  0 siblings, 0 replies; 19+ messages in thread
From: Martin KaFai Lau @ 2018-06-15  6:04 UTC (permalink / raw)
  To: John Fastabend; +Cc: ast, daniel, netdev

On Thu, Jun 14, 2018 at 09:45:02AM -0700, John Fastabend wrote:
> After adding checks to ensure TCP is in ESTABLISHED state when a
> sock is added we need to also ensure that user does not transition
> through tcp_disconnect() and back into ESTABLISHED state without
> sockmap removing the sock.
> 
> To do this add unhash hook and remove sock from map there.
> 
> Reported-by: Eric Dumazet <edumazet@google.com>
> Fixes: 81110384441a ("bpf: sockmap, add hash map support")
> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
LGTM. One nit.

Acked-by: Martin KaFai Lau <kafai@fb.com>

> ---
>  0 files changed
> 
> diff --git a/kernel/bpf/sockmap.c b/kernel/bpf/sockmap.c
> index 04764f5..ffc5152 100644
> --- a/kernel/bpf/sockmap.c
> +++ b/kernel/bpf/sockmap.c
> @@ -130,6 +130,7 @@ struct smap_psock {
>  
>  	struct proto *sk_proto;
>  	void (*save_close)(struct sock *sk, long timeout);
> +	void (*save_unhash)(struct sock *sk);
>  	void (*save_data_ready)(struct sock *sk);
>  	void (*save_write_space)(struct sock *sk);
>  };
> @@ -141,6 +142,7 @@ static int bpf_tcp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
>  static int bpf_tcp_sendpage(struct sock *sk, struct page *page,
>  			    int offset, size_t size, int flags);
>  static void bpf_tcp_close(struct sock *sk, long timeout);
> +static void bpf_tcp_unhash(struct sock *sk);
>  
>  static inline struct smap_psock *smap_psock_sk(const struct sock *sk)
>  {
> @@ -182,6 +184,7 @@ static void build_protos(struct proto prot[SOCKMAP_NUM_CONFIGS],
>  {
>  	prot[SOCKMAP_BASE]			= *base;
>  	prot[SOCKMAP_BASE].close		= bpf_tcp_close;
> +	prot[SOCKMAP_BASE].unhash		= bpf_tcp_unhash;
>  	prot[SOCKMAP_BASE].recvmsg		= bpf_tcp_recvmsg;
>  	prot[SOCKMAP_BASE].stream_memory_read	= bpf_tcp_stream_read;
>  
> @@ -215,6 +218,7 @@ static int bpf_tcp_init(struct sock *sk)
>  	}
>  
>  	psock->save_close = sk->sk_prot->close;
> +	psock->save_unhash = sk->sk_prot->unhash;
>  	psock->sk_proto = sk->sk_prot;
>  
>  	/* Build IPv6 sockmap whenever the address of tcpv6_prot changes */
> @@ -302,28 +306,12 @@ struct smap_psock_map_entry *psock_map_pop(struct sock *sk,
>  	return e;
>  }
>  
> -static void bpf_tcp_close(struct sock *sk, long timeout)
> +static void bpf_tcp_remove(struct sock *sk, struct smap_psock *psock)
>  {
> -	void (*close_fun)(struct sock *sk, long timeout);
>  	struct smap_psock_map_entry *e;
>  	struct sk_msg_buff *md, *mtmp;
> -	struct smap_psock *psock;
>  	struct sock *osk;
>  
> -	rcu_read_lock();
> -	psock = smap_psock_sk(sk);
> -	if (unlikely(!psock)) {
> -		rcu_read_unlock();
> -		return sk->sk_prot->close(sk, timeout);
> -	}
> -
> -	/* The psock may be destroyed anytime after exiting the RCU critial
> -	 * section so by the time we use close_fun the psock may no longer
> -	 * be valid. However, bpf_tcp_close is called with the sock lock
> -	 * held so the close hook and sk are still valid.
> -	 */
> -	close_fun = psock->save_close;
> -
>  	if (psock->cork) {
>  		free_start_sg(psock->sock, psock->cork);
>  		kfree(psock->cork);
> @@ -378,6 +366,51 @@ static void bpf_tcp_close(struct sock *sk, long timeout)
>  		}
>  		e = psock_map_pop(sk, psock);
>  	}
> +}
> +
> +static void bpf_tcp_unhash(struct sock *sk)
> +{
> +	void (*unhash_fun)(struct sock *sk);
> +	struct smap_psock *psock;
> +
> +	rcu_read_lock();
> +	psock = smap_psock_sk(sk);
> +	if (unlikely(!psock)) {
> +		rcu_read_unlock();
> +		return sk->sk_prot->unhash(sk);
> +	}
> +
> +	/* The psock may be destroyed anytime after exiting the RCU critial
> +	 * section so by the time we use close_fun the psock may no longer
> +	 * be valid. However, bpf_tcp_close is called with the sock lock
> +	 * held so the close hook and sk are still valid.
> +	 */
Nit. s/close/unhash/

> +	unhash_fun = psock->save_unhash;
> +	bpf_tcp_remove(sk, psock);
> +	rcu_read_unlock();
> +	unhash_fun(sk);
> +
> +}
> +
> +static void bpf_tcp_close(struct sock *sk, long timeout)
> +{
> +	void (*close_fun)(struct sock *sk, long timeout);
> +	struct smap_psock *psock;
> +
> +	rcu_read_lock();
> +	psock = smap_psock_sk(sk);
> +	if (unlikely(!psock)) {
> +		rcu_read_unlock();
> +		return sk->sk_prot->close(sk, timeout);
> +	}
> +
> +	/* The psock may be destroyed anytime after exiting the RCU critial
> +	 * section so by the time we use close_fun the psock may no longer
> +	 * be valid. However, bpf_tcp_close is called with the sock lock
> +	 * held so the close hook and sk are still valid.
> +	 */
> +	close_fun = psock->save_close;
> +	bpf_tcp_remove(sk, psock);
>  	rcu_read_unlock();
>  	close_fun(sk, timeout);
>  }
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bpf PATCH v2 5/6] bpf: sockhash, add release routine
  2018-06-14 16:45 ` [bpf PATCH v2 5/6] bpf: sockhash, add release routine John Fastabend
@ 2018-06-15  6:05   ` Martin KaFai Lau
  0 siblings, 0 replies; 19+ messages in thread
From: Martin KaFai Lau @ 2018-06-15  6:05 UTC (permalink / raw)
  To: John Fastabend; +Cc: ast, daniel, netdev

On Thu, Jun 14, 2018 at 09:45:07AM -0700, John Fastabend wrote:
> Add map_release_uref pointer to hashmap ops. This was dropped when
> original sockhash code was ported into bpf-next before initial
> commit.
> 
> Fixes: 81110384441a ("bpf: sockmap, add hash map support")
> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>

> ---
>  0 files changed
> 
> diff --git a/kernel/bpf/sockmap.c b/kernel/bpf/sockmap.c
> index ffc5152..77fe204 100644
> --- a/kernel/bpf/sockmap.c
> +++ b/kernel/bpf/sockmap.c
> @@ -2518,6 +2518,7 @@ struct sock  *__sock_hash_lookup_elem(struct bpf_map *map, void *key)
>  	.map_get_next_key = sock_hash_get_next_key,
>  	.map_update_elem = sock_hash_update_elem,
>  	.map_delete_elem = sock_hash_delete_elem,
> +	.map_release_uref = sock_map_release,
>  };
>  
>  static bool bpf_is_valid_sock(struct bpf_sock_ops_kern *ops)
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bpf PATCH v2 6/6] bpf: selftest remove attempts to add LISTEN sockets to sockmap
  2018-06-14 16:45 ` [bpf PATCH v2 6/6] bpf: selftest remove attempts to add LISTEN sockets to sockmap John Fastabend
@ 2018-06-15  6:07   ` Martin KaFai Lau
  0 siblings, 0 replies; 19+ messages in thread
From: Martin KaFai Lau @ 2018-06-15  6:07 UTC (permalink / raw)
  To: John Fastabend; +Cc: ast, daniel, netdev

On Thu, Jun 14, 2018 at 09:45:12AM -0700, John Fastabend wrote:
> In selftest test_maps the sockmap test case attempts to add a socket
> in listening state to the sockmap. This is no longer a valid operation
> so it fails as expected. However, the test wrongly reports this as an
> error now. Fix the test to avoid adding sockets in listening state.
> 
> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>

> ---
>  0 files changed
> 
> diff --git a/tools/testing/selftests/bpf/test_maps.c b/tools/testing/selftests/bpf/test_maps.c
> index 6c25334..9fed5f0 100644
> --- a/tools/testing/selftests/bpf/test_maps.c
> +++ b/tools/testing/selftests/bpf/test_maps.c
> @@ -564,7 +564,7 @@ static void test_sockmap(int tasks, void *data)
>  	}
>  
>  	/* Test update without programs */
> -	for (i = 0; i < 6; i++) {
> +	for (i = 2; i < 6; i++) {
>  		err = bpf_map_update_elem(fd, &i, &sfd[i], BPF_ANY);
>  		if (err) {
>  			printf("Failed noprog update sockmap '%i:%i'\n",
> @@ -727,7 +727,7 @@ static void test_sockmap(int tasks, void *data)
>  	}
>  
>  	/* Test map update elem afterwards fd lives in fd and map_fd */
> -	for (i = 0; i < 6; i++) {
> +	for (i = 2; i < 6; i++) {
>  		err = bpf_map_update_elem(map_fd_rx, &i, &sfd[i], BPF_ANY);
>  		if (err) {
>  			printf("Failed map_fd_rx update sockmap %i '%i:%i'\n",
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bpf PATCH v2 3/6] bpf: sockhash fix omitted bucket lock in sock_close
  2018-06-15  5:41   ` Martin KaFai Lau
@ 2018-06-15 15:23     ` John Fastabend
  2018-06-15 15:45       ` Martin KaFai Lau
  0 siblings, 1 reply; 19+ messages in thread
From: John Fastabend @ 2018-06-15 15:23 UTC (permalink / raw)
  To: Martin KaFai Lau; +Cc: ast, daniel, netdev

On 06/14/2018 10:41 PM, Martin KaFai Lau wrote:
> On Thu, Jun 14, 2018 at 09:44:57AM -0700, John Fastabend wrote:
>> First in tcp_close, reduce scope of sk_callback_lock() the lock is
>> only needed for protecting smap_release_sock() the ingress and cork
>> lists are protected by sock lock. Having the lock in wider scope is
>> harmless but may confuse the reader who may infer it is in fact
>> needed.
>>
>> Next, in sock_hash_delete_elem() the pattern is as follows,
>>
>>   sock_hash_delete_elem()
>>      [...]
>>      spin_lock(bucket_lock)
>>      l = lookup_elem_raw()
>>      if (l)
>>         hlist_del_rcu()
>>         write_lock(sk_callback_lock)
>>          .... destroy psock ...
>>         write_unlock(sk_callback_lock)
>>      spin_unlock(bucket_lock)
>>
>> The ordering is necessary because we only know the {p}sock after
>> dereferencing the hash table which we can't do unless we have the
>> bucket lock held. Once we have the bucket lock and the psock element
>> it is deleted from the hashmap to ensure any other path doing a lookup
>> will fail. Finally, the refcnt is decremented and if zero the psock
>> is destroyed.
>>
>> In parallel with the above (or free'ing the map) a tcp close event
>> may trigger tcp_close(). Which at the moment omits the bucket lock
>> altogether (oops!) where the flow looks like this,
>>
>>   bpf_tcp_close()
>>      [...]
>>      write_lock(sk_callback_lock)
>>      for each psock->maps // list of maps this sock is part of
>>          hlist_del_rcu(ref_hash_node);
>>          .... destroy psock ...
>>      write_unlock(sk_callback_lock)
>>
>> Obviously, and demonstrated by syzbot, this is broken because
>> we can have multiple threads deleting entries via hlist_del_rcu().
>>
>> To fix this we might be tempted to wrap the hlist operation in a
>> bucket lock but that would create a lock inversion problem. In
>> summary to follow locking rules maps needs the sk_callback_lock but we
>> need the bucket lock to do the hlist_del_rcu. To resolve the lock
>> inversion problem note that when bpf_tcp_close is called no updates
>> can happen in parallel, due to ESTABLISH state check in update logic,
>> so pop the head of the list repeatedly and remove the reference until
>> no more are left. If a delete happens in parallel from the BPF API
>> that is OK as well because it will do a similar action, lookup the
>> sock in the map/hash, delete it from the map/hash, and dec the refcnt.
>> We check for this case before doing a destroy on the psock to ensure
>> we don't have two threads tearing down a psock. The new logic is
>> as follows,
>>
>>   bpf_tcp_close()
>>   e = psock_map_pop(psock->maps) // done with sk_callback_lock
>>   bucket_lock() // lock hash list bucket
>>   l = lookup_elem_raw(head, hash, key, key_size);
>>   if (l) {
>>      //only get here if elmnt was not already removed
>>      hlist_del_rcu()
>>      ... destroy psock...
>>   }
>>   bucket_unlock()
>>
>> And finally for all the above to work add missing sk_callback_lock
>> around smap_list_remove in sock_hash_ctx_update_elem(). Otherwise
>> delete and update may corrupt maps list.
>>
>> (As an aside the sk_callback_lock serves two purposes. The
>>  first, is to update the sock callbacks sk_data_ready, sk_write_space,
>>  etc. The second is to protect the psock 'maps' list. The 'maps' list
>>  is used to (as shown above) to delete all map/hash references to a
>>  sock when the sock is closed)
>>
>> (If we did not have the ESTABLISHED state guarantee from tcp_close
>>  then we could not ensure completion because updates could happen
>>  forever and pin thread in delete loop.)
>>
>> Reported-by: syzbot+0ce137753c78f7b6acc1@syzkaller.appspotmail.com
>> Fixes: 81110384441a ("bpf: sockmap, add hash map support")
>> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
>> ---
>>  0 files changed
>>

^^^^ Will fix this 0 files changes as well.

>>  	struct bpf_htab *htab = container_of(map, struct bpf_htab, map);
>> @@ -2114,10 +2169,13 @@ static void sock_hash_free(struct bpf_map *map)
>>  	 */
>>  	rcu_read_lock();
>>  	for (i = 0; i < htab->n_buckets; i++) {
>> -		struct hlist_head *head = select_bucket(htab, i);
>> +		struct bucket *b = __select_bucket(htab, i);
>> +		struct hlist_head *head;
>>  		struct hlist_node *n;
>>  		struct htab_elem *l;
>>  
>> +		raw_spin_lock_bh(&b->lock);
> There is a synchronize_rcu() at the beginning of sock_hash_free().
> Taking the bucket's lock of the free-ing map at this point is a bit
> suspicious.  What may still access the map after synchronize_rcu()?
> 

tcp_close() may be called while the map is being free. The sync_rcu will
only sync the BPF side.

.John

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bpf PATCH v2 3/6] bpf: sockhash fix omitted bucket lock in sock_close
  2018-06-15 15:23     ` John Fastabend
@ 2018-06-15 15:45       ` Martin KaFai Lau
  0 siblings, 0 replies; 19+ messages in thread
From: Martin KaFai Lau @ 2018-06-15 15:45 UTC (permalink / raw)
  To: John Fastabend; +Cc: ast, daniel, netdev

On Fri, Jun 15, 2018 at 08:23:14AM -0700, John Fastabend wrote:
> On 06/14/2018 10:41 PM, Martin KaFai Lau wrote:
> > On Thu, Jun 14, 2018 at 09:44:57AM -0700, John Fastabend wrote:
> >> First in tcp_close, reduce scope of sk_callback_lock() the lock is
> >> only needed for protecting smap_release_sock() the ingress and cork
> >> lists are protected by sock lock. Having the lock in wider scope is
> >> harmless but may confuse the reader who may infer it is in fact
> >> needed.
> >>
> >> Next, in sock_hash_delete_elem() the pattern is as follows,
> >>
> >>   sock_hash_delete_elem()
> >>      [...]
> >>      spin_lock(bucket_lock)
> >>      l = lookup_elem_raw()
> >>      if (l)
> >>         hlist_del_rcu()
> >>         write_lock(sk_callback_lock)
> >>          .... destroy psock ...
> >>         write_unlock(sk_callback_lock)
> >>      spin_unlock(bucket_lock)
> >>
> >> The ordering is necessary because we only know the {p}sock after
> >> dereferencing the hash table which we can't do unless we have the
> >> bucket lock held. Once we have the bucket lock and the psock element
> >> it is deleted from the hashmap to ensure any other path doing a lookup
> >> will fail. Finally, the refcnt is decremented and if zero the psock
> >> is destroyed.
> >>
> >> In parallel with the above (or free'ing the map) a tcp close event
> >> may trigger tcp_close(). Which at the moment omits the bucket lock
> >> altogether (oops!) where the flow looks like this,
> >>
> >>   bpf_tcp_close()
> >>      [...]
> >>      write_lock(sk_callback_lock)
> >>      for each psock->maps // list of maps this sock is part of
> >>          hlist_del_rcu(ref_hash_node);
> >>          .... destroy psock ...
> >>      write_unlock(sk_callback_lock)
> >>
> >> Obviously, and demonstrated by syzbot, this is broken because
> >> we can have multiple threads deleting entries via hlist_del_rcu().
> >>
> >> To fix this we might be tempted to wrap the hlist operation in a
> >> bucket lock but that would create a lock inversion problem. In
> >> summary to follow locking rules maps needs the sk_callback_lock but we
> >> need the bucket lock to do the hlist_del_rcu. To resolve the lock
> >> inversion problem note that when bpf_tcp_close is called no updates
> >> can happen in parallel, due to ESTABLISH state check in update logic,
> >> so pop the head of the list repeatedly and remove the reference until
> >> no more are left. If a delete happens in parallel from the BPF API
> >> that is OK as well because it will do a similar action, lookup the
> >> sock in the map/hash, delete it from the map/hash, and dec the refcnt.
> >> We check for this case before doing a destroy on the psock to ensure
> >> we don't have two threads tearing down a psock. The new logic is
> >> as follows,
> >>
> >>   bpf_tcp_close()
> >>   e = psock_map_pop(psock->maps) // done with sk_callback_lock
> >>   bucket_lock() // lock hash list bucket
> >>   l = lookup_elem_raw(head, hash, key, key_size);
> >>   if (l) {
> >>      //only get here if elmnt was not already removed
> >>      hlist_del_rcu()
> >>      ... destroy psock...
> >>   }
> >>   bucket_unlock()
> >>
> >> And finally for all the above to work add missing sk_callback_lock
> >> around smap_list_remove in sock_hash_ctx_update_elem(). Otherwise
> >> delete and update may corrupt maps list.
> >>
> >> (As an aside the sk_callback_lock serves two purposes. The
> >>  first, is to update the sock callbacks sk_data_ready, sk_write_space,
> >>  etc. The second is to protect the psock 'maps' list. The 'maps' list
> >>  is used to (as shown above) to delete all map/hash references to a
> >>  sock when the sock is closed)
> >>
> >> (If we did not have the ESTABLISHED state guarantee from tcp_close
> >>  then we could not ensure completion because updates could happen
> >>  forever and pin thread in delete loop.)
> >>
> >> Reported-by: syzbot+0ce137753c78f7b6acc1@syzkaller.appspotmail.com
> >> Fixes: 81110384441a ("bpf: sockmap, add hash map support")
> >> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
> >> ---
> >>  0 files changed
> >>
> 
> ^^^^ Will fix this 0 files changes as well.
> 
> >>  	struct bpf_htab *htab = container_of(map, struct bpf_htab, map);
> >> @@ -2114,10 +2169,13 @@ static void sock_hash_free(struct bpf_map *map)
> >>  	 */
> >>  	rcu_read_lock();
> >>  	for (i = 0; i < htab->n_buckets; i++) {
> >> -		struct hlist_head *head = select_bucket(htab, i);
> >> +		struct bucket *b = __select_bucket(htab, i);
> >> +		struct hlist_head *head;
> >>  		struct hlist_node *n;
> >>  		struct htab_elem *l;
> >>  
> >> +		raw_spin_lock_bh(&b->lock);
> > There is a synchronize_rcu() at the beginning of sock_hash_free().
> > Taking the bucket's lock of the free-ing map at this point is a bit
> > suspicious.  What may still access the map after synchronize_rcu()?
> > 
> 
> tcp_close() may be called while the map is being free. The sync_rcu will
> only sync the BPF side.
Thanks for the explanation.

hmm....so the bpf_tcp_close(), which is under rcu_read_lock(), may still
has a reference to this hash.  Should it wait for another rcu grace period
before freeing the htab->buckets and htab?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bpf PATCH v2 2/6] bpf: sockmap only allow ESTABLISHED sock state
  2018-06-15  0:18   ` Martin KaFai Lau
@ 2018-06-18 14:50     ` John Fastabend
  2018-06-18 21:17       ` Martin KaFai Lau
  0 siblings, 1 reply; 19+ messages in thread
From: John Fastabend @ 2018-06-18 14:50 UTC (permalink / raw)
  To: Martin KaFai Lau; +Cc: ast, daniel, netdev

On 06/14/2018 05:18 PM, Martin KaFai Lau wrote:
> On Thu, Jun 14, 2018 at 09:44:52AM -0700, John Fastabend wrote:
>> Per the note in the TLS ULP (which is actually a generic statement
>> regarding ULPs)
>>
>>  /* The TLS ulp is currently supported only for TCP sockets
>>   * in ESTABLISHED state.
>>   * Supporting sockets in LISTEN state will require us
>>   * to modify the accept implementation to clone rather then
>>   * share the ulp context.
>>   */
> Can you explain how that apply to bpf_tcp ulp?
> 
> My understanding is the "ulp context" referred in TLS ulp is
> the tls_context stored in icsk_ulp_data but I don't see bpf_tcp's
> ulp is using icsk_ulp_data.
> 
> Others LGTM.
> 

So I think you are right we could probably allow it
here but I am thinking I'll leave the check for now
anyways for a couple reasons. First, we will shortly
add support to allow ULP types to coexist. At the moment
the two ULP types can not coexist. When this happens it
looks like we will need to restrict to only ESTABLISHED
types or somehow make all ULPs work in all states.

Second, I don't have any use cases (nor can I think of
any) for the sock{map|hash} ULP to be running on a non
ESTABLISHED socket. Its not clear to me that having the
sendmsg/sendpage hooks for a LISTEN socket makes sense.
I would rather restrict it now and if we add something
later where it makes sense to run on non-ESTABLISHED
socks we can remove the check.

Thanks for reviewing,
John

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bpf PATCH v2 2/6] bpf: sockmap only allow ESTABLISHED sock state
  2018-06-18 14:50     ` John Fastabend
@ 2018-06-18 21:17       ` Martin KaFai Lau
  2018-06-20 22:15         ` John Fastabend
  0 siblings, 1 reply; 19+ messages in thread
From: Martin KaFai Lau @ 2018-06-18 21:17 UTC (permalink / raw)
  To: John Fastabend; +Cc: ast, daniel, netdev

On Mon, Jun 18, 2018 at 07:50:19AM -0700, John Fastabend wrote:
> On 06/14/2018 05:18 PM, Martin KaFai Lau wrote:
> > On Thu, Jun 14, 2018 at 09:44:52AM -0700, John Fastabend wrote:
> >> Per the note in the TLS ULP (which is actually a generic statement
> >> regarding ULPs)
> >>
> >>  /* The TLS ulp is currently supported only for TCP sockets
> >>   * in ESTABLISHED state.
> >>   * Supporting sockets in LISTEN state will require us
> >>   * to modify the accept implementation to clone rather then
> >>   * share the ulp context.
> >>   */
> > Can you explain how that apply to bpf_tcp ulp?
> > 
> > My understanding is the "ulp context" referred in TLS ulp is
> > the tls_context stored in icsk_ulp_data but I don't see bpf_tcp's
> > ulp is using icsk_ulp_data.
> > 
> > Others LGTM.
> > 
> 
> So I think you are right we could probably allow it
> here but I am thinking I'll leave the check for now
> anyways for a couple reasons. First, we will shortly
> add support to allow ULP types to coexist. At the moment
> the two ULP types can not coexist. When this happens it
> looks like we will need to restrict to only ESTABLISHED
> types or somehow make all ULPs work in all states.
> 
> Second, I don't have any use cases (nor can I think of
> any) for the sock{map|hash} ULP to be running on a non
> ESTABLISHED socket. Its not clear to me that having the
> sendmsg/sendpage hooks for a LISTEN socket makes sense.
> I would rather restrict it now and if we add something
> later where it makes sense to run on non-ESTABLISHED
> socks we can remove the check.
Make sense if there is no use case.  It will be helpful if the commit log
is updated accordingly.  Thanks!

Acked-by: Martin KaFai Lau <kafai@fb.com>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bpf PATCH v2 2/6] bpf: sockmap only allow ESTABLISHED sock state
  2018-06-18 21:17       ` Martin KaFai Lau
@ 2018-06-20 22:15         ` John Fastabend
  0 siblings, 0 replies; 19+ messages in thread
From: John Fastabend @ 2018-06-20 22:15 UTC (permalink / raw)
  To: Martin KaFai Lau; +Cc: ast, daniel, netdev

On 06/18/2018 02:17 PM, Martin KaFai Lau wrote:
> On Mon, Jun 18, 2018 at 07:50:19AM -0700, John Fastabend wrote:
>> On 06/14/2018 05:18 PM, Martin KaFai Lau wrote:
>>> On Thu, Jun 14, 2018 at 09:44:52AM -0700, John Fastabend wrote:
>>>> Per the note in the TLS ULP (which is actually a generic statement
>>>> regarding ULPs)
>>>>
>>>>  /* The TLS ulp is currently supported only for TCP sockets
>>>>   * in ESTABLISHED state.
>>>>   * Supporting sockets in LISTEN state will require us
>>>>   * to modify the accept implementation to clone rather then
>>>>   * share the ulp context.
>>>>   */
>>> Can you explain how that apply to bpf_tcp ulp?
>>>
>>> My understanding is the "ulp context" referred in TLS ulp is
>>> the tls_context stored in icsk_ulp_data but I don't see bpf_tcp's
>>> ulp is using icsk_ulp_data.
>>>
>>> Others LGTM.
>>>
>>
>> So I think you are right we could probably allow it
>> here but I am thinking I'll leave the check for now
>> anyways for a couple reasons. First, we will shortly
>> add support to allow ULP types to coexist. At the moment
>> the two ULP types can not coexist. When this happens it
>> looks like we will need to restrict to only ESTABLISHED
>> types or somehow make all ULPs work in all states.
>>
>> Second, I don't have any use cases (nor can I think of
>> any) for the sock{map|hash} ULP to be running on a non
>> ESTABLISHED socket. Its not clear to me that having the
>> sendmsg/sendpage hooks for a LISTEN socket makes sense.
>> I would rather restrict it now and if we add something
>> later where it makes sense to run on non-ESTABLISHED
>> socks we can remove the check.
> Make sense if there is no use case.  It will be helpful if the commit log
> is updated accordingly.  Thanks!
> 
> Acked-by: Martin KaFai Lau <kafai@fb.com>
> 

The fall-out of this patch got a bit ugly. It doesn't
make much sense to me to allow transitioning into
ESTABLISH state (via tcp_disconnect) and not allow adding
the socks up front. But the fix via unhash callback, subsequent
patch, ended up causing a few issues. First to avoid racing
with transitions through update logic we had to use
sock_lock(sk) in the update handler. Which means we
can't use the normal ./kernel/bpf/syscall.c map update
logic and had to special case it so that preempt and
rcu were not used until after the lock was taken because
sock_lock can sleep. Then after running over night I noticed
a couple WARNINGS related to sk_forward_alloc not being
zero'd correctly on sock teardown. The issue is unhash
doesn't have sock_lock either and can be done while a
sendmsg/sendpage are running resulting in incorrectly
removing scatterlists. :(

All in all the "fix" got ugly so lets stay with the minimal
required set and allow non-established socks. It shouldn't
hurt anything even if from a use case perspective its a bit
useless. Then in bpf-next when we allow ULPs to coexist we
need to have some fix to handle this. But I would rather do
that in *next branches instead of bpf/net branches.

Thanks for all the useful feedback. I'm going to send a
set of three patches now to address the specific issues,
(a) IPV6 socks, (b) bucket lock missing and (c) uref release
missing.

Thanks,
John

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2018-06-20 22:16 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-14 16:44 [bpf PATCH v2 0/6] BPF fixes for sockhash John Fastabend
2018-06-14 16:44 ` [bpf PATCH v2 1/6] bpf: sockmap, fix crash when ipv6 sock is added John Fastabend
2018-06-14 23:53   ` Martin KaFai Lau
2018-06-15  4:46     ` John Fastabend
2018-06-14 16:44 ` [bpf PATCH v2 2/6] bpf: sockmap only allow ESTABLISHED sock state John Fastabend
2018-06-15  0:18   ` Martin KaFai Lau
2018-06-18 14:50     ` John Fastabend
2018-06-18 21:17       ` Martin KaFai Lau
2018-06-20 22:15         ` John Fastabend
2018-06-14 16:44 ` [bpf PATCH v2 3/6] bpf: sockhash fix omitted bucket lock in sock_close John Fastabend
2018-06-15  5:41   ` Martin KaFai Lau
2018-06-15 15:23     ` John Fastabend
2018-06-15 15:45       ` Martin KaFai Lau
2018-06-14 16:45 ` [bpf PATCH v2 4/6] bpf: sockmap, tcp_disconnect to listen transition John Fastabend
2018-06-15  6:04   ` Martin KaFai Lau
2018-06-14 16:45 ` [bpf PATCH v2 5/6] bpf: sockhash, add release routine John Fastabend
2018-06-15  6:05   ` Martin KaFai Lau
2018-06-14 16:45 ` [bpf PATCH v2 6/6] bpf: selftest remove attempts to add LISTEN sockets to sockmap John Fastabend
2018-06-15  6:07   ` Martin KaFai Lau

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).