netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [bpf PATCH v3 0/3] bpf, sockmap ESTABLISHED state only
@ 2018-09-18  4:36 John Fastabend
  2018-09-18  4:36 ` [bpf PATCH v3 1/3] bpf: sockmap only allow ESTABLISHED sock state John Fastabend
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: John Fastabend @ 2018-09-18  4:36 UTC (permalink / raw)
  To: edumazet, ast, daniel; +Cc: netdev

Eric noted that using the close callback is not sufficient
to catch all transitions from ESTABLISHED state to a LISTEN
state. So this series does two things. First, only allow
adding socks in ESTABLISH state and second use unhash callback
to catch tcp_disconnect() transitions.

v2: added check for ESTABLISH state in hash update sockmap as well
v3: Do not release lock from unhash in error path, no lock was
    used in the first place. And drop not so useful code comments

Thanks for reviewing Yonghong I carried your ACK forward
on patch 1/3.

Thanks,
John

---

John Fastabend (3):
      bpf: sockmap only allow ESTABLISHED sock state
      bpf: sockmap, fix transition through disconnect without close
      bpf: test_maps, only support ESTABLISHED socks


 tools/testing/selftests/bpf/test_maps.c |   10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [bpf PATCH v3 1/3] bpf: sockmap only allow ESTABLISHED sock state
  2018-09-18  4:36 [bpf PATCH v3 0/3] bpf, sockmap ESTABLISHED state only John Fastabend
@ 2018-09-18  4:36 ` John Fastabend
  2018-09-18  4:36 ` [bpf PATCH v3 2/3] bpf: sockmap, fix transition through disconnect without close John Fastabend
  2018-09-18  4:36 ` [bpf PATCH v3 3/3] bpf: test_maps, only support ESTABLISHED socks John Fastabend
  2 siblings, 0 replies; 6+ messages in thread
From: John Fastabend @ 2018-09-18  4:36 UTC (permalink / raw)
  To: edumazet, ast, daniel; +Cc: netdev

After this patch we only allow socks that are in ESTABLISHED state or
are being added via a sock_ops event that is transitioning into an
ESTABLISHED state. By allowing sock_ops events we allow users to
manage sockmaps directly from sock ops programs. The two supported
sock_ops ops are BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB and
BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB.

Similar to TLS ULP this ensures sk_user_data is correct.

Reported-by: Eric Dumazet <edumazet@google.com>
Fixes: 1aa12bdf1bfb ("bpf: sockmap, add sock close() hook to remove socks")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Yonghong Song <yhs@fb.com>
---
 0 files changed

diff --git a/kernel/bpf/sockmap.c b/kernel/bpf/sockmap.c
index 488ef96..1f97b55 100644
--- a/kernel/bpf/sockmap.c
+++ b/kernel/bpf/sockmap.c
@@ -2097,8 +2097,12 @@ static int sock_map_update_elem(struct bpf_map *map,
 		return -EINVAL;
 	}
 
+	/* ULPs are currently supported only for TCP sockets in ESTABLISHED
+	 * state.
+	 */
 	if (skops.sk->sk_type != SOCK_STREAM ||
-	    skops.sk->sk_protocol != IPPROTO_TCP) {
+	    skops.sk->sk_protocol != IPPROTO_TCP ||
+	    skops.sk->sk_state != TCP_ESTABLISHED) {
 		fput(socket->file);
 		return -EOPNOTSUPP;
 	}
@@ -2453,6 +2457,16 @@ static int sock_hash_update_elem(struct bpf_map *map,
 		return -EINVAL;
 	}
 
+	/* ULPs are currently supported only for TCP sockets in ESTABLISHED
+	 * state.
+	 */
+	if (skops.sk->sk_type != SOCK_STREAM ||
+	    skops.sk->sk_protocol != IPPROTO_TCP ||
+	    skops.sk->sk_state != TCP_ESTABLISHED) {
+		fput(socket->file);
+		return -EOPNOTSUPP;
+	}
+
 	lock_sock(skops.sk);
 	preempt_disable();
 	rcu_read_lock();
@@ -2543,10 +2557,22 @@ struct sock  *__sock_hash_lookup_elem(struct bpf_map *map, void *key)
 	.map_check_btf = map_check_no_btf,
 };
 
+static bool bpf_is_valid_sock_op(struct bpf_sock_ops_kern *ops)
+{
+	return ops->op == BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB ||
+	       ops->op == BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB;
+}
 BPF_CALL_4(bpf_sock_map_update, struct bpf_sock_ops_kern *, bpf_sock,
 	   struct bpf_map *, map, void *, key, u64, flags)
 {
 	WARN_ON_ONCE(!rcu_read_lock_held());
+
+	/* ULPs are currently supported only for TCP sockets in ESTABLISHED
+	 * state. This checks that the sock ops triggering the update is
+	 * one indicating we are (or will be soon) in an ESTABLISHED state.
+	 */
+	if (!bpf_is_valid_sock_op(bpf_sock))
+		return -EOPNOTSUPP;
 	return sock_map_ctx_update_elem(bpf_sock, map, key, flags);
 }
 
@@ -2565,6 +2591,9 @@ struct sock  *__sock_hash_lookup_elem(struct bpf_map *map, void *key)
 	   struct bpf_map *, map, void *, key, u64, flags)
 {
 	WARN_ON_ONCE(!rcu_read_lock_held());
+
+	if (!bpf_is_valid_sock_op(bpf_sock))
+		return -EOPNOTSUPP;
 	return sock_hash_ctx_update_elem(bpf_sock, map, key, flags);
 }
 

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [bpf PATCH v3 2/3] bpf: sockmap, fix transition through disconnect without close
  2018-09-18  4:36 [bpf PATCH v3 0/3] bpf, sockmap ESTABLISHED state only John Fastabend
  2018-09-18  4:36 ` [bpf PATCH v3 1/3] bpf: sockmap only allow ESTABLISHED sock state John Fastabend
@ 2018-09-18  4:36 ` John Fastabend
  2018-09-18  5:21   ` Y Song
  2018-09-18  4:36 ` [bpf PATCH v3 3/3] bpf: test_maps, only support ESTABLISHED socks John Fastabend
  2 siblings, 1 reply; 6+ messages in thread
From: John Fastabend @ 2018-09-18  4:36 UTC (permalink / raw)
  To: edumazet, ast, daniel; +Cc: netdev

It is possible (via shutdown()) for TCP socks to go trough TCP_CLOSE
state via tcp_disconnect() without actually calling tcp_close which
would then call our bpf_tcp_close() callback. Because of this a user
could disconnect a socket then put it in a LISTEN state which would
break our assumptions about sockets always being ESTABLISHED state.

To resolve this rely on the unhash hook, which is called in the
disconnect case, to remove the sock from the sockmap.

Reported-by: Eric Dumazet <edumazet@google.com>
Fixes: 1aa12bdf1bfb ("bpf: sockmap, add sock close() hook to remove socks")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
---
 0 files changed

diff --git a/kernel/bpf/sockmap.c b/kernel/bpf/sockmap.c
index 1f97b55..5680d65 100644
--- a/kernel/bpf/sockmap.c
+++ b/kernel/bpf/sockmap.c
@@ -132,6 +132,7 @@ struct smap_psock {
 	struct work_struct gc_work;
 
 	struct proto *sk_proto;
+	void (*save_unhash)(struct sock *sk);
 	void (*save_close)(struct sock *sk, long timeout);
 	void (*save_data_ready)(struct sock *sk);
 	void (*save_write_space)(struct sock *sk);
@@ -143,6 +144,7 @@ static int bpf_tcp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
 static int bpf_tcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t size);
 static int bpf_tcp_sendpage(struct sock *sk, struct page *page,
 			    int offset, size_t size, int flags);
+static void bpf_tcp_unhash(struct sock *sk);
 static void bpf_tcp_close(struct sock *sk, long timeout);
 
 static inline struct smap_psock *smap_psock_sk(const struct sock *sk)
@@ -184,6 +186,7 @@ static void build_protos(struct proto prot[SOCKMAP_NUM_CONFIGS],
 			 struct proto *base)
 {
 	prot[SOCKMAP_BASE]			= *base;
+	prot[SOCKMAP_BASE].unhash		= bpf_tcp_unhash;
 	prot[SOCKMAP_BASE].close		= bpf_tcp_close;
 	prot[SOCKMAP_BASE].recvmsg		= bpf_tcp_recvmsg;
 	prot[SOCKMAP_BASE].stream_memory_read	= bpf_tcp_stream_read;
@@ -217,6 +220,7 @@ static int bpf_tcp_init(struct sock *sk)
 		return -EBUSY;
 	}
 
+	psock->save_unhash = sk->sk_prot->unhash;
 	psock->save_close = sk->sk_prot->close;
 	psock->sk_proto = sk->sk_prot;
 
@@ -305,30 +309,12 @@ static struct smap_psock_map_entry *psock_map_pop(struct sock *sk,
 	return e;
 }
 
-static void bpf_tcp_close(struct sock *sk, long timeout)
+static void bpf_tcp_remove(struct sock *sk, struct smap_psock *psock)
 {
-	void (*close_fun)(struct sock *sk, long timeout);
 	struct smap_psock_map_entry *e;
 	struct sk_msg_buff *md, *mtmp;
-	struct smap_psock *psock;
 	struct sock *osk;
 
-	lock_sock(sk);
-	rcu_read_lock();
-	psock = smap_psock_sk(sk);
-	if (unlikely(!psock)) {
-		rcu_read_unlock();
-		release_sock(sk);
-		return sk->sk_prot->close(sk, timeout);
-	}
-
-	/* The psock may be destroyed anytime after exiting the RCU critial
-	 * section so by the time we use close_fun the psock may no longer
-	 * be valid. However, bpf_tcp_close is called with the sock lock
-	 * held so the close hook and sk are still valid.
-	 */
-	close_fun = psock->save_close;
-
 	if (psock->cork) {
 		free_start_sg(psock->sock, psock->cork, true);
 		kfree(psock->cork);
@@ -379,6 +365,42 @@ static void bpf_tcp_close(struct sock *sk, long timeout)
 		kfree(e);
 		e = psock_map_pop(sk, psock);
 	}
+}
+
+static void bpf_tcp_unhash(struct sock *sk)
+{
+	void (*unhash_fun)(struct sock *sk);
+	struct smap_psock *psock;
+
+	rcu_read_lock();
+	psock = smap_psock_sk(sk);
+	if (unlikely(!psock)) {
+		rcu_read_unlock();
+		if (sk->sk_prot->unhash)
+			return sk->sk_prot->unhash(sk);
+		return;
+	}
+	unhash_fun = psock->save_unhash;
+	bpf_tcp_remove(sk, psock);
+	rcu_read_unlock();
+	unhash_fun(sk);
+}
+
+static void bpf_tcp_close(struct sock *sk, long timeout)
+{
+	void (*close_fun)(struct sock *sk, long timeout);
+	struct smap_psock *psock;
+
+	lock_sock(sk);
+	rcu_read_lock();
+	psock = smap_psock_sk(sk);
+	if (unlikely(!psock)) {
+		rcu_read_unlock();
+		release_sock(sk);
+		return sk->sk_prot->close(sk, timeout);
+	}
+	close_fun = psock->save_close;
+	bpf_tcp_remove(sk, psock);
 	rcu_read_unlock();
 	release_sock(sk);
 	close_fun(sk, timeout);

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [bpf PATCH v3 3/3] bpf: test_maps, only support ESTABLISHED socks
  2018-09-18  4:36 [bpf PATCH v3 0/3] bpf, sockmap ESTABLISHED state only John Fastabend
  2018-09-18  4:36 ` [bpf PATCH v3 1/3] bpf: sockmap only allow ESTABLISHED sock state John Fastabend
  2018-09-18  4:36 ` [bpf PATCH v3 2/3] bpf: sockmap, fix transition through disconnect without close John Fastabend
@ 2018-09-18  4:36 ` John Fastabend
  2018-09-18  5:22   ` Y Song
  2 siblings, 1 reply; 6+ messages in thread
From: John Fastabend @ 2018-09-18  4:36 UTC (permalink / raw)
  To: edumazet, ast, daniel; +Cc: netdev

Ensure that sockets added to a sock{map|hash} that is not in the
ESTABLISHED state is rejected.

Fixes: 1aa12bdf1bfb ("bpf: sockmap, add sock close() hook to remove socks")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
---
 tools/testing/selftests/bpf/test_maps.c |   10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/bpf/test_maps.c b/tools/testing/selftests/bpf/test_maps.c
index 6f54f84..9b552c0 100644
--- a/tools/testing/selftests/bpf/test_maps.c
+++ b/tools/testing/selftests/bpf/test_maps.c
@@ -580,7 +580,11 @@ static void test_sockmap(int tasks, void *data)
 	/* Test update without programs */
 	for (i = 0; i < 6; i++) {
 		err = bpf_map_update_elem(fd, &i, &sfd[i], BPF_ANY);
-		if (err) {
+		if (i < 2 && !err) {
+			printf("Allowed update sockmap '%i:%i' not in ESTABLISHED\n",
+			       i, sfd[i]);
+			goto out_sockmap;
+		} else if (i >= 2 && err) {
 			printf("Failed noprog update sockmap '%i:%i'\n",
 			       i, sfd[i]);
 			goto out_sockmap;
@@ -741,7 +745,7 @@ static void test_sockmap(int tasks, void *data)
 	}
 
 	/* Test map update elem afterwards fd lives in fd and map_fd */
-	for (i = 0; i < 6; i++) {
+	for (i = 2; i < 6; i++) {
 		err = bpf_map_update_elem(map_fd_rx, &i, &sfd[i], BPF_ANY);
 		if (err) {
 			printf("Failed map_fd_rx update sockmap %i '%i:%i'\n",
@@ -845,7 +849,7 @@ static void test_sockmap(int tasks, void *data)
 	}
 
 	/* Delete the elems without programs */
-	for (i = 0; i < 6; i++) {
+	for (i = 2; i < 6; i++) {
 		err = bpf_map_delete_elem(fd, &i);
 		if (err) {
 			printf("Failed delete sockmap %i '%i:%i'\n",

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [bpf PATCH v3 2/3] bpf: sockmap, fix transition through disconnect without close
  2018-09-18  4:36 ` [bpf PATCH v3 2/3] bpf: sockmap, fix transition through disconnect without close John Fastabend
@ 2018-09-18  5:21   ` Y Song
  0 siblings, 0 replies; 6+ messages in thread
From: Y Song @ 2018-09-18  5:21 UTC (permalink / raw)
  To: John Fastabend; +Cc: Eric Dumazet, Alexei Starovoitov, Daniel Borkmann, netdev

On Mon, Sep 17, 2018 at 9:39 PM John Fastabend <john.fastabend@gmail.com> wrote:
>
> It is possible (via shutdown()) for TCP socks to go trough TCP_CLOSE
> state via tcp_disconnect() without actually calling tcp_close which
> would then call our bpf_tcp_close() callback. Because of this a user
> could disconnect a socket then put it in a LISTEN state which would
> break our assumptions about sockets always being ESTABLISHED state.
>
> To resolve this rely on the unhash hook, which is called in the
> disconnect case, to remove the sock from the sockmap.
>
> Reported-by: Eric Dumazet <edumazet@google.com>
> Fixes: 1aa12bdf1bfb ("bpf: sockmap, add sock close() hook to remove socks")
> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
> ---
>  0 files changed
>
> diff --git a/kernel/bpf/sockmap.c b/kernel/bpf/sockmap.c
> index 1f97b55..5680d65 100644
> --- a/kernel/bpf/sockmap.c
> +++ b/kernel/bpf/sockmap.c
> @@ -132,6 +132,7 @@ struct smap_psock {
>         struct work_struct gc_work;
>
>         struct proto *sk_proto;
> +       void (*save_unhash)(struct sock *sk);
>         void (*save_close)(struct sock *sk, long timeout);
>         void (*save_data_ready)(struct sock *sk);
>         void (*save_write_space)(struct sock *sk);
> @@ -143,6 +144,7 @@ static int bpf_tcp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
>  static int bpf_tcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t size);
>  static int bpf_tcp_sendpage(struct sock *sk, struct page *page,
>                             int offset, size_t size, int flags);
> +static void bpf_tcp_unhash(struct sock *sk);
>  static void bpf_tcp_close(struct sock *sk, long timeout);
>
>  static inline struct smap_psock *smap_psock_sk(const struct sock *sk)
> @@ -184,6 +186,7 @@ static void build_protos(struct proto prot[SOCKMAP_NUM_CONFIGS],
>                          struct proto *base)
>  {
>         prot[SOCKMAP_BASE]                      = *base;
> +       prot[SOCKMAP_BASE].unhash               = bpf_tcp_unhash;
>         prot[SOCKMAP_BASE].close                = bpf_tcp_close;
>         prot[SOCKMAP_BASE].recvmsg              = bpf_tcp_recvmsg;
>         prot[SOCKMAP_BASE].stream_memory_read   = bpf_tcp_stream_read;
> @@ -217,6 +220,7 @@ static int bpf_tcp_init(struct sock *sk)
>                 return -EBUSY;
>         }
>
> +       psock->save_unhash = sk->sk_prot->unhash;
>         psock->save_close = sk->sk_prot->close;
>         psock->sk_proto = sk->sk_prot;
>
> @@ -305,30 +309,12 @@ static struct smap_psock_map_entry *psock_map_pop(struct sock *sk,
>         return e;
>  }
>
> -static void bpf_tcp_close(struct sock *sk, long timeout)
> +static void bpf_tcp_remove(struct sock *sk, struct smap_psock *psock)
>  {
> -       void (*close_fun)(struct sock *sk, long timeout);
>         struct smap_psock_map_entry *e;
>         struct sk_msg_buff *md, *mtmp;
> -       struct smap_psock *psock;
>         struct sock *osk;
>
> -       lock_sock(sk);
> -       rcu_read_lock();
> -       psock = smap_psock_sk(sk);
> -       if (unlikely(!psock)) {
> -               rcu_read_unlock();
> -               release_sock(sk);
> -               return sk->sk_prot->close(sk, timeout);
> -       }
> -
> -       /* The psock may be destroyed anytime after exiting the RCU critial
> -        * section so by the time we use close_fun the psock may no longer
> -        * be valid. However, bpf_tcp_close is called with the sock lock
> -        * held so the close hook and sk are still valid.
> -        */
> -       close_fun = psock->save_close;
> -
>         if (psock->cork) {
>                 free_start_sg(psock->sock, psock->cork, true);
>                 kfree(psock->cork);
> @@ -379,6 +365,42 @@ static void bpf_tcp_close(struct sock *sk, long timeout)
>                 kfree(e);
>                 e = psock_map_pop(sk, psock);
>         }
> +}
> +
> +static void bpf_tcp_unhash(struct sock *sk)
> +{
> +       void (*unhash_fun)(struct sock *sk);
> +       struct smap_psock *psock;
> +
> +       rcu_read_lock();
> +       psock = smap_psock_sk(sk);
> +       if (unlikely(!psock)) {
> +               rcu_read_unlock();
> +               if (sk->sk_prot->unhash)
> +                       return sk->sk_prot->unhash(sk);
> +               return;
Nit: the above code can be
    if (sk->sk_prot->unhash)
       sk->sk_prot->unhash(sk);
    return;
Other than this,
Acked-by: Yonghong Song <yhs@fb.com>

> +       }
> +       unhash_fun = psock->save_unhash;
> +       bpf_tcp_remove(sk, psock);
> +       rcu_read_unlock();
> +       unhash_fun(sk);
> +}
> +
> +static void bpf_tcp_close(struct sock *sk, long timeout)
> +{
> +       void (*close_fun)(struct sock *sk, long timeout);
> +       struct smap_psock *psock;
> +
> +       lock_sock(sk);
> +       rcu_read_lock();
> +       psock = smap_psock_sk(sk);
> +       if (unlikely(!psock)) {
> +               rcu_read_unlock();
> +               release_sock(sk);
> +               return sk->sk_prot->close(sk, timeout);
> +       }
> +       close_fun = psock->save_close;
> +       bpf_tcp_remove(sk, psock);
>         rcu_read_unlock();
>         release_sock(sk);
>         close_fun(sk, timeout);
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [bpf PATCH v3 3/3] bpf: test_maps, only support ESTABLISHED socks
  2018-09-18  4:36 ` [bpf PATCH v3 3/3] bpf: test_maps, only support ESTABLISHED socks John Fastabend
@ 2018-09-18  5:22   ` Y Song
  0 siblings, 0 replies; 6+ messages in thread
From: Y Song @ 2018-09-18  5:22 UTC (permalink / raw)
  To: John Fastabend; +Cc: Eric Dumazet, Alexei Starovoitov, Daniel Borkmann, netdev

On Mon, Sep 17, 2018 at 9:38 PM John Fastabend <john.fastabend@gmail.com> wrote:
>
> Ensure that sockets added to a sock{map|hash} that is not in the
> ESTABLISHED state is rejected.
>
> Fixes: 1aa12bdf1bfb ("bpf: sockmap, add sock close() hook to remove socks")
> Signed-off-by: John Fastabend <john.fastabend@gmail.com>

Acked-by: Yonghong Song <yhs@fb.com>

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-09-18 10:54 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-18  4:36 [bpf PATCH v3 0/3] bpf, sockmap ESTABLISHED state only John Fastabend
2018-09-18  4:36 ` [bpf PATCH v3 1/3] bpf: sockmap only allow ESTABLISHED sock state John Fastabend
2018-09-18  4:36 ` [bpf PATCH v3 2/3] bpf: sockmap, fix transition through disconnect without close John Fastabend
2018-09-18  5:21   ` Y Song
2018-09-18  4:36 ` [bpf PATCH v3 3/3] bpf: test_maps, only support ESTABLISHED socks John Fastabend
2018-09-18  5:22   ` Y Song

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).