bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf-next 0/4] xsk: various CPU barrier and {READ, WRITE}_ONCE fixes
@ 2019-08-22  9:13 Björn Töpel
  2019-08-22  9:13 ` [PATCH bpf-next 1/4] xsk: avoid store-tearing when assigning queues Björn Töpel
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Björn Töpel @ 2019-08-22  9:13 UTC (permalink / raw)
  To: ast, daniel, netdev
  Cc: Björn Töpel, magnus.karlsson, magnus.karlsson, bpf,
	bjorn.topel, jonathan.lemon, syzbot+c82697e3043781e08802,
	hdanton, i.maximets

This is a four patch series of various barrier, {READ, WRITE}_ONCE
cleanups in the AF_XDP socket code. More details can be found in the
corresponding commit message.

For an AF_XDP socket, most control plane operations are done under the
control mutex (struct xdp_sock, mutex), but there are some places
where members of the struct is read outside the control mutex. This,
as pointed out by Daniel in [1], requires proper {READ,
WRITE}_ONCE-correctness [2] [3]. To address this, and to simplify the
code, the state variable (introduced by Ilya), is now used a point of
synchronization ("is the socket in a valid state, or not").


Thanks,
Björn

[1] https://lore.kernel.org/bpf/beef16bb-a09b-40f1-7dd0-c323b4b89b17@iogearbox.net/
[2] https://lwn.net/Articles/793253/
[3] https://github.com/google/ktsan/wiki/READ_ONCE-and-WRITE_ONCE

Björn Töpel (4):
  xsk: avoid store-tearing when assigning queues
  xsk: add proper barriers and {READ, WRITE}_ONCE-correctness for state
  xsk: avoid store-tearing when assigning umem
  xsk: lock the control mutex in sock_diag interface

 net/xdp/xsk.c      | 61 ++++++++++++++++++++++++++++++++--------------
 net/xdp/xsk_diag.c |  3 +++
 2 files changed, 46 insertions(+), 18 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH bpf-next 1/4] xsk: avoid store-tearing when assigning queues
  2019-08-22  9:13 [PATCH bpf-next 0/4] xsk: various CPU barrier and {READ, WRITE}_ONCE fixes Björn Töpel
@ 2019-08-22  9:13 ` Björn Töpel
       [not found]   ` <5d5e980f.1c69fb81.f8d9b.71f2SMTPIN_ADDED_MISSING@mx.google.com>
  2019-08-23 16:43   ` Jonathan Lemon
  2019-08-22  9:13 ` [PATCH bpf-next 2/4] xsk: add proper barriers and {READ, WRITE}_ONCE-correctness for state Björn Töpel
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 11+ messages in thread
From: Björn Töpel @ 2019-08-22  9:13 UTC (permalink / raw)
  To: ast, daniel, netdev
  Cc: Björn Töpel, magnus.karlsson, magnus.karlsson, bpf,
	jonathan.lemon, syzbot+c82697e3043781e08802, hdanton, i.maximets

From: Björn Töpel <bjorn.topel@intel.com>

Use WRITE_ONCE when doing the store of tx, rx, fq, and cq, to avoid
potential store-tearing. These members are read outside of the control
mutex in the mmap implementation.

Fixes: 37b076933a8e ("xsk: add missing write- and data-dependency barrier")
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
---
 net/xdp/xsk.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index ee4428a892fa..f3351013c2a5 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -409,7 +409,7 @@ static int xsk_init_queue(u32 entries, struct xsk_queue **queue,
 
 	/* Make sure queue is ready before it can be seen by others */
 	smp_wmb();
-	*queue = q;
+	WRITE_ONCE(*queue, q);
 	return 0;
 }
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH bpf-next 2/4] xsk: add proper barriers and {READ, WRITE}_ONCE-correctness for state
  2019-08-22  9:13 [PATCH bpf-next 0/4] xsk: various CPU barrier and {READ, WRITE}_ONCE fixes Björn Töpel
  2019-08-22  9:13 ` [PATCH bpf-next 1/4] xsk: avoid store-tearing when assigning queues Björn Töpel
@ 2019-08-22  9:13 ` Björn Töpel
  2019-08-23 16:46   ` Jonathan Lemon
  2019-08-22  9:13 ` [PATCH bpf-next 3/4] xsk: avoid store-tearing when assigning umem Björn Töpel
  2019-08-22  9:13 ` [PATCH bpf-next 4/4] xsk: lock the control mutex in sock_diag interface Björn Töpel
  3 siblings, 1 reply; 11+ messages in thread
From: Björn Töpel @ 2019-08-22  9:13 UTC (permalink / raw)
  To: ast, daniel, netdev
  Cc: Björn Töpel, magnus.karlsson, magnus.karlsson, bpf,
	jonathan.lemon, syzbot+c82697e3043781e08802, hdanton, i.maximets

From: Björn Töpel <bjorn.topel@intel.com>

The state variable was read, and written outside the control mutex
(struct xdp_sock, mutex), without proper barriers and {READ,
WRITE}_ONCE correctness.

In this commit this issue is addressed, and the state member is now
used a point of synchronization whether the socket is setup correctly
or not.

This also fixes a race, found by syzcaller, in xsk_poll() where umem
could be accessed when stale.

Suggested-by: Hillf Danton <hdanton@sina.com>
Reported-by: syzbot+c82697e3043781e08802@syzkaller.appspotmail.com
Fixes: 77cd0d7b3f25 ("xsk: add support for need_wakeup flag in AF_XDP rings")
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
---
 net/xdp/xsk.c | 57 ++++++++++++++++++++++++++++++++++++---------------
 1 file changed, 41 insertions(+), 16 deletions(-)

diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index f3351013c2a5..31236e61069b 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -162,10 +162,23 @@ static int __xsk_rcv_zc(struct xdp_sock *xs, struct xdp_buff *xdp, u32 len)
 	return err;
 }
 
+static bool xsk_is_bound(struct xdp_sock *xs)
+{
+	if (READ_ONCE(xs->state) == XSK_BOUND) {
+		/* Matches smp_wmb() in bind(). */
+		smp_rmb();
+		return true;
+	}
+	return false;
+}
+
 int xsk_rcv(struct xdp_sock *xs, struct xdp_buff *xdp)
 {
 	u32 len;
 
+	if (!xsk_is_bound(xs))
+		return -EINVAL;
+
 	if (xs->dev != xdp->rxq->dev || xs->queue_id != xdp->rxq->queue_index)
 		return -EINVAL;
 
@@ -362,6 +375,8 @@ static int xsk_sendmsg(struct socket *sock, struct msghdr *m, size_t total_len)
 	struct sock *sk = sock->sk;
 	struct xdp_sock *xs = xdp_sk(sk);
 
+	if (unlikely(!xsk_is_bound(xs)))
+		return -ENXIO;
 	if (unlikely(!xs->dev))
 		return -ENXIO;
 	if (unlikely(!(xs->dev->flags & IFF_UP)))
@@ -378,10 +393,15 @@ static unsigned int xsk_poll(struct file *file, struct socket *sock,
 			     struct poll_table_struct *wait)
 {
 	unsigned int mask = datagram_poll(file, sock, wait);
-	struct sock *sk = sock->sk;
-	struct xdp_sock *xs = xdp_sk(sk);
-	struct net_device *dev = xs->dev;
-	struct xdp_umem *umem = xs->umem;
+	struct xdp_sock *xs = xdp_sk(sock->sk);
+	struct net_device *dev;
+	struct xdp_umem *umem;
+
+	if (unlikely(!xsk_is_bound(xs)))
+		return mask;
+
+	dev = xs->dev;
+	umem = xs->umem;
 
 	if (umem->need_wakeup)
 		dev->netdev_ops->ndo_xsk_wakeup(dev, xs->queue_id,
@@ -417,10 +437,9 @@ static void xsk_unbind_dev(struct xdp_sock *xs)
 {
 	struct net_device *dev = xs->dev;
 
-	if (!dev || xs->state != XSK_BOUND)
+	if (xs->state != XSK_BOUND)
 		return;
-
-	xs->state = XSK_UNBOUND;
+	WRITE_ONCE(xs->state, XSK_UNBOUND);
 
 	/* Wait for driver to stop using the xdp socket. */
 	xdp_del_sk_umem(xs->umem, xs);
@@ -495,7 +514,9 @@ static int xsk_release(struct socket *sock)
 	local_bh_enable();
 
 	xsk_delete_from_maps(xs);
+	mutex_lock(&xs->mutex);
 	xsk_unbind_dev(xs);
+	mutex_unlock(&xs->mutex);
 
 	xskq_destroy(xs->rx);
 	xskq_destroy(xs->tx);
@@ -589,19 +610,18 @@ static int xsk_bind(struct socket *sock, struct sockaddr *addr, int addr_len)
 		}
 
 		umem_xs = xdp_sk(sock->sk);
-		if (!umem_xs->umem) {
-			/* No umem to inherit. */
+		if (!xsk_is_bound(umem_xs)) {
 			err = -EBADF;
 			sockfd_put(sock);
 			goto out_unlock;
-		} else if (umem_xs->dev != dev || umem_xs->queue_id != qid) {
+		}
+		if (umem_xs->dev != dev || umem_xs->queue_id != qid) {
 			err = -EINVAL;
 			sockfd_put(sock);
 			goto out_unlock;
 		}
-
 		xdp_get_umem(umem_xs->umem);
-		xs->umem = umem_xs->umem;
+		WRITE_ONCE(xs->umem, umem_xs->umem);
 		sockfd_put(sock);
 	} else if (!xs->umem || !xdp_umem_validate_queues(xs->umem)) {
 		err = -EINVAL;
@@ -626,10 +646,15 @@ static int xsk_bind(struct socket *sock, struct sockaddr *addr, int addr_len)
 	xdp_add_sk_umem(xs->umem, xs);
 
 out_unlock:
-	if (err)
+	if (err) {
 		dev_put(dev);
-	else
-		xs->state = XSK_BOUND;
+	} else {
+		/* Matches smp_rmb() in bind() for shared umem
+		 * sockets, and xsk_is_bound().
+		 */
+		smp_wmb();
+		WRITE_ONCE(xs->state, XSK_BOUND);
+	}
 out_release:
 	mutex_unlock(&xs->mutex);
 	rtnl_unlock();
@@ -869,7 +894,7 @@ static int xsk_mmap(struct file *file, struct socket *sock,
 	unsigned long pfn;
 	struct page *qpg;
 
-	if (xs->state != XSK_READY)
+	if (READ_ONCE(xs->state) != XSK_READY)
 		return -EBUSY;
 
 	if (offset == XDP_PGOFF_RX_RING) {
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH bpf-next 3/4] xsk: avoid store-tearing when assigning umem
  2019-08-22  9:13 [PATCH bpf-next 0/4] xsk: various CPU barrier and {READ, WRITE}_ONCE fixes Björn Töpel
  2019-08-22  9:13 ` [PATCH bpf-next 1/4] xsk: avoid store-tearing when assigning queues Björn Töpel
  2019-08-22  9:13 ` [PATCH bpf-next 2/4] xsk: add proper barriers and {READ, WRITE}_ONCE-correctness for state Björn Töpel
@ 2019-08-22  9:13 ` Björn Töpel
  2019-08-23 16:44   ` Jonathan Lemon
  2019-08-22  9:13 ` [PATCH bpf-next 4/4] xsk: lock the control mutex in sock_diag interface Björn Töpel
  3 siblings, 1 reply; 11+ messages in thread
From: Björn Töpel @ 2019-08-22  9:13 UTC (permalink / raw)
  To: ast, daniel, netdev
  Cc: Björn Töpel, magnus.karlsson, magnus.karlsson, bpf,
	jonathan.lemon, syzbot+c82697e3043781e08802, hdanton, i.maximets

From: Björn Töpel <bjorn.topel@intel.com>

The umem member of struct xdp_sock is read outside of the control
mutex, in the mmap implementation, and needs a WRITE_ONCE to avoid
potentional store-tearing.

Fixes: 423f38329d26 ("xsk: add umem fill queue support and mmap")
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
---
 net/xdp/xsk.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index 31236e61069b..6dde1857ed52 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -718,7 +718,7 @@ static int xsk_setsockopt(struct socket *sock, int level, int optname,
 
 		/* Make sure umem is ready before it can be seen by others */
 		smp_wmb();
-		xs->umem = umem;
+		WRITE_ONCE(xs->umem, umem);
 		mutex_unlock(&xs->mutex);
 		return 0;
 	}
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH bpf-next 4/4] xsk: lock the control mutex in sock_diag interface
  2019-08-22  9:13 [PATCH bpf-next 0/4] xsk: various CPU barrier and {READ, WRITE}_ONCE fixes Björn Töpel
                   ` (2 preceding siblings ...)
  2019-08-22  9:13 ` [PATCH bpf-next 3/4] xsk: avoid store-tearing when assigning umem Björn Töpel
@ 2019-08-22  9:13 ` Björn Töpel
  2019-08-23 16:44   ` Jonathan Lemon
  3 siblings, 1 reply; 11+ messages in thread
From: Björn Töpel @ 2019-08-22  9:13 UTC (permalink / raw)
  To: ast, daniel, netdev
  Cc: Björn Töpel, magnus.karlsson, magnus.karlsson, bpf,
	jonathan.lemon, syzbot+c82697e3043781e08802, hdanton, i.maximets

From: Björn Töpel <bjorn.topel@intel.com>

When accessing the members of an XDP socket, the control mutex should
be held. This commit fixes that.

Fixes: a36b38aa2af6 ("xsk: add sock_diag interface for AF_XDP")
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
---
 net/xdp/xsk_diag.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/xdp/xsk_diag.c b/net/xdp/xsk_diag.c
index d5e06c8e0cbf..c8f4f11edbbc 100644
--- a/net/xdp/xsk_diag.c
+++ b/net/xdp/xsk_diag.c
@@ -97,6 +97,7 @@ static int xsk_diag_fill(struct sock *sk, struct sk_buff *nlskb,
 	msg->xdiag_ino = sk_ino;
 	sock_diag_save_cookie(sk, msg->xdiag_cookie);
 
+	mutex_lock(&xs->mutex);
 	if ((req->xdiag_show & XDP_SHOW_INFO) && xsk_diag_put_info(xs, nlskb))
 		goto out_nlmsg_trim;
 
@@ -117,10 +118,12 @@ static int xsk_diag_fill(struct sock *sk, struct sk_buff *nlskb,
 	    sock_diag_put_meminfo(sk, nlskb, XDP_DIAG_MEMINFO))
 		goto out_nlmsg_trim;
 
+	mutex_unlock(&xs->mutex);
 	nlmsg_end(nlskb, nlh);
 	return 0;
 
 out_nlmsg_trim:
+	mutex_unlock(&xs->mutex);
 	nlmsg_cancel(nlskb, nlh);
 	return -EMSGSIZE;
 }
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf-next 1/4] xsk: avoid store-tearing when assigning queues
       [not found]   ` <5d5e980f.1c69fb81.f8d9b.71f2SMTPIN_ADDED_MISSING@mx.google.com>
@ 2019-08-22 13:51     ` Björn Töpel
  0 siblings, 0 replies; 11+ messages in thread
From: Björn Töpel @ 2019-08-22 13:51 UTC (permalink / raw)
  To: Hillf Danton
  Cc: ast, daniel, netdev, Björn Töpel, magnus.karlsson,
	magnus.karlsson, bpf, jonathan.lemon,
	syzbot+c82697e3043781e08802, i.maximets

On Thu, 22 Aug 2019 at 15:26, Hillf Danton <hdanton@sina.com> wrote:
>
> >
>
> >    /* Make sure queue is ready before it can be seen by others */
>
> >    smp_wmb();
>
>
>
> Hehe, who put mb here and for what?
>

That was from an earlier commit, and it's a barrier paired with the
lock-less reading of queues in xsk_mmap. Uhm... not sure I answered
your question?


Björn




>
>
> >-   *queue = q;
>
> >+  WRITE_ONCE(*queue, q);
>
> >    return 0;
>
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf-next 1/4] xsk: avoid store-tearing when assigning queues
  2019-08-22  9:13 ` [PATCH bpf-next 1/4] xsk: avoid store-tearing when assigning queues Björn Töpel
       [not found]   ` <5d5e980f.1c69fb81.f8d9b.71f2SMTPIN_ADDED_MISSING@mx.google.com>
@ 2019-08-23 16:43   ` Jonathan Lemon
  1 sibling, 0 replies; 11+ messages in thread
From: Jonathan Lemon @ 2019-08-23 16:43 UTC (permalink / raw)
  To: Björn Töpel
  Cc: ast, daniel, netdev, Björn Töpel, magnus.karlsson,
	magnus.karlsson, bpf, syzbot+c82697e3043781e08802, hdanton,
	i.maximets



On 22 Aug 2019, at 2:13, Björn Töpel wrote:

> From: Björn Töpel <bjorn.topel@intel.com>
>
> Use WRITE_ONCE when doing the store of tx, rx, fq, and cq, to avoid
> potential store-tearing. These members are read outside of the control
> mutex in the mmap implementation.
>
> Fixes: 37b076933a8e ("xsk: add missing write- and data-dependency barrier")
> Signed-off-by: Björn Töpel <bjorn.topel@intel.com>

Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf-next 3/4] xsk: avoid store-tearing when assigning umem
  2019-08-22  9:13 ` [PATCH bpf-next 3/4] xsk: avoid store-tearing when assigning umem Björn Töpel
@ 2019-08-23 16:44   ` Jonathan Lemon
  0 siblings, 0 replies; 11+ messages in thread
From: Jonathan Lemon @ 2019-08-23 16:44 UTC (permalink / raw)
  To: Björn Töpel
  Cc: ast, daniel, netdev, Björn Töpel, magnus.karlsson,
	magnus.karlsson, bpf, syzbot+c82697e3043781e08802, hdanton,
	i.maximets



On 22 Aug 2019, at 2:13, Björn Töpel wrote:

> From: Björn Töpel <bjorn.topel@intel.com>
>
> The umem member of struct xdp_sock is read outside of the control
> mutex, in the mmap implementation, and needs a WRITE_ONCE to avoid
> potentional store-tearing.
>
> Fixes: 423f38329d26 ("xsk: add umem fill queue support and mmap")
> Signed-off-by: Björn Töpel <bjorn.topel@intel.com>

Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf-next 4/4] xsk: lock the control mutex in sock_diag interface
  2019-08-22  9:13 ` [PATCH bpf-next 4/4] xsk: lock the control mutex in sock_diag interface Björn Töpel
@ 2019-08-23 16:44   ` Jonathan Lemon
  0 siblings, 0 replies; 11+ messages in thread
From: Jonathan Lemon @ 2019-08-23 16:44 UTC (permalink / raw)
  To: Björn Töpel
  Cc: ast, daniel, netdev, Björn Töpel, magnus.karlsson,
	magnus.karlsson, bpf, syzbot+c82697e3043781e08802, hdanton,
	i.maximets



On 22 Aug 2019, at 2:13, Björn Töpel wrote:

> From: Björn Töpel <bjorn.topel@intel.com>
>
> When accessing the members of an XDP socket, the control mutex should
> be held. This commit fixes that.
>
> Fixes: a36b38aa2af6 ("xsk: add sock_diag interface for AF_XDP")
> Signed-off-by: Björn Töpel <bjorn.topel@intel.com>

Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf-next 2/4] xsk: add proper barriers and {READ, WRITE}_ONCE-correctness for state
  2019-08-22  9:13 ` [PATCH bpf-next 2/4] xsk: add proper barriers and {READ, WRITE}_ONCE-correctness for state Björn Töpel
@ 2019-08-23 16:46   ` Jonathan Lemon
  2019-08-25 17:06     ` Björn Töpel
  0 siblings, 1 reply; 11+ messages in thread
From: Jonathan Lemon @ 2019-08-23 16:46 UTC (permalink / raw)
  To: Björn Töpel
  Cc: ast, daniel, netdev, Björn Töpel, magnus.karlsson,
	magnus.karlsson, bpf, syzbot+c82697e3043781e08802, hdanton,
	i.maximets



On 22 Aug 2019, at 2:13, Björn Töpel wrote:

> From: Björn Töpel <bjorn.topel@intel.com>
>
> The state variable was read, and written outside the control mutex
> (struct xdp_sock, mutex), without proper barriers and {READ,
> WRITE}_ONCE correctness.
>
> In this commit this issue is addressed, and the state member is now
> used a point of synchronization whether the socket is setup correctly
> or not.
>
> This also fixes a race, found by syzcaller, in xsk_poll() where umem
> could be accessed when stale.
>
> Suggested-by: Hillf Danton <hdanton@sina.com>
> Reported-by: syzbot+c82697e3043781e08802@syzkaller.appspotmail.com
> Fixes: 77cd0d7b3f25 ("xsk: add support for need_wakeup flag in AF_XDP 
> rings")
> Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
> ---
>  net/xdp/xsk.c | 57 
> ++++++++++++++++++++++++++++++++++++---------------
>  1 file changed, 41 insertions(+), 16 deletions(-)
>
> diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
> index f3351013c2a5..31236e61069b 100644
> --- a/net/xdp/xsk.c
> +++ b/net/xdp/xsk.c
> @@ -162,10 +162,23 @@ static int __xsk_rcv_zc(struct xdp_sock *xs, 
> struct xdp_buff *xdp, u32 len)
>  	return err;
>  }
>
> +static bool xsk_is_bound(struct xdp_sock *xs)
> +{
> +	if (READ_ONCE(xs->state) == XSK_BOUND) {
> +		/* Matches smp_wmb() in bind(). */
> +		smp_rmb();
> +		return true;
> +	}
> +	return false;
> +}
> +
>  int xsk_rcv(struct xdp_sock *xs, struct xdp_buff *xdp)
>  {
>  	u32 len;
>
> +	if (!xsk_is_bound(xs))
> +		return -EINVAL;
> +
>  	if (xs->dev != xdp->rxq->dev || xs->queue_id != 
> xdp->rxq->queue_index)
>  		return -EINVAL;
>
> @@ -362,6 +375,8 @@ static int xsk_sendmsg(struct socket *sock, struct 
> msghdr *m, size_t total_len)
>  	struct sock *sk = sock->sk;
>  	struct xdp_sock *xs = xdp_sk(sk);
>
> +	if (unlikely(!xsk_is_bound(xs)))
> +		return -ENXIO;
>  	if (unlikely(!xs->dev))
>  		return -ENXIO;

Can probably remove the xs->dev check now, replaced by checking 
xs->state, right?


>  	if (unlikely(!(xs->dev->flags & IFF_UP)))
> @@ -378,10 +393,15 @@ static unsigned int xsk_poll(struct file *file, 
> struct socket *sock,
>  			     struct poll_table_struct *wait)
>  {
>  	unsigned int mask = datagram_poll(file, sock, wait);
> -	struct sock *sk = sock->sk;
> -	struct xdp_sock *xs = xdp_sk(sk);
> -	struct net_device *dev = xs->dev;
> -	struct xdp_umem *umem = xs->umem;
> +	struct xdp_sock *xs = xdp_sk(sock->sk);
> +	struct net_device *dev;
> +	struct xdp_umem *umem;
> +
> +	if (unlikely(!xsk_is_bound(xs)))
> +		return mask;
> +
> +	dev = xs->dev;
> +	umem = xs->umem;
>
>  	if (umem->need_wakeup)
>  		dev->netdev_ops->ndo_xsk_wakeup(dev, xs->queue_id,
> @@ -417,10 +437,9 @@ static void xsk_unbind_dev(struct xdp_sock *xs)
>  {
>  	struct net_device *dev = xs->dev;
>
> -	if (!dev || xs->state != XSK_BOUND)
> +	if (xs->state != XSK_BOUND)
>  		return;
> -
> -	xs->state = XSK_UNBOUND;
> +	WRITE_ONCE(xs->state, XSK_UNBOUND);
>
>  	/* Wait for driver to stop using the xdp socket. */
>  	xdp_del_sk_umem(xs->umem, xs);
> @@ -495,7 +514,9 @@ static int xsk_release(struct socket *sock)
>  	local_bh_enable();
>
>  	xsk_delete_from_maps(xs);
> +	mutex_lock(&xs->mutex);
>  	xsk_unbind_dev(xs);
> +	mutex_unlock(&xs->mutex);
>
>  	xskq_destroy(xs->rx);
>  	xskq_destroy(xs->tx);
> @@ -589,19 +610,18 @@ static int xsk_bind(struct socket *sock, struct 
> sockaddr *addr, int addr_len)
>  		}
>
>  		umem_xs = xdp_sk(sock->sk);
> -		if (!umem_xs->umem) {
> -			/* No umem to inherit. */
> +		if (!xsk_is_bound(umem_xs)) {
>  			err = -EBADF;
>  			sockfd_put(sock);
>  			goto out_unlock;
> -		} else if (umem_xs->dev != dev || umem_xs->queue_id != qid) {
> +		}
> +		if (umem_xs->dev != dev || umem_xs->queue_id != qid) {
>  			err = -EINVAL;
>  			sockfd_put(sock);
>  			goto out_unlock;
>  		}
> -
>  		xdp_get_umem(umem_xs->umem);
> -		xs->umem = umem_xs->umem;
> +		WRITE_ONCE(xs->umem, umem_xs->umem);
>  		sockfd_put(sock);
>  	} else if (!xs->umem || !xdp_umem_validate_queues(xs->umem)) {
>  		err = -EINVAL;
> @@ -626,10 +646,15 @@ static int xsk_bind(struct socket *sock, struct 
> sockaddr *addr, int addr_len)
>  	xdp_add_sk_umem(xs->umem, xs);
>
>  out_unlock:
> -	if (err)
> +	if (err) {
>  		dev_put(dev);
> -	else
> -		xs->state = XSK_BOUND;
> +	} else {
> +		/* Matches smp_rmb() in bind() for shared umem
> +		 * sockets, and xsk_is_bound().
> +		 */
> +		smp_wmb();
> +		WRITE_ONCE(xs->state, XSK_BOUND);
> +	}
>  out_release:
>  	mutex_unlock(&xs->mutex);
>  	rtnl_unlock();
> @@ -869,7 +894,7 @@ static int xsk_mmap(struct file *file, struct 
> socket *sock,
>  	unsigned long pfn;
>  	struct page *qpg;
>
> -	if (xs->state != XSK_READY)
> +	if (READ_ONCE(xs->state) != XSK_READY)
>  		return -EBUSY;
>
>  	if (offset == XDP_PGOFF_RX_RING) {
> -- 
> 2.20.1

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf-next 2/4] xsk: add proper barriers and {READ, WRITE}_ONCE-correctness for state
  2019-08-23 16:46   ` Jonathan Lemon
@ 2019-08-25 17:06     ` Björn Töpel
  0 siblings, 0 replies; 11+ messages in thread
From: Björn Töpel @ 2019-08-25 17:06 UTC (permalink / raw)
  To: Jonathan Lemon
  Cc: Alexei Starovoitov, Daniel Borkmann, Netdev,
	Björn Töpel, Karlsson, Magnus, Magnus Karlsson, bpf,
	syzbot+c82697e3043781e08802, Hillf Danton, Ilya Maximets

On Fri, 23 Aug 2019 at 18:46, Jonathan Lemon <jonathan.lemon@gmail.com> wrote:
>
>
>
> On 22 Aug 2019, at 2:13, Björn Töpel wrote:
>
> > From: Björn Töpel <bjorn.topel@intel.com>
> >
> > The state variable was read, and written outside the control mutex
> > (struct xdp_sock, mutex), without proper barriers and {READ,
> > WRITE}_ONCE correctness.
> >
> > In this commit this issue is addressed, and the state member is now
> > used a point of synchronization whether the socket is setup correctly
> > or not.
> >
> > This also fixes a race, found by syzcaller, in xsk_poll() where umem
> > could be accessed when stale.
> >
> > Suggested-by: Hillf Danton <hdanton@sina.com>
> > Reported-by: syzbot+c82697e3043781e08802@syzkaller.appspotmail.com
> > Fixes: 77cd0d7b3f25 ("xsk: add support for need_wakeup flag in AF_XDP
> > rings")
> > Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
> > ---
> >  net/xdp/xsk.c | 57
> > ++++++++++++++++++++++++++++++++++++---------------
> >  1 file changed, 41 insertions(+), 16 deletions(-)
> >
> > diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
> > index f3351013c2a5..31236e61069b 100644
> > --- a/net/xdp/xsk.c
> > +++ b/net/xdp/xsk.c
> > @@ -162,10 +162,23 @@ static int __xsk_rcv_zc(struct xdp_sock *xs,
> > struct xdp_buff *xdp, u32 len)
> >       return err;
> >  }
> >
> > +static bool xsk_is_bound(struct xdp_sock *xs)
> > +{
> > +     if (READ_ONCE(xs->state) == XSK_BOUND) {
> > +             /* Matches smp_wmb() in bind(). */
> > +             smp_rmb();
> > +             return true;
> > +     }
> > +     return false;
> > +}
> > +
> >  int xsk_rcv(struct xdp_sock *xs, struct xdp_buff *xdp)
> >  {
> >       u32 len;
> >
> > +     if (!xsk_is_bound(xs))
> > +             return -EINVAL;
> > +
> >       if (xs->dev != xdp->rxq->dev || xs->queue_id !=
> > xdp->rxq->queue_index)
> >               return -EINVAL;
> >
> > @@ -362,6 +375,8 @@ static int xsk_sendmsg(struct socket *sock, struct
> > msghdr *m, size_t total_len)
> >       struct sock *sk = sock->sk;
> >       struct xdp_sock *xs = xdp_sk(sk);
> >
> > +     if (unlikely(!xsk_is_bound(xs)))
> > +             return -ENXIO;
> >       if (unlikely(!xs->dev))
> >               return -ENXIO;
>
> Can probably remove the xs->dev check now, replaced by checking
> xs->state, right?
>

Indeed! Thanks for catching; I'll send out a v2.


Björn

>
> >       if (unlikely(!(xs->dev->flags & IFF_UP)))
> > @@ -378,10 +393,15 @@ static unsigned int xsk_poll(struct file *file,
> > struct socket *sock,
> >                            struct poll_table_struct *wait)
> >  {
> >       unsigned int mask = datagram_poll(file, sock, wait);
> > -     struct sock *sk = sock->sk;
> > -     struct xdp_sock *xs = xdp_sk(sk);
> > -     struct net_device *dev = xs->dev;
> > -     struct xdp_umem *umem = xs->umem;
> > +     struct xdp_sock *xs = xdp_sk(sock->sk);
> > +     struct net_device *dev;
> > +     struct xdp_umem *umem;
> > +
> > +     if (unlikely(!xsk_is_bound(xs)))
> > +             return mask;
> > +
> > +     dev = xs->dev;
> > +     umem = xs->umem;
> >
> >       if (umem->need_wakeup)
> >               dev->netdev_ops->ndo_xsk_wakeup(dev, xs->queue_id,
> > @@ -417,10 +437,9 @@ static void xsk_unbind_dev(struct xdp_sock *xs)
> >  {
> >       struct net_device *dev = xs->dev;
> >
> > -     if (!dev || xs->state != XSK_BOUND)
> > +     if (xs->state != XSK_BOUND)
> >               return;
> > -
> > -     xs->state = XSK_UNBOUND;
> > +     WRITE_ONCE(xs->state, XSK_UNBOUND);
> >
> >       /* Wait for driver to stop using the xdp socket. */
> >       xdp_del_sk_umem(xs->umem, xs);
> > @@ -495,7 +514,9 @@ static int xsk_release(struct socket *sock)
> >       local_bh_enable();
> >
> >       xsk_delete_from_maps(xs);
> > +     mutex_lock(&xs->mutex);
> >       xsk_unbind_dev(xs);
> > +     mutex_unlock(&xs->mutex);
> >
> >       xskq_destroy(xs->rx);
> >       xskq_destroy(xs->tx);
> > @@ -589,19 +610,18 @@ static int xsk_bind(struct socket *sock, struct
> > sockaddr *addr, int addr_len)
> >               }
> >
> >               umem_xs = xdp_sk(sock->sk);
> > -             if (!umem_xs->umem) {
> > -                     /* No umem to inherit. */
> > +             if (!xsk_is_bound(umem_xs)) {
> >                       err = -EBADF;
> >                       sockfd_put(sock);
> >                       goto out_unlock;
> > -             } else if (umem_xs->dev != dev || umem_xs->queue_id != qid) {
> > +             }
> > +             if (umem_xs->dev != dev || umem_xs->queue_id != qid) {
> >                       err = -EINVAL;
> >                       sockfd_put(sock);
> >                       goto out_unlock;
> >               }
> > -
> >               xdp_get_umem(umem_xs->umem);
> > -             xs->umem = umem_xs->umem;
> > +             WRITE_ONCE(xs->umem, umem_xs->umem);
> >               sockfd_put(sock);
> >       } else if (!xs->umem || !xdp_umem_validate_queues(xs->umem)) {
> >               err = -EINVAL;
> > @@ -626,10 +646,15 @@ static int xsk_bind(struct socket *sock, struct
> > sockaddr *addr, int addr_len)
> >       xdp_add_sk_umem(xs->umem, xs);
> >
> >  out_unlock:
> > -     if (err)
> > +     if (err) {
> >               dev_put(dev);
> > -     else
> > -             xs->state = XSK_BOUND;
> > +     } else {
> > +             /* Matches smp_rmb() in bind() for shared umem
> > +              * sockets, and xsk_is_bound().
> > +              */
> > +             smp_wmb();
> > +             WRITE_ONCE(xs->state, XSK_BOUND);
> > +     }
> >  out_release:
> >       mutex_unlock(&xs->mutex);
> >       rtnl_unlock();
> > @@ -869,7 +894,7 @@ static int xsk_mmap(struct file *file, struct
> > socket *sock,
> >       unsigned long pfn;
> >       struct page *qpg;
> >
> > -     if (xs->state != XSK_READY)
> > +     if (READ_ONCE(xs->state) != XSK_READY)
> >               return -EBUSY;
> >
> >       if (offset == XDP_PGOFF_RX_RING) {
> > --
> > 2.20.1

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2019-08-25 17:06 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-22  9:13 [PATCH bpf-next 0/4] xsk: various CPU barrier and {READ, WRITE}_ONCE fixes Björn Töpel
2019-08-22  9:13 ` [PATCH bpf-next 1/4] xsk: avoid store-tearing when assigning queues Björn Töpel
     [not found]   ` <5d5e980f.1c69fb81.f8d9b.71f2SMTPIN_ADDED_MISSING@mx.google.com>
2019-08-22 13:51     ` Björn Töpel
2019-08-23 16:43   ` Jonathan Lemon
2019-08-22  9:13 ` [PATCH bpf-next 2/4] xsk: add proper barriers and {READ, WRITE}_ONCE-correctness for state Björn Töpel
2019-08-23 16:46   ` Jonathan Lemon
2019-08-25 17:06     ` Björn Töpel
2019-08-22  9:13 ` [PATCH bpf-next 3/4] xsk: avoid store-tearing when assigning umem Björn Töpel
2019-08-23 16:44   ` Jonathan Lemon
2019-08-22  9:13 ` [PATCH bpf-next 4/4] xsk: lock the control mutex in sock_diag interface Björn Töpel
2019-08-23 16:44   ` Jonathan Lemon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).