bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf 0/3] AF_XDP Socket Creation Fixes
@ 2021-03-24 14:13 Ciara Loftus
  2021-03-24 14:13 ` [PATCH bpf 1/3] libbpf: ensure umem pointer is non-NULL before dereferencing Ciara Loftus
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Ciara Loftus @ 2021-03-24 14:13 UTC (permalink / raw)
  To: netdev, bpf, magnus.karlsson, bjorn; +Cc: Ciara Loftus

This series fixes some issues around socket creation for AF_XDP.

Patch 1 fixes a potential NULL pointer dereference in
xsk_socket__create_shared.

Patch 2 ensures that the umem passed to xsk_socket__create(_shared)
remains unchanged in event of failure.

Patch 3 makes it possible for xsk_socket__create(_shared) to
succeed even if the rx and tx XDP rings have already been set up, by
ignoring the return value of the XDP_RX_RING/XDP_TX_RING setsockopt.
This removes a limitation which existed whereby a user could not retry
socket creation after a previous failed attempt.

It was chosen to solve the problem by ignoring the return values in
libbpf instead of modifying the setsockopt handling code in the kernel
in order to make it possible for the solution to be available across
all kernels, provided a new enough libbpf is available.

This series applies on commit 87d77e59d1ebc31850697341ab15ca013004b81b

Ciara Loftus (3):
  libbpf: ensure umem pointer is non-NULL before dereferencing
  libbpf: restore umem state after socket create failure
  libbpf: ignore return values of setsockopt for XDP rings.

 tools/lib/bpf/xsk.c | 66 +++++++++++++++++++++++++--------------------
 1 file changed, 37 insertions(+), 29 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH bpf 1/3] libbpf: ensure umem pointer is non-NULL before dereferencing
  2021-03-24 14:13 [PATCH bpf 0/3] AF_XDP Socket Creation Fixes Ciara Loftus
@ 2021-03-24 14:13 ` Ciara Loftus
  2021-03-26  9:14   ` Magnus Karlsson
  2021-03-24 14:13 ` [PATCH bpf 2/3] libbpf: restore umem state after socket create failure Ciara Loftus
  2021-03-24 14:13 ` [PATCH bpf 3/3] libbpf: ignore return values of setsockopt for XDP rings Ciara Loftus
  2 siblings, 1 reply; 9+ messages in thread
From: Ciara Loftus @ 2021-03-24 14:13 UTC (permalink / raw)
  To: netdev, bpf, magnus.karlsson, bjorn; +Cc: Ciara Loftus

Calls to xsk_socket__create dereference the umem to access the
fill_save and comp_save pointers. Make sure the umem is non-NULL
before doing this.

Fixes: 2f6324a3937f ("libbpf: Support shared umems between queues and devices")

Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
---
 tools/lib/bpf/xsk.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/lib/bpf/xsk.c b/tools/lib/bpf/xsk.c
index 526fc35c0b23..443b0cfb45e8 100644
--- a/tools/lib/bpf/xsk.c
+++ b/tools/lib/bpf/xsk.c
@@ -1019,6 +1019,9 @@ int xsk_socket__create(struct xsk_socket **xsk_ptr, const char *ifname,
 		       struct xsk_ring_cons *rx, struct xsk_ring_prod *tx,
 		       const struct xsk_socket_config *usr_config)
 {
+	if (!umem)
+		return -EFAULT;
+
 	return xsk_socket__create_shared(xsk_ptr, ifname, queue_id, umem,
 					 rx, tx, umem->fill_save,
 					 umem->comp_save, usr_config);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH bpf 2/3] libbpf: restore umem state after socket create failure
  2021-03-24 14:13 [PATCH bpf 0/3] AF_XDP Socket Creation Fixes Ciara Loftus
  2021-03-24 14:13 ` [PATCH bpf 1/3] libbpf: ensure umem pointer is non-NULL before dereferencing Ciara Loftus
@ 2021-03-24 14:13 ` Ciara Loftus
  2021-03-26  9:06   ` Magnus Karlsson
  2021-03-24 14:13 ` [PATCH bpf 3/3] libbpf: ignore return values of setsockopt for XDP rings Ciara Loftus
  2 siblings, 1 reply; 9+ messages in thread
From: Ciara Loftus @ 2021-03-24 14:13 UTC (permalink / raw)
  To: netdev, bpf, magnus.karlsson, bjorn; +Cc: Ciara Loftus

If the call to socket_create fails, the user may want to retry the
socket creation using the same umem. Ensure that the umem is in the
same state on exit if the call failed by restoring the _save pointers
and not unmapping the set of umem rings if those pointers are non NULL.

Fixes: 2f6324a3937f ("libbpf: Support shared umems between queues and devices")

Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
---
 tools/lib/bpf/xsk.c | 29 ++++++++++++++++++-----------
 1 file changed, 18 insertions(+), 11 deletions(-)

diff --git a/tools/lib/bpf/xsk.c b/tools/lib/bpf/xsk.c
index 443b0cfb45e8..ec3c23299329 100644
--- a/tools/lib/bpf/xsk.c
+++ b/tools/lib/bpf/xsk.c
@@ -743,21 +743,23 @@ static struct xsk_ctx *xsk_get_ctx(struct xsk_umem *umem, int ifindex,
 	return NULL;
 }
 
-static void xsk_put_ctx(struct xsk_ctx *ctx)
+static void xsk_put_ctx(struct xsk_ctx *ctx, bool unmap)
 {
 	struct xsk_umem *umem = ctx->umem;
 	struct xdp_mmap_offsets off;
 	int err;
 
 	if (--ctx->refcount == 0) {
-		err = xsk_get_mmap_offsets(umem->fd, &off);
-		if (!err) {
-			munmap(ctx->fill->ring - off.fr.desc,
-			       off.fr.desc + umem->config.fill_size *
-			       sizeof(__u64));
-			munmap(ctx->comp->ring - off.cr.desc,
-			       off.cr.desc + umem->config.comp_size *
-			       sizeof(__u64));
+		if (unmap) {
+			err = xsk_get_mmap_offsets(umem->fd, &off);
+			if (!err) {
+				munmap(ctx->fill->ring - off.fr.desc,
+				       off.fr.desc + umem->config.fill_size *
+				sizeof(__u64));
+				munmap(ctx->comp->ring - off.cr.desc,
+				       off.cr.desc + umem->config.comp_size *
+				sizeof(__u64));
+			}
 		}
 
 		list_del(&ctx->list);
@@ -854,6 +856,9 @@ int xsk_socket__create_shared(struct xsk_socket **xsk_ptr,
 	struct xsk_socket *xsk;
 	struct xsk_ctx *ctx;
 	int err, ifindex;
+	struct xsk_ring_prod *fsave = umem->fill_save;
+	struct xsk_ring_cons *csave = umem->comp_save;
+	bool unmap = !fsave;
 
 	if (!umem || !xsk_ptr || !(rx || tx))
 		return -EFAULT;
@@ -1005,7 +1010,9 @@ int xsk_socket__create_shared(struct xsk_socket **xsk_ptr,
 		munmap(rx_map, off.rx.desc +
 		       xsk->config.rx_size * sizeof(struct xdp_desc));
 out_put_ctx:
-	xsk_put_ctx(ctx);
+	umem->fill_save = fsave;
+	umem->comp_save = csave;
+	xsk_put_ctx(ctx, unmap);
 out_socket:
 	if (--umem->refcount)
 		close(xsk->fd);
@@ -1071,7 +1078,7 @@ void xsk_socket__delete(struct xsk_socket *xsk)
 		}
 	}
 
-	xsk_put_ctx(ctx);
+	xsk_put_ctx(ctx, true);
 
 	umem->refcount--;
 	/* Do not close an fd that also has an associated umem connected
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH bpf 3/3] libbpf: ignore return values of setsockopt for XDP rings.
  2021-03-24 14:13 [PATCH bpf 0/3] AF_XDP Socket Creation Fixes Ciara Loftus
  2021-03-24 14:13 ` [PATCH bpf 1/3] libbpf: ensure umem pointer is non-NULL before dereferencing Ciara Loftus
  2021-03-24 14:13 ` [PATCH bpf 2/3] libbpf: restore umem state after socket create failure Ciara Loftus
@ 2021-03-24 14:13 ` Ciara Loftus
  2021-03-26  9:14   ` Magnus Karlsson
  2 siblings, 1 reply; 9+ messages in thread
From: Ciara Loftus @ 2021-03-24 14:13 UTC (permalink / raw)
  To: netdev, bpf, magnus.karlsson, bjorn; +Cc: Ciara Loftus

During xsk_socket__create the XDP_RX_RING and XDP_TX_RING setsockopts
are called to create the rx and tx rings for the AF_XDP socket. If the ring
has already been set up, the setsockopt will return an error. However,
in the event of a failure during xsk_socket__create(_shared) after the
rings have been set up, the user may wish to retry the socket creation
using these pre-existing rings. In this case we can ignore the error
returned by the setsockopts. If there is a true error, the subsequent
call to mmap() will catch it.

Fixes: 1cad07884239 ("libbpf: add support for using AF_XDP sockets")

Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
---
 tools/lib/bpf/xsk.c | 34 ++++++++++++++++------------------
 1 file changed, 16 insertions(+), 18 deletions(-)

diff --git a/tools/lib/bpf/xsk.c b/tools/lib/bpf/xsk.c
index ec3c23299329..1f1c4c11c292 100644
--- a/tools/lib/bpf/xsk.c
+++ b/tools/lib/bpf/xsk.c
@@ -904,24 +904,22 @@ int xsk_socket__create_shared(struct xsk_socket **xsk_ptr,
 	}
 	xsk->ctx = ctx;
 
-	if (rx) {
-		err = setsockopt(xsk->fd, SOL_XDP, XDP_RX_RING,
-				 &xsk->config.rx_size,
-				 sizeof(xsk->config.rx_size));
-		if (err) {
-			err = -errno;
-			goto out_put_ctx;
-		}
-	}
-	if (tx) {
-		err = setsockopt(xsk->fd, SOL_XDP, XDP_TX_RING,
-				 &xsk->config.tx_size,
-				 sizeof(xsk->config.tx_size));
-		if (err) {
-			err = -errno;
-			goto out_put_ctx;
-		}
-	}
+	/* The return values of these setsockopt calls are intentionally not checked.
+	 * If the ring has already been set up setsockopt will return an error. However,
+	 * this scenario is acceptable as the user may be retrying the socket creation
+	 * with rings which were set up in a previous but ultimately unsuccessful call
+	 * to xsk_socket__create(_shared). The call later to mmap() will fail if there
+	 * is a real issue and we handle that return value appropriately there.
+	 */
+	if (rx)
+		setsockopt(xsk->fd, SOL_XDP, XDP_RX_RING,
+			   &xsk->config.rx_size,
+			   sizeof(xsk->config.rx_size));
+
+	if (tx)
+		setsockopt(xsk->fd, SOL_XDP, XDP_TX_RING,
+			   &xsk->config.tx_size,
+			   sizeof(xsk->config.tx_size));
 
 	err = xsk_get_mmap_offsets(xsk->fd, &off);
 	if (err) {
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH bpf 2/3] libbpf: restore umem state after socket create failure
  2021-03-24 14:13 ` [PATCH bpf 2/3] libbpf: restore umem state after socket create failure Ciara Loftus
@ 2021-03-26  9:06   ` Magnus Karlsson
  2021-03-26 14:56     ` Loftus, Ciara
  0 siblings, 1 reply; 9+ messages in thread
From: Magnus Karlsson @ 2021-03-26  9:06 UTC (permalink / raw)
  To: Ciara Loftus
  Cc: Network Development, bpf, Karlsson, Magnus, Björn Töpel

On Wed, Mar 24, 2021 at 3:46 PM Ciara Loftus <ciara.loftus@intel.com> wrote:
>
> If the call to socket_create fails, the user may want to retry the
> socket creation using the same umem. Ensure that the umem is in the
> same state on exit if the call failed by restoring the _save pointers
> and not unmapping the set of umem rings if those pointers are non NULL.
>
> Fixes: 2f6324a3937f ("libbpf: Support shared umems between queues and devices")
>
> Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
> ---
>  tools/lib/bpf/xsk.c | 29 ++++++++++++++++++-----------
>  1 file changed, 18 insertions(+), 11 deletions(-)
>
> diff --git a/tools/lib/bpf/xsk.c b/tools/lib/bpf/xsk.c
> index 443b0cfb45e8..ec3c23299329 100644
> --- a/tools/lib/bpf/xsk.c
> +++ b/tools/lib/bpf/xsk.c
> @@ -743,21 +743,23 @@ static struct xsk_ctx *xsk_get_ctx(struct xsk_umem *umem, int ifindex,
>         return NULL;
>  }
>
> -static void xsk_put_ctx(struct xsk_ctx *ctx)
> +static void xsk_put_ctx(struct xsk_ctx *ctx, bool unmap)
>  {
>         struct xsk_umem *umem = ctx->umem;
>         struct xdp_mmap_offsets off;
>         int err;
>
>         if (--ctx->refcount == 0) {
> -               err = xsk_get_mmap_offsets(umem->fd, &off);
> -               if (!err) {
> -                       munmap(ctx->fill->ring - off.fr.desc,
> -                              off.fr.desc + umem->config.fill_size *
> -                              sizeof(__u64));
> -                       munmap(ctx->comp->ring - off.cr.desc,
> -                              off.cr.desc + umem->config.comp_size *
> -                              sizeof(__u64));
> +               if (unmap) {
> +                       err = xsk_get_mmap_offsets(umem->fd, &off);
> +                       if (!err) {
> +                               munmap(ctx->fill->ring - off.fr.desc,
> +                                      off.fr.desc + umem->config.fill_size *
> +                               sizeof(__u64));
> +                               munmap(ctx->comp->ring - off.cr.desc,
> +                                      off.cr.desc + umem->config.comp_size *
> +                               sizeof(__u64));
> +                       }
>                 }

By not unmapping these rings we actually leave more state after a
failed socket creation. So how about skipping this logic (and
everything below) and always unmap the rings at failure as before, but
we move the fill_save = NULL and comp_save = NULL from xsk_create_ctx
to the end of xsk_socket__create_shared just before the "return 0"
where we know that the whole operation has succeeded. This way the
mappings would be redone during the next xsk_socket__create and if
someone decides not to retry (for some reason) we do not leave two
mappings behind. Would simplify things. What do you think?

>
>                 list_del(&ctx->list);
> @@ -854,6 +856,9 @@ int xsk_socket__create_shared(struct xsk_socket **xsk_ptr,
>         struct xsk_socket *xsk;
>         struct xsk_ctx *ctx;
>         int err, ifindex;
> +       struct xsk_ring_prod *fsave = umem->fill_save;
> +       struct xsk_ring_cons *csave = umem->comp_save;
> +       bool unmap = !fsave;
>
>         if (!umem || !xsk_ptr || !(rx || tx))
>                 return -EFAULT;
> @@ -1005,7 +1010,9 @@ int xsk_socket__create_shared(struct xsk_socket **xsk_ptr,
>                 munmap(rx_map, off.rx.desc +
>                        xsk->config.rx_size * sizeof(struct xdp_desc));
>  out_put_ctx:
> -       xsk_put_ctx(ctx);
> +       umem->fill_save = fsave;
> +       umem->comp_save = csave;
> +       xsk_put_ctx(ctx, unmap);
>  out_socket:
>         if (--umem->refcount)
>                 close(xsk->fd);
> @@ -1071,7 +1078,7 @@ void xsk_socket__delete(struct xsk_socket *xsk)
>                 }
>         }
>
> -       xsk_put_ctx(ctx);
> +       xsk_put_ctx(ctx, true);
>
>         umem->refcount--;
>         /* Do not close an fd that also has an associated umem connected
> --
> 2.17.1
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH bpf 3/3] libbpf: ignore return values of setsockopt for XDP rings.
  2021-03-24 14:13 ` [PATCH bpf 3/3] libbpf: ignore return values of setsockopt for XDP rings Ciara Loftus
@ 2021-03-26  9:14   ` Magnus Karlsson
  0 siblings, 0 replies; 9+ messages in thread
From: Magnus Karlsson @ 2021-03-26  9:14 UTC (permalink / raw)
  To: Ciara Loftus
  Cc: Network Development, bpf, Karlsson, Magnus, Björn Töpel

On Wed, Mar 24, 2021 at 3:46 PM Ciara Loftus <ciara.loftus@intel.com> wrote:
>
> During xsk_socket__create the XDP_RX_RING and XDP_TX_RING setsockopts
> are called to create the rx and tx rings for the AF_XDP socket. If the ring
> has already been set up, the setsockopt will return an error. However,
> in the event of a failure during xsk_socket__create(_shared) after the
> rings have been set up, the user may wish to retry the socket creation
> using these pre-existing rings. In this case we can ignore the error
> returned by the setsockopts. If there is a true error, the subsequent
> call to mmap() will catch it.
>
> Fixes: 1cad07884239 ("libbpf: add support for using AF_XDP sockets")
>
> Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
> ---
>  tools/lib/bpf/xsk.c | 34 ++++++++++++++++------------------
>  1 file changed, 16 insertions(+), 18 deletions(-)
>
> diff --git a/tools/lib/bpf/xsk.c b/tools/lib/bpf/xsk.c
> index ec3c23299329..1f1c4c11c292 100644
> --- a/tools/lib/bpf/xsk.c
> +++ b/tools/lib/bpf/xsk.c
> @@ -904,24 +904,22 @@ int xsk_socket__create_shared(struct xsk_socket **xsk_ptr,
>         }
>         xsk->ctx = ctx;
>
> -       if (rx) {
> -               err = setsockopt(xsk->fd, SOL_XDP, XDP_RX_RING,
> -                                &xsk->config.rx_size,
> -                                sizeof(xsk->config.rx_size));
> -               if (err) {
> -                       err = -errno;
> -                       goto out_put_ctx;
> -               }
> -       }
> -       if (tx) {
> -               err = setsockopt(xsk->fd, SOL_XDP, XDP_TX_RING,
> -                                &xsk->config.tx_size,
> -                                sizeof(xsk->config.tx_size));
> -               if (err) {
> -                       err = -errno;
> -                       goto out_put_ctx;
> -               }
> -       }
> +       /* The return values of these setsockopt calls are intentionally not checked.
> +        * If the ring has already been set up setsockopt will return an error. However,
> +        * this scenario is acceptable as the user may be retrying the socket creation
> +        * with rings which were set up in a previous but ultimately unsuccessful call
> +        * to xsk_socket__create(_shared). The call later to mmap() will fail if there
> +        * is a real issue and we handle that return value appropriately there.
> +        */
> +       if (rx)
> +               setsockopt(xsk->fd, SOL_XDP, XDP_RX_RING,
> +                          &xsk->config.rx_size,
> +                          sizeof(xsk->config.rx_size));
> +
> +       if (tx)
> +               setsockopt(xsk->fd, SOL_XDP, XDP_TX_RING,
> +                          &xsk->config.tx_size,
> +                          sizeof(xsk->config.tx_size));

Thanks Ciara!

This is a pragmatic solution, but I do not see any better way around
it since these operations are irreversible. And it works without any
fix to the kernel which is good and you have a comment explaining
things clearly. With that said, it would be nice as a follow up to
bpf-next to actually return a unique error value (among the ones that
this function can return) when the rings have already been mapped.
This way the user can react to this in a more informed way in the
future.

Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>

>
>         err = xsk_get_mmap_offsets(xsk->fd, &off);
>         if (err) {
> --
> 2.17.1
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH bpf 1/3] libbpf: ensure umem pointer is non-NULL before dereferencing
  2021-03-24 14:13 ` [PATCH bpf 1/3] libbpf: ensure umem pointer is non-NULL before dereferencing Ciara Loftus
@ 2021-03-26  9:14   ` Magnus Karlsson
  0 siblings, 0 replies; 9+ messages in thread
From: Magnus Karlsson @ 2021-03-26  9:14 UTC (permalink / raw)
  To: Ciara Loftus
  Cc: Network Development, bpf, Karlsson, Magnus, Björn Töpel

On Wed, Mar 24, 2021 at 3:46 PM Ciara Loftus <ciara.loftus@intel.com> wrote:
>
> Calls to xsk_socket__create dereference the umem to access the
> fill_save and comp_save pointers. Make sure the umem is non-NULL
> before doing this.
>
> Fixes: 2f6324a3937f ("libbpf: Support shared umems between queues and devices")
>
> Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
> ---
>  tools/lib/bpf/xsk.c | 3 +++
>  1 file changed, 3 insertions(+)

Thank you for the fix!

Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>

> diff --git a/tools/lib/bpf/xsk.c b/tools/lib/bpf/xsk.c
> index 526fc35c0b23..443b0cfb45e8 100644
> --- a/tools/lib/bpf/xsk.c
> +++ b/tools/lib/bpf/xsk.c
> @@ -1019,6 +1019,9 @@ int xsk_socket__create(struct xsk_socket **xsk_ptr, const char *ifname,
>                        struct xsk_ring_cons *rx, struct xsk_ring_prod *tx,
>                        const struct xsk_socket_config *usr_config)
>  {
> +       if (!umem)
> +               return -EFAULT;
> +
>         return xsk_socket__create_shared(xsk_ptr, ifname, queue_id, umem,
>                                          rx, tx, umem->fill_save,
>                                          umem->comp_save, usr_config);
> --
> 2.17.1
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH bpf 2/3] libbpf: restore umem state after socket create failure
  2021-03-26  9:06   ` Magnus Karlsson
@ 2021-03-26 14:56     ` Loftus, Ciara
  2021-03-26 15:20       ` Magnus Karlsson
  0 siblings, 1 reply; 9+ messages in thread
From: Loftus, Ciara @ 2021-03-26 14:56 UTC (permalink / raw)
  To: Magnus Karlsson
  Cc: Network Development, bpf, Karlsson, Magnus, Björn Töpel

> 
> On Wed, Mar 24, 2021 at 3:46 PM Ciara Loftus <ciara.loftus@intel.com>
> wrote:
> >
> > If the call to socket_create fails, the user may want to retry the
> > socket creation using the same umem. Ensure that the umem is in the
> > same state on exit if the call failed by restoring the _save pointers
> > and not unmapping the set of umem rings if those pointers are non NULL.
> >
> > Fixes: 2f6324a3937f ("libbpf: Support shared umems between queues and
> devices")
> >
> > Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
> > ---
> >  tools/lib/bpf/xsk.c | 29 ++++++++++++++++++-----------
> >  1 file changed, 18 insertions(+), 11 deletions(-)
> >
> > diff --git a/tools/lib/bpf/xsk.c b/tools/lib/bpf/xsk.c
> > index 443b0cfb45e8..ec3c23299329 100644
> > --- a/tools/lib/bpf/xsk.c
> > +++ b/tools/lib/bpf/xsk.c
> > @@ -743,21 +743,23 @@ static struct xsk_ctx *xsk_get_ctx(struct
> xsk_umem *umem, int ifindex,
> >         return NULL;
> >  }
> >
> > -static void xsk_put_ctx(struct xsk_ctx *ctx)
> > +static void xsk_put_ctx(struct xsk_ctx *ctx, bool unmap)
> >  {
> >         struct xsk_umem *umem = ctx->umem;
> >         struct xdp_mmap_offsets off;
> >         int err;
> >
> >         if (--ctx->refcount == 0) {
> > -               err = xsk_get_mmap_offsets(umem->fd, &off);
> > -               if (!err) {
> > -                       munmap(ctx->fill->ring - off.fr.desc,
> > -                              off.fr.desc + umem->config.fill_size *
> > -                              sizeof(__u64));
> > -                       munmap(ctx->comp->ring - off.cr.desc,
> > -                              off.cr.desc + umem->config.comp_size *
> > -                              sizeof(__u64));
> > +               if (unmap) {
> > +                       err = xsk_get_mmap_offsets(umem->fd, &off);
> > +                       if (!err) {
> > +                               munmap(ctx->fill->ring - off.fr.desc,
> > +                                      off.fr.desc + umem->config.fill_size *
> > +                               sizeof(__u64));
> > +                               munmap(ctx->comp->ring - off.cr.desc,
> > +                                      off.cr.desc + umem->config.comp_size *
> > +                               sizeof(__u64));
> > +                       }
> >                 }
> 
> By not unmapping these rings we actually leave more state after a
> failed socket creation. So how about skipping this logic (and

In the case of the _save rings, the maps existed before the call to
xsk_socket__create. They were created during xsk_umem__create.
So we should preserve these maps in event of failure.
I was using the wrong condition to trigger the unmap in v1 however.
We should unmap 'fill' only if
        umem->fill_save != fill
I will update this in a v2.

> everything below) and always unmap the rings at failure as before, but
> we move the fill_save = NULL and comp_save = NULL from xsk_create_ctx
> to the end of xsk_socket__create_shared just before the "return 0"
> where we know that the whole operation has succeeded. This way the

I think moving these still makes sense and will add this in the next rev.

Thanks for the feedback and suggestions!

Ciara

> mappings would be redone during the next xsk_socket__create and if
> someone decides not to retry (for some reason) we do not leave two
> mappings behind. Would simplify things. What do you think?

> 
> >
> >                 list_del(&ctx->list);
> > @@ -854,6 +856,9 @@ int xsk_socket__create_shared(struct xsk_socket
> **xsk_ptr,
> >         struct xsk_socket *xsk;
> >         struct xsk_ctx *ctx;
> >         int err, ifindex;
> > +       struct xsk_ring_prod *fsave = umem->fill_save;
> > +       struct xsk_ring_cons *csave = umem->comp_save;
> > +       bool unmap = !fsave;
> >
> >         if (!umem || !xsk_ptr || !(rx || tx))
> >                 return -EFAULT;
> > @@ -1005,7 +1010,9 @@ int xsk_socket__create_shared(struct xsk_socket
> **xsk_ptr,
> >                 munmap(rx_map, off.rx.desc +
> >                        xsk->config.rx_size * sizeof(struct xdp_desc));
> >  out_put_ctx:
> > -       xsk_put_ctx(ctx);
> > +       umem->fill_save = fsave;
> > +       umem->comp_save = csave;
> > +       xsk_put_ctx(ctx, unmap);
> >  out_socket:
> >         if (--umem->refcount)
> >                 close(xsk->fd);
> > @@ -1071,7 +1078,7 @@ void xsk_socket__delete(struct xsk_socket *xsk)
> >                 }
> >         }
> >
> > -       xsk_put_ctx(ctx);
> > +       xsk_put_ctx(ctx, true);
> >
> >         umem->refcount--;
> >         /* Do not close an fd that also has an associated umem connected
> > --
> > 2.17.1
> >

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH bpf 2/3] libbpf: restore umem state after socket create failure
  2021-03-26 14:56     ` Loftus, Ciara
@ 2021-03-26 15:20       ` Magnus Karlsson
  0 siblings, 0 replies; 9+ messages in thread
From: Magnus Karlsson @ 2021-03-26 15:20 UTC (permalink / raw)
  To: Loftus, Ciara
  Cc: Network Development, bpf, Karlsson, Magnus, Björn Töpel

On Fri, Mar 26, 2021 at 3:56 PM Loftus, Ciara <ciara.loftus@intel.com> wrote:
>
> >
> > On Wed, Mar 24, 2021 at 3:46 PM Ciara Loftus <ciara.loftus@intel.com>
> > wrote:
> > >
> > > If the call to socket_create fails, the user may want to retry the
> > > socket creation using the same umem. Ensure that the umem is in the
> > > same state on exit if the call failed by restoring the _save pointers
> > > and not unmapping the set of umem rings if those pointers are non NULL.
> > >
> > > Fixes: 2f6324a3937f ("libbpf: Support shared umems between queues and
> > devices")
> > >
> > > Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
> > > ---
> > >  tools/lib/bpf/xsk.c | 29 ++++++++++++++++++-----------
> > >  1 file changed, 18 insertions(+), 11 deletions(-)
> > >
> > > diff --git a/tools/lib/bpf/xsk.c b/tools/lib/bpf/xsk.c
> > > index 443b0cfb45e8..ec3c23299329 100644
> > > --- a/tools/lib/bpf/xsk.c
> > > +++ b/tools/lib/bpf/xsk.c
> > > @@ -743,21 +743,23 @@ static struct xsk_ctx *xsk_get_ctx(struct
> > xsk_umem *umem, int ifindex,
> > >         return NULL;
> > >  }
> > >
> > > -static void xsk_put_ctx(struct xsk_ctx *ctx)
> > > +static void xsk_put_ctx(struct xsk_ctx *ctx, bool unmap)
> > >  {
> > >         struct xsk_umem *umem = ctx->umem;
> > >         struct xdp_mmap_offsets off;
> > >         int err;
> > >
> > >         if (--ctx->refcount == 0) {
> > > -               err = xsk_get_mmap_offsets(umem->fd, &off);
> > > -               if (!err) {
> > > -                       munmap(ctx->fill->ring - off.fr.desc,
> > > -                              off.fr.desc + umem->config.fill_size *
> > > -                              sizeof(__u64));
> > > -                       munmap(ctx->comp->ring - off.cr.desc,
> > > -                              off.cr.desc + umem->config.comp_size *
> > > -                              sizeof(__u64));
> > > +               if (unmap) {
> > > +                       err = xsk_get_mmap_offsets(umem->fd, &off);
> > > +                       if (!err) {
> > > +                               munmap(ctx->fill->ring - off.fr.desc,
> > > +                                      off.fr.desc + umem->config.fill_size *
> > > +                               sizeof(__u64));
> > > +                               munmap(ctx->comp->ring - off.cr.desc,
> > > +                                      off.cr.desc + umem->config.comp_size *
> > > +                               sizeof(__u64));
> > > +                       }
> > >                 }
> >
> > By not unmapping these rings we actually leave more state after a
> > failed socket creation. So how about skipping this logic (and
>
> In the case of the _save rings, the maps existed before the call to
> xsk_socket__create. They were created during xsk_umem__create.
> So we should preserve these maps in event of failure.
> I was using the wrong condition to trigger the unmap in v1 however.
> We should unmap 'fill' only if
>         umem->fill_save != fill
> I will update this in a v2.

Ahh, you are correct. There are two ways these rings can get allocated
so that has to be taken care of. Please ignore my comment.

> > everything below) and always unmap the rings at failure as before, but
> > we move the fill_save = NULL and comp_save = NULL from xsk_create_ctx
> > to the end of xsk_socket__create_shared just before the "return 0"
> > where we know that the whole operation has succeeded. This way the
>
> I think moving these still makes sense and will add this in the next rev.
>
> Thanks for the feedback and suggestions!
>
> Ciara
>
> > mappings would be redone during the next xsk_socket__create and if
> > someone decides not to retry (for some reason) we do not leave two
> > mappings behind. Would simplify things. What do you think?
>
> >
> > >
> > >                 list_del(&ctx->list);
> > > @@ -854,6 +856,9 @@ int xsk_socket__create_shared(struct xsk_socket
> > **xsk_ptr,
> > >         struct xsk_socket *xsk;
> > >         struct xsk_ctx *ctx;
> > >         int err, ifindex;
> > > +       struct xsk_ring_prod *fsave = umem->fill_save;
> > > +       struct xsk_ring_cons *csave = umem->comp_save;
> > > +       bool unmap = !fsave;
> > >
> > >         if (!umem || !xsk_ptr || !(rx || tx))
> > >                 return -EFAULT;
> > > @@ -1005,7 +1010,9 @@ int xsk_socket__create_shared(struct xsk_socket
> > **xsk_ptr,
> > >                 munmap(rx_map, off.rx.desc +
> > >                        xsk->config.rx_size * sizeof(struct xdp_desc));
> > >  out_put_ctx:
> > > -       xsk_put_ctx(ctx);
> > > +       umem->fill_save = fsave;
> > > +       umem->comp_save = csave;
> > > +       xsk_put_ctx(ctx, unmap);
> > >  out_socket:
> > >         if (--umem->refcount)
> > >                 close(xsk->fd);
> > > @@ -1071,7 +1078,7 @@ void xsk_socket__delete(struct xsk_socket *xsk)
> > >                 }
> > >         }
> > >
> > > -       xsk_put_ctx(ctx);
> > > +       xsk_put_ctx(ctx, true);
> > >
> > >         umem->refcount--;
> > >         /* Do not close an fd that also has an associated umem connected
> > > --
> > > 2.17.1
> > >

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-03-26 15:21 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-24 14:13 [PATCH bpf 0/3] AF_XDP Socket Creation Fixes Ciara Loftus
2021-03-24 14:13 ` [PATCH bpf 1/3] libbpf: ensure umem pointer is non-NULL before dereferencing Ciara Loftus
2021-03-26  9:14   ` Magnus Karlsson
2021-03-24 14:13 ` [PATCH bpf 2/3] libbpf: restore umem state after socket create failure Ciara Loftus
2021-03-26  9:06   ` Magnus Karlsson
2021-03-26 14:56     ` Loftus, Ciara
2021-03-26 15:20       ` Magnus Karlsson
2021-03-24 14:13 ` [PATCH bpf 3/3] libbpf: ignore return values of setsockopt for XDP rings Ciara Loftus
2021-03-26  9:14   ` Magnus Karlsson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).