bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf-next v2 00/12] xsk: clean up ring access functions
@ 2019-12-19 12:39 Magnus Karlsson
  2019-12-19 12:39 ` [PATCH bpf-next v2 01/12] xsk: eliminate the lazy update threshold Magnus Karlsson
                   ` (12 more replies)
  0 siblings, 13 replies; 15+ messages in thread
From: Magnus Karlsson @ 2019-12-19 12:39 UTC (permalink / raw)
  To: magnus.karlsson, bjorn.topel, ast, daniel, netdev, jonathan.lemon
  Cc: bpf, saeedm, jeffrey.t.kirsher, maciej.fijalkowski,
	maciejromanfijalkowski

This patch set cleans up the ring access functions of AF_XDP in hope
that it will now be easier to understand and maintain. I used to get a
headache every time I looked at this code in order to really understand it,
but now I do think it is a lot less painful.

The code has been simplified a lot and as a bonus we get better
performance in nearly all cases. On my new 2.1 GHz Cascade Lake
machine with a standard default config plus AF_XDP support and
CONFIG_PREEMPT on I get the following results in percent performance
increases with this patch set compared to without it:

Zero-copy (-N):
          rxdrop        txpush        l2fwd
1 core:    -2%            0%            3%
2 cores:    4%            0%            3%

Zero-copy with poll() (-N -p):
          rxdrop        txpush        l2fwd
1 core:     3%            0%            1%
2 cores:   21%            0%            9%

Skb mode (-S):
Shows a 0% to 5% performance improvement over the same benchmarks as
above.

Here 1 core means that we are running the driver processing and the
application on the same core, while 2 cores means that they execute on
separate cores. The applications are from the xdpsock sample app.

On my older 2.0 Ghz Broadwell machine that I used for the v1, I get
the following results:

Zero-copy (-N):
          rxdrop        txpush        l2fwd
1 core:     4%            5%            4%
2 cores:    1%            0%            2%

Zero-copy with poll() (-N -p):
          rxdrop        txpush        l2fwd
1 core:     1%            3%            3%
2 cores:   22%            0%            5%

Skb mode (-S):
Shows a 0% to 1% performance improvement over the same benchmarks as
above.

When a results says 21 or 22% better, as in the case of poll mode with
2 cores and rxdrop, my first reaction is that it must be a
bug. Everything else shows between 0% and 5% performance
improvement. What is giving rise to 22%? A quick bisect indicates that
it is patches 2, 3, 4, 5, and 6 that are giving rise to most of this
improvement. So not one patch in particular, but something around 4%
improvement from each one of them. Note that exactly this benchmark
has previously had an extraordinary slow down compared to when running
without poll syscalls. For all the other poll tests above, the
slowdown has always been around 4% for using poll syscalls. But with
the bad performing test in question, it was above 25%. Interestingly,
after this clean up, the slow down is 4%, just like all the other poll
tests. Please take an extra peek at this so I have not messed up
something.

The 0% for several txpush results are due to the test bottlenecking on
a non-CPU HW resource. If I eliminated that bottleneck on my system, I
would expect to see an increase there too.

Changes v1 -> v2:
* Corrected textual errors in the commit logs (Sergei and Martin)
* Fixed the functions that detect empty and full rings so that they
  now operate on the global ring state (Maxim)

This patch has been applied against commit a352a82496d1 ("Merge branch 'libbpf-extern-followups'")

Structure of the patch set:

Patch 1: Eliminate the lazy update threshold used when preallocating
         entries in the completion ring
Patch 2: Simplify the detection of empty and full rings
Patch 3: Consolidate the two local producer pointers into one
Patch 4: Standardize the naming of the producer ring access functions
Patch 5: Eliminate the Rx batch size used for the fill ring
Patch 6: Simplify the functions xskq_nb_avail and xskq_nb_free
Patch 7: Simplify and standardize the naming of the consumer ring
         access functions
Patch 8: Change the names of the validation functions to improve
         readability and also the return value of these functions
Patch 9: Change the name of xsk_umem_discard_addr() to
         xsk_umem_release_addr() to better reflect the new
         names. Requires a name change in the drivers that support AF_XDP
         zero-copy.
Patch 10: Remove unnecessary READ_ONCE of data in the ring
Patch 11: Add overall function naming comment and reorder the functions
          for easier reference
Patch 12: Use the struct_size helper function when allocating rings

Thanks: Magnus

Magnus Karlsson (12):
  xsk: eliminate the lazy update threshold
  xsk: simplify detection of empty and full rings
  xsk: consolidate to one single cached producer pointer
  xsk: standardize naming of producer ring access functions
  xsk: eliminate the RX batch size
  xsk: simplify xskq_nb_avail and xskq_nb_free
  xsk: simplify the consumer ring access functions
  xsk: change names of validation functions
  xsk: ixgbe: i40e: ice: mlx5: xsk_umem_discard_addr to
    xsk_umem_release_addr
  xsk: remove unnecessary READ_ONCE of data
  xsk: add function naming comments and reorder functions
  xsk: use struct_size() helper

 drivers/net/ethernet/intel/i40e/i40e_xsk.c         |   4 +-
 drivers/net/ethernet/intel/ice/ice_xsk.c           |   4 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c       |   4 +-
 .../net/ethernet/mellanox/mlx5/core/en/xsk/rx.c    |   2 +-
 include/net/xdp_sock.h                             |  14 +-
 net/xdp/xsk.c                                      |  62 ++--
 net/xdp/xsk_queue.c                                |  15 +-
 net/xdp/xsk_queue.h                                | 371 +++++++++++----------
 8 files changed, 246 insertions(+), 230 deletions(-)

--
2.7.4

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH bpf-next v2 01/12] xsk: eliminate the lazy update threshold
  2019-12-19 12:39 [PATCH bpf-next v2 00/12] xsk: clean up ring access functions Magnus Karlsson
@ 2019-12-19 12:39 ` Magnus Karlsson
  2019-12-19 12:39 ` [PATCH bpf-next v2 02/12] xsk: simplify detection of empty and full rings Magnus Karlsson
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: Magnus Karlsson @ 2019-12-19 12:39 UTC (permalink / raw)
  To: magnus.karlsson, bjorn.topel, ast, daniel, netdev, jonathan.lemon
  Cc: bpf, saeedm, jeffrey.t.kirsher, maciej.fijalkowski,
	maciejromanfijalkowski

The lazy update threshold was introduced to keep the producer and
consumer some distance apart in the completion ring. This was
important in the beginning of the development of AF_XDP as the ring
format as that point in time was very sensitive to the producer and
consumer being on the same cache line. This is not the case
anymore as the current ring format does not degrade in any noticeable
way when this happens. Moreover, this threshold makes it impossible
to run the system with rings that have less than 128 entries.

So let us remove this threshold and just get one entry from the ring
as in all other functions. This will enable us to remove this function
in a later commit. Note that xskq_produce_addr_lazy followed by
xskq_produce_flush_addr_n are still not the same function as
xskq_produce_addr() as it operates on another cached pointer.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
---
 net/xdp/xsk_queue.h | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/net/xdp/xsk_queue.h b/net/xdp/xsk_queue.h
index eddae46..a2f0ba6 100644
--- a/net/xdp/xsk_queue.h
+++ b/net/xdp/xsk_queue.h
@@ -11,7 +11,6 @@
 #include <net/xdp_sock.h>
 
 #define RX_BATCH_SIZE 16
-#define LAZY_UPDATE_THRESHOLD 128
 
 struct xdp_ring {
 	u32 producer ____cacheline_aligned_in_smp;
@@ -239,7 +238,7 @@ static inline int xskq_produce_addr_lazy(struct xsk_queue *q, u64 addr)
 {
 	struct xdp_umem_ring *ring = (struct xdp_umem_ring *)q->ring;
 
-	if (xskq_nb_free(q, q->prod_head, LAZY_UPDATE_THRESHOLD) == 0)
+	if (xskq_nb_free(q, q->prod_head, 1) == 0)
 		return -ENOSPC;
 
 	/* A, matches D */
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH bpf-next v2 02/12] xsk: simplify detection of empty and full rings
  2019-12-19 12:39 [PATCH bpf-next v2 00/12] xsk: clean up ring access functions Magnus Karlsson
  2019-12-19 12:39 ` [PATCH bpf-next v2 01/12] xsk: eliminate the lazy update threshold Magnus Karlsson
@ 2019-12-19 12:39 ` Magnus Karlsson
  2019-12-19 12:39 ` [PATCH bpf-next v2 03/12] xsk: consolidate to one single cached producer pointer Magnus Karlsson
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: Magnus Karlsson @ 2019-12-19 12:39 UTC (permalink / raw)
  To: magnus.karlsson, bjorn.topel, ast, daniel, netdev, jonathan.lemon
  Cc: bpf, saeedm, jeffrey.t.kirsher, maciej.fijalkowski,
	maciejromanfijalkowski

In order to set the correct return flags for poll, the xsk code has to
check if the Rx queue is empty and if the Tx queue is full. This code
was unnecessarily large and complex as it used the functions that are
used to update the local state from the global state (xskq_nb_free and
xskq_nb_avail). Since we are not doing this nor updating any data
dependent on this state, we can simplify the functions. Another
benefit from this is that we can also simplify the xskq_nb_free and
xskq_nb_avail functions in a later commit.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
---
 net/xdp/xsk_queue.h | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/net/xdp/xsk_queue.h b/net/xdp/xsk_queue.h
index a2f0ba6..b28f9da 100644
--- a/net/xdp/xsk_queue.h
+++ b/net/xdp/xsk_queue.h
@@ -362,12 +362,15 @@ static inline void xskq_produce_flush_desc(struct xsk_queue *q)
 
 static inline bool xskq_full_desc(struct xsk_queue *q)
 {
-	return xskq_nb_avail(q, q->nentries) == q->nentries;
+	/* No barriers needed since data is not accessed */
+	return READ_ONCE(q->ring->producer) - READ_ONCE(q->ring->consumer) ==
+		q->nentries;
 }
 
 static inline bool xskq_empty_desc(struct xsk_queue *q)
 {
-	return xskq_nb_free(q, q->prod_tail, q->nentries) == q->nentries;
+	/* No barriers needed since data is not accessed */
+	return READ_ONCE(q->ring->consumer) == READ_ONCE(q->ring->producer);
 }
 
 void xskq_set_umem(struct xsk_queue *q, u64 size, u64 chunk_mask);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH bpf-next v2 03/12] xsk: consolidate to one single cached producer pointer
  2019-12-19 12:39 [PATCH bpf-next v2 00/12] xsk: clean up ring access functions Magnus Karlsson
  2019-12-19 12:39 ` [PATCH bpf-next v2 01/12] xsk: eliminate the lazy update threshold Magnus Karlsson
  2019-12-19 12:39 ` [PATCH bpf-next v2 02/12] xsk: simplify detection of empty and full rings Magnus Karlsson
@ 2019-12-19 12:39 ` Magnus Karlsson
  2019-12-19 12:39 ` [PATCH bpf-next v2 04/12] xsk: standardize naming of producer ring access functions Magnus Karlsson
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: Magnus Karlsson @ 2019-12-19 12:39 UTC (permalink / raw)
  To: magnus.karlsson, bjorn.topel, ast, daniel, netdev, jonathan.lemon
  Cc: bpf, saeedm, jeffrey.t.kirsher, maciej.fijalkowski,
	maciejromanfijalkowski

Currently, the xsk ring code has two cached producer pointers:
prod_head and prod_tail. This patch consolidates these two into a
single one called cached_prod to make the code simpler and easier to
maintain. This will be in line with the user space part of the the
code found in libbpf, that only uses a single cached pointer.

The Rx path only uses the two top level functions
xskq_produce_batch_desc and xskq_produce_flush_desc and they both use
prod_head and never prod_tail. So just move them over to
cached_prod.

The Tx XDP_DRV path uses xskq_produce_addr_lazy and
xskq_produce_flush_addr_n and unnecessarily operates on both prod_tail
and prod_head, so move them over to just use cached_prod by skipping
the intermediate step of updating prod_tail.

The Tx path in XDP_SKB mode uses xskq_reserve_addr and
xskq_produce_addr. They currently use both cached pointers, but we can
operate on the global producer pointer in xskq_produce_addr since it
has to be updated anyway, thus eliminating the use of both cached
pointers. We can also remove the xskq_nb_free in xskq_produce_addr
since it is already called in xskq_reserve_addr. No need to do it
twice.

When there is only one cached producer pointer, we can also simplify
xskq_nb_free by removing one argument.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
---
 net/xdp/xsk_queue.h | 47 +++++++++++++++++++++--------------------------
 1 file changed, 21 insertions(+), 26 deletions(-)

diff --git a/net/xdp/xsk_queue.h b/net/xdp/xsk_queue.h
index b28f9da..7ad8074 100644
--- a/net/xdp/xsk_queue.h
+++ b/net/xdp/xsk_queue.h
@@ -35,8 +35,7 @@ struct xsk_queue {
 	u64 size;
 	u32 ring_mask;
 	u32 nentries;
-	u32 prod_head;
-	u32 prod_tail;
+	u32 cached_prod;
 	u32 cons_head;
 	u32 cons_tail;
 	struct xdp_ring *ring;
@@ -94,39 +93,39 @@ static inline u64 xskq_nb_invalid_descs(struct xsk_queue *q)
 
 static inline u32 xskq_nb_avail(struct xsk_queue *q, u32 dcnt)
 {
-	u32 entries = q->prod_tail - q->cons_tail;
+	u32 entries = q->cached_prod - q->cons_tail;
 
 	if (entries == 0) {
 		/* Refresh the local pointer */
-		q->prod_tail = READ_ONCE(q->ring->producer);
-		entries = q->prod_tail - q->cons_tail;
+		q->cached_prod = READ_ONCE(q->ring->producer);
+		entries = q->cached_prod - q->cons_tail;
 	}
 
 	return (entries > dcnt) ? dcnt : entries;
 }
 
-static inline u32 xskq_nb_free(struct xsk_queue *q, u32 producer, u32 dcnt)
+static inline u32 xskq_nb_free(struct xsk_queue *q, u32 dcnt)
 {
-	u32 free_entries = q->nentries - (producer - q->cons_tail);
+	u32 free_entries = q->nentries - (q->cached_prod - q->cons_tail);
 
 	if (free_entries >= dcnt)
 		return free_entries;
 
 	/* Refresh the local tail pointer */
 	q->cons_tail = READ_ONCE(q->ring->consumer);
-	return q->nentries - (producer - q->cons_tail);
+	return q->nentries - (q->cached_prod - q->cons_tail);
 }
 
 static inline bool xskq_has_addrs(struct xsk_queue *q, u32 cnt)
 {
-	u32 entries = q->prod_tail - q->cons_tail;
+	u32 entries = q->cached_prod - q->cons_tail;
 
 	if (entries >= cnt)
 		return true;
 
 	/* Refresh the local pointer. */
-	q->prod_tail = READ_ONCE(q->ring->producer);
-	entries = q->prod_tail - q->cons_tail;
+	q->cached_prod = READ_ONCE(q->ring->producer);
+	entries = q->cached_prod - q->cons_tail;
 
 	return entries >= cnt;
 }
@@ -220,17 +219,15 @@ static inline void xskq_discard_addr(struct xsk_queue *q)
 static inline int xskq_produce_addr(struct xsk_queue *q, u64 addr)
 {
 	struct xdp_umem_ring *ring = (struct xdp_umem_ring *)q->ring;
-
-	if (xskq_nb_free(q, q->prod_tail, 1) == 0)
-		return -ENOSPC;
+	unsigned int idx = q->ring->producer;
 
 	/* A, matches D */
-	ring->desc[q->prod_tail++ & q->ring_mask] = addr;
+	ring->desc[idx++ & q->ring_mask] = addr;
 
 	/* Order producer and data */
 	smp_wmb(); /* B, matches C */
 
-	WRITE_ONCE(q->ring->producer, q->prod_tail);
+	WRITE_ONCE(q->ring->producer, idx);
 	return 0;
 }
 
@@ -238,11 +235,11 @@ static inline int xskq_produce_addr_lazy(struct xsk_queue *q, u64 addr)
 {
 	struct xdp_umem_ring *ring = (struct xdp_umem_ring *)q->ring;
 
-	if (xskq_nb_free(q, q->prod_head, 1) == 0)
+	if (xskq_nb_free(q, 1) == 0)
 		return -ENOSPC;
 
 	/* A, matches D */
-	ring->desc[q->prod_head++ & q->ring_mask] = addr;
+	ring->desc[q->cached_prod++ & q->ring_mask] = addr;
 	return 0;
 }
 
@@ -252,17 +249,16 @@ static inline void xskq_produce_flush_addr_n(struct xsk_queue *q,
 	/* Order producer and data */
 	smp_wmb(); /* B, matches C */
 
-	q->prod_tail += nb_entries;
-	WRITE_ONCE(q->ring->producer, q->prod_tail);
+	WRITE_ONCE(q->ring->producer, q->ring->producer + nb_entries);
 }
 
 static inline int xskq_reserve_addr(struct xsk_queue *q)
 {
-	if (xskq_nb_free(q, q->prod_head, 1) == 0)
+	if (xskq_nb_free(q, 1) == 0)
 		return -ENOSPC;
 
 	/* A, matches D */
-	q->prod_head++;
+	q->cached_prod++;
 	return 0;
 }
 
@@ -340,11 +336,11 @@ static inline int xskq_produce_batch_desc(struct xsk_queue *q,
 	struct xdp_rxtx_ring *ring = (struct xdp_rxtx_ring *)q->ring;
 	unsigned int idx;
 
-	if (xskq_nb_free(q, q->prod_head, 1) == 0)
+	if (xskq_nb_free(q, 1) == 0)
 		return -ENOSPC;
 
 	/* A, matches D */
-	idx = (q->prod_head++) & q->ring_mask;
+	idx = q->cached_prod++ & q->ring_mask;
 	ring->desc[idx].addr = addr;
 	ring->desc[idx].len = len;
 
@@ -356,8 +352,7 @@ static inline void xskq_produce_flush_desc(struct xsk_queue *q)
 	/* Order producer and data */
 	smp_wmb(); /* B, matches C */
 
-	q->prod_tail = q->prod_head;
-	WRITE_ONCE(q->ring->producer, q->prod_tail);
+	WRITE_ONCE(q->ring->producer, q->cached_prod);
 }
 
 static inline bool xskq_full_desc(struct xsk_queue *q)
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH bpf-next v2 04/12] xsk: standardize naming of producer ring access functions
  2019-12-19 12:39 [PATCH bpf-next v2 00/12] xsk: clean up ring access functions Magnus Karlsson
                   ` (2 preceding siblings ...)
  2019-12-19 12:39 ` [PATCH bpf-next v2 03/12] xsk: consolidate to one single cached producer pointer Magnus Karlsson
@ 2019-12-19 12:39 ` Magnus Karlsson
  2019-12-19 12:39 ` [PATCH bpf-next v2 05/12] xsk: eliminate the RX batch size Magnus Karlsson
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: Magnus Karlsson @ 2019-12-19 12:39 UTC (permalink / raw)
  To: magnus.karlsson, bjorn.topel, ast, daniel, netdev, jonathan.lemon
  Cc: bpf, saeedm, jeffrey.t.kirsher, maciej.fijalkowski,
	maciejromanfijalkowski

Adopt the naming of the producer ring access functions to have a
similar naming convention as the functions in libbpf, but adapted to
the kernel. You first reserve a number of entries that you later
submit to the global state of the ring. This is much clearer, IMO,
than the one that was in the kernel part. Once renamed, we also
discover that two functions are actually the same, so remove one of
them. Some of the primitive ring submission operations are also the
same so break these out into __xskq_prod_submit that the upper level
ring access functions can use.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
---
 net/xdp/xsk.c       | 20 +++++++++---------
 net/xdp/xsk_queue.h | 58 +++++++++++++++++++++++++----------------------------
 2 files changed, 37 insertions(+), 41 deletions(-)

diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index 9567938..f93cf76 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -165,7 +165,7 @@ static int __xsk_rcv(struct xdp_sock *xs, struct xdp_buff *xdp, u32 len)
 
 	offset += metalen;
 	addr = xsk_umem_adjust_offset(xs->umem, addr, offset);
-	err = xskq_produce_batch_desc(xs->rx, addr, len);
+	err = xskq_prod_reserve_desc(xs->rx, addr, len);
 	if (!err) {
 		xskq_discard_addr(xs->umem->fq);
 		xdp_return_buff(xdp);
@@ -178,7 +178,7 @@ static int __xsk_rcv(struct xdp_sock *xs, struct xdp_buff *xdp, u32 len)
 
 static int __xsk_rcv_zc(struct xdp_sock *xs, struct xdp_buff *xdp, u32 len)
 {
-	int err = xskq_produce_batch_desc(xs->rx, (u64)xdp->handle, len);
+	int err = xskq_prod_reserve_desc(xs->rx, xdp->handle, len);
 
 	if (err)
 		xs->rx_dropped++;
@@ -214,7 +214,7 @@ static int xsk_rcv(struct xdp_sock *xs, struct xdp_buff *xdp)
 
 static void xsk_flush(struct xdp_sock *xs)
 {
-	xskq_produce_flush_desc(xs->rx);
+	xskq_prod_submit(xs->rx);
 	xs->sk.sk_data_ready(&xs->sk);
 }
 
@@ -245,12 +245,12 @@ int xsk_generic_rcv(struct xdp_sock *xs, struct xdp_buff *xdp)
 	memcpy(buffer, xdp->data_meta, len + metalen);
 
 	addr = xsk_umem_adjust_offset(xs->umem, addr, metalen);
-	err = xskq_produce_batch_desc(xs->rx, addr, len);
+	err = xskq_prod_reserve_desc(xs->rx, addr, len);
 	if (err)
 		goto out_drop;
 
 	xskq_discard_addr(xs->umem->fq);
-	xskq_produce_flush_desc(xs->rx);
+	xskq_prod_submit(xs->rx);
 
 	spin_unlock_bh(&xs->rx_lock);
 
@@ -295,7 +295,7 @@ void __xsk_map_flush(struct bpf_map *map)
 
 void xsk_umem_complete_tx(struct xdp_umem *umem, u32 nb_entries)
 {
-	xskq_produce_flush_addr_n(umem->cq, nb_entries);
+	xskq_prod_submit_n(umem->cq, nb_entries);
 }
 EXPORT_SYMBOL(xsk_umem_complete_tx);
 
@@ -320,7 +320,7 @@ bool xsk_umem_consume_tx(struct xdp_umem *umem, struct xdp_desc *desc)
 		if (!xskq_peek_desc(xs->tx, desc, umem))
 			continue;
 
-		if (xskq_produce_addr_lazy(umem->cq, desc->addr))
+		if (xskq_prod_reserve_addr(umem->cq, desc->addr))
 			goto out;
 
 		xskq_discard_desc(xs->tx);
@@ -349,7 +349,7 @@ static void xsk_destruct_skb(struct sk_buff *skb)
 	unsigned long flags;
 
 	spin_lock_irqsave(&xs->tx_completion_lock, flags);
-	WARN_ON_ONCE(xskq_produce_addr(xs->umem->cq, addr));
+	xskq_prod_submit_addr(xs->umem->cq, addr);
 	spin_unlock_irqrestore(&xs->tx_completion_lock, flags);
 
 	sock_wfree(skb);
@@ -390,7 +390,7 @@ static int xsk_generic_xmit(struct sock *sk)
 		addr = desc.addr;
 		buffer = xdp_umem_get_data(xs->umem, addr);
 		err = skb_store_bits(skb, 0, buffer, len);
-		if (unlikely(err) || xskq_reserve_addr(xs->umem->cq)) {
+		if (unlikely(err) || xskq_prod_reserve(xs->umem->cq)) {
 			kfree_skb(skb);
 			goto out;
 		}
@@ -471,7 +471,7 @@ static __poll_t xsk_poll(struct file *file, struct socket *sock,
 			__xsk_sendmsg(sk);
 	}
 
-	if (xs->rx && !xskq_empty_desc(xs->rx))
+	if (xs->rx && !xskq_prod_is_empty(xs->rx))
 		mask |= EPOLLIN | EPOLLRDNORM;
 	if (xs->tx && !xskq_full_desc(xs->tx))
 		mask |= EPOLLOUT | EPOLLWRNORM;
diff --git a/net/xdp/xsk_queue.h b/net/xdp/xsk_queue.h
index 7ad8074..1b9a350 100644
--- a/net/xdp/xsk_queue.h
+++ b/net/xdp/xsk_queue.h
@@ -216,22 +216,17 @@ static inline void xskq_discard_addr(struct xsk_queue *q)
 	q->cons_tail++;
 }
 
-static inline int xskq_produce_addr(struct xsk_queue *q, u64 addr)
+static inline int xskq_prod_reserve(struct xsk_queue *q)
 {
-	struct xdp_umem_ring *ring = (struct xdp_umem_ring *)q->ring;
-	unsigned int idx = q->ring->producer;
+	if (xskq_nb_free(q, 1) == 0)
+		return -ENOSPC;
 
 	/* A, matches D */
-	ring->desc[idx++ & q->ring_mask] = addr;
-
-	/* Order producer and data */
-	smp_wmb(); /* B, matches C */
-
-	WRITE_ONCE(q->ring->producer, idx);
+	q->cached_prod++;
 	return 0;
 }
 
-static inline int xskq_produce_addr_lazy(struct xsk_queue *q, u64 addr)
+static inline int xskq_prod_reserve_addr(struct xsk_queue *q, u64 addr)
 {
 	struct xdp_umem_ring *ring = (struct xdp_umem_ring *)q->ring;
 
@@ -243,23 +238,32 @@ static inline int xskq_produce_addr_lazy(struct xsk_queue *q, u64 addr)
 	return 0;
 }
 
-static inline void xskq_produce_flush_addr_n(struct xsk_queue *q,
-					     u32 nb_entries)
+static inline void __xskq_prod_submit(struct xsk_queue *q, u32 idx)
 {
 	/* Order producer and data */
 	smp_wmb(); /* B, matches C */
 
-	WRITE_ONCE(q->ring->producer, q->ring->producer + nb_entries);
+	WRITE_ONCE(q->ring->producer, idx);
 }
 
-static inline int xskq_reserve_addr(struct xsk_queue *q)
+static inline void xskq_prod_submit(struct xsk_queue *q)
 {
-	if (xskq_nb_free(q, 1) == 0)
-		return -ENOSPC;
+	__xskq_prod_submit(q, q->cached_prod);
+}
 
-	/* A, matches D */
-	q->cached_prod++;
-	return 0;
+static inline void xskq_prod_submit_addr(struct xsk_queue *q, u64 addr)
+{
+	struct xdp_umem_ring *ring = (struct xdp_umem_ring *)q->ring;
+	u32 idx = q->ring->producer;
+
+	ring->desc[idx++ & q->ring_mask] = addr;
+
+	__xskq_prod_submit(q, idx);
+}
+
+static inline void xskq_prod_submit_n(struct xsk_queue *q, u32 nb_entries)
+{
+	__xskq_prod_submit(q, q->ring->producer + nb_entries);
 }
 
 /* Rx/Tx queue */
@@ -330,11 +334,11 @@ static inline void xskq_discard_desc(struct xsk_queue *q)
 	q->cons_tail++;
 }
 
-static inline int xskq_produce_batch_desc(struct xsk_queue *q,
-					  u64 addr, u32 len)
+static inline int xskq_prod_reserve_desc(struct xsk_queue *q,
+					 u64 addr, u32 len)
 {
 	struct xdp_rxtx_ring *ring = (struct xdp_rxtx_ring *)q->ring;
-	unsigned int idx;
+	u32 idx;
 
 	if (xskq_nb_free(q, 1) == 0)
 		return -ENOSPC;
@@ -347,14 +351,6 @@ static inline int xskq_produce_batch_desc(struct xsk_queue *q,
 	return 0;
 }
 
-static inline void xskq_produce_flush_desc(struct xsk_queue *q)
-{
-	/* Order producer and data */
-	smp_wmb(); /* B, matches C */
-
-	WRITE_ONCE(q->ring->producer, q->cached_prod);
-}
-
 static inline bool xskq_full_desc(struct xsk_queue *q)
 {
 	/* No barriers needed since data is not accessed */
@@ -362,7 +358,7 @@ static inline bool xskq_full_desc(struct xsk_queue *q)
 		q->nentries;
 }
 
-static inline bool xskq_empty_desc(struct xsk_queue *q)
+static inline bool xskq_prod_is_empty(struct xsk_queue *q)
 {
 	/* No barriers needed since data is not accessed */
 	return READ_ONCE(q->ring->consumer) == READ_ONCE(q->ring->producer);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH bpf-next v2 05/12] xsk: eliminate the RX batch size
  2019-12-19 12:39 [PATCH bpf-next v2 00/12] xsk: clean up ring access functions Magnus Karlsson
                   ` (3 preceding siblings ...)
  2019-12-19 12:39 ` [PATCH bpf-next v2 04/12] xsk: standardize naming of producer ring access functions Magnus Karlsson
@ 2019-12-19 12:39 ` Magnus Karlsson
  2019-12-19 12:39 ` [PATCH bpf-next v2 06/12] xsk: simplify xskq_nb_avail and xskq_nb_free Magnus Karlsson
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: Magnus Karlsson @ 2019-12-19 12:39 UTC (permalink / raw)
  To: magnus.karlsson, bjorn.topel, ast, daniel, netdev, jonathan.lemon
  Cc: bpf, saeedm, jeffrey.t.kirsher, maciej.fijalkowski,
	maciejromanfijalkowski

In the xsk consumer ring code there is a variable called RX_BATCH_SIZE
that dictates the minimum number of entries that we try to grab from
the fill and Tx rings. In fact, the code always try to grab the
maximum amount of entries from these rings. The only thing this
variable does is to throw an error if there is less than 16 (as it is
defined) entries on the ring. There is no reason to do this and it
will just lead to weird behavior from user space's point of view. So
eliminate this variable.

With this change, we will be able to simplify the xskq_nb_free and
xskq_nb_avail code in the next commit.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
---
 net/xdp/xsk_queue.h | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/net/xdp/xsk_queue.h b/net/xdp/xsk_queue.h
index 1b9a350..6ca5fed 100644
--- a/net/xdp/xsk_queue.h
+++ b/net/xdp/xsk_queue.h
@@ -10,8 +10,6 @@
 #include <linux/if_xdp.h>
 #include <net/xdp_sock.h>
 
-#define RX_BATCH_SIZE 16
-
 struct xdp_ring {
 	u32 producer ____cacheline_aligned_in_smp;
 	u32 consumer ____cacheline_aligned_in_smp;
@@ -202,7 +200,7 @@ static inline u64 *xskq_peek_addr(struct xsk_queue *q, u64 *addr,
 	if (q->cons_tail == q->cons_head) {
 		smp_mb(); /* D, matches A */
 		WRITE_ONCE(q->ring->consumer, q->cons_tail);
-		q->cons_head = q->cons_tail + xskq_nb_avail(q, RX_BATCH_SIZE);
+		q->cons_head = q->cons_tail + xskq_nb_avail(q, 1);
 
 		/* Order consumer and data */
 		smp_rmb();
@@ -320,7 +318,7 @@ static inline struct xdp_desc *xskq_peek_desc(struct xsk_queue *q,
 	if (q->cons_tail == q->cons_head) {
 		smp_mb(); /* D, matches A */
 		WRITE_ONCE(q->ring->consumer, q->cons_tail);
-		q->cons_head = q->cons_tail + xskq_nb_avail(q, RX_BATCH_SIZE);
+		q->cons_head = q->cons_tail + xskq_nb_avail(q, 1);
 
 		/* Order consumer and data */
 		smp_rmb(); /* C, matches B */
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH bpf-next v2 06/12] xsk: simplify xskq_nb_avail and xskq_nb_free
  2019-12-19 12:39 [PATCH bpf-next v2 00/12] xsk: clean up ring access functions Magnus Karlsson
                   ` (4 preceding siblings ...)
  2019-12-19 12:39 ` [PATCH bpf-next v2 05/12] xsk: eliminate the RX batch size Magnus Karlsson
@ 2019-12-19 12:39 ` Magnus Karlsson
  2019-12-19 12:39 ` [PATCH bpf-next v2 07/12] xsk: simplify the consumer ring access functions Magnus Karlsson
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: Magnus Karlsson @ 2019-12-19 12:39 UTC (permalink / raw)
  To: magnus.karlsson, bjorn.topel, ast, daniel, netdev, jonathan.lemon
  Cc: bpf, saeedm, jeffrey.t.kirsher, maciej.fijalkowski,
	maciejromanfijalkowski

At this point, there are no users of the functions xskq_nb_avail and
xskq_nb_free that take any other number of entries argument than 1, so
let us get rid of the second argument that takes the number of
entries.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
---
 net/xdp/xsk_queue.h | 24 +++++++++++++-----------
 1 file changed, 13 insertions(+), 11 deletions(-)

diff --git a/net/xdp/xsk_queue.h b/net/xdp/xsk_queue.h
index 6ca5fed..8bfa2ee 100644
--- a/net/xdp/xsk_queue.h
+++ b/net/xdp/xsk_queue.h
@@ -89,7 +89,7 @@ static inline u64 xskq_nb_invalid_descs(struct xsk_queue *q)
 	return q ? q->invalid_descs : 0;
 }
 
-static inline u32 xskq_nb_avail(struct xsk_queue *q, u32 dcnt)
+static inline u32 xskq_nb_avail(struct xsk_queue *q)
 {
 	u32 entries = q->cached_prod - q->cons_tail;
 
@@ -99,19 +99,21 @@ static inline u32 xskq_nb_avail(struct xsk_queue *q, u32 dcnt)
 		entries = q->cached_prod - q->cons_tail;
 	}
 
-	return (entries > dcnt) ? dcnt : entries;
+	return entries;
 }
 
-static inline u32 xskq_nb_free(struct xsk_queue *q, u32 dcnt)
+static inline bool xskq_prod_is_full(struct xsk_queue *q)
 {
 	u32 free_entries = q->nentries - (q->cached_prod - q->cons_tail);
 
-	if (free_entries >= dcnt)
-		return free_entries;
+	if (free_entries)
+		return false;
 
 	/* Refresh the local tail pointer */
 	q->cons_tail = READ_ONCE(q->ring->consumer);
-	return q->nentries - (q->cached_prod - q->cons_tail);
+	free_entries = q->nentries - (q->cached_prod - q->cons_tail);
+
+	return !free_entries;
 }
 
 static inline bool xskq_has_addrs(struct xsk_queue *q, u32 cnt)
@@ -200,7 +202,7 @@ static inline u64 *xskq_peek_addr(struct xsk_queue *q, u64 *addr,
 	if (q->cons_tail == q->cons_head) {
 		smp_mb(); /* D, matches A */
 		WRITE_ONCE(q->ring->consumer, q->cons_tail);
-		q->cons_head = q->cons_tail + xskq_nb_avail(q, 1);
+		q->cons_head = q->cons_tail + xskq_nb_avail(q);
 
 		/* Order consumer and data */
 		smp_rmb();
@@ -216,7 +218,7 @@ static inline void xskq_discard_addr(struct xsk_queue *q)
 
 static inline int xskq_prod_reserve(struct xsk_queue *q)
 {
-	if (xskq_nb_free(q, 1) == 0)
+	if (xskq_prod_is_full(q))
 		return -ENOSPC;
 
 	/* A, matches D */
@@ -228,7 +230,7 @@ static inline int xskq_prod_reserve_addr(struct xsk_queue *q, u64 addr)
 {
 	struct xdp_umem_ring *ring = (struct xdp_umem_ring *)q->ring;
 
-	if (xskq_nb_free(q, 1) == 0)
+	if (xskq_prod_is_full(q))
 		return -ENOSPC;
 
 	/* A, matches D */
@@ -318,7 +320,7 @@ static inline struct xdp_desc *xskq_peek_desc(struct xsk_queue *q,
 	if (q->cons_tail == q->cons_head) {
 		smp_mb(); /* D, matches A */
 		WRITE_ONCE(q->ring->consumer, q->cons_tail);
-		q->cons_head = q->cons_tail + xskq_nb_avail(q, 1);
+		q->cons_head = q->cons_tail + xskq_nb_avail(q);
 
 		/* Order consumer and data */
 		smp_rmb(); /* C, matches B */
@@ -338,7 +340,7 @@ static inline int xskq_prod_reserve_desc(struct xsk_queue *q,
 	struct xdp_rxtx_ring *ring = (struct xdp_rxtx_ring *)q->ring;
 	u32 idx;
 
-	if (xskq_nb_free(q, 1) == 0)
+	if (xskq_prod_is_full(q))
 		return -ENOSPC;
 
 	/* A, matches D */
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH bpf-next v2 07/12] xsk: simplify the consumer ring access functions
  2019-12-19 12:39 [PATCH bpf-next v2 00/12] xsk: clean up ring access functions Magnus Karlsson
                   ` (5 preceding siblings ...)
  2019-12-19 12:39 ` [PATCH bpf-next v2 06/12] xsk: simplify xskq_nb_avail and xskq_nb_free Magnus Karlsson
@ 2019-12-19 12:39 ` Magnus Karlsson
  2019-12-19 12:39 ` [PATCH bpf-next v2 08/12] xsk: change names of validation functions Magnus Karlsson
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: Magnus Karlsson @ 2019-12-19 12:39 UTC (permalink / raw)
  To: magnus.karlsson, bjorn.topel, ast, daniel, netdev, jonathan.lemon
  Cc: bpf, saeedm, jeffrey.t.kirsher, maciej.fijalkowski,
	maciejromanfijalkowski

Simplify and refactor consumer ring functions. The consumer first
"peeks" to find descriptors or addresses that are available to
read from the ring, then reads them and finally "releases" these
descriptors once it is done. The two local variables cons_tail
and cons_head are turned into one single variable called
cached_cons. cached_tail referred to the cached value of the
global consumer pointer and will be stored in cached_cons. For
cached_head, we just use cached_prod instead as it was not used
for a consumer queue before. It also better reflects what it
really is now: a cached copy of the producer pointer.

The names of the functions are also renamed in the same manner as
the producer functions. The new functions are called xskq_cons_
followed by what it does.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
---
 net/xdp/xsk.c       |  24 ++++++-------
 net/xdp/xsk_queue.h | 102 ++++++++++++++++++++++++----------------------------
 2 files changed, 58 insertions(+), 68 deletions(-)

diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index f93cf76..fb13b64 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -39,19 +39,19 @@ bool xsk_is_setup_for_bpf_map(struct xdp_sock *xs)
 
 bool xsk_umem_has_addrs(struct xdp_umem *umem, u32 cnt)
 {
-	return xskq_has_addrs(umem->fq, cnt);
+	return xskq_cons_has_entries(umem->fq, cnt);
 }
 EXPORT_SYMBOL(xsk_umem_has_addrs);
 
 u64 *xsk_umem_peek_addr(struct xdp_umem *umem, u64 *addr)
 {
-	return xskq_peek_addr(umem->fq, addr, umem);
+	return xskq_cons_peek_addr(umem->fq, addr, umem);
 }
 EXPORT_SYMBOL(xsk_umem_peek_addr);
 
 void xsk_umem_discard_addr(struct xdp_umem *umem)
 {
-	xskq_discard_addr(umem->fq);
+	xskq_cons_release(umem->fq);
 }
 EXPORT_SYMBOL(xsk_umem_discard_addr);
 
@@ -146,7 +146,7 @@ static int __xsk_rcv(struct xdp_sock *xs, struct xdp_buff *xdp, u32 len)
 	u32 metalen;
 	int err;
 
-	if (!xskq_peek_addr(xs->umem->fq, &addr, xs->umem) ||
+	if (!xskq_cons_peek_addr(xs->umem->fq, &addr, xs->umem) ||
 	    len > xs->umem->chunk_size_nohr - XDP_PACKET_HEADROOM) {
 		xs->rx_dropped++;
 		return -ENOSPC;
@@ -167,7 +167,7 @@ static int __xsk_rcv(struct xdp_sock *xs, struct xdp_buff *xdp, u32 len)
 	addr = xsk_umem_adjust_offset(xs->umem, addr, offset);
 	err = xskq_prod_reserve_desc(xs->rx, addr, len);
 	if (!err) {
-		xskq_discard_addr(xs->umem->fq);
+		xskq_cons_release(xs->umem->fq);
 		xdp_return_buff(xdp);
 		return 0;
 	}
@@ -234,7 +234,7 @@ int xsk_generic_rcv(struct xdp_sock *xs, struct xdp_buff *xdp)
 		goto out_unlock;
 	}
 
-	if (!xskq_peek_addr(xs->umem->fq, &addr, xs->umem) ||
+	if (!xskq_cons_peek_addr(xs->umem->fq, &addr, xs->umem) ||
 	    len > xs->umem->chunk_size_nohr - XDP_PACKET_HEADROOM) {
 		err = -ENOSPC;
 		goto out_drop;
@@ -249,7 +249,7 @@ int xsk_generic_rcv(struct xdp_sock *xs, struct xdp_buff *xdp)
 	if (err)
 		goto out_drop;
 
-	xskq_discard_addr(xs->umem->fq);
+	xskq_cons_release(xs->umem->fq);
 	xskq_prod_submit(xs->rx);
 
 	spin_unlock_bh(&xs->rx_lock);
@@ -317,13 +317,13 @@ bool xsk_umem_consume_tx(struct xdp_umem *umem, struct xdp_desc *desc)
 
 	rcu_read_lock();
 	list_for_each_entry_rcu(xs, &umem->xsk_list, list) {
-		if (!xskq_peek_desc(xs->tx, desc, umem))
+		if (!xskq_cons_peek_desc(xs->tx, desc, umem))
 			continue;
 
 		if (xskq_prod_reserve_addr(umem->cq, desc->addr))
 			goto out;
 
-		xskq_discard_desc(xs->tx);
+		xskq_cons_release(xs->tx);
 		rcu_read_unlock();
 		return true;
 	}
@@ -369,7 +369,7 @@ static int xsk_generic_xmit(struct sock *sk)
 	if (xs->queue_id >= xs->dev->real_num_tx_queues)
 		goto out;
 
-	while (xskq_peek_desc(xs->tx, &desc, xs->umem)) {
+	while (xskq_cons_peek_desc(xs->tx, &desc, xs->umem)) {
 		char *buffer;
 		u64 addr;
 		u32 len;
@@ -402,7 +402,7 @@ static int xsk_generic_xmit(struct sock *sk)
 		skb->destructor = xsk_destruct_skb;
 
 		err = dev_direct_xmit(skb, xs->queue_id);
-		xskq_discard_desc(xs->tx);
+		xskq_cons_release(xs->tx);
 		/* Ignore NET_XMIT_CN as packet might have been sent */
 		if (err == NET_XMIT_DROP || err == NETDEV_TX_BUSY) {
 			/* SKB completed but not sent */
@@ -473,7 +473,7 @@ static __poll_t xsk_poll(struct file *file, struct socket *sock,
 
 	if (xs->rx && !xskq_prod_is_empty(xs->rx))
 		mask |= EPOLLIN | EPOLLRDNORM;
-	if (xs->tx && !xskq_full_desc(xs->tx))
+	if (xs->tx && !xskq_cons_is_full(xs->tx))
 		mask |= EPOLLOUT | EPOLLWRNORM;
 
 	return mask;
diff --git a/net/xdp/xsk_queue.h b/net/xdp/xsk_queue.h
index 8bfa2ee..1436116 100644
--- a/net/xdp/xsk_queue.h
+++ b/net/xdp/xsk_queue.h
@@ -34,8 +34,7 @@ struct xsk_queue {
 	u32 ring_mask;
 	u32 nentries;
 	u32 cached_prod;
-	u32 cons_head;
-	u32 cons_tail;
+	u32 cached_cons;
 	struct xdp_ring *ring;
 	u64 invalid_descs;
 };
@@ -89,43 +88,48 @@ static inline u64 xskq_nb_invalid_descs(struct xsk_queue *q)
 	return q ? q->invalid_descs : 0;
 }
 
-static inline u32 xskq_nb_avail(struct xsk_queue *q)
+static inline void __xskq_cons_release(struct xsk_queue *q)
 {
-	u32 entries = q->cached_prod - q->cons_tail;
+	smp_mb(); /* D, matches A */
+	WRITE_ONCE(q->ring->consumer, q->cached_cons);
+}
 
-	if (entries == 0) {
-		/* Refresh the local pointer */
-		q->cached_prod = READ_ONCE(q->ring->producer);
-		entries = q->cached_prod - q->cons_tail;
-	}
+static inline void __xskq_cons_peek(struct xsk_queue *q)
+{
+	/* Refresh the local pointer */
+	q->cached_prod = READ_ONCE(q->ring->producer);
+	smp_rmb(); /* C, matches B */
+}
 
-	return entries;
+static inline void xskq_cons_get_entries(struct xsk_queue *q)
+{
+	__xskq_cons_release(q);
+	__xskq_cons_peek(q);
 }
 
 static inline bool xskq_prod_is_full(struct xsk_queue *q)
 {
-	u32 free_entries = q->nentries - (q->cached_prod - q->cons_tail);
+	u32 free_entries = q->nentries - (q->cached_prod - q->cached_cons);
 
 	if (free_entries)
 		return false;
 
 	/* Refresh the local tail pointer */
-	q->cons_tail = READ_ONCE(q->ring->consumer);
-	free_entries = q->nentries - (q->cached_prod - q->cons_tail);
+	q->cached_cons = READ_ONCE(q->ring->consumer);
+	free_entries = q->nentries - (q->cached_prod - q->cached_cons);
 
 	return !free_entries;
 }
 
-static inline bool xskq_has_addrs(struct xsk_queue *q, u32 cnt)
+static inline bool xskq_cons_has_entries(struct xsk_queue *q, u32 cnt)
 {
-	u32 entries = q->cached_prod - q->cons_tail;
+	u32 entries = q->cached_prod - q->cached_cons;
 
 	if (entries >= cnt)
 		return true;
 
-	/* Refresh the local pointer. */
-	q->cached_prod = READ_ONCE(q->ring->producer);
-	entries = q->cached_prod - q->cons_tail;
+	__xskq_cons_peek(q);
+	entries = q->cached_prod - q->cached_cons;
 
 	return entries >= cnt;
 }
@@ -172,9 +176,10 @@ static inline bool xskq_is_valid_addr_unaligned(struct xsk_queue *q, u64 addr,
 static inline u64 *xskq_validate_addr(struct xsk_queue *q, u64 *addr,
 				      struct xdp_umem *umem)
 {
-	while (q->cons_tail != q->cons_head) {
-		struct xdp_umem_ring *ring = (struct xdp_umem_ring *)q->ring;
-		unsigned int idx = q->cons_tail & q->ring_mask;
+	struct xdp_umem_ring *ring = (struct xdp_umem_ring *)q->ring;
+
+	while (q->cached_cons != q->cached_prod) {
+		u32 idx = q->cached_cons & q->ring_mask;
 
 		*addr = READ_ONCE(ring->desc[idx]) & q->chunk_mask;
 
@@ -190,30 +195,27 @@ static inline u64 *xskq_validate_addr(struct xsk_queue *q, u64 *addr,
 			return addr;
 
 out:
-		q->cons_tail++;
+		q->cached_cons++;
 	}
 
 	return NULL;
 }
 
-static inline u64 *xskq_peek_addr(struct xsk_queue *q, u64 *addr,
-				  struct xdp_umem *umem)
+static inline u64 *xskq_cons_peek_addr(struct xsk_queue *q, u64 *addr,
+				       struct xdp_umem *umem)
 {
-	if (q->cons_tail == q->cons_head) {
-		smp_mb(); /* D, matches A */
-		WRITE_ONCE(q->ring->consumer, q->cons_tail);
-		q->cons_head = q->cons_tail + xskq_nb_avail(q);
-
-		/* Order consumer and data */
-		smp_rmb();
-	}
-
+	if (q->cached_prod == q->cached_cons)
+		xskq_cons_get_entries(q);
 	return xskq_validate_addr(q, addr, umem);
 }
 
-static inline void xskq_discard_addr(struct xsk_queue *q)
+static inline void xskq_cons_release(struct xsk_queue *q)
 {
-	q->cons_tail++;
+	/* To improve performance, only update local state here.
+	 * Do the actual release operation when we get new entries
+	 * from the ring in xskq_cons_get_entries() instead.
+	 */
+	q->cached_cons++;
 }
 
 static inline int xskq_prod_reserve(struct xsk_queue *q)
@@ -299,41 +301,29 @@ static inline struct xdp_desc *xskq_validate_desc(struct xsk_queue *q,
 						  struct xdp_desc *desc,
 						  struct xdp_umem *umem)
 {
-	while (q->cons_tail != q->cons_head) {
+	while (q->cached_cons != q->cached_prod) {
 		struct xdp_rxtx_ring *ring = (struct xdp_rxtx_ring *)q->ring;
-		unsigned int idx = q->cons_tail & q->ring_mask;
+		u32 idx = q->cached_cons & q->ring_mask;
 
 		*desc = READ_ONCE(ring->desc[idx]);
 		if (xskq_is_valid_desc(q, desc, umem))
 			return desc;
 
-		q->cons_tail++;
+		q->cached_cons++;
 	}
 
 	return NULL;
 }
 
-static inline struct xdp_desc *xskq_peek_desc(struct xsk_queue *q,
-					      struct xdp_desc *desc,
-					      struct xdp_umem *umem)
+static inline struct xdp_desc *xskq_cons_peek_desc(struct xsk_queue *q,
+						   struct xdp_desc *desc,
+						   struct xdp_umem *umem)
 {
-	if (q->cons_tail == q->cons_head) {
-		smp_mb(); /* D, matches A */
-		WRITE_ONCE(q->ring->consumer, q->cons_tail);
-		q->cons_head = q->cons_tail + xskq_nb_avail(q);
-
-		/* Order consumer and data */
-		smp_rmb(); /* C, matches B */
-	}
-
+	if (q->cached_prod == q->cached_cons)
+		xskq_cons_get_entries(q);
 	return xskq_validate_desc(q, desc, umem);
 }
 
-static inline void xskq_discard_desc(struct xsk_queue *q)
-{
-	q->cons_tail++;
-}
-
 static inline int xskq_prod_reserve_desc(struct xsk_queue *q,
 					 u64 addr, u32 len)
 {
@@ -351,7 +341,7 @@ static inline int xskq_prod_reserve_desc(struct xsk_queue *q,
 	return 0;
 }
 
-static inline bool xskq_full_desc(struct xsk_queue *q)
+static inline bool xskq_cons_is_full(struct xsk_queue *q)
 {
 	/* No barriers needed since data is not accessed */
 	return READ_ONCE(q->ring->producer) - READ_ONCE(q->ring->consumer) ==
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH bpf-next v2 08/12] xsk: change names of validation functions
  2019-12-19 12:39 [PATCH bpf-next v2 00/12] xsk: clean up ring access functions Magnus Karlsson
                   ` (6 preceding siblings ...)
  2019-12-19 12:39 ` [PATCH bpf-next v2 07/12] xsk: simplify the consumer ring access functions Magnus Karlsson
@ 2019-12-19 12:39 ` Magnus Karlsson
  2019-12-19 12:39 ` [PATCH bpf-next v2 09/12] xsk: ixgbe: i40e: ice: mlx5: xsk_umem_discard_addr to xsk_umem_release_addr Magnus Karlsson
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: Magnus Karlsson @ 2019-12-19 12:39 UTC (permalink / raw)
  To: magnus.karlsson, bjorn.topel, ast, daniel, netdev, jonathan.lemon
  Cc: bpf, saeedm, jeffrey.t.kirsher, maciej.fijalkowski,
	maciejromanfijalkowski

Change the names of the validation functions to better reflect what
they are doing. The uppermost ones are reading entries from the rings
and only the bottom ones validate entries. So xskq_cons_read_ is a
better prefix name.

Also change the xskq_cons_read_ functions to return a bool
as the the descriptor or address is already returned by reference
in the parameters. Everyone is using the return value as a bool
anyway.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
---
 include/net/xdp_sock.h |  4 ++--
 net/xdp/xsk.c          |  4 ++--
 net/xdp/xsk_queue.h    | 59 ++++++++++++++++++++++++++------------------------
 3 files changed, 35 insertions(+), 32 deletions(-)

diff --git a/include/net/xdp_sock.h b/include/net/xdp_sock.h
index e3780e4..0742aa9 100644
--- a/include/net/xdp_sock.h
+++ b/include/net/xdp_sock.h
@@ -119,7 +119,7 @@ int xsk_generic_rcv(struct xdp_sock *xs, struct xdp_buff *xdp);
 bool xsk_is_setup_for_bpf_map(struct xdp_sock *xs);
 /* Used from netdev driver */
 bool xsk_umem_has_addrs(struct xdp_umem *umem, u32 cnt);
-u64 *xsk_umem_peek_addr(struct xdp_umem *umem, u64 *addr);
+bool xsk_umem_peek_addr(struct xdp_umem *umem, u64 *addr);
 void xsk_umem_discard_addr(struct xdp_umem *umem);
 void xsk_umem_complete_tx(struct xdp_umem *umem, u32 nb_entries);
 bool xsk_umem_consume_tx(struct xdp_umem *umem, struct xdp_desc *desc);
@@ -199,7 +199,7 @@ static inline bool xsk_umem_has_addrs_rq(struct xdp_umem *umem, u32 cnt)
 	return xsk_umem_has_addrs(umem, cnt - rq->length);
 }
 
-static inline u64 *xsk_umem_peek_addr_rq(struct xdp_umem *umem, u64 *addr)
+static inline bool xsk_umem_peek_addr_rq(struct xdp_umem *umem, u64 *addr)
 {
 	struct xdp_umem_fq_reuse *rq = umem->fq_reuse;
 
diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index fb13b64..62e27a4 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -43,7 +43,7 @@ bool xsk_umem_has_addrs(struct xdp_umem *umem, u32 cnt)
 }
 EXPORT_SYMBOL(xsk_umem_has_addrs);
 
-u64 *xsk_umem_peek_addr(struct xdp_umem *umem, u64 *addr)
+bool xsk_umem_peek_addr(struct xdp_umem *umem, u64 *addr)
 {
 	return xskq_cons_peek_addr(umem->fq, addr, umem);
 }
@@ -124,7 +124,7 @@ static void __xsk_rcv_memcpy(struct xdp_umem *umem, u64 addr, void *from_buf,
 	void *to_buf = xdp_umem_get_data(umem, addr);
 
 	addr = xsk_umem_add_offset_to_addr(addr);
-	if (xskq_crosses_non_contig_pg(umem, addr, len + metalen)) {
+	if (xskq_cons_crosses_non_contig_pg(umem, addr, len + metalen)) {
 		void *next_pg_addr = umem->pages[(addr >> PAGE_SHIFT) + 1].addr;
 		u64 page_start = addr & ~(PAGE_SIZE - 1);
 		u64 first_len = PAGE_SIZE - (addr - page_start);
diff --git a/net/xdp/xsk_queue.h b/net/xdp/xsk_queue.h
index 1436116..6d04a96 100644
--- a/net/xdp/xsk_queue.h
+++ b/net/xdp/xsk_queue.h
@@ -136,8 +136,9 @@ static inline bool xskq_cons_has_entries(struct xsk_queue *q, u32 cnt)
 
 /* UMEM queue */
 
-static inline bool xskq_crosses_non_contig_pg(struct xdp_umem *umem, u64 addr,
-					      u64 length)
+static inline bool xskq_cons_crosses_non_contig_pg(struct xdp_umem *umem,
+						   u64 addr,
+						   u64 length)
 {
 	bool cross_pg = (addr & (PAGE_SIZE - 1)) + length > PAGE_SIZE;
 	bool next_pg_contig =
@@ -147,7 +148,7 @@ static inline bool xskq_crosses_non_contig_pg(struct xdp_umem *umem, u64 addr,
 	return cross_pg && !next_pg_contig;
 }
 
-static inline bool xskq_is_valid_addr(struct xsk_queue *q, u64 addr)
+static inline bool xskq_cons_is_valid_addr(struct xsk_queue *q, u64 addr)
 {
 	if (addr >= q->size) {
 		q->invalid_descs++;
@@ -157,7 +158,8 @@ static inline bool xskq_is_valid_addr(struct xsk_queue *q, u64 addr)
 	return true;
 }
 
-static inline bool xskq_is_valid_addr_unaligned(struct xsk_queue *q, u64 addr,
+static inline bool xskq_cons_is_valid_unaligned(struct xsk_queue *q,
+						u64 addr,
 						u64 length,
 						struct xdp_umem *umem)
 {
@@ -165,7 +167,7 @@ static inline bool xskq_is_valid_addr_unaligned(struct xsk_queue *q, u64 addr,
 
 	addr = xsk_umem_add_offset_to_addr(addr);
 	if (base_addr >= q->size || addr >= q->size ||
-	    xskq_crosses_non_contig_pg(umem, addr, length)) {
+	    xskq_cons_crosses_non_contig_pg(umem, addr, length)) {
 		q->invalid_descs++;
 		return false;
 	}
@@ -173,8 +175,8 @@ static inline bool xskq_is_valid_addr_unaligned(struct xsk_queue *q, u64 addr,
 	return true;
 }
 
-static inline u64 *xskq_validate_addr(struct xsk_queue *q, u64 *addr,
-				      struct xdp_umem *umem)
+static inline bool xskq_cons_read_addr(struct xsk_queue *q, u64 *addr,
+				       struct xdp_umem *umem)
 {
 	struct xdp_umem_ring *ring = (struct xdp_umem_ring *)q->ring;
 
@@ -184,29 +186,29 @@ static inline u64 *xskq_validate_addr(struct xsk_queue *q, u64 *addr,
 		*addr = READ_ONCE(ring->desc[idx]) & q->chunk_mask;
 
 		if (umem->flags & XDP_UMEM_UNALIGNED_CHUNK_FLAG) {
-			if (xskq_is_valid_addr_unaligned(q, *addr,
+			if (xskq_cons_is_valid_unaligned(q, *addr,
 							 umem->chunk_size_nohr,
 							 umem))
-				return addr;
+				return true;
 			goto out;
 		}
 
-		if (xskq_is_valid_addr(q, *addr))
-			return addr;
+		if (xskq_cons_is_valid_addr(q, *addr))
+			return true;
 
 out:
 		q->cached_cons++;
 	}
 
-	return NULL;
+	return false;
 }
 
-static inline u64 *xskq_cons_peek_addr(struct xsk_queue *q, u64 *addr,
+static inline bool xskq_cons_peek_addr(struct xsk_queue *q, u64 *addr,
 				       struct xdp_umem *umem)
 {
 	if (q->cached_prod == q->cached_cons)
 		xskq_cons_get_entries(q);
-	return xskq_validate_addr(q, addr, umem);
+	return xskq_cons_read_addr(q, addr, umem);
 }
 
 static inline void xskq_cons_release(struct xsk_queue *q)
@@ -270,11 +272,12 @@ static inline void xskq_prod_submit_n(struct xsk_queue *q, u32 nb_entries)
 
 /* Rx/Tx queue */
 
-static inline bool xskq_is_valid_desc(struct xsk_queue *q, struct xdp_desc *d,
-				      struct xdp_umem *umem)
+static inline bool xskq_cons_is_valid_desc(struct xsk_queue *q,
+					   struct xdp_desc *d,
+					   struct xdp_umem *umem)
 {
 	if (umem->flags & XDP_UMEM_UNALIGNED_CHUNK_FLAG) {
-		if (!xskq_is_valid_addr_unaligned(q, d->addr, d->len, umem))
+		if (!xskq_cons_is_valid_unaligned(q, d->addr, d->len, umem))
 			return false;
 
 		if (d->len > umem->chunk_size_nohr || d->options) {
@@ -285,7 +288,7 @@ static inline bool xskq_is_valid_desc(struct xsk_queue *q, struct xdp_desc *d,
 		return true;
 	}
 
-	if (!xskq_is_valid_addr(q, d->addr))
+	if (!xskq_cons_is_valid_addr(q, d->addr))
 		return false;
 
 	if (((d->addr + d->len) & q->chunk_mask) != (d->addr & q->chunk_mask) ||
@@ -297,31 +300,31 @@ static inline bool xskq_is_valid_desc(struct xsk_queue *q, struct xdp_desc *d,
 	return true;
 }
 
-static inline struct xdp_desc *xskq_validate_desc(struct xsk_queue *q,
-						  struct xdp_desc *desc,
-						  struct xdp_umem *umem)
+static inline bool xskq_cons_read_desc(struct xsk_queue *q,
+				       struct xdp_desc *desc,
+				       struct xdp_umem *umem)
 {
 	while (q->cached_cons != q->cached_prod) {
 		struct xdp_rxtx_ring *ring = (struct xdp_rxtx_ring *)q->ring;
 		u32 idx = q->cached_cons & q->ring_mask;
 
 		*desc = READ_ONCE(ring->desc[idx]);
-		if (xskq_is_valid_desc(q, desc, umem))
-			return desc;
+		if (xskq_cons_is_valid_desc(q, desc, umem))
+			return true;
 
 		q->cached_cons++;
 	}
 
-	return NULL;
+	return false;
 }
 
-static inline struct xdp_desc *xskq_cons_peek_desc(struct xsk_queue *q,
-						   struct xdp_desc *desc,
-						   struct xdp_umem *umem)
+static inline bool xskq_cons_peek_desc(struct xsk_queue *q,
+				       struct xdp_desc *desc,
+				       struct xdp_umem *umem)
 {
 	if (q->cached_prod == q->cached_cons)
 		xskq_cons_get_entries(q);
-	return xskq_validate_desc(q, desc, umem);
+	return xskq_cons_read_desc(q, desc, umem);
 }
 
 static inline int xskq_prod_reserve_desc(struct xsk_queue *q,
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH bpf-next v2 09/12] xsk: ixgbe: i40e: ice: mlx5: xsk_umem_discard_addr to xsk_umem_release_addr
  2019-12-19 12:39 [PATCH bpf-next v2 00/12] xsk: clean up ring access functions Magnus Karlsson
                   ` (7 preceding siblings ...)
  2019-12-19 12:39 ` [PATCH bpf-next v2 08/12] xsk: change names of validation functions Magnus Karlsson
@ 2019-12-19 12:39 ` Magnus Karlsson
  2019-12-19 12:39 ` [PATCH bpf-next v2 10/12] xsk: remove unnecessary READ_ONCE of data Magnus Karlsson
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: Magnus Karlsson @ 2019-12-19 12:39 UTC (permalink / raw)
  To: magnus.karlsson, bjorn.topel, ast, daniel, netdev, jonathan.lemon
  Cc: bpf, saeedm, jeffrey.t.kirsher, maciej.fijalkowski,
	maciejromanfijalkowski

Change the name of xsk_umem_discard_addr to xsk_umem_release_addr to
better reflect the new naming of the AF_XDP queue manipulation
functions. As this functions is used by drivers implementing support
for AF_XDP zero-copy, it requires a name change to these drivers. The
function xsk_umem_release_addr_rq has also changed name in the same
fashion.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_xsk.c          |  4 ++--
 drivers/net/ethernet/intel/ice/ice_xsk.c            |  4 ++--
 drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c        |  4 ++--
 drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c |  2 +-
 include/net/xdp_sock.h                              | 10 +++++-----
 net/xdp/xsk.c                                       |  4 ++--
 6 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
index d07e1a8..20d6f11 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
@@ -269,7 +269,7 @@ static bool i40e_alloc_buffer_zc(struct i40e_ring *rx_ring,
 
 	bi->handle = xsk_umem_adjust_offset(umem, handle, umem->headroom);
 
-	xsk_umem_discard_addr(umem);
+	xsk_umem_release_addr(umem);
 	return true;
 }
 
@@ -306,7 +306,7 @@ static bool i40e_alloc_buffer_slow_zc(struct i40e_ring *rx_ring,
 
 	bi->handle = xsk_umem_adjust_offset(umem, handle, umem->headroom);
 
-	xsk_umem_discard_addr_rq(umem);
+	xsk_umem_release_addr_rq(umem);
 	return true;
 }
 
diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c
index cf9b8b2..0af1f0b 100644
--- a/drivers/net/ethernet/intel/ice/ice_xsk.c
+++ b/drivers/net/ethernet/intel/ice/ice_xsk.c
@@ -555,7 +555,7 @@ ice_alloc_buf_fast_zc(struct ice_ring *rx_ring, struct ice_rx_buf *rx_buf)
 
 	rx_buf->handle = handle + umem->headroom;
 
-	xsk_umem_discard_addr(umem);
+	xsk_umem_release_addr(umem);
 	return true;
 }
 
@@ -591,7 +591,7 @@ ice_alloc_buf_slow_zc(struct ice_ring *rx_ring, struct ice_rx_buf *rx_buf)
 
 	rx_buf->handle = handle + umem->headroom;
 
-	xsk_umem_discard_addr_rq(umem);
+	xsk_umem_release_addr_rq(umem);
 	return true;
 }
 
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
index d6feaac..1501d0a 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
@@ -277,7 +277,7 @@ static bool ixgbe_alloc_buffer_zc(struct ixgbe_ring *rx_ring,
 
 	bi->handle = xsk_umem_adjust_offset(umem, handle, umem->headroom);
 
-	xsk_umem_discard_addr(umem);
+	xsk_umem_release_addr(umem);
 	return true;
 }
 
@@ -304,7 +304,7 @@ static bool ixgbe_alloc_buffer_slow_zc(struct ixgbe_ring *rx_ring,
 
 	bi->handle = xsk_umem_adjust_offset(umem, handle, umem->headroom);
 
-	xsk_umem_discard_addr_rq(umem);
+	xsk_umem_release_addr_rq(umem);
 	return true;
 }
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
index 475b6bd..62fc8a1 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
@@ -35,7 +35,7 @@ int mlx5e_xsk_page_alloc_umem(struct mlx5e_rq *rq,
 	 */
 	dma_info->addr = xdp_umem_get_dma(umem, handle);
 
-	xsk_umem_discard_addr_rq(umem);
+	xsk_umem_release_addr_rq(umem);
 
 	dma_sync_single_for_device(rq->pdev, dma_info->addr, PAGE_SIZE,
 				   DMA_BIDIRECTIONAL);
diff --git a/include/net/xdp_sock.h b/include/net/xdp_sock.h
index 0742aa9..b23f60d 100644
--- a/include/net/xdp_sock.h
+++ b/include/net/xdp_sock.h
@@ -120,7 +120,7 @@ bool xsk_is_setup_for_bpf_map(struct xdp_sock *xs);
 /* Used from netdev driver */
 bool xsk_umem_has_addrs(struct xdp_umem *umem, u32 cnt);
 bool xsk_umem_peek_addr(struct xdp_umem *umem, u64 *addr);
-void xsk_umem_discard_addr(struct xdp_umem *umem);
+void xsk_umem_release_addr(struct xdp_umem *umem);
 void xsk_umem_complete_tx(struct xdp_umem *umem, u32 nb_entries);
 bool xsk_umem_consume_tx(struct xdp_umem *umem, struct xdp_desc *desc);
 void xsk_umem_consume_tx_done(struct xdp_umem *umem);
@@ -210,12 +210,12 @@ static inline bool xsk_umem_peek_addr_rq(struct xdp_umem *umem, u64 *addr)
 	return addr;
 }
 
-static inline void xsk_umem_discard_addr_rq(struct xdp_umem *umem)
+static inline void xsk_umem_release_addr_rq(struct xdp_umem *umem)
 {
 	struct xdp_umem_fq_reuse *rq = umem->fq_reuse;
 
 	if (!rq->length)
-		xsk_umem_discard_addr(umem);
+		xsk_umem_release_addr(umem);
 	else
 		rq->length--;
 }
@@ -260,7 +260,7 @@ static inline u64 *xsk_umem_peek_addr(struct xdp_umem *umem, u64 *addr)
 	return NULL;
 }
 
-static inline void xsk_umem_discard_addr(struct xdp_umem *umem)
+static inline void xsk_umem_release_addr(struct xdp_umem *umem)
 {
 }
 
@@ -334,7 +334,7 @@ static inline u64 *xsk_umem_peek_addr_rq(struct xdp_umem *umem, u64 *addr)
 	return NULL;
 }
 
-static inline void xsk_umem_discard_addr_rq(struct xdp_umem *umem)
+static inline void xsk_umem_release_addr_rq(struct xdp_umem *umem)
 {
 }
 
diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index 62e27a4..f0dd740 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -49,11 +49,11 @@ bool xsk_umem_peek_addr(struct xdp_umem *umem, u64 *addr)
 }
 EXPORT_SYMBOL(xsk_umem_peek_addr);
 
-void xsk_umem_discard_addr(struct xdp_umem *umem)
+void xsk_umem_release_addr(struct xdp_umem *umem)
 {
 	xskq_cons_release(umem->fq);
 }
-EXPORT_SYMBOL(xsk_umem_discard_addr);
+EXPORT_SYMBOL(xsk_umem_release_addr);
 
 void xsk_set_rx_need_wakeup(struct xdp_umem *umem)
 {
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH bpf-next v2 10/12] xsk: remove unnecessary READ_ONCE of data
  2019-12-19 12:39 [PATCH bpf-next v2 00/12] xsk: clean up ring access functions Magnus Karlsson
                   ` (8 preceding siblings ...)
  2019-12-19 12:39 ` [PATCH bpf-next v2 09/12] xsk: ixgbe: i40e: ice: mlx5: xsk_umem_discard_addr to xsk_umem_release_addr Magnus Karlsson
@ 2019-12-19 12:39 ` Magnus Karlsson
  2019-12-19 12:39 ` [PATCH bpf-next v2 11/12] xsk: add function naming comments and reorder functions Magnus Karlsson
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: Magnus Karlsson @ 2019-12-19 12:39 UTC (permalink / raw)
  To: magnus.karlsson, bjorn.topel, ast, daniel, netdev, jonathan.lemon
  Cc: bpf, saeedm, jeffrey.t.kirsher, maciej.fijalkowski,
	maciejromanfijalkowski

There are two unnecessary READ_ONCE of descriptor data. These are not
needed since the data is written by the producer before it signals
that the data is available by incrementing the producer pointer. As the
access to this producer pointer is serialized and the consumer always
reads the descriptor after it has read and synchronized with the
producer counter, the write of the descriptor will have fully
completed and it does not matter if the consumer has any read tearing.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
---
 net/xdp/xsk_queue.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/xdp/xsk_queue.h b/net/xdp/xsk_queue.h
index 6d04a96..4c049ca 100644
--- a/net/xdp/xsk_queue.h
+++ b/net/xdp/xsk_queue.h
@@ -183,7 +183,7 @@ static inline bool xskq_cons_read_addr(struct xsk_queue *q, u64 *addr,
 	while (q->cached_cons != q->cached_prod) {
 		u32 idx = q->cached_cons & q->ring_mask;
 
-		*addr = READ_ONCE(ring->desc[idx]) & q->chunk_mask;
+		*addr = ring->desc[idx] & q->chunk_mask;
 
 		if (umem->flags & XDP_UMEM_UNALIGNED_CHUNK_FLAG) {
 			if (xskq_cons_is_valid_unaligned(q, *addr,
@@ -308,7 +308,7 @@ static inline bool xskq_cons_read_desc(struct xsk_queue *q,
 		struct xdp_rxtx_ring *ring = (struct xdp_rxtx_ring *)q->ring;
 		u32 idx = q->cached_cons & q->ring_mask;
 
-		*desc = READ_ONCE(ring->desc[idx]);
+		*desc = ring->desc[idx];
 		if (xskq_cons_is_valid_desc(q, desc, umem))
 			return true;
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH bpf-next v2 11/12] xsk: add function naming comments and reorder functions
  2019-12-19 12:39 [PATCH bpf-next v2 00/12] xsk: clean up ring access functions Magnus Karlsson
                   ` (9 preceding siblings ...)
  2019-12-19 12:39 ` [PATCH bpf-next v2 10/12] xsk: remove unnecessary READ_ONCE of data Magnus Karlsson
@ 2019-12-19 12:39 ` Magnus Karlsson
  2019-12-19 12:39 ` [PATCH bpf-next v2 12/12] xsk: use struct_size() helper Magnus Karlsson
  2019-12-20 10:09 ` [PATCH bpf-next v2 00/12] xsk: clean up ring access functions Björn Töpel
  12 siblings, 0 replies; 15+ messages in thread
From: Magnus Karlsson @ 2019-12-19 12:39 UTC (permalink / raw)
  To: magnus.karlsson, bjorn.topel, ast, daniel, netdev, jonathan.lemon
  Cc: bpf, saeedm, jeffrey.t.kirsher, maciej.fijalkowski,
	maciejromanfijalkowski

Add comments on how the ring access functions are named and how they
are supposed to be used for producers and consumers. The functions are
also reordered so that the consumer functions are in the beginning and
the producer functions in the end, for easier reference. Put this in a
separate patch as the diff might look a little odd, but no
functionality has changed in this patch.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
---
 net/xdp/xsk.c       |  10 ++
 net/xdp/xsk_queue.h | 293 ++++++++++++++++++++++++++++------------------------
 2 files changed, 167 insertions(+), 136 deletions(-)

diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index f0dd740..e1ffcf1 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -320,6 +320,11 @@ bool xsk_umem_consume_tx(struct xdp_umem *umem, struct xdp_desc *desc)
 		if (!xskq_cons_peek_desc(xs->tx, desc, umem))
 			continue;
 
+		/* This is the backpreassure mechanism for the Tx path.
+		 * Reserve space in the completion queue and only proceed
+		 * if there is space in it. This avoids having to implement
+		 * any buffering in the Tx path.
+		 */
 		if (xskq_prod_reserve_addr(umem->cq, desc->addr))
 			goto out;
 
@@ -390,6 +395,11 @@ static int xsk_generic_xmit(struct sock *sk)
 		addr = desc.addr;
 		buffer = xdp_umem_get_data(xs->umem, addr);
 		err = skb_store_bits(skb, 0, buffer, len);
+		/* This is the backpreassure mechanism for the Tx path.
+		 * Reserve space in the completion queue and only proceed
+		 * if there is space in it. This avoids having to implement
+		 * any buffering in the Tx path.
+		 */
 		if (unlikely(err) || xskq_prod_reserve(xs->umem->cq)) {
 			kfree_skb(skb);
 			goto out;
diff --git a/net/xdp/xsk_queue.h b/net/xdp/xsk_queue.h
index 4c049ca..bec2af1 100644
--- a/net/xdp/xsk_queue.h
+++ b/net/xdp/xsk_queue.h
@@ -81,60 +81,27 @@ struct xsk_queue {
  * now and again after circling through the ring.
  */
 
-/* Common functions operating for both RXTX and umem queues */
-
-static inline u64 xskq_nb_invalid_descs(struct xsk_queue *q)
-{
-	return q ? q->invalid_descs : 0;
-}
-
-static inline void __xskq_cons_release(struct xsk_queue *q)
-{
-	smp_mb(); /* D, matches A */
-	WRITE_ONCE(q->ring->consumer, q->cached_cons);
-}
-
-static inline void __xskq_cons_peek(struct xsk_queue *q)
-{
-	/* Refresh the local pointer */
-	q->cached_prod = READ_ONCE(q->ring->producer);
-	smp_rmb(); /* C, matches B */
-}
-
-static inline void xskq_cons_get_entries(struct xsk_queue *q)
-{
-	__xskq_cons_release(q);
-	__xskq_cons_peek(q);
-}
-
-static inline bool xskq_prod_is_full(struct xsk_queue *q)
-{
-	u32 free_entries = q->nentries - (q->cached_prod - q->cached_cons);
-
-	if (free_entries)
-		return false;
-
-	/* Refresh the local tail pointer */
-	q->cached_cons = READ_ONCE(q->ring->consumer);
-	free_entries = q->nentries - (q->cached_prod - q->cached_cons);
-
-	return !free_entries;
-}
-
-static inline bool xskq_cons_has_entries(struct xsk_queue *q, u32 cnt)
-{
-	u32 entries = q->cached_prod - q->cached_cons;
-
-	if (entries >= cnt)
-		return true;
-
-	__xskq_cons_peek(q);
-	entries = q->cached_prod - q->cached_cons;
-
-	return entries >= cnt;
-}
+/* The operations on the rings are the following:
+ *
+ * producer                           consumer
+ *
+ * RESERVE entries                    PEEK in the ring for entries
+ * WRITE data into the ring           READ data from the ring
+ * SUBMIT entries                     RELEASE entries
+ *
+ * The producer reserves one or more entries in the ring. It can then
+ * fill in these entries and finally submit them so that they can be
+ * seen and read by the consumer.
+ *
+ * The consumer peeks into the ring to see if the producer has written
+ * any new entries. If so, the producer can then read these entries
+ * and when it is done reading them release them back to the producer
+ * so that the producer can use these slots to fill in new entries.
+ *
+ * The function names below reflect these operations.
+ */
 
-/* UMEM queue */
+/* Functions that read and validate content from consumer rings. */
 
 static inline bool xskq_cons_crosses_non_contig_pg(struct xdp_umem *umem,
 						   u64 addr,
@@ -148,16 +115,6 @@ static inline bool xskq_cons_crosses_non_contig_pg(struct xdp_umem *umem,
 	return cross_pg && !next_pg_contig;
 }
 
-static inline bool xskq_cons_is_valid_addr(struct xsk_queue *q, u64 addr)
-{
-	if (addr >= q->size) {
-		q->invalid_descs++;
-		return false;
-	}
-
-	return true;
-}
-
 static inline bool xskq_cons_is_valid_unaligned(struct xsk_queue *q,
 						u64 addr,
 						u64 length,
@@ -175,6 +132,16 @@ static inline bool xskq_cons_is_valid_unaligned(struct xsk_queue *q,
 	return true;
 }
 
+static inline bool xskq_cons_is_valid_addr(struct xsk_queue *q, u64 addr)
+{
+	if (addr >= q->size) {
+		q->invalid_descs++;
+		return false;
+	}
+
+	return true;
+}
+
 static inline bool xskq_cons_read_addr(struct xsk_queue *q, u64 *addr,
 				       struct xdp_umem *umem)
 {
@@ -203,75 +170,6 @@ static inline bool xskq_cons_read_addr(struct xsk_queue *q, u64 *addr,
 	return false;
 }
 
-static inline bool xskq_cons_peek_addr(struct xsk_queue *q, u64 *addr,
-				       struct xdp_umem *umem)
-{
-	if (q->cached_prod == q->cached_cons)
-		xskq_cons_get_entries(q);
-	return xskq_cons_read_addr(q, addr, umem);
-}
-
-static inline void xskq_cons_release(struct xsk_queue *q)
-{
-	/* To improve performance, only update local state here.
-	 * Do the actual release operation when we get new entries
-	 * from the ring in xskq_cons_get_entries() instead.
-	 */
-	q->cached_cons++;
-}
-
-static inline int xskq_prod_reserve(struct xsk_queue *q)
-{
-	if (xskq_prod_is_full(q))
-		return -ENOSPC;
-
-	/* A, matches D */
-	q->cached_prod++;
-	return 0;
-}
-
-static inline int xskq_prod_reserve_addr(struct xsk_queue *q, u64 addr)
-{
-	struct xdp_umem_ring *ring = (struct xdp_umem_ring *)q->ring;
-
-	if (xskq_prod_is_full(q))
-		return -ENOSPC;
-
-	/* A, matches D */
-	ring->desc[q->cached_prod++ & q->ring_mask] = addr;
-	return 0;
-}
-
-static inline void __xskq_prod_submit(struct xsk_queue *q, u32 idx)
-{
-	/* Order producer and data */
-	smp_wmb(); /* B, matches C */
-
-	WRITE_ONCE(q->ring->producer, idx);
-}
-
-static inline void xskq_prod_submit(struct xsk_queue *q)
-{
-	__xskq_prod_submit(q, q->cached_prod);
-}
-
-static inline void xskq_prod_submit_addr(struct xsk_queue *q, u64 addr)
-{
-	struct xdp_umem_ring *ring = (struct xdp_umem_ring *)q->ring;
-	u32 idx = q->ring->producer;
-
-	ring->desc[idx++ & q->ring_mask] = addr;
-
-	__xskq_prod_submit(q, idx);
-}
-
-static inline void xskq_prod_submit_n(struct xsk_queue *q, u32 nb_entries)
-{
-	__xskq_prod_submit(q, q->ring->producer + nb_entries);
-}
-
-/* Rx/Tx queue */
-
 static inline bool xskq_cons_is_valid_desc(struct xsk_queue *q,
 					   struct xdp_desc *d,
 					   struct xdp_umem *umem)
@@ -318,6 +216,48 @@ static inline bool xskq_cons_read_desc(struct xsk_queue *q,
 	return false;
 }
 
+/* Functions for consumers */
+
+static inline void __xskq_cons_release(struct xsk_queue *q)
+{
+	smp_mb(); /* D, matches A */
+	WRITE_ONCE(q->ring->consumer, q->cached_cons);
+}
+
+static inline void __xskq_cons_peek(struct xsk_queue *q)
+{
+	/* Refresh the local pointer */
+	q->cached_prod = READ_ONCE(q->ring->producer);
+	smp_rmb(); /* C, matches B */
+}
+
+static inline void xskq_cons_get_entries(struct xsk_queue *q)
+{
+	__xskq_cons_release(q);
+	__xskq_cons_peek(q);
+}
+
+static inline bool xskq_cons_has_entries(struct xsk_queue *q, u32 cnt)
+{
+	u32 entries = q->cached_prod - q->cached_cons;
+
+	if (entries >= cnt)
+		return true;
+
+	__xskq_cons_peek(q);
+	entries = q->cached_prod - q->cached_cons;
+
+	return entries >= cnt;
+}
+
+static inline bool xskq_cons_peek_addr(struct xsk_queue *q, u64 *addr,
+				       struct xdp_umem *umem)
+{
+	if (q->cached_prod == q->cached_cons)
+		xskq_cons_get_entries(q);
+	return xskq_cons_read_addr(q, addr, umem);
+}
+
 static inline bool xskq_cons_peek_desc(struct xsk_queue *q,
 				       struct xdp_desc *desc,
 				       struct xdp_umem *umem)
@@ -327,6 +267,60 @@ static inline bool xskq_cons_peek_desc(struct xsk_queue *q,
 	return xskq_cons_read_desc(q, desc, umem);
 }
 
+static inline void xskq_cons_release(struct xsk_queue *q)
+{
+	/* To improve performance, only update local state here.
+	 * Reflect this to global state when we get new entries
+	 * from the ring in xskq_cons_get_entries().
+	 */
+	q->cached_cons++;
+}
+
+static inline bool xskq_cons_is_full(struct xsk_queue *q)
+{
+	/* No barriers needed since data is not accessed */
+	return READ_ONCE(q->ring->producer) - READ_ONCE(q->ring->consumer) ==
+		q->nentries;
+}
+
+/* Functions for producers */
+
+static inline bool xskq_prod_is_full(struct xsk_queue *q)
+{
+	u32 free_entries = q->nentries - (q->cached_prod - q->cached_cons);
+
+	if (free_entries)
+		return false;
+
+	/* Refresh the local tail pointer */
+	q->cached_cons = READ_ONCE(q->ring->consumer);
+	free_entries = q->nentries - (q->cached_prod - q->cached_cons);
+
+	return !free_entries;
+}
+
+static inline int xskq_prod_reserve(struct xsk_queue *q)
+{
+	if (xskq_prod_is_full(q))
+		return -ENOSPC;
+
+	/* A, matches D */
+	q->cached_prod++;
+	return 0;
+}
+
+static inline int xskq_prod_reserve_addr(struct xsk_queue *q, u64 addr)
+{
+	struct xdp_umem_ring *ring = (struct xdp_umem_ring *)q->ring;
+
+	if (xskq_prod_is_full(q))
+		return -ENOSPC;
+
+	/* A, matches D */
+	ring->desc[q->cached_prod++ & q->ring_mask] = addr;
+	return 0;
+}
+
 static inline int xskq_prod_reserve_desc(struct xsk_queue *q,
 					 u64 addr, u32 len)
 {
@@ -344,11 +338,31 @@ static inline int xskq_prod_reserve_desc(struct xsk_queue *q,
 	return 0;
 }
 
-static inline bool xskq_cons_is_full(struct xsk_queue *q)
+static inline void __xskq_prod_submit(struct xsk_queue *q, u32 idx)
 {
-	/* No barriers needed since data is not accessed */
-	return READ_ONCE(q->ring->producer) - READ_ONCE(q->ring->consumer) ==
-		q->nentries;
+	smp_wmb(); /* B, matches C */
+
+	WRITE_ONCE(q->ring->producer, idx);
+}
+
+static inline void xskq_prod_submit(struct xsk_queue *q)
+{
+	__xskq_prod_submit(q, q->cached_prod);
+}
+
+static inline void xskq_prod_submit_addr(struct xsk_queue *q, u64 addr)
+{
+	struct xdp_umem_ring *ring = (struct xdp_umem_ring *)q->ring;
+	u32 idx = q->ring->producer;
+
+	ring->desc[idx++ & q->ring_mask] = addr;
+
+	__xskq_prod_submit(q, idx);
+}
+
+static inline void xskq_prod_submit_n(struct xsk_queue *q, u32 nb_entries)
+{
+	__xskq_prod_submit(q, q->ring->producer + nb_entries);
 }
 
 static inline bool xskq_prod_is_empty(struct xsk_queue *q)
@@ -357,6 +371,13 @@ static inline bool xskq_prod_is_empty(struct xsk_queue *q)
 	return READ_ONCE(q->ring->consumer) == READ_ONCE(q->ring->producer);
 }
 
+/* For both producers and consumers */
+
+static inline u64 xskq_nb_invalid_descs(struct xsk_queue *q)
+{
+	return q ? q->invalid_descs : 0;
+}
+
 void xskq_set_umem(struct xsk_queue *q, u64 size, u64 chunk_mask);
 struct xsk_queue *xskq_create(u32 nentries, bool umem_queue);
 void xskq_destroy(struct xsk_queue *q_ops);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH bpf-next v2 12/12] xsk: use struct_size() helper
  2019-12-19 12:39 [PATCH bpf-next v2 00/12] xsk: clean up ring access functions Magnus Karlsson
                   ` (10 preceding siblings ...)
  2019-12-19 12:39 ` [PATCH bpf-next v2 11/12] xsk: add function naming comments and reorder functions Magnus Karlsson
@ 2019-12-19 12:39 ` Magnus Karlsson
  2019-12-20 10:09 ` [PATCH bpf-next v2 00/12] xsk: clean up ring access functions Björn Töpel
  12 siblings, 0 replies; 15+ messages in thread
From: Magnus Karlsson @ 2019-12-19 12:39 UTC (permalink / raw)
  To: magnus.karlsson, bjorn.topel, ast, daniel, netdev, jonathan.lemon
  Cc: bpf, saeedm, jeffrey.t.kirsher, maciej.fijalkowski,
	maciejromanfijalkowski

Improve readability and maintainability by using the struct_size()
helper when allocating the AF_XDP rings.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
---
 net/xdp/xsk_queue.c | 15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/net/xdp/xsk_queue.c b/net/xdp/xsk_queue.c
index b665045..c90e9c1 100644
--- a/net/xdp/xsk_queue.c
+++ b/net/xdp/xsk_queue.c
@@ -18,14 +18,14 @@ void xskq_set_umem(struct xsk_queue *q, u64 size, u64 chunk_mask)
 	q->chunk_mask = chunk_mask;
 }
 
-static u32 xskq_umem_get_ring_size(struct xsk_queue *q)
+static size_t xskq_get_ring_size(struct xsk_queue *q, bool umem_queue)
 {
-	return sizeof(struct xdp_umem_ring) + q->nentries * sizeof(u64);
-}
+	struct xdp_umem_ring *umem_ring;
+	struct xdp_rxtx_ring *rxtx_ring;
 
-static u32 xskq_rxtx_get_ring_size(struct xsk_queue *q)
-{
-	return sizeof(struct xdp_ring) + q->nentries * sizeof(struct xdp_desc);
+	if (umem_queue)
+		return struct_size(umem_ring, desc, q->nentries);
+	return struct_size(rxtx_ring, desc, q->nentries);
 }
 
 struct xsk_queue *xskq_create(u32 nentries, bool umem_queue)
@@ -43,8 +43,7 @@ struct xsk_queue *xskq_create(u32 nentries, bool umem_queue)
 
 	gfp_flags = GFP_KERNEL | __GFP_ZERO | __GFP_NOWARN |
 		    __GFP_COMP  | __GFP_NORETRY;
-	size = umem_queue ? xskq_umem_get_ring_size(q) :
-	       xskq_rxtx_get_ring_size(q);
+	size = xskq_get_ring_size(q, umem_queue);
 
 	q->ring = (struct xdp_ring *)__get_free_pages(gfp_flags,
 						      get_order(size));
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH bpf-next v2 00/12] xsk: clean up ring access functions
  2019-12-19 12:39 [PATCH bpf-next v2 00/12] xsk: clean up ring access functions Magnus Karlsson
                   ` (11 preceding siblings ...)
  2019-12-19 12:39 ` [PATCH bpf-next v2 12/12] xsk: use struct_size() helper Magnus Karlsson
@ 2019-12-20 10:09 ` Björn Töpel
  2019-12-21  0:03   ` Alexei Starovoitov
  12 siblings, 1 reply; 15+ messages in thread
From: Björn Töpel @ 2019-12-20 10:09 UTC (permalink / raw)
  To: Magnus Karlsson
  Cc: Björn Töpel, Alexei Starovoitov, Daniel Borkmann,
	Netdev, Jonathan Lemon, bpf, Saeed Mahameed, Jeff Kirsher,
	Fijalkowski, Maciej, Maciej Fijalkowski

On Thu, 19 Dec 2019 at 13:40, Magnus Karlsson <magnus.karlsson@intel.com> wrote:
>
> This patch set cleans up the ring access functions of AF_XDP in hope
> that it will now be easier to understand and maintain. I used to get a
> headache every time I looked at this code in order to really understand it,
> but now I do think it is a lot less painful.
>
> The code has been simplified a lot and as a bonus we get better
> performance in nearly all cases. On my new 2.1 GHz Cascade Lake
> machine with a standard default config plus AF_XDP support and
> CONFIG_PREEMPT on I get the following results in percent performance
> increases with this patch set compared to without it:
>
> Zero-copy (-N):
>           rxdrop        txpush        l2fwd
> 1 core:    -2%            0%            3%
> 2 cores:    4%            0%            3%
>
> Zero-copy with poll() (-N -p):
>           rxdrop        txpush        l2fwd
> 1 core:     3%            0%            1%
> 2 cores:   21%            0%            9%
>
> Skb mode (-S):
> Shows a 0% to 5% performance improvement over the same benchmarks as
> above.
>
> Here 1 core means that we are running the driver processing and the
> application on the same core, while 2 cores means that they execute on
> separate cores. The applications are from the xdpsock sample app.
>
> On my older 2.0 Ghz Broadwell machine that I used for the v1, I get
> the following results:
>
> Zero-copy (-N):
>           rxdrop        txpush        l2fwd
> 1 core:     4%            5%            4%
> 2 cores:    1%            0%            2%
>
> Zero-copy with poll() (-N -p):
>           rxdrop        txpush        l2fwd
> 1 core:     1%            3%            3%
> 2 cores:   22%            0%            5%
>
> Skb mode (-S):
> Shows a 0% to 1% performance improvement over the same benchmarks as
> above.
>
> When a results says 21 or 22% better, as in the case of poll mode with
> 2 cores and rxdrop, my first reaction is that it must be a
> bug. Everything else shows between 0% and 5% performance
> improvement. What is giving rise to 22%? A quick bisect indicates that
> it is patches 2, 3, 4, 5, and 6 that are giving rise to most of this
> improvement. So not one patch in particular, but something around 4%
> improvement from each one of them. Note that exactly this benchmark
> has previously had an extraordinary slow down compared to when running
> without poll syscalls. For all the other poll tests above, the
> slowdown has always been around 4% for using poll syscalls. But with
> the bad performing test in question, it was above 25%. Interestingly,
> after this clean up, the slow down is 4%, just like all the other poll
> tests. Please take an extra peek at this so I have not messed up
> something.
>
> The 0% for several txpush results are due to the test bottlenecking on
> a non-CPU HW resource. If I eliminated that bottleneck on my system, I
> would expect to see an increase there too.
>
> Changes v1 -> v2:
> * Corrected textual errors in the commit logs (Sergei and Martin)
> * Fixed the functions that detect empty and full rings so that they
>   now operate on the global ring state (Maxim)
>
> This patch has been applied against commit a352a82496d1 ("Merge branch 'libbpf-extern-followups'")
>
> Structure of the patch set:
>
> Patch 1: Eliminate the lazy update threshold used when preallocating
>          entries in the completion ring
> Patch 2: Simplify the detection of empty and full rings
> Patch 3: Consolidate the two local producer pointers into one
> Patch 4: Standardize the naming of the producer ring access functions
> Patch 5: Eliminate the Rx batch size used for the fill ring
> Patch 6: Simplify the functions xskq_nb_avail and xskq_nb_free
> Patch 7: Simplify and standardize the naming of the consumer ring
>          access functions
> Patch 8: Change the names of the validation functions to improve
>          readability and also the return value of these functions
> Patch 9: Change the name of xsk_umem_discard_addr() to
>          xsk_umem_release_addr() to better reflect the new
>          names. Requires a name change in the drivers that support AF_XDP
>          zero-copy.
> Patch 10: Remove unnecessary READ_ONCE of data in the ring
> Patch 11: Add overall function naming comment and reorder the functions
>           for easier reference
> Patch 12: Use the struct_size helper function when allocating rings
>
> Thanks: Magnus
>

Very nice cleanup (and performance boost)!

For the series:
Reviewed-by: Björn Töpel <bjorn.topel@intel.com>
Tested-by: Björn Töpel <bjorn.topel@intel.com>
Acked-by: Björn Töpel <bjorn.topel@intel.com>


> Magnus Karlsson (12):
>   xsk: eliminate the lazy update threshold
>   xsk: simplify detection of empty and full rings
>   xsk: consolidate to one single cached producer pointer
>   xsk: standardize naming of producer ring access functions
>   xsk: eliminate the RX batch size
>   xsk: simplify xskq_nb_avail and xskq_nb_free
>   xsk: simplify the consumer ring access functions
>   xsk: change names of validation functions
>   xsk: ixgbe: i40e: ice: mlx5: xsk_umem_discard_addr to
>     xsk_umem_release_addr
>   xsk: remove unnecessary READ_ONCE of data
>   xsk: add function naming comments and reorder functions
>   xsk: use struct_size() helper
>
>  drivers/net/ethernet/intel/i40e/i40e_xsk.c         |   4 +-
>  drivers/net/ethernet/intel/ice/ice_xsk.c           |   4 +-
>  drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c       |   4 +-
>  .../net/ethernet/mellanox/mlx5/core/en/xsk/rx.c    |   2 +-
>  include/net/xdp_sock.h                             |  14 +-
>  net/xdp/xsk.c                                      |  62 ++--
>  net/xdp/xsk_queue.c                                |  15 +-
>  net/xdp/xsk_queue.h                                | 371 +++++++++++----------
>  8 files changed, 246 insertions(+), 230 deletions(-)
>
> --
> 2.7.4

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH bpf-next v2 00/12] xsk: clean up ring access functions
  2019-12-20 10:09 ` [PATCH bpf-next v2 00/12] xsk: clean up ring access functions Björn Töpel
@ 2019-12-21  0:03   ` Alexei Starovoitov
  0 siblings, 0 replies; 15+ messages in thread
From: Alexei Starovoitov @ 2019-12-21  0:03 UTC (permalink / raw)
  To: Björn Töpel
  Cc: Magnus Karlsson, Björn Töpel, Alexei Starovoitov,
	Daniel Borkmann, Netdev, Jonathan Lemon, bpf, Saeed Mahameed,
	Jeff Kirsher, Fijalkowski, Maciej, Maciej Fijalkowski

On Fri, Dec 20, 2019 at 11:09:57AM +0100, Björn Töpel wrote:
> On Thu, 19 Dec 2019 at 13:40, Magnus Karlsson <magnus.karlsson@intel.com> wrote:
> >
> > This patch set cleans up the ring access functions of AF_XDP in hope
> > that it will now be easier to understand and maintain. I used to get a
> > headache every time I looked at this code in order to really understand it,
> > but now I do think it is a lot less painful.
> >
> > The code has been simplified a lot and as a bonus we get better
> > performance in nearly all cases. On my new 2.1 GHz Cascade Lake
> > machine with a standard default config plus AF_XDP support and
> > CONFIG_PREEMPT on I get the following results in percent performance
> > increases with this patch set compared to without it:
> >
> > Zero-copy (-N):
> >           rxdrop        txpush        l2fwd
> > 1 core:    -2%            0%            3%
> > 2 cores:    4%            0%            3%
> >
> > Zero-copy with poll() (-N -p):
> >           rxdrop        txpush        l2fwd
> > 1 core:     3%            0%            1%
> > 2 cores:   21%            0%            9%
> >
> > Skb mode (-S):
> > Shows a 0% to 5% performance improvement over the same benchmarks as
> > above.
> >
> > Here 1 core means that we are running the driver processing and the
> > application on the same core, while 2 cores means that they execute on
> > separate cores. The applications are from the xdpsock sample app.
> >
> > On my older 2.0 Ghz Broadwell machine that I used for the v1, I get
> > the following results:
> >
> > Zero-copy (-N):
> >           rxdrop        txpush        l2fwd
> > 1 core:     4%            5%            4%
> > 2 cores:    1%            0%            2%
> >
> > Zero-copy with poll() (-N -p):
> >           rxdrop        txpush        l2fwd
> > 1 core:     1%            3%            3%
> > 2 cores:   22%            0%            5%
> >
> > Skb mode (-S):
> > Shows a 0% to 1% performance improvement over the same benchmarks as
> > above.
> >
> > When a results says 21 or 22% better, as in the case of poll mode with
> > 2 cores and rxdrop, my first reaction is that it must be a
> > bug. Everything else shows between 0% and 5% performance
> > improvement. What is giving rise to 22%? A quick bisect indicates that
> > it is patches 2, 3, 4, 5, and 6 that are giving rise to most of this
> > improvement. So not one patch in particular, but something around 4%
> > improvement from each one of them. Note that exactly this benchmark
> > has previously had an extraordinary slow down compared to when running
> > without poll syscalls. For all the other poll tests above, the
> > slowdown has always been around 4% for using poll syscalls. But with
> > the bad performing test in question, it was above 25%. Interestingly,
> > after this clean up, the slow down is 4%, just like all the other poll
> > tests. Please take an extra peek at this so I have not messed up
> > something.
> >
> > The 0% for several txpush results are due to the test bottlenecking on
> > a non-CPU HW resource. If I eliminated that bottleneck on my system, I
> > would expect to see an increase there too.
> >
> > Changes v1 -> v2:
> > * Corrected textual errors in the commit logs (Sergei and Martin)
> > * Fixed the functions that detect empty and full rings so that they
> >   now operate on the global ring state (Maxim)
> >
> > This patch has been applied against commit a352a82496d1 ("Merge branch 'libbpf-extern-followups'")
> >
> > Structure of the patch set:
> >
> > Patch 1: Eliminate the lazy update threshold used when preallocating
> >          entries in the completion ring
> > Patch 2: Simplify the detection of empty and full rings
> > Patch 3: Consolidate the two local producer pointers into one
> > Patch 4: Standardize the naming of the producer ring access functions
> > Patch 5: Eliminate the Rx batch size used for the fill ring
> > Patch 6: Simplify the functions xskq_nb_avail and xskq_nb_free
> > Patch 7: Simplify and standardize the naming of the consumer ring
> >          access functions
> > Patch 8: Change the names of the validation functions to improve
> >          readability and also the return value of these functions
> > Patch 9: Change the name of xsk_umem_discard_addr() to
> >          xsk_umem_release_addr() to better reflect the new
> >          names. Requires a name change in the drivers that support AF_XDP
> >          zero-copy.
> > Patch 10: Remove unnecessary READ_ONCE of data in the ring
> > Patch 11: Add overall function naming comment and reorder the functions
> >           for easier reference
> > Patch 12: Use the struct_size helper function when allocating rings
> >
> > Thanks: Magnus
> >
> 
> Very nice cleanup (and performance boost)!
> 
> For the series:
> Reviewed-by: Björn Töpel <bjorn.topel@intel.com>
> Tested-by: Björn Töpel <bjorn.topel@intel.com>
> Acked-by: Björn Töpel <bjorn.topel@intel.com>

Applied, Thanks

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2019-12-21  0:03 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-19 12:39 [PATCH bpf-next v2 00/12] xsk: clean up ring access functions Magnus Karlsson
2019-12-19 12:39 ` [PATCH bpf-next v2 01/12] xsk: eliminate the lazy update threshold Magnus Karlsson
2019-12-19 12:39 ` [PATCH bpf-next v2 02/12] xsk: simplify detection of empty and full rings Magnus Karlsson
2019-12-19 12:39 ` [PATCH bpf-next v2 03/12] xsk: consolidate to one single cached producer pointer Magnus Karlsson
2019-12-19 12:39 ` [PATCH bpf-next v2 04/12] xsk: standardize naming of producer ring access functions Magnus Karlsson
2019-12-19 12:39 ` [PATCH bpf-next v2 05/12] xsk: eliminate the RX batch size Magnus Karlsson
2019-12-19 12:39 ` [PATCH bpf-next v2 06/12] xsk: simplify xskq_nb_avail and xskq_nb_free Magnus Karlsson
2019-12-19 12:39 ` [PATCH bpf-next v2 07/12] xsk: simplify the consumer ring access functions Magnus Karlsson
2019-12-19 12:39 ` [PATCH bpf-next v2 08/12] xsk: change names of validation functions Magnus Karlsson
2019-12-19 12:39 ` [PATCH bpf-next v2 09/12] xsk: ixgbe: i40e: ice: mlx5: xsk_umem_discard_addr to xsk_umem_release_addr Magnus Karlsson
2019-12-19 12:39 ` [PATCH bpf-next v2 10/12] xsk: remove unnecessary READ_ONCE of data Magnus Karlsson
2019-12-19 12:39 ` [PATCH bpf-next v2 11/12] xsk: add function naming comments and reorder functions Magnus Karlsson
2019-12-19 12:39 ` [PATCH bpf-next v2 12/12] xsk: use struct_size() helper Magnus Karlsson
2019-12-20 10:09 ` [PATCH bpf-next v2 00/12] xsk: clean up ring access functions Björn Töpel
2019-12-21  0:03   ` Alexei Starovoitov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).