netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf-next 0/8] Simplify xdp_do_redirect_map()/xdp_do_flush_map() and XDP maps
@ 2019-12-18 10:53 Björn Töpel
  2019-12-18 10:53 ` [PATCH bpf-next 1/8] xdp: simplify devmap cleanup Björn Töpel
                   ` (8 more replies)
  0 siblings, 9 replies; 32+ messages in thread
From: Björn Töpel @ 2019-12-18 10:53 UTC (permalink / raw)
  To: netdev, ast, daniel
  Cc: Björn Töpel, bpf, davem, jakub.kicinski, hawk,
	john.fastabend, magnus.karlsson, jonathan.lemon

This series aims to simplify the XDP maps and
xdp_do_redirect_map()/xdp_do_flush_map(), and to crank out some more
performance from XDP_REDIRECT scenarios.

The first part of the series simplifies all XDP_REDIRECT capable maps,
so that __XXX_flush_map() does not require the map parameter, by
moving the flush list from the map to global scope.

This results in that the map_to_flush member can be removed from
struct bpf_redirect_info, and its corresponding logic.

Simpler code, and more performance due to that checks/code per-packet
is moved to flush.

Pre-series performance:
  $ sudo taskset -c 22 ./xdpsock -i enp134s0f0 -q 20 -n 1 -r -z
  
   sock0@enp134s0f0:20 rxdrop xdp-drv 
                  pps         pkts        1.00       
  rx              20,797,350  230,942,399
  tx              0           0          
  
  $ sudo ./xdp_redirect_cpu --dev enp134s0f0 --cpu 22 xdp_cpu_map0
  
  Running XDP/eBPF prog_name:xdp_cpu_map5_lb_hash_ip_pairs
  XDP-cpumap      CPU:to  pps            drop-pps    extra-info
  XDP-RX          20      7723038        0           0
  XDP-RX          total   7723038        0
  cpumap_kthread  total   0              0           0
  redirect_err    total   0              0
  xdp_exception   total   0              0

Post-series performance:
  $ sudo taskset -c 22 ./xdpsock -i enp134s0f0 -q 20 -n 1 -r -z

   sock0@enp134s0f0:20 rxdrop xdp-drv 
                  pps         pkts        1.00       
  rx              21,524,979  86,835,327 
  tx              0           0          
  
  $ sudo ./xdp_redirect_cpu --dev enp134s0f0 --cpu 22 xdp_cpu_map0

  Running XDP/eBPF prog_name:xdp_cpu_map5_lb_hash_ip_pairs
  XDP-cpumap      CPU:to  pps            drop-pps    extra-info
  XDP-RX          20      7840124        0           0          
  XDP-RX          total   7840124        0          
  cpumap_kthread  total   0              0           0          
  redirect_err    total   0              0          
  xdp_exception   total   0              0          
  
Results: +3.5% and +1.5% for the ubenchmarks.

Björn Töpel (8):
  xdp: simplify devmap cleanup
  xdp: simplify cpumap cleanup
  xdp: fix graze->grace type-o in cpumap comments
  xsk: make xskmap flush_list common for all map instances
  xdp: make devmap flush_list common for all map instances
  xdp: make cpumap flush_list common for all map instances
  xdp: remove map_to_flush and map swap detection
  xdp: simplify __bpf_tx_xdp_map()

 include/linux/bpf.h    |  8 ++---
 include/linux/filter.h |  1 -
 include/net/xdp_sock.h | 11 +++---
 kernel/bpf/cpumap.c    | 76 ++++++++++++++--------------------------
 kernel/bpf/devmap.c    | 78 ++++++++++--------------------------------
 kernel/bpf/xskmap.c    | 16 ++-------
 net/core/filter.c      | 63 ++++++----------------------------
 net/xdp/xsk.c          | 17 ++++-----
 8 files changed, 74 insertions(+), 196 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH bpf-next 1/8] xdp: simplify devmap cleanup
  2019-12-18 10:53 [PATCH bpf-next 0/8] Simplify xdp_do_redirect_map()/xdp_do_flush_map() and XDP maps Björn Töpel
@ 2019-12-18 10:53 ` Björn Töpel
  2019-12-18 11:14   ` Toke Høiland-Jørgensen
  2019-12-18 10:53 ` [PATCH bpf-next 2/8] xdp: simplify cpumap cleanup Björn Töpel
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 32+ messages in thread
From: Björn Töpel @ 2019-12-18 10:53 UTC (permalink / raw)
  To: netdev, ast, daniel
  Cc: Björn Töpel, bpf, davem, jakub.kicinski, hawk,
	john.fastabend, magnus.karlsson, jonathan.lemon

From: Björn Töpel <bjorn.topel@intel.com>

After the RCU flavor consolidation [1], call_rcu() and
synchronize_rcu() waits for preempt-disable regions (NAPI) in addition
to the read-side critical sections. As a result of this, the cleanup
code in devmap can be simplified

* There is no longer a need to flush in __dev_map_entry_free, since we
  know that this has been done when the call_rcu() callback is
  triggered.

* When freeing the map, there is no need to explicitly wait for a
  flush. It's guaranteed to be done after the synchronize_rcu() call
  in dev_map_free(). The rcu_barrier() is still needed, so that the
  map is not freed prior the elements.

[1] https://lwn.net/Articles/777036/

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
---
 kernel/bpf/devmap.c | 41 ++++-------------------------------------
 1 file changed, 4 insertions(+), 37 deletions(-)

diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c
index 3d3d61b5985b..1fcafc641c12 100644
--- a/kernel/bpf/devmap.c
+++ b/kernel/bpf/devmap.c
@@ -221,18 +221,6 @@ static void dev_map_free(struct bpf_map *map)
 	/* Make sure prior __dev_map_entry_free() have completed. */
 	rcu_barrier();
 
-	/* To ensure all pending flush operations have completed wait for flush
-	 * list to empty on _all_ cpus.
-	 * Because the above synchronize_rcu() ensures the map is disconnected
-	 * from the program we can assume no new items will be added.
-	 */
-	for_each_online_cpu(cpu) {
-		struct list_head *flush_list = per_cpu_ptr(dtab->flush_list, cpu);
-
-		while (!list_empty(flush_list))
-			cond_resched();
-	}
-
 	if (dtab->map.map_type == BPF_MAP_TYPE_DEVMAP_HASH) {
 		for (i = 0; i < dtab->n_buckets; i++) {
 			struct bpf_dtab_netdev *dev;
@@ -345,8 +333,7 @@ static int dev_map_hash_get_next_key(struct bpf_map *map, void *key,
 	return -ENOENT;
 }
 
-static int bq_xmit_all(struct xdp_bulk_queue *bq, u32 flags,
-		       bool in_napi_ctx)
+static int bq_xmit_all(struct xdp_bulk_queue *bq, u32 flags)
 {
 	struct bpf_dtab_netdev *obj = bq->obj;
 	struct net_device *dev = obj->dev;
@@ -384,11 +371,7 @@ static int bq_xmit_all(struct xdp_bulk_queue *bq, u32 flags,
 	for (i = 0; i < bq->count; i++) {
 		struct xdp_frame *xdpf = bq->q[i];
 
-		/* RX path under NAPI protection, can return frames faster */
-		if (likely(in_napi_ctx))
-			xdp_return_frame_rx_napi(xdpf);
-		else
-			xdp_return_frame(xdpf);
+		xdp_return_frame_rx_napi(xdpf);
 		drops++;
 	}
 	goto out;
@@ -409,7 +392,7 @@ void __dev_map_flush(struct bpf_map *map)
 
 	rcu_read_lock();
 	list_for_each_entry_safe(bq, tmp, flush_list, flush_node)
-		bq_xmit_all(bq, XDP_XMIT_FLUSH, true);
+		bq_xmit_all(bq, XDP_XMIT_FLUSH);
 	rcu_read_unlock();
 }
 
@@ -440,7 +423,7 @@ static int bq_enqueue(struct bpf_dtab_netdev *obj, struct xdp_frame *xdpf,
 	struct xdp_bulk_queue *bq = this_cpu_ptr(obj->bulkq);
 
 	if (unlikely(bq->count == DEV_MAP_BULK_SIZE))
-		bq_xmit_all(bq, 0, true);
+		bq_xmit_all(bq, 0);
 
 	/* Ingress dev_rx will be the same for all xdp_frame's in
 	 * bulk_queue, because bq stored per-CPU and must be flushed
@@ -509,27 +492,11 @@ static void *dev_map_hash_lookup_elem(struct bpf_map *map, void *key)
 	return dev ? &dev->ifindex : NULL;
 }
 
-static void dev_map_flush_old(struct bpf_dtab_netdev *dev)
-{
-	if (dev->dev->netdev_ops->ndo_xdp_xmit) {
-		struct xdp_bulk_queue *bq;
-		int cpu;
-
-		rcu_read_lock();
-		for_each_online_cpu(cpu) {
-			bq = per_cpu_ptr(dev->bulkq, cpu);
-			bq_xmit_all(bq, XDP_XMIT_FLUSH, false);
-		}
-		rcu_read_unlock();
-	}
-}
-
 static void __dev_map_entry_free(struct rcu_head *rcu)
 {
 	struct bpf_dtab_netdev *dev;
 
 	dev = container_of(rcu, struct bpf_dtab_netdev, rcu);
-	dev_map_flush_old(dev);
 	free_percpu(dev->bulkq);
 	dev_put(dev->dev);
 	kfree(dev);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH bpf-next 2/8] xdp: simplify cpumap cleanup
  2019-12-18 10:53 [PATCH bpf-next 0/8] Simplify xdp_do_redirect_map()/xdp_do_flush_map() and XDP maps Björn Töpel
  2019-12-18 10:53 ` [PATCH bpf-next 1/8] xdp: simplify devmap cleanup Björn Töpel
@ 2019-12-18 10:53 ` Björn Töpel
  2019-12-18 11:15   ` Toke Høiland-Jørgensen
  2019-12-18 17:47   ` Jakub Kicinski
  2019-12-18 10:53 ` [PATCH bpf-next 3/8] xdp: fix graze->grace type-o in cpumap comments Björn Töpel
                   ` (6 subsequent siblings)
  8 siblings, 2 replies; 32+ messages in thread
From: Björn Töpel @ 2019-12-18 10:53 UTC (permalink / raw)
  To: netdev, ast, daniel
  Cc: Björn Töpel, bpf, davem, jakub.kicinski, hawk,
	john.fastabend, magnus.karlsson, jonathan.lemon

From: Björn Töpel <bjorn.topel@intel.com>

After the RCU flavor consolidation [1], call_rcu() and
synchronize_rcu() waits for preempt-disable regions (NAPI) in addition
to the read-side critical sections. As a result of this, the cleanup
code in cpumap can be simplified

* There is no longer a need to flush in __cpu_map_entry_free, since we
  know that this has been done when the call_rcu() callback is
  triggered.

* When freeing the map, there is no need to explicitly wait for a
  flush. It's guaranteed to be done after the synchronize_rcu() call
  in cpu_map_free().

[1] https://lwn.net/Articles/777036/

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
---
 kernel/bpf/cpumap.c | 33 +++++----------------------------
 1 file changed, 5 insertions(+), 28 deletions(-)

diff --git a/kernel/bpf/cpumap.c b/kernel/bpf/cpumap.c
index ef49e17ae47c..fbf176e0a2ab 100644
--- a/kernel/bpf/cpumap.c
+++ b/kernel/bpf/cpumap.c
@@ -75,7 +75,7 @@ struct bpf_cpu_map {
 	struct list_head __percpu *flush_list;
 };
 
-static int bq_flush_to_queue(struct xdp_bulk_queue *bq, bool in_napi_ctx);
+static int bq_flush_to_queue(struct xdp_bulk_queue *bq);
 
 static struct bpf_map *cpu_map_alloc(union bpf_attr *attr)
 {
@@ -399,7 +399,6 @@ static struct bpf_cpu_map_entry *__cpu_map_entry_alloc(u32 qsize, u32 cpu,
 static void __cpu_map_entry_free(struct rcu_head *rcu)
 {
 	struct bpf_cpu_map_entry *rcpu;
-	int cpu;
 
 	/* This cpu_map_entry have been disconnected from map and one
 	 * RCU graze-period have elapsed.  Thus, XDP cannot queue any
@@ -408,13 +407,6 @@ static void __cpu_map_entry_free(struct rcu_head *rcu)
 	 */
 	rcpu = container_of(rcu, struct bpf_cpu_map_entry, rcu);
 
-	/* Flush remaining packets in percpu bulkq */
-	for_each_online_cpu(cpu) {
-		struct xdp_bulk_queue *bq = per_cpu_ptr(rcpu->bulkq, cpu);
-
-		/* No concurrent bq_enqueue can run at this point */
-		bq_flush_to_queue(bq, false);
-	}
 	free_percpu(rcpu->bulkq);
 	/* Cannot kthread_stop() here, last put free rcpu resources */
 	put_cpu_map_entry(rcpu);
@@ -522,18 +514,6 @@ static void cpu_map_free(struct bpf_map *map)
 	bpf_clear_redirect_map(map);
 	synchronize_rcu();
 
-	/* To ensure all pending flush operations have completed wait for flush
-	 * list be empty on _all_ cpus. Because the above synchronize_rcu()
-	 * ensures the map is disconnected from the program we can assume no new
-	 * items will be added to the list.
-	 */
-	for_each_online_cpu(cpu) {
-		struct list_head *flush_list = per_cpu_ptr(cmap->flush_list, cpu);
-
-		while (!list_empty(flush_list))
-			cond_resched();
-	}
-
 	/* For cpu_map the remote CPUs can still be using the entries
 	 * (struct bpf_cpu_map_entry).
 	 */
@@ -599,7 +579,7 @@ const struct bpf_map_ops cpu_map_ops = {
 	.map_check_btf		= map_check_no_btf,
 };
 
-static int bq_flush_to_queue(struct xdp_bulk_queue *bq, bool in_napi_ctx)
+static int bq_flush_to_queue(struct xdp_bulk_queue *bq)
 {
 	struct bpf_cpu_map_entry *rcpu = bq->obj;
 	unsigned int processed = 0, drops = 0;
@@ -620,10 +600,7 @@ static int bq_flush_to_queue(struct xdp_bulk_queue *bq, bool in_napi_ctx)
 		err = __ptr_ring_produce(q, xdpf);
 		if (err) {
 			drops++;
-			if (likely(in_napi_ctx))
-				xdp_return_frame_rx_napi(xdpf);
-			else
-				xdp_return_frame(xdpf);
+			xdp_return_frame_rx_napi(xdpf);
 		}
 		processed++;
 	}
@@ -646,7 +623,7 @@ static int bq_enqueue(struct bpf_cpu_map_entry *rcpu, struct xdp_frame *xdpf)
 	struct xdp_bulk_queue *bq = this_cpu_ptr(rcpu->bulkq);
 
 	if (unlikely(bq->count == CPU_MAP_BULK_SIZE))
-		bq_flush_to_queue(bq, true);
+		bq_flush_to_queue(bq);
 
 	/* Notice, xdp_buff/page MUST be queued here, long enough for
 	 * driver to code invoking us to finished, due to driver
@@ -688,7 +665,7 @@ void __cpu_map_flush(struct bpf_map *map)
 	struct xdp_bulk_queue *bq, *tmp;
 
 	list_for_each_entry_safe(bq, tmp, flush_list, flush_node) {
-		bq_flush_to_queue(bq, true);
+		bq_flush_to_queue(bq);
 
 		/* If already running, costs spin_lock_irqsave + smb_mb */
 		wake_up_process(bq->obj->kthread);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH bpf-next 3/8] xdp: fix graze->grace type-o in cpumap comments
  2019-12-18 10:53 [PATCH bpf-next 0/8] Simplify xdp_do_redirect_map()/xdp_do_flush_map() and XDP maps Björn Töpel
  2019-12-18 10:53 ` [PATCH bpf-next 1/8] xdp: simplify devmap cleanup Björn Töpel
  2019-12-18 10:53 ` [PATCH bpf-next 2/8] xdp: simplify cpumap cleanup Björn Töpel
@ 2019-12-18 10:53 ` Björn Töpel
  2019-12-18 11:18   ` Toke Høiland-Jørgensen
  2019-12-18 10:53 ` [PATCH bpf-next 4/8] xsk: make xskmap flush_list common for all map instances Björn Töpel
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 32+ messages in thread
From: Björn Töpel @ 2019-12-18 10:53 UTC (permalink / raw)
  To: netdev, ast, daniel
  Cc: Björn Töpel, bpf, davem, jakub.kicinski, hawk,
	john.fastabend, magnus.karlsson, jonathan.lemon

From: Björn Töpel <bjorn.topel@intel.com>

Simple spelling fix.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
---
 kernel/bpf/cpumap.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/bpf/cpumap.c b/kernel/bpf/cpumap.c
index fbf176e0a2ab..66948fbc58d8 100644
--- a/kernel/bpf/cpumap.c
+++ b/kernel/bpf/cpumap.c
@@ -401,7 +401,7 @@ static void __cpu_map_entry_free(struct rcu_head *rcu)
 	struct bpf_cpu_map_entry *rcpu;
 
 	/* This cpu_map_entry have been disconnected from map and one
-	 * RCU graze-period have elapsed.  Thus, XDP cannot queue any
+	 * RCU grace-period have elapsed.  Thus, XDP cannot queue any
 	 * new packets and cannot change/set flush_needed that can
 	 * find this entry.
 	 */
@@ -428,7 +428,7 @@ static void __cpu_map_entry_free(struct rcu_head *rcu)
  * percpu bulkq to queue.  Due to caller map_delete_elem() disable
  * preemption, cannot call kthread_stop() to make sure queue is empty.
  * Instead a work_queue is started for stopping kthread,
- * cpu_map_kthread_stop, which waits for an RCU graze period before
+ * cpu_map_kthread_stop, which waits for an RCU grace period before
  * stopping kthread, emptying the queue.
  */
 static void __cpu_map_entry_replace(struct bpf_cpu_map *cmap,
@@ -524,7 +524,7 @@ static void cpu_map_free(struct bpf_map *map)
 		if (!rcpu)
 			continue;
 
-		/* bq flush and cleanup happens after RCU graze-period */
+		/* bq flush and cleanup happens after RCU grace-period */
 		__cpu_map_entry_replace(cmap, i, NULL); /* call_rcu */
 	}
 	free_percpu(cmap->flush_list);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH bpf-next 4/8] xsk: make xskmap flush_list common for all map instances
  2019-12-18 10:53 [PATCH bpf-next 0/8] Simplify xdp_do_redirect_map()/xdp_do_flush_map() and XDP maps Björn Töpel
                   ` (2 preceding siblings ...)
  2019-12-18 10:53 ` [PATCH bpf-next 3/8] xdp: fix graze->grace type-o in cpumap comments Björn Töpel
@ 2019-12-18 10:53 ` Björn Töpel
  2019-12-18 11:19   ` Toke Høiland-Jørgensen
  2019-12-18 10:53 ` [PATCH bpf-next 5/8] xdp: make devmap " Björn Töpel
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 32+ messages in thread
From: Björn Töpel @ 2019-12-18 10:53 UTC (permalink / raw)
  To: netdev, ast, daniel
  Cc: Björn Töpel, bpf, davem, jakub.kicinski, hawk,
	john.fastabend, magnus.karlsson, jonathan.lemon

From: Björn Töpel <bjorn.topel@intel.com>

The xskmap flush list is used to track entries that need to flushed
from via the xdp_do_flush_map() function. This list used to be
per-map, but there is really no reason for that. Instead make the
flush list global for all xskmaps, which simplifies __xsk_map_flush()
and xsk_map_alloc().

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
---
 include/net/xdp_sock.h | 11 ++++-------
 kernel/bpf/xskmap.c    | 16 ++--------------
 net/core/filter.c      |  9 ++++-----
 net/xdp/xsk.c          | 17 +++++++++--------
 4 files changed, 19 insertions(+), 34 deletions(-)

diff --git a/include/net/xdp_sock.h b/include/net/xdp_sock.h
index e3780e4b74e1..48594740d67c 100644
--- a/include/net/xdp_sock.h
+++ b/include/net/xdp_sock.h
@@ -72,7 +72,6 @@ struct xdp_umem {
 
 struct xsk_map {
 	struct bpf_map map;
-	struct list_head __percpu *flush_list;
 	spinlock_t lock; /* Synchronize map updates */
 	struct xdp_sock *xsk_map[];
 };
@@ -139,9 +138,8 @@ void xsk_map_try_sock_delete(struct xsk_map *map, struct xdp_sock *xs,
 			     struct xdp_sock **map_entry);
 int xsk_map_inc(struct xsk_map *map);
 void xsk_map_put(struct xsk_map *map);
-int __xsk_map_redirect(struct bpf_map *map, struct xdp_buff *xdp,
-		       struct xdp_sock *xs);
-void __xsk_map_flush(struct bpf_map *map);
+int __xsk_map_redirect(struct xdp_sock *xs, struct xdp_buff *xdp);
+void __xsk_map_flush(void);
 
 static inline struct xdp_sock *__xsk_map_lookup_elem(struct bpf_map *map,
 						     u32 key)
@@ -369,13 +367,12 @@ static inline u64 xsk_umem_adjust_offset(struct xdp_umem *umem, u64 handle,
 	return 0;
 }
 
-static inline int __xsk_map_redirect(struct bpf_map *map, struct xdp_buff *xdp,
-				     struct xdp_sock *xs)
+static inline int __xsk_map_redirect(struct xdp_sock *xs, struct xdp_buff *xdp)
 {
 	return -EOPNOTSUPP;
 }
 
-static inline void __xsk_map_flush(struct bpf_map *map)
+static inline void __xsk_map_flush(void)
 {
 }
 
diff --git a/kernel/bpf/xskmap.c b/kernel/bpf/xskmap.c
index 90c4fce1c981..3757a7a50ab7 100644
--- a/kernel/bpf/xskmap.c
+++ b/kernel/bpf/xskmap.c
@@ -74,7 +74,7 @@ static struct bpf_map *xsk_map_alloc(union bpf_attr *attr)
 	struct bpf_map_memory mem;
 	int cpu, err, numa_node;
 	struct xsk_map *m;
-	u64 cost, size;
+	u64 size;
 
 	if (!capable(CAP_NET_ADMIN))
 		return ERR_PTR(-EPERM);
@@ -86,9 +86,8 @@ static struct bpf_map *xsk_map_alloc(union bpf_attr *attr)
 
 	numa_node = bpf_map_attr_numa_node(attr);
 	size = struct_size(m, xsk_map, attr->max_entries);
-	cost = size + array_size(sizeof(*m->flush_list), num_possible_cpus());
 
-	err = bpf_map_charge_init(&mem, cost);
+	err = bpf_map_charge_init(&mem, size);
 	if (err < 0)
 		return ERR_PTR(err);
 
@@ -102,16 +101,6 @@ static struct bpf_map *xsk_map_alloc(union bpf_attr *attr)
 	bpf_map_charge_move(&m->map.memory, &mem);
 	spin_lock_init(&m->lock);
 
-	m->flush_list = alloc_percpu(struct list_head);
-	if (!m->flush_list) {
-		bpf_map_charge_finish(&m->map.memory);
-		bpf_map_area_free(m);
-		return ERR_PTR(-ENOMEM);
-	}
-
-	for_each_possible_cpu(cpu)
-		INIT_LIST_HEAD(per_cpu_ptr(m->flush_list, cpu));
-
 	return &m->map;
 }
 
@@ -121,7 +110,6 @@ static void xsk_map_free(struct bpf_map *map)
 
 	bpf_clear_redirect_map(map);
 	synchronize_net();
-	free_percpu(m->flush_list);
 	bpf_map_area_free(m);
 }
 
diff --git a/net/core/filter.c b/net/core/filter.c
index a411f7835dee..c51678c473c5 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -3511,8 +3511,7 @@ xdp_do_redirect_slow(struct net_device *dev, struct xdp_buff *xdp,
 
 static int __bpf_tx_xdp_map(struct net_device *dev_rx, void *fwd,
 			    struct bpf_map *map,
-			    struct xdp_buff *xdp,
-			    u32 index)
+			    struct xdp_buff *xdp)
 {
 	int err;
 
@@ -3537,7 +3536,7 @@ static int __bpf_tx_xdp_map(struct net_device *dev_rx, void *fwd,
 	case BPF_MAP_TYPE_XSKMAP: {
 		struct xdp_sock *xs = fwd;
 
-		err = __xsk_map_redirect(map, xdp, xs);
+		err = __xsk_map_redirect(xs, xdp);
 		return err;
 	}
 	default:
@@ -3562,7 +3561,7 @@ void xdp_do_flush_map(void)
 			__cpu_map_flush(map);
 			break;
 		case BPF_MAP_TYPE_XSKMAP:
-			__xsk_map_flush(map);
+			__xsk_map_flush();
 			break;
 		default:
 			break;
@@ -3619,7 +3618,7 @@ static int xdp_do_redirect_map(struct net_device *dev, struct xdp_buff *xdp,
 	if (ri->map_to_flush && unlikely(ri->map_to_flush != map))
 		xdp_do_flush_map();
 
-	err = __bpf_tx_xdp_map(dev, fwd, map, xdp, index);
+	err = __bpf_tx_xdp_map(dev, fwd, map, xdp);
 	if (unlikely(err))
 		goto err;
 
diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index 956793893c9d..e45c27f5cfca 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -31,6 +31,8 @@
 
 #define TX_BATCH_SIZE 16
 
+static DEFINE_PER_CPU(struct list_head, xskmap_flush_list);
+
 bool xsk_is_setup_for_bpf_map(struct xdp_sock *xs)
 {
 	return READ_ONCE(xs->rx) &&  READ_ONCE(xs->umem) &&
@@ -264,11 +266,9 @@ int xsk_generic_rcv(struct xdp_sock *xs, struct xdp_buff *xdp)
 	return err;
 }
 
-int __xsk_map_redirect(struct bpf_map *map, struct xdp_buff *xdp,
-		       struct xdp_sock *xs)
+int __xsk_map_redirect(struct xdp_sock *xs, struct xdp_buff *xdp)
 {
-	struct xsk_map *m = container_of(map, struct xsk_map, map);
-	struct list_head *flush_list = this_cpu_ptr(m->flush_list);
+	struct list_head *flush_list = this_cpu_ptr(&xskmap_flush_list);
 	int err;
 
 	err = xsk_rcv(xs, xdp);
@@ -281,10 +281,9 @@ int __xsk_map_redirect(struct bpf_map *map, struct xdp_buff *xdp,
 	return 0;
 }
 
-void __xsk_map_flush(struct bpf_map *map)
+void __xsk_map_flush(void)
 {
-	struct xsk_map *m = container_of(map, struct xsk_map, map);
-	struct list_head *flush_list = this_cpu_ptr(m->flush_list);
+	struct list_head *flush_list = this_cpu_ptr(&xskmap_flush_list);
 	struct xdp_sock *xs, *tmp;
 
 	list_for_each_entry_safe(xs, tmp, flush_list, flush_node) {
@@ -1177,7 +1176,7 @@ static struct pernet_operations xsk_net_ops = {
 
 static int __init xsk_init(void)
 {
-	int err;
+	int err, cpu;
 
 	err = proto_register(&xsk_proto, 0 /* no slab */);
 	if (err)
@@ -1195,6 +1194,8 @@ static int __init xsk_init(void)
 	if (err)
 		goto out_pernet;
 
+	for_each_possible_cpu(cpu)
+		INIT_LIST_HEAD(&per_cpu(xskmap_flush_list, cpu));
 	return 0;
 
 out_pernet:
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH bpf-next 5/8] xdp: make devmap flush_list common for all map instances
  2019-12-18 10:53 [PATCH bpf-next 0/8] Simplify xdp_do_redirect_map()/xdp_do_flush_map() and XDP maps Björn Töpel
                   ` (3 preceding siblings ...)
  2019-12-18 10:53 ` [PATCH bpf-next 4/8] xsk: make xskmap flush_list common for all map instances Björn Töpel
@ 2019-12-18 10:53 ` Björn Töpel
  2019-12-18 11:19   ` Toke Høiland-Jørgensen
  2019-12-18 10:53 ` [PATCH bpf-next 6/8] xdp: make cpumap " Björn Töpel
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 32+ messages in thread
From: Björn Töpel @ 2019-12-18 10:53 UTC (permalink / raw)
  To: netdev, ast, daniel
  Cc: Björn Töpel, bpf, davem, jakub.kicinski, hawk,
	john.fastabend, magnus.karlsson, jonathan.lemon

From: Björn Töpel <bjorn.topel@intel.com>

The devmap flush list is used to track entries that need to flushed
from via the xdp_do_flush_map() function. This list used to be
per-map, but there is really no reason for that. Instead make the
flush list global for all devmaps, which simplifies __dev_map_flush()
and dev_map_init_map().

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
---
 include/linux/bpf.h |  4 ++--
 kernel/bpf/devmap.c | 37 ++++++++++++++-----------------------
 net/core/filter.c   |  2 +-
 3 files changed, 17 insertions(+), 26 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index d467983e61bb..31191804ca09 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -959,7 +959,7 @@ struct sk_buff;
 
 struct bpf_dtab_netdev *__dev_map_lookup_elem(struct bpf_map *map, u32 key);
 struct bpf_dtab_netdev *__dev_map_hash_lookup_elem(struct bpf_map *map, u32 key);
-void __dev_map_flush(struct bpf_map *map);
+void __dev_map_flush(void);
 int dev_map_enqueue(struct bpf_dtab_netdev *dst, struct xdp_buff *xdp,
 		    struct net_device *dev_rx);
 int dev_map_generic_redirect(struct bpf_dtab_netdev *dst, struct sk_buff *skb,
@@ -1068,7 +1068,7 @@ static inline struct net_device  *__dev_map_hash_lookup_elem(struct bpf_map *map
 	return NULL;
 }
 
-static inline void __dev_map_flush(struct bpf_map *map)
+static inline void __dev_map_flush(void)
 {
 }
 
diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c
index 1fcafc641c12..da9c832fc5c8 100644
--- a/kernel/bpf/devmap.c
+++ b/kernel/bpf/devmap.c
@@ -75,7 +75,6 @@ struct bpf_dtab_netdev {
 struct bpf_dtab {
 	struct bpf_map map;
 	struct bpf_dtab_netdev **netdev_map; /* DEVMAP type only */
-	struct list_head __percpu *flush_list;
 	struct list_head list;
 
 	/* these are only used for DEVMAP_HASH type maps */
@@ -85,6 +84,7 @@ struct bpf_dtab {
 	u32 n_buckets;
 };
 
+static DEFINE_PER_CPU(struct list_head, dev_map_flush_list);
 static DEFINE_SPINLOCK(dev_map_lock);
 static LIST_HEAD(dev_map_list);
 
@@ -109,8 +109,8 @@ static inline struct hlist_head *dev_map_index_hash(struct bpf_dtab *dtab,
 
 static int dev_map_init_map(struct bpf_dtab *dtab, union bpf_attr *attr)
 {
-	int err, cpu;
-	u64 cost;
+	u64 cost = 0;
+	int err;
 
 	/* check sanity of attributes */
 	if (attr->max_entries == 0 || attr->key_size != 4 ||
@@ -125,9 +125,6 @@ static int dev_map_init_map(struct bpf_dtab *dtab, union bpf_attr *attr)
 
 	bpf_map_init_from_attr(&dtab->map, attr);
 
-	/* make sure page count doesn't overflow */
-	cost = (u64) sizeof(struct list_head) * num_possible_cpus();
-
 	if (attr->map_type == BPF_MAP_TYPE_DEVMAP_HASH) {
 		dtab->n_buckets = roundup_pow_of_two(dtab->map.max_entries);
 
@@ -143,17 +140,10 @@ static int dev_map_init_map(struct bpf_dtab *dtab, union bpf_attr *attr)
 	if (err)
 		return -EINVAL;
 
-	dtab->flush_list = alloc_percpu(struct list_head);
-	if (!dtab->flush_list)
-		goto free_charge;
-
-	for_each_possible_cpu(cpu)
-		INIT_LIST_HEAD(per_cpu_ptr(dtab->flush_list, cpu));
-
 	if (attr->map_type == BPF_MAP_TYPE_DEVMAP_HASH) {
 		dtab->dev_index_head = dev_map_create_hash(dtab->n_buckets);
 		if (!dtab->dev_index_head)
-			goto free_percpu;
+			goto free_charge;
 
 		spin_lock_init(&dtab->index_lock);
 	} else {
@@ -161,13 +151,11 @@ static int dev_map_init_map(struct bpf_dtab *dtab, union bpf_attr *attr)
 						      sizeof(struct bpf_dtab_netdev *),
 						      dtab->map.numa_node);
 		if (!dtab->netdev_map)
-			goto free_percpu;
+			goto free_charge;
 	}
 
 	return 0;
 
-free_percpu:
-	free_percpu(dtab->flush_list);
 free_charge:
 	bpf_map_charge_finish(&dtab->map.memory);
 	return -ENOMEM;
@@ -201,7 +189,7 @@ static struct bpf_map *dev_map_alloc(union bpf_attr *attr)
 static void dev_map_free(struct bpf_map *map)
 {
 	struct bpf_dtab *dtab = container_of(map, struct bpf_dtab, map);
-	int i, cpu;
+	int i;
 
 	/* At this point bpf_prog->aux->refcnt == 0 and this map->refcnt == 0,
 	 * so the programs (can be more than one that used this map) were
@@ -254,7 +242,6 @@ static void dev_map_free(struct bpf_map *map)
 		bpf_map_area_free(dtab->netdev_map);
 	}
 
-	free_percpu(dtab->flush_list);
 	kfree(dtab);
 }
 
@@ -384,10 +371,9 @@ static int bq_xmit_all(struct xdp_bulk_queue *bq, u32 flags)
  * net device can be torn down. On devmap tear down we ensure the flush list
  * is empty before completing to ensure all flush operations have completed.
  */
-void __dev_map_flush(struct bpf_map *map)
+void __dev_map_flush(void)
 {
-	struct bpf_dtab *dtab = container_of(map, struct bpf_dtab, map);
-	struct list_head *flush_list = this_cpu_ptr(dtab->flush_list);
+	struct list_head *flush_list = this_cpu_ptr(&dev_map_flush_list);
 	struct xdp_bulk_queue *bq, *tmp;
 
 	rcu_read_lock();
@@ -419,7 +405,7 @@ static int bq_enqueue(struct bpf_dtab_netdev *obj, struct xdp_frame *xdpf,
 		      struct net_device *dev_rx)
 
 {
-	struct list_head *flush_list = this_cpu_ptr(obj->dtab->flush_list);
+	struct list_head *flush_list = this_cpu_ptr(&dev_map_flush_list);
 	struct xdp_bulk_queue *bq = this_cpu_ptr(obj->bulkq);
 
 	if (unlikely(bq->count == DEV_MAP_BULK_SIZE))
@@ -777,10 +763,15 @@ static struct notifier_block dev_map_notifier = {
 
 static int __init dev_map_init(void)
 {
+	int cpu;
+
 	/* Assure tracepoint shadow struct _bpf_dtab_netdev is in sync */
 	BUILD_BUG_ON(offsetof(struct bpf_dtab_netdev, dev) !=
 		     offsetof(struct _bpf_dtab_netdev, dev));
 	register_netdevice_notifier(&dev_map_notifier);
+
+	for_each_possible_cpu(cpu)
+		INIT_LIST_HEAD(&per_cpu(dev_map_flush_list, cpu));
 	return 0;
 }
 
diff --git a/net/core/filter.c b/net/core/filter.c
index c51678c473c5..b7570cb84902 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -3555,7 +3555,7 @@ void xdp_do_flush_map(void)
 		switch (map->map_type) {
 		case BPF_MAP_TYPE_DEVMAP:
 		case BPF_MAP_TYPE_DEVMAP_HASH:
-			__dev_map_flush(map);
+			__dev_map_flush();
 			break;
 		case BPF_MAP_TYPE_CPUMAP:
 			__cpu_map_flush(map);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH bpf-next 6/8] xdp: make cpumap flush_list common for all map instances
  2019-12-18 10:53 [PATCH bpf-next 0/8] Simplify xdp_do_redirect_map()/xdp_do_flush_map() and XDP maps Björn Töpel
                   ` (4 preceding siblings ...)
  2019-12-18 10:53 ` [PATCH bpf-next 5/8] xdp: make devmap " Björn Töpel
@ 2019-12-18 10:53 ` Björn Töpel
  2019-12-18 11:19   ` Toke Høiland-Jørgensen
  2019-12-18 10:53 ` [PATCH bpf-next 7/8] xdp: remove map_to_flush and map swap detection Björn Töpel
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 32+ messages in thread
From: Björn Töpel @ 2019-12-18 10:53 UTC (permalink / raw)
  To: netdev, ast, daniel
  Cc: Björn Töpel, bpf, davem, jakub.kicinski, hawk,
	john.fastabend, magnus.karlsson, jonathan.lemon

From: Björn Töpel <bjorn.topel@intel.com>

The cpumap flush list is used to track entries that need to flushed
from via the xdp_do_flush_map() function. This list used to be
per-map, but there is really no reason for that. Instead make the
flush list global for all devmaps, which simplifies __cpu_map_flush()
and cpu_map_alloc().

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
---
 include/linux/bpf.h |  4 ++--
 kernel/bpf/cpumap.c | 37 ++++++++++++++++++-------------------
 net/core/filter.c   |  2 +-
 3 files changed, 21 insertions(+), 22 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 31191804ca09..8f3e00c84f39 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -966,7 +966,7 @@ int dev_map_generic_redirect(struct bpf_dtab_netdev *dst, struct sk_buff *skb,
 			     struct bpf_prog *xdp_prog);
 
 struct bpf_cpu_map_entry *__cpu_map_lookup_elem(struct bpf_map *map, u32 key);
-void __cpu_map_flush(struct bpf_map *map);
+void __cpu_map_flush(void);
 int cpu_map_enqueue(struct bpf_cpu_map_entry *rcpu, struct xdp_buff *xdp,
 		    struct net_device *dev_rx);
 
@@ -1097,7 +1097,7 @@ struct bpf_cpu_map_entry *__cpu_map_lookup_elem(struct bpf_map *map, u32 key)
 	return NULL;
 }
 
-static inline void __cpu_map_flush(struct bpf_map *map)
+static inline void __cpu_map_flush(void)
 {
 }
 
diff --git a/kernel/bpf/cpumap.c b/kernel/bpf/cpumap.c
index 66948fbc58d8..70f71b154fa5 100644
--- a/kernel/bpf/cpumap.c
+++ b/kernel/bpf/cpumap.c
@@ -72,17 +72,18 @@ struct bpf_cpu_map {
 	struct bpf_map map;
 	/* Below members specific for map type */
 	struct bpf_cpu_map_entry **cpu_map;
-	struct list_head __percpu *flush_list;
 };
 
+static DEFINE_PER_CPU(struct list_head, cpu_map_flush_list);
+
 static int bq_flush_to_queue(struct xdp_bulk_queue *bq);
 
 static struct bpf_map *cpu_map_alloc(union bpf_attr *attr)
 {
 	struct bpf_cpu_map *cmap;
 	int err = -ENOMEM;
-	int ret, cpu;
 	u64 cost;
+	int ret;
 
 	if (!capable(CAP_SYS_ADMIN))
 		return ERR_PTR(-EPERM);
@@ -106,7 +107,6 @@ static struct bpf_map *cpu_map_alloc(union bpf_attr *attr)
 
 	/* make sure page count doesn't overflow */
 	cost = (u64) cmap->map.max_entries * sizeof(struct bpf_cpu_map_entry *);
-	cost += sizeof(struct list_head) * num_possible_cpus();
 
 	/* Notice returns -EPERM on if map size is larger than memlock limit */
 	ret = bpf_map_charge_init(&cmap->map.memory, cost);
@@ -115,23 +115,14 @@ static struct bpf_map *cpu_map_alloc(union bpf_attr *attr)
 		goto free_cmap;
 	}
 
-	cmap->flush_list = alloc_percpu(struct list_head);
-	if (!cmap->flush_list)
-		goto free_charge;
-
-	for_each_possible_cpu(cpu)
-		INIT_LIST_HEAD(per_cpu_ptr(cmap->flush_list, cpu));
-
 	/* Alloc array for possible remote "destination" CPUs */
 	cmap->cpu_map = bpf_map_area_alloc(cmap->map.max_entries *
 					   sizeof(struct bpf_cpu_map_entry *),
 					   cmap->map.numa_node);
 	if (!cmap->cpu_map)
-		goto free_percpu;
+		goto free_charge;
 
 	return &cmap->map;
-free_percpu:
-	free_percpu(cmap->flush_list);
 free_charge:
 	bpf_map_charge_finish(&cmap->map.memory);
 free_cmap:
@@ -499,7 +490,6 @@ static int cpu_map_update_elem(struct bpf_map *map, void *key, void *value,
 static void cpu_map_free(struct bpf_map *map)
 {
 	struct bpf_cpu_map *cmap = container_of(map, struct bpf_cpu_map, map);
-	int cpu;
 	u32 i;
 
 	/* At this point bpf_prog->aux->refcnt == 0 and this map->refcnt == 0,
@@ -527,7 +517,6 @@ static void cpu_map_free(struct bpf_map *map)
 		/* bq flush and cleanup happens after RCU grace-period */
 		__cpu_map_entry_replace(cmap, i, NULL); /* call_rcu */
 	}
-	free_percpu(cmap->flush_list);
 	bpf_map_area_free(cmap->cpu_map);
 	kfree(cmap);
 }
@@ -619,7 +608,7 @@ static int bq_flush_to_queue(struct xdp_bulk_queue *bq)
  */
 static int bq_enqueue(struct bpf_cpu_map_entry *rcpu, struct xdp_frame *xdpf)
 {
-	struct list_head *flush_list = this_cpu_ptr(rcpu->cmap->flush_list);
+	struct list_head *flush_list = this_cpu_ptr(&cpu_map_flush_list);
 	struct xdp_bulk_queue *bq = this_cpu_ptr(rcpu->bulkq);
 
 	if (unlikely(bq->count == CPU_MAP_BULK_SIZE))
@@ -658,10 +647,9 @@ int cpu_map_enqueue(struct bpf_cpu_map_entry *rcpu, struct xdp_buff *xdp,
 	return 0;
 }
 
-void __cpu_map_flush(struct bpf_map *map)
+void __cpu_map_flush(void)
 {
-	struct bpf_cpu_map *cmap = container_of(map, struct bpf_cpu_map, map);
-	struct list_head *flush_list = this_cpu_ptr(cmap->flush_list);
+	struct list_head *flush_list = this_cpu_ptr(&cpu_map_flush_list);
 	struct xdp_bulk_queue *bq, *tmp;
 
 	list_for_each_entry_safe(bq, tmp, flush_list, flush_node) {
@@ -671,3 +659,14 @@ void __cpu_map_flush(struct bpf_map *map)
 		wake_up_process(bq->obj->kthread);
 	}
 }
+
+static int __init cpu_map_init(void)
+{
+	int cpu;
+
+	for_each_possible_cpu(cpu)
+		INIT_LIST_HEAD(&per_cpu(cpu_map_flush_list, cpu));
+	return 0;
+}
+
+subsys_initcall(cpu_map_init);
diff --git a/net/core/filter.c b/net/core/filter.c
index b7570cb84902..c706325b3e66 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -3558,7 +3558,7 @@ void xdp_do_flush_map(void)
 			__dev_map_flush();
 			break;
 		case BPF_MAP_TYPE_CPUMAP:
-			__cpu_map_flush(map);
+			__cpu_map_flush();
 			break;
 		case BPF_MAP_TYPE_XSKMAP:
 			__xsk_map_flush();
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH bpf-next 7/8] xdp: remove map_to_flush and map swap detection
  2019-12-18 10:53 [PATCH bpf-next 0/8] Simplify xdp_do_redirect_map()/xdp_do_flush_map() and XDP maps Björn Töpel
                   ` (5 preceding siblings ...)
  2019-12-18 10:53 ` [PATCH bpf-next 6/8] xdp: make cpumap " Björn Töpel
@ 2019-12-18 10:53 ` Björn Töpel
  2019-12-18 11:20   ` Toke Høiland-Jørgensen
  2019-12-18 13:03   ` Jesper Dangaard Brouer
  2019-12-18 10:54 ` [PATCH bpf-next 8/8] xdp: simplify __bpf_tx_xdp_map() Björn Töpel
  2019-12-18 11:11 ` [PATCH bpf-next 0/8] Simplify xdp_do_redirect_map()/xdp_do_flush_map() and XDP maps Jesper Dangaard Brouer
  8 siblings, 2 replies; 32+ messages in thread
From: Björn Töpel @ 2019-12-18 10:53 UTC (permalink / raw)
  To: netdev, ast, daniel
  Cc: Björn Töpel, bpf, davem, jakub.kicinski, hawk,
	john.fastabend, magnus.karlsson, jonathan.lemon

From: Björn Töpel <bjorn.topel@intel.com>

Now that all XDP maps that can be used with bpf_redirect_map() tracks
entries to be flushed in a global fashion, there is not need to track
that the map has changed and flush from xdp_do_generic_map()
anymore. All entries will be flushed in xdp_do_flush_map().

This means that the map_to_flush can be removed, and the corresponding
checks. Moving the flush logic to one place, xdp_do_flush_map(), give
a bulking behavior and performance boost.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
---
 include/linux/filter.h |  1 -
 net/core/filter.c      | 27 +++------------------------
 2 files changed, 3 insertions(+), 25 deletions(-)

diff --git a/include/linux/filter.h b/include/linux/filter.h
index 37ac7025031d..69d6706fc889 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -592,7 +592,6 @@ struct bpf_redirect_info {
 	u32 tgt_index;
 	void *tgt_value;
 	struct bpf_map *map;
-	struct bpf_map *map_to_flush;
 	u32 kern_flags;
 };
 
diff --git a/net/core/filter.c b/net/core/filter.c
index c706325b3e66..d9caa3e57ea1 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -3547,26 +3547,9 @@ static int __bpf_tx_xdp_map(struct net_device *dev_rx, void *fwd,
 
 void xdp_do_flush_map(void)
 {
-	struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info);
-	struct bpf_map *map = ri->map_to_flush;
-
-	ri->map_to_flush = NULL;
-	if (map) {
-		switch (map->map_type) {
-		case BPF_MAP_TYPE_DEVMAP:
-		case BPF_MAP_TYPE_DEVMAP_HASH:
-			__dev_map_flush();
-			break;
-		case BPF_MAP_TYPE_CPUMAP:
-			__cpu_map_flush();
-			break;
-		case BPF_MAP_TYPE_XSKMAP:
-			__xsk_map_flush();
-			break;
-		default:
-			break;
-		}
-	}
+	__dev_map_flush();
+	__cpu_map_flush();
+	__xsk_map_flush();
 }
 EXPORT_SYMBOL_GPL(xdp_do_flush_map);
 
@@ -3615,14 +3598,10 @@ static int xdp_do_redirect_map(struct net_device *dev, struct xdp_buff *xdp,
 	ri->tgt_value = NULL;
 	WRITE_ONCE(ri->map, NULL);
 
-	if (ri->map_to_flush && unlikely(ri->map_to_flush != map))
-		xdp_do_flush_map();
-
 	err = __bpf_tx_xdp_map(dev, fwd, map, xdp);
 	if (unlikely(err))
 		goto err;
 
-	ri->map_to_flush = map;
 	_trace_xdp_redirect_map(dev, xdp_prog, fwd, map, index);
 	return 0;
 err:
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH bpf-next 8/8] xdp: simplify __bpf_tx_xdp_map()
  2019-12-18 10:53 [PATCH bpf-next 0/8] Simplify xdp_do_redirect_map()/xdp_do_flush_map() and XDP maps Björn Töpel
                   ` (6 preceding siblings ...)
  2019-12-18 10:53 ` [PATCH bpf-next 7/8] xdp: remove map_to_flush and map swap detection Björn Töpel
@ 2019-12-18 10:54 ` Björn Töpel
  2019-12-18 11:20   ` Toke Høiland-Jørgensen
  2019-12-18 11:11 ` [PATCH bpf-next 0/8] Simplify xdp_do_redirect_map()/xdp_do_flush_map() and XDP maps Jesper Dangaard Brouer
  8 siblings, 1 reply; 32+ messages in thread
From: Björn Töpel @ 2019-12-18 10:54 UTC (permalink / raw)
  To: netdev, ast, daniel
  Cc: Björn Töpel, bpf, davem, jakub.kicinski, hawk,
	john.fastabend, magnus.karlsson, jonathan.lemon

From: Björn Töpel <bjorn.topel@intel.com>

The explicit error checking is not needed. Simply return the error
instead.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
---
 net/core/filter.c | 33 +++++++--------------------------
 1 file changed, 7 insertions(+), 26 deletions(-)

diff --git a/net/core/filter.c b/net/core/filter.c
index d9caa3e57ea1..217af9974c86 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -3510,35 +3510,16 @@ xdp_do_redirect_slow(struct net_device *dev, struct xdp_buff *xdp,
 }
 
 static int __bpf_tx_xdp_map(struct net_device *dev_rx, void *fwd,
-			    struct bpf_map *map,
-			    struct xdp_buff *xdp)
+			    struct bpf_map *map, struct xdp_buff *xdp)
 {
-	int err;
-
 	switch (map->map_type) {
 	case BPF_MAP_TYPE_DEVMAP:
-	case BPF_MAP_TYPE_DEVMAP_HASH: {
-		struct bpf_dtab_netdev *dst = fwd;
-
-		err = dev_map_enqueue(dst, xdp, dev_rx);
-		if (unlikely(err))
-			return err;
-		break;
-	}
-	case BPF_MAP_TYPE_CPUMAP: {
-		struct bpf_cpu_map_entry *rcpu = fwd;
-
-		err = cpu_map_enqueue(rcpu, xdp, dev_rx);
-		if (unlikely(err))
-			return err;
-		break;
-	}
-	case BPF_MAP_TYPE_XSKMAP: {
-		struct xdp_sock *xs = fwd;
-
-		err = __xsk_map_redirect(xs, xdp);
-		return err;
-	}
+	case BPF_MAP_TYPE_DEVMAP_HASH:
+		return dev_map_enqueue(fwd, xdp, dev_rx);
+	case BPF_MAP_TYPE_CPUMAP:
+		return cpu_map_enqueue(fwd, xdp, dev_rx);
+	case BPF_MAP_TYPE_XSKMAP:
+		return __xsk_map_redirect(fwd, xdp);
 	default:
 		break;
 	}
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH bpf-next 0/8] Simplify xdp_do_redirect_map()/xdp_do_flush_map() and XDP maps
  2019-12-18 10:53 [PATCH bpf-next 0/8] Simplify xdp_do_redirect_map()/xdp_do_flush_map() and XDP maps Björn Töpel
                   ` (7 preceding siblings ...)
  2019-12-18 10:54 ` [PATCH bpf-next 8/8] xdp: simplify __bpf_tx_xdp_map() Björn Töpel
@ 2019-12-18 11:11 ` Jesper Dangaard Brouer
  2019-12-18 11:39   ` Björn Töpel
  8 siblings, 1 reply; 32+ messages in thread
From: Jesper Dangaard Brouer @ 2019-12-18 11:11 UTC (permalink / raw)
  To: Björn Töpel
  Cc: brouer, netdev, ast, daniel, bpf, davem, jakub.kicinski, hawk,
	john.fastabend, magnus.karlsson, jonathan.lemon

On Wed, 18 Dec 2019 11:53:52 +0100
Björn Töpel <bjorn.topel@gmail.com> wrote:

>   $ sudo ./xdp_redirect_cpu --dev enp134s0f0 --cpu 22 xdp_cpu_map0
>   
>   Running XDP/eBPF prog_name:xdp_cpu_map5_lb_hash_ip_pairs
>   XDP-cpumap      CPU:to  pps            drop-pps    extra-info
>   XDP-RX          20      7723038        0           0
>   XDP-RX          total   7723038        0
>   cpumap_kthread  total   0              0           0
>   redirect_err    total   0              0
>   xdp_exception   total   0              0

Hmm... I'm missing some counters on the kthread side.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH bpf-next 1/8] xdp: simplify devmap cleanup
  2019-12-18 10:53 ` [PATCH bpf-next 1/8] xdp: simplify devmap cleanup Björn Töpel
@ 2019-12-18 11:14   ` Toke Høiland-Jørgensen
  0 siblings, 0 replies; 32+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-12-18 11:14 UTC (permalink / raw)
  To: Björn Töpel, netdev, ast, daniel
  Cc: Björn Töpel, bpf, davem, jakub.kicinski, hawk,
	john.fastabend, magnus.karlsson, jonathan.lemon

Björn Töpel <bjorn.topel@gmail.com> writes:

> From: Björn Töpel <bjorn.topel@intel.com>
>
> After the RCU flavor consolidation [1], call_rcu() and
> synchronize_rcu() waits for preempt-disable regions (NAPI) in addition
> to the read-side critical sections. As a result of this, the cleanup
> code in devmap can be simplified
>
> * There is no longer a need to flush in __dev_map_entry_free, since we
>   know that this has been done when the call_rcu() callback is
>   triggered.
>
> * When freeing the map, there is no need to explicitly wait for a
>   flush. It's guaranteed to be done after the synchronize_rcu() call
>   in dev_map_free(). The rcu_barrier() is still needed, so that the
>   map is not freed prior the elements.
>
> [1] https://lwn.net/Articles/777036/
>
> Signed-off-by: Björn Töpel <bjorn.topel@intel.com>

Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH bpf-next 2/8] xdp: simplify cpumap cleanup
  2019-12-18 10:53 ` [PATCH bpf-next 2/8] xdp: simplify cpumap cleanup Björn Töpel
@ 2019-12-18 11:15   ` Toke Høiland-Jørgensen
  2019-12-18 17:47   ` Jakub Kicinski
  1 sibling, 0 replies; 32+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-12-18 11:15 UTC (permalink / raw)
  To: Björn Töpel, netdev, ast, daniel
  Cc: Björn Töpel, bpf, davem, jakub.kicinski, hawk,
	john.fastabend, magnus.karlsson, jonathan.lemon

Björn Töpel <bjorn.topel@gmail.com> writes:

> From: Björn Töpel <bjorn.topel@intel.com>
>
> After the RCU flavor consolidation [1], call_rcu() and
> synchronize_rcu() waits for preempt-disable regions (NAPI) in addition
> to the read-side critical sections. As a result of this, the cleanup
> code in cpumap can be simplified
>
> * There is no longer a need to flush in __cpu_map_entry_free, since we
>   know that this has been done when the call_rcu() callback is
>   triggered.
>
> * When freeing the map, there is no need to explicitly wait for a
>   flush. It's guaranteed to be done after the synchronize_rcu() call
>   in cpu_map_free().
>
> [1] https://lwn.net/Articles/777036/
>
> Signed-off-by: Björn Töpel <bjorn.topel@intel.com>

Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH bpf-next 3/8] xdp: fix graze->grace type-o in cpumap comments
  2019-12-18 10:53 ` [PATCH bpf-next 3/8] xdp: fix graze->grace type-o in cpumap comments Björn Töpel
@ 2019-12-18 11:18   ` Toke Høiland-Jørgensen
  0 siblings, 0 replies; 32+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-12-18 11:18 UTC (permalink / raw)
  To: Björn Töpel, netdev, ast, daniel
  Cc: Björn Töpel, bpf, davem, jakub.kicinski, hawk,
	john.fastabend, magnus.karlsson, jonathan.lemon

Björn Töpel <bjorn.topel@gmail.com> writes:

> From: Björn Töpel <bjorn.topel@intel.com>
>
> Simple spelling fix.
>
> Signed-off-by: Björn Töpel <bjorn.topel@intel.com>

Even though I kinda like the mental image of RCU going out to the
savanna for a mouthful of grass that the typo seems to imply:

Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH bpf-next 4/8] xsk: make xskmap flush_list common for all map instances
  2019-12-18 10:53 ` [PATCH bpf-next 4/8] xsk: make xskmap flush_list common for all map instances Björn Töpel
@ 2019-12-18 11:19   ` Toke Høiland-Jørgensen
  0 siblings, 0 replies; 32+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-12-18 11:19 UTC (permalink / raw)
  To: Björn Töpel, netdev, ast, daniel
  Cc: Björn Töpel, bpf, davem, jakub.kicinski, hawk,
	john.fastabend, magnus.karlsson, jonathan.lemon

Björn Töpel <bjorn.topel@gmail.com> writes:

> From: Björn Töpel <bjorn.topel@intel.com>
>
> The xskmap flush list is used to track entries that need to flushed
> from via the xdp_do_flush_map() function. This list used to be
> per-map, but there is really no reason for that. Instead make the
> flush list global for all xskmaps, which simplifies __xsk_map_flush()
> and xsk_map_alloc().
>
> Signed-off-by: Björn Töpel <bjorn.topel@intel.com>

Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH bpf-next 5/8] xdp: make devmap flush_list common for all map instances
  2019-12-18 10:53 ` [PATCH bpf-next 5/8] xdp: make devmap " Björn Töpel
@ 2019-12-18 11:19   ` Toke Høiland-Jørgensen
  0 siblings, 0 replies; 32+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-12-18 11:19 UTC (permalink / raw)
  To: Björn Töpel, netdev, ast, daniel
  Cc: Björn Töpel, bpf, davem, jakub.kicinski, hawk,
	john.fastabend, magnus.karlsson, jonathan.lemon

Björn Töpel <bjorn.topel@gmail.com> writes:

> From: Björn Töpel <bjorn.topel@intel.com>
>
> The devmap flush list is used to track entries that need to flushed
> from via the xdp_do_flush_map() function. This list used to be
> per-map, but there is really no reason for that. Instead make the
> flush list global for all devmaps, which simplifies __dev_map_flush()
> and dev_map_init_map().
>
> Signed-off-by: Björn Töpel <bjorn.topel@intel.com>

Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH bpf-next 6/8] xdp: make cpumap flush_list common for all map instances
  2019-12-18 10:53 ` [PATCH bpf-next 6/8] xdp: make cpumap " Björn Töpel
@ 2019-12-18 11:19   ` Toke Høiland-Jørgensen
  0 siblings, 0 replies; 32+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-12-18 11:19 UTC (permalink / raw)
  To: Björn Töpel, netdev, ast, daniel
  Cc: Björn Töpel, bpf, davem, jakub.kicinski, hawk,
	john.fastabend, magnus.karlsson, jonathan.lemon

Björn Töpel <bjorn.topel@gmail.com> writes:

> From: Björn Töpel <bjorn.topel@intel.com>
>
> The cpumap flush list is used to track entries that need to flushed
> from via the xdp_do_flush_map() function. This list used to be
> per-map, but there is really no reason for that. Instead make the
> flush list global for all devmaps, which simplifies __cpu_map_flush()
> and cpu_map_alloc().
>
> Signed-off-by: Björn Töpel <bjorn.topel@intel.com>

Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH bpf-next 7/8] xdp: remove map_to_flush and map swap detection
  2019-12-18 10:53 ` [PATCH bpf-next 7/8] xdp: remove map_to_flush and map swap detection Björn Töpel
@ 2019-12-18 11:20   ` Toke Høiland-Jørgensen
  2019-12-18 13:03   ` Jesper Dangaard Brouer
  1 sibling, 0 replies; 32+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-12-18 11:20 UTC (permalink / raw)
  To: Björn Töpel, netdev, ast, daniel
  Cc: Björn Töpel, bpf, davem, jakub.kicinski, hawk,
	john.fastabend, magnus.karlsson, jonathan.lemon

Björn Töpel <bjorn.topel@gmail.com> writes:

> From: Björn Töpel <bjorn.topel@intel.com>
>
> Now that all XDP maps that can be used with bpf_redirect_map() tracks
> entries to be flushed in a global fashion, there is not need to track
> that the map has changed and flush from xdp_do_generic_map()
> anymore. All entries will be flushed in xdp_do_flush_map().
>
> This means that the map_to_flush can be removed, and the corresponding
> checks. Moving the flush logic to one place, xdp_do_flush_map(), give
> a bulking behavior and performance boost.
>
> Signed-off-by: Björn Töpel <bjorn.topel@intel.com>

Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH bpf-next 8/8] xdp: simplify __bpf_tx_xdp_map()
  2019-12-18 10:54 ` [PATCH bpf-next 8/8] xdp: simplify __bpf_tx_xdp_map() Björn Töpel
@ 2019-12-18 11:20   ` Toke Høiland-Jørgensen
  0 siblings, 0 replies; 32+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-12-18 11:20 UTC (permalink / raw)
  To: Björn Töpel, netdev, ast, daniel
  Cc: Björn Töpel, bpf, davem, jakub.kicinski, hawk,
	john.fastabend, magnus.karlsson, jonathan.lemon

Björn Töpel <bjorn.topel@gmail.com> writes:

> From: Björn Töpel <bjorn.topel@intel.com>
>
> The explicit error checking is not needed. Simply return the error
> instead.
>
> Signed-off-by: Björn Töpel <bjorn.topel@intel.com>

Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH bpf-next 0/8] Simplify xdp_do_redirect_map()/xdp_do_flush_map() and XDP maps
  2019-12-18 11:11 ` [PATCH bpf-next 0/8] Simplify xdp_do_redirect_map()/xdp_do_flush_map() and XDP maps Jesper Dangaard Brouer
@ 2019-12-18 11:39   ` Björn Töpel
  2019-12-18 12:03     ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 32+ messages in thread
From: Björn Töpel @ 2019-12-18 11:39 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Netdev, Alexei Starovoitov, Daniel Borkmann, bpf, David Miller,
	Jakub Kicinski, Jesper Dangaard Brouer, John Fastabend, Karlsson,
	Magnus, Jonathan Lemon

On Wed, 18 Dec 2019 at 12:11, Jesper Dangaard Brouer <brouer@redhat.com> wrote:
>
> On Wed, 18 Dec 2019 11:53:52 +0100
> Björn Töpel <bjorn.topel@gmail.com> wrote:
>
> >   $ sudo ./xdp_redirect_cpu --dev enp134s0f0 --cpu 22 xdp_cpu_map0
> >
> >   Running XDP/eBPF prog_name:xdp_cpu_map5_lb_hash_ip_pairs
> >   XDP-cpumap      CPU:to  pps            drop-pps    extra-info
> >   XDP-RX          20      7723038        0           0
> >   XDP-RX          total   7723038        0
> >   cpumap_kthread  total   0              0           0
> >   redirect_err    total   0              0
> >   xdp_exception   total   0              0
>
> Hmm... I'm missing some counters on the kthread side.
>

Oh? Any ideas why? I just ran the upstream sample straight off.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH bpf-next 0/8] Simplify xdp_do_redirect_map()/xdp_do_flush_map() and XDP maps
  2019-12-18 11:39   ` Björn Töpel
@ 2019-12-18 12:03     ` Jesper Dangaard Brouer
  2019-12-18 12:18       ` Björn Töpel
  2019-12-19  0:39       ` Andrii Nakryiko
  0 siblings, 2 replies; 32+ messages in thread
From: Jesper Dangaard Brouer @ 2019-12-18 12:03 UTC (permalink / raw)
  To: Björn Töpel
  Cc: Netdev, Alexei Starovoitov, Daniel Borkmann, bpf, David Miller,
	Jakub Kicinski, Jesper Dangaard Brouer, John Fastabend, Karlsson,
	Magnus, Jonathan Lemon, brouer, Maciej Fijalkowski

On Wed, 18 Dec 2019 12:39:53 +0100
Björn Töpel <bjorn.topel@gmail.com> wrote:

> On Wed, 18 Dec 2019 at 12:11, Jesper Dangaard Brouer <brouer@redhat.com> wrote:
> >
> > On Wed, 18 Dec 2019 11:53:52 +0100
> > Björn Töpel <bjorn.topel@gmail.com> wrote:
> >  
> > >   $ sudo ./xdp_redirect_cpu --dev enp134s0f0 --cpu 22 xdp_cpu_map0
> > >
> > >   Running XDP/eBPF prog_name:xdp_cpu_map5_lb_hash_ip_pairs
> > >   XDP-cpumap      CPU:to  pps            drop-pps    extra-info
> > >   XDP-RX          20      7723038        0           0
> > >   XDP-RX          total   7723038        0
> > >   cpumap_kthread  total   0              0           0
> > >   redirect_err    total   0              0
> > >   xdp_exception   total   0              0  
> >
> > Hmm... I'm missing some counters on the kthread side.
> >  
> 
> Oh? Any ideas why? I just ran the upstream sample straight off.

Looks like it happened in commit: bbaf6029c49c ("samples/bpf: Convert
XDP samples to libbpf usage") (Cc Maciej).

The old bpf_load.c will auto attach the tracepoints... for and libbpf
you have to be explicit about it.

Can I ask you to also run a test with --stress-mode for
./xdp_redirect_cpu, to flush out any potential RCU race-conditions
(don't provide output, this is just a robustness test).

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH bpf-next 0/8] Simplify xdp_do_redirect_map()/xdp_do_flush_map() and XDP maps
  2019-12-18 12:03     ` Jesper Dangaard Brouer
@ 2019-12-18 12:18       ` Björn Töpel
  2019-12-18 12:32         ` Björn Töpel
  2019-12-18 12:40         ` Jesper Dangaard Brouer
  2019-12-19  0:39       ` Andrii Nakryiko
  1 sibling, 2 replies; 32+ messages in thread
From: Björn Töpel @ 2019-12-18 12:18 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Netdev, Alexei Starovoitov, Daniel Borkmann, bpf, David Miller,
	Jakub Kicinski, Jesper Dangaard Brouer, John Fastabend, Karlsson,
	Magnus, Jonathan Lemon, Maciej Fijalkowski

On Wed, 18 Dec 2019 at 13:04, Jesper Dangaard Brouer <brouer@redhat.com> wrote:
>
> On Wed, 18 Dec 2019 12:39:53 +0100
> Björn Töpel <bjorn.topel@gmail.com> wrote:
>
> > On Wed, 18 Dec 2019 at 12:11, Jesper Dangaard Brouer <brouer@redhat.com> wrote:
> > >
> > > On Wed, 18 Dec 2019 11:53:52 +0100
> > > Björn Töpel <bjorn.topel@gmail.com> wrote:
> > >
> > > >   $ sudo ./xdp_redirect_cpu --dev enp134s0f0 --cpu 22 xdp_cpu_map0
> > > >
> > > >   Running XDP/eBPF prog_name:xdp_cpu_map5_lb_hash_ip_pairs
> > > >   XDP-cpumap      CPU:to  pps            drop-pps    extra-info
> > > >   XDP-RX          20      7723038        0           0
> > > >   XDP-RX          total   7723038        0
> > > >   cpumap_kthread  total   0              0           0
> > > >   redirect_err    total   0              0
> > > >   xdp_exception   total   0              0
> > >
> > > Hmm... I'm missing some counters on the kthread side.
> > >
> >
> > Oh? Any ideas why? I just ran the upstream sample straight off.
>
> Looks like it happened in commit: bbaf6029c49c ("samples/bpf: Convert
> XDP samples to libbpf usage") (Cc Maciej).
>
> The old bpf_load.c will auto attach the tracepoints... for and libbpf
> you have to be explicit about it.
>
> Can I ask you to also run a test with --stress-mode for
> ./xdp_redirect_cpu, to flush out any potential RCU race-conditions
> (don't provide output, this is just a robustness test).
>

Sure! Other than that, does the command line above make sense? I'm
blasting UDP packets to core 20, and the idea was to re-route them to
22.


Björn

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH bpf-next 0/8] Simplify xdp_do_redirect_map()/xdp_do_flush_map() and XDP maps
  2019-12-18 12:18       ` Björn Töpel
@ 2019-12-18 12:32         ` Björn Töpel
  2019-12-18 12:40         ` Jesper Dangaard Brouer
  1 sibling, 0 replies; 32+ messages in thread
From: Björn Töpel @ 2019-12-18 12:32 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Netdev, Alexei Starovoitov, Daniel Borkmann, bpf, David Miller,
	Jakub Kicinski, Jesper Dangaard Brouer, John Fastabend, Karlsson,
	Magnus, Jonathan Lemon, Maciej Fijalkowski

On Wed, 18 Dec 2019 at 13:18, Björn Töpel <bjorn.topel@gmail.com> wrote:
>
> On Wed, 18 Dec 2019 at 13:04, Jesper Dangaard Brouer <brouer@redhat.com> wrote:
> >
> > On Wed, 18 Dec 2019 12:39:53 +0100
> > Björn Töpel <bjorn.topel@gmail.com> wrote:
> >
> > > On Wed, 18 Dec 2019 at 12:11, Jesper Dangaard Brouer <brouer@redhat.com> wrote:
> > > >
> > > > On Wed, 18 Dec 2019 11:53:52 +0100
> > > > Björn Töpel <bjorn.topel@gmail.com> wrote:
> > > >
> > > > >   $ sudo ./xdp_redirect_cpu --dev enp134s0f0 --cpu 22 xdp_cpu_map0
> > > > >
> > > > >   Running XDP/eBPF prog_name:xdp_cpu_map5_lb_hash_ip_pairs
> > > > >   XDP-cpumap      CPU:to  pps            drop-pps    extra-info
> > > > >   XDP-RX          20      7723038        0           0
> > > > >   XDP-RX          total   7723038        0
> > > > >   cpumap_kthread  total   0              0           0
> > > > >   redirect_err    total   0              0
> > > > >   xdp_exception   total   0              0
> > > >
> > > > Hmm... I'm missing some counters on the kthread side.
> > > >
> > >
> > > Oh? Any ideas why? I just ran the upstream sample straight off.
> >
> > Looks like it happened in commit: bbaf6029c49c ("samples/bpf: Convert
> > XDP samples to libbpf usage") (Cc Maciej).
> >
> > The old bpf_load.c will auto attach the tracepoints... for and libbpf
> > you have to be explicit about it.
> >
> > Can I ask you to also run a test with --stress-mode for
> > ./xdp_redirect_cpu, to flush out any potential RCU race-conditions
> > (don't provide output, this is just a robustness test).
> >
>
> Sure! Other than that, does the command line above make sense? I'm
> blasting UDP packets to core 20, and the idea was to re-route them to
> 22.
>

No, crash with --stress-mode/-x. (Still no tracepoint output.) And
bpf_redirect_map() is executed and the cpu_map thread is running. :-P

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH bpf-next 0/8] Simplify xdp_do_redirect_map()/xdp_do_flush_map() and XDP maps
  2019-12-18 12:18       ` Björn Töpel
  2019-12-18 12:32         ` Björn Töpel
@ 2019-12-18 12:40         ` Jesper Dangaard Brouer
  2019-12-18 12:48           ` Björn Töpel
  1 sibling, 1 reply; 32+ messages in thread
From: Jesper Dangaard Brouer @ 2019-12-18 12:40 UTC (permalink / raw)
  To: Björn Töpel
  Cc: Netdev, Alexei Starovoitov, Daniel Borkmann, bpf, David Miller,
	Jakub Kicinski, Jesper Dangaard Brouer, John Fastabend, Karlsson,
	Magnus, Jonathan Lemon, Maciej Fijalkowski, brouer

On Wed, 18 Dec 2019 13:18:10 +0100
Björn Töpel <bjorn.topel@gmail.com> wrote:

> On Wed, 18 Dec 2019 at 13:04, Jesper Dangaard Brouer <brouer@redhat.com> wrote:
> >
> > On Wed, 18 Dec 2019 12:39:53 +0100
> > Björn Töpel <bjorn.topel@gmail.com> wrote:
> >  
> > > On Wed, 18 Dec 2019 at 12:11, Jesper Dangaard Brouer <brouer@redhat.com> wrote:  
> > > >
> > > > On Wed, 18 Dec 2019 11:53:52 +0100
> > > > Björn Töpel <bjorn.topel@gmail.com> wrote:
> > > >  
> > > > >   $ sudo ./xdp_redirect_cpu --dev enp134s0f0 --cpu 22 xdp_cpu_map0
> > > > >
> > > > >   Running XDP/eBPF prog_name:xdp_cpu_map5_lb_hash_ip_pairs
> > > > >   XDP-cpumap      CPU:to  pps            drop-pps    extra-info
> > > > >   XDP-RX          20      7723038        0           0
> > > > >   XDP-RX          total   7723038        0
> > > > >   cpumap_kthread  total   0              0           0
> > > > >   redirect_err    total   0              0
> > > > >   xdp_exception   total   0              0  
> > > >
> > > > Hmm... I'm missing some counters on the kthread side.
> > > >  
> > >
> > > Oh? Any ideas why? I just ran the upstream sample straight off.  
> >
> > Looks like it happened in commit: bbaf6029c49c ("samples/bpf: Convert
> > XDP samples to libbpf usage") (Cc Maciej).
> >
> > The old bpf_load.c will auto attach the tracepoints... for and libbpf
> > you have to be explicit about it.
> >
> > Can I ask you to also run a test with --stress-mode for
> > ./xdp_redirect_cpu, to flush out any potential RCU race-conditions
> > (don't provide output, this is just a robustness test).
> >  
> 
> Sure! Other than that, does the command line above make sense? I'm
> blasting UDP packets to core 20, and the idea was to re-route them to
> 22.

Yes, and I love that you are using CPUMAP xdp_redirect_cpu as a test.

Explaining what is doing on (so you can say if this is what you wanted
to test):

The "XDP-RX" number is the raw XDP redirect number, but the remote CPU,
where the network stack is started, cannot operate at 7.7Mpps.  Which the
lacking tracepoint numbers should have shown. You still can observe
results via nstat, e.g.:

 # nstat -n && sleep 1 && nstat

On the remote CPU 22, the SKB will be constructed, and likely dropped
due overloading network stack and due to not having an UDP listen port.

I sometimes use:
 # iptables -t raw -I PREROUTING -p udp --dport 9 -j DROP
To drop the UDP packets in a earlier and consistent stage.

The CPUMAP have carefully been designed to avoid that a "producer" can
be slowed down by memory operations done by the "consumer", this is
mostly achieved via ptr_ring and careful bulking (cache-lines).  As
your driver i40e doesn't have 'page_pool', then you are not affected by
the return channel.

Funny test/details: i40e uses a refcnt recycle scheme, based off the
size of the RX-ring, thus it is affected by a longer outstanding queue.
The CPUMAP have an intermediate queue, that will be full in this
overload setting.  Try to increase or decrease the parameter --qsize
(remember to place it as first argument), and see if this was the
limiting factor for your XDP-RX number.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH bpf-next 0/8] Simplify xdp_do_redirect_map()/xdp_do_flush_map() and XDP maps
  2019-12-18 12:40         ` Jesper Dangaard Brouer
@ 2019-12-18 12:48           ` Björn Töpel
  0 siblings, 0 replies; 32+ messages in thread
From: Björn Töpel @ 2019-12-18 12:48 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Netdev, Alexei Starovoitov, Daniel Borkmann, bpf, David Miller,
	Jakub Kicinski, Jesper Dangaard Brouer, John Fastabend, Karlsson,
	Magnus, Jonathan Lemon, Maciej Fijalkowski

On Wed, 18 Dec 2019 at 13:40, Jesper Dangaard Brouer <brouer@redhat.com> wrote:
>
> On Wed, 18 Dec 2019 13:18:10 +0100
> Björn Töpel <bjorn.topel@gmail.com> wrote:
>
> > On Wed, 18 Dec 2019 at 13:04, Jesper Dangaard Brouer <brouer@redhat.com> wrote:
> > >
> > > On Wed, 18 Dec 2019 12:39:53 +0100
> > > Björn Töpel <bjorn.topel@gmail.com> wrote:
> > >
> > > > On Wed, 18 Dec 2019 at 12:11, Jesper Dangaard Brouer <brouer@redhat.com> wrote:
> > > > >
> > > > > On Wed, 18 Dec 2019 11:53:52 +0100
> > > > > Björn Töpel <bjorn.topel@gmail.com> wrote:
> > > > >
> > > > > >   $ sudo ./xdp_redirect_cpu --dev enp134s0f0 --cpu 22 xdp_cpu_map0
> > > > > >
> > > > > >   Running XDP/eBPF prog_name:xdp_cpu_map5_lb_hash_ip_pairs
> > > > > >   XDP-cpumap      CPU:to  pps            drop-pps    extra-info
> > > > > >   XDP-RX          20      7723038        0           0
> > > > > >   XDP-RX          total   7723038        0
> > > > > >   cpumap_kthread  total   0              0           0
> > > > > >   redirect_err    total   0              0
> > > > > >   xdp_exception   total   0              0
> > > > >
> > > > > Hmm... I'm missing some counters on the kthread side.
> > > > >
> > > >
> > > > Oh? Any ideas why? I just ran the upstream sample straight off.
> > >
> > > Looks like it happened in commit: bbaf6029c49c ("samples/bpf: Convert
> > > XDP samples to libbpf usage") (Cc Maciej).
> > >
> > > The old bpf_load.c will auto attach the tracepoints... for and libbpf
> > > you have to be explicit about it.
> > >
> > > Can I ask you to also run a test with --stress-mode for
> > > ./xdp_redirect_cpu, to flush out any potential RCU race-conditions
> > > (don't provide output, this is just a robustness test).
> > >
> >
> > Sure! Other than that, does the command line above make sense? I'm
> > blasting UDP packets to core 20, and the idea was to re-route them to
> > 22.
>
> Yes, and I love that you are using CPUMAP xdp_redirect_cpu as a test.
>
> Explaining what is doing on (so you can say if this is what you wanted
> to test):
>

I wanted to see whether one could receive (Rx + bpf_redirect_map)
more with the change. I figured out that at least bpf_redirect_map was
correctly executed, and that the numbers went up. :-P

> The "XDP-RX" number is the raw XDP redirect number, but the remote CPU,
> where the network stack is started, cannot operate at 7.7Mpps.  Which the
> lacking tracepoint numbers should have shown. You still can observe
> results via nstat, e.g.:
>
>  # nstat -n && sleep 1 && nstat
>
> On the remote CPU 22, the SKB will be constructed, and likely dropped
> due overloading network stack and due to not having an UDP listen port.
>
> I sometimes use:
>  # iptables -t raw -I PREROUTING -p udp --dport 9 -j DROP
> To drop the UDP packets in a earlier and consistent stage.
>
> The CPUMAP have carefully been designed to avoid that a "producer" can
> be slowed down by memory operations done by the "consumer", this is
> mostly achieved via ptr_ring and careful bulking (cache-lines).  As
> your driver i40e doesn't have 'page_pool', then you are not affected by
> the return channel.
>
> Funny test/details: i40e uses a refcnt recycle scheme, based off the
> size of the RX-ring, thus it is affected by a longer outstanding queue.
> The CPUMAP have an intermediate queue, that will be full in this
> overload setting.  Try to increase or decrease the parameter --qsize
> (remember to place it as first argument), and see if this was the
> limiting factor for your XDP-RX number.
>

Thanks for the elaborate description!

(Maybe it's time for samples/bpf manpages? ;-))


Björn

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH bpf-next 7/8] xdp: remove map_to_flush and map swap detection
  2019-12-18 10:53 ` [PATCH bpf-next 7/8] xdp: remove map_to_flush and map swap detection Björn Töpel
  2019-12-18 11:20   ` Toke Høiland-Jørgensen
@ 2019-12-18 13:03   ` Jesper Dangaard Brouer
  2019-12-18 13:09     ` Björn Töpel
  1 sibling, 1 reply; 32+ messages in thread
From: Jesper Dangaard Brouer @ 2019-12-18 13:03 UTC (permalink / raw)
  To: Björn Töpel
  Cc: netdev, ast, daniel, Björn Töpel, bpf, davem,
	jakub.kicinski, hawk, john.fastabend, magnus.karlsson,
	jonathan.lemon

On Wed, 18 Dec 2019 11:53:59 +0100
Björn Töpel <bjorn.topel@gmail.com> wrote:

> From: Björn Töpel <bjorn.topel@intel.com>
> 
> Now that all XDP maps that can be used with bpf_redirect_map() tracks
> entries to be flushed in a global fashion, there is not need to track
> that the map has changed and flush from xdp_do_generic_map()
> anymore. All entries will be flushed in xdp_do_flush_map().
> 
> This means that the map_to_flush can be removed, and the corresponding
> checks. Moving the flush logic to one place, xdp_do_flush_map(), give
> a bulking behavior and performance boost.
> 
> Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
> ---
>  include/linux/filter.h |  1 -
>  net/core/filter.c      | 27 +++------------------------
>  2 files changed, 3 insertions(+), 25 deletions(-)
> 
> diff --git a/include/linux/filter.h b/include/linux/filter.h
> index 37ac7025031d..69d6706fc889 100644
> --- a/include/linux/filter.h
> +++ b/include/linux/filter.h
> @@ -592,7 +592,6 @@ struct bpf_redirect_info {
>  	u32 tgt_index;
>  	void *tgt_value;
>  	struct bpf_map *map;
> -	struct bpf_map *map_to_flush;
>  	u32 kern_flags;
>  };
>  
> diff --git a/net/core/filter.c b/net/core/filter.c
> index c706325b3e66..d9caa3e57ea1 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -3547,26 +3547,9 @@ static int __bpf_tx_xdp_map(struct net_device *dev_rx, void *fwd,
>  
>  void xdp_do_flush_map(void)
>  {
> -	struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info);
> -	struct bpf_map *map = ri->map_to_flush;
> -
> -	ri->map_to_flush = NULL;
> -	if (map) {
> -		switch (map->map_type) {
> -		case BPF_MAP_TYPE_DEVMAP:
> -		case BPF_MAP_TYPE_DEVMAP_HASH:
> -			__dev_map_flush();
> -			break;
> -		case BPF_MAP_TYPE_CPUMAP:
> -			__cpu_map_flush();
> -			break;
> -		case BPF_MAP_TYPE_XSKMAP:
> -			__xsk_map_flush();
> -			break;
> -		default:
> -			break;
> -		}
> -	}
> +	__dev_map_flush();
> +	__cpu_map_flush();
> +	__xsk_map_flush();
>  }
>  EXPORT_SYMBOL_GPL(xdp_do_flush_map);
>  
> @@ -3615,14 +3598,10 @@ static int xdp_do_redirect_map(struct net_device *dev, struct xdp_buff *xdp,
>  	ri->tgt_value = NULL;
>  	WRITE_ONCE(ri->map, NULL);
>  
> -	if (ri->map_to_flush && unlikely(ri->map_to_flush != map))
> -		xdp_do_flush_map();
> -

I guess, I need to read the other patches to understand why this is valid.

The idea here is to detect if the BPF-prog are using several different
redirect maps, and do the flush if the program uses another map.  I
assume you handle this?


>  	err = __bpf_tx_xdp_map(dev, fwd, map, xdp);
>  	if (unlikely(err))
>  		goto err;
>  
> -	ri->map_to_flush = map;
>  	_trace_xdp_redirect_map(dev, xdp_prog, fwd, map, index);
>  	return 0;
>  err:



-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH bpf-next 7/8] xdp: remove map_to_flush and map swap detection
  2019-12-18 13:03   ` Jesper Dangaard Brouer
@ 2019-12-18 13:09     ` Björn Töpel
  0 siblings, 0 replies; 32+ messages in thread
From: Björn Töpel @ 2019-12-18 13:09 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Netdev, Alexei Starovoitov, Daniel Borkmann,
	Björn Töpel, bpf, David Miller, Jakub Kicinski,
	Jesper Dangaard Brouer, John Fastabend, Karlsson, Magnus,
	Jonathan Lemon

On Wed, 18 Dec 2019 at 14:03, Jesper Dangaard Brouer <jbrouer@redhat.com> wrote:
>
> On Wed, 18 Dec 2019 11:53:59 +0100
> Björn Töpel <bjorn.topel@gmail.com> wrote:
>
> > From: Björn Töpel <bjorn.topel@intel.com>
> >
> > Now that all XDP maps that can be used with bpf_redirect_map() tracks
> > entries to be flushed in a global fashion, there is not need to track
> > that the map has changed and flush from xdp_do_generic_map()
> > anymore. All entries will be flushed in xdp_do_flush_map().
> >
> > This means that the map_to_flush can be removed, and the corresponding
> > checks. Moving the flush logic to one place, xdp_do_flush_map(), give
> > a bulking behavior and performance boost.
> >
> > Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
> > ---
> >  include/linux/filter.h |  1 -
> >  net/core/filter.c      | 27 +++------------------------
> >  2 files changed, 3 insertions(+), 25 deletions(-)
> >
> > diff --git a/include/linux/filter.h b/include/linux/filter.h
> > index 37ac7025031d..69d6706fc889 100644
> > --- a/include/linux/filter.h
> > +++ b/include/linux/filter.h
> > @@ -592,7 +592,6 @@ struct bpf_redirect_info {
> >       u32 tgt_index;
> >       void *tgt_value;
> >       struct bpf_map *map;
> > -     struct bpf_map *map_to_flush;
> >       u32 kern_flags;
> >  };
> >
> > diff --git a/net/core/filter.c b/net/core/filter.c
> > index c706325b3e66..d9caa3e57ea1 100644
> > --- a/net/core/filter.c
> > +++ b/net/core/filter.c
> > @@ -3547,26 +3547,9 @@ static int __bpf_tx_xdp_map(struct net_device *dev_rx, void *fwd,
> >
> >  void xdp_do_flush_map(void)
> >  {
> > -     struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info);
> > -     struct bpf_map *map = ri->map_to_flush;
> > -
> > -     ri->map_to_flush = NULL;
> > -     if (map) {
> > -             switch (map->map_type) {
> > -             case BPF_MAP_TYPE_DEVMAP:
> > -             case BPF_MAP_TYPE_DEVMAP_HASH:
> > -                     __dev_map_flush();
> > -                     break;
> > -             case BPF_MAP_TYPE_CPUMAP:
> > -                     __cpu_map_flush();
> > -                     break;
> > -             case BPF_MAP_TYPE_XSKMAP:
> > -                     __xsk_map_flush();
> > -                     break;
> > -             default:
> > -                     break;
> > -             }
> > -     }
> > +     __dev_map_flush();
> > +     __cpu_map_flush();
> > +     __xsk_map_flush();
> >  }
> >  EXPORT_SYMBOL_GPL(xdp_do_flush_map);
> >
> > @@ -3615,14 +3598,10 @@ static int xdp_do_redirect_map(struct net_device *dev, struct xdp_buff *xdp,
> >       ri->tgt_value = NULL;
> >       WRITE_ONCE(ri->map, NULL);
> >
> > -     if (ri->map_to_flush && unlikely(ri->map_to_flush != map))
> > -             xdp_do_flush_map();
> > -
>
> I guess, I need to read the other patches to understand why this is valid.
>

Please do; Review would be very much appreciated!

> The idea here is to detect if the BPF-prog are using several different
> redirect maps, and do the flush if the program uses another map.  I
> assume you handle this?
>

Yes, the highlevel idea is that since the maps are dealing partly with
this already (but swaps map internal), let the map code deal with map
swaps as well.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH bpf-next 2/8] xdp: simplify cpumap cleanup
  2019-12-18 10:53 ` [PATCH bpf-next 2/8] xdp: simplify cpumap cleanup Björn Töpel
  2019-12-18 11:15   ` Toke Høiland-Jørgensen
@ 2019-12-18 17:47   ` Jakub Kicinski
  2019-12-18 17:48     ` Björn Töpel
  1 sibling, 1 reply; 32+ messages in thread
From: Jakub Kicinski @ 2019-12-18 17:47 UTC (permalink / raw)
  To: Björn Töpel
  Cc: netdev, ast, daniel, Björn Töpel, bpf, davem, hawk,
	john.fastabend, magnus.karlsson, jonathan.lemon

On Wed, 18 Dec 2019 11:53:54 +0100, Björn Töpel wrote:
> From: Björn Töpel <bjorn.topel@intel.com>
> 
> After the RCU flavor consolidation [1], call_rcu() and
> synchronize_rcu() waits for preempt-disable regions (NAPI) in addition
> to the read-side critical sections. As a result of this, the cleanup
> code in cpumap can be simplified
> 
> * There is no longer a need to flush in __cpu_map_entry_free, since we
>   know that this has been done when the call_rcu() callback is
>   triggered.
> 
> * When freeing the map, there is no need to explicitly wait for a
>   flush. It's guaranteed to be done after the synchronize_rcu() call
>   in cpu_map_free().
> 
> [1] https://lwn.net/Articles/777036/
> 
> Signed-off-by: Björn Töpel <bjorn.topel@intel.com>

Probably transient but:

../kernel/bpf/cpumap.c: In function "cpu_map_free":
../kernel/bpf/cpumap.c:502:6: warning: unused variable "cpu" [-Wunused-variable]
  502 |  int cpu;
      |      ^~~

I think there are also warnings in patch 4.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH bpf-next 2/8] xdp: simplify cpumap cleanup
  2019-12-18 17:47   ` Jakub Kicinski
@ 2019-12-18 17:48     ` Björn Töpel
  0 siblings, 0 replies; 32+ messages in thread
From: Björn Töpel @ 2019-12-18 17:48 UTC (permalink / raw)
  To: Jakub Kicinski, Björn Töpel
  Cc: netdev, ast, daniel, bpf, davem, hawk, john.fastabend,
	magnus.karlsson, jonathan.lemon

On 2019-12-18 18:47, Jakub Kicinski wrote:
> On Wed, 18 Dec 2019 11:53:54 +0100, Björn Töpel wrote:
>> From: Björn Töpel <bjorn.topel@intel.com>
>>
>> After the RCU flavor consolidation [1], call_rcu() and
>> synchronize_rcu() waits for preempt-disable regions (NAPI) in addition
>> to the read-side critical sections. As a result of this, the cleanup
>> code in cpumap can be simplified
>>
>> * There is no longer a need to flush in __cpu_map_entry_free, since we
>>    know that this has been done when the call_rcu() callback is
>>    triggered.
>>
>> * When freeing the map, there is no need to explicitly wait for a
>>    flush. It's guaranteed to be done after the synchronize_rcu() call
>>    in cpu_map_free().
>>
>> [1] https://lwn.net/Articles/777036/
>>
>> Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
> 
> Probably transient but:
> 
> ../kernel/bpf/cpumap.c: In function "cpu_map_free":
> ../kernel/bpf/cpumap.c:502:6: warning: unused variable "cpu" [-Wunused-variable]
>    502 |  int cpu;
>        |      ^~~
> 
> I think there are also warnings in patch 4.
> 

Ugh. Thanks, I'll respin!


Björn

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH bpf-next 0/8] Simplify xdp_do_redirect_map()/xdp_do_flush_map() and XDP maps
  2019-12-18 12:03     ` Jesper Dangaard Brouer
  2019-12-18 12:18       ` Björn Töpel
@ 2019-12-19  0:39       ` Andrii Nakryiko
  2019-12-19 19:33         ` Jesper Dangaard Brouer
  1 sibling, 1 reply; 32+ messages in thread
From: Andrii Nakryiko @ 2019-12-19  0:39 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Björn Töpel, Netdev, Alexei Starovoitov,
	Daniel Borkmann, bpf, David Miller, Jakub Kicinski,
	Jesper Dangaard Brouer, John Fastabend, Karlsson, Magnus,
	Jonathan Lemon, Maciej Fijalkowski

On Wed, Dec 18, 2019 at 4:04 AM Jesper Dangaard Brouer
<brouer@redhat.com> wrote:
>
> On Wed, 18 Dec 2019 12:39:53 +0100
> Björn Töpel <bjorn.topel@gmail.com> wrote:
>
> > On Wed, 18 Dec 2019 at 12:11, Jesper Dangaard Brouer <brouer@redhat.com> wrote:
> > >
> > > On Wed, 18 Dec 2019 11:53:52 +0100
> > > Björn Töpel <bjorn.topel@gmail.com> wrote:
> > >
> > > >   $ sudo ./xdp_redirect_cpu --dev enp134s0f0 --cpu 22 xdp_cpu_map0
> > > >
> > > >   Running XDP/eBPF prog_name:xdp_cpu_map5_lb_hash_ip_pairs
> > > >   XDP-cpumap      CPU:to  pps            drop-pps    extra-info
> > > >   XDP-RX          20      7723038        0           0
> > > >   XDP-RX          total   7723038        0
> > > >   cpumap_kthread  total   0              0           0
> > > >   redirect_err    total   0              0
> > > >   xdp_exception   total   0              0
> > >
> > > Hmm... I'm missing some counters on the kthread side.
> > >
> >
> > Oh? Any ideas why? I just ran the upstream sample straight off.
>
> Looks like it happened in commit: bbaf6029c49c ("samples/bpf: Convert
> XDP samples to libbpf usage") (Cc Maciej).
>
> The old bpf_load.c will auto attach the tracepoints... for and libbpf
> you have to be explicit about it.

... or you can use skeleton, which will auto-attach them as well,
provided BPF program's section names follow expected naming
convention. So it might be a good idea to try it out.

>
> Can I ask you to also run a test with --stress-mode for
> ./xdp_redirect_cpu, to flush out any potential RCU race-conditions
> (don't provide output, this is just a robustness test).
>
> --
> Best regards,
>   Jesper Dangaard Brouer
>   MSc.CS, Principal Kernel Engineer at Red Hat
>   LinkedIn: http://www.linkedin.com/in/brouer
>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH bpf-next 0/8] Simplify xdp_do_redirect_map()/xdp_do_flush_map() and XDP maps
  2019-12-19  0:39       ` Andrii Nakryiko
@ 2019-12-19 19:33         ` Jesper Dangaard Brouer
  2019-12-19 20:08           ` Daniel Borkmann
  0 siblings, 1 reply; 32+ messages in thread
From: Jesper Dangaard Brouer @ 2019-12-19 19:33 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Björn Töpel, Netdev, Alexei Starovoitov,
	Daniel Borkmann, bpf, David Miller, Jakub Kicinski,
	John Fastabend, Karlsson, Magnus, Jonathan Lemon,
	Maciej Fijalkowski, brouer

On Wed, 18 Dec 2019 16:39:08 -0800
Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:

> On Wed, Dec 18, 2019 at 4:04 AM Jesper Dangaard Brouer
> <brouer@redhat.com> wrote:
> >
> > On Wed, 18 Dec 2019 12:39:53 +0100
> > Björn Töpel <bjorn.topel@gmail.com> wrote:
> >  
> > > On Wed, 18 Dec 2019 at 12:11, Jesper Dangaard Brouer <brouer@redhat.com> wrote:  
> > > >
> > > > On Wed, 18 Dec 2019 11:53:52 +0100
> > > > Björn Töpel <bjorn.topel@gmail.com> wrote:
> > > >  
> > > > >   $ sudo ./xdp_redirect_cpu --dev enp134s0f0 --cpu 22 xdp_cpu_map0
> > > > >
> > > > >   Running XDP/eBPF prog_name:xdp_cpu_map5_lb_hash_ip_pairs
> > > > >   XDP-cpumap      CPU:to  pps            drop-pps    extra-info
> > > > >   XDP-RX          20      7723038        0           0
> > > > >   XDP-RX          total   7723038        0
> > > > >   cpumap_kthread  total   0              0           0
> > > > >   redirect_err    total   0              0
> > > > >   xdp_exception   total   0              0  
> > > >
> > > > Hmm... I'm missing some counters on the kthread side.
> > > >  
> > >
> > > Oh? Any ideas why? I just ran the upstream sample straight off.  
> >
> > Looks like it happened in commit: bbaf6029c49c ("samples/bpf: Convert
> > XDP samples to libbpf usage") (Cc Maciej).
> >
> > The old bpf_load.c will auto attach the tracepoints... for and libbpf
> > you have to be explicit about it.  
> 
> ... or you can use skeleton, which will auto-attach them as well,
> provided BPF program's section names follow expected naming
> convention. So it might be a good idea to try it out.

To Andrii, can you provide some more info on how to use this new
skeleton system of yours?  (Pointers to code examples?)

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH bpf-next 0/8] Simplify xdp_do_redirect_map()/xdp_do_flush_map() and XDP maps
  2019-12-19 19:33         ` Jesper Dangaard Brouer
@ 2019-12-19 20:08           ` Daniel Borkmann
  2019-12-19 22:56             ` Andrii Nakryiko
  0 siblings, 1 reply; 32+ messages in thread
From: Daniel Borkmann @ 2019-12-19 20:08 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, Andrii Nakryiko
  Cc: Björn Töpel, Netdev, Alexei Starovoitov, bpf,
	David Miller, Jakub Kicinski, John Fastabend, Karlsson, Magnus,
	Jonathan Lemon, Maciej Fijalkowski

On 12/19/19 8:33 PM, Jesper Dangaard Brouer wrote:
> On Wed, 18 Dec 2019 16:39:08 -0800
> Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> 
>> On Wed, Dec 18, 2019 at 4:04 AM Jesper Dangaard Brouer
>> <brouer@redhat.com> wrote:
>>>
>>> On Wed, 18 Dec 2019 12:39:53 +0100
>>> Björn Töpel <bjorn.topel@gmail.com> wrote:
>>>   
>>>> On Wed, 18 Dec 2019 at 12:11, Jesper Dangaard Brouer <brouer@redhat.com> wrote:
>>>>>
>>>>> On Wed, 18 Dec 2019 11:53:52 +0100
>>>>> Björn Töpel <bjorn.topel@gmail.com> wrote:
>>>>>   
>>>>>>    $ sudo ./xdp_redirect_cpu --dev enp134s0f0 --cpu 22 xdp_cpu_map0
>>>>>>
>>>>>>    Running XDP/eBPF prog_name:xdp_cpu_map5_lb_hash_ip_pairs
>>>>>>    XDP-cpumap      CPU:to  pps            drop-pps    extra-info
>>>>>>    XDP-RX          20      7723038        0           0
>>>>>>    XDP-RX          total   7723038        0
>>>>>>    cpumap_kthread  total   0              0           0
>>>>>>    redirect_err    total   0              0
>>>>>>    xdp_exception   total   0              0
>>>>>
>>>>> Hmm... I'm missing some counters on the kthread side.
>>>>>   
>>>>
>>>> Oh? Any ideas why? I just ran the upstream sample straight off.
>>>
>>> Looks like it happened in commit: bbaf6029c49c ("samples/bpf: Convert
>>> XDP samples to libbpf usage") (Cc Maciej).
>>>
>>> The old bpf_load.c will auto attach the tracepoints... for and libbpf
>>> you have to be explicit about it.
>>
>> ... or you can use skeleton, which will auto-attach them as well,
>> provided BPF program's section names follow expected naming
>> convention. So it might be a good idea to try it out.
> 
> To Andrii, can you provide some more info on how to use this new
> skeleton system of yours?  (Pointers to code examples?)

There's a man page ;-)

https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/tree/tools/bpf/bpftool/Documentation/bpftool-gen.rst

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH bpf-next 0/8] Simplify xdp_do_redirect_map()/xdp_do_flush_map() and XDP maps
  2019-12-19 20:08           ` Daniel Borkmann
@ 2019-12-19 22:56             ` Andrii Nakryiko
  0 siblings, 0 replies; 32+ messages in thread
From: Andrii Nakryiko @ 2019-12-19 22:56 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Jesper Dangaard Brouer, Björn Töpel, Netdev,
	Alexei Starovoitov, bpf, David Miller, Jakub Kicinski,
	John Fastabend, Karlsson, Magnus, Jonathan Lemon,
	Maciej Fijalkowski

On Thu, Dec 19, 2019 at 12:08 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> On 12/19/19 8:33 PM, Jesper Dangaard Brouer wrote:
> > On Wed, 18 Dec 2019 16:39:08 -0800
> > Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> >
> >> On Wed, Dec 18, 2019 at 4:04 AM Jesper Dangaard Brouer
> >> <brouer@redhat.com> wrote:
> >>>
> >>> On Wed, 18 Dec 2019 12:39:53 +0100
> >>> Björn Töpel <bjorn.topel@gmail.com> wrote:
> >>>
> >>>> On Wed, 18 Dec 2019 at 12:11, Jesper Dangaard Brouer <brouer@redhat.com> wrote:
> >>>>>
> >>>>> On Wed, 18 Dec 2019 11:53:52 +0100
> >>>>> Björn Töpel <bjorn.topel@gmail.com> wrote:
> >>>>>
> >>>>>>    $ sudo ./xdp_redirect_cpu --dev enp134s0f0 --cpu 22 xdp_cpu_map0
> >>>>>>
> >>>>>>    Running XDP/eBPF prog_name:xdp_cpu_map5_lb_hash_ip_pairs
> >>>>>>    XDP-cpumap      CPU:to  pps            drop-pps    extra-info
> >>>>>>    XDP-RX          20      7723038        0           0
> >>>>>>    XDP-RX          total   7723038        0
> >>>>>>    cpumap_kthread  total   0              0           0
> >>>>>>    redirect_err    total   0              0
> >>>>>>    xdp_exception   total   0              0
> >>>>>
> >>>>> Hmm... I'm missing some counters on the kthread side.
> >>>>>
> >>>>
> >>>> Oh? Any ideas why? I just ran the upstream sample straight off.
> >>>
> >>> Looks like it happened in commit: bbaf6029c49c ("samples/bpf: Convert
> >>> XDP samples to libbpf usage") (Cc Maciej).
> >>>
> >>> The old bpf_load.c will auto attach the tracepoints... for and libbpf
> >>> you have to be explicit about it.
> >>
> >> ... or you can use skeleton, which will auto-attach them as well,
> >> provided BPF program's section names follow expected naming
> >> convention. So it might be a good idea to try it out.
> >
> > To Andrii, can you provide some more info on how to use this new
> > skeleton system of yours?  (Pointers to code examples?)
>
> There's a man page ;-)
>
> https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/tree/tools/bpf/bpftool/Documentation/bpftool-gen.rst

Also see runqslower patch set for end-to-end set up. There are a bunch
of selftests already using skeletons (test_attach_probe,
test_core_extern, test_skeleton).

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2019-12-19 22:56 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-18 10:53 [PATCH bpf-next 0/8] Simplify xdp_do_redirect_map()/xdp_do_flush_map() and XDP maps Björn Töpel
2019-12-18 10:53 ` [PATCH bpf-next 1/8] xdp: simplify devmap cleanup Björn Töpel
2019-12-18 11:14   ` Toke Høiland-Jørgensen
2019-12-18 10:53 ` [PATCH bpf-next 2/8] xdp: simplify cpumap cleanup Björn Töpel
2019-12-18 11:15   ` Toke Høiland-Jørgensen
2019-12-18 17:47   ` Jakub Kicinski
2019-12-18 17:48     ` Björn Töpel
2019-12-18 10:53 ` [PATCH bpf-next 3/8] xdp: fix graze->grace type-o in cpumap comments Björn Töpel
2019-12-18 11:18   ` Toke Høiland-Jørgensen
2019-12-18 10:53 ` [PATCH bpf-next 4/8] xsk: make xskmap flush_list common for all map instances Björn Töpel
2019-12-18 11:19   ` Toke Høiland-Jørgensen
2019-12-18 10:53 ` [PATCH bpf-next 5/8] xdp: make devmap " Björn Töpel
2019-12-18 11:19   ` Toke Høiland-Jørgensen
2019-12-18 10:53 ` [PATCH bpf-next 6/8] xdp: make cpumap " Björn Töpel
2019-12-18 11:19   ` Toke Høiland-Jørgensen
2019-12-18 10:53 ` [PATCH bpf-next 7/8] xdp: remove map_to_flush and map swap detection Björn Töpel
2019-12-18 11:20   ` Toke Høiland-Jørgensen
2019-12-18 13:03   ` Jesper Dangaard Brouer
2019-12-18 13:09     ` Björn Töpel
2019-12-18 10:54 ` [PATCH bpf-next 8/8] xdp: simplify __bpf_tx_xdp_map() Björn Töpel
2019-12-18 11:20   ` Toke Høiland-Jørgensen
2019-12-18 11:11 ` [PATCH bpf-next 0/8] Simplify xdp_do_redirect_map()/xdp_do_flush_map() and XDP maps Jesper Dangaard Brouer
2019-12-18 11:39   ` Björn Töpel
2019-12-18 12:03     ` Jesper Dangaard Brouer
2019-12-18 12:18       ` Björn Töpel
2019-12-18 12:32         ` Björn Töpel
2019-12-18 12:40         ` Jesper Dangaard Brouer
2019-12-18 12:48           ` Björn Töpel
2019-12-19  0:39       ` Andrii Nakryiko
2019-12-19 19:33         ` Jesper Dangaard Brouer
2019-12-19 20:08           ` Daniel Borkmann
2019-12-19 22:56             ` Andrii Nakryiko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).