* [PATCH v7 bpf-next 0/9] introduce support for XDP programs in CPUMAP
@ 2020-07-14 13:56 Lorenzo Bianconi
2020-07-14 13:56 ` [PATCH v7 bpf-next 1/9] cpumap: use non-locked version __ptr_ring_consume_batched Lorenzo Bianconi
` (10 more replies)
0 siblings, 11 replies; 21+ messages in thread
From: Lorenzo Bianconi @ 2020-07-14 13:56 UTC (permalink / raw)
To: netdev, bpf
Cc: davem, ast, brouer, daniel, toke, lorenzo.bianconi, dsahern,
andrii.nakryiko
Similar to what David Ahern proposed in [1] for DEVMAPs, introduce the
capability to attach and run a XDP program to CPUMAP entries.
The idea behind this feature is to add the possibility to define on which CPU
run the eBPF program if the underlying hw does not support RSS.
I respin patch 1/6 from a previous series sent by David [2].
The functionality has been tested on Marvell Espressobin, i40e and mlx5.
Detailed tests results can be found here:
https://github.com/xdp-project/xdp-project/blob/master/areas/cpumap/cpumap04-map-xdp-prog.org
Changes since v6:
- rebase on top of bpf-next
- move bpf_cpumap_val and bpf_prog in the first bpf_cpu_map_entry cache-line
Changes since v5:
- move bpf_prog_put() in put_cpu_map_entry()
- remove READ_ONCE(rcpu->prog) in cpu_map_bpf_prog_run_xdp
- rely on bpf_prog_get_type() instead of bpf_prog_get_type_dev() in
__cpu_map_load_bpf_program()
Changes since v4:
- move xdp_clear_return_frame_no_direct inside rcu section
- update David Ahern's email address
Changes since v3:
- fix typo in commit message
- fix access to ctx->ingress_ifindex in cpumap bpf selftest
Changes since v2:
- improved comments
- fix return value in xdp_convert_buff_to_frame
- added patch 1/9: "cpumap: use non-locked version __ptr_ring_consume_batched"
- do not run kmem_cache_alloc_bulk if all frames have been consumed by the XDP
program attached to the CPUMAP entry
- removed bpf_trace_printk in kselftest
Changes since v1:
- added performance test results
- added kselftest support
- fixed memory accounting with page_pool
- extended xdp_redirect_cpu_user.c to load an external program to perform
redirect
- reported ifindex to attached eBPF program
- moved bpf_cpumap_val definition to include/uapi/linux/bpf.h
[1] https://patchwork.ozlabs.org/project/netdev/cover/20200529220716.75383-1-dsahern@kernel.org/
[2] https://patchwork.ozlabs.org/project/netdev/patch/20200513014607.40418-2-dsahern@kernel.org/
David Ahern (1):
net: refactor xdp_convert_buff_to_frame
Jesper Dangaard Brouer (1):
cpumap: use non-locked version __ptr_ring_consume_batched
Lorenzo Bianconi (7):
samples/bpf: xdp_redirect_cpu_user: do not update bpf maps in option
loop
cpumap: formalize map value as a named struct
bpf: cpumap: add the possibility to attach an eBPF program to cpumap
bpf: cpumap: implement XDP_REDIRECT for eBPF programs attached to map
entries
libbpf: add SEC name for xdp programs attached to CPUMAP
samples/bpf: xdp_redirect_cpu: load a eBPF program on cpumap
selftest: add tests for XDP programs in CPUMAP entries
include/linux/bpf.h | 6 +
include/net/xdp.h | 41 ++--
include/trace/events/xdp.h | 16 +-
include/uapi/linux/bpf.h | 14 ++
kernel/bpf/cpumap.c | 162 +++++++++++---
net/core/dev.c | 9 +
samples/bpf/xdp_redirect_cpu_kern.c | 25 ++-
samples/bpf/xdp_redirect_cpu_user.c | 209 ++++++++++++++++--
tools/include/uapi/linux/bpf.h | 14 ++
tools/lib/bpf/libbpf.c | 2 +
.../bpf/prog_tests/xdp_cpumap_attach.c | 70 ++++++
.../bpf/progs/test_xdp_with_cpumap_helpers.c | 36 +++
12 files changed, 531 insertions(+), 73 deletions(-)
create mode 100644 tools/testing/selftests/bpf/prog_tests/xdp_cpumap_attach.c
create mode 100644 tools/testing/selftests/bpf/progs/test_xdp_with_cpumap_helpers.c
--
2.26.2
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v7 bpf-next 1/9] cpumap: use non-locked version __ptr_ring_consume_batched
2020-07-14 13:56 [PATCH v7 bpf-next 0/9] introduce support for XDP programs in CPUMAP Lorenzo Bianconi
@ 2020-07-14 13:56 ` Lorenzo Bianconi
2020-07-14 13:56 ` [PATCH v7 bpf-next 2/9] net: refactor xdp_convert_buff_to_frame Lorenzo Bianconi
` (9 subsequent siblings)
10 siblings, 0 replies; 21+ messages in thread
From: Lorenzo Bianconi @ 2020-07-14 13:56 UTC (permalink / raw)
To: netdev, bpf
Cc: davem, ast, brouer, daniel, toke, lorenzo.bianconi, dsahern,
andrii.nakryiko
From: Jesper Dangaard Brouer <brouer@redhat.com>
Commit 77361825bb01 ("bpf: cpumap use ptr_ring_consume_batched") changed
away from using single frame ptr_ring dequeue (__ptr_ring_consume) to
consume a batched, but it uses a locked version, which as the comment
explain isn't needed.
Change to use the non-locked version __ptr_ring_consume_batched.
Fixes: 77361825bb01 ("bpf: cpumap use ptr_ring_consume_batched")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
kernel/bpf/cpumap.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/bpf/cpumap.c b/kernel/bpf/cpumap.c
index bd8658055c16..323c91c4fab0 100644
--- a/kernel/bpf/cpumap.c
+++ b/kernel/bpf/cpumap.c
@@ -259,7 +259,7 @@ static int cpu_map_kthread_run(void *data)
* kthread CPU pinned. Lockless access to ptr_ring
* consume side valid as no-resize allowed of queue.
*/
- n = ptr_ring_consume_batched(rcpu->queue, frames, CPUMAP_BATCH);
+ n = __ptr_ring_consume_batched(rcpu->queue, frames, CPUMAP_BATCH);
for (i = 0; i < n; i++) {
void *f = frames[i];
--
2.26.2
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v7 bpf-next 2/9] net: refactor xdp_convert_buff_to_frame
2020-07-14 13:56 [PATCH v7 bpf-next 0/9] introduce support for XDP programs in CPUMAP Lorenzo Bianconi
2020-07-14 13:56 ` [PATCH v7 bpf-next 1/9] cpumap: use non-locked version __ptr_ring_consume_batched Lorenzo Bianconi
@ 2020-07-14 13:56 ` Lorenzo Bianconi
2020-07-14 13:56 ` [PATCH v7 bpf-next 3/9] samples/bpf: xdp_redirect_cpu_user: do not update bpf maps in option loop Lorenzo Bianconi
` (8 subsequent siblings)
10 siblings, 0 replies; 21+ messages in thread
From: Lorenzo Bianconi @ 2020-07-14 13:56 UTC (permalink / raw)
To: netdev, bpf
Cc: davem, ast, brouer, daniel, toke, lorenzo.bianconi, dsahern,
andrii.nakryiko
From: David Ahern <dsahern@kernel.org>
Move the guts of xdp_convert_buff_to_frame to a new helper,
xdp_update_frame_from_buff so it can be reused removing code duplication
Suggested-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Co-developed-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: David Ahern <dsahern@kernel.org>
---
include/net/xdp.h | 35 ++++++++++++++++++++++-------------
1 file changed, 22 insertions(+), 13 deletions(-)
diff --git a/include/net/xdp.h b/include/net/xdp.h
index 609f819ed08b..5b383c450858 100644
--- a/include/net/xdp.h
+++ b/include/net/xdp.h
@@ -121,39 +121,48 @@ void xdp_convert_frame_to_buff(struct xdp_frame *frame, struct xdp_buff *xdp)
xdp->frame_sz = frame->frame_sz;
}
-/* Convert xdp_buff to xdp_frame */
static inline
-struct xdp_frame *xdp_convert_buff_to_frame(struct xdp_buff *xdp)
+int xdp_update_frame_from_buff(struct xdp_buff *xdp,
+ struct xdp_frame *xdp_frame)
{
- struct xdp_frame *xdp_frame;
- int metasize;
- int headroom;
-
- if (xdp->rxq->mem.type == MEM_TYPE_XSK_BUFF_POOL)
- return xdp_convert_zc_to_xdp_frame(xdp);
+ int metasize, headroom;
/* Assure headroom is available for storing info */
headroom = xdp->data - xdp->data_hard_start;
metasize = xdp->data - xdp->data_meta;
metasize = metasize > 0 ? metasize : 0;
if (unlikely((headroom - metasize) < sizeof(*xdp_frame)))
- return NULL;
+ return -ENOSPC;
/* Catch if driver didn't reserve tailroom for skb_shared_info */
if (unlikely(xdp->data_end > xdp_data_hard_end(xdp))) {
XDP_WARN("Driver BUG: missing reserved tailroom");
- return NULL;
+ return -ENOSPC;
}
- /* Store info in top of packet */
- xdp_frame = xdp->data_hard_start;
-
xdp_frame->data = xdp->data;
xdp_frame->len = xdp->data_end - xdp->data;
xdp_frame->headroom = headroom - sizeof(*xdp_frame);
xdp_frame->metasize = metasize;
xdp_frame->frame_sz = xdp->frame_sz;
+ return 0;
+}
+
+/* Convert xdp_buff to xdp_frame */
+static inline
+struct xdp_frame *xdp_convert_buff_to_frame(struct xdp_buff *xdp)
+{
+ struct xdp_frame *xdp_frame;
+
+ if (xdp->rxq->mem.type == MEM_TYPE_XSK_BUFF_POOL)
+ return xdp_convert_zc_to_xdp_frame(xdp);
+
+ /* Store info in top of packet */
+ xdp_frame = xdp->data_hard_start;
+ if (unlikely(xdp_update_frame_from_buff(xdp, xdp_frame) < 0))
+ return NULL;
+
/* rxq only valid until napi_schedule ends, convert to xdp_mem_info */
xdp_frame->mem = xdp->rxq->mem;
--
2.26.2
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v7 bpf-next 3/9] samples/bpf: xdp_redirect_cpu_user: do not update bpf maps in option loop
2020-07-14 13:56 [PATCH v7 bpf-next 0/9] introduce support for XDP programs in CPUMAP Lorenzo Bianconi
2020-07-14 13:56 ` [PATCH v7 bpf-next 1/9] cpumap: use non-locked version __ptr_ring_consume_batched Lorenzo Bianconi
2020-07-14 13:56 ` [PATCH v7 bpf-next 2/9] net: refactor xdp_convert_buff_to_frame Lorenzo Bianconi
@ 2020-07-14 13:56 ` Lorenzo Bianconi
2020-07-14 13:56 ` [PATCH v7 bpf-next 4/9] cpumap: formalize map value as a named struct Lorenzo Bianconi
` (7 subsequent siblings)
10 siblings, 0 replies; 21+ messages in thread
From: Lorenzo Bianconi @ 2020-07-14 13:56 UTC (permalink / raw)
To: netdev, bpf
Cc: davem, ast, brouer, daniel, toke, lorenzo.bianconi, dsahern,
andrii.nakryiko
Do not update xdp_redirect_cpu maps running while option loop but
defer it after all available options have been parsed. This is a
preliminary patch to pass the program name we want to attach to the
map entries as a user option
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
samples/bpf/xdp_redirect_cpu_user.c | 36 +++++++++++++++++++++--------
1 file changed, 27 insertions(+), 9 deletions(-)
diff --git a/samples/bpf/xdp_redirect_cpu_user.c b/samples/bpf/xdp_redirect_cpu_user.c
index f4e755e0dd73..6bb2d95cb26c 100644
--- a/samples/bpf/xdp_redirect_cpu_user.c
+++ b/samples/bpf/xdp_redirect_cpu_user.c
@@ -681,6 +681,7 @@ int main(int argc, char **argv)
int add_cpu = -1;
int opt, err;
int prog_fd;
+ int *cpu, i;
__u32 qsize;
n_cpus = get_nprocs_conf();
@@ -716,6 +717,13 @@ int main(int argc, char **argv)
}
mark_cpus_unavailable();
+ cpu = malloc(n_cpus * sizeof(int));
+ if (!cpu) {
+ fprintf(stderr, "failed to allocate cpu array\n");
+ return EXIT_FAIL;
+ }
+ memset(cpu, 0, n_cpus * sizeof(int));
+
/* Parse commands line args */
while ((opt = getopt_long(argc, argv, "hSd:s:p:q:c:xzF",
long_options, &longindex)) != -1) {
@@ -760,8 +768,7 @@ int main(int argc, char **argv)
errno, strerror(errno));
goto error;
}
- create_cpu_entry(add_cpu, qsize, added_cpus, true);
- added_cpus++;
+ cpu[added_cpus++] = add_cpu;
break;
case 'q':
qsize = atoi(optarg);
@@ -772,6 +779,7 @@ int main(int argc, char **argv)
case 'h':
error:
default:
+ free(cpu);
usage(argv, obj);
return EXIT_FAIL_OPTION;
}
@@ -784,16 +792,21 @@ int main(int argc, char **argv)
if (ifindex == -1) {
fprintf(stderr, "ERR: required option --dev missing\n");
usage(argv, obj);
- return EXIT_FAIL_OPTION;
+ err = EXIT_FAIL_OPTION;
+ goto out;
}
/* Required option */
if (add_cpu == -1) {
fprintf(stderr, "ERR: required option --cpu missing\n");
fprintf(stderr, " Specify multiple --cpu option to add more\n");
usage(argv, obj);
- return EXIT_FAIL_OPTION;
+ err = EXIT_FAIL_OPTION;
+ goto out;
}
+ for (i = 0; i < added_cpus; i++)
+ create_cpu_entry(cpu[i], qsize, i, true);
+
/* Remove XDP program when program is interrupted or killed */
signal(SIGINT, int_exit);
signal(SIGTERM, int_exit);
@@ -801,27 +814,32 @@ int main(int argc, char **argv)
prog = bpf_object__find_program_by_title(obj, prog_name);
if (!prog) {
fprintf(stderr, "bpf_object__find_program_by_title failed\n");
- return EXIT_FAIL;
+ err = EXIT_FAIL;
+ goto out;
}
prog_fd = bpf_program__fd(prog);
if (prog_fd < 0) {
fprintf(stderr, "bpf_program__fd failed\n");
- return EXIT_FAIL;
+ err = EXIT_FAIL;
+ goto out;
}
if (bpf_set_link_xdp_fd(ifindex, prog_fd, xdp_flags) < 0) {
fprintf(stderr, "link set xdp fd failed\n");
- return EXIT_FAIL_XDP;
+ err = EXIT_FAIL_XDP;
+ goto out;
}
err = bpf_obj_get_info_by_fd(prog_fd, &info, &info_len);
if (err) {
printf("can't get prog info - %s\n", strerror(errno));
- return err;
+ goto out;
}
prog_id = info.id;
stats_poll(interval, use_separators, prog_name, stress_mode);
- return EXIT_OK;
+out:
+ free(cpu);
+ return err;
}
--
2.26.2
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v7 bpf-next 4/9] cpumap: formalize map value as a named struct
2020-07-14 13:56 [PATCH v7 bpf-next 0/9] introduce support for XDP programs in CPUMAP Lorenzo Bianconi
` (2 preceding siblings ...)
2020-07-14 13:56 ` [PATCH v7 bpf-next 3/9] samples/bpf: xdp_redirect_cpu_user: do not update bpf maps in option loop Lorenzo Bianconi
@ 2020-07-14 13:56 ` Lorenzo Bianconi
2020-07-14 13:56 ` [PATCH v7 bpf-next 5/9] bpf: cpumap: add the possibility to attach an eBPF program to cpumap Lorenzo Bianconi
` (6 subsequent siblings)
10 siblings, 0 replies; 21+ messages in thread
From: Lorenzo Bianconi @ 2020-07-14 13:56 UTC (permalink / raw)
To: netdev, bpf
Cc: davem, ast, brouer, daniel, toke, lorenzo.bianconi, dsahern,
andrii.nakryiko
As it has been already done for devmap, introduce 'struct bpf_cpumap_val'
to formalize the expected values that can be passed in for a CPUMAP.
Update cpumap code to use the struct.
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
include/uapi/linux/bpf.h | 9 +++++++++
kernel/bpf/cpumap.c | 28 +++++++++++++++-------------
tools/include/uapi/linux/bpf.h | 9 +++++++++
3 files changed, 33 insertions(+), 13 deletions(-)
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 5e386389913a..109623527358 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -3849,6 +3849,15 @@ struct bpf_devmap_val {
} bpf_prog;
};
+/* CPUMAP map-value layout
+ *
+ * The struct data-layout of map-value is a configuration interface.
+ * New members can only be added to the end of this structure.
+ */
+struct bpf_cpumap_val {
+ __u32 qsize; /* queue size to remote target CPU */
+};
+
enum sk_action {
SK_DROP = 0,
SK_PASS,
diff --git a/kernel/bpf/cpumap.c b/kernel/bpf/cpumap.c
index 323c91c4fab0..ff48dc00e8d0 100644
--- a/kernel/bpf/cpumap.c
+++ b/kernel/bpf/cpumap.c
@@ -52,7 +52,6 @@ struct xdp_bulk_queue {
struct bpf_cpu_map_entry {
u32 cpu; /* kthread CPU and map index */
int map_id; /* Back reference to map */
- u32 qsize; /* Queue size placeholder for map lookup */
/* XDP can run multiple RX-ring queues, need __percpu enqueue store */
struct xdp_bulk_queue __percpu *bulkq;
@@ -62,10 +61,13 @@ struct bpf_cpu_map_entry {
/* Queue with potential multi-producers, and single-consumer kthread */
struct ptr_ring *queue;
struct task_struct *kthread;
- struct work_struct kthread_stop_wq;
+
+ struct bpf_cpumap_val value;
atomic_t refcnt; /* Control when this struct can be free'ed */
struct rcu_head rcu;
+
+ struct work_struct kthread_stop_wq;
};
struct bpf_cpu_map {
@@ -307,8 +309,8 @@ static int cpu_map_kthread_run(void *data)
return 0;
}
-static struct bpf_cpu_map_entry *__cpu_map_entry_alloc(u32 qsize, u32 cpu,
- int map_id)
+static struct bpf_cpu_map_entry *
+__cpu_map_entry_alloc(struct bpf_cpumap_val *value, u32 cpu, int map_id)
{
gfp_t gfp = GFP_KERNEL | __GFP_NOWARN;
struct bpf_cpu_map_entry *rcpu;
@@ -338,13 +340,13 @@ static struct bpf_cpu_map_entry *__cpu_map_entry_alloc(u32 qsize, u32 cpu,
if (!rcpu->queue)
goto free_bulkq;
- err = ptr_ring_init(rcpu->queue, qsize, gfp);
+ err = ptr_ring_init(rcpu->queue, value->qsize, gfp);
if (err)
goto free_queue;
rcpu->cpu = cpu;
rcpu->map_id = map_id;
- rcpu->qsize = qsize;
+ rcpu->value.qsize = value->qsize;
/* Setup kthread */
rcpu->kthread = kthread_create_on_node(cpu_map_kthread_run, rcpu, numa,
@@ -437,12 +439,12 @@ static int cpu_map_update_elem(struct bpf_map *map, void *key, void *value,
u64 map_flags)
{
struct bpf_cpu_map *cmap = container_of(map, struct bpf_cpu_map, map);
+ struct bpf_cpumap_val cpumap_value = {};
struct bpf_cpu_map_entry *rcpu;
-
/* Array index key correspond to CPU number */
u32 key_cpu = *(u32 *)key;
- /* Value is the queue size */
- u32 qsize = *(u32 *)value;
+
+ memcpy(&cpumap_value, value, map->value_size);
if (unlikely(map_flags > BPF_EXIST))
return -EINVAL;
@@ -450,18 +452,18 @@ static int cpu_map_update_elem(struct bpf_map *map, void *key, void *value,
return -E2BIG;
if (unlikely(map_flags == BPF_NOEXIST))
return -EEXIST;
- if (unlikely(qsize > 16384)) /* sanity limit on qsize */
+ if (unlikely(cpumap_value.qsize > 16384)) /* sanity limit on qsize */
return -EOVERFLOW;
/* Make sure CPU is a valid possible cpu */
if (key_cpu >= nr_cpumask_bits || !cpu_possible(key_cpu))
return -ENODEV;
- if (qsize == 0) {
+ if (cpumap_value.qsize == 0) {
rcpu = NULL; /* Same as deleting */
} else {
/* Updating qsize cause re-allocation of bpf_cpu_map_entry */
- rcpu = __cpu_map_entry_alloc(qsize, key_cpu, map->id);
+ rcpu = __cpu_map_entry_alloc(&cpumap_value, key_cpu, map->id);
if (!rcpu)
return -ENOMEM;
rcpu->cmap = cmap;
@@ -523,7 +525,7 @@ static void *cpu_map_lookup_elem(struct bpf_map *map, void *key)
struct bpf_cpu_map_entry *rcpu =
__cpu_map_lookup_elem(map, *(u32 *)key);
- return rcpu ? &rcpu->qsize : NULL;
+ return rcpu ? &rcpu->value : NULL;
}
static int cpu_map_get_next_key(struct bpf_map *map, void *key, void *next_key)
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 5e386389913a..109623527358 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -3849,6 +3849,15 @@ struct bpf_devmap_val {
} bpf_prog;
};
+/* CPUMAP map-value layout
+ *
+ * The struct data-layout of map-value is a configuration interface.
+ * New members can only be added to the end of this structure.
+ */
+struct bpf_cpumap_val {
+ __u32 qsize; /* queue size to remote target CPU */
+};
+
enum sk_action {
SK_DROP = 0,
SK_PASS,
--
2.26.2
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v7 bpf-next 5/9] bpf: cpumap: add the possibility to attach an eBPF program to cpumap
2020-07-14 13:56 [PATCH v7 bpf-next 0/9] introduce support for XDP programs in CPUMAP Lorenzo Bianconi
` (3 preceding siblings ...)
2020-07-14 13:56 ` [PATCH v7 bpf-next 4/9] cpumap: formalize map value as a named struct Lorenzo Bianconi
@ 2020-07-14 13:56 ` Lorenzo Bianconi
2020-07-14 13:56 ` [PATCH v7 bpf-next 6/9] bpf: cpumap: implement XDP_REDIRECT for eBPF programs attached to map entries Lorenzo Bianconi
` (5 subsequent siblings)
10 siblings, 0 replies; 21+ messages in thread
From: Lorenzo Bianconi @ 2020-07-14 13:56 UTC (permalink / raw)
To: netdev, bpf
Cc: davem, ast, brouer, daniel, toke, lorenzo.bianconi, dsahern,
andrii.nakryiko
Introduce the capability to attach an eBPF program to cpumap entries.
The idea behind this feature is to add the possibility to define on
which CPU run the eBPF program if the underlying hw does not support
RSS. Current supported verdicts are XDP_DROP and XDP_PASS.
This patch has been tested on Marvell ESPRESSObin using xdp_redirect_cpu
sample available in the kernel tree to identify possible performance
regressions. Results show there are no observable differences in
packet-per-second:
$./xdp_redirect_cpu --progname xdp_cpu_map0 --dev eth0 --cpu 1
rx: 354.8 Kpps
rx: 356.0 Kpps
rx: 356.8 Kpps
rx: 356.3 Kpps
rx: 356.6 Kpps
rx: 356.6 Kpps
rx: 356.7 Kpps
rx: 355.8 Kpps
rx: 356.8 Kpps
rx: 356.8 Kpps
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Co-developed-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
include/linux/bpf.h | 6 ++
include/net/xdp.h | 5 ++
include/trace/events/xdp.h | 14 ++--
include/uapi/linux/bpf.h | 5 ++
kernel/bpf/cpumap.c | 121 +++++++++++++++++++++++++++++----
net/core/dev.c | 9 +++
tools/include/uapi/linux/bpf.h | 5 ++
7 files changed, 148 insertions(+), 17 deletions(-)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index c67c88ad35f8..54ad426dbea1 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1272,6 +1272,7 @@ struct bpf_cpu_map_entry *__cpu_map_lookup_elem(struct bpf_map *map, u32 key);
void __cpu_map_flush(void);
int cpu_map_enqueue(struct bpf_cpu_map_entry *rcpu, struct xdp_buff *xdp,
struct net_device *dev_rx);
+bool cpu_map_prog_allowed(struct bpf_map *map);
/* Return map's numa specified by userspace */
static inline int bpf_map_attr_numa_node(const union bpf_attr *attr)
@@ -1432,6 +1433,11 @@ static inline int cpu_map_enqueue(struct bpf_cpu_map_entry *rcpu,
return 0;
}
+static inline bool cpu_map_prog_allowed(struct bpf_map *map)
+{
+ return false;
+}
+
static inline struct bpf_prog *bpf_prog_get_type_path(const char *name,
enum bpf_prog_type type)
{
diff --git a/include/net/xdp.h b/include/net/xdp.h
index 5b383c450858..83b9e0142b52 100644
--- a/include/net/xdp.h
+++ b/include/net/xdp.h
@@ -98,6 +98,11 @@ struct xdp_frame {
struct net_device *dev_rx; /* used by cpumap */
};
+struct xdp_cpumap_stats {
+ unsigned int pass;
+ unsigned int drop;
+};
+
/* Clear kernel pointers in xdp_frame */
static inline void xdp_scrub_frame(struct xdp_frame *frame)
{
diff --git a/include/trace/events/xdp.h b/include/trace/events/xdp.h
index b73d3e141323..e2c99f5bee39 100644
--- a/include/trace/events/xdp.h
+++ b/include/trace/events/xdp.h
@@ -177,9 +177,9 @@ DEFINE_EVENT(xdp_redirect_template, xdp_redirect_map_err,
TRACE_EVENT(xdp_cpumap_kthread,
TP_PROTO(int map_id, unsigned int processed, unsigned int drops,
- int sched),
+ int sched, struct xdp_cpumap_stats *xdp_stats),
- TP_ARGS(map_id, processed, drops, sched),
+ TP_ARGS(map_id, processed, drops, sched, xdp_stats),
TP_STRUCT__entry(
__field(int, map_id)
@@ -188,6 +188,8 @@ TRACE_EVENT(xdp_cpumap_kthread,
__field(unsigned int, drops)
__field(unsigned int, processed)
__field(int, sched)
+ __field(unsigned int, xdp_pass)
+ __field(unsigned int, xdp_drop)
),
TP_fast_assign(
@@ -197,16 +199,20 @@ TRACE_EVENT(xdp_cpumap_kthread,
__entry->drops = drops;
__entry->processed = processed;
__entry->sched = sched;
+ __entry->xdp_pass = xdp_stats->pass;
+ __entry->xdp_drop = xdp_stats->drop;
),
TP_printk("kthread"
" cpu=%d map_id=%d action=%s"
" processed=%u drops=%u"
- " sched=%d",
+ " sched=%d"
+ " xdp_pass=%u xdp_drop=%u",
__entry->cpu, __entry->map_id,
__print_symbolic(__entry->act, __XDP_ACT_SYM_TAB),
__entry->processed, __entry->drops,
- __entry->sched)
+ __entry->sched,
+ __entry->xdp_pass, __entry->xdp_drop)
);
TRACE_EVENT(xdp_cpumap_enqueue,
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 109623527358..c010b57fce3f 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -227,6 +227,7 @@ enum bpf_attach_type {
BPF_CGROUP_INET6_GETSOCKNAME,
BPF_XDP_DEVMAP,
BPF_CGROUP_INET_SOCK_RELEASE,
+ BPF_XDP_CPUMAP,
__MAX_BPF_ATTACH_TYPE
};
@@ -3856,6 +3857,10 @@ struct bpf_devmap_val {
*/
struct bpf_cpumap_val {
__u32 qsize; /* queue size to remote target CPU */
+ union {
+ int fd; /* prog fd on map write */
+ __u32 id; /* prog id on map read */
+ } bpf_prog;
};
enum sk_action {
diff --git a/kernel/bpf/cpumap.c b/kernel/bpf/cpumap.c
index ff48dc00e8d0..b3a8aea81ee5 100644
--- a/kernel/bpf/cpumap.c
+++ b/kernel/bpf/cpumap.c
@@ -63,6 +63,7 @@ struct bpf_cpu_map_entry {
struct task_struct *kthread;
struct bpf_cpumap_val value;
+ struct bpf_prog *prog;
atomic_t refcnt; /* Control when this struct can be free'ed */
struct rcu_head rcu;
@@ -82,6 +83,7 @@ static int bq_flush_to_queue(struct xdp_bulk_queue *bq);
static struct bpf_map *cpu_map_alloc(union bpf_attr *attr)
{
+ u32 value_size = attr->value_size;
struct bpf_cpu_map *cmap;
int err = -ENOMEM;
u64 cost;
@@ -92,7 +94,9 @@ static struct bpf_map *cpu_map_alloc(union bpf_attr *attr)
/* check sanity of attributes */
if (attr->max_entries == 0 || attr->key_size != 4 ||
- attr->value_size != 4 || attr->map_flags & ~BPF_F_NUMA_NODE)
+ (value_size != offsetofend(struct bpf_cpumap_val, qsize) &&
+ value_size != offsetofend(struct bpf_cpumap_val, bpf_prog.fd)) ||
+ attr->map_flags & ~BPF_F_NUMA_NODE)
return ERR_PTR(-EINVAL);
cmap = kzalloc(sizeof(*cmap), GFP_USER);
@@ -214,6 +218,8 @@ static void __cpu_map_ring_cleanup(struct ptr_ring *ring)
static void put_cpu_map_entry(struct bpf_cpu_map_entry *rcpu)
{
if (atomic_dec_and_test(&rcpu->refcnt)) {
+ if (rcpu->prog)
+ bpf_prog_put(rcpu->prog);
/* The queue should be empty at this point */
__cpu_map_ring_cleanup(rcpu->queue);
ptr_ring_cleanup(rcpu->queue, NULL);
@@ -222,6 +228,62 @@ static void put_cpu_map_entry(struct bpf_cpu_map_entry *rcpu)
}
}
+static int cpu_map_bpf_prog_run_xdp(struct bpf_cpu_map_entry *rcpu,
+ void **frames, int n,
+ struct xdp_cpumap_stats *stats)
+{
+ struct xdp_rxq_info rxq;
+ struct xdp_buff xdp;
+ int i, nframes = 0;
+
+ if (!rcpu->prog)
+ return n;
+
+ rcu_read_lock();
+
+ xdp_set_return_frame_no_direct();
+ xdp.rxq = &rxq;
+
+ for (i = 0; i < n; i++) {
+ struct xdp_frame *xdpf = frames[i];
+ u32 act;
+ int err;
+
+ rxq.dev = xdpf->dev_rx;
+ rxq.mem = xdpf->mem;
+ /* TODO: report queue_index to xdp_rxq_info */
+
+ xdp_convert_frame_to_buff(xdpf, &xdp);
+
+ act = bpf_prog_run_xdp(rcpu->prog, &xdp);
+ switch (act) {
+ case XDP_PASS:
+ err = xdp_update_frame_from_buff(&xdp, xdpf);
+ if (err < 0) {
+ xdp_return_frame(xdpf);
+ stats->drop++;
+ } else {
+ frames[nframes++] = xdpf;
+ stats->pass++;
+ }
+ break;
+ default:
+ bpf_warn_invalid_xdp_action(act);
+ /* fallthrough */
+ case XDP_DROP:
+ xdp_return_frame(xdpf);
+ stats->drop++;
+ break;
+ }
+ }
+
+ xdp_clear_return_frame_no_direct();
+
+ rcu_read_unlock();
+
+ return nframes;
+}
+
#define CPUMAP_BATCH 8
static int cpu_map_kthread_run(void *data)
@@ -236,11 +298,12 @@ static int cpu_map_kthread_run(void *data)
* kthread_stop signal until queue is empty.
*/
while (!kthread_should_stop() || !__ptr_ring_empty(rcpu->queue)) {
+ struct xdp_cpumap_stats stats = {}; /* zero stats */
+ gfp_t gfp = __GFP_ZERO | GFP_ATOMIC;
unsigned int drops = 0, sched = 0;
void *frames[CPUMAP_BATCH];
void *skbs[CPUMAP_BATCH];
- gfp_t gfp = __GFP_ZERO | GFP_ATOMIC;
- int i, n, m;
+ int i, n, m, nframes;
/* Release CPU reschedule checks */
if (__ptr_ring_empty(rcpu->queue)) {
@@ -261,8 +324,8 @@ static int cpu_map_kthread_run(void *data)
* kthread CPU pinned. Lockless access to ptr_ring
* consume side valid as no-resize allowed of queue.
*/
- n = __ptr_ring_consume_batched(rcpu->queue, frames, CPUMAP_BATCH);
-
+ n = __ptr_ring_consume_batched(rcpu->queue, frames,
+ CPUMAP_BATCH);
for (i = 0; i < n; i++) {
void *f = frames[i];
struct page *page = virt_to_page(f);
@@ -274,15 +337,19 @@ static int cpu_map_kthread_run(void *data)
prefetchw(page);
}
- m = kmem_cache_alloc_bulk(skbuff_head_cache, gfp, n, skbs);
- if (unlikely(m == 0)) {
- for (i = 0; i < n; i++)
- skbs[i] = NULL; /* effect: xdp_return_frame */
- drops = n;
+ /* Support running another XDP prog on this CPU */
+ nframes = cpu_map_bpf_prog_run_xdp(rcpu, frames, n, &stats);
+ if (nframes) {
+ m = kmem_cache_alloc_bulk(skbuff_head_cache, gfp, nframes, skbs);
+ if (unlikely(m == 0)) {
+ for (i = 0; i < nframes; i++)
+ skbs[i] = NULL; /* effect: xdp_return_frame */
+ drops += nframes;
+ }
}
local_bh_disable();
- for (i = 0; i < n; i++) {
+ for (i = 0; i < nframes; i++) {
struct xdp_frame *xdpf = frames[i];
struct sk_buff *skb = skbs[i];
int ret;
@@ -299,7 +366,7 @@ static int cpu_map_kthread_run(void *data)
drops++;
}
/* Feedback loop via tracepoint */
- trace_xdp_cpumap_kthread(rcpu->map_id, n, drops, sched);
+ trace_xdp_cpumap_kthread(rcpu->map_id, n, drops, sched, &stats);
local_bh_enable(); /* resched point, may call do_softirq() */
}
@@ -309,13 +376,38 @@ static int cpu_map_kthread_run(void *data)
return 0;
}
+bool cpu_map_prog_allowed(struct bpf_map *map)
+{
+ return map->map_type == BPF_MAP_TYPE_CPUMAP &&
+ map->value_size != offsetofend(struct bpf_cpumap_val, qsize);
+}
+
+static int __cpu_map_load_bpf_program(struct bpf_cpu_map_entry *rcpu, int fd)
+{
+ struct bpf_prog *prog;
+
+ prog = bpf_prog_get_type(fd, BPF_PROG_TYPE_XDP);
+ if (IS_ERR(prog))
+ return PTR_ERR(prog);
+
+ if (prog->expected_attach_type != BPF_XDP_CPUMAP) {
+ bpf_prog_put(prog);
+ return -EINVAL;
+ }
+
+ rcpu->value.bpf_prog.id = prog->aux->id;
+ rcpu->prog = prog;
+
+ return 0;
+}
+
static struct bpf_cpu_map_entry *
__cpu_map_entry_alloc(struct bpf_cpumap_val *value, u32 cpu, int map_id)
{
+ int numa, err, i, fd = value->bpf_prog.fd;
gfp_t gfp = GFP_KERNEL | __GFP_NOWARN;
struct bpf_cpu_map_entry *rcpu;
struct xdp_bulk_queue *bq;
- int numa, err, i;
/* Have map->numa_node, but choose node of redirect target CPU */
numa = cpu_to_node(cpu);
@@ -357,6 +449,9 @@ __cpu_map_entry_alloc(struct bpf_cpumap_val *value, u32 cpu, int map_id)
get_cpu_map_entry(rcpu); /* 1-refcnt for being in cmap->cpu_map[] */
get_cpu_map_entry(rcpu); /* 1-refcnt for kthread */
+ if (fd > 0 && __cpu_map_load_bpf_program(rcpu, fd))
+ goto free_ptr_ring;
+
/* Make sure kthread runs on a single CPU */
kthread_bind(rcpu->kthread, cpu);
wake_up_process(rcpu->kthread);
diff --git a/net/core/dev.c b/net/core/dev.c
index b61075828358..b820527f0a8d 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5448,6 +5448,8 @@ static int generic_xdp_install(struct net_device *dev, struct netdev_bpf *xdp)
for (i = 0; i < new->aux->used_map_cnt; i++) {
if (dev_map_can_have_prog(new->aux->used_maps[i]))
return -EINVAL;
+ if (cpu_map_prog_allowed(new->aux->used_maps[i]))
+ return -EINVAL;
}
}
@@ -8875,6 +8877,13 @@ int dev_change_xdp_fd(struct net_device *dev, struct netlink_ext_ack *extack,
return -EINVAL;
}
+ if (prog->expected_attach_type == BPF_XDP_CPUMAP) {
+ NL_SET_ERR_MSG(extack,
+ "BPF_XDP_CPUMAP programs can not be attached to a device");
+ bpf_prog_put(prog);
+ return -EINVAL;
+ }
+
/* prog->aux->id may be 0 for orphaned device-bound progs */
if (prog->aux->id && prog->aux->id == prog_id) {
bpf_prog_put(prog);
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 109623527358..c010b57fce3f 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -227,6 +227,7 @@ enum bpf_attach_type {
BPF_CGROUP_INET6_GETSOCKNAME,
BPF_XDP_DEVMAP,
BPF_CGROUP_INET_SOCK_RELEASE,
+ BPF_XDP_CPUMAP,
__MAX_BPF_ATTACH_TYPE
};
@@ -3856,6 +3857,10 @@ struct bpf_devmap_val {
*/
struct bpf_cpumap_val {
__u32 qsize; /* queue size to remote target CPU */
+ union {
+ int fd; /* prog fd on map write */
+ __u32 id; /* prog id on map read */
+ } bpf_prog;
};
enum sk_action {
--
2.26.2
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v7 bpf-next 6/9] bpf: cpumap: implement XDP_REDIRECT for eBPF programs attached to map entries
2020-07-14 13:56 [PATCH v7 bpf-next 0/9] introduce support for XDP programs in CPUMAP Lorenzo Bianconi
` (4 preceding siblings ...)
2020-07-14 13:56 ` [PATCH v7 bpf-next 5/9] bpf: cpumap: add the possibility to attach an eBPF program to cpumap Lorenzo Bianconi
@ 2020-07-14 13:56 ` Lorenzo Bianconi
2020-07-14 13:56 ` [PATCH v7 bpf-next 7/9] libbpf: add SEC name for xdp programs attached to CPUMAP Lorenzo Bianconi
` (4 subsequent siblings)
10 siblings, 0 replies; 21+ messages in thread
From: Lorenzo Bianconi @ 2020-07-14 13:56 UTC (permalink / raw)
To: netdev, bpf
Cc: davem, ast, brouer, daniel, toke, lorenzo.bianconi, dsahern,
andrii.nakryiko
Introduce XDP_REDIRECT support for eBPF programs attached to cpumap
entries.
This patch has been tested on Marvell ESPRESSObin using a modified
version of xdp_redirect_cpu sample in order to attach a XDP program
to CPUMAP entries to perform a redirect on the mvneta interface.
In particular the following scenario has been tested:
rq (cpu0) --> mvneta - XDP_REDIRECT (cpu0) --> CPUMAP - XDP_REDIRECT (cpu1) --> mvneta
$./xdp_redirect_cpu -p xdp_cpu_map0 -d eth0 -c 1 -e xdp_redirect \
-f xdp_redirect_kern.o -m tx_port -r eth0
tx: 285.2 Kpps rx: 285.2 Kpps
Attaching a simple XDP program on eth0 to perform XDP_TX gives
comparable results:
tx: 288.4 Kpps rx: 288.4 Kpps
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Co-developed-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
include/net/xdp.h | 1 +
include/trace/events/xdp.h | 6 ++++--
kernel/bpf/cpumap.c | 17 +++++++++++++++--
3 files changed, 20 insertions(+), 4 deletions(-)
diff --git a/include/net/xdp.h b/include/net/xdp.h
index 83b9e0142b52..5be0d4d65b94 100644
--- a/include/net/xdp.h
+++ b/include/net/xdp.h
@@ -99,6 +99,7 @@ struct xdp_frame {
};
struct xdp_cpumap_stats {
+ unsigned int redirect;
unsigned int pass;
unsigned int drop;
};
diff --git a/include/trace/events/xdp.h b/include/trace/events/xdp.h
index e2c99f5bee39..cd24e8a59529 100644
--- a/include/trace/events/xdp.h
+++ b/include/trace/events/xdp.h
@@ -190,6 +190,7 @@ TRACE_EVENT(xdp_cpumap_kthread,
__field(int, sched)
__field(unsigned int, xdp_pass)
__field(unsigned int, xdp_drop)
+ __field(unsigned int, xdp_redirect)
),
TP_fast_assign(
@@ -201,18 +202,19 @@ TRACE_EVENT(xdp_cpumap_kthread,
__entry->sched = sched;
__entry->xdp_pass = xdp_stats->pass;
__entry->xdp_drop = xdp_stats->drop;
+ __entry->xdp_redirect = xdp_stats->redirect;
),
TP_printk("kthread"
" cpu=%d map_id=%d action=%s"
" processed=%u drops=%u"
" sched=%d"
- " xdp_pass=%u xdp_drop=%u",
+ " xdp_pass=%u xdp_drop=%u xdp_redirect=%u",
__entry->cpu, __entry->map_id,
__print_symbolic(__entry->act, __XDP_ACT_SYM_TAB),
__entry->processed, __entry->drops,
__entry->sched,
- __entry->xdp_pass, __entry->xdp_drop)
+ __entry->xdp_pass, __entry->xdp_drop, __entry->xdp_redirect)
);
TRACE_EVENT(xdp_cpumap_enqueue,
diff --git a/kernel/bpf/cpumap.c b/kernel/bpf/cpumap.c
index b3a8aea81ee5..4c95d0615ca2 100644
--- a/kernel/bpf/cpumap.c
+++ b/kernel/bpf/cpumap.c
@@ -239,7 +239,7 @@ static int cpu_map_bpf_prog_run_xdp(struct bpf_cpu_map_entry *rcpu,
if (!rcpu->prog)
return n;
- rcu_read_lock();
+ rcu_read_lock_bh();
xdp_set_return_frame_no_direct();
xdp.rxq = &rxq;
@@ -267,6 +267,16 @@ static int cpu_map_bpf_prog_run_xdp(struct bpf_cpu_map_entry *rcpu,
stats->pass++;
}
break;
+ case XDP_REDIRECT:
+ err = xdp_do_redirect(xdpf->dev_rx, &xdp,
+ rcpu->prog);
+ if (unlikely(err)) {
+ xdp_return_frame(xdpf);
+ stats->drop++;
+ } else {
+ stats->redirect++;
+ }
+ break;
default:
bpf_warn_invalid_xdp_action(act);
/* fallthrough */
@@ -277,9 +287,12 @@ static int cpu_map_bpf_prog_run_xdp(struct bpf_cpu_map_entry *rcpu,
}
}
+ if (stats->redirect)
+ xdp_do_flush_map();
+
xdp_clear_return_frame_no_direct();
- rcu_read_unlock();
+ rcu_read_unlock_bh(); /* resched point, may call do_softirq() */
return nframes;
}
--
2.26.2
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v7 bpf-next 7/9] libbpf: add SEC name for xdp programs attached to CPUMAP
2020-07-14 13:56 [PATCH v7 bpf-next 0/9] introduce support for XDP programs in CPUMAP Lorenzo Bianconi
` (5 preceding siblings ...)
2020-07-14 13:56 ` [PATCH v7 bpf-next 6/9] bpf: cpumap: implement XDP_REDIRECT for eBPF programs attached to map entries Lorenzo Bianconi
@ 2020-07-14 13:56 ` Lorenzo Bianconi
2020-07-14 13:56 ` [PATCH v7 bpf-next 8/9] samples/bpf: xdp_redirect_cpu: load a eBPF program on cpumap Lorenzo Bianconi
` (3 subsequent siblings)
10 siblings, 0 replies; 21+ messages in thread
From: Lorenzo Bianconi @ 2020-07-14 13:56 UTC (permalink / raw)
To: netdev, bpf
Cc: davem, ast, brouer, daniel, toke, lorenzo.bianconi, dsahern,
andrii.nakryiko
As for DEVMAP, support SEC("xdp_cpumap/") as a short cut for loading
the program with type BPF_PROG_TYPE_XDP and expected attach type
BPF_XDP_CPUMAP.
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
tools/lib/bpf/libbpf.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 4489f95f1d1a..f55fd8a5c008 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -6912,6 +6912,8 @@ static const struct bpf_sec_def section_defs[] = {
.attach_fn = attach_iter),
BPF_EAPROG_SEC("xdp_devmap/", BPF_PROG_TYPE_XDP,
BPF_XDP_DEVMAP),
+ BPF_EAPROG_SEC("xdp_cpumap/", BPF_PROG_TYPE_XDP,
+ BPF_XDP_CPUMAP),
BPF_PROG_SEC("xdp", BPF_PROG_TYPE_XDP),
BPF_PROG_SEC("perf_event", BPF_PROG_TYPE_PERF_EVENT),
BPF_PROG_SEC("lwt_in", BPF_PROG_TYPE_LWT_IN),
--
2.26.2
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v7 bpf-next 8/9] samples/bpf: xdp_redirect_cpu: load a eBPF program on cpumap
2020-07-14 13:56 [PATCH v7 bpf-next 0/9] introduce support for XDP programs in CPUMAP Lorenzo Bianconi
` (6 preceding siblings ...)
2020-07-14 13:56 ` [PATCH v7 bpf-next 7/9] libbpf: add SEC name for xdp programs attached to CPUMAP Lorenzo Bianconi
@ 2020-07-14 13:56 ` Lorenzo Bianconi
2020-07-14 13:56 ` [PATCH v7 bpf-next 9/9] selftest: add tests for XDP programs in CPUMAP entries Lorenzo Bianconi
` (2 subsequent siblings)
10 siblings, 0 replies; 21+ messages in thread
From: Lorenzo Bianconi @ 2020-07-14 13:56 UTC (permalink / raw)
To: netdev, bpf
Cc: davem, ast, brouer, daniel, toke, lorenzo.bianconi, dsahern,
andrii.nakryiko
Extend xdp_redirect_cpu_{usr,kern}.c adding the possibility to load
a XDP program on cpumap entries. The following options have been added:
- mprog-name: cpumap entry program name
- mprog-filename: cpumap entry program filename
- redirect-device: output interface if the cpumap program performs a
XDP_REDIRECT to an egress interface
- redirect-map: bpf map used to perform XDP_REDIRECT to an egress
interface
- mprog-disable: disable loading XDP program on cpumap entries
Add xdp_pass, xdp_drop, xdp_redirect stats accounting
Co-developed-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
samples/bpf/xdp_redirect_cpu_kern.c | 25 ++--
samples/bpf/xdp_redirect_cpu_user.c | 175 +++++++++++++++++++++++++---
2 files changed, 178 insertions(+), 22 deletions(-)
diff --git a/samples/bpf/xdp_redirect_cpu_kern.c b/samples/bpf/xdp_redirect_cpu_kern.c
index 2baf8db1f7e7..8255025dea97 100644
--- a/samples/bpf/xdp_redirect_cpu_kern.c
+++ b/samples/bpf/xdp_redirect_cpu_kern.c
@@ -21,7 +21,7 @@
struct {
__uint(type, BPF_MAP_TYPE_CPUMAP);
__uint(key_size, sizeof(u32));
- __uint(value_size, sizeof(u32));
+ __uint(value_size, sizeof(struct bpf_cpumap_val));
__uint(max_entries, MAX_CPUS);
} cpu_map SEC(".maps");
@@ -30,6 +30,9 @@ struct datarec {
__u64 processed;
__u64 dropped;
__u64 issue;
+ __u64 xdp_pass;
+ __u64 xdp_drop;
+ __u64 xdp_redirect;
};
/* Count RX packets, as XDP bpf_prog doesn't get direct TX-success
@@ -692,13 +695,16 @@ int trace_xdp_cpumap_enqueue(struct cpumap_enqueue_ctx *ctx)
* Code in: kernel/include/trace/events/xdp.h
*/
struct cpumap_kthread_ctx {
- u64 __pad; // First 8 bytes are not accessible by bpf code
- int map_id; // offset:8; size:4; signed:1;
- u32 act; // offset:12; size:4; signed:0;
- int cpu; // offset:16; size:4; signed:1;
- unsigned int drops; // offset:20; size:4; signed:0;
- unsigned int processed; // offset:24; size:4; signed:0;
- int sched; // offset:28; size:4; signed:1;
+ u64 __pad; // First 8 bytes are not accessible
+ int map_id; // offset:8; size:4; signed:1;
+ u32 act; // offset:12; size:4; signed:0;
+ int cpu; // offset:16; size:4; signed:1;
+ unsigned int drops; // offset:20; size:4; signed:0;
+ unsigned int processed; // offset:24; size:4; signed:0;
+ int sched; // offset:28; size:4; signed:1;
+ unsigned int xdp_pass; // offset:32; size:4; signed:0;
+ unsigned int xdp_drop; // offset:36; size:4; signed:0;
+ unsigned int xdp_redirect; // offset:40; size:4; signed:0;
};
SEC("tracepoint/xdp/xdp_cpumap_kthread")
@@ -712,6 +718,9 @@ int trace_xdp_cpumap_kthread(struct cpumap_kthread_ctx *ctx)
return 0;
rec->processed += ctx->processed;
rec->dropped += ctx->drops;
+ rec->xdp_pass += ctx->xdp_pass;
+ rec->xdp_drop += ctx->xdp_drop;
+ rec->xdp_redirect += ctx->xdp_redirect;
/* Count times kthread yielded CPU via schedule call */
if (ctx->sched)
diff --git a/samples/bpf/xdp_redirect_cpu_user.c b/samples/bpf/xdp_redirect_cpu_user.c
index 6bb2d95cb26c..004c0622c913 100644
--- a/samples/bpf/xdp_redirect_cpu_user.c
+++ b/samples/bpf/xdp_redirect_cpu_user.c
@@ -70,6 +70,11 @@ static const struct option long_options[] = {
{"stress-mode", no_argument, NULL, 'x' },
{"no-separators", no_argument, NULL, 'z' },
{"force", no_argument, NULL, 'F' },
+ {"mprog-disable", no_argument, NULL, 'n' },
+ {"mprog-name", required_argument, NULL, 'e' },
+ {"mprog-filename", required_argument, NULL, 'f' },
+ {"redirect-device", required_argument, NULL, 'r' },
+ {"redirect-map", required_argument, NULL, 'm' },
{0, 0, NULL, 0 }
};
@@ -156,6 +161,9 @@ struct datarec {
__u64 processed;
__u64 dropped;
__u64 issue;
+ __u64 xdp_pass;
+ __u64 xdp_drop;
+ __u64 xdp_redirect;
};
struct record {
__u64 timestamp;
@@ -175,6 +183,9 @@ static bool map_collect_percpu(int fd, __u32 key, struct record *rec)
/* For percpu maps, userspace gets a value per possible CPU */
unsigned int nr_cpus = bpf_num_possible_cpus();
struct datarec values[nr_cpus];
+ __u64 sum_xdp_redirect = 0;
+ __u64 sum_xdp_pass = 0;
+ __u64 sum_xdp_drop = 0;
__u64 sum_processed = 0;
__u64 sum_dropped = 0;
__u64 sum_issue = 0;
@@ -196,10 +207,19 @@ static bool map_collect_percpu(int fd, __u32 key, struct record *rec)
sum_dropped += values[i].dropped;
rec->cpu[i].issue = values[i].issue;
sum_issue += values[i].issue;
+ rec->cpu[i].xdp_pass = values[i].xdp_pass;
+ sum_xdp_pass += values[i].xdp_pass;
+ rec->cpu[i].xdp_drop = values[i].xdp_drop;
+ sum_xdp_drop += values[i].xdp_drop;
+ rec->cpu[i].xdp_redirect = values[i].xdp_redirect;
+ sum_xdp_redirect += values[i].xdp_redirect;
}
rec->total.processed = sum_processed;
rec->total.dropped = sum_dropped;
rec->total.issue = sum_issue;
+ rec->total.xdp_pass = sum_xdp_pass;
+ rec->total.xdp_drop = sum_xdp_drop;
+ rec->total.xdp_redirect = sum_xdp_redirect;
return true;
}
@@ -300,17 +320,33 @@ static __u64 calc_errs_pps(struct datarec *r,
return pps;
}
+static void calc_xdp_pps(struct datarec *r, struct datarec *p,
+ double *xdp_pass, double *xdp_drop,
+ double *xdp_redirect, double period_)
+{
+ *xdp_pass = 0, *xdp_drop = 0, *xdp_redirect = 0;
+ if (period_ > 0) {
+ *xdp_redirect = (r->xdp_redirect - p->xdp_redirect) / period_;
+ *xdp_pass = (r->xdp_pass - p->xdp_pass) / period_;
+ *xdp_drop = (r->xdp_drop - p->xdp_drop) / period_;
+ }
+}
+
static void stats_print(struct stats_record *stats_rec,
struct stats_record *stats_prev,
- char *prog_name)
+ char *prog_name, char *mprog_name, int mprog_fd)
{
unsigned int nr_cpus = bpf_num_possible_cpus();
double pps = 0, drop = 0, err = 0;
+ bool mprog_enabled = false;
struct record *rec, *prev;
int to_cpu;
double t;
int i;
+ if (mprog_fd > 0)
+ mprog_enabled = true;
+
/* Header */
printf("Running XDP/eBPF prog_name:%s\n", prog_name);
printf("%-15s %-7s %-14s %-11s %-9s\n",
@@ -455,6 +491,34 @@ static void stats_print(struct stats_record *stats_rec,
printf(fm2_err, "xdp_exception", "total", pps, drop);
}
+ /* CPUMAP attached XDP program that runs on remote/destination CPU */
+ if (mprog_enabled) {
+ char *fmt_k = "%-15s %-7d %'-14.0f %'-11.0f %'-10.0f\n";
+ char *fm2_k = "%-15s %-7s %'-14.0f %'-11.0f %'-10.0f\n";
+ double xdp_pass, xdp_drop, xdp_redirect;
+
+ printf("\n2nd remote XDP/eBPF prog_name: %s\n", mprog_name);
+ printf("%-15s %-7s %-14s %-11s %-9s\n",
+ "XDP-cpumap", "CPU:to", "xdp-pass", "xdp-drop", "xdp-redir");
+
+ rec = &stats_rec->kthread;
+ prev = &stats_prev->kthread;
+ t = calc_period(rec, prev);
+ for (i = 0; i < nr_cpus; i++) {
+ struct datarec *r = &rec->cpu[i];
+ struct datarec *p = &prev->cpu[i];
+
+ calc_xdp_pps(r, p, &xdp_pass, &xdp_drop,
+ &xdp_redirect, t);
+ if (xdp_pass > 0 || xdp_drop > 0 || xdp_redirect > 0)
+ printf(fmt_k, "xdp-in-kthread", i, xdp_pass, xdp_drop,
+ xdp_redirect);
+ }
+ calc_xdp_pps(&rec->total, &prev->total, &xdp_pass, &xdp_drop,
+ &xdp_redirect, t);
+ printf(fm2_k, "xdp-in-kthread", "total", xdp_pass, xdp_drop, xdp_redirect);
+ }
+
printf("\n");
fflush(stdout);
}
@@ -491,7 +555,7 @@ static inline void swap(struct stats_record **a, struct stats_record **b)
*b = tmp;
}
-static int create_cpu_entry(__u32 cpu, __u32 queue_size,
+static int create_cpu_entry(__u32 cpu, struct bpf_cpumap_val *value,
__u32 avail_idx, bool new)
{
__u32 curr_cpus_count = 0;
@@ -501,7 +565,7 @@ static int create_cpu_entry(__u32 cpu, __u32 queue_size,
/* Add a CPU entry to cpumap, as this allocate a cpu entry in
* the kernel for the cpu.
*/
- ret = bpf_map_update_elem(cpu_map_fd, &cpu, &queue_size, 0);
+ ret = bpf_map_update_elem(cpu_map_fd, &cpu, value, 0);
if (ret) {
fprintf(stderr, "Create CPU entry failed (err:%d)\n", ret);
exit(EXIT_FAIL_BPF);
@@ -532,9 +596,9 @@ static int create_cpu_entry(__u32 cpu, __u32 queue_size,
}
}
/* map_fd[7] = cpus_iterator */
- printf("%s CPU:%u as idx:%u queue_size:%d (total cpus_count:%u)\n",
+ printf("%s CPU:%u as idx:%u qsize:%d prog_fd: %d (cpus_count:%u)\n",
new ? "Add-new":"Replace", cpu, avail_idx,
- queue_size, curr_cpus_count);
+ value->qsize, value->bpf_prog.fd, curr_cpus_count);
return 0;
}
@@ -558,21 +622,26 @@ static void mark_cpus_unavailable(void)
}
/* Stress cpumap management code by concurrently changing underlying cpumap */
-static void stress_cpumap(void)
+static void stress_cpumap(struct bpf_cpumap_val *value)
{
/* Changing qsize will cause kernel to free and alloc a new
* bpf_cpu_map_entry, with an associated/complicated tear-down
* procedure.
*/
- create_cpu_entry(1, 1024, 0, false);
- create_cpu_entry(1, 8, 0, false);
- create_cpu_entry(1, 16000, 0, false);
+ value->qsize = 1024;
+ create_cpu_entry(1, value, 0, false);
+ value->qsize = 8;
+ create_cpu_entry(1, value, 0, false);
+ value->qsize = 16000;
+ create_cpu_entry(1, value, 0, false);
}
static void stats_poll(int interval, bool use_separators, char *prog_name,
+ char *mprog_name, struct bpf_cpumap_val *value,
bool stress_mode)
{
struct stats_record *record, *prev;
+ int mprog_fd;
record = alloc_stats_record();
prev = alloc_stats_record();
@@ -584,11 +653,12 @@ static void stats_poll(int interval, bool use_separators, char *prog_name,
while (1) {
swap(&prev, &record);
+ mprog_fd = value->bpf_prog.fd;
stats_collect(record);
- stats_print(record, prev, prog_name);
+ stats_print(record, prev, prog_name, mprog_name, mprog_fd);
sleep(interval);
if (stress_mode)
- stress_cpumap();
+ stress_cpumap(value);
}
free_stats_record(record);
@@ -661,15 +731,66 @@ static int init_map_fds(struct bpf_object *obj)
return 0;
}
+static int load_cpumap_prog(char *file_name, char *prog_name,
+ char *redir_interface, char *redir_map)
+{
+ struct bpf_prog_load_attr prog_load_attr = {
+ .prog_type = BPF_PROG_TYPE_XDP,
+ .expected_attach_type = BPF_XDP_CPUMAP,
+ .file = file_name,
+ };
+ struct bpf_program *prog;
+ struct bpf_object *obj;
+ int fd;
+
+ if (bpf_prog_load_xattr(&prog_load_attr, &obj, &fd))
+ return -1;
+
+ if (fd < 0) {
+ fprintf(stderr, "ERR: bpf_prog_load_xattr: %s\n",
+ strerror(errno));
+ return fd;
+ }
+
+ if (redir_interface && redir_map) {
+ int err, map_fd, ifindex_out, key = 0;
+
+ map_fd = bpf_object__find_map_fd_by_name(obj, redir_map);
+ if (map_fd < 0)
+ return map_fd;
+
+ ifindex_out = if_nametoindex(redir_interface);
+ if (!ifindex_out)
+ return -1;
+
+ err = bpf_map_update_elem(map_fd, &key, &ifindex_out, 0);
+ if (err < 0)
+ return err;
+ }
+
+ prog = bpf_object__find_program_by_title(obj, prog_name);
+ if (!prog) {
+ fprintf(stderr, "bpf_object__find_program_by_title failed\n");
+ return EXIT_FAIL;
+ }
+
+ return bpf_program__fd(prog);
+}
+
int main(int argc, char **argv)
{
struct rlimit r = {10 * 1024 * 1024, RLIM_INFINITY};
char *prog_name = "xdp_cpu_map5_lb_hash_ip_pairs";
+ char *mprog_filename = "xdp_redirect_kern.o";
+ char *redir_interface = NULL, *redir_map = NULL;
+ char *mprog_name = "xdp_redirect_dummy";
+ bool mprog_disable = false;
struct bpf_prog_load_attr prog_load_attr = {
.prog_type = BPF_PROG_TYPE_UNSPEC,
};
struct bpf_prog_info info = {};
__u32 info_len = sizeof(info);
+ struct bpf_cpumap_val value;
bool use_separators = true;
bool stress_mode = false;
struct bpf_program *prog;
@@ -725,7 +846,7 @@ int main(int argc, char **argv)
memset(cpu, 0, n_cpus * sizeof(int));
/* Parse commands line args */
- while ((opt = getopt_long(argc, argv, "hSd:s:p:q:c:xzF",
+ while ((opt = getopt_long(argc, argv, "hSd:s:p:q:c:xzFf:e:r:m:",
long_options, &longindex)) != -1) {
switch (opt) {
case 'd':
@@ -759,6 +880,21 @@ int main(int argc, char **argv)
/* Selecting eBPF prog to load */
prog_name = optarg;
break;
+ case 'n':
+ mprog_disable = true;
+ break;
+ case 'f':
+ mprog_filename = optarg;
+ break;
+ case 'e':
+ mprog_name = optarg;
+ break;
+ case 'r':
+ redir_interface = optarg;
+ break;
+ case 'm':
+ redir_map = optarg;
+ break;
case 'c':
/* Add multiple CPUs */
add_cpu = strtoul(optarg, NULL, 0);
@@ -804,8 +940,18 @@ int main(int argc, char **argv)
goto out;
}
+ value.bpf_prog.fd = 0;
+ if (!mprog_disable)
+ value.bpf_prog.fd = load_cpumap_prog(mprog_filename, mprog_name,
+ redir_interface, redir_map);
+ if (value.bpf_prog.fd < 0) {
+ err = value.bpf_prog.fd;
+ goto out;
+ }
+ value.qsize = qsize;
+
for (i = 0; i < added_cpus; i++)
- create_cpu_entry(cpu[i], qsize, i, true);
+ create_cpu_entry(cpu[i], &value, i, true);
/* Remove XDP program when program is interrupted or killed */
signal(SIGINT, int_exit);
@@ -838,7 +984,8 @@ int main(int argc, char **argv)
}
prog_id = info.id;
- stats_poll(interval, use_separators, prog_name, stress_mode);
+ stats_poll(interval, use_separators, prog_name, mprog_name,
+ &value, stress_mode);
out:
free(cpu);
return err;
--
2.26.2
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v7 bpf-next 9/9] selftest: add tests for XDP programs in CPUMAP entries
2020-07-14 13:56 [PATCH v7 bpf-next 0/9] introduce support for XDP programs in CPUMAP Lorenzo Bianconi
` (7 preceding siblings ...)
2020-07-14 13:56 ` [PATCH v7 bpf-next 8/9] samples/bpf: xdp_redirect_cpu: load a eBPF program on cpumap Lorenzo Bianconi
@ 2020-07-14 13:56 ` Lorenzo Bianconi
2020-07-14 15:19 ` [PATCH v7 bpf-next 0/9] introduce support for XDP programs in CPUMAP Alexei Starovoitov
2020-07-17 10:00 ` Jakub Sitnicki
10 siblings, 0 replies; 21+ messages in thread
From: Lorenzo Bianconi @ 2020-07-14 13:56 UTC (permalink / raw)
To: netdev, bpf
Cc: davem, ast, brouer, daniel, toke, lorenzo.bianconi, dsahern,
andrii.nakryiko
Similar to what have been done for DEVMAP, introduce tests to verify
ability to add a XDP program to an entry in a CPUMAP.
Verify CPUMAP programs can not be attached to devices as a normal
XDP program, and only programs with BPF_XDP_CPUMAP attach type can
be loaded in a CPUMAP.
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
.../bpf/prog_tests/xdp_cpumap_attach.c | 70 +++++++++++++++++++
.../bpf/progs/test_xdp_with_cpumap_helpers.c | 36 ++++++++++
2 files changed, 106 insertions(+)
create mode 100644 tools/testing/selftests/bpf/prog_tests/xdp_cpumap_attach.c
create mode 100644 tools/testing/selftests/bpf/progs/test_xdp_with_cpumap_helpers.c
diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_cpumap_attach.c b/tools/testing/selftests/bpf/prog_tests/xdp_cpumap_attach.c
new file mode 100644
index 000000000000..0176573fe4e7
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/xdp_cpumap_attach.c
@@ -0,0 +1,70 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <uapi/linux/bpf.h>
+#include <linux/if_link.h>
+#include <test_progs.h>
+
+#include "test_xdp_with_cpumap_helpers.skel.h"
+
+#define IFINDEX_LO 1
+
+void test_xdp_with_cpumap_helpers(void)
+{
+ struct test_xdp_with_cpumap_helpers *skel;
+ struct bpf_prog_info info = {};
+ struct bpf_cpumap_val val = {
+ .qsize = 192,
+ };
+ __u32 duration = 0, idx = 0;
+ __u32 len = sizeof(info);
+ int err, prog_fd, map_fd;
+
+ skel = test_xdp_with_cpumap_helpers__open_and_load();
+ if (CHECK_FAIL(!skel)) {
+ perror("test_xdp_with_cpumap_helpers__open_and_load");
+ return;
+ }
+
+ /* can not attach program with cpumaps that allow programs
+ * as xdp generic
+ */
+ prog_fd = bpf_program__fd(skel->progs.xdp_redir_prog);
+ err = bpf_set_link_xdp_fd(IFINDEX_LO, prog_fd, XDP_FLAGS_SKB_MODE);
+ CHECK(err == 0, "Generic attach of program with 8-byte CPUMAP",
+ "should have failed\n");
+
+ prog_fd = bpf_program__fd(skel->progs.xdp_dummy_cm);
+ map_fd = bpf_map__fd(skel->maps.cpu_map);
+ err = bpf_obj_get_info_by_fd(prog_fd, &info, &len);
+ if (CHECK_FAIL(err))
+ goto out_close;
+
+ val.bpf_prog.fd = prog_fd;
+ err = bpf_map_update_elem(map_fd, &idx, &val, 0);
+ CHECK(err, "Add program to cpumap entry", "err %d errno %d\n",
+ err, errno);
+
+ err = bpf_map_lookup_elem(map_fd, &idx, &val);
+ CHECK(err, "Read cpumap entry", "err %d errno %d\n", err, errno);
+ CHECK(info.id != val.bpf_prog.id, "Expected program id in cpumap entry",
+ "expected %u read %u\n", info.id, val.bpf_prog.id);
+
+ /* can not attach BPF_XDP_CPUMAP program to a device */
+ err = bpf_set_link_xdp_fd(IFINDEX_LO, prog_fd, XDP_FLAGS_SKB_MODE);
+ CHECK(err == 0, "Attach of BPF_XDP_CPUMAP program",
+ "should have failed\n");
+
+ val.qsize = 192;
+ val.bpf_prog.fd = bpf_program__fd(skel->progs.xdp_dummy_prog);
+ err = bpf_map_update_elem(map_fd, &idx, &val, 0);
+ CHECK(err == 0, "Add non-BPF_XDP_CPUMAP program to cpumap entry",
+ "should have failed\n");
+
+out_close:
+ test_xdp_with_cpumap_helpers__destroy(skel);
+}
+
+void test_xdp_cpumap_attach(void)
+{
+ if (test__start_subtest("cpumap_with_progs"))
+ test_xdp_with_cpumap_helpers();
+}
diff --git a/tools/testing/selftests/bpf/progs/test_xdp_with_cpumap_helpers.c b/tools/testing/selftests/bpf/progs/test_xdp_with_cpumap_helpers.c
new file mode 100644
index 000000000000..59ee4f182ff8
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_xdp_with_cpumap_helpers.c
@@ -0,0 +1,36 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <linux/bpf.h>
+#include <bpf/bpf_helpers.h>
+
+#define IFINDEX_LO 1
+
+struct {
+ __uint(type, BPF_MAP_TYPE_CPUMAP);
+ __uint(key_size, sizeof(__u32));
+ __uint(value_size, sizeof(struct bpf_cpumap_val));
+ __uint(max_entries, 4);
+} cpu_map SEC(".maps");
+
+SEC("xdp_redir")
+int xdp_redir_prog(struct xdp_md *ctx)
+{
+ return bpf_redirect_map(&cpu_map, 1, 0);
+}
+
+SEC("xdp_dummy")
+int xdp_dummy_prog(struct xdp_md *ctx)
+{
+ return XDP_PASS;
+}
+
+SEC("xdp_cpumap/dummy_cm")
+int xdp_dummy_cm(struct xdp_md *ctx)
+{
+ if (ctx->ingress_ifindex == IFINDEX_LO)
+ return XDP_DROP;
+
+ return XDP_PASS;
+}
+
+char _license[] SEC("license") = "GPL";
--
2.26.2
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH v7 bpf-next 0/9] introduce support for XDP programs in CPUMAP
2020-07-14 13:56 [PATCH v7 bpf-next 0/9] introduce support for XDP programs in CPUMAP Lorenzo Bianconi
` (8 preceding siblings ...)
2020-07-14 13:56 ` [PATCH v7 bpf-next 9/9] selftest: add tests for XDP programs in CPUMAP entries Lorenzo Bianconi
@ 2020-07-14 15:19 ` Alexei Starovoitov
2020-07-14 15:35 ` Lorenzo Bianconi
2020-07-17 10:00 ` Jakub Sitnicki
10 siblings, 1 reply; 21+ messages in thread
From: Alexei Starovoitov @ 2020-07-14 15:19 UTC (permalink / raw)
To: Lorenzo Bianconi
Cc: Network Development, bpf, David S. Miller, Alexei Starovoitov,
Jesper Dangaard Brouer, Daniel Borkmann,
Toke Høiland-Jørgensen, lorenzo.bianconi, David Ahern,
Andrii Nakryiko
On Tue, Jul 14, 2020 at 6:56 AM Lorenzo Bianconi <lorenzo@kernel.org> wrote:
>
> Similar to what David Ahern proposed in [1] for DEVMAPs, introduce the
> capability to attach and run a XDP program to CPUMAP entries.
> The idea behind this feature is to add the possibility to define on which CPU
> run the eBPF program if the underlying hw does not support RSS.
> I respin patch 1/6 from a previous series sent by David [2].
> The functionality has been tested on Marvell Espressobin, i40e and mlx5.
> Detailed tests results can be found here:
> https://github.com/xdp-project/xdp-project/blob/master/areas/cpumap/cpumap04-map-xdp-prog.org
>
> Changes since v6:
> - rebase on top of bpf-next
> - move bpf_cpumap_val and bpf_prog in the first bpf_cpu_map_entry cache-line
fyi. I'm waiting on Daniel to do one more look, since he commented in the past.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v7 bpf-next 0/9] introduce support for XDP programs in CPUMAP
2020-07-14 15:19 ` [PATCH v7 bpf-next 0/9] introduce support for XDP programs in CPUMAP Alexei Starovoitov
@ 2020-07-14 15:35 ` Lorenzo Bianconi
2020-07-16 16:27 ` Daniel Borkmann
0 siblings, 1 reply; 21+ messages in thread
From: Lorenzo Bianconi @ 2020-07-14 15:35 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: Network Development, bpf, David S. Miller, Alexei Starovoitov,
Jesper Dangaard Brouer, Daniel Borkmann,
Toke Høiland-Jørgensen, lorenzo.bianconi, David Ahern,
Andrii Nakryiko
[-- Attachment #1: Type: text/plain, Size: 997 bytes --]
> On Tue, Jul 14, 2020 at 6:56 AM Lorenzo Bianconi <lorenzo@kernel.org> wrote:
> >
> > Similar to what David Ahern proposed in [1] for DEVMAPs, introduce the
> > capability to attach and run a XDP program to CPUMAP entries.
> > The idea behind this feature is to add the possibility to define on which CPU
> > run the eBPF program if the underlying hw does not support RSS.
> > I respin patch 1/6 from a previous series sent by David [2].
> > The functionality has been tested on Marvell Espressobin, i40e and mlx5.
> > Detailed tests results can be found here:
> > https://github.com/xdp-project/xdp-project/blob/master/areas/cpumap/cpumap04-map-xdp-prog.org
> >
> > Changes since v6:
> > - rebase on top of bpf-next
> > - move bpf_cpumap_val and bpf_prog in the first bpf_cpu_map_entry cache-line
>
> fyi. I'm waiting on Daniel to do one more look, since he commented in the past.
ack, thx. I have just figured out today that v6 is not applying anymore.
Regards,
Lorenzo
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v7 bpf-next 0/9] introduce support for XDP programs in CPUMAP
2020-07-14 15:35 ` Lorenzo Bianconi
@ 2020-07-16 16:27 ` Daniel Borkmann
0 siblings, 0 replies; 21+ messages in thread
From: Daniel Borkmann @ 2020-07-16 16:27 UTC (permalink / raw)
To: Lorenzo Bianconi, Alexei Starovoitov
Cc: Network Development, bpf, David S. Miller, Alexei Starovoitov,
Jesper Dangaard Brouer, Toke Høiland-Jørgensen,
lorenzo.bianconi, David Ahern, Andrii Nakryiko
On 7/14/20 5:35 PM, Lorenzo Bianconi wrote:
>> On Tue, Jul 14, 2020 at 6:56 AM Lorenzo Bianconi <lorenzo@kernel.org> wrote:
>>>
>>> Similar to what David Ahern proposed in [1] for DEVMAPs, introduce the
>>> capability to attach and run a XDP program to CPUMAP entries.
>>> The idea behind this feature is to add the possibility to define on which CPU
>>> run the eBPF program if the underlying hw does not support RSS.
>>> I respin patch 1/6 from a previous series sent by David [2].
>>> The functionality has been tested on Marvell Espressobin, i40e and mlx5.
>>> Detailed tests results can be found here:
>>> https://github.com/xdp-project/xdp-project/blob/master/areas/cpumap/cpumap04-map-xdp-prog.org
>>>
>>> Changes since v6:
>>> - rebase on top of bpf-next
>>> - move bpf_cpumap_val and bpf_prog in the first bpf_cpu_map_entry cache-line
>>
>> fyi. I'm waiting on Daniel to do one more look, since he commented in the past.
>
> ack, thx. I have just figured out today that v6 is not applying anymore.
LGTM, I've applied it, thanks!
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v7 bpf-next 0/9] introduce support for XDP programs in CPUMAP
2020-07-14 13:56 [PATCH v7 bpf-next 0/9] introduce support for XDP programs in CPUMAP Lorenzo Bianconi
` (9 preceding siblings ...)
2020-07-14 15:19 ` [PATCH v7 bpf-next 0/9] introduce support for XDP programs in CPUMAP Alexei Starovoitov
@ 2020-07-17 10:00 ` Jakub Sitnicki
2020-07-17 10:08 ` Jakub Sitnicki
2020-07-17 11:01 ` Lorenzo Bianconi
10 siblings, 2 replies; 21+ messages in thread
From: Jakub Sitnicki @ 2020-07-17 10:00 UTC (permalink / raw)
To: Lorenzo Bianconi
Cc: netdev, davem, ast, brouer, daniel, toke, lorenzo.bianconi,
dsahern, andrii.nakryiko, bpf
On Tue, 14 Jul 2020 15:56:33 +0200
Lorenzo Bianconi <lorenzo@kernel.org> wrote:
> Similar to what David Ahern proposed in [1] for DEVMAPs, introduce the
> capability to attach and run a XDP program to CPUMAP entries.
> The idea behind this feature is to add the possibility to define on which CPU
> run the eBPF program if the underlying hw does not support RSS.
> I respin patch 1/6 from a previous series sent by David [2].
> The functionality has been tested on Marvell Espressobin, i40e and mlx5.
> Detailed tests results can be found here:
> https://github.com/xdp-project/xdp-project/blob/master/areas/cpumap/cpumap04-map-xdp-prog.org
>
> Changes since v6:
> - rebase on top of bpf-next
> - move bpf_cpumap_val and bpf_prog in the first bpf_cpu_map_entry cache-line
>
> Changes since v5:
> - move bpf_prog_put() in put_cpu_map_entry()
> - remove READ_ONCE(rcpu->prog) in cpu_map_bpf_prog_run_xdp
> - rely on bpf_prog_get_type() instead of bpf_prog_get_type_dev() in
> __cpu_map_load_bpf_program()
>
> Changes since v4:
> - move xdp_clear_return_frame_no_direct inside rcu section
> - update David Ahern's email address
>
> Changes since v3:
> - fix typo in commit message
> - fix access to ctx->ingress_ifindex in cpumap bpf selftest
>
> Changes since v2:
> - improved comments
> - fix return value in xdp_convert_buff_to_frame
> - added patch 1/9: "cpumap: use non-locked version __ptr_ring_consume_batched"
> - do not run kmem_cache_alloc_bulk if all frames have been consumed by the XDP
> program attached to the CPUMAP entry
> - removed bpf_trace_printk in kselftest
>
> Changes since v1:
> - added performance test results
> - added kselftest support
> - fixed memory accounting with page_pool
> - extended xdp_redirect_cpu_user.c to load an external program to perform
> redirect
> - reported ifindex to attached eBPF program
> - moved bpf_cpumap_val definition to include/uapi/linux/bpf.h
>
> [1] https://patchwork.ozlabs.org/project/netdev/cover/20200529220716.75383-1-dsahern@kernel.org/
> [2] https://patchwork.ozlabs.org/project/netdev/patch/20200513014607.40418-2-dsahern@kernel.org/
>
> David Ahern (1):
> net: refactor xdp_convert_buff_to_frame
>
> Jesper Dangaard Brouer (1):
> cpumap: use non-locked version __ptr_ring_consume_batched
>
> Lorenzo Bianconi (7):
> samples/bpf: xdp_redirect_cpu_user: do not update bpf maps in option
> loop
> cpumap: formalize map value as a named struct
> bpf: cpumap: add the possibility to attach an eBPF program to cpumap
> bpf: cpumap: implement XDP_REDIRECT for eBPF programs attached to map
> entries
> libbpf: add SEC name for xdp programs attached to CPUMAP
> samples/bpf: xdp_redirect_cpu: load a eBPF program on cpumap
> selftest: add tests for XDP programs in CPUMAP entries
>
> include/linux/bpf.h | 6 +
> include/net/xdp.h | 41 ++--
> include/trace/events/xdp.h | 16 +-
> include/uapi/linux/bpf.h | 14 ++
> kernel/bpf/cpumap.c | 162 +++++++++++---
> net/core/dev.c | 9 +
> samples/bpf/xdp_redirect_cpu_kern.c | 25 ++-
> samples/bpf/xdp_redirect_cpu_user.c | 209 ++++++++++++++++--
> tools/include/uapi/linux/bpf.h | 14 ++
> tools/lib/bpf/libbpf.c | 2 +
> .../bpf/prog_tests/xdp_cpumap_attach.c | 70 ++++++
> .../bpf/progs/test_xdp_with_cpumap_helpers.c | 36 +++
> 12 files changed, 531 insertions(+), 73 deletions(-)
> create mode 100644 tools/testing/selftests/bpf/prog_tests/xdp_cpumap_attach.c
> create mode 100644 tools/testing/selftests/bpf/progs/test_xdp_with_cpumap_helpers.c
>
This started showing up with when running ./test_progs from recent
bpf-next (bfdfa51702de). Any chance it is related?
[ 2950.440613] =============================================
[ 3073.281578] INFO: task cpumap/0/map:26:536 blocked for more than 860 seconds.
[ 3073.285492] Tainted: G W 5.8.0-rc4-01471-g15d51f3a516b #814
[ 3073.289177] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 3073.293021] cpumap/0/map:26 D 0 536 2 0x00004000
[ 3073.295755] Call Trace:
[ 3073.297143] __schedule+0x5ad/0xf10
[ 3073.299032] ? pci_mmcfg_check_reserved+0xd0/0xd0
[ 3073.301416] ? static_obj+0x31/0x80
[ 3073.303277] ? mark_held_locks+0x24/0x90
[ 3073.305313] ? cpu_map_update_elem+0x6d0/0x6d0
[ 3073.307544] schedule+0x6f/0x160
[ 3073.309282] schedule_preempt_disabled+0x14/0x20
[ 3073.311593] kthread+0x175/0x240
[ 3073.313299] ? kthread_create_on_node+0xd0/0xd0
[ 3073.315106] ret_from_fork+0x1f/0x30
[ 3073.316365]
Showing all locks held in the system:
[ 3073.318423] 1 lock held by khungtaskd/33:
[ 3073.319642] #0: ffffffff82d246a0 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x28/0x1c3
[ 3073.322249] =============================================
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v7 bpf-next 0/9] introduce support for XDP programs in CPUMAP
2020-07-17 10:00 ` Jakub Sitnicki
@ 2020-07-17 10:08 ` Jakub Sitnicki
2020-07-17 11:06 ` Lorenzo Bianconi
2020-07-17 11:01 ` Lorenzo Bianconi
1 sibling, 1 reply; 21+ messages in thread
From: Jakub Sitnicki @ 2020-07-17 10:08 UTC (permalink / raw)
To: Jakub Sitnicki
Cc: netdev, davem, ast, brouer, daniel, toke, lorenzo.bianconi,
dsahern, andrii.nakryiko, bpf
On Fri, 17 Jul 2020 12:00:13 +0200
Jakub Sitnicki <jakub@cloudflare.com> wrote:
> On Tue, 14 Jul 2020 15:56:33 +0200
> Lorenzo Bianconi <lorenzo@kernel.org> wrote:
>
> > Similar to what David Ahern proposed in [1] for DEVMAPs, introduce the
> > capability to attach and run a XDP program to CPUMAP entries.
> > The idea behind this feature is to add the possibility to define on which CPU
> > run the eBPF program if the underlying hw does not support RSS.
> > I respin patch 1/6 from a previous series sent by David [2].
> > The functionality has been tested on Marvell Espressobin, i40e and mlx5.
> > Detailed tests results can be found here:
> > https://github.com/xdp-project/xdp-project/blob/master/areas/cpumap/cpumap04-map-xdp-prog.org
> >
> > Changes since v6:
> > - rebase on top of bpf-next
> > - move bpf_cpumap_val and bpf_prog in the first bpf_cpu_map_entry cache-line
> >
> > Changes since v5:
> > - move bpf_prog_put() in put_cpu_map_entry()
> > - remove READ_ONCE(rcpu->prog) in cpu_map_bpf_prog_run_xdp
> > - rely on bpf_prog_get_type() instead of bpf_prog_get_type_dev() in
> > __cpu_map_load_bpf_program()
> >
> > Changes since v4:
> > - move xdp_clear_return_frame_no_direct inside rcu section
> > - update David Ahern's email address
> >
> > Changes since v3:
> > - fix typo in commit message
> > - fix access to ctx->ingress_ifindex in cpumap bpf selftest
> >
> > Changes since v2:
> > - improved comments
> > - fix return value in xdp_convert_buff_to_frame
> > - added patch 1/9: "cpumap: use non-locked version __ptr_ring_consume_batched"
> > - do not run kmem_cache_alloc_bulk if all frames have been consumed by the XDP
> > program attached to the CPUMAP entry
> > - removed bpf_trace_printk in kselftest
> >
> > Changes since v1:
> > - added performance test results
> > - added kselftest support
> > - fixed memory accounting with page_pool
> > - extended xdp_redirect_cpu_user.c to load an external program to perform
> > redirect
> > - reported ifindex to attached eBPF program
> > - moved bpf_cpumap_val definition to include/uapi/linux/bpf.h
> >
> > [1] https://patchwork.ozlabs.org/project/netdev/cover/20200529220716.75383-1-dsahern@kernel.org/
> > [2] https://patchwork.ozlabs.org/project/netdev/patch/20200513014607.40418-2-dsahern@kernel.org/
> >
> > David Ahern (1):
> > net: refactor xdp_convert_buff_to_frame
> >
> > Jesper Dangaard Brouer (1):
> > cpumap: use non-locked version __ptr_ring_consume_batched
> >
> > Lorenzo Bianconi (7):
> > samples/bpf: xdp_redirect_cpu_user: do not update bpf maps in option
> > loop
> > cpumap: formalize map value as a named struct
> > bpf: cpumap: add the possibility to attach an eBPF program to cpumap
> > bpf: cpumap: implement XDP_REDIRECT for eBPF programs attached to map
> > entries
> > libbpf: add SEC name for xdp programs attached to CPUMAP
> > samples/bpf: xdp_redirect_cpu: load a eBPF program on cpumap
> > selftest: add tests for XDP programs in CPUMAP entries
> >
> > include/linux/bpf.h | 6 +
> > include/net/xdp.h | 41 ++--
> > include/trace/events/xdp.h | 16 +-
> > include/uapi/linux/bpf.h | 14 ++
> > kernel/bpf/cpumap.c | 162 +++++++++++---
> > net/core/dev.c | 9 +
> > samples/bpf/xdp_redirect_cpu_kern.c | 25 ++-
> > samples/bpf/xdp_redirect_cpu_user.c | 209 ++++++++++++++++--
> > tools/include/uapi/linux/bpf.h | 14 ++
> > tools/lib/bpf/libbpf.c | 2 +
> > .../bpf/prog_tests/xdp_cpumap_attach.c | 70 ++++++
> > .../bpf/progs/test_xdp_with_cpumap_helpers.c | 36 +++
> > 12 files changed, 531 insertions(+), 73 deletions(-)
> > create mode 100644 tools/testing/selftests/bpf/prog_tests/xdp_cpumap_attach.c
> > create mode 100644 tools/testing/selftests/bpf/progs/test_xdp_with_cpumap_helpers.c
> >
>
> This started showing up with when running ./test_progs from recent
> bpf-next (bfdfa51702de). Any chance it is related?
>
> [ 2950.440613] =============================================
>
> [ 3073.281578] INFO: task cpumap/0/map:26:536 blocked for more than 860 seconds.
> [ 3073.285492] Tainted: G W 5.8.0-rc4-01471-g15d51f3a516b #814
> [ 3073.289177] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [ 3073.293021] cpumap/0/map:26 D 0 536 2 0x00004000
> [ 3073.295755] Call Trace:
> [ 3073.297143] __schedule+0x5ad/0xf10
> [ 3073.299032] ? pci_mmcfg_check_reserved+0xd0/0xd0
> [ 3073.301416] ? static_obj+0x31/0x80
> [ 3073.303277] ? mark_held_locks+0x24/0x90
> [ 3073.305313] ? cpu_map_update_elem+0x6d0/0x6d0
> [ 3073.307544] schedule+0x6f/0x160
> [ 3073.309282] schedule_preempt_disabled+0x14/0x20
> [ 3073.311593] kthread+0x175/0x240
> [ 3073.313299] ? kthread_create_on_node+0xd0/0xd0
> [ 3073.315106] ret_from_fork+0x1f/0x30
> [ 3073.316365]
> Showing all locks held in the system:
> [ 3073.318423] 1 lock held by khungtaskd/33:
> [ 3073.319642] #0: ffffffff82d246a0 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x28/0x1c3
>
> [ 3073.322249] =============================================
'test_maps' looks sad :-( Not sure if related.
bash-5.0# ./test_maps
Fork 1024 tasks to 'test_update_delete'
Fork 1024 tasks to 'test_update_delete'
Fork 100 tasks to 'test_hashmap'
Fork 100 tasks to 'test_hashmap_percpu'
Fork 100 tasks to 'test_hashmap_sizes'
[ 66.961150] test_maps invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
[ 66.962490] CPU: 3 PID: 3263 Comm: test_maps Not tainted 5.8.0-rc4-01471-g15d51f3a516b #814
[ 66.963617] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014
[ 66.965958] Call Trace:
[ 66.966404] dump_stack+0x9e/0xe0
[ 66.966978] dump_header+0x89/0x49a
[ 66.967624] oom_kill_process.cold+0xb/0x10
[ 66.968379] out_of_memory+0x1b1/0x820
[ 66.969060] ? oom_killer_disable+0x210/0x210
[ 66.969833] __alloc_pages_slowpath.constprop.0+0x125f/0x1460
[ 66.970852] ? warn_alloc+0x120/0x120
[ 66.971508] ? __alloc_pages_nodemask+0x30f/0x5c0
[ 66.972237] __alloc_pages_nodemask+0x4fd/0x5c0
[ 66.972863] ? __alloc_pages_slowpath.constprop.0+0x1460/0x1460
[ 66.973681] ? find_held_lock+0x85/0xa0
[ 66.974223] ? lock_downgrade+0x360/0x360
[ 66.974760] ? policy_nodemask+0x19/0x90
[ 66.975266] ? policy_node+0x56/0x60
[ 66.975719] pagecache_get_page+0xf7/0x360
[ 66.976259] filemap_fault+0xe4a/0xfe0
[ 66.976743] ? read_cache_page_gfp+0x20/0x20
[ 66.977274] ? find_held_lock+0x85/0xa0
[ 66.977792] ? filemap_page_mkwrite+0x140/0x140
[ 66.978383] __do_fault+0x6a/0x1e0
[ 66.978789] handle_mm_fault+0x16eb/0x1fb0
[ 66.979319] ? copy_page_range+0xf80/0xf80
[ 66.979755] ? vmacache_find+0xba/0x100
[ 66.980166] do_user_addr_fault+0x2ce/0x5ed
[ 66.980625] exc_page_fault+0x5e/0xc0
[ 66.981005] ? asm_exc_page_fault+0x8/0x30
[ 66.981445] asm_exc_page_fault+0x1e/0x30
[ 66.981860] RIP: 0033:0x7f33d40901bd
[ 66.982248] Code: Bad RIP value.
[ 66.982579] RSP: 002b:00007ffe84e1dd18 EFLAGS: 00010202
[ 66.983143] RAX: fffffffffffffff4 RBX: 0000000000000002 RCX: 00007f33d40901bd
[ 66.984038] RDX: 0000000000000078 RSI: 00007ffe84e1dd50 RDI: 0000000000000000
[ 66.984831] RBP: 00007ffe84e1dd30 R08: 00007ffe84e1dd50 R09: 00007ffe84e1dd50
[ 66.985619] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000020000
[ 66.986399] R13: 0000000000000004 R14: 0000000000000000 R15: 0000000000000060
[ 66.987656] Mem-Info:
[ 66.987924] active_anon:1424 inactive_anon:126 isolated_anon:0
[ 66.987924] active_file:53 inactive_file:33 isolated_file:0
[ 66.987924] unevictable:0 dirty:0 writeback:0
[ 66.987924] slab_reclaimable:11120 slab_unreclaimable:98031
[ 66.987924] mapped:88 shmem:137 pagetables:120 bounce:0
[ 66.987924] free:21175 free_pcp:725 free_cma:0
[ 66.993308] Node 0 active_anon:5696kB inactive_anon:504kB active_file:212kB inactive_file:132kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:352kB dirty:0kB writeback:0kB shmem:548kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB all_unreclaimable? yes
[ 66.997665] Node 0 DMA free:13708kB min:308kB low:384kB high:460kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15992kB managed:15908kB mlocked:0kB kernel_stack:96kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[ 67.002330] lowmem_reserve[]: 0 2925 3354 3354
[ 67.003090] Node 0 DMA32 free:60008kB min:58668kB low:73332kB high:87996kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:3129212kB managed:3001752kB mlocked:0kB kernel_stack:25536kB pagetables:292kB bounce:0kB free_pcp:844kB local_pcp:248kB free_cma:0kB
[ 67.008157] lowmem_reserve[]: 0 0 428 428
[ 67.008843] Node 0 Normal free:9528kB min:8600kB low:10748kB high:12896kB reserved_highatomic:2048KB active_anon:5404kB inactive_anon:504kB active_file:336kB inactive_file:0kB unevictable:0kB writepending:0kB present:1048576kB managed:439260kB mlocked:0kB kernel_stack:5824kB pagetables:128kB bounce:0kB free_pcp:1596kB local_pcp:660kB free_cma:0kB
[ 67.014007] lowmem_reserve[]: 0 0 0 0
[ 67.014673] Node 0 DMA: 1*4kB (U) 1*8kB (U) 2*16kB (UE) 1*32kB (U) 1*64kB (E) 2*128kB (UE) 2*256kB (UE) 1*512kB (E) 2*1024kB (UE) 3*2048kB (UME) 1*4096kB (M) = 13708kB
[ 67.016793] Node 0 DMA32: 8*4kB (UM) 5*8kB (UM) 4*16kB (UME) 8*32kB (UME) 7*64kB (ME) 8*128kB (UME) 6*256kB (UM) 8*512kB (ME) 3*1024kB (M) 2*2048kB (ME) 11*4096kB (M) = 59720kB
[ 67.018915] Node 0 Normal: 256*4kB (UME) 149*8kB (UM) 89*16kB (UM) 41*32kB (UM) 20*64kB (UMH) 8*128kB (UM) 2*256kB (MH) 1*512kB (H) 1*1024kB (H) 0*2048kB 0*4096kB = 9304kB
[ 67.020673] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[ 67.021811] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[ 67.022768] 184 total pagecache pages
[ 67.023418] 0 pages in swap cache
[ 67.023912] Swap cache stats: add 0, delete 0, find 0/0
[ 67.024726] Free swap = 0kB
[ 67.025150] Total swap = 0kB
[ 67.025530] 1048445 pages RAM
[ 67.025865] 0 pages HighMem/MovableOnly
[ 67.026510] 184215 pages reserved
[ 67.027002] 0 pages cma reserved
[ 67.027542] 0 pages hwpoisoned
[ 67.027928] Unreclaimable slab info:
[ 67.028500] Name Used Total
[ 67.029057] 9p-fcall-cache 297KB 445KB
[ 67.029793] 9p-fcall-cache 49KB 49KB
[ 67.030375] 9p-fcall-cache 123KB 272KB
[ 67.030897] 9p-fcall-cache 346KB 495KB
[ 67.031539] p9_req_t 16KB 16KB
[ 67.032326] fib6_nodes 4KB 4KB
[ 67.032950] RAWv6 31KB 31KB
[ 67.033960] mqueue_inode_cache 31KB 31KB
[ 67.034708] ext4_bio_post_read_ctx 15KB 15KB
[ 67.035647] bio-2 7KB 7KB
[ 67.036471] UNIX 372KB 372KB
[ 67.037769] tcp_bind_bucket 4KB 4KB
[ 67.038701] ip_fib_trie 4KB 4KB
[ 67.039496] ip_fib_alias 3KB 3KB
[ 67.040414] ip_dst_cache 4KB 4KB
[ 67.041365] RAW 31KB 31KB
[ 67.042341] UDP 121KB 121KB
[ 67.043540] tw_sock_TCP 7KB 7KB
[ 67.044491] request_sock_TCP 7KB 7KB
[ 67.045447] TCP 58KB 58KB
[ 67.046445] hugetlbfs_inode_cache 31KB 31KB
[ 67.047443] bio-1 15KB 15KB
[ 67.048375] eventpoll_pwq 23KB 23KB
[ 67.049388] eventpoll_epi 35KB 35KB
[ 67.050361] inotify_inode_mark 3KB 3KB
[ 67.051399] request_queue 62KB 62KB
[ 67.052563] blkdev_ioc 7KB 7KB
[ 67.053623] bio-0 20KB 20KB
[ 67.054603] biovec-max 327KB 327KB
[ 67.055593] skbuff_fclone_cache 15KB 15KB
[ 67.056627] skbuff_head_cache 281KB 312KB
[ 67.057868] file_lock_cache 31KB 31KB
[ 67.058878] file_lock_ctx 15KB 15KB
[ 67.059874] fsnotify_mark_connector 4KB 4KB
[ 67.060957] task_delay_info 451KB 451KB
[ 67.061644] proc_dir_entry 125KB 125KB
[ 67.062404] pde_opener 59KB 59KB
[ 67.063696] seq_file 210KB 241KB
[ 67.064759] sigqueue 19KB 19KB
[ 67.065763] shmem_inode_cache 795KB 795KB
[ 67.066778] kernfs_node_cache 2565KB 2565KB
[ 67.067762] mnt_cache 31KB 31KB
[ 67.068798] filp 11820KB 11820KB
[ 67.069807] names_cache 327KB 476KB
[ 67.070795] key_jar 15KB 15KB
[ 67.071787] nsproxy 3KB 3KB
[ 67.072802] vm_area_struct 3236KB 3400KB
[ 67.073773] mm_struct 870KB 1154KB
[ 67.074787] fs_cache 232KB 264KB
[ 67.076008] files_cache 860KB 972KB
[ 67.076969] signal_cache 2626KB 2686KB
[ 67.077911] sighand_cache 2370KB 2405KB
[ 67.078963] task_struct 8564KB 8564KB
[ 67.080472] cred_jar 392KB 420KB
[ 67.081767] anon_vma_chain 1191KB 1413KB
[ 67.082736] anon_vma 200KB 220KB
[ 67.083805] pid 593KB 640KB
[ 67.084733] Acpi-Operand 265KB 308KB
[ 67.085598] Acpi-ParseExt 47KB 47KB
[ 67.086504] Acpi-Parse 189KB 205KB
[ 67.091509] Acpi-State 200KB 216KB
[ 67.092511] Acpi-Namespace 24KB 24KB
[ 67.093388] numa_policy 3KB 3KB
[ 67.094249] trace_event_file 151KB 151KB
[ 67.095076] ftrace_event_field 196KB 196KB
[ 67.099230] pool_workqueue 16KB 16KB
[ 67.100094] vmap_area 567KB 567KB
[ 67.105224] page->ptl 385KB 433KB
[ 67.106079] kmemleak_scan_area 286KB 286KB
[ 67.107190] kmemleak_object 163919KB 168558KB
[ 67.108081] kmalloc-8k 21840KB 21840KB
[ 67.108932] kmalloc-4k 24592KB 24592KB
[ 67.114003] kmalloc-2k 15684KB 15684KB
[ 67.114896] kmalloc-1k 44936KB 44936KB
[ 67.121018] kmalloc-512 4192KB 4192KB
[ 67.122212] kmalloc-256 1128KB 1128KB
[ 67.123089] kmalloc-192 5008KB 5008KB
[ 67.128978] kmalloc-128 356KB 356KB
[ 67.129877] kmalloc-96 145KB 156KB
[ 67.132262] kmalloc-64 1032KB 1032KB
[ 67.133085] kmalloc-32 233KB 244KB
[ 67.138993] kmalloc-16 108KB 108KB
[ 67.139847] kmalloc-8 440KB 533KB
[ 67.142221] kmem_cache_node 43KB 43KB
[ 67.143038] kmem_cache 140KB 140KB
[ 67.149977] Tasks state (memory values in pages):
[ 67.150743] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
[ 67.153263] [ 103] 0 103 7766 1042 81920 0 -1000 systemd-udevd
[ 67.157980] [ 194] 0 194 1032 136 40960 0 0 bash
[ 67.161986] [ 195] 0 195 758 40 45056 0 0 test_maps
[ 67.167074] [ 3178] 0 3178 758 39 45056 0 0 test_maps
[ 67.170572] [ 3186] 0 3186 758 39 45056 0 0 test_maps
[ 67.172052] [ 3205] 0 3205 758 39 45056 0 0 test_maps
[ 67.177580] [ 3213] 0 3213 758 39 45056 0 0 test_maps
[ 67.179026] [ 3221] 0 3221 758 39 45056 0 0 test_maps
[ 67.184143] [ 3222] 0 3222 758 39 45056 0 0 test_maps
[ 67.187571] [ 3230] 0 3230 758 39 45056 0 0 test_maps
[ 67.189053] [ 3241] 0 3241 758 39 45056 0 0 test_maps
[ 67.194580] [ 3243] 0 3243 758 39 45056 0 0 test_maps
[ 67.196058] [ 3250] 0 3250 758 39 45056 0 0 test_maps
[ 67.201124] [ 3263] 0 3263 758 39 45056 0 0 test_maps
[ 67.206581] [ 3298] 0 3298 758 39 45056 0 0 test_maps
[ 67.210563] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/,task=bash,pid=194,uid=0
[ 67.214666] Out of memory: Killed process 194 (bash) total-vm:4128kB, anon-rss:544kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:40kB oom_score_adj:0
[ 67.226601] oom_reaper: reaped process 3298 (test_maps), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
[ 67.228355] oom_reaper: reaped process 3178 (test_maps), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
[ 67.234148] oom_reaper: reaped process 3222 (test_maps), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
[ 67.236873] oom_reaper: reaped process 3213 (test_maps), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
[ 67.239028] oom_reaper: reaped process 3186 (test_maps), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
[ 67.268146] oom_reaper: reaped process 3221 (test_maps), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
[ 67.275959] kthreadd invoked oom-killer: gfp_mask=0x40cc0(GFP_KERNEL|__GFP_COMP), order=1, oom_score_adj=0
[ 67.277227] CPU: 0 PID: 2 Comm: kthreadd Not tainted 5.8.0-rc4-01471-g15d51f3a516b #814
[ 67.278278] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014
[ 67.279895] Call Trace:
[ 67.280250] dump_stack+0x9e/0xe0
[ 67.280674] dump_header+0x89/0x49a
[ 67.281113] out_of_memory.cold+0xa/0xbb
[ 67.281608] ? oom_killer_disable+0x210/0x210
[ 67.282157] __alloc_pages_slowpath.constprop.0+0x125f/0x1460
[ 67.282894] ? warn_alloc+0x120/0x120
[ 67.283377] ? __alloc_pages_nodemask+0x30f/0x5c0
[ 67.283925] __alloc_pages_nodemask+0x4fd/0x5c0
[ 67.284446] ? __alloc_pages_slowpath.constprop.0+0x1460/0x1460
[ 67.285146] alloc_slab_page+0x2e/0x7a0
[ 67.285590] ? new_slab+0x22e/0x2b0
[ 67.285999] new_slab+0x276/0x2b0
[ 67.286402] ___slab_alloc+0x4ba/0x6d0
[ 67.286892] ? copy_process+0x256d/0x2f80
[ 67.287391] ? lock_downgrade+0x360/0x360
[ 67.287908] ? copy_process+0x256d/0x2f80
[ 67.288482] ? __slab_alloc.isra.0+0x4b/0x90
[ 67.289207] __slab_alloc.isra.0+0x4b/0x90
[ 67.289874] ? copy_process+0x256d/0x2f80
[ 67.290565] kmem_cache_alloc_node+0xb7/0x330
[ 67.291163] ? trace_hardirqs_on+0x1e/0x130
[ 67.291699] copy_process+0x256d/0x2f80
[ 67.292231] ? mark_lock+0x13f/0xc30
[ 67.292746] ? find_held_lock+0x85/0xa0
[ 67.293314] ? __cleanup_sighand+0x60/0x60
[ 67.293889] _do_fork+0xcf/0x840
[ 67.294354] ? copy_init_mm+0x20/0x20
[ 67.294856] ? lockdep_hardirqs_on_prepare+0x14c/0x240
[ 67.295530] ? _raw_spin_unlock_irq+0x24/0x50
[ 67.296036] ? trace_hardirqs_on+0x1e/0x130
[ 67.296529] ? preempt_count_sub+0x14/0xc0
[ 67.297010] ? lock_acquire+0x133/0x4e0
[ 67.297467] kernel_thread+0xa8/0xe0
[ 67.297886] ? legacy_clone_args_valid+0x30/0x30
[ 67.298429] ? kthread_create_on_node+0xd0/0xd0
[ 67.298966] ? do_raw_spin_unlock+0xa3/0x130
[ 67.299458] ? preempt_count_sub+0x14/0xc0
[ 67.299907] kthreadd+0x2be/0x340
[ 67.300290] ? kthread_create_on_cpu+0x120/0x120
[ 67.300775] ? lockdep_hardirqs_on_prepare+0x14c/0x240
[ 67.301324] ? _raw_spin_unlock_irq+0x24/0x50
[ 67.301771] ? trace_hardirqs_on+0x1e/0x130
[ 67.302206] ? kthread_create_on_cpu+0x120/0x120
[ 67.302685] ret_from_fork+0x1f/0x30
[ 67.303234] Mem-Info:
[ 67.303485] active_anon:1210 inactive_anon:126 isolated_anon:0
[ 67.303485] active_file:5 inactive_file:12 isolated_file:0
[ 67.303485] unevictable:0 dirty:0 writeback:0
[ 67.303485] slab_reclaimable:11187 slab_unreclaimable:98423
[ 67.303485] mapped:12 shmem:137 pagetables:70 bounce:0
[ 67.303485] free:20722 free_pcp:1 free_cma:0
[ 67.307090] Node 0 active_anon:4840kB inactive_anon:504kB active_file:20kB inactive_file:48kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:48kB dirty:0kB writeback:0kB shmem:548kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB all_unreclaimable? yes
[ 67.309863] Node 0 DMA free:13708kB min:308kB low:384kB high:460kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15992kB managed:15908kB mlocked:0kB kernel_stack:96kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[ 67.312641] lowmem_reserve[]: 0 2925 3354 3354
[ 67.313081] Node 0 DMA32 free:60868kB min:58668kB low:73332kB high:87996kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:3129212kB managed:3001752kB mlocked:0kB kernel_stack:25824kB pagetables:240kB bounce:0kB free_pcp:248kB local_pcp:0kB free_cma:0kB
[ 67.316022] lowmem_reserve[]: 0 0 428 428
[ 67.316560] Node 0 Normal free:8312kB min:8600kB low:10748kB high:12896kB reserved_highatomic:0KB active_anon:4916kB inactive_anon:504kB active_file:308kB inactive_file:32kB unevictable:0kB writepending:0kB present:1048576kB managed:439260kB mlocked:0kB kernel_stack:8192kB pagetables:40kB bounce:0kB free_pcp:392kB local_pcp:0kB free_cma:0kB
[ 67.319987] lowmem_reserve[]: 0 0 0 0
[ 67.320421] Node 0 DMA: 1*4kB (U) 1*8kB (U) 2*16kB (UE) 1*32kB (U) 1*64kB (E) 2*128kB (UE) 2*256kB (UE) 1*512kB (E) 2*1024kB (UE) 3*2048kB (UME) 1*4096kB (M) = 13708kB
[ 67.321913] Node 0 DMA32: 6*4kB (M) 26*8kB (UM) 29*16kB (UM) 11*32kB (UME) 9*64kB (UME) 10*128kB (UME) 6*256kB (UM) 7*512kB (ME) 4*1024kB (M) 2*2048kB (ME) 11*4096kB (M) = 61272kB
[ 67.323562] Node 0 Normal: 488*4kB (UME) 251*8kB (UM) 86*16kB (UM) 37*32kB (UM) 18*64kB (UM) 6*128kB (M) 4*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 9464kB
[ 67.325773] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[ 67.326999] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[ 67.328287] 155 total pagecache pages
[ 67.328836] 0 pages in swap cache
[ 67.329365] Swap cache stats: add 0, delete 0, find 0/0
[ 67.330140] Free swap = 0kB
[ 67.330596] Total swap = 0kB
[ 67.331043] 1048445 pages RAM
[ 67.331478] 0 pages HighMem/MovableOnly
[ 67.332080] 184215 pages reserved
[ 67.332617] 0 pages cma reserved
[ 67.333114] 0 pages hwpoisoned
[ 67.333816] Unreclaimable slab info:
[ 67.334439] Name Used Total
[ 67.335297] 9p-fcall-cache 297KB 445KB
[ 67.336094] 9p-fcall-cache 49KB 49KB
[ 67.336933] 9p-fcall-cache 123KB 272KB
[ 67.337751] 9p-fcall-cache 445KB 495KB
[ 67.338572] p9_req_t 16KB 16KB
[ 67.339391] fib6_nodes 4KB 4KB
[ 67.340229] RAWv6 31KB 31KB
[ 67.341033] mqueue_inode_cache 31KB 31KB
[ 67.341997] ext4_bio_post_read_ctx 15KB 15KB
[ 67.342835] bio-2 7KB 7KB
[ 67.343695] UNIX 372KB 372KB
[ 67.344560] tcp_bind_bucket 4KB 4KB
[ 67.345368] ip_fib_trie 4KB 4KB
[ 67.346190] ip_fib_alias 3KB 3KB
[ 67.346997] ip_dst_cache 4KB 4KB
[ 67.347819] RAW 31KB 31KB
[ 67.348668] UDP 121KB 121KB
[ 67.349856] tw_sock_TCP 7KB 7KB
[ 67.350723] request_sock_TCP 7KB 7KB
[ 67.351567] TCP 58KB 58KB
[ 67.352412] hugetlbfs_inode_cache 31KB 31KB
[ 67.353319] bio-1 15KB 15KB
[ 67.354116] eventpoll_pwq 23KB 23KB
[ 67.354919] eventpoll_epi 35KB 35KB
[ 67.355735] inotify_inode_mark 3KB 3KB
[ 67.356618] request_queue 62KB 62KB
[ 67.357680] blkdev_ioc 7KB 7KB
[ 67.358569] bio-0 20KB 20KB
[ 67.359520] biovec-max 327KB 327KB
[ 67.360396] skbuff_fclone_cache 15KB 15KB
[ 67.361277] skbuff_head_cache 281KB 312KB
[ 67.362116] file_lock_cache 31KB 31KB
[ 67.362967] file_lock_ctx 15KB 15KB
[ 67.363597] fsnotify_mark_connector 4KB 4KB
[ 67.364279] task_delay_info 455KB 455KB
[ 67.364832] proc_dir_entry 125KB 125KB
[ 67.365663] pde_opener 59KB 59KB
[ 67.366265] seq_file 210KB 241KB
[ 67.366805] sigqueue 19KB 19KB
[ 67.367386] shmem_inode_cache 795KB 795KB
[ 67.367909] kernfs_node_cache 2565KB 2565KB
[ 67.368489] mnt_cache 31KB 31KB
[ 67.369014] filp 11820KB 11820KB
[ 67.369585] names_cache 327KB 476KB
[ 67.370113] key_jar 15KB 15KB
[ 67.370664] nsproxy 3KB 3KB
[ 67.371230] vm_area_struct 2870KB 3100KB
[ 67.371754] mm_struct 831KB 1154KB
[ 67.372337] fs_cache 176KB 212KB
[ 67.372862] files_cache 843KB 956KB
[ 67.373625] signal_cache 2809KB 2809KB
[ 67.374210] sighand_cache 2615KB 2615KB
[ 67.374750] task_struct 9207KB 9207KB
[ 67.375335] cred_jar 392KB 420KB
[ 67.375856] anon_vma_chain 1058KB 1291KB
[ 67.376437] anon_vma 144KB 160KB
[ 67.376952] pid 609KB 640KB
[ 67.377519] Acpi-Operand 265KB 308KB
[ 67.378063] Acpi-ParseExt 47KB 47KB
[ 67.378645] Acpi-Parse 189KB 205KB
[ 67.379216] Acpi-State 200KB 216KB
[ 67.379747] Acpi-Namespace 24KB 24KB
[ 67.380345] numa_policy 3KB 3KB
[ 67.380870] trace_event_file 151KB 151KB
[ 67.381649] ftrace_event_field 196KB 196KB
[ 67.382199] pool_workqueue 16KB 16KB
[ 67.382722] vmap_area 567KB 567KB
[ 67.383261] page->ptl 384KB 429KB
[ 67.383781] kmemleak_scan_area 286KB 286KB
[ 67.384527] kmemleak_object 162939KB 167943KB
[ 67.385080] kmalloc-8k 21840KB 21840KB
[ 67.385618] kmalloc-4k 25736KB 25736KB
[ 67.386158] kmalloc-2k 15684KB 15684KB
[ 67.386679] kmalloc-1k 44936KB 44936KB
[ 67.387208] kmalloc-512 4192KB 4192KB
[ 67.387826] kmalloc-256 1184KB 1184KB
[ 67.388514] kmalloc-192 5032KB 5032KB
[ 67.389397] kmalloc-128 356KB 356KB
[ 67.390100] kmalloc-96 145KB 156KB
[ 67.390775] kmalloc-64 1052KB 1052KB
[ 67.391489] kmalloc-32 233KB 244KB
[ 67.392261] kmalloc-16 108KB 108KB
[ 67.392964] kmalloc-8 440KB 533KB
[ 67.393657] kmem_cache_node 43KB 43KB
[ 67.394424] kmem_cache 140KB 140KB
[ 67.395296] Tasks state (memory values in pages):
[ 67.395889] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
[ 67.397031] [ 103] 0 103 7766 1042 81920 0 -1000 systemd-udevd
[ 67.398411] Out of memory and no killable processes...
[ 67.399203] Kernel panic - not syncing: System is deadlocked on memory
[ 67.400146] CPU: 0 PID: 2 Comm: kthreadd Not tainted 5.8.0-rc4-01471-g15d51f3a516b #814
[ 67.401254] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014
[ 67.403154] Call Trace:
[ 67.403536] dump_stack+0x9e/0xe0
[ 67.404021] panic+0x1ab/0x3ae
[ 67.404497] ? __warn_printk+0xf3/0xf3
[ 67.405075] ? __rcu_read_unlock+0x58/0x250
[ 67.405705] ? out_of_memory.cold+0x2d/0xbb
[ 67.406272] ? out_of_memory.cold+0x1f/0xbb
[ 67.406910] out_of_memory.cold+0x45/0xbb
[ 67.407540] ? oom_killer_disable+0x210/0x210
[ 67.408225] __alloc_pages_slowpath.constprop.0+0x125f/0x1460
[ 67.409178] ? warn_alloc+0x120/0x120
[ 67.409849] ? __alloc_pages_nodemask+0x30f/0x5c0
[ 67.410722] __alloc_pages_nodemask+0x4fd/0x5c0
[ 67.411533] ? __alloc_pages_slowpath.constprop.0+0x1460/0x1460
[ 67.412641] alloc_slab_page+0x2e/0x7a0
[ 67.413345] ? new_slab+0x22e/0x2b0
[ 67.413883] new_slab+0x276/0x2b0
[ 67.414534] ___slab_alloc+0x4ba/0x6d0
[ 67.415221] ? copy_process+0x256d/0x2f80
[ 67.415930] ? lock_downgrade+0x360/0x360
[ 67.416659] ? copy_process+0x256d/0x2f80
[ 67.417390] ? __slab_alloc.isra.0+0x4b/0x90
[ 67.418171] __slab_alloc.isra.0+0x4b/0x90
[ 67.418903] ? copy_process+0x256d/0x2f80
[ 67.419600] kmem_cache_alloc_node+0xb7/0x330
[ 67.420339] ? trace_hardirqs_on+0x1e/0x130
[ 67.421078] copy_process+0x256d/0x2f80
[ 67.421722] ? mark_lock+0x13f/0xc30
[ 67.422284] ? find_held_lock+0x85/0xa0
[ 67.422838] ? __cleanup_sighand+0x60/0x60
[ 67.423515] _do_fork+0xcf/0x840
[ 67.423978] ? copy_init_mm+0x20/0x20
[ 67.424496] ? lockdep_hardirqs_on_prepare+0x14c/0x240
[ 67.425321] ? _raw_spin_unlock_irq+0x24/0x50
[ 67.426018] ? trace_hardirqs_on+0x1e/0x130
[ 67.426702] ? preempt_count_sub+0x14/0xc0
[ 67.427328] ? lock_acquire+0x133/0x4e0
[ 67.427938] kernel_thread+0xa8/0xe0
[ 67.428523] ? legacy_clone_args_valid+0x30/0x30
[ 67.429283] ? kthread_create_on_node+0xd0/0xd0
[ 67.430026] ? do_raw_spin_unlock+0xa3/0x130
[ 67.430708] ? preempt_count_sub+0x14/0xc0
[ 67.431402] kthreadd+0x2be/0x340
[ 67.431958] ? kthread_create_on_cpu+0x120/0x120
[ 67.432726] ? lockdep_hardirqs_on_prepare+0x14c/0x240
[ 67.433575] ? _raw_spin_unlock_irq+0x24/0x50
[ 67.434288] ? trace_hardirqs_on+0x1e/0x130
[ 67.434976] ? kthread_create_on_cpu+0x120/0x120
[ 67.435746] ret_from_fork+0x1f/0x30
[ 67.436548] Kernel Offset: disabled
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v7 bpf-next 0/9] introduce support for XDP programs in CPUMAP
2020-07-17 10:00 ` Jakub Sitnicki
2020-07-17 10:08 ` Jakub Sitnicki
@ 2020-07-17 11:01 ` Lorenzo Bianconi
2020-07-17 15:13 ` Jakub Sitnicki
1 sibling, 1 reply; 21+ messages in thread
From: Lorenzo Bianconi @ 2020-07-17 11:01 UTC (permalink / raw)
To: Jakub Sitnicki
Cc: netdev, davem, ast, brouer, daniel, toke, lorenzo.bianconi,
dsahern, andrii.nakryiko, bpf
[-- Attachment #1: Type: text/plain, Size: 1584 bytes --]
[...]
> This started showing up with when running ./test_progs from recent
> bpf-next (bfdfa51702de). Any chance it is related?
>
> [ 2950.440613] =============================================
>
> [ 3073.281578] INFO: task cpumap/0/map:26:536 blocked for more than 860 seconds.
> [ 3073.285492] Tainted: G W 5.8.0-rc4-01471-g15d51f3a516b #814
> [ 3073.289177] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [ 3073.293021] cpumap/0/map:26 D 0 536 2 0x00004000
> [ 3073.295755] Call Trace:
> [ 3073.297143] __schedule+0x5ad/0xf10
> [ 3073.299032] ? pci_mmcfg_check_reserved+0xd0/0xd0
> [ 3073.301416] ? static_obj+0x31/0x80
> [ 3073.303277] ? mark_held_locks+0x24/0x90
> [ 3073.305313] ? cpu_map_update_elem+0x6d0/0x6d0
> [ 3073.307544] schedule+0x6f/0x160
> [ 3073.309282] schedule_preempt_disabled+0x14/0x20
> [ 3073.311593] kthread+0x175/0x240
> [ 3073.313299] ? kthread_create_on_node+0xd0/0xd0
> [ 3073.315106] ret_from_fork+0x1f/0x30
> [ 3073.316365]
> Showing all locks held in the system:
> [ 3073.318423] 1 lock held by khungtaskd/33:
> [ 3073.319642] #0: ffffffff82d246a0 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x28/0x1c3
>
> [ 3073.322249] =============================================
Hi Jakub,
can you please provide more info? can you please identify the test that trigger
the issue? I run test_progs with bpf-next master branch and it works fine for me.
I run the tests in a vm with 4 vcpus and 4G of memory.
Regards,
Lorenzo
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v7 bpf-next 0/9] introduce support for XDP programs in CPUMAP
2020-07-17 10:08 ` Jakub Sitnicki
@ 2020-07-17 11:06 ` Lorenzo Bianconi
0 siblings, 0 replies; 21+ messages in thread
From: Lorenzo Bianconi @ 2020-07-17 11:06 UTC (permalink / raw)
To: Jakub Sitnicki
Cc: netdev, davem, ast, brouer, daniel, toke, dsahern, andrii.nakryiko, bpf
[-- Attachment #1: Type: text/plain, Size: 33116 bytes --]
> On Fri, 17 Jul 2020 12:00:13 +0200
> Jakub Sitnicki <jakub@cloudflare.com> wrote:
>
[...]
> 'test_maps' looks sad :-( Not sure if related.
I trigger this issue even switching back to commit:
commit 59632b220f2d61df274ed3a14a204e941051fdad
Author: Randy Dunlap <rdunlap@infradead.org>
Date: Wed Jul 15 09:42:46 2020 -0700
net: ipv6: drop duplicate word in comment
[ 244.327081] [ 8645] 0 8645 760 31 40960 0 0 test_maps
[ 244.327148] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),global_oom,task_memcg=/system.slice/polkit.service,task=polkitd,pid=211,uid=997
[ 244.327264] Out of memory: Killed process 211 (polkitd) total-vm:1541300kB, anon-rss:7516kB, file-rss:0kB, shmem-rss:0kB, UID:997 pgtables:180kB oom_score_adj:0
[ 244.333067] test_maps invoked oom-killer: gfp_mask=0x140cc0(GFP_USER|__GFP_COMP), order=0, oom_score_adj=0
[ 244.333141] CPU: 0 PID: 8632 Comm: test_maps Not tainted 5.8.0-rc4-kvm+ #196
[ 244.333196] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014
[ 244.333270] Call Trace:
[ 244.333301] dump_stack+0x57/0x70
[ 244.333334] dump_header+0x4a/0x1d7
[ 244.333367] oom_kill_process.cold+0xb/0x10
[ 244.333398] out_of_memory+0x1ca/0x280
[ 244.333428] __alloc_pages_slowpath.constprop.0+0x946/0xc00
[ 244.333489] __alloc_pages_nodemask+0x1f7/0x230
[ 244.333529] alloc_slab_page+0x157/0x2b0
[ 244.333556] allocate_slab+0x2da/0x320
[ 244.333585] ___slab_alloc.constprop.0+0x283/0x4f0
[ 244.333622] ? update_load_avg+0x5b/0x520
[ 244.333649] ? htab_map_alloc+0x3a/0x480
[ 244.333677] ? set_next_entity+0x60/0x80
[ 244.333705] ? pick_next_task_fair+0x25f/0x2c0
[ 244.333742] ? kvm_sched_clock_read+0xd/0x20
[ 244.333776] kmem_cache_alloc_trace+0x1ca/0x1e0
[ 244.333811] ? htab_map_alloc+0x3a/0x480
[ 244.333839] htab_map_alloc+0x3a/0x480
[ 244.333869] __do_sys_bpf+0x2a4/0x1c30
[ 244.333897] ? hrtimer_nanosleep+0xb8/0x190
[ 244.333926] do_syscall_64+0x45/0x240
[ 244.333953] ? exc_page_fault+0x1b4/0x4a0
[ 244.333993] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 244.334037] RIP: 0033:0x7fe678a5b43d
[ 244.334072] Code: Bad RIP value.
[ 244.334103] RSP: 002b:00007ffed0c408a8 EFLAGS: 00000202 ORIG_RAX: 0000000000000141
[ 244.334166] RAX: ffffffffffffffda RBX: 0000000000000013 RCX: 00007fe678a5b43d
[ 244.334219] RDX: 0000000000000078 RSI: 00007ffed0c408e0 RDI: 0000000000000000
[ 244.334268] RBP: 00007ffed0c408c0 R08: 00007ffed0c408e0 R09: 00007ffed0c408e0
[ 244.334326] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000001
[ 244.334393] R13: 0000000000000002 R14: 0000000000000054 R15: 0000000000000000
[ 244.334457] Mem-Info:
[ 244.334488] active_anon:13749 inactive_anon:81 isolated_anon:0
active_file:0 inactive_file:2 isolated_file:0
unevictable:0 dirty:0 writeback:0
slab_reclaimable:4723 slab_unreclaimable:25886
mapped:38 shmem:125 pagetables:1432 bounce:0
free:25932 free_pcp:676 free_cma:0
[ 244.334684] Node 0 active_anon:54996kB inactive_anon:324kB active_file:0kB inactive_file:8kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:152kB dirty:0kB writeback:0kB shmem:500kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 4096kB writeback_tmp:0kB all_unreclaimable? no
[ 244.334839] DMA free:8528kB min:704kB low:880kB high:1056kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15992kB managed:15908kB mlocked:0kB kernel_stack:16kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[ 244.334991] lowmem_reserve[]: 0 1972 1972 1972
[ 244.335029] DMA32 free:95200kB min:97596kB low:119944kB high:142292kB reserved_highatomic:4096KB active_anon:54996kB inactive_anon:324kB active_file:100kB inactive_file:492kB unevictable:0kB writepending:0kB present:2080624kB managed:2024508kB mlocked:0kB kernel_stack:6496kB pagetables:5728kB bounce:0kB free_pcp:2704kB local_pcp:1468kB free_cma:0kB
[ 244.335202] lowmem_reserve[]: 0 0 0 0
[ 244.335230] DMA: 6*4kB (U) 1*8kB (U) 1*16kB (U) 1*32kB (U) 0*64kB 0*128kB 1*256kB (U) 0*512kB 0*1024kB 0*2048kB 2*4096kB (ME) = 8528kB
[ 244.335313] DMA32: 1607*4kB (MEH) 1383*8kB (UMEH) 913*16kB (UMH) 556*32kB (UMEH) 368*64kB (MEH) 82*128kB (MH) 27*256kB (MH) 0*512kB 0*1024kB 2*2048kB (UM) 0*4096kB = 94948kB
[ 244.335415] 159 total pagecache pages
[ 244.335444] 0 pages in swap cache
[ 244.335472] Swap cache stats: add 0, delete 0, find 0/0
[ 244.335503] Free swap = 0kB
[ 244.335530] Total swap = 0kB
[ 244.335559] 524154 pages RAM
[ 244.335589] 0 pages HighMem/MovableOnly
[ 244.335618] 14050 pages reserved
[ 244.335645] Unreclaimable slab info:
[ 244.335672] Name Used Total
[ 244.335760] 9p-fcall-cache 32KB 32KB
[ 244.335808] p9_req_t 3KB 3KB
[ 244.335842] PINGv6 31KB 31KB
[ 244.335880] RAWv6 78KB 78KB
[ 244.335915] UDPv6 30KB 30KB
[ 244.335951] tw_sock_TCPv6 3KB 3KB
[ 244.335987] request_sock_TCPv6 3KB 3KB
[ 244.336022] TCPv6 30KB 30KB
Increasing vm memory from 2G to 4G fixes the issue for me. Can you please
double check?
Regards,
Lorenzo
>
> bash-5.0# ./test_maps
> Fork 1024 tasks to 'test_update_delete'
> Fork 1024 tasks to 'test_update_delete'
> Fork 100 tasks to 'test_hashmap'
> Fork 100 tasks to 'test_hashmap_percpu'
> Fork 100 tasks to 'test_hashmap_sizes'
> [ 66.961150] test_maps invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
> [ 66.962490] CPU: 3 PID: 3263 Comm: test_maps Not tainted 5.8.0-rc4-01471-g15d51f3a516b #814
> [ 66.963617] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014
> [ 66.965958] Call Trace:
> [ 66.966404] dump_stack+0x9e/0xe0
> [ 66.966978] dump_header+0x89/0x49a
> [ 66.967624] oom_kill_process.cold+0xb/0x10
> [ 66.968379] out_of_memory+0x1b1/0x820
> [ 66.969060] ? oom_killer_disable+0x210/0x210
> [ 66.969833] __alloc_pages_slowpath.constprop.0+0x125f/0x1460
> [ 66.970852] ? warn_alloc+0x120/0x120
> [ 66.971508] ? __alloc_pages_nodemask+0x30f/0x5c0
> [ 66.972237] __alloc_pages_nodemask+0x4fd/0x5c0
> [ 66.972863] ? __alloc_pages_slowpath.constprop.0+0x1460/0x1460
> [ 66.973681] ? find_held_lock+0x85/0xa0
> [ 66.974223] ? lock_downgrade+0x360/0x360
> [ 66.974760] ? policy_nodemask+0x19/0x90
> [ 66.975266] ? policy_node+0x56/0x60
> [ 66.975719] pagecache_get_page+0xf7/0x360
> [ 66.976259] filemap_fault+0xe4a/0xfe0
> [ 66.976743] ? read_cache_page_gfp+0x20/0x20
> [ 66.977274] ? find_held_lock+0x85/0xa0
> [ 66.977792] ? filemap_page_mkwrite+0x140/0x140
> [ 66.978383] __do_fault+0x6a/0x1e0
> [ 66.978789] handle_mm_fault+0x16eb/0x1fb0
> [ 66.979319] ? copy_page_range+0xf80/0xf80
> [ 66.979755] ? vmacache_find+0xba/0x100
> [ 66.980166] do_user_addr_fault+0x2ce/0x5ed
> [ 66.980625] exc_page_fault+0x5e/0xc0
> [ 66.981005] ? asm_exc_page_fault+0x8/0x30
> [ 66.981445] asm_exc_page_fault+0x1e/0x30
> [ 66.981860] RIP: 0033:0x7f33d40901bd
> [ 66.982248] Code: Bad RIP value.
> [ 66.982579] RSP: 002b:00007ffe84e1dd18 EFLAGS: 00010202
> [ 66.983143] RAX: fffffffffffffff4 RBX: 0000000000000002 RCX: 00007f33d40901bd
> [ 66.984038] RDX: 0000000000000078 RSI: 00007ffe84e1dd50 RDI: 0000000000000000
> [ 66.984831] RBP: 00007ffe84e1dd30 R08: 00007ffe84e1dd50 R09: 00007ffe84e1dd50
> [ 66.985619] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000020000
> [ 66.986399] R13: 0000000000000004 R14: 0000000000000000 R15: 0000000000000060
> [ 66.987656] Mem-Info:
> [ 66.987924] active_anon:1424 inactive_anon:126 isolated_anon:0
> [ 66.987924] active_file:53 inactive_file:33 isolated_file:0
> [ 66.987924] unevictable:0 dirty:0 writeback:0
> [ 66.987924] slab_reclaimable:11120 slab_unreclaimable:98031
> [ 66.987924] mapped:88 shmem:137 pagetables:120 bounce:0
> [ 66.987924] free:21175 free_pcp:725 free_cma:0
> [ 66.993308] Node 0 active_anon:5696kB inactive_anon:504kB active_file:212kB inactive_file:132kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:352kB dirty:0kB writeback:0kB shmem:548kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB all_unreclaimable? yes
> [ 66.997665] Node 0 DMA free:13708kB min:308kB low:384kB high:460kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15992kB managed:15908kB mlocked:0kB kernel_stack:96kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
> [ 67.002330] lowmem_reserve[]: 0 2925 3354 3354
> [ 67.003090] Node 0 DMA32 free:60008kB min:58668kB low:73332kB high:87996kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:3129212kB managed:3001752kB mlocked:0kB kernel_stack:25536kB pagetables:292kB bounce:0kB free_pcp:844kB local_pcp:248kB free_cma:0kB
> [ 67.008157] lowmem_reserve[]: 0 0 428 428
> [ 67.008843] Node 0 Normal free:9528kB min:8600kB low:10748kB high:12896kB reserved_highatomic:2048KB active_anon:5404kB inactive_anon:504kB active_file:336kB inactive_file:0kB unevictable:0kB writepending:0kB present:1048576kB managed:439260kB mlocked:0kB kernel_stack:5824kB pagetables:128kB bounce:0kB free_pcp:1596kB local_pcp:660kB free_cma:0kB
> [ 67.014007] lowmem_reserve[]: 0 0 0 0
> [ 67.014673] Node 0 DMA: 1*4kB (U) 1*8kB (U) 2*16kB (UE) 1*32kB (U) 1*64kB (E) 2*128kB (UE) 2*256kB (UE) 1*512kB (E) 2*1024kB (UE) 3*2048kB (UME) 1*4096kB (M) = 13708kB
> [ 67.016793] Node 0 DMA32: 8*4kB (UM) 5*8kB (UM) 4*16kB (UME) 8*32kB (UME) 7*64kB (ME) 8*128kB (UME) 6*256kB (UM) 8*512kB (ME) 3*1024kB (M) 2*2048kB (ME) 11*4096kB (M) = 59720kB
> [ 67.018915] Node 0 Normal: 256*4kB (UME) 149*8kB (UM) 89*16kB (UM) 41*32kB (UM) 20*64kB (UMH) 8*128kB (UM) 2*256kB (MH) 1*512kB (H) 1*1024kB (H) 0*2048kB 0*4096kB = 9304kB
> [ 67.020673] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
> [ 67.021811] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
> [ 67.022768] 184 total pagecache pages
> [ 67.023418] 0 pages in swap cache
> [ 67.023912] Swap cache stats: add 0, delete 0, find 0/0
> [ 67.024726] Free swap = 0kB
> [ 67.025150] Total swap = 0kB
> [ 67.025530] 1048445 pages RAM
> [ 67.025865] 0 pages HighMem/MovableOnly
> [ 67.026510] 184215 pages reserved
> [ 67.027002] 0 pages cma reserved
> [ 67.027542] 0 pages hwpoisoned
> [ 67.027928] Unreclaimable slab info:
> [ 67.028500] Name Used Total
> [ 67.029057] 9p-fcall-cache 297KB 445KB
> [ 67.029793] 9p-fcall-cache 49KB 49KB
> [ 67.030375] 9p-fcall-cache 123KB 272KB
> [ 67.030897] 9p-fcall-cache 346KB 495KB
> [ 67.031539] p9_req_t 16KB 16KB
> [ 67.032326] fib6_nodes 4KB 4KB
> [ 67.032950] RAWv6 31KB 31KB
> [ 67.033960] mqueue_inode_cache 31KB 31KB
> [ 67.034708] ext4_bio_post_read_ctx 15KB 15KB
> [ 67.035647] bio-2 7KB 7KB
> [ 67.036471] UNIX 372KB 372KB
> [ 67.037769] tcp_bind_bucket 4KB 4KB
> [ 67.038701] ip_fib_trie 4KB 4KB
> [ 67.039496] ip_fib_alias 3KB 3KB
> [ 67.040414] ip_dst_cache 4KB 4KB
> [ 67.041365] RAW 31KB 31KB
> [ 67.042341] UDP 121KB 121KB
> [ 67.043540] tw_sock_TCP 7KB 7KB
> [ 67.044491] request_sock_TCP 7KB 7KB
> [ 67.045447] TCP 58KB 58KB
> [ 67.046445] hugetlbfs_inode_cache 31KB 31KB
> [ 67.047443] bio-1 15KB 15KB
> [ 67.048375] eventpoll_pwq 23KB 23KB
> [ 67.049388] eventpoll_epi 35KB 35KB
> [ 67.050361] inotify_inode_mark 3KB 3KB
> [ 67.051399] request_queue 62KB 62KB
> [ 67.052563] blkdev_ioc 7KB 7KB
> [ 67.053623] bio-0 20KB 20KB
> [ 67.054603] biovec-max 327KB 327KB
> [ 67.055593] skbuff_fclone_cache 15KB 15KB
> [ 67.056627] skbuff_head_cache 281KB 312KB
> [ 67.057868] file_lock_cache 31KB 31KB
> [ 67.058878] file_lock_ctx 15KB 15KB
> [ 67.059874] fsnotify_mark_connector 4KB 4KB
> [ 67.060957] task_delay_info 451KB 451KB
> [ 67.061644] proc_dir_entry 125KB 125KB
> [ 67.062404] pde_opener 59KB 59KB
> [ 67.063696] seq_file 210KB 241KB
> [ 67.064759] sigqueue 19KB 19KB
> [ 67.065763] shmem_inode_cache 795KB 795KB
> [ 67.066778] kernfs_node_cache 2565KB 2565KB
> [ 67.067762] mnt_cache 31KB 31KB
> [ 67.068798] filp 11820KB 11820KB
> [ 67.069807] names_cache 327KB 476KB
> [ 67.070795] key_jar 15KB 15KB
> [ 67.071787] nsproxy 3KB 3KB
> [ 67.072802] vm_area_struct 3236KB 3400KB
> [ 67.073773] mm_struct 870KB 1154KB
> [ 67.074787] fs_cache 232KB 264KB
> [ 67.076008] files_cache 860KB 972KB
> [ 67.076969] signal_cache 2626KB 2686KB
> [ 67.077911] sighand_cache 2370KB 2405KB
> [ 67.078963] task_struct 8564KB 8564KB
> [ 67.080472] cred_jar 392KB 420KB
> [ 67.081767] anon_vma_chain 1191KB 1413KB
> [ 67.082736] anon_vma 200KB 220KB
> [ 67.083805] pid 593KB 640KB
> [ 67.084733] Acpi-Operand 265KB 308KB
> [ 67.085598] Acpi-ParseExt 47KB 47KB
> [ 67.086504] Acpi-Parse 189KB 205KB
> [ 67.091509] Acpi-State 200KB 216KB
> [ 67.092511] Acpi-Namespace 24KB 24KB
> [ 67.093388] numa_policy 3KB 3KB
> [ 67.094249] trace_event_file 151KB 151KB
> [ 67.095076] ftrace_event_field 196KB 196KB
> [ 67.099230] pool_workqueue 16KB 16KB
> [ 67.100094] vmap_area 567KB 567KB
> [ 67.105224] page->ptl 385KB 433KB
> [ 67.106079] kmemleak_scan_area 286KB 286KB
> [ 67.107190] kmemleak_object 163919KB 168558KB
> [ 67.108081] kmalloc-8k 21840KB 21840KB
> [ 67.108932] kmalloc-4k 24592KB 24592KB
> [ 67.114003] kmalloc-2k 15684KB 15684KB
> [ 67.114896] kmalloc-1k 44936KB 44936KB
> [ 67.121018] kmalloc-512 4192KB 4192KB
> [ 67.122212] kmalloc-256 1128KB 1128KB
> [ 67.123089] kmalloc-192 5008KB 5008KB
> [ 67.128978] kmalloc-128 356KB 356KB
> [ 67.129877] kmalloc-96 145KB 156KB
> [ 67.132262] kmalloc-64 1032KB 1032KB
> [ 67.133085] kmalloc-32 233KB 244KB
> [ 67.138993] kmalloc-16 108KB 108KB
> [ 67.139847] kmalloc-8 440KB 533KB
> [ 67.142221] kmem_cache_node 43KB 43KB
> [ 67.143038] kmem_cache 140KB 140KB
> [ 67.149977] Tasks state (memory values in pages):
> [ 67.150743] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
> [ 67.153263] [ 103] 0 103 7766 1042 81920 0 -1000 systemd-udevd
> [ 67.157980] [ 194] 0 194 1032 136 40960 0 0 bash
> [ 67.161986] [ 195] 0 195 758 40 45056 0 0 test_maps
> [ 67.167074] [ 3178] 0 3178 758 39 45056 0 0 test_maps
> [ 67.170572] [ 3186] 0 3186 758 39 45056 0 0 test_maps
> [ 67.172052] [ 3205] 0 3205 758 39 45056 0 0 test_maps
> [ 67.177580] [ 3213] 0 3213 758 39 45056 0 0 test_maps
> [ 67.179026] [ 3221] 0 3221 758 39 45056 0 0 test_maps
> [ 67.184143] [ 3222] 0 3222 758 39 45056 0 0 test_maps
> [ 67.187571] [ 3230] 0 3230 758 39 45056 0 0 test_maps
> [ 67.189053] [ 3241] 0 3241 758 39 45056 0 0 test_maps
> [ 67.194580] [ 3243] 0 3243 758 39 45056 0 0 test_maps
> [ 67.196058] [ 3250] 0 3250 758 39 45056 0 0 test_maps
> [ 67.201124] [ 3263] 0 3263 758 39 45056 0 0 test_maps
> [ 67.206581] [ 3298] 0 3298 758 39 45056 0 0 test_maps
> [ 67.210563] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/,task=bash,pid=194,uid=0
> [ 67.214666] Out of memory: Killed process 194 (bash) total-vm:4128kB, anon-rss:544kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:40kB oom_score_adj:0
> [ 67.226601] oom_reaper: reaped process 3298 (test_maps), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
> [ 67.228355] oom_reaper: reaped process 3178 (test_maps), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
> [ 67.234148] oom_reaper: reaped process 3222 (test_maps), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
> [ 67.236873] oom_reaper: reaped process 3213 (test_maps), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
> [ 67.239028] oom_reaper: reaped process 3186 (test_maps), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
> [ 67.268146] oom_reaper: reaped process 3221 (test_maps), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
> [ 67.275959] kthreadd invoked oom-killer: gfp_mask=0x40cc0(GFP_KERNEL|__GFP_COMP), order=1, oom_score_adj=0
> [ 67.277227] CPU: 0 PID: 2 Comm: kthreadd Not tainted 5.8.0-rc4-01471-g15d51f3a516b #814
> [ 67.278278] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014
> [ 67.279895] Call Trace:
> [ 67.280250] dump_stack+0x9e/0xe0
> [ 67.280674] dump_header+0x89/0x49a
> [ 67.281113] out_of_memory.cold+0xa/0xbb
> [ 67.281608] ? oom_killer_disable+0x210/0x210
> [ 67.282157] __alloc_pages_slowpath.constprop.0+0x125f/0x1460
> [ 67.282894] ? warn_alloc+0x120/0x120
> [ 67.283377] ? __alloc_pages_nodemask+0x30f/0x5c0
> [ 67.283925] __alloc_pages_nodemask+0x4fd/0x5c0
> [ 67.284446] ? __alloc_pages_slowpath.constprop.0+0x1460/0x1460
> [ 67.285146] alloc_slab_page+0x2e/0x7a0
> [ 67.285590] ? new_slab+0x22e/0x2b0
> [ 67.285999] new_slab+0x276/0x2b0
> [ 67.286402] ___slab_alloc+0x4ba/0x6d0
> [ 67.286892] ? copy_process+0x256d/0x2f80
> [ 67.287391] ? lock_downgrade+0x360/0x360
> [ 67.287908] ? copy_process+0x256d/0x2f80
> [ 67.288482] ? __slab_alloc.isra.0+0x4b/0x90
> [ 67.289207] __slab_alloc.isra.0+0x4b/0x90
> [ 67.289874] ? copy_process+0x256d/0x2f80
> [ 67.290565] kmem_cache_alloc_node+0xb7/0x330
> [ 67.291163] ? trace_hardirqs_on+0x1e/0x130
> [ 67.291699] copy_process+0x256d/0x2f80
> [ 67.292231] ? mark_lock+0x13f/0xc30
> [ 67.292746] ? find_held_lock+0x85/0xa0
> [ 67.293314] ? __cleanup_sighand+0x60/0x60
> [ 67.293889] _do_fork+0xcf/0x840
> [ 67.294354] ? copy_init_mm+0x20/0x20
> [ 67.294856] ? lockdep_hardirqs_on_prepare+0x14c/0x240
> [ 67.295530] ? _raw_spin_unlock_irq+0x24/0x50
> [ 67.296036] ? trace_hardirqs_on+0x1e/0x130
> [ 67.296529] ? preempt_count_sub+0x14/0xc0
> [ 67.297010] ? lock_acquire+0x133/0x4e0
> [ 67.297467] kernel_thread+0xa8/0xe0
> [ 67.297886] ? legacy_clone_args_valid+0x30/0x30
> [ 67.298429] ? kthread_create_on_node+0xd0/0xd0
> [ 67.298966] ? do_raw_spin_unlock+0xa3/0x130
> [ 67.299458] ? preempt_count_sub+0x14/0xc0
> [ 67.299907] kthreadd+0x2be/0x340
> [ 67.300290] ? kthread_create_on_cpu+0x120/0x120
> [ 67.300775] ? lockdep_hardirqs_on_prepare+0x14c/0x240
> [ 67.301324] ? _raw_spin_unlock_irq+0x24/0x50
> [ 67.301771] ? trace_hardirqs_on+0x1e/0x130
> [ 67.302206] ? kthread_create_on_cpu+0x120/0x120
> [ 67.302685] ret_from_fork+0x1f/0x30
> [ 67.303234] Mem-Info:
> [ 67.303485] active_anon:1210 inactive_anon:126 isolated_anon:0
> [ 67.303485] active_file:5 inactive_file:12 isolated_file:0
> [ 67.303485] unevictable:0 dirty:0 writeback:0
> [ 67.303485] slab_reclaimable:11187 slab_unreclaimable:98423
> [ 67.303485] mapped:12 shmem:137 pagetables:70 bounce:0
> [ 67.303485] free:20722 free_pcp:1 free_cma:0
> [ 67.307090] Node 0 active_anon:4840kB inactive_anon:504kB active_file:20kB inactive_file:48kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:48kB dirty:0kB writeback:0kB shmem:548kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB all_unreclaimable? yes
> [ 67.309863] Node 0 DMA free:13708kB min:308kB low:384kB high:460kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15992kB managed:15908kB mlocked:0kB kernel_stack:96kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
> [ 67.312641] lowmem_reserve[]: 0 2925 3354 3354
> [ 67.313081] Node 0 DMA32 free:60868kB min:58668kB low:73332kB high:87996kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:3129212kB managed:3001752kB mlocked:0kB kernel_stack:25824kB pagetables:240kB bounce:0kB free_pcp:248kB local_pcp:0kB free_cma:0kB
> [ 67.316022] lowmem_reserve[]: 0 0 428 428
> [ 67.316560] Node 0 Normal free:8312kB min:8600kB low:10748kB high:12896kB reserved_highatomic:0KB active_anon:4916kB inactive_anon:504kB active_file:308kB inactive_file:32kB unevictable:0kB writepending:0kB present:1048576kB managed:439260kB mlocked:0kB kernel_stack:8192kB pagetables:40kB bounce:0kB free_pcp:392kB local_pcp:0kB free_cma:0kB
> [ 67.319987] lowmem_reserve[]: 0 0 0 0
> [ 67.320421] Node 0 DMA: 1*4kB (U) 1*8kB (U) 2*16kB (UE) 1*32kB (U) 1*64kB (E) 2*128kB (UE) 2*256kB (UE) 1*512kB (E) 2*1024kB (UE) 3*2048kB (UME) 1*4096kB (M) = 13708kB
> [ 67.321913] Node 0 DMA32: 6*4kB (M) 26*8kB (UM) 29*16kB (UM) 11*32kB (UME) 9*64kB (UME) 10*128kB (UME) 6*256kB (UM) 7*512kB (ME) 4*1024kB (M) 2*2048kB (ME) 11*4096kB (M) = 61272kB
> [ 67.323562] Node 0 Normal: 488*4kB (UME) 251*8kB (UM) 86*16kB (UM) 37*32kB (UM) 18*64kB (UM) 6*128kB (M) 4*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 9464kB
> [ 67.325773] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
> [ 67.326999] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
> [ 67.328287] 155 total pagecache pages
> [ 67.328836] 0 pages in swap cache
> [ 67.329365] Swap cache stats: add 0, delete 0, find 0/0
> [ 67.330140] Free swap = 0kB
> [ 67.330596] Total swap = 0kB
> [ 67.331043] 1048445 pages RAM
> [ 67.331478] 0 pages HighMem/MovableOnly
> [ 67.332080] 184215 pages reserved
> [ 67.332617] 0 pages cma reserved
> [ 67.333114] 0 pages hwpoisoned
> [ 67.333816] Unreclaimable slab info:
> [ 67.334439] Name Used Total
> [ 67.335297] 9p-fcall-cache 297KB 445KB
> [ 67.336094] 9p-fcall-cache 49KB 49KB
> [ 67.336933] 9p-fcall-cache 123KB 272KB
> [ 67.337751] 9p-fcall-cache 445KB 495KB
> [ 67.338572] p9_req_t 16KB 16KB
> [ 67.339391] fib6_nodes 4KB 4KB
> [ 67.340229] RAWv6 31KB 31KB
> [ 67.341033] mqueue_inode_cache 31KB 31KB
> [ 67.341997] ext4_bio_post_read_ctx 15KB 15KB
> [ 67.342835] bio-2 7KB 7KB
> [ 67.343695] UNIX 372KB 372KB
> [ 67.344560] tcp_bind_bucket 4KB 4KB
> [ 67.345368] ip_fib_trie 4KB 4KB
> [ 67.346190] ip_fib_alias 3KB 3KB
> [ 67.346997] ip_dst_cache 4KB 4KB
> [ 67.347819] RAW 31KB 31KB
> [ 67.348668] UDP 121KB 121KB
> [ 67.349856] tw_sock_TCP 7KB 7KB
> [ 67.350723] request_sock_TCP 7KB 7KB
> [ 67.351567] TCP 58KB 58KB
> [ 67.352412] hugetlbfs_inode_cache 31KB 31KB
> [ 67.353319] bio-1 15KB 15KB
> [ 67.354116] eventpoll_pwq 23KB 23KB
> [ 67.354919] eventpoll_epi 35KB 35KB
> [ 67.355735] inotify_inode_mark 3KB 3KB
> [ 67.356618] request_queue 62KB 62KB
> [ 67.357680] blkdev_ioc 7KB 7KB
> [ 67.358569] bio-0 20KB 20KB
> [ 67.359520] biovec-max 327KB 327KB
> [ 67.360396] skbuff_fclone_cache 15KB 15KB
> [ 67.361277] skbuff_head_cache 281KB 312KB
> [ 67.362116] file_lock_cache 31KB 31KB
> [ 67.362967] file_lock_ctx 15KB 15KB
> [ 67.363597] fsnotify_mark_connector 4KB 4KB
> [ 67.364279] task_delay_info 455KB 455KB
> [ 67.364832] proc_dir_entry 125KB 125KB
> [ 67.365663] pde_opener 59KB 59KB
> [ 67.366265] seq_file 210KB 241KB
> [ 67.366805] sigqueue 19KB 19KB
> [ 67.367386] shmem_inode_cache 795KB 795KB
> [ 67.367909] kernfs_node_cache 2565KB 2565KB
> [ 67.368489] mnt_cache 31KB 31KB
> [ 67.369014] filp 11820KB 11820KB
> [ 67.369585] names_cache 327KB 476KB
> [ 67.370113] key_jar 15KB 15KB
> [ 67.370664] nsproxy 3KB 3KB
> [ 67.371230] vm_area_struct 2870KB 3100KB
> [ 67.371754] mm_struct 831KB 1154KB
> [ 67.372337] fs_cache 176KB 212KB
> [ 67.372862] files_cache 843KB 956KB
> [ 67.373625] signal_cache 2809KB 2809KB
> [ 67.374210] sighand_cache 2615KB 2615KB
> [ 67.374750] task_struct 9207KB 9207KB
> [ 67.375335] cred_jar 392KB 420KB
> [ 67.375856] anon_vma_chain 1058KB 1291KB
> [ 67.376437] anon_vma 144KB 160KB
> [ 67.376952] pid 609KB 640KB
> [ 67.377519] Acpi-Operand 265KB 308KB
> [ 67.378063] Acpi-ParseExt 47KB 47KB
> [ 67.378645] Acpi-Parse 189KB 205KB
> [ 67.379216] Acpi-State 200KB 216KB
> [ 67.379747] Acpi-Namespace 24KB 24KB
> [ 67.380345] numa_policy 3KB 3KB
> [ 67.380870] trace_event_file 151KB 151KB
> [ 67.381649] ftrace_event_field 196KB 196KB
> [ 67.382199] pool_workqueue 16KB 16KB
> [ 67.382722] vmap_area 567KB 567KB
> [ 67.383261] page->ptl 384KB 429KB
> [ 67.383781] kmemleak_scan_area 286KB 286KB
> [ 67.384527] kmemleak_object 162939KB 167943KB
> [ 67.385080] kmalloc-8k 21840KB 21840KB
> [ 67.385618] kmalloc-4k 25736KB 25736KB
> [ 67.386158] kmalloc-2k 15684KB 15684KB
> [ 67.386679] kmalloc-1k 44936KB 44936KB
> [ 67.387208] kmalloc-512 4192KB 4192KB
> [ 67.387826] kmalloc-256 1184KB 1184KB
> [ 67.388514] kmalloc-192 5032KB 5032KB
> [ 67.389397] kmalloc-128 356KB 356KB
> [ 67.390100] kmalloc-96 145KB 156KB
> [ 67.390775] kmalloc-64 1052KB 1052KB
> [ 67.391489] kmalloc-32 233KB 244KB
> [ 67.392261] kmalloc-16 108KB 108KB
> [ 67.392964] kmalloc-8 440KB 533KB
> [ 67.393657] kmem_cache_node 43KB 43KB
> [ 67.394424] kmem_cache 140KB 140KB
> [ 67.395296] Tasks state (memory values in pages):
> [ 67.395889] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
> [ 67.397031] [ 103] 0 103 7766 1042 81920 0 -1000 systemd-udevd
> [ 67.398411] Out of memory and no killable processes...
> [ 67.399203] Kernel panic - not syncing: System is deadlocked on memory
> [ 67.400146] CPU: 0 PID: 2 Comm: kthreadd Not tainted 5.8.0-rc4-01471-g15d51f3a516b #814
> [ 67.401254] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014
> [ 67.403154] Call Trace:
> [ 67.403536] dump_stack+0x9e/0xe0
> [ 67.404021] panic+0x1ab/0x3ae
> [ 67.404497] ? __warn_printk+0xf3/0xf3
> [ 67.405075] ? __rcu_read_unlock+0x58/0x250
> [ 67.405705] ? out_of_memory.cold+0x2d/0xbb
> [ 67.406272] ? out_of_memory.cold+0x1f/0xbb
> [ 67.406910] out_of_memory.cold+0x45/0xbb
> [ 67.407540] ? oom_killer_disable+0x210/0x210
> [ 67.408225] __alloc_pages_slowpath.constprop.0+0x125f/0x1460
> [ 67.409178] ? warn_alloc+0x120/0x120
> [ 67.409849] ? __alloc_pages_nodemask+0x30f/0x5c0
> [ 67.410722] __alloc_pages_nodemask+0x4fd/0x5c0
> [ 67.411533] ? __alloc_pages_slowpath.constprop.0+0x1460/0x1460
> [ 67.412641] alloc_slab_page+0x2e/0x7a0
> [ 67.413345] ? new_slab+0x22e/0x2b0
> [ 67.413883] new_slab+0x276/0x2b0
> [ 67.414534] ___slab_alloc+0x4ba/0x6d0
> [ 67.415221] ? copy_process+0x256d/0x2f80
> [ 67.415930] ? lock_downgrade+0x360/0x360
> [ 67.416659] ? copy_process+0x256d/0x2f80
> [ 67.417390] ? __slab_alloc.isra.0+0x4b/0x90
> [ 67.418171] __slab_alloc.isra.0+0x4b/0x90
> [ 67.418903] ? copy_process+0x256d/0x2f80
> [ 67.419600] kmem_cache_alloc_node+0xb7/0x330
> [ 67.420339] ? trace_hardirqs_on+0x1e/0x130
> [ 67.421078] copy_process+0x256d/0x2f80
> [ 67.421722] ? mark_lock+0x13f/0xc30
> [ 67.422284] ? find_held_lock+0x85/0xa0
> [ 67.422838] ? __cleanup_sighand+0x60/0x60
> [ 67.423515] _do_fork+0xcf/0x840
> [ 67.423978] ? copy_init_mm+0x20/0x20
> [ 67.424496] ? lockdep_hardirqs_on_prepare+0x14c/0x240
> [ 67.425321] ? _raw_spin_unlock_irq+0x24/0x50
> [ 67.426018] ? trace_hardirqs_on+0x1e/0x130
> [ 67.426702] ? preempt_count_sub+0x14/0xc0
> [ 67.427328] ? lock_acquire+0x133/0x4e0
> [ 67.427938] kernel_thread+0xa8/0xe0
> [ 67.428523] ? legacy_clone_args_valid+0x30/0x30
> [ 67.429283] ? kthread_create_on_node+0xd0/0xd0
> [ 67.430026] ? do_raw_spin_unlock+0xa3/0x130
> [ 67.430708] ? preempt_count_sub+0x14/0xc0
> [ 67.431402] kthreadd+0x2be/0x340
> [ 67.431958] ? kthread_create_on_cpu+0x120/0x120
> [ 67.432726] ? lockdep_hardirqs_on_prepare+0x14c/0x240
> [ 67.433575] ? _raw_spin_unlock_irq+0x24/0x50
> [ 67.434288] ? trace_hardirqs_on+0x1e/0x130
> [ 67.434976] ? kthread_create_on_cpu+0x120/0x120
> [ 67.435746] ret_from_fork+0x1f/0x30
> [ 67.436548] Kernel Offset: disabled
>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v7 bpf-next 0/9] introduce support for XDP programs in CPUMAP
2020-07-17 11:01 ` Lorenzo Bianconi
@ 2020-07-17 15:13 ` Jakub Sitnicki
2020-07-17 16:31 ` Lorenzo Bianconi
2020-07-17 19:12 ` Lorenzo Bianconi
0 siblings, 2 replies; 21+ messages in thread
From: Jakub Sitnicki @ 2020-07-17 15:13 UTC (permalink / raw)
To: Lorenzo Bianconi
Cc: netdev, davem, ast, brouer, daniel, toke, lorenzo.bianconi,
dsahern, andrii.nakryiko, bpf
On Fri, 17 Jul 2020 13:01:36 +0200
Lorenzo Bianconi <lorenzo@kernel.org> wrote:
> [...]
>
> > This started showing up with when running ./test_progs from recent
> > bpf-next (bfdfa51702de). Any chance it is related?
> >
> > [ 2950.440613] =============================================
> >
> > [ 3073.281578] INFO: task cpumap/0/map:26:536 blocked for more than 860 seconds.
> > [ 3073.285492] Tainted: G W 5.8.0-rc4-01471-g15d51f3a516b #814
> > [ 3073.289177] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > [ 3073.293021] cpumap/0/map:26 D 0 536 2 0x00004000
> > [ 3073.295755] Call Trace:
> > [ 3073.297143] __schedule+0x5ad/0xf10
> > [ 3073.299032] ? pci_mmcfg_check_reserved+0xd0/0xd0
> > [ 3073.301416] ? static_obj+0x31/0x80
> > [ 3073.303277] ? mark_held_locks+0x24/0x90
> > [ 3073.305313] ? cpu_map_update_elem+0x6d0/0x6d0
> > [ 3073.307544] schedule+0x6f/0x160
> > [ 3073.309282] schedule_preempt_disabled+0x14/0x20
> > [ 3073.311593] kthread+0x175/0x240
> > [ 3073.313299] ? kthread_create_on_node+0xd0/0xd0
> > [ 3073.315106] ret_from_fork+0x1f/0x30
> > [ 3073.316365]
> > Showing all locks held in the system:
> > [ 3073.318423] 1 lock held by khungtaskd/33:
> > [ 3073.319642] #0: ffffffff82d246a0 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x28/0x1c3
> >
> > [ 3073.322249] =============================================
>
> Hi Jakub,
>
> can you please provide more info? can you please identify the test that trigger
> the issue? I run test_progs with bpf-next master branch and it works fine for me.
> I run the tests in a vm with 4 vcpus and 4G of memory.
>
> Regards,
> Lorenzo
>
Was able to trigger it running the newly added selftest:
virtme-init: console is ttyS0
bash-5.0# ./test_progs -n 100
#100/1 cpumap_with_progs:OK
#100 xdp_cpumap_attach:OK
Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED
bash-5.0# [ 247.177168] INFO: task cpumap/0/map:3:198 blocked for more than 122 seconds.
[ 247.181306] Not tainted 5.8.0-rc4-01456-gbfdfa51702de #815
[ 247.184487] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 247.188876] cpumap/0/map:3 D 0 198 2 0x00004000
[ 247.192624] Call Trace:
[ 247.194327] __schedule+0x5ad/0xf10
[ 247.196860] ? pci_mmcfg_check_reserved+0xd0/0xd0
[ 247.199853] ? static_obj+0x31/0x80
[ 247.201917] ? mark_held_locks+0x24/0x90
[ 247.204398] ? cpu_map_update_elem+0x6d0/0x6d0
[ 247.207098] schedule+0x6f/0x160
[ 247.209079] schedule_preempt_disabled+0x14/0x20
[ 247.211863] kthread+0x175/0x240
[ 247.213698] ? kthread_create_on_node+0xd0/0xd0
[ 247.216054] ret_from_fork+0x1f/0x30
[ 247.218363]
[ 247.218363] Showing all locks held in the system:
[ 247.222150] 1 lock held by khungtaskd/33:
[ 247.224894] #0: ffffffff82d246a0 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x28/0x1c3
[ 247.231113]
[ 247.232335] =============================================
[ 247.232335]
qemu running with 4 vCPUs, 4 GB of memory. .config uploaded at
https://paste.centos.org/view/0c14663d
HTH,
-jkbs
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v7 bpf-next 0/9] introduce support for XDP programs in CPUMAP
2020-07-17 15:13 ` Jakub Sitnicki
@ 2020-07-17 16:31 ` Lorenzo Bianconi
2020-07-17 19:12 ` Lorenzo Bianconi
1 sibling, 0 replies; 21+ messages in thread
From: Lorenzo Bianconi @ 2020-07-17 16:31 UTC (permalink / raw)
To: Jakub Sitnicki
Cc: netdev, davem, ast, brouer, daniel, toke, lorenzo.bianconi,
dsahern, andrii.nakryiko, bpf
[-- Attachment #1: Type: text/plain, Size: 1694 bytes --]
> On Fri, 17 Jul 2020 13:01:36 +0200
> Lorenzo Bianconi <lorenzo@kernel.org> wrote:
>
> > [...]
> >
[...]
> Was able to trigger it running the newly added selftest:
>
> virtme-init: console is ttyS0
> bash-5.0# ./test_progs -n 100
> #100/1 cpumap_with_progs:OK
> #100 xdp_cpumap_attach:OK
> Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED
> bash-5.0# [ 247.177168] INFO: task cpumap/0/map:3:198 blocked for more than 122 seconds.
> [ 247.181306] Not tainted 5.8.0-rc4-01456-gbfdfa51702de #815
> [ 247.184487] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [ 247.188876] cpumap/0/map:3 D 0 198 2 0x00004000
> [ 247.192624] Call Trace:
> [ 247.194327] __schedule+0x5ad/0xf10
> [ 247.196860] ? pci_mmcfg_check_reserved+0xd0/0xd0
> [ 247.199853] ? static_obj+0x31/0x80
> [ 247.201917] ? mark_held_locks+0x24/0x90
> [ 247.204398] ? cpu_map_update_elem+0x6d0/0x6d0
> [ 247.207098] schedule+0x6f/0x160
> [ 247.209079] schedule_preempt_disabled+0x14/0x20
> [ 247.211863] kthread+0x175/0x240
> [ 247.213698] ? kthread_create_on_node+0xd0/0xd0
> [ 247.216054] ret_from_fork+0x1f/0x30
> [ 247.218363]
> [ 247.218363] Showing all locks held in the system:
> [ 247.222150] 1 lock held by khungtaskd/33:
> [ 247.224894] #0: ffffffff82d246a0 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x28/0x1c3
> [ 247.231113]
> [ 247.232335] =============================================
> [ 247.232335]
>
> qemu running with 4 vCPUs, 4 GB of memory. .config uploaded at
> https://paste.centos.org/view/0c14663d
ack, thx Jakub. I will look at it.
Regards,
Lorenzo
>
> HTH,
> -jkbs
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v7 bpf-next 0/9] introduce support for XDP programs in CPUMAP
2020-07-17 15:13 ` Jakub Sitnicki
2020-07-17 16:31 ` Lorenzo Bianconi
@ 2020-07-17 19:12 ` Lorenzo Bianconi
2020-07-17 23:13 ` Alexei Starovoitov
1 sibling, 1 reply; 21+ messages in thread
From: Lorenzo Bianconi @ 2020-07-17 19:12 UTC (permalink / raw)
To: Jakub Sitnicki
Cc: netdev, davem, ast, brouer, daniel, toke, lorenzo.bianconi,
dsahern, andrii.nakryiko, bpf
[-- Attachment #1: Type: text/plain, Size: 1368 bytes --]
> On Fri, 17 Jul 2020 13:01:36 +0200
> Lorenzo Bianconi <lorenzo@kernel.org> wrote:
>
[...]
>
> HTH,
> -jkbs
Hi Jakub,
can you please test the patch below when you have some free cycles? It fixes
the issue in my setup.
Regards,
Lorenzo
diff --git a/kernel/bpf/cpumap.c b/kernel/bpf/cpumap.c
index 4c95d0615ca2..f1c46529929b 100644
--- a/kernel/bpf/cpumap.c
+++ b/kernel/bpf/cpumap.c
@@ -453,24 +453,27 @@ __cpu_map_entry_alloc(struct bpf_cpumap_val *value, u32 cpu, int map_id)
rcpu->map_id = map_id;
rcpu->value.qsize = value->qsize;
+ if (fd > 0 && __cpu_map_load_bpf_program(rcpu, fd))
+ goto free_ptr_ring;
+
/* Setup kthread */
rcpu->kthread = kthread_create_on_node(cpu_map_kthread_run, rcpu, numa,
"cpumap/%d/map:%d", cpu, map_id);
if (IS_ERR(rcpu->kthread))
- goto free_ptr_ring;
+ goto free_prog;
get_cpu_map_entry(rcpu); /* 1-refcnt for being in cmap->cpu_map[] */
get_cpu_map_entry(rcpu); /* 1-refcnt for kthread */
- if (fd > 0 && __cpu_map_load_bpf_program(rcpu, fd))
- goto free_ptr_ring;
-
/* Make sure kthread runs on a single CPU */
kthread_bind(rcpu->kthread, cpu);
wake_up_process(rcpu->kthread);
return rcpu;
+free_prog:
+ if (rcpu->prog)
+ bpf_prog_put(rcpu->prog);
free_ptr_ring:
ptr_ring_cleanup(rcpu->queue, NULL);
free_queue:
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH v7 bpf-next 0/9] introduce support for XDP programs in CPUMAP
2020-07-17 19:12 ` Lorenzo Bianconi
@ 2020-07-17 23:13 ` Alexei Starovoitov
0 siblings, 0 replies; 21+ messages in thread
From: Alexei Starovoitov @ 2020-07-17 23:13 UTC (permalink / raw)
To: Lorenzo Bianconi
Cc: Jakub Sitnicki, Network Development, David S. Miller,
Alexei Starovoitov, Jesper Dangaard Brouer, Daniel Borkmann,
Toke Høiland-Jørgensen, lorenzo.bianconi, David Ahern,
Andrii Nakryiko, bpf
On Fri, Jul 17, 2020 at 12:13 PM Lorenzo Bianconi <lorenzo@kernel.org> wrote:
>
> > On Fri, 17 Jul 2020 13:01:36 +0200
> > Lorenzo Bianconi <lorenzo@kernel.org> wrote:
> >
>
> [...]
>
> >
> > HTH,
> > -jkbs
>
> Hi Jakub,
>
> can you please test the patch below when you have some free cycles? It fixes
> the issue in my setup.
>
> Regards,
> Lorenzo
>
> diff --git a/kernel/bpf/cpumap.c b/kernel/bpf/cpumap.c
> index 4c95d0615ca2..f1c46529929b 100644
> --- a/kernel/bpf/cpumap.c
> +++ b/kernel/bpf/cpumap.c
> @@ -453,24 +453,27 @@ __cpu_map_entry_alloc(struct bpf_cpumap_val *value, u32 cpu, int map_id)
> rcpu->map_id = map_id;
> rcpu->value.qsize = value->qsize;
>
> + if (fd > 0 && __cpu_map_load_bpf_program(rcpu, fd))
> + goto free_ptr_ring;
> +
> /* Setup kthread */
> rcpu->kthread = kthread_create_on_node(cpu_map_kthread_run, rcpu, numa,
> "cpumap/%d/map:%d", cpu, map_id);
> if (IS_ERR(rcpu->kthread))
> - goto free_ptr_ring;
> + goto free_prog;
>
> get_cpu_map_entry(rcpu); /* 1-refcnt for being in cmap->cpu_map[] */
> get_cpu_map_entry(rcpu); /* 1-refcnt for kthread */
>
> - if (fd > 0 && __cpu_map_load_bpf_program(rcpu, fd))
> - goto free_ptr_ring;
> -
> /* Make sure kthread runs on a single CPU */
> kthread_bind(rcpu->kthread, cpu);
> wake_up_process(rcpu->kthread);
>
> return rcpu;
>
> +free_prog:
> + if (rcpu->prog)
> + bpf_prog_put(rcpu->prog);
> free_ptr_ring:
> ptr_ring_cleanup(rcpu->queue, NULL);
> free_queue:
Please send it as a proper patch.
^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2020-07-17 23:14 UTC | newest]
Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-14 13:56 [PATCH v7 bpf-next 0/9] introduce support for XDP programs in CPUMAP Lorenzo Bianconi
2020-07-14 13:56 ` [PATCH v7 bpf-next 1/9] cpumap: use non-locked version __ptr_ring_consume_batched Lorenzo Bianconi
2020-07-14 13:56 ` [PATCH v7 bpf-next 2/9] net: refactor xdp_convert_buff_to_frame Lorenzo Bianconi
2020-07-14 13:56 ` [PATCH v7 bpf-next 3/9] samples/bpf: xdp_redirect_cpu_user: do not update bpf maps in option loop Lorenzo Bianconi
2020-07-14 13:56 ` [PATCH v7 bpf-next 4/9] cpumap: formalize map value as a named struct Lorenzo Bianconi
2020-07-14 13:56 ` [PATCH v7 bpf-next 5/9] bpf: cpumap: add the possibility to attach an eBPF program to cpumap Lorenzo Bianconi
2020-07-14 13:56 ` [PATCH v7 bpf-next 6/9] bpf: cpumap: implement XDP_REDIRECT for eBPF programs attached to map entries Lorenzo Bianconi
2020-07-14 13:56 ` [PATCH v7 bpf-next 7/9] libbpf: add SEC name for xdp programs attached to CPUMAP Lorenzo Bianconi
2020-07-14 13:56 ` [PATCH v7 bpf-next 8/9] samples/bpf: xdp_redirect_cpu: load a eBPF program on cpumap Lorenzo Bianconi
2020-07-14 13:56 ` [PATCH v7 bpf-next 9/9] selftest: add tests for XDP programs in CPUMAP entries Lorenzo Bianconi
2020-07-14 15:19 ` [PATCH v7 bpf-next 0/9] introduce support for XDP programs in CPUMAP Alexei Starovoitov
2020-07-14 15:35 ` Lorenzo Bianconi
2020-07-16 16:27 ` Daniel Borkmann
2020-07-17 10:00 ` Jakub Sitnicki
2020-07-17 10:08 ` Jakub Sitnicki
2020-07-17 11:06 ` Lorenzo Bianconi
2020-07-17 11:01 ` Lorenzo Bianconi
2020-07-17 15:13 ` Jakub Sitnicki
2020-07-17 16:31 ` Lorenzo Bianconi
2020-07-17 19:12 ` Lorenzo Bianconi
2020-07-17 23:13 ` Alexei Starovoitov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).