BPF Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH bpf-next v2 00/35] bpf: switch to memcg-based memory accounting
@ 2020-07-27 18:44 Roman Gushchin
  2020-07-27 18:44 ` [PATCH bpf-next v2 01/35] bpf: memcg-based memory accounting for bpf progs Roman Gushchin
                   ` (34 more replies)
  0 siblings, 35 replies; 88+ messages in thread
From: Roman Gushchin @ 2020-07-27 18:44 UTC (permalink / raw)
  To: bpf
  Cc: netdev, Alexei Starovoitov, Daniel Borkmann, kernel-team,
	linux-kernel, Roman Gushchin

Currently bpf is using the memlock rlimit for the memory accounting.
This approach has its downsides and over time has created a significant
amount of problems:

1) The limit is per-user, but because most bpf operations are performed
   as root, the limit has a little value.

2) It's hard to come up with a specific maximum value. Especially because
   the counter is shared with non-bpf users (e.g. memlock() users).
   Any specific value is either too low and creates false failures
   or too high and useless.

3) Charging is not connected to the actual memory allocation. Bpf code
   should manually calculate the estimated cost and precharge the counter,
   and then take care of uncharging, including all fail paths.
   It adds to the code complexity and makes it easy to leak a charge.

4) There is no simple way of getting the current value of the counter.
   We've used drgn for it, but it's far from being convenient.

5) Cryptic -EPERM is returned on exceeding the limit. Libbpf even had
   a function to "explain" this case for users.

In order to overcome these problems let's switch to the memcg-based
memory accounting of bpf objects. With the recent addition of the percpu
memory accounting, now it's possible to provide a comprehensive accounting
of memory used by bpf programs and maps.

This approach has the following advantages:
1) The limit is per-cgroup and hierarchical. It's way more flexible and allows
   a better control over memory usage by different workloads.

2) The actual memory consumption is taken into account. It happens automatically
   on the allocation time if __GFP_ACCOUNT flags is passed. Uncharging is also
   performed automatically on releasing the memory. So the code on the bpf side
   becomes simpler and safer.

3) There is a simple way to get the current value and statistics.

The patchset consists of the following parts:
1) memcg-based accounting for various bpf objects: progs and maps
2) removal of the rlimit-based accounting
3) removal of rlimit adjustments in userspace tools and tests

v2:
  - fixed build issue, caused by the remaining rlimit-based accounting
    for sockhash maps


Roman Gushchin (35):
  bpf: memcg-based memory accounting for bpf progs
  bpf: memcg-based memory accounting for bpf maps
  bpf: refine memcg-based memory accounting for arraymap maps
  bpf: refine memcg-based memory accounting for cpumap maps
  bpf: memcg-based memory accounting for cgroup storage maps
  bpf: refine memcg-based memory accounting for devmap maps
  bpf: refine memcg-based memory accounting for hashtab maps
  bpf: memcg-based memory accounting for lpm_trie maps
  bpf: memcg-based memory accounting for bpf ringbuffer
  bpf: memcg-based memory accounting for socket storage maps
  bpf: refine memcg-based memory accounting for sockmap and sockhash
    maps
  bpf: refine memcg-based memory accounting for xskmap maps
  bpf: eliminate rlimit-based memory accounting for arraymap maps
  bpf: eliminate rlimit-based memory accounting for bpf_struct_ops maps
  bpf: eliminate rlimit-based memory accounting for cpumap maps
  bpf: eliminate rlimit-based memory accounting for cgroup storage maps
  bpf: eliminate rlimit-based memory accounting for devmap maps
  bpf: eliminate rlimit-based memory accounting for hashtab maps
  bpf: eliminate rlimit-based memory accounting for lpm_trie maps
  bpf: eliminate rlimit-based memory accounting for queue_stack_maps
    maps
  bpf: eliminate rlimit-based memory accounting for reuseport_array maps
  bpf: eliminate rlimit-based memory accounting for bpf ringbuffer
  bpf: eliminate rlimit-based memory accounting for sockmap and sockhash
    maps
  bpf: eliminate rlimit-based memory accounting for stackmap maps
  bpf: eliminate rlimit-based memory accounting for socket storage maps
  bpf: eliminate rlimit-based memory accounting for xskmap maps
  bpf: eliminate rlimit-based memory accounting infra for bpf maps
  bpf: eliminate rlimit-based memory accounting for bpf progs
  bpf: libbpf: cleanup RLIMIT_MEMLOCK usage
  bpf: bpftool: do not touch RLIMIT_MEMLOCK
  bpf: runqslower: don't touch RLIMIT_MEMLOCK
  bpf: selftests: delete bpf_rlimit.h
  bpf: selftests: don't touch RLIMIT_MEMLOCK
  bpf: samples: do not touch RLIMIT_MEMLOCK
  perf: don't touch RLIMIT_MEMLOCK

 include/linux/bpf.h                           |  23 ---
 kernel/bpf/arraymap.c                         |  30 +---
 kernel/bpf/bpf_struct_ops.c                   |  19 +--
 kernel/bpf/core.c                             |  20 +--
 kernel/bpf/cpumap.c                           |  20 +--
 kernel/bpf/devmap.c                           |  23 +--
 kernel/bpf/hashtab.c                          |  33 +---
 kernel/bpf/local_storage.c                    |  38 ++---
 kernel/bpf/lpm_trie.c                         |  17 +-
 kernel/bpf/queue_stack_maps.c                 |  16 +-
 kernel/bpf/reuseport_array.c                  |  12 +-
 kernel/bpf/ringbuf.c                          |  33 ++--
 kernel/bpf/stackmap.c                         |  16 +-
 kernel/bpf/syscall.c                          | 152 ++----------------
 net/core/bpf_sk_storage.c                     |  23 +--
 net/core/sock_map.c                           |  40 ++---
 net/xdp/xskmap.c                              |  13 +-
 samples/bpf/hbm.c                             |   1 -
 samples/bpf/map_perf_test_user.c              |  11 --
 samples/bpf/offwaketime_user.c                |   2 -
 samples/bpf/sockex2_user.c                    |   2 -
 samples/bpf/sockex3_user.c                    |   2 -
 samples/bpf/spintest_user.c                   |   2 -
 samples/bpf/syscall_tp_user.c                 |   2 -
 samples/bpf/task_fd_query_user.c              |   5 -
 samples/bpf/test_lru_dist.c                   |   3 -
 samples/bpf/test_map_in_map_user.c            |   9 --
 samples/bpf/test_overhead_user.c              |   2 -
 samples/bpf/trace_event_user.c                |   2 -
 samples/bpf/tracex2_user.c                    |   6 -
 samples/bpf/tracex3_user.c                    |   6 -
 samples/bpf/tracex4_user.c                    |   6 -
 samples/bpf/tracex5_user.c                    |   3 -
 samples/bpf/tracex6_user.c                    |   3 -
 samples/bpf/xdp1_user.c                       |   6 -
 samples/bpf/xdp_adjust_tail_user.c            |   6 -
 samples/bpf/xdp_monitor_user.c                |   6 -
 samples/bpf/xdp_redirect_cpu_user.c           |   6 -
 samples/bpf/xdp_redirect_map_user.c           |   6 -
 samples/bpf/xdp_redirect_user.c               |   6 -
 samples/bpf/xdp_router_ipv4_user.c            |   6 -
 samples/bpf/xdp_rxq_info_user.c               |   6 -
 samples/bpf/xdp_sample_pkts_user.c            |   6 -
 samples/bpf/xdp_tx_iptunnel_user.c            |   6 -
 samples/bpf/xdpsock_user.c                    |   7 -
 tools/bpf/bpftool/common.c                    |   7 -
 tools/bpf/bpftool/feature.c                   |   2 -
 tools/bpf/bpftool/main.h                      |   2 -
 tools/bpf/bpftool/map.c                       |   2 -
 tools/bpf/bpftool/pids.c                      |   1 -
 tools/bpf/bpftool/prog.c                      |   3 -
 tools/bpf/bpftool/struct_ops.c                |   2 -
 tools/bpf/runqslower/runqslower.c             |  16 --
 tools/lib/bpf/libbpf.c                        |  31 +---
 tools/lib/bpf/libbpf.h                        |   5 -
 tools/perf/builtin-trace.c                    |  10 --
 tools/perf/tests/builtin-test.c               |   6 -
 tools/perf/util/Build                         |   1 -
 tools/perf/util/rlimit.c                      |  29 ----
 tools/perf/util/rlimit.h                      |   6 -
 tools/testing/selftests/bpf/bench.c           |  16 --
 tools/testing/selftests/bpf/bpf_rlimit.h      |  28 ----
 .../selftests/bpf/flow_dissector_load.c       |   1 -
 .../selftests/bpf/get_cgroup_id_user.c        |   1 -
 .../bpf/prog_tests/select_reuseport.c         |   1 -
 .../selftests/bpf/prog_tests/sk_lookup.c      |   1 -
 .../selftests/bpf/progs/bpf_iter_bpf_map.c    |   5 +-
 .../selftests/bpf/progs/map_ptr_kern.c        |   5 -
 tools/testing/selftests/bpf/test_btf.c        |   1 -
 .../selftests/bpf/test_cgroup_storage.c       |   1 -
 tools/testing/selftests/bpf/test_dev_cgroup.c |   1 -
 tools/testing/selftests/bpf/test_lpm_map.c    |   1 -
 tools/testing/selftests/bpf/test_lru_map.c    |   1 -
 tools/testing/selftests/bpf/test_maps.c       |   1 -
 tools/testing/selftests/bpf/test_netcnt.c     |   1 -
 tools/testing/selftests/bpf/test_progs.c      |   1 -
 .../selftests/bpf/test_skb_cgroup_id_user.c   |   1 -
 tools/testing/selftests/bpf/test_sock.c       |   1 -
 tools/testing/selftests/bpf/test_sock_addr.c  |   1 -
 .../testing/selftests/bpf/test_sock_fields.c  |   1 -
 .../selftests/bpf/test_socket_cookie.c        |   1 -
 tools/testing/selftests/bpf/test_sockmap.c    |   1 -
 tools/testing/selftests/bpf/test_sysctl.c     |   1 -
 tools/testing/selftests/bpf/test_tag.c        |   1 -
 .../bpf/test_tcp_check_syncookie_user.c       |   1 -
 .../testing/selftests/bpf/test_tcpbpf_user.c  |   1 -
 .../selftests/bpf/test_tcpnotify_user.c       |   1 -
 tools/testing/selftests/bpf/test_verifier.c   |   1 -
 .../testing/selftests/bpf/test_verifier_log.c |   2 -
 tools/testing/selftests/bpf/xdping.c          |   6 -
 tools/testing/selftests/net/reuseport_bpf.c   |  20 ---
 91 files changed, 97 insertions(+), 794 deletions(-)
 delete mode 100644 tools/perf/util/rlimit.c
 delete mode 100644 tools/perf/util/rlimit.h
 delete mode 100644 tools/testing/selftests/bpf/bpf_rlimit.h

-- 
2.26.2


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH bpf-next v2 01/35] bpf: memcg-based memory accounting for bpf progs
  2020-07-27 18:44 [PATCH bpf-next v2 00/35] bpf: switch to memcg-based memory accounting Roman Gushchin
@ 2020-07-27 18:44 ` Roman Gushchin
  2020-07-27 22:11   ` Song Liu
  2020-07-27 18:44 ` [PATCH bpf-next v2 02/35] bpf: memcg-based memory accounting for bpf maps Roman Gushchin
                   ` (33 subsequent siblings)
  34 siblings, 1 reply; 88+ messages in thread
From: Roman Gushchin @ 2020-07-27 18:44 UTC (permalink / raw)
  To: bpf
  Cc: netdev, Alexei Starovoitov, Daniel Borkmann, kernel-team,
	linux-kernel, Roman Gushchin

Include memory used by bpf programs into the memcg-based accounting.
This includes the memory used by programs itself, auxiliary data
and statistics.

Signed-off-by: Roman Gushchin <guro@fb.com>
---
 kernel/bpf/core.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index bde93344164d..daab8dcafbd4 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -77,7 +77,7 @@ void *bpf_internal_load_pointer_neg_helper(const struct sk_buff *skb, int k, uns
 
 struct bpf_prog *bpf_prog_alloc_no_stats(unsigned int size, gfp_t gfp_extra_flags)
 {
-	gfp_t gfp_flags = GFP_KERNEL | __GFP_ZERO | gfp_extra_flags;
+	gfp_t gfp_flags = GFP_KERNEL_ACCOUNT | __GFP_ZERO | gfp_extra_flags;
 	struct bpf_prog_aux *aux;
 	struct bpf_prog *fp;
 
@@ -86,7 +86,7 @@ struct bpf_prog *bpf_prog_alloc_no_stats(unsigned int size, gfp_t gfp_extra_flag
 	if (fp == NULL)
 		return NULL;
 
-	aux = kzalloc(sizeof(*aux), GFP_KERNEL | gfp_extra_flags);
+	aux = kzalloc(sizeof(*aux), GFP_KERNEL_ACCOUNT | gfp_extra_flags);
 	if (aux == NULL) {
 		vfree(fp);
 		return NULL;
@@ -104,7 +104,7 @@ struct bpf_prog *bpf_prog_alloc_no_stats(unsigned int size, gfp_t gfp_extra_flag
 
 struct bpf_prog *bpf_prog_alloc(unsigned int size, gfp_t gfp_extra_flags)
 {
-	gfp_t gfp_flags = GFP_KERNEL | __GFP_ZERO | gfp_extra_flags;
+	gfp_t gfp_flags = GFP_KERNEL_ACCOUNT | __GFP_ZERO | gfp_extra_flags;
 	struct bpf_prog *prog;
 	int cpu;
 
@@ -217,7 +217,7 @@ void bpf_prog_free_linfo(struct bpf_prog *prog)
 struct bpf_prog *bpf_prog_realloc(struct bpf_prog *fp_old, unsigned int size,
 				  gfp_t gfp_extra_flags)
 {
-	gfp_t gfp_flags = GFP_KERNEL | __GFP_ZERO | gfp_extra_flags;
+	gfp_t gfp_flags = GFP_KERNEL_ACCOUNT | __GFP_ZERO | gfp_extra_flags;
 	struct bpf_prog *fp;
 	u32 pages, delta;
 	int ret;
-- 
2.26.2


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH bpf-next v2 02/35] bpf: memcg-based memory accounting for bpf maps
  2020-07-27 18:44 [PATCH bpf-next v2 00/35] bpf: switch to memcg-based memory accounting Roman Gushchin
  2020-07-27 18:44 ` [PATCH bpf-next v2 01/35] bpf: memcg-based memory accounting for bpf progs Roman Gushchin
@ 2020-07-27 18:44 ` Roman Gushchin
  2020-07-27 22:12   ` Song Liu
  2020-07-27 18:44 ` [PATCH bpf-next v2 03/35] bpf: refine memcg-based memory accounting for arraymap maps Roman Gushchin
                   ` (32 subsequent siblings)
  34 siblings, 1 reply; 88+ messages in thread
From: Roman Gushchin @ 2020-07-27 18:44 UTC (permalink / raw)
  To: bpf
  Cc: netdev, Alexei Starovoitov, Daniel Borkmann, kernel-team,
	linux-kernel, Roman Gushchin

This patch enables memcg-based memory accounting for memory allocated
by __bpf_map_area_alloc(), which is used by most map types for
large allocations.

Following patches in the series will refine the accounting for
some map types.

Signed-off-by: Roman Gushchin <guro@fb.com>
---
 kernel/bpf/syscall.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index ee290b1f2d9e..501b2c071d7b 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -275,7 +275,7 @@ static void *__bpf_map_area_alloc(u64 size, int numa_node, bool mmapable)
 	 * __GFP_RETRY_MAYFAIL to avoid such situations.
 	 */
 
-	const gfp_t gfp = __GFP_NOWARN | __GFP_ZERO;
+	const gfp_t gfp = __GFP_NOWARN | __GFP_ZERO | __GFP_ACCOUNT;
 	unsigned int flags = 0;
 	unsigned long align = 1;
 	void *area;
-- 
2.26.2


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH bpf-next v2 03/35] bpf: refine memcg-based memory accounting for arraymap maps
  2020-07-27 18:44 [PATCH bpf-next v2 00/35] bpf: switch to memcg-based memory accounting Roman Gushchin
  2020-07-27 18:44 ` [PATCH bpf-next v2 01/35] bpf: memcg-based memory accounting for bpf progs Roman Gushchin
  2020-07-27 18:44 ` [PATCH bpf-next v2 02/35] bpf: memcg-based memory accounting for bpf maps Roman Gushchin
@ 2020-07-27 18:44 ` Roman Gushchin
  2020-07-27 22:30   ` Song Liu
  2020-07-27 18:44 ` [PATCH bpf-next v2 04/35] bpf: refine memcg-based memory accounting for cpumap maps Roman Gushchin
                   ` (31 subsequent siblings)
  34 siblings, 1 reply; 88+ messages in thread
From: Roman Gushchin @ 2020-07-27 18:44 UTC (permalink / raw)
  To: bpf
  Cc: netdev, Alexei Starovoitov, Daniel Borkmann, kernel-team,
	linux-kernel, Roman Gushchin

Include percpu arrays and auxiliary data into the memcg-based memory
accounting.

Signed-off-by: Roman Gushchin <guro@fb.com>
---
 kernel/bpf/arraymap.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
index 8ff419b632a6..9597fecff8da 100644
--- a/kernel/bpf/arraymap.c
+++ b/kernel/bpf/arraymap.c
@@ -28,12 +28,12 @@ static void bpf_array_free_percpu(struct bpf_array *array)
 
 static int bpf_array_alloc_percpu(struct bpf_array *array)
 {
+	const gfp_t gfp = GFP_USER | __GFP_NOWARN | __GFP_ACCOUNT;
 	void __percpu *ptr;
 	int i;
 
 	for (i = 0; i < array->map.max_entries; i++) {
-		ptr = __alloc_percpu_gfp(array->elem_size, 8,
-					 GFP_USER | __GFP_NOWARN);
+		ptr = __alloc_percpu_gfp(array->elem_size, 8, gfp);
 		if (!ptr) {
 			bpf_array_free_percpu(array);
 			return -ENOMEM;
@@ -969,7 +969,7 @@ static struct bpf_map *prog_array_map_alloc(union bpf_attr *attr)
 	struct bpf_array_aux *aux;
 	struct bpf_map *map;
 
-	aux = kzalloc(sizeof(*aux), GFP_KERNEL);
+	aux = kzalloc(sizeof(*aux), GFP_KERNEL_ACCOUNT);
 	if (!aux)
 		return ERR_PTR(-ENOMEM);
 
-- 
2.26.2


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH bpf-next v2 04/35] bpf: refine memcg-based memory accounting for cpumap maps
  2020-07-27 18:44 [PATCH bpf-next v2 00/35] bpf: switch to memcg-based memory accounting Roman Gushchin
                   ` (2 preceding siblings ...)
  2020-07-27 18:44 ` [PATCH bpf-next v2 03/35] bpf: refine memcg-based memory accounting for arraymap maps Roman Gushchin
@ 2020-07-27 18:44 ` Roman Gushchin
  2020-07-27 22:48   ` Song Liu
  2020-07-27 18:44 ` [PATCH bpf-next v2 05/35] bpf: memcg-based memory accounting for cgroup storage maps Roman Gushchin
                   ` (30 subsequent siblings)
  34 siblings, 1 reply; 88+ messages in thread
From: Roman Gushchin @ 2020-07-27 18:44 UTC (permalink / raw)
  To: bpf
  Cc: netdev, Alexei Starovoitov, Daniel Borkmann, kernel-team,
	linux-kernel, Roman Gushchin

Include metadata and percpu data into the memcg-based memory accounting.

Signed-off-by: Roman Gushchin <guro@fb.com>
---
 kernel/bpf/cpumap.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/bpf/cpumap.c b/kernel/bpf/cpumap.c
index f1c46529929b..74ae9fcbe82e 100644
--- a/kernel/bpf/cpumap.c
+++ b/kernel/bpf/cpumap.c
@@ -99,7 +99,7 @@ static struct bpf_map *cpu_map_alloc(union bpf_attr *attr)
 	    attr->map_flags & ~BPF_F_NUMA_NODE)
 		return ERR_PTR(-EINVAL);
 
-	cmap = kzalloc(sizeof(*cmap), GFP_USER);
+	cmap = kzalloc(sizeof(*cmap), GFP_USER | __GFP_ACCOUNT);
 	if (!cmap)
 		return ERR_PTR(-ENOMEM);
 
@@ -418,7 +418,7 @@ static struct bpf_cpu_map_entry *
 __cpu_map_entry_alloc(struct bpf_cpumap_val *value, u32 cpu, int map_id)
 {
 	int numa, err, i, fd = value->bpf_prog.fd;
-	gfp_t gfp = GFP_KERNEL | __GFP_NOWARN;
+	gfp_t gfp = GFP_KERNEL_ACCOUNT | __GFP_NOWARN;
 	struct bpf_cpu_map_entry *rcpu;
 	struct xdp_bulk_queue *bq;
 
-- 
2.26.2


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH bpf-next v2 05/35] bpf: memcg-based memory accounting for cgroup storage maps
  2020-07-27 18:44 [PATCH bpf-next v2 00/35] bpf: switch to memcg-based memory accounting Roman Gushchin
                   ` (3 preceding siblings ...)
  2020-07-27 18:44 ` [PATCH bpf-next v2 04/35] bpf: refine memcg-based memory accounting for cpumap maps Roman Gushchin
@ 2020-07-27 18:44 ` Roman Gushchin
  2020-07-27 23:05   ` Song Liu
  2020-07-27 18:44 ` [PATCH bpf-next v2 06/35] bpf: refine memcg-based memory accounting for devmap maps Roman Gushchin
                   ` (29 subsequent siblings)
  34 siblings, 1 reply; 88+ messages in thread
From: Roman Gushchin @ 2020-07-27 18:44 UTC (permalink / raw)
  To: bpf
  Cc: netdev, Alexei Starovoitov, Daniel Borkmann, kernel-team,
	linux-kernel, Roman Gushchin

Account memory used by cgroup storage maps including the percpu memory
for the percpu flavor of cgroup storage and map metadata.

Signed-off-by: Roman Gushchin <guro@fb.com>
---
 kernel/bpf/local_storage.c | 17 ++++++++---------
 1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/kernel/bpf/local_storage.c b/kernel/bpf/local_storage.c
index 3b2c70197d78..117acb2e80fb 100644
--- a/kernel/bpf/local_storage.c
+++ b/kernel/bpf/local_storage.c
@@ -166,7 +166,8 @@ static int cgroup_storage_update_elem(struct bpf_map *map, void *key,
 
 	new = kmalloc_node(sizeof(struct bpf_storage_buffer) +
 			   map->value_size,
-			   __GFP_ZERO | GFP_ATOMIC | __GFP_NOWARN,
+			   __GFP_ZERO | GFP_ATOMIC | __GFP_NOWARN |
+			   __GFP_ACCOUNT,
 			   map->numa_node);
 	if (!new)
 		return -ENOMEM;
@@ -313,7 +314,7 @@ static struct bpf_map *cgroup_storage_map_alloc(union bpf_attr *attr)
 		return ERR_PTR(ret);
 
 	map = kmalloc_node(sizeof(struct bpf_cgroup_storage_map),
-			   __GFP_ZERO | GFP_USER, numa_node);
+			   __GFP_ZERO | GFP_USER | __GFP_ACCOUNT, numa_node);
 	if (!map) {
 		bpf_map_charge_finish(&mem);
 		return ERR_PTR(-ENOMEM);
@@ -496,9 +497,9 @@ static size_t bpf_cgroup_storage_calculate_size(struct bpf_map *map, u32 *pages)
 struct bpf_cgroup_storage *bpf_cgroup_storage_alloc(struct bpf_prog *prog,
 					enum bpf_cgroup_storage_type stype)
 {
+	const gfp_t gfp = __GFP_ZERO | GFP_USER | __GFP_ACCOUNT;
 	struct bpf_cgroup_storage *storage;
 	struct bpf_map *map;
-	gfp_t flags;
 	size_t size;
 	u32 pages;
 
@@ -511,20 +512,18 @@ struct bpf_cgroup_storage *bpf_cgroup_storage_alloc(struct bpf_prog *prog,
 	if (bpf_map_charge_memlock(map, pages))
 		return ERR_PTR(-EPERM);
 
-	storage = kmalloc_node(sizeof(struct bpf_cgroup_storage),
-			       __GFP_ZERO | GFP_USER, map->numa_node);
+	storage = kmalloc_node(sizeof(struct bpf_cgroup_storage), gfp,
+			       map->numa_node);
 	if (!storage)
 		goto enomem;
 
-	flags = __GFP_ZERO | GFP_USER;
-
 	if (stype == BPF_CGROUP_STORAGE_SHARED) {
-		storage->buf = kmalloc_node(size, flags, map->numa_node);
+		storage->buf = kmalloc_node(size, gfp, map->numa_node);
 		if (!storage->buf)
 			goto enomem;
 		check_and_init_map_lock(map, storage->buf->data);
 	} else {
-		storage->percpu_buf = __alloc_percpu_gfp(size, 8, flags);
+		storage->percpu_buf = __alloc_percpu_gfp(size, 8, gfp);
 		if (!storage->percpu_buf)
 			goto enomem;
 	}
-- 
2.26.2


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH bpf-next v2 06/35] bpf: refine memcg-based memory accounting for devmap maps
  2020-07-27 18:44 [PATCH bpf-next v2 00/35] bpf: switch to memcg-based memory accounting Roman Gushchin
                   ` (4 preceding siblings ...)
  2020-07-27 18:44 ` [PATCH bpf-next v2 05/35] bpf: memcg-based memory accounting for cgroup storage maps Roman Gushchin
@ 2020-07-27 18:44 ` Roman Gushchin
  2020-07-27 23:35   ` Song Liu
  2020-07-27 18:44 ` [PATCH bpf-next v2 07/35] bpf: refine memcg-based memory accounting for hashtab maps Roman Gushchin
                   ` (28 subsequent siblings)
  34 siblings, 1 reply; 88+ messages in thread
From: Roman Gushchin @ 2020-07-27 18:44 UTC (permalink / raw)
  To: bpf
  Cc: netdev, Alexei Starovoitov, Daniel Borkmann, kernel-team,
	linux-kernel, Roman Gushchin

Include map metadata and the node size (struct bpf_dtab_netdev) on
element update into the accounting.

Signed-off-by: Roman Gushchin <guro@fb.com>
---
 kernel/bpf/devmap.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c
index 10abb06065bb..05bf93088063 100644
--- a/kernel/bpf/devmap.c
+++ b/kernel/bpf/devmap.c
@@ -175,7 +175,7 @@ static struct bpf_map *dev_map_alloc(union bpf_attr *attr)
 	if (!capable(CAP_NET_ADMIN))
 		return ERR_PTR(-EPERM);
 
-	dtab = kzalloc(sizeof(*dtab), GFP_USER);
+	dtab = kzalloc(sizeof(*dtab), GFP_USER | __GFP_ACCOUNT);
 	if (!dtab)
 		return ERR_PTR(-ENOMEM);
 
@@ -603,7 +603,8 @@ static struct bpf_dtab_netdev *__dev_map_alloc_node(struct net *net,
 	struct bpf_prog *prog = NULL;
 	struct bpf_dtab_netdev *dev;
 
-	dev = kmalloc_node(sizeof(*dev), GFP_ATOMIC | __GFP_NOWARN,
+	dev = kmalloc_node(sizeof(*dev),
+			   GFP_ATOMIC | __GFP_NOWARN | __GFP_ACCOUNT,
 			   dtab->map.numa_node);
 	if (!dev)
 		return ERR_PTR(-ENOMEM);
-- 
2.26.2


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH bpf-next v2 07/35] bpf: refine memcg-based memory accounting for hashtab maps
  2020-07-27 18:44 [PATCH bpf-next v2 00/35] bpf: switch to memcg-based memory accounting Roman Gushchin
                   ` (5 preceding siblings ...)
  2020-07-27 18:44 ` [PATCH bpf-next v2 06/35] bpf: refine memcg-based memory accounting for devmap maps Roman Gushchin
@ 2020-07-27 18:44 ` Roman Gushchin
  2020-07-27 23:36   ` Song Liu
  2020-07-27 18:44 ` [PATCH bpf-next v2 08/35] bpf: memcg-based memory accounting for lpm_trie maps Roman Gushchin
                   ` (27 subsequent siblings)
  34 siblings, 1 reply; 88+ messages in thread
From: Roman Gushchin @ 2020-07-27 18:44 UTC (permalink / raw)
  To: bpf
  Cc: netdev, Alexei Starovoitov, Daniel Borkmann, kernel-team,
	linux-kernel, Roman Gushchin

Include percpu objects and the size of map metadata into the
accounting.

Signed-off-by: Roman Gushchin <guro@fb.com>
---
 kernel/bpf/hashtab.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
index 024276787055..9d0432170812 100644
--- a/kernel/bpf/hashtab.c
+++ b/kernel/bpf/hashtab.c
@@ -263,10 +263,11 @@ static int prealloc_init(struct bpf_htab *htab)
 		goto skip_percpu_elems;
 
 	for (i = 0; i < num_entries; i++) {
+		const gfp_t gfp = GFP_USER | __GFP_NOWARN | __GFP_ACCOUNT;
 		u32 size = round_up(htab->map.value_size, 8);
 		void __percpu *pptr;
 
-		pptr = __alloc_percpu_gfp(size, 8, GFP_USER | __GFP_NOWARN);
+		pptr = __alloc_percpu_gfp(size, 8, gfp);
 		if (!pptr)
 			goto free_elems;
 		htab_elem_set_ptr(get_htab_elem(htab, i), htab->map.key_size,
@@ -321,7 +322,7 @@ static int alloc_extra_elems(struct bpf_htab *htab)
 	int cpu;
 
 	pptr = __alloc_percpu_gfp(sizeof(struct htab_elem *), 8,
-				  GFP_USER | __GFP_NOWARN);
+				  GFP_USER | __GFP_NOWARN | __GFP_ACCOUNT);
 	if (!pptr)
 		return -ENOMEM;
 
@@ -424,7 +425,7 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
 	u64 cost;
 	int err;
 
-	htab = kzalloc(sizeof(*htab), GFP_USER);
+	htab = kzalloc(sizeof(*htab), GFP_USER | __GFP_ACCOUNT);
 	if (!htab)
 		return ERR_PTR(-ENOMEM);
 
@@ -827,6 +828,7 @@ static struct htab_elem *alloc_htab_elem(struct bpf_htab *htab, void *key,
 					 bool percpu, bool onallcpus,
 					 struct htab_elem *old_elem)
 {
+	const gfp_t gfp = GFP_ATOMIC | __GFP_NOWARN | __GFP_ACCOUNT;
 	u32 size = htab->map.value_size;
 	bool prealloc = htab_is_prealloc(htab);
 	struct htab_elem *l_new, **pl_new;
@@ -859,8 +861,7 @@ static struct htab_elem *alloc_htab_elem(struct bpf_htab *htab, void *key,
 				l_new = ERR_PTR(-E2BIG);
 				goto dec_count;
 			}
-		l_new = kmalloc_node(htab->elem_size, GFP_ATOMIC | __GFP_NOWARN,
-				     htab->map.numa_node);
+		l_new = kmalloc_node(htab->elem_size, gfp, htab->map.numa_node);
 		if (!l_new) {
 			l_new = ERR_PTR(-ENOMEM);
 			goto dec_count;
@@ -876,8 +877,7 @@ static struct htab_elem *alloc_htab_elem(struct bpf_htab *htab, void *key,
 			pptr = htab_elem_get_ptr(l_new, key_size);
 		} else {
 			/* alloc_percpu zero-fills */
-			pptr = __alloc_percpu_gfp(size, 8,
-						  GFP_ATOMIC | __GFP_NOWARN);
+			pptr = __alloc_percpu_gfp(size, 8, gfp);
 			if (!pptr) {
 				kfree(l_new);
 				l_new = ERR_PTR(-ENOMEM);
-- 
2.26.2


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH bpf-next v2 08/35] bpf: memcg-based memory accounting for lpm_trie maps
  2020-07-27 18:44 [PATCH bpf-next v2 00/35] bpf: switch to memcg-based memory accounting Roman Gushchin
                   ` (6 preceding siblings ...)
  2020-07-27 18:44 ` [PATCH bpf-next v2 07/35] bpf: refine memcg-based memory accounting for hashtab maps Roman Gushchin
@ 2020-07-27 18:44 ` Roman Gushchin
  2020-07-27 23:55   ` Song Liu
  2020-07-27 18:44 ` [PATCH bpf-next v2 09/35] bpf: memcg-based memory accounting for bpf ringbuffer Roman Gushchin
                   ` (26 subsequent siblings)
  34 siblings, 1 reply; 88+ messages in thread
From: Roman Gushchin @ 2020-07-27 18:44 UTC (permalink / raw)
  To: bpf
  Cc: netdev, Alexei Starovoitov, Daniel Borkmann, kernel-team,
	linux-kernel, Roman Gushchin

Include lpm trie and lpm trie node objects into the memcg-based memory
accounting.

Signed-off-by: Roman Gushchin <guro@fb.com>
---
 kernel/bpf/lpm_trie.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/bpf/lpm_trie.c b/kernel/bpf/lpm_trie.c
index 44474bf3ab7a..d85e0fc2cafc 100644
--- a/kernel/bpf/lpm_trie.c
+++ b/kernel/bpf/lpm_trie.c
@@ -282,7 +282,7 @@ static struct lpm_trie_node *lpm_trie_node_alloc(const struct lpm_trie *trie,
 	if (value)
 		size += trie->map.value_size;
 
-	node = kmalloc_node(size, GFP_ATOMIC | __GFP_NOWARN,
+	node = kmalloc_node(size, GFP_ATOMIC | __GFP_NOWARN | __GFP_ACCOUNT,
 			    trie->map.numa_node);
 	if (!node)
 		return NULL;
@@ -557,7 +557,7 @@ static struct bpf_map *trie_alloc(union bpf_attr *attr)
 	    attr->value_size > LPM_VAL_SIZE_MAX)
 		return ERR_PTR(-EINVAL);
 
-	trie = kzalloc(sizeof(*trie), GFP_USER | __GFP_NOWARN);
+	trie = kzalloc(sizeof(*trie), GFP_USER | __GFP_NOWARN | __GFP_ACCOUNT);
 	if (!trie)
 		return ERR_PTR(-ENOMEM);
 
-- 
2.26.2


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH bpf-next v2 09/35] bpf: memcg-based memory accounting for bpf ringbuffer
  2020-07-27 18:44 [PATCH bpf-next v2 00/35] bpf: switch to memcg-based memory accounting Roman Gushchin
                   ` (7 preceding siblings ...)
  2020-07-27 18:44 ` [PATCH bpf-next v2 08/35] bpf: memcg-based memory accounting for lpm_trie maps Roman Gushchin
@ 2020-07-27 18:44 ` Roman Gushchin
  2020-07-27 23:56   ` Song Liu
  2020-07-27 18:44 ` [PATCH bpf-next v2 10/35] bpf: memcg-based memory accounting for socket storage maps Roman Gushchin
                   ` (25 subsequent siblings)
  34 siblings, 1 reply; 88+ messages in thread
From: Roman Gushchin @ 2020-07-27 18:44 UTC (permalink / raw)
  To: bpf
  Cc: netdev, Alexei Starovoitov, Daniel Borkmann, kernel-team,
	linux-kernel, Roman Gushchin

Enable the memcg-based memory accounting for the memory used by
the bpf ringbuffer.

Signed-off-by: Roman Gushchin <guro@fb.com>
---
 kernel/bpf/ringbuf.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/kernel/bpf/ringbuf.c b/kernel/bpf/ringbuf.c
index 002f8a5c9e51..e8e2c39cbdc9 100644
--- a/kernel/bpf/ringbuf.c
+++ b/kernel/bpf/ringbuf.c
@@ -60,8 +60,8 @@ struct bpf_ringbuf_hdr {
 
 static struct bpf_ringbuf *bpf_ringbuf_area_alloc(size_t data_sz, int numa_node)
 {
-	const gfp_t flags = GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_NOWARN |
-			    __GFP_ZERO;
+	const gfp_t flags = GFP_KERNEL_ACCOUNT | __GFP_RETRY_MAYFAIL |
+			    __GFP_NOWARN | __GFP_ZERO;
 	int nr_meta_pages = RINGBUF_PGOFF + RINGBUF_POS_PAGES;
 	int nr_data_pages = data_sz >> PAGE_SHIFT;
 	int nr_pages = nr_meta_pages + nr_data_pages;
@@ -89,7 +89,8 @@ static struct bpf_ringbuf *bpf_ringbuf_area_alloc(size_t data_sz, int numa_node)
 	 */
 	array_size = (nr_meta_pages + 2 * nr_data_pages) * sizeof(*pages);
 	if (array_size > PAGE_SIZE)
-		pages = vmalloc_node(array_size, numa_node);
+		pages = __vmalloc_node(array_size, 1, GFP_KERNEL_ACCOUNT,
+				       numa_node, __builtin_return_address(0));
 	else
 		pages = kmalloc_node(array_size, flags, numa_node);
 	if (!pages)
@@ -167,7 +168,7 @@ static struct bpf_map *ringbuf_map_alloc(union bpf_attr *attr)
 		return ERR_PTR(-E2BIG);
 #endif
 
-	rb_map = kzalloc(sizeof(*rb_map), GFP_USER);
+	rb_map = kzalloc(sizeof(*rb_map), GFP_USER | __GFP_ACCOUNT);
 	if (!rb_map)
 		return ERR_PTR(-ENOMEM);
 
-- 
2.26.2


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH bpf-next v2 10/35] bpf: memcg-based memory accounting for socket storage maps
  2020-07-27 18:44 [PATCH bpf-next v2 00/35] bpf: switch to memcg-based memory accounting Roman Gushchin
                   ` (8 preceding siblings ...)
  2020-07-27 18:44 ` [PATCH bpf-next v2 09/35] bpf: memcg-based memory accounting for bpf ringbuffer Roman Gushchin
@ 2020-07-27 18:44 ` Roman Gushchin
  2020-07-27 23:57   ` Song Liu
  2020-07-27 18:44 ` [PATCH bpf-next v2 11/35] bpf: refine memcg-based memory accounting for sockmap and sockhash maps Roman Gushchin
                   ` (24 subsequent siblings)
  34 siblings, 1 reply; 88+ messages in thread
From: Roman Gushchin @ 2020-07-27 18:44 UTC (permalink / raw)
  To: bpf
  Cc: netdev, Alexei Starovoitov, Daniel Borkmann, kernel-team,
	linux-kernel, Roman Gushchin

Account memory used by the socket storage.

Signed-off-by: Roman Gushchin <guro@fb.com>
---
 net/core/bpf_sk_storage.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/net/core/bpf_sk_storage.c b/net/core/bpf_sk_storage.c
index eafcd15e7dfd..fbcd03cd00d3 100644
--- a/net/core/bpf_sk_storage.c
+++ b/net/core/bpf_sk_storage.c
@@ -130,7 +130,8 @@ static struct bpf_sk_storage_elem *selem_alloc(struct bpf_sk_storage_map *smap,
 	if (charge_omem && omem_charge(sk, smap->elem_size))
 		return NULL;
 
-	selem = kzalloc(smap->elem_size, GFP_ATOMIC | __GFP_NOWARN);
+	selem = kzalloc(smap->elem_size,
+			GFP_ATOMIC | __GFP_NOWARN | __GFP_ACCOUNT);
 	if (selem) {
 		if (value)
 			memcpy(SDATA(selem)->data, value, smap->map.value_size);
@@ -337,7 +338,8 @@ static int sk_storage_alloc(struct sock *sk,
 	if (err)
 		return err;
 
-	sk_storage = kzalloc(sizeof(*sk_storage), GFP_ATOMIC | __GFP_NOWARN);
+	sk_storage = kzalloc(sizeof(*sk_storage),
+			     GFP_ATOMIC | __GFP_NOWARN | __GFP_ACCOUNT);
 	if (!sk_storage) {
 		err = -ENOMEM;
 		goto uncharge;
@@ -677,7 +679,7 @@ static struct bpf_map *bpf_sk_storage_map_alloc(union bpf_attr *attr)
 	u64 cost;
 	int ret;
 
-	smap = kzalloc(sizeof(*smap), GFP_USER | __GFP_NOWARN);
+	smap = kzalloc(sizeof(*smap), GFP_USER | __GFP_NOWARN | __GFP_ACCOUNT);
 	if (!smap)
 		return ERR_PTR(-ENOMEM);
 	bpf_map_init_from_attr(&smap->map, attr);
@@ -695,7 +697,7 @@ static struct bpf_map *bpf_sk_storage_map_alloc(union bpf_attr *attr)
 	}
 
 	smap->buckets = kvcalloc(sizeof(*smap->buckets), nbuckets,
-				 GFP_USER | __GFP_NOWARN);
+				 GFP_USER | __GFP_NOWARN | __GFP_ACCOUNT);
 	if (!smap->buckets) {
 		bpf_map_charge_finish(&smap->map.memory);
 		kfree(smap);
@@ -1024,7 +1026,7 @@ bpf_sk_storage_diag_alloc(const struct nlattr *nla_stgs)
 	}
 
 	diag = kzalloc(sizeof(*diag) + sizeof(diag->maps[0]) * nr_maps,
-		       GFP_KERNEL);
+		       GFP_KERNEL | __GFP_ACCOUNT);
 	if (!diag)
 		return ERR_PTR(-ENOMEM);
 
-- 
2.26.2


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH bpf-next v2 11/35] bpf: refine memcg-based memory accounting for sockmap and sockhash maps
  2020-07-27 18:44 [PATCH bpf-next v2 00/35] bpf: switch to memcg-based memory accounting Roman Gushchin
                   ` (9 preceding siblings ...)
  2020-07-27 18:44 ` [PATCH bpf-next v2 10/35] bpf: memcg-based memory accounting for socket storage maps Roman Gushchin
@ 2020-07-27 18:44 ` Roman Gushchin
  2020-07-27 23:58   ` Song Liu
  2020-07-27 18:44 ` [PATCH bpf-next v2 12/35] bpf: refine memcg-based memory accounting for xskmap maps Roman Gushchin
                   ` (23 subsequent siblings)
  34 siblings, 1 reply; 88+ messages in thread
From: Roman Gushchin @ 2020-07-27 18:44 UTC (permalink / raw)
  To: bpf
  Cc: netdev, Alexei Starovoitov, Daniel Borkmann, kernel-team,
	linux-kernel, Roman Gushchin

Include internal metadata into the memcg-based memory accounting.
Also include the memory allocated on updating an element.

Signed-off-by: Roman Gushchin <guro@fb.com>
---
 net/core/sock_map.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/net/core/sock_map.c b/net/core/sock_map.c
index 119f52a99dc1..bc797adca44c 100644
--- a/net/core/sock_map.c
+++ b/net/core/sock_map.c
@@ -38,7 +38,7 @@ static struct bpf_map *sock_map_alloc(union bpf_attr *attr)
 	    attr->map_flags & ~SOCK_CREATE_FLAG_MASK)
 		return ERR_PTR(-EINVAL);
 
-	stab = kzalloc(sizeof(*stab), GFP_USER);
+	stab = kzalloc(sizeof(*stab), GFP_USER | __GFP_ACCOUNT);
 	if (!stab)
 		return ERR_PTR(-ENOMEM);
 
@@ -829,7 +829,8 @@ static struct bpf_shtab_elem *sock_hash_alloc_elem(struct bpf_shtab *htab,
 		}
 	}
 
-	new = kmalloc_node(htab->elem_size, GFP_ATOMIC | __GFP_NOWARN,
+	new = kmalloc_node(htab->elem_size,
+			   GFP_ATOMIC | __GFP_NOWARN | __GFP_ACCOUNT,
 			   htab->map.numa_node);
 	if (!new) {
 		atomic_dec(&htab->count);
@@ -1011,7 +1012,7 @@ static struct bpf_map *sock_hash_alloc(union bpf_attr *attr)
 	if (attr->key_size > MAX_BPF_STACK)
 		return ERR_PTR(-E2BIG);
 
-	htab = kzalloc(sizeof(*htab), GFP_USER);
+	htab = kzalloc(sizeof(*htab), GFP_USER | __GFP_ACCOUNT);
 	if (!htab)
 		return ERR_PTR(-ENOMEM);
 
-- 
2.26.2


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH bpf-next v2 12/35] bpf: refine memcg-based memory accounting for xskmap maps
  2020-07-27 18:44 [PATCH bpf-next v2 00/35] bpf: switch to memcg-based memory accounting Roman Gushchin
                   ` (10 preceding siblings ...)
  2020-07-27 18:44 ` [PATCH bpf-next v2 11/35] bpf: refine memcg-based memory accounting for sockmap and sockhash maps Roman Gushchin
@ 2020-07-27 18:44 ` Roman Gushchin
  2020-07-28  0:01   ` Song Liu
  2020-07-27 18:44 ` [PATCH bpf-next v2 13/35] bpf: eliminate rlimit-based memory accounting for arraymap maps Roman Gushchin
                   ` (22 subsequent siblings)
  34 siblings, 1 reply; 88+ messages in thread
From: Roman Gushchin @ 2020-07-27 18:44 UTC (permalink / raw)
  To: bpf
  Cc: netdev, Alexei Starovoitov, Daniel Borkmann, kernel-team,
	linux-kernel, Roman Gushchin

Extend xskmap memory accounting to include the memory taken by
the xsk_map_node structure.

Signed-off-by: Roman Gushchin <guro@fb.com>
---
 net/xdp/xskmap.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/xdp/xskmap.c b/net/xdp/xskmap.c
index 8367adbbe9df..e574b22defe5 100644
--- a/net/xdp/xskmap.c
+++ b/net/xdp/xskmap.c
@@ -28,7 +28,8 @@ static struct xsk_map_node *xsk_map_node_alloc(struct xsk_map *map,
 	struct xsk_map_node *node;
 	int err;
 
-	node = kzalloc(sizeof(*node), GFP_ATOMIC | __GFP_NOWARN);
+	node = kzalloc(sizeof(*node),
+		       GFP_ATOMIC | __GFP_NOWARN | __GFP_ACCOUNT);
 	if (!node)
 		return ERR_PTR(-ENOMEM);
 
-- 
2.26.2


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH bpf-next v2 13/35] bpf: eliminate rlimit-based memory accounting for arraymap maps
  2020-07-27 18:44 [PATCH bpf-next v2 00/35] bpf: switch to memcg-based memory accounting Roman Gushchin
                   ` (11 preceding siblings ...)
  2020-07-27 18:44 ` [PATCH bpf-next v2 12/35] bpf: refine memcg-based memory accounting for xskmap maps Roman Gushchin
@ 2020-07-27 18:44 ` Roman Gushchin
  2020-07-28  0:04   ` Song Liu
  2020-07-27 18:44 ` [PATCH bpf-next v2 14/35] bpf: eliminate rlimit-based memory accounting for bpf_struct_ops maps Roman Gushchin
                   ` (21 subsequent siblings)
  34 siblings, 1 reply; 88+ messages in thread
From: Roman Gushchin @ 2020-07-27 18:44 UTC (permalink / raw)
  To: bpf
  Cc: netdev, Alexei Starovoitov, Daniel Borkmann, kernel-team,
	linux-kernel, Roman Gushchin

Do not use rlimit-based memory accounting for arraymap maps.
It has been replaced with the memcg-based memory accounting.

Signed-off-by: Roman Gushchin <guro@fb.com>
---
 kernel/bpf/arraymap.c | 24 ++++--------------------
 1 file changed, 4 insertions(+), 20 deletions(-)

diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
index 9597fecff8da..41581c38b31d 100644
--- a/kernel/bpf/arraymap.c
+++ b/kernel/bpf/arraymap.c
@@ -75,11 +75,10 @@ int array_map_alloc_check(union bpf_attr *attr)
 static struct bpf_map *array_map_alloc(union bpf_attr *attr)
 {
 	bool percpu = attr->map_type == BPF_MAP_TYPE_PERCPU_ARRAY;
-	int ret, numa_node = bpf_map_attr_numa_node(attr);
+	int numa_node = bpf_map_attr_numa_node(attr);
 	u32 elem_size, index_mask, max_entries;
 	bool bypass_spec_v1 = bpf_bypass_spec_v1();
-	u64 cost, array_size, mask64;
-	struct bpf_map_memory mem;
+	u64 array_size, mask64;
 	struct bpf_array *array;
 
 	elem_size = round_up(attr->value_size, 8);
@@ -120,44 +119,29 @@ static struct bpf_map *array_map_alloc(union bpf_attr *attr)
 		}
 	}
 
-	/* make sure there is no u32 overflow later in round_up() */
-	cost = array_size;
-	if (percpu)
-		cost += (u64)attr->max_entries * elem_size * num_possible_cpus();
-
-	ret = bpf_map_charge_init(&mem, cost);
-	if (ret < 0)
-		return ERR_PTR(ret);
-
 	/* allocate all map elements and zero-initialize them */
 	if (attr->map_flags & BPF_F_MMAPABLE) {
 		void *data;
 
 		/* kmalloc'ed memory can't be mmap'ed, use explicit vmalloc */
 		data = bpf_map_area_mmapable_alloc(array_size, numa_node);
-		if (!data) {
-			bpf_map_charge_finish(&mem);
+		if (!data)
 			return ERR_PTR(-ENOMEM);
-		}
 		array = data + PAGE_ALIGN(sizeof(struct bpf_array))
 			- offsetof(struct bpf_array, value);
 	} else {
 		array = bpf_map_area_alloc(array_size, numa_node);
 	}
-	if (!array) {
-		bpf_map_charge_finish(&mem);
+	if (!array)
 		return ERR_PTR(-ENOMEM);
-	}
 	array->index_mask = index_mask;
 	array->map.bypass_spec_v1 = bypass_spec_v1;
 
 	/* copy mandatory map attributes */
 	bpf_map_init_from_attr(&array->map, attr);
-	bpf_map_charge_move(&array->map.memory, &mem);
 	array->elem_size = elem_size;
 
 	if (percpu && bpf_array_alloc_percpu(array)) {
-		bpf_map_charge_finish(&array->map.memory);
 		bpf_map_area_free(array);
 		return ERR_PTR(-ENOMEM);
 	}
-- 
2.26.2


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH bpf-next v2 14/35] bpf: eliminate rlimit-based memory accounting for bpf_struct_ops maps
  2020-07-27 18:44 [PATCH bpf-next v2 00/35] bpf: switch to memcg-based memory accounting Roman Gushchin
                   ` (12 preceding siblings ...)
  2020-07-27 18:44 ` [PATCH bpf-next v2 13/35] bpf: eliminate rlimit-based memory accounting for arraymap maps Roman Gushchin
@ 2020-07-27 18:44 ` Roman Gushchin
  2020-07-28  5:29   ` Song Liu
  2020-07-27 18:44 ` [PATCH bpf-next v2 15/35] bpf: eliminate rlimit-based memory accounting for cpumap maps Roman Gushchin
                   ` (20 subsequent siblings)
  34 siblings, 1 reply; 88+ messages in thread
From: Roman Gushchin @ 2020-07-27 18:44 UTC (permalink / raw)
  To: bpf
  Cc: netdev, Alexei Starovoitov, Daniel Borkmann, kernel-team,
	linux-kernel, Roman Gushchin

Do not use rlimit-based memory accounting for bpf_struct_ops maps.
It has been replaced with the memcg-based memory accounting.

Signed-off-by: Roman Gushchin <guro@fb.com>
---
 kernel/bpf/bpf_struct_ops.c | 19 +++----------------
 1 file changed, 3 insertions(+), 16 deletions(-)

diff --git a/kernel/bpf/bpf_struct_ops.c b/kernel/bpf/bpf_struct_ops.c
index 969c5d47f81f..22bfa236683b 100644
--- a/kernel/bpf/bpf_struct_ops.c
+++ b/kernel/bpf/bpf_struct_ops.c
@@ -550,12 +550,10 @@ static int bpf_struct_ops_map_alloc_check(union bpf_attr *attr)
 static struct bpf_map *bpf_struct_ops_map_alloc(union bpf_attr *attr)
 {
 	const struct bpf_struct_ops *st_ops;
-	size_t map_total_size, st_map_size;
+	size_t st_map_size;
 	struct bpf_struct_ops_map *st_map;
 	const struct btf_type *t, *vt;
-	struct bpf_map_memory mem;
 	struct bpf_map *map;
-	int err;
 
 	if (!bpf_capable())
 		return ERR_PTR(-EPERM);
@@ -575,20 +573,11 @@ static struct bpf_map *bpf_struct_ops_map_alloc(union bpf_attr *attr)
 		 * struct bpf_struct_ops_tcp_congestions_ops
 		 */
 		(vt->size - sizeof(struct bpf_struct_ops_value));
-	map_total_size = st_map_size +
-		/* uvalue */
-		sizeof(vt->size) +
-		/* struct bpf_progs **progs */
-		 btf_type_vlen(t) * sizeof(struct bpf_prog *);
-	err = bpf_map_charge_init(&mem, map_total_size);
-	if (err < 0)
-		return ERR_PTR(err);
 
 	st_map = bpf_map_area_alloc(st_map_size, NUMA_NO_NODE);
-	if (!st_map) {
-		bpf_map_charge_finish(&mem);
+	if (!st_map)
 		return ERR_PTR(-ENOMEM);
-	}
+
 	st_map->st_ops = st_ops;
 	map = &st_map->map;
 
@@ -599,14 +588,12 @@ static struct bpf_map *bpf_struct_ops_map_alloc(union bpf_attr *attr)
 	st_map->image = bpf_jit_alloc_exec(PAGE_SIZE);
 	if (!st_map->uvalue || !st_map->progs || !st_map->image) {
 		bpf_struct_ops_map_free(map);
-		bpf_map_charge_finish(&mem);
 		return ERR_PTR(-ENOMEM);
 	}
 
 	mutex_init(&st_map->lock);
 	set_vm_flush_reset_perms(st_map->image);
 	bpf_map_init_from_attr(map, attr);
-	bpf_map_charge_move(&map->memory, &mem);
 
 	return map;
 }
-- 
2.26.2


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH bpf-next v2 15/35] bpf: eliminate rlimit-based memory accounting for cpumap maps
  2020-07-27 18:44 [PATCH bpf-next v2 00/35] bpf: switch to memcg-based memory accounting Roman Gushchin
                   ` (13 preceding siblings ...)
  2020-07-27 18:44 ` [PATCH bpf-next v2 14/35] bpf: eliminate rlimit-based memory accounting for bpf_struct_ops maps Roman Gushchin
@ 2020-07-27 18:44 ` Roman Gushchin
  2020-07-28  5:30   ` Song Liu
  2020-07-27 18:44 ` [PATCH bpf-next v2 16/35] bpf: eliminate rlimit-based memory accounting for cgroup storage maps Roman Gushchin
                   ` (19 subsequent siblings)
  34 siblings, 1 reply; 88+ messages in thread
From: Roman Gushchin @ 2020-07-27 18:44 UTC (permalink / raw)
  To: bpf
  Cc: netdev, Alexei Starovoitov, Daniel Borkmann, kernel-team,
	linux-kernel, Roman Gushchin

Do not use rlimit-based memory accounting for cpumap maps.
It has been replaced with the memcg-based memory accounting.

Signed-off-by: Roman Gushchin <guro@fb.com>
---
 kernel/bpf/cpumap.c | 16 +---------------
 1 file changed, 1 insertion(+), 15 deletions(-)

diff --git a/kernel/bpf/cpumap.c b/kernel/bpf/cpumap.c
index 74ae9fcbe82e..50f3444a3301 100644
--- a/kernel/bpf/cpumap.c
+++ b/kernel/bpf/cpumap.c
@@ -86,8 +86,6 @@ static struct bpf_map *cpu_map_alloc(union bpf_attr *attr)
 	u32 value_size = attr->value_size;
 	struct bpf_cpu_map *cmap;
 	int err = -ENOMEM;
-	u64 cost;
-	int ret;
 
 	if (!bpf_capable())
 		return ERR_PTR(-EPERM);
@@ -111,26 +109,14 @@ static struct bpf_map *cpu_map_alloc(union bpf_attr *attr)
 		goto free_cmap;
 	}
 
-	/* make sure page count doesn't overflow */
-	cost = (u64) cmap->map.max_entries * sizeof(struct bpf_cpu_map_entry *);
-
-	/* Notice returns -EPERM on if map size is larger than memlock limit */
-	ret = bpf_map_charge_init(&cmap->map.memory, cost);
-	if (ret) {
-		err = ret;
-		goto free_cmap;
-	}
-
 	/* Alloc array for possible remote "destination" CPUs */
 	cmap->cpu_map = bpf_map_area_alloc(cmap->map.max_entries *
 					   sizeof(struct bpf_cpu_map_entry *),
 					   cmap->map.numa_node);
 	if (!cmap->cpu_map)
-		goto free_charge;
+		goto free_cmap;
 
 	return &cmap->map;
-free_charge:
-	bpf_map_charge_finish(&cmap->map.memory);
 free_cmap:
 	kfree(cmap);
 	return ERR_PTR(err);
-- 
2.26.2


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH bpf-next v2 16/35] bpf: eliminate rlimit-based memory accounting for cgroup storage maps
  2020-07-27 18:44 [PATCH bpf-next v2 00/35] bpf: switch to memcg-based memory accounting Roman Gushchin
                   ` (14 preceding siblings ...)
  2020-07-27 18:44 ` [PATCH bpf-next v2 15/35] bpf: eliminate rlimit-based memory accounting for cpumap maps Roman Gushchin
@ 2020-07-27 18:44 ` Roman Gushchin
  2020-07-28  5:31   ` Song Liu
  2020-07-27 18:44 ` [PATCH bpf-next v2 17/35] bpf: eliminate rlimit-based memory accounting for devmap maps Roman Gushchin
                   ` (18 subsequent siblings)
  34 siblings, 1 reply; 88+ messages in thread
From: Roman Gushchin @ 2020-07-27 18:44 UTC (permalink / raw)
  To: bpf
  Cc: netdev, Alexei Starovoitov, Daniel Borkmann, kernel-team,
	linux-kernel, Roman Gushchin

Do not use rlimit-based memory accounting for cgroup storage maps.
It has been replaced with the memcg-based memory accounting.

Signed-off-by: Roman Gushchin <guro@fb.com>
---
 kernel/bpf/local_storage.c | 21 +--------------------
 1 file changed, 1 insertion(+), 20 deletions(-)

diff --git a/kernel/bpf/local_storage.c b/kernel/bpf/local_storage.c
index 117acb2e80fb..5f29a420849c 100644
--- a/kernel/bpf/local_storage.c
+++ b/kernel/bpf/local_storage.c
@@ -288,8 +288,6 @@ static struct bpf_map *cgroup_storage_map_alloc(union bpf_attr *attr)
 {
 	int numa_node = bpf_map_attr_numa_node(attr);
 	struct bpf_cgroup_storage_map *map;
-	struct bpf_map_memory mem;
-	int ret;
 
 	if (attr->key_size != sizeof(struct bpf_cgroup_storage_key) &&
 	    attr->key_size != sizeof(__u64))
@@ -309,18 +307,10 @@ static struct bpf_map *cgroup_storage_map_alloc(union bpf_attr *attr)
 		/* max_entries is not used and enforced to be 0 */
 		return ERR_PTR(-EINVAL);
 
-	ret = bpf_map_charge_init(&mem, sizeof(struct bpf_cgroup_storage_map));
-	if (ret < 0)
-		return ERR_PTR(ret);
-
 	map = kmalloc_node(sizeof(struct bpf_cgroup_storage_map),
 			   __GFP_ZERO | GFP_USER | __GFP_ACCOUNT, numa_node);
-	if (!map) {
-		bpf_map_charge_finish(&mem);
+	if (!map)
 		return ERR_PTR(-ENOMEM);
-	}
-
-	bpf_map_charge_move(&map->map.memory, &mem);
 
 	/* copy mandatory map attributes */
 	bpf_map_init_from_attr(&map->map, attr);
@@ -509,9 +499,6 @@ struct bpf_cgroup_storage *bpf_cgroup_storage_alloc(struct bpf_prog *prog,
 
 	size = bpf_cgroup_storage_calculate_size(map, &pages);
 
-	if (bpf_map_charge_memlock(map, pages))
-		return ERR_PTR(-EPERM);
-
 	storage = kmalloc_node(sizeof(struct bpf_cgroup_storage), gfp,
 			       map->numa_node);
 	if (!storage)
@@ -533,7 +520,6 @@ struct bpf_cgroup_storage *bpf_cgroup_storage_alloc(struct bpf_prog *prog,
 	return storage;
 
 enomem:
-	bpf_map_uncharge_memlock(map, pages);
 	kfree(storage);
 	return ERR_PTR(-ENOMEM);
 }
@@ -560,16 +546,11 @@ void bpf_cgroup_storage_free(struct bpf_cgroup_storage *storage)
 {
 	enum bpf_cgroup_storage_type stype;
 	struct bpf_map *map;
-	u32 pages;
 
 	if (!storage)
 		return;
 
 	map = &storage->map->map;
-
-	bpf_cgroup_storage_calculate_size(map, &pages);
-	bpf_map_uncharge_memlock(map, pages);
-
 	stype = cgroup_storage_type(map);
 	if (stype == BPF_CGROUP_STORAGE_SHARED)
 		call_rcu(&storage->rcu, free_shared_cgroup_storage_rcu);
-- 
2.26.2


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH bpf-next v2 17/35] bpf: eliminate rlimit-based memory accounting for devmap maps
  2020-07-27 18:44 [PATCH bpf-next v2 00/35] bpf: switch to memcg-based memory accounting Roman Gushchin
                   ` (15 preceding siblings ...)
  2020-07-27 18:44 ` [PATCH bpf-next v2 16/35] bpf: eliminate rlimit-based memory accounting for cgroup storage maps Roman Gushchin
@ 2020-07-27 18:44 ` Roman Gushchin
  2020-07-28  5:31   ` Song Liu
  2020-07-27 18:44 ` [PATCH bpf-next v2 18/35] bpf: eliminate rlimit-based memory accounting for hashtab maps Roman Gushchin
                   ` (17 subsequent siblings)
  34 siblings, 1 reply; 88+ messages in thread
From: Roman Gushchin @ 2020-07-27 18:44 UTC (permalink / raw)
  To: bpf
  Cc: netdev, Alexei Starovoitov, Daniel Borkmann, kernel-team,
	linux-kernel, Roman Gushchin

Do not use rlimit-based memory accounting for devmap maps.
It has been replaced with the memcg-based memory accounting.

Signed-off-by: Roman Gushchin <guro@fb.com>
---
 kernel/bpf/devmap.c | 18 ++----------------
 1 file changed, 2 insertions(+), 16 deletions(-)

diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c
index 05bf93088063..8148c7260a54 100644
--- a/kernel/bpf/devmap.c
+++ b/kernel/bpf/devmap.c
@@ -109,8 +109,6 @@ static inline struct hlist_head *dev_map_index_hash(struct bpf_dtab *dtab,
 static int dev_map_init_map(struct bpf_dtab *dtab, union bpf_attr *attr)
 {
 	u32 valsize = attr->value_size;
-	u64 cost = 0;
-	int err;
 
 	/* check sanity of attributes. 2 value sizes supported:
 	 * 4 bytes: ifindex
@@ -135,21 +133,13 @@ static int dev_map_init_map(struct bpf_dtab *dtab, union bpf_attr *attr)
 
 		if (!dtab->n_buckets) /* Overflow check */
 			return -EINVAL;
-		cost += (u64) sizeof(struct hlist_head) * dtab->n_buckets;
-	} else {
-		cost += (u64) dtab->map.max_entries * sizeof(struct bpf_dtab_netdev *);
 	}
 
-	/* if map size is larger than memlock limit, reject it */
-	err = bpf_map_charge_init(&dtab->map.memory, cost);
-	if (err)
-		return -EINVAL;
-
 	if (attr->map_type == BPF_MAP_TYPE_DEVMAP_HASH) {
 		dtab->dev_index_head = dev_map_create_hash(dtab->n_buckets,
 							   dtab->map.numa_node);
 		if (!dtab->dev_index_head)
-			goto free_charge;
+			return -ENOMEM;
 
 		spin_lock_init(&dtab->index_lock);
 	} else {
@@ -157,14 +147,10 @@ static int dev_map_init_map(struct bpf_dtab *dtab, union bpf_attr *attr)
 						      sizeof(struct bpf_dtab_netdev *),
 						      dtab->map.numa_node);
 		if (!dtab->netdev_map)
-			goto free_charge;
+			return -ENOMEM;
 	}
 
 	return 0;
-
-free_charge:
-	bpf_map_charge_finish(&dtab->map.memory);
-	return -ENOMEM;
 }
 
 static struct bpf_map *dev_map_alloc(union bpf_attr *attr)
-- 
2.26.2


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH bpf-next v2 18/35] bpf: eliminate rlimit-based memory accounting for hashtab maps
  2020-07-27 18:44 [PATCH bpf-next v2 00/35] bpf: switch to memcg-based memory accounting Roman Gushchin
                   ` (16 preceding siblings ...)
  2020-07-27 18:44 ` [PATCH bpf-next v2 17/35] bpf: eliminate rlimit-based memory accounting for devmap maps Roman Gushchin
@ 2020-07-27 18:44 ` Roman Gushchin
  2020-07-28  5:32   ` Song Liu
  2020-07-27 18:44 ` [PATCH bpf-next v2 19/35] bpf: eliminate rlimit-based memory accounting for lpm_trie maps Roman Gushchin
                   ` (16 subsequent siblings)
  34 siblings, 1 reply; 88+ messages in thread
From: Roman Gushchin @ 2020-07-27 18:44 UTC (permalink / raw)
  To: bpf
  Cc: netdev, Alexei Starovoitov, Daniel Borkmann, kernel-team,
	linux-kernel, Roman Gushchin

Do not use rlimit-based memory accounting for hashtab maps.
It has been replaced with the memcg-based memory accounting.

Signed-off-by: Roman Gushchin <guro@fb.com>
---
 kernel/bpf/hashtab.c | 19 +------------------
 1 file changed, 1 insertion(+), 18 deletions(-)

diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
index 9d0432170812..9372b559b4e7 100644
--- a/kernel/bpf/hashtab.c
+++ b/kernel/bpf/hashtab.c
@@ -422,7 +422,6 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
 	bool percpu_lru = (attr->map_flags & BPF_F_NO_COMMON_LRU);
 	bool prealloc = !(attr->map_flags & BPF_F_NO_PREALLOC);
 	struct bpf_htab *htab;
-	u64 cost;
 	int err;
 
 	htab = kzalloc(sizeof(*htab), GFP_USER | __GFP_ACCOUNT);
@@ -459,26 +458,12 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
 	    htab->n_buckets > U32_MAX / sizeof(struct bucket))
 		goto free_htab;
 
-	cost = (u64) htab->n_buckets * sizeof(struct bucket) +
-	       (u64) htab->elem_size * htab->map.max_entries;
-
-	if (percpu)
-		cost += (u64) round_up(htab->map.value_size, 8) *
-			num_possible_cpus() * htab->map.max_entries;
-	else
-	       cost += (u64) htab->elem_size * num_possible_cpus();
-
-	/* if map size is larger than memlock limit, reject it */
-	err = bpf_map_charge_init(&htab->map.memory, cost);
-	if (err)
-		goto free_htab;
-
 	err = -ENOMEM;
 	htab->buckets = bpf_map_area_alloc(htab->n_buckets *
 					   sizeof(struct bucket),
 					   htab->map.numa_node);
 	if (!htab->buckets)
-		goto free_charge;
+		goto free_htab;
 
 	if (htab->map.map_flags & BPF_F_ZERO_SEED)
 		htab->hashrnd = 0;
@@ -508,8 +493,6 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
 	prealloc_destroy(htab);
 free_buckets:
 	bpf_map_area_free(htab->buckets);
-free_charge:
-	bpf_map_charge_finish(&htab->map.memory);
 free_htab:
 	kfree(htab);
 	return ERR_PTR(err);
-- 
2.26.2


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH bpf-next v2 19/35] bpf: eliminate rlimit-based memory accounting for lpm_trie maps
  2020-07-27 18:44 [PATCH bpf-next v2 00/35] bpf: switch to memcg-based memory accounting Roman Gushchin
                   ` (17 preceding siblings ...)
  2020-07-27 18:44 ` [PATCH bpf-next v2 18/35] bpf: eliminate rlimit-based memory accounting for hashtab maps Roman Gushchin
@ 2020-07-27 18:44 ` Roman Gushchin
  2020-07-28  5:32   ` Song Liu
  2020-07-27 18:44 ` [PATCH bpf-next v2 20/35] bpf: eliminate rlimit-based memory accounting for queue_stack_maps maps Roman Gushchin
                   ` (15 subsequent siblings)
  34 siblings, 1 reply; 88+ messages in thread
From: Roman Gushchin @ 2020-07-27 18:44 UTC (permalink / raw)
  To: bpf
  Cc: netdev, Alexei Starovoitov, Daniel Borkmann, kernel-team,
	linux-kernel, Roman Gushchin

Do not use rlimit-based memory accounting for lpm_trie maps.
It has been replaced with the memcg-based memory accounting.

Signed-off-by: Roman Gushchin <guro@fb.com>
---
 kernel/bpf/lpm_trie.c | 13 -------------
 1 file changed, 13 deletions(-)

diff --git a/kernel/bpf/lpm_trie.c b/kernel/bpf/lpm_trie.c
index d85e0fc2cafc..c747f0835eb1 100644
--- a/kernel/bpf/lpm_trie.c
+++ b/kernel/bpf/lpm_trie.c
@@ -540,8 +540,6 @@ static int trie_delete_elem(struct bpf_map *map, void *_key)
 static struct bpf_map *trie_alloc(union bpf_attr *attr)
 {
 	struct lpm_trie *trie;
-	u64 cost = sizeof(*trie), cost_per_node;
-	int ret;
 
 	if (!bpf_capable())
 		return ERR_PTR(-EPERM);
@@ -567,20 +565,9 @@ static struct bpf_map *trie_alloc(union bpf_attr *attr)
 			  offsetof(struct bpf_lpm_trie_key, data);
 	trie->max_prefixlen = trie->data_size * 8;
 
-	cost_per_node = sizeof(struct lpm_trie_node) +
-			attr->value_size + trie->data_size;
-	cost += (u64) attr->max_entries * cost_per_node;
-
-	ret = bpf_map_charge_init(&trie->map.memory, cost);
-	if (ret)
-		goto out_err;
-
 	spin_lock_init(&trie->lock);
 
 	return &trie->map;
-out_err:
-	kfree(trie);
-	return ERR_PTR(ret);
 }
 
 static void trie_free(struct bpf_map *map)
-- 
2.26.2


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH bpf-next v2 20/35] bpf: eliminate rlimit-based memory accounting for queue_stack_maps maps
  2020-07-27 18:44 [PATCH bpf-next v2 00/35] bpf: switch to memcg-based memory accounting Roman Gushchin
                   ` (18 preceding siblings ...)
  2020-07-27 18:44 ` [PATCH bpf-next v2 19/35] bpf: eliminate rlimit-based memory accounting for lpm_trie maps Roman Gushchin
@ 2020-07-27 18:44 ` Roman Gushchin
  2020-07-28  5:35   ` Song Liu
  2020-07-27 18:44 ` [PATCH bpf-next v2 21/35] bpf: eliminate rlimit-based memory accounting for reuseport_array maps Roman Gushchin
                   ` (14 subsequent siblings)
  34 siblings, 1 reply; 88+ messages in thread
From: Roman Gushchin @ 2020-07-27 18:44 UTC (permalink / raw)
  To: bpf
  Cc: netdev, Alexei Starovoitov, Daniel Borkmann, kernel-team,
	linux-kernel, Roman Gushchin

Do not use rlimit-based memory accounting for queue_stack maps.
It has been replaced with the memcg-based memory accounting.

Signed-off-by: Roman Gushchin <guro@fb.com>
---
 kernel/bpf/queue_stack_maps.c | 16 ++++------------
 1 file changed, 4 insertions(+), 12 deletions(-)

diff --git a/kernel/bpf/queue_stack_maps.c b/kernel/bpf/queue_stack_maps.c
index 44184f82916a..92e73c35a34a 100644
--- a/kernel/bpf/queue_stack_maps.c
+++ b/kernel/bpf/queue_stack_maps.c
@@ -66,29 +66,21 @@ static int queue_stack_map_alloc_check(union bpf_attr *attr)
 
 static struct bpf_map *queue_stack_map_alloc(union bpf_attr *attr)
 {
-	int ret, numa_node = bpf_map_attr_numa_node(attr);
-	struct bpf_map_memory mem = {0};
+	int numa_node = bpf_map_attr_numa_node(attr);
 	struct bpf_queue_stack *qs;
-	u64 size, queue_size, cost;
+	u64 size, queue_size;
 
 	size = (u64) attr->max_entries + 1;
-	cost = queue_size = sizeof(*qs) + size * attr->value_size;
-
-	ret = bpf_map_charge_init(&mem, cost);
-	if (ret < 0)
-		return ERR_PTR(ret);
+	queue_size = sizeof(*qs) + size * attr->value_size;
 
 	qs = bpf_map_area_alloc(queue_size, numa_node);
-	if (!qs) {
-		bpf_map_charge_finish(&mem);
+	if (!qs)
 		return ERR_PTR(-ENOMEM);
-	}
 
 	memset(qs, 0, sizeof(*qs));
 
 	bpf_map_init_from_attr(&qs->map, attr);
 
-	bpf_map_charge_move(&qs->map.memory, &mem);
 	qs->size = size;
 
 	raw_spin_lock_init(&qs->lock);
-- 
2.26.2


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH bpf-next v2 21/35] bpf: eliminate rlimit-based memory accounting for reuseport_array maps
  2020-07-27 18:44 [PATCH bpf-next v2 00/35] bpf: switch to memcg-based memory accounting Roman Gushchin
                   ` (19 preceding siblings ...)
  2020-07-27 18:44 ` [PATCH bpf-next v2 20/35] bpf: eliminate rlimit-based memory accounting for queue_stack_maps maps Roman Gushchin
@ 2020-07-27 18:44 ` Roman Gushchin
  2020-07-28  5:36   ` Song Liu
  2020-07-27 18:44 ` [PATCH bpf-next v2 22/35] bpf: eliminate rlimit-based memory accounting for bpf ringbuffer Roman Gushchin
                   ` (13 subsequent siblings)
  34 siblings, 1 reply; 88+ messages in thread
From: Roman Gushchin @ 2020-07-27 18:44 UTC (permalink / raw)
  To: bpf
  Cc: netdev, Alexei Starovoitov, Daniel Borkmann, kernel-team,
	linux-kernel, Roman Gushchin

Do not use rlimit-based memory accounting for reuseport_array maps.
It has been replaced with the memcg-based memory accounting.

Signed-off-by: Roman Gushchin <guro@fb.com>
---
 kernel/bpf/reuseport_array.c | 12 ++----------
 1 file changed, 2 insertions(+), 10 deletions(-)

diff --git a/kernel/bpf/reuseport_array.c b/kernel/bpf/reuseport_array.c
index 90b29c5b1da7..9d0161fdfec7 100644
--- a/kernel/bpf/reuseport_array.c
+++ b/kernel/bpf/reuseport_array.c
@@ -150,9 +150,8 @@ static void reuseport_array_free(struct bpf_map *map)
 
 static struct bpf_map *reuseport_array_alloc(union bpf_attr *attr)
 {
-	int err, numa_node = bpf_map_attr_numa_node(attr);
+	int numa_node = bpf_map_attr_numa_node(attr);
 	struct reuseport_array *array;
-	struct bpf_map_memory mem;
 	u64 array_size;
 
 	if (!bpf_capable())
@@ -161,20 +160,13 @@ static struct bpf_map *reuseport_array_alloc(union bpf_attr *attr)
 	array_size = sizeof(*array);
 	array_size += (u64)attr->max_entries * sizeof(struct sock *);
 
-	err = bpf_map_charge_init(&mem, array_size);
-	if (err)
-		return ERR_PTR(err);
-
 	/* allocate all map elements and zero-initialize them */
 	array = bpf_map_area_alloc(array_size, numa_node);
-	if (!array) {
-		bpf_map_charge_finish(&mem);
+	if (!array)
 		return ERR_PTR(-ENOMEM);
-	}
 
 	/* copy mandatory map attributes */
 	bpf_map_init_from_attr(&array->map, attr);
-	bpf_map_charge_move(&array->map.memory, &mem);
 
 	return &array->map;
 }
-- 
2.26.2


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH bpf-next v2 22/35] bpf: eliminate rlimit-based memory accounting for bpf ringbuffer
  2020-07-27 18:44 [PATCH bpf-next v2 00/35] bpf: switch to memcg-based memory accounting Roman Gushchin
                   ` (20 preceding siblings ...)
  2020-07-27 18:44 ` [PATCH bpf-next v2 21/35] bpf: eliminate rlimit-based memory accounting for reuseport_array maps Roman Gushchin
@ 2020-07-27 18:44 ` Roman Gushchin
  2020-07-28  5:37   ` Song Liu
  2020-07-28  5:56   ` Andrii Nakryiko
  2020-07-27 18:44 ` [PATCH bpf-next v2 23/35] bpf: eliminate rlimit-based memory accounting for sockmap and sockhash maps Roman Gushchin
                   ` (12 subsequent siblings)
  34 siblings, 2 replies; 88+ messages in thread
From: Roman Gushchin @ 2020-07-27 18:44 UTC (permalink / raw)
  To: bpf
  Cc: netdev, Alexei Starovoitov, Daniel Borkmann, kernel-team,
	linux-kernel, Roman Gushchin

Do not use rlimit-based memory accounting for bpf ringbuffer.
It has been replaced with the memcg-based memory accounting.

bpf_ringbuf_alloc() can't return anything except ERR_PTR(-ENOMEM)
and a valid pointer, so to simplify the code make it return NULL
in the first case. This allows to drop a couple of lines in
ringbuf_map_alloc() and also makes it look similar to other memory
allocating function like kmalloc().

Signed-off-by: Roman Gushchin <guro@fb.com>
---
 kernel/bpf/ringbuf.c | 24 ++++--------------------
 1 file changed, 4 insertions(+), 20 deletions(-)

diff --git a/kernel/bpf/ringbuf.c b/kernel/bpf/ringbuf.c
index e8e2c39cbdc9..e687b798d097 100644
--- a/kernel/bpf/ringbuf.c
+++ b/kernel/bpf/ringbuf.c
@@ -48,7 +48,6 @@ struct bpf_ringbuf {
 
 struct bpf_ringbuf_map {
 	struct bpf_map map;
-	struct bpf_map_memory memory;
 	struct bpf_ringbuf *rb;
 };
 
@@ -135,7 +134,7 @@ static struct bpf_ringbuf *bpf_ringbuf_alloc(size_t data_sz, int numa_node)
 
 	rb = bpf_ringbuf_area_alloc(data_sz, numa_node);
 	if (!rb)
-		return ERR_PTR(-ENOMEM);
+		return NULL;
 
 	spin_lock_init(&rb->spinlock);
 	init_waitqueue_head(&rb->waitq);
@@ -151,8 +150,6 @@ static struct bpf_ringbuf *bpf_ringbuf_alloc(size_t data_sz, int numa_node)
 static struct bpf_map *ringbuf_map_alloc(union bpf_attr *attr)
 {
 	struct bpf_ringbuf_map *rb_map;
-	u64 cost;
-	int err;
 
 	if (attr->map_flags & ~RINGBUF_CREATE_FLAG_MASK)
 		return ERR_PTR(-EINVAL);
@@ -174,26 +171,13 @@ static struct bpf_map *ringbuf_map_alloc(union bpf_attr *attr)
 
 	bpf_map_init_from_attr(&rb_map->map, attr);
 
-	cost = sizeof(struct bpf_ringbuf_map) +
-	       sizeof(struct bpf_ringbuf) +
-	       attr->max_entries;
-	err = bpf_map_charge_init(&rb_map->map.memory, cost);
-	if (err)
-		goto err_free_map;
-
 	rb_map->rb = bpf_ringbuf_alloc(attr->max_entries, rb_map->map.numa_node);
-	if (IS_ERR(rb_map->rb)) {
-		err = PTR_ERR(rb_map->rb);
-		goto err_uncharge;
+	if (!rb_map->rb) {
+		kfree(rb_map);
+		return ERR_PTR(-ENOMEM);
 	}
 
 	return &rb_map->map;
-
-err_uncharge:
-	bpf_map_charge_finish(&rb_map->map.memory);
-err_free_map:
-	kfree(rb_map);
-	return ERR_PTR(err);
 }
 
 static void bpf_ringbuf_free(struct bpf_ringbuf *rb)
-- 
2.26.2


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH bpf-next v2 23/35] bpf: eliminate rlimit-based memory accounting for sockmap and sockhash maps
  2020-07-27 18:44 [PATCH bpf-next v2 00/35] bpf: switch to memcg-based memory accounting Roman Gushchin
                   ` (21 preceding siblings ...)
  2020-07-27 18:44 ` [PATCH bpf-next v2 22/35] bpf: eliminate rlimit-based memory accounting for bpf ringbuffer Roman Gushchin
@ 2020-07-27 18:44 ` Roman Gushchin
  2020-07-28  5:37   ` Song Liu
  2020-07-27 18:44 ` [PATCH bpf-next v2 24/35] bpf: eliminate rlimit-based memory accounting for stackmap maps Roman Gushchin
                   ` (11 subsequent siblings)
  34 siblings, 1 reply; 88+ messages in thread
From: Roman Gushchin @ 2020-07-27 18:44 UTC (permalink / raw)
  To: bpf
  Cc: netdev, Alexei Starovoitov, Daniel Borkmann, kernel-team,
	linux-kernel, Roman Gushchin

Do not use rlimit-based memory accounting for sockmap and sockhash maps.
It has been replaced with the memcg-based memory accounting.

Signed-off-by: Roman Gushchin <guro@fb.com>
---
 net/core/sock_map.c | 33 ++++++---------------------------
 1 file changed, 6 insertions(+), 27 deletions(-)

diff --git a/net/core/sock_map.c b/net/core/sock_map.c
index bc797adca44c..07c90baf8db1 100644
--- a/net/core/sock_map.c
+++ b/net/core/sock_map.c
@@ -26,8 +26,6 @@ struct bpf_stab {
 static struct bpf_map *sock_map_alloc(union bpf_attr *attr)
 {
 	struct bpf_stab *stab;
-	u64 cost;
-	int err;
 
 	if (!capable(CAP_NET_ADMIN))
 		return ERR_PTR(-EPERM);
@@ -45,22 +43,15 @@ static struct bpf_map *sock_map_alloc(union bpf_attr *attr)
 	bpf_map_init_from_attr(&stab->map, attr);
 	raw_spin_lock_init(&stab->lock);
 
-	/* Make sure page count doesn't overflow. */
-	cost = (u64) stab->map.max_entries * sizeof(struct sock *);
-	err = bpf_map_charge_init(&stab->map.memory, cost);
-	if (err)
-		goto free_stab;
-
 	stab->sks = bpf_map_area_alloc(stab->map.max_entries *
 				       sizeof(struct sock *),
 				       stab->map.numa_node);
-	if (stab->sks)
-		return &stab->map;
-	err = -ENOMEM;
-	bpf_map_charge_finish(&stab->map.memory);
-free_stab:
-	kfree(stab);
-	return ERR_PTR(err);
+	if (!stab->sks) {
+		kfree(stab);
+		return ERR_PTR(-ENOMEM);
+	}
+
+	return &stab->map;
 }
 
 int sock_map_get_from_fd(const union bpf_attr *attr, struct bpf_prog *prog)
@@ -999,7 +990,6 @@ static struct bpf_map *sock_hash_alloc(union bpf_attr *attr)
 {
 	struct bpf_shtab *htab;
 	int i, err;
-	u64 cost;
 
 	if (!capable(CAP_NET_ADMIN))
 		return ERR_PTR(-EPERM);
@@ -1027,21 +1017,10 @@ static struct bpf_map *sock_hash_alloc(union bpf_attr *attr)
 		goto free_htab;
 	}
 
-	cost = (u64) htab->buckets_num * sizeof(struct bpf_shtab_bucket) +
-	       (u64) htab->elem_size * htab->map.max_entries;
-	if (cost >= U32_MAX - PAGE_SIZE) {
-		err = -EINVAL;
-		goto free_htab;
-	}
-	err = bpf_map_charge_init(&htab->map.memory, cost);
-	if (err)
-		goto free_htab;
-
 	htab->buckets = bpf_map_area_alloc(htab->buckets_num *
 					   sizeof(struct bpf_shtab_bucket),
 					   htab->map.numa_node);
 	if (!htab->buckets) {
-		bpf_map_charge_finish(&htab->map.memory);
 		err = -ENOMEM;
 		goto free_htab;
 	}
-- 
2.26.2


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH bpf-next v2 24/35] bpf: eliminate rlimit-based memory accounting for stackmap maps
  2020-07-27 18:44 [PATCH bpf-next v2 00/35] bpf: switch to memcg-based memory accounting Roman Gushchin
                   ` (22 preceding siblings ...)
  2020-07-27 18:44 ` [PATCH bpf-next v2 23/35] bpf: eliminate rlimit-based memory accounting for sockmap and sockhash maps Roman Gushchin
@ 2020-07-27 18:44 ` Roman Gushchin
  2020-07-28  5:38   ` Song Liu
  2020-07-27 18:44 ` [PATCH bpf-next v2 25/35] bpf: eliminate rlimit-based memory accounting for socket storage maps Roman Gushchin
                   ` (10 subsequent siblings)
  34 siblings, 1 reply; 88+ messages in thread
From: Roman Gushchin @ 2020-07-27 18:44 UTC (permalink / raw)
  To: bpf
  Cc: netdev, Alexei Starovoitov, Daniel Borkmann, kernel-team,
	linux-kernel, Roman Gushchin

Do not use rlimit-based memory accounting for stackmap maps.
It has been replaced with the memcg-based memory accounting.

Signed-off-by: Roman Gushchin <guro@fb.com>
---
 kernel/bpf/stackmap.c | 16 +++-------------
 1 file changed, 3 insertions(+), 13 deletions(-)

diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c
index 5beb2f8c23da..9ac0f405beef 100644
--- a/kernel/bpf/stackmap.c
+++ b/kernel/bpf/stackmap.c
@@ -90,7 +90,6 @@ static struct bpf_map *stack_map_alloc(union bpf_attr *attr)
 {
 	u32 value_size = attr->value_size;
 	struct bpf_stack_map *smap;
-	struct bpf_map_memory mem;
 	u64 cost, n_buckets;
 	int err;
 
@@ -119,15 +118,9 @@ static struct bpf_map *stack_map_alloc(union bpf_attr *attr)
 
 	cost = n_buckets * sizeof(struct stack_map_bucket *) + sizeof(*smap);
 	cost += n_buckets * (value_size + sizeof(struct stack_map_bucket));
-	err = bpf_map_charge_init(&mem, cost);
-	if (err)
-		return ERR_PTR(err);
-
 	smap = bpf_map_area_alloc(cost, bpf_map_attr_numa_node(attr));
-	if (!smap) {
-		bpf_map_charge_finish(&mem);
+	if (!smap)
 		return ERR_PTR(-ENOMEM);
-	}
 
 	bpf_map_init_from_attr(&smap->map, attr);
 	smap->map.value_size = value_size;
@@ -135,20 +128,17 @@ static struct bpf_map *stack_map_alloc(union bpf_attr *attr)
 
 	err = get_callchain_buffers(sysctl_perf_event_max_stack);
 	if (err)
-		goto free_charge;
+		goto free_smap;
 
 	err = prealloc_elems_and_freelist(smap);
 	if (err)
 		goto put_buffers;
 
-	bpf_map_charge_move(&smap->map.memory, &mem);
-
 	return &smap->map;
 
 put_buffers:
 	put_callchain_buffers();
-free_charge:
-	bpf_map_charge_finish(&mem);
+free_smap:
 	bpf_map_area_free(smap);
 	return ERR_PTR(err);
 }
-- 
2.26.2


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH bpf-next v2 25/35] bpf: eliminate rlimit-based memory accounting for socket storage maps
  2020-07-27 18:44 [PATCH bpf-next v2 00/35] bpf: switch to memcg-based memory accounting Roman Gushchin
                   ` (23 preceding siblings ...)
  2020-07-27 18:44 ` [PATCH bpf-next v2 24/35] bpf: eliminate rlimit-based memory accounting for stackmap maps Roman Gushchin
@ 2020-07-27 18:44 ` Roman Gushchin
  2020-07-28  5:41   ` Song Liu
  2020-07-27 18:44 ` [PATCH bpf-next v2 26/35] bpf: eliminate rlimit-based memory accounting for xskmap maps Roman Gushchin
                   ` (9 subsequent siblings)
  34 siblings, 1 reply; 88+ messages in thread
From: Roman Gushchin @ 2020-07-27 18:44 UTC (permalink / raw)
  To: bpf
  Cc: netdev, Alexei Starovoitov, Daniel Borkmann, kernel-team,
	linux-kernel, Roman Gushchin

Do not use rlimit-based memory accounting for socket storage maps.
It has been replaced with the memcg-based memory accounting.

Signed-off-by: Roman Gushchin <guro@fb.com>
---
 net/core/bpf_sk_storage.c | 11 -----------
 1 file changed, 11 deletions(-)

diff --git a/net/core/bpf_sk_storage.c b/net/core/bpf_sk_storage.c
index fbcd03cd00d3..c0a35b6368af 100644
--- a/net/core/bpf_sk_storage.c
+++ b/net/core/bpf_sk_storage.c
@@ -676,8 +676,6 @@ static struct bpf_map *bpf_sk_storage_map_alloc(union bpf_attr *attr)
 	struct bpf_sk_storage_map *smap;
 	unsigned int i;
 	u32 nbuckets;
-	u64 cost;
-	int ret;
 
 	smap = kzalloc(sizeof(*smap), GFP_USER | __GFP_NOWARN | __GFP_ACCOUNT);
 	if (!smap)
@@ -688,18 +686,9 @@ static struct bpf_map *bpf_sk_storage_map_alloc(union bpf_attr *attr)
 	/* Use at least 2 buckets, select_bucket() is undefined behavior with 1 bucket */
 	nbuckets = max_t(u32, 2, nbuckets);
 	smap->bucket_log = ilog2(nbuckets);
-	cost = sizeof(*smap->buckets) * nbuckets + sizeof(*smap);
-
-	ret = bpf_map_charge_init(&smap->map.memory, cost);
-	if (ret < 0) {
-		kfree(smap);
-		return ERR_PTR(ret);
-	}
-
 	smap->buckets = kvcalloc(sizeof(*smap->buckets), nbuckets,
 				 GFP_USER | __GFP_NOWARN | __GFP_ACCOUNT);
 	if (!smap->buckets) {
-		bpf_map_charge_finish(&smap->map.memory);
 		kfree(smap);
 		return ERR_PTR(-ENOMEM);
 	}
-- 
2.26.2


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH bpf-next v2 26/35] bpf: eliminate rlimit-based memory accounting for xskmap maps
  2020-07-27 18:44 [PATCH bpf-next v2 00/35] bpf: switch to memcg-based memory accounting Roman Gushchin
                   ` (24 preceding siblings ...)
  2020-07-27 18:44 ` [PATCH bpf-next v2 25/35] bpf: eliminate rlimit-based memory accounting for socket storage maps Roman Gushchin
@ 2020-07-27 18:44 ` Roman Gushchin
  2020-07-28  5:42   ` Song Liu
  2020-07-27 18:44 ` [PATCH bpf-next v2 27/35] bpf: eliminate rlimit-based memory accounting infra for bpf maps Roman Gushchin
                   ` (8 subsequent siblings)
  34 siblings, 1 reply; 88+ messages in thread
From: Roman Gushchin @ 2020-07-27 18:44 UTC (permalink / raw)
  To: bpf
  Cc: netdev, Alexei Starovoitov, Daniel Borkmann, kernel-team,
	linux-kernel, Roman Gushchin

Do not use rlimit-based memory accounting for xskmap maps.
It has been replaced with the memcg-based memory accounting.

Signed-off-by: Roman Gushchin <guro@fb.com>
---
 net/xdp/xskmap.c | 10 +---------
 1 file changed, 1 insertion(+), 9 deletions(-)

diff --git a/net/xdp/xskmap.c b/net/xdp/xskmap.c
index e574b22defe5..0366013f13c6 100644
--- a/net/xdp/xskmap.c
+++ b/net/xdp/xskmap.c
@@ -74,7 +74,6 @@ static void xsk_map_sock_delete(struct xdp_sock *xs,
 
 static struct bpf_map *xsk_map_alloc(union bpf_attr *attr)
 {
-	struct bpf_map_memory mem;
 	int err, numa_node;
 	struct xsk_map *m;
 	u64 size;
@@ -90,18 +89,11 @@ static struct bpf_map *xsk_map_alloc(union bpf_attr *attr)
 	numa_node = bpf_map_attr_numa_node(attr);
 	size = struct_size(m, xsk_map, attr->max_entries);
 
-	err = bpf_map_charge_init(&mem, size);
-	if (err < 0)
-		return ERR_PTR(err);
-
 	m = bpf_map_area_alloc(size, numa_node);
-	if (!m) {
-		bpf_map_charge_finish(&mem);
+	if (!m)
 		return ERR_PTR(-ENOMEM);
-	}
 
 	bpf_map_init_from_attr(&m->map, attr);
-	bpf_map_charge_move(&m->map.memory, &mem);
 	spin_lock_init(&m->lock);
 
 	return &m->map;
-- 
2.26.2


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH bpf-next v2 27/35] bpf: eliminate rlimit-based memory accounting infra for bpf maps
  2020-07-27 18:44 [PATCH bpf-next v2 00/35] bpf: switch to memcg-based memory accounting Roman Gushchin
                   ` (25 preceding siblings ...)
  2020-07-27 18:44 ` [PATCH bpf-next v2 26/35] bpf: eliminate rlimit-based memory accounting for xskmap maps Roman Gushchin
@ 2020-07-27 18:44 ` Roman Gushchin
  2020-07-28  5:47   ` Song Liu
  2020-07-27 18:44 ` [PATCH bpf-next v2 28/35] bpf: eliminate rlimit-based memory accounting for bpf progs Roman Gushchin
                   ` (7 subsequent siblings)
  34 siblings, 1 reply; 88+ messages in thread
From: Roman Gushchin @ 2020-07-27 18:44 UTC (permalink / raw)
  To: bpf
  Cc: netdev, Alexei Starovoitov, Daniel Borkmann, kernel-team,
	linux-kernel, Roman Gushchin

Remove rlimit-based accounting infrastructure code, which is not used
anymore.

Signed-off-by: Roman Gushchin <guro@fb.com>
---
 include/linux/bpf.h                           | 12 ----
 kernel/bpf/syscall.c                          | 64 +------------------
 .../selftests/bpf/progs/map_ptr_kern.c        |  5 --
 3 files changed, 2 insertions(+), 79 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 8357be349133..055c693d9928 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -112,11 +112,6 @@ struct bpf_map_ops {
 	const struct bpf_iter_seq_info *iter_seq_info;
 };
 
-struct bpf_map_memory {
-	u32 pages;
-	struct user_struct *user;
-};
-
 struct bpf_map {
 	/* The first two cachelines with read-mostly members of which some
 	 * are also accessed in fast-path (e.g. ops, max_entries).
@@ -137,7 +132,6 @@ struct bpf_map {
 	u32 btf_key_type_id;
 	u32 btf_value_type_id;
 	struct btf *btf;
-	struct bpf_map_memory memory;
 	char name[BPF_OBJ_NAME_LEN];
 	u32 btf_vmlinux_value_type_id;
 	bool bypass_spec_v1;
@@ -1117,12 +1111,6 @@ void bpf_map_inc_with_uref(struct bpf_map *map);
 struct bpf_map * __must_check bpf_map_inc_not_zero(struct bpf_map *map);
 void bpf_map_put_with_uref(struct bpf_map *map);
 void bpf_map_put(struct bpf_map *map);
-int bpf_map_charge_memlock(struct bpf_map *map, u32 pages);
-void bpf_map_uncharge_memlock(struct bpf_map *map, u32 pages);
-int bpf_map_charge_init(struct bpf_map_memory *mem, u64 size);
-void bpf_map_charge_finish(struct bpf_map_memory *mem);
-void bpf_map_charge_move(struct bpf_map_memory *dst,
-			 struct bpf_map_memory *src);
 void *bpf_map_area_alloc(u64 size, int numa_node);
 void *bpf_map_area_mmapable_alloc(u64 size, int numa_node);
 void bpf_map_area_free(void *base);
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 501b2c071d7b..ae51e2363cc1 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -354,60 +354,6 @@ static void bpf_uncharge_memlock(struct user_struct *user, u32 pages)
 		atomic_long_sub(pages, &user->locked_vm);
 }
 
-int bpf_map_charge_init(struct bpf_map_memory *mem, u64 size)
-{
-	u32 pages = round_up(size, PAGE_SIZE) >> PAGE_SHIFT;
-	struct user_struct *user;
-	int ret;
-
-	if (size >= U32_MAX - PAGE_SIZE)
-		return -E2BIG;
-
-	user = get_current_user();
-	ret = bpf_charge_memlock(user, pages);
-	if (ret) {
-		free_uid(user);
-		return ret;
-	}
-
-	mem->pages = pages;
-	mem->user = user;
-
-	return 0;
-}
-
-void bpf_map_charge_finish(struct bpf_map_memory *mem)
-{
-	bpf_uncharge_memlock(mem->user, mem->pages);
-	free_uid(mem->user);
-}
-
-void bpf_map_charge_move(struct bpf_map_memory *dst,
-			 struct bpf_map_memory *src)
-{
-	*dst = *src;
-
-	/* Make sure src will not be used for the redundant uncharging. */
-	memset(src, 0, sizeof(struct bpf_map_memory));
-}
-
-int bpf_map_charge_memlock(struct bpf_map *map, u32 pages)
-{
-	int ret;
-
-	ret = bpf_charge_memlock(map->memory.user, pages);
-	if (ret)
-		return ret;
-	map->memory.pages += pages;
-	return ret;
-}
-
-void bpf_map_uncharge_memlock(struct bpf_map *map, u32 pages)
-{
-	bpf_uncharge_memlock(map->memory.user, pages);
-	map->memory.pages -= pages;
-}
-
 static int bpf_map_alloc_id(struct bpf_map *map)
 {
 	int id;
@@ -456,13 +402,10 @@ void bpf_map_free_id(struct bpf_map *map, bool do_idr_lock)
 static void bpf_map_free_deferred(struct work_struct *work)
 {
 	struct bpf_map *map = container_of(work, struct bpf_map, work);
-	struct bpf_map_memory mem;
 
-	bpf_map_charge_move(&mem, &map->memory);
 	security_bpf_map_free(map);
 	/* implementation dependent freeing */
 	map->ops->map_free(map);
-	bpf_map_charge_finish(&mem);
 }
 
 static void bpf_map_put_uref(struct bpf_map *map)
@@ -541,7 +484,7 @@ static void bpf_map_show_fdinfo(struct seq_file *m, struct file *filp)
 		   "value_size:\t%u\n"
 		   "max_entries:\t%u\n"
 		   "map_flags:\t%#x\n"
-		   "memlock:\t%llu\n"
+		   "memlock:\t%llu\n" /* deprecated */
 		   "map_id:\t%u\n"
 		   "frozen:\t%u\n",
 		   map->map_type,
@@ -549,7 +492,7 @@ static void bpf_map_show_fdinfo(struct seq_file *m, struct file *filp)
 		   map->value_size,
 		   map->max_entries,
 		   map->map_flags,
-		   map->memory.pages * 1ULL << PAGE_SHIFT,
+		   0LLU,
 		   map->id,
 		   READ_ONCE(map->frozen));
 	if (type) {
@@ -790,7 +733,6 @@ static int map_check_btf(struct bpf_map *map, const struct btf *btf,
 static int map_create(union bpf_attr *attr)
 {
 	int numa_node = bpf_map_attr_numa_node(attr);
-	struct bpf_map_memory mem;
 	struct bpf_map *map;
 	int f_flags;
 	int err;
@@ -887,9 +829,7 @@ static int map_create(union bpf_attr *attr)
 	security_bpf_map_free(map);
 free_map:
 	btf_put(map->btf);
-	bpf_map_charge_move(&mem, &map->memory);
 	map->ops->map_free(map);
-	bpf_map_charge_finish(&mem);
 	return err;
 }
 
diff --git a/tools/testing/selftests/bpf/progs/map_ptr_kern.c b/tools/testing/selftests/bpf/progs/map_ptr_kern.c
index 473665cac67e..49d1dcaf7999 100644
--- a/tools/testing/selftests/bpf/progs/map_ptr_kern.c
+++ b/tools/testing/selftests/bpf/progs/map_ptr_kern.c
@@ -26,17 +26,12 @@ __u32 g_line = 0;
 		return 0;	\
 })
 
-struct bpf_map_memory {
-	__u32 pages;
-} __attribute__((preserve_access_index));
-
 struct bpf_map {
 	enum bpf_map_type map_type;
 	__u32 key_size;
 	__u32 value_size;
 	__u32 max_entries;
 	__u32 id;
-	struct bpf_map_memory memory;
 } __attribute__((preserve_access_index));
 
 static inline int check_bpf_map_fields(struct bpf_map *map, __u32 key_size,
-- 
2.26.2


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH bpf-next v2 28/35] bpf: eliminate rlimit-based memory accounting for bpf progs
  2020-07-27 18:44 [PATCH bpf-next v2 00/35] bpf: switch to memcg-based memory accounting Roman Gushchin
                   ` (26 preceding siblings ...)
  2020-07-27 18:44 ` [PATCH bpf-next v2 27/35] bpf: eliminate rlimit-based memory accounting infra for bpf maps Roman Gushchin
@ 2020-07-27 18:44 ` Roman Gushchin
  2020-07-28  5:55   ` Song Liu
  2020-07-27 18:45 ` [PATCH bpf-next v2 29/35] bpf: libbpf: cleanup RLIMIT_MEMLOCK usage Roman Gushchin
                   ` (6 subsequent siblings)
  34 siblings, 1 reply; 88+ messages in thread
From: Roman Gushchin @ 2020-07-27 18:44 UTC (permalink / raw)
  To: bpf
  Cc: netdev, Alexei Starovoitov, Daniel Borkmann, kernel-team,
	linux-kernel, Roman Gushchin

Do not use rlimit-based memory accounting for bpf progs. It has been
replaced with memcg-based memory accounting.

Signed-off-by: Roman Gushchin <guro@fb.com>
---
 include/linux/bpf.h  | 11 ------
 kernel/bpf/core.c    | 12 ++-----
 kernel/bpf/syscall.c | 86 ++++++--------------------------------------
 3 files changed, 12 insertions(+), 97 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 055c693d9928..0c443468200e 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1095,8 +1095,6 @@ void bpf_prog_sub(struct bpf_prog *prog, int i);
 void bpf_prog_inc(struct bpf_prog *prog);
 struct bpf_prog * __must_check bpf_prog_inc_not_zero(struct bpf_prog *prog);
 void bpf_prog_put(struct bpf_prog *prog);
-int __bpf_prog_charge(struct user_struct *user, u32 pages);
-void __bpf_prog_uncharge(struct user_struct *user, u32 pages);
 void __bpf_free_used_maps(struct bpf_prog_aux *aux,
 			  struct bpf_map **used_maps, u32 len);
 
@@ -1380,15 +1378,6 @@ bpf_prog_inc_not_zero(struct bpf_prog *prog)
 	return ERR_PTR(-EOPNOTSUPP);
 }
 
-static inline int __bpf_prog_charge(struct user_struct *user, u32 pages)
-{
-	return 0;
-}
-
-static inline void __bpf_prog_uncharge(struct user_struct *user, u32 pages)
-{
-}
-
 static inline int bpf_obj_get_user(const char __user *pathname, int flags)
 {
 	return -EOPNOTSUPP;
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index daab8dcafbd4..23b8ff109ac8 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -219,23 +219,15 @@ struct bpf_prog *bpf_prog_realloc(struct bpf_prog *fp_old, unsigned int size,
 {
 	gfp_t gfp_flags = GFP_KERNEL_ACCOUNT | __GFP_ZERO | gfp_extra_flags;
 	struct bpf_prog *fp;
-	u32 pages, delta;
-	int ret;
+	u32 pages;
 
 	size = round_up(size, PAGE_SIZE);
 	pages = size / PAGE_SIZE;
 	if (pages <= fp_old->pages)
 		return fp_old;
 
-	delta = pages - fp_old->pages;
-	ret = __bpf_prog_charge(fp_old->aux->user, delta);
-	if (ret)
-		return NULL;
-
 	fp = __vmalloc(size, gfp_flags);
-	if (fp == NULL) {
-		__bpf_prog_uncharge(fp_old->aux->user, delta);
-	} else {
+	if (fp) {
 		memcpy(fp, fp_old, fp_old->pages * PAGE_SIZE);
 		fp->pages = pages;
 		fp->aux->prog = fp;
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index ae51e2363cc1..7f0bf60f5218 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -337,23 +337,6 @@ void bpf_map_init_from_attr(struct bpf_map *map, union bpf_attr *attr)
 	map->numa_node = bpf_map_attr_numa_node(attr);
 }
 
-static int bpf_charge_memlock(struct user_struct *user, u32 pages)
-{
-	unsigned long memlock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT;
-
-	if (atomic_long_add_return(pages, &user->locked_vm) > memlock_limit) {
-		atomic_long_sub(pages, &user->locked_vm);
-		return -EPERM;
-	}
-	return 0;
-}
-
-static void bpf_uncharge_memlock(struct user_struct *user, u32 pages)
-{
-	if (user)
-		atomic_long_sub(pages, &user->locked_vm);
-}
-
 static int bpf_map_alloc_id(struct bpf_map *map)
 {
 	int id;
@@ -1563,51 +1546,6 @@ static void bpf_audit_prog(const struct bpf_prog *prog, unsigned int op)
 	audit_log_end(ab);
 }
 
-int __bpf_prog_charge(struct user_struct *user, u32 pages)
-{
-	unsigned long memlock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT;
-	unsigned long user_bufs;
-
-	if (user) {
-		user_bufs = atomic_long_add_return(pages, &user->locked_vm);
-		if (user_bufs > memlock_limit) {
-			atomic_long_sub(pages, &user->locked_vm);
-			return -EPERM;
-		}
-	}
-
-	return 0;
-}
-
-void __bpf_prog_uncharge(struct user_struct *user, u32 pages)
-{
-	if (user)
-		atomic_long_sub(pages, &user->locked_vm);
-}
-
-static int bpf_prog_charge_memlock(struct bpf_prog *prog)
-{
-	struct user_struct *user = get_current_user();
-	int ret;
-
-	ret = __bpf_prog_charge(user, prog->pages);
-	if (ret) {
-		free_uid(user);
-		return ret;
-	}
-
-	prog->aux->user = user;
-	return 0;
-}
-
-static void bpf_prog_uncharge_memlock(struct bpf_prog *prog)
-{
-	struct user_struct *user = prog->aux->user;
-
-	__bpf_prog_uncharge(user, prog->pages);
-	free_uid(user);
-}
-
 static int bpf_prog_alloc_id(struct bpf_prog *prog)
 {
 	int id;
@@ -1657,7 +1595,7 @@ static void __bpf_prog_put_rcu(struct rcu_head *rcu)
 
 	kvfree(aux->func_info);
 	kfree(aux->func_info_aux);
-	bpf_prog_uncharge_memlock(aux->prog);
+	free_uid(aux->user);
 	security_bpf_prog_free(aux);
 	bpf_prog_free(aux->prog);
 }
@@ -2090,7 +2028,7 @@ static int bpf_prog_load(union bpf_attr *attr, union bpf_attr __user *uattr)
 		tgt_prog = bpf_prog_get(attr->attach_prog_fd);
 		if (IS_ERR(tgt_prog)) {
 			err = PTR_ERR(tgt_prog);
-			goto free_prog_nouncharge;
+			goto free_prog;
 		}
 		prog->aux->linked_prog = tgt_prog;
 	}
@@ -2099,18 +2037,15 @@ static int bpf_prog_load(union bpf_attr *attr, union bpf_attr __user *uattr)
 
 	err = security_bpf_prog_alloc(prog->aux);
 	if (err)
-		goto free_prog_nouncharge;
-
-	err = bpf_prog_charge_memlock(prog);
-	if (err)
-		goto free_prog_sec;
+		goto free_prog;
 
+	prog->aux->user = get_current_user();
 	prog->len = attr->insn_cnt;
 
 	err = -EFAULT;
 	if (copy_from_user(prog->insns, u64_to_user_ptr(attr->insns),
 			   bpf_prog_insn_size(prog)) != 0)
-		goto free_prog;
+		goto free_prog_sec;
 
 	prog->orig_prog = NULL;
 	prog->jited = 0;
@@ -2121,19 +2056,19 @@ static int bpf_prog_load(union bpf_attr *attr, union bpf_attr __user *uattr)
 	if (bpf_prog_is_dev_bound(prog->aux)) {
 		err = bpf_prog_offload_init(prog, attr);
 		if (err)
-			goto free_prog;
+			goto free_prog_sec;
 	}
 
 	/* find program type: socket_filter vs tracing_filter */
 	err = find_prog_type(type, prog);
 	if (err < 0)
-		goto free_prog;
+		goto free_prog_sec;
 
 	prog->aux->load_time = ktime_get_boottime_ns();
 	err = bpf_obj_name_cpy(prog->aux->name, attr->prog_name,
 			       sizeof(attr->prog_name));
 	if (err < 0)
-		goto free_prog;
+		goto free_prog_sec;
 
 	/* run eBPF verifier */
 	err = bpf_check(&prog, attr, uattr);
@@ -2178,11 +2113,10 @@ static int bpf_prog_load(union bpf_attr *attr, union bpf_attr __user *uattr)
 	 */
 	__bpf_prog_put_noref(prog, prog->aux->func_cnt);
 	return err;
-free_prog:
-	bpf_prog_uncharge_memlock(prog);
 free_prog_sec:
+	free_uid(prog->aux->user);
 	security_bpf_prog_free(prog->aux);
-free_prog_nouncharge:
+free_prog:
 	bpf_prog_free(prog);
 	return err;
 }
-- 
2.26.2


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH bpf-next v2 29/35] bpf: libbpf: cleanup RLIMIT_MEMLOCK usage
  2020-07-27 18:44 [PATCH bpf-next v2 00/35] bpf: switch to memcg-based memory accounting Roman Gushchin
                   ` (27 preceding siblings ...)
  2020-07-27 18:44 ` [PATCH bpf-next v2 28/35] bpf: eliminate rlimit-based memory accounting for bpf progs Roman Gushchin
@ 2020-07-27 18:45 ` Roman Gushchin
  2020-07-27 22:05   ` Andrii Nakryiko
  2020-07-27 18:45 ` [PATCH bpf-next v2 30/35] bpf: bpftool: do not touch RLIMIT_MEMLOCK Roman Gushchin
                   ` (5 subsequent siblings)
  34 siblings, 1 reply; 88+ messages in thread
From: Roman Gushchin @ 2020-07-27 18:45 UTC (permalink / raw)
  To: bpf
  Cc: netdev, Alexei Starovoitov, Daniel Borkmann, kernel-team,
	linux-kernel, Roman Gushchin

As bpf is not using memlock rlimit for memory accounting anymore,
let's remove the related code from libbpf.

Bpf operations can't fail because of exceeding the limit anymore.

Signed-off-by: Roman Gushchin <guro@fb.com>
---
 tools/lib/bpf/libbpf.c | 31 +------------------------------
 tools/lib/bpf/libbpf.h |  5 -----
 2 files changed, 1 insertion(+), 35 deletions(-)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index e51479d60285..841060f5cee3 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -112,32 +112,6 @@ void libbpf_print(enum libbpf_print_level level, const char *format, ...)
 	va_end(args);
 }
 
-static void pr_perm_msg(int err)
-{
-	struct rlimit limit;
-	char buf[100];
-
-	if (err != -EPERM || geteuid() != 0)
-		return;
-
-	err = getrlimit(RLIMIT_MEMLOCK, &limit);
-	if (err)
-		return;
-
-	if (limit.rlim_cur == RLIM_INFINITY)
-		return;
-
-	if (limit.rlim_cur < 1024)
-		snprintf(buf, sizeof(buf), "%zu bytes", (size_t)limit.rlim_cur);
-	else if (limit.rlim_cur < 1024*1024)
-		snprintf(buf, sizeof(buf), "%.1f KiB", (double)limit.rlim_cur / 1024);
-	else
-		snprintf(buf, sizeof(buf), "%.1f MiB", (double)limit.rlim_cur / (1024*1024));
-
-	pr_warn("permission error while running as root; try raising 'ulimit -l'? current value: %s\n",
-		buf);
-}
-
 #define STRERR_BUFSIZE  128
 
 /* Copied from tools/perf/util/util.h */
@@ -3420,8 +3394,7 @@ bpf_object__probe_loading(struct bpf_object *obj)
 		cp = libbpf_strerror_r(ret, errmsg, sizeof(errmsg));
 		pr_warn("Error in %s():%s(%d). Couldn't load trivial BPF "
 			"program. Make sure your kernel supports BPF "
-			"(CONFIG_BPF_SYSCALL=y) and/or that RLIMIT_MEMLOCK is "
-			"set to big enough value.\n", __func__, cp, ret);
+			"(CONFIG_BPF_SYSCALL=y)", __func__, cp, ret);
 		return -ret;
 	}
 	close(ret);
@@ -3918,7 +3891,6 @@ bpf_object__create_maps(struct bpf_object *obj)
 err_out:
 	cp = libbpf_strerror_r(err, errmsg, sizeof(errmsg));
 	pr_warn("map '%s': failed to create: %s(%d)\n", map->name, cp, err);
-	pr_perm_msg(err);
 	for (j = 0; j < i; j++)
 		zclose(obj->maps[j].fd);
 	return err;
@@ -5419,7 +5391,6 @@ load_program(struct bpf_program *prog, struct bpf_insn *insns, int insns_cnt,
 	ret = -errno;
 	cp = libbpf_strerror_r(errno, errmsg, sizeof(errmsg));
 	pr_warn("load bpf program failed: %s\n", cp);
-	pr_perm_msg(ret);
 
 	if (log_buf && log_buf[0] != '\0') {
 		ret = -LIBBPF_ERRNO__VERIFY;
diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
index c6813791fa7e..8d2f1194cb02 100644
--- a/tools/lib/bpf/libbpf.h
+++ b/tools/lib/bpf/libbpf.h
@@ -610,11 +610,6 @@ bpf_prog_linfo__lfind(const struct bpf_prog_linfo *prog_linfo,
 
 /*
  * Probe for supported system features
- *
- * Note that running many of these probes in a short amount of time can cause
- * the kernel to reach the maximal size of lockable memory allowed for the
- * user, causing subsequent probes to fail. In this case, the caller may want
- * to adjust that limit with setrlimit().
  */
 LIBBPF_API bool bpf_probe_prog_type(enum bpf_prog_type prog_type,
 				    __u32 ifindex);
-- 
2.26.2


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH bpf-next v2 30/35] bpf: bpftool: do not touch RLIMIT_MEMLOCK
  2020-07-27 18:44 [PATCH bpf-next v2 00/35] bpf: switch to memcg-based memory accounting Roman Gushchin
                   ` (28 preceding siblings ...)
  2020-07-27 18:45 ` [PATCH bpf-next v2 29/35] bpf: libbpf: cleanup RLIMIT_MEMLOCK usage Roman Gushchin
@ 2020-07-27 18:45 ` Roman Gushchin
  2020-07-28  6:00   ` Song Liu
  2020-07-28  6:00   ` Andrii Nakryiko
  2020-07-27 18:45 ` [PATCH bpf-next v2 31/35] bpf: runqslower: don't " Roman Gushchin
                   ` (4 subsequent siblings)
  34 siblings, 2 replies; 88+ messages in thread
From: Roman Gushchin @ 2020-07-27 18:45 UTC (permalink / raw)
  To: bpf
  Cc: netdev, Alexei Starovoitov, Daniel Borkmann, kernel-team,
	linux-kernel, Roman Gushchin

Since bpf stopped using memlock rlimit to limit the memory usage,
there is no more reason for bpftool to alter its own limits.

Signed-off-by: Roman Gushchin <guro@fb.com>
---
 tools/bpf/bpftool/common.c     | 7 -------
 tools/bpf/bpftool/feature.c    | 2 --
 tools/bpf/bpftool/main.h       | 2 --
 tools/bpf/bpftool/map.c        | 2 --
 tools/bpf/bpftool/pids.c       | 1 -
 tools/bpf/bpftool/prog.c       | 3 ---
 tools/bpf/bpftool/struct_ops.c | 2 --
 7 files changed, 19 deletions(-)

diff --git a/tools/bpf/bpftool/common.c b/tools/bpf/bpftool/common.c
index 65303664417e..01b87e8c3040 100644
--- a/tools/bpf/bpftool/common.c
+++ b/tools/bpf/bpftool/common.c
@@ -109,13 +109,6 @@ static bool is_bpffs(char *path)
 	return (unsigned long)st_fs.f_type == BPF_FS_MAGIC;
 }
 
-void set_max_rlimit(void)
-{
-	struct rlimit rinf = { RLIM_INFINITY, RLIM_INFINITY };
-
-	setrlimit(RLIMIT_MEMLOCK, &rinf);
-}
-
 static int
 mnt_fs(const char *target, const char *type, char *buff, size_t bufflen)
 {
diff --git a/tools/bpf/bpftool/feature.c b/tools/bpf/bpftool/feature.c
index 1cd75807673e..2d6c6bff934e 100644
--- a/tools/bpf/bpftool/feature.c
+++ b/tools/bpf/bpftool/feature.c
@@ -885,8 +885,6 @@ static int do_probe(int argc, char **argv)
 	__u32 ifindex = 0;
 	char *ifname;
 
-	set_max_rlimit();
-
 	while (argc) {
 		if (is_prefix(*argv, "kernel")) {
 			if (target != COMPONENT_UNSPEC) {
diff --git a/tools/bpf/bpftool/main.h b/tools/bpf/bpftool/main.h
index e3a79b5a9960..0a3bd1ff14da 100644
--- a/tools/bpf/bpftool/main.h
+++ b/tools/bpf/bpftool/main.h
@@ -95,8 +95,6 @@ int detect_common_prefix(const char *arg, ...);
 void fprint_hex(FILE *f, void *arg, unsigned int n, const char *sep);
 void usage(void) __noreturn;
 
-void set_max_rlimit(void);
-
 int mount_tracefs(const char *target);
 
 struct pinned_obj_table {
diff --git a/tools/bpf/bpftool/map.c b/tools/bpf/bpftool/map.c
index 3a27d31a1856..f08b9e707511 100644
--- a/tools/bpf/bpftool/map.c
+++ b/tools/bpf/bpftool/map.c
@@ -1315,8 +1315,6 @@ static int do_create(int argc, char **argv)
 		return -1;
 	}
 
-	set_max_rlimit();
-
 	fd = bpf_create_map_xattr(&attr);
 	if (fd < 0) {
 		p_err("map create failed: %s", strerror(errno));
diff --git a/tools/bpf/bpftool/pids.c b/tools/bpf/bpftool/pids.c
index e3b116325403..4c559a8ae4e8 100644
--- a/tools/bpf/bpftool/pids.c
+++ b/tools/bpf/bpftool/pids.c
@@ -96,7 +96,6 @@ int build_obj_refs_table(struct obj_refs_table *table, enum bpf_obj_type type)
 	libbpf_print_fn_t default_print;
 
 	hash_init(table->table);
-	set_max_rlimit();
 
 	skel = pid_iter_bpf__open();
 	if (!skel) {
diff --git a/tools/bpf/bpftool/prog.c b/tools/bpf/bpftool/prog.c
index 3e6ecc6332e2..40e50db60332 100644
--- a/tools/bpf/bpftool/prog.c
+++ b/tools/bpf/bpftool/prog.c
@@ -1291,8 +1291,6 @@ static int load_with_options(int argc, char **argv, bool first_prog_only)
 		}
 	}
 
-	set_max_rlimit();
-
 	obj = bpf_object__open_file(file, &open_opts);
 	if (IS_ERR_OR_NULL(obj)) {
 		p_err("failed to open object file");
@@ -1833,7 +1831,6 @@ static int do_profile(int argc, char **argv)
 		}
 	}
 
-	set_max_rlimit();
 	err = profiler_bpf__load(profile_obj);
 	if (err) {
 		p_err("failed to load profile_obj");
diff --git a/tools/bpf/bpftool/struct_ops.c b/tools/bpf/bpftool/struct_ops.c
index b58b91f62ffb..0915e1e9b7c0 100644
--- a/tools/bpf/bpftool/struct_ops.c
+++ b/tools/bpf/bpftool/struct_ops.c
@@ -498,8 +498,6 @@ static int do_register(int argc, char **argv)
 	if (IS_ERR_OR_NULL(obj))
 		return -1;
 
-	set_max_rlimit();
-
 	load_attr.obj = obj;
 	if (verifier_logs)
 		/* log_level1 + log_level2 + stats, but not stable UAPI */
-- 
2.26.2


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH bpf-next v2 31/35] bpf: runqslower: don't touch RLIMIT_MEMLOCK
  2020-07-27 18:44 [PATCH bpf-next v2 00/35] bpf: switch to memcg-based memory accounting Roman Gushchin
                   ` (29 preceding siblings ...)
  2020-07-27 18:45 ` [PATCH bpf-next v2 30/35] bpf: bpftool: do not touch RLIMIT_MEMLOCK Roman Gushchin
@ 2020-07-27 18:45 ` Roman Gushchin
  2020-07-28  6:03   ` Andrii Nakryiko
  2020-07-27 18:45 ` [PATCH bpf-next v2 32/35] bpf: selftests: delete bpf_rlimit.h Roman Gushchin
                   ` (3 subsequent siblings)
  34 siblings, 1 reply; 88+ messages in thread
From: Roman Gushchin @ 2020-07-27 18:45 UTC (permalink / raw)
  To: bpf
  Cc: netdev, Alexei Starovoitov, Daniel Borkmann, kernel-team,
	linux-kernel, Roman Gushchin

Since bpf is not using memlock rlimit for memory accounting,
there are no more reasons to bump the limit.

Signed-off-by: Roman Gushchin <guro@fb.com>
---
 tools/bpf/runqslower/runqslower.c | 16 ----------------
 1 file changed, 16 deletions(-)

diff --git a/tools/bpf/runqslower/runqslower.c b/tools/bpf/runqslower/runqslower.c
index d89715844952..a3380b53ce0c 100644
--- a/tools/bpf/runqslower/runqslower.c
+++ b/tools/bpf/runqslower/runqslower.c
@@ -88,16 +88,6 @@ int libbpf_print_fn(enum libbpf_print_level level,
 	return vfprintf(stderr, format, args);
 }
 
-static int bump_memlock_rlimit(void)
-{
-	struct rlimit rlim_new = {
-		.rlim_cur	= RLIM_INFINITY,
-		.rlim_max	= RLIM_INFINITY,
-	};
-
-	return setrlimit(RLIMIT_MEMLOCK, &rlim_new);
-}
-
 void handle_event(void *ctx, int cpu, void *data, __u32 data_sz)
 {
 	const struct event *e = data;
@@ -134,12 +124,6 @@ int main(int argc, char **argv)
 
 	libbpf_set_print(libbpf_print_fn);
 
-	err = bump_memlock_rlimit();
-	if (err) {
-		fprintf(stderr, "failed to increase rlimit: %d", err);
-		return 1;
-	}
-
 	obj = runqslower_bpf__open();
 	if (!obj) {
 		fprintf(stderr, "failed to open and/or load BPF object\n");
-- 
2.26.2


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH bpf-next v2 32/35] bpf: selftests: delete bpf_rlimit.h
  2020-07-27 18:44 [PATCH bpf-next v2 00/35] bpf: switch to memcg-based memory accounting Roman Gushchin
                   ` (30 preceding siblings ...)
  2020-07-27 18:45 ` [PATCH bpf-next v2 31/35] bpf: runqslower: don't " Roman Gushchin
@ 2020-07-27 18:45 ` Roman Gushchin
  2020-07-28  6:06   ` Andrii Nakryiko
  2020-07-27 18:45 ` [PATCH bpf-next v2 33/35] bpf: selftests: don't touch RLIMIT_MEMLOCK Roman Gushchin
                   ` (2 subsequent siblings)
  34 siblings, 1 reply; 88+ messages in thread
From: Roman Gushchin @ 2020-07-27 18:45 UTC (permalink / raw)
  To: bpf
  Cc: netdev, Alexei Starovoitov, Daniel Borkmann, kernel-team,
	linux-kernel, Roman Gushchin

As rlimit-based memory accounting is not used by bpf anymore,
there are no more reasons to play with memlock rlimit.

Delete bpf_rlimit.h which contained a code to bump the limit.

Signed-off-by: Roman Gushchin <guro@fb.com>
---
 samples/bpf/hbm.c                             |  1 -
 tools/testing/selftests/bpf/bpf_rlimit.h      | 28 -------------------
 .../selftests/bpf/flow_dissector_load.c       |  1 -
 .../selftests/bpf/get_cgroup_id_user.c        |  1 -
 .../bpf/prog_tests/select_reuseport.c         |  1 -
 .../selftests/bpf/prog_tests/sk_lookup.c      |  1 -
 tools/testing/selftests/bpf/test_btf.c        |  1 -
 .../selftests/bpf/test_cgroup_storage.c       |  1 -
 tools/testing/selftests/bpf/test_dev_cgroup.c |  1 -
 tools/testing/selftests/bpf/test_lpm_map.c    |  1 -
 tools/testing/selftests/bpf/test_lru_map.c    |  1 -
 tools/testing/selftests/bpf/test_maps.c       |  1 -
 tools/testing/selftests/bpf/test_netcnt.c     |  1 -
 tools/testing/selftests/bpf/test_progs.c      |  1 -
 .../selftests/bpf/test_skb_cgroup_id_user.c   |  1 -
 tools/testing/selftests/bpf/test_sock.c       |  1 -
 tools/testing/selftests/bpf/test_sock_addr.c  |  1 -
 .../testing/selftests/bpf/test_sock_fields.c  |  1 -
 .../selftests/bpf/test_socket_cookie.c        |  1 -
 tools/testing/selftests/bpf/test_sockmap.c    |  1 -
 tools/testing/selftests/bpf/test_sysctl.c     |  1 -
 tools/testing/selftests/bpf/test_tag.c        |  1 -
 .../bpf/test_tcp_check_syncookie_user.c       |  1 -
 .../testing/selftests/bpf/test_tcpbpf_user.c  |  1 -
 .../selftests/bpf/test_tcpnotify_user.c       |  1 -
 tools/testing/selftests/bpf/test_verifier.c   |  1 -
 .../testing/selftests/bpf/test_verifier_log.c |  2 --
 27 files changed, 55 deletions(-)
 delete mode 100644 tools/testing/selftests/bpf/bpf_rlimit.h

diff --git a/samples/bpf/hbm.c b/samples/bpf/hbm.c
index 7d7153777678..e4b38ceb20a7 100644
--- a/samples/bpf/hbm.c
+++ b/samples/bpf/hbm.c
@@ -46,7 +46,6 @@
 #include <getopt.h>
 
 #include "bpf_load.h"
-#include "bpf_rlimit.h"
 #include "cgroup_helpers.h"
 #include "hbm.h"
 #include "bpf_util.h"
diff --git a/tools/testing/selftests/bpf/bpf_rlimit.h b/tools/testing/selftests/bpf/bpf_rlimit.h
deleted file mode 100644
index 9dac9b30f8ef..000000000000
--- a/tools/testing/selftests/bpf/bpf_rlimit.h
+++ /dev/null
@@ -1,28 +0,0 @@
-#include <sys/resource.h>
-#include <stdio.h>
-
-static  __attribute__((constructor)) void bpf_rlimit_ctor(void)
-{
-	struct rlimit rlim_old, rlim_new = {
-		.rlim_cur	= RLIM_INFINITY,
-		.rlim_max	= RLIM_INFINITY,
-	};
-
-	getrlimit(RLIMIT_MEMLOCK, &rlim_old);
-	/* For the sake of running the test cases, we temporarily
-	 * set rlimit to infinity in order for kernel to focus on
-	 * errors from actual test cases and not getting noise
-	 * from hitting memlock limits. The limit is on per-process
-	 * basis and not a global one, hence destructor not really
-	 * needed here.
-	 */
-	if (setrlimit(RLIMIT_MEMLOCK, &rlim_new) < 0) {
-		perror("Unable to lift memlock rlimit");
-		/* Trying out lower limit, but expect potential test
-		 * case failures from this!
-		 */
-		rlim_new.rlim_cur = rlim_old.rlim_cur + (1UL << 20);
-		rlim_new.rlim_max = rlim_old.rlim_max + (1UL << 20);
-		setrlimit(RLIMIT_MEMLOCK, &rlim_new);
-	}
-}
diff --git a/tools/testing/selftests/bpf/flow_dissector_load.c b/tools/testing/selftests/bpf/flow_dissector_load.c
index 3fd83b9dc1bf..75818141f318 100644
--- a/tools/testing/selftests/bpf/flow_dissector_load.c
+++ b/tools/testing/selftests/bpf/flow_dissector_load.c
@@ -11,7 +11,6 @@
 #include <bpf/bpf.h>
 #include <bpf/libbpf.h>
 
-#include "bpf_rlimit.h"
 #include "flow_dissector_load.h"
 
 const char *cfg_pin_path = "/sys/fs/bpf/flow_dissector";
diff --git a/tools/testing/selftests/bpf/get_cgroup_id_user.c b/tools/testing/selftests/bpf/get_cgroup_id_user.c
index e8da7b39158d..597bc70286f2 100644
--- a/tools/testing/selftests/bpf/get_cgroup_id_user.c
+++ b/tools/testing/selftests/bpf/get_cgroup_id_user.c
@@ -19,7 +19,6 @@
 #include <bpf/libbpf.h>
 
 #include "cgroup_helpers.h"
-#include "bpf_rlimit.h"
 
 #define CHECK(condition, tag, format...) ({		\
 	int __ret = !!(condition);			\
diff --git a/tools/testing/selftests/bpf/prog_tests/select_reuseport.c b/tools/testing/selftests/bpf/prog_tests/select_reuseport.c
index 821b4146b7b6..520c8de8ee03 100644
--- a/tools/testing/selftests/bpf/prog_tests/select_reuseport.c
+++ b/tools/testing/selftests/bpf/prog_tests/select_reuseport.c
@@ -18,7 +18,6 @@
 #include <netinet/in.h>
 #include <bpf/bpf.h>
 #include <bpf/libbpf.h>
-#include "bpf_rlimit.h"
 #include "bpf_util.h"
 
 #include "test_progs.h"
diff --git a/tools/testing/selftests/bpf/prog_tests/sk_lookup.c b/tools/testing/selftests/bpf/prog_tests/sk_lookup.c
index 9bbd2b2b7630..9d3faf6cf92d 100644
--- a/tools/testing/selftests/bpf/prog_tests/sk_lookup.c
+++ b/tools/testing/selftests/bpf/prog_tests/sk_lookup.c
@@ -30,7 +30,6 @@
 #include <bpf/bpf.h>
 
 #include "test_progs.h"
-#include "bpf_rlimit.h"
 #include "bpf_util.h"
 #include "cgroup_helpers.h"
 #include "network_helpers.h"
diff --git a/tools/testing/selftests/bpf/test_btf.c b/tools/testing/selftests/bpf/test_btf.c
index 305fae8f80a9..e4b7bd9e3abf 100644
--- a/tools/testing/selftests/bpf/test_btf.c
+++ b/tools/testing/selftests/bpf/test_btf.c
@@ -22,7 +22,6 @@
 #include <bpf/libbpf.h>
 #include <bpf/btf.h>
 
-#include "bpf_rlimit.h"
 #include "bpf_util.h"
 #include "test_btf.h"
 
diff --git a/tools/testing/selftests/bpf/test_cgroup_storage.c b/tools/testing/selftests/bpf/test_cgroup_storage.c
index 655729004391..0bde741ad84c 100644
--- a/tools/testing/selftests/bpf/test_cgroup_storage.c
+++ b/tools/testing/selftests/bpf/test_cgroup_storage.c
@@ -6,7 +6,6 @@
 #include <stdlib.h>
 #include <sys/sysinfo.h>
 
-#include "bpf_rlimit.h"
 #include "cgroup_helpers.h"
 
 char bpf_log_buf[BPF_LOG_BUF_SIZE];
diff --git a/tools/testing/selftests/bpf/test_dev_cgroup.c b/tools/testing/selftests/bpf/test_dev_cgroup.c
index d850fb9076b5..4d6df9d99d50 100644
--- a/tools/testing/selftests/bpf/test_dev_cgroup.c
+++ b/tools/testing/selftests/bpf/test_dev_cgroup.c
@@ -14,7 +14,6 @@
 #include <bpf/libbpf.h>
 
 #include "cgroup_helpers.h"
-#include "bpf_rlimit.h"
 
 #define DEV_CGROUP_PROG "./dev_cgroup.o"
 
diff --git a/tools/testing/selftests/bpf/test_lpm_map.c b/tools/testing/selftests/bpf/test_lpm_map.c
index 006be3963977..ec595b5135e2 100644
--- a/tools/testing/selftests/bpf/test_lpm_map.c
+++ b/tools/testing/selftests/bpf/test_lpm_map.c
@@ -26,7 +26,6 @@
 #include <bpf/bpf.h>
 
 #include "bpf_util.h"
-#include "bpf_rlimit.h"
 
 struct tlpm_node {
 	struct tlpm_node *next;
diff --git a/tools/testing/selftests/bpf/test_lru_map.c b/tools/testing/selftests/bpf/test_lru_map.c
index 6a5349f9eb14..76748ff51de8 100644
--- a/tools/testing/selftests/bpf/test_lru_map.c
+++ b/tools/testing/selftests/bpf/test_lru_map.c
@@ -18,7 +18,6 @@
 #include <bpf/libbpf.h>
 
 #include "bpf_util.h"
-#include "bpf_rlimit.h"
 #include "../../../include/linux/filter.h"
 
 #define LOCAL_FREE_TARGET	(128)
diff --git a/tools/testing/selftests/bpf/test_maps.c b/tools/testing/selftests/bpf/test_maps.c
index 754cf611723e..350fee74a6b3 100644
--- a/tools/testing/selftests/bpf/test_maps.c
+++ b/tools/testing/selftests/bpf/test_maps.c
@@ -23,7 +23,6 @@
 #include <bpf/libbpf.h>
 
 #include "bpf_util.h"
-#include "bpf_rlimit.h"
 #include "test_maps.h"
 
 #ifndef ENOTSUPP
diff --git a/tools/testing/selftests/bpf/test_netcnt.c b/tools/testing/selftests/bpf/test_netcnt.c
index c1da5404454a..7a3e07b4627d 100644
--- a/tools/testing/selftests/bpf/test_netcnt.c
+++ b/tools/testing/selftests/bpf/test_netcnt.c
@@ -12,7 +12,6 @@
 #include <bpf/libbpf.h>
 
 #include "cgroup_helpers.h"
-#include "bpf_rlimit.h"
 #include "netcnt_common.h"
 
 #define BPF_PROG "./netcnt_prog.o"
diff --git a/tools/testing/selftests/bpf/test_progs.c b/tools/testing/selftests/bpf/test_progs.c
index b1e4dadacd9b..406716d305dc 100644
--- a/tools/testing/selftests/bpf/test_progs.c
+++ b/tools/testing/selftests/bpf/test_progs.c
@@ -4,7 +4,6 @@
 #define _GNU_SOURCE
 #include "test_progs.h"
 #include "cgroup_helpers.h"
-#include "bpf_rlimit.h"
 #include <argp.h>
 #include <pthread.h>
 #include <sched.h>
diff --git a/tools/testing/selftests/bpf/test_skb_cgroup_id_user.c b/tools/testing/selftests/bpf/test_skb_cgroup_id_user.c
index 356351c0ac28..8155e2c1d6ce 100644
--- a/tools/testing/selftests/bpf/test_skb_cgroup_id_user.c
+++ b/tools/testing/selftests/bpf/test_skb_cgroup_id_user.c
@@ -15,7 +15,6 @@
 #include <bpf/bpf.h>
 #include <bpf/libbpf.h>
 
-#include "bpf_rlimit.h"
 #include "cgroup_helpers.h"
 
 #define CGROUP_PATH		"/skb_cgroup_test"
diff --git a/tools/testing/selftests/bpf/test_sock.c b/tools/testing/selftests/bpf/test_sock.c
index 52bf14955797..cd1ebce8b1a7 100644
--- a/tools/testing/selftests/bpf/test_sock.c
+++ b/tools/testing/selftests/bpf/test_sock.c
@@ -14,7 +14,6 @@
 
 #include "cgroup_helpers.h"
 #include <bpf/bpf_endian.h>
-#include "bpf_rlimit.h"
 #include "bpf_util.h"
 
 #define CG_PATH		"/foo"
diff --git a/tools/testing/selftests/bpf/test_sock_addr.c b/tools/testing/selftests/bpf/test_sock_addr.c
index 0358814c67dc..7b8cd4fafb3d 100644
--- a/tools/testing/selftests/bpf/test_sock_addr.c
+++ b/tools/testing/selftests/bpf/test_sock_addr.c
@@ -19,7 +19,6 @@
 #include <bpf/libbpf.h>
 
 #include "cgroup_helpers.h"
-#include "bpf_rlimit.h"
 #include "bpf_util.h"
 
 #ifndef ENOTSUPP
diff --git a/tools/testing/selftests/bpf/test_sock_fields.c b/tools/testing/selftests/bpf/test_sock_fields.c
index f0fc103261a4..8ffdda96aeb6 100644
--- a/tools/testing/selftests/bpf/test_sock_fields.c
+++ b/tools/testing/selftests/bpf/test_sock_fields.c
@@ -14,7 +14,6 @@
 #include <bpf/libbpf.h>
 
 #include "cgroup_helpers.h"
-#include "bpf_rlimit.h"
 
 enum bpf_addr_array_idx {
 	ADDR_SRV_IDX,
diff --git a/tools/testing/selftests/bpf/test_socket_cookie.c b/tools/testing/selftests/bpf/test_socket_cookie.c
index 15653b0e26eb..998efb7158b7 100644
--- a/tools/testing/selftests/bpf/test_socket_cookie.c
+++ b/tools/testing/selftests/bpf/test_socket_cookie.c
@@ -12,7 +12,6 @@
 #include <bpf/bpf.h>
 #include <bpf/libbpf.h>
 
-#include "bpf_rlimit.h"
 #include "cgroup_helpers.h"
 
 #define CG_PATH			"/foo"
diff --git a/tools/testing/selftests/bpf/test_sockmap.c b/tools/testing/selftests/bpf/test_sockmap.c
index 78789b27e573..7094b93f44ec 100644
--- a/tools/testing/selftests/bpf/test_sockmap.c
+++ b/tools/testing/selftests/bpf/test_sockmap.c
@@ -37,7 +37,6 @@
 #include <bpf/libbpf.h>
 
 #include "bpf_util.h"
-#include "bpf_rlimit.h"
 #include "cgroup_helpers.h"
 
 int running;
diff --git a/tools/testing/selftests/bpf/test_sysctl.c b/tools/testing/selftests/bpf/test_sysctl.c
index d196e2a4a6e0..b5fd51efb4c7 100644
--- a/tools/testing/selftests/bpf/test_sysctl.c
+++ b/tools/testing/selftests/bpf/test_sysctl.c
@@ -14,7 +14,6 @@
 #include <bpf/libbpf.h>
 
 #include <bpf/bpf_endian.h>
-#include "bpf_rlimit.h"
 #include "bpf_util.h"
 #include "cgroup_helpers.h"
 
diff --git a/tools/testing/selftests/bpf/test_tag.c b/tools/testing/selftests/bpf/test_tag.c
index 6272c784ca2a..bcbf14dd00e1 100644
--- a/tools/testing/selftests/bpf/test_tag.c
+++ b/tools/testing/selftests/bpf/test_tag.c
@@ -20,7 +20,6 @@
 #include <bpf/bpf.h>
 
 #include "../../../include/linux/filter.h"
-#include "bpf_rlimit.h"
 
 static struct bpf_insn prog[BPF_MAXINSNS];
 
diff --git a/tools/testing/selftests/bpf/test_tcp_check_syncookie_user.c b/tools/testing/selftests/bpf/test_tcp_check_syncookie_user.c
index b9e991d43155..894eb0710d6f 100644
--- a/tools/testing/selftests/bpf/test_tcp_check_syncookie_user.c
+++ b/tools/testing/selftests/bpf/test_tcp_check_syncookie_user.c
@@ -15,7 +15,6 @@
 #include <bpf/bpf.h>
 #include <bpf/libbpf.h>
 
-#include "bpf_rlimit.h"
 #include "cgroup_helpers.h"
 
 static int start_server(const struct sockaddr *addr, socklen_t len)
diff --git a/tools/testing/selftests/bpf/test_tcpbpf_user.c b/tools/testing/selftests/bpf/test_tcpbpf_user.c
index 3ae127620463..100393afeb12 100644
--- a/tools/testing/selftests/bpf/test_tcpbpf_user.c
+++ b/tools/testing/selftests/bpf/test_tcpbpf_user.c
@@ -10,7 +10,6 @@
 #include <bpf/bpf.h>
 #include <bpf/libbpf.h>
 
-#include "bpf_rlimit.h"
 #include "bpf_util.h"
 #include "cgroup_helpers.h"
 
diff --git a/tools/testing/selftests/bpf/test_tcpnotify_user.c b/tools/testing/selftests/bpf/test_tcpnotify_user.c
index f9765ddf0761..9d14fedd47e4 100644
--- a/tools/testing/selftests/bpf/test_tcpnotify_user.c
+++ b/tools/testing/selftests/bpf/test_tcpnotify_user.c
@@ -19,7 +19,6 @@
 #include <linux/perf_event.h>
 #include <linux/err.h>
 
-#include "bpf_rlimit.h"
 #include "bpf_util.h"
 #include "cgroup_helpers.h"
 
diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c
index 78a6bae56ea6..7c5e005c237f 100644
--- a/tools/testing/selftests/bpf/test_verifier.c
+++ b/tools/testing/selftests/bpf/test_verifier.c
@@ -41,7 +41,6 @@
 #  define CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS 1
 # endif
 #endif
-#include "bpf_rlimit.h"
 #include "bpf_rand.h"
 #include "bpf_util.h"
 #include "test_btf.h"
diff --git a/tools/testing/selftests/bpf/test_verifier_log.c b/tools/testing/selftests/bpf/test_verifier_log.c
index 8d6918c3b4a2..4bca0a7344cc 100644
--- a/tools/testing/selftests/bpf/test_verifier_log.c
+++ b/tools/testing/selftests/bpf/test_verifier_log.c
@@ -11,8 +11,6 @@
 
 #include <bpf/bpf.h>
 
-#include "bpf_rlimit.h"
-
 #define LOG_SIZE (1 << 20)
 
 #define err(str...)	printf("ERROR: " str)
-- 
2.26.2


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH bpf-next v2 33/35] bpf: selftests: don't touch RLIMIT_MEMLOCK
  2020-07-27 18:44 [PATCH bpf-next v2 00/35] bpf: switch to memcg-based memory accounting Roman Gushchin
                   ` (31 preceding siblings ...)
  2020-07-27 18:45 ` [PATCH bpf-next v2 32/35] bpf: selftests: delete bpf_rlimit.h Roman Gushchin
@ 2020-07-27 18:45 ` Roman Gushchin
  2020-07-28  6:08   ` Andrii Nakryiko
  2020-07-27 18:45 ` [PATCH bpf-next v2 34/35] bpf: samples: do not " Roman Gushchin
  2020-07-27 18:45 ` [PATCH bpf-next v2 35/35] perf: don't " Roman Gushchin
  34 siblings, 1 reply; 88+ messages in thread
From: Roman Gushchin @ 2020-07-27 18:45 UTC (permalink / raw)
  To: bpf
  Cc: netdev, Alexei Starovoitov, Daniel Borkmann, kernel-team,
	linux-kernel, Roman Gushchin

Since bpf is not using memlock rlimit for memory accounting,
there are no more reasons to bump the limit.

Signed-off-by: Roman Gushchin <guro@fb.com>
---
 tools/testing/selftests/bpf/bench.c           | 16 ---------------
 .../selftests/bpf/progs/bpf_iter_bpf_map.c    |  5 ++---
 tools/testing/selftests/bpf/xdping.c          |  6 ------
 tools/testing/selftests/net/reuseport_bpf.c   | 20 -------------------
 4 files changed, 2 insertions(+), 45 deletions(-)

diff --git a/tools/testing/selftests/bpf/bench.c b/tools/testing/selftests/bpf/bench.c
index 944ad4721c83..f66610541c8a 100644
--- a/tools/testing/selftests/bpf/bench.c
+++ b/tools/testing/selftests/bpf/bench.c
@@ -29,25 +29,9 @@ static int libbpf_print_fn(enum libbpf_print_level level,
 	return vfprintf(stderr, format, args);
 }
 
-static int bump_memlock_rlimit(void)
-{
-	struct rlimit rlim_new = {
-		.rlim_cur	= RLIM_INFINITY,
-		.rlim_max	= RLIM_INFINITY,
-	};
-
-	return setrlimit(RLIMIT_MEMLOCK, &rlim_new);
-}
-
 void setup_libbpf()
 {
-	int err;
-
 	libbpf_set_print(libbpf_print_fn);
-
-	err = bump_memlock_rlimit();
-	if (err)
-		fprintf(stderr, "failed to increase RLIMIT_MEMLOCK: %d", err);
 }
 
 void hits_drops_report_progress(int iter, struct bench_res *res, long delta_ns)
diff --git a/tools/testing/selftests/bpf/progs/bpf_iter_bpf_map.c b/tools/testing/selftests/bpf/progs/bpf_iter_bpf_map.c
index 08651b23edba..5fe76df58dd4 100644
--- a/tools/testing/selftests/bpf/progs/bpf_iter_bpf_map.c
+++ b/tools/testing/selftests/bpf/progs/bpf_iter_bpf_map.c
@@ -19,10 +19,9 @@ int dump_bpf_map(struct bpf_iter__bpf_map *ctx)
 	}
 
 	if (seq_num == 0)
-		BPF_SEQ_PRINTF(seq, "      id   refcnt  usercnt  locked_vm\n");
+		BPF_SEQ_PRINTF(seq, "      id   refcnt  usercnt\n");
 
 	BPF_SEQ_PRINTF(seq, "%8u %8ld %8ld %10lu\n", map->id, map->refcnt.counter,
-		       map->usercnt.counter,
-		       map->memory.user->locked_vm.counter);
+		       map->usercnt.counter);
 	return 0;
 }
diff --git a/tools/testing/selftests/bpf/xdping.c b/tools/testing/selftests/bpf/xdping.c
index 842d9155d36c..488021169171 100644
--- a/tools/testing/selftests/bpf/xdping.c
+++ b/tools/testing/selftests/bpf/xdping.c
@@ -88,7 +88,6 @@ int main(int argc, char **argv)
 {
 	__u32 mode_flags = XDP_FLAGS_DRV_MODE | XDP_FLAGS_SKB_MODE;
 	struct addrinfo *a, hints = { .ai_family = AF_INET };
-	struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
 	__u16 count = XDPING_DEFAULT_COUNT;
 	struct pinginfo pinginfo = { 0 };
 	const char *optstr = "c:I:NsS";
@@ -166,11 +165,6 @@ int main(int argc, char **argv)
 		freeaddrinfo(a);
 	}
 
-	if (setrlimit(RLIMIT_MEMLOCK, &r)) {
-		perror("setrlimit(RLIMIT_MEMLOCK)");
-		return 1;
-	}
-
 	snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]);
 
 	if (bpf_prog_load(filename, BPF_PROG_TYPE_XDP, &obj, &prog_fd)) {
diff --git a/tools/testing/selftests/net/reuseport_bpf.c b/tools/testing/selftests/net/reuseport_bpf.c
index b5277106df1f..88709898bae5 100644
--- a/tools/testing/selftests/net/reuseport_bpf.c
+++ b/tools/testing/selftests/net/reuseport_bpf.c
@@ -437,26 +437,6 @@ void enable_fastopen(void)
 	}
 }
 
-static struct rlimit rlim_old;
-
-static  __attribute__((constructor)) void main_ctor(void)
-{
-	getrlimit(RLIMIT_MEMLOCK, &rlim_old);
-
-	if (rlim_old.rlim_cur != RLIM_INFINITY) {
-		struct rlimit rlim_new;
-
-		rlim_new.rlim_cur = rlim_old.rlim_cur + (1UL << 20);
-		rlim_new.rlim_max = rlim_old.rlim_max + (1UL << 20);
-		setrlimit(RLIMIT_MEMLOCK, &rlim_new);
-	}
-}
-
-static __attribute__((destructor)) void main_dtor(void)
-{
-	setrlimit(RLIMIT_MEMLOCK, &rlim_old);
-}
-
 int main(void)
 {
 	fprintf(stderr, "---- IPv4 UDP ----\n");
-- 
2.26.2


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH bpf-next v2 34/35] bpf: samples: do not touch RLIMIT_MEMLOCK
  2020-07-27 18:44 [PATCH bpf-next v2 00/35] bpf: switch to memcg-based memory accounting Roman Gushchin
                   ` (32 preceding siblings ...)
  2020-07-27 18:45 ` [PATCH bpf-next v2 33/35] bpf: selftests: don't touch RLIMIT_MEMLOCK Roman Gushchin
@ 2020-07-27 18:45 ` Roman Gushchin
  2020-07-28  6:14   ` Song Liu
  2020-07-27 18:45 ` [PATCH bpf-next v2 35/35] perf: don't " Roman Gushchin
  34 siblings, 1 reply; 88+ messages in thread
From: Roman Gushchin @ 2020-07-27 18:45 UTC (permalink / raw)
  To: bpf
  Cc: netdev, Alexei Starovoitov, Daniel Borkmann, kernel-team,
	linux-kernel, Roman Gushchin

Since bpf is not using rlimit memlock for the memory accounting
and control, do not change the limit in sample applications.

Signed-off-by: Roman Gushchin <guro@fb.com>
---
 samples/bpf/map_perf_test_user.c    | 11 -----------
 samples/bpf/offwaketime_user.c      |  2 --
 samples/bpf/sockex2_user.c          |  2 --
 samples/bpf/sockex3_user.c          |  2 --
 samples/bpf/spintest_user.c         |  2 --
 samples/bpf/syscall_tp_user.c       |  2 --
 samples/bpf/task_fd_query_user.c    |  5 -----
 samples/bpf/test_lru_dist.c         |  3 ---
 samples/bpf/test_map_in_map_user.c  |  9 ---------
 samples/bpf/test_overhead_user.c    |  2 --
 samples/bpf/trace_event_user.c      |  2 --
 samples/bpf/tracex2_user.c          |  6 ------
 samples/bpf/tracex3_user.c          |  6 ------
 samples/bpf/tracex4_user.c          |  6 ------
 samples/bpf/tracex5_user.c          |  3 ---
 samples/bpf/tracex6_user.c          |  3 ---
 samples/bpf/xdp1_user.c             |  6 ------
 samples/bpf/xdp_adjust_tail_user.c  |  6 ------
 samples/bpf/xdp_monitor_user.c      |  6 ------
 samples/bpf/xdp_redirect_cpu_user.c |  6 ------
 samples/bpf/xdp_redirect_map_user.c |  6 ------
 samples/bpf/xdp_redirect_user.c     |  6 ------
 samples/bpf/xdp_router_ipv4_user.c  |  6 ------
 samples/bpf/xdp_rxq_info_user.c     |  6 ------
 samples/bpf/xdp_sample_pkts_user.c  |  6 ------
 samples/bpf/xdp_tx_iptunnel_user.c  |  6 ------
 samples/bpf/xdpsock_user.c          |  7 -------
 27 files changed, 133 deletions(-)

diff --git a/samples/bpf/map_perf_test_user.c b/samples/bpf/map_perf_test_user.c
index 8b13230b4c46..4c198bc55beb 100644
--- a/samples/bpf/map_perf_test_user.c
+++ b/samples/bpf/map_perf_test_user.c
@@ -421,20 +421,9 @@ static void fixup_map(struct bpf_object *obj)
 
 int main(int argc, char **argv)
 {
-	struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
-	int nr_cpus = sysconf(_SC_NPROCESSORS_ONLN);
-	struct bpf_link *links[8];
-	struct bpf_program *prog;
-	struct bpf_object *obj;
-	struct bpf_map *map;
 	char filename[256];
 	int i = 0;
 
-	if (setrlimit(RLIMIT_MEMLOCK, &r)) {
-		perror("setrlimit(RLIMIT_MEMLOCK)");
-		return 1;
-	}
-
 	if (argc > 1)
 		test_flags = atoi(argv[1]) ? : test_flags;
 
diff --git a/samples/bpf/offwaketime_user.c b/samples/bpf/offwaketime_user.c
index 51c7da5341cc..9e51dd011a2a 100644
--- a/samples/bpf/offwaketime_user.c
+++ b/samples/bpf/offwaketime_user.c
@@ -95,12 +95,10 @@ static void int_exit(int sig)
 
 int main(int argc, char **argv)
 {
-	struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
 	char filename[256];
 	int delay = 1;
 
 	snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]);
-	setrlimit(RLIMIT_MEMLOCK, &r);
 
 	signal(SIGINT, int_exit);
 	signal(SIGTERM, int_exit);
diff --git a/samples/bpf/sockex2_user.c b/samples/bpf/sockex2_user.c
index af925a5afd1d..bafa567b840c 100644
--- a/samples/bpf/sockex2_user.c
+++ b/samples/bpf/sockex2_user.c
@@ -16,7 +16,6 @@ struct pair {
 
 int main(int ac, char **argv)
 {
-	struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
 	struct bpf_object *obj;
 	int map_fd, prog_fd;
 	char filename[256];
@@ -24,7 +23,6 @@ int main(int ac, char **argv)
 	FILE *f;
 
 	snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]);
-	setrlimit(RLIMIT_MEMLOCK, &r);
 
 	if (bpf_prog_load(filename, BPF_PROG_TYPE_SOCKET_FILTER,
 			  &obj, &prog_fd))
diff --git a/samples/bpf/sockex3_user.c b/samples/bpf/sockex3_user.c
index 4dbee7427d47..6ee7b7a4b9b7 100644
--- a/samples/bpf/sockex3_user.c
+++ b/samples/bpf/sockex3_user.c
@@ -26,7 +26,6 @@ struct pair {
 int main(int argc, char **argv)
 {
 	int i, sock, key, fd, main_prog_fd, jmp_table_fd, hash_map_fd;
-	struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
 	struct bpf_program *prog;
 	struct bpf_object *obj;
 	char filename[256];
@@ -34,7 +33,6 @@ int main(int argc, char **argv)
 	FILE *f;
 
 	snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]);
-	setrlimit(RLIMIT_MEMLOCK, &r);
 
 	obj = bpf_object__open_file(filename, NULL);
 	if (libbpf_get_error(obj)) {
diff --git a/samples/bpf/spintest_user.c b/samples/bpf/spintest_user.c
index fb430ea2ef51..458f1439e670 100644
--- a/samples/bpf/spintest_user.c
+++ b/samples/bpf/spintest_user.c
@@ -11,14 +11,12 @@
 
 int main(int ac, char **argv)
 {
-	struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
 	long key, next_key, value;
 	char filename[256];
 	struct ksym *sym;
 	int i;
 
 	snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]);
-	setrlimit(RLIMIT_MEMLOCK, &r);
 
 	if (load_kallsyms()) {
 		printf("failed to process /proc/kallsyms\n");
diff --git a/samples/bpf/syscall_tp_user.c b/samples/bpf/syscall_tp_user.c
index 57014bab7cbe..caa3891ee774 100644
--- a/samples/bpf/syscall_tp_user.c
+++ b/samples/bpf/syscall_tp_user.c
@@ -85,7 +85,6 @@ static int test(char *filename, int num_progs)
 
 int main(int argc, char **argv)
 {
-	struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
 	int opt, num_progs = 1;
 	char filename[256];
 
@@ -101,7 +100,6 @@ int main(int argc, char **argv)
 		}
 	}
 
-	setrlimit(RLIMIT_MEMLOCK, &r);
 	snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]);
 
 	return test(filename, num_progs);
diff --git a/samples/bpf/task_fd_query_user.c b/samples/bpf/task_fd_query_user.c
index ff2e9c1c7266..e2c1cacb781c 100644
--- a/samples/bpf/task_fd_query_user.c
+++ b/samples/bpf/task_fd_query_user.c
@@ -290,16 +290,11 @@ static int test_debug_fs_uprobe(char *binary_path, long offset, bool is_return)
 
 int main(int argc, char **argv)
 {
-	struct rlimit r = {1024*1024, RLIM_INFINITY};
 	extern char __executable_start;
 	char filename[256], buf[256];
 	__u64 uprobe_file_offset;
 
 	snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]);
-	if (setrlimit(RLIMIT_MEMLOCK, &r)) {
-		perror("setrlimit(RLIMIT_MEMLOCK)");
-		return 1;
-	}
 
 	if (load_kallsyms()) {
 		printf("failed to process /proc/kallsyms\n");
diff --git a/samples/bpf/test_lru_dist.c b/samples/bpf/test_lru_dist.c
index b313dba4111b..c92c5c06b965 100644
--- a/samples/bpf/test_lru_dist.c
+++ b/samples/bpf/test_lru_dist.c
@@ -489,7 +489,6 @@ static void test_parallel_lru_loss(int map_type, int map_flags, int nr_tasks)
 
 int main(int argc, char **argv)
 {
-	struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
 	int map_flags[] = {0, BPF_F_NO_COMMON_LRU};
 	const char *dist_file;
 	int nr_tasks = 1;
@@ -508,8 +507,6 @@ int main(int argc, char **argv)
 
 	setbuf(stdout, NULL);
 
-	assert(!setrlimit(RLIMIT_MEMLOCK, &r));
-
 	srand(time(NULL));
 
 	nr_cpus = bpf_num_possible_cpus();
diff --git a/samples/bpf/test_map_in_map_user.c b/samples/bpf/test_map_in_map_user.c
index 98656de56b83..0e65753a157a 100644
--- a/samples/bpf/test_map_in_map_user.c
+++ b/samples/bpf/test_map_in_map_user.c
@@ -114,17 +114,8 @@ static void test_map_in_map(void)
 
 int main(int argc, char **argv)
 {
-	struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
-	struct bpf_link *link = NULL;
-	struct bpf_program *prog;
-	struct bpf_object *obj;
 	char filename[256];
 
-	if (setrlimit(RLIMIT_MEMLOCK, &r)) {
-		perror("setrlimit(RLIMIT_MEMLOCK)");
-		return 1;
-	}
-
 	snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]);
 	obj = bpf_object__open_file(filename, NULL);
 	if (libbpf_get_error(obj)) {
diff --git a/samples/bpf/test_overhead_user.c b/samples/bpf/test_overhead_user.c
index 94f74112a20e..c100fd46cd8a 100644
--- a/samples/bpf/test_overhead_user.c
+++ b/samples/bpf/test_overhead_user.c
@@ -125,12 +125,10 @@ static void unload_progs(void)
 
 int main(int argc, char **argv)
 {
-	struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
 	char filename[256];
 	int num_cpu = 8;
 	int test_flags = ~0;
 
-	setrlimit(RLIMIT_MEMLOCK, &r);
 
 	if (argc > 1)
 		test_flags = atoi(argv[1]) ? : test_flags;
diff --git a/samples/bpf/trace_event_user.c b/samples/bpf/trace_event_user.c
index ac1ba368195c..9664749bf618 100644
--- a/samples/bpf/trace_event_user.c
+++ b/samples/bpf/trace_event_user.c
@@ -294,13 +294,11 @@ static void test_bpf_perf_event(void)
 
 int main(int argc, char **argv)
 {
-	struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
 	struct bpf_object *obj = NULL;
 	char filename[256];
 	int error = 1;
 
 	snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]);
-	setrlimit(RLIMIT_MEMLOCK, &r);
 
 	signal(SIGINT, err_exit);
 	signal(SIGTERM, err_exit);
diff --git a/samples/bpf/tracex2_user.c b/samples/bpf/tracex2_user.c
index 3e36b3e4e3ef..1626d51dfffd 100644
--- a/samples/bpf/tracex2_user.c
+++ b/samples/bpf/tracex2_user.c
@@ -116,7 +116,6 @@ static void int_exit(int sig)
 
 int main(int ac, char **argv)
 {
-	struct rlimit r = {1024*1024, RLIM_INFINITY};
 	long key, next_key, value;
 	struct bpf_link *links[2];
 	struct bpf_program *prog;
@@ -125,11 +124,6 @@ int main(int ac, char **argv)
 	int i, j = 0;
 	FILE *f;
 
-	if (setrlimit(RLIMIT_MEMLOCK, &r)) {
-		perror("setrlimit(RLIMIT_MEMLOCK)");
-		return 1;
-	}
-
 	snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]);
 	obj = bpf_object__open_file(filename, NULL);
 	if (libbpf_get_error(obj)) {
diff --git a/samples/bpf/tracex3_user.c b/samples/bpf/tracex3_user.c
index 70e987775c15..33e16ba39f25 100644
--- a/samples/bpf/tracex3_user.c
+++ b/samples/bpf/tracex3_user.c
@@ -107,7 +107,6 @@ static void print_hist(int fd)
 
 int main(int ac, char **argv)
 {
-	struct rlimit r = {1024*1024, RLIM_INFINITY};
 	struct bpf_link *links[2];
 	struct bpf_program *prog;
 	struct bpf_object *obj;
@@ -127,11 +126,6 @@ int main(int ac, char **argv)
 		}
 	}
 
-	if (setrlimit(RLIMIT_MEMLOCK, &r)) {
-		perror("setrlimit(RLIMIT_MEMLOCK)");
-		return 1;
-	}
-
 	snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]);
 	obj = bpf_object__open_file(filename, NULL);
 	if (libbpf_get_error(obj)) {
diff --git a/samples/bpf/tracex4_user.c b/samples/bpf/tracex4_user.c
index e8faf8f184ae..cea399424bca 100644
--- a/samples/bpf/tracex4_user.c
+++ b/samples/bpf/tracex4_user.c
@@ -48,18 +48,12 @@ static void print_old_objects(int fd)
 
 int main(int ac, char **argv)
 {
-	struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
 	struct bpf_link *links[2];
 	struct bpf_program *prog;
 	struct bpf_object *obj;
 	char filename[256];
 	int map_fd, i, j = 0;
 
-	if (setrlimit(RLIMIT_MEMLOCK, &r)) {
-		perror("setrlimit(RLIMIT_MEMLOCK, RLIM_INFINITY)");
-		return 1;
-	}
-
 	snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]);
 	obj = bpf_object__open_file(filename, NULL);
 	if (libbpf_get_error(obj)) {
diff --git a/samples/bpf/tracex5_user.c b/samples/bpf/tracex5_user.c
index 98dad57a96c4..1549fa3ec65c 100644
--- a/samples/bpf/tracex5_user.c
+++ b/samples/bpf/tracex5_user.c
@@ -34,7 +34,6 @@ static void install_accept_all_seccomp(void)
 
 int main(int ac, char **argv)
 {
-	struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
 	struct bpf_link *link = NULL;
 	struct bpf_program *prog;
 	struct bpf_object *obj;
@@ -43,8 +42,6 @@ int main(int ac, char **argv)
 	const char *title;
 	FILE *f;
 
-	setrlimit(RLIMIT_MEMLOCK, &r);
-
 	snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]);
 	obj = bpf_object__open_file(filename, NULL);
 	if (libbpf_get_error(obj)) {
diff --git a/samples/bpf/tracex6_user.c b/samples/bpf/tracex6_user.c
index 33df9784775d..28296f40c133 100644
--- a/samples/bpf/tracex6_user.c
+++ b/samples/bpf/tracex6_user.c
@@ -175,15 +175,12 @@ static void test_bpf_perf_event(void)
 
 int main(int argc, char **argv)
 {
-	struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
 	struct bpf_link *links[2];
 	struct bpf_program *prog;
 	struct bpf_object *obj;
 	char filename[256];
 	int i = 0;
 
-	setrlimit(RLIMIT_MEMLOCK, &r);
-
 	snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]);
 	obj = bpf_object__open_file(filename, NULL);
 	if (libbpf_get_error(obj)) {
diff --git a/samples/bpf/xdp1_user.c b/samples/bpf/xdp1_user.c
index c447ad9e3a1d..116e39f6b666 100644
--- a/samples/bpf/xdp1_user.c
+++ b/samples/bpf/xdp1_user.c
@@ -79,7 +79,6 @@ static void usage(const char *prog)
 
 int main(int argc, char **argv)
 {
-	struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
 	struct bpf_prog_load_attr prog_load_attr = {
 		.prog_type	= BPF_PROG_TYPE_XDP,
 	};
@@ -117,11 +116,6 @@ int main(int argc, char **argv)
 		return 1;
 	}
 
-	if (setrlimit(RLIMIT_MEMLOCK, &r)) {
-		perror("setrlimit(RLIMIT_MEMLOCK)");
-		return 1;
-	}
-
 	ifindex = if_nametoindex(argv[optind]);
 	if (!ifindex) {
 		perror("if_nametoindex");
diff --git a/samples/bpf/xdp_adjust_tail_user.c b/samples/bpf/xdp_adjust_tail_user.c
index ba482dc3da33..a70b094c8ec5 100644
--- a/samples/bpf/xdp_adjust_tail_user.c
+++ b/samples/bpf/xdp_adjust_tail_user.c
@@ -82,7 +82,6 @@ static void usage(const char *cmd)
 
 int main(int argc, char **argv)
 {
-	struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
 	struct bpf_prog_load_attr prog_load_attr = {
 		.prog_type	= BPF_PROG_TYPE_XDP,
 	};
@@ -143,11 +142,6 @@ int main(int argc, char **argv)
 		}
 	}
 
-	if (setrlimit(RLIMIT_MEMLOCK, &r)) {
-		perror("setrlimit(RLIMIT_MEMLOCK, RLIM_INFINITY)");
-		return 1;
-	}
-
 	if (!ifindex) {
 		fprintf(stderr, "Invalid ifname\n");
 		return 1;
diff --git a/samples/bpf/xdp_monitor_user.c b/samples/bpf/xdp_monitor_user.c
index ef53b93db573..25e6a24f8d7b 100644
--- a/samples/bpf/xdp_monitor_user.c
+++ b/samples/bpf/xdp_monitor_user.c
@@ -645,7 +645,6 @@ static void print_bpf_prog_info(void)
 
 int main(int argc, char **argv)
 {
-	struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
 	int longindex = 0, opt;
 	int ret = EXIT_SUCCESS;
 	char bpf_obj_file[256];
@@ -676,11 +675,6 @@ int main(int argc, char **argv)
 		}
 	}
 
-	if (setrlimit(RLIMIT_MEMLOCK, &r)) {
-		perror("setrlimit(RLIMIT_MEMLOCK)");
-		return EXIT_FAILURE;
-	}
-
 	if (load_bpf_file(bpf_obj_file)) {
 		printf("ERROR - bpf_log_buf: %s", bpf_log_buf);
 		return EXIT_FAILURE;
diff --git a/samples/bpf/xdp_redirect_cpu_user.c b/samples/bpf/xdp_redirect_cpu_user.c
index 004c0622c913..6773027b2a89 100644
--- a/samples/bpf/xdp_redirect_cpu_user.c
+++ b/samples/bpf/xdp_redirect_cpu_user.c
@@ -779,7 +779,6 @@ static int load_cpumap_prog(char *file_name, char *prog_name,
 
 int main(int argc, char **argv)
 {
-	struct rlimit r = {10 * 1024 * 1024, RLIM_INFINITY};
 	char *prog_name = "xdp_cpu_map5_lb_hash_ip_pairs";
 	char *mprog_filename = "xdp_redirect_kern.o";
 	char *redir_interface = NULL, *redir_map = NULL;
@@ -818,11 +817,6 @@ int main(int argc, char **argv)
 	snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]);
 	prog_load_attr.file = filename;
 
-	if (setrlimit(RLIMIT_MEMLOCK, &r)) {
-		perror("setrlimit(RLIMIT_MEMLOCK)");
-		return 1;
-	}
-
 	if (bpf_prog_load_xattr(&prog_load_attr, &obj, &prog_fd))
 		return EXIT_FAIL;
 
diff --git a/samples/bpf/xdp_redirect_map_user.c b/samples/bpf/xdp_redirect_map_user.c
index 35e16dee613e..31131b6e7782 100644
--- a/samples/bpf/xdp_redirect_map_user.c
+++ b/samples/bpf/xdp_redirect_map_user.c
@@ -96,7 +96,6 @@ static void usage(const char *prog)
 
 int main(int argc, char **argv)
 {
-	struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
 	struct bpf_prog_load_attr prog_load_attr = {
 		.prog_type	= BPF_PROG_TYPE_XDP,
 	};
@@ -135,11 +134,6 @@ int main(int argc, char **argv)
 		return 1;
 	}
 
-	if (setrlimit(RLIMIT_MEMLOCK, &r)) {
-		perror("setrlimit(RLIMIT_MEMLOCK)");
-		return 1;
-	}
-
 	ifindex_in = if_nametoindex(argv[optind]);
 	if (!ifindex_in)
 		ifindex_in = strtoul(argv[optind], NULL, 0);
diff --git a/samples/bpf/xdp_redirect_user.c b/samples/bpf/xdp_redirect_user.c
index 9ca2bf457cda..41d705c3a1f7 100644
--- a/samples/bpf/xdp_redirect_user.c
+++ b/samples/bpf/xdp_redirect_user.c
@@ -97,7 +97,6 @@ static void usage(const char *prog)
 
 int main(int argc, char **argv)
 {
-	struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
 	struct bpf_prog_load_attr prog_load_attr = {
 		.prog_type	= BPF_PROG_TYPE_XDP,
 	};
@@ -136,11 +135,6 @@ int main(int argc, char **argv)
 		return 1;
 	}
 
-	if (setrlimit(RLIMIT_MEMLOCK, &r)) {
-		perror("setrlimit(RLIMIT_MEMLOCK)");
-		return 1;
-	}
-
 	ifindex_in = if_nametoindex(argv[optind]);
 	if (!ifindex_in)
 		ifindex_in = strtoul(argv[optind], NULL, 0);
diff --git a/samples/bpf/xdp_router_ipv4_user.c b/samples/bpf/xdp_router_ipv4_user.c
index c2da1b51ff95..b5f03cb17a3c 100644
--- a/samples/bpf/xdp_router_ipv4_user.c
+++ b/samples/bpf/xdp_router_ipv4_user.c
@@ -625,7 +625,6 @@ static void usage(const char *prog)
 
 int main(int ac, char **argv)
 {
-	struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
 	struct bpf_prog_load_attr prog_load_attr = {
 		.prog_type	= BPF_PROG_TYPE_XDP,
 	};
@@ -670,11 +669,6 @@ int main(int ac, char **argv)
 		return 1;
 	}
 
-	if (setrlimit(RLIMIT_MEMLOCK, &r)) {
-		perror("setrlimit(RLIMIT_MEMLOCK)");
-		return 1;
-	}
-
 	if (bpf_prog_load_xattr(&prog_load_attr, &obj, &prog_fd))
 		return 1;
 
diff --git a/samples/bpf/xdp_rxq_info_user.c b/samples/bpf/xdp_rxq_info_user.c
index caa4e7ffcfc7..74a2926eba08 100644
--- a/samples/bpf/xdp_rxq_info_user.c
+++ b/samples/bpf/xdp_rxq_info_user.c
@@ -450,7 +450,6 @@ static void stats_poll(int interval, int action, __u32 cfg_opt)
 int main(int argc, char **argv)
 {
 	__u32 cfg_options= NO_TOUCH ; /* Default: Don't touch packet memory */
-	struct rlimit r = {10 * 1024 * 1024, RLIM_INFINITY};
 	struct bpf_prog_load_attr prog_load_attr = {
 		.prog_type	= BPF_PROG_TYPE_XDP,
 	};
@@ -474,11 +473,6 @@ int main(int argc, char **argv)
 	snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]);
 	prog_load_attr.file = filename;
 
-	if (setrlimit(RLIMIT_MEMLOCK, &r)) {
-		perror("setrlimit(RLIMIT_MEMLOCK)");
-		return 1;
-	}
-
 	if (bpf_prog_load_xattr(&prog_load_attr, &obj, &prog_fd))
 		return EXIT_FAIL;
 
diff --git a/samples/bpf/xdp_sample_pkts_user.c b/samples/bpf/xdp_sample_pkts_user.c
index 991ef6f0880b..551c6839f593 100644
--- a/samples/bpf/xdp_sample_pkts_user.c
+++ b/samples/bpf/xdp_sample_pkts_user.c
@@ -110,7 +110,6 @@ static void usage(const char *prog)
 
 int main(int argc, char **argv)
 {
-	struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
 	struct bpf_prog_load_attr prog_load_attr = {
 		.prog_type	= BPF_PROG_TYPE_XDP,
 	};
@@ -144,11 +143,6 @@ int main(int argc, char **argv)
 		return 1;
 	}
 
-	if (setrlimit(RLIMIT_MEMLOCK, &r)) {
-		perror("setrlimit(RLIMIT_MEMLOCK)");
-		return 1;
-	}
-
 	snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]);
 	prog_load_attr.file = filename;
 
diff --git a/samples/bpf/xdp_tx_iptunnel_user.c b/samples/bpf/xdp_tx_iptunnel_user.c
index a419bee151a8..1d4f305d02aa 100644
--- a/samples/bpf/xdp_tx_iptunnel_user.c
+++ b/samples/bpf/xdp_tx_iptunnel_user.c
@@ -155,7 +155,6 @@ int main(int argc, char **argv)
 	struct bpf_prog_load_attr prog_load_attr = {
 		.prog_type	= BPF_PROG_TYPE_XDP,
 	};
-	struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
 	int min_port = 0, max_port = 0, vip2tnl_map_fd;
 	const char *optstr = "i:a:p:s:d:m:T:P:FSNh";
 	unsigned char opt_flags[256] = {};
@@ -254,11 +253,6 @@ int main(int argc, char **argv)
 		}
 	}
 
-	if (setrlimit(RLIMIT_MEMLOCK, &r)) {
-		perror("setrlimit(RLIMIT_MEMLOCK, RLIM_INFINITY)");
-		return 1;
-	}
-
 	if (!ifindex) {
 		fprintf(stderr, "Invalid ifname\n");
 		return 1;
diff --git a/samples/bpf/xdpsock_user.c b/samples/bpf/xdpsock_user.c
index 19c679456a0e..b3bd60433546 100644
--- a/samples/bpf/xdpsock_user.c
+++ b/samples/bpf/xdpsock_user.c
@@ -1216,7 +1216,6 @@ static void enter_xsks_into_map(struct bpf_object *obj)
 
 int main(int argc, char **argv)
 {
-	struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
 	bool rx = false, tx = false;
 	struct xsk_umem_info *umem;
 	struct bpf_object *obj;
@@ -1226,12 +1225,6 @@ int main(int argc, char **argv)
 
 	parse_command_line(argc, argv);
 
-	if (setrlimit(RLIMIT_MEMLOCK, &r)) {
-		fprintf(stderr, "ERROR: setrlimit(RLIMIT_MEMLOCK) \"%s\"\n",
-			strerror(errno));
-		exit(EXIT_FAILURE);
-	}
-
 	if (opt_num_xsks > 1)
 		load_xdp_program(argv, &obj);
 
-- 
2.26.2


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH bpf-next v2 35/35] perf: don't touch RLIMIT_MEMLOCK
  2020-07-27 18:44 [PATCH bpf-next v2 00/35] bpf: switch to memcg-based memory accounting Roman Gushchin
                   ` (33 preceding siblings ...)
  2020-07-27 18:45 ` [PATCH bpf-next v2 34/35] bpf: samples: do not " Roman Gushchin
@ 2020-07-27 18:45 ` Roman Gushchin
  2020-07-28  6:09   ` Andrii Nakryiko
  34 siblings, 1 reply; 88+ messages in thread
From: Roman Gushchin @ 2020-07-27 18:45 UTC (permalink / raw)
  To: bpf
  Cc: netdev, Alexei Starovoitov, Daniel Borkmann, kernel-team,
	linux-kernel, Roman Gushchin

Since bpf stopped using memlock rlimit to limit the memory usage,
there is no more reason for perf to alter its own limit.

Signed-off-by: Roman Gushchin <guro@fb.com>
---
 tools/perf/builtin-trace.c      | 10 ----------
 tools/perf/tests/builtin-test.c |  6 ------
 tools/perf/util/Build           |  1 -
 tools/perf/util/rlimit.c        | 29 -----------------------------
 tools/perf/util/rlimit.h        |  6 ------
 5 files changed, 52 deletions(-)
 delete mode 100644 tools/perf/util/rlimit.c
 delete mode 100644 tools/perf/util/rlimit.h

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 4cbb64edc998..3d6a98a12537 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -19,7 +19,6 @@
 #include <api/fs/tracing_path.h>
 #include <bpf/bpf.h>
 #include "util/bpf_map.h"
-#include "util/rlimit.h"
 #include "builtin.h"
 #include "util/cgroup.h"
 #include "util/color.h"
@@ -4838,15 +4837,6 @@ int cmd_trace(int argc, const char **argv)
 		goto out;
 	}
 
-	/*
-	 * Parsing .perfconfig may entail creating a BPF event, that may need
-	 * to create BPF maps, so bump RLIM_MEMLOCK as the default 64K setting
-	 * is too small. This affects just this process, not touching the
-	 * global setting. If it fails we'll get something in 'perf trace -v'
-	 * to help diagnose the problem.
-	 */
-	rlimit__bump_memlock();
-
 	err = perf_config(trace__config, &trace);
 	if (err)
 		goto out;
diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index da5b6cc23f25..e4efbba8202b 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -22,7 +22,6 @@
 #include <subcmd/parse-options.h>
 #include "string2.h"
 #include "symbol.h"
-#include "util/rlimit.h"
 #include <linux/kernel.h>
 #include <linux/string.h>
 #include <subcmd/exec-cmd.h>
@@ -794,11 +793,6 @@ int cmd_test(int argc, const char **argv)
 
 	if (skip != NULL)
 		skiplist = intlist__new(skip);
-	/*
-	 * Tests that create BPF maps, for instance, need more than the 64K
-	 * default:
-	 */
-	rlimit__bump_memlock();
 
 	return __cmd_test(argc, argv, skiplist);
 }
diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index 8d18380ecd10..4902cd3b3b58 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -26,7 +26,6 @@ perf-y += parse-events.o
 perf-y += perf_regs.o
 perf-y += path.o
 perf-y += print_binary.o
-perf-y += rlimit.o
 perf-y += argv_split.o
 perf-y += rbtree.o
 perf-y += libstring.o
diff --git a/tools/perf/util/rlimit.c b/tools/perf/util/rlimit.c
deleted file mode 100644
index 13521d392a22..000000000000
--- a/tools/perf/util/rlimit.c
+++ /dev/null
@@ -1,29 +0,0 @@
-/* SPDX-License-Identifier: LGPL-2.1 */
-
-#include "util/debug.h"
-#include "util/rlimit.h"
-#include <sys/time.h>
-#include <sys/resource.h>
-
-/*
- * Bump the memlock so that we can get bpf maps of a reasonable size,
- * like the ones used with 'perf trace' and with 'perf test bpf',
- * improve this to some specific request if needed.
- */
-void rlimit__bump_memlock(void)
-{
-	struct rlimit rlim;
-
-	if (getrlimit(RLIMIT_MEMLOCK, &rlim) == 0) {
-		rlim.rlim_cur *= 4;
-		rlim.rlim_max *= 4;
-
-		if (setrlimit(RLIMIT_MEMLOCK, &rlim) < 0) {
-			rlim.rlim_cur /= 2;
-			rlim.rlim_max /= 2;
-
-			if (setrlimit(RLIMIT_MEMLOCK, &rlim) < 0)
-				pr_debug("Couldn't bump rlimit(MEMLOCK), failures may take place when creating BPF maps, etc\n");
-		}
-	}
-}
diff --git a/tools/perf/util/rlimit.h b/tools/perf/util/rlimit.h
deleted file mode 100644
index 9f59d8e710a3..000000000000
--- a/tools/perf/util/rlimit.h
+++ /dev/null
@@ -1,6 +0,0 @@
-#ifndef __PERF_RLIMIT_H_
-#define __PERF_RLIMIT_H_
-/* SPDX-License-Identifier: LGPL-2.1 */
-
-void rlimit__bump_memlock(void);
-#endif // __PERF_RLIMIT_H_
-- 
2.26.2


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH bpf-next v2 29/35] bpf: libbpf: cleanup RLIMIT_MEMLOCK usage
  2020-07-27 18:45 ` [PATCH bpf-next v2 29/35] bpf: libbpf: cleanup RLIMIT_MEMLOCK usage Roman Gushchin
@ 2020-07-27 22:05   ` Andrii Nakryiko
  2020-07-27 22:44     ` Song Liu
  2020-07-27 23:15     ` Roman Gushchin
  0 siblings, 2 replies; 88+ messages in thread
From: Andrii Nakryiko @ 2020-07-27 22:05 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: bpf, Networking, Alexei Starovoitov, Daniel Borkmann,
	Kernel Team, open list

On Mon, Jul 27, 2020 at 12:21 PM Roman Gushchin <guro@fb.com> wrote:
>
> As bpf is not using memlock rlimit for memory accounting anymore,
> let's remove the related code from libbpf.
>
> Bpf operations can't fail because of exceeding the limit anymore.
>

They can't in the newest kernel, but libbpf will keep working and
supporting old kernels for a very long time now. So please don't
remove any of this.

But it would be nice to add a detection of whether kernel needs a
RLIMIT_MEMLOCK bump or not. Is there some simple and reliable way to
detect this from user-space?


> Signed-off-by: Roman Gushchin <guro@fb.com>
> ---
>  tools/lib/bpf/libbpf.c | 31 +------------------------------
>  tools/lib/bpf/libbpf.h |  5 -----
>  2 files changed, 1 insertion(+), 35 deletions(-)
>

[...]

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH bpf-next v2 01/35] bpf: memcg-based memory accounting for bpf progs
  2020-07-27 18:44 ` [PATCH bpf-next v2 01/35] bpf: memcg-based memory accounting for bpf progs Roman Gushchin
@ 2020-07-27 22:11   ` Song Liu
  2020-07-28  0:08     ` Roman Gushchin
  0 siblings, 1 reply; 88+ messages in thread
From: Song Liu @ 2020-07-27 22:11 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: bpf, Networking, Alexei Starovoitov, Daniel Borkmann,
	Kernel Team, open list

On Mon, Jul 27, 2020 at 12:20 PM Roman Gushchin <guro@fb.com> wrote:
>
> Include memory used by bpf programs into the memcg-based accounting.
> This includes the memory used by programs itself, auxiliary data
> and statistics.
>
> Signed-off-by: Roman Gushchin <guro@fb.com>
> ---
>  kernel/bpf/core.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
> index bde93344164d..daab8dcafbd4 100644
> --- a/kernel/bpf/core.c
> +++ b/kernel/bpf/core.c
> @@ -77,7 +77,7 @@ void *bpf_internal_load_pointer_neg_helper(const struct sk_buff *skb, int k, uns
>
>  struct bpf_prog *bpf_prog_alloc_no_stats(unsigned int size, gfp_t gfp_extra_flags)
>  {
> -       gfp_t gfp_flags = GFP_KERNEL | __GFP_ZERO | gfp_extra_flags;
> +       gfp_t gfp_flags = GFP_KERNEL_ACCOUNT | __GFP_ZERO | gfp_extra_flags;
>         struct bpf_prog_aux *aux;
>         struct bpf_prog *fp;
>
> @@ -86,7 +86,7 @@ struct bpf_prog *bpf_prog_alloc_no_stats(unsigned int size, gfp_t gfp_extra_flag
>         if (fp == NULL)
>                 return NULL;
>
> -       aux = kzalloc(sizeof(*aux), GFP_KERNEL | gfp_extra_flags);
> +       aux = kzalloc(sizeof(*aux), GFP_KERNEL_ACCOUNT | gfp_extra_flags);
>         if (aux == NULL) {
>                 vfree(fp);
>                 return NULL;
> @@ -104,7 +104,7 @@ struct bpf_prog *bpf_prog_alloc_no_stats(unsigned int size, gfp_t gfp_extra_flag
>
>  struct bpf_prog *bpf_prog_alloc(unsigned int size, gfp_t gfp_extra_flags)
>  {
> -       gfp_t gfp_flags = GFP_KERNEL | __GFP_ZERO | gfp_extra_flags;
> +       gfp_t gfp_flags = GFP_KERNEL_ACCOUNT | __GFP_ZERO | gfp_extra_flags;
>         struct bpf_prog *prog;
>         int cpu;
>
> @@ -217,7 +217,7 @@ void bpf_prog_free_linfo(struct bpf_prog *prog)
>  struct bpf_prog *bpf_prog_realloc(struct bpf_prog *fp_old, unsigned int size,
>                                   gfp_t gfp_extra_flags)
>  {
> -       gfp_t gfp_flags = GFP_KERNEL | __GFP_ZERO | gfp_extra_flags;
> +       gfp_t gfp_flags = GFP_KERNEL_ACCOUNT | __GFP_ZERO | gfp_extra_flags;
>         struct bpf_prog *fp;
>         u32 pages, delta;
>         int ret;
> --

Do we need similar changes in

bpf_prog_array_copy()
bpf_prog_alloc_jited_linfo()
bpf_prog_clone_create()

and maybe a few more?

Thanks,
Song

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH bpf-next v2 02/35] bpf: memcg-based memory accounting for bpf maps
  2020-07-27 18:44 ` [PATCH bpf-next v2 02/35] bpf: memcg-based memory accounting for bpf maps Roman Gushchin
@ 2020-07-27 22:12   ` Song Liu
  0 siblings, 0 replies; 88+ messages in thread
From: Song Liu @ 2020-07-27 22:12 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: bpf, Networking, Alexei Starovoitov, Daniel Borkmann,
	Kernel Team, open list

On Mon, Jul 27, 2020 at 12:23 PM Roman Gushchin <guro@fb.com> wrote:
>
> This patch enables memcg-based memory accounting for memory allocated
> by __bpf_map_area_alloc(), which is used by most map types for
> large allocations.
>
> Following patches in the series will refine the accounting for
> some map types.
>
> Signed-off-by: Roman Gushchin <guro@fb.com>

Acked-by: Song Liu <songliubraving@fb.com>

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH bpf-next v2 03/35] bpf: refine memcg-based memory accounting for arraymap maps
  2020-07-27 18:44 ` [PATCH bpf-next v2 03/35] bpf: refine memcg-based memory accounting for arraymap maps Roman Gushchin
@ 2020-07-27 22:30   ` Song Liu
  0 siblings, 0 replies; 88+ messages in thread
From: Song Liu @ 2020-07-27 22:30 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: bpf, Networking, Alexei Starovoitov, Daniel Borkmann,
	Kernel Team, open list

On Mon, Jul 27, 2020 at 12:23 PM Roman Gushchin <guro@fb.com> wrote:
>
> Include percpu arrays and auxiliary data into the memcg-based memory
> accounting.
>
> Signed-off-by: Roman Gushchin <guro@fb.com>

Acked-by: Song Liu <songliubraving@fb.com>

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH bpf-next v2 29/35] bpf: libbpf: cleanup RLIMIT_MEMLOCK usage
  2020-07-27 22:05   ` Andrii Nakryiko
@ 2020-07-27 22:44     ` Song Liu
  2020-07-27 23:15     ` Roman Gushchin
  1 sibling, 0 replies; 88+ messages in thread
From: Song Liu @ 2020-07-27 22:44 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Roman Gushchin, bpf, Networking, Alexei Starovoitov,
	Daniel Borkmann, Kernel Team, open list

On Mon, Jul 27, 2020 at 3:07 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Mon, Jul 27, 2020 at 12:21 PM Roman Gushchin <guro@fb.com> wrote:
> >
> > As bpf is not using memlock rlimit for memory accounting anymore,
> > let's remove the related code from libbpf.
> >
> > Bpf operations can't fail because of exceeding the limit anymore.
> >
>
> They can't in the newest kernel, but libbpf will keep working and
> supporting old kernels for a very long time now. So please don't
> remove any of this.
>
> But it would be nice to add a detection of whether kernel needs a
> RLIMIT_MEMLOCK bump or not. Is there some simple and reliable way to
> detect this from user-space?
>

Agreed. We will need compatibility or similar detection for perf as well.

Thanks,
Song

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH bpf-next v2 04/35] bpf: refine memcg-based memory accounting for cpumap maps
  2020-07-27 18:44 ` [PATCH bpf-next v2 04/35] bpf: refine memcg-based memory accounting for cpumap maps Roman Gushchin
@ 2020-07-27 22:48   ` Song Liu
  0 siblings, 0 replies; 88+ messages in thread
From: Song Liu @ 2020-07-27 22:48 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: bpf, Networking, Alexei Starovoitov, Daniel Borkmann,
	Kernel Team, open list

On Mon, Jul 27, 2020 at 12:23 PM Roman Gushchin <guro@fb.com> wrote:
>
> Include metadata and percpu data into the memcg-based memory accounting.
>
> Signed-off-by: Roman Gushchin <guro@fb.com>

Acked-by: Song Liu <songliubraving@fb.com>

> ---
>  kernel/bpf/cpumap.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/bpf/cpumap.c b/kernel/bpf/cpumap.c
> index f1c46529929b..74ae9fcbe82e 100644
> --- a/kernel/bpf/cpumap.c
> +++ b/kernel/bpf/cpumap.c
> @@ -99,7 +99,7 @@ static struct bpf_map *cpu_map_alloc(union bpf_attr *attr)
>             attr->map_flags & ~BPF_F_NUMA_NODE)
>                 return ERR_PTR(-EINVAL);
>
> -       cmap = kzalloc(sizeof(*cmap), GFP_USER);
> +       cmap = kzalloc(sizeof(*cmap), GFP_USER | __GFP_ACCOUNT);
>         if (!cmap)
>                 return ERR_PTR(-ENOMEM);
>
> @@ -418,7 +418,7 @@ static struct bpf_cpu_map_entry *
>  __cpu_map_entry_alloc(struct bpf_cpumap_val *value, u32 cpu, int map_id)
>  {
>         int numa, err, i, fd = value->bpf_prog.fd;
> -       gfp_t gfp = GFP_KERNEL | __GFP_NOWARN;
> +       gfp_t gfp = GFP_KERNEL_ACCOUNT | __GFP_NOWARN;
>         struct bpf_cpu_map_entry *rcpu;
>         struct xdp_bulk_queue *bq;
>
> --
> 2.26.2
>

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH bpf-next v2 05/35] bpf: memcg-based memory accounting for cgroup storage maps
  2020-07-27 18:44 ` [PATCH bpf-next v2 05/35] bpf: memcg-based memory accounting for cgroup storage maps Roman Gushchin
@ 2020-07-27 23:05   ` Song Liu
  0 siblings, 0 replies; 88+ messages in thread
From: Song Liu @ 2020-07-27 23:05 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: bpf, Networking, Alexei Starovoitov, Daniel Borkmann,
	Kernel Team, open list

On Mon, Jul 27, 2020 at 12:26 PM Roman Gushchin <guro@fb.com> wrote:
>
> Account memory used by cgroup storage maps including the percpu memory
> for the percpu flavor of cgroup storage and map metadata.
>
> Signed-off-by: Roman Gushchin <guro@fb.com>

Acked-by: Song Liu <songliubraving@fb.com>

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH bpf-next v2 29/35] bpf: libbpf: cleanup RLIMIT_MEMLOCK usage
  2020-07-27 22:05   ` Andrii Nakryiko
  2020-07-27 22:44     ` Song Liu
@ 2020-07-27 23:15     ` Roman Gushchin
  2020-07-28  5:59       ` Andrii Nakryiko
  1 sibling, 1 reply; 88+ messages in thread
From: Roman Gushchin @ 2020-07-27 23:15 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, Networking, Alexei Starovoitov, Daniel Borkmann,
	Kernel Team, open list

On Mon, Jul 27, 2020 at 03:05:11PM -0700, Andrii Nakryiko wrote:
> On Mon, Jul 27, 2020 at 12:21 PM Roman Gushchin <guro@fb.com> wrote:
> >
> > As bpf is not using memlock rlimit for memory accounting anymore,
> > let's remove the related code from libbpf.
> >
> > Bpf operations can't fail because of exceeding the limit anymore.
> >
> 
> They can't in the newest kernel, but libbpf will keep working and
> supporting old kernels for a very long time now. So please don't
> remove any of this.

Yeah, good point, agree.
So we just can drop this patch from the series, no other changes
are needed.

> 
> But it would be nice to add a detection of whether kernel needs a
> RLIMIT_MEMLOCK bump or not. Is there some simple and reliable way to
> detect this from user-space?

Hm, the best idea I can think of is to wait for -EPERM before bumping.
We can in theory look for the presence of memory.stat::percpu in cgroupfs,
but it's way to cryptic.

Thanks!

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH bpf-next v2 06/35] bpf: refine memcg-based memory accounting for devmap maps
  2020-07-27 18:44 ` [PATCH bpf-next v2 06/35] bpf: refine memcg-based memory accounting for devmap maps Roman Gushchin
@ 2020-07-27 23:35   ` Song Liu
  0 siblings, 0 replies; 88+ messages in thread
From: Song Liu @ 2020-07-27 23:35 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: bpf, Networking, Alexei Starovoitov, Daniel Borkmann,
	Kernel Team, open list

On Mon, Jul 27, 2020 at 12:22 PM Roman Gushchin <guro@fb.com> wrote:
>
> Include map metadata and the node size (struct bpf_dtab_netdev) on
> element update into the accounting.
>
> Signed-off-by: Roman Gushchin <guro@fb.com>

Acked-by: Song Liu <songliubraving@fb.com>

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH bpf-next v2 07/35] bpf: refine memcg-based memory accounting for hashtab maps
  2020-07-27 18:44 ` [PATCH bpf-next v2 07/35] bpf: refine memcg-based memory accounting for hashtab maps Roman Gushchin
@ 2020-07-27 23:36   ` Song Liu
  0 siblings, 0 replies; 88+ messages in thread
From: Song Liu @ 2020-07-27 23:36 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: bpf, Networking, Alexei Starovoitov, Daniel Borkmann,
	Kernel Team, open list

On Mon, Jul 27, 2020 at 12:20 PM Roman Gushchin <guro@fb.com> wrote:
>
> Include percpu objects and the size of map metadata into the
> accounting.
>
> Signed-off-by: Roman Gushchin <guro@fb.com>

Acked-by: Song Liu <songliubraving@fb.com>

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH bpf-next v2 08/35] bpf: memcg-based memory accounting for lpm_trie maps
  2020-07-27 18:44 ` [PATCH bpf-next v2 08/35] bpf: memcg-based memory accounting for lpm_trie maps Roman Gushchin
@ 2020-07-27 23:55   ` Song Liu
  0 siblings, 0 replies; 88+ messages in thread
From: Song Liu @ 2020-07-27 23:55 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: bpf, Networking, Alexei Starovoitov, Daniel Borkmann,
	Kernel Team, open list

On Mon, Jul 27, 2020 at 12:22 PM Roman Gushchin <guro@fb.com> wrote:
>
> Include lpm trie and lpm trie node objects into the memcg-based memory
> accounting.
>
> Signed-off-by: Roman Gushchin <guro@fb.com>

Acked-by: Song Liu <songliubraving@fb.com>

> ---
>  kernel/bpf/lpm_trie.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/bpf/lpm_trie.c b/kernel/bpf/lpm_trie.c
> index 44474bf3ab7a..d85e0fc2cafc 100644
> --- a/kernel/bpf/lpm_trie.c
> +++ b/kernel/bpf/lpm_trie.c
> @@ -282,7 +282,7 @@ static struct lpm_trie_node *lpm_trie_node_alloc(const struct lpm_trie *trie,
>         if (value)
>                 size += trie->map.value_size;
>
> -       node = kmalloc_node(size, GFP_ATOMIC | __GFP_NOWARN,
> +       node = kmalloc_node(size, GFP_ATOMIC | __GFP_NOWARN | __GFP_ACCOUNT,
>                             trie->map.numa_node);
>         if (!node)
>                 return NULL;
> @@ -557,7 +557,7 @@ static struct bpf_map *trie_alloc(union bpf_attr *attr)
>             attr->value_size > LPM_VAL_SIZE_MAX)
>                 return ERR_PTR(-EINVAL);
>
> -       trie = kzalloc(sizeof(*trie), GFP_USER | __GFP_NOWARN);
> +       trie = kzalloc(sizeof(*trie), GFP_USER | __GFP_NOWARN | __GFP_ACCOUNT);
>         if (!trie)
>                 return ERR_PTR(-ENOMEM);
>
> --
> 2.26.2
>

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH bpf-next v2 09/35] bpf: memcg-based memory accounting for bpf ringbuffer
  2020-07-27 18:44 ` [PATCH bpf-next v2 09/35] bpf: memcg-based memory accounting for bpf ringbuffer Roman Gushchin
@ 2020-07-27 23:56   ` Song Liu
  0 siblings, 0 replies; 88+ messages in thread
From: Song Liu @ 2020-07-27 23:56 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: bpf, Networking, Alexei Starovoitov, Daniel Borkmann,
	Kernel Team, open list

On Mon, Jul 27, 2020 at 12:22 PM Roman Gushchin <guro@fb.com> wrote:
>
> Enable the memcg-based memory accounting for the memory used by
> the bpf ringbuffer.
>
> Signed-off-by: Roman Gushchin <guro@fb.com>

Acked-by: Song Liu <songliubraving@fb.com>

> ---
>  kernel/bpf/ringbuf.c | 9 +++++----
>  1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/bpf/ringbuf.c b/kernel/bpf/ringbuf.c
> index 002f8a5c9e51..e8e2c39cbdc9 100644
> --- a/kernel/bpf/ringbuf.c
> +++ b/kernel/bpf/ringbuf.c
> @@ -60,8 +60,8 @@ struct bpf_ringbuf_hdr {
>
>  static struct bpf_ringbuf *bpf_ringbuf_area_alloc(size_t data_sz, int numa_node)
>  {
> -       const gfp_t flags = GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_NOWARN |
> -                           __GFP_ZERO;
> +       const gfp_t flags = GFP_KERNEL_ACCOUNT | __GFP_RETRY_MAYFAIL |
> +                           __GFP_NOWARN | __GFP_ZERO;
>         int nr_meta_pages = RINGBUF_PGOFF + RINGBUF_POS_PAGES;
>         int nr_data_pages = data_sz >> PAGE_SHIFT;
>         int nr_pages = nr_meta_pages + nr_data_pages;
> @@ -89,7 +89,8 @@ static struct bpf_ringbuf *bpf_ringbuf_area_alloc(size_t data_sz, int numa_node)
>          */
>         array_size = (nr_meta_pages + 2 * nr_data_pages) * sizeof(*pages);
>         if (array_size > PAGE_SIZE)
> -               pages = vmalloc_node(array_size, numa_node);
> +               pages = __vmalloc_node(array_size, 1, GFP_KERNEL_ACCOUNT,
> +                                      numa_node, __builtin_return_address(0));
>         else
>                 pages = kmalloc_node(array_size, flags, numa_node);
>         if (!pages)
> @@ -167,7 +168,7 @@ static struct bpf_map *ringbuf_map_alloc(union bpf_attr *attr)
>                 return ERR_PTR(-E2BIG);
>  #endif
>
> -       rb_map = kzalloc(sizeof(*rb_map), GFP_USER);
> +       rb_map = kzalloc(sizeof(*rb_map), GFP_USER | __GFP_ACCOUNT);
>         if (!rb_map)
>                 return ERR_PTR(-ENOMEM);
>
> --
> 2.26.2
>

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH bpf-next v2 10/35] bpf: memcg-based memory accounting for socket storage maps
  2020-07-27 18:44 ` [PATCH bpf-next v2 10/35] bpf: memcg-based memory accounting for socket storage maps Roman Gushchin
@ 2020-07-27 23:57   ` Song Liu
  0 siblings, 0 replies; 88+ messages in thread
From: Song Liu @ 2020-07-27 23:57 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: bpf, Networking, Alexei Starovoitov, Daniel Borkmann,
	Kernel Team, open list

On Mon, Jul 27, 2020 at 12:28 PM Roman Gushchin <guro@fb.com> wrote:
>
> Account memory used by the socket storage.
>
> Signed-off-by: Roman Gushchin <guro@fb.com>

Acked-by: Song Liu <songliubraving@fb.com>

> ---
>  net/core/bpf_sk_storage.c | 12 +++++++-----
>  1 file changed, 7 insertions(+), 5 deletions(-)
>
> diff --git a/net/core/bpf_sk_storage.c b/net/core/bpf_sk_storage.c
> index eafcd15e7dfd..fbcd03cd00d3 100644
> --- a/net/core/bpf_sk_storage.c
> +++ b/net/core/bpf_sk_storage.c
> @@ -130,7 +130,8 @@ static struct bpf_sk_storage_elem *selem_alloc(struct bpf_sk_storage_map *smap,
>         if (charge_omem && omem_charge(sk, smap->elem_size))
>                 return NULL;
>
> -       selem = kzalloc(smap->elem_size, GFP_ATOMIC | __GFP_NOWARN);
> +       selem = kzalloc(smap->elem_size,
> +                       GFP_ATOMIC | __GFP_NOWARN | __GFP_ACCOUNT);
>         if (selem) {
>                 if (value)
>                         memcpy(SDATA(selem)->data, value, smap->map.value_size);
> @@ -337,7 +338,8 @@ static int sk_storage_alloc(struct sock *sk,
>         if (err)
>                 return err;
>
> -       sk_storage = kzalloc(sizeof(*sk_storage), GFP_ATOMIC | __GFP_NOWARN);
> +       sk_storage = kzalloc(sizeof(*sk_storage),
> +                            GFP_ATOMIC | __GFP_NOWARN | __GFP_ACCOUNT);
>         if (!sk_storage) {
>                 err = -ENOMEM;
>                 goto uncharge;
> @@ -677,7 +679,7 @@ static struct bpf_map *bpf_sk_storage_map_alloc(union bpf_attr *attr)
>         u64 cost;
>         int ret;
>
> -       smap = kzalloc(sizeof(*smap), GFP_USER | __GFP_NOWARN);
> +       smap = kzalloc(sizeof(*smap), GFP_USER | __GFP_NOWARN | __GFP_ACCOUNT);
>         if (!smap)
>                 return ERR_PTR(-ENOMEM);
>         bpf_map_init_from_attr(&smap->map, attr);
> @@ -695,7 +697,7 @@ static struct bpf_map *bpf_sk_storage_map_alloc(union bpf_attr *attr)
>         }
>
>         smap->buckets = kvcalloc(sizeof(*smap->buckets), nbuckets,
> -                                GFP_USER | __GFP_NOWARN);
> +                                GFP_USER | __GFP_NOWARN | __GFP_ACCOUNT);
>         if (!smap->buckets) {
>                 bpf_map_charge_finish(&smap->map.memory);
>                 kfree(smap);
> @@ -1024,7 +1026,7 @@ bpf_sk_storage_diag_alloc(const struct nlattr *nla_stgs)
>         }
>
>         diag = kzalloc(sizeof(*diag) + sizeof(diag->maps[0]) * nr_maps,
> -                      GFP_KERNEL);
> +                      GFP_KERNEL | __GFP_ACCOUNT);
>         if (!diag)
>                 return ERR_PTR(-ENOMEM);
>
> --
> 2.26.2
>

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH bpf-next v2 11/35] bpf: refine memcg-based memory accounting for sockmap and sockhash maps
  2020-07-27 18:44 ` [PATCH bpf-next v2 11/35] bpf: refine memcg-based memory accounting for sockmap and sockhash maps Roman Gushchin
@ 2020-07-27 23:58   ` Song Liu
  0 siblings, 0 replies; 88+ messages in thread
From: Song Liu @ 2020-07-27 23:58 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: bpf, Networking, Alexei Starovoitov, Daniel Borkmann,
	Kernel Team, open list

On Mon, Jul 27, 2020 at 12:27 PM Roman Gushchin <guro@fb.com> wrote:
>
> Include internal metadata into the memcg-based memory accounting.
> Also include the memory allocated on updating an element.
>
> Signed-off-by: Roman Gushchin <guro@fb.com>

Acked-by: Song Liu <songliubraving@fb.com>

> ---
>  net/core/sock_map.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/net/core/sock_map.c b/net/core/sock_map.c
> index 119f52a99dc1..bc797adca44c 100644
> --- a/net/core/sock_map.c
> +++ b/net/core/sock_map.c
> @@ -38,7 +38,7 @@ static struct bpf_map *sock_map_alloc(union bpf_attr *attr)
>             attr->map_flags & ~SOCK_CREATE_FLAG_MASK)
>                 return ERR_PTR(-EINVAL);
>
> -       stab = kzalloc(sizeof(*stab), GFP_USER);
> +       stab = kzalloc(sizeof(*stab), GFP_USER | __GFP_ACCOUNT);
>         if (!stab)
>                 return ERR_PTR(-ENOMEM);
>
> @@ -829,7 +829,8 @@ static struct bpf_shtab_elem *sock_hash_alloc_elem(struct bpf_shtab *htab,
>                 }
>         }
>
> -       new = kmalloc_node(htab->elem_size, GFP_ATOMIC | __GFP_NOWARN,
> +       new = kmalloc_node(htab->elem_size,
> +                          GFP_ATOMIC | __GFP_NOWARN | __GFP_ACCOUNT,
>                            htab->map.numa_node);
>         if (!new) {
>                 atomic_dec(&htab->count);
> @@ -1011,7 +1012,7 @@ static struct bpf_map *sock_hash_alloc(union bpf_attr *attr)
>         if (attr->key_size > MAX_BPF_STACK)
>                 return ERR_PTR(-E2BIG);
>
> -       htab = kzalloc(sizeof(*htab), GFP_USER);
> +       htab = kzalloc(sizeof(*htab), GFP_USER | __GFP_ACCOUNT);
>         if (!htab)
>                 return ERR_PTR(-ENOMEM);
>
> --
> 2.26.2
>

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH bpf-next v2 12/35] bpf: refine memcg-based memory accounting for xskmap maps
  2020-07-27 18:44 ` [PATCH bpf-next v2 12/35] bpf: refine memcg-based memory accounting for xskmap maps Roman Gushchin
@ 2020-07-28  0:01   ` Song Liu
  0 siblings, 0 replies; 88+ messages in thread
From: Song Liu @ 2020-07-28  0:01 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: bpf, Networking, Alexei Starovoitov, Daniel Borkmann,
	Kernel Team, open list

On Mon, Jul 27, 2020 at 12:25 PM Roman Gushchin <guro@fb.com> wrote:
>
> Extend xskmap memory accounting to include the memory taken by
> the xsk_map_node structure.
>
> Signed-off-by: Roman Gushchin <guro@fb.com>

Acked-by: Song Liu <songliubraving@fb.com>

> ---
>  net/xdp/xskmap.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/net/xdp/xskmap.c b/net/xdp/xskmap.c
> index 8367adbbe9df..e574b22defe5 100644
> --- a/net/xdp/xskmap.c
> +++ b/net/xdp/xskmap.c
> @@ -28,7 +28,8 @@ static struct xsk_map_node *xsk_map_node_alloc(struct xsk_map *map,
>         struct xsk_map_node *node;
>         int err;
>
> -       node = kzalloc(sizeof(*node), GFP_ATOMIC | __GFP_NOWARN);
> +       node = kzalloc(sizeof(*node),
> +                      GFP_ATOMIC | __GFP_NOWARN | __GFP_ACCOUNT);
>         if (!node)
>                 return ERR_PTR(-ENOMEM);
>
> --
> 2.26.2
>

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH bpf-next v2 13/35] bpf: eliminate rlimit-based memory accounting for arraymap maps
  2020-07-27 18:44 ` [PATCH bpf-next v2 13/35] bpf: eliminate rlimit-based memory accounting for arraymap maps Roman Gushchin
@ 2020-07-28  0:04   ` Song Liu
  0 siblings, 0 replies; 88+ messages in thread
From: Song Liu @ 2020-07-28  0:04 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: bpf, Networking, Alexei Starovoitov, Daniel Borkmann,
	Kernel Team, open list

On Mon, Jul 27, 2020 at 12:26 PM Roman Gushchin <guro@fb.com> wrote:
>
> Do not use rlimit-based memory accounting for arraymap maps.
> It has been replaced with the memcg-based memory accounting.
>
> Signed-off-by: Roman Gushchin <guro@fb.com>

Acked-by: Song Liu <songliubraving@fb.com>

> ---
>  kernel/bpf/arraymap.c | 24 ++++--------------------
>  1 file changed, 4 insertions(+), 20 deletions(-)
>
> diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
> index 9597fecff8da..41581c38b31d 100644
> --- a/kernel/bpf/arraymap.c
> +++ b/kernel/bpf/arraymap.c
> @@ -75,11 +75,10 @@ int array_map_alloc_check(union bpf_attr *attr)
>  static struct bpf_map *array_map_alloc(union bpf_attr *attr)
>  {
>         bool percpu = attr->map_type == BPF_MAP_TYPE_PERCPU_ARRAY;
> -       int ret, numa_node = bpf_map_attr_numa_node(attr);
> +       int numa_node = bpf_map_attr_numa_node(attr);
>         u32 elem_size, index_mask, max_entries;
>         bool bypass_spec_v1 = bpf_bypass_spec_v1();
> -       u64 cost, array_size, mask64;
> -       struct bpf_map_memory mem;
> +       u64 array_size, mask64;
>         struct bpf_array *array;
>
>         elem_size = round_up(attr->value_size, 8);
> @@ -120,44 +119,29 @@ static struct bpf_map *array_map_alloc(union bpf_attr *attr)
>                 }
>         }
>
> -       /* make sure there is no u32 overflow later in round_up() */
> -       cost = array_size;
> -       if (percpu)
> -               cost += (u64)attr->max_entries * elem_size * num_possible_cpus();
> -
> -       ret = bpf_map_charge_init(&mem, cost);
> -       if (ret < 0)
> -               return ERR_PTR(ret);
> -
>         /* allocate all map elements and zero-initialize them */
>         if (attr->map_flags & BPF_F_MMAPABLE) {
>                 void *data;
>
>                 /* kmalloc'ed memory can't be mmap'ed, use explicit vmalloc */
>                 data = bpf_map_area_mmapable_alloc(array_size, numa_node);
> -               if (!data) {
> -                       bpf_map_charge_finish(&mem);
> +               if (!data)
>                         return ERR_PTR(-ENOMEM);
> -               }
>                 array = data + PAGE_ALIGN(sizeof(struct bpf_array))
>                         - offsetof(struct bpf_array, value);
>         } else {
>                 array = bpf_map_area_alloc(array_size, numa_node);
>         }
> -       if (!array) {
> -               bpf_map_charge_finish(&mem);
> +       if (!array)
>                 return ERR_PTR(-ENOMEM);
> -       }
>         array->index_mask = index_mask;
>         array->map.bypass_spec_v1 = bypass_spec_v1;
>
>         /* copy mandatory map attributes */
>         bpf_map_init_from_attr(&array->map, attr);
> -       bpf_map_charge_move(&array->map.memory, &mem);
>         array->elem_size = elem_size;
>
>         if (percpu && bpf_array_alloc_percpu(array)) {
> -               bpf_map_charge_finish(&array->map.memory);
>                 bpf_map_area_free(array);
>                 return ERR_PTR(-ENOMEM);
>         }
> --
> 2.26.2
>

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH bpf-next v2 01/35] bpf: memcg-based memory accounting for bpf progs
  2020-07-27 22:11   ` Song Liu
@ 2020-07-28  0:08     ` Roman Gushchin
  2020-07-28  4:42       ` Song Liu
  0 siblings, 1 reply; 88+ messages in thread
From: Roman Gushchin @ 2020-07-28  0:08 UTC (permalink / raw)
  To: Song Liu
  Cc: bpf, Networking, Alexei Starovoitov, Daniel Borkmann,
	Kernel Team, open list

On Mon, Jul 27, 2020 at 03:11:42PM -0700, Song Liu wrote:
> On Mon, Jul 27, 2020 at 12:20 PM Roman Gushchin <guro@fb.com> wrote:
> >
> > Include memory used by bpf programs into the memcg-based accounting.
> > This includes the memory used by programs itself, auxiliary data
> > and statistics.
> >
> > Signed-off-by: Roman Gushchin <guro@fb.com>
> > ---
> >  kernel/bpf/core.c | 8 ++++----
> >  1 file changed, 4 insertions(+), 4 deletions(-)
> >
> > diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
> > index bde93344164d..daab8dcafbd4 100644
> > --- a/kernel/bpf/core.c
> > +++ b/kernel/bpf/core.c
> > @@ -77,7 +77,7 @@ void *bpf_internal_load_pointer_neg_helper(const struct sk_buff *skb, int k, uns
> >
> >  struct bpf_prog *bpf_prog_alloc_no_stats(unsigned int size, gfp_t gfp_extra_flags)
> >  {
> > -       gfp_t gfp_flags = GFP_KERNEL | __GFP_ZERO | gfp_extra_flags;
> > +       gfp_t gfp_flags = GFP_KERNEL_ACCOUNT | __GFP_ZERO | gfp_extra_flags;
> >         struct bpf_prog_aux *aux;
> >         struct bpf_prog *fp;
> >
> > @@ -86,7 +86,7 @@ struct bpf_prog *bpf_prog_alloc_no_stats(unsigned int size, gfp_t gfp_extra_flag
> >         if (fp == NULL)
> >                 return NULL;
> >
> > -       aux = kzalloc(sizeof(*aux), GFP_KERNEL | gfp_extra_flags);
> > +       aux = kzalloc(sizeof(*aux), GFP_KERNEL_ACCOUNT | gfp_extra_flags);
> >         if (aux == NULL) {
> >                 vfree(fp);
> >                 return NULL;
> > @@ -104,7 +104,7 @@ struct bpf_prog *bpf_prog_alloc_no_stats(unsigned int size, gfp_t gfp_extra_flag
> >
> >  struct bpf_prog *bpf_prog_alloc(unsigned int size, gfp_t gfp_extra_flags)
> >  {
> > -       gfp_t gfp_flags = GFP_KERNEL | __GFP_ZERO | gfp_extra_flags;
> > +       gfp_t gfp_flags = GFP_KERNEL_ACCOUNT | __GFP_ZERO | gfp_extra_flags;
> >         struct bpf_prog *prog;
> >         int cpu;
> >
> > @@ -217,7 +217,7 @@ void bpf_prog_free_linfo(struct bpf_prog *prog)
> >  struct bpf_prog *bpf_prog_realloc(struct bpf_prog *fp_old, unsigned int size,
> >                                   gfp_t gfp_extra_flags)
> >  {
> > -       gfp_t gfp_flags = GFP_KERNEL | __GFP_ZERO | gfp_extra_flags;
> > +       gfp_t gfp_flags = GFP_KERNEL_ACCOUNT | __GFP_ZERO | gfp_extra_flags;
> >         struct bpf_prog *fp;
> >         u32 pages, delta;
> >         int ret;
> > --

Hi Song!

Thank you for looking into the patchset!

> 
> Do we need similar changes in
> 
> bpf_prog_array_copy()
> bpf_prog_alloc_jited_linfo()
> bpf_prog_clone_create()
> 
> and maybe a few more?

I've tried to follow the rlimit-based accounting, so those objects which were
skipped are mostly skipped now and vice versa. The main reason for that is
simple: I don't know many parts of bpf code well enough to decide whether
we need accounting or not.

In general with memcg-based accounting we can easily cover places which were
not covered previously: e.g. the memory used by the verifier. But I guess it's
better to do it case-by-case.

But if you're aware of any big objects which should be accounted for sure,
please, let me know.

Thanks!


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH bpf-next v2 01/35] bpf: memcg-based memory accounting for bpf progs
  2020-07-28  0:08     ` Roman Gushchin
@ 2020-07-28  4:42       ` Song Liu
  0 siblings, 0 replies; 88+ messages in thread
From: Song Liu @ 2020-07-28  4:42 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: bpf, Networking, Alexei Starovoitov, Daniel Borkmann,
	Kernel Team, open list

On Mon, Jul 27, 2020 at 5:08 PM Roman Gushchin <guro@fb.com> wrote:
>
> On Mon, Jul 27, 2020 at 03:11:42PM -0700, Song Liu wrote:
> > On Mon, Jul 27, 2020 at 12:20 PM Roman Gushchin <guro@fb.com> wrote:
> > >
> > > Include memory used by bpf programs into the memcg-based accounting.
> > > This includes the memory used by programs itself, auxiliary data
> > > and statistics.
> > >
> > > Signed-off-by: Roman Gushchin <guro@fb.com>
> > > ---
> > >  kernel/bpf/core.c | 8 ++++----
> > >  1 file changed, 4 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
> > > index bde93344164d..daab8dcafbd4 100644
> > > --- a/kernel/bpf/core.c
> > > +++ b/kernel/bpf/core.c
> > > @@ -77,7 +77,7 @@ void *bpf_internal_load_pointer_neg_helper(const struct sk_buff *skb, int k, uns
> > >
> > >  struct bpf_prog *bpf_prog_alloc_no_stats(unsigned int size, gfp_t gfp_extra_flags)
> > >  {
> > > -       gfp_t gfp_flags = GFP_KERNEL | __GFP_ZERO | gfp_extra_flags;
> > > +       gfp_t gfp_flags = GFP_KERNEL_ACCOUNT | __GFP_ZERO | gfp_extra_flags;
> > >         struct bpf_prog_aux *aux;
> > >         struct bpf_prog *fp;
> > >
> > > @@ -86,7 +86,7 @@ struct bpf_prog *bpf_prog_alloc_no_stats(unsigned int size, gfp_t gfp_extra_flag
> > >         if (fp == NULL)
> > >                 return NULL;
> > >
> > > -       aux = kzalloc(sizeof(*aux), GFP_KERNEL | gfp_extra_flags);
> > > +       aux = kzalloc(sizeof(*aux), GFP_KERNEL_ACCOUNT | gfp_extra_flags);
> > >         if (aux == NULL) {
> > >                 vfree(fp);
> > >                 return NULL;
> > > @@ -104,7 +104,7 @@ struct bpf_prog *bpf_prog_alloc_no_stats(unsigned int size, gfp_t gfp_extra_flag
> > >
> > >  struct bpf_prog *bpf_prog_alloc(unsigned int size, gfp_t gfp_extra_flags)
> > >  {
> > > -       gfp_t gfp_flags = GFP_KERNEL | __GFP_ZERO | gfp_extra_flags;
> > > +       gfp_t gfp_flags = GFP_KERNEL_ACCOUNT | __GFP_ZERO | gfp_extra_flags;
> > >         struct bpf_prog *prog;
> > >         int cpu;
> > >
> > > @@ -217,7 +217,7 @@ void bpf_prog_free_linfo(struct bpf_prog *prog)
> > >  struct bpf_prog *bpf_prog_realloc(struct bpf_prog *fp_old, unsigned int size,
> > >                                   gfp_t gfp_extra_flags)
> > >  {
> > > -       gfp_t gfp_flags = GFP_KERNEL | __GFP_ZERO | gfp_extra_flags;
> > > +       gfp_t gfp_flags = GFP_KERNEL_ACCOUNT | __GFP_ZERO | gfp_extra_flags;
> > >         struct bpf_prog *fp;
> > >         u32 pages, delta;
> > >         int ret;
> > > --
>
> Hi Song!
>
> Thank you for looking into the patchset!
>
> >
> > Do we need similar changes in
> >
> > bpf_prog_array_copy()
> > bpf_prog_alloc_jited_linfo()
> > bpf_prog_clone_create()
> >
> > and maybe a few more?
>
> I've tried to follow the rlimit-based accounting, so those objects which were
> skipped are mostly skipped now and vice versa. The main reason for that is
> simple: I don't know many parts of bpf code well enough to decide whether
> we need accounting or not.
>
> In general with memcg-based accounting we can easily cover places which were
> not covered previously: e.g. the memory used by the verifier. But I guess it's
> better to do it case-by-case.
>
> But if you're aware of any big objects which should be accounted for sure,
> please, let me know.

Thanks for the explanation. I think we can do one-to-one migration to
memcg-based accounting for now.

Song

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH bpf-next v2 14/35] bpf: eliminate rlimit-based memory accounting for bpf_struct_ops maps
  2020-07-27 18:44 ` [PATCH bpf-next v2 14/35] bpf: eliminate rlimit-based memory accounting for bpf_struct_ops maps Roman Gushchin
@ 2020-07-28  5:29   ` Song Liu
  0 siblings, 0 replies; 88+ messages in thread
From: Song Liu @ 2020-07-28  5:29 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: bpf, Networking, Alexei Starovoitov, Daniel Borkmann,
	Kernel Team, open list

On Mon, Jul 27, 2020 at 12:26 PM Roman Gushchin <guro@fb.com> wrote:
>
> Do not use rlimit-based memory accounting for bpf_struct_ops maps.
> It has been replaced with the memcg-based memory accounting.
>
> Signed-off-by: Roman Gushchin <guro@fb.com>

Acked-by: Song Liu <songliubraving@fb.com>

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH bpf-next v2 15/35] bpf: eliminate rlimit-based memory accounting for cpumap maps
  2020-07-27 18:44 ` [PATCH bpf-next v2 15/35] bpf: eliminate rlimit-based memory accounting for cpumap maps Roman Gushchin
@ 2020-07-28  5:30   ` Song Liu
  0 siblings, 0 replies; 88+ messages in thread
From: Song Liu @ 2020-07-28  5:30 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: bpf, Networking, Alexei Starovoitov, Daniel Borkmann,
	Kernel Team, open list

On Mon, Jul 27, 2020 at 12:22 PM Roman Gushchin <guro@fb.com> wrote:
>
> Do not use rlimit-based memory accounting for cpumap maps.
> It has been replaced with the memcg-based memory accounting.
>
> Signed-off-by: Roman Gushchin <guro@fb.com>

Acked-by: Song Liu <songliubraving@fb.com>

> ---
>  kernel/bpf/cpumap.c | 16 +---------------
>  1 file changed, 1 insertion(+), 15 deletions(-)
>
> diff --git a/kernel/bpf/cpumap.c b/kernel/bpf/cpumap.c
> index 74ae9fcbe82e..50f3444a3301 100644
> --- a/kernel/bpf/cpumap.c
> +++ b/kernel/bpf/cpumap.c
> @@ -86,8 +86,6 @@ static struct bpf_map *cpu_map_alloc(union bpf_attr *attr)
>         u32 value_size = attr->value_size;
>         struct bpf_cpu_map *cmap;
>         int err = -ENOMEM;
> -       u64 cost;
> -       int ret;
>
>         if (!bpf_capable())
>                 return ERR_PTR(-EPERM);
> @@ -111,26 +109,14 @@ static struct bpf_map *cpu_map_alloc(union bpf_attr *attr)
>                 goto free_cmap;
>         }
>
> -       /* make sure page count doesn't overflow */
> -       cost = (u64) cmap->map.max_entries * sizeof(struct bpf_cpu_map_entry *);
> -
> -       /* Notice returns -EPERM on if map size is larger than memlock limit */
> -       ret = bpf_map_charge_init(&cmap->map.memory, cost);
> -       if (ret) {
> -               err = ret;
> -               goto free_cmap;
> -       }
> -
>         /* Alloc array for possible remote "destination" CPUs */
>         cmap->cpu_map = bpf_map_area_alloc(cmap->map.max_entries *
>                                            sizeof(struct bpf_cpu_map_entry *),
>                                            cmap->map.numa_node);
>         if (!cmap->cpu_map)
> -               goto free_charge;
> +               goto free_cmap;
>
>         return &cmap->map;
> -free_charge:
> -       bpf_map_charge_finish(&cmap->map.memory);
>  free_cmap:
>         kfree(cmap);
>         return ERR_PTR(err);
> --
> 2.26.2
>

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH bpf-next v2 16/35] bpf: eliminate rlimit-based memory accounting for cgroup storage maps
  2020-07-27 18:44 ` [PATCH bpf-next v2 16/35] bpf: eliminate rlimit-based memory accounting for cgroup storage maps Roman Gushchin
@ 2020-07-28  5:31   ` Song Liu
  0 siblings, 0 replies; 88+ messages in thread
From: Song Liu @ 2020-07-28  5:31 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: bpf, Networking, Alexei Starovoitov, Daniel Borkmann,
	Kernel Team, open list

On Mon, Jul 27, 2020 at 12:21 PM Roman Gushchin <guro@fb.com> wrote:
>
> Do not use rlimit-based memory accounting for cgroup storage maps.
> It has been replaced with the memcg-based memory accounting.
>
> Signed-off-by: Roman Gushchin <guro@fb.com>

Acked-by: Song Liu <songliubraving@fb.com>

> ---
>  kernel/bpf/local_storage.c | 21 +--------------------
>  1 file changed, 1 insertion(+), 20 deletions(-)
>
> diff --git a/kernel/bpf/local_storage.c b/kernel/bpf/local_storage.c
> index 117acb2e80fb..5f29a420849c 100644
> --- a/kernel/bpf/local_storage.c
> +++ b/kernel/bpf/local_storage.c
> @@ -288,8 +288,6 @@ static struct bpf_map *cgroup_storage_map_alloc(union bpf_attr *attr)
>  {
>         int numa_node = bpf_map_attr_numa_node(attr);
>         struct bpf_cgroup_storage_map *map;
> -       struct bpf_map_memory mem;
> -       int ret;
>
>         if (attr->key_size != sizeof(struct bpf_cgroup_storage_key) &&
>             attr->key_size != sizeof(__u64))
> @@ -309,18 +307,10 @@ static struct bpf_map *cgroup_storage_map_alloc(union bpf_attr *attr)
>                 /* max_entries is not used and enforced to be 0 */
>                 return ERR_PTR(-EINVAL);
>
> -       ret = bpf_map_charge_init(&mem, sizeof(struct bpf_cgroup_storage_map));
> -       if (ret < 0)
> -               return ERR_PTR(ret);
> -
>         map = kmalloc_node(sizeof(struct bpf_cgroup_storage_map),
>                            __GFP_ZERO | GFP_USER | __GFP_ACCOUNT, numa_node);
> -       if (!map) {
> -               bpf_map_charge_finish(&mem);
> +       if (!map)
>                 return ERR_PTR(-ENOMEM);
> -       }
> -
> -       bpf_map_charge_move(&map->map.memory, &mem);
>
>         /* copy mandatory map attributes */
>         bpf_map_init_from_attr(&map->map, attr);
> @@ -509,9 +499,6 @@ struct bpf_cgroup_storage *bpf_cgroup_storage_alloc(struct bpf_prog *prog,
>
>         size = bpf_cgroup_storage_calculate_size(map, &pages);
>
> -       if (bpf_map_charge_memlock(map, pages))
> -               return ERR_PTR(-EPERM);
> -
>         storage = kmalloc_node(sizeof(struct bpf_cgroup_storage), gfp,
>                                map->numa_node);
>         if (!storage)
> @@ -533,7 +520,6 @@ struct bpf_cgroup_storage *bpf_cgroup_storage_alloc(struct bpf_prog *prog,
>         return storage;
>
>  enomem:
> -       bpf_map_uncharge_memlock(map, pages);
>         kfree(storage);
>         return ERR_PTR(-ENOMEM);
>  }
> @@ -560,16 +546,11 @@ void bpf_cgroup_storage_free(struct bpf_cgroup_storage *storage)
>  {
>         enum bpf_cgroup_storage_type stype;
>         struct bpf_map *map;
> -       u32 pages;
>
>         if (!storage)
>                 return;
>
>         map = &storage->map->map;
> -
> -       bpf_cgroup_storage_calculate_size(map, &pages);
> -       bpf_map_uncharge_memlock(map, pages);
> -
>         stype = cgroup_storage_type(map);
>         if (stype == BPF_CGROUP_STORAGE_SHARED)
>                 call_rcu(&storage->rcu, free_shared_cgroup_storage_rcu);
> --
> 2.26.2
>

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH bpf-next v2 17/35] bpf: eliminate rlimit-based memory accounting for devmap maps
  2020-07-27 18:44 ` [PATCH bpf-next v2 17/35] bpf: eliminate rlimit-based memory accounting for devmap maps Roman Gushchin
@ 2020-07-28  5:31   ` Song Liu
  0 siblings, 0 replies; 88+ messages in thread
From: Song Liu @ 2020-07-28  5:31 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: bpf, Networking, Alexei Starovoitov, Daniel Borkmann,
	Kernel Team, open list

On Mon, Jul 27, 2020 at 12:20 PM Roman Gushchin <guro@fb.com> wrote:
>
> Do not use rlimit-based memory accounting for devmap maps.
> It has been replaced with the memcg-based memory accounting.
>
> Signed-off-by: Roman Gushchin <guro@fb.com>

Acked-by: Song Liu <songliubraving@fb.com>

> ---
>  kernel/bpf/devmap.c | 18 ++----------------
>  1 file changed, 2 insertions(+), 16 deletions(-)
>
> diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c
> index 05bf93088063..8148c7260a54 100644
> --- a/kernel/bpf/devmap.c
> +++ b/kernel/bpf/devmap.c
> @@ -109,8 +109,6 @@ static inline struct hlist_head *dev_map_index_hash(struct bpf_dtab *dtab,
>  static int dev_map_init_map(struct bpf_dtab *dtab, union bpf_attr *attr)
>  {
>         u32 valsize = attr->value_size;
> -       u64 cost = 0;
> -       int err;
>
>         /* check sanity of attributes. 2 value sizes supported:
>          * 4 bytes: ifindex
> @@ -135,21 +133,13 @@ static int dev_map_init_map(struct bpf_dtab *dtab, union bpf_attr *attr)
>
>                 if (!dtab->n_buckets) /* Overflow check */
>                         return -EINVAL;
> -               cost += (u64) sizeof(struct hlist_head) * dtab->n_buckets;
> -       } else {
> -               cost += (u64) dtab->map.max_entries * sizeof(struct bpf_dtab_netdev *);
>         }
>
> -       /* if map size is larger than memlock limit, reject it */
> -       err = bpf_map_charge_init(&dtab->map.memory, cost);
> -       if (err)
> -               return -EINVAL;
> -
>         if (attr->map_type == BPF_MAP_TYPE_DEVMAP_HASH) {
>                 dtab->dev_index_head = dev_map_create_hash(dtab->n_buckets,
>                                                            dtab->map.numa_node);
>                 if (!dtab->dev_index_head)
> -                       goto free_charge;
> +                       return -ENOMEM;
>
>                 spin_lock_init(&dtab->index_lock);
>         } else {
> @@ -157,14 +147,10 @@ static int dev_map_init_map(struct bpf_dtab *dtab, union bpf_attr *attr)
>                                                       sizeof(struct bpf_dtab_netdev *),
>                                                       dtab->map.numa_node);
>                 if (!dtab->netdev_map)
> -                       goto free_charge;
> +                       return -ENOMEM;
>         }
>
>         return 0;
> -
> -free_charge:
> -       bpf_map_charge_finish(&dtab->map.memory);
> -       return -ENOMEM;
>  }
>
>  static struct bpf_map *dev_map_alloc(union bpf_attr *attr)
> --
> 2.26.2
>

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH bpf-next v2 18/35] bpf: eliminate rlimit-based memory accounting for hashtab maps
  2020-07-27 18:44 ` [PATCH bpf-next v2 18/35] bpf: eliminate rlimit-based memory accounting for hashtab maps Roman Gushchin
@ 2020-07-28  5:32   ` Song Liu
  0 siblings, 0 replies; 88+ messages in thread
From: Song Liu @ 2020-07-28  5:32 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: bpf, Networking, Alexei Starovoitov, Daniel Borkmann,
	Kernel Team, open list

On Mon, Jul 27, 2020 at 12:21 PM Roman Gushchin <guro@fb.com> wrote:
>
> Do not use rlimit-based memory accounting for hashtab maps.
> It has been replaced with the memcg-based memory accounting.
>
> Signed-off-by: Roman Gushchin <guro@fb.com>

Acked-by: Song Liu <songliubraving@fb.com>

> ---
>  kernel/bpf/hashtab.c | 19 +------------------
>  1 file changed, 1 insertion(+), 18 deletions(-)
>
> diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
> index 9d0432170812..9372b559b4e7 100644
> --- a/kernel/bpf/hashtab.c
> +++ b/kernel/bpf/hashtab.c
> @@ -422,7 +422,6 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
>         bool percpu_lru = (attr->map_flags & BPF_F_NO_COMMON_LRU);
>         bool prealloc = !(attr->map_flags & BPF_F_NO_PREALLOC);
>         struct bpf_htab *htab;
> -       u64 cost;
>         int err;
>
>         htab = kzalloc(sizeof(*htab), GFP_USER | __GFP_ACCOUNT);
> @@ -459,26 +458,12 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
>             htab->n_buckets > U32_MAX / sizeof(struct bucket))
>                 goto free_htab;
>
> -       cost = (u64) htab->n_buckets * sizeof(struct bucket) +
> -              (u64) htab->elem_size * htab->map.max_entries;
> -
> -       if (percpu)
> -               cost += (u64) round_up(htab->map.value_size, 8) *
> -                       num_possible_cpus() * htab->map.max_entries;
> -       else
> -              cost += (u64) htab->elem_size * num_possible_cpus();
> -
> -       /* if map size is larger than memlock limit, reject it */
> -       err = bpf_map_charge_init(&htab->map.memory, cost);
> -       if (err)
> -               goto free_htab;
> -
>         err = -ENOMEM;
>         htab->buckets = bpf_map_area_alloc(htab->n_buckets *
>                                            sizeof(struct bucket),
>                                            htab->map.numa_node);
>         if (!htab->buckets)
> -               goto free_charge;
> +               goto free_htab;
>
>         if (htab->map.map_flags & BPF_F_ZERO_SEED)
>                 htab->hashrnd = 0;
> @@ -508,8 +493,6 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
>         prealloc_destroy(htab);
>  free_buckets:
>         bpf_map_area_free(htab->buckets);
> -free_charge:
> -       bpf_map_charge_finish(&htab->map.memory);
>  free_htab:
>         kfree(htab);
>         return ERR_PTR(err);
> --
> 2.26.2
>

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH bpf-next v2 19/35] bpf: eliminate rlimit-based memory accounting for lpm_trie maps
  2020-07-27 18:44 ` [PATCH bpf-next v2 19/35] bpf: eliminate rlimit-based memory accounting for lpm_trie maps Roman Gushchin
@ 2020-07-28  5:32   ` Song Liu
  0 siblings, 0 replies; 88+ messages in thread
From: Song Liu @ 2020-07-28  5:32 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: bpf, Networking, Alexei Starovoitov, Daniel Borkmann,
	Kernel Team, open list

On Mon, Jul 27, 2020 at 12:25 PM Roman Gushchin <guro@fb.com> wrote:
>
> Do not use rlimit-based memory accounting for lpm_trie maps.
> It has been replaced with the memcg-based memory accounting.
>
> Signed-off-by: Roman Gushchin <guro@fb.com>

Acked-by: Song Liu <songliubraving@fb.com>

> ---
>  kernel/bpf/lpm_trie.c | 13 -------------
>  1 file changed, 13 deletions(-)
>
> diff --git a/kernel/bpf/lpm_trie.c b/kernel/bpf/lpm_trie.c
> index d85e0fc2cafc..c747f0835eb1 100644
> --- a/kernel/bpf/lpm_trie.c
> +++ b/kernel/bpf/lpm_trie.c
> @@ -540,8 +540,6 @@ static int trie_delete_elem(struct bpf_map *map, void *_key)
>  static struct bpf_map *trie_alloc(union bpf_attr *attr)
>  {
>         struct lpm_trie *trie;
> -       u64 cost = sizeof(*trie), cost_per_node;
> -       int ret;
>
>         if (!bpf_capable())
>                 return ERR_PTR(-EPERM);
> @@ -567,20 +565,9 @@ static struct bpf_map *trie_alloc(union bpf_attr *attr)
>                           offsetof(struct bpf_lpm_trie_key, data);
>         trie->max_prefixlen = trie->data_size * 8;
>
> -       cost_per_node = sizeof(struct lpm_trie_node) +
> -                       attr->value_size + trie->data_size;
> -       cost += (u64) attr->max_entries * cost_per_node;
> -
> -       ret = bpf_map_charge_init(&trie->map.memory, cost);
> -       if (ret)
> -               goto out_err;
> -
>         spin_lock_init(&trie->lock);
>
>         return &trie->map;
> -out_err:
> -       kfree(trie);
> -       return ERR_PTR(ret);
>  }
>
>  static void trie_free(struct bpf_map *map)
> --
> 2.26.2
>

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH bpf-next v2 20/35] bpf: eliminate rlimit-based memory accounting for queue_stack_maps maps
  2020-07-27 18:44 ` [PATCH bpf-next v2 20/35] bpf: eliminate rlimit-based memory accounting for queue_stack_maps maps Roman Gushchin
@ 2020-07-28  5:35   ` Song Liu
  0 siblings, 0 replies; 88+ messages in thread
From: Song Liu @ 2020-07-28  5:35 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: bpf, Networking, Alexei Starovoitov, Daniel Borkmann,
	Kernel Team, open list

On Mon, Jul 27, 2020 at 12:25 PM Roman Gushchin <guro@fb.com> wrote:
>
> Do not use rlimit-based memory accounting for queue_stack maps.
> It has been replaced with the memcg-based memory accounting.
>
> Signed-off-by: Roman Gushchin <guro@fb.com>

Acked-by: Song Liu <songliubraving@fb.com>

> ---
>  kernel/bpf/queue_stack_maps.c | 16 ++++------------
>  1 file changed, 4 insertions(+), 12 deletions(-)
>
> diff --git a/kernel/bpf/queue_stack_maps.c b/kernel/bpf/queue_stack_maps.c
> index 44184f82916a..92e73c35a34a 100644
> --- a/kernel/bpf/queue_stack_maps.c
> +++ b/kernel/bpf/queue_stack_maps.c
> @@ -66,29 +66,21 @@ static int queue_stack_map_alloc_check(union bpf_attr *attr)
>
>  static struct bpf_map *queue_stack_map_alloc(union bpf_attr *attr)
>  {
> -       int ret, numa_node = bpf_map_attr_numa_node(attr);
> -       struct bpf_map_memory mem = {0};
> +       int numa_node = bpf_map_attr_numa_node(attr);
>         struct bpf_queue_stack *qs;
> -       u64 size, queue_size, cost;
> +       u64 size, queue_size;
>
>         size = (u64) attr->max_entries + 1;
> -       cost = queue_size = sizeof(*qs) + size * attr->value_size;
> -
> -       ret = bpf_map_charge_init(&mem, cost);
> -       if (ret < 0)
> -               return ERR_PTR(ret);
> +       queue_size = sizeof(*qs) + size * attr->value_size;
>
>         qs = bpf_map_area_alloc(queue_size, numa_node);
> -       if (!qs) {
> -               bpf_map_charge_finish(&mem);
> +       if (!qs)
>                 return ERR_PTR(-ENOMEM);
> -       }
>
>         memset(qs, 0, sizeof(*qs));
>
>         bpf_map_init_from_attr(&qs->map, attr);
>
> -       bpf_map_charge_move(&qs->map.memory, &mem);
>         qs->size = size;
>
>         raw_spin_lock_init(&qs->lock);
> --
> 2.26.2
>

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH bpf-next v2 21/35] bpf: eliminate rlimit-based memory accounting for reuseport_array maps
  2020-07-27 18:44 ` [PATCH bpf-next v2 21/35] bpf: eliminate rlimit-based memory accounting for reuseport_array maps Roman Gushchin
@ 2020-07-28  5:36   ` Song Liu
  0 siblings, 0 replies; 88+ messages in thread
From: Song Liu @ 2020-07-28  5:36 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: bpf, Networking, Alexei Starovoitov, Daniel Borkmann,
	Kernel Team, open list

On Mon, Jul 27, 2020 at 12:23 PM Roman Gushchin <guro@fb.com> wrote:
>
> Do not use rlimit-based memory accounting for reuseport_array maps.
> It has been replaced with the memcg-based memory accounting.
>
> Signed-off-by: Roman Gushchin <guro@fb.com>

Acked-by: Song Liu <songliubraving@fb.com>

> ---
>  kernel/bpf/reuseport_array.c | 12 ++----------
>  1 file changed, 2 insertions(+), 10 deletions(-)
>
> diff --git a/kernel/bpf/reuseport_array.c b/kernel/bpf/reuseport_array.c
> index 90b29c5b1da7..9d0161fdfec7 100644
> --- a/kernel/bpf/reuseport_array.c
> +++ b/kernel/bpf/reuseport_array.c
> @@ -150,9 +150,8 @@ static void reuseport_array_free(struct bpf_map *map)
>
>  static struct bpf_map *reuseport_array_alloc(union bpf_attr *attr)
>  {
> -       int err, numa_node = bpf_map_attr_numa_node(attr);
> +       int numa_node = bpf_map_attr_numa_node(attr);
>         struct reuseport_array *array;
> -       struct bpf_map_memory mem;
>         u64 array_size;
>
>         if (!bpf_capable())
> @@ -161,20 +160,13 @@ static struct bpf_map *reuseport_array_alloc(union bpf_attr *attr)
>         array_size = sizeof(*array);
>         array_size += (u64)attr->max_entries * sizeof(struct sock *);
>
> -       err = bpf_map_charge_init(&mem, array_size);
> -       if (err)
> -               return ERR_PTR(err);
> -
>         /* allocate all map elements and zero-initialize them */
>         array = bpf_map_area_alloc(array_size, numa_node);
> -       if (!array) {
> -               bpf_map_charge_finish(&mem);
> +       if (!array)
>                 return ERR_PTR(-ENOMEM);
> -       }
>
>         /* copy mandatory map attributes */
>         bpf_map_init_from_attr(&array->map, attr);
> -       bpf_map_charge_move(&array->map.memory, &mem);
>
>         return &array->map;
>  }
> --
> 2.26.2
>

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH bpf-next v2 22/35] bpf: eliminate rlimit-based memory accounting for bpf ringbuffer
  2020-07-27 18:44 ` [PATCH bpf-next v2 22/35] bpf: eliminate rlimit-based memory accounting for bpf ringbuffer Roman Gushchin
@ 2020-07-28  5:37   ` Song Liu
  2020-07-28  5:56   ` Andrii Nakryiko
  1 sibling, 0 replies; 88+ messages in thread
From: Song Liu @ 2020-07-28  5:37 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: bpf, Networking, Alexei Starovoitov, Daniel Borkmann,
	Kernel Team, open list

On Mon, Jul 27, 2020 at 12:22 PM Roman Gushchin <guro@fb.com> wrote:
>
> Do not use rlimit-based memory accounting for bpf ringbuffer.
> It has been replaced with the memcg-based memory accounting.
>
> bpf_ringbuf_alloc() can't return anything except ERR_PTR(-ENOMEM)
> and a valid pointer, so to simplify the code make it return NULL
> in the first case. This allows to drop a couple of lines in
> ringbuf_map_alloc() and also makes it look similar to other memory
> allocating function like kmalloc().
>
> Signed-off-by: Roman Gushchin <guro@fb.com>

Acked-by: Song Liu <songliubraving@fb.com>

> ---
>  kernel/bpf/ringbuf.c | 24 ++++--------------------
>  1 file changed, 4 insertions(+), 20 deletions(-)
>
> diff --git a/kernel/bpf/ringbuf.c b/kernel/bpf/ringbuf.c
> index e8e2c39cbdc9..e687b798d097 100644
> --- a/kernel/bpf/ringbuf.c
> +++ b/kernel/bpf/ringbuf.c
> @@ -48,7 +48,6 @@ struct bpf_ringbuf {
>
>  struct bpf_ringbuf_map {
>         struct bpf_map map;
> -       struct bpf_map_memory memory;
>         struct bpf_ringbuf *rb;
>  };
>
> @@ -135,7 +134,7 @@ static struct bpf_ringbuf *bpf_ringbuf_alloc(size_t data_sz, int numa_node)
>
>         rb = bpf_ringbuf_area_alloc(data_sz, numa_node);
>         if (!rb)
> -               return ERR_PTR(-ENOMEM);
> +               return NULL;
>
>         spin_lock_init(&rb->spinlock);
>         init_waitqueue_head(&rb->waitq);
> @@ -151,8 +150,6 @@ static struct bpf_ringbuf *bpf_ringbuf_alloc(size_t data_sz, int numa_node)
>  static struct bpf_map *ringbuf_map_alloc(union bpf_attr *attr)
>  {
>         struct bpf_ringbuf_map *rb_map;
> -       u64 cost;
> -       int err;
>
>         if (attr->map_flags & ~RINGBUF_CREATE_FLAG_MASK)
>                 return ERR_PTR(-EINVAL);
> @@ -174,26 +171,13 @@ static struct bpf_map *ringbuf_map_alloc(union bpf_attr *attr)
>
>         bpf_map_init_from_attr(&rb_map->map, attr);
>
> -       cost = sizeof(struct bpf_ringbuf_map) +
> -              sizeof(struct bpf_ringbuf) +
> -              attr->max_entries;
> -       err = bpf_map_charge_init(&rb_map->map.memory, cost);
> -       if (err)
> -               goto err_free_map;
> -
>         rb_map->rb = bpf_ringbuf_alloc(attr->max_entries, rb_map->map.numa_node);
> -       if (IS_ERR(rb_map->rb)) {
> -               err = PTR_ERR(rb_map->rb);
> -               goto err_uncharge;
> +       if (!rb_map->rb) {
> +               kfree(rb_map);
> +               return ERR_PTR(-ENOMEM);
>         }
>
>         return &rb_map->map;
> -
> -err_uncharge:
> -       bpf_map_charge_finish(&rb_map->map.memory);
> -err_free_map:
> -       kfree(rb_map);
> -       return ERR_PTR(err);
>  }
>
>  static void bpf_ringbuf_free(struct bpf_ringbuf *rb)
> --
> 2.26.2
>

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH bpf-next v2 23/35] bpf: eliminate rlimit-based memory accounting for sockmap and sockhash maps
  2020-07-27 18:44 ` [PATCH bpf-next v2 23/35] bpf: eliminate rlimit-based memory accounting for sockmap and sockhash maps Roman Gushchin
@ 2020-07-28  5:37   ` Song Liu
  0 siblings, 0 replies; 88+ messages in thread
From: Song Liu @ 2020-07-28  5:37 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: bpf, Networking, Alexei Starovoitov, Daniel Borkmann,
	Kernel Team, open list

On Mon, Jul 27, 2020 at 12:21 PM Roman Gushchin <guro@fb.com> wrote:
>
> Do not use rlimit-based memory accounting for sockmap and sockhash maps.
> It has been replaced with the memcg-based memory accounting.
>
> Signed-off-by: Roman Gushchin <guro@fb.com>

Acked-by: Song Liu <songliubraving@fb.com>

> ---
>  net/core/sock_map.c | 33 ++++++---------------------------
>  1 file changed, 6 insertions(+), 27 deletions(-)
>
> diff --git a/net/core/sock_map.c b/net/core/sock_map.c
> index bc797adca44c..07c90baf8db1 100644
> --- a/net/core/sock_map.c
> +++ b/net/core/sock_map.c
> @@ -26,8 +26,6 @@ struct bpf_stab {
>  static struct bpf_map *sock_map_alloc(union bpf_attr *attr)
>  {
>         struct bpf_stab *stab;
> -       u64 cost;
> -       int err;
>
>         if (!capable(CAP_NET_ADMIN))
>                 return ERR_PTR(-EPERM);
> @@ -45,22 +43,15 @@ static struct bpf_map *sock_map_alloc(union bpf_attr *attr)
>         bpf_map_init_from_attr(&stab->map, attr);
>         raw_spin_lock_init(&stab->lock);
>
> -       /* Make sure page count doesn't overflow. */
> -       cost = (u64) stab->map.max_entries * sizeof(struct sock *);
> -       err = bpf_map_charge_init(&stab->map.memory, cost);
> -       if (err)
> -               goto free_stab;
> -
>         stab->sks = bpf_map_area_alloc(stab->map.max_entries *
>                                        sizeof(struct sock *),
>                                        stab->map.numa_node);
> -       if (stab->sks)
> -               return &stab->map;
> -       err = -ENOMEM;
> -       bpf_map_charge_finish(&stab->map.memory);
> -free_stab:
> -       kfree(stab);
> -       return ERR_PTR(err);
> +       if (!stab->sks) {
> +               kfree(stab);
> +               return ERR_PTR(-ENOMEM);
> +       }
> +
> +       return &stab->map;
>  }
>
>  int sock_map_get_from_fd(const union bpf_attr *attr, struct bpf_prog *prog)
> @@ -999,7 +990,6 @@ static struct bpf_map *sock_hash_alloc(union bpf_attr *attr)
>  {
>         struct bpf_shtab *htab;
>         int i, err;
> -       u64 cost;
>
>         if (!capable(CAP_NET_ADMIN))
>                 return ERR_PTR(-EPERM);
> @@ -1027,21 +1017,10 @@ static struct bpf_map *sock_hash_alloc(union bpf_attr *attr)
>                 goto free_htab;
>         }
>
> -       cost = (u64) htab->buckets_num * sizeof(struct bpf_shtab_bucket) +
> -              (u64) htab->elem_size * htab->map.max_entries;
> -       if (cost >= U32_MAX - PAGE_SIZE) {
> -               err = -EINVAL;
> -               goto free_htab;
> -       }
> -       err = bpf_map_charge_init(&htab->map.memory, cost);
> -       if (err)
> -               goto free_htab;
> -
>         htab->buckets = bpf_map_area_alloc(htab->buckets_num *
>                                            sizeof(struct bpf_shtab_bucket),
>                                            htab->map.numa_node);
>         if (!htab->buckets) {
> -               bpf_map_charge_finish(&htab->map.memory);
>                 err = -ENOMEM;
>                 goto free_htab;
>         }
> --
> 2.26.2
>

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH bpf-next v2 24/35] bpf: eliminate rlimit-based memory accounting for stackmap maps
  2020-07-27 18:44 ` [PATCH bpf-next v2 24/35] bpf: eliminate rlimit-based memory accounting for stackmap maps Roman Gushchin
@ 2020-07-28  5:38   ` Song Liu
  0 siblings, 0 replies; 88+ messages in thread
From: Song Liu @ 2020-07-28  5:38 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: bpf, Networking, Alexei Starovoitov, Daniel Borkmann,
	Kernel Team, open list

On Mon, Jul 27, 2020 at 12:22 PM Roman Gushchin <guro@fb.com> wrote:
>
> Do not use rlimit-based memory accounting for stackmap maps.
> It has been replaced with the memcg-based memory accounting.
>
> Signed-off-by: Roman Gushchin <guro@fb.com>

Acked-by: Song Liu <songliubraving@fb.com>

> ---
>  kernel/bpf/stackmap.c | 16 +++-------------
>  1 file changed, 3 insertions(+), 13 deletions(-)
>
> diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c
> index 5beb2f8c23da..9ac0f405beef 100644
> --- a/kernel/bpf/stackmap.c
> +++ b/kernel/bpf/stackmap.c
> @@ -90,7 +90,6 @@ static struct bpf_map *stack_map_alloc(union bpf_attr *attr)
>  {
>         u32 value_size = attr->value_size;
>         struct bpf_stack_map *smap;
> -       struct bpf_map_memory mem;
>         u64 cost, n_buckets;
>         int err;
>
> @@ -119,15 +118,9 @@ static struct bpf_map *stack_map_alloc(union bpf_attr *attr)
>
>         cost = n_buckets * sizeof(struct stack_map_bucket *) + sizeof(*smap);
>         cost += n_buckets * (value_size + sizeof(struct stack_map_bucket));
> -       err = bpf_map_charge_init(&mem, cost);
> -       if (err)
> -               return ERR_PTR(err);
> -
>         smap = bpf_map_area_alloc(cost, bpf_map_attr_numa_node(attr));
> -       if (!smap) {
> -               bpf_map_charge_finish(&mem);
> +       if (!smap)
>                 return ERR_PTR(-ENOMEM);
> -       }
>
>         bpf_map_init_from_attr(&smap->map, attr);
>         smap->map.value_size = value_size;
> @@ -135,20 +128,17 @@ static struct bpf_map *stack_map_alloc(union bpf_attr *attr)
>
>         err = get_callchain_buffers(sysctl_perf_event_max_stack);
>         if (err)
> -               goto free_charge;
> +               goto free_smap;
>
>         err = prealloc_elems_and_freelist(smap);
>         if (err)
>                 goto put_buffers;
>
> -       bpf_map_charge_move(&smap->map.memory, &mem);
> -
>         return &smap->map;
>
>  put_buffers:
>         put_callchain_buffers();
> -free_charge:
> -       bpf_map_charge_finish(&mem);
> +free_smap:
>         bpf_map_area_free(smap);
>         return ERR_PTR(err);
>  }
> --
> 2.26.2
>

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH bpf-next v2 25/35] bpf: eliminate rlimit-based memory accounting for socket storage maps
  2020-07-27 18:44 ` [PATCH bpf-next v2 25/35] bpf: eliminate rlimit-based memory accounting for socket storage maps Roman Gushchin
@ 2020-07-28  5:41   ` Song Liu
  0 siblings, 0 replies; 88+ messages in thread
From: Song Liu @ 2020-07-28  5:41 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: bpf, Networking, Alexei Starovoitov, Daniel Borkmann,
	Kernel Team, open list

On Mon, Jul 27, 2020 at 12:26 PM Roman Gushchin <guro@fb.com> wrote:
>
> Do not use rlimit-based memory accounting for socket storage maps.
> It has been replaced with the memcg-based memory accounting.
>
> Signed-off-by: Roman Gushchin <guro@fb.com>

Acked-by: Song Liu <songliubraving@fb.com>

> ---
>  net/core/bpf_sk_storage.c | 11 -----------
>  1 file changed, 11 deletions(-)
>
> diff --git a/net/core/bpf_sk_storage.c b/net/core/bpf_sk_storage.c
> index fbcd03cd00d3..c0a35b6368af 100644
> --- a/net/core/bpf_sk_storage.c
> +++ b/net/core/bpf_sk_storage.c
> @@ -676,8 +676,6 @@ static struct bpf_map *bpf_sk_storage_map_alloc(union bpf_attr *attr)
>         struct bpf_sk_storage_map *smap;
>         unsigned int i;
>         u32 nbuckets;
> -       u64 cost;
> -       int ret;
>
>         smap = kzalloc(sizeof(*smap), GFP_USER | __GFP_NOWARN | __GFP_ACCOUNT);
>         if (!smap)
> @@ -688,18 +686,9 @@ static struct bpf_map *bpf_sk_storage_map_alloc(union bpf_attr *attr)
>         /* Use at least 2 buckets, select_bucket() is undefined behavior with 1 bucket */
>         nbuckets = max_t(u32, 2, nbuckets);
>         smap->bucket_log = ilog2(nbuckets);
> -       cost = sizeof(*smap->buckets) * nbuckets + sizeof(*smap);
> -
> -       ret = bpf_map_charge_init(&smap->map.memory, cost);
> -       if (ret < 0) {
> -               kfree(smap);
> -               return ERR_PTR(ret);
> -       }
> -
>         smap->buckets = kvcalloc(sizeof(*smap->buckets), nbuckets,
>                                  GFP_USER | __GFP_NOWARN | __GFP_ACCOUNT);
>         if (!smap->buckets) {
> -               bpf_map_charge_finish(&smap->map.memory);
>                 kfree(smap);
>                 return ERR_PTR(-ENOMEM);
>         }
> --
> 2.26.2
>

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH bpf-next v2 26/35] bpf: eliminate rlimit-based memory accounting for xskmap maps
  2020-07-27 18:44 ` [PATCH bpf-next v2 26/35] bpf: eliminate rlimit-based memory accounting for xskmap maps Roman Gushchin
@ 2020-07-28  5:42   ` Song Liu
  0 siblings, 0 replies; 88+ messages in thread
From: Song Liu @ 2020-07-28  5:42 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: bpf, Networking, Alexei Starovoitov, Daniel Borkmann,
	Kernel Team, open list

On Mon, Jul 27, 2020 at 12:21 PM Roman Gushchin <guro@fb.com> wrote:
>
> Do not use rlimit-based memory accounting for xskmap maps.
> It has been replaced with the memcg-based memory accounting.
>
> Signed-off-by: Roman Gushchin <guro@fb.com>

Acked-by: Song Liu <songliubraving@fb.com>


> ---
>  net/xdp/xskmap.c | 10 +---------
>  1 file changed, 1 insertion(+), 9 deletions(-)
>
> diff --git a/net/xdp/xskmap.c b/net/xdp/xskmap.c
> index e574b22defe5..0366013f13c6 100644
> --- a/net/xdp/xskmap.c
> +++ b/net/xdp/xskmap.c
> @@ -74,7 +74,6 @@ static void xsk_map_sock_delete(struct xdp_sock *xs,
>
>  static struct bpf_map *xsk_map_alloc(union bpf_attr *attr)
>  {
> -       struct bpf_map_memory mem;
>         int err, numa_node;
>         struct xsk_map *m;
>         u64 size;
> @@ -90,18 +89,11 @@ static struct bpf_map *xsk_map_alloc(union bpf_attr *attr)
>         numa_node = bpf_map_attr_numa_node(attr);
>         size = struct_size(m, xsk_map, attr->max_entries);
>
> -       err = bpf_map_charge_init(&mem, size);
> -       if (err < 0)
> -               return ERR_PTR(err);
> -
>         m = bpf_map_area_alloc(size, numa_node);
> -       if (!m) {
> -               bpf_map_charge_finish(&mem);
> +       if (!m)
>                 return ERR_PTR(-ENOMEM);
> -       }
>
>         bpf_map_init_from_attr(&m->map, attr);
> -       bpf_map_charge_move(&m->map.memory, &mem);
>         spin_lock_init(&m->lock);
>
>         return &m->map;
> --
> 2.26.2
>

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH bpf-next v2 27/35] bpf: eliminate rlimit-based memory accounting infra for bpf maps
  2020-07-27 18:44 ` [PATCH bpf-next v2 27/35] bpf: eliminate rlimit-based memory accounting infra for bpf maps Roman Gushchin
@ 2020-07-28  5:47   ` Song Liu
  2020-07-28  5:58     ` Andrii Nakryiko
  0 siblings, 1 reply; 88+ messages in thread
From: Song Liu @ 2020-07-28  5:47 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: bpf, Networking, Alexei Starovoitov, Daniel Borkmann,
	Kernel Team, open list

On Mon, Jul 27, 2020 at 12:26 PM Roman Gushchin <guro@fb.com> wrote:
>
> Remove rlimit-based accounting infrastructure code, which is not used
> anymore.
>
> Signed-off-by: Roman Gushchin <guro@fb.com>
[...]
>
>  static void bpf_map_put_uref(struct bpf_map *map)
> @@ -541,7 +484,7 @@ static void bpf_map_show_fdinfo(struct seq_file *m, struct file *filp)
>                    "value_size:\t%u\n"
>                    "max_entries:\t%u\n"
>                    "map_flags:\t%#x\n"
> -                  "memlock:\t%llu\n"
> +                  "memlock:\t%llu\n" /* deprecated */

I am not sure whether we can deprecate this one.. How difficult is it
to keep this statistics?

Thanks,
Song

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH bpf-next v2 28/35] bpf: eliminate rlimit-based memory accounting for bpf progs
  2020-07-27 18:44 ` [PATCH bpf-next v2 28/35] bpf: eliminate rlimit-based memory accounting for bpf progs Roman Gushchin
@ 2020-07-28  5:55   ` Song Liu
  0 siblings, 0 replies; 88+ messages in thread
From: Song Liu @ 2020-07-28  5:55 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: bpf, Networking, Alexei Starovoitov, Daniel Borkmann,
	Kernel Team, open list

On Mon, Jul 27, 2020 at 12:21 PM Roman Gushchin <guro@fb.com> wrote:
>
> Do not use rlimit-based memory accounting for bpf progs. It has been
> replaced with memcg-based memory accounting.
>
> Signed-off-by: Roman Gushchin <guro@fb.com>

Acked-by: Song Liu <songliubraving@fb.com>

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH bpf-next v2 22/35] bpf: eliminate rlimit-based memory accounting for bpf ringbuffer
  2020-07-27 18:44 ` [PATCH bpf-next v2 22/35] bpf: eliminate rlimit-based memory accounting for bpf ringbuffer Roman Gushchin
  2020-07-28  5:37   ` Song Liu
@ 2020-07-28  5:56   ` Andrii Nakryiko
  1 sibling, 0 replies; 88+ messages in thread
From: Andrii Nakryiko @ 2020-07-28  5:56 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: bpf, Networking, Alexei Starovoitov, Daniel Borkmann,
	Kernel Team, open list

On Mon, Jul 27, 2020 at 12:21 PM Roman Gushchin <guro@fb.com> wrote:
>
> Do not use rlimit-based memory accounting for bpf ringbuffer.
> It has been replaced with the memcg-based memory accounting.
>
> bpf_ringbuf_alloc() can't return anything except ERR_PTR(-ENOMEM)
> and a valid pointer, so to simplify the code make it return NULL
> in the first case. This allows to drop a couple of lines in
> ringbuf_map_alloc() and also makes it look similar to other memory
> allocating function like kmalloc().
>
> Signed-off-by: Roman Gushchin <guro@fb.com>
> ---

LGTM.

Acked-by: Andrii Nakryiko <andriin@fb.com>

>  kernel/bpf/ringbuf.c | 24 ++++--------------------
>  1 file changed, 4 insertions(+), 20 deletions(-)
>

[...]

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH bpf-next v2 27/35] bpf: eliminate rlimit-based memory accounting infra for bpf maps
  2020-07-28  5:47   ` Song Liu
@ 2020-07-28  5:58     ` Andrii Nakryiko
  2020-07-28  6:06       ` Song Liu
  0 siblings, 1 reply; 88+ messages in thread
From: Andrii Nakryiko @ 2020-07-28  5:58 UTC (permalink / raw)
  To: Song Liu
  Cc: Roman Gushchin, bpf, Networking, Alexei Starovoitov,
	Daniel Borkmann, Kernel Team, open list

On Mon, Jul 27, 2020 at 10:47 PM Song Liu <song@kernel.org> wrote:
>
> On Mon, Jul 27, 2020 at 12:26 PM Roman Gushchin <guro@fb.com> wrote:
> >
> > Remove rlimit-based accounting infrastructure code, which is not used
> > anymore.
> >
> > Signed-off-by: Roman Gushchin <guro@fb.com>
> [...]
> >
> >  static void bpf_map_put_uref(struct bpf_map *map)
> > @@ -541,7 +484,7 @@ static void bpf_map_show_fdinfo(struct seq_file *m, struct file *filp)
> >                    "value_size:\t%u\n"
> >                    "max_entries:\t%u\n"
> >                    "map_flags:\t%#x\n"
> > -                  "memlock:\t%llu\n"
> > +                  "memlock:\t%llu\n" /* deprecated */
>
> I am not sure whether we can deprecate this one.. How difficult is it
> to keep this statistics?
>

It's factually correct now, that BPF map doesn't use any memlock memory, no?

This is actually one way to detect whether RLIMIT_MEMLOCK is necessary
or not: create a small map, check if it's fdinfo has memlock: 0 or not
:)

> Thanks,
> Song

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH bpf-next v2 29/35] bpf: libbpf: cleanup RLIMIT_MEMLOCK usage
  2020-07-27 23:15     ` Roman Gushchin
@ 2020-07-28  5:59       ` Andrii Nakryiko
  2020-07-30  1:38         ` Roman Gushchin
  0 siblings, 1 reply; 88+ messages in thread
From: Andrii Nakryiko @ 2020-07-28  5:59 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: bpf, Networking, Alexei Starovoitov, Daniel Borkmann,
	Kernel Team, open list

On Mon, Jul 27, 2020 at 4:15 PM Roman Gushchin <guro@fb.com> wrote:
>
> On Mon, Jul 27, 2020 at 03:05:11PM -0700, Andrii Nakryiko wrote:
> > On Mon, Jul 27, 2020 at 12:21 PM Roman Gushchin <guro@fb.com> wrote:
> > >
> > > As bpf is not using memlock rlimit for memory accounting anymore,
> > > let's remove the related code from libbpf.
> > >
> > > Bpf operations can't fail because of exceeding the limit anymore.
> > >
> >
> > They can't in the newest kernel, but libbpf will keep working and
> > supporting old kernels for a very long time now. So please don't
> > remove any of this.
>
> Yeah, good point, agree.
> So we just can drop this patch from the series, no other changes
> are needed.
>
> >
> > But it would be nice to add a detection of whether kernel needs a
> > RLIMIT_MEMLOCK bump or not. Is there some simple and reliable way to
> > detect this from user-space?
>
> Hm, the best idea I can think of is to wait for -EPERM before bumping.
> We can in theory look for the presence of memory.stat::percpu in cgroupfs,
> but it's way to cryptic.
>

As I just mentioned on another thread, checking fdinfo's "memlock: 0"
should be reliable enough, no?

> Thanks!

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH bpf-next v2 30/35] bpf: bpftool: do not touch RLIMIT_MEMLOCK
  2020-07-27 18:45 ` [PATCH bpf-next v2 30/35] bpf: bpftool: do not touch RLIMIT_MEMLOCK Roman Gushchin
@ 2020-07-28  6:00   ` Song Liu
  2020-07-28  6:00   ` Andrii Nakryiko
  1 sibling, 0 replies; 88+ messages in thread
From: Song Liu @ 2020-07-28  6:00 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: bpf, Networking, Alexei Starovoitov, Daniel Borkmann,
	Kernel Team, open list

On Mon, Jul 27, 2020 at 12:21 PM Roman Gushchin <guro@fb.com> wrote:
>
> Since bpf stopped using memlock rlimit to limit the memory usage,
> there is no more reason for bpftool to alter its own limits.
>
> Signed-off-by: Roman Gushchin <guro@fb.com>

I think we will need feature check for memcg based accounting.

Thanks,
Song

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH bpf-next v2 30/35] bpf: bpftool: do not touch RLIMIT_MEMLOCK
  2020-07-27 18:45 ` [PATCH bpf-next v2 30/35] bpf: bpftool: do not touch RLIMIT_MEMLOCK Roman Gushchin
  2020-07-28  6:00   ` Song Liu
@ 2020-07-28  6:00   ` Andrii Nakryiko
  1 sibling, 0 replies; 88+ messages in thread
From: Andrii Nakryiko @ 2020-07-28  6:00 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: bpf, Networking, Alexei Starovoitov, Daniel Borkmann,
	Kernel Team, open list

On Mon, Jul 27, 2020 at 12:21 PM Roman Gushchin <guro@fb.com> wrote:
>
> Since bpf stopped using memlock rlimit to limit the memory usage,
> there is no more reason for bpftool to alter its own limits.
>
> Signed-off-by: Roman Gushchin <guro@fb.com>
> ---

This can't be removed either, due to old kernel support. We probably
should have a helper function to probe RLIMIT_MEMLOCK use by BPF
subsystem, though, and not call set_max_rlimit() is not necessary.

>  tools/bpf/bpftool/common.c     | 7 -------
>  tools/bpf/bpftool/feature.c    | 2 --
>  tools/bpf/bpftool/main.h       | 2 --
>  tools/bpf/bpftool/map.c        | 2 --
>  tools/bpf/bpftool/pids.c       | 1 -
>  tools/bpf/bpftool/prog.c       | 3 ---
>  tools/bpf/bpftool/struct_ops.c | 2 --
>  7 files changed, 19 deletions(-)
>

[...]

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH bpf-next v2 31/35] bpf: runqslower: don't touch RLIMIT_MEMLOCK
  2020-07-27 18:45 ` [PATCH bpf-next v2 31/35] bpf: runqslower: don't " Roman Gushchin
@ 2020-07-28  6:03   ` Andrii Nakryiko
  0 siblings, 0 replies; 88+ messages in thread
From: Andrii Nakryiko @ 2020-07-28  6:03 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: bpf, Networking, Alexei Starovoitov, Daniel Borkmann,
	Kernel Team, open list

On Mon, Jul 27, 2020 at 12:24 PM Roman Gushchin <guro@fb.com> wrote:
>
> Since bpf is not using memlock rlimit for memory accounting,
> there are no more reasons to bump the limit.
>
> Signed-off-by: Roman Gushchin <guro@fb.com>
> ---
>  tools/bpf/runqslower/runqslower.c | 16 ----------------
>  1 file changed, 16 deletions(-)
>

This can go, I suppose, we still have a runqslower variant in BCC with
this logic, to show an example on what/how to do this for kernels
without this patch set applied.

Acked-by: Andrii Nakryiko <andriin@fb.com>

> diff --git a/tools/bpf/runqslower/runqslower.c b/tools/bpf/runqslower/runqslower.c
> index d89715844952..a3380b53ce0c 100644
> --- a/tools/bpf/runqslower/runqslower.c
> +++ b/tools/bpf/runqslower/runqslower.c
> @@ -88,16 +88,6 @@ int libbpf_print_fn(enum libbpf_print_level level,
>         return vfprintf(stderr, format, args);
>  }
>
> -static int bump_memlock_rlimit(void)
> -{
> -       struct rlimit rlim_new = {
> -               .rlim_cur       = RLIM_INFINITY,
> -               .rlim_max       = RLIM_INFINITY,
> -       };
> -
> -       return setrlimit(RLIMIT_MEMLOCK, &rlim_new);
> -}
> -
>  void handle_event(void *ctx, int cpu, void *data, __u32 data_sz)
>  {
>         const struct event *e = data;
> @@ -134,12 +124,6 @@ int main(int argc, char **argv)
>
>         libbpf_set_print(libbpf_print_fn);
>
> -       err = bump_memlock_rlimit();
> -       if (err) {
> -               fprintf(stderr, "failed to increase rlimit: %d", err);
> -               return 1;
> -       }
> -
>         obj = runqslower_bpf__open();
>         if (!obj) {
>                 fprintf(stderr, "failed to open and/or load BPF object\n");
> --
> 2.26.2
>

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH bpf-next v2 27/35] bpf: eliminate rlimit-based memory accounting infra for bpf maps
  2020-07-28  5:58     ` Andrii Nakryiko
@ 2020-07-28  6:06       ` Song Liu
  2020-07-28 19:08         ` Roman Gushchin
  0 siblings, 1 reply; 88+ messages in thread
From: Song Liu @ 2020-07-28  6:06 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Roman Gushchin, bpf, Networking, Alexei Starovoitov,
	Daniel Borkmann, Kernel Team, open list

On Mon, Jul 27, 2020 at 10:58 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Mon, Jul 27, 2020 at 10:47 PM Song Liu <song@kernel.org> wrote:
> >
> > On Mon, Jul 27, 2020 at 12:26 PM Roman Gushchin <guro@fb.com> wrote:
> > >
> > > Remove rlimit-based accounting infrastructure code, which is not used
> > > anymore.
> > >
> > > Signed-off-by: Roman Gushchin <guro@fb.com>
> > [...]
> > >
> > >  static void bpf_map_put_uref(struct bpf_map *map)
> > > @@ -541,7 +484,7 @@ static void bpf_map_show_fdinfo(struct seq_file *m, struct file *filp)
> > >                    "value_size:\t%u\n"
> > >                    "max_entries:\t%u\n"
> > >                    "map_flags:\t%#x\n"
> > > -                  "memlock:\t%llu\n"
> > > +                  "memlock:\t%llu\n" /* deprecated */
> >
> > I am not sure whether we can deprecate this one.. How difficult is it
> > to keep this statistics?
> >
>
> It's factually correct now, that BPF map doesn't use any memlock memory, no?

I am not sure whether memlock really means memlock for all users... I bet there
are users who use memlock to check total memory used by the map.

>
> This is actually one way to detect whether RLIMIT_MEMLOCK is necessary
> or not: create a small map, check if it's fdinfo has memlock: 0 or not
> :)

If we do show memlock=0, this is a good check...

Thanks,
Song

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH bpf-next v2 32/35] bpf: selftests: delete bpf_rlimit.h
  2020-07-27 18:45 ` [PATCH bpf-next v2 32/35] bpf: selftests: delete bpf_rlimit.h Roman Gushchin
@ 2020-07-28  6:06   ` Andrii Nakryiko
  2020-07-28  6:11     ` Song Liu
  0 siblings, 1 reply; 88+ messages in thread
From: Andrii Nakryiko @ 2020-07-28  6:06 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: bpf, Networking, Alexei Starovoitov, Daniel Borkmann,
	Kernel Team, open list

On Mon, Jul 27, 2020 at 12:25 PM Roman Gushchin <guro@fb.com> wrote:
>
> As rlimit-based memory accounting is not used by bpf anymore,
> there are no more reasons to play with memlock rlimit.
>
> Delete bpf_rlimit.h which contained a code to bump the limit.
>
> Signed-off-by: Roman Gushchin <guro@fb.com>
> ---

We run test_progs on old kernels as part of libbpf Github CI. We'll
need to either leave setrlimit() or do it conditionally, depending on
detected kernel feature support.

>  samples/bpf/hbm.c                             |  1 -
>  tools/testing/selftests/bpf/bpf_rlimit.h      | 28 -------------------
>  .../selftests/bpf/flow_dissector_load.c       |  1 -
>  .../selftests/bpf/get_cgroup_id_user.c        |  1 -
>  .../bpf/prog_tests/select_reuseport.c         |  1 -
>  .../selftests/bpf/prog_tests/sk_lookup.c      |  1 -
>  tools/testing/selftests/bpf/test_btf.c        |  1 -
>  .../selftests/bpf/test_cgroup_storage.c       |  1 -
>  tools/testing/selftests/bpf/test_dev_cgroup.c |  1 -
>  tools/testing/selftests/bpf/test_lpm_map.c    |  1 -
>  tools/testing/selftests/bpf/test_lru_map.c    |  1 -
>  tools/testing/selftests/bpf/test_maps.c       |  1 -
>  tools/testing/selftests/bpf/test_netcnt.c     |  1 -
>  tools/testing/selftests/bpf/test_progs.c      |  1 -
>  .../selftests/bpf/test_skb_cgroup_id_user.c   |  1 -
>  tools/testing/selftests/bpf/test_sock.c       |  1 -
>  tools/testing/selftests/bpf/test_sock_addr.c  |  1 -
>  .../testing/selftests/bpf/test_sock_fields.c  |  1 -
>  .../selftests/bpf/test_socket_cookie.c        |  1 -
>  tools/testing/selftests/bpf/test_sockmap.c    |  1 -
>  tools/testing/selftests/bpf/test_sysctl.c     |  1 -
>  tools/testing/selftests/bpf/test_tag.c        |  1 -
>  .../bpf/test_tcp_check_syncookie_user.c       |  1 -
>  .../testing/selftests/bpf/test_tcpbpf_user.c  |  1 -
>  .../selftests/bpf/test_tcpnotify_user.c       |  1 -
>  tools/testing/selftests/bpf/test_verifier.c   |  1 -
>  .../testing/selftests/bpf/test_verifier_log.c |  2 --
>  27 files changed, 55 deletions(-)
>  delete mode 100644 tools/testing/selftests/bpf/bpf_rlimit.h
>

[...]

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH bpf-next v2 33/35] bpf: selftests: don't touch RLIMIT_MEMLOCK
  2020-07-27 18:45 ` [PATCH bpf-next v2 33/35] bpf: selftests: don't touch RLIMIT_MEMLOCK Roman Gushchin
@ 2020-07-28  6:08   ` Andrii Nakryiko
  0 siblings, 0 replies; 88+ messages in thread
From: Andrii Nakryiko @ 2020-07-28  6:08 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: bpf, Networking, Alexei Starovoitov, Daniel Borkmann,
	Kernel Team, open list

On Mon, Jul 27, 2020 at 12:21 PM Roman Gushchin <guro@fb.com> wrote:
>
> Since bpf is not using memlock rlimit for memory accounting,
> there are no more reasons to bump the limit.
>
> Signed-off-by: Roman Gushchin <guro@fb.com>
> ---

Similarly for bench, it's a tool that's not coupled with the latest
kernel version, it will be a big step down if the tool doesn't bump
rlimit on its own on slightly older kernels. Let's just keep it for
now.

>  tools/testing/selftests/bpf/bench.c           | 16 ---------------
>  .../selftests/bpf/progs/bpf_iter_bpf_map.c    |  5 ++---
>  tools/testing/selftests/bpf/xdping.c          |  6 ------
>  tools/testing/selftests/net/reuseport_bpf.c   | 20 -------------------
>  4 files changed, 2 insertions(+), 45 deletions(-)
>
> diff --git a/tools/testing/selftests/bpf/bench.c b/tools/testing/selftests/bpf/bench.c
> index 944ad4721c83..f66610541c8a 100644
> --- a/tools/testing/selftests/bpf/bench.c
> +++ b/tools/testing/selftests/bpf/bench.c
> @@ -29,25 +29,9 @@ static int libbpf_print_fn(enum libbpf_print_level level,
>         return vfprintf(stderr, format, args);
>  }
>
> -static int bump_memlock_rlimit(void)
> -{
> -       struct rlimit rlim_new = {
> -               .rlim_cur       = RLIM_INFINITY,
> -               .rlim_max       = RLIM_INFINITY,
> -       };
> -
> -       return setrlimit(RLIMIT_MEMLOCK, &rlim_new);
> -}
> -
>  void setup_libbpf()
>  {
> -       int err;
> -
>         libbpf_set_print(libbpf_print_fn);
> -
> -       err = bump_memlock_rlimit();
> -       if (err)
> -               fprintf(stderr, "failed to increase RLIMIT_MEMLOCK: %d", err);
>  }
>

[...]

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH bpf-next v2 35/35] perf: don't touch RLIMIT_MEMLOCK
  2020-07-27 18:45 ` [PATCH bpf-next v2 35/35] perf: don't " Roman Gushchin
@ 2020-07-28  6:09   ` Andrii Nakryiko
  2020-07-28 12:13     ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 88+ messages in thread
From: Andrii Nakryiko @ 2020-07-28  6:09 UTC (permalink / raw)
  To: Roman Gushchin, Arnaldo Carvalho de Melo
  Cc: bpf, Networking, Alexei Starovoitov, Daniel Borkmann,
	Kernel Team, open list

On Mon, Jul 27, 2020 at 12:21 PM Roman Gushchin <guro@fb.com> wrote:
>
> Since bpf stopped using memlock rlimit to limit the memory usage,
> there is no more reason for perf to alter its own limit.
>
> Signed-off-by: Roman Gushchin <guro@fb.com>
> ---

Cc'd Armaldo, but I'm guessing it's a similar situation that latest
perf might be running on older kernel and should keep working.

>  tools/perf/builtin-trace.c      | 10 ----------
>  tools/perf/tests/builtin-test.c |  6 ------
>  tools/perf/util/Build           |  1 -
>  tools/perf/util/rlimit.c        | 29 -----------------------------
>  tools/perf/util/rlimit.h        |  6 ------
>  5 files changed, 52 deletions(-)
>  delete mode 100644 tools/perf/util/rlimit.c
>  delete mode 100644 tools/perf/util/rlimit.h
>

[...]

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH bpf-next v2 32/35] bpf: selftests: delete bpf_rlimit.h
  2020-07-28  6:06   ` Andrii Nakryiko
@ 2020-07-28  6:11     ` Song Liu
  2020-07-28 18:30       ` Andrii Nakryiko
  0 siblings, 1 reply; 88+ messages in thread
From: Song Liu @ 2020-07-28  6:11 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Roman Gushchin, bpf, Networking, Alexei Starovoitov,
	Daniel Borkmann, Kernel Team, open list



> On Jul 27, 2020, at 11:06 PM, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> 
> On Mon, Jul 27, 2020 at 12:25 PM Roman Gushchin <guro@fb.com> wrote:
>> 
>> As rlimit-based memory accounting is not used by bpf anymore,
>> there are no more reasons to play with memlock rlimit.
>> 
>> Delete bpf_rlimit.h which contained a code to bump the limit.
>> 
>> Signed-off-by: Roman Gushchin <guro@fb.com>
>> ---
> 
> We run test_progs on old kernels as part of libbpf Github CI. We'll
> need to either leave setrlimit() or do it conditionally, depending on
> detected kernel feature support.

Hmm... I am surprised that running test_progs on old kernels is not 
too noisy. Have we got any issue with that?

Thanks,
Song

> 
>> samples/bpf/hbm.c                             |  1 -
>> tools/testing/selftests/bpf/bpf_rlimit.h      | 28 -------------------
>> .../selftests/bpf/flow_dissector_load.c       |  1 -
>> .../selftests/bpf/get_cgroup_id_user.c        |  1 -
>> .../bpf/prog_tests/select_reuseport.c         |  1 -
>> .../selftests/bpf/prog_tests/sk_lookup.c      |  1 -
>> tools/testing/selftests/bpf/test_btf.c        |  1 -
>> .../selftests/bpf/test_cgroup_storage.c       |  1 -
>> tools/testing/selftests/bpf/test_dev_cgroup.c |  1 -
>> tools/testing/selftests/bpf/test_lpm_map.c    |  1 -
>> tools/testing/selftests/bpf/test_lru_map.c    |  1 -
>> tools/testing/selftests/bpf/test_maps.c       |  1 -
>> tools/testing/selftests/bpf/test_netcnt.c     |  1 -
>> tools/testing/selftests/bpf/test_progs.c      |  1 -
>> .../selftests/bpf/test_skb_cgroup_id_user.c   |  1 -
>> tools/testing/selftests/bpf/test_sock.c       |  1 -
>> tools/testing/selftests/bpf/test_sock_addr.c  |  1 -
>> .../testing/selftests/bpf/test_sock_fields.c  |  1 -
>> .../selftests/bpf/test_socket_cookie.c        |  1 -
>> tools/testing/selftests/bpf/test_sockmap.c    |  1 -
>> tools/testing/selftests/bpf/test_sysctl.c     |  1 -
>> tools/testing/selftests/bpf/test_tag.c        |  1 -
>> .../bpf/test_tcp_check_syncookie_user.c       |  1 -
>> .../testing/selftests/bpf/test_tcpbpf_user.c  |  1 -
>> .../selftests/bpf/test_tcpnotify_user.c       |  1 -
>> tools/testing/selftests/bpf/test_verifier.c   |  1 -
>> .../testing/selftests/bpf/test_verifier_log.c |  2 --
>> 27 files changed, 55 deletions(-)
>> delete mode 100644 tools/testing/selftests/bpf/bpf_rlimit.h
>> 
> 
> [...]


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH bpf-next v2 34/35] bpf: samples: do not touch RLIMIT_MEMLOCK
  2020-07-27 18:45 ` [PATCH bpf-next v2 34/35] bpf: samples: do not " Roman Gushchin
@ 2020-07-28  6:14   ` Song Liu
  0 siblings, 0 replies; 88+ messages in thread
From: Song Liu @ 2020-07-28  6:14 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: bpf, Networking, Alexei Starovoitov, Daniel Borkmann,
	Kernel Team, open list

On Mon, Jul 27, 2020 at 12:26 PM Roman Gushchin <guro@fb.com> wrote:
>
> Since bpf is not using rlimit memlock for the memory accounting
> and control, do not change the limit in sample applications.
>
> Signed-off-by: Roman Gushchin <guro@fb.com>

Acked-by: Song Liu <songliubraving@fb.com>

> ---
[...]
>  samples/bpf/xdp_rxq_info_user.c     |  6 ------
>  samples/bpf/xdp_sample_pkts_user.c  |  6 ------
>  samples/bpf/xdp_tx_iptunnel_user.c  |  6 ------
>  samples/bpf/xdpsock_user.c          |  7 -------
>  27 files changed, 133 deletions(-)

133 (-) no (+), nice! :)

[...]

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH bpf-next v2 35/35] perf: don't touch RLIMIT_MEMLOCK
  2020-07-28  6:09   ` Andrii Nakryiko
@ 2020-07-28 12:13     ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 88+ messages in thread
From: Arnaldo Carvalho de Melo @ 2020-07-28 12:13 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Roman Gushchin, bpf, Networking, Alexei Starovoitov,
	Daniel Borkmann, Kernel Team, open list

Em Mon, Jul 27, 2020 at 11:09:43PM -0700, Andrii Nakryiko escreveu:
> On Mon, Jul 27, 2020 at 12:21 PM Roman Gushchin <guro@fb.com> wrote:
> >
> > Since bpf stopped using memlock rlimit to limit the memory usage,
> > there is no more reason for perf to alter its own limit.
> >
> > Signed-off-by: Roman Gushchin <guro@fb.com>
> > ---
> 
> Cc'd Armaldo, but I'm guessing it's a similar situation that latest
> perf might be running on older kernel and should keep working.

Yes, please leave it as is, the latest perf should continue working with
older kernels, so if there is a way to figure out if the kernel running
is one where BPF doesn't use memlock rlimit for that purpose, then in
those cases we shouldn't use it.

- Arnaldo
 
> >  tools/perf/builtin-trace.c      | 10 ----------
> >  tools/perf/tests/builtin-test.c |  6 ------
> >  tools/perf/util/Build           |  1 -
> >  tools/perf/util/rlimit.c        | 29 -----------------------------
> >  tools/perf/util/rlimit.h        |  6 ------
> >  5 files changed, 52 deletions(-)
> >  delete mode 100644 tools/perf/util/rlimit.c
> >  delete mode 100644 tools/perf/util/rlimit.h
> >
> 
> [...]

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH bpf-next v2 32/35] bpf: selftests: delete bpf_rlimit.h
  2020-07-28  6:11     ` Song Liu
@ 2020-07-28 18:30       ` Andrii Nakryiko
  0 siblings, 0 replies; 88+ messages in thread
From: Andrii Nakryiko @ 2020-07-28 18:30 UTC (permalink / raw)
  To: Song Liu
  Cc: Roman Gushchin, bpf, Networking, Alexei Starovoitov,
	Daniel Borkmann, Kernel Team, open list

On Mon, Jul 27, 2020 at 11:11 PM Song Liu <songliubraving@fb.com> wrote:
>
>
>
> > On Jul 27, 2020, at 11:06 PM, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> >
> > On Mon, Jul 27, 2020 at 12:25 PM Roman Gushchin <guro@fb.com> wrote:
> >>
> >> As rlimit-based memory accounting is not used by bpf anymore,
> >> there are no more reasons to play with memlock rlimit.
> >>
> >> Delete bpf_rlimit.h which contained a code to bump the limit.
> >>
> >> Signed-off-by: Roman Gushchin <guro@fb.com>
> >> ---
> >
> > We run test_progs on old kernels as part of libbpf Github CI. We'll
> > need to either leave setrlimit() or do it conditionally, depending on
> > detected kernel feature support.
>
> Hmm... I am surprised that running test_progs on old kernels is not
> too noisy. Have we got any issue with that?
>

For libbpf CI we maintain a list of enabled/disabled tests that are
not supposed to succeed on a given kernel. So it works OK in practice,
just needs an occasional update to those lists.


> Thanks,
> Song
>
> >
> >> samples/bpf/hbm.c                             |  1 -
> >> tools/testing/selftests/bpf/bpf_rlimit.h      | 28 -------------------
> >> .../selftests/bpf/flow_dissector_load.c       |  1 -
> >> .../selftests/bpf/get_cgroup_id_user.c        |  1 -
> >> .../bpf/prog_tests/select_reuseport.c         |  1 -
> >> .../selftests/bpf/prog_tests/sk_lookup.c      |  1 -
> >> tools/testing/selftests/bpf/test_btf.c        |  1 -
> >> .../selftests/bpf/test_cgroup_storage.c       |  1 -
> >> tools/testing/selftests/bpf/test_dev_cgroup.c |  1 -
> >> tools/testing/selftests/bpf/test_lpm_map.c    |  1 -
> >> tools/testing/selftests/bpf/test_lru_map.c    |  1 -
> >> tools/testing/selftests/bpf/test_maps.c       |  1 -
> >> tools/testing/selftests/bpf/test_netcnt.c     |  1 -
> >> tools/testing/selftests/bpf/test_progs.c      |  1 -
> >> .../selftests/bpf/test_skb_cgroup_id_user.c   |  1 -
> >> tools/testing/selftests/bpf/test_sock.c       |  1 -
> >> tools/testing/selftests/bpf/test_sock_addr.c  |  1 -
> >> .../testing/selftests/bpf/test_sock_fields.c  |  1 -
> >> .../selftests/bpf/test_socket_cookie.c        |  1 -
> >> tools/testing/selftests/bpf/test_sockmap.c    |  1 -
> >> tools/testing/selftests/bpf/test_sysctl.c     |  1 -
> >> tools/testing/selftests/bpf/test_tag.c        |  1 -
> >> .../bpf/test_tcp_check_syncookie_user.c       |  1 -
> >> .../testing/selftests/bpf/test_tcpbpf_user.c  |  1 -
> >> .../selftests/bpf/test_tcpnotify_user.c       |  1 -
> >> tools/testing/selftests/bpf/test_verifier.c   |  1 -
> >> .../testing/selftests/bpf/test_verifier_log.c |  2 --
> >> 27 files changed, 55 deletions(-)
> >> delete mode 100644 tools/testing/selftests/bpf/bpf_rlimit.h
> >>
> >
> > [...]
>

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH bpf-next v2 27/35] bpf: eliminate rlimit-based memory accounting infra for bpf maps
  2020-07-28  6:06       ` Song Liu
@ 2020-07-28 19:08         ` Roman Gushchin
  2020-07-28 19:16           ` Andrii Nakryiko
  0 siblings, 1 reply; 88+ messages in thread
From: Roman Gushchin @ 2020-07-28 19:08 UTC (permalink / raw)
  To: Song Liu
  Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov,
	Daniel Borkmann, Kernel Team, open list

On Mon, Jul 27, 2020 at 11:06:42PM -0700, Song Liu wrote:
> On Mon, Jul 27, 2020 at 10:58 PM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > On Mon, Jul 27, 2020 at 10:47 PM Song Liu <song@kernel.org> wrote:
> > >
> > > On Mon, Jul 27, 2020 at 12:26 PM Roman Gushchin <guro@fb.com> wrote:
> > > >
> > > > Remove rlimit-based accounting infrastructure code, which is not used
> > > > anymore.
> > > >
> > > > Signed-off-by: Roman Gushchin <guro@fb.com>
> > > [...]
> > > >
> > > >  static void bpf_map_put_uref(struct bpf_map *map)
> > > > @@ -541,7 +484,7 @@ static void bpf_map_show_fdinfo(struct seq_file *m, struct file *filp)
> > > >                    "value_size:\t%u\n"
> > > >                    "max_entries:\t%u\n"
> > > >                    "map_flags:\t%#x\n"
> > > > -                  "memlock:\t%llu\n"
> > > > +                  "memlock:\t%llu\n" /* deprecated */
> > >
> > > I am not sure whether we can deprecate this one.. How difficult is it
> > > to keep this statistics?
> > >
> >
> > It's factually correct now, that BPF map doesn't use any memlock memory, no?

Right.

> 
> I am not sure whether memlock really means memlock for all users... I bet there
> are users who use memlock to check total memory used by the map.

But this is just the part of struct bpf_map, so I agree with Andrii,
it's a safe check.

> 
> >
> > This is actually one way to detect whether RLIMIT_MEMLOCK is necessary
> > or not: create a small map, check if it's fdinfo has memlock: 0 or not
> > :)
> 
> If we do show memlock=0, this is a good check...

The only question I have if it's worth checking at all? Bumping the rlimit
is a way cheaper operation than creating a temporarily map and checking its
properties.

So is there any win in comparison to just leaving the userspace code* as it is
for now?

* except runqslower and samples

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH bpf-next v2 27/35] bpf: eliminate rlimit-based memory accounting infra for bpf maps
  2020-07-28 19:08         ` Roman Gushchin
@ 2020-07-28 19:16           ` Andrii Nakryiko
  0 siblings, 0 replies; 88+ messages in thread
From: Andrii Nakryiko @ 2020-07-28 19:16 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: Song Liu, bpf, Networking, Alexei Starovoitov, Daniel Borkmann,
	Kernel Team, open list

On Tue, Jul 28, 2020 at 12:09 PM Roman Gushchin <guro@fb.com> wrote:
>
> On Mon, Jul 27, 2020 at 11:06:42PM -0700, Song Liu wrote:
> > On Mon, Jul 27, 2020 at 10:58 PM Andrii Nakryiko
> > <andrii.nakryiko@gmail.com> wrote:
> > >
> > > On Mon, Jul 27, 2020 at 10:47 PM Song Liu <song@kernel.org> wrote:
> > > >
> > > > On Mon, Jul 27, 2020 at 12:26 PM Roman Gushchin <guro@fb.com> wrote:
> > > > >
> > > > > Remove rlimit-based accounting infrastructure code, which is not used
> > > > > anymore.
> > > > >
> > > > > Signed-off-by: Roman Gushchin <guro@fb.com>
> > > > [...]
> > > > >
> > > > >  static void bpf_map_put_uref(struct bpf_map *map)
> > > > > @@ -541,7 +484,7 @@ static void bpf_map_show_fdinfo(struct seq_file *m, struct file *filp)
> > > > >                    "value_size:\t%u\n"
> > > > >                    "max_entries:\t%u\n"
> > > > >                    "map_flags:\t%#x\n"
> > > > > -                  "memlock:\t%llu\n"
> > > > > +                  "memlock:\t%llu\n" /* deprecated */
> > > >
> > > > I am not sure whether we can deprecate this one.. How difficult is it
> > > > to keep this statistics?
> > > >
> > >
> > > It's factually correct now, that BPF map doesn't use any memlock memory, no?
>
> Right.
>
> >
> > I am not sure whether memlock really means memlock for all users... I bet there
> > are users who use memlock to check total memory used by the map.
>
> But this is just the part of struct bpf_map, so I agree with Andrii,
> it's a safe check.
>
> >
> > >
> > > This is actually one way to detect whether RLIMIT_MEMLOCK is necessary
> > > or not: create a small map, check if it's fdinfo has memlock: 0 or not
> > > :)
> >
> > If we do show memlock=0, this is a good check...
>
> The only question I have if it's worth checking at all? Bumping the rlimit
> is a way cheaper operation than creating a temporarily map and checking its
> properties.
>

for perf and libbpf -- I think it's totally worth it. Bumping
RLIMIT_MEMLOCK automatically means potentially messing up some other
parts of the system (e.g., BCC just bumps it to INFINITY allowing to
over-allocate too much memory, potentially, for unrelated applications
that do rely on RLIMIT_MEMLOCK). It's one of the reasons why libbpf
doesn't do it automatically, actually. So knowing when this is not
necessary, will allow to improve diagnostic messages by libbpf, and
would just avoid potentially risky operation by perf/BCC/etc.

> So is there any win in comparison to just leaving the userspace code* as it is
> for now?
>
> * except runqslower and samples

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH bpf-next v2 29/35] bpf: libbpf: cleanup RLIMIT_MEMLOCK usage
  2020-07-28  5:59       ` Andrii Nakryiko
@ 2020-07-30  1:38         ` Roman Gushchin
  2020-07-30 19:39           ` Andrii Nakryiko
  0 siblings, 1 reply; 88+ messages in thread
From: Roman Gushchin @ 2020-07-30  1:38 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, Networking, Alexei Starovoitov, Daniel Borkmann,
	Kernel Team, open list

On Mon, Jul 27, 2020 at 10:59:33PM -0700, Andrii Nakryiko wrote:
> On Mon, Jul 27, 2020 at 4:15 PM Roman Gushchin <guro@fb.com> wrote:
> >
> > On Mon, Jul 27, 2020 at 03:05:11PM -0700, Andrii Nakryiko wrote:
> > > On Mon, Jul 27, 2020 at 12:21 PM Roman Gushchin <guro@fb.com> wrote:
> > > >
> > > > As bpf is not using memlock rlimit for memory accounting anymore,
> > > > let's remove the related code from libbpf.
> > > >
> > > > Bpf operations can't fail because of exceeding the limit anymore.
> > > >
> > >
> > > They can't in the newest kernel, but libbpf will keep working and
> > > supporting old kernels for a very long time now. So please don't
> > > remove any of this.
> >
> > Yeah, good point, agree.
> > So we just can drop this patch from the series, no other changes
> > are needed.
> >
> > >
> > > But it would be nice to add a detection of whether kernel needs a
> > > RLIMIT_MEMLOCK bump or not. Is there some simple and reliable way to
> > > detect this from user-space?

Btw, do you mean we should add a new function to the libbpf API?
Or just extend pr_perm_msg() to skip guessing on new kernels?

The problem with the latter one is that it's called on a failed attempt
to create a map, so unlikely we'll be able to create a new one just to test
for the "memlock" value. But it also raises a question what should we do
if the creation of this temporarily map fails? Assume the old kernel and
bump the limit?
Idk, maybe it's better to just leave the userspace code as it is for some time.

Thanks!

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH bpf-next v2 29/35] bpf: libbpf: cleanup RLIMIT_MEMLOCK usage
  2020-07-30  1:38         ` Roman Gushchin
@ 2020-07-30 19:39           ` Andrii Nakryiko
  2020-07-30 20:46             ` Roman Gushchin
  0 siblings, 1 reply; 88+ messages in thread
From: Andrii Nakryiko @ 2020-07-30 19:39 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: bpf, Networking, Alexei Starovoitov, Daniel Borkmann,
	Kernel Team, open list

On Wed, Jul 29, 2020 at 6:38 PM Roman Gushchin <guro@fb.com> wrote:
>
> On Mon, Jul 27, 2020 at 10:59:33PM -0700, Andrii Nakryiko wrote:
> > On Mon, Jul 27, 2020 at 4:15 PM Roman Gushchin <guro@fb.com> wrote:
> > >
> > > On Mon, Jul 27, 2020 at 03:05:11PM -0700, Andrii Nakryiko wrote:
> > > > On Mon, Jul 27, 2020 at 12:21 PM Roman Gushchin <guro@fb.com> wrote:
> > > > >
> > > > > As bpf is not using memlock rlimit for memory accounting anymore,
> > > > > let's remove the related code from libbpf.
> > > > >
> > > > > Bpf operations can't fail because of exceeding the limit anymore.
> > > > >
> > > >
> > > > They can't in the newest kernel, but libbpf will keep working and
> > > > supporting old kernels for a very long time now. So please don't
> > > > remove any of this.
> > >
> > > Yeah, good point, agree.
> > > So we just can drop this patch from the series, no other changes
> > > are needed.
> > >
> > > >
> > > > But it would be nice to add a detection of whether kernel needs a
> > > > RLIMIT_MEMLOCK bump or not. Is there some simple and reliable way to
> > > > detect this from user-space?
>
> Btw, do you mean we should add a new function to the libbpf API?
> Or just extend pr_perm_msg() to skip guessing on new kernels?
>

I think we have to do both. There is libbpf_util.h in libbpf, we could
add two functions there:

- libbpf_needs_memlock() that would return true/false if kernel is old
and needs RLIMIT_MEMLOCK
- as a convenience, we can also add libbpf_inc_memlock_by() and
libbpf_set_memlock_to(), which will optionally (if kernel needs it)
adjust RLIMIT_MEMLOCK?

I think for your patch set, given it's pretty big already, let's not
touch runqslower, libbpf, and perf code (I think samples/bpf are fine
to just remove memlock adjustment), and we'll deal with detection and
optional bumping of RLIMIT_MEMLOCK as a separate patch once your
change land.


> The problem with the latter one is that it's called on a failed attempt
> to create a map, so unlikely we'll be able to create a new one just to test
> for the "memlock" value. But it also raises a question what should we do
> if the creation of this temporarily map fails? Assume the old kernel and
> bump the limit?

Yeah, I think we'll have to make assumptions like that. Ideally, of
course, detection of this would be just a simple sysfs value or
something, don't know. Maybe there is already a way for kernel to
communicate something like that?

> Idk, maybe it's better to just leave the userspace code as it is for some time.
>
> Thanks!

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH bpf-next v2 29/35] bpf: libbpf: cleanup RLIMIT_MEMLOCK usage
  2020-07-30 19:39           ` Andrii Nakryiko
@ 2020-07-30 20:46             ` Roman Gushchin
  0 siblings, 0 replies; 88+ messages in thread
From: Roman Gushchin @ 2020-07-30 20:46 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, Networking, Alexei Starovoitov, Daniel Borkmann,
	Kernel Team, open list

On Thu, Jul 30, 2020 at 12:39:40PM -0700, Andrii Nakryiko wrote:
> On Wed, Jul 29, 2020 at 6:38 PM Roman Gushchin <guro@fb.com> wrote:
> >
> > On Mon, Jul 27, 2020 at 10:59:33PM -0700, Andrii Nakryiko wrote:
> > > On Mon, Jul 27, 2020 at 4:15 PM Roman Gushchin <guro@fb.com> wrote:
> > > >
> > > > On Mon, Jul 27, 2020 at 03:05:11PM -0700, Andrii Nakryiko wrote:
> > > > > On Mon, Jul 27, 2020 at 12:21 PM Roman Gushchin <guro@fb.com> wrote:
> > > > > >
> > > > > > As bpf is not using memlock rlimit for memory accounting anymore,
> > > > > > let's remove the related code from libbpf.
> > > > > >
> > > > > > Bpf operations can't fail because of exceeding the limit anymore.
> > > > > >
> > > > >
> > > > > They can't in the newest kernel, but libbpf will keep working and
> > > > > supporting old kernels for a very long time now. So please don't
> > > > > remove any of this.
> > > >
> > > > Yeah, good point, agree.
> > > > So we just can drop this patch from the series, no other changes
> > > > are needed.
> > > >
> > > > >
> > > > > But it would be nice to add a detection of whether kernel needs a
> > > > > RLIMIT_MEMLOCK bump or not. Is there some simple and reliable way to
> > > > > detect this from user-space?
> >
> > Btw, do you mean we should add a new function to the libbpf API?
> > Or just extend pr_perm_msg() to skip guessing on new kernels?
> >
> 
> I think we have to do both. There is libbpf_util.h in libbpf, we could
> add two functions there:
> 
> - libbpf_needs_memlock() that would return true/false if kernel is old
> and needs RLIMIT_MEMLOCK
> - as a convenience, we can also add libbpf_inc_memlock_by() and
> libbpf_set_memlock_to(), which will optionally (if kernel needs it)
> adjust RLIMIT_MEMLOCK?
> 
> I think for your patch set, given it's pretty big already, let's not
> touch runqslower, libbpf, and perf code (I think samples/bpf are fine
> to just remove memlock adjustment), and we'll deal with detection and
> optional bumping of RLIMIT_MEMLOCK as a separate patch once your
> change land.

Ok, works for me. Let me repost the kernel part + samples as v3.

> 
> 
> > The problem with the latter one is that it's called on a failed attempt
> > to create a map, so unlikely we'll be able to create a new one just to test
> > for the "memlock" value. But it also raises a question what should we do
> > if the creation of this temporarily map fails? Assume the old kernel and
> > bump the limit?
> 
> Yeah, I think we'll have to make assumptions like that. Ideally, of
> course, detection of this would be just a simple sysfs value or
> something, don't know. Maybe there is already a way for kernel to
> communicate something like that?

For instance, we've /sys/kernel/cgroup/features for cgroup features:
it's a list of supported mount options for cgroup fs.

Idk if bpf deserves something similar, but as far as I remember,
we've discussed it a couple of years ago, and at that time the consensus
was that it's too hard to keep such list uptodate, so the userspace should
just try and fail. Idk if it's still valid.

Thank you!

^ permalink raw reply	[flat|nested] 88+ messages in thread

end of thread, back to index

Thread overview: 88+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-27 18:44 [PATCH bpf-next v2 00/35] bpf: switch to memcg-based memory accounting Roman Gushchin
2020-07-27 18:44 ` [PATCH bpf-next v2 01/35] bpf: memcg-based memory accounting for bpf progs Roman Gushchin
2020-07-27 22:11   ` Song Liu
2020-07-28  0:08     ` Roman Gushchin
2020-07-28  4:42       ` Song Liu
2020-07-27 18:44 ` [PATCH bpf-next v2 02/35] bpf: memcg-based memory accounting for bpf maps Roman Gushchin
2020-07-27 22:12   ` Song Liu
2020-07-27 18:44 ` [PATCH bpf-next v2 03/35] bpf: refine memcg-based memory accounting for arraymap maps Roman Gushchin
2020-07-27 22:30   ` Song Liu
2020-07-27 18:44 ` [PATCH bpf-next v2 04/35] bpf: refine memcg-based memory accounting for cpumap maps Roman Gushchin
2020-07-27 22:48   ` Song Liu
2020-07-27 18:44 ` [PATCH bpf-next v2 05/35] bpf: memcg-based memory accounting for cgroup storage maps Roman Gushchin
2020-07-27 23:05   ` Song Liu
2020-07-27 18:44 ` [PATCH bpf-next v2 06/35] bpf: refine memcg-based memory accounting for devmap maps Roman Gushchin
2020-07-27 23:35   ` Song Liu
2020-07-27 18:44 ` [PATCH bpf-next v2 07/35] bpf: refine memcg-based memory accounting for hashtab maps Roman Gushchin
2020-07-27 23:36   ` Song Liu
2020-07-27 18:44 ` [PATCH bpf-next v2 08/35] bpf: memcg-based memory accounting for lpm_trie maps Roman Gushchin
2020-07-27 23:55   ` Song Liu
2020-07-27 18:44 ` [PATCH bpf-next v2 09/35] bpf: memcg-based memory accounting for bpf ringbuffer Roman Gushchin
2020-07-27 23:56   ` Song Liu
2020-07-27 18:44 ` [PATCH bpf-next v2 10/35] bpf: memcg-based memory accounting for socket storage maps Roman Gushchin
2020-07-27 23:57   ` Song Liu
2020-07-27 18:44 ` [PATCH bpf-next v2 11/35] bpf: refine memcg-based memory accounting for sockmap and sockhash maps Roman Gushchin
2020-07-27 23:58   ` Song Liu
2020-07-27 18:44 ` [PATCH bpf-next v2 12/35] bpf: refine memcg-based memory accounting for xskmap maps Roman Gushchin
2020-07-28  0:01   ` Song Liu
2020-07-27 18:44 ` [PATCH bpf-next v2 13/35] bpf: eliminate rlimit-based memory accounting for arraymap maps Roman Gushchin
2020-07-28  0:04   ` Song Liu
2020-07-27 18:44 ` [PATCH bpf-next v2 14/35] bpf: eliminate rlimit-based memory accounting for bpf_struct_ops maps Roman Gushchin
2020-07-28  5:29   ` Song Liu
2020-07-27 18:44 ` [PATCH bpf-next v2 15/35] bpf: eliminate rlimit-based memory accounting for cpumap maps Roman Gushchin
2020-07-28  5:30   ` Song Liu
2020-07-27 18:44 ` [PATCH bpf-next v2 16/35] bpf: eliminate rlimit-based memory accounting for cgroup storage maps Roman Gushchin
2020-07-28  5:31   ` Song Liu
2020-07-27 18:44 ` [PATCH bpf-next v2 17/35] bpf: eliminate rlimit-based memory accounting for devmap maps Roman Gushchin
2020-07-28  5:31   ` Song Liu
2020-07-27 18:44 ` [PATCH bpf-next v2 18/35] bpf: eliminate rlimit-based memory accounting for hashtab maps Roman Gushchin
2020-07-28  5:32   ` Song Liu
2020-07-27 18:44 ` [PATCH bpf-next v2 19/35] bpf: eliminate rlimit-based memory accounting for lpm_trie maps Roman Gushchin
2020-07-28  5:32   ` Song Liu
2020-07-27 18:44 ` [PATCH bpf-next v2 20/35] bpf: eliminate rlimit-based memory accounting for queue_stack_maps maps Roman Gushchin
2020-07-28  5:35   ` Song Liu
2020-07-27 18:44 ` [PATCH bpf-next v2 21/35] bpf: eliminate rlimit-based memory accounting for reuseport_array maps Roman Gushchin
2020-07-28  5:36   ` Song Liu
2020-07-27 18:44 ` [PATCH bpf-next v2 22/35] bpf: eliminate rlimit-based memory accounting for bpf ringbuffer Roman Gushchin
2020-07-28  5:37   ` Song Liu
2020-07-28  5:56   ` Andrii Nakryiko
2020-07-27 18:44 ` [PATCH bpf-next v2 23/35] bpf: eliminate rlimit-based memory accounting for sockmap and sockhash maps Roman Gushchin
2020-07-28  5:37   ` Song Liu
2020-07-27 18:44 ` [PATCH bpf-next v2 24/35] bpf: eliminate rlimit-based memory accounting for stackmap maps Roman Gushchin
2020-07-28  5:38   ` Song Liu
2020-07-27 18:44 ` [PATCH bpf-next v2 25/35] bpf: eliminate rlimit-based memory accounting for socket storage maps Roman Gushchin
2020-07-28  5:41   ` Song Liu
2020-07-27 18:44 ` [PATCH bpf-next v2 26/35] bpf: eliminate rlimit-based memory accounting for xskmap maps Roman Gushchin
2020-07-28  5:42   ` Song Liu
2020-07-27 18:44 ` [PATCH bpf-next v2 27/35] bpf: eliminate rlimit-based memory accounting infra for bpf maps Roman Gushchin
2020-07-28  5:47   ` Song Liu
2020-07-28  5:58     ` Andrii Nakryiko
2020-07-28  6:06       ` Song Liu
2020-07-28 19:08         ` Roman Gushchin
2020-07-28 19:16           ` Andrii Nakryiko
2020-07-27 18:44 ` [PATCH bpf-next v2 28/35] bpf: eliminate rlimit-based memory accounting for bpf progs Roman Gushchin
2020-07-28  5:55   ` Song Liu
2020-07-27 18:45 ` [PATCH bpf-next v2 29/35] bpf: libbpf: cleanup RLIMIT_MEMLOCK usage Roman Gushchin
2020-07-27 22:05   ` Andrii Nakryiko
2020-07-27 22:44     ` Song Liu
2020-07-27 23:15     ` Roman Gushchin
2020-07-28  5:59       ` Andrii Nakryiko
2020-07-30  1:38         ` Roman Gushchin
2020-07-30 19:39           ` Andrii Nakryiko
2020-07-30 20:46             ` Roman Gushchin
2020-07-27 18:45 ` [PATCH bpf-next v2 30/35] bpf: bpftool: do not touch RLIMIT_MEMLOCK Roman Gushchin
2020-07-28  6:00   ` Song Liu
2020-07-28  6:00   ` Andrii Nakryiko
2020-07-27 18:45 ` [PATCH bpf-next v2 31/35] bpf: runqslower: don't " Roman Gushchin
2020-07-28  6:03   ` Andrii Nakryiko
2020-07-27 18:45 ` [PATCH bpf-next v2 32/35] bpf: selftests: delete bpf_rlimit.h Roman Gushchin
2020-07-28  6:06   ` Andrii Nakryiko
2020-07-28  6:11     ` Song Liu
2020-07-28 18:30       ` Andrii Nakryiko
2020-07-27 18:45 ` [PATCH bpf-next v2 33/35] bpf: selftests: don't touch RLIMIT_MEMLOCK Roman Gushchin
2020-07-28  6:08   ` Andrii Nakryiko
2020-07-27 18:45 ` [PATCH bpf-next v2 34/35] bpf: samples: do not " Roman Gushchin
2020-07-28  6:14   ` Song Liu
2020-07-27 18:45 ` [PATCH bpf-next v2 35/35] perf: don't " Roman Gushchin
2020-07-28  6:09   ` Andrii Nakryiko
2020-07-28 12:13     ` Arnaldo Carvalho de Melo

BPF Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/bpf/0 bpf/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 bpf bpf/ https://lore.kernel.org/bpf \
		bpf@vger.kernel.org
	public-inbox-index bpf

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.bpf


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git