From: Alexei Starovoitov <alexei.starovoitov@gmail.com>
To: Yafang Shao <laoar.shao@gmail.com>
Cc: John Fastabend <john.fastabend@gmail.com>,
Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Andrii Nakryiko <andrii@kernel.org>,
Martin KaFai Lau <kafai@fb.com>, Song Liu <songliubraving@fb.com>,
Yonghong Song <yhs@fb.com>, KP Singh <kpsingh@kernel.org>,
Stanislav Fomichev <sdf@google.com>, Hao Luo <haoluo@google.com>,
Jiri Olsa <jolsa@kernel.org>, Tejun Heo <tj@kernel.org>,
dennis@kernel.org, Chris Lameter <cl@linux.com>,
Andrew Morton <akpm@linux-foundation.org>,
Pekka Enberg <penberg@kernel.org>,
David Rientjes <rientjes@google.com>,
Joonsoo Kim <iamjoonsoo.kim@lge.com>,
Roman Gushchin <roman.gushchin@linux.dev>,
Hyeonggon Yoo <42.hyeyoo@gmail.com>,
Vlastimil Babka <vbabka@suse.cz>,
urezki@gmail.com, linux-mm <linux-mm@kvack.org>,
bpf <bpf@vger.kernel.org>
Subject: Re: [PATCH bpf-next 7/7] bpf: hashtab memory usage
Date: Tue, 7 Feb 2023 17:56:44 -0800 [thread overview]
Message-ID: <CAADnVQLe9OVF2xUpsA6buPPRhFuRgPEgk2Mxe8UWPZrhfFAwCw@mail.gmail.com> (raw)
In-Reply-To: <CALOAHbDUk7CymPco4s12EdFauFroVBhQM6yNJQzji1D=jmkM5g@mail.gmail.com>
On Sat, Feb 4, 2023 at 7:56 PM Yafang Shao <laoar.shao@gmail.com> wrote:
>
> On Sat, Feb 4, 2023 at 10:01 AM John Fastabend <john.fastabend@gmail.com> wrote:
> >
> > Yafang Shao wrote:
> > > Get htab memory usage from the htab pointers we have allocated. Some
> > > small pointers are ignored as their size are quite small compared with
> > > the total size.
> > >
> > > The result as follows,
> > > - before this change
> > > 1: hash name count_map flags 0x0 <<<< prealloc
> > > key 16B value 24B max_entries 1048576 memlock 41943040B
> > > 2: hash name count_map flags 0x1 <<<< non prealloc, fully set
> > > key 16B value 24B max_entries 1048576 memlock 41943040B
> > > 3: hash name count_map flags 0x1 <<<< non prealloc, non set
> > > key 16B value 24B max_entries 1048576 memlock 41943040B
> > >
> > > The memlock is always a fixed number whatever it is preallocated or
> > > not, and whatever the allocated elements number is.
> > >
> > > - after this change
> > > 1: hash name count_map flags 0x0 <<<< prealloc
> > > key 16B value 24B max_entries 1048576 memlock 109064464B
> > > 2: hash name count_map flags 0x1 <<<< non prealloc, fully set
> > > key 16B value 24B max_entries 1048576 memlock 117464320B
> > > 3: hash name count_map flags 0x1 <<<< non prealloc, non set
> > > key 16B value 24B max_entries 1048576 memlock 16797952B
> > >
> > > The memlock now is hashtab actually allocated.
> > >
> > > At worst, the difference can be 10x, for example,
> > > - before this change
> > > 4: hash name count_map flags 0x0
> > > key 4B value 4B max_entries 1048576 memlock 8388608B
> > >
> > > - after this change
> > > 4: hash name count_map flags 0x0
> > > key 4B value 4B max_entries 1048576 memlock 83898640B
> > >
> >
> > This walks the entire map and buckets to get the size? Inside a
> > rcu critical section as well :/ it seems.
> >
>
> No, it doesn't walk the entire map and buckets, but just gets one
> random element.
> So it won't be a problem to do it inside a rcu critical section.
>
> > What am I missing, if you know how many elements are added (which
> > you can track on map updates) how come we can't just calculate the
> > memory size directly from this?
> >
>
> It is less accurate and hard to understand. Take non-preallocated
> percpu hashtab for example,
> The size can be calculated as follows,
> key_size = round_up(htab->map.key_size, 8);
> value_size = round_up(htab->map.value_size, 8);
> pcpu_meta_size = sizeof(struct llist_node) + sizeof(void *);
> usage = ((value_size * num_possible_cpus() +\
> pcpu_meta_size + key_size) * max_entries
>
> That is quite unfriendly to the newbies, and may be error-prone.
Please do that instead.
map_mem_usage callback is a no go as I mentioned earlier.
> Furthermore, it is less accurate because there is underlying memory
> allocation in the MM subsystem.
> Now we can get a more accurate usage with little overhead. Why not do it?
because htab_mem_usage() is not maintainable long term.
100% accuracy is a non-goal.
next prev parent reply other threads:[~2023-02-08 1:57 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-02 1:41 [PATCH bpf-next 0/7] bpf, mm: bpf memory usage Yafang Shao
2023-02-02 1:41 ` [PATCH bpf-next 1/7] mm: percpu: fix incorrect size in pcpu_obj_full_size() Yafang Shao
2023-02-02 1:41 ` [PATCH bpf-next 2/7] mm: percpu: introduce percpu_size() Yafang Shao
2023-02-02 14:32 ` Christoph Lameter
2023-02-02 15:01 ` Yafang Shao
2023-02-02 1:41 ` [PATCH bpf-next 3/7] mm: vmalloc: introduce vsize() Yafang Shao
2023-02-02 10:23 ` Christoph Hellwig
2023-02-02 14:10 ` Yafang Shao
2023-02-02 1:41 ` [PATCH bpf-next 4/7] mm: util: introduce kvsize() Yafang Shao
2023-02-02 1:41 ` [PATCH bpf-next 5/7] bpf: add new map ops ->map_mem_usage Yafang Shao
2023-02-02 1:41 ` [PATCH bpf-next 6/7] bpf: introduce bpf_mem_alloc_size() Yafang Shao
2023-02-02 4:53 ` kernel test robot
2023-02-02 14:11 ` Yafang Shao
2023-02-02 1:41 ` [PATCH bpf-next 7/7] bpf: hashtab memory usage Yafang Shao
2023-02-04 2:01 ` John Fastabend
2023-02-05 3:55 ` Yafang Shao
2023-02-08 1:56 ` Alexei Starovoitov [this message]
2023-02-08 3:33 ` Yafang Shao
2023-02-08 4:29 ` Alexei Starovoitov
2023-02-08 14:22 ` Yafang Shao
2023-02-05 22:14 ` Cong Wang
2023-02-06 11:52 ` Yafang Shao
2023-02-04 2:15 ` [PATCH bpf-next 0/7] bpf, mm: bpf " John Fastabend
2023-02-05 4:03 ` Yafang Shao
2023-02-07 0:48 ` Ho-Ren Chuang
2023-02-07 7:02 ` Yafang Shao
2023-02-07 0:53 ` Ho-Ren Chuang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAADnVQLe9OVF2xUpsA6buPPRhFuRgPEgk2Mxe8UWPZrhfFAwCw@mail.gmail.com \
--to=alexei.starovoitov@gmail.com \
--cc=42.hyeyoo@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=cl@linux.com \
--cc=daniel@iogearbox.net \
--cc=dennis@kernel.org \
--cc=haoluo@google.com \
--cc=iamjoonsoo.kim@lge.com \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kafai@fb.com \
--cc=kpsingh@kernel.org \
--cc=laoar.shao@gmail.com \
--cc=linux-mm@kvack.org \
--cc=penberg@kernel.org \
--cc=rientjes@google.com \
--cc=roman.gushchin@linux.dev \
--cc=sdf@google.com \
--cc=songliubraving@fb.com \
--cc=tj@kernel.org \
--cc=urezki@gmail.com \
--cc=vbabka@suse.cz \
--cc=yhs@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.