netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Quentin Monnet <quentin.monnet@netronome.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	bpf@vger.kernel.org, netdev@vger.kernel.org,
	oss-drivers@netronome.com
Subject: Re: [RFC bpf-next 0/3] tools: bpftool: add subcommand to count map entries
Date: Wed, 14 Aug 2019 10:42:05 +0100	[thread overview]
Message-ID: <ab11a9f2-0fbd-d35f-fee1-784554a2705a@netronome.com> (raw)
In-Reply-To: <20190814015149.b4pmubo3s4ou5yek@ast-mbp>

2019-08-13 18:51 UTC-0700 ~ Alexei Starovoitov
<alexei.starovoitov@gmail.com>
> On Tue, Aug 13, 2019 at 02:09:18PM +0100, Quentin Monnet wrote:
>> This series adds a "bpftool map count" subcommand to count the number of
>> entries present in a BPF map. This results from a customer request for a
>> tool to count the number of entries in BPF maps used in production (for
>> example, to know how many free entries are left in a given map).
>>
>> The first two commits actually contain some clean-up in preparation for the
>> new subcommand.
>>
>> The third commit adds the new subcommand. Because what data should count as
>> an entry is not entirely clear for all map types, we actually dump several
>> counters, and leave it to the users to interpret the values.
>>
>> Sending as a RFC because I'm looking for feedback on the approach. Is
>> printing several values the good thing to do? Also, note that some map
>> types such as queue/stack maps do not support any type of counting, this
>> would need to be implemented in the kernel I believe.
>>
>> More generally, we have a use case where (hash) maps are under pressure
>> (many additions/deletions from the BPF program), and counting the entries
>> by iterating other the different keys is not at all reliable. Would that
>> make sense to add a new bpf() subcommand to count the entries on the kernel
>> side instead of cycling over the entries in bpftool? If so, we would need
>> to agree on what makes an entry for each kind of map.
> 
> I don't mind new bpftool sub-command, but against adding kernel interface.
> Can you elaborate what is the actual use case?

Hi Alexei, thanks for your feedback.

The use case is a network processing application (close to a NAT), where
a hash map is used to keep track of flows, many of them being
short-lived. The BPF program spends a good chunk of time adding and
deleting entries to/from the map. The overall size (number of entries)
increases slowly, and when it grows past a certain threshold some action
must be taken (some flows are deleted from user space, possibly copied
to another map or whatever) to ensure we still have some room for new
incoming flows.

> The same can be achieved by 'bpftool map dump|grep key|wc -l', no?

To some extent (with subtleties for some other map types); and we use a
similar command line as a workaround for now. But because of the rate of
inserts/deletes in the map, the process often reports a number higher
than the max number of entries (we observed up to ~750k when max_entries
is 500k), even is the map is only half-full on average during the count.
On the worst case (though not frequent), an entry is deleted just before
we get the next key from it, and iteration starts all over again. This
is not reliable to determine how much space is left in the map.

I cannot see a solution that would provide a more accurate count from
user space, when the map is under pressure?

> 
>> Note that we are also facing similar issues for purging map from their
>> entries (deleting all entries at once). We can iterate on the keys and
>> delete elements one by one, but this is very inefficient when entries are
>> being added/removed in parallel from the BPF program, and having another
>> dedicated command accessible from the bpf() system call might help here as
>> well.
> 
> I think that fits into the batch processing of map commands discussion.
> 

This is also what we do at the moment, but we hit similar limitations
when iterating over the keys.

Thanks,
Quentin

  reply	other threads:[~2019-08-14  9:42 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-13 13:09 [RFC bpf-next 0/3] tools: bpftool: add subcommand to count map entries Quentin Monnet
2019-08-13 13:09 ` [RFC bpf-next 1/3] tools: bpftool: clean up dump_map_elem() return value Quentin Monnet
2019-08-13 13:09 ` [RFC bpf-next 2/3] tools: bpftool: make comment more explicit for count of dumped entries Quentin Monnet
2019-08-13 13:09 ` [RFC bpf-next 3/3] tools: bpftool: add "bpftool map count" to count entries in map Quentin Monnet
2019-08-14  1:51 ` [RFC bpf-next 0/3] tools: bpftool: add subcommand to count map entries Alexei Starovoitov
2019-08-14  9:42   ` Quentin Monnet [this message]
2019-08-14 16:45     ` Edward Cree
2019-08-14 16:58       ` Alexei Starovoitov
2019-08-14 17:12         ` Quentin Monnet
2019-08-14 20:18           ` Andrii Nakryiko
2019-08-15 14:02             ` Quentin Monnet
2019-08-14 16:58       ` Quentin Monnet
2019-08-14 17:14         ` Edward Cree
2019-08-15 14:15           ` Quentin Monnet
2019-08-16 18:13             ` Edward Cree

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ab11a9f2-0fbd-d35f-fee1-784554a2705a@netronome.com \
    --to=quentin.monnet@netronome.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=netdev@vger.kernel.org \
    --cc=oss-drivers@netronome.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).