netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andrii Nakryiko <andrii.nakryiko@gmail.com>
To: Quentin Monnet <quentin.monnet@netronome.com>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>,
	Edward Cree <ecree@solarflare.com>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>, bpf <bpf@vger.kernel.org>,
	Network Development <netdev@vger.kernel.org>,
	oss-drivers@netronome.com
Subject: Re: [RFC bpf-next 0/3] tools: bpftool: add subcommand to count map entries
Date: Wed, 14 Aug 2019 13:18:27 -0700	[thread overview]
Message-ID: <CAEf4BzYqsT4OmWQ9WK9dmnKT9cMcjbhgHZmboUBgkEvtbaeUeA@mail.gmail.com> (raw)
In-Reply-To: <bec14521-dec1-5e1b-2f29-5c0492500272@netronome.com>

On Wed, Aug 14, 2019 at 10:12 AM Quentin Monnet
<quentin.monnet@netronome.com> wrote:
>
> 2019-08-14 09:58 UTC-0700 ~ Alexei Starovoitov
> <alexei.starovoitov@gmail.com>
> > On Wed, Aug 14, 2019 at 9:45 AM Edward Cree <ecree@solarflare.com> wrote:
> >>
> >> On 14/08/2019 10:42, Quentin Monnet wrote:
> >>> 2019-08-13 18:51 UTC-0700 ~ Alexei Starovoitov
> >>> <alexei.starovoitov@gmail.com>
> >>>> The same can be achieved by 'bpftool map dump|grep key|wc -l', no?
> >>> To some extent (with subtleties for some other map types); and we use a
> >>> similar command line as a workaround for now. But because of the rate of
> >>> inserts/deletes in the map, the process often reports a number higher
> >>> than the max number of entries (we observed up to ~750k when max_entries
> >>> is 500k), even is the map is only half-full on average during the count.
> >>> On the worst case (though not frequent), an entry is deleted just before
> >>> we get the next key from it, and iteration starts all over again. This
> >>> is not reliable to determine how much space is left in the map.
> >>>
> >>> I cannot see a solution that would provide a more accurate count from
> >>> user space, when the map is under pressure?
> >> This might be a really dumb suggestion, but: you're wanting to collect a
> >>  summary statistic over an in-kernel data structure in a single syscall,
> >>  because making a series of syscalls to examine every entry is slow and
> >>  racy.  Isn't that exactly a job for an in-kernel virtual machine, and
> >>  could you not supply an eBPF program which the kernel runs on each entry
> >>  in the map, thus supporting people who want to calculate something else
> >>  (mean, min and max, whatever) instead of count?
> >
> > Pretty much my suggestion as well :)

I also support the suggestion to count it from BPF side. It's flexible
and powerful approach and doesn't require adding more and more nuanced
sub-APIs to kernel to support subset of bulk operations on map
(subset, because we'll expose count, but what about, e.g., p50, etc,
there will always be something more that someone will want and it just
doesn't scale).

> >
> > It seems the better fix for your nat threshold is to keep count of
> > elements in the map in a separate global variable that
> > bpf program manually increments and decrements.
> > bpftool will dump it just as regular map of single element.
> > (I believe it doesn't recognize global variables properly yet)
> > and BTF will be there to pick exactly that 'count' variable.
> >
>
> It would be with an offloaded map, but yes, I suppose we could keep
> track of the numbers in a separate map. We'll have a look into this.

See if you can use a global variable, that way you completely
eliminate any overhead from BPF side of things, except for atomic
increment.

>
> Thanks to both of you for the suggestions.
> Quentin

  reply	other threads:[~2019-08-14 20:18 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-13 13:09 [RFC bpf-next 0/3] tools: bpftool: add subcommand to count map entries Quentin Monnet
2019-08-13 13:09 ` [RFC bpf-next 1/3] tools: bpftool: clean up dump_map_elem() return value Quentin Monnet
2019-08-13 13:09 ` [RFC bpf-next 2/3] tools: bpftool: make comment more explicit for count of dumped entries Quentin Monnet
2019-08-13 13:09 ` [RFC bpf-next 3/3] tools: bpftool: add "bpftool map count" to count entries in map Quentin Monnet
2019-08-14  1:51 ` [RFC bpf-next 0/3] tools: bpftool: add subcommand to count map entries Alexei Starovoitov
2019-08-14  9:42   ` Quentin Monnet
2019-08-14 16:45     ` Edward Cree
2019-08-14 16:58       ` Alexei Starovoitov
2019-08-14 17:12         ` Quentin Monnet
2019-08-14 20:18           ` Andrii Nakryiko [this message]
2019-08-15 14:02             ` Quentin Monnet
2019-08-14 16:58       ` Quentin Monnet
2019-08-14 17:14         ` Edward Cree
2019-08-15 14:15           ` Quentin Monnet
2019-08-16 18:13             ` Edward Cree

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAEf4BzYqsT4OmWQ9WK9dmnKT9cMcjbhgHZmboUBgkEvtbaeUeA@mail.gmail.com \
    --to=andrii.nakryiko@gmail.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=ecree@solarflare.com \
    --cc=netdev@vger.kernel.org \
    --cc=oss-drivers@netronome.com \
    --cc=quentin.monnet@netronome.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).