All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hou Tao <hotforest@gmail.com>
To: andrii.nakryiko@gmail.com
Cc: andrii@kernel.org, ast@kernel.org, bpf@vger.kernel.org,
	daniel@iogearbox.net, davem@davemloft.net, hotforest@gmail.com,
	houtao1@huawei.com, kafai@fb.com, kuba@kernel.org,
	netdev@vger.kernel.org, yhs@fb.com
Subject: Re: [PATCH bpf-next] selftests/bpf: use getpagesize() to initialize ring buffer size
Date: Thu,  3 Feb 2022 19:12:45 +0800	[thread overview]
Message-ID: <20220203111245.3495-1-houtao1@huawei.com> (raw)
In-Reply-To: <CAEf4BzY_BGV_8d8+gUMva6dpnHq=JSo8oU0p3tc_o=7ii2gU4A@mail.gmail.com>

Hi,

> On Tue, Feb 1, 2022 at 6:36 PM Hou Tao <hotforest@gmail.com> wrote:
> >
> > Hi,
> >
> > > >
> > > > Hi Andrii,
> > > >
> > > > > >
> > > > > > 4096 is OK for x86-64, but for other archs with greater than 4KB
> > > > > > page size (e.g. 64KB under arm64), test_verifier for test case
> > > > > > "check valid spill/fill, ptr to mem" will fail, so just use
> > > > > > getpagesize() to initialize the ring buffer size. Do this for
> > > > > > test_progs as well.
> > > > > >
> > > > [...]
> > > >
> > > > > > diff --git a/tools/testing/selftests/bpf/progs/ima.c b/tools/testing/selftests/bpf/progs/ima.c
> > > > > > index 96060ff4ffc6..e192a9f16aea 100644
> > > > > > --- a/tools/testing/selftests/bpf/progs/ima.c
> > > > > > +++ b/tools/testing/selftests/bpf/progs/ima.c
> > > > > > @@ -13,7 +13,6 @@ u32 monitored_pid = 0;
> > > > > >
> > > > > >  struct {
> > > > > >         __uint(type, BPF_MAP_TYPE_RINGBUF);
> > > > > > -       __uint(max_entries, 1 << 12);
> > > > >
> > > > > Should we just bump it to 64/128/256KB instead? It's quite annoying to
> > > > > do a split open and then load just due to this...
> > > > >
> > > > Agreed.
> > > >
> > > > > I'm also wondering if we should either teach kernel to round up to
> > > > > closes power-of-2 of page_size internally, or teach libbpf to do this
> > > > > for RINGBUF maps. Thoughts?
> > > > >
[...]
> > >
> > > No, if you read BPF ringbuf code carefully you'll see that we map the
> > > entire ringbuf data twice in the memory (see [0] for lame ASCII
> > > diagram), so that records that are wrapped at the end of the ringbuf
> > > and go back to the start are still accessible as a linear array. It's
> > > a very important guarantee, so it has to be page size multiple. But
> > > auto-increasing it to the closest power-of-2 of page size seems like a
> > > pretty low-impact change. Hard to imagine breaking anything except
> > > some carefully crafted tests for ENOSPC behavior.
> > >
> >
> > Yes, i know the double map trick. What i tried to say is that:
> > (1) remove the page-aligned restrain for max_entries
> > (2) still allocate page-aligned memory for ringbuf
> >
> > instead of rounding max_entries up to closest power-of-2 page size
> > directly, so max_entries from userspace is unchanged and double map trick
> > still works.
> 
> I don't see how. Knowing the correct and exact size of the ringbuf
> data area is mandatory for correctly consuming ringbuf data from
> user-space. But if I'm missing something, feel free to give it a try
> and see if it actually works.
> 
You are right. The userspace needs max_entries to do mmap() for data
area, so max_entries must be page-sized aligned.

If we want to do the automatic round-up, i think libbpf would be a better
place, because if the round-up is done in kernel, the userspace program
may use the old max_entries to call mmap(), the consumer side will not
work and leads to confusion for usage. If we do auto-round-up in libbpf,
the setup procedure is hidden from libbpf user. Will add the auto
round-up and its tests in libbpf.

Regards
Tao
> 
> >
> > > [0] https://github.com/torvalds/linux/blob/master/kernel/bpf/ringbuf.c#L73-L89
> >

      reply	other threads:[~2022-02-03 11:12 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-27  2:49 [PATCH bpf-next] selftests/bpf: use getpagesize() to initialize ring buffer size Hou Tao
2022-02-01  0:02 ` Andrii Nakryiko
2022-02-01  8:43   ` Hou Tao
2022-02-02  1:29     ` Andrii Nakryiko
2022-02-02  2:36       ` Hou Tao
2022-02-02  6:45         ` Andrii Nakryiko
2022-02-03 11:12           ` Hou Tao [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220203111245.3495-1-houtao1@huawei.com \
    --to=hotforest@gmail.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=houtao1@huawei.com \
    --cc=kafai@fb.com \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.