bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@kernel.org>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: "Toke Høiland-Jørgensen" <toke@redhat.com>,
	"Alexei Starovoitov" <alexei.starovoitov@gmail.com>,
	bpf <bpf@vger.kernel.org>
Subject: Re: Selftest failures related to kern_sync_rcu()
Date: Wed, 14 Apr 2021 15:27:39 -0700	[thread overview]
Message-ID: <20210414222739.GY4510@paulmck-ThinkPad-P17-Gen-1> (raw)
In-Reply-To: <CAEf4BzZ=oFbTaS2DPOry8jbunb2Qtu4omF3VsYMNJ5_8VNHoQw@mail.gmail.com>

On Wed, Apr 14, 2021 at 03:13:38PM -0700, Andrii Nakryiko wrote:
> On Wed, Apr 14, 2021 at 2:25 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> >
> > On Wed, Apr 14, 2021 at 09:18:09PM +0200, Toke Høiland-Jørgensen wrote:
> > > "Paul E. McKenney" <paulmck@kernel.org> writes:
> > >
> > > > On Wed, Apr 14, 2021 at 08:39:04PM +0200, Toke Høiland-Jørgensen wrote:
> > > >> "Paul E. McKenney" <paulmck@kernel.org> writes:
> > > >>
> > > >> > On Wed, Apr 14, 2021 at 10:59:23AM -0700, Alexei Starovoitov wrote:
> > > >> >> On Wed, Apr 14, 2021 at 10:52 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> > > >> >> >
> > > >> >> > > > > >                 if (num_online_cpus() > 1)
> > > >> >> > > > > >                         synchronize_rcu();
> > > >> >> >
> > > >> >> > In CONFIG_PREEMPT_NONE=y and CONFIG_PREEMPT_VOLUNTARY=y kernels, this
> > > >> >> > synchronize_rcu() will be a no-op anyway due to there only being the
> > > >> >> > one CPU.  Or are these failures all happening in CONFIG_PREEMPT=y kernels,
> > > >> >> > and in tests where preemption could result in the observed failures?
> > > >> >> >
> > > >> >> > Could you please send your .config file, or at least the relevant portions
> > > >> >> > of it?
> > > >> >>
> > > >> >> That's my understanding as well. I assumed Toke has preempt=y.
> > > >> >> Otherwise the whole thing needs to be root caused properly.
> > > >> >
> > > >> > Given that there is only a single CPU, I am still confused about what
> > > >> > the tests are expecting the membarrier() system call to do for them.
> > > >>
> > > >> It's basically a proxy for waiting until the objects are freed on the
> > > >> kernel side, as far as I understand...
> > > >
> > > > There are in-kernel objects that are freed via call_rcu(), and the idea
> > > > is to wait until these objects really are freed?  Or am I still missing
> > > > out on what is going on?
> > >
> > > Something like that? Although I'm not actually sure these are using
> > > call_rcu()? One of them needs __put_task_struct() to run, and the other
> > > waits for map freeing, with this comment:
> > >
> > >
> > >       /* we need to either wait for or force synchronize_rcu(), before
> > >        * checking for "still exists" condition, otherwise map could still be
> > >        * resolvable by ID, causing false positives.
> > >        *
> > >        * Older kernels (5.8 and earlier) freed map only after two
> > >        * synchronize_rcu()s, so trigger two, to be entirely sure.
> > >        */
> > >       CHECK(kern_sync_rcu(), "sync_rcu", "failed\n");
> > >       CHECK(kern_sync_rcu(), "sync_rcu", "failed\n");
> >
> > OK, so the issue is that the membarrier() system call is designed to force
> > ordering only within a user process, and you need it in the kernel.
> >
> > Give or take my being puzzled as to why the membarrier() system call
> > doesn't do it for you on a CONFIG_PREEMPT_NONE=y system, this brings
> > us back to the question Alexei asked me in the first place, what is the
> > best way to invoke an in-kernel synchronize_rcu() from userspace?
> >
> > You guys gave some reasonable examples.  Here are a few others:
> >
> > o       Bring a CPU online, then force it offline, or vice versa.
> >         But in this case, sys_membarrier() would do what you need
> >         given more than one CPU.
> >
> > o       Use the membarrier() system call, but require that the tests
> >         run on systems with at least two CPUs.
> >
> > o       Create a kernel module whose init function does a
> >         synchronize_rcu() and then returns failure.  This will
> >         avoid the overhead of removing that kernel module.
> >
> > o       Create a sysfs or debugfs interface that does a
> >         synchronize_rcu().
> >
> > But I am still concerned that you are needing more than synchronize_rcu()
> > can do.  Otherwise, the membarrier() system call would work just fine
> > on a single CPU on your CONFIG_PREEMPT_VOLUNTARY=y kernel.
> 
> Selftests know internals of kernel implementation and wait for some
> objects to be freed with call_rcu(). So I think at this point the best
> way is just to go back to map-in-map or socket local storage.
> Map-in-map will probably work on older kernels, so I'd stick with that
> (plus all the code is there in the referenced commit). The performance
> and number of syscalls performed doesn't matter, really.

Ah!  If they need to wait for objects to be freed with call_rcu(), then
they need to make the kernel execute an rcu_barrier().  One way to make
this happen is to unmount an ext4 filesystem.  This would explain why
the membarrier() system call wasn't doing the job on single-CPU systems
even in kernels built with CONFIG_PREEMPT_VOLUNTARY=y.

But if you have a more direct way to wait the required period of time,
so much the better!

							Thanx, Paul

  reply	other threads:[~2021-04-14 22:27 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-08 19:34 Selftest failures related to kern_sync_rcu() Toke Høiland-Jørgensen
2021-04-13  3:38 ` Andrii Nakryiko
2021-04-13  8:50   ` Toke Høiland-Jørgensen
2021-04-13 21:43     ` Andrii Nakryiko
2021-04-14 15:54       ` Alexei Starovoitov
2021-04-14 17:52         ` Paul E. McKenney
2021-04-14 17:59           ` Alexei Starovoitov
2021-04-14 18:19             ` Paul E. McKenney
2021-04-14 18:39               ` Toke Høiland-Jørgensen
2021-04-14 18:41                 ` Paul E. McKenney
2021-04-14 19:18                   ` Toke Høiland-Jørgensen
2021-04-14 21:25                     ` Paul E. McKenney
2021-04-14 22:13                       ` Andrii Nakryiko
2021-04-14 22:27                         ` Paul E. McKenney [this message]
2021-04-14 22:47                         ` Toke Høiland-Jørgensen
2021-04-14 18:27             ` Toke Høiland-Jørgensen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210414222739.GY4510@paulmck-ThinkPad-P17-Gen-1 \
    --to=paulmck@kernel.org \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=bpf@vger.kernel.org \
    --cc=toke@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).