All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.ibm.com>
To: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: "Joel Fernandes, Google" <joel@joelfernandes.org>,
	rcu <rcu@vger.kernel.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Ingo Molnar <mingo@kernel.org>,
	Lai Jiangshan <jiangshanlai@gmail.com>,
	dipankar <dipankar@in.ibm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Josh Triplett <josh@joshtriplett.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Peter Zijlstra <peterz@infradead.org>,
	rostedt <rostedt@goodmis.org>,
	David Howells <dhowells@redhat.com>,
	Eric Dumazet <edumazet@google.com>, fweisbec <fweisbec@gmail.com>,
	Oleg Nesterov <oleg@redhat.com>,
	linux-nvdimm <linux-nvdimm@lists.01.org>,
	dri-devel <dri-devel@lists.freedesktop.org>,
	amd-gfx <amd-gfx@lists.freedesktop.org>
Subject: Re: [PATCH RFC tip/core/rcu 0/4] Forbid static SRCU use in modules
Date: Tue, 9 Apr 2019 09:40:31 -0700	[thread overview]
Message-ID: <20190409164031.GE14111@linux.ibm.com> (raw)
In-Reply-To: <534133139.2374.1554825363211.JavaMail.zimbra@efficios.com>

On Tue, Apr 09, 2019 at 11:56:03AM -0400, Mathieu Desnoyers wrote:
> ----- On Apr 9, 2019, at 11:40 AM, Joel Fernandes, Google joel@joelfernandes.org wrote:
> 
> > On Mon, Apr 08, 2019 at 01:24:47PM -0400, Mathieu Desnoyers wrote:
> >> ----- On Apr 8, 2019, at 11:46 AM, paulmck paulmck@linux.ibm.com wrote:
> >> 
> >> > On Mon, Apr 08, 2019 at 10:49:32AM -0400, Mathieu Desnoyers wrote:
> >> >> ----- On Apr 8, 2019, at 10:22 AM, paulmck paulmck@linux.ibm.com wrote:
> >> >> 
> >> >> > On Mon, Apr 08, 2019 at 09:05:34AM -0400, Mathieu Desnoyers wrote:
> >> >> >> ----- On Apr 7, 2019, at 10:27 PM, paulmck paulmck@linux.ibm.com wrote:
> >> >> >> 
> >> >> >> > On Sun, Apr 07, 2019 at 09:07:18PM +0000, Joel Fernandes wrote:
> >> >> >> >> On Sun, Apr 07, 2019 at 04:41:36PM -0400, Mathieu Desnoyers wrote:
> >> >> >> >> > 
> >> >> >> >> > ----- On Apr 7, 2019, at 3:32 PM, Joel Fernandes, Google joel@joelfernandes.org
> >> >> >> >> > wrote:
> >> >> >> >> > 
> >> >> >> >> > > On Sun, Apr 07, 2019 at 03:26:16PM -0400, Mathieu Desnoyers wrote:
> >> >> >> >> > >> ----- On Apr 7, 2019, at 9:59 AM, paulmck paulmck@linux.ibm.com wrote:
> >> >> >> >> > >> 
> >> >> >> >> > >> > On Sun, Apr 07, 2019 at 06:39:41AM -0700, Paul E. McKenney wrote:
> >> >> >> >> > >> >> On Sat, Apr 06, 2019 at 07:06:13PM -0400, Joel Fernandes wrote:
> >> >> >> >> > >> > 
> >> >> >> >> > >> > [ . . . ]
> >> >> >> >> > >> > 
> >> >> >> >> > >> >> > > diff --git a/include/asm-generic/vmlinux.lds.h
> >> >> >> >> > >> >> > > b/include/asm-generic/vmlinux.lds.h
> >> >> >> >> > >> >> > > index f8f6f04c4453..c2d919a1566e 100644
> >> >> >> >> > >> >> > > --- a/include/asm-generic/vmlinux.lds.h
> >> >> >> >> > >> >> > > +++ b/include/asm-generic/vmlinux.lds.h
> >> >> >> >> > >> >> > > @@ -338,6 +338,10 @@
> >> >> >> >> > >> >> > >  		KEEP(*(__tracepoints_ptrs)) /* Tracepoints: pointer array */ \
> >> >> >> >> > >> >> > >  		__stop___tracepoints_ptrs = .;				\
> >> >> >> >> > >> >> > >  		*(__tracepoints_strings)/* Tracepoints: strings */	\
> >> >> >> >> > >> >> > > +		. = ALIGN(8);						\
> >> >> >> >> > >> >> > > +		__start___srcu_struct = .;				\
> >> >> >> >> > >> >> > > +		*(___srcu_struct_ptrs)					\
> >> >> >> >> > >> >> > > +		__end___srcu_struct = .;				\
> >> >> >> >> > >> >> > >  	}								\
> >> >> >> >> > >> >> > 
> >> >> >> >> > >> >> > This vmlinux linker modification is not needed. I tested without it and srcu
> >> >> >> >> > >> >> > torture works fine with rcutorture built as a module. Putting further prints
> >> >> >> >> > >> >> > in kernel/module.c verified that the kernel is able to find the srcu structs
> >> >> >> >> > >> >> > just fine. You could squash the below patch into this one or apply it on top
> >> >> >> >> > >> >> > of the dev branch.
> >> >> >> >> > >> >> 
> >> >> >> >> > >> >> Good point, given that otherwise FORTRAN named common blocks would not
> >> >> >> >> > >> >> work.
> >> >> >> >> > >> >> 
> >> >> >> >> > >> >> But isn't one advantage of leaving that stuff in the RO_DATA_SECTION()
> >> >> >> >> > >> >> macro that it can be mapped read-only?  Or am I suffering from excessive
> >> >> >> >> > >> >> optimism?
> >> >> >> >> > >> > 
> >> >> >> >> > >> > And to answer the other question, in the case where I am suffering from
> >> >> >> >> > >> > excessive optimism, it should be a separate commit.  Please see below
> >> >> >> >> > >> > for the updated original commit thus far.
> >> >> >> >> > >> > 
> >> >> >> >> > >> > And may I have your Tested-by?
> >> >> >> >> > >> 
> >> >> >> >> > >> Just to confirm: does the cleanup performed in the modules going
> >> >> >> >> > >> notifier end up acting as a barrier first before freeing the memory ?
> >> >> >> >> > >> If not, is it explicitly stated that a barrier must be issued before
> >> >> >> >> > >> module unload ?
> >> >> >> >> > >> 
> >> >> >> >> > > 
> >> >> >> >> > > You mean rcu_barrier? It is mentioned in the documentation that this is the
> >> >> >> >> > > responsibility of the module writer to prevent delays for all modules.
> >> >> >> >> > 
> >> >> >> >> > It's a srcu barrier yes. Considering it would be a barrier specific to the
> >> >> >> >> > srcu domain within that module, I don't see how it would cause delays for
> >> >> >> >> > "all" modules if we implicitly issue the barrier on module unload. What
> >> >> >> >> > am I missing ?
> >> >> >> >> 
> >> >> >> >> Yes you are right. I thought of this after I just sent my email. I think it
> >> >> >> >> makes sense for srcu case to do and could avoid a class of bugs.
> >> >> >> > 
> >> >> >> > If there are call_srcu() callbacks outstanding, the module writer still
> >> >> >> > needs the srcu_barrier() because otherwise callbacks arrive after
> >> >> >> > the module text has gone, which will be disappoint the CPU when it
> >> >> >> > tries fetching instructions that are no longer mapped.  If there are
> >> >> >> > no call_srcu() callbacks from that module, then there is no need for
> >> >> >> > srcu_barrier() either way.
> >> >> >> > 
> >> >> >> > So if an srcu_barrier() is needed, the module developer needs to
> >> >> >> > supply it.
> >> >> >> 
> >> >> >> When you say "callbacks arrive after the module text has gone",
> >> >> >> I think you assume that free_module() is invoked before the
> >> >> >> MODULE_STATE_GOING notifiers are called. But it's done in the
> >> >> >> opposite order: going notifiers are called first, and then
> >> >> >> free_module() is invoked.
> >> >> >> 
> >> >> >> So AFAIU it would be safe to issue the srcu_barrier() from the module
> >> >> >> going notifier.
> >> >> >> 
> >> >> >> Or am I missing something ?
> >> >> > 
> >> >> > We do seem to be talking past each other.  ;-)
> >> >> > 
> >> >> > This has nothing to do with the order of events at module-unload time.
> >> >> > 
> >> >> > So please let me try again.
> >> >> > 
> >> >> > If a given srcu_struct in a module never has call_srcu() invoked, there
> >> >> > is no need to invoke rcu_barrier() at any time, whether at module-unload
> >> >> > time or not.  Adding rcu_barrier() in this case adds overhead and latency
> >> >> > for no good reason.
> >> >> 
> >> >> Not if we invoke srcu_barrier() for that specific domain. If
> >> >> call_srcu was never invoked for a srcu domain, I don't see why
> >> >> srcu_barrier() should be more expensive than a simple check that
> >> >> the domain does not have any srcu work queued.
> >> > 
> >> > But that simple check does involve a cache miss for each possible CPU (not
> >> > just each online CPU), so it is non-trivial, especially on large systems.
> >> > 
> >> >> > If a given srcu_struct in a module does have at least one call_srcu()
> >> >> > invoked, it is already that module's responsibility to make sure that
> >> >> > the code sticks around long enough for the callback to be invoked.
> >> >> 
> >> >> I understand that when users do explicit dynamic allocation/cleanup of
> >> >> srcu domains, they indeed need to take care of doing explicit srcu_barrier().
> >> >> However, if they do static definition of srcu domains, it would be nice
> >> >> if we can handle the barriers under the hood.
> >> > 
> >> > All else being equal, of course.  But...
> >> > 
> >> >> > This means that correct SRCU users that invoke call_srcu() already
> >> >> > have srcu_barrier() at module-unload time.  Incorrect SRCU users, with
> >> >> > reasonable probability, now get a WARN_ON() at module-unload time, with
> >> >> > the per-CPU state getting leaked.  Before this change, they would (also
> >> >> > with reasonable probability) instead get an instruction-fetch fault when
> >> >> > the SRCU callback was invoked after the completion of the module unload.
> >> >> > Furthermore, in all cases where they would previously have gotten the
> >> >> > instruction-fetch fault, they now get the WARN_ON(), like this:
> >> >> > 
> >> >> >	if (WARN_ON(rcu_segcblist_n_cbs(&sdp->srcu_cblist)))
> >> >> >		return; /* Forgot srcu_barrier(), so just leak it! */
> >> >> > 
> >> >> > So this change already represents an improvement in usability.
> >> >> 
> >> >> Considering that we can do a srcu_barrier() for the specific domain,
> >> >> and that it should add no noticeable overhead if there is no queued
> >> >> callbacks, I don't see a good reason for leaving the srcu_barrier
> >> >> invocation to the user rather than implicitly doing it from the
> >> >> module going notifier.
> >> > 
> >> > Now, I could automatically add an indicator of whether or not a
> >> > call_srcu() had happened, but then again, that would either add a
> >> > call_srcu() scalability bottleneck or again require a scan of all possible
> >> > CPUs...  to figure out if it was necessary to scan all possible CPUs.
> >> > 
> >> > Or is scanning all possible CPUs down in the noise in this case?  Or
> >> > am I missing a trick that would reduce the overhead?
> >> 
> >> Module unloading implicitly does a synchronize_rcu (for RCU-sched), and
> >> a stop_machine. So I would be tempted to say that overhead of iteration
> >> over all CPUs might not matter that much considering the rest.
> >> 
> >> About notifying that a call_srcu has happened for the srcu domain in a
> >> scalable fashion, let's see... We could have a flag "call_srcu_used"
> >> for each call_srcu domain. Whenever call_srcu is invoked, it would
> >> load that flag. It sets it on first use.
> >> 
> >> The idea here is to only use that flag when srcu_barrier is performed
> >> right before the srcu domain cleanup (it could become part of that
> >> cleanup). Else, using it in all srcu_barrier() might be tricky, because
> >> we may then need to add memory barriers or locking to the call_srcu
> >> fast-path, which is an overhead we try to avoid.
> >> 
> >> However, if we only use that flag as part of the srcu domain cleanup,
> >> it's already prohibited to invoke call_srcu concurrently with the
> >> cleanup of the same domain, so I don't think we would need any
> >> memory barriers in call_srcu.
> > 
> > About the last part of your email, it seems to that if after call_srcu has
> > returned, if the module could be unloaded on some other CPU - then it would
> > need to see the flag stored by the preceding call_srcu, so I believe there
> > would be a memory barrier between the two opreations (call_srcu and module
> > unload).
> 
> In order for the module unload not to race against module execution, it needs
> to happen after the call_srcu in a way that is already ordered by other means,
> else module unload races against the module code.
> 
> > 
> > Also about doing the unconditional srcu_barrier, since a module could be
> > unloaded at any time - don't all SRCU using modules need to invoke
> > srcu_barrier() during their clean up anyway so we are incurring the barrier
> > overhead anyway? Or, am I missing a design pattern here? It seems to me
> > rcutorture module definitely calls srcu_barrier() before it is unloaded.
> 
> I think a valid approach which is even simpler might be: if a module statically
> defines a SRCU domain, it should be expected to use it. So adding a srcu_barrier()
> to its module going notifier should not hurt. The rare case where a module defines
> a static SRCU domain *and* does not actually use it with call_srcu() does not
> seem that usual, and not worth optimizing for.
> 
> Thoughts ?

Most SRCU users use only synchronize_srcu(), and don't ever use
call_srcu().  Which is not too surprising given that call_srcu() showed
up late in the game.

But something still bothers me about this, and I am not yet sure
what.  One thing that seems to reduce anxiety somewhat is doing the
srcu_barrier() on all calls to cleanup_srcu_struct() rather than just
those invoked from the modules infrastructure, but I don't see why at
the moment.

							Thanx, Paul

> Thanks,
> 
> Mathieu
> 
> 
> > 
> > thanks,
> > 
> > - Joel
> > 
> >> Thoughts ?
> >> 
> >> Thanks,
> >> 
> >> Mathieu
> >> 
> >> --
> >> Mathieu Desnoyers
> >> EfficiOS Inc.
> > > http://www.efficios.com
> 
> -- 
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com
> 

  parent reply	other threads:[~2019-04-09 16:40 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-02 14:28 [PATCH RFC tip/core/rcu 0/4] Forbid static SRCU use in modules Paul E. McKenney
2019-04-02 14:29 ` [PATCH RFC tip/core/rcu 1/4] dax/super: Dynamically allocate dax_srcu Paul E. McKenney
2019-04-02 14:29   ` Paul E. McKenney
2019-04-03 18:31   ` Dan Williams
2019-04-03 18:31     ` Dan Williams
2019-04-04 21:04     ` Paul E. McKenney
2019-04-04 21:04       ` Paul E. McKenney
2019-04-02 14:29 ` [PATCH RFC tip/core/rcu 2/4] drivers/gpu/drm: Dynamically allocate drm_unplug_srcu Paul E. McKenney
2019-04-02 16:14   ` Daniel Vetter
2019-04-02 16:14     ` Daniel Vetter
2019-04-02 14:29 ` [PATCH RFC tip/core/rcu 3/4] drivers/gpu/drm/amd: Dynamically allocate kfd_processes_srcu Paul E. McKenney
2019-04-02 17:40   ` Kuehling, Felix
2019-04-02 17:40     ` Kuehling, Felix
2019-04-04 21:16     ` Paul E. McKenney
2019-04-04 21:16       ` Paul E. McKenney
2019-04-02 14:29 ` [PATCH RFC tip/core/rcu 4/4] rcu: Forbid DEFINE{,_STATIC}_SRCU() from modules Paul E. McKenney
2019-04-02 15:14 ` [PATCH RFC tip/core/rcu 0/4] Forbid static SRCU use in modules Mathieu Desnoyers
2019-04-02 15:23   ` Paul E. McKenney
     [not found]     ` <20190402152334.GC4102-tEXmvtCZX7AybS5Ee8rs3A@public.gmane.org>
2019-04-02 15:34       ` Mathieu Desnoyers
2019-04-02 15:34         ` Mathieu Desnoyers
2019-04-03 13:32         ` Paul E. McKenney
2019-04-03 14:27           ` Mathieu Desnoyers
     [not found]             ` <1028306587.504.1554301662374.JavaMail.zimbra-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
2019-04-03 16:20               ` Paul E. McKenney
2019-04-03 16:20                 ` Paul E. McKenney
2019-04-03 19:30                 ` Joel Fernandes
2019-04-03 19:30                   ` Joel Fernandes
     [not found]                 ` <20190403162039.GA14111-tEXmvtCZX7AybS5Ee8rs3A@public.gmane.org>
2019-04-05 23:28                   ` Paul E. McKenney
2019-04-05 23:28                     ` Paul E. McKenney
2019-04-06 13:33                     ` Joel Fernandes
2019-04-07 13:48                       ` Paul E. McKenney
2019-04-07 13:48                         ` Paul E. McKenney
     [not found]                     ` <20190405232835.GA24702-tEXmvtCZX7AybS5Ee8rs3A@public.gmane.org>
2019-04-06 23:06                       ` Joel Fernandes
2019-04-06 23:06                         ` Joel Fernandes
2019-04-07 13:39                         ` Paul E. McKenney
     [not found]                           ` <20190407133941.GC14111-tEXmvtCZX7AybS5Ee8rs3A@public.gmane.org>
2019-04-07 13:59                             ` Paul E. McKenney
2019-04-07 13:59                               ` Paul E. McKenney
     [not found]                               ` <20190407135937.GA30053-tEXmvtCZX7AybS5Ee8rs3A@public.gmane.org>
2019-04-07 15:46                                 ` Joel Fernandes
2019-04-07 15:46                                   ` Joel Fernandes
2019-04-07 17:05                                   ` Paul E. McKenney
2019-04-07 17:05                                     ` Paul E. McKenney
     [not found]                                     ` <20190407170514.GE14111-tEXmvtCZX7AybS5Ee8rs3A@public.gmane.org>
2019-04-08  0:36                                       ` Joel Fernandes
2019-04-08  0:36                                         ` Joel Fernandes
2019-04-08  2:28                                         ` Paul E. McKenney
2019-04-07 19:26                                 ` Mathieu Desnoyers
2019-04-07 19:26                                   ` Mathieu Desnoyers
     [not found]                                   ` <134026717.535.1554665176677.JavaMail.zimbra-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
2019-04-07 19:32                                     ` Joel Fernandes
2019-04-07 19:32                                       ` Joel Fernandes
2019-04-07 20:41                                       ` Mathieu Desnoyers
2019-04-07 20:41                                         ` Mathieu Desnoyers
2019-04-07 21:07                                         ` Joel Fernandes
2019-04-08  2:27                                           ` Paul E. McKenney
2019-04-08 13:05                                             ` Mathieu Desnoyers
2019-04-08 14:22                                               ` Paul E. McKenney
2019-04-08 14:49                                                 ` Mathieu Desnoyers
2019-04-08 15:46                                                   ` Paul E. McKenney
2019-04-08 17:24                                                     ` Mathieu Desnoyers
2019-04-09 15:40                                                       ` Joel Fernandes
     [not found]                                                         ` <20190409154012.GC248418-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2019-04-09 15:56                                                           ` Mathieu Desnoyers
2019-04-09 15:56                                                             ` Mathieu Desnoyers
2019-04-09 16:18                                                             ` Joel Fernandes
2019-04-09 16:40                                                             ` Paul E. McKenney [this message]
     [not found]                                                               ` <20190409164031.GE14111-tEXmvtCZX7AybS5Ee8rs3A@public.gmane.org>
2019-04-09 16:45                                                                 ` Mathieu Desnoyers
2019-04-09 16:45                                                                   ` Mathieu Desnoyers
2019-04-09 17:55                                                                   ` Paul E. McKenney
     [not found]                                                                     ` <20190409175549.GG14111-tEXmvtCZX7AybS5Ee8rs3A@public.gmane.org>
2019-04-09 18:04                                                                       ` Mathieu Desnoyers
2019-04-09 18:04                                                                         ` Mathieu Desnoyers
2019-04-09 19:14                                                                         ` Paul E. McKenney
2019-04-02 18:40     ` Joel Fernandes
2019-04-02 18:40       ` Joel Fernandes
2019-04-02 18:40       ` Joel Fernandes
2019-04-03 13:19       ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190409164031.GE14111@linux.ibm.com \
    --to=paulmck@linux.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=dhowells@redhat.com \
    --cc=dipankar@in.ibm.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=edumazet@google.com \
    --cc=fweisbec@gmail.com \
    --cc=jiangshanlai@gmail.com \
    --cc=joel@joelfernandes.org \
    --cc=josh@joshtriplett.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mingo@kernel.org \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rcu@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.