linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: joel@joelfernandes.org
To: Frederic Weisbecker <frederic@kernel.org>
Cc: linux-kernel@vger.kernel.org, Ingo Molnar <mingo@redhat.com>,
	Josh Triplett <josh@joshtriplett.org>,
	Lai Jiangshan <jiangshanlai@gmail.com>,
	Marco Elver <elver@google.com>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	"Paul E. McKenney" <paulmck@kernel.org>,
	rcu@vger.kernel.org, Steven Rostedt <rostedt@goodmis.org>,
	"Uladzislau Rezki (Sony)" <urezki@gmail.com>,
	fweisbec@gmail.com, neeraj.iitr10@gmail.com,
	stern@rowland.harvard.edu
Subject: Re: [PATCH v7 6/6] rcu/segcblist: Add additional comments to explain smp_mb()
Date: Fri, 16 Oct 2020 21:27:53 -0400	[thread overview]
Message-ID: <20201017012753.GB4015033@google.com> (raw)
In-Reply-To: <20201015133511.GB127222@lothringen>

Adding Alan as well as its memory barrier discussion ;-)

On Thu, Oct 15, 2020 at 03:35:11PM +0200, Frederic Weisbecker wrote:
> On Wed, Oct 14, 2020 at 08:23:01PM -0400, Joel Fernandes (Google) wrote:
> > Memory barriers are needed when updating the full length of the
> > segcblist, however it is not fully clearly why one is needed before and
> > after. This patch therefore adds additional comments to the function
> > header to explain it.
> > 
> > Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> > ---
> >  kernel/rcu/rcu_segcblist.c | 38 ++++++++++++++++++++++++++++++++++----
> >  1 file changed, 34 insertions(+), 4 deletions(-)
> > 
> > diff --git a/kernel/rcu/rcu_segcblist.c b/kernel/rcu/rcu_segcblist.c
> > index 271d5d9d7f60..25ffd07f9951 100644
> > --- a/kernel/rcu/rcu_segcblist.c
> > +++ b/kernel/rcu/rcu_segcblist.c
> > @@ -147,17 +147,47 @@ static void rcu_segcblist_inc_seglen(struct rcu_segcblist *rsclp, int seg)
> >   * field to disagree with the actual number of callbacks on the structure.
> >   * This increase is fully ordered with respect to the callers accesses
> >   * both before and after.
> > + *
> > + * About memory barriers:
> > + * There is a situation where rcu_barrier() locklessly samples the full
> > + * length of the segmented cblist before deciding what to do. That can
> > + * race with another path that calls this function. rcu_barrier() should
> > + * not wrongly assume there are no callbacks, so any transitions from 1->0
> > + * and 0->1 have to be carefully ordered with respect to list modifications.
> > + *
> > + * Memory barrier is needed before adding to length, for the case where
> > + * v is negative which does not happen in current code, but used
> > + * to happen. Keep the memory barrier for robustness reasons.
> 
> Heh, I seem to recongnize someone's decision's style ;-)

Actually, the last paragraph I added is bogus. Indeed this memory barrier is
not just for robustness reasons. It is needed because rcu_do_batch() adjusts
the length of the list (possibly to 0) _after_ executing the callbacks, so
that's a negative number:
        rcu_segcblist_add_len(&rdp->cblist, -count);

> >     When/If the
> > + * length transitions from 1 -> 0, the write to 0 has to be ordered *after*
> > + * the memory accesses of the CBs that were dequeued and the segcblist
> > + * modifications:
> > + * P0 (what P1 sees)	P1
> > + * set len = 0
> > + *                      rcu_barrier sees len as 0
> > + * dequeue from list
> > + *                      rcu_barrier does nothing.
> 
> It's a bit difficult to read that way. So that would be:
> 
> 
>       rcu_do_batch()                rcu_barrier()
>       --                            --
>       dequeue                       l = READ(len)
>       smp_mb()                      if (!l)
>       WRITE(len, 0)                     check next CPU...
> 
> But I'm a bit confused against what it pairs in rcu_barrier().

I believe it pairs with an implied memory barrier via control dependency.

The following litmus test would confirm it:

C rcubarrier+ctrldep

(*
 * Result: Never
 *
 * This litmus test shows that rcu_barrier (P1) prematurely
 * returning by reading len 0 can cause issues if P0 does
 * NOT have a smb_mb() before WRITE_ONCE().
 *
 * mod_data == 2 means garbage which the callback should never see.
 *)

{ int len = 1; }

P0(int *len, int *mod_data)
{
        int r0;

        // accessed by say RCU callback in rcu_do_batch();
        r0 = READ_ONCE(*mod_data);
        smp_mb(); // Remove this and the "exists" will become true.
        WRITE_ONCE(*len, 0);
}

P1(int *len, int *mod_data)
{
        int r0;

        r0 = READ_ONCE(*len);

        // rcu_barrier will return early if len is 0
        if (r0 == 0)
                WRITE_ONCE(*mod_data, 2);
}

// Is it possible?
exists (0:r0=2 /\ 1:r0=0)

> > + *
> > + * Memory barrier is needed after adding to length for the case
> > + * where length transitions from 0 -> 1. This is because rcu_barrier()
> > + * should never miss an update to the length. So the update to length
> > + * has to be seen *before* any modifications to the segmented list. Otherwise a
> > + * race can happen.
> > + * P0 (what P1 sees)	P1
> > + * queue to list
> > + *                      rcu_barrier sees len as 0
> > + * set len = 1.
> > + *                      rcu_barrier does nothing.
> 
> So that would be:
> 
>       call_rcu()                    rcu_barrier()
>       --                            --
>       WRITE(len, len + 1)           l = READ(len)
>       smp_mb()                      if (!l)
>       queue                            check next CPU...
> 
> 
> But I still don't see against what it pairs in rcu_barrier.

Actually, for the second case maybe a similar reasoning can be applied
(control dependency) but I'm unable to come up with a litmus test.
In fact, now I'm wondering how is it possible that call_rcu() races with
rcu_barrier(). The module should ensure that no more call_rcu() should happen
before rcu_barrier() is called.

confused,

 - Joel


  reply	other threads:[~2020-10-17  6:03 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-15  0:22 [PATCH v7 0/6] Add support for length of each segment in the segcblist Joel Fernandes (Google)
2020-10-15  0:22 ` [PATCH v7 1/6] rcu/tree: Make rcu_do_batch count how many callbacks were executed Joel Fernandes (Google)
2020-10-15  0:22 ` [PATCH v7 2/6] rcu/segcblist: Add counters to segcblist datastructure Joel Fernandes (Google)
2020-10-15 12:21   ` Frederic Weisbecker
2020-10-17  1:31     ` joel
2020-10-21 15:33     ` joel
2020-10-21 21:53       ` Frederic Weisbecker
2020-10-21 22:31         ` Joel Fernandes
2020-10-18  8:23   ` [rcu/segcblist] e08055898f: WARNING:at_kernel/rcu/srcutree.c:#cleanup_srcu_struct kernel test robot
2020-10-21 14:40     ` joel
2020-10-15  0:22 ` [PATCH v7 3/6] rcu/trace: Add tracing for how segcb list changes Joel Fernandes (Google)
2020-10-15  0:22 ` [PATCH v7 4/6] rcu/segcblist: Remove useless rcupdate.h include Joel Fernandes (Google)
2020-10-15  0:23 ` [PATCH v7 5/6] rcu/tree: Remove redundant smp_mb() in rcu_do_batch Joel Fernandes (Google)
2020-10-15  0:23 ` [PATCH v7 6/6] rcu/segcblist: Add additional comments to explain smp_mb() Joel Fernandes (Google)
2020-10-15 13:35   ` Frederic Weisbecker
2020-10-17  1:27     ` joel [this message]
2020-10-17  3:19       ` joel
2020-10-17 13:29         ` Frederic Weisbecker
2020-10-18  0:35           ` joel
2020-10-19 12:37             ` Frederic Weisbecker
2020-10-21 18:57               ` Joel Fernandes
2020-10-21 21:16                 ` Frederic Weisbecker
2020-10-17 20:24         ` Alan Stern
2020-10-18 20:45           ` Joel Fernandes
2020-10-18 21:15             ` Alan Stern
2020-10-17 14:31       ` Alan Stern
2020-10-18 20:16         ` Joel Fernandes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201017012753.GB4015033@google.com \
    --to=joel@joelfernandes.org \
    --cc=elver@google.com \
    --cc=frederic@kernel.org \
    --cc=fweisbec@gmail.com \
    --cc=jiangshanlai@gmail.com \
    --cc=josh@joshtriplett.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mingo@redhat.com \
    --cc=neeraj.iitr10@gmail.com \
    --cc=paulmck@kernel.org \
    --cc=rcu@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=stern@rowland.harvard.edu \
    --cc=urezki@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).