All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.ibm.com>
To: Nicholas Piggin <npiggin@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	LKMM Maintainers -- Akira Yokosawa <akiyks@gmail.com>,
	Andrea Parri <andrea.parri@amarulasolutions.com>,
	Boqun Feng <boqun.feng@gmail.com>,
	David Howells <dhowells@redhat.com>,
	Daniel Lustig <dlustig@nvidia.com>,
	Jade Alglave <j.alglave@ucl.ac.uk>,
	Kernel development list <linux-kernel@vger.kernel.org>,
	Luc Maranget <luc.maranget@inria.fr>,
	Alan Stern <stern@rowland.harvard.edu>,
	Will Deacon <will.deacon@arm.com>
Subject: Re: [PATCH] Documentation: atomic_t.txt: Explain ordering provided by smp_mb__{before,after}_atomic()
Date: Sat, 20 Apr 2019 01:54:40 -0700	[thread overview]
Message-ID: <20190420085440.GK14111@linux.ibm.com> (raw)
In-Reply-To: <1555719429.t9n8gkf70y.astroid@bobo.none>

On Sat, Apr 20, 2019 at 10:26:15AM +1000, Nicholas Piggin wrote:
> Paul E. McKenney's on April 20, 2019 4:26 am:
> > On Fri, Apr 19, 2019 at 08:00:17PM +0200, Peter Zijlstra wrote:
> >> On Fri, Apr 19, 2019 at 01:21:45PM -0400, Alan Stern wrote:
> >> > Index: usb-devel/Documentation/atomic_t.txt
> >> > ===================================================================
> >> > --- usb-devel.orig/Documentation/atomic_t.txt
> >> > +++ usb-devel/Documentation/atomic_t.txt
> >> > @@ -171,7 +171,10 @@ The barriers:
> >> >    smp_mb__{before,after}_atomic()
> >> >  
> >> >  only apply to the RMW ops and can be used to augment/upgrade the ordering
> >> > -inherent to the used atomic op. These barriers provide a full smp_mb().
> >> > +inherent to the used atomic op. Unlike normal smp_mb() barriers, they order
> >> > +only the RMW op itself against the instructions preceding the
> >> > +smp_mb__before_atomic() or following the smp_mb__after_atomic(); they do
> >> > +not order instructions on the other side of the RMW op at all.
> >> 
> >> Now it is I who is confused; what?
> >> 
> >> 	x = 1;
> >> 	smp_mb__before_atomic();
> >> 	atomic_add(1, &a);
> >> 	y = 1;
> >> 
> >> the stores to both x and y will be ordered as if an smp_mb() where
> >> there. There is no order between a and y otoh.
> > 
> > Let's look at x86.  And a slightly different example:
> > 
> > 	x = 1;
> > 	smp_mb__before_atomic();
> > 	atomic_add(1, &a);
> > 	r1 = y;
> > 
> > The atomic_add() asm does not have the "memory" constraint, which is
> > completely legitimate because atomic_add() does not return a value,
> > and thus guarantees no ordering.  The compiler is therefore within
> > its rights to transform the code into the following:
> > 
> > 	x = 1;
> > 	smp_mb__before_atomic();
> > 	r1 = y;
> > 	atomic_add(1, &a);
> > 
> > But x86's smp_mb__before_atomic() is just a compiler barrier, and
> > x86 is further allowed to reorder prior stores with later loads.
> > The CPU can therefore execute this code as follows:
> > 
> > 	r1 = y;
> > 	x = 1;
> > 	smp_mb__before_atomic();
> > 	atomic_add(1, &a);
> > 
> > So in general, the ordering is guaranteed only to the atomic itself,
> > not to accesses on the other side of the atomic.
> 
> That's interesting. I don't think that's what all our code expects.
> I had the same idea as Peter.
> 
> IIRC the primitive was originally introduced exactly so x86 would not
> need to have the unnecessary hardware barrier with sequences like
> 
>   smp_mb();
>   ...
>   atomic_inc(&v);
> 
> The "new" semantics are a bit subtle. One option might be just to
> replace it entirely with atomic_xxx_mb() ?

Hmmm...  There are more than 2,000 uses of atomic_inc() in the kernel.
There are about 300-400 total between smp_mb__before_atomic() and
smp_mb__after_atomic().

So what are our options?

1.	atomic_xxx_mb() as you say.

	From a quick scan of smp_mb__before_atomic() uses, we need this
	for atomic_inc(), atomic_dec(), atomic_add(), atomic_sub(),
	clear_bit(), set_bit(), test_bit(), atomic_long_dec(),
	atomic_long_add(), refcount_dec(), cmpxchg_relaxed(),
	set_tsk_thread_flag(), clear_bit_unlock().

	Another random look identifies atomic_andnot().

	And atomic_set(): set_preempt_state().	This fails
	on x86, s390, and TSO friends, does it not?  Or is
	this ARM-only?	Still, why not just smp_mb() before and
	after?	Same issue in __kernfs_new_node(), bio_cnt_set(),
	sbitmap_queue_update_wake_batch(), 

	Ditto for atomic64_set() in __ceph_dir_set_complete().

	Ditto for atomic_read() in rvt_qp_is_avail().  This function
	has a couple of other oddly placed smp_mb__before_atomic().

	And atomic_cmpxchg(): msc_buffer_alloc().  This instance
	of smp_mb__before_atomic() can be removed unless I am missing
	something subtle.  Ditto for kvm_vcpu_exiting_guest_mode(),
	pv_kick_node(), __sbq_wake_up(), 

	And lock acquisition??? acm_read_bulk_callback().

	In nfnl_acct_fill_info(), a smp_mb__before_atomic() after
	a atomic64_xchg()???  Also before a clear_bit(), but the
	clear_bit() is inside an "if".

	There are a few cases that would see added overhead.  For example,
	svc_get_next_xprt() has the following:

		smp_mb__before_atomic();
		clear_bit(SP_CONGESTED, &pool->sp_flags);
		clear_bit(RQ_BUSY, &rqstp->rq_flags);
		smp_mb__after_atomic();

	And xs_sock_reset_connection_flags() has this:

		smp_mb__before_atomic();
		clear_bit(XPRT_CLOSE_WAIT, &xprt->state);
		clear_bit(XPRT_CLOSING, &xprt->state);
		xs_sock_reset_state_flags(xprt);  /* Also a clear_bit(). */
		smp_mb__after_atomic();

	Yeah, there are more than a few misuses, aren't there?  :-/
	A coccinelle script seems in order.  In 0day test robot.

	But there are a number of helper functions whose purpose
	seems to be to wrap an atomic in smp_mb__before_atomic() and
	smp_mb__after_atomic(), so some of the atomic_xxx_mb() functions
	might be a good idea just for improved readability.

2.	Add something to checkpatch.pl warning about non-total ordering,
	with the error message explaining that the ordering is partial.

3.	Make non-value-returning atomics provide full ordering.
	This would of course need some benchmarking, but would be a
	simple change to make and would eliminate a large class of
	potential bugs.  My guess is that the loss in performance
	would be non-negligible, but who knows?

4.	Your idea here!

							Thanx, Paul


  reply	other threads:[~2019-04-20  8:54 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-19 17:21 [PATCH] Documentation: atomic_t.txt: Explain ordering provided by smp_mb__{before,after}_atomic() Alan Stern
2019-04-19 17:54 ` Paul E. McKenney
2019-04-19 18:00 ` Peter Zijlstra
2019-04-19 18:26   ` Paul E. McKenney
2019-04-20  0:26     ` Nicholas Piggin
2019-04-20  8:54       ` Paul E. McKenney [this message]
2019-04-23 12:17         ` Peter Zijlstra
2019-04-23 13:21           ` Paul E. McKenney
2019-04-23 13:26             ` Peter Zijlstra
2019-04-23 20:16               ` Paul E. McKenney
2019-04-23 20:28                 ` Peter Zijlstra
2019-04-24  8:29                   ` Paul E. McKenney
2019-04-24  8:44                     ` Peter Zijlstra
2019-04-23 12:32         ` Peter Zijlstra
2019-04-23 13:30           ` Paul E. McKenney
2019-04-23 13:40             ` Peter Zijlstra
2019-04-23 20:19               ` Paul E. McKenney
2019-04-27  8:17             ` Andrea Parri
2019-04-27  8:36               ` Paul E. McKenney
2019-04-29  9:24             ` Johan Hovold
2019-04-29 14:49               ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190420085440.GK14111@linux.ibm.com \
    --to=paulmck@linux.ibm.com \
    --cc=akiyks@gmail.com \
    --cc=andrea.parri@amarulasolutions.com \
    --cc=boqun.feng@gmail.com \
    --cc=dhowells@redhat.com \
    --cc=dlustig@nvidia.com \
    --cc=j.alglave@ucl.ac.uk \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luc.maranget@inria.fr \
    --cc=npiggin@gmail.com \
    --cc=peterz@infradead.org \
    --cc=stern@rowland.harvard.edu \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.