All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Palmer Dabbelt <palmer@dabbelt.com>
Cc: mingo@redhat.com, mcgrof@kernel.org, viro@zeniv.linux.org.uk,
	sfr@canb.auug.org.au, nicolas.dichtel@6wind.com,
	rmk+kernel@armlinux.org.uk, msalter@redhat.com,
	tklauser@distanz.ch, will.deacon@arm.com, james.hogan@imgtec.com,
	paul.gortmaker@windriver.com, linux@roeck-us.net,
	linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org,
	albert@sifive.com, patches@groups.riscv.org
Subject: Re: [PATCH 2/9] RISC-V: Atomic and Locking Code
Date: Wed, 5 Jul 2017 10:43:21 +0200	[thread overview]
Message-ID: <20170705084321.hqy4qsm45mv772k4@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <20170704195102.3974-3-palmer@dabbelt.com>

On Tue, Jul 04, 2017 at 12:50:55PM -0700, Palmer Dabbelt wrote:
> +/*
> + * FIXME: I could only find documentation that atomic_{add,sub,inc,dec} are
> + * barrier-free.  I'm assuming that and/or/xor have the same constraints as the
> + * others.
> + */

Yes.. we have new documentation in the work to which I would post a link
but for some reason copy/paste stopped working again (Konsole does that
at times and is #$%#$%#4# annoying).

Ha, found it using google...

  https://marc.info/?l=linux-kernel&m=14972790112580


> +

> +/*
> + * atomic_{cmp,}xchg is required to have exactly the same ordering semantics as
> + * {cmp,}xchg and the operations that return, so they need a barrier.  We just
> + * use the other implementations directly.
> + */

cmpxchg triggers an extra rule; all conditional operations only need to
imply barriers on success. So a cmpxchg that fails, need not imply any
ordering what so ever.

> +/*
> + * Our atomic operations set the AQ and RL bits and therefor we don't need to
> + * fence around atomics.
> + */
> +#define __smb_mb__before_atomic()	barrier()
> +#define __smb_mb__after_atomic()	barrier()

Ah, not quite... you need full barriers here. Because your regular
atomic ops imply no ordering what so ever.

> +/*
> + * These barries are meant to prevent memory operations inside a spinlock from
> + * moving outside of that spinlock.  Since we set the AQ and RL bits when
> + * entering or leaving spinlocks, no additional fence needs to be performed.
> + */
> +#define smb_mb__before_spinlock()	barrier()
> +#define smb_mb__after_spinlock()	barrier()

Also probably not true. I _think_ you want a full barrier here, but
given the total lack of documentation on your end and the fact I've not
yet read the spinlock (which I suppose is below) I cannot yet state
more.


> +
> +/* FIXME: I don't think RISC-V is allowed to perform a speculative load. */
> +#define smp_acquire__after_ctrl_dep()	barrier()

That would be a very weird thing to disallow... speculative loads are
teh awesome ;-) Note you can get the very same effect from caches when
your stores are not globally atomic.

> +/*
> + * The RISC-V ISA doesn't support byte or half-word AMOs, so we fall back to a
> + * regular store and a fence here.  Otherwise we emit an AMO with an AQ or RL
> + * bit set and allow the microarchitecture to avoid the other half of the AMO.
> + */
> +#define __smp_store_release(p, v)					\
> +do {									\
> +	union { typeof(*p) __val; char __c[1]; } __u =			\
> +		{ .__val = (__force typeof(*p)) (v) };			\
> +	compiletime_assert_atomic_type(*p);				\
> +	switch (sizeof(*p)) {						\
> +	case 1:								\
> +	case 2:								\
> +		smb_mb();						\
> +		WRITE_ONCE(*p, __u.__val);				\
> +		break;							\
> +	case 4:								\
> +		__asm__ __volatile__ (					\
> +			"amoswap.w.rl zero, %1, %0"			\
> +			: "+A" (*p), "r" (__u.__val)			\
> +			: 						\
> +			: "memory");					\
> +		break;							\
> +	case 8:								\
> +		__asm__ __volatile__ (					\
> +			"amoswap.d.rl zero, %1, %0"			\
> +			: "+A" (*p), "r" (__u.__val)			\
> +			: 						\
> +			: "memory");					\
> +		break;							\
> +	}								\
> +} while (0)
> +
> +#define __smp_load_acquire(p)						\
> +do {									\
> +	union { typeof(*p) __val; char __c[1]; } __u =			\
> +		{ .__val = (__force typeof(*p)) (v) };			\
> +	compiletime_assert_atomic_type(*p);				\
> +	switch (sizeof(*p)) {						\
> +	case 1:								\
> +	case 2:								\
> +		__u.__val = READ_ONCE(*p);				\
> +		smb_mb();						\
> +		break;							\
> +	case 4:								\
> +		__asm__ __volatile__ (					\
> +			"amoor.w.aq %1, zero, %0"			\
> +			: "+A" (*p)					\
> +			: "=r" (__u.__val)				\
> +			: "memory");					\
> +		break;							\
> +	case 8:								\
> +		__asm__ __volatile__ (					\
> +			"amoor.d.aq %1, zero, %0"			\
> +			: "+A" (*p)					\
> +			: "=r" (__u.__val)				\
> +			: "memory");					\
> +		break;							\
> +	}								\
> +	__u.__val;							\
> +} while (0)

'creative' use of amoswap and amoor :-)

You should really look at a normal load with ordering instruction
though, that amoor.aq is a rmw and will promote the cacheline to
exclusive (and dirty it).

> +/*
> + * Simple spin lock operations.  These provide no fairness guarantees.
> + */
> +
> +/* FIXME: Replace this with a ticket lock, like MIPS. */
> +
> +#define arch_spin_lock_flags(lock, flags) arch_spin_lock(lock)
> +#define arch_spin_is_locked(x)	((x)->lock != 0)
> +#define arch_spin_unlock_wait(x) \
> +		do { cpu_relax(); } while ((x)->lock)
> +
> +static inline void arch_spin_unlock(arch_spinlock_t *lock)
> +{
> +	__asm__ __volatile__ (
> +		"amoswap.w.rl x0, x0, %0"
> +		: "=A" (lock->lock)
> +		:: "memory");
> +}
> +
> +static inline int arch_spin_trylock(arch_spinlock_t *lock)
> +{
> +	int tmp = 1, busy;
> +
> +	__asm__ __volatile__ (
> +		"amoswap.w.aq %0, %2, %1"
> +		: "=r" (busy), "+A" (lock->lock)
> +		: "r" (tmp)
> +		: "memory");
> +
> +	return !busy;
> +}
> +
> +static inline void arch_spin_lock(arch_spinlock_t *lock)
> +{
> +	while (1) {
> +		if (arch_spin_is_locked(lock))
> +			continue;
> +
> +		if (arch_spin_trylock(lock))
> +			break;
> +	}
> +}

OK, so back to smp_mb__{before,after}_spinlock(), that wants to order
things like:

	wakeup:					block:

	COND = 1;				p->state = UNINTERRUPTIBLE;
						smp_mb();
	smp_mb__before_spinlock();
	spin_lock(&lock);			if (!COND)
						  schedule()
	if (p->state & state)
		goto out;


And here it is important that the COND store not happen _after_ the
p->state load.

Now, your spin_lock() only implies the AQ thing, which should only
constraint later load/stores but does nothing for the prior load/stores.
So our COND store can drop into the lock and even happen after the
p->state load.

So you very much want your smp_mb__{before,after}_spinlock thingies to
be full barriers.

  reply	other threads:[~2017-07-05  8:44 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-04 19:50 RISC-V Linux Port v4 Palmer Dabbelt
2017-07-04 19:50 ` Palmer Dabbelt
2017-07-04 19:50 ` [PATCH 1/9] RISC-V: Init and Halt Code Palmer Dabbelt
2017-07-04 19:50 ` Palmer Dabbelt
2017-07-04 19:50   ` Palmer Dabbelt
     [not found]   ` <alpine.DEB.2.20.1707042224560.2131@nanos>
2017-07-04 21:17     ` [patches] " Karsten Merker
2017-07-05  6:39       ` Thomas Gleixner
2017-07-04 21:54   ` [patches] " Jonathan Neuschäfer
2017-07-06 22:34     ` Palmer Dabbelt
2017-07-06 22:34       ` Palmer Dabbelt
2017-07-07 12:58       ` Jonathan Neuschäfer
2017-07-10 20:39         ` Palmer Dabbelt
2017-07-10 20:39           ` Palmer Dabbelt
2017-07-04 19:50 ` [PATCH 2/9] RISC-V: Atomic and Locking Code Palmer Dabbelt
2017-07-04 19:50   ` Palmer Dabbelt
2017-07-05  8:43   ` Peter Zijlstra [this message]
2017-07-06 11:08     ` Boqun Feng
2017-07-06  7:26       ` Peter Zijlstra
2017-07-07  1:04     ` Palmer Dabbelt
2017-07-07  1:04       ` Palmer Dabbelt
2017-07-07  2:14       ` Boqun Feng
2017-07-10 20:39         ` Palmer Dabbelt
2017-07-07  8:08       ` Peter Zijlstra
2017-07-10 20:39         ` Palmer Dabbelt
2017-07-06 10:33   ` Boqun Feng
2017-07-07 13:16   ` [patches] " Jonathan Neuschäfer
2017-07-10 20:39     ` Palmer Dabbelt
2017-07-04 19:50 ` Palmer Dabbelt
2017-07-04 19:50 ` [PATCH 3/9] RISC-V: Generic library routines and assembly Palmer Dabbelt
2017-07-04 19:50   ` Palmer Dabbelt
2017-07-04 19:50 ` [PATCH 4/9] RISC-V: ELF and module implementation Palmer Dabbelt
2017-07-04 19:50   ` Palmer Dabbelt
2017-07-04 19:50 ` [PATCH 5/9] RISC-V: Task implementation Palmer Dabbelt
2017-07-04 19:50   ` Palmer Dabbelt
2017-07-04 19:50 ` Palmer Dabbelt
2017-07-04 19:50 ` [PATCH 6/9] RISC-V: Device, timer, IRQs, and the SBI Palmer Dabbelt
2017-07-04 19:50   ` Palmer Dabbelt
2017-07-04 19:51 ` [PATCH 7/9] RISC-V: Paging and MMU Palmer Dabbelt
2017-07-04 19:51   ` Palmer Dabbelt
2017-07-04 19:51 ` Palmer Dabbelt
2017-07-04 19:51 ` [PATCH 8/9] RISC-V: User-facing API Palmer Dabbelt
2017-07-04 19:51   ` Palmer Dabbelt
2017-07-05 10:24   ` James Hogan
2017-07-05 10:24     ` James Hogan
2017-07-06  2:01   ` Christoph Hellwig
2017-07-06  8:55     ` Will Deacon
2017-07-06 15:34       ` Christoph Hellwig
2017-07-06 15:45         ` Will Deacon
     [not found]           ` <mhng-f92ef7c4-049a-4a71-be12-c600d1d7858b@palmer-si-x1c4>
2017-07-10 20:18             ` Palmer Dabbelt
2017-07-11 13:22             ` Will Deacon
2017-07-11 13:55               ` Christoph Hellwig
2017-07-11 17:28                 ` Palmer Dabbelt
2017-07-11 17:28                   ` Palmer Dabbelt
2017-07-11 17:07               ` Palmer Dabbelt
2017-07-06 15:34   ` Dave P Martin
2017-07-04 19:51 ` Palmer Dabbelt
2017-07-04 19:51 ` [PATCH 9/9] RISC-V: Build Infastructure Palmer Dabbelt
2017-07-04 19:51   ` Palmer Dabbelt
  -- strict thread matches above, loose matches on Subject: below --
2017-06-06 22:59 RISC-V Linux Port v2 Palmer Dabbelt
2017-06-28 18:55 ` RISC-V Linux Port v3 Palmer Dabbelt
2017-06-28 18:55   ` [PATCH 2/9] RISC-V: Atomic and Locking Code Palmer Dabbelt
2017-06-28 18:55     ` Palmer Dabbelt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170705084321.hqy4qsm45mv772k4@hirez.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=albert@sifive.com \
    --cc=james.hogan@imgtec.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@roeck-us.net \
    --cc=mcgrof@kernel.org \
    --cc=mingo@redhat.com \
    --cc=msalter@redhat.com \
    --cc=nicolas.dichtel@6wind.com \
    --cc=palmer@dabbelt.com \
    --cc=patches@groups.riscv.org \
    --cc=paul.gortmaker@windriver.com \
    --cc=rmk+kernel@armlinux.org.uk \
    --cc=sfr@canb.auug.org.au \
    --cc=tklauser@distanz.ch \
    --cc=viro@zeniv.linux.org.uk \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.