All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Alex Kogan <alex.kogan@oracle.com>
Cc: linux@armlinux.org.uk, mingo@redhat.com, will.deacon@arm.com,
	arnd@arndb.de, longman@redhat.com, linux-arch@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, tglx@linutronix.de, bp@alien8.de,
	hpa@zytor.com, x86@kernel.org, guohanjun@huawei.com,
	jglauber@marvell.com, steven.sistare@oracle.com,
	daniel.m.jordan@oracle.com, dave.dice@oracle.com
Subject: Re: [PATCH v9 3/5] locking/qspinlock: Introduce CNA into the slow path of qspinlock
Date: Thu, 23 Jan 2020 11:16:49 +0100	[thread overview]
Message-ID: <20200123101649.GF14946@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <20200123100635.GE14946@hirez.programming.kicks-ass.net>

On Thu, Jan 23, 2020 at 11:06:35AM +0100, Peter Zijlstra wrote:
> On Thu, Jan 23, 2020 at 10:26:58AM +0100, Peter Zijlstra wrote:
> > On Tue, Jan 14, 2020 at 10:59:18PM -0500, Alex Kogan wrote:
> > > +/* this function is called only when the primary queue is empty */
> > > +static inline bool cna_try_change_tail(struct qspinlock *lock, u32 val,
> > > +				       struct mcs_spinlock *node)
> > > +{
> > > +	struct mcs_spinlock *head_2nd, *tail_2nd;
> > > +	u32 new;
> > > +
> > > +	/* If the secondary queue is empty, do what MCS does. */
> > > +	if (node->locked <= 1)
> > > +		return __try_clear_tail(lock, val, node);
> > > +
> > > +	/*
> > > +	 * Try to update the tail value to the last node in the secondary queue.
> > > +	 * If successful, pass the lock to the first thread in the secondary
> > > +	 * queue. Doing those two actions effectively moves all nodes from the
> > > +	 * secondary queue into the main one.
> > > +	 */
> > > +	tail_2nd = decode_tail(node->locked);
> > > +	head_2nd = tail_2nd->next;
> > > +	new = ((struct cna_node *)tail_2nd)->encoded_tail + _Q_LOCKED_VAL;
> > > +
> > > +	if (atomic_try_cmpxchg_relaxed(&lock->val, &val, new)) {
> > > +		/*
> > > +		 * Try to reset @next in tail_2nd to NULL, but no need to check
> > > +		 * the result - if failed, a new successor has updated it.
> > > +		 */
> > 
> > I think you actually have an ordering bug here; the load of head_2nd
> > *must* happen before the atomic_try_cmpxchg(), otherwise it might
> > observe the new next and clear a valid next pointer.
> > 
> > What would be the best fix for that; I'm thinking:
> > 
> > 	head_2nd = smp_load_acquire(&tail_2nd->next);
> > 
> > Will?
> 
> Hmm, given we've not passed the lock around yet; why wouldn't something
> like this work:
> 
> 	smp_store_release(&tail_2nd->next, NULL);

Argh, make that:

	tail_2nd->next = NULL;

	smp_wmb();

> 	if (!atomic_try_cmpxchg_relaxed(&lock, &val, new)) {
> 		tail_2nd->next = head_2nd;
> 		return false;
> 	}
> 
> The whole second queue is only ever modified by the lock owner, and that
> is us, so we can pre-terminate the secondary queue (break the circular
> link), try the cmpxchg and fix it back up when it fails.

WARNING: multiple messages have this Message-ID (diff)
From: Peter Zijlstra <peterz@infradead.org>
To: Alex Kogan <alex.kogan@oracle.com>
Cc: linux-arch@vger.kernel.org, guohanjun@huawei.com, arnd@arndb.de,
	dave.dice@oracle.com, jglauber@marvell.com, x86@kernel.org,
	will.deacon@arm.com, linux@armlinux.org.uk,
	steven.sistare@oracle.com, linux-kernel@vger.kernel.org,
	mingo@redhat.com, bp@alien8.de, hpa@zytor.com,
	longman@redhat.com, tglx@linutronix.de,
	daniel.m.jordan@oracle.com, linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH v9 3/5] locking/qspinlock: Introduce CNA into the slow path of qspinlock
Date: Thu, 23 Jan 2020 11:16:49 +0100	[thread overview]
Message-ID: <20200123101649.GF14946@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <20200123100635.GE14946@hirez.programming.kicks-ass.net>

On Thu, Jan 23, 2020 at 11:06:35AM +0100, Peter Zijlstra wrote:
> On Thu, Jan 23, 2020 at 10:26:58AM +0100, Peter Zijlstra wrote:
> > On Tue, Jan 14, 2020 at 10:59:18PM -0500, Alex Kogan wrote:
> > > +/* this function is called only when the primary queue is empty */
> > > +static inline bool cna_try_change_tail(struct qspinlock *lock, u32 val,
> > > +				       struct mcs_spinlock *node)
> > > +{
> > > +	struct mcs_spinlock *head_2nd, *tail_2nd;
> > > +	u32 new;
> > > +
> > > +	/* If the secondary queue is empty, do what MCS does. */
> > > +	if (node->locked <= 1)
> > > +		return __try_clear_tail(lock, val, node);
> > > +
> > > +	/*
> > > +	 * Try to update the tail value to the last node in the secondary queue.
> > > +	 * If successful, pass the lock to the first thread in the secondary
> > > +	 * queue. Doing those two actions effectively moves all nodes from the
> > > +	 * secondary queue into the main one.
> > > +	 */
> > > +	tail_2nd = decode_tail(node->locked);
> > > +	head_2nd = tail_2nd->next;
> > > +	new = ((struct cna_node *)tail_2nd)->encoded_tail + _Q_LOCKED_VAL;
> > > +
> > > +	if (atomic_try_cmpxchg_relaxed(&lock->val, &val, new)) {
> > > +		/*
> > > +		 * Try to reset @next in tail_2nd to NULL, but no need to check
> > > +		 * the result - if failed, a new successor has updated it.
> > > +		 */
> > 
> > I think you actually have an ordering bug here; the load of head_2nd
> > *must* happen before the atomic_try_cmpxchg(), otherwise it might
> > observe the new next and clear a valid next pointer.
> > 
> > What would be the best fix for that; I'm thinking:
> > 
> > 	head_2nd = smp_load_acquire(&tail_2nd->next);
> > 
> > Will?
> 
> Hmm, given we've not passed the lock around yet; why wouldn't something
> like this work:
> 
> 	smp_store_release(&tail_2nd->next, NULL);

Argh, make that:

	tail_2nd->next = NULL;

	smp_wmb();

> 	if (!atomic_try_cmpxchg_relaxed(&lock, &val, new)) {
> 		tail_2nd->next = head_2nd;
> 		return false;
> 	}
> 
> The whole second queue is only ever modified by the lock owner, and that
> is us, so we can pre-terminate the secondary queue (break the circular
> link), try the cmpxchg and fix it back up when it fails.

WARNING: multiple messages have this Message-ID (diff)
From: Peter Zijlstra <peterz@infradead.org>
To: Alex Kogan <alex.kogan@oracle.com>
Cc: linux-arch@vger.kernel.org, guohanjun@huawei.com, arnd@arndb.de,
	dave.dice@oracle.com, jglauber@marvell.com, x86@kernel.org,
	will.deacon@arm.com, linux@armlinux.org.uk,
	steven.sistare@oracle.com, linux-kernel@vger.kernel.org,
	mingo@redhat.com, bp@alien8.de, hpa@zytor.com,
	longman@redhat.com, tglx@linutronix.de,
	daniel.m.jordan@oracle.com, linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH v9 3/5] locking/qspinlock: Introduce CNA into the slow path of qspinlock
Date: Thu, 23 Jan 2020 11:16:49 +0100	[thread overview]
Message-ID: <20200123101649.GF14946@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <20200123100635.GE14946@hirez.programming.kicks-ass.net>

On Thu, Jan 23, 2020 at 11:06:35AM +0100, Peter Zijlstra wrote:
> On Thu, Jan 23, 2020 at 10:26:58AM +0100, Peter Zijlstra wrote:
> > On Tue, Jan 14, 2020 at 10:59:18PM -0500, Alex Kogan wrote:
> > > +/* this function is called only when the primary queue is empty */
> > > +static inline bool cna_try_change_tail(struct qspinlock *lock, u32 val,
> > > +				       struct mcs_spinlock *node)
> > > +{
> > > +	struct mcs_spinlock *head_2nd, *tail_2nd;
> > > +	u32 new;
> > > +
> > > +	/* If the secondary queue is empty, do what MCS does. */
> > > +	if (node->locked <= 1)
> > > +		return __try_clear_tail(lock, val, node);
> > > +
> > > +	/*
> > > +	 * Try to update the tail value to the last node in the secondary queue.
> > > +	 * If successful, pass the lock to the first thread in the secondary
> > > +	 * queue. Doing those two actions effectively moves all nodes from the
> > > +	 * secondary queue into the main one.
> > > +	 */
> > > +	tail_2nd = decode_tail(node->locked);
> > > +	head_2nd = tail_2nd->next;
> > > +	new = ((struct cna_node *)tail_2nd)->encoded_tail + _Q_LOCKED_VAL;
> > > +
> > > +	if (atomic_try_cmpxchg_relaxed(&lock->val, &val, new)) {
> > > +		/*
> > > +		 * Try to reset @next in tail_2nd to NULL, but no need to check
> > > +		 * the result - if failed, a new successor has updated it.
> > > +		 */
> > 
> > I think you actually have an ordering bug here; the load of head_2nd
> > *must* happen before the atomic_try_cmpxchg(), otherwise it might
> > observe the new next and clear a valid next pointer.
> > 
> > What would be the best fix for that; I'm thinking:
> > 
> > 	head_2nd = smp_load_acquire(&tail_2nd->next);
> > 
> > Will?
> 
> Hmm, given we've not passed the lock around yet; why wouldn't something
> like this work:
> 
> 	smp_store_release(&tail_2nd->next, NULL);

Argh, make that:

	tail_2nd->next = NULL;

	smp_wmb();

> 	if (!atomic_try_cmpxchg_relaxed(&lock, &val, new)) {
> 		tail_2nd->next = head_2nd;
> 		return false;
> 	}
> 
> The whole second queue is only ever modified by the lock owner, and that
> is us, so we can pre-terminate the secondary queue (break the circular
> link), try the cmpxchg and fix it back up when it fails.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2020-01-23 10:17 UTC|newest]

Thread overview: 89+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-15  3:59 [PATCH v9 0/5] Add NUMA-awareness to qspinlock Alex Kogan
2020-01-15  3:59 ` Alex Kogan
2020-01-15  3:59 ` [PATCH v9 1/5] locking/qspinlock: Rename mcs lock/unlock macros and make them more generic Alex Kogan
2020-01-15  3:59   ` Alex Kogan
2020-01-15  3:59 ` [PATCH v9 2/5] locking/qspinlock: Refactor the qspinlock slow path Alex Kogan
2020-01-15  3:59   ` Alex Kogan
2020-01-15  3:59   ` Alex Kogan
2020-01-15  3:59 ` [PATCH v9 3/5] locking/qspinlock: Introduce CNA into the slow path of qspinlock Alex Kogan
2020-01-15  3:59   ` Alex Kogan
2020-01-15  3:59   ` Alex Kogan
2020-01-23  9:26   ` Peter Zijlstra
2020-01-23  9:26     ` Peter Zijlstra
2020-01-23  9:26     ` Peter Zijlstra
2020-01-23 10:06     ` Peter Zijlstra
2020-01-23 10:06       ` Peter Zijlstra
2020-01-23 10:06       ` Peter Zijlstra
2020-01-23 10:16       ` Peter Zijlstra [this message]
2020-01-23 10:16         ` Peter Zijlstra
2020-01-23 10:16         ` Peter Zijlstra
2020-01-23 11:22         ` Will Deacon
2020-01-23 11:22           ` Will Deacon
2020-01-23 13:17           ` Peter Zijlstra
2020-01-23 13:17             ` Peter Zijlstra
2020-01-23 13:17             ` Peter Zijlstra
2020-01-23 14:15   ` Waiman Long
2020-01-23 14:15     ` Waiman Long
2020-01-23 15:29     ` Peter Zijlstra
2020-01-23 15:29       ` Peter Zijlstra
2020-01-23 15:29       ` Peter Zijlstra
2020-01-15  3:59 ` [PATCH v9 4/5] locking/qspinlock: Introduce starvation avoidance into CNA Alex Kogan
2020-01-15  3:59   ` Alex Kogan
2020-01-23 19:55   ` Waiman Long
2020-01-23 19:55     ` Waiman Long
2020-01-23 20:39     ` Waiman Long
2020-01-23 20:39       ` Waiman Long
2020-01-23 23:39       ` Alex Kogan
2020-01-23 23:39         ` Alex Kogan
2020-01-15  3:59 ` [PATCH v9 5/5] locking/qspinlock: Introduce the shuffle reduction optimization " Alex Kogan
2020-01-15  3:59   ` Alex Kogan
2020-03-02  1:14   ` [locking/qspinlock] 7b6da71157: unixbench.score 8.4% improvement kernel test robot
2020-03-02  1:14     ` kernel test robot
2020-03-02  1:14     ` kernel test robot
2020-01-22 11:45 ` [PATCH v9 0/5] Add NUMA-awareness to qspinlock Lihao Liang
2020-01-22 11:45   ` Lihao Liang
2020-01-22 17:24   ` Waiman Long
2020-01-22 17:24     ` Waiman Long
2020-01-23 11:35     ` Will Deacon
2020-01-23 11:35       ` Will Deacon
2020-01-23 15:25       ` Waiman Long
2020-01-23 15:25         ` Waiman Long
2020-01-23 19:08         ` Waiman Long
2020-01-23 19:08           ` Waiman Long
2020-01-22 19:29   ` Alex Kogan
2020-01-22 19:29     ` Alex Kogan
2020-01-26  0:32     ` Lihao Liang
2020-01-26  0:32       ` Lihao Liang
2020-01-26  1:58       ` Lihao Liang
2020-01-26  1:58         ` Lihao Liang
2020-01-26  1:58         ` Lihao Liang
2020-01-27 16:01         ` Alex Kogan
2020-01-27 16:01           ` Alex Kogan
2020-01-29  1:39           ` Lihao Liang
2020-01-29  1:39             ` Lihao Liang
2020-01-27  6:16       ` Alex Kogan
2020-01-27  6:16         ` Alex Kogan
2020-01-24 22:24 ` Paul E. McKenney
2020-01-24 22:24   ` Paul E. McKenney
     [not found]   ` <6AAE7FC6-F5DE-4067-8BC4-77F27948CD09@oracle.com>
2020-01-25  0:57     ` Paul E. McKenney
2020-01-25  0:57       ` Paul E. McKenney
2020-01-25  1:59       ` Waiman Long
2020-01-25  1:59         ` Waiman Long
     [not found]         ` <adb4fb09-f374-4d64-096b-ba9ad8b35fd5@redhat.com>
2020-01-25  4:58           ` Paul E. McKenney
2020-01-25  4:58             ` Paul E. McKenney
2020-01-25 19:41             ` Waiman Long
2020-01-25 19:41               ` Waiman Long
2020-01-26 15:35               ` Paul E. McKenney
2020-01-26 15:35                 ` Paul E. McKenney
2020-01-26 22:42                 ` Paul E. McKenney
2020-01-26 22:42                   ` Paul E. McKenney
2020-01-26 23:32                   ` Paul E. McKenney
2020-01-26 23:32                     ` Paul E. McKenney
2020-01-27  6:04                   ` Alex Kogan
2020-01-27  6:04                     ` Alex Kogan
2020-01-27 14:11                   ` Waiman Long
2020-01-27 14:11                     ` Waiman Long
2020-01-27 15:09                     ` Paul E. McKenney
2020-01-27 15:09                       ` Paul E. McKenney
     [not found]                       ` <9b3a3f16-5405-b6d1-d023-b85f4aab46dd@redhat.com>
2020-01-27 17:17                         ` Waiman Long
2020-01-27 17:17                           ` Waiman Long

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200123101649.GF14946@hirez.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=alex.kogan@oracle.com \
    --cc=arnd@arndb.de \
    --cc=bp@alien8.de \
    --cc=daniel.m.jordan@oracle.com \
    --cc=dave.dice@oracle.com \
    --cc=guohanjun@huawei.com \
    --cc=hpa@zytor.com \
    --cc=jglauber@marvell.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=longman@redhat.com \
    --cc=mingo@redhat.com \
    --cc=steven.sistare@oracle.com \
    --cc=tglx@linutronix.de \
    --cc=will.deacon@arm.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.