linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Alex Kogan <alex.kogan@oracle.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: linux-arch@vger.kernel.org, guohanjun@huawei.com, arnd@arndb.de,
	dave.dice@oracle.com, jglauber@marvell.com, x86@kernel.org,
	will.deacon@arm.com, linux@armlinux.org.uk,
	steven.sistare@oracle.com, linux-kernel@vger.kernel.org,
	rahul.x.yadav@oracle.com, mingo@redhat.com, bp@alien8.de,
	hpa@zytor.com, longman@redhat.com, tglx@linutronix.de,
	daniel.m.jordan@oracle.com, linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH v3 3/5] locking/qspinlock: Introduce CNA into the slow path of qspinlock
Date: Tue, 16 Jul 2019 13:19:16 -0400	[thread overview]
Message-ID: <193BBB31-F376-451F-BDE1-D4807140EB51@oracle.com> (raw)
In-Reply-To: <20190716155022.GR3419@hirez.programming.kicks-ass.net>

Hi, Peter.

Thanks for the review and all the suggestions!

A couple of comments are inlined below.

> On Jul 16, 2019, at 11:50 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> 
> On Mon, Jul 15, 2019 at 03:25:34PM -0400, Alex Kogan wrote:
>> +static struct cna_node *find_successor(struct mcs_spinlock *me)
>> +{
>> +	struct cna_node *me_cna = CNA_NODE(me);
>> +	struct cna_node *head_other, *tail_other, *cur;
>> +	struct cna_node *next = CNA_NODE(READ_ONCE(me->next));
>> +	int my_node;
>> +
>> +	/* @next should be set, else we would not be calling this function. */
>> +	WARN_ON_ONCE(next == NULL);
>> +
>> +	my_node = me_cna->numa_node;
>> +
>> +	/*
>> +	 * Fast path - check whether the immediate successor runs on
>> +	 * the same node.
>> +	 */
>> +	if (next->numa_node == my_node)
>> +		return next;
>> +
>> +	head_other = next;
>> +	tail_other = next;
>> +
>> +	/*
>> +	 * Traverse the main waiting queue starting from the successor of my
>> +	 * successor, and look for a thread running on the same node.
>> +	 */
>> +	cur = CNA_NODE(READ_ONCE(next->mcs.next));
>> +	while (cur) {
>> +		if (cur->numa_node == my_node) {
>> +			/*
>> +			 * Found a thread on the same node. Move threads
>> +			 * between me and that node into the secondary queue.
>> +			 */
>> +			if (me->locked > 1)
>> +				CNA_NODE(me->locked)->tail->mcs.next =
>> +					(struct mcs_spinlock *)head_other;
>> +			else
>> +				me->locked = (uintptr_t)head_other;
>> +			tail_other->mcs.next = NULL;
>> +			CNA_NODE(me->locked)->tail = tail_other;
>> +			return cur;
>> +		}
>> +		tail_other = cur;
>> +		cur = CNA_NODE(READ_ONCE(cur->mcs.next));
>> +	}
>> +	return NULL;
>> +}
> 
> static void cna_move(struct cna_node *cn, struct cna_node *cni)
> {
> 	struct cna_node *head, *tail;
> 
> 	/* remove @cni */
> 	WRITE_ONCE(cn->mcs.next, cni->mcs.next);
> 
> 	/* stick @cni on the 'other' list tail */
> 	cni->mcs.next = NULL;
> 
> 	if (cn->mcs.locked <= 1) {
> 		/* head = tail = cni */
> 		head = cni;
> 		head->tail = cni;
> 		cn->mcs.locked = head->encoded_tail;
> 	} else {
> 		/* add to tail */
> 		head = (struct cna_node *)decode_tail(cn->mcs.locked);
> 		tail = tail->tail;
> 		tail->next = cni;
> 	}
> }
> 
> static struct cna_node *cna_find_next(struct mcs_spinlock *node)
> {
> 	struct cna_node *cni, *cn = (struct cna_node *)node;
> 
> 	while ((cni = (struct cna_node *)READ_ONCE(cn->mcs.next))) {
> 		if (likely(cni->node == cn->node))
> 			break;
> 
> 		cna_move(cn, cni);
> 	}
> 
> 	return cni;
> }
But then you move nodes from the main list to the ‘other’ list one-by-one.
I’m afraid this would be unnecessary expensive.
Plus, all this extra work is wasted if you do not find a thread on the same 
NUMA node (you move everyone to the ‘other’ list only to move them back in 
cna_mcs_pass_lock()).

> 
>> +static inline bool cna_set_locked_empty_mcs(struct qspinlock *lock, u32 val,
>> +					struct mcs_spinlock *node)
>> +{
>> +	/* Check whether the secondary queue is empty. */
>> +	if (node->locked <= 1) {
>> +		if (atomic_try_cmpxchg_relaxed(&lock->val, &val,
>> +				_Q_LOCKED_VAL))
>> +			return true; /* No contention */
>> +	} else {
>> +		/*
>> +		 * Pass the lock to the first thread in the secondary
>> +		 * queue, but first try to update the queue's tail to
>> +		 * point to the last node in the secondary queue.
> 
> 
> That comment doesn't make sense; there's at least one conditional
> missing.
In CNA, we cannot just clear the tail when the MCS chain is empty, as 
there might be nodes in the ‘other’ chain. In that case (this is the “else” part),
we want to pass the lock to the first node in the ‘other’ chain, but 
first we need to put the last node from that chain into the tail. Perhaps the
comment should read “…  but first try to update the *primary* queue's tail …”, 
if that makes more sense.

> 
>> +		 */
>> +		struct cna_node *succ = CNA_NODE(node->locked);
>> +		u32 new = succ->tail->encoded_tail + _Q_LOCKED_VAL;
>> +
>> +		if (atomic_try_cmpxchg_relaxed(&lock->val, &val, new)) {
>> +			arch_mcs_spin_unlock_contended(&succ->mcs.locked, 1);
>> +			return true;
>> +		}
>> +	}
>> +
>> +	return false;
>> +}
> 
> static cna_try_clear_tail(struct qspinlock *lock, u32 val, struct mcs_spinlock *node)
> {
> 	if (node->locked <= 1)
> 		return __try_clear_tail(lock, val, node);
> 
> 	/* the other case */
> }
Good point, thanks.

> 
>> +static inline void cna_pass_mcs_lock(struct mcs_spinlock *node,
>> +				     struct mcs_spinlock *next)
>> +{
>> +	struct cna_node *succ = NULL;
>> +	u64 *var = &next->locked;
>> +	u64 val = 1;
>> +
>> +	succ = find_successor(node);
> 
> This makes unlock O(n), which is 'funneh' and undocumented.
I will add a comment above the call to find_successor() / cna_find_next().

> 
>> +
>> +	if (succ) {
>> +		var = &succ->mcs.locked;
>> +		/*
>> +		 * We unlock a successor by passing a non-zero value,
>> +		 * so set @val to 1 iff @locked is 0, which will happen
>> +		 * if we acquired the MCS lock when its queue was empty
>> +		 */
>> +		val = node->locked + (node->locked == 0);
>> +	} else if (node->locked > 1) { /* if the secondary queue is not empty */
>> +		/* pass the lock to the first node in that queue */
>> +		succ = CNA_NODE(node->locked);
>> +		succ->tail->mcs.next = next;
>> +		var = &succ->mcs.locked;
> 
>> +	}	/*
>> +		 * Otherwise, pass the lock to the immediate successor
>> +		 * in the main queue.
>> +		 */
> 
> I don't think this mis-indented comment can happen. The call-site
> guarantees @next is non-null.
> 
> Therefore, cna_find_next() will either return it, or place it on the
> secondary list. If it (cna_find_next) returns NULL, we must have a
> non-empty secondary list.
> 
> In no case do I see this tertiary condition being possible.
find_successor() will return NULL if it does not find a thread running on the 
same NUMA node. And the secondary queue might be empty at that time.

> 
>> +
>> +	arch_mcs_spin_unlock_contended(var, val);
>> +}
> 
> This also renders this @next argument superfluous.
> 
> static cna_mcs_pass_lock(struct mcs_spinlock *node, struct mcs_spinlock *next)
> {
> 	next = cna_find_next(node);
> 	if (!next) {
> 		BUG_ON(node->locked <= 1);
> 		next = (struct cna_node *)decode_tail(node->locked);
> 		node->locked = 1;
> 	}
> 
> 	arch_mcs_pass_lock(&next->mcs.locked, node->locked);
> }

@next is passed to save the load from @node.
This is probably most important for the native code (__pass_mcs_lock()).
That function should be inlined, however, and that load should not matter.
Bottom line, I agree that we can remove the @next argument.

Best regards,
— Alex



_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2019-07-16 17:20 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-15 19:25 [PATCH v3 0/5] Add NUMA-awareness to qspinlock Alex Kogan
2019-07-15 19:25 ` [PATCH v3 1/5] locking/qspinlock: Make arch_mcs_spin_unlock_contended more generic Alex Kogan
2019-07-16 10:23   ` Peter Zijlstra
2019-07-15 19:25 ` [PATCH v3 2/5] locking/qspinlock: Refactor the qspinlock slow path Alex Kogan
2019-07-16 10:20   ` Peter Zijlstra
2019-07-16 14:53     ` Alex Kogan
2019-07-16 15:58       ` Peter Zijlstra
2019-07-15 19:25 ` [PATCH v3 3/5] locking/qspinlock: Introduce CNA into the slow path of qspinlock Alex Kogan
2019-07-15 21:30   ` Waiman Long
2019-07-16 11:04     ` Peter Zijlstra
2019-07-16 14:26     ` Alex Kogan
2019-07-16 14:44       ` Waiman Long
     [not found]     ` <aa73b86d-902a-bb6f-d372-8645c8299a6d@redhat.com>
     [not found]       ` <C1C55A40-FDB1-43B5-B551-F9B8BE776DF8@oracle.com>
2019-07-16 14:50         ` Waiman Long
2019-07-17 17:44           ` Alex Kogan
2019-07-17 17:58             ` Waiman Long
2019-07-16 11:05   ` Peter Zijlstra
2019-07-16 14:30     ` Alex Kogan
2019-07-16 15:50   ` Peter Zijlstra
2019-07-16 17:19     ` Alex Kogan [this message]
2019-07-16 18:47       ` Peter Zijlstra
2019-07-17  8:39         ` Peter Zijlstra
2019-07-17  8:59           ` Peter Zijlstra
2019-07-17 14:52             ` Alex Kogan
     [not found]           ` <FFC2D45A-24B3-40E1-ABBB-1D696E830B23@oracle.com>
2019-07-17 15:09             ` Peter Zijlstra
2019-07-17  2:16   ` Waiman Long
2019-07-17  7:44     ` Peter Zijlstra
2019-07-17 13:35       ` Waiman Long
2019-07-17 14:42       ` Alex Kogan
2019-07-15 19:25 ` [PATCH v3 4/5] locking/qspinlock: Introduce starvation avoidance into CNA Alex Kogan
2019-07-16 15:59   ` Peter Zijlstra
2019-07-15 19:25 ` [PATCH v3 5/5] locking/qspinlock: Introduce the shuffle reduction optimization " Alex Kogan
2019-07-16 11:47 ` [PATCH v3 0/5] Add NUMA-awareness to qspinlock Nicholas Piggin
     [not found]   ` <7D29555E-8F72-4EDD-8A87-B1A59C3945A6@oracle.com>
2019-07-16 23:07     ` Nicholas Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=193BBB31-F376-451F-BDE1-D4807140EB51@oracle.com \
    --to=alex.kogan@oracle.com \
    --cc=arnd@arndb.de \
    --cc=bp@alien8.de \
    --cc=daniel.m.jordan@oracle.com \
    --cc=dave.dice@oracle.com \
    --cc=guohanjun@huawei.com \
    --cc=hpa@zytor.com \
    --cc=jglauber@marvell.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=longman@redhat.com \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rahul.x.yadav@oracle.com \
    --cc=steven.sistare@oracle.com \
    --cc=tglx@linutronix.de \
    --cc=will.deacon@arm.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).