All of lore.kernel.org
 help / color / mirror / Atom feed
From: Waiman Long <longman@redhat.com>
To: Alex Kogan <alex.kogan@oracle.com>
Cc: linux@armlinux.org.uk, peterz@infradead.org, mingo@redhat.com,
	will.deacon@arm.com, arnd@arndb.de, linux-arch@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, tglx@linutronix.de, bp@alien8.de,
	hpa@zytor.com, x86@kernel.org, guohanjun@huawei.com,
	jglauber@marvell.com, steven.sistare@oracle.com,
	daniel.m.jordan@oracle.com, dave.dice@oracle.com
Subject: Re: [PATCH v10 3/5] locking/qspinlock: Introduce CNA into the slow path of qspinlock
Date: Tue, 1 Sep 2020 13:38:29 -0400	[thread overview]
Message-ID: <84fdb30e-387d-7bc1-e1f9-fa57cd9a32c9@redhat.com> (raw)
In-Reply-To: <08E77224-563F-49C7-9E7F-BD98E4FD121D@oracle.com>

On 8/31/20 5:39 PM, Alex Kogan wrote:
>> On Jul 28, 2020, at 4:00 PM, Waiman Long <longman@redhat.com> wrote:
>>
>> On 4/3/20 4:59 PM, Alex Kogan wrote:
>>> In CNA, spinning threads are organized in two queues, a primary queue for
>>> threads running on the same node as the current lock holder, and a
>>> secondary queue for threads running on other nodes. After acquiring the
>>> MCS lock and before acquiring the spinlock, the lock holder scans the
>>> primary queue looking for a thread running on the same node (pre-scan). If
>>> found (call it thread T), all threads in the primary queue between the
>>> current lock holder and T are moved to the end of the secondary queue.
>>> If such T is not found, we make another scan of the primary queue when
>>> unlocking the MCS lock (post-scan), starting at the position where
>>> pre-scan stopped. If both scans fail to find such T, the MCS lock is
>>> passed to the first thread in the secondary queue. If the secondary queue
>>> is empty, the lock is passed to the next thread in the primary queue.
>>> For more details, see https://urldefense.com/v3/__https://arxiv.org/abs/1810.05600__;!!GqivPVa7Brio!OaieLQ3MMZThgxr-Of8i9dbN5CwG8mXSIBJ_sUofhAXcs43IWL2x-stO-XKLEebn$ .
>>>
>>> Note that this variant of CNA may introduce starvation by continuously
>>> passing the lock to threads running on the same node. This issue
>>> will be addressed later in the series.
>>>
>>> Enabling CNA is controlled via a new configuration option
>>> (NUMA_AWARE_SPINLOCKS). By default, the CNA variant is patched in at the
>>> boot time only if we run on a multi-node machine in native environment and
>>> the new config is enabled. (For the time being, the patching requires
>>> CONFIG_PARAVIRT_SPINLOCKS to be enabled as well. However, this should be
>>> resolved once static_call() is available.) This default behavior can be
>>> overridden with the new kernel boot command-line option
>>> "numa_spinlock=on/off" (default is "auto").
>>>
>>> Signed-off-by: Alex Kogan <alex.kogan@oracle.com>
>>> Reviewed-by: Steve Sistare <steven.sistare@oracle.com>
>>> Reviewed-by: Waiman Long <longman@redhat.com>
>>> ---
>> There is also a concern that the worst case latency for a lock transfer can be close to O(n) which can be quite large for large SMP systems. I have a patch on top that modifies the current behavior to limit the number of node scans after the lock is freed.
> I understand the concern. While your patch addresses it, I am afraid it makes
> the code somewhat more complex, and duplicates some of the slow path
> functionality (e.g., the spin loop until the lock value changes to a certain
> value).
>
> Let me suggest a different idea for gradually restructuring the main queue
> that has some similarity to the way you suggested to handle prioritized waiters.
> Basically, instead of scanning the entire chain of main queue waiters,
> we can check only the next waiter and, if present and it runs on a different
> node, move it to the secondary queue. In addition, to maintain the preference
> for a certain numa node ID, we set the numa node of the next-next waiter,
> if present, to that of the current lock holder. This is the part similar to the
> way you suggested to handle prioritized waiters.
>
> This way, the worst case complexity of cna_order_queue() decreases from O(n)
> down to O(1), as we always “scan" only one waiter. And as before, we change
> the preference (and flush the secondary queue) after X handovers (or after
> Y ms, as in your in other patch).
>
> I attach the patch that applies on top of your patch for prioritized nodes
> (0006), but does not include your patch 0007 (time based threshold),
> which I will integrate into the series in the next revision.
>
> Please, let me know what you think.
>
That is an interesting idea. I don't have any fundamental objection to 
that. I just wonder how it will impact the kind of performance test that 
you ran before. It would be nice to see the performance impact with that 
change.

Cheers,
Longman


WARNING: multiple messages have this Message-ID (diff)
From: Waiman Long <longman@redhat.com>
To: Alex Kogan <alex.kogan@oracle.com>
Cc: linux-arch@vger.kernel.org, guohanjun@huawei.com, arnd@arndb.de,
	peterz@infradead.org, dave.dice@oracle.com, jglauber@marvell.com,
	x86@kernel.org, will.deacon@arm.com, linux@armlinux.org.uk,
	linux-kernel@vger.kernel.org, mingo@redhat.com, bp@alien8.de,
	hpa@zytor.com, steven.sistare@oracle.com, tglx@linutronix.de,
	daniel.m.jordan@oracle.com, linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH v10 3/5] locking/qspinlock: Introduce CNA into the slow path of qspinlock
Date: Tue, 1 Sep 2020 13:38:29 -0400	[thread overview]
Message-ID: <84fdb30e-387d-7bc1-e1f9-fa57cd9a32c9@redhat.com> (raw)
In-Reply-To: <08E77224-563F-49C7-9E7F-BD98E4FD121D@oracle.com>

On 8/31/20 5:39 PM, Alex Kogan wrote:
>> On Jul 28, 2020, at 4:00 PM, Waiman Long <longman@redhat.com> wrote:
>>
>> On 4/3/20 4:59 PM, Alex Kogan wrote:
>>> In CNA, spinning threads are organized in two queues, a primary queue for
>>> threads running on the same node as the current lock holder, and a
>>> secondary queue for threads running on other nodes. After acquiring the
>>> MCS lock and before acquiring the spinlock, the lock holder scans the
>>> primary queue looking for a thread running on the same node (pre-scan). If
>>> found (call it thread T), all threads in the primary queue between the
>>> current lock holder and T are moved to the end of the secondary queue.
>>> If such T is not found, we make another scan of the primary queue when
>>> unlocking the MCS lock (post-scan), starting at the position where
>>> pre-scan stopped. If both scans fail to find such T, the MCS lock is
>>> passed to the first thread in the secondary queue. If the secondary queue
>>> is empty, the lock is passed to the next thread in the primary queue.
>>> For more details, see https://urldefense.com/v3/__https://arxiv.org/abs/1810.05600__;!!GqivPVa7Brio!OaieLQ3MMZThgxr-Of8i9dbN5CwG8mXSIBJ_sUofhAXcs43IWL2x-stO-XKLEebn$ .
>>>
>>> Note that this variant of CNA may introduce starvation by continuously
>>> passing the lock to threads running on the same node. This issue
>>> will be addressed later in the series.
>>>
>>> Enabling CNA is controlled via a new configuration option
>>> (NUMA_AWARE_SPINLOCKS). By default, the CNA variant is patched in at the
>>> boot time only if we run on a multi-node machine in native environment and
>>> the new config is enabled. (For the time being, the patching requires
>>> CONFIG_PARAVIRT_SPINLOCKS to be enabled as well. However, this should be
>>> resolved once static_call() is available.) This default behavior can be
>>> overridden with the new kernel boot command-line option
>>> "numa_spinlock=on/off" (default is "auto").
>>>
>>> Signed-off-by: Alex Kogan <alex.kogan@oracle.com>
>>> Reviewed-by: Steve Sistare <steven.sistare@oracle.com>
>>> Reviewed-by: Waiman Long <longman@redhat.com>
>>> ---
>> There is also a concern that the worst case latency for a lock transfer can be close to O(n) which can be quite large for large SMP systems. I have a patch on top that modifies the current behavior to limit the number of node scans after the lock is freed.
> I understand the concern. While your patch addresses it, I am afraid it makes
> the code somewhat more complex, and duplicates some of the slow path
> functionality (e.g., the spin loop until the lock value changes to a certain
> value).
>
> Let me suggest a different idea for gradually restructuring the main queue
> that has some similarity to the way you suggested to handle prioritized waiters.
> Basically, instead of scanning the entire chain of main queue waiters,
> we can check only the next waiter and, if present and it runs on a different
> node, move it to the secondary queue. In addition, to maintain the preference
> for a certain numa node ID, we set the numa node of the next-next waiter,
> if present, to that of the current lock holder. This is the part similar to the
> way you suggested to handle prioritized waiters.
>
> This way, the worst case complexity of cna_order_queue() decreases from O(n)
> down to O(1), as we always “scan" only one waiter. And as before, we change
> the preference (and flush the secondary queue) after X handovers (or after
> Y ms, as in your in other patch).
>
> I attach the patch that applies on top of your patch for prioritized nodes
> (0006), but does not include your patch 0007 (time based threshold),
> which I will integrate into the series in the next revision.
>
> Please, let me know what you think.
>
That is an interesting idea. I don't have any fundamental objection to 
that. I just wonder how it will impact the kind of performance test that 
you ran before. It would be nice to see the performance impact with that 
change.

Cheers,
Longman


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2020-09-01 17:38 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-03 20:59 [PATCH v10 0/5] Add NUMA-awareness to qspinlock Alex Kogan
2020-04-03 20:59 ` Alex Kogan
2020-04-03 20:59 ` [PATCH v10 1/5] locking/qspinlock: Rename mcs lock/unlock macros and make them more generic Alex Kogan
2020-04-03 20:59   ` Alex Kogan
2020-04-03 20:59 ` [PATCH v10 2/5] locking/qspinlock: Refactor the qspinlock slow path Alex Kogan
2020-04-03 20:59   ` Alex Kogan
2020-04-03 20:59   ` Alex Kogan
2020-04-03 20:59 ` [PATCH v10 3/5] locking/qspinlock: Introduce CNA into the slow path of qspinlock Alex Kogan
2020-04-03 20:59   ` Alex Kogan
2020-04-03 20:59   ` Alex Kogan
2020-04-04 23:25   ` kbuild test robot
2020-04-07 21:57     ` Alex Kogan
2020-07-28 20:00   ` Waiman Long
2020-07-28 20:00     ` Waiman Long
2020-08-31 21:39     ` Alex Kogan
2020-08-31 21:39       ` Alex Kogan
2020-09-01 17:38       ` Waiman Long [this message]
2020-09-01 17:38         ` Waiman Long
2020-04-03 20:59 ` [PATCH v10 4/5] locking/qspinlock: Introduce starvation avoidance into CNA Alex Kogan
2020-04-03 20:59   ` Alex Kogan
2020-07-28 19:39   ` Waiman Long
2020-04-03 20:59 ` [PATCH v10 5/5] locking/qspinlock: Avoid moving certain threads between waiting queues in CNA Alex Kogan
2020-04-03 20:59   ` Alex Kogan
2020-07-28 19:34   ` Waiman Long
2020-07-28 19:34     ` Waiman Long
2020-05-04 14:17 ` [PATCH v10 0/5] Add NUMA-awareness to qspinlock Alex Kogan
2020-05-04 14:17   ` Alex Kogan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=84fdb30e-387d-7bc1-e1f9-fa57cd9a32c9@redhat.com \
    --to=longman@redhat.com \
    --cc=alex.kogan@oracle.com \
    --cc=arnd@arndb.de \
    --cc=bp@alien8.de \
    --cc=daniel.m.jordan@oracle.com \
    --cc=dave.dice@oracle.com \
    --cc=guohanjun@huawei.com \
    --cc=hpa@zytor.com \
    --cc=jglauber@marvell.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=steven.sistare@oracle.com \
    --cc=tglx@linutronix.de \
    --cc=will.deacon@arm.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.