linux-arch.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Guo Ren <guoren@kernel.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Alex Kogan <alex.kogan@oracle.com>,
	linux@armlinux.org.uk, mingo@redhat.com, will.deacon@arm.com,
	arnd@arndb.de, longman@redhat.com, linux-arch@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, tglx@linutronix.de, bp@alien8.de,
	hpa@zytor.com, x86@kernel.org, guohanjun@huawei.com,
	jglauber@marvell.com, steven.sistare@oracle.com,
	daniel.m.jordan@oracle.com, dave.dice@oracle.com
Subject: Re: [PATCH v15 3/6] locking/qspinlock: Introduce CNA into the slow path of qspinlock
Date: Fri, 4 Aug 2023 20:19:25 -0400	[thread overview]
Message-ID: <CAJF2gTQB7pvLYCkZ+9Xmdtmv4ZVynC2embAza-sgPwL8c-D-sw@mail.gmail.com> (raw)
In-Reply-To: <20230804182312.GO212435@hirez.programming.kicks-ass.net>

On Fri, Aug 4, 2023 at 2:24 PM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Fri, Aug 04, 2023 at 10:17:35AM -0400, Guo Ren wrote:
>
> > > See, this is where the ARM64 WFE would come in handy; I don't suppose
> > > RISC-V has anything like that?
> > Em... arm64 smp_cond_load only could save power consumption or release
> > the pipeline resources of an SMT processor. When (Node1 cpu64) is in
> > the WFE state, it still needs (Node0 cpu1) to write the value to give
> > a cross-NUMA signal. So I didn't see what WFE related to reducing
> > cross-Numa transactions, or I missed something. Sorry
>
> The benefit is that WFE significantly reduces the memory traffic. Since
> it 'suspends' the core and waits for a write-notification instead of
> busy polling the memory location you get a ton less loads.
Em... I had a different observation: When a long lock queue appeared
by a store buffer delay problem in the lock torture test, we observed
all interconnects get into a quiet state, and there was no more memory
traffic. All the cores are loop-loading "different" cacheline from
their L1 cache, caused by queued_spinlock. So I don't see any memory
traffics on the bus.

For the LL + WFE, AFAIK, LL is a load instruction that would grab the
cacheline from the bus into the L1-cache and set the reservation set
(arm may call it exclusive-monitor). If any cacheline invalidation
requests (readunique/cleanunique/...) come in, WFE would retire, and
the reservation set would be cleared. So from a cacheline perspective,
there is no difference between "LL+WFE" and "looping loads."

Let's see two scenarios of LL+WFE, multi-cores, and muti-threadings of one core:
 - In the multi-cores case, WFE didn't give any more benefits than the
loop loading from my perspective. Because the only thing WFE could do
is to "suspend core" (I borrowed your word here), but it can't be deep
sleep because the response from WFE is the most prior thing. As you
said, we should prevent "terribly contended" situations, so WFE must
keep fast reactions in the pipeline, not deep sleep. That's WFI stuff.
And loop loading also could reduce power consumption through the
proper micro-arch design: When the pipeline gets into a loop loading
state, the loop buffer mechanism start, no instructions fetch happens,
the frontend component can suspend for a while, and the only working
components are "loop buffer" and "LSU load path." Other components
could suspend for a while. So loop loading is not as terrible as you
thought.

 - In the multi-threading of one core case, introducing an ISA
instruction (WFE) to solve the loop loading problem is worthwhile
because the thread could release the resource of the processor's pipe
line.


>


--
Best Regards
 Guo Ren

  reply	other threads:[~2023-08-05  0:19 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-14 20:07 [PATCH v15 0/6] Add NUMA-awareness to qspinlock Alex Kogan
2021-05-14 20:07 ` [PATCH v15 1/6] locking/qspinlock: Rename mcs lock/unlock macros and make them more generic Alex Kogan
2021-05-14 20:07 ` [PATCH v15 2/6] locking/qspinlock: Refactor the qspinlock slow path Alex Kogan
2021-05-14 20:07 ` [PATCH v15 3/6] locking/qspinlock: Introduce CNA into the slow path of qspinlock Alex Kogan
2021-09-22 19:25   ` Davidlohr Bueso
2021-09-22 19:52     ` Waiman Long
2023-08-04  1:49     ` Guo Ren
2021-09-30 10:05   ` Barry Song
2023-08-02 23:14   ` Guo Ren
2023-08-03  8:50     ` Peter Zijlstra
2023-08-03 10:28       ` Guo Ren
2023-08-03 11:56         ` Peter Zijlstra
2023-08-04  1:33           ` Guo Ren
2023-08-04  1:38             ` Guo Ren
2023-08-04  8:25             ` Peter Zijlstra
2023-08-04 14:17               ` Guo Ren
2023-08-04 18:23                 ` Peter Zijlstra
2023-08-05  0:19                   ` Guo Ren [this message]
2021-05-14 20:07 ` [PATCH v15 4/6] locking/qspinlock: Introduce starvation avoidance into CNA Alex Kogan
2021-05-14 20:07 ` [PATCH v15 5/6] locking/qspinlock: Avoid moving certain threads between waiting queues in CNA Alex Kogan
2021-05-14 20:07 ` [PATCH v15 6/6] locking/qspinlock: Introduce the shuffle reduction optimization into CNA Alex Kogan
2021-09-30  9:44 ` [PATCH v15 0/6] Add NUMA-awareness to qspinlock Barry Song
2021-09-30 16:58   ` Waiman Long
2021-09-30 22:57     ` Barry Song
2021-09-30 23:51   ` Alex Kogan
2021-12-13 20:37 ` Alex Kogan
2021-12-15 15:13   ` Alex Kogan
2022-04-11 17:09 ` Alex Kogan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJF2gTQB7pvLYCkZ+9Xmdtmv4ZVynC2embAza-sgPwL8c-D-sw@mail.gmail.com \
    --to=guoren@kernel.org \
    --cc=alex.kogan@oracle.com \
    --cc=arnd@arndb.de \
    --cc=bp@alien8.de \
    --cc=daniel.m.jordan@oracle.com \
    --cc=dave.dice@oracle.com \
    --cc=guohanjun@huawei.com \
    --cc=hpa@zytor.com \
    --cc=jglauber@marvell.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=longman@redhat.com \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=steven.sistare@oracle.com \
    --cc=tglx@linutronix.de \
    --cc=will.deacon@arm.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).