From: Will Deacon <will.deacon@arm.com>
To: Vikram Mulukutla <markivx@codeaurora.org>
Cc: qiaozhou <qiaozhou@asrmicro.com>,
Thomas Gleixner <tglx@linutronix.de>,
John Stultz <john.stultz@linaro.org>,
sboyd@codeaurora.org, LKML <linux-kernel@vger.kernel.org>,
Wang Wilbur <wilburwang@asrmicro.com>,
Marc Zyngier <marc.zyngier@arm.com>,
Peter Zijlstra <peterz@infradead.org>,
linux-kernel-owner@vger.kernel.org, sudeep.holla@arm.com
Subject: Re: [Question]: try to fix contention between expire_timers and try_to_del_timer_sync
Date: Fri, 28 Jul 2017 10:28:32 +0100 [thread overview]
Message-ID: <20170728092831.GA24839@arm.com> (raw)
In-Reply-To: <e1cc02c5e7dfd4d6bec937b6dc97bfc7@codeaurora.org>
On Thu, Jul 27, 2017 at 06:10:34PM -0700, Vikram Mulukutla wrote:
> On 2017-07-26 18:29, qiaozhou wrote:
> >On 2017年07月26日 22:16, Thomas Gleixner wrote:
> >>On Wed, 26 Jul 2017, qiaozhou wrote:
> >>For that particular timer case we can clear base->running_timer w/o the
> >>lock held (see patch below), but this kind of
> >>
> >> lock -> test -> unlock -> retry
> >>
> >>loops are all over the place in the kernel, so this is going to hurt you
> >>sooner than later in some other place.
> >It's true. This is the way spinlock is used normally and widely in
> >kernel. I'll also ask ARM experts whether we can do something to avoid
> >or reduce the chance of such issue. ARMv8.1 has one single
> >instruction(ldadda) to replace the ldaxr/stxr loop. Hope it can
> >improve and reduce the chance.
>
> I think we should have this discussion now - I brought this up earlier [1]
> and I promised a test case that I completely forgot about - but here it
> is (attached). Essentially a Big CPU in an acquire-check-release loop
> will have an unfair advantage over a little CPU concurrently attempting
> to acquire the same lock, in spite of the ticket implementation. If the Big
> CPU needs the little CPU to make forward progress : livelock.
>
> We've run into the same loop construct in other spots in the kernel and
> the reason that a real symptom is so rare is that the retry-loop on the
> 'Big'
> CPU needs to be interrupted just once by say an IRQ/FIQ and the live-lock
> is broken. If the entire retry loop is within an interrupt-disabled critical
> section then the odds of live-locking are much higher.
>
> An example of the problem on a previous kernel is here [2]. Changes to the
> workqueue code since may have fixed this particular instance.
>
> One solution was to use udelay(1) in such loops instead of cpu_relax(), but
> that's not very 'relaxing'. I'm not sure if there's something we could do
> within the ticket spin-lock implementation to deal with this.
Does bodging cpu_relax to back-off to wfe after a while help? The event
stream will wake it up if nothing else does. Nasty patch below, but I'd be
interested to know whether or not it helps.
Will
--->8
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index 64c9e78f9882..1f5a29c8612e 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -149,9 +149,11 @@ extern void release_thread(struct task_struct *);
unsigned long get_wchan(struct task_struct *p);
+void __cpu_relax(unsigned long pc);
+
static inline void cpu_relax(void)
{
- asm volatile("yield" ::: "memory");
+ __cpu_relax(_THIS_IP_);
}
/* Thread switching */
diff --git a/arch/arm64/kernel/arm64ksyms.c b/arch/arm64/kernel/arm64ksyms.c
index 67368c7329c0..be8a698ea680 100644
--- a/arch/arm64/kernel/arm64ksyms.c
+++ b/arch/arm64/kernel/arm64ksyms.c
@@ -72,6 +72,8 @@ EXPORT_SYMBOL(_mcount);
NOKPROBE_SYMBOL(_mcount);
#endif
+EXPORT_SYMBOL(__cpu_relax);
+
/* arm-smccc */
EXPORT_SYMBOL(__arm_smccc_smc);
EXPORT_SYMBOL(__arm_smccc_hvc);
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index 659ae8094ed5..c394c3704b7f 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -403,6 +403,31 @@ unsigned long get_wchan(struct task_struct *p)
return ret;
}
+static DEFINE_PER_CPU(u64, __cpu_relax_data);
+
+#define CPU_RELAX_WFE_THRESHOLD 10000
+void __cpu_relax(unsigned long pc)
+{
+ u64 new, old = raw_cpu_read(__cpu_relax_data);
+ u32 old_pc, new_pc;
+ bool wfe = false;
+
+ old_pc = (u32)old;
+ new = new_pc = (u32)pc;
+
+ if (old_pc == new_pc) {
+ if ((old >> 32) > CPU_RELAX_WFE_THRESHOLD) {
+ asm volatile("sevl; wfe; wfe\n" ::: "memory");
+ wfe = true;
+ } else {
+ new = old + (1UL << 32);
+ }
+ }
+
+ if (this_cpu_cmpxchg(__cpu_relax_data, old, new) == old && !wfe)
+ asm volatile("yield" ::: "memory");
+}
+
unsigned long arch_align_stack(unsigned long sp)
{
if (!(current->personality & ADDR_NO_RANDOMIZE) && randomize_va_space)
next prev parent reply other threads:[~2017-07-28 9:28 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <3d2459c7-defd-a47e-6cea-007c10cecaac@asrmicro.com>
2017-07-26 14:16 ` [Question]: try to fix contention between expire_timers and try_to_del_timer_sync Thomas Gleixner
2017-07-27 1:29 ` qiaozhou
2017-07-27 15:14 ` Will Deacon
2017-07-27 15:19 ` Thomas Gleixner
2017-07-28 1:10 ` Vikram Mulukutla
2017-07-28 9:28 ` Peter Zijlstra
2017-07-28 19:11 ` Vikram Mulukutla
2017-07-28 9:28 ` Will Deacon [this message]
2017-07-28 19:09 ` Vikram Mulukutla
2017-07-31 11:20 ` qiaozhou
2017-08-01 7:37 ` qiaozhou
2017-08-03 23:32 ` Vikram Mulukutla
2017-08-04 3:15 ` qiaozhou
2017-07-31 13:13 ` Will Deacon
2017-08-03 23:25 ` Vikram Mulukutla
2017-08-15 18:40 ` Will Deacon
2017-08-25 19:48 ` Vikram Mulukutla
2017-08-25 20:25 ` Vikram Mulukutla
2017-08-28 23:12 ` Vikram Mulukutla
2017-09-06 11:19 ` qiaozhou
2017-09-25 11:02 ` qiaozhou
2017-10-02 14:14 ` Will Deacon
2017-10-11 8:33 ` qiaozhou
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170728092831.GA24839@arm.com \
--to=will.deacon@arm.com \
--cc=john.stultz@linaro.org \
--cc=linux-kernel-owner@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=marc.zyngier@arm.com \
--cc=markivx@codeaurora.org \
--cc=peterz@infradead.org \
--cc=qiaozhou@asrmicro.com \
--cc=sboyd@codeaurora.org \
--cc=sudeep.holla@arm.com \
--cc=tglx@linutronix.de \
--cc=wilburwang@asrmicro.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).