From: Thomas Gleixner <tglx@linutronix.de>
To: qiaozhou <qiaozhou@asrmicro.com>
Cc: John Stultz <john.stultz@linaro.org>,
sboyd@codeaurora.org, LKML <linux-kernel@vger.kernel.org>,
Wang Wilbur <wilburwang@asrmicro.com>,
Marc Zyngier <marc.zyngier@arm.com>,
Will Deacon <will.deacon@arm.com>,
Peter Zijlstra <peterz@infradead.org>
Subject: Re: [Question]: try to fix contention between expire_timers and try_to_del_timer_sync
Date: Wed, 26 Jul 2017 16:16:37 +0200 (CEST) [thread overview]
Message-ID: <alpine.DEB.2.20.1707261548560.2186@nanos> (raw)
In-Reply-To: <3d2459c7-defd-a47e-6cea-007c10cecaac@asrmicro.com>
On Wed, 26 Jul 2017, qiaozhou wrote:
Cc'ed ARM folks.
> I want to ask you for suggestions about how to fix one contention between
> expire_timers and try_to_del_timer_sync. Thanks in advance.
> The issue is a hard-lockup issue detected on our platform(arm64, one cluster
> with 4 a53, and the other with 4 a73). The sequence is as below:
> 1. core0 checks expired timers, and wakes up a process on a73, in step 1.3.
> 2. core4 starts to run the process and try to delete the timer in sync method,
> in step 2.1.
> 3. Before core0 can run to 1.4 and get the lock, core4 has already run from
> 2.1 to 2.6 in while loop. And in step 2.4, it fails since the timer is still
> the running timer.
>
> core0(a53): core4(a73):
> run_timer_softirq
> __run_timers
> spin_lock_irq(&base->lock)
> expire_timers()
> 1.1: base->running_timer = timer;
> 1.2: spin_unlock_irq(&base->lock);
> 1.3: call_timer_fn(timer, fn, data);
> schedule()
> 1.4: spin_lock_irq(&base->lock); del_timer_sync(&timer)
> 1.5: back to 1.1 in while loop
> 2.1: try_to_del_timer_sync()
> 2.2: get_timer_base()
> 2.3: spin_lock_irqsave(&base->lock, flags);
> 2.4: check "base->running_timer != timer"
> 2.5: spin_unlock_irqrestore(&base->lock, flags);
> 2.6:cpu_relax(); (back to 2.1 in while loop)
This is horribly formatted.
> Core0 runs @832MHz, and core4 runs @1.8GHz. A73 is also much more powerful in
> design than a53. So the actual running is that a73 keeps looping in spin_lock
> -> check running_timer -> spin_unlock -> spin_lock ->...., while a53 can't
> even succeed to complete one store exclusive instruction, before it can enter
> WFE to wait for lock release. It just keeps looping in +30 and+3c in below
> asm, due to stxr fails.
>
> So it deadloop, a53 can't jump out ldaxr/stxr loop, while a73 can't pass the
> running_timer check. Finally the hard-lockup is triggered.(BTW, the issue
> needs a long time to occur.)
>
> <_raw_spin_lock_irq+0x2c>: prfm pstl1strm, [x19]
> /<_raw_spin_lock_irq+0x30>: ldaxr w0, [x19]//
> //<_raw_spin_lock_irq+0x34>: add w1, w0, #0x10, lsl #12//
> //<_raw_spin_lock_irq+0x38>: stxr w2, w1, [x19]//
> //<_raw_spin_lock_irq+0x3c>: cbnz w2, 0xffffff80089f52e0
> <_raw_spin_lock_irq+0x30>/
> <_raw_spin_lock_irq+0x40>: eor w1, w0, w0, ror #16
> <_raw_spin_lock_irq+0x44>: cbz w1, 0xffffff80089f530c
> <_raw_spin_lock_irq+0x5c>
> <_raw_spin_lock_irq+0x48>: sevl
> <_raw_spin_lock_irq+0x4c>: wfe
> <_raw_spin_lock_irq+0x50>: ldaxrh w2, [x19]
> <_raw_spin_lock_irq+0x54>: eor w1, w2, w0, lsr #16
> <_raw_spin_lock_irq+0x58>: cbnz w1, 0xffffff80089f52fc
> <_raw_spin_lock_irq+0x4c>
>
> The loop on a53 only has 4 instructions, and loop on a73 has ~100
> instructions. Still a53 has no chance to store exclusive successfully. It may
> be related with ldaxr/stxr cost, core frequency, core number etc.
>
> I have no idea of fixing it in spinlock/unlock implement, so I try to fix it
> in timer driver. I prepared a raw patch, not sure it's the correct direction
> to solve this issue. Could you help to give some suggestions? Thanks.
>
> From fb8fbfeb8f9f92fdadd9920ce234fd433bc883e1 Mon Sep 17 00:00:00 2001
> From: Qiao Zhou <qiaozhou@asrmicro.com>
> Date: Wed, 26 Jul 2017 20:30:33 +0800
> Subject: [PATCH] RFC: timers: try to fix contention between expire_timers and
> del_timer_sync
>
> try to fix contention between expire_timers and del_timer_sync by
> adding TIMER_WOKEN status, which means that the timer has already
> woken up corresponding process.
Timers do a lot more things than waking up a process.
Just because this happens in a particular case for you where a wakeup is
involved this does not mean, that this is generally true.
And that whole flag business is completely broken. There is a comment in
call_timer_fn() which says:
/*
* It is permissible to free the timer from inside the
* function that is called from it ....
So you _CANNOT_ touch timer after the function returned. It might be gone
already.
For that particular timer case we can clear base->running_timer w/o the
lock held (see patch below), but this kind of
lock -> test -> unlock -> retry
loops are all over the place in the kernel, so this is going to hurt you
sooner than later in some other place.
Thanks,
tglx
8<------------
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -1301,10 +1301,12 @@ static void expire_timers(struct timer_b
if (timer->flags & TIMER_IRQSAFE) {
raw_spin_unlock(&base->lock);
call_timer_fn(timer, fn, data);
+ base->running_timer = NULL;
raw_spin_lock(&base->lock);
} else {
raw_spin_unlock_irq(&base->lock);
call_timer_fn(timer, fn, data);
+ base->running_timer = NULL;
raw_spin_lock_irq(&base->lock);
}
}
next parent reply other threads:[~2017-07-26 14:16 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <3d2459c7-defd-a47e-6cea-007c10cecaac@asrmicro.com>
2017-07-26 14:16 ` Thomas Gleixner [this message]
2017-07-27 1:29 ` [Question]: try to fix contention between expire_timers and try_to_del_timer_sync qiaozhou
2017-07-27 15:14 ` Will Deacon
2017-07-27 15:19 ` Thomas Gleixner
2017-07-28 1:10 ` Vikram Mulukutla
2017-07-28 9:28 ` Peter Zijlstra
2017-07-28 19:11 ` Vikram Mulukutla
2017-07-28 9:28 ` Will Deacon
2017-07-28 19:09 ` Vikram Mulukutla
2017-07-31 11:20 ` qiaozhou
2017-08-01 7:37 ` qiaozhou
2017-08-03 23:32 ` Vikram Mulukutla
2017-08-04 3:15 ` qiaozhou
2017-07-31 13:13 ` Will Deacon
2017-08-03 23:25 ` Vikram Mulukutla
2017-08-15 18:40 ` Will Deacon
2017-08-25 19:48 ` Vikram Mulukutla
2017-08-25 20:25 ` Vikram Mulukutla
2017-08-28 23:12 ` Vikram Mulukutla
2017-09-06 11:19 ` qiaozhou
2017-09-25 11:02 ` qiaozhou
2017-10-02 14:14 ` Will Deacon
2017-10-11 8:33 ` qiaozhou
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.DEB.2.20.1707261548560.2186@nanos \
--to=tglx@linutronix.de \
--cc=john.stultz@linaro.org \
--cc=linux-kernel@vger.kernel.org \
--cc=marc.zyngier@arm.com \
--cc=peterz@infradead.org \
--cc=qiaozhou@asrmicro.com \
--cc=sboyd@codeaurora.org \
--cc=wilburwang@asrmicro.com \
--cc=will.deacon@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).