From: qiaozhou <qiaozhou@asrmicro.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <john.stultz@linaro.org>, <sboyd@codeaurora.org>,
LKML <linux-kernel@vger.kernel.org>,
Wang Wilbur <wilburwang@asrmicro.com>,
Marc Zyngier <marc.zyngier@arm.com>,
Will Deacon <will.deacon@arm.com>,
Peter Zijlstra <peterz@infradead.org>
Subject: Re: [Question]: try to fix contention between expire_timers and try_to_del_timer_sync
Date: Thu, 27 Jul 2017 09:29:20 +0800 [thread overview]
Message-ID: <dcb18367-c747-96a8-9927-d8ba6954c496@asrmicro.com> (raw)
In-Reply-To: <alpine.DEB.2.20.1707261548560.2186@nanos>
On 2017年07月26日 22:16, Thomas Gleixner wrote:
> On Wed, 26 Jul 2017, qiaozhou wrote:
>
> Cc'ed ARM folks.
>
>> I want to ask you for suggestions about how to fix one contention between
>> expire_timers and try_to_del_timer_sync. Thanks in advance.
>> The issue is a hard-lockup issue detected on our platform(arm64, one cluster
>> with 4 a53, and the other with 4 a73). The sequence is as below:
>> 1. core0 checks expired timers, and wakes up a process on a73, in step 1.3.
>> 2. core4 starts to run the process and try to delete the timer in sync method,
>> in step 2.1.
>> 3. Before core0 can run to 1.4 and get the lock, core4 has already run from
>> 2.1 to 2.6 in while loop. And in step 2.4, it fails since the timer is still
>> the running timer.
>>
>> core0(a53): core4(a73):
>> run_timer_softirq
>> __run_timers
>> spin_lock_irq(&base->lock)
>> expire_timers()
>> 1.1: base->running_timer = timer;
>> 1.2: spin_unlock_irq(&base->lock);
>> 1.3: call_timer_fn(timer, fn, data);
>> schedule()
>> 1.4: spin_lock_irq(&base->lock); del_timer_sync(&timer)
>> 1.5: back to 1.1 in while loop
>> 2.1: try_to_del_timer_sync()
>> 2.2: get_timer_base()
>> 2.3: spin_lock_irqsave(&base->lock, flags);
>> 2.4: check "base->running_timer != timer"
>> 2.5: spin_unlock_irqrestore(&base->lock, flags);
>> 2.6:cpu_relax(); (back to 2.1 in while loop)
> This is horribly formatted.
Sorry for the format. Updated the calling sequence on two cores.
core0(a53): core4(a73):
run_timer_softirq
__run_timers
spin_lock_irq(&base->lock)
expire_timers()
1.1: base->running_timer = timer;
1.2: spin_unlock_irq(&base->lock);
1.3: call_timer_fn(timer, fn, data);
1.4: spin_lock_irq(&base->lock);
1.5: back to 1.1 in while loop
core4(a73):
schedule()
del_timer_sync(&timer)
2.1: try_to_del_timer_sync()
2.2: get_timer_base()
2.3: spin_lock_irqsave(&base->lock, flags);
2.4: check "base->running_timer != timer"
2.5: spin_unlock_irqrestore(&base->lock, flags);
2.6:cpu_relax(); (back to 2.1 in while loop)
>
>> Core0 runs @832MHz, and core4 runs @1.8GHz. A73 is also much more powerful in
>> design than a53. So the actual running is that a73 keeps looping in spin_lock
>> -> check running_timer -> spin_unlock -> spin_lock ->...., while a53 can't
>> even succeed to complete one store exclusive instruction, before it can enter
>> WFE to wait for lock release. It just keeps looping in +30 and+3c in below
>> asm, due to stxr fails.
>>
>> So it deadloop, a53 can't jump out ldaxr/stxr loop, while a73 can't pass the
>> running_timer check. Finally the hard-lockup is triggered.(BTW, the issue
>> needs a long time to occur.)
>>
>> <_raw_spin_lock_irq+0x2c>: prfm pstl1strm, [x19]
>> /<_raw_spin_lock_irq+0x30>: ldaxr w0, [x19]//
>> //<_raw_spin_lock_irq+0x34>: add w1, w0, #0x10, lsl #12//
>> //<_raw_spin_lock_irq+0x38>: stxr w2, w1, [x19]//
>> //<_raw_spin_lock_irq+0x3c>: cbnz w2, 0xffffff80089f52e0
>> <_raw_spin_lock_irq+0x30>/
>> <_raw_spin_lock_irq+0x40>: eor w1, w0, w0, ror #16
>> <_raw_spin_lock_irq+0x44>: cbz w1, 0xffffff80089f530c
>> <_raw_spin_lock_irq+0x5c>
>> <_raw_spin_lock_irq+0x48>: sevl
>> <_raw_spin_lock_irq+0x4c>: wfe
>> <_raw_spin_lock_irq+0x50>: ldaxrh w2, [x19]
>> <_raw_spin_lock_irq+0x54>: eor w1, w2, w0, lsr #16
>> <_raw_spin_lock_irq+0x58>: cbnz w1, 0xffffff80089f52fc
>> <_raw_spin_lock_irq+0x4c>
>>
>> The loop on a53 only has 4 instructions, and loop on a73 has ~100
>> instructions. Still a53 has no chance to store exclusive successfully. It may
>> be related with ldaxr/stxr cost, core frequency, core number etc.
>>
>> I have no idea of fixing it in spinlock/unlock implement, so I try to fix it
>> in timer driver. I prepared a raw patch, not sure it's the correct direction
>> to solve this issue. Could you help to give some suggestions? Thanks.
>>
>> From fb8fbfeb8f9f92fdadd9920ce234fd433bc883e1 Mon Sep 17 00:00:00 2001
>> From: Qiao Zhou <qiaozhou@asrmicro.com>
>> Date: Wed, 26 Jul 2017 20:30:33 +0800
>> Subject: [PATCH] RFC: timers: try to fix contention between expire_timers and
>> del_timer_sync
>>
>> try to fix contention between expire_timers and del_timer_sync by
>> adding TIMER_WOKEN status, which means that the timer has already
>> woken up corresponding process.
> Timers do a lot more things than waking up a process.
>
> Just because this happens in a particular case for you where a wakeup is
> involved this does not mean, that this is generally true.
Yes, you're right. My intention is that as the timer function is
executed, maybe we can do something.
>
> And that whole flag business is completely broken. There is a comment in
> call_timer_fn() which says:
>
> /*
> * It is permissible to free the timer from inside the
> * function that is called from it ....
>
> So you _CANNOT_ touch timer after the function returned. It might be gone
> already.
You're right. Touching timer here has risk.
>
> For that particular timer case we can clear base->running_timer w/o the
> lock held (see patch below), but this kind of
>
> lock -> test -> unlock -> retry
>
> loops are all over the place in the kernel, so this is going to hurt you
> sooner than later in some other place.
It's true. This is the way spinlock is used normally and widely in
kernel. I'll also ask ARM experts whether we can do something to avoid
or reduce the chance of such issue. ARMv8.1 has one single
instruction(ldadda) to replace the ldaxr/stxr loop. Hope it can improve
and reduce the chance.
>
> Thanks,
>
> tglx
>
> 8<------------
>
> --- a/kernel/time/timer.c
> +++ b/kernel/time/timer.c
> @@ -1301,10 +1301,12 @@ static void expire_timers(struct timer_b
> if (timer->flags & TIMER_IRQSAFE) {
> raw_spin_unlock(&base->lock);
> call_timer_fn(timer, fn, data);
> + base->running_timer = NULL;
> raw_spin_lock(&base->lock);
> } else {
> raw_spin_unlock_irq(&base->lock);
> call_timer_fn(timer, fn, data);
> + base->running_timer = NULL;
> raw_spin_lock_irq(&base->lock);
> }
> }
It should work for this particular issue and I'll test it. Previously I
thought it was unsafe to touch base->running_timer without holding lock.
Thanks a lot for all the suggestions.
Best Regards
Qiao
next prev parent reply other threads:[~2017-07-27 1:30 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <3d2459c7-defd-a47e-6cea-007c10cecaac@asrmicro.com>
2017-07-26 14:16 ` [Question]: try to fix contention between expire_timers and try_to_del_timer_sync Thomas Gleixner
2017-07-27 1:29 ` qiaozhou [this message]
2017-07-27 15:14 ` Will Deacon
2017-07-27 15:19 ` Thomas Gleixner
2017-07-28 1:10 ` Vikram Mulukutla
2017-07-28 9:28 ` Peter Zijlstra
2017-07-28 19:11 ` Vikram Mulukutla
2017-07-28 9:28 ` Will Deacon
2017-07-28 19:09 ` Vikram Mulukutla
2017-07-31 11:20 ` qiaozhou
2017-08-01 7:37 ` qiaozhou
2017-08-03 23:32 ` Vikram Mulukutla
2017-08-04 3:15 ` qiaozhou
2017-07-31 13:13 ` Will Deacon
2017-08-03 23:25 ` Vikram Mulukutla
2017-08-15 18:40 ` Will Deacon
2017-08-25 19:48 ` Vikram Mulukutla
2017-08-25 20:25 ` Vikram Mulukutla
2017-08-28 23:12 ` Vikram Mulukutla
2017-09-06 11:19 ` qiaozhou
2017-09-25 11:02 ` qiaozhou
2017-10-02 14:14 ` Will Deacon
2017-10-11 8:33 ` qiaozhou
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=dcb18367-c747-96a8-9927-d8ba6954c496@asrmicro.com \
--to=qiaozhou@asrmicro.com \
--cc=john.stultz@linaro.org \
--cc=linux-kernel@vger.kernel.org \
--cc=marc.zyngier@arm.com \
--cc=peterz@infradead.org \
--cc=sboyd@codeaurora.org \
--cc=tglx@linutronix.de \
--cc=wilburwang@asrmicro.com \
--cc=will.deacon@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).