linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: qiaozhou <qiaozhou@asrmicro.com>
Cc: John Stultz <john.stultz@linaro.org>,
	sboyd@codeaurora.org, LKML <linux-kernel@vger.kernel.org>,
	Wang Wilbur <wilburwang@asrmicro.com>,
	Marc Zyngier <marc.zyngier@arm.com>,
	Will Deacon <will.deacon@arm.com>,
	Peter Zijlstra <peterz@infradead.org>
Subject: Re: [Question]: try to fix contention between expire_timers and try_to_del_timer_sync
Date: Wed, 26 Jul 2017 16:16:37 +0200 (CEST)	[thread overview]
Message-ID: <alpine.DEB.2.20.1707261548560.2186@nanos> (raw)
In-Reply-To: <3d2459c7-defd-a47e-6cea-007c10cecaac@asrmicro.com>

On Wed, 26 Jul 2017, qiaozhou wrote:

Cc'ed ARM folks.

> I want to ask you for suggestions about how to fix one contention between
> expire_timers and try_to_del_timer_sync. Thanks in advance.
> The issue is a hard-lockup issue detected on our platform(arm64, one cluster
> with 4 a53, and the other with 4 a73). The sequence is as below:
> 1. core0 checks expired timers, and wakes up a process on a73, in step 1.3.
> 2. core4 starts to run the process and try to delete the timer in sync method,
> in step 2.1.
> 3. Before core0 can run to 1.4 and get the lock, core4 has already run from
> 2.1 to 2.6 in while loop. And in step 2.4, it fails since the timer is still
> the running timer.
> 
> core0(a53): core4(a73):
> run_timer_softirq
> __run_timers
>     spin_lock_irq(&base->lock)
>         expire_timers()
>             1.1: base->running_timer = timer;
>             1.2: spin_unlock_irq(&base->lock);
>             1.3: call_timer_fn(timer, fn, data);
> schedule()
>             1.4: spin_lock_irq(&base->lock);     del_timer_sync(&timer)
>             1.5: back to 1.1 in while loop
> 2.1: try_to_del_timer_sync()
>         2.2: get_timer_base()
>         2.3: spin_lock_irqsave(&base->lock, flags);
>           2.4: check "base->running_timer != timer"
>         2.5: spin_unlock_irqrestore(&base->lock, flags);
>    2.6:cpu_relax(); (back to 2.1 in while loop)

This is horribly formatted.

> Core0 runs @832MHz, and core4 runs @1.8GHz. A73 is also much more powerful in
> design than a53. So the actual running is that a73 keeps looping in spin_lock
> -> check running_timer -> spin_unlock -> spin_lock ->...., while a53 can't
> even succeed to complete one store exclusive instruction, before it can enter
> WFE to wait for lock release. It just keeps looping in +30 and+3c in below
> asm, due to stxr fails.
> 
> So it deadloop, a53 can't jump out ldaxr/stxr loop, while a73 can't pass the
> running_timer check. Finally the hard-lockup is triggered.(BTW, the issue
> needs a long time to occur.)
> 
> <_raw_spin_lock_irq+0x2c>:   prfm    pstl1strm, [x19]
> /<_raw_spin_lock_irq+0x30>:   ldaxr   w0, [x19]//
> //<_raw_spin_lock_irq+0x34>:   add     w1, w0, #0x10, lsl #12//
> //<_raw_spin_lock_irq+0x38>:   stxr    w2, w1, [x19]//
> //<_raw_spin_lock_irq+0x3c>:   cbnz    w2, 0xffffff80089f52e0
> <_raw_spin_lock_irq+0x30>/
> <_raw_spin_lock_irq+0x40>:   eor     w1, w0, w0, ror #16
> <_raw_spin_lock_irq+0x44>:   cbz     w1, 0xffffff80089f530c
> <_raw_spin_lock_irq+0x5c>
> <_raw_spin_lock_irq+0x48>:   sevl
> <_raw_spin_lock_irq+0x4c>:   wfe
> <_raw_spin_lock_irq+0x50>:   ldaxrh  w2, [x19]
> <_raw_spin_lock_irq+0x54>:   eor     w1, w2, w0, lsr #16
> <_raw_spin_lock_irq+0x58>:   cbnz    w1, 0xffffff80089f52fc
> <_raw_spin_lock_irq+0x4c>
> 
> The loop on a53 only has 4 instructions, and loop on a73 has ~100
> instructions. Still a53 has no chance to store exclusive successfully. It may
> be related with ldaxr/stxr cost, core frequency, core number etc.
> 
> I have no idea of fixing it in spinlock/unlock implement, so I try to fix it
> in timer driver. I prepared a raw patch, not sure it's the correct direction
> to solve this issue. Could you help to give some suggestions? Thanks.
> 
> From fb8fbfeb8f9f92fdadd9920ce234fd433bc883e1 Mon Sep 17 00:00:00 2001
> From: Qiao Zhou <qiaozhou@asrmicro.com>
> Date: Wed, 26 Jul 2017 20:30:33 +0800
> Subject: [PATCH] RFC: timers: try to fix contention between expire_timers and
>  del_timer_sync
> 
> try to fix contention between expire_timers and del_timer_sync by
> adding TIMER_WOKEN status, which means that the timer has already
> woken up corresponding process.

Timers do a lot more things than waking up a process.

Just because this happens in a particular case for you where a wakeup is
involved this does not mean, that this is generally true.

And that whole flag business is completely broken. There is a comment in
call_timer_fn() which says:

        /*
         * It is permissible to free the timer from inside the
         * function that is called from it ....

So you _CANNOT_ touch timer after the function returned. It might be gone
already.

For that particular timer case we can clear base->running_timer w/o the
lock held (see patch below), but this kind of

     lock -> test -> unlock -> retry

loops are all over the place in the kernel, so this is going to hurt you
sooner than later in some other place.

Thanks,

	tglx

8<------------

--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -1301,10 +1301,12 @@ static void expire_timers(struct timer_b
 		if (timer->flags & TIMER_IRQSAFE) {
 			raw_spin_unlock(&base->lock);
 			call_timer_fn(timer, fn, data);
+			base->running_timer = NULL;
 			raw_spin_lock(&base->lock);
 		} else {
 			raw_spin_unlock_irq(&base->lock);
 			call_timer_fn(timer, fn, data);
+			base->running_timer = NULL;
 			raw_spin_lock_irq(&base->lock);
 		}
 	}

       reply	other threads:[~2017-07-26 14:16 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <3d2459c7-defd-a47e-6cea-007c10cecaac@asrmicro.com>
2017-07-26 14:16 ` Thomas Gleixner [this message]
2017-07-27  1:29   ` [Question]: try to fix contention between expire_timers and try_to_del_timer_sync qiaozhou
2017-07-27 15:14     ` Will Deacon
2017-07-27 15:19       ` Thomas Gleixner
2017-07-28  1:10     ` Vikram Mulukutla
2017-07-28  9:28       ` Peter Zijlstra
2017-07-28 19:11         ` Vikram Mulukutla
2017-07-28  9:28       ` Will Deacon
2017-07-28 19:09         ` Vikram Mulukutla
2017-07-31 11:20           ` qiaozhou
2017-08-01  7:37             ` qiaozhou
2017-08-03 23:32               ` Vikram Mulukutla
2017-08-04  3:15                 ` qiaozhou
2017-07-31 13:13           ` Will Deacon
2017-08-03 23:25             ` Vikram Mulukutla
2017-08-15 18:40               ` Will Deacon
2017-08-25 19:48                 ` Vikram Mulukutla
2017-08-25 20:25                   ` Vikram Mulukutla
2017-08-28 23:12                   ` Vikram Mulukutla
2017-09-06 11:19                     ` qiaozhou
2017-09-25 11:02                     ` qiaozhou
2017-10-02 14:14                       ` Will Deacon
2017-10-11  8:33                         ` qiaozhou

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.20.1707261548560.2186@nanos \
    --to=tglx@linutronix.de \
    --cc=john.stultz@linaro.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=marc.zyngier@arm.com \
    --cc=peterz@infradead.org \
    --cc=qiaozhou@asrmicro.com \
    --cc=sboyd@codeaurora.org \
    --cc=wilburwang@asrmicro.com \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).