* [PATCH] dump_stack: Avoid the livelock of the dump_lock
@ 2019-10-29 9:24 Kevin Hao
[not found] ` <CAHk-=wjU9ASiPYFqmGJtOqG-0KtuNtu-aNPPY4M1AbcPdrfz7A@mail.gmail.com>
0 siblings, 1 reply; 3+ messages in thread
From: Kevin Hao @ 2019-10-29 9:24 UTC (permalink / raw)
To: linux-kernel; +Cc: Andrew Morton, Linus Torvalds
In the current code, we uses the atomic_cmpxchg() to serialize the
output of the dump_stack(), but this implementation suffers the
thundering herd problem. We have observed such kind of livelock on a
Marvell cn96xx board(24 cpus) when heavily using the dump_stack() in
a kprobe handler. Actually we can use a spinlock here and leverage the
implementation of the spinlock(either ticket or queued spinlock) to
mediate such kind of livelock. Since the dump_stack() runs with the
irq disabled, so use the raw_spinlock_t to make it safe for rt kernel.
Signed-off-by: Kevin Hao <haokexin@gmail.com>
---
lib/dump_stack.c | 24 +++++++++++-------------
1 file changed, 11 insertions(+), 13 deletions(-)
diff --git a/lib/dump_stack.c b/lib/dump_stack.c
index 5cff72f18c4a..fa971f75f1e2 100644
--- a/lib/dump_stack.c
+++ b/lib/dump_stack.c
@@ -83,37 +83,35 @@ static void __dump_stack(void)
* Architectures can override this implementation by implementing its own.
*/
#ifdef CONFIG_SMP
-static atomic_t dump_lock = ATOMIC_INIT(-1);
+static DEFINE_RAW_SPINLOCK(dump_lock);
+static int dump_cpu = -1;
asmlinkage __visible void dump_stack(void)
{
unsigned long flags;
int was_locked;
- int old;
int cpu;
/*
* Permit this cpu to perform nested stack dumps while serialising
* against other CPUs
*/
-retry:
local_irq_save(flags);
cpu = smp_processor_id();
- old = atomic_cmpxchg(&dump_lock, -1, cpu);
- if (old == -1) {
+
+ if (READ_ONCE(dump_cpu) != cpu) {
+ raw_spin_lock(&dump_lock);
+ dump_cpu = cpu;
was_locked = 0;
- } else if (old == cpu) {
+ } else
was_locked = 1;
- } else {
- local_irq_restore(flags);
- cpu_relax();
- goto retry;
- }
__dump_stack();
- if (!was_locked)
- atomic_set(&dump_lock, -1);
+ if (!was_locked) {
+ dump_cpu = -1;
+ raw_spin_unlock(&dump_lock);
+ }
local_irq_restore(flags);
}
--
2.14.4
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] dump_stack: Avoid the livelock of the dump_lock
[not found] ` <CAHk-=wjU9ASiPYFqmGJtOqG-0KtuNtu-aNPPY4M1AbcPdrfz7A@mail.gmail.com>
@ 2019-10-29 14:09 ` Kevin Hao
2019-10-29 15:23 ` Linus Torvalds
0 siblings, 1 reply; 3+ messages in thread
From: Kevin Hao @ 2019-10-29 14:09 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Linux Kernel Mailing List, Andrew Morton
[-- Attachment #1: Type: text/plain, Size: 1824 bytes --]
On Tue, Oct 29, 2019 at 01:46:40PM +0100, Linus Torvalds wrote:
> Sorry for html crud, I'm at a conference without my laptop, so reading email on
> my phone..
>
> On Tue, Oct 29, 2019, 10:31 Kevin Hao <haokexin@gmail.com> wrote:
>
> In the current code, we uses the atomic_cmpxchg() to serialize the
> output of the dump_stack(), but this implementation suffers the
> thundering herd problem. Actually we can use a spinlock here and leverage
> the
>
> implementation of the spinlock(either ticket or queued spinlock) to
> mediate such kind of livelock. Since the dump_stack() runs with the
> irq disabled, so use the raw_spinlock_t to make it safe for rt kernel.
>
>
> The problem with this is that I think it will deadlock easily with NMI's, won't
> it?
>
> There is a window between getting the spin lock and setting 'dump_cpu' where an
> incoming NMI would then deadlock because it also tries to get the lock, because
> it hasn't seen the fact that this cpu already got it.
Indeed, sorry I missed that.
>
> The cmpxchg is supposed to protect against that, and make the locking be atomic
> with the CPU setting.
>
> Can you try instead to make the retry loop not jump right back to the cmpxchg,
> but start looping just reading the CPU value, and only jumping back when it
> becomes -1?
Do you mean something like this?
diff --git a/lib/dump_stack.c b/lib/dump_stack.c
index 5cff72f18c4a..a0f1293a3638 100644
--- a/lib/dump_stack.c
+++ b/lib/dump_stack.c
@@ -107,6 +107,7 @@ asmlinkage __visible void dump_stack(void)
} else {
local_irq_restore(flags);
cpu_relax();
+ while (atomic_read(&dump_lock) != -1);
goto retry;
}
It does help. I will send a v2 if it pass more stress test.
Thanks,
Kevin
>
> Linus
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] dump_stack: Avoid the livelock of the dump_lock
2019-10-29 14:09 ` Kevin Hao
@ 2019-10-29 15:23 ` Linus Torvalds
0 siblings, 0 replies; 3+ messages in thread
From: Linus Torvalds @ 2019-10-29 15:23 UTC (permalink / raw)
To: Kevin Hao; +Cc: Linux Kernel Mailing List, Andrew Morton
On Tue, Oct 29, 2019 at 3:09 PM Kevin Hao <haokexin@gmail.com> wrote:
>
> Do you mean something like this?
Yeah, except I'd make it be something like
local_irq_restore(flags);
- cpu_relax();
+ do { cpu_relax(); } while (atomic_read(&dump_lock) != -1);
goto retry;
instead, ie doing the cpu_relax() inside that loop.
Obviously untested.
Linus
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2019-10-29 15:23 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-29 9:24 [PATCH] dump_stack: Avoid the livelock of the dump_lock Kevin Hao
[not found] ` <CAHk-=wjU9ASiPYFqmGJtOqG-0KtuNtu-aNPPY4M1AbcPdrfz7A@mail.gmail.com>
2019-10-29 14:09 ` Kevin Hao
2019-10-29 15:23 ` Linus Torvalds
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).