linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] dump_stack: Avoid the livelock of the dump_lock
@ 2019-10-29  9:24 Kevin Hao
       [not found] ` <CAHk-=wjU9ASiPYFqmGJtOqG-0KtuNtu-aNPPY4M1AbcPdrfz7A@mail.gmail.com>
  0 siblings, 1 reply; 3+ messages in thread
From: Kevin Hao @ 2019-10-29  9:24 UTC (permalink / raw)
  To: linux-kernel; +Cc: Andrew Morton, Linus Torvalds

In the current code, we uses the atomic_cmpxchg() to serialize the
output of the dump_stack(), but this implementation suffers the
thundering herd problem. We have observed such kind of livelock on a
Marvell cn96xx board(24 cpus) when heavily using the dump_stack() in
a kprobe handler. Actually we can use a spinlock here and leverage the
implementation of the spinlock(either ticket or queued spinlock) to
mediate such kind of livelock. Since the dump_stack() runs with the
irq disabled, so use the raw_spinlock_t to make it safe for rt kernel.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
---
 lib/dump_stack.c | 24 +++++++++++-------------
 1 file changed, 11 insertions(+), 13 deletions(-)

diff --git a/lib/dump_stack.c b/lib/dump_stack.c
index 5cff72f18c4a..fa971f75f1e2 100644
--- a/lib/dump_stack.c
+++ b/lib/dump_stack.c
@@ -83,37 +83,35 @@ static void __dump_stack(void)
  * Architectures can override this implementation by implementing its own.
  */
 #ifdef CONFIG_SMP
-static atomic_t dump_lock = ATOMIC_INIT(-1);
+static DEFINE_RAW_SPINLOCK(dump_lock);
+static int dump_cpu = -1;
 
 asmlinkage __visible void dump_stack(void)
 {
 	unsigned long flags;
 	int was_locked;
-	int old;
 	int cpu;
 
 	/*
 	 * Permit this cpu to perform nested stack dumps while serialising
 	 * against other CPUs
 	 */
-retry:
 	local_irq_save(flags);
 	cpu = smp_processor_id();
-	old = atomic_cmpxchg(&dump_lock, -1, cpu);
-	if (old == -1) {
+
+	if (READ_ONCE(dump_cpu) != cpu) {
+		raw_spin_lock(&dump_lock);
+		dump_cpu = cpu;
 		was_locked = 0;
-	} else if (old == cpu) {
+	} else
 		was_locked = 1;
-	} else {
-		local_irq_restore(flags);
-		cpu_relax();
-		goto retry;
-	}
 
 	__dump_stack();
 
-	if (!was_locked)
-		atomic_set(&dump_lock, -1);
+	if (!was_locked) {
+		dump_cpu = -1;
+		raw_spin_unlock(&dump_lock);
+	}
 
 	local_irq_restore(flags);
 }
-- 
2.14.4


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] dump_stack: Avoid the livelock of the dump_lock
       [not found] ` <CAHk-=wjU9ASiPYFqmGJtOqG-0KtuNtu-aNPPY4M1AbcPdrfz7A@mail.gmail.com>
@ 2019-10-29 14:09   ` Kevin Hao
  2019-10-29 15:23     ` Linus Torvalds
  0 siblings, 1 reply; 3+ messages in thread
From: Kevin Hao @ 2019-10-29 14:09 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Linux Kernel Mailing List, Andrew Morton

[-- Attachment #1: Type: text/plain, Size: 1824 bytes --]

On Tue, Oct 29, 2019 at 01:46:40PM +0100, Linus Torvalds wrote:
> Sorry for html crud, I'm at a conference without my laptop, so reading email on
> my phone..
> 
> On Tue, Oct 29, 2019, 10:31 Kevin Hao <haokexin@gmail.com> wrote:
> 
>     In the current code, we uses the atomic_cmpxchg() to serialize the
>     output of the dump_stack(), but this implementation suffers the
>     thundering herd problem. Actually we can use a spinlock here and leverage
>     the
> 
>     implementation of the spinlock(either ticket or queued spinlock) to
>     mediate such kind of livelock. Since the dump_stack() runs with the
>     irq disabled, so use the raw_spinlock_t to make it safe for rt kernel.
> 
> 
> The problem with this is that I think it will deadlock easily with NMI's, won't
> it?
> 
> There is a window between getting the spin lock and setting 'dump_cpu' where an
> incoming NMI would then deadlock because it also tries to get the lock, because
> it hasn't seen the fact that this cpu already got it.

Indeed, sorry I missed that.

> 
> The cmpxchg is supposed to protect against that, and make the locking be atomic
> with the CPU setting.
> 
> Can you try instead to make the retry loop not jump right back to the cmpxchg,
> but start looping just reading the CPU value, and only jumping back when it
> becomes -1?

Do you mean something like this?
diff --git a/lib/dump_stack.c b/lib/dump_stack.c
index 5cff72f18c4a..a0f1293a3638 100644
--- a/lib/dump_stack.c
+++ b/lib/dump_stack.c
@@ -107,6 +107,7 @@ asmlinkage __visible void dump_stack(void)
 	} else {
 		local_irq_restore(flags);
 		cpu_relax();
+		while (atomic_read(&dump_lock) != -1);
 		goto retry;
 	}
 
It does help. I will send a v2 if it pass more stress test.

Thanks,
Kevin

> 
>       Linus

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] dump_stack: Avoid the livelock of the dump_lock
  2019-10-29 14:09   ` Kevin Hao
@ 2019-10-29 15:23     ` Linus Torvalds
  0 siblings, 0 replies; 3+ messages in thread
From: Linus Torvalds @ 2019-10-29 15:23 UTC (permalink / raw)
  To: Kevin Hao; +Cc: Linux Kernel Mailing List, Andrew Morton

On Tue, Oct 29, 2019 at 3:09 PM Kevin Hao <haokexin@gmail.com> wrote:
>
> Do you mean something like this?

Yeah, except I'd make it be something like

                local_irq_restore(flags);
-               cpu_relax();
+               do { cpu_relax(); } while (atomic_read(&dump_lock) != -1);
                goto retry;

instead, ie doing the cpu_relax() inside that loop.

Obviously untested.

          Linus

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2019-10-29 15:23 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-29  9:24 [PATCH] dump_stack: Avoid the livelock of the dump_lock Kevin Hao
     [not found] ` <CAHk-=wjU9ASiPYFqmGJtOqG-0KtuNtu-aNPPY4M1AbcPdrfz7A@mail.gmail.com>
2019-10-29 14:09   ` Kevin Hao
2019-10-29 15:23     ` Linus Torvalds

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).