linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/3] kdb: Fix locking in vkdb_printf()
@ 2016-11-29  9:37 Petr Mladek
  2016-11-29  9:37 ` [PATCH v2 1/3] kdb: Remove unused kdb_event handling Petr Mladek
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Petr Mladek @ 2016-11-29  9:37 UTC (permalink / raw)
  To: Jason Wessel
  Cc: Daniel Thompson, Peter Zijlstra, Andrew Morton,
	Sergey Senozhatsky, linux-kernel, Petr Mladek

This patchset fixes some races that I noticed when looked
over the kdbd_printf() code. v2 is based on the feedback
from Daniel Thompson and Sergey Senozhatsky.

Changes agaist v1:

  + Completely remove kdb_event (Daniel)
  + Avoid using smp_processor_id() (Sergey)
  + Better describe the whole situation in 3rd patch (Daniel)

Petr Mladek (3):
  kdb: Remove unused kdb_event handling
  kdb: Properly synchronize vkdb_printf() calls with other CPUs
  kdb: Call vkdb_printf() from vprintk_default() only when wanted

 include/linux/kdb.h            |  2 +-
 kernel/debug/kdb/kdb_io.c      | 37 +++++++++++++------------------------
 kernel/debug/kdb/kdb_main.c    |  1 -
 kernel/debug/kdb/kdb_private.h |  1 -
 kernel/printk/printk.c         |  3 ++-
 5 files changed, 16 insertions(+), 28 deletions(-)

-- 
1.8.5.6

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH v2 1/3] kdb: Remove unused kdb_event handling
  2016-11-29  9:37 [PATCH v2 0/3] kdb: Fix locking in vkdb_printf() Petr Mladek
@ 2016-11-29  9:37 ` Petr Mladek
  2016-11-29  9:37 ` [PATCH v2 2/3] kdb: Properly synchronize vkdb_printf() calls with other CPUs Petr Mladek
  2016-11-29  9:37 ` [PATCH v2 3/3] kdb: Call vkdb_printf() from vprintk_default() only when wanted Petr Mladek
  2 siblings, 0 replies; 4+ messages in thread
From: Petr Mladek @ 2016-11-29  9:37 UTC (permalink / raw)
  To: Jason Wessel
  Cc: Daniel Thompson, Peter Zijlstra, Andrew Morton,
	Sergey Senozhatsky, linux-kernel, Petr Mladek

kdb_event state variable is only set but never checked in
the kernel code.

http://www.spinics.net/lists/kdb/msg01733.html suggests that
this variable affected WARN_CONSOLE_UNLOCKED() in the original
implementation. But this check never went upstream.

The semantic is unclear and racy. The value is updated after
the kdb_printf_lock is acquired and after it is released.
It should be symmetric at minimum. The value should be manipulated
either inside or outside the locked area.

Fortunately, it seems that the original function is gone and
we could simply remove the state variable.

Suggested-by: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Petr Mladek <pmladek@suse.com>
---
 include/linux/kdb.h         | 1 -
 kernel/debug/kdb/kdb_io.c   | 2 --
 kernel/debug/kdb/kdb_main.c | 1 -
 3 files changed, 4 deletions(-)

diff --git a/include/linux/kdb.h b/include/linux/kdb.h
index 410decacff8f..eb706188dc23 100644
--- a/include/linux/kdb.h
+++ b/include/linux/kdb.h
@@ -77,7 +77,6 @@
  * number whenever the kernel debugger is entered.
  */
 extern int kdb_initial_cpu;
-extern atomic_t kdb_event;
 
 /* Types and messages used for dynamically added kdb shell commands */
 
diff --git a/kernel/debug/kdb/kdb_io.c b/kernel/debug/kdb/kdb_io.c
index fc1ef736253c..3faadc988ec0 100644
--- a/kernel/debug/kdb/kdb_io.c
+++ b/kernel/debug/kdb/kdb_io.c
@@ -576,7 +576,6 @@ int vkdb_printf(enum kdb_msgsrc src, const char *fmt, va_list ap)
 		KDB_STATE_SET(PRINTF_LOCK);
 		spin_lock_irqsave(&kdb_printf_lock, flags);
 		got_printf_lock = 1;
-		atomic_inc(&kdb_event);
 	} else {
 		__acquire(kdb_printf_lock);
 	}
@@ -851,7 +850,6 @@ int vkdb_printf(enum kdb_msgsrc src, const char *fmt, va_list ap)
 		got_printf_lock = 0;
 		spin_unlock_irqrestore(&kdb_printf_lock, flags);
 		KDB_STATE_CLEAR(PRINTF_LOCK);
-		atomic_dec(&kdb_event);
 	} else {
 		__release(kdb_printf_lock);
 	}
diff --git a/kernel/debug/kdb/kdb_main.c b/kernel/debug/kdb/kdb_main.c
index 2a20c0dfdafc..ca183919d302 100644
--- a/kernel/debug/kdb/kdb_main.c
+++ b/kernel/debug/kdb/kdb_main.c
@@ -60,7 +60,6 @@
  * Kernel debugger state flags
  */
 int kdb_flags;
-atomic_t kdb_event;
 
 /*
  * kdb_lock protects updates to kdb_initial_cpu.  Used to
-- 
1.8.5.6

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH v2 2/3] kdb: Properly synchronize vkdb_printf() calls with other CPUs
  2016-11-29  9:37 [PATCH v2 0/3] kdb: Fix locking in vkdb_printf() Petr Mladek
  2016-11-29  9:37 ` [PATCH v2 1/3] kdb: Remove unused kdb_event handling Petr Mladek
@ 2016-11-29  9:37 ` Petr Mladek
  2016-11-29  9:37 ` [PATCH v2 3/3] kdb: Call vkdb_printf() from vprintk_default() only when wanted Petr Mladek
  2 siblings, 0 replies; 4+ messages in thread
From: Petr Mladek @ 2016-11-29  9:37 UTC (permalink / raw)
  To: Jason Wessel
  Cc: Daniel Thompson, Peter Zijlstra, Andrew Morton,
	Sergey Senozhatsky, linux-kernel, Petr Mladek

kdb_printf_lock does not prevent other CPUs from entering the critical
section because it is ignored when KDB_STATE_PRINTF_LOCK is set.

The problematic situation might look like:

CPU0					CPU1

vkdb_printf()
  if (!KDB_STATE(PRINTF_LOCK))
    KDB_STATE_SET(PRINTF_LOCK);
    spin_lock_irqsave(&kdb_printf_lock, flags);

					vkdb_printf()
					  if (!KDB_STATE(PRINTF_LOCK))

BANG: The PRINTF_LOCK state is set and CPU1 is entering the critical
section without spinning on the lock.

The problem is that the code tries to implement locking using two state
variables that are not handled atomically. Well, we need a custom
locking because we want to allow reentering the critical section on
the very same CPU.

Let's use solution from Petr Zijlstra that was proposed for a similar
scenario, see
https://lkml.kernel.org/r/20161018171513.734367391@infradead.org

This patch uses the same trick with cmpxchg(). The only difference
is that we want to handle only recursion from the same context and
therefore we disable interrupts.

In addition, KDB_STATE_PRINTF_LOCK is removed. In fact, we are not
able to set it a non-racy way.

Signed-off-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Daniel Thompson <daniel.thompson@linaro.org>
---
 kernel/debug/kdb/kdb_io.c      | 30 +++++++++++++-----------------
 kernel/debug/kdb/kdb_private.h |  1 -
 2 files changed, 13 insertions(+), 18 deletions(-)

diff --git a/kernel/debug/kdb/kdb_io.c b/kernel/debug/kdb/kdb_io.c
index 3faadc988ec0..cf495c7a8519 100644
--- a/kernel/debug/kdb/kdb_io.c
+++ b/kernel/debug/kdb/kdb_io.c
@@ -555,16 +555,16 @@ int vkdb_printf(enum kdb_msgsrc src, const char *fmt, va_list ap)
 	int colcount;
 	int logging, saved_loglevel = 0;
 	int saved_trap_printk;
-	int got_printf_lock = 0;
 	int retlen = 0;
 	int fnd, len;
+	int this_cpu, old_cpu;
+	static int kdb_printf_cpu = -1;
 	char *cp, *cp2, *cphold = NULL, replaced_byte = ' ';
 	char *moreprompt = "more> ";
 	struct console *c = console_drivers;
-	static DEFINE_SPINLOCK(kdb_printf_lock);
 	unsigned long uninitialized_var(flags);
 
-	preempt_disable();
+	local_irq_save(flags);
 	saved_trap_printk = kdb_trap_printk;
 	kdb_trap_printk = 0;
 
@@ -572,12 +572,13 @@ int vkdb_printf(enum kdb_msgsrc src, const char *fmt, va_list ap)
 	 * But if any cpu goes recursive in kdb, just print the output,
 	 * even if it is interleaved with any other text.
 	 */
-	if (!KDB_STATE(PRINTF_LOCK)) {
-		KDB_STATE_SET(PRINTF_LOCK);
-		spin_lock_irqsave(&kdb_printf_lock, flags);
-		got_printf_lock = 1;
-	} else {
-		__acquire(kdb_printf_lock);
+	this_cpu = smp_processor_id();
+	for (;;) {
+		old_cpu = cmpxchg(&kdb_printf_cpu, -1, this_cpu);
+		if (old_cpu == -1 || old_cpu == this_cpu)
+			break;
+
+		cpu_relax();
 	}
 
 	diag = kdbgetintenv("LINES", &linecount);
@@ -846,15 +847,10 @@ int vkdb_printf(enum kdb_msgsrc src, const char *fmt, va_list ap)
 	suspend_grep = 0; /* end of what may have been a recursive call */
 	if (logging)
 		console_loglevel = saved_loglevel;
-	if (KDB_STATE(PRINTF_LOCK) && got_printf_lock) {
-		got_printf_lock = 0;
-		spin_unlock_irqrestore(&kdb_printf_lock, flags);
-		KDB_STATE_CLEAR(PRINTF_LOCK);
-	} else {
-		__release(kdb_printf_lock);
-	}
+	/* kdb_printf_cpu locked the code above. */
+	smp_store_release(&kdb_printf_cpu, old_cpu);
 	kdb_trap_printk = saved_trap_printk;
-	preempt_enable();
+	local_irq_restore(flags);
 	return retlen;
 }
 
diff --git a/kernel/debug/kdb/kdb_private.h b/kernel/debug/kdb/kdb_private.h
index 75014d7f4568..fc224fbcf954 100644
--- a/kernel/debug/kdb/kdb_private.h
+++ b/kernel/debug/kdb/kdb_private.h
@@ -132,7 +132,6 @@ extern int kdbgetaddrarg(int, const char **, int*, unsigned long *,
 #define KDB_STATE_PAGER		0x00000400	/* pager is available */
 #define KDB_STATE_GO_SWITCH	0x00000800	/* go is switching
 						 * back to initial cpu */
-#define KDB_STATE_PRINTF_LOCK	0x00001000	/* Holds kdb_printf lock */
 #define KDB_STATE_WAIT_IPI	0x00002000	/* Waiting for kdb_ipi() NMI */
 #define KDB_STATE_RECURSE	0x00004000	/* Recursive entry to kdb */
 #define KDB_STATE_IP_ADJUSTED	0x00008000	/* Restart IP has been
-- 
1.8.5.6

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH v2 3/3] kdb: Call vkdb_printf() from vprintk_default() only when wanted
  2016-11-29  9:37 [PATCH v2 0/3] kdb: Fix locking in vkdb_printf() Petr Mladek
  2016-11-29  9:37 ` [PATCH v2 1/3] kdb: Remove unused kdb_event handling Petr Mladek
  2016-11-29  9:37 ` [PATCH v2 2/3] kdb: Properly synchronize vkdb_printf() calls with other CPUs Petr Mladek
@ 2016-11-29  9:37 ` Petr Mladek
  2 siblings, 0 replies; 4+ messages in thread
From: Petr Mladek @ 2016-11-29  9:37 UTC (permalink / raw)
  To: Jason Wessel
  Cc: Daniel Thompson, Peter Zijlstra, Andrew Morton,
	Sergey Senozhatsky, linux-kernel, Petr Mladek

kdb_trap_printk allows to pass normal printk() messages to kdb via
vkdb_printk(). For example, it is used to get backtrace using
the classic show_stack(), see kdb_show_stack().

vkdb_printf() tries to avoid a potential infinite loop by disabling
the trap. But this approach is racy, for example:

CPU1					CPU2

vkdb_printf()
  // assume that kdb_trap_printk == 0
  saved_trap_printk = kdb_trap_printk;
  kdb_trap_printk = 0;

					kdb_show_stack()
					  kdb_trap_printk++;

Problem1: Now, a nested printk() on CPU0 calls vkdb_printf()
	  even when it should have been disabled. It will not
	  cause a deadlock but...

   // using the outdated saved value: 0
   kdb_trap_printk = saved_trap_printk;

					  kdb_trap_printk--;

Problem2: Now, kdb_trap_printk == -1 and will stay like this.
   It means that all messages will get passed to kdb from
   now on.

This patch removes the racy saved_trap_printk handling. Instead,
the recursion is prevented by a check for the locked CPU.

The solution is still kind of racy. A non-related printk(), from
another process, might get trapped by vkdb_printf(). And the wanted
printk() might not get trapped because kdb_printf_cpu is assigned.
But this problem existed even with the original code.

A proper solution would be to get_cpu() before setting kdb_trap_printk
and trap messages only from this CPU. I am not sure if it is worth
the effort, though.

In fact, the race is very theoretical. When kdb is running any of
the commands that use kdb_trap_printk there is a single active CPU
and the other CPUs should be in a holding pen inside kgdb_cpu_enter().

The only time this is violated is when there is a timeout waiting
for the other CPUs to report to the holding pen.

Finally, note that the situation is a bit schizophrenic. vkdb_printf()
explicitly allows recursion but only from KDB code that calls
kdb_printf() directly. On the other hand, the generic printk()
recursion is not allowed because it might cause an infinite loop.
This is why we could not hide the decision inside vkdb_printf()
easily.

Signed-off-by: Petr Mladek <pmladek@suse.com>
---
 include/linux/kdb.h       | 1 +
 kernel/debug/kdb/kdb_io.c | 9 ++-------
 kernel/printk/printk.c    | 3 ++-
 3 files changed, 5 insertions(+), 8 deletions(-)

diff --git a/include/linux/kdb.h b/include/linux/kdb.h
index eb706188dc23..68bd88223417 100644
--- a/include/linux/kdb.h
+++ b/include/linux/kdb.h
@@ -161,6 +161,7 @@ enum kdb_msgsrc {
 };
 
 extern int kdb_trap_printk;
+extern int kdb_printf_cpu;
 extern __printf(2, 0) int vkdb_printf(enum kdb_msgsrc src, const char *fmt,
 				      va_list args);
 extern __printf(1, 2) int kdb_printf(const char *, ...);
diff --git a/kernel/debug/kdb/kdb_io.c b/kernel/debug/kdb/kdb_io.c
index cf495c7a8519..351b4f785270 100644
--- a/kernel/debug/kdb/kdb_io.c
+++ b/kernel/debug/kdb/kdb_io.c
@@ -30,6 +30,7 @@
 char kdb_prompt_str[CMD_BUFLEN];
 
 int kdb_trap_printk;
+int kdb_printf_cpu = -1;
 
 static int kgdb_transition_check(char *buffer)
 {
@@ -554,24 +555,19 @@ int vkdb_printf(enum kdb_msgsrc src, const char *fmt, va_list ap)
 	int linecount;
 	int colcount;
 	int logging, saved_loglevel = 0;
-	int saved_trap_printk;
 	int retlen = 0;
 	int fnd, len;
 	int this_cpu, old_cpu;
-	static int kdb_printf_cpu = -1;
 	char *cp, *cp2, *cphold = NULL, replaced_byte = ' ';
 	char *moreprompt = "more> ";
 	struct console *c = console_drivers;
 	unsigned long uninitialized_var(flags);
 
-	local_irq_save(flags);
-	saved_trap_printk = kdb_trap_printk;
-	kdb_trap_printk = 0;
-
 	/* Serialize kdb_printf if multiple cpus try to write at once.
 	 * But if any cpu goes recursive in kdb, just print the output,
 	 * even if it is interleaved with any other text.
 	 */
+	local_irq_save(flags);
 	this_cpu = smp_processor_id();
 	for (;;) {
 		old_cpu = cmpxchg(&kdb_printf_cpu, -1, this_cpu);
@@ -849,7 +845,6 @@ int vkdb_printf(enum kdb_msgsrc src, const char *fmt, va_list ap)
 		console_loglevel = saved_loglevel;
 	/* kdb_printf_cpu locked the code above. */
 	smp_store_release(&kdb_printf_cpu, old_cpu);
-	kdb_trap_printk = saved_trap_printk;
 	local_irq_restore(flags);
 	return retlen;
 }
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index f7a55e9ff2f7..2403800bb5f9 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1926,7 +1926,8 @@ int vprintk_default(const char *fmt, va_list args)
 	int r;
 
 #ifdef CONFIG_KGDB_KDB
-	if (unlikely(kdb_trap_printk)) {
+	/* Allow to pass printk() to kdb but avoid a recursion. */
+	if (unlikely(kdb_trap_printk && kdb_printf_cpu < 0)) {
 		r = vkdb_printf(KDB_MSGSRC_PRINTK, fmt, args);
 		return r;
 	}
-- 
1.8.5.6

^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-11-29  9:40 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-29  9:37 [PATCH v2 0/3] kdb: Fix locking in vkdb_printf() Petr Mladek
2016-11-29  9:37 ` [PATCH v2 1/3] kdb: Remove unused kdb_event handling Petr Mladek
2016-11-29  9:37 ` [PATCH v2 2/3] kdb: Properly synchronize vkdb_printf() calls with other CPUs Petr Mladek
2016-11-29  9:37 ` [PATCH v2 3/3] kdb: Call vkdb_printf() from vprintk_default() only when wanted Petr Mladek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).