All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4 v5] Avoid softlockups in console_unlock()
@ 2013-08-14 13:28 Jan Kara
  2013-08-14 13:28 ` [PATCH 1/4] printk: Remove separate printk_sched buffers and use printk buf instead Jan Kara
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Jan Kara @ 2013-08-14 13:28 UTC (permalink / raw)
  To: LKML; +Cc: Andrew Morton, Steven Rostedt, mhocko, Jan Kara

  Hello,

  so I finally got back to the patches to avoid softlockups when a CPU gets
caught in console_unlock() for a long time during heavy printing from other
CPU. As is discussed in patch 3/4 it isn't enough to just silence the watchdog
because if CPU spends too long in console_unlock() also RCU will complain,
other CPUs can be blocked waiting for printing CPU to process IPI, and even
disk can be offlined because commands couldn't be delivered to it for too long.

This patch series solves the problem by stopping printing in console_unlock()
after 1000 characters and the printing is postponed to irq work. To avoid
hogging a single CPU (irq work gets processed on the same CPU where it was
queued so it doesn't really help to reduce the printing load on that CPU) we
introduce a new type of lazy irq work - IRQ_WORK_UNBOUND - which can be
processed by any CPU.

The patch series has survived my testing without any softlockup reports.  I've
tested running sysrq-t (lots of printk output) while inserting modules (to
generate IPIs and also some printk traffic) and also running two delayed works
printing 10 KB of text each. All this was with simulated 9600 baud serial
console.

								Honza

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/4] printk: Remove separate printk_sched buffers and use printk buf instead
  2013-08-14 13:28 [PATCH 0/4 v5] Avoid softlockups in console_unlock() Jan Kara
@ 2013-08-14 13:28 ` Jan Kara
  2013-08-14 13:28 ` [PATCH 2/4] irq_work: Provide a irq work that can be processed on any cpu Jan Kara
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 8+ messages in thread
From: Jan Kara @ 2013-08-14 13:28 UTC (permalink / raw)
  To: LKML; +Cc: Andrew Morton, Steven Rostedt, mhocko, Jan Kara

From: Steven Rostedt <rostedt@goodmis.org>

To prevent deadlocks with doing a printk inside the scheduler,
printk_sched() was created. The issue is that printk has a console_sem
that it can grab and release. The release does a wake up if there's a
task pending on the sem, and this wake up grabs the rq locks that is
held in the scheduler. This leads to a possible deadlock if the wake up
uses the same rq as the one with the rq lock held already.

What printk_sched() does is to save the printk write in a per cpu buffer
and sets the PRINTK_PENDING_SCHED flag. On a timer tick, if this flag is
set, the printk() is done against the buffer.

There's a couple of issues with this approach.

1) If two printk_sched()s are called before the tick, the second one
will overwrite the first one.

2) The temporary buffer is 512 bytes and is per cpu. This is a quite a
bit of space wasted for something that is seldom used.

In order to remove this, the printk_sched() can use the printk buffer
instead, and delay the console_trylock()/console_unlock() to the queued
work.

Because printk_sched() would then be taking the logbuf_lock, the
logbuf_lock must not be held while doing anything that may call into the
scheduler functions, which includes wake ups. Unfortunately, printk()
also has a console_sem that it uses, and on release, the
up(&console_sem) may do a wake up of any pending waiters. This must be
avoided while holding the logbuf_lock.

Luckily, there's not many places that do the unlock, or hold the
logbuf_lock. By moving things around a little, the console_sem can be
released without ever holding the logbuf_lock, and we can safely have
printk_sched() use the printk buffer directly.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 kernel/printk/printk.c | 86 +++++++++++++++++++++++++++++++-------------------
 1 file changed, 54 insertions(+), 32 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 5b5a708..9208e17 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -208,6 +208,9 @@ struct printk_log {
 /*
  * The logbuf_lock protects kmsg buffer, indices, counters. It is also
  * used in interesting ways to provide interlocking in console_unlock();
+ * This can be taken within the scheduler's rq lock. It must be released
+ * before calling console_unlock() or anything else that might wake up
+ * a process.
  */
 static DEFINE_RAW_SPINLOCK(logbuf_lock);
 
@@ -1343,27 +1346,43 @@ static inline int can_use_console(unsigned int cpu)
  * interrupts disabled. It should return with 'lockbuf_lock'
  * released but interrupts still disabled.
  */
-static int console_trylock_for_printk(unsigned int cpu)
+static int console_trylock_for_printk(unsigned int cpu, bool in_sched)
 	__releases(&logbuf_lock)
 {
 	int retval = 0, wake = 0;
 
-	if (console_trylock()) {
-		retval = 1;
+	/* if called from the scheduler, we can not call up() */
+	if (in_sched)
+		goto out;
 
-		/*
-		 * If we can't use the console, we need to release
-		 * the console semaphore by hand to avoid flushing
-		 * the buffer. We need to hold the console semaphore
-		 * in order to do this test safely.
-		 */
-		if (!can_use_console(cpu)) {
-			console_locked = 0;
-			wake = 1;
-			retval = 0;
-		}
+	if (down_trylock(&console_sem))
+		goto out;
+
+	/*
+	 * If we can't use the console, we need to release
+	 * the console semaphore by hand to avoid flushing
+	 * the buffer. We need to hold the console semaphore
+	 * in order to do this test safely.
+	 */
+	if (console_suspended || !can_use_console(cpu)) {
+		wake = 1;
+		goto out;
 	}
+
+	/* console is now locked */
+
+	console_locked = 1;
+	console_may_schedule = 0;
+	mutex_acquire(&console_lock_dep_map, 0, 1, _RET_IP_);
+
+	retval = 1;
+
+out:
 	logbuf_cpu = UINT_MAX;
+	/*
+	 * The logbuf_lock must not be held when doing a wake up,
+	 * which the up(&console_sem) can do.
+	 */
 	raw_spin_unlock(&logbuf_lock);
 	if (wake)
 		up(&console_sem);
@@ -1495,11 +1514,17 @@ asmlinkage int vprintk_emit(int facility, int level,
 	static int recursion_bug;
 	static char textbuf[LOG_LINE_MAX];
 	char *text = textbuf;
-	size_t text_len;
+	size_t text_len = 0;
 	enum log_flags lflags = 0;
 	unsigned long flags;
 	int this_cpu;
 	int printed_len = 0;
+	bool in_sched = false;
+
+	if (level == -2) {
+		level = -1;
+		in_sched = true;
+	}
 
 	boot_delay_msec(level);
 	printk_delay();
@@ -1545,7 +1570,12 @@ asmlinkage int vprintk_emit(int facility, int level,
 	 * The printf needs to come first; we need the syslog
 	 * prefix which might be passed-in as a parameter.
 	 */
-	text_len = vscnprintf(text, sizeof(textbuf), fmt, args);
+	if (in_sched)
+		text_len = scnprintf(text, sizeof(textbuf),
+				     KERN_WARNING "[sched_delayed] ");
+
+	text_len += vscnprintf(text + text_len,
+			       sizeof(textbuf) - text_len, fmt, args);
 
 	/* mark and strip a trailing newline */
 	if (text_len && text[text_len-1] == '\n') {
@@ -1620,7 +1650,7 @@ asmlinkage int vprintk_emit(int facility, int level,
 	 * The console_trylock_for_printk() function will release 'logbuf_lock'
 	 * regardless of whether it actually gets the console semaphore or not.
 	 */
-	if (console_trylock_for_printk(this_cpu))
+	if (console_trylock_for_printk(this_cpu, in_sched))
 		console_unlock();
 
 	lockdep_on();
@@ -2428,21 +2458,19 @@ late_initcall(printk_late_init);
 /*
  * Delayed printk version, for scheduler-internal messages:
  */
-#define PRINTK_BUF_SIZE		512
-
 #define PRINTK_PENDING_WAKEUP	0x01
-#define PRINTK_PENDING_SCHED	0x02
+#define PRINTK_PENDING_OUTPUT	0x02
 
 static DEFINE_PER_CPU(int, printk_pending);
-static DEFINE_PER_CPU(char [PRINTK_BUF_SIZE], printk_sched_buf);
 
 static void wake_up_klogd_work_func(struct irq_work *irq_work)
 {
 	int pending = __this_cpu_xchg(printk_pending, 0);
 
-	if (pending & PRINTK_PENDING_SCHED) {
-		char *buf = __get_cpu_var(printk_sched_buf);
-		printk(KERN_WARNING "[sched_delayed] %s", buf);
+	if (pending & PRINTK_PENDING_OUTPUT) {
+		/* If trylock fails, someone else is doing the printing */
+		if (console_trylock())
+			console_unlock();
 	}
 
 	if (pending & PRINTK_PENDING_WAKEUP)
@@ -2466,21 +2494,15 @@ void wake_up_klogd(void)
 
 int printk_sched(const char *fmt, ...)
 {
-	unsigned long flags;
 	va_list args;
-	char *buf;
 	int r;
 
-	local_irq_save(flags);
-	buf = __get_cpu_var(printk_sched_buf);
-
 	va_start(args, fmt);
-	r = vsnprintf(buf, PRINTK_BUF_SIZE, fmt, args);
+	r = vprintk_emit(0, -2, NULL, 0, fmt, args);
 	va_end(args);
 
-	__this_cpu_or(printk_pending, PRINTK_PENDING_SCHED);
+	__this_cpu_or(printk_pending, PRINTK_PENDING_OUTPUT);
 	irq_work_queue(&__get_cpu_var(wake_up_klogd_work));
-	local_irq_restore(flags);
 
 	return r;
 }
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/4] irq_work: Provide a irq work that can be processed on any cpu
  2013-08-14 13:28 [PATCH 0/4 v5] Avoid softlockups in console_unlock() Jan Kara
  2013-08-14 13:28 ` [PATCH 1/4] printk: Remove separate printk_sched buffers and use printk buf instead Jan Kara
@ 2013-08-14 13:28 ` Jan Kara
  2013-08-14 13:28 ` [PATCH 3/4] printk: Defer printing to irq work when we printed too much Jan Kara
  2013-08-14 13:28 ` [PATCH 4/4] printk: Use unbound irq work for printing and waking Jan Kara
  3 siblings, 0 replies; 8+ messages in thread
From: Jan Kara @ 2013-08-14 13:28 UTC (permalink / raw)
  To: LKML; +Cc: Andrew Morton, Steven Rostedt, mhocko, Jan Kara

Provide new irq work flag - IRQ_WORK_UNBOUND - meaning that can be
processed on any cpu. This flag implies IRQ_WORK_LAZY so that things are
simple and we don't have to pick any particular cpu to do the work. We
just do the work from a timer tick on whichever cpu it happens first.
This is useful as a lightweight and simple code path without locking or
other dependencies to offload work to other cpu if possible.

We will use this type of irq work to make a guarantee of forward
progress of printing to a (serial) console when printing on one cpu
would cause interrupts to be disabled for too long.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 include/linux/irq_work.h |  2 ++
 kernel/irq_work.c        | 51 +++++++++++++++++++++++++++++-------------------
 2 files changed, 33 insertions(+), 20 deletions(-)

diff --git a/include/linux/irq_work.h b/include/linux/irq_work.h
index 6601702..ca07a16 100644
--- a/include/linux/irq_work.h
+++ b/include/linux/irq_work.h
@@ -16,6 +16,8 @@
 #define IRQ_WORK_BUSY		2UL
 #define IRQ_WORK_FLAGS		3UL
 #define IRQ_WORK_LAZY		4UL /* Doesn't want IPI, wait for tick */
+#define __IRQ_WORK_UNBOUND      8UL /* Use IRQ_WORK_UNBOUND instead! */
+#define IRQ_WORK_UNBOUND	(__IRQ_WORK_UNBOUND | IRQ_WORK_LAZY) /* Any cpu can process this work */
 
 struct irq_work {
 	unsigned long flags;
diff --git a/kernel/irq_work.c b/kernel/irq_work.c
index 55fcce6..446cd81 100644
--- a/kernel/irq_work.c
+++ b/kernel/irq_work.c
@@ -22,6 +22,9 @@
 static DEFINE_PER_CPU(struct llist_head, irq_work_list);
 static DEFINE_PER_CPU(int, irq_work_raised);
 
+/* List of irq-work any CPU can pick up */
+static LLIST_HEAD(unbound_irq_work_list);
+
 /*
  * Claim the entry so that no one else will poke at it.
  */
@@ -70,12 +73,16 @@ void irq_work_queue(struct irq_work *work)
 	/* Queue the entry and raise the IPI if needed. */
 	preempt_disable();
 
-	llist_add(&work->llnode, &__get_cpu_var(irq_work_list));
+	if (!(work->flags & __IRQ_WORK_UNBOUND))
+		llist_add(&work->llnode, &__get_cpu_var(irq_work_list));
+	else
+		llist_add(&work->llnode, &unbound_irq_work_list);
 
 	/*
 	 * If the work is not "lazy" or the tick is stopped, raise the irq
 	 * work interrupt (if supported by the arch), otherwise, just wait
-	 * for the next tick.
+	 * for the next tick. We do this even for unbound work to make sure
+	 * *some* CPU will be doing the work.
 	 */
 	if (!(work->flags & IRQ_WORK_LAZY) || tick_nohz_tick_stopped()) {
 		if (!this_cpu_cmpxchg(irq_work_raised, 0, 1))
@@ -100,28 +107,11 @@ bool irq_work_needs_cpu(void)
 	return true;
 }
 
-static void __irq_work_run(void)
+static void process_irq_work_list(struct llist_node *llnode)
 {
 	unsigned long flags;
 	struct irq_work *work;
-	struct llist_head *this_list;
-	struct llist_node *llnode;
-
-
-	/*
-	 * Reset the "raised" state right before we check the list because
-	 * an NMI may enqueue after we find the list empty from the runner.
-	 */
-	__this_cpu_write(irq_work_raised, 0);
-	barrier();
 
-	this_list = &__get_cpu_var(irq_work_list);
-	if (llist_empty(this_list))
-		return;
-
-	BUG_ON(!irqs_disabled());
-
-	llnode = llist_del_all(this_list);
 	while (llnode != NULL) {
 		work = llist_entry(llnode, struct irq_work, llnode);
 
@@ -146,6 +136,27 @@ static void __irq_work_run(void)
 	}
 }
 
+static void __irq_work_run(void)
+{
+	struct llist_head *this_list;
+
+	/*
+	 * Reset the "raised" state right before we check the list because
+	 * an NMI may enqueue after we find the list empty from the runner.
+	 */
+	__this_cpu_write(irq_work_raised, 0);
+	barrier();
+
+	this_list = &__get_cpu_var(irq_work_list);
+	if (llist_empty(this_list) && llist_empty(&unbound_irq_work_list))
+		return;
+
+	BUG_ON(!irqs_disabled());
+
+	process_irq_work_list(llist_del_all(this_list));
+	process_irq_work_list(llist_del_all(&unbound_irq_work_list));
+}
+
 /*
  * Run the irq_work entries on this cpu. Requires to be ran from hardirq
  * context with local IRQs disabled.
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 3/4] printk: Defer printing to irq work when we printed too much
  2013-08-14 13:28 [PATCH 0/4 v5] Avoid softlockups in console_unlock() Jan Kara
  2013-08-14 13:28 ` [PATCH 1/4] printk: Remove separate printk_sched buffers and use printk buf instead Jan Kara
  2013-08-14 13:28 ` [PATCH 2/4] irq_work: Provide a irq work that can be processed on any cpu Jan Kara
@ 2013-08-14 13:28 ` Jan Kara
  2013-08-15  1:38   ` Steven Rostedt
  2013-08-14 13:28 ` [PATCH 4/4] printk: Use unbound irq work for printing and waking Jan Kara
  3 siblings, 1 reply; 8+ messages in thread
From: Jan Kara @ 2013-08-14 13:28 UTC (permalink / raw)
  To: LKML; +Cc: Andrew Morton, Steven Rostedt, mhocko, Jan Kara

A CPU can be caught in console_unlock() for a long time (tens of seconds
are reported by our customers) when other CPUs are using printk heavily
and serial console makes printing slow. Despite serial console drivers
are calling touch_nmi_watchdog() this triggers softlockup warnings
because interrupts are disabled for the whole time console_unlock() runs
(e.g. vprintk() calls console_unlock() with interrupts disabled). Thus
IPIs cannot be processed and other CPUs get stuck spinning in calls like
smp_call_function_many(). Also RCU eventually starts reporting lockups.

In my artifical testing I can also easily trigger a situation when disk
disappears from the system apparently because commands to / from it
could not be delivered for long enough. This is why just silencing
watchdogs isn't a reliable solution to the problem and we simply have to
avoid spending too long in console_unlock() with interrupts disabled.

The solution this patch works toward is to postpone printing to a later
moment / different CPU when we already printed over 1000 characters in
current console_unlock() invocation. This is a crude heuristic but
measuring time we spent printing doesn't seem to be really viable - we
cannot rely on high resolution time being available and with interrupts
disabled jiffies are not updated. The value 1000 was chosen so that
things are still bearable with 9600 baud serial console and OTOH it
shouldn't cause offloading of printing in common cases.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 kernel/printk/printk.c | 62 ++++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 52 insertions(+), 10 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 9208e17..28c220a 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -2026,31 +2026,49 @@ out:
 	raw_spin_unlock_irqrestore(&logbuf_lock, flags);
 }
 
+/*
+ * How much characters can we print in one call of printk before offloading
+ * printing work
+ */
+#define MAX_PRINTK_CHARS 1000
+
+/* Should we stop printing on this cpu? */
+static bool cpu_stop_printing(int printed_chars)
+{
+	/* Oops? Print everything now to maximize chances user will see it */
+	if (oops_in_progress)
+		return false;
+	return printed_chars > MAX_PRINTK_CHARS;
+}
+
 /**
- * console_unlock - unlock the console system
+ * __console_unlock - unlock the console system
  *
  * Releases the console_lock which the caller holds on the console system
  * and the console driver list.
  *
- * While the console_lock was held, console output may have been buffered
- * by printk().  If this is the case, console_unlock(); emits
- * the output prior to releasing the lock.
+ * While the console_lock was held, console output may have been buffered by
+ * printk(). If this is the case, __console_unlock() emits the output prior to
+ * releasing the lock. However we need not write all the data in the buffer if
+ * we would hog the CPU for too long. Function returns true, if there's more
+ * data that needs printing in the buffer.
  *
  * If there is output waiting, we wake /dev/kmsg and syslog() users.
  *
- * console_unlock(); may be called from any context.
+ * __console_unlock(); may be called from any context.
  */
-void console_unlock(void)
+static bool __console_unlock(void)
 {
 	static char text[LOG_LINE_MAX + PREFIX_MAX];
 	static u64 seen_seq;
 	unsigned long flags;
 	bool wake_klogd = false;
 	bool retry;
+	int printed_chars = 0;
 
 	if (console_suspended) {
 		up(&console_sem);
-		return;
+		return false;
 	}
 
 	console_may_schedule = 0;
@@ -2078,6 +2096,8 @@ again:
 skip:
 		if (console_seq == log_next_seq)
 			break;
+		if (cpu_stop_printing(printed_chars))
+			break;
 
 		msg = log_from_idx(console_idx);
 		if (msg->flags & LOG_NOCONS) {
@@ -2108,6 +2128,7 @@ skip:
 		stop_critical_timings();	/* don't trace print latency */
 		call_console_drivers(level, text, len);
 		start_critical_timings();
+		printed_chars += len;
 		local_irq_restore(flags);
 	}
 	console_locked = 0;
@@ -2131,13 +2152,20 @@ skip:
 	retry = console_seq != log_next_seq;
 	raw_spin_unlock_irqrestore(&logbuf_lock, flags);
 
-	if (retry && console_trylock())
-		goto again;
+	if (retry && !cpu_stop_printing(printed_chars)) {
+		if (console_trylock())
+			goto again;
+		/*
+		 * Someone else is printing so the caller doesn't have to
+		 * schedule irq work
+		 */
+		retry = false;
+	}
 
 	if (wake_klogd)
 		wake_up_klogd();
+	return retry;
 }
-EXPORT_SYMBOL(console_unlock);
 
 /**
  * console_conditional_schedule - yield the CPU if required
@@ -2508,6 +2536,20 @@ int printk_sched(const char *fmt, ...)
 }
 
 /*
+ * This is a wrapper for __console_unlock() that makes sure the rest of buffer
+ * is printed in future.
+ */
+void console_unlock(void)
+{
+       if (__console_unlock()) {
+               /* Leave the rest of printing for a timer tick */
+		__this_cpu_or(printk_pending, PRINTK_PENDING_OUTPUT);
+		irq_work_queue(&__get_cpu_var(wake_up_klogd_work));
+	}
+}
+EXPORT_SYMBOL(console_unlock);
+
+/*
  * printk rate limiting, lifted from the networking subsystem.
  *
  * This enforces a rate limit: not more than 10 kernel messages
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 4/4] printk: Use unbound irq work for printing and waking
  2013-08-14 13:28 [PATCH 0/4 v5] Avoid softlockups in console_unlock() Jan Kara
                   ` (2 preceding siblings ...)
  2013-08-14 13:28 ` [PATCH 3/4] printk: Defer printing to irq work when we printed too much Jan Kara
@ 2013-08-14 13:28 ` Jan Kara
  3 siblings, 0 replies; 8+ messages in thread
From: Jan Kara @ 2013-08-14 13:28 UTC (permalink / raw)
  To: LKML; +Cc: Andrew Morton, Steven Rostedt, mhocko, Jan Kara

Now when per-cpu printk buffers are gone, there's no need to have printk
flags or printk irq_work per cpu. Just make printk_pending a single
variable operated by atomic operations and have single unbound irq work
doing the waking and printing. This has an advantage that any cpu can do the
printing / wakeup work thus lowering the latency of printing and better
distributing the printing work over cpus.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 kernel/printk/printk.c | 47 ++++++++++++++++++++++++++---------------------
 1 file changed, 26 insertions(+), 21 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 28c220a..118ba7b 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -2486,36 +2486,41 @@ late_initcall(printk_late_init);
 /*
  * Delayed printk version, for scheduler-internal messages:
  */
-#define PRINTK_PENDING_WAKEUP	0x01
-#define PRINTK_PENDING_OUTPUT	0x02
+#define PRINTK_PENDING_WAKEUP	0
+#define PRINTK_PENDING_OUTPUT	1
 
-static DEFINE_PER_CPU(int, printk_pending);
+static unsigned long printk_pending;
 
-static void wake_up_klogd_work_func(struct irq_work *irq_work)
+static void printk_irq_work_func(struct irq_work *irq_work)
 {
-	int pending = __this_cpu_xchg(printk_pending, 0);
+	if (printk_pending) {
+		unsigned long pending = xchg(&printk_pending, 0);
 
-	if (pending & PRINTK_PENDING_OUTPUT) {
-		/* If trylock fails, someone else is doing the printing */
-		if (console_trylock())
-			console_unlock();
-	}
+		if (test_bit(PRINTK_PENDING_OUTPUT, &pending)) {
+			/*
+			 * If trylock fails, someone else is doing the
+			 * printing
+			 */
+			if (console_trylock())
+				console_unlock();
+		}
 
-	if (pending & PRINTK_PENDING_WAKEUP)
-		wake_up_interruptible(&log_wait);
+		if (test_bit(PRINTK_PENDING_WAKEUP, &pending))
+			wake_up_interruptible(&log_wait);
+	}
 }
 
-static DEFINE_PER_CPU(struct irq_work, wake_up_klogd_work) = {
-	.func = wake_up_klogd_work_func,
-	.flags = IRQ_WORK_LAZY,
+static struct irq_work printk_irq_work = {
+	.func = printk_irq_work_func,
+	.flags = IRQ_WORK_UNBOUND,
 };
 
 void wake_up_klogd(void)
 {
 	preempt_disable();
 	if (waitqueue_active(&log_wait)) {
-		this_cpu_or(printk_pending, PRINTK_PENDING_WAKEUP);
-		irq_work_queue(&__get_cpu_var(wake_up_klogd_work));
+		set_bit(PRINTK_PENDING_WAKEUP, &printk_pending);
+		irq_work_queue(&printk_irq_work);
 	}
 	preempt_enable();
 }
@@ -2529,8 +2534,8 @@ int printk_sched(const char *fmt, ...)
 	r = vprintk_emit(0, -2, NULL, 0, fmt, args);
 	va_end(args);
 
-	__this_cpu_or(printk_pending, PRINTK_PENDING_OUTPUT);
-	irq_work_queue(&__get_cpu_var(wake_up_klogd_work));
+	set_bit(PRINTK_PENDING_OUTPUT, &printk_pending);
+	irq_work_queue(&printk_irq_work);
 
 	return r;
 }
@@ -2543,8 +2548,8 @@ void console_unlock(void)
 {
        if (__console_unlock()) {
                /* Leave the rest of printing for a timer tick */
-		__this_cpu_or(printk_pending, PRINTK_PENDING_OUTPUT);
-		irq_work_queue(&__get_cpu_var(wake_up_klogd_work));
+		set_bit(PRINTK_PENDING_OUTPUT, &printk_pending);
+		irq_work_queue(&printk_irq_work);
 	}
 }
 EXPORT_SYMBOL(console_unlock);
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 3/4] printk: Defer printing to irq work when we printed too much
  2013-08-14 13:28 ` [PATCH 3/4] printk: Defer printing to irq work when we printed too much Jan Kara
@ 2013-08-15  1:38   ` Steven Rostedt
  2013-08-15  7:52     ` Jan Kara
  0 siblings, 1 reply; 8+ messages in thread
From: Steven Rostedt @ 2013-08-15  1:38 UTC (permalink / raw)
  To: Jan Kara; +Cc: LKML, Andrew Morton, mhocko

On Wed, 14 Aug 2013 15:28:27 +0200
Jan Kara <jack@suse.cz> wrote:

> A CPU can be caught in console_unlock() for a long time (tens of seconds
> are reported by our customers) when other CPUs are using printk heavily
> and serial console makes printing slow. Despite serial console drivers
> are calling touch_nmi_watchdog() this triggers softlockup warnings
> because interrupts are disabled for the whole time console_unlock() runs
> (e.g. vprintk() calls console_unlock() with interrupts disabled). Thus
> IPIs cannot be processed and other CPUs get stuck spinning in calls like
> smp_call_function_many(). Also RCU eventually starts reporting lockups.
> 
> In my artifical testing I can also easily trigger a situation when disk
> disappears from the system apparently because commands to / from it
> could not be delivered for long enough. This is why just silencing
> watchdogs isn't a reliable solution to the problem and we simply have to
> avoid spending too long in console_unlock() with interrupts disabled.
> 
> The solution this patch works toward is to postpone printing to a later
> moment / different CPU when we already printed over 1000 characters in
> current console_unlock() invocation. This is a crude heuristic but
> measuring time we spent printing doesn't seem to be really viable - we
> cannot rely on high resolution time being available and with interrupts
> disabled jiffies are not updated. The value 1000 was chosen so that
> things are still bearable with 9600 baud serial console and OTOH it
> shouldn't cause offloading of printing in common cases.
> 
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
>  kernel/printk/printk.c | 62 ++++++++++++++++++++++++++++++++++++++++++--------
>  1 file changed, 52 insertions(+), 10 deletions(-)
> 
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index 9208e17..28c220a 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -2026,31 +2026,49 @@ out:
>  	raw_spin_unlock_irqrestore(&logbuf_lock, flags);
>  }
>  
> +/*
> + * How much characters can we print in one call of printk before offloading
> + * printing work
> + */
> +#define MAX_PRINTK_CHARS 1000

This really needs to be configurable. 1000 may be good for you, but may
not be good for others. Heck, we should be able to disable this
offloading too.

This should probably be added to the printk /proc/sys/kernel/... files.

If it is set to zero, then disable any offloading.

-- Steve


> +
> +/* Should we stop printing on this cpu? */
> +static bool cpu_stop_printing(int printed_chars)
> +{
> +	/* Oops? Print everything now to maximize chances user will see it */
> +	if (oops_in_progress)
> +		return false;
> +	return printed_chars > MAX_PRINTK_CHARS;
> +}
> +
>  /**

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 3/4] printk: Defer printing to irq work when we printed too much
  2013-08-15  1:38   ` Steven Rostedt
@ 2013-08-15  7:52     ` Jan Kara
  2013-08-15 13:26       ` Steven Rostedt
  0 siblings, 1 reply; 8+ messages in thread
From: Jan Kara @ 2013-08-15  7:52 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Jan Kara, LKML, Andrew Morton, mhocko

On Wed 14-08-13 21:38:18, Steven Rostedt wrote:
> On Wed, 14 Aug 2013 15:28:27 +0200
> Jan Kara <jack@suse.cz> wrote:
> 
> > A CPU can be caught in console_unlock() for a long time (tens of seconds
> > are reported by our customers) when other CPUs are using printk heavily
> > and serial console makes printing slow. Despite serial console drivers
> > are calling touch_nmi_watchdog() this triggers softlockup warnings
> > because interrupts are disabled for the whole time console_unlock() runs
> > (e.g. vprintk() calls console_unlock() with interrupts disabled). Thus
> > IPIs cannot be processed and other CPUs get stuck spinning in calls like
> > smp_call_function_many(). Also RCU eventually starts reporting lockups.
> > 
> > In my artifical testing I can also easily trigger a situation when disk
> > disappears from the system apparently because commands to / from it
> > could not be delivered for long enough. This is why just silencing
> > watchdogs isn't a reliable solution to the problem and we simply have to
> > avoid spending too long in console_unlock() with interrupts disabled.
> > 
> > The solution this patch works toward is to postpone printing to a later
> > moment / different CPU when we already printed over 1000 characters in
> > current console_unlock() invocation. This is a crude heuristic but
> > measuring time we spent printing doesn't seem to be really viable - we
> > cannot rely on high resolution time being available and with interrupts
> > disabled jiffies are not updated. The value 1000 was chosen so that
> > things are still bearable with 9600 baud serial console and OTOH it
> > shouldn't cause offloading of printing in common cases.
> > 
> > Signed-off-by: Jan Kara <jack@suse.cz>
> > ---
> >  kernel/printk/printk.c | 62 ++++++++++++++++++++++++++++++++++++++++++--------
> >  1 file changed, 52 insertions(+), 10 deletions(-)
> > 
> > diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> > index 9208e17..28c220a 100644
> > --- a/kernel/printk/printk.c
> > +++ b/kernel/printk/printk.c
> > @@ -2026,31 +2026,49 @@ out:
> >  	raw_spin_unlock_irqrestore(&logbuf_lock, flags);
> >  }
> >  
> > +/*
> > + * How much characters can we print in one call of printk before offloading
> > + * printing work
> > + */
> > +#define MAX_PRINTK_CHARS 1000
> 
> This really needs to be configurable. 1000 may be good for you, but may
> not be good for others. Heck, we should be able to disable this
> offloading too.
> 
> This should probably be added to the printk /proc/sys/kernel/... files.
> 
> If it is set to zero, then disable any offloading.
  OK, I can easily do this. I'm somewhat skeptical people would want to
tune this but maybe they would.

								Honza

> > +
> > +/* Should we stop printing on this cpu? */
> > +static bool cpu_stop_printing(int printed_chars)
> > +{
> > +	/* Oops? Print everything now to maximize chances user will see it */
> > +	if (oops_in_progress)
> > +		return false;
> > +	return printed_chars > MAX_PRINTK_CHARS;
> > +}
> > +
> >  /**
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 3/4] printk: Defer printing to irq work when we printed too much
  2013-08-15  7:52     ` Jan Kara
@ 2013-08-15 13:26       ` Steven Rostedt
  0 siblings, 0 replies; 8+ messages in thread
From: Steven Rostedt @ 2013-08-15 13:26 UTC (permalink / raw)
  To: Jan Kara; +Cc: LKML, Andrew Morton, mhocko

On Thu, 15 Aug 2013 09:52:51 +0200
Jan Kara <jack@suse.cz> wrote:

 
> > > +/*
> > > + * How much characters can we print in one call of printk before offloading
> > > + * printing work
> > > + */
> > > +#define MAX_PRINTK_CHARS 1000
> > 
> > This really needs to be configurable. 1000 may be good for you, but may
> > not be good for others. Heck, we should be able to disable this
> > offloading too.
> > 
> > This should probably be added to the printk /proc/sys/kernel/... files.
> > 
> > If it is set to zero, then disable any offloading.
>   OK, I can easily do this. I'm somewhat skeptical people would want to
> tune this but maybe they would.
> 

Most people wouldn't. But as this is a new heuristic, if things are not
behaving as people expect, then it would be good to be able to modify
that heuristic to see if that was the issue. Including, disabling it.

-- Steve

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2013-08-15 13:26 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-08-14 13:28 [PATCH 0/4 v5] Avoid softlockups in console_unlock() Jan Kara
2013-08-14 13:28 ` [PATCH 1/4] printk: Remove separate printk_sched buffers and use printk buf instead Jan Kara
2013-08-14 13:28 ` [PATCH 2/4] irq_work: Provide a irq work that can be processed on any cpu Jan Kara
2013-08-14 13:28 ` [PATCH 3/4] printk: Defer printing to irq work when we printed too much Jan Kara
2013-08-15  1:38   ` Steven Rostedt
2013-08-15  7:52     ` Jan Kara
2013-08-15 13:26       ` Steven Rostedt
2013-08-14 13:28 ` [PATCH 4/4] printk: Use unbound irq work for printing and waking Jan Kara

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.