linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4 v6] Avoid softlockups in console_unlock()
@ 2013-08-21  8:08 Jan Kara
  2013-08-21  8:08 ` [PATCH 1/4] printk: Remove separate printk_sched buffers and use printk buf instead Jan Kara
                   ` (4 more replies)
  0 siblings, 5 replies; 22+ messages in thread
From: Jan Kara @ 2013-08-21  8:08 UTC (permalink / raw)
  To: Andrew Morton; +Cc: LKML, mhocko, hare, Steven Rostedt, Jan Kara

  Hello,

  This is the next iteration of my printk patchset. Since v5 I've made the
limit for printing configurable via sysfs.

Summary:

These patches avoid softlockups when a CPU gets caught in console_unlock() for
a long time during heavy printing from other CPU. As is discussed in patch 3/4
it isn't enough to just silence the watchdog because if CPU spends too long in
console_unlock() also RCU will complain, other CPUs can be blocked waiting for
printing CPU to process IPI, and even disk can be offlined because commands
couldn't be delivered to it for too long.

This patch series solves the problem by stopping printing in console_unlock()
after 1000 characters and the printing is postponed to irq work. To avoid
hogging a single CPU (irq work gets processed on the same CPU where it was
queued so it doesn't really help to reduce the printing load on that CPU) we
introduce a new type of lazy irq work - IRQ_WORK_UNBOUND - which can be
processed by any CPU.

The patch series has survived my testing without any softlockup reports.  I've
tested running sysrq-t (lots of printk output) while inserting modules (to
generate IPIs and also some printk traffic) and also running two delayed works
printing 10 KB of text each. All this was with simulated 9600 baud serial
console.

								Honza

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 1/4] printk: Remove separate printk_sched buffers and use printk buf instead
  2013-08-21  8:08 [PATCH 0/4 v6] Avoid softlockups in console_unlock() Jan Kara
@ 2013-08-21  8:08 ` Jan Kara
  2013-08-21  8:08 ` [PATCH 2/4] irq_work: Provide a irq work that can be processed on any cpu Jan Kara
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 22+ messages in thread
From: Jan Kara @ 2013-08-21  8:08 UTC (permalink / raw)
  To: Andrew Morton; +Cc: LKML, mhocko, hare, Steven Rostedt, Jan Kara

From: Steven Rostedt <rostedt@goodmis.org>

To prevent deadlocks with doing a printk inside the scheduler,
printk_sched() was created. The issue is that printk has a console_sem
that it can grab and release. The release does a wake up if there's a
task pending on the sem, and this wake up grabs the rq locks that is
held in the scheduler. This leads to a possible deadlock if the wake up
uses the same rq as the one with the rq lock held already.

What printk_sched() does is to save the printk write in a per cpu buffer
and sets the PRINTK_PENDING_SCHED flag. On a timer tick, if this flag is
set, the printk() is done against the buffer.

There's a couple of issues with this approach.

1) If two printk_sched()s are called before the tick, the second one
will overwrite the first one.

2) The temporary buffer is 512 bytes and is per cpu. This is a quite a
bit of space wasted for something that is seldom used.

In order to remove this, the printk_sched() can use the printk buffer
instead, and delay the console_trylock()/console_unlock() to the queued
work.

Because printk_sched() would then be taking the logbuf_lock, the
logbuf_lock must not be held while doing anything that may call into the
scheduler functions, which includes wake ups. Unfortunately, printk()
also has a console_sem that it uses, and on release, the
up(&console_sem) may do a wake up of any pending waiters. This must be
avoided while holding the logbuf_lock.

Luckily, there's not many places that do the unlock, or hold the
logbuf_lock. By moving things around a little, the console_sem can be
released without ever holding the logbuf_lock, and we can safely have
printk_sched() use the printk buffer directly.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 kernel/printk/printk.c | 86 +++++++++++++++++++++++++++++++-------------------
 1 file changed, 54 insertions(+), 32 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 5b5a708..9208e17 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -208,6 +208,9 @@ struct printk_log {
 /*
  * The logbuf_lock protects kmsg buffer, indices, counters. It is also
  * used in interesting ways to provide interlocking in console_unlock();
+ * This can be taken within the scheduler's rq lock. It must be released
+ * before calling console_unlock() or anything else that might wake up
+ * a process.
  */
 static DEFINE_RAW_SPINLOCK(logbuf_lock);
 
@@ -1343,27 +1346,43 @@ static inline int can_use_console(unsigned int cpu)
  * interrupts disabled. It should return with 'lockbuf_lock'
  * released but interrupts still disabled.
  */
-static int console_trylock_for_printk(unsigned int cpu)
+static int console_trylock_for_printk(unsigned int cpu, bool in_sched)
 	__releases(&logbuf_lock)
 {
 	int retval = 0, wake = 0;
 
-	if (console_trylock()) {
-		retval = 1;
+	/* if called from the scheduler, we can not call up() */
+	if (in_sched)
+		goto out;
 
-		/*
-		 * If we can't use the console, we need to release
-		 * the console semaphore by hand to avoid flushing
-		 * the buffer. We need to hold the console semaphore
-		 * in order to do this test safely.
-		 */
-		if (!can_use_console(cpu)) {
-			console_locked = 0;
-			wake = 1;
-			retval = 0;
-		}
+	if (down_trylock(&console_sem))
+		goto out;
+
+	/*
+	 * If we can't use the console, we need to release
+	 * the console semaphore by hand to avoid flushing
+	 * the buffer. We need to hold the console semaphore
+	 * in order to do this test safely.
+	 */
+	if (console_suspended || !can_use_console(cpu)) {
+		wake = 1;
+		goto out;
 	}
+
+	/* console is now locked */
+
+	console_locked = 1;
+	console_may_schedule = 0;
+	mutex_acquire(&console_lock_dep_map, 0, 1, _RET_IP_);
+
+	retval = 1;
+
+out:
 	logbuf_cpu = UINT_MAX;
+	/*
+	 * The logbuf_lock must not be held when doing a wake up,
+	 * which the up(&console_sem) can do.
+	 */
 	raw_spin_unlock(&logbuf_lock);
 	if (wake)
 		up(&console_sem);
@@ -1495,11 +1514,17 @@ asmlinkage int vprintk_emit(int facility, int level,
 	static int recursion_bug;
 	static char textbuf[LOG_LINE_MAX];
 	char *text = textbuf;
-	size_t text_len;
+	size_t text_len = 0;
 	enum log_flags lflags = 0;
 	unsigned long flags;
 	int this_cpu;
 	int printed_len = 0;
+	bool in_sched = false;
+
+	if (level == -2) {
+		level = -1;
+		in_sched = true;
+	}
 
 	boot_delay_msec(level);
 	printk_delay();
@@ -1545,7 +1570,12 @@ asmlinkage int vprintk_emit(int facility, int level,
 	 * The printf needs to come first; we need the syslog
 	 * prefix which might be passed-in as a parameter.
 	 */
-	text_len = vscnprintf(text, sizeof(textbuf), fmt, args);
+	if (in_sched)
+		text_len = scnprintf(text, sizeof(textbuf),
+				     KERN_WARNING "[sched_delayed] ");
+
+	text_len += vscnprintf(text + text_len,
+			       sizeof(textbuf) - text_len, fmt, args);
 
 	/* mark and strip a trailing newline */
 	if (text_len && text[text_len-1] == '\n') {
@@ -1620,7 +1650,7 @@ asmlinkage int vprintk_emit(int facility, int level,
 	 * The console_trylock_for_printk() function will release 'logbuf_lock'
 	 * regardless of whether it actually gets the console semaphore or not.
 	 */
-	if (console_trylock_for_printk(this_cpu))
+	if (console_trylock_for_printk(this_cpu, in_sched))
 		console_unlock();
 
 	lockdep_on();
@@ -2428,21 +2458,19 @@ late_initcall(printk_late_init);
 /*
  * Delayed printk version, for scheduler-internal messages:
  */
-#define PRINTK_BUF_SIZE		512
-
 #define PRINTK_PENDING_WAKEUP	0x01
-#define PRINTK_PENDING_SCHED	0x02
+#define PRINTK_PENDING_OUTPUT	0x02
 
 static DEFINE_PER_CPU(int, printk_pending);
-static DEFINE_PER_CPU(char [PRINTK_BUF_SIZE], printk_sched_buf);
 
 static void wake_up_klogd_work_func(struct irq_work *irq_work)
 {
 	int pending = __this_cpu_xchg(printk_pending, 0);
 
-	if (pending & PRINTK_PENDING_SCHED) {
-		char *buf = __get_cpu_var(printk_sched_buf);
-		printk(KERN_WARNING "[sched_delayed] %s", buf);
+	if (pending & PRINTK_PENDING_OUTPUT) {
+		/* If trylock fails, someone else is doing the printing */
+		if (console_trylock())
+			console_unlock();
 	}
 
 	if (pending & PRINTK_PENDING_WAKEUP)
@@ -2466,21 +2494,15 @@ void wake_up_klogd(void)
 
 int printk_sched(const char *fmt, ...)
 {
-	unsigned long flags;
 	va_list args;
-	char *buf;
 	int r;
 
-	local_irq_save(flags);
-	buf = __get_cpu_var(printk_sched_buf);
-
 	va_start(args, fmt);
-	r = vsnprintf(buf, PRINTK_BUF_SIZE, fmt, args);
+	r = vprintk_emit(0, -2, NULL, 0, fmt, args);
 	va_end(args);
 
-	__this_cpu_or(printk_pending, PRINTK_PENDING_SCHED);
+	__this_cpu_or(printk_pending, PRINTK_PENDING_OUTPUT);
 	irq_work_queue(&__get_cpu_var(wake_up_klogd_work));
-	local_irq_restore(flags);
 
 	return r;
 }
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 2/4] irq_work: Provide a irq work that can be processed on any cpu
  2013-08-21  8:08 [PATCH 0/4 v6] Avoid softlockups in console_unlock() Jan Kara
  2013-08-21  8:08 ` [PATCH 1/4] printk: Remove separate printk_sched buffers and use printk buf instead Jan Kara
@ 2013-08-21  8:08 ` Jan Kara
  2013-08-21 18:49   ` Steven Rostedt
  2013-08-21  8:08 ` [PATCH 3/4] printk: Defer printing to irq work when we printed too much Jan Kara
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 22+ messages in thread
From: Jan Kara @ 2013-08-21  8:08 UTC (permalink / raw)
  To: Andrew Morton; +Cc: LKML, mhocko, hare, Steven Rostedt, Jan Kara

Provide new irq work flag - IRQ_WORK_UNBOUND - meaning that can be
processed on any cpu. This flag implies IRQ_WORK_LAZY so that things are
simple and we don't have to pick any particular cpu to do the work. We
just do the work from a timer tick on whichever cpu it happens first.
This is useful as a lightweight and simple code path without locking or
other dependencies to offload work to other cpu if possible.

We will use this type of irq work to make a guarantee of forward
progress of printing to a (serial) console when printing on one cpu
would cause interrupts to be disabled for too long.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 include/linux/irq_work.h |  2 ++
 kernel/irq_work.c        | 51 +++++++++++++++++++++++++++++-------------------
 2 files changed, 33 insertions(+), 20 deletions(-)

diff --git a/include/linux/irq_work.h b/include/linux/irq_work.h
index 6601702..ca07a16 100644
--- a/include/linux/irq_work.h
+++ b/include/linux/irq_work.h
@@ -16,6 +16,8 @@
 #define IRQ_WORK_BUSY		2UL
 #define IRQ_WORK_FLAGS		3UL
 #define IRQ_WORK_LAZY		4UL /* Doesn't want IPI, wait for tick */
+#define __IRQ_WORK_UNBOUND      8UL /* Use IRQ_WORK_UNBOUND instead! */
+#define IRQ_WORK_UNBOUND	(__IRQ_WORK_UNBOUND | IRQ_WORK_LAZY) /* Any cpu can process this work */
 
 struct irq_work {
 	unsigned long flags;
diff --git a/kernel/irq_work.c b/kernel/irq_work.c
index 55fcce6..446cd81 100644
--- a/kernel/irq_work.c
+++ b/kernel/irq_work.c
@@ -22,6 +22,9 @@
 static DEFINE_PER_CPU(struct llist_head, irq_work_list);
 static DEFINE_PER_CPU(int, irq_work_raised);
 
+/* List of irq-work any CPU can pick up */
+static LLIST_HEAD(unbound_irq_work_list);
+
 /*
  * Claim the entry so that no one else will poke at it.
  */
@@ -70,12 +73,16 @@ void irq_work_queue(struct irq_work *work)
 	/* Queue the entry and raise the IPI if needed. */
 	preempt_disable();
 
-	llist_add(&work->llnode, &__get_cpu_var(irq_work_list));
+	if (!(work->flags & __IRQ_WORK_UNBOUND))
+		llist_add(&work->llnode, &__get_cpu_var(irq_work_list));
+	else
+		llist_add(&work->llnode, &unbound_irq_work_list);
 
 	/*
 	 * If the work is not "lazy" or the tick is stopped, raise the irq
 	 * work interrupt (if supported by the arch), otherwise, just wait
-	 * for the next tick.
+	 * for the next tick. We do this even for unbound work to make sure
+	 * *some* CPU will be doing the work.
 	 */
 	if (!(work->flags & IRQ_WORK_LAZY) || tick_nohz_tick_stopped()) {
 		if (!this_cpu_cmpxchg(irq_work_raised, 0, 1))
@@ -100,28 +107,11 @@ bool irq_work_needs_cpu(void)
 	return true;
 }
 
-static void __irq_work_run(void)
+static void process_irq_work_list(struct llist_node *llnode)
 {
 	unsigned long flags;
 	struct irq_work *work;
-	struct llist_head *this_list;
-	struct llist_node *llnode;
-
-
-	/*
-	 * Reset the "raised" state right before we check the list because
-	 * an NMI may enqueue after we find the list empty from the runner.
-	 */
-	__this_cpu_write(irq_work_raised, 0);
-	barrier();
 
-	this_list = &__get_cpu_var(irq_work_list);
-	if (llist_empty(this_list))
-		return;
-
-	BUG_ON(!irqs_disabled());
-
-	llnode = llist_del_all(this_list);
 	while (llnode != NULL) {
 		work = llist_entry(llnode, struct irq_work, llnode);
 
@@ -146,6 +136,27 @@ static void __irq_work_run(void)
 	}
 }
 
+static void __irq_work_run(void)
+{
+	struct llist_head *this_list;
+
+	/*
+	 * Reset the "raised" state right before we check the list because
+	 * an NMI may enqueue after we find the list empty from the runner.
+	 */
+	__this_cpu_write(irq_work_raised, 0);
+	barrier();
+
+	this_list = &__get_cpu_var(irq_work_list);
+	if (llist_empty(this_list) && llist_empty(&unbound_irq_work_list))
+		return;
+
+	BUG_ON(!irqs_disabled());
+
+	process_irq_work_list(llist_del_all(this_list));
+	process_irq_work_list(llist_del_all(&unbound_irq_work_list));
+}
+
 /*
  * Run the irq_work entries on this cpu. Requires to be ran from hardirq
  * context with local IRQs disabled.
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 3/4] printk: Defer printing to irq work when we printed too much
  2013-08-21  8:08 [PATCH 0/4 v6] Avoid softlockups in console_unlock() Jan Kara
  2013-08-21  8:08 ` [PATCH 1/4] printk: Remove separate printk_sched buffers and use printk buf instead Jan Kara
  2013-08-21  8:08 ` [PATCH 2/4] irq_work: Provide a irq work that can be processed on any cpu Jan Kara
@ 2013-08-21  8:08 ` Jan Kara
  2013-08-21 19:06   ` Steven Rostedt
  2013-08-21  8:08 ` [PATCH 4/4] printk: Use unbound irq work for printing and waking Jan Kara
  2013-08-21 21:27 ` [PATCH 0/4 v6] Avoid softlockups in console_unlock() Andrew Morton
  4 siblings, 1 reply; 22+ messages in thread
From: Jan Kara @ 2013-08-21  8:08 UTC (permalink / raw)
  To: Andrew Morton; +Cc: LKML, mhocko, hare, Steven Rostedt, Jan Kara

A CPU can be caught in console_unlock() for a long time (tens of seconds
are reported by our customers) when other CPUs are using printk heavily
and serial console makes printing slow. Despite serial console drivers
are calling touch_nmi_watchdog() this triggers softlockup warnings
because interrupts are disabled for the whole time console_unlock() runs
(e.g. vprintk() calls console_unlock() with interrupts disabled). Thus
IPIs cannot be processed and other CPUs get stuck spinning in calls like
smp_call_function_many(). Also RCU eventually starts reporting lockups.

In my artifical testing I can also easily trigger a situation when disk
disappears from the system apparently because commands to / from it
could not be delivered for long enough. This is why just silencing
watchdogs isn't a reliable solution to the problem and we simply have to
avoid spending too long in console_unlock() with interrupts disabled.

The solution this patch works toward is to postpone printing to a later
moment / different CPU when we already printed over 1000 characters in
current console_unlock() invocation. This is a crude heuristic but
measuring time we spent printing doesn't seem to be really viable - we
cannot rely on high resolution time being available and with interrupts
disabled jiffies are not updated. The value 1000 was chosen so that
things are still bearable with 9600 baud serial console and OTOH it
shouldn't cause offloading of printing in common cases.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 Documentation/sysctl/kernel.txt | 19 +++++++++++++
 include/linux/printk.h          |  1 +
 kernel/printk/printk.c          | 62 ++++++++++++++++++++++++++++++++++-------
 kernel/sysctl.c                 |  8 ++++++
 4 files changed, 80 insertions(+), 10 deletions(-)

diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
index ab7d16e..0967bcc 100644
--- a/Documentation/sysctl/kernel.txt
+++ b/Documentation/sysctl/kernel.txt
@@ -57,6 +57,7 @@ show up in /proc/sys/kernel:
 - printk_delay
 - printk_ratelimit
 - printk_ratelimit_burst
+- printk_offload_chars
 - randomize_va_space
 - real-root-dev               ==> Documentation/initrd.txt
 - reboot-cmd                  [ SPARC only ]
@@ -524,6 +525,24 @@ send before ratelimiting kicks in.
 
 ==============================================================
 
+printk_offload_chars:
+
+Printing to console can be relatively slow especially in case of
+serial console. When there is intensive printing happening from
+several cpus (as is the case during boot), a cpu can be spending
+significant time (seconds or more) doing printing. To avoid
+softlockups, lost interrupts, and similar problems a cpu stops
+printing to console after printing 'printk_offload_chars' characters.
+Another cpu will then continue printing. Higher value means possibly
+longer interrupt and other latencies but lower latency of printing
+and lower chance something goes wrong during system crash and
+important message is not printed.
+
+By default we offload printing from a cpu after printing 1000
+characters. Setting printk_offload_chars to 0 disables offloading.
+
+==============================================================
+
 randomize_va_space:
 
 This option can be used to select the type of process address
diff --git a/include/linux/printk.h b/include/linux/printk.h
index 22c7052..135d958 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -141,6 +141,7 @@ extern bool printk_timed_ratelimit(unsigned long *caller_jiffies,
 extern int printk_delay_msec;
 extern int dmesg_restrict;
 extern int kptr_restrict;
+extern int printk_offload_chars;
 
 extern void wake_up_klogd(void);
 
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 9208e17..e97971b 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -253,6 +253,12 @@ static char __log_buf[__LOG_BUF_LEN] __aligned(LOG_ALIGN);
 static char *log_buf = __log_buf;
 static u32 log_buf_len = __LOG_BUF_LEN;
 
+/*
+ * How much characters can we print in one call of printk before offloading
+ * printing work. Tunable via /proc/sys/kernel/printk_offload_chars.
+ */
+int __read_mostly printk_offload_chars = 1000;
+
 /* cpu currently holding logbuf_lock */
 static volatile unsigned int logbuf_cpu = UINT_MAX;
 
@@ -2026,31 +2032,43 @@ out:
 	raw_spin_unlock_irqrestore(&logbuf_lock, flags);
 }
 
+/* Should we stop printing on this cpu? */
+static bool cpu_stop_printing(int printed_chars)
+{
+	/* Oops? Print everything now to maximize chances user will see it */
+	if (oops_in_progress)
+		return false;
+	return printk_offload_chars && printed_chars > printk_offload_chars;
+}
+
 /**
- * console_unlock - unlock the console system
+ * __console_unlock - unlock the console system
  *
  * Releases the console_lock which the caller holds on the console system
  * and the console driver list.
  *
- * While the console_lock was held, console output may have been buffered
- * by printk().  If this is the case, console_unlock(); emits
- * the output prior to releasing the lock.
+ * While the console_lock was held, console output may have been buffered by
+ * printk(). If this is the case, __console_unlock() emits the output prior to
+ * releasing the lock. However we need not write all the data in the buffer if
+ * we would hog the CPU for too long. Function returns true, if there's more
+ * data that needs printing in the buffer.
  *
  * If there is output waiting, we wake /dev/kmsg and syslog() users.
  *
- * console_unlock(); may be called from any context.
+ * __console_unlock(); may be called from any context.
  */
-void console_unlock(void)
+static bool __console_unlock(void)
 {
 	static char text[LOG_LINE_MAX + PREFIX_MAX];
 	static u64 seen_seq;
 	unsigned long flags;
 	bool wake_klogd = false;
 	bool retry;
+	int printed_chars = 0;
 
 	if (console_suspended) {
 		up(&console_sem);
-		return;
+		return false;
 	}
 
 	console_may_schedule = 0;
@@ -2078,6 +2096,8 @@ again:
 skip:
 		if (console_seq == log_next_seq)
 			break;
+		if (cpu_stop_printing(printed_chars))
+			break;
 
 		msg = log_from_idx(console_idx);
 		if (msg->flags & LOG_NOCONS) {
@@ -2108,6 +2128,7 @@ skip:
 		stop_critical_timings();	/* don't trace print latency */
 		call_console_drivers(level, text, len);
 		start_critical_timings();
+		printed_chars += len;
 		local_irq_restore(flags);
 	}
 	console_locked = 0;
@@ -2131,13 +2152,20 @@ skip:
 	retry = console_seq != log_next_seq;
 	raw_spin_unlock_irqrestore(&logbuf_lock, flags);
 
-	if (retry && console_trylock())
-		goto again;
+	if (retry && !cpu_stop_printing(printed_chars)) {
+		if (console_trylock())
+			goto again;
+		/*
+		 * Someone else is printing so the caller doesn't have to
+		 * schedule irq work
+		 */
+		retry = false;
+	}
 
 	if (wake_klogd)
 		wake_up_klogd();
+	return retry;
 }
-EXPORT_SYMBOL(console_unlock);
 
 /**
  * console_conditional_schedule - yield the CPU if required
@@ -2508,6 +2536,20 @@ int printk_sched(const char *fmt, ...)
 }
 
 /*
+ * This is a wrapper for __console_unlock() that makes sure the rest of buffer
+ * is printed in future.
+ */
+void console_unlock(void)
+{
+       if (__console_unlock()) {
+               /* Leave the rest of printing for a timer tick */
+		__this_cpu_or(printk_pending, PRINTK_PENDING_OUTPUT);
+		irq_work_queue(&__get_cpu_var(wake_up_klogd_work));
+	}
+}
+EXPORT_SYMBOL(console_unlock);
+
+/*
  * printk rate limiting, lifted from the networking subsystem.
  *
  * This enforces a rate limit: not more than 10 kernel messages
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 07f6fc4..002788f 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -772,6 +772,14 @@ static struct ctl_table kern_table[] = {
 		.extra2		= &ten_thousand,
 	},
 	{
+		.procname	= "printk_offload_chars",
+		.data		= &printk_offload_chars,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec_minmax,
+		.extra1		= &zero,
+	},
+	{
 		.procname	= "dmesg_restrict",
 		.data		= &dmesg_restrict,
 		.maxlen		= sizeof(int),
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 4/4] printk: Use unbound irq work for printing and waking
  2013-08-21  8:08 [PATCH 0/4 v6] Avoid softlockups in console_unlock() Jan Kara
                   ` (2 preceding siblings ...)
  2013-08-21  8:08 ` [PATCH 3/4] printk: Defer printing to irq work when we printed too much Jan Kara
@ 2013-08-21  8:08 ` Jan Kara
  2013-08-21 19:24   ` Steven Rostedt
  2013-08-21 21:27 ` [PATCH 0/4 v6] Avoid softlockups in console_unlock() Andrew Morton
  4 siblings, 1 reply; 22+ messages in thread
From: Jan Kara @ 2013-08-21  8:08 UTC (permalink / raw)
  To: Andrew Morton; +Cc: LKML, mhocko, hare, Steven Rostedt, Jan Kara

Now when per-cpu printk buffers are gone, there's no need to have printk
flags or printk irq_work per cpu. Just make printk_pending a single
variable operated by atomic operations and have single unbound irq work
doing the waking and printing. This has an advantage that any cpu can do the
printing / wakeup work thus lowering the latency of printing and better
distributing the printing work over cpus.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 kernel/printk/printk.c | 47 ++++++++++++++++++++++++++---------------------
 1 file changed, 26 insertions(+), 21 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index e97971b..d5ee3d0 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -2486,36 +2486,41 @@ late_initcall(printk_late_init);
 /*
  * Delayed printk version, for scheduler-internal messages:
  */
-#define PRINTK_PENDING_WAKEUP	0x01
-#define PRINTK_PENDING_OUTPUT	0x02
+#define PRINTK_PENDING_WAKEUP	0
+#define PRINTK_PENDING_OUTPUT	1
 
-static DEFINE_PER_CPU(int, printk_pending);
+static unsigned long printk_pending;
 
-static void wake_up_klogd_work_func(struct irq_work *irq_work)
+static void printk_irq_work_func(struct irq_work *irq_work)
 {
-	int pending = __this_cpu_xchg(printk_pending, 0);
+	if (printk_pending) {
+		unsigned long pending = xchg(&printk_pending, 0);
 
-	if (pending & PRINTK_PENDING_OUTPUT) {
-		/* If trylock fails, someone else is doing the printing */
-		if (console_trylock())
-			console_unlock();
-	}
+		if (test_bit(PRINTK_PENDING_OUTPUT, &pending)) {
+			/*
+			 * If trylock fails, someone else is doing the
+			 * printing
+			 */
+			if (console_trylock())
+				console_unlock();
+		}
 
-	if (pending & PRINTK_PENDING_WAKEUP)
-		wake_up_interruptible(&log_wait);
+		if (test_bit(PRINTK_PENDING_WAKEUP, &pending))
+			wake_up_interruptible(&log_wait);
+	}
 }
 
-static DEFINE_PER_CPU(struct irq_work, wake_up_klogd_work) = {
-	.func = wake_up_klogd_work_func,
-	.flags = IRQ_WORK_LAZY,
+static struct irq_work printk_irq_work = {
+	.func = printk_irq_work_func,
+	.flags = IRQ_WORK_UNBOUND,
 };
 
 void wake_up_klogd(void)
 {
 	preempt_disable();
 	if (waitqueue_active(&log_wait)) {
-		this_cpu_or(printk_pending, PRINTK_PENDING_WAKEUP);
-		irq_work_queue(&__get_cpu_var(wake_up_klogd_work));
+		set_bit(PRINTK_PENDING_WAKEUP, &printk_pending);
+		irq_work_queue(&printk_irq_work);
 	}
 	preempt_enable();
 }
@@ -2529,8 +2534,8 @@ int printk_sched(const char *fmt, ...)
 	r = vprintk_emit(0, -2, NULL, 0, fmt, args);
 	va_end(args);
 
-	__this_cpu_or(printk_pending, PRINTK_PENDING_OUTPUT);
-	irq_work_queue(&__get_cpu_var(wake_up_klogd_work));
+	set_bit(PRINTK_PENDING_OUTPUT, &printk_pending);
+	irq_work_queue(&printk_irq_work);
 
 	return r;
 }
@@ -2543,8 +2548,8 @@ void console_unlock(void)
 {
        if (__console_unlock()) {
                /* Leave the rest of printing for a timer tick */
-		__this_cpu_or(printk_pending, PRINTK_PENDING_OUTPUT);
-		irq_work_queue(&__get_cpu_var(wake_up_klogd_work));
+		set_bit(PRINTK_PENDING_OUTPUT, &printk_pending);
+		irq_work_queue(&printk_irq_work);
 	}
 }
 EXPORT_SYMBOL(console_unlock);
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/4] irq_work: Provide a irq work that can be processed on any cpu
  2013-08-21  8:08 ` [PATCH 2/4] irq_work: Provide a irq work that can be processed on any cpu Jan Kara
@ 2013-08-21 18:49   ` Steven Rostedt
  2013-09-05 15:56     ` Jan Kara
  0 siblings, 1 reply; 22+ messages in thread
From: Steven Rostedt @ 2013-08-21 18:49 UTC (permalink / raw)
  To: Jan Kara; +Cc: Andrew Morton, LKML, mhocko, hare

On Wed, 21 Aug 2013 10:08:30 +0200
Jan Kara <jack@suse.cz> wrote:

  
>  struct irq_work {
>  	unsigned long flags;
> diff --git a/kernel/irq_work.c b/kernel/irq_work.c
> index 55fcce6..446cd81 100644
> --- a/kernel/irq_work.c
> +++ b/kernel/irq_work.c
> @@ -22,6 +22,9 @@
>  static DEFINE_PER_CPU(struct llist_head, irq_work_list);
>  static DEFINE_PER_CPU(int, irq_work_raised);
>  
> +/* List of irq-work any CPU can pick up */
> +static LLIST_HEAD(unbound_irq_work_list);
> +
>  /*
>   * Claim the entry so that no one else will poke at it.
>   */
> @@ -70,12 +73,16 @@ void irq_work_queue(struct irq_work *work)
>  	/* Queue the entry and raise the IPI if needed. */
>  	preempt_disable();
>  
> -	llist_add(&work->llnode, &__get_cpu_var(irq_work_list));
> +	if (!(work->flags & __IRQ_WORK_UNBOUND))
> +		llist_add(&work->llnode, &__get_cpu_var(irq_work_list));
> +	else
> +		llist_add(&work->llnode, &unbound_irq_work_list);


Just for better readability I would have:

	if (work->flags & __IRQ_WORK_UNBOUND)
		llist_add(&work->llnode, &unbound_irq_work_list);
	else
		llist_add(&work->llnode, &__get_cpu_var(irq_work_list));

>  
>  	/*
>  	 * If the work is not "lazy" or the tick is stopped, raise the irq
>  	 * work interrupt (if supported by the arch), otherwise, just wait
> -	 * for the next tick.
> +	 * for the next tick. We do this even for unbound work to make sure
> +	 * *some* CPU will be doing the work.
>  	 */
>  	if (!(work->flags & IRQ_WORK_LAZY) || tick_nohz_tick_stopped()) {
>  		if (!this_cpu_cmpxchg(irq_work_raised, 0, 1))
> @@ -100,28 +107,11 @@ bool irq_work_needs_cpu(void)
>  	return true;
>  }
>  
> -static void __irq_work_run(void)
> +static void process_irq_work_list(struct llist_node *llnode)
>  {
>  	unsigned long flags;
>  	struct irq_work *work;
> -	struct llist_head *this_list;
> -	struct llist_node *llnode;
> -
> -
> -	/*
> -	 * Reset the "raised" state right before we check the list because
> -	 * an NMI may enqueue after we find the list empty from the runner.
> -	 */
> -	__this_cpu_write(irq_work_raised, 0);
> -	barrier();
>  
> -	this_list = &__get_cpu_var(irq_work_list);
> -	if (llist_empty(this_list))
> -		return;
> -
> -	BUG_ON(!irqs_disabled());
> -
> -	llnode = llist_del_all(this_list);
>  	while (llnode != NULL) {
>  		work = llist_entry(llnode, struct irq_work, llnode);
>  
> @@ -146,6 +136,27 @@ static void __irq_work_run(void)
>  	}
>  }
>  
> +static void __irq_work_run(void)
> +{
> +	struct llist_head *this_list;
> +
> +	/*
> +	 * Reset the "raised" state right before we check the list because
> +	 * an NMI may enqueue after we find the list empty from the runner.
> +	 */
> +	__this_cpu_write(irq_work_raised, 0);
> +	barrier();
> +
> +	this_list = &__get_cpu_var(irq_work_list);
> +	if (llist_empty(this_list) && llist_empty(&unbound_irq_work_list))
> +		return;
> +
> +	BUG_ON(!irqs_disabled());
> +
> +	process_irq_work_list(llist_del_all(this_list));
> +	process_irq_work_list(llist_del_all(&unbound_irq_work_list));

OK, I'm being a bit of an micro optimization freak here, but...

How about moving the list_empty() and BUG_ON into the
process_irq_work_list() call.

That is,

	if (list_empty(llnode))
		return;

	BUG_ON(!irqs_disabled());

That way we avoid the double xchg() that is done with the two calls to
llist_del_all().

-- Steve



> +}
> +
>  /*
>   * Run the irq_work entries on this cpu. Requires to be ran from hardirq
>   * context with local IRQs disabled.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/4] printk: Defer printing to irq work when we printed too much
  2013-08-21  8:08 ` [PATCH 3/4] printk: Defer printing to irq work when we printed too much Jan Kara
@ 2013-08-21 19:06   ` Steven Rostedt
  0 siblings, 0 replies; 22+ messages in thread
From: Steven Rostedt @ 2013-08-21 19:06 UTC (permalink / raw)
  To: Jan Kara; +Cc: Andrew Morton, LKML, mhocko, hare

On Wed, 21 Aug 2013 10:08:31 +0200
Jan Kara <jack@suse.cz> wrote:

> A CPU can be caught in console_unlock() for a long time (tens of seconds
> are reported by our customers) when other CPUs are using printk heavily
> and serial console makes printing slow. Despite serial console drivers
> are calling touch_nmi_watchdog() this triggers softlockup warnings
> because interrupts are disabled for the whole time console_unlock() runs
> (e.g. vprintk() calls console_unlock() with interrupts disabled). Thus
> IPIs cannot be processed and other CPUs get stuck spinning in calls like
> smp_call_function_many(). Also RCU eventually starts reporting lockups.
> 
> In my artifical testing I can also easily trigger a situation when disk
> disappears from the system apparently because commands to / from it
> could not be delivered for long enough. This is why just silencing
> watchdogs isn't a reliable solution to the problem and we simply have to
> avoid spending too long in console_unlock() with interrupts disabled.
> 
> The solution this patch works toward is to postpone printing to a later
> moment / different CPU when we already printed over 1000 characters in
> current console_unlock() invocation. This is a crude heuristic but
> measuring time we spent printing doesn't seem to be really viable - we
> cannot rely on high resolution time being available and with interrupts
> disabled jiffies are not updated. The value 1000 was chosen so that
> things are still bearable with 9600 baud serial console and OTOH it
> shouldn't cause offloading of printing in common cases.
> 

BTW, you can also test by running:

 # cd /sys/kernel/debug/tracing
 # echo function > current_tracer
 # sleep 1
 # echo z > /proc/sysrq-trigger


> Signed-off-by: Jan Kara <jack@suse.cz>

Reviewed-by: Steven Rostedt <rostedt@goodmis.org>

-- Steve

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 4/4] printk: Use unbound irq work for printing and waking
  2013-08-21  8:08 ` [PATCH 4/4] printk: Use unbound irq work for printing and waking Jan Kara
@ 2013-08-21 19:24   ` Steven Rostedt
  0 siblings, 0 replies; 22+ messages in thread
From: Steven Rostedt @ 2013-08-21 19:24 UTC (permalink / raw)
  To: Jan Kara; +Cc: Andrew Morton, LKML, mhocko, hare

On Wed, 21 Aug 2013 10:08:32 +0200
Jan Kara <jack@suse.cz> wrote:

> Now when per-cpu printk buffers are gone, there's no need to have printk
> flags or printk irq_work per cpu. Just make printk_pending a single
> variable operated by atomic operations and have single unbound irq work
> doing the waking and printing. This has an advantage that any cpu can do the
> printing / wakeup work thus lowering the latency of printing and better
> distributing the printing work over cpus.
> 
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---

Reviewed-by: Steven Rostedt <rostedt@goodmis.org>

-- Steve

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/4 v6] Avoid softlockups in console_unlock()
  2013-08-21  8:08 [PATCH 0/4 v6] Avoid softlockups in console_unlock() Jan Kara
                   ` (3 preceding siblings ...)
  2013-08-21  8:08 ` [PATCH 4/4] printk: Use unbound irq work for printing and waking Jan Kara
@ 2013-08-21 21:27 ` Andrew Morton
  2013-08-21 22:59   ` Jan Kara
  4 siblings, 1 reply; 22+ messages in thread
From: Andrew Morton @ 2013-08-21 21:27 UTC (permalink / raw)
  To: Jan Kara; +Cc: LKML, mhocko, hare, Steven Rostedt

On Wed, 21 Aug 2013 10:08:28 +0200 Jan Kara <jack@suse.cz> wrote:

> These patches avoid softlockups when a CPU gets caught in console_unlock() for
> a long time during heavy printing from other CPU. As is discussed in patch 3/4
> it isn't enough to just silence the watchdog because if CPU spends too long in
> console_unlock() also RCU will complain, other CPUs can be blocked waiting for
> printing CPU to process IPI, and even disk can be offlined because commands
> couldn't be delivered to it for too long.
> 
> This patch series solves the problem by stopping printing in console_unlock()
> after 1000 characters and the printing is postponed to irq work. To avoid
> hogging a single CPU (irq work gets processed on the same CPU where it was
> queued so it doesn't really help to reduce the printing load on that CPU) we
> introduce a new type of lazy irq work - IRQ_WORK_UNBOUND - which can be
> processed by any CPU.

I still hate the patchset :(

Remind us why we need this?  Whose kernel is spewing so much logging and why?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/4 v6] Avoid softlockups in console_unlock()
  2013-08-21 21:27 ` [PATCH 0/4 v6] Avoid softlockups in console_unlock() Andrew Morton
@ 2013-08-21 22:59   ` Jan Kara
  2013-08-22 19:49     ` Andrew Morton
  0 siblings, 1 reply; 22+ messages in thread
From: Jan Kara @ 2013-08-21 22:59 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Jan Kara, LKML, mhocko, hare, Steven Rostedt

On Wed 21-08-13 14:27:23, Andrew Morton wrote:
> On Wed, 21 Aug 2013 10:08:28 +0200 Jan Kara <jack@suse.cz> wrote:
> 
> > These patches avoid softlockups when a CPU gets caught in console_unlock() for
> > a long time during heavy printing from other CPU. As is discussed in patch 3/4
> > it isn't enough to just silence the watchdog because if CPU spends too long in
> > console_unlock() also RCU will complain, other CPUs can be blocked waiting for
> > printing CPU to process IPI, and even disk can be offlined because commands
> > couldn't be delivered to it for too long.
> > 
> > This patch series solves the problem by stopping printing in console_unlock()
> > after 1000 characters and the printing is postponed to irq work. To avoid
> > hogging a single CPU (irq work gets processed on the same CPU where it was
> > queued so it doesn't really help to reduce the printing load on that CPU) we
> > introduce a new type of lazy irq work - IRQ_WORK_UNBOUND - which can be
> > processed by any CPU.
> 
> I still hate the patchset :(
> 
> Remind us why we need this?  Whose kernel is spewing so much logging and
> why?
  We have customers (quite a few of them actually) which have machines with
lots of SCSI disks attached (due to multipath etc.) and during boot when
these disks are discovered and partitions set up quite some printing
happens - multiplied by the number of devices (1000+) it is too much for a
serial console to handle quickly enough. So these machines aren't able to
boot with serial console enabled.

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/4 v6] Avoid softlockups in console_unlock()
  2013-08-21 22:59   ` Jan Kara
@ 2013-08-22 19:49     ` Andrew Morton
  2013-08-22 21:57       ` Jan Kara
  2013-08-23 19:48       ` Jiri Kosina
  0 siblings, 2 replies; 22+ messages in thread
From: Andrew Morton @ 2013-08-22 19:49 UTC (permalink / raw)
  To: Jan Kara; +Cc: LKML, mhocko, hare, Steven Rostedt

On Thu, 22 Aug 2013 00:59:15 +0200 Jan Kara <jack@suse.cz> wrote:

> On Wed 21-08-13 14:27:23, Andrew Morton wrote:
> > On Wed, 21 Aug 2013 10:08:28 +0200 Jan Kara <jack@suse.cz> wrote:
> > 
> > > These patches avoid softlockups when a CPU gets caught in console_unlock() for
> > > a long time during heavy printing from other CPU. As is discussed in patch 3/4
> > > it isn't enough to just silence the watchdog because if CPU spends too long in
> > > console_unlock() also RCU will complain, other CPUs can be blocked waiting for
> > > printing CPU to process IPI, and even disk can be offlined because commands
> > > couldn't be delivered to it for too long.
> > > 
> > > This patch series solves the problem by stopping printing in console_unlock()
> > > after 1000 characters and the printing is postponed to irq work. To avoid
> > > hogging a single CPU (irq work gets processed on the same CPU where it was
> > > queued so it doesn't really help to reduce the printing load on that CPU) we
> > > introduce a new type of lazy irq work - IRQ_WORK_UNBOUND - which can be
> > > processed by any CPU.
> > 
> > I still hate the patchset :(
> > 
> > Remind us why we need this?  Whose kernel is spewing so much logging and
> > why?
>   We have customers (quite a few of them actually) which have machines with
> lots of SCSI disks attached (due to multipath etc.) and during boot when
> these disks are discovered and partitions set up quite some printing
> happens - multiplied by the number of devices (1000+) it is too much for a
> serial console to handle quickly enough. So these machines aren't able to
> boot with serial console enabled.

It sounds like rather a corner case, not worth mucking up the critical
core logging code.

Desperately seeking alternatives...

I suppose there's some reason why we can't just make those drivers shut
up?  If the messages are in the log buffer but aren't displayed,
they're still accessible after boot?

Or how about passing those messages over to a kernel thread, to be
printed out at a lower rate?  A linked list and schedule_work() would
suffice.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/4 v6] Avoid softlockups in console_unlock()
  2013-08-22 19:49     ` Andrew Morton
@ 2013-08-22 21:57       ` Jan Kara
  2013-08-22 22:14         ` Andrew Morton
  2013-08-23 19:48       ` Jiri Kosina
  1 sibling, 1 reply; 22+ messages in thread
From: Jan Kara @ 2013-08-22 21:57 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Jan Kara, LKML, mhocko, hare, Steven Rostedt

On Thu 22-08-13 12:49:13, Andrew Morton wrote:
> On Thu, 22 Aug 2013 00:59:15 +0200 Jan Kara <jack@suse.cz> wrote:
> 
> > On Wed 21-08-13 14:27:23, Andrew Morton wrote:
> > > On Wed, 21 Aug 2013 10:08:28 +0200 Jan Kara <jack@suse.cz> wrote:
> > > 
> > > > These patches avoid softlockups when a CPU gets caught in console_unlock() for
> > > > a long time during heavy printing from other CPU. As is discussed in patch 3/4
> > > > it isn't enough to just silence the watchdog because if CPU spends too long in
> > > > console_unlock() also RCU will complain, other CPUs can be blocked waiting for
> > > > printing CPU to process IPI, and even disk can be offlined because commands
> > > > couldn't be delivered to it for too long.
> > > > 
> > > > This patch series solves the problem by stopping printing in console_unlock()
> > > > after 1000 characters and the printing is postponed to irq work. To avoid
> > > > hogging a single CPU (irq work gets processed on the same CPU where it was
> > > > queued so it doesn't really help to reduce the printing load on that CPU) we
> > > > introduce a new type of lazy irq work - IRQ_WORK_UNBOUND - which can be
> > > > processed by any CPU.
> > > 
> > > I still hate the patchset :(
> > > 
> > > Remind us why we need this?  Whose kernel is spewing so much logging and
> > > why?
> >   We have customers (quite a few of them actually) which have machines with
> > lots of SCSI disks attached (due to multipath etc.) and during boot when
> > these disks are discovered and partitions set up quite some printing
> > happens - multiplied by the number of devices (1000+) it is too much for a
> > serial console to handle quickly enough. So these machines aren't able to
> > boot with serial console enabled.
> 
> It sounds like rather a corner case, not worth mucking up the critical
> core logging code.
> 
> Desperately seeking alternatives...
> 
> I suppose there's some reason why we can't just make those drivers shut
> up?  If the messages are in the log buffer but aren't displayed,
> they're still accessible after boot?
> 
> Or how about passing those messages over to a kernel thread, to be
> printed out at a lower rate?  A linked list and schedule_work() would
> suffice.
  Andrew, you seem really desperate ;-) I don't really like modifying
individual drivers, partitioning code, or SCSI core to be less verbose -
IMHO that's fighting with windmills and it's not like any of those parts is
excessively verbose. Every part prints its bits and it accumulates. I
cannot really imagine this would work long term.

Handing over printing to someone else is exactly what I'm doing - if
there's too big traffic so that one CPU is forced to write a lot of stuff
for other CPUs. The only difference to what you suggest seems to be that
you would like to explicitely mark printks that can be passed to someone
else. We could technically do that but I have trouble how do identify which
printks to mark - I could experimetally find those for some machine but
finding it for all machines is difficult. And then you have cases like
'echo t >/proc/sysrq-trigger' which currently kill the machine with serial
console and lots of processes (I've tried that although no customer
complained about this yet). And marking 'less important' printks wouldn't
bring any code simplification anyway since you would still have to handle
offload for the marked printks.

So as much as I understand the uncertainty of giving up the printing CPU
and relying on a timer tick on some CPU to pick up printing disturbs you, it
seems as the most maintainable solution to me... Or do you have other
concerns?

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/4 v6] Avoid softlockups in console_unlock()
  2013-08-22 21:57       ` Jan Kara
@ 2013-08-22 22:14         ` Andrew Morton
  2013-08-23 19:50           ` Jiri Kosina
  0 siblings, 1 reply; 22+ messages in thread
From: Andrew Morton @ 2013-08-22 22:14 UTC (permalink / raw)
  To: Jan Kara; +Cc: LKML, mhocko, hare, Steven Rostedt

On Thu, 22 Aug 2013 23:57:42 +0200 Jan Kara <jack@suse.cz> wrote:

> On Thu 22-08-13 12:49:13, Andrew Morton wrote:
> > Desperately seeking alternatives...
> > 
> > I suppose there's some reason why we can't just make those drivers shut
> > up?  If the messages are in the log buffer but aren't displayed,
> > they're still accessible after boot?
> > 
> > Or how about passing those messages over to a kernel thread, to be
> > printed out at a lower rate?  A linked list and schedule_work() would
> > suffice.
> 
>   Andrew, you seem really desperate ;-)

Years of hard experience have taught me: don't muck with printk.  It
needs to be robust, simple and to have minimum dependency on both the
calling environment and on correctly functioning kernel components. 
printk and NMI are the harshest environments we have to deal with.

> I don't really like modifying
> individual drivers, partitioning code, or SCSI core to be less verbose -
> IMHO that's fighting with windmills and it's not like any of those parts is
> excessively verbose. Every part prints its bits and it accumulates. I
> cannot really imagine this would work long term.
> 
> Handing over printing to someone else is exactly what I'm doing

No you aren't - you're mucking with printk!

What I'm proposing is to put a front-end *before* printk.  Thus leaving
printk unmucked with.

But it was just a suggestion.  Maybe there are other ways.  I'm
inviting you to suggest ways of fixing this obscure corner case
without mucking with everyone's printk.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/4 v6] Avoid softlockups in console_unlock()
  2013-08-22 19:49     ` Andrew Morton
  2013-08-22 21:57       ` Jan Kara
@ 2013-08-23 19:48       ` Jiri Kosina
  2013-08-23 19:58         ` Andrew Morton
  1 sibling, 1 reply; 22+ messages in thread
From: Jiri Kosina @ 2013-08-23 19:48 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Jan Kara, LKML, mhocko, hare, Steven Rostedt

On Thu, 22 Aug 2013, Andrew Morton wrote:

> >   We have customers (quite a few of them actually) which have machines with
> > lots of SCSI disks attached (due to multipath etc.) and during boot when
> > these disks are discovered and partitions set up quite some printing
> > happens - multiplied by the number of devices (1000+) it is too much for a
> > serial console to handle quickly enough. So these machines aren't able to
> > boot with serial console enabled.
> 
> It sounds like rather a corner case, not worth mucking up the critical
> core logging code.

Andrew, I have to admit I don't understand this argument at all. Do you 
feel comfortable having the code in the kernel that can be spinning for 
tenths of seconds with interrupts disabled? As that's the exact 
description of the current printk().

I feel very uncomfortable having a core kernel logging facility having 
such property.

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/4 v6] Avoid softlockups in console_unlock()
  2013-08-22 22:14         ` Andrew Morton
@ 2013-08-23 19:50           ` Jiri Kosina
  0 siblings, 0 replies; 22+ messages in thread
From: Jiri Kosina @ 2013-08-23 19:50 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Jan Kara, LKML, mhocko, hare, Steven Rostedt

On Thu, 22 Aug 2013, Andrew Morton wrote:

> Years of hard experience have taught me: don't muck with printk.  It
> needs to be robust, simple and to have minimum dependency on both the
> calling environment and on correctly functioning kernel components. 
> printk and NMI are the harshest environments we have to deal with.

NMI is harsh due to its design.

printk() is harsh because of the way we have implemented it. So we'd 
better work on fixing it, methinks.

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/4 v6] Avoid softlockups in console_unlock()
  2013-08-23 19:48       ` Jiri Kosina
@ 2013-08-23 19:58         ` Andrew Morton
  2013-08-26 11:48           ` Jiri Kosina
  2013-09-05 15:46           ` Jan Kara
  0 siblings, 2 replies; 22+ messages in thread
From: Andrew Morton @ 2013-08-23 19:58 UTC (permalink / raw)
  To: Jiri Kosina; +Cc: Jan Kara, LKML, mhocko, hare, Steven Rostedt

On Fri, 23 Aug 2013 21:48:36 +0200 (CEST) Jiri Kosina <jkosina@suse.cz> wrote:

> > >   We have customers (quite a few of them actually) which have machines with
> > > lots of SCSI disks attached (due to multipath etc.) and during boot when
> > > these disks are discovered and partitions set up quite some printing
> > > happens - multiplied by the number of devices (1000+) it is too much for a
> > > serial console to handle quickly enough. So these machines aren't able to
> > > boot with serial console enabled.
> > 
> > It sounds like rather a corner case, not worth mucking up the critical
> > core logging code.
> 
> Andrew, I have to admit I don't understand this argument at all.

Of course you do.  print should be simple, robust and have minimum
dependency on other kernel parts.

I suppose that if you make the proposed
/proc/sys/kernel/max_printk_chars settable from the boot command line
and default to zero, any risks are minimized.

Baling out if oops_in_progress was a good thing also.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/4 v6] Avoid softlockups in console_unlock()
  2013-08-23 19:58         ` Andrew Morton
@ 2013-08-26 11:48           ` Jiri Kosina
  2013-09-05 15:46           ` Jan Kara
  1 sibling, 0 replies; 22+ messages in thread
From: Jiri Kosina @ 2013-08-26 11:48 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Jan Kara, LKML, mhocko, hare, Steven Rostedt

On Fri, 23 Aug 2013, Andrew Morton wrote:

> > > >   We have customers (quite a few of them actually) which have machines with
> > > > lots of SCSI disks attached (due to multipath etc.) and during boot when
> > > > these disks are discovered and partitions set up quite some printing
> > > > happens - multiplied by the number of devices (1000+) it is too much for a
> > > > serial console to handle quickly enough. So these machines aren't able to
> > > > boot with serial console enabled.
> > > 
> > > It sounds like rather a corner case, not worth mucking up the critical
> > > core logging code.
> > 
> > Andrew, I have to admit I don't understand this argument at all.
> 
> Of course you do.  print should be simple, robust and have minimum
> dependency on other kernel parts.

I agree. That would need a complete printk() rewrite though. Before that 
happens, fixing obvious lockup in the code seems like a very proper thing 
to do.

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/4 v6] Avoid softlockups in console_unlock()
  2013-08-23 19:58         ` Andrew Morton
  2013-08-26 11:48           ` Jiri Kosina
@ 2013-09-05 15:46           ` Jan Kara
  2013-09-19 21:26             ` Andrew Morton
  1 sibling, 1 reply; 22+ messages in thread
From: Jan Kara @ 2013-09-05 15:46 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Jiri Kosina, Jan Kara, LKML, mhocko, hare, Steven Rostedt

  Sorry for a delayed reply. I was on vacation...

On Fri 23-08-13 12:58:22, Andrew Morton wrote:
> On Fri, 23 Aug 2013 21:48:36 +0200 (CEST) Jiri Kosina <jkosina@suse.cz> wrote:
> 
> > > >   We have customers (quite a few of them actually) which have machines with
> > > > lots of SCSI disks attached (due to multipath etc.) and during boot when
> > > > these disks are discovered and partitions set up quite some printing
> > > > happens - multiplied by the number of devices (1000+) it is too much for a
> > > > serial console to handle quickly enough. So these machines aren't able to
> > > > boot with serial console enabled.
> > > 
> > > It sounds like rather a corner case, not worth mucking up the critical
> > > core logging code.
> > 
> > Andrew, I have to admit I don't understand this argument at all.
> 
> Of course you do.  print should be simple, robust and have minimum
> dependency on other kernel parts.
> 
> I suppose that if you make the proposed
> /proc/sys/kernel/max_printk_chars settable from the boot command line
> and default to zero, any risks are minimized.
  That's easy enough to do so if it makes you happy I'll go for that.
During my vacation I was also thinking how I could address some of your
concerns. The only idea I found plausible was a scheme where CPU that
wants to stop printing would raise some flag but still keep printing
releasing and reacquiring the console_sem from time to time. In
console_trylock_for_printk() we would block waiting for console_sem
if we see the flag raised.

This way we would be guaranteed someone has really taken over printing
before we leave console_unlock(). We would still need to use irq_work so
that we have someone to take over printing in case printk storm has filled
our dmesg buffer and we are now slowly getting it out to the console.

So all in all this would be a bit more complex than my current solution
(additional flag and some logic around it). The advantage is that we would
rely on irq_work only to achieve reasonable irq latency but it won't be
necessary for getting printk out to console. If this addresses your
concerns better I could try implementing that. Thoughts?

> Baling out if oops_in_progress was a good thing also.
  That's what's happening already now yes.

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/4] irq_work: Provide a irq work that can be processed on any cpu
  2013-08-21 18:49   ` Steven Rostedt
@ 2013-09-05 15:56     ` Jan Kara
  0 siblings, 0 replies; 22+ messages in thread
From: Jan Kara @ 2013-09-05 15:56 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Jan Kara, Andrew Morton, LKML, mhocko, hare

On Wed 21-08-13 14:49:12, Steven Rostedt wrote:
> On Wed, 21 Aug 2013 10:08:30 +0200
> Jan Kara <jack@suse.cz> wrote:
> 
>   
> >  struct irq_work {
> >  	unsigned long flags;
> > diff --git a/kernel/irq_work.c b/kernel/irq_work.c
> > index 55fcce6..446cd81 100644
> > --- a/kernel/irq_work.c
> > +++ b/kernel/irq_work.c
> > @@ -22,6 +22,9 @@
> >  static DEFINE_PER_CPU(struct llist_head, irq_work_list);
> >  static DEFINE_PER_CPU(int, irq_work_raised);
> >  
> > +/* List of irq-work any CPU can pick up */
> > +static LLIST_HEAD(unbound_irq_work_list);
> > +
> >  /*
> >   * Claim the entry so that no one else will poke at it.
> >   */
> > @@ -70,12 +73,16 @@ void irq_work_queue(struct irq_work *work)
> >  	/* Queue the entry and raise the IPI if needed. */
> >  	preempt_disable();
> >  
> > -	llist_add(&work->llnode, &__get_cpu_var(irq_work_list));
> > +	if (!(work->flags & __IRQ_WORK_UNBOUND))
> > +		llist_add(&work->llnode, &__get_cpu_var(irq_work_list));
> > +	else
> > +		llist_add(&work->llnode, &unbound_irq_work_list);
> 
> 
> Just for better readability I would have:
> 
> 	if (work->flags & __IRQ_WORK_UNBOUND)
> 		llist_add(&work->llnode, &unbound_irq_work_list);
> 	else
> 		llist_add(&work->llnode, &__get_cpu_var(irq_work_list));
  OK, done.

> >  
> >  	/*
> >  	 * If the work is not "lazy" or the tick is stopped, raise the irq
> >  	 * work interrupt (if supported by the arch), otherwise, just wait
> > -	 * for the next tick.
> > +	 * for the next tick. We do this even for unbound work to make sure
> > +	 * *some* CPU will be doing the work.
> >  	 */
> >  	if (!(work->flags & IRQ_WORK_LAZY) || tick_nohz_tick_stopped()) {
> >  		if (!this_cpu_cmpxchg(irq_work_raised, 0, 1))
> > @@ -100,28 +107,11 @@ bool irq_work_needs_cpu(void)
> >  	return true;
> >  }
> >  
> > -static void __irq_work_run(void)
> > +static void process_irq_work_list(struct llist_node *llnode)
> >  {
> >  	unsigned long flags;
> >  	struct irq_work *work;
> > -	struct llist_head *this_list;
> > -	struct llist_node *llnode;
> > -
> > -
> > -	/*
> > -	 * Reset the "raised" state right before we check the list because
> > -	 * an NMI may enqueue after we find the list empty from the runner.
> > -	 */
> > -	__this_cpu_write(irq_work_raised, 0);
> > -	barrier();
> >  
> > -	this_list = &__get_cpu_var(irq_work_list);
> > -	if (llist_empty(this_list))
> > -		return;
> > -
> > -	BUG_ON(!irqs_disabled());
> > -
> > -	llnode = llist_del_all(this_list);
> >  	while (llnode != NULL) {
> >  		work = llist_entry(llnode, struct irq_work, llnode);
> >  
> > @@ -146,6 +136,27 @@ static void __irq_work_run(void)
> >  	}
> >  }
> >  
> > +static void __irq_work_run(void)
> > +{
> > +	struct llist_head *this_list;
> > +
> > +	/*
> > +	 * Reset the "raised" state right before we check the list because
> > +	 * an NMI may enqueue after we find the list empty from the runner.
> > +	 */
> > +	__this_cpu_write(irq_work_raised, 0);
> > +	barrier();
> > +
> > +	this_list = &__get_cpu_var(irq_work_list);
> > +	if (llist_empty(this_list) && llist_empty(&unbound_irq_work_list))
> > +		return;
> > +
> > +	BUG_ON(!irqs_disabled());
> > +
> > +	process_irq_work_list(llist_del_all(this_list));
> > +	process_irq_work_list(llist_del_all(&unbound_irq_work_list));
> 
> OK, I'm being a bit of an micro optimization freak here, but...
> 
> How about moving the list_empty() and BUG_ON into the
> process_irq_work_list() call.
> 
> That is,
> 
> 	if (list_empty(llnode))
> 		return;
> 
> 	BUG_ON(!irqs_disabled());
> 
> That way we avoid the double xchg() that is done with the two calls to
> llist_del_all().
  OK, done that.

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/4 v6] Avoid softlockups in console_unlock()
  2013-09-05 15:46           ` Jan Kara
@ 2013-09-19 21:26             ` Andrew Morton
  2013-09-20 21:55               ` Jan Kara
  0 siblings, 1 reply; 22+ messages in thread
From: Andrew Morton @ 2013-09-19 21:26 UTC (permalink / raw)
  To: Jan Kara; +Cc: Jiri Kosina, LKML, mhocko, hare, Steven Rostedt

On Thu, 5 Sep 2013 17:46:12 +0200 Jan Kara <jack@suse.cz> wrote:

>   Sorry for a delayed reply. I was on vacation...
> 
> On Fri 23-08-13 12:58:22, Andrew Morton wrote:
> > On Fri, 23 Aug 2013 21:48:36 +0200 (CEST) Jiri Kosina <jkosina@suse.cz> wrote:
> > 
> > > > >   We have customers (quite a few of them actually) which have machines with
> > > > > lots of SCSI disks attached (due to multipath etc.) and during boot when
> > > > > these disks are discovered and partitions set up quite some printing
> > > > > happens - multiplied by the number of devices (1000+) it is too much for a
> > > > > serial console to handle quickly enough. So these machines aren't able to
> > > > > boot with serial console enabled.
> > > > 
> > > > It sounds like rather a corner case, not worth mucking up the critical
> > > > core logging code.
> > > 
> > > Andrew, I have to admit I don't understand this argument at all.
> > 
> > Of course you do.  print should be simple, robust and have minimum
> > dependency on other kernel parts.
> > 
> > I suppose that if you make the proposed
> > /proc/sys/kernel/max_printk_chars settable from the boot command line
> > and default to zero, any risks are minimized.
>   That's easy enough to do so if it makes you happy I'll go for that.
> During my vacation I was also thinking how I could address some of your
> concerns. The only idea I found plausible was a scheme where CPU that
> wants to stop printing would raise some flag but still keep printing
> releasing and reacquiring the console_sem from time to time. In
> console_trylock_for_printk() we would block waiting for console_sem
> if we see the flag raised.
> 
> This way we would be guaranteed someone has really taken over printing
> before we leave console_unlock(). We would still need to use irq_work so
> that we have someone to take over printing in case printk storm has filled
> our dmesg buffer and we are now slowly getting it out to the console.
> 
> So all in all this would be a bit more complex than my current solution
> (additional flag and some logic around it). The advantage is that we would
> rely on irq_work only to achieve reasonable irq latency but it won't be
> necessary for getting printk out to console. If this addresses your
> concerns better I could try implementing that. Thoughts?
> 

What driver is in use here, anyway? 
Is it drivers/tty/serial/8250/8250_early.c?

I'm not sure that I fully undertand the problem yet.  You say "These
patches avoid softlockups when a CPU gets caught in console_unlock()
for a long time during heavy printing from other CPU".  But *why* is
the printing CPU holding the lock for so long?  A single printk won't
take a huge amount of time, so that CPU must be spinning waiting for
previously-printk'd characters to drain?  Or something else?

See, if we can get to the bottom of this then perhaps we can pace the
printing CPU so that it somehow twiddles thumbs outside console_lock().

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/4 v6] Avoid softlockups in console_unlock()
  2013-09-19 21:26             ` Andrew Morton
@ 2013-09-20 21:55               ` Jan Kara
  0 siblings, 0 replies; 22+ messages in thread
From: Jan Kara @ 2013-09-20 21:55 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Jan Kara, Jiri Kosina, LKML, mhocko, hare, Steven Rostedt

On Thu 19-09-13 14:26:27, Andrew Morton wrote:
> On Thu, 5 Sep 2013 17:46:12 +0200 Jan Kara <jack@suse.cz> wrote:
> > On Fri 23-08-13 12:58:22, Andrew Morton wrote:
> > > On Fri, 23 Aug 2013 21:48:36 +0200 (CEST) Jiri Kosina <jkosina@suse.cz> wrote:
> > > 
> > > > > >   We have customers (quite a few of them actually) which have machines with
> > > > > > lots of SCSI disks attached (due to multipath etc.) and during boot when
> > > > > > these disks are discovered and partitions set up quite some printing
> > > > > > happens - multiplied by the number of devices (1000+) it is too much for a
> > > > > > serial console to handle quickly enough. So these machines aren't able to
> > > > > > boot with serial console enabled.
> > > > > 
> > > > > It sounds like rather a corner case, not worth mucking up the critical
> > > > > core logging code.
> > > > 
> > > > Andrew, I have to admit I don't understand this argument at all.
> > > 
> > > Of course you do.  print should be simple, robust and have minimum
> > > dependency on other kernel parts.
> > > 
> > > I suppose that if you make the proposed
> > > /proc/sys/kernel/max_printk_chars settable from the boot command line
> > > and default to zero, any risks are minimized.
> >   That's easy enough to do so if it makes you happy I'll go for that.
> > During my vacation I was also thinking how I could address some of your
> > concerns. The only idea I found plausible was a scheme where CPU that
> > wants to stop printing would raise some flag but still keep printing
> > releasing and reacquiring the console_sem from time to time. In
> > console_trylock_for_printk() we would block waiting for console_sem
> > if we see the flag raised.
> > 
> > This way we would be guaranteed someone has really taken over printing
> > before we leave console_unlock(). We would still need to use irq_work so
> > that we have someone to take over printing in case printk storm has filled
> > our dmesg buffer and we are now slowly getting it out to the console.
> > 
> > So all in all this would be a bit more complex than my current solution
> > (additional flag and some logic around it). The advantage is that we would
> > rely on irq_work only to achieve reasonable irq latency but it won't be
> > necessary for getting printk out to console. If this addresses your
> > concerns better I could try implementing that. Thoughts?
> > 
> 
> What driver is in use here, anyway? 
> Is it drivers/tty/serial/8250/8250_early.c?
  It is 8250 serial (at least in report I have checked now). However I don't
think it's early console - the lockup reports contain traces e.g. with
serial8250_console_putchar() which is a standard serial console (but I may
be wrong here).
 
> I'm not sure that I fully undertand the problem yet.  You say "These
> patches avoid softlockups when a CPU gets caught in console_unlock()
> for a long time during heavy printing from other CPU".  But *why* is
> the printing CPU holding the lock for so long?  A single printk won't
> take a huge amount of time, so that CPU must be spinning waiting for
> previously-printk'd characters to drain?
  Yes. We want to print a message, we enter concole_unlock() on some CPU
and start sending characters via serial console. While we are doing that,
other CPUs come and add new messages to the kernel buffer. We also have to
send those to serial console which even prolongs the window during which
other CPUs have time to add new messages to the buffer. So the time we can
spend in console_unlock() doing printing is currently unbounded - while
other CPUs keep coming with messages, current holder of console_sem keeps
printing.

> Or something else?
> See, if we can get to the bottom of this then perhaps we can pace the
> printing CPU so that it somehow twiddles thumbs outside console_lock().
  I've actually tried to do this in some of my previous attempts to tackle
the problem. The problem I hit with that approach was that printk() itself
is often called with interrupts disabled. So to be able to reenable
interrupts within reasonable time, we have to return from such printk()
first.

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 0/4 v6] Avoid softlockups in console_unlock()
@ 2013-11-07 21:48 Jan Kara
  0 siblings, 0 replies; 22+ messages in thread
From: Jan Kara @ 2013-11-07 21:48 UTC (permalink / raw)
  To: Andrew Morton; +Cc: LKML, mhocko, Steven Rostedt, Jan Kara

  Hello,

  This is the next iteration of my printk patchset. Since v5 I've made the
limit for printing configurable via kernel parameter and let it default to 0.
So unless user sets printk.offload_chars on kernel command line, there will
be no difference to current printk behavior.

Summary:

These patches avoid softlockups when a CPU gets caught in console_unlock() for
a long time during heavy printing from other CPU. As is discussed in patch 3/4
it isn't enough to just silence the watchdog because if CPU spends too long in
console_unlock() also RCU will complain, other CPUs can be blocked waiting for
printing CPU to process IPI, and even disk can be offlined because interrupts
would be disabled for too long.

This patch series solves the problem by stopping printing in console_unlock()
after X (tunable) characters and the printing is postponed to irq work. To
avoid hogging a single CPU (irq work gets processed on the same CPU where it
was queued so it doesn't really help to reduce the printing load on that CPU)
we introduce a new type of lazy irq work - IRQ_WORK_UNBOUND - which can be
processed by any CPU.

The patch series has survived my testing without any softlockup reports.  I've
tested running sysrq-t (lots of printk output) while inserting modules (to
generate IPIs and also some printk traffic) and also running two delayed works
printing 10 KB of text each. All this was with simulated 9600 baud serial
console.

								Honza

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2013-11-07 21:48 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-08-21  8:08 [PATCH 0/4 v6] Avoid softlockups in console_unlock() Jan Kara
2013-08-21  8:08 ` [PATCH 1/4] printk: Remove separate printk_sched buffers and use printk buf instead Jan Kara
2013-08-21  8:08 ` [PATCH 2/4] irq_work: Provide a irq work that can be processed on any cpu Jan Kara
2013-08-21 18:49   ` Steven Rostedt
2013-09-05 15:56     ` Jan Kara
2013-08-21  8:08 ` [PATCH 3/4] printk: Defer printing to irq work when we printed too much Jan Kara
2013-08-21 19:06   ` Steven Rostedt
2013-08-21  8:08 ` [PATCH 4/4] printk: Use unbound irq work for printing and waking Jan Kara
2013-08-21 19:24   ` Steven Rostedt
2013-08-21 21:27 ` [PATCH 0/4 v6] Avoid softlockups in console_unlock() Andrew Morton
2013-08-21 22:59   ` Jan Kara
2013-08-22 19:49     ` Andrew Morton
2013-08-22 21:57       ` Jan Kara
2013-08-22 22:14         ` Andrew Morton
2013-08-23 19:50           ` Jiri Kosina
2013-08-23 19:48       ` Jiri Kosina
2013-08-23 19:58         ` Andrew Morton
2013-08-26 11:48           ` Jiri Kosina
2013-09-05 15:46           ` Jan Kara
2013-09-19 21:26             ` Andrew Morton
2013-09-20 21:55               ` Jan Kara
2013-11-07 21:48 Jan Kara

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).