All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
To: Jan Kara <jack@suse.cz>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Petr Mladek <pmladek@suse.cz>, KY Sri nivasan <kys@microsoft.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	linux-kernel@vger.kernel.org,
	Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Subject: Re: [PATCH 1/7] printk: Hand over printing to console if printing too long
Date: Tue, 22 Dec 2015 23:48:38 +0900	[thread overview]
Message-ID: <20151222144838.GA611@swordfish> (raw)
In-Reply-To: <20151222134730.GD7266@quack.suse.cz>

On (12/22/15 14:47), Jan Kara wrote:
[..]
> Thanks for looking into my patches and sorry for replying with a delay. As
> I wrote in my previous email [1] even the referenced patches are not quite
> enough. Over last few days I have worked on redoing the stuff as we
> discussed with Linus and Andrew at Kernel Summit and I have new patches
> which are working fine for me. I still want to test them on some machines
> having real issues with udev during boot but so far stress-testing with
> serial console slowed down to ~1000 chars/sec on other machines and VMs
> looks promising.
> 
> I'm attaching them in case you want to have a look. They are on top of
> Tejun's patch adding cond_resched() (which is essential). I'll officially
> submit the patches once the testing is finished (but I'm not sure when I
> get to the problematic HW...).
> 

Hello,

Thanks a lot! Will take a look.

	-ss

> From 2e9675abbfc0df4a24a8c760c58e8150b9a31259 Mon Sep 17 00:00:00 2001
> From: Jan Kara <jack@suse.cz>
> Date: Mon, 21 Dec 2015 13:10:31 +0100
> Subject: [PATCH 1/2] printk: Make printk() completely async
> 
> Currently, printk() sometimes waits for message to be printed to console
> and sometimes it does not (when console_sem is held by some other
> process). In case printk() grabs console_sem and starts printing to
> console, it prints messages from kernel printk buffer until the buffer
> is empty. When serial console is attached, printing is slow and thus
> other CPUs in the system have plenty of time to append new messages to
> the buffer while one CPU is printing. Thus the CPU can spend unbounded
> amount of time doing printing in console_unlock().  This is especially
> serious problem if the printk() calling console_unlock() was called with
> interrupts disabled.
> 
> In practice users have observed a CPU can spend tens of seconds printing
> in console_unlock() (usually during boot when hundreds of SCSI devices
> are discovered) resulting in RCU stalls (CPU doing printing doesn't
> reach quiescent state for a long time), softlockup reports (IPIs for the
> printing CPU don't get served and thus other CPUs are spinning waiting
> for the printing CPU to process IPIs), and eventually a machine death
> (as messages from stalls and lockups append to printk buffer faster than
> we are able to print). So these machines are unable to boot with serial
> console attached. Another observed issue is that due to slow printk,
> hardware discovery is slow and udev times out before kernel manages to
> discover all the attached HW. Also during artificial stress testing SATA
> disk disappears from the system because its interrupts aren't served for
> too long.
> 
> This patch makes printk() completely asynchronous (similar to what
> printk_deferred() did until now). It appends message to the kernel
> printk buffer and queues work to do the printing to console. This has
> the advantage that printing always happens from a schedulable contex and
> thus we don't lockup any particular CPU or even interrupts. Also it has
> the advantage that printk() is fast and thus kernel booting is not
> slowed down by slow serial console. Disadvantage of this method is that
> in case of crash there is higher chance that important messages won't
> appear in console output (we may need working scheduling to print
> message to console). We somewhat mitigate this risk by switching printk
> to the original method of immediate printing to console if oops is in
> progress.  Also for debugging purposes we provide printk.synchronous
> kernel parameter which resorts to the original printk behavior.
> 
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
>  Documentation/kernel-parameters.txt |  10 +++
>  kernel/printk/printk.c              | 144 +++++++++++++++++++++---------------
>  2 files changed, 95 insertions(+), 59 deletions(-)
> 
> diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
> index 742f69d18fc8..4cf1bddeffc7 100644
> --- a/Documentation/kernel-parameters.txt
> +++ b/Documentation/kernel-parameters.txt
> @@ -3000,6 +3000,16 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
>  	printk.time=	Show timing data prefixed to each printk message line
>  			Format: <bool>  (1/Y/y=enable, 0/N/n=disable)
>  
> +	printk.synchronous=
> +			By default kernel messages are printed to console
> +			asynchronously (except during early boot or when oops
> +			is happening). That avoids kernel stalling behind slow
> +			serial console and thus avoids softlockups, interrupt
> +			timeouts, or userspace timing out during heavy printing.
> +			However for debugging problems, printing messages to
> +			console immediately may be desirable. This option
> +			enables such behavior.
> +
>  	processor.max_cstate=	[HW,ACPI]
>  			Limit processor to maximum C-state
>  			max_cstate=9 overrides any DMI blacklist limit.
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index 299c2f0e7350..d455d1bd0d2c 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -283,6 +283,73 @@ static char __log_buf[__LOG_BUF_LEN] __aligned(LOG_ALIGN);
>  static char *log_buf = __log_buf;
>  static u32 log_buf_len = __LOG_BUF_LEN;
>  
> +/*
> + * When true, printing to console will happen synchronously unless someone else
> + * is already printing messages.
> + */
> +static bool __read_mostly printk_sync;
> +module_param_named(synchronous, printk_sync, bool, S_IRUGO | S_IWUSR);
> +MODULE_PARM_DESC(synchronous, "make printing to console synchronous");
> +
> +#define PRINTK_PENDING_WAKEUP	0x01
> +#define PRINTK_PENDING_OUTPUT	0x02
> +
> +static DEFINE_PER_CPU(int, printk_pending);
> +
> +static void printing_work_func(struct work_struct *work)
> +{
> +	console_lock();
> +	console_unlock();
> +}
> +
> +static DECLARE_WORK(printing_work, printing_work_func);
> +
> +static void wake_up_klogd_work_func(struct irq_work *irq_work)
> +{
> +	int pending = __this_cpu_xchg(printk_pending, 0);
> +
> +	/*
> +	 * We just schedule regular work to do the printing from irq work. We
> +	 * don't want to do printing here directly as that happens with
> +	 * interrupts disabled and thus is bad for interrupt latency. We also
> +	 * don't want to queue regular work from vprintk_emit() as that gets
> +	 * called in various difficult contexts where schedule_work() could
> +	 * deadlock.
> +	 */
> +	if (pending & PRINTK_PENDING_OUTPUT)
> +		schedule_work(&printing_work);
> +
> +	if (pending & PRINTK_PENDING_WAKEUP)
> +		wake_up_interruptible(&log_wait);
> +}
> +
> +static DEFINE_PER_CPU(struct irq_work, wake_up_klogd_work) = {
> +	.func = wake_up_klogd_work_func,
> +	.flags = IRQ_WORK_LAZY,
> +};
> +
> +void wake_up_klogd(void)
> +{
> +	preempt_disable();
> +	if (waitqueue_active(&log_wait)) {
> +		this_cpu_or(printk_pending, PRINTK_PENDING_WAKEUP);
> +		irq_work_queue(this_cpu_ptr(&wake_up_klogd_work));
> +	}
> +	preempt_enable();
> +}
> +
> +int printk_deferred(const char *fmt, ...)
> +{
> +	va_list args;
> +	int r;
> +
> +	va_start(args, fmt);
> +	r = vprintk_emit(0, LOGLEVEL_SCHED, NULL, 0, fmt, args);
> +	va_end(args);
> +
> +	return r;
> +}
> +
>  /* Return log buffer address */
>  char *log_buf_addr_get(void)
>  {
> @@ -1668,15 +1735,14 @@ asmlinkage int vprintk_emit(int facility, int level,
>  	unsigned long flags;
>  	int this_cpu;
>  	int printed_len = 0;
> -	bool in_sched = false;
> +	bool sync_print = printk_sync;
>  	/* cpu currently holding logbuf_lock in this function */
>  	static unsigned int logbuf_cpu = UINT_MAX;
>  
>  	if (level == LOGLEVEL_SCHED) {
>  		level = LOGLEVEL_DEFAULT;
> -		in_sched = true;
> +		sync_print = false;
>  	}
> -
>  	boot_delay_msec(level);
>  	printk_delay();
>  
> @@ -1803,10 +1869,24 @@ asmlinkage int vprintk_emit(int facility, int level,
>  	logbuf_cpu = UINT_MAX;
>  	raw_spin_unlock(&logbuf_lock);
>  	lockdep_on();
> +	/*
> +	 * By default we print message to console asynchronously so that kernel
> +	 * doesn't get stalled due to slow serial console. That can lead to
> +	 * softlockups, lost interrupts, or userspace timing out under heavy
> +	 * printing load.
> +	 *
> +	 * However we resort to synchronous printing of messages during early
> +	 * boot, when oops is in progress, or when synchronous printing was
> +	 * explicitely requested by kernel parameter.
> +	 */
> +	if (keventd_up() && !oops_in_progress && !sync_print) {
> +		__this_cpu_or(printk_pending, PRINTK_PENDING_OUTPUT);
> +		irq_work_queue(this_cpu_ptr(&wake_up_klogd_work));
> +	} else
> +		sync_print = true;
>  	local_irq_restore(flags);
>  
> -	/* If called from the scheduler, we can not call up(). */
> -	if (!in_sched) {
> +	if (sync_print) {
>  		lockdep_off();
>  		/*
>  		 * Disable preemption to avoid being preempted while holding
> @@ -2688,60 +2768,6 @@ late_initcall(printk_late_init);
>  
>  #if defined CONFIG_PRINTK
>  /*
> - * Delayed printk version, for scheduler-internal messages:
> - */
> -#define PRINTK_PENDING_WAKEUP	0x01
> -#define PRINTK_PENDING_OUTPUT	0x02
> -
> -static DEFINE_PER_CPU(int, printk_pending);
> -
> -static void wake_up_klogd_work_func(struct irq_work *irq_work)
> -{
> -	int pending = __this_cpu_xchg(printk_pending, 0);
> -
> -	if (pending & PRINTK_PENDING_OUTPUT) {
> -		/* If trylock fails, someone else is doing the printing */
> -		if (console_trylock())
> -			console_unlock();
> -	}
> -
> -	if (pending & PRINTK_PENDING_WAKEUP)
> -		wake_up_interruptible(&log_wait);
> -}
> -
> -static DEFINE_PER_CPU(struct irq_work, wake_up_klogd_work) = {
> -	.func = wake_up_klogd_work_func,
> -	.flags = IRQ_WORK_LAZY,
> -};
> -
> -void wake_up_klogd(void)
> -{
> -	preempt_disable();
> -	if (waitqueue_active(&log_wait)) {
> -		this_cpu_or(printk_pending, PRINTK_PENDING_WAKEUP);
> -		irq_work_queue(this_cpu_ptr(&wake_up_klogd_work));
> -	}
> -	preempt_enable();
> -}
> -
> -int printk_deferred(const char *fmt, ...)
> -{
> -	va_list args;
> -	int r;
> -
> -	preempt_disable();
> -	va_start(args, fmt);
> -	r = vprintk_emit(0, LOGLEVEL_SCHED, NULL, 0, fmt, args);
> -	va_end(args);
> -
> -	__this_cpu_or(printk_pending, PRINTK_PENDING_OUTPUT);
> -	irq_work_queue(this_cpu_ptr(&wake_up_klogd_work));
> -	preempt_enable();
> -
> -	return r;
> -}
> -
> -/*
>   * printk rate limiting, lifted from the networking subsystem.
>   *
>   * This enforces a rate limit: not more than 10 kernel messages
> -- 
> 2.6.2
> 

> From be116ae18f15f0d2d05ddf0b53eaac184943d312 Mon Sep 17 00:00:00 2001
> From: Jan Kara <jack@suse.cz>
> Date: Mon, 21 Dec 2015 14:26:13 +0100
> Subject: [PATCH 2/2] printk: Skip messages on oops
> 
> When there are too many messages in the kernel printk buffer it can take
> very long to print them to console (especially when using slow serial
> console). This is undesirable during oops so when we encounter oops and
> there are more than 100 messages to print, print just the newest 100
> messages and then the oops message.
> 
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
>  kernel/printk/printk.c | 34 +++++++++++++++++++++++++++++++++-
>  1 file changed, 33 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index d455d1bd0d2c..fc67ab70e9c7 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -262,6 +262,9 @@ static u64 console_seq;
>  static u32 console_idx;
>  static enum log_flags console_prev;
>  
> +/* current record sequence when oops happened */
> +static u64 oops_start_seq;
> +
>  /* the next printk record to read after the last 'clear' command */
>  static u64 clear_seq;
>  static u32 clear_idx;
> @@ -1783,6 +1786,8 @@ asmlinkage int vprintk_emit(int facility, int level,
>  					 NULL, 0, recursion_msg,
>  					 strlen(recursion_msg));
>  	}
> +	if (oops_in_progress && !sync_print && !oops_start_seq)
> +		oops_start_seq = log_next_seq;
>  
>  	/*
>  	 * The printf needs to come first; we need the syslog
> @@ -2292,6 +2297,12 @@ out:
>  	raw_spin_unlock_irqrestore(&logbuf_lock, flags);
>  }
>  
> +/*
> + * When oops happens and there are more messages to be printed in the printk
> + * buffer that this, skip some mesages and print only this many newest messages.
> + */
> +#define PRINT_MSGS_BEFORE_OOPS 100
> +
>  /**
>   * console_unlock - unlock the console system
>   *
> @@ -2348,7 +2359,28 @@ again:
>  			seen_seq = log_next_seq;
>  		}
>  
> -		if (console_seq < log_first_seq) {
> +		/*
> +		 * If oops happened and there are more than
> +		 * PRINT_MSGS_BEFORE_OOPS messages pending before oops message,
> +		 * skip them to make oops appear faster.
> +		 */
> +		if (oops_start_seq &&
> +		    console_seq + PRINT_MSGS_BEFORE_OOPS < oops_start_seq) {
> +			len = sprintf(text,
> +				"** %u printk messages dropped due to oops ** ",
> +				(unsigned)(oops_start_seq - console_seq -
> +					   PRINT_MSGS_BEFORE_OOPS));
> +			if (console_seq < log_first_seq) {
> +				console_seq = log_first_seq;
> +				console_idx = log_first_idx;
> +			}
> +			while (console_seq <
> +			       oops_start_seq - PRINT_MSGS_BEFORE_OOPS) {
> +				console_idx = log_next(console_idx);
> +				console_seq++;
> +			}
> +			console_prev = 0;
> +		} else if (console_seq < log_first_seq) {
>  			len = sprintf(text, "** %u printk messages dropped ** ",
>  				      (unsigned)(log_first_seq - console_seq));
>  
> -- 
> 2.6.2
> 


  reply	other threads:[~2015-12-22 14:50 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-10 14:52 [PATCH 1/7] printk: Hand over printing to console if printing too long Sergey Senozhatsky
2015-12-10 15:24 ` Sergey Senozhatsky
2015-12-11  4:27 ` Sergey Senozhatsky
2015-12-11  6:29   ` Sergey Senozhatsky
2015-12-22 13:47 ` Jan Kara
2015-12-22 14:48   ` Sergey Senozhatsky [this message]
2015-12-23  1:54   ` Sergey Senozhatsky
2015-12-23  3:37     ` Sergey Senozhatsky
2015-12-23  3:57       ` Sergey Senozhatsky
2015-12-23  4:15         ` Sergey Senozhatsky
2016-01-05 14:37     ` Jan Kara
2016-01-06  1:41       ` Sergey Senozhatsky
2016-01-06  6:48       ` Sergey Senozhatsky
2016-01-06 12:25         ` Jan Kara
2016-01-11 13:25           ` Sergey Senozhatsky
2015-12-31  2:44   ` Sergey Senozhatsky
2015-12-31  3:13     ` Sergey Senozhatsky
2015-12-31  4:58       ` Sergey Senozhatsky
2016-01-05 14:48         ` Jan Kara
2016-01-06  3:38           ` Sergey Senozhatsky
2016-01-06  8:36             ` Sergey Senozhatsky
2016-01-06 10:21               ` Jan Kara
2016-01-06 11:10                 ` Sergey Senozhatsky
2016-01-11 12:54   ` Petr Mladek
2016-01-12 14:00     ` Jan Kara
  -- strict thread matches above, loose matches on Subject: below --
2015-10-26  4:52 [PATCH 0/6 v2] printk: Softlockup avoidance Jan Kara
2015-10-26  4:52 ` [PATCH 1/7] printk: Hand over printing to console if printing too long Jan Kara
2016-03-01 17:22   ` Denys Vlasenko
2016-03-02  9:30     ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151222144838.GA611@swordfish \
    --to=sergey.senozhatsky@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=jack@suse.cz \
    --cc=kys@microsoft.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pmladek@suse.cz \
    --cc=rostedt@goodmis.org \
    --cc=sergey.senozhatsky.work@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.