From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751525AbdECGYD (ORCPT ); Wed, 3 May 2017 02:24:03 -0400 Received: from www262.sakura.ne.jp ([202.181.97.72]:54686 "EHLO www262.sakura.ne.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751108AbdECGX4 (ORCPT ); Wed, 3 May 2017 02:23:56 -0400 To: joe@perches.com, linux-kernel@vger.kernel.org Cc: mingo@redhat.com, peterz@infradead.org, pmladek@suse.com, sergey.senozhatsky@gmail.com, rostedt@goodmis.org Subject: Re: [PATCH] printk: Add best-effort printk() buffering. From: Tetsuo Handa References: <1493560477-3016-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp> <1493568716.1874.7.camel@perches.com> In-Reply-To: <1493568716.1874.7.camel@perches.com> Message-Id: <201705031521.EIJ39594.MFtOVOHSFLFOJQ@I-love.SAKURA.ne.jp> X-Mailer: Winbiff [Version 2.51 PL2] X-Accept-Language: ja,en,zh Date: Wed, 3 May 2017 15:21:56 +0900 Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Joe Perches wrote: > On Sun, 2017-04-30 at 22:54 +0900, Tetsuo Handa wrote: > > Sometimes we want to printk() multiple lines in a group without being > > disturbed by concurrent printk() from interrupts and/or other threads. > > For example, mixed printk() output of multiple thread's dump makes it > > hard to interpret. > > > > This patch introduces fixed-sized statically allocated buffers for > > buffering printk() output for each thread/context in best effort > > (i.e. up to PAGE_SIZE bytes, up to 16 concurrent printk() users). > [] > > diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c > [] > > +#define MAX_PRINTK_BUFFERS 16 > > +static struct printk_buffer { > > + unsigned long context; /* printk_context() */ > > + unsigned int nested; > > + unsigned int used; /* Valid bytes in buf[]. */ > > + char buf[PAGE_SIZE]; > > +} printk_buffers[MAX_PRINTK_BUFFERS]; > > Perhaps these buffers could be acquired by > alloc_page rather than be static structures and > the sizeof buf[PAGE_SIZE] should be reduced by > sizeof(unsigned long) + > sizeof(unsigned int) + > sizeof(unsigned int) > so that struct printk_buffers is exactly > PAGE_SIZE. When should the buffers be allocated? If upon boot, there will be little difference. If the first time each buffer is needed, we introduce a risk of failing to allocate memory using alloc_page(GFP_ATOMIC | __GFP_NOWARN) and a risk of stack overflow during alloc_page() because printk() has to be prepared for being called from stack-tight situation. Also, while dynamic allocation can allow linked list of the buffer, we will need to introduce a lock for traversing the list, which might become more expensive than walking fixed-sized array of the buffer. We could avoid list traversal by passing "struct printk_buffer" argument, but since there are so many functions which expect pr_cont() behavior, scattering "struct printk_buffer" argument is a big patch. We could avoid list traversal by storing "struct printk_buffer[3]" inside "struct task_struct" (for hard IRQ context, soft IRQ context, task context), but occupying sizeof(struct printk_buffer *) * 3 bytes only for buffers which will be used only for a moment is not smart. Thus, I think fixed-sized statically allocated buffers is the most reasonable choice. Using a CONFIG_ option for controlling how many pages should be allocated for "struct printk_buffer" might make sense for systems with little RAM. Tetsuo Handa wrote: > +void get_printk_buffer(void) > +{ > + const unsigned long context = printk_context(); > + int i; > + > + /* No-op if called from NMI context. */ > + if ((context & 3) == 3) > + return; > + for (i = 0; i < MAX_PRINTK_BUFFERS; i++) { > + struct printk_buffer *ptr = &printk_buffers[i]; > + > + if (ptr->context != context) { > + if (ptr->context || > + cmpxchg(&ptr->context, 0, context)) > + continue; > + ptr->nested = 0; > + ptr->used = 0; > + } else { > + ptr->nested++; > + } > + break; > + } > +} Oops, I over-optimized here. We should not reserve ptr->context == 0 slot before checking ptr->context != 0 slots, or we may race with nested calls. void get_printk_buffer(void) { const unsigned long context = printk_context(); struct printk_buffer *ptr; int i; /* No-op if called from NMI context. */ if ((context & 3) == 3) return; ptr = find_printk_buffer(); if (ptr) { ptr->nested++; return; } for (i = 0; i < MAX_PRINTK_BUFFERS; i++) { ptr = &printk_buffers[i]; if (ptr->context || cmpxchg(&ptr->context, 0, context)) continue; ptr->nested = 0; ptr->used = 0; break; } } > int vprintk_default(const char *fmt, va_list args) > { > + struct printk_buffer *ptr; > + va_list args2; > + unsigned int i; > int r; > > #ifdef CONFIG_KGDB_KDB > @@ -1806,6 +1960,43 @@ int vprintk_default(const char *fmt, va_list args) > return r; > } > #endif > + ptr = find_printk_buffer(); > + if (!ptr) > + goto use_unbuffered; > + /* > + * Try to store to printk_buffer first. If it fails, flush completed > + * lines in printk_buffer, and then try to store to printk_buffer > + * again. If it still fails, flush incomplete line in printk_buffer > + * and use unbuffered printing. > + * > + * Since printk_buffer is identified by current thread and interrupt > + * context and same level of context does not recurse, we don't need > + * logbuf_lock_irqsave()/logbuf_unlock_irqrestore() here except > + * __flush_printk_buffer(). > + */ Before I continue, I'd like to confirm that this assumption is correct. My understanding is that (1) task context can be preempted by soft IRQ context, hard IRQ context and NMI context. (2) soft IRQ context can be preempted by hard IRQ context and NMI context. (3) hard IRQ context can be preempted by NMI context. (4) An kernel-oops event can interrupt task context, soft IRQ context, hard IRQ context and NMI context, but the interrupted context can not continue execution of vprintk_default() after returning from the kernel-oops event even if the kernel-oops event occurred in schedulable context and panic_on_oops == 0. and as a result, it is guaranteed that an kernel-oops event which occurred between > + for (i = 0; i < 2; i++) { here > + unsigned int pos = ptr->used; > + char *text = ptr->buf + pos; > + > + va_copy(args2, args); > + r = vsnprintf(text, sizeof(ptr->buf) - pos, fmt, args2); > + va_end(args2); > + if (r + pos < sizeof(ptr->buf)) { > + /* > + * Eliminate KERN_CONT at this point because we can > + * concatenate incomplete lines inside printk_buffer. > + */ > + if (r >= 2 && printk_get_level(text) == 'c') { > + memmove(text, text + 2, r - 2); > + ptr->used += r - 2; > + } else { > + ptr->used += r; > + } > + return r; > + } and here shall not make ptr->used >= sizeof(ptr->buf) true. > + __flush_printk_buffer(ptr, i); > + } > +use_unbuffered: > r = vprintk_emit(0, LOGLEVEL_DEFAULT, NULL, 0, fmt, args); > > return r; Well, should vprintk_default() refrain from buffering if oops_in_progress == 1 ?