Re: [PATCH] printk: Add best-effort printk() buffering.

From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
To: sergey.senozhatsky@gmail.com
Cc: sergey.senozhatsky.work@gmail.com, linux-kernel@vger.kernel.org,
	mingo@redhat.com, peterz@infradead.org, pmladek@suse.com,
	rostedt@goodmis.org
Subject: Re: [PATCH] printk: Add best-effort printk() buffering.
Date: Tue, 9 May 2017 20:41:14 +0900	[thread overview]
Message-ID: <201705092041.DFJ04165.VMHSFQFFJLOOOt@I-love.SAKURA.ne.jp> (raw)
In-Reply-To: <20170509010442.GA397@tigerII.localdomain>

Sergey Senozhatsky wrote:
> Hello,
> 
> On (05/08/17 22:05), Tetsuo Handa wrote:
> > > On (04/30/17 22:54), Tetsuo Handa wrote:
> > > > Sometimes we want to printk() multiple lines in a group without being
> > > > disturbed by concurrent printk() from interrupts and/or other threads.
> > > > For example, mixed printk() output of multiple thread's dump makes it
> > > > hard to interpret.
> > > 
> > > hm, it's very close to what printk-safe does [and printk-nmi, of course].
> > > the difference is that buffered-printk does not disable local IRQs,
> > > unlike printk-safe, which has to do it by design. so the question is,
> > > can buffered-printk impose atomicity requirements? it seems that it can
> > > (am I wrong?). and, if so, then can we use printk-safe instead? we can
> > > add a new printk_buffered_begin/printk_buffered_end API, for example,
> > > (or enter/exit) for that purpose, that would set a buffered-printk
> > > `printk_context' bit so we can flush buffers in a "special way", not via IRQ
> > > work, and may be avoid message loss (printk-safe buffers are bigger in size
> > > than proposed PAGE_SIZE buffers).
> > 
> > printk_buffered_begin()/printk_buffered_end() corresponds to
> > get_printk_buffer()/put_printk_buffer().
> > printk_context() distinguishes atomic contexts.
> > flush_printk_buffer() flushes from non-NMI context.
> > 
> > What does atomicity requirements mean?
> 
> what I meant was -- "can we sleep under printk_buffered_begin() or not".
> printk-safe disables local IRQs. so what I propose is something like this
> 
> 	printk-safe-enter    //disable local IRQs, use per-CPU buffer
> 	backtrace
> 	printk-safe-exit     //flush per-CPU buffer, enable local IRQs
> 
> except that 'printk-safe-enter/exit' will have new names here, say
> printk-buffered-begin/end, and, probably, handle flush differently.

OK. Then, answer is that we are allowed to sleep after get_printk_buffer()
if get_printk_buffer() is called from schedulable context because different
printk_buffer will be assigned by get_printk_buffer() if get_printk_buffer()
is called from non-schedulable context.

> 
> 
> > > hm, 16 is rather random, it's too much for UP and probably not enough for
> > > a 240 CPUs system. for the time being there are 3 buffered-printk users
> > > (as far as I can see), but who knows how more will be added in the future.
> > > each CPU can have overlapping printks from process, IRQ and NMI contexts.
> > > for NMI we use printk-nmi buffers, so it's out of the list; but, in general,
> > > *it seems* that we better depend on the number of CPUs the system has.
> > > which, once again, returns us back to printk-safe...
> > > 
> > > thoughts?
> > 
> > I can make 16 a CONFIG_ option.
> 
> but still, why use additional N buffers, when we already have per-CPU
> buffers? what am I missing?

Per-CPU buffers need to disable preemption by disabling local hard
IRQ / soft IRQ. But printk_buffers need not to disable preemption.

> 
> > Would you read 201705031521.EIJ39594.MFtOVOHSFLFOJQ@I-love.SAKURA.ne.jp ?
> 
> sure.
> 
> > But as long as actually writing to console devices is slow, message loss
> > is inevitable no matter how big buffer is used. Rather, I'd expect an API
> > which allows printk() users in schedulable context (e.g. kmallocwd and/or
> > warn_alloc() for reporting allocation stalls) to wait until written to
> > console devices. That will more likely to reduce message loss.
> 
> hm, from a schedulable context you can do *something* like
> 
> 	console_lock()
> 	printk()
> 	...
> 	printk()
> 	console_unlock()
> 
> 
> you won't be able to console_lock() until all pending messages are
> flushed. since you are in a schedulable context, you can sleep on
> console_sem in console_lock(). well, just saying.

console_lock()/console_unlock() pair is different from what I want.

console_lock()/console_unlock() pair blocks as long as somebody else
is printk()ing. What I want is an API for

  current thread waits for N bytes to be written to console devices
  if current thread stored N bytes using printk(), but allow using some
  timeout and killable because waiting unconditionally forever is not good
  (e.g. current thread is expected to bail out soon if OOM-killed during
  waiting for N bytes to be written to console devices)

so that kmallocwd-like thread can wait for necessary and sufficient period
for not overflowing the printk buffer. Even after we offload to a dedicated
kernel thread (e.g. "[RFC][PATCHv3 0/5] printk: introduce printing kernel
thread"), this API is needed for not to overflow the printk buffer

"Wait for N bytes to be written to console devices" could be replaced by
"wait until usage of printk buffer becomes less than XX%".

> 
> 
> > > > +	while (1) {
> > > > +		char *text = ptr->buf;
> > > > +		unsigned int text_len = ptr->used;
> > > > +		char *cp = memchr(text, '\n', text_len);
> > > > +		char c;
> > > 
> > > what guarantees that there'll always be a terminating newline?
> > 
> > Nothing guarantees. Why need such a guarantee?
> 
>  : The memchr() and memrchr() functions return a pointer to the matching
>  : byte or NULL if the character does not occur in the given memory area.
> 
> 
> so `cp' can be NULL here?
> 

Yes.

> +               if (cp++)
> +                       text_len = cp - text;
> +               else if (all)
> +                       cp = text + text_len;
> +               else
> +                       break;
> +               /* printk_get_level() depends on text '\0'-terminated. */

But `cp' cannot be NULL here.

> +               c = *cp;
> +               *cp = '\0';
> +               process_log(0, LOGLEVEL_DEFAULT, NULL, 0, text, text_len);
> +               ptr->used -= text_len;
> +               if (!ptr->used)
> +                       break;
> +               *cp = c;