All of lore.kernel.org
 help / color / mirror / Atom feed
From: John Ogness <john.ogness@linutronix.de>
To: Petr Mladek <pmladek@suse.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>,
	Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH printk-rework 11/12] printk: remove logbuf_lock
Date: Tue, 02 Feb 2021 12:47:20 +0106	[thread overview]
Message-ID: <87czxivgrj.fsf@jogness.linutronix.de> (raw)
In-Reply-To: <YBkYOKL22kADKTeG@alley>

On 2021-02-02, Petr Mladek <pmladek@suse.com> wrote:
> On Tue 2021-01-26 22:21:50, John Ogness wrote:
>> Since the ringbuffer is lockless, there is no need for it to be
>> protected by @logbuf_lock. Remove @logbuf_lock.
>> 
>> This means that printk_nmi_direct and printk_safe_flush_on_panic()
>> no longer need to acquire any lock to run.
>> 
>> @console_seq, @exclusive_console_stop_seq, @console_dropped are
>> protected by @console_lock.
>> 
>> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
>> index d14a4afc5b72..b57dba7f077d 100644
>> --- a/kernel/printk/printk.c
>> +++ b/kernel/printk/printk.c
>> @@ -401,6 +366,7 @@ static u64 syslog_seq;
>>  static size_t syslog_partial;
>>  static bool syslog_time;
>>  
>> +/* All 3 protected by @console_sem. */
>>  /* the next printk record to write to the console */
>>  static u64 console_seq;
>>  static u64 exclusive_console_stop_seq;
>> @@ -762,27 +728,27 @@ static ssize_t devkmsg_read(struct file *file, char __user *buf,
>>  	if (ret)
>>  		return ret;
>>  
>> -	logbuf_lock_irq();
>> +	printk_safe_enter_irq();
>
> What is the exact reason to keep this, please?

As Sergey pointed out [0], logbuf_lock_irq() does 2 things: logbuf_lock
and safe buffers. This series is not trying to remove the safe buffers
(a later series will). The series is only removing logbuf_lock. So all
logbuf_lock_*() calls will turn into printk_safe_*() calls. There are a
few exceptions, which you noticed and I will respond to.

[0] https://lkml.kernel.org/r/20201208203539.GB1667627@google.com

> 1. The primary function of the printk_safe context is to avoid deadlock
>    caused by logbuf_lock. It might have happened with recursive or nested
>    printk(). But logbuf_lock is gone now.

Agreed. Deadlock is not a concern anymore.

> 2. There are still some hidded locks that were guarded by this as
>    well. For example, console_owner_lock, or spinlock inside
>    console_sem, or scheduler locks taken when console_sem()
>    wakes another waiting process. It might still make sense
>    to somehow guard these.

This was not my motivation and I do not think it is an issue. I am not
aware of any technical need for the safe buffers to protect such
synchronization.

> 3. It kind of prevented infinite printk() recursion by using another
>    code path. The other path was limited by the size of the per-cpu
>    buffer. Well, recursion inside printk_safe code would likely
>    hit end of the stack first.

Yes, this was my main motivation. The safe buffers carry this
responsibility in mainline. So until a replacement for recursion
protection is in place, the safe buffers should remain.

And even if we decide we do not need/want recursion protection, I still
do not think this series should be the one to remove it. I only wanted
to remove logbuf_lock for now.

If we later have regressions, it will be helpful to bisect if the safe
buffers (with their local_irq_disable()) or the logbuf_lock were
involved.

> IMHO, we do not need printk safe context here in devkmsg_read().
> It does not belong into any categoty that is described above.
> logbug_lock() is gone. devkmsg_read() is never called directly
> from printk().

No. But it is calling printk_ringbuffer functions that can trigger
WARN_ONs that can trigger printk's.

> The same is true for almost entire patch. There are only two or so
> exceptions, see below.
>
>
>>  	if (!prb_read_valid(prb, atomic64_read(&user->seq), r)) {
>>  		if (file->f_flags & O_NONBLOCK) {
>>  			ret = -EAGAIN;
>> -			logbuf_unlock_irq();
>> +			printk_safe_exit_irq();
>>  			goto out;
>>  		}
>>  
>> -		logbuf_unlock_irq();
>> +		printk_safe_exit_irq();
>>  		ret = wait_event_interruptible(log_wait,
>>  				prb_read_valid(prb, atomic64_read(&user->seq), r));
>>  		if (ret)
>>  			goto out;
>> -		logbuf_lock_irq();
>> +		printk_safe_enter_irq();
>>  	}
>>  
>>  	if (atomic64_read(&user->seq) < prb_first_valid_seq(prb)) {
>>  		/* our last seen message is gone, return error and reset */
>>  		atomic64_set(&user->seq, prb_first_valid_seq(prb));
>>  		ret = -EPIPE;
>> -		logbuf_unlock_irq();
>> +		printk_safe_exit_irq();
>>  		goto out;
>>  	}
>>  
>
>
>> @@ -2593,7 +2559,6 @@ void console_unlock(void)
>>  		size_t len;
>>  
>>  		printk_safe_enter_irqsave(flags);
>> -		raw_spin_lock(&logbuf_lock);
>>  skip:
>>  		if (!prb_read_valid(prb, console_seq, &r))
>>  			break;
>
> I agree that printk_safe context makes sense in console_unlock().
> It prevents eventual deadlocks caused, for example by
> console_lock_owner.
>
> It also somehow prevents an infinite loop when a console driver would
> add a new message that would need to get flushed out as well which
> would trigger another messages ...
>
>
>> @@ -2973,9 +2933,7 @@ void register_console(struct console *newcon)
>>  		/*
>>  		 * console_unlock(); will print out the buffered messages
>>  		 * for us.
>> -		 */
>> -		logbuf_lock_irqsave(flags);
>
> I am just curious what was the motivation to remove printk_safe
> context here? It is a bit inconsistent with the other locations
> where you kept it.

This never should have been logbuf_lock_irqsave(flags) in the first
place. It would have been enough to just grab @logbuf_lock. The safe
buffers only make sense if printk or printk_ringbuffer functions are
called. Here we are just copying some variables (which are already
protected by console_sem and syslog_lock).

> IMHO, it can be removed. It does not fit into any category
> where it would help as described above.
>
> Anyway, we have to be consistent and explain it in the commit message.

I could add an earlier patch that changes logbuf_lock_irqsave() here to
spin_lock_irqsave() and explain. Then for this patch it would be clear
that it is just dropped.

>> -		/*
>> +		 *
>>  		 * We're about to replay the log buffer.  Only do this to the
>>  		 * just-registered console to avoid excessive message spam to
>>  		 * the already-registered consoles.
>> @@ -2988,11 +2946,9 @@ void register_console(struct console *newcon)
>>  		exclusive_console_stop_seq = console_seq;
>>  
>>  		/* Get a consistent copy of @syslog_seq. */
>> -		raw_spin_lock(&syslog_lock);
>> +		raw_spin_lock_irqsave(&syslog_lock, flags);
>
> I guess that you have added _irqsafe() variant here to preserve the
> original behavior.

Yes.

> I just wonder if it makes sense. We take the sleeping console_lock()
> in this function. This should not be allowed in a context
> with disabled interrupts.
>
> IMHO, we do not need the irqsafe variant here. But it probably should
> be removed in a separate patch. We should not hide it in this huge patch.

Yes. Let's put that in another patch. It is not related to logbuf_lock
removal.

>>  		console_seq = syslog_seq;
>> -		raw_spin_unlock(&syslog_lock);
>> -
>> -		logbuf_unlock_irqrestore(flags);
>> +		raw_spin_unlock_irqrestore(&syslog_lock, flags);
>>  	}
>>  	console_unlock();
>>  	console_sysfs_notify();
>> @@ -3414,9 +3366,11 @@ bool kmsg_dump_get_line_nolock(struct kmsg_dumper_iter *iter, bool syslog,
>>  	struct printk_info info;
>>  	unsigned int line_count;
>>  	struct printk_record r;
>> +	unsigned long flags;
>>  	size_t l = 0;
>>  	bool ret = false;
>>  
>> +	printk_safe_enter_irqsave(flags);
>
> This change is neither obvious nor documented.

I noticed that this function was calling ringbuffer functions without
marking the region for safe buffers. I should have included a separate
patch before this one adding the safe buffers so that it would be
clear. Sorry.

> I guess that this is preparation step for unfying
> kmsg_dump_get_line_nolock() and kmsg_dump_get_line().
>
> Please, do it in the next patch and keep this one strightforward.

Or I will just do it in the following unification patch.

> That said, IMHO, the printk_safe() context is not needed here at all.
> This code is not called from printk() directly. The recursion is
> prevented by iter->next_seq or the buffer size.

My logic was: "If it is calling prb_*(), it should be protected by safe
buffers."

>>  	prb_rec_init_rd(&r, &info, line, size);
>>  
>>  	if (!iter->active)
>
>
>> @@ -3569,8 +3517,12 @@ EXPORT_SYMBOL_GPL(kmsg_dump_get_buffer);
>>   */
>>  void kmsg_dump_rewind_nolock(struct kmsg_dumper_iter *iter)
>>  {
>> +	unsigned long flags;
>> +
>> +	printk_safe_enter_irqsave(flags);
>
> This is the same as in kmsg_dump_get_line_nolock().
>
> Please, keep this huge patch strightforward. Either replace
> logbuf_lock*() with the corresponding printk_safe*() calls.
> Or do not add printk_safe*() where it is not needed.

Agreed. Sorry.

>>  	iter->cur_seq = latched_seq_read_nolock(&clear_seq);
>>  	iter->next_seq = prb_next_seq(prb);
>> +	printk_safe_exit_irqrestore(flags);
>>  }
>>  
>>  /**
>
>> diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c
>> index a0e6f746de6c..a9a3137bd972 100644
>> --- a/kernel/printk/printk_safe.c
>> +++ b/kernel/printk/printk_safe.c
>> @@ -368,20 +354,21 @@ __printf(1, 0) int vprintk_func(const char *fmt, va_list args)
>>  #endif
>>  
>>  	/*
>> -	 * Try to use the main logbuf even in NMI. But avoid calling console
>> +	 * Use the main logbuf even in NMI. But avoid calling console
>>  	 * drivers that might have their own locks.
>>  	 */
>> -	if ((this_cpu_read(printk_context) & PRINTK_NMI_DIRECT_CONTEXT_MASK) &&
>> -	    raw_spin_trylock(&logbuf_lock)) {
>> +	if ((this_cpu_read(printk_context) & PRINTK_NMI_DIRECT_CONTEXT_MASK)) {
>> +		unsigned long flags;
>>  		int len;
>>  
>> +		printk_safe_enter_irqsave(flags);
>
> The printk_safe context does not make much sense here. The per-context
> redirection is done in vprintk_func(). It will always use this path
> because PRINTK_NMI_DIRECT_CONTEXT_MASK is handled before
> PRINTK_SAFE_CONTEXT_MASK.

If the following vprintk_store() calls printk(), those printk's should
land in safe buffers. If @printk_context is not incremented, they land
here again.

vprintk_store() relies on its caller to update @printk_context.

>>  		len = vprintk_store(0, LOGLEVEL_DEFAULT, NULL, fmt, args);
>> -		raw_spin_unlock(&logbuf_lock);
>> +		printk_safe_exit_irqrestore(flags);
>>  		defer_console_output();
>>  		return len;
>>  	}
>>  
>> -	/* Use extra buffer in NMI when logbuf_lock is taken or in safe mode. */
>> +	/* Use extra buffer in NMI. */
>>  	if (this_cpu_read(printk_context) & PRINTK_NMI_CONTEXT_MASK)
>>  		return vprintk_nmi(fmt, args);
>
> I agree that it makes sense to keep vprintk_nmi() for now. It prevents
> an infinite recursion. I still hope that we will agree on a better
> solution later.
>
>
> Summary:
>
> I do not have a strong opinion whether to remove printk_safe
> enter/exit calls in this patch or in a separate one. I would
> be fine with both solutions.
>
> Anyway, please keep the patch as straightforward as possible.
> Please, do not move printk_safe context into another locations.
>
> Also provide more reasoning in the commit message. Why printk_safe
> enter/exit calls are preserved or why they can be removed.
>
> Feel free to mention that printk_safe context still makes sense on
> some locations, explain why, and say that it will be removed later.

OK. Thank you for the excellent response. I will go over this again.

John Ogness

  reply	other threads:[~2021-02-02 11:42 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-26 21:15 [PATCH printk-rework 00/12] printk: remove logbuf_lock John Ogness
2021-01-26 21:15 ` [PATCH printk-rework 01/12] printk: kmsg_dump: remove unused fields John Ogness
2021-01-26 21:15 ` [PATCH printk-rework 02/12] printk: refactor kmsg_dump_get_buffer() John Ogness
2021-01-26 21:15 ` [PATCH printk-rework 03/12] printk: consolidate kmsg_dump_get_buffer/syslog_print_all code John Ogness
     [not found]   ` <YBQgTQYTA5p6Wgj6@alley>
2021-02-01  9:49     ` John Ogness
2021-02-02 12:31       ` Petr Mladek
2021-01-26 21:15 ` [PATCH printk-rework 04/12] printk: define CONSOLE_LOG_MAX in printk.h John Ogness
     [not found]   ` <YBQtbKrdwUAZQB9v@alley>
2021-02-01  8:24     ` LINE_MAX: was: " John Ogness
2021-02-02 11:22       ` Petr Mladek
2021-01-26 21:15 ` [PATCH printk-rework 05/12] printk: use seqcount_latch for clear_seq John Ogness
2021-01-26 21:15 ` [PATCH printk-rework 06/12] printk: use atomic64_t for devkmsg_user.seq John Ogness
2021-01-26 21:15 ` [PATCH printk-rework 07/12] printk: add syslog_lock John Ogness
2021-02-01 12:26   ` Petr Mladek
2021-02-01 13:11     ` John Ogness
2021-02-02 12:50       ` Petr Mladek
2021-01-26 21:15 ` [PATCH printk-rework 08/12] printk: introduce a kmsg_dump iterator John Ogness
2021-02-01 13:17   ` Petr Mladek
2021-02-01 13:32     ` John Ogness
2021-01-26 21:15 ` [PATCH printk-rework 09/12] um: synchronize kmsg_dumper John Ogness
2021-02-01 10:26   ` Petr Mladek
2021-02-01 14:15   ` Petr Mladek
2021-02-01 16:51     ` John Ogness
2021-02-01 16:54       ` Richard Weinberger
2021-02-01 20:25         ` John Ogness
2021-02-01 20:40           ` Richard Weinberger
2021-02-02 13:26       ` Petr Mladek
2021-01-26 21:15 ` [PATCH printk-rework 10/12] hv: " John Ogness
2021-01-27 21:32   ` Michael Kelley
2021-02-01 10:56     ` John Ogness
2021-01-26 21:15 ` [PATCH printk-rework 11/12] printk: remove logbuf_lock John Ogness
2021-02-02  9:15   ` Petr Mladek
2021-02-02 11:41     ` John Ogness [this message]
2021-02-02 16:11       ` Petr Mladek
2021-01-26 21:15 ` [PATCH printk-rework 12/12] printk: kmsg_dump: remove _nolock() variants John Ogness
2021-02-02  9:45   ` Petr Mladek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87czxivgrj.fsf@jogness.linutronix.de \
    --to=john.ogness@linutronix.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pmladek@suse.com \
    --cc=rostedt@goodmis.org \
    --cc=sergey.senozhatsky.work@gmail.com \
    --cc=sergey.senozhatsky@gmail.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.