All of lore.kernel.org
 help / color / mirror / Atom feed
From: John Ogness <john.ogness@linutronix.de>
To: Daniel Thompson <daniel.thompson@linaro.org>
Cc: "Petr Mladek" <pmladek@suse.com>,
	"Sergey Senozhatsky" <senozhatsky@chromium.org>,
	"Steven Rostedt" <rostedt@goodmis.org>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	linux-kernel@vger.kernel.org,
	"Michael Ellerman" <mpe@ellerman.id.au>,
	"Benjamin Herrenschmidt" <benh@kernel.crashing.org>,
	"Paul Mackerras" <paulus@samba.org>,
	"Ingo Molnar" <mingo@redhat.com>,
	"Borislav Petkov" <bp@alien8.de>,
	x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
	"Jason Wessel" <jason.wessel@windriver.com>,
	"Douglas Anderson" <dianders@chromium.org>,
	"Srikar Dronamraju" <srikar@linux.vnet.ibm.com>,
	"Gautham R. Shenoy" <ego@linux.vnet.ibm.com>,
	"Chengyang Fan" <cy.fan@huawei.com>,
	"Christophe Leroy" <christophe.leroy@csgroup.eu>,
	"Bhaskar Chowdhury" <unixbhaskar@gmail.com>,
	"Nicholas Piggin" <npiggin@gmail.com>,
	"Cédric Le Goater" <clg@kaod.org>,
	"Gustavo A. R. Silva" <gustavoars@kernel.org>,
	"Peter Zijlstra" <peterz@infradead.org>,
	linuxppc-dev@lists.ozlabs.org,
	kgdb-bugreport@lists.sourceforge.net
Subject: Re: [PATCH printk v1 03/10] kgdb: delay roundup if holding printk cpulock
Date: Tue, 03 Aug 2021 17:36:32 +0206	[thread overview]
Message-ID: <87tuk635rb.fsf@jogness.linutronix.de> (raw)
In-Reply-To: <20210803142558.cz7apumpgijs5y4y@maple.lan>

On 2021-08-03, Daniel Thompson <daniel.thompson@linaro.org> wrote:
> On Tue, Aug 03, 2021 at 03:18:54PM +0206, John Ogness wrote:
>> kgdb makes use of its own cpulock (@dbg_master_lock, @kgdb_active)
>> during cpu roundup. This will conflict with the printk cpulock.
>
> When the full vision is realized what will be the purpose of the printk
> cpulock?
>
> I'm asking largely because it's current role is actively unhelpful
> w.r.t. kdb. It is possible that cautious use of in_dbg_master() might
> be a better (and safer) solution. However it sounds like there is a
> larger role planned for the printk cpulock...

The printk cpulock is used as a synchronization mechanism for
implementing atomic consoles, which need to be able to safely interrupt
the console write() activity at any time and immediately continue with
their own printing. The ultimate goal is to move all console printing
into per-console dedicated kthreads, so the primary function of the
printk cpulock is really to immediately _stop_ the CPU/kthread
performing write() in order to allow write_atomic() (from any context on
any CPU) to safely and reliably take over.

Atomic consoles are actually quite similar to the kgdb_io ops. For
example, comparing:

serial8250_console_write_atomic() + serial8250_console_putchar_locked()

with

serial8250_put_poll_char()

The difference is that serial8250_console_write_atomic() is line-based
and synchronizing with serial8250_console_write() so that if the kernel
crashes while outputing to the console, write() can be interrupted by
write_atomic() and cleanly formatted crash data can be output.

Also serial8250_put_poll_char() is calling into __pm_runtime_resume(),
which includes a spinlock and possibly sleeping. This would not be
acceptable for atomic consoles. Although, as Andy pointed out [0], I
will need to figure out how to deal with suspended consoles. Or just
implement a policy that registered atomic consoles may never be
suspended.

I had not considered merging kgdb_io ops with atomic console ops. But
now that I look at it more closely, there may be some useful overlap. I
will consider this. Thank you for this idea.

>> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
>> index 3d0c933937b4..1b546e117f10 100644
>> --- a/kernel/printk/printk.c
>> +++ b/kernel/printk/printk.c
>> @@ -214,6 +215,7 @@ int devkmsg_sysctl_set_loglvl(struct ctl_table *table, int write,
>>  #ifdef CONFIG_SMP
>>  static atomic_t printk_cpulock_owner = ATOMIC_INIT(-1);
>>  static atomic_t printk_cpulock_nested = ATOMIC_INIT(0);
>> +static unsigned int kgdb_cpu = -1;
>
> Is this the flag to provoke retriggering? It appears to be a write-only
> variable (at least in this patch). How is it consumed?

Critical catch! Thank you. I am quite unhappy to see these hunks were
accidentally dropped when generating this series.

@@ -3673,6 +3675,9 @@ EXPORT_SYMBOL(__printk_cpu_trylock);
  */
 void __printk_cpu_unlock(void)
 {
+	bool trigger_kgdb = false;
+	unsigned int cpu;
+
 	if (atomic_read(&printk_cpulock_nested)) {
 		atomic_dec(&printk_cpulock_nested);
 		return;
@@ -3683,6 +3688,12 @@ void __printk_cpu_unlock(void)
 	 * LMM(__printk_cpu_unlock:A)
 	 */
 
+	cpu = smp_processor_id();
+	if (kgdb_cpu == cpu) {
+		trigger_kgdb = true;
+		kgdb_cpu = -1;
+	}
+
 	/*
 	 * Guarantee loads and stores from this CPU when it was the
 	 * lock owner are visible to the next lock owner. This pairs
@@ -3703,6 +3714,21 @@ void __printk_cpu_unlock(void)
 	 */
 	atomic_set_release(&printk_cpulock_owner,
 			   -1); /* LMM(__printk_cpu_unlock:B) */
+
+	if (trigger_kgdb) {
+		pr_warn("re-triggering kgdb roundup for CPU#%d\n", cpu);
+		kgdb_roundup_cpu(cpu);
+	}
 }
 EXPORT_SYMBOL(__printk_cpu_unlock);

John Ogness

[0] https://lore.kernel.org/lkml/YQlKAeXS9MPmE284@smile.fi.intel.com

WARNING: multiple messages have this Message-ID (diff)
From: John Ogness <john.ogness@linutronix.de>
To: Daniel Thompson <daniel.thompson@linaro.org>
Cc: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>,
	"Douglas Anderson" <dianders@chromium.org>,
	"Srikar Dronamraju" <srikar@linux.vnet.ibm.com>,
	"Peter Zijlstra" <peterz@infradead.org>,
	linux-kernel@vger.kernel.org, "Paul Mackerras" <paulus@samba.org>,
	"H. Peter Anvin" <hpa@zytor.com>,
	"Chengyang Fan" <cy.fan@huawei.com>,
	"Bhaskar Chowdhury" <unixbhaskar@gmail.com>,
	x86@kernel.org, "Ingo Molnar" <mingo@redhat.com>,
	kgdb-bugreport@lists.sourceforge.net,
	"Petr Mladek" <pmladek@suse.com>,
	"Nicholas Piggin" <npiggin@gmail.com>,
	"Borislav Petkov" <bp@alien8.de>,
	"Steven Rostedt" <rostedt@goodmis.org>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Gustavo A. R. Silva" <gustavoars@kernel.org>,
	"Sergey Senozhatsky" <senozhatsky@chromium.org>,
	"Jason Wessel" <jason.wessel@windriver.com>,
	linuxppc-dev@lists.ozlabs.org, "Cédric Le Goater" <clg@kaod.org>
Subject: Re: [PATCH printk v1 03/10] kgdb: delay roundup if holding printk cpulock
Date: Tue, 03 Aug 2021 17:36:32 +0206	[thread overview]
Message-ID: <87tuk635rb.fsf@jogness.linutronix.de> (raw)
In-Reply-To: <20210803142558.cz7apumpgijs5y4y@maple.lan>

On 2021-08-03, Daniel Thompson <daniel.thompson@linaro.org> wrote:
> On Tue, Aug 03, 2021 at 03:18:54PM +0206, John Ogness wrote:
>> kgdb makes use of its own cpulock (@dbg_master_lock, @kgdb_active)
>> during cpu roundup. This will conflict with the printk cpulock.
>
> When the full vision is realized what will be the purpose of the printk
> cpulock?
>
> I'm asking largely because it's current role is actively unhelpful
> w.r.t. kdb. It is possible that cautious use of in_dbg_master() might
> be a better (and safer) solution. However it sounds like there is a
> larger role planned for the printk cpulock...

The printk cpulock is used as a synchronization mechanism for
implementing atomic consoles, which need to be able to safely interrupt
the console write() activity at any time and immediately continue with
their own printing. The ultimate goal is to move all console printing
into per-console dedicated kthreads, so the primary function of the
printk cpulock is really to immediately _stop_ the CPU/kthread
performing write() in order to allow write_atomic() (from any context on
any CPU) to safely and reliably take over.

Atomic consoles are actually quite similar to the kgdb_io ops. For
example, comparing:

serial8250_console_write_atomic() + serial8250_console_putchar_locked()

with

serial8250_put_poll_char()

The difference is that serial8250_console_write_atomic() is line-based
and synchronizing with serial8250_console_write() so that if the kernel
crashes while outputing to the console, write() can be interrupted by
write_atomic() and cleanly formatted crash data can be output.

Also serial8250_put_poll_char() is calling into __pm_runtime_resume(),
which includes a spinlock and possibly sleeping. This would not be
acceptable for atomic consoles. Although, as Andy pointed out [0], I
will need to figure out how to deal with suspended consoles. Or just
implement a policy that registered atomic consoles may never be
suspended.

I had not considered merging kgdb_io ops with atomic console ops. But
now that I look at it more closely, there may be some useful overlap. I
will consider this. Thank you for this idea.

>> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
>> index 3d0c933937b4..1b546e117f10 100644
>> --- a/kernel/printk/printk.c
>> +++ b/kernel/printk/printk.c
>> @@ -214,6 +215,7 @@ int devkmsg_sysctl_set_loglvl(struct ctl_table *table, int write,
>>  #ifdef CONFIG_SMP
>>  static atomic_t printk_cpulock_owner = ATOMIC_INIT(-1);
>>  static atomic_t printk_cpulock_nested = ATOMIC_INIT(0);
>> +static unsigned int kgdb_cpu = -1;
>
> Is this the flag to provoke retriggering? It appears to be a write-only
> variable (at least in this patch). How is it consumed?

Critical catch! Thank you. I am quite unhappy to see these hunks were
accidentally dropped when generating this series.

@@ -3673,6 +3675,9 @@ EXPORT_SYMBOL(__printk_cpu_trylock);
  */
 void __printk_cpu_unlock(void)
 {
+	bool trigger_kgdb = false;
+	unsigned int cpu;
+
 	if (atomic_read(&printk_cpulock_nested)) {
 		atomic_dec(&printk_cpulock_nested);
 		return;
@@ -3683,6 +3688,12 @@ void __printk_cpu_unlock(void)
 	 * LMM(__printk_cpu_unlock:A)
 	 */
 
+	cpu = smp_processor_id();
+	if (kgdb_cpu == cpu) {
+		trigger_kgdb = true;
+		kgdb_cpu = -1;
+	}
+
 	/*
 	 * Guarantee loads and stores from this CPU when it was the
 	 * lock owner are visible to the next lock owner. This pairs
@@ -3703,6 +3714,21 @@ void __printk_cpu_unlock(void)
 	 */
 	atomic_set_release(&printk_cpulock_owner,
 			   -1); /* LMM(__printk_cpu_unlock:B) */
+
+	if (trigger_kgdb) {
+		pr_warn("re-triggering kgdb roundup for CPU#%d\n", cpu);
+		kgdb_roundup_cpu(cpu);
+	}
 }
 EXPORT_SYMBOL(__printk_cpu_unlock);

John Ogness

[0] https://lore.kernel.org/lkml/YQlKAeXS9MPmE284@smile.fi.intel.com

  reply	other threads:[~2021-08-03 15:30 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-03 13:12 [PATCH printk v1 00/10] printk: introduce atomic consoles and sync mode John Ogness
2021-08-03 13:12 ` John Ogness
2021-08-03 13:12 ` John Ogness
2021-08-03 13:12 ` [PATCH printk v1 01/10] printk: relocate printk cpulock functions John Ogness
2021-08-04  9:24   ` Petr Mladek
2021-08-03 13:12 ` [PATCH printk v1 02/10] printk: rename printk cpulock API and always disable interrupts John Ogness
2021-08-04  9:52   ` Petr Mladek
2021-08-03 13:12 ` [PATCH printk v1 03/10] kgdb: delay roundup if holding printk cpulock John Ogness
2021-08-03 13:12   ` John Ogness
2021-08-03 14:25   ` Daniel Thompson
2021-08-03 14:25     ` Daniel Thompson
2021-08-03 15:30     ` John Ogness [this message]
2021-08-03 15:30       ` John Ogness
2021-08-04 11:31       ` Daniel Thompson
2021-08-04 11:31         ` Daniel Thompson
2021-08-04 12:12         ` Petr Mladek
2021-08-04 12:12           ` Petr Mladek
2021-08-04 15:04           ` Daniel Thompson
2021-08-04 15:04             ` Daniel Thompson
2021-08-05  3:46             ` John Ogness
2021-08-05  3:46               ` John Ogness
2021-08-06 12:06               ` Daniel Thompson
2021-08-06 12:06                 ` Daniel Thompson
2021-08-04 12:31       ` Petr Mladek
2021-08-04 12:31         ` Petr Mladek
2021-08-03 13:12 ` [PATCH printk v1 04/10] printk: relocate printk_delay() John Ogness
2021-08-04 13:07   ` Petr Mladek
2021-08-03 13:12 ` [PATCH printk v1 05/10] printk: call boot_delay_msec() in printk_delay() John Ogness
2021-08-04 13:09   ` Petr Mladek
2021-08-31  1:04   ` Sergey Senozhatsky
2021-08-03 13:12 ` [PATCH printk v1 06/10] printk: use seqcount_latch for console_seq John Ogness
2021-08-05 12:16   ` Petr Mladek
2021-08-05 15:26     ` John Ogness
2021-08-06 15:56       ` Petr Mladek
2021-08-31  3:05         ` Sergey Senozhatsky
2021-08-03 13:12 ` [PATCH printk v1 07/10] console: add write_atomic interface John Ogness
2021-08-03 14:02   ` Andy Shevchenko
2021-08-06 10:56     ` John Ogness
2021-08-06 11:18       ` Andy Shevchenko
2021-08-31  2:55   ` Sergey Senozhatsky
2021-08-03 13:12 ` [PATCH printk v1 08/10] printk: introduce kernel sync mode John Ogness
2021-08-05 17:11   ` Petr Mladek
2021-08-05 21:25     ` John Ogness
2021-08-03 13:13 ` [PATCH printk v1 09/10] kdb: if available, only use atomic consoles for output mirroring John Ogness
2021-08-03 13:13 ` [PATCH printk v1 10/10] serial: 8250: implement write_atomic John Ogness
2021-08-03 13:13   ` John Ogness
2021-08-03 13:13   ` John Ogness
2021-08-03 14:07   ` Andy Shevchenko
2021-08-03 14:07     ` Andy Shevchenko
2021-08-03 14:07     ` Andy Shevchenko
2021-08-05  7:47     ` Jiri Slaby
2021-08-05  7:47       ` Jiri Slaby
2021-08-05  7:47       ` Jiri Slaby
2021-08-05  8:26       ` John Ogness
2021-08-05  8:26         ` John Ogness
2021-08-05  8:26         ` John Ogness
2021-08-03 13:52 ` [PATCH printk v1 00/10] printk: introduce atomic consoles and sync mode Andy Shevchenko
2021-08-03 13:52   ` Andy Shevchenko
2021-08-03 13:52   ` Andy Shevchenko
2021-08-05 15:47 ` Petr Mladek
2021-08-05 15:47   ` Petr Mladek
2021-08-05 15:47   ` Petr Mladek
2021-08-31  0:33   ` Sergey Senozhatsky
2021-08-31  0:33     ` Sergey Senozhatsky
2021-08-31  0:33     ` Sergey Senozhatsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87tuk635rb.fsf@jogness.linutronix.de \
    --to=john.ogness@linutronix.de \
    --cc=benh@kernel.crashing.org \
    --cc=bp@alien8.de \
    --cc=christophe.leroy@csgroup.eu \
    --cc=clg@kaod.org \
    --cc=cy.fan@huawei.com \
    --cc=daniel.thompson@linaro.org \
    --cc=dianders@chromium.org \
    --cc=ego@linux.vnet.ibm.com \
    --cc=gustavoars@kernel.org \
    --cc=hpa@zytor.com \
    --cc=jason.wessel@windriver.com \
    --cc=kgdb-bugreport@lists.sourceforge.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mingo@redhat.com \
    --cc=mpe@ellerman.id.au \
    --cc=npiggin@gmail.com \
    --cc=paulus@samba.org \
    --cc=peterz@infradead.org \
    --cc=pmladek@suse.com \
    --cc=rostedt@goodmis.org \
    --cc=senozhatsky@chromium.org \
    --cc=srikar@linux.vnet.ibm.com \
    --cc=tglx@linutronix.de \
    --cc=unixbhaskar@gmail.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.