All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marek Szyprowski <m.szyprowski@samsung.com>
To: Petr Mladek <pmladek@suse.com>
Cc: John Ogness <john.ogness@linutronix.de>,
	Sergey Senozhatsky <senozhatsky@chromium.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	linux-kernel@vger.kernel.org,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	linux-amlogic@lists.infradead.org
Subject: Re: [PATCH printk v5 1/1] printk: extend console_lock for per-console locking
Date: Wed, 27 Apr 2022 13:44:52 +0200	[thread overview]
Message-ID: <b6c1a8ac-c691-a84d-d3a1-f99984d32f06@samsung.com> (raw)
In-Reply-To: <Ymjy3rHRenba7r7R@alley>

Hi,

On 27.04.2022 09:38, Petr Mladek wrote:
> On Wed 2022-04-27 09:08:33, Marek Szyprowski wrote:
>> On 26.04.2022 15:16, Petr Mladek wrote:
>>> On Tue 2022-04-26 14:07:42, Petr Mladek wrote:
>>>> On Mon 2022-04-25 23:04:28, John Ogness wrote:
>>>>> Currently threaded console printers synchronize against each
>>>>> other using console_lock(). However, different console drivers
>>>>> are unrelated and do not require any synchronization between
>>>>> each other. Removing the synchronization between the threaded
>>>>> console printers will allow each console to print at its own
>>>>> speed.
>>>>>
>>>>> But the threaded consoles printers do still need to synchronize
>>>>> against console_lock() callers. Introduce a per-console mutex
>>>>> and a new console boolean field @blocked to provide this
>>>>> synchronization.
>>>>>
>>>>> console_lock() is modified so that it must acquire the mutex
>>>>> of each console in order to set the @blocked field. Console
>>>>> printing threads will acquire their mutex while printing a
>>>>> record. If @blocked was set, the thread will go back to sleep
>>>>> instead of printing.
>>>>>
>>>>> The reason for the @blocked boolean field is so that
>>>>> console_lock() callers do not need to acquire multiple console
>>>>> mutexes simultaneously, which would introduce unnecessary
>>>>> complexity due to nested mutex locking. Also, a new field
>>>>> was chosen instead of adding a new @flags value so that the
>>>>> blocked status could be checked without concern of reading
>>>>> inconsistent values due to @flags updates from other contexts.
>>>>>
>>>>> Threaded console printers also need to synchronize against
>>>>> console_trylock() callers. Since console_trylock() may be
>>>>> called from any context, the per-console mutex cannot be used
>>>>> for this synchronization. (mutex_trylock() cannot be called
>>>>> from atomic contexts.) Introduce a global atomic counter to
>>>>> identify if any threaded printers are active. The threaded
>>>>> printers will also check the atomic counter to identify if the
>>>>> console has been locked by another task via console_trylock().
>>>>>
>>>>> Note that @console_sem is still used to provide synchronization
>>>>> between console_lock() and console_trylock() callers.
>>>>>
>>>>> A locking overview for console_lock(), console_trylock(), and the
>>>>> threaded printers is as follows (pseudo code):
>>>>>
>>>>> console_lock()
>>>>> {
>>>>>           down(&console_sem);
>>>>>           for_each_console(con) {
>>>>>                   mutex_lock(&con->lock);
>>>>>                   con->blocked = true;
>>>>>                   mutex_unlock(&con->lock);
>>>>>           }
>>>>>           /* console_lock acquired */
>>>>> }
>>>>>
>>>>> console_trylock()
>>>>> {
>>>>>           if (down_trylock(&console_sem) == 0) {
>>>>>                   if (atomic_cmpxchg(&console_kthreads_active, 0, -1) == 0) {
>>>>>                           /* console_lock acquired */
>>>>>                   }
>>>>>           }
>>>>> }
>>>>>
>>>>> threaded_printer()
>>>>> {
>>>>>           mutex_lock(&con->lock);
>>>>>           if (!con->blocked) {
>>>>> 		/* console_lock() callers blocked */
>>>>>
>>>>>                   if (atomic_inc_unless_negative(&console_kthreads_active)) {
>>>>>                           /* console_trylock() callers blocked */
>>>>>
>>>>>                           con->write();
>>>>>
>>>>>                           atomic_dec(&console_lock_count);
>>>>>                   }
>>>>>           }
>>>>>           mutex_unlock(&con->lock);
>>>>> }
>>>>>
>>>>> The console owner and waiter logic now only applies between contexts
>>>>> that have taken the console_lock via console_trylock(). Threaded
>>>>> printers never take the console_lock, so they do not have a
>>>>> console_lock to handover. Tasks that have used console_lock() will
>>>>> block the threaded printers using a mutex and if the console_lock
>>>>> is handed over to an atomic context, it would be unable to unblock
>>>>> the threaded printers. However, the console_trylock() case is
>>>>> really the only scenario that is interesting for handovers anyway.
>>>>>
>>>>> @panic_console_dropped must change to atomic_t since it is no longer
>>>>> protected exclusively by the console_lock.
>>>>>
>>>>> Since threaded printers remain asleep if they see that the console
>>>>> is locked, they now must be explicitly woken in __console_unlock().
>>>>> This means wake_up_klogd() calls following a console_unlock() are
>>>>> no longer necessary and are removed.
>>>>>
>>>>> Also note that threaded printers no longer need to check
>>>>> @console_suspended. The check for the @blocked field implicitly
>>>>> covers the suspended console case.
>>>>>
>>>>> Signed-off-by: John Ogness <john.ogness@linutronix.de>
>>>> Nice, it it better than v4. I am going to push this for linux-next.
>>>>
>>>> Reviewed-by: Petr Mladek <pmladek@suse.com>
>>> JFYI, I have just pushed this patch instead of the one
>>> from v4 into printk/linux.git, branch rework/kthreads.
>>>
>>> It means that this branch has been rebased. It will be
>>> used in the next refresh of linux-next.
>> This patchset landed in linux next-20220426. In my tests I've found that
>> it causes deadlock on all my Amlogic Meson G12B/SM1 based boards: Odroid
>> C4/N2 and Khadas VIM3/VIM3l. The deadlock happens when system boots to
>> userspace and getty (with automated login) is executed. I even see the
>> bash prompt, but then the console is freezed. Reverting this patch
>> (e00cc0e1cbf4) on top of linux-next (together with 6b3d71e87892 to make
>> revert clean) fixes the issue.
> Thanks a lot for the report!
>
> Just by chance, do you have the log from the dead-locked boot stored
> in userspace and can you share it? I mean the log stored in
> /var/log/dmesg or journaltctl.

If there would be any messages, I expect them to be visible on the 
serial kernel console.

> In the worst case, it might help to see log from the boot with
> the reverted patch. I would help us to see the ordering of various
> console-related operations on your system.
>
> And regarding the console. Is it the graphics console (ttyX)
> or a serial one (ttyS) or yet another one?

Serial console, /dev/ttyAML0 with kernel console enabled. Later a DRM 
driver is loaded (meson_drm), which initializes its fbdev emulation with 
its console.

However it looks that I've trusted automatic bisect a bit too much and 
had a bit of luck while checking the reverts on top of linux-next. The 
issue is not 100% reproducible, so I've did this bisection again 
manually with more tries. The real commit causing the issue is 
09c5ba0aa2fc ("printk: add kthread console printers"). Reverting the 
following 3 commits 6b3d71e878920b085dd823bc422951bb6f143505, 
e00cc0e1cbf4ea5a63d66c8de8d79519855fb231 and 
09c5ba0aa2fcfdadb17d045c3ee6f86d69270df7 on top of linux-next makes the 
system fully operational again.

I've also tried to disable the DRM driver and its fbdev and console (by 
adding modprobe.blacklist=meson_drm to kernel cmdline), but this didn't 
help. Here is the full serial console log:

https://pastebin.com/E5CDH88L

If there is anything you would like me to try, let me know.


Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


WARNING: multiple messages have this Message-ID (diff)
From: Marek Szyprowski <m.szyprowski@samsung.com>
To: Petr Mladek <pmladek@suse.com>
Cc: John Ogness <john.ogness@linutronix.de>,
	Sergey Senozhatsky <senozhatsky@chromium.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	linux-kernel@vger.kernel.org,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	linux-amlogic@lists.infradead.org
Subject: Re: [PATCH printk v5 1/1] printk: extend console_lock for per-console locking
Date: Wed, 27 Apr 2022 13:44:52 +0200	[thread overview]
Message-ID: <b6c1a8ac-c691-a84d-d3a1-f99984d32f06@samsung.com> (raw)
In-Reply-To: <Ymjy3rHRenba7r7R@alley>

Hi,

On 27.04.2022 09:38, Petr Mladek wrote:
> On Wed 2022-04-27 09:08:33, Marek Szyprowski wrote:
>> On 26.04.2022 15:16, Petr Mladek wrote:
>>> On Tue 2022-04-26 14:07:42, Petr Mladek wrote:
>>>> On Mon 2022-04-25 23:04:28, John Ogness wrote:
>>>>> Currently threaded console printers synchronize against each
>>>>> other using console_lock(). However, different console drivers
>>>>> are unrelated and do not require any synchronization between
>>>>> each other. Removing the synchronization between the threaded
>>>>> console printers will allow each console to print at its own
>>>>> speed.
>>>>>
>>>>> But the threaded consoles printers do still need to synchronize
>>>>> against console_lock() callers. Introduce a per-console mutex
>>>>> and a new console boolean field @blocked to provide this
>>>>> synchronization.
>>>>>
>>>>> console_lock() is modified so that it must acquire the mutex
>>>>> of each console in order to set the @blocked field. Console
>>>>> printing threads will acquire their mutex while printing a
>>>>> record. If @blocked was set, the thread will go back to sleep
>>>>> instead of printing.
>>>>>
>>>>> The reason for the @blocked boolean field is so that
>>>>> console_lock() callers do not need to acquire multiple console
>>>>> mutexes simultaneously, which would introduce unnecessary
>>>>> complexity due to nested mutex locking. Also, a new field
>>>>> was chosen instead of adding a new @flags value so that the
>>>>> blocked status could be checked without concern of reading
>>>>> inconsistent values due to @flags updates from other contexts.
>>>>>
>>>>> Threaded console printers also need to synchronize against
>>>>> console_trylock() callers. Since console_trylock() may be
>>>>> called from any context, the per-console mutex cannot be used
>>>>> for this synchronization. (mutex_trylock() cannot be called
>>>>> from atomic contexts.) Introduce a global atomic counter to
>>>>> identify if any threaded printers are active. The threaded
>>>>> printers will also check the atomic counter to identify if the
>>>>> console has been locked by another task via console_trylock().
>>>>>
>>>>> Note that @console_sem is still used to provide synchronization
>>>>> between console_lock() and console_trylock() callers.
>>>>>
>>>>> A locking overview for console_lock(), console_trylock(), and the
>>>>> threaded printers is as follows (pseudo code):
>>>>>
>>>>> console_lock()
>>>>> {
>>>>>           down(&console_sem);
>>>>>           for_each_console(con) {
>>>>>                   mutex_lock(&con->lock);
>>>>>                   con->blocked = true;
>>>>>                   mutex_unlock(&con->lock);
>>>>>           }
>>>>>           /* console_lock acquired */
>>>>> }
>>>>>
>>>>> console_trylock()
>>>>> {
>>>>>           if (down_trylock(&console_sem) == 0) {
>>>>>                   if (atomic_cmpxchg(&console_kthreads_active, 0, -1) == 0) {
>>>>>                           /* console_lock acquired */
>>>>>                   }
>>>>>           }
>>>>> }
>>>>>
>>>>> threaded_printer()
>>>>> {
>>>>>           mutex_lock(&con->lock);
>>>>>           if (!con->blocked) {
>>>>> 		/* console_lock() callers blocked */
>>>>>
>>>>>                   if (atomic_inc_unless_negative(&console_kthreads_active)) {
>>>>>                           /* console_trylock() callers blocked */
>>>>>
>>>>>                           con->write();
>>>>>
>>>>>                           atomic_dec(&console_lock_count);
>>>>>                   }
>>>>>           }
>>>>>           mutex_unlock(&con->lock);
>>>>> }
>>>>>
>>>>> The console owner and waiter logic now only applies between contexts
>>>>> that have taken the console_lock via console_trylock(). Threaded
>>>>> printers never take the console_lock, so they do not have a
>>>>> console_lock to handover. Tasks that have used console_lock() will
>>>>> block the threaded printers using a mutex and if the console_lock
>>>>> is handed over to an atomic context, it would be unable to unblock
>>>>> the threaded printers. However, the console_trylock() case is
>>>>> really the only scenario that is interesting for handovers anyway.
>>>>>
>>>>> @panic_console_dropped must change to atomic_t since it is no longer
>>>>> protected exclusively by the console_lock.
>>>>>
>>>>> Since threaded printers remain asleep if they see that the console
>>>>> is locked, they now must be explicitly woken in __console_unlock().
>>>>> This means wake_up_klogd() calls following a console_unlock() are
>>>>> no longer necessary and are removed.
>>>>>
>>>>> Also note that threaded printers no longer need to check
>>>>> @console_suspended. The check for the @blocked field implicitly
>>>>> covers the suspended console case.
>>>>>
>>>>> Signed-off-by: John Ogness <john.ogness@linutronix.de>
>>>> Nice, it it better than v4. I am going to push this for linux-next.
>>>>
>>>> Reviewed-by: Petr Mladek <pmladek@suse.com>
>>> JFYI, I have just pushed this patch instead of the one
>>> from v4 into printk/linux.git, branch rework/kthreads.
>>>
>>> It means that this branch has been rebased. It will be
>>> used in the next refresh of linux-next.
>> This patchset landed in linux next-20220426. In my tests I've found that
>> it causes deadlock on all my Amlogic Meson G12B/SM1 based boards: Odroid
>> C4/N2 and Khadas VIM3/VIM3l. The deadlock happens when system boots to
>> userspace and getty (with automated login) is executed. I even see the
>> bash prompt, but then the console is freezed. Reverting this patch
>> (e00cc0e1cbf4) on top of linux-next (together with 6b3d71e87892 to make
>> revert clean) fixes the issue.
> Thanks a lot for the report!
>
> Just by chance, do you have the log from the dead-locked boot stored
> in userspace and can you share it? I mean the log stored in
> /var/log/dmesg or journaltctl.

If there would be any messages, I expect them to be visible on the 
serial kernel console.

> In the worst case, it might help to see log from the boot with
> the reverted patch. I would help us to see the ordering of various
> console-related operations on your system.
>
> And regarding the console. Is it the graphics console (ttyX)
> or a serial one (ttyS) or yet another one?

Serial console, /dev/ttyAML0 with kernel console enabled. Later a DRM 
driver is loaded (meson_drm), which initializes its fbdev emulation with 
its console.

However it looks that I've trusted automatic bisect a bit too much and 
had a bit of luck while checking the reverts on top of linux-next. The 
issue is not 100% reproducible, so I've did this bisection again 
manually with more tries. The real commit causing the issue is 
09c5ba0aa2fc ("printk: add kthread console printers"). Reverting the 
following 3 commits 6b3d71e878920b085dd823bc422951bb6f143505, 
e00cc0e1cbf4ea5a63d66c8de8d79519855fb231 and 
09c5ba0aa2fcfdadb17d045c3ee6f86d69270df7 on top of linux-next makes the 
system fully operational again.

I've also tried to disable the DRM driver and its fbdev and console (by 
adding modprobe.blacklist=meson_drm to kernel cmdline), but this didn't 
help. Here is the full serial console log:

https://pastebin.com/E5CDH88L

If there is anything you would like me to try, let me know.


Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


_______________________________________________
linux-amlogic mailing list
linux-amlogic@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-amlogic

  reply	other threads:[~2022-04-27 11:45 UTC|newest]

Thread overview: 99+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-21 21:22 [PATCH printk v4 00/15] implement threaded console printing John Ogness
2022-04-21 21:22 ` [PATCH printk v4 01/15] printk: rename cpulock functions John Ogness
2022-04-21 21:22 ` [PATCH printk v4 02/15] printk: cpu sync always disable interrupts John Ogness
2022-04-21 21:22 ` [PATCH printk v4 03/15] printk: add missing memory barrier to wake_up_klogd() John Ogness
2022-04-21 21:22 ` [PATCH printk v4 04/15] printk: wake up all waiters John Ogness
2022-04-21 21:22 ` [PATCH printk v4 05/15] printk: wake waiters for safe and NMI contexts John Ogness
2022-04-21 21:22 ` [PATCH printk v4 06/15] printk: get caller_id/timestamp after migration disable John Ogness
2022-04-21 21:22 ` [PATCH printk v4 07/15] printk: call boot_delay_msec() in printk_delay() John Ogness
2022-04-21 21:22 ` [PATCH printk v4 08/15] printk: add con_printk() macro for console details John Ogness
2022-04-21 21:22 ` [PATCH printk v4 09/15] printk: refactor and rework printing logic John Ogness
2022-04-21 21:22 ` [PATCH printk v4 10/15] printk: move buffer definitions into console_emit_next_record() caller John Ogness
2022-04-21 21:22 ` [PATCH printk v4 11/15] printk: add pr_flush() John Ogness
2022-04-21 21:22 ` [PATCH printk v4 12/15] printk: add functions to prefer direct printing John Ogness
2022-04-21 21:22 ` [PATCH printk v4 13/15] printk: add kthread console printers John Ogness
2022-04-22  7:48   ` Petr Mladek
2022-04-21 21:22 ` [PATCH printk v4 14/15] printk: extend console_lock for proper kthread support John Ogness
2022-04-21 21:40   ` John Ogness
2022-04-22  9:21   ` Petr Mladek
2022-04-25 20:58   ` [PATCH printk v5 1/1] printk: extend console_lock for per-console locking John Ogness
2022-04-26 12:07     ` Petr Mladek
2022-04-26 13:16       ` Petr Mladek
     [not found]         ` <CGME20220427070833eucas1p27a32ce7c41c0da26f05bd52155f0031c@eucas1p2.samsung.com>
2022-04-27  7:08           ` Marek Szyprowski
2022-04-27  7:08             ` Marek Szyprowski
2022-04-27  7:38             ` Petr Mladek
2022-04-27  7:38               ` Petr Mladek
2022-04-27 11:44               ` Marek Szyprowski [this message]
2022-04-27 11:44                 ` Marek Szyprowski
2022-04-27 16:15                 ` John Ogness
2022-04-27 16:15                   ` John Ogness
2022-04-27 16:48                   ` Petr Mladek
2022-04-27 16:48                     ` Petr Mladek
2022-04-28 14:54                   ` Petr Mladek
2022-04-28 14:54                     ` Petr Mladek
2022-04-29 13:53                   ` Marek Szyprowski
2022-04-29 13:53                     ` Marek Szyprowski
2022-04-30 16:00                     ` John Ogness
2022-04-30 16:00                       ` John Ogness
2022-05-02  9:19                       ` Marek Szyprowski
2022-05-02  9:19                         ` Marek Szyprowski
2022-05-02 13:11                         ` John Ogness
2022-05-02 13:11                           ` John Ogness
2022-05-02 22:29                           ` Marek Szyprowski
2022-05-02 22:29                             ` Marek Szyprowski
2022-05-04  5:56                             ` John Ogness
2022-05-04  5:56                               ` John Ogness
2022-05-04  6:52                               ` Marek Szyprowski
2022-05-04  6:52                                 ` Marek Szyprowski
2022-06-08 15:10                           ` Geert Uytterhoeven
2022-06-08 15:10                             ` Geert Uytterhoeven
2022-06-09 11:19                             ` John Ogness
2022-06-09 11:19                               ` John Ogness
2022-06-09 11:58                               ` Jason A. Donenfeld
2022-06-09 11:58                                 ` Jason A. Donenfeld
2022-06-09 12:18                                 ` Dmitry Vyukov
2022-06-09 12:18                                   ` Dmitry Vyukov
2022-06-09 12:27                                   ` Jason A. Donenfeld
2022-06-09 12:27                                     ` Jason A. Donenfeld
2022-06-09 12:32                                     ` Dmitry Vyukov
2022-06-09 12:32                                       ` Dmitry Vyukov
2022-06-17 16:51                                     ` Sebastian Andrzej Siewior
2022-06-17 16:51                                       ` Sebastian Andrzej Siewior
2022-06-09 12:18                                 ` Jason A. Donenfeld
2022-06-09 12:18                                   ` Jason A. Donenfeld
2022-05-02 13:17                         ` Petr Mladek
2022-05-02 13:17                           ` Petr Mladek
2022-05-02 23:13                           ` Marek Szyprowski
2022-05-02 23:13                             ` Marek Szyprowski
2022-05-03  6:49                             ` Petr Mladek
2022-05-03  6:49                               ` Petr Mladek
2022-05-04  6:05                               ` Marek Szyprowski
2022-05-04  6:05                                 ` Marek Szyprowski
2022-05-04 21:11                             ` John Ogness
2022-05-04 21:11                               ` John Ogness
2022-05-04 22:42                               ` John Ogness
2022-05-04 22:42                                 ` John Ogness
2022-05-05 22:33                                 ` John Ogness
2022-05-05 22:33                                   ` John Ogness
2022-05-06  6:43                                   ` Marek Szyprowski
2022-05-06  6:43                                     ` Marek Szyprowski
2022-05-06  7:55                                     ` Neil Armstrong
2022-05-06  7:55                                       ` Neil Armstrong
2022-05-08 11:02                                       ` John Ogness
2022-05-08 11:02                                         ` John Ogness
2022-05-06  8:16                                     ` Petr Mladek
2022-05-06  8:16                                       ` Petr Mladek
2022-05-06  9:20                                     ` John Ogness
2022-05-06  9:20                                       ` John Ogness
     [not found]             ` <CGME20220506112526eucas1p2a3688f87d3ed8331b99f2f876bf6c2f6@eucas1p2.samsung.com>
2022-05-06 11:25               ` Marek Szyprowski
2022-05-06 12:41                 ` John Ogness
2022-05-06 13:04                   ` Marek Szyprowski
2022-06-22  9:03     ` Geert Uytterhoeven
2022-06-22  9:03       ` Geert Uytterhoeven
2022-06-22 22:37       ` John Ogness
2022-06-22 22:37         ` John Ogness
2022-06-23 10:10         ` Geert Uytterhoeven
2022-06-23 10:10           ` Geert Uytterhoeven
2022-04-21 21:22 ` [PATCH printk v4 15/15] printk: remove @console_locked John Ogness
2022-04-22  9:39 ` [PATCH printk v4 00/15] implement threaded console printing Petr Mladek
2022-04-22 20:29   ` Petr Mladek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b6c1a8ac-c691-a84d-d3a1-f99984d32f06@samsung.com \
    --to=m.szyprowski@samsung.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=john.ogness@linutronix.de \
    --cc=linux-amlogic@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pmladek@suse.com \
    --cc=rostedt@goodmis.org \
    --cc=senozhatsky@chromium.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.