* [PATCH] printk/console: Check consistent sequence number when handling race in console_unlock()
@ 2021-06-29 14:33 Petr Mladek
2021-06-29 14:48 ` John Ogness
` (2 more replies)
0 siblings, 3 replies; 11+ messages in thread
From: Petr Mladek @ 2021-06-29 14:33 UTC (permalink / raw)
To: John Ogness
Cc: Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner,
linux-kernel, Petr Mladek, stable
The standard printk() tries to flush the message to the console
immediately. It tries to take the console lock. If the lock is
already taken then the current owner is responsible for flushing
even the new message.
There is a small race window between checking whether a new message is
available and releasing the console lock. It is solved by re-checking
the state after releasing the console lock. If the check is positive
then console_unlock() tries to take the lock again and process the new
message as well.
The commit 996e966640ddea7b535c ("printk: remove logbuf_lock") causes that
console_seq is not longer read atomically. As a result, the re-check might
be done with an inconsistent 64-bit index.
Solve it by using the last sequence number that has been checked under
the console lock. In the worst case, it will take the lock again only
to realized that the new message has already been proceed. But it
was possible even before.
Fixes: commit 996e966640ddea7b535c ("printk: remove logbuf_lock")
Cc: stable@vger.kernel.org # 5.13
Signed-off-by: Petr Mladek <pmladek@suse.com>
---
kernel/printk/printk.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 142a58d124d9..87411084075e 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -2545,6 +2545,7 @@ void console_unlock(void)
bool do_cond_resched, retry;
struct printk_info info;
struct printk_record r;
+ u64 next_seq;
if (console_suspended) {
up_console_sem();
@@ -2654,8 +2655,10 @@ void console_unlock(void)
cond_resched();
}
- console_locked = 0;
+ /* Get consistent value of the next-to-be-used sequence number. */
+ next_seq = console_seq;
+ console_locked = 0;
up_console_sem();
/*
@@ -2664,7 +2667,7 @@ void console_unlock(void)
* there's a new owner and the console_unlock() from them will do the
* flush, no worries.
*/
- retry = prb_read_valid(prb, console_seq, NULL);
+ retry = prb_read_valid(prb, next_seq, NULL);
printk_safe_exit_irqrestore(flags);
if (retry && console_trylock())
--
2.26.2
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH] printk/console: Check consistent sequence number when handling race in console_unlock()
2021-06-29 14:33 [PATCH] printk/console: Check consistent sequence number when handling race in console_unlock() Petr Mladek
@ 2021-06-29 14:48 ` John Ogness
2021-06-29 15:42 ` Sergey Senozhatsky
2021-06-29 19:44 ` kernel test robot
2 siblings, 0 replies; 11+ messages in thread
From: John Ogness @ 2021-06-29 14:48 UTC (permalink / raw)
To: Petr Mladek
Cc: Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner,
linux-kernel, Petr Mladek, stable
On 2021-06-29, Petr Mladek <pmladek@suse.com> wrote:
> The standard printk() tries to flush the message to the console
> immediately. It tries to take the console lock. If the lock is
> already taken then the current owner is responsible for flushing
> even the new message.
>
> There is a small race window between checking whether a new message is
> available and releasing the console lock. It is solved by re-checking
> the state after releasing the console lock. If the check is positive
> then console_unlock() tries to take the lock again and process the new
> message as well.
>
> The commit 996e966640ddea7b535c ("printk: remove logbuf_lock") causes that
> console_seq is not longer read atomically. As a result, the re-check might
> be done with an inconsistent 64-bit index.
>
> Solve it by using the last sequence number that has been checked under
> the console lock. In the worst case, it will take the lock again only
> to realized that the new message has already been proceed. But it
> was possible even before.
>
> Fixes: commit 996e966640ddea7b535c ("printk: remove logbuf_lock")
> Cc: stable@vger.kernel.org # 5.13
> Signed-off-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: John Ogness <john.ogness@linutronix.de>
Good catch.
John
> ---
> kernel/printk/printk.c | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index 142a58d124d9..87411084075e 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -2545,6 +2545,7 @@ void console_unlock(void)
> bool do_cond_resched, retry;
> struct printk_info info;
> struct printk_record r;
> + u64 next_seq;
>
> if (console_suspended) {
> up_console_sem();
> @@ -2654,8 +2655,10 @@ void console_unlock(void)
> cond_resched();
> }
>
> - console_locked = 0;
> + /* Get consistent value of the next-to-be-used sequence number. */
> + next_seq = console_seq;
>
> + console_locked = 0;
> up_console_sem();
>
> /*
> @@ -2664,7 +2667,7 @@ void console_unlock(void)
> * there's a new owner and the console_unlock() from them will do the
> * flush, no worries.
> */
> - retry = prb_read_valid(prb, console_seq, NULL);
> + retry = prb_read_valid(prb, next_seq, NULL);
> printk_safe_exit_irqrestore(flags);
>
> if (retry && console_trylock())
> --
> 2.26.2
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] printk/console: Check consistent sequence number when handling race in console_unlock()
2021-06-29 14:33 [PATCH] printk/console: Check consistent sequence number when handling race in console_unlock() Petr Mladek
2021-06-29 14:48 ` John Ogness
@ 2021-06-29 15:42 ` Sergey Senozhatsky
2021-06-30 7:58 ` Petr Mladek
2021-06-29 19:44 ` kernel test robot
2 siblings, 1 reply; 11+ messages in thread
From: Sergey Senozhatsky @ 2021-06-29 15:42 UTC (permalink / raw)
To: Petr Mladek
Cc: John Ogness, Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner,
linux-kernel, stable
On (21/06/29 16:33), Petr Mladek wrote:
> The standard printk() tries to flush the message to the console
> immediately. It tries to take the console lock. If the lock is
> already taken then the current owner is responsible for flushing
> even the new message.
>
> There is a small race window between checking whether a new message is
> available and releasing the console lock. It is solved by re-checking
> the state after releasing the console lock. If the check is positive
> then console_unlock() tries to take the lock again and process the new
> message as well.
[..]
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index 142a58d124d9..87411084075e 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -2545,6 +2545,7 @@ void console_unlock(void)
> bool do_cond_resched, retry;
> struct printk_info info;
> struct printk_record r;
> + u64 next_seq;
>
> if (console_suspended) {
> up_console_sem();
> @@ -2654,8 +2655,10 @@ void console_unlock(void)
> cond_resched();
> }
>
> - console_locked = 0;
> + /* Get consistent value of the next-to-be-used sequence number. */
> + next_seq = console_seq;
>
> + console_locked = 0;
> up_console_sem();
>
> /*
> @@ -2664,7 +2667,7 @@ void console_unlock(void)
> * there's a new owner and the console_unlock() from them will do the
> * flush, no worries.
> */
> - retry = prb_read_valid(prb, console_seq, NULL);
> + retry = prb_read_valid(prb, next_seq, NULL);
> printk_safe_exit_irqrestore(flags);
>
> if (retry && console_trylock())
Maybe it's too late here in my time zone, but what are the consequences
of this race?
`retry` can be falsely set, console_trylock() does not spin on owner,
so the context that just released the lock can grab it again only if
it's unlocked. For the context that just has released the console_sem
and then acquired it again, because of the race, - console_seq will be
valid after it acquires the lock, then it'll jump to `retry` and
re-validated the console_seq - prb_read_valid(). If it's valid, it'll
print the message; and should another CPU printk that CPU will spin on
owner and then the current console_sem owner will yield to it via
console_lock_spinning branch.
One way or the other, good catch and nice to have it fixed.
Acked-by: Sergey Senozhatsky <senozhatsky@chromium.org>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] printk/console: Check consistent sequence number when handling race in console_unlock()
2021-06-29 14:33 [PATCH] printk/console: Check consistent sequence number when handling race in console_unlock() Petr Mladek
@ 2021-06-29 19:44 ` kernel test robot
2021-06-29 15:42 ` Sergey Senozhatsky
2021-06-29 19:44 ` kernel test robot
2 siblings, 0 replies; 11+ messages in thread
From: kernel test robot @ 2021-06-29 19:44 UTC (permalink / raw)
To: Petr Mladek, John Ogness
Cc: kbuild-all, Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner,
linux-kernel, Petr Mladek, stable
[-- Attachment #1: Type: text/plain, Size: 7244 bytes --]
Hi Petr,
I love your patch! Perhaps something to improve:
[auto build test WARNING on linus/master]
[also build test WARNING on v5.13 next-20210629]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]
url: https://github.com/0day-ci/linux/commits/Petr-Mladek/printk-console-Check-consistent-sequence-number-when-handling-race-in-console_unlock/20210629-223408
base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git c54b245d011855ea91c5beff07f1db74143ce614
config: x86_64-buildonly-randconfig-r006-20210629 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
reproduce (this is a W=1 build):
# https://github.com/0day-ci/linux/commit/24c9c4e757be2efe5bb225e443b3502338abf64c
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Petr-Mladek/printk-console-Check-consistent-sequence-number-when-handling-race-in-console_unlock/20210629-223408
git checkout 24c9c4e757be2efe5bb225e443b3502338abf64c
# save the attached .config to linux build tree
make W=1 ARCH=x86_64
If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
All warnings (new ones prefixed by >>):
kernel/printk/printk.c:175:5: warning: no previous prototype for 'devkmsg_sysctl_set_loglvl' [-Wmissing-prototypes]
175 | int devkmsg_sysctl_set_loglvl(struct ctl_table *table, int write,
| ^~~~~~~~~~~~~~~~~~~~~~~~~
kernel/printk/printk.c: In function 'console_unlock':
>> kernel/printk/printk.c:2548:6: warning: variable 'next_seq' set but not used [-Wunused-but-set-variable]
2548 | u64 next_seq;
| ^~~~~~~~
vim +/next_seq +2548 kernel/printk/printk.c
2525
2526 /**
2527 * console_unlock - unlock the console system
2528 *
2529 * Releases the console_lock which the caller holds on the console system
2530 * and the console driver list.
2531 *
2532 * While the console_lock was held, console output may have been buffered
2533 * by printk(). If this is the case, console_unlock(); emits
2534 * the output prior to releasing the lock.
2535 *
2536 * If there is output waiting, we wake /dev/kmsg and syslog() users.
2537 *
2538 * console_unlock(); may be called from any context.
2539 */
2540 void console_unlock(void)
2541 {
2542 static char ext_text[CONSOLE_EXT_LOG_MAX];
2543 static char text[CONSOLE_LOG_MAX];
2544 unsigned long flags;
2545 bool do_cond_resched, retry;
2546 struct printk_info info;
2547 struct printk_record r;
> 2548 u64 next_seq;
2549
2550 if (console_suspended) {
2551 up_console_sem();
2552 return;
2553 }
2554
2555 prb_rec_init_rd(&r, &info, text, sizeof(text));
2556
2557 /*
2558 * Console drivers are called with interrupts disabled, so
2559 * @console_may_schedule should be cleared before; however, we may
2560 * end up dumping a lot of lines, for example, if called from
2561 * console registration path, and should invoke cond_resched()
2562 * between lines if allowable. Not doing so can cause a very long
2563 * scheduling stall on a slow console leading to RCU stall and
2564 * softlockup warnings which exacerbate the issue with more
2565 * messages practically incapacitating the system.
2566 *
2567 * console_trylock() is not able to detect the preemptive
2568 * context reliably. Therefore the value must be stored before
2569 * and cleared after the "again" goto label.
2570 */
2571 do_cond_resched = console_may_schedule;
2572 again:
2573 console_may_schedule = 0;
2574
2575 /*
2576 * We released the console_sem lock, so we need to recheck if
2577 * cpu is online and (if not) is there at least one CON_ANYTIME
2578 * console.
2579 */
2580 if (!can_use_console()) {
2581 console_locked = 0;
2582 up_console_sem();
2583 return;
2584 }
2585
2586 for (;;) {
2587 size_t ext_len = 0;
2588 size_t len;
2589
2590 printk_safe_enter_irqsave(flags);
2591 skip:
2592 if (!prb_read_valid(prb, console_seq, &r))
2593 break;
2594
2595 if (console_seq != r.info->seq) {
2596 console_dropped += r.info->seq - console_seq;
2597 console_seq = r.info->seq;
2598 }
2599
2600 if (suppress_message_printing(r.info->level)) {
2601 /*
2602 * Skip record we have buffered and already printed
2603 * directly to the console when we received it, and
2604 * record that has level above the console loglevel.
2605 */
2606 console_seq++;
2607 goto skip;
2608 }
2609
2610 /* Output to all consoles once old messages replayed. */
2611 if (unlikely(exclusive_console &&
2612 console_seq >= exclusive_console_stop_seq)) {
2613 exclusive_console = NULL;
2614 }
2615
2616 /*
2617 * Handle extended console text first because later
2618 * record_print_text() will modify the record buffer in-place.
2619 */
2620 if (nr_ext_console_drivers) {
2621 ext_len = info_print_ext_header(ext_text,
2622 sizeof(ext_text),
2623 r.info);
2624 ext_len += msg_print_ext_body(ext_text + ext_len,
2625 sizeof(ext_text) - ext_len,
2626 &r.text_buf[0],
2627 r.info->text_len,
2628 &r.info->dev_info);
2629 }
2630 len = record_print_text(&r,
2631 console_msg_format & MSG_FORMAT_SYSLOG,
2632 printk_time);
2633 console_seq++;
2634
2635 /*
2636 * While actively printing out messages, if another printk()
2637 * were to occur on another CPU, it may wait for this one to
2638 * finish. This task can not be preempted if there is a
2639 * waiter waiting to take over.
2640 */
2641 console_lock_spinning_enable();
2642
2643 stop_critical_timings(); /* don't trace print latency */
2644 call_console_drivers(ext_text, ext_len, text, len);
2645 start_critical_timings();
2646
2647 if (console_lock_spinning_disable_and_check()) {
2648 printk_safe_exit_irqrestore(flags);
2649 return;
2650 }
2651
2652 printk_safe_exit_irqrestore(flags);
2653
2654 if (do_cond_resched)
2655 cond_resched();
2656 }
2657
2658 /* Get consistent value of the next-to-be-used sequence number. */
2659 next_seq = console_seq;
2660
2661 console_locked = 0;
2662 up_console_sem();
2663
2664 /*
2665 * Someone could have filled up the buffer again, so re-check if there's
2666 * something to flush. In case we cannot trylock the console_sem again,
2667 * there's a new owner and the console_unlock() from them will do the
2668 * flush, no worries.
2669 */
2670 retry = prb_read_valid(prb, next_seq, NULL);
2671 printk_safe_exit_irqrestore(flags);
2672
2673 if (retry && console_trylock())
2674 goto again;
2675 }
2676 EXPORT_SYMBOL(console_unlock);
2677
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 35718 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] printk/console: Check consistent sequence number when handling race in console_unlock()
@ 2021-06-29 19:44 ` kernel test robot
0 siblings, 0 replies; 11+ messages in thread
From: kernel test robot @ 2021-06-29 19:44 UTC (permalink / raw)
To: kbuild-all
[-- Attachment #1: Type: text/plain, Size: 7441 bytes --]
Hi Petr,
I love your patch! Perhaps something to improve:
[auto build test WARNING on linus/master]
[also build test WARNING on v5.13 next-20210629]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]
url: https://github.com/0day-ci/linux/commits/Petr-Mladek/printk-console-Check-consistent-sequence-number-when-handling-race-in-console_unlock/20210629-223408
base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git c54b245d011855ea91c5beff07f1db74143ce614
config: x86_64-buildonly-randconfig-r006-20210629 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
reproduce (this is a W=1 build):
# https://github.com/0day-ci/linux/commit/24c9c4e757be2efe5bb225e443b3502338abf64c
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Petr-Mladek/printk-console-Check-consistent-sequence-number-when-handling-race-in-console_unlock/20210629-223408
git checkout 24c9c4e757be2efe5bb225e443b3502338abf64c
# save the attached .config to linux build tree
make W=1 ARCH=x86_64
If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
All warnings (new ones prefixed by >>):
kernel/printk/printk.c:175:5: warning: no previous prototype for 'devkmsg_sysctl_set_loglvl' [-Wmissing-prototypes]
175 | int devkmsg_sysctl_set_loglvl(struct ctl_table *table, int write,
| ^~~~~~~~~~~~~~~~~~~~~~~~~
kernel/printk/printk.c: In function 'console_unlock':
>> kernel/printk/printk.c:2548:6: warning: variable 'next_seq' set but not used [-Wunused-but-set-variable]
2548 | u64 next_seq;
| ^~~~~~~~
vim +/next_seq +2548 kernel/printk/printk.c
2525
2526 /**
2527 * console_unlock - unlock the console system
2528 *
2529 * Releases the console_lock which the caller holds on the console system
2530 * and the console driver list.
2531 *
2532 * While the console_lock was held, console output may have been buffered
2533 * by printk(). If this is the case, console_unlock(); emits
2534 * the output prior to releasing the lock.
2535 *
2536 * If there is output waiting, we wake /dev/kmsg and syslog() users.
2537 *
2538 * console_unlock(); may be called from any context.
2539 */
2540 void console_unlock(void)
2541 {
2542 static char ext_text[CONSOLE_EXT_LOG_MAX];
2543 static char text[CONSOLE_LOG_MAX];
2544 unsigned long flags;
2545 bool do_cond_resched, retry;
2546 struct printk_info info;
2547 struct printk_record r;
> 2548 u64 next_seq;
2549
2550 if (console_suspended) {
2551 up_console_sem();
2552 return;
2553 }
2554
2555 prb_rec_init_rd(&r, &info, text, sizeof(text));
2556
2557 /*
2558 * Console drivers are called with interrupts disabled, so
2559 * @console_may_schedule should be cleared before; however, we may
2560 * end up dumping a lot of lines, for example, if called from
2561 * console registration path, and should invoke cond_resched()
2562 * between lines if allowable. Not doing so can cause a very long
2563 * scheduling stall on a slow console leading to RCU stall and
2564 * softlockup warnings which exacerbate the issue with more
2565 * messages practically incapacitating the system.
2566 *
2567 * console_trylock() is not able to detect the preemptive
2568 * context reliably. Therefore the value must be stored before
2569 * and cleared after the "again" goto label.
2570 */
2571 do_cond_resched = console_may_schedule;
2572 again:
2573 console_may_schedule = 0;
2574
2575 /*
2576 * We released the console_sem lock, so we need to recheck if
2577 * cpu is online and (if not) is there at least one CON_ANYTIME
2578 * console.
2579 */
2580 if (!can_use_console()) {
2581 console_locked = 0;
2582 up_console_sem();
2583 return;
2584 }
2585
2586 for (;;) {
2587 size_t ext_len = 0;
2588 size_t len;
2589
2590 printk_safe_enter_irqsave(flags);
2591 skip:
2592 if (!prb_read_valid(prb, console_seq, &r))
2593 break;
2594
2595 if (console_seq != r.info->seq) {
2596 console_dropped += r.info->seq - console_seq;
2597 console_seq = r.info->seq;
2598 }
2599
2600 if (suppress_message_printing(r.info->level)) {
2601 /*
2602 * Skip record we have buffered and already printed
2603 * directly to the console when we received it, and
2604 * record that has level above the console loglevel.
2605 */
2606 console_seq++;
2607 goto skip;
2608 }
2609
2610 /* Output to all consoles once old messages replayed. */
2611 if (unlikely(exclusive_console &&
2612 console_seq >= exclusive_console_stop_seq)) {
2613 exclusive_console = NULL;
2614 }
2615
2616 /*
2617 * Handle extended console text first because later
2618 * record_print_text() will modify the record buffer in-place.
2619 */
2620 if (nr_ext_console_drivers) {
2621 ext_len = info_print_ext_header(ext_text,
2622 sizeof(ext_text),
2623 r.info);
2624 ext_len += msg_print_ext_body(ext_text + ext_len,
2625 sizeof(ext_text) - ext_len,
2626 &r.text_buf[0],
2627 r.info->text_len,
2628 &r.info->dev_info);
2629 }
2630 len = record_print_text(&r,
2631 console_msg_format & MSG_FORMAT_SYSLOG,
2632 printk_time);
2633 console_seq++;
2634
2635 /*
2636 * While actively printing out messages, if another printk()
2637 * were to occur on another CPU, it may wait for this one to
2638 * finish. This task can not be preempted if there is a
2639 * waiter waiting to take over.
2640 */
2641 console_lock_spinning_enable();
2642
2643 stop_critical_timings(); /* don't trace print latency */
2644 call_console_drivers(ext_text, ext_len, text, len);
2645 start_critical_timings();
2646
2647 if (console_lock_spinning_disable_and_check()) {
2648 printk_safe_exit_irqrestore(flags);
2649 return;
2650 }
2651
2652 printk_safe_exit_irqrestore(flags);
2653
2654 if (do_cond_resched)
2655 cond_resched();
2656 }
2657
2658 /* Get consistent value of the next-to-be-used sequence number. */
2659 next_seq = console_seq;
2660
2661 console_locked = 0;
2662 up_console_sem();
2663
2664 /*
2665 * Someone could have filled up the buffer again, so re-check if there's
2666 * something to flush. In case we cannot trylock the console_sem again,
2667 * there's a new owner and the console_unlock() from them will do the
2668 * flush, no worries.
2669 */
2670 retry = prb_read_valid(prb, next_seq, NULL);
2671 printk_safe_exit_irqrestore(flags);
2672
2673 if (retry && console_trylock())
2674 goto again;
2675 }
2676 EXPORT_SYMBOL(console_unlock);
2677
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org
[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 35718 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] printk/console: Check consistent sequence number when handling race in console_unlock()
2021-06-29 19:44 ` kernel test robot
@ 2021-06-29 20:53 ` John Ogness
-1 siblings, 0 replies; 11+ messages in thread
From: John Ogness @ 2021-06-29 20:53 UTC (permalink / raw)
To: kernel test robot, Petr Mladek
Cc: kbuild-all, Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner,
linux-kernel, Petr Mladek, stable
On 2021-06-30, kernel test robot <lkp@intel.com> wrote:
>>> kernel/printk/printk.c:2548:6: warning: variable 'next_seq' set but not used [-Wunused-but-set-variable]
I suppose the correct fix for this warning would be to change the NOP
macros. Currently they are:
#define prb_read_valid(rb, seq, r) false
#define prb_first_valid_seq(rb) 0
They should probably be something like (untested):
#define prb_read_valid(rb, seq, r) \
({ \
(void)(rb); \
(void)(seq); \
(void)(r); \
false; \
})
#define prb_first_valid_seq(rb) \
({ \
(void)(rb); \
0; \
})
John Ogness
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] printk/console: Check consistent sequence number when handling race in console_unlock()
@ 2021-06-29 20:53 ` John Ogness
0 siblings, 0 replies; 11+ messages in thread
From: John Ogness @ 2021-06-29 20:53 UTC (permalink / raw)
To: kbuild-all
[-- Attachment #1: Type: text/plain, Size: 860 bytes --]
On 2021-06-30, kernel test robot <lkp@intel.com> wrote:
>>> kernel/printk/printk.c:2548:6: warning: variable 'next_seq' set but not used [-Wunused-but-set-variable]
I suppose the correct fix for this warning would be to change the NOP
macros. Currently they are:
#define prb_read_valid(rb, seq, r) false
#define prb_first_valid_seq(rb) 0
They should probably be something like (untested):
#define prb_read_valid(rb, seq, r) \
({ \
(void)(rb); \
(void)(seq); \
(void)(r); \
false; \
})
#define prb_first_valid_seq(rb) \
({ \
(void)(rb); \
0; \
})
John Ogness
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] printk/console: Check consistent sequence number when handling race in console_unlock()
2021-06-29 15:42 ` Sergey Senozhatsky
@ 2021-06-30 7:58 ` Petr Mladek
2021-07-05 3:53 ` Sergey Senozhatsky
0 siblings, 1 reply; 11+ messages in thread
From: Petr Mladek @ 2021-06-30 7:58 UTC (permalink / raw)
To: Sergey Senozhatsky
Cc: John Ogness, Steven Rostedt, Thomas Gleixner, linux-kernel, stable
On Wed 2021-06-30 00:42:13, Sergey Senozhatsky wrote:
> On (21/06/29 16:33), Petr Mladek wrote:
> > The standard printk() tries to flush the message to the console
> > immediately. It tries to take the console lock. If the lock is
> > already taken then the current owner is responsible for flushing
> > even the new message.
> >
> > There is a small race window between checking whether a new message is
> > available and releasing the console lock. It is solved by re-checking
> > the state after releasing the console lock. If the check is positive
> > then console_unlock() tries to take the lock again and process the new
> > message as well.
> [..]
> > diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> > index 142a58d124d9..87411084075e 100644
> > --- a/kernel/printk/printk.c
> > +++ b/kernel/printk/printk.c
> > @@ -2545,6 +2545,7 @@ void console_unlock(void)
> > bool do_cond_resched, retry;
> > struct printk_info info;
> > struct printk_record r;
> > + u64 next_seq;
> >
> > if (console_suspended) {
> > up_console_sem();
> > @@ -2654,8 +2655,10 @@ void console_unlock(void)
> > cond_resched();
> > }
> >
> > - console_locked = 0;
> > + /* Get consistent value of the next-to-be-used sequence number. */
> > + next_seq = console_seq;
> >
> > + console_locked = 0;
> > up_console_sem();
> >
> > /*
> > @@ -2664,7 +2667,7 @@ void console_unlock(void)
> > * there's a new owner and the console_unlock() from them will do the
> > * flush, no worries.
> > */
> > - retry = prb_read_valid(prb, console_seq, NULL);
> > + retry = prb_read_valid(prb, next_seq, NULL);
> > printk_safe_exit_irqrestore(flags);
> >
> > if (retry && console_trylock())
>
> Maybe it's too late here in my time zone, but what are the consequences
> of this race?
>
> `retry` can be falsely set, console_trylock() does not spin on owner,
> so the context that just released the lock can grab it again only if
> it's unlocked. For the context that just has released the console_sem
> and then acquired it again, because of the race, - console_seq will be
> valid after it acquires the lock, then it'll jump to `retry` and
> re-validated the console_seq - prb_read_valid(). If it's valid, it'll
> print the message; and should another CPU printk that CPU will spin on
> owner and then the current console_sem owner will yield to it via
> console_lock_spinning branch.
I am not sure that I follow it correctly. IMHO, there are two possible
races. I believe that you are talking about the 2nd scenario:
1st scenario: console_unlock() retries but the message has been proceed
in the meantime:
CPU0 CPU1
console_unlock()
// process all pending messages
next_seq = 100;
up_console_sem();
printk()
vprintk_store()
// storing message with seq = 100;
console_trylock_spinning()
//succees
console_unlock()
// show message with seq == 100
console_seq++; (101)
next_seq = 101;
up_consle_sem()
retry = prb_read_valid(prb, next_seq, NULL);
// true because next_seq == 100
goto retry;
if (!prb_read_valid(prb, console_seq, &r))
break;
// breaks because console_seq == 101
Result: CPU0 retired just to realize that the message
has already been procceed.
2nd scenario: printk() caller spins when other process is already
processing it's message
CPU0 CPU1
printk()
vprintk_store()
// storing message with seq == 100
console_trylock_spinning()
// succees
console_unlock()
// show message with seq == 100
console_seq++; (=> 101)
next_seq = 101;
up_console_sem();
printk()
vprintk_store()
// storing message with seq == 101
retry = prb_read_valid(prb, next_seq, NULL);
// true because next_seq == 101
goto retry
if (!prb_read_valid(prb, console_seq, &r))
// read messages with seq == 101
console_seq++; (=> 102)
console_lock_spinning_enable();
call_console_drivers();
console_trylock_spinning()
// spinning
if (console_lock_spinning_disable_and_check())
return;
// returns because there is a waiter
// got the console lock
console_unlock()
if (!prb_read_valid(prb, console_seq, &r))
break;
// breaks because console_seq == 102
Result: CPU1 was spinning just to realize that the message has already
been proceed.
It is not ideal. But the result is always correct.
The races have been there already before. Only the race window in 1st
scenario was a bit smaller.
Anyway, thanks for the review.
Best Regards,
Petr
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] printk/console: Check consistent sequence number when handling race in console_unlock()
2021-06-29 20:53 ` John Ogness
@ 2021-06-30 8:57 ` Petr Mladek
-1 siblings, 0 replies; 11+ messages in thread
From: Petr Mladek @ 2021-06-30 8:57 UTC (permalink / raw)
To: John Ogness
Cc: kernel test robot, kbuild-all, Sergey Senozhatsky,
Steven Rostedt, Thomas Gleixner, linux-kernel, stable
On Tue 2021-06-29 22:59:57, John Ogness wrote:
> On 2021-06-30, kernel test robot <lkp@intel.com> wrote:
> >>> kernel/printk/printk.c:2548:6: warning: variable 'next_seq' set but not used [-Wunused-but-set-variable]
>
> I suppose the correct fix for this warning would be to change the NOP
> macros. Currently they are:
>
> #define prb_read_valid(rb, seq, r) false
> #define prb_first_valid_seq(rb) 0
>
> They should probably be something like (untested):
>
> #define prb_read_valid(rb, seq, r) \
> ({ \
> (void)(rb); \
> (void)(seq); \
> (void)(r); \
> false; \
> })
This did not work:
kernel/printk/printk.c: In function ‘console_unlock’:
kernel/printk/printk.c:2600:23: error: ‘prb’ undeclared (first use in this function)
if (!prb_read_valid(prb, console_seq, &r))
^
kernel/printk/printk.c:2230:16: note: in definition of macro ‘prb_read_valid’
(void)(rb); \
^~
kernel/printk/printk.c:2600:23: note: each undeclared identifier is reported only once for each function it appears in
if (!prb_read_valid(prb, console_seq, &r))
^
kernel/printk/printk.c:2230:16: note: in definition of macro ‘prb_read_valid’
(void)(rb); \
^~
Instead, it might be solved by declaring next_seq as:
u64 __maybe_unused next_seq;
Any better idea is welcome. Well, it is not worth any big complexity.
Best Regards,
Petr
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] printk/console: Check consistent sequence number when handling race in console_unlock()
@ 2021-06-30 8:57 ` Petr Mladek
0 siblings, 0 replies; 11+ messages in thread
From: Petr Mladek @ 2021-06-30 8:57 UTC (permalink / raw)
To: kbuild-all
[-- Attachment #1: Type: text/plain, Size: 1674 bytes --]
On Tue 2021-06-29 22:59:57, John Ogness wrote:
> On 2021-06-30, kernel test robot <lkp@intel.com> wrote:
> >>> kernel/printk/printk.c:2548:6: warning: variable 'next_seq' set but not used [-Wunused-but-set-variable]
>
> I suppose the correct fix for this warning would be to change the NOP
> macros. Currently they are:
>
> #define prb_read_valid(rb, seq, r) false
> #define prb_first_valid_seq(rb) 0
>
> They should probably be something like (untested):
>
> #define prb_read_valid(rb, seq, r) \
> ({ \
> (void)(rb); \
> (void)(seq); \
> (void)(r); \
> false; \
> })
This did not work:
kernel/printk/printk.c: In function ‘console_unlock’:
kernel/printk/printk.c:2600:23: error: ‘prb’ undeclared (first use in this function)
if (!prb_read_valid(prb, console_seq, &r))
^
kernel/printk/printk.c:2230:16: note: in definition of macro ‘prb_read_valid’
(void)(rb); \
^~
kernel/printk/printk.c:2600:23: note: each undeclared identifier is reported only once for each function it appears in
if (!prb_read_valid(prb, console_seq, &r))
^
kernel/printk/printk.c:2230:16: note: in definition of macro ‘prb_read_valid’
(void)(rb); \
^~
Instead, it might be solved by declaring next_seq as:
u64 __maybe_unused next_seq;
Any better idea is welcome. Well, it is not worth any big complexity.
Best Regards,
Petr
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] printk/console: Check consistent sequence number when handling race in console_unlock()
2021-06-30 7:58 ` Petr Mladek
@ 2021-07-05 3:53 ` Sergey Senozhatsky
0 siblings, 0 replies; 11+ messages in thread
From: Sergey Senozhatsky @ 2021-07-05 3:53 UTC (permalink / raw)
To: Petr Mladek
Cc: Sergey Senozhatsky, John Ogness, Steven Rostedt, Thomas Gleixner,
linux-kernel, stable
On (21/06/30 09:58), Petr Mladek wrote:
[..]
> > `retry` can be falsely set, console_trylock() does not spin on owner,
> > so the context that just released the lock can grab it again only if
> > it's unlocked. For the context that just has released the console_sem
> > and then acquired it again, because of the race, - console_seq will be
> > valid after it acquires the lock, then it'll jump to `retry` and
> > re-validated the console_seq - prb_read_valid(). If it's valid, it'll
> > print the message; and should another CPU printk that CPU will spin on
> > owner and then the current console_sem owner will yield to it via
> > console_lock_spinning branch.
>
> I am not sure that I follow it correctly. IMHO, there are two possible
> races. I believe that you are talking about the 2nd scenario:
I guess I was thinking about two scenarios simultaneously, but you
certainly did a much better job describing them.
Thanks a lot for spending time on this!
> 1st scenario: console_unlock() retries but the message has been proceed
> in the meantime:
[..]
> Result: CPU0 retired just to realize that the message
> has already been procceed.
Ack.
> 2nd scenario: printk() caller spins when other process is already
> processing it's message
[..]
> Result: CPU1 was spinning just to realize that the message has already
> been proceed.
Ack.
> It is not ideal. But the result is always correct.
>
> The races have been there already before. Only the race window in 1st
> scenario was a bit smaller.
Yeah, this was my assertion as well, but I wanted to double check.
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2021-07-05 3:53 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-29 14:33 [PATCH] printk/console: Check consistent sequence number when handling race in console_unlock() Petr Mladek
2021-06-29 14:48 ` John Ogness
2021-06-29 15:42 ` Sergey Senozhatsky
2021-06-30 7:58 ` Petr Mladek
2021-07-05 3:53 ` Sergey Senozhatsky
2021-06-29 19:44 ` kernel test robot
2021-06-29 19:44 ` kernel test robot
2021-06-29 20:53 ` John Ogness
2021-06-29 20:53 ` John Ogness
2021-06-30 8:57 ` Petr Mladek
2021-06-30 8:57 ` Petr Mladek
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.