Re: [RFC 0/5] printk: Implement WARN_*DEFERRED()

* Re: [RFC 0/5] printk: Implement WARN_*DEFERRED()
       [not found] <1474992135-14777-1-git-send-email-pmladek@suse.com>
@ 2016-09-28  1:18 ` Sergey Senozhatsky
  2016-09-29 11:28   ` Petr Mladek
  0 siblings, 1 reply; 4+ messages in thread
From: Sergey Senozhatsky @ 2016-09-28  1:18 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Matt Fleming, Byungchul Park, Frederic Weisbecker, Jan Kara,
	Luca Abeni, Rik van Riel, Thomas Gleixner, Wanpeng Li, Yuyang Du,
	Mel Gorman, Mike Galbraith, Tejun Heo, Calvin Owens,
	linux-kernel, Sergey Senozhatsky, Sergey Senozhatsky

On (09/27/16 18:02), Petr Mladek wrote:
> The main trick is that we replace the per-CPU function pointer
> by a preempt_count-like variable that could track the printk context.
> 
> I know that Sergey has another ideas in this area. But I wanted to see
> how this approach would look like.

well, yes. I was looking at WARN_*_DEFERRED [1] for some time, and, I
think, the maintenance cost of that solution is just too high:

a) every existing WARN_* in sched/timekeeping/who knows where else
   must be evaluated to ensure that in can't be called from printk()
   path. if `false' - then the corresponding macro must be replaced
   with _DEFERRED flavor.

b) any patch that adds new WARN_* usages must be additionally checked
   to ensure that each of new WARN_* macros cannot be called from printk
   path. if `false' -- the corresponding macro must be replaced with
   _DEFERRED flavor.

c) any patch that refactors the code or moves some function calls around
   etc. must be additionally checked for any accidental WARN_* from printk
   path. even though if none of the patches added any new WARN_* to the code.

b) apart from WARN_* there can be `accidental' pr_err/pr_debug/etc. not
   necessarily newly added (see 'c').

that's too much.
for example [not blaming anyone], a recent patch [2] that added a reasonable
WARN_ON_ONCE to assert_clock_updated() which, however, can result in a
possible printk() deadlock scenario that you, Petr, outlined [3]:

:+ printk()
:  + vprintk_func -> vprintk_default()
:    + vprinkt_emit()
:      + console_unlock()
:        + up_console_sem()
:          + up()                # takes &sem->lock
:            + __up()
:              + wake_up_process()
:                + try_to_wake_up()
:                  + ttwu_queue()
:                    + ttwu_do_activate()
:                      + ttwu_do_wakeup()
:                        + rq_clock()
:                          + lockdep_assert_held()
:                            + WARN_ON_ONCE()
:                              + printk()
:                                + vprintk_func -> vprintk_default()
:                                  + vprintk_emit()
:                                    + console_try_lock()
:                                      + down_trylock_console_sem()
:                                        + __down_trylock_console_sem()
:                                          + down_trylock()

it takes a lot of additional effort, because both reviewer and contributor
must consider printk() internals. and, what's worse, if something goes
unnoticed we end up having a printk() deadlock again.

so I decided to address some of printk() issues in printk.c, not in
kernel/time/timekeeping.c or kernel/sched/core.c or anywhere else.

> Mid-air collision:
>
> I have just realized that Sergey sent another patchset that was
> more generic, complicated, and had some similarities, see
> https://lkml.kernel.org/r/20160927142237.5539-1-sergey.senozhatsky@gmail.com

yeah, I should have Cc-ed a wider audience. do I need to resend the
patch set with the `extended' Cc list?

[1] https://marc.info/?l=linux-kernel&m=147158843319944
[2] https://marc.info/?l=linux-kernel&m=147446511924573
[3] https://marc.info/?l=linux-kernel&m=147447352127741

	-ss

^ permalink raw reply	[flat|nested] 4+ messages in thread