linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] RFC: console: hack up console_trylock more
@ 2019-05-02 14:16 Daniel Vetter
  2019-05-03 15:14 ` Petr Mladek
  2019-05-06  7:45 ` [PATCH] RFC: console: hack up console_lock more v2 Daniel Vetter
  0 siblings, 2 replies; 15+ messages in thread
From: Daniel Vetter @ 2019-05-02 14:16 UTC (permalink / raw)
  To: DRI Development
  Cc: Intel Graphics Development, Daniel Vetter, Daniel Vetter,
	Peter Zijlstra, Ingo Molnar, Will Deacon, Petr Mladek,
	Sergey Senozhatsky, Steven Rostedt, John Ogness, linux-kernel

console_trylock, called from within printk, can be called from pretty
much anywhere. Including try_to_wake_up. Note that this isn't common,
usually the box is in pretty bad shape at that point already. But it
really doesn't help when then lockdep jumps in and spams the logs,
potentially obscuring the real backtrace we're really interested in.
One case I've seen (slightly simplified backtrace):

 Call Trace:
  <IRQ>
  console_trylock+0xe/0x60
  vprintk_emit+0xf1/0x320
  printk+0x4d/0x69
  __warn_printk+0x46/0x90
  native_smp_send_reschedule+0x2f/0x40
  check_preempt_curr+0x81/0xa0
  ttwu_do_wakeup+0x14/0x220
  try_to_wake_up+0x218/0x5f0
  pollwake+0x6f/0x90
  credit_entropy_bits+0x204/0x310
  add_interrupt_randomness+0x18f/0x210
  handle_irq+0x67/0x160
  do_IRQ+0x5e/0x130
  common_interrupt+0xf/0xf
  </IRQ>

This alone isn't a problem, but the spinlock in the semaphore is also
still held while waking up waiters (up() -> __up() -> try_to_wake_up()
callchain), which then closes the runqueue vs. semaphore.lock loop,
and upsets lockdep, which issues a circular locking splat to dmesg.
Worse it upsets developers, since we don't want to spam dmesg with
clutter when the machine is dying already.

Fix this by creating a __down_trylock which only trylocks the
semaphore.lock. This isn't correct in full generality, but good enough
for console_lock:

- there's only ever one console_lock holder, we won't fail spuriously
  because someone is doing a down() or up() while there's still room
  (unlike other semaphores with count > 1).

- console_unlock() has one massive retry loop, which will catch anyone
  who races the trylock against the up(). This makes sure that no
  printk lines will get lost. Making the trylock more racy therefore
  has no further impact.

Also cc'ing John Ogness since perhaps his printk rework fixes this all
properly.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: linux-kernel@vger.kernel.org
---
 include/linux/semaphore.h  |  1 +
 kernel/locking/semaphore.c | 33 +++++++++++++++++++++++++++++++++
 kernel/printk/printk.c     |  2 +-
 3 files changed, 35 insertions(+), 1 deletion(-)

diff --git a/include/linux/semaphore.h b/include/linux/semaphore.h
index 11c86fbfeb98..82f6db6121c3 100644
--- a/include/linux/semaphore.h
+++ b/include/linux/semaphore.h
@@ -40,6 +40,7 @@ extern void down(struct semaphore *sem);
 extern int __must_check down_interruptible(struct semaphore *sem);
 extern int __must_check down_killable(struct semaphore *sem);
 extern int __must_check down_trylock(struct semaphore *sem);
+extern int __must_check __down_trylock(struct semaphore *sem);
 extern int __must_check down_timeout(struct semaphore *sem, long jiffies);
 extern void up(struct semaphore *sem);
 
diff --git a/kernel/locking/semaphore.c b/kernel/locking/semaphore.c
index 561acdd39960..345123336019 100644
--- a/kernel/locking/semaphore.c
+++ b/kernel/locking/semaphore.c
@@ -143,6 +143,39 @@ int down_trylock(struct semaphore *sem)
 }
 EXPORT_SYMBOL(down_trylock);
 
+/**
+ * __down_trylock - try to acquire the semaphore, without any locking
+ * @sem: the semaphore to be acquired
+ *
+ * Try to acquire the semaphore atomically.  Returns 0 if the semaphore has
+ * been acquired successfully or 1 if it it cannot be acquired.
+ *
+ * NOTE: This return value is inverted from both spin_trylock and
+ * mutex_trylock!  Be careful about this when converting code.
+ *
+ * Unlike mutex_trylock, this function can be used from interrupt context,
+ * and the semaphore can be released by any task or interrupt.
+ *
+ * WARNING: Unlike down_trylock this function doesn't guarantee that that the
+ * semaphore will be acquired if it could, it's best effort only. Use for
+ * down_trylock_console_sem only.
+ */
+int __down_trylock(struct semaphore *sem)
+{
+	unsigned long flags;
+	int count;
+
+	if (!raw_spin_trylock_irqsave(&sem->lock, flags))
+		return 1;
+	count = sem->count - 1;
+	if (likely(count >= 0))
+		sem->count = count;
+	raw_spin_unlock_irqrestore(&sem->lock, flags);
+
+	return (count < 0);
+}
+EXPORT_SYMBOL(__down_trylock);
+
 /**
  * down_timeout - acquire the semaphore within a specified time
  * @sem: the semaphore to be acquired
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 02ca827b8fac..5293803eec1f 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -217,7 +217,7 @@ static int __down_trylock_console_sem(unsigned long ip)
 	 * deadlock in printk()->down_trylock_console_sem() otherwise.
 	 */
 	printk_safe_enter_irqsave(flags);
-	lock_failed = down_trylock(&console_sem);
+	lock_failed = __down_trylock(&console_sem);
 	printk_safe_exit_irqrestore(flags);
 
 	if (lock_failed)
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] RFC: console: hack up console_trylock more
  2019-05-02 14:16 [PATCH] RFC: console: hack up console_trylock more Daniel Vetter
@ 2019-05-03 15:14 ` Petr Mladek
  2019-05-06  7:11   ` Daniel Vetter
  2019-05-06  7:45 ` [PATCH] RFC: console: hack up console_lock more v2 Daniel Vetter
  1 sibling, 1 reply; 15+ messages in thread
From: Petr Mladek @ 2019-05-03 15:14 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: DRI Development, Intel Graphics Development, Daniel Vetter,
	Peter Zijlstra, Ingo Molnar, Will Deacon, Sergey Senozhatsky,
	Steven Rostedt, John Ogness, linux-kernel

On Thu 2019-05-02 16:16:43, Daniel Vetter wrote:
> console_trylock, called from within printk, can be called from pretty
> much anywhere. Including try_to_wake_up. Note that this isn't common,
> usually the box is in pretty bad shape at that point already. But it
> really doesn't help when then lockdep jumps in and spams the logs,
> potentially obscuring the real backtrace we're really interested in.
> One case I've seen (slightly simplified backtrace):
> 
>  Call Trace:
>   <IRQ>
>   console_trylock+0xe/0x60
>   vprintk_emit+0xf1/0x320
>   printk+0x4d/0x69
>   __warn_printk+0x46/0x90
>   native_smp_send_reschedule+0x2f/0x40
>   check_preempt_curr+0x81/0xa0
>   ttwu_do_wakeup+0x14/0x220
>   try_to_wake_up+0x218/0x5f0
>   pollwake+0x6f/0x90
>   credit_entropy_bits+0x204/0x310
>   add_interrupt_randomness+0x18f/0x210
>   handle_irq+0x67/0x160
>   do_IRQ+0x5e/0x130
>   common_interrupt+0xf/0xf
>   </IRQ>
> 
> This alone isn't a problem, but the spinlock in the semaphore is also
> still held while waking up waiters (up() -> __up() -> try_to_wake_up()
> callchain), which then closes the runqueue vs. semaphore.lock loop,
> and upsets lockdep, which issues a circular locking splat to dmesg.
> Worse it upsets developers, since we don't want to spam dmesg with
> clutter when the machine is dying already.
>
> Fix this by creating a __down_trylock which only trylocks the
> semaphore.lock. This isn't correct in full generality, but good enough
> for console_lock:
> 
> - there's only ever one console_lock holder, we won't fail spuriously
>   because someone is doing a down() or up() while there's still room
>   (unlike other semaphores with count > 1).
>
> - console_unlock() has one massive retry loop, which will catch anyone
>   who races the trylock against the up(). This makes sure that no
>   printk lines will get lost. Making the trylock more racy therefore
>   has no further impact.

To be honest, I do not see how this could solve the problem.

The circular dependency is still there. If the new __down_trylock()
succeeds then console_unlock() will get called in the same context
and it will still need to call up() -> try_to_wake_up().

Note that there are many other console_lock() callers that might
happen in parallel and might appear in the wait queue.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] RFC: console: hack up console_trylock more
  2019-05-03 15:14 ` Petr Mladek
@ 2019-05-06  7:11   ` Daniel Vetter
  2019-05-06  7:48     ` Petr Mladek
  0 siblings, 1 reply; 15+ messages in thread
From: Daniel Vetter @ 2019-05-06  7:11 UTC (permalink / raw)
  To: Petr Mladek
  Cc: DRI Development, Intel Graphics Development, Daniel Vetter,
	Peter Zijlstra, Ingo Molnar, Will Deacon, Sergey Senozhatsky,
	Steven Rostedt, John Ogness, Linux Kernel Mailing List

On Fri, May 3, 2019 at 5:14 PM Petr Mladek <pmladek@suse.com> wrote:
> On Thu 2019-05-02 16:16:43, Daniel Vetter wrote:
> > console_trylock, called from within printk, can be called from pretty
> > much anywhere. Including try_to_wake_up. Note that this isn't common,
> > usually the box is in pretty bad shape at that point already. But it
> > really doesn't help when then lockdep jumps in and spams the logs,
> > potentially obscuring the real backtrace we're really interested in.
> > One case I've seen (slightly simplified backtrace):
> >
> >  Call Trace:
> >   <IRQ>
> >   console_trylock+0xe/0x60
> >   vprintk_emit+0xf1/0x320
> >   printk+0x4d/0x69
> >   __warn_printk+0x46/0x90
> >   native_smp_send_reschedule+0x2f/0x40
> >   check_preempt_curr+0x81/0xa0
> >   ttwu_do_wakeup+0x14/0x220
> >   try_to_wake_up+0x218/0x5f0
> >   pollwake+0x6f/0x90
> >   credit_entropy_bits+0x204/0x310
> >   add_interrupt_randomness+0x18f/0x210
> >   handle_irq+0x67/0x160
> >   do_IRQ+0x5e/0x130
> >   common_interrupt+0xf/0xf
> >   </IRQ>
> >
> > This alone isn't a problem, but the spinlock in the semaphore is also
> > still held while waking up waiters (up() -> __up() -> try_to_wake_up()
> > callchain), which then closes the runqueue vs. semaphore.lock loop,
> > and upsets lockdep, which issues a circular locking splat to dmesg.
> > Worse it upsets developers, since we don't want to spam dmesg with
> > clutter when the machine is dying already.
> >
> > Fix this by creating a __down_trylock which only trylocks the
> > semaphore.lock. This isn't correct in full generality, but good enough
> > for console_lock:
> >
> > - there's only ever one console_lock holder, we won't fail spuriously
> >   because someone is doing a down() or up() while there's still room
> >   (unlike other semaphores with count > 1).
> >
> > - console_unlock() has one massive retry loop, which will catch anyone
> >   who races the trylock against the up(). This makes sure that no
> >   printk lines will get lost. Making the trylock more racy therefore
> >   has no further impact.
>
> To be honest, I do not see how this could solve the problem.
>
> The circular dependency is still there. If the new __down_trylock()
> succeeds then console_unlock() will get called in the same context
> and it will still need to call up() -> try_to_wake_up().
>
> Note that there are many other console_lock() callers that might
> happen in parallel and might appear in the wait queue.

Hm right. It's very rare we hit this in our CI and I don't know how to
repro otherwise, so just threw this out at the wall to see if it
sticks. I'll try and come up with a new trick then.

Thanks, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH] RFC: console: hack up console_lock more v2
  2019-05-02 14:16 [PATCH] RFC: console: hack up console_trylock more Daniel Vetter
  2019-05-03 15:14 ` Petr Mladek
@ 2019-05-06  7:45 ` Daniel Vetter
  2019-05-06  8:16   ` Petr Mladek
                     ` (2 more replies)
  1 sibling, 3 replies; 15+ messages in thread
From: Daniel Vetter @ 2019-05-06  7:45 UTC (permalink / raw)
  To: Intel Graphics Development
  Cc: Daniel Vetter, Daniel Vetter, Peter Zijlstra, Ingo Molnar,
	Will Deacon, Petr Mladek, Sergey Senozhatsky, Steven Rostedt,
	John Ogness, linux-kernel

console_trylock, called from within printk, can be called from pretty
much anywhere. Including try_to_wake_up. Note that this isn't common,
usually the box is in pretty bad shape at that point already. But it
really doesn't help when then lockdep jumps in and spams the logs,
potentially obscuring the real backtrace we're really interested in.
One case I've seen (slightly simplified backtrace):

 Call Trace:
  <IRQ>
  console_trylock+0xe/0x60
  vprintk_emit+0xf1/0x320
  printk+0x4d/0x69
  __warn_printk+0x46/0x90
  native_smp_send_reschedule+0x2f/0x40
  check_preempt_curr+0x81/0xa0
  ttwu_do_wakeup+0x14/0x220
  try_to_wake_up+0x218/0x5f0
  pollwake+0x6f/0x90
  credit_entropy_bits+0x204/0x310
  add_interrupt_randomness+0x18f/0x210
  handle_irq+0x67/0x160
  do_IRQ+0x5e/0x130
  common_interrupt+0xf/0xf
  </IRQ>

This alone isn't a problem, but the spinlock in the semaphore is also
still held while waking up waiters (up() -> __up() -> try_to_wake_up()
callchain), which then closes the runqueue vs. semaphore.lock loop,
and upsets lockdep, which issues a circular locking splat to dmesg.
Worse it upsets developers, since we don't want to spam dmesg with
clutter when the machine is dying already.

Fix this by creating a prinkt_safe_up() which calls wake_up_process
outside of the spinlock. This isn't correct in full generality, but
good enough for console_lock:

- console_lock doesn't use interruptible or killable or timeout down()
  calls, hence an up() is the only thing that can wake up a process.
  Hence the process can't get woken and killed and reaped while we try
  to wake it up too.

- semaphore.c always updates the waiter list while under the spinlock,
  so there's no other races. Specifically another process that races
  with a quick console_lock/unlock while we've dropped the spinlock
  already won't see our own waiter.

Also cc'ing John Ogness since perhaps his printk rework fixes this all
properly.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: linux-kernel@vger.kernel.org
---
v2: My first attempt at patching down_trylock didn't work (thanks to
Petr for pointing out), now try to have a special up() that breaks the
cycle.
-Daniel
---
 include/linux/semaphore.h  |  1 +
 kernel/locking/semaphore.c | 31 +++++++++++++++++++++++++++++++
 kernel/printk/printk.c     |  2 +-
 3 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/include/linux/semaphore.h b/include/linux/semaphore.h
index 11c86fbfeb98..7e839c72809d 100644
--- a/include/linux/semaphore.h
+++ b/include/linux/semaphore.h
@@ -41,6 +41,7 @@ extern int __must_check down_interruptible(struct semaphore *sem);
 extern int __must_check down_killable(struct semaphore *sem);
 extern int __must_check down_trylock(struct semaphore *sem);
 extern int __must_check down_timeout(struct semaphore *sem, long jiffies);
+extern void printk_safe_up(struct semaphore *sem);
 extern void up(struct semaphore *sem);
 
 #endif /* __LINUX_SEMAPHORE_H */
diff --git a/kernel/locking/semaphore.c b/kernel/locking/semaphore.c
index 561acdd39960..84675993dc59 100644
--- a/kernel/locking/semaphore.c
+++ b/kernel/locking/semaphore.c
@@ -197,6 +197,37 @@ struct semaphore_waiter {
 	bool up;
 };
 
+/**
+ * printk_safe_up - release the semaphore in console_unlock
+ * @sem: the semaphore to release
+ *
+ * Release the semaphore.  Unlike mutexes, up() may be called from any
+ * context and even by tasks which have never called down().
+ *
+ * NOTE: This is a special version of up() for console_unlock only. It is only
+ * safe if there are no killable, interruptible or timing out down() calls.
+ */
+void printk_safe_up(struct semaphore *sem)
+{
+	unsigned long flags;
+	struct semaphore_waiter *waiter = NULL;
+
+	raw_spin_lock_irqsave(&sem->lock, flags);
+	if (likely(list_empty(&sem->wait_list))) {
+		sem->count++;
+	} else {
+		waiter = list_first_entry(&sem->wait_list,
+					  struct semaphore_waiter, list);
+		list_del(&waiter->list);
+		waiter->up = true;
+	}
+	raw_spin_unlock_irqrestore(&sem->lock, flags);
+
+	if (waiter)
+		wake_up_process(waiter->task);
+}
+EXPORT_SYMBOL(printk_safe_up);
+
 /*
  * Because this function is inlined, the 'state' parameter will be
  * constant, and thus optimised away by the compiler.  Likewise the
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 02ca827b8fac..62303929afda 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -234,7 +234,7 @@ static void __up_console_sem(unsigned long ip)
 	mutex_release(&console_lock_dep_map, 1, ip);
 
 	printk_safe_enter_irqsave(flags);
-	up(&console_sem);
+	printk_safe_up(&console_sem);
 	printk_safe_exit_irqrestore(flags);
 }
 #define up_console_sem() __up_console_sem(_RET_IP_)
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] RFC: console: hack up console_trylock more
  2019-05-06  7:11   ` Daniel Vetter
@ 2019-05-06  7:48     ` Petr Mladek
  2019-05-06  8:40       ` Daniel Vetter
  0 siblings, 1 reply; 15+ messages in thread
From: Petr Mladek @ 2019-05-06  7:48 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: DRI Development, Intel Graphics Development, Daniel Vetter,
	Peter Zijlstra, Ingo Molnar, Will Deacon, Sergey Senozhatsky,
	Steven Rostedt, John Ogness, Linux Kernel Mailing List

On Mon 2019-05-06 09:11:37, Daniel Vetter wrote:
> On Fri, May 3, 2019 at 5:14 PM Petr Mladek <pmladek@suse.com> wrote:
> > On Thu 2019-05-02 16:16:43, Daniel Vetter wrote:
> > > console_trylock, called from within printk, can be called from pretty
> > > much anywhere. Including try_to_wake_up. Note that this isn't common,
> > > usually the box is in pretty bad shape at that point already. But it
> > > really doesn't help when then lockdep jumps in and spams the logs,
> > > potentially obscuring the real backtrace we're really interested in.
> > > One case I've seen (slightly simplified backtrace):
> > >
> > >  Call Trace:
> > >   <IRQ>
> > >   console_trylock+0xe/0x60
> > >   vprintk_emit+0xf1/0x320
> > >   printk+0x4d/0x69
> > >   __warn_printk+0x46/0x90
> > >   native_smp_send_reschedule+0x2f/0x40
> > >   check_preempt_curr+0x81/0xa0
> > >   ttwu_do_wakeup+0x14/0x220
> > >   try_to_wake_up+0x218/0x5f0
> > >   pollwake+0x6f/0x90
> > >   credit_entropy_bits+0x204/0x310
> > >   add_interrupt_randomness+0x18f/0x210
> > >   handle_irq+0x67/0x160
> > >   do_IRQ+0x5e/0x130
> > >   common_interrupt+0xf/0xf
> > >   </IRQ>
> > >
> > > This alone isn't a problem, but the spinlock in the semaphore is also
> > > still held while waking up waiters (up() -> __up() -> try_to_wake_up()
> > > callchain), which then closes the runqueue vs. semaphore.lock loop,
> > > and upsets lockdep, which issues a circular locking splat to dmesg.
> > > Worse it upsets developers, since we don't want to spam dmesg with
> > > clutter when the machine is dying already.
> > >
> > > Fix this by creating a __down_trylock which only trylocks the
> > > semaphore.lock. This isn't correct in full generality, but good enough
> > > for console_lock:
> > >
> > > - there's only ever one console_lock holder, we won't fail spuriously
> > >   because someone is doing a down() or up() while there's still room
> > >   (unlike other semaphores with count > 1).
> > >
> > > - console_unlock() has one massive retry loop, which will catch anyone
> > >   who races the trylock against the up(). This makes sure that no
> > >   printk lines will get lost. Making the trylock more racy therefore
> > >   has no further impact.
> >
> > To be honest, I do not see how this could solve the problem.
> >
> > The circular dependency is still there. If the new __down_trylock()
> > succeeds then console_unlock() will get called in the same context
> > and it will still need to call up() -> try_to_wake_up().
> >
> > Note that there are many other console_lock() callers that might
> > happen in parallel and might appear in the wait queue.
> 
> Hm right. It's very rare we hit this in our CI and I don't know how to
> repro otherwise, so just threw this out at the wall to see if it
> sticks. I'll try and come up with a new trick then.

Single messages are printed from scheduler via printk_deferred().
WARN() might be solved by introducing printk deferred context,
see the per-cpu variable printk_context.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] RFC: console: hack up console_lock more v2
  2019-05-06  7:45 ` [PATCH] RFC: console: hack up console_lock more v2 Daniel Vetter
@ 2019-05-06  8:16   ` Petr Mladek
  2019-05-06  8:26     ` Petr Mladek
  2019-05-06  8:20   ` Petr Mladek
  2019-05-09 10:32   ` [Intel-gfx] " Chris Wilson
  2 siblings, 1 reply; 15+ messages in thread
From: Petr Mladek @ 2019-05-06  8:16 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Intel Graphics Development, Daniel Vetter, Peter Zijlstra,
	Ingo Molnar, Will Deacon, Sergey Senozhatsky, Steven Rostedt,
	John Ogness, linux-kernel

On Mon 2019-05-06 09:45:53, Daniel Vetter wrote:
> console_trylock, called from within printk, can be called from pretty
> much anywhere. Including try_to_wake_up. Note that this isn't common,
> usually the box is in pretty bad shape at that point already. But it
> really doesn't help when then lockdep jumps in and spams the logs,
> potentially obscuring the real backtrace we're really interested in.
> One case I've seen (slightly simplified backtrace):
> 
>  Call Trace:
>   <IRQ>
>   console_trylock+0xe/0x60
>   vprintk_emit+0xf1/0x320
>   printk+0x4d/0x69
>   __warn_printk+0x46/0x90
>   native_smp_send_reschedule+0x2f/0x40
>   check_preempt_curr+0x81/0xa0
>   ttwu_do_wakeup+0x14/0x220
>   try_to_wake_up+0x218/0x5f0

try_to_wake_up() takes p->pi_lock. It could deadlock because it
can get called recursively from printk_safe_up().

And there are more locks taken from try_to_wake_up(), for example,
__task_rq_lock() taken from ttwu_remote().

IMHO, the most reliable solution would be do call the entire
up_console_sem() from printk deferred context. We could assign
few bytes for this context in the per-CPU printk_deferred
variable.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] RFC: console: hack up console_lock more v2
  2019-05-06  7:45 ` [PATCH] RFC: console: hack up console_lock more v2 Daniel Vetter
  2019-05-06  8:16   ` Petr Mladek
@ 2019-05-06  8:20   ` Petr Mladek
  2019-05-09 10:32   ` [Intel-gfx] " Chris Wilson
  2 siblings, 0 replies; 15+ messages in thread
From: Petr Mladek @ 2019-05-06  8:20 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Intel Graphics Development, Daniel Vetter, Peter Zijlstra,
	Ingo Molnar, Will Deacon, Sergey Senozhatsky, Steven Rostedt,
	John Ogness, linux-kernel

On Mon 2019-05-06 09:45:53, Daniel Vetter wrote:
> console_trylock, called from within printk, can be called from pretty
> much anywhere. Including try_to_wake_up. Note that this isn't common,
> usually the box is in pretty bad shape at that point already. But it
> really doesn't help when then lockdep jumps in and spams the logs,
> potentially obscuring the real backtrace we're really interested in.
> One case I've seen (slightly simplified backtrace):
> 
>  Call Trace:
>   <IRQ>
>   console_trylock+0xe/0x60
>   vprintk_emit+0xf1/0x320
>   printk+0x4d/0x69
>   __warn_printk+0x46/0x90
>   native_smp_send_reschedule+0x2f/0x40
>   check_preempt_curr+0x81/0xa0
>   ttwu_do_wakeup+0x14/0x220
>   try_to_wake_up+0x218/0x5f0

try_to_wake_up() takes p->pi_lock. It could deadlock because it
can get called recursively from printk_safe_up().

And there are more locks taken from try_to_wake_up(), for example,
__task_rq_lock() taken from ttwu_remote().

IMHO, the most reliable solution would be do call the entire
up_console_sem() from printk deferred context. We could assign
few bytes for this context in the per-CPU printk_deferred
variable.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] RFC: console: hack up console_lock more v2
  2019-05-06  8:16   ` Petr Mladek
@ 2019-05-06  8:26     ` Petr Mladek
  2019-05-06  9:38       ` Daniel Vetter
  0 siblings, 1 reply; 15+ messages in thread
From: Petr Mladek @ 2019-05-06  8:26 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Intel Graphics Development, Daniel Vetter, Peter Zijlstra,
	Ingo Molnar, Will Deacon, Sergey Senozhatsky, Steven Rostedt,
	John Ogness, linux-kernel

On Mon 2019-05-06 10:16:14, Petr Mladek wrote:
> On Mon 2019-05-06 09:45:53, Daniel Vetter wrote:
> > console_trylock, called from within printk, can be called from pretty
> > much anywhere. Including try_to_wake_up. Note that this isn't common,
> > usually the box is in pretty bad shape at that point already. But it
> > really doesn't help when then lockdep jumps in and spams the logs,
> > potentially obscuring the real backtrace we're really interested in.
> > One case I've seen (slightly simplified backtrace):
> > 
> >  Call Trace:
> >   <IRQ>
> >   console_trylock+0xe/0x60
> >   vprintk_emit+0xf1/0x320
> >   printk+0x4d/0x69
> >   __warn_printk+0x46/0x90
> >   native_smp_send_reschedule+0x2f/0x40
> >   check_preempt_curr+0x81/0xa0
> >   ttwu_do_wakeup+0x14/0x220
> >   try_to_wake_up+0x218/0x5f0
> 
> try_to_wake_up() takes p->pi_lock. It could deadlock because it
> can get called recursively from printk_safe_up().
> 
> And there are more locks taken from try_to_wake_up(), for example,
> __task_rq_lock() taken from ttwu_remote().
> 
> IMHO, the most reliable solution would be do call the entire
> up_console_sem() from printk deferred context. We could assign
> few bytes for this context in the per-CPU printk_deferred
> variable.

Ah, I was too fast and did the same mistake. This won't help because
it would still call try_to_wake_up() recursively.

We need to call all printk's that can be called under locks
taken in try_to_wake_up() path in printk deferred context.
Unfortunately it is whack a mole approach.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] RFC: console: hack up console_trylock more
  2019-05-06  7:48     ` Petr Mladek
@ 2019-05-06  8:40       ` Daniel Vetter
  0 siblings, 0 replies; 15+ messages in thread
From: Daniel Vetter @ 2019-05-06  8:40 UTC (permalink / raw)
  To: Petr Mladek
  Cc: DRI Development, Intel Graphics Development, Daniel Vetter,
	Peter Zijlstra, Ingo Molnar, Will Deacon, Sergey Senozhatsky,
	Steven Rostedt, John Ogness, Linux Kernel Mailing List

On Mon, May 6, 2019 at 9:48 AM Petr Mladek <pmladek@suse.com> wrote:
> On Mon 2019-05-06 09:11:37, Daniel Vetter wrote:
> > On Fri, May 3, 2019 at 5:14 PM Petr Mladek <pmladek@suse.com> wrote:
> > > On Thu 2019-05-02 16:16:43, Daniel Vetter wrote:
> > > > console_trylock, called from within printk, can be called from pretty
> > > > much anywhere. Including try_to_wake_up. Note that this isn't common,
> > > > usually the box is in pretty bad shape at that point already. But it
> > > > really doesn't help when then lockdep jumps in and spams the logs,
> > > > potentially obscuring the real backtrace we're really interested in.
> > > > One case I've seen (slightly simplified backtrace):
> > > >
> > > >  Call Trace:
> > > >   <IRQ>
> > > >   console_trylock+0xe/0x60
> > > >   vprintk_emit+0xf1/0x320
> > > >   printk+0x4d/0x69
> > > >   __warn_printk+0x46/0x90
> > > >   native_smp_send_reschedule+0x2f/0x40
> > > >   check_preempt_curr+0x81/0xa0
> > > >   ttwu_do_wakeup+0x14/0x220
> > > >   try_to_wake_up+0x218/0x5f0
> > > >   pollwake+0x6f/0x90
> > > >   credit_entropy_bits+0x204/0x310
> > > >   add_interrupt_randomness+0x18f/0x210
> > > >   handle_irq+0x67/0x160
> > > >   do_IRQ+0x5e/0x130
> > > >   common_interrupt+0xf/0xf
> > > >   </IRQ>
> > > >
> > > > This alone isn't a problem, but the spinlock in the semaphore is also
> > > > still held while waking up waiters (up() -> __up() -> try_to_wake_up()
> > > > callchain), which then closes the runqueue vs. semaphore.lock loop,
> > > > and upsets lockdep, which issues a circular locking splat to dmesg.
> > > > Worse it upsets developers, since we don't want to spam dmesg with
> > > > clutter when the machine is dying already.
> > > >
> > > > Fix this by creating a __down_trylock which only trylocks the
> > > > semaphore.lock. This isn't correct in full generality, but good enough
> > > > for console_lock:
> > > >
> > > > - there's only ever one console_lock holder, we won't fail spuriously
> > > >   because someone is doing a down() or up() while there's still room
> > > >   (unlike other semaphores with count > 1).
> > > >
> > > > - console_unlock() has one massive retry loop, which will catch anyone
> > > >   who races the trylock against the up(). This makes sure that no
> > > >   printk lines will get lost. Making the trylock more racy therefore
> > > >   has no further impact.
> > >
> > > To be honest, I do not see how this could solve the problem.
> > >
> > > The circular dependency is still there. If the new __down_trylock()
> > > succeeds then console_unlock() will get called in the same context
> > > and it will still need to call up() -> try_to_wake_up().
> > >
> > > Note that there are many other console_lock() callers that might
> > > happen in parallel and might appear in the wait queue.
> >
> > Hm right. It's very rare we hit this in our CI and I don't know how to
> > repro otherwise, so just threw this out at the wall to see if it
> > sticks. I'll try and come up with a new trick then.
>
> Single messages are printed from scheduler via printk_deferred().
> WARN() might be solved by introducing printk deferred context,
> see the per-cpu variable printk_context.

I convinced myself that I can take the wake_up_process out from under
the spinlock, for the limited case of the console lock. I think that's
a cleaner and more robust fix than leaking printk_context trickery
into the console_unlock code.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] RFC: console: hack up console_lock more v2
  2019-05-06  8:26     ` Petr Mladek
@ 2019-05-06  9:38       ` Daniel Vetter
  2019-05-06 11:24         ` Petr Mladek
  0 siblings, 1 reply; 15+ messages in thread
From: Daniel Vetter @ 2019-05-06  9:38 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Daniel Vetter, Intel Graphics Development, Daniel Vetter,
	Peter Zijlstra, Ingo Molnar, Will Deacon, Sergey Senozhatsky,
	Steven Rostedt, John Ogness, linux-kernel

On Mon, May 06, 2019 at 10:26:28AM +0200, Petr Mladek wrote:
> On Mon 2019-05-06 10:16:14, Petr Mladek wrote:
> > On Mon 2019-05-06 09:45:53, Daniel Vetter wrote:
> > > console_trylock, called from within printk, can be called from pretty
> > > much anywhere. Including try_to_wake_up. Note that this isn't common,
> > > usually the box is in pretty bad shape at that point already. But it
> > > really doesn't help when then lockdep jumps in and spams the logs,
> > > potentially obscuring the real backtrace we're really interested in.
> > > One case I've seen (slightly simplified backtrace):
> > > 
> > >  Call Trace:
> > >   <IRQ>
> > >   console_trylock+0xe/0x60
> > >   vprintk_emit+0xf1/0x320
> > >   printk+0x4d/0x69
> > >   __warn_printk+0x46/0x90
> > >   native_smp_send_reschedule+0x2f/0x40
> > >   check_preempt_curr+0x81/0xa0
> > >   ttwu_do_wakeup+0x14/0x220
> > >   try_to_wake_up+0x218/0x5f0
> > 
> > try_to_wake_up() takes p->pi_lock. It could deadlock because it
> > can get called recursively from printk_safe_up().
> > 
> > And there are more locks taken from try_to_wake_up(), for example,
> > __task_rq_lock() taken from ttwu_remote().
> > 
> > IMHO, the most reliable solution would be do call the entire
> > up_console_sem() from printk deferred context. We could assign
> > few bytes for this context in the per-CPU printk_deferred
> > variable.
> 
> Ah, I was too fast and did the same mistake. This won't help because
> it would still call try_to_wake_up() recursively.

Uh :-/

> We need to call all printk's that can be called under locks
> taken in try_to_wake_up() path in printk deferred context.
> Unfortunately it is whack a mole approach.

Hm since it's whack-a-mole anyway, what about converting the WARN_ON into
a prinkt_deferred, like all the other scheduler related code? Feels a
notch more consistent to me than leaking the printk_context into areas it
wasn't really meant built for. Scheduler code already fully subscribed to
the whack-a-mole approach after all.

This would mean we drop the backtrace from the native_smp_send_reschedule,
or maybe we need a printk_deferred_backtrace()?
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] RFC: console: hack up console_lock more v2
  2019-05-06  9:38       ` Daniel Vetter
@ 2019-05-06 11:24         ` Petr Mladek
  2019-05-08  8:17           ` [Intel-gfx] " Daniel Vetter
  0 siblings, 1 reply; 15+ messages in thread
From: Petr Mladek @ 2019-05-06 11:24 UTC (permalink / raw)
  To: Intel Graphics Development, Daniel Vetter, Peter Zijlstra,
	Ingo Molnar, Will Deacon, Sergey Senozhatsky, Steven Rostedt,
	John Ogness, linux-kernel

On Mon 2019-05-06 11:38:13, Daniel Vetter wrote:
> On Mon, May 06, 2019 at 10:26:28AM +0200, Petr Mladek wrote:
> > On Mon 2019-05-06 10:16:14, Petr Mladek wrote:
> > > On Mon 2019-05-06 09:45:53, Daniel Vetter wrote:
> > > > console_trylock, called from within printk, can be called from pretty
> > > > much anywhere. Including try_to_wake_up. Note that this isn't common,
> > > > usually the box is in pretty bad shape at that point already. But it
> > > > really doesn't help when then lockdep jumps in and spams the logs,
> > > > potentially obscuring the real backtrace we're really interested in.
> > > > One case I've seen (slightly simplified backtrace):
> > > > 
> > > >  Call Trace:
> > > >   <IRQ>
> > > >   console_trylock+0xe/0x60
> > > >   vprintk_emit+0xf1/0x320
> > > >   printk+0x4d/0x69
> > > >   __warn_printk+0x46/0x90
> > > >   native_smp_send_reschedule+0x2f/0x40
> > > >   check_preempt_curr+0x81/0xa0
> > > >   ttwu_do_wakeup+0x14/0x220
> > > >   try_to_wake_up+0x218/0x5f0
> > > 
> > > try_to_wake_up() takes p->pi_lock. It could deadlock because it
> > > can get called recursively from printk_safe_up().
> > > 
> > > And there are more locks taken from try_to_wake_up(), for example,
> > > __task_rq_lock() taken from ttwu_remote().
> > > 
> > > IMHO, the most reliable solution would be do call the entire
> > > up_console_sem() from printk deferred context. We could assign
> > > few bytes for this context in the per-CPU printk_deferred
> > > variable.
> > 
> > Ah, I was too fast and did the same mistake. This won't help because
> > it would still call try_to_wake_up() recursively.
> 
> Uh :-/
> 
> > We need to call all printk's that can be called under locks
> > taken in try_to_wake_up() path in printk deferred context.
> > Unfortunately it is whack a mole approach.
> 
> Hm since it's whack-a-mole anyway, what about converting the WARN_ON into
> a prinkt_deferred, like all the other scheduler related code? Feels a
> notch more consistent to me than leaking the printk_context into areas it
> wasn't really meant built for. Scheduler code already fully subscribed to
> the whack-a-mole approach after all.

I am not sure how exactly you mean the conversion.

Anyway, we do not want to use printk_deferred() treewide. It reduces
the chance that the messages reach consoles. Scheduler is an
exception because of the possible deadlocks.

A solution would be to define WARN_ON_DEFERRED() that would
call normal WARN_ON() in printk deferred context and
use in scheduler.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Intel-gfx] [PATCH] RFC: console: hack up console_lock more v2
  2019-05-06 11:24         ` Petr Mladek
@ 2019-05-08  8:17           ` Daniel Vetter
  2019-05-09 15:08             ` Petr Mladek
  0 siblings, 1 reply; 15+ messages in thread
From: Daniel Vetter @ 2019-05-08  8:17 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Intel Graphics Development, Daniel Vetter, Peter Zijlstra,
	Ingo Molnar, Will Deacon, Sergey Senozhatsky, Steven Rostedt,
	John Ogness, linux-kernel

On Mon, May 06, 2019 at 01:24:48PM +0200, Petr Mladek wrote:
> On Mon 2019-05-06 11:38:13, Daniel Vetter wrote:
> > On Mon, May 06, 2019 at 10:26:28AM +0200, Petr Mladek wrote:
> > > On Mon 2019-05-06 10:16:14, Petr Mladek wrote:
> > > > On Mon 2019-05-06 09:45:53, Daniel Vetter wrote:
> > > > > console_trylock, called from within printk, can be called from pretty
> > > > > much anywhere. Including try_to_wake_up. Note that this isn't common,
> > > > > usually the box is in pretty bad shape at that point already. But it
> > > > > really doesn't help when then lockdep jumps in and spams the logs,
> > > > > potentially obscuring the real backtrace we're really interested in.
> > > > > One case I've seen (slightly simplified backtrace):
> > > > > 
> > > > >  Call Trace:
> > > > >   <IRQ>
> > > > >   console_trylock+0xe/0x60
> > > > >   vprintk_emit+0xf1/0x320
> > > > >   printk+0x4d/0x69
> > > > >   __warn_printk+0x46/0x90
> > > > >   native_smp_send_reschedule+0x2f/0x40
> > > > >   check_preempt_curr+0x81/0xa0
> > > > >   ttwu_do_wakeup+0x14/0x220
> > > > >   try_to_wake_up+0x218/0x5f0
> > > > 
> > > > try_to_wake_up() takes p->pi_lock. It could deadlock because it
> > > > can get called recursively from printk_safe_up().
> > > > 
> > > > And there are more locks taken from try_to_wake_up(), for example,
> > > > __task_rq_lock() taken from ttwu_remote().
> > > > 
> > > > IMHO, the most reliable solution would be do call the entire
> > > > up_console_sem() from printk deferred context. We could assign
> > > > few bytes for this context in the per-CPU printk_deferred
> > > > variable.
> > > 
> > > Ah, I was too fast and did the same mistake. This won't help because
> > > it would still call try_to_wake_up() recursively.
> > 
> > Uh :-/
> > 
> > > We need to call all printk's that can be called under locks
> > > taken in try_to_wake_up() path in printk deferred context.
> > > Unfortunately it is whack a mole approach.
> > 
> > Hm since it's whack-a-mole anyway, what about converting the WARN_ON into
> > a prinkt_deferred, like all the other scheduler related code? Feels a
> > notch more consistent to me than leaking the printk_context into areas it
> > wasn't really meant built for. Scheduler code already fully subscribed to
> > the whack-a-mole approach after all.
> 
> I am not sure how exactly you mean the conversion.
> 
> Anyway, we do not want to use printk_deferred() treewide. It reduces
> the chance that the messages reach consoles. Scheduler is an
> exception because of the possible deadlocks.
> 
> A solution would be to define WARN_ON_DEFERRED() that would
> call normal WARN_ON() in printk deferred context and
> use in scheduler.

Sent it out, and then Sergey pointed out printk_safe_enter/exit (which I
guess is what you meant, and which I missed), but we're doing this already
around the up() call in __up_console_sem.

So I think these further recursions you're pointed out are already handled
correctly, and all we need to do is to break the loop involving
semaphore.lock of the console_lock semaphore only. Which I think this
patch here achieves.

Thoughts? Or are we again missing something here?

Thanks, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Intel-gfx] [PATCH] RFC: console: hack up console_lock more v2
  2019-05-06  7:45 ` [PATCH] RFC: console: hack up console_lock more v2 Daniel Vetter
  2019-05-06  8:16   ` Petr Mladek
  2019-05-06  8:20   ` Petr Mladek
@ 2019-05-09 10:32   ` Chris Wilson
  2019-05-09 13:05     ` Peter Zijlstra
  2 siblings, 1 reply; 15+ messages in thread
From: Chris Wilson @ 2019-05-09 10:32 UTC (permalink / raw)
  To: Daniel Vetter, Intel Graphics Development
  Cc: Petr Mladek, John Ogness, Peter Zijlstra, Daniel Vetter,
	Will Deacon, linux-kernel, Steven Rostedt, Sergey Senozhatsky,
	Ingo Molnar, Daniel Vetter

Quoting Daniel Vetter (2019-05-06 08:45:53)
> +/**
> + * printk_safe_up - release the semaphore in console_unlock
> + * @sem: the semaphore to release
> + *
> + * Release the semaphore.  Unlike mutexes, up() may be called from any
> + * context and even by tasks which have never called down().
> + *
> + * NOTE: This is a special version of up() for console_unlock only. It is only
> + * safe if there are no killable, interruptible or timing out down() calls.
> + */
> +void printk_safe_up(struct semaphore *sem)
> +{
> +       unsigned long flags;
> +       struct semaphore_waiter *waiter = NULL;
> +
> +       raw_spin_lock_irqsave(&sem->lock, flags);
> +       if (likely(list_empty(&sem->wait_list))) {
> +               sem->count++;
> +       } else {
> +               waiter = list_first_entry(&sem->wait_list,
> +                                         struct semaphore_waiter, list);
> +               list_del(&waiter->list);
> +               waiter->up = true;
> +       }
> +       raw_spin_unlock_irqrestore(&sem->lock, flags);
> +
> +       if (waiter)
> +               wake_up_process(waiter->task);

From comparing against __down_common() there's a risk here that as soon
as waiter->up == true, the waiter may complete and make the onstack
struct semaphore_waiter invalid. If you store waiter->task locally under
the spinlock that problem is resolved.

Then there is the issue of an unprotected dereference of the task in
wake_up_process() -- I think you can wrap this function with
rcu_read_lock() to keep that safe, and wake_up_process() should be a
no-op if it races against process termination.
-Chris

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Intel-gfx] [PATCH] RFC: console: hack up console_lock more v2
  2019-05-09 10:32   ` [Intel-gfx] " Chris Wilson
@ 2019-05-09 13:05     ` Peter Zijlstra
  0 siblings, 0 replies; 15+ messages in thread
From: Peter Zijlstra @ 2019-05-09 13:05 UTC (permalink / raw)
  To: Chris Wilson
  Cc: Daniel Vetter, Intel Graphics Development, Petr Mladek,
	John Ogness, Will Deacon, linux-kernel, Steven Rostedt,
	Sergey Senozhatsky, Ingo Molnar, Daniel Vetter

On Thu, May 09, 2019 at 11:32:57AM +0100, Chris Wilson wrote:
> Quoting Daniel Vetter (2019-05-06 08:45:53)
> > +/**
> > + * printk_safe_up - release the semaphore in console_unlock
> > + * @sem: the semaphore to release
> > + *
> > + * Release the semaphore.  Unlike mutexes, up() may be called from any
> > + * context and even by tasks which have never called down().
> > + *
> > + * NOTE: This is a special version of up() for console_unlock only. It is only
> > + * safe if there are no killable, interruptible or timing out down() calls.
> > + */
> > +void printk_safe_up(struct semaphore *sem)
> > +{
> > +       unsigned long flags;
> > +       struct semaphore_waiter *waiter = NULL;
> > +
> > +       raw_spin_lock_irqsave(&sem->lock, flags);
> > +       if (likely(list_empty(&sem->wait_list))) {
> > +               sem->count++;
> > +       } else {
> > +               waiter = list_first_entry(&sem->wait_list,
> > +                                         struct semaphore_waiter, list);
> > +               list_del(&waiter->list);
> > +               waiter->up = true;
> > +       }
> > +       raw_spin_unlock_irqrestore(&sem->lock, flags);
> > +
> > +       if (waiter)
> > +               wake_up_process(waiter->task);
> 
> From comparing against __down_common() there's a risk here that as soon
> as waiter->up == true, the waiter may complete and make the onstack
> struct semaphore_waiter invalid. If you store waiter->task locally under
> the spinlock that problem is resolved.
> 
> Then there is the issue of an unprotected dereference of the task in
> wake_up_process() -- I think you can wrap this function with
> rcu_read_lock() to keep that safe, and wake_up_process() should be a
> no-op if it races against process termination.

task_struct is not RCU protected, see task_rcu_dereference() for magic.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Intel-gfx] [PATCH] RFC: console: hack up console_lock more v2
  2019-05-08  8:17           ` [Intel-gfx] " Daniel Vetter
@ 2019-05-09 15:08             ` Petr Mladek
  0 siblings, 0 replies; 15+ messages in thread
From: Petr Mladek @ 2019-05-09 15:08 UTC (permalink / raw)
  To: Intel Graphics Development, Daniel Vetter, Peter Zijlstra,
	Ingo Molnar, Will Deacon, Sergey Senozhatsky, Steven Rostedt,
	John Ogness, linux-kernel

On Wed 2019-05-08 10:17:12, Daniel Vetter wrote:
> On Mon, May 06, 2019 at 01:24:48PM +0200, Petr Mladek wrote:
> > On Mon 2019-05-06 11:38:13, Daniel Vetter wrote:
> > > On Mon, May 06, 2019 at 10:26:28AM +0200, Petr Mladek wrote:
> > > > On Mon 2019-05-06 10:16:14, Petr Mladek wrote:
> > > > > On Mon 2019-05-06 09:45:53, Daniel Vetter wrote:
> > > > > > console_trylock, called from within printk, can be called from pretty
> > > > > > much anywhere. Including try_to_wake_up. Note that this isn't common,
> > > > > > usually the box is in pretty bad shape at that point already. But it
> > > > > > really doesn't help when then lockdep jumps in and spams the logs,
> > > > > > potentially obscuring the real backtrace we're really interested in.
> > > > > > One case I've seen (slightly simplified backtrace):
> > > > > > 
> > > > > >  Call Trace:
> > > > > >   <IRQ>
> > > > > >   console_trylock+0xe/0x60
> > > > > >   vprintk_emit+0xf1/0x320
> > > > > >   printk+0x4d/0x69
> > > > > >   __warn_printk+0x46/0x90
> > > > > >   native_smp_send_reschedule+0x2f/0x40
> > > > > >   check_preempt_curr+0x81/0xa0
> > > > > >   ttwu_do_wakeup+0x14/0x220
> > > > > >   try_to_wake_up+0x218/0x5f0
> > > > > 
> > > > > try_to_wake_up() takes p->pi_lock. It could deadlock because it
> > > > > can get called recursively from printk_safe_up().
> > > > > 
> > > > > And there are more locks taken from try_to_wake_up(), for example,
> > > > > __task_rq_lock() taken from ttwu_remote().
> > > > > 
> > > > > IMHO, the most reliable solution would be do call the entire
> > > > > up_console_sem() from printk deferred context. We could assign
> > > > > few bytes for this context in the per-CPU printk_deferred
> > > > > variable.
> > > > 
> > > > Ah, I was too fast and did the same mistake. This won't help because
> > > > it would still call try_to_wake_up() recursively.
> > > 
> > > Uh :-/
> > > 
> > > > We need to call all printk's that can be called under locks
> > > > taken in try_to_wake_up() path in printk deferred context.
> > > > Unfortunately it is whack a mole approach.
> > > 
> > > Hm since it's whack-a-mole anyway, what about converting the WARN_ON into
> > > a prinkt_deferred, like all the other scheduler related code? Feels a
> > > notch more consistent to me than leaking the printk_context into areas it
> > > wasn't really meant built for. Scheduler code already fully subscribed to
> > > the whack-a-mole approach after all.
> > 
> > I am not sure how exactly you mean the conversion.
> > 
> > Anyway, we do not want to use printk_deferred() treewide. It reduces
> > the chance that the messages reach consoles. Scheduler is an
> > exception because of the possible deadlocks.
> > 
> > A solution would be to define WARN_ON_DEFERRED() that would
> > call normal WARN_ON() in printk deferred context and
> > use in scheduler.
> 
> Sent it out, and then Sergey pointed out printk_safe_enter/exit (which I
> guess is what you meant, and which I missed)

No, I meant introducing a deferred printk context that would behave
like printk_deferred().

printk safe context is more limiting. It prevents deadlock on
logbuf_lock by temporary saving the messages into per-CPU
buffers.

In scheduler, we could store the messages directly into
the main log buffer. We just need to deffer the console
handling to avoid deadlock on the scheduler locks.

> , but we're doing this already around the up() call
> in __up_console_sem.
>
> So I think these further recursions you're pointed out are already handled
> correctly, and all we need to do is to break the loop involving
> semaphore.lock of the console_lock semaphore only. Which I think this
> patch here achieves.

printk safe context would help only when try_to_wake_up()
and all other functions using the same locks _all over
the system_ are called printk safe (or deferred) context.

By other words, printk safe context solves only printk()
recursion. It does not solve recursion of the scheduler
operations.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2019-05-09 15:08 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-02 14:16 [PATCH] RFC: console: hack up console_trylock more Daniel Vetter
2019-05-03 15:14 ` Petr Mladek
2019-05-06  7:11   ` Daniel Vetter
2019-05-06  7:48     ` Petr Mladek
2019-05-06  8:40       ` Daniel Vetter
2019-05-06  7:45 ` [PATCH] RFC: console: hack up console_lock more v2 Daniel Vetter
2019-05-06  8:16   ` Petr Mladek
2019-05-06  8:26     ` Petr Mladek
2019-05-06  9:38       ` Daniel Vetter
2019-05-06 11:24         ` Petr Mladek
2019-05-08  8:17           ` [Intel-gfx] " Daniel Vetter
2019-05-09 15:08             ` Petr Mladek
2019-05-06  8:20   ` Petr Mladek
2019-05-09 10:32   ` [Intel-gfx] " Chris Wilson
2019-05-09 13:05     ` Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).