linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* BUG: sleeping function called from invalid context at kernel/mutex.c
@ 2011-04-02 11:47 Xiaotian Feng
  2011-04-02 19:12 ` Thomas Gleixner
  0 siblings, 1 reply; 7+ messages in thread
From: Xiaotian Feng @ 2011-04-02 11:47 UTC (permalink / raw)
  To: tglx, linux-kernel; +Cc: mingo

Hi,

     I've got following warnings when I'm trying to resume from suspend, so
sparse_lock is not safe enough as a mutex for the suspend resume case, can
we simply convert sparse_lock back to spinlock?

BUG: sleeping function called from invalid context at kernel/mutex.c:278
in_atomic(): 1, irqs_disabled(): 1, pid: 2345, name: bash
INFO: lockdep is turned off.
irq event stamp: 0
hardirqs last  enabled at (0): [<          (null)>]           (null)
hardirqs last disabled at (0): [<ffffffff81068391>] 
copy_process+0x4d1/0x1070
softirqs last  enabled at (0): [<ffffffff81068391>] 
copy_process+0x4d1/0x1070
softirqs last disabled at (0): [<          (null)>]           (null)
Pid: 2345, comm: bash Tainted: G          I 2.6.39-rc1+ #59
Call Trace:
  [<ffffffff81052d77>] __might_sleep+0x107/0x140
  [<ffffffff816a57df>] mutex_lock_nested+0x2f/0x50
  [<ffffffff810e191a>] irq_reserve_irqs+0x4a/0xa0
  [<ffffffff810e18a8>] ? __irq_put_desc_unlock+0x28/0x50
  [<ffffffff810e4e78>] irq_set_chip+0x58/0x70
  [<ffffffff810e4fc0>] ? handle_level_irq+0xe0/0xe0
  [<ffffffff810e4eb6>] irq_set_chip_and_handler_name+0x26/0x50
  [<ffffffff8102fbb0>] arch_setup_hpet_msi+0xa0/0xf0
  [<ffffffff8103613c>] hpet_setup_msi_irq+0x1c/0x40
  [<ffffffff81036ad3>] hpet_set_mode+0x1a3/0x1f0
  [<ffffffff81036b34>] hpet_msi_set_mode+0x14/0x20
  [<ffffffff810a03aa>] clockevents_set_mode+0x2a/0x70
  [<ffffffff810a0693>] tick_resume+0x53/0xa0
  [<ffffffff810a0cbd>] tick_notify+0x1cd/0x1e0
  [<ffffffff810a00a8>] ? clockevents_notify+0x28/0x140
  [<ffffffff816ab1f4>] notifier_call_chain+0x94/0xd0
  [<ffffffff810970c6>] raw_notifier_call_chain+0x16/0x20
  [<ffffffff810a00bd>] clockevents_notify+0x3d/0x140
  [<ffffffff8109c000>] timekeeping_resume+0x190/0x1d0
  [<ffffffff813c5fd5>] syscore_resume+0x45/0xf0
  [<ffffffff810bb977>] suspend_enter+0xe7/0x120
  [<ffffffff810bba7e>] suspend_devices_and_enter+0x8e/0x2c0
  [<ffffffff810bbd86>] enter_state+0xd6/0x130
  [<ffffffff810baf99>] state_store+0x99/0x100
  [<ffffffff812c82b7>] kobj_attr_store+0x17/0x20
  [<ffffffff811fc6f3>] sysfs_write_file+0xa3/0x100
  [<ffffffff81187310>] vfs_write+0xd0/0x1a0
  [<ffffffff811874e4>] sys_write+0x54/0xa0
  [<ffffffff816af382>] system_call_fastpath+0x16/0x1b

Thanks
Xiaotian

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: BUG: sleeping function called from invalid context at kernel/mutex.c
  2011-04-02 11:47 BUG: sleeping function called from invalid context at kernel/mutex.c Xiaotian Feng
@ 2011-04-02 19:12 ` Thomas Gleixner
  2011-04-05  7:41   ` Peter Zijlstra
  2011-04-07  5:22   ` Xiaotian Feng
  0 siblings, 2 replies; 7+ messages in thread
From: Thomas Gleixner @ 2011-04-02 19:12 UTC (permalink / raw)
  To: Xiaotian Feng
  Cc: LKML, Ingo Molnar, Venkatesh Pallipadi, Suresh Siddha,
	H. Peter Anvin, Rafael J. Wysocki, Peter Zijlstra

On Sat, 2 Apr 2011, Xiaotian Feng wrote:
>     I've got following warnings when I'm trying to resume from suspend, so
> sparse_lock is not safe enough as a mutex for the suspend resume case, can
> we simply convert sparse_lock back to spinlock?

What the heck? We do not just convert a mutex to a spinlock because a
warning triggers. And it was a mutex already in 2.6.38 w/o any such
problems.

And it is safe enough for the resume case as well, it simply cannot go
to sleep because nothing else can hold that mutex in the early resume
code. We take the mutex during early boot as well, but there the might
sleep checks are disabled. And we should do the very same thing in the
early resume code as well. Rafael, PeterZ ???

But that's a different issue and not the real problem.

So now instead of thinking about spinlock it would have been pretty
simple to figure out that the warning is triggered due to commit
d72274e5.

But of course we dont want to revert d72274e5 either, though it's easy
to figure out from the call chain that the fundamental stupidity is in
the hpet resume code.

Why the hell does it call arch_setup_hpet_msi()? That interrupt was
completely setup _BEFORE_ we went into suspend. And I don't see any
code which tears down the interrupt on suspend. So we run through a
full setup for nothing. And when we have interrupt remapping enabled
it even leaks an irte, because it allocates a new one. Utter crap.
init_one_hpet_msi_clockevent() has the same stupid call.

So no, the mutex stays a mutex and we fix crap where the crap is.

The following completely untested patch should solve that. Actually we
should be more clever and do the affinity magic in the new function as
well, so we can get rid of this weird disable/set_affinity/enable
hack, but that's a different story.

Thanks,

	tglx
---
 arch/x86/include/asm/hpet.h    |    2 ++
 arch/x86/kernel/apic/io_apic.c |   11 +++++++++++
 arch/x86/kernel/hpet.c         |    5 +++--
 3 files changed, 16 insertions(+), 2 deletions(-)

Index: linux-2.6/arch/x86/include/asm/hpet.h
===================================================================
--- linux-2.6.orig/arch/x86/include/asm/hpet.h
+++ linux-2.6/arch/x86/include/asm/hpet.h
@@ -83,11 +83,13 @@ extern void hpet_msi_read(struct hpet_de
 
 #ifdef CONFIG_PCI_MSI
 extern int arch_setup_hpet_msi(unsigned int irq, unsigned int id);
+extern int arch_start_hpet_msi(unsigned int irq, unsigned int id);
 #else
 static inline int arch_setup_hpet_msi(unsigned int irq, unsigned int id)
 {
 	return -EINVAL;
 }
+static inline int arch_start_hpet_msi(unsigned int irq, unsigned int id) { }
 #endif
 
 #ifdef CONFIG_HPET_EMULATE_RTC
Index: linux-2.6/arch/x86/kernel/apic/io_apic.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/apic/io_apic.c
+++ linux-2.6/arch/x86/kernel/apic/io_apic.c
@@ -3477,6 +3477,17 @@ int arch_setup_hpet_msi(unsigned int irq
 	irq_set_chip_and_handler_name(irq, chip, handle_edge_irq, "edge");
 	return 0;
 }
+
+int arch_start_hpet_msi(unsigned int irq, unsigned int id)
+{
+	struct msi_msg msg;
+	int ret = msi_compose_msg(NULL, irq, &msg, id);
+
+	if (!ret)
+		hpet_msi_write(irq_get_handler_data(irq), &msg);
+	return ret;
+}
+
 #endif
 
 #endif /* CONFIG_PCI_MSI */
Index: linux-2.6/arch/x86/kernel/hpet.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/hpet.c
+++ linux-2.6/arch/x86/kernel/hpet.c
@@ -370,7 +370,8 @@ static void hpet_set_mode(enum clock_eve
 			hpet_enable_legacy_int();
 		} else {
 			struct hpet_dev *hdev = EVT_TO_HPET_DEV(evt);
-			hpet_setup_msi_irq(hdev->irq);
+
+			arch_start_hpet_msi(hdev->irq, hpet_blockid);
 			disable_irq(hdev->irq);
 			irq_set_affinity(hdev->irq, cpumask_of(hdev->cpu));
 			enable_irq(hdev->irq);
@@ -555,7 +556,7 @@ static void init_one_hpet_msi_clockevent
 	if (!(hdev->flags & HPET_DEV_VALID))
 		return;
 
-	if (hpet_setup_msi_irq(hdev->irq))
+	if (arch_start_hpet_msi(hdev->irq, hpet_blockid))
 		return;
 
 	hdev->cpu = cpu;



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: BUG: sleeping function called from invalid context at kernel/mutex.c
  2011-04-02 19:12 ` Thomas Gleixner
@ 2011-04-05  7:41   ` Peter Zijlstra
  2011-04-06  4:43     ` Rafael J. Wysocki
  2011-04-07  5:22   ` Xiaotian Feng
  1 sibling, 1 reply; 7+ messages in thread
From: Peter Zijlstra @ 2011-04-05  7:41 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Xiaotian Feng, LKML, Ingo Molnar, Venkatesh Pallipadi,
	Suresh Siddha, H. Peter Anvin, Rafael J. Wysocki

On Sat, 2011-04-02 at 21:12 +0200, Thomas Gleixner wrote:
> 
> And it is safe enough for the resume case as well, it simply cannot go
> to sleep because nothing else can hold that mutex in the early resume
> code. We take the mutex during early boot as well, but there the might
> sleep checks are disabled. And we should do the very same thing in the
> early resume code as well. Rafael, PeterZ ??? 

the $subject text sound like it triggered might_sleep(), and that had a
system_state != SYSTEM_RUNNING bail condition, but then, I've no clue
what resume looks like.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: BUG: sleeping function called from invalid context at kernel/mutex.c
  2011-04-05  7:41   ` Peter Zijlstra
@ 2011-04-06  4:43     ` Rafael J. Wysocki
  2011-04-06  8:07       ` Peter Zijlstra
  0 siblings, 1 reply; 7+ messages in thread
From: Rafael J. Wysocki @ 2011-04-06  4:43 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Thomas Gleixner, Xiaotian Feng, LKML, Ingo Molnar,
	Venkatesh Pallipadi, Suresh Siddha, H. Peter Anvin

On Tuesday, April 05, 2011, Peter Zijlstra wrote:
> On Sat, 2011-04-02 at 21:12 +0200, Thomas Gleixner wrote:
> > 
> > And it is safe enough for the resume case as well, it simply cannot go
> > to sleep because nothing else can hold that mutex in the early resume
> > code. We take the mutex during early boot as well, but there the might
> > sleep checks are disabled. And we should do the very same thing in the
> > early resume code as well. Rafael, PeterZ ??? 
> 
> the $subject text sound like it triggered might_sleep(), and that had a
> system_state != SYSTEM_RUNNING bail condition, but then, I've no clue
> what resume looks like.

Early resume looks pretty much like the system startup, e.g. everything
called from syscore_ops should not be sleepable (although mutexes shouldn't
trigger, because that code is effectively single-threaded, unless somebody
holds the mutex in question when that code is being executed, but that would
deadlock anyway).

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: BUG: sleeping function called from invalid context at kernel/mutex.c
  2011-04-06  4:43     ` Rafael J. Wysocki
@ 2011-04-06  8:07       ` Peter Zijlstra
  2011-04-07  5:53         ` Rafael J. Wysocki
  0 siblings, 1 reply; 7+ messages in thread
From: Peter Zijlstra @ 2011-04-06  8:07 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Thomas Gleixner, Xiaotian Feng, LKML, Ingo Molnar,
	Venkatesh Pallipadi, Suresh Siddha, H. Peter Anvin

On Wed, 2011-04-06 at 06:43 +0200, Rafael J. Wysocki wrote:
> > the $subject text sound like it triggered might_sleep(), and that had a
> > system_state != SYSTEM_RUNNING bail condition, but then, I've no clue
> > what resume looks like.
> 
> Early resume looks pretty much like the system startup, e.g. everything
> called from syscore_ops should not be sleepable (although mutexes shouldn't
> trigger, because that code is effectively single-threaded, unless somebody
> holds the mutex in question when that code is being executed, but that would
> deadlock anyway). 

Right, so system_state != SYSTEM_RUNNING should be true for resume?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: BUG: sleeping function called from invalid context at kernel/mutex.c
  2011-04-02 19:12 ` Thomas Gleixner
  2011-04-05  7:41   ` Peter Zijlstra
@ 2011-04-07  5:22   ` Xiaotian Feng
  1 sibling, 0 replies; 7+ messages in thread
From: Xiaotian Feng @ 2011-04-07  5:22 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Ingo Molnar, Venkatesh Pallipadi, Suresh Siddha,
	H. Peter Anvin, Rafael J. Wysocki, Peter Zijlstra

On 04/03/2011 03:12 AM, Thomas Gleixner wrote:
> On Sat, 2 Apr 2011, Xiaotian Feng wrote:
>>      I've got following warnings when I'm trying to resume from suspend, so
>> sparse_lock is not safe enough as a mutex for the suspend resume case, can
>> we simply convert sparse_lock back to spinlock?
>
> What the heck? We do not just convert a mutex to a spinlock because a
> warning triggers. And it was a mutex already in 2.6.38 w/o any such
> problems.
>
> And it is safe enough for the resume case as well, it simply cannot go
> to sleep because nothing else can hold that mutex in the early resume
> code. We take the mutex during early boot as well, but there the might
> sleep checks are disabled. And we should do the very same thing in the
> early resume code as well. Rafael, PeterZ ???
>
> But that's a different issue and not the real problem.
>
> So now instead of thinking about spinlock it would have been pretty
> simple to figure out that the warning is triggered due to commit
> d72274e5.
>
> But of course we dont want to revert d72274e5 either, though it's easy
> to figure out from the call chain that the fundamental stupidity is in
> the hpet resume code.
>
> Why the hell does it call arch_setup_hpet_msi()? That interrupt was
> completely setup _BEFORE_ we went into suspend. And I don't see any
> code which tears down the interrupt on suspend. So we run through a
> full setup for nothing. And when we have interrupt remapping enabled
> it even leaks an irte, because it allocates a new one. Utter crap.
> init_one_hpet_msi_clockevent() has the same stupid call.
>
> So no, the mutex stays a mutex and we fix crap where the crap is.
>
> The following completely untested patch should solve that. Actually we
> should be more clever and do the affinity magic in the new function as
> well, so we can get rid of this weird disable/set_affinity/enable
> hack, but that's a different story.
>

Yes, this patch resolves the warnings, thanks.

> Thanks,
>
> 	tglx
> ---
>   arch/x86/include/asm/hpet.h    |    2 ++
>   arch/x86/kernel/apic/io_apic.c |   11 +++++++++++
>   arch/x86/kernel/hpet.c         |    5 +++--
>   3 files changed, 16 insertions(+), 2 deletions(-)
>
> Index: linux-2.6/arch/x86/include/asm/hpet.h
> ===================================================================
> --- linux-2.6.orig/arch/x86/include/asm/hpet.h
> +++ linux-2.6/arch/x86/include/asm/hpet.h
> @@ -83,11 +83,13 @@ extern void hpet_msi_read(struct hpet_de
>
>   #ifdef CONFIG_PCI_MSI
>   extern int arch_setup_hpet_msi(unsigned int irq, unsigned int id);
> +extern int arch_start_hpet_msi(unsigned int irq, unsigned int id);
>   #else
>   static inline int arch_setup_hpet_msi(unsigned int irq, unsigned int id)
>   {
>   	return -EINVAL;
>   }
> +static inline int arch_start_hpet_msi(unsigned int irq, unsigned int id) { }
>   #endif
>
>   #ifdef CONFIG_HPET_EMULATE_RTC
> Index: linux-2.6/arch/x86/kernel/apic/io_apic.c
> ===================================================================
> --- linux-2.6.orig/arch/x86/kernel/apic/io_apic.c
> +++ linux-2.6/arch/x86/kernel/apic/io_apic.c
> @@ -3477,6 +3477,17 @@ int arch_setup_hpet_msi(unsigned int irq
>   	irq_set_chip_and_handler_name(irq, chip, handle_edge_irq, "edge");
>   	return 0;
>   }
> +
> +int arch_start_hpet_msi(unsigned int irq, unsigned int id)
> +{
> +	struct msi_msg msg;
> +	int ret = msi_compose_msg(NULL, irq,&msg, id);
> +
> +	if (!ret)
> +		hpet_msi_write(irq_get_handler_data(irq),&msg);
> +	return ret;
> +}
> +
>   #endif
>
>   #endif /* CONFIG_PCI_MSI */
> Index: linux-2.6/arch/x86/kernel/hpet.c
> ===================================================================
> --- linux-2.6.orig/arch/x86/kernel/hpet.c
> +++ linux-2.6/arch/x86/kernel/hpet.c
> @@ -370,7 +370,8 @@ static void hpet_set_mode(enum clock_eve
>   			hpet_enable_legacy_int();
>   		} else {
>   			struct hpet_dev *hdev = EVT_TO_HPET_DEV(evt);
> -			hpet_setup_msi_irq(hdev->irq);
> +
> +			arch_start_hpet_msi(hdev->irq, hpet_blockid);
>   			disable_irq(hdev->irq);
>   			irq_set_affinity(hdev->irq, cpumask_of(hdev->cpu));
>   			enable_irq(hdev->irq);
> @@ -555,7 +556,7 @@ static void init_one_hpet_msi_clockevent
>   	if (!(hdev->flags&  HPET_DEV_VALID))
>   		return;
>
> -	if (hpet_setup_msi_irq(hdev->irq))
> +	if (arch_start_hpet_msi(hdev->irq, hpet_blockid))
>   		return;
>
>   	hdev->cpu = cpu;
>
>
>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: BUG: sleeping function called from invalid context at kernel/mutex.c
  2011-04-06  8:07       ` Peter Zijlstra
@ 2011-04-07  5:53         ` Rafael J. Wysocki
  0 siblings, 0 replies; 7+ messages in thread
From: Rafael J. Wysocki @ 2011-04-07  5:53 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Thomas Gleixner, Xiaotian Feng, LKML, Ingo Molnar,
	Venkatesh Pallipadi, Suresh Siddha, H. Peter Anvin

On Wednesday, April 06, 2011, Peter Zijlstra wrote:
> On Wed, 2011-04-06 at 06:43 +0200, Rafael J. Wysocki wrote:
> > > the $subject text sound like it triggered might_sleep(), and that had a
> > > system_state != SYSTEM_RUNNING bail condition, but then, I've no clue
> > > what resume looks like.
> > 
> > Early resume looks pretty much like the system startup, e.g. everything
> > called from syscore_ops should not be sleepable (although mutexes shouldn't
> > trigger, because that code is effectively single-threaded, unless somebody
> > holds the mutex in question when that code is being executed, but that would
> > deadlock anyway). 
> 
> Right, so system_state != SYSTEM_RUNNING should be true for resume?

It is not, although it probably should be.

First, some time ago there was opposition to adding more different possible
values of system_state and it wasn't clear which of the existing values should
be used instead of SYSTEM_RUNNING during suspend/resume.  We ended up sticking
to SYSTEM_RUNNING for that reason long enough for some code to develop the
expectation of system_state == SYSTEM_RUNNING during suspend/resume (IOW,
changing that right now would probably break stuff).

Second, even if we decide to switch from SYSTEM_RUNNING to something else
during suspend (and back during resume), it's not entirely clear what's the
right place to do so.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2011-04-07  5:53 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-04-02 11:47 BUG: sleeping function called from invalid context at kernel/mutex.c Xiaotian Feng
2011-04-02 19:12 ` Thomas Gleixner
2011-04-05  7:41   ` Peter Zijlstra
2011-04-06  4:43     ` Rafael J. Wysocki
2011-04-06  8:07       ` Peter Zijlstra
2011-04-07  5:53         ` Rafael J. Wysocki
2011-04-07  5:22   ` Xiaotian Feng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).