All of lore.kernel.org
 help / color / mirror / Atom feed
* scheduling while atomic acpi_idle_enter_bm
@ 2009-10-27 23:40 Luis R. Rodriguez
  2009-10-27 23:53 ` Arjan van de Ven
  0 siblings, 1 reply; 7+ messages in thread
From: Luis R. Rodriguez @ 2009-10-27 23:40 UTC (permalink / raw)
  To: linux-kernel, linux-acpi; +Cc: Len Brown

I get this when modprobing some module I am working on. I figured it
was the module's fault but the EIP points to something else so I am
not sure. I get the following repeating about 4 times on 2.6.32-rc5:

[ 4806.829545] BUG: scheduling while atomic: swapper/0/0x10010000
[ 4806.829551] Modules linked in: ath9k_htc ath9k_hw ath_hif_usb
ath_htc binfmt_misc i915 drm_kms_helper drm i2c_algo_bit bri
dge stp bnep lp snd_hda_codec_analog snd_hda_intel snd_hda_codec
snd_pcm_oss snd_mixer_oss snd_pcm arc4 snd_seq_dummy snd_seq
_oss snd_seq_midi snd_rawmidi ath5k snd_seq_midi_event snd_seq
snd_timer snd_seq_device mac80211 ath joydev pcmcia ppdev thin
kpad_acpi snd hid_microsoft video parport_pc psmouse soundcore
cfg80211 usbhid output yenta_socket rsrc_nonstatic pcmcia_core
 parport btusb serio_raw pcspkr nvram snd_page_alloc led_class
intel_agp agpgart iTCO_wdt iTCO_vendor_support ohci1394 ieee13
94 e1000e
[ 4806.829683]
[ 4806.829690] Pid: 0, comm: swapper Not tainted (2.6.32-rc5-wl #27) 7660A14
[ 4806.829700] EIP: 0060:[<c03a1a7b>] EFLAGS: 00000292 CPU: 0
[ 4806.829714] EIP is at acpi_idle_enter_bm+0x251/0x282
[ 4806.829720] EAX: c0884bec EBX: 000002ba ECX: 00000000 EDX: 0137f000
[ 4806.829727] ESI: 00000000 EDI: f70bacc0 EBP: c0773f9c ESP: c0773f70
[ 4806.829733]  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[ 4806.829740] CR0: 8005003b CR2: 0804c000 CR3: 00892000 CR4: 00000690
[ 4806.829748] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[ 4806.829753] DR6: ffff0ff0 DR7: 00000400
[ 4806.829758] Call Trace:
[ 4806.829773]  [<c04a9537>] cpuidle_idle_call+0x77/0xf0
[ 4806.829784]  [<c01020da>] cpu_idle+0x5a/0x90
[ 4806.829795]  [<c05913c8>] rest_init+0x58/0x70
[ 4806.829808]  [<c07e8945>] start_kernel+0x330/0x398
[ 4806.829818]  [<c07e83ce>] ? unknown_bootoption+0x0/0x1f6
[ 4806.829828]  [<c07e807e>] i386_start_kernel+0x7e/0xa8

Then I get some 3 of these:

[ 4808.518136] BUG: scheduling while atomic: swapper/0/0x10010000
[ 4808.518143] Modules linked in: ath9k_htc ath9k_hw ath_hif_usb
ath_htc binfmt_misc i915 drm_kms_helper drm i2c_algo_bit bridge stp
bnep lp snd_hda_codec_analog snd_hda_intel snd_hda_codec snd_pcm_oss
snd_mixer_oss snd_pcm arc4 snd_seq_dummy snd_seq_oss snd_seq_midi
snd_rawmidi ath5k snd_seq_midi_event snd_seq snd_timer snd_seq_device
mac80211 ath joydev pcmcia ppdev thinkpad_acpi snd hid_microsoft video
parport_pc psmouse soundcore cfg80211 usbhid output yenta_socket
rsrc_nonstatic pcmcia_core parport btusb serio_raw pcspkr nvram
snd_page_alloc led_class intel_agp agpgart iTCO_wdt
iTCO_vendor_support ohci1394 ieee1394 e1000e
[ 4808.518258]
[ 4808.518264] Pid: 0, comm: swapper Not tainted (2.6.32-rc5-wl #27) 7660A14
[ 4808.518271] EIP: 0060:[<c05a65cf>] EFLAGS: 00000296 CPU: 1
[ 4808.518280] EIP is at _spin_unlock_irqrestore+0xf/0x20
[ 4808.518286] EAX: 00000296 EBX: f70bb008 ECX: 0000195e EDX: 00000296
[ 4808.518292] ESI: 00000005 EDI: 00000296 EBP: f70b3f20 ESP: f70b3f20
[ 4808.518299]  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[ 4808.518304] CR0: 8005003b CR2: 08204008 CR3: 36b26000 CR4: 00000690
[ 4808.518310] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[ 4808.518315] DR6: ffff0ff0 DR7: 00000400
[ 4808.518319] Call Trace:
[ 4808.518328]  [<c017615a>] clockevents_notify+0x3a/0x80
[ 4808.518338]  [<c03a14d4>] lapic_timer_state_broadcast+0x36/0x39
[ 4808.518347]  [<c03a1a99>] acpi_idle_enter_bm+0x26f/0x282
[ 4808.518356]  [<c04a9537>] cpuidle_idle_call+0x77/0xf0
[ 4808.518364]  [<c01020da>] cpu_idle+0x5a/0x90
[ 4808.518373]  [<c05a1204>] start_secondary+0x170/0x1dc
[ 4808.518382]  [<c05a1204>] ? start_secondary+0x170/0x1dc

  Luis


  Luis

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: scheduling while atomic acpi_idle_enter_bm
  2009-10-27 23:40 scheduling while atomic acpi_idle_enter_bm Luis R. Rodriguez
@ 2009-10-27 23:53 ` Arjan van de Ven
  2009-11-03  3:02   ` Len Brown
  0 siblings, 1 reply; 7+ messages in thread
From: Arjan van de Ven @ 2009-10-27 23:53 UTC (permalink / raw)
  To: Luis R. Rodriguez; +Cc: linux-kernel, linux-acpi, Len Brown

On Tue, 27 Oct 2009 16:40:03 -0700
"Luis R. Rodriguez" <mcgrof@gmail.com> wrote:

> I get this when modprobing some module I am working on. I figured it
> was the module's fault but the EIP points to something else so I am
> not sure. I get the following repeating about 4 times on 2.6.32-rc5:


you can get this if your own code leaves interrupts disabled in a
kernel thread and then lets the cpu go idle...


-- 
Arjan van de Ven 	Intel Open Source Technology Centre
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: scheduling while atomic acpi_idle_enter_bm
  2009-10-27 23:53 ` Arjan van de Ven
@ 2009-11-03  3:02   ` Len Brown
  2009-11-03  3:09     ` Luis R. Rodriguez
  0 siblings, 1 reply; 7+ messages in thread
From: Len Brown @ 2009-11-03  3:02 UTC (permalink / raw)
  To: Arjan van de Ven; +Cc: Luis R. Rodriguez, linux-kernel, linux-acpi

> > I get this when modprobing some module I am working on. I figured it
> > was the module's fault but the EIP points to something else so I am
> > not sure. I get the following repeating about 4 times on 2.6.32-rc5:
> 
> 
> you can get this if your own code leaves interrupts disabled in a
> kernel thread and then lets the cpu go idle...

Unclear.

acpi_enter_idle_bm() assumes that it is entered with irqs enabled,
and so it we unconditionally disables IRQs.

Then we unconditionally re-enable them.

The problem seems to be that right after we enable them,
we find that they are actually disabled, perhaps as
a side-effect of SMM.

Is your machine a Dell, per chance?

Please test the patches in this bug report:
http://bugzilla.kernel.org/show_bug.cgi?id=14101

thanks,
Len Brown, Intel Open Source Technology Center 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: scheduling while atomic acpi_idle_enter_bm
  2009-11-03  3:02   ` Len Brown
@ 2009-11-03  3:09     ` Luis R. Rodriguez
  2009-11-06  1:23       ` ykzhao
  0 siblings, 1 reply; 7+ messages in thread
From: Luis R. Rodriguez @ 2009-11-03  3:09 UTC (permalink / raw)
  To: Len Brown; +Cc: Arjan van de Ven, linux-kernel, linux-acpi

On Mon, Nov 2, 2009 at 7:02 PM, Len Brown <lenb@kernel.org> wrote:
>> > I get this when modprobing some module I am working on. I figured it
>> > was the module's fault but the EIP points to something else so I am
>> > not sure. I get the following repeating about 4 times on 2.6.32-rc5:
>>
>>
>> you can get this if your own code leaves interrupts disabled in a
>> kernel thread and then lets the cpu go idle...
>
> Unclear.
>
> acpi_enter_idle_bm() assumes that it is entered with irqs enabled,
> and so it we unconditionally disables IRQs.
>
> Then we unconditionally re-enable them.
>
> The problem seems to be that right after we enable them,
> we find that they are actually disabled, perhaps as
> a side-effect of SMM.
>
> Is your machine a Dell, per chance?

Nope.

> Please test the patches in this bug report:
> http://bugzilla.kernel.org/show_bug.cgi?id=14101

In my case it was as Arjan pointed out and I've fixed it in my driver.
Sorry for not reporting back and thanks for your review.

  Luis

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: scheduling while atomic acpi_idle_enter_bm
  2009-11-03  3:09     ` Luis R. Rodriguez
@ 2009-11-06  1:23       ` ykzhao
  2009-11-06  1:50         ` Luis R. Rodriguez
  0 siblings, 1 reply; 7+ messages in thread
From: ykzhao @ 2009-11-06  1:23 UTC (permalink / raw)
  To: Luis R. Rodriguez; +Cc: Len Brown, Arjan van de Ven, linux-acpi

On Tue, 2009-11-03 at 11:09 +0800, Luis R. Rodriguez wrote:
> On Mon, Nov 2, 2009 at 7:02 PM, Len Brown <lenb@kernel.org> wrote:
> >> > I get this when modprobing some module I am working on. I figured it
> >> > was the module's fault but the EIP points to something else so I am
> >> > not sure. I get the following repeating about 4 times on 2.6.32-rc5:
> >>
> >>
> >> you can get this if your own code leaves interrupts disabled in a
> >> kernel thread and then lets the cpu go idle...
> >
> > Unclear.
> >
> > acpi_enter_idle_bm() assumes that it is entered with irqs enabled,
> > and so it we unconditionally disables IRQs.
> >
> > Then we unconditionally re-enable them.
> >
> > The problem seems to be that right after we enable them,
> > we find that they are actually disabled, perhaps as
> > a side-effect of SMM.
> >
> > Is your machine a Dell, per chance?
> 
> Nope.
> 
> > Please test the patches in this bug report:
> > http://bugzilla.kernel.org/show_bug.cgi?id=14101
> 
> In my case it was as Arjan pointed out and I've fixed it in my driver.
> Sorry for not reporting back and thanks for your review.
Hi, Luis
   It is very great that this issue is fixed in your driver. 
But it seems that there exist so many similar issues on kerneloops.
   >BUG: scheduling while atomic: swapper/0/0x10000100
   >Call Trace:
 [<ffffffff812d2efa>] ? acpi_idle_enter_bm+0x284/0x2bf
 [<ffffffff813f931b>] ? cpuidle_idle_call+0x9b/0xf0
 [<ffffffff81010e12>] ? cpu_idle+0xb2/0x100
  
   >BUG: scheduling while atomic: swapper/0/0x10010000
   >Call Trace:
 [<ffffffff812d2efa>] ? acpi_idle_enter_bm+0x284/0x2bf
 [<ffffffff813f931b>] ? cpuidle_idle_call+0x9b/0xf0
 [<ffffffff81010e12>] ? cpu_idle+0xb2/0x100
 [<ffffffff8151de43>] ? start_secondary+0xa9/0xab

   From the above log it seems that the preempt_count is 0x10010000, which means that
this happens in softirq. After the cpu is waked up from C-state, the interrupt is enabled.
Then it can handle the interrupt ISR and soft IRQ if the interrupt is triggered. 
Is the above issue caused by that the might_sleep is called in the ISR/softIRQ?

Hi, Luis
   Can you describe how you fix this issue in your driver? It will be great if you can
give us some example codes that can trigger this issue.

Thanks.
   
> 
>   Luis
> --
> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: scheduling while atomic acpi_idle_enter_bm
  2009-11-06  1:23       ` ykzhao
@ 2009-11-06  1:50         ` Luis R. Rodriguez
  2009-11-06  5:21           ` ykzhao
  0 siblings, 1 reply; 7+ messages in thread
From: Luis R. Rodriguez @ 2009-11-06  1:50 UTC (permalink / raw)
  To: ykzhao; +Cc: Len Brown, Arjan van de Ven, linux-acpi

On Thu, Nov 5, 2009 at 5:23 PM, ykzhao <yakui.zhao@intel.com> wrote:
> On Tue, 2009-11-03 at 11:09 +0800, Luis R. Rodriguez wrote:
>> On Mon, Nov 2, 2009 at 7:02 PM, Len Brown <lenb@kernel.org> wrote:
>> >> > I get this when modprobing some module I am working on. I figured it
>> >> > was the module's fault but the EIP points to something else so I am
>> >> > not sure. I get the following repeating about 4 times on 2.6.32-rc5:
>> >>
>> >>
>> >> you can get this if your own code leaves interrupts disabled in a
>> >> kernel thread and then lets the cpu go idle...
>> >
>> > Unclear.
>> >
>> > acpi_enter_idle_bm() assumes that it is entered with irqs enabled,
>> > and so it we unconditionally disables IRQs.
>> >
>> > Then we unconditionally re-enable them.
>> >
>> > The problem seems to be that right after we enable them,
>> > we find that they are actually disabled, perhaps as
>> > a side-effect of SMM.
>> >
>> > Is your machine a Dell, per chance?
>>
>> Nope.
>>
>> > Please test the patches in this bug report:
>> > http://bugzilla.kernel.org/show_bug.cgi?id=14101
>>
>> In my case it was as Arjan pointed out and I've fixed it in my driver.
>> Sorry for not reporting back and thanks for your review.
> Hi, Luis
>   It is very great that this issue is fixed in your driver.
> But it seems that there exist so many similar issues on kerneloops.
>   >BUG: scheduling while atomic: swapper/0/0x10000100
>   >Call Trace:
>  [<ffffffff812d2efa>] ? acpi_idle_enter_bm+0x284/0x2bf
>  [<ffffffff813f931b>] ? cpuidle_idle_call+0x9b/0xf0
>  [<ffffffff81010e12>] ? cpu_idle+0xb2/0x100
>
>   >BUG: scheduling while atomic: swapper/0/0x10010000
>   >Call Trace:
>  [<ffffffff812d2efa>] ? acpi_idle_enter_bm+0x284/0x2bf
>  [<ffffffff813f931b>] ? cpuidle_idle_call+0x9b/0xf0
>  [<ffffffff81010e12>] ? cpu_idle+0xb2/0x100
>  [<ffffffff8151de43>] ? start_secondary+0xa9/0xab
>
> From the above log it seems that the preempt_count is 0x10010000,
> which means that this happens in softirq.

What's the preempt_count and how does it get changed?

> After the cpu is awoken from C-state, the interrupt is enabled.
> Then it can handle the interrupt ISR and soft IRQ if the interrupt is triggered.
> Is the above issue caused by that the might_sleep is called in the ISR/softIRQ?

Think so.

>   Can you describe how you fix this issue in your driver? It will be great if you can
> give us some example codes that can trigger this issue.

You can view the git commit here:

http://tinyurl.com/add-rx-support-ath9k-htc

Its a bit big but anything that has to do with mutex->spinlock is what fixed it.

Let me summarize what I did.

I took Arjan's tip for granted:

"you can get this if your own code leaves interrupts disabled in a
kernel thread and then lets the cpu go idle..."

So I went and checked code I might have which would do this. In my
case my USB irq handler was taking a nap with mutex lock somewhere
down the pipeline, once the workqueue has been kicked off and it grabs
the mutex_lock() and the ISR then wants to contend but sleeps.

I changed the ISR code to spin_lock_irqsave() while it pumps skbs into
an skb queue I had set up, and changed my workqueue which eats those
skbs on the skb queue to use spin_lock_bh() (this is also wrong so I
just changed it to irq_save as well).

FWIW the git tree is at:

git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/ath9k_htc.git

and the commit was 88f284ae6a6a7ed7404bcf52c1a5f0692b01ea7f

  Luis
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: scheduling while atomic acpi_idle_enter_bm
  2009-11-06  1:50         ` Luis R. Rodriguez
@ 2009-11-06  5:21           ` ykzhao
  0 siblings, 0 replies; 7+ messages in thread
From: ykzhao @ 2009-11-06  5:21 UTC (permalink / raw)
  To: Luis R. Rodriguez; +Cc: Len Brown, Arjan van de Ven, linux-acpi

On Fri, 2009-11-06 at 09:50 +0800, Luis R. Rodriguez wrote:
> On Thu, Nov 5, 2009 at 5:23 PM, ykzhao <yakui.zhao@intel.com> wrote:
> > On Tue, 2009-11-03 at 11:09 +0800, Luis R. Rodriguez wrote:
> >> On Mon, Nov 2, 2009 at 7:02 PM, Len Brown <lenb@kernel.org> wrote:
> >> >> > I get this when modprobing some module I am working on. I figured it
> >> >> > was the module's fault but the EIP points to something else so I am
> >> >> > not sure. I get the following repeating about 4 times on 2.6.32-rc5:
> >> >>
> >> >>
> >> >> you can get this if your own code leaves interrupts disabled in a
> >> >> kernel thread and then lets the cpu go idle...
> >> >
> >> > Unclear.
> >> >
> >> > acpi_enter_idle_bm() assumes that it is entered with irqs enabled,
> >> > and so it we unconditionally disables IRQs.
> >> >
> >> > Then we unconditionally re-enable them.
> >> >
> >> > The problem seems to be that right after we enable them,
> >> > we find that they are actually disabled, perhaps as
> >> > a side-effect of SMM.
> >> >
> >> > Is your machine a Dell, per chance?
> >>
> >> Nope.
> >>
> >> > Please test the patches in this bug report:
> >> > http://bugzilla.kernel.org/show_bug.cgi?id=14101
> >>
> >> In my case it was as Arjan pointed out and I've fixed it in my driver.
> >> Sorry for not reporting back and thanks for your review.
> > Hi, Luis
> >   It is very great that this issue is fixed in your driver.
> > But it seems that there exist so many similar issues on kerneloops.
> >   >BUG: scheduling while atomic: swapper/0/0x10000100
> >   >Call Trace:
> >  [<ffffffff812d2efa>] ? acpi_idle_enter_bm+0x284/0x2bf
> >  [<ffffffff813f931b>] ? cpuidle_idle_call+0x9b/0xf0
> >  [<ffffffff81010e12>] ? cpu_idle+0xb2/0x100
> >
> >   >BUG: scheduling while atomic: swapper/0/0x10010000
> >   >Call Trace:
> >  [<ffffffff812d2efa>] ? acpi_idle_enter_bm+0x284/0x2bf
> >  [<ffffffff813f931b>] ? cpuidle_idle_call+0x9b/0xf0
> >  [<ffffffff81010e12>] ? cpu_idle+0xb2/0x100
> >  [<ffffffff8151de43>] ? start_secondary+0xa9/0xab
> >
> > From the above log it seems that the preempt_count is 0x10010000,
> > which means that this happens in softirq.
> 
> What's the preempt_count and how does it get changed?
thanks for your help.
After looking at the commit, I understand how this happens.
Now it seems clear that this issue is caused by that might_sleep is
called in the ISR/softirq. Sometimes it is called imlicitly. For
example: it will be called in mutex_lock.
    When it enters the ISR/softirq, we will add
HARDIRQ_OFFSET/SOFTIRQ_OFFSET to preempt_count(0x10000/0x100).
    And the might_sleep will call the function of 
 __cond_resched, which will add the PREEMPT_ACTIVE.


thanks.
> 
> > After the cpu is awoken from C-state, the interrupt is enabled.
> > Then it can handle the interrupt ISR and soft IRQ if the interrupt is triggered.
> > Is the above issue caused by that the might_sleep is called in the ISR/softIRQ?
> 
> Think so.
> 
> >   Can you describe how you fix this issue in your driver? It will be great if you can
> > give us some example codes that can trigger this issue.
> 
> You can view the git commit here:
> 
> http://tinyurl.com/add-rx-support-ath9k-htc
> 
> Its a bit big but anything that has to do with mutex->spinlock is what fixed it.
> 
> Let me summarize what I did.
> 
> I took Arjan's tip for granted:
> 
> "you can get this if your own code leaves interrupts disabled in a
> kernel thread and then lets the cpu go idle..."
> 
> So I went and checked code I might have which would do this. In my
> case my USB irq handler was taking a nap with mutex lock somewhere
> down the pipeline, once the workqueue has been kicked off and it grabs
> the mutex_lock() and the ISR then wants to contend but sleeps.
> 
> I changed the ISR code to spin_lock_irqsave() while it pumps skbs into
> an skb queue I had set up, and changed my workqueue which eats those
> skbs on the skb queue to use spin_lock_bh() (this is also wrong so I
> just changed it to irq_save as well).
> 
> FWIW the git tree is at:
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/ath9k_htc.git
> 
> and the commit was 88f284ae6a6a7ed7404bcf52c1a5f0692b01ea7f
> 
>   Luis
> --
> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2009-11-06  5:23 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-10-27 23:40 scheduling while atomic acpi_idle_enter_bm Luis R. Rodriguez
2009-10-27 23:53 ` Arjan van de Ven
2009-11-03  3:02   ` Len Brown
2009-11-03  3:09     ` Luis R. Rodriguez
2009-11-06  1:23       ` ykzhao
2009-11-06  1:50         ` Luis R. Rodriguez
2009-11-06  5:21           ` ykzhao

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.