All of lore.kernel.org
 help / color / mirror / Atom feed
* [v3.4-rc1] ACPI regression bisected
@ 2012-04-02 15:05 Jörg Otte
  2012-04-02 23:54 ` Andi Kleen
  0 siblings, 1 reply; 20+ messages in thread
From: Jörg Otte @ 2012-04-02 15:05 UTC (permalink / raw)
  To: linux-kernel; +Cc: Len Brown, Andi Kleen

Hi,

my Notebook ( Thinkpad T43) doesn't boot with V3.4-rc1.
The kernel stops booting after displaying the following:

 ACPI: Core revision 20120320
 ACPI: setting ELCR to 0200 (from 0800)
 Performance Events: p6 PMU driver.
 ... version:                0
 ... bit width:              32
 ... generic registers:      2
 ... value mask:             00000000ffffffff
 ... max period:             000000007fffffff
 ... fixed-purpose events:   0
 ... event mask:             0000000000000003
 devtmpfs: initialized
 NET: Registered protocol family 16
 ACPI: bus type pci registered
 PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem
0xe0000000-0xefffffff] (base 0xe0000000)
 PCI: MMCONFIG at [mem 0xe0000000-0xefffffff] reserved in E820
 PCI: Using MMCONFIG for extended config space
 PCI: Using configuration type 1 for base access
 bio: create slab <bio-0> at 0
 ACPI: Added _OSI(Module Device)
 ACPI: Added _OSI(Processor Device)
 ACPI: Added _OSI(3.0 _SCP Extensions)
 ACPI: Added _OSI(Processor Aggregator Device)
 ACPI: EC: EC description table is found, configuring boot EC

Comparing this with the displays of kernel v3.3 the next expected
outputs would be:

 ACPI: Interpreter enabled
 ACPI: (supports S0 S5)


I bisected the problem to commit:
6fe0d0628245fdcd6fad8b837c81e8f7ebc3364d  "ACPI: Make ACPI interrupt threaded"

Please cc me I'm not subscribed.

Regards,
Joerg Otte

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [v3.4-rc1] ACPI regression bisected
  2012-04-02 15:05 [v3.4-rc1] ACPI regression bisected Jörg Otte
@ 2012-04-02 23:54 ` Andi Kleen
  2012-04-03  8:11   ` Thomas Gleixner
  2012-04-03  9:13   ` Jörg Otte
  0 siblings, 2 replies; 20+ messages in thread
From: Andi Kleen @ 2012-04-02 23:54 UTC (permalink / raw)
  To: Jörg Otte; +Cc: linux-kernel, Len Brown

>  ACPI: Added _OSI(Processor Aggregator Device)
>  ACPI: EC: EC description table is found, configuring boot EC
> 
> Comparing this with the displays of kernel v3.3 the next expected
> outputs would be:
> 
>  ACPI: Interpreter enabled
>  ACPI: (supports S0 S5)
> 
> 
> I bisected the problem to commit:
> 6fe0d0628245fdcd6fad8b837c81e8f7ebc3364d  "ACPI: Make ACPI interrupt threaded"

Hmm, that's odd. It should not really change any semantics. I assume you double
checked you hit the right commit?

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [v3.4-rc1] ACPI regression bisected
  2012-04-02 23:54 ` Andi Kleen
@ 2012-04-03  8:11   ` Thomas Gleixner
  2012-04-12  8:35     ` Jörg Otte
  2012-04-03  9:13   ` Jörg Otte
  1 sibling, 1 reply; 20+ messages in thread
From: Thomas Gleixner @ 2012-04-03  8:11 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Jörg Otte, linux-kernel, Len Brown

On Mon, 2 Apr 2012, Andi Kleen wrote:

> >  ACPI: Added _OSI(Processor Aggregator Device)
> >  ACPI: EC: EC description table is found, configuring boot EC
> > 
> > Comparing this with the displays of kernel v3.3 the next expected
> > outputs would be:
> > 
> >  ACPI: Interpreter enabled
> >  ACPI: (supports S0 S5)
> > 
> > 
> > I bisected the problem to commit:
> > 6fe0d0628245fdcd6fad8b837c81e8f7ebc3364d  "ACPI: Make ACPI interrupt threaded"
> 
> Hmm, that's odd. It should not really change any semantics. I assume you double
> checked you hit the right commit?

Well it does change the semantics depending on the type of the
interrupt. The above commit does:

-       if (request_irq(irq, acpi_irq, IRQF_SHARED, "acpi", acpi_irq)) {
+       if (request_threaded_irq(irq, NULL, acpi_irq, IRQF_SHARED, "acpi",
+                                acpi_irq)) {

which creates a threaded handler and the primary handler is the
default core one which merily returns IRQ_WAKE_THREAD. Though that
leaves the interrupt line unmasked and for level type interrupts this
is not working. So you either need a primary handler which masks the
irq at the device level or you can add IRQF_ONESHOT to let the core
code deal with the mask/unmask business. See patch below.

Now the problem with this approach is that if this is really a shared
interrupt the core code will either reject this request or the one of
the other device which shares that interrupt line, depending which one
request first. The reason is that we require all devices to agree on
that flag - otherwise a long running/waiting/sleeping threaded handler
might block out the other device for a long time.

If it's never shared, nothing to worry about. Otherwise you need to
provide a primary handler which silences the interrupt at the device
level and unmask it when your threaded handler is done.

Thanks,

	tglx

diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c
index ba14fb9..1b81bc9 100644
--- a/drivers/acpi/osl.c
+++ b/drivers/acpi/osl.c
@@ -607,7 +607,8 @@ acpi_os_install_interrupt_handler(u32 gsi, acpi_osd_handler handler,
 
 	acpi_irq_handler = handler;
 	acpi_irq_context = context;
-	if (request_threaded_irq(irq, NULL, acpi_irq, IRQF_SHARED, "acpi",
+	if (request_threaded_irq(irq, NULL, acpi_irq,
+				 IRQF_SHARED | IRQF_ONESHOT, "acpi",
 				 acpi_irq)) {
 		printk(KERN_ERR PREFIX "SCI (IRQ%d) allocation failed\n", irq);
 		acpi_irq_handler = NULL;

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [v3.4-rc1] ACPI regression bisected
  2012-04-02 23:54 ` Andi Kleen
  2012-04-03  8:11   ` Thomas Gleixner
@ 2012-04-03  9:13   ` Jörg Otte
  1 sibling, 0 replies; 20+ messages in thread
From: Jörg Otte @ 2012-04-03  9:13 UTC (permalink / raw)
  To: Andi Kleen; +Cc: linux-kernel, Len Brown

Yes , its really that commit!
My current top of source tree is:

jojo@eiche:/data/kernel/linux$ git log --pretty=oneline
3a5307cb7c25047242cf01c7698f25600f60b32e Revert "ACPI: Make ACPI
interrupt threaded"
dd775ae2549217d3ae09363e3edb305d0fa19928 Linux 3.4-rc1
b7ffff4bb3fef527a26144a95d9dfe8c11a6b927 Merge branch 's3-for-3.4' of
git://git.kernel.org/pub/scm/linux/kernel/git/amit/virtio-console
8bb1f229527dee95644e0f8496980bb767c6f620 Merge branch 'for-linus' of
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
f22e08a79f3765fecf060b225a46931c94fb0a92 Merge branch
'sched-urgent-for-linus' of
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
f187e9fd68577cdd5f914659b6f7f11124e40485 Merge branch
'perf-urgent-for-linus' of
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
adb3b1f3fc1c6edb501808ebf80a81e81c52eb73 Merge tag 'parisc-misc' of
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/parisc-2.6
a75ee6ecd411a50bf4da927c2fdb2cb56246a2bd Merge tag 'scsi-misc' of
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6


Am 3. April 2012 01:54 schrieb Andi Kleen <ak@linux.intel.com>:
>>  ACPI: Added _OSI(Processor Aggregator Device)
>>  ACPI: EC: EC description table is found, configuring boot EC
>>
>> Comparing this with the displays of kernel v3.3 the next expected
>> outputs would be:
>>
>>  ACPI: Interpreter enabled
>>  ACPI: (supports S0 S5)
>>
>>
>> I bisected the problem to commit:
>> 6fe0d0628245fdcd6fad8b837c81e8f7ebc3364d  "ACPI: Make ACPI interrupt threaded"
>
> Hmm, that's odd. It should not really change any semantics. I assume you double
> checked you hit the right commit?
>
> -Andi
>
> --
> ak@linux.intel.com -- Speaking for myself only

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [v3.4-rc1] ACPI regression bisected
  2012-04-03  8:11   ` Thomas Gleixner
@ 2012-04-12  8:35     ` Jörg Otte
  2012-04-12  8:57       ` Thomas Gleixner
  0 siblings, 1 reply; 20+ messages in thread
From: Jörg Otte @ 2012-04-12  8:35 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Andi Kleen, linux-kernel, Len Brown

I tried that patch, for me it works.

Thanks, Joerg


Am 3. April 2012 10:11 schrieb Thomas Gleixner <tglx@linutronix.de>:
> On Mon, 2 Apr 2012, Andi Kleen wrote:
>
>> >  ACPI: Added _OSI(Processor Aggregator Device)
>> >  ACPI: EC: EC description table is found, configuring boot EC
>> >
>> > Comparing this with the displays of kernel v3.3 the next expected
>> > outputs would be:
>> >
>> >  ACPI: Interpreter enabled
>> >  ACPI: (supports S0 S5)
>> >
>> >
>> > I bisected the problem to commit:
>> > 6fe0d0628245fdcd6fad8b837c81e8f7ebc3364d  "ACPI: Make ACPI interrupt threaded"
>>
>> Hmm, that's odd. It should not really change any semantics. I assume you double
>> checked you hit the right commit?
>
> Well it does change the semantics depending on the type of the
> interrupt. The above commit does:
>
> -       if (request_irq(irq, acpi_irq, IRQF_SHARED, "acpi", acpi_irq)) {
> +       if (request_threaded_irq(irq, NULL, acpi_irq, IRQF_SHARED, "acpi",
> +                                acpi_irq)) {
>
> which creates a threaded handler and the primary handler is the
> default core one which merily returns IRQ_WAKE_THREAD. Though that
> leaves the interrupt line unmasked and for level type interrupts this
> is not working. So you either need a primary handler which masks the
> irq at the device level or you can add IRQF_ONESHOT to let the core
> code deal with the mask/unmask business. See patch below.
>
> Now the problem with this approach is that if this is really a shared
> interrupt the core code will either reject this request or the one of
> the other device which shares that interrupt line, depending which one
> request first. The reason is that we require all devices to agree on
> that flag - otherwise a long running/waiting/sleeping threaded handler
> might block out the other device for a long time.
>
> If it's never shared, nothing to worry about. Otherwise you need to
> provide a primary handler which silences the interrupt at the device
> level and unmask it when your threaded handler is done.
>
> Thanks,
>
>        tglx
>
> diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c
> index ba14fb9..1b81bc9 100644
> --- a/drivers/acpi/osl.c
> +++ b/drivers/acpi/osl.c
> @@ -607,7 +607,8 @@ acpi_os_install_interrupt_handler(u32 gsi, acpi_osd_handler handler,
>
>        acpi_irq_handler = handler;
>        acpi_irq_context = context;
> -       if (request_threaded_irq(irq, NULL, acpi_irq, IRQF_SHARED, "acpi",
> +       if (request_threaded_irq(irq, NULL, acpi_irq,
> +                                IRQF_SHARED | IRQF_ONESHOT, "acpi",
>                                 acpi_irq)) {
>                printk(KERN_ERR PREFIX "SCI (IRQ%d) allocation failed\n", irq);
>                acpi_irq_handler = NULL;

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [v3.4-rc1] ACPI regression bisected
  2012-04-12  8:35     ` Jörg Otte
@ 2012-04-12  8:57       ` Thomas Gleixner
  2012-04-12 11:42         ` Paul Bolle
  2012-04-12 14:29         ` Matthew Garrett
  0 siblings, 2 replies; 20+ messages in thread
From: Thomas Gleixner @ 2012-04-12  8:57 UTC (permalink / raw)
  To: Jörg Otte; +Cc: Andi Kleen, linux-kernel, Len Brown

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1002 bytes --]

On Thu, 12 Apr 2012, Jörg Otte wrote:

> I tried that patch, for me it works.

> > --- a/drivers/acpi/osl.c
> > +++ b/drivers/acpi/osl.c
> > @@ -607,7 +607,8 @@ acpi_os_install_interrupt_handler(u32 gsi, acpi_osd_handler handler,
> >
> >        acpi_irq_handler = handler;
> >        acpi_irq_context = context;
> > -       if (request_threaded_irq(irq, NULL, acpi_irq, IRQF_SHARED, "acpi",
> > +       if (request_threaded_irq(irq, NULL, acpi_irq,
> > +                                IRQF_SHARED | IRQF_ONESHOT, "acpi",
> >                                 acpi_irq)) {
> >                printk(KERN_ERR PREFIX "SCI (IRQ%d) allocation failed\n", irq);
> >                acpi_irq_handler = NULL;

OK, so now the question is whether the ACPI interrupt can end up being
shared or not.

If it can be shared, then we need a proper primary handler which
silences the interrupt at the device level and the threaded handler
needs to reenable it after finishing the processing.

Len, Andi ???

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [v3.4-rc1] ACPI regression bisected
  2012-04-12  8:57       ` Thomas Gleixner
@ 2012-04-12 11:42         ` Paul Bolle
  2012-04-12 17:06           ` Thomas Gleixner
  2012-04-12 14:29         ` Matthew Garrett
  1 sibling, 1 reply; 20+ messages in thread
From: Paul Bolle @ 2012-04-12 11:42 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Jörg Otte, Andi Kleen, linux-kernel, Len Brown

On Thu, 2012-04-12 at 10:57 +0200, Thomas Gleixner wrote:
> OK, so now the question is whether the ACPI interrupt can end up being
> shared or not.
> 
> If it can be shared, then we need a proper primary handler which
> silences the interrupt at the device level and the threaded handler
> needs to reenable it after finishing the processing.

0) For what it's worth, I've got a ThinkPad T41 that triggers the same
problem Jörg reported (both in v3.4-rc1 and in v3.4-rc2). I managed to
capture some further info over a serial line.

1) After printing
    "ACPI: EC: EC description table is found, configuring boot EC"

Fedora Rawhide's 3.4.0-0.rc1.git3.1.fc18.i686 loops printing:

BUG: soft lockup - CPU#0 stuck for 22s! [swapper/0:1]
Modules linked in:
irq event stamp: 6634591
hardirqs last  enabled at (6634590): [<c09d903d>] restore_all_notrace+0x0/0x18
hardirqs last disabled at (6634591): [<c09e10ee>] common_interrupt+0x2e/0x3c
softirqs last  enabled at (302952): [<c0443645>] __do_softirq+0x115/0x330
softirqs last disabled at (303313): [<c04053e5>] do_softirq+0xa5/0x100
Modules linked in:

Pid: 1, comm: swapper/0 Not tainted 3.4.0-0.rc1.git3.1.fc18.i686 #1 IBM        /       
EIP: 0060:[<c0443592>] EFLAGS: 00000206 CPU: 0
EIP is at __do_softirq+0x62/0x330
EAX: f4538000 EBX: f4538000 ECX: 00000002 EDX: f4538484
ESI: 00000002 EDI: 00000002 EBP: f4413ff8 ESP: f4413fc4
 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
CR0: 8005003b CR2: ff996000 CR3: 00ea7000 CR4: 000007d0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff0ff0 DR7: 00000400
Process swapper/0 (pid: 1, ti=f4410000 task=f4538000 task.ti=f452a000)
Stack:
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 0000000a
 00000100 c0c1da08 f452b950 f452a000 c0443530 f452b960 c04053e5
Call Trace:
 [<c0443530>] ? local_bh_enable_ip+0xd0/0xd0
 <IRQ> 
 [<c0443b25>] ? irq_exit+0xb5/0xc0
 [<c040501b>] ? do_IRQ+0x4b/0xc0
 [<c04a1084>] ? trace_hardirqs_on_caller+0xf4/0x180
 [<c09e10f5>] ? common_interrupt+0x35/0x3c
 [<c04a007b>] ? __lock_acquire+0x154b/0x1640
 [<c09d8cea>] ? _raw_spin_unlock_irqrestore+0x3a/0x70
 [<c046054e>] ? prepare_to_wait+0x4e/0x70
 [<c070dd6e>] ? acpi_ec_transaction_unlocked+0x13f/0x1f0
 [<c0460330>] ? wake_up_bit+0x30/0x30
 [<c070e0dd>] ? acpi_ec_transaction+0x18c/0x24f
 [<c04a111b>] ? trace_hardirqs_on+0xb/0x10
 [<c070962f>] ? acpi_os_signal_semaphore+0x67/0x76
 [<c070e23c>] ? acpi_ec_read+0x43/0x50
 [<c070e3aa>] ? acpi_ec_space_handler+0x84/0xf2
 [<c070e326>] ? acpi_ec_burst_disable+0x40/0x40
 [<c0719cd7>] ? acpi_ev_address_space_dispatch+0x201/0x253
 [<c070e326>] ? acpi_ec_burst_disable+0x40/0x40
 [<c071e2a1>] ? acpi_ex_access_region+0x2c9/0x361
 [<c071e719>] ? acpi_ex_field_datum_io+0x110/0x35f
 [<c0551180>] ? kmem_cache_alloc+0xd0/0x230
 [<c04055c7>] ? dump_trace+0x97/0x100
 [<c0733523>] ? acpi_ut_allocate_object_desc_dbg+0x3a/0xaf
 [<c071e9f6>] ? acpi_ex_extract_from_field+0x8e/0x223
 [<c0733778>] ? acpi_ut_create_integer_object+0x2a/0x39
 [<c071dd15>] ? acpi_ex_read_data_from_field+0x195/0x1ca
 [<c0721e44>] ? acpi_ex_resolve_node_to_value+0x21c/0x2a8
 [<c072217b>] ? acpi_ex_resolve_to_value+0x2ab/0x2bb
 [<c0715b67>] ? acpi_ds_evaluate_name_path+0x6d/0xd6
 [<c0715fb2>] ? acpi_ds_exec_end_op+0x8b/0x585
 [<c0715f02>] ? acpi_ds_exec_begin_op+0x169/0x18e
 [<c072bb1e>] ? acpi_ps_parse_loop+0x805/0x9b7
 [<c072c102>] ? acpi_ps_parse_aml+0x105/0x34f
 [<c072c9ea>] ? acpi_ps_execute_method+0x1ea/0x2c8
 [<c07266f3>] ? acpi_ns_evaluate+0x167/0x2cb
 [<c0726aa0>] ? acpi_ns_init_one_device+0xa0/0x14e
 [<c0726a00>] ? acpi_ns_exec_module_code_list+0x1a9/0x1a9
 [<c0729a17>] ? acpi_ns_walk_namespace+0xc8/0x171
 [<c0cd33d4>] ? acpi_sleep_proc_init+0x2e/0x2e
 [<c0726fc2>] ? acpi_ns_initialize_devices+0x132/0x1c8
 [<c0726a00>] ? acpi_ns_exec_module_code_list+0x1a9/0x1a9
 [<c07340e4>] ? acpi_initialize_objects+0xed/0xf7
 [<c0cd34e3>] ? acpi_init+0x10f/0x28c
 [<c0401202>] ? do_one_initcall+0x112/0x160
 [<c0ca88bf>] ? kernel_init+0x10f/0x1a6
 [<c0ca8154>] ? rdinit_setup+0x22/0x22
 [<c0ca87b0>] ? start_kernel+0x39f/0x39f
 [<c09e1102>] ? kernel_thread_helper+0x6/0x10
Code: 00 01 64 a1 6c 88 d5 c0 c7 45 e8 0a 00 00 00 89 45 e4 8d b4 26 00 00 00 00 64 c7 05 40 d6 e9 c0 00 00 00 00 e8 80 db 05 00 fb 90 <8d> 74 26 00 c7 45 f0 00 da c1 c0 eb 09 90 83 45 f0 04 d1 ef 74
Call Trace:
 [<c0443530>] ? local_bh_enable_ip+0xd0/0xd0
 <IRQ>  [<c0443b25>] ? irq_exit+0xb5/0xc0
 [<c040501b>] ? do_IRQ+0x4b/0xc0
 [<c04a1084>] ? trace_hardirqs_on_caller+0xf4/0x180
 [<c09e10f5>] ? common_interrupt+0x35/0x3c
 [<c04a007b>] ? __lock_acquire+0x154b/0x1640
 [<c09d8cea>] ? _raw_spin_unlock_irqrestore+0x3a/0x70
 [<c046054e>] ? prepare_to_wait+0x4e/0x70
 [<c070dd6e>] ? acpi_ec_transaction_unlocked+0x13f/0x1f0
 [<c0460330>] ? wake_up_bit+0x30/0x30
 [<c070e0dd>] ? acpi_ec_transaction+0x18c/0x24f
 [<c04a111b>] ? trace_hardirqs_on+0xb/0x10
 [<c070962f>] ? acpi_os_signal_semaphore+0x67/0x76
 [<c070e23c>] ? acpi_ec_read+0x43/0x50
 [<c070e3aa>] ? acpi_ec_space_handler+0x84/0xf2
 [<c070e326>] ? acpi_ec_burst_disable+0x40/0x40
 [<c0719cd7>] ? acpi_ev_address_space_dispatch+0x201/0x253
 [<c070e326>] ? acpi_ec_burst_disable+0x40/0x40
 [<c071e2a1>] ? acpi_ex_access_region+0x2c9/0x361
 [<c071e719>] ? acpi_ex_field_datum_io+0x110/0x35f
 [<c0551180>] ? kmem_cache_alloc+0xd0/0x230
 [<c04055c7>] ? dump_trace+0x97/0x100
 [<c0733523>] ? acpi_ut_allocate_object_desc_dbg+0x3a/0xaf
 [<c071e9f6>] ? acpi_ex_extract_from_field+0x8e/0x223
 [<c0733778>] ? acpi_ut_create_integer_object+0x2a/0x39
 [<c071dd15>] ? acpi_ex_read_data_from_field+0x195/0x1ca
 [<c0721e44>] ? acpi_ex_resolve_node_to_value+0x21c/0x2a8
 [<c072217b>] ? acpi_ex_resolve_to_value+0x2ab/0x2bb
 [<c0715b67>] ? acpi_ds_evaluate_name_path+0x6d/0xd6
 [<c0715fb2>] ? acpi_ds_exec_end_op+0x8b/0x585
 [<c0715f02>] ? acpi_ds_exec_begin_op+0x169/0x18e
 [<c072bb1e>] ? acpi_ps_parse_loop+0x805/0x9b7
 [<c072c102>] ? acpi_ps_parse_aml+0x105/0x34f
 [<c072c9ea>] ? acpi_ps_execute_method+0x1ea/0x2c8
 [<c07266f3>] ? acpi_ns_evaluate+0x167/0x2cb
 [<c0726aa0>] ? acpi_ns_init_one_device+0xa0/0x14e
 [<c0726a00>] ? acpi_ns_exec_module_code_list+0x1a9/0x1a9
 [<c0729a17>] ? acpi_ns_walk_namespace+0xc8/0x171
 [<c0cd33d4>] ? acpi_sleep_proc_init+0x2e/0x2e
 [<c0726fc2>] ? acpi_ns_initialize_devices+0x132/0x1c8
 [<c0726a00>] ? acpi_ns_exec_module_code_list+0x1a9/0x1a9
 [<c07340e4>] ? acpi_initialize_objects+0xed/0xf7
 [<c0cd34e3>] ? acpi_init+0x10f/0x28c
 [<c0401202>] ? do_one_initcall+0x112/0x160
 [<c0ca88bf>] ? kernel_init+0x10f/0x1a6
 [<c0ca8154>] ? rdinit_setup+0x22/0x22
 [<c0ca87b0>] ? start_kernel+0x39f/0x39f
 [<c09e1102>] ? kernel_thread_helper+0x6/0x10

2) And, also after printing
    "ACPI: EC: EC description table is found, configuring boot EC"

Fedora Rawhide's 3.4.0-0.rc2.git0.2.fc18.i686 loops printing:
BUG: soft lockup - CPU#0 stuck for 23s! [swapper/0:0]
Modules linked in:
Modules linked in:

Pid: 0, comm: swapper/0 Not tainted 3.4.0-0.rc2.git0.2.fc18.i686 #1 IBM        /       
EIP: 0060:[<c043e405>] EFLAGS: 00000206 CPU: 0
EIP is at __do_softirq+0x55/0x190
EAX: 00000000 EBX: 00000002 ECX: 00000000 EDX: 00000000
ESI: c0b52000 EDI: c043e3b0 EBP: f440bff8 ESP: f440bfc4
 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
CR0: 8005003b CR2: ff996000 CR3: 00c74000 CR4: 000007d0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff0ff0 DR7: 00000400
Process swapper/0 (pid: 0, ti=f4408000 task=c0b5e020 task.ti=c0b52000)
Stack:
 bf00827e 28fffff7 ff18043a 04f7ffff ff00907f 00ddfff9 00000000 0000000a
 00000100 00000002 c0b53dd8 c0b52000 c043e3b0 c0b53de8 c0405246
Call Trace:
 [<c043e3b0>] ? local_bh_enable_ip+0x90/0x90
 <IRQ> 
 [<c043e79d>] ? irq_exit+0x9d/0xb0
 [<c0404ebb>] ? do_IRQ+0x4b/0xc0
 [<c0409588>] ? sched_clock+0x8/0x10
 [<c0467df0>] ? sched_clock_local+0xf0/0x1e0
 [<c093f170>] ? common_interrupt+0x30/0x38
 [<c065c6c3>] ? sha_transform+0xe03/0x1070
 [<c070abb6>] ? extract_buf+0x56/0x130
 [<c0460195>] ? __wake_up+0x45/0x60
 [<c070a9bd>] ? account+0xad/0x100
 [<c070b0c2>] ? extract_entropy+0x62/0x110
 [<c070b1dd>] ? get_random_bytes+0x6d/0x80
 [<c040b029>] ? cpu_idle+0x19/0xe0
 [<c0936bf3>] ? schedule+0x23/0x60
 [<c09186d5>] ? rest_init+0x5d/0x68
 [<c0bd0778>] ? start_kernel+0x367/0x36d
 [<c0bd0186>] ? loglevel+0x2b/0x2b
 [<c0bd0078>] ? i386_start_kernel+0x78/0x7d
Code: 00 01 00 00 64 a1 2c 57 c6 c0 89 5d f0 89 45 e4 c7 45 e8 0a 00 00 00 8d b4 26 00 00 00 00 64 c7 05 80 a1 c6 c0 00 00 00 00 fb 90 <8d> 74 26 00 bf 00 7a b5 c0 eb 08 83 c7 04 d1 6d f0 74 57 f6 45
Call Trace:
 [<c043e3b0>] ? local_bh_enable_ip+0x90/0x90
 <IRQ>  [<c043e79d>] ? irq_exit+0x9d/0xb0
 [<c0404ebb>] ? do_IRQ+0x4b/0xc0
 [<c0409588>] ? sched_clock+0x8/0x10
 [<c0467df0>] ? sched_clock_local+0xf0/0x1e0
 [<c093f170>] ? common_interrupt+0x30/0x38
 [<c065c6c3>] ? sha_transform+0xe03/0x1070
 [<c070abb6>] ? extract_buf+0x56/0x130
 [<c0460195>] ? __wake_up+0x45/0x60
 [<c070a9bd>] ? account+0xad/0x100
 [<c070b0c2>] ? extract_entropy+0x62/0x110
 [<c070b1dd>] ? get_random_bytes+0x6d/0x80
 [<c040b029>] ? cpu_idle+0x19/0xe0
 [<c0936bf3>] ? schedule+0x23/0x60
 [<c09186d5>] ? rest_init+0x5d/0x68
 [<c0bd0778>] ? start_kernel+0x367/0x36d
 [<c0bd0186>] ? loglevel+0x2b/0x2b
 [<c0bd0078>] ? i386_start_kernel+0x78/0x7d

3) Feel free to ask for more information.


Paul Bolle


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [v3.4-rc1] ACPI regression bisected
  2012-04-12  8:57       ` Thomas Gleixner
  2012-04-12 11:42         ` Paul Bolle
@ 2012-04-12 14:29         ` Matthew Garrett
  2012-04-12 17:04           ` Thomas Gleixner
  1 sibling, 1 reply; 20+ messages in thread
From: Matthew Garrett @ 2012-04-12 14:29 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Jörg Otte, Andi Kleen, linux-kernel, Len Brown

On Thu, Apr 12, 2012 at 10:57:45AM +0200, Thomas Gleixner wrote:

> OK, so now the question is whether the ACPI interrupt can end up being
> shared or not.

Per spec:

"OSPM is required to treat the ACPI SCI interrupt as a sharable, level, 
active low interrupt."

-- 
Matthew Garrett | mjg59@srcf.ucam.org

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [v3.4-rc1] ACPI regression bisected
  2012-04-12 14:29         ` Matthew Garrett
@ 2012-04-12 17:04           ` Thomas Gleixner
  2012-04-12 17:09             ` Thomas Gleixner
  0 siblings, 1 reply; 20+ messages in thread
From: Thomas Gleixner @ 2012-04-12 17:04 UTC (permalink / raw)
  To: Matthew Garrett; +Cc: Jörg Otte, Andi Kleen, linux-kernel, Len Brown

On Thu, 12 Apr 2012, Matthew Garrett wrote:

> On Thu, Apr 12, 2012 at 10:57:45AM +0200, Thomas Gleixner wrote:
> 
> > OK, so now the question is whether the ACPI interrupt can end up being
> > shared or not.
> 
> Per spec:
> 
> "OSPM is required to treat the ACPI SCI interrupt as a sharable, level, 
> active low interrupt."

I feared that. So now is the question whether there is a way to mask
the interrupt at the "device" level.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [v3.4-rc1] ACPI regression bisected
  2012-04-12 11:42         ` Paul Bolle
@ 2012-04-12 17:06           ` Thomas Gleixner
  2012-04-12 21:52             ` Paul Bolle
  0 siblings, 1 reply; 20+ messages in thread
From: Thomas Gleixner @ 2012-04-12 17:06 UTC (permalink / raw)
  To: Paul Bolle; +Cc: Jörg Otte, Andi Kleen, linux-kernel, Len Brown

On Thu, 12 Apr 2012, Paul Bolle wrote:
> 3) Feel free to ask for more information.

Does the test patch I sent solve the problem for you as well?

See: https://lkml.org/lkml/2012/4/3/88

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [v3.4-rc1] ACPI regression bisected
  2012-04-12 17:04           ` Thomas Gleixner
@ 2012-04-12 17:09             ` Thomas Gleixner
  2012-04-12 19:26               ` Andi Kleen
  0 siblings, 1 reply; 20+ messages in thread
From: Thomas Gleixner @ 2012-04-12 17:09 UTC (permalink / raw)
  To: Matthew Garrett; +Cc: Jörg Otte, Andi Kleen, linux-kernel, Len Brown

On Thu, 12 Apr 2012, Thomas Gleixner wrote:

> On Thu, 12 Apr 2012, Matthew Garrett wrote:
> 
> > On Thu, Apr 12, 2012 at 10:57:45AM +0200, Thomas Gleixner wrote:
> > 
> > > OK, so now the question is whether the ACPI interrupt can end up being
> > > shared or not.
> > 
> > Per spec:
> > 
> > "OSPM is required to treat the ACPI SCI interrupt as a sharable, level, 
> > active low interrupt."
> 
> I feared that. So now is the question whether there is a way to mask
> the interrupt at the "device" level.

If not, I could be persuaded to provide a mechanism to force thread
all interrupts which are on that line, if there's a good argument to
do so.

All machines in my farm do not share the line, so I guess it's only
relevant when you boot with apic disabled or such.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [v3.4-rc1] ACPI regression bisected
  2012-04-12 17:09             ` Thomas Gleixner
@ 2012-04-12 19:26               ` Andi Kleen
  2012-04-12 20:42                 ` Thomas Gleixner
  2012-04-19  9:30                 ` Thomas Gleixner
  0 siblings, 2 replies; 20+ messages in thread
From: Andi Kleen @ 2012-04-12 19:26 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Matthew Garrett, Jörg Otte, linux-kernel, Len Brown


>> I feared that. So now is the question whether there is a way to mask
>> the interrupt at the "device" level.

Probably not. Len?

> If not, I could be persuaded to provide a mechanism to force thread
> all interrupts which are on that line, if there's a good argument to
> do so.

Hmm, we could probably just use a work queue for this instead. I guess i 
drank the threaded
interrupt kool aid too much when I did the patch :-)

I'll switch over to a workqueue.

Meanwhile the patch can be just reverted. The patch that needed it 
(delays in EC access to
work around buggy Acer EC) was not merged by  Len so far, so there's 
nothing in tree right
now that relies on the thread.

For 3.4 I think reverting is the right approach.

-Andi


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [v3.4-rc1] ACPI regression bisected
  2012-04-12 19:26               ` Andi Kleen
@ 2012-04-12 20:42                 ` Thomas Gleixner
  2012-04-18  9:25                   ` Paul Bolle
  2012-04-19  9:30                 ` Thomas Gleixner
  1 sibling, 1 reply; 20+ messages in thread
From: Thomas Gleixner @ 2012-04-12 20:42 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Matthew Garrett, Jörg Otte, linux-kernel, Len Brown

On Thu, 12 Apr 2012, Andi Kleen wrote:
> > > I feared that. So now is the question whether there is a way to mask
> > > the interrupt at the "device" level.
> 
> Probably not. Len?
> 
> > If not, I could be persuaded to provide a mechanism to force thread
> > all interrupts which are on that line, if there's a good argument to
> > do so.
> 
> Hmm, we could probably just use a work queue for this instead. I guess i drank
> the threaded
> interrupt kool aid too much when I did the patch :-)
> 
> I'll switch over to a workqueue.
> 
> Meanwhile the patch can be just reverted. The patch that needed it (delays in
> EC access to
> work around buggy Acer EC) was not merged by  Len so far, so there's nothing
> in tree right
> now that relies on the thread.
> 
> For 3.4 I think reverting is the right approach.

Ack.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [v3.4-rc1] ACPI regression bisected
  2012-04-12 17:06           ` Thomas Gleixner
@ 2012-04-12 21:52             ` Paul Bolle
  0 siblings, 0 replies; 20+ messages in thread
From: Paul Bolle @ 2012-04-12 21:52 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Jörg Otte, Andi Kleen, linux-kernel, Len Brown

On Thu, 2012-04-12 at 19:06 +0200, Thomas Gleixner wrote:
> Does the test patch I sent solve the problem for you as well?
> 
> See: https://lkml.org/lkml/2012/4/3/88

Yes, it also solves the problem on my laptop.


Paul Bolle


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [v3.4-rc1] ACPI regression bisected
  2012-04-12 20:42                 ` Thomas Gleixner
@ 2012-04-18  9:25                   ` Paul Bolle
  2012-04-18 10:29                     ` Thomas Gleixner
  0 siblings, 1 reply; 20+ messages in thread
From: Paul Bolle @ 2012-04-18  9:25 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Andi Kleen, Matthew Garrett, Jörg Otte, linux-kernel, Len Brown

On Thu, 2012-04-12 at 22:42 +0200, Thomas Gleixner wrote:
> > For 3.4 I think reverting is the right approach.
> 
> Ack.

A revert of 6fe0d0628245fdcd6fad8b837c81e8f7ebc3364d ("ACPI: Make ACPI
interrupt threaded") wasn't included in v3.4-rc3. Has it been pushed
somewhere? Is testing needed before that revert can be pushed?


Paul Bolle


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [v3.4-rc1] ACPI regression bisected
  2012-04-18  9:25                   ` Paul Bolle
@ 2012-04-18 10:29                     ` Thomas Gleixner
  2012-04-18 11:07                       ` Paul Bolle
  2012-04-18 17:24                       ` Linus Torvalds
  0 siblings, 2 replies; 20+ messages in thread
From: Thomas Gleixner @ 2012-04-18 10:29 UTC (permalink / raw)
  To: Paul Bolle
  Cc: Andi Kleen, Matthew Garrett, Jörg Otte, LKML, Len Brown,
	Linus Torvalds, Rafael J. Wysocki

B1;2601;0cOn Wed, 18 Apr 2012, Paul Bolle wrote:
> On Thu, 2012-04-12 at 22:42 +0200, Thomas Gleixner wrote:
> > > For 3.4 I think reverting is the right approach.
> > 
> > Ack.
> 
> A revert of 6fe0d0628245fdcd6fad8b837c81e8f7ebc3364d ("ACPI: Make ACPI
> interrupt threaded") wasn't included in v3.4-rc3. Has it been pushed
> somewhere? Is testing needed before that revert can be pushed?

Seems those who broke it are too busy to care.

Linus, can you please apply directly?

------------>
Subject: ACPI: Revert 6fe0d06 ("ACPI: Make ACPI interrupt threaded")
From: Thomas Gleixner <tglx@linutronix.de>

Paul bisected this regression to 6fe0d06 ("ACPI: Make ACPI interrupt
threaded"). 

The conversion was done blindly and is wrong, as it does not provide a
primary handler to disable the level type irq on the device
level. Neither does it set the IRQF_ONESHOT flag which handles that at
the irq line level. This can't be done as the interrupt might be
shared, though we might extend the core to force it.

So an interrupt on this line will wake up the thread, but immediately
unmask the irq after that. Due to the interrupt being level type the
hardware interrupt is raised over and over and prevents the irq thread
from handling it. Fail.

request_irq() unfortunately does not refuse such a request and the
patch was obviously never tested with real interrupts.

Bisected-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

diff --git b/drivers/acpi/osl.c a/drivers/acpi/osl.c
index 02367a8..412a1e0 100644
--- b/drivers/acpi/osl.c
+++ a/drivers/acpi/osl.c
@@ -595,8 +595,7 @@ acpi_os_install_interrupt_handler(u32 gsi, acpi_osd_handler handler,
 
 	acpi_irq_handler = handler;
 	acpi_irq_context = context;
-	if (request_threaded_irq(irq, NULL, acpi_irq, IRQF_SHARED, "acpi",
-				 acpi_irq)) {
+	if (request_irq(irq, acpi_irq, IRQF_SHARED, "acpi", acpi_irq)) {
 		printk(KERN_ERR PREFIX "SCI (IRQ%d) allocation failed\n", irq);
 		acpi_irq_handler = NULL;
 		return AE_NOT_ACQUIRED;


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [v3.4-rc1] ACPI regression bisected
  2012-04-18 10:29                     ` Thomas Gleixner
@ 2012-04-18 11:07                       ` Paul Bolle
  2012-04-18 17:24                       ` Linus Torvalds
  1 sibling, 0 replies; 20+ messages in thread
From: Paul Bolle @ 2012-04-18 11:07 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Andi Kleen, Matthew Garrett, Jörg Otte, LKML, Len Brown,
	Linus Torvalds, Rafael J. Wysocki

On Wed, 2012-04-18 at 12:29 +0200, Thomas Gleixner wrote:
> Subject: ACPI: Revert 6fe0d06 ("ACPI: Make ACPI interrupt threaded")
> From: Thomas Gleixner <tglx@linutronix.de>
> 
> Paul bisected this regression to 6fe0d06 ("ACPI: Make ACPI interrupt
> threaded"). 
> 
> [...]
>
> Bisected-by: Paul Bolle <pebolle@tiscali.nl>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

Please do s/Paul/Jörg/ and s/Paul Bolle <pebolle@tiscali.nl>/Jörg Otte
<jrg.otte@googlemail.com>/. Jörg did all that work! I just provided some
extra information that apparently wasn't really needed anymore.


Paul Bolle


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [v3.4-rc1] ACPI regression bisected
  2012-04-18 10:29                     ` Thomas Gleixner
  2012-04-18 11:07                       ` Paul Bolle
@ 2012-04-18 17:24                       ` Linus Torvalds
  2012-04-18 19:25                         ` Thomas Gleixner
  1 sibling, 1 reply; 20+ messages in thread
From: Linus Torvalds @ 2012-04-18 17:24 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Paul Bolle, Andi Kleen, Matthew Garrett, Jörg Otte, LKML,
	Len Brown, Rafael J. Wysocki

On Wed, Apr 18, 2012 at 3:29 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
>
> Linus, can you please apply directly?

Applied. Can we perhaps also make request_threaded_irq() noisily
reject things that are not marked IRQF_ONESHOT and have a NULL
low-level handler?

That still allows people to do IRQF_SHARED | IRQF_ONESHOT, and that
won't be rejected until somebody else actually tries to share it. Then
it breaks that somebody else ;(

Sadly, that would tend to break fairly silently.

I wonder if we should make the mismatch error printout be
unconditional (and make just the "dump_stack()" part be conditional on
CONFIG_DEBUG_SHIRQ)?

                 Linus

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [v3.4-rc1] ACPI regression bisected
  2012-04-18 17:24                       ` Linus Torvalds
@ 2012-04-18 19:25                         ` Thomas Gleixner
  0 siblings, 0 replies; 20+ messages in thread
From: Thomas Gleixner @ 2012-04-18 19:25 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Paul Bolle, Andi Kleen, Matthew Garrett, Jörg Otte, LKML,
	Len Brown, Rafael J. Wysocki

On Wed, 18 Apr 2012, Linus Torvalds wrote:

> On Wed, Apr 18, 2012 at 3:29 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
> >
> > Linus, can you please apply directly?
> 
> Applied. Can we perhaps also make request_threaded_irq() noisily
> reject things that are not marked IRQF_ONESHOT and have a NULL
> low-level handler?

Yes, it's on my (ever growing) todo list.
 
> That still allows people to do IRQF_SHARED | IRQF_ONESHOT, and that
> won't be rejected until somebody else actually tries to share it. Then
> it breaks that somebody else ;(
> 
> Sadly, that would tend to break fairly silently.
> 
> I wonder if we should make the mismatch error printout be
> unconditional (and make just the "dump_stack()" part be conditional on
> CONFIG_DEBUG_SHIRQ)?

That's my plan.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [v3.4-rc1] ACPI regression bisected
  2012-04-12 19:26               ` Andi Kleen
  2012-04-12 20:42                 ` Thomas Gleixner
@ 2012-04-19  9:30                 ` Thomas Gleixner
  1 sibling, 0 replies; 20+ messages in thread
From: Thomas Gleixner @ 2012-04-19  9:30 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Matthew Garrett, Jörg Otte, linux-kernel, Len Brown

On Thu, 12 Apr 2012, Andi Kleen wrote:

> 
> > > I feared that. So now is the question whether there is a way to mask
> > > the interrupt at the "device" level.
> 
> Probably not. Len?
> 
> > If not, I could be persuaded to provide a mechanism to force thread
> > all interrupts which are on that line, if there's a good argument to
> > do so.
> 
> Hmm, we could probably just use a work queue for this instead. I guess i drank
> the threaded
> interrupt kool aid too much when I did the patch :-)
> 
> I'll switch over to a workqueue.

That still requires that you can silence the interrupt at the device
level in the first place unless you play silly games with
disable_irq_nosync(), which is even worse than having a
IRQF_FORCE_ONESHOT mechanism in the core.

If you go the workqueue way, then you basically split the interrupt
into a hard irq context and a thread handler. So it's the same to
request a real threaded interrupt with a primary and a thread handler
and return IRQF_WAKE_THREAD conditionally from the primary handler,
when you need the thread context for further processing.

Thoughts?

	tglx


^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2012-04-19  9:30 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-04-02 15:05 [v3.4-rc1] ACPI regression bisected Jörg Otte
2012-04-02 23:54 ` Andi Kleen
2012-04-03  8:11   ` Thomas Gleixner
2012-04-12  8:35     ` Jörg Otte
2012-04-12  8:57       ` Thomas Gleixner
2012-04-12 11:42         ` Paul Bolle
2012-04-12 17:06           ` Thomas Gleixner
2012-04-12 21:52             ` Paul Bolle
2012-04-12 14:29         ` Matthew Garrett
2012-04-12 17:04           ` Thomas Gleixner
2012-04-12 17:09             ` Thomas Gleixner
2012-04-12 19:26               ` Andi Kleen
2012-04-12 20:42                 ` Thomas Gleixner
2012-04-18  9:25                   ` Paul Bolle
2012-04-18 10:29                     ` Thomas Gleixner
2012-04-18 11:07                       ` Paul Bolle
2012-04-18 17:24                       ` Linus Torvalds
2012-04-18 19:25                         ` Thomas Gleixner
2012-04-19  9:30                 ` Thomas Gleixner
2012-04-03  9:13   ` Jörg Otte

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.