All of lore.kernel.org
 help / color / mirror / Atom feed
* 3.11.0-rc1:  Crash on rmmod of ath10k_pci
@ 2013-07-17 20:30 Ben Greear
  2013-07-18  5:25 ` Michal Kazior
  0 siblings, 1 reply; 4+ messages in thread
From: Ben Greear @ 2013-07-17 20:30 UTC (permalink / raw)
  To: ath10k

So, started testing on a v2 ath10k board today.  wlanX showed up
once I updated to 3.11.0-rc1 (didn't work on a 3.10 -wl kernel I
had laying around).

rmmod of ath10k_pci blows up pretty spectacularly however...

I'll go looking for a different tree to test upon...

[root@LEC2270-1 ~]# BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
IP: [<ffffffffa054c6bd>] ath10k_ce_per_engine_service+0x2c/0x17a [ath10k_pci]
PGD 1f12d9067 PUD 1f1299067 PMD 0
Oops: 0000 [#1] PREEMPT SMP
Modules linked in: nf_nat_ipv4 nf_nat 8021q garp stp mrp llc fuse macvlan wanlink(O) pktgen lockd sunrpc f71882fg coretemp hwmon snd_hda_codec_hdmi 
snd_hda_codec_rea]
CPU: 1 PID: 5085 Comm: rmmod Tainted: G         C O 3.11.0-rc1+ #2
Hardware name: To be filled by O.E.M. To be filled by O.E.M./HURONRIVER, BIOS 4.6.5 05/02/2012
task: ffff8801ecb7ae80 ti: ffff8801e8934000 task.ti: ffff8801e8934000
RIP: 0010:[<ffffffffa054c6bd>]  [<ffffffffa054c6bd>] ath10k_ce_per_engine_service+0x2c/0x17a [ath10k_pci]
RSP: 0018:ffff88021fa83e78  EFLAGS: 00010286
RAX: ffff88021598c000 RBX: 0000000000000000 RCX: ffff88021fa83ec8
RDX: ffff88021598c0f8 RSI: 0000000000000000 RDI: ffff88020d015f20
RBP: ffff88021fa83ed8 R08: ffff88021fa83f88 R09: ffff88021fa8e640
R10: ffff88021fa930a0 R11: ffff88021fa83ee8 R12: ffff88020d015f20
R13: ffff88021598c450 R14: 0000000000000009 R15: 0000000000000030
FS:  00007f3fe79f6740(0000) GS:ffff88021fa80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000010 CR3: 00000001f1223000 CR4: 00000000000407e0
Stack:
  ffff88021fa83e88 ffffffff81579152 ffff88021fa83f08 ffffffff810ab7be
  ffffffff810305d4 ffff88021fa8db40 ffff88021fa83ed8 ffff88021598c168
  ffff88021598c100 0000000000000101 0000000000000009 0000000000000030
Call Trace:
  <IRQ>
  [<ffffffff81579152>] ? _raw_spin_unlock_irq+0x26/0x31
  [<ffffffff810ab7be>] ? run_timer_softirq+0x1dd/0x1ec
  [<ffffffff810305d4>] ? lapic_next_deadline+0x2f/0x36
  [<ffffffffa05496bb>] ath10k_pci_ce_tasklet+0x15/0x17 [ath10k_pci]
  [<ffffffff810a5872>] tasklet_action+0x78/0xc6
  [<ffffffff810a606c>] __do_softirq+0xc4/0x19d
  [<ffffffff810a61cd>] irq_exit+0x46/0xa3
  [<ffffffff810317de>] smp_apic_timer_interrupt+0x2a/0x37
  [<ffffffff8157eddd>] apic_timer_interrupt+0x6d/0x80
  <EOI>
  [<ffffffff81579104>] ? _raw_spin_unlock_irqrestore+0xf/0x37
  [<ffffffff81108871>] __free_irq+0x116/0x1a4
  [<ffffffff81108971>] free_irq+0x72/0x8b
  [<ffffffffa0549479>] ath10k_pci_stop_intr+0x35/0x5c [ath10k_pci]
  [<ffffffffa0549504>] ath10k_pci_remove+0x64/0xad [ath10k_pci]
  [<ffffffff812d42c8>] pci_device_remove+0x3a/0x91
  [<ffffffff8139227f>] __device_release_driver+0x84/0xda
  [<ffffffff8139235f>] driver_detach+0x8a/0xb0
  [<ffffffff81391356>] bus_remove_driver+0xb4/0xd7
  [<ffffffff81392dda>] driver_unregister+0x67/0x6f
  [<ffffffff812d444b>] pci_unregister_driver+0x20/0x85
  [<ffffffffa054d238>] ath10k_pci_exit+0x10/0x12 [ath10k_pci]
  [<ffffffff810ed068>] SyS_delete_module+0x1f7/0x25b
  [<ffffffff8100b6a2>] ? do_notify_resume+0x58/0x69
  [<ffffffff8157e1e9>] system_call_fastpath+0x16/0x1b
Code: 89 f6 48 89 e5 41 57 41 56 41 55 41 54 49 89 fc 53 48 83 ec 38 4c 8b af 90 01 00 00 49 8b 9c f5 58 04 00 00 49 81 c5 50 04 00 00 <44> 8b 73 10 e8 c3 fc ff 
ff 4
RIP  [<ffffffffa054c6bd>] ath10k_ce_per_engine_service+0x2c/0x17a [ath10k_pci]
  RSP <ffff88021fa83e78>
CR2: 0000000000000010
---[ end trace ca9bd6378a42a1a7 ]---

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: 3.11.0-rc1: Crash on rmmod of ath10k_pci
  2013-07-17 20:30 3.11.0-rc1: Crash on rmmod of ath10k_pci Ben Greear
@ 2013-07-18  5:25 ` Michal Kazior
  2013-07-18 14:30   ` Ben Greear
  0 siblings, 1 reply; 4+ messages in thread
From: Michal Kazior @ 2013-07-18  5:25 UTC (permalink / raw)
  To: Ben Greear; +Cc: ath10k

Hi Ben,


On 17 July 2013 22:30, Ben Greear <greearb@candelatech.com> wrote:
> So, started testing on a v2 ath10k board today.  wlanX showed up
> once I updated to 3.11.0-rc1 (didn't work on a 3.10 -wl kernel I
> had laying around).
>
> rmmod of ath10k_pci blows up pretty spectacularly however...
>
> I'll go looking for a different tree to test upon...
>
> [root@LEC2270-1 ~]# BUG: unable to handle kernel NULL pointer dereference at
> 0000000000000010
> IP: [<ffffffffa054c6bd>] ath10k_ce_per_engine_service+0x2c/0x17a
> [ath10k_pci]
> PGD 1f12d9067 PUD 1f1299067 PMD 0
> Oops: 0000 [#1] PREEMPT SMP
> Modules linked in: nf_nat_ipv4 nf_nat 8021q garp stp mrp llc fuse macvlan
> wanlink(O) pktgen lockd sunrpc f71882fg coretemp hwmon snd_hda_codec_hdmi
> snd_hda_codec_rea]
> CPU: 1 PID: 5085 Comm: rmmod Tainted: G         C O 3.11.0-rc1+ #2
> Hardware name: To be filled by O.E.M. To be filled by O.E.M./HURONRIVER,
> BIOS 4.6.5 05/02/2012
> task: ffff8801ecb7ae80 ti: ffff8801e8934000 task.ti: ffff8801e8934000
> RIP: 0010:[<ffffffffa054c6bd>]  [<ffffffffa054c6bd>]
> ath10k_ce_per_engine_service+0x2c/0x17a [ath10k_pci]
> RSP: 0018:ffff88021fa83e78  EFLAGS: 00010286
> RAX: ffff88021598c000 RBX: 0000000000000000 RCX: ffff88021fa83ec8
> RDX: ffff88021598c0f8 RSI: 0000000000000000 RDI: ffff88020d015f20
> RBP: ffff88021fa83ed8 R08: ffff88021fa83f88 R09: ffff88021fa8e640
> R10: ffff88021fa930a0 R11: ffff88021fa83ee8 R12: ffff88020d015f20
> R13: ffff88021598c450 R14: 0000000000000009 R15: 0000000000000030
> FS:  00007f3fe79f6740(0000) GS:ffff88021fa80000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000010 CR3: 00000001f1223000 CR4: 00000000000407e0
> Stack:
>  ffff88021fa83e88 ffffffff81579152 ffff88021fa83f08 ffffffff810ab7be
>  ffffffff810305d4 ffff88021fa8db40 ffff88021fa83ed8 ffff88021598c168
>  ffff88021598c100 0000000000000101 0000000000000009 0000000000000030
> Call Trace:
>  <IRQ>
>  [<ffffffff81579152>] ? _raw_spin_unlock_irq+0x26/0x31
>  [<ffffffff810ab7be>] ? run_timer_softirq+0x1dd/0x1ec
>  [<ffffffff810305d4>] ? lapic_next_deadline+0x2f/0x36
>  [<ffffffffa05496bb>] ath10k_pci_ce_tasklet+0x15/0x17 [ath10k_pci]
>  [<ffffffff810a5872>] tasklet_action+0x78/0xc6
>  [<ffffffff810a606c>] __do_softirq+0xc4/0x19d
>  [<ffffffff810a61cd>] irq_exit+0x46/0xa3
>  [<ffffffff810317de>] smp_apic_timer_interrupt+0x2a/0x37
>  [<ffffffff8157eddd>] apic_timer_interrupt+0x6d/0x80
>  <EOI>
>  [<ffffffff81579104>] ? _raw_spin_unlock_irqrestore+0xf/0x37
>  [<ffffffff81108871>] __free_irq+0x116/0x1a4
>  [<ffffffff81108971>] free_irq+0x72/0x8b
>  [<ffffffffa0549479>] ath10k_pci_stop_intr+0x35/0x5c [ath10k_pci]
>  [<ffffffffa0549504>] ath10k_pci_remove+0x64/0xad [ath10k_pci]
>  [<ffffffff812d42c8>] pci_device_remove+0x3a/0x91
>  [<ffffffff8139227f>] __device_release_driver+0x84/0xda
>  [<ffffffff8139235f>] driver_detach+0x8a/0xb0
>  [<ffffffff81391356>] bus_remove_driver+0xb4/0xd7
>  [<ffffffff81392dda>] driver_unregister+0x67/0x6f
>  [<ffffffff812d444b>] pci_unregister_driver+0x20/0x85
>  [<ffffffffa054d238>] ath10k_pci_exit+0x10/0x12 [ath10k_pci]
>  [<ffffffff810ed068>] SyS_delete_module+0x1f7/0x25b
>  [<ffffffff8100b6a2>] ? do_notify_resume+0x58/0x69
>  [<ffffffff8157e1e9>] system_call_fastpath+0x16/0x1b
> Code: 89 f6 48 89 e5 41 57 41 56 41 55 41 54 49 89 fc 53 48 83 ec 38 4c 8b
> af 90 01 00 00 49 8b 9c f5 58 04 00 00 49 81 c5 50 04 00 00 <44> 8b 73 10 e8
> c3 fc ff ff 4
> RIP  [<ffffffffa054c6bd>] ath10k_ce_per_engine_service+0x2c/0x17a
> [ath10k_pci]
>  RSP <ffff88021fa83e78>
> CR2: 0000000000000010
> ---[ end trace ca9bd6378a42a1a7 ]---

You seem to have been lucky to trigger a race. CE is teared down
before we really stop interrupts and there's very small chance for an
interrupt to come in. This is strange, since CE interrupts are
disabled and only firmware (i.e. crash) interrupt may come in. I've
never seen firmware crash during module unloading. I have a patch for
this but it's based upon my recovery patchset and it's not trivial to
rebase the fix.


Pozdrawiam / Best regards,
Michał Kazior.

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: 3.11.0-rc1: Crash on rmmod of ath10k_pci
  2013-07-18  5:25 ` Michal Kazior
@ 2013-07-18 14:30   ` Ben Greear
  2013-07-19 10:41     ` Kalle Valo
  0 siblings, 1 reply; 4+ messages in thread
From: Ben Greear @ 2013-07-18 14:30 UTC (permalink / raw)
  To: Michal Kazior; +Cc: ath10k

On 07/17/2013 10:25 PM, Michal Kazior wrote:
> Hi Ben,
>
>
> On 17 July 2013 22:30, Ben Greear <greearb@candelatech.com> wrote:
>> So, started testing on a v2 ath10k board today.  wlanX showed up
>> once I updated to 3.11.0-rc1 (didn't work on a 3.10 -wl kernel I
>> had laying around).
>>
>> rmmod of ath10k_pci blows up pretty spectacularly however...
>>
>> I'll go looking for a different tree to test upon...
>>
>> [root@LEC2270-1 ~]# BUG: unable to handle kernel NULL pointer dereference at
>> 0000000000000010
>> IP: [<ffffffffa054c6bd>] ath10k_ce_per_engine_service+0x2c/0x17a
>> [ath10k_pci]
>> PGD 1f12d9067 PUD 1f1299067 PMD 0
>> Oops: 0000 [#1] PREEMPT SMP
>> Modules linked in: nf_nat_ipv4 nf_nat 8021q garp stp mrp llc fuse macvlan
>> wanlink(O) pktgen lockd sunrpc f71882fg coretemp hwmon snd_hda_codec_hdmi
>> snd_hda_codec_rea]
>> CPU: 1 PID: 5085 Comm: rmmod Tainted: G         C O 3.11.0-rc1+ #2
>> Hardware name: To be filled by O.E.M. To be filled by O.E.M./HURONRIVER,
>> BIOS 4.6.5 05/02/2012
>> task: ffff8801ecb7ae80 ti: ffff8801e8934000 task.ti: ffff8801e8934000
>> RIP: 0010:[<ffffffffa054c6bd>]  [<ffffffffa054c6bd>]
>> ath10k_ce_per_engine_service+0x2c/0x17a [ath10k_pci]
>> RSP: 0018:ffff88021fa83e78  EFLAGS: 00010286
>> RAX: ffff88021598c000 RBX: 0000000000000000 RCX: ffff88021fa83ec8
>> RDX: ffff88021598c0f8 RSI: 0000000000000000 RDI: ffff88020d015f20
>> RBP: ffff88021fa83ed8 R08: ffff88021fa83f88 R09: ffff88021fa8e640
>> R10: ffff88021fa930a0 R11: ffff88021fa83ee8 R12: ffff88020d015f20
>> R13: ffff88021598c450 R14: 0000000000000009 R15: 0000000000000030
>> FS:  00007f3fe79f6740(0000) GS:ffff88021fa80000(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 0000000000000010 CR3: 00000001f1223000 CR4: 00000000000407e0
>> Stack:
>>   ffff88021fa83e88 ffffffff81579152 ffff88021fa83f08 ffffffff810ab7be
>>   ffffffff810305d4 ffff88021fa8db40 ffff88021fa83ed8 ffff88021598c168
>>   ffff88021598c100 0000000000000101 0000000000000009 0000000000000030
>> Call Trace:
>>   <IRQ>
>>   [<ffffffff81579152>] ? _raw_spin_unlock_irq+0x26/0x31
>>   [<ffffffff810ab7be>] ? run_timer_softirq+0x1dd/0x1ec
>>   [<ffffffff810305d4>] ? lapic_next_deadline+0x2f/0x36
>>   [<ffffffffa05496bb>] ath10k_pci_ce_tasklet+0x15/0x17 [ath10k_pci]
>>   [<ffffffff810a5872>] tasklet_action+0x78/0xc6
>>   [<ffffffff810a606c>] __do_softirq+0xc4/0x19d
>>   [<ffffffff810a61cd>] irq_exit+0x46/0xa3
>>   [<ffffffff810317de>] smp_apic_timer_interrupt+0x2a/0x37
>>   [<ffffffff8157eddd>] apic_timer_interrupt+0x6d/0x80
>>   <EOI>
>>   [<ffffffff81579104>] ? _raw_spin_unlock_irqrestore+0xf/0x37
>>   [<ffffffff81108871>] __free_irq+0x116/0x1a4
>>   [<ffffffff81108971>] free_irq+0x72/0x8b
>>   [<ffffffffa0549479>] ath10k_pci_stop_intr+0x35/0x5c [ath10k_pci]
>>   [<ffffffffa0549504>] ath10k_pci_remove+0x64/0xad [ath10k_pci]
>>   [<ffffffff812d42c8>] pci_device_remove+0x3a/0x91
>>   [<ffffffff8139227f>] __device_release_driver+0x84/0xda
>>   [<ffffffff8139235f>] driver_detach+0x8a/0xb0
>>   [<ffffffff81391356>] bus_remove_driver+0xb4/0xd7
>>   [<ffffffff81392dda>] driver_unregister+0x67/0x6f
>>   [<ffffffff812d444b>] pci_unregister_driver+0x20/0x85
>>   [<ffffffffa054d238>] ath10k_pci_exit+0x10/0x12 [ath10k_pci]
>>   [<ffffffff810ed068>] SyS_delete_module+0x1f7/0x25b
>>   [<ffffffff8100b6a2>] ? do_notify_resume+0x58/0x69
>>   [<ffffffff8157e1e9>] system_call_fastpath+0x16/0x1b
>> Code: 89 f6 48 89 e5 41 57 41 56 41 55 41 54 49 89 fc 53 48 83 ec 38 4c 8b
>> af 90 01 00 00 49 8b 9c f5 58 04 00 00 49 81 c5 50 04 00 00 <44> 8b 73 10 e8
>> c3 fc ff ff 4
>> RIP  [<ffffffffa054c6bd>] ath10k_ce_per_engine_service+0x2c/0x17a
>> [ath10k_pci]
>>   RSP <ffff88021fa83e78>
>> CR2: 0000000000000010
>> ---[ end trace ca9bd6378a42a1a7 ]---
>
> You seem to have been lucky to trigger a race. CE is teared down
> before we really stop interrupts and there's very small chance for an
> interrupt to come in. This is strange, since CE interrupts are
> disabled and only firmware (i.e. crash) interrupt may come in. I've
> never seen firmware crash during module unloading. I have a patch for
> this but it's based upon my recovery patchset and it's not trivial to
> rebase the fix.

I plan to do further testing on the 'ath' repository, so hopefully
no need for porting patches around.  If/when that is stable, can see
about testing mainline.

Thanks,
Ben


-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: 3.11.0-rc1: Crash on rmmod of ath10k_pci
  2013-07-18 14:30   ` Ben Greear
@ 2013-07-19 10:41     ` Kalle Valo
  0 siblings, 0 replies; 4+ messages in thread
From: Kalle Valo @ 2013-07-19 10:41 UTC (permalink / raw)
  To: Ben Greear; +Cc: Michal Kazior, ath10k

Ben Greear <greearb@candelatech.com> writes:

>> You seem to have been lucky to trigger a race. CE is teared down
>> before we really stop interrupts and there's very small chance for an
>> interrupt to come in. This is strange, since CE interrupts are
>> disabled and only firmware (i.e. crash) interrupt may come in. I've
>> never seen firmware crash during module unloading. I have a patch for
>> this but it's based upon my recovery patchset and it's not trivial to
>> rebase the fix.
>
> I plan to do further testing on the 'ath' repository, so hopefully
> no need for porting patches around.  If/when that is stable, can see
> about testing mainline.

Yes, please base your work on the ath.git repository instead of mainline
(=Linus' releases). At this stage when we don't have that many users we
would not want to use time on the mainline, the time is much better
spent on improving the driver itself.

-- 
Kalle Valo

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2013-07-19 10:41 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-17 20:30 3.11.0-rc1: Crash on rmmod of ath10k_pci Ben Greear
2013-07-18  5:25 ` Michal Kazior
2013-07-18 14:30   ` Ben Greear
2013-07-19 10:41     ` Kalle Valo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.