linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [BUG] usb: dwc3: Kernel NULL pointer dereference in dwc3_remove()
@ 2021-06-01 11:02 Alexandru Elisei
  2021-06-03  3:07 ` Peter Chen
  2021-06-03  6:30 ` Felipe Balbi
  0 siblings, 2 replies; 9+ messages in thread
From: Alexandru Elisei @ 2021-06-01 11:02 UTC (permalink / raw)
  To: balbi, Greg Kroah-Hartman, p.zabel, linux-usb,
	Linux Kernel Mailing List, arm-mail-list, sanm

I've been seeing the following panic when shutting down my rockpro64:

[   21.459064] xhci-hcd xhci-hcd.0.auto: USB bus 5 deregistered
[   21.683077] Unable to handle kernel NULL pointer dereference at virtual address
00000000000000a0
[   21.683858] Mem abort info:
[   21.684104]   ESR = 0x96000004
[   21.684375]   EC = 0x25: DABT (current EL), IL = 32 bits
[   21.684841]   SET = 0, FnV = 0
[   21.685111]   EA = 0, S1PTW = 0
[   21.685389] Data abort info:
[   21.685644]   ISV = 0, ISS = 0x00000004
[   21.686024]   CM = 0, WnR = 0
[   21.686288] user pgtable: 4k pages, 48-bit VAs, pgdp=000000000757a000
[   21.686853] [00000000000000a0] pgd=0000000000000000, p4d=0000000000000000
[   21.687452] Internal error: Oops: 96000004EEMPT SMP
[   21.687941] Modules linked in:
[   21.688214] CPU: 4 PID: 1 Comm: shutdown Not tainted
5.12.0-rc7-00262-g568262bf5492 #33
[   21.688915] Hardware name: Pine64 RockPro64 v2.0 (DT)
[   21.689357] pstate: 60000005 (nZCv daif -PAN -UAO -TCO BTYPE=--)
[   21.689884] pc : down_read_interruptible+0xec/0x200
[   21.690321] lr : simple_recursive_removal+0x48/0x280
[   21.690761] sp : ffff800011f4b940
[   21.691053] x29: ffff800011f4b940 x28: ffff000000809b40
[   21.691522] x27: ffff000000809b98 x26: ffff8000114f5170
[   21.691990] x25: 00000000000000a0 x24: ffff800011e84030
[   21.692459] x23: 0000000000000080 x22: 0000000000000000
[   21.692927] x21: ffff800011ecaa5c x20: ffff800011ecaa60
[   21.693395] x19: ffff000000809b40 x18: ffffffffffffffff
[   21.693863] x17: 0000000000000000 x16: 0000000000000000
[   21.694331] x15: ffff800091f4ba6d x14: 0000000000000004
[   21.694799] x13: 0000000000000000 x12: 0000000000000020
[   21.695267] x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f
[   21.695735] x9 : 6f6c746364716e62 x8 : 7f7f7f7f7f7f7f7f
[   21.696203] x7 : fefefeff6364626d x6 : 0000000000001bd8
[   21.696671] x5 : 0000000000000000 x4 : 0000000000000000
[   21.697138] x3 : 00000000000000a0 x2 : 0000000000000001
[   21.697606] x1 : 0000000000000000 x0 : 00000000000000a0
[   21.698075] Call trace:
[   21.698291]  down_read_interruptible+0xec/0x200
[   21.698690]  debugfs_remove+0x60/0x84
[   21.699016]  dwc3_debugfs_exit+0x1c/0x6c
[   21.699363]  dwc3_remove+0x34/0x1a0
[   21.699672]  platform_remove+0x28/0x60
[   21.700005]  __device_release_driver+0x188/0x230
[   21.700414]  device_release_driver+0x2c/0x44
[   21.700791]  bus_remove_device+0x124/0x130
[   21.701154]  device_del+0x168/0x420
[   21.701462]  platform_device_del.part.0+0x1c/0x90
[   21.701877]  platform_device_unregister+0x28/0x44
[   21.702291]  of_platform_device_destroy+0xe8/0x100
[   21.702716]  device_for_each_child_reverse+0x64/0xb4
[   21.703153]  of_platform_depopulate+0x40/0x84
[   21.703538]  __dwc3_of_simple_teardown+0x20/0xd4
[   21.703945]  dwc3_of_simple_shutdown+0x14/0x20
[   21.704337]  platform_shutdown+0x28/0x40
[   21.704683]  device_shutdown+0x158/0x330
[   21.705029]  kernel_power_off+0x38/0x7c
[   21.705372]  __do_sys_reboot+0x16c/0x2a0
[   21.705719]  __arm64_sys_reboot+0x28/0x34
[   21.706074]  el0_svc_common.constprop.0+0x60/0x120
[   21.706499]  do_el0_svc+0x28/0x94
[   21.706794]  el0_svc+0x2c/0x54
[   21.707067]  el0_sync_handler+0xa4/0x130
[   21.707414]  el0_sync+0x170/0x180
[   21.707711] Code: c8047c62 35ffff84 17fffe5f f9800071 (c85ffc60)
[   21.708250] ---[ end trace 5ae08147542eb468 ]---
[   21.708667] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[   21.709456] Kernel Offset: disabled
[   21.709762] CPU features: 0x00240022,2100600c
[   21.710146] Memory Limit: 2048 MB
[   21.710443] ---[ end Kernel panic - not syncing: Attempted to kill init!
exitcode=0x0000000b ]---

I've been able to bisect the panic and the offending commit is 568262bf5492 ("usb:
dwc3: core: Add shutdown callback for dwc3"). I can provide more diagnostic
information if needed and I can help test the fix.

Thanks,

Alex 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [BUG] usb: dwc3: Kernel NULL pointer dereference in dwc3_remove()
  2021-06-01 11:02 [BUG] usb: dwc3: Kernel NULL pointer dereference in dwc3_remove() Alexandru Elisei
@ 2021-06-03  3:07 ` Peter Chen
  2021-06-03  6:30 ` Felipe Balbi
  1 sibling, 0 replies; 9+ messages in thread
From: Peter Chen @ 2021-06-03  3:07 UTC (permalink / raw)
  To: Alexandru Elisei
  Cc: balbi, Greg Kroah-Hartman, p.zabel, linux-usb,
	Linux Kernel Mailing List, arm-mail-list, sanm

On 21-06-01 12:02:34, Alexandru Elisei wrote:
> I've been seeing the following panic when shutting down my rockpro64:
> 
> [   21.459064] xhci-hcd xhci-hcd.0.auto: USB bus 5 deregistered
> [   21.683077] Unable to handle kernel NULL pointer dereference at virtual address
> 00000000000000a0
> [   21.683858] Mem abort info:
> [   21.684104]   ESR = 0x96000004
> [   21.684375]   EC = 0x25: DABT (current EL), IL = 32 bits
> [   21.684841]   SET = 0, FnV = 0
> [   21.685111]   EA = 0, S1PTW = 0
> [   21.685389] Data abort info:
> [   21.685644]   ISV = 0, ISS = 0x00000004
> [   21.686024]   CM = 0, WnR = 0
> [   21.686288] user pgtable: 4k pages, 48-bit VAs, pgdp=000000000757a000
> [   21.686853] [00000000000000a0] pgd=0000000000000000, p4d=0000000000000000
> [   21.687452] Internal error: Oops: 96000004EEMPT SMP
> [   21.687941] Modules linked in:
> [   21.688214] CPU: 4 PID: 1 Comm: shutdown Not tainted
> 5.12.0-rc7-00262-g568262bf5492 #33
> [   21.688915] Hardware name: Pine64 RockPro64 v2.0 (DT)
> [   21.689357] pstate: 60000005 (nZCv daif -PAN -UAO -TCO BTYPE=--)
> [   21.689884] pc : down_read_interruptible+0xec/0x200
> [   21.690321] lr : simple_recursive_removal+0x48/0x280
> [   21.690761] sp : ffff800011f4b940
> [   21.691053] x29: ffff800011f4b940 x28: ffff000000809b40
> [   21.691522] x27: ffff000000809b98 x26: ffff8000114f5170
> [   21.691990] x25: 00000000000000a0 x24: ffff800011e84030
> [   21.692459] x23: 0000000000000080 x22: 0000000000000000
> [   21.692927] x21: ffff800011ecaa5c x20: ffff800011ecaa60
> [   21.693395] x19: ffff000000809b40 x18: ffffffffffffffff
> [   21.693863] x17: 0000000000000000 x16: 0000000000000000
> [   21.694331] x15: ffff800091f4ba6d x14: 0000000000000004
> [   21.694799] x13: 0000000000000000 x12: 0000000000000020
> [   21.695267] x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f
> [   21.695735] x9 : 6f6c746364716e62 x8 : 7f7f7f7f7f7f7f7f
> [   21.696203] x7 : fefefeff6364626d x6 : 0000000000001bd8
> [   21.696671] x5 : 0000000000000000 x4 : 0000000000000000
> [   21.697138] x3 : 00000000000000a0 x2 : 0000000000000001
> [   21.697606] x1 : 0000000000000000 x0 : 00000000000000a0
> [   21.698075] Call trace:
> [   21.698291]  down_read_interruptible+0xec/0x200
> [   21.698690]  debugfs_remove+0x60/0x84
> [   21.699016]  dwc3_debugfs_exit+0x1c/0x6c
> [   21.699363]  dwc3_remove+0x34/0x1a0
> [   21.699672]  platform_remove+0x28/0x60
> [   21.700005]  __device_release_driver+0x188/0x230
> [   21.700414]  device_release_driver+0x2c/0x44
> [   21.700791]  bus_remove_device+0x124/0x130
> [   21.701154]  device_del+0x168/0x420
> [   21.701462]  platform_device_del.part.0+0x1c/0x90
> [   21.701877]  platform_device_unregister+0x28/0x44
> [   21.702291]  of_platform_device_destroy+0xe8/0x100
> [   21.702716]  device_for_each_child_reverse+0x64/0xb4
> [   21.703153]  of_platform_depopulate+0x40/0x84
> [   21.703538]  __dwc3_of_simple_teardown+0x20/0xd4
> [   21.703945]  dwc3_of_simple_shutdown+0x14/0x20
> [   21.704337]  platform_shutdown+0x28/0x40
> [   21.704683]  device_shutdown+0x158/0x330
> [   21.705029]  kernel_power_off+0x38/0x7c
> [   21.705372]  __do_sys_reboot+0x16c/0x2a0
> [   21.705719]  __arm64_sys_reboot+0x28/0x34
> [   21.706074]  el0_svc_common.constprop.0+0x60/0x120
> [   21.706499]  do_el0_svc+0x28/0x94
> [   21.706794]  el0_svc+0x2c/0x54
> [   21.707067]  el0_sync_handler+0xa4/0x130
> [   21.707414]  el0_sync+0x170/0x180
> [   21.707711] Code: c8047c62 35ffff84 17fffe5f f9800071 (c85ffc60)
> [   21.708250] ---[ end trace 5ae08147542eb468 ]---
> [   21.708667] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
> [   21.709456] Kernel Offset: disabled
> [   21.709762] CPU features: 0x00240022,2100600c
> [   21.710146] Memory Limit: 2048 MB
> [   21.710443] ---[ end Kernel panic - not syncing: Attempted to kill init!
> exitcode=0x0000000b ]---
> 

I find down_read_interruptible is called at sys_perf_event_open, could you find the
relationship between remove debugfs and perf event functions?

-- 

Thanks,
Peter Chen


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [BUG] usb: dwc3: Kernel NULL pointer dereference in dwc3_remove()
  2021-06-01 11:02 [BUG] usb: dwc3: Kernel NULL pointer dereference in dwc3_remove() Alexandru Elisei
  2021-06-03  3:07 ` Peter Chen
@ 2021-06-03  6:30 ` Felipe Balbi
  2021-06-03 10:41   ` Alexandru Elisei
  1 sibling, 1 reply; 9+ messages in thread
From: Felipe Balbi @ 2021-06-03  6:30 UTC (permalink / raw)
  To: Alexandru Elisei, Greg Kroah-Hartman, p.zabel, linux-usb,
	Linux Kernel Mailing List, arm-mail-list, sanm


[-- Attachment #1.1: Type: text/plain, Size: 4638 bytes --]


Hi,

Alexandru Elisei <alexandru.elisei@arm.com> writes:
> I've been seeing the following panic when shutting down my rockpro64:
>
> [   21.459064] xhci-hcd xhci-hcd.0.auto: USB bus 5 deregistered
> [   21.683077] Unable to handle kernel NULL pointer dereference at virtual address
> 00000000000000a0
> [   21.683858] Mem abort info:
> [   21.684104]   ESR = 0x96000004
> [   21.684375]   EC = 0x25: DABT (current EL), IL = 32 bits
> [   21.684841]   SET = 0, FnV = 0
> [   21.685111]   EA = 0, S1PTW = 0
> [   21.685389] Data abort info:
> [   21.685644]   ISV = 0, ISS = 0x00000004
> [   21.686024]   CM = 0, WnR = 0
> [   21.686288] user pgtable: 4k pages, 48-bit VAs, pgdp=000000000757a000
> [   21.686853] [00000000000000a0] pgd=0000000000000000, p4d=0000000000000000
> [   21.687452] Internal error: Oops: 96000004EEMPT SMP
> [   21.687941] Modules linked in:
> [   21.688214] CPU: 4 PID: 1 Comm: shutdown Not tainted
> 5.12.0-rc7-00262-g568262bf5492 #33
> [   21.688915] Hardware name: Pine64 RockPro64 v2.0 (DT)
> [   21.689357] pstate: 60000005 (nZCv daif -PAN -UAO -TCO BTYPE=--)
> [   21.689884] pc : down_read_interruptible+0xec/0x200
> [   21.690321] lr : simple_recursive_removal+0x48/0x280
> [   21.690761] sp : ffff800011f4b940
> [   21.691053] x29: ffff800011f4b940 x28: ffff000000809b40
> [   21.691522] x27: ffff000000809b98 x26: ffff8000114f5170
> [   21.691990] x25: 00000000000000a0 x24: ffff800011e84030
> [   21.692459] x23: 0000000000000080 x22: 0000000000000000
> [   21.692927] x21: ffff800011ecaa5c x20: ffff800011ecaa60
> [   21.693395] x19: ffff000000809b40 x18: ffffffffffffffff
> [   21.693863] x17: 0000000000000000 x16: 0000000000000000
> [   21.694331] x15: ffff800091f4ba6d x14: 0000000000000004
> [   21.694799] x13: 0000000000000000 x12: 0000000000000020
> [   21.695267] x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f
> [   21.695735] x9 : 6f6c746364716e62 x8 : 7f7f7f7f7f7f7f7f
> [   21.696203] x7 : fefefeff6364626d x6 : 0000000000001bd8
> [   21.696671] x5 : 0000000000000000 x4 : 0000000000000000
> [   21.697138] x3 : 00000000000000a0 x2 : 0000000000000001
> [   21.697606] x1 : 0000000000000000 x0 : 00000000000000a0
> [   21.698075] Call trace:
> [   21.698291]  down_read_interruptible+0xec/0x200
> [   21.698690]  debugfs_remove+0x60/0x84
> [   21.699016]  dwc3_debugfs_exit+0x1c/0x6c
> [   21.699363]  dwc3_remove+0x34/0x1a0
> [   21.699672]  platform_remove+0x28/0x60
> [   21.700005]  __device_release_driver+0x188/0x230
> [   21.700414]  device_release_driver+0x2c/0x44
> [   21.700791]  bus_remove_device+0x124/0x130
> [   21.701154]  device_del+0x168/0x420
> [   21.701462]  platform_device_del.part.0+0x1c/0x90
> [   21.701877]  platform_device_unregister+0x28/0x44
> [   21.702291]  of_platform_device_destroy+0xe8/0x100
> [   21.702716]  device_for_each_child_reverse+0x64/0xb4
> [   21.703153]  of_platform_depopulate+0x40/0x84
> [   21.703538]  __dwc3_of_simple_teardown+0x20/0xd4
> [   21.703945]  dwc3_of_simple_shutdown+0x14/0x20
> [   21.704337]  platform_shutdown+0x28/0x40
> [   21.704683]  device_shutdown+0x158/0x330
> [   21.705029]  kernel_power_off+0x38/0x7c
> [   21.705372]  __do_sys_reboot+0x16c/0x2a0
> [   21.705719]  __arm64_sys_reboot+0x28/0x34
> [   21.706074]  el0_svc_common.constprop.0+0x60/0x120
> [   21.706499]  do_el0_svc+0x28/0x94
> [   21.706794]  el0_svc+0x2c/0x54
> [   21.707067]  el0_sync_handler+0xa4/0x130
> [   21.707414]  el0_sync+0x170/0x180
> [   21.707711] Code: c8047c62 35ffff84 17fffe5f f9800071 (c85ffc60)
> [   21.708250] ---[ end trace 5ae08147542eb468 ]---
> [   21.708667] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
> [   21.709456] Kernel Offset: disabled
> [   21.709762] CPU features: 0x00240022,2100600c
> [   21.710146] Memory Limit: 2048 MB
> [   21.710443] ---[ end Kernel panic - not syncing: Attempted to kill init!
> exitcode=0x0000000b ]---
>
> I've been able to bisect the panic and the offending commit is 568262bf5492 ("usb:
> dwc3: core: Add shutdown callback for dwc3"). I can provide more diagnostic
> information if needed and I can help test the fix.

if you simply revert that commit in HEAD, does the problem really go
away?

Oh wait, it should go away, yes. dwc3_shutdown() is just called
dwc3_remove() directly, then we end up calling
debugfs_remove_recursive() twice.

Sandeep, can you fix this one?

-- 
balbi

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 511 bytes --]

[-- Attachment #2: Type: text/plain, Size: 176 bytes --]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [BUG] usb: dwc3: Kernel NULL pointer dereference in dwc3_remove()
  2021-06-03  6:30 ` Felipe Balbi
@ 2021-06-03 10:41   ` Alexandru Elisei
  2021-06-03 11:40     ` Greg Kroah-Hartman
  0 siblings, 1 reply; 9+ messages in thread
From: Alexandru Elisei @ 2021-06-03 10:41 UTC (permalink / raw)
  To: Felipe Balbi, Greg Kroah-Hartman, p.zabel, linux-usb,
	Linux Kernel Mailing List, arm-mail-list, sanm

Hello Felipe,

Thank you for having a look!

On 6/3/21 7:30 AM, Felipe Balbi wrote:
> Hi,
>
> Alexandru Elisei <alexandru.elisei@arm.com> writes:
>> I've been seeing the following panic when shutting down my rockpro64:
>>
>> [   21.459064] xhci-hcd xhci-hcd.0.auto: USB bus 5 deregistered
>> [   21.683077] Unable to handle kernel NULL pointer dereference at virtual address
>> 00000000000000a0
>> [   21.683858] Mem abort info:
>> [   21.684104]   ESR = 0x96000004
>> [   21.684375]   EC = 0x25: DABT (current EL), IL = 32 bits
>> [   21.684841]   SET = 0, FnV = 0
>> [   21.685111]   EA = 0, S1PTW = 0
>> [   21.685389] Data abort info:
>> [   21.685644]   ISV = 0, ISS = 0x00000004
>> [   21.686024]   CM = 0, WnR = 0
>> [   21.686288] user pgtable: 4k pages, 48-bit VAs, pgdp=000000000757a000
>> [   21.686853] [00000000000000a0] pgd=0000000000000000, p4d=0000000000000000
>> [   21.687452] Internal error: Oops: 96000004EEMPT SMP
>> [   21.687941] Modules linked in:
>> [   21.688214] CPU: 4 PID: 1 Comm: shutdown Not tainted
>> 5.12.0-rc7-00262-g568262bf5492 #33
>> [   21.688915] Hardware name: Pine64 RockPro64 v2.0 (DT)
>> [   21.689357] pstate: 60000005 (nZCv daif -PAN -UAO -TCO BTYPE=--)
>> [   21.689884] pc : down_read_interruptible+0xec/0x200
>> [   21.690321] lr : simple_recursive_removal+0x48/0x280
>> [   21.690761] sp : ffff800011f4b940
>> [   21.691053] x29: ffff800011f4b940 x28: ffff000000809b40
>> [   21.691522] x27: ffff000000809b98 x26: ffff8000114f5170
>> [   21.691990] x25: 00000000000000a0 x24: ffff800011e84030
>> [   21.692459] x23: 0000000000000080 x22: 0000000000000000
>> [   21.692927] x21: ffff800011ecaa5c x20: ffff800011ecaa60
>> [   21.693395] x19: ffff000000809b40 x18: ffffffffffffffff
>> [   21.693863] x17: 0000000000000000 x16: 0000000000000000
>> [   21.694331] x15: ffff800091f4ba6d x14: 0000000000000004
>> [   21.694799] x13: 0000000000000000 x12: 0000000000000020
>> [   21.695267] x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f
>> [   21.695735] x9 : 6f6c746364716e62 x8 : 7f7f7f7f7f7f7f7f
>> [   21.696203] x7 : fefefeff6364626d x6 : 0000000000001bd8
>> [   21.696671] x5 : 0000000000000000 x4 : 0000000000000000
>> [   21.697138] x3 : 00000000000000a0 x2 : 0000000000000001
>> [   21.697606] x1 : 0000000000000000 x0 : 00000000000000a0
>> [   21.698075] Call trace:
>> [   21.698291]  down_read_interruptible+0xec/0x200
>> [   21.698690]  debugfs_remove+0x60/0x84
>> [   21.699016]  dwc3_debugfs_exit+0x1c/0x6c
>> [   21.699363]  dwc3_remove+0x34/0x1a0
>> [   21.699672]  platform_remove+0x28/0x60
>> [   21.700005]  __device_release_driver+0x188/0x230
>> [   21.700414]  device_release_driver+0x2c/0x44
>> [   21.700791]  bus_remove_device+0x124/0x130
>> [   21.701154]  device_del+0x168/0x420
>> [   21.701462]  platform_device_del.part.0+0x1c/0x90
>> [   21.701877]  platform_device_unregister+0x28/0x44
>> [   21.702291]  of_platform_device_destroy+0xe8/0x100
>> [   21.702716]  device_for_each_child_reverse+0x64/0xb4
>> [   21.703153]  of_platform_depopulate+0x40/0x84
>> [   21.703538]  __dwc3_of_simple_teardown+0x20/0xd4
>> [   21.703945]  dwc3_of_simple_shutdown+0x14/0x20
>> [   21.704337]  platform_shutdown+0x28/0x40
>> [   21.704683]  device_shutdown+0x158/0x330
>> [   21.705029]  kernel_power_off+0x38/0x7c
>> [   21.705372]  __do_sys_reboot+0x16c/0x2a0
>> [   21.705719]  __arm64_sys_reboot+0x28/0x34
>> [   21.706074]  el0_svc_common.constprop.0+0x60/0x120
>> [   21.706499]  do_el0_svc+0x28/0x94
>> [   21.706794]  el0_svc+0x2c/0x54
>> [   21.707067]  el0_sync_handler+0xa4/0x130
>> [   21.707414]  el0_sync+0x170/0x180
>> [   21.707711] Code: c8047c62 35ffff84 17fffe5f f9800071 (c85ffc60)
>> [   21.708250] ---[ end trace 5ae08147542eb468 ]---
>> [   21.708667] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
>> [   21.709456] Kernel Offset: disabled
>> [   21.709762] CPU features: 0x00240022,2100600c
>> [   21.710146] Memory Limit: 2048 MB
>> [   21.710443] ---[ end Kernel panic - not syncing: Attempted to kill init!
>> exitcode=0x0000000b ]---
>>
>> I've been able to bisect the panic and the offending commit is 568262bf5492 ("usb:
>> dwc3: core: Add shutdown callback for dwc3"). I can provide more diagnostic
>> information if needed and I can help test the fix.
> if you simply revert that commit in HEAD, does the problem really go
> away?

Kernel built from commit 324c92e5e0ee, which is the kernel tip today, the panic is
there. Reverting the offending commit, 568262bf5492, makes the panic disappear.

Thanks,

Alex

>
> Oh wait, it should go away, yes. dwc3_shutdown() is just called
> dwc3_remove() directly, then we end up calling
> debugfs_remove_recursive() twice.
>
> Sandeep, can you fix this one?
>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [BUG] usb: dwc3: Kernel NULL pointer dereference in dwc3_remove()
  2021-06-03 10:41   ` Alexandru Elisei
@ 2021-06-03 11:40     ` Greg Kroah-Hartman
  2021-06-03 12:05       ` Alexandru Elisei
  0 siblings, 1 reply; 9+ messages in thread
From: Greg Kroah-Hartman @ 2021-06-03 11:40 UTC (permalink / raw)
  To: Alexandru Elisei
  Cc: Felipe Balbi, p.zabel, linux-usb, Linux Kernel Mailing List,
	arm-mail-list, sanm

On Thu, Jun 03, 2021 at 11:41:45AM +0100, Alexandru Elisei wrote:
> Hello Felipe,
> 
> Thank you for having a look!
> 
> On 6/3/21 7:30 AM, Felipe Balbi wrote:
> > Hi,
> >
> > Alexandru Elisei <alexandru.elisei@arm.com> writes:
> >> I've been seeing the following panic when shutting down my rockpro64:
> >>
> >> [   21.459064] xhci-hcd xhci-hcd.0.auto: USB bus 5 deregistered
> >> [   21.683077] Unable to handle kernel NULL pointer dereference at virtual address
> >> 00000000000000a0
> >> [   21.683858] Mem abort info:
> >> [   21.684104]   ESR = 0x96000004
> >> [   21.684375]   EC = 0x25: DABT (current EL), IL = 32 bits
> >> [   21.684841]   SET = 0, FnV = 0
> >> [   21.685111]   EA = 0, S1PTW = 0
> >> [   21.685389] Data abort info:
> >> [   21.685644]   ISV = 0, ISS = 0x00000004
> >> [   21.686024]   CM = 0, WnR = 0
> >> [   21.686288] user pgtable: 4k pages, 48-bit VAs, pgdp=000000000757a000
> >> [   21.686853] [00000000000000a0] pgd=0000000000000000, p4d=0000000000000000
> >> [   21.687452] Internal error: Oops: 96000004EEMPT SMP
> >> [   21.687941] Modules linked in:
> >> [   21.688214] CPU: 4 PID: 1 Comm: shutdown Not tainted
> >> 5.12.0-rc7-00262-g568262bf5492 #33
> >> [   21.688915] Hardware name: Pine64 RockPro64 v2.0 (DT)
> >> [   21.689357] pstate: 60000005 (nZCv daif -PAN -UAO -TCO BTYPE=--)
> >> [   21.689884] pc : down_read_interruptible+0xec/0x200
> >> [   21.690321] lr : simple_recursive_removal+0x48/0x280
> >> [   21.690761] sp : ffff800011f4b940
> >> [   21.691053] x29: ffff800011f4b940 x28: ffff000000809b40
> >> [   21.691522] x27: ffff000000809b98 x26: ffff8000114f5170
> >> [   21.691990] x25: 00000000000000a0 x24: ffff800011e84030
> >> [   21.692459] x23: 0000000000000080 x22: 0000000000000000
> >> [   21.692927] x21: ffff800011ecaa5c x20: ffff800011ecaa60
> >> [   21.693395] x19: ffff000000809b40 x18: ffffffffffffffff
> >> [   21.693863] x17: 0000000000000000 x16: 0000000000000000
> >> [   21.694331] x15: ffff800091f4ba6d x14: 0000000000000004
> >> [   21.694799] x13: 0000000000000000 x12: 0000000000000020
> >> [   21.695267] x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f
> >> [   21.695735] x9 : 6f6c746364716e62 x8 : 7f7f7f7f7f7f7f7f
> >> [   21.696203] x7 : fefefeff6364626d x6 : 0000000000001bd8
> >> [   21.696671] x5 : 0000000000000000 x4 : 0000000000000000
> >> [   21.697138] x3 : 00000000000000a0 x2 : 0000000000000001
> >> [   21.697606] x1 : 0000000000000000 x0 : 00000000000000a0
> >> [   21.698075] Call trace:
> >> [   21.698291]  down_read_interruptible+0xec/0x200
> >> [   21.698690]  debugfs_remove+0x60/0x84
> >> [   21.699016]  dwc3_debugfs_exit+0x1c/0x6c
> >> [   21.699363]  dwc3_remove+0x34/0x1a0
> >> [   21.699672]  platform_remove+0x28/0x60
> >> [   21.700005]  __device_release_driver+0x188/0x230
> >> [   21.700414]  device_release_driver+0x2c/0x44
> >> [   21.700791]  bus_remove_device+0x124/0x130
> >> [   21.701154]  device_del+0x168/0x420
> >> [   21.701462]  platform_device_del.part.0+0x1c/0x90
> >> [   21.701877]  platform_device_unregister+0x28/0x44
> >> [   21.702291]  of_platform_device_destroy+0xe8/0x100
> >> [   21.702716]  device_for_each_child_reverse+0x64/0xb4
> >> [   21.703153]  of_platform_depopulate+0x40/0x84
> >> [   21.703538]  __dwc3_of_simple_teardown+0x20/0xd4
> >> [   21.703945]  dwc3_of_simple_shutdown+0x14/0x20
> >> [   21.704337]  platform_shutdown+0x28/0x40
> >> [   21.704683]  device_shutdown+0x158/0x330
> >> [   21.705029]  kernel_power_off+0x38/0x7c
> >> [   21.705372]  __do_sys_reboot+0x16c/0x2a0
> >> [   21.705719]  __arm64_sys_reboot+0x28/0x34
> >> [   21.706074]  el0_svc_common.constprop.0+0x60/0x120
> >> [   21.706499]  do_el0_svc+0x28/0x94
> >> [   21.706794]  el0_svc+0x2c/0x54
> >> [   21.707067]  el0_sync_handler+0xa4/0x130
> >> [   21.707414]  el0_sync+0x170/0x180
> >> [   21.707711] Code: c8047c62 35ffff84 17fffe5f f9800071 (c85ffc60)
> >> [   21.708250] ---[ end trace 5ae08147542eb468 ]---
> >> [   21.708667] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
> >> [   21.709456] Kernel Offset: disabled
> >> [   21.709762] CPU features: 0x00240022,2100600c
> >> [   21.710146] Memory Limit: 2048 MB
> >> [   21.710443] ---[ end Kernel panic - not syncing: Attempted to kill init!
> >> exitcode=0x0000000b ]---
> >>
> >> I've been able to bisect the panic and the offending commit is 568262bf5492 ("usb:
> >> dwc3: core: Add shutdown callback for dwc3"). I can provide more diagnostic
> >> information if needed and I can help test the fix.
> > if you simply revert that commit in HEAD, does the problem really go
> > away?
> 
> Kernel built from commit 324c92e5e0ee, which is the kernel tip today, the panic is
> there. Reverting the offending commit, 568262bf5492, makes the panic disappear.

Want to send a revert so I can take it now?

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [BUG] usb: dwc3: Kernel NULL pointer dereference in dwc3_remove()
  2021-06-03 11:40     ` Greg Kroah-Hartman
@ 2021-06-03 12:05       ` Alexandru Elisei
  2021-06-03 14:35         ` Felipe Balbi
  0 siblings, 1 reply; 9+ messages in thread
From: Alexandru Elisei @ 2021-06-03 12:05 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Felipe Balbi, p.zabel, linux-usb, Linux Kernel Mailing List,
	arm-mail-list, sanm

Hi Greg,

On 6/3/21 12:40 PM, Greg Kroah-Hartman wrote:
> On Thu, Jun 03, 2021 at 11:41:45AM +0100, Alexandru Elisei wrote:
>> Hello Felipe,
>>
>> Thank you for having a look!
>>
>> On 6/3/21 7:30 AM, Felipe Balbi wrote:
>>> Hi,
>>>
>>> Alexandru Elisei <alexandru.elisei@arm.com> writes:
>>>> I've been seeing the following panic when shutting down my rockpro64:
>>>>
>>>> [�� 21.459064] xhci-hcd xhci-hcd.0.auto: USB bus 5 deregistered
>>>> [�� 21.683077] Unable to handle kernel NULL pointer dereference at virtual address
>>>> 00000000000000a0
>>>> [�� 21.683858] Mem abort info:
>>>> [�� 21.684104]�� ESR = 0x96000004
>>>> [�� 21.684375]�� EC = 0x25: DABT (current EL), IL = 32 bits
>>>> [�� 21.684841]�� SET = 0, FnV = 0
>>>> [�� 21.685111]�� EA = 0, S1PTW = 0
>>>> [�� 21.685389] Data abort info:
>>>> [�� 21.685644]�� ISV = 0, ISS = 0x00000004
>>>> [�� 21.686024]�� CM = 0, WnR = 0
>>>> [�� 21.686288] user pgtable: 4k pages, 48-bit VAs, pgdp=000000000757a000
>>>> [�� 21.686853] [00000000000000a0] pgd=0000000000000000, p4d=0000000000000000
>>>> [�� 21.687452] Internal error: Oops: 96000004EEMPT SMP
>>>> [�� 21.687941] Modules linked in:
>>>> [�� 21.688214] CPU: 4 PID: 1 Comm: shutdown Not tainted
>>>> 5.12.0-rc7-00262-g568262bf5492 #33
>>>> [�� 21.688915] Hardware name: Pine64 RockPro64 v2.0 (DT)
>>>> [�� 21.689357] pstate: 60000005 (nZCv daif -PAN -UAO -TCO BTYPE=--)
>>>> [�� 21.689884] pc : down_read_interruptible+0xec/0x200
>>>> [�� 21.690321] lr : simple_recursive_removal+0x48/0x280
>>>> [�� 21.690761] sp : ffff800011f4b940
>>>> [�� 21.691053] x29: ffff800011f4b940 x28: ffff000000809b40
>>>> [�� 21.691522] x27: ffff000000809b98 x26: ffff8000114f5170
>>>> [�� 21.691990] x25: 00000000000000a0 x24: ffff800011e84030
>>>> [�� 21.692459] x23: 0000000000000080 x22: 0000000000000000
>>>> [�� 21.692927] x21: ffff800011ecaa5c x20: ffff800011ecaa60
>>>> [�� 21.693395] x19: ffff000000809b40 x18: ffffffffffffffff
>>>> [�� 21.693863] x17: 0000000000000000 x16: 0000000000000000
>>>> [�� 21.694331] x15: ffff800091f4ba6d x14: 0000000000000004
>>>> [�� 21.694799] x13: 0000000000000000 x12: 0000000000000020
>>>> [�� 21.695267] x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f
>>>> [�� 21.695735] x9 : 6f6c746364716e62 x8 : 7f7f7f7f7f7f7f7f
>>>> [�� 21.696203] x7 : fefefeff6364626d x6 : 0000000000001bd8
>>>> [�� 21.696671] x5 : 0000000000000000 x4 : 0000000000000000
>>>> [�� 21.697138] x3 : 00000000000000a0 x2 : 0000000000000001
>>>> [�� 21.697606] x1 : 0000000000000000 x0 : 00000000000000a0
>>>> [�� 21.698075] Call trace:
>>>> [�� 21.698291]� down_read_interruptible+0xec/0x200
>>>> [�� 21.698690]� debugfs_remove+0x60/0x84
>>>> [�� 21.699016]� dwc3_debugfs_exit+0x1c/0x6c
>>>> [�� 21.699363]� dwc3_remove+0x34/0x1a0
>>>> [�� 21.699672]� platform_remove+0x28/0x60
>>>> [�� 21.700005]� __device_release_driver+0x188/0x230
>>>> [�� 21.700414]� device_release_driver+0x2c/0x44
>>>> [�� 21.700791]� bus_remove_device+0x124/0x130
>>>> [�� 21.701154]� device_del+0x168/0x420
>>>> [�� 21.701462]� platform_device_del.part.0+0x1c/0x90
>>>> [�� 21.701877]� platform_device_unregister+0x28/0x44
>>>> [�� 21.702291]� of_platform_device_destroy+0xe8/0x100
>>>> [�� 21.702716]� device_for_each_child_reverse+0x64/0xb4
>>>> [�� 21.703153]� of_platform_depopulate+0x40/0x84
>>>> [�� 21.703538]� __dwc3_of_simple_teardown+0x20/0xd4
>>>> [�� 21.703945]� dwc3_of_simple_shutdown+0x14/0x20
>>>> [�� 21.704337]� platform_shutdown+0x28/0x40
>>>> [�� 21.704683]� device_shutdown+0x158/0x330
>>>> [�� 21.705029]� kernel_power_off+0x38/0x7c
>>>> [�� 21.705372]� __do_sys_reboot+0x16c/0x2a0
>>>> [�� 21.705719]� __arm64_sys_reboot+0x28/0x34
>>>> [�� 21.706074]� el0_svc_common.constprop.0+0x60/0x120
>>>> [�� 21.706499]� do_el0_svc+0x28/0x94
>>>> [�� 21.706794]� el0_svc+0x2c/0x54
>>>> [�� 21.707067]� el0_sync_handler+0xa4/0x130
>>>> [�� 21.707414]� el0_sync+0x170/0x180
>>>> [�� 21.707711] Code: c8047c62 35ffff84 17fffe5f f9800071 (c85ffc60)
>>>> [�� 21.708250] ---[ end trace 5ae08147542eb468 ]---
>>>> [�� 21.708667] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
>>>> [�� 21.709456] Kernel Offset: disabled
>>>> [�� 21.709762] CPU features: 0x00240022,2100600c
>>>> [�� 21.710146] Memory Limit: 2048 MB
>>>> [�� 21.710443] ---[ end Kernel panic - not syncing: Attempted to kill init!
>>>> exitcode=0x0000000b ]---
>>>>
>>>> I've been able to bisect the panic and the offending commit is 568262bf5492 ("usb:
>>>> dwc3: core: Add shutdown callback for dwc3"). I can provide more diagnostic
>>>> information if needed and I can help test the fix.
>>> if you simply revert that commit in HEAD, does the problem really go
>>> away?
>> Kernel built from commit 324c92e5e0ee, which is the kernel tip today, the panic is
>> there. Reverting the offending commit, 568262bf5492, makes the panic disappear.
> Want to send a revert so I can take it now?

I can send a revert, but Felipe was asking Sandeep (the commit author) for a fix,
so I'll leave it up to Felipe to decide how to proceed.

Thanks,

Alex


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [BUG] usb: dwc3: Kernel NULL pointer dereference in dwc3_remove()
  2021-06-03 12:05       ` Alexandru Elisei
@ 2021-06-03 14:35         ` Felipe Balbi
       [not found]           ` <20210603173632.GA25299@jackp-linux.qualcomm.com>
  0 siblings, 1 reply; 9+ messages in thread
From: Felipe Balbi @ 2021-06-03 14:35 UTC (permalink / raw)
  To: Alexandru Elisei, Greg Kroah-Hartman
  Cc: p.zabel, linux-usb, Linux Kernel Mailing List, arm-mail-list, sanm


[-- Attachment #1.1: Type: text/plain, Size: 6404 bytes --]


Hi,

Alexandru Elisei <alexandru.elisei@arm.com> writes:
> On 6/3/21 12:40 PM, Greg Kroah-Hartman wrote:
>> On Thu, Jun 03, 2021 at 11:41:45AM +0100, Alexandru Elisei wrote:
>>> Hello Felipe,
>>>
>>> Thank you for having a look!
>>>
>>> On 6/3/21 7:30 AM, Felipe Balbi wrote:
>>>> Hi,
>>>>
>>>> Alexandru Elisei <alexandru.elisei@arm.com> writes:
>>>>> I've been seeing the following panic when shutting down my rockpro64:
>>>>>
>>>>> [�� 21.459064] xhci-hcd xhci-hcd.0.auto: USB bus 5 deregistered
>>>>> [�� 21.683077] Unable to handle kernel NULL pointer dereference at virtual address
>>>>> 00000000000000a0
>>>>> [�� 21.683858] Mem abort info:
>>>>> [�� 21.684104]�� ESR = 0x96000004
>>>>> [�� 21.684375]�� EC = 0x25: DABT (current EL), IL = 32 bits
>>>>> [�� 21.684841]�� SET = 0, FnV = 0
>>>>> [�� 21.685111]�� EA = 0, S1PTW = 0
>>>>> [�� 21.685389] Data abort info:
>>>>> [�� 21.685644]�� ISV = 0, ISS = 0x00000004
>>>>> [�� 21.686024]�� CM = 0, WnR = 0
>>>>> [�� 21.686288] user pgtable: 4k pages, 48-bit VAs, pgdp=000000000757a000
>>>>> [�� 21.686853] [00000000000000a0] pgd=0000000000000000, p4d=0000000000000000
>>>>> [�� 21.687452] Internal error: Oops: 96000004EEMPT SMP
>>>>> [�� 21.687941] Modules linked in:
>>>>> [�� 21.688214] CPU: 4 PID: 1 Comm: shutdown Not tainted
>>>>> 5.12.0-rc7-00262-g568262bf5492 #33
>>>>> [�� 21.688915] Hardware name: Pine64 RockPro64 v2.0 (DT)
>>>>> [�� 21.689357] pstate: 60000005 (nZCv daif -PAN -UAO -TCO BTYPE=--)
>>>>> [�� 21.689884] pc : down_read_interruptible+0xec/0x200
>>>>> [�� 21.690321] lr : simple_recursive_removal+0x48/0x280
>>>>> [�� 21.690761] sp : ffff800011f4b940
>>>>> [�� 21.691053] x29: ffff800011f4b940 x28: ffff000000809b40
>>>>> [�� 21.691522] x27: ffff000000809b98 x26: ffff8000114f5170
>>>>> [�� 21.691990] x25: 00000000000000a0 x24: ffff800011e84030
>>>>> [�� 21.692459] x23: 0000000000000080 x22: 0000000000000000
>>>>> [�� 21.692927] x21: ffff800011ecaa5c x20: ffff800011ecaa60
>>>>> [�� 21.693395] x19: ffff000000809b40 x18: ffffffffffffffff
>>>>> [�� 21.693863] x17: 0000000000000000 x16: 0000000000000000
>>>>> [�� 21.694331] x15: ffff800091f4ba6d x14: 0000000000000004
>>>>> [�� 21.694799] x13: 0000000000000000 x12: 0000000000000020
>>>>> [�� 21.695267] x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f
>>>>> [�� 21.695735] x9 : 6f6c746364716e62 x8 : 7f7f7f7f7f7f7f7f
>>>>> [�� 21.696203] x7 : fefefeff6364626d x6 : 0000000000001bd8
>>>>> [�� 21.696671] x5 : 0000000000000000 x4 : 0000000000000000
>>>>> [�� 21.697138] x3 : 00000000000000a0 x2 : 0000000000000001
>>>>> [�� 21.697606] x1 : 0000000000000000 x0 : 00000000000000a0
>>>>> [�� 21.698075] Call trace:
>>>>> [�� 21.698291]� down_read_interruptible+0xec/0x200
>>>>> [�� 21.698690]� debugfs_remove+0x60/0x84
>>>>> [�� 21.699016]� dwc3_debugfs_exit+0x1c/0x6c
>>>>> [�� 21.699363]� dwc3_remove+0x34/0x1a0
>>>>> [�� 21.699672]� platform_remove+0x28/0x60
>>>>> [�� 21.700005]� __device_release_driver+0x188/0x230
>>>>> [�� 21.700414]� device_release_driver+0x2c/0x44
>>>>> [�� 21.700791]� bus_remove_device+0x124/0x130
>>>>> [�� 21.701154]� device_del+0x168/0x420
>>>>> [�� 21.701462]� platform_device_del.part.0+0x1c/0x90
>>>>> [�� 21.701877]� platform_device_unregister+0x28/0x44
>>>>> [�� 21.702291]� of_platform_device_destroy+0xe8/0x100
>>>>> [�� 21.702716]� device_for_each_child_reverse+0x64/0xb4
>>>>> [�� 21.703153]� of_platform_depopulate+0x40/0x84
>>>>> [�� 21.703538]� __dwc3_of_simple_teardown+0x20/0xd4
>>>>> [�� 21.703945]� dwc3_of_simple_shutdown+0x14/0x20
>>>>> [�� 21.704337]� platform_shutdown+0x28/0x40
>>>>> [�� 21.704683]� device_shutdown+0x158/0x330
>>>>> [�� 21.705029]� kernel_power_off+0x38/0x7c
>>>>> [�� 21.705372]� __do_sys_reboot+0x16c/0x2a0
>>>>> [�� 21.705719]� __arm64_sys_reboot+0x28/0x34
>>>>> [�� 21.706074]� el0_svc_common.constprop.0+0x60/0x120
>>>>> [�� 21.706499]� do_el0_svc+0x28/0x94
>>>>> [�� 21.706794]� el0_svc+0x2c/0x54
>>>>> [�� 21.707067]� el0_sync_handler+0xa4/0x130
>>>>> [�� 21.707414]� el0_sync+0x170/0x180
>>>>> [�� 21.707711] Code: c8047c62 35ffff84 17fffe5f f9800071 (c85ffc60)
>>>>> [�� 21.708250] ---[ end trace 5ae08147542eb468 ]---
>>>>> [�� 21.708667] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
>>>>> [�� 21.709456] Kernel Offset: disabled
>>>>> [�� 21.709762] CPU features: 0x00240022,2100600c
>>>>> [�� 21.710146] Memory Limit: 2048 MB
>>>>> [�� 21.710443] ---[ end Kernel panic - not syncing: Attempted to kill init!
>>>>> exitcode=0x0000000b ]---
>>>>>
>>>>> I've been able to bisect the panic and the offending commit is 568262bf5492 ("usb:
>>>>> dwc3: core: Add shutdown callback for dwc3"). I can provide more diagnostic
>>>>> information if needed and I can help test the fix.
>>>> if you simply revert that commit in HEAD, does the problem really go
>>>> away?
>>> Kernel built from commit 324c92e5e0ee, which is the kernel tip today, the panic is
>>> there. Reverting the offending commit, 568262bf5492, makes the panic disappear.
>> Want to send a revert so I can take it now?
>
> I can send a revert, but Felipe was asking Sandeep (the commit author) for a fix,
> so I'll leave it up to Felipe to decide how to proceed.

I'm okay with a revert. Feel free to add my Acked-by: Felipe Balbi
<balbi@kernel.org> or it.

Sandeep, please send a new version that doesn't encounter the same
issue. Make sure to test by reloading the driver in a tight loop for
several iterations.

-- 
balbi

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 511 bytes --]

[-- Attachment #2: Type: text/plain, Size: 176 bytes --]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [BUG] usb: dwc3: Kernel NULL pointer dereference in dwc3_remove()
       [not found]           ` <20210603173632.GA25299@jackp-linux.qualcomm.com>
@ 2021-06-04  8:20             ` Felipe Balbi
       [not found]               ` <20210607180023.GA23045@jackp-linux.qualcomm.com>
  0 siblings, 1 reply; 9+ messages in thread
From: Felipe Balbi @ 2021-06-04  8:20 UTC (permalink / raw)
  To: Jack Pham
  Cc: Alexandru Elisei, Greg Kroah-Hartman, p.zabel, linux-usb,
	Linux Kernel Mailing List, arm-mail-list, sanm


[-- Attachment #1.1: Type: text/plain, Size: 7528 bytes --]


Hi,

Jack Pham <jackp@codeaurora.org> writes:
>> >>>> Alexandru Elisei <alexandru.elisei@arm.com> writes:
>> >>>>> I've been seeing the following panic when shutting down my rockpro64:
>> >>>>>
>> >>>>> [�� 21.459064] xhci-hcd xhci-hcd.0.auto: USB bus 5 deregistered
>> >>>>> [�� 21.683077] Unable to handle kernel NULL pointer dereference at virtual address
>> >>>>> 00000000000000a0
>> >>>>> [�� 21.683858] Mem abort info:
>> >>>>> [�� 21.684104]�� ESR = 0x96000004
>> >>>>> [�� 21.684375]�� EC = 0x25: DABT (current EL), IL = 32 bits
>> >>>>> [�� 21.684841]�� SET = 0, FnV = 0
>> >>>>> [�� 21.685111]�� EA = 0, S1PTW = 0
>> >>>>> [�� 21.685389] Data abort info:
>> >>>>> [�� 21.685644]�� ISV = 0, ISS = 0x00000004
>> >>>>> [�� 21.686024]�� CM = 0, WnR = 0
>> >>>>> [�� 21.686288] user pgtable: 4k pages, 48-bit VAs, pgdp=000000000757a000
>> >>>>> [�� 21.686853] [00000000000000a0] pgd=0000000000000000, p4d=0000000000000000
>> >>>>> [�� 21.687452] Internal error: Oops: 96000004EEMPT SMP
>> >>>>> [�� 21.687941] Modules linked in:
>> >>>>> [�� 21.688214] CPU: 4 PID: 1 Comm: shutdown Not tainted
>> >>>>> 5.12.0-rc7-00262-g568262bf5492 #33
>> >>>>> [�� 21.688915] Hardware name: Pine64 RockPro64 v2.0 (DT)
>> >>>>> [�� 21.689357] pstate: 60000005 (nZCv daif -PAN -UAO -TCO BTYPE=--)
>> >>>>> [�� 21.689884] pc : down_read_interruptible+0xec/0x200
>> >>>>> [�� 21.690321] lr : simple_recursive_removal+0x48/0x280
>> >>>>> [�� 21.690761] sp : ffff800011f4b940
>> >>>>> [�� 21.691053] x29: ffff800011f4b940 x28: ffff000000809b40
>> >>>>> [�� 21.691522] x27: ffff000000809b98 x26: ffff8000114f5170
>> >>>>> [�� 21.691990] x25: 00000000000000a0 x24: ffff800011e84030
>> >>>>> [�� 21.692459] x23: 0000000000000080 x22: 0000000000000000
>> >>>>> [�� 21.692927] x21: ffff800011ecaa5c x20: ffff800011ecaa60
>> >>>>> [�� 21.693395] x19: ffff000000809b40 x18: ffffffffffffffff
>> >>>>> [�� 21.693863] x17: 0000000000000000 x16: 0000000000000000
>> >>>>> [�� 21.694331] x15: ffff800091f4ba6d x14: 0000000000000004
>> >>>>> [�� 21.694799] x13: 0000000000000000 x12: 0000000000000020
>> >>>>> [�� 21.695267] x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f
>> >>>>> [�� 21.695735] x9 : 6f6c746364716e62 x8 : 7f7f7f7f7f7f7f7f
>> >>>>> [�� 21.696203] x7 : fefefeff6364626d x6 : 0000000000001bd8
>> >>>>> [�� 21.696671] x5 : 0000000000000000 x4 : 0000000000000000
>> >>>>> [�� 21.697138] x3 : 00000000000000a0 x2 : 0000000000000001
>> >>>>> [�� 21.697606] x1 : 0000000000000000 x0 : 00000000000000a0
>> >>>>> [�� 21.698075] Call trace:
>> >>>>> [�� 21.698291]� down_read_interruptible+0xec/0x200
>> >>>>> [�� 21.698690]� debugfs_remove+0x60/0x84
>> >>>>> [�� 21.699016]� dwc3_debugfs_exit+0x1c/0x6c
>> >>>>> [�� 21.699363]� dwc3_remove+0x34/0x1a0
>> >>>>> [�� 21.699672]� platform_remove+0x28/0x60
>> >>>>> [�� 21.700005]� __device_release_driver+0x188/0x230
>> >>>>> [�� 21.700414]� device_release_driver+0x2c/0x44
>> >>>>> [�� 21.700791]� bus_remove_device+0x124/0x130
>> >>>>> [�� 21.701154]� device_del+0x168/0x420
>> >>>>> [�� 21.701462]� platform_device_del.part.0+0x1c/0x90
>> >>>>> [�� 21.701877]� platform_device_unregister+0x28/0x44
>> >>>>> [�� 21.702291]� of_platform_device_destroy+0xe8/0x100
>> >>>>> [�� 21.702716]� device_for_each_child_reverse+0x64/0xb4
>> >>>>> [�� 21.703153]� of_platform_depopulate+0x40/0x84
>> >>>>> [�� 21.703538]� __dwc3_of_simple_teardown+0x20/0xd4
>> >>>>> [�� 21.703945]� dwc3_of_simple_shutdown+0x14/0x20
>> >>>>> [�� 21.704337]� platform_shutdown+0x28/0x40
>> >>>>> [�� 21.704683]� device_shutdown+0x158/0x330
>> >>>>> [�� 21.705029]� kernel_power_off+0x38/0x7c
>> >>>>> [�� 21.705372]� __do_sys_reboot+0x16c/0x2a0
>> >>>>> [�� 21.705719]� __arm64_sys_reboot+0x28/0x34
>> >>>>> [�� 21.706074]� el0_svc_common.constprop.0+0x60/0x120
>> >>>>> [�� 21.706499]� do_el0_svc+0x28/0x94
>> >>>>> [�� 21.706794]� el0_svc+0x2c/0x54
>> >>>>> [�� 21.707067]� el0_sync_handler+0xa4/0x130
>> >>>>> [�� 21.707414]� el0_sync+0x170/0x180
>> >>>>> [�� 21.707711] Code: c8047c62 35ffff84 17fffe5f f9800071 (c85ffc60)
>> >>>>> [�� 21.708250] ---[ end trace 5ae08147542eb468 ]---
>> >>>>> [�� 21.708667] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
>> >>>>> [�� 21.709456] Kernel Offset: disabled
>> >>>>> [�� 21.709762] CPU features: 0x00240022,2100600c
>> >>>>> [�� 21.710146] Memory Limit: 2048 MB
>> >>>>> [�� 21.710443] ---[ end Kernel panic - not syncing: Attempted to kill init!
>> >>>>> exitcode=0x0000000b ]---
>> >>>>>
>> >>>>> I've been able to bisect the panic and the offending commit is 568262bf5492 ("usb:
>> >>>>> dwc3: core: Add shutdown callback for dwc3"). I can provide more diagnostic
>> >>>>> information if needed and I can help test the fix.
>> >>>> if you simply revert that commit in HEAD, does the problem really go
>> >>>> away?
>> >>> Kernel built from commit 324c92e5e0ee, which is the kernel tip today, the panic is
>> >>> there. Reverting the offending commit, 568262bf5492, makes the panic disappear.
>> >> Want to send a revert so I can take it now?
>> >
>> > I can send a revert, but Felipe was asking Sandeep (the commit author) for a fix,
>> > so I'll leave it up to Felipe to decide how to proceed.
>> 
>> I'm okay with a revert. Feel free to add my Acked-by: Felipe Balbi
>> <balbi@kernel.org> or it.
>> 
>> Sandeep, please send a new version that doesn't encounter the same
>> issue. Make sure to test by reloading the driver in a tight loop for
>> several iterations.
>
> This would probably be tricky to test on other "glue" drivers as the
> problem appears to be specific only to dwc3_of_simple.  It looks like
> both dwc3_of_simple and the dwc3 core now (due to 568262bf5492) each
> implement respective .shutdown callbacks. The latter is simply a wrapper
> around dwc3_remove(). And from the panic call stack above we see that
> dwc3_of_simple_shutdown() calls of_platform_depopulate() which will 
> again call dwc3_remove() resulting in the double remove.
>
> So would an alternative approach be to protect against dwc3_remove()
> getting called multiple times? IMO it'd be a bit messy to have to add

no, I  don't think so. That sounds like a workaround. We should be able
to guarantee that ->remove() doesn't get called twice using the driver
model properly.

> additional checks there to know if it had already been called. So maybe
> avoid it altogether--should dwc3_of_simple_shutdown() just skip calling
> of_platform_depopulate()?

I don't know what the idiomatic is nowadays, but at least early on, we
had to call depopulate.

-- 
balbi


[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 511 bytes --]

[-- Attachment #2: Type: text/plain, Size: 176 bytes --]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [BUG] usb: dwc3: Kernel NULL pointer dereference in dwc3_remove()
       [not found]               ` <20210607180023.GA23045@jackp-linux.qualcomm.com>
@ 2021-06-10 10:11                 ` Felipe Balbi
  0 siblings, 0 replies; 9+ messages in thread
From: Felipe Balbi @ 2021-06-10 10:11 UTC (permalink / raw)
  To: Jack Pham
  Cc: Alexandru Elisei, Greg Kroah-Hartman, p.zabel, linux-usb,
	Linux Kernel Mailing List, arm-mail-list, sanm


[-- Attachment #1.1: Type: text/plain, Size: 3421 bytes --]


Hi,

Jack Pham <jackp@codeaurora.org> writes:
> On Fri, Jun 04, 2021 at 11:20:12AM +0300, Felipe Balbi wrote:
>> Jack Pham <jackp@codeaurora.org> writes:
>> >> >>>> Alexandru Elisei <alexandru.elisei@arm.com> writes:
>> >> >>>>> I've been able to bisect the panic and the offending commit is 568262bf5492 ("usb:
>> >> >>>>> dwc3: core: Add shutdown callback for dwc3"). I can provide more diagnostic
>> >> >>>>> information if needed and I can help test the fix.
>> >> >>>> if you simply revert that commit in HEAD, does the problem really go
>> >> >>>> away?
>> >> >>> Kernel built from commit 324c92e5e0ee, which is the kernel tip today, the panic is
>> >> >>> there. Reverting the offending commit, 568262bf5492, makes the panic disappear.
>> >> >> Want to send a revert so I can take it now?
>> >> >
>> >> > I can send a revert, but Felipe was asking Sandeep (the commit author) for a fix,
>> >> > so I'll leave it up to Felipe to decide how to proceed.
>> >> 
>> >> I'm okay with a revert. Feel free to add my Acked-by: Felipe Balbi
>> >> <balbi@kernel.org> or it.
>> >> 
>> >> Sandeep, please send a new version that doesn't encounter the same
>> >> issue. Make sure to test by reloading the driver in a tight loop for
>> >> several iterations.
>> >
>> > This would probably be tricky to test on other "glue" drivers as the
>> > problem appears to be specific only to dwc3_of_simple.  It looks like
>> > both dwc3_of_simple and the dwc3 core now (due to 568262bf5492) each
>> > implement respective .shutdown callbacks. The latter is simply a wrapper
>> > around dwc3_remove(). And from the panic call stack above we see that
>> > dwc3_of_simple_shutdown() calls of_platform_depopulate() which will 
>> > again call dwc3_remove() resulting in the double remove.
>> >
>> > So would an alternative approach be to protect against dwc3_remove()
>> > getting called multiple times? IMO it'd be a bit messy to have to add
>> 
>> no, I  don't think so. That sounds like a workaround. We should be able
>> to guarantee that ->remove() doesn't get called twice using the driver
>> model properly.
>
> Completely fair.  So then having a .shutdown callback that directly calls
> dwc3_remove() is probably not the right thing to do as it completely
> bypasses the driver model so if and when the driver core does later
> release the device from the driver that's how we end up with the double
> remove.

yeah, I would agree with that.

>> > additional checks there to know if it had already been called. So maybe
>> > avoid it altogether--should dwc3_of_simple_shutdown() just skip calling
>> > of_platform_depopulate()?
>> 
>> I don't know what the idiomatic is nowadays, but at least early on, we
>> had to call depopulate.
>
> So any suggestions on how to fix the original issue Sandeep was trying
> to fix with 568262bf5492? Maybe implement .shutdown in dwc3_qcom and have
> it follow what dwc3_of_simple does with of_platform_depopulate()? But
> then wouldn't other "glues" want/need to follow suit?

I think we can implement shutdown in core, but we need to careful with
it. Instead of just blindly calling remove, let's extract the common
parts to another internal function that both remove and shutdown
call. debugfs removal should not be part of that generic method :-)

Anything in that generic method should, probably, be idempotent.

-- 
balbi

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 511 bytes --]

[-- Attachment #2: Type: text/plain, Size: 176 bytes --]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-06-10 10:13 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-01 11:02 [BUG] usb: dwc3: Kernel NULL pointer dereference in dwc3_remove() Alexandru Elisei
2021-06-03  3:07 ` Peter Chen
2021-06-03  6:30 ` Felipe Balbi
2021-06-03 10:41   ` Alexandru Elisei
2021-06-03 11:40     ` Greg Kroah-Hartman
2021-06-03 12:05       ` Alexandru Elisei
2021-06-03 14:35         ` Felipe Balbi
     [not found]           ` <20210603173632.GA25299@jackp-linux.qualcomm.com>
2021-06-04  8:20             ` Felipe Balbi
     [not found]               ` <20210607180023.GA23045@jackp-linux.qualcomm.com>
2021-06-10 10:11                 ` Felipe Balbi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).