linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Boot crash @hci_uart_tx_wakeup+0x38/0x148 w/ Linus' HEAD?
@ 2017-11-16  2:00 John Stultz
  2017-11-16 20:58 ` John Stultz
  0 siblings, 1 reply; 13+ messages in thread
From: John Stultz @ 2017-11-16  2:00 UTC (permalink / raw)
  To: Marcel Holtmann; +Cc: Rob Herring, lkml, Sumit Semwal

Hey folks,
   After updating to Linus' HEAD today, I'm seeing the following odd
boot time crash with the HiKey board (which uses the serdev driver).

[    1.963009] Unable to handle kernel read from unreadable memory at
virtual address 406f127000
[    1.963012] Mem abort info:
[    1.963015]   ESR = 0x96000005
[    1.963018]   Exception class = DABT (current EL), IL = 32 bits
[    1.963021]   SET = 0, FnV = 0
[    1.963023]   EA = 0, S1PTW = 0
[    1.963025] Data abort info:
[    1.963027]   ISV = 0, ISS = 0x00000005
[    1.963030]   CM = 0, WnR = 0
[    1.963032] [000000406f127000] user address but active_mm is swapper
[    1.963038] Internal error: Oops: 96000005 [#1] PREEMPT SMP
[    1.963046] CPU: 1 PID: 1282 Comm: kworker/u17:1 Not tainted
4.14.0-07281-g1b386f4 #666
[    1.963050] Hardware name: HiKey Development Board (DT)
[    1.963068] Workqueue: hci0 hci_cmd_work
[    1.963074] task: ffffffc0753c8000 task.stack: ffffff800b6c0000
[    1.963079] pstate: 80400005 (Nzcv daif +PAN -UAO)
[    1.963090] pc : hci_uart_tx_wakeup+0x38/0x148
[    1.963095] lr : hci_uart_tx_wakeup+0x30/0x148
[    1.963098] sp : ffffff800b6c3d10
[    1.963101] x29: ffffff800b6c3d10 x28: 0000000000000000
[    1.963107] x27: ffffffc074a79c78 x26: ffffffc07504d830
[    1.963112] x25: ffffff8008f76a30 x24: ffffffc07510a840
[    1.963117] x23: ffffffc07510aa10 x22: 0000000000000001
[    1.963122] x21: ffffff8008cd2000 x20: ffffffc0351cf288
[    1.963128] x19: ffffffc0351cf218 x18: 0000000000000000
[    1.963132] x17: 0000000000000000 x16: 0000000000000000
[    1.963137] x15: 0000000000000000 x14: ffffffc005fa1c00
[    1.963142] x13: 000000406f127000 x12: 0000000034d5d91d
[    1.963147] x11: 0000000000000400 x10: ffffffc077f8c480
[    1.963152] x9 : 0000000000000000 x8 : 0000000000000005
[    1.963157] x7 : 0000000000000000 x6 : ffffff8008f59e88
[    1.963162] x5 : 0000000000000003 x4 : 000000406f127000
[    1.963167] x3 : 0000000000000001 x2 : 000000406f127000
[    1.963172] x1 : ffffff8008cd2d28 x0 : 0000000000000000
[    1.963179] Process kworker/u17:1 (pid: 1282, stack limit =
0xffffff800b6c0000)
[    1.963182] Call trace:
[    1.963188]  hci_uart_tx_wakeup+0x38/0x148
[    1.963193]  hci_uart_send_frame+0x28/0x38
[    1.963197]  hci_send_frame+0x64/0xc0
[    1.963201]  hci_cmd_work+0x98/0x110
[    1.963209]  process_one_work+0x134/0x330
[    1.963214]  worker_thread+0x130/0x468
[    1.963220]  kthread+0xf8/0x128
[    1.963227]  ret_from_fork+0x10/0x18
[    1.963234] Code: 9134a2a0 97f3be03 f9402280 d538d082 (b8626801)
[    1.963239] ---[ end trace 457a26b9096bec64 ]---


I need to spend some more time doing more in depth debugging here, but
I have to close up for the night and run for dinner, so I wanted to
send out the initial report now.

thanks
-john

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Boot crash @hci_uart_tx_wakeup+0x38/0x148 w/ Linus' HEAD?
  2017-11-16  2:00 Boot crash @hci_uart_tx_wakeup+0x38/0x148 w/ Linus' HEAD? John Stultz
@ 2017-11-16 20:58 ` John Stultz
  2017-11-16 21:30   ` Lukas Wunner
  0 siblings, 1 reply; 13+ messages in thread
From: John Stultz @ 2017-11-16 20:58 UTC (permalink / raw)
  To: Marcel Holtmann, Ronald Tschalär, Lukas Wunner
  Cc: Rob Herring, lkml, Sumit Semwal

On Wed, Nov 15, 2017 at 6:00 PM, John Stultz <john.stultz@linaro.org> wrote:
> Hey folks,
>    After updating to Linus' HEAD today, I'm seeing the following odd
> boot time crash with the HiKey board (which uses the serdev driver).
>
> [    1.963009] Unable to handle kernel read from unreadable memory at
> virtual address 406f127000
> [    1.963012] Mem abort info:
> [    1.963015]   ESR = 0x96000005
> [    1.963018]   Exception class = DABT (current EL), IL = 32 bits
> [    1.963021]   SET = 0, FnV = 0
> [    1.963023]   EA = 0, S1PTW = 0
> [    1.963025] Data abort info:
> [    1.963027]   ISV = 0, ISS = 0x00000005
> [    1.963030]   CM = 0, WnR = 0
> [    1.963032] [000000406f127000] user address but active_mm is swapper
> [    1.963038] Internal error: Oops: 96000005 [#1] PREEMPT SMP
> [    1.963046] CPU: 1 PID: 1282 Comm: kworker/u17:1 Not tainted
> 4.14.0-07281-g1b386f4 #666
> [    1.963050] Hardware name: HiKey Development Board (DT)
> [    1.963068] Workqueue: hci0 hci_cmd_work
> [    1.963074] task: ffffffc0753c8000 task.stack: ffffff800b6c0000
> [    1.963079] pstate: 80400005 (Nzcv daif +PAN -UAO)
> [    1.963090] pc : hci_uart_tx_wakeup+0x38/0x148
> [    1.963095] lr : hci_uart_tx_wakeup+0x30/0x148
> [    1.963098] sp : ffffff800b6c3d10
> [    1.963101] x29: ffffff800b6c3d10 x28: 0000000000000000
> [    1.963107] x27: ffffffc074a79c78 x26: ffffffc07504d830
> [    1.963112] x25: ffffff8008f76a30 x24: ffffffc07510a840
> [    1.963117] x23: ffffffc07510aa10 x22: 0000000000000001
> [    1.963122] x21: ffffff8008cd2000 x20: ffffffc0351cf288
> [    1.963128] x19: ffffffc0351cf218 x18: 0000000000000000
> [    1.963132] x17: 0000000000000000 x16: 0000000000000000
> [    1.963137] x15: 0000000000000000 x14: ffffffc005fa1c00
> [    1.963142] x13: 000000406f127000 x12: 0000000034d5d91d
> [    1.963147] x11: 0000000000000400 x10: ffffffc077f8c480
> [    1.963152] x9 : 0000000000000000 x8 : 0000000000000005
> [    1.963157] x7 : 0000000000000000 x6 : ffffff8008f59e88
> [    1.963162] x5 : 0000000000000003 x4 : 000000406f127000
> [    1.963167] x3 : 0000000000000001 x2 : 000000406f127000
> [    1.963172] x1 : ffffff8008cd2d28 x0 : 0000000000000000
> [    1.963179] Process kworker/u17:1 (pid: 1282, stack limit =
> 0xffffff800b6c0000)
> [    1.963182] Call trace:
> [    1.963188]  hci_uart_tx_wakeup+0x38/0x148
> [    1.963193]  hci_uart_send_frame+0x28/0x38
> [    1.963197]  hci_send_frame+0x64/0xc0
> [    1.963201]  hci_cmd_work+0x98/0x110
> [    1.963209]  process_one_work+0x134/0x330
> [    1.963214]  worker_thread+0x130/0x468
> [    1.963220]  kthread+0xf8/0x128
> [    1.963227]  ret_from_fork+0x10/0x18
> [    1.963234] Code: 9134a2a0 97f3be03 f9402280 d538d082 (b8626801)
> [    1.963239] ---[ end trace 457a26b9096bec64 ]---
>

So I bisected this down and the issue looks like the boot regression is from:
   67d2f8781b9f ("Bluetooth: hci_ldisc: Allow sleeping while proto
locks are held")

Reverting that change makes things work again.

thanks
-john

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Boot crash @hci_uart_tx_wakeup+0x38/0x148 w/ Linus' HEAD?
  2017-11-16 20:58 ` John Stultz
@ 2017-11-16 21:30   ` Lukas Wunner
  2017-11-16 21:45     ` Lukas Wunner
  0 siblings, 1 reply; 13+ messages in thread
From: Lukas Wunner @ 2017-11-16 21:30 UTC (permalink / raw)
  To: John Stultz
  Cc: Marcel Holtmann, Ronald Tschalär, Rob Herring, lkml, Sumit Semwal

Hi John,

thanks for reporting this and apologies for the breakage.

On Thu, Nov 16, 2017 at 12:58:27PM -0800, John Stultz wrote:
> On Wed, Nov 15, 2017 at 6:00 PM, John Stultz <john.stultz@linaro.org> wrote:
> >    After updating to Linus' HEAD today, I'm seeing the following odd
> > boot time crash with the HiKey board (which uses the serdev driver).
> >
> > [    1.963009] Unable to handle kernel read from unreadable memory at
> > virtual address 406f127000
> > [    1.963012] Mem abort info:
> > [    1.963015]   ESR = 0x96000005
> > [    1.963018]   Exception class = DABT (current EL), IL = 32 bits
> > [    1.963021]   SET = 0, FnV = 0
> > [    1.963023]   EA = 0, S1PTW = 0
> > [    1.963025] Data abort info:
> > [    1.963027]   ISV = 0, ISS = 0x00000005
> > [    1.963030]   CM = 0, WnR = 0
> > [    1.963032] [000000406f127000] user address but active_mm is swapper
> > [    1.963038] Internal error: Oops: 96000005 [#1] PREEMPT SMP
> > [    1.963046] CPU: 1 PID: 1282 Comm: kworker/u17:1 Not tainted
> > 4.14.0-07281-g1b386f4 #666
> > [    1.963050] Hardware name: HiKey Development Board (DT)
> > [    1.963068] Workqueue: hci0 hci_cmd_work
> > [    1.963074] task: ffffffc0753c8000 task.stack: ffffff800b6c0000
> > [    1.963079] pstate: 80400005 (Nzcv daif +PAN -UAO)
> > [    1.963090] pc : hci_uart_tx_wakeup+0x38/0x148
> > [    1.963095] lr : hci_uart_tx_wakeup+0x30/0x148
> > [    1.963098] sp : ffffff800b6c3d10
> > [    1.963101] x29: ffffff800b6c3d10 x28: 0000000000000000
> > [    1.963107] x27: ffffffc074a79c78 x26: ffffffc07504d830
> > [    1.963112] x25: ffffff8008f76a30 x24: ffffffc07510a840
> > [    1.963117] x23: ffffffc07510aa10 x22: 0000000000000001
> > [    1.963122] x21: ffffff8008cd2000 x20: ffffffc0351cf288
> > [    1.963128] x19: ffffffc0351cf218 x18: 0000000000000000
> > [    1.963132] x17: 0000000000000000 x16: 0000000000000000
> > [    1.963137] x15: 0000000000000000 x14: ffffffc005fa1c00
> > [    1.963142] x13: 000000406f127000 x12: 0000000034d5d91d
> > [    1.963147] x11: 0000000000000400 x10: ffffffc077f8c480
> > [    1.963152] x9 : 0000000000000000 x8 : 0000000000000005
> > [    1.963157] x7 : 0000000000000000 x6 : ffffff8008f59e88
> > [    1.963162] x5 : 0000000000000003 x4 : 000000406f127000
> > [    1.963167] x3 : 0000000000000001 x2 : 000000406f127000
> > [    1.963172] x1 : ffffff8008cd2d28 x0 : 0000000000000000
> > [    1.963179] Process kworker/u17:1 (pid: 1282, stack limit =
> > 0xffffff800b6c0000)
> > [    1.963182] Call trace:
> > [    1.963188]  hci_uart_tx_wakeup+0x38/0x148
> > [    1.963193]  hci_uart_send_frame+0x28/0x38
> > [    1.963197]  hci_send_frame+0x64/0xc0
> > [    1.963201]  hci_cmd_work+0x98/0x110
> > [    1.963209]  process_one_work+0x134/0x330
> > [    1.963214]  worker_thread+0x130/0x468
> > [    1.963220]  kthread+0xf8/0x128
> > [    1.963227]  ret_from_fork+0x10/0x18
> > [    1.963234] Code: 9134a2a0 97f3be03 f9402280 d538d082 (b8626801)
> > [    1.963239] ---[ end trace 457a26b9096bec64 ]---
> >
> 
> So I bisected this down and the issue looks like the boot regression is from:
>    67d2f8781b9f ("Bluetooth: hci_ldisc: Allow sleeping while proto
> locks are held")
> 
> Reverting that change makes things work again.

Hm, I notice percpu_init_rwsem() is only ever called in the hci_ldisc
case (from hci_uart_tty_open()) but apparently never in the hci_serdev
case.  Could that be the culprit?  Wondering why it's working in the
rwlock case then, perhaps by luck?

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Boot crash @hci_uart_tx_wakeup+0x38/0x148 w/ Linus' HEAD?
  2017-11-16 21:30   ` Lukas Wunner
@ 2017-11-16 21:45     ` Lukas Wunner
  2017-11-16 22:29       ` John Stultz
  0 siblings, 1 reply; 13+ messages in thread
From: Lukas Wunner @ 2017-11-16 21:45 UTC (permalink / raw)
  To: John Stultz
  Cc: Marcel Holtmann, Ronald Tschalär, Rob Herring, lkml,
	Sumit Semwal, linux-bluetooth

On Thu, Nov 16, 2017 at 10:30:45PM +0100, Lukas Wunner wrote:
> On Thu, Nov 16, 2017 at 12:58:27PM -0800, John Stultz wrote:
> > On Wed, Nov 15, 2017 at 6:00 PM, John Stultz <john.stultz@linaro.org> wrote:
> > >    After updating to Linus' HEAD today, I'm seeing the following odd
> > > boot time crash with the HiKey board (which uses the serdev driver).
> > >
> > > [    1.963009] Unable to handle kernel read from unreadable memory at
> > > virtual address 406f127000
> > > [    1.963012] Mem abort info:
> > > [    1.963015]   ESR = 0x96000005
> > > [    1.963018]   Exception class = DABT (current EL), IL = 32 bits
> > > [    1.963021]   SET = 0, FnV = 0
> > > [    1.963023]   EA = 0, S1PTW = 0
> > > [    1.963025] Data abort info:
> > > [    1.963027]   ISV = 0, ISS = 0x00000005
> > > [    1.963030]   CM = 0, WnR = 0
> > > [    1.963032] [000000406f127000] user address but active_mm is swapper
> > > [    1.963038] Internal error: Oops: 96000005 [#1] PREEMPT SMP
> > > [    1.963046] CPU: 1 PID: 1282 Comm: kworker/u17:1 Not tainted
> > > 4.14.0-07281-g1b386f4 #666
> > > [    1.963050] Hardware name: HiKey Development Board (DT)
> > > [    1.963068] Workqueue: hci0 hci_cmd_work
> > > [    1.963074] task: ffffffc0753c8000 task.stack: ffffff800b6c0000
> > > [    1.963079] pstate: 80400005 (Nzcv daif +PAN -UAO)
> > > [    1.963090] pc : hci_uart_tx_wakeup+0x38/0x148
> > > [    1.963095] lr : hci_uart_tx_wakeup+0x30/0x148
> > > [    1.963098] sp : ffffff800b6c3d10
> > > [    1.963101] x29: ffffff800b6c3d10 x28: 0000000000000000
> > > [    1.963107] x27: ffffffc074a79c78 x26: ffffffc07504d830
> > > [    1.963112] x25: ffffff8008f76a30 x24: ffffffc07510a840
> > > [    1.963117] x23: ffffffc07510aa10 x22: 0000000000000001
> > > [    1.963122] x21: ffffff8008cd2000 x20: ffffffc0351cf288
> > > [    1.963128] x19: ffffffc0351cf218 x18: 0000000000000000
> > > [    1.963132] x17: 0000000000000000 x16: 0000000000000000
> > > [    1.963137] x15: 0000000000000000 x14: ffffffc005fa1c00
> > > [    1.963142] x13: 000000406f127000 x12: 0000000034d5d91d
> > > [    1.963147] x11: 0000000000000400 x10: ffffffc077f8c480
> > > [    1.963152] x9 : 0000000000000000 x8 : 0000000000000005
> > > [    1.963157] x7 : 0000000000000000 x6 : ffffff8008f59e88
> > > [    1.963162] x5 : 0000000000000003 x4 : 000000406f127000
> > > [    1.963167] x3 : 0000000000000001 x2 : 000000406f127000
> > > [    1.963172] x1 : ffffff8008cd2d28 x0 : 0000000000000000
> > > [    1.963179] Process kworker/u17:1 (pid: 1282, stack limit =
> > > 0xffffff800b6c0000)
> > > [    1.963182] Call trace:
> > > [    1.963188]  hci_uart_tx_wakeup+0x38/0x148
> > > [    1.963193]  hci_uart_send_frame+0x28/0x38
> > > [    1.963197]  hci_send_frame+0x64/0xc0
> > > [    1.963201]  hci_cmd_work+0x98/0x110
> > > [    1.963209]  process_one_work+0x134/0x330
> > > [    1.963214]  worker_thread+0x130/0x468
> > > [    1.963220]  kthread+0xf8/0x128
> > > [    1.963227]  ret_from_fork+0x10/0x18
> > > [    1.963234] Code: 9134a2a0 97f3be03 f9402280 d538d082 (b8626801)
> > > [    1.963239] ---[ end trace 457a26b9096bec64 ]---
> > >
> > 
> > So I bisected this down and the issue looks like the boot regression is from:
> >    67d2f8781b9f ("Bluetooth: hci_ldisc: Allow sleeping while proto
> > locks are held")
> > 
> > Reverting that change makes things work again.
> 
> Hm, I notice percpu_init_rwsem() is only ever called in the hci_ldisc
> case (from hci_uart_tty_open()) but apparently never in the hci_serdev
> case.  Could that be the culprit?  Wondering why it's working in the
> rwlock case then, perhaps by luck?

As a shot in the dark, does the below patch help?

-- >8 --
diff --git a/drivers/bluetooth/hci_serdev.c b/drivers/bluetooth/hci_serdev.c
index 71664b2..2582bef 100644
--- a/drivers/bluetooth/hci_serdev.c
+++ b/drivers/bluetooth/hci_serdev.c
@@ -304,6 +304,8 @@ int hci_uart_register_device(struct hci_uart *hu,
 
 	INIT_WORK(&hu->write_work, hci_uart_write_work);
 
+	percpu_init_rwsem(&hu->proto_lock);
+
 	/* Only when vendor specific setup callback is provided, consider
 	 * the manufacturer information valid. This avoids filling in the
 	 * value for Ericsson when nothing is specified.

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: Boot crash @hci_uart_tx_wakeup+0x38/0x148 w/ Linus' HEAD?
  2017-11-16 21:45     ` Lukas Wunner
@ 2017-11-16 22:29       ` John Stultz
  2017-11-16 23:54         ` [PATCH] Bluetooth: hci_serdev: Init hci_uart proto_lock to avoid oops Lukas Wunner
  0 siblings, 1 reply; 13+ messages in thread
From: John Stultz @ 2017-11-16 22:29 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Marcel Holtmann, Ronald Tschalär, Rob Herring, lkml,
	Sumit Semwal, linux-bluetooth

On Thu, Nov 16, 2017 at 1:45 PM, Lukas Wunner <lukas@wunner.de> wrote:
> On Thu, Nov 16, 2017 at 10:30:45PM +0100, Lukas Wunner wrote:
>> On Thu, Nov 16, 2017 at 12:58:27PM -0800, John Stultz wrote:
>> > On Wed, Nov 15, 2017 at 6:00 PM, John Stultz <john.stultz@linaro.org> wrote:
>> > >    After updating to Linus' HEAD today, I'm seeing the following odd
>> > > boot time crash with the HiKey board (which uses the serdev driver).
>> > >
>> > > [    1.963009] Unable to handle kernel read from unreadable memory at
>> > > virtual address 406f127000
>> > > [    1.963012] Mem abort info:
>> > > [    1.963015]   ESR = 0x96000005
>> > > [    1.963018]   Exception class = DABT (current EL), IL = 32 bits
>> > > [    1.963021]   SET = 0, FnV = 0
>> > > [    1.963023]   EA = 0, S1PTW = 0
>> > > [    1.963025] Data abort info:
>> > > [    1.963027]   ISV = 0, ISS = 0x00000005
>> > > [    1.963030]   CM = 0, WnR = 0
>> > > [    1.963032] [000000406f127000] user address but active_mm is swapper
>> > > [    1.963038] Internal error: Oops: 96000005 [#1] PREEMPT SMP
>> > > [    1.963046] CPU: 1 PID: 1282 Comm: kworker/u17:1 Not tainted
>> > > 4.14.0-07281-g1b386f4 #666
>> > > [    1.963050] Hardware name: HiKey Development Board (DT)
>> > > [    1.963068] Workqueue: hci0 hci_cmd_work
>> > > [    1.963074] task: ffffffc0753c8000 task.stack: ffffff800b6c0000
>> > > [    1.963079] pstate: 80400005 (Nzcv daif +PAN -UAO)
>> > > [    1.963090] pc : hci_uart_tx_wakeup+0x38/0x148
>> > > [    1.963095] lr : hci_uart_tx_wakeup+0x30/0x148
>> > > [    1.963098] sp : ffffff800b6c3d10
>> > > [    1.963101] x29: ffffff800b6c3d10 x28: 0000000000000000
>> > > [    1.963107] x27: ffffffc074a79c78 x26: ffffffc07504d830
>> > > [    1.963112] x25: ffffff8008f76a30 x24: ffffffc07510a840
>> > > [    1.963117] x23: ffffffc07510aa10 x22: 0000000000000001
>> > > [    1.963122] x21: ffffff8008cd2000 x20: ffffffc0351cf288
>> > > [    1.963128] x19: ffffffc0351cf218 x18: 0000000000000000
>> > > [    1.963132] x17: 0000000000000000 x16: 0000000000000000
>> > > [    1.963137] x15: 0000000000000000 x14: ffffffc005fa1c00
>> > > [    1.963142] x13: 000000406f127000 x12: 0000000034d5d91d
>> > > [    1.963147] x11: 0000000000000400 x10: ffffffc077f8c480
>> > > [    1.963152] x9 : 0000000000000000 x8 : 0000000000000005
>> > > [    1.963157] x7 : 0000000000000000 x6 : ffffff8008f59e88
>> > > [    1.963162] x5 : 0000000000000003 x4 : 000000406f127000
>> > > [    1.963167] x3 : 0000000000000001 x2 : 000000406f127000
>> > > [    1.963172] x1 : ffffff8008cd2d28 x0 : 0000000000000000
>> > > [    1.963179] Process kworker/u17:1 (pid: 1282, stack limit =
>> > > 0xffffff800b6c0000)
>> > > [    1.963182] Call trace:
>> > > [    1.963188]  hci_uart_tx_wakeup+0x38/0x148
>> > > [    1.963193]  hci_uart_send_frame+0x28/0x38
>> > > [    1.963197]  hci_send_frame+0x64/0xc0
>> > > [    1.963201]  hci_cmd_work+0x98/0x110
>> > > [    1.963209]  process_one_work+0x134/0x330
>> > > [    1.963214]  worker_thread+0x130/0x468
>> > > [    1.963220]  kthread+0xf8/0x128
>> > > [    1.963227]  ret_from_fork+0x10/0x18
>> > > [    1.963234] Code: 9134a2a0 97f3be03 f9402280 d538d082 (b8626801)
>> > > [    1.963239] ---[ end trace 457a26b9096bec64 ]---
>> > >
>> >
>> > So I bisected this down and the issue looks like the boot regression is from:
>> >    67d2f8781b9f ("Bluetooth: hci_ldisc: Allow sleeping while proto
>> > locks are held")
>> >
>> > Reverting that change makes things work again.
>>
>> Hm, I notice percpu_init_rwsem() is only ever called in the hci_ldisc
>> case (from hci_uart_tty_open()) but apparently never in the hci_serdev
>> case.  Could that be the culprit?  Wondering why it's working in the
>> rwlock case then, perhaps by luck?
>
> As a shot in the dark, does the below patch help?
>
> -- >8 --
> diff --git a/drivers/bluetooth/hci_serdev.c b/drivers/bluetooth/hci_serdev.c
> index 71664b2..2582bef 100644
> --- a/drivers/bluetooth/hci_serdev.c
> +++ b/drivers/bluetooth/hci_serdev.c
> @@ -304,6 +304,8 @@ int hci_uart_register_device(struct hci_uart *hu,
>
>         INIT_WORK(&hu->write_work, hci_uart_write_work);
>
> +       percpu_init_rwsem(&hu->proto_lock);
> +

Yes. This seems to avoid the issue!

thanks
-john

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH] Bluetooth: hci_serdev: Init hci_uart proto_lock to avoid oops
  2017-11-16 22:29       ` John Stultz
@ 2017-11-16 23:54         ` Lukas Wunner
  2017-11-16 23:59           ` Lukas Wunner
  2017-11-17  6:07           ` Marcel Holtmann
  0 siblings, 2 replies; 13+ messages in thread
From: Lukas Wunner @ 2017-11-16 23:54 UTC (permalink / raw)
  To: Marcel Holtmann, Gustavo, Padovan <gustavo, Johan,
	Hedberg <johan.hedberg, Rob Herring, Johan Hovold
  Cc: Ronald Tschalaer, Sumit Semwal, linux-bluetooth, linux-kernel,
	John Stultz

John Stultz reports a boot time crash with the HiKey board (which uses
hci_serdev) occurring in hci_uart_tx_wakeup().  That function is
contained in hci_ldisc.c, but also called from the newer hci_serdev.c.
It acquires the proto_lock in struct hci_uart and it turns out that we
forgot to init the lock in the serdev code path, thus causing the crash.

John bisected the crash to commit 67d2f8781b9f ("Bluetooth: hci_ldisc:
Allow sleeping while proto locks are held"), but the issue was present
before and the commit merely exposed it.  (Perhaps by luck, the crash
did not occur with rwlocks.)

Init the proto_lock in the serdev code path to avoid the oops.

Stack trace for posterity:

Unable to handle kernel read from unreadable memory at 406f127000
[000000406f127000] user address but active_mm is swapper
Internal error: Oops: 96000005 [#1] PREEMPT SMP
Hardware name: HiKey Development Board (DT)
Call trace:
 hci_uart_tx_wakeup+0x38/0x148
 hci_uart_send_frame+0x28/0x38
 hci_send_frame+0x64/0xc0
 hci_cmd_work+0x98/0x110
 process_one_work+0x134/0x330
 worker_thread+0x130/0x468
 kthread+0xf8/0x128
 ret_from_fork+0x10/0x18

Link: https://lkml.org/lkml/2017/11/15/908
Reported-and-tested-by: John Stultz <john.stultz@linaro.org>
Cc: Ronald Tschalär <ronald@innovation.ch>
Cc: Rob Herring <rob.herring@linaro.org>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
---
@Rob (and everyone else):  I'm not sure if this is in fact the correct
approach, or if we should instead duplicate hci_uart_tx_wakeup() in
hci_serdev.c (sans locking?), much as we've duplicated a lot of other
functions there.  Let me know what your preference is.  Thanks!

 drivers/bluetooth/hci_serdev.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/bluetooth/hci_serdev.c b/drivers/bluetooth/hci_serdev.c
index 71664b2..e0e6461 100644
--- a/drivers/bluetooth/hci_serdev.c
+++ b/drivers/bluetooth/hci_serdev.c
@@ -303,6 +303,7 @@ int hci_uart_register_device(struct hci_uart *hu,
 	hci_set_drvdata(hdev, hu);
 
 	INIT_WORK(&hu->write_work, hci_uart_write_work);
+	percpu_init_rwsem(&hu->proto_lock);
 
 	/* Only when vendor specific setup callback is provided, consider
 	 * the manufacturer information valid. This avoids filling in the
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] Bluetooth: hci_serdev: Init hci_uart proto_lock to avoid oops
  2017-11-16 23:54         ` [PATCH] Bluetooth: hci_serdev: Init hci_uart proto_lock to avoid oops Lukas Wunner
@ 2017-11-16 23:59           ` Lukas Wunner
  2017-11-17  6:07           ` Marcel Holtmann
  1 sibling, 0 replies; 13+ messages in thread
From: Lukas Wunner @ 2017-11-16 23:59 UTC (permalink / raw)
  To: Marcel Holtmann, Gustavo Padovan, Johan Hedberg, Rob Herring,
	Johan Hovold
  Cc: Ronald Tschalaer, Sumit Semwal, linux-bluetooth, linux-kernel,
	John Stultz

On Fri, Nov 17, 2017 at 12:54:53AM +0100, Lukas Wunner wrote:
> John Stultz reports a boot time crash with the HiKey board (which uses
> hci_serdev) occurring in hci_uart_tx_wakeup().  That function is
> contained in hci_ldisc.c, but also called from the newer hci_serdev.c.
> It acquires the proto_lock in struct hci_uart and it turns out that we
> forgot to init the lock in the serdev code path, thus causing the crash.
> 
> John bisected the crash to commit 67d2f8781b9f ("Bluetooth: hci_ldisc:
> Allow sleeping while proto locks are held"), but the issue was present
> before and the commit merely exposed it.  (Perhaps by luck, the crash
> did not occur with rwlocks.)
> 
> Init the proto_lock in the serdev code path to avoid the oops.

Apologies, I botched Gustavo Padovan's and Johan Hedberg's e-mail
address in the To: header, please fix up when replying.

Lukas

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] Bluetooth: hci_serdev: Init hci_uart proto_lock to avoid oops
  2017-11-16 23:54         ` [PATCH] Bluetooth: hci_serdev: Init hci_uart proto_lock to avoid oops Lukas Wunner
  2017-11-16 23:59           ` Lukas Wunner
@ 2017-11-17  6:07           ` Marcel Holtmann
  2017-12-11 21:49             ` John Stultz
  1 sibling, 1 reply; 13+ messages in thread
From: Marcel Holtmann @ 2017-11-17  6:07 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Rob Herring, Johan Hovold, Ronald Tschalaer, Sumit Semwal,
	open list:BLUETOOTH DRIVERS, Linux Kernel Mailing List,
	John Stultz

Hi Lukas,

> John Stultz reports a boot time crash with the HiKey board (which uses
> hci_serdev) occurring in hci_uart_tx_wakeup().  That function is
> contained in hci_ldisc.c, but also called from the newer hci_serdev.c.
> It acquires the proto_lock in struct hci_uart and it turns out that we
> forgot to init the lock in the serdev code path, thus causing the crash.
> 
> John bisected the crash to commit 67d2f8781b9f ("Bluetooth: hci_ldisc:
> Allow sleeping while proto locks are held"), but the issue was present
> before and the commit merely exposed it.  (Perhaps by luck, the crash
> did not occur with rwlocks.)
> 
> Init the proto_lock in the serdev code path to avoid the oops.
> 
> Stack trace for posterity:
> 
> Unable to handle kernel read from unreadable memory at 406f127000
> [000000406f127000] user address but active_mm is swapper
> Internal error: Oops: 96000005 [#1] PREEMPT SMP
> Hardware name: HiKey Development Board (DT)
> Call trace:
> hci_uart_tx_wakeup+0x38/0x148
> hci_uart_send_frame+0x28/0x38
> hci_send_frame+0x64/0xc0
> hci_cmd_work+0x98/0x110
> process_one_work+0x134/0x330
> worker_thread+0x130/0x468
> kthread+0xf8/0x128
> ret_from_fork+0x10/0x18
> 
> Link: https://lkml.org/lkml/2017/11/15/908
> Reported-and-tested-by: John Stultz <john.stultz@linaro.org>
> Cc: Ronald Tschalär <ronald@innovation.ch>
> Cc: Rob Herring <rob.herring@linaro.org>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Signed-off-by: Lukas Wunner <lukas@wunner.de>
> ---
> @Rob (and everyone else):  I'm not sure if this is in fact the correct
> approach, or if we should instead duplicate hci_uart_tx_wakeup() in
> hci_serdev.c (sans locking?), much as we've duplicated a lot of other
> functions there.  Let me know what your preference is.  Thanks!
> 
> drivers/bluetooth/hci_serdev.c | 1 +
> 1 file changed, 1 insertion(+)

patch has been applied to bluetooth-next tree.

Regards

Marcel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] Bluetooth: hci_serdev: Init hci_uart proto_lock to avoid oops
  2017-11-17  6:07           ` Marcel Holtmann
@ 2017-12-11 21:49             ` John Stultz
  2018-01-02 20:39               ` John Stultz
  0 siblings, 1 reply; 13+ messages in thread
From: John Stultz @ 2017-12-11 21:49 UTC (permalink / raw)
  To: Marcel Holtmann
  Cc: Lukas Wunner, Rob Herring, Johan Hovold, Ronald Tschalaer,
	Sumit Semwal, open list:BLUETOOTH DRIVERS,
	Linux Kernel Mailing List

On Thu, Nov 16, 2017 at 10:07 PM, Marcel Holtmann <marcel@holtmann.org> wrote:
> Hi Lukas,
>
>> John Stultz reports a boot time crash with the HiKey board (which uses
>> hci_serdev) occurring in hci_uart_tx_wakeup().  That function is
>> contained in hci_ldisc.c, but also called from the newer hci_serdev.c.
>> It acquires the proto_lock in struct hci_uart and it turns out that we
>> forgot to init the lock in the serdev code path, thus causing the crash.
>>
>> John bisected the crash to commit 67d2f8781b9f ("Bluetooth: hci_ldisc:
>> Allow sleeping while proto locks are held"), but the issue was present
>> before and the commit merely exposed it.  (Perhaps by luck, the crash
>> did not occur with rwlocks.)
>>
>> Init the proto_lock in the serdev code path to avoid the oops.
>>
[snip]
> patch has been applied to bluetooth-next tree.

Sorry to be a nuisance if its just a timing thing, but I wanted to
follow up on this just to make sure it didn't fall through the cracks,
as I noticed w/ -rc3 it hasn't landed yet.

thanks
-john

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] Bluetooth: hci_serdev: Init hci_uart proto_lock to avoid oops
  2017-12-11 21:49             ` John Stultz
@ 2018-01-02 20:39               ` John Stultz
  2018-01-02 20:45                 ` Marcel Holtmann
  2018-02-01  5:50                 ` Backport d73e17281665 to v4.15 stable Lukas Wunner
  0 siblings, 2 replies; 13+ messages in thread
From: John Stultz @ 2018-01-02 20:39 UTC (permalink / raw)
  To: Marcel Holtmann
  Cc: Lukas Wunner, Rob Herring, Johan Hovold, Ronald Tschalaer,
	Sumit Semwal, open list:BLUETOOTH DRIVERS,
	Linux Kernel Mailing List

On Mon, Dec 11, 2017 at 1:49 PM, John Stultz <john.stultz@linaro.org> wrote:
> On Thu, Nov 16, 2017 at 10:07 PM, Marcel Holtmann <marcel@holtmann.org> wrote:
>> Hi Lukas,
>>
>>> John Stultz reports a boot time crash with the HiKey board (which uses
>>> hci_serdev) occurring in hci_uart_tx_wakeup().  That function is
>>> contained in hci_ldisc.c, but also called from the newer hci_serdev.c.
>>> It acquires the proto_lock in struct hci_uart and it turns out that we
>>> forgot to init the lock in the serdev code path, thus causing the crash.
>>>
>>> John bisected the crash to commit 67d2f8781b9f ("Bluetooth: hci_ldisc:
>>> Allow sleeping while proto locks are held"), but the issue was present
>>> before and the commit merely exposed it.  (Perhaps by luck, the crash
>>> did not occur with rwlocks.)
>>>
>>> Init the proto_lock in the serdev code path to avoid the oops.
>>>
> [snip]
>> patch has been applied to bluetooth-next tree.
>
> Sorry to be a nuisance if its just a timing thing, but I wanted to
> follow up on this just to make sure it didn't fall through the cracks,
> as I noticed w/ -rc3 it hasn't landed yet.

Happy new year all,
  Just wanted to send another ping on this as it seems it hasn't made
it into -rc6. Did this get missed as being tagged as a (at least
functional) regression fix?

thanks
-john

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] Bluetooth: hci_serdev: Init hci_uart proto_lock to avoid oops
  2018-01-02 20:39               ` John Stultz
@ 2018-01-02 20:45                 ` Marcel Holtmann
  2018-02-01  5:50                 ` Backport d73e17281665 to v4.15 stable Lukas Wunner
  1 sibling, 0 replies; 13+ messages in thread
From: Marcel Holtmann @ 2018-01-02 20:45 UTC (permalink / raw)
  To: John Stultz
  Cc: Lukas Wunner, Rob Herring, Johan Hovold, Ronald Tschalaer,
	Sumit Semwal, open list:BLUETOOTH DRIVERS,
	Linux Kernel Mailing List

Hi John,

>>>> John Stultz reports a boot time crash with the HiKey board (which uses
>>>> hci_serdev) occurring in hci_uart_tx_wakeup().  That function is
>>>> contained in hci_ldisc.c, but also called from the newer hci_serdev.c.
>>>> It acquires the proto_lock in struct hci_uart and it turns out that we
>>>> forgot to init the lock in the serdev code path, thus causing the crash.
>>>> 
>>>> John bisected the crash to commit 67d2f8781b9f ("Bluetooth: hci_ldisc:
>>>> Allow sleeping while proto locks are held"), but the issue was present
>>>> before and the commit merely exposed it.  (Perhaps by luck, the crash
>>>> did not occur with rwlocks.)
>>>> 
>>>> Init the proto_lock in the serdev code path to avoid the oops.
>>>> 
>> [snip]
>>> patch has been applied to bluetooth-next tree.
>> 
>> Sorry to be a nuisance if its just a timing thing, but I wanted to
>> follow up on this just to make sure it didn't fall through the cracks,
>> as I noticed w/ -rc3 it hasn't landed yet.
> 
> Happy new year all,
>  Just wanted to send another ping on this as it seems it hasn't made
> it into -rc6. Did this get missed as being tagged as a (at least
> functional) regression fix?

since it wasn’t marked as urgent fix, it never went on the path to -rc. It is on the path for the next release. And then can be included in -stable if needed.

Regards

Marcel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Backport d73e17281665 to v4.15 stable
  2018-01-02 20:39               ` John Stultz
  2018-01-02 20:45                 ` Marcel Holtmann
@ 2018-02-01  5:50                 ` Lukas Wunner
  2018-02-01  6:42                   ` Greg Kroah-Hartman
  1 sibling, 1 reply; 13+ messages in thread
From: Lukas Wunner @ 2018-02-01  5:50 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: John Stultz, Marcel Holtmann, Rob Herring, Johan Hovold,
	Ronald Tschalaer, Sumit Semwal, linux-bluetooth, linux-kernel

Dear Greg,

when you refill the patch queue for v4.15 stable one of these days,
please consider adding

    commit d73e172816652772114827abaa2dbc053eecbbd7
    Author: Lukas Wunner <lukas@wunner.de>
    Date:   Fri Nov 17 00:54:53 2017 +0100
    
    Bluetooth: hci_serdev: Init hci_uart proto_lock to avoid oops

to fix a boot crash on certain machines with UART-attached Bluetooth
which was kindly reported by John Stultz.  The patch unfortunately
lacked a Fixes: or Cc: stable tag.  It landed in Linus' tree 7 hours ago.

Thanks!

Lukas

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Backport d73e17281665 to v4.15 stable
  2018-02-01  5:50                 ` Backport d73e17281665 to v4.15 stable Lukas Wunner
@ 2018-02-01  6:42                   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 13+ messages in thread
From: Greg Kroah-Hartman @ 2018-02-01  6:42 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: John Stultz, Marcel Holtmann, Rob Herring, Johan Hovold,
	Ronald Tschalaer, Sumit Semwal, linux-bluetooth, linux-kernel

On Thu, Feb 01, 2018 at 06:50:52AM +0100, Lukas Wunner wrote:
> Dear Greg,
> 
> when you refill the patch queue for v4.15 stable one of these days,
> please consider adding
> 
>     commit d73e172816652772114827abaa2dbc053eecbbd7
>     Author: Lukas Wunner <lukas@wunner.de>
>     Date:   Fri Nov 17 00:54:53 2017 +0100
>     
>     Bluetooth: hci_serdev: Init hci_uart proto_lock to avoid oops
> 
> to fix a boot crash on certain machines with UART-attached Bluetooth
> which was kindly reported by John Stultz.  The patch unfortunately
> lacked a Fixes: or Cc: stable tag.  It landed in Linus' tree 7 hours ago.

Now queued up, thanks.

greg k-h

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2018-02-01  6:42 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-16  2:00 Boot crash @hci_uart_tx_wakeup+0x38/0x148 w/ Linus' HEAD? John Stultz
2017-11-16 20:58 ` John Stultz
2017-11-16 21:30   ` Lukas Wunner
2017-11-16 21:45     ` Lukas Wunner
2017-11-16 22:29       ` John Stultz
2017-11-16 23:54         ` [PATCH] Bluetooth: hci_serdev: Init hci_uart proto_lock to avoid oops Lukas Wunner
2017-11-16 23:59           ` Lukas Wunner
2017-11-17  6:07           ` Marcel Holtmann
2017-12-11 21:49             ` John Stultz
2018-01-02 20:39               ` John Stultz
2018-01-02 20:45                 ` Marcel Holtmann
2018-02-01  5:50                 ` Backport d73e17281665 to v4.15 stable Lukas Wunner
2018-02-01  6:42                   ` Greg Kroah-Hartman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).