* "usbip: Implement SG support to vhci-hcd and stub driver" causes a deadlock @ 2019-12-06 3:24 Marek Marczykowski-Górecki 2019-12-06 6:50 ` Suwan Kim 0 siblings, 1 reply; 17+ messages in thread From: Marek Marczykowski-Górecki @ 2019-12-06 3:24 UTC (permalink / raw) To: linux-usb; +Cc: Suwan Kim, Shuah Khan, Valentina Manea [-- Attachment #1: Type: text/plain, Size: 5097 bytes --] Hello, I've hit an issue with recent 4.19 and 5.4 kernels. In short: if I connect Yubikey 4 and use its CCID interface (for example `ykman oath list` command), the client side hangs (100% reliably). After 60s I get a message that a CPU hangs waiting for a spinlock (see below). I've bisected it to a ea44d190764b4422af ("usbip: Implement SG support to vhci-hcd and stub driver") commit. Which indeed is also backported to 4.19. Any idea what is going on here? I can easily provide more information, if you tell me how to get it. The kernel log: [ 6452.701016] usb 1-1: New USB device found, idVendor=1050, idProduct=0407, bcdDevice= 4.27 [ 6452.701049] usb 1-1: New USB device strings: Mfr=1, Product=2, SerialNumber=0 [ 6452.701075] usb 1-1: Product: Yubikey 4 OTP+U2F+CCID [ 6452.701092] usb 1-1: Manufacturer: Yubico [ 6452.711566] input: Yubico Yubikey 4 OTP+U2F+CCID as /devices/platform/vhci_hcd.0/usb1/1-1/1-1:1.0/0003:1050:0407.0001/input/input1 [ 6452.762251] hid-generic 0003:1050:0407.0001: input,hidraw0: USB HID v1.10 Keyboard [Yubico Yubikey 4 OTP+U2F+CCID] on usb-vhci_hcd.0-1/input0 [ 6452.770270] hid-generic 0003:1050:0407.0002: hiddev96,hidraw1: USB HID v1.10 Device [Yubico Yubikey 4 OTP+U2F+CCID] on usb-vhci_hcd.0-1/input1 [ 6530.805002] vhci_hcd: unlink->seqnum 111 [ 6530.805024] vhci_hcd: urb->status -104 [ 6531.386607] usb 1-1: recv xbuf, 42 [ 6531.386701] vhci_hcd: stop threads [ 6531.386718] vhci_hcd: release socket [ 6531.386734] vhci_hcd: disconnect device [ 6531.386800] usb 1-1: USB disconnect, device number 2 [ 6591.409099] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: [ 6591.409130] rcu: 1-...!: (0 ticks this GP) idle=53a/1/0x4000000000000000 softirq=18978/18978 fqs=0 [ 6591.409158] rcu: (detected by 0, t=60002 jiffies, g=17933, q=893) [ 6591.409181] Sending NMI from CPU 0 to CPUs 1: [ 6591.410415] NMI backtrace for cpu 1 [ 6591.410416] CPU: 1 PID: 338 Comm: kworker/1:2 Tainted: G O 4.19.84-1.pvops.qubes.x86_64 #1 [ 6591.410417] Workqueue: usb_hub_wq hub_event [ 6591.410417] RIP: 0010:_raw_spin_lock_irqsave+0x22/0x40 [ 6591.410418] Code: 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 53 9c 58 0f 1f 44 00 00 48 89 c3 fa 66 0f 1f 44 00 00 31 c0 ba 01 00 00 00 f0 0f b1 17 <85> c0 75 05 48 89 d8 5b c3 89 c6 e8 3e 1c 7e ff 66 90 48 89 d8 5b [ 6591.410419] RSP: 0000:ffffc900009dfaf8 EFLAGS: 00000046 [ 6591.410419] RAX: 0000000000000000 RBX: 0000000000000082 RCX: 0000000000000000 [ 6591.410420] RDX: 0000000000000001 RSI: ffff88801e633e40 RDI: ffff888006740be0 [ 6591.410420] RBP: ffff888006740be0 R08: 0000000000000000 R09: ffffffff81346f00 [ 6591.410420] R10: ffff888014f5b5d8 R11: 0000000000000000 R12: 00000000ffffff94 [ 6591.410421] R13: ffff88803ff06000 R14: ffff88803ff06000 R15: ffff88801e633e40 [ 6591.410421] FS: 0000000000000000(0000) GS:ffff8880f5b00000(0000) knlGS:0000000000000000 [ 6591.410421] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 6591.410422] CR2: 00005eede8b122a8 CR3: 000000000220a002 CR4: 00000000003606e0 [ 6591.410422] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 6591.410423] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 6591.410423] Call Trace: [ 6591.410423] vhci_urb_dequeue+0x2b/0x280 [vhci_hcd] [ 6591.410423] usb_hcd_flush_endpoint+0x119/0x190 [ 6591.410424] usb_disable_endpoint+0x7b/0xa0 [ 6591.410424] usb_disable_interface+0x3e/0x50 [ 6591.410424] usb_unbind_interface+0x117/0x250 [ 6591.410425] device_release_driver_internal+0x17d/0x240 [ 6591.410425] bus_remove_device+0xe5/0x150 [ 6591.410425] device_del+0x161/0x360 [ 6591.410426] ? usb_remove_ep_devs+0x1b/0x30 [ 6591.410426] usb_disable_device+0x93/0x240 [ 6591.410426] usb_disconnect+0x90/0x270 [ 6591.410427] hub_port_connect+0x83/0xab0 [ 6591.410427] hub_event+0x8d1/0xab0 [ 6591.410427] process_one_work+0x191/0x370 [ 6591.410428] worker_thread+0x4f/0x3b0 [ 6591.410428] kthread+0xf8/0x130 [ 6591.410428] ? rescuer_thread+0x340/0x340 [ 6591.410428] ? kthread_create_worker_on_cpu+0x70/0x70 [ 6591.410429] ret_from_fork+0x35/0x40 [ 6591.410432] rcu: rcu_sched kthread starved for 60002 jiffies! g17933 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=1 [ 6591.411962] rcu: RCU grace-period kthread stack dump: [ 6591.411980] rcu_sched I 0 10 2 0x80000000 [ 6591.411998] Call Trace: [ 6591.412015] ? __schedule+0x3f5/0x870 [ 6591.412030] schedule+0x32/0x80 [ 6591.412044] schedule_timeout+0x16f/0x350 [ 6591.412059] ? __next_timer_interrupt+0xc0/0xc0 [ 6591.412077] rcu_gp_kthread+0x569/0x950 [ 6591.412092] kthread+0xf8/0x130 [ 6591.412106] ? rcu_nocb_kthread+0x560/0x560 [ 6591.412119] ? kthread_create_worker_on_cpu+0x70/0x70 [ 6591.412136] ret_from_fork+0x35/0x40 Some more details are available here: https://github.com/QubesOS/qubes-issues/issues/5498 -- Best Regards, Marek Marczykowski-Górecki Invisible Things Lab A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: "usbip: Implement SG support to vhci-hcd and stub driver" causes a deadlock 2019-12-06 3:24 "usbip: Implement SG support to vhci-hcd and stub driver" causes a deadlock Marek Marczykowski-Górecki @ 2019-12-06 6:50 ` Suwan Kim 2019-12-06 20:57 ` Marek Marczykowski-Górecki 0 siblings, 1 reply; 17+ messages in thread From: Suwan Kim @ 2019-12-06 6:50 UTC (permalink / raw) To: Marek Marczykowski-Górecki; +Cc: linux-usb, Shuah Khan, Valentina Manea On Fri, Dec 06, 2019 at 04:24:06AM +0100, Marek Marczykowski-Górecki wrote: > Hello, > > I've hit an issue with recent 4.19 and 5.4 kernels. In short: if I > connect Yubikey 4 and use its CCID interface (for example `ykman oath > list` command), the client side hangs (100% reliably). After 60s I get a > message that a CPU hangs waiting for a spinlock (see below). > > I've bisected it to a ea44d190764b4422af ("usbip: Implement SG support > to vhci-hcd and stub driver") commit. Which indeed is also backported to > 4.19. > > Any idea what is going on here? I can easily provide more information, > if you tell me how to get it. > Hi, Thanks for reporting. Could you turn on lockdep and USBIP_DEBUG in kernel config and send dmesg log? It will be helpful to figure out lock dependency in vhci_hcd. Regards, Suwan Kim > The kernel log: > [ 6452.701016] usb 1-1: New USB device found, idVendor=1050, idProduct=0407, bcdDevice= 4.27 > [ 6452.701049] usb 1-1: New USB device strings: Mfr=1, Product=2, SerialNumber=0 > [ 6452.701075] usb 1-1: Product: Yubikey 4 OTP+U2F+CCID > [ 6452.701092] usb 1-1: Manufacturer: Yubico > [ 6452.711566] input: Yubico Yubikey 4 OTP+U2F+CCID as /devices/platform/vhci_hcd.0/usb1/1-1/1-1:1.0/0003:1050:0407.0001/input/input1 > [ 6452.762251] hid-generic 0003:1050:0407.0001: input,hidraw0: USB HID v1.10 Keyboard [Yubico Yubikey 4 OTP+U2F+CCID] on usb-vhci_hcd.0-1/input0 > [ 6452.770270] hid-generic 0003:1050:0407.0002: hiddev96,hidraw1: USB HID v1.10 Device [Yubico Yubikey 4 OTP+U2F+CCID] on usb-vhci_hcd.0-1/input1 > [ 6530.805002] vhci_hcd: unlink->seqnum 111 > [ 6530.805024] vhci_hcd: urb->status -104 > [ 6531.386607] usb 1-1: recv xbuf, 42 > [ 6531.386701] vhci_hcd: stop threads > [ 6531.386718] vhci_hcd: release socket > [ 6531.386734] vhci_hcd: disconnect device > [ 6531.386800] usb 1-1: USB disconnect, device number 2 > [ 6591.409099] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: > [ 6591.409130] rcu: 1-...!: (0 ticks this GP) idle=53a/1/0x4000000000000000 softirq=18978/18978 fqs=0 > [ 6591.409158] rcu: (detected by 0, t=60002 jiffies, g=17933, q=893) > [ 6591.409181] Sending NMI from CPU 0 to CPUs 1: > [ 6591.410415] NMI backtrace for cpu 1 > [ 6591.410416] CPU: 1 PID: 338 Comm: kworker/1:2 Tainted: G O 4.19.84-1.pvops.qubes.x86_64 #1 > [ 6591.410417] Workqueue: usb_hub_wq hub_event > [ 6591.410417] RIP: 0010:_raw_spin_lock_irqsave+0x22/0x40 > [ 6591.410418] Code: 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 53 9c 58 0f 1f 44 00 00 48 89 c3 fa 66 0f 1f 44 00 00 31 c0 ba 01 00 00 00 f0 0f b1 17 <85> c0 75 05 48 89 d8 5b c3 89 c6 e8 3e 1c 7e ff 66 90 48 89 d8 5b > [ 6591.410419] RSP: 0000:ffffc900009dfaf8 EFLAGS: 00000046 > [ 6591.410419] RAX: 0000000000000000 RBX: 0000000000000082 RCX: 0000000000000000 > [ 6591.410420] RDX: 0000000000000001 RSI: ffff88801e633e40 RDI: ffff888006740be0 > [ 6591.410420] RBP: ffff888006740be0 R08: 0000000000000000 R09: ffffffff81346f00 > [ 6591.410420] R10: ffff888014f5b5d8 R11: 0000000000000000 R12: 00000000ffffff94 > [ 6591.410421] R13: ffff88803ff06000 R14: ffff88803ff06000 R15: ffff88801e633e40 > [ 6591.410421] FS: 0000000000000000(0000) GS:ffff8880f5b00000(0000) knlGS:0000000000000000 > [ 6591.410421] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 6591.410422] CR2: 00005eede8b122a8 CR3: 000000000220a002 CR4: 00000000003606e0 > [ 6591.410422] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 6591.410423] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [ 6591.410423] Call Trace: > [ 6591.410423] vhci_urb_dequeue+0x2b/0x280 [vhci_hcd] > [ 6591.410423] usb_hcd_flush_endpoint+0x119/0x190 > [ 6591.410424] usb_disable_endpoint+0x7b/0xa0 > [ 6591.410424] usb_disable_interface+0x3e/0x50 > [ 6591.410424] usb_unbind_interface+0x117/0x250 > [ 6591.410425] device_release_driver_internal+0x17d/0x240 > [ 6591.410425] bus_remove_device+0xe5/0x150 > [ 6591.410425] device_del+0x161/0x360 > [ 6591.410426] ? usb_remove_ep_devs+0x1b/0x30 > [ 6591.410426] usb_disable_device+0x93/0x240 > [ 6591.410426] usb_disconnect+0x90/0x270 > [ 6591.410427] hub_port_connect+0x83/0xab0 > [ 6591.410427] hub_event+0x8d1/0xab0 > [ 6591.410427] process_one_work+0x191/0x370 > [ 6591.410428] worker_thread+0x4f/0x3b0 > [ 6591.410428] kthread+0xf8/0x130 > [ 6591.410428] ? rescuer_thread+0x340/0x340 > [ 6591.410428] ? kthread_create_worker_on_cpu+0x70/0x70 > [ 6591.410429] ret_from_fork+0x35/0x40 > [ 6591.410432] rcu: rcu_sched kthread starved for 60002 jiffies! g17933 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=1 > [ 6591.411962] rcu: RCU grace-period kthread stack dump: > [ 6591.411980] rcu_sched I 0 10 2 0x80000000 > [ 6591.411998] Call Trace: > [ 6591.412015] ? __schedule+0x3f5/0x870 > [ 6591.412030] schedule+0x32/0x80 > [ 6591.412044] schedule_timeout+0x16f/0x350 > [ 6591.412059] ? __next_timer_interrupt+0xc0/0xc0 > [ 6591.412077] rcu_gp_kthread+0x569/0x950 > [ 6591.412092] kthread+0xf8/0x130 > [ 6591.412106] ? rcu_nocb_kthread+0x560/0x560 > [ 6591.412119] ? kthread_create_worker_on_cpu+0x70/0x70 > [ 6591.412136] ret_from_fork+0x35/0x40 > > Some more details are available here: > https://github.com/QubesOS/qubes-issues/issues/5498 > > -- > Best Regards, > Marek Marczykowski-Górecki > Invisible Things Lab > A: Because it messes up the order in which people normally read text. > Q: Why is top-posting such a bad thing? ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: "usbip: Implement SG support to vhci-hcd and stub driver" causes a deadlock 2019-12-06 6:50 ` Suwan Kim @ 2019-12-06 20:57 ` Marek Marczykowski-Górecki 2019-12-06 21:12 ` Shuah Khan 2019-12-09 2:01 ` Suwan Kim 0 siblings, 2 replies; 17+ messages in thread From: Marek Marczykowski-Górecki @ 2019-12-06 20:57 UTC (permalink / raw) To: Suwan Kim; +Cc: linux-usb, Shuah Khan, Valentina Manea [-- Attachment #1: Type: text/plain, Size: 4201 bytes --] On Fri, Dec 06, 2019 at 03:50:58PM +0900, Suwan Kim wrote: > On Fri, Dec 06, 2019 at 04:24:06AM +0100, Marek Marczykowski-Górecki wrote: > > Hello, > > > > I've hit an issue with recent 4.19 and 5.4 kernels. In short: if I > > connect Yubikey 4 and use its CCID interface (for example `ykman oath > > list` command), the client side hangs (100% reliably). After 60s I get a > > message that a CPU hangs waiting for a spinlock (see below). > > > > I've bisected it to a ea44d190764b4422af ("usbip: Implement SG support > > to vhci-hcd and stub driver") commit. Which indeed is also backported to > > 4.19. > > > > Any idea what is going on here? I can easily provide more information, > > if you tell me how to get it. > > > > Hi, > > Thanks for reporting. Could you turn on lockdep and USBIP_DEBUG > in kernel config and send dmesg log? It will be helpful to figure > out lock dependency in vhci_hcd. Hmm, I've tried, but I don't see much more information there (see below). I've just enabled PROVE_LOCKING and USBIP_DEBUG. Do I need to do anything more, like some boot option? Also, this one (as the previous one) is from 4.19.84. Interestingly, on 4.19.87 I don't get the message at all. Hmm, I've done also another test: got 4.19.84 with "usbip: Implement SG support to vhci-hcd and stub driver" reverted and it still hangs... I'm going for another bisect round (4.19.81 works, 4.19.84 doesn't). [ 212.890519] usb 1-1: recv xbuf, 42 [ 212.891177] vhci_hcd: vhci_shutdown_connection:1024: stop threads [ 212.891228] vhci_hcd: vhci_shutdown_connection:1032: release socket [ 212.891388] vhci_hcd: vhci_shutdown_connection:1058: disconnect device [ 212.891637] usb 1-1: USB disconnect, device number 2 [ 277.967398] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: [ 277.967456] rcu: 1-...0: (1 GPs behind) idle=d66/1/0x4000000000000000 softirq=32543/32544 fqs=15856 [ 277.967544] rcu: (detected by 2, t=65009 jiffies, g=65285, q=5921) [ 277.967560] Sending NMI from CPU 2 to CPUs 1: [ 277.968606] NMI backtrace for cpu 1 [ 277.968607] CPU: 1 PID: 450 Comm: kworker/1:3 Tainted: G O 4.19.84-1.pvops.qubes.x86_64 #2 [ 277.968608] Workqueue: usb_hub_wq hub_event [ 277.968608] RIP: 0010:_raw_spin_lock+0x2c/0x40 [ 277.968609] Code: 44 00 00 55 48 89 fd 65 ff 05 40 b8 60 7c ff 74 24 08 48 8d 7f 18 45 31 c9 31 c9 41 b8 01 00 00 00 31 d2 31 f6 e8 64 f0 71 ff <48> 89 ef 58 5d e9 1a 3a 72 ff 66 2e 0f 1f 84 00 00 00 00 00 0f 1f [ 277.968610] RSP: 0018:ffffb29f00bbbb90 EFLAGS: 00000082 [ 277.968610] RAX: ffff8fa938768000 RBX: ffff8fa981fe3950 RCX: a50d71adbc2b783b [ 277.968611] RDX: 00000000c093239f RSI: 00000000158aa627 RDI: 0000000000000046 [ 277.968611] RBP: ffffffff8450c860 R08: 0000000000000001 R09: 00000000000c0490 [ 277.968612] R10: 0000000000000000 R11: a50d71adbc2b783b R12: ffff8fa981fe3968 [ 277.968612] R13: ffff8fa91728e000 R14: ffffffff8434e626 R15: 0000000000040200 [ 277.968613] FS: 0000000000000000(0000) GS:ffff8fa9f5a80000(0000) knlGS:0000000000000000 [ 277.968613] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 277.968614] CR2: 00006335daf937b8 CR3: 00000000b0410002 CR4: 00000000001606e0 [ 277.968614] Call Trace: [ 277.968614] ? usb_hcd_flush_endpoint+0x122/0x170 [ 277.968615] usb_hcd_flush_endpoint+0x122/0x170 [ 277.968615] usb_disable_interface+0x3c/0x50 [ 277.968616] usb_unbind_interface+0x181/0x260 [ 277.968616] device_release_driver_internal+0x18b/0x250 [ 277.968617] bus_remove_device+0xfc/0x170 [ 277.968617] device_del+0x165/0x380 [ 277.968617] usb_disable_device+0x93/0x240 [ 277.968618] usb_disconnect+0xc1/0x2c0 [ 277.968618] hub_event+0xcc4/0x1620 [ 277.968618] process_one_work+0x221/0x580 [ 277.968619] worker_thread+0x50/0x3b0 [ 277.968619] ? process_one_work+0x580/0x580 [ 277.968620] kthread+0x122/0x140 [ 277.968620] ? kthread_create_worker_on_cpu+0x70/0x70 [ 277.968620] ret_from_fork+0x3a/0x50 -- Best Regards, Marek Marczykowski-Górecki Invisible Things Lab A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: "usbip: Implement SG support to vhci-hcd and stub driver" causes a deadlock 2019-12-06 20:57 ` Marek Marczykowski-Górecki @ 2019-12-06 21:12 ` Shuah Khan 2019-12-07 0:58 ` Marek Marczykowski-Górecki 2019-12-09 2:01 ` Suwan Kim 1 sibling, 1 reply; 17+ messages in thread From: Shuah Khan @ 2019-12-06 21:12 UTC (permalink / raw) To: Marek Marczykowski-Górecki, Suwan Kim Cc: linux-usb, Valentina Manea, Shuah Khan On 12/6/19 1:57 PM, Marek Marczykowski-Górecki wrote: > On Fri, Dec 06, 2019 at 03:50:58PM +0900, Suwan Kim wrote: >> On Fri, Dec 06, 2019 at 04:24:06AM +0100, Marek Marczykowski-Górecki wrote: >>> Hello, >>> >>> I've hit an issue with recent 4.19 and 5.4 kernels. In short: if I >>> connect Yubikey 4 and use its CCID interface (for example `ykman oath >>> list` command), the client side hangs (100% reliably). After 60s I get a >>> message that a CPU hangs waiting for a spinlock (see below). >>> >>> I've bisected it to a ea44d190764b4422af ("usbip: Implement SG support >>> to vhci-hcd and stub driver") commit. Which indeed is also backported to >>> 4.19. >>> >>> Any idea what is going on here? I can easily provide more information, >>> if you tell me how to get it. >>> >> >> Hi, >> >> Thanks for reporting. Could you turn on lockdep and USBIP_DEBUG >> in kernel config and send dmesg log? It will be helpful to figure >> out lock dependency in vhci_hcd. > > Hmm, I've tried, but I don't see much more information there (see > below). I've just enabled PROVE_LOCKING and USBIP_DEBUG. Do I need to do > anything more, like some boot option? > > Also, this one (as the previous one) is from 4.19.84. Interestingly, on > 4.19.87 I don't get the message at all. > > Hmm, I've done also another test: got 4.19.84 with "usbip: Implement SG > support to vhci-hcd and stub driver" reverted and it still hangs... > > I'm going for another bisect round (4.19.81 works, 4.19.84 doesn't). Does 4.19.82 work? thanks, -- Shuah ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: "usbip: Implement SG support to vhci-hcd and stub driver" causes a deadlock 2019-12-06 21:12 ` Shuah Khan @ 2019-12-07 0:58 ` Marek Marczykowski-Górecki 2019-12-07 18:45 ` Marek Marczykowski-Górecki 0 siblings, 1 reply; 17+ messages in thread From: Marek Marczykowski-Górecki @ 2019-12-07 0:58 UTC (permalink / raw) To: Shuah Khan; +Cc: Suwan Kim, linux-usb, Valentina Manea [-- Attachment #1: Type: text/plain, Size: 1982 bytes --] On Fri, Dec 06, 2019 at 02:12:08PM -0700, Shuah Khan wrote: > On 12/6/19 1:57 PM, Marek Marczykowski-Górecki wrote: > > On Fri, Dec 06, 2019 at 03:50:58PM +0900, Suwan Kim wrote: > > > On Fri, Dec 06, 2019 at 04:24:06AM +0100, Marek Marczykowski-Górecki wrote: > > > > Hello, > > > > > > > > I've hit an issue with recent 4.19 and 5.4 kernels. In short: if I > > > > connect Yubikey 4 and use its CCID interface (for example `ykman oath > > > > list` command), the client side hangs (100% reliably). After 60s I get a > > > > message that a CPU hangs waiting for a spinlock (see below). > > > > > > > > I've bisected it to a ea44d190764b4422af ("usbip: Implement SG support > > > > to vhci-hcd and stub driver") commit. Which indeed is also backported to > > > > 4.19. > > > > > > > > Any idea what is going on here? I can easily provide more information, > > > > if you tell me how to get it. > > > > > > > > > > Hi, > > > > > > Thanks for reporting. Could you turn on lockdep and USBIP_DEBUG > > > in kernel config and send dmesg log? It will be helpful to figure > > > out lock dependency in vhci_hcd. > > > > Hmm, I've tried, but I don't see much more information there (see > > below). I've just enabled PROVE_LOCKING and USBIP_DEBUG. Do I need to do > > anything more, like some boot option? > > > > Also, this one (as the previous one) is from 4.19.84. Interestingly, on > > 4.19.87 I don't get the message at all. > > > > Hmm, I've done also another test: got 4.19.84 with "usbip: Implement SG > > support to vhci-hcd and stub driver" reverted and it still hangs... > > > > I'm going for another bisect round (4.19.81 works, 4.19.84 doesn't). > > Does 4.19.82 work? Yes, it does, or at least something after 4.19.82 works (9da271c1c). -- Best Regards, Marek Marczykowski-Górecki Invisible Things Lab A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: "usbip: Implement SG support to vhci-hcd and stub driver" causes a deadlock 2019-12-07 0:58 ` Marek Marczykowski-Górecki @ 2019-12-07 18:45 ` Marek Marczykowski-Górecki 0 siblings, 0 replies; 17+ messages in thread From: Marek Marczykowski-Górecki @ 2019-12-07 18:45 UTC (permalink / raw) To: Shuah Khan; +Cc: Suwan Kim, linux-usb, Valentina Manea [-- Attachment #1: Type: text/plain, Size: 4274 bytes --] On Sat, Dec 07, 2019 at 01:58:09AM +0100, Marek Marczykowski-Górecki wrote: > On Fri, Dec 06, 2019 at 02:12:08PM -0700, Shuah Khan wrote: > > On 12/6/19 1:57 PM, Marek Marczykowski-Górecki wrote: > > > On Fri, Dec 06, 2019 at 03:50:58PM +0900, Suwan Kim wrote: > > > > On Fri, Dec 06, 2019 at 04:24:06AM +0100, Marek Marczykowski-Górecki wrote: > > > > > Hello, > > > > > > > > > > I've hit an issue with recent 4.19 and 5.4 kernels. In short: if I > > > > > connect Yubikey 4 and use its CCID interface (for example `ykman oath > > > > > list` command), the client side hangs (100% reliably). After 60s I get a > > > > > message that a CPU hangs waiting for a spinlock (see below). > > > > > > > > > > I've bisected it to a ea44d190764b4422af ("usbip: Implement SG support > > > > > to vhci-hcd and stub driver") commit. Which indeed is also backported to > > > > > 4.19. > > > > > > > > > > Any idea what is going on here? I can easily provide more information, > > > > > if you tell me how to get it. > > > > > > > > > > > > > Hi, > > > > > > > > Thanks for reporting. Could you turn on lockdep and USBIP_DEBUG > > > > in kernel config and send dmesg log? It will be helpful to figure > > > > out lock dependency in vhci_hcd. > > > > > > Hmm, I've tried, but I don't see much more information there (see > > > below). I've just enabled PROVE_LOCKING and USBIP_DEBUG. Do I need to do > > > anything more, like some boot option? > > > > > > Also, this one (as the previous one) is from 4.19.84. Interestingly, on > > > 4.19.87 I don't get the message at all. > > > > > > Hmm, I've done also another test: got 4.19.84 with "usbip: Implement SG > > > support to vhci-hcd and stub driver" reverted and it still hangs... > > > > > > I'm going for another bisect round (4.19.81 works, 4.19.84 doesn't). > > > > Does 4.19.82 work? > > Yes, it does, or at least something after 4.19.82 works (9da271c1c). I've done bisect again and again got into the same commit. Most of the time I didn't get any message about the deadlock, it simply hanged. Full bisect log: git bisect start # good: [ef244c3088856cf048c77231653b4c92a7b2213c] Linux 4.19.81 git bisect good ef244c3088856cf048c77231653b4c92a7b2213c # bad: [c555efaf14026c7751fa68d87403a5eb5ae7dcaf] Linux 4.19.84 git bisect bad c555efaf14026c7751fa68d87403a5eb5ae7dcaf # good: [9da271c1cdc14839b694e23889a653c1ed0b5f8f] net: usb: lan78xx: Disable interrupts before calling generic_handle_irq() git bisect good 9da271c1cdc14839b694e23889a653c1ed0b5f8f # bad: [8181146cd7de890cdfdda68ddc3730250887d7fc] PCI: tegra: Enable Relaxed Ordering only for Tegra20 & Tegra30 git bisect bad 8181146cd7de890cdfdda68ddc3730250887d7fc # good: [8e6bf4bc3a88e4b84e5c4ec50143a71a61503336] mm: memcontrol: fix network errors from failing __GFP_ATOMIC charges git bisect good 8e6bf4bc3a88e4b84e5c4ec50143a71a61503336 # good: [0327c7818da27b018464c0b9f541c5d276c57172] can: flexcan: disable completely the ECC mechanism git bisect good 0327c7818da27b018464c0b9f541c5d276c57172 # good: [5e36cf8edb5812e378b57511263d1a0a9172eeb9] configfs: fix a deadlock in configfs_symlink() git bisect good 5e36cf8edb5812e378b57511263d1a0a9172eeb9 # good: [4f6c5200269998f0066e1bd4db931297b7b2b906] ALSA: usb-audio: Fix possible NULL dereference at create_yamaha_midi_quirk() git bisect good 4f6c5200269998f0066e1bd4db931297b7b2b906 # good: [502bd151448c2c76a927b26783e5538875c534ff] sched/fair: Fix low cpu usage with high throttling by removing expiration of cpu-local slices git bisect good 502bd151448c2c76a927b26783e5538875c534ff # good: [f865ae473c16fb2b8b8601fa04f4f6517ad557b3] usbip: Fix vhci_urb_enqueue() URB null transfer buffer error path git bisect good f865ae473c16fb2b8b8601fa04f4f6517ad557b3 # bad: [e2dd254bde5cdac24e7774584d6f3c2c61fe09e5] usbip: Implement SG support to vhci-hcd and stub driver git bisect bad e2dd254bde5cdac24e7774584d6f3c2c61fe09e5 # first bad commit: [e2dd254bde5cdac24e7774584d6f3c2c61fe09e5] usbip: Implement SG support to vhci-hcd and stub driver -- Best Regards, Marek Marczykowski-Górecki Invisible Things Lab A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: "usbip: Implement SG support to vhci-hcd and stub driver" causes a deadlock 2019-12-06 20:57 ` Marek Marczykowski-Górecki 2019-12-06 21:12 ` Shuah Khan @ 2019-12-09 2:01 ` Suwan Kim 2019-12-09 3:37 ` Marek Marczykowski-Górecki 1 sibling, 1 reply; 17+ messages in thread From: Suwan Kim @ 2019-12-09 2:01 UTC (permalink / raw) To: Marek Marczykowski-Górecki; +Cc: linux-usb, Shuah Khan, Valentina Manea On Fri, Dec 06, 2019 at 09:57:42PM +0100, Marek Marczykowski-Górecki wrote: > On Fri, Dec 06, 2019 at 03:50:58PM +0900, Suwan Kim wrote: > > On Fri, Dec 06, 2019 at 04:24:06AM +0100, Marek Marczykowski-Górecki wrote: > > > Hello, > > > > > > I've hit an issue with recent 4.19 and 5.4 kernels. In short: if I > > > connect Yubikey 4 and use its CCID interface (for example `ykman oath > > > list` command), the client side hangs (100% reliably). After 60s I get a > > > message that a CPU hangs waiting for a spinlock (see below). > > > > > > I've bisected it to a ea44d190764b4422af ("usbip: Implement SG support > > > to vhci-hcd and stub driver") commit. Which indeed is also backported to > > > 4.19. > > > > > > Any idea what is going on here? I can easily provide more information, > > > if you tell me how to get it. > > > > > > > Hi, > > > > Thanks for reporting. Could you turn on lockdep and USBIP_DEBUG > > in kernel config and send dmesg log? It will be helpful to figure > > out lock dependency in vhci_hcd. > > Hmm, I've tried, but I don't see much more information there (see > below). I've just enabled PROVE_LOCKING and USBIP_DEBUG. Do I need to do > anything more, like some boot option? > > Also, this one (as the previous one) is from 4.19.84. Interestingly, on > 4.19.87 I don't get the message at all. > > Hmm, I've done also another test: got 4.19.84 with "usbip: Implement SG > support to vhci-hcd and stub driver" reverted and it still hangs... If so, deadlock is caused by other causes, and why is it different from the results of bisect? > > I'm going for another bisect round (4.19.81 works, 4.19.84 doesn't). > > [ 212.890519] usb 1-1: recv xbuf, 42 This message is printed by receive error and before that, driver canceled URB transmission. we need to know the exact situation before this message. Could you send me a longer log messages showing the situation before "[ 212.890519] usb 1-1: recv xbuf, 42"? And please also send the result of "lsusb -v". Regards, Suwan Kim > [ 212.891177] vhci_hcd: vhci_shutdown_connection:1024: stop threads > [ 212.891228] vhci_hcd: vhci_shutdown_connection:1032: release socket > [ 212.891388] vhci_hcd: vhci_shutdown_connection:1058: disconnect device > [ 212.891637] usb 1-1: USB disconnect, device number 2 > [ 277.967398] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: > [ 277.967456] rcu: 1-...0: (1 GPs behind) idle=d66/1/0x4000000000000000 softirq=32543/32544 fqs=15856 > [ 277.967544] rcu: (detected by 2, t=65009 jiffies, g=65285, q=5921) > [ 277.967560] Sending NMI from CPU 2 to CPUs 1: > [ 277.968606] NMI backtrace for cpu 1 > [ 277.968607] CPU: 1 PID: 450 Comm: kworker/1:3 Tainted: G O 4.19.84-1.pvops.qubes.x86_64 #2 > [ 277.968608] Workqueue: usb_hub_wq hub_event > [ 277.968608] RIP: 0010:_raw_spin_lock+0x2c/0x40 > [ 277.968609] Code: 44 00 00 55 48 89 fd 65 ff 05 40 b8 60 7c ff 74 24 08 48 8d 7f 18 45 31 c9 31 c9 41 b8 01 00 00 00 31 d2 31 f6 e8 64 f0 71 ff <48> 89 ef 58 5d e9 1a 3a 72 ff 66 2e 0f 1f 84 00 00 00 00 00 0f 1f > [ 277.968610] RSP: 0018:ffffb29f00bbbb90 EFLAGS: 00000082 > [ 277.968610] RAX: ffff8fa938768000 RBX: ffff8fa981fe3950 RCX: a50d71adbc2b783b > [ 277.968611] RDX: 00000000c093239f RSI: 00000000158aa627 RDI: 0000000000000046 > [ 277.968611] RBP: ffffffff8450c860 R08: 0000000000000001 R09: 00000000000c0490 > [ 277.968612] R10: 0000000000000000 R11: a50d71adbc2b783b R12: ffff8fa981fe3968 > [ 277.968612] R13: ffff8fa91728e000 R14: ffffffff8434e626 R15: 0000000000040200 > [ 277.968613] FS: 0000000000000000(0000) GS:ffff8fa9f5a80000(0000) knlGS:0000000000000000 > [ 277.968613] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 277.968614] CR2: 00006335daf937b8 CR3: 00000000b0410002 CR4: 00000000001606e0 > [ 277.968614] Call Trace: > [ 277.968614] ? usb_hcd_flush_endpoint+0x122/0x170 > [ 277.968615] usb_hcd_flush_endpoint+0x122/0x170 > [ 277.968615] usb_disable_interface+0x3c/0x50 > [ 277.968616] usb_unbind_interface+0x181/0x260 > [ 277.968616] device_release_driver_internal+0x18b/0x250 > [ 277.968617] bus_remove_device+0xfc/0x170 > [ 277.968617] device_del+0x165/0x380 > [ 277.968617] usb_disable_device+0x93/0x240 > [ 277.968618] usb_disconnect+0xc1/0x2c0 > [ 277.968618] hub_event+0xcc4/0x1620 > [ 277.968618] process_one_work+0x221/0x580 > [ 277.968619] worker_thread+0x50/0x3b0 > [ 277.968619] ? process_one_work+0x580/0x580 > [ 277.968620] kthread+0x122/0x140 > [ 277.968620] ? kthread_create_worker_on_cpu+0x70/0x70 > [ 277.968620] ret_from_fork+0x3a/0x50 > > > > -- > Best Regards, > Marek Marczykowski-Górecki > Invisible Things Lab > A: Because it messes up the order in which people normally read text. > Q: Why is top-posting such a bad thing? ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: "usbip: Implement SG support to vhci-hcd and stub driver" causes a deadlock 2019-12-09 2:01 ` Suwan Kim @ 2019-12-09 3:37 ` Marek Marczykowski-Górecki 2019-12-09 6:35 ` Suwan Kim 0 siblings, 1 reply; 17+ messages in thread From: Marek Marczykowski-Górecki @ 2019-12-09 3:37 UTC (permalink / raw) To: Suwan Kim; +Cc: linux-usb, Shuah Khan, Valentina Manea [-- Attachment #1: Type: text/plain, Size: 15813 bytes --] On Mon, Dec 09, 2019 at 11:01:30AM +0900, Suwan Kim wrote: > On Fri, Dec 06, 2019 at 09:57:42PM +0100, Marek Marczykowski-Górecki wrote: > > On Fri, Dec 06, 2019 at 03:50:58PM +0900, Suwan Kim wrote: > > > On Fri, Dec 06, 2019 at 04:24:06AM +0100, Marek Marczykowski-Górecki wrote: > > > > Hello, > > > > > > > > I've hit an issue with recent 4.19 and 5.4 kernels. In short: if I > > > > connect Yubikey 4 and use its CCID interface (for example `ykman oath > > > > list` command), the client side hangs (100% reliably). After 60s I get a > > > > message that a CPU hangs waiting for a spinlock (see below). > > > > > > > > I've bisected it to a ea44d190764b4422af ("usbip: Implement SG support > > > > to vhci-hcd and stub driver") commit. Which indeed is also backported to > > > > 4.19. > > > > > > > > Any idea what is going on here? I can easily provide more information, > > > > if you tell me how to get it. > > > > > > > > > > Hi, > > > > > > Thanks for reporting. Could you turn on lockdep and USBIP_DEBUG > > > in kernel config and send dmesg log? It will be helpful to figure > > > out lock dependency in vhci_hcd. > > > > Hmm, I've tried, but I don't see much more information there (see > > below). I've just enabled PROVE_LOCKING and USBIP_DEBUG. Do I need to do > > anything more, like some boot option? > > > > Also, this one (as the previous one) is from 4.19.84. Interestingly, on > > 4.19.87 I don't get the message at all. > > > > Hmm, I've done also another test: got 4.19.84 with "usbip: Implement SG > > support to vhci-hcd and stub driver" reverted and it still hangs... > > If so, deadlock is caused by other causes, and why is it different > from the results of bisect? No idea, but as you've seen in the other email, another bisect returned exactly the same commit. > > I'm going for another bisect round (4.19.81 works, 4.19.84 doesn't). > > > > [ 212.890519] usb 1-1: recv xbuf, 42 > > This message is printed by receive error and before that, driver > canceled URB transmission. we need to know the exact situation > before this message. I've added some more messages and found recv_size is 0. > Could you send me a longer log messages showing the situation > before "[ 212.890519] usb 1-1: recv xbuf, 42"? Sure, with added extra messages (debug patch below). [ 131.397522] usb 1-1: num_sgs 0 [ 131.406588] usb 1-1: num_sgs 0 [ 131.410621] usb 1-1: num_sgs 0 [ 131.411950] usb 1-1: num_sgs 0 [ 131.413186] usb 1-1: num_sgs 0 [ 131.414590] usb 1-1: num_sgs 0 [ 131.417086] usb 1-1: num_sgs 0 [ 131.418188] usb 1-1: num_sgs 0 [ 131.419228] usb 1-1: num_sgs 0 [ 131.420248] usb 1-1: num_sgs 0 [ 131.457315] usb 1-1: num_sgs 5 [ 131.457345] usb 1-1: size 42, copy 42 recv 42, recv_size 42, sg->length 16384 [ 131.457359] usb 1-1: size 42, copy 0 recv -22, recv_size 0, sg->length 16384 [ 131.457372] usb 1-1: recv xbuf, 42 size 42 [ 131.458263] vhci_hcd: vhci_shutdown_connection:1024: stop threads [ 131.458318] vhci_hcd: vhci_shutdown_connection:1032: release socket [ 131.458431] vhci_hcd: vhci_shutdown_connection:1058: disconnect device [ 131.460171] usb 1-1: USB disconnect, device number 2 (...) If I add "if (!recv_size) continue;" there, it works! > And please also send the result of "lsusb -v". Bus 003 Device 006: ID 1050:0407 Yubico.com Yubikey 4 OTP+U2F+CCID Device Descriptor: bLength 18 bDescriptorType 1 bcdUSB 2.00 bDeviceClass 0 bDeviceSubClass 0 bDeviceProtocol 0 bMaxPacketSize0 64 idVendor 0x1050 Yubico.com idProduct 0x0407 Yubikey 4 OTP+U2F+CCID bcdDevice 4.34 iManufacturer 1 Yubico iProduct 2 Yubikey 4 OTP+U2F+CCID iSerial 0 bNumConfigurations 1 Configuration Descriptor: bLength 9 bDescriptorType 2 wTotalLength 0x0096 bNumInterfaces 3 bConfigurationValue 1 iConfiguration 0 bmAttributes 0x80 (Bus Powered) MaxPower 30mA Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 0 bAlternateSetting 0 bNumEndpoints 1 bInterfaceClass 3 Human Interface Device bInterfaceSubClass 1 Boot Interface Subclass bInterfaceProtocol 1 Keyboard iInterface 0 HID Device Descriptor: bLength 9 bDescriptorType 33 bcdHID 1.10 bCountryCode 0 Not supported bNumDescriptors 1 bDescriptorType 34 Report wDescriptorLength 71 Report Descriptor: (length is 71) Item(Global): Usage Page, data= [ 0x01 ] 1 Generic Desktop Controls Item(Local ): Usage, data= [ 0x06 ] 6 Keyboard Item(Main ): Collection, data= [ 0x01 ] 1 Application Item(Global): Usage Page, data= [ 0x07 ] 7 Keyboard Item(Local ): Usage Minimum, data= [ 0xe0 ] 224 Control Left Item(Local ): Usage Maximum, data= [ 0xe7 ] 231 GUI Right Item(Global): Logical Minimum, data= [ 0x00 ] 0 Item(Global): Logical Maximum, data= [ 0x01 ] 1 Item(Global): Report Size, data= [ 0x01 ] 1 Item(Global): Report Count, data= [ 0x08 ] 8 Item(Main ): Input, data= [ 0x02 ] 2 Data Variable Absolute No_Wrap Linear Preferred_State No_Null_Position Non_Volatile Bitfield Item(Global): Report Count, data= [ 0x01 ] 1 Item(Global): Report Size, data= [ 0x08 ] 8 Item(Main ): Input, data= [ 0x01 ] 1 Constant Array Absolute No_Wrap Linear Preferred_State No_Null_Position Non_Volatile Bitfield Item(Global): Report Count, data= [ 0x05 ] 5 Item(Global): Report Size, data= [ 0x01 ] 1 Item(Global): Usage Page, data= [ 0x08 ] 8 LEDs Item(Local ): Usage Minimum, data= [ 0x01 ] 1 NumLock Item(Local ): Usage Maximum, data= [ 0x05 ] 5 Kana Item(Main ): Output, data= [ 0x02 ] 2 Data Variable Absolute No_Wrap Linear Preferred_State No_Null_Position Non_Volatile Bitfield Item(Global): Report Count, data= [ 0x01 ] 1 Item(Global): Report Size, data= [ 0x03 ] 3 Item(Main ): Output, data= [ 0x01 ] 1 Constant Array Absolute No_Wrap Linear Preferred_State No_Null_Position Non_Volatile Bitfield Item(Global): Report Count, data= [ 0x06 ] 6 Item(Global): Report Size, data= [ 0x08 ] 8 Item(Global): Logical Minimum, data= [ 0x00 ] 0 Item(Global): Logical Maximum, data= [ 0x65 ] 101 Item(Global): Usage Page, data= [ 0x07 ] 7 Keyboard Item(Local ): Usage Minimum, data= [ 0x00 ] 0 No Event Item(Local ): Usage Maximum, data= [ 0x65 ] 101 Keyboard Application (Windows Key for Win95 or Compose) Item(Main ): Input, data= [ 0x00 ] 0 Data Array Absolute No_Wrap Linear Preferred_State No_Null_Position Non_Volatile Bitfield Item(Local ): Usage, data= [ 0x03 ] 3 Keyboard Error Undefined Item(Global): Report Size, data= [ 0x08 ] 8 Item(Global): Report Count, data= [ 0x08 ] 8 Item(Main ): Feature, data= [ 0x02 ] 2 Data Variable Absolute No_Wrap Linear Preferred_State No_Null_Position Non_Volatile Bitfield Item(Main ): End Collection, data=none Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x81 EP 1 IN bmAttributes 3 Transfer Type Interrupt Synch Type None Usage Type Data wMaxPacketSize 0x0008 1x 8 bytes bInterval 10 Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 1 bAlternateSetting 0 bNumEndpoints 2 bInterfaceClass 3 Human Interface Device bInterfaceSubClass 0 bInterfaceProtocol 0 iInterface 0 HID Device Descriptor: bLength 9 bDescriptorType 33 bcdHID 1.10 bCountryCode 0 Not supported bNumDescriptors 1 bDescriptorType 34 Report wDescriptorLength 34 Report Descriptor: (length is 34) Item(Global): Usage Page, data= [ 0xd0 0xf1 ] 61904 (null) Item(Local ): Usage, data= [ 0x01 ] 1 (null) Item(Main ): Collection, data= [ 0x01 ] 1 Application Item(Local ): Usage, data= [ 0x20 ] 32 (null) Item(Global): Logical Minimum, data= [ 0x00 ] 0 Item(Global): Logical Maximum, data= [ 0xff 0x00 ] 255 Item(Global): Report Size, data= [ 0x08 ] 8 Item(Global): Report Count, data= [ 0x40 ] 64 Item(Main ): Input, data= [ 0x02 ] 2 Data Variable Absolute No_Wrap Linear Preferred_State No_Null_Position Non_Volatile Bitfield Item(Local ): Usage, data= [ 0x21 ] 33 (null) Item(Global): Logical Minimum, data= [ 0x00 ] 0 Item(Global): Logical Maximum, data= [ 0xff 0x00 ] 255 Item(Global): Report Size, data= [ 0x08 ] 8 Item(Global): Report Count, data= [ 0x40 ] 64 Item(Main ): Output, data= [ 0x02 ] 2 Data Variable Absolute No_Wrap Linear Preferred_State No_Null_Position Non_Volatile Bitfield Item(Main ): End Collection, data=none Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x04 EP 4 OUT bmAttributes 3 Transfer Type Interrupt Synch Type None Usage Type Data wMaxPacketSize 0x0040 1x 64 bytes bInterval 2 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x84 EP 4 IN bmAttributes 3 Transfer Type Interrupt Synch Type None Usage Type Data wMaxPacketSize 0x0040 1x 64 bytes bInterval 2 Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 2 bAlternateSetting 0 bNumEndpoints 3 bInterfaceClass 11 Chip/SmartCard bInterfaceSubClass 0 bInterfaceProtocol 0 iInterface 0 ChipCard Interface Descriptor: bLength 54 bDescriptorType 33 bcdCCID 1.00 nMaxSlotIndex 0 bVoltageSupport 7 5.0V 3.0V 1.8V dwProtocols 2 T=1 dwDefaultClock 4000 dwMaxiumumClock 4000 bNumClockSupported 0 dwDataRate 307200 bps dwMaxDataRate 307200 bps bNumDataRatesSupp. 0 dwMaxIFSD 2038 dwSyncProtocols 00000000 dwMechanical 00000000 dwFeatures 000400FE Auto configuration based on ATR Auto activation on insert Auto voltage selection Auto clock change Auto baud rate change Auto parameter negotiation made by CCID Short and extended APDU level exchange dwMaxCCIDMsgLen 3072 bClassGetResponse echo bClassEnvelope echo wlcdLayout none bPINSupport 0 bMaxCCIDBusySlots 1 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x02 EP 2 OUT bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0040 1x 64 bytes bInterval 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x82 EP 2 IN bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0040 1x 64 bytes bInterval 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x83 EP 3 IN bmAttributes 3 Transfer Type Interrupt Synch Type None Usage Type Data wMaxPacketSize 0x0008 1x 8 bytes bInterval 32 Device Status: 0x0000 (Bus Powered) And the debug patch: diff --git a/drivers/usb/usbip/usbip_common.c b/drivers/usb/usbip/usbip_common.c index d88a5b15f..89b87d5ed 100644 --- a/drivers/usb/usbip/usbip_common.c +++ b/drivers/usb/usbip/usbip_common.c @@ -709,6 +709,7 @@ int usbip_recv_xbuff(struct usbip_device *ud, struct urb *urb) /* should not happen, probably malicious packet */ goto error; + dev_err(&urb->dev->dev, "num_sgs %d\n", urb->num_sgs); if (urb->num_sgs) { copy = size; for_each_sg(urb->sg, sg, urb->num_sgs, i) { @@ -721,6 +722,7 @@ int usbip_recv_xbuff(struct usbip_device *ud, struct urb *urb) recv = usbip_recv(ud->tcp_socket, sg_virt(sg), recv_size); + dev_err(&urb->dev->dev, "size %d, copy %d recv %d, recv_size %d, sg->length %d\n", size, copy, recv, recv_size, sg->length); if (recv != recv_size) goto error; @@ -740,7 +742,7 @@ int usbip_recv_xbuff(struct usbip_device *ud, struct urb *urb) return ret; error: - dev_err(&urb->dev->dev, "recv xbuf, %d\n", ret); + dev_err(&urb->dev->dev, "recv xbuf, %d size %d\n", ret, size); if (ud->side == USBIP_STUB || ud->side == USBIP_VUDC) usbip_event_add(ud, SDEV_EVENT_ERROR_TCP); else -- Best Regards, Marek Marczykowski-Górecki Invisible Things Lab A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: "usbip: Implement SG support to vhci-hcd and stub driver" causes a deadlock 2019-12-09 3:37 ` Marek Marczykowski-Górecki @ 2019-12-09 6:35 ` Suwan Kim 2019-12-09 14:19 ` Marek Marczykowski-Górecki 0 siblings, 1 reply; 17+ messages in thread From: Suwan Kim @ 2019-12-09 6:35 UTC (permalink / raw) To: Marek Marczykowski-Górecki; +Cc: linux-usb, Shuah Khan, Valentina Manea On Mon, Dec 09, 2019 at 04:37:40AM +0100, Marek Marczykowski-Górecki wrote: > On Mon, Dec 09, 2019 at 11:01:30AM +0900, Suwan Kim wrote: > > On Fri, Dec 06, 2019 at 09:57:42PM +0100, Marek Marczykowski-Górecki wrote: > > > On Fri, Dec 06, 2019 at 03:50:58PM +0900, Suwan Kim wrote: > > > > On Fri, Dec 06, 2019 at 04:24:06AM +0100, Marek Marczykowski-Górecki wrote: > > > > > Hello, > > > > > > > > > > I've hit an issue with recent 4.19 and 5.4 kernels. In short: if I > > > > > connect Yubikey 4 and use its CCID interface (for example `ykman oath > > > > > list` command), the client side hangs (100% reliably). After 60s I get a > > > > > message that a CPU hangs waiting for a spinlock (see below). > > > > > > > > > > I've bisected it to a ea44d190764b4422af ("usbip: Implement SG support > > > > > to vhci-hcd and stub driver") commit. Which indeed is also backported to > > > > > 4.19. > > > > > > > > > > Any idea what is going on here? I can easily provide more information, > > > > > if you tell me how to get it. > > > > > > > > > > > > > Hi, > > > > > > > > Thanks for reporting. Could you turn on lockdep and USBIP_DEBUG > > > > in kernel config and send dmesg log? It will be helpful to figure > > > > out lock dependency in vhci_hcd. > > > > > > Hmm, I've tried, but I don't see much more information there (see > > > below). I've just enabled PROVE_LOCKING and USBIP_DEBUG. Do I need to do > > > anything more, like some boot option? > > > > > > Also, this one (as the previous one) is from 4.19.84. Interestingly, on > > > 4.19.87 I don't get the message at all. > > > > > > Hmm, I've done also another test: got 4.19.84 with "usbip: Implement SG > > > support to vhci-hcd and stub driver" reverted and it still hangs... > > > > If so, deadlock is caused by other causes, and why is it different > > from the results of bisect? > > No idea, but as you've seen in the other email, another bisect returned > exactly the same commit. > > > > > I'm going for another bisect round (4.19.81 works, 4.19.84 doesn't). > > > > > > [ 212.890519] usb 1-1: recv xbuf, 42 > > > > This message is printed by receive error and before that, driver > > canceled URB transmission. we need to know the exact situation > > before this message. > > I've added some more messages and found recv_size is 0. That is the bug point. "size" is urb->actual_length that means amount of data actually received from device. And "copy" is amount of data received from usbip server. So, in this situation, vhci-hcd received all the data from usbip server even if there are more sg entries left. So, "copy == 0" means vhci-hcd receives all data from the server and we should check "if (copy == 0)" in for_each_sg() loop of usbip_recv_xbuff() to exit the loop and not to add error event. > > > Could you send me a longer log messages showing the situation > > before "[ 212.890519] usb 1-1: recv xbuf, 42"? > > Sure, with added extra messages (debug patch below). > > [ 131.397522] usb 1-1: num_sgs 0 > [ 131.406588] usb 1-1: num_sgs 0 > [ 131.410621] usb 1-1: num_sgs 0 > [ 131.411950] usb 1-1: num_sgs 0 > [ 131.413186] usb 1-1: num_sgs 0 > [ 131.414590] usb 1-1: num_sgs 0 > [ 131.417086] usb 1-1: num_sgs 0 > [ 131.418188] usb 1-1: num_sgs 0 > [ 131.419228] usb 1-1: num_sgs 0 > [ 131.420248] usb 1-1: num_sgs 0 > [ 131.457315] usb 1-1: num_sgs 5 > [ 131.457345] usb 1-1: size 42, copy 42 recv 42, recv_size 42, sg->length 16384 Device sent 42 bytes data (size 42) and vhci-hcd received 42 bytes data from the server. vhci-hcd received all the data and It should exit the loop. > [ 131.457359] usb 1-1: size 42, copy 0 recv -22, recv_size 0, sg->length 16384 > [ 131.457372] usb 1-1: recv xbuf, 42 size 42 > [ 131.458263] vhci_hcd: vhci_shutdown_connection:1024: stop threads > [ 131.458318] vhci_hcd: vhci_shutdown_connection:1032: release socket > [ 131.458431] vhci_hcd: vhci_shutdown_connection:1058: disconnect device > [ 131.460171] usb 1-1: USB disconnect, device number 2 > > (...) > > If I add "if (!recv_size) continue;" there, it works! I think we should check "copy" not the "recv_size" because "copy" shows the amount of data received from the server. int usbip_recv_xbuff(struct usbip_device *ud, struct urb *urb) ... ... if (urb->num_sgs) { copy = size; for_each_sg(urb->sg, sg, urb->num_sgs, i) { int recv_size; if (copy < sg->length) recv_size = copy; else recv_size = sg->length; recv = usbip_recv(ud->tcp_socket, sg_virt(sg), recv_size); if (recv != recv_size) goto error; copy -= recv; ret += recv; /* Add here */ if (!copy) break; ^^^^^^^^^^^^^^ } Regards, Suwan Kim > > And please also send the result of "lsusb -v". > > Bus 003 Device 006: ID 1050:0407 Yubico.com Yubikey 4 OTP+U2F+CCID > Device Descriptor: > bLength 18 > bDescriptorType 1 > bcdUSB 2.00 > bDeviceClass 0 > bDeviceSubClass 0 > bDeviceProtocol 0 > bMaxPacketSize0 64 > idVendor 0x1050 Yubico.com > idProduct 0x0407 Yubikey 4 OTP+U2F+CCID > bcdDevice 4.34 > iManufacturer 1 Yubico > iProduct 2 Yubikey 4 OTP+U2F+CCID > iSerial 0 > bNumConfigurations 1 > Configuration Descriptor: > bLength 9 > bDescriptorType 2 > wTotalLength 0x0096 > bNumInterfaces 3 > bConfigurationValue 1 > iConfiguration 0 > bmAttributes 0x80 > (Bus Powered) > MaxPower 30mA > Interface Descriptor: > bLength 9 > bDescriptorType 4 > bInterfaceNumber 0 > bAlternateSetting 0 > bNumEndpoints 1 > bInterfaceClass 3 Human Interface Device > bInterfaceSubClass 1 Boot Interface Subclass > bInterfaceProtocol 1 Keyboard > iInterface 0 > HID Device Descriptor: > bLength 9 > bDescriptorType 33 > bcdHID 1.10 > bCountryCode 0 Not supported > bNumDescriptors 1 > bDescriptorType 34 Report > wDescriptorLength 71 > Report Descriptor: (length is 71) > Item(Global): Usage Page, data= [ 0x01 ] 1 > Generic Desktop Controls > Item(Local ): Usage, data= [ 0x06 ] 6 > Keyboard > Item(Main ): Collection, data= [ 0x01 ] 1 > Application > Item(Global): Usage Page, data= [ 0x07 ] 7 > Keyboard > Item(Local ): Usage Minimum, data= [ 0xe0 ] 224 > Control Left > Item(Local ): Usage Maximum, data= [ 0xe7 ] 231 > GUI Right > Item(Global): Logical Minimum, data= [ 0x00 ] 0 > Item(Global): Logical Maximum, data= [ 0x01 ] 1 > Item(Global): Report Size, data= [ 0x01 ] 1 > Item(Global): Report Count, data= [ 0x08 ] 8 > Item(Main ): Input, data= [ 0x02 ] 2 > Data Variable Absolute No_Wrap Linear > Preferred_State No_Null_Position Non_Volatile Bitfield > Item(Global): Report Count, data= [ 0x01 ] 1 > Item(Global): Report Size, data= [ 0x08 ] 8 > Item(Main ): Input, data= [ 0x01 ] 1 > Constant Array Absolute No_Wrap Linear > Preferred_State No_Null_Position Non_Volatile Bitfield > Item(Global): Report Count, data= [ 0x05 ] 5 > Item(Global): Report Size, data= [ 0x01 ] 1 > Item(Global): Usage Page, data= [ 0x08 ] 8 > LEDs > Item(Local ): Usage Minimum, data= [ 0x01 ] 1 > NumLock > Item(Local ): Usage Maximum, data= [ 0x05 ] 5 > Kana > Item(Main ): Output, data= [ 0x02 ] 2 > Data Variable Absolute No_Wrap Linear > Preferred_State No_Null_Position Non_Volatile Bitfield > Item(Global): Report Count, data= [ 0x01 ] 1 > Item(Global): Report Size, data= [ 0x03 ] 3 > Item(Main ): Output, data= [ 0x01 ] 1 > Constant Array Absolute No_Wrap Linear > Preferred_State No_Null_Position Non_Volatile Bitfield > Item(Global): Report Count, data= [ 0x06 ] 6 > Item(Global): Report Size, data= [ 0x08 ] 8 > Item(Global): Logical Minimum, data= [ 0x00 ] 0 > Item(Global): Logical Maximum, data= [ 0x65 ] 101 > Item(Global): Usage Page, data= [ 0x07 ] 7 > Keyboard > Item(Local ): Usage Minimum, data= [ 0x00 ] 0 > No Event > Item(Local ): Usage Maximum, data= [ 0x65 ] 101 > Keyboard Application (Windows Key for Win95 or Compose) > Item(Main ): Input, data= [ 0x00 ] 0 > Data Array Absolute No_Wrap Linear > Preferred_State No_Null_Position Non_Volatile Bitfield > Item(Local ): Usage, data= [ 0x03 ] 3 > Keyboard Error Undefined > Item(Global): Report Size, data= [ 0x08 ] 8 > Item(Global): Report Count, data= [ 0x08 ] 8 > Item(Main ): Feature, data= [ 0x02 ] 2 > Data Variable Absolute No_Wrap Linear > Preferred_State No_Null_Position Non_Volatile Bitfield > Item(Main ): End Collection, data=none > Endpoint Descriptor: > bLength 7 > bDescriptorType 5 > bEndpointAddress 0x81 EP 1 IN > bmAttributes 3 > Transfer Type Interrupt > Synch Type None > Usage Type Data > wMaxPacketSize 0x0008 1x 8 bytes > bInterval 10 > Interface Descriptor: > bLength 9 > bDescriptorType 4 > bInterfaceNumber 1 > bAlternateSetting 0 > bNumEndpoints 2 > bInterfaceClass 3 Human Interface Device > bInterfaceSubClass 0 > bInterfaceProtocol 0 > iInterface 0 > HID Device Descriptor: > bLength 9 > bDescriptorType 33 > bcdHID 1.10 > bCountryCode 0 Not supported > bNumDescriptors 1 > bDescriptorType 34 Report > wDescriptorLength 34 > Report Descriptor: (length is 34) > Item(Global): Usage Page, data= [ 0xd0 0xf1 ] 61904 > (null) > Item(Local ): Usage, data= [ 0x01 ] 1 > (null) > Item(Main ): Collection, data= [ 0x01 ] 1 > Application > Item(Local ): Usage, data= [ 0x20 ] 32 > (null) > Item(Global): Logical Minimum, data= [ 0x00 ] 0 > Item(Global): Logical Maximum, data= [ 0xff 0x00 ] 255 > Item(Global): Report Size, data= [ 0x08 ] 8 > Item(Global): Report Count, data= [ 0x40 ] 64 > Item(Main ): Input, data= [ 0x02 ] 2 > Data Variable Absolute No_Wrap Linear > Preferred_State No_Null_Position Non_Volatile Bitfield > Item(Local ): Usage, data= [ 0x21 ] 33 > (null) > Item(Global): Logical Minimum, data= [ 0x00 ] 0 > Item(Global): Logical Maximum, data= [ 0xff 0x00 ] 255 > Item(Global): Report Size, data= [ 0x08 ] 8 > Item(Global): Report Count, data= [ 0x40 ] 64 > Item(Main ): Output, data= [ 0x02 ] 2 > Data Variable Absolute No_Wrap Linear > Preferred_State No_Null_Position Non_Volatile Bitfield > Item(Main ): End Collection, data=none > Endpoint Descriptor: > bLength 7 > bDescriptorType 5 > bEndpointAddress 0x04 EP 4 OUT > bmAttributes 3 > Transfer Type Interrupt > Synch Type None > Usage Type Data > wMaxPacketSize 0x0040 1x 64 bytes > bInterval 2 > Endpoint Descriptor: > bLength 7 > bDescriptorType 5 > bEndpointAddress 0x84 EP 4 IN > bmAttributes 3 > Transfer Type Interrupt > Synch Type None > Usage Type Data > wMaxPacketSize 0x0040 1x 64 bytes > bInterval 2 > Interface Descriptor: > bLength 9 > bDescriptorType 4 > bInterfaceNumber 2 > bAlternateSetting 0 > bNumEndpoints 3 > bInterfaceClass 11 Chip/SmartCard > bInterfaceSubClass 0 > bInterfaceProtocol 0 > iInterface 0 > ChipCard Interface Descriptor: > bLength 54 > bDescriptorType 33 > bcdCCID 1.00 > nMaxSlotIndex 0 > bVoltageSupport 7 5.0V 3.0V 1.8V > dwProtocols 2 T=1 > dwDefaultClock 4000 > dwMaxiumumClock 4000 > bNumClockSupported 0 > dwDataRate 307200 bps > dwMaxDataRate 307200 bps > bNumDataRatesSupp. 0 > dwMaxIFSD 2038 > dwSyncProtocols 00000000 > dwMechanical 00000000 > dwFeatures 000400FE > Auto configuration based on ATR > Auto activation on insert > Auto voltage selection > Auto clock change > Auto baud rate change > Auto parameter negotiation made by CCID > Short and extended APDU level exchange > dwMaxCCIDMsgLen 3072 > bClassGetResponse echo > bClassEnvelope echo > wlcdLayout none > bPINSupport 0 > bMaxCCIDBusySlots 1 > Endpoint Descriptor: > bLength 7 > bDescriptorType 5 > bEndpointAddress 0x02 EP 2 OUT > bmAttributes 2 > Transfer Type Bulk > Synch Type None > Usage Type Data > wMaxPacketSize 0x0040 1x 64 bytes > bInterval 0 > Endpoint Descriptor: > bLength 7 > bDescriptorType 5 > bEndpointAddress 0x82 EP 2 IN > bmAttributes 2 > Transfer Type Bulk > Synch Type None > Usage Type Data > wMaxPacketSize 0x0040 1x 64 bytes > bInterval 0 > Endpoint Descriptor: > bLength 7 > bDescriptorType 5 > bEndpointAddress 0x83 EP 3 IN > bmAttributes 3 > Transfer Type Interrupt > Synch Type None > Usage Type Data > wMaxPacketSize 0x0008 1x 8 bytes > bInterval 32 > Device Status: 0x0000 > (Bus Powered) > > > > And the debug patch: > > diff --git a/drivers/usb/usbip/usbip_common.c b/drivers/usb/usbip/usbip_common.c > index d88a5b15f..89b87d5ed 100644 > --- a/drivers/usb/usbip/usbip_common.c > +++ b/drivers/usb/usbip/usbip_common.c > @@ -709,6 +709,7 @@ int usbip_recv_xbuff(struct usbip_device *ud, struct urb *urb) > /* should not happen, probably malicious packet */ > goto error; > > + dev_err(&urb->dev->dev, "num_sgs %d\n", urb->num_sgs); > if (urb->num_sgs) { > copy = size; > for_each_sg(urb->sg, sg, urb->num_sgs, i) { > @@ -721,6 +722,7 @@ int usbip_recv_xbuff(struct usbip_device *ud, struct urb *urb) > > recv = usbip_recv(ud->tcp_socket, sg_virt(sg), > recv_size); > + dev_err(&urb->dev->dev, "size %d, copy %d recv %d, recv_size %d, sg->length %d\n", size, copy, recv, recv_size, sg->length); > > if (recv != recv_size) > goto error; > @@ -740,7 +742,7 @@ int usbip_recv_xbuff(struct usbip_device *ud, struct urb *urb) > return ret; > > error: > - dev_err(&urb->dev->dev, "recv xbuf, %d\n", ret); > + dev_err(&urb->dev->dev, "recv xbuf, %d size %d\n", ret, size); > if (ud->side == USBIP_STUB || ud->side == USBIP_VUDC) > usbip_event_add(ud, SDEV_EVENT_ERROR_TCP); > else > > > -- > Best Regards, > Marek Marczykowski-Górecki > Invisible Things Lab > A: Because it messes up the order in which people normally read text. > Q: Why is top-posting such a bad thing? ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: "usbip: Implement SG support to vhci-hcd and stub driver" causes a deadlock 2019-12-09 6:35 ` Suwan Kim @ 2019-12-09 14:19 ` Marek Marczykowski-Górecki 2019-12-10 14:25 ` Suwan Kim 0 siblings, 1 reply; 17+ messages in thread From: Marek Marczykowski-Górecki @ 2019-12-09 14:19 UTC (permalink / raw) To: Suwan Kim; +Cc: linux-usb, Shuah Khan, Valentina Manea [-- Attachment #1: Type: text/plain, Size: 3675 bytes --] On Mon, Dec 09, 2019 at 03:35:43PM +0900, Suwan Kim wrote: > On Mon, Dec 09, 2019 at 04:37:40AM +0100, Marek Marczykowski-Górecki wrote: > > On Mon, Dec 09, 2019 at 11:01:30AM +0900, Suwan Kim wrote: > > > On Fri, Dec 06, 2019 at 09:57:42PM +0100, Marek Marczykowski-Górecki wrote: > > > > [ 212.890519] usb 1-1: recv xbuf, 42 > > > > > > This message is printed by receive error and before that, driver > > > canceled URB transmission. we need to know the exact situation > > > before this message. > > > > I've added some more messages and found recv_size is 0. > > That is the bug point. "size" is urb->actual_length that means > amount of data actually received from device. And "copy" is > amount of data received from usbip server. So, in this situation, > vhci-hcd received all the data from usbip server even if there > are more sg entries left. So, "copy == 0" means vhci-hcd receives > all data from the server and we should check "if (copy == 0)" in > for_each_sg() loop of usbip_recv_xbuff() to exit the loop and not > to add error event. That makes sense. But I think there is also another issue here: hang in case of an error. Here it was EINVAL, but there are probably other reasons why usbip_recv can fail, like network error or misbehaving server. This definitely should not cause the client to fail this way... And also, the actual error code is lost. > > > Could you send me a longer log messages showing the situation > > > before "[ 212.890519] usb 1-1: recv xbuf, 42"? > > > > Sure, with added extra messages (debug patch below). > > > > [ 131.397522] usb 1-1: num_sgs 0 > > [ 131.406588] usb 1-1: num_sgs 0 > > [ 131.410621] usb 1-1: num_sgs 0 > > [ 131.411950] usb 1-1: num_sgs 0 > > [ 131.413186] usb 1-1: num_sgs 0 > > [ 131.414590] usb 1-1: num_sgs 0 > > [ 131.417086] usb 1-1: num_sgs 0 > > [ 131.418188] usb 1-1: num_sgs 0 > > [ 131.419228] usb 1-1: num_sgs 0 > > [ 131.420248] usb 1-1: num_sgs 0 > > [ 131.457315] usb 1-1: num_sgs 5 > > [ 131.457345] usb 1-1: size 42, copy 42 recv 42, recv_size 42, sg->length 16384 > > Device sent 42 bytes data (size 42) and vhci-hcd received 42 bytes > data from the server. vhci-hcd received all the data and It should > exit the loop. > > > [ 131.457359] usb 1-1: size 42, copy 0 recv -22, recv_size 0, sg->length 16384 > > [ 131.457372] usb 1-1: recv xbuf, 42 size 42 > > [ 131.458263] vhci_hcd: vhci_shutdown_connection:1024: stop threads > > [ 131.458318] vhci_hcd: vhci_shutdown_connection:1032: release socket > > [ 131.458431] vhci_hcd: vhci_shutdown_connection:1058: disconnect device > > [ 131.460171] usb 1-1: USB disconnect, device number 2 > > > > (...) > > > > If I add "if (!recv_size) continue;" there, it works! > > I think we should check "copy" not the "recv_size" because "copy" > shows the amount of data received from the server. > > int usbip_recv_xbuff(struct usbip_device *ud, struct urb *urb) > ... > ... > if (urb->num_sgs) { > copy = size; > for_each_sg(urb->sg, sg, urb->num_sgs, i) { > int recv_size; > > if (copy < sg->length) > recv_size = copy; > else > recv_size = sg->length; > > recv = usbip_recv(ud->tcp_socket, sg_virt(sg), > recv_size); > > if (recv != recv_size) > goto error; > > copy -= recv; > ret += recv; > > /* Add here */ > if (!copy) > break; > ^^^^^^^^^^^^^^ > } This helps too. -- Best Regards, Marek Marczykowski-Górecki Invisible Things Lab A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: "usbip: Implement SG support to vhci-hcd and stub driver" causes a deadlock 2019-12-09 14:19 ` Marek Marczykowski-Górecki @ 2019-12-10 14:25 ` Suwan Kim 2019-12-10 15:32 ` Marek Marczykowski-Górecki 0 siblings, 1 reply; 17+ messages in thread From: Suwan Kim @ 2019-12-10 14:25 UTC (permalink / raw) To: Marek Marczykowski-Górecki; +Cc: linux-usb, Shuah Khan, Valentina Manea On Mon, Dec 09, 2019 at 03:19:59PM +0100, Marek Marczykowski-Górecki wrote: > On Mon, Dec 09, 2019 at 03:35:43PM +0900, Suwan Kim wrote: > > On Mon, Dec 09, 2019 at 04:37:40AM +0100, Marek Marczykowski-Górecki wrote: > > > On Mon, Dec 09, 2019 at 11:01:30AM +0900, Suwan Kim wrote: > > > > On Fri, Dec 06, 2019 at 09:57:42PM +0100, Marek Marczykowski-Górecki wrote: > > > > > [ 212.890519] usb 1-1: recv xbuf, 42 > > > > > > > > This message is printed by receive error and before that, driver > > > > canceled URB transmission. we need to know the exact situation > > > > before this message. > > > > > > I've added some more messages and found recv_size is 0. > > > > That is the bug point. "size" is urb->actual_length that means > > amount of data actually received from device. And "copy" is > > amount of data received from usbip server. So, in this situation, > > vhci-hcd received all the data from usbip server even if there > > are more sg entries left. So, "copy == 0" means vhci-hcd receives > > all data from the server and we should check "if (copy == 0)" in > > for_each_sg() loop of usbip_recv_xbuff() to exit the loop and not > > to add error event. > > That makes sense. But I think there is also another issue here: hang in > case of an error. Here it was EINVAL, but there are probably other > reasons why usbip_recv can fail, like network error or misbehaving > server. This definitely should not cause the client to fail this way... > And also, the actual error code is lost. I agree. I have been taking a look at it and trying to reproduce the same issue on my machine. I guess race condition between hub irq thread and driver (vhci_urb_enqueue or hcd?) But I'm not sure... > > > > > Could you send me a longer log messages showing the situation > > > > before "[ 212.890519] usb 1-1: recv xbuf, 42"? > > > > > > Sure, with added extra messages (debug patch below). > > > > > > [ 131.397522] usb 1-1: num_sgs 0 > > > [ 131.406588] usb 1-1: num_sgs 0 > > > [ 131.410621] usb 1-1: num_sgs 0 > > > [ 131.411950] usb 1-1: num_sgs 0 > > > [ 131.413186] usb 1-1: num_sgs 0 > > > [ 131.414590] usb 1-1: num_sgs 0 > > > [ 131.417086] usb 1-1: num_sgs 0 > > > [ 131.418188] usb 1-1: num_sgs 0 > > > [ 131.419228] usb 1-1: num_sgs 0 > > > [ 131.420248] usb 1-1: num_sgs 0 > > > [ 131.457315] usb 1-1: num_sgs 5 > > > [ 131.457345] usb 1-1: size 42, copy 42 recv 42, recv_size 42, sg->length 16384 > > > > Device sent 42 bytes data (size 42) and vhci-hcd received 42 bytes > > data from the server. vhci-hcd received all the data and It should > > exit the loop. > > > > > [ 131.457359] usb 1-1: size 42, copy 0 recv -22, recv_size 0, sg->length 16384 > > > [ 131.457372] usb 1-1: recv xbuf, 42 size 42 > > > [ 131.458263] vhci_hcd: vhci_shutdown_connection:1024: stop threads > > > [ 131.458318] vhci_hcd: vhci_shutdown_connection:1032: release socket > > > [ 131.458431] vhci_hcd: vhci_shutdown_connection:1058: disconnect device > > > [ 131.460171] usb 1-1: USB disconnect, device number 2 > > > > > > (...) > > > > > > If I add "if (!recv_size) continue;" there, it works! > > > > I think we should check "copy" not the "recv_size" because "copy" > > shows the amount of data received from the server. > > > > int usbip_recv_xbuff(struct usbip_device *ud, struct urb *urb) > > ... > > ... > > if (urb->num_sgs) { > > copy = size; > > for_each_sg(urb->sg, sg, urb->num_sgs, i) { > > int recv_size; > > > > if (copy < sg->length) > > recv_size = copy; > > else > > recv_size = sg->length; > > > > recv = usbip_recv(ud->tcp_socket, sg_virt(sg), > > recv_size); > > > > if (recv != recv_size) > > goto error; > > > > copy -= recv; > > ret += recv; > > > > /* Add here */ > > if (!copy) > > break; > > ^^^^^^^^^^^^^^ > > } > > This helps too. Good. We should fix this issue first. I will submit a patch soon. Regrads, Suwan Kim ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: "usbip: Implement SG support to vhci-hcd and stub driver" causes a deadlock 2019-12-10 14:25 ` Suwan Kim @ 2019-12-10 15:32 ` Marek Marczykowski-Górecki 2019-12-11 3:07 ` Suwan Kim 0 siblings, 1 reply; 17+ messages in thread From: Marek Marczykowski-Górecki @ 2019-12-10 15:32 UTC (permalink / raw) To: Suwan Kim; +Cc: linux-usb, Shuah Khan, Valentina Manea [-- Attachment #1: Type: text/plain, Size: 3683 bytes --] On Tue, Dec 10, 2019 at 11:25:35PM +0900, Suwan Kim wrote: > On Mon, Dec 09, 2019 at 03:19:59PM +0100, Marek Marczykowski-Górecki wrote: > > On Mon, Dec 09, 2019 at 03:35:43PM +0900, Suwan Kim wrote: > > > On Mon, Dec 09, 2019 at 04:37:40AM +0100, Marek Marczykowski-Górecki wrote: > > > > On Mon, Dec 09, 2019 at 11:01:30AM +0900, Suwan Kim wrote: > > > > > On Fri, Dec 06, 2019 at 09:57:42PM +0100, Marek Marczykowski-Górecki wrote: > > > > > > [ 212.890519] usb 1-1: recv xbuf, 42 > > > > > > > > > > This message is printed by receive error and before that, driver > > > > > canceled URB transmission. we need to know the exact situation > > > > > before this message. > > > > > > > > I've added some more messages and found recv_size is 0. > > > > > > That is the bug point. "size" is urb->actual_length that means > > > amount of data actually received from device. And "copy" is > > > amount of data received from usbip server. So, in this situation, > > > vhci-hcd received all the data from usbip server even if there > > > are more sg entries left. So, "copy == 0" means vhci-hcd receives > > > all data from the server and we should check "if (copy == 0)" in > > > for_each_sg() loop of usbip_recv_xbuff() to exit the loop and not > > > to add error event. > > > > That makes sense. But I think there is also another issue here: hang in > > case of an error. Here it was EINVAL, but there are probably other > > reasons why usbip_recv can fail, like network error or misbehaving > > server. This definitely should not cause the client to fail this way... > > And also, the actual error code is lost. > > I agree. I have been taking a look at it and trying to reproduce > the same issue on my machine. I guess race condition between hub > irq thread and driver (vhci_urb_enqueue or hcd?) > But I'm not sure... I've tried it some more time and one time I've got a traceback pointing at lock_release() called from vhci_urb_dequeue, not lock_acquire(). This get me thinking it may not be deadlock on a spinlock, but some infinite loop. Looking at the source, I think it's about usb_hcd_flush_endpoint looping indefinitely because vhci_urb_dequeue() exit early on this: spin_lock_irqsave(&vhci->lock, flags); priv = urb->hcpriv; if (!priv) { /* URB was never linked! or will be soon given back by * vhci_rx. */ spin_unlock_irqrestore(&vhci->lock, flags); return -EIDRM; } Adding a print there confirms it. And I think it's because of vhci_recv_ret_submit(): spin_lock_irqsave(&vdev->priv_lock, flags); urb = pickup_urb_and_free_priv(vdev, pdu->base.seqnum); // ****** priv freed here spin_unlock_irqrestore(&vdev->priv_lock, flags); if (!urb) { pr_err("cannot find a urb of seqnum %u max seqnum %d\n", pdu->base.seqnum, atomic_read(&vhci_hcd->seqnum)); usbip_event_add(ud, VDEV_EVENT_ERROR_TCP); return; } /* unpack the pdu to a urb */ usbip_pack_pdu(pdu, urb, USBIP_RET_SUBMIT, 0); /* recv transfer buffer */ if (usbip_recv_xbuff(ud, urb) < 0) // ***** exit early here return; /* recv iso_packet_descriptor */ if (usbip_recv_iso(ud, urb) < 0) return; I'm not really sure what should happen, but I think some cleanup in case of usbip_recv_xbuff() failure is missing. And probably in case of usbip_recv_iso() failure too. -- Best Regards, Marek Marczykowski-Górecki Invisible Things Lab A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: "usbip: Implement SG support to vhci-hcd and stub driver" causes a deadlock 2019-12-10 15:32 ` Marek Marczykowski-Górecki @ 2019-12-11 3:07 ` Suwan Kim 2019-12-11 3:20 ` Marek Marczykowski-Górecki 0 siblings, 1 reply; 17+ messages in thread From: Suwan Kim @ 2019-12-11 3:07 UTC (permalink / raw) To: Marek Marczykowski-Górecki; +Cc: linux-usb, Shuah Khan, Valentina Manea On Tue, Dec 10, 2019 at 04:32:21PM +0100, Marek Marczykowski-Górecki wrote: > On Tue, Dec 10, 2019 at 11:25:35PM +0900, Suwan Kim wrote: > > On Mon, Dec 09, 2019 at 03:19:59PM +0100, Marek Marczykowski-Górecki wrote: > > > On Mon, Dec 09, 2019 at 03:35:43PM +0900, Suwan Kim wrote: > > > > On Mon, Dec 09, 2019 at 04:37:40AM +0100, Marek Marczykowski-Górecki wrote: > > > > > On Mon, Dec 09, 2019 at 11:01:30AM +0900, Suwan Kim wrote: > > > > > > On Fri, Dec 06, 2019 at 09:57:42PM +0100, Marek Marczykowski-Górecki wrote: > > > > > > > [ 212.890519] usb 1-1: recv xbuf, 42 > > > > > > > > > > > > This message is printed by receive error and before that, driver > > > > > > canceled URB transmission. we need to know the exact situation > > > > > > before this message. > > > > > > > > > > I've added some more messages and found recv_size is 0. > > > > > > > > That is the bug point. "size" is urb->actual_length that means > > > > amount of data actually received from device. And "copy" is > > > > amount of data received from usbip server. So, in this situation, > > > > vhci-hcd received all the data from usbip server even if there > > > > are more sg entries left. So, "copy == 0" means vhci-hcd receives > > > > all data from the server and we should check "if (copy == 0)" in > > > > for_each_sg() loop of usbip_recv_xbuff() to exit the loop and not > > > > to add error event. > > > > > > That makes sense. But I think there is also another issue here: hang in > > > case of an error. Here it was EINVAL, but there are probably other > > > reasons why usbip_recv can fail, like network error or misbehaving > > > server. This definitely should not cause the client to fail this way... > > > And also, the actual error code is lost. > > > > I agree. I have been taking a look at it and trying to reproduce > > the same issue on my machine. I guess race condition between hub > > irq thread and driver (vhci_urb_enqueue or hcd?) > > But I'm not sure... > > I've tried it some more time and one time I've got a traceback pointing > at lock_release() called from vhci_urb_dequeue, not lock_acquire(). This > get me thinking it may not be deadlock on a spinlock, but some infinite > loop. Looking at the source, I think it's about usb_hcd_flush_endpoint > looping indefinitely because vhci_urb_dequeue() exit early on this: That really makes sense. Very good debugging. I can also see the infinite loop now. When there is transaction error, vhci rx thread adds error event and event handler tries to shutdown the connection and also changes the port status of vhci that kicks hub_irq. Then hub_irq cleans up the device resources including pending URBs (vhci_urb_dequeue). But as you pointed out, "priv" is already released in vhci rx thread. > > > spin_lock_irqsave(&vhci->lock, flags); > > priv = urb->hcpriv; > if (!priv) { > /* URB was never linked! or will be soon given back by > * vhci_rx. */ > spin_unlock_irqrestore(&vhci->lock, flags); > return -EIDRM; > } > > Adding a print there confirms it. > > And I think it's because of vhci_recv_ret_submit(): > > > spin_lock_irqsave(&vdev->priv_lock, flags); > urb = pickup_urb_and_free_priv(vdev, pdu->base.seqnum); > > // ****** priv freed here > > spin_unlock_irqrestore(&vdev->priv_lock, flags); > > if (!urb) { > pr_err("cannot find a urb of seqnum %u max seqnum %d\n", > pdu->base.seqnum, > atomic_read(&vhci_hcd->seqnum)); > usbip_event_add(ud, VDEV_EVENT_ERROR_TCP); > return; > } > > /* unpack the pdu to a urb */ > usbip_pack_pdu(pdu, urb, USBIP_RET_SUBMIT, 0); > > /* recv transfer buffer */ > if (usbip_recv_xbuff(ud, urb) < 0) > // ***** exit early here > return; > > /* recv iso_packet_descriptor */ > if (usbip_recv_iso(ud, urb) < 0) > return; > > I'm not really sure what should happen, but I think some cleanup in case > of usbip_recv_xbuff() failure is missing. And probably in case of > usbip_recv_iso() failure too. HCD should giveback URB to driver calling usb_hcd_giveback_urb(). But in the case of transaction error, vhci_recv_ret_submit() terminates without giveback URB. So, I think the error path in usbip_recv_xbuff and usbip_recv_iso should unlink URB from ep and insert proper error code in urb->status that indicates error status to driver and handle giveback URB to driver. Then hub_irq doesn't need to flush error URB. That will be helpful to graceful connection shutdown. static void vhci_recv_ret_submit(struct vhci_device *vdev, struct usbip_header *pdu) ... ... if (usbip_recv_xbuff(ud, urb) < 0) { urb->status = -EPIPE; goto error; // goto error path } /* recv iso_packet_descriptor */ if (usbip_recv_iso(ud, urb) < 0) { urb->status = -EPIPE; goto error; // goto error path } ... ... error://error path spin_lock_irqsave(&vhci->lock, flags); usb_hcd_unlink_urb_from_ep(vhci_hcd_to_hcd(vhci_hcd), urb); spin_unlock_irqrestore(&vhci->lock, flags); usb_hcd_giveback_urb(vhci_hcd_to_hcd(vhci_hcd), urb, urb->status); usbip_dbg_vhci_rx("Leave\n"); } Regards, Suwan Kim > -- > Best Regards, > Marek Marczykowski-Górecki > Invisible Things Lab > A: Because it messes up the order in which people normally read text. > Q: Why is top-posting such a bad thing? ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: "usbip: Implement SG support to vhci-hcd and stub driver" causes a deadlock 2019-12-11 3:07 ` Suwan Kim @ 2019-12-11 3:20 ` Marek Marczykowski-Górecki 2019-12-11 4:53 ` Suwan Kim 2019-12-11 6:27 ` Suwan Kim 0 siblings, 2 replies; 17+ messages in thread From: Marek Marczykowski-Górecki @ 2019-12-11 3:20 UTC (permalink / raw) To: Suwan Kim; +Cc: linux-usb, Shuah Khan, Valentina Manea [-- Attachment #1: Type: text/plain, Size: 1504 bytes --] On Wed, Dec 11, 2019 at 12:07:54PM +0900, Suwan Kim wrote: > HCD should giveback URB to driver calling usb_hcd_giveback_urb(). > But in the case of transaction error, vhci_recv_ret_submit() > terminates without giveback URB. So, I think the error path in > usbip_recv_xbuff and usbip_recv_iso should unlink URB from ep and > insert proper error code in urb->status that indicates error > status to driver and handle giveback URB to driver. > > Then hub_irq doesn't need to flush error URB. > That will be helpful to graceful connection shutdown. > > > static void vhci_recv_ret_submit(struct vhci_device *vdev, > struct usbip_header *pdu) > ... > ... > if (usbip_recv_xbuff(ud, urb) < 0) { > urb->status = -EPIPE; > goto error; // goto error path > } > > /* recv iso_packet_descriptor */ > if (usbip_recv_iso(ud, urb) < 0) { > urb->status = -EPIPE; > goto error; // goto error path > } > ... > ... > error://error path > spin_lock_irqsave(&vhci->lock, flags); > usb_hcd_unlink_urb_from_ep(vhci_hcd_to_hcd(vhci_hcd), urb); > spin_unlock_irqrestore(&vhci->lock, flags); > > usb_hcd_giveback_urb(vhci_hcd_to_hcd(vhci_hcd), urb, urb->status); > > usbip_dbg_vhci_rx("Leave\n"); > } Yup, that works! Now the error is handled gracefully instead of hanging. -- Best Regards, Marek Marczykowski-Górecki Invisible Things Lab A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: "usbip: Implement SG support to vhci-hcd and stub driver" causes a deadlock 2019-12-11 3:20 ` Marek Marczykowski-Górecki @ 2019-12-11 4:53 ` Suwan Kim 2019-12-11 6:27 ` Suwan Kim 1 sibling, 0 replies; 17+ messages in thread From: Suwan Kim @ 2019-12-11 4:53 UTC (permalink / raw) To: Marek Marczykowski-Górecki; +Cc: linux-usb, Shuah Khan, Valentina Manea On Wed, Dec 11, 2019 at 04:20:32AM +0100, Marek Marczykowski-Górecki wrote: > On Wed, Dec 11, 2019 at 12:07:54PM +0900, Suwan Kim wrote: > > HCD should giveback URB to driver calling usb_hcd_giveback_urb(). > > But in the case of transaction error, vhci_recv_ret_submit() > > terminates without giveback URB. So, I think the error path in > > usbip_recv_xbuff and usbip_recv_iso should unlink URB from ep and > > insert proper error code in urb->status that indicates error > > status to driver and handle giveback URB to driver. > > > > Then hub_irq doesn't need to flush error URB. > > That will be helpful to graceful connection shutdown. > > > > > > static void vhci_recv_ret_submit(struct vhci_device *vdev, > > struct usbip_header *pdu) > > ... > > ... > > if (usbip_recv_xbuff(ud, urb) < 0) { > > urb->status = -EPIPE; > > goto error; // goto error path > > } > > > > /* recv iso_packet_descriptor */ > > if (usbip_recv_iso(ud, urb) < 0) { > > urb->status = -EPIPE; > > goto error; // goto error path > > } > > ... > > ... > > error://error path > > spin_lock_irqsave(&vhci->lock, flags); > > usb_hcd_unlink_urb_from_ep(vhci_hcd_to_hcd(vhci_hcd), urb); > > spin_unlock_irqrestore(&vhci->lock, flags); > > > > usb_hcd_giveback_urb(vhci_hcd_to_hcd(vhci_hcd), urb, urb->status); > > > > usbip_dbg_vhci_rx("Leave\n"); > > } > > Yup, that works! Now the error is handled gracefully instead of hanging. Ok, good! I will send a patch including Reported-by tag with you :) Thanks for reporting the bugs. Regards, Suwan Kim ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: "usbip: Implement SG support to vhci-hcd and stub driver" causes a deadlock 2019-12-11 3:20 ` Marek Marczykowski-Górecki 2019-12-11 4:53 ` Suwan Kim @ 2019-12-11 6:27 ` Suwan Kim 2019-12-11 11:01 ` Marek Marczykowski-Górecki 1 sibling, 1 reply; 17+ messages in thread From: Suwan Kim @ 2019-12-11 6:27 UTC (permalink / raw) To: Marek Marczykowski-Górecki; +Cc: linux-usb, Shuah Khan, Valentina Manea On Wed, Dec 11, 2019 at 04:20:32AM +0100, Marek Marczykowski-Górecki wrote: > On Wed, Dec 11, 2019 at 12:07:54PM +0900, Suwan Kim wrote: > > HCD should giveback URB to driver calling usb_hcd_giveback_urb(). > > But in the case of transaction error, vhci_recv_ret_submit() > > terminates without giveback URB. So, I think the error path in > > usbip_recv_xbuff and usbip_recv_iso should unlink URB from ep and > > insert proper error code in urb->status that indicates error > > status to driver and handle giveback URB to driver. > > > > Then hub_irq doesn't need to flush error URB. > > That will be helpful to graceful connection shutdown. > > > > > > static void vhci_recv_ret_submit(struct vhci_device *vdev, > > struct usbip_header *pdu) > > ... > > ... > > if (usbip_recv_xbuff(ud, urb) < 0) { > > urb->status = -EPIPE; > > goto error; // goto error path > > } > > > > /* recv iso_packet_descriptor */ > > if (usbip_recv_iso(ud, urb) < 0) { > > urb->status = -EPIPE; > > goto error; // goto error path > > } > > ... > > ... > > error://error path > > spin_lock_irqsave(&vhci->lock, flags); > > usb_hcd_unlink_urb_from_ep(vhci_hcd_to_hcd(vhci_hcd), urb); > > spin_unlock_irqrestore(&vhci->lock, flags); > > > > usb_hcd_giveback_urb(vhci_hcd_to_hcd(vhci_hcd), urb, urb->status); > > > > usbip_dbg_vhci_rx("Leave\n"); > > } > > Yup, that works! Now the error is handled gracefully instead of hanging. Marek, Could you test if it works too? Regards, Suwan Kim diff --git a/drivers/usb/usbip/vhci_rx.c b/drivers/usb/usbip/vhci_rx.c index 33f8972ba842..19a807e398eb 100644 --- a/drivers/usb/usbip/vhci_rx.c +++ b/drivers/usb/usbip/vhci_rx.c @@ -78,11 +78,11 @@ static void vhci_recv_ret_submit(struct vhci_device *vdev, /* recv transfer buffer */ if (usbip_recv_xbuff(ud, urb) < 0) - return; + urb->status = -EPIPE; /* recv iso_packet_descriptor */ if (usbip_recv_iso(ud, urb) < 0) - return; + urb->status = -EPIPE; /* restore the padding in iso packets */ usbip_pad_iso(ud, urb); ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: "usbip: Implement SG support to vhci-hcd and stub driver" causes a deadlock 2019-12-11 6:27 ` Suwan Kim @ 2019-12-11 11:01 ` Marek Marczykowski-Górecki 0 siblings, 0 replies; 17+ messages in thread From: Marek Marczykowski-Górecki @ 2019-12-11 11:01 UTC (permalink / raw) To: Suwan Kim; +Cc: linux-usb, Shuah Khan, Valentina Manea [-- Attachment #1: Type: text/plain, Size: 2595 bytes --] On Wed, Dec 11, 2019 at 03:27:05PM +0900, Suwan Kim wrote: > On Wed, Dec 11, 2019 at 04:20:32AM +0100, Marek Marczykowski-Górecki wrote: > > On Wed, Dec 11, 2019 at 12:07:54PM +0900, Suwan Kim wrote: > > > HCD should giveback URB to driver calling usb_hcd_giveback_urb(). > > > But in the case of transaction error, vhci_recv_ret_submit() > > > terminates without giveback URB. So, I think the error path in > > > usbip_recv_xbuff and usbip_recv_iso should unlink URB from ep and > > > insert proper error code in urb->status that indicates error > > > status to driver and handle giveback URB to driver. > > > > > > Then hub_irq doesn't need to flush error URB. > > > That will be helpful to graceful connection shutdown. > > > > > > > > > static void vhci_recv_ret_submit(struct vhci_device *vdev, > > > struct usbip_header *pdu) > > > ... > > > ... > > > if (usbip_recv_xbuff(ud, urb) < 0) { > > > urb->status = -EPIPE; > > > goto error; // goto error path > > > } > > > > > > /* recv iso_packet_descriptor */ > > > if (usbip_recv_iso(ud, urb) < 0) { > > > urb->status = -EPIPE; > > > goto error; // goto error path > > > } > > > ... > > > ... > > > error://error path > > > spin_lock_irqsave(&vhci->lock, flags); > > > usb_hcd_unlink_urb_from_ep(vhci_hcd_to_hcd(vhci_hcd), urb); > > > spin_unlock_irqrestore(&vhci->lock, flags); > > > > > > usb_hcd_giveback_urb(vhci_hcd_to_hcd(vhci_hcd), urb, urb->status); > > > > > > usbip_dbg_vhci_rx("Leave\n"); > > > } > > > > Yup, that works! Now the error is handled gracefully instead of hanging. > > Marek, Could you test if it works too? Yes, this seems to work too. > Regards, > Suwan Kim > > > diff --git a/drivers/usb/usbip/vhci_rx.c b/drivers/usb/usbip/vhci_rx.c > index 33f8972ba842..19a807e398eb 100644 > --- a/drivers/usb/usbip/vhci_rx.c > +++ b/drivers/usb/usbip/vhci_rx.c > @@ -78,11 +78,11 @@ static void vhci_recv_ret_submit(struct vhci_device *vdev, > > /* recv transfer buffer */ > if (usbip_recv_xbuff(ud, urb) < 0) > - return; > + urb->status = -EPIPE; > > /* recv iso_packet_descriptor */ > if (usbip_recv_iso(ud, urb) < 0) > - return; > + urb->status = -EPIPE; > > /* restore the padding in iso packets */ > usbip_pad_iso(ud, urb); -- Best Regards, Marek Marczykowski-Górecki Invisible Things Lab A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2019-12-11 11:01 UTC | newest] Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-12-06 3:24 "usbip: Implement SG support to vhci-hcd and stub driver" causes a deadlock Marek Marczykowski-Górecki 2019-12-06 6:50 ` Suwan Kim 2019-12-06 20:57 ` Marek Marczykowski-Górecki 2019-12-06 21:12 ` Shuah Khan 2019-12-07 0:58 ` Marek Marczykowski-Górecki 2019-12-07 18:45 ` Marek Marczykowski-Górecki 2019-12-09 2:01 ` Suwan Kim 2019-12-09 3:37 ` Marek Marczykowski-Górecki 2019-12-09 6:35 ` Suwan Kim 2019-12-09 14:19 ` Marek Marczykowski-Górecki 2019-12-10 14:25 ` Suwan Kim 2019-12-10 15:32 ` Marek Marczykowski-Górecki 2019-12-11 3:07 ` Suwan Kim 2019-12-11 3:20 ` Marek Marczykowski-Górecki 2019-12-11 4:53 ` Suwan Kim 2019-12-11 6:27 ` Suwan Kim 2019-12-11 11:01 ` Marek Marczykowski-Górecki
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.