* Re: WARNING in usb_composite_setup_continue [not found] <CAEAjamsxe9OuMVpHfox3w57HtGsE3mPXOty9bdXW-iPdx=TXMA@mail.gmail.com> @ 2020-11-09 19:29 ` Kyungtae Kim 2020-11-10 15:56 ` Alan Stern 0 siblings, 1 reply; 11+ messages in thread From: Kyungtae Kim @ 2020-11-09 19:29 UTC (permalink / raw) To: Felipe Balbi, Greg KH; +Cc: USB list, LKML, syzkaller We report a bug (in linux-5.8.13) found by FuzzUSB (a modified version of syzkaller). (corrected analysis) This bug happens while continuing a delayed setup message in mass storage gadget. To be specific, composite_setup() sets FSG_STATE_CONFIG_CHANGE via fsg_set_alt() (line 1793), and followed by cdev->delayed_status++ (line 1798). Meanwile, the mass gadget tries check cdev->delayed_status == 0 through handle_exception() (line 2428), which occurs in between the two operations above. Such a race causes invalid operations eventually. ================================================================== usb_composite_setup_continue: Unexpected call WARNING: CPU: 1 PID: 1882 at drivers/usb/gadget/composite.c:2457 usb_composite_setup_continue+0x1c7/0x220 drivers/usb/gadget/composite.c:2457 Kernel panic - not syncing: panic_on_warn set ... CPU: 1 PID: 1882 Comm: file-storage Not tainted 5.8.13 #4 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0xa7/0xea lib/dump_stack.c:118 panic+0x298/0x521 kernel/panic.c:231 __warn.cold.12+0x25/0x32 kernel/panic.c:600 report_bug+0x1b2/0x260 lib/bug.c:198 handle_bug+0x43/0x70 arch/x86/kernel/traps.c:235 exc_invalid_op+0x18/0x40 arch/x86/kernel/traps.c:255 asm_exc_invalid_op+0x12/0x20 ./arch/x86/include/asm/idtentry.h:540 RIP: 0010:usb_composite_setup_continue+0x1c7/0x220 drivers/usb/gadget/composite.c:2457 Code: 62 48 8b 7b 58 4c 89 fe e8 66 fb ff ff e9 e6 fe ff ff e8 6c 89 87 fd 48 c7 c6 60 c9 41 86 48 c7 c7 60 c6 41 86 e8 78 0c 5c fd <0f> 0b e9 c7 fe ff ff e8 3d a4 ae fd e9 a3 fe ff ff e8 53 a4 ae fd RSP: 0000:ffff8880446dfd38 EFLAGS: 00010086 RAX: 0000000000000000 RBX: ffff888069eee4b0 RCX: 0000000000000000 RDX: 0000000000000027 RSI: 0000000000000004 RDI: ffffed10088dbf9d RBP: ffff8880446dfd60 R08: ffffed100d953fd2 R09: ffffed100d953fd2 R10: ffff88806ca9fe8b R11: ffffed100d953fd1 R12: 0000000000000000 R13: ffff888069eee540 R14: 0000000000000246 R15: ffff888040609310 handle_exception drivers/usb/gadget/function/f_mass_storage.c:2428 [inline] fsg_main_thread+0x12f4/0x59f1 drivers/usb/gadget/function/f_mass_storage.c:2466 kthread+0x374/0x480 kernel/kthread.c:291 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:293 Dumping ftrace buffer: (ftrace buffer empty) Kernel Offset: disabled ================================================================== Regards, Kyungtae Kim On Mon, Nov 9, 2020 at 2:08 PM Kyungtae Kim <kt0755@gmail.com> wrote: > > We report a bug (in linux-5.8.13) found by FuzzUSB (a modified version > of syzkaller). > > The bug happens when the freed object tty->port is accessed in > tty_init_dev (line 1358). > It seems tty->port is freed during the locking (i.e., tty_ldisc_lock) > ahead (line 1355). > > ================================================================== > usb_composite_setup_continue: Unexpected call > WARNING: CPU: 1 PID: 1882 at drivers/usb/gadget/composite.c:2457 usb_composite_setup_continue+0x1c7/0x220 drivers/usb/gadget/composite.c:2457 > Kernel panic - not syncing: panic_on_warn set ... > CPU: 1 PID: 1882 Comm: file-storage Not tainted 5.8.13 #4 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 > Call Trace: > __dump_stack lib/dump_stack.c:77 [inline] > dump_stack+0xa7/0xea lib/dump_stack.c:118 > panic+0x298/0x521 kernel/panic.c:231 > __warn.cold.12+0x25/0x32 kernel/panic.c:600 > report_bug+0x1b2/0x260 lib/bug.c:198 > handle_bug+0x43/0x70 arch/x86/kernel/traps.c:235 > exc_invalid_op+0x18/0x40 arch/x86/kernel/traps.c:255 > asm_exc_invalid_op+0x12/0x20 ./arch/x86/include/asm/idtentry.h:540 > RIP: 0010:usb_composite_setup_continue+0x1c7/0x220 drivers/usb/gadget/composite.c:2457 > Code: 62 48 8b 7b 58 4c 89 fe e8 66 fb ff ff e9 e6 fe ff ff e8 6c 89 87 fd 48 c7 c6 60 c9 41 86 48 c7 c7 60 c6 41 86 e8 78 0c 5c fd <0f> 0b e9 c7 fe ff ff e8 3d a4 ae fd e9 a3 fe ff ff e8 53 a4 ae fd > RSP: 0000:ffff8880446dfd38 EFLAGS: 00010086 > RAX: 0000000000000000 RBX: ffff888069eee4b0 RCX: 0000000000000000 > RDX: 0000000000000027 RSI: 0000000000000004 RDI: ffffed10088dbf9d > RBP: ffff8880446dfd60 R08: ffffed100d953fd2 R09: ffffed100d953fd2 > R10: ffff88806ca9fe8b R11: ffffed100d953fd1 R12: 0000000000000000 > R13: ffff888069eee540 R14: 0000000000000246 R15: ffff888040609310 > handle_exception drivers/usb/gadget/function/f_mass_storage.c:2428 [inline] > fsg_main_thread+0x12f4/0x59f1 drivers/usb/gadget/function/f_mass_storage.c:2466 > kthread+0x374/0x480 kernel/kthread.c:291 > ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:293 > Dumping ftrace buffer: > (ftrace buffer empty) > Kernel Offset: disabled > ================================================================== > > Regards, > Kyungtae Kim ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: WARNING in usb_composite_setup_continue 2020-11-09 19:29 ` WARNING in usb_composite_setup_continue Kyungtae Kim @ 2020-11-10 15:56 ` Alan Stern 2020-11-11 7:59 ` Peter Chen 2020-11-11 19:47 ` Alan Stern 0 siblings, 2 replies; 11+ messages in thread From: Alan Stern @ 2020-11-10 15:56 UTC (permalink / raw) To: Kyungtae Kim; +Cc: Felipe Balbi, Greg KH, USB list, syzkaller Felipe: On Mon, Nov 09, 2020 at 02:29:42PM -0500, Kyungtae Kim wrote: > We report a bug (in linux-5.8.13) found by FuzzUSB (a modified version > of syzkaller). > > (corrected analysis) > This bug happens while continuing a delayed setup message in mass > storage gadget. > To be specific, composite_setup() sets FSG_STATE_CONFIG_CHANGE via > fsg_set_alt() (line 1793), > and followed by cdev->delayed_status++ (line 1798). > Meanwile, the mass gadget tries check cdev->delayed_status == 0 > through handle_exception() (line 2428), > which occurs in between the two operations above. > Such a race causes invalid operations eventually. Do you know who maintains composite.c (or the composite framework) these days? This is a real race, and it needs to be fixed. Part of the problem seems to be that cdev->delayed_status is sometimes accessed without the protection of cdev->lock. But I don't know when it is safe to take that lock, so I can't tell what changes to make. Another part of the problem is that cdev->delayed_status doesn't count things properly. That is, its value is incremented each time a function driver asks for a delayed status and decremented each time a function driver calls usb_composite_setup_continue(), and the delayed status response is sent when the value reaches 0. But there's nothing to stop this from happening (imagine a gadget with two functions A and B): Function driver A asks for delayed status; Function driver A calls setup_continue(): Now the value of the counter is 0 so a status message is queued too early; Function driver B asks for delayed status; Function driver B calls setup_continue(): Now a second status message is queued. I'm willing to help fix these issues, but I need assistance from someone who fully understands the composite framework. Alan Stern ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: WARNING in usb_composite_setup_continue 2020-11-10 15:56 ` Alan Stern @ 2020-11-11 7:59 ` Peter Chen 2020-11-11 15:57 ` Alan Stern 2020-11-11 19:47 ` Alan Stern 1 sibling, 1 reply; 11+ messages in thread From: Peter Chen @ 2020-11-11 7:59 UTC (permalink / raw) To: Alan Stern; +Cc: Kyungtae Kim, Felipe Balbi, Greg KH, USB list, syzkaller On 20-11-10 10:56:50, Alan Stern wrote: > Felipe: > > On Mon, Nov 09, 2020 at 02:29:42PM -0500, Kyungtae Kim wrote: > > We report a bug (in linux-5.8.13) found by FuzzUSB (a modified version > > of syzkaller). > > > > (corrected analysis) > > This bug happens while continuing a delayed setup message in mass > > storage gadget. > > To be specific, composite_setup() sets FSG_STATE_CONFIG_CHANGE via > > fsg_set_alt() (line 1793), > > and followed by cdev->delayed_status++ (line 1798). > > Meanwile, the mass gadget tries check cdev->delayed_status == 0 > > through handle_exception() (line 2428), > > which occurs in between the two operations above. > > Such a race causes invalid operations eventually. > > Do you know who maintains composite.c (or the composite framework) these > days? This is a real race, and it needs to be fixed. > > Part of the problem seems to be that cdev->delayed_status is sometimes > accessed without the protection of cdev->lock. But I don't know when it > is safe to take that lock, so I can't tell what changes to make. > > Another part of the problem is that cdev->delayed_status doesn't count > things properly. That is, its value is incremented each time a function > driver asks for a delayed status and decremented each time a function > driver calls usb_composite_setup_continue(), and the delayed status > response is sent when the value reaches 0. But there's nothing to stop > this from happening (imagine a gadget with two functions A and B): > > Function driver A asks for delayed status; > Function driver A calls setup_continue(): Now the value > of the counter is 0 so a status message is queued > too early; > Function driver B asks for delayed status; > Function driver B calls setup_continue(): Now a second > status message is queued. > > I'm willing to help fix these issues, but I need assistance from someone > who fully understands the composite framework. > Hi Alan & Kyungtae, I quite not understand why this occurs, since cdev->delayed_status's increment and decrement are both protected by cdev->lock. cdev->delayed_status's increment: Place 1: case USB_REQ_GET_CONFIGURATION: spin_lock(&cdev->lock); set_config(cdev, ctrl, w_value); f->set_alt; cdev->delayed_status++; spin_unlock(&cdev->lock); Place 2: case USB_REQ_SET_INTERFACE: spin_lock(&cdev->lock); value = f->set_alt(f, w_index, w_value); if (value == USB_GADGET_DELAYED_STATUS) { DBG(cdev, "%s: interface %d (%s) requested delayed status\n", __func__, intf, f->name); cdev->delayed_status++; DBG(cdev, "delayed_status count %d\n", cdev->delayed_status); } spin_unlock(&cdev->lock); cdev->delayed_status's decrement: function: usb_composite_setup_continue which called by fsg_main_thread due to FSG_STATE_CONFIG_CHANGE. spin_lock_irqsave(&cdev->lock, flags); if (cdev->delayed_status == 0) { WARN(cdev, "%s: Unexpected call\n", __func__); } else if (--cdev->delayed_status == 0) { ... spin_unlock_irqrestore(&cdev->lock, flags); -- Thanks, Peter Chen ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: WARNING in usb_composite_setup_continue 2020-11-11 7:59 ` Peter Chen @ 2020-11-11 15:57 ` Alan Stern 0 siblings, 0 replies; 11+ messages in thread From: Alan Stern @ 2020-11-11 15:57 UTC (permalink / raw) To: Peter Chen; +Cc: Kyungtae Kim, Felipe Balbi, Greg KH, USB list, syzkaller On Wed, Nov 11, 2020 at 07:59:34AM +0000, Peter Chen wrote: > On 20-11-10 10:56:50, Alan Stern wrote: > > Felipe: > > > > On Mon, Nov 09, 2020 at 02:29:42PM -0500, Kyungtae Kim wrote: > > > We report a bug (in linux-5.8.13) found by FuzzUSB (a modified version > > > of syzkaller). > > > > > > (corrected analysis) > > > This bug happens while continuing a delayed setup message in mass > > > storage gadget. > > > To be specific, composite_setup() sets FSG_STATE_CONFIG_CHANGE via > > > fsg_set_alt() (line 1793), > > > and followed by cdev->delayed_status++ (line 1798). > > > Meanwile, the mass gadget tries check cdev->delayed_status == 0 > > > through handle_exception() (line 2428), > > > which occurs in between the two operations above. > > > Such a race causes invalid operations eventually. > > > > Do you know who maintains composite.c (or the composite framework) these > > days? This is a real race, and it needs to be fixed. > > > > Part of the problem seems to be that cdev->delayed_status is sometimes > > accessed without the protection of cdev->lock. But I don't know when it > > is safe to take that lock, so I can't tell what changes to make. > > > > Another part of the problem is that cdev->delayed_status doesn't count > > things properly. That is, its value is incremented each time a function > > driver asks for a delayed status and decremented each time a function > > driver calls usb_composite_setup_continue(), and the delayed status > > response is sent when the value reaches 0. But there's nothing to stop > > this from happening (imagine a gadget with two functions A and B): > > > > Function driver A asks for delayed status; > > Function driver A calls setup_continue(): Now the value > > of the counter is 0 so a status message is queued > > too early; > > Function driver B asks for delayed status; > > Function driver B calls setup_continue(): Now a second > > status message is queued. > > > > I'm willing to help fix these issues, but I need assistance from someone > > who fully understands the composite framework. > > > > Hi Alan & Kyungtae, > > I quite not understand why this occurs, since cdev->delayed_status's > increment and decrement are both protected by cdev->lock. > > cdev->delayed_status's increment: > > Place 1: > case USB_REQ_GET_CONFIGURATION: > spin_lock(&cdev->lock); > set_config(cdev, ctrl, w_value); > f->set_alt; > cdev->delayed_status++; > > spin_unlock(&cdev->lock); > > Place 2: > case USB_REQ_SET_INTERFACE: > spin_lock(&cdev->lock); > value = f->set_alt(f, w_index, w_value); > if (value == USB_GADGET_DELAYED_STATUS) { > DBG(cdev, > "%s: interface %d (%s) requested delayed status\n", > __func__, intf, f->name); > cdev->delayed_status++; > DBG(cdev, "delayed_status count %d\n", > cdev->delayed_status); > } > spin_unlock(&cdev->lock); > > cdev->delayed_status's decrement: > function: usb_composite_setup_continue which called by fsg_main_thread > due to FSG_STATE_CONFIG_CHANGE. > > spin_lock_irqsave(&cdev->lock, flags); > > if (cdev->delayed_status == 0) { > WARN(cdev, "%s: Unexpected call\n", __func__); > > } else if (--cdev->delayed_status == 0) { > ... > > spin_unlock_irqrestore(&cdev->lock, flags); You are right. I didn't follow the call changes enough to see that set_config and reset_config are always called with cdev->lock held. However, the other problem outlined in my earlier email still remains. But now I understand what's happening well enough to write a patch. Alan Stern ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: WARNING in usb_composite_setup_continue 2020-11-10 15:56 ` Alan Stern 2020-11-11 7:59 ` Peter Chen @ 2020-11-11 19:47 ` Alan Stern 2020-11-12 9:01 ` Peter Chen 1 sibling, 1 reply; 11+ messages in thread From: Alan Stern @ 2020-11-11 19:47 UTC (permalink / raw) To: Peter Chen, Kyungtae Kim; +Cc: Felipe Balbi, Greg KH, USB list, syzkaller On Tue, Nov 10, 2020 at 10:56:50AM -0500, Alan Stern wrote: > Felipe: > > On Mon, Nov 09, 2020 at 02:29:42PM -0500, Kyungtae Kim wrote: > > We report a bug (in linux-5.8.13) found by FuzzUSB (a modified version > > of syzkaller). > > > > (corrected analysis) > > This bug happens while continuing a delayed setup message in mass > > storage gadget. > > To be specific, composite_setup() sets FSG_STATE_CONFIG_CHANGE via > > fsg_set_alt() (line 1793), > > and followed by cdev->delayed_status++ (line 1798). > > Meanwile, the mass gadget tries check cdev->delayed_status == 0 > > through handle_exception() (line 2428), > > which occurs in between the two operations above. > > Such a race causes invalid operations eventually. > > Do you know who maintains composite.c (or the composite framework) these > days? This is a real race, and it needs to be fixed. > > Part of the problem seems to be that cdev->delayed_status is sometimes > accessed without the protection of cdev->lock. But I don't know when it > is safe to take that lock, so I can't tell what changes to make. > > Another part of the problem is that cdev->delayed_status doesn't count > things properly. That is, its value is incremented each time a function > driver asks for a delayed status and decremented each time a function > driver calls usb_composite_setup_continue(), and the delayed status > response is sent when the value reaches 0. But there's nothing to stop > this from happening (imagine a gadget with two functions A and B): > > Function driver A asks for delayed status; > Function driver A calls setup_continue(): Now the value > of the counter is 0 so a status message is queued > too early; > Function driver B asks for delayed status; > Function driver B calls setup_continue(): Now a second > status message is queued. > > I'm willing to help fix these issues, but I need assistance from someone > who fully understands the composite framework. Okay, so as Peter pointed out, these comments were wrong. The cdev lock is held and the scenario described above can't occur, because the lock isn't released until after both functions have asked for delayed status. This means that Kyungtae's analysis of the FuzzUSB report is wrong, and we still don't know what really happened. Here's my guess... There's still a possible race, although it's a different one: The gadget's delayed status reply can race with the host timing out and sending a new SETUP packet: Host sends SETUP packet A Function receives A and decides to send a delayed status reply Function thread starts to process packet A Host times out waiting for A status Host sends new SETUP packet B Composite core receives packet B and resets cdev->delayed_status Function thread finishes and calls usb_composite_setup_continue() The composite core sends a status reply for packet A, not packet B Host receives status for A but thinks it is the status for B! Function thread processes packet B Function thread finishes and calls usb_composite_setup_continue() The composite core sees cdev->delayed_status == 0 and WARNs. At the moment I don't see how to prevent this sort of race from happening. We may need to change the API, giving the composite core a way to match up calls to usb_composite_setup_continue() with the corresponding call to composite_setup(). But even that wouldn't fix the entire problem. Suggestions? Alan Stern ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: WARNING in usb_composite_setup_continue 2020-11-11 19:47 ` Alan Stern @ 2020-11-12 9:01 ` Peter Chen 2020-11-12 15:59 ` Alan Stern 0 siblings, 1 reply; 11+ messages in thread From: Peter Chen @ 2020-11-12 9:01 UTC (permalink / raw) To: Alan Stern; +Cc: Kyungtae Kim, Felipe Balbi, Greg KH, USB list, syzkaller On 20-11-11 14:47:10, Alan Stern wrote: > On Tue, Nov 10, 2020 at 10:56:50AM -0500, Alan Stern wrote: > > Felipe: > > > > On Mon, Nov 09, 2020 at 02:29:42PM -0500, Kyungtae Kim wrote: > > > We report a bug (in linux-5.8.13) found by FuzzUSB (a modified version > > > of syzkaller). > > > > > > (corrected analysis) > > > This bug happens while continuing a delayed setup message in mass > > > storage gadget. > > > To be specific, composite_setup() sets FSG_STATE_CONFIG_CHANGE via > > > fsg_set_alt() (line 1793), > > > and followed by cdev->delayed_status++ (line 1798). > > > Meanwile, the mass gadget tries check cdev->delayed_status == 0 > > > through handle_exception() (line 2428), > > > which occurs in between the two operations above. > > > Such a race causes invalid operations eventually. > > > > Do you know who maintains composite.c (or the composite framework) these > > days? This is a real race, and it needs to be fixed. > > > > Part of the problem seems to be that cdev->delayed_status is sometimes > > accessed without the protection of cdev->lock. But I don't know when it > > is safe to take that lock, so I can't tell what changes to make. > > > > Another part of the problem is that cdev->delayed_status doesn't count > > things properly. That is, its value is incremented each time a function > > driver asks for a delayed status and decremented each time a function > > driver calls usb_composite_setup_continue(), and the delayed status > > response is sent when the value reaches 0. But there's nothing to stop > > this from happening (imagine a gadget with two functions A and B): > > > > Function driver A asks for delayed status; > > Function driver A calls setup_continue(): Now the value > > of the counter is 0 so a status message is queued > > too early; > > Function driver B asks for delayed status; > > Function driver B calls setup_continue(): Now a second > > status message is queued. > > > > I'm willing to help fix these issues, but I need assistance from someone > > who fully understands the composite framework. > > Okay, so as Peter pointed out, these comments were wrong. The cdev lock > is held and the scenario described above can't occur, because the lock > isn't released until after both functions have asked for delayed status. > > This means that Kyungtae's analysis of the FuzzUSB report is wrong, and > we still don't know what really happened. Here's my guess... > > There's still a possible race, although it's a different one: The > gadget's delayed status reply can race with the host timing out and > sending a new SETUP packet: > > Host sends SETUP packet A > > Function receives A and decides > to send a delayed status reply > > Function thread starts to > process packet A > > Host times out waiting for A status > > Host sends new SETUP packet B > > Composite core receives packet B > and resets cdev->delayed_status resets cdev->delayed_status? Where to do that? Even the host re-try the control transfer, it should send the same control request, eg, SET_CONFIGURATION, and increase cdev->delayed_status. There is a description for possible host sends next control request before receiving status packet at USB 2.0 Spec, CH 5.5.5 Control Transfer Data Sequences If a Setup transaction is received by an endpoint before a previously initiated control transfer is completed, the device must abort the current transfer/operation and handle the new control Setup transaction. A Setup transaction should not normally be sent before the completion of a previous control transfer. However, if a transfer is aborted, for example, due to errors on the bus, the host can send the next Setup transaction prematurely from the endpoint’s perspective. > > Function thread finishes and calls > usb_composite_setup_continue() > > The composite core sends a status > reply for packet A, not packet B > > Host receives status for A but thinks > it is the status for B! > > Function thread processes packet B > > Function thread finishes and calls > usb_composite_setup_continue() > > The composite core sees > cdev->delayed_status == 0 and WARNs. > > At the moment I don't see how to prevent this sort of race from > happening. We may need to change the API, giving the composite core a > way to match up calls to usb_composite_setup_continue() with the > corresponding call to composite_setup(). But even that wouldn't fix > the entire problem. > Hi Alan, I more think a possible reset or disconnect occurrence between them, and composite_disconnect is called. Kyungtae, would you please help test below code? Besides, would you tell me the test environments during FuzzUSB test? A really host or a host emulated host? Thanks. diff --git a/drivers/usb/gadget/composite.c b/drivers/usb/gadget/composite.c index c6d455f2bb92..4a8cc1277206 100644 --- a/drivers/usb/gadget/composite.c +++ b/drivers/usb/gadget/composite.c @@ -2454,7 +2454,8 @@ void usb_composite_setup_continue(struct usb_composite_dev *cdev) spin_lock_irqsave(&cdev->lock, flags); if (cdev->delayed_status == 0) { - WARN(cdev, "%s: Unexpected call\n", __func__); + WARN(cdev, "%s: Unexpected call, %s\n", __func__, + cdev->config ? "connecting state" : "already disconnected"); } else if (--cdev->delayed_status == 0) { DBG(cdev, "%s: Completing delayed status\n", __func__); -- Thanks, Peter Chen ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: WARNING in usb_composite_setup_continue 2020-11-12 9:01 ` Peter Chen @ 2020-11-12 15:59 ` Alan Stern 2020-11-13 10:02 ` Peter Chen 0 siblings, 1 reply; 11+ messages in thread From: Alan Stern @ 2020-11-12 15:59 UTC (permalink / raw) To: Peter Chen, g; +Cc: Kyungtae Kim, Felipe Balbi, Greg KH, USB list, syzkaller On Thu, Nov 12, 2020 at 09:01:11AM +0000, Peter Chen wrote: > On 20-11-11 14:47:10, Alan Stern wrote: > > There's still a possible race, although it's a different one: The > > gadget's delayed status reply can race with the host timing out and > > sending a new SETUP packet: > > > > Host sends SETUP packet A > > > > Function receives A and decides > > to send a delayed status reply > > > > Function thread starts to > > process packet A > > > > Host times out waiting for A status > > > > Host sends new SETUP packet B > > > > Composite core receives packet B > > and resets cdev->delayed_status > > resets cdev->delayed_status? Where to do that? Sorry, for some reason I though I read a line that set delayed_status to 0 in composite_setup(). I must have been fantasizing... > Even the host re-try the > control transfer, it should send the same control request, eg, > SET_CONFIGURATION, and increase cdev->delayed_status. There is a description > for possible host sends next control request before receiving status > packet at USB 2.0 Spec, CH 5.5.5 Control Transfer Data Sequences > > If a Setup transaction is received by an endpoint before a previously > initiated control transfer is completed, > the device must abort the current transfer/operation and > handle the new control Setup transaction. A Setup > transaction should not normally be sent before the completion > of a previous control transfer. However, if a > transfer is aborted, for example, due to errors on the bus, > the host can send the next Setup transaction > prematurely from the endpoint’s perspective. Yes. The composite core does not abort the current transfer/operation when a new Setup transaction occurs. But it should -- and it should set delayed_status back to 0. (Maybe this is what I was fantasizing.) > > Function thread finishes and calls > > usb_composite_setup_continue() > > > > The composite core sends a status > > reply for packet A, not packet B > > > > Host receives status for A but thinks > > it is the status for B! > > > > Function thread processes packet B > > > > Function thread finishes and calls > > usb_composite_setup_continue() > > > > The composite core sees > > cdev->delayed_status == 0 and WARNs. > > > > At the moment I don't see how to prevent this sort of race from > > happening. We may need to change the API, giving the composite core a > > way to match up calls to usb_composite_setup_continue() with the > > corresponding call to composite_setup(). But even that wouldn't fix > > the entire problem. > > > > Hi Alan, > > I more think a possible reset or disconnect occurrence between them, and > composite_disconnect is called. That sounds reasonable. This is why I think we will need to change the API. There has to be a way to tell usb_composite_setup_continue() which SETUP it is being called for. A new SETUP or a disconnect should invalidate the old SETUP, and then usb_composite_setup_continue() would ignore any calls that were for the old SETUP packet. Alan Stern ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: WARNING in usb_composite_setup_continue 2020-11-12 15:59 ` Alan Stern @ 2020-11-13 10:02 ` Peter Chen 2020-11-13 17:00 ` Alan Stern 0 siblings, 1 reply; 11+ messages in thread From: Peter Chen @ 2020-11-13 10:02 UTC (permalink / raw) To: Alan Stern; +Cc: g, Kyungtae Kim, Felipe Balbi, Greg KH, USB list, syzkaller On 20-11-12 10:59:05, Alan Stern wrote: > On Thu, Nov 12, 2020 at 09:01:11AM +0000, Peter Chen wrote: > > On 20-11-11 14:47:10, Alan Stern wrote: > > > There's still a possible race, although it's a different one: The > > > gadget's delayed status reply can race with the host timing out and > > > sending a new SETUP packet: > > > > > > Host sends SETUP packet A > > > > > > Function receives A and decides > > > to send a delayed status reply > > > > > > Function thread starts to > > > process packet A > > > > > > Host times out waiting for A status > > > > > > Host sends new SETUP packet B > > > > > > Composite core receives packet B > > > and resets cdev->delayed_status > > > > resets cdev->delayed_status? Where to do that? > > Sorry, for some reason I though I read a line that set delayed_status to > 0 in composite_setup(). I must have been fantasizing... > > > Even the host re-try the > > control transfer, it should send the same control request, eg, > > SET_CONFIGURATION, and increase cdev->delayed_status. There is a description > > for possible host sends next control request before receiving status > > packet at USB 2.0 Spec, CH 5.5.5 Control Transfer Data Sequences > > > > If a Setup transaction is received by an endpoint before a previously > > initiated control transfer is completed, > > the device must abort the current transfer/operation and > > handle the new control Setup transaction. A Setup > > transaction should not normally be sent before the completion > > of a previous control transfer. However, if a > > transfer is aborted, for example, due to errors on the bus, > > the host can send the next Setup transaction > > prematurely from the endpoint’s perspective. > > Yes. The composite core does not abort the current transfer/operation > when a new Setup transaction occurs. But it should -- and it should set > delayed_status back to 0. (Maybe this is what I was fantasizing.) > > > > Function thread finishes and calls > > > usb_composite_setup_continue() > > > > > > The composite core sends a status > > > reply for packet A, not packet B > > > > > > Host receives status for A but thinks > > > it is the status for B! > > > > > > Function thread processes packet B > > > > > > Function thread finishes and calls > > > usb_composite_setup_continue() > > > > > > The composite core sees > > > cdev->delayed_status == 0 and WARNs. > > > > > > At the moment I don't see how to prevent this sort of race from > > > happening. We may need to change the API, giving the composite core a > > > way to match up calls to usb_composite_setup_continue() with the > > > corresponding call to composite_setup(). But even that wouldn't fix > > > the entire problem. > > > > > > > Hi Alan, > > > > I more think a possible reset or disconnect occurrence between them, and > > composite_disconnect is called. > > That sounds reasonable. > > This is why I think we will need to change the API. There has to be a > way to tell usb_composite_setup_continue() which SETUP it is being > called for. A new SETUP or a disconnect should invalidate the old > SETUP, and then usb_composite_setup_continue() would ignore any calls > that were for the old SETUP packet. > I think usb_composite_setup_continue logic has no issue, we only need to consider if changing WARN to DBG is necessary, FuzzUSB seems to be trigger panic for WARN. See below message on its trace log > Kernel panic - not syncing: panic_on_warn set ... For new SETUP, current composite layer makes sure the pending request will not get the status since the stage request is only queued when cdev->delayed_status is 1. But the UDC driver should make sure no new request if old request has not finished, consider the corner case that the new SETUP comes after the pending request calls usb_composite_setup_continue and queues the status stage, then, the two zero-length packets from device will be the bus, but host only wants get one. Meanwhile, there is only one request for composite device (see: usb_composite_dev.req), it means the composite layer can't handle multiple ep0 request. For disconnect event, it is an unexpected event between SETUP(DATA) stage and status stage, and usb_composite_setup_continue just does nothing since the bus has already not at configured state. -- Thanks, Peter Chen ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: WARNING in usb_composite_setup_continue 2020-11-13 10:02 ` Peter Chen @ 2020-11-13 17:00 ` Alan Stern 2020-11-16 10:02 ` Peter Chen 0 siblings, 1 reply; 11+ messages in thread From: Alan Stern @ 2020-11-13 17:00 UTC (permalink / raw) To: Peter Chen; +Cc: g, Kyungtae Kim, Felipe Balbi, Greg KH, USB list, syzkaller On Fri, Nov 13, 2020 at 10:02:58AM +0000, Peter Chen wrote: > On 20-11-12 10:59:05, Alan Stern wrote: > > This is why I think we will need to change the API. There has to be a > > way to tell usb_composite_setup_continue() which SETUP it is being > > called for. A new SETUP or a disconnect should invalidate the old > > SETUP, and then usb_composite_setup_continue() would ignore any calls > > that were for the old SETUP packet. > > > > I think usb_composite_setup_continue logic has no issue, we only need to > consider if changing WARN to DBG is necessary, FuzzUSB seems to be > trigger panic for WARN. Yes, changing the WARN to DBG will make FuzzUSB happy. But I still think there is a logical flaw in the design of the API. > See below message on its trace log > > Kernel panic - not syncing: panic_on_warn set ... > > > For new SETUP, current composite layer makes sure the pending request > will not get the status since the stage request is only queued when > cdev->delayed_status is 1. > > But the UDC driver should make sure no new > request if old request has not finished, consider the corner case that > the new SETUP comes after the pending request calls usb_composite_setup_continue > and queues the status stage, then, the two zero-length packets from device > will be the bus, but host only wants get one. Meanwhile, there is only one > request for composite device (see: usb_composite_dev.req), it means the > composite layer can't handle multiple ep0 request. That's right. Unfortunately, I think fixing this will require changes to the UDC drivers as well as to the composite core. Similar to what I said earlier, there will have to be a way for the composite core to tell the UDC driver which SETUP packet the zero-length status reply is meant for. > For disconnect event, it is an unexpected event between SETUP(DATA) stage > and status stage, and usb_composite_setup_continue just does nothing > since the bus has already not at configured state. Yes. But there still is another problem in the API. Suppose the host sends a Set-Interface request, and the function driver is not able to handle it (maybe a memory allocation fails). The gadget should report this failure to the host by STALLing ep0. But there is no way for the function driver to tell the composite core that the failure occurred! You can see this problem in f_mass_storage.c. If do_set_interface() encounters an error, it will return a negative error code. But the caller (handle_exception()) ignores the return code! When Dave Brownell designed the Gadget and Composite APIs, he really did not give sufficient thought to the issues involved in delayed responses to control-OUT transfers. Alan Stern ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: WARNING in usb_composite_setup_continue 2020-11-13 17:00 ` Alan Stern @ 2020-11-16 10:02 ` Peter Chen 2020-11-16 16:00 ` Alan Stern 0 siblings, 1 reply; 11+ messages in thread From: Peter Chen @ 2020-11-16 10:02 UTC (permalink / raw) To: Alan Stern; +Cc: g, Kyungtae Kim, Felipe Balbi, Greg KH, USB list, syzkaller On 20-11-13 12:00:15, Alan Stern wrote: > On Fri, Nov 13, 2020 at 10:02:58AM +0000, Peter Chen wrote: > > On 20-11-12 10:59:05, Alan Stern wrote: > > > This is why I think we will need to change the API. There has to be a > > > way to tell usb_composite_setup_continue() which SETUP it is being > > > called for. A new SETUP or a disconnect should invalidate the old > > > SETUP, and then usb_composite_setup_continue() would ignore any calls > > > that were for the old SETUP packet. > > > > > > > I think usb_composite_setup_continue logic has no issue, we only need to > > consider if changing WARN to DBG is necessary, FuzzUSB seems to be > > trigger panic for WARN. > > Yes, changing the WARN to DBG will make FuzzUSB happy. But I still > think there is a logical flaw in the design of the API. > > > See below message on its trace log > > > Kernel panic - not syncing: panic_on_warn set ... > > > > > > For new SETUP, current composite layer makes sure the pending request > > will not get the status since the stage request is only queued when > > cdev->delayed_status is 1. > > > > But the UDC driver should make sure no new > > request if old request has not finished, consider the corner case that > > the new SETUP comes after the pending request calls usb_composite_setup_continue > > and queues the status stage, then, the two zero-length packets from device > > will be the bus, but host only wants get one. Meanwhile, there is only one > > request for composite device (see: usb_composite_dev.req), it means the > > composite layer can't handle multiple ep0 request. > > That's right. Unfortunately, I think fixing this will require changes > to the UDC drivers as well as to the composite core. Similar to what I > said earlier, there will have to be a way for the composite core to tell > the UDC driver which SETUP packet the zero-length status reply is meant > for. It needs the composite layer to support multiple requests for EP0, the effort is big. It is better the real problem is reported, then, it has environment to test the solution. The reported FuzzUSB issue is not this issue. > > > For disconnect event, it is an unexpected event between SETUP(DATA) stage > > and status stage, and usb_composite_setup_continue just does nothing > > since the bus has already not at configured state. > > Yes. But there still is another problem in the API. > > Suppose the host sends a Set-Interface request, and the function driver > is not able to handle it (maybe a memory allocation fails). The gadget > should report this failure to the host by STALLing ep0. But there is no > way for the function driver to tell the composite core that the failure > occurred! > > You can see this problem in f_mass_storage.c. If do_set_interface() > encounters an error, it will return a negative error code. But the > caller (handle_exception()) ignores the return code! > > When Dave Brownell designed the Gadget and Composite APIs, he really did > not give sufficient thought to the issues involved in delayed responses > to control-OUT transfers. > We could add one parameter for usb_composite_setup_continue to indicate error occurred during function's deferred operation, and stall the ep0 at usb_composite_setup_continue, do you think so? -- Thanks, Peter Chen ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: WARNING in usb_composite_setup_continue 2020-11-16 10:02 ` Peter Chen @ 2020-11-16 16:00 ` Alan Stern 0 siblings, 0 replies; 11+ messages in thread From: Alan Stern @ 2020-11-16 16:00 UTC (permalink / raw) To: Peter Chen; +Cc: g, Kyungtae Kim, Felipe Balbi, Greg KH, USB list, syzkaller On Mon, Nov 16, 2020 at 10:02:22AM +0000, Peter Chen wrote: > On 20-11-13 12:00:15, Alan Stern wrote: > > That's right. Unfortunately, I think fixing this will require changes > > to the UDC drivers as well as to the composite core. Similar to what I > > said earlier, there will have to be a way for the composite core to tell > > the UDC driver which SETUP packet the zero-length status reply is meant > > for. > > It needs the composite layer to support multiple requests for EP0, the > effort is big. It is better the real problem is reported, then, it has > environment to test the solution. The reported FuzzUSB issue is not this issue. WE don't really need to support multiple ep0 requests. We just need to know whether a particular response was meant for the most recent ep0 transfer or for an earlier one. For example, the UDC driver could keep a counter of SETUP packets, and it could pass this counter to the composite core's setup routine. The status reply request could include the corresponding counter value, and the UDC driver could then ignore replies that are not meant for the most recent SETUP. Of course, this would be a big API change, affecting all UDC and gadget drivers. Maybe you can think of a better way to fix the problem. > > Suppose the host sends a Set-Interface request, and the function driver > > is not able to handle it (maybe a memory allocation fails). The gadget > > should report this failure to the host by STALLing ep0. But there is no > > way for the function driver to tell the composite core that the failure > > occurred! > > > > You can see this problem in f_mass_storage.c. If do_set_interface() > > encounters an error, it will return a negative error code. But the > > caller (handle_exception()) ignores the return code! > > > > When Dave Brownell designed the Gadget and Composite APIs, he really did > > not give sufficient thought to the issues involved in delayed responses > > to control-OUT transfers. > > > > We could add one parameter for usb_composite_setup_continue to indicate > error occurred during function's deferred operation, and stall the ep0 > at usb_composite_setup_continue, do you think so? That would work only if the function driver also had a way to tell the composite core which control transfer failed. The basic problem is the race between a control status response and arrival of a new SETUP packet. This race exists between the UDC driver and the gadget driver (that is, the composiste core), and between the composite core and the function drivers. Alan Stern ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2020-11-16 16:00 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <CAEAjamsxe9OuMVpHfox3w57HtGsE3mPXOty9bdXW-iPdx=TXMA@mail.gmail.com> 2020-11-09 19:29 ` WARNING in usb_composite_setup_continue Kyungtae Kim 2020-11-10 15:56 ` Alan Stern 2020-11-11 7:59 ` Peter Chen 2020-11-11 15:57 ` Alan Stern 2020-11-11 19:47 ` Alan Stern 2020-11-12 9:01 ` Peter Chen 2020-11-12 15:59 ` Alan Stern 2020-11-13 10:02 ` Peter Chen 2020-11-13 17:00 ` Alan Stern 2020-11-16 10:02 ` Peter Chen 2020-11-16 16:00 ` Alan Stern
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.