All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: WARNING in usb_composite_setup_continue
       [not found] <CAEAjamsxe9OuMVpHfox3w57HtGsE3mPXOty9bdXW-iPdx=TXMA@mail.gmail.com>
@ 2020-11-09 19:29 ` Kyungtae Kim
  2020-11-10 15:56   ` Alan Stern
  0 siblings, 1 reply; 11+ messages in thread
From: Kyungtae Kim @ 2020-11-09 19:29 UTC (permalink / raw)
  To: Felipe Balbi, Greg KH; +Cc: USB list, LKML, syzkaller

We report a bug (in linux-5.8.13) found by FuzzUSB (a modified version
of syzkaller).

(corrected analysis)
This bug happens while continuing a delayed setup message in mass
storage gadget.
To be specific, composite_setup() sets FSG_STATE_CONFIG_CHANGE via
fsg_set_alt() (line 1793),
and followed by cdev->delayed_status++ (line 1798).
Meanwile, the mass gadget tries  check cdev->delayed_status == 0
through handle_exception() (line 2428),
which occurs in between the two operations above.
Such a race causes invalid operations eventually.

==================================================================
usb_composite_setup_continue: Unexpected call
WARNING: CPU: 1 PID: 1882 at drivers/usb/gadget/composite.c:2457
usb_composite_setup_continue+0x1c7/0x220
drivers/usb/gadget/composite.c:2457
Kernel panic - not syncing: panic_on_warn set ...
CPU: 1 PID: 1882 Comm: file-storage Not tainted 5.8.13 #4
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0xa7/0xea lib/dump_stack.c:118
 panic+0x298/0x521 kernel/panic.c:231
 __warn.cold.12+0x25/0x32 kernel/panic.c:600
 report_bug+0x1b2/0x260 lib/bug.c:198
 handle_bug+0x43/0x70 arch/x86/kernel/traps.c:235
 exc_invalid_op+0x18/0x40 arch/x86/kernel/traps.c:255
 asm_exc_invalid_op+0x12/0x20 ./arch/x86/include/asm/idtentry.h:540
RIP: 0010:usb_composite_setup_continue+0x1c7/0x220
drivers/usb/gadget/composite.c:2457
Code: 62 48 8b 7b 58 4c 89 fe e8 66 fb ff ff e9 e6 fe ff ff e8 6c 89
87 fd 48 c7 c6 60 c9 41 86 48 c7 c7 60 c6 41 86 e8 78 0c 5c fd <0f> 0b
e9 c7 fe ff ff e8 3d a4 ae fd e9 a3 fe ff ff e8 53 a4 ae fd
RSP: 0000:ffff8880446dfd38 EFLAGS: 00010086
RAX: 0000000000000000 RBX: ffff888069eee4b0 RCX: 0000000000000000
RDX: 0000000000000027 RSI: 0000000000000004 RDI: ffffed10088dbf9d
RBP: ffff8880446dfd60 R08: ffffed100d953fd2 R09: ffffed100d953fd2
R10: ffff88806ca9fe8b R11: ffffed100d953fd1 R12: 0000000000000000
R13: ffff888069eee540 R14: 0000000000000246 R15: ffff888040609310
 handle_exception drivers/usb/gadget/function/f_mass_storage.c:2428 [inline]
 fsg_main_thread+0x12f4/0x59f1 drivers/usb/gadget/function/f_mass_storage.c:2466
 kthread+0x374/0x480 kernel/kthread.c:291
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:293
Dumping ftrace buffer:
   (ftrace buffer empty)
Kernel Offset: disabled
==================================================================

Regards,
Kyungtae Kim

On Mon, Nov 9, 2020 at 2:08 PM Kyungtae Kim <kt0755@gmail.com> wrote:
>
> We report a bug (in linux-5.8.13) found by FuzzUSB (a modified version
> of syzkaller).
>
> The bug happens when the freed object tty->port is accessed in
> tty_init_dev (line 1358).
> It seems tty->port is freed during the locking (i.e., tty_ldisc_lock)
> ahead (line 1355).
>
> ==================================================================
> usb_composite_setup_continue: Unexpected call
> WARNING: CPU: 1 PID: 1882 at drivers/usb/gadget/composite.c:2457 usb_composite_setup_continue+0x1c7/0x220 drivers/usb/gadget/composite.c:2457
> Kernel panic - not syncing: panic_on_warn set ...
> CPU: 1 PID: 1882 Comm: file-storage Not tainted 5.8.13 #4
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0xa7/0xea lib/dump_stack.c:118
>  panic+0x298/0x521 kernel/panic.c:231
>  __warn.cold.12+0x25/0x32 kernel/panic.c:600
>  report_bug+0x1b2/0x260 lib/bug.c:198
>  handle_bug+0x43/0x70 arch/x86/kernel/traps.c:235
>  exc_invalid_op+0x18/0x40 arch/x86/kernel/traps.c:255
>  asm_exc_invalid_op+0x12/0x20 ./arch/x86/include/asm/idtentry.h:540
> RIP: 0010:usb_composite_setup_continue+0x1c7/0x220 drivers/usb/gadget/composite.c:2457
> Code: 62 48 8b 7b 58 4c 89 fe e8 66 fb ff ff e9 e6 fe ff ff e8 6c 89 87 fd 48 c7 c6 60 c9 41 86 48 c7 c7 60 c6 41 86 e8 78 0c 5c fd <0f> 0b e9 c7 fe ff ff e8 3d a4 ae fd e9 a3 fe ff ff e8 53 a4 ae fd
> RSP: 0000:ffff8880446dfd38 EFLAGS: 00010086
> RAX: 0000000000000000 RBX: ffff888069eee4b0 RCX: 0000000000000000
> RDX: 0000000000000027 RSI: 0000000000000004 RDI: ffffed10088dbf9d
> RBP: ffff8880446dfd60 R08: ffffed100d953fd2 R09: ffffed100d953fd2
> R10: ffff88806ca9fe8b R11: ffffed100d953fd1 R12: 0000000000000000
> R13: ffff888069eee540 R14: 0000000000000246 R15: ffff888040609310
>  handle_exception drivers/usb/gadget/function/f_mass_storage.c:2428 [inline]
>  fsg_main_thread+0x12f4/0x59f1 drivers/usb/gadget/function/f_mass_storage.c:2466
>  kthread+0x374/0x480 kernel/kthread.c:291
>  ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:293
> Dumping ftrace buffer:
>    (ftrace buffer empty)
> Kernel Offset: disabled
> ==================================================================
>
> Regards,
> Kyungtae Kim

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: WARNING in usb_composite_setup_continue
  2020-11-09 19:29 ` WARNING in usb_composite_setup_continue Kyungtae Kim
@ 2020-11-10 15:56   ` Alan Stern
  2020-11-11  7:59     ` Peter Chen
  2020-11-11 19:47     ` Alan Stern
  0 siblings, 2 replies; 11+ messages in thread
From: Alan Stern @ 2020-11-10 15:56 UTC (permalink / raw)
  To: Kyungtae Kim; +Cc: Felipe Balbi, Greg KH, USB list, syzkaller

Felipe:

On Mon, Nov 09, 2020 at 02:29:42PM -0500, Kyungtae Kim wrote:
> We report a bug (in linux-5.8.13) found by FuzzUSB (a modified version
> of syzkaller).
> 
> (corrected analysis)
> This bug happens while continuing a delayed setup message in mass
> storage gadget.
> To be specific, composite_setup() sets FSG_STATE_CONFIG_CHANGE via
> fsg_set_alt() (line 1793),
> and followed by cdev->delayed_status++ (line 1798).
> Meanwile, the mass gadget tries  check cdev->delayed_status == 0
> through handle_exception() (line 2428),
> which occurs in between the two operations above.
> Such a race causes invalid operations eventually.

Do you know who maintains composite.c (or the composite framework) these 
days?  This is a real race, and it needs to be fixed.

Part of the problem seems to be that cdev->delayed_status is sometimes 
accessed without the protection of cdev->lock.  But I don't know when it 
is safe to take that lock, so I can't tell what changes to make.

Another part of the problem is that cdev->delayed_status doesn't count 
things properly.  That is, its value is incremented each time a function 
driver asks for a delayed status and decremented each time a function 
driver calls usb_composite_setup_continue(), and the delayed status 
response is sent when the value reaches 0.  But there's nothing to stop 
this from happening (imagine a gadget with two functions A and B):

	Function driver A asks for delayed status;
	Function driver A calls setup_continue(): Now the value
		of the counter is 0 so a status message is queued
		too early;
	Function driver B asks for delayed status;
	Function driver B calls setup_continue(): Now a second
		status message is queued.

I'm willing to help fix these issues, but I need assistance from someone 
who fully understands the composite framework.

Alan Stern

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: WARNING in usb_composite_setup_continue
  2020-11-10 15:56   ` Alan Stern
@ 2020-11-11  7:59     ` Peter Chen
  2020-11-11 15:57       ` Alan Stern
  2020-11-11 19:47     ` Alan Stern
  1 sibling, 1 reply; 11+ messages in thread
From: Peter Chen @ 2020-11-11  7:59 UTC (permalink / raw)
  To: Alan Stern; +Cc: Kyungtae Kim, Felipe Balbi, Greg KH, USB list, syzkaller

On 20-11-10 10:56:50, Alan Stern wrote:
> Felipe:
> 
> On Mon, Nov 09, 2020 at 02:29:42PM -0500, Kyungtae Kim wrote:
> > We report a bug (in linux-5.8.13) found by FuzzUSB (a modified version
> > of syzkaller).
> > 
> > (corrected analysis)
> > This bug happens while continuing a delayed setup message in mass
> > storage gadget.
> > To be specific, composite_setup() sets FSG_STATE_CONFIG_CHANGE via
> > fsg_set_alt() (line 1793),
> > and followed by cdev->delayed_status++ (line 1798).
> > Meanwile, the mass gadget tries  check cdev->delayed_status == 0
> > through handle_exception() (line 2428),
> > which occurs in between the two operations above.
> > Such a race causes invalid operations eventually.
> 
> Do you know who maintains composite.c (or the composite framework) these 
> days?  This is a real race, and it needs to be fixed.
> 
> Part of the problem seems to be that cdev->delayed_status is sometimes 
> accessed without the protection of cdev->lock.  But I don't know when it 
> is safe to take that lock, so I can't tell what changes to make.
> 
> Another part of the problem is that cdev->delayed_status doesn't count 
> things properly.  That is, its value is incremented each time a function 
> driver asks for a delayed status and decremented each time a function 
> driver calls usb_composite_setup_continue(), and the delayed status 
> response is sent when the value reaches 0.  But there's nothing to stop 
> this from happening (imagine a gadget with two functions A and B):
> 
> 	Function driver A asks for delayed status;
> 	Function driver A calls setup_continue(): Now the value
> 		of the counter is 0 so a status message is queued
> 		too early;
> 	Function driver B asks for delayed status;
> 	Function driver B calls setup_continue(): Now a second
> 		status message is queued.
> 
> I'm willing to help fix these issues, but I need assistance from someone 
> who fully understands the composite framework.
> 

Hi Alan & Kyungtae,

I quite not understand why this occurs, since cdev->delayed_status's
increment and decrement are both protected by cdev->lock.

cdev->delayed_status's increment:

Place 1:
case USB_REQ_GET_CONFIGURATION:
spin_lock(&cdev->lock);
set_config(cdev, ctrl, w_value);
	f->set_alt;
	cdev->delayed_status++;

spin_unlock(&cdev->lock);

Place 2:
case USB_REQ_SET_INTERFACE:
spin_lock(&cdev->lock);
value = f->set_alt(f, w_index, w_value);
if (value == USB_GADGET_DELAYED_STATUS) {
	DBG(cdev,
	 "%s: interface %d (%s) requested delayed status\n",
			__func__, intf, f->name);
	cdev->delayed_status++;
	DBG(cdev, "delayed_status count %d\n",
			cdev->delayed_status);
}
spin_unlock(&cdev->lock);

cdev->delayed_status's decrement:
function: usb_composite_setup_continue which called by fsg_main_thread
due to FSG_STATE_CONFIG_CHANGE.

spin_lock_irqsave(&cdev->lock, flags);

	if (cdev->delayed_status == 0) {
		WARN(cdev, "%s: Unexpected call\n", __func__);

	} else if (--cdev->delayed_status == 0) {
		...

spin_unlock_irqrestore(&cdev->lock, flags);
-- 

Thanks,
Peter Chen

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: WARNING in usb_composite_setup_continue
  2020-11-11  7:59     ` Peter Chen
@ 2020-11-11 15:57       ` Alan Stern
  0 siblings, 0 replies; 11+ messages in thread
From: Alan Stern @ 2020-11-11 15:57 UTC (permalink / raw)
  To: Peter Chen; +Cc: Kyungtae Kim, Felipe Balbi, Greg KH, USB list, syzkaller

On Wed, Nov 11, 2020 at 07:59:34AM +0000, Peter Chen wrote:
> On 20-11-10 10:56:50, Alan Stern wrote:
> > Felipe:
> > 
> > On Mon, Nov 09, 2020 at 02:29:42PM -0500, Kyungtae Kim wrote:
> > > We report a bug (in linux-5.8.13) found by FuzzUSB (a modified version
> > > of syzkaller).
> > > 
> > > (corrected analysis)
> > > This bug happens while continuing a delayed setup message in mass
> > > storage gadget.
> > > To be specific, composite_setup() sets FSG_STATE_CONFIG_CHANGE via
> > > fsg_set_alt() (line 1793),
> > > and followed by cdev->delayed_status++ (line 1798).
> > > Meanwile, the mass gadget tries  check cdev->delayed_status == 0
> > > through handle_exception() (line 2428),
> > > which occurs in between the two operations above.
> > > Such a race causes invalid operations eventually.
> > 
> > Do you know who maintains composite.c (or the composite framework) these 
> > days?  This is a real race, and it needs to be fixed.
> > 
> > Part of the problem seems to be that cdev->delayed_status is sometimes 
> > accessed without the protection of cdev->lock.  But I don't know when it 
> > is safe to take that lock, so I can't tell what changes to make.
> > 
> > Another part of the problem is that cdev->delayed_status doesn't count 
> > things properly.  That is, its value is incremented each time a function 
> > driver asks for a delayed status and decremented each time a function 
> > driver calls usb_composite_setup_continue(), and the delayed status 
> > response is sent when the value reaches 0.  But there's nothing to stop 
> > this from happening (imagine a gadget with two functions A and B):
> > 
> > 	Function driver A asks for delayed status;
> > 	Function driver A calls setup_continue(): Now the value
> > 		of the counter is 0 so a status message is queued
> > 		too early;
> > 	Function driver B asks for delayed status;
> > 	Function driver B calls setup_continue(): Now a second
> > 		status message is queued.
> > 
> > I'm willing to help fix these issues, but I need assistance from someone 
> > who fully understands the composite framework.
> > 
> 
> Hi Alan & Kyungtae,
> 
> I quite not understand why this occurs, since cdev->delayed_status's
> increment and decrement are both protected by cdev->lock.
> 
> cdev->delayed_status's increment:
> 
> Place 1:
> case USB_REQ_GET_CONFIGURATION:
> spin_lock(&cdev->lock);
> set_config(cdev, ctrl, w_value);
> 	f->set_alt;
> 	cdev->delayed_status++;
> 
> spin_unlock(&cdev->lock);
> 
> Place 2:
> case USB_REQ_SET_INTERFACE:
> spin_lock(&cdev->lock);
> value = f->set_alt(f, w_index, w_value);
> if (value == USB_GADGET_DELAYED_STATUS) {
> 	DBG(cdev,
> 	 "%s: interface %d (%s) requested delayed status\n",
> 			__func__, intf, f->name);
> 	cdev->delayed_status++;
> 	DBG(cdev, "delayed_status count %d\n",
> 			cdev->delayed_status);
> }
> spin_unlock(&cdev->lock);
> 
> cdev->delayed_status's decrement:
> function: usb_composite_setup_continue which called by fsg_main_thread
> due to FSG_STATE_CONFIG_CHANGE.
> 
> spin_lock_irqsave(&cdev->lock, flags);
> 
> 	if (cdev->delayed_status == 0) {
> 		WARN(cdev, "%s: Unexpected call\n", __func__);
> 
> 	} else if (--cdev->delayed_status == 0) {
> 		...
> 
> spin_unlock_irqrestore(&cdev->lock, flags);

You are right.  I didn't follow the call changes enough to see that 
set_config and reset_config are always called with cdev->lock held.

However, the other problem outlined in my earlier email still remains.  
But now I understand what's happening well enough to write a patch.

Alan Stern

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: WARNING in usb_composite_setup_continue
  2020-11-10 15:56   ` Alan Stern
  2020-11-11  7:59     ` Peter Chen
@ 2020-11-11 19:47     ` Alan Stern
  2020-11-12  9:01       ` Peter Chen
  1 sibling, 1 reply; 11+ messages in thread
From: Alan Stern @ 2020-11-11 19:47 UTC (permalink / raw)
  To: Peter Chen, Kyungtae Kim; +Cc: Felipe Balbi, Greg KH, USB list, syzkaller

On Tue, Nov 10, 2020 at 10:56:50AM -0500, Alan Stern wrote:
> Felipe:
> 
> On Mon, Nov 09, 2020 at 02:29:42PM -0500, Kyungtae Kim wrote:
> > We report a bug (in linux-5.8.13) found by FuzzUSB (a modified version
> > of syzkaller).
> > 
> > (corrected analysis)
> > This bug happens while continuing a delayed setup message in mass
> > storage gadget.
> > To be specific, composite_setup() sets FSG_STATE_CONFIG_CHANGE via
> > fsg_set_alt() (line 1793),
> > and followed by cdev->delayed_status++ (line 1798).
> > Meanwile, the mass gadget tries  check cdev->delayed_status == 0
> > through handle_exception() (line 2428),
> > which occurs in between the two operations above.
> > Such a race causes invalid operations eventually.
> 
> Do you know who maintains composite.c (or the composite framework) these 
> days?  This is a real race, and it needs to be fixed.
> 
> Part of the problem seems to be that cdev->delayed_status is sometimes 
> accessed without the protection of cdev->lock.  But I don't know when it 
> is safe to take that lock, so I can't tell what changes to make.
> 
> Another part of the problem is that cdev->delayed_status doesn't count 
> things properly.  That is, its value is incremented each time a function 
> driver asks for a delayed status and decremented each time a function 
> driver calls usb_composite_setup_continue(), and the delayed status 
> response is sent when the value reaches 0.  But there's nothing to stop 
> this from happening (imagine a gadget with two functions A and B):
> 
> 	Function driver A asks for delayed status;
> 	Function driver A calls setup_continue(): Now the value
> 		of the counter is 0 so a status message is queued
> 		too early;
> 	Function driver B asks for delayed status;
> 	Function driver B calls setup_continue(): Now a second
> 		status message is queued.
> 
> I'm willing to help fix these issues, but I need assistance from someone 
> who fully understands the composite framework.

Okay, so as Peter pointed out, these comments were wrong.  The cdev lock 
is held and the scenario described above can't occur, because the lock 
isn't released until after both functions have asked for delayed status.

This means that Kyungtae's analysis of the FuzzUSB report is wrong, and 
we still don't know what really happened.  Here's my guess...

There's still a possible race, although it's a different one: The 
gadget's delayed status reply can race with the host timing out and 
sending a new SETUP packet:

	Host sends SETUP packet A

	Function receives A and decides
	to send a delayed status reply

					Function thread starts to
					process packet A

	Host times out waiting for A status

	Host sends new SETUP packet B

	Composite core receives packet B
	and resets cdev->delayed_status

					Function thread finishes and calls
					usb_composite_setup_continue()

					The composite core sends a status
					reply for packet A, not packet B

	Host receives status for A but thinks
	it is the status for B!

					Function thread processes packet B

					Function thread finishes and calls
					usb_composite_setup_continue()

					The composite core sees
					cdev->delayed_status == 0 and WARNs.

At the moment I don't see how to prevent this sort of race from 
happening.  We may need to change the API, giving the composite core a 
way to match up calls to usb_composite_setup_continue() with the 
corresponding call to composite_setup().  But even that wouldn't fix 
the entire problem.

Suggestions?

Alan Stern

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: WARNING in usb_composite_setup_continue
  2020-11-11 19:47     ` Alan Stern
@ 2020-11-12  9:01       ` Peter Chen
  2020-11-12 15:59         ` Alan Stern
  0 siblings, 1 reply; 11+ messages in thread
From: Peter Chen @ 2020-11-12  9:01 UTC (permalink / raw)
  To: Alan Stern; +Cc: Kyungtae Kim, Felipe Balbi, Greg KH, USB list, syzkaller

On 20-11-11 14:47:10, Alan Stern wrote:
> On Tue, Nov 10, 2020 at 10:56:50AM -0500, Alan Stern wrote:
> > Felipe:
> > 
> > On Mon, Nov 09, 2020 at 02:29:42PM -0500, Kyungtae Kim wrote:
> > > We report a bug (in linux-5.8.13) found by FuzzUSB (a modified version
> > > of syzkaller).
> > > 
> > > (corrected analysis)
> > > This bug happens while continuing a delayed setup message in mass
> > > storage gadget.
> > > To be specific, composite_setup() sets FSG_STATE_CONFIG_CHANGE via
> > > fsg_set_alt() (line 1793),
> > > and followed by cdev->delayed_status++ (line 1798).
> > > Meanwile, the mass gadget tries  check cdev->delayed_status == 0
> > > through handle_exception() (line 2428),
> > > which occurs in between the two operations above.
> > > Such a race causes invalid operations eventually.
> > 
> > Do you know who maintains composite.c (or the composite framework) these 
> > days?  This is a real race, and it needs to be fixed.
> > 
> > Part of the problem seems to be that cdev->delayed_status is sometimes 
> > accessed without the protection of cdev->lock.  But I don't know when it 
> > is safe to take that lock, so I can't tell what changes to make.
> > 
> > Another part of the problem is that cdev->delayed_status doesn't count 
> > things properly.  That is, its value is incremented each time a function 
> > driver asks for a delayed status and decremented each time a function 
> > driver calls usb_composite_setup_continue(), and the delayed status 
> > response is sent when the value reaches 0.  But there's nothing to stop 
> > this from happening (imagine a gadget with two functions A and B):
> > 
> > 	Function driver A asks for delayed status;
> > 	Function driver A calls setup_continue(): Now the value
> > 		of the counter is 0 so a status message is queued
> > 		too early;
> > 	Function driver B asks for delayed status;
> > 	Function driver B calls setup_continue(): Now a second
> > 		status message is queued.
> > 
> > I'm willing to help fix these issues, but I need assistance from someone 
> > who fully understands the composite framework.
> 
> Okay, so as Peter pointed out, these comments were wrong.  The cdev lock 
> is held and the scenario described above can't occur, because the lock 
> isn't released until after both functions have asked for delayed status.
> 
> This means that Kyungtae's analysis of the FuzzUSB report is wrong, and 
> we still don't know what really happened.  Here's my guess...
> 
> There's still a possible race, although it's a different one: The 
> gadget's delayed status reply can race with the host timing out and 
> sending a new SETUP packet:
> 
> 	Host sends SETUP packet A
> 
> 	Function receives A and decides
> 	to send a delayed status reply
> 
> 					Function thread starts to
> 					process packet A
> 
> 	Host times out waiting for A status
> 
> 	Host sends new SETUP packet B
> 
> 	Composite core receives packet B
> 	and resets cdev->delayed_status

resets cdev->delayed_status? Where to do that? Even the host re-try the
control transfer, it should send the same control request, eg,
SET_CONFIGURATION, and increase cdev->delayed_status. There is a description
for possible host sends next control request before receiving status
packet at USB 2.0 Spec, CH 5.5.5 Control Transfer Data Sequences

	  If a Setup transaction is received by an endpoint before a previously
	  initiated control transfer is completed,
	  the device must abort the current transfer/operation and
	  handle the new control Setup transaction. A Setup
	  transaction should not normally be sent before the completion
	  of a previous control transfer. However, if a
	  transfer is aborted, for example, due to errors on the bus,
	  the host can send the next Setup transaction
	  prematurely from the endpoint’s perspective.

> 
> 					Function thread finishes and calls
> 					usb_composite_setup_continue()
> 
> 					The composite core sends a status
> 					reply for packet A, not packet B
> 
> 	Host receives status for A but thinks
> 	it is the status for B!
> 
> 					Function thread processes packet B
> 
> 					Function thread finishes and calls
> 					usb_composite_setup_continue()
> 
> 					The composite core sees
> 					cdev->delayed_status == 0 and WARNs.
> 
> At the moment I don't see how to prevent this sort of race from 
> happening.  We may need to change the API, giving the composite core a 
> way to match up calls to usb_composite_setup_continue() with the 
> corresponding call to composite_setup().  But even that wouldn't fix 
> the entire problem.
> 

Hi Alan,

I more think a possible reset or disconnect occurrence between them, and
composite_disconnect is called. Kyungtae, would you please help test below code?
Besides, would you tell me the test environments during FuzzUSB test? A really host or a
host emulated host? Thanks.

diff --git a/drivers/usb/gadget/composite.c b/drivers/usb/gadget/composite.c
index c6d455f2bb92..4a8cc1277206 100644
--- a/drivers/usb/gadget/composite.c
+++ b/drivers/usb/gadget/composite.c
@@ -2454,7 +2454,8 @@ void usb_composite_setup_continue(struct usb_composite_dev *cdev)
 	spin_lock_irqsave(&cdev->lock, flags);
 
 	if (cdev->delayed_status == 0) {
-		WARN(cdev, "%s: Unexpected call\n", __func__);
+		WARN(cdev, "%s: Unexpected call, %s\n", __func__,
+		cdev->config ? "connecting state" : "already disconnected");
 
 	} else if (--cdev->delayed_status == 0) {
 		DBG(cdev, "%s: Completing delayed status\n", __func__);
-- 

Thanks,
Peter Chen

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: WARNING in usb_composite_setup_continue
  2020-11-12  9:01       ` Peter Chen
@ 2020-11-12 15:59         ` Alan Stern
  2020-11-13 10:02           ` Peter Chen
  0 siblings, 1 reply; 11+ messages in thread
From: Alan Stern @ 2020-11-12 15:59 UTC (permalink / raw)
  To: Peter Chen, g; +Cc: Kyungtae Kim, Felipe Balbi, Greg KH, USB list, syzkaller

On Thu, Nov 12, 2020 at 09:01:11AM +0000, Peter Chen wrote:
> On 20-11-11 14:47:10, Alan Stern wrote:
> > There's still a possible race, although it's a different one: The 
> > gadget's delayed status reply can race with the host timing out and 
> > sending a new SETUP packet:
> > 
> > 	Host sends SETUP packet A
> > 
> > 	Function receives A and decides
> > 	to send a delayed status reply
> > 
> > 					Function thread starts to
> > 					process packet A
> > 
> > 	Host times out waiting for A status
> > 
> > 	Host sends new SETUP packet B
> > 
> > 	Composite core receives packet B
> > 	and resets cdev->delayed_status
> 
> resets cdev->delayed_status? Where to do that?

Sorry, for some reason I though I read a line that set delayed_status to 
0 in composite_setup().  I must have been fantasizing...

>  Even the host re-try the
> control transfer, it should send the same control request, eg,
> SET_CONFIGURATION, and increase cdev->delayed_status. There is a description
> for possible host sends next control request before receiving status
> packet at USB 2.0 Spec, CH 5.5.5 Control Transfer Data Sequences
> 
> 	  If a Setup transaction is received by an endpoint before a previously
> 	  initiated control transfer is completed,
> 	  the device must abort the current transfer/operation and
> 	  handle the new control Setup transaction. A Setup
> 	  transaction should not normally be sent before the completion
> 	  of a previous control transfer. However, if a
> 	  transfer is aborted, for example, due to errors on the bus,
> 	  the host can send the next Setup transaction
> 	  prematurely from the endpoint’s perspective.

Yes.  The composite core does not abort the current transfer/operation 
when a new Setup transaction occurs.  But it should -- and it should set 
delayed_status back to 0.  (Maybe this is what I was fantasizing.)

> > 					Function thread finishes and calls
> > 					usb_composite_setup_continue()
> > 
> > 					The composite core sends a status
> > 					reply for packet A, not packet B
> > 
> > 	Host receives status for A but thinks
> > 	it is the status for B!
> > 
> > 					Function thread processes packet B
> > 
> > 					Function thread finishes and calls
> > 					usb_composite_setup_continue()
> > 
> > 					The composite core sees
> > 					cdev->delayed_status == 0 and WARNs.
> > 
> > At the moment I don't see how to prevent this sort of race from 
> > happening.  We may need to change the API, giving the composite core a 
> > way to match up calls to usb_composite_setup_continue() with the 
> > corresponding call to composite_setup().  But even that wouldn't fix 
> > the entire problem.
> > 
> 
> Hi Alan,
> 
> I more think a possible reset or disconnect occurrence between them, and
> composite_disconnect is called.

That sounds reasonable.

This is why I think we will need to change the API.  There has to be a 
way to tell usb_composite_setup_continue() which SETUP it is being 
called for.  A new SETUP or a disconnect should invalidate the old 
SETUP, and then usb_composite_setup_continue() would ignore any calls 
that were for the old SETUP packet.

Alan Stern

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: WARNING in usb_composite_setup_continue
  2020-11-12 15:59         ` Alan Stern
@ 2020-11-13 10:02           ` Peter Chen
  2020-11-13 17:00             ` Alan Stern
  0 siblings, 1 reply; 11+ messages in thread
From: Peter Chen @ 2020-11-13 10:02 UTC (permalink / raw)
  To: Alan Stern; +Cc: g, Kyungtae Kim, Felipe Balbi, Greg KH, USB list, syzkaller

On 20-11-12 10:59:05, Alan Stern wrote:
> On Thu, Nov 12, 2020 at 09:01:11AM +0000, Peter Chen wrote:
> > On 20-11-11 14:47:10, Alan Stern wrote:
> > > There's still a possible race, although it's a different one: The 
> > > gadget's delayed status reply can race with the host timing out and 
> > > sending a new SETUP packet:
> > > 
> > > 	Host sends SETUP packet A
> > > 
> > > 	Function receives A and decides
> > > 	to send a delayed status reply
> > > 
> > > 					Function thread starts to
> > > 					process packet A
> > > 
> > > 	Host times out waiting for A status
> > > 
> > > 	Host sends new SETUP packet B
> > > 
> > > 	Composite core receives packet B
> > > 	and resets cdev->delayed_status
> > 
> > resets cdev->delayed_status? Where to do that?
> 
> Sorry, for some reason I though I read a line that set delayed_status to 
> 0 in composite_setup().  I must have been fantasizing...
> 
> >  Even the host re-try the
> > control transfer, it should send the same control request, eg,
> > SET_CONFIGURATION, and increase cdev->delayed_status. There is a description
> > for possible host sends next control request before receiving status
> > packet at USB 2.0 Spec, CH 5.5.5 Control Transfer Data Sequences
> > 
> > 	  If a Setup transaction is received by an endpoint before a previously
> > 	  initiated control transfer is completed,
> > 	  the device must abort the current transfer/operation and
> > 	  handle the new control Setup transaction. A Setup
> > 	  transaction should not normally be sent before the completion
> > 	  of a previous control transfer. However, if a
> > 	  transfer is aborted, for example, due to errors on the bus,
> > 	  the host can send the next Setup transaction
> > 	  prematurely from the endpoint’s perspective.
> 
> Yes.  The composite core does not abort the current transfer/operation 
> when a new Setup transaction occurs.  But it should -- and it should set 
> delayed_status back to 0.  (Maybe this is what I was fantasizing.)
> 
> > > 					Function thread finishes and calls
> > > 					usb_composite_setup_continue()
> > > 
> > > 					The composite core sends a status
> > > 					reply for packet A, not packet B
> > > 
> > > 	Host receives status for A but thinks
> > > 	it is the status for B!
> > > 
> > > 					Function thread processes packet B
> > > 
> > > 					Function thread finishes and calls
> > > 					usb_composite_setup_continue()
> > > 
> > > 					The composite core sees
> > > 					cdev->delayed_status == 0 and WARNs.
> > > 
> > > At the moment I don't see how to prevent this sort of race from 
> > > happening.  We may need to change the API, giving the composite core a 
> > > way to match up calls to usb_composite_setup_continue() with the 
> > > corresponding call to composite_setup().  But even that wouldn't fix 
> > > the entire problem.
> > > 
> > 
> > Hi Alan,
> > 
> > I more think a possible reset or disconnect occurrence between them, and
> > composite_disconnect is called.
> 
> That sounds reasonable.
> 
> This is why I think we will need to change the API.  There has to be a 
> way to tell usb_composite_setup_continue() which SETUP it is being 
> called for.  A new SETUP or a disconnect should invalidate the old 
> SETUP, and then usb_composite_setup_continue() would ignore any calls 
> that were for the old SETUP packet.
> 

I think usb_composite_setup_continue logic has no issue, we only need to
consider if changing WARN to DBG is necessary, FuzzUSB seems to be
trigger panic for WARN.

See below message on its trace log
> Kernel panic - not syncing: panic_on_warn set ...


For new SETUP, current composite layer makes sure the pending request
will not get the status since the stage request is only queued when
cdev->delayed_status is 1. But the UDC driver should make sure no new
request if old request has not finished, consider the corner case that
the new SETUP comes after the pending request calls usb_composite_setup_continue
and queues the status stage, then, the two zero-length packets from device
will be the bus, but host only wants get one. Meanwhile, there is only one
request for composite device (see: usb_composite_dev.req), it means the
composite layer can't handle multiple ep0 request.

For disconnect event, it is an unexpected event between SETUP(DATA) stage
and status stage, and usb_composite_setup_continue just does nothing
since the bus has already not at configured state.

-- 

Thanks,
Peter Chen

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: WARNING in usb_composite_setup_continue
  2020-11-13 10:02           ` Peter Chen
@ 2020-11-13 17:00             ` Alan Stern
  2020-11-16 10:02               ` Peter Chen
  0 siblings, 1 reply; 11+ messages in thread
From: Alan Stern @ 2020-11-13 17:00 UTC (permalink / raw)
  To: Peter Chen; +Cc: g, Kyungtae Kim, Felipe Balbi, Greg KH, USB list, syzkaller

On Fri, Nov 13, 2020 at 10:02:58AM +0000, Peter Chen wrote:
> On 20-11-12 10:59:05, Alan Stern wrote:
> > This is why I think we will need to change the API.  There has to be a 
> > way to tell usb_composite_setup_continue() which SETUP it is being 
> > called for.  A new SETUP or a disconnect should invalidate the old 
> > SETUP, and then usb_composite_setup_continue() would ignore any calls 
> > that were for the old SETUP packet.
> > 
> 
> I think usb_composite_setup_continue logic has no issue, we only need to
> consider if changing WARN to DBG is necessary, FuzzUSB seems to be
> trigger panic for WARN.

Yes, changing the WARN to DBG will make FuzzUSB happy.  But I still 
think there is a logical flaw in the design of the API.

> See below message on its trace log
> > Kernel panic - not syncing: panic_on_warn set ...
> 
> 
> For new SETUP, current composite layer makes sure the pending request
> will not get the status since the stage request is only queued when
> cdev->delayed_status is 1.
> 
> But the UDC driver should make sure no new
> request if old request has not finished, consider the corner case that
> the new SETUP comes after the pending request calls usb_composite_setup_continue
> and queues the status stage, then, the two zero-length packets from device
> will be the bus, but host only wants get one. Meanwhile, there is only one
> request for composite device (see: usb_composite_dev.req), it means the
> composite layer can't handle multiple ep0 request.

That's right.  Unfortunately, I think fixing this will require changes 
to the UDC drivers as well as to the composite core.  Similar to what I 
said earlier, there will have to be a way for the composite core to tell 
the UDC driver which SETUP packet the zero-length status reply is meant 
for.

> For disconnect event, it is an unexpected event between SETUP(DATA) stage
> and status stage, and usb_composite_setup_continue just does nothing
> since the bus has already not at configured state.

Yes.  But there still is another problem in the API.

Suppose the host sends a Set-Interface request, and the function driver 
is not able to handle it (maybe a memory allocation fails).  The gadget 
should report this failure to the host by STALLing ep0.  But there is no 
way for the function driver to tell the composite core that the failure 
occurred!

You can see this problem in f_mass_storage.c.  If do_set_interface() 
encounters an error, it will return a negative error code.  But the 
caller (handle_exception()) ignores the return code!

When Dave Brownell designed the Gadget and Composite APIs, he really did 
not give sufficient thought to the issues involved in delayed responses 
to control-OUT transfers.

Alan Stern

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: WARNING in usb_composite_setup_continue
  2020-11-13 17:00             ` Alan Stern
@ 2020-11-16 10:02               ` Peter Chen
  2020-11-16 16:00                 ` Alan Stern
  0 siblings, 1 reply; 11+ messages in thread
From: Peter Chen @ 2020-11-16 10:02 UTC (permalink / raw)
  To: Alan Stern; +Cc: g, Kyungtae Kim, Felipe Balbi, Greg KH, USB list, syzkaller

On 20-11-13 12:00:15, Alan Stern wrote:
> On Fri, Nov 13, 2020 at 10:02:58AM +0000, Peter Chen wrote:
> > On 20-11-12 10:59:05, Alan Stern wrote:
> > > This is why I think we will need to change the API.  There has to be a 
> > > way to tell usb_composite_setup_continue() which SETUP it is being 
> > > called for.  A new SETUP or a disconnect should invalidate the old 
> > > SETUP, and then usb_composite_setup_continue() would ignore any calls 
> > > that were for the old SETUP packet.
> > > 
> > 
> > I think usb_composite_setup_continue logic has no issue, we only need to
> > consider if changing WARN to DBG is necessary, FuzzUSB seems to be
> > trigger panic for WARN.
> 
> Yes, changing the WARN to DBG will make FuzzUSB happy.  But I still 
> think there is a logical flaw in the design of the API.
> 
> > See below message on its trace log
> > > Kernel panic - not syncing: panic_on_warn set ...
> > 
> > 
> > For new SETUP, current composite layer makes sure the pending request
> > will not get the status since the stage request is only queued when
> > cdev->delayed_status is 1.
> > 
> > But the UDC driver should make sure no new
> > request if old request has not finished, consider the corner case that
> > the new SETUP comes after the pending request calls usb_composite_setup_continue
> > and queues the status stage, then, the two zero-length packets from device
> > will be the bus, but host only wants get one. Meanwhile, there is only one
> > request for composite device (see: usb_composite_dev.req), it means the
> > composite layer can't handle multiple ep0 request.
> 
> That's right.  Unfortunately, I think fixing this will require changes 
> to the UDC drivers as well as to the composite core.  Similar to what I 
> said earlier, there will have to be a way for the composite core to tell 
> the UDC driver which SETUP packet the zero-length status reply is meant 
> for.

It needs the composite layer to support multiple requests for EP0, the
effort is big. It is better the real problem is reported, then, it has
environment to test the solution. The reported FuzzUSB issue is not this issue.

> 
> > For disconnect event, it is an unexpected event between SETUP(DATA) stage
> > and status stage, and usb_composite_setup_continue just does nothing
> > since the bus has already not at configured state.
> 
> Yes.  But there still is another problem in the API.
> 
> Suppose the host sends a Set-Interface request, and the function driver 
> is not able to handle it (maybe a memory allocation fails).  The gadget 
> should report this failure to the host by STALLing ep0.  But there is no 
> way for the function driver to tell the composite core that the failure 
> occurred!
> 
> You can see this problem in f_mass_storage.c.  If do_set_interface() 
> encounters an error, it will return a negative error code.  But the 
> caller (handle_exception()) ignores the return code!
> 
> When Dave Brownell designed the Gadget and Composite APIs, he really did 
> not give sufficient thought to the issues involved in delayed responses 
> to control-OUT transfers.
> 

We could add one parameter for usb_composite_setup_continue to indicate
error occurred during function's deferred operation, and stall the ep0
at usb_composite_setup_continue, do you think so?

-- 

Thanks,
Peter Chen

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: WARNING in usb_composite_setup_continue
  2020-11-16 10:02               ` Peter Chen
@ 2020-11-16 16:00                 ` Alan Stern
  0 siblings, 0 replies; 11+ messages in thread
From: Alan Stern @ 2020-11-16 16:00 UTC (permalink / raw)
  To: Peter Chen; +Cc: g, Kyungtae Kim, Felipe Balbi, Greg KH, USB list, syzkaller

On Mon, Nov 16, 2020 at 10:02:22AM +0000, Peter Chen wrote:
> On 20-11-13 12:00:15, Alan Stern wrote:
> > That's right.  Unfortunately, I think fixing this will require changes 
> > to the UDC drivers as well as to the composite core.  Similar to what I 
> > said earlier, there will have to be a way for the composite core to tell 
> > the UDC driver which SETUP packet the zero-length status reply is meant 
> > for.
> 
> It needs the composite layer to support multiple requests for EP0, the
> effort is big. It is better the real problem is reported, then, it has
> environment to test the solution. The reported FuzzUSB issue is not this issue.

WE don't really need to support multiple ep0 requests.  We just need to 
know whether a particular response was meant for the most recent 
ep0 transfer or for an earlier one.

For example, the UDC driver could keep a counter of SETUP packets, and 
it could pass this counter to the composite core's setup routine.  The 
status reply request could include the corresponding counter value, and 
the UDC driver could then ignore replies that are not meant for the most 
recent SETUP.

Of course, this would be a big API change, affecting all UDC and gadget 
drivers.  Maybe you can think of a better way to fix the problem.

> > Suppose the host sends a Set-Interface request, and the function driver 
> > is not able to handle it (maybe a memory allocation fails).  The gadget 
> > should report this failure to the host by STALLing ep0.  But there is no 
> > way for the function driver to tell the composite core that the failure 
> > occurred!
> > 
> > You can see this problem in f_mass_storage.c.  If do_set_interface() 
> > encounters an error, it will return a negative error code.  But the 
> > caller (handle_exception()) ignores the return code!
> > 
> > When Dave Brownell designed the Gadget and Composite APIs, he really did 
> > not give sufficient thought to the issues involved in delayed responses 
> > to control-OUT transfers.
> > 
> 
> We could add one parameter for usb_composite_setup_continue to indicate
> error occurred during function's deferred operation, and stall the ep0
> at usb_composite_setup_continue, do you think so?

That would work only if the function driver also had a way to tell the 
composite core which control transfer failed.

The basic problem is the race between a control status response and 
arrival of a new SETUP packet.  This race exists between the UDC driver 
and the gadget driver (that is, the composiste core), and between the 
composite core and the function drivers.

Alan Stern

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2020-11-16 16:00 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CAEAjamsxe9OuMVpHfox3w57HtGsE3mPXOty9bdXW-iPdx=TXMA@mail.gmail.com>
2020-11-09 19:29 ` WARNING in usb_composite_setup_continue Kyungtae Kim
2020-11-10 15:56   ` Alan Stern
2020-11-11  7:59     ` Peter Chen
2020-11-11 15:57       ` Alan Stern
2020-11-11 19:47     ` Alan Stern
2020-11-12  9:01       ` Peter Chen
2020-11-12 15:59         ` Alan Stern
2020-11-13 10:02           ` Peter Chen
2020-11-13 17:00             ` Alan Stern
2020-11-16 10:02               ` Peter Chen
2020-11-16 16:00                 ` Alan Stern

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.