linux-usb.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Virtual hub, resets etc...
@ 2019-07-04  5:52 Benjamin Herrenschmidt
  2019-07-04  8:02 ` f_mass_storage configuration races (Was: Virtual hub, resets etc...) Benjamin Herrenschmidt
  2019-07-04 16:13 ` Virtual hub, resets etc Alan Stern
  0 siblings, 2 replies; 16+ messages in thread
From: Benjamin Herrenschmidt @ 2019-07-04  5:52 UTC (permalink / raw)
  To: Alan Stern, Felipe Balbi; +Cc: linux-usb, Michal Nazarewicz

Hi Folks !

(Michal: Mass storage issue near the end...)

So I'd like to pick your brains on what you think is the best policy to
implement for this case:

The issue is around the Aspeed vhub driver which I wrote.

To recap, the HW shows on the host as a USB hub with 5 ports for which
I create five UDC for gadgets. The actual hub emulation is largely done
in SW with HW assist.

At the moment, the hub always pulls up, so it's always present on the
host when there's a host connected. So far so good (this is the subject
of another discussion).

When any of the child UDC pulls up, I show the USB_PORT_STAT_CONNECTION
on that port, so the host enumerates. At the moment,
USB_PORT_STAT_POWER is always set and I don't emulate power control.

The opposite with the child pulling down of course.

The interesting question however is how to react to events on the
upstream leg of the hub such as suspend, resume and USB reset. So far I
don't seem to be able to detect connection/disconnection but I'll dig a
bit more.

So far, I've tried to implement what I understand of the USB spec (I
might have misread) which consists of the following:

 - suspend: For each enabled port that isn't explicitely in
USB_PORT_STAT_SUSPEND state already (host initiated port suspend), call
the corresponding gadget suspend callback if any. I do NOT set
USB_PORT_STAT_SUSPEND in the port state.

 - resume: As above but with the resume callback

(Note: See below about issues with suspend)

 - bus reset: When I sense a bus reset, that's where I'm not too sure
what to do. Currently I clear all the status bits of the ports
except USB_PORT_STAT_SUSPEND. Thus I clear USB_PORT_STAT_ENABLE.
But I'm not sure what to do with the gadget. I currently call
the gadget suspend as "hinted" by the spec calling for S0 state iirc,
but I don't think it's the best thing to do, it doesn't make that much
sense... Should I do a gadget reset instead ? 

 - If the host clears USB_PORT_STAT_ENABLE, what should I do ? I
currently do a suspend as well, which isn't great... mostly it does
nothing and keep potentially the gadget trying to do stuff. I could
do a reset. I don't want to do a disconnect because we are still
connected to the hub so that's not really the right call, but at least
for composite it's the same thing...

Now, a few things i noticed while at it:

 - At some point I had code to reject EP queue() if the device is
suspended with -ESHUTDOWN. The end result was bad ... f_mass_storage
goes into an infinite loop of trying to queue the same stuff in
start_out_transfer() when that happens. It looks like it's not really
handling errors from queue() in a particularily useful way.

 - With my current code doing suspend/resume on bus resets, when I
reboot some hosts, and they re-enumerate, I tend to hit the WARN_ON
drivers/usb/gadget/function/f_mass_storage.c:341

static inline int __fsg_is_set(struct fsg_common *common,
			       const char *func, unsigned line)
{
	if (common->fsg)
		return 1;
	ERROR(common, "common->fsg is NULL in %s at %u\n", func, line);
	WARN_ON(1);
	return 0;
}

This happens a little while after a successul set_configuration. Here's
a trace:

[   94.918471] 1e6a0000.usb-vhub: port1:EP0 SETUP packet 00/09/0001/0000/0000 [out] st=0
[   94.918487] 1e6a0000.usb-vhub: port1:EP0 forwarding to gadget...
[   94.918501] gadget: high-speed config #1: c
[   94.918522] gadget: raise_exception: 3 (was 0)
[   94.918543] gadget: set_config: interface 0 (Mass Storage Function) requested delayed status
[   94.918550] gadget: delayed_status count 1
[   94.918567] 1e6a0000.usb-vhub: port1:EP0 driver returned 32767
[   95.009831] 1e6a0000.usb-vhub: port1:EP1 Disabling !
[   95.014894] 1e6a0000.usb-vhub: port1:EP1 Nuking...
[   95.019706] 1e6a0000.usb-vhub: port1:EP1 Nuked 0 request(s)
[   95.025367] 1e6a0000.usb-vhub: port1:EP2 Disabling !
[   95.030426] 1e6a0000.usb-vhub: port1:EP2 Nuking...
[   95.035231] 1e6a0000.usb-vhub: port1:EP2 Nuked 0 request(s)
[   95.040882] gadget: usb_composite_setup_continue
[   95.045514] gadget: usb_composite_setup_continue: Completing delayed status
[   95.052571] gadget: handle_exception
[   95.056188] 1e6a0000.usb-vhub: port1:EP0 SETUP packet 80/06/0300/0000/0002 [in] st=0
[   95.056205] 1e6a0000.usb-vhub: port1:EP0 forwarding to gadget...
[   95.056234] 1e6a0000.usb-vhub: port1:EP0 driver returned 0
[   95.064016] 1e6a0000.usb-vhub: port1:EP0 SETUP packet 80/06/0300/0000/0004 [in] st=0
[   95.064031] 1e6a0000.usb-vhub: port1:EP0 forwarding to gadget...
[   95.064056] 1e6a0000.usb-vhub: port1:EP0 driver returned 0
[   95.070215] 1e6a0000.usb-vhub: port1:EP0 SETUP packet 80/06/0301/0409/0002 [in] st=0
[   95.070232] 1e6a0000.usb-vhub: port1:EP0 forwarding to gadget...
[   95.070261] 1e6a0000.usb-vhub: port1:EP0 driver returned 0
[   95.075888] 1e6a0000.usb-vhub: port1:EP0 SETUP packet 80/06/0301/0409/0020 [in] st=0
[   95.075904] 1e6a0000.usb-vhub: port1:EP0 forwarding to gadget...
[   95.075932] 1e6a0000.usb-vhub: port1:EP0 driver returned 0
[   95.083714] 1e6a0000.usb-vhub: port1:EP0 SETUP packet 80/06/0302/0409/0002 [in] st=0
[   95.083728] 1e6a0000.usb-vhub: port1:EP0 forwarding to gadget...
[   95.083755] 1e6a0000.usb-vhub: port1:EP0 driver returned 0
[   95.089893] 1e6a0000.usb-vhub: port1:EP0 SETUP packet 80/06/0302/0409/0026 [in] st=0
[   95.089909] 1e6a0000.usb-vhub: port1:EP0 forwarding to gadget...
[   95.089937] 1e6a0000.usb-vhub: port1:EP0 driver returned 0
[   95.095468] 1e6a0000.usb-vhub: port1:EP0 SETUP packet 80/06/0303/0409/0002 [in] st=0
[   95.095482] 1e6a0000.usb-vhub: port1:EP0 forwarding to gadget...
[   95.095506] 1e6a0000.usb-vhub: port1:EP0 driver returned 0
[   95.103379] 1e6a0000.usb-vhub: port1:EP0 SETUP packet 80/06/0303/0409/0016 [in] st=0
[   95.103395] 1e6a0000.usb-vhub: port1:EP0 forwarding to gadget...
[   95.103422] 1e6a0000.usb-vhub: port1:EP0 driver returned 0
[   95.109558] 1e6a0000.usb-vhub: port1:EP0 SETUP packet a1/fe/0000/0000/0001 [in] st=0
[   95.109575] 1e6a0000.usb-vhub: port1:EP0 forwarding to gadget...
[   95.109590] gadget: common->fsg is NULL in fsg_setup at 489
[   95.109597] ------------[ cut here ]------------

I have to get my head around that code, but if one of you have a clue, I
would welcome it :-)

Interestingly it recovers. The host seems to then reset the prot, then reconfigure and
the second time around it all works fine.

Cheers,
Ben.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* f_mass_storage configuration races (Was: Virtual hub, resets etc...)
  2019-07-04  5:52 Virtual hub, resets etc Benjamin Herrenschmidt
@ 2019-07-04  8:02 ` Benjamin Herrenschmidt
  2019-07-04  8:04   ` [PATCH] usb: gadget: mass_storage: Fix races between fsg_disable and fsg_set_alt Benjamin Herrenschmidt
  2019-07-04 16:13 ` Virtual hub, resets etc Alan Stern
  1 sibling, 1 reply; 16+ messages in thread
From: Benjamin Herrenschmidt @ 2019-07-04  8:02 UTC (permalink / raw)
  To: Alan Stern, Felipe Balbi; +Cc: linux-usb, Michal Nazarewicz

Sooo...

I think I found what's going on with some of the issues triggering
things like

	usb_composite_setup_continue: Unexpected call 

Or possibly

	gadget: common->fsg is NULL in fsg_setup at 489

But I mostly tracked down the former.

Fundamentally, it boils down to the storage going through multiple
attempts at FSG_STATE_CONFIG_CHANGE too quickly. In my case:

 - The hub port gets reset, this eventually calls fsg_disable

 - In the middle of handle_exception, we get a fsg_set_alt(), after
common->state is set back to FSG_STATE_NORMAL and before we get to
call do_set_interface (or inside it).

What happens is that not only new_fsg is indeterminate and possibly
racy (maybe not a huge deal per-se), but we end up in that
interesting situation where the handle_exception caused by fsg_disable
ends up applying "new_fsg" *and* calling usb_composite_setup_continue
because it sees new_fsg being set by fsg_set_alt.

But *then*, fsg_set_alt() also queues up a new exception. So we come
back a second time around. We call do_set_interface() again, which
resets everything for no reason, re-established the fsg and ... we call
usb_composite_setup_continue() again, this time completely at the wrong
time since there's nothing to continue.

I think the right fix is to replace that racy exception crap with a
little queue so we remove those races.

In the meantime however, I think the simpler patch that I'll send as
a reply to this works around it, provided the host doesn't do multiple
set_alt too quickly. The latter could be handled by setting new_fsg
with the lock used for the state, and reading it from that same lock.
But I haven't observed that problem in practice.

With this patch, I can now unplug & replug on my host solidly, this
wasn't the case before.

Cheers,
Ben.




^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH] usb: gadget: mass_storage: Fix races between fsg_disable and fsg_set_alt
  2019-07-04  8:02 ` f_mass_storage configuration races (Was: Virtual hub, resets etc...) Benjamin Herrenschmidt
@ 2019-07-04  8:04   ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 16+ messages in thread
From: Benjamin Herrenschmidt @ 2019-07-04  8:04 UTC (permalink / raw)
  To: Alan Stern, Felipe Balbi; +Cc: linux-usb, Michal Nazarewicz

If fsg_disable() and fsg_set_alt() are called too closely to each
other (for example due to a quick reset/reconnect), what can happen
is that fsg_set_alt sets common->new_fsg from an interrupt while
handle_exception is trying to process the config change caused by
fsg_disable():

	fsg_disable()
	...
	handle_exception()
		sets state back to FSG_STATE_NORMAL
		hasn't yet called do_set_interface()
		or is inside it.

 ---> interrupt
	fsg_set_alt
		sets common->new_fsg
		queues a new FSG_STATE_CONFIG_CHANGE
 <---

Now, the first handle_exception can "see" the updated
new_fsg, treats it as if it was a fsg_set_alt() response,
call usb_composite_setup_continue() etc...

But then, the thread sees the second FSG_STATE_CONFIG_CHANGE,
and goes back down the same path, wipes and reattaches a now
active fsg, and .. calls usb_composite_setup_continue() which
at this point is wrong.

Not only we get a backtrace, but I suspect the second set_interface
wrecks some state causing the host to get upset in my case.

This fixes it by using a separate state for "clearing" the fsg
so that new_fsg isn't clobbered, and FSG_STATE_CONFIG_CHANGE
is only ever called for the real thing.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
 drivers/usb/gadget/function/f_mass_storage.c | 13 +++++++------
 drivers/usb/gadget/function/storage_common.h |  1 +
 2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/usb/gadget/function/f_mass_storage.c b/drivers/usb/gadget/function/f_mass_storage.c
index 043f97ad8f22..72f12bdd0dae 100644
--- a/drivers/usb/gadget/function/f_mass_storage.c
+++ b/drivers/usb/gadget/function/f_mass_storage.c
@@ -2293,8 +2293,7 @@ static int fsg_set_alt(struct usb_function *f, unsigned intf, unsigned alt)
 static void fsg_disable(struct usb_function *f)
 {
 	struct fsg_dev *fsg = fsg_from_func(f);
-	fsg->common->new_fsg = NULL;
-	raise_exception(fsg->common, FSG_STATE_CONFIG_CHANGE);
+	raise_exception(fsg->common, FSG_STATE_CONFIG_CLEAR);
 }
 
 
@@ -2412,10 +2411,13 @@ static void handle_exception(struct fsg_common *common)
 		/*			SS_RESET_OCCURRED;  */
 		break;
 
+	case FSG_STATE_CONFIG_CLEAR:
+		do_set_interface(common, NULL);
+		break;
+
 	case FSG_STATE_CONFIG_CHANGE:
 		do_set_interface(common, common->new_fsg);
-		if (common->new_fsg)
-			usb_composite_setup_continue(common->cdev);
+		usb_composite_setup_continue(common->cdev);
 		break;
 
 	case FSG_STATE_EXIT:
@@ -2989,8 +2991,7 @@ static void fsg_unbind(struct usb_configuration *c, struct usb_function *f)
 
 	DBG(fsg, "unbind\n");
 	if (fsg->common->fsg == fsg) {
-		fsg->common->new_fsg = NULL;
-		raise_exception(fsg->common, FSG_STATE_CONFIG_CHANGE);
+		raise_exception(fsg->common, FSG_STATE_CONFIG_CLEAR);
 		/* FIXME: make interruptible or killable somehow? */
 		wait_event(common->fsg_wait, common->fsg != fsg);
 	}
diff --git a/drivers/usb/gadget/function/storage_common.h b/drivers/usb/gadget/function/storage_common.h
index e5e3a2553aaa..5f8cdc08ad85 100644
--- a/drivers/usb/gadget/function/storage_common.h
+++ b/drivers/usb/gadget/function/storage_common.h
@@ -160,6 +160,7 @@ enum fsg_state {
 	FSG_STATE_NORMAL,
 	FSG_STATE_ABORT_BULK_OUT,
 	FSG_STATE_PROTOCOL_RESET,
+	FSG_STATE_CONFIG_CLEAR,
 	FSG_STATE_CONFIG_CHANGE,
 	FSG_STATE_EXIT,
 	FSG_STATE_TERMINATED



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: Virtual hub, resets etc...
  2019-07-04  5:52 Virtual hub, resets etc Benjamin Herrenschmidt
  2019-07-04  8:02 ` f_mass_storage configuration races (Was: Virtual hub, resets etc...) Benjamin Herrenschmidt
@ 2019-07-04 16:13 ` Alan Stern
  2019-07-04 21:44   ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 16+ messages in thread
From: Alan Stern @ 2019-07-04 16:13 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Felipe Balbi, linux-usb, Michal Nazarewicz

On Thu, 4 Jul 2019, Benjamin Herrenschmidt wrote:

> Hi Folks !
> 
> (Michal: Mass storage issue near the end...)
> 
> So I'd like to pick your brains on what you think is the best policy to
> implement for this case:
> 
> The issue is around the Aspeed vhub driver which I wrote.
> 
> To recap, the HW shows on the host as a USB hub with 5 ports for which
> I create five UDC for gadgets. The actual hub emulation is largely done
> in SW with HW assist.
> 
> At the moment, the hub always pulls up, so it's always present on the
> host when there's a host connected. So far so good (this is the subject
> of another discussion).
> 
> When any of the child UDC pulls up, I show the USB_PORT_STAT_CONNECTION
> on that port, so the host enumerates. At the moment,
> USB_PORT_STAT_POWER is always set and I don't emulate power control.
> 
> The opposite with the child pulling down of course.
> 
> The interesting question however is how to react to events on the
> upstream leg of the hub such as suspend, resume and USB reset. So far I
> don't seem to be able to detect connection/disconnection but I'll dig a
> bit more.
> 
> So far, I've tried to implement what I understand of the USB spec (I
> might have misread) which consists of the following:
> 
>  - suspend: For each enabled port that isn't explicitely in
> USB_PORT_STAT_SUSPEND state already (host initiated port suspend), call
> the corresponding gadget suspend callback if any. I do NOT set
> USB_PORT_STAT_SUSPEND in the port state.
> 
>  - resume: As above but with the resume callback

That sounds right.

> (Note: See below about issues with suspend)
> 
>  - bus reset: When I sense a bus reset, that's where I'm not too sure
> what to do. Currently I clear all the status bits of the ports
> except USB_PORT_STAT_SUSPEND. Thus I clear USB_PORT_STAT_ENABLE.
> But I'm not sure what to do with the gadget. I currently call
> the gadget suspend as "hinted" by the spec calling for S0 state iirc,
> but I don't think it's the best thing to do, it doesn't make that much
> sense... Should I do a gadget reset instead ? 

You should also clear USB_PORT_STAT_SUSPEND.  Calling the gadget's
suspend routine (if the gadget isn't already suspended) is the right
thing to do; the spec says a USB device goes into suspend if it doesn't
receive any packets for a period of 3 (or 5? -- something like that)  
ms, and that certainly would be the case here.

>  - If the host clears USB_PORT_STAT_ENABLE, what should I do ? I
> currently do a suspend as well, which isn't great... mostly it does
> nothing and keep potentially the gadget trying to do stuff. I could
> do a reset. I don't want to do a disconnect because we are still
> connected to the hub so that's not really the right call, but at least
> for composite it's the same thing...

As above, doing a suspend is the right thing.

> Now, a few things i noticed while at it:
> 
>  - At some point I had code to reject EP queue() if the device is
> suspended with -ESHUTDOWN. The end result was bad ... f_mass_storage
> goes into an infinite loop of trying to queue the same stuff in
> start_out_transfer() when that happens. It looks like it's not really
> handling errors from queue() in a particularily useful way.

Don't reject EP queue requests.  Accept them as you would at any time;
they will complete after the port is resumed.

As for f_mass_storage, repeatedly attempting to queue an OUT transfer
is normal behavior.  The fact that one attempt gets an error doesn't
stop the driver from making more attempts; the only thing that would
stop it is being disabled by a config change, a suspend, a disconnect,
or an unbind.

>  - With my current code doing suspend/resume on bus resets, when I
> reboot some hosts, and they re-enumerate, I tend to hit the WARN_ON
> drivers/usb/gadget/function/f_mass_storage.c:341
> 
> static inline int __fsg_is_set(struct fsg_common *common,
> 			       const char *func, unsigned line)
> {
> 	if (common->fsg)
> 		return 1;
> 	ERROR(common, "common->fsg is NULL in %s at %u\n", func, line);
> 	WARN_ON(1);
> 	return 0;
> }
> 
> This happens a little while after a successul set_configuration. Here's
> a trace:

...

> I have to get my head around that code, but if one of you have a clue, I
> would welcome it :-)
> 
> Interestingly it recovers. The host seems to then reset the prot, then reconfigure and
> the second time around it all works fine.

I suspect this is related to the race you found.  EJ Hsu has been 
working on much the same thing (see the mailing list archive).

Alan Stern


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Virtual hub, resets etc...
  2019-07-04 16:13 ` Virtual hub, resets etc Alan Stern
@ 2019-07-04 21:44   ` Benjamin Herrenschmidt
  2019-07-04 21:58     ` Benjamin Herrenschmidt
  2019-07-05  1:34     ` Alan Stern
  0 siblings, 2 replies; 16+ messages in thread
From: Benjamin Herrenschmidt @ 2019-07-04 21:44 UTC (permalink / raw)
  To: Alan Stern; +Cc: Felipe Balbi, linux-usb, Michal Nazarewicz

On Thu, 2019-07-04 at 12:13 -0400, Alan Stern wrote:
> >   - bus reset: When I sense a bus reset, that's where I'm not too sure
> > what to do. Currently I clear all the status bits of the ports
> > except USB_PORT_STAT_SUSPEND. Thus I clear USB_PORT_STAT_ENABLE.
> > But I'm not sure what to do with the gadget. I currently call
> > the gadget suspend as "hinted" by the spec calling for S0 state iirc,
> > but I don't think it's the best thing to do, it doesn't make that much
> > sense... Should I do a gadget reset instead ? 
> 
> You should also clear USB_PORT_STAT_SUSPEND.  Calling the gadget's
> suspend routine (if the gadget isn't already suspended) is the right
> thing to do; the spec says a USB device goes into suspend if it doesn't
> receive any packets for a period of 3 (or 5? -- something like that)  
> ms, and that certainly would be the case here.
> 
> >   - If the host clears USB_PORT_STAT_ENABLE, what should I do ? I
> > currently do a suspend as well, which isn't great... mostly it does
> > nothing and keep potentially the gadget trying to do stuff. I could
> > do a reset. I don't want to do a disconnect because we are still
> > connected to the hub so that's not really the right call, but at least
> > for composite it's the same thing...
> 
> As above, doing a suspend is the right thing.

It is the right HW behaviour. But with our gadget stack, it doesn't
reset or cleanup anything. Though since the port gets disabled, I
suppose re-enabling it will cause a reset and will sort that out.

> > Now, a few things i noticed while at it:
> > 
> >   - At some point I had code to reject EP queue() if the device is
> > suspended with -ESHUTDOWN. The end result was bad ... f_mass_storage
> > goes into an infinite loop of trying to queue the same stuff in
> > start_out_transfer() when that happens. It looks like it's not really
> > handling errors from queue() in a particularily useful way.
> 
> Don't reject EP queue requests.  Accept them as you would at any time;
> they will complete after the port is resumed.

Except the suspend on a bus reset clears the port enable. You can't
resume from that, only reset the port no ? Or am I missing something ?

> As for f_mass_storage, repeatedly attempting to queue an OUT transfer
> is normal behavior.  The fact that one attempt gets an error doesn't
> stop the driver from making more attempts; the only thing that would
> stop it is being disabled by a config change, a suspend, a disconnect,
> or an unbind.

Except it does that in a tight loop and locks up the machine...

> >   - With my current code doing suspend/resume on bus resets, when I
> > reboot some hosts, and they re-enumerate, I tend to hit the WARN_ON
> > drivers/usb/gadget/function/f_mass_storage.c:341
> > 
> > static inline int __fsg_is_set(struct fsg_common *common,
> >                               const char *func, unsigned line)
> > {
> >        if (common->fsg)
> >                return 1;
> >        ERROR(common, "common->fsg is NULL in %s at %u\n", func, line);
> >        WARN_ON(1);
> >        return 0;
> > }
> > 
> > This happens a little while after a successul set_configuration. Here's
> > a trace:
> 
> ...
> 
> > I have to get my head around that code, but if one of you have a clue, I
> > would welcome it :-)
> > 
> > Interestingly it recovers. The host seems to then reset the prot, then reconfigure and
> > the second time around it all works fine.
> 
> I suspect this is related to the race you found.  EJ Hsu has been 
> working on much the same thing (see the mailing list archive).

Right. I debugged the race and produced the fix I posted *after* I had
change my code to do a reset rather than a suspend on the hub receiving
an upstream bus reset.

I will switch back to doing suspend instead and see whether that stays
fixed.

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Virtual hub, resets etc...
  2019-07-04 21:44   ` Benjamin Herrenschmidt
@ 2019-07-04 21:58     ` Benjamin Herrenschmidt
  2019-07-05  1:37       ` Alan Stern
  2019-07-05  1:34     ` Alan Stern
  1 sibling, 1 reply; 16+ messages in thread
From: Benjamin Herrenschmidt @ 2019-07-04 21:58 UTC (permalink / raw)
  To: Alan Stern; +Cc: Felipe Balbi, linux-usb, Michal Nazarewicz

On Fri, 2019-07-05 at 07:44 +1000, Benjamin Herrenschmidt wrote:
> > >    - At some point I had code to reject EP queue() if the device
> > > is
> > > suspended with -ESHUTDOWN. The end result was bad ...
> > > f_mass_storage
> > > goes into an infinite loop of trying to queue the same stuff in
> > > start_out_transfer() when that happens. It looks like it's not
> > > really
> > > handling errors from queue() in a particularily useful way.
> > 
> > Don't reject EP queue requests.  Accept them as you would at any
> > time;
> > they will complete after the port is resumed.
>
> Except the suspend on a bus reset clears the port enable. You can't
> resume from that, only reset the port no ? Or am I missing something
> ?

Talking of which... do we need this ?

--- a/drivers/usb/gadget/composite.c
+++ b/drivers/usb/gadget/composite.c
@@ -1976,6 +1976,7 @@ void composite_disconnect(struct usb_gadget *gadget)
 	 * disconnect callbacks?
 	 */
 	spin_lock_irqsave(&cdev->lock, flags);
+	cdev->suspended = 0;
 	if (cdev->config)
 		reset_config(cdev);
 	if (cdev->driver->disconnect)

Otherwise with my vhub or with dummy_hcd, a suspend followed by a reset
will keep that stale suspended flag to 1 (which has no effect at the moment
but still...)

If yes, I'll submit a patch accordingly...

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Virtual hub, resets etc...
  2019-07-04 21:44   ` Benjamin Herrenschmidt
  2019-07-04 21:58     ` Benjamin Herrenschmidt
@ 2019-07-05  1:34     ` Alan Stern
  2019-07-05  2:08       ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 16+ messages in thread
From: Alan Stern @ 2019-07-05  1:34 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Felipe Balbi, linux-usb, Michal Nazarewicz

On Fri, 5 Jul 2019, Benjamin Herrenschmidt wrote:

> On Thu, 2019-07-04 at 12:13 -0400, Alan Stern wrote:
> > >   - bus reset: When I sense a bus reset, that's where I'm not too sure
> > > what to do. Currently I clear all the status bits of the ports
> > > except USB_PORT_STAT_SUSPEND. Thus I clear USB_PORT_STAT_ENABLE.
> > > But I'm not sure what to do with the gadget. I currently call
> > > the gadget suspend as "hinted" by the spec calling for S0 state iirc,
> > > but I don't think it's the best thing to do, it doesn't make that much
> > > sense... Should I do a gadget reset instead ? 
> > 
> > You should also clear USB_PORT_STAT_SUSPEND.  Calling the gadget's
> > suspend routine (if the gadget isn't already suspended) is the right
> > thing to do; the spec says a USB device goes into suspend if it doesn't
> > receive any packets for a period of 3 (or 5? -- something like that)  
> > ms, and that certainly would be the case here.
> > 
> > >   - If the host clears USB_PORT_STAT_ENABLE, what should I do ? I
> > > currently do a suspend as well, which isn't great... mostly it does
> > > nothing and keep potentially the gadget trying to do stuff. I could
> > > do a reset. I don't want to do a disconnect because we are still
> > > connected to the hub so that's not really the right call, but at least
> > > for composite it's the same thing...
> > 
> > As above, doing a suspend is the right thing.
> 
> It is the right HW behaviour. But with our gadget stack, it doesn't
> reset or cleanup anything. Though since the port gets disabled, I
> suppose re-enabling it will cause a reset and will sort that out.

That's right.  (Except that you got the cause and effect reversed; 
resetting the port is what causes it to be re-enabled.  :-)

> > > Now, a few things i noticed while at it:
> > > 
> > >   - At some point I had code to reject EP queue() if the device is
> > > suspended with -ESHUTDOWN. The end result was bad ... f_mass_storage
> > > goes into an infinite loop of trying to queue the same stuff in
> > > start_out_transfer() when that happens. It looks like it's not really
> > > handling errors from queue() in a particularily useful way.
> > 
> > Don't reject EP queue requests.  Accept them as you would at any time;
> > they will complete after the port is resumed.
> 
> Except the suspend on a bus reset clears the port enable. You can't
> resume from that, only reset the port no ? Or am I missing something ?

You're right.  Nevertheless, the driver should accept queue requests, 
even if they will eventually fail.  It's what the gadget drivers 
expect.

> > As for f_mass_storage, repeatedly attempting to queue an OUT transfer
> > is normal behavior.  The fact that one attempt gets an error doesn't
> > stop the driver from making more attempts; the only thing that would
> > stop it is being disabled by a config change, a suspend, a disconnect,
> > or an unbind.
> 
> Except it does that in a tight loop and locks up the machine...

Well, that wouldn't happen if your UDC accepted the requests, right?  

Besides, f_mass_storage won't repeatedly try to queue an OUT transfer 
once it knows that it is suspended.

Alan Stern

> > >   - With my current code doing suspend/resume on bus resets, when I
> > > reboot some hosts, and they re-enumerate, I tend to hit the WARN_ON
> > > drivers/usb/gadget/function/f_mass_storage.c:341
> > > 
> > > static inline int __fsg_is_set(struct fsg_common *common,
> > >                               const char *func, unsigned line)
> > > {
> > >        if (common->fsg)
> > >                return 1;
> > >        ERROR(common, "common->fsg is NULL in %s at %u\n", func, line);
> > >        WARN_ON(1);
> > >        return 0;
> > > }
> > > 
> > > This happens a little while after a successul set_configuration. Here's
> > > a trace:
> > 
> > ...
> > 
> > > I have to get my head around that code, but if one of you have a clue, I
> > > would welcome it :-)
> > > 
> > > Interestingly it recovers. The host seems to then reset the prot, then reconfigure and
> > > the second time around it all works fine.
> > 
> > I suspect this is related to the race you found.  EJ Hsu has been 
> > working on much the same thing (see the mailing list archive).
> 
> Right. I debugged the race and produced the fix I posted *after* I had
> change my code to do a reset rather than a suspend on the hub receiving
> an upstream bus reset.
> 
> I will switch back to doing suspend instead and see whether that stays
> fixed.
> 
> Cheers,
> Ben.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Virtual hub, resets etc...
  2019-07-04 21:58     ` Benjamin Herrenschmidt
@ 2019-07-05  1:37       ` Alan Stern
  2019-07-05  2:15         ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 16+ messages in thread
From: Alan Stern @ 2019-07-05  1:37 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Felipe Balbi, linux-usb, Michal Nazarewicz

On Fri, 5 Jul 2019, Benjamin Herrenschmidt wrote:

> On Fri, 2019-07-05 at 07:44 +1000, Benjamin Herrenschmidt wrote:
> > > >    - At some point I had code to reject EP queue() if the device
> > > > is
> > > > suspended with -ESHUTDOWN. The end result was bad ...
> > > > f_mass_storage
> > > > goes into an infinite loop of trying to queue the same stuff in
> > > > start_out_transfer() when that happens. It looks like it's not
> > > > really
> > > > handling errors from queue() in a particularily useful way.
> > > 
> > > Don't reject EP queue requests.  Accept them as you would at any
> > > time;
> > > they will complete after the port is resumed.
> >
> > Except the suspend on a bus reset clears the port enable. You can't
> > resume from that, only reset the port no ? Or am I missing something
> > ?
> 
> Talking of which... do we need this ?
> 
> --- a/drivers/usb/gadget/composite.c
> +++ b/drivers/usb/gadget/composite.c
> @@ -1976,6 +1976,7 @@ void composite_disconnect(struct usb_gadget *gadget)
>  	 * disconnect callbacks?
>  	 */
>  	spin_lock_irqsave(&cdev->lock, flags);
> +	cdev->suspended = 0;
>  	if (cdev->config)
>  		reset_config(cdev);
>  	if (cdev->driver->disconnect)
> 
> Otherwise with my vhub or with dummy_hcd, a suspend followed by a reset
> will keep that stale suspended flag to 1 (which has no effect at the moment
> but still...)
> 
> If yes, I'll submit a patch accordingly...

According to the USB spec, a host is not supposed to reset a suspended 
port (it's supposed to resume the port and then do the reset).  But of 
course this can happen anyway, so we should handle it properly.

Alan Stern


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Virtual hub, resets etc...
  2019-07-05  1:34     ` Alan Stern
@ 2019-07-05  2:08       ` Benjamin Herrenschmidt
  2019-07-05 14:08         ` Alan Stern
  0 siblings, 1 reply; 16+ messages in thread
From: Benjamin Herrenschmidt @ 2019-07-05  2:08 UTC (permalink / raw)
  To: Alan Stern; +Cc: Felipe Balbi, linux-usb, Michal Nazarewicz

On Thu, 2019-07-04 at 21:34 -0400, Alan Stern wrote:
> > It is the right HW behaviour. But with our gadget stack, it doesn't
> > reset or cleanup anything. Though since the port gets disabled, I
> > suppose re-enabling it will cause a reset and will sort that out.
> 
> That's right.  (Except that you got the cause and effect reversed; 
> resetting the port is what causes it to be re-enabled.  :-)

Right.

> > > > Now, a few things i noticed while at it:
> > > > 
> > > >   - At some point I had code to reject EP queue() if the device is
> > > > suspended with -ESHUTDOWN. The end result was bad ... f_mass_storage
> > > > goes into an infinite loop of trying to queue the same stuff in
> > > > start_out_transfer() when that happens. It looks like it's not really
> > > > handling errors from queue() in a particularily useful way.
> > > 
> > > Don't reject EP queue requests.  Accept them as you would at any time;
> > > they will complete after the port is resumed.
> > 
> > Except the suspend on a bus reset clears the port enable. You can't
> > resume from that, only reset the port no ? Or am I missing something ?
> 
> You're right.  Nevertheless, the driver should accept queue requests, 
> even if they will eventually fail.  It's what the gadget drivers 
> expect.

Ok. Things seem to work. I went back to suspend on bus reset, and with
some other fixes I did to my enable/reset logic and the mass storage
fix I posted yesterday it seems to be solid. Yay !

> > > As for f_mass_storage, repeatedly attempting to queue an OUT transfer
> > > is normal behavior.  The fact that one attempt gets an error doesn't
> > > stop the driver from making more attempts; the only thing that would
> > > stop it is being disabled by a config change, a suspend, a disconnect,
> > > or an unbind.
> > 
> > Except it does that in a tight loop and locks up the machine...
> 
> Well, that wouldn't happen if your UDC accepted the requests, right?  

Sure but it would be nice if the mass storage dealt with -ESHUTDOWN
properly and stopped :-) Or other errors... if the UDC HW for example
dies for some reason, mass storage will lockup.

> Besides, f_mass_storage won't repeatedly try to queue an OUT transfer 
> once it knows that it is suspended.

Not afaik. But I might have missed something. I didn't see any suspend
callback in f_mass_storage.c...

Cheers,
Ben.

> Alan Stern
> 
> > > >   - With my current code doing suspend/resume on bus resets, when I
> > > > reboot some hosts, and they re-enumerate, I tend to hit the WARN_ON
> > > > drivers/usb/gadget/function/f_mass_storage.c:341
> > > > 
> > > > static inline int __fsg_is_set(struct fsg_common *common,
> > > >                               const char *func, unsigned line)
> > > > {
> > > >        if (common->fsg)
> > > >                return 1;
> > > >        ERROR(common, "common->fsg is NULL in %s at %u\n", func, line);
> > > >        WARN_ON(1);
> > > >        return 0;
> > > > }
> > > > 
> > > > This happens a little while after a successul set_configuration. Here's
> > > > a trace:
> > > 
> > > ...
> > > 
> > > > I have to get my head around that code, but if one of you have a clue, I
> > > > would welcome it :-)
> > > > 
> > > > Interestingly it recovers. The host seems to then reset the prot, then reconfigure and
> > > > the second time around it all works fine.
> > > 
> > > I suspect this is related to the race you found.  EJ Hsu has been 
> > > working on much the same thing (see the mailing list archive).
> > 
> > Right. I debugged the race and produced the fix I posted *after* I had
> > change my code to do a reset rather than a suspend on the hub receiving
> > an upstream bus reset.
> > 
> > I will switch back to doing suspend instead and see whether that stays
> > fixed.
> > 
> > Cheers,
> > Ben.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Virtual hub, resets etc...
  2019-07-05  1:37       ` Alan Stern
@ 2019-07-05  2:15         ` Benjamin Herrenschmidt
  2019-07-05 14:20           ` Alan Stern
  0 siblings, 1 reply; 16+ messages in thread
From: Benjamin Herrenschmidt @ 2019-07-05  2:15 UTC (permalink / raw)
  To: Alan Stern; +Cc: Felipe Balbi, linux-usb, Michal Nazarewicz

On Thu, 2019-07-04 at 21:37 -0400, Alan Stern wrote:
> > 
> > Talking of which... do we need this ?
> > 
> > --- a/drivers/usb/gadget/composite.c
> > +++ b/drivers/usb/gadget/composite.c
> > @@ -1976,6 +1976,7 @@ void composite_disconnect(struct usb_gadget *gadget)
> >         * disconnect callbacks?
> >         */
> >        spin_lock_irqsave(&cdev->lock, flags);
> > +     cdev->suspended = 0;
> >        if (cdev->config)
> >                reset_config(cdev);
> >        if (cdev->driver->disconnect)
> > 
> > Otherwise with my vhub or with dummy_hcd, a suspend followed by a reset
> > will keep that stale suspended flag to 1 (which has no effect at the moment
> > but still...)
> > 
> > If yes, I'll submit a patch accordingly...
> 
> According to the USB spec, a host is not supposed to reset a suspended 
> port (it's supposed to resume the port and then do the reset).  But of 
> course this can happen anyway, so we should handle it properly.

Right. I do see the resume coming in, but I don't forward it to the
gadget because here's what happens in that order:

 1- Host gets shutdown (or cable disconnected)

 2- Upstream bus suspend: I call ->suspend on the gadgets on all
enabled ports that don't have USB_PORT_STAT_SUSPEND already set. I
don't change the port status, I don't set USB_PORT_STAT_SUSPEND

 3- Machine gets turned back on (or cable reconnected)
 
 4- Upstream bus resume: I call ->resume on the gadgets on all
enabled ports that don't have USB_PORT_STAT_SUSPEND set.

 5- Upstream bus reset: I call ->suspend on all enabled ports after
clearing their status (I preserve only USB_PORT_STAT_SUSPEND and
USB_PORT_STAT_POWER which is always set for me). Note: I currently do
this even if the port had USB_PORT_STAT_SUSPEND set, so such as port
will get a double suspend ... maybe I shouldn't.

 6- Hosts sets port reset: I reset the gadget since it's already
bound/enabled. It's still "suspended".

So we do have a legitimate case of "reset while suspended".

I'll tidy up the patch and submit it.

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Virtual hub, resets etc...
  2019-07-05  2:08       ` Benjamin Herrenschmidt
@ 2019-07-05 14:08         ` Alan Stern
  2019-07-05 22:01           ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 16+ messages in thread
From: Alan Stern @ 2019-07-05 14:08 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Felipe Balbi, linux-usb, Michal Nazarewicz

On Fri, 5 Jul 2019, Benjamin Herrenschmidt wrote:

> > > > As for f_mass_storage, repeatedly attempting to queue an OUT transfer
> > > > is normal behavior.  The fact that one attempt gets an error doesn't
> > > > stop the driver from making more attempts; the only thing that would
> > > > stop it is being disabled by a config change, a suspend, a disconnect,
> > > > or an unbind.
> > > 
> > > Except it does that in a tight loop and locks up the machine...
> > 
> > Well, that wouldn't happen if your UDC accepted the requests, right?  
> 
> Sure but it would be nice if the mass storage dealt with -ESHUTDOWN
> properly and stopped :-) Or other errors... if the UDC HW for example
> dies for some reason, mass storage will lockup.

I suppose we could add code to check for this case and handle it, 
although I'm not sure what would be the right thing to do.  Delay for 
one second and try again?  Disable the gadget until the host does a 
reset?

> > Besides, f_mass_storage won't repeatedly try to queue an OUT transfer 
> > once it knows that it is suspended.
> 
> Not afaik. But I might have missed something. I didn't see any suspend
> callback in f_mass_storage.c...

Oops, right.  Sorry about that; my memory is slowly decaying.  I need
to upgrade my brain's RAM...

Alan Stern


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Virtual hub, resets etc...
  2019-07-05  2:15         ` Benjamin Herrenschmidt
@ 2019-07-05 14:20           ` Alan Stern
  2019-07-05 22:06             ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 16+ messages in thread
From: Alan Stern @ 2019-07-05 14:20 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Felipe Balbi, linux-usb, Michal Nazarewicz

On Fri, 5 Jul 2019, Benjamin Herrenschmidt wrote:

> On Thu, 2019-07-04 at 21:37 -0400, Alan Stern wrote:
> > > 
> > > Talking of which... do we need this ?
> > > 
> > > --- a/drivers/usb/gadget/composite.c
> > > +++ b/drivers/usb/gadget/composite.c
> > > @@ -1976,6 +1976,7 @@ void composite_disconnect(struct usb_gadget *gadget)
> > >         * disconnect callbacks?
> > >         */
> > >        spin_lock_irqsave(&cdev->lock, flags);
> > > +     cdev->suspended = 0;
> > >        if (cdev->config)
> > >                reset_config(cdev);
> > >        if (cdev->driver->disconnect)
> > > 
> > > Otherwise with my vhub or with dummy_hcd, a suspend followed by a reset
> > > will keep that stale suspended flag to 1 (which has no effect at the moment
> > > but still...)
> > > 
> > > If yes, I'll submit a patch accordingly...
> > 
> > According to the USB spec, a host is not supposed to reset a suspended 
> > port (it's supposed to resume the port and then do the reset).  But of 
> > course this can happen anyway, so we should handle it properly.
> 
> Right. I do see the resume coming in, but I don't forward it to the
> gadget because here's what happens in that order:
> 
>  1- Host gets shutdown (or cable disconnected)
> 
>  2- Upstream bus suspend: I call ->suspend on the gadgets on all
> enabled ports that don't have USB_PORT_STAT_SUSPEND already set. I
> don't change the port status, I don't set USB_PORT_STAT_SUSPEND

Hmmm.  Does the descriptor for your hub say that it is self-powered?  A 
bus-powered hub would turn off completely when its upstream cable was 
unplugged, thereby sending a disconnect signal to all its child 
devices.

I don't recall what the USB spec says a self-powered hub should do.  
Maybe it doesn't say anything about it.

>  3- Machine gets turned back on (or cable reconnected)
>  
>  4- Upstream bus resume: I call ->resume on the gadgets on all
> enabled ports that don't have USB_PORT_STAT_SUSPEND set.

No, the upstream bus doesn't resume upon cable reconnect.  A resume
would require packets to be received over the cable, but the host won't
send any packets to the hub until the upstream port has been reset and
enabled.  So you should eliminate this step.

>  5- Upstream bus reset: I call ->suspend on all enabled ports after
> clearing their status (I preserve only USB_PORT_STAT_SUSPEND and
> USB_PORT_STAT_POWER which is always set for me). Note: I currently do
> this even if the port had USB_PORT_STAT_SUSPEND set, so such as port
> will get a double suspend ... maybe I shouldn't.

I believe the upstream reset should cause the hub to clear all the 
downstream port statuses.  Even if the reset doesn't do this, the 
Set-Config request which follows the reset should.

Whether you tell the gadget drivers they are no longer suspended is up 
to you.  I suspect it doesn't matter much.

>  6- Hosts sets port reset: I reset the gadget since it's already
> bound/enabled. It's still "suspended".
> 
> So we do have a legitimate case of "reset while suspended".

Ah, but it doesn't contradict what I wrote earlier.  There's a
difference between resuming a suspended _device_ and resuming a
suspended _port_.

Nevertheless, in practice the difference doesn't matter and the 
composite core should do the right thing.

> I'll tidy up the patch and submit it.

Good.

Alan Stern


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Virtual hub, resets etc...
  2019-07-05 14:08         ` Alan Stern
@ 2019-07-05 22:01           ` Benjamin Herrenschmidt
  2019-07-06 18:37             ` Alan Stern
  0 siblings, 1 reply; 16+ messages in thread
From: Benjamin Herrenschmidt @ 2019-07-05 22:01 UTC (permalink / raw)
  To: Alan Stern; +Cc: Felipe Balbi, linux-usb, Michal Nazarewicz

On Fri, 2019-07-05 at 10:08 -0400, Alan Stern wrote:
> On Fri, 5 Jul 2019, Benjamin Herrenschmidt wrote:
> 
> > > > > As for f_mass_storage, repeatedly attempting to queue an OUT transfer
> > > > > is normal behavior.  The fact that one attempt gets an error doesn't
> > > > > stop the driver from making more attempts; the only thing that would
> > > > > stop it is being disabled by a config change, a suspend, a disconnect,
> > > > > or an unbind.
> > > > 
> > > > Except it does that in a tight loop and locks up the machine...
> > > 
> > > Well, that wouldn't happen if your UDC accepted the requests, right?  
> > 
> > Sure but it would be nice if the mass storage dealt with -ESHUTDOWN
> > properly and stopped :-) Or other errors... if the UDC HW for example
> > dies for some reason, mass storage will lockup.
> 
> I suppose we could add code to check for this case and handle it, 
> although I'm not sure what would be the right thing to do.  Delay for 
> one second and try again?  Disable the gadget until the host does a 
> reset?

I think just stop it until the next reset yes.

> > > Besides, f_mass_storage won't repeatedly try to queue an OUT transfer 
> > > once it knows that it is suspended.
> > 
> > Not afaik. But I might have missed something. I didn't see any suspend
> > callback in f_mass_storage.c...
> 
> Oops, right.  Sorry about that; my memory is slowly decaying.  I need
> to upgrade my brain's RAM...

Haha, I wish I didn't have that problem too :)

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Virtual hub, resets etc...
  2019-07-05 14:20           ` Alan Stern
@ 2019-07-05 22:06             ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 16+ messages in thread
From: Benjamin Herrenschmidt @ 2019-07-05 22:06 UTC (permalink / raw)
  To: Alan Stern; +Cc: Felipe Balbi, linux-usb, Michal Nazarewicz

On Fri, 2019-07-05 at 10:20 -0400, Alan Stern wrote:
> 
> > Right. I do see the resume coming in, but I don't forward it to the
> > gadget because here's what happens in that order:
> > 
> >  1- Host gets shutdown (or cable disconnected)
> > 
> >  2- Upstream bus suspend: I call ->suspend on the gadgets on all
> > enabled ports that don't have USB_PORT_STAT_SUSPEND already set. I
> > don't change the port status, I don't set USB_PORT_STAT_SUSPEND
> 
> Hmmm.  Does the descriptor for your hub say that it is self-powered?  A 
> bus-powered hub would turn off completely when its upstream cable was 
> unplugged, thereby sending a disconnect signal to all its child 
> devices.
> 
> I don't recall what the USB spec says a self-powered hub should do.  
> Maybe it doesn't say anything about it.

Yes it's self powered. I took the cable as an example, this scenario
happens with a host power off as well of course whihc is more relevant
in our case since we are a BMC).

> >  3- Machine gets turned back on (or cable reconnected)
> >  
> >  4- Upstream bus resume: I call ->resume on the gadgets on all
> > enabled ports that don't have USB_PORT_STAT_SUSPEND set.
> 
> No, the upstream bus doesn't resume upon cable reconnect.

It does. At least my HW detects a resume signaling right before the but
reset.

>   A resume
> would require packets to be received over the cable, but the host won't
> send any packets to the hub until the upstream port has been reset and
> enabled.  So you should eliminate this step.

It's not a step I created. I do observe resume signaling by the hub HW
(it's an interrupt) right before the bus reset when the host comes back
up. In any case, no impact on what happens below...

> >  5- Upstream bus reset: I call ->suspend on all enabled ports after
> > clearing their status (I preserve only USB_PORT_STAT_SUSPEND and
> > USB_PORT_STAT_POWER which is always set for me). Note: I currently do
> > this even if the port had USB_PORT_STAT_SUSPEND set, so such as port
> > will get a double suspend ... maybe I shouldn't.
> 
> I believe the upstream reset should cause the hub to clear all the 
> downstream port statuses.  Even if the reset doesn't do this, the 
> Set-Config request which follows the reset should.
> 
> Whether you tell the gadget drivers they are no longer suspended is up 
> to you.  I suspect it doesn't matter much.

Typo. I preserve POWER and CONNECTION. Not SUSPEND.

POWER because I'm self powered so it's always set and CONNECTION for
obvious reasons.

> >  6- Hosts sets port reset: I reset the gadget since it's already
> > bound/enabled. It's still "suspended".
> > 
> > So we do have a legitimate case of "reset while suspended".
> 
> Ah, but it doesn't contradict what I wrote earlier.  There's a
> difference between resuming a suspended _device_ and resuming a
> suspended _port_.
> 
> Nevertheless, in practice the difference doesn't matter and the 
> composite core should do the right thing.

Yes :-)

> > I'll tidy up the patch and submit it.
> 
> Good.

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Virtual hub, resets etc...
  2019-07-05 22:01           ` Benjamin Herrenschmidt
@ 2019-07-06 18:37             ` Alan Stern
  2019-07-06 23:44               ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 16+ messages in thread
From: Alan Stern @ 2019-07-06 18:37 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Felipe Balbi, linux-usb, Michal Nazarewicz

On Sat, 6 Jul 2019, Benjamin Herrenschmidt wrote:

> On Fri, 2019-07-05 at 10:08 -0400, Alan Stern wrote:
> > On Fri, 5 Jul 2019, Benjamin Herrenschmidt wrote:

> > > Sure but it would be nice if the mass storage dealt with -ESHUTDOWN
> > > properly and stopped :-) Or other errors... if the UDC HW for example
> > > dies for some reason, mass storage will lockup.
> > 
> > I suppose we could add code to check for this case and handle it, 
> > although I'm not sure what would be the right thing to do.  Delay for 
> > one second and try again?  Disable the gadget until the host does a 
> > reset?
> 
> I think just stop it until the next reset yes.

Can you test this patch?

Alan Stern



Index: usb-devel/drivers/usb/gadget/function/f_mass_storage.c
===================================================================
--- usb-devel.orig/drivers/usb/gadget/function/f_mass_storage.c
+++ usb-devel/drivers/usb/gadget/function/f_mass_storage.c
@@ -552,13 +552,14 @@ static int start_transfer(struct fsg_dev
 
 		/* We can't do much more than wait for a reset */
 		req->status = rc;
+		if (rc == -ESHUTDOWN)
+			fsg->common->running = 0;
 
 		/*
 		 * Note: currently the net2280 driver fails zero-length
 		 * submissions if DMA is enabled.
 		 */
-		if (rc != -ESHUTDOWN &&
-				!(rc == -EOPNOTSUPP && req->length == 0))
+		else if (!(rc == -EOPNOTSUPP && req->length == 0))
 			WARNING(fsg, "error in submission: %s --> %d\n",
 					ep->name, rc);
 	}


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Virtual hub, resets etc...
  2019-07-06 18:37             ` Alan Stern
@ 2019-07-06 23:44               ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 16+ messages in thread
From: Benjamin Herrenschmidt @ 2019-07-06 23:44 UTC (permalink / raw)
  To: Alan Stern; +Cc: Felipe Balbi, linux-usb, Michal Nazarewicz

On Sat, 2019-07-06 at 14:37 -0400, Alan Stern wrote:
> On Sat, 6 Jul 2019, Benjamin Herrenschmidt wrote:
> 
> > On Fri, 2019-07-05 at 10:08 -0400, Alan Stern wrote:
> > > On Fri, 5 Jul 2019, Benjamin Herrenschmidt wrote:
> 
> > > > Sure but it would be nice if the mass storage dealt with
> -ESHUTDOWN
> > > > properly and stopped :-) Or other errors... if the UDC HW for
> example
> > > > dies for some reason, mass storage will lockup.
> > > 
> > > I suppose we could add code to check for this case and handle
> it, 
> > > although I'm not sure what would be the right thing to do.  Delay
> for 
> > > one second and try again?  Disable the gadget until the host does
> a 
> > > reset?
> > 
> > I think just stop it until the next reset yes.
> 
> Can you test this patch?

Not for a few days, I'm away from the machine and that test requires me
being physically present, but I will when I'm back.

Thanks !

Cheers,
Ben.

> Alan Stern
> 
> 
> 
> Index: usb-devel/drivers/usb/gadget/function/f_mass_storage.c
> ===================================================================
> --- usb-devel.orig/drivers/usb/gadget/function/f_mass_storage.c
> +++ usb-devel/drivers/usb/gadget/function/f_mass_storage.c
> @@ -552,13 +552,14 @@ static int start_transfer(struct fsg_dev
>  
>                 /* We can't do much more than wait for a reset */
>                 req->status = rc;
> +               if (rc == -ESHUTDOWN)
> +                       fsg->common->running = 0;
>  
>                 /*
>                  * Note: currently the net2280 driver fails zero-
> length
>                  * submissions if DMA is enabled.
>                  */
> -               if (rc != -ESHUTDOWN &&
> -                               !(rc == -EOPNOTSUPP && req->length ==
> 0))
> +               else if (!(rc == -EOPNOTSUPP && req->length == 0))
>                         WARNING(fsg, "error in submission: %s -->
> %d\n",
>                                         ep->name, rc);
>         }


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2019-07-06 23:48 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-04  5:52 Virtual hub, resets etc Benjamin Herrenschmidt
2019-07-04  8:02 ` f_mass_storage configuration races (Was: Virtual hub, resets etc...) Benjamin Herrenschmidt
2019-07-04  8:04   ` [PATCH] usb: gadget: mass_storage: Fix races between fsg_disable and fsg_set_alt Benjamin Herrenschmidt
2019-07-04 16:13 ` Virtual hub, resets etc Alan Stern
2019-07-04 21:44   ` Benjamin Herrenschmidt
2019-07-04 21:58     ` Benjamin Herrenschmidt
2019-07-05  1:37       ` Alan Stern
2019-07-05  2:15         ` Benjamin Herrenschmidt
2019-07-05 14:20           ` Alan Stern
2019-07-05 22:06             ` Benjamin Herrenschmidt
2019-07-05  1:34     ` Alan Stern
2019-07-05  2:08       ` Benjamin Herrenschmidt
2019-07-05 14:08         ` Alan Stern
2019-07-05 22:01           ` Benjamin Herrenschmidt
2019-07-06 18:37             ` Alan Stern
2019-07-06 23:44               ` Benjamin Herrenschmidt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).