[v3] usb: hub: add retry routine after intr URB submit error
diff mbox series

Message ID 20181120143438.18296-1-nsaenzjulienne@suse.de
State New, archived
Headers show
Series
  • [v3] usb: hub: add retry routine after intr URB submit error
Related show

Commit Message

Nicolas Saenz Julienne Nov. 20, 2018, 2:34 p.m. UTC
The hub sends hot-plug events to the host trough it's interrupt URB. The
driver takes care of completing the URB and re-submitting it. Completion
errors are handled in the hub_event() work, yet submission errors are
ignored, rendering the device unresponsive. All further events are lost.

It is fairly hard to find this issue in the wild, since you have to time
the USB hot-plug event with the URB submission failure. For instance it
could be the system running out of memory or some malfunction in the USB
controller driver. Nevertheless, it's pretty reasonable to think it'll
happen sometime. One can trigger this issue using eBPF's function
override feature (see BCC's inject.py script).

This patch adds a retry routine to the event of a submission error. The
HUB driver will try to re-submit the URB once every second until it's
successful or the HUB is disconnected.

As some USB subsystems already take care of this issue, the
implementation was inspired from usbhid/hid_core.c's.

Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>

---

v3: As per Oliver's request:
  - Take care of race condition between disconnect and irq

v2: as per Alan's and Oliver's comments:
  - Rename timer
  - Delete the timer on disconnect
  - Don't reset HUB nor exponential slowdown the timer, 1s fixed retry
    period
  - Check for -ESHUTDOWN prior kicking the timer

 drivers/usb/core/hub.c | 45 ++++++++++++++++++++++++++++++++++++------
 drivers/usb/core/hub.h |  2 ++
 2 files changed, 41 insertions(+), 6 deletions(-)

Comments

Oliver Neukum Nov. 20, 2018, 2:57 p.m. UTC | #1
On Di, 2018-11-20 at 15:34 +0100, Nicolas Saenz Julienne wrote:
> The hub sends hot-plug events to the host trough it's interrupt URB. The
> driver takes care of completing the URB and re-submitting it. Completion
> errors are handled in the hub_event() work, yet submission errors are
> ignored, rendering the device unresponsive. All further events are lost.
> 

Hi,

almost. There is no point in kicking of an error handling while aq
reset is underway. You are checking only "quiescing" but not
"in_reset".

	Regards
		Oliver
Oliver Neukum Nov. 20, 2018, 3:12 p.m. UTC | #2
> Anytime "in_reset" is set "quiescing" is also set:
> 
> static int hub_pre_reset(struct usb_interface *intf)
> {
> 	struct usb_hub *hub = usb_get_intfdata(intf);
> 
> 	hub_quiesce(hub, HUB_PRE_RESET); //sets quiesce
> 	hub->in_reset = 1;
> 	hub_pm_barrier_for_all_ports(hub);
> 	return 0;
> }
> 
> static int hub_post_reset(struct usb_interface *intf)
> {
> 	struct usb_hub *hub = usb_get_intfdata(intf);
> 
> 	hub->in_reset = 0;
> 	hub_pm_barrier_for_all_ports(hub);
> 	hub_activate(hub, HUB_POST_RESET); //clears quiesce
> 	return 0;
> }
> 
> I should be OK isn't it?

Sorry, yes, I overlooked that there are two flags.

	Regards
		Oliver
Nicolas Saenz Julienne Nov. 20, 2018, 3:17 p.m. UTC | #3
Hi Oliver,

On Tue, 2018-11-20 at 15:57 +0100, Oliver Neukum wrote:
> On Di, 2018-11-20 at 15:34 +0100, Nicolas Saenz Julienne wrote:
> > The hub sends hot-plug events to the host trough it's interrupt
> > URB. The
> > driver takes care of completing the URB and re-submitting it.
> > Completion
> > errors are handled in the hub_event() work, yet submission errors
> > are
> > ignored, rendering the device unresponsive. All further events are
> > lost.
> > 
> 
> Hi,
> 
> almost. There is no point in kicking of an error handling while aq
> reset is underway. You are checking only "quiescing" but not
> "in_reset".

Anytime "in_reset" is set "quiescing" is also set:

static int hub_pre_reset(struct usb_interface *intf)
{
	struct usb_hub *hub = usb_get_intfdata(intf);

	hub_quiesce(hub, HUB_PRE_RESET); //sets quiesce
	hub->in_reset = 1;
	hub_pm_barrier_for_all_ports(hub);
	return 0;
}

static int hub_post_reset(struct usb_interface *intf)
{
	struct usb_hub *hub = usb_get_intfdata(intf);

	hub->in_reset = 0;
	hub_pm_barrier_for_all_ports(hub);
	hub_activate(hub, HUB_POST_RESET); //clears quiesce
	return 0;
}

I should be OK isn't it?

Regards,
Nicolas

> 
> 	Regards
> 		Oliver
>
Greg KH Jan. 7, 2019, 4:33 p.m. UTC | #4
On Tue, Nov 20, 2018 at 03:34:38PM +0100, Nicolas Saenz Julienne wrote:
> The hub sends hot-plug events to the host trough it's interrupt URB. The
> driver takes care of completing the URB and re-submitting it. Completion
> errors are handled in the hub_event() work, yet submission errors are
> ignored, rendering the device unresponsive. All further events are lost.
> 
> It is fairly hard to find this issue in the wild, since you have to time
> the USB hot-plug event with the URB submission failure. For instance it
> could be the system running out of memory or some malfunction in the USB
> controller driver. Nevertheless, it's pretty reasonable to think it'll
> happen sometime. One can trigger this issue using eBPF's function
> override feature (see BCC's inject.py script).
> 
> This patch adds a retry routine to the event of a submission error. The
> HUB driver will try to re-submit the URB once every second until it's
> successful or the HUB is disconnected.
> 
> As some USB subsystems already take care of this issue, the
> implementation was inspired from usbhid/hid_core.c's.
> 
> Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
> 
> ---

What ever happened to this patch?  Is it still needed?  Oliver and Alan,
any objections about it anymore?

thanks,

greg k-h

> 
> v3: As per Oliver's request:
>   - Take care of race condition between disconnect and irq
> 
> v2: as per Alan's and Oliver's comments:
>   - Rename timer
>   - Delete the timer on disconnect
>   - Don't reset HUB nor exponential slowdown the timer, 1s fixed retry
>     period
>   - Check for -ESHUTDOWN prior kicking the timer
> 
>  drivers/usb/core/hub.c | 45 ++++++++++++++++++++++++++++++++++++------
>  drivers/usb/core/hub.h |  2 ++
>  2 files changed, 41 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
> index c6077d582d29..630490a35301 100644
> --- a/drivers/usb/core/hub.c
> +++ b/drivers/usb/core/hub.c
> @@ -607,6 +607,38 @@ static int hub_port_status(struct usb_hub *hub, int port1,
>  				   status, change, NULL);
>  }
>  
> +static void hub_resubmit_irq_urb(struct usb_hub *hub)
> +{
> +	unsigned long flags;
> +	int status;
> +
> +	spin_lock_irqsave(&hub->irq_urb_lock, flags);
> +
> +	if (hub->quiescing) {
> +		spin_unlock_irqrestore(&hub->irq_urb_lock, flags);
> +		return;
> +	}
> +
> +	status = usb_submit_urb(hub->urb, GFP_ATOMIC);
> +	if (status && status != -ENODEV && status != -EPERM &&
> +	    status != -ESHUTDOWN) {
> +		dev_err(hub->intfdev, "resubmit --> %d\n", status);
> +		mod_timer(&hub->irq_urb_retry,
> +			  jiffies + msecs_to_jiffies(MSEC_PER_SEC));
> +
> +	}
> +
> +	spin_unlock_irqrestore(&hub->irq_urb_lock, flags);
> +}
> +
> +static void hub_retry_irq_urb(struct timer_list *t)
> +{
> +	struct usb_hub *hub = from_timer(hub, t, irq_urb_retry);
> +
> +	hub_resubmit_irq_urb(hub);
> +}
> +
> +
>  static void kick_hub_wq(struct usb_hub *hub)
>  {
>  	struct usb_interface *intf;
> @@ -709,12 +741,7 @@ static void hub_irq(struct urb *urb)
>  	kick_hub_wq(hub);
>  
>  resubmit:
> -	if (hub->quiescing)
> -		return;
> -
> -	status = usb_submit_urb(hub->urb, GFP_ATOMIC);
> -	if (status != 0 && status != -ENODEV && status != -EPERM)
> -		dev_err(hub->intfdev, "resubmit --> %d\n", status);
> +	hub_resubmit_irq_urb(hub);
>  }
>  
>  /* USB 2.0 spec Section 11.24.2.3 */
> @@ -1254,10 +1281,13 @@ enum hub_quiescing_type {
>  static void hub_quiesce(struct usb_hub *hub, enum hub_quiescing_type type)
>  {
>  	struct usb_device *hdev = hub->hdev;
> +	unsigned long flags;
>  	int i;
>  
>  	/* hub_wq and related activity won't re-trigger */
> +	spin_lock_irqsave(&hub->irq_urb_lock, flags);
>  	hub->quiescing = 1;
> +	spin_unlock_irqrestore(&hub->irq_urb_lock, flags);
>  
>  	if (type != HUB_SUSPEND) {
>  		/* Disconnect all the children */
> @@ -1268,6 +1298,7 @@ static void hub_quiesce(struct usb_hub *hub, enum hub_quiescing_type type)
>  	}
>  
>  	/* Stop hub_wq and related activity */
> +	del_timer_sync(&hub->irq_urb_retry);
>  	usb_kill_urb(hub->urb);
>  	if (hub->has_indicators)
>  		cancel_delayed_work_sync(&hub->leds);
> @@ -1800,6 +1831,8 @@ static int hub_probe(struct usb_interface *intf, const struct usb_device_id *id)
>  	INIT_DELAYED_WORK(&hub->leds, led_work);
>  	INIT_DELAYED_WORK(&hub->init_work, NULL);
>  	INIT_WORK(&hub->events, hub_event);
> +	spin_lock_init(&hub->irq_urb_lock);
> +	timer_setup(&hub->irq_urb_retry, hub_retry_irq_urb, 0);
>  	usb_get_intf(intf);
>  	usb_get_dev(hdev);
>  
> diff --git a/drivers/usb/core/hub.h b/drivers/usb/core/hub.h
> index 4accfb63f7dc..a9e24e4b8df1 100644
> --- a/drivers/usb/core/hub.h
> +++ b/drivers/usb/core/hub.h
> @@ -69,6 +69,8 @@ struct usb_hub {
>  	struct delayed_work	leds;
>  	struct delayed_work	init_work;
>  	struct work_struct      events;
> +	spinlock_t		irq_urb_lock;
> +	struct timer_list	irq_urb_retry;
>  	struct usb_port		**ports;
>  };
>  
> -- 
> 2.19.1
>
Alan Stern Jan. 7, 2019, 6:59 p.m. UTC | #5
On Mon, 7 Jan 2019, Greg KH wrote:

> On Tue, Nov 20, 2018 at 03:34:38PM +0100, Nicolas Saenz Julienne wrote:
> > The hub sends hot-plug events to the host trough it's interrupt URB. The
> > driver takes care of completing the URB and re-submitting it. Completion
> > errors are handled in the hub_event() work, yet submission errors are
> > ignored, rendering the device unresponsive. All further events are lost.
> > 
> > It is fairly hard to find this issue in the wild, since you have to time
> > the USB hot-plug event with the URB submission failure. For instance it
> > could be the system running out of memory or some malfunction in the USB
> > controller driver. Nevertheless, it's pretty reasonable to think it'll
> > happen sometime. One can trigger this issue using eBPF's function
> > override feature (see BCC's inject.py script).
> > 
> > This patch adds a retry routine to the event of a submission error. The
> > HUB driver will try to re-submit the URB once every second until it's
> > successful or the HUB is disconnected.
> > 
> > As some USB subsystems already take care of this issue, the
> > implementation was inspired from usbhid/hid_core.c's.
> > 
> > Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
> > 
> > ---
> 
> What ever happened to this patch?  Is it still needed?  Oliver and Alan,
> any objections about it anymore?

I have just one very minor nit to pick (see below).  Mostly I've been
waiting to hear from Oliver.

Alan Stern

> 
> thanks,
> 
> greg k-h
> 
> > 
> > v3: As per Oliver's request:
> >   - Take care of race condition between disconnect and irq
> > 
> > v2: as per Alan's and Oliver's comments:
> >   - Rename timer
> >   - Delete the timer on disconnect
> >   - Don't reset HUB nor exponential slowdown the timer, 1s fixed retry
> >     period
> >   - Check for -ESHUTDOWN prior kicking the timer
> > 
> >  drivers/usb/core/hub.c | 45 ++++++++++++++++++++++++++++++++++++------
> >  drivers/usb/core/hub.h |  2 ++
> >  2 files changed, 41 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
> > index c6077d582d29..630490a35301 100644
> > --- a/drivers/usb/core/hub.c
> > +++ b/drivers/usb/core/hub.c
> > @@ -607,6 +607,38 @@ static int hub_port_status(struct usb_hub *hub, int port1,
> >  				   status, change, NULL);
> >  }
> >  
> > +static void hub_resubmit_irq_urb(struct usb_hub *hub)
> > +{
> > +	unsigned long flags;
> > +	int status;
> > +
> > +	spin_lock_irqsave(&hub->irq_urb_lock, flags);
> > +
> > +	if (hub->quiescing) {
> > +		spin_unlock_irqrestore(&hub->irq_urb_lock, flags);
> > +		return;
> > +	}
> > +
> > +	status = usb_submit_urb(hub->urb, GFP_ATOMIC);
> > +	if (status && status != -ENODEV && status != -EPERM &&
> > +	    status != -ESHUTDOWN) {
> > +		dev_err(hub->intfdev, "resubmit --> %d\n", status);
> > +		mod_timer(&hub->irq_urb_retry,
> > +			  jiffies + msecs_to_jiffies(MSEC_PER_SEC));

This can be written more simply as:

		mod_timer(&hub->irq_urb_retry, jiffies + HZ);
Oliver Neukum Jan. 7, 2019, 9:25 p.m. UTC | #6
On Mo, 2019-01-07 at 13:59 -0500, Alan Stern wrote:
> On Mon, 7 Jan 2019, Greg KH wrote:
> 
> 
> > What ever happened to this patch?  Is it still needed?  Oliver and Alan,
> > any objections about it anymore?
> 
> I have just one very minor nit to pick (see below).  Mostly I've been
> waiting to hear from Oliver.

Sorry, I have been away over Christmas. It is looking good to me.

	Regards
		Oliver

Patch
diff mbox series

diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
index c6077d582d29..630490a35301 100644
--- a/drivers/usb/core/hub.c
+++ b/drivers/usb/core/hub.c
@@ -607,6 +607,38 @@  static int hub_port_status(struct usb_hub *hub, int port1,
 				   status, change, NULL);
 }
 
+static void hub_resubmit_irq_urb(struct usb_hub *hub)
+{
+	unsigned long flags;
+	int status;
+
+	spin_lock_irqsave(&hub->irq_urb_lock, flags);
+
+	if (hub->quiescing) {
+		spin_unlock_irqrestore(&hub->irq_urb_lock, flags);
+		return;
+	}
+
+	status = usb_submit_urb(hub->urb, GFP_ATOMIC);
+	if (status && status != -ENODEV && status != -EPERM &&
+	    status != -ESHUTDOWN) {
+		dev_err(hub->intfdev, "resubmit --> %d\n", status);
+		mod_timer(&hub->irq_urb_retry,
+			  jiffies + msecs_to_jiffies(MSEC_PER_SEC));
+
+	}
+
+	spin_unlock_irqrestore(&hub->irq_urb_lock, flags);
+}
+
+static void hub_retry_irq_urb(struct timer_list *t)
+{
+	struct usb_hub *hub = from_timer(hub, t, irq_urb_retry);
+
+	hub_resubmit_irq_urb(hub);
+}
+
+
 static void kick_hub_wq(struct usb_hub *hub)
 {
 	struct usb_interface *intf;
@@ -709,12 +741,7 @@  static void hub_irq(struct urb *urb)
 	kick_hub_wq(hub);
 
 resubmit:
-	if (hub->quiescing)
-		return;
-
-	status = usb_submit_urb(hub->urb, GFP_ATOMIC);
-	if (status != 0 && status != -ENODEV && status != -EPERM)
-		dev_err(hub->intfdev, "resubmit --> %d\n", status);
+	hub_resubmit_irq_urb(hub);
 }
 
 /* USB 2.0 spec Section 11.24.2.3 */
@@ -1254,10 +1281,13 @@  enum hub_quiescing_type {
 static void hub_quiesce(struct usb_hub *hub, enum hub_quiescing_type type)
 {
 	struct usb_device *hdev = hub->hdev;
+	unsigned long flags;
 	int i;
 
 	/* hub_wq and related activity won't re-trigger */
+	spin_lock_irqsave(&hub->irq_urb_lock, flags);
 	hub->quiescing = 1;
+	spin_unlock_irqrestore(&hub->irq_urb_lock, flags);
 
 	if (type != HUB_SUSPEND) {
 		/* Disconnect all the children */
@@ -1268,6 +1298,7 @@  static void hub_quiesce(struct usb_hub *hub, enum hub_quiescing_type type)
 	}
 
 	/* Stop hub_wq and related activity */
+	del_timer_sync(&hub->irq_urb_retry);
 	usb_kill_urb(hub->urb);
 	if (hub->has_indicators)
 		cancel_delayed_work_sync(&hub->leds);
@@ -1800,6 +1831,8 @@  static int hub_probe(struct usb_interface *intf, const struct usb_device_id *id)
 	INIT_DELAYED_WORK(&hub->leds, led_work);
 	INIT_DELAYED_WORK(&hub->init_work, NULL);
 	INIT_WORK(&hub->events, hub_event);
+	spin_lock_init(&hub->irq_urb_lock);
+	timer_setup(&hub->irq_urb_retry, hub_retry_irq_urb, 0);
 	usb_get_intf(intf);
 	usb_get_dev(hdev);
 
diff --git a/drivers/usb/core/hub.h b/drivers/usb/core/hub.h
index 4accfb63f7dc..a9e24e4b8df1 100644
--- a/drivers/usb/core/hub.h
+++ b/drivers/usb/core/hub.h
@@ -69,6 +69,8 @@  struct usb_hub {
 	struct delayed_work	leds;
 	struct delayed_work	init_work;
 	struct work_struct      events;
+	spinlock_t		irq_urb_lock;
+	struct timer_list	irq_urb_retry;
 	struct usb_port		**ports;
 };