From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934483AbdA0Lgg (ORCPT ); Fri, 27 Jan 2017 06:36:36 -0500 Received: from mx2.suse.de ([195.135.220.15]:36791 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933577AbdA0LNA (ORCPT ); Fri, 27 Jan 2017 06:13:00 -0500 X-Amavis-Alert: BAD HEADER SECTION, Duplicate header field: "References" From: Jiri Slaby To: stable@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Mathias Nyman , Jiri Slaby Subject: [PATCH 3.12 160/235] xhci: fix deadlock at host remove by running watchdog correctly Date: Fri, 27 Jan 2017 11:54:53 +0100 Message-Id: X-Mailer: git-send-email 2.11.0 In-Reply-To: <5b46dc789ca2be4046e4e40a131858d386cac741.1485514374.git.jslaby@suse.cz> References: <5b46dc789ca2be4046e4e40a131858d386cac741.1485514374.git.jslaby@suse.cz> In-Reply-To: References: Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Mathias Nyman 3.12-stable review patch. If anyone has any objections, please let me know. =============== commit d6169d04097fd9ddf811e63eae4e5cd71e6666e2 upstream. If a URB is killed while the host is removed we can end up in a situation where the hub thread takes the roothub device lock, and waits for the URB to be given back by xhci-hcd, blocking the host remove code. xhci-hcd tries to stop the endpoint and give back the urb, but can't as the host is removed from PCI bus at the same time, preventing the normal way of giving back urb. Instead we need to rely on the stop command timeout function to give back the urb. This xhci_stop_endpoint_command_watchdog() timeout function used a XHCI_STATE_DYING flag to indicate if the timeout function is already running, but later this flag has been taking into use in other places to mark that xhci is dying. Remove checks for XHCI_STATE_DYING in xhci_urb_dequeue. We are still checking that reading from pci state does not return 0xffffffff or that host is not halted before trying to stop the endpoint. This whole area of stopping endpoints, giving back URBs, and the wathdog timeout need rework, this fix focuses on solving a specific deadlock issue that we can then send to stable before any major rework. Signed-off-by: Mathias Nyman Signed-off-by: Jiri Slaby --- drivers/usb/host/xhci-ring.c | 7 ------- drivers/usb/host/xhci.c | 13 ------------- 2 files changed, 20 deletions(-) diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c index 4bcea54f60cd..8f1159612593 100644 --- a/drivers/usb/host/xhci-ring.c +++ b/drivers/usb/host/xhci-ring.c @@ -948,13 +948,6 @@ void xhci_stop_endpoint_command_watchdog(unsigned long arg) spin_lock_irqsave(&xhci->lock, flags); ep->stop_cmds_pending--; - if (xhci->xhc_state & XHCI_STATE_DYING) { - xhci_dbg_trace(xhci, trace_xhci_dbg_cancel_urb, - "Stop EP timer ran, but another timer marked " - "xHCI as DYING, exiting."); - spin_unlock_irqrestore(&xhci->lock, flags); - return; - } if (!(ep->stop_cmds_pending == 0 && (ep->ep_state & EP_HALT_PENDING))) { xhci_dbg_trace(xhci, trace_xhci_dbg_cancel_urb, "Stop EP timer ran, but no command pending, " diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c index ea185eaeae28..04ba50b05075 100644 --- a/drivers/usb/host/xhci.c +++ b/drivers/usb/host/xhci.c @@ -1538,19 +1538,6 @@ int xhci_urb_dequeue(struct usb_hcd *hcd, struct urb *urb, int status) xhci_urb_free_priv(xhci, urb_priv); return ret; } - if ((xhci->xhc_state & XHCI_STATE_DYING) || - (xhci->xhc_state & XHCI_STATE_HALTED)) { - xhci_dbg_trace(xhci, trace_xhci_dbg_cancel_urb, - "Ep 0x%x: URB %p to be canceled on " - "non-responsive xHCI host.", - urb->ep->desc.bEndpointAddress, urb); - /* Let the stop endpoint command watchdog timer (which set this - * state) finish cleaning up the endpoint TD lists. We must - * have caught it in the middle of dropping a lock and giving - * back an URB. - */ - goto done; - } ep_index = xhci_get_endpoint_index(&urb->ep->desc); ep = &xhci->devs[urb->dev->slot_id]->eps[ep_index]; -- 2.11.0