From: Mathias Nyman <mathias.nyman@linux.intel.com>
To: Lu Baolu <baolu.lu@linux.intel.com>,
Baolin Wang <baolin.wang@linaro.org>
Cc: Mathias Nyman <mathias.nyman@intel.com>,
Greg KH <gregkh@linuxfoundation.org>,
USB <linux-usb@vger.kernel.org>,
LKML <linux-kernel@vger.kernel.org>,
Mark Brown <broonie@kernel.org>, "Lu, Baolu" <baolu.lu@intel.com>
Subject: Re: [PATCH 2/2] usb: host: xhci: Handle the right timeout command
Date: Wed, 21 Dec 2016 14:57:04 +0200 [thread overview]
Message-ID: <585A7C20.9060500@linux.intel.com> (raw)
In-Reply-To: <585A27DF.4030003@linux.intel.com>
On 21.12.2016 08:57, Lu Baolu wrote:
> Hi Mathias,
>
> I have some comments for the implementation of
> xhci_handle_command_timeout() as well.
>
> On 12/20/2016 11:13 PM, Mathias Nyman wrote:
>> On 20.12.2016 09:30, Baolin Wang wrote:
>> ...
>>
>> Alright, I gathered all current work related to xhci races and timeouts
>> and put them into a branch:
>>
>> git://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git timeout_race_fixes
>>
>> Its based on 4.9
>> It includes a few other patches just to avoid conflicts and make my life easier
>>
>> Interesting patches are:
>>
>> ee4eb91 xhci: remove unnecessary check for pending timer
>> 0cba67d xhci: detect stop endpoint race using pending timer instead of counter.
>> 4f2535f xhci: Handle command completion and timeout race
>> b9d00d7 usb: host: xhci: Fix possible wild pointer when handling abort command
>> 529a5a0 usb: xhci: fix possible wild pointer
>> 4766555 xhci: Fix race related to abort operation
>> de834a3 xhci: Use delayed_work instead of timer for command timeout
>> 69973b8 Linux 4.9
>>
>> The fixes for command queue races will go to usb-linus and stable, the
>> reworks for stop ep watchdog timer will go to usb-next.
>>
>> Still completely untested, (well it compiles)
>>
>> Felipe gave instructions how to modify dwc3 driver to timeout on address
>> devicecommands to test these, I'll try to set that up.
>>
>> All additional testing is welcome, especially if you can trigger timeouts
>> and races
>>
>> -Mathias
>>
>>
>
> I post the code below and add my comments in line.
>
> 1276 void xhci_handle_command_timeout(struct work_struct *work)
> 1277 {
> 1278 struct xhci_hcd *xhci;
> 1279 int ret;
> 1280 unsigned long flags;
> 1281 u64 hw_ring_state;
> 1282
> 1283 xhci = container_of(to_delayed_work(work), struct xhci_hcd, cmd_timer);
> 1284
> 1285 spin_lock_irqsave(&xhci->lock, flags);
> 1286
> 1287 /*
> 1288 * If timeout work is pending, or current_cmd is NULL, it means we
> 1289 * raced with command completion. Command is handled so just return.
> 1290 */
> 1291 if (!xhci->current_cmd || delayed_work_pending(&xhci->cmd_timer)) {
> 1292 spin_unlock_irqrestore(&xhci->lock, flags);
> 1293 return;
> 1294 }
> 1295 /* mark this command to be cancelled */
> 1296 xhci->current_cmd->status = COMP_CMD_ABORT;
> 1297
> 1298 /* Make sure command ring is running before aborting it */
> 1299 hw_ring_state = xhci_read_64(xhci, &xhci->op_regs->cmd_ring);
> 1300 if ((xhci->cmd_ring_state & CMD_RING_STATE_RUNNING) &&
> 1301 (hw_ring_state & CMD_RING_RUNNING)) {
> 1302 /* Prevent new doorbell, and start command abort */
> 1303 xhci->cmd_ring_state = CMD_RING_STATE_ABORTED;
> 1304 spin_unlock_irqrestore(&xhci->lock, flags);
> 1305 xhci_dbg(xhci, "Command timeout\n");
> 1306 ret = xhci_abort_cmd_ring(xhci);
> 1307 if (unlikely(ret == -ESHUTDOWN)) {
> 1308 xhci_err(xhci, "Abort command ring failed\n");
> 1309 xhci_cleanup_command_queue(xhci);
> 1310 usb_hc_died(xhci_to_hcd(xhci)->primary_hcd);
> 1311 xhci_dbg(xhci, "xHCI host controller is dead.\n");
> 1312 }
> 1313 return;
> 1314 }
> 1315
> 1316 /* host removed. Bail out */
> 1317 if (xhci->xhc_state & XHCI_STATE_REMOVING) {
> 1318 spin_unlock_irqrestore(&xhci->lock, flags);
> 1319 xhci_dbg(xhci, "host removed, ring start fail?\n");
> 1320 xhci_cleanup_command_queue(xhci);
> 1321 return;
> 1322 }
>
> I think this part of code should be moved up to line 1295.
The XHCI_STATE_REMOVING and XHCI_STATE_DYING needs a rework,
I'm working on that.
Basically we want XHCI_STATE_REMOVING to mean that all devices are going,
away and driver will be removed. Don't bother with re-calculating available
bandwidths after every device removal, but do use xhci hardware to disable
devices cleanly etc.
XHCI_STATE_DYING should mean hardware is not working/responding. Don't
bother writing any registers or queuing anything. Just return all
pending and cancelled URBs, notify core we died, and free all allocated memory.
>
> 1323
> 1324 /* command timeout on stopped ring, ring can't be aborted */
> 1325 xhci_dbg(xhci, "Command timeout on stopped ring\n");
> 1326 xhci_handle_stopped_cmd_ring(xhci, xhci->current_cmd);
> 1327 spin_unlock_irqrestore(&xhci->lock, flags);
>
> This part of code is tricky. I have no idea about in which case should this
> code be executed? Anyway, we shouldn't call xhci_handle_stopped_cmd_ring()
> here, right?
>
This isn't changed it these patches.
It will remove the aborted commands and restart the ring. It's useful if we
want to abort a command but command ring was not running. (if for some
unkown reason it was stopped, or forgot to restart.
-Mathias
next prev parent reply other threads:[~2016-12-21 12:56 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-12-05 7:51 [PATCH 1/2] usb: host: xhci: Fix possible wild pointer when handling abort command Baolin Wang
2016-12-05 7:51 ` [PATCH 2/2] usb: host: xhci: Handle the right timeout command Baolin Wang
2016-12-12 15:52 ` Mathias Nyman
2016-12-13 3:21 ` Baolin Wang
2016-12-19 10:33 ` Mathias Nyman
2016-12-19 11:34 ` Baolin Wang
2016-12-19 12:13 ` Mathias Nyman
2016-12-20 3:23 ` Baolin Wang
2016-12-20 4:29 ` Lu Baolu
2016-12-20 6:06 ` Baolin Wang
2016-12-20 6:39 ` Lu Baolu
2016-12-20 6:46 ` Baolin Wang
2016-12-20 7:18 ` Lu Baolu
2016-12-20 7:30 ` Baolin Wang
2016-12-20 15:13 ` Mathias Nyman
2016-12-21 2:22 ` Baolin Wang
2016-12-21 13:00 ` Mathias Nyman
2016-12-27 3:07 ` Baolin Wang
2017-01-02 14:57 ` Mathias Nyman
2017-01-03 6:20 ` Baolin Wang
2016-12-21 6:17 ` Lu Baolu
2016-12-21 12:48 ` Mathias Nyman
2016-12-21 14:10 ` OGAWA Hirofumi
2016-12-21 15:04 ` Mathias Nyman
2016-12-21 15:18 ` OGAWA Hirofumi
2016-12-22 1:46 ` Lu Baolu
2016-12-23 12:54 ` Mathias Nyman
2016-12-22 1:43 ` Lu Baolu
2016-12-21 6:57 ` Lu Baolu
2016-12-21 12:57 ` Mathias Nyman [this message]
2016-12-22 1:39 ` Lu Baolu
2016-12-05 14:08 ` [PATCH 1/2] usb: host: xhci: Fix possible wild pointer when handling abort command Mathias Nyman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=585A7C20.9060500@linux.intel.com \
--to=mathias.nyman@linux.intel.com \
--cc=baolin.wang@linaro.org \
--cc=baolu.lu@intel.com \
--cc=baolu.lu@linux.intel.com \
--cc=broonie@kernel.org \
--cc=gregkh@linuxfoundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-usb@vger.kernel.org \
--cc=mathias.nyman@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.