linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mathias Nyman <mathias.nyman@linux.intel.com>
To: Lu Baolu <baolu.lu@linux.intel.com>,
	Baolin Wang <baolin.wang@linaro.org>
Cc: Mathias Nyman <mathias.nyman@intel.com>,
	Greg KH <gregkh@linuxfoundation.org>,
	USB <linux-usb@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Mark Brown <broonie@kernel.org>, "Lu, Baolu" <baolu.lu@intel.com>
Subject: Re: [PATCH 2/2] usb: host: xhci: Handle the right timeout command
Date: Wed, 21 Dec 2016 14:57:04 +0200	[thread overview]
Message-ID: <585A7C20.9060500@linux.intel.com> (raw)
In-Reply-To: <585A27DF.4030003@linux.intel.com>

On 21.12.2016 08:57, Lu Baolu wrote:
> Hi Mathias,
>
> I have some comments for the implementation of
> xhci_handle_command_timeout() as well.
>
> On 12/20/2016 11:13 PM, Mathias Nyman wrote:
>> On 20.12.2016 09:30, Baolin Wang wrote:
>> ...
>>
>> Alright, I gathered all current work related to xhci races and timeouts
>> and put them into a branch:
>>
>> git://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git timeout_race_fixes
>>
>> Its based on 4.9
>> It includes a few other patches just to avoid conflicts and  make my life easier
>>
>> Interesting patches are:
>>
>> ee4eb91 xhci: remove unnecessary check for pending timer
>> 0cba67d xhci: detect stop endpoint race using pending timer instead of counter.
>> 4f2535f xhci: Handle command completion and timeout race
>> b9d00d7 usb: host: xhci: Fix possible wild pointer when handling abort command
>> 529a5a0 usb: xhci: fix possible wild pointer
>> 4766555 xhci: Fix race related to abort operation
>> de834a3 xhci: Use delayed_work instead of timer for command timeout
>> 69973b8 Linux 4.9
>>
>> The fixes for command queue races will go to usb-linus and stable, the
>> reworks for stop ep watchdog timer will go to usb-next.
>>
>> Still completely untested, (well it compiles)
>>
>> Felipe gave instructions how to modify dwc3 driver to timeout on address
>> devicecommands to test these, I'll try to set that up.
>>
>> All additional testing is welcome, especially if you can trigger timeouts
>> and races
>>
>> -Mathias
>>
>>
>
> I post the code below and add my comments in line.
>
> 1276 void xhci_handle_command_timeout(struct work_struct *work)
> 1277 {
> 1278         struct xhci_hcd *xhci;
> 1279         int ret;
> 1280         unsigned long flags;
> 1281         u64 hw_ring_state;
> 1282
> 1283         xhci = container_of(to_delayed_work(work), struct xhci_hcd, cmd_timer);
> 1284
> 1285         spin_lock_irqsave(&xhci->lock, flags);
> 1286
> 1287         /*
> 1288          * If timeout work is pending, or current_cmd is NULL, it means we
> 1289          * raced with command completion. Command is handled so just return.
> 1290          */
> 1291         if (!xhci->current_cmd || delayed_work_pending(&xhci->cmd_timer)) {
> 1292                 spin_unlock_irqrestore(&xhci->lock, flags);
> 1293                 return;
> 1294         }
> 1295         /* mark this command to be cancelled */
> 1296         xhci->current_cmd->status = COMP_CMD_ABORT;
> 1297
> 1298         /* Make sure command ring is running before aborting it */
> 1299         hw_ring_state = xhci_read_64(xhci, &xhci->op_regs->cmd_ring);
> 1300         if ((xhci->cmd_ring_state & CMD_RING_STATE_RUNNING) &&
> 1301             (hw_ring_state & CMD_RING_RUNNING))  {
> 1302                 /* Prevent new doorbell, and start command abort */
> 1303                 xhci->cmd_ring_state = CMD_RING_STATE_ABORTED;
> 1304                 spin_unlock_irqrestore(&xhci->lock, flags);
> 1305                 xhci_dbg(xhci, "Command timeout\n");
> 1306                 ret = xhci_abort_cmd_ring(xhci);
> 1307                 if (unlikely(ret == -ESHUTDOWN)) {
> 1308                         xhci_err(xhci, "Abort command ring failed\n");
> 1309                         xhci_cleanup_command_queue(xhci);
> 1310                         usb_hc_died(xhci_to_hcd(xhci)->primary_hcd);
> 1311                         xhci_dbg(xhci, "xHCI host controller is dead.\n");
> 1312                 }
> 1313                 return;
> 1314         }
> 1315
> 1316         /* host removed. Bail out */
> 1317         if (xhci->xhc_state & XHCI_STATE_REMOVING) {
> 1318                 spin_unlock_irqrestore(&xhci->lock, flags);
> 1319                 xhci_dbg(xhci, "host removed, ring start fail?\n");
> 1320                 xhci_cleanup_command_queue(xhci);
> 1321                 return;
> 1322         }
>
> I think this part of code should be moved up to line 1295.

The XHCI_STATE_REMOVING and XHCI_STATE_DYING needs a rework,
I'm working on that.

Basically we want XHCI_STATE_REMOVING to mean that  all devices are going,
away and driver will be removed. Don't bother with re-calculating available
bandwidths after every device removal, but do use xhci hardware to disable
devices cleanly etc.

XHCI_STATE_DYING should mean hardware is not working/responding. Don't
bother writing any registers or queuing anything. Just return all
pending and cancelled URBs, notify core we died, and free all allocated memory.

>
> 1323
> 1324         /* command timeout on stopped ring, ring can't be aborted */
> 1325         xhci_dbg(xhci, "Command timeout on stopped ring\n");
> 1326         xhci_handle_stopped_cmd_ring(xhci, xhci->current_cmd);
> 1327         spin_unlock_irqrestore(&xhci->lock, flags);
>
> This part of code is tricky. I have no idea about in which case should this
> code be executed? Anyway, we shouldn't call xhci_handle_stopped_cmd_ring()
> here, right?
>

This isn't changed it these patches.

It will remove the aborted commands and restart the ring. It's useful if we
want to abort a command but command ring was not running. (if for some
unkown reason it was stopped, or forgot to restart.

-Mathias

  reply	other threads:[~2016-12-21 12:56 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-05  7:51 [PATCH 1/2] usb: host: xhci: Fix possible wild pointer when handling abort command Baolin Wang
2016-12-05  7:51 ` [PATCH 2/2] usb: host: xhci: Handle the right timeout command Baolin Wang
2016-12-12 15:52   ` Mathias Nyman
2016-12-13  3:21     ` Baolin Wang
2016-12-19 10:33       ` Mathias Nyman
2016-12-19 11:34         ` Baolin Wang
2016-12-19 12:13           ` Mathias Nyman
2016-12-20  3:23             ` Baolin Wang
2016-12-20  4:29             ` Lu Baolu
2016-12-20  6:06               ` Baolin Wang
2016-12-20  6:39                 ` Lu Baolu
2016-12-20  6:46                   ` Baolin Wang
2016-12-20  7:18                     ` Lu Baolu
2016-12-20  7:30                       ` Baolin Wang
2016-12-20 15:13                         ` Mathias Nyman
2016-12-21  2:22                           ` Baolin Wang
2016-12-21 13:00                             ` Mathias Nyman
2016-12-27  3:07                               ` Baolin Wang
2017-01-02 14:57                                 ` Mathias Nyman
2017-01-03  6:20                                   ` Baolin Wang
2016-12-21  6:17                           ` Lu Baolu
2016-12-21 12:48                             ` Mathias Nyman
2016-12-21 14:10                               ` OGAWA Hirofumi
2016-12-21 15:04                                 ` Mathias Nyman
2016-12-21 15:18                                   ` OGAWA Hirofumi
2016-12-22  1:46                                     ` Lu Baolu
2016-12-23 12:54                                       ` Mathias Nyman
2016-12-22  1:43                               ` Lu Baolu
2016-12-21  6:57                           ` Lu Baolu
2016-12-21 12:57                             ` Mathias Nyman [this message]
2016-12-22  1:39                               ` Lu Baolu
2016-12-05 14:08 ` [PATCH 1/2] usb: host: xhci: Fix possible wild pointer when handling abort command Mathias Nyman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=585A7C20.9060500@linux.intel.com \
    --to=mathias.nyman@linux.intel.com \
    --cc=baolin.wang@linaro.org \
    --cc=baolu.lu@intel.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=broonie@kernel.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-usb@vger.kernel.org \
    --cc=mathias.nyman@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).