All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pavan Kondeti <quic_pkondeti@quicinc.com>
To: Mathias Nyman <mathias.nyman@linux.intel.com>
Cc: Pavan Kondeti <pkondeti@codeaurora.org>,
	youling 257 <youling257@gmail.com>, <gregkh@linuxfoundation.org>,
	<linux-usb@vger.kernel.org>
Subject: Re: [PATCH 4/5] xhci: Fix command ring pointer corruption while aborting a command
Date: Thu, 28 Oct 2021 13:33:02 +0530	[thread overview]
Message-ID: <20211028080302.GA14180@hu-pkondeti-hyd.qualcomm.com> (raw)
In-Reply-To: <64b2d3ef-7470-ddc4-0af5-5cbd889d8092@linux.intel.com>

On Wed, Oct 27, 2021 at 04:59:11PM +0300, Mathias Nyman wrote:
> > This patch passes my test. Can you please clarify the below question that is
> > bothering me.
> 
> Sure, currently this works as we only have one command ring segment.
> But lets look at a theoretical case with two segments far apart where
> the last command on a segment is aborted
> 
> > 
> > Here crcr points to the DMA address of the command which is getting
> > executed (soon will be aborted) by the xHC. After the ring is stopped,
> > we want xHC to execute the *next* command.
> > 
> > Is it guaranteed that the upper 32 bits of previous and next commands will
> > be same? Because when the issue happens, xHC takes the 32 bits of crcr
> > variable and update it's internal pointer from which it will fetch the commands
> > next time the ring is started. I think it is guaranteed because the upper 32
> > bits only may change when we cross the segments but in which case there will
> > be a link TRB in the middle. Since xHC command ring is stopped, it won't be
> > fetching the link TRB until door bell is rung again. Is my understanding correct?
> 
> Good point. 
> 
> This could be an issue as xHC will move the internal pointer to the next
> command TRB (past link) right after generating a "command aborted" event,
> before generating the "command ring stopped" event. The command ring stopped
> event should thus point to the next command TRB to handle.
> 
> I'll assume that the "command aborted" and "command ring stopped" events are generated
> before xHC updates the internal pointer based on CRCR write.
> 
> The CRCR write may thus move the internal dequeue pointer back to previous segment.
> 
> This may mess things up, even if the interrupt handler turns the aborted command TRB
> to a no-op, it's possible the partial CRCR update could cause internal pointer to 
> point to first command of old segment. 
> 
> > 
> > Also, what if there is a race with xHC just finishing this command (which we
> > are currently aborting) and next link TRB is traversed and processing the next
> > command in a different segment. For some reason, we could not update our
> > deq pointer and in middle of aborting the command (which is already completed)
> > and updating the higher 32 bit with the previous deq segment. This is a
> > hypothetical question as we are only using 1 segment for the command ring.
> 
> In this case we won't see the "command aborted" event. Only a "command ring
> stopped" event. and yes, the CRCR write could cause the internal dequeue pointer
> to be moved back to the old segment, and we have the same issues as in the 
> previous case.
> 
> These two cases could probably solved by writing the next command pointer to CRCR
> instead.
> 
> Only problem I see with this is if xHC hasn't even fetched the command TRB we are
> aborting. Then ring will stop at this last command TRB on segment.
> Writing the next command TRB pointer to CRCR forces xHC to move to the next
> segment without xHC processing the Link TRB, and cycle bit might get out of sync.
> 
> Hard to imagine how we end up in this situation.  
> 
> Then there is the third case where xHC handled multiple command TRBs, but interrupts
> are blocked so driver is unaware of their completion.
> Here its possible we write the wrong segment to CRCR even if we use the next
> command TRB pointer. xHC could be way past what driver thinks is the next command.
> 
> For this case we proably need to check for pending interrupts and unhandled command
> completions on the event ring before aborting the command.
> But that is probably another patch.
> 
> But for now the best solution seems to be just to write next command TRB
> pointer to CRCR during abort.
> 
Thanks Mathias for the detailed explanation. Agree that your patch is good
enough for the current situation. May be a TODO comment here would help. 

Thanks,
Pavan

  reply	other threads:[~2021-10-28  8:03 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-08  9:25 [PATCH 0/5] xhci fixes for usb-linus Mathias Nyman
2021-10-08  9:25 ` [PATCH 1/5] xhci: guard accesses to ep_state in xhci_endpoint_reset() Mathias Nyman
2021-10-08  9:25 ` [PATCH 2/5] xhci: add quirk for host controllers that don't update endpoint DCS Mathias Nyman
2021-10-08  9:25 ` [PATCH 3/5] USB: xhci: dbc: fix tty registration race Mathias Nyman
2021-10-08  9:25 ` [PATCH 4/5] xhci: Fix command ring pointer corruption while aborting a command Mathias Nyman
2021-10-22 10:59   ` youling257
2021-10-22 11:00     ` youling 257
2021-10-25 11:21       ` Mathias Nyman
2021-10-25 15:01         ` youling 257
2021-10-25 15:21           ` Mathias Nyman
2021-10-26 10:49             ` Pavan Kondeti
2021-10-27 13:59               ` Mathias Nyman
2021-10-28  8:03                 ` Pavan Kondeti [this message]
2021-10-29 12:48                   ` Mathias Nyman
2021-10-29 12:51                     ` [RFT PATCH] xhci: Fix commad ring abort, write all 64 bits to CRCR register Mathias Nyman
2021-10-29 15:35                       ` youling 257
2021-11-01  8:37                         ` Mathias Nyman
2021-11-01  3:31                       ` Pavan Kondeti
2021-11-01  8:36                         ` Mathias Nyman
2021-10-08  9:25 ` [PATCH 5/5] xhci: Enable trust tx length quirk for Fresco FL11 USB controller Mathias Nyman
2021-10-08 16:13 ` [PATCH 0/5] xhci fixes for usb-linus Greg KH

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211028080302.GA14180@hu-pkondeti-hyd.qualcomm.com \
    --to=quic_pkondeti@quicinc.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-usb@vger.kernel.org \
    --cc=mathias.nyman@linux.intel.com \
    --cc=pkondeti@codeaurora.org \
    --cc=youling257@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.