Hi Heikki, On Mon, 2021-05-17 at 15:15 +0300, Heikki Krogerus wrote: > On Mon, May 17, 2021 at 01:03:12PM +0300, Heikki Krogerus wrote: > > On Sat, May 15, 2021 at 09:09:53PM -0700, Bjorn Andersson wrote: > > > It's possible that the interrupt handler for the UCSI driver > > > signals a > > > connector changes after the handler clears the PENDING bit, but > > > before > > > it has sent the acknowledge request. The result is that the handler > > > is > > > invoked yet again, to ack the same connector change. > > > > > > At least some versions of the Qualcomm UCSI firmware will not > > > handle the > > > second - "spurious" - acknowledgment gracefully. So make sure to > > > not > > > clear the pending flag until the change is acknowledged. > > > > > > Any connector changes coming in after the acknowledgment, that > > > would > > > have the pending flag incorrectly cleared, would afaict be covered > > > by > > > the subsequent connector status check. > > > > > > Fixes: 217504a05532 ("usb: typec: ucsi: Work around PPM losing > > > change information") > > > Signed-off-by: Bjorn Andersson > > > > I'm OK with this if Bejamin does not see any problems with it. I'll > > wait for his comments before giving my reviewed-by tag. > > > > That workaround (commit 217504a05532) is unfortunately too fragile. > > I'm going to now separate the processing of the connector state from > > the event handler (interrupt handler). That way we should be fairly > > sure we don't loose any of the connector states even if an event is > > generated while we are still in the middle of processing the previous > > one(s), and at the same time be sure that we also don't confuse the > > firmware. > > > > So the event handler shall after that only read the connector status, > > schedule the unique job where it's processed and ACK the event. > > Nothing else. > > Seems to be straightforward to implement. I'm attaching the patch I > made for that. I think it should actually also remove the problem you > are seeing. Can you test it? Hmm, I am happy to try the patch tomorrow in the scenario where I observed my race condition. Unfortunately, I don't feel it'll work. The problem that I was seeing looked like a race condition in the PPM itself, where the window is the time between the UCSI_GET_CONNECTOR_STATUS command and the subsequent ACK. For such a firmware level bug in the PPM, we need a way to detect the race condition when it happens (or get a fix for the firmware). Note that there was also some code to always sent a UCSI_GET_CAM_SUPPORTED command "to ensure the PPM does not get stuck in case it assumes we do". I have no idea where this came from or what PPMs might require this workaround. But I doubt it is a good idea to drop it. Benjamin