From: Jack Pham <firstname.lastname@example.org>
To: Felipe Balbi <email@example.com>,
Sandeep Maheswaram <firstname.lastname@example.org>
Cc: Alexandru Elisei <email@example.com>,
Greg Kroah-Hartman <firstname.lastname@example.org>,
Linux Kernel Mailing List <email@example.com>,
Subject: Re: [BUG] usb: dwc3: Kernel NULL pointer dereference in dwc3_remove()
Date: Thu, 10 Jun 2021 08:33:46 -0700 [thread overview]
Message-ID: <20210610153346.GA26872@jackp-linux.qualcomm.com> (raw)
On Thu, Jun 10, 2021 at 01:11:42PM +0300, Felipe Balbi wrote:
> Jack Pham <firstname.lastname@example.org> writes:
> > On Fri, Jun 04, 2021 at 11:20:12AM +0300, Felipe Balbi wrote:
> >> Jack Pham <email@example.com> writes:
> >> >> >>>> Alexandru Elisei <firstname.lastname@example.org> writes:
> >> >> >>>>> I've been able to bisect the panic and the offending commit is 568262bf5492 ("usb:
> >> >> >>>>> dwc3: core: Add shutdown callback for dwc3"). I can provide more diagnostic
> >> >> >>>>> information if needed and I can help test the fix.
> >> >> >>>> if you simply revert that commit in HEAD, does the problem really go
> >> >> >>>> away?
> >> >> >>> Kernel built from commit 324c92e5e0ee, which is the kernel tip today, the panic is
> >> >> >>> there. Reverting the offending commit, 568262bf5492, makes the panic disappear.
> >> >> >> Want to send a revert so I can take it now?
> >> >> >
> >> >> > I can send a revert, but Felipe was asking Sandeep (the commit author) for a fix,
> >> >> > so I'll leave it up to Felipe to decide how to proceed.
> >> >>
> >> >> I'm okay with a revert. Feel free to add my Acked-by: Felipe Balbi
> >> >> <email@example.com> or it.
> >> >>
> >> >> Sandeep, please send a new version that doesn't encounter the same
> >> >> issue. Make sure to test by reloading the driver in a tight loop for
> >> >> several iterations.
> >> >
> >> > This would probably be tricky to test on other "glue" drivers as the
> >> > problem appears to be specific only to dwc3_of_simple. It looks like
> >> > both dwc3_of_simple and the dwc3 core now (due to 568262bf5492) each
> >> > implement respective .shutdown callbacks. The latter is simply a wrapper
> >> > around dwc3_remove(). And from the panic call stack above we see that
> >> > dwc3_of_simple_shutdown() calls of_platform_depopulate() which will
> >> > again call dwc3_remove() resulting in the double remove.
> >> >
> >> > So would an alternative approach be to protect against dwc3_remove()
> >> > getting called multiple times? IMO it'd be a bit messy to have to add
> >> no, I don't think so. That sounds like a workaround. We should be able
> >> to guarantee that ->remove() doesn't get called twice using the driver
> >> model properly.
> > Completely fair. So then having a .shutdown callback that directly calls
> > dwc3_remove() is probably not the right thing to do as it completely
> > bypasses the driver model so if and when the driver core does later
> > release the device from the driver that's how we end up with the double
> > remove.
> yeah, I would agree with that.
> >> > additional checks there to know if it had already been called. So maybe
> >> > avoid it altogether--should dwc3_of_simple_shutdown() just skip calling
> >> > of_platform_depopulate()?
> >> I don't know what the idiomatic is nowadays, but at least early on, we
> >> had to call depopulate.
> > So any suggestions on how to fix the original issue Sandeep was trying
> > to fix with 568262bf5492? Maybe implement .shutdown in dwc3_qcom and have
> > it follow what dwc3_of_simple does with of_platform_depopulate()? But
> > then wouldn't other "glues" want/need to follow suit?
> I think we can implement shutdown in core, but we need to careful with
> it. Instead of just blindly calling remove, let's extract the common
> parts to another internal function that both remove and shutdown
> call. debugfs removal should not be part of that generic method :-)
Upon re-reading your description in 568262bf5492 it sounds like the
original intention of your patch is basically to quiesce the HW so that
it doesn't continue to run after SMMU/IOMMU is disabled right?
If that is the case, couldn't we simply call only dwc3_core_exit_mode()
assuming there is no other requirement to do any other cleanup/teardown
(PHYs, clocks, resets, runtime PM, et al)? This function should do the
bare minimum of stopping the controller in whatever mode (host or
peripheral) it is currently operating in.
> Anything in that generic method should, probably, be idempotent.
Yes we'll need to ensure that dwc3_core_exit_mode() can be called
multiple times without additional side effects. At first glance this
probably means setting dwc->xhci and dwc->gadget to NULL from
dwc3_host_exit() and dwc3_gadget_exit(), respectively.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
next parent reply other threads:[~2021-06-10 15:34 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <firstname.lastname@example.org>
[not found] ` <email@example.com>
[not found] ` <firstname.lastname@example.org>
[not found] ` <YLi/u9J5f+nQO4Cm@kroah.com>
[not found] ` <email@example.com>
[not found] ` <firstname.lastname@example.org>
[not found] ` <20210603173632.GA25299@jackp-linux.qualcomm.com>
[not found] ` <email@example.com>
[not found] ` <20210607180023.GA23045@jackp-linux.qualcomm.com>
[not found] ` <firstname.lastname@example.org>
2021-06-10 15:33 ` Jack Pham [this message]
2021-06-14 7:37 ` [BUG] usb: dwc3: Kernel NULL pointer dereference in dwc3_remove() Sandeep Maheswaram
2021-06-16 0:13 ` Jack Pham
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).