All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH] mmc: omap_hsmmc: fix runtime suspend crash: add host checks in callbacks
       [not found] <1341620786-13598-1-git-send-email-khilman@ti.com>
@ 2012-07-10  4:25 ` S, Venkatraman
  2012-07-10 14:17   ` Kevin Hilman
  0 siblings, 1 reply; 4+ messages in thread
From: S, Venkatraman @ 2012-07-10  4:25 UTC (permalink / raw)
  To: Kevin Hilman
  Cc: Chris Ball, linux-omap, Balaji T K, Rajendra Nayak,
	Adrian Hunter, open list:MULTIMEDIA CARD (...),
	linux-kernel@vger.kernel.org (open list)

On Sat, Jul 7, 2012 at 5:56 AM, Kevin Hilman <khilman@ti.com> wrote:
> Due to the way the driver core takes runtime PM references during
> probe, a driver's runtime PM callbacks may not be called until probe
> returns.  During probe, drvdata is set to the 'host' pointer but if
> probe fails, drvdata is set to NULL.
>
> The runtime PM callbacks currently blindly dereference this host
> pointer, but if probe fails, and the runtime PM callbacks are called
> after probe returns, a NULL pointer dereference will result.
>
Pardon my ignorance, but why would runtime suspend be called for a
device whose probe has failed ? AFAIK, MMC stack wouldn't make
the _enable()/disable() calls, which call runtime PM APIs, unless the
probe is successful.
 Is there a stacktrace for the offending caller ?

> Fix this by simply checking to be sure the host pointer is non-NULL.
>
> Signed-off-by: Kevin Hilman <khilman@ti.com>
> ---
> Applies to v3.5-rc5.  Fix is needed for v3.5-rc.
>
Can you please let me know which commit introduced this problem
in the first place ? There were not many changes in MMC PM code
in this merge window.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] mmc: omap_hsmmc: fix runtime suspend crash: add host checks in callbacks
  2012-07-10  4:25 ` [PATCH] mmc: omap_hsmmc: fix runtime suspend crash: add host checks in callbacks S, Venkatraman
@ 2012-07-10 14:17   ` Kevin Hilman
  2012-07-10 14:47     ` S, Venkatraman
  0 siblings, 1 reply; 4+ messages in thread
From: Kevin Hilman @ 2012-07-10 14:17 UTC (permalink / raw)
  To: S, Venkatraman
  Cc: Chris Ball, linux-omap, Balaji T K, Rajendra Nayak,
	Adrian Hunter, open list:MULTIMEDIA CARD (...),
	linux-kernel@vger.kernel.org (open list)

"S, Venkatraman" <svenkatr@ti.com> writes:

> On Sat, Jul 7, 2012 at 5:56 AM, Kevin Hilman <khilman@ti.com> wrote:
>> Due to the way the driver core takes runtime PM references during
>> probe, a driver's runtime PM callbacks may not be called until probe
>> returns.  During probe, drvdata is set to the 'host' pointer but if
>> probe fails, drvdata is set to NULL.
>>
>> The runtime PM callbacks currently blindly dereference this host
>> pointer, but if probe fails, and the runtime PM callbacks are called
>> after probe returns, a NULL pointer dereference will result.
>>
> Pardon my ignorance, but why would runtime suspend be called for a
> device whose probe has failed ? AFAIK, MMC stack wouldn't make the
> _enable()/disable() calls, which call runtime PM APIs, unless the
> probe is successful.  Is there a stacktrace for the offending caller ?

First thing to note: there are several error conditions *after*
pm_runtime_enable() and the first _get_sync() are called.

So here's what can happen:

The driver core does a _get_noresume() before calling the driver's
probe, so the runtime PM usecount is already 1 when the driver is
called.  The _get_sync() in _probe enables the device and increases the
usecount to 2.  The _put_sync() at the end of ->probe() will decrease
the usecount to 1, but since the usecount is still non-zero, the
driver's callbacks are still not called.  It's not until the driver core
does its _put_sync (to balance the _get_noresume() that the usecount
goes to zero and the driver's callbacks are called.

When probe succeeds, this all works fine.  However, if probe fails the
host pointer is set to NULL, but the runtime PM callbacks are still
called and attempt to dereference a NULL pointer.

>> Fix this by simply checking to be sure the host pointer is non-NULL.
>>
>> Signed-off-by: Kevin Hilman <khilman@ti.com>
>> ---
>> Applies to v3.5-rc5.  Fix is needed for v3.5-rc.
>>
> Can you please let me know which commit introduced this problem
> in the first place ? There were not many changes in MMC PM code
> in this merge window.

I guess this problem is probably not a regression and has been around
for some time.  It has not been seen because probe has not failed.

In using recent l-o master though, with Russell's dmaengine changes
merged, probe is failing for me becasue it can't allocate a DMA channel,
and then I'm seeing this problem.

Kevin

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] mmc: omap_hsmmc: fix runtime suspend crash: add host checks in callbacks
  2012-07-10 14:17   ` Kevin Hilman
@ 2012-07-10 14:47     ` S, Venkatraman
  2012-07-10 22:57       ` Kevin Hilman
  0 siblings, 1 reply; 4+ messages in thread
From: S, Venkatraman @ 2012-07-10 14:47 UTC (permalink / raw)
  To: Kevin Hilman
  Cc: Chris Ball, linux-omap, Balaji T K, Rajendra Nayak,
	Adrian Hunter, open list:MULTIMEDIA CARD (...),
	linux-kernel@vger.kernel.org (open list)

On Tue, Jul 10, 2012 at 7:47 PM, Kevin Hilman <khilman@ti.com> wrote:
> "S, Venkatraman" <svenkatr@ti.com> writes:
>
>> On Sat, Jul 7, 2012 at 5:56 AM, Kevin Hilman <khilman@ti.com> wrote:
>>> Due to the way the driver core takes runtime PM references during
>>> probe, a driver's runtime PM callbacks may not be called until probe
>>> returns.  During probe, drvdata is set to the 'host' pointer but if
>>> probe fails, drvdata is set to NULL.
>>>
>>> The runtime PM callbacks currently blindly dereference this host
>>> pointer, but if probe fails, and the runtime PM callbacks are called
>>> after probe returns, a NULL pointer dereference will result.
>>>
>> Pardon my ignorance, but why would runtime suspend be called for a
>> device whose probe has failed ? AFAIK, MMC stack wouldn't make the
>> _enable()/disable() calls, which call runtime PM APIs, unless the
>> probe is successful.  Is there a stacktrace for the offending caller ?
>
> First thing to note: there are several error conditions *after*
> pm_runtime_enable() and the first _get_sync() are called.
>
> So here's what can happen:
>
> The driver core does a _get_noresume() before calling the driver's
> probe, so the runtime PM usecount is already 1 when the driver is
> called.  The _get_sync() in _probe enables the device and increases the
> usecount to 2.  The _put_sync() at the end of ->probe() will decrease
> the usecount to 1, but since the usecount is still non-zero, the
> driver's callbacks are still not called.  It's not until the driver core
> does its _put_sync (to balance the _get_noresume() that the usecount
> goes to zero and the driver's callbacks are called.
>
Thanks for the detailed explanation.
But pm_runtime_disable() is called in the error path, so shouldn't
it prevent the driver callbacks from being called ? (The docuementation
talks about disable_depth field, but it doesn't sound right that runtime
pm would be enabled before _probe() is called. So disable_depth should
be 1 when pm_runtime_disable is called in _probe error path, right ?)

> When probe succeeds, this all works fine.  However, if probe fails the
> host pointer is set to NULL, but the runtime PM callbacks are still
> called and attempt to dereference a NULL pointer.
>
>>> Fix this by simply checking to be sure the host pointer is non-NULL.
>>>
>>> Signed-off-by: Kevin Hilman <khilman@ti.com>
>>> ---
>>> Applies to v3.5-rc5.  Fix is needed for v3.5-rc.
>>>
>> Can you please let me know which commit introduced this problem
>> in the first place ? There were not many changes in MMC PM code
>> in this merge window.
>
> I guess this problem is probably not a regression and has been around
> for some time.  It has not been seen because probe has not failed.
>
> In using recent l-o master though, with Russell's dmaengine changes
> merged, probe is failing for me becasue it can't allocate a DMA channel,
> and then I'm seeing this problem.
>
> Kevin

Ok. I'll let Chris decide on this - DMA engine is not yet in Linus's tree
though.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] mmc: omap_hsmmc: fix runtime suspend crash: add host checks in callbacks
  2012-07-10 14:47     ` S, Venkatraman
@ 2012-07-10 22:57       ` Kevin Hilman
  0 siblings, 0 replies; 4+ messages in thread
From: Kevin Hilman @ 2012-07-10 22:57 UTC (permalink / raw)
  To: S, Venkatraman
  Cc: Chris Ball, linux-omap, Balaji T K, Rajendra Nayak,
	Adrian Hunter, open list:MULTIMEDIA CARD (...),
	linux-kernel@vger.kernel.org (open list)

"S, Venkatraman" <svenkatr@ti.com> writes:

> On Tue, Jul 10, 2012 at 7:47 PM, Kevin Hilman <khilman@ti.com> wrote:
>> "S, Venkatraman" <svenkatr@ti.com> writes:
>>
>>> On Sat, Jul 7, 2012 at 5:56 AM, Kevin Hilman <khilman@ti.com> wrote:
>>>> Due to the way the driver core takes runtime PM references during
>>>> probe, a driver's runtime PM callbacks may not be called until probe
>>>> returns.  During probe, drvdata is set to the 'host' pointer but if
>>>> probe fails, drvdata is set to NULL.
>>>>
>>>> The runtime PM callbacks currently blindly dereference this host
>>>> pointer, but if probe fails, and the runtime PM callbacks are called
>>>> after probe returns, a NULL pointer dereference will result.
>>>>
>>> Pardon my ignorance, but why would runtime suspend be called for a
>>> device whose probe has failed ? AFAIK, MMC stack wouldn't make the
>>> _enable()/disable() calls, which call runtime PM APIs, unless the
>>> probe is successful.  Is there a stacktrace for the offending caller ?
>>
>> First thing to note: there are several error conditions *after*
>> pm_runtime_enable() and the first _get_sync() are called.
>>
>> So here's what can happen:
>>
>> The driver core does a _get_noresume() before calling the driver's
>> probe, so the runtime PM usecount is already 1 when the driver is
>> called.  The _get_sync() in _probe enables the device and increases the
>> usecount to 2.  The _put_sync() at the end of ->probe() will decrease
>> the usecount to 1, but since the usecount is still non-zero, the
>> driver's callbacks are still not called.  It's not until the driver core
>> does its _put_sync (to balance the _get_noresume() that the usecount
>> goes to zero and the driver's callbacks are called.
>>
> Thanks for the detailed explanation.
> But pm_runtime_disable() is called in the error path, so shouldn't
> it prevent the driver callbacks from being called ? 

Yes, you're right.  

I'm getting a few different problems mixed together here.  To work on a
different problem, I had commented out the pm_runtime_disable() call in
the error path and so I was seeing the callbacks being called even on
probe failure.  Withe the pm_runtime_disable() added back, that doesn't
happen.

However, there is another case where the runtime PM callbacks may be
called: the PM domain code in omap_device can still attempts to call the
runtime PM callbacks during late suspend for devices that are not
already runtime suspended.  This is done even when there is no driver
bound to a device, but this is a bug in the omap_device code, not in the
MMC driver.  So I'll be sending a patch shortly to fix that.

So, all that to say, this patch can be dropped.

Thanks for the detailed review.

Kevin

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2012-07-10 22:57 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1341620786-13598-1-git-send-email-khilman@ti.com>
2012-07-10  4:25 ` [PATCH] mmc: omap_hsmmc: fix runtime suspend crash: add host checks in callbacks S, Venkatraman
2012-07-10 14:17   ` Kevin Hilman
2012-07-10 14:47     ` S, Venkatraman
2012-07-10 22:57       ` Kevin Hilman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.