linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mmc: sdhci: disable irq in sdhci host suspend ranther than free this irq
@ 2016-01-28  9:42 Haibo Chen
  2016-01-28 10:20 ` Russell King - ARM Linux
  0 siblings, 1 reply; 8+ messages in thread
From: Haibo Chen @ 2016-01-28  9:42 UTC (permalink / raw)
  To: ulf.hansson; +Cc: rmk+kernel, haibo.chen, linux-mmc, linux-kernel

Currently sdhci driver free irq in host suspend, and call
request_threaded_irq() in host resume. But during host resume,
Ctrl+C can impact sdhci host resume, see the error log:

CPU1 is up
PM: noirq resume of devices complete after 0.637 msecs imx-sdma 30bd0000.sdma: loaded firmware 4.1
PM: early resume of devices complete after 0.774 msecs
dpm_run_callback(): platform_pm_resume+0x0/0x44 returns -4
PM: Device 30b40000.usdhc failed to resume: error -4
dpm_run_callback(): platform_pm_resume+0x0/0x44 returns -4
PM: Device 30b50000.usdhc failed to resume: error -4
dpm_run_callback(): platform_pm_resume+0x0/0x44 returns -4
PM: Device 30b60000.usdhc failed to resume: error -4 fec 30be0000.ethernet eth0: Link is Up - 100Mbps/Full - flow control rx/tx
mmc0: Timeout waiting for hardware interrupt.
mmc0: Timeout waiting for hardware interrupt.
mmc0: Timeout waiting for hardware interrupt.
mmc0: Timeout waiting for hardware interrupt.
mmc0: Timeout waiting for hardware interrupt.
mmc0: Timeout waiting for hardware interrupt.
mmc0: error -110 during resume (card was removed?)
mmc2: Timeout waiting for hardware interrupt.
mmc2: Timeout waiting for hardware interrupt.
mmc2: error -110 during resume (card was removed?)

In request_threaded_irq-> __setup_irq-> kthread_create
->kthread_create_on_node, the comment shows that SIGKILLed will
impact the kthread create, and return -EINTR.

This patch replace them with disable|enable_irq(), that will prevent
IRQs from being propagated to the sdhci driver.

Fixes: 781e989cf593 ("mmc: sdhci: convert to new SDIO IRQ handling")
Signed-off-by: Haibo Chen <haibo.chen@nxp.com>
---
 drivers/mmc/host/sdhci.c | 12 +++---------
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
index d622435..4b1646b 100644
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -2686,7 +2686,7 @@ int sdhci_suspend_host(struct sdhci_host *host)
 		host->ier = 0;
 		sdhci_writel(host, 0, SDHCI_INT_ENABLE);
 		sdhci_writel(host, 0, SDHCI_SIGNAL_ENABLE);
-		free_irq(host->irq, host);
+		disable_irq(host->irq);
 	} else {
 		sdhci_enable_irq_wakeups(host);
 		enable_irq_wake(host->irq);
@@ -2698,8 +2698,6 @@ EXPORT_SYMBOL_GPL(sdhci_suspend_host);
 
 int sdhci_resume_host(struct sdhci_host *host)
 {
-	int ret = 0;
-
 	if (host->flags & (SDHCI_USE_SDMA | SDHCI_USE_ADMA)) {
 		if (host->ops->enable_dma)
 			host->ops->enable_dma(host);
@@ -2718,11 +2716,7 @@ int sdhci_resume_host(struct sdhci_host *host)
 	}
 
 	if (!device_may_wakeup(mmc_dev(host->mmc))) {
-		ret = request_threaded_irq(host->irq, sdhci_irq,
-					   sdhci_thread_irq, IRQF_SHARED,
-					   mmc_hostname(host->mmc), host);
-		if (ret)
-			return ret;
+		enable_irq(host->irq);
 	} else {
 		sdhci_disable_irq_wakeups(host);
 		disable_irq_wake(host->irq);
@@ -2730,7 +2724,7 @@ int sdhci_resume_host(struct sdhci_host *host)
 
 	sdhci_enable_card_detection(host);
 
-	return ret;
+	return 0;
 }
 
 EXPORT_SYMBOL_GPL(sdhci_resume_host);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] mmc: sdhci: disable irq in sdhci host suspend ranther than free this irq
  2016-01-28  9:42 [PATCH] mmc: sdhci: disable irq in sdhci host suspend ranther than free this irq Haibo Chen
@ 2016-01-28 10:20 ` Russell King - ARM Linux
  2016-01-28 15:47   ` Ulf Hansson
  0 siblings, 1 reply; 8+ messages in thread
From: Russell King - ARM Linux @ 2016-01-28 10:20 UTC (permalink / raw)
  To: Haibo Chen; +Cc: ulf.hansson, linux-mmc, linux-kernel

On Thu, Jan 28, 2016 at 05:42:26PM +0800, Haibo Chen wrote:
> Currently sdhci driver free irq in host suspend, and call
> request_threaded_irq() in host resume. But during host resume,
> Ctrl+C can impact sdhci host resume, see the error log:

Ctrl+C should have no effect on this - that seems to imply that there's
some other bug elsewhere.

> diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
> index d622435..4b1646b 100644
> --- a/drivers/mmc/host/sdhci.c
> +++ b/drivers/mmc/host/sdhci.c
> @@ -2686,7 +2686,7 @@ int sdhci_suspend_host(struct sdhci_host *host)
>  		host->ier = 0;
>  		sdhci_writel(host, 0, SDHCI_INT_ENABLE);
>  		sdhci_writel(host, 0, SDHCI_SIGNAL_ENABLE);
> -		free_irq(host->irq, host);
> +		disable_irq(host->irq);

This is really not acceptable I'm afraid.  While it's common on ARM for
each interrupt to be uniquely allocated to a peripheral, not all SDHCI
platforms have that luxury.

SDHCI is also used on PCI, and on x86 platforms, it's common to have PCI
interrupts shared between (sometimes many) different PCI devices.

For example, on my laptop:

 18:    1089806     286185   IO-APIC-fasteoi   uhci_hcd:usb8, r852, mmc0

the SDHCI interrupt is shared with two other peripherals - one USB
controller and a NAND device.

Disabling the interrupt will adversely impact other peripherals and
cause regressions where the interrupt is shared.

So, I'm afraid I'm going to have to NAK this patch.

-- 
RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mmc: sdhci: disable irq in sdhci host suspend ranther than free this irq
  2016-01-28 10:20 ` Russell King - ARM Linux
@ 2016-01-28 15:47   ` Ulf Hansson
  2016-01-28 16:21     ` Thomas Gleixner
  2016-01-28 16:38     ` Russell King - ARM Linux
  0 siblings, 2 replies; 8+ messages in thread
From: Ulf Hansson @ 2016-01-28 15:47 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Haibo Chen, linux-mmc, linux-kernel, Thomas Gleixner, Jon Hunter

+tglx, Jon

On 28 January 2016 at 11:20, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Thu, Jan 28, 2016 at 05:42:26PM +0800, Haibo Chen wrote:
>> Currently sdhci driver free irq in host suspend, and call
>> request_threaded_irq() in host resume. But during host resume,
>> Ctrl+C can impact sdhci host resume, see the error log:
>
> Ctrl+C should have no effect on this - that seems to imply that there's
> some other bug elsewhere.
>
>> diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
>> index d622435..4b1646b 100644
>> --- a/drivers/mmc/host/sdhci.c
>> +++ b/drivers/mmc/host/sdhci.c
>> @@ -2686,7 +2686,7 @@ int sdhci_suspend_host(struct sdhci_host *host)
>>               host->ier = 0;
>>               sdhci_writel(host, 0, SDHCI_INT_ENABLE);
>>               sdhci_writel(host, 0, SDHCI_SIGNAL_ENABLE);
>> -             free_irq(host->irq, host);
>> +             disable_irq(host->irq);
>
> This is really not acceptable I'm afraid.  While it's common on ARM for
> each interrupt to be uniquely allocated to a peripheral, not all SDHCI
> platforms have that luxury.
>
> SDHCI is also used on PCI, and on x86 platforms, it's common to have PCI
> interrupts shared between (sometimes many) different PCI devices.
>
> For example, on my laptop:
>
>  18:    1089806     286185   IO-APIC-fasteoi   uhci_hcd:usb8, r852, mmc0
>
> the SDHCI interrupt is shared with two other peripherals - one USB
> controller and a NAND device.
>
> Disabling the interrupt will adversely impact other peripherals and
> cause regressions where the interrupt is shared.

I thought disable|enable_irq() was being reference counted, so it
shouldn't impact the other peripherals for shared IRQs. I might have
understood this wrong though!?

Although, as if that's the case it also means that the IRQ can still
reach sdhci's irq handler as it hasn't actually been disabled.

Therefore, the only way we currently can make sure to don't get the
IRQ is to free and later re-request it. Now, apparently that has
issues when using threaded IRQ handlers.

I have recently discussed a related change on the genirq framework,
which in principle turned out that we concluded on needing a new API
to deal with PM related enable/disable IRQ cases.
http://www.gossamer-threads.com/lists/linux/kernel/2350504?do=post_view_threaded#2350504

Perhaps that's actually what we need to cover this case.

>
> So, I'm afraid I'm going to have to NAK this patch.

I agree. We need another solution!

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mmc: sdhci: disable irq in sdhci host suspend ranther than free this irq
  2016-01-28 15:47   ` Ulf Hansson
@ 2016-01-28 16:21     ` Thomas Gleixner
  2016-01-28 16:27       ` Thomas Gleixner
  2016-01-28 16:38     ` Russell King - ARM Linux
  1 sibling, 1 reply; 8+ messages in thread
From: Thomas Gleixner @ 2016-01-28 16:21 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Russell King - ARM Linux, Haibo Chen, linux-mmc, linux-kernel,
	Jon Hunter

On Thu, 28 Jan 2016, Ulf Hansson wrote:
> On 28 January 2016 at 11:20, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> >> -             free_irq(host->irq, host);
> >> +             disable_irq(host->irq);
> >
> > This is really not acceptable I'm afraid.  While it's common on ARM for
> > each interrupt to be uniquely allocated to a peripheral, not all SDHCI
> > platforms have that luxury.
> >
> > SDHCI is also used on PCI, and on x86 platforms, it's common to have PCI
> > interrupts shared between (sometimes many) different PCI devices.
> >
> > For example, on my laptop:
> >
> >  18:    1089806     286185   IO-APIC-fasteoi   uhci_hcd:usb8, r852, mmc0
> >
> > the SDHCI interrupt is shared with two other peripherals - one USB
> > controller and a NAND device.
> >
> > Disabling the interrupt will adversely impact other peripherals and
> > cause regressions where the interrupt is shared.
> 
> I thought disable|enable_irq() was being reference counted, so it
> shouldn't impact the other peripherals for shared IRQs. I might have
> understood this wrong though!?

It's reference counted. But it disables the irq line and not a particular
interrupt handler.
 
> Although, as if that's the case it also means that the IRQ can still
> reach sdhci's irq handler as it hasn't actually been disabled.

No. The result is that the other devices on the same irq line won't get any
interrupt anymore.

> Therefore, the only way we currently can make sure to don't get the
> IRQ is to free and later re-request it. Now, apparently that has
> issues when using threaded IRQ handlers.

What's the issue?
 
Thanks,

	tglx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mmc: sdhci: disable irq in sdhci host suspend ranther than free this irq
  2016-01-28 16:21     ` Thomas Gleixner
@ 2016-01-28 16:27       ` Thomas Gleixner
  2017-12-27  2:54         ` Peng Fan
  0 siblings, 1 reply; 8+ messages in thread
From: Thomas Gleixner @ 2016-01-28 16:27 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Russell King - ARM Linux, Haibo Chen, linux-mmc, linux-kernel,
	Jon Hunter

On Thu, 28 Jan 2016, Thomas Gleixner wrote:
> On Thu, 28 Jan 2016, Ulf Hansson wrote:
> > Therefore, the only way we currently can make sure to don't get the
> > IRQ is to free and later re-request it. Now, apparently that has
> > issues when using threaded IRQ handlers.
> 
> What's the issue?

Ah, you mean that one:

> Currently sdhci driver free irq in host suspend, and call
> request_threaded_irq() in host resume. But during host resume,
> Ctrl+C can impact sdhci host resume, see the error log:

> CPU1 is up
> PM: noirq resume of devices complete after 0.637 msecs imx-sdma 30bd0000.sdma: loaded firmware 4.1
> PM: early resume of devices complete after 0.774 msecs
> dpm_run_callback(): platform_pm_resume+0x0/0x44 returns -4
> PM: Device 30b40000.usdhc failed to resume: error -4
> dpm_run_callback(): platform_pm_resume+0x0/0x44 returns -4
> PM: Device 30b50000.usdhc failed to resume: error -4
> dpm_run_callback(): platform_pm_resume+0x0/0x44 returns -4
> PM: Device 30b60000.usdhc failed to resume: error -4 fec 30be0000.ethernet eth0: Link is Up - 100Mbps/Full - flow control rx/tx
> mmc0: Timeout waiting for hardware interrupt.
> mmc0: Timeout waiting for hardware interrupt.
> mmc0: Timeout waiting for hardware interrupt.
> mmc0: Timeout waiting for hardware interrupt.
> mmc0: Timeout waiting for hardware interrupt.
> mmc0: Timeout waiting for hardware interrupt.
> mmc0: error -110 during resume (card was removed?)
> mmc2: Timeout waiting for hardware interrupt.
> mmc2: Timeout waiting for hardware interrupt.
> mmc2: error -110 during resume (card was removed?)

In request_threaded_irq-> __setup_irq-> kthread_create
->kthread_create_on_node, the comment shows that SIGKILLed will
impact the kthread create, and return -EINTR.

And how should that thread be SIGKILLed? Hitting Ctrl+C on the console does
not affect any kernel internal thread. Hitting Ctrl+C affects solely the
process which is running on that console.

And if it would, then that would be a completely different, serious bug which
needs to be fixed.

How was verified, that the thread was not created and that the creation failed
due to a SIGKILL?

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mmc: sdhci: disable irq in sdhci host suspend ranther than free this irq
  2016-01-28 15:47   ` Ulf Hansson
  2016-01-28 16:21     ` Thomas Gleixner
@ 2016-01-28 16:38     ` Russell King - ARM Linux
  1 sibling, 0 replies; 8+ messages in thread
From: Russell King - ARM Linux @ 2016-01-28 16:38 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Haibo Chen, linux-mmc, linux-kernel, Thomas Gleixner, Jon Hunter

On Thu, Jan 28, 2016 at 04:47:23PM +0100, Ulf Hansson wrote:
> +tglx, Jon
> 
> On 28 January 2016 at 11:20, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > On Thu, Jan 28, 2016 at 05:42:26PM +0800, Haibo Chen wrote:
> >> Currently sdhci driver free irq in host suspend, and call
> >> request_threaded_irq() in host resume. But during host resume,
> >> Ctrl+C can impact sdhci host resume, see the error log:
> >
> > Ctrl+C should have no effect on this - that seems to imply that there's
> > some other bug elsewhere.
> >
> >> diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
> >> index d622435..4b1646b 100644
> >> --- a/drivers/mmc/host/sdhci.c
> >> +++ b/drivers/mmc/host/sdhci.c
> >> @@ -2686,7 +2686,7 @@ int sdhci_suspend_host(struct sdhci_host *host)
> >>               host->ier = 0;
> >>               sdhci_writel(host, 0, SDHCI_INT_ENABLE);
> >>               sdhci_writel(host, 0, SDHCI_SIGNAL_ENABLE);
> >> -             free_irq(host->irq, host);
> >> +             disable_irq(host->irq);
> >
> > This is really not acceptable I'm afraid.  While it's common on ARM for
> > each interrupt to be uniquely allocated to a peripheral, not all SDHCI
> > platforms have that luxury.
> >
> > SDHCI is also used on PCI, and on x86 platforms, it's common to have PCI
> > interrupts shared between (sometimes many) different PCI devices.
> >
> > For example, on my laptop:
> >
> >  18:    1089806     286185   IO-APIC-fasteoi   uhci_hcd:usb8, r852, mmc0
> >
> > the SDHCI interrupt is shared with two other peripherals - one USB
> > controller and a NAND device.
> >
> > Disabling the interrupt will adversely impact other peripherals and
> > cause regressions where the interrupt is shared.
> 
> I thought disable|enable_irq() was being reference counted, so it
> shouldn't impact the other peripherals for shared IRQs. I might have
> understood this wrong though!?

They are.  When anything disables an IRQ, the IRQ is disabled.  Only
once the N disable_irq()s have been balanced with N enable_irq()s will
the interrupt be re-enabled.  disable_irq() doesn't work on a per-device
level, but on a per-interrupt line level.

So, if sdhci calls disable_irq() in its suspend interrupt, it disables
the IRQ for _everything_ thats sharing that interrupt.  If (eg) USB or
r852 needs an interrupt to complete its own suspend, it won't see that
interrupt because SDHCI disabled it.

It appear might work as-is even so, if SDHCI happens to be (on the test
setup) suspended after (eg) both the USB and r852 drivers.

It is probably much better if SDHCI writes to the device on suspend to
disable interrupts, synchronise with the IRQ, and then set a flag to
indicate that the interrupt handler should immediately return IRQ_NONE
in case any of the other peripherals sharing the IRQ line trigger an
interrupt.

> I have recently discussed a related change on the genirq framework,
> which in principle turned out that we concluded on needing a new API
> to deal with PM related enable/disable IRQ cases.
> http://www.gossamer-threads.com/lists/linux/kernel/2350504?do=post_view_threaded#2350504

I haven't read your link, but I don't think we really need yet more
APIs to deal with this, except possibly one thing - a way to tell
genirq that a specific IRQ handler should not be called because its
device is suspended.

IOW, moving:

static irqreturn_t foo_device_irq(void *devid)
{
	struct foo_device_priv *priv = dev_id;

	if (priv->suspended)
		return IRQ_NONE;

	... rest of IRQ handling
}

into genirq code, so that we don't end up with that pattern repeated
many times in drivers.

It may be that's exactly what's being proposed in the link, but as I
say, I've not read it yet.

-- 
RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mmc: sdhci: disable irq in sdhci host suspend ranther than free this irq
  2016-01-28 16:27       ` Thomas Gleixner
@ 2017-12-27  2:54         ` Peng Fan
  2021-06-11 14:22           ` Martin Kaiser
  0 siblings, 1 reply; 8+ messages in thread
From: Peng Fan @ 2017-12-27  2:54 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Ulf Hansson, Russell King - ARM Linux, Haibo Chen, linux-mmc,
	linux-kernel, Jon Hunter, aisheng.dong

Hi All,

Sorry for bring back this old topic again.

On Thu, Jan 28, 2016 at 05:27:46PM +0100, Thomas Gleixner wrote:
>On Thu, 28 Jan 2016, Thomas Gleixner wrote:
>> On Thu, 28 Jan 2016, Ulf Hansson wrote:
>> > Therefore, the only way we currently can make sure to don't get the
>> > IRQ is to free and later re-request it. Now, apparently that has
>> > issues when using threaded IRQ handlers.
>> 
>> What's the issue?
>
>Ah, you mean that one:
>
>> Currently sdhci driver free irq in host suspend, and call
>> request_threaded_irq() in host resume. But during host resume,
>> Ctrl+C can impact sdhci host resume, see the error log:
>
>> CPU1 is up
>> PM: noirq resume of devices complete after 0.637 msecs imx-sdma 30bd0000.sdma: loaded firmware 4.1
>> PM: early resume of devices complete after 0.774 msecs
>> dpm_run_callback(): platform_pm_resume+0x0/0x44 returns -4
>> PM: Device 30b40000.usdhc failed to resume: error -4
>> dpm_run_callback(): platform_pm_resume+0x0/0x44 returns -4
>> PM: Device 30b50000.usdhc failed to resume: error -4
>> dpm_run_callback(): platform_pm_resume+0x0/0x44 returns -4
>> PM: Device 30b60000.usdhc failed to resume: error -4 fec 30be0000.ethernet eth0: Link is Up - 100Mbps/Full - flow control rx/tx
>> mmc0: Timeout waiting for hardware interrupt.
>> mmc0: Timeout waiting for hardware interrupt.
>> mmc0: Timeout waiting for hardware interrupt.
>> mmc0: Timeout waiting for hardware interrupt.
>> mmc0: Timeout waiting for hardware interrupt.
>> mmc0: Timeout waiting for hardware interrupt.
>> mmc0: error -110 during resume (card was removed?)
>> mmc2: Timeout waiting for hardware interrupt.
>> mmc2: Timeout waiting for hardware interrupt.
>> mmc2: error -110 during resume (card was removed?)
>
>In request_threaded_irq-> __setup_irq-> kthread_create
>->kthread_create_on_node, the comment shows that SIGKILLed will
>impact the kthread create, and return -EINTR.
>
>And how should that thread be SIGKILLed? Hitting Ctrl+C on the console does
>not affect any kernel internal thread. Hitting Ctrl+C affects solely the
>process which is running on that console.
>
>And if it would, then that would be a completely different, serious bug which
>needs to be fixed.
>
>How was verified, that the thread was not created and that the creation failed
>due to a SIGKILL?

This is the testcase.
"/unit_tests/SRTC/rtcwakeup.out -d rtc0 -m mem -s 2;"
it acts "echo mem > /sys/power/state", then rtc interrupt will wakeup the system.

My understanding is:
The issue is during suspend resume, it is in rtwakeup.out process space,
during resume, "get_current()->comm" shows "rtcwakeup.out", so if we
send SIGKILL from userspace, a interrupt will occur, interrupt
handler will directly return to kernel space to continue resuming.

__setup_irq->kthread_create->wait_for_completion_killable, here
wait_for_completion_killable see SIGKILL pending and return -EINTR,
then sdhci resume process failure, because of sdhci interrupt thread
not created.

During suspend/resume, OOM Killer will be disabled and enalbed. When
request_threaded_irq in sdhci resume, OOM Killer is still disabled.
According to kthread_create comments for wait_for_completion_killable,
using killable is to catch OOM sigkill. But during resume, OOM Killer
is disabled, So how about the following patch to disable SIGKILL for
a short while?

diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
index e9290a3439d5..84c4c99b1acb 100644
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -19,6 +19,7 @@
 #include <linux/io.h>
 #include <linux/module.h>
 #include <linux/dma-mapping.h>
+#include <linux/signal.h>
 #include <linux/slab.h>
 #include <linux/scatterlist.h>
 #include <linux/swiotlb.h>
@@ -2895,9 +2896,11 @@ int sdhci_resume_host(struct sdhci_host *host)
 	}
 
 	if (!device_may_wakeup(mmc_dev(host->mmc))) {
+		disallow_signal(SIGKILL);
 		ret = request_threaded_irq(host->irq, sdhci_irq,
 					   sdhci_thread_irq, IRQF_SHARED,
 					   mmc_hostname(host->mmc), host);
+		allow_signal(SIGKILL);
 		if (ret)
 			return ret;
 	} else {

Thanks,
Peng.

>
>Thanks,
>
>	tglx

-- 

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] mmc: sdhci: disable irq in sdhci host suspend ranther than free this irq
  2017-12-27  2:54         ` Peng Fan
@ 2021-06-11 14:22           ` Martin Kaiser
  0 siblings, 0 replies; 8+ messages in thread
From: Martin Kaiser @ 2021-06-11 14:22 UTC (permalink / raw)
  To: Peng Fan
  Cc: Thomas Gleixner, Ulf Hansson, Russell King - ARM Linux,
	Haibo Chen, linux-mmc, linux-kernel, Jon Hunter, aisheng.dong

Dear all,

I have to resurrect this discussion once again. The issue is still present
in linux-next and can be reproduced easily on my imx25-based system.

Thus wrote Peng Fan (van.freenix@gmail.com):

> On Thu, Jan 28, 2016 at 05:27:46PM +0100, Thomas Gleixner wrote:
> >On Thu, 28 Jan 2016, Thomas Gleixner wrote:
> >> On Thu, 28 Jan 2016, Ulf Hansson wrote:
> >> > Therefore, the only way we currently can make sure to don't get the
> >> > IRQ is to free and later re-request it. Now, apparently that has
> >> > issues when using threaded IRQ handlers.

> >> What's the issue?

> >Ah, you mean that one:

> >> Currently sdhci driver free irq in host suspend, and call
> >> request_threaded_irq() in host resume. But during host resume,
> >> Ctrl+C can impact sdhci host resume, see the error log:

> >> [...]

My test setup uses rtc as a wakeup source. Additionally, I also define my
console uart as wakeup source

echo enabled > /sys/class/tty/ttymxc3/power/wakeup

and then run rtcwake in a loop

while true ; do rtcwake -s 2 -m mem ; done

Pressing Ctrl-C while the system is sleeping reproduces the problem quickly.

sdhci_resume_host fails because the kthread for the threaded irq can't be
created.

ps confirms that there's no [irq/25-mmc0] kernel thread any more and sdhci
starts printing register dumps periodically

[  101.603339] mmc0: Timeout waiting for hardware cmd interrupt.
[  101.609225] mmc0: sdhci: ============ SDHCI REGISTER DUMP ===========
[  101.615725] mmc0: sdhci: Sys addr:  0x00000008 | Version:  0x00001001
...
[  101.700086] mmc0: sdhci: Host ctl2: 0x00000000
[  101.704568] mmc0: sdhci-esdhc-imx: ========= ESDHC IMX DEBUG STATUS DUMP =========
[  101.712179] mmc0: sdhci-esdhc-imx: cmd debug status:  0x0000
...
[  101.746827] mmc0: sdhci-esdhc-imx: async fifo debug status:  0x0000
[  101.753137] mmc0: sdhci: ============================================


I guest that we're taking this path from sdhci_resume_host

request_threaded_irq
   __setup_irq
      setup_irq_thread
         kthread_create
            kthread_create_on_node
               __kthread_create_on_node
                  wake_up_process(kthreadd_task);
                  if (unlikely(wait_for_completion_killable(&done))) {
                     if (xchg(&create->done, NULL))
                        return ERR_PTR(-EINTR);

I can confirm Peng's observation that current->pid in
__kthread_create_on_node is the pid of the rtcwake process.


This is an ftrace log of the error case, I used
trace-cmd start -e task -e mmc -e power -e signal -p function -l tty_flip_buffer_push -l sdhci_resume_host -l __kthread_create_on_node

         rtcwake-185     [000] .n..   321.166724: device_pm_callback_start: imx-uart 50008000.serial, parent: 50000000.spba, bus [resume]
         rtcwake-185     [000] dnh.   321.166823: tty_flip_buffer_push <-__imx_uart_rxint.constprop.0
         rtcwake-185     [000] .n..   321.166906: device_pm_callback_end: imx-uart 50008000.serial, err=0
...
         rtcwake-185     [000] .n..   321.167472: device_pm_callback_start: imx_rngc 53fb0000.rngb, parent: 53f00000.bus, bus [resume]
         rtcwake-185     [000] .n..   321.167486: device_pm_callback_end: imx_rngc 53fb0000.rngb, err=0
         rtcwake-185     [000] .n..   321.167502: device_pm_callback_start: sdhci-esdhc-imx 53fb4000.mmc, parent: 53f00000.bus, bus [resume]
         rtcwake-185     [000] .n..   321.167516: sdhci_resume_host <-sdhci_esdhc_resume
         rtcwake-185     [000] .n..   321.167584: __kthread_create_on_node <-kthread_create_on_node
         rtcwake-185     [000] .n..   321.167652: device_pm_callback_end: sdhci-esdhc-imx 53fb4000.mmc, err=-4
         rtcwake-185     [000] .n..   321.167922: device_pm_callback_start: imx-fb 53fbc000.lcdc, parent: 53f00000.bus, bus [resume]
         rtcwake-185     [000] .n..   321.167978: device_pm_callback_end: imx-fb 53fbc000.lcdc, err=0

If wakeup is successful, the irq's thread is created

         rtcwake-168     [000] .n..   320.566614: device_pm_callback_start: imx-uart 50008000.serial, parent: 50000000.spba, bus [resume]
         rtcwake-168     [000] dnh.   320.566718: tty_flip_buffer_push <-__imx_uart_rxint.constprop.0
         rtcwake-168     [000] .n..   320.566808: device_pm_callback_end: imx-uart 50008000.serial, err=0
...
         rtcwake-168     [000] .n..   320.567359: device_pm_callback_start: imx_rngc 53fb0000.rngb, parent: 53f00000.bus, bus [resume]
         rtcwake-168     [000] .n..   320.567372: device_pm_callback_end: imx_rngc 53fb0000.rngb, err=0
         rtcwake-168     [000] .n..   320.567386: device_pm_callback_start: sdhci-esdhc-imx 53fb4000.mmc, parent: 53f00000.bus, bus [resume]
         rtcwake-168     [000] .n..   320.567402: sdhci_resume_host <-sdhci_esdhc_resume
         rtcwake-168     [000] .n..   320.567460: __kthread_create_on_node <-kthread_create_on_node
        kthreadd-2       [000] ....   320.567922: task_newtask: pid=174 comm=kthreadd clone_flags=800700 oom_score_adj=0
    kworker/u2:3-173     [000] ....   320.568183: __kthread_create_on_node <-kthread_create_on_node
        kthreadd-2       [000] ....   320.568504: task_newtask: pid=175 comm=kthreadd clone_flags=800700 oom_score_adj=0
         rtcwake-168     [000] ....   320.568659: task_rename: pid=174 oldcomm=kthreadd newcomm=irq/25-mmc0 oom_score_adj=0
         rtcwake-168     [000] .n..   320.568824: device_pm_callback_end: sdhci-esdhc-imx 53fb4000.mmc, err=0

> >In request_threaded_irq-> __setup_irq-> kthread_create
> >->kthread_create_on_node, the comment shows that SIGKILLed will
> >impact the kthread create, and return -EINTR.

> >And how should that thread be SIGKILLed? Hitting Ctrl+C on the console does
> >not affect any kernel internal thread. Hitting Ctrl+C affects solely the
> >process which is running on that console.

> >And if it would, then that would be a completely different, serious bug which
> >needs to be fixed.

> >How was verified, that the thread was not created and that the creation failed
> >due to a SIGKILL?

See above, the irq thread is missing, it was running before the
suspend+wakeup. I hope that the ftrace output confirms my assumption about
the code path that was taken. I didn't find any other way we could return
-EINTR in sdhci_resume_host.

> My understanding is:
> The issue is during suspend resume, it is in rtwakeup.out process space,
> during resume, "get_current()->comm" shows "rtcwakeup.out", so if we
> send SIGKILL from userspace, a interrupt will occur, interrupt
> handler will directly return to kernel space to continue resuming.

> __setup_irq->kthread_create->wait_for_completion_killable, here
> wait_for_completion_killable see SIGKILL pending and return -EINTR,
> then sdhci resume process failure, because of sdhci interrupt thread
> not created.

> During suspend/resume, OOM Killer will be disabled and enalbed. When
> request_threaded_irq in sdhci resume, OOM Killer is still disabled.
> According to kthread_create comments for wait_for_completion_killable,
> using killable is to catch OOM sigkill. But during resume, OOM Killer
> is disabled, So how about the following patch to disable SIGKILL for
> a short while?

I tried this patch, it didn't fix the problem for me.

I could make the problem disappear if I moved sdhci_resume_host's
request_threaded_irq call into a worker. Instead of calling
request_threaded_irq, I'd schedule the work on system_unbound_wq. The new
thread for sdhci's irq is then requested by someone other than rtcwake.

Generally, it makes sense to me that kthreadd aborts a request for a new
thread if the requester is killed during the request. However, in the case
of sdhci resume, the thread should always be created, regardless of the
requester's state...

Of course, the workqueue hack is not an acceptable way to fix this. I'd
appreciate if anyone could point me in the right direction for a proper fix.

Thanks in advance for your help,

   Martin

> diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
> index e9290a3439d5..84c4c99b1acb 100644
> --- a/drivers/mmc/host/sdhci.c
> +++ b/drivers/mmc/host/sdhci.c
> @@ -19,6 +19,7 @@
>  #include <linux/io.h>
>  #include <linux/module.h>
>  #include <linux/dma-mapping.h>
> +#include <linux/signal.h>
>  #include <linux/slab.h>
>  #include <linux/scatterlist.h>
>  #include <linux/swiotlb.h>
> @@ -2895,9 +2896,11 @@ int sdhci_resume_host(struct sdhci_host *host)
>  	}

>  	if (!device_may_wakeup(mmc_dev(host->mmc))) {
> +		disallow_signal(SIGKILL);
>  		ret = request_threaded_irq(host->irq, sdhci_irq,
>  					   sdhci_thread_irq, IRQF_SHARED,
>  					   mmc_hostname(host->mmc), host);
> +		allow_signal(SIGKILL);
>  		if (ret)
>  			return ret;
>  	} else {

> Thanks,
> Peng.


> >Thanks,

> >	tglx

> -- 

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-06-11 14:23 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-28  9:42 [PATCH] mmc: sdhci: disable irq in sdhci host suspend ranther than free this irq Haibo Chen
2016-01-28 10:20 ` Russell King - ARM Linux
2016-01-28 15:47   ` Ulf Hansson
2016-01-28 16:21     ` Thomas Gleixner
2016-01-28 16:27       ` Thomas Gleixner
2017-12-27  2:54         ` Peng Fan
2021-06-11 14:22           ` Martin Kaiser
2016-01-28 16:38     ` Russell King - ARM Linux

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).