* [PATCH] mmc: sdhci: disable irq in sdhci host suspend ranther than free this irq @ 2016-01-28 9:42 Haibo Chen 2016-01-28 10:20 ` Russell King - ARM Linux 0 siblings, 1 reply; 8+ messages in thread From: Haibo Chen @ 2016-01-28 9:42 UTC (permalink / raw) To: ulf.hansson; +Cc: rmk+kernel, haibo.chen, linux-mmc, linux-kernel Currently sdhci driver free irq in host suspend, and call request_threaded_irq() in host resume. But during host resume, Ctrl+C can impact sdhci host resume, see the error log: CPU1 is up PM: noirq resume of devices complete after 0.637 msecs imx-sdma 30bd0000.sdma: loaded firmware 4.1 PM: early resume of devices complete after 0.774 msecs dpm_run_callback(): platform_pm_resume+0x0/0x44 returns -4 PM: Device 30b40000.usdhc failed to resume: error -4 dpm_run_callback(): platform_pm_resume+0x0/0x44 returns -4 PM: Device 30b50000.usdhc failed to resume: error -4 dpm_run_callback(): platform_pm_resume+0x0/0x44 returns -4 PM: Device 30b60000.usdhc failed to resume: error -4 fec 30be0000.ethernet eth0: Link is Up - 100Mbps/Full - flow control rx/tx mmc0: Timeout waiting for hardware interrupt. mmc0: Timeout waiting for hardware interrupt. mmc0: Timeout waiting for hardware interrupt. mmc0: Timeout waiting for hardware interrupt. mmc0: Timeout waiting for hardware interrupt. mmc0: Timeout waiting for hardware interrupt. mmc0: error -110 during resume (card was removed?) mmc2: Timeout waiting for hardware interrupt. mmc2: Timeout waiting for hardware interrupt. mmc2: error -110 during resume (card was removed?) In request_threaded_irq-> __setup_irq-> kthread_create ->kthread_create_on_node, the comment shows that SIGKILLed will impact the kthread create, and return -EINTR. This patch replace them with disable|enable_irq(), that will prevent IRQs from being propagated to the sdhci driver. Fixes: 781e989cf593 ("mmc: sdhci: convert to new SDIO IRQ handling") Signed-off-by: Haibo Chen <haibo.chen@nxp.com> --- drivers/mmc/host/sdhci.c | 12 +++--------- 1 file changed, 3 insertions(+), 9 deletions(-) diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c index d622435..4b1646b 100644 --- a/drivers/mmc/host/sdhci.c +++ b/drivers/mmc/host/sdhci.c @@ -2686,7 +2686,7 @@ int sdhci_suspend_host(struct sdhci_host *host) host->ier = 0; sdhci_writel(host, 0, SDHCI_INT_ENABLE); sdhci_writel(host, 0, SDHCI_SIGNAL_ENABLE); - free_irq(host->irq, host); + disable_irq(host->irq); } else { sdhci_enable_irq_wakeups(host); enable_irq_wake(host->irq); @@ -2698,8 +2698,6 @@ EXPORT_SYMBOL_GPL(sdhci_suspend_host); int sdhci_resume_host(struct sdhci_host *host) { - int ret = 0; - if (host->flags & (SDHCI_USE_SDMA | SDHCI_USE_ADMA)) { if (host->ops->enable_dma) host->ops->enable_dma(host); @@ -2718,11 +2716,7 @@ int sdhci_resume_host(struct sdhci_host *host) } if (!device_may_wakeup(mmc_dev(host->mmc))) { - ret = request_threaded_irq(host->irq, sdhci_irq, - sdhci_thread_irq, IRQF_SHARED, - mmc_hostname(host->mmc), host); - if (ret) - return ret; + enable_irq(host->irq); } else { sdhci_disable_irq_wakeups(host); disable_irq_wake(host->irq); @@ -2730,7 +2724,7 @@ int sdhci_resume_host(struct sdhci_host *host) sdhci_enable_card_detection(host); - return ret; + return 0; } EXPORT_SYMBOL_GPL(sdhci_resume_host); -- 1.9.1 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] mmc: sdhci: disable irq in sdhci host suspend ranther than free this irq 2016-01-28 9:42 [PATCH] mmc: sdhci: disable irq in sdhci host suspend ranther than free this irq Haibo Chen @ 2016-01-28 10:20 ` Russell King - ARM Linux 2016-01-28 15:47 ` Ulf Hansson 0 siblings, 1 reply; 8+ messages in thread From: Russell King - ARM Linux @ 2016-01-28 10:20 UTC (permalink / raw) To: Haibo Chen; +Cc: ulf.hansson, linux-mmc, linux-kernel On Thu, Jan 28, 2016 at 05:42:26PM +0800, Haibo Chen wrote: > Currently sdhci driver free irq in host suspend, and call > request_threaded_irq() in host resume. But during host resume, > Ctrl+C can impact sdhci host resume, see the error log: Ctrl+C should have no effect on this - that seems to imply that there's some other bug elsewhere. > diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c > index d622435..4b1646b 100644 > --- a/drivers/mmc/host/sdhci.c > +++ b/drivers/mmc/host/sdhci.c > @@ -2686,7 +2686,7 @@ int sdhci_suspend_host(struct sdhci_host *host) > host->ier = 0; > sdhci_writel(host, 0, SDHCI_INT_ENABLE); > sdhci_writel(host, 0, SDHCI_SIGNAL_ENABLE); > - free_irq(host->irq, host); > + disable_irq(host->irq); This is really not acceptable I'm afraid. While it's common on ARM for each interrupt to be uniquely allocated to a peripheral, not all SDHCI platforms have that luxury. SDHCI is also used on PCI, and on x86 platforms, it's common to have PCI interrupts shared between (sometimes many) different PCI devices. For example, on my laptop: 18: 1089806 286185 IO-APIC-fasteoi uhci_hcd:usb8, r852, mmc0 the SDHCI interrupt is shared with two other peripherals - one USB controller and a NAND device. Disabling the interrupt will adversely impact other peripherals and cause regressions where the interrupt is shared. So, I'm afraid I'm going to have to NAK this patch. -- RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] mmc: sdhci: disable irq in sdhci host suspend ranther than free this irq 2016-01-28 10:20 ` Russell King - ARM Linux @ 2016-01-28 15:47 ` Ulf Hansson 2016-01-28 16:21 ` Thomas Gleixner 2016-01-28 16:38 ` Russell King - ARM Linux 0 siblings, 2 replies; 8+ messages in thread From: Ulf Hansson @ 2016-01-28 15:47 UTC (permalink / raw) To: Russell King - ARM Linux Cc: Haibo Chen, linux-mmc, linux-kernel, Thomas Gleixner, Jon Hunter +tglx, Jon On 28 January 2016 at 11:20, Russell King - ARM Linux <linux@arm.linux.org.uk> wrote: > On Thu, Jan 28, 2016 at 05:42:26PM +0800, Haibo Chen wrote: >> Currently sdhci driver free irq in host suspend, and call >> request_threaded_irq() in host resume. But during host resume, >> Ctrl+C can impact sdhci host resume, see the error log: > > Ctrl+C should have no effect on this - that seems to imply that there's > some other bug elsewhere. > >> diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c >> index d622435..4b1646b 100644 >> --- a/drivers/mmc/host/sdhci.c >> +++ b/drivers/mmc/host/sdhci.c >> @@ -2686,7 +2686,7 @@ int sdhci_suspend_host(struct sdhci_host *host) >> host->ier = 0; >> sdhci_writel(host, 0, SDHCI_INT_ENABLE); >> sdhci_writel(host, 0, SDHCI_SIGNAL_ENABLE); >> - free_irq(host->irq, host); >> + disable_irq(host->irq); > > This is really not acceptable I'm afraid. While it's common on ARM for > each interrupt to be uniquely allocated to a peripheral, not all SDHCI > platforms have that luxury. > > SDHCI is also used on PCI, and on x86 platforms, it's common to have PCI > interrupts shared between (sometimes many) different PCI devices. > > For example, on my laptop: > > 18: 1089806 286185 IO-APIC-fasteoi uhci_hcd:usb8, r852, mmc0 > > the SDHCI interrupt is shared with two other peripherals - one USB > controller and a NAND device. > > Disabling the interrupt will adversely impact other peripherals and > cause regressions where the interrupt is shared. I thought disable|enable_irq() was being reference counted, so it shouldn't impact the other peripherals for shared IRQs. I might have understood this wrong though!? Although, as if that's the case it also means that the IRQ can still reach sdhci's irq handler as it hasn't actually been disabled. Therefore, the only way we currently can make sure to don't get the IRQ is to free and later re-request it. Now, apparently that has issues when using threaded IRQ handlers. I have recently discussed a related change on the genirq framework, which in principle turned out that we concluded on needing a new API to deal with PM related enable/disable IRQ cases. http://www.gossamer-threads.com/lists/linux/kernel/2350504?do=post_view_threaded#2350504 Perhaps that's actually what we need to cover this case. > > So, I'm afraid I'm going to have to NAK this patch. I agree. We need another solution! Kind regards Uffe ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] mmc: sdhci: disable irq in sdhci host suspend ranther than free this irq 2016-01-28 15:47 ` Ulf Hansson @ 2016-01-28 16:21 ` Thomas Gleixner 2016-01-28 16:27 ` Thomas Gleixner 2016-01-28 16:38 ` Russell King - ARM Linux 1 sibling, 1 reply; 8+ messages in thread From: Thomas Gleixner @ 2016-01-28 16:21 UTC (permalink / raw) To: Ulf Hansson Cc: Russell King - ARM Linux, Haibo Chen, linux-mmc, linux-kernel, Jon Hunter On Thu, 28 Jan 2016, Ulf Hansson wrote: > On 28 January 2016 at 11:20, Russell King - ARM Linux > <linux@arm.linux.org.uk> wrote: > >> - free_irq(host->irq, host); > >> + disable_irq(host->irq); > > > > This is really not acceptable I'm afraid. While it's common on ARM for > > each interrupt to be uniquely allocated to a peripheral, not all SDHCI > > platforms have that luxury. > > > > SDHCI is also used on PCI, and on x86 platforms, it's common to have PCI > > interrupts shared between (sometimes many) different PCI devices. > > > > For example, on my laptop: > > > > 18: 1089806 286185 IO-APIC-fasteoi uhci_hcd:usb8, r852, mmc0 > > > > the SDHCI interrupt is shared with two other peripherals - one USB > > controller and a NAND device. > > > > Disabling the interrupt will adversely impact other peripherals and > > cause regressions where the interrupt is shared. > > I thought disable|enable_irq() was being reference counted, so it > shouldn't impact the other peripherals for shared IRQs. I might have > understood this wrong though!? It's reference counted. But it disables the irq line and not a particular interrupt handler. > Although, as if that's the case it also means that the IRQ can still > reach sdhci's irq handler as it hasn't actually been disabled. No. The result is that the other devices on the same irq line won't get any interrupt anymore. > Therefore, the only way we currently can make sure to don't get the > IRQ is to free and later re-request it. Now, apparently that has > issues when using threaded IRQ handlers. What's the issue? Thanks, tglx ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] mmc: sdhci: disable irq in sdhci host suspend ranther than free this irq 2016-01-28 16:21 ` Thomas Gleixner @ 2016-01-28 16:27 ` Thomas Gleixner 2017-12-27 2:54 ` Peng Fan 0 siblings, 1 reply; 8+ messages in thread From: Thomas Gleixner @ 2016-01-28 16:27 UTC (permalink / raw) To: Ulf Hansson Cc: Russell King - ARM Linux, Haibo Chen, linux-mmc, linux-kernel, Jon Hunter On Thu, 28 Jan 2016, Thomas Gleixner wrote: > On Thu, 28 Jan 2016, Ulf Hansson wrote: > > Therefore, the only way we currently can make sure to don't get the > > IRQ is to free and later re-request it. Now, apparently that has > > issues when using threaded IRQ handlers. > > What's the issue? Ah, you mean that one: > Currently sdhci driver free irq in host suspend, and call > request_threaded_irq() in host resume. But during host resume, > Ctrl+C can impact sdhci host resume, see the error log: > CPU1 is up > PM: noirq resume of devices complete after 0.637 msecs imx-sdma 30bd0000.sdma: loaded firmware 4.1 > PM: early resume of devices complete after 0.774 msecs > dpm_run_callback(): platform_pm_resume+0x0/0x44 returns -4 > PM: Device 30b40000.usdhc failed to resume: error -4 > dpm_run_callback(): platform_pm_resume+0x0/0x44 returns -4 > PM: Device 30b50000.usdhc failed to resume: error -4 > dpm_run_callback(): platform_pm_resume+0x0/0x44 returns -4 > PM: Device 30b60000.usdhc failed to resume: error -4 fec 30be0000.ethernet eth0: Link is Up - 100Mbps/Full - flow control rx/tx > mmc0: Timeout waiting for hardware interrupt. > mmc0: Timeout waiting for hardware interrupt. > mmc0: Timeout waiting for hardware interrupt. > mmc0: Timeout waiting for hardware interrupt. > mmc0: Timeout waiting for hardware interrupt. > mmc0: Timeout waiting for hardware interrupt. > mmc0: error -110 during resume (card was removed?) > mmc2: Timeout waiting for hardware interrupt. > mmc2: Timeout waiting for hardware interrupt. > mmc2: error -110 during resume (card was removed?) In request_threaded_irq-> __setup_irq-> kthread_create ->kthread_create_on_node, the comment shows that SIGKILLed will impact the kthread create, and return -EINTR. And how should that thread be SIGKILLed? Hitting Ctrl+C on the console does not affect any kernel internal thread. Hitting Ctrl+C affects solely the process which is running on that console. And if it would, then that would be a completely different, serious bug which needs to be fixed. How was verified, that the thread was not created and that the creation failed due to a SIGKILL? Thanks, tglx ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] mmc: sdhci: disable irq in sdhci host suspend ranther than free this irq 2016-01-28 16:27 ` Thomas Gleixner @ 2017-12-27 2:54 ` Peng Fan 2021-06-11 14:22 ` Martin Kaiser 0 siblings, 1 reply; 8+ messages in thread From: Peng Fan @ 2017-12-27 2:54 UTC (permalink / raw) To: Thomas Gleixner Cc: Ulf Hansson, Russell King - ARM Linux, Haibo Chen, linux-mmc, linux-kernel, Jon Hunter, aisheng.dong Hi All, Sorry for bring back this old topic again. On Thu, Jan 28, 2016 at 05:27:46PM +0100, Thomas Gleixner wrote: >On Thu, 28 Jan 2016, Thomas Gleixner wrote: >> On Thu, 28 Jan 2016, Ulf Hansson wrote: >> > Therefore, the only way we currently can make sure to don't get the >> > IRQ is to free and later re-request it. Now, apparently that has >> > issues when using threaded IRQ handlers. >> >> What's the issue? > >Ah, you mean that one: > >> Currently sdhci driver free irq in host suspend, and call >> request_threaded_irq() in host resume. But during host resume, >> Ctrl+C can impact sdhci host resume, see the error log: > >> CPU1 is up >> PM: noirq resume of devices complete after 0.637 msecs imx-sdma 30bd0000.sdma: loaded firmware 4.1 >> PM: early resume of devices complete after 0.774 msecs >> dpm_run_callback(): platform_pm_resume+0x0/0x44 returns -4 >> PM: Device 30b40000.usdhc failed to resume: error -4 >> dpm_run_callback(): platform_pm_resume+0x0/0x44 returns -4 >> PM: Device 30b50000.usdhc failed to resume: error -4 >> dpm_run_callback(): platform_pm_resume+0x0/0x44 returns -4 >> PM: Device 30b60000.usdhc failed to resume: error -4 fec 30be0000.ethernet eth0: Link is Up - 100Mbps/Full - flow control rx/tx >> mmc0: Timeout waiting for hardware interrupt. >> mmc0: Timeout waiting for hardware interrupt. >> mmc0: Timeout waiting for hardware interrupt. >> mmc0: Timeout waiting for hardware interrupt. >> mmc0: Timeout waiting for hardware interrupt. >> mmc0: Timeout waiting for hardware interrupt. >> mmc0: error -110 during resume (card was removed?) >> mmc2: Timeout waiting for hardware interrupt. >> mmc2: Timeout waiting for hardware interrupt. >> mmc2: error -110 during resume (card was removed?) > >In request_threaded_irq-> __setup_irq-> kthread_create >->kthread_create_on_node, the comment shows that SIGKILLed will >impact the kthread create, and return -EINTR. > >And how should that thread be SIGKILLed? Hitting Ctrl+C on the console does >not affect any kernel internal thread. Hitting Ctrl+C affects solely the >process which is running on that console. > >And if it would, then that would be a completely different, serious bug which >needs to be fixed. > >How was verified, that the thread was not created and that the creation failed >due to a SIGKILL? This is the testcase. "/unit_tests/SRTC/rtcwakeup.out -d rtc0 -m mem -s 2;" it acts "echo mem > /sys/power/state", then rtc interrupt will wakeup the system. My understanding is: The issue is during suspend resume, it is in rtwakeup.out process space, during resume, "get_current()->comm" shows "rtcwakeup.out", so if we send SIGKILL from userspace, a interrupt will occur, interrupt handler will directly return to kernel space to continue resuming. __setup_irq->kthread_create->wait_for_completion_killable, here wait_for_completion_killable see SIGKILL pending and return -EINTR, then sdhci resume process failure, because of sdhci interrupt thread not created. During suspend/resume, OOM Killer will be disabled and enalbed. When request_threaded_irq in sdhci resume, OOM Killer is still disabled. According to kthread_create comments for wait_for_completion_killable, using killable is to catch OOM sigkill. But during resume, OOM Killer is disabled, So how about the following patch to disable SIGKILL for a short while? diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c index e9290a3439d5..84c4c99b1acb 100644 --- a/drivers/mmc/host/sdhci.c +++ b/drivers/mmc/host/sdhci.c @@ -19,6 +19,7 @@ #include <linux/io.h> #include <linux/module.h> #include <linux/dma-mapping.h> +#include <linux/signal.h> #include <linux/slab.h> #include <linux/scatterlist.h> #include <linux/swiotlb.h> @@ -2895,9 +2896,11 @@ int sdhci_resume_host(struct sdhci_host *host) } if (!device_may_wakeup(mmc_dev(host->mmc))) { + disallow_signal(SIGKILL); ret = request_threaded_irq(host->irq, sdhci_irq, sdhci_thread_irq, IRQF_SHARED, mmc_hostname(host->mmc), host); + allow_signal(SIGKILL); if (ret) return ret; } else { Thanks, Peng. > >Thanks, > > tglx -- ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] mmc: sdhci: disable irq in sdhci host suspend ranther than free this irq 2017-12-27 2:54 ` Peng Fan @ 2021-06-11 14:22 ` Martin Kaiser 0 siblings, 0 replies; 8+ messages in thread From: Martin Kaiser @ 2021-06-11 14:22 UTC (permalink / raw) To: Peng Fan Cc: Thomas Gleixner, Ulf Hansson, Russell King - ARM Linux, Haibo Chen, linux-mmc, linux-kernel, Jon Hunter, aisheng.dong Dear all, I have to resurrect this discussion once again. The issue is still present in linux-next and can be reproduced easily on my imx25-based system. Thus wrote Peng Fan (van.freenix@gmail.com): > On Thu, Jan 28, 2016 at 05:27:46PM +0100, Thomas Gleixner wrote: > >On Thu, 28 Jan 2016, Thomas Gleixner wrote: > >> On Thu, 28 Jan 2016, Ulf Hansson wrote: > >> > Therefore, the only way we currently can make sure to don't get the > >> > IRQ is to free and later re-request it. Now, apparently that has > >> > issues when using threaded IRQ handlers. > >> What's the issue? > >Ah, you mean that one: > >> Currently sdhci driver free irq in host suspend, and call > >> request_threaded_irq() in host resume. But during host resume, > >> Ctrl+C can impact sdhci host resume, see the error log: > >> [...] My test setup uses rtc as a wakeup source. Additionally, I also define my console uart as wakeup source echo enabled > /sys/class/tty/ttymxc3/power/wakeup and then run rtcwake in a loop while true ; do rtcwake -s 2 -m mem ; done Pressing Ctrl-C while the system is sleeping reproduces the problem quickly. sdhci_resume_host fails because the kthread for the threaded irq can't be created. ps confirms that there's no [irq/25-mmc0] kernel thread any more and sdhci starts printing register dumps periodically [ 101.603339] mmc0: Timeout waiting for hardware cmd interrupt. [ 101.609225] mmc0: sdhci: ============ SDHCI REGISTER DUMP =========== [ 101.615725] mmc0: sdhci: Sys addr: 0x00000008 | Version: 0x00001001 ... [ 101.700086] mmc0: sdhci: Host ctl2: 0x00000000 [ 101.704568] mmc0: sdhci-esdhc-imx: ========= ESDHC IMX DEBUG STATUS DUMP ========= [ 101.712179] mmc0: sdhci-esdhc-imx: cmd debug status: 0x0000 ... [ 101.746827] mmc0: sdhci-esdhc-imx: async fifo debug status: 0x0000 [ 101.753137] mmc0: sdhci: ============================================ I guest that we're taking this path from sdhci_resume_host request_threaded_irq __setup_irq setup_irq_thread kthread_create kthread_create_on_node __kthread_create_on_node wake_up_process(kthreadd_task); if (unlikely(wait_for_completion_killable(&done))) { if (xchg(&create->done, NULL)) return ERR_PTR(-EINTR); I can confirm Peng's observation that current->pid in __kthread_create_on_node is the pid of the rtcwake process. This is an ftrace log of the error case, I used trace-cmd start -e task -e mmc -e power -e signal -p function -l tty_flip_buffer_push -l sdhci_resume_host -l __kthread_create_on_node rtcwake-185 [000] .n.. 321.166724: device_pm_callback_start: imx-uart 50008000.serial, parent: 50000000.spba, bus [resume] rtcwake-185 [000] dnh. 321.166823: tty_flip_buffer_push <-__imx_uart_rxint.constprop.0 rtcwake-185 [000] .n.. 321.166906: device_pm_callback_end: imx-uart 50008000.serial, err=0 ... rtcwake-185 [000] .n.. 321.167472: device_pm_callback_start: imx_rngc 53fb0000.rngb, parent: 53f00000.bus, bus [resume] rtcwake-185 [000] .n.. 321.167486: device_pm_callback_end: imx_rngc 53fb0000.rngb, err=0 rtcwake-185 [000] .n.. 321.167502: device_pm_callback_start: sdhci-esdhc-imx 53fb4000.mmc, parent: 53f00000.bus, bus [resume] rtcwake-185 [000] .n.. 321.167516: sdhci_resume_host <-sdhci_esdhc_resume rtcwake-185 [000] .n.. 321.167584: __kthread_create_on_node <-kthread_create_on_node rtcwake-185 [000] .n.. 321.167652: device_pm_callback_end: sdhci-esdhc-imx 53fb4000.mmc, err=-4 rtcwake-185 [000] .n.. 321.167922: device_pm_callback_start: imx-fb 53fbc000.lcdc, parent: 53f00000.bus, bus [resume] rtcwake-185 [000] .n.. 321.167978: device_pm_callback_end: imx-fb 53fbc000.lcdc, err=0 If wakeup is successful, the irq's thread is created rtcwake-168 [000] .n.. 320.566614: device_pm_callback_start: imx-uart 50008000.serial, parent: 50000000.spba, bus [resume] rtcwake-168 [000] dnh. 320.566718: tty_flip_buffer_push <-__imx_uart_rxint.constprop.0 rtcwake-168 [000] .n.. 320.566808: device_pm_callback_end: imx-uart 50008000.serial, err=0 ... rtcwake-168 [000] .n.. 320.567359: device_pm_callback_start: imx_rngc 53fb0000.rngb, parent: 53f00000.bus, bus [resume] rtcwake-168 [000] .n.. 320.567372: device_pm_callback_end: imx_rngc 53fb0000.rngb, err=0 rtcwake-168 [000] .n.. 320.567386: device_pm_callback_start: sdhci-esdhc-imx 53fb4000.mmc, parent: 53f00000.bus, bus [resume] rtcwake-168 [000] .n.. 320.567402: sdhci_resume_host <-sdhci_esdhc_resume rtcwake-168 [000] .n.. 320.567460: __kthread_create_on_node <-kthread_create_on_node kthreadd-2 [000] .... 320.567922: task_newtask: pid=174 comm=kthreadd clone_flags=800700 oom_score_adj=0 kworker/u2:3-173 [000] .... 320.568183: __kthread_create_on_node <-kthread_create_on_node kthreadd-2 [000] .... 320.568504: task_newtask: pid=175 comm=kthreadd clone_flags=800700 oom_score_adj=0 rtcwake-168 [000] .... 320.568659: task_rename: pid=174 oldcomm=kthreadd newcomm=irq/25-mmc0 oom_score_adj=0 rtcwake-168 [000] .n.. 320.568824: device_pm_callback_end: sdhci-esdhc-imx 53fb4000.mmc, err=0 > >In request_threaded_irq-> __setup_irq-> kthread_create > >->kthread_create_on_node, the comment shows that SIGKILLed will > >impact the kthread create, and return -EINTR. > >And how should that thread be SIGKILLed? Hitting Ctrl+C on the console does > >not affect any kernel internal thread. Hitting Ctrl+C affects solely the > >process which is running on that console. > >And if it would, then that would be a completely different, serious bug which > >needs to be fixed. > >How was verified, that the thread was not created and that the creation failed > >due to a SIGKILL? See above, the irq thread is missing, it was running before the suspend+wakeup. I hope that the ftrace output confirms my assumption about the code path that was taken. I didn't find any other way we could return -EINTR in sdhci_resume_host. > My understanding is: > The issue is during suspend resume, it is in rtwakeup.out process space, > during resume, "get_current()->comm" shows "rtcwakeup.out", so if we > send SIGKILL from userspace, a interrupt will occur, interrupt > handler will directly return to kernel space to continue resuming. > __setup_irq->kthread_create->wait_for_completion_killable, here > wait_for_completion_killable see SIGKILL pending and return -EINTR, > then sdhci resume process failure, because of sdhci interrupt thread > not created. > During suspend/resume, OOM Killer will be disabled and enalbed. When > request_threaded_irq in sdhci resume, OOM Killer is still disabled. > According to kthread_create comments for wait_for_completion_killable, > using killable is to catch OOM sigkill. But during resume, OOM Killer > is disabled, So how about the following patch to disable SIGKILL for > a short while? I tried this patch, it didn't fix the problem for me. I could make the problem disappear if I moved sdhci_resume_host's request_threaded_irq call into a worker. Instead of calling request_threaded_irq, I'd schedule the work on system_unbound_wq. The new thread for sdhci's irq is then requested by someone other than rtcwake. Generally, it makes sense to me that kthreadd aborts a request for a new thread if the requester is killed during the request. However, in the case of sdhci resume, the thread should always be created, regardless of the requester's state... Of course, the workqueue hack is not an acceptable way to fix this. I'd appreciate if anyone could point me in the right direction for a proper fix. Thanks in advance for your help, Martin > diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c > index e9290a3439d5..84c4c99b1acb 100644 > --- a/drivers/mmc/host/sdhci.c > +++ b/drivers/mmc/host/sdhci.c > @@ -19,6 +19,7 @@ > #include <linux/io.h> > #include <linux/module.h> > #include <linux/dma-mapping.h> > +#include <linux/signal.h> > #include <linux/slab.h> > #include <linux/scatterlist.h> > #include <linux/swiotlb.h> > @@ -2895,9 +2896,11 @@ int sdhci_resume_host(struct sdhci_host *host) > } > if (!device_may_wakeup(mmc_dev(host->mmc))) { > + disallow_signal(SIGKILL); > ret = request_threaded_irq(host->irq, sdhci_irq, > sdhci_thread_irq, IRQF_SHARED, > mmc_hostname(host->mmc), host); > + allow_signal(SIGKILL); > if (ret) > return ret; > } else { > Thanks, > Peng. > >Thanks, > > tglx > -- ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] mmc: sdhci: disable irq in sdhci host suspend ranther than free this irq 2016-01-28 15:47 ` Ulf Hansson 2016-01-28 16:21 ` Thomas Gleixner @ 2016-01-28 16:38 ` Russell King - ARM Linux 1 sibling, 0 replies; 8+ messages in thread From: Russell King - ARM Linux @ 2016-01-28 16:38 UTC (permalink / raw) To: Ulf Hansson Cc: Haibo Chen, linux-mmc, linux-kernel, Thomas Gleixner, Jon Hunter On Thu, Jan 28, 2016 at 04:47:23PM +0100, Ulf Hansson wrote: > +tglx, Jon > > On 28 January 2016 at 11:20, Russell King - ARM Linux > <linux@arm.linux.org.uk> wrote: > > On Thu, Jan 28, 2016 at 05:42:26PM +0800, Haibo Chen wrote: > >> Currently sdhci driver free irq in host suspend, and call > >> request_threaded_irq() in host resume. But during host resume, > >> Ctrl+C can impact sdhci host resume, see the error log: > > > > Ctrl+C should have no effect on this - that seems to imply that there's > > some other bug elsewhere. > > > >> diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c > >> index d622435..4b1646b 100644 > >> --- a/drivers/mmc/host/sdhci.c > >> +++ b/drivers/mmc/host/sdhci.c > >> @@ -2686,7 +2686,7 @@ int sdhci_suspend_host(struct sdhci_host *host) > >> host->ier = 0; > >> sdhci_writel(host, 0, SDHCI_INT_ENABLE); > >> sdhci_writel(host, 0, SDHCI_SIGNAL_ENABLE); > >> - free_irq(host->irq, host); > >> + disable_irq(host->irq); > > > > This is really not acceptable I'm afraid. While it's common on ARM for > > each interrupt to be uniquely allocated to a peripheral, not all SDHCI > > platforms have that luxury. > > > > SDHCI is also used on PCI, and on x86 platforms, it's common to have PCI > > interrupts shared between (sometimes many) different PCI devices. > > > > For example, on my laptop: > > > > 18: 1089806 286185 IO-APIC-fasteoi uhci_hcd:usb8, r852, mmc0 > > > > the SDHCI interrupt is shared with two other peripherals - one USB > > controller and a NAND device. > > > > Disabling the interrupt will adversely impact other peripherals and > > cause regressions where the interrupt is shared. > > I thought disable|enable_irq() was being reference counted, so it > shouldn't impact the other peripherals for shared IRQs. I might have > understood this wrong though!? They are. When anything disables an IRQ, the IRQ is disabled. Only once the N disable_irq()s have been balanced with N enable_irq()s will the interrupt be re-enabled. disable_irq() doesn't work on a per-device level, but on a per-interrupt line level. So, if sdhci calls disable_irq() in its suspend interrupt, it disables the IRQ for _everything_ thats sharing that interrupt. If (eg) USB or r852 needs an interrupt to complete its own suspend, it won't see that interrupt because SDHCI disabled it. It appear might work as-is even so, if SDHCI happens to be (on the test setup) suspended after (eg) both the USB and r852 drivers. It is probably much better if SDHCI writes to the device on suspend to disable interrupts, synchronise with the IRQ, and then set a flag to indicate that the interrupt handler should immediately return IRQ_NONE in case any of the other peripherals sharing the IRQ line trigger an interrupt. > I have recently discussed a related change on the genirq framework, > which in principle turned out that we concluded on needing a new API > to deal with PM related enable/disable IRQ cases. > http://www.gossamer-threads.com/lists/linux/kernel/2350504?do=post_view_threaded#2350504 I haven't read your link, but I don't think we really need yet more APIs to deal with this, except possibly one thing - a way to tell genirq that a specific IRQ handler should not be called because its device is suspended. IOW, moving: static irqreturn_t foo_device_irq(void *devid) { struct foo_device_priv *priv = dev_id; if (priv->suspended) return IRQ_NONE; ... rest of IRQ handling } into genirq code, so that we don't end up with that pattern repeated many times in drivers. It may be that's exactly what's being proposed in the link, but as I say, I've not read it yet. -- RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2021-06-11 14:23 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-01-28 9:42 [PATCH] mmc: sdhci: disable irq in sdhci host suspend ranther than free this irq Haibo Chen 2016-01-28 10:20 ` Russell King - ARM Linux 2016-01-28 15:47 ` Ulf Hansson 2016-01-28 16:21 ` Thomas Gleixner 2016-01-28 16:27 ` Thomas Gleixner 2017-12-27 2:54 ` Peng Fan 2021-06-11 14:22 ` Martin Kaiser 2016-01-28 16:38 ` Russell King - ARM Linux
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).