From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Gao, Yunpeng" Subject: RE: Is it possible for sdhci host controller driver to schedule inside sdhci_request() function? Date: Thu, 27 Jan 2011 15:29:22 +0800 Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8BIT Return-path: Received: from mga09.intel.com ([134.134.136.24]:5088 "EHLO mga09.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750934Ab1A0Hat convert rfc822-to-8bit (ORCPT ); Thu, 27 Jan 2011 02:30:49 -0500 In-Reply-To: Content-Language: en-US Sender: linux-mmc-owner@vger.kernel.org List-Id: linux-mmc@vger.kernel.org To: "Gao, Yunpeng" , "linux-mmc@vger.kernel.org" >Our platform use sdhci host controller. And now there's a discussion to schedule >inside sdhci_request() function of drivers/mmc/host/sdhci.c. >According to the mmc stack code, seems the sdhci_request() is in the context >of kernel thread (mmcqd), so I think it should be OK if schedule inside of it. > >But when I tried it with below patch: >--------------------------------------------------------------------------- >diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c >index 5dead05..2c2ac66 100644 >--- a/drivers/mmc/host/sdhci.c >+++ b/drivers/mmc/host/sdhci.c >@@ -1184,6 +1184,10 @@ static void sdhci_request(struct mmc_host *mmc, struct >mmc_request *mrq) > > host = mmc_priv(mmc); > >+ printk(KERN_ERR "%s: Enter into sdhci_request:\n", >mmc_hostname(host->mmc)); >+ msleep(10); >+ printk(KERN_ERR "%s: Go on after msleep 10ms ...\n", >mmc_hostname(host->mmc)); >+ > spin_lock_irqsave(&host->lock, flags); > > WARN_ON(host->mrq != NULL); >--------------------------------------------------------------------------- >The kernel reported 'scheduling while atomic' BUG as below: >--------------------------------------------------------------------------- >... ><3>[ 4.985928] mmc0: Enter into sdhci_request: ><3>[ 4.996750] mmc0: Go on after msleep 10ms ... ><3>[ 5.007822] mmc0: Enter into sdhci_request: ><3>[ 5.018790] mmc0: Go on after msleep 10ms ... ><3>[ 5.028727] mmc0: Enter into sdhci_request: ><3>[ 5.039841] mmc0: Go on after msleep 10ms ... ><3>[ 5.049772] mmc0: Enter into sdhci_request: ><3>[ 5.049815] BUG: scheduling while atomic: swapper/0/0x00000104 ><4>[ 5.049845] no locks held by swapper/0. ><4>[ 5.049866] Modules linked in: ><4>[ 5.049889] Modules linked in: ><4>[ 5.049912] ><4>[ 5.049930] Pid: 0, comm: swapper Not tainted 2.6.35.3+ #8 / ><4>[ 5.049960] EIP: 0060:[] EFLAGS: 00000206 CPU: 0 ><4>[ 5.049997] EIP is at _raw_spin_unlock_irq+0x25/0x3b ><4>[ 5.050024] EAX: c1751f28 EBX: c2607bc0 ECX: 00000007 EDX: 00000007 ><4>[ 5.050055] ESI: c1779a1c EDI: f74400b0 EBP: c1751f2c ESP: c1751f28 ><4>[ 5.050086] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 ><0>[ 5.050116] Process swapper (pid: 0, ti=c1750000 task=c176dea0 >task.ti=c1750000) ><0>[ 5.050150] Stack: ><4>[ 5.050165] c2607bc0 c1751f58 c10288bc 00000000 00000002 00000000 >c1028865 c176dea0 ><4>[ 5.050229] <0> 00000001 c184cbc0 f74400b0 c176dea0 c1751fb4 c15285f8 >2c66255a 00000001 ><4>[ 5.050300] <0> c184cbc0 c1845000 c184cbc0 c184cbc0 c184cbc0 f74400b0 >00000000 c176e108 ><0>[ 5.050376] Call Trace: ><4>[ 5.050407] [] ? finish_task_switch+0x57/0x90 ><4>[ 5.050441] [] ? finish_task_switch+0x0/0x90 ><4>[ 5.050474] [] ? schedule+0x78e/0x869 ><4>[ 5.050509] [] ? tick_nohz_restart_sched_tick+0x154/0x15d ><4>[ 5.050545] [] ? cpu_idle+0x6a/0x76 ><4>[ 5.050576] [] ? rest_init+0xd4/0xd9 ><4>[ 5.050609] [] ? start_kernel+0x2f7/0x2fc ><4>[ 5.050642] [] ? i386_start_kernel+0xd0/0xd7 ><0>[ 5.050667] Code: fc b0 ff 5b 5d c3 55 ba 01 00 00 00 89 e5 8b 4d 04 53 >89 c3 8d 40 10 e8 97 2b b3 ff 89 d8 e8 f0 b2 cb ff e8 84 06 b3 ff fb 89 e0 <25> >00 e0 ff ff ff 48 14 f6 40 08 08 74 05 e8 e0 dc ff ff 5b 5d ><4>[ 5.051060] Call Trace: ><4>[ 5.051088] [] finish_task_switch+0x57/0x90 ><4>[ 5.051121] [] ? finish_task_switch+0x0/0x90 ><4>[ 5.051153] [] schedule+0x78e/0x869 ><4>[ 5.051185] [] ? tick_nohz_restart_sched_tick+0x154/0x15d ><4>[ 5.051221] [] cpu_idle+0x6a/0x76 ><4>[ 5.051250] [] rest_init+0xd4/0xd9 ><4>[ 5.051280] [] start_kernel+0x2f7/0x2fc ><4>[ 5.051312] [] i386_start_kernel+0xd0/0xd7 ><3>[ 5.051340] bad: scheduling from the idle thread! ><4>[ 5.051368] Pid: 0, comm: swapper Not tainted 2.6.35.3+ #8 ><4>[ 5.051393] Call Trace: ><4>[ 5.051415] [] ? printk+0xf/0x11 ><4>[ 5.051444] [] dequeue_task_idle+0x1a/0x27 ><4>[ 5.051475] [] dequeue_task+0xca/0xd9 ><4>[ 5.051506] [] deactivate_task+0x1f/0x25 ><4>[ 5.051535] [] schedule+0x14f/0x869 ><4>[ 5.051567] [] ? trace_hardirqs_on+0xb/0xd ><4>[ 5.051598] [] schedule_timeout+0x27c/0x2cf ><4>[ 5.051630] [] ? process_timeout+0x0/0xa ><4>[ 5.051662] [] schedule_timeout_uninterruptible+0x15/0x17 ><4>[ 5.051723] [] msleep+0x10/0x16 ><4>[ 5.051758] [] sdhci_request+0x33/0xfc ><4>[ 5.051790] [] mmc_request_done+0x42/0x6a ><4>[ 5.051820] [] sdhci_tasklet_finish+0xc4/0xca ><4>[ 5.051854] [] tasklet_action+0x69/0xb0 ><4>[ 5.051883] [] __do_softirq+0x122/0x27a ><4>[ 5.051912] [] do_softirq+0x2b/0x43 ><4>[ 5.051940] [] irq_exit+0x35/0x72 ><4>[ 5.051969] [] smp_apic_timer_interrupt+0x6e/0x7c ><4>[ 5.052002] [] apic_timer_interrupt+0x2f/0x34 ><4>[ 5.052035] [] ? down_trylock+0x18/0x27 ><4>[ 5.052064] [] ? _raw_spin_unlock_irq+0x25/0x3b ><4>[ 5.052097] [] finish_task_switch+0x57/0x90 ><4>[ 5.052129] [] ? finish_task_switch+0x0/0x90 ><4>[ 5.052159] [] schedule+0x78e/0x869 ><4>[ 5.052190] [] ? tick_nohz_restart_sched_tick+0x154/0x15d ><4>[ 5.052224] [] cpu_idle+0x6a/0x76 ><4>[ 5.052252] [] rest_init+0xd4/0xd9 ><4>[ 5.052282] [] start_kernel+0x2f7/0x2fc ><4>[ 5.052312] [] i386_start_kernel+0xd0/0xd7 ><1>[ 5.062718] BUG: unable to handle kernel NULL pointer dereference at >(null) ><1>[ 5.062774] IP: [<(null)>] (null) ><4>[ 5.062806] *pdpt = 0000000001856001 *pde = 0000000000000000 ><0>[ 5.062851] Oops: 0010 [#1] PREEMPT SMP ><0>[ 5.062890] last sysfs file: ><4>[ 5.062917] Modules linked in: ><4>[ 5.062944] ><4>[ 5.062967] Pid: 0, comm: swapper Not tainted 2.6.35.3+ #8 / ><4>[ 5.063003] EIP: 0060:[<00000000>] EFLAGS: 00010046 CPU: 1 ><4>[ 5.063037] EIP is at 0x0 ><4>[ 5.063061] EAX: c2607bc0 EBX: c15304d4 ECX: 00000001 EDX: c176dea0 ><4>[ 5.063099] ESI: c176dea0 EDI: c2607bc0 EBP: f747de3c ESP: f747de28 ><4>[ 5.063137] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 ><0>[ 5.063173] Process swapper (pid: 0, ti=f747c000 task=f747a230 >task.ti=f747c000) ><0>[ 5.063216] Stack: ><4>[ 5.063236] c10298a7 00000001 c2607bc0 c176dea0 c2607bc0 f747de48 >c10298d3 00000000 ><4>[ 5.063309] <0> f747de70 c1031fad 00000000 00000001 00000001 00000000 >00000282 c1751dd8 ><4>[ 5.063394] <0> c1751e18 f747debc f747de78 c1032183 f747de80 c1041022 >f747ded0 c1040b47 ><0>[ 5.063488] Call Trace: ><4>[ 5.063520] [] ? enqueue_task+0x72/0x7f ><4>[ 5.063559] [] ? activate_task+0x1f/0x25 ><4>[ 5.063597] [] ? try_to_wake_up+0x268/0x416 ><4>[ 5.063636] [] ? wake_up_process+0xf/0x11 ><4>[ 5.063674] [] ? process_timeout+0x8/0xa ><4>[ 5.063712] [] ? run_timer_softirq+0x299/0x414 ><4>[ 5.063752] [] ? process_timeout+0x0/0xa ><4>[ 5.063790] [] ? __do_softirq+0x122/0x27a ><4>[ 5.063828] [] ? do_softirq+0x2b/0x43 ><4>[ 5.063864] [] ? irq_exit+0x35/0x72 ><4>[ 5.063900] [] ? smp_apic_timer_interrupt+0x6e/0x7c ><4>[ 5.063942] [] ? apic_timer_interrupt+0x2f/0x34 ><4>[ 5.063984] [] ? down_trylock+0x18/0x27 ><4>[ 5.064022] [] ? intel_idle+0x169/0x1aa ><4>[ 5.064062] [] ? cpuidle_idle_call+0x71/0x14c ><4>[ 5.064101] [] ? cpu_idle+0x49/0x76 ><4>[ 5.064138] [] ? start_secondary+0x1c1/0x1c6 ><0>[ 5.064171] Code: Bad EIP value. ><0>[ 5.064208] EIP: [<00000000>] 0x0 SS:ESP 0068:f747de28 ><0>[ 5.064251] CR2: 0000000000000000 ><6>[ 0.000000] Initializing cgroup subsys cpu ><5>[ 0.000000] Linux version 2.6.35.3+ (root@yunpeng) (gcc version 4.4.4 >20100630 (Red Hat 4.4.4-10) (GCC) ) #8 SMP PREEMPT Wed Jan 26 22:49:08 CST 2011 >--------------------------------------------------------------------------- According to the call trace, in the mmc_request_done() of drivers/mmc/core/core.c, it will retry to issue the same command as below: ------------------------------------------------ if (err && cmd->retries) { ... host->ops->request(host, mrq); ----------------------------------------------- Since the mmc_request_done() is called in the tasklet context, so it's not surprise why kernel reports BUG once mmc_request_done() try to re-send command to card. Thanks. Regards, Yunpeng