All of lore.kernel.org
 help / color / mirror / Atom feed
* Hand on resume if sd/mmc card was removed while system was suspended/hibernated
@ 2010-02-03 17:47 Maxim Levitsky
  2010-02-04 23:18 ` [PATCH] MMC: fix hang if card was removed during suspend and unsafe resume was enabled Maxim Levitsky
  0 siblings, 1 reply; 24+ messages in thread
From: Maxim Levitsky @ 2010-02-03 17:47 UTC (permalink / raw)
  To: linux-mmc; +Cc: linux-pm, linux-kernel

Hi,

This is what I get, if I remove mmc card while system is suspended:

<4>[15241.041945] Call Trace:
<4>[15241.042047]  [<ffffffff8106620a>] ? prepare_to_wait+0x2a/0x90
<4>[15241.042159]  [<ffffffff810790bd>] ? trace_hardirqs_on+0xd/0x10
<4>[15241.042271]  [<ffffffff8140db12>] ? _raw_spin_unlock_irqrestore+0x42/0x80
<4>[15241.042386]  [<ffffffff8112a390>] ? bdi_sched_wait+0x0/0x20
<4>[15241.042496]  [<ffffffff8112a39e>] bdi_sched_wait+0xe/0x20
<4>[15241.042606]  [<ffffffff8140af6f>] __wait_on_bit+0x5f/0x90
<4>[15241.042714]  [<ffffffff8112a390>] ? bdi_sched_wait+0x0/0x20
<4>[15241.042824]  [<ffffffff8140b018>] out_of_line_wait_on_bit+0x78/0x90
<4>[15241.042935]  [<ffffffff81065fd0>] ? wake_bit_function+0x0/0x40
<4>[15241.043045]  [<ffffffff8112a2d3>] ? bdi_queue_work+0xa3/0xe0
<4>[15241.043155]  [<ffffffff8112a37f>] bdi_sync_writeback+0x6f/0x80
<4>[15241.043265]  [<ffffffff8112a3d2>] sync_inodes_sb+0x22/0x120
<4>[15241.043375]  [<ffffffff8112f1d2>] __sync_filesystem+0x82/0x90
<4>[15241.043485]  [<ffffffff8112f3db>] sync_filesystem+0x4b/0x70
<4>[15241.043594]  [<ffffffff811391de>] fsync_bdev+0x2e/0x60
<4>[15241.043704]  [<ffffffff812226be>] invalidate_partition+0x2e/0x50
<4>[15241.043816]  [<ffffffff8116b92f>] del_gendisk+0x3f/0x140
<4>[15241.043926]  [<ffffffffa00c0233>] mmc_blk_remove+0x33/0x60 [mmc_block]
<4>[15241.044043]  [<ffffffff81338977>] mmc_bus_remove+0x17/0x20
<4>[15241.044152]  [<ffffffff812ce746>] __device_release_driver+0x66/0xc0
<4>[15241.044264]  [<ffffffff812ce89d>] device_release_driver+0x2d/0x40
<4>[15241.044375]  [<ffffffff812cd9b5>] bus_remove_device+0xb5/0x120
<4>[15241.044486]  [<ffffffff812cb46f>] device_del+0x12f/0x1a0
<4>[15241.044593]  [<ffffffff81338a5b>] mmc_remove_card+0x5b/0x90
<4>[15241.044702]  [<ffffffff8133ac27>] mmc_sd_remove+0x27/0x50
<4>[15241.044811]  [<ffffffff81337d8c>] mmc_resume_host+0x10c/0x140
<4>[15241.044929]  [<ffffffffa00850e9>] sdhci_resume_host+0x69/0xa0 [sdhci]
<4>[15241.045044]  [<ffffffffa0bdc39e>] sdhci_pci_resume+0x8e/0xb0 [sdhci_pci]
<4>[15241.045159]  [<ffffffff8124b0a2>] pci_legacy_resume+0x42/0x60
<4>[15241.045268]  [<ffffffff8124b148>] pci_pm_restore+0x88/0xb0
<4>[15241.045378]  [<ffffffff812d3942>] pm_op+0x1a2/0x1c0
<4>[15241.045483]  [<ffffffff812d44cd>] dpm_resume_end+0x14d/0x520
<4>[15241.045593]  [<ffffffff8108c0f1>] hibernation_snapshot+0xd1/0x290
<4>[15241.045704]  [<ffffffff8108c3ad>] hibernate+0xfd/0x200
<4>[15241.045811]  [<ffffffff8108ac5c>] state_store+0xec/0x100
<4>[15241.045919]  [<ffffffff81172e17>] ? sysfs_get_active_two+0x27/0x60
<4>[15241.046032]  [<ffffffff8122db07>] kobj_attr_store+0x17/0x20
<4>[15241.046141]  [<ffffffff811710a6>] sysfs_write_file+0xe6/0x170
<4>[15241.046253]  [<ffffffff811087f8>] vfs_write+0xb8/0x1a0
<4>[15241.046361]  [<ffffffff811089d1>] sys_write+0x51/0x90
<4>[15241.046470]  [<ffffffff8100305b>] system_call_fastpath+0x16/0x1b
<4>[15241.046579] INFO: lockdep is turned off.


It seems that del_disk can't be called from .resume methods.
It sleeps for threads that are frozen at that point.

Since I wrote my own driver (for xD cards) I have seen same problem.

I solved this (it is just very nice that way anyway) by a freezable 
kernel thread that polls for card state changes, 
and thus calls del_disk (indirectly) after system got fully resumed.

What do you think?

Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH] MMC: fix hang if card was removed during suspend and unsafe resume was enabled
  2010-02-03 17:47 Hand on resume if sd/mmc card was removed while system was suspended/hibernated Maxim Levitsky
@ 2010-02-04 23:18 ` Maxim Levitsky
  2010-02-05  0:09   ` Andrew Morton
  0 siblings, 1 reply; 24+ messages in thread
From: Maxim Levitsky @ 2010-02-04 23:18 UTC (permalink / raw)
  To: linux-mmc; +Cc: Philip Langdale, Andrew Morton, linux-kernel, Maxim Levitsky

Currently removal of the card leads to del_disk called indirectly by mmc core.
This function expects userspace to be running, which isn't when .resume is called

Fix that by removing the code that did that in mmc_resume_host. It is possible
because card detection logic will kick it later and remove the card.

Also make mtd workqueue freezeable, so it won't attempt to add/remove the card
while userspace is frozen.

Signed-off-by: Maxim Levitsky <maximlevitsky@gmail.com>
---
 drivers/mmc/core/core.c |    9 ++-------
 1 files changed, 2 insertions(+), 7 deletions(-)

diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
index 30acd52..879d48d 100644
--- a/drivers/mmc/core/core.c
+++ b/drivers/mmc/core/core.c
@@ -1257,7 +1257,6 @@ int mmc_suspend_host(struct mmc_host *host, pm_message_t state)
 	if (host->caps & MMC_CAP_DISABLE)
 		cancel_delayed_work(&host->disable);
 	cancel_delayed_work(&host->detect);
-	mmc_flush_scheduled_work();
 
 	mmc_bus_get(host);
 	if (host->bus_ops && !host->bus_dead) {
@@ -1300,15 +1299,11 @@ int mmc_resume_host(struct mmc_host *host)
 		mmc_select_voltage(host, host->ocr);
 		BUG_ON(!host->bus_ops->resume);
 		err = host->bus_ops->resume(host);
+
 		if (err) {
 			printk(KERN_WARNING "%s: error %d during resume "
 					    "(card was removed?)\n",
 					    mmc_hostname(host), err);
-			if (host->bus_ops->remove)
-				host->bus_ops->remove(host);
-			mmc_claim_host(host);
-			mmc_detach_bus(host);
-			mmc_release_host(host);
 			/* no need to bother upper layers */
 			err = 0;
 		}
@@ -1332,7 +1327,7 @@ static int __init mmc_init(void)
 {
 	int ret;
 
-	workqueue = create_singlethread_workqueue("kmmcd");
+	workqueue = create_freezeable_workqueue("kmmcd");
 	if (!workqueue)
 		return -ENOMEM;
 
-- 
1.6.3.3


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH] MMC: fix hang if card was removed during suspend and unsafe resume was enabled
  2010-02-04 23:18 ` [PATCH] MMC: fix hang if card was removed during suspend and unsafe resume was enabled Maxim Levitsky
@ 2010-02-05  0:09   ` Andrew Morton
  2010-02-05  8:31       ` Maxim Levitsky
  2010-02-05 10:17     ` Adrian Hunter
  0 siblings, 2 replies; 24+ messages in thread
From: Andrew Morton @ 2010-02-05  0:09 UTC (permalink / raw)
  To: Maxim Levitsky; +Cc: linux-mmc, Philip Langdale, linux-kernel, Jorg Schummer

On Fri,  5 Feb 2010 01:18:15 +0200 Maxim Levitsky <maximlevitsky@gmail.com> wrote:

> Currently removal of the card leads to del_disk called indirectly by mmc core.
> This function expects userspace to be running, which isn't when .resume is called
> 
> Fix that by removing the code that did that in mmc_resume_host. It is possible
> because card detection logic will kick it later and remove the card.

I don't really understand.  The above implies that to trigger this bug,
one needs to physically remove the card during a resume operation.  ie:
a human-vs-computer race.  Sounds unlikely?

So...  exactly what steps does the user need to take to trigger this
bug?

> Also make mtd workqueue freezeable, so it won't attempt to add/remove the card
> while userspace is frozen.
> 
> 
> diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
> index 30acd52..879d48d 100644
> --- a/drivers/mmc/core/core.c
> +++ b/drivers/mmc/core/core.c
> @@ -1257,7 +1257,6 @@ int mmc_suspend_host(struct mmc_host *host, pm_message_t state)
>  	if (host->caps & MMC_CAP_DISABLE)
>  		cancel_delayed_work(&host->disable);
>  	cancel_delayed_work(&host->detect);
> -	mmc_flush_scheduled_work();
>  
>  	mmc_bus_get(host);
>  	if (host->bus_ops && !host->bus_dead) {
> @@ -1300,15 +1299,11 @@ int mmc_resume_host(struct mmc_host *host)
>  		mmc_select_voltage(host, host->ocr);
>  		BUG_ON(!host->bus_ops->resume);
>  		err = host->bus_ops->resume(host);
> +
>  		if (err) {
>  			printk(KERN_WARNING "%s: error %d during resume "
>  					    "(card was removed?)\n",
>  					    mmc_hostname(host), err);
> -			if (host->bus_ops->remove)
> -				host->bus_ops->remove(host);
> -			mmc_claim_host(host);
> -			mmc_detach_bus(host);
> -			mmc_release_host(host);

afacit that code's been there since March 2009.  I'd have thought that
someone would have noticed "kernel hangs on resume" before now.

Do you think the patch should be backported into 2.6.32.x and eariler?

>  			/* no need to bother upper layers */
>  			err = 0;
>  		}
> @@ -1332,7 +1327,7 @@ static int __init mmc_init(void)
>  {
>  	int ret;
>  
> -	workqueue = create_singlethread_workqueue("kmmcd");
> +	workqueue = create_freezeable_workqueue("kmmcd");
>  	if (!workqueue)
>  		return -ENOMEM;

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] MMC: fix hang if card was removed during suspend and unsafe resume was enabled
  2010-02-05  0:09   ` Andrew Morton
@ 2010-02-05  8:31       ` Maxim Levitsky
  2010-02-05 10:17     ` Adrian Hunter
  1 sibling, 0 replies; 24+ messages in thread
From: Maxim Levitsky @ 2010-02-05  8:31 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mmc, Philip Langdale, linux-kernel, Jorg Schummer, linux-pm

On Thu, 2010-02-04 at 16:09 -0800, Andrew Morton wrote: 
> On Fri,  5 Feb 2010 01:18:15 +0200 Maxim Levitsky <maximlevitsky@gmail.com> wrote:
> 
> > Currently removal of the card leads to del_disk called indirectly by mmc core.
> > This function expects userspace to be running, which isn't when .resume is called
> > 
> > Fix that by removing the code that did that in mmc_resume_host. It is possible
> > because card detection logic will kick it later and remove the card.
> 
> I don't really understand.  The above implies that to trigger this bug,
> one needs to physically remove the card during a resume operation.  ie:
> a human-vs-computer race.  Sounds unlikely?
> 
> So...  exactly what steps does the user need to take to trigger this

Sorry for describing this poorly.
The steps are:

-> Have a kernel with CONFIG_MMC_UNSAFE_RESUME
-> Insert MMC/SD card
-> Suspend/hibernate the system
-> While system is hibernated/suspended pull the card off
-> Resume the system
-> Hang


if CONFIG_MMC_UNSAFE_RESUME is set, mmc core allows the user to
suspend/resume the card normally assuming he won't change the card or
modify it in another system. The former case is actually handled quite
well.

if CONFIG_MMC_UNSAFE_RESUME isn't set, it removes the card during
suspend, and I now think (and will test) that this will still hang the
system this time on suspend.

Maybe we can make del_disk behave well if called with userspace frozen?
After all if user calls it, very likely that hardware is absent thus
there is no point in syncing (which I think triggers the hang)....

Best regards,
Maxim Levitsky


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] MMC: fix hang if card was removed during suspend and unsafe resume was enabled
@ 2010-02-05  8:31       ` Maxim Levitsky
  0 siblings, 0 replies; 24+ messages in thread
From: Maxim Levitsky @ 2010-02-05  8:31 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Jorg Schummer, linux-mmc, linux-kernel, linux-pm, Philip Langdale

On Thu, 2010-02-04 at 16:09 -0800, Andrew Morton wrote: 
> On Fri,  5 Feb 2010 01:18:15 +0200 Maxim Levitsky <maximlevitsky@gmail.com> wrote:
> 
> > Currently removal of the card leads to del_disk called indirectly by mmc core.
> > This function expects userspace to be running, which isn't when .resume is called
> > 
> > Fix that by removing the code that did that in mmc_resume_host. It is possible
> > because card detection logic will kick it later and remove the card.
> 
> I don't really understand.  The above implies that to trigger this bug,
> one needs to physically remove the card during a resume operation.  ie:
> a human-vs-computer race.  Sounds unlikely?
> 
> So...  exactly what steps does the user need to take to trigger this

Sorry for describing this poorly.
The steps are:

-> Have a kernel with CONFIG_MMC_UNSAFE_RESUME
-> Insert MMC/SD card
-> Suspend/hibernate the system
-> While system is hibernated/suspended pull the card off
-> Resume the system
-> Hang


if CONFIG_MMC_UNSAFE_RESUME is set, mmc core allows the user to
suspend/resume the card normally assuming he won't change the card or
modify it in another system. The former case is actually handled quite
well.

if CONFIG_MMC_UNSAFE_RESUME isn't set, it removes the card during
suspend, and I now think (and will test) that this will still hang the
system this time on suspend.

Maybe we can make del_disk behave well if called with userspace frozen?
After all if user calls it, very likely that hardware is absent thus
there is no point in syncing (which I think triggers the hang)....

Best regards,
Maxim Levitsky

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] MMC: fix hang if card was removed during suspend and unsafe resume was enabled
  2010-02-05  0:09   ` Andrew Morton
  2010-02-05  8:31       ` Maxim Levitsky
@ 2010-02-05 10:17     ` Adrian Hunter
  2010-02-05 13:42       ` Maxim Levitsky
  1 sibling, 1 reply; 24+ messages in thread
From: Adrian Hunter @ 2010-02-05 10:17 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Maxim Levitsky, linux-mmc, Philip Langdale, linux-kernel,
	Schummer Jorg.2 (EXT-Tieto/Espoo),
	nico, nico

ext Andrew Morton wrote:
> On Fri,  5 Feb 2010 01:18:15 +0200 Maxim Levitsky <maximlevitsky@gmail.com> wrote:
> 
>> Currently removal of the card leads to del_disk called indirectly by mmc core.
>> This function expects userspace to be running, which isn't when .resume is called
>>
>> Fix that by removing the code that did that in mmc_resume_host. It is possible
>> because card detection logic will kick it later and remove the card.
> 
> I don't really understand.  The above implies that to trigger this bug,
> one needs to physically remove the card during a resume operation.  ie:
> a human-vs-computer race.  Sounds unlikely?
> 
> So...  exactly what steps does the user need to take to trigger this
> bug?
> 
>> Also make mtd workqueue freezeable, so it won't attempt to add/remove the card
>> while userspace is frozen.
>>
>>
>> diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
>> index 30acd52..879d48d 100644
>> --- a/drivers/mmc/core/core.c
>> +++ b/drivers/mmc/core/core.c
>> @@ -1257,7 +1257,6 @@ int mmc_suspend_host(struct mmc_host *host, pm_message_t state)
>>  	if (host->caps & MMC_CAP_DISABLE)
>>  		cancel_delayed_work(&host->disable);
>>  	cancel_delayed_work(&host->detect);
>> -	mmc_flush_scheduled_work();
>>  
>>  	mmc_bus_get(host);
>>  	if (host->bus_ops && !host->bus_dead) {
>> @@ -1300,15 +1299,11 @@ int mmc_resume_host(struct mmc_host *host)
>>  		mmc_select_voltage(host, host->ocr);
>>  		BUG_ON(!host->bus_ops->resume);
>>  		err = host->bus_ops->resume(host);
>> +
>>  		if (err) {
>>  			printk(KERN_WARNING "%s: error %d during resume "
>>  					    "(card was removed?)\n",
>>  					    mmc_hostname(host), err);
>> -			if (host->bus_ops->remove)
>> -				host->bus_ops->remove(host);
>> -			mmc_claim_host(host);
>> -			mmc_detach_bus(host);
>> -			mmc_release_host(host);
> 
> afacit that code's been there since March 2009.  I'd have thought that
> someone would have noticed "kernel hangs on resume" before now.
> 
> Do you think the patch should be backported into 2.6.32.x and eariler?

It looks like the code was introduced in 2.6.32.x by commit

95cdfb72b9bc568803f395c266152c71b034b461

cc'ing the author Nicolas Pitre

> 
>>  			/* no need to bother upper layers */
>>  			err = 0;
>>  		}
>> @@ -1332,7 +1327,7 @@ static int __init mmc_init(void)
>>  {
>>  	int ret;
>>  
>> -	workqueue = create_singlethread_workqueue("kmmcd");
>> +	workqueue = create_freezeable_workqueue("kmmcd");
>>  	if (!workqueue)
>>  		return -ENOMEM;
> --
> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] MMC: fix hang if card was removed during suspend and unsafe resume was enabled
  2010-02-05 10:17     ` Adrian Hunter
@ 2010-02-05 13:42       ` Maxim Levitsky
  0 siblings, 0 replies; 24+ messages in thread
From: Maxim Levitsky @ 2010-02-05 13:42 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Andrew Morton, linux-mmc, Philip Langdale, linux-kernel,
	Schummer Jorg.2 (EXT-Tieto/Espoo),
	nico, nico

On Fri, 2010-02-05 at 12:17 +0200, Adrian Hunter wrote: 
> ext Andrew Morton wrote:
> > On Fri,  5 Feb 2010 01:18:15 +0200 Maxim Levitsky <maximlevitsky@gmail.com> wrote:
> > 
> >> Currently removal of the card leads to del_disk called indirectly by mmc core.
> >> This function expects userspace to be running, which isn't when .resume is called
> >>
> >> Fix that by removing the code that did that in mmc_resume_host. It is possible
> >> because card detection logic will kick it later and remove the card.
> > 
> > I don't really understand.  The above implies that to trigger this bug,
> > one needs to physically remove the card during a resume operation.  ie:
> > a human-vs-computer race.  Sounds unlikely?
> > 
> > So...  exactly what steps does the user need to take to trigger this
> > bug?
> > 
> >> Also make mtd workqueue freezeable, so it won't attempt to add/remove the card
> >> while userspace is frozen.
> >>
> >>
> >> diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
> >> index 30acd52..879d48d 100644
> >> --- a/drivers/mmc/core/core.c
> >> +++ b/drivers/mmc/core/core.c
> >> @@ -1257,7 +1257,6 @@ int mmc_suspend_host(struct mmc_host *host, pm_message_t state)
> >>  	if (host->caps & MMC_CAP_DISABLE)
> >>  		cancel_delayed_work(&host->disable);
> >>  	cancel_delayed_work(&host->detect);
> >> -	mmc_flush_scheduled_work();
> >>  
> >>  	mmc_bus_get(host);
> >>  	if (host->bus_ops && !host->bus_dead) {
> >> @@ -1300,15 +1299,11 @@ int mmc_resume_host(struct mmc_host *host)
> >>  		mmc_select_voltage(host, host->ocr);
> >>  		BUG_ON(!host->bus_ops->resume);
> >>  		err = host->bus_ops->resume(host);
> >> +
> >>  		if (err) {
> >>  			printk(KERN_WARNING "%s: error %d during resume "
> >>  					    "(card was removed?)\n",
> >>  					    mmc_hostname(host), err);
> >> -			if (host->bus_ops->remove)
> >> -				host->bus_ops->remove(host);
> >> -			mmc_claim_host(host);
> >> -			mmc_detach_bus(host);
> >> -			mmc_release_host(host);
> > 
> > afacit that code's been there since March 2009.  I'd have thought that
> > someone would have noticed "kernel hangs on resume" before now.
> > 
> > Do you think the patch should be backported into 2.6.32.x and eariler?
> 
> It looks like the code was introduced in 2.6.32.x by commit
> 
> 95cdfb72b9bc568803f395c266152c71b034b461
> 
> cc'ing the author Nicolas Pitre


I don't think this is this commit fault.
The problem lies somewhere in block layer.
del_disk hangs if called while usrspace is frozen.
Because I assume that this code was tested, I guess that it was possible
to call del_disk in this way once.

Fixing CONFIG_MMC_UNSAFE_RESUME=n not to do del_disk, won't be easy...

Best regards,
Maxim Levitsky



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] MMC: fix hang if card was removed during suspend and unsafe resume was enabled
  2010-02-05  8:31       ` Maxim Levitsky
  (?)
@ 2010-02-05 14:13       ` Andrew Morton
  2010-02-05 14:19         ` Maxim Levitsky
  2010-02-05 14:19         ` Maxim Levitsky
  -1 siblings, 2 replies; 24+ messages in thread
From: Andrew Morton @ 2010-02-05 14:13 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: linux-mmc, Philip Langdale, linux-kernel, Jorg Schummer, linux-pm

On Fri, 05 Feb 2010 10:31:42 +0200 Maxim Levitsky <maximlevitsky@gmail.com> wrote:

> On Thu, 2010-02-04 at 16:09 -0800, Andrew Morton wrote: 
> > On Fri,  5 Feb 2010 01:18:15 +0200 Maxim Levitsky <maximlevitsky@gmail.com> wrote:
> > 
> > > Currently removal of the card leads to del_disk called indirectly by mmc core.
> > > This function expects userspace to be running, which isn't when .resume is called
> > > 
> > > Fix that by removing the code that did that in mmc_resume_host. It is possible
> > > because card detection logic will kick it later and remove the card.
> > 
> > I don't really understand.  The above implies that to trigger this bug,
> > one needs to physically remove the card during a resume operation.  ie:
> > a human-vs-computer race.  Sounds unlikely?
> > 
> > So...  exactly what steps does the user need to take to trigger this
> 
> Sorry for describing this poorly.
> The steps are:
> 
> -> Have a kernel with CONFIG_MMC_UNSAFE_RESUME
> -> Insert MMC/SD card
> -> Suspend/hibernate the system
> -> While system is hibernated/suspended pull the card off
> -> Resume the system
> -> Hang
> 
> 
> if CONFIG_MMC_UNSAFE_RESUME is set, mmc core allows the user to
> suspend/resume the card normally assuming he won't change the card or
> modify it in another system. The former case is actually handled quite
> well.
> 
> if CONFIG_MMC_UNSAFE_RESUME isn't set, it removes the card during
> suspend, and I now think (and will test) that this will still hang the
> system this time on suspend.
> 
> Maybe we can make del_disk behave well if called with userspace frozen?
> After all if user calls it, very likely that hardware is absent thus
> there is no point in syncing (which I think triggers the hang)....
> 

There is no del_disk in the kernel.  Let's be more specific (and
accurate!) about the hang.  I assume it's
mmc_remove_card->device_del->kobject_uevent?

Yes, I'd have thought that it would be a good idea for the
kobject_uevent code (or lower, in call_usermodehelper) to take avoiding
action if userspace is frozen.  However such action would probably
involve doing a WARN_ON() too, so we'd still need MMC changes to avoid
that.



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] MMC: fix hang if card was removed during suspend and unsafe resume was enabled
  2010-02-05  8:31       ` Maxim Levitsky
  (?)
  (?)
@ 2010-02-05 14:13       ` Andrew Morton
  -1 siblings, 0 replies; 24+ messages in thread
From: Andrew Morton @ 2010-02-05 14:13 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: Jorg Schummer, linux-mmc, linux-kernel, linux-pm, Philip Langdale

On Fri, 05 Feb 2010 10:31:42 +0200 Maxim Levitsky <maximlevitsky@gmail.com> wrote:

> On Thu, 2010-02-04 at 16:09 -0800, Andrew Morton wrote: 
> > On Fri,  5 Feb 2010 01:18:15 +0200 Maxim Levitsky <maximlevitsky@gmail.com> wrote:
> > 
> > > Currently removal of the card leads to del_disk called indirectly by mmc core.
> > > This function expects userspace to be running, which isn't when .resume is called
> > > 
> > > Fix that by removing the code that did that in mmc_resume_host. It is possible
> > > because card detection logic will kick it later and remove the card.
> > 
> > I don't really understand.  The above implies that to trigger this bug,
> > one needs to physically remove the card during a resume operation.  ie:
> > a human-vs-computer race.  Sounds unlikely?
> > 
> > So...  exactly what steps does the user need to take to trigger this
> 
> Sorry for describing this poorly.
> The steps are:
> 
> -> Have a kernel with CONFIG_MMC_UNSAFE_RESUME
> -> Insert MMC/SD card
> -> Suspend/hibernate the system
> -> While system is hibernated/suspended pull the card off
> -> Resume the system
> -> Hang
> 
> 
> if CONFIG_MMC_UNSAFE_RESUME is set, mmc core allows the user to
> suspend/resume the card normally assuming he won't change the card or
> modify it in another system. The former case is actually handled quite
> well.
> 
> if CONFIG_MMC_UNSAFE_RESUME isn't set, it removes the card during
> suspend, and I now think (and will test) that this will still hang the
> system this time on suspend.
> 
> Maybe we can make del_disk behave well if called with userspace frozen?
> After all if user calls it, very likely that hardware is absent thus
> there is no point in syncing (which I think triggers the hang)....
> 

There is no del_disk in the kernel.  Let's be more specific (and
accurate!) about the hang.  I assume it's
mmc_remove_card->device_del->kobject_uevent?

Yes, I'd have thought that it would be a good idea for the
kobject_uevent code (or lower, in call_usermodehelper) to take avoiding
action if userspace is frozen.  However such action would probably
involve doing a WARN_ON() too, so we'd still need MMC changes to avoid
that.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] MMC: fix hang if card was removed during suspend and unsafe resume was enabled
  2010-02-05 14:13       ` Andrew Morton
@ 2010-02-05 14:19         ` Maxim Levitsky
  2010-02-05 14:39           ` Andrew Morton
  2010-02-05 14:39           ` Andrew Morton
  2010-02-05 14:19         ` Maxim Levitsky
  1 sibling, 2 replies; 24+ messages in thread
From: Maxim Levitsky @ 2010-02-05 14:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mmc, Philip Langdale, linux-kernel, Jorg Schummer, linux-pm

On Fri, 2010-02-05 at 06:13 -0800, Andrew Morton wrote: 
> On Fri, 05 Feb 2010 10:31:42 +0200 Maxim Levitsky <maximlevitsky@gmail.com> wrote:
> 
> > On Thu, 2010-02-04 at 16:09 -0800, Andrew Morton wrote: 
> > > On Fri,  5 Feb 2010 01:18:15 +0200 Maxim Levitsky <maximlevitsky@gmail.com> wrote:
> > > 
> > > > Currently removal of the card leads to del_disk called indirectly by mmc core.
> > > > This function expects userspace to be running, which isn't when .resume is called
> > > > 
> > > > Fix that by removing the code that did that in mmc_resume_host. It is possible
> > > > because card detection logic will kick it later and remove the card.
> > > 
> > > I don't really understand.  The above implies that to trigger this bug,
> > > one needs to physically remove the card during a resume operation.  ie:
> > > a human-vs-computer race.  Sounds unlikely?
> > > 
> > > So...  exactly what steps does the user need to take to trigger this
> > 
> > Sorry for describing this poorly.
> > The steps are:
> > 
> > -> Have a kernel with CONFIG_MMC_UNSAFE_RESUME
> > -> Insert MMC/SD card
> > -> Suspend/hibernate the system
> > -> While system is hibernated/suspended pull the card off
> > -> Resume the system
> > -> Hang
> > 
> > 
> > if CONFIG_MMC_UNSAFE_RESUME is set, mmc core allows the user to
> > suspend/resume the card normally assuming he won't change the card or
> > modify it in another system. The former case is actually handled quite
> > well.
> > 
> > if CONFIG_MMC_UNSAFE_RESUME isn't set, it removes the card during
> > suspend, and I now think (and will test) that this will still hang the
> > system this time on suspend.
> > 
> > Maybe we can make del_disk behave well if called with userspace frozen?
> > After all if user calls it, very likely that hardware is absent thus
> > there is no point in syncing (which I think triggers the hang)....
> > 
> 
> There is no del_disk in the kernel.  Let's be more specific (and
> accurate!) about the hang.  I assume it's
> mmc_remove_card->device_del->kobject_uevent?
Sorry!
I was referring to del_gendisk. 

<4>[15241.042047]  [<ffffffff8106620a>] ? prepare_to_wait+0x2a/0x90
<4>[15241.042159]  [<ffffffff810790bd>] ? trace_hardirqs_on+0xd/0x10
<4>[15241.042271]  [<ffffffff8140db12>] ? _raw_spin_unlock_irqrestore+0x42/0x80
<4>[15241.042386]  [<ffffffff8112a390>] ? bdi_sched_wait+0x0/0x20
<4>[15241.042496]  [<ffffffff8112a39e>] bdi_sched_wait+0xe/0x20
<4>[15241.042606]  [<ffffffff8140af6f>] __wait_on_bit+0x5f/0x90
<4>[15241.042714]  [<ffffffff8112a390>] ? bdi_sched_wait+0x0/0x20
<4>[15241.042824]  [<ffffffff8140b018>] out_of_line_wait_on_bit+0x78/0x90
<4>[15241.042935]  [<ffffffff81065fd0>] ? wake_bit_function+0x0/0x40
<4>[15241.043045]  [<ffffffff8112a2d3>] ? bdi_queue_work+0xa3/0xe0
<4>[15241.043155]  [<ffffffff8112a37f>] bdi_sync_writeback+0x6f/0x80
<4>[15241.043265]  [<ffffffff8112a3d2>] sync_inodes_sb+0x22/0x120
<4>[15241.043375]  [<ffffffff8112f1d2>] __sync_filesystem+0x82/0x90
<4>[15241.043485]  [<ffffffff8112f3db>] sync_filesystem+0x4b/0x70
<4>[15241.043594]  [<ffffffff811391de>] fsync_bdev+0x2e/0x60
<4>[15241.043704]  [<ffffffff812226be>] invalidate_partition+0x2e/0x50
<4>[15241.043816]  [<ffffffff8116b92f>] del_gendisk+0x3f/0x140
<4>[15241.043926]  [<ffffffffa00c0233>] mmc_blk_remove+0x33/0x60 [mmc_block]
<4>[15241.044043]  [<ffffffff81338977>] mmc_bus_remove+0x17/0x20
<4>[15241.044152]  [<ffffffff812ce746>] __device_release_driver+0x66/0xc0
<4>[15241.044264]  [<ffffffff812ce89d>] device_release_driver+0x2d/0x40
<4>[15241.044375]  [<ffffffff812cd9b5>] bus_remove_device+0xb5/0x120
<4>[15241.044486]  [<ffffffff812cb46f>] device_del+0x12f/0x1a0
<4>[15241.044593]  [<ffffffff81338a5b>] mmc_remove_card+0x5b/0x90
<4>[15241.044702]  [<ffffffff8133ac27>] mmc_sd_remove+0x27/0x50
<4>[15241.044811]  [<ffffffff81337d8c>] mmc_resume_host+0x10c/0x140
<4>[15241.044929]  [<ffffffffa00850e9>] sdhci_resume_host+0x69/0xa0 [sdhci]
<4>[15241.045044]  [<ffffffffa0bdc39e>] sdhci_pci_resume+0x8e/0xb0 [sdhci_pci]

> 
> Yes, I'd have thought that it would be a good idea for the
> kobject_uevent code (or lower, in call_usermodehelper) to take avoiding
> action if userspace is frozen.  However such action would probably
> involve doing a WARN_ON() too, so we'd still need MMC changes to avoid
> that.
> 
> 

Best regards,
Maxim Levitsky



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] MMC: fix hang if card was removed during suspend and unsafe resume was enabled
  2010-02-05 14:13       ` Andrew Morton
  2010-02-05 14:19         ` Maxim Levitsky
@ 2010-02-05 14:19         ` Maxim Levitsky
  1 sibling, 0 replies; 24+ messages in thread
From: Maxim Levitsky @ 2010-02-05 14:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Jorg Schummer, linux-mmc, linux-kernel, linux-pm, Philip Langdale

On Fri, 2010-02-05 at 06:13 -0800, Andrew Morton wrote: 
> On Fri, 05 Feb 2010 10:31:42 +0200 Maxim Levitsky <maximlevitsky@gmail.com> wrote:
> 
> > On Thu, 2010-02-04 at 16:09 -0800, Andrew Morton wrote: 
> > > On Fri,  5 Feb 2010 01:18:15 +0200 Maxim Levitsky <maximlevitsky@gmail.com> wrote:
> > > 
> > > > Currently removal of the card leads to del_disk called indirectly by mmc core.
> > > > This function expects userspace to be running, which isn't when .resume is called
> > > > 
> > > > Fix that by removing the code that did that in mmc_resume_host. It is possible
> > > > because card detection logic will kick it later and remove the card.
> > > 
> > > I don't really understand.  The above implies that to trigger this bug,
> > > one needs to physically remove the card during a resume operation.  ie:
> > > a human-vs-computer race.  Sounds unlikely?
> > > 
> > > So...  exactly what steps does the user need to take to trigger this
> > 
> > Sorry for describing this poorly.
> > The steps are:
> > 
> > -> Have a kernel with CONFIG_MMC_UNSAFE_RESUME
> > -> Insert MMC/SD card
> > -> Suspend/hibernate the system
> > -> While system is hibernated/suspended pull the card off
> > -> Resume the system
> > -> Hang
> > 
> > 
> > if CONFIG_MMC_UNSAFE_RESUME is set, mmc core allows the user to
> > suspend/resume the card normally assuming he won't change the card or
> > modify it in another system. The former case is actually handled quite
> > well.
> > 
> > if CONFIG_MMC_UNSAFE_RESUME isn't set, it removes the card during
> > suspend, and I now think (and will test) that this will still hang the
> > system this time on suspend.
> > 
> > Maybe we can make del_disk behave well if called with userspace frozen?
> > After all if user calls it, very likely that hardware is absent thus
> > there is no point in syncing (which I think triggers the hang)....
> > 
> 
> There is no del_disk in the kernel.  Let's be more specific (and
> accurate!) about the hang.  I assume it's
> mmc_remove_card->device_del->kobject_uevent?
Sorry!
I was referring to del_gendisk. 

<4>[15241.042047]  [<ffffffff8106620a>] ? prepare_to_wait+0x2a/0x90
<4>[15241.042159]  [<ffffffff810790bd>] ? trace_hardirqs_on+0xd/0x10
<4>[15241.042271]  [<ffffffff8140db12>] ? _raw_spin_unlock_irqrestore+0x42/0x80
<4>[15241.042386]  [<ffffffff8112a390>] ? bdi_sched_wait+0x0/0x20
<4>[15241.042496]  [<ffffffff8112a39e>] bdi_sched_wait+0xe/0x20
<4>[15241.042606]  [<ffffffff8140af6f>] __wait_on_bit+0x5f/0x90
<4>[15241.042714]  [<ffffffff8112a390>] ? bdi_sched_wait+0x0/0x20
<4>[15241.042824]  [<ffffffff8140b018>] out_of_line_wait_on_bit+0x78/0x90
<4>[15241.042935]  [<ffffffff81065fd0>] ? wake_bit_function+0x0/0x40
<4>[15241.043045]  [<ffffffff8112a2d3>] ? bdi_queue_work+0xa3/0xe0
<4>[15241.043155]  [<ffffffff8112a37f>] bdi_sync_writeback+0x6f/0x80
<4>[15241.043265]  [<ffffffff8112a3d2>] sync_inodes_sb+0x22/0x120
<4>[15241.043375]  [<ffffffff8112f1d2>] __sync_filesystem+0x82/0x90
<4>[15241.043485]  [<ffffffff8112f3db>] sync_filesystem+0x4b/0x70
<4>[15241.043594]  [<ffffffff811391de>] fsync_bdev+0x2e/0x60
<4>[15241.043704]  [<ffffffff812226be>] invalidate_partition+0x2e/0x50
<4>[15241.043816]  [<ffffffff8116b92f>] del_gendisk+0x3f/0x140
<4>[15241.043926]  [<ffffffffa00c0233>] mmc_blk_remove+0x33/0x60 [mmc_block]
<4>[15241.044043]  [<ffffffff81338977>] mmc_bus_remove+0x17/0x20
<4>[15241.044152]  [<ffffffff812ce746>] __device_release_driver+0x66/0xc0
<4>[15241.044264]  [<ffffffff812ce89d>] device_release_driver+0x2d/0x40
<4>[15241.044375]  [<ffffffff812cd9b5>] bus_remove_device+0xb5/0x120
<4>[15241.044486]  [<ffffffff812cb46f>] device_del+0x12f/0x1a0
<4>[15241.044593]  [<ffffffff81338a5b>] mmc_remove_card+0x5b/0x90
<4>[15241.044702]  [<ffffffff8133ac27>] mmc_sd_remove+0x27/0x50
<4>[15241.044811]  [<ffffffff81337d8c>] mmc_resume_host+0x10c/0x140
<4>[15241.044929]  [<ffffffffa00850e9>] sdhci_resume_host+0x69/0xa0 [sdhci]
<4>[15241.045044]  [<ffffffffa0bdc39e>] sdhci_pci_resume+0x8e/0xb0 [sdhci_pci]

> 
> Yes, I'd have thought that it would be a good idea for the
> kobject_uevent code (or lower, in call_usermodehelper) to take avoiding
> action if userspace is frozen.  However such action would probably
> involve doing a WARN_ON() too, so we'd still need MMC changes to avoid
> that.
> 
> 

Best regards,
Maxim Levitsky

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] MMC: fix hang if card was removed during suspend and unsafe resume was enabled
  2010-02-05 14:19         ` Maxim Levitsky
@ 2010-02-05 14:39           ` Andrew Morton
  2010-02-05 15:52             ` Maxim Levitsky
  2010-02-05 15:52             ` Maxim Levitsky
  2010-02-05 14:39           ` Andrew Morton
  1 sibling, 2 replies; 24+ messages in thread
From: Andrew Morton @ 2010-02-05 14:39 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: linux-mmc, Philip Langdale, linux-kernel, Jorg Schummer, linux-pm

On Fri, 05 Feb 2010 16:19:20 +0200 Maxim Levitsky <maximlevitsky@gmail.com> wrote:

> > 
> > There is no del_disk in the kernel.  Let's be more specific (and
> > accurate!) about the hang.  I assume it's
> > mmc_remove_card->device_del->kobject_uevent?
> Sorry!
> I was referring to del_gendisk. 
> 
> <4>[15241.042047]  [<ffffffff8106620a>] ? prepare_to_wait+0x2a/0x90
> <4>[15241.042159]  [<ffffffff810790bd>] ? trace_hardirqs_on+0xd/0x10
> <4>[15241.042271]  [<ffffffff8140db12>] ? _raw_spin_unlock_irqrestore+0x42/0x80
> <4>[15241.042386]  [<ffffffff8112a390>] ? bdi_sched_wait+0x0/0x20
> <4>[15241.042496]  [<ffffffff8112a39e>] bdi_sched_wait+0xe/0x20
> <4>[15241.042606]  [<ffffffff8140af6f>] __wait_on_bit+0x5f/0x90
> <4>[15241.042714]  [<ffffffff8112a390>] ? bdi_sched_wait+0x0/0x20
> <4>[15241.042824]  [<ffffffff8140b018>] out_of_line_wait_on_bit+0x78/0x90
> <4>[15241.042935]  [<ffffffff81065fd0>] ? wake_bit_function+0x0/0x40
> <4>[15241.043045]  [<ffffffff8112a2d3>] ? bdi_queue_work+0xa3/0xe0
> <4>[15241.043155]  [<ffffffff8112a37f>] bdi_sync_writeback+0x6f/0x80
> <4>[15241.043265]  [<ffffffff8112a3d2>] sync_inodes_sb+0x22/0x120
> <4>[15241.043375]  [<ffffffff8112f1d2>] __sync_filesystem+0x82/0x90
> <4>[15241.043485]  [<ffffffff8112f3db>] sync_filesystem+0x4b/0x70
> <4>[15241.043594]  [<ffffffff811391de>] fsync_bdev+0x2e/0x60
> <4>[15241.043704]  [<ffffffff812226be>] invalidate_partition+0x2e/0x50
> <4>[15241.043816]  [<ffffffff8116b92f>] del_gendisk+0x3f/0x140
> <4>[15241.043926]  [<ffffffffa00c0233>] mmc_blk_remove+0x33/0x60 [mmc_block]
> <4>[15241.044043]  [<ffffffff81338977>] mmc_bus_remove+0x17/0x20
> <4>[15241.044152]  [<ffffffff812ce746>] __device_release_driver+0x66/0xc0
> <4>[15241.044264]  [<ffffffff812ce89d>] device_release_driver+0x2d/0x40
> <4>[15241.044375]  [<ffffffff812cd9b5>] bus_remove_device+0xb5/0x120
> <4>[15241.044486]  [<ffffffff812cb46f>] device_del+0x12f/0x1a0
> <4>[15241.044593]  [<ffffffff81338a5b>] mmc_remove_card+0x5b/0x90
> <4>[15241.044702]  [<ffffffff8133ac27>] mmc_sd_remove+0x27/0x50
> <4>[15241.044811]  [<ffffffff81337d8c>] mmc_resume_host+0x10c/0x140
> <4>[15241.044929]  [<ffffffffa00850e9>] sdhci_resume_host+0x69/0xa0 [sdhci]
> <4>[15241.045044]  [<ffffffffa0bdc39e>] sdhci_pci_resume+0x8e/0xb0 [sdhci_pci]

So what's the hang?  del_gendisk is doing IO?  I'd assumed that it was
because it was calling kobject_uevent, but userspace is frozen.

Why is it this hard :(

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] MMC: fix hang if card was removed during suspend and unsafe resume was enabled
  2010-02-05 14:19         ` Maxim Levitsky
  2010-02-05 14:39           ` Andrew Morton
@ 2010-02-05 14:39           ` Andrew Morton
  1 sibling, 0 replies; 24+ messages in thread
From: Andrew Morton @ 2010-02-05 14:39 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: Jorg Schummer, linux-mmc, linux-kernel, linux-pm, Philip Langdale

On Fri, 05 Feb 2010 16:19:20 +0200 Maxim Levitsky <maximlevitsky@gmail.com> wrote:

> > 
> > There is no del_disk in the kernel.  Let's be more specific (and
> > accurate!) about the hang.  I assume it's
> > mmc_remove_card->device_del->kobject_uevent?
> Sorry!
> I was referring to del_gendisk. 
> 
> <4>[15241.042047]  [<ffffffff8106620a>] ? prepare_to_wait+0x2a/0x90
> <4>[15241.042159]  [<ffffffff810790bd>] ? trace_hardirqs_on+0xd/0x10
> <4>[15241.042271]  [<ffffffff8140db12>] ? _raw_spin_unlock_irqrestore+0x42/0x80
> <4>[15241.042386]  [<ffffffff8112a390>] ? bdi_sched_wait+0x0/0x20
> <4>[15241.042496]  [<ffffffff8112a39e>] bdi_sched_wait+0xe/0x20
> <4>[15241.042606]  [<ffffffff8140af6f>] __wait_on_bit+0x5f/0x90
> <4>[15241.042714]  [<ffffffff8112a390>] ? bdi_sched_wait+0x0/0x20
> <4>[15241.042824]  [<ffffffff8140b018>] out_of_line_wait_on_bit+0x78/0x90
> <4>[15241.042935]  [<ffffffff81065fd0>] ? wake_bit_function+0x0/0x40
> <4>[15241.043045]  [<ffffffff8112a2d3>] ? bdi_queue_work+0xa3/0xe0
> <4>[15241.043155]  [<ffffffff8112a37f>] bdi_sync_writeback+0x6f/0x80
> <4>[15241.043265]  [<ffffffff8112a3d2>] sync_inodes_sb+0x22/0x120
> <4>[15241.043375]  [<ffffffff8112f1d2>] __sync_filesystem+0x82/0x90
> <4>[15241.043485]  [<ffffffff8112f3db>] sync_filesystem+0x4b/0x70
> <4>[15241.043594]  [<ffffffff811391de>] fsync_bdev+0x2e/0x60
> <4>[15241.043704]  [<ffffffff812226be>] invalidate_partition+0x2e/0x50
> <4>[15241.043816]  [<ffffffff8116b92f>] del_gendisk+0x3f/0x140
> <4>[15241.043926]  [<ffffffffa00c0233>] mmc_blk_remove+0x33/0x60 [mmc_block]
> <4>[15241.044043]  [<ffffffff81338977>] mmc_bus_remove+0x17/0x20
> <4>[15241.044152]  [<ffffffff812ce746>] __device_release_driver+0x66/0xc0
> <4>[15241.044264]  [<ffffffff812ce89d>] device_release_driver+0x2d/0x40
> <4>[15241.044375]  [<ffffffff812cd9b5>] bus_remove_device+0xb5/0x120
> <4>[15241.044486]  [<ffffffff812cb46f>] device_del+0x12f/0x1a0
> <4>[15241.044593]  [<ffffffff81338a5b>] mmc_remove_card+0x5b/0x90
> <4>[15241.044702]  [<ffffffff8133ac27>] mmc_sd_remove+0x27/0x50
> <4>[15241.044811]  [<ffffffff81337d8c>] mmc_resume_host+0x10c/0x140
> <4>[15241.044929]  [<ffffffffa00850e9>] sdhci_resume_host+0x69/0xa0 [sdhci]
> <4>[15241.045044]  [<ffffffffa0bdc39e>] sdhci_pci_resume+0x8e/0xb0 [sdhci_pci]

So what's the hang?  del_gendisk is doing IO?  I'd assumed that it was
because it was calling kobject_uevent, but userspace is frozen.

Why is it this hard :(

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] MMC: fix hang if card was removed during suspend and unsafe resume was enabled
  2010-02-05 14:39           ` Andrew Morton
  2010-02-05 15:52             ` Maxim Levitsky
@ 2010-02-05 15:52             ` Maxim Levitsky
  2010-02-05 16:19               ` Madhusudhan
                                 ` (3 more replies)
  1 sibling, 4 replies; 24+ messages in thread
From: Maxim Levitsky @ 2010-02-05 15:52 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mmc, Philip Langdale, linux-kernel, Jorg Schummer, linux-pm

On Fri, 2010-02-05 at 06:39 -0800, Andrew Morton wrote: 
> On Fri, 05 Feb 2010 16:19:20 +0200 Maxim Levitsky <maximlevitsky@gmail.com> wrote:
> 
> > > 
> > > There is no del_disk in the kernel.  Let's be more specific (and
> > > accurate!) about the hang.  I assume it's
> > > mmc_remove_card->device_del->kobject_uevent?
> > Sorry!
> > I was referring to del_gendisk. 
> > 
> > <4>[15241.042047]  [<ffffffff8106620a>] ? prepare_to_wait+0x2a/0x90
> > <4>[15241.042159]  [<ffffffff810790bd>] ? trace_hardirqs_on+0xd/0x10
> > <4>[15241.042271]  [<ffffffff8140db12>] ? _raw_spin_unlock_irqrestore+0x42/0x80
> > <4>[15241.042386]  [<ffffffff8112a390>] ? bdi_sched_wait+0x0/0x20
> > <4>[15241.042496]  [<ffffffff8112a39e>] bdi_sched_wait+0xe/0x20
> > <4>[15241.042606]  [<ffffffff8140af6f>] __wait_on_bit+0x5f/0x90
> > <4>[15241.042714]  [<ffffffff8112a390>] ? bdi_sched_wait+0x0/0x20
> > <4>[15241.042824]  [<ffffffff8140b018>] out_of_line_wait_on_bit+0x78/0x90
> > <4>[15241.042935]  [<ffffffff81065fd0>] ? wake_bit_function+0x0/0x40
> > <4>[15241.043045]  [<ffffffff8112a2d3>] ? bdi_queue_work+0xa3/0xe0
> > <4>[15241.043155]  [<ffffffff8112a37f>] bdi_sync_writeback+0x6f/0x80
> > <4>[15241.043265]  [<ffffffff8112a3d2>] sync_inodes_sb+0x22/0x120
> > <4>[15241.043375]  [<ffffffff8112f1d2>] __sync_filesystem+0x82/0x90
> > <4>[15241.043485]  [<ffffffff8112f3db>] sync_filesystem+0x4b/0x70
> > <4>[15241.043594]  [<ffffffff811391de>] fsync_bdev+0x2e/0x60
> > <4>[15241.043704]  [<ffffffff812226be>] invalidate_partition+0x2e/0x50
> > <4>[15241.043816]  [<ffffffff8116b92f>] del_gendisk+0x3f/0x140
> > <4>[15241.043926]  [<ffffffffa00c0233>] mmc_blk_remove+0x33/0x60 [mmc_block]
> > <4>[15241.044043]  [<ffffffff81338977>] mmc_bus_remove+0x17/0x20
> > <4>[15241.044152]  [<ffffffff812ce746>] __device_release_driver+0x66/0xc0
> > <4>[15241.044264]  [<ffffffff812ce89d>] device_release_driver+0x2d/0x40
> > <4>[15241.044375]  [<ffffffff812cd9b5>] bus_remove_device+0xb5/0x120
> > <4>[15241.044486]  [<ffffffff812cb46f>] device_del+0x12f/0x1a0
> > <4>[15241.044593]  [<ffffffff81338a5b>] mmc_remove_card+0x5b/0x90
> > <4>[15241.044702]  [<ffffffff8133ac27>] mmc_sd_remove+0x27/0x50
> > <4>[15241.044811]  [<ffffffff81337d8c>] mmc_resume_host+0x10c/0x140
> > <4>[15241.044929]  [<ffffffffa00850e9>] sdhci_resume_host+0x69/0xa0 [sdhci]
> > <4>[15241.045044]  [<ffffffffa0bdc39e>] sdhci_pci_resume+0x8e/0xb0 [sdhci_pci]
> 
> So what's the hang?  del_gendisk is doing IO?  I'd assumed that it was
> because it was calling kobject_uevent, but userspace is frozen.

This is a backtrace of a hang.

The patch I sent tries not to remove the card during suspend, by letting
card presence logic to run after system is fully resumed.

However if CONFIG_MMC_UNSAFE_RESUME is not set, card would be removed at
suspend time, again with userspace frozen.
I will need to add some knobs to remove and add (if present) the card
when system is fully resumed. It is possible, but I will need to dig the
mmc code a bit deeper.

What I suggested is make del_gendisk suspend safe somehow.

Best regards,
Maxim Levitsky


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] MMC: fix hang if card was removed during suspend and unsafe resume was enabled
  2010-02-05 14:39           ` Andrew Morton
@ 2010-02-05 15:52             ` Maxim Levitsky
  2010-02-05 15:52             ` Maxim Levitsky
  1 sibling, 0 replies; 24+ messages in thread
From: Maxim Levitsky @ 2010-02-05 15:52 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Jorg Schummer, linux-mmc, linux-kernel, linux-pm, Philip Langdale

On Fri, 2010-02-05 at 06:39 -0800, Andrew Morton wrote: 
> On Fri, 05 Feb 2010 16:19:20 +0200 Maxim Levitsky <maximlevitsky@gmail.com> wrote:
> 
> > > 
> > > There is no del_disk in the kernel.  Let's be more specific (and
> > > accurate!) about the hang.  I assume it's
> > > mmc_remove_card->device_del->kobject_uevent?
> > Sorry!
> > I was referring to del_gendisk. 
> > 
> > <4>[15241.042047]  [<ffffffff8106620a>] ? prepare_to_wait+0x2a/0x90
> > <4>[15241.042159]  [<ffffffff810790bd>] ? trace_hardirqs_on+0xd/0x10
> > <4>[15241.042271]  [<ffffffff8140db12>] ? _raw_spin_unlock_irqrestore+0x42/0x80
> > <4>[15241.042386]  [<ffffffff8112a390>] ? bdi_sched_wait+0x0/0x20
> > <4>[15241.042496]  [<ffffffff8112a39e>] bdi_sched_wait+0xe/0x20
> > <4>[15241.042606]  [<ffffffff8140af6f>] __wait_on_bit+0x5f/0x90
> > <4>[15241.042714]  [<ffffffff8112a390>] ? bdi_sched_wait+0x0/0x20
> > <4>[15241.042824]  [<ffffffff8140b018>] out_of_line_wait_on_bit+0x78/0x90
> > <4>[15241.042935]  [<ffffffff81065fd0>] ? wake_bit_function+0x0/0x40
> > <4>[15241.043045]  [<ffffffff8112a2d3>] ? bdi_queue_work+0xa3/0xe0
> > <4>[15241.043155]  [<ffffffff8112a37f>] bdi_sync_writeback+0x6f/0x80
> > <4>[15241.043265]  [<ffffffff8112a3d2>] sync_inodes_sb+0x22/0x120
> > <4>[15241.043375]  [<ffffffff8112f1d2>] __sync_filesystem+0x82/0x90
> > <4>[15241.043485]  [<ffffffff8112f3db>] sync_filesystem+0x4b/0x70
> > <4>[15241.043594]  [<ffffffff811391de>] fsync_bdev+0x2e/0x60
> > <4>[15241.043704]  [<ffffffff812226be>] invalidate_partition+0x2e/0x50
> > <4>[15241.043816]  [<ffffffff8116b92f>] del_gendisk+0x3f/0x140
> > <4>[15241.043926]  [<ffffffffa00c0233>] mmc_blk_remove+0x33/0x60 [mmc_block]
> > <4>[15241.044043]  [<ffffffff81338977>] mmc_bus_remove+0x17/0x20
> > <4>[15241.044152]  [<ffffffff812ce746>] __device_release_driver+0x66/0xc0
> > <4>[15241.044264]  [<ffffffff812ce89d>] device_release_driver+0x2d/0x40
> > <4>[15241.044375]  [<ffffffff812cd9b5>] bus_remove_device+0xb5/0x120
> > <4>[15241.044486]  [<ffffffff812cb46f>] device_del+0x12f/0x1a0
> > <4>[15241.044593]  [<ffffffff81338a5b>] mmc_remove_card+0x5b/0x90
> > <4>[15241.044702]  [<ffffffff8133ac27>] mmc_sd_remove+0x27/0x50
> > <4>[15241.044811]  [<ffffffff81337d8c>] mmc_resume_host+0x10c/0x140
> > <4>[15241.044929]  [<ffffffffa00850e9>] sdhci_resume_host+0x69/0xa0 [sdhci]
> > <4>[15241.045044]  [<ffffffffa0bdc39e>] sdhci_pci_resume+0x8e/0xb0 [sdhci_pci]
> 
> So what's the hang?  del_gendisk is doing IO?  I'd assumed that it was
> because it was calling kobject_uevent, but userspace is frozen.

This is a backtrace of a hang.

The patch I sent tries not to remove the card during suspend, by letting
card presence logic to run after system is fully resumed.

However if CONFIG_MMC_UNSAFE_RESUME is not set, card would be removed at
suspend time, again with userspace frozen.
I will need to add some knobs to remove and add (if present) the card
when system is fully resumed. It is possible, but I will need to dig the
mmc code a bit deeper.

What I suggested is make del_gendisk suspend safe somehow.

Best regards,
Maxim Levitsky

^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [PATCH] MMC: fix hang if card was removed during suspend and unsafe resume was enabled
  2010-02-05 15:52             ` Maxim Levitsky
@ 2010-02-05 16:19                 ` Madhusudhan
  2010-02-05 16:19                 ` Madhusudhan
                                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 24+ messages in thread
From: Madhusudhan @ 2010-02-05 16:19 UTC (permalink / raw)
  To: 'Maxim Levitsky', 'Andrew Morton'
  Cc: linux-mmc, 'Philip Langdale', 'linux-kernel',
	'Jorg Schummer', 'linux-pm'



> -----Original Message-----
> From: linux-mmc-owner@vger.kernel.org [mailto:linux-mmc-
> owner@vger.kernel.org] On Behalf Of Maxim Levitsky
> Sent: Friday, February 05, 2010 9:52 AM
> To: Andrew Morton
> Cc: linux-mmc@vger.kernel.org; Philip Langdale; linux-kernel; Jorg
> Schummer; linux-pm
> Subject: Re: [PATCH] MMC: fix hang if card was removed during suspend and
> unsafe resume was enabled
> 
> On Fri, 2010-02-05 at 06:39 -0800, Andrew Morton wrote:
> > On Fri, 05 Feb 2010 16:19:20 +0200 Maxim Levitsky
> <maximlevitsky@gmail.com> wrote:
> >
> > > >
> > > > There is no del_disk in the kernel.  Let's be more specific (and
> > > > accurate!) about the hang.  I assume it's
> > > > mmc_remove_card->device_del->kobject_uevent?
> > > Sorry!
> > > I was referring to del_gendisk.
> > >
> > > <4>[15241.042047]  [<ffffffff8106620a>] ? prepare_to_wait+0x2a/0x90
> > > <4>[15241.042159]  [<ffffffff810790bd>] ? trace_hardirqs_on+0xd/0x10
> > > <4>[15241.042271]  [<ffffffff8140db12>] ?
> _raw_spin_unlock_irqrestore+0x42/0x80
> > > <4>[15241.042386]  [<ffffffff8112a390>] ? bdi_sched_wait+0x0/0x20
> > > <4>[15241.042496]  [<ffffffff8112a39e>] bdi_sched_wait+0xe/0x20
> > > <4>[15241.042606]  [<ffffffff8140af6f>] __wait_on_bit+0x5f/0x90
> > > <4>[15241.042714]  [<ffffffff8112a390>] ? bdi_sched_wait+0x0/0x20
> > > <4>[15241.042824]  [<ffffffff8140b018>]
> out_of_line_wait_on_bit+0x78/0x90
> > > <4>[15241.042935]  [<ffffffff81065fd0>] ? wake_bit_function+0x0/0x40
> > > <4>[15241.043045]  [<ffffffff8112a2d3>] ? bdi_queue_work+0xa3/0xe0
> > > <4>[15241.043155]  [<ffffffff8112a37f>] bdi_sync_writeback+0x6f/0x80
> > > <4>[15241.043265]  [<ffffffff8112a3d2>] sync_inodes_sb+0x22/0x120
> > > <4>[15241.043375]  [<ffffffff8112f1d2>] __sync_filesystem+0x82/0x90
> > > <4>[15241.043485]  [<ffffffff8112f3db>] sync_filesystem+0x4b/0x70
> > > <4>[15241.043594]  [<ffffffff811391de>] fsync_bdev+0x2e/0x60
> > > <4>[15241.043704]  [<ffffffff812226be>] invalidate_partition+0x2e/0x50
> > > <4>[15241.043816]  [<ffffffff8116b92f>] del_gendisk+0x3f/0x140
> > > <4>[15241.043926]  [<ffffffffa00c0233>] mmc_blk_remove+0x33/0x60
> [mmc_block]
> > > <4>[15241.044043]  [<ffffffff81338977>] mmc_bus_remove+0x17/0x20
> > > <4>[15241.044152]  [<ffffffff812ce746>]
> __device_release_driver+0x66/0xc0
> > > <4>[15241.044264]  [<ffffffff812ce89d>]
> device_release_driver+0x2d/0x40
> > > <4>[15241.044375]  [<ffffffff812cd9b5>] bus_remove_device+0xb5/0x120
> > > <4>[15241.044486]  [<ffffffff812cb46f>] device_del+0x12f/0x1a0
> > > <4>[15241.044593]  [<ffffffff81338a5b>] mmc_remove_card+0x5b/0x90
> > > <4>[15241.044702]  [<ffffffff8133ac27>] mmc_sd_remove+0x27/0x50
> > > <4>[15241.044811]  [<ffffffff81337d8c>] mmc_resume_host+0x10c/0x140
> > > <4>[15241.044929]  [<ffffffffa00850e9>] sdhci_resume_host+0x69/0xa0
> [sdhci]
> > > <4>[15241.045044]  [<ffffffffa0bdc39e>] sdhci_pci_resume+0x8e/0xb0
> [sdhci_pci]
> >
> > So what's the hang?  del_gendisk is doing IO?  I'd assumed that it was
> > because it was calling kobject_uevent, but userspace is frozen.
> 
> This is a backtrace of a hang.
> 
> The patch I sent tries not to remove the card during suspend, by letting
> card presence logic to run after system is fully resumed.
> 

The assumption with CONFIG_MMC_UNSAFE_RESUME is that the card remains in the
slot during suspend. The controller driver can simply ignore a card removal
event if received in the suspended state. Wouldn't that solve your problem?

The system would resume because of the event and reinsertion of the card
would enumerate it again.

Regards,
Madhu

> However if CONFIG_MMC_UNSAFE_RESUME is not set, card would be removed at
> suspend time, again with userspace frozen.
> I will need to add some knobs to remove and add (if present) the card
> when system is fully resumed. It is possible, but I will need to dig the
> mmc code a bit deeper.
> 
> What I suggested is make del_gendisk suspend safe somehow.
> 
> Best regards,
> Maxim Levitsky
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] MMC: fix hang if card was removed during suspend and unsafe resume was enabled
  2010-02-05 15:52             ` Maxim Levitsky
@ 2010-02-05 16:19               ` Madhusudhan
  2010-02-05 16:19                 ` Madhusudhan
                                 ` (2 subsequent siblings)
  3 siblings, 0 replies; 24+ messages in thread
From: Madhusudhan @ 2010-02-05 16:19 UTC (permalink / raw)
  To: 'Maxim Levitsky', 'Andrew Morton'
  Cc: 'Jorg Schummer', linux-mmc, 'linux-kernel',
	'linux-pm', 'Philip Langdale'



> -----Original Message-----
> From: linux-mmc-owner@vger.kernel.org [mailto:linux-mmc-
> owner@vger.kernel.org] On Behalf Of Maxim Levitsky
> Sent: Friday, February 05, 2010 9:52 AM
> To: Andrew Morton
> Cc: linux-mmc@vger.kernel.org; Philip Langdale; linux-kernel; Jorg
> Schummer; linux-pm
> Subject: Re: [PATCH] MMC: fix hang if card was removed during suspend and
> unsafe resume was enabled
> 
> On Fri, 2010-02-05 at 06:39 -0800, Andrew Morton wrote:
> > On Fri, 05 Feb 2010 16:19:20 +0200 Maxim Levitsky
> <maximlevitsky@gmail.com> wrote:
> >
> > > >
> > > > There is no del_disk in the kernel.  Let's be more specific (and
> > > > accurate!) about the hang.  I assume it's
> > > > mmc_remove_card->device_del->kobject_uevent?
> > > Sorry!
> > > I was referring to del_gendisk.
> > >
> > > <4>[15241.042047]  [<ffffffff8106620a>] ? prepare_to_wait+0x2a/0x90
> > > <4>[15241.042159]  [<ffffffff810790bd>] ? trace_hardirqs_on+0xd/0x10
> > > <4>[15241.042271]  [<ffffffff8140db12>] ?
> _raw_spin_unlock_irqrestore+0x42/0x80
> > > <4>[15241.042386]  [<ffffffff8112a390>] ? bdi_sched_wait+0x0/0x20
> > > <4>[15241.042496]  [<ffffffff8112a39e>] bdi_sched_wait+0xe/0x20
> > > <4>[15241.042606]  [<ffffffff8140af6f>] __wait_on_bit+0x5f/0x90
> > > <4>[15241.042714]  [<ffffffff8112a390>] ? bdi_sched_wait+0x0/0x20
> > > <4>[15241.042824]  [<ffffffff8140b018>]
> out_of_line_wait_on_bit+0x78/0x90
> > > <4>[15241.042935]  [<ffffffff81065fd0>] ? wake_bit_function+0x0/0x40
> > > <4>[15241.043045]  [<ffffffff8112a2d3>] ? bdi_queue_work+0xa3/0xe0
> > > <4>[15241.043155]  [<ffffffff8112a37f>] bdi_sync_writeback+0x6f/0x80
> > > <4>[15241.043265]  [<ffffffff8112a3d2>] sync_inodes_sb+0x22/0x120
> > > <4>[15241.043375]  [<ffffffff8112f1d2>] __sync_filesystem+0x82/0x90
> > > <4>[15241.043485]  [<ffffffff8112f3db>] sync_filesystem+0x4b/0x70
> > > <4>[15241.043594]  [<ffffffff811391de>] fsync_bdev+0x2e/0x60
> > > <4>[15241.043704]  [<ffffffff812226be>] invalidate_partition+0x2e/0x50
> > > <4>[15241.043816]  [<ffffffff8116b92f>] del_gendisk+0x3f/0x140
> > > <4>[15241.043926]  [<ffffffffa00c0233>] mmc_blk_remove+0x33/0x60
> [mmc_block]
> > > <4>[15241.044043]  [<ffffffff81338977>] mmc_bus_remove+0x17/0x20
> > > <4>[15241.044152]  [<ffffffff812ce746>]
> __device_release_driver+0x66/0xc0
> > > <4>[15241.044264]  [<ffffffff812ce89d>]
> device_release_driver+0x2d/0x40
> > > <4>[15241.044375]  [<ffffffff812cd9b5>] bus_remove_device+0xb5/0x120
> > > <4>[15241.044486]  [<ffffffff812cb46f>] device_del+0x12f/0x1a0
> > > <4>[15241.044593]  [<ffffffff81338a5b>] mmc_remove_card+0x5b/0x90
> > > <4>[15241.044702]  [<ffffffff8133ac27>] mmc_sd_remove+0x27/0x50
> > > <4>[15241.044811]  [<ffffffff81337d8c>] mmc_resume_host+0x10c/0x140
> > > <4>[15241.044929]  [<ffffffffa00850e9>] sdhci_resume_host+0x69/0xa0
> [sdhci]
> > > <4>[15241.045044]  [<ffffffffa0bdc39e>] sdhci_pci_resume+0x8e/0xb0
> [sdhci_pci]
> >
> > So what's the hang?  del_gendisk is doing IO?  I'd assumed that it was
> > because it was calling kobject_uevent, but userspace is frozen.
> 
> This is a backtrace of a hang.
> 
> The patch I sent tries not to remove the card during suspend, by letting
> card presence logic to run after system is fully resumed.
> 

The assumption with CONFIG_MMC_UNSAFE_RESUME is that the card remains in the
slot during suspend. The controller driver can simply ignore a card removal
event if received in the suspended state. Wouldn't that solve your problem?

The system would resume because of the event and reinsertion of the card
would enumerate it again.

Regards,
Madhu

> However if CONFIG_MMC_UNSAFE_RESUME is not set, card would be removed at
> suspend time, again with userspace frozen.
> I will need to add some knobs to remove and add (if present) the card
> when system is fully resumed. It is possible, but I will need to dig the
> mmc code a bit deeper.
> 
> What I suggested is make del_gendisk suspend safe somehow.
> 
> Best regards,
> Maxim Levitsky
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [PATCH] MMC: fix hang if card was removed during suspend and unsafe resume was enabled
@ 2010-02-05 16:19                 ` Madhusudhan
  0 siblings, 0 replies; 24+ messages in thread
From: Madhusudhan @ 2010-02-05 16:19 UTC (permalink / raw)
  To: 'Maxim Levitsky', 'Andrew Morton'
  Cc: linux-mmc, 'Philip Langdale', 'linux-kernel',
	'Jorg Schummer', 'linux-pm'



> -----Original Message-----
> From: linux-mmc-owner@vger.kernel.org [mailto:linux-mmc-
> owner@vger.kernel.org] On Behalf Of Maxim Levitsky
> Sent: Friday, February 05, 2010 9:52 AM
> To: Andrew Morton
> Cc: linux-mmc@vger.kernel.org; Philip Langdale; linux-kernel; Jorg
> Schummer; linux-pm
> Subject: Re: [PATCH] MMC: fix hang if card was removed during suspend and
> unsafe resume was enabled
> 
> On Fri, 2010-02-05 at 06:39 -0800, Andrew Morton wrote:
> > On Fri, 05 Feb 2010 16:19:20 +0200 Maxim Levitsky
> <maximlevitsky@gmail.com> wrote:
> >
> > > >
> > > > There is no del_disk in the kernel.  Let's be more specific (and
> > > > accurate!) about the hang.  I assume it's
> > > > mmc_remove_card->device_del->kobject_uevent?
> > > Sorry!
> > > I was referring to del_gendisk.
> > >
> > > <4>[15241.042047]  [<ffffffff8106620a>] ? prepare_to_wait+0x2a/0x90
> > > <4>[15241.042159]  [<ffffffff810790bd>] ? trace_hardirqs_on+0xd/0x10
> > > <4>[15241.042271]  [<ffffffff8140db12>] ?
> _raw_spin_unlock_irqrestore+0x42/0x80
> > > <4>[15241.042386]  [<ffffffff8112a390>] ? bdi_sched_wait+0x0/0x20
> > > <4>[15241.042496]  [<ffffffff8112a39e>] bdi_sched_wait+0xe/0x20
> > > <4>[15241.042606]  [<ffffffff8140af6f>] __wait_on_bit+0x5f/0x90
> > > <4>[15241.042714]  [<ffffffff8112a390>] ? bdi_sched_wait+0x0/0x20
> > > <4>[15241.042824]  [<ffffffff8140b018>]
> out_of_line_wait_on_bit+0x78/0x90
> > > <4>[15241.042935]  [<ffffffff81065fd0>] ? wake_bit_function+0x0/0x40
> > > <4>[15241.043045]  [<ffffffff8112a2d3>] ? bdi_queue_work+0xa3/0xe0
> > > <4>[15241.043155]  [<ffffffff8112a37f>] bdi_sync_writeback+0x6f/0x80
> > > <4>[15241.043265]  [<ffffffff8112a3d2>] sync_inodes_sb+0x22/0x120
> > > <4>[15241.043375]  [<ffffffff8112f1d2>] __sync_filesystem+0x82/0x90
> > > <4>[15241.043485]  [<ffffffff8112f3db>] sync_filesystem+0x4b/0x70
> > > <4>[15241.043594]  [<ffffffff811391de>] fsync_bdev+0x2e/0x60
> > > <4>[15241.043704]  [<ffffffff812226be>] invalidate_partition+0x2e/0x50
> > > <4>[15241.043816]  [<ffffffff8116b92f>] del_gendisk+0x3f/0x140
> > > <4>[15241.043926]  [<ffffffffa00c0233>] mmc_blk_remove+0x33/0x60
> [mmc_block]
> > > <4>[15241.044043]  [<ffffffff81338977>] mmc_bus_remove+0x17/0x20
> > > <4>[15241.044152]  [<ffffffff812ce746>]
> __device_release_driver+0x66/0xc0
> > > <4>[15241.044264]  [<ffffffff812ce89d>]
> device_release_driver+0x2d/0x40
> > > <4>[15241.044375]  [<ffffffff812cd9b5>] bus_remove_device+0xb5/0x120
> > > <4>[15241.044486]  [<ffffffff812cb46f>] device_del+0x12f/0x1a0
> > > <4>[15241.044593]  [<ffffffff81338a5b>] mmc_remove_card+0x5b/0x90
> > > <4>[15241.044702]  [<ffffffff8133ac27>] mmc_sd_remove+0x27/0x50
> > > <4>[15241.044811]  [<ffffffff81337d8c>] mmc_resume_host+0x10c/0x140
> > > <4>[15241.044929]  [<ffffffffa00850e9>] sdhci_resume_host+0x69/0xa0
> [sdhci]
> > > <4>[15241.045044]  [<ffffffffa0bdc39e>] sdhci_pci_resume+0x8e/0xb0
> [sdhci_pci]
> >
> > So what's the hang?  del_gendisk is doing IO?  I'd assumed that it was
> > because it was calling kobject_uevent, but userspace is frozen.
> 
> This is a backtrace of a hang.
> 
> The patch I sent tries not to remove the card during suspend, by letting
> card presence logic to run after system is fully resumed.
> 

The assumption with CONFIG_MMC_UNSAFE_RESUME is that the card remains in the
slot during suspend. The controller driver can simply ignore a card removal
event if received in the suspended state. Wouldn't that solve your problem?

The system would resume because of the event and reinsertion of the card
would enumerate it again.

Regards,
Madhu

> However if CONFIG_MMC_UNSAFE_RESUME is not set, card would be removed at
> suspend time, again with userspace frozen.
> I will need to add some knobs to remove and add (if present) the card
> when system is fully resumed. It is possible, but I will need to dig the
> mmc code a bit deeper.
> 
> What I suggested is make del_gendisk suspend safe somehow.
> 
> Best regards,
> Maxim Levitsky
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [PATCH] MMC: fix hang if card was removed during suspend and unsafe resume was enabled
  2010-02-05 16:19                 ` Madhusudhan
@ 2010-02-05 16:32                   ` Maxim Levitsky
  -1 siblings, 0 replies; 24+ messages in thread
From: Maxim Levitsky @ 2010-02-05 16:32 UTC (permalink / raw)
  To: Madhusudhan
  Cc: 'Andrew Morton', linux-mmc, 'Philip Langdale',
	'linux-kernel', 'Jorg Schummer',
	'linux-pm'

On Fri, 2010-02-05 at 10:19 -0600, Madhusudhan wrote: 
> 
> > -----Original Message-----
> > From: linux-mmc-owner@vger.kernel.org [mailto:linux-mmc-
> > owner@vger.kernel.org] On Behalf Of Maxim Levitsky
> > Sent: Friday, February 05, 2010 9:52 AM
> > To: Andrew Morton
> > Cc: linux-mmc@vger.kernel.org; Philip Langdale; linux-kernel; Jorg
> > Schummer; linux-pm
> > Subject: Re: [PATCH] MMC: fix hang if card was removed during suspend and
> > unsafe resume was enabled
> > 
> > On Fri, 2010-02-05 at 06:39 -0800, Andrew Morton wrote:
> > > On Fri, 05 Feb 2010 16:19:20 +0200 Maxim Levitsky
> > <maximlevitsky@gmail.com> wrote:
> > >
> > > > >
> > > > > There is no del_disk in the kernel.  Let's be more specific (and
> > > > > accurate!) about the hang.  I assume it's
> > > > > mmc_remove_card->device_del->kobject_uevent?
> > > > Sorry!
> > > > I was referring to del_gendisk.
> > > >
> > > > <4>[15241.042047]  [<ffffffff8106620a>] ? prepare_to_wait+0x2a/0x90
> > > > <4>[15241.042159]  [<ffffffff810790bd>] ? trace_hardirqs_on+0xd/0x10
> > > > <4>[15241.042271]  [<ffffffff8140db12>] ?
> > _raw_spin_unlock_irqrestore+0x42/0x80
> > > > <4>[15241.042386]  [<ffffffff8112a390>] ? bdi_sched_wait+0x0/0x20
> > > > <4>[15241.042496]  [<ffffffff8112a39e>] bdi_sched_wait+0xe/0x20
> > > > <4>[15241.042606]  [<ffffffff8140af6f>] __wait_on_bit+0x5f/0x90
> > > > <4>[15241.042714]  [<ffffffff8112a390>] ? bdi_sched_wait+0x0/0x20
> > > > <4>[15241.042824]  [<ffffffff8140b018>]
> > out_of_line_wait_on_bit+0x78/0x90
> > > > <4>[15241.042935]  [<ffffffff81065fd0>] ? wake_bit_function+0x0/0x40
> > > > <4>[15241.043045]  [<ffffffff8112a2d3>] ? bdi_queue_work+0xa3/0xe0
> > > > <4>[15241.043155]  [<ffffffff8112a37f>] bdi_sync_writeback+0x6f/0x80
> > > > <4>[15241.043265]  [<ffffffff8112a3d2>] sync_inodes_sb+0x22/0x120
> > > > <4>[15241.043375]  [<ffffffff8112f1d2>] __sync_filesystem+0x82/0x90
> > > > <4>[15241.043485]  [<ffffffff8112f3db>] sync_filesystem+0x4b/0x70
> > > > <4>[15241.043594]  [<ffffffff811391de>] fsync_bdev+0x2e/0x60
> > > > <4>[15241.043704]  [<ffffffff812226be>] invalidate_partition+0x2e/0x50
> > > > <4>[15241.043816]  [<ffffffff8116b92f>] del_gendisk+0x3f/0x140
> > > > <4>[15241.043926]  [<ffffffffa00c0233>] mmc_blk_remove+0x33/0x60
> > [mmc_block]
> > > > <4>[15241.044043]  [<ffffffff81338977>] mmc_bus_remove+0x17/0x20
> > > > <4>[15241.044152]  [<ffffffff812ce746>]
> > __device_release_driver+0x66/0xc0
> > > > <4>[15241.044264]  [<ffffffff812ce89d>]
> > device_release_driver+0x2d/0x40
> > > > <4>[15241.044375]  [<ffffffff812cd9b5>] bus_remove_device+0xb5/0x120
> > > > <4>[15241.044486]  [<ffffffff812cb46f>] device_del+0x12f/0x1a0
> > > > <4>[15241.044593]  [<ffffffff81338a5b>] mmc_remove_card+0x5b/0x90
> > > > <4>[15241.044702]  [<ffffffff8133ac27>] mmc_sd_remove+0x27/0x50
> > > > <4>[15241.044811]  [<ffffffff81337d8c>] mmc_resume_host+0x10c/0x140
> > > > <4>[15241.044929]  [<ffffffffa00850e9>] sdhci_resume_host+0x69/0xa0
> > [sdhci]
> > > > <4>[15241.045044]  [<ffffffffa0bdc39e>] sdhci_pci_resume+0x8e/0xb0
> > [sdhci_pci]
> > >
> > > So what's the hang?  del_gendisk is doing IO?  I'd assumed that it was
> > > because it was calling kobject_uevent, but userspace is frozen.
> > 
> > This is a backtrace of a hang.
> > 
> > The patch I sent tries not to remove the card during suspend, by letting
> > card presence logic to run after system is fully resumed.
> > 
> 
> The assumption with CONFIG_MMC_UNSAFE_RESUME is that the card remains in the
> slot during suspend. The controller driver can simply ignore a card removal
> event if received in the suspended state. Wouldn't that solve your problem?
The assumption is that user doesn't play games with the card while
system is in low power state. Why should my computer hang if I remove
the card and then resume the system?


> 
> The system would resume because of the event and reinsertion of the card
> would enumerate it again.

Don't know about resume on removal, bacause its not supported by most
controllers, and it very very dangerous (Think about closing the lid,
removing the card, and putting the system in a bag....


Best regards,
Maxim Levitsky


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] MMC: fix hang if card was removed during suspend and unsafe resume was enabled
@ 2010-02-05 16:32                   ` Maxim Levitsky
  0 siblings, 0 replies; 24+ messages in thread
From: Maxim Levitsky @ 2010-02-05 16:32 UTC (permalink / raw)
  To: Madhusudhan
  Cc: 'Jorg Schummer', linux-mmc, 'linux-kernel',
	'Andrew Morton', 'linux-pm',
	'Philip Langdale'

On Fri, 2010-02-05 at 10:19 -0600, Madhusudhan wrote: 
> 
> > -----Original Message-----
> > From: linux-mmc-owner@vger.kernel.org [mailto:linux-mmc-
> > owner@vger.kernel.org] On Behalf Of Maxim Levitsky
> > Sent: Friday, February 05, 2010 9:52 AM
> > To: Andrew Morton
> > Cc: linux-mmc@vger.kernel.org; Philip Langdale; linux-kernel; Jorg
> > Schummer; linux-pm
> > Subject: Re: [PATCH] MMC: fix hang if card was removed during suspend and
> > unsafe resume was enabled
> > 
> > On Fri, 2010-02-05 at 06:39 -0800, Andrew Morton wrote:
> > > On Fri, 05 Feb 2010 16:19:20 +0200 Maxim Levitsky
> > <maximlevitsky@gmail.com> wrote:
> > >
> > > > >
> > > > > There is no del_disk in the kernel.  Let's be more specific (and
> > > > > accurate!) about the hang.  I assume it's
> > > > > mmc_remove_card->device_del->kobject_uevent?
> > > > Sorry!
> > > > I was referring to del_gendisk.
> > > >
> > > > <4>[15241.042047]  [<ffffffff8106620a>] ? prepare_to_wait+0x2a/0x90
> > > > <4>[15241.042159]  [<ffffffff810790bd>] ? trace_hardirqs_on+0xd/0x10
> > > > <4>[15241.042271]  [<ffffffff8140db12>] ?
> > _raw_spin_unlock_irqrestore+0x42/0x80
> > > > <4>[15241.042386]  [<ffffffff8112a390>] ? bdi_sched_wait+0x0/0x20
> > > > <4>[15241.042496]  [<ffffffff8112a39e>] bdi_sched_wait+0xe/0x20
> > > > <4>[15241.042606]  [<ffffffff8140af6f>] __wait_on_bit+0x5f/0x90
> > > > <4>[15241.042714]  [<ffffffff8112a390>] ? bdi_sched_wait+0x0/0x20
> > > > <4>[15241.042824]  [<ffffffff8140b018>]
> > out_of_line_wait_on_bit+0x78/0x90
> > > > <4>[15241.042935]  [<ffffffff81065fd0>] ? wake_bit_function+0x0/0x40
> > > > <4>[15241.043045]  [<ffffffff8112a2d3>] ? bdi_queue_work+0xa3/0xe0
> > > > <4>[15241.043155]  [<ffffffff8112a37f>] bdi_sync_writeback+0x6f/0x80
> > > > <4>[15241.043265]  [<ffffffff8112a3d2>] sync_inodes_sb+0x22/0x120
> > > > <4>[15241.043375]  [<ffffffff8112f1d2>] __sync_filesystem+0x82/0x90
> > > > <4>[15241.043485]  [<ffffffff8112f3db>] sync_filesystem+0x4b/0x70
> > > > <4>[15241.043594]  [<ffffffff811391de>] fsync_bdev+0x2e/0x60
> > > > <4>[15241.043704]  [<ffffffff812226be>] invalidate_partition+0x2e/0x50
> > > > <4>[15241.043816]  [<ffffffff8116b92f>] del_gendisk+0x3f/0x140
> > > > <4>[15241.043926]  [<ffffffffa00c0233>] mmc_blk_remove+0x33/0x60
> > [mmc_block]
> > > > <4>[15241.044043]  [<ffffffff81338977>] mmc_bus_remove+0x17/0x20
> > > > <4>[15241.044152]  [<ffffffff812ce746>]
> > __device_release_driver+0x66/0xc0
> > > > <4>[15241.044264]  [<ffffffff812ce89d>]
> > device_release_driver+0x2d/0x40
> > > > <4>[15241.044375]  [<ffffffff812cd9b5>] bus_remove_device+0xb5/0x120
> > > > <4>[15241.044486]  [<ffffffff812cb46f>] device_del+0x12f/0x1a0
> > > > <4>[15241.044593]  [<ffffffff81338a5b>] mmc_remove_card+0x5b/0x90
> > > > <4>[15241.044702]  [<ffffffff8133ac27>] mmc_sd_remove+0x27/0x50
> > > > <4>[15241.044811]  [<ffffffff81337d8c>] mmc_resume_host+0x10c/0x140
> > > > <4>[15241.044929]  [<ffffffffa00850e9>] sdhci_resume_host+0x69/0xa0
> > [sdhci]
> > > > <4>[15241.045044]  [<ffffffffa0bdc39e>] sdhci_pci_resume+0x8e/0xb0
> > [sdhci_pci]
> > >
> > > So what's the hang?  del_gendisk is doing IO?  I'd assumed that it was
> > > because it was calling kobject_uevent, but userspace is frozen.
> > 
> > This is a backtrace of a hang.
> > 
> > The patch I sent tries not to remove the card during suspend, by letting
> > card presence logic to run after system is fully resumed.
> > 
> 
> The assumption with CONFIG_MMC_UNSAFE_RESUME is that the card remains in the
> slot during suspend. The controller driver can simply ignore a card removal
> event if received in the suspended state. Wouldn't that solve your problem?
The assumption is that user doesn't play games with the card while
system is in low power state. Why should my computer hang if I remove
the card and then resume the system?


> 
> The system would resume because of the event and reinsertion of the card
> would enumerate it again.

Don't know about resume on removal, bacause its not supported by most
controllers, and it very very dangerous (Think about closing the lid,
removing the card, and putting the system in a bag....


Best regards,
Maxim Levitsky

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] MMC: fix hang if card was removed during suspend and unsafe resume was enabled
  2010-02-05 15:52             ` Maxim Levitsky
  2010-02-05 16:19               ` Madhusudhan
  2010-02-05 16:19                 ` Madhusudhan
@ 2010-02-05 18:26               ` Andrew Morton
  2010-02-05 19:58                 ` Maxim Levitsky
  2010-02-05 19:58                 ` Maxim Levitsky
  2010-02-05 18:26               ` Andrew Morton
  3 siblings, 2 replies; 24+ messages in thread
From: Andrew Morton @ 2010-02-05 18:26 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: linux-mmc, Philip Langdale, linux-kernel, Jorg Schummer, linux-pm

On Fri, 05 Feb 2010 17:52:00 +0200
Maxim Levitsky <maximlevitsky@gmail.com> wrote:

> > > <4>[15241.042047]  [<ffffffff8106620a>] ? prepare_to_wait+0x2a/0x90
> > > <4>[15241.042159]  [<ffffffff810790bd>] ? trace_hardirqs_on+0xd/0x10
> > > <4>[15241.042271]  [<ffffffff8140db12>] ? _raw_spin_unlock_irqrestore+0x42/0x80
> > > <4>[15241.042386]  [<ffffffff8112a390>] ? bdi_sched_wait+0x0/0x20
> > > <4>[15241.042496]  [<ffffffff8112a39e>] bdi_sched_wait+0xe/0x20
> > > <4>[15241.042606]  [<ffffffff8140af6f>] __wait_on_bit+0x5f/0x90
> > > <4>[15241.042714]  [<ffffffff8112a390>] ? bdi_sched_wait+0x0/0x20
> > > <4>[15241.042824]  [<ffffffff8140b018>] out_of_line_wait_on_bit+0x78/0x90
> > > <4>[15241.042935]  [<ffffffff81065fd0>] ? wake_bit_function+0x0/0x40
> > > <4>[15241.043045]  [<ffffffff8112a2d3>] ? bdi_queue_work+0xa3/0xe0
> > > <4>[15241.043155]  [<ffffffff8112a37f>] bdi_sync_writeback+0x6f/0x80
> > > <4>[15241.043265]  [<ffffffff8112a3d2>] sync_inodes_sb+0x22/0x120
> > > <4>[15241.043375]  [<ffffffff8112f1d2>] __sync_filesystem+0x82/0x90
> > > <4>[15241.043485]  [<ffffffff8112f3db>] sync_filesystem+0x4b/0x70
> > > <4>[15241.043594]  [<ffffffff811391de>] fsync_bdev+0x2e/0x60
> > > <4>[15241.043704]  [<ffffffff812226be>] invalidate_partition+0x2e/0x50
> > > <4>[15241.043816]  [<ffffffff8116b92f>] del_gendisk+0x3f/0x140
> > > <4>[15241.043926]  [<ffffffffa00c0233>] mmc_blk_remove+0x33/0x60 [mmc_block]
> > > <4>[15241.044043]  [<ffffffff81338977>] mmc_bus_remove+0x17/0x20
> > > <4>[15241.044152]  [<ffffffff812ce746>] __device_release_driver+0x66/0xc0
> > > <4>[15241.044264]  [<ffffffff812ce89d>] device_release_driver+0x2d/0x40
> > > <4>[15241.044375]  [<ffffffff812cd9b5>] bus_remove_device+0xb5/0x120
> > > <4>[15241.044486]  [<ffffffff812cb46f>] device_del+0x12f/0x1a0
> > > <4>[15241.044593]  [<ffffffff81338a5b>] mmc_remove_card+0x5b/0x90
> > > <4>[15241.044702]  [<ffffffff8133ac27>] mmc_sd_remove+0x27/0x50
> > > <4>[15241.044811]  [<ffffffff81337d8c>] mmc_resume_host+0x10c/0x140
> > > <4>[15241.044929]  [<ffffffffa00850e9>] sdhci_resume_host+0x69/0xa0 [sdhci]
> > > <4>[15241.045044]  [<ffffffffa0bdc39e>] sdhci_pci_resume+0x8e/0xb0 [sdhci_pci]
> > 
> > So what's the hang?  del_gendisk is doing IO?  I'd assumed that it was
> > because it was calling kobject_uevent, but userspace is frozen.
> 
> This is a backtrace of a hang.

But why did it hang?  Because the BDI worker threads are trying to
perform IO through a suspended device?


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] MMC: fix hang if card was removed during suspend and unsafe resume was enabled
  2010-02-05 15:52             ` Maxim Levitsky
                                 ` (2 preceding siblings ...)
  2010-02-05 18:26               ` Andrew Morton
@ 2010-02-05 18:26               ` Andrew Morton
  3 siblings, 0 replies; 24+ messages in thread
From: Andrew Morton @ 2010-02-05 18:26 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: Jorg Schummer, linux-mmc, linux-kernel, linux-pm, Philip Langdale

On Fri, 05 Feb 2010 17:52:00 +0200
Maxim Levitsky <maximlevitsky@gmail.com> wrote:

> > > <4>[15241.042047]  [<ffffffff8106620a>] ? prepare_to_wait+0x2a/0x90
> > > <4>[15241.042159]  [<ffffffff810790bd>] ? trace_hardirqs_on+0xd/0x10
> > > <4>[15241.042271]  [<ffffffff8140db12>] ? _raw_spin_unlock_irqrestore+0x42/0x80
> > > <4>[15241.042386]  [<ffffffff8112a390>] ? bdi_sched_wait+0x0/0x20
> > > <4>[15241.042496]  [<ffffffff8112a39e>] bdi_sched_wait+0xe/0x20
> > > <4>[15241.042606]  [<ffffffff8140af6f>] __wait_on_bit+0x5f/0x90
> > > <4>[15241.042714]  [<ffffffff8112a390>] ? bdi_sched_wait+0x0/0x20
> > > <4>[15241.042824]  [<ffffffff8140b018>] out_of_line_wait_on_bit+0x78/0x90
> > > <4>[15241.042935]  [<ffffffff81065fd0>] ? wake_bit_function+0x0/0x40
> > > <4>[15241.043045]  [<ffffffff8112a2d3>] ? bdi_queue_work+0xa3/0xe0
> > > <4>[15241.043155]  [<ffffffff8112a37f>] bdi_sync_writeback+0x6f/0x80
> > > <4>[15241.043265]  [<ffffffff8112a3d2>] sync_inodes_sb+0x22/0x120
> > > <4>[15241.043375]  [<ffffffff8112f1d2>] __sync_filesystem+0x82/0x90
> > > <4>[15241.043485]  [<ffffffff8112f3db>] sync_filesystem+0x4b/0x70
> > > <4>[15241.043594]  [<ffffffff811391de>] fsync_bdev+0x2e/0x60
> > > <4>[15241.043704]  [<ffffffff812226be>] invalidate_partition+0x2e/0x50
> > > <4>[15241.043816]  [<ffffffff8116b92f>] del_gendisk+0x3f/0x140
> > > <4>[15241.043926]  [<ffffffffa00c0233>] mmc_blk_remove+0x33/0x60 [mmc_block]
> > > <4>[15241.044043]  [<ffffffff81338977>] mmc_bus_remove+0x17/0x20
> > > <4>[15241.044152]  [<ffffffff812ce746>] __device_release_driver+0x66/0xc0
> > > <4>[15241.044264]  [<ffffffff812ce89d>] device_release_driver+0x2d/0x40
> > > <4>[15241.044375]  [<ffffffff812cd9b5>] bus_remove_device+0xb5/0x120
> > > <4>[15241.044486]  [<ffffffff812cb46f>] device_del+0x12f/0x1a0
> > > <4>[15241.044593]  [<ffffffff81338a5b>] mmc_remove_card+0x5b/0x90
> > > <4>[15241.044702]  [<ffffffff8133ac27>] mmc_sd_remove+0x27/0x50
> > > <4>[15241.044811]  [<ffffffff81337d8c>] mmc_resume_host+0x10c/0x140
> > > <4>[15241.044929]  [<ffffffffa00850e9>] sdhci_resume_host+0x69/0xa0 [sdhci]
> > > <4>[15241.045044]  [<ffffffffa0bdc39e>] sdhci_pci_resume+0x8e/0xb0 [sdhci_pci]
> > 
> > So what's the hang?  del_gendisk is doing IO?  I'd assumed that it was
> > because it was calling kobject_uevent, but userspace is frozen.
> 
> This is a backtrace of a hang.

But why did it hang?  Because the BDI worker threads are trying to
perform IO through a suspended device?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] MMC: fix hang if card was removed during suspend and unsafe resume was enabled
  2010-02-05 18:26               ` Andrew Morton
  2010-02-05 19:58                 ` Maxim Levitsky
@ 2010-02-05 19:58                 ` Maxim Levitsky
  1 sibling, 0 replies; 24+ messages in thread
From: Maxim Levitsky @ 2010-02-05 19:58 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mmc, Philip Langdale, linux-kernel, Jorg Schummer, linux-pm

On Fri, 2010-02-05 at 10:26 -0800, Andrew Morton wrote: 
> On Fri, 05 Feb 2010 17:52:00 +0200
> Maxim Levitsky <maximlevitsky@gmail.com> wrote:
> 
> > > > <4>[15241.042047]  [<ffffffff8106620a>] ? prepare_to_wait+0x2a/0x90
> > > > <4>[15241.042159]  [<ffffffff810790bd>] ? trace_hardirqs_on+0xd/0x10
> > > > <4>[15241.042271]  [<ffffffff8140db12>] ? _raw_spin_unlock_irqrestore+0x42/0x80
> > > > <4>[15241.042386]  [<ffffffff8112a390>] ? bdi_sched_wait+0x0/0x20
> > > > <4>[15241.042496]  [<ffffffff8112a39e>] bdi_sched_wait+0xe/0x20
> > > > <4>[15241.042606]  [<ffffffff8140af6f>] __wait_on_bit+0x5f/0x90
> > > > <4>[15241.042714]  [<ffffffff8112a390>] ? bdi_sched_wait+0x0/0x20
> > > > <4>[15241.042824]  [<ffffffff8140b018>] out_of_line_wait_on_bit+0x78/0x90
> > > > <4>[15241.042935]  [<ffffffff81065fd0>] ? wake_bit_function+0x0/0x40
> > > > <4>[15241.043045]  [<ffffffff8112a2d3>] ? bdi_queue_work+0xa3/0xe0
> > > > <4>[15241.043155]  [<ffffffff8112a37f>] bdi_sync_writeback+0x6f/0x80
> > > > <4>[15241.043265]  [<ffffffff8112a3d2>] sync_inodes_sb+0x22/0x120
> > > > <4>[15241.043375]  [<ffffffff8112f1d2>] __sync_filesystem+0x82/0x90
> > > > <4>[15241.043485]  [<ffffffff8112f3db>] sync_filesystem+0x4b/0x70
> > > > <4>[15241.043594]  [<ffffffff811391de>] fsync_bdev+0x2e/0x60
> > > > <4>[15241.043704]  [<ffffffff812226be>] invalidate_partition+0x2e/0x50
> > > > <4>[15241.043816]  [<ffffffff8116b92f>] del_gendisk+0x3f/0x140
> > > > <4>[15241.043926]  [<ffffffffa00c0233>] mmc_blk_remove+0x33/0x60 [mmc_block]
> > > > <4>[15241.044043]  [<ffffffff81338977>] mmc_bus_remove+0x17/0x20
> > > > <4>[15241.044152]  [<ffffffff812ce746>] __device_release_driver+0x66/0xc0
> > > > <4>[15241.044264]  [<ffffffff812ce89d>] device_release_driver+0x2d/0x40
> > > > <4>[15241.044375]  [<ffffffff812cd9b5>] bus_remove_device+0xb5/0x120
> > > > <4>[15241.044486]  [<ffffffff812cb46f>] device_del+0x12f/0x1a0
> > > > <4>[15241.044593]  [<ffffffff81338a5b>] mmc_remove_card+0x5b/0x90
> > > > <4>[15241.044702]  [<ffffffff8133ac27>] mmc_sd_remove+0x27/0x50
> > > > <4>[15241.044811]  [<ffffffff81337d8c>] mmc_resume_host+0x10c/0x140
> > > > <4>[15241.044929]  [<ffffffffa00850e9>] sdhci_resume_host+0x69/0xa0 [sdhci]
> > > > <4>[15241.045044]  [<ffffffffa0bdc39e>] sdhci_pci_resume+0x8e/0xb0 [sdhci_pci]
> > > 
> > > So what's the hang?  del_gendisk is doing IO?  I'd assumed that it was
> > > because it was calling kobject_uevent, but userspace is frozen.
> > 
> > This is a backtrace of a hang.
> 
> But why did it hang?  Because the BDI worker threads are trying to
> perform IO through a suspended device?
> 
Something like that I guess.
Also this is 100% reproducible, and I can reproduce this with my own
driver too (by making the card detection workqueue be non freezable)


Best regards,
Maxim Levitsky


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] MMC: fix hang if card was removed during suspend and unsafe resume was enabled
  2010-02-05 18:26               ` Andrew Morton
@ 2010-02-05 19:58                 ` Maxim Levitsky
  2010-02-05 19:58                 ` Maxim Levitsky
  1 sibling, 0 replies; 24+ messages in thread
From: Maxim Levitsky @ 2010-02-05 19:58 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Jorg Schummer, linux-mmc, linux-kernel, linux-pm, Philip Langdale

On Fri, 2010-02-05 at 10:26 -0800, Andrew Morton wrote: 
> On Fri, 05 Feb 2010 17:52:00 +0200
> Maxim Levitsky <maximlevitsky@gmail.com> wrote:
> 
> > > > <4>[15241.042047]  [<ffffffff8106620a>] ? prepare_to_wait+0x2a/0x90
> > > > <4>[15241.042159]  [<ffffffff810790bd>] ? trace_hardirqs_on+0xd/0x10
> > > > <4>[15241.042271]  [<ffffffff8140db12>] ? _raw_spin_unlock_irqrestore+0x42/0x80
> > > > <4>[15241.042386]  [<ffffffff8112a390>] ? bdi_sched_wait+0x0/0x20
> > > > <4>[15241.042496]  [<ffffffff8112a39e>] bdi_sched_wait+0xe/0x20
> > > > <4>[15241.042606]  [<ffffffff8140af6f>] __wait_on_bit+0x5f/0x90
> > > > <4>[15241.042714]  [<ffffffff8112a390>] ? bdi_sched_wait+0x0/0x20
> > > > <4>[15241.042824]  [<ffffffff8140b018>] out_of_line_wait_on_bit+0x78/0x90
> > > > <4>[15241.042935]  [<ffffffff81065fd0>] ? wake_bit_function+0x0/0x40
> > > > <4>[15241.043045]  [<ffffffff8112a2d3>] ? bdi_queue_work+0xa3/0xe0
> > > > <4>[15241.043155]  [<ffffffff8112a37f>] bdi_sync_writeback+0x6f/0x80
> > > > <4>[15241.043265]  [<ffffffff8112a3d2>] sync_inodes_sb+0x22/0x120
> > > > <4>[15241.043375]  [<ffffffff8112f1d2>] __sync_filesystem+0x82/0x90
> > > > <4>[15241.043485]  [<ffffffff8112f3db>] sync_filesystem+0x4b/0x70
> > > > <4>[15241.043594]  [<ffffffff811391de>] fsync_bdev+0x2e/0x60
> > > > <4>[15241.043704]  [<ffffffff812226be>] invalidate_partition+0x2e/0x50
> > > > <4>[15241.043816]  [<ffffffff8116b92f>] del_gendisk+0x3f/0x140
> > > > <4>[15241.043926]  [<ffffffffa00c0233>] mmc_blk_remove+0x33/0x60 [mmc_block]
> > > > <4>[15241.044043]  [<ffffffff81338977>] mmc_bus_remove+0x17/0x20
> > > > <4>[15241.044152]  [<ffffffff812ce746>] __device_release_driver+0x66/0xc0
> > > > <4>[15241.044264]  [<ffffffff812ce89d>] device_release_driver+0x2d/0x40
> > > > <4>[15241.044375]  [<ffffffff812cd9b5>] bus_remove_device+0xb5/0x120
> > > > <4>[15241.044486]  [<ffffffff812cb46f>] device_del+0x12f/0x1a0
> > > > <4>[15241.044593]  [<ffffffff81338a5b>] mmc_remove_card+0x5b/0x90
> > > > <4>[15241.044702]  [<ffffffff8133ac27>] mmc_sd_remove+0x27/0x50
> > > > <4>[15241.044811]  [<ffffffff81337d8c>] mmc_resume_host+0x10c/0x140
> > > > <4>[15241.044929]  [<ffffffffa00850e9>] sdhci_resume_host+0x69/0xa0 [sdhci]
> > > > <4>[15241.045044]  [<ffffffffa0bdc39e>] sdhci_pci_resume+0x8e/0xb0 [sdhci_pci]
> > > 
> > > So what's the hang?  del_gendisk is doing IO?  I'd assumed that it was
> > > because it was calling kobject_uevent, but userspace is frozen.
> > 
> > This is a backtrace of a hang.
> 
> But why did it hang?  Because the BDI worker threads are trying to
> perform IO through a suspended device?
> 
Something like that I guess.
Also this is 100% reproducible, and I can reproduce this with my own
driver too (by making the card detection workqueue be non freezable)


Best regards,
Maxim Levitsky

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2010-02-05 19:59 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-02-03 17:47 Hand on resume if sd/mmc card was removed while system was suspended/hibernated Maxim Levitsky
2010-02-04 23:18 ` [PATCH] MMC: fix hang if card was removed during suspend and unsafe resume was enabled Maxim Levitsky
2010-02-05  0:09   ` Andrew Morton
2010-02-05  8:31     ` Maxim Levitsky
2010-02-05  8:31       ` Maxim Levitsky
2010-02-05 14:13       ` Andrew Morton
2010-02-05 14:19         ` Maxim Levitsky
2010-02-05 14:39           ` Andrew Morton
2010-02-05 15:52             ` Maxim Levitsky
2010-02-05 15:52             ` Maxim Levitsky
2010-02-05 16:19               ` Madhusudhan
2010-02-05 16:19               ` Madhusudhan
2010-02-05 16:19                 ` Madhusudhan
2010-02-05 16:32                 ` Maxim Levitsky
2010-02-05 16:32                   ` Maxim Levitsky
2010-02-05 18:26               ` Andrew Morton
2010-02-05 19:58                 ` Maxim Levitsky
2010-02-05 19:58                 ` Maxim Levitsky
2010-02-05 18:26               ` Andrew Morton
2010-02-05 14:39           ` Andrew Morton
2010-02-05 14:19         ` Maxim Levitsky
2010-02-05 14:13       ` Andrew Morton
2010-02-05 10:17     ` Adrian Hunter
2010-02-05 13:42       ` Maxim Levitsky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.