All of lore.kernel.org
 help / color / mirror / Atom feed
* kernel 2.6.38.6 MMC controller problem (fwd)
@ 2011-05-16  7:06 Guennadi Liakhovetski
       [not found] ` <BANLkTinbJfDLp5nfkq-VTYLJvihT1thXvA@mail.gmail.com>
  0 siblings, 1 reply; 19+ messages in thread
From: Guennadi Liakhovetski @ 2011-05-16  7:06 UTC (permalink / raw)
  To: linux-mmc
  Cc: Greg KH, Chris Ball, David Strobach, horms, damm,
	Andrei Warkentin, Linus Walleij

(added the ML and previous discussion participants to CC)

Looks like my patch was indeed less obviious, than we thought. We need a 
lock-up backtrace, I guess. David, can you use a sysrq to get a trace? 
Something like

echo w > /proc/sysrq-trigger
or even
echo t > /proc/sysrq-trigger

and provide traces? The one with 't' will be probably huge, so, maybe you 
could do it with as few tasks running as possible, maybe without a 
graphical login. Or I would have to test it with my Laptop, to which I'll 
get access tomorrow.

Thanks
Guennadi
---
Guennadi Liakhovetski, Ph.D.
Freelance Open-Source Software Developer
http://www.open-technology.de/

---------- Forwarded message ----------
Date: Mon, 16 May 2011 01:57:04 +0200
From: David Strobach <lalochcz@gmail.com>
To: gregkh@suse.de, g.liakhovetski@gmx.de
Cc: horms@verge.net.au, damm@opensource.se
Subject: kernel 2.6.38.6 MMC controller problem

Hello,

I found (by bisection), that the commit 3fe962c (
http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.38.y.git;a=commit;h=3fe962c04818a4634255beb3be9f236d36350543)
introduced regression in MMC card detection. The card is either not detected
or causes the system to hang. There is related forum thread at
https://bbs.archlinux.org/viewtopic.php?id=118751. Relevant part of my own
log follows:

May 16 00:15:13 localhost kernel: [  134.670685] mmc0: new SD card at
address aaaa
May 16 00:15:23 localhost kernel: [  144.715115] mmc0: Timeout waiting for
hardware interrupt.
May 16 00:15:23 localhost kernel: [  144.715119] sdhci: =========== REGISTER
DUMP (mmc0)===========
May 16 00:15:23 localhost kernel: [  144.715126] sdhci: Sys addr: 0xbae85840
| Version:  0x00000400
May 16 00:15:23 localhost kernel: [  144.715133] sdhci: Blk size: 0x00007040
| Blk cnt:  0x00000001
May 16 00:15:23 localhost kernel: [  144.715140] sdhci: Argument: 0x00000200
| Trn mode: 0x00000013
May 16 00:15:23 localhost kernel: [  144.715146] sdhci: Present:  0x01ff0001
| Host ctl: 0x00000003
May 16 00:15:23 localhost kernel: [  144.715153] sdhci: Power:    0x0000000f
| Blk gap:  0x00000000
May 16 00:15:23 localhost kernel: [  144.715159] sdhci: Wake-up:  0x00000000
| Clock:    0x00000100
May 16 00:15:23 localhost kernel: [  144.715166] sdhci: Timeout:  0x00000009
| Int stat: 0x00000000
May 16 00:15:23 localhost kernel: [  144.715172] sdhci: Int enab: 0x02ff00cb
| Sig enab: 0x02ff00cb
May 16 00:15:23 localhost kernel: [  144.715178] sdhci: AC12 err: 0x00000000
| Slot int: 0x00000000
May 16 00:15:23 localhost kernel: [  144.715185] sdhci: Caps:     0x01e032b2
| Caps_1:   0x00000000
May 16 00:15:23 localhost kernel: [  144.715192] sdhci: Cmd:      0x0000101a
| Max curr: 0x00000040
May 16 00:15:23 localhost kernel: [  144.715194] sdhci:
===========================================
May 16 00:15:31 localhost kernel: [  152.604505] mmc0: Card removed during
transfer!
May 16 00:15:31 localhost kernel: [  152.604511] mmc0: Resetting controller.
May 16 00:15:31 localhost kernel: [  152.604568] mmcblk0: unable to set
block size to 512: -123
May 16 00:15:31 localhost kernel: [  152.604687] mmcblk: probe of mmc0:aaaa
failed with error -22
May 16 00:15:31 localhost kernel: [  152.801534] mmc0: card aaaa removed
May 16 00:15:31 localhost kernel: [  152.814827] mmc0: Got command interrupt
0x00030000 even though no command operation was in progress.
May 16 00:15:31 localhost kernel: [  152.814835] sdhci: =========== REGISTER
DUMP (mmc0)===========
May 16 00:15:31 localhost kernel: [  152.814844] sdhci: Sys addr: 0xbae85840
| Version:  0x00000400
May 16 00:15:31 localhost kernel: [  152.814853] sdhci: Blk size: 0x00007040
| Blk cnt:  0x00000001
May 16 00:15:31 localhost kernel: [  152.814860] sdhci: Argument: 0x00000200
| Trn mode: 0x00000013
May 16 00:15:31 localhost kernel: [  152.814867] sdhci: Present:  0x01f00001
| Host ctl: 0x00000000
May 16 00:15:31 localhost kernel: [  152.814874] sdhci: Power:    0x0000000f
| Blk gap:  0x00000000
May 16 00:15:31 localhost kernel: [  152.814881] sdhci: Wake-up:  0x00000000
| Clock:    0x00004007
May 16 00:15:31 localhost kernel: [  152.814888] sdhci: Timeout:  0x00000009
| Int stat: 0x00000000
May 16 00:15:31 localhost kernel: [  152.814895] sdhci: Int enab: 0x00ff00c3
| Sig enab: 0x00ff00c3
May 16 00:15:31 localhost kernel: [  152.814902] sdhci: AC12 err: 0x00000000
| Slot int: 0x00000000
May 16 00:15:31 localhost kernel: [  152.814909] sdhci: Caps:     0x01e032b2
| Caps_1:   0x00000000
May 16 00:15:31 localhost kernel: [  152.814916] sdhci: Cmd:      0x0000101a
| Max curr: 0x00000040
May 16 00:15:31 localhost kernel: [  152.814919] sdhci:
===========================================


Regards
David Strobach

^ permalink raw reply	[flat|nested] 19+ messages in thread

* kernel 2.6.38.6 MMC controller problem (fwd)
       [not found] ` <BANLkTinbJfDLp5nfkq-VTYLJvihT1thXvA@mail.gmail.com>
@ 2011-05-16  8:42   ` David Strobach
  2011-05-16  8:45   ` Guennadi Liakhovetski
  1 sibling, 0 replies; 19+ messages in thread
From: David Strobach @ 2011-05-16  8:42 UTC (permalink / raw)
  To: linux-mmc

[-- Attachment #1: Type: text/plain, Size: 5142 bytes --]

It's actually an oops. The backtrace is attached.

David

On Mon, May 16, 2011 at 09:06, Guennadi Liakhovetski
<g.liakhovetski@gmx.de> wrote:
>
> (added the ML and previous discussion participants to CC)
>
> Looks like my patch was indeed less obviious, than we thought. We need a
> lock-up backtrace, I guess. David, can you use a sysrq to get a trace?
> Something like
>
> echo w > /proc/sysrq-trigger
> or even
> echo t > /proc/sysrq-trigger
>
> and provide traces? The one with 't' will be probably huge, so, maybe you
> could do it with as few tasks running as possible, maybe without a
> graphical login. Or I would have to test it with my Laptop, to which I'll
> get access tomorrow.
>
> Thanks
> Guennadi
> ---
> Guennadi Liakhovetski, Ph.D.
> Freelance Open-Source Software Developer
> http://www.open-technology.de/
>
> ---------- Forwarded message ----------
> Date: Mon, 16 May 2011 01:57:04 +0200
> From: David Strobach <lalochcz@gmail.com>
> To: gregkh@suse.de, g.liakhovetski@gmx.de
> Cc: horms@verge.net.au, damm@opensource.se
> Subject: kernel 2.6.38.6 MMC controller problem
>
> Hello,
>
> I found (by bisection), that the commit 3fe962c (
> http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.38.y.git;a=commit;h=3fe962c04818a4634255beb3be9f236d36350543)
> introduced regression in MMC card detection. The card is either not detected
> or causes the system to hang. There is related forum thread at
> https://bbs.archlinux.org/viewtopic.php?id=118751. Relevant part of my own
> log follows:
>
> May 16 00:15:13 localhost kernel: [  134.670685] mmc0: new SD card at
> address aaaa
> May 16 00:15:23 localhost kernel: [  144.715115] mmc0: Timeout waiting for
> hardware interrupt.
> May 16 00:15:23 localhost kernel: [  144.715119] sdhci: =========== REGISTER
> DUMP (mmc0)===========
> May 16 00:15:23 localhost kernel: [  144.715126] sdhci: Sys addr: 0xbae85840
> | Version:  0x00000400
> May 16 00:15:23 localhost kernel: [  144.715133] sdhci: Blk size: 0x00007040
> | Blk cnt:  0x00000001
> May 16 00:15:23 localhost kernel: [  144.715140] sdhci: Argument: 0x00000200
> | Trn mode: 0x00000013
> May 16 00:15:23 localhost kernel: [  144.715146] sdhci: Present:  0x01ff0001
> | Host ctl: 0x00000003
> May 16 00:15:23 localhost kernel: [  144.715153] sdhci: Power:    0x0000000f
> | Blk gap:  0x00000000
> May 16 00:15:23 localhost kernel: [  144.715159] sdhci: Wake-up:  0x00000000
> | Clock:    0x00000100
> May 16 00:15:23 localhost kernel: [  144.715166] sdhci: Timeout:  0x00000009
> | Int stat: 0x00000000
> May 16 00:15:23 localhost kernel: [  144.715172] sdhci: Int enab: 0x02ff00cb
> | Sig enab: 0x02ff00cb
> May 16 00:15:23 localhost kernel: [  144.715178] sdhci: AC12 err: 0x00000000
> | Slot int: 0x00000000
> May 16 00:15:23 localhost kernel: [  144.715185] sdhci: Caps:     0x01e032b2
> | Caps_1:   0x00000000
> May 16 00:15:23 localhost kernel: [  144.715192] sdhci: Cmd:      0x0000101a
> | Max curr: 0x00000040
> May 16 00:15:23 localhost kernel: [  144.715194] sdhci:
> ===========================================
> May 16 00:15:31 localhost kernel: [  152.604505] mmc0: Card removed during
> transfer!
> May 16 00:15:31 localhost kernel: [  152.604511] mmc0: Resetting controller.
> May 16 00:15:31 localhost kernel: [  152.604568] mmcblk0: unable to set
> block size to 512: -123
> May 16 00:15:31 localhost kernel: [  152.604687] mmcblk: probe of mmc0:aaaa
> failed with error -22
> May 16 00:15:31 localhost kernel: [  152.801534] mmc0: card aaaa removed
> May 16 00:15:31 localhost kernel: [  152.814827] mmc0: Got command interrupt
> 0x00030000 even though no command operation was in progress.
> May 16 00:15:31 localhost kernel: [  152.814835] sdhci: =========== REGISTER
> DUMP (mmc0)===========
> May 16 00:15:31 localhost kernel: [  152.814844] sdhci: Sys addr: 0xbae85840
> | Version:  0x00000400
> May 16 00:15:31 localhost kernel: [  152.814853] sdhci: Blk size: 0x00007040
> | Blk cnt:  0x00000001
> May 16 00:15:31 localhost kernel: [  152.814860] sdhci: Argument: 0x00000200
> | Trn mode: 0x00000013
> May 16 00:15:31 localhost kernel: [  152.814867] sdhci: Present:  0x01f00001
> | Host ctl: 0x00000000
> May 16 00:15:31 localhost kernel: [  152.814874] sdhci: Power:    0x0000000f
> | Blk gap:  0x00000000
> May 16 00:15:31 localhost kernel: [  152.814881] sdhci: Wake-up:  0x00000000
> | Clock:    0x00004007
> May 16 00:15:31 localhost kernel: [  152.814888] sdhci: Timeout:  0x00000009
> | Int stat: 0x00000000
> May 16 00:15:31 localhost kernel: [  152.814895] sdhci: Int enab: 0x00ff00c3
> | Sig enab: 0x00ff00c3
> May 16 00:15:31 localhost kernel: [  152.814902] sdhci: AC12 err: 0x00000000
> | Slot int: 0x00000000
> May 16 00:15:31 localhost kernel: [  152.814909] sdhci: Caps:     0x01e032b2
> | Caps_1:   0x00000000
> May 16 00:15:31 localhost kernel: [  152.814916] sdhci: Cmd:      0x0000101a
> | Max curr: 0x00000040
> May 16 00:15:31 localhost kernel: [  152.814919] sdhci:
> ===========================================
>
>
> Regards
> David Strobach

[-- Attachment #2: mmc-oops --]
[-- Type: application/octet-stream, Size: 4004 bytes --]

[  354.484676] mmc0: new SDHC card at address b368
[  354.485567] mmcblk0: mmc0:b368 NCard 3.72 GiB 
[  354.486431] divide error: 0000 [#1] PREEMPT SMP 
[  354.487293] last sysfs file: /sys/devices/virtual/bdi/179:0/uevent
[  354.488271] CPU 3 
[  354.488588] Modules linked in: mmc_block usbhid hid netconsole configfs ext2 btusb bluetooth vboxdrv cpufreq_ondemand acpi_cpufreq freq_table mperf uvcvideo videodev v4l2_compat_ioctl32 snd_hda_codec_hdmi tpm_infineon snd_hda_codec_realtek snd_hda_intel nvidia(P) snd_hda_codec snd_pcm snd_timer snd_page_alloc snd_hwdep joydev i2c_core arc4 snd_mixer_oss ecb tpm_tis ehci_hcd tpm toshiba_acpi sparse_keymap tpm_bios usbcore sg toshiba_bluetooth processor snd thermal battery button psmouse video ac soundcore pcspkr iwlagn iwlcore serio_raw mac80211 cfg80211 rfkill sdhci_pci sdhci mmc_core intel_ips intel_agp iTCO_wdt intel_gtt iTCO_vendor_support evdev e1000e ext4 mbcache jbd2 crc16 dm_mod sr_mod cdrom sd_mod ahci libahci libata scsi_mod
[  354.501153] 
[  354.501389] Pid: 3065, comm: mmcqd/0 Tainted: P            2.6.38-ARCH #1 TOSHIBA TECRA S11/Portable PC
[  354.503019] RIP: 0010:[<ffffffffa0127f95>]  [<ffffffffa0127f95>] sdhci_send_command+0x575/0xbf0 [sdhci]
[  354.504532] RSP: 0018:ffff88012c5d1b60  EFLAGS: 00010046
[  354.505368] RAX: 0000000000000000 RBX: ffff88012e33fc80 RCX: 0000000000000000
[  354.506499] RDX: 0000000000000000 RSI: 0000000008000085 RDI: ffff88012e33f800
[  354.507623] RBP: ffff88012c5d1be0 R08: 0000000000000000 R09: 0000000010624dd3
[  354.508747] R10: 0000000000000001 R11: 0000000000000001 R12: ffff88012c5d1ce0
[  354.509878] R13: ffff88012c5d1d50 R14: 0000000000000003 R15: ffff88012c70fc00
[  354.511002] FS:  0000000000000000(0000) GS:ffff8800beec0000(0000) knlGS:0000000000000000
[  354.512280] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  354.513188] CR2: 00007f2f9c881200 CR3: 0000000001693000 CR4: 00000000000006e0
[  354.514312] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  354.515436] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  354.516567] Process mmcqd/0 (pid: 3065, threadinfo ffff88012c5d0000, task ffff88012b1c75f0)
[  354.517883] Stack:
[  354.518200]  ffff88012c5d1bf0 ffffffffa0152e3c 0000000000000000 0000000000000000
[  354.519487]  ffff88012b1c75f0 0000000000000000 0000000000000000 ffff88012b1c75f0
[  354.520765]  ffffffff81051f20 0000000000000286 ffff88012c5d1bd0 ffff88012e33f800
[  354.522044] Call Trace:
[  354.522441]  [<ffffffffa0152e3c>] ? __mmc_claim_host+0x12c/0x180 [mmc_core]
[  354.523543]  [<ffffffff81051f20>] ? default_wake_function+0x0/0x10
[  354.524518]  [<ffffffffa01286f9>] sdhci_request+0xe9/0x110 [sdhci]
[  354.525494]  [<ffffffffa01524e2>] mmc_wait_for_req+0x102/0x140 [mmc_core]
[  354.526571]  [<ffffffffa11576ae>] mmc_blk_issue_rw_rq+0x20e/0x690 [mmc_block]
[  354.527697]  [<ffffffffa01525b0>] ? mmc_wait_done+0x0/0x10 [mmc_core]
[  354.528715]  [<ffffffff811f8018>] ? cfq_dispatch_requests+0x1b8/0xb60
[  354.529740]  [<ffffffffa115802d>] mmc_blk_issue_rq+0x10d/0x1a0 [mmc_block]
[  354.529742]  [<ffffffff811e5e9e>] ? blk_start_request+0x2e/0x40
[  354.529744]  [<ffffffffa11581c2>] mmc_queue_thread+0x102/0x130 [mmc_block]
[  354.529746]  [<ffffffffa11580c0>] ? mmc_queue_thread+0x0/0x130 [mmc_block]
[  354.529754]  [<ffffffff81079a47>] kthread+0x87/0x90
[  354.529758]  [<ffffffff8100bc24>] kernel_thread_helper+0x4/0x10
[  354.529760]  [<ffffffff810799c0>] ? kthread+0x0/0x90
[  354.529761]  [<ffffffff8100bc20>] ? kernel_thread_helper+0x0/0x10
[  354.529762] Code: 01 45 19 f6 41 83 e6 fe 41 83 c6 03 e9 dc fa ff ff 0f 1f 44 00 00 41 8b 4d 04 44 8b 83 64 01 00 00 31 d2 41 b9 d3 4d 62 10 89 c8 <41> f7 f0 41 8b 55 00 89 c1 89 d0 41 f7 e1 c1 ea 06 01 d1 81 e6 
[  354.529774] RIP  [<ffffffffa0127f95>] sdhci_send_command+0x575/0xbf0 [sdhci]
[  354.529777]  RSP <ffff88012c5d1b60>
[  354.529779] ---[ end trace f2e51128a7e7056a ]---
[  354.529780] note: mmcqd/0[3065] exited with preempt_count 1

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: kernel 2.6.38.6 MMC controller problem (fwd)
       [not found] ` <BANLkTinbJfDLp5nfkq-VTYLJvihT1thXvA@mail.gmail.com>
  2011-05-16  8:42   ` David Strobach
@ 2011-05-16  8:45   ` Guennadi Liakhovetski
  2011-05-16 10:26     ` Jaehoon Chung
  2011-05-16 14:52     ` Chris Ball
  1 sibling, 2 replies; 19+ messages in thread
From: Guennadi Liakhovetski @ 2011-05-16  8:45 UTC (permalink / raw)
  To: David Strobach
  Cc: linux-mmc, Greg KH, Chris Ball, horms, damm, Andrei Warkentin,
	Linus Walleij

"divide error"???... hmmm, maybe one of SDHCI maintainers has to look at 
it?:)

Thanks
Guennadi

On Mon, 16 May 2011, David Strobach wrote:

> It's actually an oops. The backtrace is attached.
> 
> David
> 
> On Mon, May 16, 2011 at 09:06, Guennadi Liakhovetski
> <g.liakhovetski@gmx.de>wrote:
> 
> > (added the ML and previous discussion participants to CC)
> >
> > Looks like my patch was indeed less obviious, than we thought. We need a
> > lock-up backtrace, I guess. David, can you use a sysrq to get a trace?
> > Something like
> >
> > echo w > /proc/sysrq-trigger
> > or even
> > echo t > /proc/sysrq-trigger
> >
> > and provide traces? The one with 't' will be probably huge, so, maybe you
> > could do it with as few tasks running as possible, maybe without a
> > graphical login. Or I would have to test it with my Laptop, to which I'll
> > get access tomorrow.
> >
> > Thanks
> > Guennadi
> > ---
> > Guennadi Liakhovetski, Ph.D.
> > Freelance Open-Source Software Developer
> > http://www.open-technology.de/
> >
> > ---------- Forwarded message ----------
> > Date: Mon, 16 May 2011 01:57:04 +0200
> > From: David Strobach <lalochcz@gmail.com>
> > To: gregkh@suse.de, g.liakhovetski@gmx.de
> > Cc: horms@verge.net.au, damm@opensource.se
> > Subject: kernel 2.6.38.6 MMC controller problem
> >
> > Hello,
> >
> > I found (by bisection), that the commit 3fe962c (
> >
> > http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.38.y.git;a=commit;h=3fe962c04818a4634255beb3be9f236d36350543
> > )
> > introduced regression in MMC card detection. The card is either not
> > detected
> > or causes the system to hang. There is related forum thread at
> > https://bbs.archlinux.org/viewtopic.php?id=118751. Relevant part of my own
> > log follows:
> >
> > May 16 00:15:13 localhost kernel: [  134.670685] mmc0: new SD card at
> > address aaaa
> > May 16 00:15:23 localhost kernel: [  144.715115] mmc0: Timeout waiting for
> > hardware interrupt.
> > May 16 00:15:23 localhost kernel: [  144.715119] sdhci: ===========
> > REGISTER
> > DUMP (mmc0)===========
> > May 16 00:15:23 localhost kernel: [  144.715126] sdhci: Sys addr:
> > 0xbae85840
> > | Version:  0x00000400
> > May 16 00:15:23 localhost kernel: [  144.715133] sdhci: Blk size:
> > 0x00007040
> > | Blk cnt:  0x00000001
> > May 16 00:15:23 localhost kernel: [  144.715140] sdhci: Argument:
> > 0x00000200
> > | Trn mode: 0x00000013
> > May 16 00:15:23 localhost kernel: [  144.715146] sdhci: Present:
> >  0x01ff0001
> > | Host ctl: 0x00000003
> > May 16 00:15:23 localhost kernel: [  144.715153] sdhci: Power:
> >  0x0000000f
> > | Blk gap:  0x00000000
> > May 16 00:15:23 localhost kernel: [  144.715159] sdhci: Wake-up:
> >  0x00000000
> > | Clock:    0x00000100
> > May 16 00:15:23 localhost kernel: [  144.715166] sdhci: Timeout:
> >  0x00000009
> > | Int stat: 0x00000000
> > May 16 00:15:23 localhost kernel: [  144.715172] sdhci: Int enab:
> > 0x02ff00cb
> > | Sig enab: 0x02ff00cb
> > May 16 00:15:23 localhost kernel: [  144.715178] sdhci: AC12 err:
> > 0x00000000
> > | Slot int: 0x00000000
> > May 16 00:15:23 localhost kernel: [  144.715185] sdhci: Caps:
> > 0x01e032b2
> > | Caps_1:   0x00000000
> > May 16 00:15:23 localhost kernel: [  144.715192] sdhci: Cmd:
> >  0x0000101a
> > | Max curr: 0x00000040
> > May 16 00:15:23 localhost kernel: [  144.715194] sdhci:
> > ===========================================
> > May 16 00:15:31 localhost kernel: [  152.604505] mmc0: Card removed during
> > transfer!
> > May 16 00:15:31 localhost kernel: [  152.604511] mmc0: Resetting
> > controller.
> > May 16 00:15:31 localhost kernel: [  152.604568] mmcblk0: unable to set
> > block size to 512: -123
> > May 16 00:15:31 localhost kernel: [  152.604687] mmcblk: probe of mmc0:aaaa
> > failed with error -22
> > May 16 00:15:31 localhost kernel: [  152.801534] mmc0: card aaaa removed
> > May 16 00:15:31 localhost kernel: [  152.814827] mmc0: Got command
> > interrupt
> > 0x00030000 even though no command operation was in progress.
> > May 16 00:15:31 localhost kernel: [  152.814835] sdhci: ===========
> > REGISTER
> > DUMP (mmc0)===========
> > May 16 00:15:31 localhost kernel: [  152.814844] sdhci: Sys addr:
> > 0xbae85840
> > | Version:  0x00000400
> > May 16 00:15:31 localhost kernel: [  152.814853] sdhci: Blk size:
> > 0x00007040
> > | Blk cnt:  0x00000001
> > May 16 00:15:31 localhost kernel: [  152.814860] sdhci: Argument:
> > 0x00000200
> > | Trn mode: 0x00000013
> > May 16 00:15:31 localhost kernel: [  152.814867] sdhci: Present:
> >  0x01f00001
> > | Host ctl: 0x00000000
> > May 16 00:15:31 localhost kernel: [  152.814874] sdhci: Power:
> >  0x0000000f
> > | Blk gap:  0x00000000
> > May 16 00:15:31 localhost kernel: [  152.814881] sdhci: Wake-up:
> >  0x00000000
> > | Clock:    0x00004007
> > May 16 00:15:31 localhost kernel: [  152.814888] sdhci: Timeout:
> >  0x00000009
> > | Int stat: 0x00000000
> > May 16 00:15:31 localhost kernel: [  152.814895] sdhci: Int enab:
> > 0x00ff00c3
> > | Sig enab: 0x00ff00c3
> > May 16 00:15:31 localhost kernel: [  152.814902] sdhci: AC12 err:
> > 0x00000000
> > | Slot int: 0x00000000
> > May 16 00:15:31 localhost kernel: [  152.814909] sdhci: Caps:
> > 0x01e032b2
> > | Caps_1:   0x00000000
> > May 16 00:15:31 localhost kernel: [  152.814916] sdhci: Cmd:
> >  0x0000101a
> > | Max curr: 0x00000040
> > May 16 00:15:31 localhost kernel: [  152.814919] sdhci:
> > ===========================================
> >
> >
> > Regards
> > David Strobach
> >
> 

---
Guennadi Liakhovetski, Ph.D.
Freelance Open-Source Software Developer
http://www.open-technology.de/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: kernel 2.6.38.6 MMC controller problem (fwd)
  2011-05-16  8:45   ` Guennadi Liakhovetski
@ 2011-05-16 10:26     ` Jaehoon Chung
  2011-05-16 10:39       ` Guennadi Liakhovetski
  2011-05-16 14:52     ` Chris Ball
  1 sibling, 1 reply; 19+ messages in thread
From: Jaehoon Chung @ 2011-05-16 10:26 UTC (permalink / raw)
  To: Guennadi Liakhovetski
  Cc: David Strobach, linux-mmc, Greg KH, Chris Ball, horms, damm,
	Andrei Warkentin, Linus Walleij, Kyungmin Park

Hi..

i found the similar case. So i wonder how i can resolve that..
But i didn't find the solution and i didn't know what problem..

i want to know that solution..

->first card inserted (correct card detect)
# mmc1: new SDHC card at address e624
mmcblk1: mmc1:e624 SU04G 3.69 GiB
mmcblk1: p1
#
# mmc1: card e624 removed
-> second card inserted
mmc1: error -110 whilst initialising SD card
mmc1: Card removed during transfer!
mmc1: Resetting controller.
-> third card inserted 
mmc1: new SDHC card at address e624
mmcblk1: mmc1:e624 SU04G 3.69 GiB
  mmcblk1: p1

Regards,
Jaehoon Chung

Guennadi Liakhovetski wrote:
> "divide error"???... hmmm, maybe one of SDHCI maintainers has to look at 
> it?:)
> 
> Thanks
> Guennadi
> 
> On Mon, 16 May 2011, David Strobach wrote:
> 
>> It's actually an oops. The backtrace is attached.
>>
>> David
>>
>> On Mon, May 16, 2011 at 09:06, Guennadi Liakhovetski
>> <g.liakhovetski@gmx.de>wrote:
>>
>>> (added the ML and previous discussion participants to CC)
>>>
>>> Looks like my patch was indeed less obviious, than we thought. We need a
>>> lock-up backtrace, I guess. David, can you use a sysrq to get a trace?
>>> Something like
>>>
>>> echo w > /proc/sysrq-trigger
>>> or even
>>> echo t > /proc/sysrq-trigger
>>>
>>> and provide traces? The one with 't' will be probably huge, so, maybe you
>>> could do it with as few tasks running as possible, maybe without a
>>> graphical login. Or I would have to test it with my Laptop, to which I'll
>>> get access tomorrow.
>>>
>>> Thanks
>>> Guennadi
>>> ---
>>> Guennadi Liakhovetski, Ph.D.
>>> Freelance Open-Source Software Developer
>>> http://www.open-technology.de/
>>>
>>> ---------- Forwarded message ----------
>>> Date: Mon, 16 May 2011 01:57:04 +0200
>>> From: David Strobach <lalochcz@gmail.com>
>>> To: gregkh@suse.de, g.liakhovetski@gmx.de
>>> Cc: horms@verge.net.au, damm@opensource.se
>>> Subject: kernel 2.6.38.6 MMC controller problem
>>>
>>> Hello,
>>>
>>> I found (by bisection), that the commit 3fe962c (
>>>
>>> http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.38.y.git;a=commit;h=3fe962c04818a4634255beb3be9f236d36350543
>>> )
>>> introduced regression in MMC card detection. The card is either not
>>> detected
>>> or causes the system to hang. There is related forum thread at
>>> https://bbs.archlinux.org/viewtopic.php?id=118751. Relevant part of my own
>>> log follows:
>>>
>>> May 16 00:15:13 localhost kernel: [  134.670685] mmc0: new SD card at
>>> address aaaa
>>> May 16 00:15:23 localhost kernel: [  144.715115] mmc0: Timeout waiting for
>>> hardware interrupt.
>>> May 16 00:15:23 localhost kernel: [  144.715119] sdhci: ===========
>>> REGISTER
>>> DUMP (mmc0)===========
>>> May 16 00:15:23 localhost kernel: [  144.715126] sdhci: Sys addr:
>>> 0xbae85840
>>> | Version:  0x00000400
>>> May 16 00:15:23 localhost kernel: [  144.715133] sdhci: Blk size:
>>> 0x00007040
>>> | Blk cnt:  0x00000001
>>> May 16 00:15:23 localhost kernel: [  144.715140] sdhci: Argument:
>>> 0x00000200
>>> | Trn mode: 0x00000013
>>> May 16 00:15:23 localhost kernel: [  144.715146] sdhci: Present:
>>>  0x01ff0001
>>> | Host ctl: 0x00000003
>>> May 16 00:15:23 localhost kernel: [  144.715153] sdhci: Power:
>>>  0x0000000f
>>> | Blk gap:  0x00000000
>>> May 16 00:15:23 localhost kernel: [  144.715159] sdhci: Wake-up:
>>>  0x00000000
>>> | Clock:    0x00000100
>>> May 16 00:15:23 localhost kernel: [  144.715166] sdhci: Timeout:
>>>  0x00000009
>>> | Int stat: 0x00000000
>>> May 16 00:15:23 localhost kernel: [  144.715172] sdhci: Int enab:
>>> 0x02ff00cb
>>> | Sig enab: 0x02ff00cb
>>> May 16 00:15:23 localhost kernel: [  144.715178] sdhci: AC12 err:
>>> 0x00000000
>>> | Slot int: 0x00000000
>>> May 16 00:15:23 localhost kernel: [  144.715185] sdhci: Caps:
>>> 0x01e032b2
>>> | Caps_1:   0x00000000
>>> May 16 00:15:23 localhost kernel: [  144.715192] sdhci: Cmd:
>>>  0x0000101a
>>> | Max curr: 0x00000040
>>> May 16 00:15:23 localhost kernel: [  144.715194] sdhci:
>>> ===========================================
>>> May 16 00:15:31 localhost kernel: [  152.604505] mmc0: Card removed during
>>> transfer!
>>> May 16 00:15:31 localhost kernel: [  152.604511] mmc0: Resetting
>>> controller.
>>> May 16 00:15:31 localhost kernel: [  152.604568] mmcblk0: unable to set
>>> block size to 512: -123
>>> May 16 00:15:31 localhost kernel: [  152.604687] mmcblk: probe of mmc0:aaaa
>>> failed with error -22
>>> May 16 00:15:31 localhost kernel: [  152.801534] mmc0: card aaaa removed
>>> May 16 00:15:31 localhost kernel: [  152.814827] mmc0: Got command
>>> interrupt
>>> 0x00030000 even though no command operation was in progress.
>>> May 16 00:15:31 localhost kernel: [  152.814835] sdhci: ===========
>>> REGISTER
>>> DUMP (mmc0)===========
>>> May 16 00:15:31 localhost kernel: [  152.814844] sdhci: Sys addr:
>>> 0xbae85840
>>> | Version:  0x00000400
>>> May 16 00:15:31 localhost kernel: [  152.814853] sdhci: Blk size:
>>> 0x00007040
>>> | Blk cnt:  0x00000001
>>> May 16 00:15:31 localhost kernel: [  152.814860] sdhci: Argument:
>>> 0x00000200
>>> | Trn mode: 0x00000013
>>> May 16 00:15:31 localhost kernel: [  152.814867] sdhci: Present:
>>>  0x01f00001
>>> | Host ctl: 0x00000000
>>> May 16 00:15:31 localhost kernel: [  152.814874] sdhci: Power:
>>>  0x0000000f
>>> | Blk gap:  0x00000000
>>> May 16 00:15:31 localhost kernel: [  152.814881] sdhci: Wake-up:
>>>  0x00000000
>>> | Clock:    0x00004007
>>> May 16 00:15:31 localhost kernel: [  152.814888] sdhci: Timeout:
>>>  0x00000009
>>> | Int stat: 0x00000000
>>> May 16 00:15:31 localhost kernel: [  152.814895] sdhci: Int enab:
>>> 0x00ff00c3
>>> | Sig enab: 0x00ff00c3
>>> May 16 00:15:31 localhost kernel: [  152.814902] sdhci: AC12 err:
>>> 0x00000000
>>> | Slot int: 0x00000000
>>> May 16 00:15:31 localhost kernel: [  152.814909] sdhci: Caps:
>>> 0x01e032b2
>>> | Caps_1:   0x00000000
>>> May 16 00:15:31 localhost kernel: [  152.814916] sdhci: Cmd:
>>>  0x0000101a
>>> | Max curr: 0x00000040
>>> May 16 00:15:31 localhost kernel: [  152.814919] sdhci:
>>> ===========================================
>>>
>>>
>>> Regards
>>> David Strobach
>>>
> 
> ---
> Guennadi Liakhovetski, Ph.D.
> Freelance Open-Source Software Developer
> http://www.open-technology.de/
> --
> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: kernel 2.6.38.6 MMC controller problem (fwd)
  2011-05-16 10:26     ` Jaehoon Chung
@ 2011-05-16 10:39       ` Guennadi Liakhovetski
  2011-05-16 10:48         ` Jaehoon Chung
  0 siblings, 1 reply; 19+ messages in thread
From: Guennadi Liakhovetski @ 2011-05-16 10:39 UTC (permalink / raw)
  To: Jaehoon Chung
  Cc: David Strobach, linux-mmc, Greg KH, Chris Ball, horms, damm,
	Andrei Warkentin, Linus Walleij, Kyungmin Park

On Mon, 16 May 2011, Jaehoon Chung wrote:

> Hi..
> 
> i found the similar case. So i wonder how i can resolve that..
> But i didn't find the solution and i didn't know what problem..

What MMC host driver? Anything in dmesg?

Thanks
Guennadi

> 
> i want to know that solution..
> 
> ->first card inserted (correct card detect)
> # mmc1: new SDHC card at address e624
> mmcblk1: mmc1:e624 SU04G 3.69 GiB
> mmcblk1: p1
> #
> # mmc1: card e624 removed
> -> second card inserted
> mmc1: error -110 whilst initialising SD card
> mmc1: Card removed during transfer!
> mmc1: Resetting controller.
> -> third card inserted 
> mmc1: new SDHC card at address e624
> mmcblk1: mmc1:e624 SU04G 3.69 GiB
>   mmcblk1: p1
> 
> Regards,
> Jaehoon Chung
> 
> Guennadi Liakhovetski wrote:
> > "divide error"???... hmmm, maybe one of SDHCI maintainers has to look at 
> > it?:)
> > 
> > Thanks
> > Guennadi
> > 
> > On Mon, 16 May 2011, David Strobach wrote:
> > 
> >> It's actually an oops. The backtrace is attached.
> >>
> >> David
> >>
> >> On Mon, May 16, 2011 at 09:06, Guennadi Liakhovetski
> >> <g.liakhovetski@gmx.de>wrote:
> >>
> >>> (added the ML and previous discussion participants to CC)
> >>>
> >>> Looks like my patch was indeed less obviious, than we thought. We need a
> >>> lock-up backtrace, I guess. David, can you use a sysrq to get a trace?
> >>> Something like
> >>>
> >>> echo w > /proc/sysrq-trigger
> >>> or even
> >>> echo t > /proc/sysrq-trigger
> >>>
> >>> and provide traces? The one with 't' will be probably huge, so, maybe you
> >>> could do it with as few tasks running as possible, maybe without a
> >>> graphical login. Or I would have to test it with my Laptop, to which I'll
> >>> get access tomorrow.
> >>>
> >>> Thanks
> >>> Guennadi
> >>> ---
> >>> Guennadi Liakhovetski, Ph.D.
> >>> Freelance Open-Source Software Developer
> >>> http://www.open-technology.de/
> >>>
> >>> ---------- Forwarded message ----------
> >>> Date: Mon, 16 May 2011 01:57:04 +0200
> >>> From: David Strobach <lalochcz@gmail.com>
> >>> To: gregkh@suse.de, g.liakhovetski@gmx.de
> >>> Cc: horms@verge.net.au, damm@opensource.se
> >>> Subject: kernel 2.6.38.6 MMC controller problem
> >>>
> >>> Hello,
> >>>
> >>> I found (by bisection), that the commit 3fe962c (
> >>>
> >>> http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.38.y.git;a=commit;h=3fe962c04818a4634255beb3be9f236d36350543
> >>> )
> >>> introduced regression in MMC card detection. The card is either not
> >>> detected
> >>> or causes the system to hang. There is related forum thread at
> >>> https://bbs.archlinux.org/viewtopic.php?id=118751. Relevant part of my own
> >>> log follows:
> >>>
> >>> May 16 00:15:13 localhost kernel: [  134.670685] mmc0: new SD card at
> >>> address aaaa
> >>> May 16 00:15:23 localhost kernel: [  144.715115] mmc0: Timeout waiting for
> >>> hardware interrupt.
> >>> May 16 00:15:23 localhost kernel: [  144.715119] sdhci: ===========
> >>> REGISTER
> >>> DUMP (mmc0)===========
> >>> May 16 00:15:23 localhost kernel: [  144.715126] sdhci: Sys addr:
> >>> 0xbae85840
> >>> | Version:  0x00000400
> >>> May 16 00:15:23 localhost kernel: [  144.715133] sdhci: Blk size:
> >>> 0x00007040
> >>> | Blk cnt:  0x00000001
> >>> May 16 00:15:23 localhost kernel: [  144.715140] sdhci: Argument:
> >>> 0x00000200
> >>> | Trn mode: 0x00000013
> >>> May 16 00:15:23 localhost kernel: [  144.715146] sdhci: Present:
> >>>  0x01ff0001
> >>> | Host ctl: 0x00000003
> >>> May 16 00:15:23 localhost kernel: [  144.715153] sdhci: Power:
> >>>  0x0000000f
> >>> | Blk gap:  0x00000000
> >>> May 16 00:15:23 localhost kernel: [  144.715159] sdhci: Wake-up:
> >>>  0x00000000
> >>> | Clock:    0x00000100
> >>> May 16 00:15:23 localhost kernel: [  144.715166] sdhci: Timeout:
> >>>  0x00000009
> >>> | Int stat: 0x00000000
> >>> May 16 00:15:23 localhost kernel: [  144.715172] sdhci: Int enab:
> >>> 0x02ff00cb
> >>> | Sig enab: 0x02ff00cb
> >>> May 16 00:15:23 localhost kernel: [  144.715178] sdhci: AC12 err:
> >>> 0x00000000
> >>> | Slot int: 0x00000000
> >>> May 16 00:15:23 localhost kernel: [  144.715185] sdhci: Caps:
> >>> 0x01e032b2
> >>> | Caps_1:   0x00000000
> >>> May 16 00:15:23 localhost kernel: [  144.715192] sdhci: Cmd:
> >>>  0x0000101a
> >>> | Max curr: 0x00000040
> >>> May 16 00:15:23 localhost kernel: [  144.715194] sdhci:
> >>> ===========================================
> >>> May 16 00:15:31 localhost kernel: [  152.604505] mmc0: Card removed during
> >>> transfer!
> >>> May 16 00:15:31 localhost kernel: [  152.604511] mmc0: Resetting
> >>> controller.
> >>> May 16 00:15:31 localhost kernel: [  152.604568] mmcblk0: unable to set
> >>> block size to 512: -123
> >>> May 16 00:15:31 localhost kernel: [  152.604687] mmcblk: probe of mmc0:aaaa
> >>> failed with error -22
> >>> May 16 00:15:31 localhost kernel: [  152.801534] mmc0: card aaaa removed
> >>> May 16 00:15:31 localhost kernel: [  152.814827] mmc0: Got command
> >>> interrupt
> >>> 0x00030000 even though no command operation was in progress.
> >>> May 16 00:15:31 localhost kernel: [  152.814835] sdhci: ===========
> >>> REGISTER
> >>> DUMP (mmc0)===========
> >>> May 16 00:15:31 localhost kernel: [  152.814844] sdhci: Sys addr:
> >>> 0xbae85840
> >>> | Version:  0x00000400
> >>> May 16 00:15:31 localhost kernel: [  152.814853] sdhci: Blk size:
> >>> 0x00007040
> >>> | Blk cnt:  0x00000001
> >>> May 16 00:15:31 localhost kernel: [  152.814860] sdhci: Argument:
> >>> 0x00000200
> >>> | Trn mode: 0x00000013
> >>> May 16 00:15:31 localhost kernel: [  152.814867] sdhci: Present:
> >>>  0x01f00001
> >>> | Host ctl: 0x00000000
> >>> May 16 00:15:31 localhost kernel: [  152.814874] sdhci: Power:
> >>>  0x0000000f
> >>> | Blk gap:  0x00000000
> >>> May 16 00:15:31 localhost kernel: [  152.814881] sdhci: Wake-up:
> >>>  0x00000000
> >>> | Clock:    0x00004007
> >>> May 16 00:15:31 localhost kernel: [  152.814888] sdhci: Timeout:
> >>>  0x00000009
> >>> | Int stat: 0x00000000
> >>> May 16 00:15:31 localhost kernel: [  152.814895] sdhci: Int enab:
> >>> 0x00ff00c3
> >>> | Sig enab: 0x00ff00c3
> >>> May 16 00:15:31 localhost kernel: [  152.814902] sdhci: AC12 err:
> >>> 0x00000000
> >>> | Slot int: 0x00000000
> >>> May 16 00:15:31 localhost kernel: [  152.814909] sdhci: Caps:
> >>> 0x01e032b2
> >>> | Caps_1:   0x00000000
> >>> May 16 00:15:31 localhost kernel: [  152.814916] sdhci: Cmd:
> >>>  0x0000101a
> >>> | Max curr: 0x00000040
> >>> May 16 00:15:31 localhost kernel: [  152.814919] sdhci:
> >>> ===========================================
> >>>
> >>>
> >>> Regards
> >>> David Strobach
> >>>
> > 
> > ---
> > Guennadi Liakhovetski, Ph.D.
> > Freelance Open-Source Software Developer
> > http://www.open-technology.de/
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> 

---
Guennadi Liakhovetski, Ph.D.
Freelance Open-Source Software Developer
http://www.open-technology.de/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: kernel 2.6.38.6 MMC controller problem (fwd)
  2011-05-16 10:39       ` Guennadi Liakhovetski
@ 2011-05-16 10:48         ` Jaehoon Chung
  0 siblings, 0 replies; 19+ messages in thread
From: Jaehoon Chung @ 2011-05-16 10:48 UTC (permalink / raw)
  To: Guennadi Liakhovetski
  Cc: Jaehoon Chung, David Strobach, linux-mmc, Greg KH, Chris Ball,
	horms, damm, Andrei Warkentin, Linus Walleij, Kyungmin Park

Guennadi Liakhovetski wrote:
> On Mon, 16 May 2011, Jaehoon Chung wrote:
> 
>> Hi..
>>
>> i found the similar case. So i wonder how i can resolve that..
>> But i didn't find the solution and i didn't know what problem..
> 
> What MMC host driver? Anything in dmesg?

Using sdhci and dw_mmc controller..i will resend the dmesg log.

Thanks,
Jaehoon Chung

> 
> Thanks
> Guennadi
> 
>> i want to know that solution..
>>
>> ->first card inserted (correct card detect)
>> # mmc1: new SDHC card at address e624
>> mmcblk1: mmc1:e624 SU04G 3.69 GiB
>> mmcblk1: p1
>> #
>> # mmc1: card e624 removed
>> -> second card inserted
>> mmc1: error -110 whilst initialising SD card
>> mmc1: Card removed during transfer!
>> mmc1: Resetting controller.
>> -> third card inserted 
>> mmc1: new SDHC card at address e624
>> mmcblk1: mmc1:e624 SU04G 3.69 GiB
>>   mmcblk1: p1
>>
>> Regards,
>> Jaehoon Chung
>>
>> Guennadi Liakhovetski wrote:
>>> "divide error"???... hmmm, maybe one of SDHCI maintainers has to look at 
>>> it?:)
>>>
>>> Thanks
>>> Guennadi
>>>
>>> On Mon, 16 May 2011, David Strobach wrote:
>>>
>>>> It's actually an oops. The backtrace is attached.
>>>>
>>>> David
>>>>
>>>> On Mon, May 16, 2011 at 09:06, Guennadi Liakhovetski
>>>> <g.liakhovetski@gmx.de>wrote:
>>>>
>>>>> (added the ML and previous discussion participants to CC)
>>>>>
>>>>> Looks like my patch was indeed less obviious, than we thought. We need a
>>>>> lock-up backtrace, I guess. David, can you use a sysrq to get a trace?
>>>>> Something like
>>>>>
>>>>> echo w > /proc/sysrq-trigger
>>>>> or even
>>>>> echo t > /proc/sysrq-trigger
>>>>>
>>>>> and provide traces? The one with 't' will be probably huge, so, maybe you
>>>>> could do it with as few tasks running as possible, maybe without a
>>>>> graphical login. Or I would have to test it with my Laptop, to which I'll
>>>>> get access tomorrow.
>>>>>
>>>>> Thanks
>>>>> Guennadi
>>>>> ---
>>>>> Guennadi Liakhovetski, Ph.D.
>>>>> Freelance Open-Source Software Developer
>>>>> http://www.open-technology.de/
>>>>>
>>>>> ---------- Forwarded message ----------
>>>>> Date: Mon, 16 May 2011 01:57:04 +0200
>>>>> From: David Strobach <lalochcz@gmail.com>
>>>>> To: gregkh@suse.de, g.liakhovetski@gmx.de
>>>>> Cc: horms@verge.net.au, damm@opensource.se
>>>>> Subject: kernel 2.6.38.6 MMC controller problem
>>>>>
>>>>> Hello,
>>>>>
>>>>> I found (by bisection), that the commit 3fe962c (
>>>>>
>>>>> http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.38.y.git;a=commit;h=3fe962c04818a4634255beb3be9f236d36350543
>>>>> )
>>>>> introduced regression in MMC card detection. The card is either not
>>>>> detected
>>>>> or causes the system to hang. There is related forum thread at
>>>>> https://bbs.archlinux.org/viewtopic.php?id=118751. Relevant part of my own
>>>>> log follows:
>>>>>
>>>>> May 16 00:15:13 localhost kernel: [  134.670685] mmc0: new SD card at
>>>>> address aaaa
>>>>> May 16 00:15:23 localhost kernel: [  144.715115] mmc0: Timeout waiting for
>>>>> hardware interrupt.
>>>>> May 16 00:15:23 localhost kernel: [  144.715119] sdhci: ===========
>>>>> REGISTER
>>>>> DUMP (mmc0)===========
>>>>> May 16 00:15:23 localhost kernel: [  144.715126] sdhci: Sys addr:
>>>>> 0xbae85840
>>>>> | Version:  0x00000400
>>>>> May 16 00:15:23 localhost kernel: [  144.715133] sdhci: Blk size:
>>>>> 0x00007040
>>>>> | Blk cnt:  0x00000001
>>>>> May 16 00:15:23 localhost kernel: [  144.715140] sdhci: Argument:
>>>>> 0x00000200
>>>>> | Trn mode: 0x00000013
>>>>> May 16 00:15:23 localhost kernel: [  144.715146] sdhci: Present:
>>>>>  0x01ff0001
>>>>> | Host ctl: 0x00000003
>>>>> May 16 00:15:23 localhost kernel: [  144.715153] sdhci: Power:
>>>>>  0x0000000f
>>>>> | Blk gap:  0x00000000
>>>>> May 16 00:15:23 localhost kernel: [  144.715159] sdhci: Wake-up:
>>>>>  0x00000000
>>>>> | Clock:    0x00000100
>>>>> May 16 00:15:23 localhost kernel: [  144.715166] sdhci: Timeout:
>>>>>  0x00000009
>>>>> | Int stat: 0x00000000
>>>>> May 16 00:15:23 localhost kernel: [  144.715172] sdhci: Int enab:
>>>>> 0x02ff00cb
>>>>> | Sig enab: 0x02ff00cb
>>>>> May 16 00:15:23 localhost kernel: [  144.715178] sdhci: AC12 err:
>>>>> 0x00000000
>>>>> | Slot int: 0x00000000
>>>>> May 16 00:15:23 localhost kernel: [  144.715185] sdhci: Caps:
>>>>> 0x01e032b2
>>>>> | Caps_1:   0x00000000
>>>>> May 16 00:15:23 localhost kernel: [  144.715192] sdhci: Cmd:
>>>>>  0x0000101a
>>>>> | Max curr: 0x00000040
>>>>> May 16 00:15:23 localhost kernel: [  144.715194] sdhci:
>>>>> ===========================================
>>>>> May 16 00:15:31 localhost kernel: [  152.604505] mmc0: Card removed during
>>>>> transfer!
>>>>> May 16 00:15:31 localhost kernel: [  152.604511] mmc0: Resetting
>>>>> controller.
>>>>> May 16 00:15:31 localhost kernel: [  152.604568] mmcblk0: unable to set
>>>>> block size to 512: -123
>>>>> May 16 00:15:31 localhost kernel: [  152.604687] mmcblk: probe of mmc0:aaaa
>>>>> failed with error -22
>>>>> May 16 00:15:31 localhost kernel: [  152.801534] mmc0: card aaaa removed
>>>>> May 16 00:15:31 localhost kernel: [  152.814827] mmc0: Got command
>>>>> interrupt
>>>>> 0x00030000 even though no command operation was in progress.
>>>>> May 16 00:15:31 localhost kernel: [  152.814835] sdhci: ===========
>>>>> REGISTER
>>>>> DUMP (mmc0)===========
>>>>> May 16 00:15:31 localhost kernel: [  152.814844] sdhci: Sys addr:
>>>>> 0xbae85840
>>>>> | Version:  0x00000400
>>>>> May 16 00:15:31 localhost kernel: [  152.814853] sdhci: Blk size:
>>>>> 0x00007040
>>>>> | Blk cnt:  0x00000001
>>>>> May 16 00:15:31 localhost kernel: [  152.814860] sdhci: Argument:
>>>>> 0x00000200
>>>>> | Trn mode: 0x00000013
>>>>> May 16 00:15:31 localhost kernel: [  152.814867] sdhci: Present:
>>>>>  0x01f00001
>>>>> | Host ctl: 0x00000000
>>>>> May 16 00:15:31 localhost kernel: [  152.814874] sdhci: Power:
>>>>>  0x0000000f
>>>>> | Blk gap:  0x00000000
>>>>> May 16 00:15:31 localhost kernel: [  152.814881] sdhci: Wake-up:
>>>>>  0x00000000
>>>>> | Clock:    0x00004007
>>>>> May 16 00:15:31 localhost kernel: [  152.814888] sdhci: Timeout:
>>>>>  0x00000009
>>>>> | Int stat: 0x00000000
>>>>> May 16 00:15:31 localhost kernel: [  152.814895] sdhci: Int enab:
>>>>> 0x00ff00c3
>>>>> | Sig enab: 0x00ff00c3
>>>>> May 16 00:15:31 localhost kernel: [  152.814902] sdhci: AC12 err:
>>>>> 0x00000000
>>>>> | Slot int: 0x00000000
>>>>> May 16 00:15:31 localhost kernel: [  152.814909] sdhci: Caps:
>>>>> 0x01e032b2
>>>>> | Caps_1:   0x00000000
>>>>> May 16 00:15:31 localhost kernel: [  152.814916] sdhci: Cmd:
>>>>>  0x0000101a
>>>>> | Max curr: 0x00000040
>>>>> May 16 00:15:31 localhost kernel: [  152.814919] sdhci:
>>>>> ===========================================
>>>>>
>>>>>
>>>>> Regards
>>>>> David Strobach
>>>>>
>>> ---
>>> Guennadi Liakhovetski, Ph.D.
>>> Freelance Open-Source Software Developer
>>> http://www.open-technology.de/
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
> 
> ---
> Guennadi Liakhovetski, Ph.D.
> Freelance Open-Source Software Developer
> http://www.open-technology.de/
> --
> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: kernel 2.6.38.6 MMC controller problem (fwd)
  2011-05-16  8:45   ` Guennadi Liakhovetski
  2011-05-16 10:26     ` Jaehoon Chung
@ 2011-05-16 14:52     ` Chris Ball
  2011-05-16 15:15       ` David Strobach
  1 sibling, 1 reply; 19+ messages in thread
From: Chris Ball @ 2011-05-16 14:52 UTC (permalink / raw)
  To: Guennadi Liakhovetski
  Cc: David Strobach, linux-mmc, Greg KH, horms, damm,
	Andrei Warkentin, Linus Walleij

Hi,

On Mon, May 16 2011, Guennadi Liakhovetski wrote:
> "divide error"???... hmmm, maybe one of SDHCI maintainers has to look at 
> it?:)

Just tried to reproduce this, and can't.  David, could you share your
.config please?

I can't see any obvious candidates for a divide error.  Guennadi, do you
think we should back out your patch while investigating?  Is anyone else
able to reproduce the bug?

(Jaehoon's report looks to me like it could be unrelated, since there's
no divide error there..)

- Chris.
-- 
Chris Ball   <cjb@laptop.org>   <http://printf.net/>
One Laptop Per Child

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: kernel 2.6.38.6 MMC controller problem (fwd)
  2011-05-16 14:52     ` Chris Ball
@ 2011-05-16 15:15       ` David Strobach
  2011-05-16 15:56         ` Chris Ball
  0 siblings, 1 reply; 19+ messages in thread
From: David Strobach @ 2011-05-16 15:15 UTC (permalink / raw)
  To: Chris Ball
  Cc: Guennadi Liakhovetski, linux-mmc, Greg KH, horms, damm,
	Andrei Warkentin, Linus Walleij

Hi,

On Mon, May 16, 2011 at 16:52, Chris Ball <cjb@laptop.org> wrote:

> Just tried to reproduce this, and can't.  David, could you share your
> .config please?

I use the stock archlinux kernel, so the .config is here:
http://projects.archlinux.org/svntogit/packages.git/tree/kernel26/repos/core-x86_64

> (Jaehoon's report looks to me like it could be unrelated, since there's
> no divide error there..)

I don't always end up with the divide error either. See the log in the
OP. We can also ask for logs on the archlinux forum thread, I
referenced in the OP.

This is my MMC controller, in case, it matters:

02:00.0 SD Host controller: Ricoh Co Ltd MMC/SD Host Controller (rev 01)
        Subsystem: Toshiba America Info Systems Device 0001
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 16
        Region 0: Memory at d7200200 (32-bit, non-prefetchable) [size=256]
        Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
                Address: 0000000000000000  Data: 0000
        Capabilities: [78] Power Management version 3
                Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA
PME(D0+,D1+,D2+,D3hot+,D3cold+)
                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=2 PME+
        Capabilities: [80] Express (v1) Endpoint, MSI 00
                DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s
unlimited, L1 unlimited
                        ExtTag- AttnBtn+ AttnInd+ PwrInd+ RBE+ FLReset-
                DevCtl: Report errors: Correctable- Non-Fatal- Fatal-
Unsupported-
                        RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
                        MaxPayload 128 bytes, MaxReadReq 512 bytes
                DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+
AuxPwr- TransPend-
                LnkCap: Port #1, Speed 2.5GT/s, Width x1, ASPM L0s L1,
Latency L0 <4us, L1 <64us
                        ClockPM+ Surprise- LLActRep- BwNot-
                LnkCtl: ASPM L1 Enabled; RCB 64 bytes Disabled-
Retrain- CommClk+
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train-
SlotClk+ DLActive- BWMgmt- ABWMgmt-
        Capabilities: [100 v1] Virtual Channel
                Caps:   LPEVC=0 RefClk=100ns PATEntryBits=1
                Arb:    Fixed- WRR32- WRR64- WRR128-
                Ctrl:   ArbSelect=Fixed
                Status: InProgress-
                VC0:    Caps:   PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
                        Arb:    Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
                        Ctrl:   Enable+ ID=0 ArbSelect=Fixed TC/VC=ff
                        Status: NegoPending- InProgress-
        Capabilities: [800 v1] Advanced Error Reporting
                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt-
UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
                AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
        Kernel driver in use: sdhci-pci
        Kernel modules: sdhci-pci

David

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: kernel 2.6.38.6 MMC controller problem (fwd)
  2011-05-16 15:15       ` David Strobach
@ 2011-05-16 15:56         ` Chris Ball
  2011-05-16 16:07           ` Guennadi Liakhovetski
                             ` (3 more replies)
  0 siblings, 4 replies; 19+ messages in thread
From: Chris Ball @ 2011-05-16 15:56 UTC (permalink / raw)
  To: David Strobach
  Cc: Guennadi Liakhovetski, linux-mmc, Greg KH, horms, damm,
	Andrei Warkentin, Linus Walleij

Hi,

On Mon, May 16 2011, David Strobach wrote:
> I don't always end up with the divide error either. See the log in the
> OP. We can also ask for logs on the archlinux forum thread, I
> referenced in the OP.

Oh, that's interesting -- https://bugs.archlinux.org/task/23778 was
opened on April 15th with the divide error crash, but I didn't send
Guennadi's patch to Linus until May 9th.  So this patch can't be the
whole story.

David, are you very confident in the bisection being correct?
Also, perhaps you could confirm that setting CONFIG_MMC_CLKGATE=n
makes all of the problems go away, even with the bad patch applied?

Thanks.  I'll plan on sending Linus a revert of Guennadi's patch today,
assuming he doesn't release 2.6.39 within a few hours..

- Chris.
-- 
Chris Ball   <cjb@laptop.org>   <http://printf.net/>
One Laptop Per Child

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: kernel 2.6.38.6 MMC controller problem (fwd)
  2011-05-16 15:56         ` Chris Ball
@ 2011-05-16 16:07           ` Guennadi Liakhovetski
  2011-05-16 16:37             ` Chris Ball
  2011-05-16 16:07           ` David Strobach
                             ` (2 subsequent siblings)
  3 siblings, 1 reply; 19+ messages in thread
From: Guennadi Liakhovetski @ 2011-05-16 16:07 UTC (permalink / raw)
  To: Chris Ball
  Cc: David Strobach, linux-mmc, Greg KH, horms, damm,
	Andrei Warkentin, Linus Walleij

On Mon, 16 May 2011, Chris Ball wrote:

> Hi,
> 
> On Mon, May 16 2011, David Strobach wrote:
> > I don't always end up with the divide error either. See the log in the
> > OP. We can also ask for logs on the archlinux forum thread, I
> > referenced in the OP.
> 
> Oh, that's interesting -- https://bugs.archlinux.org/task/23778 was
> opened on April 15th with the divide error crash, but I didn't send
> Guennadi's patch to Linus until May 9th.  So this patch can't be the
> whole story.
> 
> David, are you very confident in the bisection being correct?
> Also, perhaps you could confirm that setting CONFIG_MMC_CLKGATE=n
> makes all of the problems go away, even with the bad patch applied?
> 
> Thanks.  I'll plan on sending Linus a revert of Guennadi's patch today,
> assuming he doesn't release 2.6.39 within a few hours..

Hm, don't know... The patch _definitely_ fixes problems in some 
configurations, and _maybe_ causes problems in others. You cannot 
reproduce the problem. Have you got a .config from the OP? We can also fix 
the problem in stable - whether it turns out to be my patch or not. Can 
you maybe wait for about 6 hours, until I get a chance to test this on my 
laptop? No, I don't have archlinux on it. What distro(s) have you tried? 
Do all reporters use archlinux?

Thanks
Guennadi
---
Guennadi Liakhovetski, Ph.D.
Freelance Open-Source Software Developer
http://www.open-technology.de/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: kernel 2.6.38.6 MMC controller problem (fwd)
  2011-05-16 15:56         ` Chris Ball
  2011-05-16 16:07           ` Guennadi Liakhovetski
@ 2011-05-16 16:07           ` David Strobach
  2011-05-16 16:29           ` David Strobach
  2011-05-18 13:29           ` David Strobach
  3 siblings, 0 replies; 19+ messages in thread
From: David Strobach @ 2011-05-16 16:07 UTC (permalink / raw)
  To: Chris Ball
  Cc: Guennadi Liakhovetski, linux-mmc, Greg KH, horms, damm,
	Andrei Warkentin, Linus Walleij

On Mon, May 16, 2011 at 17:56, Chris Ball <cjb@laptop.org> wrote:
> Oh, that's interesting -- https://bugs.archlinux.org/task/23778 was
> opened on April 15th with the divide error crash, but I didn't send
> Guennadi's patch to Linus until May 9th.  So this patch can't be the
> whole story.

I don't think the Arch issue 23778 is related to this problem. This
crash happens as soon as the card is inserted.

> David, are you very confident in the bisection being correct?

Well, I'm quite sure. I was only searching in patches between 2.6.38.5
and 2.6.38.6 and the problem is 100% reproducible.

> Also, perhaps you could confirm that setting CONFIG_MMC_CLKGATE=n
> makes all of the problems go away, even with the bad patch applied?

OK, I'll try.

David

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: kernel 2.6.38.6 MMC controller problem (fwd)
  2011-05-16 15:56         ` Chris Ball
  2011-05-16 16:07           ` Guennadi Liakhovetski
  2011-05-16 16:07           ` David Strobach
@ 2011-05-16 16:29           ` David Strobach
  2011-05-18 13:29           ` David Strobach
  3 siblings, 0 replies; 19+ messages in thread
From: David Strobach @ 2011-05-16 16:29 UTC (permalink / raw)
  To: Chris Ball
  Cc: Guennadi Liakhovetski, linux-mmc, Greg KH, horms, damm,
	Andrei Warkentin, Linus Walleij

On Mon, May 16, 2011 at 17:56, Chris Ball <cjb@laptop.org> wrote:
> Also, perhaps you could confirm that setting CONFIG_MMC_CLKGATE=n
> makes all of the problems go away, even with the bad patch applied?

Confirmed. No problem with CONFIG_MMC_CLKGATE=n

David

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: kernel 2.6.38.6 MMC controller problem (fwd)
  2011-05-16 16:07           ` Guennadi Liakhovetski
@ 2011-05-16 16:37             ` Chris Ball
  2011-05-16 16:42               ` Guennadi Liakhovetski
  0 siblings, 1 reply; 19+ messages in thread
From: Chris Ball @ 2011-05-16 16:37 UTC (permalink / raw)
  To: Guennadi Liakhovetski
  Cc: David Strobach, linux-mmc, Greg KH, horms, damm,
	Andrei Warkentin, Linus Walleij

Hi,

On Mon, May 16 2011, Guennadi Liakhovetski wrote:
>> Thanks.  I'll plan on sending Linus a revert of Guennadi's patch today,
>> assuming he doesn't release 2.6.39 within a few hours..
>
> Hm, don't know... The patch _definitely_ fixes problems in some 
> configurations, and _maybe_ causes problems in others. You cannot 
> reproduce the problem.

It looks like around six different people on:
   https://bbs.archlinux.org/viewtopic.php?id=118751

say that moving from 38.5 to 38.6 makes their system crash at boot
or resume time if they have an SD card inserted -- this seems like
reasonably strong evidence, and the problem being seen after the
patch (full system crash) is significantly worse than the problem
being seen before the patch (occasional misdetection of insertion).

> Have you got a .config from the OP?

Yes, http://projects.archlinux.org/svntogit/packages.git/tree/kernel26/repos/core-x86_64/config.x86_64.
I checked that it does indeed have CLKGATE=y.

> We can also fix the problem in stable - whether it turns out to be my
> patch or not.

It would be a shame to release with a boot crash, though.

> Can you maybe wait for about 6 hours, until I get a chance to test
> this on my laptop?

My intuition is that we should revert it now -- it's been a week since
the final -rc, so you'd expect .39 to be released today, and we can
always add your patch back into stable via .40-rc1 if it's not related
to the crashes.

> No, I don't have archlinux on it. What distro(s) have you tried?  Do
> all reporters use archlinux?

I've tried Ubuntu.  All reporters so far use Archlinux, but it's quite
plausible that we're only seeing the bug from Archlinux users because
only Archlinux has pushed out 2.6.38.6 to its users already..

Thanks,

- Chris.
-- 
Chris Ball   <cjb@laptop.org>   <http://printf.net/>
One Laptop Per Child

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: kernel 2.6.38.6 MMC controller problem (fwd)
  2011-05-16 16:37             ` Chris Ball
@ 2011-05-16 16:42               ` Guennadi Liakhovetski
  2011-05-16 17:56                 ` Linus Walleij
  0 siblings, 1 reply; 19+ messages in thread
From: Guennadi Liakhovetski @ 2011-05-16 16:42 UTC (permalink / raw)
  To: Chris Ball
  Cc: David Strobach, linux-mmc, Greg KH, horms, damm,
	Andrei Warkentin, Linus Walleij

On Mon, 16 May 2011, Chris Ball wrote:

> Hi,
> 
> On Mon, May 16 2011, Guennadi Liakhovetski wrote:
> >> Thanks.  I'll plan on sending Linus a revert of Guennadi's patch today,
> >> assuming he doesn't release 2.6.39 within a few hours..
> >
> > Hm, don't know... The patch _definitely_ fixes problems in some 
> > configurations, and _maybe_ causes problems in others. You cannot 
> > reproduce the problem.
> 
> It looks like around six different people on:
>    https://bbs.archlinux.org/viewtopic.php?id=118751
> 
> say that moving from 38.5 to 38.6 makes their system crash at boot
> or resume time if they have an SD card inserted -- this seems like
> reasonably strong evidence, and the problem being seen after the
> patch (full system crash) is significantly worse than the problem
> being seen before the patch (occasional misdetection of insertion).
> 
> > Have you got a .config from the OP?
> 
> Yes, http://projects.archlinux.org/svntogit/packages.git/tree/kernel26/repos/core-x86_64/config.x86_64.
> I checked that it does indeed have CLKGATE=y.
> 
> > We can also fix the problem in stable - whether it turns out to be my
> > patch or not.
> 
> It would be a shame to release with a boot crash, though.
> 
> > Can you maybe wait for about 6 hours, until I get a chance to test
> > this on my laptop?
> 
> My intuition is that we should revert it now -- it's been a week since
> the final -rc, so you'd expect .39 to be released today, and we can
> always add your patch back into stable via .40-rc1 if it's not related
> to the crashes.

Ok, let's revert it then. Just we have to not forget to work on this 
problem and find a proper fix.

Thanks
Guennadi

> 
> > No, I don't have archlinux on it. What distro(s) have you tried?  Do
> > all reporters use archlinux?
> 
> I've tried Ubuntu.  All reporters so far use Archlinux, but it's quite
> plausible that we're only seeing the bug from Archlinux users because
> only Archlinux has pushed out 2.6.38.6 to its users already..
> 
> Thanks,
> 
> - Chris.
> -- 
> Chris Ball   <cjb@laptop.org>   <http://printf.net/>
> One Laptop Per Child
> 

---
Guennadi Liakhovetski, Ph.D.
Freelance Open-Source Software Developer
http://www.open-technology.de/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: kernel 2.6.38.6 MMC controller problem (fwd)
  2011-05-16 16:42               ` Guennadi Liakhovetski
@ 2011-05-16 17:56                 ` Linus Walleij
  2011-05-17  9:20                   ` Dong, Chuanxiao
  0 siblings, 1 reply; 19+ messages in thread
From: Linus Walleij @ 2011-05-16 17:56 UTC (permalink / raw)
  To: Guennadi Liakhovetski
  Cc: Chris Ball, David Strobach, linux-mmc, Greg KH, horms, damm,
	Andrei Warkentin

2011/5/16 Guennadi Liakhovetski <g.liakhovetski@gmx.de>:
> On Mon, 16 May 2011, Chris Ball wrote:
>> My intuition is that we should revert it now -- it's been a week since
>> the final -rc, so you'd expect .39 to be released today, and we can
>> always add your patch back into stable via .40-rc1 if it's not related
>> to the crashes.
>
> Ok, let's revert it then. Just we have to not forget to work on this
> problem and find a proper fix.

I'm still baffled by the whole thing, the solution seems so intuitively
correct. :-(

Is there something strange in the semantics of mmc_claim_host()?

Linus Walleij

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: kernel 2.6.38.6 MMC controller problem (fwd)
  2011-05-16 17:56                 ` Linus Walleij
@ 2011-05-17  9:20                   ` Dong, Chuanxiao
  2011-05-18  5:29                     ` Linus Walleij
  0 siblings, 1 reply; 19+ messages in thread
From: Dong, Chuanxiao @ 2011-05-17  9:20 UTC (permalink / raw)
  To: Linus Walleij, Guennadi Liakhovetski
  Cc: Chris Ball, David Strobach, linux-mmc, Greg KH, horms, damm,
	Andrei Warkentin

> -----Original Message-----
> From: linux-mmc-owner@vger.kernel.org
> [mailto:linux-mmc-owner@vger.kernel.org] On Behalf Of Linus Walleij
> Sent: Tuesday, May 17, 2011 1:56 AM
> To: Guennadi Liakhovetski
> Cc: Chris Ball; David Strobach; linux-mmc@vger.kernel.org; Greg KH;
> horms@verge.net.au; damm@opensource.se; Andrei Warkentin
> Subject: Re: kernel 2.6.38.6 MMC controller problem (fwd)
> 
> 2011/5/16 Guennadi Liakhovetski <g.liakhovetski@gmx.de>:
> > On Mon, 16 May 2011, Chris Ball wrote:
> >> My intuition is that we should revert it now -- it's been a week since
> >> the final -rc, so you'd expect .39 to be released today, and we can
> >> always add your patch back into stable via .40-rc1 if it's not related
> >> to the crashes.
> >
> > Ok, let's revert it then. Just we have to not forget to work on this
> > problem and find a proper fix.
> 
> I'm still baffled by the whole thing, the solution seems so intuitively
> correct. :-(
> 
> Is there something strange in the semantics of mmc_claim_host()?
> 
mmc clock gating thread may be delayed for a long time in some scenarios. Since mmc_claim_host will be also used by other thread. I think this will possibly happen.
But not sure the system how to treat the long time delayed.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: kernel 2.6.38.6 MMC controller problem (fwd)
  2011-05-17  9:20                   ` Dong, Chuanxiao
@ 2011-05-18  5:29                     ` Linus Walleij
  2011-05-18  6:01                       ` Dong, Chuanxiao
  0 siblings, 1 reply; 19+ messages in thread
From: Linus Walleij @ 2011-05-18  5:29 UTC (permalink / raw)
  To: Dong, Chuanxiao
  Cc: Guennadi Liakhovetski, Chris Ball, David Strobach, linux-mmc,
	Greg KH, horms, damm, Andrei Warkentin

2011/5/17 Dong, Chuanxiao <chuanxiao.dong@intel.com>:

> mmc clock gating thread may be delayed for a long time in some scenarios.
> Since mmc_claim_host will be also used by other thread. I think this will
> possibly happen.

That shouldn't matter? host->lock will make sure the requests are
serialized will it not?

Linus Walleij

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: kernel 2.6.38.6 MMC controller problem (fwd)
  2011-05-18  5:29                     ` Linus Walleij
@ 2011-05-18  6:01                       ` Dong, Chuanxiao
  0 siblings, 0 replies; 19+ messages in thread
From: Dong, Chuanxiao @ 2011-05-18  6:01 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Guennadi Liakhovetski, Chris Ball, David Strobach, linux-mmc,
	Greg KH, horms, damm, Andrei Warkentin



> -----Original Message-----
> From: linus.ml.walleij@gmail.com [mailto:linus.ml.walleij@gmail.com] On Behalf Of
> Linus Walleij
> Sent: Wednesday, May 18, 2011 1:30 PM
> To: Dong, Chuanxiao
> Cc: Guennadi Liakhovetski; Chris Ball; David Strobach; linux-mmc@vger.kernel.org;
> Greg KH; horms@verge.net.au; damm@opensource.se; Andrei Warkentin
> Subject: Re: kernel 2.6.38.6 MMC controller problem (fwd)
> 
> 2011/5/17 Dong, Chuanxiao <chuanxiao.dong@intel.com>:
> 
> > mmc clock gating thread may be delayed for a long time in some scenarios.
> > Since mmc_claim_host will be also used by other thread. I think this will
> > possibly happen.
> 
> That shouldn't matter? host->lock will make sure the requests are
> serialized will it not?
If mmc clock gating thread is scheduled when calling mmc_claim_host, host->clock is unlocked. And maybe when it is scheduled back, host was already claimed by other thread. If it is possible happened, then host->clock cannot make this serialized.


> 
> Linus Walleij

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: kernel 2.6.38.6 MMC controller problem (fwd)
  2011-05-16 15:56         ` Chris Ball
                             ` (2 preceding siblings ...)
  2011-05-16 16:29           ` David Strobach
@ 2011-05-18 13:29           ` David Strobach
  3 siblings, 0 replies; 19+ messages in thread
From: David Strobach @ 2011-05-18 13:29 UTC (permalink / raw)
  To: Chris Ball
  Cc: Guennadi Liakhovetski, linux-mmc, Greg KH, horms, damm,
	Andrei Warkentin, Linus Walleij

Hi,

On Mon, May 16, 2011 at 17:56, Chris Ball <cjb@laptop.org> wrote:
> David, are you very confident in the bisection being correct?

I've built packages for Archlinux with the suspected commit dropped.
So far there are three confirmations in the forum. Also there are new
symptoms described.
https://bbs.archlinux.org/viewtopic.php?id=118751

David

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2011-05-18 13:29 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-05-16  7:06 kernel 2.6.38.6 MMC controller problem (fwd) Guennadi Liakhovetski
     [not found] ` <BANLkTinbJfDLp5nfkq-VTYLJvihT1thXvA@mail.gmail.com>
2011-05-16  8:42   ` David Strobach
2011-05-16  8:45   ` Guennadi Liakhovetski
2011-05-16 10:26     ` Jaehoon Chung
2011-05-16 10:39       ` Guennadi Liakhovetski
2011-05-16 10:48         ` Jaehoon Chung
2011-05-16 14:52     ` Chris Ball
2011-05-16 15:15       ` David Strobach
2011-05-16 15:56         ` Chris Ball
2011-05-16 16:07           ` Guennadi Liakhovetski
2011-05-16 16:37             ` Chris Ball
2011-05-16 16:42               ` Guennadi Liakhovetski
2011-05-16 17:56                 ` Linus Walleij
2011-05-17  9:20                   ` Dong, Chuanxiao
2011-05-18  5:29                     ` Linus Walleij
2011-05-18  6:01                       ` Dong, Chuanxiao
2011-05-16 16:07           ` David Strobach
2011-05-16 16:29           ` David Strobach
2011-05-18 13:29           ` David Strobach

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.