linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 1/2] mmc: sdhci: Manually check card status after reset
@ 2019-05-01 17:54 Raul E Rangel
  2019-05-01 17:54 ` [RFC PATCH 2/2] mmc: sdhci: Quirk for AMD SDHC Device 0x7906 Raul E Rangel
                   ` (2 more replies)
  0 siblings, 3 replies; 20+ messages in thread
From: Raul E Rangel @ 2019-05-01 17:54 UTC (permalink / raw)
  To: linux-mmc
  Cc: djkurtz, Raul E Rangel, hongjiefang, Jennifer Dahm, linux-kernel,
	Shawn Lin, Kyle Roeschley, Avri Altman, Ulf Hansson

There is a race condition between resetting the SDHCI controller and
disconnecting the card.

For example:
0) Card is connected and transferring data
1) mmc_sd_reset is called to reset the controller due to a data error
2) sdhci_set_ios calls sdhci_do_reset
3) SOFT_RESET_ALL is toggled which clears the IRQs the controller has
configured.
4) Wait for SOFT_RESET_ALL to clear
5) CD logic notices card is gone and CARD_PRESENT goes low, but since the
   IRQs are not configured a CARD_REMOVED interrupt is never raised.
6) IRQs are enabled again
7) mmc layer never notices the device is disconnected. The SDHCI layer
   will keep returning -ENOMEDIUM. This results in a card that is always
   present and not functional.

Signed-off-by: Raul E Rangel <rrangel@chromium.org>
---
You can see an example of the following two patches here:
https://privatebin.net/?b0f5953716d34ca6#C699bCBQ99NdvspfDW7CMucT8CJG4DgL+yUNPyepDCo=
Line 8213: EILSEQ
Line 8235: SDHC is hard reset
Line 8240: Controller completes reset and card is no longer present
Line 8379: mmc_sd_reset notices card is missing and issues a card_event
           and schedules a detect change.
Line 8402: Don't init the card since it's already gone.
Line 8717: Marks card as removed
Line 8820: mmc_sd_remove removes the block device

I am running into a kernel panic. A task gets stuck for more than 120
seconds. I keep seeing blkdev_close in the stack trace, so maybe I'm not
calling something correctly?

Here is the panic: https://privatebin.net/?8ec48c1547d19975#dq/h189w5jmTlbMKKAwZjUr4bhm7Q2AgvGdRqc5BxAc=

I sometimes see the following:
[  547.943974] udevd[144]: seq 2350 '/devices/pci0000:00/0000:00:14.7/mmc_host/mmc0/mmc0:0001/block/mmcblk0/mmcblk0p1' is taking a long time

I was getting the kernel panic on a 4.14 kernel: https://chromium.googlesource.com/chromiumos/third_party/kernel/+/f3dc032faf4d074f20ada437e2d081a28ac699da/drivers/mmc/host
So I'm guessing I'm missing an upstream fix.

Do the patches look correct or am I doing something that would cause a
kernel panic?

I have a DUT setup with a GPIO I can use to toggle the CD pin. I ran a
test where I connect and then randomly, between 0s - 1s disconnect the
card. This got over 20k iterations before the panic. Though when I do it
manually and stop for 2 minutes the panic happens.

Any help would be appreciated.

Thanks,
Raul


 drivers/mmc/core/sd.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/drivers/mmc/core/sd.c b/drivers/mmc/core/sd.c
index 265e1aeeb9d8..9206c4297d66 100644
--- a/drivers/mmc/core/sd.c
+++ b/drivers/mmc/core/sd.c
@@ -1242,7 +1242,27 @@ static int mmc_sd_runtime_resume(struct mmc_host *host)
 
 static int mmc_sd_hw_reset(struct mmc_host *host)
 {
+	int present;
 	mmc_power_cycle(host, host->card->ocr);
+
+	present = host->ops->get_cd(host);
+
+	/* The card status could have changed while resetting. */
+	if ((mmc_card_removed(host->card) && present) ||
+	    (!mmc_card_removed(host->card) && !present)) {
+		pr_info("%s: card status changed during reset\n",
+		       mmc_hostname(host));
+		host->ops->card_event(host);
+		mmc_detect_change(host, 0);
+	}
+
+	/* Don't perform unnecessary transactions if the card is missing. */
+	if (!present) {
+		pr_info("%s: card was removed during reset\n",
+			mmc_hostname(host));
+		return -ENOMEDIUM;
+	}
+
 	return mmc_sd_init_card(host, host->card->ocr, host->card);
 }
 
-- 
2.21.0.593.g511ec345e18-goog


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC PATCH 2/2] mmc: sdhci: Quirk for AMD SDHC Device 0x7906
  2019-05-01 17:54 [RFC PATCH 1/2] mmc: sdhci: Manually check card status after reset Raul E Rangel
@ 2019-05-01 17:54 ` Raul E Rangel
  2019-05-02  6:32   ` Adrian Hunter
  2019-05-28  7:41   ` Ulf Hansson
  2019-05-03 15:12 ` [RFC PATCH 1/2] mmc: sdhci: Manually check card status after reset Raul Rangel
  2019-05-28  7:38 ` Ulf Hansson
  2 siblings, 2 replies; 20+ messages in thread
From: Raul E Rangel @ 2019-05-01 17:54 UTC (permalink / raw)
  To: linux-mmc
  Cc: djkurtz, Raul E Rangel, linux-kernel, Adrian Hunter, Ulf Hansson

AMD SDHC 0x7906 requires a hard reset to clear all internal state.
Otherwise it can get into a bad state where the DATA lines are always
read as zeros.

This change requires firmware that can transition the device into
D3Cold for it to work correctly. If the firmware does not support
transitioning to D3Cold then the power state transitions are a no-op.

Signed-off-by: Raul E Rangel <rrangel@chromium.org>
---

 drivers/mmc/host/sdhci-pci-core.c | 51 ++++++++++++++++++++++++++++++-
 1 file changed, 50 insertions(+), 1 deletion(-)

diff --git a/drivers/mmc/host/sdhci-pci-core.c b/drivers/mmc/host/sdhci-pci-core.c
index 99b0fec2836b..532fbcbd373b 100644
--- a/drivers/mmc/host/sdhci-pci-core.c
+++ b/drivers/mmc/host/sdhci-pci-core.c
@@ -25,6 +25,7 @@
 #include <linux/mmc/mmc.h>
 #include <linux/scatterlist.h>
 #include <linux/io.h>
+#include <linux/iopoll.h>
 #include <linux/gpio.h>
 #include <linux/pm_runtime.h>
 #include <linux/mmc/slot-gpio.h>
@@ -1498,11 +1499,59 @@ static int amd_probe(struct sdhci_pci_chip *chip)
 	return 0;
 }
 
+static u32 sdhci_read_present_state(struct sdhci_host *host)
+{
+	return sdhci_readl(host, SDHCI_PRESENT_STATE);
+}
+
+void amd_sdhci_reset(struct sdhci_host *host, u8 mask)
+{
+	struct sdhci_pci_slot *slot = sdhci_priv(host);
+	struct pci_dev *pdev = slot->chip->pdev;
+	u32 present_state;
+
+	/*
+	 * SDHC 0x7906 requires a hard reset to clear all internal state.
+	 * Otherwise it can get into a bad state where the DATA lines are always
+	 * read as zeros.
+	 */
+	if (pdev->device == 0x7906 && (mask & SDHCI_RESET_ALL)) {
+		pci_clear_master(pdev);
+
+		pci_save_state(pdev);
+
+		pci_set_power_state(pdev, PCI_D3cold);
+		pr_debug("%s: power_state=%u\n", mmc_hostname(host->mmc),
+			pdev->current_state);
+		pci_set_power_state(pdev, PCI_D0);
+
+		pci_restore_state(pdev);
+
+		/*
+		 * SDHCI_RESET_ALL says the card detect logic should not be
+		 * reset, but since we need to reset the entire controller
+		 * we should wait until the card detect logic has stabilized.
+		 *
+		 * This normally takes about 40ms.
+		 */
+		readx_poll_timeout(
+			sdhci_read_present_state,
+			host,
+			present_state,
+			present_state & SDHCI_CD_STABLE,
+			10000,
+			100000
+		);
+	}
+
+	return sdhci_reset(host, mask);
+}
+
 static const struct sdhci_ops amd_sdhci_pci_ops = {
 	.set_clock			= sdhci_set_clock,
 	.enable_dma			= sdhci_pci_enable_dma,
 	.set_bus_width			= sdhci_set_bus_width,
-	.reset				= sdhci_reset,
+	.reset				= amd_sdhci_reset,
 	.set_uhs_signaling		= sdhci_set_uhs_signaling,
 };
 
-- 
2.21.0.593.g511ec345e18-goog


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 2/2] mmc: sdhci: Quirk for AMD SDHC Device 0x7906
  2019-05-01 17:54 ` [RFC PATCH 2/2] mmc: sdhci: Quirk for AMD SDHC Device 0x7906 Raul E Rangel
@ 2019-05-02  6:32   ` Adrian Hunter
  2019-05-02 15:42     ` Raul Rangel
  2019-05-12 17:04     ` S-k, Shyam-sundar
  2019-05-28  7:41   ` Ulf Hansson
  1 sibling, 2 replies; 20+ messages in thread
From: Adrian Hunter @ 2019-05-02  6:32 UTC (permalink / raw)
  To: Raul E Rangel, linux-mmc, Agrawal, Nitesh-kumar
  Cc: djkurtz, linux-kernel, Ulf Hansson, Shyam Sundar S K, Sen,
	Pankaj, Shah, Nehal-bakulchandra

Cc: some AMD people

On 1/05/19 8:54 PM, Raul E Rangel wrote:
> AMD SDHC 0x7906 requires a hard reset to clear all internal state.
> Otherwise it can get into a bad state where the DATA lines are always
> read as zeros.
> 
> This change requires firmware that can transition the device into
> D3Cold for it to work correctly. If the firmware does not support
> transitioning to D3Cold then the power state transitions are a no-op.
> 
> Signed-off-by: Raul E Rangel <rrangel@chromium.org>
> ---
> 
>  drivers/mmc/host/sdhci-pci-core.c | 51 ++++++++++++++++++++++++++++++-
>  1 file changed, 50 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/mmc/host/sdhci-pci-core.c b/drivers/mmc/host/sdhci-pci-core.c
> index 99b0fec2836b..532fbcbd373b 100644
> --- a/drivers/mmc/host/sdhci-pci-core.c
> +++ b/drivers/mmc/host/sdhci-pci-core.c
> @@ -25,6 +25,7 @@
>  #include <linux/mmc/mmc.h>
>  #include <linux/scatterlist.h>
>  #include <linux/io.h>
> +#include <linux/iopoll.h>
>  #include <linux/gpio.h>
>  #include <linux/pm_runtime.h>
>  #include <linux/mmc/slot-gpio.h>
> @@ -1498,11 +1499,59 @@ static int amd_probe(struct sdhci_pci_chip *chip)
>  	return 0;
>  }
>  
> +static u32 sdhci_read_present_state(struct sdhci_host *host)
> +{
> +	return sdhci_readl(host, SDHCI_PRESENT_STATE);
> +}
> +
> +void amd_sdhci_reset(struct sdhci_host *host, u8 mask)
> +{
> +	struct sdhci_pci_slot *slot = sdhci_priv(host);
> +	struct pci_dev *pdev = slot->chip->pdev;
> +	u32 present_state;
> +
> +	/*
> +	 * SDHC 0x7906 requires a hard reset to clear all internal state.
> +	 * Otherwise it can get into a bad state where the DATA lines are always
> +	 * read as zeros.
> +	 */
> +	if (pdev->device == 0x7906 && (mask & SDHCI_RESET_ALL)) {
> +		pci_clear_master(pdev);
> +
> +		pci_save_state(pdev);
> +
> +		pci_set_power_state(pdev, PCI_D3cold);
> +		pr_debug("%s: power_state=%u\n", mmc_hostname(host->mmc),
> +			pdev->current_state);
> +		pci_set_power_state(pdev, PCI_D0);
> +
> +		pci_restore_state(pdev);
> +
> +		/*
> +		 * SDHCI_RESET_ALL says the card detect logic should not be
> +		 * reset, but since we need to reset the entire controller
> +		 * we should wait until the card detect logic has stabilized.
> +		 *
> +		 * This normally takes about 40ms.
> +		 */
> +		readx_poll_timeout(
> +			sdhci_read_present_state,
> +			host,
> +			present_state,
> +			present_state & SDHCI_CD_STABLE,
> +			10000,
> +			100000
> +		);
> +	}
> +
> +	return sdhci_reset(host, mask);
> +}
> +
>  static const struct sdhci_ops amd_sdhci_pci_ops = {
>  	.set_clock			= sdhci_set_clock,
>  	.enable_dma			= sdhci_pci_enable_dma,
>  	.set_bus_width			= sdhci_set_bus_width,
> -	.reset				= sdhci_reset,
> +	.reset				= amd_sdhci_reset,
>  	.set_uhs_signaling		= sdhci_set_uhs_signaling,
>  };
>  
> 


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 2/2] mmc: sdhci: Quirk for AMD SDHC Device 0x7906
  2019-05-02  6:32   ` Adrian Hunter
@ 2019-05-02 15:42     ` Raul Rangel
  2019-05-12 17:04     ` S-k, Shyam-sundar
  1 sibling, 0 replies; 20+ messages in thread
From: Raul Rangel @ 2019-05-02 15:42 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: linux-mmc, Agrawal, Nitesh-kumar, djkurtz, linux-kernel,
	Ulf Hansson, Shyam Sundar S K, Sen, Pankaj, Shah,
	Nehal-bakulchandra, Avri Altman, gene.chen, Chris.Wang

Ou Thu, May 02, 2019 at 09:32:16AM +0300, Adrian Hunter wrote:

Gene or Chris,
Can you sign off on the patch.

Thanks,
Raul
> Cc: some AMD people
> 
> On 1/05/19 8:54 PM, Raul E Rangel wrote:
> > AMD SDHC 0x7906 requires a hard reset to clear all internal state.
> > Otherwise it can get into a bad state where the DATA lines are always
> > read as zeros.
> > 
> > This change requires firmware that can transition the device into
> > D3Cold for it to work correctly. If the firmware does not support
> > transitioning to D3Cold then the power state transitions are a no-op.
> > 
> > Signed-off-by: Raul E Rangel <rrangel@chromium.org>
> > ---
> > 
> >  drivers/mmc/host/sdhci-pci-core.c | 51 ++++++++++++++++++++++++++++++-
> >  1 file changed, 50 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/mmc/host/sdhci-pci-core.c b/drivers/mmc/host/sdhci-pci-core.c
> > index 99b0fec2836b..532fbcbd373b 100644
> > --- a/drivers/mmc/host/sdhci-pci-core.c
> > +++ b/drivers/mmc/host/sdhci-pci-core.c
> > @@ -25,6 +25,7 @@
> >  #include <linux/mmc/mmc.h>
> >  #include <linux/scatterlist.h>
> >  #include <linux/io.h>
> > +#include <linux/iopoll.h>
> >  #include <linux/gpio.h>
> >  #include <linux/pm_runtime.h>
> >  #include <linux/mmc/slot-gpio.h>
> > @@ -1498,11 +1499,59 @@ static int amd_probe(struct sdhci_pci_chip *chip)
> >  	return 0;
> >  }
> >  
> > +static u32 sdhci_read_present_state(struct sdhci_host *host)
> > +{
> > +	return sdhci_readl(host, SDHCI_PRESENT_STATE);
> > +}
> > +
> > +void amd_sdhci_reset(struct sdhci_host *host, u8 mask)
> > +{
> > +	struct sdhci_pci_slot *slot = sdhci_priv(host);
> > +	struct pci_dev *pdev = slot->chip->pdev;
> > +	u32 present_state;
> > +
> > +	/*
> > +	 * SDHC 0x7906 requires a hard reset to clear all internal state.
> > +	 * Otherwise it can get into a bad state where the DATA lines are always
> > +	 * read as zeros.
> > +	 */
> > +	if (pdev->device == 0x7906 && (mask & SDHCI_RESET_ALL)) {
> > +		pci_clear_master(pdev);
> > +
> > +		pci_save_state(pdev);
> > +
> > +		pci_set_power_state(pdev, PCI_D3cold);
> > +		pr_debug("%s: power_state=%u\n", mmc_hostname(host->mmc),
> > +			pdev->current_state);
> > +		pci_set_power_state(pdev, PCI_D0);
> > +
> > +		pci_restore_state(pdev);
> > +
> > +		/*
> > +		 * SDHCI_RESET_ALL says the card detect logic should not be
> > +		 * reset, but since we need to reset the entire controller
> > +		 * we should wait until the card detect logic has stabilized.
> > +		 *
> > +		 * This normally takes about 40ms.
> > +		 */
> > +		readx_poll_timeout(
> > +			sdhci_read_present_state,
> > +			host,
> > +			present_state,
> > +			present_state & SDHCI_CD_STABLE,
> > +			10000,
> > +			100000
> > +		);
> > +	}
> > +
> > +	return sdhci_reset(host, mask);
> > +}
> > +
> >  static const struct sdhci_ops amd_sdhci_pci_ops = {
> >  	.set_clock			= sdhci_set_clock,
> >  	.enable_dma			= sdhci_pci_enable_dma,
> >  	.set_bus_width			= sdhci_set_bus_width,
> > -	.reset				= sdhci_reset,
> > +	.reset				= amd_sdhci_reset,
> >  	.set_uhs_signaling		= sdhci_set_uhs_signaling,
> >  };
> >  
> > 
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 1/2] mmc: sdhci: Manually check card status after reset
  2019-05-01 17:54 [RFC PATCH 1/2] mmc: sdhci: Manually check card status after reset Raul E Rangel
  2019-05-01 17:54 ` [RFC PATCH 2/2] mmc: sdhci: Quirk for AMD SDHC Device 0x7906 Raul E Rangel
@ 2019-05-03 15:12 ` Raul Rangel
  2019-05-08 19:00   ` Raul Rangel
  2019-05-28  7:38 ` Ulf Hansson
  2 siblings, 1 reply; 20+ messages in thread
From: Raul Rangel @ 2019-05-03 15:12 UTC (permalink / raw)
  To: linux-mmc
  Cc: djkurtz, hongjiefang, Jennifer Dahm, linux-kernel, Shawn Lin,
	Kyle Roeschley, Avri Altman, Ulf Hansson, Raul Rangel

On Wed, May 01, 2019 at 11:54:56AM -0600, Raul E Rangel wrote:
> I am running into a kernel panic. A task gets stuck for more than 120
> seconds. I keep seeing blkdev_close in the stack trace, so maybe I'm not
> calling something correctly?
> 
> Here is the panic: https://privatebin.net/?8ec48c1547d19975#dq/h189w5jmTlbMKKAwZjUr4bhm7Q2AgvGdRqc5BxAc=
> 
> I sometimes see the following:
> [  547.943974] udevd[144]: seq 2350 '/devices/pci0000:00/0000:00:14.7/mmc_host/mmc0/mmc0:0001/block/mmcblk0/mmcblk0p1' is taking a long time
> 
> I was getting the kernel panic on a 4.14 kernel: https://chromium.googlesource.com/chromiumos/third_party/kernel/+/f3dc032faf4d074f20ada437e2d081a28ac699da/drivers/mmc/host
> So I'm guessing I'm missing an upstream fix.
> 
So I tried these patches on the 5.1 with the memory leak patch applied:
https://patchwork.kernel.org/patch/10927541/

Everything works as expected:
    mmc0: new high speed SDHC card at address 0001
    mmcblk0: mmc0:0001 00000 7.41 GiB
     mmcblk0: p1 p2
    mmc0: card status changed during reset
    mmc0: card was removed during reset
    mmc0: tried to HW reset card, got error -123
    print_req_error: I/O error, dev mmcblk0, sector 2072 flags 80700
    print_req_error: I/O error, dev mmcblk0, sector 2073 flags 80700
    mmc0: card 0001 removed

    mmc0: new high speed SDHC card at address 0001
    mmcblk0: mmc0:0001 00000 7.41 GiB
     mmcblk0: p1 p2
    mmc0: card status changed during reset
    mmc0: card was removed during reset
    mmc0: tried to HW reset card, got error -123
    print_req_error: I/O error, dev mmcblk0, sector 12584832 flags 80700
    print_req_error: I/O error, dev mmcblk0, sector 12584833 flags 80700
    mmc0: card 0001 removed
    Buffer I/O error on dev mmcblk0p2, logical block 369904, async page read

    mmc0: new high speed SDHC card at address 0001
    mmcblk0: mmc0:0001 00000 7.41 GiB
     mmcblk0: p1 p2
    mmc0: card 0001 removed

I also ran another test. Using the DUT with a GPIO to randomly toggle the
CD pin I was able to get 4k+ iterations without any problems. So I think
the patches are good.

    --- Iteration 4506 ----
    Disconnecting card
    [ 7444.370169] mmc0: card 0001 removed
    Connecting Card and sleeping for 0.31476s
    [ 7444.682177] mmc0: new high speed SDHC card at address 0001
    [ 7444.719038] mmcblk0: mmc0:0001 00000 7.41 GiB
    [ 7444.727375]  mmcblk0: p1 p2
    Card Disconnected
    [ 7444.831896] mmc0: Card removed during transfer!
    [ 7444.831914] mmc0: Resetting controller.
    --- Iteration 4507 ----
    Connecting Card and sleeping for 0.30259s
    [ 7445.033649] mmc0: card 0001 removed
    Card Disconnected
    [ 7445.307008] mmc0: error -123 whilst initialising SD card
    --- Iteration 4508 ----
    Connecting Card and sleeping for 0.24827s
    Card Disconnected
    [ 7445.716033] mmc0: Card removed during transfer!
    [ 7445.716052] mmc0: Resetting controller.
    [ 7445.716489] mmc0: error -123 whilst initialising SD card
    --- Iteration 4509 ----
    Connecting Card and sleeping for 0.14132s
    Card Disconnected
    --- Iteration 4510 ----
    Connecting Card and sleeping for 0.26001s
    Card Disconnected
    [ 7446.436079] mmc0: error -123 whilst initialising SD card
    Connecting card to verify if it died
    [ 7446.906804] mmc0: new high speed SDHC card at address 0001
    [ 7446.933631] mmcblk0: mmc0:0001 00000 7.41 GiB
    [ 7446.949224]  mmcblk0: p1 p2
    SDHC still functional

Should I send the patches out again without the RFC tag?

I'll keep trying to track down the hung task I was seeing on 4.14. But I
don't think that's related to these patches. I might just end up
backporting the blk-mq patches to our 4.14 branch since I suspect that
fixes it.

Thanks,
Raul

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 1/2] mmc: sdhci: Manually check card status after reset
  2019-05-03 15:12 ` [RFC PATCH 1/2] mmc: sdhci: Manually check card status after reset Raul Rangel
@ 2019-05-08 19:00   ` Raul Rangel
  0 siblings, 0 replies; 20+ messages in thread
From: Raul Rangel @ 2019-05-08 19:00 UTC (permalink / raw)
  To: linux-mmc
  Cc: djkurtz, hongjiefang, Jennifer Dahm, linux-kernel, Shawn Lin,
	Kyle Roeschley, Avri Altman, Ulf Hansson, rrangel

On Fri, May 03, 2019 at 09:12:24AM -0600, Raul Rangel wrote:
> On Wed, May 01, 2019 at 11:54:56AM -0600, Raul E Rangel wrote:
> > I am running into a kernel panic. A task gets stuck for more than 120
> > seconds. I keep seeing blkdev_close in the stack trace, so maybe I'm not
> > calling something correctly?
> > 
> > Here is the panic: https://privatebin.net/?8ec48c1547d19975#dq/h189w5jmTlbMKKAwZjUr4bhm7Q2AgvGdRqc5BxAc=
> > 
> > I sometimes see the following:
> > [  547.943974] udevd[144]: seq 2350 '/devices/pci0000:00/0000:00:14.7/mmc_host/mmc0/mmc0:0001/block/mmcblk0/mmcblk0p1' is taking a long time
> > 
> > I was getting the kernel panic on a 4.14 kernel: https://chromium.googlesource.com/chromiumos/third_party/kernel/+/f3dc032faf4d074f20ada437e2d081a28ac699da/drivers/mmc/host
> > So I'm guessing I'm missing an upstream fix.
> > 
> 
> I'll keep trying to track down the hung task I was seeing on 4.14. But I
> don't think that's related to these patches. I might just end up
> backporting the blk-mq patches to our 4.14 branch since I suspect that
> fixes it.

So I tracked down the hung task in 4.14, it's a resource leak.
mmc_cleanup_queue stops the worker thread. If there were any requests in
the queue they would be holding onto a reference of mmc_blk_data. When
mmc_blk_remove_req calls mmc_blk_put, there are still references to md, so
it never calls blk_cleanup_queue, and the requests stay in the queue
forever.

Fortunately Adrian already has a fix for this: https://lore.kernel.org/patchwork/patch/856512/
I think we should cherry-pick 41e3efd07d5a02c80f503e29d755aa1bbb4245de
into v4.14. I've tried it locally and it fixes the kernel panic I was
seeing.

I've also sent out two more patches for v4.14 that need to be applied
with Adrian's patch:
* https://patchwork.kernel.org/patch/10936439/
* https://patchwork.kernel.org/patch/10936441/

As for this patch, are there any comments? I have a test running that is
doing random connect/disconnects, and it's over 6k iterations now.

Thanks,
Raul

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 2/2] mmc: sdhci: Quirk for AMD SDHC Device 0x7906
  2019-05-02  6:32   ` Adrian Hunter
  2019-05-02 15:42     ` Raul Rangel
@ 2019-05-12 17:04     ` S-k, Shyam-sundar
  2019-05-13  6:44       ` Adrian Hunter
  1 sibling, 1 reply; 20+ messages in thread
From: S-k, Shyam-sundar @ 2019-05-12 17:04 UTC (permalink / raw)
  To: Adrian Hunter, Raul E Rangel, linux-mmc, Agrawal, Nitesh-kumar
  Cc: djkurtz, linux-kernel, Ulf Hansson, Sen, Pankaj, Shah,
	Nehal-bakulchandra

On 5/2/2019 12:02 PM, Adrian Hunter wrote:
> Cc: some AMD people
>
> On 1/05/19 8:54 PM, Raul E Rangel wrote:
>> AMD SDHC 0x7906 requires a hard reset to clear all internal state.
>> Otherwise it can get into a bad state where the DATA lines are always
>> read as zeros.
>>
>> This change requires firmware that can transition the device into
>> D3Cold for it to work correctly. If the firmware does not support
>> transitioning to D3Cold then the power state transitions are a no-op.
>>
>> Signed-off-by: Raul E Rangel <rrangel@chromium.org>
Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 2/2] mmc: sdhci: Quirk for AMD SDHC Device 0x7906
  2019-05-12 17:04     ` S-k, Shyam-sundar
@ 2019-05-13  6:44       ` Adrian Hunter
  0 siblings, 0 replies; 20+ messages in thread
From: Adrian Hunter @ 2019-05-13  6:44 UTC (permalink / raw)
  To: S-k, Shyam-sundar, Raul E Rangel, linux-mmc, Agrawal, Nitesh-kumar
  Cc: djkurtz, linux-kernel, Ulf Hansson, Sen, Pankaj, Shah,
	Nehal-bakulchandra

On 12/05/19 8:04 PM, S-k, Shyam-sundar wrote:
> On 5/2/2019 12:02 PM, Adrian Hunter wrote:
>> Cc: some AMD people
>>
>> On 1/05/19 8:54 PM, Raul E Rangel wrote:
>>> AMD SDHC 0x7906 requires a hard reset to clear all internal state.
>>> Otherwise it can get into a bad state where the DATA lines are always
>>> read as zeros.
>>>
>>> This change requires firmware that can transition the device into
>>> D3Cold for it to work correctly. If the firmware does not support
>>> transitioning to D3Cold then the power state transitions are a no-op.
>>>
>>> Signed-off-by: Raul E Rangel <rrangel@chromium.org>
> Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
> 

Acked-by: Adrian Hunter <adrian.hunter@intel.com>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 1/2] mmc: sdhci: Manually check card status after reset
  2019-05-01 17:54 [RFC PATCH 1/2] mmc: sdhci: Manually check card status after reset Raul E Rangel
  2019-05-01 17:54 ` [RFC PATCH 2/2] mmc: sdhci: Quirk for AMD SDHC Device 0x7906 Raul E Rangel
  2019-05-03 15:12 ` [RFC PATCH 1/2] mmc: sdhci: Manually check card status after reset Raul Rangel
@ 2019-05-28  7:38 ` Ulf Hansson
  2019-06-07 16:05   ` Raul Rangel
  2 siblings, 1 reply; 20+ messages in thread
From: Ulf Hansson @ 2019-05-28  7:38 UTC (permalink / raw)
  To: Raul E Rangel
  Cc: linux-mmc, Daniel Kurtz, hongjiefang, Jennifer Dahm,
	Linux Kernel Mailing List, Shawn Lin, Kyle Roeschley,
	Avri Altman

On Wed, 1 May 2019 at 19:55, Raul E Rangel <rrangel@chromium.org> wrote:
>
> There is a race condition between resetting the SDHCI controller and
> disconnecting the card.
>
> For example:
> 0) Card is connected and transferring data
> 1) mmc_sd_reset is called to reset the controller due to a data error

I assume you refer to mmc_sd_hw_reset()? In that case, I think you
have interpreted the purpose of mmc_sd_hw_reset() slightly wrong. It's
responsibility is to reset the SD-card and not the host/controller.

Whether there some additional "reset" of the controller needed, that
is assumed by the core, to be managed via the ->set_ios() callback for
the host.

> 2) sdhci_set_ios calls sdhci_do_reset
> 3) SOFT_RESET_ALL is toggled which clears the IRQs the controller has
> configured.
> 4) Wait for SOFT_RESET_ALL to clear
> 5) CD logic notices card is gone and CARD_PRESENT goes low, but since the
>    IRQs are not configured a CARD_REMOVED interrupt is never raised.
> 6) IRQs are enabled again
> 7) mmc layer never notices the device is disconnected. The SDHCI layer
>    will keep returning -ENOMEDIUM. This results in a card that is always
>    present and not functional.

This sounds like host specific problems, which most likely should be
fixed in host driver, solely. Unless I am missing something, of
course.

>
> Signed-off-by: Raul E Rangel <rrangel@chromium.org>
> ---
> You can see an example of the following two patches here:
> https://privatebin.net/?b0f5953716d34ca6#C699bCBQ99NdvspfDW7CMucT8CJG4DgL+yUNPyepDCo=
> Line 8213: EILSEQ
> Line 8235: SDHC is hard reset
> Line 8240: Controller completes reset and card is no longer present
> Line 8379: mmc_sd_reset notices card is missing and issues a card_event
>            and schedules a detect change.
> Line 8402: Don't init the card since it's already gone.
> Line 8717: Marks card as removed
> Line 8820: mmc_sd_remove removes the block device
>
> I am running into a kernel panic. A task gets stuck for more than 120
> seconds. I keep seeing blkdev_close in the stack trace, so maybe I'm not
> calling something correctly?
>
> Here is the panic: https://privatebin.net/?8ec48c1547d19975#dq/h189w5jmTlbMKKAwZjUr4bhm7Q2AgvGdRqc5BxAc=
>
> I sometimes see the following:
> [  547.943974] udevd[144]: seq 2350 '/devices/pci0000:00/0000:00:14.7/mmc_host/mmc0/mmc0:0001/block/mmcblk0/mmcblk0p1' is taking a long time
>
> I was getting the kernel panic on a 4.14 kernel: https://chromium.googlesource.com/chromiumos/third_party/kernel/+/f3dc032faf4d074f20ada437e2d081a28ac699da/drivers/mmc/host
> So I'm guessing I'm missing an upstream fix.
>
> Do the patches look correct or am I doing something that would cause a
> kernel panic?
>
> I have a DUT setup with a GPIO I can use to toggle the CD pin. I ran a
> test where I connect and then randomly, between 0s - 1s disconnect the
> card. This got over 20k iterations before the panic. Though when I do it
> manually and stop for 2 minutes the panic happens.
>
> Any help would be appreciated.
>
> Thanks,
> Raul
>
>
>  drivers/mmc/core/sd.c | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
>
> diff --git a/drivers/mmc/core/sd.c b/drivers/mmc/core/sd.c
> index 265e1aeeb9d8..9206c4297d66 100644
> --- a/drivers/mmc/core/sd.c
> +++ b/drivers/mmc/core/sd.c
> @@ -1242,7 +1242,27 @@ static int mmc_sd_runtime_resume(struct mmc_host *host)
>
>  static int mmc_sd_hw_reset(struct mmc_host *host)
>  {
> +       int present;
>         mmc_power_cycle(host, host->card->ocr);
> +
> +       present = host->ops->get_cd(host);
> +
> +       /* The card status could have changed while resetting. */
> +       if ((mmc_card_removed(host->card) && present) ||
> +           (!mmc_card_removed(host->card) && !present)) {
> +               pr_info("%s: card status changed during reset\n",
> +                      mmc_hostname(host));
> +               host->ops->card_event(host);
> +               mmc_detect_change(host, 0);
> +       }
> +
> +       /* Don't perform unnecessary transactions if the card is missing. */
> +       if (!present) {
> +               pr_info("%s: card was removed during reset\n",
> +                       mmc_hostname(host));
> +               return -ENOMEDIUM;
> +       }
> +

When doing a  mmc_hw_reset() (which ends up calling mmc_sd_hw_reset()
in case of SD cards), we are making a final attempt to make the card
functional again, via a power cycle and a re-init of it.

In this path, we don't care whether the card is removed, as that
should have been detected already when the block layer calls
mmc_detect_card_removed().

>         return mmc_sd_init_card(host, host->card->ocr, host->card);
>  }
>
> --
> 2.21.0.593.g511ec345e18-goog
>

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 2/2] mmc: sdhci: Quirk for AMD SDHC Device 0x7906
  2019-05-01 17:54 ` [RFC PATCH 2/2] mmc: sdhci: Quirk for AMD SDHC Device 0x7906 Raul E Rangel
  2019-05-02  6:32   ` Adrian Hunter
@ 2019-05-28  7:41   ` Ulf Hansson
  2019-06-03 20:34     ` Raul Rangel
  1 sibling, 1 reply; 20+ messages in thread
From: Ulf Hansson @ 2019-05-28  7:41 UTC (permalink / raw)
  To: Raul E Rangel
  Cc: linux-mmc, Daniel Kurtz, Linux Kernel Mailing List, Adrian Hunter

On Wed, 1 May 2019 at 19:55, Raul E Rangel <rrangel@chromium.org> wrote:
>
> AMD SDHC 0x7906 requires a hard reset to clear all internal state.
> Otherwise it can get into a bad state where the DATA lines are always
> read as zeros.
>
> This change requires firmware that can transition the device into
> D3Cold for it to work correctly. If the firmware does not support
> transitioning to D3Cold then the power state transitions are a no-op.
>
> Signed-off-by: Raul E Rangel <rrangel@chromium.org>

Does this also solve the problem you tried to fix in patch1, without patch1?

Should this have a stable tag?

Kind regards
Uffe

> ---
>
>  drivers/mmc/host/sdhci-pci-core.c | 51 ++++++++++++++++++++++++++++++-
>  1 file changed, 50 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/mmc/host/sdhci-pci-core.c b/drivers/mmc/host/sdhci-pci-core.c
> index 99b0fec2836b..532fbcbd373b 100644
> --- a/drivers/mmc/host/sdhci-pci-core.c
> +++ b/drivers/mmc/host/sdhci-pci-core.c
> @@ -25,6 +25,7 @@
>  #include <linux/mmc/mmc.h>
>  #include <linux/scatterlist.h>
>  #include <linux/io.h>
> +#include <linux/iopoll.h>
>  #include <linux/gpio.h>
>  #include <linux/pm_runtime.h>
>  #include <linux/mmc/slot-gpio.h>
> @@ -1498,11 +1499,59 @@ static int amd_probe(struct sdhci_pci_chip *chip)
>         return 0;
>  }
>
> +static u32 sdhci_read_present_state(struct sdhci_host *host)
> +{
> +       return sdhci_readl(host, SDHCI_PRESENT_STATE);
> +}
> +
> +void amd_sdhci_reset(struct sdhci_host *host, u8 mask)
> +{
> +       struct sdhci_pci_slot *slot = sdhci_priv(host);
> +       struct pci_dev *pdev = slot->chip->pdev;
> +       u32 present_state;
> +
> +       /*
> +        * SDHC 0x7906 requires a hard reset to clear all internal state.
> +        * Otherwise it can get into a bad state where the DATA lines are always
> +        * read as zeros.
> +        */
> +       if (pdev->device == 0x7906 && (mask & SDHCI_RESET_ALL)) {
> +               pci_clear_master(pdev);
> +
> +               pci_save_state(pdev);
> +
> +               pci_set_power_state(pdev, PCI_D3cold);
> +               pr_debug("%s: power_state=%u\n", mmc_hostname(host->mmc),
> +                       pdev->current_state);
> +               pci_set_power_state(pdev, PCI_D0);
> +
> +               pci_restore_state(pdev);
> +
> +               /*
> +                * SDHCI_RESET_ALL says the card detect logic should not be
> +                * reset, but since we need to reset the entire controller
> +                * we should wait until the card detect logic has stabilized.
> +                *
> +                * This normally takes about 40ms.
> +                */
> +               readx_poll_timeout(
> +                       sdhci_read_present_state,
> +                       host,
> +                       present_state,
> +                       present_state & SDHCI_CD_STABLE,
> +                       10000,
> +                       100000
> +               );
> +       }
> +
> +       return sdhci_reset(host, mask);
> +}
> +
>  static const struct sdhci_ops amd_sdhci_pci_ops = {
>         .set_clock                      = sdhci_set_clock,
>         .enable_dma                     = sdhci_pci_enable_dma,
>         .set_bus_width                  = sdhci_set_bus_width,
> -       .reset                          = sdhci_reset,
> +       .reset                          = amd_sdhci_reset,
>         .set_uhs_signaling              = sdhci_set_uhs_signaling,
>  };
>
> --
> 2.21.0.593.g511ec345e18-goog
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 2/2] mmc: sdhci: Quirk for AMD SDHC Device 0x7906
  2019-05-28  7:41   ` Ulf Hansson
@ 2019-06-03 20:34     ` Raul Rangel
  0 siblings, 0 replies; 20+ messages in thread
From: Raul Rangel @ 2019-06-03 20:34 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-mmc, Daniel Kurtz, Linux Kernel Mailing List, Adrian Hunter

On Tue, May 28, 2019 at 09:41:07AM +0200, Ulf Hansson wrote:
> On Wed, 1 May 2019 at 19:55, Raul E Rangel <rrangel@chromium.org> wrote:
> >
> > AMD SDHC 0x7906 requires a hard reset to clear all internal state.
> > Otherwise it can get into a bad state where the DATA lines are always
> > read as zeros.
> >
> > This change requires firmware that can transition the device into
> > D3Cold for it to work correctly. If the firmware does not support
> > transitioning to D3Cold then the power state transitions are a no-op.
> >
> > Signed-off-by: Raul E Rangel <rrangel@chromium.org>
> 
> Does this also solve the problem you tried to fix in patch1, without patch1?
Patch one is required. Otherwise there is a possible race condition
where the card remove is not detected. I'll go through the comments you
added on the first patch.

> 
> Should this have a stable tag?
I would not put a stable tag on this patch series yet because there is a
possible resource leak in 4.14. See https://lkml.org/lkml/2019/5/13/763
If we can get those patched merged then I think it's fine to backport
the patch set. 4.19 doesn't suffer from the resource leak. I haven't
audited anything before 4.14 though.

Thanks,
Raul

> 
> Kind regards
> Uffe
> 
> > ---
> >
> >  drivers/mmc/host/sdhci-pci-core.c | 51 ++++++++++++++++++++++++++++++-
> >  1 file changed, 50 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/mmc/host/sdhci-pci-core.c b/drivers/mmc/host/sdhci-pci-core.c
> > index 99b0fec2836b..532fbcbd373b 100644
> > --- a/drivers/mmc/host/sdhci-pci-core.c
> > +++ b/drivers/mmc/host/sdhci-pci-core.c
> > @@ -25,6 +25,7 @@
> >  #include <linux/mmc/mmc.h>
> >  #include <linux/scatterlist.h>
> >  #include <linux/io.h>
> > +#include <linux/iopoll.h>
> >  #include <linux/gpio.h>
> >  #include <linux/pm_runtime.h>
> >  #include <linux/mmc/slot-gpio.h>
> > @@ -1498,11 +1499,59 @@ static int amd_probe(struct sdhci_pci_chip *chip)
> >         return 0;
> >  }
> >
> > +static u32 sdhci_read_present_state(struct sdhci_host *host)
> > +{
> > +       return sdhci_readl(host, SDHCI_PRESENT_STATE);
> > +}
> > +
> > +void amd_sdhci_reset(struct sdhci_host *host, u8 mask)
> > +{
> > +       struct sdhci_pci_slot *slot = sdhci_priv(host);
> > +       struct pci_dev *pdev = slot->chip->pdev;
> > +       u32 present_state;
> > +
> > +       /*
> > +        * SDHC 0x7906 requires a hard reset to clear all internal state.
> > +        * Otherwise it can get into a bad state where the DATA lines are always
> > +        * read as zeros.
> > +        */
> > +       if (pdev->device == 0x7906 && (mask & SDHCI_RESET_ALL)) {
> > +               pci_clear_master(pdev);
> > +
> > +               pci_save_state(pdev);
> > +
> > +               pci_set_power_state(pdev, PCI_D3cold);
> > +               pr_debug("%s: power_state=%u\n", mmc_hostname(host->mmc),
> > +                       pdev->current_state);
> > +               pci_set_power_state(pdev, PCI_D0);
> > +
> > +               pci_restore_state(pdev);
> > +
> > +               /*
> > +                * SDHCI_RESET_ALL says the card detect logic should not be
> > +                * reset, but since we need to reset the entire controller
> > +                * we should wait until the card detect logic has stabilized.
> > +                *
> > +                * This normally takes about 40ms.
> > +                */
> > +               readx_poll_timeout(
> > +                       sdhci_read_present_state,
> > +                       host,
> > +                       present_state,
> > +                       present_state & SDHCI_CD_STABLE,
> > +                       10000,
> > +                       100000
> > +               );
> > +       }
> > +
> > +       return sdhci_reset(host, mask);
> > +}
> > +
> >  static const struct sdhci_ops amd_sdhci_pci_ops = {
> >         .set_clock                      = sdhci_set_clock,
> >         .enable_dma                     = sdhci_pci_enable_dma,
> >         .set_bus_width                  = sdhci_set_bus_width,
> > -       .reset                          = sdhci_reset,
> > +       .reset                          = amd_sdhci_reset,
> >         .set_uhs_signaling              = sdhci_set_uhs_signaling,
> >  };
> >
> > --
> > 2.21.0.593.g511ec345e18-goog
> >

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 1/2] mmc: sdhci: Manually check card status after reset
  2019-05-28  7:38 ` Ulf Hansson
@ 2019-06-07 16:05   ` Raul Rangel
  2019-06-10 16:17     ` Ulf Hansson
  0 siblings, 1 reply; 20+ messages in thread
From: Raul Rangel @ 2019-06-07 16:05 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-mmc, Daniel Kurtz, hongjiefang, Jennifer Dahm,
	Linux Kernel Mailing List, Shawn Lin, Kyle Roeschley,
	Avri Altman

On Tue, May 28, 2019 at 09:38:20AM +0200, Ulf Hansson wrote:
> On Wed, 1 May 2019 at 19:55, Raul E Rangel <rrangel@chromium.org> wrote:

First off, thanks for the review.

> >
> > There is a race condition between resetting the SDHCI controller and
> > disconnecting the card.
> >
> > For example:
> > 0) Card is connected and transferring data
> > 1) mmc_sd_reset is called to reset the controller due to a data error
> 
> I assume you refer to mmc_sd_hw_reset()? In that case, I think you
> have interpreted the purpose of mmc_sd_hw_reset() slightly wrong. It's
> responsibility is to reset the SD-card and not the host/controller.
You are correct. I was looking at a 4.14 kernel where it's called
mmc_sd_reset. 4.19 and above call it mmc_sd_hw_reset.

All I was trying to convey here was that a block error will eventually
call sdhci_set_ios with SOFT_RESET_ALL via:
mmc_blk_reset
mmc_hw_reset
mmc_sd_hw_reset
mmc_power_cycle
mmc_power_off
mmc_set_initial_state
sdhci_set_ios
sdhci_reinit
sdhci_init
sdhci_do_reset(host, SDHCI_RESET_ALL);

> 
> Whether there some additional "reset" of the controller needed, that
> is assumed by the core, to be managed via the ->set_ios() callback for
> the host.
> 
> > 2) sdhci_set_ios calls sdhci_do_reset
> > 3) SOFT_RESET_ALL is toggled which clears the IRQs the controller has
> > configured.
> > 4) Wait for SOFT_RESET_ALL to clear
> > 5) CD logic notices card is gone and CARD_PRESENT goes low, but since the
> >    IRQs are not configured a CARD_REMOVED interrupt is never raised.
> > 6) IRQs are enabled again
> > 7) mmc layer never notices the device is disconnected. The SDHCI layer
> >    will keep returning -ENOMEDIUM. This results in a card that is always
> >    present and not functional.
> 
> This sounds like host specific problems, which most likely should be
> fixed in host driver, solely. Unless I am missing something, of
> course.

So in sdhci_do_reset we call the reset callback which is typically
sdhci_reset. sdhci_reset can wait for up to 100ms waiting for the
controller to reset. If SDHCI_RESET_ALL was passed as the flag, the
controller will clear the IRQ mask. If during that 100ms the card is
removed there is no notification to the MMC system that the card was
removed. So it seems like the card is always present.

> >  drivers/mmc/core/sd.c | 20 ++++++++++++++++++++
> >  1 file changed, 20 insertions(+)
> >
> > diff --git a/drivers/mmc/core/sd.c b/drivers/mmc/core/sd.c
> > index 265e1aeeb9d8..9206c4297d66 100644
> > --- a/drivers/mmc/core/sd.c
> > +++ b/drivers/mmc/core/sd.c
> > @@ -1242,7 +1242,27 @@ static int mmc_sd_runtime_resume(struct mmc_host *host)
> >
> >  static int mmc_sd_hw_reset(struct mmc_host *host)
> >  {
> > +       int present;
> >         mmc_power_cycle(host, host->card->ocr);
> > +
> > +       present = host->ops->get_cd(host);
> > +
> > +       /* The card status could have changed while resetting. */
> > +       if ((mmc_card_removed(host->card) && present) ||
> > +           (!mmc_card_removed(host->card) && !present)) {
> > +               pr_info("%s: card status changed during reset\n",
> > +                      mmc_hostname(host));
> > +               host->ops->card_event(host);
> > +               mmc_detect_change(host, 0);
> > +       }
> > +
> > +       /* Don't perform unnecessary transactions if the card is missing. */
> > +       if (!present) {
> > +               pr_info("%s: card was removed during reset\n",
> > +                       mmc_hostname(host));
> > +               return -ENOMEDIUM;
> > +       }
> > +
> 
> When doing a  mmc_hw_reset() (which ends up calling mmc_sd_hw_reset()
> in case of SD cards), we are making a final attempt to make the card
> functional again, via a power cycle and a re-init of it.
> 
> In this path, we don't care whether the card is removed, as that
> should have been detected already when the block layer calls
> mmc_detect_card_removed().

mmc_detect_card_removed has the guard `host->detect_change` to
prevent it from checking the card status again. So maybe the fix to the
missing interrupt/race condition is to set `host->detect_change = 1`
from sdhci_do_reset after calling the reset handler. This way there will
always be a single poll after a full reset so the correct card status can
be detected?

> 
> >         return mmc_sd_init_card(host, host->card->ocr, host->card);
> >  }
> >
> > --
> > 2.21.0.593.g511ec345e18-goog
> >
> 
> Kind regards
> Uffe
>
Thanks again for the review!


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 1/2] mmc: sdhci: Manually check card status after reset
  2019-06-07 16:05   ` Raul Rangel
@ 2019-06-10 16:17     ` Ulf Hansson
  2019-06-10 16:32       ` Raul Rangel
  0 siblings, 1 reply; 20+ messages in thread
From: Ulf Hansson @ 2019-06-10 16:17 UTC (permalink / raw)
  To: Raul Rangel
  Cc: linux-mmc, Daniel Kurtz, hongjiefang, Linux Kernel Mailing List,
	Shawn Lin, Kyle Roeschley, Avri Altman, Adrian Hunter

+ Adrian

On Fri, 7 Jun 2019 at 18:05, Raul Rangel <rrangel@chromium.org> wrote:
>
> On Tue, May 28, 2019 at 09:38:20AM +0200, Ulf Hansson wrote:
> > On Wed, 1 May 2019 at 19:55, Raul E Rangel <rrangel@chromium.org> wrote:
>
> First off, thanks for the review.
>
> > >
> > > There is a race condition between resetting the SDHCI controller and
> > > disconnecting the card.
> > >
> > > For example:
> > > 0) Card is connected and transferring data
> > > 1) mmc_sd_reset is called to reset the controller due to a data error
> >
> > I assume you refer to mmc_sd_hw_reset()? In that case, I think you
> > have interpreted the purpose of mmc_sd_hw_reset() slightly wrong. It's
> > responsibility is to reset the SD-card and not the host/controller.
> You are correct. I was looking at a 4.14 kernel where it's called
> mmc_sd_reset. 4.19 and above call it mmc_sd_hw_reset.
>
> All I was trying to convey here was that a block error will eventually
> call sdhci_set_ios with SOFT_RESET_ALL via:
> mmc_blk_reset
> mmc_hw_reset
> mmc_sd_hw_reset
> mmc_power_cycle
> mmc_power_off
> mmc_set_initial_state
> sdhci_set_ios
> sdhci_reinit
> sdhci_init
> sdhci_do_reset(host, SDHCI_RESET_ALL);
>
> >
> > Whether there some additional "reset" of the controller needed, that
> > is assumed by the core, to be managed via the ->set_ios() callback for
> > the host.
> >
> > > 2) sdhci_set_ios calls sdhci_do_reset
> > > 3) SOFT_RESET_ALL is toggled which clears the IRQs the controller has
> > > configured.
> > > 4) Wait for SOFT_RESET_ALL to clear
> > > 5) CD logic notices card is gone and CARD_PRESENT goes low, but since the
> > >    IRQs are not configured a CARD_REMOVED interrupt is never raised.
> > > 6) IRQs are enabled again
> > > 7) mmc layer never notices the device is disconnected. The SDHCI layer
> > >    will keep returning -ENOMEDIUM. This results in a card that is always
> > >    present and not functional.
> >
> > This sounds like host specific problems, which most likely should be
> > fixed in host driver, solely. Unless I am missing something, of
> > course.
>
> So in sdhci_do_reset we call the reset callback which is typically
> sdhci_reset. sdhci_reset can wait for up to 100ms waiting for the
> controller to reset. If SDHCI_RESET_ALL was passed as the flag, the
> controller will clear the IRQ mask. If during that 100ms the card is
> removed there is no notification to the MMC system that the card was
> removed. So it seems like the card is always present.

So you are saying that the problem occurs when the card is removed
during this 100ms period?

I assume there a way for sdhci to re-validate whether the card has
been removed during this period, right?

>
> > >  drivers/mmc/core/sd.c | 20 ++++++++++++++++++++
> > >  1 file changed, 20 insertions(+)
> > >
> > > diff --git a/drivers/mmc/core/sd.c b/drivers/mmc/core/sd.c
> > > index 265e1aeeb9d8..9206c4297d66 100644
> > > --- a/drivers/mmc/core/sd.c
> > > +++ b/drivers/mmc/core/sd.c
> > > @@ -1242,7 +1242,27 @@ static int mmc_sd_runtime_resume(struct mmc_host *host)
> > >
> > >  static int mmc_sd_hw_reset(struct mmc_host *host)
> > >  {
> > > +       int present;
> > >         mmc_power_cycle(host, host->card->ocr);
> > > +
> > > +       present = host->ops->get_cd(host);
> > > +
> > > +       /* The card status could have changed while resetting. */
> > > +       if ((mmc_card_removed(host->card) && present) ||
> > > +           (!mmc_card_removed(host->card) && !present)) {
> > > +               pr_info("%s: card status changed during reset\n",
> > > +                      mmc_hostname(host));
> > > +               host->ops->card_event(host);
> > > +               mmc_detect_change(host, 0);
> > > +       }
> > > +
> > > +       /* Don't perform unnecessary transactions if the card is missing. */
> > > +       if (!present) {
> > > +               pr_info("%s: card was removed during reset\n",
> > > +                       mmc_hostname(host));
> > > +               return -ENOMEDIUM;
> > > +       }
> > > +
> >
> > When doing a  mmc_hw_reset() (which ends up calling mmc_sd_hw_reset()
> > in case of SD cards), we are making a final attempt to make the card
> > functional again, via a power cycle and a re-init of it.
> >
> > In this path, we don't care whether the card is removed, as that
> > should have been detected already when the block layer calls
> > mmc_detect_card_removed().
>
> mmc_detect_card_removed has the guard `host->detect_change` to
> prevent it from checking the card status again. So maybe the fix to the
> missing interrupt/race condition is to set `host->detect_change = 1`
> from sdhci_do_reset after calling the reset handler. This way there will
> always be a single poll after a full reset so the correct card status can
> be detected?

It sounds like you should call mmc_detect_change(), after the reset to
let the core check for cards that may have been removed/inserted.

Would that work?

[...]

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 1/2] mmc: sdhci: Manually check card status after reset
  2019-06-10 16:17     ` Ulf Hansson
@ 2019-06-10 16:32       ` Raul Rangel
  2019-06-11 10:30         ` Adrian Hunter
  0 siblings, 1 reply; 20+ messages in thread
From: Raul Rangel @ 2019-06-10 16:32 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-mmc, Daniel Kurtz, hongjiefang, Linux Kernel Mailing List,
	Shawn Lin, Kyle Roeschley, Avri Altman, Adrian Hunter

On Mon, Jun 10, 2019 at 06:17:31PM +0200, Ulf Hansson wrote:
> + Adrian
> 
> On Fri, 7 Jun 2019 at 18:05, Raul Rangel <rrangel@chromium.org> wrote:
> >
> > On Tue, May 28, 2019 at 09:38:20AM +0200, Ulf Hansson wrote:
> > > On Wed, 1 May 2019 at 19:55, Raul E Rangel <rrangel@chromium.org> wrote:
> >
> > First off, thanks for the review.
> >
> > > >
> > > > There is a race condition between resetting the SDHCI controller and
> > > > disconnecting the card.
> > > >
> > > > For example:
> > > > 0) Card is connected and transferring data
> > > > 1) mmc_sd_reset is called to reset the controller due to a data error
> > >
> > > I assume you refer to mmc_sd_hw_reset()? In that case, I think you
> > > have interpreted the purpose of mmc_sd_hw_reset() slightly wrong. It's
> > > responsibility is to reset the SD-card and not the host/controller.
> > You are correct. I was looking at a 4.14 kernel where it's called
> > mmc_sd_reset. 4.19 and above call it mmc_sd_hw_reset.
> >
> > All I was trying to convey here was that a block error will eventually
> > call sdhci_set_ios with SOFT_RESET_ALL via:
> > mmc_blk_reset
> > mmc_hw_reset
> > mmc_sd_hw_reset
> > mmc_power_cycle
> > mmc_power_off
> > mmc_set_initial_state
> > sdhci_set_ios
> > sdhci_reinit
> > sdhci_init
> > sdhci_do_reset(host, SDHCI_RESET_ALL);
> >
> > >
> > > Whether there some additional "reset" of the controller needed, that
> > > is assumed by the core, to be managed via the ->set_ios() callback for
> > > the host.
> > >
> > > > 2) sdhci_set_ios calls sdhci_do_reset
> > > > 3) SOFT_RESET_ALL is toggled which clears the IRQs the controller has
> > > > configured.
> > > > 4) Wait for SOFT_RESET_ALL to clear
> > > > 5) CD logic notices card is gone and CARD_PRESENT goes low, but since the
> > > >    IRQs are not configured a CARD_REMOVED interrupt is never raised.
> > > > 6) IRQs are enabled again
> > > > 7) mmc layer never notices the device is disconnected. The SDHCI layer
> > > >    will keep returning -ENOMEDIUM. This results in a card that is always
> > > >    present and not functional.
> > >
> > > This sounds like host specific problems, which most likely should be
> > > fixed in host driver, solely. Unless I am missing something, of
> > > course.
> >
> > So in sdhci_do_reset we call the reset callback which is typically
> > sdhci_reset. sdhci_reset can wait for up to 100ms waiting for the
> > controller to reset. If SDHCI_RESET_ALL was passed as the flag, the
> > controller will clear the IRQ mask. If during that 100ms the card is
> > removed there is no notification to the MMC system that the card was
> > removed. So it seems like the card is always present.
> 
> So you are saying that the problem occurs when the card is removed
> during this 100ms period?
Exactly. Thought I think most controllers reset fast enough to where
that window is very small. But for this AMD controller we need to do a
full reset which takes a while, so we can see the problem.

> 
> I assume there a way for sdhci to re-validate whether the card has
> been removed during this period, right?
I'll let Adrian chime in here.

> >
> > > >  drivers/mmc/core/sd.c | 20 ++++++++++++++++++++
> > > >  1 file changed, 20 insertions(+)
> > > >
> > > > diff --git a/drivers/mmc/core/sd.c b/drivers/mmc/core/sd.c
> > > > index 265e1aeeb9d8..9206c4297d66 100644
> > > > --- a/drivers/mmc/core/sd.c
> > > > +++ b/drivers/mmc/core/sd.c
> > > > @@ -1242,7 +1242,27 @@ static int mmc_sd_runtime_resume(struct mmc_host *host)
> > > >
> > > >  static int mmc_sd_hw_reset(struct mmc_host *host)
> > > >  {
> > > > +       int present;
> > > >         mmc_power_cycle(host, host->card->ocr);
> > > > +
> > > > +       present = host->ops->get_cd(host);
> > > > +
> > > > +       /* The card status could have changed while resetting. */
> > > > +       if ((mmc_card_removed(host->card) && present) ||
> > > > +           (!mmc_card_removed(host->card) && !present)) {
> > > > +               pr_info("%s: card status changed during reset\n",
> > > > +                      mmc_hostname(host));
> > > > +               host->ops->card_event(host);
> > > > +               mmc_detect_change(host, 0);
> > > > +       }
> > > > +
> > > > +       /* Don't perform unnecessary transactions if the card is missing. */
> > > > +       if (!present) {
> > > > +               pr_info("%s: card was removed during reset\n",
> > > > +                       mmc_hostname(host));
> > > > +               return -ENOMEDIUM;
> > > > +       }
> > > > +
> > >
> > > When doing a  mmc_hw_reset() (which ends up calling mmc_sd_hw_reset()
> > > in case of SD cards), we are making a final attempt to make the card
> > > functional again, via a power cycle and a re-init of it.
> > >
> > > In this path, we don't care whether the card is removed, as that
> > > should have been detected already when the block layer calls
> > > mmc_detect_card_removed().
> >
> > mmc_detect_card_removed has the guard `host->detect_change` to
> > prevent it from checking the card status again. So maybe the fix to the
> > missing interrupt/race condition is to set `host->detect_change = 1`
> > from sdhci_do_reset after calling the reset handler. This way there will
> > always be a single poll after a full reset so the correct card status can
> > be detected?
> 
> It sounds like you should call mmc_detect_change(), after the reset to
> let the core check for cards that may have been removed/inserted.
> 
> Would that work?
Yeah, I could add `mmc_detect_change()` to the AMD SDHCI patch. I just
thought it would be better to fix the potential race condition at a
higher level. Though I'm happy to move it to the AMD patch. Let me know.

Thanks
> 
> [...]
> 
> Kind regards
> Uffe

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 1/2] mmc: sdhci: Manually check card status after reset
  2019-06-10 16:32       ` Raul Rangel
@ 2019-06-11 10:30         ` Adrian Hunter
  2019-06-19 14:56           ` Raul Rangel
  0 siblings, 1 reply; 20+ messages in thread
From: Adrian Hunter @ 2019-06-11 10:30 UTC (permalink / raw)
  To: Raul Rangel, Ulf Hansson
  Cc: linux-mmc, Daniel Kurtz, hongjiefang, Linux Kernel Mailing List,
	Shawn Lin, Kyle Roeschley, Avri Altman

On 10/06/19 7:32 PM, Raul Rangel wrote:
> On Mon, Jun 10, 2019 at 06:17:31PM +0200, Ulf Hansson wrote:
>> + Adrian
>>
>> On Fri, 7 Jun 2019 at 18:05, Raul Rangel <rrangel@chromium.org> wrote:
>>>
>>> On Tue, May 28, 2019 at 09:38:20AM +0200, Ulf Hansson wrote:
>>>> On Wed, 1 May 2019 at 19:55, Raul E Rangel <rrangel@chromium.org> wrote:
>>>
>>> First off, thanks for the review.
>>>
>>>>>
>>>>> There is a race condition between resetting the SDHCI controller and
>>>>> disconnecting the card.
>>>>>
>>>>> For example:
>>>>> 0) Card is connected and transferring data
>>>>> 1) mmc_sd_reset is called to reset the controller due to a data error
>>>>
>>>> I assume you refer to mmc_sd_hw_reset()? In that case, I think you
>>>> have interpreted the purpose of mmc_sd_hw_reset() slightly wrong. It's
>>>> responsibility is to reset the SD-card and not the host/controller.
>>> You are correct. I was looking at a 4.14 kernel where it's called
>>> mmc_sd_reset. 4.19 and above call it mmc_sd_hw_reset.
>>>
>>> All I was trying to convey here was that a block error will eventually
>>> call sdhci_set_ios with SOFT_RESET_ALL via:
>>> mmc_blk_reset
>>> mmc_hw_reset
>>> mmc_sd_hw_reset
>>> mmc_power_cycle
>>> mmc_power_off
>>> mmc_set_initial_state
>>> sdhci_set_ios
>>> sdhci_reinit
>>> sdhci_init
>>> sdhci_do_reset(host, SDHCI_RESET_ALL);
>>>
>>>>
>>>> Whether there some additional "reset" of the controller needed, that
>>>> is assumed by the core, to be managed via the ->set_ios() callback for
>>>> the host.
>>>>
>>>>> 2) sdhci_set_ios calls sdhci_do_reset
>>>>> 3) SOFT_RESET_ALL is toggled which clears the IRQs the controller has
>>>>> configured.
>>>>> 4) Wait for SOFT_RESET_ALL to clear
>>>>> 5) CD logic notices card is gone and CARD_PRESENT goes low, but since the
>>>>>    IRQs are not configured a CARD_REMOVED interrupt is never raised.
>>>>> 6) IRQs are enabled again
>>>>> 7) mmc layer never notices the device is disconnected. The SDHCI layer
>>>>>    will keep returning -ENOMEDIUM. This results in a card that is always
>>>>>    present and not functional.
>>>>
>>>> This sounds like host specific problems, which most likely should be
>>>> fixed in host driver, solely. Unless I am missing something, of
>>>> course.
>>>
>>> So in sdhci_do_reset we call the reset callback which is typically
>>> sdhci_reset. sdhci_reset can wait for up to 100ms waiting for the
>>> controller to reset. If SDHCI_RESET_ALL was passed as the flag, the
>>> controller will clear the IRQ mask. If during that 100ms the card is
>>> removed there is no notification to the MMC system that the card was
>>> removed. So it seems like the card is always present.
>>
>> So you are saying that the problem occurs when the card is removed
>> during this 100ms period?
> Exactly. Thought I think most controllers reset fast enough to where
> that window is very small. But for this AMD controller we need to do a
> full reset which takes a while, so we can see the problem.
> 
>>
>> I assume there a way for sdhci to re-validate whether the card has
>> been removed during this period, right?
> I'll let Adrian chime in here.
> 
>>>
>>>>>  drivers/mmc/core/sd.c | 20 ++++++++++++++++++++
>>>>>  1 file changed, 20 insertions(+)
>>>>>
>>>>> diff --git a/drivers/mmc/core/sd.c b/drivers/mmc/core/sd.c
>>>>> index 265e1aeeb9d8..9206c4297d66 100644
>>>>> --- a/drivers/mmc/core/sd.c
>>>>> +++ b/drivers/mmc/core/sd.c
>>>>> @@ -1242,7 +1242,27 @@ static int mmc_sd_runtime_resume(struct mmc_host *host)
>>>>>
>>>>>  static int mmc_sd_hw_reset(struct mmc_host *host)
>>>>>  {
>>>>> +       int present;
>>>>>         mmc_power_cycle(host, host->card->ocr);
>>>>> +
>>>>> +       present = host->ops->get_cd(host);
>>>>> +
>>>>> +       /* The card status could have changed while resetting. */
>>>>> +       if ((mmc_card_removed(host->card) && present) ||
>>>>> +           (!mmc_card_removed(host->card) && !present)) {
>>>>> +               pr_info("%s: card status changed during reset\n",
>>>>> +                      mmc_hostname(host));
>>>>> +               host->ops->card_event(host);
>>>>> +               mmc_detect_change(host, 0);
>>>>> +       }
>>>>> +
>>>>> +       /* Don't perform unnecessary transactions if the card is missing. */
>>>>> +       if (!present) {
>>>>> +               pr_info("%s: card was removed during reset\n",
>>>>> +                       mmc_hostname(host));
>>>>> +               return -ENOMEDIUM;
>>>>> +       }
>>>>> +
>>>>
>>>> When doing a  mmc_hw_reset() (which ends up calling mmc_sd_hw_reset()
>>>> in case of SD cards), we are making a final attempt to make the card
>>>> functional again, via a power cycle and a re-init of it.
>>>>
>>>> In this path, we don't care whether the card is removed, as that
>>>> should have been detected already when the block layer calls
>>>> mmc_detect_card_removed().
>>>
>>> mmc_detect_card_removed has the guard `host->detect_change` to
>>> prevent it from checking the card status again. So maybe the fix to the
>>> missing interrupt/race condition is to set `host->detect_change = 1`
>>> from sdhci_do_reset after calling the reset handler. This way there will
>>> always be a single poll after a full reset so the correct card status can
>>> be detected?
>>
>> It sounds like you should call mmc_detect_change(), after the reset to
>> let the core check for cards that may have been removed/inserted.
>>
>> Would that work?
> Yeah, I could add `mmc_detect_change()` to the AMD SDHCI patch. I just
> thought it would be better to fix the potential race condition at a
> higher level. Though I'm happy to move it to the AMD patch. Let me know.

Does the following work?


diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
index 0cd5f2ce98df..f672171246b0 100644
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -341,8 +341,19 @@ static void sdhci_init(struct sdhci_host *host, int soft)
 
 static void sdhci_reinit(struct sdhci_host *host)
 {
+	u32 cd = host->ier & (SDHCI_INT_CARD_REMOVE | SDHCI_INT_CARD_INSERT);
+
 	sdhci_init(host, 0);
 	sdhci_enable_card_detection(host);
+
+	/*
+	 * A change to the card detect bits indicates a change in present state,
+	 * refer sdhci_set_card_detection(). A card detect interrupt might have
+	 * been missed while the host controller was being reset, so trigger a
+	 * rescan to check.
+	 */
+	if (cd != (host->ier & (SDHCI_INT_CARD_REMOVE | SDHCI_INT_CARD_INSERT)))
+		mmc_detect_change(host->mmc, msecs_to_jiffies(200));
 }
 
 static void __sdhci_led_activate(struct sdhci_host *host)

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 1/2] mmc: sdhci: Manually check card status after reset
  2019-06-11 10:30         ` Adrian Hunter
@ 2019-06-19 14:56           ` Raul Rangel
  2019-08-01 15:16             ` Raul Rangel
  0 siblings, 1 reply; 20+ messages in thread
From: Raul Rangel @ 2019-06-19 14:56 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Ulf Hansson, linux-mmc, Daniel Kurtz, hongjiefang,
	Linux Kernel Mailing List, Shawn Lin, Kyle Roeschley,
	Avri Altman

On Tue, Jun 11, 2019 at 01:30:55PM +0300, Adrian Hunter wrote:
> Does the following work?
> 
> 
> diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
> index 0cd5f2ce98df..f672171246b0 100644
> --- a/drivers/mmc/host/sdhci.c
> +++ b/drivers/mmc/host/sdhci.c
> @@ -341,8 +341,19 @@ static void sdhci_init(struct sdhci_host *host, int soft)
>  
>  static void sdhci_reinit(struct sdhci_host *host)
>  {
> +	u32 cd = host->ier & (SDHCI_INT_CARD_REMOVE | SDHCI_INT_CARD_INSERT);
> +
>  	sdhci_init(host, 0);
>  	sdhci_enable_card_detection(host);
> +
> +	/*
> +	 * A change to the card detect bits indicates a change in present state,
> +	 * refer sdhci_set_card_detection(). A card detect interrupt might have
> +	 * been missed while the host controller was being reset, so trigger a
> +	 * rescan to check.
> +	 */
> +	if (cd != (host->ier & (SDHCI_INT_CARD_REMOVE | SDHCI_INT_CARD_INSERT)))
> +		mmc_detect_change(host->mmc, msecs_to_jiffies(200));
>  }
>  
>  static void __sdhci_led_activate(struct sdhci_host *host)

Your patch looks good. I tried it out and got over 57k insertion/removal
iterations. Do you want me to send out your patch, or do you want to do
it?

Just to recap, the patch you proposed + the AMD SDHCI specific patch fix
the problem.

Thanks!

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 1/2] mmc: sdhci: Manually check card status after reset
  2019-06-19 14:56           ` Raul Rangel
@ 2019-08-01 15:16             ` Raul Rangel
  2019-08-02  4:58               ` Adrian Hunter
  0 siblings, 1 reply; 20+ messages in thread
From: Raul Rangel @ 2019-08-01 15:16 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Ulf Hansson, linux-mmc, Daniel Kurtz, hongjiefang,
	Linux Kernel Mailing List, Shawn Lin, Kyle Roeschley,
	Avri Altman

On Wed, Jun 19, 2019 at 08:56:25AM -0600, Raul Rangel wrote:
> Your patch looks good. I tried it out and got over 57k insertion/removal
> iterations. Do you want me to send out your patch, or do you want to do
> it?
> 
> Just to recap, the patch you proposed + the AMD SDHCI specific patch fix
> the problem.
> 

Just pinging the thread again to see what you would prefer. I can send
the patch with you in the Signed-off. Or if you would prefer to not
update the sdhci driver at all, I can add the logic into the AMD
specific patch.

I would like to get this in so it lands by 5.4.

Thanks,
Raul

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 1/2] mmc: sdhci: Manually check card status after reset
  2019-08-01 15:16             ` Raul Rangel
@ 2019-08-02  4:58               ` Adrian Hunter
  2019-08-05 16:49                 ` Raul Rangel
  0 siblings, 1 reply; 20+ messages in thread
From: Adrian Hunter @ 2019-08-02  4:58 UTC (permalink / raw)
  To: Raul Rangel
  Cc: Ulf Hansson, linux-mmc, Daniel Kurtz, hongjiefang,
	Linux Kernel Mailing List, Shawn Lin, Kyle Roeschley,
	Avri Altman

On 1/08/19 6:16 PM, Raul Rangel wrote:
> On Wed, Jun 19, 2019 at 08:56:25AM -0600, Raul Rangel wrote:
>> Your patch looks good. I tried it out and got over 57k insertion/removal
>> iterations. Do you want me to send out your patch, or do you want to do
>> it?
>>
>> Just to recap, the patch you proposed + the AMD SDHCI specific patch fix
>> the problem.
>>
> 
> Just pinging the thread again to see what you would prefer. I can send
> the patch with you in the Signed-off. Or if you would prefer to not
> update the sdhci driver at all, I can add the logic into the AMD
> specific patch.
> 
> I would like to get this in so it lands by 5.4.

You seem not to have answered to my suggestion for a change to sdhci_reinit() here:

	https://lore.kernel.org/lkml/fcdf6cc4-2729-abe2-85c8-b0d04901c5ae@intel.com/


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 1/2] mmc: sdhci: Manually check card status after reset
  2019-08-02  4:58               ` Adrian Hunter
@ 2019-08-05 16:49                 ` Raul Rangel
  2019-08-06  5:51                   ` Adrian Hunter
  0 siblings, 1 reply; 20+ messages in thread
From: Raul Rangel @ 2019-08-05 16:49 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Ulf Hansson, linux-mmc, Daniel Kurtz, hongjiefang,
	Linux Kernel Mailing List, Shawn Lin, Kyle Roeschley,
	Avri Altman

On Fri, Aug 02, 2019 at 07:58:20AM +0300, Adrian Hunter wrote:
> 
> You seem not to have answered to my suggestion for a change to sdhci_reinit() here:
> 
> 	https://lore.kernel.org/lkml/fcdf6cc4-2729-abe2-85c8-b0d04901c5ae@intel.com/
> 
I thought I answered it here: https://lore.kernel.org/lkml/20190619145625.GA50985@google.com/#t

Did I miss something?

Thanks

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 1/2] mmc: sdhci: Manually check card status after reset
  2019-08-05 16:49                 ` Raul Rangel
@ 2019-08-06  5:51                   ` Adrian Hunter
  0 siblings, 0 replies; 20+ messages in thread
From: Adrian Hunter @ 2019-08-06  5:51 UTC (permalink / raw)
  To: Raul Rangel
  Cc: Ulf Hansson, linux-mmc, Daniel Kurtz, hongjiefang,
	Linux Kernel Mailing List, Shawn Lin, Kyle Roeschley,
	Avri Altman

On 5/08/19 7:49 PM, Raul Rangel wrote:
> On Fri, Aug 02, 2019 at 07:58:20AM +0300, Adrian Hunter wrote:
>>
>> You seem not to have answered to my suggestion for a change to sdhci_reinit() here:
>>
>> 	https://lore.kernel.org/lkml/fcdf6cc4-2729-abe2-85c8-b0d04901c5ae@intel.com/
>>
> I thought I answered it here: https://lore.kernel.org/lkml/20190619145625.GA50985@google.com/#t
> 
> Did I miss something?

I didn't get that mail for some reason, sorry.

Please send out my patch and your second with my ack.

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2019-08-06  5:53 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-01 17:54 [RFC PATCH 1/2] mmc: sdhci: Manually check card status after reset Raul E Rangel
2019-05-01 17:54 ` [RFC PATCH 2/2] mmc: sdhci: Quirk for AMD SDHC Device 0x7906 Raul E Rangel
2019-05-02  6:32   ` Adrian Hunter
2019-05-02 15:42     ` Raul Rangel
2019-05-12 17:04     ` S-k, Shyam-sundar
2019-05-13  6:44       ` Adrian Hunter
2019-05-28  7:41   ` Ulf Hansson
2019-06-03 20:34     ` Raul Rangel
2019-05-03 15:12 ` [RFC PATCH 1/2] mmc: sdhci: Manually check card status after reset Raul Rangel
2019-05-08 19:00   ` Raul Rangel
2019-05-28  7:38 ` Ulf Hansson
2019-06-07 16:05   ` Raul Rangel
2019-06-10 16:17     ` Ulf Hansson
2019-06-10 16:32       ` Raul Rangel
2019-06-11 10:30         ` Adrian Hunter
2019-06-19 14:56           ` Raul Rangel
2019-08-01 15:16             ` Raul Rangel
2019-08-02  4:58               ` Adrian Hunter
2019-08-05 16:49                 ` Raul Rangel
2019-08-06  5:51                   ` Adrian Hunter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).