linux-mmc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Yann Gautier <yann.gautier@foss.st.com>
To: Linus Walleij <linus.walleij@linaro.org>,
	<linux-mmc@vger.kernel.org>,
	"Ulf Hansson" <ulf.hansson@linaro.org>
Cc: <stable@vger.kernel.org>, <phone-devel@vger.kernel.org>,
	Ludovic Barre <ludovic.barre@st.com>,
	Stephan Gerhold <stephan@gerhold.net>, <newbyte@disroot.org>
Subject: Re: [PATCH v3] mmc: core: Add a card quirk for non-hw busy detection
Date: Mon, 16 Aug 2021 16:03:31 +0200	[thread overview]
Message-ID: <2f449f6e-bca0-3c70-4255-26619e957d44@foss.st.com> (raw)
In-Reply-To: <20210720144115.1525257-1-linus.walleij@linaro.org>

On 7/20/21 4:41 PM, Linus Walleij wrote:
> Some boot partitions on the Samsung 4GB KLM4G1YE4C "4YMD1R" and "M4G1YC"
> cards appear broken when accessed randomly. CMD6 to switch back to the main
> partition randomly stalls after CMD18 access to the boot partition 1, and
> the card never comes back online. The accesses to the boot partitions work
> several times before this happens, but eventually the card access hangs
> while initializing the card.
> 
> Some problematic eMMC cards are found in the Samsung GT-S7710 (Skomer)
> and SGH-I407 (Kyle) mobile phones.
> 
> I tried using only single blocks with CMD17 on the boot partitions with the
> result that it crashed even faster.
> 
> After a bit of root cause analysis it turns out that these old eMMC cards
> probably cannot do hardware busy detection (monitoring DAT0) properly.
> 
> The card survives on older kernels, but this is because recent kernels have
> added busy detection handling for the SoC used in these phones, exposing
> the issue.
> 
> Construct a quirk that makes the MMC cord avoid using the ->card_busy()
> callback if the card is listed with MMC_QUIRK_BROKEN_HW_BUSY_DETECT and
> register the known problematic cards. The core changes are pretty
> straight-forward with a helper inline to check of we can use hardware
> busy detection.
> 
> On the MMCI host we have to counter the fact that if the host was able to
> use busy detect, it would be used unsolicited in the command IRQ callback.
> Rewrite this so that MMCI will not attempt to use hardware busy detection
> in the command IRQ until:
> - A card is attached to the host and
> - We know that the card can handle this busy detection
> 
> I have glanced over the ->card_busy() callbacks on some other hosts and
> they seem to mostly read a register reflecting the value of DAT0 for this
> which works fine with the quirk in this patch. However if the error appear
> on other hosts they might need additional fixes.
> 
> After applying this patch, the main partition can be accessed and mounted
> without problems on Samsung GT-S7710 and SGH-I407.
> 
> Fixes: cb0335b778c7 ("mmc: mmci: add busy_complete callback")
> Cc: stable@vger.kernel.org
> Cc: phone-devel@vger.kernel.org
> Cc: Ludovic Barre <ludovic.barre@st.com>
> Cc: Stephan Gerhold <stephan@gerhold.net>
> Reported-by: newbyte@disroot.org
> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>

Hi Linus,

I was just testing your patch on top of mmc/next.
Whereas mmc/next is fine, with your patch I fail to pass MMC test 5 
(Multi-block write).
I've got this error on STM32MP157C-EV1 board:
[  108.956218] mmc0: Starting tests of card mmc0:aaaa...
[  108.959862] mmc0: Test case 5. Multi-block write...
[  108.995615] mmc0: Warning: Host did not wait for busy state to end.
[  109.000483] mmc0: Result: ERROR (-110)
Then nothing more happens.

The test was done on an SD-card Sandisk Extreme Pro SDXC UHS-I mark 3, 
in DDR50 mode.

I'll try to add more traces to see what happens.


Best regards,
Yann


> ---
> ChangeLog v2->v3:
> - Rebase on v5.14-rc1
> - Reword the commit message slightly.
> ChangeLog v1->v2:
> - Rewrite to reflect the actual problem of broken busy detection.
> ---
>   drivers/mmc/core/core.c    |  8 ++++----
>   drivers/mmc/core/core.h    | 17 +++++++++++++++++
>   drivers/mmc/core/mmc_ops.c |  4 ++--
>   drivers/mmc/core/quirks.h  | 21 +++++++++++++++++++++
>   drivers/mmc/host/mmci.c    | 22 ++++++++++++++++++++--
>   include/linux/mmc/card.h   |  1 +
>   6 files changed, 65 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
> index 95fedcf56e4a..e08dd9ea3d46 100644
> --- a/drivers/mmc/core/core.c
> +++ b/drivers/mmc/core/core.c
> @@ -232,7 +232,7 @@ static void __mmc_start_request(struct mmc_host *host, struct mmc_request *mrq)
>   	 * And bypass I/O abort, reset and bus suspend operations.
>   	 */
>   	if (sdio_is_io_busy(mrq->cmd->opcode, mrq->cmd->arg) &&
> -	    host->ops->card_busy) {
> +	    mmc_hw_busy_detect(host)) {
>   		int tries = 500; /* Wait aprox 500ms at maximum */
>   
>   		while (host->ops->card_busy(host) && --tries)
> @@ -1200,7 +1200,7 @@ int mmc_set_uhs_voltage(struct mmc_host *host, u32 ocr)
>   	 */
>   	if (!host->ops->start_signal_voltage_switch)
>   		return -EPERM;
> -	if (!host->ops->card_busy)
> +	if (!mmc_hw_busy_detect(host))
>   		pr_warn("%s: cannot verify signal voltage switch\n",
>   			mmc_hostname(host));
>   
> @@ -1220,7 +1220,7 @@ int mmc_set_uhs_voltage(struct mmc_host *host, u32 ocr)
>   	 * after the response of cmd11, but wait 1 ms to be sure
>   	 */
>   	mmc_delay(1);
> -	if (host->ops->card_busy && !host->ops->card_busy(host)) {
> +	if (mmc_hw_busy_detect(host) && !host->ops->card_busy(host)) {
>   		err = -EAGAIN;
>   		goto power_cycle;
>   	}
> @@ -1241,7 +1241,7 @@ int mmc_set_uhs_voltage(struct mmc_host *host, u32 ocr)
>   	 * Failure to switch is indicated by the card holding
>   	 * dat[0:3] low
>   	 */
> -	if (host->ops->card_busy && host->ops->card_busy(host))
> +	if (mmc_hw_busy_detect(host) && host->ops->card_busy(host))
>   		err = -EAGAIN;
>   
>   power_cycle:
> diff --git a/drivers/mmc/core/core.h b/drivers/mmc/core/core.h
> index 0c4de2030b3f..6a5619eed4a6 100644
> --- a/drivers/mmc/core/core.h
> +++ b/drivers/mmc/core/core.h
> @@ -181,4 +181,21 @@ static inline int mmc_flush_cache(struct mmc_host *host)
>   	return 0;
>   }
>   
> +/**
> + * mmc_hw_busy_detect() - Can we use hw busy detection?
> + * @host: the host in question
> + */
> +static inline bool mmc_hw_busy_detect(struct mmc_host *host)
> +{
> +	struct mmc_card *card = host->card;
> +	bool has_ops;
> +	bool able = true;
> +
> +	has_ops = (host->ops->card_busy != NULL);
> +	if (card)
> +		able = !(card->quirks & MMC_QUIRK_BROKEN_HW_BUSY_DETECT);
> +
> +	return (has_ops && able);
> +}
> +
>   #endif
> diff --git a/drivers/mmc/core/mmc_ops.c b/drivers/mmc/core/mmc_ops.c
> index 973756ed4016..546fc799a8e5 100644
> --- a/drivers/mmc/core/mmc_ops.c
> +++ b/drivers/mmc/core/mmc_ops.c
> @@ -435,7 +435,7 @@ static int mmc_busy_cb(void *cb_data, bool *busy)
>   	u32 status = 0;
>   	int err;
>   
> -	if (host->ops->card_busy) {
> +	if (mmc_hw_busy_detect(host)) {
>   		*busy = host->ops->card_busy(host);
>   		return 0;
>   	}
> @@ -597,7 +597,7 @@ int __mmc_switch(struct mmc_card *card, u8 set, u8 index, u8 value,
>   	 * when it's not allowed to poll by using CMD13, then we need to rely on
>   	 * waiting the stated timeout to be sufficient.
>   	 */
> -	if (!send_status && !host->ops->card_busy) {
> +	if (!send_status && !mmc_hw_busy_detect(host)) {
>   		mmc_delay(timeout_ms);
>   		goto out_tim;
>   	}
> diff --git a/drivers/mmc/core/quirks.h b/drivers/mmc/core/quirks.h
> index d68e6e513a4f..8da6526f0eb0 100644
> --- a/drivers/mmc/core/quirks.h
> +++ b/drivers/mmc/core/quirks.h
> @@ -99,6 +99,27 @@ static const struct mmc_fixup __maybe_unused mmc_blk_fixups[] = {
>   	MMC_FIXUP("V10016", CID_MANFID_KINGSTON, CID_OEMID_ANY, add_quirk_mmc,
>   		  MMC_QUIRK_TRIM_BROKEN),
>   
> +	/*
> +	 * Some older Samsung eMMCs have broken hardware busy detection.
> +	 * Enabling this feature in the host controller can make the card
> +	 * accesses lock up completely.
> +	 */
> +	MMC_FIXUP("4YMD1R", CID_MANFID_SAMSUNG, CID_OEMID_ANY, add_quirk_mmc,
> +		  MMC_QUIRK_BROKEN_HW_BUSY_DETECT),
> +	/* Samsung KLMxGxxE4x eMMCs from 2012: 4, 8, 16, 32 and 64 GB */
> +	MMC_FIXUP("M4G1YC", CID_MANFID_SAMSUNG, CID_OEMID_ANY, add_quirk_mmc,
> +		  MMC_QUIRK_BROKEN_HW_BUSY_DETECT),
> +	MMC_FIXUP("M8G1WA", CID_MANFID_SAMSUNG, CID_OEMID_ANY, add_quirk_mmc,
> +		  MMC_QUIRK_BROKEN_HW_BUSY_DETECT),
> +	MMC_FIXUP("MAG2WA", CID_MANFID_SAMSUNG, CID_OEMID_ANY, add_quirk_mmc,
> +		  MMC_QUIRK_BROKEN_HW_BUSY_DETECT),
> +	MMC_FIXUP("MBG4WA", CID_MANFID_SAMSUNG, CID_OEMID_ANY, add_quirk_mmc,
> +		  MMC_QUIRK_BROKEN_HW_BUSY_DETECT),
> +	MMC_FIXUP("MAG2WA", CID_MANFID_SAMSUNG, CID_OEMID_ANY, add_quirk_mmc,
> +		  MMC_QUIRK_BROKEN_HW_BUSY_DETECT),
> +	MMC_FIXUP("MCG8WA", CID_MANFID_SAMSUNG, CID_OEMID_ANY, add_quirk_mmc,
> +		  MMC_QUIRK_BROKEN_HW_BUSY_DETECT),
> +
>   	END_FIXUP
>   };
>   
> diff --git a/drivers/mmc/host/mmci.c b/drivers/mmc/host/mmci.c
> index 984d35055156..3046917b2b67 100644
> --- a/drivers/mmc/host/mmci.c
> +++ b/drivers/mmc/host/mmci.c
> @@ -347,6 +347,24 @@ static int mmci_card_busy(struct mmc_host *mmc)
>   	return busy;
>   }
>   
> +/* Use this if the MMCI variant AND the card supports it */
> +static bool mmci_use_busy_detect(struct mmci_host *host)
> +{
> +	struct mmc_card *card = host->mmc->card;
> +
> +	if (!host->variant->busy_detect)
> +		return false;
> +
> +	/* We don't allow this until we know that the card can handle it */
> +	if (!card)
> +		return false;
> +
> +	if (card->quirks & MMC_QUIRK_BROKEN_HW_BUSY_DETECT)
> +		return false;
> +
> +	return true;
> +}
> +
>   static void mmci_reg_delay(struct mmci_host *host)
>   {
>   	/*
> @@ -1381,7 +1399,7 @@ mmci_cmd_irq(struct mmci_host *host, struct mmc_command *cmd,
>   		return;
>   
>   	/* Handle busy detection on DAT0 if the variant supports it. */
> -	if (busy_resp && host->variant->busy_detect)
> +	if (busy_resp && mmci_use_busy_detect(host))
>   		if (!host->ops->busy_complete(host, status, err_msk))
>   			return;
>   
> @@ -1725,7 +1743,7 @@ static void mmci_set_max_busy_timeout(struct mmc_host *mmc)
>   	struct mmci_host *host = mmc_priv(mmc);
>   	u32 max_busy_timeout = 0;
>   
> -	if (!host->variant->busy_detect)
> +	if (!mmci_use_busy_detect(host))
>   		return;
>   
>   	if (host->variant->busy_timeout && mmc->actual_clock)
> diff --git a/include/linux/mmc/card.h b/include/linux/mmc/card.h
> index 74e6c0624d27..525a39951c6d 100644
> --- a/include/linux/mmc/card.h
> +++ b/include/linux/mmc/card.h
> @@ -280,6 +280,7 @@ struct mmc_card {
>   						/* for byte mode */
>   #define MMC_QUIRK_NONSTD_SDIO	(1<<2)		/* non-standard SDIO card attached */
>   						/* (missing CIA registers) */
> +#define MMC_QUIRK_BROKEN_HW_BUSY_DETECT (1<<3)	/* Disable hardware busy detection on DAT0 */
>   #define MMC_QUIRK_NONSTD_FUNC_IF (1<<4)		/* SDIO card has nonstd function interfaces */
>   #define MMC_QUIRK_DISABLE_CD	(1<<5)		/* disconnect CD/DAT[3] resistor */
>   #define MMC_QUIRK_INAND_CMD38	(1<<6)		/* iNAND devices have broken CMD38 */
> 


  parent reply	other threads:[~2021-08-16 14:03 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-20 14:41 [PATCH v3] mmc: core: Add a card quirk for non-hw busy detection Linus Walleij
2021-08-16 13:31 ` Ulf Hansson
2021-08-16 14:03 ` Yann Gautier [this message]
2021-08-16 22:37   ` Linus Walleij
2021-08-24 15:50     ` Yann Gautier
2022-02-13 23:56 ` Linus Walleij

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2f449f6e-bca0-3c70-4255-26619e957d44@foss.st.com \
    --to=yann.gautier@foss.st.com \
    --cc=linus.walleij@linaro.org \
    --cc=linux-mmc@vger.kernel.org \
    --cc=ludovic.barre@st.com \
    --cc=newbyte@disroot.org \
    --cc=phone-devel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=stephan@gerhold.net \
    --cc=ulf.hansson@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).