Re: [PATCH] [v2] mmc: sdhci-pci-gli: Improve Random 4K Read Performance of GL9763E

From: Ulf Hansson <ulf.hansson@linaro.org>
To: Renius Chen <reniuschengl@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>,
	linux-mmc <linux-mmc@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Ben Chuang <Ben.Chuang@genesyslogic.com.tw>
Subject: Re: [PATCH] [v2] mmc: sdhci-pci-gli: Improve Random 4K Read Performance of GL9763E
Date: Tue, 6 Jul 2021 11:16:17 +0200	[thread overview]
Message-ID: <CAPDyKFq0yHxX7wb4XGeiMiSGGiOf8RKJ5ahhFQ+_vodqnyPV9Q@mail.gmail.com> (raw)
In-Reply-To: <CAJU4x8srB7skGFVcj1SPrzEZSnVkwKiW3OPN0GQxvgtRG7GAAQ@mail.gmail.com>

On Mon, 5 Jul 2021 at 17:09, Renius Chen <reniuschengl@gmail.com> wrote:
>
> Ulf Hansson <ulf.hansson@linaro.org> 於 2021年7月5日 週一 下午8:51寫道：
> >
> > On Mon, 5 Jul 2021 at 12:59, Renius Chen <reniuschengl@gmail.com> wrote:
> > >
> > > Ulf Hansson <ulf.hansson@linaro.org> 於 2021年7月5日 週一 下午6:03寫道：
> > > >
> > > > On Mon, 5 Jul 2021 at 11:00, Renius Chen <reniuschengl@gmail.com> wrote:
> > > > >
> > > > > During a sequence of random 4K read operations, the performance will be
> > > > > reduced due to spending much time on entering/exiting the low power state
> > > > > between requests. We disable the low power state negotiation of GL9763E
> > > > > during a sequence of random 4K read operations to improve the performance
> > > > > and enable it again after the operations have finished.
> > > > >
> > > > > Signed-off-by: Renius Chen <reniuschengl@gmail.com>
> > > > > ---
> > > > >  drivers/mmc/host/sdhci-pci-gli.c | 68 ++++++++++++++++++++++++++++++++
> > > > >  1 file changed, 68 insertions(+)
> > > > >
> > > > > diff --git a/drivers/mmc/host/sdhci-pci-gli.c b/drivers/mmc/host/sdhci-pci-gli.c
> > > > > index 302a7579a9b3..5f1f332b4241 100644
> > > > > --- a/drivers/mmc/host/sdhci-pci-gli.c
> > > > > +++ b/drivers/mmc/host/sdhci-pci-gli.c
> > > > > @@ -88,6 +88,9 @@
> > > > >  #define PCIE_GLI_9763E_SCR      0x8E0
> > > > >  #define   GLI_9763E_SCR_AXI_REQ           BIT(9)
> > > > >
> > > > > +#define PCIE_GLI_9763E_CFG       0x8A0
> > > > > +#define   GLI_9763E_CFG_LPSN_DIS   BIT(12)
> > > > > +
> > > > >  #define PCIE_GLI_9763E_CFG2      0x8A4
> > > > >  #define   GLI_9763E_CFG2_L1DLY     GENMASK(28, 19)
> > > > >  #define   GLI_9763E_CFG2_L1DLY_MID 0x54
> > > > > @@ -128,6 +131,11 @@
> > > > >
> > > > >  #define GLI_MAX_TUNING_LOOP 40
> > > > >
> > > > > +struct gli_host {
> > > > > +       bool start_4k_r;
> > > > > +       int continuous_4k_r;
> > > > > +};
> > > > > +
> > > > >  /* Genesys Logic chipset */
> > > > >  static inline void gl9750_wt_on(struct sdhci_host *host)
> > > > >  {
> > > > > @@ -691,6 +699,62 @@ static void sdhci_gl9763e_dumpregs(struct mmc_host *mmc)
> > > > >         sdhci_dumpregs(mmc_priv(mmc));
> > > > >  }
> > > > >
> > > > > +static void gl9763e_set_low_power_negotiation(struct sdhci_pci_slot *slot, bool enable)
> > > > > +{
> > > > > +       struct pci_dev *pdev = slot->chip->pdev;
> > > > > +       u32 value;
> > > > > +
> > > > > +       pci_read_config_dword(pdev, PCIE_GLI_9763E_VHS, &value);
> > > > > +       value &= ~GLI_9763E_VHS_REV;
> > > > > +       value |= FIELD_PREP(GLI_9763E_VHS_REV, GLI_9763E_VHS_REV_W);
> > > > > +       pci_write_config_dword(pdev, PCIE_GLI_9763E_VHS, value);
> > > > > +
> > > > > +       pci_read_config_dword(pdev, PCIE_GLI_9763E_CFG, &value);
> > > > > +
> > > > > +       if (enable)
> > > > > +               value &= ~GLI_9763E_CFG_LPSN_DIS;
> > > > > +       else
> > > > > +               value |= GLI_9763E_CFG_LPSN_DIS;
> > > > > +
> > > > > +       pci_write_config_dword(pdev, PCIE_GLI_9763E_CFG, value);
> > > > > +
> > > > > +       pci_read_config_dword(pdev, PCIE_GLI_9763E_VHS, &value);
> > > > > +       value &= ~GLI_9763E_VHS_REV;
> > > > > +       value |= FIELD_PREP(GLI_9763E_VHS_REV, GLI_9763E_VHS_REV_R);
> > > > > +       pci_write_config_dword(pdev, PCIE_GLI_9763E_VHS, value);
> > > > > +}
> > > > > +
> > > > > +static void gl9763e_request(struct mmc_host *mmc, struct mmc_request *mrq)
> > > > > +{
> > > > > +       struct sdhci_host *host = mmc_priv(mmc);
> > > > > +       struct mmc_command *cmd;
> > > > > +       struct sdhci_pci_slot *slot = sdhci_priv(host);
> > > > > +       struct gli_host *gli_host = sdhci_pci_priv(slot);
> > > > > +
> > > > > +       cmd = mrq->cmd;
> > > > > +
> > > > > +       if (cmd && (cmd->opcode == MMC_READ_MULTIPLE_BLOCK) && (cmd->data->blocks == 8)) {
> > > > > +               gli_host->continuous_4k_r++;
> > > > > +
> > > > > +               if ((!gli_host->start_4k_r) && (gli_host->continuous_4k_r >= 3)) {
> > > > > +                       gl9763e_set_low_power_negotiation(slot, false);
> > > > > +
> > > > > +                       gli_host->start_4k_r = true;
> > > > > +               }
> > > > > +       } else {
> > > > > +               gli_host->continuous_4k_r = 0;
> > > > > +
> > > > > +               if (gli_host->start_4k_r)       {
> > > > > +                       gl9763e_set_low_power_negotiation(slot, true);
> > > > > +
> > > > > +                       gli_host->start_4k_r = false;
> > > > > +               }
> > > > > +       }
> > > >
> > > > The above code is trying to figure out what kind of storage use case
> > > > that is running, based on information about the buffers. This does not
> > > > work, simply because the buffers don't give you all the information
> > > > you need to make the right decisions.
> > > >
> > > > Moreover, I am sure you would try to follow up with additional changes
> > > > on top, trying to tweak the behaviour to fit another use case - and so
> > > > on. My point is, this code doesn't belong in the lowest layer drivers.
> > > >
> > > > To move forward, I suggest you explore using runtime PM in combination
> > > > with dev PM qos. In this way, the driver could implement a default
> > > > behaviour, which can be tweaked from upper layer governors for
> > > > example, but also from user space (via sysfs) allowing more
> > > > flexibility and potentially support for various more use cases.
> > > >
> > >
> > > Hi Ulf,
> > >
> > > Thanks for advice.
> > >
> > > But we'll meet the performance issue only during a seqence of requests
> > > of read commands with 4K data length.
> > >
> > > So what we have to do is looking into the requests to monitor such
> > > behaviors and disable the low power state negotiation of GL9763e. And
> > > the information from the request buffer is sufficient for this
> > > purpose.
> > >
> > > We don't even care about if we disable the low power state negotiation
> > > by a wrong decision because we'll enable it again by any requests
> > > which are not read commands or their data length is not 4K. Disabling
> > > the low power state negotiation of GL9763e not only has no side
> > > effects but also helps its performance.
> > >
> > > The behavior is only about the low power state negotiation of GL9763e
> > > and 4K reads, and not related to runtime PM, so that we monitor the
> > > requests and implement it in the driver of GL9763e.
> >
> > I don't agree, sorry.
> >
> > The request doesn't tell you about the behavior/performance of the
> > eMMC/SD card. You can have some average idea, but things vary
> > depending on what eMMC/SD card that is being used - and over time when
> > the card gets used, for example.
> >
> > But, let's not discuss use cases and exactly how to tune the behavior,
> > that's a separate discussion.
> >
> > To repeat what I said, my main point is that this kind of code doesn't
> > belong in the driver. Instead, please try using runtime PM and dev PM
> > Qos.
> >
> > A rather simple attempt would be to deploy runtime PM support and play
> > with a default autosuspend timeout instead. Would that work for you?
> >
>
> Hi Ulf,
>
>
> Thanks for your explanation.
>
> I think there may be some misunderstandings here.

I fully understand what you want to do.

>
> Our purpose is to avoid our GL9763e from entering ASPM L1 state during
> a sequence of 4K read requests. So we don't have to consider about the
> behavior/performance of the eMMC/SD card and what eMMC/SD card that is
> being used. We just need to know what kind of requests we are
> receiving now from the PCIe root port.
>
> Besides, the APSM L1 is purely hardware behavior in GL9763e and has no
> corresponding relationship with runtime PM. It's not activated by
> driver and the behaviors are not handled by software. I think runtime
> PM is used to handle the behaviors of D0/D3 of the device, but not the
> link status of ASPM L0s, L1, etc.

Maybe runtime PM isn't the perfect fit for this type of use case.

That still doesn't matter to to me, I will not accept this kind of
governor/policy based code for use cases, in drivers. It doesn't
belong there.

>
> I agree that the policy of balancing performance vs the energy cost is
> a generic problem that all mmc drivers share. But our driver of
> GL9763e is a host driver, the setting in this patch is also only for
> GL9763e, could not be used by other devices. It depends on our
> specific hardware design so that it is not a generic solution or
> policy. So I think to implement such a patch in our specific GL9763e
> driver to execute the specific actions just for our hardware design is
> reasonable.

From the use case point of view, the GL9763e hardware design isn't at
all specific.

In many cases, controllers/platforms have support for low power states
that one want to enter to avoid wasting energy. The difficult part is
to know *when* it makes sense to enter a low power state, as it also
introduces a latency when the power needs to be restored for the
device, to allow it to serve a new request.

To me, it sounds like you may have been too aggressive on avoid
wasting energy. If I understand correctly the idle period you use is
20/21 us, while most other drivers use 50-100 ms as idle period.

[...]

Kind regards
Uffe