All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/3] spi: spi-geni-qcom: Use the FIFO even more
@ 2020-09-12 21:07 Douglas Anderson
  2020-09-12 21:08 ` [PATCH 2/3] spi: spi-geni-qcom: Don't program CS_TOGGLE again and again Douglas Anderson
                   ` (4 more replies)
  0 siblings, 5 replies; 17+ messages in thread
From: Douglas Anderson @ 2020-09-12 21:07 UTC (permalink / raw)
  To: Mark Brown
  Cc: Akash Asthana, swboyd, Douglas Anderson, Andy Gross,
	Bjorn Andersson, linux-arm-msm, linux-kernel, linux-spi

In commit 902481a78ee4 ("spi: spi-geni-qcom: Actually use our FIFO") I
explained that the maximum size we could program the FIFO was
"mas->tx_fifo_depth - 3" but that I chose "mas->tx_fifo_depth()"
because I was worried about decreased bandwidth.

Since that time:
* All the interconnect patches have landed, making things run at the
  proper speed.
* I've done more measurements.

This lets me confirm that there's really no downside of using the FIFO
more.  Specifically I did "flashrom -p ec -r /tmp/foo.bin" on a
Chromebook and averaged over several runs.

Before: It took 6.66 seconds and 59669 interrupts fired.
After:  It took 6.66 seconds and 47992 interrupts fired.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
---

 drivers/spi/spi-geni-qcom.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/spi/spi-geni-qcom.c b/drivers/spi/spi-geni-qcom.c
index 0dc3f4c55b0b..7f0bf0dec466 100644
--- a/drivers/spi/spi-geni-qcom.c
+++ b/drivers/spi/spi-geni-qcom.c
@@ -308,7 +308,7 @@ static int spi_geni_init(struct spi_geni_master *mas)
 	 * Hardware programming guide suggests to configure
 	 * RX FIFO RFR level to fifo_depth-2.
 	 */
-	geni_se_init(se, mas->tx_fifo_depth / 2, mas->tx_fifo_depth - 2);
+	geni_se_init(se, mas->tx_fifo_depth - 3, mas->tx_fifo_depth - 2);
 	/* Transmit an entire FIFO worth of data per IRQ */
 	mas->tx_wm = 1;
 	ver = geni_se_get_qup_hw_version(se);
-- 
2.28.0.618.gf4bc123cb7-goog


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/3] spi: spi-geni-qcom: Don't program CS_TOGGLE again and again
  2020-09-12 21:07 [PATCH 1/3] spi: spi-geni-qcom: Use the FIFO even more Douglas Anderson
@ 2020-09-12 21:08 ` Douglas Anderson
  2020-09-12 23:01   ` Bjorn Andersson
  2020-09-15 10:40   ` Akash Asthana
  2020-09-12 21:08 ` [PATCH 3/3] spi: spi-geni-qcom: Slightly optimize setup of bidirectional xfters Douglas Anderson
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 17+ messages in thread
From: Douglas Anderson @ 2020-09-12 21:08 UTC (permalink / raw)
  To: Mark Brown
  Cc: Akash Asthana, swboyd, Douglas Anderson, Andy Gross,
	Bjorn Andersson, linux-arm-msm, linux-kernel, linux-spi

We always toggle the chip select manually in spi-geni-qcom so that we
can properly implement the Linux API.  There's no reason to program
this to the hardware on every transfer.  Program it once at init and
be done with it.

This saves some part of a microsecond of overhead on each transfer.
While not really noticeable on any real world benchmarks, we might as
well save the time.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
---

 drivers/spi/spi-geni-qcom.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/spi/spi-geni-qcom.c b/drivers/spi/spi-geni-qcom.c
index 7f0bf0dec466..92d88bf85a90 100644
--- a/drivers/spi/spi-geni-qcom.c
+++ b/drivers/spi/spi-geni-qcom.c
@@ -290,6 +290,7 @@ static int spi_geni_init(struct spi_geni_master *mas)
 {
 	struct geni_se *se = &mas->se;
 	unsigned int proto, major, minor, ver;
+	u32 spi_tx_cfg;
 
 	pm_runtime_get_sync(mas->dev);
 
@@ -322,6 +323,11 @@ static int spi_geni_init(struct spi_geni_master *mas)
 
 	geni_se_select_mode(se, GENI_SE_FIFO);
 
+	/* We always control CS manually */
+	spi_tx_cfg = readl(se->base + SE_SPI_TRANS_CFG);
+	spi_tx_cfg &= ~CS_TOGGLE;
+	writel(spi_tx_cfg, se->base + SE_SPI_TRANS_CFG);
+
 	pm_runtime_put(mas->dev);
 	return 0;
 }
@@ -331,7 +337,7 @@ static void setup_fifo_xfer(struct spi_transfer *xfer,
 				u16 mode, struct spi_master *spi)
 {
 	u32 m_cmd = 0;
-	u32 spi_tx_cfg, len;
+	u32 len;
 	struct geni_se *se = &mas->se;
 	int ret;
 
@@ -350,7 +356,6 @@ static void setup_fifo_xfer(struct spi_transfer *xfer,
 	spin_lock_irq(&mas->lock);
 	spin_unlock_irq(&mas->lock);
 
-	spi_tx_cfg = readl(se->base + SE_SPI_TRANS_CFG);
 	if (xfer->bits_per_word != mas->cur_bits_per_word) {
 		spi_setup_word_len(mas, mode, xfer->bits_per_word);
 		mas->cur_bits_per_word = xfer->bits_per_word;
@@ -364,8 +369,6 @@ static void setup_fifo_xfer(struct spi_transfer *xfer,
 	mas->tx_rem_bytes = 0;
 	mas->rx_rem_bytes = 0;
 
-	spi_tx_cfg &= ~CS_TOGGLE;
-
 	if (!(mas->cur_bits_per_word % MIN_WORD_LEN))
 		len = xfer->len * BITS_PER_BYTE / mas->cur_bits_per_word;
 	else
@@ -384,7 +387,6 @@ static void setup_fifo_xfer(struct spi_transfer *xfer,
 		writel(len, se->base + SE_SPI_RX_TRANS_LEN);
 		mas->rx_rem_bytes = xfer->len;
 	}
-	writel(spi_tx_cfg, se->base + SE_SPI_TRANS_CFG);
 
 	/*
 	 * Lock around right before we start the transfer since our
-- 
2.28.0.618.gf4bc123cb7-goog


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3/3] spi: spi-geni-qcom: Slightly optimize setup of bidirectional xfters
  2020-09-12 21:07 [PATCH 1/3] spi: spi-geni-qcom: Use the FIFO even more Douglas Anderson
  2020-09-12 21:08 ` [PATCH 2/3] spi: spi-geni-qcom: Don't program CS_TOGGLE again and again Douglas Anderson
@ 2020-09-12 21:08 ` Douglas Anderson
  2020-09-12 22:54   ` Bjorn Andersson
                     ` (2 more replies)
  2020-09-12 22:53 ` [PATCH 1/3] spi: spi-geni-qcom: Use the FIFO even more Bjorn Andersson
                   ` (2 subsequent siblings)
  4 siblings, 3 replies; 17+ messages in thread
From: Douglas Anderson @ 2020-09-12 21:08 UTC (permalink / raw)
  To: Mark Brown
  Cc: Akash Asthana, swboyd, Douglas Anderson, Andy Gross,
	Bjorn Andersson, linux-arm-msm, linux-kernel, linux-spi

When setting up a bidirectional transfer we need to program both the
TX and RX lengths.  We don't need a memory barrier between those two
writes.  Factor out the __iowmb() and use writel_relaxed().  This
saves a fraction of a microsecond of setup overhead on bidirectional
transfers.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
---

 drivers/spi/spi-geni-qcom.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/spi/spi-geni-qcom.c b/drivers/spi/spi-geni-qcom.c
index 92d88bf85a90..6c7e12b68bf0 100644
--- a/drivers/spi/spi-geni-qcom.c
+++ b/drivers/spi/spi-geni-qcom.c
@@ -376,15 +376,22 @@ static void setup_fifo_xfer(struct spi_transfer *xfer,
 	len &= TRANS_LEN_MSK;
 
 	mas->cur_xfer = xfer;
+
+	/*
+	 * Factor out the __iowmb() so that we can use writel_relaxed() for
+	 * both writes below and thus only incur the overhead once even if
+	 * we execute both of them.
+	 */
+	__iowmb();
+
 	if (xfer->tx_buf) {
 		m_cmd |= SPI_TX_ONLY;
 		mas->tx_rem_bytes = xfer->len;
-		writel(len, se->base + SE_SPI_TX_TRANS_LEN);
+		writel_relaxed(len, se->base + SE_SPI_TX_TRANS_LEN);
 	}
-
 	if (xfer->rx_buf) {
 		m_cmd |= SPI_RX_ONLY;
-		writel(len, se->base + SE_SPI_RX_TRANS_LEN);
+		writel_relaxed(len, se->base + SE_SPI_RX_TRANS_LEN);
 		mas->rx_rem_bytes = xfer->len;
 	}
 
-- 
2.28.0.618.gf4bc123cb7-goog


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/3] spi: spi-geni-qcom: Use the FIFO even more
  2020-09-12 21:07 [PATCH 1/3] spi: spi-geni-qcom: Use the FIFO even more Douglas Anderson
  2020-09-12 21:08 ` [PATCH 2/3] spi: spi-geni-qcom: Don't program CS_TOGGLE again and again Douglas Anderson
  2020-09-12 21:08 ` [PATCH 3/3] spi: spi-geni-qcom: Slightly optimize setup of bidirectional xfters Douglas Anderson
@ 2020-09-12 22:53 ` Bjorn Andersson
  2020-09-13  1:11   ` Doug Anderson
  2020-09-14 14:52 ` Mark Brown
  2020-09-15  7:30 ` Akash Asthana
  4 siblings, 1 reply; 17+ messages in thread
From: Bjorn Andersson @ 2020-09-12 22:53 UTC (permalink / raw)
  To: Douglas Anderson
  Cc: Mark Brown, Akash Asthana, swboyd, Andy Gross, linux-arm-msm,
	linux-kernel, linux-spi

On Sat 12 Sep 16:07 CDT 2020, Douglas Anderson wrote:

> In commit 902481a78ee4 ("spi: spi-geni-qcom: Actually use our FIFO") I
> explained that the maximum size we could program the FIFO was
> "mas->tx_fifo_depth - 3" but that I chose "mas->tx_fifo_depth()"
> because I was worried about decreased bandwidth.
> 
> Since that time:
> * All the interconnect patches have landed, making things run at the
>   proper speed.
> * I've done more measurements.
> 
> This lets me confirm that there's really no downside of using the FIFO
> more.  Specifically I did "flashrom -p ec -r /tmp/foo.bin" on a
> Chromebook and averaged over several runs.

Wouldn't there be a downside in the form of setting the watermark that
close to the full FIFO we have less room for being late handling the
interrupt? Or is there some mechanism involved that will prevent
the FIFO from being overrun?

Regards,
Bjorn

> 
> Before: It took 6.66 seconds and 59669 interrupts fired.
> After:  It took 6.66 seconds and 47992 interrupts fired.
> 
> Signed-off-by: Douglas Anderson <dianders@chromium.org>
> ---
> 
>  drivers/spi/spi-geni-qcom.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/spi/spi-geni-qcom.c b/drivers/spi/spi-geni-qcom.c
> index 0dc3f4c55b0b..7f0bf0dec466 100644
> --- a/drivers/spi/spi-geni-qcom.c
> +++ b/drivers/spi/spi-geni-qcom.c
> @@ -308,7 +308,7 @@ static int spi_geni_init(struct spi_geni_master *mas)
>  	 * Hardware programming guide suggests to configure
>  	 * RX FIFO RFR level to fifo_depth-2.
>  	 */
> -	geni_se_init(se, mas->tx_fifo_depth / 2, mas->tx_fifo_depth - 2);
> +	geni_se_init(se, mas->tx_fifo_depth - 3, mas->tx_fifo_depth - 2);
>  	/* Transmit an entire FIFO worth of data per IRQ */
>  	mas->tx_wm = 1;
>  	ver = geni_se_get_qup_hw_version(se);
> -- 
> 2.28.0.618.gf4bc123cb7-goog
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/3] spi: spi-geni-qcom: Slightly optimize setup of bidirectional xfters
  2020-09-12 21:08 ` [PATCH 3/3] spi: spi-geni-qcom: Slightly optimize setup of bidirectional xfters Douglas Anderson
@ 2020-09-12 22:54   ` Bjorn Andersson
  2020-09-13  1:09     ` Doug Anderson
  2020-09-13  3:45     ` kernel test robot
  2020-09-13  7:31     ` kernel test robot
  2 siblings, 1 reply; 17+ messages in thread
From: Bjorn Andersson @ 2020-09-12 22:54 UTC (permalink / raw)
  To: Douglas Anderson
  Cc: Mark Brown, Akash Asthana, swboyd, Andy Gross, linux-arm-msm,
	linux-kernel, linux-spi

On Sat 12 Sep 16:08 CDT 2020, Douglas Anderson wrote:

> When setting up a bidirectional transfer we need to program both the
> TX and RX lengths.  We don't need a memory barrier between those two
> writes.  Factor out the __iowmb() and use writel_relaxed().  This
> saves a fraction of a microsecond of setup overhead on bidirectional
> transfers.
> 
> Signed-off-by: Douglas Anderson <dianders@chromium.org>
> ---
> 
>  drivers/spi/spi-geni-qcom.c | 13 ++++++++++---
>  1 file changed, 10 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/spi/spi-geni-qcom.c b/drivers/spi/spi-geni-qcom.c
> index 92d88bf85a90..6c7e12b68bf0 100644
> --- a/drivers/spi/spi-geni-qcom.c
> +++ b/drivers/spi/spi-geni-qcom.c
> @@ -376,15 +376,22 @@ static void setup_fifo_xfer(struct spi_transfer *xfer,
>  	len &= TRANS_LEN_MSK;
>  
>  	mas->cur_xfer = xfer;
> +
> +	/*
> +	 * Factor out the __iowmb() so that we can use writel_relaxed() for
> +	 * both writes below and thus only incur the overhead once even if
> +	 * we execute both of them.
> +	 */

How many passes through this function do we have to take before saving
the amount of time it took me to read this comment?

Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>

Regards,
Bjorn

> +	__iowmb();
> +
>  	if (xfer->tx_buf) {
>  		m_cmd |= SPI_TX_ONLY;
>  		mas->tx_rem_bytes = xfer->len;
> -		writel(len, se->base + SE_SPI_TX_TRANS_LEN);
> +		writel_relaxed(len, se->base + SE_SPI_TX_TRANS_LEN);
>  	}
> -
>  	if (xfer->rx_buf) {
>  		m_cmd |= SPI_RX_ONLY;
> -		writel(len, se->base + SE_SPI_RX_TRANS_LEN);
> +		writel_relaxed(len, se->base + SE_SPI_RX_TRANS_LEN);
>  		mas->rx_rem_bytes = xfer->len;
>  	}
>  
> -- 
> 2.28.0.618.gf4bc123cb7-goog
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/3] spi: spi-geni-qcom: Don't program CS_TOGGLE again and again
  2020-09-12 21:08 ` [PATCH 2/3] spi: spi-geni-qcom: Don't program CS_TOGGLE again and again Douglas Anderson
@ 2020-09-12 23:01   ` Bjorn Andersson
  2020-09-15 10:40   ` Akash Asthana
  1 sibling, 0 replies; 17+ messages in thread
From: Bjorn Andersson @ 2020-09-12 23:01 UTC (permalink / raw)
  To: Douglas Anderson
  Cc: Mark Brown, Akash Asthana, swboyd, Andy Gross, linux-arm-msm,
	linux-kernel, linux-spi

On Sat 12 Sep 16:08 CDT 2020, Douglas Anderson wrote:

> We always toggle the chip select manually in spi-geni-qcom so that we
> can properly implement the Linux API.  There's no reason to program
> this to the hardware on every transfer.  Program it once at init and
> be done with it.
> 
> This saves some part of a microsecond of overhead on each transfer.
> While not really noticeable on any real world benchmarks, we might as
> well save the time.
> 

Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>

Regards,
Bjorn

> Signed-off-by: Douglas Anderson <dianders@chromium.org>
> ---
> 
>  drivers/spi/spi-geni-qcom.c | 12 +++++++-----
>  1 file changed, 7 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/spi/spi-geni-qcom.c b/drivers/spi/spi-geni-qcom.c
> index 7f0bf0dec466..92d88bf85a90 100644
> --- a/drivers/spi/spi-geni-qcom.c
> +++ b/drivers/spi/spi-geni-qcom.c
> @@ -290,6 +290,7 @@ static int spi_geni_init(struct spi_geni_master *mas)
>  {
>  	struct geni_se *se = &mas->se;
>  	unsigned int proto, major, minor, ver;
> +	u32 spi_tx_cfg;
>  
>  	pm_runtime_get_sync(mas->dev);
>  
> @@ -322,6 +323,11 @@ static int spi_geni_init(struct spi_geni_master *mas)
>  
>  	geni_se_select_mode(se, GENI_SE_FIFO);
>  
> +	/* We always control CS manually */
> +	spi_tx_cfg = readl(se->base + SE_SPI_TRANS_CFG);
> +	spi_tx_cfg &= ~CS_TOGGLE;
> +	writel(spi_tx_cfg, se->base + SE_SPI_TRANS_CFG);
> +
>  	pm_runtime_put(mas->dev);
>  	return 0;
>  }
> @@ -331,7 +337,7 @@ static void setup_fifo_xfer(struct spi_transfer *xfer,
>  				u16 mode, struct spi_master *spi)
>  {
>  	u32 m_cmd = 0;
> -	u32 spi_tx_cfg, len;
> +	u32 len;
>  	struct geni_se *se = &mas->se;
>  	int ret;
>  
> @@ -350,7 +356,6 @@ static void setup_fifo_xfer(struct spi_transfer *xfer,
>  	spin_lock_irq(&mas->lock);
>  	spin_unlock_irq(&mas->lock);
>  
> -	spi_tx_cfg = readl(se->base + SE_SPI_TRANS_CFG);
>  	if (xfer->bits_per_word != mas->cur_bits_per_word) {
>  		spi_setup_word_len(mas, mode, xfer->bits_per_word);
>  		mas->cur_bits_per_word = xfer->bits_per_word;
> @@ -364,8 +369,6 @@ static void setup_fifo_xfer(struct spi_transfer *xfer,
>  	mas->tx_rem_bytes = 0;
>  	mas->rx_rem_bytes = 0;
>  
> -	spi_tx_cfg &= ~CS_TOGGLE;
> -
>  	if (!(mas->cur_bits_per_word % MIN_WORD_LEN))
>  		len = xfer->len * BITS_PER_BYTE / mas->cur_bits_per_word;
>  	else
> @@ -384,7 +387,6 @@ static void setup_fifo_xfer(struct spi_transfer *xfer,
>  		writel(len, se->base + SE_SPI_RX_TRANS_LEN);
>  		mas->rx_rem_bytes = xfer->len;
>  	}
> -	writel(spi_tx_cfg, se->base + SE_SPI_TRANS_CFG);
>  
>  	/*
>  	 * Lock around right before we start the transfer since our
> -- 
> 2.28.0.618.gf4bc123cb7-goog
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/3] spi: spi-geni-qcom: Slightly optimize setup of bidirectional xfters
  2020-09-12 22:54   ` Bjorn Andersson
@ 2020-09-13  1:09     ` Doug Anderson
  2020-09-13 20:35       ` Doug Anderson
  0 siblings, 1 reply; 17+ messages in thread
From: Doug Anderson @ 2020-09-13  1:09 UTC (permalink / raw)
  To: Bjorn Andersson
  Cc: Mark Brown, Akash Asthana, Stephen Boyd, Andy Gross,
	linux-arm-msm, LKML, linux-spi

Hi,

On Sat, Sep 12, 2020 at 3:54 PM Bjorn Andersson
<bjorn.andersson@linaro.org> wrote:
>
> On Sat 12 Sep 16:08 CDT 2020, Douglas Anderson wrote:
>
> > When setting up a bidirectional transfer we need to program both the
> > TX and RX lengths.  We don't need a memory barrier between those two
> > writes.  Factor out the __iowmb() and use writel_relaxed().  This
> > saves a fraction of a microsecond of setup overhead on bidirectional
> > transfers.
> >
> > Signed-off-by: Douglas Anderson <dianders@chromium.org>
> > ---
> >
> >  drivers/spi/spi-geni-qcom.c | 13 ++++++++++---
> >  1 file changed, 10 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/spi/spi-geni-qcom.c b/drivers/spi/spi-geni-qcom.c
> > index 92d88bf85a90..6c7e12b68bf0 100644
> > --- a/drivers/spi/spi-geni-qcom.c
> > +++ b/drivers/spi/spi-geni-qcom.c
> > @@ -376,15 +376,22 @@ static void setup_fifo_xfer(struct spi_transfer *xfer,
> >       len &= TRANS_LEN_MSK;
> >
> >       mas->cur_xfer = xfer;
> > +
> > +     /*
> > +      * Factor out the __iowmb() so that we can use writel_relaxed() for
> > +      * both writes below and thus only incur the overhead once even if
> > +      * we execute both of them.
> > +      */
>
> How many passes through this function do we have to take before saving
> the amount of time it took me to read this comment?
>
> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>

Thanks for the review!  Yeah, in Chrome OS we do a crazy amount of SPI
transfers since our EC and security chip are connected over SPI and we
seem to pile a whole lot of stuff into the EC.  This means we keep
coming back to the SPI controller again and again when profiling
things.  I'm hoping that we'll eventually be able to get DMA enabled
here, but until then at least it's nice to make the FIFO transfers
better...

-Doug

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/3] spi: spi-geni-qcom: Use the FIFO even more
  2020-09-12 22:53 ` [PATCH 1/3] spi: spi-geni-qcom: Use the FIFO even more Bjorn Andersson
@ 2020-09-13  1:11   ` Doug Anderson
  2020-09-13  3:12     ` Bjorn Andersson
  0 siblings, 1 reply; 17+ messages in thread
From: Doug Anderson @ 2020-09-13  1:11 UTC (permalink / raw)
  To: Bjorn Andersson
  Cc: Mark Brown, Akash Asthana, Stephen Boyd, Andy Gross,
	linux-arm-msm, LKML, linux-spi

Hi,

On Sat, Sep 12, 2020 at 3:53 PM Bjorn Andersson
<bjorn.andersson@linaro.org> wrote:
>
> On Sat 12 Sep 16:07 CDT 2020, Douglas Anderson wrote:
>
> > In commit 902481a78ee4 ("spi: spi-geni-qcom: Actually use our FIFO") I
> > explained that the maximum size we could program the FIFO was
> > "mas->tx_fifo_depth - 3" but that I chose "mas->tx_fifo_depth()"
> > because I was worried about decreased bandwidth.
> >
> > Since that time:
> > * All the interconnect patches have landed, making things run at the
> >   proper speed.
> > * I've done more measurements.
> >
> > This lets me confirm that there's really no downside of using the FIFO
> > more.  Specifically I did "flashrom -p ec -r /tmp/foo.bin" on a
> > Chromebook and averaged over several runs.
>
> Wouldn't there be a downside in the form of setting the watermark that
> close to the full FIFO we have less room for being late handling the
> interrupt? Or is there some mechanism involved that will prevent
> the FIFO from being overrun?

Yeah, I had that worry too, but, as described in 902481a78ee4 ("spi:
spi-geni-qcom: Actually use our FIFO"), it doesn't seem to be a
problem.  From that commit: "We are the SPI master, so it makes sense
that there would be no problems with overruns, the master should just
stop clocking."

-Doug

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/3] spi: spi-geni-qcom: Use the FIFO even more
  2020-09-13  1:11   ` Doug Anderson
@ 2020-09-13  3:12     ` Bjorn Andersson
  0 siblings, 0 replies; 17+ messages in thread
From: Bjorn Andersson @ 2020-09-13  3:12 UTC (permalink / raw)
  To: Doug Anderson
  Cc: Mark Brown, Akash Asthana, Stephen Boyd, Andy Gross,
	linux-arm-msm, LKML, linux-spi

On Sat 12 Sep 20:11 CDT 2020, Doug Anderson wrote:

> Hi,
> 
> On Sat, Sep 12, 2020 at 3:53 PM Bjorn Andersson
> <bjorn.andersson@linaro.org> wrote:
> >
> > On Sat 12 Sep 16:07 CDT 2020, Douglas Anderson wrote:
> >
> > > In commit 902481a78ee4 ("spi: spi-geni-qcom: Actually use our FIFO") I
> > > explained that the maximum size we could program the FIFO was
> > > "mas->tx_fifo_depth - 3" but that I chose "mas->tx_fifo_depth()"
> > > because I was worried about decreased bandwidth.
> > >
> > > Since that time:
> > > * All the interconnect patches have landed, making things run at the
> > >   proper speed.
> > > * I've done more measurements.
> > >
> > > This lets me confirm that there's really no downside of using the FIFO
> > > more.  Specifically I did "flashrom -p ec -r /tmp/foo.bin" on a
> > > Chromebook and averaged over several runs.
> >
> > Wouldn't there be a downside in the form of setting the watermark that
> > close to the full FIFO we have less room for being late handling the
> > interrupt? Or is there some mechanism involved that will prevent
> > the FIFO from being overrun?
> 
> Yeah, I had that worry too, but, as described in 902481a78ee4 ("spi:
> spi-geni-qcom: Actually use our FIFO"), it doesn't seem to be a
> problem.  From that commit: "We are the SPI master, so it makes sense
> that there would be no problems with overruns, the master should just
> stop clocking."
> 

Actually read the message of the linked commit now. I share your view
that this indicates that the controller does something wrt the clocking
to handle this case.

Change itself looks good, so:

Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>

Regards,
Bjorn

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/3] spi: spi-geni-qcom: Slightly optimize setup of bidirectional xfters
  2020-09-12 21:08 ` [PATCH 3/3] spi: spi-geni-qcom: Slightly optimize setup of bidirectional xfters Douglas Anderson
@ 2020-09-13  3:45     ` kernel test robot
  2020-09-13  3:45     ` kernel test robot
  2020-09-13  7:31     ` kernel test robot
  2 siblings, 0 replies; 17+ messages in thread
From: kernel test robot @ 2020-09-13  3:45 UTC (permalink / raw)
  To: Douglas Anderson, Mark Brown
  Cc: kbuild-all, clang-built-linux, Akash Asthana, swboyd,
	Douglas Anderson, Andy Gross, Bjorn Andersson, linux-arm-msm,
	linux-kernel, linux-spi

[-- Attachment #1: Type: text/plain, Size: 4844 bytes --]

Hi Douglas,

I love your patch! Yet something to improve:

[auto build test ERROR on spi/for-next]
[also build test ERROR on v5.9-rc4 next-20200911]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Douglas-Anderson/spi-spi-geni-qcom-Use-the-FIFO-even-more/20200913-050928
base:   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git for-next
config: x86_64-randconfig-a002-20200913 (attached as .config)
compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project 3170d54842655d6d936aae32b7d0bc92fce7f22e)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install x86_64 cross compiling tool for clang build
        # apt-get install binutils-x86-64-linux-gnu
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=x86_64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

>> drivers/spi/spi-geni-qcom.c:385:2: error: implicit declaration of function '__iowmb' [-Werror,-Wimplicit-function-declaration]
           __iowmb();
           ^
   1 error generated.

# https://github.com/0day-ci/linux/commit/4458adf4007926cfaaa505b54a83059c9ba813ad
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Douglas-Anderson/spi-spi-geni-qcom-Use-the-FIFO-even-more/20200913-050928
git checkout 4458adf4007926cfaaa505b54a83059c9ba813ad
vim +/__iowmb +385 drivers/spi/spi-geni-qcom.c

   334	
   335	static void setup_fifo_xfer(struct spi_transfer *xfer,
   336					struct spi_geni_master *mas,
   337					u16 mode, struct spi_master *spi)
   338	{
   339		u32 m_cmd = 0;
   340		u32 len;
   341		struct geni_se *se = &mas->se;
   342		int ret;
   343	
   344		/*
   345		 * Ensure that our interrupt handler isn't still running from some
   346		 * prior command before we start messing with the hardware behind
   347		 * its back.  We don't need to _keep_ the lock here since we're only
   348		 * worried about racing with out interrupt handler.  The SPI core
   349		 * already handles making sure that we're not trying to do two
   350		 * transfers at once or setting a chip select and doing a transfer
   351		 * concurrently.
   352		 *
   353		 * NOTE: we actually _can't_ hold the lock here because possibly we
   354		 * might call clk_set_rate() which needs to be able to sleep.
   355		 */
   356		spin_lock_irq(&mas->lock);
   357		spin_unlock_irq(&mas->lock);
   358	
   359		if (xfer->bits_per_word != mas->cur_bits_per_word) {
   360			spi_setup_word_len(mas, mode, xfer->bits_per_word);
   361			mas->cur_bits_per_word = xfer->bits_per_word;
   362		}
   363	
   364		/* Speed and bits per word can be overridden per transfer */
   365		ret = geni_spi_set_clock_and_bw(mas, xfer->speed_hz);
   366		if (ret)
   367			return;
   368	
   369		mas->tx_rem_bytes = 0;
   370		mas->rx_rem_bytes = 0;
   371	
   372		if (!(mas->cur_bits_per_word % MIN_WORD_LEN))
   373			len = xfer->len * BITS_PER_BYTE / mas->cur_bits_per_word;
   374		else
   375			len = xfer->len / (mas->cur_bits_per_word / BITS_PER_BYTE + 1);
   376		len &= TRANS_LEN_MSK;
   377	
   378		mas->cur_xfer = xfer;
   379	
   380		/*
   381		 * Factor out the __iowmb() so that we can use writel_relaxed() for
   382		 * both writes below and thus only incur the overhead once even if
   383		 * we execute both of them.
   384		 */
 > 385		__iowmb();
   386	
   387		if (xfer->tx_buf) {
   388			m_cmd |= SPI_TX_ONLY;
   389			mas->tx_rem_bytes = xfer->len;
   390			writel_relaxed(len, se->base + SE_SPI_TX_TRANS_LEN);
   391		}
   392		if (xfer->rx_buf) {
   393			m_cmd |= SPI_RX_ONLY;
   394			writel_relaxed(len, se->base + SE_SPI_RX_TRANS_LEN);
   395			mas->rx_rem_bytes = xfer->len;
   396		}
   397	
   398		/*
   399		 * Lock around right before we start the transfer since our
   400		 * interrupt could come in at any time now.
   401		 */
   402		spin_lock_irq(&mas->lock);
   403		geni_se_setup_m_cmd(se, m_cmd, FRAGMENTATION);
   404	
   405		/*
   406		 * TX_WATERMARK_REG should be set after SPI configuration and
   407		 * setting up GENI SE engine, as driver starts data transfer
   408		 * for the watermark interrupt.
   409		 */
   410		if (m_cmd & SPI_TX_ONLY)
   411			writel(mas->tx_wm, se->base + SE_GENI_TX_WATERMARK_REG);
   412		spin_unlock_irq(&mas->lock);
   413	}
   414	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 39154 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/3] spi: spi-geni-qcom: Slightly optimize setup of bidirectional xfters
@ 2020-09-13  3:45     ` kernel test robot
  0 siblings, 0 replies; 17+ messages in thread
From: kernel test robot @ 2020-09-13  3:45 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 4969 bytes --]

Hi Douglas,

I love your patch! Yet something to improve:

[auto build test ERROR on spi/for-next]
[also build test ERROR on v5.9-rc4 next-20200911]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Douglas-Anderson/spi-spi-geni-qcom-Use-the-FIFO-even-more/20200913-050928
base:   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git for-next
config: x86_64-randconfig-a002-20200913 (attached as .config)
compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project 3170d54842655d6d936aae32b7d0bc92fce7f22e)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install x86_64 cross compiling tool for clang build
        # apt-get install binutils-x86-64-linux-gnu
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=x86_64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

>> drivers/spi/spi-geni-qcom.c:385:2: error: implicit declaration of function '__iowmb' [-Werror,-Wimplicit-function-declaration]
           __iowmb();
           ^
   1 error generated.

# https://github.com/0day-ci/linux/commit/4458adf4007926cfaaa505b54a83059c9ba813ad
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Douglas-Anderson/spi-spi-geni-qcom-Use-the-FIFO-even-more/20200913-050928
git checkout 4458adf4007926cfaaa505b54a83059c9ba813ad
vim +/__iowmb +385 drivers/spi/spi-geni-qcom.c

   334	
   335	static void setup_fifo_xfer(struct spi_transfer *xfer,
   336					struct spi_geni_master *mas,
   337					u16 mode, struct spi_master *spi)
   338	{
   339		u32 m_cmd = 0;
   340		u32 len;
   341		struct geni_se *se = &mas->se;
   342		int ret;
   343	
   344		/*
   345		 * Ensure that our interrupt handler isn't still running from some
   346		 * prior command before we start messing with the hardware behind
   347		 * its back.  We don't need to _keep_ the lock here since we're only
   348		 * worried about racing with out interrupt handler.  The SPI core
   349		 * already handles making sure that we're not trying to do two
   350		 * transfers at once or setting a chip select and doing a transfer
   351		 * concurrently.
   352		 *
   353		 * NOTE: we actually _can't_ hold the lock here because possibly we
   354		 * might call clk_set_rate() which needs to be able to sleep.
   355		 */
   356		spin_lock_irq(&mas->lock);
   357		spin_unlock_irq(&mas->lock);
   358	
   359		if (xfer->bits_per_word != mas->cur_bits_per_word) {
   360			spi_setup_word_len(mas, mode, xfer->bits_per_word);
   361			mas->cur_bits_per_word = xfer->bits_per_word;
   362		}
   363	
   364		/* Speed and bits per word can be overridden per transfer */
   365		ret = geni_spi_set_clock_and_bw(mas, xfer->speed_hz);
   366		if (ret)
   367			return;
   368	
   369		mas->tx_rem_bytes = 0;
   370		mas->rx_rem_bytes = 0;
   371	
   372		if (!(mas->cur_bits_per_word % MIN_WORD_LEN))
   373			len = xfer->len * BITS_PER_BYTE / mas->cur_bits_per_word;
   374		else
   375			len = xfer->len / (mas->cur_bits_per_word / BITS_PER_BYTE + 1);
   376		len &= TRANS_LEN_MSK;
   377	
   378		mas->cur_xfer = xfer;
   379	
   380		/*
   381		 * Factor out the __iowmb() so that we can use writel_relaxed() for
   382		 * both writes below and thus only incur the overhead once even if
   383		 * we execute both of them.
   384		 */
 > 385		__iowmb();
   386	
   387		if (xfer->tx_buf) {
   388			m_cmd |= SPI_TX_ONLY;
   389			mas->tx_rem_bytes = xfer->len;
   390			writel_relaxed(len, se->base + SE_SPI_TX_TRANS_LEN);
   391		}
   392		if (xfer->rx_buf) {
   393			m_cmd |= SPI_RX_ONLY;
   394			writel_relaxed(len, se->base + SE_SPI_RX_TRANS_LEN);
   395			mas->rx_rem_bytes = xfer->len;
   396		}
   397	
   398		/*
   399		 * Lock around right before we start the transfer since our
   400		 * interrupt could come in at any time now.
   401		 */
   402		spin_lock_irq(&mas->lock);
   403		geni_se_setup_m_cmd(se, m_cmd, FRAGMENTATION);
   404	
   405		/*
   406		 * TX_WATERMARK_REG should be set after SPI configuration and
   407		 * setting up GENI SE engine, as driver starts data transfer
   408		 * for the watermark interrupt.
   409		 */
   410		if (m_cmd & SPI_TX_ONLY)
   411			writel(mas->tx_wm, se->base + SE_GENI_TX_WATERMARK_REG);
   412		spin_unlock_irq(&mas->lock);
   413	}
   414	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 39154 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/3] spi: spi-geni-qcom: Slightly optimize setup of bidirectional xfters
  2020-09-12 21:08 ` [PATCH 3/3] spi: spi-geni-qcom: Slightly optimize setup of bidirectional xfters Douglas Anderson
@ 2020-09-13  7:31     ` kernel test robot
  2020-09-13  3:45     ` kernel test robot
  2020-09-13  7:31     ` kernel test robot
  2 siblings, 0 replies; 17+ messages in thread
From: kernel test robot @ 2020-09-13  7:31 UTC (permalink / raw)
  To: Douglas Anderson, Mark Brown
  Cc: kbuild-all, Akash Asthana, swboyd, Douglas Anderson, Andy Gross,
	Bjorn Andersson, linux-arm-msm, linux-kernel, linux-spi

[-- Attachment #1: Type: text/plain, Size: 4742 bytes --]

Hi Douglas,

I love your patch! Yet something to improve:

[auto build test ERROR on spi/for-next]
[also build test ERROR on v5.9-rc4 next-20200911]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Douglas-Anderson/spi-spi-geni-qcom-Use-the-FIFO-even-more/20200913-050928
base:   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git for-next
config: sparc-allyesconfig (attached as .config)
compiler: sparc64-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=sparc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   drivers/spi/spi-geni-qcom.c: In function 'setup_fifo_xfer':
>> drivers/spi/spi-geni-qcom.c:385:2: error: implicit declaration of function '__iowmb' [-Werror=implicit-function-declaration]
     385 |  __iowmb();
         |  ^~~~~~~
   cc1: some warnings being treated as errors

# https://github.com/0day-ci/linux/commit/4458adf4007926cfaaa505b54a83059c9ba813ad
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Douglas-Anderson/spi-spi-geni-qcom-Use-the-FIFO-even-more/20200913-050928
git checkout 4458adf4007926cfaaa505b54a83059c9ba813ad
vim +/__iowmb +385 drivers/spi/spi-geni-qcom.c

   334	
   335	static void setup_fifo_xfer(struct spi_transfer *xfer,
   336					struct spi_geni_master *mas,
   337					u16 mode, struct spi_master *spi)
   338	{
   339		u32 m_cmd = 0;
   340		u32 len;
   341		struct geni_se *se = &mas->se;
   342		int ret;
   343	
   344		/*
   345		 * Ensure that our interrupt handler isn't still running from some
   346		 * prior command before we start messing with the hardware behind
   347		 * its back.  We don't need to _keep_ the lock here since we're only
   348		 * worried about racing with out interrupt handler.  The SPI core
   349		 * already handles making sure that we're not trying to do two
   350		 * transfers at once or setting a chip select and doing a transfer
   351		 * concurrently.
   352		 *
   353		 * NOTE: we actually _can't_ hold the lock here because possibly we
   354		 * might call clk_set_rate() which needs to be able to sleep.
   355		 */
   356		spin_lock_irq(&mas->lock);
   357		spin_unlock_irq(&mas->lock);
   358	
   359		if (xfer->bits_per_word != mas->cur_bits_per_word) {
   360			spi_setup_word_len(mas, mode, xfer->bits_per_word);
   361			mas->cur_bits_per_word = xfer->bits_per_word;
   362		}
   363	
   364		/* Speed and bits per word can be overridden per transfer */
   365		ret = geni_spi_set_clock_and_bw(mas, xfer->speed_hz);
   366		if (ret)
   367			return;
   368	
   369		mas->tx_rem_bytes = 0;
   370		mas->rx_rem_bytes = 0;
   371	
   372		if (!(mas->cur_bits_per_word % MIN_WORD_LEN))
   373			len = xfer->len * BITS_PER_BYTE / mas->cur_bits_per_word;
   374		else
   375			len = xfer->len / (mas->cur_bits_per_word / BITS_PER_BYTE + 1);
   376		len &= TRANS_LEN_MSK;
   377	
   378		mas->cur_xfer = xfer;
   379	
   380		/*
   381		 * Factor out the __iowmb() so that we can use writel_relaxed() for
   382		 * both writes below and thus only incur the overhead once even if
   383		 * we execute both of them.
   384		 */
 > 385		__iowmb();
   386	
   387		if (xfer->tx_buf) {
   388			m_cmd |= SPI_TX_ONLY;
   389			mas->tx_rem_bytes = xfer->len;
   390			writel_relaxed(len, se->base + SE_SPI_TX_TRANS_LEN);
   391		}
   392		if (xfer->rx_buf) {
   393			m_cmd |= SPI_RX_ONLY;
   394			writel_relaxed(len, se->base + SE_SPI_RX_TRANS_LEN);
   395			mas->rx_rem_bytes = xfer->len;
   396		}
   397	
   398		/*
   399		 * Lock around right before we start the transfer since our
   400		 * interrupt could come in at any time now.
   401		 */
   402		spin_lock_irq(&mas->lock);
   403		geni_se_setup_m_cmd(se, m_cmd, FRAGMENTATION);
   404	
   405		/*
   406		 * TX_WATERMARK_REG should be set after SPI configuration and
   407		 * setting up GENI SE engine, as driver starts data transfer
   408		 * for the watermark interrupt.
   409		 */
   410		if (m_cmd & SPI_TX_ONLY)
   411			writel(mas->tx_wm, se->base + SE_GENI_TX_WATERMARK_REG);
   412		spin_unlock_irq(&mas->lock);
   413	}
   414	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 66758 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/3] spi: spi-geni-qcom: Slightly optimize setup of bidirectional xfters
@ 2020-09-13  7:31     ` kernel test robot
  0 siblings, 0 replies; 17+ messages in thread
From: kernel test robot @ 2020-09-13  7:31 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 4866 bytes --]

Hi Douglas,

I love your patch! Yet something to improve:

[auto build test ERROR on spi/for-next]
[also build test ERROR on v5.9-rc4 next-20200911]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Douglas-Anderson/spi-spi-geni-qcom-Use-the-FIFO-even-more/20200913-050928
base:   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git for-next
config: sparc-allyesconfig (attached as .config)
compiler: sparc64-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=sparc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   drivers/spi/spi-geni-qcom.c: In function 'setup_fifo_xfer':
>> drivers/spi/spi-geni-qcom.c:385:2: error: implicit declaration of function '__iowmb' [-Werror=implicit-function-declaration]
     385 |  __iowmb();
         |  ^~~~~~~
   cc1: some warnings being treated as errors

# https://github.com/0day-ci/linux/commit/4458adf4007926cfaaa505b54a83059c9ba813ad
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Douglas-Anderson/spi-spi-geni-qcom-Use-the-FIFO-even-more/20200913-050928
git checkout 4458adf4007926cfaaa505b54a83059c9ba813ad
vim +/__iowmb +385 drivers/spi/spi-geni-qcom.c

   334	
   335	static void setup_fifo_xfer(struct spi_transfer *xfer,
   336					struct spi_geni_master *mas,
   337					u16 mode, struct spi_master *spi)
   338	{
   339		u32 m_cmd = 0;
   340		u32 len;
   341		struct geni_se *se = &mas->se;
   342		int ret;
   343	
   344		/*
   345		 * Ensure that our interrupt handler isn't still running from some
   346		 * prior command before we start messing with the hardware behind
   347		 * its back.  We don't need to _keep_ the lock here since we're only
   348		 * worried about racing with out interrupt handler.  The SPI core
   349		 * already handles making sure that we're not trying to do two
   350		 * transfers at once or setting a chip select and doing a transfer
   351		 * concurrently.
   352		 *
   353		 * NOTE: we actually _can't_ hold the lock here because possibly we
   354		 * might call clk_set_rate() which needs to be able to sleep.
   355		 */
   356		spin_lock_irq(&mas->lock);
   357		spin_unlock_irq(&mas->lock);
   358	
   359		if (xfer->bits_per_word != mas->cur_bits_per_word) {
   360			spi_setup_word_len(mas, mode, xfer->bits_per_word);
   361			mas->cur_bits_per_word = xfer->bits_per_word;
   362		}
   363	
   364		/* Speed and bits per word can be overridden per transfer */
   365		ret = geni_spi_set_clock_and_bw(mas, xfer->speed_hz);
   366		if (ret)
   367			return;
   368	
   369		mas->tx_rem_bytes = 0;
   370		mas->rx_rem_bytes = 0;
   371	
   372		if (!(mas->cur_bits_per_word % MIN_WORD_LEN))
   373			len = xfer->len * BITS_PER_BYTE / mas->cur_bits_per_word;
   374		else
   375			len = xfer->len / (mas->cur_bits_per_word / BITS_PER_BYTE + 1);
   376		len &= TRANS_LEN_MSK;
   377	
   378		mas->cur_xfer = xfer;
   379	
   380		/*
   381		 * Factor out the __iowmb() so that we can use writel_relaxed() for
   382		 * both writes below and thus only incur the overhead once even if
   383		 * we execute both of them.
   384		 */
 > 385		__iowmb();
   386	
   387		if (xfer->tx_buf) {
   388			m_cmd |= SPI_TX_ONLY;
   389			mas->tx_rem_bytes = xfer->len;
   390			writel_relaxed(len, se->base + SE_SPI_TX_TRANS_LEN);
   391		}
   392		if (xfer->rx_buf) {
   393			m_cmd |= SPI_RX_ONLY;
   394			writel_relaxed(len, se->base + SE_SPI_RX_TRANS_LEN);
   395			mas->rx_rem_bytes = xfer->len;
   396		}
   397	
   398		/*
   399		 * Lock around right before we start the transfer since our
   400		 * interrupt could come in at any time now.
   401		 */
   402		spin_lock_irq(&mas->lock);
   403		geni_se_setup_m_cmd(se, m_cmd, FRAGMENTATION);
   404	
   405		/*
   406		 * TX_WATERMARK_REG should be set after SPI configuration and
   407		 * setting up GENI SE engine, as driver starts data transfer
   408		 * for the watermark interrupt.
   409		 */
   410		if (m_cmd & SPI_TX_ONLY)
   411			writel(mas->tx_wm, se->base + SE_GENI_TX_WATERMARK_REG);
   412		spin_unlock_irq(&mas->lock);
   413	}
   414	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 66758 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/3] spi: spi-geni-qcom: Slightly optimize setup of bidirectional xfters
  2020-09-13  1:09     ` Doug Anderson
@ 2020-09-13 20:35       ` Doug Anderson
  0 siblings, 0 replies; 17+ messages in thread
From: Doug Anderson @ 2020-09-13 20:35 UTC (permalink / raw)
  To: Bjorn Andersson
  Cc: Mark Brown, Akash Asthana, Stephen Boyd, Andy Gross,
	linux-arm-msm, LKML, linux-spi

Hi,

On Sat, Sep 12, 2020 at 6:09 PM Doug Anderson <dianders@chromium.org> wrote:
>
> Hi,
>
> On Sat, Sep 12, 2020 at 3:54 PM Bjorn Andersson
> <bjorn.andersson@linaro.org> wrote:
> >
> > On Sat 12 Sep 16:08 CDT 2020, Douglas Anderson wrote:
> >
> > > When setting up a bidirectional transfer we need to program both the
> > > TX and RX lengths.  We don't need a memory barrier between those two
> > > writes.  Factor out the __iowmb() and use writel_relaxed().  This
> > > saves a fraction of a microsecond of setup overhead on bidirectional
> > > transfers.
> > >
> > > Signed-off-by: Douglas Anderson <dianders@chromium.org>
> > > ---
> > >
> > >  drivers/spi/spi-geni-qcom.c | 13 ++++++++++---
> > >  1 file changed, 10 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/drivers/spi/spi-geni-qcom.c b/drivers/spi/spi-geni-qcom.c
> > > index 92d88bf85a90..6c7e12b68bf0 100644
> > > --- a/drivers/spi/spi-geni-qcom.c
> > > +++ b/drivers/spi/spi-geni-qcom.c
> > > @@ -376,15 +376,22 @@ static void setup_fifo_xfer(struct spi_transfer *xfer,
> > >       len &= TRANS_LEN_MSK;
> > >
> > >       mas->cur_xfer = xfer;
> > > +
> > > +     /*
> > > +      * Factor out the __iowmb() so that we can use writel_relaxed() for
> > > +      * both writes below and thus only incur the overhead once even if
> > > +      * we execute both of them.
> > > +      */
> >
> > How many passes through this function do we have to take before saving
> > the amount of time it took me to read this comment?
> >
> > Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
>
> Thanks for the review!  Yeah, in Chrome OS we do a crazy amount of SPI
> transfers since our EC and security chip are connected over SPI and we
> seem to pile a whole lot of stuff into the EC.  This means we keep
> coming back to the SPI controller again and again when profiling
> things.  I'm hoping that we'll eventually be able to get DMA enabled
> here, but until then at least it's nice to make the FIFO transfers
> better...

Ugh.  Given the problem that the kernel test robot found, I'm gonna
say just drop this patch but keep the others I sent.  As per the CL
description, it's a pretty minor optimization and even though we do a
lot of SPI transfers it's probably more worth it to work towards DMA
mode than to try to find a cleaner solution for this one.

-Doug

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/3] spi: spi-geni-qcom: Use the FIFO even more
  2020-09-12 21:07 [PATCH 1/3] spi: spi-geni-qcom: Use the FIFO even more Douglas Anderson
                   ` (2 preceding siblings ...)
  2020-09-12 22:53 ` [PATCH 1/3] spi: spi-geni-qcom: Use the FIFO even more Bjorn Andersson
@ 2020-09-14 14:52 ` Mark Brown
  2020-09-15  7:30 ` Akash Asthana
  4 siblings, 0 replies; 17+ messages in thread
From: Mark Brown @ 2020-09-14 14:52 UTC (permalink / raw)
  To: Douglas Anderson
  Cc: linux-arm-msm, swboyd, linux-spi, Bjorn Andersson, linux-kernel,
	Andy Gross, Akash Asthana

On Sat, 12 Sep 2020 14:07:59 -0700, Douglas Anderson wrote:
> In commit 902481a78ee4 ("spi: spi-geni-qcom: Actually use our FIFO") I
> explained that the maximum size we could program the FIFO was
> "mas->tx_fifo_depth - 3" but that I chose "mas->tx_fifo_depth()"
> because I was worried about decreased bandwidth.
> 
> Since that time:
> * All the interconnect patches have landed, making things run at the
>   proper speed.
> * I've done more measurements.
> 
> [...]

Applied to

   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git for-next

Thanks!

[1/2] spi: spi-geni-qcom: Use the FIFO even more
      commit: fc129a43aa2705770dc45b2e9c506d2617fd5863
[2/2] spi: spi-geni-qcom: Don't program CS_TOGGLE again and again
      commit: 14ac4e049dc1183440960f177b60b54357e54d90

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/3] spi: spi-geni-qcom: Use the FIFO even more
  2020-09-12 21:07 [PATCH 1/3] spi: spi-geni-qcom: Use the FIFO even more Douglas Anderson
                   ` (3 preceding siblings ...)
  2020-09-14 14:52 ` Mark Brown
@ 2020-09-15  7:30 ` Akash Asthana
  4 siblings, 0 replies; 17+ messages in thread
From: Akash Asthana @ 2020-09-15  7:30 UTC (permalink / raw)
  To: Douglas Anderson, Mark Brown
  Cc: swboyd, Andy Gross, Bjorn Andersson, linux-arm-msm, linux-kernel,
	linux-spi


On 9/13/2020 2:37 AM, Douglas Anderson wrote:
> In commit 902481a78ee4 ("spi: spi-geni-qcom: Actually use our FIFO") I
> explained that the maximum size we could program the FIFO was
> "mas->tx_fifo_depth - 3" but that I chose "mas->tx_fifo_depth()"
> because I was worried about decreased bandwidth.
>
> Since that time:
> * All the interconnect patches have landed, making things run at the
>    proper speed.
> * I've done more measurements.
>
> This lets me confirm that there's really no downside of using the FIFO
> more.  Specifically I did "flashrom -p ec -r /tmp/foo.bin" on a
> Chromebook and averaged over several runs.
>
> Before: It took 6.66 seconds and 59669 interrupts fired.
> After:  It took 6.66 seconds and 47992 interrupts fired.

Reviewed-by: Akash Asthana <akashast@codeaurora.org>

> Signed-off-by: Douglas Anderson <dianders@chromium.org>
> ---
>
>
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,\na Linux Foundation Collaborative Project


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/3] spi: spi-geni-qcom: Don't program CS_TOGGLE again and again
  2020-09-12 21:08 ` [PATCH 2/3] spi: spi-geni-qcom: Don't program CS_TOGGLE again and again Douglas Anderson
  2020-09-12 23:01   ` Bjorn Andersson
@ 2020-09-15 10:40   ` Akash Asthana
  1 sibling, 0 replies; 17+ messages in thread
From: Akash Asthana @ 2020-09-15 10:40 UTC (permalink / raw)
  To: Douglas Anderson, Mark Brown
  Cc: swboyd, Andy Gross, Bjorn Andersson, linux-arm-msm, linux-kernel,
	linux-spi


On 9/13/2020 2:38 AM, Douglas Anderson wrote:
> We always toggle the chip select manually in spi-geni-qcom so that we
> can properly implement the Linux API.  There's no reason to program
> this to the hardware on every transfer.  Program it once at init and
> be done with it.
>
> This saves some part of a microsecond of overhead on each transfer.
> While not really noticeable on any real world benchmarks, we might as
> well save the time.

Yeah this is configuration part, can be moved to one time init function, 
as per HPG CS_TOGGLE bit of SPI_TRANS_CFG register is used to instruct 
FW to toggle CS line btw each words. We never came across any 
usecase/slave who needs this.

Reviewed-by: Akash Asthana <akashast@codeaurora.org>

> Signed-off-by: Douglas Anderson <dianders@chromium.org>
> ---
>
>   drivers/spi/spi-geni-qcom.c | 12 +++++++-----
>   1 file changed, 7 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/spi/spi-geni-qcom.c b/drivers/spi/spi-geni-qcom.c
> index 7f0bf0dec466..92d88bf85a90 100644
> --- a/drivers/spi/spi-geni-qcom.c
> +++ b/drivers/spi/spi-geni-qcom.c
> @@ -290,6 +290,7 @@ static int spi_geni_init(struct spi_geni_master *mas)
>   {
>   	struct geni_se *se = &mas->se;
>   	unsigned int proto, major, minor, ver;
> +	u32 spi_tx_cfg;
>   
>   	pm_runtime_get_sync(mas->dev);
>   
> @@ -322,6 +323,11 @@ static int spi_geni_init(struct spi_geni_master *mas)
>   
>   	geni_se_select_mode(se, GENI_SE_FIFO);
>   
> +	/* We always control CS manually */
> +	spi_tx_cfg = readl(se->base + SE_SPI_TRANS_CFG);
> +	spi_tx_cfg &= ~CS_TOGGLE;
> +	writel(spi_tx_cfg, se->base + SE_SPI_TRANS_CFG);
> +
>   	pm_runtime_put(mas->dev);
>   	return 0;
>   }
> @@ -331,7 +337,7 @@ static void setup_fifo_xfer(struct spi_transfer *xfer,
>   				u16 mode, struct spi_master *spi)
>   {
>   	u32 m_cmd = 0;
> -	u32 spi_tx_cfg, len;
> +	u32 len;
>   	struct geni_se *se = &mas->se;
>   	int ret;
>   
> @@ -350,7 +356,6 @@ static void setup_fifo_xfer(struct spi_transfer *xfer,
>   	spin_lock_irq(&mas->lock);
>   	spin_unlock_irq(&mas->lock);
>   
> -	spi_tx_cfg = readl(se->base + SE_SPI_TRANS_CFG);
>   	if (xfer->bits_per_word != mas->cur_bits_per_word) {
>   		spi_setup_word_len(mas, mode, xfer->bits_per_word);
>   		mas->cur_bits_per_word = xfer->bits_per_word;
> @@ -364,8 +369,6 @@ static void setup_fifo_xfer(struct spi_transfer *xfer,
>   	mas->tx_rem_bytes = 0;
>   	mas->rx_rem_bytes = 0;
>   
> -	spi_tx_cfg &= ~CS_TOGGLE;
> -
>   	if (!(mas->cur_bits_per_word % MIN_WORD_LEN))
>   		len = xfer->len * BITS_PER_BYTE / mas->cur_bits_per_word;
>   	else
> @@ -384,7 +387,6 @@ static void setup_fifo_xfer(struct spi_transfer *xfer,
>   		writel(len, se->base + SE_SPI_RX_TRANS_LEN);
>   		mas->rx_rem_bytes = xfer->len;
>   	}
> -	writel(spi_tx_cfg, se->base + SE_SPI_TRANS_CFG);
>   
>   	/*
>   	 * Lock around right before we start the transfer since our

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,\na Linux Foundation Collaborative Project


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2020-09-15 10:40 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-12 21:07 [PATCH 1/3] spi: spi-geni-qcom: Use the FIFO even more Douglas Anderson
2020-09-12 21:08 ` [PATCH 2/3] spi: spi-geni-qcom: Don't program CS_TOGGLE again and again Douglas Anderson
2020-09-12 23:01   ` Bjorn Andersson
2020-09-15 10:40   ` Akash Asthana
2020-09-12 21:08 ` [PATCH 3/3] spi: spi-geni-qcom: Slightly optimize setup of bidirectional xfters Douglas Anderson
2020-09-12 22:54   ` Bjorn Andersson
2020-09-13  1:09     ` Doug Anderson
2020-09-13 20:35       ` Doug Anderson
2020-09-13  3:45   ` kernel test robot
2020-09-13  3:45     ` kernel test robot
2020-09-13  7:31   ` kernel test robot
2020-09-13  7:31     ` kernel test robot
2020-09-12 22:53 ` [PATCH 1/3] spi: spi-geni-qcom: Use the FIFO even more Bjorn Andersson
2020-09-13  1:11   ` Doug Anderson
2020-09-13  3:12     ` Bjorn Andersson
2020-09-14 14:52 ` Mark Brown
2020-09-15  7:30 ` Akash Asthana

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.