linux-integrity.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] tpm_tis: fix stall after iowrite*()s
@ 2023-03-23 15:34 Sebastian Andrzej Siewior
  2023-03-30  0:25 ` Jarkko Sakkinen
  0 siblings, 1 reply; 4+ messages in thread
From: Sebastian Andrzej Siewior @ 2023-03-23 15:34 UTC (permalink / raw)
  To: linux-integrity
  Cc: Haris Okanovic, Jarkko Sakkinen, Jason Gunthorpe, Peter Huewe,
	Thomas Gleixner

From: Haris Okanovic <haris.okanovic@ni.com>

ioread8() operations to TPM MMIO addresses can stall the CPU when
immediately following a sequence of iowrite*()'s to the same region.

For example, cyclitest measures ~400us latency spikes when a non-RT
usermode application communicates with an SPI-based TPM chip (Intel Atom
E3940 system, PREEMPT_RT kernel). The spikes are caused by a
stalling ioread8() operation following a sequence of 30+ iowrite8()s to
the same address. I believe this happens because the write sequence is
buffered (in CPU or somewhere along the bus), and gets flushed on the
first LOAD instruction (ioread*()) that follows.

The enclosed change appears to fix this issue: read the TPM chip's
access register (status code) after every iowrite*() operation to
amortize the cost of flushing data to chip across multiple instructions.

Signed-off-by: Haris Okanovic <haris.okanovic@ni.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---

I don't know how performance critical this is so that it should be
restricted to PREEMPT_RT. This has been in RT queue since late 2017 and
I have no idea how to deal with this differently/ in a more generic way.
Original thread:
	https://lore.kernel.org/20170804215651.29247-1-haris.okanovic@ni.com

 drivers/char/tpm/tpm_tis.c | 29 +++++++++++++++++++++++++++--
 1 file changed, 27 insertions(+), 2 deletions(-)

diff --git a/drivers/char/tpm/tpm_tis.c b/drivers/char/tpm/tpm_tis.c
index ed5dabd3c72d6..513e0d1c349a6 100644
--- a/drivers/char/tpm/tpm_tis.c
+++ b/drivers/char/tpm/tpm_tis.c
@@ -50,6 +50,31 @@ static inline struct tpm_tis_tcg_phy *to_tpm_tis_tcg_phy(struct tpm_tis_data *da
 	return container_of(data, struct tpm_tis_tcg_phy, priv);
 }
 
+#ifdef CONFIG_PREEMPT_RT
+/*
+ * Flushes previous write operations to chip so that a subsequent
+ * ioread*()s won't stall a CPU.
+ */
+static inline void tpm_tis_flush(void __iomem *iobase)
+{
+	ioread8(iobase + TPM_ACCESS(0));
+}
+#else
+#define tpm_tis_flush(iobase) do { } while (0)
+#endif
+
+static inline void tpm_tis_iowrite8(u8 b, void __iomem *iobase, u32 addr)
+{
+	iowrite8(b, iobase + addr);
+	tpm_tis_flush(iobase);
+}
+
+static inline void tpm_tis_iowrite32(u32 b, void __iomem *iobase, u32 addr)
+{
+	iowrite32(b, iobase + addr);
+	tpm_tis_flush(iobase);
+}
+
 static int interrupts = -1;
 module_param(interrupts, int, 0444);
 MODULE_PARM_DESC(interrupts, "Enable interrupts");
@@ -186,12 +211,12 @@ static int tpm_tcg_write_bytes(struct tpm_tis_data *data, u32 addr, u16 len,
 	switch (io_mode) {
 	case TPM_TIS_PHYS_8:
 		while (len--)
-			iowrite8(*value++, phy->iobase + addr);
+			tpm_tis_iowrite8(*value++, phy->iobase, addr);
 		break;
 	case TPM_TIS_PHYS_16:
 		return -EINVAL;
 	case TPM_TIS_PHYS_32:
-		iowrite32(le32_to_cpu(*((__le32 *)value)), phy->iobase + addr);
+		tpm_tis_iowrite32(le32_to_cpu(*((__le32 *)value)), phy->iobase, addr);
 		break;
 	}
 
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] tpm_tis: fix stall after iowrite*()s
  2023-03-23 15:34 [PATCH] tpm_tis: fix stall after iowrite*()s Sebastian Andrzej Siewior
@ 2023-03-30  0:25 ` Jarkko Sakkinen
  2023-04-19 15:41   ` [PATCH v2] " Sebastian Andrzej Siewior
  0 siblings, 1 reply; 4+ messages in thread
From: Jarkko Sakkinen @ 2023-03-30  0:25 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: linux-integrity, Haris Okanovic, Jason Gunthorpe, Peter Huewe,
	Thomas Gleixner

On Thu, Mar 23, 2023 at 04:34:36PM +0100, Sebastian Andrzej Siewior wrote:
> From: Haris Okanovic <haris.okanovic@ni.com>
> 
> ioread8() operations to TPM MMIO addresses can stall the CPU when
> immediately following a sequence of iowrite*()'s to the same region.
> 
> For example, cyclitest measures ~400us latency spikes when a non-RT
> usermode application communicates with an SPI-based TPM chip (Intel Atom
> E3940 system, PREEMPT_RT kernel). The spikes are caused by a
> stalling ioread8() operation following a sequence of 30+ iowrite8()s to
> the same address. I believe this happens because the write sequence is
> buffered (in CPU or somewhere along the bus), and gets flushed on the
> first LOAD instruction (ioread*()) that follows.
> 
> The enclosed change appears to fix this issue: read the TPM chip's
> access register (status code) after every iowrite*() operation to
> amortize the cost of flushing data to chip across multiple instructions.
> 
> Signed-off-by: Haris Okanovic <haris.okanovic@ni.com>
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> ---
> 
> I don't know how performance critical this is so that it should be
> restricted to PREEMPT_RT. This has been in RT queue since late 2017 and
> I have no idea how to deal with this differently/ in a more generic way.
> Original thread:
> 	https://lore.kernel.org/20170804215651.29247-1-haris.okanovic@ni.com
> 
>  drivers/char/tpm/tpm_tis.c | 29 +++++++++++++++++++++++++++--
>  1 file changed, 27 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/char/tpm/tpm_tis.c b/drivers/char/tpm/tpm_tis.c
> index ed5dabd3c72d6..513e0d1c349a6 100644
> --- a/drivers/char/tpm/tpm_tis.c
> +++ b/drivers/char/tpm/tpm_tis.c
> @@ -50,6 +50,31 @@ static inline struct tpm_tis_tcg_phy *to_tpm_tis_tcg_phy(struct tpm_tis_data *da
>  	return container_of(data, struct tpm_tis_tcg_phy, priv);
>  }
>  
> +#ifdef CONFIG_PREEMPT_RT
> +/*
> + * Flushes previous write operations to chip so that a subsequent
> + * ioread*()s won't stall a CPU.
> + */

I would replace this with:

/*
 * Flush previous write operations with a dummy read operation to the 
 * TPM MMIO base address.
 */

 I think rest of the reasoning would be better place to the functions,
 which are call sites for this helper, and here it would be make more
 sense to explain what it actually does.

> +static inline void tpm_tis_flush(void __iomem *iobase)
> +{
> +	ioread8(iobase + TPM_ACCESS(0));
> +}
> +#else
> +#define tpm_tis_flush(iobase) do { } while (0)
> +#endif
> +
> +static inline void tpm_tis_iowrite8(u8 b, void __iomem *iobase, u32 addr)

/*
 * Write a byte to the TPM MMIO address, and flush the write queue. The
 * flush amortizes the cost of the IO operations, and thus avoids unwanted
 * latency peaks.
 */

> +{
> +	iowrite8(b, iobase + addr);
> +	tpm_tis_flush(iobase);
> +}
> +

/*
 * Write a 32-bit word to the TPM MMIO address, and flush the write queue.
 * The flush amortizes the cost of the IO operations, and thus avoids
 * unwanted latency peaks.
 *

> +static inline void tpm_tis_iowrite32(u32 b, void __iomem *iobase, u32 addr)
> +{
> +	iowrite32(b, iobase + addr);
> +	tpm_tis_flush(iobase);
> +}
> +
>  static int interrupts = -1;
>  module_param(interrupts, int, 0444);
>  MODULE_PARM_DESC(interrupts, "Enable interrupts");
> @@ -186,12 +211,12 @@ static int tpm_tcg_write_bytes(struct tpm_tis_data *data, u32 addr, u16 len,
>  	switch (io_mode) {
>  	case TPM_TIS_PHYS_8:
>  		while (len--)
> -			iowrite8(*value++, phy->iobase + addr);
> +			tpm_tis_iowrite8(*value++, phy->iobase, addr);
>  		break;
>  	case TPM_TIS_PHYS_16:
>  		return -EINVAL;
>  	case TPM_TIS_PHYS_32:
> -		iowrite32(le32_to_cpu(*((__le32 *)value)), phy->iobase + addr);
> +		tpm_tis_iowrite32(le32_to_cpu(*((__le32 *)value)), phy->iobase, addr);
>  		break;
>  	}
>  
> -- 
> 2.40.0
> 

Thanks for catching this up. It is a small code change but I think that it
would deserve just a bit more documentation, as it would make sure that the
reasoning you gave is taken into account in the future code reviews.

I think that if you append what I suggested above (you can use your
judgement and edit as you will), it should be sufficient.

BR, Jarkko

BR, Jarkko

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH v2] tpm_tis: fix stall after iowrite*()s
  2023-03-30  0:25 ` Jarkko Sakkinen
@ 2023-04-19 15:41   ` Sebastian Andrzej Siewior
  2023-04-23 15:34     ` Jarkko Sakkinen
  0 siblings, 1 reply; 4+ messages in thread
From: Sebastian Andrzej Siewior @ 2023-04-19 15:41 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: linux-integrity, Haris Okanovic, Jason Gunthorpe, Peter Huewe,
	Thomas Gleixner

From: Haris Okanovic <haris.okanovic@ni.com>

ioread8() operations to TPM MMIO addresses can stall the CPU when
immediately following a sequence of iowrite*()'s to the same region.

For example, cyclitest measures ~400us latency spikes when a non-RT
usermode application communicates with an SPI-based TPM chip (Intel Atom
E3940 system, PREEMPT_RT kernel). The spikes are caused by a
stalling ioread8() operation following a sequence of 30+ iowrite8()s to
the same address. I believe this happens because the write sequence is
buffered (in CPU or somewhere along the bus), and gets flushed on the
first LOAD instruction (ioread*()) that follows.

The enclosed change appears to fix this issue: read the TPM chip's
access register (status code) after every iowrite*() operation to
amortize the cost of flushing data to chip across multiple instructions.

Signed-off-by: Haris Okanovic <haris.okanovic@ni.com>
Link: https://lore.kernel.org/r/20230323153436.B2SATnZV@linutronix.de
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
v1…v2:
  - Updated/ added comments as per Jarkko Sakkinen.

On 2023-03-30 03:25:34 [+0300], Jarkko Sakkinen wrote:
…
> I would replace this with:
> 
> /*
>  * Flush previous write operations with a dummy read operation to the 
>  * TPM MMIO base address.
>  */
> 
>  I think rest of the reasoning would be better place to the functions,
>  which are call sites for this helper, and here it would be make more
>  sense to explain what it actually does.
> 
> Thanks for catching this up. It is a small code change but I think that it
> would deserve just a bit more documentation, as it would make sure that the
> reasoning you gave is taken into account in the future code reviews.
> 
> I think that if you append what I suggested above (you can use your
> judgement and edit as you will), it should be sufficient.

Did as asked. However, it is a bit misleading given that the comment
above tpm_tis_iowrite*() describes the flush behaviour which is
conditional on CONFIG_PREEMPT_RT. Do you want this flush unconditionally
or you fine the way it is?

 drivers/char/tpm/tpm_tis.c |   43 +++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 41 insertions(+), 2 deletions(-)

--- a/drivers/char/tpm/tpm_tis.c
+++ b/drivers/char/tpm/tpm_tis.c
@@ -50,6 +50,45 @@ static inline struct tpm_tis_tcg_phy *to
 	return container_of(data, struct tpm_tis_tcg_phy, priv);
 }
 
+#ifdef CONFIG_PREEMPT_RT
+/*
+ * Flush previous write operations with a dummy read operation to the
+ * TPM MMIO base address.
+ */
+static inline void tpm_tis_flush(void __iomem *iobase)
+{
+	ioread8(iobase + TPM_ACCESS(0));
+}
+#else
+#define tpm_tis_flush(iobase) do { } while (0)
+#endif
+
+/*
+ * Write a byte word to the TPM MMIO address, and flush the write queue.
+ * The flush ensures that the data is sent immediately over the bus and not
+ * aggregated with further requests and transferred later in a batch. The large
+ * write requests can lead to unwanted latency spikes by blocking the CPU until
+ * the complete batch has been transferred.
+ */
+static inline void tpm_tis_iowrite8(u8 b, void __iomem *iobase, u32 addr)
+{
+	iowrite8(b, iobase + addr);
+	tpm_tis_flush(iobase);
+}
+
+/*
+ * Write a 32-bit word to the TPM MMIO address, and flush the write queue.
+ * The flush ensures that the data is sent immediately over the bus and not
+ * aggregated with further requests and transferred later in a batch. The large
+ * write requests can lead to unwanted latency spikes by blocking the CPU until
+ * the complete batch has been transferred.
+ */
+static inline void tpm_tis_iowrite32(u32 b, void __iomem *iobase, u32 addr)
+{
+	iowrite32(b, iobase + addr);
+	tpm_tis_flush(iobase);
+}
+
 static int interrupts = -1;
 module_param(interrupts, int, 0444);
 MODULE_PARM_DESC(interrupts, "Enable interrupts");
@@ -186,12 +225,12 @@ static int tpm_tcg_write_bytes(struct tp
 	switch (io_mode) {
 	case TPM_TIS_PHYS_8:
 		while (len--)
-			iowrite8(*value++, phy->iobase + addr);
+			tpm_tis_iowrite8(*value++, phy->iobase, addr);
 		break;
 	case TPM_TIS_PHYS_16:
 		return -EINVAL;
 	case TPM_TIS_PHYS_32:
-		iowrite32(le32_to_cpu(*((__le32 *)value)), phy->iobase + addr);
+		tpm_tis_iowrite32(le32_to_cpu(*((__le32 *)value)), phy->iobase, addr);
 		break;
 	}
 


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] tpm_tis: fix stall after iowrite*()s
  2023-04-19 15:41   ` [PATCH v2] " Sebastian Andrzej Siewior
@ 2023-04-23 15:34     ` Jarkko Sakkinen
  0 siblings, 0 replies; 4+ messages in thread
From: Jarkko Sakkinen @ 2023-04-23 15:34 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: linux-integrity, Haris Okanovic, Jason Gunthorpe, Peter Huewe,
	Thomas Gleixner

On Wed Apr 19, 2023 at 6:41 PM EEST, Sebastian Andrzej Siewior wrote:
> From: Haris Okanovic <haris.okanovic@ni.com>
>
> ioread8() operations to TPM MMIO addresses can stall the CPU when
> immediately following a sequence of iowrite*()'s to the same region.
>
> For example, cyclitest measures ~400us latency spikes when a non-RT
> usermode application communicates with an SPI-based TPM chip (Intel Atom
> E3940 system, PREEMPT_RT kernel). The spikes are caused by a
> stalling ioread8() operation following a sequence of 30+ iowrite8()s to
> the same address. I believe this happens because the write sequence is
> buffered (in CPU or somewhere along the bus), and gets flushed on the
> first LOAD instruction (ioread*()) that follows.
>
> The enclosed change appears to fix this issue: read the TPM chip's
> access register (status code) after every iowrite*() operation to
> amortize the cost of flushing data to chip across multiple instructions.
>
> Signed-off-by: Haris Okanovic <haris.okanovic@ni.com>
> Link: https://lore.kernel.org/r/20230323153436.B2SATnZV@linutronix.de
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> ---
> v1…v2:
>   - Updated/ added comments as per Jarkko Sakkinen.
>
> On 2023-03-30 03:25:34 [+0300], Jarkko Sakkinen wrote:
> …
> > I would replace this with:
> > 
> > /*
> >  * Flush previous write operations with a dummy read operation to the 
> >  * TPM MMIO base address.
> >  */
> > 
> >  I think rest of the reasoning would be better place to the functions,
> >  which are call sites for this helper, and here it would be make more
> >  sense to explain what it actually does.
> …
> > 
> > Thanks for catching this up. It is a small code change but I think that it
> > would deserve just a bit more documentation, as it would make sure that the
> > reasoning you gave is taken into account in the future code reviews.
> > 
> > I think that if you append what I suggested above (you can use your
> > judgement and edit as you will), it should be sufficient.
>
> Did as asked. However, it is a bit misleading given that the comment
> above tpm_tis_iowrite*() describes the flush behaviour which is
> conditional on CONFIG_PREEMPT_RT. Do you want this flush unconditionally
> or you fine the way it is?
>
>  drivers/char/tpm/tpm_tis.c |   43 +++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 41 insertions(+), 2 deletions(-)
>
> --- a/drivers/char/tpm/tpm_tis.c
> +++ b/drivers/char/tpm/tpm_tis.c
> @@ -50,6 +50,45 @@ static inline struct tpm_tis_tcg_phy *to
>  	return container_of(data, struct tpm_tis_tcg_phy, priv);
>  }
>  
> +#ifdef CONFIG_PREEMPT_RT
> +/*
> + * Flush previous write operations with a dummy read operation to the
> + * TPM MMIO base address.
> + */
> +static inline void tpm_tis_flush(void __iomem *iobase)
> +{
> +	ioread8(iobase + TPM_ACCESS(0));
> +}
> +#else
> +#define tpm_tis_flush(iobase) do { } while (0)
> +#endif
> +
> +/*
> + * Write a byte word to the TPM MMIO address, and flush the write queue.
> + * The flush ensures that the data is sent immediately over the bus and not
> + * aggregated with further requests and transferred later in a batch. The large
> + * write requests can lead to unwanted latency spikes by blocking the CPU until
> + * the complete batch has been transferred.
> + */
> +static inline void tpm_tis_iowrite8(u8 b, void __iomem *iobase, u32 addr)
> +{
> +	iowrite8(b, iobase + addr);
> +	tpm_tis_flush(iobase);
> +}
> +
> +/*
> + * Write a 32-bit word to the TPM MMIO address, and flush the write queue.
> + * The flush ensures that the data is sent immediately over the bus and not
> + * aggregated with further requests and transferred later in a batch. The large
> + * write requests can lead to unwanted latency spikes by blocking the CPU until
> + * the complete batch has been transferred.
> + */
> +static inline void tpm_tis_iowrite32(u32 b, void __iomem *iobase, u32 addr)
> +{
> +	iowrite32(b, iobase + addr);
> +	tpm_tis_flush(iobase);
> +}
> +
>  static int interrupts = -1;
>  module_param(interrupts, int, 0444);
>  MODULE_PARM_DESC(interrupts, "Enable interrupts");
> @@ -186,12 +225,12 @@ static int tpm_tcg_write_bytes(struct tp
>  	switch (io_mode) {
>  	case TPM_TIS_PHYS_8:
>  		while (len--)
> -			iowrite8(*value++, phy->iobase + addr);
> +			tpm_tis_iowrite8(*value++, phy->iobase, addr);
>  		break;
>  	case TPM_TIS_PHYS_16:
>  		return -EINVAL;
>  	case TPM_TIS_PHYS_32:
> -		iowrite32(le32_to_cpu(*((__le32 *)value)), phy->iobase + addr);
> +		tpm_tis_iowrite32(le32_to_cpu(*((__le32 *)value)), phy->iobase, addr);
>  		break;
>  	}
>  


Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>

BR, Jarkko

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-04-23 15:34 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-23 15:34 [PATCH] tpm_tis: fix stall after iowrite*()s Sebastian Andrzej Siewior
2023-03-30  0:25 ` Jarkko Sakkinen
2023-04-19 15:41   ` [PATCH v2] " Sebastian Andrzej Siewior
2023-04-23 15:34     ` Jarkko Sakkinen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).