From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F1543C77B73 for ; Wed, 19 Apr 2023 15:41:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232897AbjDSPlg (ORCPT ); Wed, 19 Apr 2023 11:41:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33722 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232036AbjDSPlf (ORCPT ); Wed, 19 Apr 2023 11:41:35 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3329E4490 for ; Wed, 19 Apr 2023 08:41:34 -0700 (PDT) Date: Wed, 19 Apr 2023 17:41:30 +0200 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1681918891; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=4/YfXKV5b0StPu508z4GRS2uAmm2oisxiF22pdKUUnw=; b=e/yuoalzU9dA24XJvzZCp2JVS0zw8xW1FEiHz83ON4/99DRm5MLjfie2zzGQaB/VdHTXmr Jczg75aNxqwnHMwrw+CipaD/l9C3UH0jepCsXjdMMdi9edi8Bv2h8+KEgb3hkhvbLVbbC6 MugdzwLivMa6V1xrGGaxWEXRFrABNzeBMZHlicybQuG3TigrfKo4CBq02z4SFfOa/4EJjx 8j8C0eP2zgDd+tdZiQ+xgqX2LANOk3F0rBF6scD/6EiDR+gHtWwEJwoG+i02jDqKelrNnX XoCCfmD5iVL7XjemdoTONgIi25ra48DOa3BE18fu+GpauQiJRq6Ua0ZFgZw+MQ== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1681918891; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=4/YfXKV5b0StPu508z4GRS2uAmm2oisxiF22pdKUUnw=; b=r+AuIlArX4ov/q87M0FPIcSe68gdWPhTOBeCZHfq0wjdvW9TtLFs4zMYbxc4dP0W6Kuiq3 RcgMhc+2sEayshCg== From: Sebastian Andrzej Siewior To: Jarkko Sakkinen Cc: linux-integrity@vger.kernel.org, Haris Okanovic , Jason Gunthorpe , Peter Huewe , Thomas Gleixner Subject: [PATCH v2] tpm_tis: fix stall after iowrite*()s Message-ID: <20230419154130.b392MbTl@linutronix.de> References: <20230323153436.B2SATnZV@linutronix.de> <20230330002534.teqpltcmpkdms72t@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable In-Reply-To: <20230330002534.teqpltcmpkdms72t@kernel.org> Precedence: bulk List-ID: X-Mailing-List: linux-integrity@vger.kernel.org =46rom: Haris Okanovic ioread8() operations to TPM MMIO addresses can stall the CPU when immediately following a sequence of iowrite*()'s to the same region. For example, cyclitest measures ~400us latency spikes when a non-RT usermode application communicates with an SPI-based TPM chip (Intel Atom E3940 system, PREEMPT_RT kernel). The spikes are caused by a stalling ioread8() operation following a sequence of 30+ iowrite8()s to the same address. I believe this happens because the write sequence is buffered (in CPU or somewhere along the bus), and gets flushed on the first LOAD instruction (ioread*()) that follows. The enclosed change appears to fix this issue: read the TPM chip's access register (status code) after every iowrite*() operation to amortize the cost of flushing data to chip across multiple instructions. Signed-off-by: Haris Okanovic Link: https://lore.kernel.org/r/20230323153436.B2SATnZV@linutronix.de Signed-off-by: Sebastian Andrzej Siewior --- v1=E2=80=A6v2: - Updated/ added comments as per Jarkko Sakkinen. On 2023-03-30 03:25:34 [+0300], Jarkko Sakkinen wrote: =E2=80=A6 > I would replace this with: >=20 > /* > * Flush previous write operations with a dummy read operation to the=20 > * TPM MMIO base address. > */ >=20 > I think rest of the reasoning would be better place to the functions, > which are call sites for this helper, and here it would be make more > sense to explain what it actually does. =E2=80=A6 >=20 > Thanks for catching this up. It is a small code change but I think that it > would deserve just a bit more documentation, as it would make sure that t= he > reasoning you gave is taken into account in the future code reviews. >=20 > I think that if you append what I suggested above (you can use your > judgement and edit as you will), it should be sufficient. Did as asked. However, it is a bit misleading given that the comment above tpm_tis_iowrite*() describes the flush behaviour which is conditional on CONFIG_PREEMPT_RT. Do you want this flush unconditionally or you fine the way it is? drivers/char/tpm/tpm_tis.c | 43 ++++++++++++++++++++++++++++++++++++++++= +-- 1 file changed, 41 insertions(+), 2 deletions(-) --- a/drivers/char/tpm/tpm_tis.c +++ b/drivers/char/tpm/tpm_tis.c @@ -50,6 +50,45 @@ static inline struct tpm_tis_tcg_phy *to return container_of(data, struct tpm_tis_tcg_phy, priv); } =20 +#ifdef CONFIG_PREEMPT_RT +/* + * Flush previous write operations with a dummy read operation to the + * TPM MMIO base address. + */ +static inline void tpm_tis_flush(void __iomem *iobase) +{ + ioread8(iobase + TPM_ACCESS(0)); +} +#else +#define tpm_tis_flush(iobase) do { } while (0) +#endif + +/* + * Write a byte word to the TPM MMIO address, and flush the write queue. + * The flush ensures that the data is sent immediately over the bus and not + * aggregated with further requests and transferred later in a batch. The = large + * write requests can lead to unwanted latency spikes by blocking the CPU = until + * the complete batch has been transferred. + */ +static inline void tpm_tis_iowrite8(u8 b, void __iomem *iobase, u32 addr) +{ + iowrite8(b, iobase + addr); + tpm_tis_flush(iobase); +} + +/* + * Write a 32-bit word to the TPM MMIO address, and flush the write queue. + * The flush ensures that the data is sent immediately over the bus and not + * aggregated with further requests and transferred later in a batch. The = large + * write requests can lead to unwanted latency spikes by blocking the CPU = until + * the complete batch has been transferred. + */ +static inline void tpm_tis_iowrite32(u32 b, void __iomem *iobase, u32 addr) +{ + iowrite32(b, iobase + addr); + tpm_tis_flush(iobase); +} + static int interrupts =3D -1; module_param(interrupts, int, 0444); MODULE_PARM_DESC(interrupts, "Enable interrupts"); @@ -186,12 +225,12 @@ static int tpm_tcg_write_bytes(struct tp switch (io_mode) { case TPM_TIS_PHYS_8: while (len--) - iowrite8(*value++, phy->iobase + addr); + tpm_tis_iowrite8(*value++, phy->iobase, addr); break; case TPM_TIS_PHYS_16: return -EINVAL; case TPM_TIS_PHYS_32: - iowrite32(le32_to_cpu(*((__le32 *)value)), phy->iobase + addr); + tpm_tis_iowrite32(le32_to_cpu(*((__le32 *)value)), phy->iobase, addr); break; } =20