All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marc Zyngier <maz@kernel.org>
To: Shanker Donthineni <sdonthineni@nvidia.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>, Jonathan Corbet <corbet@lwn.net>,
	Mark Rutland <mark.rutland@arm.com>,
	Lorenzo Pieralisi <lpieralisi@kernel.org>,
	"Sudeep\ Holla" <sudeep.holla@arm.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	<linux-arm-kernel@lists.infradead.org>,
	<linux-doc@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
	Vikram Sethi <vsethi@nvidia.com>,
	Thierry Reding <treding@nvidia.com>
Subject: Re: [PATCH v4] irqchip/gicv3: Workaround for NVIDIA erratum T241-FABRIC-4
Date: Sat, 18 Mar 2023 09:44:27 +0000	[thread overview]
Message-ID: <87wn3egzr8.wl-maz@kernel.org> (raw)
In-Reply-To: <20230318045812.2043117-1-sdonthineni@nvidia.com>

On Sat, 18 Mar 2023 04:58:12 +0000,
Shanker Donthineni <sdonthineni@nvidia.com> wrote:
> 
> The T241 platform suffers from the T241-FABRIC-4 erratum which causes
> unexpected behavior in the GIC when multiple transactions are received
> simultaneously from different sources. This hardware issue impacts
> NVIDIA server platforms that use more than two T241 chips
> interconnected. Each chip has support for 320 {E}SPIs.
> 
> This issue occurs when multiple packets from different GICs are
> incorrectly interleaved at the target chip. The erratum text below
> specifies exactly what can cause multiple transfer packets susceptible
> to interleaving and GIC state corruption. GIC state corruption can
> lead to a range of problems, including kernel panics, and unexpected
> behavior.
> 
> From the erratum text:
>   "In some cases, inter-socket AXI4 Stream packets with multiple
>   transfers, may be interleaved by the fabric when presented to ARM
>   Generic Interrupt Controller. GIC expects all transfers of a packet
>   to be delivered without any interleaving.
> 
>   The following GICv3 commands may result in multiple transfer packets
>   over inter-socket AXI4 Stream interface:
>    - Register reads from GICD_I* and GICD_N*
>    - Register writes to 64-bit GICD registers other than GICD_IROUTERn*
>    - ITS command MOVALL
> 
>   Multiple commands in GICv4+ utilize multiple transfer packets,
>   including VMOVP, VMOVI, VMAPP, and 64-bit register accesses."
> 
>   This issue impacts system configurations with more than 2 sockets,
>   that require multi-transfer packets to be sent over inter-socket
>   AXI4 Stream interface between GIC instances on different sockets.
>   GICv4 cannot be supported. GICv3 SW model can only be supported
>   with the workaround. Single and Dual socket configurations are not
>   impacted by this issue and support GICv3 and GICv4."
> 
> Link: https://developer.nvidia.com/docs/t241-fabric-4/nvidia-t241-fabric-4-errata.pdf
> 
> Writing to the chip alias region of the GICD_In{E} registers except
> GICD_ICENABLERn has an equivalent effect as writing to the global
> distributor. The SPI interrupt deactivate path is not impacted by
> the erratum.
> 
> To fix this problem, implement a workaround that ensures read accesses
> to the GICD_In{E} registers are directed to the chip that owns the
> SPI, and disables GICv4.x features for KVM. To simplify code changes,
> the gic_configure_irq() function uses the same alias region for both
> read and write operations to GICD_ICFGR.
> 
> Co-developed-by: Vikram Sethi <vsethi@nvidia.com>
> Signed-off-by: Vikram Sethi <vsethi@nvidia.com>
> Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com>
> ---
> Changes since v2:
>  - Fix the build issue for the 32bit arch
> Changes since v2:
>  - Add accessors for the SOC-ID version & revision
>  - Include "linux/bitfield.h" and "linux/bits.h" in irq-gic-v3.c
> Changes since v1:
>  - Use SMCCC SOC-ID API for detecting the T241 chip
>  - Implement Marc's suggestions
>  - Edit commit text

You seem to have ignored most of my comments on v2[1] apart from the
SOC_ID stuff. I guess I'll wait for v5...

	M.

[1] https://lore.kernel.org/all/871qlqif9v.wl-maz@kernel.org/

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2023-03-18  9:45 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-18  4:58 [PATCH v4] irqchip/gicv3: Workaround for NVIDIA erratum T241-FABRIC-4 Shanker Donthineni
2023-03-18  4:58 ` Shanker Donthineni
2023-03-18  9:44 ` Marc Zyngier [this message]
2023-03-18 15:29   ` Shanker Donthineni
2023-03-18 15:29     ` Shanker Donthineni

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87wn3egzr8.wl-maz@kernel.org \
    --to=maz@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=corbet@lwn.net \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lpieralisi@kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=sdonthineni@nvidia.com \
    --cc=sudeep.holla@arm.com \
    --cc=tglx@linutronix.de \
    --cc=treding@nvidia.com \
    --cc=vsethi@nvidia.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.