All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Pali Rohár" <pali@kernel.org>
To: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>,
	Thomas Petazzoni <thomas.petazzoni@bootlin.com>,
	Bjorn Helgaas <bhelgaas@google.com>,
	Rob Herring <robh@kernel.org>,
	Gregory Clement <gregory.clement@bootlin.com>
Cc: "Marek Behún" <kabel@kernel.org>,
	"Remi Pommarel" <repk@triplefau.lt>, Xogium <contact@xogium.me>,
	"Tomasz Maciej Nowak" <tmn505@gmail.com>,
	"Nadav Haklai" <nadavh@marvell.com>,
	"Kostya Porotchkin" <kostap@marvell.com>,
	linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [RESEND PATCH 5/5] PCI: aardvark: Implement workaround for PCIe Completion Timeout
Date: Wed, 8 Sep 2021 21:42:57 +0200	[thread overview]
Message-ID: <20210908194257.nhn4iexqkzdv7ijz@pali> (raw)
In-Reply-To: <20210825195953.7ylfnplurhfixabg@pali>

On Wednesday 25 August 2021 21:59:53 Pali Rohár wrote:
> On Friday 25 June 2021 00:26:21 Pali Rohár wrote:
> > Marvell Armada 3700 Functional Errata, Guidelines, and Restrictions
> > document describes in erratum 3.12 PCIe Completion Timeout (Ref #: 251),
> > that PCIe IP does not support a strong-ordered model for inbound posted vs.
> > outbound completion.
> > 
> > As a workaround for this erratum, DIS_ORD_CHK flag in Debug Mux Control
> > register must be set. It disables the ordering check in the core between
> > Completions and Posted requests received from the link.
> > 
> > It was reported that enabling this workaround fixes instability issues and
> > "Unhandled fault" errors when using 60 GHz WiFi 802.11ad card with Qualcomm
> > QCA6335 chip under significant load which were caused by interrupt status
> > stuck in the outbound CMPLT queue traced back to this erratum.
> > 
> > This workaround fixes also kernel panic triggered after some minutes of
> > usage 5 GHz WiFi 802.11ax card with Mediatek MT7915 chip:
> > 
> >     Internal error: synchronous external abort: 96000210 [#1] SMP
> >     Kernel panic - not syncing: Fatal exception in interrupt
> > 
> > Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
> > Signed-off-by: Pali Rohár <pali@kernel.org>
> > Cc: stable@vger.kernel.org
> > ---
> > Patch was originally written by Thomas and is already for a long time part
> > of Marvell SDK. I have just re-written/re-applied it on top of mainline
> > kernel and also wrote a new updated commit message.
> > 
> > Please note that this patch is questionable as Bjorn has some objections
> > and nobody, including Marvell, was not able to explain erratum nor what
> > is workaround exactly doing. Documentation about this topic is basically
> > missing.
> 
> See also https://lore.kernel.org/linux-pci/20210723221710.wtztsrddudnxeoj3@pali/

Hello Lorenzo. For now let just this one patch (5/5) as is. As we do not
know how to process this issue and there is open (above) question.

I hope that Marvell people would respond to this above issue.

Other remaining patches in this series are fine.

> > We just know that it fixes real kernel crashes when using WiFi cards.
> > ---
> >  drivers/pci/controller/pci-aardvark.c | 7 +++++++
> >  1 file changed, 7 insertions(+)
> > 
> > diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
> > index 9ff68abd8d1e..231f4469d87e 100644
> > --- a/drivers/pci/controller/pci-aardvark.c
> > +++ b/drivers/pci/controller/pci-aardvark.c
> > @@ -167,6 +167,8 @@
> >  #define     LTSSM_L0				0x10
> >  #define     RC_BAR_CONFIG			0x300
> >  #define VENDOR_ID_REG				(LMI_BASE_ADDR + 0x44)
> > +#define DEBUG_MUX_CTRL_REG			(LMI_BASE_ADDR + 0x208)
> > +#define     DIS_ORD_CHK				BIT(30)
> >  
> >  /* PCIe core controller registers */
> >  #define CTRL_CORE_BASE_ADDR			0x18000
> > @@ -450,6 +452,11 @@ static void advk_pcie_setup_hw(struct advk_pcie *pcie)
> >  		PCIE_CORE_CTRL2_TD_ENABLE;
> >  	advk_writel(pcie, reg, PCIE_CORE_CTRL2_REG);
> >  
> > +	/* Disable ordering checks, workaround for erratum 3.12 "PCIe completion timeout" */
> > +	reg = advk_readl(pcie, DEBUG_MUX_CTRL_REG);
> > +	reg |= DIS_ORD_CHK;
> > +	advk_writel(pcie, reg, DEBUG_MUX_CTRL_REG);
> > +
> >  	/* Set lane X1 */
> >  	reg = advk_readl(pcie, PCIE_CORE_CTRL0_REG);
> >  	reg &= ~LANE_CNT_MSK;
> > -- 
> > 2.20.1
> > 

WARNING: multiple messages have this Message-ID (diff)
From: "Pali Rohár" <pali@kernel.org>
To: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>,
	Thomas Petazzoni <thomas.petazzoni@bootlin.com>,
	Bjorn Helgaas <bhelgaas@google.com>,
	Rob Herring <robh@kernel.org>,
	Gregory Clement <gregory.clement@bootlin.com>
Cc: "Marek Behún" <kabel@kernel.org>,
	"Remi Pommarel" <repk@triplefau.lt>, Xogium <contact@xogium.me>,
	"Tomasz Maciej Nowak" <tmn505@gmail.com>,
	"Nadav Haklai" <nadavh@marvell.com>,
	"Kostya Porotchkin" <kostap@marvell.com>,
	linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [RESEND PATCH 5/5] PCI: aardvark: Implement workaround for PCIe Completion Timeout
Date: Wed, 8 Sep 2021 21:42:57 +0200	[thread overview]
Message-ID: <20210908194257.nhn4iexqkzdv7ijz@pali> (raw)
In-Reply-To: <20210825195953.7ylfnplurhfixabg@pali>

On Wednesday 25 August 2021 21:59:53 Pali Rohár wrote:
> On Friday 25 June 2021 00:26:21 Pali Rohár wrote:
> > Marvell Armada 3700 Functional Errata, Guidelines, and Restrictions
> > document describes in erratum 3.12 PCIe Completion Timeout (Ref #: 251),
> > that PCIe IP does not support a strong-ordered model for inbound posted vs.
> > outbound completion.
> > 
> > As a workaround for this erratum, DIS_ORD_CHK flag in Debug Mux Control
> > register must be set. It disables the ordering check in the core between
> > Completions and Posted requests received from the link.
> > 
> > It was reported that enabling this workaround fixes instability issues and
> > "Unhandled fault" errors when using 60 GHz WiFi 802.11ad card with Qualcomm
> > QCA6335 chip under significant load which were caused by interrupt status
> > stuck in the outbound CMPLT queue traced back to this erratum.
> > 
> > This workaround fixes also kernel panic triggered after some minutes of
> > usage 5 GHz WiFi 802.11ax card with Mediatek MT7915 chip:
> > 
> >     Internal error: synchronous external abort: 96000210 [#1] SMP
> >     Kernel panic - not syncing: Fatal exception in interrupt
> > 
> > Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
> > Signed-off-by: Pali Rohár <pali@kernel.org>
> > Cc: stable@vger.kernel.org
> > ---
> > Patch was originally written by Thomas and is already for a long time part
> > of Marvell SDK. I have just re-written/re-applied it on top of mainline
> > kernel and also wrote a new updated commit message.
> > 
> > Please note that this patch is questionable as Bjorn has some objections
> > and nobody, including Marvell, was not able to explain erratum nor what
> > is workaround exactly doing. Documentation about this topic is basically
> > missing.
> 
> See also https://lore.kernel.org/linux-pci/20210723221710.wtztsrddudnxeoj3@pali/

Hello Lorenzo. For now let just this one patch (5/5) as is. As we do not
know how to process this issue and there is open (above) question.

I hope that Marvell people would respond to this above issue.

Other remaining patches in this series are fine.

> > We just know that it fixes real kernel crashes when using WiFi cards.
> > ---
> >  drivers/pci/controller/pci-aardvark.c | 7 +++++++
> >  1 file changed, 7 insertions(+)
> > 
> > diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
> > index 9ff68abd8d1e..231f4469d87e 100644
> > --- a/drivers/pci/controller/pci-aardvark.c
> > +++ b/drivers/pci/controller/pci-aardvark.c
> > @@ -167,6 +167,8 @@
> >  #define     LTSSM_L0				0x10
> >  #define     RC_BAR_CONFIG			0x300
> >  #define VENDOR_ID_REG				(LMI_BASE_ADDR + 0x44)
> > +#define DEBUG_MUX_CTRL_REG			(LMI_BASE_ADDR + 0x208)
> > +#define     DIS_ORD_CHK				BIT(30)
> >  
> >  /* PCIe core controller registers */
> >  #define CTRL_CORE_BASE_ADDR			0x18000
> > @@ -450,6 +452,11 @@ static void advk_pcie_setup_hw(struct advk_pcie *pcie)
> >  		PCIE_CORE_CTRL2_TD_ENABLE;
> >  	advk_writel(pcie, reg, PCIE_CORE_CTRL2_REG);
> >  
> > +	/* Disable ordering checks, workaround for erratum 3.12 "PCIe completion timeout" */
> > +	reg = advk_readl(pcie, DEBUG_MUX_CTRL_REG);
> > +	reg |= DIS_ORD_CHK;
> > +	advk_writel(pcie, reg, DEBUG_MUX_CTRL_REG);
> > +
> >  	/* Set lane X1 */
> >  	reg = advk_readl(pcie, PCIE_CORE_CTRL0_REG);
> >  	reg &= ~LANE_CNT_MSK;
> > -- 
> > 2.20.1
> > 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2021-09-08 19:43 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-24 22:26 [RESEND PATCH 0/5] PCI: aardvark: Initialization fixes Pali Rohár
2021-06-24 22:26 ` Pali Rohár
2021-06-24 22:26 ` [RESEND PATCH 1/5] PCI: aardvark: Fix link training Pali Rohár
2021-06-24 22:26   ` Pali Rohár
2021-06-24 22:26 ` [RESEND PATCH 2/5] PCI: Add PCI_EXP_DEVCTL_PAYLOAD_* macros Pali Rohár
2021-06-24 22:26   ` Pali Rohár
2021-08-13 15:46   ` Lorenzo Pieralisi
2021-08-13 15:46     ` Lorenzo Pieralisi
2021-08-24 19:00   ` Bjorn Helgaas
2021-08-24 19:00     ` Bjorn Helgaas
2021-06-24 22:26 ` [RESEND PATCH 3/5] PCI: aardvark: Fix PCIe Max Payload Size setting Pali Rohár
2021-06-24 22:26   ` Pali Rohár
2021-06-24 22:26 ` [RESEND PATCH 4/5] PCI: aardvark: Implement workaround for the readback value of VEND_ID Pali Rohár
2021-06-24 22:26   ` Pali Rohár
2021-06-24 22:26 ` [RESEND PATCH 5/5] PCI: aardvark: Implement workaround for PCIe Completion Timeout Pali Rohár
2021-06-24 22:26   ` Pali Rohár
2021-08-25 19:59   ` Pali Rohár
2021-08-25 19:59     ` Pali Rohár
2021-09-08 19:42     ` Pali Rohár [this message]
2021-09-08 19:42       ` Pali Rohár
2021-06-25 12:52 ` [RESEND PATCH 0/5] PCI: aardvark: Initialization fixes Lorenzo Pieralisi
2021-06-25 12:52   ` Lorenzo Pieralisi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210908194257.nhn4iexqkzdv7ijz@pali \
    --to=pali@kernel.org \
    --cc=bhelgaas@google.com \
    --cc=contact@xogium.me \
    --cc=gregory.clement@bootlin.com \
    --cc=kabel@kernel.org \
    --cc=kostap@marvell.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lorenzo.pieralisi@arm.com \
    --cc=nadavh@marvell.com \
    --cc=repk@triplefau.lt \
    --cc=robh@kernel.org \
    --cc=thomas.petazzoni@bootlin.com \
    --cc=tmn505@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.