linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Z.q. Hou" <zhiqiang.hou@nxp.com>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: "linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-arm-kernel@lists.infradead.org" 
	<linux-arm-kernel@lists.infradead.org>,
	"lorenzo.pieralisi@arm.com" <lorenzo.pieralisi@arm.com>,
	"robh@kernel.org" <robh@kernel.org>,
	"bhelgaas@google.com" <bhelgaas@google.com>,
	"M.h. Lian" <minghuan.lian@nxp.com>, Roy Zang <roy.zang@nxp.com>,
	Mingkai Hu <mingkai.hu@nxp.com>, Leo Li <leoyang.li@nxp.com>
Subject: RE: [PATCH] PCI: layerscape: Change back to the default error response behavior
Date: Wed, 30 Sep 2020 05:37:25 +0000	[thread overview]
Message-ID: <HE1PR0402MB3371F5BAB03DABFBD63A5F3C84330@HE1PR0402MB3371.eurprd04.prod.outlook.com> (raw)
In-Reply-To: <20200929150252.GA2540544@bjorn-Precision-5520>

Hi Bjorn,

Thanks a lot for your comments!

> -----Original Message-----
> From: Bjorn Helgaas <helgaas@kernel.org>
> Sent: 2020年9月29日 23:03
> To: Z.q. Hou <zhiqiang.hou@nxp.com>
> Cc: linux-pci@vger.kernel.org; linux-kernel@vger.kernel.org;
> linux-arm-kernel@lists.infradead.org; lorenzo.pieralisi@arm.com;
> robh@kernel.org; bhelgaas@google.com; M.h. Lian
> <minghuan.lian@nxp.com>; Roy Zang <roy.zang@nxp.com>; Mingkai Hu
> <mingkai.hu@nxp.com>; Leo Li <leoyang.li@nxp.com>
> Subject: Re: [PATCH] PCI: layerscape: Change back to the default error
> response behavior
> 
> On Tue, Sep 29, 2020 at 09:13:28PM +0800, Zhiqiang Hou wrote:
> > From: Hou Zhiqiang <Zhiqiang.Hou@nxp.com>
> >
> > In the current error response behavior, it will send a SLVERR response
> > to device's internal AXI slave system interface when the PCIe
> > controller experiences an erroneous completion (UR, CA and CT) from an
> > external completer for its outbound non-posted request, which will
> > result in SError and crash the kernel directly.
> 
> Possible wording:
> 
>   As currently configured, when the PCIe controller receives a
>   Completion with UR or CA status, or a Completion Timeout occurs, it
>   sends a SLVERR response to the internal AXI slave system interface,
>   which results in SError and a kernel crash.
> 
> Please add a blank line between paragraphs, and s/This patch change back
> it/Change it/ below.
> 
> > This patch change back it to the default behavior to increase the
> > robustness of the kernel. In the default behavior, it always sends an
> > OKAY response to the internal AXI slave interface when the controller
> > gets these erroneous completions. And the AER driver will report and
> > try to recover these errors.
> 
> This reverts 84d897d69938 ("PCI: layerscape: Change default error response
> behavior"), so please mention that in the commit log, probably as:
> 
> Fixes: 84d897d69938 ("PCI: layerscape: Change default error response
> behavior")
> 
> Maybe it also needs a stable tag, e.g., v4.15+?

Thanks for your good suggestions! Will fix in v2.

> 
> Since this is a pure revert, whatever problem 84d897d69938 fixed must now
> be fixed in some other way.  Otherwise, this revert would just be
> reintroducing the problem fixed by 84d897d69938.
> 
> This commit log should mention that what that other fix is.
> 
> AER is only a reporting mechanism, it is asynchronous to the instruction
> stream, and it's optional (may not be implemented in the hardware, and may
> not be supported by the kernel), so I'm not super convinced that it can be the
> answer to this problem.
>

The commit 84d897d69938 ("PCI: layerscape: Change default error response behavior") doesn't fix any issue, it just enable a feature of DesignWare PCIe IP that it allows error response to AXI slave interface, which are not enabled on all other platforms with DWC IP. As mentioned in that commit it will also send an OKAY response to AXI slave interface for erroneous completion of non-post transaction including CFG and MEM_rd transactions, however upstream won't support for platforms aborting on CFG accesses, so we have to change it back to the default error response behavior and bear the error of MEM_rd isn't forwarded, just like other DWC IP platforms.

I remember the SError interrupt mechanism is also asynchronous abort and it is only a reporting mechanism. Contrast with the AER, it will make the kernel crash. So both of these 2 mechanism cannot ensure the data integrity, generally the upper layer data transfer protocol has its own mechanism to ensure the data integrity, it's not a issue for almost users. If one really wants a kernel crash when there is error of MEM_rd, he can enable this in his local code.

Thanks,
Zhiqiang
 
> > Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@nxp.com>
> > ---
> >  drivers/pci/controller/dwc/pci-layerscape.c | 11 -----------
> >  1 file changed, 11 deletions(-)
> >
> > diff --git a/drivers/pci/controller/dwc/pci-layerscape.c
> > b/drivers/pci/controller/dwc/pci-layerscape.c
> > index f24f79a70d9a..e92ab8a77046 100644
> > --- a/drivers/pci/controller/dwc/pci-layerscape.c
> > +++ b/drivers/pci/controller/dwc/pci-layerscape.c
> > @@ -30,8 +30,6 @@
> >
> >  /* PEX Internal Configuration Registers */
> >  #define PCIE_STRFMR1		0x71c /* Symbol Timer & Filter Mask
> Register1 */
> > -#define PCIE_ABSERR		0x8d0 /* Bridge Slave Error Response
> Register */
> > -#define PCIE_ABSERR_SETTING	0x9401 /* Forward error of
> non-posted request */
> >
> >  #define PCIE_IATU_NUM		6
> >
> > @@ -123,14 +121,6 @@ static int ls_pcie_link_up(struct dw_pcie *pci)
> >  	return 1;
> >  }
> >
> > -/* Forward error response of outbound non-posted requests */ -static
> > void ls_pcie_fix_error_response(struct ls_pcie *pcie) -{
> > -	struct dw_pcie *pci = pcie->pci;
> > -
> > -	iowrite32(PCIE_ABSERR_SETTING, pci->dbi_base + PCIE_ABSERR);
> > -}
> > -
> >  static int ls_pcie_host_init(struct pcie_port *pp)  {
> >  	struct dw_pcie *pci = to_dw_pcie_from_pp(pp); @@ -142,7 +132,6 @@
> > static int ls_pcie_host_init(struct pcie_port *pp)
> >  	 * dw_pcie_setup_rc() will reconfigure the outbound windows.
> >  	 */
> >  	ls_pcie_disable_outbound_atus(pcie);
> > -	ls_pcie_fix_error_response(pcie);
> >
> >  	dw_pcie_dbi_ro_wr_en(pci);
> >  	ls_pcie_clear_multifunction(pcie);
> > --
> > 2.17.1
> >

  reply	other threads:[~2020-09-30  5:37 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-29 13:13 [PATCH] PCI: layerscape: Change back to the default error response behavior Zhiqiang Hou
2020-09-29 15:02 ` Bjorn Helgaas
2020-09-30  5:37   ` Z.q. Hou [this message]
2020-09-30 13:29 ` Kishon Vijay Abraham I
2020-09-30 15:07   ` Rob Herring
2020-09-30 15:42     ` Kishon Vijay Abraham I
2020-10-12  4:33     ` Z.q. Hou

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=HE1PR0402MB3371F5BAB03DABFBD63A5F3C84330@HE1PR0402MB3371.eurprd04.prod.outlook.com \
    --to=zhiqiang.hou@nxp.com \
    --cc=bhelgaas@google.com \
    --cc=helgaas@kernel.org \
    --cc=leoyang.li@nxp.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lorenzo.pieralisi@arm.com \
    --cc=minghuan.lian@nxp.com \
    --cc=mingkai.hu@nxp.com \
    --cc=robh@kernel.org \
    --cc=roy.zang@nxp.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).