All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bjorn Helgaas <helgaas@kernel.org>
To: Huacai Chen <chenhuacai@loongson.cn>
Cc: Bjorn Helgaas <bhelgaas@google.com>,
	linux-pci@vger.kernel.org, Xuefeng Li <lixuefeng@loongson.cn>,
	Huacai Chen <chenhuacai@gmail.com>,
	Jiaxun Yang <jiaxun.yang@flygoat.com>
Subject: Re: [PATCH V3 3/4] PCI: Improve the MRRS quirk for LS7A
Date: Sat, 26 Jun 2021 10:48:37 -0500	[thread overview]
Message-ID: <20210626154837.GA3738258@bjorn-Precision-5520> (raw)
In-Reply-To: <20210625222204.GA3657225@bjorn-Precision-5520>

On Fri, Jun 25, 2021 at 05:22:04PM -0500, Bjorn Helgaas wrote:
> On Fri, Jun 25, 2021 at 05:30:29PM +0800, Huacai Chen wrote:
> > In new revision of LS7A, some PCIe ports support larger value than 256,
> > but their maximum supported MRRS values are not detectable. Moreover,
> > the current loongson_mrrs_quirk() cannot avoid devices increasing its
> > MRRS after pci_enable_device(), and some devices (e.g. Realtek 8169)
> > will actually set a big value in its driver. So the only possible way is
> > configure MRRS of all devices in BIOS, and add a PCI device flag (i.e.,
> > PCI_DEV_FLAGS_NO_INCREASE_MRRS) to stop the increasing MRRS operations.
> > 
> > However, according to PCIe Spec, it is legal for an OS to program any
> > value for MRRS, and it is also legal for an endpoint to generate a Read
> > Request with any size up to its MRRS. As the hardware engineers says,
> > the root cause here is LS7A doesn't break up large read requests (Yes,
> > that is a problem in the LS7A design).
> 
> "LS7A doesn't break up large read requests" claims to be a root cause,
> but you haven't yet said what the actual *problem* is.
> 
> Is the problem that an endpoint reports a malformed TLP because it
> received a completion bigger than it can handle?  Is it that the LS7A
> root port reports some kind of error if it receives a Memory Read
> request with a size that's "too big"?  Maybe the LS7A doesn't know
> what to do when it receives a Memory Read request with MRRS > MPS?
> What exactly happens when the problem occurs?
> 
> MRRS applies only to the read request.  It is not directly related to
> the size of the completions that carry the data back to the device
> (except that obviously you shouldn't get a completion larger than the
> read you requested).
> 
> The setting that directly controls the size of completions is MPS
> (Max_Payload_Size).  One reason to break up read requests is because
> the endpoint's buffers can't accommodate big TLPs.  One way to deal
> with that is to set MPS in the hierarchy to a smaller value.  Then the
> root port must ensure that no TLP exceeds the MPS size, regardless of
> what the MRRS in the read request was.
> 
> For example, if the endpoint's MRRS=4096 and the hierarchy's MPS=128,
> it's up to the root port to break up completions into 128-byte chunks.
> 
> It's also possible to set the endpoint's MRRS=128, which means reads
> to main memory will never receive completions larger than 128 bytes.
> But it does NOT guarantee that a peer-to-peer DMA from another device
> will be limited to 128 bytes.  The other device is allowed to generate
> Memory Write TLPs with payloads up to its MPS size, and MRRS is not
> involved at all.
> 
> It's not clear yet whether the LS7A problem is with MRRS, with MPS, or
> with some combination.  It's important to understand exactly what is
> broken here so the quirk doesn't get in the way of future changes to
> the generic MRRS and MPS configuration.
> 
> Here's a good overview:
> 
>   https://www.xilinx.com/support/documentation/white_papers/wp350.pdf
> 
> > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> > ---
> >  drivers/pci/pci.c    | 5 +++++
> >  drivers/pci/quirks.c | 8 +++++++-
> >  include/linux/pci.h  | 2 ++
> >  3 files changed, 14 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > index b717680377a9..6f0d2f5b6f30 100644
> > --- a/drivers/pci/pci.c
> > +++ b/drivers/pci/pci.c
> > @@ -5802,6 +5802,11 @@ int pcie_set_readrq(struct pci_dev *dev, int rq)
> >  
> >  	v = (ffs(rq) - 8) << 12;
> >  
> > +	if (dev->dev_flags & PCI_DEV_FLAGS_NO_INCREASE_MRRS) {
> > +		if (rq > pcie_get_readrq(dev))
> > +			return -EINVAL;
> > +	}
> > +
> >  	ret = pcie_capability_clear_and_set_word(dev, PCI_EXP_DEVCTL,
> >  						  PCI_EXP_DEVCTL_READRQ, v);
> >  
> > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> > index dee4798a49fc..8284480dc7e4 100644
> > --- a/drivers/pci/quirks.c
> > +++ b/drivers/pci/quirks.c
> > @@ -263,7 +263,13 @@ static void loongson_mrrs_quirk(struct pci_dev *dev)
> >  		 * anything larger than this. So force this limit on
> >  		 * any devices attached under these ports.
> >  		 */
> > -		if (pci_match_id(bridge_devids, bridge)) {
> > +		if (bridge && pci_match_id(bridge_devids, bridge)) {
> > +			dev->dev_flags |= PCI_DEV_FLAGS_NO_INCREASE_MRRS;
> > +
> > +			if (pcie_bus_config == PCIE_BUS_DEFAULT ||
> > +			    pcie_bus_config == PCIE_BUS_TUNE_OFF)
> > +				break;

Another approach might be to make a quirk that prevents Linux from
touching MPS and MRRS at all under any circumstances.

You'd have to do this without reference to pcie_bus_config so future
MPS/MRRS algorithm changes wouldn't be affected.  And the quirk bit
would have to be in struct pci_host_bridge, similar to no_ext_tags.

> >  			if (pcie_get_readrq(dev) > 256) {
> >  				pci_info(dev, "limiting MRRS to 256\n");
> >  				pcie_set_readrq(dev, 256);
> > diff --git a/include/linux/pci.h b/include/linux/pci.h
> > index 24306504226a..5e0ec3e4318b 100644
> > --- a/include/linux/pci.h
> > +++ b/include/linux/pci.h
> > @@ -227,6 +227,8 @@ enum pci_dev_flags {
> >  	PCI_DEV_FLAGS_NO_FLR_RESET = (__force pci_dev_flags_t) (1 << 10),
> >  	/* Don't use Relaxed Ordering for TLPs directed at this device */
> >  	PCI_DEV_FLAGS_NO_RELAXED_ORDERING = (__force pci_dev_flags_t) (1 << 11),
> > +	/* Don't increase BIOS's MRRS configuration */
> > +	PCI_DEV_FLAGS_NO_INCREASE_MRRS = (__force pci_dev_flags_t) (1 << 12),
> >  };
> >  
> >  enum pci_irq_reroute_variant {
> > -- 
> > 2.27.0
> > 

  reply	other threads:[~2021-06-26 15:48 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-25  9:30 [PATCH V3 0/4] PCI: Loongson-related pci quirks Huacai Chen
2021-06-25  9:30 ` [PATCH V3 1/4] PCI/portdrv: Don't disable device during shutdown Huacai Chen
2021-06-25 20:45   ` Bjorn Helgaas
2021-06-28  9:32     ` Huacai Chen
2021-06-25  9:30 ` [PATCH V3 2/4] PCI: Move loongson pci quirks to quirks.c Huacai Chen
2021-06-26 15:39   ` Bjorn Helgaas
2021-06-27  9:54     ` Huacai Chen
2021-06-25  9:30 ` [PATCH V3 3/4] PCI: Improve the MRRS quirk for LS7A Huacai Chen
2021-06-25 22:22   ` Bjorn Helgaas
2021-06-26 15:48     ` Bjorn Helgaas [this message]
2021-06-27 10:27       ` Huacai Chen
2021-06-27 10:25     ` Huacai Chen
2021-06-28 20:51       ` Bjorn Helgaas
2021-06-29  2:00         ` Huacai Chen
2021-06-29  2:12           ` Bjorn Helgaas
2021-06-29  3:32             ` Huacai Chen
2021-06-29  3:38               ` Jiaxun Yang
2021-06-25  9:30 ` [PATCH V3 4/4] PCI: Add quirk for multifunction devices of LS7A Huacai Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210626154837.GA3738258@bjorn-Precision-5520 \
    --to=helgaas@kernel.org \
    --cc=bhelgaas@google.com \
    --cc=chenhuacai@gmail.com \
    --cc=chenhuacai@loongson.cn \
    --cc=jiaxun.yang@flygoat.com \
    --cc=linux-pci@vger.kernel.org \
    --cc=lixuefeng@loongson.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.