All of lore.kernel.org
 help / color / mirror / Atom feed
* ARM Max Read Req Size and PCIE_BUS_PERFORMANCE stories
@ 2021-08-10 10:40 Krzysztof Hałasa
  2021-08-10 23:27 ` Bjorn Helgaas
  0 siblings, 1 reply; 3+ messages in thread
From: Krzysztof Hałasa @ 2021-08-10 10:40 UTC (permalink / raw)
  To: linux-pci

Hi,

Background: I'm using an ARMv7 (i.MX6) system with an RTL8111 (aka
RTL8169) network interface. By default, the system is using
PCIE_BUS_DEFAULT:

config PCIE_BUS_DEFAULT
          Default choice; ensure that the MPS matches upstream bridge.

and the r8169 driver doesn't work - the RTL chip requests PCIe read
longer than 512 bytes, and the CPU rejects the request.

I've traced the problem to this: (r8169_main.c: rtl_jumbo_config())
	int readrq = 4096;
...
	if (pci_is_pcie(tp->pci_dev) && tp->supports_gmii)
		pcie_set_readrq(tp->pci_dev, readrq);

I've verified that changing the value back to 512 (after r8168 driver
set it to 4096) makes it work again.


We have several PCIE_BUS_* modes, all guarded by CONFIG_EXPERT. I've
verified that PCIE_BUS_PERFORMANCE also fixes the problem. It sets
MaxReadReqSize to MaxPayloadSize which is equal to 128 on i.MX6.
This is, most probably, suboptimal (despite "performance" in the name).


Now, how should it be fixed (so it works by default)?
1. should the drivers be banned from using pcie_set_readrq() etc?
   I believe some chips may require MRRS adjustment by the driver,
   though.
2. should the PCI code limit MRRS to MPS by default?
3. should the PCI code limit MRRS to the maximum safe value (512 on
   this CPU)?

Does hardware like common x86 have a "maximum safe value" (lower than
4096)?

Any other ideas?

i.MX6 details:
There is mysterious CX_REMOTE_RD_REQ_SIZE (CPU design time constant)
and the Remote_Read_Request_Size, a part of PCIE_PL_MRCCR0 register:

"Remote Read Request Size specifies the largest amount of data (bytes)
that will ever be requested (via an inbound MemRd TLP) by a remote
device. Must never be programmed with a value that exceeds the value
represented by the configuration parameter CX_REMOTE_RD_REQ_SIZE as the
Master Response Composer RAM in the AXI bridge is sized using
CX_REMOTE_RD_REQ_SIZE."

Default value is 512 bytes (and works) and while I think it may be
possible to set it to 1024 or even 2048 bytes, it doesn't seem to work.
The "Remote Max Bridge Tag" (which is calculated automatically by the
CPU based on "Remote Read Request Size" changes from 3 to 1 (which may
make sense):

"Remote Read Request Size" vs. "Remote Max Bridge Tag"
 128 13 <<< does that mean 14 simultaneous requests? Or 13?
 256  6
 512  3
1024  1
2048  0 <<< a single request? No requests?
4096 31 <<< apparently some internal logic failure

-- 
Krzysztof "Chris" Hałasa

Sieć Badawcza Łukasiewicz
Przemysłowy Instytut Automatyki i Pomiarów PIAP
Al. Jerozolimskie 202, 02-486 Warszawa

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: ARM Max Read Req Size and PCIE_BUS_PERFORMANCE stories
  2021-08-10 10:40 ARM Max Read Req Size and PCIE_BUS_PERFORMANCE stories Krzysztof Hałasa
@ 2021-08-10 23:27 ` Bjorn Helgaas
  2021-08-11  6:33   ` Krzysztof Hałasa
  0 siblings, 1 reply; 3+ messages in thread
From: Bjorn Helgaas @ 2021-08-10 23:27 UTC (permalink / raw)
  To: Krzysztof Hałasa
  Cc: linux-pci, Artem Lapkin, Neil Armstrong, Huacai Chen

[+cc Artem, Art, Neil, Huacai]

On Tue, Aug 10, 2021 at 12:40:48PM +0200, Krzysztof Hałasa wrote:
> Hi,
> 
> Background: I'm using an ARMv7 (i.MX6) system with an RTL8111 (aka
> RTL8169) network interface. By default, the system is using
> PCIE_BUS_DEFAULT:
> 
> config PCIE_BUS_DEFAULT
>           Default choice; ensure that the MPS matches upstream bridge.
> 
> and the r8169 driver doesn't work - the RTL chip requests PCIe read
> longer than 512 bytes, and the CPU rejects the request.

Super.  IIUC, i.MX6 is another DWC-based controller, so this looks
like another case of the issue that afflicts amlogic, keystone,
loongson (weirdly apparently *not* DWC-based), meson, and probably
others.

Some previous discussion here:
https://lore.kernel.org/linux-pci/20210707155418.GA897940@bjorn-Precision-5520/

> I've traced the problem to this: (r8169_main.c: rtl_jumbo_config())
> 	int readrq = 4096;
> ...
> 	if (pci_is_pcie(tp->pci_dev) && tp->supports_gmii)
> 		pcie_set_readrq(tp->pci_dev, readrq);
> 
> I've verified that changing the value back to 512 (after r8168 driver
> set it to 4096) makes it work again.

This sounds like a defect in the CPU/PCI host adapter.  If the device
initiates a 4096-byte read and the CPU or whatever can't deal with it,
the host adapter should break it up into whatever the CPU *can*
handle.  I don't think this is the device's problem or the device
driver's problem.

There is no mechanism in the PCIe or ACPI spec for the driver to learn
the maximum MRRS value the CPU can tolerate.

> We have several PCIE_BUS_* modes, all guarded by CONFIG_EXPERT. I've
> verified that PCIE_BUS_PERFORMANCE also fixes the problem. It sets
> MaxReadReqSize to MaxPayloadSize which is equal to 128 on i.MX6.
> This is, most probably, suboptimal (despite "performance" in the name).

Yes.

> Now, how should it be fixed (so it works by default)?
> 1. should the drivers be banned from using pcie_set_readrq() etc?
>    I believe some chips may require MRRS adjustment by the driver,
>    though.

I don't know the details of this particular situation, but ideally,
drivers should not change MRRS or MPS.  If they do, they definitely
need to use a PCI core interface like pcie_set_readrq() because these
are system-level parameters that need to be considered in relation to
other devices in the hierarchy and sometimes to problems in the PCI
host adapter.

> 2. should the PCI code limit MRRS to MPS by default?
> 3. should the PCI code limit MRRS to the maximum safe value (512 on
>    this CPU)?

How do we learn the maximum safe value?  Is this something a native
driver could read from PCIE_PL_MRCCR0 (see below)?

This sounds like a real train wreck if any of these designs want to
use ACPI.  There's nothing in ACPI to communicate a "maximum safe
MRRS" value to the OS.

> Does hardware like common x86 have a "maximum safe value" (lower than
> 4096)?

Not that I'm aware of.

> Any other ideas?
> 
> i.MX6 details:
> There is mysterious CX_REMOTE_RD_REQ_SIZE (CPU design time constant)
> and the Remote_Read_Request_Size, a part of PCIE_PL_MRCCR0 register:
> 
> "Remote Read Request Size specifies the largest amount of data (bytes)
> that will ever be requested (via an inbound MemRd TLP) by a remote
> device. Must never be programmed with a value that exceeds the value
> represented by the configuration parameter CX_REMOTE_RD_REQ_SIZE as the
> Master Response Composer RAM in the AXI bridge is sized using
> CX_REMOTE_RD_REQ_SIZE."
> 
> Default value is 512 bytes (and works) and while I think it may be
> possible to set it to 1024 or even 2048 bytes, it doesn't seem to work.
> The "Remote Max Bridge Tag" (which is calculated automatically by the
> CPU based on "Remote Read Request Size" changes from 3 to 1 (which may
> make sense):
> 
> "Remote Read Request Size" vs. "Remote Max Bridge Tag"
>  128 13 <<< does that mean 14 simultaneous requests? Or 13?
>  256  6
>  512  3
> 1024  1
> 2048  0 <<< a single request? No requests?
> 4096 31 <<< apparently some internal logic failure

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: ARM Max Read Req Size and PCIE_BUS_PERFORMANCE stories
  2021-08-10 23:27 ` Bjorn Helgaas
@ 2021-08-11  6:33   ` Krzysztof Hałasa
  0 siblings, 0 replies; 3+ messages in thread
From: Krzysztof Hałasa @ 2021-08-11  6:33 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: linux-pci, Artem Lapkin, Neil Armstrong, Huacai Chen

Hi Bjorn,

Bjorn Helgaas <helgaas@kernel.org> writes:

> Super.  IIUC, i.MX6 is another DWC-based controller, so this looks
> like another case of the issue that afflicts amlogic, keystone,
> loongson (weirdly apparently *not* DWC-based), meson, and probably
> others.
>
> Some previous discussion here:
> https://lore.kernel.org/linux-pci/20210707155418.GA897940@bjorn-Precision-5520/

Ok. So I guess the fix for i.MX6 is a simple PCI fixup, limiting MRRS to
512 (while MPS = 128).

Now the Kconfig states that MRRS=MPS is the "best performance" case. Is
it really true? One could think requesting 512 bytes (memory read)
completed with 4 response packets 128 bytes each could be faster than
4 requests and 4 responses. Various docs suggest that lowering MRRS is
good for "interactivity", but not exactly for throughput.
Should I do some benchmarking?

> This sounds like a defect in the CPU/PCI host adapter.  If the device
> initiates a 4096-byte read and the CPU or whatever can't deal with it,
> the host adapter should break it up into whatever the CPU *can*
> handle.  I don't think this is the device's problem or the device
> driver's problem.

As I see it, this is not a CPU (ARM core, IXI bus etc) problem. This is
DWC PCIe thing, and the cause is apparently the size of its buffers.
That's why the max number of tags (which I assume is a number of
individual read requests) depends on the MRRS size.

The PCIe 3 docs state:
Max_Read_Request_Size – This field sets the maximum Read
Request size for the Function as a Requester. The Function
must not generate Read Requests with a size exceeding the set
value.

Not sure about the "set value", but perhaps a limit (lower than 4096) is
permitted - e.g. for the whole system or bus.

The i.MX6 errata list states:
"ERR003754 PCIe: 9000403702—AHB/AXI Bridge Master responds with UR
status instead of CA status for inbound MRd requesting greater than
CX_REMOTE_RD_REQ_SIZE

Description:
         The AHB/AXI Bridge RAM is sized at configuration time to support inbound read requests with a
         maximum size of CX_REMOTE_RD_REQ_SIZE. When this limit is violated the core responds
         with UR status, when it should respond with CA status."

The above suggests that aborting an oversized read request is ok. It
should be done with a CA (Completer Abort) code rather than UR
(Unsupported Request), but that's just a difference in the error code.

>> 2. should the PCI code limit MRRS to MPS by default?
>> 3. should the PCI code limit MRRS to the maximum safe value (512 on
>>    this CPU)?
>
> How do we learn the maximum safe value?  Is this something a native
> driver could read from PCIE_PL_MRCCR0 (see below)?

It will read "512" by default (unless maybe some boot loader etc.
changed it).
I think, for these particular SoCs, a fixed 512 would do.
But perhaps we could experiment with larger values, which need to be
*written* to this register before use.

I will think about it.
-- 
Krzysztof "Chris" Hałasa

Sieć Badawcza Łukasiewicz
Przemysłowy Instytut Automatyki i Pomiarów PIAP
Al. Jerozolimskie 202, 02-486 Warszawa

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-08-11  6:33 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-10 10:40 ARM Max Read Req Size and PCIE_BUS_PERFORMANCE stories Krzysztof Hałasa
2021-08-10 23:27 ` Bjorn Helgaas
2021-08-11  6:33   ` Krzysztof Hałasa

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.