linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* RPi4 can't deal with 64 bit PCI accesses
@ 2021-02-22 15:47 Nicolas Saenz Julienne
  2021-02-22 16:18 ` Ard Biesheuvel
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Nicolas Saenz Julienne @ 2021-02-22 15:47 UTC (permalink / raw)
  To: linux-pci, linux-arm-kernel, linux-rpi-kernel, devicetree
  Cc: Florian Fainelli, Rob Herring, Bjorn Helgaas,
	bcm-kernel-feedback-list, Robin Murphy

[-- Attachment #1: Type: text/plain, Size: 1244 bytes --]

Hi everyone,
Raspberry Pi 4, a 64bit arm system on chip, contains a PCIe bus that can't
handle 64bit accesses to its MMIO address space, in other words, writeq() has
to be split into two distinct writel() operations. This isn't ideal, as it
misrepresents PCI's promise of being able to treat device memory as regular
memory, ultimately breaking a bunch of PCI device drivers[1].

I'd like to have a go at fixing this in a way that can be distributed in a
generic distro without prejudice to other users.

AFAIK there is no way to detect this limitation through generic PCIe
capabilities, so one solution would be to expose it through firmware
(devicetree in this case), and pass the limitations through 'struct device' so
as for the drivers to choose the right access method in a way that doesn't
affect performance much[2]. All in all, most of this doesn't need to be
PCI-centric as the property could be applied to any MMIO bus.

Thoughts? Opinions? Is it overkill just for a single SoC?

Regards,
Nicolas

[1] https://github.com/raspberrypi/linux/issues/4158#issuecomment-782351510
[2] Things might get even weirder as the order in which the 32bit operations
    are performed might matter (low/high vs high/low).


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: RPi4 can't deal with 64 bit PCI accesses
  2021-02-22 15:47 RPi4 can't deal with 64 bit PCI accesses Nicolas Saenz Julienne
@ 2021-02-22 16:18 ` Ard Biesheuvel
  2021-02-22 16:36   ` Nicolas Saenz Julienne
  2021-02-22 16:56 ` Robin Murphy
  2021-02-22 17:55 ` Russell King - ARM Linux admin
  2 siblings, 1 reply; 13+ messages in thread
From: Ard Biesheuvel @ 2021-02-22 16:18 UTC (permalink / raw)
  To: Nicolas Saenz Julienne
  Cc: linux-pci, linux-arm-kernel, linux-rpi-kernel, devicetree,
	Rob Herring, Florian Fainelli, Bjorn Helgaas, Robin Murphy,
	bcm-kernel-feedback-list

On Mon, 22 Feb 2021 at 16:48, Nicolas Saenz Julienne
<nsaenzjulienne@suse.de> wrote:
>
> Hi everyone,
> Raspberry Pi 4, a 64bit arm system on chip, contains a PCIe bus that can't
> handle 64bit accesses to its MMIO address space, in other words, writeq() has
> to be split into two distinct writel() operations. This isn't ideal, as it
> misrepresents PCI's promise of being able to treat device memory as regular
> memory, ultimately breaking a bunch of PCI device drivers[1].
>
> I'd like to have a go at fixing this in a way that can be distributed in a
> generic distro without prejudice to other users.
>
> AFAIK there is no way to detect this limitation through generic PCIe
> capabilities, so one solution would be to expose it through firmware
> (devicetree in this case), and pass the limitations through 'struct device' so
> as for the drivers to choose the right access method in a way that doesn't
> affect performance much[2]. All in all, most of this doesn't need to be
> PCI-centric as the property could be applied to any MMIO bus.
>
> Thoughts? Opinions? Is it overkill just for a single SoC?
>

Hi Nicolas,

How does this issue manifest itself? There are other PCIe RC
implementations suffering from the same issue, and some of the drivers
in Linux already work around this, by using split accesses. Look at
this one, for instance:

a310acd7a7ea ("NVMe: use split lo_hi_{read,write}q")

which switches NVMe to lo_hi_readq, which appears to be used in quite
a few other places as well.

-- 
Ard.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: RPi4 can't deal with 64 bit PCI accesses
  2021-02-22 16:18 ` Ard Biesheuvel
@ 2021-02-22 16:36   ` Nicolas Saenz Julienne
  0 siblings, 0 replies; 13+ messages in thread
From: Nicolas Saenz Julienne @ 2021-02-22 16:36 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-pci, linux-arm-kernel, linux-rpi-kernel, devicetree,
	Rob Herring, Florian Fainelli, Bjorn Helgaas, Robin Murphy,
	bcm-kernel-feedback-list

[-- Attachment #1: Type: text/plain, Size: 2127 bytes --]

On Mon, 2021-02-22 at 17:18 +0100, Ard Biesheuvel wrote:
> On Mon, 22 Feb 2021 at 16:48, Nicolas Saenz Julienne
> <nsaenzjulienne@suse.de> wrote:
> > 
> > Hi everyone,
> > Raspberry Pi 4, a 64bit arm system on chip, contains a PCIe bus that can't
> > handle 64bit accesses to its MMIO address space, in other words, writeq() has
> > to be split into two distinct writel() operations. This isn't ideal, as it
> > misrepresents PCI's promise of being able to treat device memory as regular
> > memory, ultimately breaking a bunch of PCI device drivers[1].
> > 
> > I'd like to have a go at fixing this in a way that can be distributed in a
> > generic distro without prejudice to other users.
> > 
> > AFAIK there is no way to detect this limitation through generic PCIe
> > capabilities, so one solution would be to expose it through firmware
> > (devicetree in this case), and pass the limitations through 'struct device' so
> > as for the drivers to choose the right access method in a way that doesn't
> > affect performance much[2]. All in all, most of this doesn't need to be
> > PCI-centric as the property could be applied to any MMIO bus.
> > 
> > Thoughts? Opinions? Is it overkill just for a single SoC?
> > 
> 
> Hi Nicolas,
> 
> How does this issue manifest itself? There are other PCIe RC

Only the low bits would get written/read, as for the high bits I can't recall
if they were corrupted or simply ignored (I experienced this some time ago
while bringing up RPi's PCIe in u-boot).

> implementations suffering from the same issue, and some of the drivers
> in Linux already work around this, by using split accesses. Look at
> this one, for instance:
> 
> a310acd7a7ea ("NVMe: use split lo_hi_{read,write}q")
> 
> which switches NVMe to lo_hi_readq, which appears to be used in quite
> a few other places as well.

Indeed, XHCI does this unanimously too. But I figured forcing the split on all
drivers woudln't be a very popular solution just for RPi's 'faulty' bus. But if
it turns out to be a common problem, I guess it isn't such a bad idea.

Regards,
Nicolas


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: RPi4 can't deal with 64 bit PCI accesses
  2021-02-22 15:47 RPi4 can't deal with 64 bit PCI accesses Nicolas Saenz Julienne
  2021-02-22 16:18 ` Ard Biesheuvel
@ 2021-02-22 16:56 ` Robin Murphy
  2021-02-24 16:55   ` Florian Fainelli
  2021-02-22 17:55 ` Russell King - ARM Linux admin
  2 siblings, 1 reply; 13+ messages in thread
From: Robin Murphy @ 2021-02-22 16:56 UTC (permalink / raw)
  To: Nicolas Saenz Julienne, linux-pci, linux-arm-kernel,
	linux-rpi-kernel, devicetree
  Cc: Rob Herring, Florian Fainelli, Bjorn Helgaas, Robin Murphy,
	bcm-kernel-feedback-list

On 2021-02-22 15:47, Nicolas Saenz Julienne wrote:
> Hi everyone,
> Raspberry Pi 4, a 64bit arm system on chip, contains a PCIe bus that can't
> handle 64bit accesses to its MMIO address space, in other words, writeq() has
> to be split into two distinct writel() operations. This isn't ideal, as it
> misrepresents PCI's promise of being able to treat device memory as regular
> memory, ultimately breaking a bunch of PCI device drivers[1].
> 
> I'd like to have a go at fixing this in a way that can be distributed in a
> generic distro without prejudice to other users.
> 
> AFAIK there is no way to detect this limitation through generic PCIe
> capabilities, so one solution would be to expose it through firmware
> (devicetree in this case), and pass the limitations through 'struct device' so
> as for the drivers to choose the right access method in a way that doesn't
> affect performance much[2]. All in all, most of this doesn't need to be
> PCI-centric as the property could be applied to any MMIO bus.

It is indeed something that people can get wrong with internal buses as 
well - for example commit f2d9848aeb9f is such a workaround, also 
conveniently illustrating the case of significant functionality having 
to be disabled where the device *does* require 64-bit atomicity for 
correctness.

Working around kernel I/O accessors is all very well, but another 
concern for PCI in particular is when things like framebuffer memory can 
get mmap'ed into userspace (or even memremap'ed within the kernel). Even 
in AArch32, compiled code may result in 64-bit accesses being generated 
depending on how the CPU and interconnect handle LDRD/STRD/LDM/STM/etc., 
so it's basically not safe to ever let that happen at all.

Robin.

> 
> Thoughts? Opinions? Is it overkill just for a single SoC?
> 
> Regards,
> Nicolas
> 
> [1] https://github.com/raspberrypi/linux/issues/4158#issuecomment-782351510
> [2] Things might get even weirder as the order in which the 32bit operations
>      are performed might matter (low/high vs high/low).
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: RPi4 can't deal with 64 bit PCI accesses
  2021-02-22 15:47 RPi4 can't deal with 64 bit PCI accesses Nicolas Saenz Julienne
  2021-02-22 16:18 ` Ard Biesheuvel
  2021-02-22 16:56 ` Robin Murphy
@ 2021-02-22 17:55 ` Russell King - ARM Linux admin
  2 siblings, 0 replies; 13+ messages in thread
From: Russell King - ARM Linux admin @ 2021-02-22 17:55 UTC (permalink / raw)
  To: Nicolas Saenz Julienne
  Cc: linux-pci, linux-arm-kernel, linux-rpi-kernel, devicetree,
	Rob Herring, Florian Fainelli, Bjorn Helgaas, Robin Murphy,
	bcm-kernel-feedback-list

On Mon, Feb 22, 2021 at 04:47:22PM +0100, Nicolas Saenz Julienne wrote:
> [2] Things might get even weirder as the order in which the 32bit operations
>     are performed might matter (low/high vs high/low).

Note that arm32 does not provide writeq() very purposely because it
is device specific whether writing high-then-low or low-then-high is
the correct approach. See linux/io-64-nonatomic-*.h

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: RPi4 can't deal with 64 bit PCI accesses
  2021-02-22 16:56 ` Robin Murphy
@ 2021-02-24 16:55   ` Florian Fainelli
  2021-02-24 20:25     ` Christoph Hellwig
  0 siblings, 1 reply; 13+ messages in thread
From: Florian Fainelli @ 2021-02-24 16:55 UTC (permalink / raw)
  To: Robin Murphy, Nicolas Saenz Julienne, linux-pci,
	linux-arm-kernel, linux-rpi-kernel, devicetree
  Cc: Rob Herring, Florian Fainelli, Bjorn Helgaas, Robin Murphy,
	bcm-kernel-feedback-list



On 2/22/2021 8:56 AM, Robin Murphy wrote:
> On 2021-02-22 15:47, Nicolas Saenz Julienne wrote:
>> Hi everyone,
>> Raspberry Pi 4, a 64bit arm system on chip, contains a PCIe bus that
>> can't
>> handle 64bit accesses to its MMIO address space, in other words,
>> writeq() has
>> to be split into two distinct writel() operations. This isn't ideal,
>> as it
>> misrepresents PCI's promise of being able to treat device memory as
>> regular
>> memory, ultimately breaking a bunch of PCI device drivers[1].
>>
>> I'd like to have a go at fixing this in a way that can be distributed
>> in a
>> generic distro without prejudice to other users.
>>
>> AFAIK there is no way to detect this limitation through generic PCIe
>> capabilities, so one solution would be to expose it through firmware
>> (devicetree in this case), and pass the limitations through 'struct
>> device' so
>> as for the drivers to choose the right access method in a way that
>> doesn't
>> affect performance much[2]. All in all, most of this doesn't need to be
>> PCI-centric as the property could be applied to any MMIO bus.
> 
> It is indeed something that people can get wrong with internal buses as
> well - for example commit f2d9848aeb9f is such a workaround, also
> conveniently illustrating the case of significant functionality having
> to be disabled where the device *does* require 64-bit atomicity for
> correctness.
> 
> Working around kernel I/O accessors is all very well, but another
> concern for PCI in particular is when things like framebuffer memory can
> get mmap'ed into userspace (or even memremap'ed within the kernel). Even
> in AArch32, compiled code may result in 64-bit accesses being generated
> depending on how the CPU and interconnect handle LDRD/STRD/LDM/STM/etc.,
> so it's basically not safe to ever let that happen at all.

Agreed, this makes finding a generic solution a tiny bit harder. Do you
have something in mind Nicolas?
-- 
Florian

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: RPi4 can't deal with 64 bit PCI accesses
  2021-02-24 16:55   ` Florian Fainelli
@ 2021-02-24 20:25     ` Christoph Hellwig
  2021-02-24 20:35       ` Florian Fainelli
  2021-02-25 10:41       ` David Woodhouse
  0 siblings, 2 replies; 13+ messages in thread
From: Christoph Hellwig @ 2021-02-24 20:25 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: Robin Murphy, Nicolas Saenz Julienne, linux-pci,
	linux-arm-kernel, linux-rpi-kernel, devicetree, Rob Herring,
	Bjorn Helgaas, Robin Murphy, bcm-kernel-feedback-list

On Wed, Feb 24, 2021 at 08:55:10AM -0800, Florian Fainelli wrote:
> > Working around kernel I/O accessors is all very well, but another
> > concern for PCI in particular is when things like framebuffer memory can
> > get mmap'ed into userspace (or even memremap'ed within the kernel). Even
> > in AArch32, compiled code may result in 64-bit accesses being generated
> > depending on how the CPU and interconnect handle LDRD/STRD/LDM/STM/etc.,
> > so it's basically not safe to ever let that happen at all.
> 
> Agreed, this makes finding a generic solution a tiny bit harder. Do you
> have something in mind Nicolas?

The only workable solution is a new

bool 64bit_mmio_supported(void)

check that is used like:

	if (64bit_mmio_supported())
		readq(foodev->regs, REG_OFFSET);
	else
		lo_hi_readq(foodev->regs, REG_OFFSET);

where 64bit_mmio_supported() return false for all 32-bit kernels,
true for all non-broken 64-bit kernels and is an actual function
for arm64 multiplatforms builds that include te RPi quirk.

The above would then replace the existing magic from the
<linux/io-64-nonatomic-lo-hi.h> and <linux/io-64-nonatomic-hi-lo.h>
headers.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: RPi4 can't deal with 64 bit PCI accesses
  2021-02-24 20:25     ` Christoph Hellwig
@ 2021-02-24 20:35       ` Florian Fainelli
  2021-02-25 10:29         ` Neil Armstrong
  2021-02-25 10:41       ` David Woodhouse
  1 sibling, 1 reply; 13+ messages in thread
From: Florian Fainelli @ 2021-02-24 20:35 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Robin Murphy, Nicolas Saenz Julienne, linux-pci,
	linux-arm-kernel, linux-rpi-kernel, devicetree, Rob Herring,
	Bjorn Helgaas, Robin Murphy, bcm-kernel-feedback-list



On 2/24/2021 12:25 PM, Christoph Hellwig wrote:
> On Wed, Feb 24, 2021 at 08:55:10AM -0800, Florian Fainelli wrote:
>>> Working around kernel I/O accessors is all very well, but another
>>> concern for PCI in particular is when things like framebuffer memory can
>>> get mmap'ed into userspace (or even memremap'ed within the kernel). Even
>>> in AArch32, compiled code may result in 64-bit accesses being generated
>>> depending on how the CPU and interconnect handle LDRD/STRD/LDM/STM/etc.,
>>> so it's basically not safe to ever let that happen at all.
>>
>> Agreed, this makes finding a generic solution a tiny bit harder. Do you
>> have something in mind Nicolas?
> 
> The only workable solution is a new
> 
> bool 64bit_mmio_supported(void)
> 
> check that is used like:
> 
> 	if (64bit_mmio_supported())
> 		readq(foodev->regs, REG_OFFSET);
> 	else
> 		lo_hi_readq(foodev->regs, REG_OFFSET);
> 
> where 64bit_mmio_supported() return false for all 32-bit kernels,
> true for all non-broken 64-bit kernels and is an actual function
> for arm64 multiplatforms builds that include te RPi quirk.
> 
> The above would then replace the existing magic from the
> <linux/io-64-nonatomic-lo-hi.h> and <linux/io-64-nonatomic-hi-lo.h>
> headers.

That would work. The use case described by Robin is highly unlikely to
exist on the Pi4 given that you cannot easily access the PCIe bus and
plug an arbitrary GPU, so maybe there is nothing to do for framebuffer
memory.
-- 
Florian

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: RPi4 can't deal with 64 bit PCI accesses
  2021-02-24 20:35       ` Florian Fainelli
@ 2021-02-25 10:29         ` Neil Armstrong
  2021-02-25 11:10           ` Robin Murphy
  0 siblings, 1 reply; 13+ messages in thread
From: Neil Armstrong @ 2021-02-25 10:29 UTC (permalink / raw)
  To: Florian Fainelli, Christoph Hellwig
  Cc: devicetree, Rob Herring, linux-pci, Bjorn Helgaas,
	Nicolas Saenz Julienne, bcm-kernel-feedback-list, Robin Murphy,
	Robin Murphy, linux-arm-kernel, linux-rpi-kernel

On 24/02/2021 21:35, Florian Fainelli wrote:
> 
> 
> On 2/24/2021 12:25 PM, Christoph Hellwig wrote:
>> On Wed, Feb 24, 2021 at 08:55:10AM -0800, Florian Fainelli wrote:
>>>> Working around kernel I/O accessors is all very well, but another
>>>> concern for PCI in particular is when things like framebuffer memory can
>>>> get mmap'ed into userspace (or even memremap'ed within the kernel). Even
>>>> in AArch32, compiled code may result in 64-bit accesses being generated
>>>> depending on how the CPU and interconnect handle LDRD/STRD/LDM/STM/etc.,
>>>> so it's basically not safe to ever let that happen at all.
>>>
>>> Agreed, this makes finding a generic solution a tiny bit harder. Do you
>>> have something in mind Nicolas?
>>
>> The only workable solution is a new
>>
>> bool 64bit_mmio_supported(void)
>>
>> check that is used like:
>>
>> 	if (64bit_mmio_supported())
>> 		readq(foodev->regs, REG_OFFSET);
>> 	else
>> 		lo_hi_readq(foodev->regs, REG_OFFSET);
>>
>> where 64bit_mmio_supported() return false for all 32-bit kernels,
>> true for all non-broken 64-bit kernels and is an actual function
>> for arm64 multiplatforms builds that include te RPi quirk.
>>
>> The above would then replace the existing magic from the
>> <linux/io-64-nonatomic-lo-hi.h> and <linux/io-64-nonatomic-hi-lo.h>
>> headers.
> 
> That would work. The use case described by Robin is highly unlikely to
> exist on the Pi4 given that you cannot easily access the PCIe bus and
> plug an arbitrary GPU, so maybe there is nothing to do for framebuffer
> memory.
> 

Erf, not really, with the compute module ATX/ITX boards are being designed with a full PCIe connector like:
https://www.indiegogo.com/projects/over-board-raspberry-pi-4-mini-itx-motherboard/#/

Neil

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: RPi4 can't deal with 64 bit PCI accesses
  2021-02-24 20:25     ` Christoph Hellwig
  2021-02-24 20:35       ` Florian Fainelli
@ 2021-02-25 10:41       ` David Woodhouse
  1 sibling, 0 replies; 13+ messages in thread
From: David Woodhouse @ 2021-02-25 10:41 UTC (permalink / raw)
  To: Christoph Hellwig, Florian Fainelli
  Cc: Robin Murphy, Nicolas Saenz Julienne, linux-pci,
	linux-arm-kernel, linux-rpi-kernel, devicetree, Rob Herring,
	Bjorn Helgaas, Robin Murphy, bcm-kernel-feedback-list

[-- Attachment #1: Type: text/plain, Size: 1663 bytes --]

On Wed, 2021-02-24 at 20:25 +0000, Christoph Hellwig wrote:
> On Wed, Feb 24, 2021 at 08:55:10AM -0800, Florian Fainelli wrote:
> > > Working around kernel I/O accessors is all very well, but another
> > > concern for PCI in particular is when things like framebuffer memory can
> > > get mmap'ed into userspace (or even memremap'ed within the kernel). Even
> > > in AArch32, compiled code may result in 64-bit accesses being generated
> > > depending on how the CPU and interconnect handle LDRD/STRD/LDM/STM/etc.,
> > > so it's basically not safe to ever let that happen at all.
> > 
> > Agreed, this makes finding a generic solution a tiny bit harder. Do you
> > have something in mind Nicolas?
> 
> The only workable solution is a new
> 
> bool 64bit_mmio_supported(void)
> 
> check that is used like:
> 
> 	if (64bit_mmio_supported())
> 		readq(foodev->regs, REG_OFFSET);
> 	else
> 		lo_hi_readq(foodev->regs, REG_OFFSET);
> 
> where 64bit_mmio_supported() return false for all 32-bit kernels,
> true for all non-broken 64-bit kernels and is an actual function
> for arm64 multiplatforms builds that include te RPi quirk.
> 
> The above would then replace the existing magic from the
> <linux/io-64-nonatomic-lo-hi.h> and <linux/io-64-nonatomic-hi-lo.h>
> headers.

Is it completely impossible to do 64-bit cycles with this host bridge?

I'm now having nasty flashbacks to an SH platform with a host bridge
that screwed up byte access for direct MMIO — but *could* do those MMIO
cycles accurately if we did them through an indirect method similar to
config cycles. Is there any such mechanism on the offending hardware?

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5174 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: RPi4 can't deal with 64 bit PCI accesses
  2021-02-25 10:29         ` Neil Armstrong
@ 2021-02-25 11:10           ` Robin Murphy
  2021-02-25 11:35             ` Nicolas Saenz Julienne
  0 siblings, 1 reply; 13+ messages in thread
From: Robin Murphy @ 2021-02-25 11:10 UTC (permalink / raw)
  To: Neil Armstrong, Florian Fainelli, Christoph Hellwig
  Cc: devicetree, Rob Herring, linux-pci, Bjorn Helgaas,
	Nicolas Saenz Julienne, bcm-kernel-feedback-list, Robin Murphy,
	linux-arm-kernel, linux-rpi-kernel

On 2021-02-25 10:29, Neil Armstrong wrote:
> On 24/02/2021 21:35, Florian Fainelli wrote:
>>
>>
>> On 2/24/2021 12:25 PM, Christoph Hellwig wrote:
>>> On Wed, Feb 24, 2021 at 08:55:10AM -0800, Florian Fainelli wrote:
>>>>> Working around kernel I/O accessors is all very well, but another
>>>>> concern for PCI in particular is when things like framebuffer memory can
>>>>> get mmap'ed into userspace (or even memremap'ed within the kernel). Even
>>>>> in AArch32, compiled code may result in 64-bit accesses being generated
>>>>> depending on how the CPU and interconnect handle LDRD/STRD/LDM/STM/etc.,
>>>>> so it's basically not safe to ever let that happen at all.
>>>>
>>>> Agreed, this makes finding a generic solution a tiny bit harder. Do you
>>>> have something in mind Nicolas?
>>>
>>> The only workable solution is a new
>>>
>>> bool 64bit_mmio_supported(void)

Note that to be sufficiently generic this would have to be a per-device 
property - a system could have an affected PCIe root complex but still 
have other devices elsewhere in the SoC that can, or even need to, use 
64-bit accesses.

>>> check that is used like:
>>>
>>> 	if (64bit_mmio_supported())
>>> 		readq(foodev->regs, REG_OFFSET);
>>> 	else
>>> 		lo_hi_readq(foodev->regs, REG_OFFSET);
>>>
>>> where 64bit_mmio_supported() return false for all 32-bit kernels,
>>> true for all non-broken 64-bit kernels and is an actual function
>>> for arm64 multiplatforms builds that include te RPi quirk.
>>>
>>> The above would then replace the existing magic from the
>>> <linux/io-64-nonatomic-lo-hi.h> and <linux/io-64-nonatomic-hi-lo.h>
>>> headers.
>>
>> That would work. The use case described by Robin is highly unlikely to
>> exist on the Pi4 given that you cannot easily access the PCIe bus and
>> plug an arbitrary GPU, so maybe there is nothing to do for framebuffer
>> memory.

Framebuffers are only the most obvious example - I don't feel the 
inclination to audit every driver/subsystem that can possibly make a 
non-iomem remapping or userspace mmap of a prefetchable BAR, but I'm 
sure there are more.

> Erf, not really, with the compute module ATX/ITX boards are being designed with a full PCIe connector like:
> https://www.indiegogo.com/projects/over-board-raspberry-pi-4-mini-itx-motherboard/#/

Right, this whole thread looks to have come about due to random 
endpoints getting connected to the exposed bus on compute modules. If it 
was an issue at all for the XHCI on standard Pi 4 boards I don't think 
people would just be starting to notice it now...

Robin.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: RPi4 can't deal with 64 bit PCI accesses
  2021-02-25 11:10           ` Robin Murphy
@ 2021-02-25 11:35             ` Nicolas Saenz Julienne
  2021-02-26  5:32               ` Christoph Hellwig
  0 siblings, 1 reply; 13+ messages in thread
From: Nicolas Saenz Julienne @ 2021-02-25 11:35 UTC (permalink / raw)
  To: Robin Murphy, Neil Armstrong, Florian Fainelli, Christoph Hellwig
  Cc: devicetree, Rob Herring, linux-pci, Bjorn Helgaas,
	bcm-kernel-feedback-list, Robin Murphy, linux-arm-kernel,
	linux-rpi-kernel

[-- Attachment #1: Type: text/plain, Size: 3737 bytes --]

On Thu, 2021-02-25 at 11:10 +0000, Robin Murphy wrote:
> On 2021-02-25 10:29, Neil Armstrong wrote:
> > On 24/02/2021 21:35, Florian Fainelli wrote:
> > > 
> > > 
> > > On 2/24/2021 12:25 PM, Christoph Hellwig wrote:
> > > > On Wed, Feb 24, 2021 at 08:55:10AM -0800, Florian Fainelli wrote:
> > > > > > Working around kernel I/O accessors is all very well, but another
> > > > > > concern for PCI in particular is when things like framebuffer memory can
> > > > > > get mmap'ed into userspace (or even memremap'ed within the kernel). Even
> > > > > > in AArch32, compiled code may result in 64-bit accesses being generated
> > > > > > depending on how the CPU and interconnect handle LDRD/STRD/LDM/STM/etc.,
> > > > > > so it's basically not safe to ever let that happen at all.
> > > > > 
> > > > > Agreed, this makes finding a generic solution a tiny bit harder. Do you
> > > > > have something in mind Nicolas?
> > > > 
> > > > The only workable solution is a new
> > > > 
> > > > bool 64bit_mmio_supported(void)
> 
> Note that to be sufficiently generic this would have to be a per-device 
> property - a system could have an affected PCIe root complex but still 
> have other devices elsewhere in the SoC that can, or even need to, use 
> 64-bit accesses.

Yes, that's what I had in mind myself. All in all, why penalize the rest of
busses in the system. What I'm planning is to introduce a '64bit-mmio-broken'
DT property that'll utimately live somwhere in 'struct device.'

WRT why not defaulting to 32-bit accesses for distro images if they support
RPi4. My *un-educated* guess is that, the performance penalty of checking for a
device flag is (way) lower than having to resort to two distinct write
operations with their assorted memory barriers. I'm sure you can
comment/correct me here.

> > > > check that is used like:
> > > > 
> > > > 	if (64bit_mmio_supported())
> > > > 		readq(foodev->regs, REG_OFFSET);
> > > > 	else
> > > > 		lo_hi_readq(foodev->regs, REG_OFFSET);
> > > > 
> > > > where 64bit_mmio_supported() return false for all 32-bit kernels,
> > > > true for all non-broken 64-bit kernels and is an actual function
> > > > for arm64 multiplatforms builds that include te RPi quirk.
> > > > 
> > > > The above would then replace the existing magic from the
> > > > <linux/io-64-nonatomic-lo-hi.h> and <linux/io-64-nonatomic-hi-lo.h>
> > > > headers.
> > > 
> > > That would work. The use case described by Robin is highly unlikely to
> > > exist on the Pi4 given that you cannot easily access the PCIe bus and
> > > plug an arbitrary GPU, so maybe there is nothing to do for framebuffer
> > > memory.
> 
> Framebuffers are only the most obvious example - I don't feel the 
> inclination to audit every driver/subsystem that can possibly make a 
> non-iomem remapping or userspace mmap of a prefetchable BAR, but I'm 
> sure there are more.

IIUC the only solution to the issue here is to disallow mmaping memory
belonging to a broken bus, right? In this case, the function above would do the
trick.

> > Erf, not really, with the compute module ATX/ITX boards are being designed
> > with a full PCIe connector like:
> > https://www.indiegogo.com/projects/over-board-raspberry-pi-4-mini-itx-motherboard/#/
> 
> Right, this whole thread looks to have come about due to random 
> endpoints getting connected to the exposed bus on compute modules. If it 
> was an issue at all for the XHCI on standard Pi 4 boards I don't think 
> people would just be starting to notice it now...

Indeed. For the record, here's the original complaint, although I'm sure others
exist: https://github.com/raspberrypi/linux/issues/4158

Regards,
Nicolas


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: RPi4 can't deal with 64 bit PCI accesses
  2021-02-25 11:35             ` Nicolas Saenz Julienne
@ 2021-02-26  5:32               ` Christoph Hellwig
  0 siblings, 0 replies; 13+ messages in thread
From: Christoph Hellwig @ 2021-02-26  5:32 UTC (permalink / raw)
  To: Nicolas Saenz Julienne
  Cc: Robin Murphy, Neil Armstrong, Florian Fainelli,
	Christoph Hellwig, devicetree, Rob Herring, linux-pci,
	Bjorn Helgaas, bcm-kernel-feedback-list, Robin Murphy,
	linux-arm-kernel, linux-rpi-kernel

On Thu, Feb 25, 2021 at 12:35:27PM +0100, Nicolas Saenz Julienne wrote:
> Yes, that's what I had in mind myself. All in all, why penalize the rest of
> busses in the system. What I'm planning is to introduce a '64bit-mmio-broken'
> DT property that'll utimately live somwhere in 'struct device.'
> 
> WRT why not defaulting to 32-bit accesses for distro images if they support
> RPi4. My *un-educated* guess is that, the performance penalty of checking for a
> device flag is (way) lower than having to resort to two distinct write
> operations with their assorted memory barriers. I'm sure you can
> comment/correct me here.

Various high performance devices rely on the fact that 64-bit MMIO
writes are atomic, and will have to use an extra lock and/or an entirely
different programming model if they are not supported.

If that is not the case just using 32-bit accesses always is certainly
easier, that's what we did for the slow-path only 64-bit registers in
NVMe.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-02-26  5:33 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-22 15:47 RPi4 can't deal with 64 bit PCI accesses Nicolas Saenz Julienne
2021-02-22 16:18 ` Ard Biesheuvel
2021-02-22 16:36   ` Nicolas Saenz Julienne
2021-02-22 16:56 ` Robin Murphy
2021-02-24 16:55   ` Florian Fainelli
2021-02-24 20:25     ` Christoph Hellwig
2021-02-24 20:35       ` Florian Fainelli
2021-02-25 10:29         ` Neil Armstrong
2021-02-25 11:10           ` Robin Murphy
2021-02-25 11:35             ` Nicolas Saenz Julienne
2021-02-26  5:32               ` Christoph Hellwig
2021-02-25 10:41       ` David Woodhouse
2021-02-22 17:55 ` Russell King - ARM Linux admin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).