Linux-PCI Archive on
 help / color / Atom feed
From: Robin Murphy <>
To: Marek Vasut <>,
	Lorenzo Pieralisi <>
Cc: Rob Herring <>,
	Geert Uytterhoeven <>,
	PCI <>,
	Geert Uytterhoeven <>,
	Wolfram Sang <>,
Subject: Re: [PATCH V3 2/3] PCI: rcar: Do not abort on too many inbound dma-ranges
Date: Fri, 18 Oct 2019 18:35:07 +0100
Message-ID: <> (raw)
In-Reply-To: <>

On 18/10/2019 17:44, Marek Vasut wrote:
> On 10/18/19 5:44 PM, Robin Murphy wrote:
>> On 18/10/2019 15:26, Marek Vasut wrote:
>>> On 10/18/19 2:53 PM, Robin Murphy wrote:
>>>> On 18/10/2019 13:22, Marek Vasut wrote:
>>>>> On 10/18/19 11:53 AM, Lorenzo Pieralisi wrote:
>>>>>> On Thu, Oct 17, 2019 at 05:01:26PM +0200, Marek Vasut wrote:
>>>>>> [...]
>>>>>>>> Again, just handling the first N dma-ranges entries and ignoring the
>>>>>>>> rest is not 'configure the controller correctly'.
>>>>>>> It's the best effort thing to do. It's well possible the next
>>>>>>> generation
>>>>>>> of the controller will have more windows, so could accommodate the
>>>>>>> whole
>>>>>>> list of ranges.
>>>> In the context of DT describing the platform that doesn't make any
>>>> sense. It's like saying it's fine for U-Boot to also describe a bunch of
>>>> non-existent CPUs just because future SoCs might have them. Just because
>>>> the system would probably still boot doesn't mean it's right.
>>> It's the exact opposite of what you just described -- the last release
>>> of U-Boot currently populates a subset of the DMA ranges, not a
>>> superset. The dma-ranges in the Linux DT currently are a superset of
>>> available DRAM on the platform.
>> I'm not talking about the overall coverage of addresses - I've already
>> made clear what I think about that - I'm talking about the *number* of
>> individual entries. If the DT binding defines that dma-ranges entries
>> directly represent bridge windows, then the bootloader for a given
>> platform should never generate more entries than that platform has
>> actual windows, because to do otherwise would be bogus.
> I have a feeling that's not how Rob defined the dma-ranges in this
> discussion though.
>>>>>>> Thinking about this further, this patch should be OK either way, if
>>>>>>> there is a DT which defines more DMA ranges than the controller can
>>>>>>> handle, handling some is better than failing outright -- a PCI which
>>>>>>> works with a subset of memory is better than PCI that does not work
>>>>>>> at all.
>>>>>> OK to sum it up, this patch is there to deal with u-boot adding
>>>>>> multiple
>>>>>> dma-ranges to DT.
>>>>> Yes, this patch was posted over two months ago, about the same time
>>>>> this
>>>>> functionality was posted for inclusion in U-Boot. It made it into
>>>>> recent
>>>>> U-Boot release, but there was no feedback on the Linux patch until
>>>>> recently.
>>>>> U-Boot can be changed for the next release, assuming we agree on how it
>>>>> should behave.
>>>>>> I still do not understand the benefit given that for
>>>>>> DMA masks they are useless as Rob pointed out and ditto for inbound
>>>>>> windows programming (given that AFAICS the PCI controller filters out
>>>>>> any transaction that does not fall within its inbound windows by
>>>>>> default
>>>>>> so adding dma-ranges has the net effect of widening the DMA'able
>>>>>> address
>>>>>> space rather than limiting it).
>>>>>> In short, what's the benefit of adding more dma-ranges regions to the
>>>>>> DT (and consequently handling them in the kernel) ?
>>>>> The benefit is programming the controller inbound windows correctly.
>>>>> But if there is a better way to do that, I am open to implement that.
>>>>> Are there any suggestions / examples of that ?
>>>> The crucial thing is that once we improve the existing "dma-ranges"
>>>> handling in the DMA layer such that it *does* consider multiple entries
>>>> properly, platforms presenting ranges which don't actually exist will
>>>> almost certainly start going wrong, and are either going to have to fix
>>>> their broken bootloaders or try to make a case for platform-specific
>>>> workarounds in core code.
>>> Again, this is exactly the other way around, the dma-ranges populated by
>>> U-Boot cover only existing DRAM. The single dma-range in Linux DT covers
>>> even the holes without existing DRAM.
>>> So even if the Linux dma-ranges handling changes, there should be no
>>> problem.
>> Say you have a single hardware window, and this DT property (1-cell
>> numbers for simplicity:
>>      dma-ranges = <0x00000000 0x00000000 0x80000000>;
>> Driver reads one entry and programs the window to 2GB@0, DMA setup
>> parses the first entry and sets device masks to 0x7fffffff, and
>> everything's fine.
>> Now say we describe the exact same address range this way instead:
>>      dma-ranges = <0x00000000 0x00000000 0x40000000,
>>                0x40000000 0x40000000 0x40000000>;
>> Driver reads one entry and programs the window to 1GB@0, DMA setup
>> parses the first entry and sets device masks to 0x3fffffff, and *today*,
>> things are suboptimal but happen to work.
>> Now say we finally get round to fixing the of_dma code to properly
>> generate DMA masks that actually include all usable address bits, a user
>> upgrades their kernel package, and reboots with that same DT...
>> Driver reads one entry and programs the window to 1GB@0, DMA setup
>> parses all entries and sets device masks to 0x7fffffff, devices start
>> randomly failing or throwing DMA errors half the time, angry user looks
>> at the changelog to find that somebody decided their now-corrupted
>> filesystem is less important than the fact that hey, at least the
>> machine didn't refuse to boot because the DT was obviously wrong. Are
>> you sure that shouldn't be a problem?
> I think you picked a rather special case here and arrived as a DMA mask
> which just fails in this special case. Such special case doesn't happen
> here, and even if it did, I would expect Linux to merge those two ranges
> or do something sane ? If the DMA mask is set incorrectly, that's a bug
> of the DMA code I would think.

The mask is not set incorrectly - DMA masks represent the number of 
address bits the device (or intermediate interconnect in the case of the 
bus mask) is capable of driving. Thus when DMA is limited to a specific 
address range, the masks should be wide enough to cover the topmost 
address of that range (unless the device's own capability is inherently 

> What DMA mask would you get if those two entries had a gap inbetween
> them ? E.g.:
>   dma-ranges = <0x00000000 0x00000000 0x20000000,
>                 0x40000000 0x40000000 0x20000000>;

OK, here's an real non-simplified example (note that these windows are 
fixed and not programmed by Linux):

	dma-ranges = <0x02000000 0x0 0x2c1c0000 0x0 0x2c1c0000 0x0 0x00040000>,
			     <0x02000000 0x0 0x80000000 0x0 0x80000000 0x0 0x80000000>,
			     <0x43000000 0x8 0x80000000 0x8 0x80000000 0x2 0x00000000>;

The DMA masks for the devices behind this bridge *should* be 35 bits, 
because that's the size of the largest usable address. Currently, 
however, because of the of_dma code's deficiency they would end up being 
an utterly useless 30 bits, which isn't even enough to reach the start 
of RAM. Thus I can't actually have this property in my DT, and as a 
result I can't enable the IOMMU, because *that* also needs to know the 
ranges in order to reserve the unusable gaps between the windows once 
address translation is in play.

>> Now, if you want to read the DT binding as less strict and let it just
>> describe some arbitrarily-complex set of address ranges that should be
>> valid for DMA, that's not insurmountable; you just need more complex
>> logic in your driver capable of calculating how best to cover *all*
>> those ranges using the available number of windows.
> That's what the driver does with this patchset, except it's not possible
> to cover all those ranges. It covers them as well as it can.

Which means by definition it *doesn't* do what I suggested there...


  reply index

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-09 17:57 [PATCH V3 1/3] PCI: rcar: Move the inbound index check marek.vasut
2019-08-09 17:57 ` [PATCH V3 2/3] PCI: rcar: Do not abort on too many inbound dma-ranges marek.vasut
2019-08-16 13:23   ` Simon Horman
2019-08-16 13:28     ` Marek Vasut
2019-08-16 13:38       ` Simon Horman
2019-08-16 17:41         ` Marek Vasut
2019-10-21 10:18       ` Andrew Murray
2019-10-26 18:03         ` Marek Vasut
2019-10-26 20:36           ` Andrew Murray
2019-10-26 21:06             ` Andrew Murray
2019-11-06 23:37             ` Marek Vasut
2019-11-07 14:19               ` Andrew Murray
2019-11-16 15:48                 ` Marek Vasut
2019-11-18 18:42                   ` Robin Murphy
2019-10-16 15:00   ` Lorenzo Pieralisi
2019-10-16 15:10     ` Marek Vasut
2019-10-16 15:26       ` Lorenzo Pieralisi
2019-10-16 15:29         ` Marek Vasut
2019-10-16 16:18           ` Lorenzo Pieralisi
2019-10-16 18:12             ` Rob Herring
2019-10-16 18:17               ` Marek Vasut
2019-10-16 20:25                 ` Rob Herring
2019-10-16 21:15                   ` Marek Vasut
2019-10-16 22:26                     ` Rob Herring
2019-10-16 22:33                       ` Marek Vasut
2019-10-17  7:06                         ` Geert Uytterhoeven
2019-10-17 10:55                           ` Marek Vasut
2019-10-17 13:06                             ` Robin Murphy
2019-10-17 14:00                               ` Marek Vasut
2019-10-17 14:36                                 ` Rob Herring
2019-10-17 15:01                                   ` Marek Vasut
2019-10-18  9:53                                     ` Lorenzo Pieralisi
2019-10-18 12:22                                       ` Marek Vasut
2019-10-18 12:53                                         ` Robin Murphy
2019-10-18 14:26                                           ` Marek Vasut
2019-10-18 15:44                                             ` Robin Murphy
2019-10-18 16:44                                               ` Marek Vasut
2019-10-18 17:35                                                 ` Robin Murphy [this message]
2019-10-18 18:44                                                   ` Marek Vasut
2019-10-21  8:32                                                     ` Geert Uytterhoeven
2019-11-19 12:10                                                     ` Robin Murphy
2019-10-18 10:06                         ` Andrew Murray
2019-10-18 10:17                           ` Geert Uytterhoeven
2019-10-18 11:40                             ` Andrew Murray
2019-08-09 17:57 ` [PATCH V3 3/3] PCI: rcar: Recalculate inbound range alignment for each controller entry marek.vasut
2019-10-21 10:39   ` Andrew Murray
2019-08-16 10:52 ` [PATCH V3 1/3] PCI: rcar: Move the inbound index check Lorenzo Pieralisi
2019-08-16 10:59   ` Marek Vasut
2019-08-16 11:10     ` Lorenzo Pieralisi
2019-10-15 20:14 ` Marek Vasut
2019-10-21 10:11 ` Andrew Murray

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \ \ \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-PCI Archive on

Archives are clonable:
	git clone --mirror linux-pci/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-pci linux-pci/ \
	public-inbox-index linux-pci

Example config snippet for mirrors

Newsgroup available over NNTP:

AGPL code for this site: git clone