From: Robin Murphy <robin.murphy@arm.com>
To: Marek Vasut <marek.vasut@gmail.com>,
Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Rob Herring <robh@kernel.org>,
Geert Uytterhoeven <geert@linux-m68k.org>,
PCI <linux-pci@vger.kernel.org>,
Geert Uytterhoeven <geert+renesas@glider.be>,
Wolfram Sang <wsa@the-dreams.de>,
"open list:MEDIA DRIVERS FOR RENESAS - FCP"
<linux-renesas-soc@vger.kernel.org>
Subject: Re: [PATCH V3 2/3] PCI: rcar: Do not abort on too many inbound dma-ranges
Date: Fri, 18 Oct 2019 18:35:07 +0100 [thread overview]
Message-ID: <9db05950-cb22-c99d-7544-05e148ef7e1a@arm.com> (raw)
In-Reply-To: <32591ef7-9792-4874-512c-ce3bcc3e9998@gmail.com>
On 18/10/2019 17:44, Marek Vasut wrote:
> On 10/18/19 5:44 PM, Robin Murphy wrote:
>> On 18/10/2019 15:26, Marek Vasut wrote:
>>> On 10/18/19 2:53 PM, Robin Murphy wrote:
>>>> On 18/10/2019 13:22, Marek Vasut wrote:
>>>>> On 10/18/19 11:53 AM, Lorenzo Pieralisi wrote:
>>>>>> On Thu, Oct 17, 2019 at 05:01:26PM +0200, Marek Vasut wrote:
>>>>>>
>>>>>> [...]
>>>>>>
>>>>>>>> Again, just handling the first N dma-ranges entries and ignoring the
>>>>>>>> rest is not 'configure the controller correctly'.
>>>>>>>
>>>>>>> It's the best effort thing to do. It's well possible the next
>>>>>>> generation
>>>>>>> of the controller will have more windows, so could accommodate the
>>>>>>> whole
>>>>>>> list of ranges.
>>>>
>>>> In the context of DT describing the platform that doesn't make any
>>>> sense. It's like saying it's fine for U-Boot to also describe a bunch of
>>>> non-existent CPUs just because future SoCs might have them. Just because
>>>> the system would probably still boot doesn't mean it's right.
>>>
>>> It's the exact opposite of what you just described -- the last release
>>> of U-Boot currently populates a subset of the DMA ranges, not a
>>> superset. The dma-ranges in the Linux DT currently are a superset of
>>> available DRAM on the platform.
>>
>> I'm not talking about the overall coverage of addresses - I've already
>> made clear what I think about that - I'm talking about the *number* of
>> individual entries. If the DT binding defines that dma-ranges entries
>> directly represent bridge windows, then the bootloader for a given
>> platform should never generate more entries than that platform has
>> actual windows, because to do otherwise would be bogus.
>
> I have a feeling that's not how Rob defined the dma-ranges in this
> discussion though.
>
>>>>>>> Thinking about this further, this patch should be OK either way, if
>>>>>>> there is a DT which defines more DMA ranges than the controller can
>>>>>>> handle, handling some is better than failing outright -- a PCI which
>>>>>>> works with a subset of memory is better than PCI that does not work
>>>>>>> at all.
>>>>>>
>>>>>> OK to sum it up, this patch is there to deal with u-boot adding
>>>>>> multiple
>>>>>> dma-ranges to DT.
>>>>>
>>>>> Yes, this patch was posted over two months ago, about the same time
>>>>> this
>>>>> functionality was posted for inclusion in U-Boot. It made it into
>>>>> recent
>>>>> U-Boot release, but there was no feedback on the Linux patch until
>>>>> recently.
>>>>>
>>>>> U-Boot can be changed for the next release, assuming we agree on how it
>>>>> should behave.
>>>>>
>>>>>> I still do not understand the benefit given that for
>>>>>> DMA masks they are useless as Rob pointed out and ditto for inbound
>>>>>> windows programming (given that AFAICS the PCI controller filters out
>>>>>> any transaction that does not fall within its inbound windows by
>>>>>> default
>>>>>> so adding dma-ranges has the net effect of widening the DMA'able
>>>>>> address
>>>>>> space rather than limiting it).
>>>>>>
>>>>>> In short, what's the benefit of adding more dma-ranges regions to the
>>>>>> DT (and consequently handling them in the kernel) ?
>>>>>
>>>>> The benefit is programming the controller inbound windows correctly.
>>>>> But if there is a better way to do that, I am open to implement that.
>>>>> Are there any suggestions / examples of that ?
>>>>
>>>> The crucial thing is that once we improve the existing "dma-ranges"
>>>> handling in the DMA layer such that it *does* consider multiple entries
>>>> properly, platforms presenting ranges which don't actually exist will
>>>> almost certainly start going wrong, and are either going to have to fix
>>>> their broken bootloaders or try to make a case for platform-specific
>>>> workarounds in core code.
>>> Again, this is exactly the other way around, the dma-ranges populated by
>>> U-Boot cover only existing DRAM. The single dma-range in Linux DT covers
>>> even the holes without existing DRAM.
>>>
>>> So even if the Linux dma-ranges handling changes, there should be no
>>> problem.
>>
>> Say you have a single hardware window, and this DT property (1-cell
>> numbers for simplicity:
>>
>> dma-ranges = <0x00000000 0x00000000 0x80000000>;
>>
>> Driver reads one entry and programs the window to 2GB@0, DMA setup
>> parses the first entry and sets device masks to 0x7fffffff, and
>> everything's fine.
>>
>> Now say we describe the exact same address range this way instead:
>>
>> dma-ranges = <0x00000000 0x00000000 0x40000000,
>> 0x40000000 0x40000000 0x40000000>;
>>
>> Driver reads one entry and programs the window to 1GB@0, DMA setup
>> parses the first entry and sets device masks to 0x3fffffff, and *today*,
>> things are suboptimal but happen to work.
>>
>> Now say we finally get round to fixing the of_dma code to properly
>> generate DMA masks that actually include all usable address bits, a user
>> upgrades their kernel package, and reboots with that same DT...
>>
>> Driver reads one entry and programs the window to 1GB@0, DMA setup
>> parses all entries and sets device masks to 0x7fffffff, devices start
>> randomly failing or throwing DMA errors half the time, angry user looks
>> at the changelog to find that somebody decided their now-corrupted
>> filesystem is less important than the fact that hey, at least the
>> machine didn't refuse to boot because the DT was obviously wrong. Are
>> you sure that shouldn't be a problem?
>
> I think you picked a rather special case here and arrived as a DMA mask
> which just fails in this special case. Such special case doesn't happen
> here, and even if it did, I would expect Linux to merge those two ranges
> or do something sane ? If the DMA mask is set incorrectly, that's a bug
> of the DMA code I would think.
The mask is not set incorrectly - DMA masks represent the number of
address bits the device (or intermediate interconnect in the case of the
bus mask) is capable of driving. Thus when DMA is limited to a specific
address range, the masks should be wide enough to cover the topmost
address of that range (unless the device's own capability is inherently
narrower).
> What DMA mask would you get if those two entries had a gap inbetween
> them ? E.g.:
>
> dma-ranges = <0x00000000 0x00000000 0x20000000,
> 0x40000000 0x40000000 0x20000000>;
OK, here's an real non-simplified example (note that these windows are
fixed and not programmed by Linux):
dma-ranges = <0x02000000 0x0 0x2c1c0000 0x0 0x2c1c0000 0x0 0x00040000>,
<0x02000000 0x0 0x80000000 0x0 0x80000000 0x0 0x80000000>,
<0x43000000 0x8 0x80000000 0x8 0x80000000 0x2 0x00000000>;
The DMA masks for the devices behind this bridge *should* be 35 bits,
because that's the size of the largest usable address. Currently,
however, because of the of_dma code's deficiency they would end up being
an utterly useless 30 bits, which isn't even enough to reach the start
of RAM. Thus I can't actually have this property in my DT, and as a
result I can't enable the IOMMU, because *that* also needs to know the
ranges in order to reserve the unusable gaps between the windows once
address translation is in play.
>> Now, if you want to read the DT binding as less strict and let it just
>> describe some arbitrarily-complex set of address ranges that should be
>> valid for DMA, that's not insurmountable; you just need more complex
>> logic in your driver capable of calculating how best to cover *all*
>> those ranges using the available number of windows.
>
> That's what the driver does with this patchset, except it's not possible
> to cover all those ranges. It covers them as well as it can.
Which means by definition it *doesn't* do what I suggested there...
Robin.
next prev parent reply other threads:[~2019-10-18 17:35 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-08-09 17:57 [PATCH V3 1/3] PCI: rcar: Move the inbound index check marek.vasut
2019-08-09 17:57 ` [PATCH V3 2/3] PCI: rcar: Do not abort on too many inbound dma-ranges marek.vasut
2019-08-16 13:23 ` Simon Horman
2019-08-16 13:28 ` Marek Vasut
2019-08-16 13:38 ` Simon Horman
2019-08-16 17:41 ` Marek Vasut
2019-10-21 10:18 ` Andrew Murray
2019-10-26 18:03 ` Marek Vasut
2019-10-26 20:36 ` Andrew Murray
2019-10-26 21:06 ` Andrew Murray
2019-11-06 23:37 ` Marek Vasut
2019-11-07 14:19 ` Andrew Murray
2019-11-16 15:48 ` Marek Vasut
2019-11-18 18:42 ` Robin Murphy
2019-12-22 7:46 ` Marek Vasut
2019-10-16 15:00 ` Lorenzo Pieralisi
2019-10-16 15:10 ` Marek Vasut
2019-10-16 15:26 ` Lorenzo Pieralisi
2019-10-16 15:29 ` Marek Vasut
2019-10-16 16:18 ` Lorenzo Pieralisi
2019-10-16 18:12 ` Rob Herring
2019-10-16 18:17 ` Marek Vasut
2019-10-16 20:25 ` Rob Herring
2019-10-16 21:15 ` Marek Vasut
2019-10-16 22:26 ` Rob Herring
2019-10-16 22:33 ` Marek Vasut
2019-10-17 7:06 ` Geert Uytterhoeven
2019-10-17 10:55 ` Marek Vasut
2019-10-17 13:06 ` Robin Murphy
2019-10-17 14:00 ` Marek Vasut
2019-10-17 14:36 ` Rob Herring
2019-10-17 15:01 ` Marek Vasut
2019-10-18 9:53 ` Lorenzo Pieralisi
2019-10-18 12:22 ` Marek Vasut
2019-10-18 12:53 ` Robin Murphy
2019-10-18 14:26 ` Marek Vasut
2019-10-18 15:44 ` Robin Murphy
2019-10-18 16:44 ` Marek Vasut
2019-10-18 17:35 ` Robin Murphy [this message]
2019-10-18 18:44 ` Marek Vasut
2019-10-21 8:32 ` Geert Uytterhoeven
2019-11-19 12:10 ` Robin Murphy
2019-10-18 10:06 ` Andrew Murray
2019-10-18 10:17 ` Geert Uytterhoeven
2019-10-18 11:40 ` Andrew Murray
2019-08-09 17:57 ` [PATCH V3 3/3] PCI: rcar: Recalculate inbound range alignment for each controller entry marek.vasut
2019-10-21 10:39 ` Andrew Murray
2019-08-16 10:52 ` [PATCH V3 1/3] PCI: rcar: Move the inbound index check Lorenzo Pieralisi
2019-08-16 10:59 ` Marek Vasut
2019-08-16 11:10 ` Lorenzo Pieralisi
2019-10-15 20:14 ` Marek Vasut
2019-10-21 10:11 ` Andrew Murray
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9db05950-cb22-c99d-7544-05e148ef7e1a@arm.com \
--to=robin.murphy@arm.com \
--cc=geert+renesas@glider.be \
--cc=geert@linux-m68k.org \
--cc=linux-pci@vger.kernel.org \
--cc=linux-renesas-soc@vger.kernel.org \
--cc=lorenzo.pieralisi@arm.com \
--cc=marek.vasut@gmail.com \
--cc=robh@kernel.org \
--cc=wsa@the-dreams.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).