Linux-PCI Archive on lore.kernel.org
 help / color / Atom feed
From: Marek Vasut <marek.vasut@gmail.com>
To: Robin Murphy <robin.murphy@arm.com>,
	Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Rob Herring <robh@kernel.org>,
	Geert Uytterhoeven <geert@linux-m68k.org>,
	PCI <linux-pci@vger.kernel.org>,
	Geert Uytterhoeven <geert+renesas@glider.be>,
	Wolfram Sang <wsa@the-dreams.de>,
	"open list:MEDIA DRIVERS FOR RENESAS - FCP" 
	<linux-renesas-soc@vger.kernel.org>
Subject: Re: [PATCH V3 2/3] PCI: rcar: Do not abort on too many inbound dma-ranges
Date: Fri, 18 Oct 2019 20:44:26 +0200
Message-ID: <6ae07b1a-bcfa-e48a-e260-da20d6663afc@gmail.com> (raw)
In-Reply-To: <9db05950-cb22-c99d-7544-05e148ef7e1a@arm.com>

On 10/18/19 7:35 PM, Robin Murphy wrote:
> On 18/10/2019 17:44, Marek Vasut wrote:
>> On 10/18/19 5:44 PM, Robin Murphy wrote:
>>> On 18/10/2019 15:26, Marek Vasut wrote:
>>>> On 10/18/19 2:53 PM, Robin Murphy wrote:
>>>>> On 18/10/2019 13:22, Marek Vasut wrote:
>>>>>> On 10/18/19 11:53 AM, Lorenzo Pieralisi wrote:
>>>>>>> On Thu, Oct 17, 2019 at 05:01:26PM +0200, Marek Vasut wrote:
>>>>>>>
>>>>>>> [...]
>>>>>>>
>>>>>>>>> Again, just handling the first N dma-ranges entries and
>>>>>>>>> ignoring the
>>>>>>>>> rest is not 'configure the controller correctly'.
>>>>>>>>
>>>>>>>> It's the best effort thing to do. It's well possible the next
>>>>>>>> generation
>>>>>>>> of the controller will have more windows, so could accommodate the
>>>>>>>> whole
>>>>>>>> list of ranges.
>>>>>
>>>>> In the context of DT describing the platform that doesn't make any
>>>>> sense. It's like saying it's fine for U-Boot to also describe a
>>>>> bunch of
>>>>> non-existent CPUs just because future SoCs might have them. Just
>>>>> because
>>>>> the system would probably still boot doesn't mean it's right.
>>>>
>>>> It's the exact opposite of what you just described -- the last release
>>>> of U-Boot currently populates a subset of the DMA ranges, not a
>>>> superset. The dma-ranges in the Linux DT currently are a superset of
>>>> available DRAM on the platform.
>>>
>>> I'm not talking about the overall coverage of addresses - I've already
>>> made clear what I think about that - I'm talking about the *number* of
>>> individual entries. If the DT binding defines that dma-ranges entries
>>> directly represent bridge windows, then the bootloader for a given
>>> platform should never generate more entries than that platform has
>>> actual windows, because to do otherwise would be bogus.
>>
>> I have a feeling that's not how Rob defined the dma-ranges in this
>> discussion though.
>>
>>>>>>>> Thinking about this further, this patch should be OK either way, if
>>>>>>>> there is a DT which defines more DMA ranges than the controller can
>>>>>>>> handle, handling some is better than failing outright -- a PCI
>>>>>>>> which
>>>>>>>> works with a subset of memory is better than PCI that does not work
>>>>>>>> at all.
>>>>>>>
>>>>>>> OK to sum it up, this patch is there to deal with u-boot adding
>>>>>>> multiple
>>>>>>> dma-ranges to DT.
>>>>>>
>>>>>> Yes, this patch was posted over two months ago, about the same time
>>>>>> this
>>>>>> functionality was posted for inclusion in U-Boot. It made it into
>>>>>> recent
>>>>>> U-Boot release, but there was no feedback on the Linux patch until
>>>>>> recently.
>>>>>>
>>>>>> U-Boot can be changed for the next release, assuming we agree on
>>>>>> how it
>>>>>> should behave.
>>>>>>
>>>>>>> I still do not understand the benefit given that for
>>>>>>> DMA masks they are useless as Rob pointed out and ditto for inbound
>>>>>>> windows programming (given that AFAICS the PCI controller filters
>>>>>>> out
>>>>>>> any transaction that does not fall within its inbound windows by
>>>>>>> default
>>>>>>> so adding dma-ranges has the net effect of widening the DMA'able
>>>>>>> address
>>>>>>> space rather than limiting it).
>>>>>>>
>>>>>>> In short, what's the benefit of adding more dma-ranges regions to
>>>>>>> the
>>>>>>> DT (and consequently handling them in the kernel) ?
>>>>>>
>>>>>> The benefit is programming the controller inbound windows correctly.
>>>>>> But if there is a better way to do that, I am open to implement that.
>>>>>> Are there any suggestions / examples of that ?
>>>>>
>>>>> The crucial thing is that once we improve the existing "dma-ranges"
>>>>> handling in the DMA layer such that it *does* consider multiple
>>>>> entries
>>>>> properly, platforms presenting ranges which don't actually exist will
>>>>> almost certainly start going wrong, and are either going to have to
>>>>> fix
>>>>> their broken bootloaders or try to make a case for platform-specific
>>>>> workarounds in core code.
>>>> Again, this is exactly the other way around, the dma-ranges
>>>> populated by
>>>> U-Boot cover only existing DRAM. The single dma-range in Linux DT
>>>> covers
>>>> even the holes without existing DRAM.
>>>>
>>>> So even if the Linux dma-ranges handling changes, there should be no
>>>> problem.
>>>
>>> Say you have a single hardware window, and this DT property (1-cell
>>> numbers for simplicity:
>>>
>>>      dma-ranges = <0x00000000 0x00000000 0x80000000>;
>>>
>>> Driver reads one entry and programs the window to 2GB@0, DMA setup
>>> parses the first entry and sets device masks to 0x7fffffff, and
>>> everything's fine.
>>>
>>> Now say we describe the exact same address range this way instead:
>>>
>>>      dma-ranges = <0x00000000 0x00000000 0x40000000,
>>>                0x40000000 0x40000000 0x40000000>;
>>>
>>> Driver reads one entry and programs the window to 1GB@0, DMA setup
>>> parses the first entry and sets device masks to 0x3fffffff, and *today*,
>>> things are suboptimal but happen to work.
>>>
>>> Now say we finally get round to fixing the of_dma code to properly
>>> generate DMA masks that actually include all usable address bits, a user
>>> upgrades their kernel package, and reboots with that same DT...
>>>
>>> Driver reads one entry and programs the window to 1GB@0, DMA setup
>>> parses all entries and sets device masks to 0x7fffffff, devices start
>>> randomly failing or throwing DMA errors half the time, angry user looks
>>> at the changelog to find that somebody decided their now-corrupted
>>> filesystem is less important than the fact that hey, at least the
>>> machine didn't refuse to boot because the DT was obviously wrong. Are
>>> you sure that shouldn't be a problem?
>>
>> I think you picked a rather special case here and arrived as a DMA mask
>> which just fails in this special case. Such special case doesn't happen
>> here, and even if it did, I would expect Linux to merge those two ranges
>> or do something sane ? If the DMA mask is set incorrectly, that's a bug
>> of the DMA code I would think.
> 
> The mask is not set incorrectly - DMA masks represent the number of
> address bits the device (or intermediate interconnect in the case of the
> bus mask) is capable of driving. Thus when DMA is limited to a specific
> address range, the masks should be wide enough to cover the topmost
> address of that range (unless the device's own capability is inherently
> narrower).

Then the mask should be 0x7fffffff in both cases I'd say.

>> What DMA mask would you get if those two entries had a gap inbetween
>> them ? E.g.:
>>
>>   dma-ranges = <0x00000000 0x00000000 0x20000000,
>>                 0x40000000 0x40000000 0x20000000>;
> 
> OK, here's an real non-simplified example

I would really like an answer to the simple example above before we
start inventing convoluted ones.

> (note that these windows are fixed and not programmed by Linux):
> 
>     dma-ranges = <0x02000000 0x0 0x2c1c0000 0x0 0x2c1c0000 0x0 0x00040000>,
>                  <0x02000000 0x0 0x80000000 0x0 0x80000000 0x0 0x80000000>,
>                  <0x43000000 0x8 0x80000000 0x8 0x80000000 0x2 0x00000000>;
> 
> The DMA masks for the devices behind this bridge *should* be 35 bits,
> because that's the size of the largest usable address. Currently,
> however, because of the of_dma code's deficiency they would end up being
> an utterly useless 30 bits, which isn't even enough to reach the start
> of RAM. Thus I can't actually have this property in my DT, and as a
> result I can't enable the IOMMU, because *that* also needs to know the
> ranges in order to reserve the unusable gaps between the windows once
> address translation is in play.

How is this related to this particular patch ? This looks more like some
DMA internal topic.

>>> Now, if you want to read the DT binding as less strict and let it just
>>> describe some arbitrarily-complex set of address ranges that should be
>>> valid for DMA, that's not insurmountable; you just need more complex
>>> logic in your driver capable of calculating how best to cover *all*
>>> those ranges using the available number of windows.
>>
>> That's what the driver does with this patchset, except it's not possible
>> to cover all those ranges. It covers them as well as it can.
> 
> Which means by definition it *doesn't* do what I suggested there...

But then it didn't do this before either, or ... ?

-- 
Best regards,
Marek Vasut

  reply index

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-09 17:57 [PATCH V3 1/3] PCI: rcar: Move the inbound index check marek.vasut
2019-08-09 17:57 ` [PATCH V3 2/3] PCI: rcar: Do not abort on too many inbound dma-ranges marek.vasut
2019-08-16 13:23   ` Simon Horman
2019-08-16 13:28     ` Marek Vasut
2019-08-16 13:38       ` Simon Horman
2019-08-16 17:41         ` Marek Vasut
2019-10-21 10:18       ` Andrew Murray
2019-10-26 18:03         ` Marek Vasut
2019-10-26 20:36           ` Andrew Murray
2019-10-26 21:06             ` Andrew Murray
2019-11-06 23:37             ` Marek Vasut
2019-11-07 14:19               ` Andrew Murray
2019-11-16 15:48                 ` Marek Vasut
2019-11-18 18:42                   ` Robin Murphy
2019-10-16 15:00   ` Lorenzo Pieralisi
2019-10-16 15:10     ` Marek Vasut
2019-10-16 15:26       ` Lorenzo Pieralisi
2019-10-16 15:29         ` Marek Vasut
2019-10-16 16:18           ` Lorenzo Pieralisi
2019-10-16 18:12             ` Rob Herring
2019-10-16 18:17               ` Marek Vasut
2019-10-16 20:25                 ` Rob Herring
2019-10-16 21:15                   ` Marek Vasut
2019-10-16 22:26                     ` Rob Herring
2019-10-16 22:33                       ` Marek Vasut
2019-10-17  7:06                         ` Geert Uytterhoeven
2019-10-17 10:55                           ` Marek Vasut
2019-10-17 13:06                             ` Robin Murphy
2019-10-17 14:00                               ` Marek Vasut
2019-10-17 14:36                                 ` Rob Herring
2019-10-17 15:01                                   ` Marek Vasut
2019-10-18  9:53                                     ` Lorenzo Pieralisi
2019-10-18 12:22                                       ` Marek Vasut
2019-10-18 12:53                                         ` Robin Murphy
2019-10-18 14:26                                           ` Marek Vasut
2019-10-18 15:44                                             ` Robin Murphy
2019-10-18 16:44                                               ` Marek Vasut
2019-10-18 17:35                                                 ` Robin Murphy
2019-10-18 18:44                                                   ` Marek Vasut [this message]
2019-10-21  8:32                                                     ` Geert Uytterhoeven
2019-10-18 10:06                         ` Andrew Murray
2019-10-18 10:17                           ` Geert Uytterhoeven
2019-10-18 11:40                             ` Andrew Murray
2019-08-09 17:57 ` [PATCH V3 3/3] PCI: rcar: Recalculate inbound range alignment for each controller entry marek.vasut
2019-10-21 10:39   ` Andrew Murray
2019-08-16 10:52 ` [PATCH V3 1/3] PCI: rcar: Move the inbound index check Lorenzo Pieralisi
2019-08-16 10:59   ` Marek Vasut
2019-08-16 11:10     ` Lorenzo Pieralisi
2019-10-15 20:14 ` Marek Vasut
2019-10-21 10:11 ` Andrew Murray

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6ae07b1a-bcfa-e48a-e260-da20d6663afc@gmail.com \
    --to=marek.vasut@gmail.com \
    --cc=geert+renesas@glider.be \
    --cc=geert@linux-m68k.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-renesas-soc@vger.kernel.org \
    --cc=lorenzo.pieralisi@arm.com \
    --cc=robh@kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=wsa@the-dreams.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-PCI Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-pci/0 linux-pci/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-pci linux-pci/ https://lore.kernel.org/linux-pci \
		linux-pci@vger.kernel.org
	public-inbox-index linux-pci

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-pci


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git