All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marc Zyngier <marc.zyngier@arm.com>
To: Mason <slash.tmp@free.fr>
Cc: Robin Murphy <robin.murphy@arm.com>,
	 Bjorn Helgaas <helgaas@kernel.org>,
	 Mark Rutland <mark.rutland@arm.com>,
	 Ard Biesheuvel <ard.biesheuvel@linaro.org>,
	 linux-pci <linux-pci@vger.kernel.org>,
	 Linux ARM <linux-arm-kernel@lists.infradead.org>,
	 Thibaud Cornic <thibaud_cornic@sigmadesigns.com>,
	 Marc Gonzalez <Marc_Gonzalez@sigmadesigns.com>
Subject: Re: [PATCH v10] PCI: tango: Add MSI controller support
Date: Sat, 26 Aug 2017 14:08:32 +0100	[thread overview]
Message-ID: <86378ewk6n.fsf@arm.com> (raw)
In-Reply-To: <50ba5ba7-6f5d-5a38-9fa3-7ba798165609@free.fr> (Mason's message of "Fri, 25 Aug 2017 18:44:27 +0200")

On Fri, Aug 25 2017 at  6:44:27 pm BST, Mason <slash.tmp@free.fr> wrote:
> On 25/08/2017 17:45, Robin Murphy wrote:
>
>> On 25/08/17 16:35, Mason wrote:
>>
>>> On 25/08/2017 17:25, Robin Murphy wrote:
>>>
>>>> On 25/08/17 16:01, Mason wrote:
>>>>
>>>>> Robin wrote a prophetic post back in March:
>>>>> http://lists.infradead.org/pipermail/linux-arm-kernel/2017-March/492965.html
>>>>>
>>>>>> The appropriate DT property would be "dma-ranges", i.e.
>>>>>>
>>>>>> pci@... {
>>>>>> 	...
>>>>>> 	dma-ranges = <(PCI bus address) (CPU phys address) (size)>;
>>>>>> }
>>>>>
>>>>> The dma-ranges property seems to be exactly what I'm looking for:
>>>>>
>>>>> Restrict DMA to the first X MB of RAM (use a bounce buffer
>>>>> for other physical addresses).
>>>>>
>>>>> I added the following property to my PCIe node
>>>>>
>>>>>   dma-ranges = <0x0 0x80000000 0x80000000 0x20000000>;
>>>>>
>>>>> with the intent to create a 1:1 mapping for [0x80000000, 0xa0000000[
>>>>>
>>>>> But it does not work. Arg!
>>>>>
>>>>> My PCIe controller driver seems to be correctly calling of_dma_get_range:
>>>>>
>>>>> [ 0.520469] [<c03d85e8>] (of_dma_get_range) from [<c03d5ad8>]
>>>>> (of_dma_configure+0x48/0x234)
>>>>> [ 0.520483] [<c03d5ad8>] (of_dma_configure) from [<c02fa154>]
>>>>> (pci_device_add+0xac/0x350)
>>>>> [ 0.520493] [<c02fa154>] (pci_device_add) from [<c02fa488>]
>>>>> (pci_scan_single_device+0x90/0xb0)
>>>>> [ 0.520501] [<c02fa488>] (pci_scan_single_device) from
>>>>> [<c02fa500>] (pci_scan_slot+0x58/0x100)
>>>>> [ 0.520510] [<c02fa500>] (pci_scan_slot) from [<c02fb418>]
>>>>> (pci_scan_child_bus+0x20/0xf8)
>>>>> [ 0.520519] [<c02fb418>] (pci_scan_child_bus) from [<c02fb6e8>]
>>>>> (pci_scan_root_bus_msi+0xcc/0xd8)
>>>>> [ 0.520527] [<c02fb6e8>] (pci_scan_root_bus_msi) from
>>>>> [<c02fb70c>] (pci_scan_root_bus+0x18/0x20)
>>>>> [ 0.520537] [<c02fb70c>] (pci_scan_root_bus) from [<c0310544>]
>>>>> (pci_host_common_probe+0xc8/0x314)
>>>>> [ 0.520546] [<c0310544>] (pci_host_common_probe) from
>>>>> [<c0310ce8>] (tango_pcie_probe+0x148/0x350)
>>>>> [ 0.520557] [<c0310ce8>] (tango_pcie_probe) from [<c034d398>]
>>>>> (platform_drv_probe+0x34/0x6c)
>>>>>
>>>>> of_dma_get_range() is called on the pcie node (which is expected)
>>>>> but after parsing n_addr_cells and n_size_cells in the while loop,
>>>>> the code jumps to the parent node ("soc")... while my property is
>>>>> attached to the pcie node...
>>>>
>>>> This is not your driver calling of_dma_get_range(), this is the PCI core
>>>> doing so in the act of DMA master configuration for a discovered
>>>> *endpoint*. The fact that the "pass the host controller's OF node
>>>> because we don't have one for the endpoint" bodge only works properly
>>>> for dma-coherent and not dma-ranges is a known, but irrelevant, problem.
>>>>
>>>> If your host controller driver needs to discover its windows from DT to
>>>> configure *itself*, it needs to parse dma-ranges itself; see pcie-iproc,
>>>> pcie-racar, pcie-xgene, etc. for examples.
>>>
>>> Yes, I'm aware that I need to do my own parsing of dma-ranges.
>>> I can use that information to configure BAR0.base and the
>>> region registers.
>>>
>>> But Linux needs to record my settings at some point, right?
>>> Otherwise, how does the DMA framework know that devices can
>>> only reach cpu addresses [0x80000000, 0xa0000000[ and when
>>> to use bounce buffers?
>>>
>>> What's preventing the XHCI driver from allocating memory
>>> outside of my "safe" range, and having the DMA framework
>>> blindly map that?
>> 
>> At the moment, nothing. Systems that have physical memory that is not
>> visible in PCI mem space are having a bad time and will not go to space
>> today.
>> 
>> But that bears no relation to your MSI controller getting its doorbell
>> address set appropriately.
>
> OK, so this is what I propose for v11 in order to not
> hard code the MSI doorbell address (e.g. 0xa002e07c)
>
> I add the following property to the pcie node:
>
> 	dma-ranges = <0x0 0x80000000 0x80000000 0x20000000>;
>
> I.e. pci_addr = 0x80000000, cpu_addr = 0x80000000, len=0x20000000
>
> Then in the PCIe driver, I parse dma-ranges.
>
> Consequently
>
> 	MSI_doorbell_addr = cpu_addr + len + res.start + 0x7c
>
> Bjorn, Marc, Robin, is that an acceptable solution?

It seems to work, but I still have my doubts about this BAR0.base and
the associated regions. Are these regions so hardcoded in HW that the RC
cannot DMA outside of this 1GB region? Or can it be reconfigured by some
SW agent to cover more RAM, should someone decide that 1GB is on the
"too little" side?

If the former is true, the HW is remarkably busted and/or inflexible. If
the latter is true, then the dma-ranges property feels very fragile, as
it must be kept in sync with the amount of memory that the system has.

	M.
-- 
Jazz is not dead. It just smells funny.

WARNING: multiple messages have this Message-ID (diff)
From: marc.zyngier@arm.com (Marc Zyngier)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v10] PCI: tango: Add MSI controller support
Date: Sat, 26 Aug 2017 14:08:32 +0100	[thread overview]
Message-ID: <86378ewk6n.fsf@arm.com> (raw)
In-Reply-To: <50ba5ba7-6f5d-5a38-9fa3-7ba798165609@free.fr> (Mason's message of "Fri, 25 Aug 2017 18:44:27 +0200")

On Fri, Aug 25 2017 at  6:44:27 pm BST, Mason <slash.tmp@free.fr> wrote:
> On 25/08/2017 17:45, Robin Murphy wrote:
>
>> On 25/08/17 16:35, Mason wrote:
>>
>>> On 25/08/2017 17:25, Robin Murphy wrote:
>>>
>>>> On 25/08/17 16:01, Mason wrote:
>>>>
>>>>> Robin wrote a prophetic post back in March:
>>>>> http://lists.infradead.org/pipermail/linux-arm-kernel/2017-March/492965.html
>>>>>
>>>>>> The appropriate DT property would be "dma-ranges", i.e.
>>>>>>
>>>>>> pci at ... {
>>>>>> 	...
>>>>>> 	dma-ranges = <(PCI bus address) (CPU phys address) (size)>;
>>>>>> }
>>>>>
>>>>> The dma-ranges property seems to be exactly what I'm looking for:
>>>>>
>>>>> Restrict DMA to the first X MB of RAM (use a bounce buffer
>>>>> for other physical addresses).
>>>>>
>>>>> I added the following property to my PCIe node
>>>>>
>>>>>   dma-ranges = <0x0 0x80000000 0x80000000 0x20000000>;
>>>>>
>>>>> with the intent to create a 1:1 mapping for [0x80000000, 0xa0000000[
>>>>>
>>>>> But it does not work. Arg!
>>>>>
>>>>> My PCIe controller driver seems to be correctly calling of_dma_get_range:
>>>>>
>>>>> [ 0.520469] [<c03d85e8>] (of_dma_get_range) from [<c03d5ad8>]
>>>>> (of_dma_configure+0x48/0x234)
>>>>> [ 0.520483] [<c03d5ad8>] (of_dma_configure) from [<c02fa154>]
>>>>> (pci_device_add+0xac/0x350)
>>>>> [ 0.520493] [<c02fa154>] (pci_device_add) from [<c02fa488>]
>>>>> (pci_scan_single_device+0x90/0xb0)
>>>>> [ 0.520501] [<c02fa488>] (pci_scan_single_device) from
>>>>> [<c02fa500>] (pci_scan_slot+0x58/0x100)
>>>>> [ 0.520510] [<c02fa500>] (pci_scan_slot) from [<c02fb418>]
>>>>> (pci_scan_child_bus+0x20/0xf8)
>>>>> [ 0.520519] [<c02fb418>] (pci_scan_child_bus) from [<c02fb6e8>]
>>>>> (pci_scan_root_bus_msi+0xcc/0xd8)
>>>>> [ 0.520527] [<c02fb6e8>] (pci_scan_root_bus_msi) from
>>>>> [<c02fb70c>] (pci_scan_root_bus+0x18/0x20)
>>>>> [ 0.520537] [<c02fb70c>] (pci_scan_root_bus) from [<c0310544>]
>>>>> (pci_host_common_probe+0xc8/0x314)
>>>>> [ 0.520546] [<c0310544>] (pci_host_common_probe) from
>>>>> [<c0310ce8>] (tango_pcie_probe+0x148/0x350)
>>>>> [ 0.520557] [<c0310ce8>] (tango_pcie_probe) from [<c034d398>]
>>>>> (platform_drv_probe+0x34/0x6c)
>>>>>
>>>>> of_dma_get_range() is called on the pcie node (which is expected)
>>>>> but after parsing n_addr_cells and n_size_cells in the while loop,
>>>>> the code jumps to the parent node ("soc")... while my property is
>>>>> attached to the pcie node...
>>>>
>>>> This is not your driver calling of_dma_get_range(), this is the PCI core
>>>> doing so in the act of DMA master configuration for a discovered
>>>> *endpoint*. The fact that the "pass the host controller's OF node
>>>> because we don't have one for the endpoint" bodge only works properly
>>>> for dma-coherent and not dma-ranges is a known, but irrelevant, problem.
>>>>
>>>> If your host controller driver needs to discover its windows from DT to
>>>> configure *itself*, it needs to parse dma-ranges itself; see pcie-iproc,
>>>> pcie-racar, pcie-xgene, etc. for examples.
>>>
>>> Yes, I'm aware that I need to do my own parsing of dma-ranges.
>>> I can use that information to configure BAR0.base and the
>>> region registers.
>>>
>>> But Linux needs to record my settings at some point, right?
>>> Otherwise, how does the DMA framework know that devices can
>>> only reach cpu addresses [0x80000000, 0xa0000000[ and when
>>> to use bounce buffers?
>>>
>>> What's preventing the XHCI driver from allocating memory
>>> outside of my "safe" range, and having the DMA framework
>>> blindly map that?
>> 
>> At the moment, nothing. Systems that have physical memory that is not
>> visible in PCI mem space are having a bad time and will not go to space
>> today.
>> 
>> But that bears no relation to your MSI controller getting its doorbell
>> address set appropriately.
>
> OK, so this is what I propose for v11 in order to not
> hard code the MSI doorbell address (e.g. 0xa002e07c)
>
> I add the following property to the pcie node:
>
> 	dma-ranges = <0x0 0x80000000 0x80000000 0x20000000>;
>
> I.e. pci_addr = 0x80000000, cpu_addr = 0x80000000, len=0x20000000
>
> Then in the PCIe driver, I parse dma-ranges.
>
> Consequently
>
> 	MSI_doorbell_addr = cpu_addr + len + res.start + 0x7c
>
> Bjorn, Marc, Robin, is that an acceptable solution?

It seems to work, but I still have my doubts about this BAR0.base and
the associated regions. Are these regions so hardcoded in HW that the RC
cannot DMA outside of this 1GB region? Or can it be reconfigured by some
SW agent to cover more RAM, should someone decide that 1GB is on the
"too little" side?

If the former is true, the HW is remarkably busted and/or inflexible. If
the latter is true, then the dma-ranges property feels very fragile, as
it must be kept in sync with the amount of memory that the system has.

	M.
-- 
Jazz is not dead. It just smells funny.

  reply	other threads:[~2017-08-26 13:08 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-22 14:56 [PATCH v10] PCI: tango: Add MSI controller support Marc Gonzalez
2017-08-22 14:56 ` Marc Gonzalez
2017-08-22 16:29 ` Marc Zyngier
2017-08-22 16:29   ` Marc Zyngier
2017-08-22 18:02   ` Marc Gonzalez
2017-08-22 18:02     ` Marc Gonzalez
2017-08-22 20:03     ` Marc Zyngier
2017-08-22 20:03       ` Marc Zyngier
2017-08-24 17:04       ` Bjorn Helgaas
2017-08-24 17:04         ` Bjorn Helgaas
2017-08-24 17:51         ` Marc Gonzalez
2017-08-24 17:51           ` Marc Gonzalez
2017-08-24 18:35           ` Ard Biesheuvel
2017-08-24 18:35             ` Ard Biesheuvel
2017-08-24 20:53             ` Mason
2017-08-24 20:53               ` Mason
2017-08-25  7:54               ` Marc Zyngier
2017-08-25  7:54                 ` Marc Zyngier
2017-08-25  8:56                 ` Mason
2017-08-25  8:56                   ` Mason
2017-08-25 15:01                 ` Mason
2017-08-25 15:01                   ` Mason
2017-08-25 15:25                   ` Robin Murphy
2017-08-25 15:25                     ` Robin Murphy
2017-08-25 15:35                     ` Mason
2017-08-25 15:35                       ` Mason
2017-08-25 15:45                       ` Robin Murphy
2017-08-25 15:45                         ` Robin Murphy
2017-08-25 16:44                         ` Mason
2017-08-25 16:44                           ` Mason
2017-08-26 13:08                           ` Marc Zyngier [this message]
2017-08-26 13:08                             ` Marc Zyngier
2017-08-26 18:12                             ` Mason
2017-08-26 18:12                               ` Mason
2017-08-28 16:13                     ` Mason
2017-08-28 16:13                       ` Mason
2017-09-18 21:30                       ` Rob Herring
2017-09-18 21:30                         ` Rob Herring
2017-09-19 10:41                         ` Mason
2017-09-19 10:41                           ` Mason
2017-08-23 12:59 ` Marc Gonzalez
2017-08-23 12:59   ` Marc Gonzalez
2017-08-24 17:01   ` Bjorn Helgaas
2017-08-24 17:01     ` Bjorn Helgaas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=86378ewk6n.fsf@arm.com \
    --to=marc.zyngier@arm.com \
    --cc=Marc_Gonzalez@sigmadesigns.com \
    --cc=ard.biesheuvel@linaro.org \
    --cc=helgaas@kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=robin.murphy@arm.com \
    --cc=slash.tmp@free.fr \
    --cc=thibaud_cornic@sigmadesigns.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.