linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Partial BAR Address Allocation
@ 2017-02-22 18:03 Sinan Kaya
  2017-02-22 18:44 ` Bjorn Helgaas
  0 siblings, 1 reply; 9+ messages in thread
From: Sinan Kaya @ 2017-02-22 18:03 UTC (permalink / raw)
  To: Linux PCI

We are carrying a big burden on our firmware to accommodate 32 bit non-prefetchable memory
for supporting all BAR types. 

Endpoint drivers usually try to allocate both a 64 bit and a 32 bit BAR address
for different platforms but they don't usually need both of them. 

I think the behavior changes from card to card. 

I'm looking for a way to get rid of 32 bit BAR addresses and support 64 bit BAR addresses
only. 

I want to see the current state of the kernel. Do we know if all resources need to be assigned
before a driver can become functional. I have seen some patches to relax this but I don't
know the current status.


-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Partial BAR Address Allocation
  2017-02-22 18:03 Partial BAR Address Allocation Sinan Kaya
@ 2017-02-22 18:44 ` Bjorn Helgaas
  2017-02-22 20:44   ` Sinan Kaya
  0 siblings, 1 reply; 9+ messages in thread
From: Bjorn Helgaas @ 2017-02-22 18:44 UTC (permalink / raw)
  To: Sinan Kaya; +Cc: Linux PCI

On Wed, Feb 22, 2017 at 01:03:32PM -0500, Sinan Kaya wrote:
> We are carrying a big burden on our firmware to accommodate 32 bit
> non-prefetchable memory for supporting all BAR types. 
> 
> Endpoint drivers usually try to allocate both a 64 bit and a 32 bit
> BAR address for different platforms but they don't usually need both
> of them. 
> 
> I think the behavior changes from card to card. 
> 
> I'm looking for a way to get rid of 32 bit BAR addresses and support
> 64 bit BAR addresses only. 
> 
> I want to see the current state of the kernel. Do we know if all
> resources need to be assigned before a driver can become functional.
> I have seen some patches to relax this but I don't know the current
> status.

The kernel tries to assign space for all BARs because it doesn't know
what the driver will require.  It is possible for a driver to claim a
device even if the kernel was unable to assign space for all the BARs.

However, pci_enable_device() will fail if any BARs are unassigned.  If
a driver only requires I/O BARs or only requires memory BARs, it can
use pci_enable_device_io() or pci_enable_device_mem() instead.  Those
will succeed as long as all the I/O BARs (or all the memory BARs) are
assigned.

There is no way for a driver to say "I only need this memory BAR and
not the other ones."  The reason is because the PCI_COMMAND_MEMORY bit
enables *all* the memory BARs; there's no way to enable memory BARs
selectively.  If we enable memory BARs and one of them is unassigned,
that unassigned BAR is enabled, and the device will respond at
whatever address the register happens to contain, and that may cause
conflicts.

I'm not sure this answers your question.  Do you want to get rid of
32-bit BAR addresses because your host bridge doesn't have a window to
32-bit PCI addresses?  It's typical for a bridge to support a window
to the 32-bit PCI space as well as one to the 64-bit PCI space.  Often
it performs address translation for the 32-bit window so it doesn't
have to be in the 32-bit area on the CPU side, e.g., you could have
something like this where we have three host bridges and the 2-4GB
space on each PCI root bus is addressable:

  pci_bus 0000:00: root bus resource [mem 0x1080000000-0x10ffffffff] (bus address [0x80000000-0xffffffff])
  pci_bus 0001:00: root bus resource [mem 0x1180000000-0x11ffffffff] (bus address [0x80000000-0xffffffff])
  pci_bus 0002:00: root bus resource [mem 0x1280000000-0x12ffffffff] (bus address [0x80000000-0xffffffff])

Bjorn

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Partial BAR Address Allocation
  2017-02-22 18:44 ` Bjorn Helgaas
@ 2017-02-22 20:44   ` Sinan Kaya
  2017-02-22 23:39     ` Bjorn Helgaas
  0 siblings, 1 reply; 9+ messages in thread
From: Sinan Kaya @ 2017-02-22 20:44 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: Linux PCI

On 2/22/2017 1:44 PM, Bjorn Helgaas wrote:
> There is no way for a driver to say "I only need this memory BAR and
> not the other ones."  The reason is because the PCI_COMMAND_MEMORY bit
> enables *all* the memory BARs; there's no way to enable memory BARs
> selectively.  If we enable memory BARs and one of them is unassigned,
> that unassigned BAR is enabled, and the device will respond at
> whatever address the register happens to contain, and that may cause
> conflicts.
> 
> I'm not sure this answers your question.  Do you want to get rid of
> 32-bit BAR addresses because your host bridge doesn't have a window to
> 32-bit PCI addresses?  It's typical for a bridge to support a window
> to the 32-bit PCI space as well as one to the 64-bit PCI space.  Often
> it performs address translation for the 32-bit window so it doesn't
> have to be in the 32-bit area on the CPU side, e.g., you could have
> something like this where we have three host bridges and the 2-4GB
> space on each PCI root bus is addressable:
> 
>   pci_bus 0000:00: root bus resource [mem 0x1080000000-0x10ffffffff] (bus address [0x80000000-0xffffffff])
>   pci_bus 0001:00: root bus resource [mem 0x1180000000-0x11ffffffff] (bus address [0x80000000-0xffffffff])
>   pci_bus 0002:00: root bus resource [mem 0x1280000000-0x12ffffffff] (bus address [0x80000000-0xffffffff])

The problem is that according to PCI specification BAR addresses and DMA
addresses cannot overlap.

>From PCI-to-PCI Bridge Arch. spec.:
"A bridge forwards PCI memory transactions from its primary interface to
its secondary interface (downstream) if a memory address is in the range
defined by the Memory Base and Memory Limit registers (when the base is
less than or equal to the limit) as illustrated in Figure 4-3. Conversely, 
a memory transaction on the secondary interface that is within this address
range will not be forwarded upstream to the primary interface."

To be specific, if your DMA address happens to be in [0x80000000-0xffffffff]
and root port's aperture includes this range; the DMA will never make to the
system memory.

Lorenzo and Robin took some steps to carve out PCI addresses out of DMA
addresses in IOMMU drivers by using iova_reserve_pci_windows() function.

However, I see that we are still exposed when the operating system doesn't
have any IOMMU driver and is using the SWIOTLB for instance. 

The FW solution I'm looking at requires carving out some part of the DDR from
before OS boot so that OS doesn't reclaim that area for DMA.

I'm not very happy with this solution. I'm also surprised that there is
no generic solution in the kernel takes care of this for all root ports
regardless of IOMMU driver presence.

-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Partial BAR Address Allocation
  2017-02-22 20:44   ` Sinan Kaya
@ 2017-02-22 23:39     ` Bjorn Helgaas
  2017-02-23 11:40       ` Robin Murphy
  2017-03-06 11:04       ` Joerg Roedel
  0 siblings, 2 replies; 9+ messages in thread
From: Bjorn Helgaas @ 2017-02-22 23:39 UTC (permalink / raw)
  To: Sinan Kaya; +Cc: Linux PCI, Joerg Roedel, iommu

[+cc Joerg, iommu list]

On Wed, Feb 22, 2017 at 03:44:53PM -0500, Sinan Kaya wrote:
> On 2/22/2017 1:44 PM, Bjorn Helgaas wrote:
> > There is no way for a driver to say "I only need this memory BAR and
> > not the other ones."  The reason is because the PCI_COMMAND_MEMORY bit
> > enables *all* the memory BARs; there's no way to enable memory BARs
> > selectively.  If we enable memory BARs and one of them is unassigned,
> > that unassigned BAR is enabled, and the device will respond at
> > whatever address the register happens to contain, and that may cause
> > conflicts.
> > 
> > I'm not sure this answers your question.  Do you want to get rid of
> > 32-bit BAR addresses because your host bridge doesn't have a window to
> > 32-bit PCI addresses?  It's typical for a bridge to support a window
> > to the 32-bit PCI space as well as one to the 64-bit PCI space.  Often
> > it performs address translation for the 32-bit window so it doesn't
> > have to be in the 32-bit area on the CPU side, e.g., you could have
> > something like this where we have three host bridges and the 2-4GB
> > space on each PCI root bus is addressable:
> > 
> >   pci_bus 0000:00: root bus resource [mem 0x1080000000-0x10ffffffff] (bus address [0x80000000-0xffffffff])
> >   pci_bus 0001:00: root bus resource [mem 0x1180000000-0x11ffffffff] (bus address [0x80000000-0xffffffff])
> >   pci_bus 0002:00: root bus resource [mem 0x1280000000-0x12ffffffff] (bus address [0x80000000-0xffffffff])
> 
> The problem is that according to PCI specification BAR addresses and
> DMA addresses cannot overlap.
> 
> From PCI-to-PCI Bridge Arch. spec.: "A bridge forwards PCI memory
> transactions from its primary interface to its secondary interface
> (downstream) if a memory address is in the range defined by the
> Memory Base and Memory Limit registers (when the base is less than
> or equal to the limit) as illustrated in Figure 4-3. Conversely, a
> memory transaction on the secondary interface that is within this
> address range will not be forwarded upstream to the primary
> interface."
> 
> To be specific, if your DMA address happens to be in
> [0x80000000-0xffffffff] and root port's aperture includes this
> range; the DMA will never make to the system memory.
> 
> Lorenzo and Robin took some steps to carve out PCI addresses out of
> DMA addresses in IOMMU drivers by using iova_reserve_pci_windows()
> function.
> 
> However, I see that we are still exposed when the operating system
> doesn't have any IOMMU driver and is using the SWIOTLB for instance. 

Hmmm.  I guess SWIOTLB assumes there's no address translation in the
DMA direction, right?  If there's no address translation in the PIO
direction, PCI bus BAR addresses are identical to the CPU-side
addresses.  In that case, there's no conflict because we already have
to assign BARs so they never look like a system memory address.

But if there *is* address translation in the PIO direction, we can
have conflicts because the bridge can translate CPU-side PIO accesses
to arbitrary PCI bus addresses.

> The FW solution I'm looking at requires carving out some part of the
> DDR from before OS boot so that OS doesn't reclaim that area for
> DMA.

If you want to reach system RAM, I guess you need to make sure you
only DMA to bus addresses outside the host bridge windows, as you said
above.  DMA inside the windows would be handled as peer-to-peer DMA.

> I'm not very happy with this solution. I'm also surprised that there
> is no generic solution in the kernel takes care of this for all root
> ports regardless of IOMMU driver presence.

The PCI core isn't really involved in allocating DMA addresses,
although there definitely is the connection with PCI-to-PCI bridge
windows that you mentioned.  I added IOMMU guys, who would know a lot
more than I do.

Bjorn

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Partial BAR Address Allocation
  2017-02-22 23:39     ` Bjorn Helgaas
@ 2017-02-23 11:40       ` Robin Murphy
  2017-02-23 13:54         ` Sinan Kaya
  2017-03-06 11:04       ` Joerg Roedel
  1 sibling, 1 reply; 9+ messages in thread
From: Robin Murphy @ 2017-02-23 11:40 UTC (permalink / raw)
  To: Bjorn Helgaas, Sinan Kaya; +Cc: Linux PCI, iommu

On 22/02/17 23:39, Bjorn Helgaas wrote:
> [+cc Joerg, iommu list]
> 
> On Wed, Feb 22, 2017 at 03:44:53PM -0500, Sinan Kaya wrote:
>> On 2/22/2017 1:44 PM, Bjorn Helgaas wrote:
>>> There is no way for a driver to say "I only need this memory BAR and
>>> not the other ones."  The reason is because the PCI_COMMAND_MEMORY bit
>>> enables *all* the memory BARs; there's no way to enable memory BARs
>>> selectively.  If we enable memory BARs and one of them is unassigned,
>>> that unassigned BAR is enabled, and the device will respond at
>>> whatever address the register happens to contain, and that may cause
>>> conflicts.
>>>
>>> I'm not sure this answers your question.  Do you want to get rid of
>>> 32-bit BAR addresses because your host bridge doesn't have a window to
>>> 32-bit PCI addresses?  It's typical for a bridge to support a window
>>> to the 32-bit PCI space as well as one to the 64-bit PCI space.  Often
>>> it performs address translation for the 32-bit window so it doesn't
>>> have to be in the 32-bit area on the CPU side, e.g., you could have
>>> something like this where we have three host bridges and the 2-4GB
>>> space on each PCI root bus is addressable:
>>>
>>>   pci_bus 0000:00: root bus resource [mem 0x1080000000-0x10ffffffff] (bus address [0x80000000-0xffffffff])
>>>   pci_bus 0001:00: root bus resource [mem 0x1180000000-0x11ffffffff] (bus address [0x80000000-0xffffffff])
>>>   pci_bus 0002:00: root bus resource [mem 0x1280000000-0x12ffffffff] (bus address [0x80000000-0xffffffff])
>>
>> The problem is that according to PCI specification BAR addresses and
>> DMA addresses cannot overlap.
>>
>> From PCI-to-PCI Bridge Arch. spec.: "A bridge forwards PCI memory
>> transactions from its primary interface to its secondary interface
>> (downstream) if a memory address is in the range defined by the
>> Memory Base and Memory Limit registers (when the base is less than
>> or equal to the limit) as illustrated in Figure 4-3. Conversely, a
>> memory transaction on the secondary interface that is within this
>> address range will not be forwarded upstream to the primary
>> interface."
>>
>> To be specific, if your DMA address happens to be in
>> [0x80000000-0xffffffff] and root port's aperture includes this
>> range; the DMA will never make to the system memory.
>>
>> Lorenzo and Robin took some steps to carve out PCI addresses out of
>> DMA addresses in IOMMU drivers by using iova_reserve_pci_windows()
>> function.
>>
>> However, I see that we are still exposed when the operating system
>> doesn't have any IOMMU driver and is using the SWIOTLB for instance. 
> 
> Hmmm.  I guess SWIOTLB assumes there's no address translation in the
> DMA direction, right?

Not entirely - it does rely on arch-provided dma_to_phys() and
phys_to_dma() helpers which are free to accommodate such translations in
a device-specific manner. On arm64 we use these to account for
dev->dma_pfn_offset describing a straightforward linear offset, but
unless one constant offset would apply to all possible outbound windows
I'm not sure that's much help here.

>  If there's no address translation in the PIO
> direction, PCI bus BAR addresses are identical to the CPU-side
> addresses.  In that case, there's no conflict because we already have
> to assign BARs so they never look like a system memory address.
> 
> But if there *is* address translation in the PIO direction, we can
> have conflicts because the bridge can translate CPU-side PIO accesses
> to arbitrary PCI bus addresses.
> 
>> The FW solution I'm looking at requires carving out some part of the
>> DDR from before OS boot so that OS doesn't reclaim that area for
>> DMA.
> 
> If you want to reach system RAM, I guess you need to make sure you
> only DMA to bus addresses outside the host bridge windows, as you said
> above.  DMA inside the windows would be handled as peer-to-peer DMA.
> 
>> I'm not very happy with this solution. I'm also surprised that there
>> is no generic solution in the kernel takes care of this for all root
>> ports regardless of IOMMU driver presence.
> 
> The PCI core isn't really involved in allocating DMA addresses,
> although there definitely is the connection with PCI-to-PCI bridge
> windows that you mentioned.  I added IOMMU guys, who would know a lot
> more than I do.

To me, having the bus addresses of windows shadow assigned physical
addresses sounds mostly like a broken system configuration. Can the
firmware not reprogram them elsewhere, or is the entire bottom 4GB of
the physical memory map occupied by system RAM?

Robin.

> 
> Bjorn
> _______________________________________________
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Partial BAR Address Allocation
  2017-02-23 11:40       ` Robin Murphy
@ 2017-02-23 13:54         ` Sinan Kaya
  0 siblings, 0 replies; 9+ messages in thread
From: Sinan Kaya @ 2017-02-23 13:54 UTC (permalink / raw)
  To: Robin Murphy, Bjorn Helgaas; +Cc: Linux PCI, iommu

Hi Robin,

On 2/23/2017 6:40 AM, Robin Murphy wrote:
> On 22/02/17 23:39, Bjorn Helgaas wrote:
>> [+cc Joerg, iommu list]
>>
>> On Wed, Feb 22, 2017 at 03:44:53PM -0500, Sinan Kaya wrote:
>>> On 2/22/2017 1:44 PM, Bjorn Helgaas wrote:
>>>> There is no way for a driver to say "I only need this memory BAR and
>>>> not the other ones."  The reason is because the PCI_COMMAND_MEMORY bit
>>>> enables *all* the memory BARs; there's no way to enable memory BARs
>>>> selectively.  If we enable memory BARs and one of them is unassigned,
>>>> that unassigned BAR is enabled, and the device will respond at
>>>> whatever address the register happens to contain, and that may cause
>>>> conflicts.
>>>>
>>>> I'm not sure this answers your question.  Do you want to get rid of
>>>> 32-bit BAR addresses because your host bridge doesn't have a window to
>>>> 32-bit PCI addresses?  It's typical for a bridge to support a window
>>>> to the 32-bit PCI space as well as one to the 64-bit PCI space.  Often
>>>> it performs address translation for the 32-bit window so it doesn't
>>>> have to be in the 32-bit area on the CPU side, e.g., you could have
>>>> something like this where we have three host bridges and the 2-4GB
>>>> space on each PCI root bus is addressable:
>>>>
>>>>   pci_bus 0000:00: root bus resource [mem 0x1080000000-0x10ffffffff] (bus address [0x80000000-0xffffffff])
>>>>   pci_bus 0001:00: root bus resource [mem 0x1180000000-0x11ffffffff] (bus address [0x80000000-0xffffffff])
>>>>   pci_bus 0002:00: root bus resource [mem 0x1280000000-0x12ffffffff] (bus address [0x80000000-0xffffffff])
>>>
>>> The problem is that according to PCI specification BAR addresses and
>>> DMA addresses cannot overlap.
>>>
>>> From PCI-to-PCI Bridge Arch. spec.: "A bridge forwards PCI memory
>>> transactions from its primary interface to its secondary interface
>>> (downstream) if a memory address is in the range defined by the
>>> Memory Base and Memory Limit registers (when the base is less than
>>> or equal to the limit) as illustrated in Figure 4-3. Conversely, a
>>> memory transaction on the secondary interface that is within this
>>> address range will not be forwarded upstream to the primary
>>> interface."
>>>
>>> To be specific, if your DMA address happens to be in
>>> [0x80000000-0xffffffff] and root port's aperture includes this
>>> range; the DMA will never make to the system memory.
>>>
>>> Lorenzo and Robin took some steps to carve out PCI addresses out of
>>> DMA addresses in IOMMU drivers by using iova_reserve_pci_windows()
>>> function.
>>>
>>> However, I see that we are still exposed when the operating system
>>> doesn't have any IOMMU driver and is using the SWIOTLB for instance. 
>>
>> Hmmm.  I guess SWIOTLB assumes there's no address translation in the
>> DMA direction, right?
> 
> Not entirely - it does rely on arch-provided dma_to_phys() and
> phys_to_dma() helpers which are free to accommodate such translations in
> a device-specific manner. On arm64 we use these to account for
> dev->dma_pfn_offset describing a straightforward linear offset, but
> unless one constant offset would apply to all possible outbound windows
> I'm not sure that's much help here.

yeah, that won't help. This is a PCI only problem. Arch layer solution
will move the entire DMA ranges for all peripherals in the SOC to a specific offset.
This would be most useful if the entire DDR would start at some non-zero offset.

Even then, PCI usually has several ranges. One range like this to have some
space below 4GB and another untranslated range for true 64bit cards. 

>>>>   pci_bus 0002:00: root bus resource [mem 0x1280000000-0x12ffffffff] (bus address [0x80000000-0xffffffff])

We have to emulate some range in the first 4GB to make PCI cards happy.

> 
>>  If there's no address translation in the PIO
>> direction, PCI bus BAR addresses are identical to the CPU-side
>> addresses.  In that case, there's no conflict because we already have
>> to assign BARs so they never look like a system memory address.
>>
>> But if there *is* address translation in the PIO direction, we can
>> have conflicts because the bridge can translate CPU-side PIO accesses
>> to arbitrary PCI bus addresses.
>>
>>> The FW solution I'm looking at requires carving out some part of the
>>> DDR from before OS boot so that OS doesn't reclaim that area for
>>> DMA.
>>
>> If you want to reach system RAM, I guess you need to make sure you
>> only DMA to bus addresses outside the host bridge windows, as you said
>> above.  DMA inside the windows would be handled as peer-to-peer DMA.
>>
>>> I'm not very happy with this solution. I'm also surprised that there
>>> is no generic solution in the kernel takes care of this for all root
>>> ports regardless of IOMMU driver presence.
>>
>> The PCI core isn't really involved in allocating DMA addresses,
>> although there definitely is the connection with PCI-to-PCI bridge
>> windows that you mentioned.  I added IOMMU guys, who would know a lot
>> more than I do.
> 
> To me, having the bus addresses of windows shadow assigned physical
> addresses sounds mostly like a broken system configuration. Can the
> firmware not reprogram them elsewhere, or is the entire bottom 4GB of
> the physical memory map occupied by system RAM?

I think your suggestion is also going in the same direction where FW 
moves the things around so that there is some hole in the first 4GB 
that OS doesn't see it and PCI has exclusive access to it.

I was looking to see if there is a better solution via some ACPI
table entry like PNP0C02 to tell the OS what range PCI drivers are not
allowed to touch but could be used for something else.

Problem with UEFI reserved region is that we are prohibiting the 
region from being used for anything else besides PCI. That region
is gone forever.

Another solution like you suggested is to move the DDR around so that
I don't need reserved regions.

Thanks for the suggestions,
Sinan

> 
> Robin.
> 
>>
>> Bjorn
>> _______________________________________________
>> iommu mailing list
>> iommu@lists.linux-foundation.org
>> https://lists.linuxfoundation.org/mailman/listinfo/iommu
>>
> 


-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Partial BAR Address Allocation
  2017-02-22 23:39     ` Bjorn Helgaas
  2017-02-23 11:40       ` Robin Murphy
@ 2017-03-06 11:04       ` Joerg Roedel
  2017-03-08 15:42         ` Sinan Kaya
  2017-03-10 20:14         ` Bjorn Helgaas
  1 sibling, 2 replies; 9+ messages in thread
From: Joerg Roedel @ 2017-03-06 11:04 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: Sinan Kaya, Linux PCI, iommu

On Wed, Feb 22, 2017 at 05:39:44PM -0600, Bjorn Helgaas wrote:
> [+cc Joerg, iommu list]
> 
> On Wed, Feb 22, 2017 at 03:44:53PM -0500, Sinan Kaya wrote:
> > On 2/22/2017 1:44 PM, Bjorn Helgaas wrote:
> > > There is no way for a driver to say "I only need this memory BAR and
> > > not the other ones."  The reason is because the PCI_COMMAND_MEMORY bit
> > > enables *all* the memory BARs; there's no way to enable memory BARs
> > > selectively.  If we enable memory BARs and one of them is unassigned,
> > > that unassigned BAR is enabled, and the device will respond at
> > > whatever address the register happens to contain, and that may cause
> > > conflicts.

Hmm, maybe I am missing something, but isn't this only a problem if the
'unassigned' BAR as an address configured that also falls into the
Bridge-Window of the parent bridge? Otherwise no requests should be
routed to the BAR anyway, right?

> > The problem is that according to PCI specification BAR addresses and
> > DMA addresses cannot overlap.
> > 
> > From PCI-to-PCI Bridge Arch. spec.: "A bridge forwards PCI memory
> > transactions from its primary interface to its secondary interface
> > (downstream) if a memory address is in the range defined by the
> > Memory Base and Memory Limit registers (when the base is less than
> > or equal to the limit) as illustrated in Figure 4-3. Conversely, a
> > memory transaction on the secondary interface that is within this
> > address range will not be forwarded upstream to the primary
> > interface."
> > 
> > To be specific, if your DMA address happens to be in
> > [0x80000000-0xffffffff] and root port's aperture includes this
> > range; the DMA will never make to the system memory.

If there is no translation by an IOMMU this shouldn't be a problem, as
long as the bridge windows don't overlap with system ram. With
translation the IOMMU driver has to take care of that, which they
usually do.

> Hmmm.  I guess SWIOTLB assumes there's no address translation in the
> DMA direction, right?  If there's no address translation in the PIO
> direction, PCI bus BAR addresses are identical to the CPU-side
> addresses.  In that case, there's no conflict because we already have
> to assign BARs so they never look like a system memory address.

Yes, SWIOTLB assumes that IOVA == PA.

> But if there *is* address translation in the PIO direction, we can
> have conflicts because the bridge can translate CPU-side PIO accesses
> to arbitrary PCI bus addresses.

I am not aware of any hardware that does translation on the PIO space.
The IOMMUs I know of don't care about PIO at all.



	Joerg

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Partial BAR Address Allocation
  2017-03-06 11:04       ` Joerg Roedel
@ 2017-03-08 15:42         ` Sinan Kaya
  2017-03-10 20:14         ` Bjorn Helgaas
  1 sibling, 0 replies; 9+ messages in thread
From: Sinan Kaya @ 2017-03-08 15:42 UTC (permalink / raw)
  To: Joerg Roedel, Bjorn Helgaas; +Cc: Linux PCI, iommu

On 3/6/2017 6:04 AM, Joerg Roedel wrote:
> On Wed, Feb 22, 2017 at 05:39:44PM -0600, Bjorn Helgaas wrote:
>> [+cc Joerg, iommu list]
>>
>> On Wed, Feb 22, 2017 at 03:44:53PM -0500, Sinan Kaya wrote:
>>> On 2/22/2017 1:44 PM, Bjorn Helgaas wrote:
>>>> There is no way for a driver to say "I only need this memory BAR and
>>>> not the other ones."  The reason is because the PCI_COMMAND_MEMORY bit
>>>> enables *all* the memory BARs; there's no way to enable memory BARs
>>>> selectively.  If we enable memory BARs and one of them is unassigned,
>>>> that unassigned BAR is enabled, and the device will respond at
>>>> whatever address the register happens to contain, and that may cause
>>>> conflicts.
> 
> Hmm, maybe I am missing something, but isn't this only a problem if the
> 'unassigned' BAR as an address configured that also falls into the
> Bridge-Window of the parent bridge? Otherwise no requests should be
> routed to the BAR anyway, right?

Correct, in order for this to happen you need to have multiple devices under
a bridge. One device sends a read request towards the system address that
happens to overlap with the BAR address of the unassigned BAR. The device
with unassigned resource will start responding. This is one of those P2P 
use cases.

> 
>>> The problem is that according to PCI specification BAR addresses and
>>> DMA addresses cannot overlap.
>>>
>>> From PCI-to-PCI Bridge Arch. spec.: "A bridge forwards PCI memory
>>> transactions from its primary interface to its secondary interface
>>> (downstream) if a memory address is in the range defined by the
>>> Memory Base and Memory Limit registers (when the base is less than
>>> or equal to the limit) as illustrated in Figure 4-3. Conversely, a
>>> memory transaction on the secondary interface that is within this
>>> address range will not be forwarded upstream to the primary
>>> interface."
>>>
>>> To be specific, if your DMA address happens to be in
>>> [0x80000000-0xffffffff] and root port's aperture includes this
>>> range; the DMA will never make to the system memory.
> 
> If there is no translation by an IOMMU this shouldn't be a problem, as
> long as the bridge windows don't overlap with system ram. With
> translation the IOMMU driver has to take care of that, which they
> usually do.

Correct, IOMMU drivers that I have reviewed all carve out the bridge
windows out of the IOMMU driver allocatable address range in 
iova_reserve_pci_windows() function.

> 
>> Hmmm.  I guess SWIOTLB assumes there's no address translation in the
>> DMA direction, right?  If there's no address translation in the PIO
>> direction, PCI bus BAR addresses are identical to the CPU-side
>> addresses.  In that case, there's no conflict because we already have
>> to assign BARs so they never look like a system memory address.
> 
> Yes, SWIOTLB assumes that IOVA == PA.
> 
>> But if there *is* address translation in the PIO direction, we can
>> have conflicts because the bridge can translate CPU-side PIO accesses
>> to arbitrary PCI bus addresses.
> 
> I am not aware of any hardware that does translation on the PIO space.
> The IOMMUs I know of don't care about PIO at all.

IOMMUs are used in the in inbound path mostly. 

Most of the HW has some sort of reserved address in the 4 GB whether
with or without translation to support existing 32 bit only cards.
We are talking about a problem in the outbound/PIO path.

> 
> 
> 
> 	Joerg
> 
> 


-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Partial BAR Address Allocation
  2017-03-06 11:04       ` Joerg Roedel
  2017-03-08 15:42         ` Sinan Kaya
@ 2017-03-10 20:14         ` Bjorn Helgaas
  1 sibling, 0 replies; 9+ messages in thread
From: Bjorn Helgaas @ 2017-03-10 20:14 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Sinan Kaya, Linux PCI, iommu

On Mon, Mar 06, 2017 at 12:04:39PM +0100, Joerg Roedel wrote:
> On Wed, Feb 22, 2017 at 05:39:44PM -0600, Bjorn Helgaas wrote:
> > [+cc Joerg, iommu list]
> > 
> > On Wed, Feb 22, 2017 at 03:44:53PM -0500, Sinan Kaya wrote:
> > > On 2/22/2017 1:44 PM, Bjorn Helgaas wrote:
> > > > There is no way for a driver to say "I only need this memory BAR and
> > > > not the other ones."  The reason is because the PCI_COMMAND_MEMORY bit
> > > > enables *all* the memory BARs; there's no way to enable memory BARs
> > > > selectively.  If we enable memory BARs and one of them is unassigned,
> > > > that unassigned BAR is enabled, and the device will respond at
> > > > whatever address the register happens to contain, and that may cause
> > > > conflicts.
> 
> Hmm, maybe I am missing something, but isn't this only a problem if the
> 'unassigned' BAR as an address configured that also falls into the
> Bridge-Window of the parent bridge? Otherwise no requests should be
> routed to the BAR anyway, right?

I guess it's true that we could safely enable a memory BAR if the
upstream bridge would never route anything to it.

But it would depend on the size of the BAR and the upstream bridge's
configuration, so it doesn't feel like it would really be reliable in
general.

> > But if there *is* address translation in the PIO direction, we can
> > have conflicts because the bridge can translate CPU-side PIO accesses
> > to arbitrary PCI bus addresses.
> 
> I am not aware of any hardware that does translation on the PIO space.
> The IOMMUs I know of don't care about PIO at all.

Right, address translation in the PIO direction would be done by the
host bridge, not the IOMMU.  There are a fair number of bridges that
do this -- basically all the callers of pci_add_resource_offset().
They just apply a constant offset, often by chopping off some
high-order bits of the CPU address.

Bjorn

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-03-10 20:15 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-22 18:03 Partial BAR Address Allocation Sinan Kaya
2017-02-22 18:44 ` Bjorn Helgaas
2017-02-22 20:44   ` Sinan Kaya
2017-02-22 23:39     ` Bjorn Helgaas
2017-02-23 11:40       ` Robin Murphy
2017-02-23 13:54         ` Sinan Kaya
2017-03-06 11:04       ` Joerg Roedel
2017-03-08 15:42         ` Sinan Kaya
2017-03-10 20:14         ` Bjorn Helgaas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).