All of lore.kernel.org
 help / color / mirror / Atom feed
* PCIe host controller behind IOMMU on ARM
@ 2015-11-04 13:57 ` Phil Edworthy
  0 siblings, 0 replies; 52+ messages in thread
From: Phil Edworthy @ 2015-11-04 13:57 UTC (permalink / raw)
  To: linux-pci
  Cc: linux-kernel, linux-arm-kernel, Bjorn Helgaas, Arnd Bergmann,
	Lorenzo Pieralisi, Liviu.Dudau, Magnus

Hi,

I am trying to hook up a PCIe host controller that sits behind an IOMMU,
but having some problems.

I'm using the pcie-rcar PCIe host controller and it works fine without
the IOMMU, and I can attach the IOMMU to the controller such that any calls
to dma_alloc_coherent made by the controller driver uses the iommu_ops
version of dma_ops.

However, I can't see how to make the endpoints to utilise the dma_ops that
the controller uses. Shouldn't the endpoints inherit the dma_ops from the
controller? Any pointers for this?

Thanks
Phil


^ permalink raw reply	[flat|nested] 52+ messages in thread

* PCIe host controller behind IOMMU on ARM
@ 2015-11-04 13:57 ` Phil Edworthy
  0 siblings, 0 replies; 52+ messages in thread
From: Phil Edworthy @ 2015-11-04 13:57 UTC (permalink / raw)
  To: linux-pci
  Cc: linux-kernel, linux-arm-kernel, Bjorn Helgaas, Arnd Bergmann,
	Lorenzo Pieralisi, Liviu.Dudau, Magnus

Hi,

I am trying to hook up a PCIe host controller that sits behind an IOMMU,
but having some problems.

I'm using the pcie-rcar PCIe host controller and it works fine without
the IOMMU, and I can attach the IOMMU to the controller such that any calls
to dma_alloc_coherent made by the controller driver uses the iommu_ops
version of dma_ops.

However, I can't see how to make the endpoints to utilise the dma_ops that
the controller uses. Shouldn't the endpoints inherit the dma_ops from the
controller? Any pointers for this?

Thanks
Phil


^ permalink raw reply	[flat|nested] 52+ messages in thread

* PCIe host controller behind IOMMU on ARM
@ 2015-11-04 13:57 ` Phil Edworthy
  0 siblings, 0 replies; 52+ messages in thread
From: Phil Edworthy @ 2015-11-04 13:57 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

I am trying to hook up a PCIe host controller that sits behind an IOMMU,
but having some problems.

I'm using the pcie-rcar PCIe host controller and it works fine without
the IOMMU, and I can attach the IOMMU to the controller such that any calls
to dma_alloc_coherent made by the controller driver uses the iommu_ops
version of dma_ops.

However, I can't see how to make the endpoints to utilise the dma_ops that
the controller uses. Shouldn't the endpoints inherit the dma_ops from the
controller? Any pointers for this?

Thanks
Phil

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: PCIe host controller behind IOMMU on ARM
  2015-11-04 13:57 ` Phil Edworthy
  (?)
@ 2015-11-04 14:24   ` Liviu.Dudau
  -1 siblings, 0 replies; 52+ messages in thread
From: Liviu.Dudau @ 2015-11-04 14:24 UTC (permalink / raw)
  To: Phil Edworthy
  Cc: linux-pci, linux-kernel, linux-arm-kernel, Bjorn Helgaas,
	Arnd Bergmann, Lorenzo Pieralisi, Magnus

On Wed, Nov 04, 2015 at 01:57:48PM +0000, Phil Edworthy wrote:
> Hi,
> 
> I am trying to hook up a PCIe host controller that sits behind an IOMMU,
> but having some problems.
> 
> I'm using the pcie-rcar PCIe host controller and it works fine without
> the IOMMU, and I can attach the IOMMU to the controller such that any calls
> to dma_alloc_coherent made by the controller driver uses the iommu_ops
> version of dma_ops.
> 
> However, I can't see how to make the endpoints to utilise the dma_ops that
> the controller uses. Shouldn't the endpoints inherit the dma_ops from the
> controller? 

No, not directly.

> Any pointers for this?

You need to understand the process through which a driver for endpoint get
an address to be passed down to the device. Have a look at
Documentation/DMA-API-HOWTO.txt, there is a nice explanation there.
(Hint: EP driver needs to call dma_map_single).

Also, you need to make sure that the bus address that ends up being set into
the endpoint gets translated correctly by the host controller into an address
that the IOMMU can then translate into physical address.

Best regards,
Liviu


> 
> Thanks
> Phil
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

-- 
====================
| I would like to |
| fix the world,  |
| but they're not |
| giving me the   |
 \ source code!  /
  ---------------
    ¯\_(ツ)_/¯

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: PCIe host controller behind IOMMU on ARM
@ 2015-11-04 14:24   ` Liviu.Dudau
  0 siblings, 0 replies; 52+ messages in thread
From: Liviu.Dudau @ 2015-11-04 14:24 UTC (permalink / raw)
  To: Phil Edworthy
  Cc: linux-pci, linux-kernel, linux-arm-kernel, Bjorn Helgaas,
	Arnd Bergmann, Lorenzo Pieralisi, Magnus

On Wed, Nov 04, 2015 at 01:57:48PM +0000, Phil Edworthy wrote:
> Hi,
> 
> I am trying to hook up a PCIe host controller that sits behind an IOMMU,
> but having some problems.
> 
> I'm using the pcie-rcar PCIe host controller and it works fine without
> the IOMMU, and I can attach the IOMMU to the controller such that any calls
> to dma_alloc_coherent made by the controller driver uses the iommu_ops
> version of dma_ops.
> 
> However, I can't see how to make the endpoints to utilise the dma_ops that
> the controller uses. Shouldn't the endpoints inherit the dma_ops from the
> controller? 

No, not directly.

> Any pointers for this?

You need to understand the process through which a driver for endpoint get
an address to be passed down to the device. Have a look at
Documentation/DMA-API-HOWTO.txt, there is a nice explanation there.
(Hint: EP driver needs to call dma_map_single).

Also, you need to make sure that the bus address that ends up being set into
the endpoint gets translated correctly by the host controller into an address
that the IOMMU can then translate into physical address.

Best regards,
Liviu


> 
> Thanks
> Phil
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

-- 
====================
| I would like to |
| fix the world,  |
| but they're not |
| giving me the   |
 \ source code!  /
  ---------------
    ¯\_(ツ)_/¯

^ permalink raw reply	[flat|nested] 52+ messages in thread

* PCIe host controller behind IOMMU on ARM
@ 2015-11-04 14:24   ` Liviu.Dudau
  0 siblings, 0 replies; 52+ messages in thread
From: Liviu.Dudau at arm.com @ 2015-11-04 14:24 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Nov 04, 2015 at 01:57:48PM +0000, Phil Edworthy wrote:
> Hi,
> 
> I am trying to hook up a PCIe host controller that sits behind an IOMMU,
> but having some problems.
> 
> I'm using the pcie-rcar PCIe host controller and it works fine without
> the IOMMU, and I can attach the IOMMU to the controller such that any calls
> to dma_alloc_coherent made by the controller driver uses the iommu_ops
> version of dma_ops.
> 
> However, I can't see how to make the endpoints to utilise the dma_ops that
> the controller uses. Shouldn't the endpoints inherit the dma_ops from the
> controller? 

No, not directly.

> Any pointers for this?

You need to understand the process through which a driver for endpoint get
an address to be passed down to the device. Have a look at
Documentation/DMA-API-HOWTO.txt, there is a nice explanation there.
(Hint: EP driver needs to call dma_map_single).

Also, you need to make sure that the bus address that ends up being set into
the endpoint gets translated correctly by the host controller into an address
that the IOMMU can then translate into physical address.

Best regards,
Liviu


> 
> Thanks
> Phil
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

-- 
====================
| I would like to |
| fix the world,  |
| but they're not |
| giving me the   |
 \ source code!  /
  ---------------
    ?\_(?)_/?

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: PCIe host controller behind IOMMU on ARM
  2015-11-04 14:24   ` Liviu.Dudau
  (?)
@ 2015-11-04 14:48     ` Phil Edworthy
  -1 siblings, 0 replies; 52+ messages in thread
From: Phil Edworthy @ 2015-11-04 14:48 UTC (permalink / raw)
  To: Liviu.Dudau
  Cc: linux-pci, linux-kernel, linux-arm-kernel, Bjorn Helgaas,
	Arnd Bergmann, Lorenzo Pieralisi, Magnus

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 1932 bytes --]

Hi Liviu,

On 04 November 2015 14:24, Liviu wrote:
> On Wed, Nov 04, 2015 at 01:57:48PM +0000, Phil Edworthy wrote:
> > Hi,
> >
> > I am trying to hook up a PCIe host controller that sits behind an IOMMU,
> > but having some problems.
> >
> > I'm using the pcie-rcar PCIe host controller and it works fine without
> > the IOMMU, and I can attach the IOMMU to the controller such that any calls
> > to dma_alloc_coherent made by the controller driver uses the iommu_ops
> > version of dma_ops.
> >
> > However, I can't see how to make the endpoints to utilise the dma_ops that
> > the controller uses. Shouldn't the endpoints inherit the dma_ops from the
> > controller?
> 
> No, not directly.
> 
> > Any pointers for this?
> 
> You need to understand the process through which a driver for endpoint get
> an address to be passed down to the device. Have a look at
> Documentation/DMA-API-HOWTO.txt, there is a nice explanation there.
> (Hint: EP driver needs to call dma_map_single).
> 
> Also, you need to make sure that the bus address that ends up being set into
> the endpoint gets translated correctly by the host controller into an address
> that the IOMMU can then translate into physical address.
Sure, though since this is bog standard Intel PCIe ethernet card which works
fine when the IOMMU is effectively unused, I don’t think there is a problem
with that.

The driver for the PCIe controller sets up the IOMMU mapping ok when I
do a test call to dma_alloc_coherent() in the controller's driver. i.e. when I
do this, it ends up in arm_iommu_alloc_attrs(), which calls
__iommu_alloc_buffer() and __alloc_iova().

When an endpoint driver allocates and maps a dma coherent buffer it
also needs to end up in arm_iommu_alloc_attrs(), but it doesn't.

Thanks
Phil
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: PCIe host controller behind IOMMU on ARM
@ 2015-11-04 14:48     ` Phil Edworthy
  0 siblings, 0 replies; 52+ messages in thread
From: Phil Edworthy @ 2015-11-04 14:48 UTC (permalink / raw)
  To: Liviu.Dudau
  Cc: linux-pci, linux-kernel, linux-arm-kernel, Bjorn Helgaas,
	Arnd Bergmann, Lorenzo Pieralisi, Magnus

SGkgTGl2aXUsDQoNCk9uIDA0IE5vdmVtYmVyIDIwMTUgMTQ6MjQsIExpdml1IHdyb3RlOg0KPiBP
biBXZWQsIE5vdiAwNCwgMjAxNSBhdCAwMTo1Nzo0OFBNICswMDAwLCBQaGlsIEVkd29ydGh5IHdy
b3RlOg0KPiA+IEhpLA0KPiA+DQo+ID4gSSBhbSB0cnlpbmcgdG8gaG9vayB1cCBhIFBDSWUgaG9z
dCBjb250cm9sbGVyIHRoYXQgc2l0cyBiZWhpbmQgYW4gSU9NTVUsDQo+ID4gYnV0IGhhdmluZyBz
b21lIHByb2JsZW1zLg0KPiA+DQo+ID4gSSdtIHVzaW5nIHRoZSBwY2llLXJjYXIgUENJZSBob3N0
IGNvbnRyb2xsZXIgYW5kIGl0IHdvcmtzIGZpbmUgd2l0aG91dA0KPiA+IHRoZSBJT01NVSwgYW5k
IEkgY2FuIGF0dGFjaCB0aGUgSU9NTVUgdG8gdGhlIGNvbnRyb2xsZXIgc3VjaCB0aGF0IGFueSBj
YWxscw0KPiA+IHRvIGRtYV9hbGxvY19jb2hlcmVudCBtYWRlIGJ5IHRoZSBjb250cm9sbGVyIGRy
aXZlciB1c2VzIHRoZSBpb21tdV9vcHMNCj4gPiB2ZXJzaW9uIG9mIGRtYV9vcHMuDQo+ID4NCj4g
PiBIb3dldmVyLCBJIGNhbid0IHNlZSBob3cgdG8gbWFrZSB0aGUgZW5kcG9pbnRzIHRvIHV0aWxp
c2UgdGhlIGRtYV9vcHMgdGhhdA0KPiA+IHRoZSBjb250cm9sbGVyIHVzZXMuIFNob3VsZG4ndCB0
aGUgZW5kcG9pbnRzIGluaGVyaXQgdGhlIGRtYV9vcHMgZnJvbSB0aGUNCj4gPiBjb250cm9sbGVy
Pw0KPiANCj4gTm8sIG5vdCBkaXJlY3RseS4NCj4gDQo+ID4gQW55IHBvaW50ZXJzIGZvciB0aGlz
Pw0KPiANCj4gWW91IG5lZWQgdG8gdW5kZXJzdGFuZCB0aGUgcHJvY2VzcyB0aHJvdWdoIHdoaWNo
IGEgZHJpdmVyIGZvciBlbmRwb2ludCBnZXQNCj4gYW4gYWRkcmVzcyB0byBiZSBwYXNzZWQgZG93
biB0byB0aGUgZGV2aWNlLiBIYXZlIGEgbG9vayBhdA0KPiBEb2N1bWVudGF0aW9uL0RNQS1BUEkt
SE9XVE8udHh0LCB0aGVyZSBpcyBhIG5pY2UgZXhwbGFuYXRpb24gdGhlcmUuDQo+IChIaW50OiBF
UCBkcml2ZXIgbmVlZHMgdG8gY2FsbCBkbWFfbWFwX3NpbmdsZSkuDQo+IA0KPiBBbHNvLCB5b3Ug
bmVlZCB0byBtYWtlIHN1cmUgdGhhdCB0aGUgYnVzIGFkZHJlc3MgdGhhdCBlbmRzIHVwIGJlaW5n
IHNldCBpbnRvDQo+IHRoZSBlbmRwb2ludCBnZXRzIHRyYW5zbGF0ZWQgY29ycmVjdGx5IGJ5IHRo
ZSBob3N0IGNvbnRyb2xsZXIgaW50byBhbiBhZGRyZXNzDQo+IHRoYXQgdGhlIElPTU1VIGNhbiB0
aGVuIHRyYW5zbGF0ZSBpbnRvIHBoeXNpY2FsIGFkZHJlc3MuDQpTdXJlLCB0aG91Z2ggc2luY2Ug
dGhpcyBpcyBib2cgc3RhbmRhcmQgSW50ZWwgUENJZSBldGhlcm5ldCBjYXJkIHdoaWNoIHdvcmtz
DQpmaW5lIHdoZW4gdGhlIElPTU1VIGlzIGVmZmVjdGl2ZWx5IHVudXNlZCwgSSBkb27igJl0IHRo
aW5rIHRoZXJlIGlzIGEgcHJvYmxlbQ0Kd2l0aCB0aGF0Lg0KDQpUaGUgZHJpdmVyIGZvciB0aGUg
UENJZSBjb250cm9sbGVyIHNldHMgdXAgdGhlIElPTU1VIG1hcHBpbmcgb2sgd2hlbiBJDQpkbyBh
IHRlc3QgY2FsbCB0byBkbWFfYWxsb2NfY29oZXJlbnQoKSBpbiB0aGUgY29udHJvbGxlcidzIGRy
aXZlci4gaS5lLiB3aGVuIEkNCmRvIHRoaXMsIGl0IGVuZHMgdXAgaW4gYXJtX2lvbW11X2FsbG9j
X2F0dHJzKCksIHdoaWNoIGNhbGxzDQpfX2lvbW11X2FsbG9jX2J1ZmZlcigpIGFuZCBfX2FsbG9j
X2lvdmEoKS4NCg0KV2hlbiBhbiBlbmRwb2ludCBkcml2ZXIgYWxsb2NhdGVzIGFuZCBtYXBzIGEg
ZG1hIGNvaGVyZW50IGJ1ZmZlciBpdA0KYWxzbyBuZWVkcyB0byBlbmQgdXAgaW4gYXJtX2lvbW11
X2FsbG9jX2F0dHJzKCksIGJ1dCBpdCBkb2Vzbid0Lg0KDQpUaGFua3MNClBoaWwNCg==

^ permalink raw reply	[flat|nested] 52+ messages in thread

* PCIe host controller behind IOMMU on ARM
@ 2015-11-04 14:48     ` Phil Edworthy
  0 siblings, 0 replies; 52+ messages in thread
From: Phil Edworthy @ 2015-11-04 14:48 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Liviu,

On 04 November 2015 14:24, Liviu wrote:
> On Wed, Nov 04, 2015 at 01:57:48PM +0000, Phil Edworthy wrote:
> > Hi,
> >
> > I am trying to hook up a PCIe host controller that sits behind an IOMMU,
> > but having some problems.
> >
> > I'm using the pcie-rcar PCIe host controller and it works fine without
> > the IOMMU, and I can attach the IOMMU to the controller such that any calls
> > to dma_alloc_coherent made by the controller driver uses the iommu_ops
> > version of dma_ops.
> >
> > However, I can't see how to make the endpoints to utilise the dma_ops that
> > the controller uses. Shouldn't the endpoints inherit the dma_ops from the
> > controller?
> 
> No, not directly.
> 
> > Any pointers for this?
> 
> You need to understand the process through which a driver for endpoint get
> an address to be passed down to the device. Have a look at
> Documentation/DMA-API-HOWTO.txt, there is a nice explanation there.
> (Hint: EP driver needs to call dma_map_single).
> 
> Also, you need to make sure that the bus address that ends up being set into
> the endpoint gets translated correctly by the host controller into an address
> that the IOMMU can then translate into physical address.
Sure, though since this is bog standard Intel PCIe ethernet card which works
fine when the IOMMU is effectively unused, I don?t think there is a problem
with that.

The driver for the PCIe controller sets up the IOMMU mapping ok when I
do a test call to dma_alloc_coherent() in the controller's driver. i.e. when I
do this, it ends up in arm_iommu_alloc_attrs(), which calls
__iommu_alloc_buffer() and __alloc_iova().

When an endpoint driver allocates and maps a dma coherent buffer it
also needs to end up in arm_iommu_alloc_attrs(), but it doesn't.

Thanks
Phil

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: PCIe host controller behind IOMMU on ARM
  2015-11-04 14:48     ` Phil Edworthy
  (?)
@ 2015-11-04 15:01       ` Liviu.Dudau
  -1 siblings, 0 replies; 52+ messages in thread
From: Liviu.Dudau @ 2015-11-04 15:01 UTC (permalink / raw)
  To: Phil Edworthy
  Cc: linux-pci, linux-kernel, linux-arm-kernel, Bjorn Helgaas,
	Arnd Bergmann, Lorenzo Pieralisi, Magnus

On Wed, Nov 04, 2015 at 02:48:38PM +0000, Phil Edworthy wrote:
> Hi Liviu,
> 
> On 04 November 2015 14:24, Liviu wrote:
> > On Wed, Nov 04, 2015 at 01:57:48PM +0000, Phil Edworthy wrote:
> > > Hi,
> > >
> > > I am trying to hook up a PCIe host controller that sits behind an IOMMU,
> > > but having some problems.
> > >
> > > I'm using the pcie-rcar PCIe host controller and it works fine without
> > > the IOMMU, and I can attach the IOMMU to the controller such that any calls
> > > to dma_alloc_coherent made by the controller driver uses the iommu_ops
> > > version of dma_ops.
> > >
> > > However, I can't see how to make the endpoints to utilise the dma_ops that
> > > the controller uses. Shouldn't the endpoints inherit the dma_ops from the
> > > controller?
> > 
> > No, not directly.
> > 
> > > Any pointers for this?
> > 
> > You need to understand the process through which a driver for endpoint get
> > an address to be passed down to the device. Have a look at
> > Documentation/DMA-API-HOWTO.txt, there is a nice explanation there.
> > (Hint: EP driver needs to call dma_map_single).
> > 
> > Also, you need to make sure that the bus address that ends up being set into
> > the endpoint gets translated correctly by the host controller into an address
> > that the IOMMU can then translate into physical address.
> Sure, though since this is bog standard Intel PCIe ethernet card which works
> fine when the IOMMU is effectively unused, I don’t think there is a problem
> with that.
> 
> The driver for the PCIe controller sets up the IOMMU mapping ok when I
> do a test call to dma_alloc_coherent() in the controller's driver. i.e. when I
> do this, it ends up in arm_iommu_alloc_attrs(), which calls
> __iommu_alloc_buffer() and __alloc_iova().
> 
> When an endpoint driver allocates and maps a dma coherent buffer it
> also needs to end up in arm_iommu_alloc_attrs(), but it doesn't.

Why do you think that? Remember that the only thing attached to the IOMMU is the
host controller. The endpoint is on the PCIe bus, which gets a different translation
that the IOMMU knows nothing about. If it helps you to visualise it better, think
of the host controller as another IOMMU device. It's the ops of the host controller
that should be invoked, not the IOMMU's.

Best regards,
Liviu

> 
> Thanks
> Phil

-- 
====================
| I would like to |
| fix the world,  |
| but they're not |
| giving me the   |
 \ source code!  /
  ---------------
    ¯\_(ツ)_/¯

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: PCIe host controller behind IOMMU on ARM
@ 2015-11-04 15:01       ` Liviu.Dudau
  0 siblings, 0 replies; 52+ messages in thread
From: Liviu.Dudau @ 2015-11-04 15:01 UTC (permalink / raw)
  To: Phil Edworthy
  Cc: linux-pci, linux-kernel, linux-arm-kernel, Bjorn Helgaas,
	Arnd Bergmann, Lorenzo Pieralisi, Magnus

On Wed, Nov 04, 2015 at 02:48:38PM +0000, Phil Edworthy wrote:
> Hi Liviu,
> 
> On 04 November 2015 14:24, Liviu wrote:
> > On Wed, Nov 04, 2015 at 01:57:48PM +0000, Phil Edworthy wrote:
> > > Hi,
> > >
> > > I am trying to hook up a PCIe host controller that sits behind an IOMMU,
> > > but having some problems.
> > >
> > > I'm using the pcie-rcar PCIe host controller and it works fine without
> > > the IOMMU, and I can attach the IOMMU to the controller such that any calls
> > > to dma_alloc_coherent made by the controller driver uses the iommu_ops
> > > version of dma_ops.
> > >
> > > However, I can't see how to make the endpoints to utilise the dma_ops that
> > > the controller uses. Shouldn't the endpoints inherit the dma_ops from the
> > > controller?
> > 
> > No, not directly.
> > 
> > > Any pointers for this?
> > 
> > You need to understand the process through which a driver for endpoint get
> > an address to be passed down to the device. Have a look at
> > Documentation/DMA-API-HOWTO.txt, there is a nice explanation there.
> > (Hint: EP driver needs to call dma_map_single).
> > 
> > Also, you need to make sure that the bus address that ends up being set into
> > the endpoint gets translated correctly by the host controller into an address
> > that the IOMMU can then translate into physical address.
> Sure, though since this is bog standard Intel PCIe ethernet card which works
> fine when the IOMMU is effectively unused, I don’t think there is a problem
> with that.
> 
> The driver for the PCIe controller sets up the IOMMU mapping ok when I
> do a test call to dma_alloc_coherent() in the controller's driver. i.e. when I
> do this, it ends up in arm_iommu_alloc_attrs(), which calls
> __iommu_alloc_buffer() and __alloc_iova().
> 
> When an endpoint driver allocates and maps a dma coherent buffer it
> also needs to end up in arm_iommu_alloc_attrs(), but it doesn't.

Why do you think that? Remember that the only thing attached to the IOMMU is the
host controller. The endpoint is on the PCIe bus, which gets a different translation
that the IOMMU knows nothing about. If it helps you to visualise it better, think
of the host controller as another IOMMU device. It's the ops of the host controller
that should be invoked, not the IOMMU's.

Best regards,
Liviu

> 
> Thanks
> Phil

-- 
====================
| I would like to |
| fix the world,  |
| but they're not |
| giving me the   |
 \ source code!  /
  ---------------
    ¯\_(ツ)_/¯

^ permalink raw reply	[flat|nested] 52+ messages in thread

* PCIe host controller behind IOMMU on ARM
@ 2015-11-04 15:01       ` Liviu.Dudau
  0 siblings, 0 replies; 52+ messages in thread
From: Liviu.Dudau at arm.com @ 2015-11-04 15:01 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Nov 04, 2015 at 02:48:38PM +0000, Phil Edworthy wrote:
> Hi Liviu,
> 
> On 04 November 2015 14:24, Liviu wrote:
> > On Wed, Nov 04, 2015 at 01:57:48PM +0000, Phil Edworthy wrote:
> > > Hi,
> > >
> > > I am trying to hook up a PCIe host controller that sits behind an IOMMU,
> > > but having some problems.
> > >
> > > I'm using the pcie-rcar PCIe host controller and it works fine without
> > > the IOMMU, and I can attach the IOMMU to the controller such that any calls
> > > to dma_alloc_coherent made by the controller driver uses the iommu_ops
> > > version of dma_ops.
> > >
> > > However, I can't see how to make the endpoints to utilise the dma_ops that
> > > the controller uses. Shouldn't the endpoints inherit the dma_ops from the
> > > controller?
> > 
> > No, not directly.
> > 
> > > Any pointers for this?
> > 
> > You need to understand the process through which a driver for endpoint get
> > an address to be passed down to the device. Have a look at
> > Documentation/DMA-API-HOWTO.txt, there is a nice explanation there.
> > (Hint: EP driver needs to call dma_map_single).
> > 
> > Also, you need to make sure that the bus address that ends up being set into
> > the endpoint gets translated correctly by the host controller into an address
> > that the IOMMU can then translate into physical address.
> Sure, though since this is bog standard Intel PCIe ethernet card which works
> fine when the IOMMU is effectively unused, I don?t think there is a problem
> with that.
> 
> The driver for the PCIe controller sets up the IOMMU mapping ok when I
> do a test call to dma_alloc_coherent() in the controller's driver. i.e. when I
> do this, it ends up in arm_iommu_alloc_attrs(), which calls
> __iommu_alloc_buffer() and __alloc_iova().
> 
> When an endpoint driver allocates and maps a dma coherent buffer it
> also needs to end up in arm_iommu_alloc_attrs(), but it doesn't.

Why do you think that? Remember that the only thing attached to the IOMMU is the
host controller. The endpoint is on the PCIe bus, which gets a different translation
that the IOMMU knows nothing about. If it helps you to visualise it better, think
of the host controller as another IOMMU device. It's the ops of the host controller
that should be invoked, not the IOMMU's.

Best regards,
Liviu

> 
> Thanks
> Phil

-- 
====================
| I would like to |
| fix the world,  |
| but they're not |
| giving me the   |
 \ source code!  /
  ---------------
    ?\_(?)_/?

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: PCIe host controller behind IOMMU on ARM
  2015-11-04 15:01       ` Liviu.Dudau
  (?)
@ 2015-11-04 15:19         ` Phil Edworthy
  -1 siblings, 0 replies; 52+ messages in thread
From: Phil Edworthy @ 2015-11-04 15:19 UTC (permalink / raw)
  To: Liviu.Dudau
  Cc: linux-pci, linux-kernel, linux-arm-kernel, Bjorn Helgaas,
	Arnd Bergmann, Lorenzo Pieralisi, Magnus

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 2683 bytes --]

Hi Liviu,

On 04 November 2015 15:02, Liviu wrote:
> On Wed, Nov 04, 2015 at 02:48:38PM +0000, Phil Edworthy wrote:
> > Hi Liviu,
> >
> > On 04 November 2015 14:24, Liviu wrote:
> > > On Wed, Nov 04, 2015 at 01:57:48PM +0000, Phil Edworthy wrote:
> > > > Hi,
> > > >
> > > > I am trying to hook up a PCIe host controller that sits behind an IOMMU,
> > > > but having some problems.
> > > >
> > > > I'm using the pcie-rcar PCIe host controller and it works fine without
> > > > the IOMMU, and I can attach the IOMMU to the controller such that any
> calls
> > > > to dma_alloc_coherent made by the controller driver uses the iommu_ops
> > > > version of dma_ops.
> > > >
> > > > However, I can't see how to make the endpoints to utilise the dma_ops that
> > > > the controller uses. Shouldn't the endpoints inherit the dma_ops from the
> > > > controller?
> > >
> > > No, not directly.
> > >
> > > > Any pointers for this?
> > >
> > > You need to understand the process through which a driver for endpoint get
> > > an address to be passed down to the device. Have a look at
> > > Documentation/DMA-API-HOWTO.txt, there is a nice explanation there.
> > > (Hint: EP driver needs to call dma_map_single).
> > >
> > > Also, you need to make sure that the bus address that ends up being set into
> > > the endpoint gets translated correctly by the host controller into an address
> > > that the IOMMU can then translate into physical address.
> > Sure, though since this is bog standard Intel PCIe ethernet card which works
> > fine when the IOMMU is effectively unused, I don’t think there is a problem
> > with that.
> >
> > The driver for the PCIe controller sets up the IOMMU mapping ok when I
> > do a test call to dma_alloc_coherent() in the controller's driver. i.e. when I
> > do this, it ends up in arm_iommu_alloc_attrs(), which calls
> > __iommu_alloc_buffer() and __alloc_iova().
> >
> > When an endpoint driver allocates and maps a dma coherent buffer it
> > also needs to end up in arm_iommu_alloc_attrs(), but it doesn't.
> 
> Why do you think that? Remember that the only thing attached to the IOMMU is
> the
> host controller. The endpoint is on the PCIe bus, which gets a different
> translation
> that the IOMMU knows nothing about. If it helps you to visualise it better, think
> of the host controller as another IOMMU device. It's the ops of the host
> controller
> that should be invoked, not the IOMMU's.
Ok, that makes sense. I'll have a think and poke it a bit more...

Thanks for your comments
Phil
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: PCIe host controller behind IOMMU on ARM
@ 2015-11-04 15:19         ` Phil Edworthy
  0 siblings, 0 replies; 52+ messages in thread
From: Phil Edworthy @ 2015-11-04 15:19 UTC (permalink / raw)
  To: Liviu.Dudau
  Cc: linux-pci, linux-kernel, linux-arm-kernel, Bjorn Helgaas,
	Arnd Bergmann, Lorenzo Pieralisi, Magnus

SGkgTGl2aXUsDQoNCk9uIDA0IE5vdmVtYmVyIDIwMTUgMTU6MDIsIExpdml1IHdyb3RlOg0KPiBP
biBXZWQsIE5vdiAwNCwgMjAxNSBhdCAwMjo0ODozOFBNICswMDAwLCBQaGlsIEVkd29ydGh5IHdy
b3RlOg0KPiA+IEhpIExpdml1LA0KPiA+DQo+ID4gT24gMDQgTm92ZW1iZXIgMjAxNSAxNDoyNCwg
TGl2aXUgd3JvdGU6DQo+ID4gPiBPbiBXZWQsIE5vdiAwNCwgMjAxNSBhdCAwMTo1Nzo0OFBNICsw
MDAwLCBQaGlsIEVkd29ydGh5IHdyb3RlOg0KPiA+ID4gPiBIaSwNCj4gPiA+ID4NCj4gPiA+ID4g
SSBhbSB0cnlpbmcgdG8gaG9vayB1cCBhIFBDSWUgaG9zdCBjb250cm9sbGVyIHRoYXQgc2l0cyBi
ZWhpbmQgYW4gSU9NTVUsDQo+ID4gPiA+IGJ1dCBoYXZpbmcgc29tZSBwcm9ibGVtcy4NCj4gPiA+
ID4NCj4gPiA+ID4gSSdtIHVzaW5nIHRoZSBwY2llLXJjYXIgUENJZSBob3N0IGNvbnRyb2xsZXIg
YW5kIGl0IHdvcmtzIGZpbmUgd2l0aG91dA0KPiA+ID4gPiB0aGUgSU9NTVUsIGFuZCBJIGNhbiBh
dHRhY2ggdGhlIElPTU1VIHRvIHRoZSBjb250cm9sbGVyIHN1Y2ggdGhhdCBhbnkNCj4gY2FsbHMN
Cj4gPiA+ID4gdG8gZG1hX2FsbG9jX2NvaGVyZW50IG1hZGUgYnkgdGhlIGNvbnRyb2xsZXIgZHJp
dmVyIHVzZXMgdGhlIGlvbW11X29wcw0KPiA+ID4gPiB2ZXJzaW9uIG9mIGRtYV9vcHMuDQo+ID4g
PiA+DQo+ID4gPiA+IEhvd2V2ZXIsIEkgY2FuJ3Qgc2VlIGhvdyB0byBtYWtlIHRoZSBlbmRwb2lu
dHMgdG8gdXRpbGlzZSB0aGUgZG1hX29wcyB0aGF0DQo+ID4gPiA+IHRoZSBjb250cm9sbGVyIHVz
ZXMuIFNob3VsZG4ndCB0aGUgZW5kcG9pbnRzIGluaGVyaXQgdGhlIGRtYV9vcHMgZnJvbSB0aGUN
Cj4gPiA+ID4gY29udHJvbGxlcj8NCj4gPiA+DQo+ID4gPiBObywgbm90IGRpcmVjdGx5Lg0KPiA+
ID4NCj4gPiA+ID4gQW55IHBvaW50ZXJzIGZvciB0aGlzPw0KPiA+ID4NCj4gPiA+IFlvdSBuZWVk
IHRvIHVuZGVyc3RhbmQgdGhlIHByb2Nlc3MgdGhyb3VnaCB3aGljaCBhIGRyaXZlciBmb3IgZW5k
cG9pbnQgZ2V0DQo+ID4gPiBhbiBhZGRyZXNzIHRvIGJlIHBhc3NlZCBkb3duIHRvIHRoZSBkZXZp
Y2UuIEhhdmUgYSBsb29rIGF0DQo+ID4gPiBEb2N1bWVudGF0aW9uL0RNQS1BUEktSE9XVE8udHh0
LCB0aGVyZSBpcyBhIG5pY2UgZXhwbGFuYXRpb24gdGhlcmUuDQo+ID4gPiAoSGludDogRVAgZHJp
dmVyIG5lZWRzIHRvIGNhbGwgZG1hX21hcF9zaW5nbGUpLg0KPiA+ID4NCj4gPiA+IEFsc28sIHlv
dSBuZWVkIHRvIG1ha2Ugc3VyZSB0aGF0IHRoZSBidXMgYWRkcmVzcyB0aGF0IGVuZHMgdXAgYmVp
bmcgc2V0IGludG8NCj4gPiA+IHRoZSBlbmRwb2ludCBnZXRzIHRyYW5zbGF0ZWQgY29ycmVjdGx5
IGJ5IHRoZSBob3N0IGNvbnRyb2xsZXIgaW50byBhbiBhZGRyZXNzDQo+ID4gPiB0aGF0IHRoZSBJ
T01NVSBjYW4gdGhlbiB0cmFuc2xhdGUgaW50byBwaHlzaWNhbCBhZGRyZXNzLg0KPiA+IFN1cmUs
IHRob3VnaCBzaW5jZSB0aGlzIGlzIGJvZyBzdGFuZGFyZCBJbnRlbCBQQ0llIGV0aGVybmV0IGNh
cmQgd2hpY2ggd29ya3MNCj4gPiBmaW5lIHdoZW4gdGhlIElPTU1VIGlzIGVmZmVjdGl2ZWx5IHVu
dXNlZCwgSSBkb27igJl0IHRoaW5rIHRoZXJlIGlzIGEgcHJvYmxlbQ0KPiA+IHdpdGggdGhhdC4N
Cj4gPg0KPiA+IFRoZSBkcml2ZXIgZm9yIHRoZSBQQ0llIGNvbnRyb2xsZXIgc2V0cyB1cCB0aGUg
SU9NTVUgbWFwcGluZyBvayB3aGVuIEkNCj4gPiBkbyBhIHRlc3QgY2FsbCB0byBkbWFfYWxsb2Nf
Y29oZXJlbnQoKSBpbiB0aGUgY29udHJvbGxlcidzIGRyaXZlci4gaS5lLiB3aGVuIEkNCj4gPiBk
byB0aGlzLCBpdCBlbmRzIHVwIGluIGFybV9pb21tdV9hbGxvY19hdHRycygpLCB3aGljaCBjYWxs
cw0KPiA+IF9faW9tbXVfYWxsb2NfYnVmZmVyKCkgYW5kIF9fYWxsb2NfaW92YSgpLg0KPiA+DQo+
ID4gV2hlbiBhbiBlbmRwb2ludCBkcml2ZXIgYWxsb2NhdGVzIGFuZCBtYXBzIGEgZG1hIGNvaGVy
ZW50IGJ1ZmZlciBpdA0KPiA+IGFsc28gbmVlZHMgdG8gZW5kIHVwIGluIGFybV9pb21tdV9hbGxv
Y19hdHRycygpLCBidXQgaXQgZG9lc24ndC4NCj4gDQo+IFdoeSBkbyB5b3UgdGhpbmsgdGhhdD8g
UmVtZW1iZXIgdGhhdCB0aGUgb25seSB0aGluZyBhdHRhY2hlZCB0byB0aGUgSU9NTVUgaXMNCj4g
dGhlDQo+IGhvc3QgY29udHJvbGxlci4gVGhlIGVuZHBvaW50IGlzIG9uIHRoZSBQQ0llIGJ1cywg
d2hpY2ggZ2V0cyBhIGRpZmZlcmVudA0KPiB0cmFuc2xhdGlvbg0KPiB0aGF0IHRoZSBJT01NVSBr
bm93cyBub3RoaW5nIGFib3V0LiBJZiBpdCBoZWxwcyB5b3UgdG8gdmlzdWFsaXNlIGl0IGJldHRl
ciwgdGhpbmsNCj4gb2YgdGhlIGhvc3QgY29udHJvbGxlciBhcyBhbm90aGVyIElPTU1VIGRldmlj
ZS4gSXQncyB0aGUgb3BzIG9mIHRoZSBob3N0DQo+IGNvbnRyb2xsZXINCj4gdGhhdCBzaG91bGQg
YmUgaW52b2tlZCwgbm90IHRoZSBJT01NVSdzLg0KT2ssIHRoYXQgbWFrZXMgc2Vuc2UuIEknbGwg
aGF2ZSBhIHRoaW5rIGFuZCBwb2tlIGl0IGEgYml0IG1vcmUuLi4NCg0KVGhhbmtzIGZvciB5b3Vy
IGNvbW1lbnRzDQpQaGlsDQo=

^ permalink raw reply	[flat|nested] 52+ messages in thread

* PCIe host controller behind IOMMU on ARM
@ 2015-11-04 15:19         ` Phil Edworthy
  0 siblings, 0 replies; 52+ messages in thread
From: Phil Edworthy @ 2015-11-04 15:19 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Liviu,

On 04 November 2015 15:02, Liviu wrote:
> On Wed, Nov 04, 2015 at 02:48:38PM +0000, Phil Edworthy wrote:
> > Hi Liviu,
> >
> > On 04 November 2015 14:24, Liviu wrote:
> > > On Wed, Nov 04, 2015 at 01:57:48PM +0000, Phil Edworthy wrote:
> > > > Hi,
> > > >
> > > > I am trying to hook up a PCIe host controller that sits behind an IOMMU,
> > > > but having some problems.
> > > >
> > > > I'm using the pcie-rcar PCIe host controller and it works fine without
> > > > the IOMMU, and I can attach the IOMMU to the controller such that any
> calls
> > > > to dma_alloc_coherent made by the controller driver uses the iommu_ops
> > > > version of dma_ops.
> > > >
> > > > However, I can't see how to make the endpoints to utilise the dma_ops that
> > > > the controller uses. Shouldn't the endpoints inherit the dma_ops from the
> > > > controller?
> > >
> > > No, not directly.
> > >
> > > > Any pointers for this?
> > >
> > > You need to understand the process through which a driver for endpoint get
> > > an address to be passed down to the device. Have a look at
> > > Documentation/DMA-API-HOWTO.txt, there is a nice explanation there.
> > > (Hint: EP driver needs to call dma_map_single).
> > >
> > > Also, you need to make sure that the bus address that ends up being set into
> > > the endpoint gets translated correctly by the host controller into an address
> > > that the IOMMU can then translate into physical address.
> > Sure, though since this is bog standard Intel PCIe ethernet card which works
> > fine when the IOMMU is effectively unused, I don?t think there is a problem
> > with that.
> >
> > The driver for the PCIe controller sets up the IOMMU mapping ok when I
> > do a test call to dma_alloc_coherent() in the controller's driver. i.e. when I
> > do this, it ends up in arm_iommu_alloc_attrs(), which calls
> > __iommu_alloc_buffer() and __alloc_iova().
> >
> > When an endpoint driver allocates and maps a dma coherent buffer it
> > also needs to end up in arm_iommu_alloc_attrs(), but it doesn't.
> 
> Why do you think that? Remember that the only thing attached to the IOMMU is
> the
> host controller. The endpoint is on the PCIe bus, which gets a different
> translation
> that the IOMMU knows nothing about. If it helps you to visualise it better, think
> of the host controller as another IOMMU device. It's the ops of the host
> controller
> that should be invoked, not the IOMMU's.
Ok, that makes sense. I'll have a think and poke it a bit more...

Thanks for your comments
Phil

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: PCIe host controller behind IOMMU on ARM
  2015-11-04 15:19         ` Phil Edworthy
  (?)
@ 2015-11-04 15:30           ` Will Deacon
  -1 siblings, 0 replies; 52+ messages in thread
From: Will Deacon @ 2015-11-04 15:30 UTC (permalink / raw)
  To: Phil Edworthy
  Cc: Liviu.Dudau, Lorenzo Pieralisi, Arnd Bergmann, linux-pci, Magnus,
	linux-kernel, Bjorn Helgaas, linux-arm-kernel

On Wed, Nov 04, 2015 at 03:19:13PM +0000, Phil Edworthy wrote:
> On 04 November 2015 15:02, Liviu wrote:
> > On Wed, Nov 04, 2015 at 02:48:38PM +0000, Phil Edworthy wrote:
> > > Sure, though since this is bog standard Intel PCIe ethernet card which works
> > > fine when the IOMMU is effectively unused, I don’t think there is a problem
> > > with that.
> > >
> > > The driver for the PCIe controller sets up the IOMMU mapping ok when I
> > > do a test call to dma_alloc_coherent() in the controller's driver. i.e. when I
> > > do this, it ends up in arm_iommu_alloc_attrs(), which calls
> > > __iommu_alloc_buffer() and __alloc_iova().
> > >
> > > When an endpoint driver allocates and maps a dma coherent buffer it
> > > also needs to end up in arm_iommu_alloc_attrs(), but it doesn't.
> > 
> > Why do you think that? Remember that the only thing attached to the IOMMU is
> > the
> > host controller. The endpoint is on the PCIe bus, which gets a different
> > translation
> > that the IOMMU knows nothing about. If it helps you to visualise it better, think
> > of the host controller as another IOMMU device. It's the ops of the host
> > controller
> > that should be invoked, not the IOMMU's.
> Ok, that makes sense. I'll have a think and poke it a bit more...

Take a look at of_iommu_configure, which is currently lacking support
for PCI devices. It should be using a variant on the device-tree bindings
already in use for describing MSI device IDs, so that we can translate
the RequesterID of the endpoint into an ID that the IOMMU can understand.

Will

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: PCIe host controller behind IOMMU on ARM
@ 2015-11-04 15:30           ` Will Deacon
  0 siblings, 0 replies; 52+ messages in thread
From: Will Deacon @ 2015-11-04 15:30 UTC (permalink / raw)
  To: Phil Edworthy
  Cc: Liviu.Dudau, Lorenzo Pieralisi, Arnd Bergmann, linux-pci, Magnus,
	linux-kernel, Bjorn Helgaas, linux-arm-kernel

On Wed, Nov 04, 2015 at 03:19:13PM +0000, Phil Edworthy wrote:
> On 04 November 2015 15:02, Liviu wrote:
> > On Wed, Nov 04, 2015 at 02:48:38PM +0000, Phil Edworthy wrote:
> > > Sure, though since this is bog standard Intel PCIe ethernet card which works
> > > fine when the IOMMU is effectively unused, I don’t think there is a problem
> > > with that.
> > >
> > > The driver for the PCIe controller sets up the IOMMU mapping ok when I
> > > do a test call to dma_alloc_coherent() in the controller's driver. i.e. when I
> > > do this, it ends up in arm_iommu_alloc_attrs(), which calls
> > > __iommu_alloc_buffer() and __alloc_iova().
> > >
> > > When an endpoint driver allocates and maps a dma coherent buffer it
> > > also needs to end up in arm_iommu_alloc_attrs(), but it doesn't.
> > 
> > Why do you think that? Remember that the only thing attached to the IOMMU is
> > the
> > host controller. The endpoint is on the PCIe bus, which gets a different
> > translation
> > that the IOMMU knows nothing about. If it helps you to visualise it better, think
> > of the host controller as another IOMMU device. It's the ops of the host
> > controller
> > that should be invoked, not the IOMMU's.
> Ok, that makes sense. I'll have a think and poke it a bit more...

Take a look at of_iommu_configure, which is currently lacking support
for PCI devices. It should be using a variant on the device-tree bindings
already in use for describing MSI device IDs, so that we can translate
the RequesterID of the endpoint into an ID that the IOMMU can understand.

Will

^ permalink raw reply	[flat|nested] 52+ messages in thread

* PCIe host controller behind IOMMU on ARM
@ 2015-11-04 15:30           ` Will Deacon
  0 siblings, 0 replies; 52+ messages in thread
From: Will Deacon @ 2015-11-04 15:30 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Nov 04, 2015 at 03:19:13PM +0000, Phil Edworthy wrote:
> On 04 November 2015 15:02, Liviu wrote:
> > On Wed, Nov 04, 2015 at 02:48:38PM +0000, Phil Edworthy wrote:
> > > Sure, though since this is bog standard Intel PCIe ethernet card which works
> > > fine when the IOMMU is effectively unused, I don?t think there is a problem
> > > with that.
> > >
> > > The driver for the PCIe controller sets up the IOMMU mapping ok when I
> > > do a test call to dma_alloc_coherent() in the controller's driver. i.e. when I
> > > do this, it ends up in arm_iommu_alloc_attrs(), which calls
> > > __iommu_alloc_buffer() and __alloc_iova().
> > >
> > > When an endpoint driver allocates and maps a dma coherent buffer it
> > > also needs to end up in arm_iommu_alloc_attrs(), but it doesn't.
> > 
> > Why do you think that? Remember that the only thing attached to the IOMMU is
> > the
> > host controller. The endpoint is on the PCIe bus, which gets a different
> > translation
> > that the IOMMU knows nothing about. If it helps you to visualise it better, think
> > of the host controller as another IOMMU device. It's the ops of the host
> > controller
> > that should be invoked, not the IOMMU's.
> Ok, that makes sense. I'll have a think and poke it a bit more...

Take a look at of_iommu_configure, which is currently lacking support
for PCI devices. It should be using a variant on the device-tree bindings
already in use for describing MSI device IDs, so that we can translate
the RequesterID of the endpoint into an ID that the IOMMU can understand.

Will

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: PCIe host controller behind IOMMU on ARM
  2015-11-04 15:30           ` Will Deacon
  (?)
@ 2015-11-04 18:02             ` Phil Edworthy
  -1 siblings, 0 replies; 52+ messages in thread
From: Phil Edworthy @ 2015-11-04 18:02 UTC (permalink / raw)
  To: Will Deacon
  Cc: Liviu.Dudau, Lorenzo Pieralisi, Arnd Bergmann, linux-pci, Magnus,
	linux-kernel, Bjorn Helgaas, linux-arm-kernel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 1991 bytes --]

Hi Will,

On 04 November 2015 15:30, Will wrote:
> On Wed, Nov 04, 2015 at 03:19:13PM +0000, Phil Edworthy wrote:
> > On 04 November 2015 15:02, Liviu wrote:
> > > On Wed, Nov 04, 2015 at 02:48:38PM +0000, Phil Edworthy wrote:
> > > > Sure, though since this is bog standard Intel PCIe ethernet card which works
> > > > fine when the IOMMU is effectively unused, I don’t think there is a problem
> > > > with that.
> > > >
> > > > The driver for the PCIe controller sets up the IOMMU mapping ok when I
> > > > do a test call to dma_alloc_coherent() in the controller's driver. i.e. when I
> > > > do this, it ends up in arm_iommu_alloc_attrs(), which calls
> > > > __iommu_alloc_buffer() and __alloc_iova().
> > > >
> > > > When an endpoint driver allocates and maps a dma coherent buffer it
> > > > also needs to end up in arm_iommu_alloc_attrs(), but it doesn't.
> > >
> > > Why do you think that? Remember that the only thing attached to the
> IOMMU is
> > > the
> > > host controller. The endpoint is on the PCIe bus, which gets a different
> > > translation
> > > that the IOMMU knows nothing about. If it helps you to visualise it better,
> think
> > > of the host controller as another IOMMU device. It's the ops of the host
> > > controller
> > > that should be invoked, not the IOMMU's.
> > Ok, that makes sense. I'll have a think and poke it a bit more...
> 
> Take a look at of_iommu_configure, which is currently lacking support
> for PCI devices. It should be using a variant on the device-tree bindings
> already in use for describing MSI device IDs, so that we can translate
> the RequesterID of the endpoint into an ID that the IOMMU can understand.
Whilst we want to introduce isolation at some point, right now we would like
to use the IOMMU as this PCIe controller only uses a 32-bit AXI address.

Thanks
Phil
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: PCIe host controller behind IOMMU on ARM
@ 2015-11-04 18:02             ` Phil Edworthy
  0 siblings, 0 replies; 52+ messages in thread
From: Phil Edworthy @ 2015-11-04 18:02 UTC (permalink / raw)
  To: Will Deacon
  Cc: Liviu.Dudau, Lorenzo Pieralisi, Arnd Bergmann, linux-pci, Magnus,
	linux-kernel, Bjorn Helgaas, linux-arm-kernel

SGkgV2lsbCwNCg0KT24gMDQgTm92ZW1iZXIgMjAxNSAxNTozMCwgV2lsbCB3cm90ZToNCj4gT24g
V2VkLCBOb3YgMDQsIDIwMTUgYXQgMDM6MTk6MTNQTSArMDAwMCwgUGhpbCBFZHdvcnRoeSB3cm90
ZToNCj4gPiBPbiAwNCBOb3ZlbWJlciAyMDE1IDE1OjAyLCBMaXZpdSB3cm90ZToNCj4gPiA+IE9u
IFdlZCwgTm92IDA0LCAyMDE1IGF0IDAyOjQ4OjM4UE0gKzAwMDAsIFBoaWwgRWR3b3J0aHkgd3Jv
dGU6DQo+ID4gPiA+IFN1cmUsIHRob3VnaCBzaW5jZSB0aGlzIGlzIGJvZyBzdGFuZGFyZCBJbnRl
bCBQQ0llIGV0aGVybmV0IGNhcmQgd2hpY2ggd29ya3MNCj4gPiA+ID4gZmluZSB3aGVuIHRoZSBJ
T01NVSBpcyBlZmZlY3RpdmVseSB1bnVzZWQsIEkgZG9u4oCZdCB0aGluayB0aGVyZSBpcyBhIHBy
b2JsZW0NCj4gPiA+ID4gd2l0aCB0aGF0Lg0KPiA+ID4gPg0KPiA+ID4gPiBUaGUgZHJpdmVyIGZv
ciB0aGUgUENJZSBjb250cm9sbGVyIHNldHMgdXAgdGhlIElPTU1VIG1hcHBpbmcgb2sgd2hlbiBJ
DQo+ID4gPiA+IGRvIGEgdGVzdCBjYWxsIHRvIGRtYV9hbGxvY19jb2hlcmVudCgpIGluIHRoZSBj
b250cm9sbGVyJ3MgZHJpdmVyLiBpLmUuIHdoZW4gSQ0KPiA+ID4gPiBkbyB0aGlzLCBpdCBlbmRz
IHVwIGluIGFybV9pb21tdV9hbGxvY19hdHRycygpLCB3aGljaCBjYWxscw0KPiA+ID4gPiBfX2lv
bW11X2FsbG9jX2J1ZmZlcigpIGFuZCBfX2FsbG9jX2lvdmEoKS4NCj4gPiA+ID4NCj4gPiA+ID4g
V2hlbiBhbiBlbmRwb2ludCBkcml2ZXIgYWxsb2NhdGVzIGFuZCBtYXBzIGEgZG1hIGNvaGVyZW50
IGJ1ZmZlciBpdA0KPiA+ID4gPiBhbHNvIG5lZWRzIHRvIGVuZCB1cCBpbiBhcm1faW9tbXVfYWxs
b2NfYXR0cnMoKSwgYnV0IGl0IGRvZXNuJ3QuDQo+ID4gPg0KPiA+ID4gV2h5IGRvIHlvdSB0aGlu
ayB0aGF0PyBSZW1lbWJlciB0aGF0IHRoZSBvbmx5IHRoaW5nIGF0dGFjaGVkIHRvIHRoZQ0KPiBJ
T01NVSBpcw0KPiA+ID4gdGhlDQo+ID4gPiBob3N0IGNvbnRyb2xsZXIuIFRoZSBlbmRwb2ludCBp
cyBvbiB0aGUgUENJZSBidXMsIHdoaWNoIGdldHMgYSBkaWZmZXJlbnQNCj4gPiA+IHRyYW5zbGF0
aW9uDQo+ID4gPiB0aGF0IHRoZSBJT01NVSBrbm93cyBub3RoaW5nIGFib3V0LiBJZiBpdCBoZWxw
cyB5b3UgdG8gdmlzdWFsaXNlIGl0IGJldHRlciwNCj4gdGhpbmsNCj4gPiA+IG9mIHRoZSBob3N0
IGNvbnRyb2xsZXIgYXMgYW5vdGhlciBJT01NVSBkZXZpY2UuIEl0J3MgdGhlIG9wcyBvZiB0aGUg
aG9zdA0KPiA+ID4gY29udHJvbGxlcg0KPiA+ID4gdGhhdCBzaG91bGQgYmUgaW52b2tlZCwgbm90
IHRoZSBJT01NVSdzLg0KPiA+IE9rLCB0aGF0IG1ha2VzIHNlbnNlLiBJJ2xsIGhhdmUgYSB0aGlu
ayBhbmQgcG9rZSBpdCBhIGJpdCBtb3JlLi4uDQo+IA0KPiBUYWtlIGEgbG9vayBhdCBvZl9pb21t
dV9jb25maWd1cmUsIHdoaWNoIGlzIGN1cnJlbnRseSBsYWNraW5nIHN1cHBvcnQNCj4gZm9yIFBD
SSBkZXZpY2VzLiBJdCBzaG91bGQgYmUgdXNpbmcgYSB2YXJpYW50IG9uIHRoZSBkZXZpY2UtdHJl
ZSBiaW5kaW5ncw0KPiBhbHJlYWR5IGluIHVzZSBmb3IgZGVzY3JpYmluZyBNU0kgZGV2aWNlIElE
cywgc28gdGhhdCB3ZSBjYW4gdHJhbnNsYXRlDQo+IHRoZSBSZXF1ZXN0ZXJJRCBvZiB0aGUgZW5k
cG9pbnQgaW50byBhbiBJRCB0aGF0IHRoZSBJT01NVSBjYW4gdW5kZXJzdGFuZC4NCldoaWxzdCB3
ZSB3YW50IHRvIGludHJvZHVjZSBpc29sYXRpb24gYXQgc29tZSBwb2ludCwgcmlnaHQgbm93IHdl
IHdvdWxkIGxpa2UNCnRvIHVzZSB0aGUgSU9NTVUgYXMgdGhpcyBQQ0llIGNvbnRyb2xsZXIgb25s
eSB1c2VzIGEgMzItYml0IEFYSSBhZGRyZXNzLg0KDQpUaGFua3MNClBoaWwNCg==

^ permalink raw reply	[flat|nested] 52+ messages in thread

* PCIe host controller behind IOMMU on ARM
@ 2015-11-04 18:02             ` Phil Edworthy
  0 siblings, 0 replies; 52+ messages in thread
From: Phil Edworthy @ 2015-11-04 18:02 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Will,

On 04 November 2015 15:30, Will wrote:
> On Wed, Nov 04, 2015 at 03:19:13PM +0000, Phil Edworthy wrote:
> > On 04 November 2015 15:02, Liviu wrote:
> > > On Wed, Nov 04, 2015 at 02:48:38PM +0000, Phil Edworthy wrote:
> > > > Sure, though since this is bog standard Intel PCIe ethernet card which works
> > > > fine when the IOMMU is effectively unused, I don?t think there is a problem
> > > > with that.
> > > >
> > > > The driver for the PCIe controller sets up the IOMMU mapping ok when I
> > > > do a test call to dma_alloc_coherent() in the controller's driver. i.e. when I
> > > > do this, it ends up in arm_iommu_alloc_attrs(), which calls
> > > > __iommu_alloc_buffer() and __alloc_iova().
> > > >
> > > > When an endpoint driver allocates and maps a dma coherent buffer it
> > > > also needs to end up in arm_iommu_alloc_attrs(), but it doesn't.
> > >
> > > Why do you think that? Remember that the only thing attached to the
> IOMMU is
> > > the
> > > host controller. The endpoint is on the PCIe bus, which gets a different
> > > translation
> > > that the IOMMU knows nothing about. If it helps you to visualise it better,
> think
> > > of the host controller as another IOMMU device. It's the ops of the host
> > > controller
> > > that should be invoked, not the IOMMU's.
> > Ok, that makes sense. I'll have a think and poke it a bit more...
> 
> Take a look at of_iommu_configure, which is currently lacking support
> for PCI devices. It should be using a variant on the device-tree bindings
> already in use for describing MSI device IDs, so that we can translate
> the RequesterID of the endpoint into an ID that the IOMMU can understand.
Whilst we want to introduce isolation at some point, right now we would like
to use the IOMMU as this PCIe controller only uses a 32-bit AXI address.

Thanks
Phil

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: PCIe host controller behind IOMMU on ARM
  2015-11-04 15:01       ` Liviu.Dudau
  (?)
@ 2015-11-09 12:32         ` Phil Edworthy
  -1 siblings, 0 replies; 52+ messages in thread
From: Phil Edworthy @ 2015-11-09 12:32 UTC (permalink / raw)
  To: Liviu.Dudau, Will Deacon
  Cc: linux-pci, linux-kernel, linux-arm-kernel, Bjorn Helgaas,
	Arnd Bergmann, Lorenzo Pieralisi, Magnus

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 3991 bytes --]

Hi Liviu, Will,

On 04 November 2015 15:19, Phil wrote:
> On 04 November 2015 15:02, Liviu wrote:
> > On Wed, Nov 04, 2015 at 02:48:38PM +0000, Phil Edworthy wrote:
> > > Hi Liviu,
> > >
> > > On 04 November 2015 14:24, Liviu wrote:
> > > > On Wed, Nov 04, 2015 at 01:57:48PM +0000, Phil Edworthy wrote:
> > > > > Hi,
> > > > >
> > > > > I am trying to hook up a PCIe host controller that sits behind an IOMMU,
> > > > > but having some problems.
> > > > >
> > > > > I'm using the pcie-rcar PCIe host controller and it works fine without
> > > > > the IOMMU, and I can attach the IOMMU to the controller such that any
> > calls
> > > > > to dma_alloc_coherent made by the controller driver uses the
> iommu_ops
> > > > > version of dma_ops.
> > > > >
> > > > > However, I can't see how to make the endpoints to utilise the dma_ops
> that
> > > > > the controller uses. Shouldn't the endpoints inherit the dma_ops from the
> > > > > controller?
> > > >
> > > > No, not directly.
> > > >
> > > > > Any pointers for this?
> > > >
> > > > You need to understand the process through which a driver for endpoint
> get
> > > > an address to be passed down to the device. Have a look at
> > > > Documentation/DMA-API-HOWTO.txt, there is a nice explanation there.
> > > > (Hint: EP driver needs to call dma_map_single).
> > > >
> > > > Also, you need to make sure that the bus address that ends up being set
> into
> > > > the endpoint gets translated correctly by the host controller into an address
> > > > that the IOMMU can then translate into physical address.
> > > Sure, though since this is bog standard Intel PCIe ethernet card which works
> > > fine when the IOMMU is effectively unused, I don’t think there is a problem
> > > with that.
> > >
> > > The driver for the PCIe controller sets up the IOMMU mapping ok when I
> > > do a test call to dma_alloc_coherent() in the controller's driver. i.e. when I
> > > do this, it ends up in arm_iommu_alloc_attrs(), which calls
> > > __iommu_alloc_buffer() and __alloc_iova().
> > >
> > > When an endpoint driver allocates and maps a dma coherent buffer it
> > > also needs to end up in arm_iommu_alloc_attrs(), but it doesn't.
> >
> > Why do you think that? Remember that the only thing attached to the IOMMU
> is
> > the
> > host controller. The endpoint is on the PCIe bus, which gets a different
> > translation
> > that the IOMMU knows nothing about. If it helps you to visualise it better, think
> > of the host controller as another IOMMU device. It's the ops of the host
> > controller
> > that should be invoked, not the IOMMU's.
> Ok, that makes sense. I'll have a think and poke it a bit more...
Somewhat related to this, since our PCIe controller HW is limited to
32-bit AXI address range, before trying to hook up the IOMMU I have
tried to limit the dma_mask for PCI cards to DMA_BIT_MASK(32). The
reason being that Linux uses a 1 to 1 mapping between PCI addresses
and cpu (phys) addresses when there isn't an IOMMU involved, so I
think that we need to limit the PCI address space used.

Since pci_setup_device() sets up dma_mask, I added a bus notifier in the
PCIe controller driver so I can change the mask, if needed, on the
BUS_NOTIFY_BOUND_DRIVER action.
However, I think there is the potential for card drivers to allocate and
map buffers before the bus notifier get called. Additionally, I've seen
drivers change their behaviour based on the success or failure of
dma_set_mask_and_coherent(dev, DMA_BIT_MASK(64)), so the
driver could, theoretically at least, operate in a way that is not
compatible with a more restricted dma_mask (though I can't think
of any way this would not work with hardware I've seen).

So, I think that using a bus notifier is the wrong way to go, but I don’t
know what other options I have. Any suggestions?

Thanks for your help
Phil
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: PCIe host controller behind IOMMU on ARM
@ 2015-11-09 12:32         ` Phil Edworthy
  0 siblings, 0 replies; 52+ messages in thread
From: Phil Edworthy @ 2015-11-09 12:32 UTC (permalink / raw)
  To: Liviu.Dudau, Will Deacon
  Cc: linux-pci, linux-kernel, linux-arm-kernel, Bjorn Helgaas,
	Arnd Bergmann, Lorenzo Pieralisi, Magnus

SGkgTGl2aXUsIFdpbGwsDQoNCk9uIDA0IE5vdmVtYmVyIDIwMTUgMTU6MTksIFBoaWwgd3JvdGU6
DQo+IE9uIDA0IE5vdmVtYmVyIDIwMTUgMTU6MDIsIExpdml1IHdyb3RlOg0KPiA+IE9uIFdlZCwg
Tm92IDA0LCAyMDE1IGF0IDAyOjQ4OjM4UE0gKzAwMDAsIFBoaWwgRWR3b3J0aHkgd3JvdGU6DQo+
ID4gPiBIaSBMaXZpdSwNCj4gPiA+DQo+ID4gPiBPbiAwNCBOb3ZlbWJlciAyMDE1IDE0OjI0LCBM
aXZpdSB3cm90ZToNCj4gPiA+ID4gT24gV2VkLCBOb3YgMDQsIDIwMTUgYXQgMDE6NTc6NDhQTSAr
MDAwMCwgUGhpbCBFZHdvcnRoeSB3cm90ZToNCj4gPiA+ID4gPiBIaSwNCj4gPiA+ID4gPg0KPiA+
ID4gPiA+IEkgYW0gdHJ5aW5nIHRvIGhvb2sgdXAgYSBQQ0llIGhvc3QgY29udHJvbGxlciB0aGF0
IHNpdHMgYmVoaW5kIGFuIElPTU1VLA0KPiA+ID4gPiA+IGJ1dCBoYXZpbmcgc29tZSBwcm9ibGVt
cy4NCj4gPiA+ID4gPg0KPiA+ID4gPiA+IEknbSB1c2luZyB0aGUgcGNpZS1yY2FyIFBDSWUgaG9z
dCBjb250cm9sbGVyIGFuZCBpdCB3b3JrcyBmaW5lIHdpdGhvdXQNCj4gPiA+ID4gPiB0aGUgSU9N
TVUsIGFuZCBJIGNhbiBhdHRhY2ggdGhlIElPTU1VIHRvIHRoZSBjb250cm9sbGVyIHN1Y2ggdGhh
dCBhbnkNCj4gPiBjYWxscw0KPiA+ID4gPiA+IHRvIGRtYV9hbGxvY19jb2hlcmVudCBtYWRlIGJ5
IHRoZSBjb250cm9sbGVyIGRyaXZlciB1c2VzIHRoZQ0KPiBpb21tdV9vcHMNCj4gPiA+ID4gPiB2
ZXJzaW9uIG9mIGRtYV9vcHMuDQo+ID4gPiA+ID4NCj4gPiA+ID4gPiBIb3dldmVyLCBJIGNhbid0
IHNlZSBob3cgdG8gbWFrZSB0aGUgZW5kcG9pbnRzIHRvIHV0aWxpc2UgdGhlIGRtYV9vcHMNCj4g
dGhhdA0KPiA+ID4gPiA+IHRoZSBjb250cm9sbGVyIHVzZXMuIFNob3VsZG4ndCB0aGUgZW5kcG9p
bnRzIGluaGVyaXQgdGhlIGRtYV9vcHMgZnJvbSB0aGUNCj4gPiA+ID4gPiBjb250cm9sbGVyPw0K
PiA+ID4gPg0KPiA+ID4gPiBObywgbm90IGRpcmVjdGx5Lg0KPiA+ID4gPg0KPiA+ID4gPiA+IEFu
eSBwb2ludGVycyBmb3IgdGhpcz8NCj4gPiA+ID4NCj4gPiA+ID4gWW91IG5lZWQgdG8gdW5kZXJz
dGFuZCB0aGUgcHJvY2VzcyB0aHJvdWdoIHdoaWNoIGEgZHJpdmVyIGZvciBlbmRwb2ludA0KPiBn
ZXQNCj4gPiA+ID4gYW4gYWRkcmVzcyB0byBiZSBwYXNzZWQgZG93biB0byB0aGUgZGV2aWNlLiBI
YXZlIGEgbG9vayBhdA0KPiA+ID4gPiBEb2N1bWVudGF0aW9uL0RNQS1BUEktSE9XVE8udHh0LCB0
aGVyZSBpcyBhIG5pY2UgZXhwbGFuYXRpb24gdGhlcmUuDQo+ID4gPiA+IChIaW50OiBFUCBkcml2
ZXIgbmVlZHMgdG8gY2FsbCBkbWFfbWFwX3NpbmdsZSkuDQo+ID4gPiA+DQo+ID4gPiA+IEFsc28s
IHlvdSBuZWVkIHRvIG1ha2Ugc3VyZSB0aGF0IHRoZSBidXMgYWRkcmVzcyB0aGF0IGVuZHMgdXAg
YmVpbmcgc2V0DQo+IGludG8NCj4gPiA+ID4gdGhlIGVuZHBvaW50IGdldHMgdHJhbnNsYXRlZCBj
b3JyZWN0bHkgYnkgdGhlIGhvc3QgY29udHJvbGxlciBpbnRvIGFuIGFkZHJlc3MNCj4gPiA+ID4g
dGhhdCB0aGUgSU9NTVUgY2FuIHRoZW4gdHJhbnNsYXRlIGludG8gcGh5c2ljYWwgYWRkcmVzcy4N
Cj4gPiA+IFN1cmUsIHRob3VnaCBzaW5jZSB0aGlzIGlzIGJvZyBzdGFuZGFyZCBJbnRlbCBQQ0ll
IGV0aGVybmV0IGNhcmQgd2hpY2ggd29ya3MNCj4gPiA+IGZpbmUgd2hlbiB0aGUgSU9NTVUgaXMg
ZWZmZWN0aXZlbHkgdW51c2VkLCBJIGRvbuKAmXQgdGhpbmsgdGhlcmUgaXMgYSBwcm9ibGVtDQo+
ID4gPiB3aXRoIHRoYXQuDQo+ID4gPg0KPiA+ID4gVGhlIGRyaXZlciBmb3IgdGhlIFBDSWUgY29u
dHJvbGxlciBzZXRzIHVwIHRoZSBJT01NVSBtYXBwaW5nIG9rIHdoZW4gSQ0KPiA+ID4gZG8gYSB0
ZXN0IGNhbGwgdG8gZG1hX2FsbG9jX2NvaGVyZW50KCkgaW4gdGhlIGNvbnRyb2xsZXIncyBkcml2
ZXIuIGkuZS4gd2hlbiBJDQo+ID4gPiBkbyB0aGlzLCBpdCBlbmRzIHVwIGluIGFybV9pb21tdV9h
bGxvY19hdHRycygpLCB3aGljaCBjYWxscw0KPiA+ID4gX19pb21tdV9hbGxvY19idWZmZXIoKSBh
bmQgX19hbGxvY19pb3ZhKCkuDQo+ID4gPg0KPiA+ID4gV2hlbiBhbiBlbmRwb2ludCBkcml2ZXIg
YWxsb2NhdGVzIGFuZCBtYXBzIGEgZG1hIGNvaGVyZW50IGJ1ZmZlciBpdA0KPiA+ID4gYWxzbyBu
ZWVkcyB0byBlbmQgdXAgaW4gYXJtX2lvbW11X2FsbG9jX2F0dHJzKCksIGJ1dCBpdCBkb2Vzbid0
Lg0KPiA+DQo+ID4gV2h5IGRvIHlvdSB0aGluayB0aGF0PyBSZW1lbWJlciB0aGF0IHRoZSBvbmx5
IHRoaW5nIGF0dGFjaGVkIHRvIHRoZSBJT01NVQ0KPiBpcw0KPiA+IHRoZQ0KPiA+IGhvc3QgY29u
dHJvbGxlci4gVGhlIGVuZHBvaW50IGlzIG9uIHRoZSBQQ0llIGJ1cywgd2hpY2ggZ2V0cyBhIGRp
ZmZlcmVudA0KPiA+IHRyYW5zbGF0aW9uDQo+ID4gdGhhdCB0aGUgSU9NTVUga25vd3Mgbm90aGlu
ZyBhYm91dC4gSWYgaXQgaGVscHMgeW91IHRvIHZpc3VhbGlzZSBpdCBiZXR0ZXIsIHRoaW5rDQo+
ID4gb2YgdGhlIGhvc3QgY29udHJvbGxlciBhcyBhbm90aGVyIElPTU1VIGRldmljZS4gSXQncyB0
aGUgb3BzIG9mIHRoZSBob3N0DQo+ID4gY29udHJvbGxlcg0KPiA+IHRoYXQgc2hvdWxkIGJlIGlu
dm9rZWQsIG5vdCB0aGUgSU9NTVUncy4NCj4gT2ssIHRoYXQgbWFrZXMgc2Vuc2UuIEknbGwgaGF2
ZSBhIHRoaW5rIGFuZCBwb2tlIGl0IGEgYml0IG1vcmUuLi4NClNvbWV3aGF0IHJlbGF0ZWQgdG8g
dGhpcywgc2luY2Ugb3VyIFBDSWUgY29udHJvbGxlciBIVyBpcyBsaW1pdGVkIHRvDQozMi1iaXQg
QVhJIGFkZHJlc3MgcmFuZ2UsIGJlZm9yZSB0cnlpbmcgdG8gaG9vayB1cCB0aGUgSU9NTVUgSSBo
YXZlDQp0cmllZCB0byBsaW1pdCB0aGUgZG1hX21hc2sgZm9yIFBDSSBjYXJkcyB0byBETUFfQklU
X01BU0soMzIpLiBUaGUNCnJlYXNvbiBiZWluZyB0aGF0IExpbnV4IHVzZXMgYSAxIHRvIDEgbWFw
cGluZyBiZXR3ZWVuIFBDSSBhZGRyZXNzZXMNCmFuZCBjcHUgKHBoeXMpIGFkZHJlc3NlcyB3aGVu
IHRoZXJlIGlzbid0IGFuIElPTU1VIGludm9sdmVkLCBzbyBJDQp0aGluayB0aGF0IHdlIG5lZWQg
dG8gbGltaXQgdGhlIFBDSSBhZGRyZXNzIHNwYWNlIHVzZWQuDQoNClNpbmNlIHBjaV9zZXR1cF9k
ZXZpY2UoKSBzZXRzIHVwIGRtYV9tYXNrLCBJIGFkZGVkIGEgYnVzIG5vdGlmaWVyIGluIHRoZQ0K
UENJZSBjb250cm9sbGVyIGRyaXZlciBzbyBJIGNhbiBjaGFuZ2UgdGhlIG1hc2ssIGlmIG5lZWRl
ZCwgb24gdGhlDQpCVVNfTk9USUZZX0JPVU5EX0RSSVZFUiBhY3Rpb24uDQpIb3dldmVyLCBJIHRo
aW5rIHRoZXJlIGlzIHRoZSBwb3RlbnRpYWwgZm9yIGNhcmQgZHJpdmVycyB0byBhbGxvY2F0ZSBh
bmQNCm1hcCBidWZmZXJzIGJlZm9yZSB0aGUgYnVzIG5vdGlmaWVyIGdldCBjYWxsZWQuIEFkZGl0
aW9uYWxseSwgSSd2ZSBzZWVuDQpkcml2ZXJzIGNoYW5nZSB0aGVpciBiZWhhdmlvdXIgYmFzZWQg
b24gdGhlIHN1Y2Nlc3Mgb3IgZmFpbHVyZSBvZg0KZG1hX3NldF9tYXNrX2FuZF9jb2hlcmVudChk
ZXYsIERNQV9CSVRfTUFTSyg2NCkpLCBzbyB0aGUNCmRyaXZlciBjb3VsZCwgdGhlb3JldGljYWxs
eSBhdCBsZWFzdCwgb3BlcmF0ZSBpbiBhIHdheSB0aGF0IGlzIG5vdA0KY29tcGF0aWJsZSB3aXRo
IGEgbW9yZSByZXN0cmljdGVkIGRtYV9tYXNrICh0aG91Z2ggSSBjYW4ndCB0aGluaw0Kb2YgYW55
IHdheSB0aGlzIHdvdWxkIG5vdCB3b3JrIHdpdGggaGFyZHdhcmUgSSd2ZSBzZWVuKS4NCg0KU28s
IEkgdGhpbmsgdGhhdCB1c2luZyBhIGJ1cyBub3RpZmllciBpcyB0aGUgd3Jvbmcgd2F5IHRvIGdv
LCBidXQgSSBkb27igJl0DQprbm93IHdoYXQgb3RoZXIgb3B0aW9ucyBJIGhhdmUuIEFueSBzdWdn
ZXN0aW9ucz8NCg0KVGhhbmtzIGZvciB5b3VyIGhlbHANClBoaWwNCg==

^ permalink raw reply	[flat|nested] 52+ messages in thread

* PCIe host controller behind IOMMU on ARM
@ 2015-11-09 12:32         ` Phil Edworthy
  0 siblings, 0 replies; 52+ messages in thread
From: Phil Edworthy @ 2015-11-09 12:32 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Liviu, Will,

On 04 November 2015 15:19, Phil wrote:
> On 04 November 2015 15:02, Liviu wrote:
> > On Wed, Nov 04, 2015 at 02:48:38PM +0000, Phil Edworthy wrote:
> > > Hi Liviu,
> > >
> > > On 04 November 2015 14:24, Liviu wrote:
> > > > On Wed, Nov 04, 2015 at 01:57:48PM +0000, Phil Edworthy wrote:
> > > > > Hi,
> > > > >
> > > > > I am trying to hook up a PCIe host controller that sits behind an IOMMU,
> > > > > but having some problems.
> > > > >
> > > > > I'm using the pcie-rcar PCIe host controller and it works fine without
> > > > > the IOMMU, and I can attach the IOMMU to the controller such that any
> > calls
> > > > > to dma_alloc_coherent made by the controller driver uses the
> iommu_ops
> > > > > version of dma_ops.
> > > > >
> > > > > However, I can't see how to make the endpoints to utilise the dma_ops
> that
> > > > > the controller uses. Shouldn't the endpoints inherit the dma_ops from the
> > > > > controller?
> > > >
> > > > No, not directly.
> > > >
> > > > > Any pointers for this?
> > > >
> > > > You need to understand the process through which a driver for endpoint
> get
> > > > an address to be passed down to the device. Have a look at
> > > > Documentation/DMA-API-HOWTO.txt, there is a nice explanation there.
> > > > (Hint: EP driver needs to call dma_map_single).
> > > >
> > > > Also, you need to make sure that the bus address that ends up being set
> into
> > > > the endpoint gets translated correctly by the host controller into an address
> > > > that the IOMMU can then translate into physical address.
> > > Sure, though since this is bog standard Intel PCIe ethernet card which works
> > > fine when the IOMMU is effectively unused, I don?t think there is a problem
> > > with that.
> > >
> > > The driver for the PCIe controller sets up the IOMMU mapping ok when I
> > > do a test call to dma_alloc_coherent() in the controller's driver. i.e. when I
> > > do this, it ends up in arm_iommu_alloc_attrs(), which calls
> > > __iommu_alloc_buffer() and __alloc_iova().
> > >
> > > When an endpoint driver allocates and maps a dma coherent buffer it
> > > also needs to end up in arm_iommu_alloc_attrs(), but it doesn't.
> >
> > Why do you think that? Remember that the only thing attached to the IOMMU
> is
> > the
> > host controller. The endpoint is on the PCIe bus, which gets a different
> > translation
> > that the IOMMU knows nothing about. If it helps you to visualise it better, think
> > of the host controller as another IOMMU device. It's the ops of the host
> > controller
> > that should be invoked, not the IOMMU's.
> Ok, that makes sense. I'll have a think and poke it a bit more...
Somewhat related to this, since our PCIe controller HW is limited to
32-bit AXI address range, before trying to hook up the IOMMU I have
tried to limit the dma_mask for PCI cards to DMA_BIT_MASK(32). The
reason being that Linux uses a 1 to 1 mapping between PCI addresses
and cpu (phys) addresses when there isn't an IOMMU involved, so I
think that we need to limit the PCI address space used.

Since pci_setup_device() sets up dma_mask, I added a bus notifier in the
PCIe controller driver so I can change the mask, if needed, on the
BUS_NOTIFY_BOUND_DRIVER action.
However, I think there is the potential for card drivers to allocate and
map buffers before the bus notifier get called. Additionally, I've seen
drivers change their behaviour based on the success or failure of
dma_set_mask_and_coherent(dev, DMA_BIT_MASK(64)), so the
driver could, theoretically at least, operate in a way that is not
compatible with a more restricted dma_mask (though I can't think
of any way this would not work with hardware I've seen).

So, I think that using a bus notifier is the wrong way to go, but I don?t
know what other options I have. Any suggestions?

Thanks for your help
Phil

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: PCIe host controller behind IOMMU on ARM
  2015-11-09 12:32         ` Phil Edworthy
  (?)
@ 2015-11-11 18:24           ` Liviu.Dudau
  -1 siblings, 0 replies; 52+ messages in thread
From: Liviu.Dudau @ 2015-11-11 18:24 UTC (permalink / raw)
  To: Phil Edworthy
  Cc: Will Deacon, linux-pci, linux-kernel, linux-arm-kernel,
	Bjorn Helgaas, Arnd Bergmann, Lorenzo Pieralisi, Magnus

On Mon, Nov 09, 2015 at 12:32:13PM +0000, Phil Edworthy wrote:
> Hi Liviu, Will,
> 
> On 04 November 2015 15:19, Phil wrote:
> > On 04 November 2015 15:02, Liviu wrote:
> > > On Wed, Nov 04, 2015 at 02:48:38PM +0000, Phil Edworthy wrote:
> > > > Hi Liviu,
> > > >
> > > > On 04 November 2015 14:24, Liviu wrote:
> > > > > On Wed, Nov 04, 2015 at 01:57:48PM +0000, Phil Edworthy wrote:
> > > > > > Hi,
> > > > > >
> > > > > > I am trying to hook up a PCIe host controller that sits behind an IOMMU,
> > > > > > but having some problems.
> > > > > >
> > > > > > I'm using the pcie-rcar PCIe host controller and it works fine without
> > > > > > the IOMMU, and I can attach the IOMMU to the controller such that any
> > > calls
> > > > > > to dma_alloc_coherent made by the controller driver uses the
> > iommu_ops
> > > > > > version of dma_ops.
> > > > > >
> > > > > > However, I can't see how to make the endpoints to utilise the dma_ops
> > that
> > > > > > the controller uses. Shouldn't the endpoints inherit the dma_ops from the
> > > > > > controller?
> > > > >
> > > > > No, not directly.
> > > > >
> > > > > > Any pointers for this?
> > > > >
> > > > > You need to understand the process through which a driver for endpoint
> > get
> > > > > an address to be passed down to the device. Have a look at
> > > > > Documentation/DMA-API-HOWTO.txt, there is a nice explanation there.
> > > > > (Hint: EP driver needs to call dma_map_single).
> > > > >
> > > > > Also, you need to make sure that the bus address that ends up being set
> > into
> > > > > the endpoint gets translated correctly by the host controller into an address
> > > > > that the IOMMU can then translate into physical address.
> > > > Sure, though since this is bog standard Intel PCIe ethernet card which works
> > > > fine when the IOMMU is effectively unused, I don’t think there is a problem
> > > > with that.
> > > >
> > > > The driver for the PCIe controller sets up the IOMMU mapping ok when I
> > > > do a test call to dma_alloc_coherent() in the controller's driver. i.e. when I
> > > > do this, it ends up in arm_iommu_alloc_attrs(), which calls
> > > > __iommu_alloc_buffer() and __alloc_iova().
> > > >
> > > > When an endpoint driver allocates and maps a dma coherent buffer it
> > > > also needs to end up in arm_iommu_alloc_attrs(), but it doesn't.
> > >
> > > Why do you think that? Remember that the only thing attached to the IOMMU
> > is
> > > the
> > > host controller. The endpoint is on the PCIe bus, which gets a different
> > > translation
> > > that the IOMMU knows nothing about. If it helps you to visualise it better, think
> > > of the host controller as another IOMMU device. It's the ops of the host
> > > controller
> > > that should be invoked, not the IOMMU's.
> > Ok, that makes sense. I'll have a think and poke it a bit more...

Hi Phil,

Not trying to ignore your email, but I thought this is more in Will's backyard.

> Somewhat related to this, since our PCIe controller HW is limited to
> 32-bit AXI address range, before trying to hook up the IOMMU I have
> tried to limit the dma_mask for PCI cards to DMA_BIT_MASK(32). The
> reason being that Linux uses a 1 to 1 mapping between PCI addresses
> and cpu (phys) addresses when there isn't an IOMMU involved, so I
> think that we need to limit the PCI address space used.

I think you're mixing things a bit or not explaining them very well. Having the
PCIe controller limited to 32-bit AXI does not mean that the PCIe bus cannot
carry 64-bit addresses. It depends on how they get translated by the host bridge
or its associated ATS block. I can't see why you can't have a setup where
the CPU addresses are 32-bit but the PCIe bus addresses are all 64-bit.
You just have to be careful on how you setup your mem64 ranges so that they don't
overlap with the 32-bit ranges when translated.

And no, you should not limit at the card driver the DMA_BIT_MASK() unless the
card is not capable of supporting more than 32-bit addresses.

> 
> Since pci_setup_device() sets up dma_mask, I added a bus notifier in the
> PCIe controller driver so I can change the mask, if needed, on the
> BUS_NOTIFY_BOUND_DRIVER action.
> However, I think there is the potential for card drivers to allocate and
> map buffers before the bus notifier get called. Additionally, I've seen
> drivers change their behaviour based on the success or failure of
> dma_set_mask_and_coherent(dev, DMA_BIT_MASK(64)), so the
> driver could, theoretically at least, operate in a way that is not
> compatible with a more restricted dma_mask (though I can't think
> of any way this would not work with hardware I've seen).
> 
> So, I think that using a bus notifier is the wrong way to go, but I don’t
> know what other options I have. Any suggestions?

I would first have a look at how the PCIe bus addresses are translated by the
host controller. 

Best regards,
Liviu

> 
> Thanks for your help
> Phil

-- 
====================
| I would like to |
| fix the world,  |
| but they're not |
| giving me the   |
 \ source code!  /
  ---------------
    ¯\_(ツ)_/¯

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: PCIe host controller behind IOMMU on ARM
@ 2015-11-11 18:24           ` Liviu.Dudau
  0 siblings, 0 replies; 52+ messages in thread
From: Liviu.Dudau @ 2015-11-11 18:24 UTC (permalink / raw)
  To: Phil Edworthy
  Cc: Will Deacon, linux-pci, linux-kernel, linux-arm-kernel,
	Bjorn Helgaas, Arnd Bergmann, Lorenzo Pieralisi, Magnus

On Mon, Nov 09, 2015 at 12:32:13PM +0000, Phil Edworthy wrote:
> Hi Liviu, Will,
> 
> On 04 November 2015 15:19, Phil wrote:
> > On 04 November 2015 15:02, Liviu wrote:
> > > On Wed, Nov 04, 2015 at 02:48:38PM +0000, Phil Edworthy wrote:
> > > > Hi Liviu,
> > > >
> > > > On 04 November 2015 14:24, Liviu wrote:
> > > > > On Wed, Nov 04, 2015 at 01:57:48PM +0000, Phil Edworthy wrote:
> > > > > > Hi,
> > > > > >
> > > > > > I am trying to hook up a PCIe host controller that sits behind an IOMMU,
> > > > > > but having some problems.
> > > > > >
> > > > > > I'm using the pcie-rcar PCIe host controller and it works fine without
> > > > > > the IOMMU, and I can attach the IOMMU to the controller such that any
> > > calls
> > > > > > to dma_alloc_coherent made by the controller driver uses the
> > iommu_ops
> > > > > > version of dma_ops.
> > > > > >
> > > > > > However, I can't see how to make the endpoints to utilise the dma_ops
> > that
> > > > > > the controller uses. Shouldn't the endpoints inherit the dma_ops from the
> > > > > > controller?
> > > > >
> > > > > No, not directly.
> > > > >
> > > > > > Any pointers for this?
> > > > >
> > > > > You need to understand the process through which a driver for endpoint
> > get
> > > > > an address to be passed down to the device. Have a look at
> > > > > Documentation/DMA-API-HOWTO.txt, there is a nice explanation there.
> > > > > (Hint: EP driver needs to call dma_map_single).
> > > > >
> > > > > Also, you need to make sure that the bus address that ends up being set
> > into
> > > > > the endpoint gets translated correctly by the host controller into an address
> > > > > that the IOMMU can then translate into physical address.
> > > > Sure, though since this is bog standard Intel PCIe ethernet card which works
> > > > fine when the IOMMU is effectively unused, I don’t think there is a problem
> > > > with that.
> > > >
> > > > The driver for the PCIe controller sets up the IOMMU mapping ok when I
> > > > do a test call to dma_alloc_coherent() in the controller's driver. i.e. when I
> > > > do this, it ends up in arm_iommu_alloc_attrs(), which calls
> > > > __iommu_alloc_buffer() and __alloc_iova().
> > > >
> > > > When an endpoint driver allocates and maps a dma coherent buffer it
> > > > also needs to end up in arm_iommu_alloc_attrs(), but it doesn't.
> > >
> > > Why do you think that? Remember that the only thing attached to the IOMMU
> > is
> > > the
> > > host controller. The endpoint is on the PCIe bus, which gets a different
> > > translation
> > > that the IOMMU knows nothing about. If it helps you to visualise it better, think
> > > of the host controller as another IOMMU device. It's the ops of the host
> > > controller
> > > that should be invoked, not the IOMMU's.
> > Ok, that makes sense. I'll have a think and poke it a bit more...

Hi Phil,

Not trying to ignore your email, but I thought this is more in Will's backyard.

> Somewhat related to this, since our PCIe controller HW is limited to
> 32-bit AXI address range, before trying to hook up the IOMMU I have
> tried to limit the dma_mask for PCI cards to DMA_BIT_MASK(32). The
> reason being that Linux uses a 1 to 1 mapping between PCI addresses
> and cpu (phys) addresses when there isn't an IOMMU involved, so I
> think that we need to limit the PCI address space used.

I think you're mixing things a bit or not explaining them very well. Having the
PCIe controller limited to 32-bit AXI does not mean that the PCIe bus cannot
carry 64-bit addresses. It depends on how they get translated by the host bridge
or its associated ATS block. I can't see why you can't have a setup where
the CPU addresses are 32-bit but the PCIe bus addresses are all 64-bit.
You just have to be careful on how you setup your mem64 ranges so that they don't
overlap with the 32-bit ranges when translated.

And no, you should not limit at the card driver the DMA_BIT_MASK() unless the
card is not capable of supporting more than 32-bit addresses.

> 
> Since pci_setup_device() sets up dma_mask, I added a bus notifier in the
> PCIe controller driver so I can change the mask, if needed, on the
> BUS_NOTIFY_BOUND_DRIVER action.
> However, I think there is the potential for card drivers to allocate and
> map buffers before the bus notifier get called. Additionally, I've seen
> drivers change their behaviour based on the success or failure of
> dma_set_mask_and_coherent(dev, DMA_BIT_MASK(64)), so the
> driver could, theoretically at least, operate in a way that is not
> compatible with a more restricted dma_mask (though I can't think
> of any way this would not work with hardware I've seen).
> 
> So, I think that using a bus notifier is the wrong way to go, but I don’t
> know what other options I have. Any suggestions?

I would first have a look at how the PCIe bus addresses are translated by the
host controller. 

Best regards,
Liviu

> 
> Thanks for your help
> Phil

-- 
====================
| I would like to |
| fix the world,  |
| but they're not |
| giving me the   |
 \ source code!  /
  ---------------
    ¯\_(ツ)_/¯

^ permalink raw reply	[flat|nested] 52+ messages in thread

* PCIe host controller behind IOMMU on ARM
@ 2015-11-11 18:24           ` Liviu.Dudau
  0 siblings, 0 replies; 52+ messages in thread
From: Liviu.Dudau at arm.com @ 2015-11-11 18:24 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Nov 09, 2015 at 12:32:13PM +0000, Phil Edworthy wrote:
> Hi Liviu, Will,
> 
> On 04 November 2015 15:19, Phil wrote:
> > On 04 November 2015 15:02, Liviu wrote:
> > > On Wed, Nov 04, 2015 at 02:48:38PM +0000, Phil Edworthy wrote:
> > > > Hi Liviu,
> > > >
> > > > On 04 November 2015 14:24, Liviu wrote:
> > > > > On Wed, Nov 04, 2015 at 01:57:48PM +0000, Phil Edworthy wrote:
> > > > > > Hi,
> > > > > >
> > > > > > I am trying to hook up a PCIe host controller that sits behind an IOMMU,
> > > > > > but having some problems.
> > > > > >
> > > > > > I'm using the pcie-rcar PCIe host controller and it works fine without
> > > > > > the IOMMU, and I can attach the IOMMU to the controller such that any
> > > calls
> > > > > > to dma_alloc_coherent made by the controller driver uses the
> > iommu_ops
> > > > > > version of dma_ops.
> > > > > >
> > > > > > However, I can't see how to make the endpoints to utilise the dma_ops
> > that
> > > > > > the controller uses. Shouldn't the endpoints inherit the dma_ops from the
> > > > > > controller?
> > > > >
> > > > > No, not directly.
> > > > >
> > > > > > Any pointers for this?
> > > > >
> > > > > You need to understand the process through which a driver for endpoint
> > get
> > > > > an address to be passed down to the device. Have a look at
> > > > > Documentation/DMA-API-HOWTO.txt, there is a nice explanation there.
> > > > > (Hint: EP driver needs to call dma_map_single).
> > > > >
> > > > > Also, you need to make sure that the bus address that ends up being set
> > into
> > > > > the endpoint gets translated correctly by the host controller into an address
> > > > > that the IOMMU can then translate into physical address.
> > > > Sure, though since this is bog standard Intel PCIe ethernet card which works
> > > > fine when the IOMMU is effectively unused, I don?t think there is a problem
> > > > with that.
> > > >
> > > > The driver for the PCIe controller sets up the IOMMU mapping ok when I
> > > > do a test call to dma_alloc_coherent() in the controller's driver. i.e. when I
> > > > do this, it ends up in arm_iommu_alloc_attrs(), which calls
> > > > __iommu_alloc_buffer() and __alloc_iova().
> > > >
> > > > When an endpoint driver allocates and maps a dma coherent buffer it
> > > > also needs to end up in arm_iommu_alloc_attrs(), but it doesn't.
> > >
> > > Why do you think that? Remember that the only thing attached to the IOMMU
> > is
> > > the
> > > host controller. The endpoint is on the PCIe bus, which gets a different
> > > translation
> > > that the IOMMU knows nothing about. If it helps you to visualise it better, think
> > > of the host controller as another IOMMU device. It's the ops of the host
> > > controller
> > > that should be invoked, not the IOMMU's.
> > Ok, that makes sense. I'll have a think and poke it a bit more...

Hi Phil,

Not trying to ignore your email, but I thought this is more in Will's backyard.

> Somewhat related to this, since our PCIe controller HW is limited to
> 32-bit AXI address range, before trying to hook up the IOMMU I have
> tried to limit the dma_mask for PCI cards to DMA_BIT_MASK(32). The
> reason being that Linux uses a 1 to 1 mapping between PCI addresses
> and cpu (phys) addresses when there isn't an IOMMU involved, so I
> think that we need to limit the PCI address space used.

I think you're mixing things a bit or not explaining them very well. Having the
PCIe controller limited to 32-bit AXI does not mean that the PCIe bus cannot
carry 64-bit addresses. It depends on how they get translated by the host bridge
or its associated ATS block. I can't see why you can't have a setup where
the CPU addresses are 32-bit but the PCIe bus addresses are all 64-bit.
You just have to be careful on how you setup your mem64 ranges so that they don't
overlap with the 32-bit ranges when translated.

And no, you should not limit at the card driver the DMA_BIT_MASK() unless the
card is not capable of supporting more than 32-bit addresses.

> 
> Since pci_setup_device() sets up dma_mask, I added a bus notifier in the
> PCIe controller driver so I can change the mask, if needed, on the
> BUS_NOTIFY_BOUND_DRIVER action.
> However, I think there is the potential for card drivers to allocate and
> map buffers before the bus notifier get called. Additionally, I've seen
> drivers change their behaviour based on the success or failure of
> dma_set_mask_and_coherent(dev, DMA_BIT_MASK(64)), so the
> driver could, theoretically at least, operate in a way that is not
> compatible with a more restricted dma_mask (though I can't think
> of any way this would not work with hardware I've seen).
> 
> So, I think that using a bus notifier is the wrong way to go, but I don?t
> know what other options I have. Any suggestions?

I would first have a look at how the PCIe bus addresses are translated by the
host controller. 

Best regards,
Liviu

> 
> Thanks for your help
> Phil

-- 
====================
| I would like to |
| fix the world,  |
| but they're not |
| giving me the   |
 \ source code!  /
  ---------------
    ?\_(?)_/?

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: PCIe host controller behind IOMMU on ARM
  2015-11-11 18:24           ` Liviu.Dudau
  (?)
@ 2015-11-11 20:22             ` Arnd Bergmann
  -1 siblings, 0 replies; 52+ messages in thread
From: Arnd Bergmann @ 2015-11-11 20:22 UTC (permalink / raw)
  To: Liviu.Dudau
  Cc: Phil Edworthy, Will Deacon, linux-pci, linux-kernel,
	linux-arm-kernel, Bjorn Helgaas, Lorenzo Pieralisi, Magnus

On Wednesday 11 November 2015 18:24:56 Liviu.Dudau@arm.com wrote:
> 
> > Somewhat related to this, since our PCIe controller HW is limited to
> > 32-bit AXI address range, before trying to hook up the IOMMU I have
> > tried to limit the dma_mask for PCI cards to DMA_BIT_MASK(32). The
> > reason being that Linux uses a 1 to 1 mapping between PCI addresses
> > and cpu (phys) addresses when there isn't an IOMMU involved, so I
> > think that we need to limit the PCI address space used.
> 
> I think you're mixing things a bit or not explaining them very well. Having the
> PCIe controller limited to 32-bit AXI does not mean that the PCIe bus cannot
> carry 64-bit addresses. It depends on how they get translated by the host bridge
> or its associated ATS block. I can't see why you can't have a setup where
> the CPU addresses are 32-bit but the PCIe bus addresses are all 64-bit.
> You just have to be careful on how you setup your mem64 ranges so that they don't
> overlap with the 32-bit ranges when translated.
> 
> And no, you should not limit at the card driver the DMA_BIT_MASK() unless the
> card is not capable of supporting more than 32-bit addresses.

I think we are missing one crucial bit of infrastructure on ARM64 at
the moment: the dma_set_mask() function should fail if a driver asks
for a mask that is larger than the dma-ranges property of the parent
device (or any device higher up in the hierarchy) allows.

Drivers that want a larger mask should try that first, and then fall
back to a 32-bit mask, which is guaranteed to work.

	Arnd

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: PCIe host controller behind IOMMU on ARM
@ 2015-11-11 20:22             ` Arnd Bergmann
  0 siblings, 0 replies; 52+ messages in thread
From: Arnd Bergmann @ 2015-11-11 20:22 UTC (permalink / raw)
  To: Liviu.Dudau
  Cc: Phil Edworthy, Will Deacon, linux-pci, linux-kernel,
	linux-arm-kernel, Bjorn Helgaas, Lorenzo Pieralisi, Magnus

On Wednesday 11 November 2015 18:24:56 Liviu.Dudau@arm.com wrote:
> 
> > Somewhat related to this, since our PCIe controller HW is limited to
> > 32-bit AXI address range, before trying to hook up the IOMMU I have
> > tried to limit the dma_mask for PCI cards to DMA_BIT_MASK(32). The
> > reason being that Linux uses a 1 to 1 mapping between PCI addresses
> > and cpu (phys) addresses when there isn't an IOMMU involved, so I
> > think that we need to limit the PCI address space used.
> 
> I think you're mixing things a bit or not explaining them very well. Having the
> PCIe controller limited to 32-bit AXI does not mean that the PCIe bus cannot
> carry 64-bit addresses. It depends on how they get translated by the host bridge
> or its associated ATS block. I can't see why you can't have a setup where
> the CPU addresses are 32-bit but the PCIe bus addresses are all 64-bit.
> You just have to be careful on how you setup your mem64 ranges so that they don't
> overlap with the 32-bit ranges when translated.
> 
> And no, you should not limit at the card driver the DMA_BIT_MASK() unless the
> card is not capable of supporting more than 32-bit addresses.

I think we are missing one crucial bit of infrastructure on ARM64 at
the moment: the dma_set_mask() function should fail if a driver asks
for a mask that is larger than the dma-ranges property of the parent
device (or any device higher up in the hierarchy) allows.

Drivers that want a larger mask should try that first, and then fall
back to a 32-bit mask, which is guaranteed to work.

	Arnd

^ permalink raw reply	[flat|nested] 52+ messages in thread

* PCIe host controller behind IOMMU on ARM
@ 2015-11-11 20:22             ` Arnd Bergmann
  0 siblings, 0 replies; 52+ messages in thread
From: Arnd Bergmann @ 2015-11-11 20:22 UTC (permalink / raw)
  To: linux-arm-kernel

On Wednesday 11 November 2015 18:24:56 Liviu.Dudau at arm.com wrote:
> 
> > Somewhat related to this, since our PCIe controller HW is limited to
> > 32-bit AXI address range, before trying to hook up the IOMMU I have
> > tried to limit the dma_mask for PCI cards to DMA_BIT_MASK(32). The
> > reason being that Linux uses a 1 to 1 mapping between PCI addresses
> > and cpu (phys) addresses when there isn't an IOMMU involved, so I
> > think that we need to limit the PCI address space used.
> 
> I think you're mixing things a bit or not explaining them very well. Having the
> PCIe controller limited to 32-bit AXI does not mean that the PCIe bus cannot
> carry 64-bit addresses. It depends on how they get translated by the host bridge
> or its associated ATS block. I can't see why you can't have a setup where
> the CPU addresses are 32-bit but the PCIe bus addresses are all 64-bit.
> You just have to be careful on how you setup your mem64 ranges so that they don't
> overlap with the 32-bit ranges when translated.
> 
> And no, you should not limit at the card driver the DMA_BIT_MASK() unless the
> card is not capable of supporting more than 32-bit addresses.

I think we are missing one crucial bit of infrastructure on ARM64 at
the moment: the dma_set_mask() function should fail if a driver asks
for a mask that is larger than the dma-ranges property of the parent
device (or any device higher up in the hierarchy) allows.

Drivers that want a larger mask should try that first, and then fall
back to a 32-bit mask, which is guaranteed to work.

	Arnd

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: PCIe host controller behind IOMMU on ARM
  2015-11-11 18:24           ` Liviu.Dudau
  (?)
@ 2015-11-12  9:26             ` Phil Edworthy
  -1 siblings, 0 replies; 52+ messages in thread
From: Phil Edworthy @ 2015-11-12  9:26 UTC (permalink / raw)
  To: Liviu.Dudau, Arnd Bergmann
  Cc: Will Deacon, linux-pci, linux-kernel, linux-arm-kernel,
	Bjorn Helgaas, Lorenzo Pieralisi, Magnus

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 5982 bytes --]

Hi Liviu, Arnd,

On 11 November 2015 18:25, LIviu wrote:
> On Mon, Nov 09, 2015 at 12:32:13PM +0000, Phil Edworthy wrote:
> > Hi Liviu, Will,
> >
> > On 04 November 2015 15:19, Phil wrote:
> > > On 04 November 2015 15:02, Liviu wrote:
> > > > On Wed, Nov 04, 2015 at 02:48:38PM +0000, Phil Edworthy wrote:
> > > > > Hi Liviu,
> > > > >
> > > > > On 04 November 2015 14:24, Liviu wrote:
> > > > > > On Wed, Nov 04, 2015 at 01:57:48PM +0000, Phil Edworthy wrote:
> > > > > > > Hi,
> > > > > > >
> > > > > > > I am trying to hook up a PCIe host controller that sits behind an
> IOMMU,
> > > > > > > but having some problems.
> > > > > > >
> > > > > > > I'm using the pcie-rcar PCIe host controller and it works fine without
> > > > > > > the IOMMU, and I can attach the IOMMU to the controller such that
> any
> > > > calls
> > > > > > > to dma_alloc_coherent made by the controller driver uses the
> > > iommu_ops
> > > > > > > version of dma_ops.
> > > > > > >
> > > > > > > However, I can't see how to make the endpoints to utilise the
> dma_ops
> > > that
> > > > > > > the controller uses. Shouldn't the endpoints inherit the dma_ops from
> the
> > > > > > > controller?
> > > > > >
> > > > > > No, not directly.
> > > > > >
> > > > > > > Any pointers for this?
> > > > > >
> > > > > > You need to understand the process through which a driver for
> endpoint
> > > get
> > > > > > an address to be passed down to the device. Have a look at
> > > > > > Documentation/DMA-API-HOWTO.txt, there is a nice explanation there.
> > > > > > (Hint: EP driver needs to call dma_map_single).
> > > > > >
> > > > > > Also, you need to make sure that the bus address that ends up being set
> > > into
> > > > > > the endpoint gets translated correctly by the host controller into an
> address
> > > > > > that the IOMMU can then translate into physical address.
> > > > > Sure, though since this is bog standard Intel PCIe ethernet card which
> works
> > > > > fine when the IOMMU is effectively unused, I don’t think there is a
> problem
> > > > > with that.
> > > > >
> > > > > The driver for the PCIe controller sets up the IOMMU mapping ok when I
> > > > > do a test call to dma_alloc_coherent() in the controller's driver. i.e. when I
> > > > > do this, it ends up in arm_iommu_alloc_attrs(), which calls
> > > > > __iommu_alloc_buffer() and __alloc_iova().
> > > > >
> > > > > When an endpoint driver allocates and maps a dma coherent buffer it
> > > > > also needs to end up in arm_iommu_alloc_attrs(), but it doesn't.
> > > >
> > > > Why do you think that? Remember that the only thing attached to the
> IOMMU
> > > is
> > > > the
> > > > host controller. The endpoint is on the PCIe bus, which gets a different
> > > > translation
> > > > that the IOMMU knows nothing about. If it helps you to visualise it better,
> think
> > > > of the host controller as another IOMMU device. It's the ops of the host
> > > > controller
> > > > that should be invoked, not the IOMMU's.
> > > Ok, that makes sense. I'll have a think and poke it a bit more...
> 
> Hi Phil,
> 
> Not trying to ignore your email, but I thought this is more in Will's backyard.
> 
> > Somewhat related to this, since our PCIe controller HW is limited to
> > 32-bit AXI address range, before trying to hook up the IOMMU I have
> > tried to limit the dma_mask for PCI cards to DMA_BIT_MASK(32). The
> > reason being that Linux uses a 1 to 1 mapping between PCI addresses
> > and cpu (phys) addresses when there isn't an IOMMU involved, so I
> > think that we need to limit the PCI address space used.
> 
> I think you're mixing things a bit or not explaining them very well. Having the
> PCIe controller limited to 32-bit AXI does not mean that the PCIe bus cannot
> carry 64-bit addresses. It depends on how they get translated by the host bridge
> or its associated ATS block. I can't see why you can't have a setup where
> the CPU addresses are 32-bit but the PCIe bus addresses are all 64-bit.
> You just have to be careful on how you setup your mem64 ranges so that they
> don't
> overlap with the 32-bit ranges when translated.
>From a HW point of view I agree that we can setup the PCI host bridge such that
it uses 64-bit PCI address, with 32-bit cpu addresses. Though in practice doesn't
this mean that the dma ops used by card drivers has to be provided by our PCI
host bridge driver so we can apply the translation to those PCI addresses?
This comes back to my point below about how to do this. Adding a bus notifier
to do this may be too late, and arm64 doesn't implement set_dma_ops().

> And no, you should not limit at the card driver the DMA_BIT_MASK() unless the
> card is not capable of supporting more than 32-bit addresses.
If there was infrastructure that checked all parents dma-ranges when the
dma_set_mask() function is called as Arnd pointed out, this would nicely solve
the problem.

> > Since pci_setup_device() sets up dma_mask, I added a bus notifier in the
> > PCIe controller driver so I can change the mask, if needed, on the
> > BUS_NOTIFY_BOUND_DRIVER action.
> > However, I think there is the potential for card drivers to allocate and
> > map buffers before the bus notifier get called. Additionally, I've seen
> > drivers change their behaviour based on the success or failure of
> > dma_set_mask_and_coherent(dev, DMA_BIT_MASK(64)), so the
> > driver could, theoretically at least, operate in a way that is not
> > compatible with a more restricted dma_mask (though I can't think
> > of any way this would not work with hardware I've seen).
> >
> > So, I think that using a bus notifier is the wrong way to go, but I don’t
> > know what other options I have. Any suggestions?
> 
> I would first have a look at how the PCIe bus addresses are translated by the
> host controller.
> 
> Best regards,
> Liviu
> 
Thanks
Phil
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: PCIe host controller behind IOMMU on ARM
@ 2015-11-12  9:26             ` Phil Edworthy
  0 siblings, 0 replies; 52+ messages in thread
From: Phil Edworthy @ 2015-11-12  9:26 UTC (permalink / raw)
  To: Liviu.Dudau, Arnd Bergmann
  Cc: Will Deacon, linux-pci, linux-kernel, linux-arm-kernel,
	Bjorn Helgaas, Lorenzo Pieralisi, Magnus

SGkgTGl2aXUsIEFybmQsDQoNCk9uIDExIE5vdmVtYmVyIDIwMTUgMTg6MjUsIExJdml1IHdyb3Rl
Og0KPiBPbiBNb24sIE5vdiAwOSwgMjAxNSBhdCAxMjozMjoxM1BNICswMDAwLCBQaGlsIEVkd29y
dGh5IHdyb3RlOg0KPiA+IEhpIExpdml1LCBXaWxsLA0KPiA+DQo+ID4gT24gMDQgTm92ZW1iZXIg
MjAxNSAxNToxOSwgUGhpbCB3cm90ZToNCj4gPiA+IE9uIDA0IE5vdmVtYmVyIDIwMTUgMTU6MDIs
IExpdml1IHdyb3RlOg0KPiA+ID4gPiBPbiBXZWQsIE5vdiAwNCwgMjAxNSBhdCAwMjo0ODozOFBN
ICswMDAwLCBQaGlsIEVkd29ydGh5IHdyb3RlOg0KPiA+ID4gPiA+IEhpIExpdml1LA0KPiA+ID4g
PiA+DQo+ID4gPiA+ID4gT24gMDQgTm92ZW1iZXIgMjAxNSAxNDoyNCwgTGl2aXUgd3JvdGU6DQo+
ID4gPiA+ID4gPiBPbiBXZWQsIE5vdiAwNCwgMjAxNSBhdCAwMTo1Nzo0OFBNICswMDAwLCBQaGls
IEVkd29ydGh5IHdyb3RlOg0KPiA+ID4gPiA+ID4gPiBIaSwNCj4gPiA+ID4gPiA+ID4NCj4gPiA+
ID4gPiA+ID4gSSBhbSB0cnlpbmcgdG8gaG9vayB1cCBhIFBDSWUgaG9zdCBjb250cm9sbGVyIHRo
YXQgc2l0cyBiZWhpbmQgYW4NCj4gSU9NTVUsDQo+ID4gPiA+ID4gPiA+IGJ1dCBoYXZpbmcgc29t
ZSBwcm9ibGVtcy4NCj4gPiA+ID4gPiA+ID4NCj4gPiA+ID4gPiA+ID4gSSdtIHVzaW5nIHRoZSBw
Y2llLXJjYXIgUENJZSBob3N0IGNvbnRyb2xsZXIgYW5kIGl0IHdvcmtzIGZpbmUgd2l0aG91dA0K
PiA+ID4gPiA+ID4gPiB0aGUgSU9NTVUsIGFuZCBJIGNhbiBhdHRhY2ggdGhlIElPTU1VIHRvIHRo
ZSBjb250cm9sbGVyIHN1Y2ggdGhhdA0KPiBhbnkNCj4gPiA+ID4gY2FsbHMNCj4gPiA+ID4gPiA+
ID4gdG8gZG1hX2FsbG9jX2NvaGVyZW50IG1hZGUgYnkgdGhlIGNvbnRyb2xsZXIgZHJpdmVyIHVz
ZXMgdGhlDQo+ID4gPiBpb21tdV9vcHMNCj4gPiA+ID4gPiA+ID4gdmVyc2lvbiBvZiBkbWFfb3Bz
Lg0KPiA+ID4gPiA+ID4gPg0KPiA+ID4gPiA+ID4gPiBIb3dldmVyLCBJIGNhbid0IHNlZSBob3cg
dG8gbWFrZSB0aGUgZW5kcG9pbnRzIHRvIHV0aWxpc2UgdGhlDQo+IGRtYV9vcHMNCj4gPiA+IHRo
YXQNCj4gPiA+ID4gPiA+ID4gdGhlIGNvbnRyb2xsZXIgdXNlcy4gU2hvdWxkbid0IHRoZSBlbmRw
b2ludHMgaW5oZXJpdCB0aGUgZG1hX29wcyBmcm9tDQo+IHRoZQ0KPiA+ID4gPiA+ID4gPiBjb250
cm9sbGVyPw0KPiA+ID4gPiA+ID4NCj4gPiA+ID4gPiA+IE5vLCBub3QgZGlyZWN0bHkuDQo+ID4g
PiA+ID4gPg0KPiA+ID4gPiA+ID4gPiBBbnkgcG9pbnRlcnMgZm9yIHRoaXM/DQo+ID4gPiA+ID4g
Pg0KPiA+ID4gPiA+ID4gWW91IG5lZWQgdG8gdW5kZXJzdGFuZCB0aGUgcHJvY2VzcyB0aHJvdWdo
IHdoaWNoIGEgZHJpdmVyIGZvcg0KPiBlbmRwb2ludA0KPiA+ID4gZ2V0DQo+ID4gPiA+ID4gPiBh
biBhZGRyZXNzIHRvIGJlIHBhc3NlZCBkb3duIHRvIHRoZSBkZXZpY2UuIEhhdmUgYSBsb29rIGF0
DQo+ID4gPiA+ID4gPiBEb2N1bWVudGF0aW9uL0RNQS1BUEktSE9XVE8udHh0LCB0aGVyZSBpcyBh
IG5pY2UgZXhwbGFuYXRpb24gdGhlcmUuDQo+ID4gPiA+ID4gPiAoSGludDogRVAgZHJpdmVyIG5l
ZWRzIHRvIGNhbGwgZG1hX21hcF9zaW5nbGUpLg0KPiA+ID4gPiA+ID4NCj4gPiA+ID4gPiA+IEFs
c28sIHlvdSBuZWVkIHRvIG1ha2Ugc3VyZSB0aGF0IHRoZSBidXMgYWRkcmVzcyB0aGF0IGVuZHMg
dXAgYmVpbmcgc2V0DQo+ID4gPiBpbnRvDQo+ID4gPiA+ID4gPiB0aGUgZW5kcG9pbnQgZ2V0cyB0
cmFuc2xhdGVkIGNvcnJlY3RseSBieSB0aGUgaG9zdCBjb250cm9sbGVyIGludG8gYW4NCj4gYWRk
cmVzcw0KPiA+ID4gPiA+ID4gdGhhdCB0aGUgSU9NTVUgY2FuIHRoZW4gdHJhbnNsYXRlIGludG8g
cGh5c2ljYWwgYWRkcmVzcy4NCj4gPiA+ID4gPiBTdXJlLCB0aG91Z2ggc2luY2UgdGhpcyBpcyBi
b2cgc3RhbmRhcmQgSW50ZWwgUENJZSBldGhlcm5ldCBjYXJkIHdoaWNoDQo+IHdvcmtzDQo+ID4g
PiA+ID4gZmluZSB3aGVuIHRoZSBJT01NVSBpcyBlZmZlY3RpdmVseSB1bnVzZWQsIEkgZG9u4oCZ
dCB0aGluayB0aGVyZSBpcyBhDQo+IHByb2JsZW0NCj4gPiA+ID4gPiB3aXRoIHRoYXQuDQo+ID4g
PiA+ID4NCj4gPiA+ID4gPiBUaGUgZHJpdmVyIGZvciB0aGUgUENJZSBjb250cm9sbGVyIHNldHMg
dXAgdGhlIElPTU1VIG1hcHBpbmcgb2sgd2hlbiBJDQo+ID4gPiA+ID4gZG8gYSB0ZXN0IGNhbGwg
dG8gZG1hX2FsbG9jX2NvaGVyZW50KCkgaW4gdGhlIGNvbnRyb2xsZXIncyBkcml2ZXIuIGkuZS4g
d2hlbiBJDQo+ID4gPiA+ID4gZG8gdGhpcywgaXQgZW5kcyB1cCBpbiBhcm1faW9tbXVfYWxsb2Nf
YXR0cnMoKSwgd2hpY2ggY2FsbHMNCj4gPiA+ID4gPiBfX2lvbW11X2FsbG9jX2J1ZmZlcigpIGFu
ZCBfX2FsbG9jX2lvdmEoKS4NCj4gPiA+ID4gPg0KPiA+ID4gPiA+IFdoZW4gYW4gZW5kcG9pbnQg
ZHJpdmVyIGFsbG9jYXRlcyBhbmQgbWFwcyBhIGRtYSBjb2hlcmVudCBidWZmZXIgaXQNCj4gPiA+
ID4gPiBhbHNvIG5lZWRzIHRvIGVuZCB1cCBpbiBhcm1faW9tbXVfYWxsb2NfYXR0cnMoKSwgYnV0
IGl0IGRvZXNuJ3QuDQo+ID4gPiA+DQo+ID4gPiA+IFdoeSBkbyB5b3UgdGhpbmsgdGhhdD8gUmVt
ZW1iZXIgdGhhdCB0aGUgb25seSB0aGluZyBhdHRhY2hlZCB0byB0aGUNCj4gSU9NTVUNCj4gPiA+
IGlzDQo+ID4gPiA+IHRoZQ0KPiA+ID4gPiBob3N0IGNvbnRyb2xsZXIuIFRoZSBlbmRwb2ludCBp
cyBvbiB0aGUgUENJZSBidXMsIHdoaWNoIGdldHMgYSBkaWZmZXJlbnQNCj4gPiA+ID4gdHJhbnNs
YXRpb24NCj4gPiA+ID4gdGhhdCB0aGUgSU9NTVUga25vd3Mgbm90aGluZyBhYm91dC4gSWYgaXQg
aGVscHMgeW91IHRvIHZpc3VhbGlzZSBpdCBiZXR0ZXIsDQo+IHRoaW5rDQo+ID4gPiA+IG9mIHRo
ZSBob3N0IGNvbnRyb2xsZXIgYXMgYW5vdGhlciBJT01NVSBkZXZpY2UuIEl0J3MgdGhlIG9wcyBv
ZiB0aGUgaG9zdA0KPiA+ID4gPiBjb250cm9sbGVyDQo+ID4gPiA+IHRoYXQgc2hvdWxkIGJlIGlu
dm9rZWQsIG5vdCB0aGUgSU9NTVUncy4NCj4gPiA+IE9rLCB0aGF0IG1ha2VzIHNlbnNlLiBJJ2xs
IGhhdmUgYSB0aGluayBhbmQgcG9rZSBpdCBhIGJpdCBtb3JlLi4uDQo+IA0KPiBIaSBQaGlsLA0K
PiANCj4gTm90IHRyeWluZyB0byBpZ25vcmUgeW91ciBlbWFpbCwgYnV0IEkgdGhvdWdodCB0aGlz
IGlzIG1vcmUgaW4gV2lsbCdzIGJhY2t5YXJkLg0KPiANCj4gPiBTb21ld2hhdCByZWxhdGVkIHRv
IHRoaXMsIHNpbmNlIG91ciBQQ0llIGNvbnRyb2xsZXIgSFcgaXMgbGltaXRlZCB0bw0KPiA+IDMy
LWJpdCBBWEkgYWRkcmVzcyByYW5nZSwgYmVmb3JlIHRyeWluZyB0byBob29rIHVwIHRoZSBJT01N
VSBJIGhhdmUNCj4gPiB0cmllZCB0byBsaW1pdCB0aGUgZG1hX21hc2sgZm9yIFBDSSBjYXJkcyB0
byBETUFfQklUX01BU0soMzIpLiBUaGUNCj4gPiByZWFzb24gYmVpbmcgdGhhdCBMaW51eCB1c2Vz
IGEgMSB0byAxIG1hcHBpbmcgYmV0d2VlbiBQQ0kgYWRkcmVzc2VzDQo+ID4gYW5kIGNwdSAocGh5
cykgYWRkcmVzc2VzIHdoZW4gdGhlcmUgaXNuJ3QgYW4gSU9NTVUgaW52b2x2ZWQsIHNvIEkNCj4g
PiB0aGluayB0aGF0IHdlIG5lZWQgdG8gbGltaXQgdGhlIFBDSSBhZGRyZXNzIHNwYWNlIHVzZWQu
DQo+IA0KPiBJIHRoaW5rIHlvdSdyZSBtaXhpbmcgdGhpbmdzIGEgYml0IG9yIG5vdCBleHBsYWlu
aW5nIHRoZW0gdmVyeSB3ZWxsLiBIYXZpbmcgdGhlDQo+IFBDSWUgY29udHJvbGxlciBsaW1pdGVk
IHRvIDMyLWJpdCBBWEkgZG9lcyBub3QgbWVhbiB0aGF0IHRoZSBQQ0llIGJ1cyBjYW5ub3QNCj4g
Y2FycnkgNjQtYml0IGFkZHJlc3Nlcy4gSXQgZGVwZW5kcyBvbiBob3cgdGhleSBnZXQgdHJhbnNs
YXRlZCBieSB0aGUgaG9zdCBicmlkZ2UNCj4gb3IgaXRzIGFzc29jaWF0ZWQgQVRTIGJsb2NrLiBJ
IGNhbid0IHNlZSB3aHkgeW91IGNhbid0IGhhdmUgYSBzZXR1cCB3aGVyZQ0KPiB0aGUgQ1BVIGFk
ZHJlc3NlcyBhcmUgMzItYml0IGJ1dCB0aGUgUENJZSBidXMgYWRkcmVzc2VzIGFyZSBhbGwgNjQt
Yml0Lg0KPiBZb3UganVzdCBoYXZlIHRvIGJlIGNhcmVmdWwgb24gaG93IHlvdSBzZXR1cCB5b3Vy
IG1lbTY0IHJhbmdlcyBzbyB0aGF0IHRoZXkNCj4gZG9uJ3QNCj4gb3ZlcmxhcCB3aXRoIHRoZSAz
Mi1iaXQgcmFuZ2VzIHdoZW4gdHJhbnNsYXRlZC4NCkZyb20gYSBIVyBwb2ludCBvZiB2aWV3IEkg
YWdyZWUgdGhhdCB3ZSBjYW4gc2V0dXAgdGhlIFBDSSBob3N0IGJyaWRnZSBzdWNoIHRoYXQNCml0
IHVzZXMgNjQtYml0IFBDSSBhZGRyZXNzLCB3aXRoIDMyLWJpdCBjcHUgYWRkcmVzc2VzLiBUaG91
Z2ggaW4gcHJhY3RpY2UgZG9lc24ndA0KdGhpcyBtZWFuIHRoYXQgdGhlIGRtYSBvcHMgdXNlZCBi
eSBjYXJkIGRyaXZlcnMgaGFzIHRvIGJlIHByb3ZpZGVkIGJ5IG91ciBQQ0kNCmhvc3QgYnJpZGdl
IGRyaXZlciBzbyB3ZSBjYW4gYXBwbHkgdGhlIHRyYW5zbGF0aW9uIHRvIHRob3NlIFBDSSBhZGRy
ZXNzZXM/DQpUaGlzIGNvbWVzIGJhY2sgdG8gbXkgcG9pbnQgYmVsb3cgYWJvdXQgaG93IHRvIGRv
IHRoaXMuIEFkZGluZyBhIGJ1cyBub3RpZmllcg0KdG8gZG8gdGhpcyBtYXkgYmUgdG9vIGxhdGUs
IGFuZCBhcm02NCBkb2Vzbid0IGltcGxlbWVudCBzZXRfZG1hX29wcygpLg0KDQo+IEFuZCBubywg
eW91IHNob3VsZCBub3QgbGltaXQgYXQgdGhlIGNhcmQgZHJpdmVyIHRoZSBETUFfQklUX01BU0so
KSB1bmxlc3MgdGhlDQo+IGNhcmQgaXMgbm90IGNhcGFibGUgb2Ygc3VwcG9ydGluZyBtb3JlIHRo
YW4gMzItYml0IGFkZHJlc3Nlcy4NCklmIHRoZXJlIHdhcyBpbmZyYXN0cnVjdHVyZSB0aGF0IGNo
ZWNrZWQgYWxsIHBhcmVudHMgZG1hLXJhbmdlcyB3aGVuIHRoZQ0KZG1hX3NldF9tYXNrKCkgZnVu
Y3Rpb24gaXMgY2FsbGVkIGFzIEFybmQgcG9pbnRlZCBvdXQsIHRoaXMgd291bGQgbmljZWx5IHNv
bHZlDQp0aGUgcHJvYmxlbS4NCg0KPiA+IFNpbmNlIHBjaV9zZXR1cF9kZXZpY2UoKSBzZXRzIHVw
IGRtYV9tYXNrLCBJIGFkZGVkIGEgYnVzIG5vdGlmaWVyIGluIHRoZQ0KPiA+IFBDSWUgY29udHJv
bGxlciBkcml2ZXIgc28gSSBjYW4gY2hhbmdlIHRoZSBtYXNrLCBpZiBuZWVkZWQsIG9uIHRoZQ0K
PiA+IEJVU19OT1RJRllfQk9VTkRfRFJJVkVSIGFjdGlvbi4NCj4gPiBIb3dldmVyLCBJIHRoaW5r
IHRoZXJlIGlzIHRoZSBwb3RlbnRpYWwgZm9yIGNhcmQgZHJpdmVycyB0byBhbGxvY2F0ZSBhbmQN
Cj4gPiBtYXAgYnVmZmVycyBiZWZvcmUgdGhlIGJ1cyBub3RpZmllciBnZXQgY2FsbGVkLiBBZGRp
dGlvbmFsbHksIEkndmUgc2Vlbg0KPiA+IGRyaXZlcnMgY2hhbmdlIHRoZWlyIGJlaGF2aW91ciBi
YXNlZCBvbiB0aGUgc3VjY2VzcyBvciBmYWlsdXJlIG9mDQo+ID4gZG1hX3NldF9tYXNrX2FuZF9j
b2hlcmVudChkZXYsIERNQV9CSVRfTUFTSyg2NCkpLCBzbyB0aGUNCj4gPiBkcml2ZXIgY291bGQs
IHRoZW9yZXRpY2FsbHkgYXQgbGVhc3QsIG9wZXJhdGUgaW4gYSB3YXkgdGhhdCBpcyBub3QNCj4g
PiBjb21wYXRpYmxlIHdpdGggYSBtb3JlIHJlc3RyaWN0ZWQgZG1hX21hc2sgKHRob3VnaCBJIGNh
bid0IHRoaW5rDQo+ID4gb2YgYW55IHdheSB0aGlzIHdvdWxkIG5vdCB3b3JrIHdpdGggaGFyZHdh
cmUgSSd2ZSBzZWVuKS4NCj4gPg0KPiA+IFNvLCBJIHRoaW5rIHRoYXQgdXNpbmcgYSBidXMgbm90
aWZpZXIgaXMgdGhlIHdyb25nIHdheSB0byBnbywgYnV0IEkgZG9u4oCZdA0KPiA+IGtub3cgd2hh
dCBvdGhlciBvcHRpb25zIEkgaGF2ZS4gQW55IHN1Z2dlc3Rpb25zPw0KPiANCj4gSSB3b3VsZCBm
aXJzdCBoYXZlIGEgbG9vayBhdCBob3cgdGhlIFBDSWUgYnVzIGFkZHJlc3NlcyBhcmUgdHJhbnNs
YXRlZCBieSB0aGUNCj4gaG9zdCBjb250cm9sbGVyLg0KPiANCj4gQmVzdCByZWdhcmRzLA0KPiBM
aXZpdQ0KPiANClRoYW5rcw0KUGhpbA0K

^ permalink raw reply	[flat|nested] 52+ messages in thread

* PCIe host controller behind IOMMU on ARM
@ 2015-11-12  9:26             ` Phil Edworthy
  0 siblings, 0 replies; 52+ messages in thread
From: Phil Edworthy @ 2015-11-12  9:26 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Liviu, Arnd,

On 11 November 2015 18:25, LIviu wrote:
> On Mon, Nov 09, 2015 at 12:32:13PM +0000, Phil Edworthy wrote:
> > Hi Liviu, Will,
> >
> > On 04 November 2015 15:19, Phil wrote:
> > > On 04 November 2015 15:02, Liviu wrote:
> > > > On Wed, Nov 04, 2015 at 02:48:38PM +0000, Phil Edworthy wrote:
> > > > > Hi Liviu,
> > > > >
> > > > > On 04 November 2015 14:24, Liviu wrote:
> > > > > > On Wed, Nov 04, 2015 at 01:57:48PM +0000, Phil Edworthy wrote:
> > > > > > > Hi,
> > > > > > >
> > > > > > > I am trying to hook up a PCIe host controller that sits behind an
> IOMMU,
> > > > > > > but having some problems.
> > > > > > >
> > > > > > > I'm using the pcie-rcar PCIe host controller and it works fine without
> > > > > > > the IOMMU, and I can attach the IOMMU to the controller such that
> any
> > > > calls
> > > > > > > to dma_alloc_coherent made by the controller driver uses the
> > > iommu_ops
> > > > > > > version of dma_ops.
> > > > > > >
> > > > > > > However, I can't see how to make the endpoints to utilise the
> dma_ops
> > > that
> > > > > > > the controller uses. Shouldn't the endpoints inherit the dma_ops from
> the
> > > > > > > controller?
> > > > > >
> > > > > > No, not directly.
> > > > > >
> > > > > > > Any pointers for this?
> > > > > >
> > > > > > You need to understand the process through which a driver for
> endpoint
> > > get
> > > > > > an address to be passed down to the device. Have a look at
> > > > > > Documentation/DMA-API-HOWTO.txt, there is a nice explanation there.
> > > > > > (Hint: EP driver needs to call dma_map_single).
> > > > > >
> > > > > > Also, you need to make sure that the bus address that ends up being set
> > > into
> > > > > > the endpoint gets translated correctly by the host controller into an
> address
> > > > > > that the IOMMU can then translate into physical address.
> > > > > Sure, though since this is bog standard Intel PCIe ethernet card which
> works
> > > > > fine when the IOMMU is effectively unused, I don?t think there is a
> problem
> > > > > with that.
> > > > >
> > > > > The driver for the PCIe controller sets up the IOMMU mapping ok when I
> > > > > do a test call to dma_alloc_coherent() in the controller's driver. i.e. when I
> > > > > do this, it ends up in arm_iommu_alloc_attrs(), which calls
> > > > > __iommu_alloc_buffer() and __alloc_iova().
> > > > >
> > > > > When an endpoint driver allocates and maps a dma coherent buffer it
> > > > > also needs to end up in arm_iommu_alloc_attrs(), but it doesn't.
> > > >
> > > > Why do you think that? Remember that the only thing attached to the
> IOMMU
> > > is
> > > > the
> > > > host controller. The endpoint is on the PCIe bus, which gets a different
> > > > translation
> > > > that the IOMMU knows nothing about. If it helps you to visualise it better,
> think
> > > > of the host controller as another IOMMU device. It's the ops of the host
> > > > controller
> > > > that should be invoked, not the IOMMU's.
> > > Ok, that makes sense. I'll have a think and poke it a bit more...
> 
> Hi Phil,
> 
> Not trying to ignore your email, but I thought this is more in Will's backyard.
> 
> > Somewhat related to this, since our PCIe controller HW is limited to
> > 32-bit AXI address range, before trying to hook up the IOMMU I have
> > tried to limit the dma_mask for PCI cards to DMA_BIT_MASK(32). The
> > reason being that Linux uses a 1 to 1 mapping between PCI addresses
> > and cpu (phys) addresses when there isn't an IOMMU involved, so I
> > think that we need to limit the PCI address space used.
> 
> I think you're mixing things a bit or not explaining them very well. Having the
> PCIe controller limited to 32-bit AXI does not mean that the PCIe bus cannot
> carry 64-bit addresses. It depends on how they get translated by the host bridge
> or its associated ATS block. I can't see why you can't have a setup where
> the CPU addresses are 32-bit but the PCIe bus addresses are all 64-bit.
> You just have to be careful on how you setup your mem64 ranges so that they
> don't
> overlap with the 32-bit ranges when translated.
>From a HW point of view I agree that we can setup the PCI host bridge such that
it uses 64-bit PCI address, with 32-bit cpu addresses. Though in practice doesn't
this mean that the dma ops used by card drivers has to be provided by our PCI
host bridge driver so we can apply the translation to those PCI addresses?
This comes back to my point below about how to do this. Adding a bus notifier
to do this may be too late, and arm64 doesn't implement set_dma_ops().

> And no, you should not limit at the card driver the DMA_BIT_MASK() unless the
> card is not capable of supporting more than 32-bit addresses.
If there was infrastructure that checked all parents dma-ranges when the
dma_set_mask() function is called as Arnd pointed out, this would nicely solve
the problem.

> > Since pci_setup_device() sets up dma_mask, I added a bus notifier in the
> > PCIe controller driver so I can change the mask, if needed, on the
> > BUS_NOTIFY_BOUND_DRIVER action.
> > However, I think there is the potential for card drivers to allocate and
> > map buffers before the bus notifier get called. Additionally, I've seen
> > drivers change their behaviour based on the success or failure of
> > dma_set_mask_and_coherent(dev, DMA_BIT_MASK(64)), so the
> > driver could, theoretically at least, operate in a way that is not
> > compatible with a more restricted dma_mask (though I can't think
> > of any way this would not work with hardware I've seen).
> >
> > So, I think that using a bus notifier is the wrong way to go, but I don?t
> > know what other options I have. Any suggestions?
> 
> I would first have a look at how the PCIe bus addresses are translated by the
> host controller.
> 
> Best regards,
> Liviu
> 
Thanks
Phil

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: PCIe host controller behind IOMMU on ARM
  2015-11-12  9:26             ` Phil Edworthy
@ 2015-11-12  9:49               ` Arnd Bergmann
  -1 siblings, 0 replies; 52+ messages in thread
From: Arnd Bergmann @ 2015-11-12  9:49 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Phil Edworthy, Liviu.Dudau, Lorenzo Pieralisi, Magnus, linux-pci,
	Will Deacon, linux-kernel, Bjorn Helgaas

On Thursday 12 November 2015 09:26:33 Phil Edworthy wrote:
> On 11 November 2015 18:25, LIviu wrote:
> > On Mon, Nov 09, 2015 at 12:32:13PM +0000, Phil Edworthy wrote:

> > I think you're mixing things a bit or not explaining them very well. Having the
> > PCIe controller limited to 32-bit AXI does not mean that the PCIe bus cannot
> > carry 64-bit addresses. It depends on how they get translated by the host bridge
> > or its associated ATS block. I can't see why you can't have a setup where
> > the CPU addresses are 32-bit but the PCIe bus addresses are all 64-bit.
> > You just have to be careful on how you setup your mem64 ranges so that they
> > don't
> > overlap with the 32-bit ranges when translated.
> From a HW point of view I agree that we can setup the PCI host bridge such that
> it uses 64-bit PCI address, with 32-bit cpu addresses. Though in practice doesn't
> this mean that the dma ops used by card drivers has to be provided by our PCI
> host bridge driver so we can apply the translation to those PCI addresses?
> This comes back to my point below about how to do this. Adding a bus notifier
> to do this may be too late, and arm64 doesn't implement set_dma_ops().
> 
> > And no, you should not limit at the card driver the DMA_BIT_MASK() unless the
> > card is not capable of supporting more than 32-bit addresses.
> If there was infrastructure that checked all parents dma-ranges when the
> dma_set_mask() function is called as Arnd pointed out, this would nicely solve
> the problem.

of_dma_configure calls of_dma_get_range to do all this for the PCIe host,
and then calls arch_setup_dma_ops() so the architecture specific code can
enforce the limits in dma_set_mask and pick an appropriate set of dma
operations. The missing part is in the implementation of arch_setup_dma_ops,
which currently happily ignores the base and limit.

	Arnd

^ permalink raw reply	[flat|nested] 52+ messages in thread

* PCIe host controller behind IOMMU on ARM
@ 2015-11-12  9:49               ` Arnd Bergmann
  0 siblings, 0 replies; 52+ messages in thread
From: Arnd Bergmann @ 2015-11-12  9:49 UTC (permalink / raw)
  To: linux-arm-kernel

On Thursday 12 November 2015 09:26:33 Phil Edworthy wrote:
> On 11 November 2015 18:25, LIviu wrote:
> > On Mon, Nov 09, 2015 at 12:32:13PM +0000, Phil Edworthy wrote:

> > I think you're mixing things a bit or not explaining them very well. Having the
> > PCIe controller limited to 32-bit AXI does not mean that the PCIe bus cannot
> > carry 64-bit addresses. It depends on how they get translated by the host bridge
> > or its associated ATS block. I can't see why you can't have a setup where
> > the CPU addresses are 32-bit but the PCIe bus addresses are all 64-bit.
> > You just have to be careful on how you setup your mem64 ranges so that they
> > don't
> > overlap with the 32-bit ranges when translated.
> From a HW point of view I agree that we can setup the PCI host bridge such that
> it uses 64-bit PCI address, with 32-bit cpu addresses. Though in practice doesn't
> this mean that the dma ops used by card drivers has to be provided by our PCI
> host bridge driver so we can apply the translation to those PCI addresses?
> This comes back to my point below about how to do this. Adding a bus notifier
> to do this may be too late, and arm64 doesn't implement set_dma_ops().
> 
> > And no, you should not limit at the card driver the DMA_BIT_MASK() unless the
> > card is not capable of supporting more than 32-bit addresses.
> If there was infrastructure that checked all parents dma-ranges when the
> dma_set_mask() function is called as Arnd pointed out, this would nicely solve
> the problem.

of_dma_configure calls of_dma_get_range to do all this for the PCIe host,
and then calls arch_setup_dma_ops() so the architecture specific code can
enforce the limits in dma_set_mask and pick an appropriate set of dma
operations. The missing part is in the implementation of arch_setup_dma_ops,
which currently happily ignores the base and limit.

	Arnd

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: PCIe host controller behind IOMMU on ARM
  2015-11-12  9:26             ` Phil Edworthy
  (?)
@ 2015-11-12 10:32               ` Liviu.Dudau
  -1 siblings, 0 replies; 52+ messages in thread
From: Liviu.Dudau @ 2015-11-12 10:32 UTC (permalink / raw)
  To: Phil Edworthy
  Cc: Arnd Bergmann, Will Deacon, linux-pci, linux-kernel,
	linux-arm-kernel, Bjorn Helgaas, Lorenzo Pieralisi, Magnus

On Thu, Nov 12, 2015 at 09:26:33AM +0000, Phil Edworthy wrote:
> Hi Liviu, Arnd,
> 
> On 11 November 2015 18:25, LIviu wrote:
> > On Mon, Nov 09, 2015 at 12:32:13PM +0000, Phil Edworthy wrote:
> > > Hi Liviu, Will,
> > >
> > > On 04 November 2015 15:19, Phil wrote:
> > > > On 04 November 2015 15:02, Liviu wrote:
> > > > > On Wed, Nov 04, 2015 at 02:48:38PM +0000, Phil Edworthy wrote:
> > > > > > Hi Liviu,
> > > > > >
> > > > > > On 04 November 2015 14:24, Liviu wrote:
> > > > > > > On Wed, Nov 04, 2015 at 01:57:48PM +0000, Phil Edworthy wrote:
> > > > > > > > Hi,
> > > > > > > >
> > > > > > > > I am trying to hook up a PCIe host controller that sits behind an
> > IOMMU,
> > > > > > > > but having some problems.
> > > > > > > >
> > > > > > > > I'm using the pcie-rcar PCIe host controller and it works fine without
> > > > > > > > the IOMMU, and I can attach the IOMMU to the controller such that
> > any
> > > > > calls
> > > > > > > > to dma_alloc_coherent made by the controller driver uses the
> > > > iommu_ops
> > > > > > > > version of dma_ops.
> > > > > > > >
> > > > > > > > However, I can't see how to make the endpoints to utilise the
> > dma_ops
> > > > that
> > > > > > > > the controller uses. Shouldn't the endpoints inherit the dma_ops from
> > the
> > > > > > > > controller?
> > > > > > >
> > > > > > > No, not directly.
> > > > > > >
> > > > > > > > Any pointers for this?
> > > > > > >
> > > > > > > You need to understand the process through which a driver for
> > endpoint
> > > > get
> > > > > > > an address to be passed down to the device. Have a look at
> > > > > > > Documentation/DMA-API-HOWTO.txt, there is a nice explanation there.
> > > > > > > (Hint: EP driver needs to call dma_map_single).
> > > > > > >
> > > > > > > Also, you need to make sure that the bus address that ends up being set
> > > > into
> > > > > > > the endpoint gets translated correctly by the host controller into an
> > address
> > > > > > > that the IOMMU can then translate into physical address.
> > > > > > Sure, though since this is bog standard Intel PCIe ethernet card which
> > works
> > > > > > fine when the IOMMU is effectively unused, I don’t think there is a
> > problem
> > > > > > with that.
> > > > > >
> > > > > > The driver for the PCIe controller sets up the IOMMU mapping ok when I
> > > > > > do a test call to dma_alloc_coherent() in the controller's driver. i.e. when I
> > > > > > do this, it ends up in arm_iommu_alloc_attrs(), which calls
> > > > > > __iommu_alloc_buffer() and __alloc_iova().
> > > > > >
> > > > > > When an endpoint driver allocates and maps a dma coherent buffer it
> > > > > > also needs to end up in arm_iommu_alloc_attrs(), but it doesn't.
> > > > >
> > > > > Why do you think that? Remember that the only thing attached to the
> > IOMMU
> > > > is
> > > > > the
> > > > > host controller. The endpoint is on the PCIe bus, which gets a different
> > > > > translation
> > > > > that the IOMMU knows nothing about. If it helps you to visualise it better,
> > think
> > > > > of the host controller as another IOMMU device. It's the ops of the host
> > > > > controller
> > > > > that should be invoked, not the IOMMU's.
> > > > Ok, that makes sense. I'll have a think and poke it a bit more...
> > 
> > Hi Phil,
> > 
> > Not trying to ignore your email, but I thought this is more in Will's backyard.
> > 
> > > Somewhat related to this, since our PCIe controller HW is limited to
> > > 32-bit AXI address range, before trying to hook up the IOMMU I have
> > > tried to limit the dma_mask for PCI cards to DMA_BIT_MASK(32). The
> > > reason being that Linux uses a 1 to 1 mapping between PCI addresses
> > > and cpu (phys) addresses when there isn't an IOMMU involved, so I
> > > think that we need to limit the PCI address space used.
> > 
> > I think you're mixing things a bit or not explaining them very well. Having the
> > PCIe controller limited to 32-bit AXI does not mean that the PCIe bus cannot
> > carry 64-bit addresses. It depends on how they get translated by the host bridge
> > or its associated ATS block. I can't see why you can't have a setup where
> > the CPU addresses are 32-bit but the PCIe bus addresses are all 64-bit.
> > You just have to be careful on how you setup your mem64 ranges so that they
> > don't
> > overlap with the 32-bit ranges when translated.
> From a HW point of view I agree that we can setup the PCI host bridge such that
> it uses 64-bit PCI address, with 32-bit cpu addresses. Though in practice doesn't
> this mean that the dma ops used by card drivers has to be provided by our PCI
> host bridge driver so we can apply the translation to those PCI addresses?

I thought all addresses that are set into the cards go through
pcibios_resource_to_bus() which give you the PCI address to set, although I have to
admit that when DMA gets involved I'm not 100% sure of the whole flow.

Best regards,
Liviu

> This comes back to my point below about how to do this. Adding a bus notifier
> to do this may be too late, and arm64 doesn't implement set_dma_ops().
> 
> > And no, you should not limit at the card driver the DMA_BIT_MASK() unless the
> > card is not capable of supporting more than 32-bit addresses.
> If there was infrastructure that checked all parents dma-ranges when the
> dma_set_mask() function is called as Arnd pointed out, this would nicely solve
> the problem.
> 
> > > Since pci_setup_device() sets up dma_mask, I added a bus notifier in the
> > > PCIe controller driver so I can change the mask, if needed, on the
> > > BUS_NOTIFY_BOUND_DRIVER action.
> > > However, I think there is the potential for card drivers to allocate and
> > > map buffers before the bus notifier get called. Additionally, I've seen
> > > drivers change their behaviour based on the success or failure of
> > > dma_set_mask_and_coherent(dev, DMA_BIT_MASK(64)), so the
> > > driver could, theoretically at least, operate in a way that is not
> > > compatible with a more restricted dma_mask (though I can't think
> > > of any way this would not work with hardware I've seen).
> > >
> > > So, I think that using a bus notifier is the wrong way to go, but I don’t
> > > know what other options I have. Any suggestions?
> > 
> > I would first have a look at how the PCIe bus addresses are translated by the
> > host controller.
> > 
> > Best regards,
> > Liviu
> > 
> Thanks
> Phil

-- 
====================
| I would like to |
| fix the world,  |
| but they're not |
| giving me the   |
 \ source code!  /
  ---------------
    ¯\_(ツ)_/¯

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: PCIe host controller behind IOMMU on ARM
@ 2015-11-12 10:32               ` Liviu.Dudau
  0 siblings, 0 replies; 52+ messages in thread
From: Liviu.Dudau @ 2015-11-12 10:32 UTC (permalink / raw)
  To: Phil Edworthy
  Cc: Arnd Bergmann, Will Deacon, linux-pci, linux-kernel,
	linux-arm-kernel, Bjorn Helgaas, Lorenzo Pieralisi, Magnus

On Thu, Nov 12, 2015 at 09:26:33AM +0000, Phil Edworthy wrote:
> Hi Liviu, Arnd,
> 
> On 11 November 2015 18:25, LIviu wrote:
> > On Mon, Nov 09, 2015 at 12:32:13PM +0000, Phil Edworthy wrote:
> > > Hi Liviu, Will,
> > >
> > > On 04 November 2015 15:19, Phil wrote:
> > > > On 04 November 2015 15:02, Liviu wrote:
> > > > > On Wed, Nov 04, 2015 at 02:48:38PM +0000, Phil Edworthy wrote:
> > > > > > Hi Liviu,
> > > > > >
> > > > > > On 04 November 2015 14:24, Liviu wrote:
> > > > > > > On Wed, Nov 04, 2015 at 01:57:48PM +0000, Phil Edworthy wrote:
> > > > > > > > Hi,
> > > > > > > >
> > > > > > > > I am trying to hook up a PCIe host controller that sits behind an
> > IOMMU,
> > > > > > > > but having some problems.
> > > > > > > >
> > > > > > > > I'm using the pcie-rcar PCIe host controller and it works fine without
> > > > > > > > the IOMMU, and I can attach the IOMMU to the controller such that
> > any
> > > > > calls
> > > > > > > > to dma_alloc_coherent made by the controller driver uses the
> > > > iommu_ops
> > > > > > > > version of dma_ops.
> > > > > > > >
> > > > > > > > However, I can't see how to make the endpoints to utilise the
> > dma_ops
> > > > that
> > > > > > > > the controller uses. Shouldn't the endpoints inherit the dma_ops from
> > the
> > > > > > > > controller?
> > > > > > >
> > > > > > > No, not directly.
> > > > > > >
> > > > > > > > Any pointers for this?
> > > > > > >
> > > > > > > You need to understand the process through which a driver for
> > endpoint
> > > > get
> > > > > > > an address to be passed down to the device. Have a look at
> > > > > > > Documentation/DMA-API-HOWTO.txt, there is a nice explanation there.
> > > > > > > (Hint: EP driver needs to call dma_map_single).
> > > > > > >
> > > > > > > Also, you need to make sure that the bus address that ends up being set
> > > > into
> > > > > > > the endpoint gets translated correctly by the host controller into an
> > address
> > > > > > > that the IOMMU can then translate into physical address.
> > > > > > Sure, though since this is bog standard Intel PCIe ethernet card which
> > works
> > > > > > fine when the IOMMU is effectively unused, I don’t think there is a
> > problem
> > > > > > with that.
> > > > > >
> > > > > > The driver for the PCIe controller sets up the IOMMU mapping ok when I
> > > > > > do a test call to dma_alloc_coherent() in the controller's driver. i.e. when I
> > > > > > do this, it ends up in arm_iommu_alloc_attrs(), which calls
> > > > > > __iommu_alloc_buffer() and __alloc_iova().
> > > > > >
> > > > > > When an endpoint driver allocates and maps a dma coherent buffer it
> > > > > > also needs to end up in arm_iommu_alloc_attrs(), but it doesn't.
> > > > >
> > > > > Why do you think that? Remember that the only thing attached to the
> > IOMMU
> > > > is
> > > > > the
> > > > > host controller. The endpoint is on the PCIe bus, which gets a different
> > > > > translation
> > > > > that the IOMMU knows nothing about. If it helps you to visualise it better,
> > think
> > > > > of the host controller as another IOMMU device. It's the ops of the host
> > > > > controller
> > > > > that should be invoked, not the IOMMU's.
> > > > Ok, that makes sense. I'll have a think and poke it a bit more...
> > 
> > Hi Phil,
> > 
> > Not trying to ignore your email, but I thought this is more in Will's backyard.
> > 
> > > Somewhat related to this, since our PCIe controller HW is limited to
> > > 32-bit AXI address range, before trying to hook up the IOMMU I have
> > > tried to limit the dma_mask for PCI cards to DMA_BIT_MASK(32). The
> > > reason being that Linux uses a 1 to 1 mapping between PCI addresses
> > > and cpu (phys) addresses when there isn't an IOMMU involved, so I
> > > think that we need to limit the PCI address space used.
> > 
> > I think you're mixing things a bit or not explaining them very well. Having the
> > PCIe controller limited to 32-bit AXI does not mean that the PCIe bus cannot
> > carry 64-bit addresses. It depends on how they get translated by the host bridge
> > or its associated ATS block. I can't see why you can't have a setup where
> > the CPU addresses are 32-bit but the PCIe bus addresses are all 64-bit.
> > You just have to be careful on how you setup your mem64 ranges so that they
> > don't
> > overlap with the 32-bit ranges when translated.
> From a HW point of view I agree that we can setup the PCI host bridge such that
> it uses 64-bit PCI address, with 32-bit cpu addresses. Though in practice doesn't
> this mean that the dma ops used by card drivers has to be provided by our PCI
> host bridge driver so we can apply the translation to those PCI addresses?

I thought all addresses that are set into the cards go through
pcibios_resource_to_bus() which give you the PCI address to set, although I have to
admit that when DMA gets involved I'm not 100% sure of the whole flow.

Best regards,
Liviu

> This comes back to my point below about how to do this. Adding a bus notifier
> to do this may be too late, and arm64 doesn't implement set_dma_ops().
> 
> > And no, you should not limit at the card driver the DMA_BIT_MASK() unless the
> > card is not capable of supporting more than 32-bit addresses.
> If there was infrastructure that checked all parents dma-ranges when the
> dma_set_mask() function is called as Arnd pointed out, this would nicely solve
> the problem.
> 
> > > Since pci_setup_device() sets up dma_mask, I added a bus notifier in the
> > > PCIe controller driver so I can change the mask, if needed, on the
> > > BUS_NOTIFY_BOUND_DRIVER action.
> > > However, I think there is the potential for card drivers to allocate and
> > > map buffers before the bus notifier get called. Additionally, I've seen
> > > drivers change their behaviour based on the success or failure of
> > > dma_set_mask_and_coherent(dev, DMA_BIT_MASK(64)), so the
> > > driver could, theoretically at least, operate in a way that is not
> > > compatible with a more restricted dma_mask (though I can't think
> > > of any way this would not work with hardware I've seen).
> > >
> > > So, I think that using a bus notifier is the wrong way to go, but I don’t
> > > know what other options I have. Any suggestions?
> > 
> > I would first have a look at how the PCIe bus addresses are translated by the
> > host controller.
> > 
> > Best regards,
> > Liviu
> > 
> Thanks
> Phil

-- 
====================
| I would like to |
| fix the world,  |
| but they're not |
| giving me the   |
 \ source code!  /
  ---------------
    ¯\_(ツ)_/¯

^ permalink raw reply	[flat|nested] 52+ messages in thread

* PCIe host controller behind IOMMU on ARM
@ 2015-11-12 10:32               ` Liviu.Dudau
  0 siblings, 0 replies; 52+ messages in thread
From: Liviu.Dudau at arm.com @ 2015-11-12 10:32 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Nov 12, 2015 at 09:26:33AM +0000, Phil Edworthy wrote:
> Hi Liviu, Arnd,
> 
> On 11 November 2015 18:25, LIviu wrote:
> > On Mon, Nov 09, 2015 at 12:32:13PM +0000, Phil Edworthy wrote:
> > > Hi Liviu, Will,
> > >
> > > On 04 November 2015 15:19, Phil wrote:
> > > > On 04 November 2015 15:02, Liviu wrote:
> > > > > On Wed, Nov 04, 2015 at 02:48:38PM +0000, Phil Edworthy wrote:
> > > > > > Hi Liviu,
> > > > > >
> > > > > > On 04 November 2015 14:24, Liviu wrote:
> > > > > > > On Wed, Nov 04, 2015 at 01:57:48PM +0000, Phil Edworthy wrote:
> > > > > > > > Hi,
> > > > > > > >
> > > > > > > > I am trying to hook up a PCIe host controller that sits behind an
> > IOMMU,
> > > > > > > > but having some problems.
> > > > > > > >
> > > > > > > > I'm using the pcie-rcar PCIe host controller and it works fine without
> > > > > > > > the IOMMU, and I can attach the IOMMU to the controller such that
> > any
> > > > > calls
> > > > > > > > to dma_alloc_coherent made by the controller driver uses the
> > > > iommu_ops
> > > > > > > > version of dma_ops.
> > > > > > > >
> > > > > > > > However, I can't see how to make the endpoints to utilise the
> > dma_ops
> > > > that
> > > > > > > > the controller uses. Shouldn't the endpoints inherit the dma_ops from
> > the
> > > > > > > > controller?
> > > > > > >
> > > > > > > No, not directly.
> > > > > > >
> > > > > > > > Any pointers for this?
> > > > > > >
> > > > > > > You need to understand the process through which a driver for
> > endpoint
> > > > get
> > > > > > > an address to be passed down to the device. Have a look at
> > > > > > > Documentation/DMA-API-HOWTO.txt, there is a nice explanation there.
> > > > > > > (Hint: EP driver needs to call dma_map_single).
> > > > > > >
> > > > > > > Also, you need to make sure that the bus address that ends up being set
> > > > into
> > > > > > > the endpoint gets translated correctly by the host controller into an
> > address
> > > > > > > that the IOMMU can then translate into physical address.
> > > > > > Sure, though since this is bog standard Intel PCIe ethernet card which
> > works
> > > > > > fine when the IOMMU is effectively unused, I don?t think there is a
> > problem
> > > > > > with that.
> > > > > >
> > > > > > The driver for the PCIe controller sets up the IOMMU mapping ok when I
> > > > > > do a test call to dma_alloc_coherent() in the controller's driver. i.e. when I
> > > > > > do this, it ends up in arm_iommu_alloc_attrs(), which calls
> > > > > > __iommu_alloc_buffer() and __alloc_iova().
> > > > > >
> > > > > > When an endpoint driver allocates and maps a dma coherent buffer it
> > > > > > also needs to end up in arm_iommu_alloc_attrs(), but it doesn't.
> > > > >
> > > > > Why do you think that? Remember that the only thing attached to the
> > IOMMU
> > > > is
> > > > > the
> > > > > host controller. The endpoint is on the PCIe bus, which gets a different
> > > > > translation
> > > > > that the IOMMU knows nothing about. If it helps you to visualise it better,
> > think
> > > > > of the host controller as another IOMMU device. It's the ops of the host
> > > > > controller
> > > > > that should be invoked, not the IOMMU's.
> > > > Ok, that makes sense. I'll have a think and poke it a bit more...
> > 
> > Hi Phil,
> > 
> > Not trying to ignore your email, but I thought this is more in Will's backyard.
> > 
> > > Somewhat related to this, since our PCIe controller HW is limited to
> > > 32-bit AXI address range, before trying to hook up the IOMMU I have
> > > tried to limit the dma_mask for PCI cards to DMA_BIT_MASK(32). The
> > > reason being that Linux uses a 1 to 1 mapping between PCI addresses
> > > and cpu (phys) addresses when there isn't an IOMMU involved, so I
> > > think that we need to limit the PCI address space used.
> > 
> > I think you're mixing things a bit or not explaining them very well. Having the
> > PCIe controller limited to 32-bit AXI does not mean that the PCIe bus cannot
> > carry 64-bit addresses. It depends on how they get translated by the host bridge
> > or its associated ATS block. I can't see why you can't have a setup where
> > the CPU addresses are 32-bit but the PCIe bus addresses are all 64-bit.
> > You just have to be careful on how you setup your mem64 ranges so that they
> > don't
> > overlap with the 32-bit ranges when translated.
> From a HW point of view I agree that we can setup the PCI host bridge such that
> it uses 64-bit PCI address, with 32-bit cpu addresses. Though in practice doesn't
> this mean that the dma ops used by card drivers has to be provided by our PCI
> host bridge driver so we can apply the translation to those PCI addresses?

I thought all addresses that are set into the cards go through
pcibios_resource_to_bus() which give you the PCI address to set, although I have to
admit that when DMA gets involved I'm not 100% sure of the whole flow.

Best regards,
Liviu

> This comes back to my point below about how to do this. Adding a bus notifier
> to do this may be too late, and arm64 doesn't implement set_dma_ops().
> 
> > And no, you should not limit at the card driver the DMA_BIT_MASK() unless the
> > card is not capable of supporting more than 32-bit addresses.
> If there was infrastructure that checked all parents dma-ranges when the
> dma_set_mask() function is called as Arnd pointed out, this would nicely solve
> the problem.
> 
> > > Since pci_setup_device() sets up dma_mask, I added a bus notifier in the
> > > PCIe controller driver so I can change the mask, if needed, on the
> > > BUS_NOTIFY_BOUND_DRIVER action.
> > > However, I think there is the potential for card drivers to allocate and
> > > map buffers before the bus notifier get called. Additionally, I've seen
> > > drivers change their behaviour based on the success or failure of
> > > dma_set_mask_and_coherent(dev, DMA_BIT_MASK(64)), so the
> > > driver could, theoretically at least, operate in a way that is not
> > > compatible with a more restricted dma_mask (though I can't think
> > > of any way this would not work with hardware I've seen).
> > >
> > > So, I think that using a bus notifier is the wrong way to go, but I don?t
> > > know what other options I have. Any suggestions?
> > 
> > I would first have a look at how the PCIe bus addresses are translated by the
> > host controller.
> > 
> > Best regards,
> > Liviu
> > 
> Thanks
> Phil

-- 
====================
| I would like to |
| fix the world,  |
| but they're not |
| giving me the   |
 \ source code!  /
  ---------------
    ?\_(?)_/?

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: PCIe host controller behind IOMMU on ARM
  2015-11-12  9:49               ` Arnd Bergmann
  (?)
@ 2015-11-12 15:33                 ` Phil Edworthy
  -1 siblings, 0 replies; 52+ messages in thread
From: Phil Edworthy @ 2015-11-12 15:33 UTC (permalink / raw)
  To: Arnd Bergmann, linux-arm-kernel
  Cc: Liviu.Dudau, Lorenzo Pieralisi, Magnus, linux-pci, Will Deacon,
	linux-kernel, Bjorn Helgaas

Hi Arnd,

On 12 November 2015 09:49, Arnd Bergmann wrote:
> On Thursday 12 November 2015 09:26:33 Phil Edworthy wrote:
> > On 11 November 2015 18:25, LIviu wrote:
> > > On Mon, Nov 09, 2015 at 12:32:13PM +0000, Phil Edworthy wrote:
> 
> > > I think you're mixing things a bit or not explaining them very well. Having the
> > > PCIe controller limited to 32-bit AXI does not mean that the PCIe bus cannot
> > > carry 64-bit addresses. It depends on how they get translated by the host
> bridge
> > > or its associated ATS block. I can't see why you can't have a setup where
> > > the CPU addresses are 32-bit but the PCIe bus addresses are all 64-bit.
> > > You just have to be careful on how you setup your mem64 ranges so that
> they
> > > don't
> > > overlap with the 32-bit ranges when translated.
> > From a HW point of view I agree that we can setup the PCI host bridge such that
> > it uses 64-bit PCI address, with 32-bit cpu addresses. Though in practice doesn't
> > this mean that the dma ops used by card drivers has to be provided by our PCI
> > host bridge driver so we can apply the translation to those PCI addresses?
> > This comes back to my point below about how to do this. Adding a bus notifier
> > to do this may be too late, and arm64 doesn't implement set_dma_ops().
> >
> > > And no, you should not limit at the card driver the DMA_BIT_MASK() unless
> the
> > > card is not capable of supporting more than 32-bit addresses.
> > If there was infrastructure that checked all parents dma-ranges when the
> > dma_set_mask() function is called as Arnd pointed out, this would nicely solve
> > the problem.
> 
> of_dma_configure calls of_dma_get_range to do all this for the PCIe host,
> and then calls arch_setup_dma_ops() so the architecture specific code can
> enforce the limits in dma_set_mask and pick an appropriate set of dma
> operations. The missing part is in the implementation of arch_setup_dma_ops,
> which currently happily ignores the base and limit.
I don't think it's as simple as that, though I could be wrong!

First off, of_dma_configure() sets a default coherent_dma_mask to 4GiB.
This default is set for the 'platform soc' device. For my own testing I increased
this to DMA_BIT_MASK(63). Note that setting it to DMA_BIT_MASK(64) causes
boot failure that I haven't looked into.

Then pci_device_add() sets the devices coherent_dma_mask to 4GiB before
calling of_pci_dma_configure(). I assume it does this on the basis that this is a
good default for PCI drivers that don't call dma_set_mask().
So if arch_setup_dma_ops() walks up the parents to limit the mask, you'll hit
this mask.

Finally, dma_set_mask_and_coherent() is called from the PCI card driver
but it doesn't check the parents dma masks either.

Thanks
Phil


^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: PCIe host controller behind IOMMU on ARM
@ 2015-11-12 15:33                 ` Phil Edworthy
  0 siblings, 0 replies; 52+ messages in thread
From: Phil Edworthy @ 2015-11-12 15:33 UTC (permalink / raw)
  To: Arnd Bergmann, linux-arm-kernel
  Cc: Liviu.Dudau, Lorenzo Pieralisi, Magnus, linux-pci, Will Deacon,
	linux-kernel, Bjorn Helgaas

Hi Arnd,

On 12 November 2015 09:49, Arnd Bergmann wrote:
> On Thursday 12 November 2015 09:26:33 Phil Edworthy wrote:
> > On 11 November 2015 18:25, LIviu wrote:
> > > On Mon, Nov 09, 2015 at 12:32:13PM +0000, Phil Edworthy wrote:
> 
> > > I think you're mixing things a bit or not explaining them very well. Having the
> > > PCIe controller limited to 32-bit AXI does not mean that the PCIe bus cannot
> > > carry 64-bit addresses. It depends on how they get translated by the host
> bridge
> > > or its associated ATS block. I can't see why you can't have a setup where
> > > the CPU addresses are 32-bit but the PCIe bus addresses are all 64-bit.
> > > You just have to be careful on how you setup your mem64 ranges so that
> they
> > > don't
> > > overlap with the 32-bit ranges when translated.
> > From a HW point of view I agree that we can setup the PCI host bridge such that
> > it uses 64-bit PCI address, with 32-bit cpu addresses. Though in practice doesn't
> > this mean that the dma ops used by card drivers has to be provided by our PCI
> > host bridge driver so we can apply the translation to those PCI addresses?
> > This comes back to my point below about how to do this. Adding a bus notifier
> > to do this may be too late, and arm64 doesn't implement set_dma_ops().
> >
> > > And no, you should not limit at the card driver the DMA_BIT_MASK() unless
> the
> > > card is not capable of supporting more than 32-bit addresses.
> > If there was infrastructure that checked all parents dma-ranges when the
> > dma_set_mask() function is called as Arnd pointed out, this would nicely solve
> > the problem.
> 
> of_dma_configure calls of_dma_get_range to do all this for the PCIe host,
> and then calls arch_setup_dma_ops() so the architecture specific code can
> enforce the limits in dma_set_mask and pick an appropriate set of dma
> operations. The missing part is in the implementation of arch_setup_dma_ops,
> which currently happily ignores the base and limit.
I don't think it's as simple as that, though I could be wrong!

First off, of_dma_configure() sets a default coherent_dma_mask to 4GiB.
This default is set for the 'platform soc' device. For my own testing I increased
this to DMA_BIT_MASK(63). Note that setting it to DMA_BIT_MASK(64) causes
boot failure that I haven't looked into.

Then pci_device_add() sets the devices coherent_dma_mask to 4GiB before
calling of_pci_dma_configure(). I assume it does this on the basis that this is a
good default for PCI drivers that don't call dma_set_mask().
So if arch_setup_dma_ops() walks up the parents to limit the mask, you'll hit
this mask.

Finally, dma_set_mask_and_coherent() is called from the PCI card driver
but it doesn't check the parents dma masks either.

Thanks
Phil


^ permalink raw reply	[flat|nested] 52+ messages in thread

* PCIe host controller behind IOMMU on ARM
@ 2015-11-12 15:33                 ` Phil Edworthy
  0 siblings, 0 replies; 52+ messages in thread
From: Phil Edworthy @ 2015-11-12 15:33 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Arnd,

On 12 November 2015 09:49, Arnd Bergmann wrote:
> On Thursday 12 November 2015 09:26:33 Phil Edworthy wrote:
> > On 11 November 2015 18:25, LIviu wrote:
> > > On Mon, Nov 09, 2015 at 12:32:13PM +0000, Phil Edworthy wrote:
> 
> > > I think you're mixing things a bit or not explaining them very well. Having the
> > > PCIe controller limited to 32-bit AXI does not mean that the PCIe bus cannot
> > > carry 64-bit addresses. It depends on how they get translated by the host
> bridge
> > > or its associated ATS block. I can't see why you can't have a setup where
> > > the CPU addresses are 32-bit but the PCIe bus addresses are all 64-bit.
> > > You just have to be careful on how you setup your mem64 ranges so that
> they
> > > don't
> > > overlap with the 32-bit ranges when translated.
> > From a HW point of view I agree that we can setup the PCI host bridge such that
> > it uses 64-bit PCI address, with 32-bit cpu addresses. Though in practice doesn't
> > this mean that the dma ops used by card drivers has to be provided by our PCI
> > host bridge driver so we can apply the translation to those PCI addresses?
> > This comes back to my point below about how to do this. Adding a bus notifier
> > to do this may be too late, and arm64 doesn't implement set_dma_ops().
> >
> > > And no, you should not limit at the card driver the DMA_BIT_MASK() unless
> the
> > > card is not capable of supporting more than 32-bit addresses.
> > If there was infrastructure that checked all parents dma-ranges when the
> > dma_set_mask() function is called as Arnd pointed out, this would nicely solve
> > the problem.
> 
> of_dma_configure calls of_dma_get_range to do all this for the PCIe host,
> and then calls arch_setup_dma_ops() so the architecture specific code can
> enforce the limits in dma_set_mask and pick an appropriate set of dma
> operations. The missing part is in the implementation of arch_setup_dma_ops,
> which currently happily ignores the base and limit.
I don't think it's as simple as that, though I could be wrong!

First off, of_dma_configure() sets a default coherent_dma_mask to 4GiB.
This default is set for the 'platform soc' device. For my own testing I increased
this to DMA_BIT_MASK(63). Note that setting it to DMA_BIT_MASK(64) causes
boot failure that I haven't looked into.

Then pci_device_add() sets the devices coherent_dma_mask to 4GiB before
calling of_pci_dma_configure(). I assume it does this on the basis that this is a
good default for PCI drivers that don't call dma_set_mask().
So if arch_setup_dma_ops() walks up the parents to limit the mask, you'll hit
this mask.

Finally, dma_set_mask_and_coherent() is called from the PCI card driver
but it doesn't check the parents dma masks either.

Thanks
Phil

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: PCIe host controller behind IOMMU on ARM
  2015-11-12 15:33                 ` Phil Edworthy
  (?)
@ 2015-11-12 16:16                   ` Arnd Bergmann
  -1 siblings, 0 replies; 52+ messages in thread
From: Arnd Bergmann @ 2015-11-12 16:16 UTC (permalink / raw)
  To: Phil Edworthy
  Cc: linux-arm-kernel, Liviu.Dudau, Lorenzo Pieralisi, Magnus,
	linux-pci, Will Deacon, linux-kernel, Bjorn Helgaas

On Thursday 12 November 2015 15:33:41 Phil Edworthy wrote:
> On 12 November 2015 09:49, Arnd Bergmann wrote:
> > On Thursday 12 November 2015 09:26:33 Phil Edworthy wrote:
> > > On 11 November 2015 18:25, LIviu wrote:
> > > > On Mon, Nov 09, 2015 at 12:32:13PM +0000, Phil Edworthy wrote:
> > 
> > of_dma_configure calls of_dma_get_range to do all this for the PCIe host,
> > and then calls arch_setup_dma_ops() so the architecture specific code can
> > enforce the limits in dma_set_mask and pick an appropriate set of dma
> > operations. The missing part is in the implementation of arch_setup_dma_ops,
> > which currently happily ignores the base and limit.
> I don't think it's as simple as that, though I could be wrong!
> 
> First off, of_dma_configure() sets a default coherent_dma_mask to 4GiB.
> This default is set for the 'platform soc' device. For my own testing I increased
> this to DMA_BIT_MASK(63). Note that setting it to DMA_BIT_MASK(64) causes
> boot failure that I haven't looked into.

Most platform devices actually need the 32-bit mask, so we intentionally
followed what PCI does here and default to that and require platform drivers
to explicitly ask for a larger mask if they need it.

> Then pci_device_add() sets the devices coherent_dma_mask to 4GiB before
> calling of_pci_dma_configure(). I assume it does this on the basis that this is a
> good default for PCI drivers that don't call dma_set_mask().
> So if arch_setup_dma_ops() walks up the parents to limit the mask, you'll hit
> this mask.

arch_setup_dma_ops() does not walk up the hierarchy, of_dma_configure()
does this before calling arch_setup_dma_ops(). The PCI devices start out
with the 32-bit mask, but the limit should be whatever PCI host uses.

> Finally, dma_set_mask_and_coherent() is called from the PCI card driver
> but it doesn't check the parents dma masks either.

The way I think this should work is that arch_setup_dma_ops() stores the
allowed mask in the struct device, and that dma_set_mask compares the
mask against that.

	Arnd

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: PCIe host controller behind IOMMU on ARM
@ 2015-11-12 16:16                   ` Arnd Bergmann
  0 siblings, 0 replies; 52+ messages in thread
From: Arnd Bergmann @ 2015-11-12 16:16 UTC (permalink / raw)
  To: Phil Edworthy
  Cc: linux-arm-kernel, Liviu.Dudau, Lorenzo Pieralisi, Magnus,
	linux-pci, Will Deacon, linux-kernel, Bjorn Helgaas

On Thursday 12 November 2015 15:33:41 Phil Edworthy wrote:
> On 12 November 2015 09:49, Arnd Bergmann wrote:
> > On Thursday 12 November 2015 09:26:33 Phil Edworthy wrote:
> > > On 11 November 2015 18:25, LIviu wrote:
> > > > On Mon, Nov 09, 2015 at 12:32:13PM +0000, Phil Edworthy wrote:
> > 
> > of_dma_configure calls of_dma_get_range to do all this for the PCIe host,
> > and then calls arch_setup_dma_ops() so the architecture specific code can
> > enforce the limits in dma_set_mask and pick an appropriate set of dma
> > operations. The missing part is in the implementation of arch_setup_dma_ops,
> > which currently happily ignores the base and limit.
> I don't think it's as simple as that, though I could be wrong!
> 
> First off, of_dma_configure() sets a default coherent_dma_mask to 4GiB.
> This default is set for the 'platform soc' device. For my own testing I increased
> this to DMA_BIT_MASK(63). Note that setting it to DMA_BIT_MASK(64) causes
> boot failure that I haven't looked into.

Most platform devices actually need the 32-bit mask, so we intentionally
followed what PCI does here and default to that and require platform drivers
to explicitly ask for a larger mask if they need it.

> Then pci_device_add() sets the devices coherent_dma_mask to 4GiB before
> calling of_pci_dma_configure(). I assume it does this on the basis that this is a
> good default for PCI drivers that don't call dma_set_mask().
> So if arch_setup_dma_ops() walks up the parents to limit the mask, you'll hit
> this mask.

arch_setup_dma_ops() does not walk up the hierarchy, of_dma_configure()
does this before calling arch_setup_dma_ops(). The PCI devices start out
with the 32-bit mask, but the limit should be whatever PCI host uses.

> Finally, dma_set_mask_and_coherent() is called from the PCI card driver
> but it doesn't check the parents dma masks either.

The way I think this should work is that arch_setup_dma_ops() stores the
allowed mask in the struct device, and that dma_set_mask compares the
mask against that.

	Arnd

^ permalink raw reply	[flat|nested] 52+ messages in thread

* PCIe host controller behind IOMMU on ARM
@ 2015-11-12 16:16                   ` Arnd Bergmann
  0 siblings, 0 replies; 52+ messages in thread
From: Arnd Bergmann @ 2015-11-12 16:16 UTC (permalink / raw)
  To: linux-arm-kernel

On Thursday 12 November 2015 15:33:41 Phil Edworthy wrote:
> On 12 November 2015 09:49, Arnd Bergmann wrote:
> > On Thursday 12 November 2015 09:26:33 Phil Edworthy wrote:
> > > On 11 November 2015 18:25, LIviu wrote:
> > > > On Mon, Nov 09, 2015 at 12:32:13PM +0000, Phil Edworthy wrote:
> > 
> > of_dma_configure calls of_dma_get_range to do all this for the PCIe host,
> > and then calls arch_setup_dma_ops() so the architecture specific code can
> > enforce the limits in dma_set_mask and pick an appropriate set of dma
> > operations. The missing part is in the implementation of arch_setup_dma_ops,
> > which currently happily ignores the base and limit.
> I don't think it's as simple as that, though I could be wrong!
> 
> First off, of_dma_configure() sets a default coherent_dma_mask to 4GiB.
> This default is set for the 'platform soc' device. For my own testing I increased
> this to DMA_BIT_MASK(63). Note that setting it to DMA_BIT_MASK(64) causes
> boot failure that I haven't looked into.

Most platform devices actually need the 32-bit mask, so we intentionally
followed what PCI does here and default to that and require platform drivers
to explicitly ask for a larger mask if they need it.

> Then pci_device_add() sets the devices coherent_dma_mask to 4GiB before
> calling of_pci_dma_configure(). I assume it does this on the basis that this is a
> good default for PCI drivers that don't call dma_set_mask().
> So if arch_setup_dma_ops() walks up the parents to limit the mask, you'll hit
> this mask.

arch_setup_dma_ops() does not walk up the hierarchy, of_dma_configure()
does this before calling arch_setup_dma_ops(). The PCI devices start out
with the 32-bit mask, but the limit should be whatever PCI host uses.

> Finally, dma_set_mask_and_coherent() is called from the PCI card driver
> but it doesn't check the parents dma masks either.

The way I think this should work is that arch_setup_dma_ops() stores the
allowed mask in the struct device, and that dma_set_mask compares the
mask against that.

	Arnd

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: PCIe host controller behind IOMMU on ARM
  2015-11-12 16:16                   ` Arnd Bergmann
  (?)
@ 2015-11-13 13:03                     ` Phil Edworthy
  -1 siblings, 0 replies; 52+ messages in thread
From: Phil Edworthy @ 2015-11-13 13:03 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: linux-arm-kernel, Liviu.Dudau, Lorenzo Pieralisi, Magnus,
	linux-pci, Will Deacon, linux-kernel, Bjorn Helgaas

Hi Arnd,

On 12 November 2015 16:17, Arnd Bergmann wrote:
> On Thursday 12 November 2015 15:33:41 Phil Edworthy wrote:
> > On 12 November 2015 09:49, Arnd Bergmann wrote:
> > > On Thursday 12 November 2015 09:26:33 Phil Edworthy wrote:
> > > > On 11 November 2015 18:25, LIviu wrote:
> > > > > On Mon, Nov 09, 2015 at 12:32:13PM +0000, Phil Edworthy wrote:
> > >
> > > of_dma_configure calls of_dma_get_range to do all this for the PCIe host,
> > > and then calls arch_setup_dma_ops() so the architecture specific code can
> > > enforce the limits in dma_set_mask and pick an appropriate set of dma
> > > operations. The missing part is in the implementation of
> arch_setup_dma_ops,
> > > which currently happily ignores the base and limit.
> > I don't think it's as simple as that, though I could be wrong!
> >
> > First off, of_dma_configure() sets a default coherent_dma_mask to 4GiB.
> > This default is set for the 'platform soc' device. For my own testing I increased
> > this to DMA_BIT_MASK(63). Note that setting it to DMA_BIT_MASK(64) causes
> > boot failure that I haven't looked into.
> 
> Most platform devices actually need the 32-bit mask, so we intentionally
> followed what PCI does here and default to that and require platform drivers
> to explicitly ask for a larger mask if they need it.
Ok, that makes sense.


> > Then pci_device_add() sets the devices coherent_dma_mask to 4GiB before
> > calling of_pci_dma_configure(). I assume it does this on the basis that this is a
> > good default for PCI drivers that don't call dma_set_mask().
> > So if arch_setup_dma_ops() walks up the parents to limit the mask, you'll hit
> > this mask.
> 
> arch_setup_dma_ops() does not walk up the hierarchy, of_dma_configure()
> does this before calling arch_setup_dma_ops(). The PCI devices start out
> with the 32-bit mask, but the limit should be whatever PCI host uses.
Ok, so of_dma_configure() could walk up the tree and restrict the dma
mask to whatever parents limit it to. Then it could be overridden by
a dma-ranges entry in the DT node, right?
If so, one problem I can see is PCI controllers already use the
dma-ranges binding but with 3 address cells since it also specifies
the PCI address range.

I noticed that of_dma_get_range() skips straight to the parent node.
Shouldn't it attempt to get the dma-ranges for the device's node
first? I mean most hardware is limited by the peripheral's
capabilities, not the bus. If fact, of_dma_get_range() gets the number
of address and size cells from the device node, but gets the dma-ranges
from the parent. That seems a little odd to me.

The only other problem I can see is that currently all PCI drivers can
try to set their dma mask to 64 bits. At the moment that succeeds
because there are no checks. Until devices using them have their DTs
updated with dma-ranges, we would be limiting them to a 32 bit mask. I
guess that's not much of an issue in practice.


> > Finally, dma_set_mask_and_coherent() is called from the PCI card driver
> > but it doesn't check the parents dma masks either.
> 
> The way I think this should work is that arch_setup_dma_ops() stores the
> allowed mask in the struct device, and that dma_set_mask compares the
> mask against that.
That makes sense.

Thanks for your help,
Phil

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: PCIe host controller behind IOMMU on ARM
@ 2015-11-13 13:03                     ` Phil Edworthy
  0 siblings, 0 replies; 52+ messages in thread
From: Phil Edworthy @ 2015-11-13 13:03 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: linux-arm-kernel, Liviu.Dudau, Lorenzo Pieralisi, Magnus,
	linux-pci, Will Deacon, linux-kernel, Bjorn Helgaas

Hi Arnd,

On 12 November 2015 16:17, Arnd Bergmann wrote:
> On Thursday 12 November 2015 15:33:41 Phil Edworthy wrote:
> > On 12 November 2015 09:49, Arnd Bergmann wrote:
> > > On Thursday 12 November 2015 09:26:33 Phil Edworthy wrote:
> > > > On 11 November 2015 18:25, LIviu wrote:
> > > > > On Mon, Nov 09, 2015 at 12:32:13PM +0000, Phil Edworthy wrote:
> > >
> > > of_dma_configure calls of_dma_get_range to do all this for the PCIe host,
> > > and then calls arch_setup_dma_ops() so the architecture specific code can
> > > enforce the limits in dma_set_mask and pick an appropriate set of dma
> > > operations. The missing part is in the implementation of
> arch_setup_dma_ops,
> > > which currently happily ignores the base and limit.
> > I don't think it's as simple as that, though I could be wrong!
> >
> > First off, of_dma_configure() sets a default coherent_dma_mask to 4GiB.
> > This default is set for the 'platform soc' device. For my own testing I increased
> > this to DMA_BIT_MASK(63). Note that setting it to DMA_BIT_MASK(64) causes
> > boot failure that I haven't looked into.
> 
> Most platform devices actually need the 32-bit mask, so we intentionally
> followed what PCI does here and default to that and require platform drivers
> to explicitly ask for a larger mask if they need it.
Ok, that makes sense.


> > Then pci_device_add() sets the devices coherent_dma_mask to 4GiB before
> > calling of_pci_dma_configure(). I assume it does this on the basis that this is a
> > good default for PCI drivers that don't call dma_set_mask().
> > So if arch_setup_dma_ops() walks up the parents to limit the mask, you'll hit
> > this mask.
> 
> arch_setup_dma_ops() does not walk up the hierarchy, of_dma_configure()
> does this before calling arch_setup_dma_ops(). The PCI devices start out
> with the 32-bit mask, but the limit should be whatever PCI host uses.
Ok, so of_dma_configure() could walk up the tree and restrict the dma
mask to whatever parents limit it to. Then it could be overridden by
a dma-ranges entry in the DT node, right?
If so, one problem I can see is PCI controllers already use the
dma-ranges binding but with 3 address cells since it also specifies
the PCI address range.

I noticed that of_dma_get_range() skips straight to the parent node.
Shouldn't it attempt to get the dma-ranges for the device's node
first? I mean most hardware is limited by the peripheral's
capabilities, not the bus. If fact, of_dma_get_range() gets the number
of address and size cells from the device node, but gets the dma-ranges
from the parent. That seems a little odd to me.

The only other problem I can see is that currently all PCI drivers can
try to set their dma mask to 64 bits. At the moment that succeeds
because there are no checks. Until devices using them have their DTs
updated with dma-ranges, we would be limiting them to a 32 bit mask. I
guess that's not much of an issue in practice.


> > Finally, dma_set_mask_and_coherent() is called from the PCI card driver
> > but it doesn't check the parents dma masks either.
> 
> The way I think this should work is that arch_setup_dma_ops() stores the
> allowed mask in the struct device, and that dma_set_mask compares the
> mask against that.
That makes sense.

Thanks for your help,
Phil

^ permalink raw reply	[flat|nested] 52+ messages in thread

* PCIe host controller behind IOMMU on ARM
@ 2015-11-13 13:03                     ` Phil Edworthy
  0 siblings, 0 replies; 52+ messages in thread
From: Phil Edworthy @ 2015-11-13 13:03 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Arnd,

On 12 November 2015 16:17, Arnd Bergmann wrote:
> On Thursday 12 November 2015 15:33:41 Phil Edworthy wrote:
> > On 12 November 2015 09:49, Arnd Bergmann wrote:
> > > On Thursday 12 November 2015 09:26:33 Phil Edworthy wrote:
> > > > On 11 November 2015 18:25, LIviu wrote:
> > > > > On Mon, Nov 09, 2015 at 12:32:13PM +0000, Phil Edworthy wrote:
> > >
> > > of_dma_configure calls of_dma_get_range to do all this for the PCIe host,
> > > and then calls arch_setup_dma_ops() so the architecture specific code can
> > > enforce the limits in dma_set_mask and pick an appropriate set of dma
> > > operations. The missing part is in the implementation of
> arch_setup_dma_ops,
> > > which currently happily ignores the base and limit.
> > I don't think it's as simple as that, though I could be wrong!
> >
> > First off, of_dma_configure() sets a default coherent_dma_mask to 4GiB.
> > This default is set for the 'platform soc' device. For my own testing I increased
> > this to DMA_BIT_MASK(63). Note that setting it to DMA_BIT_MASK(64) causes
> > boot failure that I haven't looked into.
> 
> Most platform devices actually need the 32-bit mask, so we intentionally
> followed what PCI does here and default to that and require platform drivers
> to explicitly ask for a larger mask if they need it.
Ok, that makes sense.


> > Then pci_device_add() sets the devices coherent_dma_mask to 4GiB before
> > calling of_pci_dma_configure(). I assume it does this on the basis that this is a
> > good default for PCI drivers that don't call dma_set_mask().
> > So if arch_setup_dma_ops() walks up the parents to limit the mask, you'll hit
> > this mask.
> 
> arch_setup_dma_ops() does not walk up the hierarchy, of_dma_configure()
> does this before calling arch_setup_dma_ops(). The PCI devices start out
> with the 32-bit mask, but the limit should be whatever PCI host uses.
Ok, so of_dma_configure() could walk up the tree and restrict the dma
mask to whatever parents limit it to. Then it could be overridden by
a dma-ranges entry in the DT node, right?
If so, one problem I can see is PCI controllers already use the
dma-ranges binding but with 3 address cells since it also specifies
the PCI address range.

I noticed that of_dma_get_range() skips straight to the parent node.
Shouldn't it attempt to get the dma-ranges for the device's node
first? I mean most hardware is limited by the peripheral's
capabilities, not the bus. If fact, of_dma_get_range() gets the number
of address and size cells from the device node, but gets the dma-ranges
from the parent. That seems a little odd to me.

The only other problem I can see is that currently all PCI drivers can
try to set their dma mask to 64 bits. At the moment that succeeds
because there are no checks. Until devices using them have their DTs
updated with dma-ranges, we would be limiting them to a 32 bit mask. I
guess that's not much of an issue in practice.


> > Finally, dma_set_mask_and_coherent() is called from the PCI card driver
> > but it doesn't check the parents dma masks either.
> 
> The way I think this should work is that arch_setup_dma_ops() stores the
> allowed mask in the struct device, and that dma_set_mask compares the
> mask against that.
That makes sense.

Thanks for your help,
Phil

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: PCIe host controller behind IOMMU on ARM
  2015-11-13 13:03                     ` Phil Edworthy
@ 2015-11-13 13:59                       ` Arnd Bergmann
  -1 siblings, 0 replies; 52+ messages in thread
From: Arnd Bergmann @ 2015-11-13 13:59 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Phil Edworthy, Lorenzo Pieralisi, Magnus, Will Deacon,
	linux-kernel, linux-pci, Bjorn Helgaas, Liviu.Dudau

On Friday 13 November 2015 13:03:11 Phil Edworthy wrote:
> 
> > > Then pci_device_add() sets the devices coherent_dma_mask to 4GiB before
> > > calling of_pci_dma_configure(). I assume it does this on the basis that this is a
> > > good default for PCI drivers that don't call dma_set_mask().
> > > So if arch_setup_dma_ops() walks up the parents to limit the mask, you'll hit
> > > this mask.
> > 
> > arch_setup_dma_ops() does not walk up the hierarchy, of_dma_configure()
> > does this before calling arch_setup_dma_ops(). The PCI devices start out
> > with the 32-bit mask, but the limit should be whatever PCI host uses.
> Ok, so of_dma_configure() could walk up the tree and restrict the dma
> mask to whatever parents limit it to. Then it could be overridden by
> a dma-ranges entry in the DT node, right?

No, the dma-ranges properties tell you what the allowed masks are,
this is what of_dma_configure() looks at.

> If so, one problem I can see is PCI controllers already use the
> dma-ranges binding but with 3 address cells since it also specifies
> the PCI address range.
> 
> I noticed that of_dma_get_range() skips straight to the parent node.
> Shouldn't it attempt to get the dma-ranges for the device's node
> first?

No, the dma-ranges explain the capabilities of the bus, this is
what you have to look at. The device itself may have additional
restrictions, but those are what the driver knows based on the
compatibility value when it passes the device specific mask into
dma_set_mask()

> I mean most hardware is limited by the peripheral's
> capabilities, not the bus. If fact, of_dma_get_range() gets the number
> of address and size cells from the device node, but gets the dma-ranges
> from the parent. That seems a little odd to me.

of_dma_get_range() calls of_n_addr_cells()/of_n_size_cells(), which get
the #address-cells/#size-cells property from the parent device (except
for the root, which is special).

> The only other problem I can see is that currently all PCI drivers can
> try to set their dma mask to 64 bits. At the moment that succeeds
> because there are no checks.

Right, this is the main bug we need to fix.

> Until devices using them have their DTs
> updated with dma-ranges, we would be limiting them to a 32 bit mask. I
> guess that's not much of an issue in practice.

Correct. I've tried to tell everyone about this when they added device
nodes for DMA capable devices. In most cases, they want 32-bit masks
anyway.

	Arnd

^ permalink raw reply	[flat|nested] 52+ messages in thread

* PCIe host controller behind IOMMU on ARM
@ 2015-11-13 13:59                       ` Arnd Bergmann
  0 siblings, 0 replies; 52+ messages in thread
From: Arnd Bergmann @ 2015-11-13 13:59 UTC (permalink / raw)
  To: linux-arm-kernel

On Friday 13 November 2015 13:03:11 Phil Edworthy wrote:
> 
> > > Then pci_device_add() sets the devices coherent_dma_mask to 4GiB before
> > > calling of_pci_dma_configure(). I assume it does this on the basis that this is a
> > > good default for PCI drivers that don't call dma_set_mask().
> > > So if arch_setup_dma_ops() walks up the parents to limit the mask, you'll hit
> > > this mask.
> > 
> > arch_setup_dma_ops() does not walk up the hierarchy, of_dma_configure()
> > does this before calling arch_setup_dma_ops(). The PCI devices start out
> > with the 32-bit mask, but the limit should be whatever PCI host uses.
> Ok, so of_dma_configure() could walk up the tree and restrict the dma
> mask to whatever parents limit it to. Then it could be overridden by
> a dma-ranges entry in the DT node, right?

No, the dma-ranges properties tell you what the allowed masks are,
this is what of_dma_configure() looks at.

> If so, one problem I can see is PCI controllers already use the
> dma-ranges binding but with 3 address cells since it also specifies
> the PCI address range.
> 
> I noticed that of_dma_get_range() skips straight to the parent node.
> Shouldn't it attempt to get the dma-ranges for the device's node
> first?

No, the dma-ranges explain the capabilities of the bus, this is
what you have to look at. The device itself may have additional
restrictions, but those are what the driver knows based on the
compatibility value when it passes the device specific mask into
dma_set_mask()

> I mean most hardware is limited by the peripheral's
> capabilities, not the bus. If fact, of_dma_get_range() gets the number
> of address and size cells from the device node, but gets the dma-ranges
> from the parent. That seems a little odd to me.

of_dma_get_range() calls of_n_addr_cells()/of_n_size_cells(), which get
the #address-cells/#size-cells property from the parent device (except
for the root, which is special).

> The only other problem I can see is that currently all PCI drivers can
> try to set their dma mask to 64 bits. At the moment that succeeds
> because there are no checks.

Right, this is the main bug we need to fix.

> Until devices using them have their DTs
> updated with dma-ranges, we would be limiting them to a 32 bit mask. I
> guess that's not much of an issue in practice.

Correct. I've tried to tell everyone about this when they added device
nodes for DMA capable devices. In most cases, they want 32-bit masks
anyway.

	Arnd

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: PCIe host controller behind IOMMU on ARM
  2015-11-13 13:59                       ` Arnd Bergmann
  (?)
@ 2015-11-13 14:11                         ` Phil Edworthy
  -1 siblings, 0 replies; 52+ messages in thread
From: Phil Edworthy @ 2015-11-13 14:11 UTC (permalink / raw)
  To: Arnd Bergmann, linux-arm-kernel
  Cc: Lorenzo Pieralisi, Magnus, Will Deacon, linux-kernel, linux-pci,
	Bjorn Helgaas, Liviu.Dudau

On 13 November 2015 14:00, Arnd Bergmann wrote:
> On Friday 13 November 2015 13:03:11 Phil Edworthy wrote:
> >
> > > > Then pci_device_add() sets the devices coherent_dma_mask to 4GiB
> before
> > > > calling of_pci_dma_configure(). I assume it does this on the basis that this is
> a
> > > > good default for PCI drivers that don't call dma_set_mask().
> > > > So if arch_setup_dma_ops() walks up the parents to limit the mask, you'll
> hit
> > > > this mask.
> > >
> > > arch_setup_dma_ops() does not walk up the hierarchy, of_dma_configure()
> > > does this before calling arch_setup_dma_ops(). The PCI devices start out
> > > with the 32-bit mask, but the limit should be whatever PCI host uses.
> > Ok, so of_dma_configure() could walk up the tree and restrict the dma
> > mask to whatever parents limit it to. Then it could be overridden by
> > a dma-ranges entry in the DT node, right?
> 
> No, the dma-ranges properties tell you what the allowed masks are,
> this is what of_dma_configure() looks at.
Ok, I understand now.


> > If so, one problem I can see is PCI controllers already use the
> > dma-ranges binding but with 3 address cells since it also specifies
> > the PCI address range.
> >
> > I noticed that of_dma_get_range() skips straight to the parent node.
> > Shouldn't it attempt to get the dma-ranges for the device's node
> > first?
> 
> No, the dma-ranges explain the capabilities of the bus, this is
> what you have to look at. The device itself may have additional
> restrictions, but those are what the driver knows based on the
> compatibility value when it passes the device specific mask into
> dma_set_mask()
Ok, this is making sense now.


> > I mean most hardware is limited by the peripheral's
> > capabilities, not the bus. If fact, of_dma_get_range() gets the number
> > of address and size cells from the device node, but gets the dma-ranges
> > from the parent. That seems a little odd to me.
> 
> of_dma_get_range() calls of_n_addr_cells()/of_n_size_cells(), which get
> the #address-cells/#size-cells property from the parent device (except
> for the root, which is special).
Right, I should have checked what of_n_addr/size_cell() actually did.


> > The only other problem I can see is that currently all PCI drivers can
> > try to set their dma mask to 64 bits. At the moment that succeeds
> > because there are no checks.
> 
> Right, this is the main bug we need to fix.
Yep.


> > Until devices using them have their DTs
> > updated with dma-ranges, we would be limiting them to a 32 bit mask. I
> > guess that's not much of an issue in practice.
> 
> Correct. I've tried to tell everyone about this when they added device
> nodes for DMA capable devices. In most cases, they want 32-bit masks
> anyway.

Thanks for your help & patience, much appreciated!
Phil

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: PCIe host controller behind IOMMU on ARM
@ 2015-11-13 14:11                         ` Phil Edworthy
  0 siblings, 0 replies; 52+ messages in thread
From: Phil Edworthy @ 2015-11-13 14:11 UTC (permalink / raw)
  To: Arnd Bergmann, linux-arm-kernel
  Cc: Lorenzo Pieralisi, Magnus, Will Deacon, linux-kernel, linux-pci,
	Bjorn Helgaas, Liviu.Dudau

On 13 November 2015 14:00, Arnd Bergmann wrote:
> On Friday 13 November 2015 13:03:11 Phil Edworthy wrote:
> >
> > > > Then pci_device_add() sets the devices coherent_dma_mask to 4GiB
> before
> > > > calling of_pci_dma_configure(). I assume it does this on the basis that this is
> a
> > > > good default for PCI drivers that don't call dma_set_mask().
> > > > So if arch_setup_dma_ops() walks up the parents to limit the mask, you'll
> hit
> > > > this mask.
> > >
> > > arch_setup_dma_ops() does not walk up the hierarchy, of_dma_configure()
> > > does this before calling arch_setup_dma_ops(). The PCI devices start out
> > > with the 32-bit mask, but the limit should be whatever PCI host uses.
> > Ok, so of_dma_configure() could walk up the tree and restrict the dma
> > mask to whatever parents limit it to. Then it could be overridden by
> > a dma-ranges entry in the DT node, right?
> 
> No, the dma-ranges properties tell you what the allowed masks are,
> this is what of_dma_configure() looks at.
Ok, I understand now.


> > If so, one problem I can see is PCI controllers already use the
> > dma-ranges binding but with 3 address cells since it also specifies
> > the PCI address range.
> >
> > I noticed that of_dma_get_range() skips straight to the parent node.
> > Shouldn't it attempt to get the dma-ranges for the device's node
> > first?
> 
> No, the dma-ranges explain the capabilities of the bus, this is
> what you have to look at. The device itself may have additional
> restrictions, but those are what the driver knows based on the
> compatibility value when it passes the device specific mask into
> dma_set_mask()
Ok, this is making sense now.


> > I mean most hardware is limited by the peripheral's
> > capabilities, not the bus. If fact, of_dma_get_range() gets the number
> > of address and size cells from the device node, but gets the dma-ranges
> > from the parent. That seems a little odd to me.
> 
> of_dma_get_range() calls of_n_addr_cells()/of_n_size_cells(), which get
> the #address-cells/#size-cells property from the parent device (except
> for the root, which is special).
Right, I should have checked what of_n_addr/size_cell() actually did.


> > The only other problem I can see is that currently all PCI drivers can
> > try to set their dma mask to 64 bits. At the moment that succeeds
> > because there are no checks.
> 
> Right, this is the main bug we need to fix.
Yep.


> > Until devices using them have their DTs
> > updated with dma-ranges, we would be limiting them to a 32 bit mask. I
> > guess that's not much of an issue in practice.
> 
> Correct. I've tried to tell everyone about this when they added device
> nodes for DMA capable devices. In most cases, they want 32-bit masks
> anyway.

Thanks for your help & patience, much appreciated!
Phil

^ permalink raw reply	[flat|nested] 52+ messages in thread

* PCIe host controller behind IOMMU on ARM
@ 2015-11-13 14:11                         ` Phil Edworthy
  0 siblings, 0 replies; 52+ messages in thread
From: Phil Edworthy @ 2015-11-13 14:11 UTC (permalink / raw)
  To: linux-arm-kernel

On 13 November 2015 14:00, Arnd Bergmann wrote:
> On Friday 13 November 2015 13:03:11 Phil Edworthy wrote:
> >
> > > > Then pci_device_add() sets the devices coherent_dma_mask to 4GiB
> before
> > > > calling of_pci_dma_configure(). I assume it does this on the basis that this is
> a
> > > > good default for PCI drivers that don't call dma_set_mask().
> > > > So if arch_setup_dma_ops() walks up the parents to limit the mask, you'll
> hit
> > > > this mask.
> > >
> > > arch_setup_dma_ops() does not walk up the hierarchy, of_dma_configure()
> > > does this before calling arch_setup_dma_ops(). The PCI devices start out
> > > with the 32-bit mask, but the limit should be whatever PCI host uses.
> > Ok, so of_dma_configure() could walk up the tree and restrict the dma
> > mask to whatever parents limit it to. Then it could be overridden by
> > a dma-ranges entry in the DT node, right?
> 
> No, the dma-ranges properties tell you what the allowed masks are,
> this is what of_dma_configure() looks at.
Ok, I understand now.


> > If so, one problem I can see is PCI controllers already use the
> > dma-ranges binding but with 3 address cells since it also specifies
> > the PCI address range.
> >
> > I noticed that of_dma_get_range() skips straight to the parent node.
> > Shouldn't it attempt to get the dma-ranges for the device's node
> > first?
> 
> No, the dma-ranges explain the capabilities of the bus, this is
> what you have to look at. The device itself may have additional
> restrictions, but those are what the driver knows based on the
> compatibility value when it passes the device specific mask into
> dma_set_mask()
Ok, this is making sense now.


> > I mean most hardware is limited by the peripheral's
> > capabilities, not the bus. If fact, of_dma_get_range() gets the number
> > of address and size cells from the device node, but gets the dma-ranges
> > from the parent. That seems a little odd to me.
> 
> of_dma_get_range() calls of_n_addr_cells()/of_n_size_cells(), which get
> the #address-cells/#size-cells property from the parent device (except
> for the root, which is special).
Right, I should have checked what of_n_addr/size_cell() actually did.


> > The only other problem I can see is that currently all PCI drivers can
> > try to set their dma mask to 64 bits. At the moment that succeeds
> > because there are no checks.
> 
> Right, this is the main bug we need to fix.
Yep.


> > Until devices using them have their DTs
> > updated with dma-ranges, we would be limiting them to a 32 bit mask. I
> > guess that's not much of an issue in practice.
> 
> Correct. I've tried to tell everyone about this when they added device
> nodes for DMA capable devices. In most cases, they want 32-bit masks
> anyway.

Thanks for your help & patience, much appreciated!
Phil

^ permalink raw reply	[flat|nested] 52+ messages in thread

end of thread, other threads:[~2015-11-13 14:11 UTC | newest]

Thread overview: 52+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-04 13:57 PCIe host controller behind IOMMU on ARM Phil Edworthy
2015-11-04 13:57 ` Phil Edworthy
2015-11-04 13:57 ` Phil Edworthy
2015-11-04 14:24 ` Liviu.Dudau
2015-11-04 14:24   ` Liviu.Dudau at arm.com
2015-11-04 14:24   ` Liviu.Dudau
2015-11-04 14:48   ` Phil Edworthy
2015-11-04 14:48     ` Phil Edworthy
2015-11-04 14:48     ` Phil Edworthy
2015-11-04 15:01     ` Liviu.Dudau
2015-11-04 15:01       ` Liviu.Dudau at arm.com
2015-11-04 15:01       ` Liviu.Dudau
2015-11-04 15:19       ` Phil Edworthy
2015-11-04 15:19         ` Phil Edworthy
2015-11-04 15:19         ` Phil Edworthy
2015-11-04 15:30         ` Will Deacon
2015-11-04 15:30           ` Will Deacon
2015-11-04 15:30           ` Will Deacon
2015-11-04 18:02           ` Phil Edworthy
2015-11-04 18:02             ` Phil Edworthy
2015-11-04 18:02             ` Phil Edworthy
2015-11-09 12:32       ` Phil Edworthy
2015-11-09 12:32         ` Phil Edworthy
2015-11-09 12:32         ` Phil Edworthy
2015-11-11 18:24         ` Liviu.Dudau
2015-11-11 18:24           ` Liviu.Dudau at arm.com
2015-11-11 18:24           ` Liviu.Dudau
2015-11-11 20:22           ` Arnd Bergmann
2015-11-11 20:22             ` Arnd Bergmann
2015-11-11 20:22             ` Arnd Bergmann
2015-11-12  9:26           ` Phil Edworthy
2015-11-12  9:26             ` Phil Edworthy
2015-11-12  9:26             ` Phil Edworthy
2015-11-12  9:49             ` Arnd Bergmann
2015-11-12  9:49               ` Arnd Bergmann
2015-11-12 15:33               ` Phil Edworthy
2015-11-12 15:33                 ` Phil Edworthy
2015-11-12 15:33                 ` Phil Edworthy
2015-11-12 16:16                 ` Arnd Bergmann
2015-11-12 16:16                   ` Arnd Bergmann
2015-11-12 16:16                   ` Arnd Bergmann
2015-11-13 13:03                   ` Phil Edworthy
2015-11-13 13:03                     ` Phil Edworthy
2015-11-13 13:03                     ` Phil Edworthy
2015-11-13 13:59                     ` Arnd Bergmann
2015-11-13 13:59                       ` Arnd Bergmann
2015-11-13 14:11                       ` Phil Edworthy
2015-11-13 14:11                         ` Phil Edworthy
2015-11-13 14:11                         ` Phil Edworthy
2015-11-12 10:32             ` Liviu.Dudau
2015-11-12 10:32               ` Liviu.Dudau at arm.com
2015-11-12 10:32               ` Liviu.Dudau

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.