* IOAT DMA w/IOMMU @ 2018-08-09 18:14 Eric Pilmore 2018-08-09 18:43 ` Bjorn Helgaas 0 siblings, 1 reply; 48+ messages in thread From: Eric Pilmore @ 2018-08-09 18:14 UTC (permalink / raw) To: linux-pci Didn't get any response on the IRC channel so trying here. Was wondering if anybody here has used IOAT DMA engines with an IOMMU turned on (Xeon based system)? My specific question is really whether it is possible to DMA (w/IOAT) to a PCI BAR address as the destination without having to map that address to the IOVA space of the DMA engine first (assuming the IOMMU is on)? I am encountering issues where I see PTE Errors reported from DMAR in this scenario, but I do not if I use a different DMA engine that's sitting downstream off the PCI tree. I'm wondering if the IOAT DMA failure is some artifact of these engines sitting behind the Host Bridge. Thanks in advance! Eric ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: IOAT DMA w/IOMMU 2018-08-09 18:14 IOAT DMA w/IOMMU Eric Pilmore @ 2018-08-09 18:43 ` Bjorn Helgaas 2018-08-09 18:51 ` Eric Pilmore 0 siblings, 1 reply; 48+ messages in thread From: Bjorn Helgaas @ 2018-08-09 18:43 UTC (permalink / raw) To: Eric Pilmore Cc: linux-pci, David Woodhouse, Logan Gunthorpe, Alex Williamson, iommu [+cc David, Logan, Alex, iommu list] On Thu, Aug 09, 2018 at 11:14:13AM -0700, Eric Pilmore wrote: > Didn't get any response on the IRC channel so trying here. > > Was wondering if anybody here has used IOAT DMA engines with an > IOMMU turned on (Xeon based system)? My specific question is really > whether it is possible to DMA (w/IOAT) to a PCI BAR address as the > destination without having to map that address to the IOVA space of > the DMA engine first (assuming the IOMMU is on)? So is this a peer-to-peer DMA scenario? You mention DMA, which would be a transaction initiated by a PCI device, to a PCI BAR address, so it doesn't sound like system memory is involved. I copied some folks who know a lot more about this than I do. > I am encountering issues where I see PTE Errors reported from DMAR > in this scenario, but I do not if I use a different DMA engine > that's sitting downstream off the PCI tree. I'm wondering if the > IOAT DMA failure is some artifact of these engines sitting behind > the Host Bridge. > > Thanks in advance! > Eric ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: IOAT DMA w/IOMMU 2018-08-09 18:43 ` Bjorn Helgaas @ 2018-08-09 18:51 ` Eric Pilmore 2018-08-09 19:35 ` Logan Gunthorpe 0 siblings, 1 reply; 48+ messages in thread From: Eric Pilmore @ 2018-08-09 18:51 UTC (permalink / raw) To: Bjorn Helgaas Cc: linux-pci, David Woodhouse, Logan Gunthorpe, Alex Williamson, iommu On Thu, Aug 9, 2018 at 11:43 AM, Bjorn Helgaas <helgaas@kernel.org> wrote: > [+cc David, Logan, Alex, iommu list] > > On Thu, Aug 09, 2018 at 11:14:13AM -0700, Eric Pilmore wrote: >> Didn't get any response on the IRC channel so trying here. >> >> Was wondering if anybody here has used IOAT DMA engines with an >> IOMMU turned on (Xeon based system)? My specific question is really >> whether it is possible to DMA (w/IOAT) to a PCI BAR address as the >> destination without having to map that address to the IOVA space of >> the DMA engine first (assuming the IOMMU is on)? > > So is this a peer-to-peer DMA scenario? You mention DMA, which would > be a transaction initiated by a PCI device, to a PCI BAR address, so > it doesn't sound like system memory is involved. No, not peer-to-peer. This is from system memory (e.g. SKB buffer which has had an IOMMU mapping created) to a PCI BAR address. > > I copied some folks who know a lot more about this than I do. Thank you! > >> I am encountering issues where I see PTE Errors reported from DMAR >> in this scenario, but I do not if I use a different DMA engine >> that's sitting downstream off the PCI tree. I'm wondering if the >> IOAT DMA failure is some artifact of these engines sitting behind >> the Host Bridge. >> >> Thanks in advance! >> Eric ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: IOAT DMA w/IOMMU 2018-08-09 18:51 ` Eric Pilmore @ 2018-08-09 19:35 ` Logan Gunthorpe 2018-08-09 19:47 ` Kit Chow 2018-08-09 21:31 ` Eric Pilmore 0 siblings, 2 replies; 48+ messages in thread From: Logan Gunthorpe @ 2018-08-09 19:35 UTC (permalink / raw) To: Eric Pilmore, Bjorn Helgaas Cc: linux-pci, David Woodhouse, Alex Williamson, iommu Hey, On 09/08/18 12:51 PM, Eric Pilmore wrote: >>> Was wondering if anybody here has used IOAT DMA engines with an >>> IOMMU turned on (Xeon based system)? My specific question is really >>> whether it is possible to DMA (w/IOAT) to a PCI BAR address as the >>> destination without having to map that address to the IOVA space of >>> the DMA engine first (assuming the IOMMU is on)? I haven't tested this scenario but my guess would be that IOAT would indeed go through the IOMMU and the PCI BAR address would need to be properly mapped into the IOAT's IOVA. The fact that you see DMAR errors is probably a good indication that this is the case. I really don't know why you'd want to DMA something without mapping it. >> So is this a peer-to-peer DMA scenario? You mention DMA, which would >> be a transaction initiated by a PCI device, to a PCI BAR address, so >> it doesn't sound like system memory is involved. > > No, not peer-to-peer. This is from system memory (e.g. SKB buffer which > has had an IOMMU mapping created) to a PCI BAR address. It's definitely peer-to-peer in the case where you are using a DMA engine in the PCI tree. You have the DMA PCI device sending TLPs directly to the PCI BAR device. So, if everything is done right, the TLPs will avoid the root complex completely. (Though, ACS settings could also prevent this from working and you'd either get similar DMAR errors or they'd disappear into a black hole). When using the IOAT, it is part of the CPU so I wouldn't say it's really peer-to-peer. But an argument could be made that it is. Though, this is exactly what the existing ntb_transport is doing: DMAing from system memory to a PCI BAR and vice versa using IOAT. Logan ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: IOAT DMA w/IOMMU 2018-08-09 19:35 ` Logan Gunthorpe @ 2018-08-09 19:47 ` Kit Chow 2018-08-09 20:11 ` Logan Gunthorpe 2018-08-09 21:31 ` Eric Pilmore 1 sibling, 1 reply; 48+ messages in thread From: Kit Chow @ 2018-08-09 19:47 UTC (permalink / raw) To: Logan Gunthorpe, Eric Pilmore, Bjorn Helgaas Cc: linux-pci, David Woodhouse, Alex Williamson, iommu On 08/09/2018 12:35 PM, Logan Gunthorpe wrote: > Hey, > > On 09/08/18 12:51 PM, Eric Pilmore wrote: >>>> Was wondering if anybody here has used IOAT DMA engines with an >>>> IOMMU turned on (Xeon based system)? My specific question is really >>>> whether it is possible to DMA (w/IOAT) to a PCI BAR address as the >>>> destination without having to map that address to the IOVA space of >>>> the DMA engine first (assuming the IOMMU is on)? > I haven't tested this scenario but my guess would be that IOAT would > indeed go through the IOMMU and the PCI BAR address would need to be > properly mapped into the IOAT's IOVA. The fact that you see DMAR errors > is probably a good indication that this is the case. I really don't know > why you'd want to DMA something without mapping it. I have experimented with changing ntb_async_tx_submit to dma_map the PCI BAR address. With this, I get a different DMAR error: DMAR: [DMA Write] Request device [07:00.4] fault addr ffd00000 [fault reason 12] non-zero reserved fields in PTE Kit >>> So is this a peer-to-peer DMA scenario? You mention DMA, which would >>> be a transaction initiated by a PCI device, to a PCI BAR address, so >>> it doesn't sound like system memory is involved. >> No, not peer-to-peer. This is from system memory (e.g. SKB buffer which >> has had an IOMMU mapping created) to a PCI BAR address. > It's definitely peer-to-peer in the case where you are using a DMA > engine in the PCI tree. You have the DMA PCI device sending TLPs > directly to the PCI BAR device. So, if everything is done right, the > TLPs will avoid the root complex completely. (Though, ACS settings could > also prevent this from working and you'd either get similar DMAR errors > or they'd disappear into a black hole). > > When using the IOAT, it is part of the CPU so I wouldn't say it's really > peer-to-peer. But an argument could be made that it is. Though, this is > exactly what the existing ntb_transport is doing: DMAing from system > memory to a PCI BAR and vice versa using IOAT. > > Logan > ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: IOAT DMA w/IOMMU 2018-08-09 19:47 ` Kit Chow @ 2018-08-09 20:11 ` Logan Gunthorpe 2018-08-09 20:57 ` Kit Chow 0 siblings, 1 reply; 48+ messages in thread From: Logan Gunthorpe @ 2018-08-09 20:11 UTC (permalink / raw) To: Kit Chow, Eric Pilmore, Bjorn Helgaas Cc: linux-pci, David Woodhouse, Alex Williamson, iommu On 09/08/18 01:47 PM, Kit Chow wrote: >> I haven't tested this scenario but my guess would be that IOAT would >> indeed go through the IOMMU and the PCI BAR address would need to be >> properly mapped into the IOAT's IOVA. The fact that you see DMAR errors >> is probably a good indication that this is the case. I really don't know >> why you'd want to DMA something without mapping it. > I have experimented with changing ntb_async_tx_submit to dma_map the PCI > BAR > address. With this, I get a different DMAR error: What code did you use to do this? > DMAR: [DMA Write] Request device [07:00.4] fault addr ffd00000 > [fault reason 12] non-zero reserved fields in PTE Also, what device corresponds to 07:00.4 on your system? Logan ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: IOAT DMA w/IOMMU 2018-08-09 20:11 ` Logan Gunthorpe @ 2018-08-09 20:57 ` Kit Chow 2018-08-09 21:11 ` Logan Gunthorpe 0 siblings, 1 reply; 48+ messages in thread From: Kit Chow @ 2018-08-09 20:57 UTC (permalink / raw) To: Logan Gunthorpe, Eric Pilmore, Bjorn Helgaas Cc: linux-pci, David Woodhouse, Alex Williamson, iommu On 08/09/2018 01:11 PM, Logan Gunthorpe wrote: > > On 09/08/18 01:47 PM, Kit Chow wrote: >>> I haven't tested this scenario but my guess would be that IOAT would >>> indeed go through the IOMMU and the PCI BAR address would need to be >>> properly mapped into the IOAT's IOVA. The fact that you see DMAR errors >>> is probably a good indication that this is the case. I really don't know >>> why you'd want to DMA something without mapping it. >> I have experimented with changing ntb_async_tx_submit to dma_map the PCI >> BAR >> address. With this, I get a different DMAR error: > What code did you use to do this? If you mean version of linux, it is 4.15.7. Or specific dma_map call, I believe it was dma_map_single. > >> DMAR: [DMA Write] Request device [07:00.4] fault addr ffd00000 >> [fault reason 12] non-zero reserved fields in PTE > Also, what device corresponds to 07:00.4 on your system? I believe 07.00.4 was the PLX dma device. I get the same error with ioat. Kit > Logan ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: IOAT DMA w/IOMMU 2018-08-09 20:57 ` Kit Chow @ 2018-08-09 21:11 ` Logan Gunthorpe 2018-08-09 21:47 ` Kit Chow 0 siblings, 1 reply; 48+ messages in thread From: Logan Gunthorpe @ 2018-08-09 21:11 UTC (permalink / raw) To: Kit Chow, Eric Pilmore, Bjorn Helgaas Cc: linux-pci, David Woodhouse, Alex Williamson, iommu On 09/08/18 02:57 PM, Kit Chow wrote: > > > On 08/09/2018 01:11 PM, Logan Gunthorpe wrote: >> >> On 09/08/18 01:47 PM, Kit Chow wrote: >>>> I haven't tested this scenario but my guess would be that IOAT would >>>> indeed go through the IOMMU and the PCI BAR address would need to be >>>> properly mapped into the IOAT's IOVA. The fact that you see DMAR errors >>>> is probably a good indication that this is the case. I really don't know >>>> why you'd want to DMA something without mapping it. >>> I have experimented with changing ntb_async_tx_submit to dma_map the PCI >>> BAR >>> address. With this, I get a different DMAR error: >> What code did you use to do this? > If you mean version of linux, it is 4.15.7. Or specific dma_map call, I > believe it was dma_map_single. I mean the complete code you use to do the mapping so we can see if it's correct. dma_map_single() seems like an odd choice, I expected to see dma_map_resource(). >>> DMAR: [DMA Write] Request device [07:00.4] fault addr ffd00000 >>> [fault reason 12] non-zero reserved fields in PTE >> Also, what device corresponds to 07:00.4 on your system? > I believe 07.00.4 was the PLX dma device. I get the same error with ioat. Using the mapping with the PLX dma device likely converts it from a pure P2P request to one where the TLPs pass through the IOMMU. So the fact that you get the same error with both means IOAT almost certainly goes through the IOMMU and there's something wrong with the mapping setup. Logan ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: IOAT DMA w/IOMMU 2018-08-09 21:11 ` Logan Gunthorpe @ 2018-08-09 21:47 ` Kit Chow 2018-08-09 22:40 ` Jiang, Dave 0 siblings, 1 reply; 48+ messages in thread From: Kit Chow @ 2018-08-09 21:47 UTC (permalink / raw) To: Logan Gunthorpe, Eric Pilmore, Bjorn Helgaas Cc: linux-pci, David Woodhouse, Alex Williamson, iommu On 08/09/2018 02:11 PM, Logan Gunthorpe wrote: > > On 09/08/18 02:57 PM, Kit Chow wrote: >> >> On 08/09/2018 01:11 PM, Logan Gunthorpe wrote: >>> On 09/08/18 01:47 PM, Kit Chow wrote: >>>>> I haven't tested this scenario but my guess would be that IOAT would >>>>> indeed go through the IOMMU and the PCI BAR address would need to be >>>>> properly mapped into the IOAT's IOVA. The fact that you see DMAR errors >>>>> is probably a good indication that this is the case. I really don't know >>>>> why you'd want to DMA something without mapping it. >>>> I have experimented with changing ntb_async_tx_submit to dma_map the PCI >>>> BAR >>>> address. With this, I get a different DMAR error: >>> What code did you use to do this? >> If you mean version of linux, it is 4.15.7. Or specific dma_map call, I >> believe it was dma_map_single. > I mean the complete code you use to do the mapping so we can see if it's > correct. dma_map_single() seems like an odd choice, I expected to see > dma_map_resource(). Thanks for the suggestion! Will try out dma_map_resource and report back. Kit >>>> DMAR: [DMA Write] Request device [07:00.4] fault addr ffd00000 >>>> [fault reason 12] non-zero reserved fields in PTE >>> Also, what device corresponds to 07:00.4 on your system? >> I believe 07.00.4 was the PLX dma device. I get the same error with ioat. > Using the mapping with the PLX dma device likely converts it from a pure > P2P request to one where the TLPs pass through the IOMMU. So the fact > that you get the same error with both means IOAT almost certainly goes > through the IOMMU and there's something wrong with the mapping setup. > > Logan ^ permalink raw reply [flat|nested] 48+ messages in thread
* RE: IOAT DMA w/IOMMU 2018-08-09 21:47 ` Kit Chow @ 2018-08-09 22:40 ` Jiang, Dave 2018-08-09 22:48 ` Kit Chow 0 siblings, 1 reply; 48+ messages in thread From: Jiang, Dave @ 2018-08-09 22:40 UTC (permalink / raw) To: Kit Chow, Logan Gunthorpe, Eric Pilmore, Bjorn Helgaas Cc: linux-pci, David Woodhouse, Alex Williamson, iommu PiAtLS0tLU9yaWdpbmFsIE1lc3NhZ2UtLS0tLQ0KPiBGcm9tOiBsaW51eC1wY2ktb3duZXJAdmdl ci5rZXJuZWwub3JnIFttYWlsdG86bGludXgtcGNpLW93bmVyQHZnZXIua2VybmVsLm9yZ10gT24g QmVoYWxmIE9mIEtpdCBDaG93DQo+IFNlbnQ6IFRodXJzZGF5LCBBdWd1c3QgOSwgMjAxOCAyOjQ4 IFBNDQo+IFRvOiBMb2dhbiBHdW50aG9ycGUgPGxvZ2FuZ0BkZWx0YXRlZS5jb20+OyBFcmljIFBp bG1vcmUgPGVwaWxtb3JlQGdpZ2Fpby5jb20+OyBCam9ybiBIZWxnYWFzIDxoZWxnYWFzQGtlcm5l bC5vcmc+DQo+IENjOiBsaW51eC1wY2lAdmdlci5rZXJuZWwub3JnOyBEYXZpZCBXb29kaG91c2Ug PGR3bXcyQGluZnJhZGVhZC5vcmc+OyBBbGV4IFdpbGxpYW1zb24gPGFsZXgud2lsbGlhbXNvbkBy ZWRoYXQuY29tPjsNCj4gaW9tbXVAbGlzdHMubGludXgtZm91bmRhdGlvbi5vcmcNCj4gU3ViamVj dDogUmU6IElPQVQgRE1BIHcvSU9NTVUNCj4gDQo+IA0KPiANCj4gT24gMDgvMDkvMjAxOCAwMjox MSBQTSwgTG9nYW4gR3VudGhvcnBlIHdyb3RlOg0KPiA+DQo+ID4gT24gMDkvMDgvMTggMDI6NTcg UE0sIEtpdCBDaG93IHdyb3RlOg0KPiA+Pg0KPiA+PiBPbiAwOC8wOS8yMDE4IDAxOjExIFBNLCBM b2dhbiBHdW50aG9ycGUgd3JvdGU6DQo+ID4+PiBPbiAwOS8wOC8xOCAwMTo0NyBQTSwgS2l0IENo b3cgd3JvdGU6DQo+ID4+Pj4+IEkgaGF2ZW4ndCB0ZXN0ZWQgdGhpcyBzY2VuYXJpbyBidXQgbXkg Z3Vlc3Mgd291bGQgYmUgdGhhdCBJT0FUIHdvdWxkDQo+ID4+Pj4+IGluZGVlZCBnbyB0aHJvdWdo IHRoZSBJT01NVSBhbmQgdGhlIFBDSSBCQVIgYWRkcmVzcyB3b3VsZCBuZWVkIHRvIGJlDQo+ID4+ Pj4+IHByb3Blcmx5IG1hcHBlZCBpbnRvIHRoZSBJT0FUJ3MgSU9WQS4gVGhlIGZhY3QgdGhhdCB5 b3Ugc2VlIERNQVIgZXJyb3JzDQo+ID4+Pj4+IGlzIHByb2JhYmx5IGEgZ29vZCBpbmRpY2F0aW9u IHRoYXQgdGhpcyBpcyB0aGUgY2FzZS4gSSByZWFsbHkgZG9uJ3Qga25vdw0KPiA+Pj4+PiB3aHkg eW91J2Qgd2FudCB0byBETUEgc29tZXRoaW5nIHdpdGhvdXQgbWFwcGluZyBpdC4NCj4gPj4+PiBJ IGhhdmUgZXhwZXJpbWVudGVkIHdpdGggY2hhbmdpbmcgbnRiX2FzeW5jX3R4X3N1Ym1pdCB0byBk bWFfbWFwIHRoZSBQQ0kNCj4gPj4+PiBCQVINCj4gPj4+PiBhZGRyZXNzLiBXaXRoIHRoaXMsIEkg Z2V0IGEgZGlmZmVyZW50IERNQVIgZXJyb3I6DQo+ID4+PiBXaGF0IGNvZGUgZGlkIHlvdSB1c2Ug dG8gZG8gdGhpcz8NCj4gPj4gSWYgeW91IG1lYW4gdmVyc2lvbiBvZiBsaW51eCwgaXQgaXMgNC4x NS43LsKgIE9yIHNwZWNpZmljIGRtYV9tYXAgY2FsbCwgSQ0KPiA+PiBiZWxpZXZlIGl0IHdhcyBk bWFfbWFwX3NpbmdsZS4NCj4gPiBJIG1lYW4gdGhlIGNvbXBsZXRlIGNvZGUgeW91IHVzZSB0byBk byB0aGUgbWFwcGluZyBzbyB3ZSBjYW4gc2VlIGlmIGl0J3MNCj4gPiBjb3JyZWN0LiBkbWFfbWFw X3NpbmdsZSgpIHNlZW1zIGxpa2UgYW4gb2RkIGNob2ljZSwgSSBleHBlY3RlZCB0byBzZWUNCj4g PiBkbWFfbWFwX3Jlc291cmNlKCkuDQo+IFRoYW5rcyBmb3IgdGhlIHN1Z2dlc3Rpb24hwqAgV2ls bCB0cnkgb3V0IGRtYV9tYXBfcmVzb3VyY2UgYW5kIHJlcG9ydCBiYWNrLg0KPiANCj4gS2l0DQoN CktpdCwgSSB3YXMgYWJsZSB0byB0cnkgdGhpcyBvbiB0aGUgU2t5bGFrZSBYZW9uIHBsYXRmb3Jt IHdpdGggSW50ZWwgTlRCIGFuZCBpb2F0ZG1hIGFuZCBJIGRpZCBub3QgZW5jb3VudGVyIGFueSBp c3N1ZXMuIFVuZm9ydHVuYXRlbHkgSSBkbyBub3QgaGF2ZSBteSBwcmUtU2t5bGFrZSBwbGF0Zm9y bSB0byB0ZXN0IGFueW1vcmUgdG8gdmVyaWZ5Lg0KDQo+IA0KPiA+Pj4+IERNQVI6IFtETUEgV3Jp dGVdIFJlcXVlc3QgZGV2aWNlIFswNzowMC40XSBmYXVsdCBhZGRyIGZmZDAwMDAwDQo+ID4+Pj4g W2ZhdWx0IHJlYXNvbiAxMl0gbm9uLXplcm8gcmVzZXJ2ZWQgZmllbGRzIGluIFBURQ0KPiA+Pj4g QWxzbywgd2hhdCBkZXZpY2UgY29ycmVzcG9uZHMgdG8gMDc6MDAuNCBvbiB5b3VyIHN5c3RlbT8N Cj4gPj4gSSBiZWxpZXZlIDA3LjAwLjQgd2FzIHRoZSBQTFggZG1hIGRldmljZS4gSSBnZXQgdGhl IHNhbWUgZXJyb3Igd2l0aCBpb2F0Lg0KPiA+IFVzaW5nIHRoZSBtYXBwaW5nIHdpdGggdGhlIFBM WCBkbWEgZGV2aWNlIGxpa2VseSBjb252ZXJ0cyBpdCBmcm9tIGEgcHVyZQ0KPiA+IFAyUCByZXF1 ZXN0IHRvIG9uZSB3aGVyZSB0aGUgVExQcyBwYXNzIHRocm91Z2ggdGhlIElPTU1VLiBTbyB0aGUg ZmFjdA0KPiA+IHRoYXQgeW91IGdldCB0aGUgc2FtZSBlcnJvciB3aXRoIGJvdGggbWVhbnMgSU9B VCBhbG1vc3QgY2VydGFpbmx5IGdvZXMNCj4gPiB0aHJvdWdoIHRoZSBJT01NVSBhbmQgdGhlcmUn cyBzb21ldGhpbmcgd3Jvbmcgd2l0aCB0aGUgbWFwcGluZyBzZXR1cC4NCj4gPg0KPiA+IExvZ2Fu DQoNCg== ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: IOAT DMA w/IOMMU 2018-08-09 22:40 ` Jiang, Dave @ 2018-08-09 22:48 ` Kit Chow 2018-08-09 22:50 ` Logan Gunthorpe 0 siblings, 1 reply; 48+ messages in thread From: Kit Chow @ 2018-08-09 22:48 UTC (permalink / raw) To: Jiang, Dave, Logan Gunthorpe, Eric Pilmore, Bjorn Helgaas Cc: linux-pci, David Woodhouse, Alex Williamson, iommu On 08/09/2018 03:40 PM, Jiang, Dave wrote: >> -----Original Message----- >> From: linux-pci-owner@vger.kernel.org [mailto:linux-pci-owner@vger.kernel.org] On Behalf Of Kit Chow >> Sent: Thursday, August 9, 2018 2:48 PM >> To: Logan Gunthorpe <logang@deltatee.com>; Eric Pilmore <epilmore@gigaio.com>; Bjorn Helgaas <helgaas@kernel.org> >> Cc: linux-pci@vger.kernel.org; David Woodhouse <dwmw2@infradead.org>; Alex Williamson <alex.williamson@redhat.com>; >> iommu@lists.linux-foundation.org >> Subject: Re: IOAT DMA w/IOMMU >> >> >> >> On 08/09/2018 02:11 PM, Logan Gunthorpe wrote: >>> On 09/08/18 02:57 PM, Kit Chow wrote: >>>> On 08/09/2018 01:11 PM, Logan Gunthorpe wrote: >>>>> On 09/08/18 01:47 PM, Kit Chow wrote: >>>>>>> I haven't tested this scenario but my guess would be that IOAT would >>>>>>> indeed go through the IOMMU and the PCI BAR address would need to be >>>>>>> properly mapped into the IOAT's IOVA. The fact that you see DMAR errors >>>>>>> is probably a good indication that this is the case. I really don't know >>>>>>> why you'd want to DMA something without mapping it. >>>>>> I have experimented with changing ntb_async_tx_submit to dma_map the PCI >>>>>> BAR >>>>>> address. With this, I get a different DMAR error: >>>>> What code did you use to do this? >>>> If you mean version of linux, it is 4.15.7. Or specific dma_map call, I >>>> believe it was dma_map_single. >>> I mean the complete code you use to do the mapping so we can see if it's >>> correct. dma_map_single() seems like an odd choice, I expected to see >>> dma_map_resource(). >> Thanks for the suggestion! Will try out dma_map_resource and report back. >> >> Kit > Kit, I was able to try this on the Skylake Xeon platform with Intel NTB and ioatdma and I did not encounter any issues. Unfortunately I do not have my pre-Skylake platform to test anymore to verify. We are expecting to get skylakes within a couple of weeks and I'll have a chance to give it a go on there. Based on Logan's comments, I am very hopeful that the dma_map_resource will make things work on the older platforms... Thanks Kit >>>>>> DMAR: [DMA Write] Request device [07:00.4] fault addr ffd00000 >>>>>> [fault reason 12] non-zero reserved fields in PTE >>>>> Also, what device corresponds to 07:00.4 on your system? >>>> I believe 07.00.4 was the PLX dma device. I get the same error with ioat. >>> Using the mapping with the PLX dma device likely converts it from a pure >>> P2P request to one where the TLPs pass through the IOMMU. So the fact >>> that you get the same error with both means IOAT almost certainly goes >>> through the IOMMU and there's something wrong with the mapping setup. >>> >>> Logan ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: IOAT DMA w/IOMMU 2018-08-09 22:48 ` Kit Chow @ 2018-08-09 22:50 ` Logan Gunthorpe 2018-08-09 23:00 ` Kit Chow 0 siblings, 1 reply; 48+ messages in thread From: Logan Gunthorpe @ 2018-08-09 22:50 UTC (permalink / raw) To: Kit Chow, Jiang, Dave, Eric Pilmore, Bjorn Helgaas Cc: linux-pci, David Woodhouse, Alex Williamson, iommu On 09/08/18 04:48 PM, Kit Chow wrote: > Based on Logan's comments, I am very hopeful that the dma_map_resource > will make things work on the older platforms... Well, I *think* dma_map_single() would still work. So I'm not that confident that's the root of your problem. I'd still like to see the actual code snippet you are using. Logan ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: IOAT DMA w/IOMMU 2018-08-09 22:50 ` Logan Gunthorpe @ 2018-08-09 23:00 ` Kit Chow 2018-08-10 16:02 ` Kit Chow 0 siblings, 1 reply; 48+ messages in thread From: Kit Chow @ 2018-08-09 23:00 UTC (permalink / raw) To: Logan Gunthorpe, Jiang, Dave, Eric Pilmore, Bjorn Helgaas Cc: linux-pci, David Woodhouse, Alex Williamson, iommu On 08/09/2018 03:50 PM, Logan Gunthorpe wrote: > > On 09/08/18 04:48 PM, Kit Chow wrote: >> Based on Logan's comments, I am very hopeful that the dma_map_resource >> will make things work on the older platforms... > Well, I *think* dma_map_single() would still work. So I'm not that > confident that's the root of your problem. I'd still like to see the > actual code snippet you are using. > > Logan Here's the code snippet - (ntbdebug & 4) path does dma_map_resource of the pci bar address. It was: unmap->addr[1] = dma_map_single(device->dev, (void *)dest, len, DMA_TO_DEVICE); Kit --- static int ntb_async_tx_submit(struct ntb_transport_qp *qp, struct ntb_queue_entry *entry) { struct dma_async_tx_descriptor *txd; struct dma_chan *chan = qp->tx_dma_chan; struct dma_device *device; size_t len = entry->len; void *buf = entry->buf; size_t dest_off, buff_off; struct dmaengine_unmap_data *unmap; dma_addr_t dest; dma_cookie_t cookie; int unmapcnt; device = chan->device; dest = qp->tx_mw_phys + qp->tx_max_frame * entry->tx_index; buff_off = (size_t)buf & ~PAGE_MASK; dest_off = (size_t)dest & ~PAGE_MASK; if (!is_dma_copy_aligned(device, buff_off, dest_off, len)) goto err; if (ntbdebug & 0x4) { unmapcnt = 2; } else { unmapcnt = 1; } unmap = dmaengine_get_unmap_data(device->dev, unmapcnt, GFP_NOWAIT); if (!unmap) goto err; unmap->len = len; unmap->addr[0] = dma_map_page(device->dev, virt_to_page(buf), buff_off, len, DMA_TO_DEVICE); if (dma_mapping_error(device->dev, unmap->addr[0])) goto err_get_unmap; if (ntbdebug & 0x4) { unmap->addr[1] = dma_map_resource(device->dev, (phys_addr_t)dest, len, DMA_TO_DEVICE, 0); if (dma_mapping_error(device->dev, unmap->addr[1])) goto err_get_unmap; unmap->to_cnt = 2; } else { unmap->addr[1] = dest; unmap->to_cnt = 1; } txd = device->device_prep_dma_memcpy(chan, unmap->addr[1], unmap->addr[0], len, DMA_PREP_INTERRUPT); if (!txd) goto err_get_unmap; txd->callback_result = ntb_tx_copy_callback; txd->callback_param = entry; dma_set_unmap(txd, unmap); cookie = dmaengine_submit(txd); if (dma_submit_error(cookie)) goto err_set_unmap; dmaengine_unmap_put(unmap); dma_async_issue_pending(chan); return 0; err_set_unmap: dma_descriptor_unmap(txd); txd->desc_free(txd); err_get_unmap: dmaengine_unmap_put(unmap); err: return -ENXIO; } ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: IOAT DMA w/IOMMU 2018-08-09 23:00 ` Kit Chow @ 2018-08-10 16:02 ` Kit Chow 2018-08-10 16:23 ` Kit Chow 2018-08-10 16:24 ` Logan Gunthorpe 0 siblings, 2 replies; 48+ messages in thread From: Kit Chow @ 2018-08-10 16:02 UTC (permalink / raw) To: Logan Gunthorpe, Jiang, Dave, Eric Pilmore, Bjorn Helgaas Cc: linux-pci, David Woodhouse, Alex Williamson, iommu Turns out there is no dma_map_resource routine on x86. get_dma_ops returns intel_dma_ops which has map_resource pointing to NULL. (gdb) p intel_dma_ops $7 = {alloc = 0xffffffff8150f310 <intel_alloc_coherent>, free = 0xffffffff8150ec20 <intel_free_coherent>, mmap = 0x0 <irq_stack_union>, get_sgtable = 0x0 <irq_stack_union>, map_page = 0xffffffff8150f2d0 <intel_map_page>, unmap_page = 0xffffffff8150ec10 <intel_unmap_page>, map_sg = 0xffffffff8150ef40 <intel_map_sg>, unmap_sg = 0xffffffff8150eb80 <intel_unmap_sg>, map_resource = 0x0 <irq_stack_union>, unmap_resource = 0x0 <irq_stack_union>, sync_single_for_cpu = 0x0 <irq_stack_union>, sync_single_for_device = 0x0 <irq_stack_union>, sync_sg_for_cpu = 0x0 <irq_stack_union>, sync_sg_for_device = 0x0 <irq_stack_union>, cache_sync = 0x0 <irq_stack_union>, mapping_error = 0xffffffff815095f0 <intel_mapping_error>, dma_supported = 0xffffffff81033830 <x86_dma_supported>, is_phys = 0} Will poke around some in the intel_map_page code but can you actually get a valid struct page for a pci bar address (dma_map_single calls virt_to_page)? If not, does a map_resource routine that can properly map a pci bar address need to be implemented? Kit --- static inline dma_addr_t dma_map_single_attrs(struct device *dev, void *ptr, size_t size, enum dma_data_direction dir, unsigned long attrs) { const struct dma_map_ops *ops = get_dma_ops(dev); dma_addr_t addr; BUG_ON(!valid_dma_direction(dir)); addr = ops->map_page(dev, virt_to_page(ptr), offset_in_page(ptr), size, dir, attrs); debug_dma_map_page(dev, virt_to_page(ptr), offset_in_page(ptr), size, dir, addr, true); return addr; } On 08/09/2018 04:00 PM, Kit Chow wrote: > > > On 08/09/2018 03:50 PM, Logan Gunthorpe wrote: >> >> On 09/08/18 04:48 PM, Kit Chow wrote: >>> Based on Logan's comments, I am very hopeful that the dma_map_resource >>> will make things work on the older platforms... >> Well, I *think* dma_map_single() would still work. So I'm not that >> confident that's the root of your problem. I'd still like to see the >> actual code snippet you are using. >> >> Logan > Here's the code snippet - (ntbdebug & 4) path does dma_map_resource of > the pci bar address. > > It was: > unmap->addr[1] = dma_map_single(device->dev, (void > *)dest, len, > DMA_TO_DEVICE); > > Kit > --- > > > static int ntb_async_tx_submit(struct ntb_transport_qp *qp, > struct ntb_queue_entry *entry) > { > struct dma_async_tx_descriptor *txd; > struct dma_chan *chan = qp->tx_dma_chan; > struct dma_device *device; > size_t len = entry->len; > void *buf = entry->buf; > size_t dest_off, buff_off; > struct dmaengine_unmap_data *unmap; > dma_addr_t dest; > dma_cookie_t cookie; > int unmapcnt; > > device = chan->device; > > dest = qp->tx_mw_phys + qp->tx_max_frame * entry->tx_index; > > buff_off = (size_t)buf & ~PAGE_MASK; > dest_off = (size_t)dest & ~PAGE_MASK; > > if (!is_dma_copy_aligned(device, buff_off, dest_off, len)) > goto err; > > > if (ntbdebug & 0x4) { > unmapcnt = 2; > } else { > unmapcnt = 1; > } > > unmap = dmaengine_get_unmap_data(device->dev, unmapcnt, > GFP_NOWAIT); > if (!unmap) > goto err; > > unmap->len = len; > unmap->addr[0] = dma_map_page(device->dev, virt_to_page(buf), > buff_off, len, DMA_TO_DEVICE); > if (dma_mapping_error(device->dev, unmap->addr[0])) > goto err_get_unmap; > > if (ntbdebug & 0x4) { > unmap->addr[1] = dma_map_resource(device->dev, > (phys_addr_t)dest, len, DMA_TO_DEVICE, 0); > if (dma_mapping_error(device->dev, unmap->addr[1])) > goto err_get_unmap; > unmap->to_cnt = 2; > } else { > unmap->addr[1] = dest; > unmap->to_cnt = 1; > } > > txd = device->device_prep_dma_memcpy(chan, unmap->addr[1], > unmap->addr[0], len, DMA_PREP_INTERRUPT); > > if (!txd) > goto err_get_unmap; > > txd->callback_result = ntb_tx_copy_callback; > txd->callback_param = entry; > dma_set_unmap(txd, unmap); > > cookie = dmaengine_submit(txd); > if (dma_submit_error(cookie)) > goto err_set_unmap; > > dmaengine_unmap_put(unmap); > > dma_async_issue_pending(chan); > > return 0; > > err_set_unmap: > dma_descriptor_unmap(txd); > txd->desc_free(txd); > err_get_unmap: > dmaengine_unmap_put(unmap); > err: > return -ENXIO; > } > ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: IOAT DMA w/IOMMU 2018-08-10 16:02 ` Kit Chow @ 2018-08-10 16:23 ` Kit Chow 2018-08-10 16:24 ` Logan Gunthorpe 2018-08-10 16:24 ` Logan Gunthorpe 1 sibling, 1 reply; 48+ messages in thread From: Kit Chow @ 2018-08-10 16:23 UTC (permalink / raw) To: Logan Gunthorpe, Jiang, Dave, Eric Pilmore, Bjorn Helgaas Cc: linux-pci, David Woodhouse, Alex Williamson, iommu There is an internal routine (__intel_map_single) inside the intel iommu code that does the actual mapping using a phys_addr_t. Think I'll try to implement a intel_map_resource routine that calls that routine directly without all of the conversions done for dma_map_{single,page} (pci bar addr -> page -> phys_addr)... On 08/10/2018 09:02 AM, Kit Chow wrote: > Turns out there is no dma_map_resource routine on x86. get_dma_ops > returns intel_dma_ops which has map_resource pointing to NULL. > > (gdb) p intel_dma_ops > $7 = {alloc = 0xffffffff8150f310 <intel_alloc_coherent>, > free = 0xffffffff8150ec20 <intel_free_coherent>, > mmap = 0x0 <irq_stack_union>, get_sgtable = 0x0 <irq_stack_union>, > map_page = 0xffffffff8150f2d0 <intel_map_page>, > unmap_page = 0xffffffff8150ec10 <intel_unmap_page>, > map_sg = 0xffffffff8150ef40 <intel_map_sg>, > unmap_sg = 0xffffffff8150eb80 <intel_unmap_sg>, > map_resource = 0x0 <irq_stack_union>, > unmap_resource = 0x0 <irq_stack_union>, > sync_single_for_cpu = 0x0 <irq_stack_union>, > sync_single_for_device = 0x0 <irq_stack_union>, > sync_sg_for_cpu = 0x0 <irq_stack_union>, > sync_sg_for_device = 0x0 <irq_stack_union>, > cache_sync = 0x0 <irq_stack_union>, > mapping_error = 0xffffffff815095f0 <intel_mapping_error>, > dma_supported = 0xffffffff81033830 <x86_dma_supported>, is_phys = 0} > > Will poke around some in the intel_map_page code but can you actually > get a valid struct page for a pci bar address (dma_map_single calls > virt_to_page)? If not, does a map_resource routine that can properly > map a pci bar address need to be implemented? > > Kit > > --- > > > static inline dma_addr_t dma_map_single_attrs(struct device *dev, void > *ptr, > size_t size, > enum dma_data_direction > dir, > unsigned long attrs) > { > const struct dma_map_ops *ops = get_dma_ops(dev); > dma_addr_t addr; > > BUG_ON(!valid_dma_direction(dir)); > addr = ops->map_page(dev, virt_to_page(ptr), > offset_in_page(ptr), size, > dir, attrs); > debug_dma_map_page(dev, virt_to_page(ptr), > offset_in_page(ptr), size, > dir, addr, true); > return addr; > } > > > > > > > On 08/09/2018 04:00 PM, Kit Chow wrote: >> >> >> On 08/09/2018 03:50 PM, Logan Gunthorpe wrote: >>> >>> On 09/08/18 04:48 PM, Kit Chow wrote: >>>> Based on Logan's comments, I am very hopeful that the dma_map_resource >>>> will make things work on the older platforms... >>> Well, I *think* dma_map_single() would still work. So I'm not that >>> confident that's the root of your problem. I'd still like to see the >>> actual code snippet you are using. >>> >>> Logan >> Here's the code snippet - (ntbdebug & 4) path does dma_map_resource >> of the pci bar address. >> >> It was: >> unmap->addr[1] = dma_map_single(device->dev, (void >> *)dest, len, >> DMA_TO_DEVICE); >> >> Kit >> --- >> >> >> static int ntb_async_tx_submit(struct ntb_transport_qp *qp, >> struct ntb_queue_entry *entry) >> { >> struct dma_async_tx_descriptor *txd; >> struct dma_chan *chan = qp->tx_dma_chan; >> struct dma_device *device; >> size_t len = entry->len; >> void *buf = entry->buf; >> size_t dest_off, buff_off; >> struct dmaengine_unmap_data *unmap; >> dma_addr_t dest; >> dma_cookie_t cookie; >> int unmapcnt; >> >> device = chan->device; >> >> dest = qp->tx_mw_phys + qp->tx_max_frame * entry->tx_index; >> >> buff_off = (size_t)buf & ~PAGE_MASK; >> dest_off = (size_t)dest & ~PAGE_MASK; >> >> if (!is_dma_copy_aligned(device, buff_off, dest_off, len)) >> goto err; >> >> >> if (ntbdebug & 0x4) { >> unmapcnt = 2; >> } else { >> unmapcnt = 1; >> } >> >> unmap = dmaengine_get_unmap_data(device->dev, unmapcnt, >> GFP_NOWAIT); >> if (!unmap) >> goto err; >> >> unmap->len = len; >> unmap->addr[0] = dma_map_page(device->dev, virt_to_page(buf), >> buff_off, len, DMA_TO_DEVICE); >> if (dma_mapping_error(device->dev, unmap->addr[0])) >> goto err_get_unmap; >> >> if (ntbdebug & 0x4) { >> unmap->addr[1] = dma_map_resource(device->dev, >> (phys_addr_t)dest, len, DMA_TO_DEVICE, 0); >> if (dma_mapping_error(device->dev, unmap->addr[1])) >> goto err_get_unmap; >> unmap->to_cnt = 2; >> } else { >> unmap->addr[1] = dest; >> unmap->to_cnt = 1; >> } >> >> txd = device->device_prep_dma_memcpy(chan, unmap->addr[1], >> unmap->addr[0], len, DMA_PREP_INTERRUPT); >> >> if (!txd) >> goto err_get_unmap; >> >> txd->callback_result = ntb_tx_copy_callback; >> txd->callback_param = entry; >> dma_set_unmap(txd, unmap); >> >> cookie = dmaengine_submit(txd); >> if (dma_submit_error(cookie)) >> goto err_set_unmap; >> >> dmaengine_unmap_put(unmap); >> >> dma_async_issue_pending(chan); >> >> return 0; >> >> err_set_unmap: >> dma_descriptor_unmap(txd); >> txd->desc_free(txd); >> err_get_unmap: >> dmaengine_unmap_put(unmap); >> err: >> return -ENXIO; >> } >> > ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: IOAT DMA w/IOMMU 2018-08-10 16:23 ` Kit Chow @ 2018-08-10 16:24 ` Logan Gunthorpe 0 siblings, 0 replies; 48+ messages in thread From: Logan Gunthorpe @ 2018-08-10 16:24 UTC (permalink / raw) To: Kit Chow, Jiang, Dave, Eric Pilmore, Bjorn Helgaas Cc: linux-pci, David Woodhouse, Alex Williamson, iommu On 10/08/18 10:23 AM, Kit Chow wrote: > There is an internal routine (__intel_map_single) inside the intel iommu > code that does the actual mapping using a phys_addr_t. Think I'll try to > implement a intel_map_resource routine that calls that routine directly > without all of the conversions done for dma_map_{single,page} (pci bar > addr -> page -> phys_addr)... Nice, yes, that was what I was thinking. Logan ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: IOAT DMA w/IOMMU 2018-08-10 16:02 ` Kit Chow 2018-08-10 16:23 ` Kit Chow @ 2018-08-10 16:24 ` Logan Gunthorpe 2018-08-10 16:31 ` Dave Jiang 1 sibling, 1 reply; 48+ messages in thread From: Logan Gunthorpe @ 2018-08-10 16:24 UTC (permalink / raw) To: Kit Chow, Jiang, Dave, Eric Pilmore, Bjorn Helgaas Cc: linux-pci, David Woodhouse, Alex Williamson, iommu On 10/08/18 10:02 AM, Kit Chow wrote: > Turns out there is no dma_map_resource routine on x86. get_dma_ops > returns intel_dma_ops which has map_resource pointing to NULL. Oh, yup. I wasn't aware of that. From a cursory view, it looks like it shouldn't be too hard to implement though. > Will poke around some in the intel_map_page code but can you actually > get a valid struct page for a pci bar address (dma_map_single calls > virt_to_page)? If not, does a map_resource routine that can properly > map a pci bar address need to be implemented? Yes, you can not get a struct page for a PCI bar address unless it's mapped with ZONE_DEVICE like in my p2p work. So that would explain why dma_map_single() didn't work. This all implies that ntb_transport doesn't work with DMA and the IOMMU turned on. I'm not sure I've ever tried that configuration myself but it is a bit surprising. Logan ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: IOAT DMA w/IOMMU 2018-08-10 16:24 ` Logan Gunthorpe @ 2018-08-10 16:31 ` Dave Jiang 2018-08-10 16:33 ` Logan Gunthorpe 0 siblings, 1 reply; 48+ messages in thread From: Dave Jiang @ 2018-08-10 16:31 UTC (permalink / raw) To: Logan Gunthorpe, Kit Chow, Eric Pilmore, Bjorn Helgaas Cc: linux-pci, David Woodhouse, Alex Williamson, iommu On 08/10/2018 09:24 AM, Logan Gunthorpe wrote: > > > On 10/08/18 10:02 AM, Kit Chow wrote: >> Turns out there is no dma_map_resource routine on x86. get_dma_ops >> returns intel_dma_ops which has map_resource pointing to NULL. > > Oh, yup. I wasn't aware of that. From a cursory view, it looks like it > shouldn't be too hard to implement though. > >> Will poke around some in the intel_map_page code but can you actually >> get a valid struct page for a pci bar address (dma_map_single calls >> virt_to_page)? If not, does a map_resource routine that can properly >> map a pci bar address need to be implemented? > > Yes, you can not get a struct page for a PCI bar address unless it's > mapped with ZONE_DEVICE like in my p2p work. So that would explain why > dma_map_single() didn't work. > > This all implies that ntb_transport doesn't work with DMA and the IOMMU > turned on. I'm not sure I've ever tried that configuration myself but it > is a bit surprising. Hmm....that's surprising because it seems to work on Skylake platform when I tested it yesterday with Intel NTB. Kit is using a Haswell platform at the moment I think. Although I'm curious if it works with the PLX NTB he's using on Skylake. > > Logan > ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: IOAT DMA w/IOMMU 2018-08-10 16:31 ` Dave Jiang @ 2018-08-10 16:33 ` Logan Gunthorpe 2018-08-10 17:01 ` Dave Jiang 0 siblings, 1 reply; 48+ messages in thread From: Logan Gunthorpe @ 2018-08-10 16:33 UTC (permalink / raw) To: Dave Jiang, Kit Chow, Eric Pilmore, Bjorn Helgaas Cc: linux-pci, David Woodhouse, Alex Williamson, iommu On 10/08/18 10:31 AM, Dave Jiang wrote: > > > On 08/10/2018 09:24 AM, Logan Gunthorpe wrote: >> >> >> On 10/08/18 10:02 AM, Kit Chow wrote: >>> Turns out there is no dma_map_resource routine on x86. get_dma_ops >>> returns intel_dma_ops which has map_resource pointing to NULL. >> >> Oh, yup. I wasn't aware of that. From a cursory view, it looks like it >> shouldn't be too hard to implement though. >> >>> Will poke around some in the intel_map_page code but can you actually >>> get a valid struct page for a pci bar address (dma_map_single calls >>> virt_to_page)? If not, does a map_resource routine that can properly >>> map a pci bar address need to be implemented? >> >> Yes, you can not get a struct page for a PCI bar address unless it's >> mapped with ZONE_DEVICE like in my p2p work. So that would explain why >> dma_map_single() didn't work. >> >> This all implies that ntb_transport doesn't work with DMA and the IOMMU >> turned on. I'm not sure I've ever tried that configuration myself but it >> is a bit surprising. > > Hmm....that's surprising because it seems to work on Skylake platform > when I tested it yesterday with Intel NTB. Kit is using a Haswell > platform at the moment I think. Although I'm curious if it works with > the PLX NTB he's using on Skylake. Does that mean on Skylake the IOAT can bypass the IOMMU? Because it looks like the ntb_transport code doesn't map the physical address of the NTB MW into the IOMMU when doing DMA... Logan ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: IOAT DMA w/IOMMU 2018-08-10 16:33 ` Logan Gunthorpe @ 2018-08-10 17:01 ` Dave Jiang 2018-08-10 17:15 ` Logan Gunthorpe 0 siblings, 1 reply; 48+ messages in thread From: Dave Jiang @ 2018-08-10 17:01 UTC (permalink / raw) To: Logan Gunthorpe, Kit Chow, Eric Pilmore, Bjorn Helgaas Cc: linux-pci, David Woodhouse, Alex Williamson, iommu On 08/10/2018 09:33 AM, Logan Gunthorpe wrote: > > > On 10/08/18 10:31 AM, Dave Jiang wrote: >> >> >> On 08/10/2018 09:24 AM, Logan Gunthorpe wrote: >>> >>> >>> On 10/08/18 10:02 AM, Kit Chow wrote: >>>> Turns out there is no dma_map_resource routine on x86. get_dma_ops >>>> returns intel_dma_ops which has map_resource pointing to NULL. >>> >>> Oh, yup. I wasn't aware of that. From a cursory view, it looks like it >>> shouldn't be too hard to implement though. >>> >>>> Will poke around some in the intel_map_page code but can you actually >>>> get a valid struct page for a pci bar address (dma_map_single calls >>>> virt_to_page)? If not, does a map_resource routine that can properly >>>> map a pci bar address need to be implemented? >>> >>> Yes, you can not get a struct page for a PCI bar address unless it's >>> mapped with ZONE_DEVICE like in my p2p work. So that would explain why >>> dma_map_single() didn't work. >>> >>> This all implies that ntb_transport doesn't work with DMA and the IOMMU >>> turned on. I'm not sure I've ever tried that configuration myself but it >>> is a bit surprising. >> >> Hmm....that's surprising because it seems to work on Skylake platform >> when I tested it yesterday with Intel NTB. Kit is using a Haswell >> platform at the moment I think. Although I'm curious if it works with >> the PLX NTB he's using on Skylake. > > Does that mean on Skylake the IOAT can bypass the IOMMU? Because it > looks like the ntb_transport code doesn't map the physical address of > the NTB MW into the IOMMU when doing DMA... Or if the BIOS has provided mapping for the Intel NTB device specifically? Is that a possibility? NTB does go through the IOMMU. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: IOAT DMA w/IOMMU 2018-08-10 17:01 ` Dave Jiang @ 2018-08-10 17:15 ` Logan Gunthorpe 2018-08-10 17:46 ` Dave Jiang 0 siblings, 1 reply; 48+ messages in thread From: Logan Gunthorpe @ 2018-08-10 17:15 UTC (permalink / raw) To: Dave Jiang, Kit Chow, Eric Pilmore, Bjorn Helgaas Cc: linux-pci, David Woodhouse, Alex Williamson, iommu On 10/08/18 11:01 AM, Dave Jiang wrote: > Or if the BIOS has provided mapping for the Intel NTB device > specifically? Is that a possibility? NTB does go through the IOMMU. I don't know but if the BIOS is doing it, but that would only at best work for Intel NTB.... I see no hope in getting the BIOS to map the MW for a Switchtec NTB device... Logan ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: IOAT DMA w/IOMMU 2018-08-10 17:15 ` Logan Gunthorpe @ 2018-08-10 17:46 ` Dave Jiang 2018-08-11 0:53 ` Kit Chow 0 siblings, 1 reply; 48+ messages in thread From: Dave Jiang @ 2018-08-10 17:46 UTC (permalink / raw) To: Logan Gunthorpe, Kit Chow, Eric Pilmore, Bjorn Helgaas Cc: linux-pci, David Woodhouse, Alex Williamson, iommu On 08/10/2018 10:15 AM, Logan Gunthorpe wrote: > > > On 10/08/18 11:01 AM, Dave Jiang wrote: >> Or if the BIOS has provided mapping for the Intel NTB device >> specifically? Is that a possibility? NTB does go through the IOMMU. > > I don't know but if the BIOS is doing it, but that would only at best > work for Intel NTB.... I see no hope in getting the BIOS to map the MW > for a Switchtec NTB device... Right. I'm just speculating why it may possibly work. But yes, I think kernel will need appropriate mapping if it's not happening right now. Hopefully Kit is onto something. > > Logan > ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: IOAT DMA w/IOMMU 2018-08-10 17:46 ` Dave Jiang @ 2018-08-11 0:53 ` Kit Chow 2018-08-11 2:10 ` Logan Gunthorpe 0 siblings, 1 reply; 48+ messages in thread From: Kit Chow @ 2018-08-11 0:53 UTC (permalink / raw) To: Dave Jiang, Logan Gunthorpe, Eric Pilmore, Bjorn Helgaas Cc: linux-pci, David Woodhouse, Alex Williamson, iommu Success! I've implemented a new intel_map_resource (and intel_unmap_resource) routine which is called by dma_map_resource. As mentioned previously, the primary job of dma_map_resource/intel_map_resource is to call the intel iommu internal mapping routine (__intel_map_single) without translating the pci bar address into a page and then back to a phys addr as dma_map_page and dma_map_single are doing. With the new dma_map_resource routine mapping the pci bar address in ntb_transport_tx_submit, I was still getting the "PTE Write access is not set" error. The __intel_map_single routine sets prot based on the DMA direction. if (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL || \ !cap_zlr(iommu->cap)) prot |= DMA_PTE_READ; if (dir == DMA_FROM_DEVICE || dir == DMA_BIDIRECTIONAL) prot |= DMA_PTE_WRITE; I was able to finally succeed in doing the dma transfers over ioat only when prot has DMA_PTE_WRITE set by setting the direction to either DMA_FROM_DEVICE or DMA_BIDIRECTIONAL. Any ideas if the prot settings need to be changed? Are there any bad side effects if I used DMA_BIDIRECTIONAL? Given that using the pci bar address as is without getting an iommu address results in the same "PTE Write access" error, I wonder if there is some internal 'prot' associated with the non-translated pci bar address that just needs to be tweaked to include DMA_PTE_WRITE??? Thanks! On 08/10/2018 10:46 AM, Dave Jiang wrote: > > On 08/10/2018 10:15 AM, Logan Gunthorpe wrote: >> >> On 10/08/18 11:01 AM, Dave Jiang wrote: >>> Or if the BIOS has provided mapping for the Intel NTB device >>> specifically? Is that a possibility? NTB does go through the IOMMU. >> I don't know but if the BIOS is doing it, but that would only at best >> work for Intel NTB.... I see no hope in getting the BIOS to map the MW >> for a Switchtec NTB device... > Right. I'm just speculating why it may possibly work. But yes, I think > kernel will need appropriate mapping if it's not happening right now. > Hopefully Kit is onto something. > >> Logan >> ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: IOAT DMA w/IOMMU 2018-08-11 0:53 ` Kit Chow @ 2018-08-11 2:10 ` Logan Gunthorpe 2018-08-13 14:23 ` Kit Chow 0 siblings, 1 reply; 48+ messages in thread From: Logan Gunthorpe @ 2018-08-11 2:10 UTC (permalink / raw) To: Kit Chow, Dave Jiang, Eric Pilmore, Bjorn Helgaas Cc: linux-pci, David Woodhouse, Alex Williamson, iommu On 10/08/18 06:53 PM, Kit Chow wrote: > I was able to finally succeed in doing the dma transfers over ioat only > when prot has DMA_PTE_WRITE set by setting the direction to either > DMA_FROM_DEVICE or DMA_BIDIRECTIONAL. Any ideas if the prot settings > need to be changed? Are there any bad side effects if I used > DMA_BIDIRECTIONAL? Good to hear it. Without digging into the direction much all I can say is that it can sometimes be very confusing what the direction is. Adding another PCI device just adds to the confusion. I believe, the direction should be from the IOAT's point of view. So if the IOAT is writing to the BAR you'd set DMA_FROM_DEVICE (ie. data is coming from the IOAT) and if it's reading you'd set DMA_TO_DEVICE (ie. data is going to the IOAT). Using DMA_BIDIRECTIONAL just forgoes any hardware security / protection that the buffer would have in terms of direction. Generally it's good practice to use the strictest direction you can. > Given that using the pci bar address as is without getting an iommu > address results in the same "PTE Write access" error, I wonder if there > is some internal 'prot' associated with the non-translated pci bar > address that just needs to be tweaked to include DMA_PTE_WRITE??? No, I don't think so. The 'prot' will be a property of the IOMMU. Not having an entry is probably just the same (from the perspective of the error you see) as only having an entry for reading. Logan ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: IOAT DMA w/IOMMU 2018-08-11 2:10 ` Logan Gunthorpe @ 2018-08-13 14:23 ` Kit Chow 2018-08-13 14:59 ` Robin Murphy 0 siblings, 1 reply; 48+ messages in thread From: Kit Chow @ 2018-08-13 14:23 UTC (permalink / raw) To: Logan Gunthorpe, Dave Jiang, Eric Pilmore, Bjorn Helgaas Cc: linux-pci, David Woodhouse, Alex Williamson, iommu On 08/10/2018 07:10 PM, Logan Gunthorpe wrote: > > On 10/08/18 06:53 PM, Kit Chow wrote: >> I was able to finally succeed in doing the dma transfers over ioat only >> when prot has DMA_PTE_WRITE set by setting the direction to either >> DMA_FROM_DEVICE or DMA_BIDIRECTIONAL. Any ideas if the prot settings >> need to be changed? Are there any bad side effects if I used >> DMA_BIDIRECTIONAL? > Good to hear it. Without digging into the direction much all I can say > is that it can sometimes be very confusing what the direction is. Adding > another PCI device just adds to the confusion. Yep, confusing :). ======================= ============================================= DMA_NONE no direction (used for debugging) DMA_TO_DEVICE data is going from the memory to the device DMA_FROM_DEVICE data is coming from the device to the memory DMA_BIDIRECTIONAL direction isn't known ======================= ============================================= > I believe, the direction should be from the IOAT's point of view. So if > the IOAT is writing to the BAR you'd set DMA_FROM_DEVICE (ie. data is > coming from the IOAT) and if it's reading you'd set DMA_TO_DEVICE (ie. > data is going to the IOAT). It would certainly seem like DMA_TO_DEVICE would be the proper choice; IOAT is the plumbing to move host data (memory) to the bar address (device). Will go with what works and set DMA_FROM_DEVICE. In ntb_async_tx_submit, does the direction used for the dma_map routines for the src and dest addresses need to be consistent? And does the direction setting for the dmaengine_unmap_data have to be consistent with the direction used in dma_map_*? BTW, dmaengine_unmap routine only calls dma_unmap_page. Should it keep track of the dma_map routine used and call the corresponding dma_unmap routine? In the case of the intel iommu, it doesn't matter. Thanks Kit > > Using DMA_BIDIRECTIONAL just forgoes any hardware security / protection > that the buffer would have in terms of direction. Generally it's good > practice to use the strictest direction you can. > >> Given that using the pci bar address as is without getting an iommu >> address results in the same "PTE Write access" error, I wonder if there >> is some internal 'prot' associated with the non-translated pci bar >> address that just needs to be tweaked to include DMA_PTE_WRITE??? > No, I don't think so. The 'prot' will be a property of the IOMMU. Not > having an entry is probably just the same (from the perspective of the > error you see) as only having an entry for reading. > > Logan ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: IOAT DMA w/IOMMU 2018-08-13 14:23 ` Kit Chow @ 2018-08-13 14:59 ` Robin Murphy 2018-08-13 15:21 ` Kit Chow 0 siblings, 1 reply; 48+ messages in thread From: Robin Murphy @ 2018-08-13 14:59 UTC (permalink / raw) To: Kit Chow, Logan Gunthorpe, Dave Jiang, Eric Pilmore, Bjorn Helgaas Cc: linux-pci, Alex Williamson, David Woodhouse, iommu On 13/08/18 15:23, Kit Chow wrote: > On 08/10/2018 07:10 PM, Logan Gunthorpe wrote: >> >> On 10/08/18 06:53 PM, Kit Chow wrote: >>> I was able to finally succeed in doing the dma transfers over ioat only >>> when prot has DMA_PTE_WRITE set by setting the direction to either >>> DMA_FROM_DEVICE or DMA_BIDIRECTIONAL. Any ideas if the prot settings >>> need to be changed? Are there any bad side effects if I used >>> DMA_BIDIRECTIONAL? >> Good to hear it. Without digging into the direction much all I can say >> is that it can sometimes be very confusing what the direction is. Adding >> another PCI device just adds to the confusion. > Yep, confusing :). > > ======================= ============================================= > DMA_NONE no direction (used for debugging) > DMA_TO_DEVICE data is going from the memory to the device > DMA_FROM_DEVICE data is coming from the device to the memory > DMA_BIDIRECTIONAL direction isn't known > ======================= ============================================= > >> I believe, the direction should be from the IOAT's point of view. So if >> the IOAT is writing to the BAR you'd set DMA_FROM_DEVICE (ie. data is >> coming from the IOAT) and if it's reading you'd set DMA_TO_DEVICE (ie. >> data is going to the IOAT). > It would certainly seem like DMA_TO_DEVICE would be the proper choice; > IOAT is the plumbing to move host data (memory) to the bar address > (device). Except that the "device" in question is the IOAT itself (more generally, it means the device represented by the first argument to dma_map_*() - the one actually emitting the reads and writes). The context of a DMA API call is the individual mapping in question, not whatever overall operation it may be part of - your example already involves two separate mappings: one "from" system memory "to" the DMA engine, and one "from" the DMA engine "to" PCI BAR memory. Note that the DMA API's dma_direction is also distinct from the dmaengine API's dma_transfer_direction, and there's plenty of fun to be had mapping between the two - see pl330.c or rcar-dmac.c for other examples of dma_map_resource() for slave devices - no guarantees that those implementations are entirely correct (especially the one I did!), but in practice they do make the "DMA engine behind an IOMMU" case work for UARTs and similar straightforward slaves. > Will go with what works and set DMA_FROM_DEVICE. > > In ntb_async_tx_submit, does the direction used for the dma_map routines > for the src and dest addresses need to be consistent? In general, the mappings of source and destination addresses would typically have opposite directions as above, unless they're both bidirectional. > And does the direction setting for the dmaengine_unmap_data have to be > consistent with the direction used in dma_map_*? Yes, the arguments to an unmap are expected to match whatever was passed to the corresponding map call. CONFIG_DMA_API_DEBUG should help catch any mishaps. Robin. > BTW, dmaengine_unmap routine only calls dma_unmap_page. Should it keep > track of the dma_map routine used and call the corresponding dma_unmap > routine? In the case of the intel iommu, it doesn't matter. > > Thanks > Kit > >> >> Using DMA_BIDIRECTIONAL just forgoes any hardware security / protection >> that the buffer would have in terms of direction. Generally it's good >> practice to use the strictest direction you can. >> >>> Given that using the pci bar address as is without getting an iommu >>> address results in the same "PTE Write access" error, I wonder if there >>> is some internal 'prot' associated with the non-translated pci bar >>> address that just needs to be tweaked to include DMA_PTE_WRITE??? >> No, I don't think so. The 'prot' will be a property of the IOMMU. Not >> having an entry is probably just the same (from the perspective of the >> error you see) as only having an entry for reading. >> >> Logan > > _______________________________________________ > iommu mailing list > iommu@lists.linux-foundation.org > https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: IOAT DMA w/IOMMU 2018-08-13 14:59 ` Robin Murphy @ 2018-08-13 15:21 ` Kit Chow 2018-08-13 23:30 ` Kit Chow 2018-08-13 23:36 ` Kit Chow 0 siblings, 2 replies; 48+ messages in thread From: Kit Chow @ 2018-08-13 15:21 UTC (permalink / raw) To: Robin Murphy, Logan Gunthorpe, Dave Jiang, Eric Pilmore, Bjorn Helgaas Cc: linux-pci, Alex Williamson, David Woodhouse, iommu On 08/13/2018 07:59 AM, Robin Murphy wrote: > On 13/08/18 15:23, Kit Chow wrote: >> On 08/10/2018 07:10 PM, Logan Gunthorpe wrote: >>> >>> On 10/08/18 06:53 PM, Kit Chow wrote: >>>> I was able to finally succeed in doing the dma transfers over ioat >>>> only >>>> when prot has DMA_PTE_WRITE set by setting the direction to either >>>> DMA_FROM_DEVICE or DMA_BIDIRECTIONAL. Any ideas if the prot settings >>>> need to be changed? Are there any bad side effects if I used >>>> DMA_BIDIRECTIONAL? >>> Good to hear it. Without digging into the direction much all I can say >>> is that it can sometimes be very confusing what the direction is. >>> Adding >>> another PCI device just adds to the confusion. >> Yep, confusing :). >> >> ======================= ============================================= >> DMA_NONE no direction (used for debugging) >> DMA_TO_DEVICE data is going from the memory to the device >> DMA_FROM_DEVICE data is coming from the device to the memory >> DMA_BIDIRECTIONAL direction isn't known >> ======================= ============================================= >> >>> I believe, the direction should be from the IOAT's point of view. So if >>> the IOAT is writing to the BAR you'd set DMA_FROM_DEVICE (ie. data is >>> coming from the IOAT) and if it's reading you'd set DMA_TO_DEVICE (ie. >>> data is going to the IOAT). >> It would certainly seem like DMA_TO_DEVICE would be the proper >> choice; IOAT is the plumbing to move host data (memory) to the bar >> address (device). > > Except that the "device" in question is the IOAT itself (more > generally, it means the device represented by the first argument to > dma_map_*() - the one actually emitting the reads and writes). The > context of a DMA API call is the individual mapping in question, not > whatever overall operation it may be part of - your example already > involves two separate mappings: one "from" system memory "to" the DMA > engine, and one "from" the DMA engine "to" PCI BAR memory. OK, that makes sense. The middleman (aka DMA engine device) is the key in the to/from puzzle. Thanks! > > Note that the DMA API's dma_direction is also distinct from the > dmaengine API's dma_transfer_direction, and there's plenty of fun to > be had mapping between the two - see pl330.c or rcar-dmac.c for other > examples of dma_map_resource() for slave devices - no guarantees that > those implementations are entirely correct (especially the one I > did!), but in practice they do make the "DMA engine behind an IOMMU" > case work for UARTs and similar straightforward slaves. > >> Will go with what works and set DMA_FROM_DEVICE. >> >> In ntb_async_tx_submit, does the direction used for the dma_map >> routines for the src and dest addresses need to be consistent? > > In general, the mappings of source and destination addresses would > typically have opposite directions as above, unless they're both > bidirectional. > >> And does the direction setting for the dmaengine_unmap_data have to >> be consistent with the direction used in dma_map_*? > > Yes, the arguments to an unmap are expected to match whatever was > passed to the corresponding map call. CONFIG_DMA_API_DEBUG should help > catch any mishaps. > > Robin. > >> BTW, dmaengine_unmap routine only calls dma_unmap_page. Should it >> keep track of the dma_map routine used and call the corresponding >> dma_unmap routine? In the case of the intel iommu, it doesn't matter. >> >> Thanks >> Kit >> >>> >>> Using DMA_BIDIRECTIONAL just forgoes any hardware security / protection >>> that the buffer would have in terms of direction. Generally it's good >>> practice to use the strictest direction you can. >>> >>>> Given that using the pci bar address as is without getting an iommu >>>> address results in the same "PTE Write access" error, I wonder if >>>> there >>>> is some internal 'prot' associated with the non-translated pci bar >>>> address that just needs to be tweaked to include DMA_PTE_WRITE??? >>> No, I don't think so. The 'prot' will be a property of the IOMMU. Not >>> having an entry is probably just the same (from the perspective of the >>> error you see) as only having an entry for reading. >>> >>> Logan >> >> _______________________________________________ >> iommu mailing list >> iommu@lists.linux-foundation.org >> https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: IOAT DMA w/IOMMU 2018-08-13 15:21 ` Kit Chow @ 2018-08-13 23:30 ` Kit Chow 2018-08-13 23:39 ` Logan Gunthorpe 2018-08-13 23:36 ` Kit Chow 1 sibling, 1 reply; 48+ messages in thread From: Kit Chow @ 2018-08-13 23:30 UTC (permalink / raw) To: Robin Murphy, Logan Gunthorpe, Dave Jiang, Eric Pilmore, Bjorn Helgaas Cc: linux-pci, Alex Williamson, David Woodhouse, iommu [-- Attachment #1: Type: text/plain, Size: 5857 bytes --] Taking a step back, I was a little surprised that dma_map_single successfully returned an iommu address for the pci bar address passed into it during my initial experiment... Internally, dma_map_single calls virt_to_page() to translate the "virtual address" into a page and intel_map page then calls page_to_phys() to convert the page to a dma_addr_t. The virt_to_page and page_to_phys routines don't appear to do any validation and just uses arithmetic to do the conversions. the pci bar address (0x383c70e51578) does fall into a valid VA range in x86_64 so it could conceivably be a valid VA. So I tried an virtual address inside the VA hole and and it too returned without any errors. virt_to_page(0x800000000000) -> 0xffffede000000000 page_to_phys(0xffffede000000000) -> 0xf80000000000 In arch/x86/include/asm/page.h, there is the following comment in regards to validating the virtual address. /* * virt_to_page(kaddr) returns a valid pointer if and only if * virt_addr_valid(kaddr) returns true. */ #define virt_to_page(kaddr) pfn_to_page(__pa(kaddr) >> PAGE_SHIFT) So it looks like the validation by virt_addr_valid was somehow dropped from the virt_to_page code path. Does anyone have any ideas what happended to it? Kit On 08/13/2018 08:21 AM, Kit Chow wrote: > > > On 08/13/2018 07:59 AM, Robin Murphy wrote: >> On 13/08/18 15:23, Kit Chow wrote: >>> On 08/10/2018 07:10 PM, Logan Gunthorpe wrote: >>>> >>>> On 10/08/18 06:53 PM, Kit Chow wrote: >>>>> I was able to finally succeed in doing the dma transfers over ioat >>>>> only >>>>> when prot has DMA_PTE_WRITE set by setting the direction to either >>>>> DMA_FROM_DEVICE or DMA_BIDIRECTIONAL. Any ideas if the prot settings >>>>> need to be changed? Are there any bad side effects if I used >>>>> DMA_BIDIRECTIONAL? >>>> Good to hear it. Without digging into the direction much all I can say >>>> is that it can sometimes be very confusing what the direction is. >>>> Adding >>>> another PCI device just adds to the confusion. >>> Yep, confusing :). >>> >>> ======================= ============================================= >>> DMA_NONE no direction (used for debugging) >>> DMA_TO_DEVICE data is going from the memory to the device >>> DMA_FROM_DEVICE data is coming from the device to the memory >>> DMA_BIDIRECTIONAL direction isn't known >>> ======================= ============================================= >>> >>>> I believe, the direction should be from the IOAT's point of view. >>>> So if >>>> the IOAT is writing to the BAR you'd set DMA_FROM_DEVICE (ie. data is >>>> coming from the IOAT) and if it's reading you'd set DMA_TO_DEVICE (ie. >>>> data is going to the IOAT). >>> It would certainly seem like DMA_TO_DEVICE would be the proper >>> choice; IOAT is the plumbing to move host data (memory) to the bar >>> address (device). >> >> Except that the "device" in question is the IOAT itself (more >> generally, it means the device represented by the first argument to >> dma_map_*() - the one actually emitting the reads and writes). The >> context of a DMA API call is the individual mapping in question, not >> whatever overall operation it may be part of - your example already >> involves two separate mappings: one "from" system memory "to" the DMA >> engine, and one "from" the DMA engine "to" PCI BAR memory. > > OK, that makes sense. The middleman (aka DMA engine device) is the > key in the to/from puzzle. Thanks! > > >> >> Note that the DMA API's dma_direction is also distinct from the >> dmaengine API's dma_transfer_direction, and there's plenty of fun to >> be had mapping between the two - see pl330.c or rcar-dmac.c for other >> examples of dma_map_resource() for slave devices - no guarantees that >> those implementations are entirely correct (especially the one I >> did!), but in practice they do make the "DMA engine behind an IOMMU" >> case work for UARTs and similar straightforward slaves. >> >>> Will go with what works and set DMA_FROM_DEVICE. >>> >>> In ntb_async_tx_submit, does the direction used for the dma_map >>> routines for the src and dest addresses need to be consistent? >> >> In general, the mappings of source and destination addresses would >> typically have opposite directions as above, unless they're both >> bidirectional. >> >>> And does the direction setting for the dmaengine_unmap_data have to >>> be consistent with the direction used in dma_map_*? >> >> Yes, the arguments to an unmap are expected to match whatever was >> passed to the corresponding map call. CONFIG_DMA_API_DEBUG should >> help catch any mishaps. >> >> Robin. >> >>> BTW, dmaengine_unmap routine only calls dma_unmap_page. Should it >>> keep track of the dma_map routine used and call the corresponding >>> dma_unmap routine? In the case of the intel iommu, it doesn't matter. >>> >>> Thanks >>> Kit >>> >>>> >>>> Using DMA_BIDIRECTIONAL just forgoes any hardware security / >>>> protection >>>> that the buffer would have in terms of direction. Generally it's good >>>> practice to use the strictest direction you can. >>>> >>>>> Given that using the pci bar address as is without getting an iommu >>>>> address results in the same "PTE Write access" error, I wonder if >>>>> there >>>>> is some internal 'prot' associated with the non-translated pci bar >>>>> address that just needs to be tweaked to include DMA_PTE_WRITE??? >>>> No, I don't think so. The 'prot' will be a property of the IOMMU. Not >>>> having an entry is probably just the same (from the perspective of the >>>> error you see) as only having an entry for reading. >>>> >>>> Logan >>> >>> _______________________________________________ >>> iommu mailing list >>> iommu@lists.linux-foundation.org >>> https://lists.linuxfoundation.org/mailman/listinfo/iommu > [-- Attachment #2: Type: text/html, Size: 9231 bytes --] ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: IOAT DMA w/IOMMU 2018-08-13 23:30 ` Kit Chow @ 2018-08-13 23:39 ` Logan Gunthorpe 2018-08-13 23:48 ` Kit Chow 0 siblings, 1 reply; 48+ messages in thread From: Logan Gunthorpe @ 2018-08-13 23:39 UTC (permalink / raw) To: Kit Chow, Robin Murphy, Dave Jiang, Eric Pilmore, Bjorn Helgaas Cc: linux-pci, Alex Williamson, David Woodhouse, iommu On 13/08/18 05:30 PM, Kit Chow wrote: > In arch/x86/include/asm/page.h, there is the following comment in > regards to validating the virtual address. > > /* > * virt_to_page(kaddr) returns a valid pointer if and only if > * virt_addr_valid(kaddr) returns true. > */ > #define virt_to_page(kaddr) pfn_to_page(__pa(kaddr) >> PAGE_SHIFT) > > So it looks like the validation by virt_addr_valid was somehow dropped > from the virt_to_page code path. Does anyone have any ideas what > happended to it? I don't think it was ever validated (though I haven't been around long enough to say for certain). What the comment is saying is that you shouldn't rely on virt_to_page() unless you know virt_addr_valid() is true (which in most cases you can without the extra check). virt_to_page is meant to be really fast so adding an extra validation would probably be a significant performance regression for the entire kernel. The fact that this can happen through dma_map_single() is non-ideal at best. It assumes the caller is mapping regular memory and doesn't check this at all. It may make sense to fix that but I think people expect dma_map_single() to be as fast as possible as well... Logan ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: IOAT DMA w/IOMMU 2018-08-13 23:39 ` Logan Gunthorpe @ 2018-08-13 23:48 ` Kit Chow 2018-08-13 23:50 ` Logan Gunthorpe 0 siblings, 1 reply; 48+ messages in thread From: Kit Chow @ 2018-08-13 23:48 UTC (permalink / raw) To: Logan Gunthorpe, Robin Murphy, Dave Jiang, Eric Pilmore, Bjorn Helgaas Cc: linux-pci, Alex Williamson, David Woodhouse, iommu On 08/13/2018 04:39 PM, Logan Gunthorpe wrote: > > On 13/08/18 05:30 PM, Kit Chow wrote: >> In arch/x86/include/asm/page.h, there is the following comment in >> regards to validating the virtual address. >> >> /* >> * virt_to_page(kaddr) returns a valid pointer if and only if >> * virt_addr_valid(kaddr) returns true. >> */ >> #define virt_to_page(kaddr) pfn_to_page(__pa(kaddr) >> PAGE_SHIFT) >> >> So it looks like the validation by virt_addr_valid was somehow dropped >> from the virt_to_page code path. Does anyone have any ideas what >> happended to it? > I don't think it was ever validated (though I haven't been around long > enough to say for certain). What the comment is saying is that you > shouldn't rely on virt_to_page() unless you know virt_addr_valid() is > true (which in most cases you can without the extra check). virt_to_page > is meant to be really fast so adding an extra validation would probably > be a significant performance regression for the entire kernel. > > The fact that this can happen through dma_map_single() is non-ideal at > best. It assumes the caller is mapping regular memory and doesn't check > this at all. It may make sense to fix that but I think people expect > dma_map_single() to be as fast as possible as well... > Perhaps include the validation with some debug turned on? ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: IOAT DMA w/IOMMU 2018-08-13 23:48 ` Kit Chow @ 2018-08-13 23:50 ` Logan Gunthorpe 2018-08-14 13:47 ` Kit Chow 2018-08-14 14:03 ` Robin Murphy 0 siblings, 2 replies; 48+ messages in thread From: Logan Gunthorpe @ 2018-08-13 23:50 UTC (permalink / raw) To: Kit Chow, Robin Murphy, Dave Jiang, Eric Pilmore, Bjorn Helgaas Cc: linux-pci, Alex Williamson, David Woodhouse, iommu On 13/08/18 05:48 PM, Kit Chow wrote: > On 08/13/2018 04:39 PM, Logan Gunthorpe wrote: >> >> On 13/08/18 05:30 PM, Kit Chow wrote: >>> In arch/x86/include/asm/page.h, there is the following comment in >>> regards to validating the virtual address. >>> >>> /* >>> * virt_to_page(kaddr) returns a valid pointer if and only if >>> * virt_addr_valid(kaddr) returns true. >>> */ >>> #define virt_to_page(kaddr) pfn_to_page(__pa(kaddr) >> PAGE_SHIFT) >>> >>> So it looks like the validation by virt_addr_valid was somehow dropped >>> from the virt_to_page code path. Does anyone have any ideas what >>> happended to it? >> I don't think it was ever validated (though I haven't been around long >> enough to say for certain). What the comment is saying is that you >> shouldn't rely on virt_to_page() unless you know virt_addr_valid() is >> true (which in most cases you can without the extra check). virt_to_page >> is meant to be really fast so adding an extra validation would probably >> be a significant performance regression for the entire kernel. >> >> The fact that this can happen through dma_map_single() is non-ideal at >> best. It assumes the caller is mapping regular memory and doesn't check >> this at all. It may make sense to fix that but I think people expect >> dma_map_single() to be as fast as possible as well... >> > Perhaps include the validation with some debug turned on? The problem is how often do you develop code with any of the debug config options turned on? There's already a couple of BUG_ONs in dma_map_single so maybe another one with virt_addr_valid wouldn't be so bad. Logan ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: IOAT DMA w/IOMMU 2018-08-13 23:50 ` Logan Gunthorpe @ 2018-08-14 13:47 ` Kit Chow 2018-08-14 14:03 ` Robin Murphy 1 sibling, 0 replies; 48+ messages in thread From: Kit Chow @ 2018-08-14 13:47 UTC (permalink / raw) To: Logan Gunthorpe, Robin Murphy, Dave Jiang, Eric Pilmore, Bjorn Helgaas Cc: linux-pci, Alex Williamson, David Woodhouse, iommu On 08/13/2018 04:50 PM, Logan Gunthorpe wrote: > > On 13/08/18 05:48 PM, Kit Chow wrote: >> On 08/13/2018 04:39 PM, Logan Gunthorpe wrote: >>> On 13/08/18 05:30 PM, Kit Chow wrote: >>>> In arch/x86/include/asm/page.h, there is the following comment in >>>> regards to validating the virtual address. >>>> >>>> /* >>>> * virt_to_page(kaddr) returns a valid pointer if and only if >>>> * virt_addr_valid(kaddr) returns true. >>>> */ >>>> #define virt_to_page(kaddr) pfn_to_page(__pa(kaddr) >> PAGE_SHIFT) >>>> >>>> So it looks like the validation by virt_addr_valid was somehow dropped >>>> from the virt_to_page code path. Does anyone have any ideas what >>>> happended to it? >>> I don't think it was ever validated (though I haven't been around long >>> enough to say for certain). What the comment is saying is that you >>> shouldn't rely on virt_to_page() unless you know virt_addr_valid() is >>> true (which in most cases you can without the extra check). virt_to_page >>> is meant to be really fast so adding an extra validation would probably >>> be a significant performance regression for the entire kernel. >>> >>> The fact that this can happen through dma_map_single() is non-ideal at >>> best. It assumes the caller is mapping regular memory and doesn't check >>> this at all. It may make sense to fix that but I think people expect >>> dma_map_single() to be as fast as possible as well... >>> >> Perhaps include the validation with some debug turned on? > The problem is how often do you develop code with any of the debug > config options turned on? > > There's already a couple of BUG_ONs in dma_map_single so maybe another > one with virt_addr_valid wouldn't be so bad. Had my very first Linux crash on the dma direction BUG_ON when I tried DMA_NONE :). ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: IOAT DMA w/IOMMU 2018-08-13 23:50 ` Logan Gunthorpe 2018-08-14 13:47 ` Kit Chow @ 2018-08-14 14:03 ` Robin Murphy 1 sibling, 0 replies; 48+ messages in thread From: Robin Murphy @ 2018-08-14 14:03 UTC (permalink / raw) To: Logan Gunthorpe, Kit Chow, Dave Jiang, Eric Pilmore, Bjorn Helgaas Cc: linux-pci, Alex Williamson, David Woodhouse, iommu On 14/08/18 00:50, Logan Gunthorpe wrote: > On 13/08/18 05:48 PM, Kit Chow wrote: >> On 08/13/2018 04:39 PM, Logan Gunthorpe wrote: >>> >>> On 13/08/18 05:30 PM, Kit Chow wrote: >>>> In arch/x86/include/asm/page.h, there is the following comment in >>>> regards to validating the virtual address. >>>> >>>> /* >>>> * virt_to_page(kaddr) returns a valid pointer if and only if >>>> * virt_addr_valid(kaddr) returns true. >>>> */ >>>> #define virt_to_page(kaddr) pfn_to_page(__pa(kaddr) >> PAGE_SHIFT) >>>> >>>> So it looks like the validation by virt_addr_valid was somehow dropped >>>> from the virt_to_page code path. Does anyone have any ideas what >>>> happended to it? >>> I don't think it was ever validated (though I haven't been around long >>> enough to say for certain). What the comment is saying is that you >>> shouldn't rely on virt_to_page() unless you know virt_addr_valid() is >>> true (which in most cases you can without the extra check). virt_to_page >>> is meant to be really fast so adding an extra validation would probably >>> be a significant performance regression for the entire kernel. >>> >>> The fact that this can happen through dma_map_single() is non-ideal at >>> best. It assumes the caller is mapping regular memory and doesn't check >>> this at all. It may make sense to fix that but I think people expect >>> dma_map_single() to be as fast as possible as well... dma_map_single() is already documented as only supporting lowmem (for which virt_to_page() can be assumed to be valid). You might get away with feeding it bogus addresses on x86, but on non-coherent architectures which convert the page back to a virtual address to perform cache maintenance you can expect that to crash and burn rapidly. There may be some minimal-overhead sanity checking of fundamentals, but in general it's not really the DMA API's job to police its callers exhaustively; consider that the mm layer doesn't go out of its way to stop you from doing things like "kfree(kfree);" either. >>> >> Perhaps include the validation with some debug turned on? > > The problem is how often do you develop code with any of the debug > config options turned on? > > There's already a couple of BUG_ONs in dma_map_single so maybe another > one with virt_addr_valid wouldn't be so bad. Note that virt_addr_valid() may be pretty heavyweight in itself. For example the arm64 implementation involves memblock_search(); that really isn't viable in a DMA mapping fastpath. Robin. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: IOAT DMA w/IOMMU 2018-08-13 15:21 ` Kit Chow 2018-08-13 23:30 ` Kit Chow @ 2018-08-13 23:36 ` Kit Chow 1 sibling, 0 replies; 48+ messages in thread From: Kit Chow @ 2018-08-13 23:36 UTC (permalink / raw) To: Robin Murphy, Logan Gunthorpe, Dave Jiang, Eric Pilmore, Bjorn Helgaas Cc: linux-pci, Alex Williamson, David Woodhouse, iommu Taking a step back, I was a little surprised that dma_map_single successfully returned an iommu address for the pci bar address passed into it during my initial experiment... Internally, dma_map_single calls virt_to_page() to translate the "virtual address" into a page and intel_map page then calls page_to_phys() to convert the page to a dma_addr_t. The virt_to_page and page_to_phys routines don't appear to do any validation and just uses arithmetic to do the conversions. the pci bar address (0x383c70e51578) does fall into a valid VA range in x86_64 so it could conceivably be a valid VA. So I tried an virtual address inside the VA hole and and it too returned without any errors. virt_to_page(0x800000000000) -> 0xffffede000000000 page_to_phys(0xffffede000000000) -> 0xf80000000000 In arch/x86/include/asm/page.h, there is the following comment in regards to validating the virtual address. /* * virt_to_page(kaddr) returns a valid pointer if and only if * virt_addr_valid(kaddr) returns true. */ #define virt_to_page(kaddr) pfn_to_page(__pa(kaddr) >> PAGE_SHIFT) So it looks like the validation by virt_addr_valid was somehow dropped from the virt_to_page code path. Does anyone have any ideas what happended to it? Kit (Resending, earlier message tagged as containing html subpart) On 08/13/2018 08:21 AM, Kit Chow wrote: > > > On 08/13/2018 07:59 AM, Robin Murphy wrote: >> On 13/08/18 15:23, Kit Chow wrote: >>> On 08/10/2018 07:10 PM, Logan Gunthorpe wrote: >>>> >>>> On 10/08/18 06:53 PM, Kit Chow wrote: >>>>> I was able to finally succeed in doing the dma transfers over ioat >>>>> only >>>>> when prot has DMA_PTE_WRITE set by setting the direction to either >>>>> DMA_FROM_DEVICE or DMA_BIDIRECTIONAL. Any ideas if the prot settings >>>>> need to be changed? Are there any bad side effects if I used >>>>> DMA_BIDIRECTIONAL? >>>> Good to hear it. Without digging into the direction much all I can say >>>> is that it can sometimes be very confusing what the direction is. >>>> Adding >>>> another PCI device just adds to the confusion. >>> Yep, confusing :). >>> >>> ======================= ============================================= >>> DMA_NONE no direction (used for debugging) >>> DMA_TO_DEVICE data is going from the memory to the device >>> DMA_FROM_DEVICE data is coming from the device to the memory >>> DMA_BIDIRECTIONAL direction isn't known >>> ======================= ============================================= >>> >>>> I believe, the direction should be from the IOAT's point of view. >>>> So if >>>> the IOAT is writing to the BAR you'd set DMA_FROM_DEVICE (ie. data is >>>> coming from the IOAT) and if it's reading you'd set DMA_TO_DEVICE (ie. >>>> data is going to the IOAT). >>> It would certainly seem like DMA_TO_DEVICE would be the proper >>> choice; IOAT is the plumbing to move host data (memory) to the bar >>> address (device). >> >> Except that the "device" in question is the IOAT itself (more >> generally, it means the device represented by the first argument to >> dma_map_*() - the one actually emitting the reads and writes). The >> context of a DMA API call is the individual mapping in question, not >> whatever overall operation it may be part of - your example already >> involves two separate mappings: one "from" system memory "to" the DMA >> engine, and one "from" the DMA engine "to" PCI BAR memory. > > OK, that makes sense. The middleman (aka DMA engine device) is the > key in the to/from puzzle. Thanks! > > >> >> Note that the DMA API's dma_direction is also distinct from the >> dmaengine API's dma_transfer_direction, and there's plenty of fun to >> be had mapping between the two - see pl330.c or rcar-dmac.c for other >> examples of dma_map_resource() for slave devices - no guarantees that >> those implementations are entirely correct (especially the one I >> did!), but in practice they do make the "DMA engine behind an IOMMU" >> case work for UARTs and similar straightforward slaves. >> >>> Will go with what works and set DMA_FROM_DEVICE. >>> >>> In ntb_async_tx_submit, does the direction used for the dma_map >>> routines for the src and dest addresses need to be consistent? >> >> In general, the mappings of source and destination addresses would >> typically have opposite directions as above, unless they're both >> bidirectional. >> >>> And does the direction setting for the dmaengine_unmap_data have to >>> be consistent with the direction used in dma_map_*? >> >> Yes, the arguments to an unmap are expected to match whatever was >> passed to the corresponding map call. CONFIG_DMA_API_DEBUG should >> help catch any mishaps. >> >> Robin. >> >>> BTW, dmaengine_unmap routine only calls dma_unmap_page. Should it >>> keep track of the dma_map routine used and call the corresponding >>> dma_unmap routine? In the case of the intel iommu, it doesn't matter. >>> >>> Thanks >>> Kit >>> >>>> >>>> Using DMA_BIDIRECTIONAL just forgoes any hardware security / >>>> protection >>>> that the buffer would have in terms of direction. Generally it's good >>>> practice to use the strictest direction you can. >>>> >>>>> Given that using the pci bar address as is without getting an iommu >>>>> address results in the same "PTE Write access" error, I wonder if >>>>> there >>>>> is some internal 'prot' associated with the non-translated pci bar >>>>> address that just needs to be tweaked to include DMA_PTE_WRITE??? >>>> No, I don't think so. The 'prot' will be a property of the IOMMU. Not >>>> having an entry is probably just the same (from the perspective of the >>>> error you see) as only having an entry for reading. >>>> >>>> Logan >>> >>> _______________________________________________ >>> iommu mailing list >>> iommu@lists.linux-foundation.org >>> https://lists.linuxfoundation.org/mailman/listinfo/iommu > ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: IOAT DMA w/IOMMU 2018-08-09 19:35 ` Logan Gunthorpe 2018-08-09 19:47 ` Kit Chow @ 2018-08-09 21:31 ` Eric Pilmore 2018-08-09 21:36 ` Logan Gunthorpe 1 sibling, 1 reply; 48+ messages in thread From: Eric Pilmore @ 2018-08-09 21:31 UTC (permalink / raw) To: Logan Gunthorpe Cc: Bjorn Helgaas, linux-pci, David Woodhouse, Alex Williamson, iommu On Thu, Aug 9, 2018 at 12:35 PM, Logan Gunthorpe <logang@deltatee.com> wrote: > Hey, > > On 09/08/18 12:51 PM, Eric Pilmore wrote: >>>> Was wondering if anybody here has used IOAT DMA engines with an >>>> IOMMU turned on (Xeon based system)? My specific question is really >>>> whether it is possible to DMA (w/IOAT) to a PCI BAR address as the >>>> destination without having to map that address to the IOVA space of >>>> the DMA engine first (assuming the IOMMU is on)? > > I haven't tested this scenario but my guess would be that IOAT would > indeed go through the IOMMU and the PCI BAR address would need to be > properly mapped into the IOAT's IOVA. The fact that you see DMAR errors > is probably a good indication that this is the case. I really don't know > why you'd want to DMA something without mapping it. The thought was to avoid paying the price of having to go through yet another translation and also because it was not believed to be necessary anyway since the DMA device could go straight to a PCI BAR address without the need for a mapping. We have been playing with two DMA engines, IOAT and PLX. The PLX does not have any issues going straight to the PCI BAR address, but unlike IOAT, PLX is sitting "in the PCI tree". > >>> So is this a peer-to-peer DMA scenario? You mention DMA, which would >>> be a transaction initiated by a PCI device, to a PCI BAR address, so >>> it doesn't sound like system memory is involved. >> >> No, not peer-to-peer. This is from system memory (e.g. SKB buffer which >> has had an IOMMU mapping created) to a PCI BAR address. > > It's definitely peer-to-peer in the case where you are using a DMA > engine in the PCI tree. You have the DMA PCI device sending TLPs > directly to the PCI BAR device. So, if everything is done right, the > TLPs will avoid the root complex completely. (Though, ACS settings could > also prevent this from working and you'd either get similar DMAR errors > or they'd disappear into a black hole). > > When using the IOAT, it is part of the CPU so I wouldn't say it's really > peer-to-peer. But an argument could be made that it is. Though, this is > exactly what the existing ntb_transport is doing: DMAing from system > memory to a PCI BAR and vice versa using IOAT. > > Logan > -- Eric Pilmore epilmore@gigaio.com http://gigaio.com Phone: (858) 775 2514 This e-mail message is intended only for the individual(s) to whom it is addressed and may contain information that is privileged, confidential, proprietary, or otherwise exempt from disclosure under applicable law. If you believe you have received this message in error, please advise the sender by return e-mail and delete it from your mailbox. Thank you. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: IOAT DMA w/IOMMU 2018-08-09 21:31 ` Eric Pilmore @ 2018-08-09 21:36 ` Logan Gunthorpe 2018-08-16 17:16 ` Kit Chow 0 siblings, 1 reply; 48+ messages in thread From: Logan Gunthorpe @ 2018-08-09 21:36 UTC (permalink / raw) To: Eric Pilmore Cc: Bjorn Helgaas, linux-pci, David Woodhouse, Alex Williamson, iommu On 09/08/18 03:31 PM, Eric Pilmore wrote: > On Thu, Aug 9, 2018 at 12:35 PM, Logan Gunthorpe <logang@deltatee.com> wrote: >> Hey, >> >> On 09/08/18 12:51 PM, Eric Pilmore wrote: >>>>> Was wondering if anybody here has used IOAT DMA engines with an >>>>> IOMMU turned on (Xeon based system)? My specific question is really >>>>> whether it is possible to DMA (w/IOAT) to a PCI BAR address as the >>>>> destination without having to map that address to the IOVA space of >>>>> the DMA engine first (assuming the IOMMU is on)? >> >> I haven't tested this scenario but my guess would be that IOAT would >> indeed go through the IOMMU and the PCI BAR address would need to be >> properly mapped into the IOAT's IOVA. The fact that you see DMAR errors >> is probably a good indication that this is the case. I really don't know >> why you'd want to DMA something without mapping it. > > The thought was to avoid paying the price of having to go through yet another > translation and also because it was not believed to be necessary anyway since > the DMA device could go straight to a PCI BAR address without the need for a > mapping. We have been playing with two DMA engines, IOAT and PLX. The > PLX does not have any issues going straight to the PCI BAR address, but unlike > IOAT, PLX is sitting "in the PCI tree". Yes, you can do true P2P transactions with the PLX DMA engine. (Though you do have to be careful as technically you need to translate to a PCI bus address not the CPU physical address which are the same on x86; and you need to watch that ACS doesn't mess it up). It doesn't sound like it's possible to avoid the IOMMU when using the IOAT so you need the mapping. I would not expect the extra translation in the IOMMU to be noticeable, performance wise. Logan ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: IOAT DMA w/IOMMU 2018-08-09 21:36 ` Logan Gunthorpe @ 2018-08-16 17:16 ` Kit Chow 2018-08-16 17:21 ` Logan Gunthorpe 0 siblings, 1 reply; 48+ messages in thread From: Kit Chow @ 2018-08-16 17:16 UTC (permalink / raw) To: Logan Gunthorpe, Eric Pilmore Cc: Bjorn Helgaas, linux-pci, David Woodhouse, Alex Williamson, iommu, linux-pci On 08/09/2018 02:36 PM, Logan Gunthorpe wrote: > > On 09/08/18 03:31 PM, Eric Pilmore wrote: >> On Thu, Aug 9, 2018 at 12:35 PM, Logan Gunthorpe <logang@deltatee.com> wrote: >>> Hey, >>> >>> On 09/08/18 12:51 PM, Eric Pilmore wrote: >>>>>> Was wondering if anybody here has used IOAT DMA engines with an >>>>>> IOMMU turned on (Xeon based system)? My specific question is really >>>>>> whether it is possible to DMA (w/IOAT) to a PCI BAR address as the >>>>>> destination without having to map that address to the IOVA space of >>>>>> the DMA engine first (assuming the IOMMU is on)? >>> I haven't tested this scenario but my guess would be that IOAT would >>> indeed go through the IOMMU and the PCI BAR address would need to be >>> properly mapped into the IOAT's IOVA. The fact that you see DMAR errors >>> is probably a good indication that this is the case. I really don't know >>> why you'd want to DMA something without mapping it. >> The thought was to avoid paying the price of having to go through yet another >> translation and also because it was not believed to be necessary anyway since >> the DMA device could go straight to a PCI BAR address without the need for a >> mapping. We have been playing with two DMA engines, IOAT and PLX. The >> PLX does not have any issues going straight to the PCI BAR address, but unlike >> IOAT, PLX is sitting "in the PCI tree". > Yes, you can do true P2P transactions with the PLX DMA engine. (Though > you do have to be careful as technically you need to translate to a PCI > bus address not the CPU physical address which are the same on x86; and > you need to watch that ACS doesn't mess it up). > > It doesn't sound like it's possible to avoid the IOMMU when using the > IOAT so you need the mapping. I would not expect the extra translation > in the IOMMU to be noticeable, performance wise. > > Logan Hi Logan, Based on Dave Jiang's suggestion, I am looking into upstreaming the ntb change to dma map the pci bar address in ntb_async_tx_submit() (along with the intel iommu change to support dma_map_resource). I only have access to intel hosts for testing (and possibly an AMD host currently collecting dust) and am not sure how to go about getting the proper test coverage for other architectures. Any suggestions? Do you anticipate any issues with dma mapping the pci bar address in ntb_async_tx_submit on non-x86 archs? Do you or know of folks who have ntb test setups with non-x86 hosts who might be able to help? Thanks Kit ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: IOAT DMA w/IOMMU 2018-08-16 17:16 ` Kit Chow @ 2018-08-16 17:21 ` Logan Gunthorpe 2018-08-16 18:53 ` Kit Chow 0 siblings, 1 reply; 48+ messages in thread From: Logan Gunthorpe @ 2018-08-16 17:21 UTC (permalink / raw) To: Kit Chow, Eric Pilmore Cc: Bjorn Helgaas, linux-pci, David Woodhouse, Alex Williamson, iommu On 16/08/18 11:16 AM, Kit Chow wrote: > I only have access to intel hosts for testing (and possibly an AMD > host currently collecting dust) and am not sure how to go about getting > the proper test coverage for other architectures. Well, I thought you were only changing the Intel IOMMU implementation... So testing on Intel hardware seems fine to me. > Any suggestions? Do you anticipate any issues with dma mapping the pci bar > address in ntb_async_tx_submit on non-x86 archs? Do you or know of folks > who have ntb test setups with non-x86 hosts who might be able to help? I expect other IOMMUs will need to be updated to properly support dma_map_resource but that will probably need to be done on a case by case basis as the need arises. Unfortunately it seems the work to add dma_map_resource() didn't really consider that the IOMMUs had to have proper support and probably should have errored out instead of just passing along the physical address if the IOMMU didn't have support. I don't know of anyone with an NTB setup on anything but x86 at the moment. Logan ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: IOAT DMA w/IOMMU 2018-08-16 17:21 ` Logan Gunthorpe @ 2018-08-16 18:53 ` Kit Chow 2018-08-16 18:56 ` Logan Gunthorpe 0 siblings, 1 reply; 48+ messages in thread From: Kit Chow @ 2018-08-16 18:53 UTC (permalink / raw) To: Logan Gunthorpe, Eric Pilmore Cc: Bjorn Helgaas, linux-pci, David Woodhouse, Alex Williamson, iommu On 08/16/2018 10:21 AM, Logan Gunthorpe wrote: > > On 16/08/18 11:16 AM, Kit Chow wrote: >> I only have access to intel hosts for testing (and possibly an AMD >> host currently collecting dust) and am not sure how to go about getting >> the proper test coverage for other architectures. > Well, I thought you were only changing the Intel IOMMU implementation... > So testing on Intel hardware seems fine to me. For the ntb change, I wasn't sure if there was some other arch where ntb_async_tx_submit would work ok with the pci bar address but would break with the dma map (assuming map_resource has been implemented). If its all x86 at the moment, I guess I'm good to go. Please confirm. Thanks Kit >> Any suggestions? Do you anticipate any issues with dma mapping the pci bar >> address in ntb_async_tx_submit on non-x86 archs? Do you or know of folks >> who have ntb test setups with non-x86 hosts who might be able to help? > I expect other IOMMUs will need to be updated to properly support > dma_map_resource but that will probably need to be done on a case by > case basis as the need arises. Unfortunately it seems the work to add > dma_map_resource() didn't really consider that the IOMMUs had to have > proper support and probably should have errored out instead of just > passing along the physical address if the IOMMU didn't have support. > > I don't know of anyone with an NTB setup on anything but x86 at the moment. > > Logan ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: IOAT DMA w/IOMMU 2018-08-16 18:53 ` Kit Chow @ 2018-08-16 18:56 ` Logan Gunthorpe 2018-08-21 23:18 ` Eric Pilmore 0 siblings, 1 reply; 48+ messages in thread From: Logan Gunthorpe @ 2018-08-16 18:56 UTC (permalink / raw) To: Kit Chow, Eric Pilmore Cc: Bjorn Helgaas, linux-pci, David Woodhouse, Alex Williamson, iommu On 16/08/18 12:53 PM, Kit Chow wrote: > > > On 08/16/2018 10:21 AM, Logan Gunthorpe wrote: >> >> On 16/08/18 11:16 AM, Kit Chow wrote: >>> I only have access to intel hosts for testing (and possibly an AMD >>> host currently collecting dust) and am not sure how to go about getting >>> the proper test coverage for other architectures. >> Well, I thought you were only changing the Intel IOMMU implementation... >> So testing on Intel hardware seems fine to me. > For the ntb change, I wasn't sure if there was some other arch where > ntb_async_tx_submit would work ok with the pci bar address but > would break with the dma map (assuming map_resource has been implemented). > If its all x86 at the moment, I guess I'm good to go. Please confirm. I expect lots of arches have issues with dma_map_resource(), it's pretty new and if it didn't work correctly on x86 I imagine many other arches are wrong too. I an arch is doesn't work, someone will have to fix that arches Iommu implementation. But using dma_map_resource() in ntb_transport is definitely correct compared to what was done before. Logan ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: IOAT DMA w/IOMMU 2018-08-16 18:56 ` Logan Gunthorpe @ 2018-08-21 23:18 ` Eric Pilmore 2018-08-21 23:20 ` Logan Gunthorpe 0 siblings, 1 reply; 48+ messages in thread From: Eric Pilmore @ 2018-08-21 23:18 UTC (permalink / raw) To: Logan Gunthorpe Cc: Kit Chow, Bjorn Helgaas, linux-pci, David Woodhouse, Alex Williamson, iommu On Thu, Aug 16, 2018 at 11:56 AM, Logan Gunthorpe <logang@deltatee.com> wrote: > > > > On 16/08/18 12:53 PM, Kit Chow wrote: > > > > > > On 08/16/2018 10:21 AM, Logan Gunthorpe wrote: > >> > >> On 16/08/18 11:16 AM, Kit Chow wrote: > >>> I only have access to intel hosts for testing (and possibly an AMD > >>> host currently collecting dust) and am not sure how to go about getting > >>> the proper test coverage for other architectures. > >> Well, I thought you were only changing the Intel IOMMU implementation... > >> So testing on Intel hardware seems fine to me. > > For the ntb change, I wasn't sure if there was some other arch where > > ntb_async_tx_submit would work ok with the pci bar address but > > would break with the dma map (assuming map_resource has been implemented). > > If its all x86 at the moment, I guess I'm good to go. Please confirm. > > I expect lots of arches have issues with dma_map_resource(), it's pretty > new and if it didn't work correctly on x86 I imagine many other arches > are wrong too. I an arch is doesn't work, someone will have to fix that > arches Iommu implementation. But using dma_map_resource() in > ntb_transport is definitely correct compared to what was done before. > > Logan We have been running locally with Kit's change for dma_map_resource and its incorporation in ntb_async_tx_submit for the destination address and it runs fine under "load" (iperf) on a Xeon (Xeon(R) CPU E5-2680 v4 @ 2.40GHz) based system, regardless of whether the DMA engine being used is IOAT or a PLX device sitting in the PCIe tree. However, when we go back to a i7 (i7-7700K CPU @ 4.20GHz) based system it seems to run into issues, specifically when put under a load. In this case, just having a load using a single ping command with an interval=0, i.e. no delay between ping packets, after a few thousand packets the system just hangs. No panic or watchdogs. Note that in this scenario I can only use a PLX DMA engine. Just wondering if anybody might have a clue as to why the i7 system might run into issues, but a Xeon system not. IOMMU differences? My gut reaction is that the problem may not necessarily be IOMMU hardware related, but perhaps simply a timing issue given the difference in cpu speeds between these systems and maybe we're hitting some window in the Intel IOMMU mapping/unmapping logic. Thanks, Eric ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: IOAT DMA w/IOMMU 2018-08-21 23:18 ` Eric Pilmore @ 2018-08-21 23:20 ` Logan Gunthorpe 2018-08-21 23:28 ` Eric Pilmore 2018-08-21 23:30 ` Eric Pilmore 0 siblings, 2 replies; 48+ messages in thread From: Logan Gunthorpe @ 2018-08-21 23:20 UTC (permalink / raw) To: Eric Pilmore Cc: Kit Chow, Bjorn Helgaas, linux-pci, David Woodhouse, Alex Williamson, iommu On 21/08/18 05:18 PM, Eric Pilmore wrote: > We have been running locally with Kit's change for dma_map_resource and its > incorporation in ntb_async_tx_submit for the destination address and > it runs fine > under "load" (iperf) on a Xeon (Xeon(R) CPU E5-2680 v4 @ 2.40GHz) based system, > regardless of whether the DMA engine being used is IOAT or a PLX > device sitting in > the PCIe tree. However, when we go back to a i7 (i7-7700K CPU @ 4.20GHz) based > system it seems to run into issues, specifically when put under a > load. In this case, > just having a load using a single ping command with an interval=0, i.e. no delay > between ping packets, after a few thousand packets the system just hangs. No > panic or watchdogs. Note that in this scenario I can only use a PLX DMA engine. This is just my best guess: but it sounds to me like a bug in the PLX DMA driver or hardware. Logan ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: IOAT DMA w/IOMMU 2018-08-21 23:20 ` Logan Gunthorpe @ 2018-08-21 23:28 ` Eric Pilmore 2018-08-21 23:35 ` Logan Gunthorpe 2018-08-21 23:30 ` Eric Pilmore 1 sibling, 1 reply; 48+ messages in thread From: Eric Pilmore @ 2018-08-21 23:28 UTC (permalink / raw) To: Logan Gunthorpe Cc: Kit Chow, Bjorn Helgaas, linux-pci, David Woodhouse, Alex Williamson, iommu [-- Attachment #1: Type: text/plain, Size: 1293 bytes --] On Tue, Aug 21, 2018 at 4:20 PM, Logan Gunthorpe <logang@deltatee.com> wrote: > > > On 21/08/18 05:18 PM, Eric Pilmore wrote: > > We have been running locally with Kit's change for dma_map_resource and > its > > incorporation in ntb_async_tx_submit for the destination address and > > it runs fine > > under "load" (iperf) on a Xeon (Xeon(R) CPU E5-2680 v4 @ 2.40GHz) based > system, > > regardless of whether the DMA engine being used is IOAT or a PLX > > device sitting in > > the PCIe tree. However, when we go back to a i7 (i7-7700K CPU @ 4.20GHz) > based > > system it seems to run into issues, specifically when put under a > > load. In this case, > > just having a load using a single ping command with an interval=0, i.e. > no delay > > between ping packets, after a few thousand packets the system just > hangs. No > > panic or watchdogs. Note that in this scenario I can only use a PLX DMA > engine. > > This is just my best guess: but it sounds to me like a bug in the PLX > DMA driver or hardware. > > The PLX DMA driver? But the PLX driver isn't really even involved in the mapping stage. Are you thinking maybe the stage at which the DMA descriptor is freed and the PLX DMA driver does a dma_descriptor_unmap? Again, PLX did not exhibit any issues on the Xeon system. Eric [-- Attachment #2: Type: text/html, Size: 1888 bytes --] ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: IOAT DMA w/IOMMU 2018-08-21 23:28 ` Eric Pilmore @ 2018-08-21 23:35 ` Logan Gunthorpe 2018-08-21 23:45 ` Eric Pilmore 0 siblings, 1 reply; 48+ messages in thread From: Logan Gunthorpe @ 2018-08-21 23:35 UTC (permalink / raw) To: Eric Pilmore Cc: Kit Chow, Bjorn Helgaas, linux-pci, David Woodhouse, Alex Williamson, iommu On 21/08/18 05:28 PM, Eric Pilmore wrote: > > > On Tue, Aug 21, 2018 at 4:20 PM, Logan Gunthorpe <logang@deltatee.com > <mailto:logang@deltatee.com>> wrote: > > > > On 21/08/18 05:18 PM, Eric Pilmore wrote: > > We have been running locally with Kit's change for dma_map_resource and its > > incorporation in ntb_async_tx_submit for the destination address and > > it runs fine > > under "load" (iperf) on a Xeon (Xeon(R) CPU E5-2680 v4 @ 2.40GHz) based system, > > regardless of whether the DMA engine being used is IOAT or a PLX > > device sitting in > > the PCIe tree. However, when we go back to a i7 (i7-7700K CPU @ 4.20GHz) based > > system it seems to run into issues, specifically when put under a > > load. In this case, > > just having a load using a single ping command with an interval=0, i.e. no delay > > between ping packets, after a few thousand packets the system just hangs. No > > panic or watchdogs. Note that in this scenario I can only use a PLX DMA engine. > > This is just my best guess: but it sounds to me like a bug in the PLX > DMA driver or hardware. > > > The PLX DMA driver? But the PLX driver isn't really even involved in > the mapping > stage. Are you thinking maybe the stage at which the DMA descriptor is > freed and > the PLX DMA driver does a dma_descriptor_unmap? Hmm, well what would make you think the hang is during mapping/unmapping? I would expect a hang to be in handling completions from the DMA engine or something like that. > Again, PLX did not exhibit any issues on the Xeon system. Oh, I missed that. That puts a crinkle in my theory but, as you say, it could be a timing issue. Also, it's VERY strange that it would hang the entire system. That makes things very hard to debug... Logan ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: IOAT DMA w/IOMMU 2018-08-21 23:35 ` Logan Gunthorpe @ 2018-08-21 23:45 ` Eric Pilmore 2018-08-21 23:53 ` Logan Gunthorpe 0 siblings, 1 reply; 48+ messages in thread From: Eric Pilmore @ 2018-08-21 23:45 UTC (permalink / raw) To: Logan Gunthorpe Cc: Kit Chow, Bjorn Helgaas, linux-pci, David Woodhouse, Alex Williamson, iommu On Tue, Aug 21, 2018 at 4:35 PM, Logan Gunthorpe <logang@deltatee.com> wrote: > > > On 21/08/18 05:28 PM, Eric Pilmore wrote: >> >> >> On Tue, Aug 21, 2018 at 4:20 PM, Logan Gunthorpe <logang@deltatee.com >> <mailto:logang@deltatee.com>> wrote: >> >> >> >> On 21/08/18 05:18 PM, Eric Pilmore wrote: >> > We have been running locally with Kit's change for dma_map_resource and its >> > incorporation in ntb_async_tx_submit for the destination address and >> > it runs fine >> > under "load" (iperf) on a Xeon (Xeon(R) CPU E5-2680 v4 @ 2.40GHz) based system, >> > regardless of whether the DMA engine being used is IOAT or a PLX >> > device sitting in >> > the PCIe tree. However, when we go back to a i7 (i7-7700K CPU @ 4.20GHz) based >> > system it seems to run into issues, specifically when put under a >> > load. In this case, >> > just having a load using a single ping command with an interval=0, i.e. no delay >> > between ping packets, after a few thousand packets the system just hangs. No >> > panic or watchdogs. Note that in this scenario I can only use a PLX DMA engine. >> >> This is just my best guess: but it sounds to me like a bug in the PLX >> DMA driver or hardware. >> >> >> The PLX DMA driver? But the PLX driver isn't really even involved in >> the mapping >> stage. Are you thinking maybe the stage at which the DMA descriptor is >> freed and >> the PLX DMA driver does a dma_descriptor_unmap? > > Hmm, well what would make you think the hang is during > mapping/unmapping? Well, the only difference between success and failure is running with the call to dma_map_resource for the destination address, which is a PCI BAR address. Prior to Kit introducing this call, we never created a mapping for the destination PCI BAR address and it worked fine on all systems when using PLX DMA. It was only when we went to a Xeon system and attempted to use IOAT DMA that we found we needed a mapping for that destination PCI BAR address. The only thing the PLX driver does related to "mappings" is a call to dma_descriptor_unmap when the descriptor is freed, however that is more of an administrative step to clean up the unmap-data data structure used when the mapping was originally established. > I would expect a hang to be in handling completions > from the DMA engine or something like that. > >> Again, PLX did not exhibit any issues on the Xeon system. > > Oh, I missed that. That puts a crinkle in my theory but, as you say, it > could be a timing issue. > > Also, it's VERY strange that it would hang the entire system. That makes > things very hard to debug... Tell me about it! ;-) Eric ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: IOAT DMA w/IOMMU 2018-08-21 23:45 ` Eric Pilmore @ 2018-08-21 23:53 ` Logan Gunthorpe 2018-08-21 23:59 ` Eric Pilmore 0 siblings, 1 reply; 48+ messages in thread From: Logan Gunthorpe @ 2018-08-21 23:53 UTC (permalink / raw) To: Eric Pilmore Cc: Kit Chow, Bjorn Helgaas, linux-pci, David Woodhouse, Alex Williamson, iommu On 21/08/18 05:45 PM, Eric Pilmore wrote: > Well, the only difference between success and failure is running with the > call to dma_map_resource for the destination address, which is a PCI BAR > address. Prior to Kit introducing this call, we never created a mapping for the > destination PCI BAR address and it worked fine on all systems when using > PLX DMA. It was only when we went to a Xeon system and attempted to use > IOAT DMA that we found we needed a mapping for that destination PCI BAR > address. > > The only thing the PLX driver does related to "mappings" is a call to > dma_descriptor_unmap when the descriptor is freed, however that is more > of an administrative step to clean up the unmap-data data structure used > when the mapping was originally established. Ah, so there's a big difference here in hardware. Without the mapping call you are going to essentially be doing a P2P transaction. So the TLPs won't even hit the CPU. With the mapping call, the TLPs will now go through the CPU and IOMMU. CPUs don't always have good support for routing PCI P2P transactions. However, I would have expected a relatively new i7 CPU to support it well. It may simply be that this CPU does not have good support, however that comes as a bit of a surprise to me. Our plans for the P2P patch-set was to have a white-list for CPUs that work. If you can, it would be worth hooking up a PCI analyzer to see what's happening to the TLPs in both cases. Logan ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: IOAT DMA w/IOMMU 2018-08-21 23:53 ` Logan Gunthorpe @ 2018-08-21 23:59 ` Eric Pilmore 0 siblings, 0 replies; 48+ messages in thread From: Eric Pilmore @ 2018-08-21 23:59 UTC (permalink / raw) To: Logan Gunthorpe Cc: Kit Chow, Bjorn Helgaas, linux-pci, David Woodhouse, Alex Williamson, iommu On Tue, Aug 21, 2018 at 4:53 PM, Logan Gunthorpe <logang@deltatee.com> wrote: > > CPUs don't always have good support for routing PCI P2P transactions. > However, I would have expected a relatively new i7 CPU to support it > well. It may simply be that this CPU does not have good support, however > that comes as a bit of a surprise to me. Our plans for the P2P > patch-set was to have a white-list for CPUs that work. > > If you can, it would be worth hooking up a PCI analyzer to see what's > happening to the TLPs in both cases. We do have one, but it will be a bit of a logistical effort to get it set-up in this config. Will be a while before I can gather that data. Will forward along when I do. Thanks, Eric ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: IOAT DMA w/IOMMU 2018-08-21 23:20 ` Logan Gunthorpe 2018-08-21 23:28 ` Eric Pilmore @ 2018-08-21 23:30 ` Eric Pilmore 1 sibling, 0 replies; 48+ messages in thread From: Eric Pilmore @ 2018-08-21 23:30 UTC (permalink / raw) To: Logan Gunthorpe Cc: Kit Chow, Bjorn Helgaas, linux-pci, David Woodhouse, Alex Williamson, iommu On Tue, Aug 21, 2018 at 4:20 PM, Logan Gunthorpe <logang@deltatee.com> wrote: > > > On 21/08/18 05:18 PM, Eric Pilmore wrote: >> We have been running locally with Kit's change for dma_map_resource and its >> incorporation in ntb_async_tx_submit for the destination address and >> it runs fine >> under "load" (iperf) on a Xeon (Xeon(R) CPU E5-2680 v4 @ 2.40GHz) based system, >> regardless of whether the DMA engine being used is IOAT or a PLX >> device sitting in >> the PCIe tree. However, when we go back to a i7 (i7-7700K CPU @ 4.20GHz) based >> system it seems to run into issues, specifically when put under a >> load. In this case, >> just having a load using a single ping command with an interval=0, i.e. no delay >> between ping packets, after a few thousand packets the system just hangs. No >> panic or watchdogs. Note that in this scenario I can only use a PLX DMA engine. > > This is just my best guess: but it sounds to me like a bug in the PLX > DMA driver or hardware. > > Logan (sorry, html in last response. resending..) The PLX DMA driver? But the PLX driver isn't really even involved in the mapping stage. Are you thinking maybe the stage at which the DMA descriptor is freed and the PLX DMA driver does a dma_descriptor_unmap? Again, PLX did not exhibit any issues on the Xeon system. Eric ^ permalink raw reply [flat|nested] 48+ messages in thread
end of thread, other threads:[~2018-08-21 23:59 UTC | newest] Thread overview: 48+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-08-09 18:14 IOAT DMA w/IOMMU Eric Pilmore 2018-08-09 18:43 ` Bjorn Helgaas 2018-08-09 18:51 ` Eric Pilmore 2018-08-09 19:35 ` Logan Gunthorpe 2018-08-09 19:47 ` Kit Chow 2018-08-09 20:11 ` Logan Gunthorpe 2018-08-09 20:57 ` Kit Chow 2018-08-09 21:11 ` Logan Gunthorpe 2018-08-09 21:47 ` Kit Chow 2018-08-09 22:40 ` Jiang, Dave 2018-08-09 22:48 ` Kit Chow 2018-08-09 22:50 ` Logan Gunthorpe 2018-08-09 23:00 ` Kit Chow 2018-08-10 16:02 ` Kit Chow 2018-08-10 16:23 ` Kit Chow 2018-08-10 16:24 ` Logan Gunthorpe 2018-08-10 16:24 ` Logan Gunthorpe 2018-08-10 16:31 ` Dave Jiang 2018-08-10 16:33 ` Logan Gunthorpe 2018-08-10 17:01 ` Dave Jiang 2018-08-10 17:15 ` Logan Gunthorpe 2018-08-10 17:46 ` Dave Jiang 2018-08-11 0:53 ` Kit Chow 2018-08-11 2:10 ` Logan Gunthorpe 2018-08-13 14:23 ` Kit Chow 2018-08-13 14:59 ` Robin Murphy 2018-08-13 15:21 ` Kit Chow 2018-08-13 23:30 ` Kit Chow 2018-08-13 23:39 ` Logan Gunthorpe 2018-08-13 23:48 ` Kit Chow 2018-08-13 23:50 ` Logan Gunthorpe 2018-08-14 13:47 ` Kit Chow 2018-08-14 14:03 ` Robin Murphy 2018-08-13 23:36 ` Kit Chow 2018-08-09 21:31 ` Eric Pilmore 2018-08-09 21:36 ` Logan Gunthorpe 2018-08-16 17:16 ` Kit Chow 2018-08-16 17:21 ` Logan Gunthorpe 2018-08-16 18:53 ` Kit Chow 2018-08-16 18:56 ` Logan Gunthorpe 2018-08-21 23:18 ` Eric Pilmore 2018-08-21 23:20 ` Logan Gunthorpe 2018-08-21 23:28 ` Eric Pilmore 2018-08-21 23:35 ` Logan Gunthorpe 2018-08-21 23:45 ` Eric Pilmore 2018-08-21 23:53 ` Logan Gunthorpe 2018-08-21 23:59 ` Eric Pilmore 2018-08-21 23:30 ` Eric Pilmore
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).