* Re: IOAT DMA w/IOMMU
@ 2018-08-10 16:02 ` Kit Chow
0 siblings, 0 replies; 95+ messages in thread
From: Kit Chow @ 2018-08-10 16:02 UTC (permalink / raw)
To: Logan Gunthorpe, Jiang, Dave, Eric Pilmore, Bjorn Helgaas
Cc: linux-pci-u79uwXL29TY76Z2rM5mHXA, Alex Williamson,
David Woodhouse,
iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
Turns out there is no dma_map_resource routine on x86. get_dma_ops
returns intel_dma_ops which has map_resource pointing to NULL.
(gdb) p intel_dma_ops
$7 = {alloc = 0xffffffff8150f310 <intel_alloc_coherent>,
free = 0xffffffff8150ec20 <intel_free_coherent>,
mmap = 0x0 <irq_stack_union>, get_sgtable = 0x0 <irq_stack_union>,
map_page = 0xffffffff8150f2d0 <intel_map_page>,
unmap_page = 0xffffffff8150ec10 <intel_unmap_page>,
map_sg = 0xffffffff8150ef40 <intel_map_sg>,
unmap_sg = 0xffffffff8150eb80 <intel_unmap_sg>,
map_resource = 0x0 <irq_stack_union>,
unmap_resource = 0x0 <irq_stack_union>,
sync_single_for_cpu = 0x0 <irq_stack_union>,
sync_single_for_device = 0x0 <irq_stack_union>,
sync_sg_for_cpu = 0x0 <irq_stack_union>,
sync_sg_for_device = 0x0 <irq_stack_union>,
cache_sync = 0x0 <irq_stack_union>,
mapping_error = 0xffffffff815095f0 <intel_mapping_error>,
dma_supported = 0xffffffff81033830 <x86_dma_supported>, is_phys = 0}
Will poke around some in the intel_map_page code but can you actually
get a valid struct page for a pci bar address (dma_map_single calls
virt_to_page)? If not, does a map_resource routine that can properly
map a pci bar address need to be implemented?
Kit
---
static inline dma_addr_t dma_map_single_attrs(struct device *dev, void *ptr,
size_t size,
enum dma_data_direction dir,
unsigned long attrs)
{
const struct dma_map_ops *ops = get_dma_ops(dev);
dma_addr_t addr;
BUG_ON(!valid_dma_direction(dir));
addr = ops->map_page(dev, virt_to_page(ptr),
offset_in_page(ptr), size,
dir, attrs);
debug_dma_map_page(dev, virt_to_page(ptr),
offset_in_page(ptr), size,
dir, addr, true);
return addr;
}
On 08/09/2018 04:00 PM, Kit Chow wrote:
>
>
> On 08/09/2018 03:50 PM, Logan Gunthorpe wrote:
>>
>> On 09/08/18 04:48 PM, Kit Chow wrote:
>>> Based on Logan's comments, I am very hopeful that the dma_map_resource
>>> will make things work on the older platforms...
>> Well, I *think* dma_map_single() would still work. So I'm not that
>> confident that's the root of your problem. I'd still like to see the
>> actual code snippet you are using.
>>
>> Logan
> Here's the code snippet - (ntbdebug & 4) path does dma_map_resource of
> the pci bar address.
>
> It was:
> unmap->addr[1] = dma_map_single(device->dev, (void
> *)dest, len,
> DMA_TO_DEVICE);
>
> Kit
> ---
>
>
> static int ntb_async_tx_submit(struct ntb_transport_qp *qp,
> struct ntb_queue_entry *entry)
> {
> struct dma_async_tx_descriptor *txd;
> struct dma_chan *chan = qp->tx_dma_chan;
> struct dma_device *device;
> size_t len = entry->len;
> void *buf = entry->buf;
> size_t dest_off, buff_off;
> struct dmaengine_unmap_data *unmap;
> dma_addr_t dest;
> dma_cookie_t cookie;
> int unmapcnt;
>
> device = chan->device;
>
> dest = qp->tx_mw_phys + qp->tx_max_frame * entry->tx_index;
>
> buff_off = (size_t)buf & ~PAGE_MASK;
> dest_off = (size_t)dest & ~PAGE_MASK;
>
> if (!is_dma_copy_aligned(device, buff_off, dest_off, len))
> goto err;
>
>
> if (ntbdebug & 0x4) {
> unmapcnt = 2;
> } else {
> unmapcnt = 1;
> }
>
> unmap = dmaengine_get_unmap_data(device->dev, unmapcnt,
> GFP_NOWAIT);
> if (!unmap)
> goto err;
>
> unmap->len = len;
> unmap->addr[0] = dma_map_page(device->dev, virt_to_page(buf),
> buff_off, len, DMA_TO_DEVICE);
> if (dma_mapping_error(device->dev, unmap->addr[0]))
> goto err_get_unmap;
>
> if (ntbdebug & 0x4) {
> unmap->addr[1] = dma_map_resource(device->dev,
> (phys_addr_t)dest, len, DMA_TO_DEVICE, 0);
> if (dma_mapping_error(device->dev, unmap->addr[1]))
> goto err_get_unmap;
> unmap->to_cnt = 2;
> } else {
> unmap->addr[1] = dest;
> unmap->to_cnt = 1;
> }
>
> txd = device->device_prep_dma_memcpy(chan, unmap->addr[1],
> unmap->addr[0], len, DMA_PREP_INTERRUPT);
>
> if (!txd)
> goto err_get_unmap;
>
> txd->callback_result = ntb_tx_copy_callback;
> txd->callback_param = entry;
> dma_set_unmap(txd, unmap);
>
> cookie = dmaengine_submit(txd);
> if (dma_submit_error(cookie))
> goto err_set_unmap;
>
> dmaengine_unmap_put(unmap);
>
> dma_async_issue_pending(chan);
>
> return 0;
>
> err_set_unmap:
> dma_descriptor_unmap(txd);
> txd->desc_free(txd);
> err_get_unmap:
> dmaengine_unmap_put(unmap);
> err:
> return -ENXIO;
> }
>
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
^ permalink raw reply [flat|nested] 95+ messages in thread
* Re: IOAT DMA w/IOMMU
@ 2018-08-10 16:23 ` Kit Chow
0 siblings, 0 replies; 95+ messages in thread
From: Kit Chow @ 2018-08-10 16:23 UTC (permalink / raw)
To: Logan Gunthorpe, Jiang, Dave, Eric Pilmore, Bjorn Helgaas
Cc: linux-pci, David Woodhouse, Alex Williamson, iommu
There is an internal routine (__intel_map_single) inside the intel iommu
code that does the actual mapping using a phys_addr_t. Think I'll try to
implement a intel_map_resource routine that calls that routine directly
without all of the conversions done for dma_map_{single,page} (pci bar
addr -> page -> phys_addr)...
On 08/10/2018 09:02 AM, Kit Chow wrote:
> Turns out there is no dma_map_resource routine on x86. get_dma_ops
> returns intel_dma_ops which has map_resource pointing to NULL.
>
> (gdb) p intel_dma_ops
> $7 = {alloc = 0xffffffff8150f310 <intel_alloc_coherent>,
> free = 0xffffffff8150ec20 <intel_free_coherent>,
> mmap = 0x0 <irq_stack_union>, get_sgtable = 0x0 <irq_stack_union>,
> map_page = 0xffffffff8150f2d0 <intel_map_page>,
> unmap_page = 0xffffffff8150ec10 <intel_unmap_page>,
> map_sg = 0xffffffff8150ef40 <intel_map_sg>,
> unmap_sg = 0xffffffff8150eb80 <intel_unmap_sg>,
> map_resource = 0x0 <irq_stack_union>,
> unmap_resource = 0x0 <irq_stack_union>,
> sync_single_for_cpu = 0x0 <irq_stack_union>,
> sync_single_for_device = 0x0 <irq_stack_union>,
> sync_sg_for_cpu = 0x0 <irq_stack_union>,
> sync_sg_for_device = 0x0 <irq_stack_union>,
> cache_sync = 0x0 <irq_stack_union>,
> mapping_error = 0xffffffff815095f0 <intel_mapping_error>,
> dma_supported = 0xffffffff81033830 <x86_dma_supported>, is_phys = 0}
>
> Will poke around some in the intel_map_page code but can you actually
> get a valid struct page for a pci bar address (dma_map_single calls
> virt_to_page)? If not, does a map_resource routine that can properly
> map a pci bar address need to be implemented?
>
> Kit
>
> ---
>
>
> static inline dma_addr_t dma_map_single_attrs(struct device *dev, void
> *ptr,
> size_t size,
> enum dma_data_direction
> dir,
> unsigned long attrs)
> {
> const struct dma_map_ops *ops = get_dma_ops(dev);
> dma_addr_t addr;
>
> BUG_ON(!valid_dma_direction(dir));
> addr = ops->map_page(dev, virt_to_page(ptr),
> offset_in_page(ptr), size,
> dir, attrs);
> debug_dma_map_page(dev, virt_to_page(ptr),
> offset_in_page(ptr), size,
> dir, addr, true);
> return addr;
> }
>
>
>
>
>
>
> On 08/09/2018 04:00 PM, Kit Chow wrote:
>>
>>
>> On 08/09/2018 03:50 PM, Logan Gunthorpe wrote:
>>>
>>> On 09/08/18 04:48 PM, Kit Chow wrote:
>>>> Based on Logan's comments, I am very hopeful that the dma_map_resource
>>>> will make things work on the older platforms...
>>> Well, I *think* dma_map_single() would still work. So I'm not that
>>> confident that's the root of your problem. I'd still like to see the
>>> actual code snippet you are using.
>>>
>>> Logan
>> Here's the code snippet - (ntbdebug & 4) path does dma_map_resource
>> of the pci bar address.
>>
>> It was:
>> unmap->addr[1] = dma_map_single(device->dev, (void
>> *)dest, len,
>> DMA_TO_DEVICE);
>>
>> Kit
>> ---
>>
>>
>> static int ntb_async_tx_submit(struct ntb_transport_qp *qp,
>> struct ntb_queue_entry *entry)
>> {
>> struct dma_async_tx_descriptor *txd;
>> struct dma_chan *chan = qp->tx_dma_chan;
>> struct dma_device *device;
>> size_t len = entry->len;
>> void *buf = entry->buf;
>> size_t dest_off, buff_off;
>> struct dmaengine_unmap_data *unmap;
>> dma_addr_t dest;
>> dma_cookie_t cookie;
>> int unmapcnt;
>>
>> device = chan->device;
>>
>> dest = qp->tx_mw_phys + qp->tx_max_frame * entry->tx_index;
>>
>> buff_off = (size_t)buf & ~PAGE_MASK;
>> dest_off = (size_t)dest & ~PAGE_MASK;
>>
>> if (!is_dma_copy_aligned(device, buff_off, dest_off, len))
>> goto err;
>>
>>
>> if (ntbdebug & 0x4) {
>> unmapcnt = 2;
>> } else {
>> unmapcnt = 1;
>> }
>>
>> unmap = dmaengine_get_unmap_data(device->dev, unmapcnt,
>> GFP_NOWAIT);
>> if (!unmap)
>> goto err;
>>
>> unmap->len = len;
>> unmap->addr[0] = dma_map_page(device->dev, virt_to_page(buf),
>> buff_off, len, DMA_TO_DEVICE);
>> if (dma_mapping_error(device->dev, unmap->addr[0]))
>> goto err_get_unmap;
>>
>> if (ntbdebug & 0x4) {
>> unmap->addr[1] = dma_map_resource(device->dev,
>> (phys_addr_t)dest, len, DMA_TO_DEVICE, 0);
>> if (dma_mapping_error(device->dev, unmap->addr[1]))
>> goto err_get_unmap;
>> unmap->to_cnt = 2;
>> } else {
>> unmap->addr[1] = dest;
>> unmap->to_cnt = 1;
>> }
>>
>> txd = device->device_prep_dma_memcpy(chan, unmap->addr[1],
>> unmap->addr[0], len, DMA_PREP_INTERRUPT);
>>
>> if (!txd)
>> goto err_get_unmap;
>>
>> txd->callback_result = ntb_tx_copy_callback;
>> txd->callback_param = entry;
>> dma_set_unmap(txd, unmap);
>>
>> cookie = dmaengine_submit(txd);
>> if (dma_submit_error(cookie))
>> goto err_set_unmap;
>>
>> dmaengine_unmap_put(unmap);
>>
>> dma_async_issue_pending(chan);
>>
>> return 0;
>>
>> err_set_unmap:
>> dma_descriptor_unmap(txd);
>> txd->desc_free(txd);
>> err_get_unmap:
>> dmaengine_unmap_put(unmap);
>> err:
>> return -ENXIO;
>> }
>>
>
^ permalink raw reply [flat|nested] 95+ messages in thread
* Re: IOAT DMA w/IOMMU
@ 2018-08-10 16:23 ` Kit Chow
0 siblings, 0 replies; 95+ messages in thread
From: Kit Chow @ 2018-08-10 16:23 UTC (permalink / raw)
To: Logan Gunthorpe, Jiang, Dave, Eric Pilmore, Bjorn Helgaas
Cc: linux-pci-u79uwXL29TY76Z2rM5mHXA, Alex Williamson,
David Woodhouse,
iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
There is an internal routine (__intel_map_single) inside the intel iommu
code that does the actual mapping using a phys_addr_t. Think I'll try to
implement a intel_map_resource routine that calls that routine directly
without all of the conversions done for dma_map_{single,page} (pci bar
addr -> page -> phys_addr)...
On 08/10/2018 09:02 AM, Kit Chow wrote:
> Turns out there is no dma_map_resource routine on x86. get_dma_ops
> returns intel_dma_ops which has map_resource pointing to NULL.
>
> (gdb) p intel_dma_ops
> $7 = {alloc = 0xffffffff8150f310 <intel_alloc_coherent>,
> free = 0xffffffff8150ec20 <intel_free_coherent>,
> mmap = 0x0 <irq_stack_union>, get_sgtable = 0x0 <irq_stack_union>,
> map_page = 0xffffffff8150f2d0 <intel_map_page>,
> unmap_page = 0xffffffff8150ec10 <intel_unmap_page>,
> map_sg = 0xffffffff8150ef40 <intel_map_sg>,
> unmap_sg = 0xffffffff8150eb80 <intel_unmap_sg>,
> map_resource = 0x0 <irq_stack_union>,
> unmap_resource = 0x0 <irq_stack_union>,
> sync_single_for_cpu = 0x0 <irq_stack_union>,
> sync_single_for_device = 0x0 <irq_stack_union>,
> sync_sg_for_cpu = 0x0 <irq_stack_union>,
> sync_sg_for_device = 0x0 <irq_stack_union>,
> cache_sync = 0x0 <irq_stack_union>,
> mapping_error = 0xffffffff815095f0 <intel_mapping_error>,
> dma_supported = 0xffffffff81033830 <x86_dma_supported>, is_phys = 0}
>
> Will poke around some in the intel_map_page code but can you actually
> get a valid struct page for a pci bar address (dma_map_single calls
> virt_to_page)? If not, does a map_resource routine that can properly
> map a pci bar address need to be implemented?
>
> Kit
>
> ---
>
>
> static inline dma_addr_t dma_map_single_attrs(struct device *dev, void
> *ptr,
> size_t size,
> enum dma_data_direction
> dir,
> unsigned long attrs)
> {
> const struct dma_map_ops *ops = get_dma_ops(dev);
> dma_addr_t addr;
>
> BUG_ON(!valid_dma_direction(dir));
> addr = ops->map_page(dev, virt_to_page(ptr),
> offset_in_page(ptr), size,
> dir, attrs);
> debug_dma_map_page(dev, virt_to_page(ptr),
> offset_in_page(ptr), size,
> dir, addr, true);
> return addr;
> }
>
>
>
>
>
>
> On 08/09/2018 04:00 PM, Kit Chow wrote:
>>
>>
>> On 08/09/2018 03:50 PM, Logan Gunthorpe wrote:
>>>
>>> On 09/08/18 04:48 PM, Kit Chow wrote:
>>>> Based on Logan's comments, I am very hopeful that the dma_map_resource
>>>> will make things work on the older platforms...
>>> Well, I *think* dma_map_single() would still work. So I'm not that
>>> confident that's the root of your problem. I'd still like to see the
>>> actual code snippet you are using.
>>>
>>> Logan
>> Here's the code snippet - (ntbdebug & 4) path does dma_map_resource
>> of the pci bar address.
>>
>> It was:
>> unmap->addr[1] = dma_map_single(device->dev, (void
>> *)dest, len,
>> DMA_TO_DEVICE);
>>
>> Kit
>> ---
>>
>>
>> static int ntb_async_tx_submit(struct ntb_transport_qp *qp,
>> struct ntb_queue_entry *entry)
>> {
>> struct dma_async_tx_descriptor *txd;
>> struct dma_chan *chan = qp->tx_dma_chan;
>> struct dma_device *device;
>> size_t len = entry->len;
>> void *buf = entry->buf;
>> size_t dest_off, buff_off;
>> struct dmaengine_unmap_data *unmap;
>> dma_addr_t dest;
>> dma_cookie_t cookie;
>> int unmapcnt;
>>
>> device = chan->device;
>>
>> dest = qp->tx_mw_phys + qp->tx_max_frame * entry->tx_index;
>>
>> buff_off = (size_t)buf & ~PAGE_MASK;
>> dest_off = (size_t)dest & ~PAGE_MASK;
>>
>> if (!is_dma_copy_aligned(device, buff_off, dest_off, len))
>> goto err;
>>
>>
>> if (ntbdebug & 0x4) {
>> unmapcnt = 2;
>> } else {
>> unmapcnt = 1;
>> }
>>
>> unmap = dmaengine_get_unmap_data(device->dev, unmapcnt,
>> GFP_NOWAIT);
>> if (!unmap)
>> goto err;
>>
>> unmap->len = len;
>> unmap->addr[0] = dma_map_page(device->dev, virt_to_page(buf),
>> buff_off, len, DMA_TO_DEVICE);
>> if (dma_mapping_error(device->dev, unmap->addr[0]))
>> goto err_get_unmap;
>>
>> if (ntbdebug & 0x4) {
>> unmap->addr[1] = dma_map_resource(device->dev,
>> (phys_addr_t)dest, len, DMA_TO_DEVICE, 0);
>> if (dma_mapping_error(device->dev, unmap->addr[1]))
>> goto err_get_unmap;
>> unmap->to_cnt = 2;
>> } else {
>> unmap->addr[1] = dest;
>> unmap->to_cnt = 1;
>> }
>>
>> txd = device->device_prep_dma_memcpy(chan, unmap->addr[1],
>> unmap->addr[0], len, DMA_PREP_INTERRUPT);
>>
>> if (!txd)
>> goto err_get_unmap;
>>
>> txd->callback_result = ntb_tx_copy_callback;
>> txd->callback_param = entry;
>> dma_set_unmap(txd, unmap);
>>
>> cookie = dmaengine_submit(txd);
>> if (dma_submit_error(cookie))
>> goto err_set_unmap;
>>
>> dmaengine_unmap_put(unmap);
>>
>> dma_async_issue_pending(chan);
>>
>> return 0;
>>
>> err_set_unmap:
>> dma_descriptor_unmap(txd);
>> txd->desc_free(txd);
>> err_get_unmap:
>> dmaengine_unmap_put(unmap);
>> err:
>> return -ENXIO;
>> }
>>
>
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
^ permalink raw reply [flat|nested] 95+ messages in thread
* Re: IOAT DMA w/IOMMU
@ 2018-08-10 16:24 ` Logan Gunthorpe
0 siblings, 0 replies; 95+ messages in thread
From: Logan Gunthorpe @ 2018-08-10 16:24 UTC (permalink / raw)
To: Kit Chow, Jiang, Dave, Eric Pilmore, Bjorn Helgaas
Cc: linux-pci, David Woodhouse, Alex Williamson, iommu
On 10/08/18 10:23 AM, Kit Chow wrote:
> There is an internal routine (__intel_map_single) inside the intel iommu
> code that does the actual mapping using a phys_addr_t. Think I'll try to
> implement a intel_map_resource routine that calls that routine directly
> without all of the conversions done for dma_map_{single,page} (pci bar
> addr -> page -> phys_addr)...
Nice, yes, that was what I was thinking.
Logan
^ permalink raw reply [flat|nested] 95+ messages in thread
* Re: IOAT DMA w/IOMMU
@ 2018-08-10 16:24 ` Logan Gunthorpe
0 siblings, 0 replies; 95+ messages in thread
From: Logan Gunthorpe @ 2018-08-10 16:24 UTC (permalink / raw)
To: Kit Chow, Jiang, Dave, Eric Pilmore, Bjorn Helgaas
Cc: linux-pci-u79uwXL29TY76Z2rM5mHXA, Alex Williamson,
David Woodhouse,
iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
On 10/08/18 10:23 AM, Kit Chow wrote:
> There is an internal routine (__intel_map_single) inside the intel iommu
> code that does the actual mapping using a phys_addr_t. Think I'll try to
> implement a intel_map_resource routine that calls that routine directly
> without all of the conversions done for dma_map_{single,page} (pci bar
> addr -> page -> phys_addr)...
Nice, yes, that was what I was thinking.
Logan
^ permalink raw reply [flat|nested] 95+ messages in thread
* Re: IOAT DMA w/IOMMU
@ 2018-08-10 16:24 ` Logan Gunthorpe
0 siblings, 0 replies; 95+ messages in thread
From: Logan Gunthorpe @ 2018-08-10 16:24 UTC (permalink / raw)
To: Kit Chow, Jiang, Dave, Eric Pilmore, Bjorn Helgaas
Cc: linux-pci, David Woodhouse, Alex Williamson, iommu
On 10/08/18 10:02 AM, Kit Chow wrote:
> Turns out there is no dma_map_resource routine on x86. get_dma_ops
> returns intel_dma_ops which has map_resource pointing to NULL.
Oh, yup. I wasn't aware of that. From a cursory view, it looks like it
shouldn't be too hard to implement though.
> Will poke around some in the intel_map_page code but can you actually
> get a valid struct page for a pci bar address (dma_map_single calls
> virt_to_page)? If not, does a map_resource routine that can properly
> map a pci bar address need to be implemented?
Yes, you can not get a struct page for a PCI bar address unless it's
mapped with ZONE_DEVICE like in my p2p work. So that would explain why
dma_map_single() didn't work.
This all implies that ntb_transport doesn't work with DMA and the IOMMU
turned on. I'm not sure I've ever tried that configuration myself but it
is a bit surprising.
Logan
^ permalink raw reply [flat|nested] 95+ messages in thread
* Re: IOAT DMA w/IOMMU
@ 2018-08-10 16:24 ` Logan Gunthorpe
0 siblings, 0 replies; 95+ messages in thread
From: Logan Gunthorpe @ 2018-08-10 16:24 UTC (permalink / raw)
To: Kit Chow, Jiang, Dave, Eric Pilmore, Bjorn Helgaas
Cc: linux-pci-u79uwXL29TY76Z2rM5mHXA, Alex Williamson,
David Woodhouse,
iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
On 10/08/18 10:02 AM, Kit Chow wrote:
> Turns out there is no dma_map_resource routine on x86. get_dma_ops
> returns intel_dma_ops which has map_resource pointing to NULL.
Oh, yup. I wasn't aware of that. From a cursory view, it looks like it
shouldn't be too hard to implement though.
> Will poke around some in the intel_map_page code but can you actually
> get a valid struct page for a pci bar address (dma_map_single calls
> virt_to_page)? If not, does a map_resource routine that can properly
> map a pci bar address need to be implemented?
Yes, you can not get a struct page for a PCI bar address unless it's
mapped with ZONE_DEVICE like in my p2p work. So that would explain why
dma_map_single() didn't work.
This all implies that ntb_transport doesn't work with DMA and the IOMMU
turned on. I'm not sure I've ever tried that configuration myself but it
is a bit surprising.
Logan
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
^ permalink raw reply [flat|nested] 95+ messages in thread
* Re: IOAT DMA w/IOMMU
@ 2018-08-10 16:31 ` Dave Jiang
0 siblings, 0 replies; 95+ messages in thread
From: Dave Jiang @ 2018-08-10 16:31 UTC (permalink / raw)
To: Logan Gunthorpe, Kit Chow, Eric Pilmore, Bjorn Helgaas
Cc: linux-pci, David Woodhouse, Alex Williamson, iommu
On 08/10/2018 09:24 AM, Logan Gunthorpe wrote:
>
>
> On 10/08/18 10:02 AM, Kit Chow wrote:
>> Turns out there is no dma_map_resource routine on x86. get_dma_ops
>> returns intel_dma_ops which has map_resource pointing to NULL.
>
> Oh, yup. I wasn't aware of that. From a cursory view, it looks like it
> shouldn't be too hard to implement though.
>
>> Will poke around some in the intel_map_page code but can you actually
>> get a valid struct page for a pci bar address (dma_map_single calls
>> virt_to_page)? If not, does a map_resource routine that can properly
>> map a pci bar address need to be implemented?
>
> Yes, you can not get a struct page for a PCI bar address unless it's
> mapped with ZONE_DEVICE like in my p2p work. So that would explain why
> dma_map_single() didn't work.
>
> This all implies that ntb_transport doesn't work with DMA and the IOMMU
> turned on. I'm not sure I've ever tried that configuration myself but it
> is a bit surprising.
Hmm....that's surprising because it seems to work on Skylake platform
when I tested it yesterday with Intel NTB. Kit is using a Haswell
platform at the moment I think. Although I'm curious if it works with
the PLX NTB he's using on Skylake.
>
> Logan
>
^ permalink raw reply [flat|nested] 95+ messages in thread
* Re: IOAT DMA w/IOMMU
@ 2018-08-10 16:31 ` Dave Jiang
0 siblings, 0 replies; 95+ messages in thread
From: Dave Jiang @ 2018-08-10 16:31 UTC (permalink / raw)
To: Logan Gunthorpe, Kit Chow, Eric Pilmore, Bjorn Helgaas
Cc: linux-pci-u79uwXL29TY76Z2rM5mHXA, Alex Williamson,
David Woodhouse,
iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
On 08/10/2018 09:24 AM, Logan Gunthorpe wrote:
>
>
> On 10/08/18 10:02 AM, Kit Chow wrote:
>> Turns out there is no dma_map_resource routine on x86. get_dma_ops
>> returns intel_dma_ops which has map_resource pointing to NULL.
>
> Oh, yup. I wasn't aware of that. From a cursory view, it looks like it
> shouldn't be too hard to implement though.
>
>> Will poke around some in the intel_map_page code but can you actually
>> get a valid struct page for a pci bar address (dma_map_single calls
>> virt_to_page)? If not, does a map_resource routine that can properly
>> map a pci bar address need to be implemented?
>
> Yes, you can not get a struct page for a PCI bar address unless it's
> mapped with ZONE_DEVICE like in my p2p work. So that would explain why
> dma_map_single() didn't work.
>
> This all implies that ntb_transport doesn't work with DMA and the IOMMU
> turned on. I'm not sure I've ever tried that configuration myself but it
> is a bit surprising.
Hmm....that's surprising because it seems to work on Skylake platform
when I tested it yesterday with Intel NTB. Kit is using a Haswell
platform at the moment I think. Although I'm curious if it works with
the PLX NTB he's using on Skylake.
>
> Logan
>
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
^ permalink raw reply [flat|nested] 95+ messages in thread
* Re: IOAT DMA w/IOMMU
@ 2018-08-10 16:33 ` Logan Gunthorpe
0 siblings, 0 replies; 95+ messages in thread
From: Logan Gunthorpe @ 2018-08-10 16:33 UTC (permalink / raw)
To: Dave Jiang, Kit Chow, Eric Pilmore, Bjorn Helgaas
Cc: linux-pci, David Woodhouse, Alex Williamson, iommu
On 10/08/18 10:31 AM, Dave Jiang wrote:
>
>
> On 08/10/2018 09:24 AM, Logan Gunthorpe wrote:
>>
>>
>> On 10/08/18 10:02 AM, Kit Chow wrote:
>>> Turns out there is no dma_map_resource routine on x86. get_dma_ops
>>> returns intel_dma_ops which has map_resource pointing to NULL.
>>
>> Oh, yup. I wasn't aware of that. From a cursory view, it looks like it
>> shouldn't be too hard to implement though.
>>
>>> Will poke around some in the intel_map_page code but can you actually
>>> get a valid struct page for a pci bar address (dma_map_single calls
>>> virt_to_page)? If not, does a map_resource routine that can properly
>>> map a pci bar address need to be implemented?
>>
>> Yes, you can not get a struct page for a PCI bar address unless it's
>> mapped with ZONE_DEVICE like in my p2p work. So that would explain why
>> dma_map_single() didn't work.
>>
>> This all implies that ntb_transport doesn't work with DMA and the IOMMU
>> turned on. I'm not sure I've ever tried that configuration myself but it
>> is a bit surprising.
>
> Hmm....that's surprising because it seems to work on Skylake platform
> when I tested it yesterday with Intel NTB. Kit is using a Haswell
> platform at the moment I think. Although I'm curious if it works with
> the PLX NTB he's using on Skylake.
Does that mean on Skylake the IOAT can bypass the IOMMU? Because it
looks like the ntb_transport code doesn't map the physical address of
the NTB MW into the IOMMU when doing DMA...
Logan
^ permalink raw reply [flat|nested] 95+ messages in thread
* Re: IOAT DMA w/IOMMU
@ 2018-08-10 16:33 ` Logan Gunthorpe
0 siblings, 0 replies; 95+ messages in thread
From: Logan Gunthorpe @ 2018-08-10 16:33 UTC (permalink / raw)
To: Dave Jiang, Kit Chow, Eric Pilmore, Bjorn Helgaas
Cc: linux-pci-u79uwXL29TY76Z2rM5mHXA, Alex Williamson,
David Woodhouse,
iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
On 10/08/18 10:31 AM, Dave Jiang wrote:
>
>
> On 08/10/2018 09:24 AM, Logan Gunthorpe wrote:
>>
>>
>> On 10/08/18 10:02 AM, Kit Chow wrote:
>>> Turns out there is no dma_map_resource routine on x86. get_dma_ops
>>> returns intel_dma_ops which has map_resource pointing to NULL.
>>
>> Oh, yup. I wasn't aware of that. From a cursory view, it looks like it
>> shouldn't be too hard to implement though.
>>
>>> Will poke around some in the intel_map_page code but can you actually
>>> get a valid struct page for a pci bar address (dma_map_single calls
>>> virt_to_page)? If not, does a map_resource routine that can properly
>>> map a pci bar address need to be implemented?
>>
>> Yes, you can not get a struct page for a PCI bar address unless it's
>> mapped with ZONE_DEVICE like in my p2p work. So that would explain why
>> dma_map_single() didn't work.
>>
>> This all implies that ntb_transport doesn't work with DMA and the IOMMU
>> turned on. I'm not sure I've ever tried that configuration myself but it
>> is a bit surprising.
>
> Hmm....that's surprising because it seems to work on Skylake platform
> when I tested it yesterday with Intel NTB. Kit is using a Haswell
> platform at the moment I think. Although I'm curious if it works with
> the PLX NTB he's using on Skylake.
Does that mean on Skylake the IOAT can bypass the IOMMU? Because it
looks like the ntb_transport code doesn't map the physical address of
the NTB MW into the IOMMU when doing DMA...
Logan
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
^ permalink raw reply [flat|nested] 95+ messages in thread
* Re: IOAT DMA w/IOMMU
@ 2018-08-10 17:01 ` Dave Jiang
0 siblings, 0 replies; 95+ messages in thread
From: Dave Jiang @ 2018-08-10 17:01 UTC (permalink / raw)
To: Logan Gunthorpe, Kit Chow, Eric Pilmore, Bjorn Helgaas
Cc: linux-pci, David Woodhouse, Alex Williamson, iommu
On 08/10/2018 09:33 AM, Logan Gunthorpe wrote:
>
>
> On 10/08/18 10:31 AM, Dave Jiang wrote:
>>
>>
>> On 08/10/2018 09:24 AM, Logan Gunthorpe wrote:
>>>
>>>
>>> On 10/08/18 10:02 AM, Kit Chow wrote:
>>>> Turns out there is no dma_map_resource routine on x86. get_dma_ops
>>>> returns intel_dma_ops which has map_resource pointing to NULL.
>>>
>>> Oh, yup. I wasn't aware of that. From a cursory view, it looks like it
>>> shouldn't be too hard to implement though.
>>>
>>>> Will poke around some in the intel_map_page code but can you actually
>>>> get a valid struct page for a pci bar address (dma_map_single calls
>>>> virt_to_page)? If not, does a map_resource routine that can properly
>>>> map a pci bar address need to be implemented?
>>>
>>> Yes, you can not get a struct page for a PCI bar address unless it's
>>> mapped with ZONE_DEVICE like in my p2p work. So that would explain why
>>> dma_map_single() didn't work.
>>>
>>> This all implies that ntb_transport doesn't work with DMA and the IOMMU
>>> turned on. I'm not sure I've ever tried that configuration myself but it
>>> is a bit surprising.
>>
>> Hmm....that's surprising because it seems to work on Skylake platform
>> when I tested it yesterday with Intel NTB. Kit is using a Haswell
>> platform at the moment I think. Although I'm curious if it works with
>> the PLX NTB he's using on Skylake.
>
> Does that mean on Skylake the IOAT can bypass the IOMMU? Because it
> looks like the ntb_transport code doesn't map the physical address of
> the NTB MW into the IOMMU when doing DMA...
Or if the BIOS has provided mapping for the Intel NTB device
specifically? Is that a possibility? NTB does go through the IOMMU.
^ permalink raw reply [flat|nested] 95+ messages in thread
* Re: IOAT DMA w/IOMMU
@ 2018-08-10 17:01 ` Dave Jiang
0 siblings, 0 replies; 95+ messages in thread
From: Dave Jiang @ 2018-08-10 17:01 UTC (permalink / raw)
To: Logan Gunthorpe, Kit Chow, Eric Pilmore, Bjorn Helgaas
Cc: linux-pci-u79uwXL29TY76Z2rM5mHXA, Alex Williamson,
David Woodhouse,
iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
On 08/10/2018 09:33 AM, Logan Gunthorpe wrote:
>
>
> On 10/08/18 10:31 AM, Dave Jiang wrote:
>>
>>
>> On 08/10/2018 09:24 AM, Logan Gunthorpe wrote:
>>>
>>>
>>> On 10/08/18 10:02 AM, Kit Chow wrote:
>>>> Turns out there is no dma_map_resource routine on x86. get_dma_ops
>>>> returns intel_dma_ops which has map_resource pointing to NULL.
>>>
>>> Oh, yup. I wasn't aware of that. From a cursory view, it looks like it
>>> shouldn't be too hard to implement though.
>>>
>>>> Will poke around some in the intel_map_page code but can you actually
>>>> get a valid struct page for a pci bar address (dma_map_single calls
>>>> virt_to_page)? If not, does a map_resource routine that can properly
>>>> map a pci bar address need to be implemented?
>>>
>>> Yes, you can not get a struct page for a PCI bar address unless it's
>>> mapped with ZONE_DEVICE like in my p2p work. So that would explain why
>>> dma_map_single() didn't work.
>>>
>>> This all implies that ntb_transport doesn't work with DMA and the IOMMU
>>> turned on. I'm not sure I've ever tried that configuration myself but it
>>> is a bit surprising.
>>
>> Hmm....that's surprising because it seems to work on Skylake platform
>> when I tested it yesterday with Intel NTB. Kit is using a Haswell
>> platform at the moment I think. Although I'm curious if it works with
>> the PLX NTB he's using on Skylake.
>
> Does that mean on Skylake the IOAT can bypass the IOMMU? Because it
> looks like the ntb_transport code doesn't map the physical address of
> the NTB MW into the IOMMU when doing DMA...
Or if the BIOS has provided mapping for the Intel NTB device
specifically? Is that a possibility? NTB does go through the IOMMU.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
^ permalink raw reply [flat|nested] 95+ messages in thread
* Re: IOAT DMA w/IOMMU
@ 2018-08-10 17:15 ` Logan Gunthorpe
0 siblings, 0 replies; 95+ messages in thread
From: Logan Gunthorpe @ 2018-08-10 17:15 UTC (permalink / raw)
To: Dave Jiang, Kit Chow, Eric Pilmore, Bjorn Helgaas
Cc: linux-pci, David Woodhouse, Alex Williamson, iommu
On 10/08/18 11:01 AM, Dave Jiang wrote:
> Or if the BIOS has provided mapping for the Intel NTB device
> specifically? Is that a possibility? NTB does go through the IOMMU.
I don't know but if the BIOS is doing it, but that would only at best
work for Intel NTB.... I see no hope in getting the BIOS to map the MW
for a Switchtec NTB device...
Logan
^ permalink raw reply [flat|nested] 95+ messages in thread
* Re: IOAT DMA w/IOMMU
@ 2018-08-10 17:15 ` Logan Gunthorpe
0 siblings, 0 replies; 95+ messages in thread
From: Logan Gunthorpe @ 2018-08-10 17:15 UTC (permalink / raw)
To: Dave Jiang, Kit Chow, Eric Pilmore, Bjorn Helgaas
Cc: linux-pci-u79uwXL29TY76Z2rM5mHXA, Alex Williamson,
David Woodhouse,
iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
On 10/08/18 11:01 AM, Dave Jiang wrote:
> Or if the BIOS has provided mapping for the Intel NTB device
> specifically? Is that a possibility? NTB does go through the IOMMU.
I don't know but if the BIOS is doing it, but that would only at best
work for Intel NTB.... I see no hope in getting the BIOS to map the MW
for a Switchtec NTB device...
Logan
^ permalink raw reply [flat|nested] 95+ messages in thread
* Re: IOAT DMA w/IOMMU
@ 2018-08-10 17:46 ` Dave Jiang
0 siblings, 0 replies; 95+ messages in thread
From: Dave Jiang @ 2018-08-10 17:46 UTC (permalink / raw)
To: Logan Gunthorpe, Kit Chow, Eric Pilmore, Bjorn Helgaas
Cc: linux-pci, David Woodhouse, Alex Williamson, iommu
On 08/10/2018 10:15 AM, Logan Gunthorpe wrote:
>
>
> On 10/08/18 11:01 AM, Dave Jiang wrote:
>> Or if the BIOS has provided mapping for the Intel NTB device
>> specifically? Is that a possibility? NTB does go through the IOMMU.
>
> I don't know but if the BIOS is doing it, but that would only at best
> work for Intel NTB.... I see no hope in getting the BIOS to map the MW
> for a Switchtec NTB device...
Right. I'm just speculating why it may possibly work. But yes, I think
kernel will need appropriate mapping if it's not happening right now.
Hopefully Kit is onto something.
>
> Logan
>
^ permalink raw reply [flat|nested] 95+ messages in thread
* Re: IOAT DMA w/IOMMU
@ 2018-08-10 17:46 ` Dave Jiang
0 siblings, 0 replies; 95+ messages in thread
From: Dave Jiang @ 2018-08-10 17:46 UTC (permalink / raw)
To: Logan Gunthorpe, Kit Chow, Eric Pilmore, Bjorn Helgaas
Cc: linux-pci-u79uwXL29TY76Z2rM5mHXA, Alex Williamson,
David Woodhouse,
iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
On 08/10/2018 10:15 AM, Logan Gunthorpe wrote:
>
>
> On 10/08/18 11:01 AM, Dave Jiang wrote:
>> Or if the BIOS has provided mapping for the Intel NTB device
>> specifically? Is that a possibility? NTB does go through the IOMMU.
>
> I don't know but if the BIOS is doing it, but that would only at best
> work for Intel NTB.... I see no hope in getting the BIOS to map the MW
> for a Switchtec NTB device...
Right. I'm just speculating why it may possibly work. But yes, I think
kernel will need appropriate mapping if it's not happening right now.
Hopefully Kit is onto something.
>
> Logan
>
^ permalink raw reply [flat|nested] 95+ messages in thread
* Re: IOAT DMA w/IOMMU
@ 2018-08-11 0:53 ` Kit Chow
0 siblings, 0 replies; 95+ messages in thread
From: Kit Chow @ 2018-08-11 0:53 UTC (permalink / raw)
To: Dave Jiang, Logan Gunthorpe, Eric Pilmore, Bjorn Helgaas
Cc: linux-pci, David Woodhouse, Alex Williamson, iommu
Success!
I've implemented a new intel_map_resource (and intel_unmap_resource)
routine which is called by dma_map_resource. As mentioned previously,
the primary job of dma_map_resource/intel_map_resource is to call the
intel iommu internal mapping routine (__intel_map_single) without
translating the pci bar address into a page and then back to a phys addr
as dma_map_page and dma_map_single are doing.
With the new dma_map_resource routine mapping the pci bar address in
ntb_transport_tx_submit, I was still getting the "PTE Write access is
not set" error.
The __intel_map_single routine sets prot based on the DMA direction.
if (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL || \
!cap_zlr(iommu->cap))
prot |= DMA_PTE_READ;
if (dir == DMA_FROM_DEVICE || dir == DMA_BIDIRECTIONAL)
prot |= DMA_PTE_WRITE;
I was able to finally succeed in doing the dma transfers over ioat only
when prot has DMA_PTE_WRITE set by setting the direction to either
DMA_FROM_DEVICE or DMA_BIDIRECTIONAL. Any ideas if the prot settings
need to be changed? Are there any bad side effects if I used
DMA_BIDIRECTIONAL?
Given that using the pci bar address as is without getting an iommu
address results in the same "PTE Write access" error, I wonder if there
is some internal 'prot' associated with the non-translated pci bar
address that just needs to be tweaked to include DMA_PTE_WRITE???
Thanks!
On 08/10/2018 10:46 AM, Dave Jiang wrote:
>
> On 08/10/2018 10:15 AM, Logan Gunthorpe wrote:
>>
>> On 10/08/18 11:01 AM, Dave Jiang wrote:
>>> Or if the BIOS has provided mapping for the Intel NTB device
>>> specifically? Is that a possibility? NTB does go through the IOMMU.
>> I don't know but if the BIOS is doing it, but that would only at best
>> work for Intel NTB.... I see no hope in getting the BIOS to map the MW
>> for a Switchtec NTB device...
> Right. I'm just speculating why it may possibly work. But yes, I think
> kernel will need appropriate mapping if it's not happening right now.
> Hopefully Kit is onto something.
>
>> Logan
>>
^ permalink raw reply [flat|nested] 95+ messages in thread
* Re: IOAT DMA w/IOMMU
@ 2018-08-11 0:53 ` Kit Chow
0 siblings, 0 replies; 95+ messages in thread
From: Kit Chow @ 2018-08-11 0:53 UTC (permalink / raw)
To: Dave Jiang, Logan Gunthorpe, Eric Pilmore, Bjorn Helgaas
Cc: linux-pci-u79uwXL29TY76Z2rM5mHXA, Alex Williamson,
David Woodhouse,
iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
Success!
I've implemented a new intel_map_resource (and intel_unmap_resource)
routine which is called by dma_map_resource. As mentioned previously,
the primary job of dma_map_resource/intel_map_resource is to call the
intel iommu internal mapping routine (__intel_map_single) without
translating the pci bar address into a page and then back to a phys addr
as dma_map_page and dma_map_single are doing.
With the new dma_map_resource routine mapping the pci bar address in
ntb_transport_tx_submit, I was still getting the "PTE Write access is
not set" error.
The __intel_map_single routine sets prot based on the DMA direction.
if (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL || \
!cap_zlr(iommu->cap))
prot |= DMA_PTE_READ;
if (dir == DMA_FROM_DEVICE || dir == DMA_BIDIRECTIONAL)
prot |= DMA_PTE_WRITE;
I was able to finally succeed in doing the dma transfers over ioat only
when prot has DMA_PTE_WRITE set by setting the direction to either
DMA_FROM_DEVICE or DMA_BIDIRECTIONAL. Any ideas if the prot settings
need to be changed? Are there any bad side effects if I used
DMA_BIDIRECTIONAL?
Given that using the pci bar address as is without getting an iommu
address results in the same "PTE Write access" error, I wonder if there
is some internal 'prot' associated with the non-translated pci bar
address that just needs to be tweaked to include DMA_PTE_WRITE???
Thanks!
On 08/10/2018 10:46 AM, Dave Jiang wrote:
>
> On 08/10/2018 10:15 AM, Logan Gunthorpe wrote:
>>
>> On 10/08/18 11:01 AM, Dave Jiang wrote:
>>> Or if the BIOS has provided mapping for the Intel NTB device
>>> specifically? Is that a possibility? NTB does go through the IOMMU.
>> I don't know but if the BIOS is doing it, but that would only at best
>> work for Intel NTB.... I see no hope in getting the BIOS to map the MW
>> for a Switchtec NTB device...
> Right. I'm just speculating why it may possibly work. But yes, I think
> kernel will need appropriate mapping if it's not happening right now.
> Hopefully Kit is onto something.
>
>> Logan
>>
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
^ permalink raw reply [flat|nested] 95+ messages in thread
* Re: IOAT DMA w/IOMMU
@ 2018-08-11 2:10 ` Logan Gunthorpe
0 siblings, 0 replies; 95+ messages in thread
From: Logan Gunthorpe @ 2018-08-11 2:10 UTC (permalink / raw)
To: Kit Chow, Dave Jiang, Eric Pilmore, Bjorn Helgaas
Cc: linux-pci, David Woodhouse, Alex Williamson, iommu
On 10/08/18 06:53 PM, Kit Chow wrote:
> I was able to finally succeed in doing the dma transfers over ioat only
> when prot has DMA_PTE_WRITE set by setting the direction to either
> DMA_FROM_DEVICE or DMA_BIDIRECTIONAL. Any ideas if the prot settings
> need to be changed? Are there any bad side effects if I used
> DMA_BIDIRECTIONAL?
Good to hear it. Without digging into the direction much all I can say
is that it can sometimes be very confusing what the direction is. Adding
another PCI device just adds to the confusion.
I believe, the direction should be from the IOAT's point of view. So if
the IOAT is writing to the BAR you'd set DMA_FROM_DEVICE (ie. data is
coming from the IOAT) and if it's reading you'd set DMA_TO_DEVICE (ie.
data is going to the IOAT).
Using DMA_BIDIRECTIONAL just forgoes any hardware security / protection
that the buffer would have in terms of direction. Generally it's good
practice to use the strictest direction you can.
> Given that using the pci bar address as is without getting an iommu
> address results in the same "PTE Write access" error, I wonder if there
> is some internal 'prot' associated with the non-translated pci bar
> address that just needs to be tweaked to include DMA_PTE_WRITE???
No, I don't think so. The 'prot' will be a property of the IOMMU. Not
having an entry is probably just the same (from the perspective of the
error you see) as only having an entry for reading.
Logan
^ permalink raw reply [flat|nested] 95+ messages in thread
* Re: IOAT DMA w/IOMMU
@ 2018-08-11 2:10 ` Logan Gunthorpe
0 siblings, 0 replies; 95+ messages in thread
From: Logan Gunthorpe @ 2018-08-11 2:10 UTC (permalink / raw)
To: Kit Chow, Dave Jiang, Eric Pilmore, Bjorn Helgaas
Cc: linux-pci-u79uwXL29TY76Z2rM5mHXA, Alex Williamson,
David Woodhouse,
iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
On 10/08/18 06:53 PM, Kit Chow wrote:
> I was able to finally succeed in doing the dma transfers over ioat only
> when prot has DMA_PTE_WRITE set by setting the direction to either
> DMA_FROM_DEVICE or DMA_BIDIRECTIONAL. Any ideas if the prot settings
> need to be changed? Are there any bad side effects if I used
> DMA_BIDIRECTIONAL?
Good to hear it. Without digging into the direction much all I can say
is that it can sometimes be very confusing what the direction is. Adding
another PCI device just adds to the confusion.
I believe, the direction should be from the IOAT's point of view. So if
the IOAT is writing to the BAR you'd set DMA_FROM_DEVICE (ie. data is
coming from the IOAT) and if it's reading you'd set DMA_TO_DEVICE (ie.
data is going to the IOAT).
Using DMA_BIDIRECTIONAL just forgoes any hardware security / protection
that the buffer would have in terms of direction. Generally it's good
practice to use the strictest direction you can.
> Given that using the pci bar address as is without getting an iommu
> address results in the same "PTE Write access" error, I wonder if there
> is some internal 'prot' associated with the non-translated pci bar
> address that just needs to be tweaked to include DMA_PTE_WRITE???
No, I don't think so. The 'prot' will be a property of the IOMMU. Not
having an entry is probably just the same (from the perspective of the
error you see) as only having an entry for reading.
Logan
^ permalink raw reply [flat|nested] 95+ messages in thread
* Re: IOAT DMA w/IOMMU
@ 2018-08-13 14:23 ` Kit Chow
0 siblings, 0 replies; 95+ messages in thread
From: Kit Chow @ 2018-08-13 14:23 UTC (permalink / raw)
To: Logan Gunthorpe, Dave Jiang, Eric Pilmore, Bjorn Helgaas
Cc: linux-pci, David Woodhouse, Alex Williamson, iommu
On 08/10/2018 07:10 PM, Logan Gunthorpe wrote:
>
> On 10/08/18 06:53 PM, Kit Chow wrote:
>> I was able to finally succeed in doing the dma transfers over ioat only
>> when prot has DMA_PTE_WRITE set by setting the direction to either
>> DMA_FROM_DEVICE or DMA_BIDIRECTIONAL. Any ideas if the prot settings
>> need to be changed? Are there any bad side effects if I used
>> DMA_BIDIRECTIONAL?
> Good to hear it. Without digging into the direction much all I can say
> is that it can sometimes be very confusing what the direction is. Adding
> another PCI device just adds to the confusion.
Yep, confusing :).
======================= =============================================
DMA_NONE no direction (used for debugging)
DMA_TO_DEVICE data is going from the memory to the device
DMA_FROM_DEVICE data is coming from the device to the memory
DMA_BIDIRECTIONAL direction isn't known
======================= =============================================
> I believe, the direction should be from the IOAT's point of view. So if
> the IOAT is writing to the BAR you'd set DMA_FROM_DEVICE (ie. data is
> coming from the IOAT) and if it's reading you'd set DMA_TO_DEVICE (ie.
> data is going to the IOAT).
It would certainly seem like DMA_TO_DEVICE would be the proper choice;
IOAT is the plumbing to move host data (memory) to the bar address (device).
Will go with what works and set DMA_FROM_DEVICE.
In ntb_async_tx_submit, does the direction used for the dma_map routines
for the src and dest addresses need to be consistent?
And does the direction setting for the dmaengine_unmap_data have to be
consistent with the direction used in dma_map_*?
BTW, dmaengine_unmap routine only calls dma_unmap_page. Should it keep
track of the dma_map routine used and call the corresponding dma_unmap
routine? In the case of the intel iommu, it doesn't matter.
Thanks
Kit
>
> Using DMA_BIDIRECTIONAL just forgoes any hardware security / protection
> that the buffer would have in terms of direction. Generally it's good
> practice to use the strictest direction you can.
>
>> Given that using the pci bar address as is without getting an iommu
>> address results in the same "PTE Write access" error, I wonder if there
>> is some internal 'prot' associated with the non-translated pci bar
>> address that just needs to be tweaked to include DMA_PTE_WRITE???
> No, I don't think so. The 'prot' will be a property of the IOMMU. Not
> having an entry is probably just the same (from the perspective of the
> error you see) as only having an entry for reading.
>
> Logan
^ permalink raw reply [flat|nested] 95+ messages in thread
* Re: IOAT DMA w/IOMMU
@ 2018-08-13 14:23 ` Kit Chow
0 siblings, 0 replies; 95+ messages in thread
From: Kit Chow @ 2018-08-13 14:23 UTC (permalink / raw)
To: Logan Gunthorpe, Dave Jiang, Eric Pilmore, Bjorn Helgaas
Cc: linux-pci-u79uwXL29TY76Z2rM5mHXA, Alex Williamson,
David Woodhouse,
iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
On 08/10/2018 07:10 PM, Logan Gunthorpe wrote:
>
> On 10/08/18 06:53 PM, Kit Chow wrote:
>> I was able to finally succeed in doing the dma transfers over ioat only
>> when prot has DMA_PTE_WRITE set by setting the direction to either
>> DMA_FROM_DEVICE or DMA_BIDIRECTIONAL. Any ideas if the prot settings
>> need to be changed? Are there any bad side effects if I used
>> DMA_BIDIRECTIONAL?
> Good to hear it. Without digging into the direction much all I can say
> is that it can sometimes be very confusing what the direction is. Adding
> another PCI device just adds to the confusion.
Yep, confusing :).
======================= =============================================
DMA_NONE no direction (used for debugging)
DMA_TO_DEVICE data is going from the memory to the device
DMA_FROM_DEVICE data is coming from the device to the memory
DMA_BIDIRECTIONAL direction isn't known
======================= =============================================
> I believe, the direction should be from the IOAT's point of view. So if
> the IOAT is writing to the BAR you'd set DMA_FROM_DEVICE (ie. data is
> coming from the IOAT) and if it's reading you'd set DMA_TO_DEVICE (ie.
> data is going to the IOAT).
It would certainly seem like DMA_TO_DEVICE would be the proper choice;
IOAT is the plumbing to move host data (memory) to the bar address (device).
Will go with what works and set DMA_FROM_DEVICE.
In ntb_async_tx_submit, does the direction used for the dma_map routines
for the src and dest addresses need to be consistent?
And does the direction setting for the dmaengine_unmap_data have to be
consistent with the direction used in dma_map_*?
BTW, dmaengine_unmap routine only calls dma_unmap_page. Should it keep
track of the dma_map routine used and call the corresponding dma_unmap
routine? In the case of the intel iommu, it doesn't matter.
Thanks
Kit
>
> Using DMA_BIDIRECTIONAL just forgoes any hardware security / protection
> that the buffer would have in terms of direction. Generally it's good
> practice to use the strictest direction you can.
>
>> Given that using the pci bar address as is without getting an iommu
>> address results in the same "PTE Write access" error, I wonder if there
>> is some internal 'prot' associated with the non-translated pci bar
>> address that just needs to be tweaked to include DMA_PTE_WRITE???
> No, I don't think so. The 'prot' will be a property of the IOMMU. Not
> having an entry is probably just the same (from the perspective of the
> error you see) as only having an entry for reading.
>
> Logan
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
^ permalink raw reply [flat|nested] 95+ messages in thread
* Re: IOAT DMA w/IOMMU
@ 2018-08-13 14:59 ` Robin Murphy
0 siblings, 0 replies; 95+ messages in thread
From: Robin Murphy @ 2018-08-13 14:59 UTC (permalink / raw)
To: Kit Chow, Logan Gunthorpe, Dave Jiang, Eric Pilmore, Bjorn Helgaas
Cc: linux-pci, Alex Williamson, David Woodhouse, iommu
On 13/08/18 15:23, Kit Chow wrote:
> On 08/10/2018 07:10 PM, Logan Gunthorpe wrote:
>>
>> On 10/08/18 06:53 PM, Kit Chow wrote:
>>> I was able to finally succeed in doing the dma transfers over ioat only
>>> when prot has DMA_PTE_WRITE set by setting the direction to either
>>> DMA_FROM_DEVICE or DMA_BIDIRECTIONAL. Any ideas if the prot settings
>>> need to be changed? Are there any bad side effects if I used
>>> DMA_BIDIRECTIONAL?
>> Good to hear it. Without digging into the direction much all I can say
>> is that it can sometimes be very confusing what the direction is. Adding
>> another PCI device just adds to the confusion.
> Yep, confusing :).
>
> ======================= =============================================
> DMA_NONE no direction (used for debugging)
> DMA_TO_DEVICE data is going from the memory to the device
> DMA_FROM_DEVICE data is coming from the device to the memory
> DMA_BIDIRECTIONAL direction isn't known
> ======================= =============================================
>
>> I believe, the direction should be from the IOAT's point of view. So if
>> the IOAT is writing to the BAR you'd set DMA_FROM_DEVICE (ie. data is
>> coming from the IOAT) and if it's reading you'd set DMA_TO_DEVICE (ie.
>> data is going to the IOAT).
> It would certainly seem like DMA_TO_DEVICE would be the proper choice;
> IOAT is the plumbing to move host data (memory) to the bar address
> (device).
Except that the "device" in question is the IOAT itself (more generally,
it means the device represented by the first argument to dma_map_*() -
the one actually emitting the reads and writes). The context of a DMA
API call is the individual mapping in question, not whatever overall
operation it may be part of - your example already involves two separate
mappings: one "from" system memory "to" the DMA engine, and one "from"
the DMA engine "to" PCI BAR memory.
Note that the DMA API's dma_direction is also distinct from the
dmaengine API's dma_transfer_direction, and there's plenty of fun to be
had mapping between the two - see pl330.c or rcar-dmac.c for other
examples of dma_map_resource() for slave devices - no guarantees that
those implementations are entirely correct (especially the one I did!),
but in practice they do make the "DMA engine behind an IOMMU" case work
for UARTs and similar straightforward slaves.
> Will go with what works and set DMA_FROM_DEVICE.
>
> In ntb_async_tx_submit, does the direction used for the dma_map routines
> for the src and dest addresses need to be consistent?
In general, the mappings of source and destination addresses would
typically have opposite directions as above, unless they're both
bidirectional.
> And does the direction setting for the dmaengine_unmap_data have to be
> consistent with the direction used in dma_map_*?
Yes, the arguments to an unmap are expected to match whatever was passed
to the corresponding map call. CONFIG_DMA_API_DEBUG should help catch
any mishaps.
Robin.
> BTW, dmaengine_unmap routine only calls dma_unmap_page. Should it keep
> track of the dma_map routine used and call the corresponding dma_unmap
> routine? In the case of the intel iommu, it doesn't matter.
>
> Thanks
> Kit
>
>>
>> Using DMA_BIDIRECTIONAL just forgoes any hardware security / protection
>> that the buffer would have in terms of direction. Generally it's good
>> practice to use the strictest direction you can.
>>
>>> Given that using the pci bar address as is without getting an iommu
>>> address results in the same "PTE Write access" error, I wonder if there
>>> is some internal 'prot' associated with the non-translated pci bar
>>> address that just needs to be tweaked to include DMA_PTE_WRITE???
>> No, I don't think so. The 'prot' will be a property of the IOMMU. Not
>> having an entry is probably just the same (from the perspective of the
>> error you see) as only having an entry for reading.
>>
>> Logan
>
> _______________________________________________
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
^ permalink raw reply [flat|nested] 95+ messages in thread
* Re: IOAT DMA w/IOMMU
@ 2018-08-13 14:59 ` Robin Murphy
0 siblings, 0 replies; 95+ messages in thread
From: Robin Murphy @ 2018-08-13 14:59 UTC (permalink / raw)
To: Kit Chow, Logan Gunthorpe, Dave Jiang, Eric Pilmore, Bjorn Helgaas
Cc: linux-pci-u79uwXL29TY76Z2rM5mHXA, Alex Williamson,
David Woodhouse,
iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
On 13/08/18 15:23, Kit Chow wrote:
> On 08/10/2018 07:10 PM, Logan Gunthorpe wrote:
>>
>> On 10/08/18 06:53 PM, Kit Chow wrote:
>>> I was able to finally succeed in doing the dma transfers over ioat only
>>> when prot has DMA_PTE_WRITE set by setting the direction to either
>>> DMA_FROM_DEVICE or DMA_BIDIRECTIONAL. Any ideas if the prot settings
>>> need to be changed? Are there any bad side effects if I used
>>> DMA_BIDIRECTIONAL?
>> Good to hear it. Without digging into the direction much all I can say
>> is that it can sometimes be very confusing what the direction is. Adding
>> another PCI device just adds to the confusion.
> Yep, confusing :).
>
> ======================= =============================================
> DMA_NONE no direction (used for debugging)
> DMA_TO_DEVICE data is going from the memory to the device
> DMA_FROM_DEVICE data is coming from the device to the memory
> DMA_BIDIRECTIONAL direction isn't known
> ======================= =============================================
>
>> I believe, the direction should be from the IOAT's point of view. So if
>> the IOAT is writing to the BAR you'd set DMA_FROM_DEVICE (ie. data is
>> coming from the IOAT) and if it's reading you'd set DMA_TO_DEVICE (ie.
>> data is going to the IOAT).
> It would certainly seem like DMA_TO_DEVICE would be the proper choice;
> IOAT is the plumbing to move host data (memory) to the bar address
> (device).
Except that the "device" in question is the IOAT itself (more generally,
it means the device represented by the first argument to dma_map_*() -
the one actually emitting the reads and writes). The context of a DMA
API call is the individual mapping in question, not whatever overall
operation it may be part of - your example already involves two separate
mappings: one "from" system memory "to" the DMA engine, and one "from"
the DMA engine "to" PCI BAR memory.
Note that the DMA API's dma_direction is also distinct from the
dmaengine API's dma_transfer_direction, and there's plenty of fun to be
had mapping between the two - see pl330.c or rcar-dmac.c for other
examples of dma_map_resource() for slave devices - no guarantees that
those implementations are entirely correct (especially the one I did!),
but in practice they do make the "DMA engine behind an IOMMU" case work
for UARTs and similar straightforward slaves.
> Will go with what works and set DMA_FROM_DEVICE.
>
> In ntb_async_tx_submit, does the direction used for the dma_map routines
> for the src and dest addresses need to be consistent?
In general, the mappings of source and destination addresses would
typically have opposite directions as above, unless they're both
bidirectional.
> And does the direction setting for the dmaengine_unmap_data have to be
> consistent with the direction used in dma_map_*?
Yes, the arguments to an unmap are expected to match whatever was passed
to the corresponding map call. CONFIG_DMA_API_DEBUG should help catch
any mishaps.
Robin.
> BTW, dmaengine_unmap routine only calls dma_unmap_page. Should it keep
> track of the dma_map routine used and call the corresponding dma_unmap
> routine? In the case of the intel iommu, it doesn't matter.
>
> Thanks
> Kit
>
>>
>> Using DMA_BIDIRECTIONAL just forgoes any hardware security / protection
>> that the buffer would have in terms of direction. Generally it's good
>> practice to use the strictest direction you can.
>>
>>> Given that using the pci bar address as is without getting an iommu
>>> address results in the same "PTE Write access" error, I wonder if there
>>> is some internal 'prot' associated with the non-translated pci bar
>>> address that just needs to be tweaked to include DMA_PTE_WRITE???
>> No, I don't think so. The 'prot' will be a property of the IOMMU. Not
>> having an entry is probably just the same (from the perspective of the
>> error you see) as only having an entry for reading.
>>
>> Logan
>
> _______________________________________________
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
^ permalink raw reply [flat|nested] 95+ messages in thread
* Re: IOAT DMA w/IOMMU
@ 2018-08-13 15:21 ` Kit Chow
0 siblings, 0 replies; 95+ messages in thread
From: Kit Chow @ 2018-08-13 15:21 UTC (permalink / raw)
To: Robin Murphy, Logan Gunthorpe, Dave Jiang, Eric Pilmore, Bjorn Helgaas
Cc: linux-pci, Alex Williamson, David Woodhouse, iommu
On 08/13/2018 07:59 AM, Robin Murphy wrote:
> On 13/08/18 15:23, Kit Chow wrote:
>> On 08/10/2018 07:10 PM, Logan Gunthorpe wrote:
>>>
>>> On 10/08/18 06:53 PM, Kit Chow wrote:
>>>> I was able to finally succeed in doing the dma transfers over ioat
>>>> only
>>>> when prot has DMA_PTE_WRITE set by setting the direction to either
>>>> DMA_FROM_DEVICE or DMA_BIDIRECTIONAL. Any ideas if the prot settings
>>>> need to be changed? Are there any bad side effects if I used
>>>> DMA_BIDIRECTIONAL?
>>> Good to hear it. Without digging into the direction much all I can say
>>> is that it can sometimes be very confusing what the direction is.
>>> Adding
>>> another PCI device just adds to the confusion.
>> Yep, confusing :).
>>
>> ======================= =============================================
>> DMA_NONE no direction (used for debugging)
>> DMA_TO_DEVICE data is going from the memory to the device
>> DMA_FROM_DEVICE data is coming from the device to the memory
>> DMA_BIDIRECTIONAL direction isn't known
>> ======================= =============================================
>>
>>> I believe, the direction should be from the IOAT's point of view. So if
>>> the IOAT is writing to the BAR you'd set DMA_FROM_DEVICE (ie. data is
>>> coming from the IOAT) and if it's reading you'd set DMA_TO_DEVICE (ie.
>>> data is going to the IOAT).
>> It would certainly seem like DMA_TO_DEVICE would be the proper
>> choice; IOAT is the plumbing to move host data (memory) to the bar
>> address (device).
>
> Except that the "device" in question is the IOAT itself (more
> generally, it means the device represented by the first argument to
> dma_map_*() - the one actually emitting the reads and writes). The
> context of a DMA API call is the individual mapping in question, not
> whatever overall operation it may be part of - your example already
> involves two separate mappings: one "from" system memory "to" the DMA
> engine, and one "from" the DMA engine "to" PCI BAR memory.
OK, that makes sense. The middleman (aka DMA engine device) is the key
in the to/from puzzle. Thanks!
>
> Note that the DMA API's dma_direction is also distinct from the
> dmaengine API's dma_transfer_direction, and there's plenty of fun to
> be had mapping between the two - see pl330.c or rcar-dmac.c for other
> examples of dma_map_resource() for slave devices - no guarantees that
> those implementations are entirely correct (especially the one I
> did!), but in practice they do make the "DMA engine behind an IOMMU"
> case work for UARTs and similar straightforward slaves.
>
>> Will go with what works and set DMA_FROM_DEVICE.
>>
>> In ntb_async_tx_submit, does the direction used for the dma_map
>> routines for the src and dest addresses need to be consistent?
>
> In general, the mappings of source and destination addresses would
> typically have opposite directions as above, unless they're both
> bidirectional.
>
>> And does the direction setting for the dmaengine_unmap_data have to
>> be consistent with the direction used in dma_map_*?
>
> Yes, the arguments to an unmap are expected to match whatever was
> passed to the corresponding map call. CONFIG_DMA_API_DEBUG should help
> catch any mishaps.
>
> Robin.
>
>> BTW, dmaengine_unmap routine only calls dma_unmap_page. Should it
>> keep track of the dma_map routine used and call the corresponding
>> dma_unmap routine? In the case of the intel iommu, it doesn't matter.
>>
>> Thanks
>> Kit
>>
>>>
>>> Using DMA_BIDIRECTIONAL just forgoes any hardware security / protection
>>> that the buffer would have in terms of direction. Generally it's good
>>> practice to use the strictest direction you can.
>>>
>>>> Given that using the pci bar address as is without getting an iommu
>>>> address results in the same "PTE Write access" error, I wonder if
>>>> there
>>>> is some internal 'prot' associated with the non-translated pci bar
>>>> address that just needs to be tweaked to include DMA_PTE_WRITE???
>>> No, I don't think so. The 'prot' will be a property of the IOMMU. Not
>>> having an entry is probably just the same (from the perspective of the
>>> error you see) as only having an entry for reading.
>>>
>>> Logan
>>
>> _______________________________________________
>> iommu mailing list
>> iommu@lists.linux-foundation.org
>> https://lists.linuxfoundation.org/mailman/listinfo/iommu
^ permalink raw reply [flat|nested] 95+ messages in thread
* Re: IOAT DMA w/IOMMU
@ 2018-08-13 15:21 ` Kit Chow
0 siblings, 0 replies; 95+ messages in thread
From: Kit Chow @ 2018-08-13 15:21 UTC (permalink / raw)
To: Robin Murphy, Logan Gunthorpe, Dave Jiang, Eric Pilmore, Bjorn Helgaas
Cc: linux-pci-u79uwXL29TY76Z2rM5mHXA, Alex Williamson,
David Woodhouse,
iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
On 08/13/2018 07:59 AM, Robin Murphy wrote:
> On 13/08/18 15:23, Kit Chow wrote:
>> On 08/10/2018 07:10 PM, Logan Gunthorpe wrote:
>>>
>>> On 10/08/18 06:53 PM, Kit Chow wrote:
>>>> I was able to finally succeed in doing the dma transfers over ioat
>>>> only
>>>> when prot has DMA_PTE_WRITE set by setting the direction to either
>>>> DMA_FROM_DEVICE or DMA_BIDIRECTIONAL. Any ideas if the prot settings
>>>> need to be changed? Are there any bad side effects if I used
>>>> DMA_BIDIRECTIONAL?
>>> Good to hear it. Without digging into the direction much all I can say
>>> is that it can sometimes be very confusing what the direction is.
>>> Adding
>>> another PCI device just adds to the confusion.
>> Yep, confusing :).
>>
>> ======================= =============================================
>> DMA_NONE no direction (used for debugging)
>> DMA_TO_DEVICE data is going from the memory to the device
>> DMA_FROM_DEVICE data is coming from the device to the memory
>> DMA_BIDIRECTIONAL direction isn't known
>> ======================= =============================================
>>
>>> I believe, the direction should be from the IOAT's point of view. So if
>>> the IOAT is writing to the BAR you'd set DMA_FROM_DEVICE (ie. data is
>>> coming from the IOAT) and if it's reading you'd set DMA_TO_DEVICE (ie.
>>> data is going to the IOAT).
>> It would certainly seem like DMA_TO_DEVICE would be the proper
>> choice; IOAT is the plumbing to move host data (memory) to the bar
>> address (device).
>
> Except that the "device" in question is the IOAT itself (more
> generally, it means the device represented by the first argument to
> dma_map_*() - the one actually emitting the reads and writes). The
> context of a DMA API call is the individual mapping in question, not
> whatever overall operation it may be part of - your example already
> involves two separate mappings: one "from" system memory "to" the DMA
> engine, and one "from" the DMA engine "to" PCI BAR memory.
OK, that makes sense. The middleman (aka DMA engine device) is the key
in the to/from puzzle. Thanks!
>
> Note that the DMA API's dma_direction is also distinct from the
> dmaengine API's dma_transfer_direction, and there's plenty of fun to
> be had mapping between the two - see pl330.c or rcar-dmac.c for other
> examples of dma_map_resource() for slave devices - no guarantees that
> those implementations are entirely correct (especially the one I
> did!), but in practice they do make the "DMA engine behind an IOMMU"
> case work for UARTs and similar straightforward slaves.
>
>> Will go with what works and set DMA_FROM_DEVICE.
>>
>> In ntb_async_tx_submit, does the direction used for the dma_map
>> routines for the src and dest addresses need to be consistent?
>
> In general, the mappings of source and destination addresses would
> typically have opposite directions as above, unless they're both
> bidirectional.
>
>> And does the direction setting for the dmaengine_unmap_data have to
>> be consistent with the direction used in dma_map_*?
>
> Yes, the arguments to an unmap are expected to match whatever was
> passed to the corresponding map call. CONFIG_DMA_API_DEBUG should help
> catch any mishaps.
>
> Robin.
>
>> BTW, dmaengine_unmap routine only calls dma_unmap_page. Should it
>> keep track of the dma_map routine used and call the corresponding
>> dma_unmap routine? In the case of the intel iommu, it doesn't matter.
>>
>> Thanks
>> Kit
>>
>>>
>>> Using DMA_BIDIRECTIONAL just forgoes any hardware security / protection
>>> that the buffer would have in terms of direction. Generally it's good
>>> practice to use the strictest direction you can.
>>>
>>>> Given that using the pci bar address as is without getting an iommu
>>>> address results in the same "PTE Write access" error, I wonder if
>>>> there
>>>> is some internal 'prot' associated with the non-translated pci bar
>>>> address that just needs to be tweaked to include DMA_PTE_WRITE???
>>> No, I don't think so. The 'prot' will be a property of the IOMMU. Not
>>> having an entry is probably just the same (from the perspective of the
>>> error you see) as only having an entry for reading.
>>>
>>> Logan
>>
>> _______________________________________________
>> iommu mailing list
>> iommu@lists.linux-foundation.org
>> https://lists.linuxfoundation.org/mailman/listinfo/iommu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
^ permalink raw reply [flat|nested] 95+ messages in thread
* Re: IOAT DMA w/IOMMU
@ 2018-08-13 23:30 ` Kit Chow
0 siblings, 0 replies; 95+ messages in thread
From: Kit Chow @ 2018-08-13 23:30 UTC (permalink / raw)
To: Robin Murphy, Logan Gunthorpe, Dave Jiang, Eric Pilmore, Bjorn Helgaas
Cc: linux-pci, Alex Williamson, David Woodhouse, iommu
[-- Attachment #1: Type: text/plain, Size: 5857 bytes --]
Taking a step back, I was a little surprised that dma_map_single
successfully returned an iommu address for the pci bar address passed
into it during my initial experiment...
Internally, dma_map_single calls virt_to_page() to translate the
"virtual address" into a page and intel_map page then calls
page_to_phys() to convert the page to a dma_addr_t.
The virt_to_page and page_to_phys routines don't appear to do any
validation and just uses arithmetic to do the conversions.
the pci bar address (0x383c70e51578) does fall into a valid VA range in
x86_64 so it could conceivably be a valid VA. So I tried an virtual
address inside the VA hole and and it too returned without any errors.
virt_to_page(0x800000000000) -> 0xffffede000000000
page_to_phys(0xffffede000000000) -> 0xf80000000000
In arch/x86/include/asm/page.h, there is the following comment in
regards to validating the virtual address.
/*
* virt_to_page(kaddr) returns a valid pointer if and only if
* virt_addr_valid(kaddr) returns true.
*/
#define virt_to_page(kaddr) pfn_to_page(__pa(kaddr) >> PAGE_SHIFT)
So it looks like the validation by virt_addr_valid was somehow dropped
from the virt_to_page code path. Does anyone have any ideas what
happended to it?
Kit
On 08/13/2018 08:21 AM, Kit Chow wrote:
>
>
> On 08/13/2018 07:59 AM, Robin Murphy wrote:
>> On 13/08/18 15:23, Kit Chow wrote:
>>> On 08/10/2018 07:10 PM, Logan Gunthorpe wrote:
>>>>
>>>> On 10/08/18 06:53 PM, Kit Chow wrote:
>>>>> I was able to finally succeed in doing the dma transfers over ioat
>>>>> only
>>>>> when prot has DMA_PTE_WRITE set by setting the direction to either
>>>>> DMA_FROM_DEVICE or DMA_BIDIRECTIONAL. Any ideas if the prot settings
>>>>> need to be changed? Are there any bad side effects if I used
>>>>> DMA_BIDIRECTIONAL?
>>>> Good to hear it. Without digging into the direction much all I can say
>>>> is that it can sometimes be very confusing what the direction is.
>>>> Adding
>>>> another PCI device just adds to the confusion.
>>> Yep, confusing :).
>>>
>>> ======================= =============================================
>>> DMA_NONE no direction (used for debugging)
>>> DMA_TO_DEVICE data is going from the memory to the device
>>> DMA_FROM_DEVICE data is coming from the device to the memory
>>> DMA_BIDIRECTIONAL direction isn't known
>>> ======================= =============================================
>>>
>>>> I believe, the direction should be from the IOAT's point of view.
>>>> So if
>>>> the IOAT is writing to the BAR you'd set DMA_FROM_DEVICE (ie. data is
>>>> coming from the IOAT) and if it's reading you'd set DMA_TO_DEVICE (ie.
>>>> data is going to the IOAT).
>>> It would certainly seem like DMA_TO_DEVICE would be the proper
>>> choice; IOAT is the plumbing to move host data (memory) to the bar
>>> address (device).
>>
>> Except that the "device" in question is the IOAT itself (more
>> generally, it means the device represented by the first argument to
>> dma_map_*() - the one actually emitting the reads and writes). The
>> context of a DMA API call is the individual mapping in question, not
>> whatever overall operation it may be part of - your example already
>> involves two separate mappings: one "from" system memory "to" the DMA
>> engine, and one "from" the DMA engine "to" PCI BAR memory.
>
> OK, that makes sense. The middleman (aka DMA engine device) is the
> key in the to/from puzzle. Thanks!
>
>
>>
>> Note that the DMA API's dma_direction is also distinct from the
>> dmaengine API's dma_transfer_direction, and there's plenty of fun to
>> be had mapping between the two - see pl330.c or rcar-dmac.c for other
>> examples of dma_map_resource() for slave devices - no guarantees that
>> those implementations are entirely correct (especially the one I
>> did!), but in practice they do make the "DMA engine behind an IOMMU"
>> case work for UARTs and similar straightforward slaves.
>>
>>> Will go with what works and set DMA_FROM_DEVICE.
>>>
>>> In ntb_async_tx_submit, does the direction used for the dma_map
>>> routines for the src and dest addresses need to be consistent?
>>
>> In general, the mappings of source and destination addresses would
>> typically have opposite directions as above, unless they're both
>> bidirectional.
>>
>>> And does the direction setting for the dmaengine_unmap_data have to
>>> be consistent with the direction used in dma_map_*?
>>
>> Yes, the arguments to an unmap are expected to match whatever was
>> passed to the corresponding map call. CONFIG_DMA_API_DEBUG should
>> help catch any mishaps.
>>
>> Robin.
>>
>>> BTW, dmaengine_unmap routine only calls dma_unmap_page. Should it
>>> keep track of the dma_map routine used and call the corresponding
>>> dma_unmap routine? In the case of the intel iommu, it doesn't matter.
>>>
>>> Thanks
>>> Kit
>>>
>>>>
>>>> Using DMA_BIDIRECTIONAL just forgoes any hardware security /
>>>> protection
>>>> that the buffer would have in terms of direction. Generally it's good
>>>> practice to use the strictest direction you can.
>>>>
>>>>> Given that using the pci bar address as is without getting an iommu
>>>>> address results in the same "PTE Write access" error, I wonder if
>>>>> there
>>>>> is some internal 'prot' associated with the non-translated pci bar
>>>>> address that just needs to be tweaked to include DMA_PTE_WRITE???
>>>> No, I don't think so. The 'prot' will be a property of the IOMMU. Not
>>>> having an entry is probably just the same (from the perspective of the
>>>> error you see) as only having an entry for reading.
>>>>
>>>> Logan
>>>
>>> _______________________________________________
>>> iommu mailing list
>>> iommu@lists.linux-foundation.org
>>> https://lists.linuxfoundation.org/mailman/listinfo/iommu
>
[-- Attachment #2: Type: text/html, Size: 9231 bytes --]
^ permalink raw reply [flat|nested] 95+ messages in thread
* Re: IOAT DMA w/IOMMU
@ 2018-08-13 23:30 ` Kit Chow
0 siblings, 0 replies; 95+ messages in thread
From: Kit Chow @ 2018-08-13 23:30 UTC (permalink / raw)
To: Robin Murphy, Logan Gunthorpe, Dave Jiang, Eric Pilmore, Bjorn Helgaas
Cc: linux-pci-u79uwXL29TY76Z2rM5mHXA, Alex Williamson,
David Woodhouse,
iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
[-- Attachment #1.1: Type: text/plain, Size: 5891 bytes --]
Taking a step back, I was a little surprised that dma_map_single
successfully returned an iommu address for the pci bar address passed
into it during my initial experiment...
Internally, dma_map_single calls virt_to_page() to translate the
"virtual address" into a page and intel_map page then calls
page_to_phys() to convert the page to a dma_addr_t.
The virt_to_page and page_to_phys routines don't appear to do any
validation and just uses arithmetic to do the conversions.
the pci bar address (0x383c70e51578) does fall into a valid VA range in
x86_64 so it could conceivably be a valid VA. So I tried an virtual
address inside the VA hole and and it too returned without any errors.
virt_to_page(0x800000000000) -> 0xffffede000000000
page_to_phys(0xffffede000000000) -> 0xf80000000000
In arch/x86/include/asm/page.h, there is the following comment in
regards to validating the virtual address.
/*
* virt_to_page(kaddr) returns a valid pointer if and only if
* virt_addr_valid(kaddr) returns true.
*/
#define virt_to_page(kaddr) pfn_to_page(__pa(kaddr) >> PAGE_SHIFT)
So it looks like the validation by virt_addr_valid was somehow dropped
from the virt_to_page code path. Does anyone have any ideas what
happended to it?
Kit
On 08/13/2018 08:21 AM, Kit Chow wrote:
>
>
> On 08/13/2018 07:59 AM, Robin Murphy wrote:
>> On 13/08/18 15:23, Kit Chow wrote:
>>> On 08/10/2018 07:10 PM, Logan Gunthorpe wrote:
>>>>
>>>> On 10/08/18 06:53 PM, Kit Chow wrote:
>>>>> I was able to finally succeed in doing the dma transfers over ioat
>>>>> only
>>>>> when prot has DMA_PTE_WRITE set by setting the direction to either
>>>>> DMA_FROM_DEVICE or DMA_BIDIRECTIONAL. Any ideas if the prot settings
>>>>> need to be changed? Are there any bad side effects if I used
>>>>> DMA_BIDIRECTIONAL?
>>>> Good to hear it. Without digging into the direction much all I can say
>>>> is that it can sometimes be very confusing what the direction is.
>>>> Adding
>>>> another PCI device just adds to the confusion.
>>> Yep, confusing :).
>>>
>>> ======================= =============================================
>>> DMA_NONE no direction (used for debugging)
>>> DMA_TO_DEVICE data is going from the memory to the device
>>> DMA_FROM_DEVICE data is coming from the device to the memory
>>> DMA_BIDIRECTIONAL direction isn't known
>>> ======================= =============================================
>>>
>>>> I believe, the direction should be from the IOAT's point of view.
>>>> So if
>>>> the IOAT is writing to the BAR you'd set DMA_FROM_DEVICE (ie. data is
>>>> coming from the IOAT) and if it's reading you'd set DMA_TO_DEVICE (ie.
>>>> data is going to the IOAT).
>>> It would certainly seem like DMA_TO_DEVICE would be the proper
>>> choice; IOAT is the plumbing to move host data (memory) to the bar
>>> address (device).
>>
>> Except that the "device" in question is the IOAT itself (more
>> generally, it means the device represented by the first argument to
>> dma_map_*() - the one actually emitting the reads and writes). The
>> context of a DMA API call is the individual mapping in question, not
>> whatever overall operation it may be part of - your example already
>> involves two separate mappings: one "from" system memory "to" the DMA
>> engine, and one "from" the DMA engine "to" PCI BAR memory.
>
> OK, that makes sense. The middleman (aka DMA engine device) is the
> key in the to/from puzzle. Thanks!
>
>
>>
>> Note that the DMA API's dma_direction is also distinct from the
>> dmaengine API's dma_transfer_direction, and there's plenty of fun to
>> be had mapping between the two - see pl330.c or rcar-dmac.c for other
>> examples of dma_map_resource() for slave devices - no guarantees that
>> those implementations are entirely correct (especially the one I
>> did!), but in practice they do make the "DMA engine behind an IOMMU"
>> case work for UARTs and similar straightforward slaves.
>>
>>> Will go with what works and set DMA_FROM_DEVICE.
>>>
>>> In ntb_async_tx_submit, does the direction used for the dma_map
>>> routines for the src and dest addresses need to be consistent?
>>
>> In general, the mappings of source and destination addresses would
>> typically have opposite directions as above, unless they're both
>> bidirectional.
>>
>>> And does the direction setting for the dmaengine_unmap_data have to
>>> be consistent with the direction used in dma_map_*?
>>
>> Yes, the arguments to an unmap are expected to match whatever was
>> passed to the corresponding map call. CONFIG_DMA_API_DEBUG should
>> help catch any mishaps.
>>
>> Robin.
>>
>>> BTW, dmaengine_unmap routine only calls dma_unmap_page. Should it
>>> keep track of the dma_map routine used and call the corresponding
>>> dma_unmap routine? In the case of the intel iommu, it doesn't matter.
>>>
>>> Thanks
>>> Kit
>>>
>>>>
>>>> Using DMA_BIDIRECTIONAL just forgoes any hardware security /
>>>> protection
>>>> that the buffer would have in terms of direction. Generally it's good
>>>> practice to use the strictest direction you can.
>>>>
>>>>> Given that using the pci bar address as is without getting an iommu
>>>>> address results in the same "PTE Write access" error, I wonder if
>>>>> there
>>>>> is some internal 'prot' associated with the non-translated pci bar
>>>>> address that just needs to be tweaked to include DMA_PTE_WRITE???
>>>> No, I don't think so. The 'prot' will be a property of the IOMMU. Not
>>>> having an entry is probably just the same (from the perspective of the
>>>> error you see) as only having an entry for reading.
>>>>
>>>> Logan
>>>
>>> _______________________________________________
>>> iommu mailing list
>>> iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
>>> https://lists.linuxfoundation.org/mailman/listinfo/iommu
>
[-- Attachment #1.2: Type: text/html, Size: 9328 bytes --]
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 95+ messages in thread
* Re: IOAT DMA w/IOMMU
@ 2018-08-13 23:39 ` Logan Gunthorpe
0 siblings, 0 replies; 95+ messages in thread
From: Logan Gunthorpe @ 2018-08-13 23:39 UTC (permalink / raw)
To: Kit Chow, Robin Murphy, Dave Jiang, Eric Pilmore, Bjorn Helgaas
Cc: linux-pci, Alex Williamson, David Woodhouse, iommu
On 13/08/18 05:30 PM, Kit Chow wrote:
> In arch/x86/include/asm/page.h, there is the following comment in
> regards to validating the virtual address.
>
> /*
> * virt_to_page(kaddr) returns a valid pointer if and only if
> * virt_addr_valid(kaddr) returns true.
> */
> #define virt_to_page(kaddr) pfn_to_page(__pa(kaddr) >> PAGE_SHIFT)
>
> So it looks like the validation by virt_addr_valid was somehow dropped
> from the virt_to_page code path. Does anyone have any ideas what
> happended to it?
I don't think it was ever validated (though I haven't been around long
enough to say for certain). What the comment is saying is that you
shouldn't rely on virt_to_page() unless you know virt_addr_valid() is
true (which in most cases you can without the extra check). virt_to_page
is meant to be really fast so adding an extra validation would probably
be a significant performance regression for the entire kernel.
The fact that this can happen through dma_map_single() is non-ideal at
best. It assumes the caller is mapping regular memory and doesn't check
this at all. It may make sense to fix that but I think people expect
dma_map_single() to be as fast as possible as well...
Logan
^ permalink raw reply [flat|nested] 95+ messages in thread
* Re: IOAT DMA w/IOMMU
@ 2018-08-13 23:39 ` Logan Gunthorpe
0 siblings, 0 replies; 95+ messages in thread
From: Logan Gunthorpe @ 2018-08-13 23:39 UTC (permalink / raw)
To: Kit Chow, Robin Murphy, Dave Jiang, Eric Pilmore, Bjorn Helgaas
Cc: linux-pci-u79uwXL29TY76Z2rM5mHXA, Alex Williamson,
David Woodhouse,
iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
On 13/08/18 05:30 PM, Kit Chow wrote:
> In arch/x86/include/asm/page.h, there is the following comment in
> regards to validating the virtual address.
>
> /*
> * virt_to_page(kaddr) returns a valid pointer if and only if
> * virt_addr_valid(kaddr) returns true.
> */
> #define virt_to_page(kaddr) pfn_to_page(__pa(kaddr) >> PAGE_SHIFT)
>
> So it looks like the validation by virt_addr_valid was somehow dropped
> from the virt_to_page code path. Does anyone have any ideas what
> happended to it?
I don't think it was ever validated (though I haven't been around long
enough to say for certain). What the comment is saying is that you
shouldn't rely on virt_to_page() unless you know virt_addr_valid() is
true (which in most cases you can without the extra check). virt_to_page
is meant to be really fast so adding an extra validation would probably
be a significant performance regression for the entire kernel.
The fact that this can happen through dma_map_single() is non-ideal at
best. It assumes the caller is mapping regular memory and doesn't check
this at all. It may make sense to fix that but I think people expect
dma_map_single() to be as fast as possible as well...
Logan
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
^ permalink raw reply [flat|nested] 95+ messages in thread
* Re: IOAT DMA w/IOMMU
@ 2018-08-13 23:48 ` Kit Chow
0 siblings, 0 replies; 95+ messages in thread
From: Kit Chow @ 2018-08-13 23:48 UTC (permalink / raw)
To: Logan Gunthorpe, Robin Murphy, Dave Jiang, Eric Pilmore, Bjorn Helgaas
Cc: linux-pci, Alex Williamson, David Woodhouse, iommu
On 08/13/2018 04:39 PM, Logan Gunthorpe wrote:
>
> On 13/08/18 05:30 PM, Kit Chow wrote:
>> In arch/x86/include/asm/page.h, there is the following comment in
>> regards to validating the virtual address.
>>
>> /*
>> * virt_to_page(kaddr) returns a valid pointer if and only if
>> * virt_addr_valid(kaddr) returns true.
>> */
>> #define virt_to_page(kaddr) pfn_to_page(__pa(kaddr) >> PAGE_SHIFT)
>>
>> So it looks like the validation by virt_addr_valid was somehow dropped
>> from the virt_to_page code path. Does anyone have any ideas what
>> happended to it?
> I don't think it was ever validated (though I haven't been around long
> enough to say for certain). What the comment is saying is that you
> shouldn't rely on virt_to_page() unless you know virt_addr_valid() is
> true (which in most cases you can without the extra check). virt_to_page
> is meant to be really fast so adding an extra validation would probably
> be a significant performance regression for the entire kernel.
>
> The fact that this can happen through dma_map_single() is non-ideal at
> best. It assumes the caller is mapping regular memory and doesn't check
> this at all. It may make sense to fix that but I think people expect
> dma_map_single() to be as fast as possible as well...
>
Perhaps include the validation with some debug turned on?
^ permalink raw reply [flat|nested] 95+ messages in thread
* Re: IOAT DMA w/IOMMU
@ 2018-08-13 23:48 ` Kit Chow
0 siblings, 0 replies; 95+ messages in thread
From: Kit Chow @ 2018-08-13 23:48 UTC (permalink / raw)
To: Logan Gunthorpe, Robin Murphy, Dave Jiang, Eric Pilmore, Bjorn Helgaas
Cc: linux-pci-u79uwXL29TY76Z2rM5mHXA, Alex Williamson,
David Woodhouse,
iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
On 08/13/2018 04:39 PM, Logan Gunthorpe wrote:
>
> On 13/08/18 05:30 PM, Kit Chow wrote:
>> In arch/x86/include/asm/page.h, there is the following comment in
>> regards to validating the virtual address.
>>
>> /*
>> * virt_to_page(kaddr) returns a valid pointer if and only if
>> * virt_addr_valid(kaddr) returns true.
>> */
>> #define virt_to_page(kaddr) pfn_to_page(__pa(kaddr) >> PAGE_SHIFT)
>>
>> So it looks like the validation by virt_addr_valid was somehow dropped
>> from the virt_to_page code path. Does anyone have any ideas what
>> happended to it?
> I don't think it was ever validated (though I haven't been around long
> enough to say for certain). What the comment is saying is that you
> shouldn't rely on virt_to_page() unless you know virt_addr_valid() is
> true (which in most cases you can without the extra check). virt_to_page
> is meant to be really fast so adding an extra validation would probably
> be a significant performance regression for the entire kernel.
>
> The fact that this can happen through dma_map_single() is non-ideal at
> best. It assumes the caller is mapping regular memory and doesn't check
> this at all. It may make sense to fix that but I think people expect
> dma_map_single() to be as fast as possible as well...
>
Perhaps include the validation with some debug turned on?
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
^ permalink raw reply [flat|nested] 95+ messages in thread
* Re: IOAT DMA w/IOMMU
@ 2018-08-13 23:50 ` Logan Gunthorpe
0 siblings, 0 replies; 95+ messages in thread
From: Logan Gunthorpe @ 2018-08-13 23:50 UTC (permalink / raw)
To: Kit Chow, Robin Murphy, Dave Jiang, Eric Pilmore, Bjorn Helgaas
Cc: linux-pci, Alex Williamson, David Woodhouse, iommu
On 13/08/18 05:48 PM, Kit Chow wrote:
> On 08/13/2018 04:39 PM, Logan Gunthorpe wrote:
>>
>> On 13/08/18 05:30 PM, Kit Chow wrote:
>>> In arch/x86/include/asm/page.h, there is the following comment in
>>> regards to validating the virtual address.
>>>
>>> /*
>>> * virt_to_page(kaddr) returns a valid pointer if and only if
>>> * virt_addr_valid(kaddr) returns true.
>>> */
>>> #define virt_to_page(kaddr) pfn_to_page(__pa(kaddr) >> PAGE_SHIFT)
>>>
>>> So it looks like the validation by virt_addr_valid was somehow dropped
>>> from the virt_to_page code path. Does anyone have any ideas what
>>> happended to it?
>> I don't think it was ever validated (though I haven't been around long
>> enough to say for certain). What the comment is saying is that you
>> shouldn't rely on virt_to_page() unless you know virt_addr_valid() is
>> true (which in most cases you can without the extra check). virt_to_page
>> is meant to be really fast so adding an extra validation would probably
>> be a significant performance regression for the entire kernel.
>>
>> The fact that this can happen through dma_map_single() is non-ideal at
>> best. It assumes the caller is mapping regular memory and doesn't check
>> this at all. It may make sense to fix that but I think people expect
>> dma_map_single() to be as fast as possible as well...
>>
> Perhaps include the validation with some debug turned on?
The problem is how often do you develop code with any of the debug
config options turned on?
There's already a couple of BUG_ONs in dma_map_single so maybe another
one with virt_addr_valid wouldn't be so bad.
Logan
^ permalink raw reply [flat|nested] 95+ messages in thread
* Re: IOAT DMA w/IOMMU
@ 2018-08-13 23:50 ` Logan Gunthorpe
0 siblings, 0 replies; 95+ messages in thread
From: Logan Gunthorpe @ 2018-08-13 23:50 UTC (permalink / raw)
To: Kit Chow, Robin Murphy, Dave Jiang, Eric Pilmore, Bjorn Helgaas
Cc: linux-pci-u79uwXL29TY76Z2rM5mHXA, Alex Williamson,
David Woodhouse,
iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
On 13/08/18 05:48 PM, Kit Chow wrote:
> On 08/13/2018 04:39 PM, Logan Gunthorpe wrote:
>>
>> On 13/08/18 05:30 PM, Kit Chow wrote:
>>> In arch/x86/include/asm/page.h, there is the following comment in
>>> regards to validating the virtual address.
>>>
>>> /*
>>> * virt_to_page(kaddr) returns a valid pointer if and only if
>>> * virt_addr_valid(kaddr) returns true.
>>> */
>>> #define virt_to_page(kaddr) pfn_to_page(__pa(kaddr) >> PAGE_SHIFT)
>>>
>>> So it looks like the validation by virt_addr_valid was somehow dropped
>>> from the virt_to_page code path. Does anyone have any ideas what
>>> happended to it?
>> I don't think it was ever validated (though I haven't been around long
>> enough to say for certain). What the comment is saying is that you
>> shouldn't rely on virt_to_page() unless you know virt_addr_valid() is
>> true (which in most cases you can without the extra check). virt_to_page
>> is meant to be really fast so adding an extra validation would probably
>> be a significant performance regression for the entire kernel.
>>
>> The fact that this can happen through dma_map_single() is non-ideal at
>> best. It assumes the caller is mapping regular memory and doesn't check
>> this at all. It may make sense to fix that but I think people expect
>> dma_map_single() to be as fast as possible as well...
>>
> Perhaps include the validation with some debug turned on?
The problem is how often do you develop code with any of the debug
config options turned on?
There's already a couple of BUG_ONs in dma_map_single so maybe another
one with virt_addr_valid wouldn't be so bad.
Logan
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
^ permalink raw reply [flat|nested] 95+ messages in thread
* Re: IOAT DMA w/IOMMU
@ 2018-08-14 13:47 ` Kit Chow
0 siblings, 0 replies; 95+ messages in thread
From: Kit Chow @ 2018-08-14 13:47 UTC (permalink / raw)
To: Logan Gunthorpe, Robin Murphy, Dave Jiang, Eric Pilmore, Bjorn Helgaas
Cc: linux-pci, Alex Williamson, David Woodhouse, iommu
On 08/13/2018 04:50 PM, Logan Gunthorpe wrote:
>
> On 13/08/18 05:48 PM, Kit Chow wrote:
>> On 08/13/2018 04:39 PM, Logan Gunthorpe wrote:
>>> On 13/08/18 05:30 PM, Kit Chow wrote:
>>>> In arch/x86/include/asm/page.h, there is the following comment in
>>>> regards to validating the virtual address.
>>>>
>>>> /*
>>>> * virt_to_page(kaddr) returns a valid pointer if and only if
>>>> * virt_addr_valid(kaddr) returns true.
>>>> */
>>>> #define virt_to_page(kaddr) pfn_to_page(__pa(kaddr) >> PAGE_SHIFT)
>>>>
>>>> So it looks like the validation by virt_addr_valid was somehow dropped
>>>> from the virt_to_page code path. Does anyone have any ideas what
>>>> happended to it?
>>> I don't think it was ever validated (though I haven't been around long
>>> enough to say for certain). What the comment is saying is that you
>>> shouldn't rely on virt_to_page() unless you know virt_addr_valid() is
>>> true (which in most cases you can without the extra check). virt_to_page
>>> is meant to be really fast so adding an extra validation would probably
>>> be a significant performance regression for the entire kernel.
>>>
>>> The fact that this can happen through dma_map_single() is non-ideal at
>>> best. It assumes the caller is mapping regular memory and doesn't check
>>> this at all. It may make sense to fix that but I think people expect
>>> dma_map_single() to be as fast as possible as well...
>>>
>> Perhaps include the validation with some debug turned on?
> The problem is how often do you develop code with any of the debug
> config options turned on?
>
> There's already a couple of BUG_ONs in dma_map_single so maybe another
> one with virt_addr_valid wouldn't be so bad.
Had my very first Linux crash on the dma direction BUG_ON when I tried
DMA_NONE :).
^ permalink raw reply [flat|nested] 95+ messages in thread
* Re: IOAT DMA w/IOMMU
@ 2018-08-14 13:47 ` Kit Chow
0 siblings, 0 replies; 95+ messages in thread
From: Kit Chow @ 2018-08-14 13:47 UTC (permalink / raw)
To: Logan Gunthorpe, Robin Murphy, Dave Jiang, Eric Pilmore, Bjorn Helgaas
Cc: linux-pci-u79uwXL29TY76Z2rM5mHXA, Alex Williamson,
David Woodhouse,
iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
On 08/13/2018 04:50 PM, Logan Gunthorpe wrote:
>
> On 13/08/18 05:48 PM, Kit Chow wrote:
>> On 08/13/2018 04:39 PM, Logan Gunthorpe wrote:
>>> On 13/08/18 05:30 PM, Kit Chow wrote:
>>>> In arch/x86/include/asm/page.h, there is the following comment in
>>>> regards to validating the virtual address.
>>>>
>>>> /*
>>>> * virt_to_page(kaddr) returns a valid pointer if and only if
>>>> * virt_addr_valid(kaddr) returns true.
>>>> */
>>>> #define virt_to_page(kaddr) pfn_to_page(__pa(kaddr) >> PAGE_SHIFT)
>>>>
>>>> So it looks like the validation by virt_addr_valid was somehow dropped
>>>> from the virt_to_page code path. Does anyone have any ideas what
>>>> happended to it?
>>> I don't think it was ever validated (though I haven't been around long
>>> enough to say for certain). What the comment is saying is that you
>>> shouldn't rely on virt_to_page() unless you know virt_addr_valid() is
>>> true (which in most cases you can without the extra check). virt_to_page
>>> is meant to be really fast so adding an extra validation would probably
>>> be a significant performance regression for the entire kernel.
>>>
>>> The fact that this can happen through dma_map_single() is non-ideal at
>>> best. It assumes the caller is mapping regular memory and doesn't check
>>> this at all. It may make sense to fix that but I think people expect
>>> dma_map_single() to be as fast as possible as well...
>>>
>> Perhaps include the validation with some debug turned on?
> The problem is how often do you develop code with any of the debug
> config options turned on?
>
> There's already a couple of BUG_ONs in dma_map_single so maybe another
> one with virt_addr_valid wouldn't be so bad.
Had my very first Linux crash on the dma direction BUG_ON when I tried
DMA_NONE :).
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
^ permalink raw reply [flat|nested] 95+ messages in thread
* Re: IOAT DMA w/IOMMU
@ 2018-08-14 14:03 ` Robin Murphy
0 siblings, 0 replies; 95+ messages in thread
From: Robin Murphy @ 2018-08-14 14:03 UTC (permalink / raw)
To: Logan Gunthorpe, Kit Chow, Dave Jiang, Eric Pilmore, Bjorn Helgaas
Cc: linux-pci, Alex Williamson, David Woodhouse, iommu
On 14/08/18 00:50, Logan Gunthorpe wrote:
> On 13/08/18 05:48 PM, Kit Chow wrote:
>> On 08/13/2018 04:39 PM, Logan Gunthorpe wrote:
>>>
>>> On 13/08/18 05:30 PM, Kit Chow wrote:
>>>> In arch/x86/include/asm/page.h, there is the following comment in
>>>> regards to validating the virtual address.
>>>>
>>>> /*
>>>> * virt_to_page(kaddr) returns a valid pointer if and only if
>>>> * virt_addr_valid(kaddr) returns true.
>>>> */
>>>> #define virt_to_page(kaddr) pfn_to_page(__pa(kaddr) >> PAGE_SHIFT)
>>>>
>>>> So it looks like the validation by virt_addr_valid was somehow dropped
>>>> from the virt_to_page code path. Does anyone have any ideas what
>>>> happended to it?
>>> I don't think it was ever validated (though I haven't been around long
>>> enough to say for certain). What the comment is saying is that you
>>> shouldn't rely on virt_to_page() unless you know virt_addr_valid() is
>>> true (which in most cases you can without the extra check). virt_to_page
>>> is meant to be really fast so adding an extra validation would probably
>>> be a significant performance regression for the entire kernel.
>>>
>>> The fact that this can happen through dma_map_single() is non-ideal at
>>> best. It assumes the caller is mapping regular memory and doesn't check
>>> this at all. It may make sense to fix that but I think people expect
>>> dma_map_single() to be as fast as possible as well...
dma_map_single() is already documented as only supporting lowmem (for
which virt_to_page() can be assumed to be valid). You might get away
with feeding it bogus addresses on x86, but on non-coherent
architectures which convert the page back to a virtual address to
perform cache maintenance you can expect that to crash and burn rapidly.
There may be some minimal-overhead sanity checking of fundamentals, but
in general it's not really the DMA API's job to police its callers
exhaustively; consider that the mm layer doesn't go out of its way to
stop you from doing things like "kfree(kfree);" either.
>>>
>> Perhaps include the validation with some debug turned on?
>
> The problem is how often do you develop code with any of the debug
> config options turned on?
>
> There's already a couple of BUG_ONs in dma_map_single so maybe another
> one with virt_addr_valid wouldn't be so bad.
Note that virt_addr_valid() may be pretty heavyweight in itself. For
example the arm64 implementation involves memblock_search(); that really
isn't viable in a DMA mapping fastpath.
Robin.
^ permalink raw reply [flat|nested] 95+ messages in thread
* Re: IOAT DMA w/IOMMU
@ 2018-08-14 14:03 ` Robin Murphy
0 siblings, 0 replies; 95+ messages in thread
From: Robin Murphy @ 2018-08-14 14:03 UTC (permalink / raw)
To: Logan Gunthorpe, Kit Chow, Dave Jiang, Eric Pilmore, Bjorn Helgaas
Cc: linux-pci-u79uwXL29TY76Z2rM5mHXA, Alex Williamson,
David Woodhouse,
iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
On 14/08/18 00:50, Logan Gunthorpe wrote:
> On 13/08/18 05:48 PM, Kit Chow wrote:
>> On 08/13/2018 04:39 PM, Logan Gunthorpe wrote:
>>>
>>> On 13/08/18 05:30 PM, Kit Chow wrote:
>>>> In arch/x86/include/asm/page.h, there is the following comment in
>>>> regards to validating the virtual address.
>>>>
>>>> /*
>>>> * virt_to_page(kaddr) returns a valid pointer if and only if
>>>> * virt_addr_valid(kaddr) returns true.
>>>> */
>>>> #define virt_to_page(kaddr) pfn_to_page(__pa(kaddr) >> PAGE_SHIFT)
>>>>
>>>> So it looks like the validation by virt_addr_valid was somehow dropped
>>>> from the virt_to_page code path. Does anyone have any ideas what
>>>> happended to it?
>>> I don't think it was ever validated (though I haven't been around long
>>> enough to say for certain). What the comment is saying is that you
>>> shouldn't rely on virt_to_page() unless you know virt_addr_valid() is
>>> true (which in most cases you can without the extra check). virt_to_page
>>> is meant to be really fast so adding an extra validation would probably
>>> be a significant performance regression for the entire kernel.
>>>
>>> The fact that this can happen through dma_map_single() is non-ideal at
>>> best. It assumes the caller is mapping regular memory and doesn't check
>>> this at all. It may make sense to fix that but I think people expect
>>> dma_map_single() to be as fast as possible as well...
dma_map_single() is already documented as only supporting lowmem (for
which virt_to_page() can be assumed to be valid). You might get away
with feeding it bogus addresses on x86, but on non-coherent
architectures which convert the page back to a virtual address to
perform cache maintenance you can expect that to crash and burn rapidly.
There may be some minimal-overhead sanity checking of fundamentals, but
in general it's not really the DMA API's job to police its callers
exhaustively; consider that the mm layer doesn't go out of its way to
stop you from doing things like "kfree(kfree);" either.
>>>
>> Perhaps include the validation with some debug turned on?
>
> The problem is how often do you develop code with any of the debug
> config options turned on?
>
> There's already a couple of BUG_ONs in dma_map_single so maybe another
> one with virt_addr_valid wouldn't be so bad.
Note that virt_addr_valid() may be pretty heavyweight in itself. For
example the arm64 implementation involves memblock_search(); that really
isn't viable in a DMA mapping fastpath.
Robin.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
^ permalink raw reply [flat|nested] 95+ messages in thread
* Re: IOAT DMA w/IOMMU
@ 2018-08-13 23:36 ` Kit Chow
0 siblings, 0 replies; 95+ messages in thread
From: Kit Chow @ 2018-08-13 23:36 UTC (permalink / raw)
To: Robin Murphy, Logan Gunthorpe, Dave Jiang, Eric Pilmore, Bjorn Helgaas
Cc: linux-pci, Alex Williamson, David Woodhouse, iommu
Taking a step back, I was a little surprised that dma_map_single
successfully returned an
iommu address for the pci bar address passed into it during my initial
experiment...
Internally, dma_map_single calls virt_to_page() to translate the
"virtual address" into a
page and intel_map page then calls page_to_phys() to convert the page to
a dma_addr_t.
The virt_to_page and page_to_phys routines don't appear to do any
validation and just
uses arithmetic to do the conversions.
the pci bar address (0x383c70e51578) does fall into a valid VA range in
x86_64 so
it could conceivably be a valid VA. So I tried an virtual address
inside the VA hole
and and it too returned without any errors.
virt_to_page(0x800000000000) -> 0xffffede000000000
page_to_phys(0xffffede000000000) -> 0xf80000000000
In arch/x86/include/asm/page.h, there is the following comment in regards to
validating the virtual address.
/*
* virt_to_page(kaddr) returns a valid pointer if and only if
* virt_addr_valid(kaddr) returns true.
*/
#define virt_to_page(kaddr) pfn_to_page(__pa(kaddr) >> PAGE_SHIFT)
So it looks like the validation by virt_addr_valid was somehow dropped
from the
virt_to_page code path. Does anyone have any ideas what happended to it?
Kit
(Resending, earlier message tagged as containing html subpart)
On 08/13/2018 08:21 AM, Kit Chow wrote:
>
>
> On 08/13/2018 07:59 AM, Robin Murphy wrote:
>> On 13/08/18 15:23, Kit Chow wrote:
>>> On 08/10/2018 07:10 PM, Logan Gunthorpe wrote:
>>>>
>>>> On 10/08/18 06:53 PM, Kit Chow wrote:
>>>>> I was able to finally succeed in doing the dma transfers over ioat
>>>>> only
>>>>> when prot has DMA_PTE_WRITE set by setting the direction to either
>>>>> DMA_FROM_DEVICE or DMA_BIDIRECTIONAL. Any ideas if the prot settings
>>>>> need to be changed? Are there any bad side effects if I used
>>>>> DMA_BIDIRECTIONAL?
>>>> Good to hear it. Without digging into the direction much all I can say
>>>> is that it can sometimes be very confusing what the direction is.
>>>> Adding
>>>> another PCI device just adds to the confusion.
>>> Yep, confusing :).
>>>
>>> ======================= =============================================
>>> DMA_NONE no direction (used for debugging)
>>> DMA_TO_DEVICE data is going from the memory to the device
>>> DMA_FROM_DEVICE data is coming from the device to the memory
>>> DMA_BIDIRECTIONAL direction isn't known
>>> ======================= =============================================
>>>
>>>> I believe, the direction should be from the IOAT's point of view.
>>>> So if
>>>> the IOAT is writing to the BAR you'd set DMA_FROM_DEVICE (ie. data is
>>>> coming from the IOAT) and if it's reading you'd set DMA_TO_DEVICE (ie.
>>>> data is going to the IOAT).
>>> It would certainly seem like DMA_TO_DEVICE would be the proper
>>> choice; IOAT is the plumbing to move host data (memory) to the bar
>>> address (device).
>>
>> Except that the "device" in question is the IOAT itself (more
>> generally, it means the device represented by the first argument to
>> dma_map_*() - the one actually emitting the reads and writes). The
>> context of a DMA API call is the individual mapping in question, not
>> whatever overall operation it may be part of - your example already
>> involves two separate mappings: one "from" system memory "to" the DMA
>> engine, and one "from" the DMA engine "to" PCI BAR memory.
>
> OK, that makes sense. The middleman (aka DMA engine device) is the
> key in the to/from puzzle. Thanks!
>
>
>>
>> Note that the DMA API's dma_direction is also distinct from the
>> dmaengine API's dma_transfer_direction, and there's plenty of fun to
>> be had mapping between the two - see pl330.c or rcar-dmac.c for other
>> examples of dma_map_resource() for slave devices - no guarantees that
>> those implementations are entirely correct (especially the one I
>> did!), but in practice they do make the "DMA engine behind an IOMMU"
>> case work for UARTs and similar straightforward slaves.
>>
>>> Will go with what works and set DMA_FROM_DEVICE.
>>>
>>> In ntb_async_tx_submit, does the direction used for the dma_map
>>> routines for the src and dest addresses need to be consistent?
>>
>> In general, the mappings of source and destination addresses would
>> typically have opposite directions as above, unless they're both
>> bidirectional.
>>
>>> And does the direction setting for the dmaengine_unmap_data have to
>>> be consistent with the direction used in dma_map_*?
>>
>> Yes, the arguments to an unmap are expected to match whatever was
>> passed to the corresponding map call. CONFIG_DMA_API_DEBUG should
>> help catch any mishaps.
>>
>> Robin.
>>
>>> BTW, dmaengine_unmap routine only calls dma_unmap_page. Should it
>>> keep track of the dma_map routine used and call the corresponding
>>> dma_unmap routine? In the case of the intel iommu, it doesn't matter.
>>>
>>> Thanks
>>> Kit
>>>
>>>>
>>>> Using DMA_BIDIRECTIONAL just forgoes any hardware security /
>>>> protection
>>>> that the buffer would have in terms of direction. Generally it's good
>>>> practice to use the strictest direction you can.
>>>>
>>>>> Given that using the pci bar address as is without getting an iommu
>>>>> address results in the same "PTE Write access" error, I wonder if
>>>>> there
>>>>> is some internal 'prot' associated with the non-translated pci bar
>>>>> address that just needs to be tweaked to include DMA_PTE_WRITE???
>>>> No, I don't think so. The 'prot' will be a property of the IOMMU. Not
>>>> having an entry is probably just the same (from the perspective of the
>>>> error you see) as only having an entry for reading.
>>>>
>>>> Logan
>>>
>>> _______________________________________________
>>> iommu mailing list
>>> iommu@lists.linux-foundation.org
>>> https://lists.linuxfoundation.org/mailman/listinfo/iommu
>
^ permalink raw reply [flat|nested] 95+ messages in thread
* Re: IOAT DMA w/IOMMU
@ 2018-08-13 23:36 ` Kit Chow
0 siblings, 0 replies; 95+ messages in thread
From: Kit Chow @ 2018-08-13 23:36 UTC (permalink / raw)
To: Robin Murphy, Logan Gunthorpe, Dave Jiang, Eric Pilmore, Bjorn Helgaas
Cc: linux-pci-u79uwXL29TY76Z2rM5mHXA, Alex Williamson,
David Woodhouse,
iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
Taking a step back, I was a little surprised that dma_map_single
successfully returned an
iommu address for the pci bar address passed into it during my initial
experiment...
Internally, dma_map_single calls virt_to_page() to translate the
"virtual address" into a
page and intel_map page then calls page_to_phys() to convert the page to
a dma_addr_t.
The virt_to_page and page_to_phys routines don't appear to do any
validation and just
uses arithmetic to do the conversions.
the pci bar address (0x383c70e51578) does fall into a valid VA range in
x86_64 so
it could conceivably be a valid VA. So I tried an virtual address
inside the VA hole
and and it too returned without any errors.
virt_to_page(0x800000000000) -> 0xffffede000000000
page_to_phys(0xffffede000000000) -> 0xf80000000000
In arch/x86/include/asm/page.h, there is the following comment in regards to
validating the virtual address.
/*
* virt_to_page(kaddr) returns a valid pointer if and only if
* virt_addr_valid(kaddr) returns true.
*/
#define virt_to_page(kaddr) pfn_to_page(__pa(kaddr) >> PAGE_SHIFT)
So it looks like the validation by virt_addr_valid was somehow dropped
from the
virt_to_page code path. Does anyone have any ideas what happended to it?
Kit
(Resending, earlier message tagged as containing html subpart)
On 08/13/2018 08:21 AM, Kit Chow wrote:
>
>
> On 08/13/2018 07:59 AM, Robin Murphy wrote:
>> On 13/08/18 15:23, Kit Chow wrote:
>>> On 08/10/2018 07:10 PM, Logan Gunthorpe wrote:
>>>>
>>>> On 10/08/18 06:53 PM, Kit Chow wrote:
>>>>> I was able to finally succeed in doing the dma transfers over ioat
>>>>> only
>>>>> when prot has DMA_PTE_WRITE set by setting the direction to either
>>>>> DMA_FROM_DEVICE or DMA_BIDIRECTIONAL. Any ideas if the prot settings
>>>>> need to be changed? Are there any bad side effects if I used
>>>>> DMA_BIDIRECTIONAL?
>>>> Good to hear it. Without digging into the direction much all I can say
>>>> is that it can sometimes be very confusing what the direction is.
>>>> Adding
>>>> another PCI device just adds to the confusion.
>>> Yep, confusing :).
>>>
>>> ======================= =============================================
>>> DMA_NONE no direction (used for debugging)
>>> DMA_TO_DEVICE data is going from the memory to the device
>>> DMA_FROM_DEVICE data is coming from the device to the memory
>>> DMA_BIDIRECTIONAL direction isn't known
>>> ======================= =============================================
>>>
>>>> I believe, the direction should be from the IOAT's point of view.
>>>> So if
>>>> the IOAT is writing to the BAR you'd set DMA_FROM_DEVICE (ie. data is
>>>> coming from the IOAT) and if it's reading you'd set DMA_TO_DEVICE (ie.
>>>> data is going to the IOAT).
>>> It would certainly seem like DMA_TO_DEVICE would be the proper
>>> choice; IOAT is the plumbing to move host data (memory) to the bar
>>> address (device).
>>
>> Except that the "device" in question is the IOAT itself (more
>> generally, it means the device represented by the first argument to
>> dma_map_*() - the one actually emitting the reads and writes). The
>> context of a DMA API call is the individual mapping in question, not
>> whatever overall operation it may be part of - your example already
>> involves two separate mappings: one "from" system memory "to" the DMA
>> engine, and one "from" the DMA engine "to" PCI BAR memory.
>
> OK, that makes sense. The middleman (aka DMA engine device) is the
> key in the to/from puzzle. Thanks!
>
>
>>
>> Note that the DMA API's dma_direction is also distinct from the
>> dmaengine API's dma_transfer_direction, and there's plenty of fun to
>> be had mapping between the two - see pl330.c or rcar-dmac.c for other
>> examples of dma_map_resource() for slave devices - no guarantees that
>> those implementations are entirely correct (especially the one I
>> did!), but in practice they do make the "DMA engine behind an IOMMU"
>> case work for UARTs and similar straightforward slaves.
>>
>>> Will go with what works and set DMA_FROM_DEVICE.
>>>
>>> In ntb_async_tx_submit, does the direction used for the dma_map
>>> routines for the src and dest addresses need to be consistent?
>>
>> In general, the mappings of source and destination addresses would
>> typically have opposite directions as above, unless they're both
>> bidirectional.
>>
>>> And does the direction setting for the dmaengine_unmap_data have to
>>> be consistent with the direction used in dma_map_*?
>>
>> Yes, the arguments to an unmap are expected to match whatever was
>> passed to the corresponding map call. CONFIG_DMA_API_DEBUG should
>> help catch any mishaps.
>>
>> Robin.
>>
>>> BTW, dmaengine_unmap routine only calls dma_unmap_page. Should it
>>> keep track of the dma_map routine used and call the corresponding
>>> dma_unmap routine? In the case of the intel iommu, it doesn't matter.
>>>
>>> Thanks
>>> Kit
>>>
>>>>
>>>> Using DMA_BIDIRECTIONAL just forgoes any hardware security /
>>>> protection
>>>> that the buffer would have in terms of direction. Generally it's good
>>>> practice to use the strictest direction you can.
>>>>
>>>>> Given that using the pci bar address as is without getting an iommu
>>>>> address results in the same "PTE Write access" error, I wonder if
>>>>> there
>>>>> is some internal 'prot' associated with the non-translated pci bar
>>>>> address that just needs to be tweaked to include DMA_PTE_WRITE???
>>>> No, I don't think so. The 'prot' will be a property of the IOMMU. Not
>>>> having an entry is probably just the same (from the perspective of the
>>>> error you see) as only having an entry for reading.
>>>>
>>>> Logan
>>>
>>> _______________________________________________
>>> iommu mailing list
>>> iommu@lists.linux-foundation.org
>>> https://lists.linuxfoundation.org/mailman/listinfo/iommu
>
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
^ permalink raw reply [flat|nested] 95+ messages in thread