All of lore.kernel.org
 help / color / mirror / Atom feed
* [Question] Missing data after DMA read transfer
@ 2016-04-20 14:56 Nicolas Morey-Chaisemartin
  2016-04-21  8:35 ` One Thousand Gnomes
  2016-04-25  6:18 ` Nicolas Morey-Chaisemartin
  0 siblings, 2 replies; 6+ messages in thread
From: Nicolas Morey-Chaisemartin @ 2016-04-20 14:56 UTC (permalink / raw)
  To: linux-kernel

Hi everyone,

Short version:
I'm having an issue with direct DMA transfer from a device to host memory.
It seems some of the data is not transferring to the appropriate page.

Some more details:
I'm debugging a home made PCI driver for our board (Kalray), attached to a x86_64 host running centos7 (3.10.0-327.el7.x86_64)

In the current case, a userland application transfers back and forth data through read/write operations on a file.
On the kernel side, it triggers DMA transfers through the PCI to/from our board memory.

We followed what pretty much all docs said about direct I/O to user buffers:

1) get_user_pages() (in the current case, it's at most 16 pages at once)
2) convert to a scatterlist
3) pci_map_sg
4) eventually coalesce sg (Intel IOMMU is enabled, so it's usually possible)
4) A lot of DMA engine handling code, using the dmaengine layer and virt-dma
5) wait for transfer complete, in the mean time, go back to (1) to schedule more work, if any
6) pci_unmap_sg
7) for read (card2host) transfer, set_page_dirty_lock
8) page_cache_release

In 99,9999% it works perfectly.
However, I have one userland application where a few pages are not written by a read (card2host) transfer.
The buffer is memset them to a different value so I can check that nothing has overwritten them.

I know (PCI protocol analyser) that the data left our board for the "right" address (the one set in the sg by pci_map_sg).
I tried reading the data between the pci_unmap_sg and the set_page_dirty, using
        uint32_t *addr = page_address(trans->pages[0]);
        dev_warn(&pdata->pdev->dev, "val = %x\n", *addr);
and it has the expected value.
But if I try to copy_from_user (using the address coming from userland, the one passed to get_user_pages), the data has not been written and I see the memset value.

I manage to build a test case that fails all the time, but never at the same offset within the buffer.
It's always in the middle (never at the start nor end), for a few pages long (varies between runs).


Am I missing something? Could it be possible that I'm not writing to the right page?
If you need more information, feel free to ask


Thanks in advance

Nicolas

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Question] Missing data after DMA read transfer
  2016-04-20 14:56 [Question] Missing data after DMA read transfer Nicolas Morey-Chaisemartin
@ 2016-04-21  8:35 ` One Thousand Gnomes
  2016-04-21  8:51   ` Nicolas Morey-Chaisemartin
  2016-04-25  6:18 ` Nicolas Morey-Chaisemartin
  1 sibling, 1 reply; 6+ messages in thread
From: One Thousand Gnomes @ 2016-04-21  8:35 UTC (permalink / raw)
  To: Nicolas Morey-Chaisemartin; +Cc: linux-kernel

> But if I try to copy_from_user (using the address coming from userland, the one passed to get_user_pages), the data has not been written and I see the memset value.
> 
> I manage to build a test case that fails all the time, but never at the same offset within the buffer.
> It's always in the middle (never at the start nor end), for a few pages long (varies between runs).

Always page aligned, always cache line aligned or arbitrarily aligned ?

Alan

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Question] Missing data after DMA read transfer
  2016-04-21  8:35 ` One Thousand Gnomes
@ 2016-04-21  8:51   ` Nicolas Morey-Chaisemartin
  0 siblings, 0 replies; 6+ messages in thread
From: Nicolas Morey-Chaisemartin @ 2016-04-21  8:51 UTC (permalink / raw)
  To: One Thousand Gnomes; +Cc: linux-kernel



Le 04/21/2016 à 10:35 AM, One Thousand Gnomes a écrit :
>> But if I try to copy_from_user (using the address coming from userland, the one passed to get_user_pages), the data has not been written and I see the memset value.
>>
>> I manage to build a test case that fails all the time, but never at the same offset within the buffer.
>> It's always in the middle (never at the start nor end), for a few pages long (varies between runs).
> Always page aligned, always cache line aligned or arbitrarily aligned ?
>
> Alan
It's always page aligned for both start and end (meaning last "untransferred" byte is the last byte of a page)

Nicolas

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Question] Missing data after DMA read transfer
  2016-04-20 14:56 [Question] Missing data after DMA read transfer Nicolas Morey-Chaisemartin
  2016-04-21  8:35 ` One Thousand Gnomes
@ 2016-04-25  6:18 ` Nicolas Morey-Chaisemartin
  2016-04-26 13:31   ` Nicolas Morey-Chaisemartin
  2016-04-27  9:20   ` Nicolas Morey-Chaisemartin
  1 sibling, 2 replies; 6+ messages in thread
From: Nicolas Morey-Chaisemartin @ 2016-04-25  6:18 UTC (permalink / raw)
  To: linux-kernel

Le 04/20/2016 à 04:56 PM, Nicolas Morey-Chaisemartin a écrit :
> Hi everyone,
>
> Short version:
> I'm having an issue with direct DMA transfer from a device to host memory.
> It seems some of the data is not transferring to the appropriate page.
>
> Some more details:
> I'm debugging a home made PCI driver for our board (Kalray), attached to a x86_64 host running centos7 (3.10.0-327.el7.x86_64)
>
> In the current case, a userland application transfers back and forth data through read/write operations on a file.
> On the kernel side, it triggers DMA transfers through the PCI to/from our board memory.
>
> We followed what pretty much all docs said about direct I/O to user buffers:
>
> 1) get_user_pages() (in the current case, it's at most 16 pages at once)
> 2) convert to a scatterlist
> 3) pci_map_sg
> 4) eventually coalesce sg (Intel IOMMU is enabled, so it's usually possible)
> 4) A lot of DMA engine handling code, using the dmaengine layer and virt-dma
> 5) wait for transfer complete, in the mean time, go back to (1) to schedule more work, if any
> 6) pci_unmap_sg
> 7) for read (card2host) transfer, set_page_dirty_lock
> 8) page_cache_release
>
> In 99,9999% it works perfectly.
> However, I have one userland application where a few pages are not written by a read (card2host) transfer.
> The buffer is memset them to a different value so I can check that nothing has overwritten them.
>
> I know (PCI protocol analyser) that the data left our board for the "right" address (the one set in the sg by pci_map_sg).
> I tried reading the data between the pci_unmap_sg and the set_page_dirty, using
>         uint32_t *addr = page_address(trans->pages[0]);
>         dev_warn(&pdata->pdev->dev, "val = %x\n", *addr);
> and it has the expected value.
> But if I try to copy_from_user (using the address coming from userland, the one passed to get_user_pages), the data has not been written and I see the memset value.
>
> I manage to build a test case that fails all the time, but never at the same offset within the buffer.
> It's always in the middle (never at the start nor end), for a few pages long (varies between runs).
>
>
> Am I missing something? Could it be possible that I'm not writing to the right page?
> If you need more information, feel free to ask
>
>
> Thanks in advance
>
> Nicolas
>

As suggested, I tried to run the app without IOMMU and with DMA_API_DEBUG enabled.

intel_iommu=off changed nothing and the app still fails
DMA_API_DEBUG showed no warning or error.

I'm open to other tests that could add useful information for debugging this.


I also tried something. I'm not sure what exactly I am looking at but it looks suspicious to me:

When running with intel_iommu=on, I retrieved the page pointer corresponding to the user virtual address by looking at the MM/VMA structs
and compare it to the on I got earlier from get_user_pages.
It appears that regularly these pointers do not match. And for the pages which are "not transfered", they never do.

If this is to be expected, why are the pages different? The buffer were memset before the call to the PCI driver so all the phy page should be resolved (no COW or things like this) and I thought the point of get_user_pages was to pin pages so they cannot be moved/swapped until they are put back?


Nicols

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Question] Missing data after DMA read transfer
  2016-04-25  6:18 ` Nicolas Morey-Chaisemartin
@ 2016-04-26 13:31   ` Nicolas Morey-Chaisemartin
  2016-04-27  9:20   ` Nicolas Morey-Chaisemartin
  1 sibling, 0 replies; 6+ messages in thread
From: Nicolas Morey-Chaisemartin @ 2016-04-26 13:31 UTC (permalink / raw)
  To: linux-kernel

PIng. I could really use some help/feedback on this.

Thanks in advance


Nicolas


Le 04/25/2016 à 08:18 AM, Nicolas Morey-Chaisemartin a écrit :
> Le 04/20/2016 à 04:56 PM, Nicolas Morey-Chaisemartin a écrit :
>> Hi everyone,
>>
>> Short version:
>> I'm having an issue with direct DMA transfer from a device to host memory.
>> It seems some of the data is not transferring to the appropriate page.
>>
>> Some more details:
>> I'm debugging a home made PCI driver for our board (Kalray), attached to a x86_64 host running centos7 (3.10.0-327.el7.x86_64)
>>
>> In the current case, a userland application transfers back and forth data through read/write operations on a file.
>> On the kernel side, it triggers DMA transfers through the PCI to/from our board memory.
>>
>> We followed what pretty much all docs said about direct I/O to user buffers:
>>
>> 1) get_user_pages() (in the current case, it's at most 16 pages at once)
>> 2) convert to a scatterlist
>> 3) pci_map_sg
>> 4) eventually coalesce sg (Intel IOMMU is enabled, so it's usually possible)
>> 4) A lot of DMA engine handling code, using the dmaengine layer and virt-dma
>> 5) wait for transfer complete, in the mean time, go back to (1) to schedule more work, if any
>> 6) pci_unmap_sg
>> 7) for read (card2host) transfer, set_page_dirty_lock
>> 8) page_cache_release
>>
>> In 99,9999% it works perfectly.
>> However, I have one userland application where a few pages are not written by a read (card2host) transfer.
>> The buffer is memset them to a different value so I can check that nothing has overwritten them.
>>
>> I know (PCI protocol analyser) that the data left our board for the "right" address (the one set in the sg by pci_map_sg).
>> I tried reading the data between the pci_unmap_sg and the set_page_dirty, using
>>         uint32_t *addr = page_address(trans->pages[0]);
>>         dev_warn(&pdata->pdev->dev, "val = %x\n", *addr);
>> and it has the expected value.
>> But if I try to copy_from_user (using the address coming from userland, the one passed to get_user_pages), the data has not been written and I see the memset value.
>>
>> I manage to build a test case that fails all the time, but never at the same offset within the buffer.
>> It's always in the middle (never at the start nor end), for a few pages long (varies between runs).
>>
>>
>> Am I missing something? Could it be possible that I'm not writing to the right page?
>> If you need more information, feel free to ask
>>
>>
>> Thanks in advance
>>
>> Nicolas
>>
> As suggested, I tried to run the app without IOMMU and with DMA_API_DEBUG enabled.
>
> intel_iommu=off changed nothing and the app still fails
> DMA_API_DEBUG showed no warning or error.
>
> I'm open to other tests that could add useful information for debugging this.
>
>
> I also tried something. I'm not sure what exactly I am looking at but it looks suspicious to me:
>
> When running with intel_iommu=on, I retrieved the page pointer corresponding to the user virtual address by looking at the MM/VMA structs
> and compare it to the on I got earlier from get_user_pages.
> It appears that regularly these pointers do not match. And for the pages which are "not transfered", they never do.
>
> If this is to be expected, why are the pages different? The buffer were memset before the call to the PCI driver so all the phy page should be resolved (no COW or things like this) and I thought the point of get_user_pages was to pin pages so they cannot be moved/swapped until they are put back?
>
>
> Nicols

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Question] Missing data after DMA read transfer
  2016-04-25  6:18 ` Nicolas Morey-Chaisemartin
  2016-04-26 13:31   ` Nicolas Morey-Chaisemartin
@ 2016-04-27  9:20   ` Nicolas Morey-Chaisemartin
  1 sibling, 0 replies; 6+ messages in thread
From: Nicolas Morey-Chaisemartin @ 2016-04-27  9:20 UTC (permalink / raw)
  To: linux-kernel

I ran some more tests:

* Test is OK if transparent huge tlb are disabled

* For all the page where data are not transfered, and only those pages, a call to get_user_page(user vaddr) just before dma_unmap_sg returns a different page from the original one.
[436477.927279] mppa 0000:03:00.0: org_page= ffffea0009f60080 cur page = ffffea00074e0080
[436477.927298] page:ffffea0009f60080 count:0 mapcount:1 mapping:          (null) index:0x2
[436477.927314] page flags: 0x2fffff00008000(tail)
[436477.927354] page dumped because: org_page
[436477.927369] page:ffffea00074e0080 count:0 mapcount:1 mapping:          (null) index:0x2
[436477.927382] page flags: 0x2fffff00008000(tail)
[436477.927421] page dumped because: cur_page

I'm not sure what to make of this...

Nicolas

Le 04/25/2016 à 08:18 AM, Nicolas Morey-Chaisemartin a écrit :
> Le 04/20/2016 à 04:56 PM, Nicolas Morey-Chaisemartin a écrit :
>> Hi everyone,
>>
>> Short version:
>> I'm having an issue with direct DMA transfer from a device to host memory.
>> It seems some of the data is not transferring to the appropriate page.
>>
>> Some more details:
>> I'm debugging a home made PCI driver for our board (Kalray), attached to a x86_64 host running centos7 (3.10.0-327.el7.x86_64)
>>
>> In the current case, a userland application transfers back and forth data through read/write operations on a file.
>> On the kernel side, it triggers DMA transfers through the PCI to/from our board memory.
>>
>> We followed what pretty much all docs said about direct I/O to user buffers:
>>
>> 1) get_user_pages() (in the current case, it's at most 16 pages at once)
>> 2) convert to a scatterlist
>> 3) pci_map_sg
>> 4) eventually coalesce sg (Intel IOMMU is enabled, so it's usually possible)
>> 4) A lot of DMA engine handling code, using the dmaengine layer and virt-dma
>> 5) wait for transfer complete, in the mean time, go back to (1) to schedule more work, if any
>> 6) pci_unmap_sg
>> 7) for read (card2host) transfer, set_page_dirty_lock
>> 8) page_cache_release
>>
>> In 99,9999% it works perfectly.
>> However, I have one userland application where a few pages are not written by a read (card2host) transfer.
>> The buffer is memset them to a different value so I can check that nothing has overwritten them.
>>
>> I know (PCI protocol analyser) that the data left our board for the "right" address (the one set in the sg by pci_map_sg).
>> I tried reading the data between the pci_unmap_sg and the set_page_dirty, using
>>         uint32_t *addr = page_address(trans->pages[0]);
>>         dev_warn(&pdata->pdev->dev, "val = %x\n", *addr);
>> and it has the expected value.
>> But if I try to copy_from_user (using the address coming from userland, the one passed to get_user_pages), the data has not been written and I see the memset value.
>>
>> I manage to build a test case that fails all the time, but never at the same offset within the buffer.
>> It's always in the middle (never at the start nor end), for a few pages long (varies between runs).
>>
>>
>> Am I missing something? Could it be possible that I'm not writing to the right page?
>> If you need more information, feel free to ask
>>
>>
>> Thanks in advance
>>
>> Nicolas
>>
> As suggested, I tried to run the app without IOMMU and with DMA_API_DEBUG enabled.
>
> intel_iommu=off changed nothing and the app still fails
> DMA_API_DEBUG showed no warning or error.
>
> I'm open to other tests that could add useful information for debugging this.
>
>
> I also tried something. I'm not sure what exactly I am looking at but it looks suspicious to me:
>
> When running with intel_iommu=on, I retrieved the page pointer corresponding to the user virtual address by looking at the MM/VMA structs
> and compare it to the on I got earlier from get_user_pages.
> It appears that regularly these pointers do not match. And for the pages which are "not transfered", they never do.
>
> If this is to be expected, why are the pages different? The buffer were memset before the call to the PCI driver so all the phy page should be resolved (no COW or things like this) and I thought the point of get_user_pages was to pin pages so they cannot be moved/swapped until they are put back?
>
>
> Nicols

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-04-27  9:20 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-20 14:56 [Question] Missing data after DMA read transfer Nicolas Morey-Chaisemartin
2016-04-21  8:35 ` One Thousand Gnomes
2016-04-21  8:51   ` Nicolas Morey-Chaisemartin
2016-04-25  6:18 ` Nicolas Morey-Chaisemartin
2016-04-26 13:31   ` Nicolas Morey-Chaisemartin
2016-04-27  9:20   ` Nicolas Morey-Chaisemartin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.