From: Robin Murphy <robin.murphy@arm.com> To: Chao Gao <chao.gao@intel.com>, linux-kernel@vger.kernel.org, iommu@lists.linux-foundation.org Cc: m.szyprowski@samsung.com, hch@lst.de, Wang Zhaoyang1 <zhaoyang1.wang@intel.com>, Gao Liang <liang.gao@intel.com>, Kevin Tian <kevin.tian@intel.com> Subject: Re: [PATCH] dma-direct: avoid redundant memory sync for swiotlb Date: Tue, 12 Apr 2022 14:33:05 +0100 [thread overview] Message-ID: <e25fbb7e-a67e-5421-b7be-700fd0209b0d@arm.com> (raw) In-Reply-To: <20220412113805.3210-1-chao.gao@intel.com> On 12/04/2022 12:38 pm, Chao Gao wrote: > When we looked into FIO performance with swiotlb enabled in VM, we found > swiotlb_bounce() is always called one more time than expected for each DMA > read request. > > It turns out that the bounce buffer is copied to original DMA buffer twice > after the completion of a DMA request (one is done by in > dma_direct_sync_single_for_cpu(), the other by swiotlb_tbl_unmap_single()). > But the content in bounce buffer actually doesn't change between the two > rounds of copy. So, one round of copy is redundant. > > Pass DMA_ATTR_SKIP_CPU_SYNC flag to swiotlb_tbl_unmap_single() to > skip the memory copy in it. It's still a little suboptimal and non-obvious to call into SWIOTLB twice though - even better might be for SWIOTLB to call arch_sync_dma_for_cpu() at the appropriate place internally, then put the dma_direct_sync in an else path here. I'm really not sure why we have the current disparity between map and unmap in this regard... :/ Robin. > This fix increases FIO 64KB sequential read throughput in a guest with > swiotlb=force by 5.6%. > > Reported-by: Wang Zhaoyang1 <zhaoyang1.wang@intel.com> > Reported-by: Gao Liang <liang.gao@intel.com> > Signed-off-by: Chao Gao <chao.gao@intel.com> > Reviewed-by: Kevin Tian <kevin.tian@intel.com> > --- > kernel/dma/direct.h | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/kernel/dma/direct.h b/kernel/dma/direct.h > index 4632b0f4f72e..8a6cd53dbe8c 100644 > --- a/kernel/dma/direct.h > +++ b/kernel/dma/direct.h > @@ -114,6 +114,7 @@ static inline void dma_direct_unmap_page(struct device *dev, dma_addr_t addr, > dma_direct_sync_single_for_cpu(dev, addr, size, dir); > > if (unlikely(is_swiotlb_buffer(dev, phys))) > - swiotlb_tbl_unmap_single(dev, phys, size, dir, attrs); > + swiotlb_tbl_unmap_single(dev, phys, size, dir, > + attrs | DMA_ATTR_SKIP_CPU_SYNC); > } > #endif /* _KERNEL_DMA_DIRECT_H */
WARNING: multiple messages have this Message-ID (diff)
From: Robin Murphy <robin.murphy@arm.com> To: Chao Gao <chao.gao@intel.com>, linux-kernel@vger.kernel.org, iommu@lists.linux-foundation.org Cc: Wang Zhaoyang1 <zhaoyang1.wang@intel.com>, Gao Liang <liang.gao@intel.com>, Kevin Tian <kevin.tian@intel.com>, hch@lst.de Subject: Re: [PATCH] dma-direct: avoid redundant memory sync for swiotlb Date: Tue, 12 Apr 2022 14:33:05 +0100 [thread overview] Message-ID: <e25fbb7e-a67e-5421-b7be-700fd0209b0d@arm.com> (raw) In-Reply-To: <20220412113805.3210-1-chao.gao@intel.com> On 12/04/2022 12:38 pm, Chao Gao wrote: > When we looked into FIO performance with swiotlb enabled in VM, we found > swiotlb_bounce() is always called one more time than expected for each DMA > read request. > > It turns out that the bounce buffer is copied to original DMA buffer twice > after the completion of a DMA request (one is done by in > dma_direct_sync_single_for_cpu(), the other by swiotlb_tbl_unmap_single()). > But the content in bounce buffer actually doesn't change between the two > rounds of copy. So, one round of copy is redundant. > > Pass DMA_ATTR_SKIP_CPU_SYNC flag to swiotlb_tbl_unmap_single() to > skip the memory copy in it. It's still a little suboptimal and non-obvious to call into SWIOTLB twice though - even better might be for SWIOTLB to call arch_sync_dma_for_cpu() at the appropriate place internally, then put the dma_direct_sync in an else path here. I'm really not sure why we have the current disparity between map and unmap in this regard... :/ Robin. > This fix increases FIO 64KB sequential read throughput in a guest with > swiotlb=force by 5.6%. > > Reported-by: Wang Zhaoyang1 <zhaoyang1.wang@intel.com> > Reported-by: Gao Liang <liang.gao@intel.com> > Signed-off-by: Chao Gao <chao.gao@intel.com> > Reviewed-by: Kevin Tian <kevin.tian@intel.com> > --- > kernel/dma/direct.h | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/kernel/dma/direct.h b/kernel/dma/direct.h > index 4632b0f4f72e..8a6cd53dbe8c 100644 > --- a/kernel/dma/direct.h > +++ b/kernel/dma/direct.h > @@ -114,6 +114,7 @@ static inline void dma_direct_unmap_page(struct device *dev, dma_addr_t addr, > dma_direct_sync_single_for_cpu(dev, addr, size, dir); > > if (unlikely(is_swiotlb_buffer(dev, phys))) > - swiotlb_tbl_unmap_single(dev, phys, size, dir, attrs); > + swiotlb_tbl_unmap_single(dev, phys, size, dir, > + attrs | DMA_ATTR_SKIP_CPU_SYNC); > } > #endif /* _KERNEL_DMA_DIRECT_H */ _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
next prev parent reply other threads:[~2022-04-12 13:33 UTC|newest] Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top 2022-04-12 11:38 [PATCH] dma-direct: avoid redundant memory sync for swiotlb Chao Gao 2022-04-12 11:38 ` Chao Gao 2022-04-12 13:21 ` Chao Gao 2022-04-12 13:21 ` Chao Gao 2022-04-12 13:33 ` Robin Murphy [this message] 2022-04-12 13:33 ` Robin Murphy 2022-04-13 1:02 ` Chao Gao 2022-04-13 1:02 ` Chao Gao 2022-04-13 4:59 ` Christoph Hellwig 2022-04-13 4:59 ` Christoph Hellwig 2022-04-13 5:46 ` Chao Gao 2022-04-13 5:46 ` Chao Gao 2022-04-13 5:49 ` Christoph Hellwig 2022-04-13 5:49 ` Christoph Hellwig 2022-04-13 13:10 ` Robin Murphy 2022-04-13 13:10 ` Robin Murphy 2022-04-13 16:44 ` Christoph Hellwig 2022-04-13 16:44 ` Christoph Hellwig
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=e25fbb7e-a67e-5421-b7be-700fd0209b0d@arm.com \ --to=robin.murphy@arm.com \ --cc=chao.gao@intel.com \ --cc=hch@lst.de \ --cc=iommu@lists.linux-foundation.org \ --cc=kevin.tian@intel.com \ --cc=liang.gao@intel.com \ --cc=linux-kernel@vger.kernel.org \ --cc=m.szyprowski@samsung.com \ --cc=zhaoyang1.wang@intel.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.