dmaengine Archive on lore.kernel.org
 help / color / Atom feed
From: Thomas Ruf <freelancer@rufusul.de>
To: Peter Ujfalusi <peter.ujfalusi@ti.com>, Vinod Koul <vkoul@kernel.org>
Cc: Federico Vaga <federico.vaga@cern.ch>,
	Dave Jiang <dave.jiang@intel.com>,
	Dan Williams <dan.j.williams@intel.com>,
	dmaengine@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: DMA Engine: Transfer From Userspace
Date: Wed, 1 Jul 2020 18:13:46 +0200 (CEST)
Message-ID: <449688975.51579.1593620026422@mailbusiness.ionos.de> (raw)
In-Reply-To: <391e0aee-e35b-fa39-ab86-18fe33a776ee@ti.com>


> On 30 June 2020 at 14:31 Peter Ujfalusi <peter.ujfalusi@ti.com> wrote:
> 
> 
> 
> 
> On 29/06/2020 18.18, Thomas Ruf wrote:
> > 
> >> On 26 June 2020 at 12:29 Peter Ujfalusi <peter.ujfalusi@ti.com> wrote:
> >>
> >> On 24/06/2020 16.58, Thomas Ruf wrote:
> >>>
> >>>> On 24 June 2020 at 14:07 Peter Ujfalusi <peter.ujfalusi@ti.com> wrote:
> >>>> On 24/06/2020 12.38, Vinod Koul wrote:
> >>>>> On 24-06-20, 11:30, Thomas Ruf wrote:
> >>>>>
> >>>>>> To make it short - i have two questions:
> >>>>>> - what are the chances to revive DMA_SG?
> >>>>>
> >>>>> 100%, if we have a in-kernel user
> >>>>
> >>>> Most DMAs can not handle differently provisioned sg_list for src and dst.
> >>>> Even if they could handle non symmetric SG setup it requires entirely
> >>>> different setup (two independent channels sending the data to each
> >>>> other, one reads, the other writes?).
> >>>
> >>> Ok, i implemented that using zynqmp_dma on a Xilinx Zynq platform (obviously ;-) and it works nicely for us.
> >>
> >> I see, if the HW does not support it then something along the lines of
> >> what the atc_prep_dma_sg did can be implemented for most engines.
> >>
> >> In essence: create a new set of sg_list which is symmetric.
> > 
> > Sorry, not sure if i understand you right?
> > You suggest that in case DMA_SG gets revived we should restrict the support to symmetric sg_lists?
> 
> No, not at all. That would not make much sense.

Glad that this was just a misunderstanding.

> > Just had a glance at the deleted code and the *_prep_dma_sg of these drivers had code to support asymmetric lists and by that "unaligend" memory (relative to page start):
> > at_hdmac.c         
> > dmaengine.c        
> > dmatest.c          
> > fsldma.c           
> > mv_xor.c           
> > nbpfaxi.c          
> > ste_dma40.c        
> > xgene-dma.c        
> > xilinx/zynqmp_dma.c
> > 
> > Why not just revive that and keep this nice functionality? ;-)
> 
> What I'm saying is that the drivers (at least at_hdmac) in essence
> creates aligned sg_list out from the received non aligned ones.
> It does this w/o actually creating the sg_list itself, but that's just a
> small detail.
> 
> In a longer run what might make sense is to have a helper function to
> convert two non symmetric sg_list into two symmetric ones so drivers
> will not have to re-implement the same code and they will only need to
> care about symmetric sg lists.

Sounds like a superb idea!

> Note, some DMAs can actually handle non symmetric src and dst lists, but
> I believe it is rare.

So i was a bit lucky that the zynqmp_dma is one of them.
 
> >> What might be plausible is to introduce hw offloading support for memcpy
> >> type of operations in a similar fashion how for example crypto does it?
> > 
> > Sounds good to me, my proxy driver implementation could be a good start for that, too!
> 
> It needs to find it's place as well... I'm not sure where that would be.
> Simple block-copy offload, sg copy offload, interleaved offload (frame
> extraction) offload, dmabuf copy offload comes to mind as candidates.

And who would decide that...

> >> The issue with a user space implemented logic is that it is not portable
> >> between systems with different DMAs. It might be that on one DMA the
> >> setup takes longer than do a CPU copy of X bytes, on the other DMA it
> >> might be significantly less or higher.
> > 
> > Fully agree with that!
> > I was also unsure how my approach will perform but in our case the latency was increased by ~20%, cpu load roughly stayed the same, of course this was the benchmark from user memory to user memory.
> > From uncached to user memory the DMA was around 15 times faster.
> 
> It depends on the size of the transfer. Lots of small individual
> transfers might be worst via DMA do the the setup time, completion
> handling, etc.

Yes, exactly.

Thanks again for your great input!

best regards,
Thomas

PS: I am on vacation for the next two weaks and probably will not check this mailing list till 20.7. But will fetch later.

  reply index

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-19 22:47 Federico Vaga
2020-06-19 23:31 ` Dave Jiang
2020-06-21  7:24   ` Vinod Koul
2020-06-21 20:36     ` Federico Vaga
2020-06-21 20:45       ` Richard Weinberger
2020-06-21 22:32         ` Federico Vaga
2020-06-22  4:47       ` Vinod Koul
2020-06-22  6:57         ` Federico Vaga
2020-06-22 12:01         ` Thomas Ruf
2020-06-22 12:27           ` Richard Weinberger
2020-06-22 14:01             ` Thomas Ruf
2020-06-22 12:30           ` Federico Vaga
2020-06-22 14:03             ` Thomas Ruf
2020-06-22 15:54           ` Vinod Koul
2020-06-22 16:34             ` Thomas Ruf
2020-06-24  9:30               ` Thomas Ruf
2020-06-24  9:38                 ` Vinod Koul
2020-06-24 12:07                   ` Peter Ujfalusi
2020-06-24 13:58                     ` Thomas Ruf
2020-06-26 10:29                       ` Peter Ujfalusi
2020-06-29 15:18                         ` Thomas Ruf
2020-06-30 12:31                           ` Peter Ujfalusi
2020-07-01 16:13                             ` Thomas Ruf [this message]
2020-06-25  0:42     ` Dave Jiang
2020-06-25  8:11       ` Thomas Ruf
2020-06-26 20:08         ` Ira Weiny
2020-06-29 15:31           ` Thomas Ruf
2020-06-22  9:25 ` Federico Vaga
2020-06-22  9:42   ` Vinod Koul

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=449688975.51579.1593620026422@mailbusiness.ionos.de \
    --to=freelancer@rufusul.de \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=dmaengine@vger.kernel.org \
    --cc=federico.vaga@cern.ch \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peter.ujfalusi@ti.com \
    --cc=vkoul@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

dmaengine Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/dmaengine/0 dmaengine/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dmaengine dmaengine/ https://lore.kernel.org/dmaengine \
		dmaengine@vger.kernel.org
	public-inbox-index dmaengine

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.dmaengine


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git