All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bruce Richardson <bruce.richardson@intel.com>
To: Jerin Jacob <jerinjacobk@gmail.com>
Cc: fengchengwen <fengchengwen@huawei.com>,
	Thomas Monjalon <thomas@monjalon.net>,
	Ferruh Yigit <ferruh.yigit@intel.com>, dpdk-dev <dev@dpdk.org>,
	Nipun Gupta <nipun.gupta@nxp.com>,
	Hemant Agrawal <hemant.agrawal@nxp.com>,
	Maxime Coquelin <maxime.coquelin@redhat.com>,
	Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>,
	Jerin Jacob <jerinj@marvell.com>,
	David Marchand <david.marchand@redhat.com>
Subject: Re: [dpdk-dev] [RFC PATCH] dmadev: introduce DMA device library
Date: Fri, 18 Jun 2021 11:03:58 +0100	[thread overview]
Message-ID: <YMxvjhneoTgsGicO@bricha3-MOBL.ger.corp.intel.com> (raw)
In-Reply-To: <CALBAE1OjPPs8dueABG2r2YBJvXn=Sa40JyXCk0Saxi_NpJGNDw@mail.gmail.com>

On Fri, Jun 18, 2021 at 10:46:08AM +0530, Jerin Jacob wrote:
> On Thu, Jun 17, 2021 at 1:30 PM Bruce Richardson
> <bruce.richardson@intel.com> wrote:
> >
> > On Thu, Jun 17, 2021 at 01:12:22PM +0530, Jerin Jacob wrote:
> > > On Thu, Jun 17, 2021 at 12:43 AM Bruce Richardson
> > > <bruce.richardson@intel.com> wrote:
> > > >
> > > > On Wed, Jun 16, 2021 at 11:38:08PM +0530, Jerin Jacob wrote:
> > > > > On Wed, Jun 16, 2021 at 11:01 PM Bruce Richardson
> > > > > <bruce.richardson@intel.com> wrote:
> > > > > >
> > > > > > On Wed, Jun 16, 2021 at 05:41:45PM +0800, fengchengwen wrote:
> > > > > > > On 2021/6/16 0:38, Bruce Richardson wrote:
> > > > > > > > On Tue, Jun 15, 2021 at 09:22:07PM +0800, Chengwen Feng wrote:
> > > > > > > >> This patch introduces 'dmadevice' which is a generic type of DMA
> > > > > > > >> device.
> > > > > > > >>
> > > > > > > >> The APIs of dmadev library exposes some generic operations which can
> > > > > > > >> enable configuration and I/O with the DMA devices.
> > > > > > > >>
> > > > > > > >> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> > > > > > > >> ---
> > > > > > > > Thanks for sending this.
> > > > > > > >
> > > > > > > > Of most interest to me right now are the key data-plane APIs. While we are
> > > > > > > > still in the prototyping phase, below is a draft of what we are thinking
> > > > > > > > for the key enqueue/perform_ops/completed_ops APIs.
> > > > > > > >
> > > > > > > > Some key differences I note in below vs your original RFC:
> > > > > > > > * Use of void pointers rather than iova addresses. While using iova's makes
> > > > > > > >   sense in the general case when using hardware, in that it can work with
> > > > > > > >   both physical addresses and virtual addresses, if we change the APIs to use
> > > > > > > >   void pointers instead it will still work for DPDK in VA mode, while at the
> > > > > > > >   same time allow use of software fallbacks in error cases, and also a stub
> > > > > > > >   driver than uses memcpy in the background. Finally, using iova's makes the
> > > > > > > >   APIs a lot more awkward to use with anything but mbufs or similar buffers
> > > > > > > >   where we already have a pre-computed physical address.
> > > > > > >
> > > > > > > The iova is an hint to application, and widely used in DPDK.
> > > > > > > If switch to void, how to pass the address (iova or just va ?)
> > > > > > > this may introduce implementation dependencies here.
> > > > > > >
> > > > > > > Or always pass the va, and the driver performs address translation, and this
> > > > > > > translation may cost too much cpu I think.
> > > > > > >
> > > > > >
> > > > > > On the latter point, about driver doing address translation I would agree.
> > > > > > However, we probably need more discussion about the use of iova vs just
> > > > > > virtual addresses. My thinking on this is that if we specify the API using
> > > > > > iovas it will severely hurt usability of the API, since it forces the user
> > > > > > to take more inefficient codepaths in a large number of cases. Given a
> > > > > > pointer to the middle of an mbuf, one cannot just pass that straight as an
> > > > > > iova but must instead do a translation into offset from mbuf pointer and
> > > > > > then readd the offset to the mbuf base address.
> > > > > >
> > > > > > My preference therefore is to require the use of an IOMMU when using a
> > > > > > dmadev, so that it can be a much closer analog of memcpy. Once an iommu is
> > > > > > present, DPDK will run in VA mode, allowing virtual addresses to our
> > > > > > hugepage memory to be sent directly to hardware. Also, when using
> > > > > > dmadevs on top of an in-kernel driver, that kernel driver may do all iommu
> > > > > > management for the app, removing further the restrictions on what memory
> > > > > > can be addressed by hardware.
> > > > >
> > > > >
> > > > > One issue of keeping void * is that memory can come from stack or heap .
> > > > > which HW can not really operate it on.
> > > >
> > > > when kernel driver is managing the IOMMU all process memory can be worked
> > > > on, not just hugepage memory, so using iova is wrong in these cases.
> > >
> > > But not for stack and heap memory. Right?
> > >
> > Yes, even stack and heap can be accessed.
> 
> The HW device cannot as that memory is NOT mapped to IOMMU. It will
> result in the transaction
> fault.
>

Not if the kernel driver rather than DPDK is managing the IOMMU:
https://www.kernel.org/doc/html/latest/x86/sva.html
"Shared Virtual Addressing (SVA) allows the processor and device to use the
same virtual addresses avoiding the need for software to translate virtual
addresses to physical addresses. SVA is what PCIe calls Shared Virtual
Memory (SVM)."
 
> At least, In octeon, DMA HW job descriptor will have a pointer (IOVA)
> which will be updated by _HW_
> upon copy job completion. That memory can not be from the
> heap(malloc()) or stack as those are not
> mapped by IOMMU.
> 
> 
> >
> > > >
> > > > As I previously said, using iova prevents the creation of a pure software
> > > > dummy driver too using memcpy in the background.
> > >
> > > Why ? the memory alloced uing rte_alloc/rte_memzone etc can be touched by CPU.
> > >
> > Yes, but it can't be accessed using physical address, so again only VA mode
> > where iova's are "void *" make sense.
> 
> I agree that it should be a physical address. My only concern that
> void * does not express
> it can not be from stack/heap. If API tells the memory need to
> allotted by rte_alloc() or rte_memzone() etc
> is fine with me.
> 
That could be a capability field too. Hardware supporting SVA/SVM does not
have this limitation so can specify that any virtual address may be used.

I suppose it really doesn't matter whether the APIs are written to take
pointers or iova's so long as the restrictions are clear. Since iova is the
default for other HW ops, I'm ok for functions to take params as iovas and
have the capability definitons provide the info to the user that in some
cases virtual addresses can be used.

  reply	other threads:[~2021-06-18 10:04 UTC|newest]

Thread overview: 79+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-15 13:22 [dpdk-dev] [RFC PATCH] dmadev: introduce DMA device library Chengwen Feng
2021-06-15 16:38 ` Bruce Richardson
2021-06-16  7:09   ` Morten Brørup
2021-06-16 10:17     ` fengchengwen
2021-06-16 12:09       ` Morten Brørup
2021-06-16 13:06       ` Bruce Richardson
2021-06-16 14:37       ` Jerin Jacob
2021-06-17  9:15         ` Bruce Richardson
2021-06-18  5:52           ` Jerin Jacob
2021-06-18  9:41             ` fengchengwen
2021-06-22 17:25               ` Jerin Jacob
2021-06-23  3:30                 ` fengchengwen
2021-06-23  7:21                   ` Jerin Jacob
2021-06-23  9:37                     ` Bruce Richardson
2021-06-23 11:40                       ` Jerin Jacob
2021-06-23 14:19                         ` Bruce Richardson
2021-06-24  6:49                           ` Jerin Jacob
2021-06-23  9:41                 ` Bruce Richardson
2021-06-23 10:10                   ` Morten Brørup
2021-06-23 11:46                   ` Jerin Jacob
2021-06-23 14:22                     ` Bruce Richardson
2021-06-18  9:55             ` Bruce Richardson
2021-06-22 17:31               ` Jerin Jacob
2021-06-22 19:17                 ` Bruce Richardson
2021-06-23  7:00                   ` Jerin Jacob
2021-06-16  9:41   ` fengchengwen
2021-06-16 17:31     ` Bruce Richardson
2021-06-16 18:08       ` Jerin Jacob
2021-06-16 19:13         ` Bruce Richardson
2021-06-17  7:42           ` Jerin Jacob
2021-06-17  8:00             ` Bruce Richardson
2021-06-18  5:16               ` Jerin Jacob
2021-06-18 10:03                 ` Bruce Richardson [this message]
2021-06-22 17:36                   ` Jerin Jacob
2021-06-17  9:48       ` fengchengwen
2021-06-17 11:02         ` Bruce Richardson
2021-06-17 14:18           ` Bruce Richardson
2021-06-18  8:52             ` fengchengwen
2021-06-18  9:30               ` Bruce Richardson
2021-06-22 17:51               ` Jerin Jacob
2021-06-23  3:50                 ` fengchengwen
2021-06-23 11:00                   ` Jerin Jacob
2021-06-23 14:56                   ` Bruce Richardson
2021-06-24 12:19                     ` fengchengwen
2021-06-26  3:59                       ` [dpdk-dev] dmadev discussion summary fengchengwen
2021-06-28 10:00                         ` Bruce Richardson
2021-06-28 11:14                           ` Ananyev, Konstantin
2021-06-28 12:53                             ` Bruce Richardson
2021-07-02 13:31                           ` fengchengwen
2021-07-01 15:01                         ` Jerin Jacob
2021-07-01 16:33                           ` Bruce Richardson
2021-07-02  7:39                             ` Morten Brørup
2021-07-02 10:05                               ` Bruce Richardson
2021-07-02 13:45                           ` fengchengwen
2021-07-02 14:57                             ` Morten Brørup
2021-07-03  0:32                               ` fengchengwen
2021-07-03  8:53                                 ` Morten Brørup
2021-07-03  9:08                                   ` Jerin Jacob
2021-07-03 12:24                                     ` Morten Brørup
2021-07-04  7:43                                       ` Jerin Jacob
2021-07-05 10:28                                         ` Morten Brørup
2021-07-06  7:11                                           ` fengchengwen
2021-07-03  9:45                                   ` fengchengwen
2021-07-03 12:00                                     ` Morten Brørup
2021-07-04  7:34                                       ` Jerin Jacob
2021-07-02  7:07                         ` Liang Ma
2021-07-02 13:59                           ` fengchengwen
2021-06-24  7:03                   ` [dpdk-dev] [RFC PATCH] dmadev: introduce DMA device library Jerin Jacob
2021-06-24  7:59                     ` Morten Brørup
2021-06-24  8:05                       ` Jerin Jacob
2021-06-23  5:34       ` Hu, Jiayu
2021-06-23 11:07         ` Jerin Jacob
2021-06-16  2:17 ` Wang, Haiyue
2021-06-16  8:04   ` Bruce Richardson
2021-06-16  8:16     ` Wang, Haiyue
2021-06-16 12:14 ` David Marchand
2021-06-16 13:11   ` Bruce Richardson
2021-06-16 16:48     ` Honnappa Nagarahalli
2021-06-16 19:10       ` Bruce Richardson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YMxvjhneoTgsGicO@bricha3-MOBL.ger.corp.intel.com \
    --to=bruce.richardson@intel.com \
    --cc=david.marchand@redhat.com \
    --cc=dev@dpdk.org \
    --cc=fengchengwen@huawei.com \
    --cc=ferruh.yigit@intel.com \
    --cc=hemant.agrawal@nxp.com \
    --cc=honnappa.nagarahalli@arm.com \
    --cc=jerinj@marvell.com \
    --cc=jerinjacobk@gmail.com \
    --cc=maxime.coquelin@redhat.com \
    --cc=nipun.gupta@nxp.com \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.