All of lore.kernel.org
 help / color / mirror / Atom feed
From: Liang Ma <liangma@liangbit.com>
To: fengchengwen <fengchengwen@huawei.com>
Cc: "Bruce Richardson" <bruce.richardson@intel.com>,
	"Jerin Jacob" <jerinjacobk@gmail.com>,
	"Jerin Jacob" <jerinj@marvell.com>,
	"Morten Brørup" <mb@smartsharesystems.com>,
	"Nipun Gupta" <nipun.gupta@nxp.com>,
	"Thomas Monjalon" <thomas@monjalon.net>,
	"Ferruh Yigit" <ferruh.yigit@intel.com>, dpdk-dev <dev@dpdk.org>,
	"Hemant Agrawal" <hemant.agrawal@nxp.com>,
	"Maxime Coquelin" <maxime.coquelin@redhat.com>,
	"Honnappa Nagarahalli" <honnappa.nagarahalli@arm.com>,
	"David Marchand" <david.marchand@redhat.com>,
	"Satananda Burla" <sburla@marvell.com>,
	"Prasun Kapoor" <pkapoor@marvell.com>
Subject: Re: [dpdk-dev] dmadev discussion summary
Date: Fri, 2 Jul 2021 08:07:35 +0100	[thread overview]
Message-ID: <YN67N8Jb5VGQmuw3@C02F33EJML85> (raw)
In-Reply-To: <c4a0ee30-f7b8-f8a1-463c-8eedaec82aea@huawei.com>

On Sat, Jun 26, 2021 at 11:59:49AM +0800, fengchengwen wrote:
> Hi, all
>   I analyzed the current DPAM DMA driver and drew this summary in conjunction
> with the previous discussion, and this will as a basis for the V2 implementation.
>   Feedback is welcome, thanks
> 
> 
> dpaa2_qdma:
>   [probe]: mainly obtains the number of hardware queues.
>   [dev_configure]: has following parameters:
>       max_hw_queues_per_core:
>       max_vqs: max number of virt-queue
>       fle_queue_pool_cnt: the size of FLE pool
>   [queue_setup]: setup up one virt-queue, has following parameters:
>       lcore_id:
>       flags: some control params, e.g. sg-list, longformat desc, exclusive HW
>              queue...
>       rbp: some misc field which impact the descriptor
>       Note: this API return the index of virt-queue which was successful
>             setuped.
>   [enqueue_bufs]: data-plane API, the key fields:
>       vq_id: the index of virt-queue
> 	  job: the pointer of job array
> 	  nb_jobs:
> 	  Note: one job has src/dest/len/flag/cnxt/status/vq_id/use_elem fields,
>             the flag field indicate whether src/dst is PHY addr.
>   [dequeue_bufs]: get the completed jobs's pointer
> 
>   [key point]:
>       ------------    ------------
>       |virt-queue|    |virt-queue|
>       ------------    ------------
>              \           /
>               \         /
>                \       /
>              ------------     ------------
>              | HW-queue |     | HW-queue |
>              ------------     ------------
>                     \            /
>                      \          /
>                       \        /
>                       core/rawdev
>       1) In the probe stage, driver tell how many HW-queues could use.
>       2) User could specify the maximum number of HW-queues managed by a single
>          core in the dev_configure stage.
>       3) User could create one virt-queue by queue_setup API, the virt-queue has
>          two types: a) exclusive HW-queue, b) shared HW-queue(as described
>          above), this is achieved by the corresponding bit of flags field.
>       4) In this mode, queue management is simplified. User do not need to
>          specify the HW-queue to be applied for and create a virt-queue on the
>          HW-queue. All you need to do is say on which core I want to create a
>          virt-queue.
>       5) The virt-queue could have different capability, e.g. virt-queue-0
>          support scatter-gather format, and virt-queue-1 don't support sg, this
>          was control by flags and rbp fields in queue_setup stage.
>       6) The data-plane API use the definition similar to rte_mbuf and
>          rte_eth_rx/tx_burst().
>       PS: I still don't understand how sg-list enqueue/dequeue, and user how to
>           use RTE_QDMA_VQ_NO_RESPONSE.
> 
>       Overall, I think it's a flexible design with many scalability. Especially
>       the queue resource pool architecture, simplifies user invocations,
>       although the 'core' introduces a bit abruptly.
> 
> 
> octeontx2_dma:
>   [dev_configure]: has one parameters:
>       chunk_pool: it's strange why it's not managed internally by the driver,
>                   but passed in through the API.
>   [enqueue_bufs]: has three important parameters:
>       context: this is what Jerin referred to 'channel', it could hold the
>                completed ring of the job.
>       buffers: hold the pointer array of dpi_dma_buf_ptr_s
>       count: how many dpi_dma_buf_ptr_s
> 	  Note: one dpi_dma_buf_ptr_s may has many src and dst pairs (it's scatter-
>             gather list), and has one completed_ptr (when HW complete it will
>             write one value to this ptr), current the completed_ptr pointer
>             struct:
>                 struct dpi_dma_req_compl_s {
>                     uint64_t cdata;  --driver init and HW update result to this.
>                     void (*compl_cb)(void *dev, void *arg);
>                     void *cb_data;
>                 };
>   [dequeue_bufs]: has two important parameters:
>       context: driver will scan it's completed ring to get complete info.
>       buffers: hold the pointer array of completed_ptr.
> 
>   [key point]:
>       -----------    -----------
>       | channel |    | channel |
>       -----------    -----------
>              \           /
>               \         /
>                \       /
>              ------------
>              | HW-queue |
>              ------------
>                    |
>                 --------
>                 |rawdev|
>                 --------
>       1) User could create one channel by init context(dpi_dma_queue_ctx_s),
>          this interface is not standardized and needs to be implemented by
>          users.
>       2) Different channels can support different transmissions, e.g. one for
>          inner m2m, and other for inbound copy.
> 
>       Overall, I think the 'channel' is similar the 'virt-queue' of dpaa2_qdma.
>       The difference is that dpaa2_qdma supports multiple hardware queues. The
>       'channel' has following
>       1) A channel is an operable unit at the user level. User can create a
>          channel for each transfer type, for example, a local-to-local channel,
>          and a local-to-host channel. User could also get the completed status
>          of one channel.
>       2) Multiple channels can run on the same HW-queue. In terms of API design,
>          this design reduces the number of data-plane API parameters. The
>          channel could has context info which will referred by data-plane APIs
>          execute.
> 
> 
> ioat:
>   [probe]: create multiple rawdev if it's DSA device and has multiple HW-queues.
>   [dev_configure]: has three parameters:
>       ring_size: the HW descriptor size.
>       hdls_disable: whether ignore user-supplied handle params
>       no_prefetch_completions:
>   [rte_ioat_enqueue_copy]: has dev_id/src/dst/length/src_hdl/dst_hdl parameters.
>   [rte_ioat_completed_ops]: has dev_id/max_copies/status/num_unsuccessful/
>                             src_hdls/dst_hdls parameters.
> 
>   Overall, one HW-queue one rawdev, and don't have many 'channel' which similar
>   to octeontx2_dma.
> 
> 
> Kunpeng_dma:
>   1) The hardmware support multiple modes(e.g. local-to-local/local-to-pciehost/
>      pciehost-to-local/immediated-to-local copy).
>      Note: Currently, we only implement local-to-local copy.
>   2) The hardmware support multiple HW-queues.
> 
> 
> Summary:
>   1) The dpaa2/octeontx2/Kunpeng are all ARM soc, there may acts as endpoint of
>      x86 host (e.g. smart NIC), multiple memory transfer requirements may exist,
>      e.g. local-to-host/local-to-host..., from the point of view of API design,
>      I think we should adopt a similar 'channel' or 'virt-queue' concept.
>   2) Whether to create a separate dmadev for each HW-queue? We previously
>      discussed this, and due HW-queue could indepent management (like
>      Kunpeng_dma and Intel DSA), we prefer create a separate dmadev for each
>      HW-queue before. But I'm not sure if that's the case with dpaa. I think
>      that can be left to the specific driver, no restriction is imposed on the
>      framework API layer.
>   3) I think we could setup following abstraction at dmadev device:
>       ------------    ------------
>       |virt-queue|    |virt-queue|
>       ------------    ------------
>              \           /
>               \         /
>                \       /
>              ------------     ------------
>              | HW-queue |     | HW-queue |
>              ------------     ------------
>                     \            /
>                      \          /
>                       \        /
>                         dmadev
>   4) The driver's ops design (here we only list key points):
>      [dev_info_get]: mainly return the number of HW-queues
>      [dev_configure]: nothing important
>      [queue_setup]: create one virt-queue, has following main parameters:
>          HW-queue-index: the HW-queue index used
>          nb_desc: the number of HW descriptors
>          opaque: driver's specific info
>          Note1: this API return virt-queue index which will used in later API.
>                 If user want create multiple virt-queue one the same HW-queue,
>                 they could achieved by call queue_setup with the same
>                 HW-queue-index.
>          Note2: I think it's hard to define queue_setup config paramter, and
>                 also this is control API, so I think it's OK to use opaque
>                 pointer to implement it.
>       [dma_copy/memset/sg]: all has vq_id input parameter.
>          Note: I notice dpaa can't support single and sg in one virt-queue, and
>                I think it's maybe software implement policy other than HW
>                restriction because virt-queue could share the same HW-queue.
>       Here we use vq_id to tackle different scenario, like local-to-local/
>       local-to-host and etc.
>   5) And the dmadev public data-plane API (just prototype):
>      dma_cookie_t rte_dmadev_memset(dev, vq_id, pattern, dst, len, flags)
>        -- flags: used as an extended parameter, it could be uint32_t
>      dma_cookie_t rte_dmadev_memcpy(dev, vq_id, src, dst, len, flags)
>      dma_cookie_t rte_dmadev_memcpy_sg(dev, vq_id, sg, sg_len, flags)
>        -- sg: struct dma_scatterlist array
>      uint16_t rte_dmadev_completed(dev, vq_id, dma_cookie_t *cookie,
>                                    uint16_t nb_cpls, bool *has_error)
>        -- nb_cpls: indicate max process operations number
>        -- has_error: indicate if there is an error
>        -- return value: the number of successful completed operations.
>        -- example:
>           1) If there are already 32 completed ops, and 4th is error, and
>              nb_cpls is 32, then the ret will be 3(because 1/2/3th is OK), and
>              has_error will be true.
>           2) If there are already 32 completed ops, and all successful
>              completed, then the ret will be min(32, nb_cpls), and has_error
>              will be false.
>           3) If there are already 32 completed ops, and all failed completed,
>              then the ret will be 0, and has_error will be true.
>      uint16_t rte_dmadev_completed_status(dev_id, vq_id, dma_cookie_t *cookie,
>                                           uint16_t nb_status, uint32_t *status)
>        -- return value: the number of failed completed operations.
>      And here I agree with Morten: we should design API which adapts to DPDK
>      service scenarios. So we don't support some sound-cards DMA, and 2D memory
>      copy which mainly used in video scenarios.
>   6) The dma_cookie_t is signed int type, when <0 it mean error, it's
>      monotonically increasing base on HW-queue (other than virt-queue). The
>      driver needs to make sure this because the damdev framework don't manage
>      the dma_cookie's creation.
>   7) Because data-plane APIs are not thread-safe, and user could determine
>      virt-queue to HW-queue's map (at the queue-setup stage), so it is user's
>      duty to ensure thread-safe.
>   8) One example:
>      vq_id = rte_dmadev_queue_setup(dev, config.{HW-queue-index=x, opaque});
>      if (vq_id < 0) {
>         // create virt-queue failed
>         return;
>      }
>      // submit memcpy task
>      cookit = rte_dmadev_memcpy(dev, vq_id, src, dst, len, flags);
>      if (cookie < 0) {
>         // submit failed
>         return;
>      }
IMO
rte_dmadev_memcpy should return ops number successfully submitted
that's easier to do re-submit if previous session is not fully
submitted.
>      // get complete task
>      ret = rte_dmadev_completed(dev, vq_id, &cookie, 1, has_error);
>      if (!has_error && ret == 1) {
>         // the memcpy successful complete
>      }
>   9) As octeontx2_dma support sg-list which has many valid buffers in
>      dpi_dma_buf_ptr_s, it could call the rte_dmadev_memcpy_sg API.
>   10) As ioat, it could delcare support one HW-queue at dev_configure stage, and
>       only support create one virt-queue.
>   11) As dpaa2_qdma, I think it could migrate to new framework, but still wait
>       for dpaa2_qdma guys feedback.
>   12) About the prototype src/dst parameters of rte_dmadev_memcpy API, we have
>       two candidates which are iova and void *, how about introduce dma_addr_t
>       type which could be va or iova ?
> 

  parent reply	other threads:[~2021-07-02  7:08 UTC|newest]

Thread overview: 79+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-15 13:22 [dpdk-dev] [RFC PATCH] dmadev: introduce DMA device library Chengwen Feng
2021-06-15 16:38 ` Bruce Richardson
2021-06-16  7:09   ` Morten Brørup
2021-06-16 10:17     ` fengchengwen
2021-06-16 12:09       ` Morten Brørup
2021-06-16 13:06       ` Bruce Richardson
2021-06-16 14:37       ` Jerin Jacob
2021-06-17  9:15         ` Bruce Richardson
2021-06-18  5:52           ` Jerin Jacob
2021-06-18  9:41             ` fengchengwen
2021-06-22 17:25               ` Jerin Jacob
2021-06-23  3:30                 ` fengchengwen
2021-06-23  7:21                   ` Jerin Jacob
2021-06-23  9:37                     ` Bruce Richardson
2021-06-23 11:40                       ` Jerin Jacob
2021-06-23 14:19                         ` Bruce Richardson
2021-06-24  6:49                           ` Jerin Jacob
2021-06-23  9:41                 ` Bruce Richardson
2021-06-23 10:10                   ` Morten Brørup
2021-06-23 11:46                   ` Jerin Jacob
2021-06-23 14:22                     ` Bruce Richardson
2021-06-18  9:55             ` Bruce Richardson
2021-06-22 17:31               ` Jerin Jacob
2021-06-22 19:17                 ` Bruce Richardson
2021-06-23  7:00                   ` Jerin Jacob
2021-06-16  9:41   ` fengchengwen
2021-06-16 17:31     ` Bruce Richardson
2021-06-16 18:08       ` Jerin Jacob
2021-06-16 19:13         ` Bruce Richardson
2021-06-17  7:42           ` Jerin Jacob
2021-06-17  8:00             ` Bruce Richardson
2021-06-18  5:16               ` Jerin Jacob
2021-06-18 10:03                 ` Bruce Richardson
2021-06-22 17:36                   ` Jerin Jacob
2021-06-17  9:48       ` fengchengwen
2021-06-17 11:02         ` Bruce Richardson
2021-06-17 14:18           ` Bruce Richardson
2021-06-18  8:52             ` fengchengwen
2021-06-18  9:30               ` Bruce Richardson
2021-06-22 17:51               ` Jerin Jacob
2021-06-23  3:50                 ` fengchengwen
2021-06-23 11:00                   ` Jerin Jacob
2021-06-23 14:56                   ` Bruce Richardson
2021-06-24 12:19                     ` fengchengwen
2021-06-26  3:59                       ` [dpdk-dev] dmadev discussion summary fengchengwen
2021-06-28 10:00                         ` Bruce Richardson
2021-06-28 11:14                           ` Ananyev, Konstantin
2021-06-28 12:53                             ` Bruce Richardson
2021-07-02 13:31                           ` fengchengwen
2021-07-01 15:01                         ` Jerin Jacob
2021-07-01 16:33                           ` Bruce Richardson
2021-07-02  7:39                             ` Morten Brørup
2021-07-02 10:05                               ` Bruce Richardson
2021-07-02 13:45                           ` fengchengwen
2021-07-02 14:57                             ` Morten Brørup
2021-07-03  0:32                               ` fengchengwen
2021-07-03  8:53                                 ` Morten Brørup
2021-07-03  9:08                                   ` Jerin Jacob
2021-07-03 12:24                                     ` Morten Brørup
2021-07-04  7:43                                       ` Jerin Jacob
2021-07-05 10:28                                         ` Morten Brørup
2021-07-06  7:11                                           ` fengchengwen
2021-07-03  9:45                                   ` fengchengwen
2021-07-03 12:00                                     ` Morten Brørup
2021-07-04  7:34                                       ` Jerin Jacob
2021-07-02  7:07                         ` Liang Ma [this message]
2021-07-02 13:59                           ` fengchengwen
2021-06-24  7:03                   ` [dpdk-dev] [RFC PATCH] dmadev: introduce DMA device library Jerin Jacob
2021-06-24  7:59                     ` Morten Brørup
2021-06-24  8:05                       ` Jerin Jacob
2021-06-23  5:34       ` Hu, Jiayu
2021-06-23 11:07         ` Jerin Jacob
2021-06-16  2:17 ` Wang, Haiyue
2021-06-16  8:04   ` Bruce Richardson
2021-06-16  8:16     ` Wang, Haiyue
2021-06-16 12:14 ` David Marchand
2021-06-16 13:11   ` Bruce Richardson
2021-06-16 16:48     ` Honnappa Nagarahalli
2021-06-16 19:10       ` Bruce Richardson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YN67N8Jb5VGQmuw3@C02F33EJML85 \
    --to=liangma@liangbit.com \
    --cc=bruce.richardson@intel.com \
    --cc=david.marchand@redhat.com \
    --cc=dev@dpdk.org \
    --cc=fengchengwen@huawei.com \
    --cc=ferruh.yigit@intel.com \
    --cc=hemant.agrawal@nxp.com \
    --cc=honnappa.nagarahalli@arm.com \
    --cc=jerinj@marvell.com \
    --cc=jerinjacobk@gmail.com \
    --cc=maxime.coquelin@redhat.com \
    --cc=mb@smartsharesystems.com \
    --cc=nipun.gupta@nxp.com \
    --cc=pkapoor@marvell.com \
    --cc=sburla@marvell.com \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.