All of lore.kernel.org
 help / color / mirror / Atom feed
From: fengchengwen <fengchengwen@huawei.com>
To: "Bruce Richardson" <bruce.richardson@intel.com>,
	"Jerin Jacob" <jerinjacobk@gmail.com>,
	"Jerin Jacob" <jerinj@marvell.com>,
	"Morten Brørup" <mb@smartsharesystems.com>,
	"Nipun Gupta" <nipun.gupta@nxp.com>
Cc: Thomas Monjalon <thomas@monjalon.net>,
	Ferruh Yigit <ferruh.yigit@intel.com>, dpdk-dev <dev@dpdk.org>,
	Nipun Gupta <nipun.gupta@nxp.com>,
	Hemant Agrawal <hemant.agrawal@nxp.com>,
	"Maxime Coquelin" <maxime.coquelin@redhat.com>,
	Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>,
	David Marchand <david.marchand@redhat.com>,
	Satananda Burla <sburla@marvell.com>,
	Prasun Kapoor <pkapoor@marvell.com>
Subject: [dpdk-dev] dmadev discussion summary
Date: Sat, 26 Jun 2021 11:59:49 +0800	[thread overview]
Message-ID: <c4a0ee30-f7b8-f8a1-463c-8eedaec82aea@huawei.com> (raw)
In-Reply-To: <e3735fa2-0416-1f9f-ff0f-1a259c675248@huawei.com>

Hi, all
  I analyzed the current DPAM DMA driver and drew this summary in conjunction
with the previous discussion, and this will as a basis for the V2 implementation.
  Feedback is welcome, thanks


dpaa2_qdma:
  [probe]: mainly obtains the number of hardware queues.
  [dev_configure]: has following parameters:
      max_hw_queues_per_core:
      max_vqs: max number of virt-queue
      fle_queue_pool_cnt: the size of FLE pool
  [queue_setup]: setup up one virt-queue, has following parameters:
      lcore_id:
      flags: some control params, e.g. sg-list, longformat desc, exclusive HW
             queue...
      rbp: some misc field which impact the descriptor
      Note: this API return the index of virt-queue which was successful
            setuped.
  [enqueue_bufs]: data-plane API, the key fields:
      vq_id: the index of virt-queue
	  job: the pointer of job array
	  nb_jobs:
	  Note: one job has src/dest/len/flag/cnxt/status/vq_id/use_elem fields,
            the flag field indicate whether src/dst is PHY addr.
  [dequeue_bufs]: get the completed jobs's pointer

  [key point]:
      ------------    ------------
      |virt-queue|    |virt-queue|
      ------------    ------------
             \           /
              \         /
               \       /
             ------------     ------------
             | HW-queue |     | HW-queue |
             ------------     ------------
                    \            /
                     \          /
                      \        /
                      core/rawdev
      1) In the probe stage, driver tell how many HW-queues could use.
      2) User could specify the maximum number of HW-queues managed by a single
         core in the dev_configure stage.
      3) User could create one virt-queue by queue_setup API, the virt-queue has
         two types: a) exclusive HW-queue, b) shared HW-queue(as described
         above), this is achieved by the corresponding bit of flags field.
      4) In this mode, queue management is simplified. User do not need to
         specify the HW-queue to be applied for and create a virt-queue on the
         HW-queue. All you need to do is say on which core I want to create a
         virt-queue.
      5) The virt-queue could have different capability, e.g. virt-queue-0
         support scatter-gather format, and virt-queue-1 don't support sg, this
         was control by flags and rbp fields in queue_setup stage.
      6) The data-plane API use the definition similar to rte_mbuf and
         rte_eth_rx/tx_burst().
      PS: I still don't understand how sg-list enqueue/dequeue, and user how to
          use RTE_QDMA_VQ_NO_RESPONSE.

      Overall, I think it's a flexible design with many scalability. Especially
      the queue resource pool architecture, simplifies user invocations,
      although the 'core' introduces a bit abruptly.


octeontx2_dma:
  [dev_configure]: has one parameters:
      chunk_pool: it's strange why it's not managed internally by the driver,
                  but passed in through the API.
  [enqueue_bufs]: has three important parameters:
      context: this is what Jerin referred to 'channel', it could hold the
               completed ring of the job.
      buffers: hold the pointer array of dpi_dma_buf_ptr_s
      count: how many dpi_dma_buf_ptr_s
	  Note: one dpi_dma_buf_ptr_s may has many src and dst pairs (it's scatter-
            gather list), and has one completed_ptr (when HW complete it will
            write one value to this ptr), current the completed_ptr pointer
            struct:
                struct dpi_dma_req_compl_s {
                    uint64_t cdata;  --driver init and HW update result to this.
                    void (*compl_cb)(void *dev, void *arg);
                    void *cb_data;
                };
  [dequeue_bufs]: has two important parameters:
      context: driver will scan it's completed ring to get complete info.
      buffers: hold the pointer array of completed_ptr.

  [key point]:
      -----------    -----------
      | channel |    | channel |
      -----------    -----------
             \           /
              \         /
               \       /
             ------------
             | HW-queue |
             ------------
                   |
                --------
                |rawdev|
                --------
      1) User could create one channel by init context(dpi_dma_queue_ctx_s),
         this interface is not standardized and needs to be implemented by
         users.
      2) Different channels can support different transmissions, e.g. one for
         inner m2m, and other for inbound copy.

      Overall, I think the 'channel' is similar the 'virt-queue' of dpaa2_qdma.
      The difference is that dpaa2_qdma supports multiple hardware queues. The
      'channel' has following
      1) A channel is an operable unit at the user level. User can create a
         channel for each transfer type, for example, a local-to-local channel,
         and a local-to-host channel. User could also get the completed status
         of one channel.
      2) Multiple channels can run on the same HW-queue. In terms of API design,
         this design reduces the number of data-plane API parameters. The
         channel could has context info which will referred by data-plane APIs
         execute.


ioat:
  [probe]: create multiple rawdev if it's DSA device and has multiple HW-queues.
  [dev_configure]: has three parameters:
      ring_size: the HW descriptor size.
      hdls_disable: whether ignore user-supplied handle params
      no_prefetch_completions:
  [rte_ioat_enqueue_copy]: has dev_id/src/dst/length/src_hdl/dst_hdl parameters.
  [rte_ioat_completed_ops]: has dev_id/max_copies/status/num_unsuccessful/
                            src_hdls/dst_hdls parameters.

  Overall, one HW-queue one rawdev, and don't have many 'channel' which similar
  to octeontx2_dma.


Kunpeng_dma:
  1) The hardmware support multiple modes(e.g. local-to-local/local-to-pciehost/
     pciehost-to-local/immediated-to-local copy).
     Note: Currently, we only implement local-to-local copy.
  2) The hardmware support multiple HW-queues.


Summary:
  1) The dpaa2/octeontx2/Kunpeng are all ARM soc, there may acts as endpoint of
     x86 host (e.g. smart NIC), multiple memory transfer requirements may exist,
     e.g. local-to-host/local-to-host..., from the point of view of API design,
     I think we should adopt a similar 'channel' or 'virt-queue' concept.
  2) Whether to create a separate dmadev for each HW-queue? We previously
     discussed this, and due HW-queue could indepent management (like
     Kunpeng_dma and Intel DSA), we prefer create a separate dmadev for each
     HW-queue before. But I'm not sure if that's the case with dpaa. I think
     that can be left to the specific driver, no restriction is imposed on the
     framework API layer.
  3) I think we could setup following abstraction at dmadev device:
      ------------    ------------
      |virt-queue|    |virt-queue|
      ------------    ------------
             \           /
              \         /
               \       /
             ------------     ------------
             | HW-queue |     | HW-queue |
             ------------     ------------
                    \            /
                     \          /
                      \        /
                        dmadev
  4) The driver's ops design (here we only list key points):
     [dev_info_get]: mainly return the number of HW-queues
     [dev_configure]: nothing important
     [queue_setup]: create one virt-queue, has following main parameters:
         HW-queue-index: the HW-queue index used
         nb_desc: the number of HW descriptors
         opaque: driver's specific info
         Note1: this API return virt-queue index which will used in later API.
                If user want create multiple virt-queue one the same HW-queue,
                they could achieved by call queue_setup with the same
                HW-queue-index.
         Note2: I think it's hard to define queue_setup config paramter, and
                also this is control API, so I think it's OK to use opaque
                pointer to implement it.
      [dma_copy/memset/sg]: all has vq_id input parameter.
         Note: I notice dpaa can't support single and sg in one virt-queue, and
               I think it's maybe software implement policy other than HW
               restriction because virt-queue could share the same HW-queue.
      Here we use vq_id to tackle different scenario, like local-to-local/
      local-to-host and etc.
  5) And the dmadev public data-plane API (just prototype):
     dma_cookie_t rte_dmadev_memset(dev, vq_id, pattern, dst, len, flags)
       -- flags: used as an extended parameter, it could be uint32_t
     dma_cookie_t rte_dmadev_memcpy(dev, vq_id, src, dst, len, flags)
     dma_cookie_t rte_dmadev_memcpy_sg(dev, vq_id, sg, sg_len, flags)
       -- sg: struct dma_scatterlist array
     uint16_t rte_dmadev_completed(dev, vq_id, dma_cookie_t *cookie,
                                   uint16_t nb_cpls, bool *has_error)
       -- nb_cpls: indicate max process operations number
       -- has_error: indicate if there is an error
       -- return value: the number of successful completed operations.
       -- example:
          1) If there are already 32 completed ops, and 4th is error, and
             nb_cpls is 32, then the ret will be 3(because 1/2/3th is OK), and
             has_error will be true.
          2) If there are already 32 completed ops, and all successful
             completed, then the ret will be min(32, nb_cpls), and has_error
             will be false.
          3) If there are already 32 completed ops, and all failed completed,
             then the ret will be 0, and has_error will be true.
     uint16_t rte_dmadev_completed_status(dev_id, vq_id, dma_cookie_t *cookie,
                                          uint16_t nb_status, uint32_t *status)
       -- return value: the number of failed completed operations.
     And here I agree with Morten: we should design API which adapts to DPDK
     service scenarios. So we don't support some sound-cards DMA, and 2D memory
     copy which mainly used in video scenarios.
  6) The dma_cookie_t is signed int type, when <0 it mean error, it's
     monotonically increasing base on HW-queue (other than virt-queue). The
     driver needs to make sure this because the damdev framework don't manage
     the dma_cookie's creation.
  7) Because data-plane APIs are not thread-safe, and user could determine
     virt-queue to HW-queue's map (at the queue-setup stage), so it is user's
     duty to ensure thread-safe.
  8) One example:
     vq_id = rte_dmadev_queue_setup(dev, config.{HW-queue-index=x, opaque});
     if (vq_id < 0) {
        // create virt-queue failed
        return;
     }
     // submit memcpy task
     cookit = rte_dmadev_memcpy(dev, vq_id, src, dst, len, flags);
     if (cookie < 0) {
        // submit failed
        return;
     }
     // get complete task
     ret = rte_dmadev_completed(dev, vq_id, &cookie, 1, has_error);
     if (!has_error && ret == 1) {
        // the memcpy successful complete
     }
  9) As octeontx2_dma support sg-list which has many valid buffers in
     dpi_dma_buf_ptr_s, it could call the rte_dmadev_memcpy_sg API.
  10) As ioat, it could delcare support one HW-queue at dev_configure stage, and
      only support create one virt-queue.
  11) As dpaa2_qdma, I think it could migrate to new framework, but still wait
      for dpaa2_qdma guys feedback.
  12) About the prototype src/dst parameters of rte_dmadev_memcpy API, we have
      two candidates which are iova and void *, how about introduce dma_addr_t
      type which could be va or iova ?


  reply	other threads:[~2021-06-26  3:59 UTC|newest]

Thread overview: 79+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-15 13:22 [dpdk-dev] [RFC PATCH] dmadev: introduce DMA device library Chengwen Feng
2021-06-15 16:38 ` Bruce Richardson
2021-06-16  7:09   ` Morten Brørup
2021-06-16 10:17     ` fengchengwen
2021-06-16 12:09       ` Morten Brørup
2021-06-16 13:06       ` Bruce Richardson
2021-06-16 14:37       ` Jerin Jacob
2021-06-17  9:15         ` Bruce Richardson
2021-06-18  5:52           ` Jerin Jacob
2021-06-18  9:41             ` fengchengwen
2021-06-22 17:25               ` Jerin Jacob
2021-06-23  3:30                 ` fengchengwen
2021-06-23  7:21                   ` Jerin Jacob
2021-06-23  9:37                     ` Bruce Richardson
2021-06-23 11:40                       ` Jerin Jacob
2021-06-23 14:19                         ` Bruce Richardson
2021-06-24  6:49                           ` Jerin Jacob
2021-06-23  9:41                 ` Bruce Richardson
2021-06-23 10:10                   ` Morten Brørup
2021-06-23 11:46                   ` Jerin Jacob
2021-06-23 14:22                     ` Bruce Richardson
2021-06-18  9:55             ` Bruce Richardson
2021-06-22 17:31               ` Jerin Jacob
2021-06-22 19:17                 ` Bruce Richardson
2021-06-23  7:00                   ` Jerin Jacob
2021-06-16  9:41   ` fengchengwen
2021-06-16 17:31     ` Bruce Richardson
2021-06-16 18:08       ` Jerin Jacob
2021-06-16 19:13         ` Bruce Richardson
2021-06-17  7:42           ` Jerin Jacob
2021-06-17  8:00             ` Bruce Richardson
2021-06-18  5:16               ` Jerin Jacob
2021-06-18 10:03                 ` Bruce Richardson
2021-06-22 17:36                   ` Jerin Jacob
2021-06-17  9:48       ` fengchengwen
2021-06-17 11:02         ` Bruce Richardson
2021-06-17 14:18           ` Bruce Richardson
2021-06-18  8:52             ` fengchengwen
2021-06-18  9:30               ` Bruce Richardson
2021-06-22 17:51               ` Jerin Jacob
2021-06-23  3:50                 ` fengchengwen
2021-06-23 11:00                   ` Jerin Jacob
2021-06-23 14:56                   ` Bruce Richardson
2021-06-24 12:19                     ` fengchengwen
2021-06-26  3:59                       ` fengchengwen [this message]
2021-06-28 10:00                         ` [dpdk-dev] dmadev discussion summary Bruce Richardson
2021-06-28 11:14                           ` Ananyev, Konstantin
2021-06-28 12:53                             ` Bruce Richardson
2021-07-02 13:31                           ` fengchengwen
2021-07-01 15:01                         ` Jerin Jacob
2021-07-01 16:33                           ` Bruce Richardson
2021-07-02  7:39                             ` Morten Brørup
2021-07-02 10:05                               ` Bruce Richardson
2021-07-02 13:45                           ` fengchengwen
2021-07-02 14:57                             ` Morten Brørup
2021-07-03  0:32                               ` fengchengwen
2021-07-03  8:53                                 ` Morten Brørup
2021-07-03  9:08                                   ` Jerin Jacob
2021-07-03 12:24                                     ` Morten Brørup
2021-07-04  7:43                                       ` Jerin Jacob
2021-07-05 10:28                                         ` Morten Brørup
2021-07-06  7:11                                           ` fengchengwen
2021-07-03  9:45                                   ` fengchengwen
2021-07-03 12:00                                     ` Morten Brørup
2021-07-04  7:34                                       ` Jerin Jacob
2021-07-02  7:07                         ` Liang Ma
2021-07-02 13:59                           ` fengchengwen
2021-06-24  7:03                   ` [dpdk-dev] [RFC PATCH] dmadev: introduce DMA device library Jerin Jacob
2021-06-24  7:59                     ` Morten Brørup
2021-06-24  8:05                       ` Jerin Jacob
2021-06-23  5:34       ` Hu, Jiayu
2021-06-23 11:07         ` Jerin Jacob
2021-06-16  2:17 ` Wang, Haiyue
2021-06-16  8:04   ` Bruce Richardson
2021-06-16  8:16     ` Wang, Haiyue
2021-06-16 12:14 ` David Marchand
2021-06-16 13:11   ` Bruce Richardson
2021-06-16 16:48     ` Honnappa Nagarahalli
2021-06-16 19:10       ` Bruce Richardson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c4a0ee30-f7b8-f8a1-463c-8eedaec82aea@huawei.com \
    --to=fengchengwen@huawei.com \
    --cc=bruce.richardson@intel.com \
    --cc=david.marchand@redhat.com \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@intel.com \
    --cc=hemant.agrawal@nxp.com \
    --cc=honnappa.nagarahalli@arm.com \
    --cc=jerinj@marvell.com \
    --cc=jerinjacobk@gmail.com \
    --cc=maxime.coquelin@redhat.com \
    --cc=mb@smartsharesystems.com \
    --cc=nipun.gupta@nxp.com \
    --cc=pkapoor@marvell.com \
    --cc=sburla@marvell.com \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.