All of lore.kernel.org
 help / color / mirror / Atom feed
From: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
To: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>,
	"Morten Brørup" <mb@smartsharesystems.com>,
	"thomas@monjalon.net" <thomas@monjalon.net>,
	"Feifei Wang" <Feifei.Wang2@arm.com>,
	"Yigit, Ferruh" <ferruh.yigit@intel.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>, nd <nd@arm.com>,
	Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>,
	"Zhang, Qi Z" <qi.z.zhang@intel.com>,
	"Xing,  Beilei" <beilei.xing@intel.com>,
	Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>,
	nd <nd@arm.com>
Subject: RE: 回复: [RFC PATCH v1 0/4] Direct re-arming of buffers on receive side
Date: Wed, 2 Feb 2022 19:46:43 +0000	[thread overview]
Message-ID: <DBAPR08MB58147EAE54ECA9B5423CDB9898279@DBAPR08MB5814.eurprd08.prod.outlook.com> (raw)
In-Reply-To: <DM6PR11MB44916CAEBB72B2A3F662E8419A219@DM6PR11MB4491.namprd11.prod.outlook.com>

<snip>

> 
> > > > From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > > > Sent: Tuesday, 18 January 2022 17.54
> > > >
> > > > [quick summary: ethdev API to bypass mempool]
> > > >
> > > > 18/01/2022 16:51, Ferruh Yigit:
> > > > > On 12/28/2021 6:55 AM, Feifei Wang wrote:
> > > > > > Morten Brørup <mb@smartsharesystems.com>:
> > > > > >> The patch provides a significant performance improvement, but
> > > > > >> I am wondering if any real world applications exist that
> > > > > >> would use
> > > > this. Only a
> > > > > >> "router on a stick" (i.e. a single-port router) comes to my
> > > > > >> mind,
> > > > and that is
> > > > > >> probably sufficient to call it useful in the real world. Do
> > > > > >> you
> > > > have any other
> > > > > >> examples to support the usefulness of this patch?
> > > > > >>
> > > > > > One case I have is about network security. For network
> > > > > > firewall,
> > > > all packets need
> > > > > > to ingress on the specified port and egress on the specified
> > > > > > port
> > > > to do packet filtering.
> > > > > > In this case, we can know flow direction in advance.
> > > > >
> > > > > I also have some concerns on how useful this API will be in real
> > > > life,
> > > > > and does the use case worth the complexity it brings.
> > > > > And it looks too much low level detail for the application.
> > > >
> > > > That's difficult to judge.
> > > > The use case is limited and the API has some severe limitations.
> > > > The benefit is measured with l3fwd, which is not exactly a real app.
> > > > Do we want an API which improves performance in limited scenarios
> > > > at the cost of breaking some general design assumptions?
> > > >
> > > > Can we achieve the same level of performance with a mempool trick?
> > >
> > > Perhaps the mbuf library could offer bulk functions for alloc/free
> > > of raw mbufs - essentially a shortcut directly to the mempool library.
> > >
> > > There might be a few more details to micro-optimize in the mempool
> > > library, if approached with this use case in mind. E.g. the
> > > rte_mempool_default_cache() could do with a few unlikely() in its
> > > comparisons.
> > >
> > > Also, for this use case, the mempool library adds tracing overhead,
> > > which this API bypasses. And considering how short the code path
> > > through the mempool cache is, the tracing overhead is relatively much.
> I.e.: memcpy(NIC->NIC) vs.
> > > trace() memcpy(NIC->cache) trace() memcpy(cache->NIC).
> > >
> > > A key optimization point could be the number of mbufs being moved
> > > to/from the mempool cache. If that number was fixed at compile time,
> > > a faster
> > > memcpy() could be used. However, it seems that different PMDs use
> > > bursts of either 4, 8, or in this case 32 mbufs. If only they could
> > > agree on such a simple detail.
> > This patch removes the stores and loads which saves on the backend cycles.
> I do not think, other optimizations can do the same.
> 
> My thought here was that we can try to introduce for mempool-cache ZC API,
> similar to one we have for the ring.
> Then on TX free path we wouldn't need to copy mbufs to be freed to
> temporary array on the stack.
> Instead we can put them straight from TX SW ring to the mempool cache.
> That should save extra store/load for mbuf and might help to achieve some
> performance gain
> without by-passing mempool.
Agree, it will remove one set of loads and stores, but not all of them. I am not sure if it can solve the performance problems. We will give it a try.

> 
> >
> > >
> > > Overall, I strongly agree that it is preferable to optimize the core
> > > libraries, rather than bypass them. Bypassing will eventually lead to
> "spaghetti code".
> > IMO, this is not "spaghetti code". There is no design rule in DPDK
> > that says the RX side must allocate buffers from a mempool or TX side must
> free buffers to a mempool. This patch does not break any modular
> boundaries. For ex: access internal details of another library.
> 
> I also have few concerns about that approach:
> - proposed implementation breaks boundary logical boundary between RX/TX
> code.
>   Right now they co-exist independently, and design of TX path doesn't directly
> affect RX path
>   and visa-versa. With proposed approach RX path need to be aware about TX
> queue details and
>   mbuf freeing strategy. So if we'll decide to change TX code, we probably
> would be able to do that
>   without affecting RX path.
Agree that now both paths will be coupled on the areas you have mentioned. This is happening within the driver code. From the application perspective, they still remain separated. I also do not see that the TX free strategy has not changed much.

>   That probably can be fixed by formalizing things a bit more by introducing
> new dev-ops API:
>   eth_dev_tx_queue_free_mbufs(port id, queue id, mbufs_to_free[], ...)
>   But that would probably eat-up significant portion of the gain you are seeing
> right now.
> 
> - very limited usage scenario - it will have a positive effect only when we have
Agree, it is limited to few scenarios. But, the scenario itself is a major scenario.

> a fixed forwarding mapping:
>   all (or nearly all) packets from the RX queue are forwarded into the same TX
> queue.
>   Even for l3fwd it doesn’t look like a generic scenario.
I think it is possible to have some logic (based on the port mask and the routes involved) to enable this feature. We will try to add that in the next version.

> 
> - we effectively link RX and TX queues - when this feature is enabled, user
> can't stop TX queue,
>   without stopping RX queue first.
Agree. How much of an issue is this? I would think when the application is shutting down, one would stop the RX side first. Are there any other scenarios we need to be aware of?

> 
> 

  reply	other threads:[~2022-02-02 19:47 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-24 16:46 [RFC PATCH v1 0/4] Direct re-arming of buffers on receive side Feifei Wang
2021-12-24 16:46 ` [RFC PATCH v1 1/4] net/i40e: enable direct re-arm mode Feifei Wang
2021-12-24 16:46 ` [RFC PATCH v1 2/4] ethdev: add API for " Feifei Wang
2021-12-24 19:38   ` Stephen Hemminger
2021-12-26  9:49     ` 回复: " Feifei Wang
2021-12-26 10:31       ` Morten Brørup
2021-12-24 16:46 ` [RFC PATCH v1 3/4] net/i40e: add direct re-arm mode internal API Feifei Wang
2021-12-24 16:46 ` [RFC PATCH v1 4/4] examples/l3fwd: give an example for direct rearm mode Feifei Wang
2021-12-26 10:25 ` [RFC PATCH v1 0/4] Direct re-arming of buffers on receive side Morten Brørup
2021-12-28  6:55   ` 回复: " Feifei Wang
2022-01-18 15:51     ` Ferruh Yigit
2022-01-18 16:53       ` Thomas Monjalon
2022-01-18 17:27         ` Morten Brørup
2022-01-27  5:24           ` Honnappa Nagarahalli
2022-01-27 16:45             ` Ananyev, Konstantin
2022-02-02 19:46               ` Honnappa Nagarahalli [this message]
2022-01-27  5:16         ` Honnappa Nagarahalli
2023-02-28  6:43       ` 回复: " Feifei Wang
2023-02-28  6:52         ` Feifei Wang
2022-01-27  4:06   ` Honnappa Nagarahalli
2022-01-27 17:13     ` Morten Brørup
2022-01-28 11:29     ` Morten Brørup
2023-03-23 10:43 ` [PATCH v4 0/3] Recycle buffers from Tx to Rx Feifei Wang
2023-03-23 10:43   ` [PATCH v4 1/3] ethdev: add API for buffer recycle mode Feifei Wang
2023-03-23 11:41     ` Morten Brørup
2023-03-29  2:16       ` Feifei Wang
2023-03-23 10:43   ` [PATCH v4 2/3] net/i40e: implement recycle buffer mode Feifei Wang
2023-03-23 10:43   ` [PATCH v4 3/3] net/ixgbe: " Feifei Wang
2023-03-30  6:29 ` [PATCH v5 0/3] Recycle buffers from Tx to Rx Feifei Wang
2023-03-30  6:29   ` [PATCH v5 1/3] ethdev: add API for buffer recycle mode Feifei Wang
2023-03-30  7:19     ` Morten Brørup
2023-03-30  9:31       ` Feifei Wang
2023-03-30 15:15         ` Morten Brørup
2023-03-30 15:58         ` Morten Brørup
2023-04-26  6:59           ` Feifei Wang
2023-04-19 14:46     ` Ferruh Yigit
2023-04-26  7:29       ` Feifei Wang
2023-03-30  6:29   ` [PATCH v5 2/3] net/i40e: implement recycle buffer mode Feifei Wang
2023-03-30  6:29   ` [PATCH v5 3/3] net/ixgbe: " Feifei Wang
2023-04-19 14:46     ` Ferruh Yigit
2023-04-26  7:36       ` Feifei Wang
2023-03-30 15:04   ` [PATCH v5 0/3] Recycle buffers from Tx to Rx Stephen Hemminger
2023-04-03  2:48     ` Feifei Wang
2023-04-19 14:56   ` Ferruh Yigit
2023-04-25  7:57     ` Feifei Wang
2023-05-25  9:45 ` [PATCH v6 0/4] Recycle mbufs from Tx queue to Rx queue Feifei Wang
2023-05-25  9:45   ` [PATCH v6 1/4] ethdev: add API for mbufs recycle mode Feifei Wang
2023-05-25 15:08     ` Morten Brørup
2023-05-31  6:10       ` Feifei Wang
2023-06-05 12:53     ` Константин Ананьев
2023-06-06  2:55       ` Feifei Wang
2023-06-06  7:10         ` Konstantin Ananyev
2023-06-06  7:31           ` Feifei Wang
2023-06-06  8:34             ` Konstantin Ananyev
2023-06-07  0:00               ` Ferruh Yigit
2023-06-12  3:25                 ` Feifei Wang
2023-05-25  9:45   ` [PATCH v6 2/4] net/i40e: implement " Feifei Wang
2023-06-05 13:02     ` Константин Ананьев
2023-06-06  3:16       ` Feifei Wang
2023-06-06  7:18         ` Konstantin Ananyev
2023-06-06  7:58           ` Feifei Wang
2023-06-06  8:27             ` Konstantin Ananyev
2023-06-12  3:05               ` Feifei Wang
2023-05-25  9:45   ` [PATCH v6 3/4] net/ixgbe: " Feifei Wang
2023-05-25  9:45   ` [PATCH v6 4/4] app/testpmd: add recycle mbufs engine Feifei Wang
2023-06-05 13:08     ` Константин Ананьев
2023-06-06  6:32       ` Feifei Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DBAPR08MB58147EAE54ECA9B5423CDB9898279@DBAPR08MB5814.eurprd08.prod.outlook.com \
    --to=honnappa.nagarahalli@arm.com \
    --cc=Feifei.Wang2@arm.com \
    --cc=andrew.rybchenko@oktetlabs.ru \
    --cc=beilei.xing@intel.com \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@intel.com \
    --cc=konstantin.ananyev@intel.com \
    --cc=mb@smartsharesystems.com \
    --cc=nd@arm.com \
    --cc=qi.z.zhang@intel.com \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.