From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by smtp.lore.kernel.org (Postfix) with ESMTP id 357A2C7EE29 for ; Thu, 25 May 2023 15:08:52 +0000 (UTC) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 7D3A940DF8; Thu, 25 May 2023 17:08:51 +0200 (CEST) Received: from dkmailrelay1.smartsharesystems.com (smartserver.smartsharesystems.com [77.243.40.215]) by mails.dpdk.org (Postfix) with ESMTP id C566B40DDB for ; Thu, 25 May 2023 17:08:50 +0200 (CEST) Received: from smartserver.smartsharesystems.com (smartserver.smartsharesys.local [192.168.4.10]) by dkmailrelay1.smartsharesystems.com (Postfix) with ESMTP id CA2E920213; Thu, 25 May 2023 17:08:49 +0200 (CEST) Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Subject: RE: [PATCH v6 1/4] ethdev: add API for mbufs recycle mode X-MimeOLE: Produced By Microsoft Exchange V6.5 Date: Thu, 25 May 2023 17:08:45 +0200 Message-ID: <98CBD80474FA8B44BF855DF32C47DC35D8794D@smartserver.smartshare.dk> In-Reply-To: <20230525094541.331338-2-feifei.wang2@arm.com> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: [PATCH v6 1/4] ethdev: add API for mbufs recycle mode Thread-Index: AdmO7bfNYYhgHV/4RxKrSgjhIddrEAAKHEmw References: <20211224164613.32569-1-feifei.wang2@arm.com> <20230525094541.331338-1-feifei.wang2@arm.com> <20230525094541.331338-2-feifei.wang2@arm.com> From: =?iso-8859-1?Q?Morten_Br=F8rup?= To: "Feifei Wang" , "Thomas Monjalon" , "Ferruh Yigit" , "Andrew Rybchenko" Cc: , , "Honnappa Nagarahalli" , "Ruifeng Wang" X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org > From: Feifei Wang [mailto:feifei.wang2@arm.com] > Sent: Thursday, 25 May 2023 11.46 >=20 > Add 'rte_eth_recycle_rx_queue_info_get' and 'rte_eth_recycle_mbufs' > APIs to recycle used mbufs from a transmit queue of an Ethernet = device, > and move these mbufs into a mbuf ring for a receive queue of an = Ethernet > device. This can bypass mempool 'put/get' operations hence saving CPU > cycles. >=20 > For each recycling mbufs, the rte_eth_recycle_mbufs() function = performs > the following operations: > - Copy used *rte_mbuf* buffer pointers from Tx mbuf ring into Rx mbuf > ring. > - Replenish the Rx descriptors with the recycling *rte_mbuf* mbufs = freed > from the Tx mbuf ring. >=20 > Suggested-by: Honnappa Nagarahalli > Suggested-by: Ruifeng Wang > Signed-off-by: Feifei Wang > Reviewed-by: Ruifeng Wang > Reviewed-by: Honnappa Nagarahalli > --- [...] > diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h > index 2c9d615fb5..c6723d5277 100644 > --- a/lib/ethdev/ethdev_driver.h > +++ b/lib/ethdev/ethdev_driver.h > @@ -59,6 +59,10 @@ struct rte_eth_dev { > eth_rx_descriptor_status_t rx_descriptor_status; > /** Check the status of a Tx descriptor */ > eth_tx_descriptor_status_t tx_descriptor_status; > + /** Pointer to PMD transmit mbufs reuse function */ > + eth_recycle_tx_mbufs_reuse_t recycle_tx_mbufs_reuse; > + /** Pointer to PMD receive descriptors refill function */ > + eth_recycle_rx_descriptors_refill_t recycle_rx_descriptors_refill; >=20 > /** > * Device data that is shared between primary and secondary = processes The rte_eth_dev struct currently looks like this: /** * @internal * The generic data structure associated with each Ethernet device. * * Pointers to burst-oriented packet receive and transmit functions are * located at the beginning of the structure, along with the pointer to * where all the data elements for the particular device are stored in = shared * memory. This split allows the function pointer and driver data to be = per- * process, while the actual configuration data for the device is = shared. */ struct rte_eth_dev { eth_rx_burst_t rx_pkt_burst; /**< Pointer to PMD receive function */ eth_tx_burst_t tx_pkt_burst; /**< Pointer to PMD transmit function */ /** Pointer to PMD transmit prepare function */ eth_tx_prep_t tx_pkt_prepare; /** Get the number of used Rx descriptors */ eth_rx_queue_count_t rx_queue_count; /** Check the status of a Rx descriptor */ eth_rx_descriptor_status_t rx_descriptor_status; /** Check the status of a Tx descriptor */ eth_tx_descriptor_status_t tx_descriptor_status; /** * Device data that is shared between primary and secondary processes */ struct rte_eth_dev_data *data; void *process_private; /**< Pointer to per-process device data */ const struct eth_dev_ops *dev_ops; /**< Functions exported by PMD */ struct rte_device *device; /**< Backing device */ struct rte_intr_handle *intr_handle; /**< Device interrupt handle */ /** User application callbacks for NIC interrupts */ struct rte_eth_dev_cb_list link_intr_cbs; /** * User-supplied functions called from rx_burst to post-process * received packets before passing them to the user */ struct rte_eth_rxtx_callback = *post_rx_burst_cbs[RTE_MAX_QUEUES_PER_PORT]; /** * User-supplied functions called from tx_burst to pre-process * received packets before passing them to the driver for transmission */ struct rte_eth_rxtx_callback = *pre_tx_burst_cbs[RTE_MAX_QUEUES_PER_PORT]; enum rte_eth_dev_state state; /**< Flag indicating the port state */ void *security_ctx; /**< Context for security ops */ } __rte_cache_aligned; Inserting the two new function pointers (recycle_tx_mbufs_reuse and = recycle_rx_descriptors_refill) as the 7th and 8th fields will move the = 'data' and 'process_private' pointers out of the first cache line. If those data pointers are used in the fast path with the rx_pkt_burst = and tx_pkt_burst functions, moving them to a different cache line might = have a performance impact on those two functions. Disclaimer: This is a big "if", and wild speculation from me, because I = haven't looked at it in detail! If this structure is not used in the = fast path like this, you can ignore my suggestion below. Please consider moving the 'data' and 'process_private' pointers to the = beginning of this structure, so they are kept in the same cache line as = the rx_pkt_burst and tx_pkt_burst function pointers. I don't know the relative importance of the remaining six fast path = functions (the four existing ones plus the two new ones in this patch), = so you could also rearrange those, so the least important two functions = are moved out of the first cache line. It doesn't have to be the two = recycle functions that go into a different cache line. -Morten