All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] fm10k: optimize legacy TX func
@ 2016-01-28  9:45 Chen Jing D(Mark)
  2016-02-16 15:27 ` Bruce Richardson
  2016-02-18  9:20 ` He, Shaopeng
  0 siblings, 2 replies; 5+ messages in thread
From: Chen Jing D(Mark) @ 2016-01-28  9:45 UTC (permalink / raw)
  To: michael.qiu, konstantin.ananyev; +Cc: dev

From: "Chen Jing D(Mark)" <jing.d.chen@intel.com>

When legacy TX func tries to free a bunch of mbufs, it will free
them one by one. This change will scan the free list and merge the
requests in case they belongs to same pool, then free once, which
will reduce cycles on freeing mbufs.

Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
---
 doc/guides/rel_notes/release_2_3.rst |    2 +
 drivers/net/fm10k/fm10k_rxtx.c       |   59 ++++++++++++++++++++++++++++-----
 2 files changed, 52 insertions(+), 9 deletions(-)

diff --git a/doc/guides/rel_notes/release_2_3.rst b/doc/guides/rel_notes/release_2_3.rst
index 99de186..20ce78d 100644
--- a/doc/guides/rel_notes/release_2_3.rst
+++ b/doc/guides/rel_notes/release_2_3.rst
@@ -3,7 +3,9 @@ DPDK Release 2.3
 
 New Features
 ------------
+* **Optimize fm10k Tx func.**
 
+  * Free multiple mbufs at a time to reduce freeing mbuf cycles.
 
 Resolved Issues
 ---------------
diff --git a/drivers/net/fm10k/fm10k_rxtx.c b/drivers/net/fm10k/fm10k_rxtx.c
index e958865..f3de691 100644
--- a/drivers/net/fm10k/fm10k_rxtx.c
+++ b/drivers/net/fm10k/fm10k_rxtx.c
@@ -369,6 +369,51 @@ fm10k_recv_scattered_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
 	return nb_rcv;
 }
 
+/*
+ * Free multiple TX mbuf at a time if they are in the same pool
+ *
+ * @txep: software desc ring index that starts to free
+ * @num: number of descs to free
+ *
+ */
+static inline void tx_free_bulk_mbuf(struct rte_mbuf **txep, int num)
+{
+	struct rte_mbuf *m, *free[RTE_FM10K_TX_MAX_FREE_BUF_SZ];
+	int i;
+	int nb_free = 0;
+
+	if (unlikely(num == 0))
+		return;
+
+	m = __rte_pktmbuf_prefree_seg(txep[0]);
+	if (likely(m != NULL)) {
+		free[0] = m;
+		nb_free = 1;
+		for (i = 1; i < num; i++) {
+			m = __rte_pktmbuf_prefree_seg(txep[i]);
+			if (likely(m != NULL)) {
+				if (likely(m->pool == free[0]->pool))
+					free[nb_free++] = m;
+				else {
+					rte_mempool_put_bulk(free[0]->pool,
+							(void *)free, nb_free);
+					free[0] = m;
+					nb_free = 1;
+				}
+			}
+			txep[i] = NULL;
+		}
+		rte_mempool_put_bulk(free[0]->pool, (void **)free, nb_free);
+	} else {
+		for (i = 1; i < num; i++) {
+			m = __rte_pktmbuf_prefree_seg(txep[i]);
+			if (m != NULL)
+				rte_mempool_put(m->pool, m);
+			txep[i] = NULL;
+		}
+	}
+}
+
 static inline void tx_free_descriptors(struct fm10k_tx_queue *q)
 {
 	uint16_t next_rs, count = 0;
@@ -385,11 +430,7 @@ static inline void tx_free_descriptors(struct fm10k_tx_queue *q)
 	 * including nb_desc */
 	if (q->last_free > next_rs) {
 		count = q->nb_desc - q->last_free;
-		while (q->last_free < q->nb_desc) {
-			rte_pktmbuf_free_seg(q->sw_ring[q->last_free]);
-			q->sw_ring[q->last_free] = NULL;
-			++q->last_free;
-		}
+		tx_free_bulk_mbuf(&q->sw_ring[q->last_free], count);
 		q->last_free = 0;
 	}
 
@@ -397,10 +438,10 @@ static inline void tx_free_descriptors(struct fm10k_tx_queue *q)
 	q->nb_free += count + (next_rs + 1 - q->last_free);
 
 	/* free buffers from last_free, up to and including next_rs */
-	while (q->last_free <= next_rs) {
-		rte_pktmbuf_free_seg(q->sw_ring[q->last_free]);
-		q->sw_ring[q->last_free] = NULL;
-		++q->last_free;
+	if (q->last_free <= next_rs) {
+		count = next_rs - q->last_free + 1;
+		tx_free_bulk_mbuf(&q->sw_ring[q->last_free], count);
+		q->last_free += count;
 	}
 
 	if (q->last_free == q->nb_desc)
-- 
1.7.7.6

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] fm10k: optimize legacy TX func
  2016-01-28  9:45 [PATCH] fm10k: optimize legacy TX func Chen Jing D(Mark)
@ 2016-02-16 15:27 ` Bruce Richardson
  2016-02-16 15:34   ` Chen, Jing D
  2016-02-18  9:20 ` He, Shaopeng
  1 sibling, 1 reply; 5+ messages in thread
From: Bruce Richardson @ 2016-02-16 15:27 UTC (permalink / raw)
  To: Chen Jing D(Mark); +Cc: dev

On Thu, Jan 28, 2016 at 05:45:59PM +0800, Chen Jing D(Mark) wrote:
> From: "Chen Jing D(Mark)" <jing.d.chen@intel.com>
> 
> When legacy TX func tries to free a bunch of mbufs, it will free
> them one by one. This change will scan the free list and merge the
> requests in case they belongs to same pool, then free once, which
> will reduce cycles on freeing mbufs.
> 
> Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
> ---
>  doc/guides/rel_notes/release_2_3.rst |    2 +
>  drivers/net/fm10k/fm10k_rxtx.c       |   59 ++++++++++++++++++++++++++++-----
>  2 files changed, 52 insertions(+), 9 deletions(-)
> 
> diff --git a/doc/guides/rel_notes/release_2_3.rst b/doc/guides/rel_notes/release_2_3.rst
> index 99de186..20ce78d 100644
> --- a/doc/guides/rel_notes/release_2_3.rst
> +++ b/doc/guides/rel_notes/release_2_3.rst
> @@ -3,7 +3,9 @@ DPDK Release 2.3
>  
>  New Features
>  ------------
> +* **Optimize fm10k Tx func.**
>  
> +  * Free multiple mbufs at a time to reduce freeing mbuf cycles.
>  

Is this really a significant enough change to warrant being called out in the
release notes? 
Personally, I don't think so, so if you are ok with it, I'll just apply this
patch without the RN update.

/Bruce

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] fm10k: optimize legacy TX func
  2016-02-16 15:27 ` Bruce Richardson
@ 2016-02-16 15:34   ` Chen, Jing D
  0 siblings, 0 replies; 5+ messages in thread
From: Chen, Jing D @ 2016-02-16 15:34 UTC (permalink / raw)
  To: Richardson, Bruce; +Cc: dev

Hi,  Bruce,

> -----Original Message-----
> From: Richardson, Bruce
> Sent: Tuesday, February 16, 2016 11:28 PM
> To: Chen, Jing D
> Cc: Qiu, Michael; Ananyev, Konstantin; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] fm10k: optimize legacy TX func
> 
> On Thu, Jan 28, 2016 at 05:45:59PM +0800, Chen Jing D(Mark) wrote:
> > From: "Chen Jing D(Mark)" <jing.d.chen@intel.com>
> >
> > When legacy TX func tries to free a bunch of mbufs, it will free them
> > one by one. This change will scan the free list and merge the requests
> > in case they belongs to same pool, then free once, which will reduce
> > cycles on freeing mbufs.
> >
> > Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
> > ---
> >  doc/guides/rel_notes/release_2_3.rst |    2 +
> >  drivers/net/fm10k/fm10k_rxtx.c       |   59
> ++++++++++++++++++++++++++++-----
> >  2 files changed, 52 insertions(+), 9 deletions(-)
> >
> > diff --git a/doc/guides/rel_notes/release_2_3.rst
> > b/doc/guides/rel_notes/release_2_3.rst
> > index 99de186..20ce78d 100644
> > --- a/doc/guides/rel_notes/release_2_3.rst
> > +++ b/doc/guides/rel_notes/release_2_3.rst
> > @@ -3,7 +3,9 @@ DPDK Release 2.3
> >
> >  New Features
> >  ------------
> > +* **Optimize fm10k Tx func.**
> >
> > +  * Free multiple mbufs at a time to reduce freeing mbuf cycles.
> >
> 
> Is this really a significant enough change to warrant being called out in the
> release notes?
> Personally, I don't think so, so if you are ok with it, I'll just apply this patch
> without the RN update.
> 
> /Bruce

This change will have some performance gain with legacy TX func.
That's why I'd like to add a line in release notes.
If you thinks it's not necessary, please kindly remove it.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] fm10k: optimize legacy TX func
  2016-01-28  9:45 [PATCH] fm10k: optimize legacy TX func Chen Jing D(Mark)
  2016-02-16 15:27 ` Bruce Richardson
@ 2016-02-18  9:20 ` He, Shaopeng
  2016-02-24 12:01   ` Bruce Richardson
  1 sibling, 1 reply; 5+ messages in thread
From: He, Shaopeng @ 2016-02-18  9:20 UTC (permalink / raw)
  To: Chen, Jing D, Qiu, Michael, Ananyev, Konstantin; +Cc: dev

Hi,

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Chen Jing D(Mark)
> Sent: Thursday, January 28, 2016 5:46 PM
> To: Qiu, Michael; Ananyev, Konstantin
> Cc: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH] fm10k: optimize legacy TX func
> 
> From: "Chen Jing D(Mark)" <jing.d.chen@intel.com>
> 
> When legacy TX func tries to free a bunch of mbufs, it will free
> them one by one. This change will scan the free list and merge the
> requests in case they belongs to same pool, then free once, which
> will reduce cycles on freeing mbufs.
> 
> Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
Acked-by: Shaopeng He <shaopeng.he@intel.com>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] fm10k: optimize legacy TX func
  2016-02-18  9:20 ` He, Shaopeng
@ 2016-02-24 12:01   ` Bruce Richardson
  0 siblings, 0 replies; 5+ messages in thread
From: Bruce Richardson @ 2016-02-24 12:01 UTC (permalink / raw)
  To: He, Shaopeng; +Cc: dev

On Thu, Feb 18, 2016 at 09:20:18AM +0000, He, Shaopeng wrote:
> Hi,
> 
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Chen Jing D(Mark)
> > Sent: Thursday, January 28, 2016 5:46 PM
> > To: Qiu, Michael; Ananyev, Konstantin
> > Cc: dev@dpdk.org
> > Subject: [dpdk-dev] [PATCH] fm10k: optimize legacy TX func
> > 
> > From: "Chen Jing D(Mark)" <jing.d.chen@intel.com>
> > 
> > When legacy TX func tries to free a bunch of mbufs, it will free
> > them one by one. This change will scan the free list and merge the
> > requests in case they belongs to same pool, then free once, which
> > will reduce cycles on freeing mbufs.
> > 
> > Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
> Acked-by: Shaopeng He <shaopeng.he@intel.com>

Applied to dpdk-next-net/rel_16_04

/Bruce

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-02-24 12:01 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-28  9:45 [PATCH] fm10k: optimize legacy TX func Chen Jing D(Mark)
2016-02-16 15:27 ` Bruce Richardson
2016-02-16 15:34   ` Chen, Jing D
2016-02-18  9:20 ` He, Shaopeng
2016-02-24 12:01   ` Bruce Richardson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.