All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
To: "Di, ChenxuX" <chenxux.di@intel.com>, "dev@dpdk.org" <dev@dpdk.org>
Cc: "Yang, Qiming" <qiming.yang@intel.com>
Subject: Re: [dpdk-dev] [PATCH v6 3/4] net/ixgbe: cleanup Tx buffers
Date: Sun, 5 Jan 2020 23:36:09 +0000	[thread overview]
Message-ID: <SN6PR11MB255840C8D1C11AD9BAABA8949A3D0@SN6PR11MB2558.namprd11.prod.outlook.com> (raw)
In-Reply-To: <3B926E44943CB04AA3A39AC16328CE39B9262D@SHSMSX101.ccr.corp.intel.com>


> > > Add support to the ixgbe driver for the API rte_eth_tx_done_cleanup to
> > > force free consumed buffers on Tx ring.
> > >
> > > Signed-off-by: Chenxu Di <chenxux.di@intel.com>
> > > ---
> > >  drivers/net/ixgbe/ixgbe_ethdev.c |   2 +
> > >  drivers/net/ixgbe/ixgbe_rxtx.c   | 116 +++++++++++++++++++++++++++++++
> > >  drivers/net/ixgbe/ixgbe_rxtx.h   |   2 +
> > >  3 files changed, 120 insertions(+)
> > >
> > > diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c
> > > b/drivers/net/ixgbe/ixgbe_ethdev.c
> > > index 2c6fd0f13..0091405db 100644
> > > --- a/drivers/net/ixgbe/ixgbe_ethdev.c
> > > +++ b/drivers/net/ixgbe/ixgbe_ethdev.c
> > > @@ -601,6 +601,7 @@ static const struct eth_dev_ops ixgbe_eth_dev_ops
> > > = {  .udp_tunnel_port_add  = ixgbe_dev_udp_tunnel_port_add,
> > > .udp_tunnel_port_del  = ixgbe_dev_udp_tunnel_port_del,
> > >  .tm_ops_get           = ixgbe_tm_ops_get,
> > > +.tx_done_cleanup      = ixgbe_tx_done_cleanup,
> >
> > Don't see how we can have one tx_done_cleanup() for different tx functions?
> > Vector and scalar TX path use different  format for sw_ring[] entries.
> > Also offload and simile TX paths use different method to track used/free
> > descriptors, and use different functions to free them:
> > offload uses tx_entry next_id, last_id plus txq. last_desc_cleaned, while simple
> > TX paths use tx_next_dd.
> >
> 
> This patches will be not include function for Vector, and I will update my code to
> Make it work for offload and simple .
> >
> > >  };
> > >
> > >  /*
> > > @@ -649,6 +650,7 @@ static const struct eth_dev_ops ixgbevf_eth_dev_ops
> > = {
> > >  .reta_query           = ixgbe_dev_rss_reta_query,
> > >  .rss_hash_update      = ixgbe_dev_rss_hash_update,
> > >  .rss_hash_conf_get    = ixgbe_dev_rss_hash_conf_get,
> > > +.tx_done_cleanup      = ixgbe_tx_done_cleanup,
> > >  };
> > >
> > >  /* store statistics names and its offset in stats structure */ diff
> > > --git a/drivers/net/ixgbe/ixgbe_rxtx.c
> > > b/drivers/net/ixgbe/ixgbe_rxtx.c index fa572d184..520b9c756 100644
> > > --- a/drivers/net/ixgbe/ixgbe_rxtx.c
> > > +++ b/drivers/net/ixgbe/ixgbe_rxtx.c
> > > @@ -2306,6 +2306,122 @@ ixgbe_tx_queue_release_mbufs(struct
> > > ixgbe_tx_queue *txq)  }  }
> > >
> > > +int ixgbe_tx_done_cleanup(void *q, uint32_t free_cnt)
> >
> > That seems to work only for offload(full) TX path (ixgbe_xmit_pkts).
> > Simple(fast) path seems not covered by this function.
> >
> 
> Same as above
> 
> > > +{
> > > +struct ixgbe_tx_queue *txq = (struct ixgbe_tx_queue *)q; struct
> > > +ixgbe_tx_entry *sw_ring; volatile union ixgbe_adv_tx_desc *txr;
> > > +uint16_t tx_first; /* First segment analyzed. */
> > > +uint16_t tx_id;    /* Current segment being processed. */
> > > +uint16_t tx_last;  /* Last segment in the current packet. */ uint16_t
> > > +tx_next;  /* First segment of the next packet. */ int count;
> > > +
> > > +if (txq == NULL)
> > > +return -ENODEV;
> > > +
> > > +count = 0;
> > > +sw_ring = txq->sw_ring;
> > > +txr = txq->tx_ring;
> > > +
> > > +/*
> > > + * tx_tail is the last sent packet on the sw_ring. Goto the end
> > > + * of that packet (the last segment in the packet chain) and
> > > + * then the next segment will be the start of the oldest segment
> > > + * in the sw_ring.
> >
> > Not sure I understand the sentence above.
> > tx_tail is the value of TDT HW register (most recently armed by SW TD).
> > last_id  is the index of last descriptor for multi-seg packet.
> > next_id is just the index of next descriptor in HW TD ring.
> > How do you conclude that it will be the ' oldest segment in the sw_ring'?
> >
> 
> The tx_tail is the last sent packet on the sw_ring. While the xmit_cleanup or
> Tx_free_bufs will be call when the nb_tx_free < tx_free_thresh .
> So the sw_ring[tx_tail].next_id must be the begin of mbufs which are not used or
>  Already freed . then begin the loop until the mbuf is used and begin to free them.
> 
> 
> 
> > Another question why do you need to write your own functions?
> > Why can't you reuse existing ixgbe_xmit_cleanup() for full(offload) path and
> > ixgbe_tx_free_bufs() for simple path?
> > Yes,  ixgbe_xmit_cleanup() doesn't free mbufs, but at least it could be used to
> > determine finished TX descriptors.
> > Based on that you can you can free appropriate sw_ring[] entries.
> >
> 
> The reason why I don't reuse existing function is that they all free several mbufs
> While the free_cnt of the API rte_eth_tx_done_cleanup() is the number of packets.
> It also need to be done that check which mbuffs are from the same packet.

At first, I don't see anything bad if tx_done_cleanup() will free only some segments from
the packet. As long as it is safe - there is no problem with that.
I think rte_eth_tx_done_cleanup() operates on mbuf, not packet quantities.
But in our case I think it doesn't matter, as ixgbe_xmit_cleanup()
mark TXDs as free only when HW is done with all TXDs for that packet.
As long as there is a way to reuse existing code and avoid duplication
(without introducing any degradation) - we should use it.
And I think there is a very good opportunity here to reuse existing
ixgbe_xmit_cleanup() for tx_done_cleanup() implementation.
Moreover because your code doesn't follow ixgbe_xmit_pkts()/ixgbe_xmit_cleanup()
logic and infrastructure, it introduces unnecessary scans over TXD ring,
and in some cases doesn't work as expected: 

+	while (1) {
+		tx_last = sw_ring[tx_id].last_id;
+
+		if (sw_ring[tx_last].mbuf) {
+			if (txr[tx_last].wb.status &
+					IXGBE_TXD_STAT_DD) {
...
+			} else {
+				/*
+				 * mbuf still in use, nothing left to
+				 * free.
+				 */
+				break;

It is not correct to expect that IXGBE_TXD_STAT_DD will be set on last TXD for *every* packet.
We set IXGBE_TXD_CMD_RS bit only on threshold packet last descriptor.
Plus ixgbe_xmit_cleanup() can cleanup TXD wb.status.

So I strongly recommend to reuse ixgbe_xmit_cleanup() here.
It would be much less error prone and will help to avoid code duplication.

Konstantin 

> 
> 
> > >This is the first packet that will be
> > > + * attempted to be freed.
> > > + */
> > > +
> > > +/* Get last segment in most recently added packet. */ tx_last =
> > > +sw_ring[txq->tx_tail].last_id;
> > > +
> > > +/* Get the next segment, which is the oldest segment in ring. */
> > > +tx_first = sw_ring[tx_last].next_id;
> > > +
> > > +/* Set the current index to the first. */ tx_id = tx_first;
> > > +
> > > +/*
> > > + * Loop through each packet. For each packet, verify that an
> > > + * mbuf exists and that the last segment is free. If so, free
> > > + * it and move on.
> > > + */
> > > +while (1) {
> > > +tx_last = sw_ring[tx_id].last_id;
> > > +
> > > +if (sw_ring[tx_last].mbuf) {
> > > +if (!(txr[tx_last].wb.status &
> > > +IXGBE_TXD_STAT_DD))
> > > +break;
> > > +
> > > +/* Get the start of the next packet. */ tx_next =
> > > +sw_ring[tx_last].next_id;
> > > +
> > > +/*
> > > + * Loop through all segments in a
> > > + * packet.
> > > + */
> > > +do {
> > > +rte_pktmbuf_free_seg(sw_ring[tx_id].mbuf);
> > > +sw_ring[tx_id].mbuf = NULL;
> > > +sw_ring[tx_id].last_id = tx_id;
> > > +
> > > +/* Move to next segment. */
> > > +tx_id = sw_ring[tx_id].next_id;
> > > +
> > > +} while (tx_id != tx_next);
> > > +
> > > +/*
> > > + * Increment the number of packets
> > > + * freed.
> > > + */
> > > +count++;
> > > +
> > > +if (unlikely(count == (int)free_cnt)) break; } else {
> > > +/*
> > > + * There are multiple reasons to be here:
> > > + * 1) All the packets on the ring have been
> > > + *    freed - tx_id is equal to tx_first
> > > + *    and some packets have been freed.
> > > + *    - Done, exit
> > > + * 2) Interfaces has not sent a rings worth of
> > > + *    packets yet, so the segment after tail is
> > > + *    still empty. Or a previous call to this
> > > + *    function freed some of the segments but
> > > + *    not all so there is a hole in the list.
> > > + *    Hopefully this is a rare case.
> > > + *    - Walk the list and find the next mbuf. If
> > > + *      there isn't one, then done.
> > > + */
> > > +if (likely(tx_id == tx_first && count != 0)) break;
> > > +
> > > +/*
> > > + * Walk the list and find the next mbuf, if any.
> > > + */
> > > +do {
> > > +/* Move to next segment. */
> > > +tx_id = sw_ring[tx_id].next_id;
> > > +
> > > +if (sw_ring[tx_id].mbuf)
> > > +break;
> > > +
> > > +} while (tx_id != tx_first);
> > > +
> > > +/*
> > > + * Determine why previous loop bailed. If there
> > > + * is not an mbuf, done.
> > > + */
> > > +if (sw_ring[tx_id].mbuf == NULL)
> > > +break;
> > > +}
> > > +}
> > > +
> > > +return count;
> > > +}
> > > +
> > >  static void __attribute__((cold))
> > >  ixgbe_tx_free_swring(struct ixgbe_tx_queue *txq)  { diff --git
> > > a/drivers/net/ixgbe/ixgbe_rxtx.h b/drivers/net/ixgbe/ixgbe_rxtx.h
> > > index 505d344b9..2c3770af6 100644
> > > --- a/drivers/net/ixgbe/ixgbe_rxtx.h
> > > +++ b/drivers/net/ixgbe/ixgbe_rxtx.h
> > > @@ -285,6 +285,8 @@ int ixgbe_rx_vec_dev_conf_condition_check(struct
> > > rte_eth_dev *dev);  int ixgbe_rxq_vec_setup(struct ixgbe_rx_queue
> > > *rxq);  void ixgbe_rx_queue_release_mbufs_vec(struct ixgbe_rx_queue
> > > *rxq);
> > >
> > > +int ixgbe_tx_done_cleanup(void *txq, uint32_t free_cnt);
> > > +
> > >  extern const uint32_t ptype_table[IXGBE_PACKET_TYPE_MAX];
> > >  extern const uint32_t ptype_table_tn[IXGBE_PACKET_TYPE_TN_MAX];
> > >
> > > --
> > > 2.17.1
> >
> 


  reply	other threads:[~2020-01-05 23:36 UTC|newest]

Thread overview: 74+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-03  5:51 [dpdk-dev] [PATCH 0/4] drivers/net: cleanup Tx buffers Chenxu Di
2019-12-03  5:51 ` [dpdk-dev] [PATCH 1/4] net/fm10k: " Chenxu Di
2019-12-03  5:51 ` [dpdk-dev] [PATCH 2/4] net/i40e: " Chenxu Di
2019-12-03  5:51 ` [dpdk-dev] [PATCH 3/4] net/ice: " Chenxu Di
2019-12-03  5:51 ` [dpdk-dev] [PATCH 4/4] net/ixgbe: " Chenxu Di
2019-12-20  3:02 ` [dpdk-dev] [PATCH v2 0/5] drivers/net: " Chenxu Di
2019-12-20  3:02   ` [dpdk-dev] [PATCH v2 1/5] net/fm10k: " Chenxu Di
2019-12-20  3:02   ` [dpdk-dev] [PATCH v2 2/5] net/i40e: " Chenxu Di
2019-12-20  3:02   ` [dpdk-dev] [PATCH v2 3/5] net/ice: " Chenxu Di
2019-12-20  3:02   ` [dpdk-dev] [PATCH v2 4/5] net/ixgbe: " Chenxu Di
2019-12-20  3:02   ` [dpdk-dev] [PATCH v2 5/5] net/e1000: " Chenxu Di
2019-12-20  3:15 ` [dpdk-dev] [PATCH v3 0/5] drivers/net: " Chenxu Di
2019-12-20  3:15   ` [dpdk-dev] [PATCH v3 1/5] net/fm10k: " Chenxu Di
2019-12-20  3:15   ` [dpdk-dev] [PATCH v3 2/5] net/i40e: " Chenxu Di
2019-12-20  3:15   ` [dpdk-dev] [PATCH v3 3/5] net/ice: " Chenxu Di
2019-12-20  3:15   ` [dpdk-dev] [PATCH v3 4/5] net/ixgbe: " Chenxu Di
2019-12-20  3:15   ` [dpdk-dev] [PATCH v3 5/5] net/e1000: " Chenxu Di
2019-12-24  2:39 ` [dpdk-dev] [PATCH v4 0/5] drivers/net: " Chenxu Di
2019-12-24  2:39   ` [dpdk-dev] [PATCH v4 1/5] net/fm10k: " Chenxu Di
2019-12-24  2:39   ` [dpdk-dev] [PATCH v4 2/5] net/i40e: " Chenxu Di
2019-12-26  8:24     ` Xing, Beilei
2019-12-24  2:39   ` [dpdk-dev] [PATCH v4 3/5] net/ice: " Chenxu Di
2019-12-24  2:39   ` [dpdk-dev] [PATCH v4 4/5] net/ixgbe: " Chenxu Di
2019-12-24  2:39   ` [dpdk-dev] [PATCH v4 5/5] net/e1000: " Chenxu Di
2019-12-30  9:38 ` [dpdk-dev] [PATCH v6 0/4] drivers/net: " Chenxu Di
2019-12-30  9:38   ` [dpdk-dev] [PATCH v6 1/4] net/i40e: " Chenxu Di
2019-12-30 13:01     ` Ananyev, Konstantin
2019-12-30  9:38   ` [dpdk-dev] [PATCH v6 2/4] net/ice: " Chenxu Di
2019-12-30  9:38   ` [dpdk-dev] [PATCH v6 3/4] net/ixgbe: " Chenxu Di
2019-12-30 12:53     ` Ananyev, Konstantin
2020-01-03  9:01       ` Di, ChenxuX
2020-01-05 23:36         ` Ananyev, Konstantin [this message]
2020-01-06  9:03           ` Di, ChenxuX
2020-01-06 13:26             ` Ananyev, Konstantin
2020-01-07 10:46               ` Di, ChenxuX
2020-01-07 14:09                 ` Ananyev, Konstantin
2020-01-08 10:15                   ` Di, ChenxuX
2020-01-08 15:12                     ` Ananyev, Konstantin
2019-12-30  9:38   ` [dpdk-dev] [PATCH v6 4/4] net/e1000: " Chenxu Di
2020-01-09 10:38 ` [dpdk-dev] [PATCH v7 0/4] drivers/net: " Chenxu Di
2020-01-09 10:38   ` [dpdk-dev] [PATCH v7 1/4] net/i40e: " Chenxu Di
2020-01-09 10:38   ` [dpdk-dev] [PATCH v7 2/4] net/ice: " Chenxu Di
2020-01-09 10:38   ` [dpdk-dev] [PATCH v7 3/4] net/ixgbe: " Chenxu Di
2020-01-09 14:01     ` Ananyev, Konstantin
2020-01-10 10:08       ` Di, ChenxuX
2020-01-10 12:46         ` Ananyev, Konstantin
2020-01-09 10:38   ` [dpdk-dev] [PATCH v7 4/4] net/e1000: " Chenxu Di
2020-01-10  9:58 ` [dpdk-dev] [PATCH v8 0/4] drivers/net: " Chenxu Di
2020-01-10  9:58   ` [dpdk-dev] [PATCH v8 1/4] net/i40e: " Chenxu Di
2020-01-10  9:58   ` [dpdk-dev] [PATCH v8 2/4] net/ice: " Chenxu Di
2020-01-10  9:58   ` [dpdk-dev] [PATCH v8 3/4] net/ixgbe: " Chenxu Di
2020-01-10 13:49     ` Ananyev, Konstantin
2020-01-10  9:59   ` [dpdk-dev] [PATCH v8 4/4] net/e1000: " Chenxu Di
2020-01-13  9:57 ` [dpdk-dev] [PATCH v9 0/4] drivers/net: " Chenxu Di
2020-01-13  9:57   ` [dpdk-dev] [PATCH v9 1/4] net/i40e: " Chenxu Di
2020-01-13 11:08     ` Ananyev, Konstantin
2020-01-13  9:57   ` [dpdk-dev] [PATCH v9 2/4] net/ice: " Chenxu Di
2020-01-14  1:55     ` Yang, Qiming
2020-01-14 12:40     ` Ferruh Yigit
2020-01-15 14:34       ` Ferruh Yigit
2020-01-16  1:40         ` Di, ChenxuX
2020-01-16  7:09           ` [dpdk-dev] [PATCH] net/ice: cleanup for vec path check Xiaolong Ye
2020-01-16 10:19             ` Ferruh Yigit
2020-01-17  2:21             ` Yang, Qiming
2020-01-16  8:43     ` [dpdk-dev] [PATCH v9 2/4] net/ice: cleanup Tx buffers Ferruh Yigit
2020-01-13  9:57   ` [dpdk-dev] [PATCH v9 3/4] net/ixgbe: " Chenxu Di
2020-01-13 11:07     ` Ananyev, Konstantin
2020-01-16  8:44     ` Ferruh Yigit
2020-01-16 14:47     ` Ferruh Yigit
2020-01-16 15:23       ` Ferruh Yigit
2020-01-13  9:57   ` [dpdk-dev] [PATCH v9 4/4] net/e1000: " Chenxu Di
2020-01-13 11:08     ` Ananyev, Konstantin
2020-01-14  2:49     ` Ye Xiaolong
2020-01-14  2:22 ` [dpdk-dev] [PATCH 0/4] drivers/net: " Ye Xiaolong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=SN6PR11MB255840C8D1C11AD9BAABA8949A3D0@SN6PR11MB2558.namprd11.prod.outlook.com \
    --to=konstantin.ananyev@intel.com \
    --cc=chenxux.di@intel.com \
    --cc=dev@dpdk.org \
    --cc=qiming.yang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.