From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Dumitrescu, Cristian" Subject: Re: [PATCH 4/4] port: fix ethdev writer burst too big Date: Thu, 31 Mar 2016 13:22:47 +0000 Message-ID: <3EB4FA525960D640B5BDFFD6A3D8912647974F2E@IRSMSX108.ger.corp.intel.com> References: <1459198297-49854-1-git-send-email-rsanford@akamai.com> <1459198297-49854-5-git-send-email-rsanford@akamai.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Cc: "Liang, Cunming" To: Robert Sanford , "dev@dpdk.org" Return-path: Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by dpdk.org (Postfix) with ESMTP id B26307F70 for ; Thu, 31 Mar 2016 15:22:50 +0200 (CEST) In-Reply-To: <1459198297-49854-5-git-send-email-rsanford@akamai.com> Content-Language: en-US List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" > -----Original Message----- > From: Robert Sanford [mailto:rsanford2@gmail.com] > Sent: Monday, March 28, 2016 9:52 PM > To: dev@dpdk.org; Dumitrescu, Cristian > Subject: [PATCH 4/4] port: fix ethdev writer burst too big >=20 > For f_tx_bulk functions in rte_port_ethdev.c, we may unintentionally > send bursts larger than tx_burst_sz to the underlying ethdev. > Some PMDs (e.g., ixgbe) may truncate this request to their maximum > burst size, resulting in unnecessary enqueuing failures or ethdev > writer retries. Sending bursts larger than tx_burst_sz is actually intentional. The assumpt= ion is that NIC performance benefits from larger burst size. So the tx_burs= t_sz is used as a minimal burst size requirement, not as a maximal or fixed= burst size requirement. I agree with you that a while ago the vector version of IXGBE driver used t= o work the way you describe it, but I don't think this is the case anymore.= As an example, if TX burst size is set to 32 and 48 packets are transmitte= d, than the PMD will TX all the 48 packets (internally it can work in batch= es of 4, 8, 32, etc, should not matter) rather than TXing just 32 packets o= ut of 48 and user having to either discard or retry with the remaining 16 p= ackets. I am CC-ing Steve Liang for confirming this. Is there any PMD that people can name that currently behaves the opposite, = i.e. given a burst of 48 pkts for TX, accept 32 pkts and discard the other = 16? >=20 > We propose to fix this by moving the tx buffer flushing logic from > *after* the loop that puts all packets into the tx buffer, to *inside* > the loop, testing for a full burst when adding each packet. >=20 The issue I have with this approach is the introduction of a branch that ha= s to be tested for each iteration of the loop rather than once for the enti= re loop. The code branch where you add this is actually the slow(er) code path (wher= e local variable expr !=3D 0), which is used for non-contiguous or bursts s= maller than tx_burst_sz. Is there a particular reason you are only interest= ed of enabling this strategy (of using tx_burst_sz as a fixed burst size re= quirement) only on this code path? The reason I am asking is the other fast= (er) code path (where expr =3D=3D 0) also uses tx_burst_sz as a minimal req= uirement and therefore it can send burst sizes bigger than tx_burst_sz. > Signed-off-by: Robert Sanford > --- > lib/librte_port/rte_port_ethdev.c | 20 ++++++++++---------- > 1 files changed, 10 insertions(+), 10 deletions(-) >=20 > diff --git a/lib/librte_port/rte_port_ethdev.c > b/lib/librte_port/rte_port_ethdev.c > index 3fb4947..1283338 100644 > --- a/lib/librte_port/rte_port_ethdev.c > +++ b/lib/librte_port/rte_port_ethdev.c > @@ -151,7 +151,7 @@ static int rte_port_ethdev_reader_stats_read(void > *port, > struct rte_port_ethdev_writer { > struct rte_port_out_stats stats; >=20 > - struct rte_mbuf *tx_buf[2 * RTE_PORT_IN_BURST_SIZE_MAX]; > + struct rte_mbuf *tx_buf[RTE_PORT_IN_BURST_SIZE_MAX]; > uint32_t tx_burst_sz; > uint16_t tx_buf_count; > uint64_t bsz_mask; > @@ -257,11 +257,11 @@ rte_port_ethdev_writer_tx_bulk(void *port, > p->tx_buf[tx_buf_count++] =3D pkt; >=20 > RTE_PORT_ETHDEV_WRITER_STATS_PKTS_IN_ADD(p, 1); > pkts_mask &=3D ~pkt_mask; > - } >=20 > - p->tx_buf_count =3D tx_buf_count; > - if (tx_buf_count >=3D p->tx_burst_sz) > - send_burst(p); > + p->tx_buf_count =3D tx_buf_count; > + if (tx_buf_count >=3D p->tx_burst_sz) > + send_burst(p); > + } > } One observation here: if we enable this proposal (which I have an issue wit= h due to the executing the branch per loop iteration rather than once per e= ntire loop), it also eliminates the buffer overflow issue flagged by you in= the other email :), so no need to e.g. doble the size of the port internal= buffer (tx_buf). >=20 > return 0; > @@ -328,7 +328,7 @@ static int rte_port_ethdev_writer_stats_read(void > *port, > struct rte_port_ethdev_writer_nodrop { > struct rte_port_out_stats stats; >=20 > - struct rte_mbuf *tx_buf[2 * RTE_PORT_IN_BURST_SIZE_MAX]; > + struct rte_mbuf *tx_buf[RTE_PORT_IN_BURST_SIZE_MAX]; > uint32_t tx_burst_sz; > uint16_t tx_buf_count; > uint64_t bsz_mask; > @@ -466,11 +466,11 @@ rte_port_ethdev_writer_nodrop_tx_bulk(void > *port, > p->tx_buf[tx_buf_count++] =3D pkt; >=20 > RTE_PORT_ETHDEV_WRITER_NODROP_STATS_PKTS_IN_ADD(p, 1); > pkts_mask &=3D ~pkt_mask; > - } >=20 > - p->tx_buf_count =3D tx_buf_count; > - if (tx_buf_count >=3D p->tx_burst_sz) > - send_burst_nodrop(p); > + p->tx_buf_count =3D tx_buf_count; > + if (tx_buf_count >=3D p->tx_burst_sz) > + send_burst_nodrop(p); > + } > } >=20 > return 0; > -- > 1.7.1