All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] net/bonding: bonding and LACP fixes
@ 2016-08-01 20:42 Robert Sanford
  2016-08-01 20:42 ` [PATCH 1/4] testpmd: fix LACP ports to work with idle links Robert Sanford
                   ` (4 more replies)
  0 siblings, 5 replies; 19+ messages in thread
From: Robert Sanford @ 2016-08-01 20:42 UTC (permalink / raw)
  To: dev; +Cc: declan.doherty, pablo.de.lara.guarch, olivier.matz

In this patch series, we fix two bonding driver bugs and
enhance testpmd so that bonding mode 4 (LACP) ports remain
operational even when idle.

1. Problem: testpmd does not call bonding mode 4 (LACP) ports' tx
   burst function at least every 100 ms, as mandated.
   Solution: Enhance testpmd's packet forwarding loop to infrequently
   invoke the tx burst API for bonding ports in mode 4, to transmit
   LACPDUs to the partner in a timely manner.
2. Problem: Bonding driver (item 3 below) needs to know how many
   objects may become cached in a memory pool. Solution: Rename
   macros that calculate a mempool cache flush threshold, and move
   them from rte_mempool.c to rte_mempool.h.
3. Problem: With little or no tx traffic, LACP tx machine may run out
   of mbufs. Solution: When calculating the minimum number of mbufs
   required in a bonding mode 4 slave's private (tx LACPDU) pool,
   include the maximum number of mbufs that may be cached in the
   pool's per-core caches.
4. Problem: When configuring a bonding device, we don't properly
   propagate most of the settings from the master to the slaves.
   Solution: Fix slave_configure() to correctly pass configuration
   data to rte_eth_dev_configure() on behalf of the slaves.

Notes for configuring and running testpmd: We specify four ethernet
devices in the arguments, because testpmd expects an even number.
We configure two devices to be slaves under one bonded device, one
device to be the other side of the forwarding bridge, and we ignore
the fourth eth dev.

 +-------------+    +-------+     +--------+
 |client A     |<==>|DPDK   |     |        |
 |bonded device|    |testpmd|<===>|client B|
 |with 2 slaves|<==>|       |     |        |
 +-------------+    +-------+     +--------+

To reproduce the out of buffers problem (#3), apply patch 1/4, run
testpmd (with example args and commands shown below), and run ping
from client A, like this: "ping -i18 -c10 clientB". After about five
minutes, one of the slaves will run out of LACPDU mbufs.

Example testpmd args:

  ./testpmd -c 0x00000555 -n 2 \
    --log-level 7 \
    --pci-whitelist "01:00.0" \
    --pci-whitelist "01:00.1" \
    --pci-whitelist "05:00.0" \
    --pci-whitelist "84:00.0" \
    --master-lcore 0 -- \
    --interactive --portmask=0xf --numa --socket-num=0 --auto-start \
    --coremask=0x00000554 --rxd=512 --txd=256 \
    --burst=32 --mbcache=64 \
    --nb-cores=2 --rxq=1 --txq=1

Example testpmd commands to reconfigure into bonding mode 4:

  stop
  port stop all
  create bonded device 4 0
  add bonding slave 2 4
  add bonding slave 3 4
  port start 0
  port start 1
  port start 4
  set portlist 4,0
  start

Robert Sanford (4):
  testpmd: fix LACP ports to work with idle links
  mempool: make cache flush threshold macro public
  net/bonding: another fix to LACP mempool size
  net/bonding: fix configuration of LACP slaves

 app/test-pmd/cmdline.c                    |    9 +++++++
 app/test-pmd/testpmd.c                    |   37 +++++++++++++++++++++++++++++
 app/test-pmd/testpmd.h                    |    4 +++
 drivers/net/bonding/rte_eth_bond_8023ad.c |   10 +++++--
 drivers/net/bonding/rte_eth_bond_pmd.c    |   28 +++++----------------
 lib/librte_mempool/rte_mempool.c          |    8 +----
 lib/librte_mempool/rte_mempool.h          |    7 +++++
 7 files changed, 73 insertions(+), 30 deletions(-)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/4] testpmd: fix LACP ports to work with idle links
  2016-08-01 20:42 [PATCH 0/4] net/bonding: bonding and LACP fixes Robert Sanford
@ 2016-08-01 20:42 ` Robert Sanford
  2017-06-22  1:25   ` Wu, Jingjing
  2016-08-01 20:42 ` [PATCH 2/4] mempool: make cache flush threshold macro public Robert Sanford
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 19+ messages in thread
From: Robert Sanford @ 2016-08-01 20:42 UTC (permalink / raw)
  To: dev; +Cc: declan.doherty, pablo.de.lara.guarch, olivier.matz

Problem: When there is little or no TX traffic on an LACP port
(bonding mode 4), we don't call its tx burst function in a timely
manner, and thus we don't transmit LACPDUs when we should.

Solution: Add and maintain an "lacp_master" flag in rte_port struct.
In the main packet forwarding loop, if port is an LACP master, in 1
out of N loops force an empty call to the tx burst API.

Signed-off-by: Robert Sanford <rsanford@akamai.com>
---
 app/test-pmd/cmdline.c |    9 +++++++++
 app/test-pmd/testpmd.c |   37 +++++++++++++++++++++++++++++++++++++
 app/test-pmd/testpmd.h |    4 ++++
 3 files changed, 50 insertions(+), 0 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index f90befc..2a629ee 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -3975,6 +3975,10 @@ static void cmd_set_bonding_mode_parsed(void *parsed_result,
 	/* Set the bonding mode for the relevant port. */
 	if (0 != rte_eth_bond_mode_set(port_id, res->value))
 		printf("\t Failed to set bonding mode for port = %d.\n", port_id);
+	else if (res->value == BONDING_MODE_8023AD)
+		set_port_lacp_master(port_id);
+	else
+		clear_port_lacp_master(port_id);
 }
 
 cmdline_parse_token_string_t cmd_setbonding_mode_set =
@@ -4408,6 +4412,11 @@ static void cmd_create_bonded_device_parsed(void *parsed_result,
 		reconfig(port_id, res->socket);
 		rte_eth_promiscuous_enable(port_id);
 		ports[port_id].enabled = 1;
+
+		if (res->mode == BONDING_MODE_8023AD)
+			set_port_lacp_master(port_id);
+		else
+			clear_port_lacp_master(port_id);
 	}
 
 }
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 1428974..806667e 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -920,12 +920,28 @@ run_pkt_fwd_on_lcore(struct fwd_lcore *fc, packet_fwd_t pkt_fwd)
 	struct fwd_stream **fsm;
 	streamid_t nb_fs;
 	streamid_t sm_id;
+	unsigned int loop_count = 0;
 
 	fsm = &fwd_streams[fc->stream_idx];
 	nb_fs = fc->stream_nb;
 	do {
 		for (sm_id = 0; sm_id < nb_fs; sm_id++)
 			(*pkt_fwd)(fsm[sm_id]);
+
+		/*
+		 * Per rte_eth_bond.h, we must invoke LACP master's tx
+		 * burst function at least once every 100 ms.
+		 */
+		loop_count++;
+		if (likely(loop_count % 1024 != 0))
+			continue;
+		for (sm_id = 0; sm_id < nb_fs; sm_id++) {
+			struct fwd_stream *fs = fsm[sm_id];
+
+			if (port_is_lacp_master(fs->tx_port))
+				rte_eth_tx_burst(fs->tx_port, fs->tx_queue,
+						 NULL, 0);
+		}
 	} while (! fc->stopped);
 }
 
@@ -1881,6 +1897,27 @@ uint8_t port_is_bonding_slave(portid_t slave_pid)
 	return port->slave_flag;
 }
 
+void set_port_lacp_master(portid_t pid)
+{
+	struct rte_port *port = &ports[pid];
+
+	port->lacp_master = 1;
+}
+
+void clear_port_lacp_master(portid_t pid)
+{
+	struct rte_port *port = &ports[pid];
+
+	port->lacp_master = 0;
+}
+
+uint8_t port_is_lacp_master(portid_t pid)
+{
+	struct rte_port *port = &ports[pid];
+
+	return port->lacp_master;
+}
+
 const uint16_t vlan_tags[] = {
 		0,  1,  2,  3,  4,  5,  6,  7,
 		8,  9, 10, 11,  12, 13, 14, 15,
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index 2b281cc..0898194 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -170,6 +170,7 @@ struct rte_port {
 	struct ether_addr       *mc_addr_pool; /**< pool of multicast addrs */
 	uint32_t                mc_addr_nb; /**< nb. of addr. in mc_addr_pool */
 	uint8_t                 slave_flag; /**< bonding slave port */
+	uint8_t                 lacp_master;/**< bonding LACP master */
 };
 
 extern portid_t __rte_unused
@@ -542,6 +543,9 @@ void init_port_config(void);
 void set_port_slave_flag(portid_t slave_pid);
 void clear_port_slave_flag(portid_t slave_pid);
 uint8_t port_is_bonding_slave(portid_t slave_pid);
+void set_port_lacp_master(portid_t pid);
+void clear_port_lacp_master(portid_t pid);
+uint8_t port_is_lacp_master(portid_t pid);
 
 int init_port_dcb_config(portid_t pid, enum dcb_mode_enable dcb_mode,
 		     enum rte_eth_nb_tcs num_tcs,
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 2/4] mempool: make cache flush threshold macro public
  2016-08-01 20:42 [PATCH 0/4] net/bonding: bonding and LACP fixes Robert Sanford
  2016-08-01 20:42 ` [PATCH 1/4] testpmd: fix LACP ports to work with idle links Robert Sanford
@ 2016-08-01 20:42 ` Robert Sanford
  2016-08-23 15:09   ` Olivier MATZ
  2016-08-01 20:42 ` [PATCH 3/4] net/bonding: another fix to LACP mempool size Robert Sanford
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 19+ messages in thread
From: Robert Sanford @ 2016-08-01 20:42 UTC (permalink / raw)
  To: dev; +Cc: declan.doherty, pablo.de.lara.guarch, olivier.matz

Rename macros that calculate a mempool cache flush threshold, and
move them from rte_mempool.c to rte_mempool.h, so that the bonding
driver can accurately calculate its mbuf requirements.

Signed-off-by: Robert Sanford <rsanford@akamai.com>
---
 lib/librte_mempool/rte_mempool.c |    8 ++------
 lib/librte_mempool/rte_mempool.h |    7 +++++++
 2 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 2e28e2e..cca4843 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -69,10 +69,6 @@ static struct rte_tailq_elem rte_mempool_tailq = {
 };
 EAL_REGISTER_TAILQ(rte_mempool_tailq)
 
-#define CACHE_FLUSHTHRESH_MULTIPLIER 1.5
-#define CALC_CACHE_FLUSHTHRESH(c)	\
-	((typeof(c))((c) * CACHE_FLUSHTHRESH_MULTIPLIER))
-
 /*
  * return the greatest common divisor between a and b (fast algorithm)
  *
@@ -686,7 +682,7 @@ static void
 mempool_cache_init(struct rte_mempool_cache *cache, uint32_t size)
 {
 	cache->size = size;
-	cache->flushthresh = CALC_CACHE_FLUSHTHRESH(size);
+	cache->flushthresh = RTE_MEMPOOL_CALC_CACHE_FLUSHTHRESH(size);
 	cache->len = 0;
 }
 
@@ -762,7 +758,7 @@ rte_mempool_create_empty(const char *name, unsigned n, unsigned elt_size,
 
 	/* asked cache too big */
 	if (cache_size > RTE_MEMPOOL_CACHE_MAX_SIZE ||
-	    CALC_CACHE_FLUSHTHRESH(cache_size) > n) {
+	    RTE_MEMPOOL_CALC_CACHE_FLUSHTHRESH(cache_size) > n) {
 		rte_errno = EINVAL;
 		return NULL;
 	}
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 059ad9e..4323c1b 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -263,6 +263,13 @@ struct rte_mempool {
 #define MEMPOOL_F_NO_PHYS_CONTIG 0x0020 /**< Don't need physically contiguous objs. */
 
 /**
+ * Calculate the threshold before we flush excess elements.
+ */
+#define RTE_MEMPOOL_CACHE_FLUSHTHRESH_MULTIPLIER 1.5
+#define RTE_MEMPOOL_CALC_CACHE_FLUSHTHRESH(c)	\
+	((typeof(c))((c) * RTE_MEMPOOL_CACHE_FLUSHTHRESH_MULTIPLIER))
+
+/**
  * @internal When debug is enabled, store some statistics.
  *
  * @param mp
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 3/4] net/bonding: another fix to LACP mempool size
  2016-08-01 20:42 [PATCH 0/4] net/bonding: bonding and LACP fixes Robert Sanford
  2016-08-01 20:42 ` [PATCH 1/4] testpmd: fix LACP ports to work with idle links Robert Sanford
  2016-08-01 20:42 ` [PATCH 2/4] mempool: make cache flush threshold macro public Robert Sanford
@ 2016-08-01 20:42 ` Robert Sanford
  2016-08-23 15:09   ` Olivier MATZ
  2016-11-07 16:02   ` Kulasek, TomaszX
  2016-08-01 20:42 ` [PATCH 4/4] net/bonding: fix configuration of LACP slaves Robert Sanford
  2017-02-08 17:14 ` [PATCH 0/4] net/bonding: bonding and LACP fixes Thomas Monjalon
  4 siblings, 2 replies; 19+ messages in thread
From: Robert Sanford @ 2016-08-01 20:42 UTC (permalink / raw)
  To: dev; +Cc: declan.doherty, pablo.de.lara.guarch, olivier.matz

The following log message may appear after a slave is idle (or nearly
idle) for a few minutes: "PMD: Failed to allocate LACP packet from
pool".

Problem: All mbufs from a slave's private pool (used exclusively for
transmitting LACPDUs) have been allocated and are still sitting in
the device's tx descriptor ring and other cores' mempool caches.

Solution: Ensure that each slaves' tx (LACPDU) mempool owns more than
n-tx-queues * (n-tx-descriptors + per-core-mempool-flush-threshold)
mbufs.

Note that the LACP tx machine function is the only code that allocates
from a slave's private pool. It runs in the context of the interrupt
thread, and thus it has no mempool cache of its own.

Signed-off-by: Robert Sanford <rsanford@akamai.com>
---
 drivers/net/bonding/rte_eth_bond_8023ad.c |   10 +++++++---
 1 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/net/bonding/rte_eth_bond_8023ad.c b/drivers/net/bonding/rte_eth_bond_8023ad.c
index 2f7ae70..1207896 100644
--- a/drivers/net/bonding/rte_eth_bond_8023ad.c
+++ b/drivers/net/bonding/rte_eth_bond_8023ad.c
@@ -854,6 +854,8 @@ bond_mode_8023ad_activate_slave(struct rte_eth_dev *bond_dev, uint8_t slave_id)
 	char mem_name[RTE_ETH_NAME_MAX_LEN];
 	int socket_id;
 	unsigned element_size;
+	unsigned cache_size;
+	unsigned cache_flushthresh;
 	uint32_t total_tx_desc;
 	struct bond_tx_queue *bd_tx_q;
 	uint16_t q_id;
@@ -890,19 +892,21 @@ bond_mode_8023ad_activate_slave(struct rte_eth_dev *bond_dev, uint8_t slave_id)
 
 	element_size = sizeof(struct slow_protocol_frame) + sizeof(struct rte_mbuf)
 				+ RTE_PKTMBUF_HEADROOM;
+	cache_size = RTE_MEMPOOL_CACHE_MAX_SIZE >= 32 ?
+		32 : RTE_MEMPOOL_CACHE_MAX_SIZE;
+	cache_flushthresh = RTE_MEMPOOL_CALC_CACHE_FLUSHTHRESH(cache_size);
 
 	/* The size of the mempool should be at least:
 	 * the sum of the TX descriptors + BOND_MODE_8023AX_SLAVE_TX_PKTS */
 	total_tx_desc = BOND_MODE_8023AX_SLAVE_TX_PKTS;
 	for (q_id = 0; q_id < bond_dev->data->nb_tx_queues; q_id++) {
 		bd_tx_q = (struct bond_tx_queue*)bond_dev->data->tx_queues[q_id];
-		total_tx_desc += bd_tx_q->nb_tx_desc;
+		total_tx_desc += bd_tx_q->nb_tx_desc + cache_flushthresh;
 	}
 
 	snprintf(mem_name, RTE_DIM(mem_name), "slave_port%u_pool", slave_id);
 	port->mbuf_pool = rte_mempool_create(mem_name,
-		total_tx_desc, element_size,
-		RTE_MEMPOOL_CACHE_MAX_SIZE >= 32 ? 32 : RTE_MEMPOOL_CACHE_MAX_SIZE,
+		total_tx_desc, element_size, cache_size,
 		sizeof(struct rte_pktmbuf_pool_private), rte_pktmbuf_pool_init,
 		NULL, rte_pktmbuf_init, NULL, socket_id, MEMPOOL_F_NO_SPREAD);
 
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 4/4] net/bonding: fix configuration of LACP slaves
  2016-08-01 20:42 [PATCH 0/4] net/bonding: bonding and LACP fixes Robert Sanford
                   ` (2 preceding siblings ...)
  2016-08-01 20:42 ` [PATCH 3/4] net/bonding: another fix to LACP mempool size Robert Sanford
@ 2016-08-01 20:42 ` Robert Sanford
  2016-11-07 16:03   ` Kulasek, TomaszX
  2017-02-08 17:14 ` [PATCH 0/4] net/bonding: bonding and LACP fixes Thomas Monjalon
  4 siblings, 1 reply; 19+ messages in thread
From: Robert Sanford @ 2016-08-01 20:42 UTC (permalink / raw)
  To: dev; +Cc: declan.doherty, pablo.de.lara.guarch, olivier.matz

Problem: When adding a slave or starting a bond device, the bond
device configures slave devices via function slave_configure().
However, settings configured in the bond device's rte_eth_conf are
not propagated to the slaves. For example, VLAN and CRC stripping
are not working as expected.

The problem is that we pass the wrong argument when we invoke
rte_eth_dev_configure(). We pass the slave's currently configured
rte_eth_conf (as a source arg!), when we should pass a copy of the
(master) bond device's rte_eth_conf.

Solution: Make a local copy of the bond device's rte_eth_conf, adjust
the LSC flag based on the slave, and then pass that rte_eth_conf to
rte_eth_dev_configure().

Also, remove code that directly pokes RSS data into the slave's
rte_eth_conf, as that is also contained in the proper rte_eth_conf
that we will pass to rte_eth_dev_configure().

Signed-off-by: Robert Sanford <rsanford@akamai.com>
---
 drivers/net/bonding/rte_eth_bond_pmd.c |   28 +++++++---------------------
 1 files changed, 7 insertions(+), 21 deletions(-)

diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c b/drivers/net/bonding/rte_eth_bond_pmd.c
index b20a272..486582f 100644
--- a/drivers/net/bonding/rte_eth_bond_pmd.c
+++ b/drivers/net/bonding/rte_eth_bond_pmd.c
@@ -1302,6 +1302,7 @@ int
 slave_configure(struct rte_eth_dev *bonded_eth_dev,
 		struct rte_eth_dev *slave_eth_dev)
 {
+	struct rte_eth_conf slave_eth_conf;
 	struct bond_rx_queue *bd_rx_q;
 	struct bond_tx_queue *bd_tx_q;
 
@@ -1313,33 +1314,18 @@ slave_configure(struct rte_eth_dev *bonded_eth_dev,
 	/* Stop slave */
 	rte_eth_dev_stop(slave_eth_dev->data->port_id);
 
-	/* Enable interrupts on slave device if supported */
-	if (slave_eth_dev->data->dev_flags & RTE_ETH_DEV_INTR_LSC)
-		slave_eth_dev->data->dev_conf.intr_conf.lsc = 1;
-
-	/* If RSS is enabled for bonding, try to enable it for slaves  */
-	if (bonded_eth_dev->data->dev_conf.rxmode.mq_mode & ETH_MQ_RX_RSS_FLAG) {
-		if (bonded_eth_dev->data->dev_conf.rx_adv_conf.rss_conf.rss_key_len
-				!= 0) {
-			slave_eth_dev->data->dev_conf.rx_adv_conf.rss_conf.rss_key_len =
-					bonded_eth_dev->data->dev_conf.rx_adv_conf.rss_conf.rss_key_len;
-			slave_eth_dev->data->dev_conf.rx_adv_conf.rss_conf.rss_key =
-					bonded_eth_dev->data->dev_conf.rx_adv_conf.rss_conf.rss_key;
-		} else {
-			slave_eth_dev->data->dev_conf.rx_adv_conf.rss_conf.rss_key = NULL;
-		}
+	/* Build slave rte_eth_conf, starting from bonded's conf */
+	slave_eth_conf = bonded_eth_dev->data->dev_conf;
 
-		slave_eth_dev->data->dev_conf.rx_adv_conf.rss_conf.rss_hf =
-				bonded_eth_dev->data->dev_conf.rx_adv_conf.rss_conf.rss_hf;
-		slave_eth_dev->data->dev_conf.rxmode.mq_mode =
-				bonded_eth_dev->data->dev_conf.rxmode.mq_mode;
-	}
+	/* Enable interrupts on slave device if supported */
+	slave_eth_conf.intr_conf.lsc =
+		!!(slave_eth_dev->data->dev_flags & RTE_ETH_DEV_INTR_LSC);
 
 	/* Configure device */
 	errval = rte_eth_dev_configure(slave_eth_dev->data->port_id,
 			bonded_eth_dev->data->nb_rx_queues,
 			bonded_eth_dev->data->nb_tx_queues,
-			&(slave_eth_dev->data->dev_conf));
+			&slave_eth_conf);
 	if (errval != 0) {
 		RTE_BOND_LOG(ERR, "Cannot configure slave device: port %u , err (%d)",
 				slave_eth_dev->data->port_id, errval);
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/4] mempool: make cache flush threshold macro public
  2016-08-01 20:42 ` [PATCH 2/4] mempool: make cache flush threshold macro public Robert Sanford
@ 2016-08-23 15:09   ` Olivier MATZ
  2016-08-23 16:07     ` Sanford, Robert
  0 siblings, 1 reply; 19+ messages in thread
From: Olivier MATZ @ 2016-08-23 15:09 UTC (permalink / raw)
  To: Robert Sanford, dev; +Cc: declan.doherty, pablo.de.lara.guarch

Hi Robert,

On 08/01/2016 10:42 PM, Robert Sanford wrote:
> Rename macros that calculate a mempool cache flush threshold, and
> move them from rte_mempool.c to rte_mempool.h, so that the bonding
> driver can accurately calculate its mbuf requirements.
>
> Signed-off-by: Robert Sanford <rsanford@akamai.com>
> ---
>   lib/librte_mempool/rte_mempool.c |    8 ++------
>   lib/librte_mempool/rte_mempool.h |    7 +++++++
>   2 files changed, 9 insertions(+), 6 deletions(-)
>
> [...]
>
> --- a/lib/librte_mempool/rte_mempool.h
> +++ b/lib/librte_mempool/rte_mempool.h
> @@ -263,6 +263,13 @@ struct rte_mempool {
>   #define MEMPOOL_F_NO_PHYS_CONTIG 0x0020 /**< Don't need physically contiguous objs. */
>
>   /**
> + * Calculate the threshold before we flush excess elements.
> + */
> +#define RTE_MEMPOOL_CACHE_FLUSHTHRESH_MULTIPLIER 1.5
> +#define RTE_MEMPOOL_CALC_CACHE_FLUSHTHRESH(c)	\
> +	((typeof(c))((c) * RTE_MEMPOOL_CACHE_FLUSHTHRESH_MULTIPLIER))
> +
> +/**
>    * @internal When debug is enabled, store some statistics.
>    *
>    * @param mp
>

What do you think of using a static inline function instead of a macro ?


Regards,
Olivier

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/4] net/bonding: another fix to LACP mempool size
  2016-08-01 20:42 ` [PATCH 3/4] net/bonding: another fix to LACP mempool size Robert Sanford
@ 2016-08-23 15:09   ` Olivier MATZ
  2016-08-23 20:01     ` Sanford, Robert
  2016-11-07 16:02   ` Kulasek, TomaszX
  1 sibling, 1 reply; 19+ messages in thread
From: Olivier MATZ @ 2016-08-23 15:09 UTC (permalink / raw)
  To: Robert Sanford, dev; +Cc: declan.doherty, pablo.de.lara.guarch

Hi Robert,

On 08/01/2016 10:42 PM, Robert Sanford wrote:
> The following log message may appear after a slave is idle (or nearly
> idle) for a few minutes: "PMD: Failed to allocate LACP packet from
> pool".
>
> Problem: All mbufs from a slave's private pool (used exclusively for
> transmitting LACPDUs) have been allocated and are still sitting in
> the device's tx descriptor ring and other cores' mempool caches.
>
> Solution: Ensure that each slaves' tx (LACPDU) mempool owns more than
> n-tx-queues * (n-tx-descriptors + per-core-mempool-flush-threshold)
> mbufs.
>
> Note that the LACP tx machine function is the only code that allocates
> from a slave's private pool. It runs in the context of the interrupt
> thread, and thus it has no mempool cache of its own.
>
> Signed-off-by: Robert Sanford <rsanford@akamai.com>
> ---
>   drivers/net/bonding/rte_eth_bond_8023ad.c |   10 +++++++---
>   1 files changed, 7 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/bonding/rte_eth_bond_8023ad.c b/drivers/net/bonding/rte_eth_bond_8023ad.c
> index 2f7ae70..1207896 100644
> --- a/drivers/net/bonding/rte_eth_bond_8023ad.c
> +++ b/drivers/net/bonding/rte_eth_bond_8023ad.c
> @@ -854,6 +854,8 @@ bond_mode_8023ad_activate_slave(struct rte_eth_dev *bond_dev, uint8_t slave_id)
>   	char mem_name[RTE_ETH_NAME_MAX_LEN];
>   	int socket_id;
>   	unsigned element_size;
> +	unsigned cache_size;
> +	unsigned cache_flushthresh;
>   	uint32_t total_tx_desc;
>   	struct bond_tx_queue *bd_tx_q;
>   	uint16_t q_id;
> @@ -890,19 +892,21 @@ bond_mode_8023ad_activate_slave(struct rte_eth_dev *bond_dev, uint8_t slave_id)
>
>   	element_size = sizeof(struct slow_protocol_frame) + sizeof(struct rte_mbuf)
>   				+ RTE_PKTMBUF_HEADROOM;
> +	cache_size = RTE_MEMPOOL_CACHE_MAX_SIZE >= 32 ?
> +		32 : RTE_MEMPOOL_CACHE_MAX_SIZE;
> +	cache_flushthresh = RTE_MEMPOOL_CALC_CACHE_FLUSHTHRESH(cache_size);
>
>   	/* The size of the mempool should be at least:
>   	 * the sum of the TX descriptors + BOND_MODE_8023AX_SLAVE_TX_PKTS */
>   	total_tx_desc = BOND_MODE_8023AX_SLAVE_TX_PKTS;
>   	for (q_id = 0; q_id < bond_dev->data->nb_tx_queues; q_id++) {
>   		bd_tx_q = (struct bond_tx_queue*)bond_dev->data->tx_queues[q_id];
> -		total_tx_desc += bd_tx_q->nb_tx_desc;
> +		total_tx_desc += bd_tx_q->nb_tx_desc + cache_flushthresh;
>   	}
>
>   	snprintf(mem_name, RTE_DIM(mem_name), "slave_port%u_pool", slave_id);
>   	port->mbuf_pool = rte_mempool_create(mem_name,
> -		total_tx_desc, element_size,
> -		RTE_MEMPOOL_CACHE_MAX_SIZE >= 32 ? 32 : RTE_MEMPOOL_CACHE_MAX_SIZE,
> +		total_tx_desc, element_size, cache_size,
>   		sizeof(struct rte_pktmbuf_pool_private), rte_pktmbuf_pool_init,
>   		NULL, rte_pktmbuf_init, NULL, socket_id, MEMPOOL_F_NO_SPREAD);
>
>

I'm not very familiar with bonding code, so maybe your patch is correct. 
I think the size of the mempool should be:

   BOND_MODE_8023AX_SLAVE_TX_PKTS +
     n_cores * RTE_MEMPOOL_CALC_CACHE_FLUSHTHRESH(cache_size)

With n_cores = number of cores that can dequeue from the mempool.

The safest thing to do would be to have n_cores = RTE_MAX_LCORE. I don't 
know if bond_dev->data->nb_tx_queue corresponds to this definition, if 
yes you can ignore my comment ;)


Regards,
Olivier

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/4] mempool: make cache flush threshold macro public
  2016-08-23 15:09   ` Olivier MATZ
@ 2016-08-23 16:07     ` Sanford, Robert
  2016-08-24 16:15       ` Olivier MATZ
  0 siblings, 1 reply; 19+ messages in thread
From: Sanford, Robert @ 2016-08-23 16:07 UTC (permalink / raw)
  To: Olivier MATZ, dev; +Cc: declan.doherty, pablo.de.lara.guarch

Hi Olivier,

On 8/23/16, 11:09 AM, "Olivier MATZ" <olivier.matz@6wind.com> wrote:

    Hi Robert,
    
    On 08/01/2016 10:42 PM, Robert Sanford wrote:
    > Rename macros that calculate a mempool cache flush threshold, and
    > move them from rte_mempool.c to rte_mempool.h, so that the bonding
    > driver can accurately calculate its mbuf requirements.
    >
    > Signed-off-by: Robert Sanford <rsanford@akamai.com>
    > ---
    >   lib/librte_mempool/rte_mempool.c |    8 ++------
    >   lib/librte_mempool/rte_mempool.h |    7 +++++++
    >   2 files changed, 9 insertions(+), 6 deletions(-)
    >
    > [...]
    >
    > --- a/lib/librte_mempool/rte_mempool.h
    > +++ b/lib/librte_mempool/rte_mempool.h
    > @@ -263,6 +263,13 @@ struct rte_mempool {
    >   #define MEMPOOL_F_NO_PHYS_CONTIG 0x0020 /**< Don't need physically contiguous objs. */
    >
    >   /**
    > + * Calculate the threshold before we flush excess elements.
    > + */
    > +#define RTE_MEMPOOL_CACHE_FLUSHTHRESH_MULTIPLIER 1.5
    > +#define RTE_MEMPOOL_CALC_CACHE_FLUSHTHRESH(c)	\
    > +	((typeof(c))((c) * RTE_MEMPOOL_CACHE_FLUSHTHRESH_MULTIPLIER))
    > +
    > +/**
    >    * @internal When debug is enabled, store some statistics.
    >    *
    >    * @param mp
    >
    
    What do you think of using a static inline function instead of a macro ?
    
    
    Regards,
    Olivier
--

Yes, an inline function is better than a macro. We still need to move the #define MULTIPLIER from rte_mempool.c to rte_mempool.h.

How is this for the prototype?
static inline unsigned rte_mempool_calc_cache_flushthresh(unsigned cache_size)

Where in the .h should we place the function, right below the MULTIPLIER definition?

Regards,
Robert



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/4] net/bonding: another fix to LACP mempool size
  2016-08-23 15:09   ` Olivier MATZ
@ 2016-08-23 20:01     ` Sanford, Robert
  2016-08-24 16:14       ` Olivier MATZ
  0 siblings, 1 reply; 19+ messages in thread
From: Sanford, Robert @ 2016-08-23 20:01 UTC (permalink / raw)
  To: Olivier MATZ, dev; +Cc: declan.doherty, pablo.de.lara.guarch

Hi Olivier,

On 8/23/16, 11:09 AM, "Olivier MATZ" <olivier.matz@6wind.com> wrote:

    Hi Robert,
    
    On 08/01/2016 10:42 PM, Robert Sanford wrote:
    > The following log message may appear after a slave is idle (or nearly
    > idle) for a few minutes: "PMD: Failed to allocate LACP packet from
    > pool".
    >
    > Problem: All mbufs from a slave's private pool (used exclusively for
    > transmitting LACPDUs) have been allocated and are still sitting in
    > the device's tx descriptor ring and other cores' mempool caches.
    >
    > Solution: Ensure that each slaves' tx (LACPDU) mempool owns more than
    > n-tx-queues * (n-tx-descriptors + per-core-mempool-flush-threshold)
    > mbufs.
    >
    > Note that the LACP tx machine function is the only code that allocates
    > from a slave's private pool. It runs in the context of the interrupt
    > thread, and thus it has no mempool cache of its own.
    >
    > Signed-off-by: Robert Sanford <rsanford@akamai.com>
    > ---
    >   drivers/net/bonding/rte_eth_bond_8023ad.c |   10 +++++++---
    >   1 files changed, 7 insertions(+), 3 deletions(-)
    >
    > diff --git a/drivers/net/bonding/rte_eth_bond_8023ad.c b/drivers/net/bonding/rte_eth_bond_8023ad.c
    > index 2f7ae70..1207896 100644
    > --- a/drivers/net/bonding/rte_eth_bond_8023ad.c
    > +++ b/drivers/net/bonding/rte_eth_bond_8023ad.c
    > @@ -854,6 +854,8 @@ bond_mode_8023ad_activate_slave(struct rte_eth_dev *bond_dev, uint8_t slave_id)
    >   	char mem_name[RTE_ETH_NAME_MAX_LEN];
    >   	int socket_id;
    >   	unsigned element_size;
    > +	unsigned cache_size;
    > +	unsigned cache_flushthresh;
    >   	uint32_t total_tx_desc;
    >   	struct bond_tx_queue *bd_tx_q;
    >   	uint16_t q_id;
    > @@ -890,19 +892,21 @@ bond_mode_8023ad_activate_slave(struct rte_eth_dev *bond_dev, uint8_t slave_id)
    >
    >   	element_size = sizeof(struct slow_protocol_frame) + sizeof(struct rte_mbuf)
    >   				+ RTE_PKTMBUF_HEADROOM;
    > +	cache_size = RTE_MEMPOOL_CACHE_MAX_SIZE >= 32 ?
    > +		32 : RTE_MEMPOOL_CACHE_MAX_SIZE;
    > +	cache_flushthresh = RTE_MEMPOOL_CALC_CACHE_FLUSHTHRESH(cache_size);
    >
    >   	/* The size of the mempool should be at least:
    >   	 * the sum of the TX descriptors + BOND_MODE_8023AX_SLAVE_TX_PKTS */
    >   	total_tx_desc = BOND_MODE_8023AX_SLAVE_TX_PKTS;
    >   	for (q_id = 0; q_id < bond_dev->data->nb_tx_queues; q_id++) {
    >   		bd_tx_q = (struct bond_tx_queue*)bond_dev->data->tx_queues[q_id];
    > -		total_tx_desc += bd_tx_q->nb_tx_desc;
    > +		total_tx_desc += bd_tx_q->nb_tx_desc + cache_flushthresh;
    >   	}
    >
    >   	snprintf(mem_name, RTE_DIM(mem_name), "slave_port%u_pool", slave_id);
    >   	port->mbuf_pool = rte_mempool_create(mem_name,
    > -		total_tx_desc, element_size,
    > -		RTE_MEMPOOL_CACHE_MAX_SIZE >= 32 ? 32 : RTE_MEMPOOL_CACHE_MAX_SIZE,
    > +		total_tx_desc, element_size, cache_size,
    >   		sizeof(struct rte_pktmbuf_pool_private), rte_pktmbuf_pool_init,
    >   		NULL, rte_pktmbuf_init, NULL, socket_id, MEMPOOL_F_NO_SPREAD);
    >
    >
    
    I'm not very familiar with bonding code, so maybe your patch is correct. 
    I think the size of the mempool should be:
    
       BOND_MODE_8023AX_SLAVE_TX_PKTS +
         n_cores * RTE_MEMPOOL_CALC_CACHE_FLUSHTHRESH(cache_size)
    
    With n_cores = number of cores that can dequeue from the mempool.
    
    The safest thing to do would be to have n_cores = RTE_MAX_LCORE. I don't 
    know if bond_dev->data->nb_tx_queue corresponds to this definition, if 
    yes you can ignore my comment ;)
    
    
    Regards,
    Olivier

--

Only one, non-EAL thread dequeues from a per-slave, private (LACPDU) mempool. The thread calls mbuf-alloc in the context of an rte_eal_alarm_set( ) callback function, and enqueues the mbuf to a multi-consumer ring.

The consumer(s) of the mbufs can be any core(s) that invoke the bond PMD’s tx-burst function. As we know, mbufs can sit around on TX descriptor rings long after we transmit them, often until we need to reuse the descriptors for subsequent TX packets. When the TX cores finally free some of the mbufs, some of the mbufs get left in a pool’s per-core cache. So, the maximum number of mbufs that may be lying around doing nothing is: the “done” mbufs in the TX desc ring and in the per-core cache, times the number of TX cores, or approximately nb_tx_queues * (nb_tx_desc + CACHE_FLUSHTHRESH(cache_size)). (I probably should also update the comment above the for-loop that calculates total_tx_desc.)

When we include normal TX mbufs and private LACPDU mbufs displacing each other in the TX desc ring, there are many (if not infinite) possibilities. I wrote a program to simulate all this interaction, trying different values for cache_size, nb_tx_desc, etc., because it took a long time to run out of LACPDU mbufs (or it never happened) just using long-interval pings.

Anyway, I hope that this is a nice, long way of saying that I will ignore your comment. ;-)


Another idea, related to mempools (but not this LACP patch) ...
Back when I first investigated the LACP pool problem, I got an idea related to the mempool per-core caches. It seems like a waste that in some usage patterns (like above), a core may “put” objects to a pool, but rarely/never “get” objects from the pool, the result being that it rarely/never reuses objects stuck in the per-core cache. We could recognize this kind of behavior by adding a small counter to the rte_mempool_cache struct. During a put, we increment the counter, and during a get, we set the counter to zero. If, during a put that is going to flush some of the objects to the global part of the pool, the counter is at or above some threshold (8, 16, 32, or whatever), then we flush ALL of the objects, since this core hasn’t been using them. What do you think?


Regards,
Robert



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/4] net/bonding: another fix to LACP mempool size
  2016-08-23 20:01     ` Sanford, Robert
@ 2016-08-24 16:14       ` Olivier MATZ
  0 siblings, 0 replies; 19+ messages in thread
From: Olivier MATZ @ 2016-08-24 16:14 UTC (permalink / raw)
  To: Sanford, Robert, dev; +Cc: declan.doherty, pablo.de.lara.guarch

Hi Robert,

On 08/23/2016 10:01 PM, Sanford, Robert wrote:
> Hi Olivier,
>
> On 8/23/16, 11:09 AM, "Olivier MATZ" <olivier.matz@6wind.com> wrote:
>
>      Hi Robert,
>
>      On 08/01/2016 10:42 PM, Robert Sanford wrote:
>      > The following log message may appear after a slave is idle (or nearly
>      > idle) for a few minutes: "PMD: Failed to allocate LACP packet from
>      > pool".
>      >
>      > Problem: All mbufs from a slave's private pool (used exclusively for
>      > transmitting LACPDUs) have been allocated and are still sitting in
>      > the device's tx descriptor ring and other cores' mempool caches.
>      >
>      > Solution: Ensure that each slaves' tx (LACPDU) mempool owns more than
>      > n-tx-queues * (n-tx-descriptors + per-core-mempool-flush-threshold)
>      > mbufs.
>      >
>      > Note that the LACP tx machine function is the only code that allocates
>      > from a slave's private pool. It runs in the context of the interrupt
>      > thread, and thus it has no mempool cache of its own.
>      >
>      > Signed-off-by: Robert Sanford <rsanford@akamai.com>
>      > ---
>      >   drivers/net/bonding/rte_eth_bond_8023ad.c |   10 +++++++---
>      >   1 files changed, 7 insertions(+), 3 deletions(-)
>      >
>      > diff --git a/drivers/net/bonding/rte_eth_bond_8023ad.c b/drivers/net/bonding/rte_eth_bond_8023ad.c
>      > index 2f7ae70..1207896 100644
>      > --- a/drivers/net/bonding/rte_eth_bond_8023ad.c
>      > +++ b/drivers/net/bonding/rte_eth_bond_8023ad.c
>      > @@ -854,6 +854,8 @@ bond_mode_8023ad_activate_slave(struct rte_eth_dev *bond_dev, uint8_t slave_id)
>      >   	char mem_name[RTE_ETH_NAME_MAX_LEN];
>      >   	int socket_id;
>      >   	unsigned element_size;
>      > +	unsigned cache_size;
>      > +	unsigned cache_flushthresh;
>      >   	uint32_t total_tx_desc;
>      >   	struct bond_tx_queue *bd_tx_q;
>      >   	uint16_t q_id;
>      > @@ -890,19 +892,21 @@ bond_mode_8023ad_activate_slave(struct rte_eth_dev *bond_dev, uint8_t slave_id)
>      >
>      >   	element_size = sizeof(struct slow_protocol_frame) + sizeof(struct rte_mbuf)
>      >   				+ RTE_PKTMBUF_HEADROOM;
>      > +	cache_size = RTE_MEMPOOL_CACHE_MAX_SIZE >= 32 ?
>      > +		32 : RTE_MEMPOOL_CACHE_MAX_SIZE;
>      > +	cache_flushthresh = RTE_MEMPOOL_CALC_CACHE_FLUSHTHRESH(cache_size);
>      >
>      >   	/* The size of the mempool should be at least:
>      >   	 * the sum of the TX descriptors + BOND_MODE_8023AX_SLAVE_TX_PKTS */
>      >   	total_tx_desc = BOND_MODE_8023AX_SLAVE_TX_PKTS;
>      >   	for (q_id = 0; q_id < bond_dev->data->nb_tx_queues; q_id++) {
>      >   		bd_tx_q = (struct bond_tx_queue*)bond_dev->data->tx_queues[q_id];
>      > -		total_tx_desc += bd_tx_q->nb_tx_desc;
>      > +		total_tx_desc += bd_tx_q->nb_tx_desc + cache_flushthresh;
>      >   	}
>      >
>      >   	snprintf(mem_name, RTE_DIM(mem_name), "slave_port%u_pool", slave_id);
>      >   	port->mbuf_pool = rte_mempool_create(mem_name,
>      > -		total_tx_desc, element_size,
>      > -		RTE_MEMPOOL_CACHE_MAX_SIZE >= 32 ? 32 : RTE_MEMPOOL_CACHE_MAX_SIZE,
>      > +		total_tx_desc, element_size, cache_size,
>      >   		sizeof(struct rte_pktmbuf_pool_private), rte_pktmbuf_pool_init,
>      >   		NULL, rte_pktmbuf_init, NULL, socket_id, MEMPOOL_F_NO_SPREAD);
>      >
>      >
>
>      I'm not very familiar with bonding code, so maybe your patch is correct.
>      I think the size of the mempool should be:
>
>         BOND_MODE_8023AX_SLAVE_TX_PKTS +
>           n_cores * RTE_MEMPOOL_CALC_CACHE_FLUSHTHRESH(cache_size)
>
>      With n_cores = number of cores that can dequeue from the mempool.
>
>      The safest thing to do would be to have n_cores = RTE_MAX_LCORE. I don't
>      know if bond_dev->data->nb_tx_queue corresponds to this definition, if
>      yes you can ignore my comment ;)
>
>
>      Regards,
>      Olivier
>
> --
>
> Only one, non-EAL thread dequeues from a per-slave, private (LACPDU) mempool. The thread calls mbuf-alloc in the context of an rte_eal_alarm_set( ) callback function, and enqueues the mbuf to a multi-consumer ring.
>
> The consumer(s) of the mbufs can be any core(s) that invoke the bond PMD’s tx-burst function. As we know, mbufs can sit around on TX descriptor rings long after we transmit them, often until we need to reuse the descriptors for subsequent TX packets. When the TX cores finally free some of the mbufs, some of the mbufs get left in a pool’s per-core cache. So, the maximum number of mbufs that may be lying around doing nothing is: the “done” mbufs in the TX desc ring and in the per-core cache, times the number of TX cores, or approximately nb_tx_queues * (nb_tx_desc + CACHE_FLUSHTHRESH(cache_size)). (I probably should also update the comment above the for-loop that calculates total_tx_desc.)
>
> When we include normal TX mbufs and private LACPDU mbufs displacing each other in the TX desc ring, there are many (if not infinite) possibilities. I wrote a program to simulate all this interaction, trying different values for cache_size, nb_tx_desc, etc., because it took a long time to run out of LACPDU mbufs (or it never happened) just using long-interval pings.
>
> Anyway, I hope that this is a nice, long way of saying that I will ignore your comment. ;-)

OK, thanks for the detailed explanation.

> Another idea, related to mempools (but not this LACP patch) ...
> Back when I first investigated the LACP pool problem, I got an idea related to the mempool per-core caches. It seems like a waste that in some usage patterns (like above), a core may “put” objects to a pool, but rarely/never “get” objects from the pool, the result being that it rarely/never reuses objects stuck in the per-core cache. We could recognize this kind of behavior by adding a small counter to the rte_mempool_cache struct. During a put, we increment the counter, and during a get, we set the counter to zero. If, during a put that is going to flush some of the objects to the global part of the pool, the counter is at or above some threshold (8, 16, 32, or whatever), then we flush ALL of the objects, since this core hasn’t been using them. What do you think?

While I understand the need, it looks a bit complicated to me.

When I read your first patch, I was thinking about automatically 
increase the size of the pool in mempool_create(), according to the 
number of cores and size of the cache. But it would have too many drawbacks:
  - waste of memory in many cases, especially for large objects
  - would not take the external caches (new 16.07 feature) in account
  - behavior change

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/4] mempool: make cache flush threshold macro public
  2016-08-23 16:07     ` Sanford, Robert
@ 2016-08-24 16:15       ` Olivier MATZ
  0 siblings, 0 replies; 19+ messages in thread
From: Olivier MATZ @ 2016-08-24 16:15 UTC (permalink / raw)
  To: Sanford, Robert, dev; +Cc: declan.doherty, pablo.de.lara.guarch

Hi Robert,

On 08/23/2016 06:07 PM, Sanford, Robert wrote:
> Hi Olivier,
>
> On 8/23/16, 11:09 AM, "Olivier MATZ" <olivier.matz@6wind.com> wrote:
>
>      Hi Robert,
>
>      On 08/01/2016 10:42 PM, Robert Sanford wrote:
>      > Rename macros that calculate a mempool cache flush threshold, and
>      > move them from rte_mempool.c to rte_mempool.h, so that the bonding
>      > driver can accurately calculate its mbuf requirements.
>      >
>      > Signed-off-by: Robert Sanford <rsanford@akamai.com>
>      > ---
>      >   lib/librte_mempool/rte_mempool.c |    8 ++------
>      >   lib/librte_mempool/rte_mempool.h |    7 +++++++
>      >   2 files changed, 9 insertions(+), 6 deletions(-)
>      >
>      > [...]
>      >
>      > --- a/lib/librte_mempool/rte_mempool.h
>      > +++ b/lib/librte_mempool/rte_mempool.h
>      > @@ -263,6 +263,13 @@ struct rte_mempool {
>      >   #define MEMPOOL_F_NO_PHYS_CONTIG 0x0020 /**< Don't need physically contiguous objs. */
>      >
>      >   /**
>      > + * Calculate the threshold before we flush excess elements.
>      > + */
>      > +#define RTE_MEMPOOL_CACHE_FLUSHTHRESH_MULTIPLIER 1.5
>      > +#define RTE_MEMPOOL_CALC_CACHE_FLUSHTHRESH(c)	\
>      > +	((typeof(c))((c) * RTE_MEMPOOL_CACHE_FLUSHTHRESH_MULTIPLIER))
>      > +
>      > +/**
>      >    * @internal When debug is enabled, store some statistics.
>      >    *
>      >    * @param mp
>      >
>
>      What do you think of using a static inline function instead of a macro ?
>
>
>      Regards,
>      Olivier
> --
>
> Yes, an inline function is better than a macro. We still need to move the #define MULTIPLIER from rte_mempool.c to rte_mempool.h.
>
> How is this for the prototype?
> static inline unsigned rte_mempool_calc_cache_flushthresh(unsigned cache_size)
>
> Where in the .h should we place the function, right below the MULTIPLIER definition?

Yep, looks good, thanks.

Maybe "unsigned" -> "unsigned int", because checkpatch may complain...


Regards,
Olivier

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/4] net/bonding: another fix to LACP mempool size
  2016-08-01 20:42 ` [PATCH 3/4] net/bonding: another fix to LACP mempool size Robert Sanford
  2016-08-23 15:09   ` Olivier MATZ
@ 2016-11-07 16:02   ` Kulasek, TomaszX
  1 sibling, 0 replies; 19+ messages in thread
From: Kulasek, TomaszX @ 2016-11-07 16:02 UTC (permalink / raw)
  To: Robert Sanford, dev; +Cc: Doherty, Declan, De Lara Guarch, Pablo, olivier.matz



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Robert Sanford
> Sent: Monday, August 1, 2016 22:43
> To: dev@dpdk.org
> Cc: Doherty, Declan <declan.doherty@intel.com>; De Lara Guarch, Pablo
> <pablo.de.lara.guarch@intel.com>; olivier.matz@6wind.com
> Subject: [dpdk-dev] [PATCH 3/4] net/bonding: another fix to LACP mempool
> size
> 
> The following log message may appear after a slave is idle (or nearly
> idle) for a few minutes: "PMD: Failed to allocate LACP packet from pool".
> 
> Problem: All mbufs from a slave's private pool (used exclusively for
> transmitting LACPDUs) have been allocated and are still sitting in the
> device's tx descriptor ring and other cores' mempool caches.
> 
> Solution: Ensure that each slaves' tx (LACPDU) mempool owns more than n-
> tx-queues * (n-tx-descriptors + per-core-mempool-flush-threshold) mbufs.
> 
> Note that the LACP tx machine function is the only code that allocates
> from a slave's private pool. It runs in the context of the interrupt
> thread, and thus it has no mempool cache of its own.
> 
> Signed-off-by: Robert Sanford <rsanford@akamai.com>
> ---
>  drivers/net/bonding/rte_eth_bond_8023ad.c |   10 +++++++---
>  1 files changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/bonding/rte_eth_bond_8023ad.c
> b/drivers/net/bonding/rte_eth_bond_8023ad.c
> index 2f7ae70..1207896 100644
> --- a/drivers/net/bonding/rte_eth_bond_8023ad.c
> +++ b/drivers/net/bonding/rte_eth_bond_8023ad.c
> @@ -854,6 +854,8 @@ bond_mode_8023ad_activate_slave(struct rte_eth_dev
> *bond_dev, uint8_t slave_id)
>  	char mem_name[RTE_ETH_NAME_MAX_LEN];
>  	int socket_id;
>  	unsigned element_size;
> +	unsigned cache_size;
> +	unsigned cache_flushthresh;
>  	uint32_t total_tx_desc;
>  	struct bond_tx_queue *bd_tx_q;
>  	uint16_t q_id;
> @@ -890,19 +892,21 @@ bond_mode_8023ad_activate_slave(struct rte_eth_dev
> *bond_dev, uint8_t slave_id)
> 
>  	element_size = sizeof(struct slow_protocol_frame) + sizeof(struct
> rte_mbuf)
>  				+ RTE_PKTMBUF_HEADROOM;
> +	cache_size = RTE_MEMPOOL_CACHE_MAX_SIZE >= 32 ?
> +		32 : RTE_MEMPOOL_CACHE_MAX_SIZE;
> +	cache_flushthresh = RTE_MEMPOOL_CALC_CACHE_FLUSHTHRESH(cache_size);
> 
>  	/* The size of the mempool should be at least:
>  	 * the sum of the TX descriptors + BOND_MODE_8023AX_SLAVE_TX_PKTS */
>  	total_tx_desc = BOND_MODE_8023AX_SLAVE_TX_PKTS;
>  	for (q_id = 0; q_id < bond_dev->data->nb_tx_queues; q_id++) {
>  		bd_tx_q = (struct bond_tx_queue*)bond_dev->data-
> >tx_queues[q_id];
> -		total_tx_desc += bd_tx_q->nb_tx_desc;
> +		total_tx_desc += bd_tx_q->nb_tx_desc + cache_flushthresh;
>  	}
> 
>  	snprintf(mem_name, RTE_DIM(mem_name), "slave_port%u_pool",
> slave_id);
>  	port->mbuf_pool = rte_mempool_create(mem_name,
> -		total_tx_desc, element_size,
> -		RTE_MEMPOOL_CACHE_MAX_SIZE >= 32 ? 32 :
> RTE_MEMPOOL_CACHE_MAX_SIZE,
> +		total_tx_desc, element_size, cache_size,
>  		sizeof(struct rte_pktmbuf_pool_private), rte_pktmbuf_pool_init,
>  		NULL, rte_pktmbuf_init, NULL, socket_id, MEMPOOL_F_NO_SPREAD);
> 
> --
> 1.7.1

Reviewed-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/4] net/bonding: fix configuration of LACP slaves
  2016-08-01 20:42 ` [PATCH 4/4] net/bonding: fix configuration of LACP slaves Robert Sanford
@ 2016-11-07 16:03   ` Kulasek, TomaszX
  0 siblings, 0 replies; 19+ messages in thread
From: Kulasek, TomaszX @ 2016-11-07 16:03 UTC (permalink / raw)
  To: Robert Sanford, dev; +Cc: Doherty, Declan, De Lara Guarch, Pablo, olivier.matz



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Robert Sanford
> Sent: Monday, August 1, 2016 22:43
> To: dev@dpdk.org
> Cc: Doherty, Declan <declan.doherty@intel.com>; De Lara Guarch, Pablo
> <pablo.de.lara.guarch@intel.com>; olivier.matz@6wind.com
> Subject: [dpdk-dev] [PATCH 4/4] net/bonding: fix configuration of LACP
> slaves
> 
> Problem: When adding a slave or starting a bond device, the bond device
> configures slave devices via function slave_configure().
> However, settings configured in the bond device's rte_eth_conf are not
> propagated to the slaves. For example, VLAN and CRC stripping are not
> working as expected.
> 
> The problem is that we pass the wrong argument when we invoke
> rte_eth_dev_configure(). We pass the slave's currently configured
> rte_eth_conf (as a source arg!), when we should pass a copy of the
> (master) bond device's rte_eth_conf.
> 
> Solution: Make a local copy of the bond device's rte_eth_conf, adjust the
> LSC flag based on the slave, and then pass that rte_eth_conf to
> rte_eth_dev_configure().
> 
> Also, remove code that directly pokes RSS data into the slave's
> rte_eth_conf, as that is also contained in the proper rte_eth_conf that we
> will pass to rte_eth_dev_configure().
> 
> Signed-off-by: Robert Sanford <rsanford@akamai.com>
> ---
>  drivers/net/bonding/rte_eth_bond_pmd.c |   28 +++++++--------------------
> -
>  1 files changed, 7 insertions(+), 21 deletions(-)
> 
> diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c
> b/drivers/net/bonding/rte_eth_bond_pmd.c
> index b20a272..486582f 100644
> --- a/drivers/net/bonding/rte_eth_bond_pmd.c
> +++ b/drivers/net/bonding/rte_eth_bond_pmd.c
> @@ -1302,6 +1302,7 @@ int
>  slave_configure(struct rte_eth_dev *bonded_eth_dev,
>  		struct rte_eth_dev *slave_eth_dev)
>  {
> +	struct rte_eth_conf slave_eth_conf;
>  	struct bond_rx_queue *bd_rx_q;
>  	struct bond_tx_queue *bd_tx_q;
> 
> @@ -1313,33 +1314,18 @@ slave_configure(struct rte_eth_dev
> *bonded_eth_dev,
>  	/* Stop slave */
>  	rte_eth_dev_stop(slave_eth_dev->data->port_id);
> 
> -	/* Enable interrupts on slave device if supported */
> -	if (slave_eth_dev->data->dev_flags & RTE_ETH_DEV_INTR_LSC)
> -		slave_eth_dev->data->dev_conf.intr_conf.lsc = 1;
> -
> -	/* If RSS is enabled for bonding, try to enable it for slaves  */
> -	if (bonded_eth_dev->data->dev_conf.rxmode.mq_mode &
> ETH_MQ_RX_RSS_FLAG) {
> -		if (bonded_eth_dev->data-
> >dev_conf.rx_adv_conf.rss_conf.rss_key_len
> -				!= 0) {
> -			slave_eth_dev->data-
> >dev_conf.rx_adv_conf.rss_conf.rss_key_len =
> -					bonded_eth_dev->data-
> >dev_conf.rx_adv_conf.rss_conf.rss_key_len;
> -			slave_eth_dev->data-
> >dev_conf.rx_adv_conf.rss_conf.rss_key =
> -					bonded_eth_dev->data-
> >dev_conf.rx_adv_conf.rss_conf.rss_key;
> -		} else {
> -			slave_eth_dev->data-
> >dev_conf.rx_adv_conf.rss_conf.rss_key = NULL;
> -		}
> +	/* Build slave rte_eth_conf, starting from bonded's conf */
> +	slave_eth_conf = bonded_eth_dev->data->dev_conf;
> 
> -		slave_eth_dev->data->dev_conf.rx_adv_conf.rss_conf.rss_hf =
> -				bonded_eth_dev->data-
> >dev_conf.rx_adv_conf.rss_conf.rss_hf;
> -		slave_eth_dev->data->dev_conf.rxmode.mq_mode =
> -				bonded_eth_dev->data->dev_conf.rxmode.mq_mode;
> -	}
> +	/* Enable interrupts on slave device if supported */
> +	slave_eth_conf.intr_conf.lsc =
> +		!!(slave_eth_dev->data->dev_flags & RTE_ETH_DEV_INTR_LSC);
> 
>  	/* Configure device */
>  	errval = rte_eth_dev_configure(slave_eth_dev->data->port_id,
>  			bonded_eth_dev->data->nb_rx_queues,
>  			bonded_eth_dev->data->nb_tx_queues,
> -			&(slave_eth_dev->data->dev_conf));
> +			&slave_eth_conf);
>  	if (errval != 0) {
>  		RTE_BOND_LOG(ERR, "Cannot configure slave device: port %u , err
> (%d)",
>  				slave_eth_dev->data->port_id, errval);
> --
> 1.7.1

Reviewed-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/4] net/bonding: bonding and LACP fixes
  2016-08-01 20:42 [PATCH 0/4] net/bonding: bonding and LACP fixes Robert Sanford
                   ` (3 preceding siblings ...)
  2016-08-01 20:42 ` [PATCH 4/4] net/bonding: fix configuration of LACP slaves Robert Sanford
@ 2017-02-08 17:14 ` Thomas Monjalon
  2017-03-09 13:19   ` Thomas Monjalon
  4 siblings, 1 reply; 19+ messages in thread
From: Thomas Monjalon @ 2017-02-08 17:14 UTC (permalink / raw)
  To: Robert Sanford, declan.doherty; +Cc: dev, pablo.de.lara.guarch, olivier.matz

2016-08-01 16:42, Robert Sanford:
> In this patch series, we fix two bonding driver bugs and
> enhance testpmd so that bonding mode 4 (LACP) ports remain
> operational even when idle.

These patches are blocked because we are waiting a v2.

Waiting so long for a fix without having any feedback makes me think
that either the bug is not so important or the PMD is not used.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/4] net/bonding: bonding and LACP fixes
  2017-02-08 17:14 ` [PATCH 0/4] net/bonding: bonding and LACP fixes Thomas Monjalon
@ 2017-03-09 13:19   ` Thomas Monjalon
  2017-03-09 16:57     ` Declan Doherty
  0 siblings, 1 reply; 19+ messages in thread
From: Thomas Monjalon @ 2017-03-09 13:19 UTC (permalink / raw)
  To: Robert Sanford; +Cc: declan.doherty, dev, pablo.de.lara.guarch, olivier.matz

2017-02-08 18:14, Thomas Monjalon:
> 2016-08-01 16:42, Robert Sanford:
> > In this patch series, we fix two bonding driver bugs and
> > enhance testpmd so that bonding mode 4 (LACP) ports remain
> > operational even when idle.
> 
> These patches are blocked because we are waiting a v2.
> 
> Waiting so long for a fix without having any feedback makes me think
> that either the bug is not so important or the PMD is not used.

Last call: anyone interested in this series?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/4] net/bonding: bonding and LACP fixes
  2017-03-09 13:19   ` Thomas Monjalon
@ 2017-03-09 16:57     ` Declan Doherty
  0 siblings, 0 replies; 19+ messages in thread
From: Declan Doherty @ 2017-03-09 16:57 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev, pablo.de.lara.guarch, olivier.matz

On 09/03/2017 1:19 PM, Thomas Monjalon wrote:
> 2017-02-08 18:14, Thomas Monjalon:
>> 2016-08-01 16:42, Robert Sanford:
>>> In this patch series, we fix two bonding driver bugs and
>>> enhance testpmd so that bonding mode 4 (LACP) ports remain
>>> operational even when idle.
>>
>> These patches are blocked because we are waiting a v2.
>>
>> Waiting so long for a fix without having any feedback makes me think
>> that either the bug is not so important or the PMD is not used.
>
> Last call: anyone interested in this series?
>

Hey Thomas, Robert doesn't seem to be active anymore on the list, so 
I'll set aside some time over the next couple of days and put together a 
v2 of these patches.

Declan

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/4] testpmd: fix LACP ports to work with idle links
  2016-08-01 20:42 ` [PATCH 1/4] testpmd: fix LACP ports to work with idle links Robert Sanford
@ 2017-06-22  1:25   ` Wu, Jingjing
  2017-10-31  1:07     ` Ferruh Yigit
  0 siblings, 1 reply; 19+ messages in thread
From: Wu, Jingjing @ 2017-06-22  1:25 UTC (permalink / raw)
  To: rsanford; +Cc: dev



> -----Original Message-----
> From: rsanford2 at gmail.com
> Sent: Tuesday, August 2, 2016 4:43 AM
> Subject: [dpdk-dev] [PATCH 1/4] testpmd: fix LACP ports to work with idle links
> 
> Problem: When there is little or no TX traffic on an LACP port (bonding mode 4),
> we don't call its tx burst function in a timely manner, and thus we don't
> transmit LACPDUs when we should.
> 
> Solution: Add and maintain an "lacp_master" flag in rte_port struct.
> In the main packet forwarding loop, if port is an LACP master, in 1 out of N
> loops force an empty call to the tx burst API.
> 
> Signed-off-by: Robert Sanford <rsanford at akamai.com>
> ---
>  app/test-pmd/cmdline.c |    9 +++++++++
>  app/test-pmd/testpmd.c |   37 +++++++++++++++++++++++++++++++++++++
>  app/test-pmd/testpmd.h |    4 ++++
>  3 files changed, 50 insertions(+), 0 deletions(-)
> 
> diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c index
> f90befc..2a629ee 100644
> --- a/app/test-pmd/cmdline.c
> +++ b/app/test-pmd/cmdline.c
> @@ -3975,6 +3975,10 @@ static void cmd_set_bonding_mode_parsed(void
> *parsed_result,
>  	/* Set the bonding mode for the relevant port. */
>  	if (0 != rte_eth_bond_mode_set(port_id, res->value))
>  		printf("\t Failed to set bonding mode for port = %d.\n", port_id);
> +	else if (res->value == BONDING_MODE_8023AD)
> +		set_port_lacp_master(port_id);
> +	else
> +		clear_port_lacp_master(port_id);
>  }
> 
>  cmdline_parse_token_string_t cmd_setbonding_mode_set = @@ -4408,6
> +4412,11 @@ static void cmd_create_bonded_device_parsed(void
> *parsed_result,
>  		reconfig(port_id, res->socket);
>  		rte_eth_promiscuous_enable(port_id);
>  		ports[port_id].enabled = 1;
> +
> +		if (res->mode == BONDING_MODE_8023AD)
> +			set_port_lacp_master(port_id);
> +		else
> +			clear_port_lacp_master(port_id);
>  	}
> 
>  }
> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index
> 1428974..806667e 100644
> --- a/app/test-pmd/testpmd.c
> +++ b/app/test-pmd/testpmd.c
> @@ -920,12 +920,28 @@ run_pkt_fwd_on_lcore(struct fwd_lcore *fc,
> packet_fwd_t pkt_fwd)
>  	struct fwd_stream **fsm;
>  	streamid_t nb_fs;
>  	streamid_t sm_id;
> +	unsigned int loop_count = 0;
> 
>  	fsm = &fwd_streams[fc->stream_idx];
>  	nb_fs = fc->stream_nb;
>  	do {
>  		for (sm_id = 0; sm_id < nb_fs; sm_id++)
>  			(*pkt_fwd)(fsm[sm_id]);
> +
> +		/*
> +		 * Per rte_eth_bond.h, we must invoke LACP master's tx
> +		 * burst function at least once every 100 ms.
> +		 */
> +		loop_count++;
> +		if (likely(loop_count % 1024 != 0))
> +			continue;
> +		for (sm_id = 0; sm_id < nb_fs; sm_id++) {
> +			struct fwd_stream *fs = fsm[sm_id];
> +
> +			if (port_is_lacp_master(fs->tx_port))
> +				rte_eth_tx_burst(fs->tx_port, fs->tx_queue,
> +						 NULL, 0);
> +		}
By add tx_burst here will change default logic. Some fwd engine are usually used to measure
Performance . Additional cycle in fast path may impact that. It would be better to move additional
sending to your fwd engine.
>  	} while (! fc->stopped);
>  }
> 
> @@ -1881,6 +1897,27 @@ uint8_t port_is_bonding_slave(portid_t slave_pid)
>  	return port->slave_flag;
>  }
> 
> +void set_port_lacp_master(portid_t pid) {
> +	struct rte_port *port = &ports[pid];
> +
You may need to check the pid is valid or not.
And the function is simple, could you just assign the lacp_master
in the caller instead of defining a new one?
> +	port->lacp_master = 1;
> +}
> +
> +void clear_port_lacp_master(portid_t pid) {
> +	struct rte_port *port = &ports[pid];
> +
Same comment as above.
> +	port->lacp_master = 0;
> +}
> +
> +uint8_t port_is_lacp_master(portid_t pid) {
> +	struct rte_port *port = &ports[pid];
> +
> +	return port->lacp_master;
> +}
> +
>  const uint16_t vlan_tags[] = {
>  		0,  1,  2,  3,  4,  5,  6,  7,
>  		8,  9, 10, 11,  12, 13, 14, 15,
> diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h index
> 2b281cc..0898194 100644
> --- a/app/test-pmd/testpmd.h
> +++ b/app/test-pmd/testpmd.h
> @@ -170,6 +170,7 @@ struct rte_port {
>  	struct ether_addr       *mc_addr_pool; /**< pool of multicast addrs */
>  	uint32_t                mc_addr_nb; /**< nb. of addr. in mc_addr_pool */
>  	uint8_t                 slave_flag; /**< bonding slave port */
> +	uint8_t                 lacp_master;/**< bonding LACP master */
>  };
> 
>  extern portid_t __rte_unused
> @@ -542,6 +543,9 @@ void init_port_config(void);  void
> set_port_slave_flag(portid_t slave_pid);  void clear_port_slave_flag(portid_t
> slave_pid);  uint8_t port_is_bonding_slave(portid_t slave_pid);
> +void set_port_lacp_master(portid_t pid); void
> +clear_port_lacp_master(portid_t pid); uint8_t
> +port_is_lacp_master(portid_t pid);
> 
>  int init_port_dcb_config(portid_t pid, enum dcb_mode_enable dcb_mode,
>  		     enum rte_eth_nb_tcs num_tcs,
> --
> 1.7.1
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/4] testpmd: fix LACP ports to work with idle links
  2017-06-22  1:25   ` Wu, Jingjing
@ 2017-10-31  1:07     ` Ferruh Yigit
  2017-11-01 20:06       ` Ferruh Yigit
  0 siblings, 1 reply; 19+ messages in thread
From: Ferruh Yigit @ 2017-10-31  1:07 UTC (permalink / raw)
  To: Robert Sanford
  Cc: Jingjing Wu, declan.doherty, pablo.de.lara.guarch, olivier.matz, DPDK

On 6/21/2017 6:25 PM, jingjing.wu at intel.com (Wu, Jingjing) wrote:
>> -----Original Message-----
>> From: rsanford2 at gmail.com
>> Sent: Tuesday, August 2, 2016 4:43 AM
>> Subject: [dpdk-dev] [PATCH 1/4] testpmd: fix LACP ports to work with idle links
>>
>> Problem: When there is little or no TX traffic on an LACP port (bonding mode 4),
>> we don't call its tx burst function in a timely manner, and thus we don't
>> transmit LACPDUs when we should.
>>
>> Solution: Add and maintain an "lacp_master" flag in rte_port struct.
>> In the main packet forwarding loop, if port is an LACP master, in 1 out of N
>> loops force an empty call to the tx burst API.
>>
>> Signed-off-by: Robert Sanford <rsanford at akamai.com>
>> ---
>>  app/test-pmd/cmdline.c |    9 +++++++++
>>  app/test-pmd/testpmd.c |   37 +++++++++++++++++++++++++++++++++++++
>>  app/test-pmd/testpmd.h |    4 ++++
>>  3 files changed, 50 insertions(+), 0 deletions(-)
>>
>> diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c index
>> f90befc..2a629ee 100644
>> --- a/app/test-pmd/cmdline.c
>> +++ b/app/test-pmd/cmdline.c
>> @@ -3975,6 +3975,10 @@ static void cmd_set_bonding_mode_parsed(void
>> *parsed_result,
>>  	/* Set the bonding mode for the relevant port. */
>>  	if (0 != rte_eth_bond_mode_set(port_id, res->value))
>>  		printf("\t Failed to set bonding mode for port = %d.\n", port_id);
>> +	else if (res->value == BONDING_MODE_8023AD)
>> +		set_port_lacp_master(port_id);
>> +	else
>> +		clear_port_lacp_master(port_id);
>>  }
>>
>>  cmdline_parse_token_string_t cmd_setbonding_mode_set = @@ -4408,6
>> +4412,11 @@ static void cmd_create_bonded_device_parsed(void
>> *parsed_result,
>>  		reconfig(port_id, res->socket);
>>  		rte_eth_promiscuous_enable(port_id);
>>  		ports[port_id].enabled = 1;
>> +
>> +		if (res->mode == BONDING_MODE_8023AD)
>> +			set_port_lacp_master(port_id);
>> +		else
>> +			clear_port_lacp_master(port_id);
>>  	}
>>
>>  }
>> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index
>> 1428974..806667e 100644
>> --- a/app/test-pmd/testpmd.c
>> +++ b/app/test-pmd/testpmd.c
>> @@ -920,12 +920,28 @@ run_pkt_fwd_on_lcore(struct fwd_lcore *fc,
>> packet_fwd_t pkt_fwd)
>>  	struct fwd_stream **fsm;
>>  	streamid_t nb_fs;
>>  	streamid_t sm_id;
>> +	unsigned int loop_count = 0;
>>
>>  	fsm = &fwd_streams[fc->stream_idx];
>>  	nb_fs = fc->stream_nb;
>>  	do {
>>  		for (sm_id = 0; sm_id < nb_fs; sm_id++)
>>  			(*pkt_fwd)(fsm[sm_id]);
>> +
>> +		/*
>> +		 * Per rte_eth_bond.h, we must invoke LACP master's tx
>> +		 * burst function at least once every 100 ms.
>> +		 */
>> +		loop_count++;
>> +		if (likely(loop_count % 1024 != 0))
>> +			continue;
>> +		for (sm_id = 0; sm_id < nb_fs; sm_id++) {
>> +			struct fwd_stream *fs = fsm[sm_id];
>> +
>> +			if (port_is_lacp_master(fs->tx_port))
>> +				rte_eth_tx_burst(fs->tx_port, fs->tx_queue,
>> +						 NULL, 0);
>> +		}
> By add tx_burst here will change default logic. Some fwd engine are usually used to measure
> Performance . Additional cycle in fast path may impact that. It would be better to move additional
> sending to your fwd engine.

Hi Robert,

This patchset is older than a year and not responded back to latest comments.

Is this still a valid patch?
Do you have any objection to remove this from patchwork?

Thanks,
ferruh

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/4] testpmd: fix LACP ports to work with idle links
  2017-10-31  1:07     ` Ferruh Yigit
@ 2017-11-01 20:06       ` Ferruh Yigit
  0 siblings, 0 replies; 19+ messages in thread
From: Ferruh Yigit @ 2017-11-01 20:06 UTC (permalink / raw)
  To: Robert Sanford
  Cc: Jingjing Wu, declan.doherty, pablo.de.lara.guarch, olivier.matz, DPDK

On 10/30/2017 6:07 PM, Ferruh Yigit wrote:
> On 6/21/2017 6:25 PM, jingjing.wu at intel.com (Wu, Jingjing) wrote:
>>> -----Original Message-----
>>> From: rsanford2 at gmail.com
>>> Sent: Tuesday, August 2, 2016 4:43 AM
>>> Subject: [dpdk-dev] [PATCH 1/4] testpmd: fix LACP ports to work with idle links
>>>
>>> Problem: When there is little or no TX traffic on an LACP port (bonding mode 4),
>>> we don't call its tx burst function in a timely manner, and thus we don't
>>> transmit LACPDUs when we should.
>>>
>>> Solution: Add and maintain an "lacp_master" flag in rte_port struct.
>>> In the main packet forwarding loop, if port is an LACP master, in 1 out of N
>>> loops force an empty call to the tx burst API.
>>>
>>> Signed-off-by: Robert Sanford <rsanford at akamai.com>
>>> ---
>>>  app/test-pmd/cmdline.c |    9 +++++++++
>>>  app/test-pmd/testpmd.c |   37 +++++++++++++++++++++++++++++++++++++
>>>  app/test-pmd/testpmd.h |    4 ++++
>>>  3 files changed, 50 insertions(+), 0 deletions(-)
>>>
>>> diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c index
>>> f90befc..2a629ee 100644
>>> --- a/app/test-pmd/cmdline.c
>>> +++ b/app/test-pmd/cmdline.c
>>> @@ -3975,6 +3975,10 @@ static void cmd_set_bonding_mode_parsed(void
>>> *parsed_result,
>>>  	/* Set the bonding mode for the relevant port. */
>>>  	if (0 != rte_eth_bond_mode_set(port_id, res->value))
>>>  		printf("\t Failed to set bonding mode for port = %d.\n", port_id);
>>> +	else if (res->value == BONDING_MODE_8023AD)
>>> +		set_port_lacp_master(port_id);
>>> +	else
>>> +		clear_port_lacp_master(port_id);
>>>  }
>>>
>>>  cmdline_parse_token_string_t cmd_setbonding_mode_set = @@ -4408,6
>>> +4412,11 @@ static void cmd_create_bonded_device_parsed(void
>>> *parsed_result,
>>>  		reconfig(port_id, res->socket);
>>>  		rte_eth_promiscuous_enable(port_id);
>>>  		ports[port_id].enabled = 1;
>>> +
>>> +		if (res->mode == BONDING_MODE_8023AD)
>>> +			set_port_lacp_master(port_id);
>>> +		else
>>> +			clear_port_lacp_master(port_id);
>>>  	}
>>>
>>>  }
>>> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index
>>> 1428974..806667e 100644
>>> --- a/app/test-pmd/testpmd.c
>>> +++ b/app/test-pmd/testpmd.c
>>> @@ -920,12 +920,28 @@ run_pkt_fwd_on_lcore(struct fwd_lcore *fc,
>>> packet_fwd_t pkt_fwd)
>>>  	struct fwd_stream **fsm;
>>>  	streamid_t nb_fs;
>>>  	streamid_t sm_id;
>>> +	unsigned int loop_count = 0;
>>>
>>>  	fsm = &fwd_streams[fc->stream_idx];
>>>  	nb_fs = fc->stream_nb;
>>>  	do {
>>>  		for (sm_id = 0; sm_id < nb_fs; sm_id++)
>>>  			(*pkt_fwd)(fsm[sm_id]);
>>> +
>>> +		/*
>>> +		 * Per rte_eth_bond.h, we must invoke LACP master's tx
>>> +		 * burst function at least once every 100 ms.
>>> +		 */
>>> +		loop_count++;
>>> +		if (likely(loop_count % 1024 != 0))
>>> +			continue;
>>> +		for (sm_id = 0; sm_id < nb_fs; sm_id++) {
>>> +			struct fwd_stream *fs = fsm[sm_id];
>>> +
>>> +			if (port_is_lacp_master(fs->tx_port))
>>> +				rte_eth_tx_burst(fs->tx_port, fs->tx_queue,
>>> +						 NULL, 0);
>>> +		}
>> By add tx_burst here will change default logic. Some fwd engine are usually used to measure
>> Performance . Additional cycle in fast path may impact that. It would be better to move additional
>> sending to your fwd engine.
> 
> Hi Robert,
> 
> This patchset is older than a year and not responded back to latest comments.
> 
> Is this still a valid patch?
> Do you have any objection to remove this from patchwork?

Patches are updated as rejected.

> 
> Thanks,
> ferruh
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2017-11-01 20:06 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-01 20:42 [PATCH 0/4] net/bonding: bonding and LACP fixes Robert Sanford
2016-08-01 20:42 ` [PATCH 1/4] testpmd: fix LACP ports to work with idle links Robert Sanford
2017-06-22  1:25   ` Wu, Jingjing
2017-10-31  1:07     ` Ferruh Yigit
2017-11-01 20:06       ` Ferruh Yigit
2016-08-01 20:42 ` [PATCH 2/4] mempool: make cache flush threshold macro public Robert Sanford
2016-08-23 15:09   ` Olivier MATZ
2016-08-23 16:07     ` Sanford, Robert
2016-08-24 16:15       ` Olivier MATZ
2016-08-01 20:42 ` [PATCH 3/4] net/bonding: another fix to LACP mempool size Robert Sanford
2016-08-23 15:09   ` Olivier MATZ
2016-08-23 20:01     ` Sanford, Robert
2016-08-24 16:14       ` Olivier MATZ
2016-11-07 16:02   ` Kulasek, TomaszX
2016-08-01 20:42 ` [PATCH 4/4] net/bonding: fix configuration of LACP slaves Robert Sanford
2016-11-07 16:03   ` Kulasek, TomaszX
2017-02-08 17:14 ` [PATCH 0/4] net/bonding: bonding and LACP fixes Thomas Monjalon
2017-03-09 13:19   ` Thomas Monjalon
2017-03-09 16:57     ` Declan Doherty

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.