linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/11] staging: octeon: multi rx group (queue) support
@ 2016-08-31 20:57 Aaro Koskinen
  2016-08-31 20:57 ` [PATCH v2 01/11] staging: octeon: disable rx interrupts in oct_rx_shutdown Aaro Koskinen
                   ` (11 more replies)
  0 siblings, 12 replies; 14+ messages in thread
From: Aaro Koskinen @ 2016-08-31 20:57 UTC (permalink / raw)
  To: Greg Kroah-Hartman, David Daney, Ed Swierk, devel
  Cc: linux-kernel, Aaro Koskinen

Hi,

This series implements multiple RX group support that should improve
the networking performance on multi-core OCTEONs. Basically we register
IRQ and NAPI for each group, and ask the HW to select the group for
the incoming packets based on hash.

Tested on EdgeRouter Lite with a simple forwarding test using two flows
and 16 RX groups distributed between two cores - the routing throughput
is roughly doubled.

Also tested with EBH5600 (8 cores) and EBB6800 (16 cores) by sending
and receiving traffic in both directions using SGMII interfaces.

A.

	v2:
		- Fix build failure with CONFIG_NET_POLL_CONSOLE.
		- Disable the extended group tag mask bits on CN68XX.
		- Set up PKND for all interfaces.
		- Don't allow poll before RX init is completed.

	v1:
		http://marc.info/?t=147258299700005&r=1&w=2

Aaro Koskinen (11):
  staging: octeon: disable rx interrupts in oct_rx_shutdown
  staging: octeon: use passed interrupt number in the handler
  staging: octeon: pass the NAPI instance reference to irq handler
  staging: octeon: move common poll code into a separate function
  staging: octeon: create a struct for rx group specific data
  staging: octeon: move irq into rx group specific data
  staging: octeon: move group number into rx group data
  staging: octeon: support enabling multiple rx groups
  staging: octeon: enable taking multiple rx groups into use
  staging: octeon: set up pknd for all interfaces
  staging: octeon: prevent poll during rx init

 drivers/staging/octeon/ethernet-rx.c     | 184 ++++++++++++++++++++-----------
 drivers/staging/octeon/ethernet.c        |  64 +++++++++--
 drivers/staging/octeon/octeon-ethernet.h |   2 +-
 3 files changed, 174 insertions(+), 76 deletions(-)

-- 
2.9.2

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2 01/11] staging: octeon: disable rx interrupts in oct_rx_shutdown
  2016-08-31 20:57 [PATCH v2 00/11] staging: octeon: multi rx group (queue) support Aaro Koskinen
@ 2016-08-31 20:57 ` Aaro Koskinen
  2016-08-31 20:57 ` [PATCH v2 02/11] staging: octeon: use passed interrupt number in the handler Aaro Koskinen
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: Aaro Koskinen @ 2016-08-31 20:57 UTC (permalink / raw)
  To: Greg Kroah-Hartman, David Daney, Ed Swierk, devel
  Cc: linux-kernel, Aaro Koskinen

Disable RX interrupts in oct_rx_shutdown(). This way we don't need to
expose the RX IRQ numbers outside the RX module.

Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
---
 drivers/staging/octeon/ethernet-rx.c | 9 +++++++++
 drivers/staging/octeon/ethernet.c    | 9 ---------
 2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/staging/octeon/ethernet-rx.c b/drivers/staging/octeon/ethernet-rx.c
index a10fe3a..5b26f2a 100644
--- a/drivers/staging/octeon/ethernet-rx.c
+++ b/drivers/staging/octeon/ethernet-rx.c
@@ -495,5 +495,14 @@ void cvm_oct_rx_initialize(void)
 
 void cvm_oct_rx_shutdown(void)
 {
+	/* Disable POW interrupt */
+	if (OCTEON_IS_MODEL(OCTEON_CN68XX))
+		cvmx_write_csr(CVMX_SSO_WQ_INT_THRX(pow_receive_group), 0);
+	else
+		cvmx_write_csr(CVMX_POW_WQ_INT_THRX(pow_receive_group), 0);
+
+	/* Free the interrupt handler */
+	free_irq(OCTEON_IRQ_WORKQ0 + pow_receive_group, cvm_oct_device);
+
 	netif_napi_del(&cvm_oct_napi);
 }
diff --git a/drivers/staging/octeon/ethernet.c b/drivers/staging/octeon/ethernet.c
index 073a1e3..1e2e1ef 100644
--- a/drivers/staging/octeon/ethernet.c
+++ b/drivers/staging/octeon/ethernet.c
@@ -853,17 +853,8 @@ static int cvm_oct_remove(struct platform_device *pdev)
 {
 	int port;
 
-	/* Disable POW interrupt */
-	if (OCTEON_IS_MODEL(OCTEON_CN68XX))
-		cvmx_write_csr(CVMX_SSO_WQ_INT_THRX(pow_receive_group), 0);
-	else
-		cvmx_write_csr(CVMX_POW_WQ_INT_THRX(pow_receive_group), 0);
-
 	cvmx_ipd_disable();
 
-	/* Free the interrupt handler */
-	free_irq(OCTEON_IRQ_WORKQ0 + pow_receive_group, cvm_oct_device);
-
 	atomic_inc_return(&cvm_oct_poll_queue_stopping);
 	cancel_delayed_work_sync(&cvm_oct_rx_refill_work);
 
-- 
2.9.2

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 02/11] staging: octeon: use passed interrupt number in the handler
  2016-08-31 20:57 [PATCH v2 00/11] staging: octeon: multi rx group (queue) support Aaro Koskinen
  2016-08-31 20:57 ` [PATCH v2 01/11] staging: octeon: disable rx interrupts in oct_rx_shutdown Aaro Koskinen
@ 2016-08-31 20:57 ` Aaro Koskinen
  2016-08-31 20:57 ` [PATCH v2 03/11] staging: octeon: pass the NAPI instance reference to irq handler Aaro Koskinen
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: Aaro Koskinen @ 2016-08-31 20:57 UTC (permalink / raw)
  To: Greg Kroah-Hartman, David Daney, Ed Swierk, devel
  Cc: linux-kernel, Aaro Koskinen

Use passed interrupt number in the handler, so we can avoid using
the global variable.

Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
---
 drivers/staging/octeon/ethernet-rx.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/octeon/ethernet-rx.c b/drivers/staging/octeon/ethernet-rx.c
index 5b26f2a..808c415 100644
--- a/drivers/staging/octeon/ethernet-rx.c
+++ b/drivers/staging/octeon/ethernet-rx.c
@@ -47,16 +47,16 @@ static struct napi_struct cvm_oct_napi;
 
 /**
  * cvm_oct_do_interrupt - interrupt handler.
- * @cpl: Interrupt number. Unused
+ * @irq: Interrupt number.
  * @dev_id: Cookie to identify the device. Unused
  *
  * The interrupt occurs whenever the POW has packets in our group.
  *
  */
-static irqreturn_t cvm_oct_do_interrupt(int cpl, void *dev_id)
+static irqreturn_t cvm_oct_do_interrupt(int irq, void *dev_id)
 {
 	/* Disable the IRQ and start napi_poll. */
-	disable_irq_nosync(OCTEON_IRQ_WORKQ0 + pow_receive_group);
+	disable_irq_nosync(irq);
 	napi_schedule(&cvm_oct_napi);
 
 	return IRQ_HANDLED;
-- 
2.9.2

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 03/11] staging: octeon: pass the NAPI instance reference to irq handler
  2016-08-31 20:57 [PATCH v2 00/11] staging: octeon: multi rx group (queue) support Aaro Koskinen
  2016-08-31 20:57 ` [PATCH v2 01/11] staging: octeon: disable rx interrupts in oct_rx_shutdown Aaro Koskinen
  2016-08-31 20:57 ` [PATCH v2 02/11] staging: octeon: use passed interrupt number in the handler Aaro Koskinen
@ 2016-08-31 20:57 ` Aaro Koskinen
  2016-08-31 20:57 ` [PATCH v2 04/11] staging: octeon: move common poll code into a separate function Aaro Koskinen
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: Aaro Koskinen @ 2016-08-31 20:57 UTC (permalink / raw)
  To: Greg Kroah-Hartman, David Daney, Ed Swierk, devel
  Cc: linux-kernel, Aaro Koskinen

Pass the NAPI instance reference to the interrupt handler.
This is preparation for having multiple NAPI instances.

Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
---
 drivers/staging/octeon/ethernet-rx.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/staging/octeon/ethernet-rx.c b/drivers/staging/octeon/ethernet-rx.c
index 808c415..27e3459 100644
--- a/drivers/staging/octeon/ethernet-rx.c
+++ b/drivers/staging/octeon/ethernet-rx.c
@@ -48,16 +48,16 @@ static struct napi_struct cvm_oct_napi;
 /**
  * cvm_oct_do_interrupt - interrupt handler.
  * @irq: Interrupt number.
- * @dev_id: Cookie to identify the device. Unused
+ * @napi_id: Cookie to identify the NAPI instance.
  *
  * The interrupt occurs whenever the POW has packets in our group.
  *
  */
-static irqreturn_t cvm_oct_do_interrupt(int irq, void *dev_id)
+static irqreturn_t cvm_oct_do_interrupt(int irq, void *napi_id)
 {
 	/* Disable the IRQ and start napi_poll. */
 	disable_irq_nosync(irq);
-	napi_schedule(&cvm_oct_napi);
+	napi_schedule(napi_id);
 
 	return IRQ_HANDLED;
 }
@@ -452,7 +452,7 @@ void cvm_oct_rx_initialize(void)
 
 	/* Register an IRQ handler to receive POW interrupts */
 	i = request_irq(OCTEON_IRQ_WORKQ0 + pow_receive_group,
-			cvm_oct_do_interrupt, 0, "Ethernet", cvm_oct_device);
+			cvm_oct_do_interrupt, 0, "Ethernet", &cvm_oct_napi);
 
 	if (i)
 		panic("Could not acquire Ethernet IRQ %d\n",
-- 
2.9.2

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 04/11] staging: octeon: move common poll code into a separate function
  2016-08-31 20:57 [PATCH v2 00/11] staging: octeon: multi rx group (queue) support Aaro Koskinen
                   ` (2 preceding siblings ...)
  2016-08-31 20:57 ` [PATCH v2 03/11] staging: octeon: pass the NAPI instance reference to irq handler Aaro Koskinen
@ 2016-08-31 20:57 ` Aaro Koskinen
  2016-08-31 20:57 ` [PATCH v2 05/11] staging: octeon: create a struct for rx group specific data Aaro Koskinen
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: Aaro Koskinen @ 2016-08-31 20:57 UTC (permalink / raw)
  To: Greg Kroah-Hartman, David Daney, Ed Swierk, devel
  Cc: linux-kernel, Aaro Koskinen

Move common poll code into a separate function.

Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
---
 drivers/staging/octeon/ethernet-rx.c | 29 +++++++++++++++++++----------
 1 file changed, 19 insertions(+), 10 deletions(-)

diff --git a/drivers/staging/octeon/ethernet-rx.c b/drivers/staging/octeon/ethernet-rx.c
index 27e3459..140e8af 100644
--- a/drivers/staging/octeon/ethernet-rx.c
+++ b/drivers/staging/octeon/ethernet-rx.c
@@ -143,14 +143,7 @@ static inline int cvm_oct_check_rcv_error(cvmx_wqe_t *work)
 	return 0;
 }
 
-/**
- * cvm_oct_napi_poll - the NAPI poll function.
- * @napi: The NAPI instance, or null if called from cvm_oct_poll_controller
- * @budget: Maximum number of packets to receive.
- *
- * Returns the number of packets processed.
- */
-static int cvm_oct_napi_poll(struct napi_struct *napi, int budget)
+static int cvm_oct_poll(int budget)
 {
 	const int	coreid = cvmx_get_core_num();
 	u64	old_group_mask;
@@ -410,7 +403,23 @@ static int cvm_oct_napi_poll(struct napi_struct *napi, int budget)
 	}
 	cvm_oct_rx_refill_pool(0);
 
-	if (rx_count < budget && napi) {
+	return rx_count;
+}
+
+/**
+ * cvm_oct_napi_poll - the NAPI poll function.
+ * @napi: The NAPI instance.
+ * @budget: Maximum number of packets to receive.
+ *
+ * Returns the number of packets processed.
+ */
+static int cvm_oct_napi_poll(struct napi_struct *napi, int budget)
+{
+	int rx_count;
+
+	rx_count = cvm_oct_poll(budget);
+
+	if (rx_count < budget) {
 		/* No more work */
 		napi_complete(napi);
 		enable_irq(OCTEON_IRQ_WORKQ0 + pow_receive_group);
@@ -427,7 +436,7 @@ static int cvm_oct_napi_poll(struct napi_struct *napi, int budget)
  */
 void cvm_oct_poll_controller(struct net_device *dev)
 {
-	cvm_oct_napi_poll(NULL, 16);
+	cvm_oct_poll(16);
 }
 #endif
 
-- 
2.9.2

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 05/11] staging: octeon: create a struct for rx group specific data
  2016-08-31 20:57 [PATCH v2 00/11] staging: octeon: multi rx group (queue) support Aaro Koskinen
                   ` (3 preceding siblings ...)
  2016-08-31 20:57 ` [PATCH v2 04/11] staging: octeon: move common poll code into a separate function Aaro Koskinen
@ 2016-08-31 20:57 ` Aaro Koskinen
  2016-08-31 20:57 ` [PATCH v2 06/11] staging: octeon: move irq into " Aaro Koskinen
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: Aaro Koskinen @ 2016-08-31 20:57 UTC (permalink / raw)
  To: Greg Kroah-Hartman, David Daney, Ed Swierk, devel
  Cc: linux-kernel, Aaro Koskinen

Create a struct for RX group specific data.

Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
---
 drivers/staging/octeon/ethernet-rx.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/drivers/staging/octeon/ethernet-rx.c b/drivers/staging/octeon/ethernet-rx.c
index 140e8af..65f6013 100644
--- a/drivers/staging/octeon/ethernet-rx.c
+++ b/drivers/staging/octeon/ethernet-rx.c
@@ -43,7 +43,9 @@
 
 #include <asm/octeon/cvmx-gmxx-defs.h>
 
-static struct napi_struct cvm_oct_napi;
+static struct oct_rx_group {
+	struct napi_struct napi;
+} oct_rx_group;
 
 /**
  * cvm_oct_do_interrupt - interrupt handler.
@@ -455,13 +457,14 @@ void cvm_oct_rx_initialize(void)
 	if (!dev_for_napi)
 		panic("No net_devices were allocated.");
 
-	netif_napi_add(dev_for_napi, &cvm_oct_napi, cvm_oct_napi_poll,
+	netif_napi_add(dev_for_napi, &oct_rx_group.napi, cvm_oct_napi_poll,
 		       rx_napi_weight);
-	napi_enable(&cvm_oct_napi);
+	napi_enable(&oct_rx_group.napi);
 
 	/* Register an IRQ handler to receive POW interrupts */
 	i = request_irq(OCTEON_IRQ_WORKQ0 + pow_receive_group,
-			cvm_oct_do_interrupt, 0, "Ethernet", &cvm_oct_napi);
+			cvm_oct_do_interrupt, 0, "Ethernet",
+			&oct_rx_group.napi);
 
 	if (i)
 		panic("Could not acquire Ethernet IRQ %d\n",
@@ -499,7 +502,7 @@ void cvm_oct_rx_initialize(void)
 	}
 
 	/* Schedule NAPI now. This will indirectly enable the interrupt. */
-	napi_schedule(&cvm_oct_napi);
+	napi_schedule(&oct_rx_group.napi);
 }
 
 void cvm_oct_rx_shutdown(void)
@@ -513,5 +516,5 @@ void cvm_oct_rx_shutdown(void)
 	/* Free the interrupt handler */
 	free_irq(OCTEON_IRQ_WORKQ0 + pow_receive_group, cvm_oct_device);
 
-	netif_napi_del(&cvm_oct_napi);
+	netif_napi_del(&oct_rx_group.napi);
 }
-- 
2.9.2

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 06/11] staging: octeon: move irq into rx group specific data
  2016-08-31 20:57 [PATCH v2 00/11] staging: octeon: multi rx group (queue) support Aaro Koskinen
                   ` (4 preceding siblings ...)
  2016-08-31 20:57 ` [PATCH v2 05/11] staging: octeon: create a struct for rx group specific data Aaro Koskinen
@ 2016-08-31 20:57 ` Aaro Koskinen
  2016-08-31 20:57 ` [PATCH v2 07/11] staging: octeon: move group number into rx group data Aaro Koskinen
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: Aaro Koskinen @ 2016-08-31 20:57 UTC (permalink / raw)
  To: Greg Kroah-Hartman, David Daney, Ed Swierk, devel
  Cc: linux-kernel, Aaro Koskinen

Move IRQ number into RX group specific data.

Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
---
 drivers/staging/octeon/ethernet-rx.c | 17 ++++++++++-------
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/drivers/staging/octeon/ethernet-rx.c b/drivers/staging/octeon/ethernet-rx.c
index 65f6013..776003c 100644
--- a/drivers/staging/octeon/ethernet-rx.c
+++ b/drivers/staging/octeon/ethernet-rx.c
@@ -44,6 +44,7 @@
 #include <asm/octeon/cvmx-gmxx-defs.h>
 
 static struct oct_rx_group {
+	int irq;
 	struct napi_struct napi;
 } oct_rx_group;
 
@@ -417,6 +418,8 @@ static int cvm_oct_poll(int budget)
  */
 static int cvm_oct_napi_poll(struct napi_struct *napi, int budget)
 {
+	struct oct_rx_group *rx_group = container_of(napi, struct oct_rx_group,
+						     napi);
 	int rx_count;
 
 	rx_count = cvm_oct_poll(budget);
@@ -424,7 +427,7 @@ static int cvm_oct_napi_poll(struct napi_struct *napi, int budget)
 	if (rx_count < budget) {
 		/* No more work */
 		napi_complete(napi);
-		enable_irq(OCTEON_IRQ_WORKQ0 + pow_receive_group);
+		enable_irq(rx_group->irq);
 	}
 	return rx_count;
 }
@@ -461,16 +464,16 @@ void cvm_oct_rx_initialize(void)
 		       rx_napi_weight);
 	napi_enable(&oct_rx_group.napi);
 
+	oct_rx_group.irq = OCTEON_IRQ_WORKQ0 + pow_receive_group;
+
 	/* Register an IRQ handler to receive POW interrupts */
-	i = request_irq(OCTEON_IRQ_WORKQ0 + pow_receive_group,
-			cvm_oct_do_interrupt, 0, "Ethernet",
+	i = request_irq(oct_rx_group.irq, cvm_oct_do_interrupt, 0, "Ethernet",
 			&oct_rx_group.napi);
 
 	if (i)
-		panic("Could not acquire Ethernet IRQ %d\n",
-		      OCTEON_IRQ_WORKQ0 + pow_receive_group);
+		panic("Could not acquire Ethernet IRQ %d\n", oct_rx_group.irq);
 
-	disable_irq_nosync(OCTEON_IRQ_WORKQ0 + pow_receive_group);
+	disable_irq_nosync(oct_rx_group.irq);
 
 	/* Enable POW interrupt when our port has at least one packet */
 	if (OCTEON_IS_MODEL(OCTEON_CN68XX)) {
@@ -514,7 +517,7 @@ void cvm_oct_rx_shutdown(void)
 		cvmx_write_csr(CVMX_POW_WQ_INT_THRX(pow_receive_group), 0);
 
 	/* Free the interrupt handler */
-	free_irq(OCTEON_IRQ_WORKQ0 + pow_receive_group, cvm_oct_device);
+	free_irq(oct_rx_group.irq, cvm_oct_device);
 
 	netif_napi_del(&oct_rx_group.napi);
 }
-- 
2.9.2

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 07/11] staging: octeon: move group number into rx group data
  2016-08-31 20:57 [PATCH v2 00/11] staging: octeon: multi rx group (queue) support Aaro Koskinen
                   ` (5 preceding siblings ...)
  2016-08-31 20:57 ` [PATCH v2 06/11] staging: octeon: move irq into " Aaro Koskinen
@ 2016-08-31 20:57 ` Aaro Koskinen
  2016-08-31 20:57 ` [PATCH v2 08/11] staging: octeon: support enabling multiple rx groups Aaro Koskinen
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: Aaro Koskinen @ 2016-08-31 20:57 UTC (permalink / raw)
  To: Greg Kroah-Hartman, David Daney, Ed Swierk, devel
  Cc: linux-kernel, Aaro Koskinen

Move group number into RX group data.

Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
---
 drivers/staging/octeon/ethernet-rx.c | 20 +++++++++++---------
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/drivers/staging/octeon/ethernet-rx.c b/drivers/staging/octeon/ethernet-rx.c
index 776003c..80d5f24 100644
--- a/drivers/staging/octeon/ethernet-rx.c
+++ b/drivers/staging/octeon/ethernet-rx.c
@@ -45,6 +45,7 @@
 
 static struct oct_rx_group {
 	int irq;
+	int group;
 	struct napi_struct napi;
 } oct_rx_group;
 
@@ -146,7 +147,7 @@ static inline int cvm_oct_check_rcv_error(cvmx_wqe_t *work)
 	return 0;
 }
 
-static int cvm_oct_poll(int budget)
+static int cvm_oct_poll(struct oct_rx_group *rx_group, int budget)
 {
 	const int	coreid = cvmx_get_core_num();
 	u64	old_group_mask;
@@ -168,13 +169,13 @@ static int cvm_oct_poll(int budget)
 	if (OCTEON_IS_MODEL(OCTEON_CN68XX)) {
 		old_group_mask = cvmx_read_csr(CVMX_SSO_PPX_GRP_MSK(coreid));
 		cvmx_write_csr(CVMX_SSO_PPX_GRP_MSK(coreid),
-			       1ull << pow_receive_group);
+			       BIT(rx_group->group));
 		cvmx_read_csr(CVMX_SSO_PPX_GRP_MSK(coreid)); /* Flush */
 	} else {
 		old_group_mask = cvmx_read_csr(CVMX_POW_PP_GRP_MSKX(coreid));
 		cvmx_write_csr(CVMX_POW_PP_GRP_MSKX(coreid),
 			       (old_group_mask & ~0xFFFFull) |
-			       1 << pow_receive_group);
+			       BIT(rx_group->group));
 	}
 
 	if (USE_ASYNC_IOBDMA) {
@@ -199,15 +200,15 @@ static int cvm_oct_poll(int budget)
 		if (!work) {
 			if (OCTEON_IS_MODEL(OCTEON_CN68XX)) {
 				cvmx_write_csr(CVMX_SSO_WQ_IQ_DIS,
-					       1ull << pow_receive_group);
+					       BIT(rx_group->group));
 				cvmx_write_csr(CVMX_SSO_WQ_INT,
-					       1ull << pow_receive_group);
+					       BIT(rx_group->group));
 			} else {
 				union cvmx_pow_wq_int wq_int;
 
 				wq_int.u64 = 0;
-				wq_int.s.iq_dis = 1 << pow_receive_group;
-				wq_int.s.wq_int = 1 << pow_receive_group;
+				wq_int.s.iq_dis = BIT(rx_group->group);
+				wq_int.s.wq_int = BIT(rx_group->group);
 				cvmx_write_csr(CVMX_POW_WQ_INT, wq_int.u64);
 			}
 			break;
@@ -422,7 +423,7 @@ static int cvm_oct_napi_poll(struct napi_struct *napi, int budget)
 						     napi);
 	int rx_count;
 
-	rx_count = cvm_oct_poll(budget);
+	rx_count = cvm_oct_poll(rx_group, budget);
 
 	if (rx_count < budget) {
 		/* No more work */
@@ -441,7 +442,7 @@ static int cvm_oct_napi_poll(struct napi_struct *napi, int budget)
  */
 void cvm_oct_poll_controller(struct net_device *dev)
 {
-	cvm_oct_poll(16);
+	cvm_oct_poll(&oct_rx_group, 16);
 }
 #endif
 
@@ -465,6 +466,7 @@ void cvm_oct_rx_initialize(void)
 	napi_enable(&oct_rx_group.napi);
 
 	oct_rx_group.irq = OCTEON_IRQ_WORKQ0 + pow_receive_group;
+	oct_rx_group.group = pow_receive_group;
 
 	/* Register an IRQ handler to receive POW interrupts */
 	i = request_irq(oct_rx_group.irq, cvm_oct_do_interrupt, 0, "Ethernet",
-- 
2.9.2

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 08/11] staging: octeon: support enabling multiple rx groups
  2016-08-31 20:57 [PATCH v2 00/11] staging: octeon: multi rx group (queue) support Aaro Koskinen
                   ` (6 preceding siblings ...)
  2016-08-31 20:57 ` [PATCH v2 07/11] staging: octeon: move group number into rx group data Aaro Koskinen
@ 2016-08-31 20:57 ` Aaro Koskinen
  2016-08-31 20:57 ` [PATCH v2 09/11] staging: octeon: enable taking multiple rx groups into use Aaro Koskinen
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: Aaro Koskinen @ 2016-08-31 20:57 UTC (permalink / raw)
  To: Greg Kroah-Hartman, David Daney, Ed Swierk, devel
  Cc: linux-kernel, Aaro Koskinen

Support enabling multiple RX groups.

Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
---
 drivers/staging/octeon/ethernet-rx.c     | 126 ++++++++++++++++++-------------
 drivers/staging/octeon/ethernet.c        |   6 +-
 drivers/staging/octeon/octeon-ethernet.h |   2 +-
 3 files changed, 81 insertions(+), 53 deletions(-)

diff --git a/drivers/staging/octeon/ethernet-rx.c b/drivers/staging/octeon/ethernet-rx.c
index 80d5f24..4f32fa3 100644
--- a/drivers/staging/octeon/ethernet-rx.c
+++ b/drivers/staging/octeon/ethernet-rx.c
@@ -47,7 +47,7 @@ static struct oct_rx_group {
 	int irq;
 	int group;
 	struct napi_struct napi;
-} oct_rx_group;
+} oct_rx_group[16];
 
 /**
  * cvm_oct_do_interrupt - interrupt handler.
@@ -442,7 +442,16 @@ static int cvm_oct_napi_poll(struct napi_struct *napi, int budget)
  */
 void cvm_oct_poll_controller(struct net_device *dev)
 {
-	cvm_oct_poll(&oct_rx_group, 16);
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(oct_rx_group); i++) {
+
+		if (!(pow_receive_groups & BIT(i)))
+			continue;
+
+		cvm_oct_poll(&oct_rx_group[i], 16);
+
+	}
 }
 #endif
 
@@ -461,65 +470,80 @@ void cvm_oct_rx_initialize(void)
 	if (!dev_for_napi)
 		panic("No net_devices were allocated.");
 
-	netif_napi_add(dev_for_napi, &oct_rx_group.napi, cvm_oct_napi_poll,
-		       rx_napi_weight);
-	napi_enable(&oct_rx_group.napi);
+	for (i = 0; i < ARRAY_SIZE(oct_rx_group); i++) {
+		int ret;
 
-	oct_rx_group.irq = OCTEON_IRQ_WORKQ0 + pow_receive_group;
-	oct_rx_group.group = pow_receive_group;
+		if (!(pow_receive_groups & BIT(i)))
+			continue;
 
-	/* Register an IRQ handler to receive POW interrupts */
-	i = request_irq(oct_rx_group.irq, cvm_oct_do_interrupt, 0, "Ethernet",
-			&oct_rx_group.napi);
+		netif_napi_add(dev_for_napi, &oct_rx_group[i].napi,
+			       cvm_oct_napi_poll, rx_napi_weight);
+		napi_enable(&oct_rx_group[i].napi);
 
-	if (i)
-		panic("Could not acquire Ethernet IRQ %d\n", oct_rx_group.irq);
+		oct_rx_group[i].irq = OCTEON_IRQ_WORKQ0 + i;
+		oct_rx_group[i].group = i;
 
-	disable_irq_nosync(oct_rx_group.irq);
+		/* Register an IRQ handler to receive POW interrupts */
+		ret = request_irq(oct_rx_group[i].irq, cvm_oct_do_interrupt, 0,
+				  "Ethernet", &oct_rx_group[i].napi);
+		if (ret)
+			panic("Could not acquire Ethernet IRQ %d\n",
+			      oct_rx_group[i].irq);
 
-	/* Enable POW interrupt when our port has at least one packet */
-	if (OCTEON_IS_MODEL(OCTEON_CN68XX)) {
-		union cvmx_sso_wq_int_thrx int_thr;
-		union cvmx_pow_wq_int_pc int_pc;
-
-		int_thr.u64 = 0;
-		int_thr.s.tc_en = 1;
-		int_thr.s.tc_thr = 1;
-		cvmx_write_csr(CVMX_SSO_WQ_INT_THRX(pow_receive_group),
-			       int_thr.u64);
-
-		int_pc.u64 = 0;
-		int_pc.s.pc_thr = 5;
-		cvmx_write_csr(CVMX_SSO_WQ_INT_PC, int_pc.u64);
-	} else {
-		union cvmx_pow_wq_int_thrx int_thr;
-		union cvmx_pow_wq_int_pc int_pc;
-
-		int_thr.u64 = 0;
-		int_thr.s.tc_en = 1;
-		int_thr.s.tc_thr = 1;
-		cvmx_write_csr(CVMX_POW_WQ_INT_THRX(pow_receive_group),
-			       int_thr.u64);
-
-		int_pc.u64 = 0;
-		int_pc.s.pc_thr = 5;
-		cvmx_write_csr(CVMX_POW_WQ_INT_PC, int_pc.u64);
-	}
+		disable_irq_nosync(oct_rx_group[i].irq);
+
+		/* Enable POW interrupt when our port has at least one packet */
+		if (OCTEON_IS_MODEL(OCTEON_CN68XX)) {
+			union cvmx_sso_wq_int_thrx int_thr;
+			union cvmx_pow_wq_int_pc int_pc;
+
+			int_thr.u64 = 0;
+			int_thr.s.tc_en = 1;
+			int_thr.s.tc_thr = 1;
+			cvmx_write_csr(CVMX_SSO_WQ_INT_THRX(i), int_thr.u64);
+
+			int_pc.u64 = 0;
+			int_pc.s.pc_thr = 5;
+			cvmx_write_csr(CVMX_SSO_WQ_INT_PC, int_pc.u64);
+		} else {
+			union cvmx_pow_wq_int_thrx int_thr;
+			union cvmx_pow_wq_int_pc int_pc;
 
-	/* Schedule NAPI now. This will indirectly enable the interrupt. */
-	napi_schedule(&oct_rx_group.napi);
+			int_thr.u64 = 0;
+			int_thr.s.tc_en = 1;
+			int_thr.s.tc_thr = 1;
+			cvmx_write_csr(CVMX_POW_WQ_INT_THRX(i), int_thr.u64);
+
+			int_pc.u64 = 0;
+			int_pc.s.pc_thr = 5;
+			cvmx_write_csr(CVMX_POW_WQ_INT_PC, int_pc.u64);
+		}
+
+		/* Schedule NAPI now. This will indirectly enable the
+		 * interrupt.
+		 */
+		napi_schedule(&oct_rx_group[i].napi);
+	}
 }
 
 void cvm_oct_rx_shutdown(void)
 {
-	/* Disable POW interrupt */
-	if (OCTEON_IS_MODEL(OCTEON_CN68XX))
-		cvmx_write_csr(CVMX_SSO_WQ_INT_THRX(pow_receive_group), 0);
-	else
-		cvmx_write_csr(CVMX_POW_WQ_INT_THRX(pow_receive_group), 0);
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(oct_rx_group); i++) {
+
+		if (!(pow_receive_groups & BIT(i)))
+			continue;
 
-	/* Free the interrupt handler */
-	free_irq(oct_rx_group.irq, cvm_oct_device);
+		/* Disable POW interrupt */
+		if (OCTEON_IS_MODEL(OCTEON_CN68XX))
+			cvmx_write_csr(CVMX_SSO_WQ_INT_THRX(i), 0);
+		else
+			cvmx_write_csr(CVMX_POW_WQ_INT_THRX(i), 0);
+
+		/* Free the interrupt handler */
+		free_irq(oct_rx_group[i].irq, cvm_oct_device);
 
-	netif_napi_del(&oct_rx_group.napi);
+		netif_napi_del(&oct_rx_group[i].napi);
+	}
 }
diff --git a/drivers/staging/octeon/ethernet.c b/drivers/staging/octeon/ethernet.c
index 1e2e1ef..7d48745 100644
--- a/drivers/staging/octeon/ethernet.c
+++ b/drivers/staging/octeon/ethernet.c
@@ -45,7 +45,7 @@ MODULE_PARM_DESC(num_packet_buffers, "\n"
 	"\tNumber of packet buffers to allocate and store in the\n"
 	"\tFPA. By default, 1024 packet buffers are used.\n");
 
-int pow_receive_group = 15;
+static int pow_receive_group = 15;
 module_param(pow_receive_group, int, 0444);
 MODULE_PARM_DESC(pow_receive_group, "\n"
 	"\tPOW group to receive packets from. All ethernet hardware\n"
@@ -86,6 +86,8 @@ int rx_napi_weight = 32;
 module_param(rx_napi_weight, int, 0444);
 MODULE_PARM_DESC(rx_napi_weight, "The NAPI WEIGHT parameter.");
 
+/* Mask indicating which receive groups are in use. */
+int pow_receive_groups;
 
 /*
  * cvm_oct_poll_queue_stopping - flag to indicate polling should stop.
@@ -678,6 +680,8 @@ static int cvm_oct_probe(struct platform_device *pdev)
 
 	cvmx_helper_initialize_packet_io_global();
 
+	pow_receive_groups = BIT(pow_receive_group);
+
 	/* Change the input group for all ports before input is enabled */
 	num_interfaces = cvmx_helper_get_number_of_interfaces();
 	for (interface = 0; interface < num_interfaces; interface++) {
diff --git a/drivers/staging/octeon/octeon-ethernet.h b/drivers/staging/octeon/octeon-ethernet.h
index d533aef..9c6852d 100644
--- a/drivers/staging/octeon/octeon-ethernet.h
+++ b/drivers/staging/octeon/octeon-ethernet.h
@@ -72,7 +72,7 @@ void cvm_oct_link_poll(struct net_device *dev);
 
 extern int always_use_pow;
 extern int pow_send_group;
-extern int pow_receive_group;
+extern int pow_receive_groups;
 extern char pow_send_list[];
 extern struct net_device *cvm_oct_device[];
 extern atomic_t cvm_oct_poll_queue_stopping;
-- 
2.9.2

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 09/11] staging: octeon: enable taking multiple rx groups into use
  2016-08-31 20:57 [PATCH v2 00/11] staging: octeon: multi rx group (queue) support Aaro Koskinen
                   ` (7 preceding siblings ...)
  2016-08-31 20:57 ` [PATCH v2 08/11] staging: octeon: support enabling multiple rx groups Aaro Koskinen
@ 2016-08-31 20:57 ` Aaro Koskinen
  2016-08-31 20:57 ` [PATCH v2 10/11] staging: octeon: set up pknd for all interfaces Aaro Koskinen
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: Aaro Koskinen @ 2016-08-31 20:57 UTC (permalink / raw)
  To: Greg Kroah-Hartman, David Daney, Ed Swierk, devel
  Cc: linux-kernel, Aaro Koskinen

Enable taking multiple RX groups into use.

Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
---
 drivers/staging/octeon/ethernet.c | 49 +++++++++++++++++++++++++++++++++++++--
 1 file changed, 47 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/octeon/ethernet.c b/drivers/staging/octeon/ethernet.c
index 7d48745..8d51f05 100644
--- a/drivers/staging/octeon/ethernet.c
+++ b/drivers/staging/octeon/ethernet.c
@@ -53,6 +53,15 @@ MODULE_PARM_DESC(pow_receive_group, "\n"
 	"\tgroup. Also any other software can submit packets to this\n"
 	"\tgroup for the kernel to process.");
 
+static int receive_group_order;
+module_param(receive_group_order, int, 0444);
+MODULE_PARM_DESC(receive_group_order, "\n"
+	"\tOrder (0..4) of receive groups to take into use. Ethernet hardware\n"
+	"\twill be configured to send incoming packets to multiple POW\n"
+	"\tgroups. pow_receive_group parameter is ignored when multiple\n"
+	"\tgroups are taken into use and groups are allocated starting\n"
+	"\tfrom 0. By default, a single group is used.\n");
+
 int pow_send_group = -1;
 module_param(pow_send_group, int, 0644);
 MODULE_PARM_DESC(pow_send_group, "\n"
@@ -680,7 +689,13 @@ static int cvm_oct_probe(struct platform_device *pdev)
 
 	cvmx_helper_initialize_packet_io_global();
 
-	pow_receive_groups = BIT(pow_receive_group);
+	if (receive_group_order) {
+		if (receive_group_order > 4)
+			receive_group_order = 4;
+		pow_receive_groups = (1 << (1 << receive_group_order)) - 1;
+	} else {
+		pow_receive_groups = BIT(pow_receive_group);
+	}
 
 	/* Change the input group for all ports before input is enabled */
 	num_interfaces = cvmx_helper_get_number_of_interfaces();
@@ -695,7 +710,37 @@ static int cvm_oct_probe(struct platform_device *pdev)
 
 			pip_prt_tagx.u64 =
 			    cvmx_read_csr(CVMX_PIP_PRT_TAGX(port));
-			pip_prt_tagx.s.grp = pow_receive_group;
+
+			if (receive_group_order) {
+				int tag_mask;
+
+				/* We support only 16 groups at the moment, so
+				 * always disable the two additional "hidden"
+				 * tag_mask bits on CN68XX.
+				 */
+				if (OCTEON_IS_MODEL(OCTEON_CN68XX))
+					pip_prt_tagx.u64 |= 0x3ull << 44;
+
+				tag_mask = ~((1 << receive_group_order) - 1);
+				pip_prt_tagx.s.grptagbase	= 0;
+				pip_prt_tagx.s.grptagmask	= tag_mask;
+				pip_prt_tagx.s.grptag		= 1;
+				pip_prt_tagx.s.tag_mode		= 0;
+				pip_prt_tagx.s.inc_prt_flag	= 1;
+				pip_prt_tagx.s.ip6_dprt_flag	= 1;
+				pip_prt_tagx.s.ip4_dprt_flag	= 1;
+				pip_prt_tagx.s.ip6_sprt_flag	= 1;
+				pip_prt_tagx.s.ip4_sprt_flag	= 1;
+				pip_prt_tagx.s.ip6_dst_flag	= 1;
+				pip_prt_tagx.s.ip4_dst_flag	= 1;
+				pip_prt_tagx.s.ip6_src_flag	= 1;
+				pip_prt_tagx.s.ip4_src_flag	= 1;
+				pip_prt_tagx.s.grp		= 0;
+			} else {
+				pip_prt_tagx.s.grptag	= 0;
+				pip_prt_tagx.s.grp	= pow_receive_group;
+			}
+
 			cvmx_write_csr(CVMX_PIP_PRT_TAGX(port),
 				       pip_prt_tagx.u64);
 		}
-- 
2.9.2

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 10/11] staging: octeon: set up pknd for all interfaces
  2016-08-31 20:57 [PATCH v2 00/11] staging: octeon: multi rx group (queue) support Aaro Koskinen
                   ` (8 preceding siblings ...)
  2016-08-31 20:57 ` [PATCH v2 09/11] staging: octeon: enable taking multiple rx groups into use Aaro Koskinen
@ 2016-08-31 20:57 ` Aaro Koskinen
  2016-08-31 20:57 ` [PATCH v2 11/11] staging: octeon: prevent poll during rx init Aaro Koskinen
  2016-09-01  2:09 ` [PATCH v2 00/11] staging: octeon: multi rx group (queue) support Ed Swierk
  11 siblings, 0 replies; 14+ messages in thread
From: Aaro Koskinen @ 2016-08-31 20:57 UTC (permalink / raw)
  To: Greg Kroah-Hartman, David Daney, Ed Swierk, devel
  Cc: linux-kernel, Aaro Koskinen

RX path uses pknd to find the correct device, and we maintain 1:1
port to pknd mapping. However, this is only set for XAUI interfaces
(in the arch code). But it should be set for all interface types.

Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
---
 drivers/staging/octeon/ethernet.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/staging/octeon/ethernet.c b/drivers/staging/octeon/ethernet.c
index 8d51f05..5497fac 100644
--- a/drivers/staging/octeon/ethernet.c
+++ b/drivers/staging/octeon/ethernet.c
@@ -488,6 +488,8 @@ int cvm_oct_common_open(struct net_device *dev,
 
 	gmx_cfg.u64 = cvmx_read_csr(CVMX_GMXX_PRTX_CFG(index, interface));
 	gmx_cfg.s.en = 1;
+	if (octeon_has_feature(OCTEON_FEATURE_PKND))
+		gmx_cfg.s.pknd = priv->port;
 	cvmx_write_csr(CVMX_GMXX_PRTX_CFG(index, interface), gmx_cfg.u64);
 
 	if (octeon_is_simulation())
-- 
2.9.2

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 11/11] staging: octeon: prevent poll during rx init
  2016-08-31 20:57 [PATCH v2 00/11] staging: octeon: multi rx group (queue) support Aaro Koskinen
                   ` (9 preceding siblings ...)
  2016-08-31 20:57 ` [PATCH v2 10/11] staging: octeon: set up pknd for all interfaces Aaro Koskinen
@ 2016-08-31 20:57 ` Aaro Koskinen
  2016-09-01  2:09 ` [PATCH v2 00/11] staging: octeon: multi rx group (queue) support Ed Swierk
  11 siblings, 0 replies; 14+ messages in thread
From: Aaro Koskinen @ 2016-08-31 20:57 UTC (permalink / raw)
  To: Greg Kroah-Hartman, David Daney, Ed Swierk, devel
  Cc: linux-kernel, Aaro Koskinen

Prevent poll before the RX init has been completed.

Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
---
 drivers/staging/octeon/ethernet-rx.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/staging/octeon/ethernet-rx.c b/drivers/staging/octeon/ethernet-rx.c
index 4f32fa3..ce1e2a3 100644
--- a/drivers/staging/octeon/ethernet-rx.c
+++ b/drivers/staging/octeon/ethernet-rx.c
@@ -43,6 +43,8 @@
 
 #include <asm/octeon/cvmx-gmxx-defs.h>
 
+static atomic_t oct_rx_ready = ATOMIC_INIT(0);
+
 static struct oct_rx_group {
 	int irq;
 	int group;
@@ -444,6 +446,9 @@ void cvm_oct_poll_controller(struct net_device *dev)
 {
 	int i;
 
+	if (!atomic_read(&oct_rx_ready))
+		return;
+
 	for (i = 0; i < ARRAY_SIZE(oct_rx_group); i++) {
 
 		if (!(pow_receive_groups & BIT(i)))
@@ -524,6 +529,7 @@ void cvm_oct_rx_initialize(void)
 		 */
 		napi_schedule(&oct_rx_group[i].napi);
 	}
+	atomic_inc(&oct_rx_ready);
 }
 
 void cvm_oct_rx_shutdown(void)
-- 
2.9.2

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 00/11] staging: octeon: multi rx group (queue) support
  2016-08-31 20:57 [PATCH v2 00/11] staging: octeon: multi rx group (queue) support Aaro Koskinen
                   ` (10 preceding siblings ...)
  2016-08-31 20:57 ` [PATCH v2 11/11] staging: octeon: prevent poll during rx init Aaro Koskinen
@ 2016-09-01  2:09 ` Ed Swierk
  2016-09-01 18:12   ` Aaro Koskinen
  11 siblings, 1 reply; 14+ messages in thread
From: Ed Swierk @ 2016-09-01  2:09 UTC (permalink / raw)
  To: Aaro Koskinen, Greg Kroah-Hartman, David Daney, devel; +Cc: linux-kernel

On 8/31/16 13:57, Aaro Koskinen wrote:
> This series implements multiple RX group support that should improve
> the networking performance on multi-core OCTEONs. Basically we register
> IRQ and NAPI for each group, and ask the HW to select the group for
> the incoming packets based on hash.
> 
> Tested on EdgeRouter Lite with a simple forwarding test using two flows
> and 16 RX groups distributed between two cores - the routing throughput
> is roughly doubled.
> 
> Also tested with EBH5600 (8 cores) and EBB6800 (16 cores) by sending
> and receiving traffic in both directions using SGMII interfaces.

With this series on 4.4.19, rx works with receive_group_order > 0.
Setting receive_group_order=4, I do see 16 Ethernet interrupts. I tried
fiddling with various smp_affinity values (e.g. setting them all to
ffffffff, or assigning a different one to each interrupt, or giving a
few to some and a few to others), as well as different values for
rps_cpus. 10-thread parallel iperf performance varies between 0.5 and 1.5
Gbit/sec total depending on the particular settings.

With the SDK kernel I get over 8 Gbit/sec. It seems to be achieving that
using just one interrupt (not even a separate one for tx, as far as I can
tell) pegged to CPU 0 (the default smp_affinity). I must be missing some
other major configuration tweak, perhaps specific to 10G.

Can you run a test on the EBB6800 with the interfaces in 10G mode?

--Ed

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 00/11] staging: octeon: multi rx group (queue) support
  2016-09-01  2:09 ` [PATCH v2 00/11] staging: octeon: multi rx group (queue) support Ed Swierk
@ 2016-09-01 18:12   ` Aaro Koskinen
  0 siblings, 0 replies; 14+ messages in thread
From: Aaro Koskinen @ 2016-09-01 18:12 UTC (permalink / raw)
  To: Ed Swierk; +Cc: Greg Kroah-Hartman, David Daney, devel, linux-kernel

Hi,

On Wed, Aug 31, 2016 at 07:09:13PM -0700, Ed Swierk wrote:
> On 8/31/16 13:57, Aaro Koskinen wrote:
> > This series implements multiple RX group support that should improve
> > the networking performance on multi-core OCTEONs. Basically we register
> > IRQ and NAPI for each group, and ask the HW to select the group for
> > the incoming packets based on hash.
> > 
> > Tested on EdgeRouter Lite with a simple forwarding test using two flows
> > and 16 RX groups distributed between two cores - the routing throughput
> > is roughly doubled.
> > 
> > Also tested with EBH5600 (8 cores) and EBB6800 (16 cores) by sending
> > and receiving traffic in both directions using SGMII interfaces.
> 
> With this series on 4.4.19, rx works with receive_group_order > 0.

Good.

> Setting receive_group_order=4, I do see 16 Ethernet interrupts. I tried
> fiddling with various smp_affinity values (e.g. setting them all to
> ffffffff, or assigning a different one to each interrupt, or giving a
> few to some and a few to others), as well as different values for
> rps_cpus. 10-thread parallel iperf performance varies between 0.5 and 1.5
> Gbit/sec total depending on the particular settings.
> 
> With the SDK kernel I get over 8 Gbit/sec. It seems to be achieving that
> using just one interrupt (not even a separate one for tx, as far as I can
> tell) pegged to CPU 0 (the default smp_affinity). I must be missing some
> other major configuration tweak, perhaps specific to 10G.
> 
> Can you run a test on the EBB6800 with the interfaces in 10G mode?

Yes, I attached two EBB6800s with XAUI and ran iperf -P 10.

With single group it gives 2.9 Gbit/s, and with 16 groups (on 16 cores)
4.3 Gbit/s. In 16 group case none of the CPUs are even close to 100%,
so the bottleneck is somewhere else. I guess implementing the proper
SSO init should increase the throughput.

A.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2016-09-01 21:25 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-31 20:57 [PATCH v2 00/11] staging: octeon: multi rx group (queue) support Aaro Koskinen
2016-08-31 20:57 ` [PATCH v2 01/11] staging: octeon: disable rx interrupts in oct_rx_shutdown Aaro Koskinen
2016-08-31 20:57 ` [PATCH v2 02/11] staging: octeon: use passed interrupt number in the handler Aaro Koskinen
2016-08-31 20:57 ` [PATCH v2 03/11] staging: octeon: pass the NAPI instance reference to irq handler Aaro Koskinen
2016-08-31 20:57 ` [PATCH v2 04/11] staging: octeon: move common poll code into a separate function Aaro Koskinen
2016-08-31 20:57 ` [PATCH v2 05/11] staging: octeon: create a struct for rx group specific data Aaro Koskinen
2016-08-31 20:57 ` [PATCH v2 06/11] staging: octeon: move irq into " Aaro Koskinen
2016-08-31 20:57 ` [PATCH v2 07/11] staging: octeon: move group number into rx group data Aaro Koskinen
2016-08-31 20:57 ` [PATCH v2 08/11] staging: octeon: support enabling multiple rx groups Aaro Koskinen
2016-08-31 20:57 ` [PATCH v2 09/11] staging: octeon: enable taking multiple rx groups into use Aaro Koskinen
2016-08-31 20:57 ` [PATCH v2 10/11] staging: octeon: set up pknd for all interfaces Aaro Koskinen
2016-08-31 20:57 ` [PATCH v2 11/11] staging: octeon: prevent poll during rx init Aaro Koskinen
2016-09-01  2:09 ` [PATCH v2 00/11] staging: octeon: multi rx group (queue) support Ed Swierk
2016-09-01 18:12   ` Aaro Koskinen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).