DPDK-dev Archive on lore.kernel.org
 help / color / Atom feed
* [dpdk-dev] [PATCH 00/27] sched: feature enhancements
@ 2019-05-28 12:05 Lukasz Krakowiak
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 01/27] sched: update macros for flexible config Lukasz Krakowiak
                   ` (26 more replies)
  0 siblings, 27 replies; 163+ messages in thread
From: Lukasz Krakowiak @ 2019-05-28 12:05 UTC (permalink / raw)
  To: cristian.dumitrescu; +Cc: dev, Lukasz Krakowiak

This patchset refactors the dpdk qos sched library to add
following features to enhance the scheduler functionality.

1. flexibile configuration of the pipe traffic classes and queues;

   Currently, each pipe has 16 queues hardwired into 4 TCs scheduled with
   strict priority, and each TC has exactly with 4 queues that are
   scheduled with Weighted Fair Queuing (WFQ).

   Instead of hardwiring queues to traffic class within the specific pipe,
   the new implementation allows more flexible/configurable split of pipe
   queues between strict priority (SP) and best-effort (BE) traffic classes
   along with the support of more number of traffic classes i.e. max 16.
   
   All the high priority TCs (TC1, TC2, ...) have exactly 1 queue, while
   the lowest priority BE TC, has 1, 4 or 8 queues. This is justified by
   the fact that all the high priority TCs are fully provisioned (small to
   medium traffic rates), while most of the traffic fits into the BE class,
   which is typically oversubscribed.

   Furthermore, this change allows to use less than 16 queues per pipe when
   not all the 16 queues are needed. Therefore, no memory will be allocated
   to the queues that are not needed.

2. Subport level configuration of pipe nodes;

   Currently, all parameters for the pipe nodes (subscribers) configuration
   are part of the port level structure which forces all groups of
   subscribers (i.e. pipes) in different subports to have similar
   configurations in terms of their number, queue sizes, traffic-classes,
   etc.

   The new implementation moves pipe nodes configuration parameters from
   port level to subport level structure. Therefore, different subports of
   the same port can have different configuration for the pipe nodes
   (subscribers), for examples- number of pipes, queue sizes, queues to
   traffic-class mapping, etc.

Jasvinder Singh (27):
  sched: update macros for flexible config
  sched: update subport and pipe data structures
  sched: update internal data structures
  sched: update port config api
  sched: update port free api
  sched: update subport config api
  sched: update pipe profile add api
  sched: update pipe config api
  sched: update pkt read and write api
  sched: update subport and tc queue stats
  sched: update port memory footprint api
  sched: update packet enqueue api
  sched: update grinder pipe and tc cache
  sched: update grinder next pipe and tc functions
  sched: update pipe and tc queues prefetch
  sched: update grinder wrr compute function
  sched: modify credits update function
  sched: update mbuf prefetch function
  sched: update grinder schedule function
  sched: update grinder handle function
  sched: update packet dequeue api
  sched: update sched queue stats api
  test/sched: update unit test
  net/softnic: update softnic tm function
  examples/qos_sched: update qos sched sample app
  examples/ip_pipeline: update ip pipeline sample app
  sched: code cleanup

 app/test/test_sched.c                         |   39 +-
 doc/guides/rel_notes/deprecation.rst          |    6 -
 doc/guides/rel_notes/release_19_08.rst        |    7 +-
 drivers/net/softnic/rte_eth_softnic.c         |  131 ++
 drivers/net/softnic/rte_eth_softnic_cli.c     |  285 ++-
 .../net/softnic/rte_eth_softnic_internals.h   |    4 +-
 drivers/net/softnic/rte_eth_softnic_tm.c      |   89 +-
 examples/ip_pipeline/cli.c                    |   85 +-
 examples/ip_pipeline/tmgr.c                   |   21 +-
 examples/ip_pipeline/tmgr.h                   |    3 -
 examples/qos_sched/app_thread.c               |   11 +-
 examples/qos_sched/cfg_file.c                 |  282 ++-
 examples/qos_sched/init.c                     |  110 +-
 examples/qos_sched/main.h                     |    6 +-
 examples/qos_sched/profile.cfg                |   59 +-
 examples/qos_sched/profile_ov.cfg             |   47 +-
 examples/qos_sched/stats.c                    |  152 +-
 lib/librte_pipeline/rte_table_action.c        |    1 -
 lib/librte_pipeline/rte_table_action.h        |    4 +-
 lib/librte_sched/Makefile                     |    2 +-
 lib/librte_sched/meson.build                  |    2 +-
 lib/librte_sched/rte_sched.c                  | 2015 ++++++++++-------
 lib/librte_sched/rte_sched.h                  |  147 +-
 lib/librte_sched/rte_sched_common.h           |   33 +
 24 files changed, 2314 insertions(+), 1227 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH 01/27] sched: update macros for flexible config
  2019-05-28 12:05 [dpdk-dev] [PATCH 00/27] sched: feature enhancements Lukasz Krakowiak
@ 2019-05-28 12:05 ` Lukasz Krakowiak
  2019-06-25 15:31   ` [dpdk-dev] [PATCH v2 00/28] sched: feature enhancements Jasvinder Singh
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 02/27] sched: update subport and pipe data structures Lukasz Krakowiak
                   ` (25 subsequent siblings)
  26 siblings, 1 reply; 163+ messages in thread
From: Lukasz Krakowiak @ 2019-05-28 12:05 UTC (permalink / raw)
  To: cristian.dumitrescu; +Cc: dev, Jasvinder Singh, Abraham Tovar, Lukasz Krakowiak

From: Jasvinder Singh <jasvinder.singh@intel.com>

Update existing macros and add new one for best-effort traffic class
queues to allow configuration flexiblity for pipe traffic classes and
queues, and subport level configuration of the pipe parameters.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 doc/guides/rel_notes/deprecation.rst   |  6 -----
 doc/guides/rel_notes/release_19_08.rst |  7 ++++-
 lib/librte_sched/Makefile              |  2 +-
 lib/librte_sched/meson.build           |  2 +-
 lib/librte_sched/rte_sched.h           | 37 +++++++++++++++++++-------
 5 files changed, 35 insertions(+), 19 deletions(-)

diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index 098d24381..a408270f5 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -99,12 +99,6 @@ Deprecation Notices
   to one it means it represents IV, when is set to zero it means J0 is used
   directly, in this case 16 bytes of J0 need to be passed.
 
-* sched: To allow more traffic classes, flexible mapping of pipe queues to
-  traffic classes, and subport level configuration of pipes and queues
-  changes will be made to macros, data structures and API functions defined
-  in "rte_sched.h". These changes are aligned to improvements suggested in the
-  RFC https://mails.dpdk.org/archives/dev/2018-November/120035.html.
-
 * metrics: The function ``rte_metrics_init`` will have a non-void return
   in order to notify errors instead of calling ``rte_exit``.
 
diff --git a/doc/guides/rel_notes/release_19_08.rst b/doc/guides/rel_notes/release_19_08.rst
index b9510f93a..210f32e7f 100644
--- a/doc/guides/rel_notes/release_19_08.rst
+++ b/doc/guides/rel_notes/release_19_08.rst
@@ -83,6 +83,11 @@ API Changes
    Also, make sure to start the actual text at the margin.
    =========================================================
 
+* sched: To allow more traffic classes, flexible mapping of pipe queues to
+  traffic classes, and subport level configuration of pipes and queues
+  changes are made to public macros, data structures and API functions defined
+  in "rte_sched.h".
+
 
 ABI Changes
 -----------
@@ -170,7 +175,7 @@ The libraries prepended with a plus sign were incremented in this version.
      librte_rcu.so.1
      librte_reorder.so.1
      librte_ring.so.2
-     librte_sched.so.2
+   + librte_sched.so.3
      librte_security.so.2
      librte_stack.so.1
      librte_table.so.3
diff --git a/lib/librte_sched/Makefile b/lib/librte_sched/Makefile
index 644fd9d15..3d7f410e1 100644
--- a/lib/librte_sched/Makefile
+++ b/lib/librte_sched/Makefile
@@ -18,7 +18,7 @@ LDLIBS += -lrte_timer
 
 EXPORT_MAP := rte_sched_version.map
 
-LIBABIVER := 2
+LIBABIVER := 3
 
 #
 # all source are stored in SRCS-y
diff --git a/lib/librte_sched/meson.build b/lib/librte_sched/meson.build
index 8e989e5f6..59d43c6d8 100644
--- a/lib/librte_sched/meson.build
+++ b/lib/librte_sched/meson.build
@@ -1,7 +1,7 @@
 # SPDX-License-Identifier: BSD-3-Clause
 # Copyright(c) 2017 Intel Corporation
 
-version = 2
+version = 3
 sources = files('rte_sched.c', 'rte_red.c', 'rte_approx.c')
 headers = files('rte_sched.h', 'rte_sched_common.h',
 		'rte_red.h', 'rte_approx.h')
diff --git a/lib/librte_sched/rte_sched.h b/lib/librte_sched/rte_sched.h
index 9c55a787d..cf7695f27 100644
--- a/lib/librte_sched/rte_sched.h
+++ b/lib/librte_sched/rte_sched.h
@@ -52,7 +52,7 @@ extern "C" {
  *	    multiple connections of same traffic class belonging to
  *	    the same user;
  *           - Weighted Round Robin (WRR) is used to service the
- *	    queues within same pipe traffic class.
+ *	    queues within same pipe lowest priority (best-effort) traffic class.
  *
  */
 
@@ -66,26 +66,43 @@ extern "C" {
 #include "rte_red.h"
 #endif
 
-/** Number of traffic classes per pipe (as well as subport).
- * Cannot be changed.
+/** Maximum number of queues per pipe.
+ * Note that the multiple queues (power of 2) can only be assigned to
+ * lowest priority (best-effort) traffic class. Other higher priority traffic
+ * classes can only have one queue.
+ *
+ * Can not change.
+ */
+#define RTE_SCHED_QUEUES_PER_PIPE    16
+
+/** Number of WRR queues for lowest priority (best-effort) traffic class per
+ * pipe.
  */
-#define RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE    4
+#define RTE_SCHED_WRR_QUEUES_PER_PIPE    8
 
-/** Number of queues per pipe traffic class. Cannot be changed. */
+/** Number of traffic classes per pipe (as well as subport). */
 #define RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS    4
+#define RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE    \
+(RTE_SCHED_QUEUES_PER_PIPE - RTE_SCHED_WRR_QUEUES_PER_PIPE + 1)
 
-/** Number of queues per pipe. */
-#define RTE_SCHED_QUEUES_PER_PIPE             \
-	(RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE *     \
-	RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS)
+/** Maximum number of subports that can be defined per port.
+ * Compile-time configurable.
+ */
+#ifndef RTE_SCHED_SUBPORTS_PER_PORT
+#define RTE_SCHED_SUBPORTS_PER_PORT      256
+#endif
 
-/** Maximum number of pipe profiles that can be defined per port.
+/** Maximum number of pipe profiles that can be defined per subport.
  * Compile-time configurable.
  */
 #ifndef RTE_SCHED_PIPE_PROFILES_PER_PORT
 #define RTE_SCHED_PIPE_PROFILES_PER_PORT      256
 #endif
 
+#ifndef RTE_SCHED_PIPE_PROFILES_PER_SUBPORT
+#define RTE_SCHED_PIPE_PROFILES_PER_SUBPORT      256
+#endif
+
 /*
  * Ethernet framing overhead. Overhead fields per Ethernet frame:
  * 1. Preamble:                             7 bytes;
-- 
2.20.1


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH 02/27] sched: update subport and pipe data structures
  2019-05-28 12:05 [dpdk-dev] [PATCH 00/27] sched: feature enhancements Lukasz Krakowiak
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 01/27] sched: update macros for flexible config Lukasz Krakowiak
@ 2019-05-28 12:05 ` Lukasz Krakowiak
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 03/27] sched: update internal " Lukasz Krakowiak
                   ` (24 subsequent siblings)
  26 siblings, 0 replies; 163+ messages in thread
From: Lukasz Krakowiak @ 2019-05-28 12:05 UTC (permalink / raw)
  To: cristian.dumitrescu; +Cc: dev, Jasvinder Singh, Abraham Tovar, Lukasz Krakowiak

From: Jasvinder Singh <jasvinder.singh@intel.com>

Update public data structures for subport and pipe to allow configuration
flexiblity for pipe traffic classes and queues, and subport level
configuration of the pipe parameters.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 app/test/test_sched.c        |  2 +-
 examples/qos_sched/init.c    |  2 +-
 lib/librte_sched/rte_sched.h | 76 ++++++++++++++++++++++--------------
 3 files changed, 49 insertions(+), 31 deletions(-)

diff --git a/app/test/test_sched.c b/app/test/test_sched.c
index 4eed8dbde..460eb53ec 100644
--- a/app/test/test_sched.c
+++ b/app/test/test_sched.c
@@ -40,7 +40,7 @@ static struct rte_sched_pipe_params pipe_profile[] = {
 		.tc_rate = {305175, 305175, 305175, 305175},
 		.tc_period = 40,
 
-		.wrr_weights = {1, 1, 1, 1,  1, 1, 1, 1,  1, 1, 1, 1,  1, 1, 1, 1},
+		.wrr_weights = {1, 1, 1, 1,  1, 1, 1, 1},
 	},
 };
 
diff --git a/examples/qos_sched/init.c b/examples/qos_sched/init.c
index 37c2b95fd..f44a07cd6 100644
--- a/examples/qos_sched/init.c
+++ b/examples/qos_sched/init.c
@@ -186,7 +186,7 @@ static struct rte_sched_pipe_params pipe_profiles[RTE_SCHED_PIPE_PROFILES_PER_PO
 		.tc_ov_weight = 1,
 #endif
 
-		.wrr_weights = {1, 1, 1, 1,  1, 1, 1, 1,  1, 1, 1, 1,  1, 1, 1, 1},
+		.wrr_weights = {1, 1, 1, 1,  1, 1, 1, 1},
 	},
 };
 
diff --git a/lib/librte_sched/rte_sched.h b/lib/librte_sched/rte_sched.h
index cf7695f27..71728f725 100644
--- a/lib/librte_sched/rte_sched.h
+++ b/lib/librte_sched/rte_sched.h
@@ -117,6 +117,33 @@ extern "C" {
 #define RTE_SCHED_FRAME_OVERHEAD_DEFAULT      24
 #endif
 
+/*
+ * Pipe configuration parameters. The period and credits_per_period
+ * parameters are measured in bytes, with one byte meaning the time
+ * duration associated with the transmission of one byte on the
+ * physical medium of the output port, with pipe or pipe traffic class
+ * rate (measured as percentage of output port rate) determined as
+ * credits_per_period divided by period. One credit represents one
+ * byte.
+ */
+struct rte_sched_pipe_params {
+	/* Pipe token bucket */
+	uint32_t tb_rate; /**< Rate (measured in bytes per second) */
+	uint32_t tb_size; /**< Size (measured in credits) */
+
+	/* Pipe traffic classes */
+	uint32_t tc_rate[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
+	/**< Traffic class rates (measured in bytes per second) */
+	uint32_t tc_period;
+	/**< Enforcement period (measured in milliseconds) */
+#ifdef RTE_SCHED_SUBPORT_TC_OV
+	uint8_t tc_ov_weight; /**< Weight Traffic class 3 oversubscription */
+#endif
+
+	/* Pipe queues */
+	uint8_t  wrr_weights[RTE_SCHED_WRR_QUEUES_PER_PIPE]; /**< WRR weights */
+};
+
 /*
  * Subport configuration parameters. The period and credits_per_period
  * parameters are measured in bytes, with one byte meaning the time
@@ -128,14 +155,32 @@ extern "C" {
  */
 struct rte_sched_subport_params {
 	/* Subport token bucket */
-	uint32_t tb_rate;                /**< Rate (measured in bytes per second) */
-	uint32_t tb_size;                /**< Size (measured in credits) */
+	uint32_t tb_rate; /**< Rate (measured in bytes per second) */
+	uint32_t tb_size; /**< Size (measured in credits) */
 
 	/* Subport traffic classes */
 	uint32_t tc_rate[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
 	/**< Traffic class rates (measured in bytes per second) */
 	uint32_t tc_period;
 	/**< Enforcement period for rates (measured in milliseconds) */
+
+	uint32_t n_subport_pipes;    /**< Number of subport_pipes */
+	uint16_t qsize[RTE_SCHED_QUEUES_PER_PIPE];
+	/**< Packet queue size for each traffic class.
+	 * All queues which are not needed,  have zero size. All the pipes
+	 * within the same subport share the similar configuration for the
+	 * queues.
+	 */
+	struct rte_sched_pipe_params *pipe_profiles;
+	/**< Pipe profile table.
+	 * Every pipe is configured using one of the profiles from this table.
+	 */
+	uint32_t n_pipe_profiles; /**< Profiles in the pipe profile table */
+#ifdef RTE_SCHED_RED
+	struct rte_red_params
+		red_params[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE][RTE_COLORS];
+		/**< RED parameters */
+#endif
 };
 
 /** Subport statistics */
@@ -158,33 +203,6 @@ struct rte_sched_subport_stats {
 #endif
 };
 
-/*
- * Pipe configuration parameters. The period and credits_per_period
- * parameters are measured in bytes, with one byte meaning the time
- * duration associated with the transmission of one byte on the
- * physical medium of the output port, with pipe or pipe traffic class
- * rate (measured as percentage of output port rate) determined as
- * credits_per_period divided by period. One credit represents one
- * byte.
- */
-struct rte_sched_pipe_params {
-	/* Pipe token bucket */
-	uint32_t tb_rate;                /**< Rate (measured in bytes per second) */
-	uint32_t tb_size;                /**< Size (measured in credits) */
-
-	/* Pipe traffic classes */
-	uint32_t tc_rate[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
-	/**< Traffic class rates (measured in bytes per second) */
-	uint32_t tc_period;
-	/**< Enforcement period (measured in milliseconds) */
-#ifdef RTE_SCHED_SUBPORT_TC_OV
-	uint8_t tc_ov_weight;		 /**< Weight Traffic class 3 oversubscription */
-#endif
-
-	/* Pipe queues */
-	uint8_t  wrr_weights[RTE_SCHED_QUEUES_PER_PIPE]; /**< WRR weights */
-};
-
 /** Queue statistics */
 struct rte_sched_queue_stats {
 	/* Packets */
-- 
2.20.1


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH 03/27] sched: update internal data structures
  2019-05-28 12:05 [dpdk-dev] [PATCH 00/27] sched: feature enhancements Lukasz Krakowiak
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 01/27] sched: update macros for flexible config Lukasz Krakowiak
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 02/27] sched: update subport and pipe data structures Lukasz Krakowiak
@ 2019-05-28 12:05 ` " Lukasz Krakowiak
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 04/27] sched: update port config api Lukasz Krakowiak
                   ` (23 subsequent siblings)
  26 siblings, 0 replies; 163+ messages in thread
From: Lukasz Krakowiak @ 2019-05-28 12:05 UTC (permalink / raw)
  To: cristian.dumitrescu; +Cc: dev, Jasvinder Singh, Abraham Tovar, Lukasz Krakowiak

From: Jasvinder Singh <jasvinder.singh@intel.com>

Update internal data structures of the scheduler to allow configuration
flexiblity for pipe traffic classes and queues, and subport level
configuration of the pipe parameters.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 lib/librte_sched/rte_sched.c | 182 +++++++++++++++++++++++++----------
 1 file changed, 130 insertions(+), 52 deletions(-)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index a60ddf97e..8256ac407 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -37,6 +37,7 @@
 
 #define RTE_SCHED_TB_RATE_CONFIG_ERR          (1e-7)
 #define RTE_SCHED_WRR_SHIFT                   3
+#define RTE_SCHED_TRAFFIC_CLASS_BE            (RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE - 1)
 #define RTE_SCHED_GRINDER_PCACHE_SIZE         (64 / RTE_SCHED_QUEUES_PER_PIPE)
 #define RTE_SCHED_PIPE_INVALID                UINT32_MAX
 #define RTE_SCHED_BMP_POS_INVALID             UINT32_MAX
@@ -46,6 +47,73 @@
  */
 #define RTE_SCHED_TIME_SHIFT		      8
 
+struct rte_sched_strict_priority_class {
+	struct rte_sched_queue *queue;
+	struct rte_mbuf **qbase;
+	uint32_t qindex;
+	uint16_t qsize;
+};
+
+struct rte_sched_best_effort_class {
+	struct rte_sched_queue *queue[RTE_SCHED_WRR_QUEUES_PER_PIPE];
+	struct rte_mbuf **qbase[RTE_SCHED_WRR_QUEUES_PER_PIPE];
+	uint32_t qindex[RTE_SCHED_WRR_QUEUES_PER_PIPE];
+	uint16_t qsize[RTE_SCHED_WRR_QUEUES_PER_PIPE];
+	uint32_t qmask;
+	uint32_t qpos;
+
+	/* WRR */
+	uint16_t wrr_tokens[RTE_SCHED_WRR_QUEUES_PER_PIPE];
+	uint16_t wrr_mask[RTE_SCHED_WRR_QUEUES_PER_PIPE];
+	uint8_t wrr_cost[RTE_SCHED_WRR_QUEUES_PER_PIPE];
+};
+
+enum grinder_state {
+	e_GRINDER_PREFETCH_PIPE = 0,
+	e_GRINDER_PREFETCH_TC_QUEUE_ARRAYS,
+	e_GRINDER_PREFETCH_MBUF,
+	e_GRINDER_READ_MBUF
+};
+
+struct rte_sched_grinder {
+	/* Pipe cache */
+	uint16_t pcache_qmask[RTE_SCHED_GRINDER_PCACHE_SIZE];
+	uint32_t pcache_qindex[RTE_SCHED_GRINDER_PCACHE_SIZE];
+	uint32_t pcache_w;
+	uint32_t pcache_r;
+
+	/* Current pipe */
+	enum grinder_state state;
+	uint32_t productive;
+	uint32_t pindex;
+	struct rte_sched_subport *subport;
+	struct rte_sched_pipe *pipe;
+	struct rte_sched_pipe_profile *pipe_params;
+
+	/* TC cache */
+	uint8_t tccache_qmask[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
+	uint32_t tccache_qindex[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
+	uint32_t tccache_w;
+	uint32_t tccache_r;
+
+	/* Current TC */
+	uint32_t tc_index;
+	struct rte_sched_strict_priority_class sp;
+	struct rte_sched_best_effort_class be;
+	struct rte_sched_queue *queue[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
+	struct rte_mbuf **qbase[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
+	uint32_t qindex[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
+	uint16_t qsize;
+	uint32_t qmask;
+	uint32_t qpos;
+	struct rte_mbuf *pkt;
+
+	/* WRR */
+	uint16_t wrr_tokens[RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS];
+	uint16_t wrr_mask[RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS];
+	uint8_t wrr_cost[RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS];
+};
+
 struct rte_sched_subport {
 	/* Token bucket (TB) */
 	uint64_t tb_time; /* time of last update */
@@ -71,7 +139,41 @@ struct rte_sched_subport {
 
 	/* Statistics */
 	struct rte_sched_subport_stats stats;
-};
+
+	/* Subport Pipes*/
+	uint32_t n_subport_pipes;
+
+	uint16_t qsize[RTE_SCHED_QUEUES_PER_PIPE];
+	uint32_t n_pipe_profiles;
+	uint32_t pipe_tc_be_rate_max;
+#ifdef RTE_SCHED_RED
+	struct rte_red_config red_config[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE][RTE_COLORS];
+#endif
+
+	/* Scheduling loop detection */
+	uint32_t pipe_loop;
+	uint32_t pipe_exhaustion;
+
+	/* Bitmap */
+	struct rte_bitmap *bmp;
+	uint32_t grinder_base_bmp_pos[RTE_SCHED_PORT_N_GRINDERS] __rte_aligned_16;
+
+	/* Grinders */
+	struct rte_sched_grinder grinder[RTE_SCHED_PORT_N_GRINDERS];
+	uint32_t busy_grinders;
+
+	/* Queue base calculation */
+	uint32_t qsize_add[RTE_SCHED_QUEUES_PER_PIPE];
+	uint32_t qsize_sum;
+
+	struct rte_sched_pipe *pipe;
+	struct rte_sched_queue *queue;
+	struct rte_sched_queue_extra *queue_extra;
+	struct rte_sched_pipe_profile *pipe_profiles;
+	uint8_t *bmp_array;
+	struct rte_mbuf **queue_array;
+	uint8_t memory[0] __rte_cache_aligned;
+} __rte_cache_aligned;
 
 struct rte_sched_pipe_profile {
 	/* Token bucket (TB) */
@@ -84,8 +186,12 @@ struct rte_sched_pipe_profile {
 	uint32_t tc_credits_per_period[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
 	uint8_t tc_ov_weight;
 
+	/* Strict priority and best effort traffic class queues */
+	uint8_t n_sp_queues;
+	uint8_t n_be_queues;
+
 	/* Pipe queues */
-	uint8_t  wrr_cost[RTE_SCHED_QUEUES_PER_PIPE];
+	uint8_t  wrr_cost[RTE_SCHED_WRR_QUEUES_PER_PIPE];
 };
 
 struct rte_sched_pipe {
@@ -100,8 +206,11 @@ struct rte_sched_pipe {
 	uint64_t tc_time; /* time of next update */
 	uint32_t tc_credits[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
 
+	uint8_t n_sp_queues; /* Strict priority traffic class queues */
+	uint8_t n_be_queues; /* Best effort traffic class queues */
+
 	/* Weighted Round Robin (WRR) */
-	uint8_t wrr_tokens[RTE_SCHED_QUEUES_PER_PIPE];
+	uint8_t wrr_tokens[RTE_SCHED_WRR_QUEUES_PER_PIPE];
 
 	/* TC oversubscription */
 	uint32_t tc_ov_credits;
@@ -121,55 +230,12 @@ struct rte_sched_queue_extra {
 #endif
 };
 
-enum grinder_state {
-	e_GRINDER_PREFETCH_PIPE = 0,
-	e_GRINDER_PREFETCH_TC_QUEUE_ARRAYS,
-	e_GRINDER_PREFETCH_MBUF,
-	e_GRINDER_READ_MBUF
-};
-
-struct rte_sched_grinder {
-	/* Pipe cache */
-	uint16_t pcache_qmask[RTE_SCHED_GRINDER_PCACHE_SIZE];
-	uint32_t pcache_qindex[RTE_SCHED_GRINDER_PCACHE_SIZE];
-	uint32_t pcache_w;
-	uint32_t pcache_r;
-
-	/* Current pipe */
-	enum grinder_state state;
-	uint32_t productive;
-	uint32_t pindex;
-	struct rte_sched_subport *subport;
-	struct rte_sched_pipe *pipe;
-	struct rte_sched_pipe_profile *pipe_params;
-
-	/* TC cache */
-	uint8_t tccache_qmask[4];
-	uint32_t tccache_qindex[4];
-	uint32_t tccache_w;
-	uint32_t tccache_r;
-
-	/* Current TC */
-	uint32_t tc_index;
-	struct rte_sched_queue *queue[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
-	struct rte_mbuf **qbase[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
-	uint32_t qindex[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
-	uint16_t qsize;
-	uint32_t qmask;
-	uint32_t qpos;
-	struct rte_mbuf *pkt;
-
-	/* WRR */
-	uint16_t wrr_tokens[RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS];
-	uint16_t wrr_mask[RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS];
-	uint8_t wrr_cost[RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS];
-};
-
 struct rte_sched_port {
 	/* User parameters */
 	uint32_t n_subports_per_port;
 	uint32_t n_pipes_per_subport;
 	uint32_t n_pipes_per_subport_log2;
+	int socket;
 	uint32_t rate;
 	uint32_t mtu;
 	uint32_t frame_overhead;
@@ -199,6 +265,9 @@ struct rte_sched_port {
 	uint32_t busy_grinders;
 	struct rte_mbuf **pkts_out;
 	uint32_t n_pkts_out;
+	uint32_t subport_id;
+
+	uint32_t n_max_subport_pipes_log2;   /* Max number of subport pipes */
 
 	/* Queue base calculation */
 	uint32_t qsize_add[RTE_SCHED_QUEUES_PER_PIPE];
@@ -212,6 +281,7 @@ struct rte_sched_port {
 	struct rte_sched_pipe_profile *pipe_profiles;
 	uint8_t *bmp_array;
 	struct rte_mbuf **queue_array;
+	struct rte_sched_subport *subports[RTE_SCHED_SUBPORTS_PER_PORT];
 	uint8_t memory[0] __rte_cache_aligned;
 } __rte_cache_aligned;
 
@@ -226,6 +296,16 @@ enum rte_sched_port_array {
 	e_RTE_SCHED_PORT_ARRAY_TOTAL,
 };
 
+enum rte_sched_subport_array {
+	e_RTE_SCHED_SUBPORT_ARRAY_PIPE = 0,
+	e_RTE_SCHED_SUBPORT_ARRAY_QUEUE,
+	e_RTE_SCHED_SUBPORT_ARRAY_QUEUE_EXTRA,
+	e_RTE_SCHED_SUBPORT_ARRAY_PIPE_PROFILES,
+	e_RTE_SCHED_SUBPORT_ARRAY_BMP_ARRAY,
+	e_RTE_SCHED_SUBPORT_ARRAY_QUEUE_ARRAY,
+	e_RTE_SCHED_SUBPORT_ARRAY_TOTAL,
+};
+
 #ifdef RTE_SCHED_COLLECT_STATS
 
 static inline uint32_t
@@ -483,7 +563,7 @@ rte_sched_port_log_pipe_profile(struct rte_sched_port *port, uint32_t i)
 		"    Token bucket: period = %u, credits per period = %u, size = %u\n"
 		"    Traffic classes: period = %u, credits per period = [%u, %u, %u, %u]\n"
 		"    Traffic class 3 oversubscription: weight = %hhu\n"
-		"    WRR cost: [%hhu, %hhu, %hhu, %hhu], [%hhu, %hhu, %hhu, %hhu], [%hhu, %hhu, %hhu, %hhu], [%hhu, %hhu, %hhu, %hhu]\n",
+		"    WRR cost: [%hhu, %hhu, %hhu, %hhu], [%hhu, %hhu, %hhu, %hhu],\n",
 		i,
 
 		/* Token bucket */
@@ -502,10 +582,8 @@ rte_sched_port_log_pipe_profile(struct rte_sched_port *port, uint32_t i)
 		p->tc_ov_weight,
 
 		/* WRR */
-		p->wrr_cost[ 0], p->wrr_cost[ 1], p->wrr_cost[ 2], p->wrr_cost[ 3],
-		p->wrr_cost[ 4], p->wrr_cost[ 5], p->wrr_cost[ 6], p->wrr_cost[ 7],
-		p->wrr_cost[ 8], p->wrr_cost[ 9], p->wrr_cost[10], p->wrr_cost[11],
-		p->wrr_cost[12], p->wrr_cost[13], p->wrr_cost[14], p->wrr_cost[15]);
+		p->wrr_cost[0], p->wrr_cost[1], p->wrr_cost[2], p->wrr_cost[3],
+		p->wrr_cost[4], p->wrr_cost[5], p->wrr_cost[6], p->wrr_cost[7]);
 }
 
 static inline uint64_t
-- 
2.20.1


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH 04/27] sched: update port config api
  2019-05-28 12:05 [dpdk-dev] [PATCH 00/27] sched: feature enhancements Lukasz Krakowiak
                   ` (2 preceding siblings ...)
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 03/27] sched: update internal " Lukasz Krakowiak
@ 2019-05-28 12:05 ` Lukasz Krakowiak
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 05/27] sched: update port free api Lukasz Krakowiak
                   ` (22 subsequent siblings)
  26 siblings, 0 replies; 163+ messages in thread
From: Lukasz Krakowiak @ 2019-05-28 12:05 UTC (permalink / raw)
  To: cristian.dumitrescu; +Cc: dev, Jasvinder Singh, Abraham Tovar, Lukasz Krakowiak

From: Jasvinder Singh <jasvinder.singh@intel.com>

Update the scheduler port configuration api implementation to allow
configuration flexiblity for pipe traffic classes and queues, and
subport level configuration of the pipe parameters..

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 lib/librte_sched/rte_sched.c | 189 +++--------------------------------
 1 file changed, 15 insertions(+), 174 deletions(-)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index 8256ac407..8f11b4597 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -388,8 +388,6 @@ pipe_profile_check(struct rte_sched_pipe_params *params,
 static int
 rte_sched_port_check_params(struct rte_sched_port_params *params)
 {
-	uint32_t i;
-
 	if (params == NULL)
 		return -1;
 
@@ -407,40 +405,10 @@ rte_sched_port_check_params(struct rte_sched_port_params *params)
 
 	/* n_subports_per_port: non-zero, limited to 16 bits, power of 2 */
 	if (params->n_subports_per_port == 0 ||
-	    params->n_subports_per_port > 1u << 16 ||
+	    params->n_subports_per_port > RTE_SCHED_SUBPORTS_PER_PORT ||
 	    !rte_is_power_of_2(params->n_subports_per_port))
 		return -6;
 
-	/* n_pipes_per_subport: non-zero, power of 2 */
-	if (params->n_pipes_per_subport == 0 ||
-	    !rte_is_power_of_2(params->n_pipes_per_subport))
-		return -7;
-
-	/* qsize: non-zero, power of 2,
-	 * no bigger than 32K (due to 16-bit read/write pointers)
-	 */
-	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++) {
-		uint16_t qsize = params->qsize[i];
-
-		if (qsize == 0 || !rte_is_power_of_2(qsize))
-			return -8;
-	}
-
-	/* pipe_profiles and n_pipe_profiles */
-	if (params->pipe_profiles == NULL ||
-	    params->n_pipe_profiles == 0 ||
-	    params->n_pipe_profiles > RTE_SCHED_PIPE_PROFILES_PER_PORT)
-		return -9;
-
-	for (i = 0; i < params->n_pipe_profiles; i++) {
-		struct rte_sched_pipe_params *p = params->pipe_profiles + i;
-		int status;
-
-		status = pipe_profile_check(p, params->rate);
-		if (status != 0)
-			return status;
-	}
-
 	return 0;
 }
 
@@ -524,36 +492,6 @@ rte_sched_port_get_memory_footprint(struct rte_sched_port_params *params)
 	return size0 + size1;
 }
 
-static void
-rte_sched_port_config_qsize(struct rte_sched_port *port)
-{
-	/* TC 0 */
-	port->qsize_add[0] = 0;
-	port->qsize_add[1] = port->qsize_add[0] + port->qsize[0];
-	port->qsize_add[2] = port->qsize_add[1] + port->qsize[0];
-	port->qsize_add[3] = port->qsize_add[2] + port->qsize[0];
-
-	/* TC 1 */
-	port->qsize_add[4] = port->qsize_add[3] + port->qsize[0];
-	port->qsize_add[5] = port->qsize_add[4] + port->qsize[1];
-	port->qsize_add[6] = port->qsize_add[5] + port->qsize[1];
-	port->qsize_add[7] = port->qsize_add[6] + port->qsize[1];
-
-	/* TC 2 */
-	port->qsize_add[8] = port->qsize_add[7] + port->qsize[1];
-	port->qsize_add[9] = port->qsize_add[8] + port->qsize[2];
-	port->qsize_add[10] = port->qsize_add[9] + port->qsize[2];
-	port->qsize_add[11] = port->qsize_add[10] + port->qsize[2];
-
-	/* TC 3 */
-	port->qsize_add[12] = port->qsize_add[11] + port->qsize[2];
-	port->qsize_add[13] = port->qsize_add[12] + port->qsize[3];
-	port->qsize_add[14] = port->qsize_add[13] + port->qsize[3];
-	port->qsize_add[15] = port->qsize_add[14] + port->qsize[3];
-
-	port->qsize_sum = port->qsize_add[15] + port->qsize[3];
-}
-
 static void
 rte_sched_port_log_pipe_profile(struct rte_sched_port *port, uint32_t i)
 {
@@ -660,84 +598,33 @@ rte_sched_pipe_profile_convert(struct rte_sched_pipe_params *src,
 	}
 }
 
-static void
-rte_sched_port_config_pipe_profile_table(struct rte_sched_port *port,
-	struct rte_sched_port_params *params)
-{
-	uint32_t i;
-
-	for (i = 0; i < port->n_pipe_profiles; i++) {
-		struct rte_sched_pipe_params *src = params->pipe_profiles + i;
-		struct rte_sched_pipe_profile *dst = port->pipe_profiles + i;
-
-		rte_sched_pipe_profile_convert(src, dst, params->rate);
-		rte_sched_port_log_pipe_profile(port, i);
-	}
-
-	port->pipe_tc3_rate_max = 0;
-	for (i = 0; i < port->n_pipe_profiles; i++) {
-		struct rte_sched_pipe_params *src = params->pipe_profiles + i;
-		uint32_t pipe_tc3_rate = src->tc_rate[3];
-
-		if (port->pipe_tc3_rate_max < pipe_tc3_rate)
-			port->pipe_tc3_rate_max = pipe_tc3_rate;
-	}
-}
-
 struct rte_sched_port *
 rte_sched_port_config(struct rte_sched_port_params *params)
 {
 	struct rte_sched_port *port = NULL;
-	uint32_t mem_size, bmp_mem_size, n_queues_per_port, i, cycles_per_byte;
+	uint32_t cycles_per_byte;
+	int status;
 
-	/* Check user parameters. Determine the amount of memory to allocate */
-	mem_size = rte_sched_port_get_memory_footprint(params);
-	if (mem_size == 0)
-		return NULL;
+	status = rte_sched_port_check_params(params);
+	if (status != 0) {
+		RTE_LOG(NOTICE, SCHED,
+			"Port scheduler params check failed (%d)\n", status);
+
+		return 0;
+	}
 
 	/* Allocate memory to store the data structures */
-	port = rte_zmalloc_socket("qos_params", mem_size, RTE_CACHE_LINE_SIZE,
-		params->socket);
+	port = rte_zmalloc_socket("qos_params", sizeof(struct rte_sched_port),
+		RTE_CACHE_LINE_SIZE, params->socket);
 	if (port == NULL)
 		return NULL;
 
-	/* compile time checks */
-	RTE_BUILD_BUG_ON(RTE_SCHED_PORT_N_GRINDERS == 0);
-	RTE_BUILD_BUG_ON(RTE_SCHED_PORT_N_GRINDERS & (RTE_SCHED_PORT_N_GRINDERS - 1));
-
 	/* User parameters */
 	port->n_subports_per_port = params->n_subports_per_port;
-	port->n_pipes_per_subport = params->n_pipes_per_subport;
-	port->n_pipes_per_subport_log2 =
-			__builtin_ctz(params->n_pipes_per_subport);
+	port->socket = params->socket;
 	port->rate = params->rate;
 	port->mtu = params->mtu + params->frame_overhead;
 	port->frame_overhead = params->frame_overhead;
-	memcpy(port->qsize, params->qsize, sizeof(params->qsize));
-	port->n_pipe_profiles = params->n_pipe_profiles;
-
-#ifdef RTE_SCHED_RED
-	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++) {
-		uint32_t j;
-
-		for (j = 0; j < RTE_COLORS; j++) {
-			/* if min/max are both zero, then RED is disabled */
-			if ((params->red_params[i][j].min_th |
-			     params->red_params[i][j].max_th) == 0) {
-				continue;
-			}
-
-			if (rte_red_config_init(&port->red_config[i][j],
-				params->red_params[i][j].wq_log2,
-				params->red_params[i][j].min_th,
-				params->red_params[i][j].max_th,
-				params->red_params[i][j].maxp_inv) != 0) {
-				rte_free(port);
-				return NULL;
-			}
-		}
-	}
-#endif
 
 	/* Timing */
 	port->time_cpu_cycles = rte_get_tsc_cycles();
@@ -748,57 +635,11 @@ rte_sched_port_config(struct rte_sched_port_params *params)
 		/ params->rate;
 	port->inv_cycles_per_byte = rte_reciprocal_value(cycles_per_byte);
 
-	/* Scheduling loop detection */
-	port->pipe_loop = RTE_SCHED_PIPE_INVALID;
-	port->pipe_exhaustion = 0;
-
-	/* Grinders */
-	port->busy_grinders = 0;
 	port->pkts_out = NULL;
 	port->n_pkts_out = 0;
+	port->subport_id = 0;
 
-	/* Queue base calculation */
-	rte_sched_port_config_qsize(port);
-
-	/* Large data structures */
-	port->subport = (struct rte_sched_subport *)
-		(port->memory + rte_sched_port_get_array_base(params,
-							      e_RTE_SCHED_PORT_ARRAY_SUBPORT));
-	port->pipe = (struct rte_sched_pipe *)
-		(port->memory + rte_sched_port_get_array_base(params,
-							      e_RTE_SCHED_PORT_ARRAY_PIPE));
-	port->queue = (struct rte_sched_queue *)
-		(port->memory + rte_sched_port_get_array_base(params,
-							      e_RTE_SCHED_PORT_ARRAY_QUEUE));
-	port->queue_extra = (struct rte_sched_queue_extra *)
-		(port->memory + rte_sched_port_get_array_base(params,
-							      e_RTE_SCHED_PORT_ARRAY_QUEUE_EXTRA));
-	port->pipe_profiles = (struct rte_sched_pipe_profile *)
-		(port->memory + rte_sched_port_get_array_base(params,
-							      e_RTE_SCHED_PORT_ARRAY_PIPE_PROFILES));
-	port->bmp_array =  port->memory
-		+ rte_sched_port_get_array_base(params, e_RTE_SCHED_PORT_ARRAY_BMP_ARRAY);
-	port->queue_array = (struct rte_mbuf **)
-		(port->memory + rte_sched_port_get_array_base(params,
-							      e_RTE_SCHED_PORT_ARRAY_QUEUE_ARRAY));
-
-	/* Pipe profile table */
-	rte_sched_port_config_pipe_profile_table(port, params);
-
-	/* Bitmap */
-	n_queues_per_port = rte_sched_port_queues_per_port(port);
-	bmp_mem_size = rte_bitmap_get_memory_footprint(n_queues_per_port);
-	port->bmp = rte_bitmap_init(n_queues_per_port, port->bmp_array,
-				    bmp_mem_size);
-	if (port->bmp == NULL) {
-		RTE_LOG(ERR, SCHED, "Bitmap init error\n");
-		rte_free(port);
-		return NULL;
-	}
-
-	for (i = 0; i < RTE_SCHED_PORT_N_GRINDERS; i++)
-		port->grinder_base_bmp_pos[i] = RTE_SCHED_PIPE_INVALID;
-
+	port->n_max_subport_pipes_log2 = 0;
 
 	return port;
 }
-- 
2.20.1


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH 05/27] sched: update port free api
  2019-05-28 12:05 [dpdk-dev] [PATCH 00/27] sched: feature enhancements Lukasz Krakowiak
                   ` (3 preceding siblings ...)
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 04/27] sched: update port config api Lukasz Krakowiak
@ 2019-05-28 12:05 ` Lukasz Krakowiak
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 06/27] sched: update subport config api Lukasz Krakowiak
                   ` (21 subsequent siblings)
  26 siblings, 0 replies; 163+ messages in thread
From: Lukasz Krakowiak @ 2019-05-28 12:05 UTC (permalink / raw)
  To: cristian.dumitrescu; +Cc: dev, Jasvinder Singh, Abraham Tovar, Lukasz Krakowiak

From: Jasvinder Singh <jasvinder.singh@intel.com>

Update the scheduler port free api implementation to allow configuration
flexiblity for pipe traffic classes and queues, and subport level
configuration of the pipe parameters.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 lib/librte_sched/rte_sched.c | 61 ++++++++++++++++++++++++++++--------
 1 file changed, 48 insertions(+), 13 deletions(-)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index 8f11b4597..39a6165e3 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -316,6 +316,29 @@ rte_sched_port_queues_per_subport(struct rte_sched_port *port)
 
 #endif
 
+static inline uint32_t
+rte_sched_subport_queues(struct rte_sched_subport *subport)
+{
+	return RTE_SCHED_QUEUES_PER_PIPE * subport->n_subport_pipes;
+}
+
+static inline struct rte_mbuf **
+rte_sched_subport_qbase(struct rte_sched_subport *subport, uint32_t qindex)
+{
+	uint32_t pindex = qindex >> 4;
+	uint32_t qpos = qindex & 0xF;
+
+	return (subport->queue_array + pindex *
+		subport->qsize_sum + subport->qsize_add[qpos]);
+}
+
+static inline uint16_t
+rte_sched_subport_qsize(struct rte_sched_subport *subport, uint32_t qindex)
+{
+	uint32_t qpos = qindex & 0xF;
+	return subport->qsize[qpos];
+}
+
 static inline uint32_t
 rte_sched_port_queues_per_port(struct rte_sched_port *port)
 {
@@ -647,28 +670,40 @@ rte_sched_port_config(struct rte_sched_port_params *params)
 void
 rte_sched_port_free(struct rte_sched_port *port)
 {
-	uint32_t qindex;
-	uint32_t n_queues_per_port;
+	uint32_t n_subport_queues;
+	uint32_t qindex, i;
 
 	/* Check user parameters */
 	if (port == NULL)
 		return;
 
-	n_queues_per_port = rte_sched_port_queues_per_port(port);
+	for (i = 0; i < port->n_subports_per_port; i++) {
+		struct rte_sched_subport *s = port->subports[i];
 
-	/* Free enqueued mbufs */
-	for (qindex = 0; qindex < n_queues_per_port; qindex++) {
-		struct rte_mbuf **mbufs = rte_sched_port_qbase(port, qindex);
-		uint16_t qsize = rte_sched_port_qsize(port, qindex);
-		struct rte_sched_queue *queue = port->queue + qindex;
-		uint16_t qr = queue->qr & (qsize - 1);
-		uint16_t qw = queue->qw & (qsize - 1);
+		if (s == NULL)
+			continue;
+
+		n_subport_queues = rte_sched_subport_queues(s);
+
+		/* Free enqueued mbufs */
+		for (qindex = 0; qindex < n_subport_queues; qindex++) {
+			struct rte_mbuf **mbufs =
+				rte_sched_subport_qbase(s, qindex);
+			uint16_t qsize = rte_sched_subport_qsize(s, qindex);
+			if (qsize != 0) {
+				struct rte_sched_queue *queue =
+					s->queue + qindex;
+				uint16_t qr = queue->qr & (qsize - 1);
+				uint16_t qw = queue->qw & (qsize - 1);
+
+				for (; qr != qw; qr = (qr + 1) & (qsize - 1))
+					rte_pktmbuf_free(mbufs[qr]);
+			}
+		}
 
-		for (; qr != qw; qr = (qr + 1) & (qsize - 1))
-			rte_pktmbuf_free(mbufs[qr]);
+		rte_bitmap_free(s->bmp);
 	}
 
-	rte_bitmap_free(port->bmp);
 	rte_free(port);
 }
 
-- 
2.20.1


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH 06/27] sched: update subport config api
  2019-05-28 12:05 [dpdk-dev] [PATCH 00/27] sched: feature enhancements Lukasz Krakowiak
                   ` (4 preceding siblings ...)
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 05/27] sched: update port free api Lukasz Krakowiak
@ 2019-05-28 12:05 ` Lukasz Krakowiak
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 07/27] sched: update pipe profile add api Lukasz Krakowiak
                   ` (20 subsequent siblings)
  26 siblings, 0 replies; 163+ messages in thread
From: Lukasz Krakowiak @ 2019-05-28 12:05 UTC (permalink / raw)
  To: cristian.dumitrescu; +Cc: dev, Jasvinder Singh, Abraham Tovar, Lukasz Krakowiak

From: Jasvinder Singh <jasvinder.singh@intel.com>

Update suport configuration api implementation of the scheduler to allow
configuration flexiblity for pipe traffic classes and queues, and subport
level configuration of the pipe parameters.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 lib/librte_sched/rte_sched.c | 325 ++++++++++++++++++++++++++++++-----
 1 file changed, 283 insertions(+), 42 deletions(-)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index 39a6165e3..020c028fd 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -495,24 +495,72 @@ rte_sched_port_get_array_base(struct rte_sched_port_params *params, enum rte_sch
 	return base;
 }
 
-uint32_t
-rte_sched_port_get_memory_footprint(struct rte_sched_port_params *params)
+static uint32_t
+rte_sched_subport_get_array_base(struct rte_sched_subport_params *params,
+	enum rte_sched_subport_array array)
 {
-	uint32_t size0, size1;
-	int status;
+	uint32_t n_subport_pipes = params->n_subport_pipes;
+	uint32_t n_subport_queues = RTE_SCHED_QUEUES_PER_PIPE * n_subport_pipes;
 
-	status = rte_sched_port_check_params(params);
-	if (status != 0) {
-		RTE_LOG(NOTICE, SCHED,
-			"Port scheduler params check failed (%d)\n", status);
+	uint32_t size_pipe = n_subport_pipes * sizeof(struct rte_sched_pipe);
+	uint32_t size_queue = n_subport_queues * sizeof(struct rte_sched_queue);
+	uint32_t size_queue_extra
+		= n_subport_queues * sizeof(struct rte_sched_queue_extra);
+	uint32_t size_pipe_profiles = RTE_SCHED_PIPE_PROFILES_PER_SUBPORT *
+		sizeof(struct rte_sched_pipe_profile);
+	uint32_t size_bmp_array =
+		rte_bitmap_get_memory_footprint(n_subport_queues);
+	uint32_t size_per_pipe_queue_array, size_queue_array;
 
-		return 0;
+	uint32_t base, i;
+
+	size_per_pipe_queue_array = 0;
+	for (i = 0; i < RTE_SCHED_QUEUES_PER_PIPE; i++) {
+		size_per_pipe_queue_array += params->qsize[i] * sizeof(struct rte_mbuf *);
 	}
+	size_queue_array = n_subport_pipes * size_per_pipe_queue_array;
 
-	size0 = sizeof(struct rte_sched_port);
-	size1 = rte_sched_port_get_array_base(params, e_RTE_SCHED_PORT_ARRAY_TOTAL);
+	base = 0;
 
-	return size0 + size1;
+	if (array == e_RTE_SCHED_SUBPORT_ARRAY_PIPE)
+		return base;
+	base += RTE_CACHE_LINE_ROUNDUP(size_pipe);
+
+	if (array == e_RTE_SCHED_SUBPORT_ARRAY_QUEUE)
+		return base;
+	base += RTE_CACHE_LINE_ROUNDUP(size_queue);
+
+	if (array == e_RTE_SCHED_SUBPORT_ARRAY_QUEUE_EXTRA)
+		return base;
+	base += RTE_CACHE_LINE_ROUNDUP(size_queue_extra);
+
+	if (array == e_RTE_SCHED_SUBPORT_ARRAY_PIPE_PROFILES)
+		return base;
+	base += RTE_CACHE_LINE_ROUNDUP(size_pipe_profiles);
+
+	if (array == e_RTE_SCHED_SUBPORT_ARRAY_BMP_ARRAY)
+		return base;
+	base += RTE_CACHE_LINE_ROUNDUP(size_bmp_array);
+
+	if (array == e_RTE_SCHED_SUBPORT_ARRAY_QUEUE_ARRAY)
+		return base;
+	base += RTE_CACHE_LINE_ROUNDUP(size_queue_array);
+
+	return base;
+}
+
+static void
+rte_sched_subport_config_qsize(struct rte_sched_subport *subport)
+{
+	uint32_t i;
+
+	subport->qsize_add[0] = 0;
+
+	for (i = 1; i < RTE_SCHED_QUEUES_PER_PIPE; i++)
+		subport->qsize_add[i] =
+			subport->qsize_add[i-1] + subport->qsize[i-1];
+
+	subport->qsize_sum = subport->qsize_add[15] + subport->qsize[15];
 }
 
 static void
@@ -621,6 +669,120 @@ rte_sched_pipe_profile_convert(struct rte_sched_pipe_params *src,
 	}
 }
 
+static int
+rte_sched_subport_check_params(struct rte_sched_subport_params *params,
+	uint32_t rate)
+{
+	uint32_t i, j;
+
+	/* Check user parameters */
+	if (params == NULL)
+		return -1;
+
+	if (params->tb_rate == 0 || params->tb_rate > rate)
+		return -2;
+
+	if (params->tb_size == 0)
+		return -3;
+
+	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++)
+		if (params->tc_rate[i] > params->tb_rate) {
+			printf("traffic class %u, tc_rate %u, tb_rate %u\n", i,
+				params->tc_rate[i], params->tb_rate);
+			return -4;
+		}
+	if (params->tc_period == 0)
+		return -6;
+
+	/* n_subport_pipes: non-zero, power of 2 */
+	if (params->n_subport_pipes == 0 ||
+	    !rte_is_power_of_2(params->n_subport_pipes))
+		return -7;
+
+	/* qsize: power of 2, if non-zero
+	 * no bigger than 32K (due to 16-bit read/write pointers)
+	 */
+	for (i = 0, j = 0; i < RTE_SCHED_QUEUES_PER_PIPE; i++) {
+		uint32_t tc_rate = params->tc_rate[j];
+		uint16_t qsize = params->qsize[i];
+
+		if (((qsize == 0) &&
+			((tc_rate != 0) &&
+			(j != RTE_SCHED_TRAFFIC_CLASS_BE))) ||
+			((qsize != 0) && !rte_is_power_of_2(qsize)))
+			return -8;
+
+		if (j < RTE_SCHED_TRAFFIC_CLASS_BE)
+			j++;
+	}
+
+	/* WRR queues : 1, 4, 8 */
+	uint32_t wrr_queues = 0;
+	for (i = 0; i < RTE_SCHED_WRR_QUEUES_PER_PIPE; i++) {
+		if (params->qsize[RTE_SCHED_TRAFFIC_CLASS_BE + i])
+			wrr_queues++;
+	}
+	if (params->tc_rate[RTE_SCHED_TRAFFIC_CLASS_BE] &&
+		(wrr_queues != 1 && wrr_queues != 2 &&
+		wrr_queues != 4 && wrr_queues != 8))
+		return -9;
+
+	/* pipe_profiles and n_pipe_profiles */
+	if (params->pipe_profiles == NULL ||
+	    params->n_pipe_profiles == 0 ||
+	    params->n_pipe_profiles > RTE_SCHED_PIPE_PROFILES_PER_SUBPORT)
+		return -10;
+
+	return 0;
+}
+
+static uint32_t
+rte_sched_subport_get_memory_footprint(struct rte_sched_port *port,
+	uint32_t subport_id, struct rte_sched_subport_params *params)
+{
+	uint32_t size0, size1;
+	int status;
+
+	if (port == NULL ||
+	    subport_id >= port->n_subports_per_port)
+		return 0;
+
+	status = rte_sched_subport_check_params(params, port->rate);
+	if (status != 0) {
+		RTE_LOG(NOTICE, SCHED,
+			"Port scheduler params check failed (%d)\n", status);
+
+		return 0;
+	}
+
+	size0 = sizeof(struct rte_sched_subport);
+	size1 = rte_sched_subport_get_array_base(params,
+			e_RTE_SCHED_SUBPORT_ARRAY_TOTAL);
+
+	return size0 + size1;
+}
+
+uint32_t
+rte_sched_port_get_memory_footprint(struct rte_sched_port_params *params)
+{
+	uint32_t size0, size1;
+	int status;
+
+	status = rte_sched_port_check_params(params);
+	if (status != 0) {
+		RTE_LOG(NOTICE, SCHED,
+			"Port scheduler params check failed (%d)\n", status);
+
+		return 0;
+	}
+
+	size0 = sizeof(struct rte_sched_port);
+	size1 = rte_sched_port_get_array_base(params,
+			e_RTE_SCHED_PORT_ARRAY_TOTAL);
+
+	return size0 + size1;
+}
+
 struct rte_sched_port *
 rte_sched_port_config(struct rte_sched_port_params *params)
 {
@@ -710,12 +872,12 @@ rte_sched_port_free(struct rte_sched_port *port)
 static void
 rte_sched_port_log_subport_config(struct rte_sched_port *port, uint32_t i)
 {
-	struct rte_sched_subport *s = port->subport + i;
+	struct rte_sched_subport *s = port->subports[i];
 
 	RTE_LOG(DEBUG, SCHED, "Low level config for subport %u:\n"
 		"    Token bucket: period = %u, credits per period = %u, size = %u\n"
-		"    Traffic classes: period = %u, credits per period = [%u, %u, %u, %u]\n"
-		"    Traffic class 3 oversubscription: wm min = %u, wm max = %u\n",
+		"    Traffic classes: period = %u, credits per period = [%u, %u, %u, %u, %u, %u, %u, %u, %u]\n"
+		"    Traffic class BE oversubscription: wm min = %u, wm max = %u\n",
 		i,
 
 		/* Token bucket */
@@ -729,8 +891,13 @@ rte_sched_port_log_subport_config(struct rte_sched_port *port, uint32_t i)
 		s->tc_credits_per_period[1],
 		s->tc_credits_per_period[2],
 		s->tc_credits_per_period[3],
+		s->tc_credits_per_period[4],
+		s->tc_credits_per_period[5],
+		s->tc_credits_per_period[6],
+		s->tc_credits_per_period[7],
+		s->tc_credits_per_period[8],
 
-		/* Traffic class 3 oversubscription */
+		/* Traffic class BE oversubscription */
 		s->tc_ov_wm_min,
 		s->tc_ov_wm_max);
 }
@@ -740,32 +907,21 @@ rte_sched_subport_config(struct rte_sched_port *port,
 	uint32_t subport_id,
 	struct rte_sched_subport_params *params)
 {
-	struct rte_sched_subport *s;
-	uint32_t i;
+	struct rte_sched_subport *s = NULL;
+	uint32_t mem_size, bmp_mem_size, n_subport_queues, n_subport_pipes_log2, i;
 
-	/* Check user parameters */
-	if (port == NULL ||
-	    subport_id >= port->n_subports_per_port ||
-	    params == NULL)
+	/* Check user parameters. Determine the amount of memory to allocate */
+	mem_size = rte_sched_subport_get_memory_footprint(port,
+		subport_id, params);
+	if (mem_size == 0)
 		return -1;
 
-	if (params->tb_rate == 0 || params->tb_rate > port->rate)
+	/* Allocate memory to store the data structures */
+	s = rte_zmalloc_socket("subport_params", mem_size, RTE_CACHE_LINE_SIZE,
+		port->socket);
+	if (s == NULL)
 		return -2;
 
-	if (params->tb_size == 0)
-		return -3;
-
-	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++) {
-		if (params->tc_rate[i] == 0 ||
-		    params->tc_rate[i] > params->tb_rate)
-			return -4;
-	}
-
-	if (params->tc_period == 0)
-		return -5;
-
-	s = port->subport + subport_id;
-
 	/* Token Bucket (TB) */
 	if (params->tb_rate == port->rate) {
 		s->tb_credits_per_period = 1;
@@ -784,19 +940,104 @@ rte_sched_subport_config(struct rte_sched_port *port,
 	/* Traffic Classes (TCs) */
 	s->tc_period = rte_sched_time_ms_to_bytes(params->tc_period, port->rate);
 	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++) {
-		s->tc_credits_per_period[i]
-			= rte_sched_time_ms_to_bytes(params->tc_period,
-						     params->tc_rate[i]);
+		if (params->qsize[i])
+			s->tc_credits_per_period[i]
+				= rte_sched_time_ms_to_bytes(params->tc_period,
+					params->tc_rate[i]);
 	}
 	s->tc_time = port->time + s->tc_period;
 	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++)
-		s->tc_credits[i] = s->tc_credits_per_period[i];
+		if (params->qsize[i])
+			s->tc_credits[i] = s->tc_credits_per_period[i];
+
+	/* compile time checks */
+	RTE_BUILD_BUG_ON(RTE_SCHED_PORT_N_GRINDERS == 0);
+	RTE_BUILD_BUG_ON(RTE_SCHED_PORT_N_GRINDERS &
+		(RTE_SCHED_PORT_N_GRINDERS - 1));
+
+	/* User parameters */
+	s->n_subport_pipes = params->n_subport_pipes;
+	n_subport_pipes_log2 = __builtin_ctz(params->n_subport_pipes);
+	memcpy(s->qsize, params->qsize, sizeof(params->qsize));
+	s->n_pipe_profiles = params->n_pipe_profiles;
+
+#ifdef RTE_SCHED_RED
+	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++) {
+		uint32_t j;
+
+		for (j = 0; j < RTE_COLORS; j++) {
+			/* if min/max are both zero, then RED is disabled */
+			if ((params->red_params[i][j].min_th |
+			     params->red_params[i][j].max_th) == 0) {
+				continue;
+			}
+
+			if (rte_red_config_init(&s->red_config[i][j],
+				params->red_params[i][j].wq_log2,
+				params->red_params[i][j].min_th,
+				params->red_params[i][j].max_th,
+				params->red_params[i][j].maxp_inv) != 0) {
+				rte_free(s);
+				return -3;
+			}
+		}
+	}
+#endif
+
+	/* Scheduling loop detection */
+	s->pipe_loop = RTE_SCHED_PIPE_INVALID;
+	s->pipe_exhaustion = 0;
+
+	/* Grinders */
+	s->busy_grinders = 0;
+
+	/* Queue base calculation */
+	rte_sched_subport_config_qsize(s);
+
+	/* Large data structures */
+	s->pipe = (struct rte_sched_pipe *)
+		(s->memory + rte_sched_subport_get_array_base(params,
+						e_RTE_SCHED_SUBPORT_ARRAY_PIPE));
+	s->queue = (struct rte_sched_queue *)
+		(s->memory + rte_sched_subport_get_array_base(params,
+						e_RTE_SCHED_SUBPORT_ARRAY_QUEUE));
+	s->queue_extra = (struct rte_sched_queue_extra *)
+		(s->memory + rte_sched_subport_get_array_base(params,
+						e_RTE_SCHED_SUBPORT_ARRAY_QUEUE_EXTRA));
+	s->pipe_profiles = (struct rte_sched_pipe_profile *)
+		(s->memory + rte_sched_subport_get_array_base(params,
+						e_RTE_SCHED_SUBPORT_ARRAY_PIPE_PROFILES));
+	s->bmp_array =  s->memory + rte_sched_subport_get_array_base(params,
+						e_RTE_SCHED_SUBPORT_ARRAY_BMP_ARRAY);
+	s->queue_array = (struct rte_mbuf **)
+		(s->memory + rte_sched_subport_get_array_base(params,
+						e_RTE_SCHED_SUBPORT_ARRAY_QUEUE_ARRAY));
+
+	/* Bitmap */
+	n_subport_queues = rte_sched_subport_queues(s);
+	bmp_mem_size = rte_bitmap_get_memory_footprint(n_subport_queues);
+	s->bmp = rte_bitmap_init(n_subport_queues, s->bmp_array,
+				bmp_mem_size);
+	if (s->bmp == NULL) {
+		RTE_LOG(ERR, SCHED, "Subport bitmap init error\n");
+		rte_free(port);
+		return -4;
+	}
+
+	for (i = 0; i < RTE_SCHED_PORT_N_GRINDERS; i++)
+		s->grinder_base_bmp_pos[i] = RTE_SCHED_PIPE_INVALID;
+
+	/* Port */
+	port->subports[subport_id] = s;
+
+	if (n_subport_pipes_log2 > port->n_max_subport_pipes_log2)
+		port->n_max_subport_pipes_log2 = n_subport_pipes_log2;
 
 #ifdef RTE_SCHED_SUBPORT_TC_OV
 	/* TC oversubscription */
 	s->tc_ov_wm_min = port->mtu;
 	s->tc_ov_wm_max = rte_sched_time_ms_to_bytes(params->tc_period,
-						     port->pipe_tc3_rate_max);
+						     s->pipe_tc_be_rate_max);
 	s->tc_ov_wm = s->tc_ov_wm_max;
 	s->tc_ov_period_id = 0;
 	s->tc_ov = 0;
-- 
2.20.1


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH 07/27] sched: update pipe profile add api
  2019-05-28 12:05 [dpdk-dev] [PATCH 00/27] sched: feature enhancements Lukasz Krakowiak
                   ` (5 preceding siblings ...)
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 06/27] sched: update subport config api Lukasz Krakowiak
@ 2019-05-28 12:05 ` Lukasz Krakowiak
  2019-05-28 14:06   ` Stephen Hemminger
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 08/27] sched: update pipe config api Lukasz Krakowiak
                   ` (19 subsequent siblings)
  26 siblings, 1 reply; 163+ messages in thread
From: Lukasz Krakowiak @ 2019-05-28 12:05 UTC (permalink / raw)
  To: cristian.dumitrescu; +Cc: dev, Jasvinder Singh, Abraham Tovar, Lukasz Krakowiak

From: Jasvinder Singh <jasvinder.singh@intel.com>

Update the pipe profile add api implementation of the scheduler to allow
configuration flexiblity for pipe traffic classes and queues, and subport
level configuration of the pipe parameters.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 lib/librte_sched/rte_sched.c | 257 +++++++++++++++++++++++++++--------
 lib/librte_sched/rte_sched.h |   3 +
 2 files changed, 205 insertions(+), 55 deletions(-)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index 020c028fd..c1079cdaa 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -365,44 +365,49 @@ rte_sched_port_qsize(struct rte_sched_port *port, uint32_t qindex)
 
 static int
 pipe_profile_check(struct rte_sched_pipe_params *params,
-	uint32_t rate)
+	uint32_t rate, uint16_t *qsize)
 {
 	uint32_t i;
 
 	/* Pipe parameters */
 	if (params == NULL)
-		return -10;
+		return -11;
 
 	/* TB rate: non-zero, not greater than port rate */
 	if (params->tb_rate == 0 ||
 		params->tb_rate > rate)
-		return -11;
+		return -12;
 
 	/* TB size: non-zero */
 	if (params->tb_size == 0)
-		return -12;
+		return -13;
 
 	/* TC rate: non-zero, less than pipe rate */
-	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++) {
-		if (params->tc_rate[i] == 0 ||
-			params->tc_rate[i] > params->tb_rate)
-			return -13;
+	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASS_BE; i++) {
+		if ((qsize[i] == 0 && params->tc_rate[i] != 0) ||
+			(qsize[i] != 0 && (params->tc_rate[i] == 0 ||
+			params->tc_rate[i] > params->tb_rate)))
+			return -14;
 	}
+	if (params->tc_rate[RTE_SCHED_TRAFFIC_CLASS_BE] == 0)
+		return -15;
 
 	/* TC period: non-zero */
 	if (params->tc_period == 0)
-		return -14;
+		return -16;
 
 #ifdef RTE_SCHED_SUBPORT_TC_OV
 	/* TC3 oversubscription weight: non-zero */
 	if (params->tc_ov_weight == 0)
-		return -15;
+		return -17;
 #endif
 
 	/* Queue WRR weights: non-zero */
-	for (i = 0; i < RTE_SCHED_QUEUES_PER_PIPE; i++) {
-		if (params->wrr_weights[i] == 0)
-			return -16;
+	for (i = 0; i < RTE_SCHED_WRR_QUEUES_PER_PIPE; i++) {
+		uint32_t qindex = RTE_SCHED_TRAFFIC_CLASS_BE + i;
+		if ((qsize[qindex] != 0 && params->wrr_weights[i] == 0) ||
+			(qsize[qindex] == 0 && params->wrr_weights[i] != 0))
+			return -18;
 	}
 
 	return 0;
@@ -549,6 +554,120 @@ rte_sched_subport_get_array_base(struct rte_sched_subport_params *params,
 	return base;
 }
 
+static void
+rte_sched_pipe_queues_config(struct rte_sched_subport *subport,
+	struct rte_sched_pipe_profile *dst)
+{
+	uint32_t i;
+
+	/* Queues: strict priority */
+	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASS_BE; i++)
+		if (subport->qsize[i])
+			dst->n_sp_queues++;
+
+	/* Queues: best effort */
+	for (i = 0; i < RTE_SCHED_WRR_QUEUES_PER_PIPE; i++)
+		if (subport->qsize[RTE_SCHED_TRAFFIC_CLASS_BE + i])
+			dst->n_be_queues++;
+}
+
+static void
+rte_sched_pipe_wrr_queues_config(struct rte_sched_pipe_params *src,
+	struct rte_sched_pipe_profile *dst)
+{
+	uint32_t wrr_cost[RTE_SCHED_WRR_QUEUES_PER_PIPE];
+
+	if (dst->n_be_queues == 1) {
+		dst->wrr_cost[0] = (uint8_t) src->wrr_weights[0];
+
+		return;
+	}
+
+	if (dst->n_be_queues == 2) {
+		uint32_t lcd;
+		wrr_cost[0] = src->wrr_weights[0];
+		wrr_cost[1] = src->wrr_weights[1];
+
+		lcd = rte_get_lcd(wrr_cost[0], wrr_cost[1]);
+
+		wrr_cost[0] = lcd / wrr_cost[0];
+		wrr_cost[1] = lcd / wrr_cost[1];
+
+		dst->wrr_cost[0] = (uint8_t) wrr_cost[0];
+		dst->wrr_cost[1] = (uint8_t) wrr_cost[1];
+
+		return;
+	}
+
+	if (dst->n_be_queues == 4) {
+		uint32_t lcd, lcd1, lcd2;
+
+		wrr_cost[0] = src->wrr_weights[0];
+		wrr_cost[1] = src->wrr_weights[1];
+		wrr_cost[2] = src->wrr_weights[2];
+		wrr_cost[3] = src->wrr_weights[3];
+
+		lcd1 = rte_get_lcd(wrr_cost[0], wrr_cost[1]);
+		lcd2 = rte_get_lcd(wrr_cost[2], wrr_cost[3]);
+		lcd = rte_get_lcd(lcd1, lcd2);
+
+		wrr_cost[0] = lcd / wrr_cost[0];
+		wrr_cost[1] = lcd / wrr_cost[1];
+		wrr_cost[2] = lcd / wrr_cost[2];
+		wrr_cost[3] = lcd / wrr_cost[3];
+
+		dst->wrr_cost[0] = (uint8_t) wrr_cost[0];
+		dst->wrr_cost[1] = (uint8_t) wrr_cost[1];
+		dst->wrr_cost[2] = (uint8_t) wrr_cost[2];
+		dst->wrr_cost[3] = (uint8_t) wrr_cost[3];
+
+		return;
+	}
+
+	if (dst->n_be_queues == 8) {
+		uint32_t lcd1, lcd2, lcd3, lcd4, lcd5, lcd6, lcd7;
+
+		wrr_cost[0] = src->wrr_weights[0];
+		wrr_cost[1] = src->wrr_weights[1];
+		wrr_cost[2] = src->wrr_weights[2];
+		wrr_cost[3] = src->wrr_weights[3];
+		wrr_cost[4] = src->wrr_weights[4];
+		wrr_cost[5] = src->wrr_weights[5];
+		wrr_cost[6] = src->wrr_weights[6];
+		wrr_cost[7] = src->wrr_weights[7];
+
+		lcd1 = rte_get_lcd(wrr_cost[0], wrr_cost[1]);
+		lcd2 = rte_get_lcd(wrr_cost[2], wrr_cost[3]);
+		lcd3 = rte_get_lcd(wrr_cost[4], wrr_cost[5]);
+		lcd4 = rte_get_lcd(wrr_cost[6], wrr_cost[7]);
+
+		lcd5 = rte_get_lcd(lcd1, lcd2);
+		lcd6 = rte_get_lcd(lcd3, lcd4);
+
+		lcd7 = rte_get_lcd(lcd5, lcd6);
+
+		wrr_cost[0] = lcd7 / wrr_cost[0];
+		wrr_cost[1] = lcd7 / wrr_cost[1];
+		wrr_cost[2] = lcd7 / wrr_cost[2];
+		wrr_cost[3] = lcd7 / wrr_cost[3];
+		wrr_cost[4] = lcd7 / wrr_cost[4];
+		wrr_cost[5] = lcd7 / wrr_cost[5];
+		wrr_cost[6] = lcd7 / wrr_cost[6];
+		wrr_cost[7] = lcd7 / wrr_cost[7];
+
+		dst->wrr_cost[0] = (uint8_t) wrr_cost[0];
+		dst->wrr_cost[1] = (uint8_t) wrr_cost[1];
+		dst->wrr_cost[2] = (uint8_t) wrr_cost[2];
+		dst->wrr_cost[3] = (uint8_t) wrr_cost[3];
+		dst->wrr_cost[4] = (uint8_t) wrr_cost[4];
+		dst->wrr_cost[5] = (uint8_t) wrr_cost[5];
+		dst->wrr_cost[6] = (uint8_t) wrr_cost[6];
+		dst->wrr_cost[7] = (uint8_t) wrr_cost[7];
+
+		return;
+	}
+}
+
 static void
 rte_sched_subport_config_qsize(struct rte_sched_subport *subport)
 {
@@ -564,15 +683,15 @@ rte_sched_subport_config_qsize(struct rte_sched_subport *subport)
 }
 
 static void
-rte_sched_port_log_pipe_profile(struct rte_sched_port *port, uint32_t i)
+rte_sched_port_log_pipe_profile(struct rte_sched_subport *subport, uint32_t i)
 {
-	struct rte_sched_pipe_profile *p = port->pipe_profiles + i;
+	struct rte_sched_pipe_profile *p = subport->pipe_profiles + i;
 
 	RTE_LOG(DEBUG, SCHED, "Low level config for pipe profile %u:\n"
 		"    Token bucket: period = %u, credits per period = %u, size = %u\n"
-		"    Traffic classes: period = %u, credits per period = [%u, %u, %u, %u]\n"
-		"    Traffic class 3 oversubscription: weight = %hhu\n"
-		"    WRR cost: [%hhu, %hhu, %hhu, %hhu], [%hhu, %hhu, %hhu, %hhu],\n",
+		"    Traffic classes: period = %u, credits per period = [%u, %u, %u, %u, %u, %u, %u, %u, %u]\n"
+		"    Traffic class BE oversubscription: weight = %hhu\n"
+		"    WRR cost: [%hhu, %hhu, %hhu, %hhu, %hhu, %hhu, %hhu, %hhu]\n",
 		i,
 
 		/* Token bucket */
@@ -586,6 +705,11 @@ rte_sched_port_log_pipe_profile(struct rte_sched_port *port, uint32_t i)
 		p->tc_credits_per_period[1],
 		p->tc_credits_per_period[2],
 		p->tc_credits_per_period[3],
+		p->tc_credits_per_period[4],
+		p->tc_credits_per_period[5],
+		p->tc_credits_per_period[6],
+		p->tc_credits_per_period[7],
+		p->tc_credits_per_period[8],
 
 		/* Traffic class 3 oversubscription */
 		p->tc_ov_weight,
@@ -606,7 +730,8 @@ rte_sched_time_ms_to_bytes(uint32_t time_ms, uint32_t rate)
 }
 
 static void
-rte_sched_pipe_profile_convert(struct rte_sched_pipe_params *src,
+rte_sched_pipe_profile_convert(struct rte_sched_subport *subport,
+	struct rte_sched_pipe_params *src,
 	struct rte_sched_pipe_profile *dst,
 	uint32_t rate)
 {
@@ -632,40 +757,42 @@ rte_sched_pipe_profile_convert(struct rte_sched_pipe_params *src,
 						rate);
 
 	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++)
-		dst->tc_credits_per_period[i]
-			= rte_sched_time_ms_to_bytes(src->tc_period,
-				src->tc_rate[i]);
+		if (subport->qsize[i])
+			dst->tc_credits_per_period[i]
+				= rte_sched_time_ms_to_bytes(src->tc_period,
+					src->tc_rate[i]);
 
 #ifdef RTE_SCHED_SUBPORT_TC_OV
 	dst->tc_ov_weight = src->tc_ov_weight;
 #endif
 
+	rte_sched_pipe_queues_config(subport, dst);
+
 	/* WRR */
-	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++) {
-		uint32_t wrr_cost[RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS];
-		uint32_t lcd, lcd1, lcd2;
-		uint32_t qindex;
+	rte_sched_pipe_wrr_queues_config(src, dst);
+}
 
-		qindex = i * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS;
+static void
+rte_sched_subport_config_pipe_profile_table(struct rte_sched_subport *subport,
+	struct rte_sched_subport_params *params, uint32_t rate)
+{
+	uint32_t i;
 
-		wrr_cost[0] = src->wrr_weights[qindex];
-		wrr_cost[1] = src->wrr_weights[qindex + 1];
-		wrr_cost[2] = src->wrr_weights[qindex + 2];
-		wrr_cost[3] = src->wrr_weights[qindex + 3];
+	for (i = 0; i < subport->n_pipe_profiles; i++) {
+		struct rte_sched_pipe_params *src = params->pipe_profiles + i;
+		struct rte_sched_pipe_profile *dst = subport->pipe_profiles + i;
 
-		lcd1 = rte_get_lcd(wrr_cost[0], wrr_cost[1]);
-		lcd2 = rte_get_lcd(wrr_cost[2], wrr_cost[3]);
-		lcd = rte_get_lcd(lcd1, lcd2);
+		rte_sched_pipe_profile_convert(subport, src, dst, rate);
+		rte_sched_port_log_pipe_profile(subport, i);
+	}
 
-		wrr_cost[0] = lcd / wrr_cost[0];
-		wrr_cost[1] = lcd / wrr_cost[1];
-		wrr_cost[2] = lcd / wrr_cost[2];
-		wrr_cost[3] = lcd / wrr_cost[3];
+	subport->pipe_tc_be_rate_max = 0;
+	for (i = 0; i < subport->n_pipe_profiles; i++) {
+		struct rte_sched_pipe_params *src = params->pipe_profiles + i;
+		uint32_t pipe_tc_be_rate = src->tc_rate[RTE_SCHED_TRAFFIC_CLASS_BE];
 
-		dst->wrr_cost[qindex] = (uint8_t) wrr_cost[0];
-		dst->wrr_cost[qindex + 1] = (uint8_t) wrr_cost[1];
-		dst->wrr_cost[qindex + 2] = (uint8_t) wrr_cost[2];
-		dst->wrr_cost[qindex + 3] = (uint8_t) wrr_cost[3];
+		if (subport->pipe_tc_be_rate_max < pipe_tc_be_rate)
+			subport->pipe_tc_be_rate_max = pipe_tc_be_rate;
 	}
 }
 
@@ -733,6 +860,15 @@ rte_sched_subport_check_params(struct rte_sched_subport_params *params,
 	    params->n_pipe_profiles > RTE_SCHED_PIPE_PROFILES_PER_SUBPORT)
 		return -10;
 
+	for (i = 0; i < params->n_pipe_profiles; i++) {
+		struct rte_sched_pipe_params *p = params->pipe_profiles + i;
+		int status;
+
+		status = pipe_profile_check(p, rate, &params->qsize[0]);
+		if (status != 0)
+			return status;
+	}
+
 	return 0;
 }
 
@@ -1013,6 +1149,9 @@ rte_sched_subport_config(struct rte_sched_port *port,
 		(s->memory + rte_sched_subport_get_array_base(params,
 						e_RTE_SCHED_SUBPORT_ARRAY_QUEUE_ARRAY));
 
+	/* Pipe profile table */
+	rte_sched_subport_config_pipe_profile_table(s, params, port->rate);
+
 	/* Bitmap */
 	n_subport_queues = rte_sched_subport_queues(s);
 	bmp_mem_size = rte_bitmap_get_memory_footprint(n_subport_queues);
@@ -1150,10 +1289,12 @@ rte_sched_pipe_config(struct rte_sched_port *port,
 
 int __rte_experimental
 rte_sched_port_pipe_profile_add(struct rte_sched_port *port,
+	uint32_t subport_id,
 	struct rte_sched_pipe_params *params,
 	uint32_t *pipe_profile_id)
 {
 	struct rte_sched_pipe_profile *pp;
+	struct rte_sched_subport *s;
 	uint32_t i;
 	int status;
 
@@ -1161,31 +1302,37 @@ rte_sched_port_pipe_profile_add(struct rte_sched_port *port,
 	if (port == NULL)
 		return -1;
 
-	/* Pipe profiles not exceeds the max limit */
-	if (port->n_pipe_profiles >= RTE_SCHED_PIPE_PROFILES_PER_PORT)
+	/* Subport id not exceeds the max limit */
+	if (subport_id > port->n_subports_per_port)
 		return -2;
 
+	s = port->subports[subport_id];
+
+	/* Pipe profiles not exceeds the max limit */
+	if (s->n_pipe_profiles >= RTE_SCHED_PIPE_PROFILES_PER_SUBPORT)
+		return -3;
+
 	/* Pipe params */
-	status = pipe_profile_check(params, port->rate);
+	status = pipe_profile_check(params, port->rate, &s->qsize[0]);
 	if (status != 0)
 		return status;
 
-	pp = &port->pipe_profiles[port->n_pipe_profiles];
-	rte_sched_pipe_profile_convert(params, pp, port->rate);
+	pp = &s->pipe_profiles[s->n_pipe_profiles];
+	rte_sched_pipe_profile_convert(s, params, pp, port->rate);
 
 	/* Pipe profile not exists */
-	for (i = 0; i < port->n_pipe_profiles; i++)
-		if (memcmp(port->pipe_profiles + i, pp, sizeof(*pp)) == 0)
-			return -3;
+	for (i = 0; i < s->n_pipe_profiles; i++)
+		if (memcmp(s->pipe_profiles + i, pp, sizeof(*pp)) == 0)
+			return -4;
 
 	/* Pipe profile commit */
-	*pipe_profile_id = port->n_pipe_profiles;
-	port->n_pipe_profiles++;
+	*pipe_profile_id = s->n_pipe_profiles;
+	s->n_pipe_profiles++;
 
-	if (port->pipe_tc3_rate_max < params->tc_rate[3])
-		port->pipe_tc3_rate_max = params->tc_rate[3];
+	if (s->pipe_tc_be_rate_max < params->tc_rate[RTE_SCHED_TRAFFIC_CLASS_BE])
+		s->pipe_tc_be_rate_max = params->tc_rate[RTE_SCHED_TRAFFIC_CLASS_BE];
 
-	rte_sched_port_log_pipe_profile(port, *pipe_profile_id);
+	rte_sched_port_log_pipe_profile(s, *pipe_profile_id);
 
 	return 0;
 }
diff --git a/lib/librte_sched/rte_sched.h b/lib/librte_sched/rte_sched.h
index 71728f725..51f801098 100644
--- a/lib/librte_sched/rte_sched.h
+++ b/lib/librte_sched/rte_sched.h
@@ -277,6 +277,8 @@ rte_sched_port_free(struct rte_sched_port *port);
  *
  * @param port
  *   Handle to port scheduler instance
+ * @param subport_id
+ *   Subport ID
  * @param params
  *   Pipe profile parameters
  * @param pipe_profile_id
@@ -286,6 +288,7 @@ rte_sched_port_free(struct rte_sched_port *port);
  */
 int __rte_experimental
 rte_sched_port_pipe_profile_add(struct rte_sched_port *port,
+	uint32_t subport_id,
 	struct rte_sched_pipe_params *params,
 	uint32_t *pipe_profile_id);
 
-- 
2.20.1


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH 08/27] sched: update pipe config api
  2019-05-28 12:05 [dpdk-dev] [PATCH 00/27] sched: feature enhancements Lukasz Krakowiak
                   ` (6 preceding siblings ...)
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 07/27] sched: update pipe profile add api Lukasz Krakowiak
@ 2019-05-28 12:05 ` Lukasz Krakowiak
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 09/27] sched: update pkt read and write api Lukasz Krakowiak
                   ` (18 subsequent siblings)
  26 siblings, 0 replies; 163+ messages in thread
From: Lukasz Krakowiak @ 2019-05-28 12:05 UTC (permalink / raw)
  To: cristian.dumitrescu; +Cc: dev, Jasvinder Singh, Abraham Tovar, Lukasz Krakowiak

From: Jasvinder Singh <jasvinder.singh@intel.com>

Update the pipe configuration api implementation of the scheduler to allow
configuration flexiblity for pipe traffic classes and queues, and subport
level configuration of the pipe parameters.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 lib/librte_sched/rte_sched.c | 58 ++++++++++++++++++++++--------------
 lib/librte_sched/rte_sched.h |  2 +-
 2 files changed, 36 insertions(+), 24 deletions(-)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index c1079cdaa..4b1959bb4 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -1206,38 +1206,43 @@ rte_sched_pipe_config(struct rte_sched_port *port,
 
 	if (port == NULL ||
 	    subport_id >= port->n_subports_per_port ||
-	    pipe_id >= port->n_pipes_per_subport ||
-	    (!deactivate && profile >= port->n_pipe_profiles))
+	    pipe_id >= port->subports[subport_id]->n_subport_pipes ||
+	    (!deactivate && profile >=
+	    port->subports[subport_id]->n_pipe_profiles))
 		return -1;
 
 
 	/* Check that subport configuration is valid */
-	s = port->subport + subport_id;
+	s = port->subports[subport_id];
 	if (s->tb_period == 0)
 		return -2;
 
-	p = port->pipe + (subport_id * port->n_pipes_per_subport + pipe_id);
+	p = s->pipe + pipe_id;
 
 	/* Handle the case when pipe already has a valid configuration */
 	if (p->tb_time) {
-		params = port->pipe_profiles + p->profile;
+		params = s->pipe_profiles + p->profile;
 
 #ifdef RTE_SCHED_SUBPORT_TC_OV
-		double subport_tc3_rate = (double) s->tc_credits_per_period[3]
+		double subport_tc_be_rate =
+			(double) s->tc_credits_per_period[
+					RTE_SCHED_TRAFFIC_CLASS_BE]
 			/ (double) s->tc_period;
-		double pipe_tc3_rate = (double) params->tc_credits_per_period[3]
+		double pipe_tc_be_rate =
+			(double) params->tc_credits_per_period[
+					RTE_SCHED_TRAFFIC_CLASS_BE]
 			/ (double) params->tc_period;
-		uint32_t tc3_ov = s->tc_ov;
+		uint32_t tc_be_ov = s->tc_ov;
 
 		/* Unplug pipe from its subport */
 		s->tc_ov_n -= params->tc_ov_weight;
-		s->tc_ov_rate -= pipe_tc3_rate;
-		s->tc_ov = s->tc_ov_rate > subport_tc3_rate;
+		s->tc_ov_rate -= pipe_tc_be_rate;
+		s->tc_ov = s->tc_ov_rate > subport_tc_be_rate;
 
-		if (s->tc_ov != tc3_ov) {
+		if (s->tc_ov != tc_be_ov) {
 			RTE_LOG(DEBUG, SCHED,
-				"Subport %u TC3 oversubscription is OFF (%.4lf >= %.4lf)\n",
-				subport_id, subport_tc3_rate, s->tc_ov_rate);
+				"Subport %u Best effort TC oversubscription is OFF (%.4lf >= %.4lf)\n",
+				subport_id, subport_tc_be_rate, s->tc_ov_rate);
 		}
 #endif
 
@@ -1250,34 +1255,41 @@ rte_sched_pipe_config(struct rte_sched_port *port,
 
 	/* Apply the new pipe configuration */
 	p->profile = profile;
-	params = port->pipe_profiles + p->profile;
+	params = s->pipe_profiles + p->profile;
 
 	/* Token Bucket (TB) */
 	p->tb_time = port->time;
 	p->tb_credits = params->tb_size / 2;
 
 	/* Traffic Classes (TCs) */
+	p->n_sp_queues = params->n_sp_queues;
+	p->n_be_queues = params->n_be_queues;
 	p->tc_time = port->time + params->tc_period;
 	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++)
-		p->tc_credits[i] = params->tc_credits_per_period[i];
+		if (s->qsize[i])
+			p->tc_credits[i] = params->tc_credits_per_period[i];
 
 #ifdef RTE_SCHED_SUBPORT_TC_OV
 	{
 		/* Subport TC3 oversubscription */
-		double subport_tc3_rate = (double) s->tc_credits_per_period[3]
+		double subport_tc_be_rate =
+			(double) s->tc_credits_per_period[
+					RTE_SCHED_TRAFFIC_CLASS_BE]
 			/ (double) s->tc_period;
-		double pipe_tc3_rate = (double) params->tc_credits_per_period[3]
+		double pipe_tc_be_rate =
+			(double) params->tc_credits_per_period[
+					RTE_SCHED_TRAFFIC_CLASS_BE]
 			/ (double) params->tc_period;
-		uint32_t tc3_ov = s->tc_ov;
+		uint32_t tc_be_ov = s->tc_ov;
 
 		s->tc_ov_n += params->tc_ov_weight;
-		s->tc_ov_rate += pipe_tc3_rate;
-		s->tc_ov = s->tc_ov_rate > subport_tc3_rate;
+		s->tc_ov_rate += pipe_tc_be_rate;
+		s->tc_ov = s->tc_ov_rate > subport_tc_be_rate;
 
-		if (s->tc_ov != tc3_ov) {
+		if (s->tc_ov != tc_be_ov) {
 			RTE_LOG(DEBUG, SCHED,
-				"Subport %u TC3 oversubscription is ON (%.4lf < %.4lf)\n",
-				subport_id, subport_tc3_rate, s->tc_ov_rate);
+				"Subport %u Best effort TC oversubscription is ON (%.4lf < %.4lf)\n",
+				subport_id, subport_tc_be_rate, s->tc_ov_rate);
 		}
 		p->tc_ov_period_id = s->tc_ov_period_id;
 		p->tc_ov_credits = s->tc_ov_wm;
diff --git a/lib/librte_sched/rte_sched.h b/lib/librte_sched/rte_sched.h
index 51f801098..da5790fc4 100644
--- a/lib/librte_sched/rte_sched.h
+++ b/lib/librte_sched/rte_sched.h
@@ -319,7 +319,7 @@ rte_sched_subport_config(struct rte_sched_port *port,
  * @param pipe_id
  *   Pipe ID within subport
  * @param pipe_profile
- *   ID of port-level pre-configured pipe profile
+ *   ID of subport-level pre-configured pipe profile
  * @return
  *   0 upon success, error code otherwise
  */
-- 
2.20.1


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH 09/27] sched: update pkt read and write api
  2019-05-28 12:05 [dpdk-dev] [PATCH 00/27] sched: feature enhancements Lukasz Krakowiak
                   ` (7 preceding siblings ...)
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 08/27] sched: update pipe config api Lukasz Krakowiak
@ 2019-05-28 12:05 ` Lukasz Krakowiak
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 10/27] sched: update subport and tc queue stats Lukasz Krakowiak
                   ` (17 subsequent siblings)
  26 siblings, 0 replies; 163+ messages in thread
From: Lukasz Krakowiak @ 2019-05-28 12:05 UTC (permalink / raw)
  To: cristian.dumitrescu; +Cc: dev, Jasvinder Singh, Abraham Tovar, Lukasz Krakowiak

From: Jasvinder Singh <jasvinder.singh@intel.com>

Update run time packet read and write api implementation of the scheduler
to allow configuration flexiblity for pipe traffic classes and queues, and
subport level configuration of the pipe parameters..

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 lib/librte_sched/rte_sched.c | 32 +++++++++++++++++---------------
 lib/librte_sched/rte_sched.h |  8 ++++----
 2 files changed, 21 insertions(+), 19 deletions(-)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index 4b1959bb4..d28d2d203 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -1351,17 +1351,15 @@ rte_sched_port_pipe_profile_add(struct rte_sched_port *port,
 
 static inline uint32_t
 rte_sched_port_qindex(struct rte_sched_port *port,
+	struct rte_sched_subport *s,
 	uint32_t subport,
 	uint32_t pipe,
-	uint32_t traffic_class,
 	uint32_t queue)
 {
 	return ((subport & (port->n_subports_per_port - 1)) <<
-			(port->n_pipes_per_subport_log2 + 4)) |
-			((pipe & (port->n_pipes_per_subport - 1)) << 4) |
-			((traffic_class &
-			    (RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE - 1)) << 2) |
-			(queue & (RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS - 1));
+			(port->n_max_subport_pipes_log2 + 4)) |
+			((pipe & (s->n_subport_pipes - 1)) << 4) |
+			(queue & (RTE_SCHED_QUEUES_PER_PIPE - 1));
 }
 
 void
@@ -1371,9 +1369,9 @@ rte_sched_port_pkt_write(struct rte_sched_port *port,
 			 uint32_t traffic_class,
 			 uint32_t queue, enum rte_color color)
 {
-	uint32_t queue_id = rte_sched_port_qindex(port, subport, pipe,
-			traffic_class, queue);
-	rte_mbuf_sched_set(pkt, queue_id, traffic_class, (uint8_t)color);
+	struct rte_sched_subport *s = port->subports[subport];
+	uint32_t qindex = rte_sched_port_qindex(port, s, subport, pipe, queue);
+	rte_mbuf_sched_set(pkt, qindex, traffic_class, (uint8_t)color);
 }
 
 void
@@ -1382,13 +1380,17 @@ rte_sched_port_pkt_read_tree_path(struct rte_sched_port *port,
 				  uint32_t *subport, uint32_t *pipe,
 				  uint32_t *traffic_class, uint32_t *queue)
 {
-	uint32_t queue_id = rte_mbuf_sched_queue_get(pkt);
+	struct rte_sched_subport *s;
+	uint32_t qindex = rte_mbuf_sched_queue_get(pkt);
+	uint32_t tc_id = rte_mbuf_sched_traffic_class_get(pkt);
+
+	*subport = (qindex >> (port->n_max_subport_pipes_log2 + 4)) &
+		(port->n_subports_per_port - 1);
 
-	*subport = queue_id >> (port->n_pipes_per_subport_log2 + 4);
-	*pipe = (queue_id >> 4) & (port->n_pipes_per_subport - 1);
-	*traffic_class = (queue_id >> 2) &
-				(RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE - 1);
-	*queue = queue_id & (RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS - 1);
+	s = port->subports[*subport];
+	*pipe = (qindex >> 4) & (s->n_subport_pipes - 1);
+	*traffic_class = tc_id;
+	*queue = qindex & (RTE_SCHED_QUEUES_PER_PIPE - 1);
 }
 
 enum rte_color
diff --git a/lib/librte_sched/rte_sched.h b/lib/librte_sched/rte_sched.h
index da5790fc4..635b59550 100644
--- a/lib/librte_sched/rte_sched.h
+++ b/lib/librte_sched/rte_sched.h
@@ -402,9 +402,9 @@ rte_sched_queue_read_stats(struct rte_sched_port *port,
  * @param pipe
  *   Pipe ID within subport
  * @param traffic_class
- *   Traffic class ID within pipe (0 .. 3)
+ *   Traffic class ID within pipe (0 .. 8)
  * @param queue
- *   Queue ID within pipe traffic class (0 .. 3)
+ *   Queue ID within pipe traffic class (0 .. 15)
  * @param color
  *   Packet color set
  */
@@ -429,9 +429,9 @@ rte_sched_port_pkt_write(struct rte_sched_port *port,
  * @param pipe
  *   Pipe ID within subport
  * @param traffic_class
- *   Traffic class ID within pipe (0 .. 3)
+ *   Traffic class ID within pipe (0 .. 8)
  * @param queue
- *   Queue ID within pipe traffic class (0 .. 3)
+ *   Queue ID within pipe traffic class (0 .. 15)
  *
  */
 void
-- 
2.20.1


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH 10/27] sched: update subport and tc queue stats
  2019-05-28 12:05 [dpdk-dev] [PATCH 00/27] sched: feature enhancements Lukasz Krakowiak
                   ` (8 preceding siblings ...)
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 09/27] sched: update pkt read and write api Lukasz Krakowiak
@ 2019-05-28 12:05 ` Lukasz Krakowiak
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 11/27] sched: update port memory footprint api Lukasz Krakowiak
                   ` (16 subsequent siblings)
  26 siblings, 0 replies; 163+ messages in thread
From: Lukasz Krakowiak @ 2019-05-28 12:05 UTC (permalink / raw)
  To: cristian.dumitrescu; +Cc: dev, Jasvinder Singh, Abraham Tovar, Lukasz Krakowiak

From: Jasvinder Singh <jasvinder.singh@intel.com>

Update subport and tc queue stats implementation of scheduler to allow
configuration flexiblity for pipe traffic classes and queues, and subport
level configuration of the pipe parameters.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 lib/librte_sched/rte_sched.c | 46 +++++++++++++++++++-----------------
 1 file changed, 24 insertions(+), 22 deletions(-)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index d28d2d203..86f2bdf51 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -1412,7 +1412,7 @@ rte_sched_subport_read_stats(struct rte_sched_port *port,
 	    stats == NULL || tc_ov == NULL)
 		return -1;
 
-	s = port->subport + subport_id;
+	s = port->subports[subport_id];
 
 	/* Copy subport stats and clear */
 	memcpy(stats, &s->stats, sizeof(struct rte_sched_subport_stats));
@@ -1468,10 +1468,10 @@ rte_sched_port_queue_is_empty(struct rte_sched_port *port, uint32_t qindex)
 #ifdef RTE_SCHED_COLLECT_STATS
 
 static inline void
-rte_sched_port_update_subport_stats(struct rte_sched_port *port, uint32_t qindex, struct rte_mbuf *pkt)
+rte_sched_port_update_subport_stats(struct rte_sched_subport *s,
+	struct rte_mbuf *pkt)
 {
-	struct rte_sched_subport *s = port->subport + (qindex / rte_sched_port_queues_per_subport(port));
-	uint32_t tc_index = (qindex >> 2) & 0x3;
+	uint32_t tc_index = rte_mbuf_sched_traffic_class_get(pkt);
 	uint32_t pkt_len = pkt->pkt_len;
 
 	s->stats.n_pkts_tc[tc_index] += 1;
@@ -1480,31 +1480,31 @@ rte_sched_port_update_subport_stats(struct rte_sched_port *port, uint32_t qindex
 
 #ifdef RTE_SCHED_RED
 static inline void
-rte_sched_port_update_subport_stats_on_drop(struct rte_sched_port *port,
-						uint32_t qindex,
-						struct rte_mbuf *pkt, uint32_t red)
+rte_sched_port_update_subport_stats_on_drop(struct rte_sched_subport *subport,
+						struct rte_mbuf *pkt,
+						uint32_t red)
 #else
 static inline void
-rte_sched_port_update_subport_stats_on_drop(struct rte_sched_port *port,
-						uint32_t qindex,
-						struct rte_mbuf *pkt, __rte_unused uint32_t red)
+rte_sched_port_update_subport_stats_on_drop(struct rte_sched_subport *subport,
+						struct rte_mbuf *pkt,
+						__rte_unused uint32_t red)
 #endif
 {
-	struct rte_sched_subport *s = port->subport + (qindex / rte_sched_port_queues_per_subport(port));
-	uint32_t tc_index = (qindex >> 2) & 0x3;
+	uint32_t tc_index = rte_mbuf_sched_traffic_class_get(pkt);
 	uint32_t pkt_len = pkt->pkt_len;
 
-	s->stats.n_pkts_tc_dropped[tc_index] += 1;
-	s->stats.n_bytes_tc_dropped[tc_index] += pkt_len;
+	subport->stats.n_pkts_tc_dropped[tc_index] += 1;
+	subport->stats.n_bytes_tc_dropped[tc_index] += pkt_len;
 #ifdef RTE_SCHED_RED
-	s->stats.n_pkts_red_dropped[tc_index] += red;
+	subport->stats.n_pkts_red_dropped[tc_index] += red;
 #endif
 }
 
 static inline void
-rte_sched_port_update_queue_stats(struct rte_sched_port *port, uint32_t qindex, struct rte_mbuf *pkt)
+rte_sched_port_update_queue_stats(struct rte_sched_subport *subport,
+	uint32_t qindex, struct rte_mbuf *pkt)
 {
-	struct rte_sched_queue_extra *qe = port->queue_extra + qindex;
+	struct rte_sched_queue_extra *qe = subport->queue_extra + qindex;
 	uint32_t pkt_len = pkt->pkt_len;
 
 	qe->stats.n_pkts += 1;
@@ -1513,17 +1513,19 @@ rte_sched_port_update_queue_stats(struct rte_sched_port *port, uint32_t qindex,
 
 #ifdef RTE_SCHED_RED
 static inline void
-rte_sched_port_update_queue_stats_on_drop(struct rte_sched_port *port,
+rte_sched_port_update_queue_stats_on_drop(struct rte_sched_subport *subport,
 						uint32_t qindex,
-						struct rte_mbuf *pkt, uint32_t red)
+						struct rte_mbuf *pkt,
+						int32_t red)
 #else
 static inline void
-rte_sched_port_update_queue_stats_on_drop(struct rte_sched_port *port,
+rte_sched_port_update_queue_stats_on_drop(struct rte_sched_subport *subport,
 						uint32_t qindex,
-						struct rte_mbuf *pkt, __rte_unused uint32_t red)
+						struct rte_mbuf *pkt,
+						__rte_unused uint32_t red)
 #endif
 {
-	struct rte_sched_queue_extra *qe = port->queue_extra + qindex;
+	struct rte_sched_queue_extra *qe = subport->queue_extra + qindex;
 	uint32_t pkt_len = pkt->pkt_len;
 
 	qe->stats.n_pkts_dropped += 1;
-- 
2.20.1


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH 11/27] sched: update port memory footprint api
  2019-05-28 12:05 [dpdk-dev] [PATCH 00/27] sched: feature enhancements Lukasz Krakowiak
                   ` (9 preceding siblings ...)
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 10/27] sched: update subport and tc queue stats Lukasz Krakowiak
@ 2019-05-28 12:05 ` Lukasz Krakowiak
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 12/27] sched: update packet enqueue api Lukasz Krakowiak
                   ` (15 subsequent siblings)
  26 siblings, 0 replies; 163+ messages in thread
From: Lukasz Krakowiak @ 2019-05-28 12:05 UTC (permalink / raw)
  To: cristian.dumitrescu; +Cc: dev, Jasvinder Singh, Abraham Tovar, Lukasz Krakowiak

From: Jasvinder Singh <jasvinder.singh@intel.com>

Update port memory footprint api implementation of scheduler to allow
configuration flexiblity for pipe traffic classes and queues, and subport
level configuration of the pipe parameters.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 lib/librte_sched/rte_sched.c | 86 +++++++++---------------------------
 lib/librte_sched/rte_sched.h |  7 ++-
 2 files changed, 25 insertions(+), 68 deletions(-)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index 86f2bdf51..e2a49633d 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -440,66 +440,6 @@ rte_sched_port_check_params(struct rte_sched_port_params *params)
 	return 0;
 }
 
-static uint32_t
-rte_sched_port_get_array_base(struct rte_sched_port_params *params, enum rte_sched_port_array array)
-{
-	uint32_t n_subports_per_port = params->n_subports_per_port;
-	uint32_t n_pipes_per_subport = params->n_pipes_per_subport;
-	uint32_t n_pipes_per_port = n_pipes_per_subport * n_subports_per_port;
-	uint32_t n_queues_per_port = RTE_SCHED_QUEUES_PER_PIPE * n_pipes_per_subport * n_subports_per_port;
-
-	uint32_t size_subport = n_subports_per_port * sizeof(struct rte_sched_subport);
-	uint32_t size_pipe = n_pipes_per_port * sizeof(struct rte_sched_pipe);
-	uint32_t size_queue = n_queues_per_port * sizeof(struct rte_sched_queue);
-	uint32_t size_queue_extra
-		= n_queues_per_port * sizeof(struct rte_sched_queue_extra);
-	uint32_t size_pipe_profiles
-		= RTE_SCHED_PIPE_PROFILES_PER_PORT * sizeof(struct rte_sched_pipe_profile);
-	uint32_t size_bmp_array = rte_bitmap_get_memory_footprint(n_queues_per_port);
-	uint32_t size_per_pipe_queue_array, size_queue_array;
-
-	uint32_t base, i;
-
-	size_per_pipe_queue_array = 0;
-	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++) {
-		size_per_pipe_queue_array += RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS
-			* params->qsize[i] * sizeof(struct rte_mbuf *);
-	}
-	size_queue_array = n_pipes_per_port * size_per_pipe_queue_array;
-
-	base = 0;
-
-	if (array == e_RTE_SCHED_PORT_ARRAY_SUBPORT)
-		return base;
-	base += RTE_CACHE_LINE_ROUNDUP(size_subport);
-
-	if (array == e_RTE_SCHED_PORT_ARRAY_PIPE)
-		return base;
-	base += RTE_CACHE_LINE_ROUNDUP(size_pipe);
-
-	if (array == e_RTE_SCHED_PORT_ARRAY_QUEUE)
-		return base;
-	base += RTE_CACHE_LINE_ROUNDUP(size_queue);
-
-	if (array == e_RTE_SCHED_PORT_ARRAY_QUEUE_EXTRA)
-		return base;
-	base += RTE_CACHE_LINE_ROUNDUP(size_queue_extra);
-
-	if (array == e_RTE_SCHED_PORT_ARRAY_PIPE_PROFILES)
-		return base;
-	base += RTE_CACHE_LINE_ROUNDUP(size_pipe_profiles);
-
-	if (array == e_RTE_SCHED_PORT_ARRAY_BMP_ARRAY)
-		return base;
-	base += RTE_CACHE_LINE_ROUNDUP(size_bmp_array);
-
-	if (array == e_RTE_SCHED_PORT_ARRAY_QUEUE_ARRAY)
-		return base;
-	base += RTE_CACHE_LINE_ROUNDUP(size_queue_array);
-
-	return base;
-}
-
 static uint32_t
 rte_sched_subport_get_array_base(struct rte_sched_subport_params *params,
 	enum rte_sched_subport_array array)
@@ -899,22 +839,36 @@ rte_sched_subport_get_memory_footprint(struct rte_sched_port *port,
 }
 
 uint32_t
-rte_sched_port_get_memory_footprint(struct rte_sched_port_params *params)
+rte_sched_port_get_memory_footprint(struct rte_sched_port_params *port_params,
+	struct rte_sched_subport_params *subport_params)
 {
-	uint32_t size0, size1;
+	uint32_t size0 = 0, size1 = 0, i;
 	int status;
 
-	status = rte_sched_port_check_params(params);
+	status = rte_sched_port_check_params(port_params);
 	if (status != 0) {
 		RTE_LOG(NOTICE, SCHED,
-			"Port scheduler params check failed (%d)\n", status);
+			"Port scheduler port params check failed (%d)\n", status);
+
+		return 0;
+	}
+
+	status = rte_sched_subport_check_params(subport_params,
+				port_params->rate);
+	if (status != 0) {
+		RTE_LOG(NOTICE, SCHED,
+			"Port scheduler subport params check failed (%d)\n", status);
 
 		return 0;
 	}
 
 	size0 = sizeof(struct rte_sched_port);
-	size1 = rte_sched_port_get_array_base(params,
-			e_RTE_SCHED_PORT_ARRAY_TOTAL);
+
+	for (i = 0; i < port_params->n_subports_per_port; i++) {
+		struct rte_sched_subport_params *sp = &subport_params[i];
+		size1 += rte_sched_subport_get_array_base(sp,
+			e_RTE_SCHED_SUBPORT_ARRAY_TOTAL);
+	}
 
 	return size0 + size1;
 }
diff --git a/lib/librte_sched/rte_sched.h b/lib/librte_sched/rte_sched.h
index 635b59550..28b589309 100644
--- a/lib/librte_sched/rte_sched.h
+++ b/lib/librte_sched/rte_sched.h
@@ -332,13 +332,16 @@ rte_sched_pipe_config(struct rte_sched_port *port,
 /**
  * Hierarchical scheduler memory footprint size per port
  *
- * @param params
+ * @param port_params
  *   Port scheduler configuration parameter structure
+ * @param subport_params
+ *   Subport configuration parameter structure
  * @return
  *   Memory footprint size in bytes upon success, 0 otherwise
  */
 uint32_t
-rte_sched_port_get_memory_footprint(struct rte_sched_port_params *params);
+rte_sched_port_get_memory_footprint(struct rte_sched_port_params *port_params,
+	struct rte_sched_subport_params *subport_params);
 
 /*
  * Statistics
-- 
2.20.1


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH 12/27] sched: update packet enqueue api
  2019-05-28 12:05 [dpdk-dev] [PATCH 00/27] sched: feature enhancements Lukasz Krakowiak
                   ` (10 preceding siblings ...)
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 11/27] sched: update port memory footprint api Lukasz Krakowiak
@ 2019-05-28 12:05 ` Lukasz Krakowiak
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 13/27] sched: update grinder pipe and tc cache Lukasz Krakowiak
                   ` (14 subsequent siblings)
  26 siblings, 0 replies; 163+ messages in thread
From: Lukasz Krakowiak @ 2019-05-28 12:05 UTC (permalink / raw)
  To: cristian.dumitrescu; +Cc: dev, Jasvinder Singh, Abraham Tovar, Lukasz Krakowiak

From: Jasvinder Singh <jasvinder.singh@intel.com>

Update packet enqueue api implementation of the scheduler to allow
configuration flexiblity for pipe traffic classes and queues, and subport
level configuration of the pipe parameters.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 lib/librte_sched/rte_sched.c | 203 +++++++++++++++++++++++------------
 1 file changed, 133 insertions(+), 70 deletions(-)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index e2a49633d..6cfa73761 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -1494,7 +1494,11 @@ rte_sched_port_update_queue_stats_on_drop(struct rte_sched_subport *subport,
 #ifdef RTE_SCHED_RED
 
 static inline int
-rte_sched_port_red_drop(struct rte_sched_port *port, struct rte_mbuf *pkt, uint32_t qindex, uint16_t qlen)
+rte_sched_port_red_drop(struct rte_sched_subport *subport,
+	struct rte_mbuf *pkt,
+	uint32_t qindex,
+	uint16_t qlen,
+	uint64_t time)
 {
 	struct rte_sched_queue_extra *qe;
 	struct rte_red_config *red_cfg;
@@ -1502,23 +1506,24 @@ rte_sched_port_red_drop(struct rte_sched_port *port, struct rte_mbuf *pkt, uint3
 	uint32_t tc_index;
 	enum rte_color color;
 
-	tc_index = (qindex >> 2) & 0x3;
+	tc_index = rte_mbuf_sched_traffic_class_get(pkt);
 	color = rte_sched_port_pkt_read_color(pkt);
-	red_cfg = &port->red_config[tc_index][color];
+	red_cfg = &subport->red_config[tc_index][color];
 
 	if ((red_cfg->min_th | red_cfg->max_th) == 0)
 		return 0;
 
-	qe = port->queue_extra + qindex;
+	qe = subport->queue_extra + qindex;
 	red = &qe->red;
 
-	return rte_red_enqueue(red_cfg, red, qlen, port->time);
+	return rte_red_enqueue(red_cfg, red, qlen, time);
 }
 
 static inline void
-rte_sched_port_set_queue_empty_timestamp(struct rte_sched_port *port, uint32_t qindex)
+rte_sched_port_set_queue_empty_timestamp(struct rte_sched_port *port,
+	struct rte_sched_subport *subport, uint32_t qindex)
 {
-	struct rte_sched_queue_extra *qe = port->queue_extra + qindex;
+	struct rte_sched_queue_extra *qe = subport->queue_extra + qindex;
 	struct rte_red *red = &qe->red;
 
 	rte_red_mark_queue_empty(red, port->time);
@@ -1526,10 +1531,23 @@ rte_sched_port_set_queue_empty_timestamp(struct rte_sched_port *port, uint32_t q
 
 #else
 
-#define rte_sched_port_red_drop(port, pkt, qindex, qlen)             0
-
-#define rte_sched_port_set_queue_empty_timestamp(port, qindex)
+static inline int rte_sched_port_red_drop(
+	struct rte_sched_subport *subport __rte_unused,
+	struct rte_mbuf *pkt __rte_unused,
+	uint32_t qindex __rte_unused,
+	uint16_t qlen __rte_unused,
+	uint64_t time __rte_unused)
+{
+	return 0;
+}
 
+static inline void
+rte_sched_port_set_queue_empty_timestamp(struct rte_sched_port *port __rte_unused,
+	struct rte_sched_subport *subport __rte_unused,
+	uint32_t qindex __rte_unused)
+{
+	return;
+}
 #endif /* RTE_SCHED_RED */
 
 #ifdef RTE_SCHED_DEBUG
@@ -1561,6 +1579,17 @@ debug_check_queue_slab(struct rte_sched_port *port, uint32_t bmp_pos,
 
 #endif /* RTE_SCHED_DEBUG */
 
+static inline struct rte_sched_subport *
+rte_sched_port_subport_prefetch0(struct rte_sched_port *port,
+				       struct rte_mbuf *pkt)
+{
+	uint32_t qindex = rte_mbuf_sched_queue_get(pkt);
+	uint32_t subport_id = (qindex >> (port->n_max_subport_pipes_log2 + 4)) &
+		(port->n_subports_per_port - 1);
+
+	return port->subports[subport_id];
+}
+
 static inline uint32_t
 rte_sched_port_enqueue_qptrs_prefetch0(struct rte_sched_port *port,
 				       struct rte_mbuf *pkt)
@@ -1571,53 +1600,55 @@ rte_sched_port_enqueue_qptrs_prefetch0(struct rte_sched_port *port,
 #endif
 	uint32_t qindex = rte_mbuf_sched_queue_get(pkt);
 
-	q = port->queue + qindex;
+	uint32_t subport_id = (qindex >> (port->n_max_subport_pipes_log2 + 4)) &
+						(port->n_subports_per_port - 1);
+	struct rte_sched_subport *s = port->subports[subport_id];
+	uint32_t queue_id = ((1 << (port->n_max_subport_pipes_log2 + 4)) - 1) &
+						qindex;
+
+	q = s->queue + queue_id;
 	rte_prefetch0(q);
 #ifdef RTE_SCHED_COLLECT_STATS
-	qe = port->queue_extra + qindex;
+	qe = s->queue_extra + queue_id;
 	rte_prefetch0(qe);
 #endif
 
-	return qindex;
+	return queue_id;
 }
 
 static inline void
-rte_sched_port_enqueue_qwa_prefetch0(struct rte_sched_port *port,
+rte_sched_port_enqueue_qwa_prefetch0(struct rte_sched_subport *subport,
 				     uint32_t qindex, struct rte_mbuf **qbase)
 {
 	struct rte_sched_queue *q;
 	struct rte_mbuf **q_qw;
 	uint16_t qsize;
 
-	q = port->queue + qindex;
-	qsize = rte_sched_port_qsize(port, qindex);
+	q = subport->queue + qindex;
+	qsize = rte_sched_subport_qsize(subport, qindex);
 	q_qw = qbase + (q->qw & (qsize - 1));
 
 	rte_prefetch0(q_qw);
-	rte_bitmap_prefetch0(port->bmp, qindex);
+	rte_bitmap_prefetch0(subport->bmp, qindex);
 }
 
 static inline int
-rte_sched_port_enqueue_qwa(struct rte_sched_port *port, uint32_t qindex,
-			   struct rte_mbuf **qbase, struct rte_mbuf *pkt)
+rte_sched_port_enqueue_qwa(struct rte_sched_subport *subport, uint32_t qindex,
+		struct rte_mbuf **qbase, struct rte_mbuf *pkt, uint64_t time)
 {
-	struct rte_sched_queue *q;
-	uint16_t qsize;
-	uint16_t qlen;
-
-	q = port->queue + qindex;
-	qsize = rte_sched_port_qsize(port, qindex);
-	qlen = q->qw - q->qr;
+	struct rte_sched_queue *q = subport->queue + qindex;
+	uint16_t qsize = rte_sched_subport_qsize(subport, qindex);
+	uint16_t qlen = q->qw - q->qr;
 
 	/* Drop the packet (and update drop stats) when queue is full */
-	if (unlikely(rte_sched_port_red_drop(port, pkt, qindex, qlen) ||
-		     (qlen >= qsize))) {
+	if (unlikely(rte_sched_port_red_drop(subport, pkt, qindex, qlen, time)
+		|| (qlen >= qsize))) {
 		rte_pktmbuf_free(pkt);
 #ifdef RTE_SCHED_COLLECT_STATS
-		rte_sched_port_update_subport_stats_on_drop(port, qindex, pkt,
-							    qlen < qsize);
-		rte_sched_port_update_queue_stats_on_drop(port, qindex, pkt,
-							  qlen < qsize);
+		rte_sched_port_update_subport_stats_on_drop(subport, pkt,
+			qlen < qsize);
+		rte_sched_port_update_queue_stats_on_drop(subport, qindex, pkt,
+			qlen < qsize);
 #endif
 		return 0;
 	}
@@ -1626,13 +1657,13 @@ rte_sched_port_enqueue_qwa(struct rte_sched_port *port, uint32_t qindex,
 	qbase[q->qw & (qsize - 1)] = pkt;
 	q->qw++;
 
-	/* Activate queue in the port bitmap */
-	rte_bitmap_set(port->bmp, qindex);
+	/* Activate queue in the subport bitmap */
+	rte_bitmap_set(subport->bmp, qindex);
 
 	/* Statistics */
 #ifdef RTE_SCHED_COLLECT_STATS
-	rte_sched_port_update_subport_stats(port, qindex, pkt);
-	rte_sched_port_update_queue_stats(port, qindex, pkt);
+	rte_sched_port_update_subport_stats(subport, pkt);
+	rte_sched_port_update_queue_stats(subport, qindex, pkt);
 #endif
 
 	return 1;
@@ -1660,6 +1691,8 @@ rte_sched_port_enqueue(struct rte_sched_port *port, struct rte_mbuf **pkts,
 		*pkt30, *pkt31, *pkt_last;
 	struct rte_mbuf **q00_base, **q01_base, **q10_base, **q11_base,
 		**q20_base, **q21_base, **q30_base, **q31_base, **q_last_base;
+	struct rte_sched_subport *subport00, *subport01, *subport10, *subport11,
+		*subport20, *subport21, *subport30, *subport31, *subport_last;
 	uint32_t q00, q01, q10, q11, q20, q21, q30, q31, q_last;
 	uint32_t r00, r01, r10, r11, r20, r21, r30, r31, r_last;
 	uint32_t result, i;
@@ -1671,6 +1704,7 @@ rte_sched_port_enqueue(struct rte_sched_port *port, struct rte_mbuf **pkts,
 	 * feed the pipeline
 	 */
 	if (unlikely(n_pkts < 6)) {
+		struct rte_sched_subport *subports[5];
 		struct rte_mbuf **q_base[5];
 		uint32_t q[5];
 
@@ -1678,22 +1712,27 @@ rte_sched_port_enqueue(struct rte_sched_port *port, struct rte_mbuf **pkts,
 		for (i = 0; i < n_pkts; i++)
 			rte_prefetch0(pkts[i]);
 
+		/* Prefetch the subport structure for each packet */
+		for (i = 0; i < n_pkts; i++)
+			subports[i] =
+				rte_sched_port_subport_prefetch0(port, pkts[i]);
+
 		/* Prefetch the queue structure for each queue */
 		for (i = 0; i < n_pkts; i++)
 			q[i] = rte_sched_port_enqueue_qptrs_prefetch0(port,
-								      pkts[i]);
+				pkts[i]);
 
 		/* Prefetch the write pointer location of each queue */
 		for (i = 0; i < n_pkts; i++) {
-			q_base[i] = rte_sched_port_qbase(port, q[i]);
-			rte_sched_port_enqueue_qwa_prefetch0(port, q[i],
+			q_base[i] = rte_sched_subport_qbase(subports[i], q[i]);
+			rte_sched_port_enqueue_qwa_prefetch0(subports[i], q[i],
 							     q_base[i]);
 		}
 
 		/* Write each packet to its queue */
 		for (i = 0; i < n_pkts; i++)
-			result += rte_sched_port_enqueue_qwa(port, q[i],
-							     q_base[i], pkts[i]);
+			result += rte_sched_port_enqueue_qwa(subports[i], q[i],
+					q_base[i], pkts[i], port->time);
 
 		return result;
 	}
@@ -1709,6 +1748,8 @@ rte_sched_port_enqueue(struct rte_sched_port *port, struct rte_mbuf **pkts,
 	rte_prefetch0(pkt10);
 	rte_prefetch0(pkt11);
 
+	subport20 = rte_sched_port_subport_prefetch0(port, pkt20);
+	subport21 = rte_sched_port_subport_prefetch0(port, pkt21);
 	q20 = rte_sched_port_enqueue_qptrs_prefetch0(port, pkt20);
 	q21 = rte_sched_port_enqueue_qptrs_prefetch0(port, pkt21);
 
@@ -1717,13 +1758,15 @@ rte_sched_port_enqueue(struct rte_sched_port *port, struct rte_mbuf **pkts,
 	rte_prefetch0(pkt00);
 	rte_prefetch0(pkt01);
 
+	subport10 = rte_sched_port_subport_prefetch0(port, pkt10);
+	subport11 = rte_sched_port_subport_prefetch0(port, pkt11);
 	q10 = rte_sched_port_enqueue_qptrs_prefetch0(port, pkt10);
 	q11 = rte_sched_port_enqueue_qptrs_prefetch0(port, pkt11);
 
-	q20_base = rte_sched_port_qbase(port, q20);
-	q21_base = rte_sched_port_qbase(port, q21);
-	rte_sched_port_enqueue_qwa_prefetch0(port, q20, q20_base);
-	rte_sched_port_enqueue_qwa_prefetch0(port, q21, q21_base);
+	q20_base = rte_sched_subport_qbase(subport20, q20);
+	q21_base = rte_sched_subport_qbase(subport21, q21);
+	rte_sched_port_enqueue_qwa_prefetch0(subport20, q20, q20_base);
+	rte_sched_port_enqueue_qwa_prefetch0(subport21, q21, q21_base);
 
 	/* Run the pipeline */
 	for (i = 6; i < (n_pkts & (~1)); i += 2) {
@@ -1738,6 +1781,10 @@ rte_sched_port_enqueue(struct rte_sched_port *port, struct rte_mbuf **pkts,
 		q31 = q21;
 		q20 = q10;
 		q21 = q11;
+		subport30 = subport20;
+		subport31 = subport21;
+		subport20 = subport10;
+		subport21 = subport11;
 		q30_base = q20_base;
 		q31_base = q21_base;
 
@@ -1747,19 +1794,25 @@ rte_sched_port_enqueue(struct rte_sched_port *port, struct rte_mbuf **pkts,
 		rte_prefetch0(pkt00);
 		rte_prefetch0(pkt01);
 
-		/* Stage 1: Prefetch queue structure storing queue pointers */
+		/* Stage 1: Prefetch subport and queue structure storing queue
+		 *  pointers
+		 */
+		subport10 = rte_sched_port_subport_prefetch0(port, pkt10);
+		subport11 = rte_sched_port_subport_prefetch0(port, pkt11);
 		q10 = rte_sched_port_enqueue_qptrs_prefetch0(port, pkt10);
 		q11 = rte_sched_port_enqueue_qptrs_prefetch0(port, pkt11);
 
 		/* Stage 2: Prefetch queue write location */
-		q20_base = rte_sched_port_qbase(port, q20);
-		q21_base = rte_sched_port_qbase(port, q21);
-		rte_sched_port_enqueue_qwa_prefetch0(port, q20, q20_base);
-		rte_sched_port_enqueue_qwa_prefetch0(port, q21, q21_base);
+		q20_base = rte_sched_subport_qbase(subport20, q20);
+		q21_base = rte_sched_subport_qbase(subport21, q21);
+		rte_sched_port_enqueue_qwa_prefetch0(subport20, q20, q20_base);
+		rte_sched_port_enqueue_qwa_prefetch0(subport21, q21, q21_base);
 
 		/* Stage 3: Write packet to queue and activate queue */
-		r30 = rte_sched_port_enqueue_qwa(port, q30, q30_base, pkt30);
-		r31 = rte_sched_port_enqueue_qwa(port, q31, q31_base, pkt31);
+		r30 = rte_sched_port_enqueue_qwa(subport30, q30, q30_base,
+			pkt30, port->time);
+		r31 = rte_sched_port_enqueue_qwa(subport31, q31, q31_base,
+			pkt31, port->time);
 		result += r30 + r31;
 	}
 
@@ -1771,38 +1824,48 @@ rte_sched_port_enqueue(struct rte_sched_port *port, struct rte_mbuf **pkts,
 	pkt_last = pkts[n_pkts - 1];
 	rte_prefetch0(pkt_last);
 
+	subport00 = rte_sched_port_subport_prefetch0(port, pkt00);
+	subport01 = rte_sched_port_subport_prefetch0(port, pkt01);
 	q00 = rte_sched_port_enqueue_qptrs_prefetch0(port, pkt00);
 	q01 = rte_sched_port_enqueue_qptrs_prefetch0(port, pkt01);
 
-	q10_base = rte_sched_port_qbase(port, q10);
-	q11_base = rte_sched_port_qbase(port, q11);
-	rte_sched_port_enqueue_qwa_prefetch0(port, q10, q10_base);
-	rte_sched_port_enqueue_qwa_prefetch0(port, q11, q11_base);
+	q10_base = rte_sched_subport_qbase(subport10, q10);
+	q11_base = rte_sched_subport_qbase(subport11, q11);
+	rte_sched_port_enqueue_qwa_prefetch0(subport10, q10, q10_base);
+	rte_sched_port_enqueue_qwa_prefetch0(subport11, q11, q11_base);
 
-	r20 = rte_sched_port_enqueue_qwa(port, q20, q20_base, pkt20);
-	r21 = rte_sched_port_enqueue_qwa(port, q21, q21_base, pkt21);
+	r20 = rte_sched_port_enqueue_qwa(subport20, q20, q20_base, pkt20,
+		port->time);
+	r21 = rte_sched_port_enqueue_qwa(subport21, q21, q21_base, pkt21,
+		port->time);
 	result += r20 + r21;
 
+	subport_last = rte_sched_port_subport_prefetch0(port, pkt_last);
 	q_last = rte_sched_port_enqueue_qptrs_prefetch0(port, pkt_last);
 
-	q00_base = rte_sched_port_qbase(port, q00);
-	q01_base = rte_sched_port_qbase(port, q01);
-	rte_sched_port_enqueue_qwa_prefetch0(port, q00, q00_base);
-	rte_sched_port_enqueue_qwa_prefetch0(port, q01, q01_base);
+	q00_base = rte_sched_subport_qbase(subport00, q00);
+	q01_base = rte_sched_subport_qbase(subport01, q01);
+	rte_sched_port_enqueue_qwa_prefetch0(subport00, q00, q00_base);
+	rte_sched_port_enqueue_qwa_prefetch0(subport01, q01, q01_base);
 
-	r10 = rte_sched_port_enqueue_qwa(port, q10, q10_base, pkt10);
-	r11 = rte_sched_port_enqueue_qwa(port, q11, q11_base, pkt11);
+	r10 = rte_sched_port_enqueue_qwa(subport10, q10, q10_base, pkt10,
+		port->time);
+	r11 = rte_sched_port_enqueue_qwa(subport11, q11, q11_base, pkt11,
+		port->time);
 	result += r10 + r11;
 
-	q_last_base = rte_sched_port_qbase(port, q_last);
-	rte_sched_port_enqueue_qwa_prefetch0(port, q_last, q_last_base);
+	q_last_base = rte_sched_subport_qbase(subport_last, q_last);
+	rte_sched_port_enqueue_qwa_prefetch0(subport_last, q_last, q_last_base);
 
-	r00 = rte_sched_port_enqueue_qwa(port, q00, q00_base, pkt00);
-	r01 = rte_sched_port_enqueue_qwa(port, q01, q01_base, pkt01);
+	r00 = rte_sched_port_enqueue_qwa(subport00, q00, q00_base, pkt00,
+		port->time);
+	r01 = rte_sched_port_enqueue_qwa(subport01, q01, q01_base, pkt01,
+		port->time);
 	result += r00 + r01;
 
 	if (n_pkts & 1) {
-		r_last = rte_sched_port_enqueue_qwa(port, q_last, q_last_base, pkt_last);
+		r_last = rte_sched_port_enqueue_qwa(subport_last, q_last,
+				q_last_base, pkt_last, port->time);
 		result += r_last;
 	}
 
@@ -2044,7 +2107,7 @@ grinder_schedule(struct rte_sched_port *port, uint32_t pos)
 		rte_bitmap_clear(port->bmp, qindex);
 		grinder->qmask &= ~(1 << grinder->qpos);
 		grinder->wrr_mask[grinder->qpos] = 0;
-		rte_sched_port_set_queue_empty_timestamp(port, qindex);
+		rte_sched_port_set_queue_empty_timestamp(port, port->subport, qindex);
 	}
 
 	/* Reset pipe loop detection */
-- 
2.20.1


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH 13/27] sched: update grinder pipe and tc cache
  2019-05-28 12:05 [dpdk-dev] [PATCH 00/27] sched: feature enhancements Lukasz Krakowiak
                   ` (11 preceding siblings ...)
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 12/27] sched: update packet enqueue api Lukasz Krakowiak
@ 2019-05-28 12:05 ` Lukasz Krakowiak
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 14/27] sched: update grinder next pipe and tc functions Lukasz Krakowiak
                   ` (13 subsequent siblings)
  26 siblings, 0 replies; 163+ messages in thread
From: Lukasz Krakowiak @ 2019-05-28 12:05 UTC (permalink / raw)
  To: cristian.dumitrescu; +Cc: dev, Jasvinder Singh, Abraham Tovar, Lukasz Krakowiak

From: Jasvinder Singh <jasvinder.singh@intel.com>

Update grinder pipe and tc cache population of scheduler to allow
configuration flexiblity for pipe traffic classes and queues, and
subport level configuration of the pipe parameters.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 lib/librte_sched/rte_sched.c | 47 ++++++++++++++++--------------------
 1 file changed, 21 insertions(+), 26 deletions(-)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index 6cfa73761..32a5121cd 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -2174,9 +2174,10 @@ grinder_pipe_exists(struct rte_sched_port *port, uint32_t base_pipe)
 #endif /* RTE_SCHED_OPTIMIZATIONS */
 
 static inline void
-grinder_pcache_populate(struct rte_sched_port *port, uint32_t pos, uint32_t bmp_pos, uint64_t bmp_slab)
+grinder_pcache_populate(struct rte_sched_subport *subport, uint32_t pos,
+	uint32_t bmp_pos, uint64_t bmp_slab)
 {
-	struct rte_sched_grinder *grinder = port->grinder + pos;
+	struct rte_sched_grinder *grinder = subport->grinder + pos;
 	uint16_t w[4];
 
 	grinder->pcache_w = 0;
@@ -2205,34 +2206,28 @@ grinder_pcache_populate(struct rte_sched_port *port, uint32_t pos, uint32_t bmp_
 }
 
 static inline void
-grinder_tccache_populate(struct rte_sched_port *port, uint32_t pos, uint32_t qindex, uint16_t qmask)
+grinder_tccache_populate(struct rte_sched_subport *subport, uint32_t pos,
+	uint32_t qindex, uint16_t qmask)
 {
-	struct rte_sched_grinder *grinder = port->grinder + pos;
-	uint8_t b[4];
+	struct rte_sched_grinder *grinder = subport->grinder + pos;
+	uint32_t i;
+	uint8_t b;
 
 	grinder->tccache_w = 0;
 	grinder->tccache_r = 0;
 
-	b[0] = (uint8_t) (qmask & 0xF);
-	b[1] = (uint8_t) ((qmask >> 4) & 0xF);
-	b[2] = (uint8_t) ((qmask >> 8) & 0xF);
-	b[3] = (uint8_t) ((qmask >> 12) & 0xF);
-
-	grinder->tccache_qmask[grinder->tccache_w] = b[0];
-	grinder->tccache_qindex[grinder->tccache_w] = qindex;
-	grinder->tccache_w += (b[0] != 0);
-
-	grinder->tccache_qmask[grinder->tccache_w] = b[1];
-	grinder->tccache_qindex[grinder->tccache_w] = qindex + 4;
-	grinder->tccache_w += (b[1] != 0);
-
-	grinder->tccache_qmask[grinder->tccache_w] = b[2];
-	grinder->tccache_qindex[grinder->tccache_w] = qindex + 8;
-	grinder->tccache_w += (b[2] != 0);
+	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASS_BE; i++) {
+		b = (uint8_t) ((qmask >> i) & 0x1);
+		grinder->tccache_qmask[grinder->tccache_w] = b;
+		grinder->tccache_qindex[grinder->tccache_w] = qindex + i;
+		grinder->tccache_w += (b != 0);
+	}
 
-	grinder->tccache_qmask[grinder->tccache_w] = b[3];
-	grinder->tccache_qindex[grinder->tccache_w] = qindex + 12;
-	grinder->tccache_w += (b[3] != 0);
+	b = (uint8_t) (qmask >> (RTE_SCHED_TRAFFIC_CLASS_BE));
+	grinder->tccache_qmask[grinder->tccache_w] = b;
+	grinder->tccache_qindex[grinder->tccache_w] = qindex +
+		RTE_SCHED_TRAFFIC_CLASS_BE;
+	grinder->tccache_w += (b != 0);
 }
 
 static inline int
@@ -2304,7 +2299,7 @@ grinder_next_pipe(struct rte_sched_port *port, uint32_t pos)
 		port->grinder_base_bmp_pos[pos] = bmp_pos;
 
 		/* Install new pipe group into grinder's pipe cache */
-		grinder_pcache_populate(port, pos, bmp_pos, bmp_slab);
+		grinder_pcache_populate(port->subport, pos, bmp_pos, bmp_slab);
 
 		pipe_qmask = grinder->pcache_qmask[0];
 		pipe_qindex = grinder->pcache_qindex[0];
@@ -2318,7 +2313,7 @@ grinder_next_pipe(struct rte_sched_port *port, uint32_t pos)
 	grinder->pipe_params = NULL; /* to be set after the pipe structure is prefetched */
 	grinder->productive = 0;
 
-	grinder_tccache_populate(port, pos, pipe_qindex, pipe_qmask);
+	grinder_tccache_populate(port->subport, pos, pipe_qindex, pipe_qmask);
 	grinder_next_tc(port, pos);
 
 	/* Check for pipe exhaustion */
-- 
2.20.1


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH 14/27] sched: update grinder next pipe and tc functions
  2019-05-28 12:05 [dpdk-dev] [PATCH 00/27] sched: feature enhancements Lukasz Krakowiak
                   ` (12 preceding siblings ...)
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 13/27] sched: update grinder pipe and tc cache Lukasz Krakowiak
@ 2019-05-28 12:05 ` Lukasz Krakowiak
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 15/27] sched: update pipe and tc queues prefetch Lukasz Krakowiak
                   ` (12 subsequent siblings)
  26 siblings, 0 replies; 163+ messages in thread
From: Lukasz Krakowiak @ 2019-05-28 12:05 UTC (permalink / raw)
  To: cristian.dumitrescu; +Cc: dev, Jasvinder Singh, Abraham Tovar, Lukasz Krakowiak

From: Jasvinder Singh <jasvinder.singh@intel.com>

Update implementation of grinder next pipe and tc functions to allow
configuration flexiblity for pipe traffic classes and queues, and subport
level configuration of the pipe parameters.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 lib/librte_sched/rte_sched.c | 115 ++++++++++++++++-------------------
 1 file changed, 53 insertions(+), 62 deletions(-)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index 32a5121cd..74a8e0a71 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -345,24 +345,6 @@ rte_sched_port_queues_per_port(struct rte_sched_port *port)
 	return RTE_SCHED_QUEUES_PER_PIPE * port->n_pipes_per_subport * port->n_subports_per_port;
 }
 
-static inline struct rte_mbuf **
-rte_sched_port_qbase(struct rte_sched_port *port, uint32_t qindex)
-{
-	uint32_t pindex = qindex >> 4;
-	uint32_t qpos = qindex & 0xF;
-
-	return (port->queue_array + pindex *
-		port->qsize_sum + port->qsize_add[qpos]);
-}
-
-static inline uint16_t
-rte_sched_port_qsize(struct rte_sched_port *port, uint32_t qindex)
-{
-	uint32_t tc = (qindex >> 2) & 0x3;
-
-	return port->qsize[tc];
-}
-
 static int
 pipe_profile_check(struct rte_sched_pipe_params *params,
 	uint32_t rate, uint16_t *qsize)
@@ -2120,13 +2102,14 @@ grinder_schedule(struct rte_sched_port *port, uint32_t pos)
 #ifdef SCHED_VECTOR_SSE4
 
 static inline int
-grinder_pipe_exists(struct rte_sched_port *port, uint32_t base_pipe)
+grinder_pipe_exists(struct rte_sched_subport *subport, uint32_t base_pipe)
 {
 	__m128i index = _mm_set1_epi32(base_pipe);
-	__m128i pipes = _mm_load_si128((__m128i *)port->grinder_base_bmp_pos);
+	__m128i pipes =
+		_mm_load_si128((__m128i *)subport->grinder_base_bmp_pos);
 	__m128i res = _mm_cmpeq_epi32(pipes, index);
 
-	pipes = _mm_load_si128((__m128i *)(port->grinder_base_bmp_pos + 4));
+	pipes = _mm_load_si128((__m128i *)(subport->grinder_base_bmp_pos + 4));
 	pipes = _mm_cmpeq_epi32(pipes, index);
 	res = _mm_or_si128(res, pipes);
 
@@ -2139,10 +2122,10 @@ grinder_pipe_exists(struct rte_sched_port *port, uint32_t base_pipe)
 #elif defined(SCHED_VECTOR_NEON)
 
 static inline int
-grinder_pipe_exists(struct rte_sched_port *port, uint32_t base_pipe)
+grinder_pipe_exists(struct rte_sched_subport *subport, uint32_t base_pipe)
 {
 	uint32x4_t index, pipes;
-	uint32_t *pos = (uint32_t *)port->grinder_base_bmp_pos;
+	uint32_t *pos = (uint32_t *)subport->grinder_base_bmp_pos;
 
 	index = vmovq_n_u32(base_pipe);
 	pipes = vld1q_u32(pos);
@@ -2159,12 +2142,12 @@ grinder_pipe_exists(struct rte_sched_port *port, uint32_t base_pipe)
 #else
 
 static inline int
-grinder_pipe_exists(struct rte_sched_port *port, uint32_t base_pipe)
+grinder_pipe_exists(struct rte_sched_subport *subport, uint32_t base_pipe)
 {
 	uint32_t i;
 
 	for (i = 0; i < RTE_SCHED_PORT_N_GRINDERS; i++) {
-		if (port->grinder_base_bmp_pos[i] == base_pipe)
+		if (subport->grinder_base_bmp_pos[i] == base_pipe)
 			return 1;
 	}
 
@@ -2231,47 +2214,54 @@ grinder_tccache_populate(struct rte_sched_subport *subport, uint32_t pos,
 }
 
 static inline int
-grinder_next_tc(struct rte_sched_port *port, uint32_t pos)
+grinder_next_tc(struct rte_sched_subport *subport, uint32_t pos)
 {
-	struct rte_sched_grinder *grinder = port->grinder + pos;
+	struct rte_sched_grinder *grinder = subport->grinder + pos;
+	struct rte_sched_pipe *pipe = grinder->pipe;
 	struct rte_mbuf **qbase;
 	uint32_t qindex;
 	uint16_t qsize;
+	uint32_t i;
 
 	if (grinder->tccache_r == grinder->tccache_w)
 		return 0;
 
 	qindex = grinder->tccache_qindex[grinder->tccache_r];
-	qbase = rte_sched_port_qbase(port, qindex);
-	qsize = rte_sched_port_qsize(port, qindex);
+	grinder->tc_index =
+		(qindex < RTE_SCHED_TRAFFIC_CLASS_BE) ?
+		qindex : RTE_SCHED_TRAFFIC_CLASS_BE;
 
-	grinder->tc_index = (qindex >> 2) & 0x3;
-	grinder->qmask = grinder->tccache_qmask[grinder->tccache_r];
-	grinder->qsize = qsize;
+	qbase = rte_sched_subport_qbase(subport, qindex);
+	if (grinder->tc_index < pipe->n_sp_queues) {
+		qsize = rte_sched_subport_qsize(subport, qindex);
 
-	grinder->qindex[0] = qindex;
-	grinder->qindex[1] = qindex + 1;
-	grinder->qindex[2] = qindex + 2;
-	grinder->qindex[3] = qindex + 3;
+		grinder->sp.qindex = qindex;
+		grinder->sp.queue = subport->queue + qindex;
+		grinder->sp.qbase = qbase;
+		grinder->sp.qsize = qsize;
 
-	grinder->queue[0] = port->queue + qindex;
-	grinder->queue[1] = port->queue + qindex + 1;
-	grinder->queue[2] = port->queue + qindex + 2;
-	grinder->queue[3] = port->queue + qindex + 3;
+		grinder->tccache_r++;
+		return 1;
+	}
+
+	for (i = 0; i < pipe->n_be_queues; i++) {
+		qsize = rte_sched_subport_qsize(subport, qindex + i);
 
-	grinder->qbase[0] = qbase;
-	grinder->qbase[1] = qbase + qsize;
-	grinder->qbase[2] = qbase + 2 * qsize;
-	grinder->qbase[3] = qbase + 3 * qsize;
+		grinder->be.qindex[i] = qindex + i;
+		grinder->be.queue[i] = subport->queue + qindex + i;
+		grinder->be.qbase[i] = qbase + i * qsize;
+		grinder->be.qsize[i] = qsize;
+	}
+	grinder->be.qmask = grinder->tccache_qmask[grinder->tccache_r];
 
 	grinder->tccache_r++;
 	return 1;
 }
 
 static inline int
-grinder_next_pipe(struct rte_sched_port *port, uint32_t pos)
+grinder_next_pipe(struct rte_sched_subport *subport, uint32_t pos)
 {
-	struct rte_sched_grinder *grinder = port->grinder + pos;
+	struct rte_sched_grinder *grinder = subport->grinder + pos;
 	uint32_t pipe_qindex;
 	uint16_t pipe_qmask;
 
@@ -2284,22 +2274,23 @@ grinder_next_pipe(struct rte_sched_port *port, uint32_t pos)
 		uint32_t bmp_pos = 0;
 
 		/* Get another non-empty pipe group */
-		if (unlikely(rte_bitmap_scan(port->bmp, &bmp_pos, &bmp_slab) <= 0))
+		if (unlikely(rte_bitmap_scan(subport->bmp, &bmp_pos, &bmp_slab)
+			<= 0))
 			return 0;
 
 #ifdef RTE_SCHED_DEBUG
-		debug_check_queue_slab(port, bmp_pos, bmp_slab);
+		debug_check_queue_slab(subport, bmp_pos, bmp_slab);
 #endif
 
 		/* Return if pipe group already in one of the other grinders */
-		port->grinder_base_bmp_pos[pos] = RTE_SCHED_BMP_POS_INVALID;
-		if (unlikely(grinder_pipe_exists(port, bmp_pos)))
+		subport->grinder_base_bmp_pos[pos] = RTE_SCHED_BMP_POS_INVALID;
+		if (unlikely(grinder_pipe_exists(subport, bmp_pos)))
 			return 0;
 
-		port->grinder_base_bmp_pos[pos] = bmp_pos;
+		subport->grinder_base_bmp_pos[pos] = bmp_pos;
 
 		/* Install new pipe group into grinder's pipe cache */
-		grinder_pcache_populate(port->subport, pos, bmp_pos, bmp_slab);
+		grinder_pcache_populate(subport, pos, bmp_pos, bmp_slab);
 
 		pipe_qmask = grinder->pcache_qmask[0];
 		pipe_qindex = grinder->pcache_qindex[0];
@@ -2308,18 +2299,18 @@ grinder_next_pipe(struct rte_sched_port *port, uint32_t pos)
 
 	/* Install new pipe in the grinder */
 	grinder->pindex = pipe_qindex >> 4;
-	grinder->subport = port->subport + (grinder->pindex / port->n_pipes_per_subport);
-	grinder->pipe = port->pipe + grinder->pindex;
+	grinder->subport = subport;
+	grinder->pipe = subport->pipe + grinder->pindex;
 	grinder->pipe_params = NULL; /* to be set after the pipe structure is prefetched */
 	grinder->productive = 0;
 
-	grinder_tccache_populate(port->subport, pos, pipe_qindex, pipe_qmask);
-	grinder_next_tc(port, pos);
+	grinder_tccache_populate(subport, pos, pipe_qindex, pipe_qmask);
+	grinder_next_tc(subport, pos);
 
 	/* Check for pipe exhaustion */
-	if (grinder->pindex == port->pipe_loop) {
-		port->pipe_exhaustion = 1;
-		port->pipe_loop = RTE_SCHED_PIPE_INVALID;
+	if (grinder->pindex == subport->pipe_loop) {
+		subport->pipe_exhaustion = 1;
+		subport->pipe_loop = RTE_SCHED_PIPE_INVALID;
 	}
 
 	return 1;
@@ -2455,7 +2446,7 @@ grinder_handle(struct rte_sched_port *port, uint32_t pos)
 	switch (grinder->state) {
 	case e_GRINDER_PREFETCH_PIPE:
 	{
-		if (grinder_next_pipe(port, pos)) {
+		if (grinder_next_pipe(port->subport, pos)) {
 			grinder_prefetch_pipe(port, pos);
 			port->busy_grinders++;
 
@@ -2502,7 +2493,7 @@ grinder_handle(struct rte_sched_port *port, uint32_t pos)
 		grinder_wrr_store(port, pos);
 
 		/* Look for another active TC within same pipe */
-		if (grinder_next_tc(port, pos)) {
+		if (grinder_next_tc(port->subport, pos)) {
 			grinder_prefetch_tc_queue_arrays(port, pos);
 
 			grinder->state = e_GRINDER_PREFETCH_MBUF;
@@ -2516,7 +2507,7 @@ grinder_handle(struct rte_sched_port *port, uint32_t pos)
 		grinder_evict(port, pos);
 
 		/* Look for another active pipe */
-		if (grinder_next_pipe(port, pos)) {
+		if (grinder_next_pipe(port->subport, pos)) {
 			grinder_prefetch_pipe(port, pos);
 
 			grinder->state = e_GRINDER_PREFETCH_TC_QUEUE_ARRAYS;
-- 
2.20.1


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH 15/27] sched: update pipe and tc queues prefetch
  2019-05-28 12:05 [dpdk-dev] [PATCH 00/27] sched: feature enhancements Lukasz Krakowiak
                   ` (13 preceding siblings ...)
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 14/27] sched: update grinder next pipe and tc functions Lukasz Krakowiak
@ 2019-05-28 12:05 ` Lukasz Krakowiak
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 16/27] sched: update grinder wrr compute function Lukasz Krakowiak
                   ` (11 subsequent siblings)
  26 siblings, 0 replies; 163+ messages in thread
From: Lukasz Krakowiak @ 2019-05-28 12:05 UTC (permalink / raw)
  To: cristian.dumitrescu; +Cc: dev, Jasvinder Singh, Abraham Tovar, Lukasz Krakowiak

From: Jasvinder Singh <jasvinder.singh@intel.com>

Update pipe and tc queues prefetch functions of scheduler to allow
configuration flexiblity for pipe traffic classes and queues, and subport
level configuration of the pipe parameters..

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 lib/librte_sched/rte_sched.c | 42 +++++++++++++++++++++---------------
 1 file changed, 25 insertions(+), 17 deletions(-)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index 74a8e0a71..07939c04f 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -2389,34 +2389,42 @@ grinder_wrr(struct rte_sched_port *port, uint32_t pos)
 #define grinder_evict(port, pos)
 
 static inline void
-grinder_prefetch_pipe(struct rte_sched_port *port, uint32_t pos)
+grinder_prefetch_pipe(struct rte_sched_subport *subport, uint32_t pos)
 {
-	struct rte_sched_grinder *grinder = port->grinder + pos;
+	struct rte_sched_grinder *grinder = subport->grinder + pos;
 
 	rte_prefetch0(grinder->pipe);
-	rte_prefetch0(grinder->queue[0]);
+	rte_prefetch0(grinder->sp.queue);
 }
 
 static inline void
 grinder_prefetch_tc_queue_arrays(struct rte_sched_port *port, uint32_t pos)
 {
-	struct rte_sched_grinder *grinder = port->grinder + pos;
-	uint16_t qsize, qr[4];
+	struct rte_sched_grinder *grinder = port->subport->grinder + pos;
+	struct rte_sched_pipe *pipe = grinder->pipe;
+	struct rte_sched_queue *queue;
+	uint32_t tc_index = grinder->tc_index, i;
+	uint16_t qsize, qr[RTE_SCHED_WRR_QUEUES_PER_PIPE];
+
+	if (tc_index < pipe->n_sp_queues) {
+		queue = grinder->sp.queue;
+		qsize = grinder->sp.qsize;
+		qr[0] = queue->qr & (qsize - 1);
 
-	qsize = grinder->qsize;
-	qr[0] = grinder->queue[0]->qr & (qsize - 1);
-	qr[1] = grinder->queue[1]->qr & (qsize - 1);
-	qr[2] = grinder->queue[2]->qr & (qsize - 1);
-	qr[3] = grinder->queue[3]->qr & (qsize - 1);
+		rte_prefetch0(grinder->sp.qbase + qr[0]);
+		return;
+	}
+
+	for (i = 0; i < pipe->n_be_queues; i++) {
+		queue = grinder->be.queue[i];
+		qsize = grinder->be.qsize[i];
+		qr[i] = queue->qr & (qsize - 1);
 
-	rte_prefetch0(grinder->qbase[0] + qr[0]);
-	rte_prefetch0(grinder->qbase[1] + qr[1]);
+		rte_prefetch0(grinder->be.qbase[i] + qr[i]);
+	}
 
 	grinder_wrr_load(port, pos);
 	grinder_wrr(port, pos);
-
-	rte_prefetch0(grinder->qbase[2] + qr[2]);
-	rte_prefetch0(grinder->qbase[3] + qr[3]);
 }
 
 static inline void
@@ -2447,7 +2455,7 @@ grinder_handle(struct rte_sched_port *port, uint32_t pos)
 	case e_GRINDER_PREFETCH_PIPE:
 	{
 		if (grinder_next_pipe(port->subport, pos)) {
-			grinder_prefetch_pipe(port, pos);
+			grinder_prefetch_pipe(port->subport, pos);
 			port->busy_grinders++;
 
 			grinder->state = e_GRINDER_PREFETCH_TC_QUEUE_ARRAYS;
@@ -2508,7 +2516,7 @@ grinder_handle(struct rte_sched_port *port, uint32_t pos)
 
 		/* Look for another active pipe */
 		if (grinder_next_pipe(port->subport, pos)) {
-			grinder_prefetch_pipe(port, pos);
+			grinder_prefetch_pipe(port->subport, pos);
 
 			grinder->state = e_GRINDER_PREFETCH_TC_QUEUE_ARRAYS;
 			return result;
-- 
2.20.1


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH 16/27] sched: update grinder wrr compute function
  2019-05-28 12:05 [dpdk-dev] [PATCH 00/27] sched: feature enhancements Lukasz Krakowiak
                   ` (14 preceding siblings ...)
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 15/27] sched: update pipe and tc queues prefetch Lukasz Krakowiak
@ 2019-05-28 12:05 ` Lukasz Krakowiak
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 17/27] sched: modify credits update function Lukasz Krakowiak
                   ` (10 subsequent siblings)
  26 siblings, 0 replies; 163+ messages in thread
From: Lukasz Krakowiak @ 2019-05-28 12:05 UTC (permalink / raw)
  To: cristian.dumitrescu; +Cc: dev, Jasvinder Singh, Abraham Tovar, Lukasz Krakowiak

From: Jasvinder Singh <jasvinder.singh@intel.com>

Update weighted round robin function for best-effort traffic class
queues of the scheduler to allow configuration flexiblity for pipe traffic
classes and queues, and subport level configuration of the pipe parameters.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 lib/librte_sched/rte_sched.c        | 135 +++++++++++++++++-----------
 lib/librte_sched/rte_sched_common.h |  41 +++++++++
 2 files changed, 125 insertions(+), 51 deletions(-)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index 07939c04f..a9b5f7bf8 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -2316,73 +2316,106 @@ grinder_next_pipe(struct rte_sched_subport *subport, uint32_t pos)
 	return 1;
 }
 
-
 static inline void
-grinder_wrr_load(struct rte_sched_port *port, uint32_t pos)
+grinder_wrr_load(struct rte_sched_subport *subport, uint32_t pos)
 {
-	struct rte_sched_grinder *grinder = port->grinder + pos;
+	struct rte_sched_grinder *grinder = subport->grinder + pos;
 	struct rte_sched_pipe *pipe = grinder->pipe;
 	struct rte_sched_pipe_profile *pipe_params = grinder->pipe_params;
-	uint32_t tc_index = grinder->tc_index;
-	uint32_t qmask = grinder->qmask;
-	uint32_t qindex;
-
-	qindex = tc_index * 4;
-
-	grinder->wrr_tokens[0] = ((uint16_t) pipe->wrr_tokens[qindex]) << RTE_SCHED_WRR_SHIFT;
-	grinder->wrr_tokens[1] = ((uint16_t) pipe->wrr_tokens[qindex + 1]) << RTE_SCHED_WRR_SHIFT;
-	grinder->wrr_tokens[2] = ((uint16_t) pipe->wrr_tokens[qindex + 2]) << RTE_SCHED_WRR_SHIFT;
-	grinder->wrr_tokens[3] = ((uint16_t) pipe->wrr_tokens[qindex + 3]) << RTE_SCHED_WRR_SHIFT;
-
-	grinder->wrr_mask[0] = (qmask & 0x1) * 0xFFFF;
-	grinder->wrr_mask[1] = ((qmask >> 1) & 0x1) * 0xFFFF;
-	grinder->wrr_mask[2] = ((qmask >> 2) & 0x1) * 0xFFFF;
-	grinder->wrr_mask[3] = ((qmask >> 3) & 0x1) * 0xFFFF;
+	uint32_t qmask = grinder->be.qmask;
+	uint32_t qindex = grinder->be.qindex[0];
+	uint32_t i;
 
-	grinder->wrr_cost[0] = pipe_params->wrr_cost[qindex];
-	grinder->wrr_cost[1] = pipe_params->wrr_cost[qindex + 1];
-	grinder->wrr_cost[2] = pipe_params->wrr_cost[qindex + 2];
-	grinder->wrr_cost[3] = pipe_params->wrr_cost[qindex + 3];
+	for (i = 0; i < pipe->n_be_queues; i++) {
+		grinder->be.wrr_tokens[i] =
+			((uint16_t) pipe->wrr_tokens[qindex + i]) << RTE_SCHED_WRR_SHIFT;
+		grinder->be.wrr_mask[i] = ((qmask >> i) & 0x1) * 0xFFFF;
+		grinder->be.wrr_cost[i] = pipe_params->wrr_cost[qindex + i];
+	}
 }
 
 static inline void
-grinder_wrr_store(struct rte_sched_port *port, uint32_t pos)
+grinder_wrr_store(struct rte_sched_subport *subport, uint32_t pos)
 {
-	struct rte_sched_grinder *grinder = port->grinder + pos;
+	struct rte_sched_grinder *grinder = subport->grinder + pos;
 	struct rte_sched_pipe *pipe = grinder->pipe;
 	uint32_t tc_index = grinder->tc_index;
-	uint32_t qindex;
-
-	qindex = tc_index * 4;
+	uint32_t i;
 
-	pipe->wrr_tokens[qindex] = (grinder->wrr_tokens[0] & grinder->wrr_mask[0])
-		>> RTE_SCHED_WRR_SHIFT;
-	pipe->wrr_tokens[qindex + 1] = (grinder->wrr_tokens[1] & grinder->wrr_mask[1])
-		>> RTE_SCHED_WRR_SHIFT;
-	pipe->wrr_tokens[qindex + 2] = (grinder->wrr_tokens[2] & grinder->wrr_mask[2])
-		>> RTE_SCHED_WRR_SHIFT;
-	pipe->wrr_tokens[qindex + 3] = (grinder->wrr_tokens[3] & grinder->wrr_mask[3])
-		>> RTE_SCHED_WRR_SHIFT;
+	if (tc_index == RTE_SCHED_TRAFFIC_CLASS_BE)
+		for (i = 0; i < pipe->n_be_queues; i++)
+			pipe->wrr_tokens[i] =
+				(grinder->be.wrr_tokens[i] & grinder->be.wrr_mask[i]) >>
+				RTE_SCHED_WRR_SHIFT;
 }
 
 static inline void
-grinder_wrr(struct rte_sched_port *port, uint32_t pos)
+grinder_wrr(struct rte_sched_subport *subport, uint32_t pos)
 {
-	struct rte_sched_grinder *grinder = port->grinder + pos;
+	struct rte_sched_grinder *grinder = subport->grinder + pos;
+	struct rte_sched_pipe *pipe = grinder->pipe;
+	uint32_t n_be_queues = pipe->n_be_queues;
 	uint16_t wrr_tokens_min;
 
-	grinder->wrr_tokens[0] |= ~grinder->wrr_mask[0];
-	grinder->wrr_tokens[1] |= ~grinder->wrr_mask[1];
-	grinder->wrr_tokens[2] |= ~grinder->wrr_mask[2];
-	grinder->wrr_tokens[3] |= ~grinder->wrr_mask[3];
+	if (n_be_queues == 1) {
+		grinder->be.wrr_tokens[0] |= ~grinder->be.wrr_mask[0];
+		grinder->be.qpos = 0;
+		wrr_tokens_min = grinder->be.wrr_tokens[0];
+		grinder->be.wrr_tokens[0] -= wrr_tokens_min;
+		return;
+	}
+
+	if (n_be_queues == 2) {
+		grinder->be.wrr_tokens[0] |= ~grinder->be.wrr_mask[0];
+		grinder->be.wrr_tokens[1] |= ~grinder->be.wrr_mask[1];
+
+		grinder->be.qpos = rte_min_pos_2_u16(grinder->be.wrr_tokens);
+		wrr_tokens_min = grinder->be.wrr_tokens[grinder->be.qpos];
+
+		grinder->be.wrr_tokens[0] -= wrr_tokens_min;
+		grinder->be.wrr_tokens[1] -= wrr_tokens_min;
+		return;
+	}
+
+	if (n_be_queues == 4) {
+		grinder->be.wrr_tokens[0] |= ~grinder->be.wrr_mask[0];
+		grinder->be.wrr_tokens[1] |= ~grinder->be.wrr_mask[1];
+		grinder->be.wrr_tokens[2] |= ~grinder->be.wrr_mask[2];
+		grinder->be.wrr_tokens[3] |= ~grinder->be.wrr_mask[3];
+
+		grinder->be.qpos = rte_min_pos_4_u16(grinder->be.wrr_tokens);
+		wrr_tokens_min = grinder->be.wrr_tokens[grinder->be.qpos];
 
-	grinder->qpos = rte_min_pos_4_u16(grinder->wrr_tokens);
-	wrr_tokens_min = grinder->wrr_tokens[grinder->qpos];
+		grinder->be.wrr_tokens[0] -= wrr_tokens_min;
+		grinder->be.wrr_tokens[1] -= wrr_tokens_min;
+		grinder->be.wrr_tokens[2] -= wrr_tokens_min;
+		grinder->be.wrr_tokens[3] -= wrr_tokens_min;
+		return;
+	}
 
-	grinder->wrr_tokens[0] -= wrr_tokens_min;
-	grinder->wrr_tokens[1] -= wrr_tokens_min;
-	grinder->wrr_tokens[2] -= wrr_tokens_min;
-	grinder->wrr_tokens[3] -= wrr_tokens_min;
+	if (n_be_queues == 8) {
+		grinder->be.wrr_tokens[0] |= ~grinder->be.wrr_mask[0];
+		grinder->be.wrr_tokens[1] |= ~grinder->be.wrr_mask[1];
+		grinder->be.wrr_tokens[2] |= ~grinder->be.wrr_mask[2];
+		grinder->be.wrr_tokens[3] |= ~grinder->be.wrr_mask[3];
+		grinder->be.wrr_tokens[4] |= ~grinder->be.wrr_mask[4];
+		grinder->be.wrr_tokens[5] |= ~grinder->be.wrr_mask[5];
+		grinder->be.wrr_tokens[6] |= ~grinder->be.wrr_mask[6];
+		grinder->be.wrr_tokens[7] |= ~grinder->be.wrr_mask[7];
+
+		grinder->be.qpos = rte_min_pos_8_u16(grinder->be.wrr_tokens);
+		wrr_tokens_min = grinder->be.wrr_tokens[grinder->be.qpos];
+
+		grinder->be.wrr_tokens[0] -= wrr_tokens_min;
+		grinder->be.wrr_tokens[1] -= wrr_tokens_min;
+		grinder->be.wrr_tokens[2] -= wrr_tokens_min;
+		grinder->be.wrr_tokens[3] -= wrr_tokens_min;
+		grinder->be.wrr_tokens[4] -= wrr_tokens_min;
+		grinder->be.wrr_tokens[5] -= wrr_tokens_min;
+		grinder->be.wrr_tokens[6] -= wrr_tokens_min;
+		grinder->be.wrr_tokens[7] -= wrr_tokens_min;
+		return;
+	}
 }
 
 
@@ -2423,8 +2456,8 @@ grinder_prefetch_tc_queue_arrays(struct rte_sched_port *port, uint32_t pos)
 		rte_prefetch0(grinder->be.qbase[i] + qr[i]);
 	}
 
-	grinder_wrr_load(port, pos);
-	grinder_wrr(port, pos);
+	grinder_wrr_load(port->subport, pos);
+	grinder_wrr(port->subport, pos);
 }
 
 static inline void
@@ -2493,12 +2526,12 @@ grinder_handle(struct rte_sched_port *port, uint32_t pos)
 
 		/* Look for next packet within the same TC */
 		if (result && grinder->qmask) {
-			grinder_wrr(port, pos);
+			grinder_wrr(port->subport, pos);
 			grinder_prefetch_mbuf(port, pos);
 
 			return 1;
 		}
-		grinder_wrr_store(port, pos);
+		grinder_wrr_store(port->subport, pos);
 
 		/* Look for another active TC within same pipe */
 		if (grinder_next_tc(port->subport, pos)) {
diff --git a/lib/librte_sched/rte_sched_common.h b/lib/librte_sched/rte_sched_common.h
index 8c191a9b8..bb3595f26 100644
--- a/lib/librte_sched/rte_sched_common.h
+++ b/lib/librte_sched/rte_sched_common.h
@@ -20,6 +20,18 @@ rte_sched_min_val_2_u32(uint32_t x, uint32_t y)
 	return (x < y)? x : y;
 }
 
+/* Simplified version to remove branches with CMOV instruction */
+static inline uint32_t
+rte_min_pos_2_u16(uint16_t *x)
+{
+	uint32_t pos0 = 0;
+
+	if (x[1] <= x[0])
+		pos0 = 1;
+
+	return pos0;
+}
+
 #if 0
 static inline uint32_t
 rte_min_pos_4_u16(uint16_t *x)
@@ -50,6 +62,35 @@ rte_min_pos_4_u16(uint16_t *x)
 
 #endif
 
+/* Simplified version to remove branches with CMOV instruction */
+static inline uint32_t
+rte_min_pos_8_u16(uint16_t *x)
+{
+	uint32_t pos0 = 0;
+	uint32_t pos1 = 2;
+	uint32_t pos2 = 4;
+	uint32_t pos3 = 6;
+
+	if (x[1] <= x[0])
+		pos0 = 1;
+	if (x[3] <= x[2])
+		pos1 = 3;
+	if (x[5] <= x[4])
+		pos2 = 5;
+	if (x[7] <= x[6])
+		pos3 = 7;
+
+	if (x[pos1] <= x[pos0])
+		pos0 = pos1;
+	if (x[pos3] <= x[pos2])
+		pos2 = pos3;
+
+	if (x[pos2] <= x[pos0])
+		pos0 = pos2;
+
+	return pos0;
+}
+
 /*
  * Compute the Greatest Common Divisor (GCD) of two numbers.
  * This implementation uses Euclid's algorithm:
-- 
2.20.1


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH 17/27] sched: modify credits update function
  2019-05-28 12:05 [dpdk-dev] [PATCH 00/27] sched: feature enhancements Lukasz Krakowiak
                   ` (15 preceding siblings ...)
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 16/27] sched: update grinder wrr compute function Lukasz Krakowiak
@ 2019-05-28 12:05 ` Lukasz Krakowiak
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 18/27] sched: update mbuf prefetch function Lukasz Krakowiak
                   ` (9 subsequent siblings)
  26 siblings, 0 replies; 163+ messages in thread
From: Lukasz Krakowiak @ 2019-05-28 12:05 UTC (permalink / raw)
  To: cristian.dumitrescu; +Cc: dev, Jasvinder Singh, Abraham Tovar, Lukasz Krakowiak

From: Jasvinder Singh <jasvinder.singh@intel.com>

Modify credits update function of the scheduler grinder to allow configuration
flexiblity for pipe traffic classes and queues, and subport level
configuration of the pipe parameters.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 lib/librte_sched/rte_sched.c | 87 +++++++++++++++++++++---------------
 1 file changed, 52 insertions(+), 35 deletions(-)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index a9b5f7bf8..ba344f0a1 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -1857,13 +1857,15 @@ rte_sched_port_enqueue(struct rte_sched_port *port, struct rte_mbuf **pkts,
 #ifndef RTE_SCHED_SUBPORT_TC_OV
 
 static inline void
-grinder_credits_update(struct rte_sched_port *port, uint32_t pos)
+grinder_credits_update(struct rte_sched_port *port,
+	struct rte_sched_subport *subport, uint32_t pos)
 {
-	struct rte_sched_grinder *grinder = port->grinder + pos;
-	struct rte_sched_subport *subport = grinder->subport;
+	struct rte_sched_grinder *grinder = subport->grinder + pos;
 	struct rte_sched_pipe *pipe = grinder->pipe;
 	struct rte_sched_pipe_profile *params = grinder->pipe_params;
+	uint32_t n_sp_queues = pipe->n_sp_queues;
 	uint64_t n_periods;
+	uint32_t i;
 
 	/* Subport TB */
 	n_periods = (port->time - subport->tb_time) / subport->tb_period;
@@ -1879,19 +1881,23 @@ grinder_credits_update(struct rte_sched_port *port, uint32_t pos)
 
 	/* Subport TCs */
 	if (unlikely(port->time >= subport->tc_time)) {
-		subport->tc_credits[0] = subport->tc_credits_per_period[0];
-		subport->tc_credits[1] = subport->tc_credits_per_period[1];
-		subport->tc_credits[2] = subport->tc_credits_per_period[2];
-		subport->tc_credits[3] = subport->tc_credits_per_period[3];
+		for (i = 0; i < n_sp_queues; i++)
+			subport->tc_credits[i] = subport->tc_credits_per_period[i];
+
+		subport->tc_credits[RTE_SCHED_TRAFFIC_CLASS_BE] =
+			subport->tc_credits_per_period[RTE_SCHED_TRAFFIC_CLASS_BE];
+
 		subport->tc_time = port->time + subport->tc_period;
 	}
 
 	/* Pipe TCs */
 	if (unlikely(port->time >= pipe->tc_time)) {
-		pipe->tc_credits[0] = params->tc_credits_per_period[0];
-		pipe->tc_credits[1] = params->tc_credits_per_period[1];
-		pipe->tc_credits[2] = params->tc_credits_per_period[2];
-		pipe->tc_credits[3] = params->tc_credits_per_period[3];
+		for (i = 0; i < n_sp_queues; i++)
+			pipe->tc_credits[i] = params->tc_credits_per_period[i];
+
+		pipe->tc_credits[RTE_SCHED_TRAFFIC_CLASS_BE] =
+			params->tc_credits_per_period[RTE_SCHED_TRAFFIC_CLASS_BE];
+
 		pipe->tc_time = port->time + params->tc_period;
 	}
 }
@@ -1899,26 +1905,34 @@ grinder_credits_update(struct rte_sched_port *port, uint32_t pos)
 #else
 
 static inline uint32_t
-grinder_tc_ov_credits_update(struct rte_sched_port *port, uint32_t pos)
+grinder_tc_ov_credits_update(struct rte_sched_port *port,
+	struct rte_sched_subport *subport, uint32_t pos)
 {
-	struct rte_sched_grinder *grinder = port->grinder + pos;
-	struct rte_sched_subport *subport = grinder->subport;
+	struct rte_sched_grinder *grinder = subport->grinder + pos;
+	struct rte_sched_pipe *pipe = grinder->pipe;
 	uint32_t tc_ov_consumption[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
-	uint32_t tc_ov_consumption_max;
+	uint32_t tc_consumption = 0, tc_ov_consumption_max;
 	uint32_t tc_ov_wm = subport->tc_ov_wm;
+	uint32_t n_sp_queues = pipe->n_sp_queues;
+	uint32_t i;
 
 	if (subport->tc_ov == 0)
 		return subport->tc_ov_wm_max;
+	for (i = 0; i < n_sp_queues; i++) {
+		tc_ov_consumption[i] =
+			subport->tc_credits_per_period[i] - subport->tc_credits[i];
+		tc_consumption += tc_ov_consumption[i];
+	}
 
-	tc_ov_consumption[0] = subport->tc_credits_per_period[0] - subport->tc_credits[0];
-	tc_ov_consumption[1] = subport->tc_credits_per_period[1] - subport->tc_credits[1];
-	tc_ov_consumption[2] = subport->tc_credits_per_period[2] - subport->tc_credits[2];
-	tc_ov_consumption[3] = subport->tc_credits_per_period[3] - subport->tc_credits[3];
+	tc_ov_consumption[RTE_SCHED_TRAFFIC_CLASS_BE] =
+		subport->tc_credits_per_period[RTE_SCHED_TRAFFIC_CLASS_BE] -
+		subport->tc_credits[RTE_SCHED_TRAFFIC_CLASS_BE];
 
-	tc_ov_consumption_max = subport->tc_credits_per_period[3] -
-		(tc_ov_consumption[0] + tc_ov_consumption[1] + tc_ov_consumption[2]);
+	tc_ov_consumption_max =
+		subport->tc_credits_per_period[3] - tc_consumption;
 
-	if (tc_ov_consumption[3] > (tc_ov_consumption_max - port->mtu)) {
+	if (tc_ov_consumption[RTE_SCHED_TRAFFIC_CLASS_BE] >
+		(tc_ov_consumption_max - port->mtu)) {
 		tc_ov_wm  -= tc_ov_wm >> 7;
 		if (tc_ov_wm < subport->tc_ov_wm_min)
 			tc_ov_wm = subport->tc_ov_wm_min;
@@ -1934,13 +1948,15 @@ grinder_tc_ov_credits_update(struct rte_sched_port *port, uint32_t pos)
 }
 
 static inline void
-grinder_credits_update(struct rte_sched_port *port, uint32_t pos)
+grinder_credits_update(struct rte_sched_port *port,
+	struct rte_sched_subport *subport, uint32_t pos)
 {
-	struct rte_sched_grinder *grinder = port->grinder + pos;
-	struct rte_sched_subport *subport = grinder->subport;
+	struct rte_sched_grinder *grinder = subport->grinder + pos;
 	struct rte_sched_pipe *pipe = grinder->pipe;
 	struct rte_sched_pipe_profile *params = grinder->pipe_params;
 	uint64_t n_periods;
+	uint32_t n_sp_queues = pipe->n_sp_queues;
+	uint32_t i;
 
 	/* Subport TB */
 	n_periods = (port->time - subport->tb_time) / subport->tb_period;
@@ -1956,12 +1972,12 @@ grinder_credits_update(struct rte_sched_port *port, uint32_t pos)
 
 	/* Subport TCs */
 	if (unlikely(port->time >= subport->tc_time)) {
-		subport->tc_ov_wm = grinder_tc_ov_credits_update(port, pos);
+		subport->tc_ov_wm = grinder_tc_ov_credits_update(port, subport, pos);
+		for (i = 0; i < n_sp_queues; i++)
+			subport->tc_credits[i] = subport->tc_credits_per_period[i];
 
-		subport->tc_credits[0] = subport->tc_credits_per_period[0];
-		subport->tc_credits[1] = subport->tc_credits_per_period[1];
-		subport->tc_credits[2] = subport->tc_credits_per_period[2];
-		subport->tc_credits[3] = subport->tc_credits_per_period[3];
+		subport->tc_credits[RTE_SCHED_TRAFFIC_CLASS_BE] =
+			subport->tc_credits_per_period[RTE_SCHED_TRAFFIC_CLASS_BE];
 
 		subport->tc_time = port->time + subport->tc_period;
 		subport->tc_ov_period_id++;
@@ -1969,10 +1985,11 @@ grinder_credits_update(struct rte_sched_port *port, uint32_t pos)
 
 	/* Pipe TCs */
 	if (unlikely(port->time >= pipe->tc_time)) {
-		pipe->tc_credits[0] = params->tc_credits_per_period[0];
-		pipe->tc_credits[1] = params->tc_credits_per_period[1];
-		pipe->tc_credits[2] = params->tc_credits_per_period[2];
-		pipe->tc_credits[3] = params->tc_credits_per_period[3];
+		for (i = 0; i < n_sp_queues; i++)
+			pipe->tc_credits[i] = params->tc_credits_per_period[i];
+
+		pipe->tc_credits[RTE_SCHED_TRAFFIC_CLASS_BE] =
+			params->tc_credits_per_period[RTE_SCHED_TRAFFIC_CLASS_BE];
 		pipe->tc_time = port->time + params->tc_period;
 	}
 
@@ -2504,7 +2521,7 @@ grinder_handle(struct rte_sched_port *port, uint32_t pos)
 
 		grinder->pipe_params = port->pipe_profiles + pipe->profile;
 		grinder_prefetch_tc_queue_arrays(port, pos);
-		grinder_credits_update(port, pos);
+		grinder_credits_update(port, port->subport, pos);
 
 		grinder->state = e_GRINDER_PREFETCH_MBUF;
 		return 0;
-- 
2.20.1


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH 18/27] sched: update mbuf prefetch function
  2019-05-28 12:05 [dpdk-dev] [PATCH 00/27] sched: feature enhancements Lukasz Krakowiak
                   ` (16 preceding siblings ...)
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 17/27] sched: modify credits update function Lukasz Krakowiak
@ 2019-05-28 12:05 ` Lukasz Krakowiak
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 19/27] sched: update grinder schedule function Lukasz Krakowiak
                   ` (8 subsequent siblings)
  26 siblings, 0 replies; 163+ messages in thread
From: Lukasz Krakowiak @ 2019-05-28 12:05 UTC (permalink / raw)
  To: cristian.dumitrescu; +Cc: dev, Jasvinder Singh, Abraham Tovar, Lukasz Krakowiak

From: Jasvinder Singh <jasvinder.singh@intel.com>

Update mbuf prefetch function of the scheduler grinder to allow configuration
flexiblity for pipe traffic classes and queues, and subport level
configuration of the pipe parameters.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 lib/librte_sched/rte_sched.c | 42 ++++++++++++++++++++++++++++--------
 1 file changed, 33 insertions(+), 9 deletions(-)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index ba344f0a1..06d89e3fd 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -2478,19 +2478,43 @@ grinder_prefetch_tc_queue_arrays(struct rte_sched_port *port, uint32_t pos)
 }
 
 static inline void
-grinder_prefetch_mbuf(struct rte_sched_port *port, uint32_t pos)
+grinder_prefetch_mbuf(struct rte_sched_subport *subport, uint32_t pos)
 {
-	struct rte_sched_grinder *grinder = port->grinder + pos;
-	uint32_t qpos = grinder->qpos;
-	struct rte_mbuf **qbase = grinder->qbase[qpos];
-	uint16_t qsize = grinder->qsize;
-	uint16_t qr = grinder->queue[qpos]->qr & (qsize - 1);
+	struct rte_sched_grinder *grinder = subport->grinder + pos;
+	struct rte_mbuf **qbase;
+	uint32_t tc_index = grinder->tc_index;
+	uint32_t qpos;
+	uint16_t qsize, qr;
+
+	if (tc_index < RTE_SCHED_TRAFFIC_CLASS_BE) {
+		qbase = grinder->sp.qbase;
+		qsize = grinder->sp.qsize;
+		qr = grinder->sp.queue->qr & (qsize - 1);
+
+		grinder->pkt = qbase[qr];
+		rte_prefetch0(grinder->pkt);
+
+		if (unlikely((qr & 0x7) == 7)) {
+			uint16_t qr_next =
+				(grinder->sp.queue->qr + 1) & (qsize - 1);
+
+			rte_prefetch0(qbase + qr_next);
+		}
+
+		return;
+	}
+
+	qpos = grinder->be.qpos;
+	qbase = grinder->be.qbase[qpos];
+	qsize = grinder->be.qsize[qpos];
+	qr = grinder->be.queue[qpos]->qr & (qsize - 1);
 
 	grinder->pkt = qbase[qr];
 	rte_prefetch0(grinder->pkt);
 
 	if (unlikely((qr & 0x7) == 7)) {
-		uint16_t qr_next = (grinder->queue[qpos]->qr + 1) & (qsize - 1);
+		uint16_t qr_next =
+			(grinder->be.queue[qpos]->qr + 1) & (qsize - 1);
 
 		rte_prefetch0(qbase + qr_next);
 	}
@@ -2529,7 +2553,7 @@ grinder_handle(struct rte_sched_port *port, uint32_t pos)
 
 	case e_GRINDER_PREFETCH_MBUF:
 	{
-		grinder_prefetch_mbuf(port, pos);
+		grinder_prefetch_mbuf(port->subport, pos);
 
 		grinder->state = e_GRINDER_READ_MBUF;
 		return 0;
@@ -2544,7 +2568,7 @@ grinder_handle(struct rte_sched_port *port, uint32_t pos)
 		/* Look for next packet within the same TC */
 		if (result && grinder->qmask) {
 			grinder_wrr(port->subport, pos);
-			grinder_prefetch_mbuf(port, pos);
+			grinder_prefetch_mbuf(port->subport, pos);
 
 			return 1;
 		}
-- 
2.20.1


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH 19/27] sched: update grinder schedule function
  2019-05-28 12:05 [dpdk-dev] [PATCH 00/27] sched: feature enhancements Lukasz Krakowiak
                   ` (17 preceding siblings ...)
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 18/27] sched: update mbuf prefetch function Lukasz Krakowiak
@ 2019-05-28 12:05 ` Lukasz Krakowiak
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 20/27] sched: update grinder handle function Lukasz Krakowiak
                   ` (7 subsequent siblings)
  26 siblings, 0 replies; 163+ messages in thread
From: Lukasz Krakowiak @ 2019-05-28 12:05 UTC (permalink / raw)
  To: cristian.dumitrescu; +Cc: dev, Jasvinder Singh, Abraham Tovar, Lukasz Krakowiak

From: Jasvinder Singh <jasvinder.singh@intel.com>

Update grinder schedule function of the scheduler to allow configuration
flexiblity for pipe traffic classes and queues, and subport level
configuration of the pipe parameters.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 lib/librte_sched/rte_sched.c | 120 ++++++++++++++++++++++++++---------
 1 file changed, 90 insertions(+), 30 deletions(-)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index 06d89e3fd..34b85af20 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -2007,10 +2007,10 @@ grinder_credits_update(struct rte_sched_port *port,
 #ifndef RTE_SCHED_SUBPORT_TC_OV
 
 static inline int
-grinder_credits_check(struct rte_sched_port *port, uint32_t pos)
+grinder_credits_check(struct rte_sched_port *port,
+	struct rte_sched_subport *subport, uint32_t pos)
 {
-	struct rte_sched_grinder *grinder = port->grinder + pos;
-	struct rte_sched_subport *subport = grinder->subport;
+	struct rte_sched_grinder *grinder = subport->grinder + pos;
 	struct rte_sched_pipe *pipe = grinder->pipe;
 	struct rte_mbuf *pkt = grinder->pkt;
 	uint32_t tc_index = grinder->tc_index;
@@ -2030,7 +2030,7 @@ grinder_credits_check(struct rte_sched_port *port, uint32_t pos)
 	if (!enough_credits)
 		return 0;
 
-	/* Update port credits */
+	/* Update subport credits */
 	subport->tb_credits -= pkt_len;
 	subport->tc_credits[tc_index] -= pkt_len;
 	pipe->tb_credits -= pkt_len;
@@ -2042,10 +2042,10 @@ grinder_credits_check(struct rte_sched_port *port, uint32_t pos)
 #else
 
 static inline int
-grinder_credits_check(struct rte_sched_port *port, uint32_t pos)
+grinder_credits_check(struct rte_sched_port *port,
+	struct rte_sched_subport *subport, uint32_t pos)
 {
-	struct rte_sched_grinder *grinder = port->grinder + pos;
-	struct rte_sched_subport *subport = grinder->subport;
+	struct rte_sched_grinder *grinder = subport->grinder + pos;
 	struct rte_sched_pipe *pipe = grinder->pipe;
 	struct rte_mbuf *pkt = grinder->pkt;
 	uint32_t tc_index = grinder->tc_index;
@@ -2054,11 +2054,18 @@ grinder_credits_check(struct rte_sched_port *port, uint32_t pos)
 	uint32_t subport_tc_credits = subport->tc_credits[tc_index];
 	uint32_t pipe_tb_credits = pipe->tb_credits;
 	uint32_t pipe_tc_credits = pipe->tc_credits[tc_index];
-	uint32_t pipe_tc_ov_mask1[] = {UINT32_MAX, UINT32_MAX, UINT32_MAX, pipe->tc_ov_credits};
-	uint32_t pipe_tc_ov_mask2[] = {0, 0, 0, UINT32_MAX};
-	uint32_t pipe_tc_ov_credits = pipe_tc_ov_mask1[tc_index];
+	uint32_t pipe_tc_ov_mask1[RTE_SCHED_TRAFFIC_CLASS_BE];
+	uint32_t pipe_tc_ov_mask2[RTE_SCHED_TRAFFIC_CLASS_BE] = {0};
+	uint32_t pipe_tc_ov_credits, i;
 	int enough_credits;
 
+	for (i = 0; i < RTE_DIM(pipe_tc_ov_mask1); i++)
+		pipe_tc_ov_mask1[i] = UINT32_MAX;
+
+	pipe_tc_ov_mask1[RTE_SCHED_TRAFFIC_CLASS_BE] = pipe->tc_ov_credits;
+	pipe_tc_ov_mask2[RTE_SCHED_TRAFFIC_CLASS_BE] = UINT32_MAX;
+	pipe_tc_ov_credits = pipe_tc_ov_mask1[tc_index];
+
 	/* Check pipe and subport credits */
 	enough_credits = (pkt_len <= subport_tb_credits) &&
 		(pkt_len <= subport_tc_credits) &&
@@ -2081,17 +2088,47 @@ grinder_credits_check(struct rte_sched_port *port, uint32_t pos)
 
 #endif /* RTE_SCHED_SUBPORT_TC_OV */
 
-
-static inline int
-grinder_schedule(struct rte_sched_port *port, uint32_t pos)
+static inline void
+grinder_schedule_sp(struct rte_sched_port *port,
+	struct rte_sched_subport *subport, uint32_t pos)
 {
-	struct rte_sched_grinder *grinder = port->grinder + pos;
-	struct rte_sched_queue *queue = grinder->queue[grinder->qpos];
+	struct rte_sched_grinder *grinder = subport->grinder + pos;
+	struct rte_sched_queue *queue = grinder->sp.queue;
 	struct rte_mbuf *pkt = grinder->pkt;
 	uint32_t pkt_len = pkt->pkt_len + port->frame_overhead;
 
-	if (!grinder_credits_check(port, pos))
-		return 0;
+	/* Advance port time */
+	port->time += pkt_len;
+
+	/* Send packet */
+	port->pkts_out[port->n_pkts_out++] = pkt;
+	queue->qr++;
+
+	if (queue->qr == queue->qw) {
+		uint32_t qindex = grinder->sp.qindex;
+
+		rte_bitmap_clear(subport->bmp, qindex);
+		rte_sched_port_set_queue_empty_timestamp(port, subport, qindex);
+	}
+
+	/* Reset pipe loop detection */
+	subport->pipe_loop = RTE_SCHED_PIPE_INVALID;
+	grinder->productive = 1;
+}
+
+static inline void
+grinder_schedule_be(struct rte_sched_port *port,
+	struct rte_sched_subport *subport, uint32_t pos)
+{
+	struct rte_sched_grinder *grinder = subport->grinder + pos;
+	struct rte_mbuf *pkt = grinder->pkt;
+	struct rte_sched_queue *queue;
+	uint32_t qpos, pkt_len;
+
+	/* Best effort TC */
+	pkt_len = pkt->pkt_len + port->frame_overhead;
+	qpos = grinder->be.qpos;
+	queue = grinder->be.queue[qpos];
 
 	/* Advance port time */
 	port->time += pkt_len;
@@ -2099,19 +2136,41 @@ grinder_schedule(struct rte_sched_port *port, uint32_t pos)
 	/* Send packet */
 	port->pkts_out[port->n_pkts_out++] = pkt;
 	queue->qr++;
-	grinder->wrr_tokens[grinder->qpos] += pkt_len * grinder->wrr_cost[grinder->qpos];
+
+	grinder->be.wrr_tokens[qpos] += pkt_len * grinder->be.wrr_cost[qpos];
 	if (queue->qr == queue->qw) {
-		uint32_t qindex = grinder->qindex[grinder->qpos];
+		uint32_t qindex = grinder->be.qindex[qpos];
 
-		rte_bitmap_clear(port->bmp, qindex);
-		grinder->qmask &= ~(1 << grinder->qpos);
-		grinder->wrr_mask[grinder->qpos] = 0;
-		rte_sched_port_set_queue_empty_timestamp(port, port->subport, qindex);
+		rte_bitmap_clear(subport->bmp, qindex);
+		grinder->be.qmask &= ~(1 << qpos);
+		grinder->be.wrr_mask[qpos] = 0;
+		rte_sched_port_set_queue_empty_timestamp(port, subport, qindex);
 	}
 
 	/* Reset pipe loop detection */
-	port->pipe_loop = RTE_SCHED_PIPE_INVALID;
+	subport->pipe_loop = RTE_SCHED_PIPE_INVALID;
 	grinder->productive = 1;
+}
+
+static inline int
+grinder_schedule(struct rte_sched_port *port,
+	struct rte_sched_subport *subport, uint32_t pos)
+{
+	struct rte_sched_grinder *grinder = subport->grinder + pos;
+	struct rte_sched_pipe *pipe = grinder->pipe;
+	uint32_t tc_index = grinder->tc_index;
+
+	if (!grinder_credits_check(port, subport, pos))
+		return 0;
+
+	/* Strict priority TC */
+	if (tc_index < pipe->n_sp_queues) {
+		grinder_schedule_sp(port, subport, pos);
+		return 1;
+	}
+
+	/* Best Effort TC */
+	grinder_schedule_be(port, subport, pos);
 
 	return 1;
 }
@@ -2523,14 +2582,15 @@ grinder_prefetch_mbuf(struct rte_sched_subport *subport, uint32_t pos)
 static inline uint32_t
 grinder_handle(struct rte_sched_port *port, uint32_t pos)
 {
-	struct rte_sched_grinder *grinder = port->grinder + pos;
+	struct rte_sched_subport *subport = port->subport;
+	struct rte_sched_grinder *grinder = subport->grinder + pos;
 
 	switch (grinder->state) {
 	case e_GRINDER_PREFETCH_PIPE:
 	{
-		if (grinder_next_pipe(port->subport, pos)) {
-			grinder_prefetch_pipe(port->subport, pos);
-			port->busy_grinders++;
+		if (grinder_next_pipe(subport, pos)) {
+			grinder_prefetch_pipe(subport, pos);
+			subport->busy_grinders++;
 
 			grinder->state = e_GRINDER_PREFETCH_TC_QUEUE_ARRAYS;
 			return 0;
@@ -2553,7 +2613,7 @@ grinder_handle(struct rte_sched_port *port, uint32_t pos)
 
 	case e_GRINDER_PREFETCH_MBUF:
 	{
-		grinder_prefetch_mbuf(port->subport, pos);
+		grinder_prefetch_mbuf(subport, pos);
 
 		grinder->state = e_GRINDER_READ_MBUF;
 		return 0;
@@ -2563,7 +2623,7 @@ grinder_handle(struct rte_sched_port *port, uint32_t pos)
 	{
 		uint32_t result = 0;
 
-		result = grinder_schedule(port, pos);
+		result = grinder_schedule(port, subport, pos);
 
 		/* Look for next packet within the same TC */
 		if (result && grinder->qmask) {
-- 
2.20.1


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH 20/27] sched: update grinder handle function
  2019-05-28 12:05 [dpdk-dev] [PATCH 00/27] sched: feature enhancements Lukasz Krakowiak
                   ` (18 preceding siblings ...)
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 19/27] sched: update grinder schedule function Lukasz Krakowiak
@ 2019-05-28 12:05 ` Lukasz Krakowiak
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 21/27] sched: update packet dequeue api Lukasz Krakowiak
                   ` (6 subsequent siblings)
  26 siblings, 0 replies; 163+ messages in thread
From: Lukasz Krakowiak @ 2019-05-28 12:05 UTC (permalink / raw)
  To: cristian.dumitrescu; +Cc: dev, Jasvinder Singh, Abraham Tovar, Lukasz Krakowiak

From: Jasvinder Singh <jasvinder.singh@intel.com>

Update grinder handle function implementation of the scheduler to allow
configuration flexiblity for pipe traffic classes and queues, and subport
level configuration of the pipe parameters.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 lib/librte_sched/rte_sched.c | 49 +++++++++++++++++++-----------------
 1 file changed, 26 insertions(+), 23 deletions(-)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index 34b85af20..0c5479426 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -2495,7 +2495,7 @@ grinder_wrr(struct rte_sched_subport *subport, uint32_t pos)
 }
 
 
-#define grinder_evict(port, pos)
+#define grinder_evict(subport, pos)
 
 static inline void
 grinder_prefetch_pipe(struct rte_sched_subport *subport, uint32_t pos)
@@ -2507,9 +2507,9 @@ grinder_prefetch_pipe(struct rte_sched_subport *subport, uint32_t pos)
 }
 
 static inline void
-grinder_prefetch_tc_queue_arrays(struct rte_sched_port *port, uint32_t pos)
+grinder_prefetch_tc_queue_arrays(struct rte_sched_subport *subport, uint32_t pos)
 {
-	struct rte_sched_grinder *grinder = port->subport->grinder + pos;
+	struct rte_sched_grinder *grinder = subport->grinder + pos;
 	struct rte_sched_pipe *pipe = grinder->pipe;
 	struct rte_sched_queue *queue;
 	uint32_t tc_index = grinder->tc_index, i;
@@ -2532,8 +2532,8 @@ grinder_prefetch_tc_queue_arrays(struct rte_sched_port *port, uint32_t pos)
 		rte_prefetch0(grinder->be.qbase[i] + qr[i]);
 	}
 
-	grinder_wrr_load(port->subport, pos);
-	grinder_wrr(port->subport, pos);
+	grinder_wrr_load(subport, pos);
+	grinder_wrr(subport, pos);
 }
 
 static inline void
@@ -2580,9 +2580,9 @@ grinder_prefetch_mbuf(struct rte_sched_subport *subport, uint32_t pos)
 }
 
 static inline uint32_t
-grinder_handle(struct rte_sched_port *port, uint32_t pos)
+grinder_handle(struct rte_sched_port *port,
+	struct rte_sched_subport *subport, uint32_t pos)
 {
-	struct rte_sched_subport *subport = port->subport;
 	struct rte_sched_grinder *grinder = subport->grinder + pos;
 
 	switch (grinder->state) {
@@ -2603,9 +2603,9 @@ grinder_handle(struct rte_sched_port *port, uint32_t pos)
 	{
 		struct rte_sched_pipe *pipe = grinder->pipe;
 
-		grinder->pipe_params = port->pipe_profiles + pipe->profile;
-		grinder_prefetch_tc_queue_arrays(port, pos);
-		grinder_credits_update(port, port->subport, pos);
+		grinder->pipe_params = subport->pipe_profiles + pipe->profile;
+		grinder_prefetch_tc_queue_arrays(subport, pos);
+		grinder_credits_update(port, subport, pos);
 
 		grinder->state = e_GRINDER_PREFETCH_MBUF;
 		return 0;
@@ -2626,38 +2626,40 @@ grinder_handle(struct rte_sched_port *port, uint32_t pos)
 		result = grinder_schedule(port, subport, pos);
 
 		/* Look for next packet within the same TC */
-		if (result && grinder->qmask) {
-			grinder_wrr(port->subport, pos);
-			grinder_prefetch_mbuf(port->subport, pos);
+		if (result &&
+			(grinder->tc_index == RTE_SCHED_TRAFFIC_CLASS_BE) &&
+			(grinder->be.qmask)) {
+			grinder_wrr(subport, pos);
+			grinder_prefetch_mbuf(subport, pos);
 
 			return 1;
 		}
-		grinder_wrr_store(port->subport, pos);
+		grinder_wrr_store(subport, pos);
 
 		/* Look for another active TC within same pipe */
-		if (grinder_next_tc(port->subport, pos)) {
-			grinder_prefetch_tc_queue_arrays(port, pos);
+		if (grinder_next_tc(subport, pos)) {
+			grinder_prefetch_tc_queue_arrays(subport, pos);
 
 			grinder->state = e_GRINDER_PREFETCH_MBUF;
 			return result;
 		}
 
 		if (grinder->productive == 0 &&
-		    port->pipe_loop == RTE_SCHED_PIPE_INVALID)
-			port->pipe_loop = grinder->pindex;
+		    subport->pipe_loop == RTE_SCHED_PIPE_INVALID)
+			subport->pipe_loop = grinder->pindex;
 
-		grinder_evict(port, pos);
+		grinder_evict(subport, pos);
 
 		/* Look for another active pipe */
-		if (grinder_next_pipe(port->subport, pos)) {
-			grinder_prefetch_pipe(port->subport, pos);
+		if (grinder_next_pipe(subport, pos)) {
+			grinder_prefetch_pipe(subport, pos);
 
 			grinder->state = e_GRINDER_PREFETCH_TC_QUEUE_ARRAYS;
 			return result;
 		}
 
 		/* No active pipe found */
-		port->busy_grinders--;
+		subport->busy_grinders--;
 
 		grinder->state = e_GRINDER_PREFETCH_PIPE;
 		return result;
@@ -2717,7 +2719,8 @@ rte_sched_port_dequeue(struct rte_sched_port *port, struct rte_mbuf **pkts, uint
 
 	/* Take each queue in the grinder one step further */
 	for (i = 0, count = 0; ; i++)  {
-		count += grinder_handle(port, i & (RTE_SCHED_PORT_N_GRINDERS - 1));
+		count += grinder_handle(port, port->subport,
+			i & (RTE_SCHED_PORT_N_GRINDERS - 1));
 		if ((count == n_pkts) ||
 		    rte_sched_port_exceptions(port, i >= RTE_SCHED_PORT_N_GRINDERS)) {
 			break;
-- 
2.20.1


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH 21/27] sched: update packet dequeue api
  2019-05-28 12:05 [dpdk-dev] [PATCH 00/27] sched: feature enhancements Lukasz Krakowiak
                   ` (19 preceding siblings ...)
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 20/27] sched: update grinder handle function Lukasz Krakowiak
@ 2019-05-28 12:05 ` Lukasz Krakowiak
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 22/27] sched: update sched queue stats api Lukasz Krakowiak
                   ` (5 subsequent siblings)
  26 siblings, 0 replies; 163+ messages in thread
From: Lukasz Krakowiak @ 2019-05-28 12:05 UTC (permalink / raw)
  To: cristian.dumitrescu; +Cc: dev, Jasvinder Singh, Abraham Tovar, Lukasz Krakowiak

From: Jasvinder Singh <jasvinder.singh@intel.com>

Update packet dequeue api implementation of scheduler to allow configuration
flexiblity for pipe traffic classes and queues, and subport level
configuration of the pipe parameters.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 lib/librte_sched/rte_sched.c | 45 ++++++++++++++++++++++++++++--------
 1 file changed, 35 insertions(+), 10 deletions(-)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index 0c5479426..116e6a627 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -2677,6 +2677,7 @@ rte_sched_port_time_resync(struct rte_sched_port *port)
 	uint64_t cycles = rte_get_tsc_cycles();
 	uint64_t cycles_diff = cycles - port->time_cpu_cycles;
 	uint64_t bytes_diff;
+	uint32_t i;
 
 	/* Compute elapsed time in bytes */
 	bytes_diff = rte_reciprocal_divide(cycles_diff << RTE_SCHED_TIME_SHIFT,
@@ -2689,20 +2690,21 @@ rte_sched_port_time_resync(struct rte_sched_port *port)
 		port->time = port->time_cpu_bytes;
 
 	/* Reset pipe loop detection */
-	port->pipe_loop = RTE_SCHED_PIPE_INVALID;
+	for (i = 0; i < port->n_subports_per_port; i++)
+		port->subports[i]->pipe_loop = RTE_SCHED_PIPE_INVALID;
 }
 
 static inline int
-rte_sched_port_exceptions(struct rte_sched_port *port, int second_pass)
+rte_sched_port_exceptions(struct rte_sched_subport *subport, int second_pass)
 {
 	int exceptions;
 
 	/* Check if any exception flag is set */
-	exceptions = (second_pass && port->busy_grinders == 0) ||
-		(port->pipe_exhaustion == 1);
+	exceptions = (second_pass && subport->busy_grinders == 0) ||
+		(subport->pipe_exhaustion == 1);
 
 	/* Clear exception flags */
-	port->pipe_exhaustion = 0;
+	subport->pipe_exhaustion = 0;
 
 	return exceptions;
 }
@@ -2710,7 +2712,9 @@ rte_sched_port_exceptions(struct rte_sched_port *port, int second_pass)
 int
 rte_sched_port_dequeue(struct rte_sched_port *port, struct rte_mbuf **pkts, uint32_t n_pkts)
 {
-	uint32_t i, count;
+	struct rte_sched_subport *subport;
+	uint32_t subport_id = port->subport_id;
+	uint32_t i, n_subports = 0, count;
 
 	port->pkts_out = pkts;
 	port->n_pkts_out = 0;
@@ -2719,10 +2723,31 @@ rte_sched_port_dequeue(struct rte_sched_port *port, struct rte_mbuf **pkts, uint
 
 	/* Take each queue in the grinder one step further */
 	for (i = 0, count = 0; ; i++)  {
-		count += grinder_handle(port, port->subport,
-			i & (RTE_SCHED_PORT_N_GRINDERS - 1));
-		if ((count == n_pkts) ||
-		    rte_sched_port_exceptions(port, i >= RTE_SCHED_PORT_N_GRINDERS)) {
+		subport = port->subports[subport_id];
+
+		count += grinder_handle(port, subport, i &
+				(RTE_SCHED_PORT_N_GRINDERS - 1));
+		if (count == n_pkts) {
+			subport_id++;
+
+			if (subport_id == port->n_subports_per_port)
+				subport_id = 0;
+
+			port->subport_id = subport_id;
+			break;
+		}
+
+		if (rte_sched_port_exceptions(subport, i >= RTE_SCHED_PORT_N_GRINDERS)) {
+			i = 0;
+			subport_id++;
+			n_subports++;
+		}
+
+		if (subport_id == port->n_subports_per_port)
+			subport_id = 0;
+
+		if (n_subports == port->n_subports_per_port) {
+			port->subport_id = subport_id;
 			break;
 		}
 	}
-- 
2.20.1


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH 22/27] sched: update sched queue stats api
  2019-05-28 12:05 [dpdk-dev] [PATCH 00/27] sched: feature enhancements Lukasz Krakowiak
                   ` (20 preceding siblings ...)
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 21/27] sched: update packet dequeue api Lukasz Krakowiak
@ 2019-05-28 12:05 ` Lukasz Krakowiak
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 23/27] test/sched: update unit test Lukasz Krakowiak
                   ` (4 subsequent siblings)
  26 siblings, 0 replies; 163+ messages in thread
From: Lukasz Krakowiak @ 2019-05-28 12:05 UTC (permalink / raw)
  To: cristian.dumitrescu; +Cc: dev, Jasvinder Singh, Abraham Tovar, Lukasz Krakowiak

From: Jasvinder Singh <jasvinder.singh@intel.com>

Update queue stats read api implementation of the scheduler to allow
configuration flexiblity for pipe traffic classes and queues, and subport
level configuration of the pipe parameters.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 lib/librte_sched/rte_sched.c | 41 +++++++++++++++++++-----------------
 1 file changed, 22 insertions(+), 19 deletions(-)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index 116e6a627..563161713 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -306,16 +306,6 @@ enum rte_sched_subport_array {
 	e_RTE_SCHED_SUBPORT_ARRAY_TOTAL,
 };
 
-#ifdef RTE_SCHED_COLLECT_STATS
-
-static inline uint32_t
-rte_sched_port_queues_per_subport(struct rte_sched_port *port)
-{
-	return RTE_SCHED_QUEUES_PER_PIPE * port->n_pipes_per_subport;
-}
-
-#endif
-
 static inline uint32_t
 rte_sched_subport_queues(struct rte_sched_subport *subport)
 {
@@ -340,9 +330,14 @@ rte_sched_subport_qsize(struct rte_sched_subport *subport, uint32_t qindex)
 }
 
 static inline uint32_t
-rte_sched_port_queues_per_port(struct rte_sched_port *port)
+rte_sched_port_queues(struct rte_sched_port *port)
 {
-	return RTE_SCHED_QUEUES_PER_PIPE * port->n_pipes_per_subport * port->n_subports_per_port;
+	uint32_t n_queues = 0, i;
+
+	for (i = 0; i < port->n_subports_per_port; i++)
+		n_queues += rte_sched_subport_queues(port->subports[i]);
+
+	return n_queues;
 }
 
 static int
@@ -1366,18 +1361,25 @@ rte_sched_queue_read_stats(struct rte_sched_port *port,
 	struct rte_sched_queue_stats *stats,
 	uint16_t *qlen)
 {
+	uint32_t subport_id, qindex;
+	struct rte_sched_subport *s;
 	struct rte_sched_queue *q;
 	struct rte_sched_queue_extra *qe;
 
 	/* Check user parameters */
 	if ((port == NULL) ||
-	    (queue_id >= rte_sched_port_queues_per_port(port)) ||
+	    (queue_id >= rte_sched_port_queues(port)) ||
 		(stats == NULL) ||
 		(qlen == NULL)) {
 		return -1;
 	}
-	q = port->queue + queue_id;
-	qe = port->queue_extra + queue_id;
+
+	subport_id = (queue_id >> (port->n_max_subport_pipes_log2 + 4)) &
+					(port->n_subports_per_port - 1);
+	s = port->subports[subport_id];
+	qindex = ((1 << (port->n_max_subport_pipes_log2 + 4)) - 1) & queue_id;
+	q = s->queue + qindex;
+	qe = s->queue_extra + qindex;
 
 	/* Copy queue stats and clear */
 	memcpy(stats, &qe->stats, sizeof(struct rte_sched_queue_stats));
@@ -1392,9 +1394,10 @@ rte_sched_queue_read_stats(struct rte_sched_port *port,
 #ifdef RTE_SCHED_DEBUG
 
 static inline int
-rte_sched_port_queue_is_empty(struct rte_sched_port *port, uint32_t qindex)
+rte_sched_port_queue_is_empty(struct rte_sched_subport *subport,
+	uint32_t qindex)
 {
-	struct rte_sched_queue *queue = port->queue + qindex;
+	struct rte_sched_queue *queue = subport->queue + qindex;
 
 	return queue->qr == queue->qw;
 }
@@ -1535,7 +1538,7 @@ rte_sched_port_set_queue_empty_timestamp(struct rte_sched_port *port __rte_unuse
 #ifdef RTE_SCHED_DEBUG
 
 static inline void
-debug_check_queue_slab(struct rte_sched_port *port, uint32_t bmp_pos,
+debug_check_queue_slab(struct rte_sched_subport *subport, uint32_t bmp_pos,
 		       uint64_t bmp_slab)
 {
 	uint64_t mask;
@@ -1547,7 +1550,7 @@ debug_check_queue_slab(struct rte_sched_port *port, uint32_t bmp_pos,
 	panic = 0;
 	for (i = 0, mask = 1; i < 64; i++, mask <<= 1) {
 		if (mask & bmp_slab) {
-			if (rte_sched_port_queue_is_empty(port, bmp_pos + i)) {
+			if (rte_sched_port_queue_is_empty(subport, bmp_pos + i)) {
 				printf("Queue %u (slab offset %u) is empty\n", bmp_pos + i, i);
 				panic = 1;
 			}
-- 
2.20.1


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH 23/27] test/sched: update unit test
  2019-05-28 12:05 [dpdk-dev] [PATCH 00/27] sched: feature enhancements Lukasz Krakowiak
                   ` (21 preceding siblings ...)
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 22/27] sched: update sched queue stats api Lukasz Krakowiak
@ 2019-05-28 12:05 ` Lukasz Krakowiak
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 24/27] net/softnic: update softnic tm function Lukasz Krakowiak
                   ` (3 subsequent siblings)
  26 siblings, 0 replies; 163+ messages in thread
From: Lukasz Krakowiak @ 2019-05-28 12:05 UTC (permalink / raw)
  To: cristian.dumitrescu; +Cc: dev, Jasvinder Singh, Abraham Tovar, Lukasz Krakowiak

From: Jasvinder Singh <jasvinder.singh@intel.com>

Update unit test to allow configuration flexiblity for pipe traffic
classes and queues, and subport level configuration of the pipe
parameters.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 app/test/test_sched.c | 37 ++++++++++++++++++++-----------------
 1 file changed, 20 insertions(+), 17 deletions(-)

diff --git a/app/test/test_sched.c b/app/test/test_sched.c
index 460eb53ec..ec94861da 100644
--- a/app/test/test_sched.c
+++ b/app/test/test_sched.c
@@ -20,40 +20,43 @@
 #define SUBPORT         0
 #define PIPE            1
 #define TC              2
-#define QUEUE           3
-
-static struct rte_sched_subport_params subport_param[] = {
-	{
-		.tb_rate = 1250000000,
-		.tb_size = 1000000,
-
-		.tc_rate = {1250000000, 1250000000, 1250000000, 1250000000},
-		.tc_period = 10,
-	},
-};
+#define QUEUE           2
 
 static struct rte_sched_pipe_params pipe_profile[] = {
 	{ /* Profile #0 */
 		.tb_rate = 305175,
 		.tb_size = 1000000,
 
-		.tc_rate = {305175, 305175, 305175, 305175},
+		.tc_rate = {305175, 305175, 305175, 305175,
+				305175, 305175, 305175, 305175, 305175},
 		.tc_period = 40,
 
 		.wrr_weights = {1, 1, 1, 1,  1, 1, 1, 1},
 	},
 };
 
+static struct rte_sched_subport_params subport_param[] = {
+	{
+		.tb_rate = 1250000000,
+		.tb_size = 1000000,
+
+		.tc_rate = {1250000000, 1250000000, 1250000000, 1250000000,
+			1250000000, 1250000000, 1250000000, 1250000000, 1250000000},
+		.tc_period = 10,
+		.n_subport_pipes = 1024,
+		.qsize = {32, 32, 32, 32, 32, 32, 32, 32,
+			32, 32, 32, 32, 32, 32, 32, 32},
+		.pipe_profiles = pipe_profile,
+		.n_pipe_profiles = 1,
+	},
+};
+
 static struct rte_sched_port_params port_param = {
 	.socket = 0, /* computed */
 	.rate = 0, /* computed */
 	.mtu = 1522,
 	.frame_overhead = RTE_SCHED_FRAME_OVERHEAD_DEFAULT,
 	.n_subports_per_port = 1,
-	.n_pipes_per_subport = 1024,
-	.qsize = {32, 32, 32, 32},
-	.pipe_profiles = pipe_profile,
-	.n_pipe_profiles = 1,
 };
 
 #define NB_MBUF          32
@@ -131,7 +134,7 @@ test_sched(void)
 	err = rte_sched_subport_config(port, SUBPORT, subport_param);
 	TEST_ASSERT_SUCCESS(err, "Error config sched, err=%d\n", err);
 
-	for (pipe = 0; pipe < port_param.n_pipes_per_subport; pipe ++) {
+	for (pipe = 0; pipe < subport_param[0].n_subport_pipes; pipe++) {
 		err = rte_sched_pipe_config(port, SUBPORT, pipe, 0);
 		TEST_ASSERT_SUCCESS(err, "Error config sched pipe %u, err=%d\n", pipe, err);
 	}
-- 
2.20.1


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH 24/27] net/softnic: update softnic tm function
  2019-05-28 12:05 [dpdk-dev] [PATCH 00/27] sched: feature enhancements Lukasz Krakowiak
                   ` (22 preceding siblings ...)
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 23/27] test/sched: update unit test Lukasz Krakowiak
@ 2019-05-28 12:05 ` Lukasz Krakowiak
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 25/27] examples/qos_sched: update qos sched sample app Lukasz Krakowiak
                   ` (2 subsequent siblings)
  26 siblings, 0 replies; 163+ messages in thread
From: Lukasz Krakowiak @ 2019-05-28 12:05 UTC (permalink / raw)
  To: cristian.dumitrescu; +Cc: dev, Jasvinder Singh, Abraham Tovar, Lukasz Krakowiak

From: Jasvinder Singh <jasvinder.singh@intel.com>

Update softnic tm function to allow configuration flexiblity for
pipe traffic classes and queues, and subport level configuration
of the pipe parameters.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 drivers/net/softnic/rte_eth_softnic.c         | 131 ++++++++
 drivers/net/softnic/rte_eth_softnic_cli.c     | 286 ++++++++++++++++--
 .../net/softnic/rte_eth_softnic_internals.h   |   4 +-
 drivers/net/softnic/rte_eth_softnic_tm.c      |  89 +++---
 4 files changed, 442 insertions(+), 68 deletions(-)

diff --git a/drivers/net/softnic/rte_eth_softnic.c b/drivers/net/softnic/rte_eth_softnic.c
index 32b001fd3..9d0168549 100644
--- a/drivers/net/softnic/rte_eth_softnic.c
+++ b/drivers/net/softnic/rte_eth_softnic.c
@@ -28,6 +28,19 @@
 #define PMD_PARAM_TM_QSIZE1                                "tm_qsize1"
 #define PMD_PARAM_TM_QSIZE2                                "tm_qsize2"
 #define PMD_PARAM_TM_QSIZE3                                "tm_qsize3"
+#define PMD_PARAM_TM_QSIZE4                                "tm_qsize4"
+#define PMD_PARAM_TM_QSIZE5                                "tm_qsize5"
+#define PMD_PARAM_TM_QSIZE6                                "tm_qsize6"
+#define PMD_PARAM_TM_QSIZE7                                "tm_qsize7"
+#define PMD_PARAM_TM_QSIZE8                                "tm_qsize8"
+#define PMD_PARAM_TM_QSIZE9                                "tm_qsize9"
+#define PMD_PARAM_TM_QSIZE10                               "tm_qsize10"
+#define PMD_PARAM_TM_QSIZE11                               "tm_qsize11"
+#define PMD_PARAM_TM_QSIZE12                               "tm_qsize12"
+#define PMD_PARAM_TM_QSIZE13                               "tm_qsize13"
+#define PMD_PARAM_TM_QSIZE14                               "tm_qsize14"
+#define PMD_PARAM_TM_QSIZE15                               "tm_qsize15"
+
 
 static const char * const pmd_valid_args[] = {
 	PMD_PARAM_FIRMWARE,
@@ -39,6 +52,18 @@ static const char * const pmd_valid_args[] = {
 	PMD_PARAM_TM_QSIZE1,
 	PMD_PARAM_TM_QSIZE2,
 	PMD_PARAM_TM_QSIZE3,
+	PMD_PARAM_TM_QSIZE4,
+	PMD_PARAM_TM_QSIZE5,
+	PMD_PARAM_TM_QSIZE6,
+	PMD_PARAM_TM_QSIZE7,
+	PMD_PARAM_TM_QSIZE8,
+	PMD_PARAM_TM_QSIZE9,
+	PMD_PARAM_TM_QSIZE10,
+	PMD_PARAM_TM_QSIZE11,
+	PMD_PARAM_TM_QSIZE12,
+	PMD_PARAM_TM_QSIZE13,
+	PMD_PARAM_TM_QSIZE14,
+	PMD_PARAM_TM_QSIZE15,
 	NULL
 };
 
@@ -434,6 +459,18 @@ pmd_parse_args(struct pmd_params *p, const char *params)
 	p->tm.qsize[1] = SOFTNIC_TM_QUEUE_SIZE;
 	p->tm.qsize[2] = SOFTNIC_TM_QUEUE_SIZE;
 	p->tm.qsize[3] = SOFTNIC_TM_QUEUE_SIZE;
+	p->tm.qsize[4] = SOFTNIC_TM_QUEUE_SIZE;
+	p->tm.qsize[5] = SOFTNIC_TM_QUEUE_SIZE;
+	p->tm.qsize[6] = SOFTNIC_TM_QUEUE_SIZE;
+	p->tm.qsize[7] = SOFTNIC_TM_QUEUE_SIZE;
+	p->tm.qsize[8] = SOFTNIC_TM_QUEUE_SIZE;
+	p->tm.qsize[9] = SOFTNIC_TM_QUEUE_SIZE;
+	p->tm.qsize[10] = SOFTNIC_TM_QUEUE_SIZE;
+	p->tm.qsize[11] = SOFTNIC_TM_QUEUE_SIZE;
+	p->tm.qsize[12] = SOFTNIC_TM_QUEUE_SIZE;
+	p->tm.qsize[13] = SOFTNIC_TM_QUEUE_SIZE;
+	p->tm.qsize[14] = SOFTNIC_TM_QUEUE_SIZE;
+	p->tm.qsize[15] = SOFTNIC_TM_QUEUE_SIZE;
 
 	/* Firmware script (optional) */
 	if (rte_kvargs_count(kvlist, PMD_PARAM_FIRMWARE) == 1) {
@@ -504,6 +541,88 @@ pmd_parse_args(struct pmd_params *p, const char *params)
 			goto out_free;
 	}
 
+	if (rte_kvargs_count(kvlist, PMD_PARAM_TM_QSIZE4) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_TM_QSIZE4,
+			&get_uint32, &p->tm.qsize[4]);
+		if (ret < 0)
+			goto out_free;
+	}
+
+	if (rte_kvargs_count(kvlist, PMD_PARAM_TM_QSIZE5) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_TM_QSIZE5,
+			&get_uint32, &p->tm.qsize[5]);
+		if (ret < 0)
+			goto out_free;
+	}
+
+	if (rte_kvargs_count(kvlist, PMD_PARAM_TM_QSIZE6) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_TM_QSIZE6,
+			&get_uint32, &p->tm.qsize[6]);
+		if (ret < 0)
+			goto out_free;
+	}
+
+	if (rte_kvargs_count(kvlist, PMD_PARAM_TM_QSIZE7) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_TM_QSIZE7,
+			&get_uint32, &p->tm.qsize[7]);
+		if (ret < 0)
+			goto out_free;
+	}
+	if (rte_kvargs_count(kvlist, PMD_PARAM_TM_QSIZE8) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_TM_QSIZE8,
+			&get_uint32, &p->tm.qsize[8]);
+		if (ret < 0)
+			goto out_free;
+	}
+	if (rte_kvargs_count(kvlist, PMD_PARAM_TM_QSIZE9) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_TM_QSIZE9,
+			&get_uint32, &p->tm.qsize[9]);
+		if (ret < 0)
+			goto out_free;
+	}
+
+	if (rte_kvargs_count(kvlist, PMD_PARAM_TM_QSIZE10) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_TM_QSIZE10,
+			&get_uint32, &p->tm.qsize[10]);
+		if (ret < 0)
+			goto out_free;
+	}
+
+	if (rte_kvargs_count(kvlist, PMD_PARAM_TM_QSIZE11) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_TM_QSIZE11,
+			&get_uint32, &p->tm.qsize[11]);
+		if (ret < 0)
+			goto out_free;
+	}
+
+	if (rte_kvargs_count(kvlist, PMD_PARAM_TM_QSIZE12) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_TM_QSIZE12,
+			&get_uint32, &p->tm.qsize[12]);
+		if (ret < 0)
+			goto out_free;
+	}
+
+	if (rte_kvargs_count(kvlist, PMD_PARAM_TM_QSIZE13) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_TM_QSIZE13,
+			&get_uint32, &p->tm.qsize[13]);
+		if (ret < 0)
+			goto out_free;
+	}
+
+	if (rte_kvargs_count(kvlist, PMD_PARAM_TM_QSIZE14) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_TM_QSIZE14,
+			&get_uint32, &p->tm.qsize[14]);
+		if (ret < 0)
+			goto out_free;
+	}
+
+	if (rte_kvargs_count(kvlist, PMD_PARAM_TM_QSIZE15) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_TM_QSIZE15,
+			&get_uint32, &p->tm.qsize[15]);
+		if (ret < 0)
+			goto out_free;
+	}
+
 out_free:
 	rte_kvargs_free(kvlist);
 	return ret;
@@ -588,6 +707,18 @@ RTE_PMD_REGISTER_PARAM_STRING(net_softnic,
 	PMD_PARAM_TM_QSIZE1 "=<uint32> "
 	PMD_PARAM_TM_QSIZE2 "=<uint32> "
 	PMD_PARAM_TM_QSIZE3 "=<uint32>"
+	PMD_PARAM_TM_QSIZE4 "=<uint32> "
+	PMD_PARAM_TM_QSIZE5 "=<uint32> "
+	PMD_PARAM_TM_QSIZE6 "=<uint32> "
+	PMD_PARAM_TM_QSIZE7 "=<uint32>"
+	PMD_PARAM_TM_QSIZE8 "=<uint32> "
+	PMD_PARAM_TM_QSIZE9 "=<uint32> "
+	PMD_PARAM_TM_QSIZE10 "=<uint32> "
+	PMD_PARAM_TM_QSIZE11 "=<uint32>"
+	PMD_PARAM_TM_QSIZE12 "=<uint32> "
+	PMD_PARAM_TM_QSIZE13 "=<uint32> "
+	PMD_PARAM_TM_QSIZE14 "=<uint32> "
+	PMD_PARAM_TM_QSIZE15 "=<uint32>"
 );
 
 
diff --git a/drivers/net/softnic/rte_eth_softnic_cli.c b/drivers/net/softnic/rte_eth_softnic_cli.c
index 56fc92ba2..2b25225cf 100644
--- a/drivers/net/softnic/rte_eth_softnic_cli.c
+++ b/drivers/net/softnic/rte_eth_softnic_cli.c
@@ -566,9 +566,13 @@ queue_node_id(uint32_t n_spp __rte_unused,
 	uint32_t tc_id,
 	uint32_t queue_id)
 {
-	return queue_id +
-		tc_id * RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE +
-		(pipe_id + subport_id * n_pps) * RTE_SCHED_QUEUES_PER_PIPE;
+	if (tc_id < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE)
+		return queue_id + tc_id +
+			(pipe_id + subport_id * n_pps) * RTE_SCHED_QUEUES_PER_PIPE;
+	else
+		return queue_id +
+			tc_id * RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE +
+			(pipe_id + subport_id * n_pps) * RTE_SCHED_QUEUES_PER_PIPE;
 }
 
 struct tmgr_hierarchy_default_params {
@@ -617,10 +621,19 @@ tmgr_hierarchy_default(struct pmd_internals *softnic,
 		},
 	};
 
+	uint32_t *shared_shaper_id =
+		(uint32_t *) calloc(RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE,
+		sizeof(uint32_t));
+	if (shared_shaper_id == NULL)
+		return -1;
+
+	memcpy(shared_shaper_id, params->shared_shaper_id.tc,
+		sizeof(params->shared_shaper_id.tc));
+
 	struct rte_tm_node_params tc_node_params[] = {
 		[0] = {
 			.shaper_profile_id = params->shaper_profile_id.tc[0],
-			.shared_shaper_id = &params->shared_shaper_id.tc[0],
+			.shared_shaper_id = &shared_shaper_id[0],
 			.n_shared_shapers =
 				(&params->shared_shaper_id.tc_valid[0]) ? 1 : 0,
 			.nonleaf = {
@@ -630,7 +643,7 @@ tmgr_hierarchy_default(struct pmd_internals *softnic,
 
 		[1] = {
 			.shaper_profile_id = params->shaper_profile_id.tc[1],
-			.shared_shaper_id = &params->shared_shaper_id.tc[1],
+			.shared_shaper_id = &shared_shaper_id[1],
 			.n_shared_shapers =
 				(&params->shared_shaper_id.tc_valid[1]) ? 1 : 0,
 			.nonleaf = {
@@ -640,7 +653,7 @@ tmgr_hierarchy_default(struct pmd_internals *softnic,
 
 		[2] = {
 			.shaper_profile_id = params->shaper_profile_id.tc[2],
-			.shared_shaper_id = &params->shared_shaper_id.tc[2],
+			.shared_shaper_id = &shared_shaper_id[2],
 			.n_shared_shapers =
 				(&params->shared_shaper_id.tc_valid[2]) ? 1 : 0,
 			.nonleaf = {
@@ -650,13 +663,63 @@ tmgr_hierarchy_default(struct pmd_internals *softnic,
 
 		[3] = {
 			.shaper_profile_id = params->shaper_profile_id.tc[3],
-			.shared_shaper_id = &params->shared_shaper_id.tc[3],
+			.shared_shaper_id = &shared_shaper_id[3],
 			.n_shared_shapers =
 				(&params->shared_shaper_id.tc_valid[3]) ? 1 : 0,
 			.nonleaf = {
 				.n_sp_priorities = 1,
 			},
 		},
+
+		[4] = {
+			.shaper_profile_id = params->shaper_profile_id.tc[4],
+			.shared_shaper_id = &shared_shaper_id[4],
+			.n_shared_shapers =
+				(&params->shared_shaper_id.tc_valid[4]) ? 1 : 0,
+			.nonleaf = {
+				.n_sp_priorities = 1,
+			},
+		},
+
+		[5] = {
+			.shaper_profile_id = params->shaper_profile_id.tc[5],
+			.shared_shaper_id = &shared_shaper_id[5],
+			.n_shared_shapers =
+				(&params->shared_shaper_id.tc_valid[5]) ? 1 : 0,
+			.nonleaf = {
+				.n_sp_priorities = 1,
+			},
+		},
+
+		[6] = {
+			.shaper_profile_id = params->shaper_profile_id.tc[6],
+			.shared_shaper_id = &shared_shaper_id[6],
+			.n_shared_shapers =
+				(&params->shared_shaper_id.tc_valid[6]) ? 1 : 0,
+			.nonleaf = {
+				.n_sp_priorities = 1,
+			},
+		},
+
+		[7] = {
+			.shaper_profile_id = params->shaper_profile_id.tc[7],
+			.shared_shaper_id = &shared_shaper_id[7],
+			.n_shared_shapers =
+				(&params->shared_shaper_id.tc_valid[7]) ? 1 : 0,
+			.nonleaf = {
+				.n_sp_priorities = 1,
+			},
+		},
+
+		[8] = {
+			.shaper_profile_id = params->shaper_profile_id.tc[8],
+			.shared_shaper_id = &shared_shaper_id[8],
+			.n_shared_shapers =
+				(&params->shared_shaper_id.tc_valid[8]) ? 1 : 0,
+			.nonleaf = {
+				.n_sp_priorities = 1,
+			},
+		},
 	};
 
 	struct rte_tm_node_params queue_node_params = {
@@ -730,7 +793,21 @@ tmgr_hierarchy_default(struct pmd_internals *softnic,
 					return -1;
 
 				/* Hierarchy level 4: Queue nodes */
-				for (q = 0; q < RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS; q++) {
+				if (t == RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE - 1) { /*BE Traffic Class*/
+					for (q = 0; q < RTE_SCHED_WRR_QUEUES_PER_PIPE; q++) {
+						status = rte_tm_node_add(port_id,
+							queue_node_id(n_spp, n_pps, s, p, t, q),
+							tc_node_id(n_spp, n_pps, s, p, t),
+							0,
+							params->weight.queue[q],
+							RTE_TM_NODE_LEVEL_ID_ANY,
+							&queue_node_params,
+							&error);
+						if (status)
+							return -1;
+					} /* Queues (BE Traffic Class) */
+				} else { /* SP Traffic Class */
+					q = 0;
 					status = rte_tm_node_add(port_id,
 						queue_node_id(n_spp, n_pps, s, p, t, q),
 						tc_node_id(n_spp, n_pps, s, p, t),
@@ -741,7 +818,7 @@ tmgr_hierarchy_default(struct pmd_internals *softnic,
 						&error);
 					if (status)
 						return -1;
-				} /* Queue */
+				} /* Queue (SP Traffic Class) */
 			} /* TC */
 		} /* Pipe */
 	} /* Subport */
@@ -762,13 +839,23 @@ tmgr_hierarchy_default(struct pmd_internals *softnic,
  *   tc1 <profile_id>
  *   tc2 <profile_id>
  *   tc3 <profile_id>
+ *   tc4 <profile_id>
+ *   tc5 <profile_id>
+ *   tc6 <profile_id>
+ *   tc7 <profile_id>
+ *   tc8 <profile_id>
  *  shared shaper
  *   tc0 <id | none>
  *   tc1 <id | none>
  *   tc2 <id | none>
  *   tc3 <id | none>
+ *   tc4 <id | none>
+ *   tc5 <id | none>
+ *   tc6 <id | none>
+ *   tc7 <id | none>
+ *   tc8 <id | none>
  *  weight
- *   queue  <q0> ... <q15>
+ *   queue  <q8> ... <q15>
  */
 static void
 cmd_tmgr_hierarchy_default(struct pmd_internals *softnic,
@@ -778,11 +865,11 @@ cmd_tmgr_hierarchy_default(struct pmd_internals *softnic,
 	size_t out_size)
 {
 	struct tmgr_hierarchy_default_params p;
-	int i, status;
+	int i, j, status;
 
 	memset(&p, 0, sizeof(p));
 
-	if (n_tokens != 50) {
+	if (n_tokens != 62) {
 		snprintf(out, out_size, MSG_ARG_MISMATCH, tokens[0]);
 		return;
 	}
@@ -894,27 +981,77 @@ cmd_tmgr_hierarchy_default(struct pmd_internals *softnic,
 		return;
 	}
 
+	if (strcmp(tokens[22], "tc4") != 0) {
+		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "tc4");
+		return;
+	}
+
+	if (softnic_parser_read_uint32(&p.shaper_profile_id.tc[4], tokens[23]) != 0) {
+		snprintf(out, out_size, MSG_ARG_INVALID, "tc4 profile id");
+		return;
+	}
+
+	if (strcmp(tokens[24], "tc5") != 0) {
+		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "tc5");
+		return;
+	}
+
+	if (softnic_parser_read_uint32(&p.shaper_profile_id.tc[5], tokens[25]) != 0) {
+		snprintf(out, out_size, MSG_ARG_INVALID, "tc5 profile id");
+		return;
+	}
+
+	if (strcmp(tokens[26], "tc6") != 0) {
+		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "tc6");
+		return;
+	}
+
+	if (softnic_parser_read_uint32(&p.shaper_profile_id.tc[6], tokens[27]) != 0) {
+		snprintf(out, out_size, MSG_ARG_INVALID, "tc6 profile id");
+		return;
+	}
+
+	if (strcmp(tokens[28], "tc7") != 0) {
+		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "tc7");
+		return;
+	}
+
+	if (softnic_parser_read_uint32(&p.shaper_profile_id.tc[7], tokens[29]) != 0) {
+		snprintf(out, out_size, MSG_ARG_INVALID, "tc7 profile id");
+		return;
+	}
+
+	if (strcmp(tokens[30], "tc8") != 0) {
+		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "tc8");
+		return;
+	}
+
+	if (softnic_parser_read_uint32(&p.shaper_profile_id.tc[8], tokens[31]) != 0) {
+		snprintf(out, out_size, MSG_ARG_INVALID, "tc8 profile id");
+		return;
+	}
+
 	/* Shared shaper */
 
-	if (strcmp(tokens[22], "shared") != 0) {
+	if (strcmp(tokens[32], "shared") != 0) {
 		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "shared");
 		return;
 	}
 
-	if (strcmp(tokens[23], "shaper") != 0) {
+	if (strcmp(tokens[33], "shaper") != 0) {
 		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "shaper");
 		return;
 	}
 
-	if (strcmp(tokens[24], "tc0") != 0) {
+	if (strcmp(tokens[34], "tc0") != 0) {
 		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "tc0");
 		return;
 	}
 
-	if (strcmp(tokens[25], "none") == 0)
+	if (strcmp(tokens[35], "none") == 0)
 		p.shared_shaper_id.tc_valid[0] = 0;
 	else {
-		if (softnic_parser_read_uint32(&p.shared_shaper_id.tc[0], tokens[25]) != 0) {
+		if (softnic_parser_read_uint32(&p.shared_shaper_id.tc[0], tokens[35]) != 0) {
 			snprintf(out, out_size, MSG_ARG_INVALID, "shared shaper tc0");
 			return;
 		}
@@ -922,15 +1059,15 @@ cmd_tmgr_hierarchy_default(struct pmd_internals *softnic,
 		p.shared_shaper_id.tc_valid[0] = 1;
 	}
 
-	if (strcmp(tokens[26], "tc1") != 0) {
+	if (strcmp(tokens[36], "tc1") != 0) {
 		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "tc1");
 		return;
 	}
 
-	if (strcmp(tokens[27], "none") == 0)
+	if (strcmp(tokens[37], "none") == 0)
 		p.shared_shaper_id.tc_valid[1] = 0;
 	else {
-		if (softnic_parser_read_uint32(&p.shared_shaper_id.tc[1], tokens[27]) != 0) {
+		if (softnic_parser_read_uint32(&p.shared_shaper_id.tc[1], tokens[37]) != 0) {
 			snprintf(out, out_size, MSG_ARG_INVALID, "shared shaper tc1");
 			return;
 		}
@@ -938,15 +1075,15 @@ cmd_tmgr_hierarchy_default(struct pmd_internals *softnic,
 		p.shared_shaper_id.tc_valid[1] = 1;
 	}
 
-	if (strcmp(tokens[28], "tc2") != 0) {
+	if (strcmp(tokens[38], "tc2") != 0) {
 		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "tc2");
 		return;
 	}
 
-	if (strcmp(tokens[29], "none") == 0)
+	if (strcmp(tokens[39], "none") == 0)
 		p.shared_shaper_id.tc_valid[2] = 0;
 	else {
-		if (softnic_parser_read_uint32(&p.shared_shaper_id.tc[2], tokens[29]) != 0) {
+		if (softnic_parser_read_uint32(&p.shared_shaper_id.tc[2], tokens[39]) != 0) {
 			snprintf(out, out_size, MSG_ARG_INVALID, "shared shaper tc2");
 			return;
 		}
@@ -954,15 +1091,15 @@ cmd_tmgr_hierarchy_default(struct pmd_internals *softnic,
 		p.shared_shaper_id.tc_valid[2] = 1;
 	}
 
-	if (strcmp(tokens[30], "tc3") != 0) {
+	if (strcmp(tokens[40], "tc3") != 0) {
 		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "tc3");
 		return;
 	}
 
-	if (strcmp(tokens[31], "none") == 0)
+	if (strcmp(tokens[41], "none") == 0)
 		p.shared_shaper_id.tc_valid[3] = 0;
 	else {
-		if (softnic_parser_read_uint32(&p.shared_shaper_id.tc[3], tokens[31]) != 0) {
+		if (softnic_parser_read_uint32(&p.shared_shaper_id.tc[3], tokens[41]) != 0) {
 			snprintf(out, out_size, MSG_ARG_INVALID, "shared shaper tc3");
 			return;
 		}
@@ -970,22 +1107,107 @@ cmd_tmgr_hierarchy_default(struct pmd_internals *softnic,
 		p.shared_shaper_id.tc_valid[3] = 1;
 	}
 
+	if (strcmp(tokens[42], "tc4") != 0) {
+		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "tc4");
+		return;
+	}
+
+	if (strcmp(tokens[43], "none") == 0)
+		p.shared_shaper_id.tc_valid[4] = 0;
+	else {
+		if (softnic_parser_read_uint32(&p.shared_shaper_id.tc[4], tokens[43]) != 0) {
+			snprintf(out, out_size, MSG_ARG_INVALID, "shared shaper tc4");
+			return;
+		}
+
+		p.shared_shaper_id.tc_valid[4] = 1;
+	}
+
+	if (strcmp(tokens[44], "tc5") != 0) {
+		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "tc5");
+		return;
+	}
+
+	if (strcmp(tokens[45], "none") == 0)
+		p.shared_shaper_id.tc_valid[5] = 0;
+	else {
+		if (softnic_parser_read_uint32(&p.shared_shaper_id.tc[5], tokens[45]) != 0) {
+			snprintf(out, out_size, MSG_ARG_INVALID, "shared shaper tc5");
+			return;
+		}
+
+		p.shared_shaper_id.tc_valid[5] = 1;
+	}
+
+	if (strcmp(tokens[46], "tc6") != 0) {
+		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "tc3");
+		return;
+	}
+
+	if (strcmp(tokens[47], "none") == 0)
+		p.shared_shaper_id.tc_valid[6] = 0;
+	else {
+		if (softnic_parser_read_uint32(&p.shared_shaper_id.tc[6], tokens[47]) != 0) {
+			snprintf(out, out_size, MSG_ARG_INVALID, "shared shaper tc6");
+			return;
+		}
+
+		p.shared_shaper_id.tc_valid[6] = 1;
+	}
+
+	if (strcmp(tokens[48], "tc7") != 0) {
+		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "tc7");
+		return;
+	}
+
+	if (strcmp(tokens[49], "none") == 0)
+		p.shared_shaper_id.tc_valid[7] = 0;
+	else {
+		if (softnic_parser_read_uint32(&p.shared_shaper_id.tc[7], tokens[49]) != 0) {
+			snprintf(out, out_size, MSG_ARG_INVALID, "shared shaper tc7");
+			return;
+		}
+
+		p.shared_shaper_id.tc_valid[7] = 1;
+	}
+
+	if (strcmp(tokens[50], "tc8") != 0) {
+		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "tc8");
+		return;
+	}
+
+	if (strcmp(tokens[51], "none") == 0)
+		p.shared_shaper_id.tc_valid[8] = 0;
+	else {
+		if (softnic_parser_read_uint32(&p.shared_shaper_id.tc[8], tokens[51]) != 0) {
+			snprintf(out, out_size, MSG_ARG_INVALID, "shared shaper tc8");
+			return;
+		}
+
+		p.shared_shaper_id.tc_valid[8] = 1;
+	}
+
 	/* Weight */
 
-	if (strcmp(tokens[32], "weight") != 0) {
+	if (strcmp(tokens[52], "weight") != 0) {
 		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "weight");
 		return;
 	}
 
-	if (strcmp(tokens[33], "queue") != 0) {
+	if (strcmp(tokens[53], "queue") != 0) {
 		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "queue");
 		return;
 	}
 
-	for (i = 0; i < 16; i++) {
-		if (softnic_parser_read_uint32(&p.weight.queue[i], tokens[34 + i]) != 0) {
-			snprintf(out, out_size, MSG_ARG_INVALID, "weight queue");
-			return;
+	for (i = 0, j = 0; i < 16; i++) {
+		if (i < RTE_SCHED_WRR_QUEUES_PER_PIPE) {
+			p.weight.queue[i] = 1;
+		} else {
+			if (softnic_parser_read_uint32(&p.weight.queue[i], tokens[54 + j]) != 0) {
+				snprintf(out, out_size, MSG_ARG_INVALID, "weight queue");
+				return;
+			}
+			j++;
 		}
 	}
 
diff --git a/drivers/net/softnic/rte_eth_softnic_internals.h b/drivers/net/softnic/rte_eth_softnic_internals.h
index 415434d0d..52531f19d 100644
--- a/drivers/net/softnic/rte_eth_softnic_internals.h
+++ b/drivers/net/softnic/rte_eth_softnic_internals.h
@@ -43,7 +43,7 @@ struct pmd_params {
 	/** Traffic Management (TM) */
 	struct {
 		uint32_t n_queues; /**< Number of queues */
-		uint16_t qsize[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
+		uint16_t qsize[RTE_SCHED_QUEUES_PER_PIPE];
 	} tm;
 };
 
@@ -167,7 +167,7 @@ struct tm_params {
 	struct rte_sched_subport_params subport_params[TM_MAX_SUBPORTS];
 
 	struct rte_sched_pipe_params
-		pipe_profiles[RTE_SCHED_PIPE_PROFILES_PER_PORT];
+		pipe_profiles[RTE_SCHED_PIPE_PROFILES_PER_SUBPORT];
 	uint32_t n_pipe_profiles;
 	uint32_t pipe_to_profile[TM_MAX_SUBPORTS * TM_MAX_PIPES_PER_SUBPORT];
 };
diff --git a/drivers/net/softnic/rte_eth_softnic_tm.c b/drivers/net/softnic/rte_eth_softnic_tm.c
index 58744a9eb..b826b5cb7 100644
--- a/drivers/net/softnic/rte_eth_softnic_tm.c
+++ b/drivers/net/softnic/rte_eth_softnic_tm.c
@@ -85,7 +85,8 @@ softnic_tmgr_port_create(struct pmd_internals *p,
 	/* Subport */
 	n_subports = t->port_params.n_subports_per_port;
 	for (subport_id = 0; subport_id < n_subports; subport_id++) {
-		uint32_t n_pipes_per_subport = t->port_params.n_pipes_per_subport;
+		uint32_t n_pipes_per_subport =
+			t->subport_params[subport_id].n_subport_pipes;
 		uint32_t pipe_id;
 		int status;
 
@@ -367,7 +368,8 @@ tm_level_get_max_nodes(struct rte_eth_dev *dev, enum tm_node_level level)
 {
 	struct pmd_internals *p = dev->data->dev_private;
 	uint32_t n_queues_max = p->params.tm.n_queues;
-	uint32_t n_tc_max = n_queues_max / RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS;
+	uint32_t n_tc_max =
+		(n_queues_max * RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE) / RTE_SCHED_QUEUES_PER_PIPE;
 	uint32_t n_pipes_max = n_tc_max / RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE;
 	uint32_t n_subports_max = n_pipes_max;
 	uint32_t n_root_max = 1;
@@ -625,10 +627,10 @@ static const struct rte_tm_level_capabilities tm_level_cap[] = {
 			.shaper_shared_n_max = 1,
 
 			.sched_n_children_max =
-				RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS,
+				RTE_SCHED_WRR_QUEUES_PER_PIPE,
 			.sched_sp_n_priorities_max = 1,
 			.sched_wfq_n_children_per_group_max =
-				RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS,
+				RTE_SCHED_WRR_QUEUES_PER_PIPE,
 			.sched_wfq_n_groups_max = 1,
 			.sched_wfq_weight_max = UINT32_MAX,
 
@@ -793,10 +795,10 @@ static const struct rte_tm_node_capabilities tm_node_cap[] = {
 
 		{.nonleaf = {
 			.sched_n_children_max =
-				RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS,
+				RTE_SCHED_WRR_QUEUES_PER_PIPE,
 			.sched_sp_n_priorities_max = 1,
 			.sched_wfq_n_children_per_group_max =
-				RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS,
+				RTE_SCHED_WRR_QUEUES_PER_PIPE,
 			.sched_wfq_n_groups_max = 1,
 			.sched_wfq_weight_max = UINT32_MAX,
 		} },
@@ -2043,15 +2045,13 @@ pipe_profile_build(struct rte_eth_dev *dev,
 
 		/* Queue */
 		TAILQ_FOREACH(nq, nl, node) {
-			uint32_t pipe_queue_id;
 
 			if (nq->level != TM_NODE_LEVEL_QUEUE ||
 				nq->parent_node_id != nt->node_id)
 				continue;
 
-			pipe_queue_id = nt->priority *
-				RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS + queue_id;
-			pp->wrr_weights[pipe_queue_id] = nq->weight;
+			if (nt->priority == RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE - 1)
+				pp->wrr_weights[queue_id] = nq->weight;
 
 			queue_id++;
 		}
@@ -2065,7 +2065,8 @@ pipe_profile_free_exists(struct rte_eth_dev *dev,
 	struct pmd_internals *p = dev->data->dev_private;
 	struct tm_params *t = &p->soft.tm.params;
 
-	if (t->n_pipe_profiles < RTE_SCHED_PIPE_PROFILES_PER_PORT) {
+	if (t->n_pipe_profiles <
+		RTE_SCHED_PIPE_PROFILES_PER_SUBPORT * RTE_SCHED_SUBPORTS_PER_PORT) {
 		*pipe_profile_id = t->n_pipe_profiles;
 		return 1;
 	}
@@ -2213,10 +2214,11 @@ tm_tc_wred_profile_get(struct rte_eth_dev *dev, uint32_t tc_id)
 #ifdef RTE_SCHED_RED
 
 static void
-wred_profiles_set(struct rte_eth_dev *dev)
+wred_profiles_set(struct rte_eth_dev *dev, uint32_t subport_id)
 {
 	struct pmd_internals *p = dev->data->dev_private;
-	struct rte_sched_port_params *pp = &p->soft.tm.params.port_params;
+	struct rte_sched_subport_params *pp =
+		&p->soft.tm.params.subport_params[subport_id];
 	uint32_t tc_id;
 	enum rte_color color;
 
@@ -2235,7 +2237,7 @@ wred_profiles_set(struct rte_eth_dev *dev)
 
 #else
 
-#define wred_profiles_set(dev)
+#define wred_profiles_set(dev, subport_id)
 
 #endif
 
@@ -2332,7 +2334,7 @@ hierarchy_commit_check(struct rte_eth_dev *dev, struct rte_tm_error *error)
 				rte_strerror(EINVAL));
 	}
 
-	/* Each pipe has exactly 4 TCs, with exactly one TC for each priority */
+	/* Each pipe has exactly 9 TCs, with exactly one TC for each priority */
 	TAILQ_FOREACH(np, nl, node) {
 		uint32_t mask = 0, mask_expected =
 			RTE_LEN2MASK(RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE,
@@ -2369,7 +2371,7 @@ hierarchy_commit_check(struct rte_eth_dev *dev, struct rte_tm_error *error)
 		if (nt->level != TM_NODE_LEVEL_TC)
 			continue;
 
-		if (nt->n_children != RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS)
+		if (nt->n_children != 1 && nt->n_children != RTE_SCHED_WRR_QUEUES_PER_PIPE)
 			return -rte_tm_error_set(error,
 				EINVAL,
 				RTE_TM_ERROR_TYPE_UNSPECIFIED,
@@ -2525,19 +2527,8 @@ hierarchy_blueprints_create(struct rte_eth_dev *dev)
 		.frame_overhead =
 			root->shaper_profile->params.pkt_length_adjust,
 		.n_subports_per_port = root->n_children,
-		.n_pipes_per_subport = h->n_tm_nodes[TM_NODE_LEVEL_PIPE] /
-			h->n_tm_nodes[TM_NODE_LEVEL_SUBPORT],
-		.qsize = {p->params.tm.qsize[0],
-			p->params.tm.qsize[1],
-			p->params.tm.qsize[2],
-			p->params.tm.qsize[3],
-		},
-		.pipe_profiles = t->pipe_profiles,
-		.n_pipe_profiles = t->n_pipe_profiles,
 	};
 
-	wred_profiles_set(dev);
-
 	subport_id = 0;
 	TAILQ_FOREACH(n, nl, node) {
 		uint64_t tc_rate[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
@@ -2566,10 +2557,40 @@ hierarchy_blueprints_create(struct rte_eth_dev *dev)
 					tc_rate[1],
 					tc_rate[2],
 					tc_rate[3],
-			},
-			.tc_period = SUBPORT_TC_PERIOD,
+					tc_rate[4],
+					tc_rate[5],
+					tc_rate[6],
+					tc_rate[7],
+					tc_rate[8],
+				},
+				.tc_period = SUBPORT_TC_PERIOD,
+
+				.n_subport_pipes = h->n_tm_nodes[TM_NODE_LEVEL_PIPE] /
+					h->n_tm_nodes[TM_NODE_LEVEL_SUBPORT],
+
+				.qsize = {p->params.tm.qsize[0],
+					p->params.tm.qsize[1],
+					p->params.tm.qsize[2],
+					p->params.tm.qsize[3],
+					p->params.tm.qsize[4],
+					p->params.tm.qsize[5],
+					p->params.tm.qsize[6],
+					p->params.tm.qsize[7],
+					p->params.tm.qsize[8],
+					p->params.tm.qsize[9],
+					p->params.tm.qsize[10],
+					p->params.tm.qsize[11],
+					p->params.tm.qsize[12],
+					p->params.tm.qsize[13],
+					p->params.tm.qsize[14],
+					p->params.tm.qsize[15],
+				},
+
+				.pipe_profiles = t->pipe_profiles,
+				.n_pipe_profiles = t->n_pipe_profiles,
 		};
 
+		wred_profiles_set(dev, subport_id);
 		subport_id++;
 	}
 }
@@ -2666,7 +2687,7 @@ update_queue_weight(struct rte_eth_dev *dev,
 	uint32_t subport_id = tm_node_subport_id(dev, ns);
 
 	uint32_t pipe_queue_id =
-		tc_id * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS + queue_id;
+		tc_id * RTE_SCHED_QUEUES_PER_PIPE + queue_id;
 
 	struct rte_sched_pipe_params *profile0 = pipe_profile_get(dev, np);
 	struct rte_sched_pipe_params profile1;
@@ -3023,7 +3044,7 @@ tm_port_queue_id(struct rte_eth_dev *dev,
 	uint32_t port_tc_id =
 		port_pipe_id * RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE + pipe_tc_id;
 	uint32_t port_queue_id =
-		port_tc_id * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS + tc_queue_id;
+		port_tc_id * RTE_SCHED_QUEUES_PER_PIPE + tc_queue_id;
 
 	return port_queue_id;
 }
@@ -3149,8 +3170,8 @@ read_pipe_stats(struct rte_eth_dev *dev,
 		uint32_t qid = tm_port_queue_id(dev,
 			subport_id,
 			pipe_id,
-			i / RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS,
-			i % RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS);
+			i / RTE_SCHED_QUEUES_PER_PIPE,
+			i % RTE_SCHED_QUEUES_PER_PIPE);
 
 		int status = rte_sched_queue_read_stats(SCHED(p),
 			qid,
@@ -3202,7 +3223,7 @@ read_tc_stats(struct rte_eth_dev *dev,
 	uint32_t i;
 
 	/* Stats read */
-	for (i = 0; i < RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS; i++) {
+	for (i = 0; i < RTE_SCHED_QUEUES_PER_PIPE; i++) {
 		struct rte_sched_queue_stats s;
 		uint16_t qlen;
 
-- 
2.20.1


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH 25/27] examples/qos_sched: update qos sched sample app
  2019-05-28 12:05 [dpdk-dev] [PATCH 00/27] sched: feature enhancements Lukasz Krakowiak
                   ` (23 preceding siblings ...)
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 24/27] net/softnic: update softnic tm function Lukasz Krakowiak
@ 2019-05-28 12:05 ` Lukasz Krakowiak
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 26/27] examples/ip_pipeline: update ip pipeline " Lukasz Krakowiak
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 27/27] sched: code cleanup Lukasz Krakowiak
  26 siblings, 0 replies; 163+ messages in thread
From: Lukasz Krakowiak @ 2019-05-28 12:05 UTC (permalink / raw)
  To: cristian.dumitrescu; +Cc: dev, Jasvinder Singh, Abraham Tovar, Lukasz Krakowiak

From: Jasvinder Singh <jasvinder.singh@intel.com>

Update qos sched sample app to allow configuration flexibility for
pipe traffic classes and queues, and subport level configuration
of the pipe parameters.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 examples/qos_sched/app_thread.c   |  11 +-
 examples/qos_sched/cfg_file.c     | 282 +++++++++++++++++-------------
 examples/qos_sched/init.c         | 108 +++++++-----
 examples/qos_sched/main.h         |   6 +-
 examples/qos_sched/profile.cfg    |  59 +++++--
 examples/qos_sched/profile_ov.cfg |  47 ++++-
 examples/qos_sched/stats.c        | 175 +++++++++++-------
 7 files changed, 432 insertions(+), 256 deletions(-)

diff --git a/examples/qos_sched/app_thread.c b/examples/qos_sched/app_thread.c
index e14b275e3..25a8d42a0 100644
--- a/examples/qos_sched/app_thread.c
+++ b/examples/qos_sched/app_thread.c
@@ -20,13 +20,11 @@
  * QoS parameters are encoded as follows:
  *		Outer VLAN ID defines subport
  *		Inner VLAN ID defines pipe
- *		Destination IP 0.0.XXX.0 defines traffic class
  *		Destination IP host (0.0.0.XXX) defines queue
  * Values below define offset to each field from start of frame
  */
 #define SUBPORT_OFFSET	7
 #define PIPE_OFFSET		9
-#define TC_OFFSET		20
 #define QUEUE_OFFSET	20
 #define COLOR_OFFSET	19
 
@@ -39,11 +37,10 @@ get_pkt_sched(struct rte_mbuf *m, uint32_t *subport, uint32_t *pipe,
 	*subport = (rte_be_to_cpu_16(pdata[SUBPORT_OFFSET]) & 0x0FFF) &
 			(port_params.n_subports_per_port - 1); /* Outer VLAN ID*/
 	*pipe = (rte_be_to_cpu_16(pdata[PIPE_OFFSET]) & 0x0FFF) &
-			(port_params.n_pipes_per_subport - 1); /* Inner VLAN ID */
-	*traffic_class = (pdata[QUEUE_OFFSET] & 0x0F) &
-			(RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE - 1); /* Destination IP */
-	*queue = ((pdata[QUEUE_OFFSET] >> 8) & 0x0F) &
-			(RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS - 1) ; /* Destination IP */
+			(subport_params[*subport].n_subport_pipes - 1); /* Inner VLAN ID */
+	*queue = active_queues[(pdata[QUEUE_OFFSET] >> 8) % n_active_queues];
+	*traffic_class = (*queue > (RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE - 1) ?
+			(RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE - 1) : *queue); /* Destination IP */
 	*color = pdata[COLOR_OFFSET] & 0x03; 	/* Destination IP */
 
 	return 0;
diff --git a/examples/qos_sched/cfg_file.c b/examples/qos_sched/cfg_file.c
index 76ffffc4b..6d3674e53 100644
--- a/examples/qos_sched/cfg_file.c
+++ b/examples/qos_sched/cfg_file.c
@@ -24,7 +24,6 @@ int
 cfg_load_port(struct rte_cfgfile *cfg, struct rte_sched_port_params *port_params)
 {
 	const char *entry;
-	int j;
 
 	if (!cfg || !port_params)
 		return -1;
@@ -37,93 +36,6 @@ cfg_load_port(struct rte_cfgfile *cfg, struct rte_sched_port_params *port_params
 	if (entry)
 		port_params->n_subports_per_port = (uint32_t)atoi(entry);
 
-	entry = rte_cfgfile_get_entry(cfg, "port", "number of pipes per subport");
-	if (entry)
-		port_params->n_pipes_per_subport = (uint32_t)atoi(entry);
-
-	entry = rte_cfgfile_get_entry(cfg, "port", "queue sizes");
-	if (entry) {
-		char *next;
-
-		for(j = 0; j < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; j++) {
-			port_params->qsize[j] = (uint16_t)strtol(entry, &next, 10);
-			if (next == NULL)
-				break;
-			entry = next;
-		}
-	}
-
-#ifdef RTE_SCHED_RED
-	for (j = 0; j < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; j++) {
-		char str[32];
-
-		/* Parse WRED min thresholds */
-		snprintf(str, sizeof(str), "tc %d wred min", j);
-		entry = rte_cfgfile_get_entry(cfg, "red", str);
-		if (entry) {
-			char *next;
-			int k;
-			/* for each packet colour (green, yellow, red) */
-			for (k = 0; k < RTE_COLORS; k++) {
-				port_params->red_params[j][k].min_th
-					= (uint16_t)strtol(entry, &next, 10);
-				if (next == NULL)
-					break;
-				entry = next;
-			}
-		}
-
-		/* Parse WRED max thresholds */
-		snprintf(str, sizeof(str), "tc %d wred max", j);
-		entry = rte_cfgfile_get_entry(cfg, "red", str);
-		if (entry) {
-			char *next;
-			int k;
-			/* for each packet colour (green, yellow, red) */
-			for (k = 0; k < RTE_COLORS; k++) {
-				port_params->red_params[j][k].max_th
-					= (uint16_t)strtol(entry, &next, 10);
-				if (next == NULL)
-					break;
-				entry = next;
-			}
-		}
-
-		/* Parse WRED inverse mark probabilities */
-		snprintf(str, sizeof(str), "tc %d wred inv prob", j);
-		entry = rte_cfgfile_get_entry(cfg, "red", str);
-		if (entry) {
-			char *next;
-			int k;
-			/* for each packet colour (green, yellow, red) */
-			for (k = 0; k < RTE_COLORS; k++) {
-				port_params->red_params[j][k].maxp_inv
-					= (uint8_t)strtol(entry, &next, 10);
-
-				if (next == NULL)
-					break;
-				entry = next;
-			}
-		}
-
-		/* Parse WRED EWMA filter weights */
-		snprintf(str, sizeof(str), "tc %d wred weight", j);
-		entry = rte_cfgfile_get_entry(cfg, "red", str);
-		if (entry) {
-			char *next;
-			int k;
-			/* for each packet colour (green, yellow, red) */
-			for (k = 0; k < RTE_COLORS; k++) {
-				port_params->red_params[j][k].wq_log2
-					= (uint8_t)strtol(entry, &next, 10);
-				if (next == NULL)
-					break;
-				entry = next;
-			}
-		}
-	}
-#endif /* RTE_SCHED_RED */
-
 	return 0;
 }
 
@@ -139,7 +51,7 @@ cfg_load_pipe(struct rte_cfgfile *cfg, struct rte_sched_pipe_params *pipe_params
 		return -1;
 
 	profiles = rte_cfgfile_num_sections(cfg, "pipe profile", sizeof("pipe profile") - 1);
-	port_params.n_pipe_profiles = profiles;
+	subport_params[0].n_pipe_profiles = profiles;
 
 	for (j = 0; j < profiles; j++) {
 		char pipe_name[32];
@@ -173,46 +85,36 @@ cfg_load_pipe(struct rte_cfgfile *cfg, struct rte_sched_pipe_params *pipe_params
 		if (entry)
 			pipe_params[j].tc_rate[3] = (uint32_t)atoi(entry);
 
+		entry = rte_cfgfile_get_entry(cfg, pipe_name, "tc 4 rate");
+		if (entry)
+			pipe_params[j].tc_rate[4] = (uint32_t)atoi(entry);
+
+		entry = rte_cfgfile_get_entry(cfg, pipe_name, "tc 5 rate");
+		if (entry)
+			pipe_params[j].tc_rate[5] = (uint32_t)atoi(entry);
+
+		entry = rte_cfgfile_get_entry(cfg, pipe_name, "tc 6 rate");
+		if (entry)
+			pipe_params[j].tc_rate[6] = (uint32_t)atoi(entry);
+
+		entry = rte_cfgfile_get_entry(cfg, pipe_name, "tc 7 rate");
+		if (entry)
+			pipe_params[j].tc_rate[7] = (uint32_t)atoi(entry);
+
+		entry = rte_cfgfile_get_entry(cfg, pipe_name, "tc 8 rate");
+		if (entry)
+			pipe_params[j].tc_rate[8] = (uint32_t)atoi(entry);
+
 #ifdef RTE_SCHED_SUBPORT_TC_OV
-		entry = rte_cfgfile_get_entry(cfg, pipe_name, "tc 3 oversubscription weight");
+		entry = rte_cfgfile_get_entry(cfg, pipe_name, "tc 8 oversubscription weight");
 		if (entry)
 			pipe_params[j].tc_ov_weight = (uint8_t)atoi(entry);
 #endif
 
-		entry = rte_cfgfile_get_entry(cfg, pipe_name, "tc 0 wrr weights");
-		if (entry) {
-			for(i = 0; i < RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS; i++) {
-				pipe_params[j].wrr_weights[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE*0 + i] =
-					(uint8_t)strtol(entry, &next, 10);
-				if (next == NULL)
-					break;
-				entry = next;
-			}
-		}
-		entry = rte_cfgfile_get_entry(cfg, pipe_name, "tc 1 wrr weights");
-		if (entry) {
-			for(i = 0; i < RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS; i++) {
-				pipe_params[j].wrr_weights[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE*1 + i] =
-					(uint8_t)strtol(entry, &next, 10);
-				if (next == NULL)
-					break;
-				entry = next;
-			}
-		}
-		entry = rte_cfgfile_get_entry(cfg, pipe_name, "tc 2 wrr weights");
-		if (entry) {
-			for(i = 0; i < RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS; i++) {
-				pipe_params[j].wrr_weights[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE*2 + i] =
-					(uint8_t)strtol(entry, &next, 10);
-				if (next == NULL)
-					break;
-				entry = next;
-			}
-		}
-		entry = rte_cfgfile_get_entry(cfg, pipe_name, "tc 3 wrr weights");
+		entry = rte_cfgfile_get_entry(cfg, pipe_name, "tc 8 wrr weights");
 		if (entry) {
-			for(i = 0; i < RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS; i++) {
-				pipe_params[j].wrr_weights[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE*3 + i] =
+			for (i = 0; i < RTE_SCHED_WRR_QUEUES_PER_PIPE; i++) {
+				pipe_params[j].wrr_weights[i] =
 					(uint8_t)strtol(entry, &next, 10);
 				if (next == NULL)
 					break;
@@ -233,12 +135,111 @@ cfg_load_subport(struct rte_cfgfile *cfg, struct rte_sched_subport_params *subpo
 		return -1;
 
 	memset(app_pipe_to_profile, -1, sizeof(app_pipe_to_profile));
+	memset(active_queues, 0, sizeof(active_queues));
+	n_active_queues = 0;
+
+#ifdef RTE_SCHED_RED
+	char sec_name[CFG_NAME_LEN];
+	snprintf(sec_name, sizeof(sec_name), "red");
+
+	if (rte_cfgfile_has_section(cfg, sec_name)) {
+		struct rte_red_params red_params[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE][RTE_COLORS];
+
+		for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++) {
+			char str[32];
+
+			/* Parse WRED min thresholds */
+			snprintf(str, sizeof(str), "tc %d wred min", i);
+			entry = rte_cfgfile_get_entry(cfg, sec_name, str);
+			if (entry) {
+				char *next;
+				/* for each packet colour (green, yellow, red) */
+				for (j = 0; j < RTE_COLORS; j++) {
+					red_params[i][j].min_th
+						= (uint16_t)strtol(entry, &next, 10);
+					if (next == NULL)
+						break;
+					entry = next;
+				}
+			}
+
+			/* Parse WRED max thresholds */
+			snprintf(str, sizeof(str), "tc %d wred max", i);
+			entry = rte_cfgfile_get_entry(cfg, "red", str);
+			if (entry) {
+				char *next;
+				/* for each packet colour (green, yellow, red) */
+				for (j = 0; j < RTE_COLORS; j++) {
+					red_params[i][j].max_th
+						= (uint16_t)strtol(entry, &next, 10);
+					if (next == NULL)
+						break;
+					entry = next;
+				}
+			}
+
+			/* Parse WRED inverse mark probabilities */
+			snprintf(str, sizeof(str), "tc %d wred inv prob", i);
+			entry = rte_cfgfile_get_entry(cfg, "red", str);
+			if (entry) {
+				char *next;
+				/* for each packet colour (green, yellow, red) */
+				for (j = 0; j < RTE_COLORS; j++) {
+					red_params[i][j].maxp_inv
+						= (uint8_t)strtol(entry, &next, 10);
+
+					if (next == NULL)
+						break;
+					entry = next;
+				}
+			}
+
+			/* Parse WRED EWMA filter weights */
+			snprintf(str, sizeof(str), "tc %d wred weight", i);
+			entry = rte_cfgfile_get_entry(cfg, "red", str);
+			if (entry) {
+				char *next;
+				/* for each packet colour (green, yellow, red) */
+				for (j = 0; j < RTE_COLORS; j++) {
+					red_params[i][j].wq_log2
+						= (uint8_t)strtol(entry, &next, 10);
+					if (next == NULL)
+						break;
+					entry = next;
+				}
+			}
+		}
+	}
+#endif /* RTE_SCHED_RED */
 
 	for (i = 0; i < MAX_SCHED_SUBPORTS; i++) {
 		char sec_name[CFG_NAME_LEN];
 		snprintf(sec_name, sizeof(sec_name), "subport %d", i);
 
 		if (rte_cfgfile_has_section(cfg, sec_name)) {
+			entry = rte_cfgfile_get_entry(cfg, sec_name,
+				"number of pipes per subport");
+			if (entry)
+				subport_params[i].n_subport_pipes = (uint32_t)atoi(entry);
+
+			entry = rte_cfgfile_get_entry(cfg, sec_name, "queue sizes");
+			if (entry) {
+				char *next;
+
+				for (j = 0; j < RTE_SCHED_QUEUES_PER_PIPE; j++) {
+				subport_params[i].qsize[j] =
+					(uint16_t)strtol(entry, &next, 10);
+				if (subport_params[i].qsize[j] != 0) {
+					active_queues[n_active_queues] = j;
+					n_active_queues++;
+				}
+
+				if (next == NULL)
+					break;
+				entry = next;
+				}
+			}
+
 			entry = rte_cfgfile_get_entry(cfg, sec_name, "tb rate");
 			if (entry)
 				subport_params[i].tb_rate = (uint32_t)atoi(entry);
@@ -267,6 +268,26 @@ cfg_load_subport(struct rte_cfgfile *cfg, struct rte_sched_subport_params *subpo
 			if (entry)
 				subport_params[i].tc_rate[3] = (uint32_t)atoi(entry);
 
+			entry = rte_cfgfile_get_entry(cfg, sec_name, "tc 4 rate");
+			if (entry)
+				subport_params[i].tc_rate[4] = (uint32_t)atoi(entry);
+
+			entry = rte_cfgfile_get_entry(cfg, sec_name, "tc 5 rate");
+			if (entry)
+				subport_params[i].tc_rate[5] = (uint32_t)atoi(entry);
+
+			entry = rte_cfgfile_get_entry(cfg, sec_name, "tc 6 rate");
+			if (entry)
+				subport_params[i].tc_rate[6] = (uint32_t)atoi(entry);
+
+			entry = rte_cfgfile_get_entry(cfg, sec_name, "tc 7 rate");
+			if (entry)
+				subport_params[i].tc_rate[7] = (uint32_t)atoi(entry);
+
+			entry = rte_cfgfile_get_entry(cfg, sec_name, "tc 8 rate");
+			if (entry)
+				subport_params[i].tc_rate[8] = (uint32_t)atoi(entry);
+
 			int n_entries = rte_cfgfile_section_num_entries(cfg, sec_name);
 			struct rte_cfgfile_entry entries[n_entries];
 
@@ -306,6 +327,21 @@ cfg_load_subport(struct rte_cfgfile *cfg, struct rte_sched_subport_params *subpo
 					}
 				}
 			}
+
+#ifdef RTE_SCHED_RED
+			for (j = 0; j < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; j++) {
+				for (k = 0; k < RTE_COLORS; k++) {
+					subport_params[i].red_params[j][k].min_th =
+						red_params[j][k].min_th;
+					subport_params[i].red_params[j][k].max_th =
+						red_params[j][k].max_th;
+					subport_params[i].red_params[j][k].maxp_inv =
+						red_params[j][k].maxp_inv;
+					subport_params[i].red_params[j][k].wq_log2 =
+						red_params[j][k].wq_log2;
+				}
+			}
+#endif
 		}
 	}
 
diff --git a/examples/qos_sched/init.c b/examples/qos_sched/init.c
index f44a07cd6..fce90de24 100644
--- a/examples/qos_sched/init.c
+++ b/examples/qos_sched/init.c
@@ -165,22 +165,12 @@ app_init_port(uint16_t portid, struct rte_mempool *mp)
 	return 0;
 }
 
-static struct rte_sched_subport_params subport_params[MAX_SCHED_SUBPORTS] = {
-	{
-		.tb_rate = 1250000000,
-		.tb_size = 1000000,
-
-		.tc_rate = {1250000000, 1250000000, 1250000000, 1250000000},
-		.tc_period = 10,
-	},
-};
-
-static struct rte_sched_pipe_params pipe_profiles[RTE_SCHED_PIPE_PROFILES_PER_PORT] = {
+static struct rte_sched_pipe_params pipe_profiles[RTE_SCHED_PIPE_PROFILES_PER_SUBPORT] = {
 	{ /* Profile #0 */
 		.tb_rate = 305175,
 		.tb_size = 1000000,
 
-		.tc_rate = {305175, 305175, 305175, 305175},
+		.tc_rate = {305175, 305175, 305175, 305175, 305175, 305175, 305175, 305175, 305175},
 		.tc_period = 40,
 #ifdef RTE_SCHED_SUBPORT_TC_OV
 		.tc_ov_weight = 1,
@@ -190,6 +180,69 @@ static struct rte_sched_pipe_params pipe_profiles[RTE_SCHED_PIPE_PROFILES_PER_PO
 	},
 };
 
+struct rte_sched_subport_params subport_params[MAX_SCHED_SUBPORTS] = {
+	{
+		.tb_rate = 1250000000,
+		.tb_size = 1000000,
+
+		.tc_rate = {1250000000, 1250000000, 1250000000, 1250000000, 1250000000, 1250000000, 1250000000, 1250000000, 1250000000},
+		.tc_period = 10,
+		.n_subport_pipes = 4096,
+		.qsize = {64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64},
+		.pipe_profiles = pipe_profiles,
+		.n_pipe_profiles = sizeof(pipe_profiles) / sizeof(struct rte_sched_pipe_params),
+
+#ifdef RTE_SCHED_RED
+		.red_params = {
+			/* Traffic Class 0 Colors Green / Yellow / Red */
+			[0][0] = {.min_th = 48, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+			[0][1] = {.min_th = 40, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+			[0][2] = {.min_th = 32, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+
+			/* Traffic Class 1 - Colors Green / Yellow / Red */
+			[1][0] = {.min_th = 48, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+			[1][1] = {.min_th = 40, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+			[1][2] = {.min_th = 32, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+
+			/* Traffic Class 2 - Colors Green / Yellow / Red */
+			[2][0] = {.min_th = 48, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+			[2][1] = {.min_th = 40, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+			[2][2] = {.min_th = 32, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+
+			/* Traffic Class 3 - Colors Green / Yellow / Red */
+			[3][0] = {.min_th = 48, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+			[3][1] = {.min_th = 40, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+			[3][2] = {.min_th = 32, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9}
+
+			/* Traffic Class 4 - Colors Green / Yellow / Red */
+			[4][0] = {.min_th = 48, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+			[4][1] = {.min_th = 40, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+			[4][2] = {.min_th = 32, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9}
+
+			/* Traffic Class 5 - Colors Green / Yellow / Red */
+			[5][0] = {.min_th = 48, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+			[5][1] = {.min_th = 40, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+			[5][2] = {.min_th = 32, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9}
+
+			/* Traffic Class 6 - Colors Green / Yellow / Red */
+			[6][0] = {.min_th = 48, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+			[6][1] = {.min_th = 40, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+			[6][2] = {.min_th = 32, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9}
+
+			/* Traffic Class 7 - Colors Green / Yellow / Red */
+			[7][0] = {.min_th = 48, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+			[7][1] = {.min_th = 40, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+			[7][2] = {.min_th = 32, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9}
+
+			/* Traffic Class 8 - Colors Green / Yellow / Red */
+			[8][0] = {.min_th = 48, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+			[8][1] = {.min_th = 40, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+			[8][2] = {.min_th = 32, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9}
+		},
+#endif /* RTE_SCHED_RED */
+	},
+};
+
 struct rte_sched_port_params port_params = {
 	.name = "port_scheduler_0",
 	.socket = 0, /* computed */
@@ -197,34 +250,6 @@ struct rte_sched_port_params port_params = {
 	.mtu = 6 + 6 + 4 + 4 + 2 + 1500,
 	.frame_overhead = RTE_SCHED_FRAME_OVERHEAD_DEFAULT,
 	.n_subports_per_port = 1,
-	.n_pipes_per_subport = 4096,
-	.qsize = {64, 64, 64, 64},
-	.pipe_profiles = pipe_profiles,
-	.n_pipe_profiles = sizeof(pipe_profiles) / sizeof(struct rte_sched_pipe_params),
-
-#ifdef RTE_SCHED_RED
-	.red_params = {
-		/* Traffic Class 0 Colors Green / Yellow / Red */
-		[0][0] = {.min_th = 48, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
-		[0][1] = {.min_th = 40, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
-		[0][2] = {.min_th = 32, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
-
-		/* Traffic Class 1 - Colors Green / Yellow / Red */
-		[1][0] = {.min_th = 48, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
-		[1][1] = {.min_th = 40, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
-		[1][2] = {.min_th = 32, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
-
-		/* Traffic Class 2 - Colors Green / Yellow / Red */
-		[2][0] = {.min_th = 48, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
-		[2][1] = {.min_th = 40, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
-		[2][2] = {.min_th = 32, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
-
-		/* Traffic Class 3 - Colors Green / Yellow / Red */
-		[3][0] = {.min_th = 48, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
-		[3][1] = {.min_th = 40, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
-		[3][2] = {.min_th = 32, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9}
-	}
-#endif /* RTE_SCHED_RED */
 };
 
 static struct rte_sched_port *
@@ -255,7 +280,8 @@ app_init_sched_port(uint32_t portid, uint32_t socketid)
 					subport, err);
 		}
 
-		for (pipe = 0; pipe < port_params.n_pipes_per_subport; pipe ++) {
+		uint32_t n_subport_pipes = subport_params[subport].n_subport_pipes;
+		for (pipe = 0; pipe < n_subport_pipes; pipe++) {
 			if (app_pipe_to_profile[subport][pipe] != -1) {
 				err = rte_sched_pipe_config(port, subport, pipe,
 						app_pipe_to_profile[subport][pipe]);
diff --git a/examples/qos_sched/main.h b/examples/qos_sched/main.h
index 8a2741c58..22a9bcb57 100644
--- a/examples/qos_sched/main.h
+++ b/examples/qos_sched/main.h
@@ -26,7 +26,7 @@ extern "C" {
 
 #define MAX_PKT_RX_BURST 64
 #define PKT_ENQUEUE 64
-#define PKT_DEQUEUE 32
+#define PKT_DEQUEUE 60
 #define MAX_PKT_TX_BURST 64
 
 #define RX_PTHRESH 8 /**< Default values of RX prefetch threshold reg. */
@@ -147,7 +147,11 @@ extern struct burst_conf burst_conf;
 extern struct ring_thresh rx_thresh;
 extern struct ring_thresh tx_thresh;
 
+uint32_t active_queues[RTE_SCHED_QUEUES_PER_PIPE];
+uint32_t n_active_queues;
+
 extern struct rte_sched_port_params port_params;
+extern struct rte_sched_subport_params subport_params[MAX_SCHED_SUBPORTS];
 
 int app_parse_args(int argc, char **argv);
 int app_init(void);
diff --git a/examples/qos_sched/profile.cfg b/examples/qos_sched/profile.cfg
index f5b704cc6..02fd8a00e 100644
--- a/examples/qos_sched/profile.cfg
+++ b/examples/qos_sched/profile.cfg
@@ -1,6 +1,6 @@
 ;   BSD LICENSE
 ;
-;   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+;   Copyright(c) 2010-2019 Intel Corporation. All rights reserved.
 ;   All rights reserved.
 ;
 ;   Redistribution and use in source and binary forms, with or without
@@ -33,12 +33,12 @@
 ; 10GbE output port:
 ;	* Single subport (subport 0):
 ;		- Subport rate set to 100% of port rate
-;		- Each of the 4 traffic classes has rate set to 100% of port rate
+;		- Each of the 9 traffic classes has rate set to 100% of port rate
 ;	* 4K pipes per subport 0 (pipes 0 .. 4095) with identical configuration:
 ;		- Pipe rate set to 1/4K of port rate
-;		- Each of the 4 traffic classes has rate set to 100% of pipe rate
-;		- Within each traffic class, the byte-level WRR weights for the 4 queues
-;         are set to 1:1:1:1
+;		- Each of the 9 traffic classes has rate set to 100% of pipe rate
+;		- Within lowest priority traffic class (best-effort), the byte-level
+;		  WRR weights for the 8 queues are set to 1:1:1:1:1:1:1:1
 ;
 ; For more details, please refer to chapter "Quality of Service (QoS) Framework"
 ; of Data Plane Development Kit (DPDK) Programmer's Guide.
@@ -47,11 +47,12 @@
 [port]
 frame overhead = 24
 number of subports per port = 1
-number of pipes per subport = 4096
-queue sizes = 64 64 64 64
 
 ; Subport configuration
 [subport 0]
+number of pipes per subport = 4096
+queue sizes = 64 64 64 64 64 64 64 64 64 64 64 64 64 64 64 64
+
 tb rate = 1250000000           ; Bytes per second
 tb size = 1000000              ; Bytes
 
@@ -59,6 +60,11 @@ tc 0 rate = 1250000000         ; Bytes per second
 tc 1 rate = 1250000000         ; Bytes per second
 tc 2 rate = 1250000000         ; Bytes per second
 tc 3 rate = 1250000000         ; Bytes per second
+tc 4 rate = 1250000000         ; Bytes per second
+tc 5 rate = 1250000000         ; Bytes per second
+tc 6 rate = 1250000000         ; Bytes per second
+tc 7 rate = 1250000000         ; Bytes per second
+tc 8 rate = 1250000000         ; Bytes per second
 tc period = 10                 ; Milliseconds
 
 pipe 0-4095 = 0                ; These pipes are configured with pipe profile 0
@@ -72,14 +78,16 @@ tc 0 rate = 305175             ; Bytes per second
 tc 1 rate = 305175             ; Bytes per second
 tc 2 rate = 305175             ; Bytes per second
 tc 3 rate = 305175             ; Bytes per second
-tc period = 40                 ; Milliseconds
+tc 4 rate = 305175             ; Bytes per second
+tc 5 rate = 305175             ; Bytes per second
+tc 6 rate = 305175             ; Bytes per second
+tc 7 rate = 305175             ; Bytes per second
+tc 8 rate = 305175             ; Bytes per second
+tc period = 160                ; Milliseconds
 
-tc 3 oversubscription weight = 1
+tc 8 oversubscription weight = 1
 
-tc 0 wrr weights = 1 1 1 1
-tc 1 wrr weights = 1 1 1 1
-tc 2 wrr weights = 1 1 1 1
-tc 3 wrr weights = 1 1 1 1
+tc 8 wrr weights = 1 1 1 1 1 1 1 1
 
 ; RED params per traffic class and color (Green / Yellow / Red)
 [red]
@@ -102,3 +110,28 @@ tc 3 wred min = 48 40 32
 tc 3 wred max = 64 64 64
 tc 3 wred inv prob = 10 10 10
 tc 3 wred weight = 9 9 9
+
+tc 4 wred min = 48 40 32
+tc 4 wred max = 64 64 64
+tc 4 wred inv prob = 10 10 10
+tc 4 wred weight = 9 9 9
+
+tc 5 wred min = 48 40 32
+tc 5 wred max = 64 64 64
+tc 5 wred inv prob = 10 10 10
+tc 5 wred weight = 9 9 9
+
+tc 6 wred min = 48 40 32
+tc 6 wred max = 64 64 64
+tc 6 wred inv prob = 10 10 10
+tc 6 wred weight = 9 9 9
+
+tc 7 wred min = 48 40 32
+tc 7 wred max = 64 64 64
+tc 7 wred inv prob = 10 10 10
+tc 7 wred weight = 9 9 9
+
+tc 8 wred min = 48 40 32
+tc 8 wred max = 64 64 64
+tc 8 wred inv prob = 10 10 10
+tc 8 wred weight = 9 9 9
diff --git a/examples/qos_sched/profile_ov.cfg b/examples/qos_sched/profile_ov.cfg
index 33000df9e..450001d2b 100644
--- a/examples/qos_sched/profile_ov.cfg
+++ b/examples/qos_sched/profile_ov.cfg
@@ -1,6 +1,6 @@
 ;   BSD LICENSE
 ;
-;   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+;   Copyright(c) 2010-2019 Intel Corporation. All rights reserved.
 ;   All rights reserved.
 ;
 ;   Redistribution and use in source and binary forms, with or without
@@ -33,11 +33,12 @@
 [port]
 frame overhead = 24
 number of subports per port = 1
-number of pipes per subport = 32
-queue sizes = 64 64 64 64
 
 ; Subport configuration
 [subport 0]
+number of pipes per subport = 32
+queue sizes = 64 64 64 64 64 64 64 64 64 64 64 64 64 64 64 64
+
 tb rate = 8400000           ; Bytes per second
 tb size = 100000            ; Bytes
 
@@ -45,6 +46,11 @@ tc 0 rate = 8400000         ; Bytes per second
 tc 1 rate = 8400000         ; Bytes per second
 tc 2 rate = 8400000         ; Bytes per second
 tc 3 rate = 8400000         ; Bytes per second
+tc 4 rate = 8400000         ; Bytes per second
+tc 5 rate = 8400000         ; Bytes per second
+tc 6 rate = 8400000         ; Bytes per second
+tc 7 rate = 8400000         ; Bytes per second
+tc 8 rate = 8400000         ; Bytes per second
 tc period = 10              ; Milliseconds
 
 pipe 0-31 = 0               ; These pipes are configured with pipe profile 0
@@ -58,14 +64,16 @@ tc 0 rate = 16800000           ; Bytes per second
 tc 1 rate = 16800000           ; Bytes per second
 tc 2 rate = 16800000           ; Bytes per second
 tc 3 rate = 16800000           ; Bytes per second
+tc 4 rate = 16800000           ; Bytes per second
+tc 5 rate = 16800000           ; Bytes per second
+tc 6 rate = 16800000           ; Bytes per second
+tc 7 rate = 16800000           ; Bytes per second
+tc 8 rate = 16800000           ; Bytes per second
 tc period = 28                 ; Milliseconds
 
 tc 3 oversubscription weight = 1
 
-tc 0 wrr weights = 1 1 1 1
-tc 1 wrr weights = 1 1 1 1
-tc 2 wrr weights = 1 1 1 1
-tc 3 wrr weights = 1 1 1 1
+tc 8 wrr weights = 1 1 1 1 1 1 1 1
 
 ; RED params per traffic class and color (Green / Yellow / Red)
 [red]
@@ -88,3 +96,28 @@ tc 3 wred min = 48 40 32
 tc 3 wred max = 64 64 64
 tc 3 wred inv prob = 10 10 10
 tc 3 wred weight = 9 9 9
+
+tc 4 wred min = 48 40 32
+tc 4 wred max = 64 64 64
+tc 4 wred inv prob = 10 10 10
+tc 4 wred weight = 9 9 9
+
+tc 5 wred min = 48 40 32
+tc 5 wred max = 64 64 64
+tc 5 wred inv prob = 10 10 10
+tc 5 wred weight = 9 9 9
+
+tc 6 wred min = 48 40 32
+tc 6 wred max = 64 64 64
+tc 6 wred inv prob = 10 10 10
+tc 6 wred weight = 9 9 9
+
+tc 7 wred min = 48 40 32
+tc 7 wred max = 64 64 64
+tc 7 wred inv prob = 10 10 10
+tc 7 wred weight = 9 9 9
+
+tc 8 wred min = 48 40 32
+tc 8 wred max = 64 64 64
+tc 8 wred inv prob = 10 10 10
+tc 8 wred weight = 9 9 9
diff --git a/examples/qos_sched/stats.c b/examples/qos_sched/stats.c
index 8193d964c..d5e6e65b0 100644
--- a/examples/qos_sched/stats.c
+++ b/examples/qos_sched/stats.c
@@ -14,24 +14,30 @@ qavg_q(uint16_t port_id, uint32_t subport_id, uint32_t pipe_id, uint8_t tc,
         struct rte_sched_queue_stats stats;
         struct rte_sched_port *port;
         uint16_t qlen;
-        uint32_t queue_id, count, i;
+		uint32_t count, i, queue_id = 0;
         uint32_t average;
 
         for (i = 0; i < nb_pfc; i++) {
                 if (qos_conf[i].tx_port == port_id)
                         break;
         }
-        if (i == nb_pfc || subport_id >= port_params.n_subports_per_port || pipe_id >= port_params.n_pipes_per_subport
-                        || tc >= RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE || q >= RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS)
-                return -1;
+	if (i == nb_pfc || subport_id >= port_params.n_subports_per_port ||
+		pipe_id >= subport_params[subport_id].n_subport_pipes  ||
+		tc >= RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE ||
+		q >= RTE_SCHED_WRR_QUEUES_PER_PIPE ||
+		(tc < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE - 1 && q > 0))
+			return -1;
 
         port = qos_conf[i].sched_port;
-
-        queue_id = RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS * (subport_id * port_params.n_pipes_per_subport + pipe_id);
-        queue_id = queue_id + (tc * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS + q);
+	for (i = 0; i < subport_id; i++)
+		queue_id += subport_params[i].n_subport_pipes *
+				RTE_SCHED_QUEUES_PER_PIPE;
+	if (tc < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE - 1)
+		queue_id += pipe_id * RTE_SCHED_QUEUES_PER_PIPE + tc;
+	else
+		queue_id += pipe_id * RTE_SCHED_QUEUES_PER_PIPE + tc + q;
 
         average = 0;
-
         for (count = 0; count < qavg_ntimes; count++) {
                 rte_sched_queue_read_stats(port, queue_id, &stats, &qlen);
                 average += qlen;
@@ -52,32 +58,42 @@ qavg_tcpipe(uint16_t port_id, uint32_t subport_id, uint32_t pipe_id,
         struct rte_sched_queue_stats stats;
         struct rte_sched_port *port;
         uint16_t qlen;
-        uint32_t queue_id, count, i;
+	uint32_t count, i, queue_id = 0;
         uint32_t average, part_average;
 
         for (i = 0; i < nb_pfc; i++) {
                 if (qos_conf[i].tx_port == port_id)
                         break;
         }
-        if (i == nb_pfc || subport_id >= port_params.n_subports_per_port || pipe_id >= port_params.n_pipes_per_subport
-                        || tc >= RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE)
-                return -1;
+	if (i == nb_pfc || subport_id >= port_params.n_subports_per_port ||
+		pipe_id >= subport_params[subport_id].n_subport_pipes ||
+		tc >= RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE)
+		return -1;
 
         port = qos_conf[i].sched_port;
 
-        queue_id = RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS * (subport_id * port_params.n_pipes_per_subport + pipe_id);
+	for (i = 0; i < subport_id; i++)
+		queue_id += subport_params[i].n_subport_pipes * RTE_SCHED_QUEUES_PER_PIPE;
+
+	queue_id += pipe_id * RTE_SCHED_QUEUES_PER_PIPE + tc;
 
         average = 0;
 
         for (count = 0; count < qavg_ntimes; count++) {
                 part_average = 0;
-                for (i = 0; i < RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS; i++) {
-                        rte_sched_queue_read_stats(port, queue_id + (tc * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS + i), &stats, &qlen);
-                        part_average += qlen;
-                }
-                average += part_average / RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS;
-                usleep(qavg_period);
-        }
+
+		if (tc < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE - 1) {
+			rte_sched_queue_read_stats(port, queue_id, &stats, &qlen);
+			part_average += qlen;
+		} else {
+			for (i = 0; i < RTE_SCHED_WRR_QUEUES_PER_PIPE; i++) {
+				rte_sched_queue_read_stats(port, queue_id + i, &stats, &qlen);
+				part_average += qlen;
+			}
+			average += part_average / RTE_SCHED_WRR_QUEUES_PER_PIPE;
+		}
+		usleep(qavg_period);
+	}
 
         average /= qavg_ntimes;
 
@@ -92,30 +108,36 @@ qavg_pipe(uint16_t port_id, uint32_t subport_id, uint32_t pipe_id)
         struct rte_sched_queue_stats stats;
         struct rte_sched_port *port;
         uint16_t qlen;
-        uint32_t queue_id, count, i;
+	uint32_t count, i, queue_id = 0;
         uint32_t average, part_average;
 
         for (i = 0; i < nb_pfc; i++) {
                 if (qos_conf[i].tx_port == port_id)
                         break;
         }
-        if (i == nb_pfc || subport_id >= port_params.n_subports_per_port || pipe_id >= port_params.n_pipes_per_subport)
-                return -1;
+	if (i == nb_pfc ||
+		subport_id >= port_params.n_subports_per_port ||
+		pipe_id >= subport_params[subport_id].n_subport_pipes)
+		return -1;
 
         port = qos_conf[i].sched_port;
 
-        queue_id = RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS * (subport_id * port_params.n_pipes_per_subport + pipe_id);
+	for (i = 0; i < subport_id; i++)
+		queue_id += subport_params[i].n_subport_pipes *
+				RTE_SCHED_QUEUES_PER_PIPE;
+
+	queue_id += pipe_id * RTE_SCHED_QUEUES_PER_PIPE;
 
         average = 0;
 
         for (count = 0; count < qavg_ntimes; count++) {
-                part_average = 0;
-                for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS; i++) {
-                        rte_sched_queue_read_stats(port, queue_id + i, &stats, &qlen);
-                        part_average += qlen;
-                }
-                average += part_average / (RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS);
-                usleep(qavg_period);
+		part_average = 0;
+		for (i = 0; i < RTE_SCHED_QUEUES_PER_PIPE; i++) {
+			rte_sched_queue_read_stats(port, queue_id + i, &stats, &qlen);
+			part_average += qlen;
+		}
+		average += part_average / RTE_SCHED_QUEUES_PER_PIPE;
+		usleep(qavg_period);
         }
 
         average /= qavg_ntimes;
@@ -131,32 +153,47 @@ qavg_tcsubport(uint16_t port_id, uint32_t subport_id, uint8_t tc)
         struct rte_sched_queue_stats stats;
         struct rte_sched_port *port;
         uint16_t qlen;
-        uint32_t queue_id, count, i, j;
+	uint32_t queue_id, count, i, j, subport_queue_id = 0;
         uint32_t average, part_average;
 
         for (i = 0; i < nb_pfc; i++) {
                 if (qos_conf[i].tx_port == port_id)
                         break;
         }
-        if (i == nb_pfc || subport_id >= port_params.n_subports_per_port || tc >= RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE)
-                return -1;
+	if (i == nb_pfc ||
+		subport_id >= port_params.n_subports_per_port ||
+		tc >= RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE)
+		return -1;
 
         port = qos_conf[i].sched_port;
 
+	for (i = 0; i < subport_id; i++)
+		subport_queue_id += subport_params[i].n_subport_pipes * RTE_SCHED_QUEUES_PER_PIPE;
+
         average = 0;
 
         for (count = 0; count < qavg_ntimes; count++) {
                 part_average = 0;
-                for (i = 0; i < port_params.n_pipes_per_subport; i++) {
-                        queue_id = RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS * (subport_id * port_params.n_pipes_per_subport + i);
+		for (i = 0; i < subport_params[subport_id].n_subport_pipes; i++) {
+			if (tc < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE - 1) {
+				queue_id = subport_queue_id + i * RTE_SCHED_QUEUES_PER_PIPE + tc;
+				rte_sched_queue_read_stats(port, queue_id, &stats, &qlen);
+				part_average += qlen;
+			} else {
+				for (j = 0; j < RTE_SCHED_WRR_QUEUES_PER_PIPE; j++) {
+					queue_id = subport_queue_id +
+							i * RTE_SCHED_QUEUES_PER_PIPE + tc + j;
+					rte_sched_queue_read_stats(port, queue_id, &stats, &qlen);
+					part_average += qlen;
+				}
+			}
+		}
+
+		if (tc < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE - 1)
+			average += part_average / (subport_params[subport_id].n_subport_pipes);
+		else
+			average += part_average / (subport_params[subport_id].n_subport_pipes) * RTE_SCHED_WRR_QUEUES_PER_PIPE;
 
-                        for (j = 0; j < RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS; j++) {
-                                rte_sched_queue_read_stats(port, queue_id + (tc * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS + j), &stats, &qlen);
-                                part_average += qlen;
-                        }
-                }
-
-                average += part_average / (port_params.n_pipes_per_subport * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS);
                 usleep(qavg_period);
         }
 
@@ -173,32 +210,36 @@ qavg_subport(uint16_t port_id, uint32_t subport_id)
         struct rte_sched_queue_stats stats;
         struct rte_sched_port *port;
         uint16_t qlen;
-        uint32_t queue_id, count, i, j;
+	uint32_t queue_id, count, i, j, subport_queue_id = 0;
         uint32_t average, part_average;
 
         for (i = 0; i < nb_pfc; i++) {
                 if (qos_conf[i].tx_port == port_id)
                         break;
         }
-        if (i == nb_pfc || subport_id >= port_params.n_subports_per_port)
-                return -1;
+	if (i == nb_pfc ||
+		subport_id >= port_params.n_subports_per_port)
+		return -1;
 
         port = qos_conf[i].sched_port;
 
+	for (i = 0; i < subport_id; i++)
+		subport_queue_id += subport_params[i].n_subport_pipes * RTE_SCHED_QUEUES_PER_PIPE;
+
         average = 0;
 
         for (count = 0; count < qavg_ntimes; count++) {
                 part_average = 0;
-                for (i = 0; i < port_params.n_pipes_per_subport; i++) {
-                        queue_id = RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS * (subport_id * port_params.n_pipes_per_subport + i);
+		for (i = 0; i < subport_params[subport_id].n_subport_pipes; i++) {
+			queue_id = subport_queue_id + i * RTE_SCHED_QUEUES_PER_PIPE;
 
-                        for (j = 0; j < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS; j++) {
+			for (j = 0; j < RTE_SCHED_QUEUES_PER_PIPE; j++) {
                                 rte_sched_queue_read_stats(port, queue_id + j, &stats, &qlen);
                                 part_average += qlen;
                         }
                 }
 
-                average += part_average / (port_params.n_pipes_per_subport * RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS);
+		average += part_average / (subport_params[subport_id].n_subport_pipes * RTE_SCHED_QUEUES_PER_PIPE);
                 usleep(qavg_period);
         }
 
@@ -252,35 +293,41 @@ pipe_stat(uint16_t port_id, uint32_t subport_id, uint32_t pipe_id)
         struct rte_sched_port *port;
         uint16_t qlen;
         uint8_t i, j;
-        uint32_t queue_id;
+	uint32_t queue_id = 0;
 
         for (i = 0; i < nb_pfc; i++) {
                 if (qos_conf[i].tx_port == port_id)
                         break;
         }
-        if (i == nb_pfc || subport_id >= port_params.n_subports_per_port || pipe_id >= port_params.n_pipes_per_subport)
+	if (i == nb_pfc ||
+		subport_id >= port_params.n_subports_per_port ||
+		pipe_id >= subport_params[subport_id].n_subport_pipes)
                 return -1;
 
         port = qos_conf[i].sched_port;
-
-        queue_id = RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS * (subport_id * port_params.n_pipes_per_subport + pipe_id);
+	for (i = 0; i < subport_id; i++)
+		queue_id += subport_params[i].n_subport_pipes * RTE_SCHED_QUEUES_PER_PIPE;
+	queue_id += pipe_id * RTE_SCHED_QUEUES_PER_PIPE;
 
         printf("\n");
         printf("+----+-------+-------------+-------------+-------------+-------------+-------------+\n");
         printf("| TC | Queue |   Pkts OK   |Pkts Dropped |  Bytes OK   |Bytes Dropped|    Length   |\n");
         printf("+----+-------+-------------+-------------+-------------+-------------+-------------+\n");
 
-        for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++) {
-                for (j = 0; j < RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS; j++) {
-
-                        rte_sched_queue_read_stats(port, queue_id + (i * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS + j), &stats, &qlen);
-
-                        printf("|  %d |   %d   | %11" PRIu32 " | %11" PRIu32 " | %11" PRIu32 " | %11" PRIu32 " | %11i |\n", i, j,
-                                        stats.n_pkts, stats.n_pkts_dropped, stats.n_bytes, stats.n_bytes_dropped, qlen);
-                        printf("+----+-------+-------------+-------------+-------------+-------------+-------------+\n");
+	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++) {
+		if (i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE - 1) {
+			rte_sched_queue_read_stats(port, queue_id + i, &stats, &qlen);
+			printf("|  %d |   %d   | %11" PRIu32 " | %11" PRIu32 " | %11" PRIu32 " | %11" PRIu32 " | %11i |\n", i, j,
+				stats.n_pkts, stats.n_pkts_dropped, stats.n_bytes, stats.n_bytes_dropped, qlen);
+			printf("+----+-------+-------------+-------------+-------------+-------------+-------------+\n");
+		} else {
+			for (j = 0; j < RTE_SCHED_WRR_QUEUES_PER_PIPE; j++) {
+				rte_sched_queue_read_stats(port, queue_id + i + j, &stats, &qlen);
+				printf("|  %d |   %d   | %11" PRIu32 " | %11" PRIu32 " | %11" PRIu32 " | %11" PRIu32 " | %11i |\n", i, j,
+					stats.n_pkts, stats.n_pkts_dropped, stats.n_bytes, stats.n_bytes_dropped, qlen);
+				printf("+----+-------+-------------+-------------+-------------+-------------+-------------+\n");
+			}
                 }
-                if (i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE - 1)
-                        printf("+----+-------+-------------+-------------+-------------+-------------+-------------+\n");
         }
         printf("\n");
 
-- 
2.20.1


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH 26/27] examples/ip_pipeline: update ip pipeline sample app
  2019-05-28 12:05 [dpdk-dev] [PATCH 00/27] sched: feature enhancements Lukasz Krakowiak
                   ` (24 preceding siblings ...)
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 25/27] examples/qos_sched: update qos sched sample app Lukasz Krakowiak
@ 2019-05-28 12:05 ` " Lukasz Krakowiak
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 27/27] sched: code cleanup Lukasz Krakowiak
  26 siblings, 0 replies; 163+ messages in thread
From: Lukasz Krakowiak @ 2019-05-28 12:05 UTC (permalink / raw)
  To: cristian.dumitrescu; +Cc: dev, Jasvinder Singh, Abraham Tovar, Lukasz Krakowiak

From: Jasvinder Singh <jasvinder.singh@intel.com>

Update ip pipeline sample app to allow configuration flexiblity
for pipe traffic classes and queues, and subport level configuration
of the pipe parameters.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 examples/ip_pipeline/cli.c             | 85 +++++++++++++-------------
 examples/ip_pipeline/tmgr.c            | 21 +++----
 examples/ip_pipeline/tmgr.h            |  3 -
 lib/librte_pipeline/rte_table_action.c |  1 -
 lib/librte_pipeline/rte_table_action.h |  4 +-
 5 files changed, 53 insertions(+), 61 deletions(-)

diff --git a/examples/ip_pipeline/cli.c b/examples/ip_pipeline/cli.c
index bcf62fbf5..2f0ee2ec1 100644
--- a/examples/ip_pipeline/cli.c
+++ b/examples/ip_pipeline/cli.c
@@ -377,8 +377,11 @@ cmd_swq(char **tokens,
 static const char cmd_tmgr_subport_profile_help[] =
 "tmgr subport profile\n"
 "   <tb_rate> <tb_size>\n"
-"   <tc0_rate> <tc1_rate> <tc2_rate> <tc3_rate>\n"
-"   <tc_period>\n";
+"   <tc0_rate> <tc1_rate> <tc2_rate> <tc3_rate>"
+"        <tc4_rate> <tc5_rate> <tc6_rate> <tc7_rate> <tc8_rate>\n"
+"   <tc_period>\n"
+"   pps <n_pipes_per_subport>\n"
+"   qsize <qsize_q0..15>";
 
 static void
 cmd_tmgr_subport_profile(char **tokens,
@@ -389,7 +392,7 @@ cmd_tmgr_subport_profile(char **tokens,
 	struct rte_sched_subport_params p;
 	int status, i;
 
-	if (n_tokens != 10) {
+	if (n_tokens != 34) {
 		snprintf(out, out_size, MSG_ARG_MISMATCH, tokens[0]);
 		return;
 	}
@@ -410,11 +413,32 @@ cmd_tmgr_subport_profile(char **tokens,
 			return;
 		}
 
-	if (parser_read_uint32(&p.tc_period, tokens[9]) != 0) {
+	if (parser_read_uint32(&p.tc_period, tokens[14]) != 0) {
 		snprintf(out, out_size, MSG_ARG_INVALID, "tc_period");
 		return;
 	}
 
+	if (strcmp(tokens[15], "pps") != 0) {
+		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "pps");
+		return;
+	}
+
+	if (parser_read_uint32(&p.n_subport_pipes, tokens[16]) != 0) {
+		snprintf(out, out_size, MSG_ARG_INVALID, "n_subport_pipes");
+		return;
+	}
+
+	if (strcmp(tokens[17], "qsize") != 0) {
+		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "qsize");
+		return;
+	}
+
+	for (i = 0; i < RTE_SCHED_QUEUES_PER_PIPE; i++)
+		if (parser_read_uint16(&p.qsize[i], tokens[18 + i]) != 0) {
+			snprintf(out, out_size, MSG_ARG_INVALID, "qsize");
+			return;
+		}
+
 	status = tmgr_subport_profile_add(&p);
 	if (status != 0) {
 		snprintf(out, out_size, MSG_CMD_FAIL, tokens[0]);
@@ -425,10 +449,11 @@ cmd_tmgr_subport_profile(char **tokens,
 static const char cmd_tmgr_pipe_profile_help[] =
 "tmgr pipe profile\n"
 "   <tb_rate> <tb_size>\n"
-"   <tc0_rate> <tc1_rate> <tc2_rate> <tc3_rate>\n"
+"   <tc0_rate> <tc1_rate> <tc2_rate> <tc3_rate>"
+"     <tc0_rate> <tc1_rate> <tc2_rate> <tc3_rate>\n"
 "   <tc_period>\n"
 "   <tc_ov_weight>\n"
-"   <wrr_weight0..15>\n";
+"   <wrr_weight0..7>\n";
 
 static void
 cmd_tmgr_pipe_profile(char **tokens,
@@ -439,7 +464,7 @@ cmd_tmgr_pipe_profile(char **tokens,
 	struct rte_sched_pipe_params p;
 	int status, i;
 
-	if (n_tokens != 27) {
+	if (n_tokens != 24) {
 		snprintf(out, out_size, MSG_ARG_MISMATCH, tokens[0]);
 		return;
 	}
@@ -460,20 +485,20 @@ cmd_tmgr_pipe_profile(char **tokens,
 			return;
 		}
 
-	if (parser_read_uint32(&p.tc_period, tokens[9]) != 0) {
+	if (parser_read_uint32(&p.tc_period, tokens[14]) != 0) {
 		snprintf(out, out_size, MSG_ARG_INVALID, "tc_period");
 		return;
 	}
 
 #ifdef RTE_SCHED_SUBPORT_TC_OV
-	if (parser_read_uint8(&p.tc_ov_weight, tokens[10]) != 0) {
+	if (parser_read_uint8(&p.tc_ov_weight, tokens[15]) != 0) {
 		snprintf(out, out_size, MSG_ARG_INVALID, "tc_ov_weight");
 		return;
 	}
 #endif
 
-	for (i = 0; i < RTE_SCHED_QUEUES_PER_PIPE; i++)
-		if (parser_read_uint8(&p.wrr_weights[i], tokens[11 + i]) != 0) {
+	for (i = 0; i < RTE_SCHED_WRR_QUEUES_PER_PIPE; i++)
+		if (parser_read_uint8(&p.wrr_weights[i], tokens[16 + i]) != 0) {
 			snprintf(out, out_size, MSG_ARG_INVALID, "wrr_weights");
 			return;
 		}
@@ -489,8 +514,6 @@ static const char cmd_tmgr_help[] =
 "tmgr <tmgr_name>\n"
 "   rate <rate>\n"
 "   spp <n_subports_per_port>\n"
-"   pps <n_pipes_per_subport>\n"
-"   qsize <qsize_tc0> <qsize_tc1> <qsize_tc2> <qsize_tc3>\n"
 "   fo <frame_overhead>\n"
 "   mtu <mtu>\n"
 "   cpu <cpu_id>\n";
@@ -504,9 +527,8 @@ cmd_tmgr(char **tokens,
 	struct tmgr_port_params p;
 	char *name;
 	struct tmgr_port *tmgr_port;
-	int i;
 
-	if (n_tokens != 19) {
+	if (n_tokens != 12) {
 		snprintf(out, out_size, MSG_ARG_MISMATCH, tokens[0]);
 		return;
 	}
@@ -533,53 +555,32 @@ cmd_tmgr(char **tokens,
 		return;
 	}
 
-	if (strcmp(tokens[6], "pps") != 0) {
-		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "pps");
-		return;
-	}
-
-	if (parser_read_uint32(&p.n_pipes_per_subport, tokens[7]) != 0) {
-		snprintf(out, out_size, MSG_ARG_INVALID, "n_pipes_per_subport");
-		return;
-	}
-
-	if (strcmp(tokens[8], "qsize") != 0) {
-		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "qsize");
-		return;
-	}
-
-	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++)
-		if (parser_read_uint16(&p.qsize[i], tokens[9 + i]) != 0) {
-			snprintf(out, out_size, MSG_ARG_INVALID, "qsize");
-			return;
-		}
-
-	if (strcmp(tokens[13], "fo") != 0) {
+	if (strcmp(tokens[6], "fo") != 0) {
 		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "fo");
 		return;
 	}
 
-	if (parser_read_uint32(&p.frame_overhead, tokens[14]) != 0) {
+	if (parser_read_uint32(&p.frame_overhead, tokens[7]) != 0) {
 		snprintf(out, out_size, MSG_ARG_INVALID, "frame_overhead");
 		return;
 	}
 
-	if (strcmp(tokens[15], "mtu") != 0) {
+	if (strcmp(tokens[8], "mtu") != 0) {
 		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "mtu");
 		return;
 	}
 
-	if (parser_read_uint32(&p.mtu, tokens[16]) != 0) {
+	if (parser_read_uint32(&p.mtu, tokens[9]) != 0) {
 		snprintf(out, out_size, MSG_ARG_INVALID, "mtu");
 		return;
 	}
 
-	if (strcmp(tokens[17], "cpu") != 0) {
+	if (strcmp(tokens[10], "cpu") != 0) {
 		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "cpu");
 		return;
 	}
 
-	if (parser_read_uint32(&p.cpu_id, tokens[18]) != 0) {
+	if (parser_read_uint32(&p.cpu_id, tokens[11]) != 0) {
 		snprintf(out, out_size, MSG_ARG_INVALID, "cpu_id");
 		return;
 	}
diff --git a/examples/ip_pipeline/tmgr.c b/examples/ip_pipeline/tmgr.c
index 40cbf1d0a..538364580 100644
--- a/examples/ip_pipeline/tmgr.c
+++ b/examples/ip_pipeline/tmgr.c
@@ -47,7 +47,8 @@ int
 tmgr_subport_profile_add(struct rte_sched_subport_params *p)
 {
 	/* Check input params */
-	if (p == NULL)
+	if (p == NULL ||
+		p->n_subport_pipes == 0)
 		return -1;
 
 	/* Save profile */
@@ -90,7 +91,6 @@ tmgr_port_create(const char *name, struct tmgr_port_params *params)
 		tmgr_port_find(name) ||
 		(params == NULL) ||
 		(params->n_subports_per_port == 0) ||
-		(params->n_pipes_per_subport == 0) ||
 		(params->cpu_id >= RTE_MAX_NUMA_NODES) ||
 		(n_subport_profiles == 0) ||
 		(n_pipe_profiles == 0))
@@ -103,18 +103,14 @@ tmgr_port_create(const char *name, struct tmgr_port_params *params)
 	p.mtu = params->mtu;
 	p.frame_overhead = params->frame_overhead;
 	p.n_subports_per_port = params->n_subports_per_port;
-	p.n_pipes_per_subport = params->n_pipes_per_subport;
-
-	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++)
-		p.qsize[i] = params->qsize[i];
-
-	p.pipe_profiles = pipe_profile;
-	p.n_pipe_profiles = n_pipe_profiles;
 
 	s = rte_sched_port_config(&p);
 	if (s == NULL)
 		return NULL;
 
+	subport_profile[0].pipe_profiles = pipe_profile;
+	subport_profile[0].n_pipe_profiles = n_pipe_profiles;
+
 	for (i = 0; i < params->n_subports_per_port; i++) {
 		int status;
 
@@ -128,7 +124,7 @@ tmgr_port_create(const char *name, struct tmgr_port_params *params)
 			return NULL;
 		}
 
-		for (j = 0; j < params->n_pipes_per_subport; j++) {
+		for (j = 0; j < subport_profile[0].n_subport_pipes; j++) {
 			status = rte_sched_pipe_config(
 				s,
 				i,
@@ -153,7 +149,6 @@ tmgr_port_create(const char *name, struct tmgr_port_params *params)
 	strlcpy(tmgr_port->name, name, sizeof(tmgr_port->name));
 	tmgr_port->s = s;
 	tmgr_port->n_subports_per_port = params->n_subports_per_port;
-	tmgr_port->n_pipes_per_subport = params->n_pipes_per_subport;
 
 	/* Node add to list */
 	TAILQ_INSERT_TAIL(&tmgr_port_list, tmgr_port, node);
@@ -205,8 +200,8 @@ tmgr_pipe_config(const char *port_name,
 	port = tmgr_port_find(port_name);
 	if ((port == NULL) ||
 		(subport_id >= port->n_subports_per_port) ||
-		(pipe_id_first >= port->n_pipes_per_subport) ||
-		(pipe_id_last >= port->n_pipes_per_subport) ||
+		(pipe_id_first >= subport_profile[0].n_subport_pipes) ||
+		(pipe_id_last >= subport_profile[0].n_subport_pipes) ||
 		(pipe_id_first > pipe_id_last) ||
 		(pipe_profile_id >= n_pipe_profiles))
 		return -1;
diff --git a/examples/ip_pipeline/tmgr.h b/examples/ip_pipeline/tmgr.h
index 0b497e795..3a958492c 100644
--- a/examples/ip_pipeline/tmgr.h
+++ b/examples/ip_pipeline/tmgr.h
@@ -25,7 +25,6 @@ struct tmgr_port {
 	char name[NAME_SIZE];
 	struct rte_sched_port *s;
 	uint32_t n_subports_per_port;
-	uint32_t n_pipes_per_subport;
 };
 
 TAILQ_HEAD(tmgr_port_list, tmgr_port);
@@ -39,8 +38,6 @@ tmgr_port_find(const char *name);
 struct tmgr_port_params {
 	uint32_t rate;
 	uint32_t n_subports_per_port;
-	uint32_t n_pipes_per_subport;
-	uint16_t qsize[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
 	uint32_t frame_overhead;
 	uint32_t mtu;
 	uint32_t cpu_id;
diff --git a/lib/librte_pipeline/rte_table_action.c b/lib/librte_pipeline/rte_table_action.c
index 9a65f3ded..c097ea681 100644
--- a/lib/librte_pipeline/rte_table_action.c
+++ b/lib/librte_pipeline/rte_table_action.c
@@ -401,7 +401,6 @@ pkt_work_tm(struct rte_mbuf *mbuf,
 {
 	struct dscp_table_entry_data *dscp_entry = &dscp_table->entry[dscp];
 	uint32_t queue_id = data->queue_id |
-				(dscp_entry->tc << 2) |
 				dscp_entry->tc_queue;
 	rte_mbuf_sched_set(mbuf, queue_id, dscp_entry->tc,
 				(uint8_t)dscp_entry->color);
diff --git a/lib/librte_pipeline/rte_table_action.h b/lib/librte_pipeline/rte_table_action.h
index cf6eeaa30..009e911cb 100644
--- a/lib/librte_pipeline/rte_table_action.h
+++ b/lib/librte_pipeline/rte_table_action.h
@@ -181,10 +181,10 @@ struct rte_table_action_lb_params {
  * RTE_TABLE_ACTION_MTR
  */
 /** Max number of traffic classes (TCs). */
-#define RTE_TABLE_ACTION_TC_MAX                                  4
+#define RTE_TABLE_ACTION_TC_MAX                                  16
 
 /** Max number of queues per traffic class. */
-#define RTE_TABLE_ACTION_TC_QUEUE_MAX                            4
+#define RTE_TABLE_ACTION_TC_QUEUE_MAX                            16
 
 /** Differentiated Services Code Point (DSCP) translation table entry. */
 struct rte_table_action_dscp_table_entry {
-- 
2.20.1


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH 27/27] sched: code cleanup
  2019-05-28 12:05 [dpdk-dev] [PATCH 00/27] sched: feature enhancements Lukasz Krakowiak
                   ` (25 preceding siblings ...)
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 26/27] examples/ip_pipeline: update ip pipeline " Lukasz Krakowiak
@ 2019-05-28 12:05 ` Lukasz Krakowiak
  26 siblings, 0 replies; 163+ messages in thread
From: Lukasz Krakowiak @ 2019-05-28 12:05 UTC (permalink / raw)
  To: cristian.dumitrescu; +Cc: dev, Jasvinder Singh, Abraham Tovar, Lukasz Krakowiak

From: Jasvinder Singh <jasvinder.singh@intel.com>

Remove redundant macros and fields from the data structures.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 lib/librte_sched/rte_sched.c | 53 ------------------------------------
 lib/librte_sched/rte_sched.h | 18 ------------
 2 files changed, 71 deletions(-)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index 563161713..79731af8e 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -100,18 +100,7 @@ struct rte_sched_grinder {
 	uint32_t tc_index;
 	struct rte_sched_strict_priority_class sp;
 	struct rte_sched_best_effort_class be;
-	struct rte_sched_queue *queue[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
-	struct rte_mbuf **qbase[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
-	uint32_t qindex[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
-	uint16_t qsize;
-	uint32_t qmask;
-	uint32_t qpos;
 	struct rte_mbuf *pkt;
-
-	/* WRR */
-	uint16_t wrr_tokens[RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS];
-	uint16_t wrr_mask[RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS];
-	uint8_t wrr_cost[RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS];
 };
 
 struct rte_sched_subport {
@@ -215,7 +204,6 @@ struct rte_sched_pipe {
 	/* TC oversubscription */
 	uint32_t tc_ov_credits;
 	uint8_t tc_ov_period_id;
-	uint8_t reserved[3];
 } __rte_cache_aligned;
 
 struct rte_sched_queue {
@@ -233,18 +221,10 @@ struct rte_sched_queue_extra {
 struct rte_sched_port {
 	/* User parameters */
 	uint32_t n_subports_per_port;
-	uint32_t n_pipes_per_subport;
-	uint32_t n_pipes_per_subport_log2;
 	int socket;
 	uint32_t rate;
 	uint32_t mtu;
 	uint32_t frame_overhead;
-	uint16_t qsize[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
-	uint32_t n_pipe_profiles;
-	uint32_t pipe_tc3_rate_max;
-#ifdef RTE_SCHED_RED
-	struct rte_red_config red_config[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE][RTE_COLORS];
-#endif
 
 	/* Timing */
 	uint64_t time_cpu_cycles;     /* Current CPU time measured in CPU cyles */
@@ -252,50 +232,17 @@ struct rte_sched_port {
 	uint64_t time;                /* Current NIC TX time measured in bytes */
 	struct rte_reciprocal inv_cycles_per_byte; /* CPU cycles per byte */
 
-	/* Scheduling loop detection */
-	uint32_t pipe_loop;
-	uint32_t pipe_exhaustion;
-
-	/* Bitmap */
-	struct rte_bitmap *bmp;
-	uint32_t grinder_base_bmp_pos[RTE_SCHED_PORT_N_GRINDERS] __rte_aligned_16;
-
 	/* Grinders */
-	struct rte_sched_grinder grinder[RTE_SCHED_PORT_N_GRINDERS];
-	uint32_t busy_grinders;
 	struct rte_mbuf **pkts_out;
 	uint32_t n_pkts_out;
 	uint32_t subport_id;
 
 	uint32_t n_max_subport_pipes_log2;   /* Max number of subport pipes */
 
-	/* Queue base calculation */
-	uint32_t qsize_add[RTE_SCHED_QUEUES_PER_PIPE];
-	uint32_t qsize_sum;
-
 	/* Large data structures */
-	struct rte_sched_subport *subport;
-	struct rte_sched_pipe *pipe;
-	struct rte_sched_queue *queue;
-	struct rte_sched_queue_extra *queue_extra;
-	struct rte_sched_pipe_profile *pipe_profiles;
-	uint8_t *bmp_array;
-	struct rte_mbuf **queue_array;
 	struct rte_sched_subport *subports[RTE_SCHED_SUBPORTS_PER_PORT];
-	uint8_t memory[0] __rte_cache_aligned;
 } __rte_cache_aligned;
 
-enum rte_sched_port_array {
-	e_RTE_SCHED_PORT_ARRAY_SUBPORT = 0,
-	e_RTE_SCHED_PORT_ARRAY_PIPE,
-	e_RTE_SCHED_PORT_ARRAY_QUEUE,
-	e_RTE_SCHED_PORT_ARRAY_QUEUE_EXTRA,
-	e_RTE_SCHED_PORT_ARRAY_PIPE_PROFILES,
-	e_RTE_SCHED_PORT_ARRAY_BMP_ARRAY,
-	e_RTE_SCHED_PORT_ARRAY_QUEUE_ARRAY,
-	e_RTE_SCHED_PORT_ARRAY_TOTAL,
-};
-
 enum rte_sched_subport_array {
 	e_RTE_SCHED_SUBPORT_ARRAY_PIPE = 0,
 	e_RTE_SCHED_SUBPORT_ARRAY_QUEUE,
diff --git a/lib/librte_sched/rte_sched.h b/lib/librte_sched/rte_sched.h
index 28b589309..5d6828e2e 100644
--- a/lib/librte_sched/rte_sched.h
+++ b/lib/librte_sched/rte_sched.h
@@ -81,7 +81,6 @@ extern "C" {
 #define RTE_SCHED_WRR_QUEUES_PER_PIPE    8
 
 /** Number of traffic classes per pipe (as well as subport). */
-#define RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS    4
 #define RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE    \
 (RTE_SCHED_QUEUES_PER_PIPE - RTE_SCHED_WRR_QUEUES_PER_PIPE + 1)
 
@@ -95,10 +94,6 @@ extern "C" {
 /** Maximum number of pipe profiles that can be defined per subport.
  * Compile-time configurable.
  */
-#ifndef RTE_SCHED_PIPE_PROFILES_PER_PORT
-#define RTE_SCHED_PIPE_PROFILES_PER_PORT      256
-#endif
-
 #ifndef RTE_SCHED_PIPE_PROFILES_PER_SUBPORT
 #define RTE_SCHED_PIPE_PROFILES_PER_SUBPORT      256
 #endif
@@ -229,19 +224,6 @@ struct rte_sched_port_params {
 	uint32_t frame_overhead;         /**< Framing overhead per packet
 					  * (measured in bytes) */
 	uint32_t n_subports_per_port;    /**< Number of subports */
-	uint32_t n_pipes_per_subport;    /**< Number of pipes per subport */
-	uint16_t qsize[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
-	/**< Packet queue size for each traffic class.
-	 * All queues within the same pipe traffic class have the same
-	 * size. Queues from different pipes serving the same traffic
-	 * class have the same size. */
-	struct rte_sched_pipe_params *pipe_profiles;
-	/**< Pipe profile table.
-	 * Every pipe is configured using one of the profiles from this table. */
-	uint32_t n_pipe_profiles;        /**< Profiles in the pipe profile table */
-#ifdef RTE_SCHED_RED
-	struct rte_red_params red_params[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE][RTE_COLORS]; /**< RED parameters */
-#endif
 };
 
 /*
-- 
2.20.1


^ permalink raw reply	[flat|nested] 163+ messages in thread

* Re: [dpdk-dev] [PATCH 07/27] sched: update pipe profile add api
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 07/27] sched: update pipe profile add api Lukasz Krakowiak
@ 2019-05-28 14:06   ` Stephen Hemminger
  0 siblings, 0 replies; 163+ messages in thread
From: Stephen Hemminger @ 2019-05-28 14:06 UTC (permalink / raw)
  To: Lukasz Krakowiak; +Cc: cristian.dumitrescu, dev, Jasvinder Singh, Abraham Tovar

On Tue, 28 May 2019 14:05:33 +0200
Lukasz Krakowiak <lukaszx.krakowiak@intel.com> wrote:

>  
>  static int
>  pipe_profile_check(struct rte_sched_pipe_params *params,
> -	uint32_t rate)
> +	uint32_t rate, uint16_t *qsize)
>  {
>  	uint32_t i;
>  
>  	/* Pipe parameters */
>  	if (params == NULL)
> -		return -10;
> +		return -11;

Having used this before, and suffered from the error handling.
Please change this to do proper logging and use a normal convention
for error numbers.  This is not in fast path, it should be more
user friendly.

^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH v2 00/28] sched: feature enhancements
  2019-05-28 12:05 ` [dpdk-dev] [PATCH 01/27] sched: update macros for flexible config Lukasz Krakowiak
@ 2019-06-25 15:31   ` Jasvinder Singh
  2019-06-25 15:31     ` [dpdk-dev] [PATCH v2 01/28] sched: update macros for flexible config Jasvinder Singh
                       ` (30 more replies)
  0 siblings, 31 replies; 163+ messages in thread
From: Jasvinder Singh @ 2019-06-25 15:31 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu

This patchset refactors the dpdk qos sched library to add
following features to enhance the scheduler functionality.

1. flexibile configuration of the pipe traffic classes and queues;

   Currently, each pipe has 16 queues hardwired into 4 TCs scheduled with
   strict priority, and each TC has exactly with 4 queues that are
   scheduled with Weighted Fair Queuing (WFQ).

   Instead of hardwiring queues to traffic class within the specific pipe,
   the new implementation allows more flexible/configurable split of pipe
   queues between strict priority (SP) and best-effort (BE) traffic classes
   along with the support of more number of traffic classes i.e. max 16.
   
   All the high priority TCs (TC1, TC2, ...) have exactly 1 queue, while
   the lowest priority BE TC, has 1, 4 or 8 queues. This is justified by
   the fact that all the high priority TCs are fully provisioned (small to
   medium traffic rates), while most of the traffic fits into the BE class,
   which is typically oversubscribed.

   Furthermore, this change allows to use less than 16 queues per pipe when
   not all the 16 queues are needed. Therefore, no memory will be allocated
   to the queues that are not needed.

2. Subport level configuration of pipe nodes;

   Currently, all parameters for the pipe nodes (subscribers) configuration
   are part of the port level structure which forces all groups of
   subscribers (i.e. pipes) in different subports to have similar
   configurations in terms of their number, queue sizes, traffic-classes,
   etc.

   The new implementation moves pipe nodes configuration parameters from
   port level to subport level structure. Therefore, different subports of
   the same port can have different configuration for the pipe nodes
   (subscribers), for examples- number of pipes, queue sizes, queues to
   traffic-class mapping, etc.

v2:
- fix bug in subport parameters check
- remove redundant RTE_SCHED_SUBPORT_PER_PORT macro
- fix bug in grinder_scheduler function
- improve doxygen comments 
- add error log information

Jasvinder Singh (27):
  sched: update macros for flexible config
  sched: update subport and pipe data structures
  sched: update internal data structures
  sched: update port config API
  sched: update port free API
  sched: update subport config API
  sched: update pipe profile add API
  sched: update pipe config API
  sched: update pkt read and write API
  sched: update subport and tc queue stats
  sched: update port memory footprint API
  sched: update packet enqueue API
  sched: update grinder pipe and tc cache
  sched: update grinder next pipe and tc functions
  sched: update pipe and tc queues prefetch
  sched: update grinder wrr compute function
  sched: modify credits update function
  sched: update mbuf prefetch function
  sched: update grinder schedule function
  sched: update grinder handle function
  sched: update packet dequeue API
  sched: update sched queue stats API
  test/sched: update unit test
  net/softnic: update softnic tm function
  examples/qos_sched: update qos sched sample app
  examples/ip_pipeline: update ip pipeline sample app
  sched: code cleanup

Lukasz Krakowiak (1):
  sched: add release note

 app/test/test_sched.c                         |   39 +-
 doc/guides/rel_notes/deprecation.rst          |    6 -
 doc/guides/rel_notes/release_19_08.rst        |    7 +-
 drivers/net/softnic/rte_eth_softnic.c         |  131 +
 drivers/net/softnic/rte_eth_softnic_cli.c     |  286 ++-
 .../net/softnic/rte_eth_softnic_internals.h   |    8 +-
 drivers/net/softnic/rte_eth_softnic_tm.c      |   89 +-
 examples/ip_pipeline/cli.c                    |   85 +-
 examples/ip_pipeline/tmgr.c                   |   22 +-
 examples/ip_pipeline/tmgr.h                   |    3 -
 examples/qos_sched/app_thread.c               |   11 +-
 examples/qos_sched/cfg_file.c                 |  283 ++-
 examples/qos_sched/init.c                     |  111 +-
 examples/qos_sched/main.h                     |    7 +-
 examples/qos_sched/profile.cfg                |   59 +-
 examples/qos_sched/profile_ov.cfg             |   47 +-
 examples/qos_sched/stats.c                    |  483 ++--
 lib/librte_pipeline/rte_table_action.c        |    1 -
 lib/librte_pipeline/rte_table_action.h        |    4 +-
 lib/librte_sched/Makefile                     |    2 +-
 lib/librte_sched/meson.build                  |    2 +-
 lib/librte_sched/rte_sched.c                  | 2133 ++++++++++-------
 lib/librte_sched/rte_sched.h                  |  229 +-
 lib/librte_sched/rte_sched_common.h           |   41 +
 24 files changed, 2634 insertions(+), 1455 deletions(-)

-- 
2.21.0


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH v2 01/28] sched: update macros for flexible config
  2019-06-25 15:31   ` [dpdk-dev] [PATCH v2 00/28] sched: feature enhancements Jasvinder Singh
@ 2019-06-25 15:31     ` Jasvinder Singh
  2019-07-01 19:04       ` Dumitrescu, Cristian
  2019-07-11 10:26       ` [dpdk-dev] [PATCH v3 00/11] sched: feature enhancements Jasvinder Singh
  2019-06-25 15:31     ` [dpdk-dev] [PATCH v2 02/28] sched: update subport and pipe data structures Jasvinder Singh
                       ` (29 subsequent siblings)
  30 siblings, 2 replies; 163+ messages in thread
From: Jasvinder Singh @ 2019-06-25 15:31 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, Abraham Tovar, Lukasz Krakowiak

Update macros to allow configuration flexiblity for pipe traffic
classes and queues, and subport level configuration of the pipe
parameters.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 lib/librte_sched/rte_sched.h | 36 +++++++++++++++++++++++++-----------
 1 file changed, 25 insertions(+), 11 deletions(-)

diff --git a/lib/librte_sched/rte_sched.h b/lib/librte_sched/rte_sched.h
index 9c55a787d..470a0036a 100644
--- a/lib/librte_sched/rte_sched.h
+++ b/lib/librte_sched/rte_sched.h
@@ -52,7 +52,7 @@ extern "C" {
  *	    multiple connections of same traffic class belonging to
  *	    the same user;
  *           - Weighted Round Robin (WRR) is used to service the
- *	    queues within same pipe traffic class.
+ *	    queues within same pipe lowest priority traffic class (best-effort).
  *
  */
 
@@ -66,20 +66,32 @@ extern "C" {
 #include "rte_red.h"
 #endif
 
-/** Number of traffic classes per pipe (as well as subport).
- * Cannot be changed.
+/** Maximum number of queues per pipe.
+ * Note that the multiple queues (power of 2) can only be assigned to
+ * lowest priority (best-effort) traffic class. Other higher priority traffic
+ * classes can only have one queue.
+ * Can not change.
+ *
+ * @see struct rte_sched_subport_params
  */
-#define RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE    4
+#define RTE_SCHED_QUEUES_PER_PIPE    16
 
-/** Number of queues per pipe traffic class. Cannot be changed. */
-#define RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS    4
+/** Number of WRR queues for best-effort traffic class per pipe.
+ *
+ * @see struct rte_sched_pipe_params
+ */
+#define RTE_SCHED_BE_QUEUES_PER_PIPE    8
 
-/** Number of queues per pipe. */
-#define RTE_SCHED_QUEUES_PER_PIPE             \
-	(RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE *     \
-	RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS)
+#define RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS    4
+/** Number of traffic classes per pipe (as well as subport).
+ *
+ * @see struct rte_sched_subport_params
+ * @see struct rte_sched_pipe_params
+ */
+#define RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE    \
+(RTE_SCHED_QUEUES_PER_PIPE - RTE_SCHED_BE_QUEUES_PER_PIPE + 1)
 
-/** Maximum number of pipe profiles that can be defined per port.
+/** Maximum number of pipe profiles that can be defined per subport.
  * Compile-time configurable.
  */
 #ifndef RTE_SCHED_PIPE_PROFILES_PER_PORT
@@ -95,6 +107,8 @@ extern "C" {
  *
  * The FCS is considered overhead only if not included in the packet
  * length (field pkt_len of struct rte_mbuf).
+ *
+ * @see struct rte_sched_port_params
  */
 #ifndef RTE_SCHED_FRAME_OVERHEAD_DEFAULT
 #define RTE_SCHED_FRAME_OVERHEAD_DEFAULT      24
-- 
2.21.0


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH v2 02/28] sched: update subport and pipe data structures
  2019-06-25 15:31   ` [dpdk-dev] [PATCH v2 00/28] sched: feature enhancements Jasvinder Singh
  2019-06-25 15:31     ` [dpdk-dev] [PATCH v2 01/28] sched: update macros for flexible config Jasvinder Singh
@ 2019-06-25 15:31     ` Jasvinder Singh
  2019-07-01 18:58       ` Dumitrescu, Cristian
  2019-07-01 19:12       ` Dumitrescu, Cristian
  2019-06-25 15:31     ` [dpdk-dev] [PATCH v2 03/28] sched: update internal " Jasvinder Singh
                       ` (28 subsequent siblings)
  30 siblings, 2 replies; 163+ messages in thread
From: Jasvinder Singh @ 2019-06-25 15:31 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, Abraham Tovar, Lukasz Krakowiak

Update subport and pipe data structures to allow configuration
flexiblity for pipe traffic classes and queues, and subport level
configuration of the pipe parameters.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 app/test/test_sched.c        |   2 +-
 examples/qos_sched/init.c    |   2 +-
 lib/librte_sched/rte_sched.h | 126 +++++++++++++++++++++++------------
 3 files changed, 85 insertions(+), 45 deletions(-)

diff --git a/app/test/test_sched.c b/app/test/test_sched.c
index 49bb9ea6f..d6651d490 100644
--- a/app/test/test_sched.c
+++ b/app/test/test_sched.c
@@ -40,7 +40,7 @@ static struct rte_sched_pipe_params pipe_profile[] = {
 		.tc_rate = {305175, 305175, 305175, 305175},
 		.tc_period = 40,
 
-		.wrr_weights = {1, 1, 1, 1,  1, 1, 1, 1,  1, 1, 1, 1,  1, 1, 1, 1},
+		.wrr_weights = {1, 1, 1, 1,  1, 1, 1, 1},
 	},
 };
 
diff --git a/examples/qos_sched/init.c b/examples/qos_sched/init.c
index 1209bd7ce..f6e9af16b 100644
--- a/examples/qos_sched/init.c
+++ b/examples/qos_sched/init.c
@@ -186,7 +186,7 @@ static struct rte_sched_pipe_params pipe_profiles[RTE_SCHED_PIPE_PROFILES_PER_PO
 		.tc_ov_weight = 1,
 #endif
 
-		.wrr_weights = {1, 1, 1, 1,  1, 1, 1, 1,  1, 1, 1, 1,  1, 1, 1, 1},
+		.wrr_weights = {1, 1, 1, 1,  1, 1, 1, 1},
 	},
 };
 
diff --git a/lib/librte_sched/rte_sched.h b/lib/librte_sched/rte_sched.h
index 470a0036a..ebde07669 100644
--- a/lib/librte_sched/rte_sched.h
+++ b/lib/librte_sched/rte_sched.h
@@ -114,6 +114,35 @@ extern "C" {
 #define RTE_SCHED_FRAME_OVERHEAD_DEFAULT      24
 #endif
 
+/*
+ * Pipe configuration parameters. The period and credits_per_period
+ * parameters are measured in bytes, with one byte meaning the time
+ * duration associated with the transmission of one byte on the
+ * physical medium of the output port, with pipe or pipe traffic class
+ * rate (measured as percentage of output port rate) determined as
+ * credits_per_period divided by period. One credit represents one
+ * byte.
+ */
+struct rte_sched_pipe_params {
+	/** Token bucket rate (measured in bytes per second) */
+	uint32_t tb_rate;
+	/** Token bucket size (measured in credits) */
+	uint32_t tb_size;
+
+	/** Traffic class rates (measured in bytes per second) */
+	uint32_t tc_rate[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
+
+	/** Enforcement period (measured in milliseconds) */
+	uint32_t tc_period;
+#ifdef RTE_SCHED_SUBPORT_TC_OV
+	/** Best-effort traffic class oversubscription weight */
+	uint8_t tc_ov_weight;
+#endif
+
+	/** WRR weights of best-effort traffic class queues */
+	uint8_t wrr_weights[RTE_SCHED_BE_QUEUES_PER_PIPE];
+};
+
 /*
  * Subport configuration parameters. The period and credits_per_period
  * parameters are measured in bytes, with one byte meaning the time
@@ -124,15 +153,44 @@ extern "C" {
  * byte.
  */
 struct rte_sched_subport_params {
-	/* Subport token bucket */
-	uint32_t tb_rate;                /**< Rate (measured in bytes per second) */
-	uint32_t tb_size;                /**< Size (measured in credits) */
+	/** Token bucket rate (measured in bytes per second) */
+	uint32_t tb_rate;
+
+	/** Token bucket size (measured in credits) */
+	uint32_t tb_size;
 
-	/* Subport traffic classes */
+	/** Traffic class rates (measured in bytes per second) */
 	uint32_t tc_rate[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
-	/**< Traffic class rates (measured in bytes per second) */
+
+	/** Enforcement period for rates (measured in milliseconds) */
 	uint32_t tc_period;
-	/**< Enforcement period for rates (measured in milliseconds) */
+
+	/** Number of subport_pipes */
+	uint32_t n_subport_pipes;
+
+	/** Packet queue size for each traffic class.
+	 * All the pipes within the same subport share the similar
+	 * configuration for the queues. Queues which are not needed, have
+	 * zero size.
+	 */
+	uint16_t qsize[RTE_SCHED_QUEUES_PER_PIPE];
+
+	/** Pipe profile table.
+	 * Every pipe is configured using one of the profiles from this table.
+	 */
+	struct rte_sched_pipe_params *pipe_profiles;
+
+	/** Profiles in the pipe profile table */
+	uint32_t n_pipe_profiles;
+
+	/** Max profiles allowed in the pipe profile table */
+	uint32_t n_max_pipe_profiles;
+#ifdef RTE_SCHED_RED
+	/** RED parameters */
+	struct rte_red_params
+		red_params[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE][RTE_COLORS];
+
+#endif
 };
 
 /** Subport statistics */
@@ -155,33 +213,6 @@ struct rte_sched_subport_stats {
 #endif
 };
 
-/*
- * Pipe configuration parameters. The period and credits_per_period
- * parameters are measured in bytes, with one byte meaning the time
- * duration associated with the transmission of one byte on the
- * physical medium of the output port, with pipe or pipe traffic class
- * rate (measured as percentage of output port rate) determined as
- * credits_per_period divided by period. One credit represents one
- * byte.
- */
-struct rte_sched_pipe_params {
-	/* Pipe token bucket */
-	uint32_t tb_rate;                /**< Rate (measured in bytes per second) */
-	uint32_t tb_size;                /**< Size (measured in credits) */
-
-	/* Pipe traffic classes */
-	uint32_t tc_rate[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
-	/**< Traffic class rates (measured in bytes per second) */
-	uint32_t tc_period;
-	/**< Enforcement period (measured in milliseconds) */
-#ifdef RTE_SCHED_SUBPORT_TC_OV
-	uint8_t tc_ov_weight;		 /**< Weight Traffic class 3 oversubscription */
-#endif
-
-	/* Pipe queues */
-	uint8_t  wrr_weights[RTE_SCHED_QUEUES_PER_PIPE]; /**< WRR weights */
-};
-
 /** Queue statistics */
 struct rte_sched_queue_stats {
 	/* Packets */
@@ -198,16 +229,25 @@ struct rte_sched_queue_stats {
 
 /** Port configuration parameters. */
 struct rte_sched_port_params {
-	const char *name;                /**< String to be associated */
-	int socket;                      /**< CPU socket ID */
-	uint32_t rate;                   /**< Output port rate
-					  * (measured in bytes per second) */
-	uint32_t mtu;                    /**< Maximum Ethernet frame size
-					  * (measured in bytes).
-					  * Should not include the framing overhead. */
-	uint32_t frame_overhead;         /**< Framing overhead per packet
-					  * (measured in bytes) */
-	uint32_t n_subports_per_port;    /**< Number of subports */
+	/** Name of the port to be associated */
+	const char *name;
+
+	/** CPU socket ID */
+	int socket;
+
+	/** Output port rate (measured in bytes per second) */
+	uint32_t rate;
+
+	/** Maximum Ethernet frame size (measured in bytes).
+	 * Should not include the framing overhead.
+	 */
+	uint32_t mtu;
+
+	/** Framing overhead per packet (measured in bytes) */
+	uint32_t frame_overhead;
+
+	/** Number of subports */
+	uint32_t n_subports_per_port;
 	uint32_t n_pipes_per_subport;    /**< Number of pipes per subport */
 	uint16_t qsize[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
 	/**< Packet queue size for each traffic class.
-- 
2.21.0


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH v2 03/28] sched: update internal data structures
  2019-06-25 15:31   ` [dpdk-dev] [PATCH v2 00/28] sched: feature enhancements Jasvinder Singh
  2019-06-25 15:31     ` [dpdk-dev] [PATCH v2 01/28] sched: update macros for flexible config Jasvinder Singh
  2019-06-25 15:31     ` [dpdk-dev] [PATCH v2 02/28] sched: update subport and pipe data structures Jasvinder Singh
@ 2019-06-25 15:31     ` " Jasvinder Singh
  2019-06-25 15:31     ` [dpdk-dev] [PATCH v2 04/28] sched: update port config API Jasvinder Singh
                       ` (27 subsequent siblings)
  30 siblings, 0 replies; 163+ messages in thread
From: Jasvinder Singh @ 2019-06-25 15:31 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, Abraham Tovar, Lukasz Krakowiak

Update internal data structures of the scheduler to allow configuration
flexiblity for pipe traffic classes and queues, and subport level
configuration of the pipe parameters.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 lib/librte_sched/rte_sched.c | 162 +++++++++++++++++++++++------------
 1 file changed, 109 insertions(+), 53 deletions(-)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index a60ddf97e..c81d59947 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -37,6 +37,8 @@
 
 #define RTE_SCHED_TB_RATE_CONFIG_ERR          (1e-7)
 #define RTE_SCHED_WRR_SHIFT                   3
+#define RTE_SCHED_MAX_QUEUES_PER_TC           RTE_SCHED_BE_QUEUES_PER_PIPE
+#define RTE_SCHED_TRAFFIC_CLASS_BE            (RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE - 1)
 #define RTE_SCHED_GRINDER_PCACHE_SIZE         (64 / RTE_SCHED_QUEUES_PER_PIPE)
 #define RTE_SCHED_PIPE_INVALID                UINT32_MAX
 #define RTE_SCHED_BMP_POS_INVALID             UINT32_MAX
@@ -46,6 +48,52 @@
  */
 #define RTE_SCHED_TIME_SHIFT		      8
 
+enum grinder_state {
+	e_GRINDER_PREFETCH_PIPE = 0,
+	e_GRINDER_PREFETCH_TC_QUEUE_ARRAYS,
+	e_GRINDER_PREFETCH_MBUF,
+	e_GRINDER_READ_MBUF
+};
+
+struct rte_sched_grinder {
+	/* Pipe cache */
+	uint16_t pcache_qmask[RTE_SCHED_GRINDER_PCACHE_SIZE];
+	uint32_t pcache_qindex[RTE_SCHED_GRINDER_PCACHE_SIZE];
+	uint32_t pcache_w;
+	uint32_t pcache_r;
+
+	/* Current pipe */
+	uint32_t pindex;
+	struct rte_sched_pipe *pipe;
+	struct rte_sched_pipe_profile *pipe_params;
+	struct rte_sched_subport *subport;
+
+	/* Grinder state*/
+	enum grinder_state state;
+	uint32_t productive;
+
+	/* TC cache */
+	uint8_t tccache_qmask[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
+	uint32_t tccache_qindex[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
+	uint32_t tccache_w;
+	uint32_t tccache_r;
+
+	/* Current TC */
+	uint32_t tc_index;
+	uint32_t qpos;
+	struct rte_sched_queue *queue[RTE_SCHED_MAX_QUEUES_PER_TC];
+	struct rte_mbuf **qbase[RTE_SCHED_MAX_QUEUES_PER_TC];
+	uint32_t qindex[RTE_SCHED_MAX_QUEUES_PER_TC];
+	uint16_t qsize;
+	uint32_t qmask;
+	struct rte_mbuf *pkt;
+
+	/* WRR */
+	uint16_t wrr_tokens[RTE_SCHED_BE_QUEUES_PER_PIPE];
+	uint16_t wrr_mask[RTE_SCHED_BE_QUEUES_PER_PIPE];
+	uint8_t wrr_cost[RTE_SCHED_BE_QUEUES_PER_PIPE];
+};
+
 struct rte_sched_subport {
 	/* Token bucket (TB) */
 	uint64_t tb_time; /* time of last update */
@@ -71,7 +119,42 @@ struct rte_sched_subport {
 
 	/* Statistics */
 	struct rte_sched_subport_stats stats;
-};
+
+	/* Subport Pipes*/
+	uint32_t n_subport_pipes;
+
+	uint16_t qsize[RTE_SCHED_QUEUES_PER_PIPE];
+	uint32_t n_pipe_profiles;
+	uint32_t n_max_pipe_profiles;
+	uint32_t pipe_tc_be_rate_max;
+#ifdef RTE_SCHED_RED
+	struct rte_red_config red_config[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE][RTE_COLORS];
+#endif
+
+	/* Scheduling loop detection */
+	uint32_t pipe_loop;
+	uint32_t pipe_exhaustion;
+
+	/* Bitmap */
+	struct rte_bitmap *bmp;
+	uint32_t grinder_base_bmp_pos[RTE_SCHED_PORT_N_GRINDERS] __rte_aligned_16;
+
+	/* Grinders */
+	struct rte_sched_grinder grinder[RTE_SCHED_PORT_N_GRINDERS];
+	uint32_t busy_grinders;
+
+	/* Queue base calculation */
+	uint32_t qsize_add[RTE_SCHED_QUEUES_PER_PIPE];
+	uint32_t qsize_sum;
+
+	struct rte_sched_pipe *pipe;
+	struct rte_sched_queue *queue;
+	struct rte_sched_queue_extra *queue_extra;
+	struct rte_sched_pipe_profile *pipe_profiles;
+	uint8_t *bmp_array;
+	struct rte_mbuf **queue_array;
+	uint8_t memory[0] __rte_cache_aligned;
+} __rte_cache_aligned;
 
 struct rte_sched_pipe_profile {
 	/* Token bucket (TB) */
@@ -84,8 +167,10 @@ struct rte_sched_pipe_profile {
 	uint32_t tc_credits_per_period[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
 	uint8_t tc_ov_weight;
 
-	/* Pipe queues */
-	uint8_t  wrr_cost[RTE_SCHED_QUEUES_PER_PIPE];
+	/* Pipe best-effort traffic class queues */
+	uint8_t n_be_queues;
+
+	uint8_t  wrr_cost[RTE_SCHED_BE_QUEUES_PER_PIPE];
 };
 
 struct rte_sched_pipe {
@@ -100,8 +185,10 @@ struct rte_sched_pipe {
 	uint64_t tc_time; /* time of next update */
 	uint32_t tc_credits[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
 
+	uint8_t n_be_queues; /* Best effort traffic class queues */
+
 	/* Weighted Round Robin (WRR) */
-	uint8_t wrr_tokens[RTE_SCHED_QUEUES_PER_PIPE];
+	uint8_t wrr_tokens[RTE_SCHED_BE_QUEUES_PER_PIPE];
 
 	/* TC oversubscription */
 	uint32_t tc_ov_credits;
@@ -121,55 +208,12 @@ struct rte_sched_queue_extra {
 #endif
 };
 
-enum grinder_state {
-	e_GRINDER_PREFETCH_PIPE = 0,
-	e_GRINDER_PREFETCH_TC_QUEUE_ARRAYS,
-	e_GRINDER_PREFETCH_MBUF,
-	e_GRINDER_READ_MBUF
-};
-
-struct rte_sched_grinder {
-	/* Pipe cache */
-	uint16_t pcache_qmask[RTE_SCHED_GRINDER_PCACHE_SIZE];
-	uint32_t pcache_qindex[RTE_SCHED_GRINDER_PCACHE_SIZE];
-	uint32_t pcache_w;
-	uint32_t pcache_r;
-
-	/* Current pipe */
-	enum grinder_state state;
-	uint32_t productive;
-	uint32_t pindex;
-	struct rte_sched_subport *subport;
-	struct rte_sched_pipe *pipe;
-	struct rte_sched_pipe_profile *pipe_params;
-
-	/* TC cache */
-	uint8_t tccache_qmask[4];
-	uint32_t tccache_qindex[4];
-	uint32_t tccache_w;
-	uint32_t tccache_r;
-
-	/* Current TC */
-	uint32_t tc_index;
-	struct rte_sched_queue *queue[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
-	struct rte_mbuf **qbase[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
-	uint32_t qindex[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
-	uint16_t qsize;
-	uint32_t qmask;
-	uint32_t qpos;
-	struct rte_mbuf *pkt;
-
-	/* WRR */
-	uint16_t wrr_tokens[RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS];
-	uint16_t wrr_mask[RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS];
-	uint8_t wrr_cost[RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS];
-};
-
 struct rte_sched_port {
 	/* User parameters */
 	uint32_t n_subports_per_port;
 	uint32_t n_pipes_per_subport;
 	uint32_t n_pipes_per_subport_log2;
+	int socket;
 	uint32_t rate;
 	uint32_t mtu;
 	uint32_t frame_overhead;
@@ -199,6 +243,9 @@ struct rte_sched_port {
 	uint32_t busy_grinders;
 	struct rte_mbuf **pkts_out;
 	uint32_t n_pkts_out;
+	uint32_t subport_id;
+
+	uint32_t max_subport_pipes_log2;   /* Max number of subport pipes */
 
 	/* Queue base calculation */
 	uint32_t qsize_add[RTE_SCHED_QUEUES_PER_PIPE];
@@ -212,6 +259,7 @@ struct rte_sched_port {
 	struct rte_sched_pipe_profile *pipe_profiles;
 	uint8_t *bmp_array;
 	struct rte_mbuf **queue_array;
+	struct rte_sched_subport *subports[0];
 	uint8_t memory[0] __rte_cache_aligned;
 } __rte_cache_aligned;
 
@@ -226,6 +274,16 @@ enum rte_sched_port_array {
 	e_RTE_SCHED_PORT_ARRAY_TOTAL,
 };
 
+enum rte_sched_subport_array {
+	e_RTE_SCHED_SUBPORT_ARRAY_PIPE = 0,
+	e_RTE_SCHED_SUBPORT_ARRAY_QUEUE,
+	e_RTE_SCHED_SUBPORT_ARRAY_QUEUE_EXTRA,
+	e_RTE_SCHED_SUBPORT_ARRAY_PIPE_PROFILES,
+	e_RTE_SCHED_SUBPORT_ARRAY_BMP_ARRAY,
+	e_RTE_SCHED_SUBPORT_ARRAY_QUEUE_ARRAY,
+	e_RTE_SCHED_SUBPORT_ARRAY_TOTAL,
+};
+
 #ifdef RTE_SCHED_COLLECT_STATS
 
 static inline uint32_t
@@ -483,7 +541,7 @@ rte_sched_port_log_pipe_profile(struct rte_sched_port *port, uint32_t i)
 		"    Token bucket: period = %u, credits per period = %u, size = %u\n"
 		"    Traffic classes: period = %u, credits per period = [%u, %u, %u, %u]\n"
 		"    Traffic class 3 oversubscription: weight = %hhu\n"
-		"    WRR cost: [%hhu, %hhu, %hhu, %hhu], [%hhu, %hhu, %hhu, %hhu], [%hhu, %hhu, %hhu, %hhu], [%hhu, %hhu, %hhu, %hhu]\n",
+		"    WRR cost: [%hhu, %hhu, %hhu, %hhu], [%hhu, %hhu, %hhu, %hhu],\n",
 		i,
 
 		/* Token bucket */
@@ -502,10 +560,8 @@ rte_sched_port_log_pipe_profile(struct rte_sched_port *port, uint32_t i)
 		p->tc_ov_weight,
 
 		/* WRR */
-		p->wrr_cost[ 0], p->wrr_cost[ 1], p->wrr_cost[ 2], p->wrr_cost[ 3],
-		p->wrr_cost[ 4], p->wrr_cost[ 5], p->wrr_cost[ 6], p->wrr_cost[ 7],
-		p->wrr_cost[ 8], p->wrr_cost[ 9], p->wrr_cost[10], p->wrr_cost[11],
-		p->wrr_cost[12], p->wrr_cost[13], p->wrr_cost[14], p->wrr_cost[15]);
+		p->wrr_cost[0], p->wrr_cost[1], p->wrr_cost[2], p->wrr_cost[3],
+		p->wrr_cost[4], p->wrr_cost[5], p->wrr_cost[6], p->wrr_cost[7]);
 }
 
 static inline uint64_t
-- 
2.21.0


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH v2 04/28] sched: update port config API
  2019-06-25 15:31   ` [dpdk-dev] [PATCH v2 00/28] sched: feature enhancements Jasvinder Singh
                       ` (2 preceding siblings ...)
  2019-06-25 15:31     ` [dpdk-dev] [PATCH v2 03/28] sched: update internal " Jasvinder Singh
@ 2019-06-25 15:31     ` Jasvinder Singh
  2019-06-25 15:31     ` [dpdk-dev] [PATCH v2 05/28] sched: update port free API Jasvinder Singh
                       ` (26 subsequent siblings)
  30 siblings, 0 replies; 163+ messages in thread
From: Jasvinder Singh @ 2019-06-25 15:31 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, Abraham Tovar, Lukasz Krakowiak

Update port configuration api implementation to allow
configuration flexiblity for pipe traffic classes and queues,
and subport level configuration of the pipe parameters.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 lib/librte_sched/rte_sched.c | 223 +++++++----------------------------
 1 file changed, 41 insertions(+), 182 deletions(-)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index c81d59947..aea938899 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -366,57 +366,39 @@ pipe_profile_check(struct rte_sched_pipe_params *params,
 static int
 rte_sched_port_check_params(struct rte_sched_port_params *params)
 {
-	uint32_t i;
-
-	if (params == NULL)
-		return -1;
+	if (params == NULL) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for parameter params \n", __func__);
+		return -EINVAL;
+	}
 
 	/* socket */
-	if (params->socket < 0)
-		return -3;
+	if (params->socket < 0) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for socket id \n", __func__);
+		return -EINVAL;
+	}
 
 	/* rate */
-	if (params->rate == 0)
-		return -4;
+	if (params->rate == 0) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for rate \n", __func__);
+		return -EINVAL;
+	}
 
 	/* mtu */
-	if (params->mtu == 0)
-		return -5;
+	if (params->mtu == 0) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for mtu \n", __func__);
+		return -EINVAL;
+	}
 
 	/* n_subports_per_port: non-zero, limited to 16 bits, power of 2 */
 	if (params->n_subports_per_port == 0 ||
-	    params->n_subports_per_port > 1u << 16 ||
-	    !rte_is_power_of_2(params->n_subports_per_port))
-		return -6;
-
-	/* n_pipes_per_subport: non-zero, power of 2 */
-	if (params->n_pipes_per_subport == 0 ||
-	    !rte_is_power_of_2(params->n_pipes_per_subport))
-		return -7;
-
-	/* qsize: non-zero, power of 2,
-	 * no bigger than 32K (due to 16-bit read/write pointers)
-	 */
-	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++) {
-		uint16_t qsize = params->qsize[i];
-
-		if (qsize == 0 || !rte_is_power_of_2(qsize))
-			return -8;
-	}
-
-	/* pipe_profiles and n_pipe_profiles */
-	if (params->pipe_profiles == NULL ||
-	    params->n_pipe_profiles == 0 ||
-	    params->n_pipe_profiles > RTE_SCHED_PIPE_PROFILES_PER_PORT)
-		return -9;
-
-	for (i = 0; i < params->n_pipe_profiles; i++) {
-		struct rte_sched_pipe_params *p = params->pipe_profiles + i;
-		int status;
-
-		status = pipe_profile_check(p, params->rate);
-		if (status != 0)
-			return status;
+	    !rte_is_power_of_2(params->n_subports_per_port)) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for number of subports \n", __func__);
+		return -EINVAL;
 	}
 
 	return 0;
@@ -502,36 +484,6 @@ rte_sched_port_get_memory_footprint(struct rte_sched_port_params *params)
 	return size0 + size1;
 }
 
-static void
-rte_sched_port_config_qsize(struct rte_sched_port *port)
-{
-	/* TC 0 */
-	port->qsize_add[0] = 0;
-	port->qsize_add[1] = port->qsize_add[0] + port->qsize[0];
-	port->qsize_add[2] = port->qsize_add[1] + port->qsize[0];
-	port->qsize_add[3] = port->qsize_add[2] + port->qsize[0];
-
-	/* TC 1 */
-	port->qsize_add[4] = port->qsize_add[3] + port->qsize[0];
-	port->qsize_add[5] = port->qsize_add[4] + port->qsize[1];
-	port->qsize_add[6] = port->qsize_add[5] + port->qsize[1];
-	port->qsize_add[7] = port->qsize_add[6] + port->qsize[1];
-
-	/* TC 2 */
-	port->qsize_add[8] = port->qsize_add[7] + port->qsize[1];
-	port->qsize_add[9] = port->qsize_add[8] + port->qsize[2];
-	port->qsize_add[10] = port->qsize_add[9] + port->qsize[2];
-	port->qsize_add[11] = port->qsize_add[10] + port->qsize[2];
-
-	/* TC 3 */
-	port->qsize_add[12] = port->qsize_add[11] + port->qsize[2];
-	port->qsize_add[13] = port->qsize_add[12] + port->qsize[3];
-	port->qsize_add[14] = port->qsize_add[13] + port->qsize[3];
-	port->qsize_add[15] = port->qsize_add[14] + port->qsize[3];
-
-	port->qsize_sum = port->qsize_add[15] + port->qsize[3];
-}
-
 static void
 rte_sched_port_log_pipe_profile(struct rte_sched_port *port, uint32_t i)
 {
@@ -638,84 +590,37 @@ rte_sched_pipe_profile_convert(struct rte_sched_pipe_params *src,
 	}
 }
 
-static void
-rte_sched_port_config_pipe_profile_table(struct rte_sched_port *port,
-	struct rte_sched_port_params *params)
-{
-	uint32_t i;
-
-	for (i = 0; i < port->n_pipe_profiles; i++) {
-		struct rte_sched_pipe_params *src = params->pipe_profiles + i;
-		struct rte_sched_pipe_profile *dst = port->pipe_profiles + i;
-
-		rte_sched_pipe_profile_convert(src, dst, params->rate);
-		rte_sched_port_log_pipe_profile(port, i);
-	}
-
-	port->pipe_tc3_rate_max = 0;
-	for (i = 0; i < port->n_pipe_profiles; i++) {
-		struct rte_sched_pipe_params *src = params->pipe_profiles + i;
-		uint32_t pipe_tc3_rate = src->tc_rate[3];
-
-		if (port->pipe_tc3_rate_max < pipe_tc3_rate)
-			port->pipe_tc3_rate_max = pipe_tc3_rate;
-	}
-}
-
 struct rte_sched_port *
 rte_sched_port_config(struct rte_sched_port_params *params)
 {
 	struct rte_sched_port *port = NULL;
-	uint32_t mem_size, bmp_mem_size, n_queues_per_port, i, cycles_per_byte;
+	uint32_t size0, size1;
+	uint32_t cycles_per_byte;
+	int status;
 
-	/* Check user parameters. Determine the amount of memory to allocate */
-	mem_size = rte_sched_port_get_memory_footprint(params);
-	if (mem_size == 0)
+	status = rte_sched_port_check_params(params);
+	if (status != 0) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Port scheduler params check failed (%d)\n",
+			__func__, status);
 		return NULL;
+	}
+
+	size0 = sizeof(struct rte_sched_port);
+	size1 = params->n_subports_per_port * sizeof(struct rte_sched_subport *);
 
 	/* Allocate memory to store the data structures */
-	port = rte_zmalloc_socket("qos_params", mem_size, RTE_CACHE_LINE_SIZE,
-		params->socket);
+	port = rte_zmalloc_socket("qos_params", size0 + size1,
+		RTE_CACHE_LINE_SIZE, params->socket);
 	if (port == NULL)
 		return NULL;
 
-	/* compile time checks */
-	RTE_BUILD_BUG_ON(RTE_SCHED_PORT_N_GRINDERS == 0);
-	RTE_BUILD_BUG_ON(RTE_SCHED_PORT_N_GRINDERS & (RTE_SCHED_PORT_N_GRINDERS - 1));
-
 	/* User parameters */
 	port->n_subports_per_port = params->n_subports_per_port;
-	port->n_pipes_per_subport = params->n_pipes_per_subport;
-	port->n_pipes_per_subport_log2 =
-			__builtin_ctz(params->n_pipes_per_subport);
+	port->socket = params->socket;
 	port->rate = params->rate;
 	port->mtu = params->mtu + params->frame_overhead;
 	port->frame_overhead = params->frame_overhead;
-	memcpy(port->qsize, params->qsize, sizeof(params->qsize));
-	port->n_pipe_profiles = params->n_pipe_profiles;
-
-#ifdef RTE_SCHED_RED
-	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++) {
-		uint32_t j;
-
-		for (j = 0; j < RTE_COLORS; j++) {
-			/* if min/max are both zero, then RED is disabled */
-			if ((params->red_params[i][j].min_th |
-			     params->red_params[i][j].max_th) == 0) {
-				continue;
-			}
-
-			if (rte_red_config_init(&port->red_config[i][j],
-				params->red_params[i][j].wq_log2,
-				params->red_params[i][j].min_th,
-				params->red_params[i][j].max_th,
-				params->red_params[i][j].maxp_inv) != 0) {
-				rte_free(port);
-				return NULL;
-			}
-		}
-	}
-#endif
 
 	/* Timing */
 	port->time_cpu_cycles = rte_get_tsc_cycles();
@@ -726,57 +631,11 @@ rte_sched_port_config(struct rte_sched_port_params *params)
 		/ params->rate;
 	port->inv_cycles_per_byte = rte_reciprocal_value(cycles_per_byte);
 
-	/* Scheduling loop detection */
-	port->pipe_loop = RTE_SCHED_PIPE_INVALID;
-	port->pipe_exhaustion = 0;
-
-	/* Grinders */
-	port->busy_grinders = 0;
 	port->pkts_out = NULL;
 	port->n_pkts_out = 0;
+	port->subport_id = 0;
 
-	/* Queue base calculation */
-	rte_sched_port_config_qsize(port);
-
-	/* Large data structures */
-	port->subport = (struct rte_sched_subport *)
-		(port->memory + rte_sched_port_get_array_base(params,
-							      e_RTE_SCHED_PORT_ARRAY_SUBPORT));
-	port->pipe = (struct rte_sched_pipe *)
-		(port->memory + rte_sched_port_get_array_base(params,
-							      e_RTE_SCHED_PORT_ARRAY_PIPE));
-	port->queue = (struct rte_sched_queue *)
-		(port->memory + rte_sched_port_get_array_base(params,
-							      e_RTE_SCHED_PORT_ARRAY_QUEUE));
-	port->queue_extra = (struct rte_sched_queue_extra *)
-		(port->memory + rte_sched_port_get_array_base(params,
-							      e_RTE_SCHED_PORT_ARRAY_QUEUE_EXTRA));
-	port->pipe_profiles = (struct rte_sched_pipe_profile *)
-		(port->memory + rte_sched_port_get_array_base(params,
-							      e_RTE_SCHED_PORT_ARRAY_PIPE_PROFILES));
-	port->bmp_array =  port->memory
-		+ rte_sched_port_get_array_base(params, e_RTE_SCHED_PORT_ARRAY_BMP_ARRAY);
-	port->queue_array = (struct rte_mbuf **)
-		(port->memory + rte_sched_port_get_array_base(params,
-							      e_RTE_SCHED_PORT_ARRAY_QUEUE_ARRAY));
-
-	/* Pipe profile table */
-	rte_sched_port_config_pipe_profile_table(port, params);
-
-	/* Bitmap */
-	n_queues_per_port = rte_sched_port_queues_per_port(port);
-	bmp_mem_size = rte_bitmap_get_memory_footprint(n_queues_per_port);
-	port->bmp = rte_bitmap_init(n_queues_per_port, port->bmp_array,
-				    bmp_mem_size);
-	if (port->bmp == NULL) {
-		RTE_LOG(ERR, SCHED, "Bitmap init error\n");
-		rte_free(port);
-		return NULL;
-	}
-
-	for (i = 0; i < RTE_SCHED_PORT_N_GRINDERS; i++)
-		port->grinder_base_bmp_pos[i] = RTE_SCHED_PIPE_INVALID;
-
+	port->max_subport_pipes_log2 = 0;
 
 	return port;
 }
-- 
2.21.0


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH v2 05/28] sched: update port free API
  2019-06-25 15:31   ` [dpdk-dev] [PATCH v2 00/28] sched: feature enhancements Jasvinder Singh
                       ` (3 preceding siblings ...)
  2019-06-25 15:31     ` [dpdk-dev] [PATCH v2 04/28] sched: update port config API Jasvinder Singh
@ 2019-06-25 15:31     ` Jasvinder Singh
  2019-06-25 15:31     ` [dpdk-dev] [PATCH v2 06/28] sched: update subport config API Jasvinder Singh
                       ` (25 subsequent siblings)
  30 siblings, 0 replies; 163+ messages in thread
From: Jasvinder Singh @ 2019-06-25 15:31 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, Abraham Tovar, Lukasz Krakowiak

Update port free api implementation to allow configuration
flexiblity for pipe traffic classes and queues, and subport level
configuration of the pipe parameters.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 lib/librte_sched/rte_sched.c | 71 ++++++++++++++++++++++++++++--------
 1 file changed, 55 insertions(+), 16 deletions(-)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index aea938899..22db212ed 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -294,6 +294,29 @@ rte_sched_port_queues_per_subport(struct rte_sched_port *port)
 
 #endif
 
+static inline uint32_t
+rte_sched_subport_queues(struct rte_sched_subport *subport)
+{
+	return RTE_SCHED_QUEUES_PER_PIPE * subport->n_subport_pipes;
+}
+
+static inline struct rte_mbuf **
+rte_sched_subport_qbase(struct rte_sched_subport *subport, uint32_t qindex)
+{
+	uint32_t pindex = qindex >> 4;
+	uint32_t qpos = qindex & 0xF;
+
+	return (subport->queue_array + pindex *
+		subport->qsize_sum + subport->qsize_add[qpos]);
+}
+
+static inline uint16_t
+rte_sched_subport_qsize(struct rte_sched_subport *subport, uint32_t qindex)
+{
+	uint32_t qpos = qindex & 0xF;
+	return subport->qsize[qpos];
+}
+
 static inline uint32_t
 rte_sched_port_queues_per_port(struct rte_sched_port *port)
 {
@@ -640,31 +663,47 @@ rte_sched_port_config(struct rte_sched_port_params *params)
 	return port;
 }
 
-void
-rte_sched_port_free(struct rte_sched_port *port)
+static inline void
+rte_sched_subport_free(struct rte_sched_subport *subport)
 {
+	uint32_t n_subport_queues;
 	uint32_t qindex;
-	uint32_t n_queues_per_port;
 
-	/* Check user parameters */
-	if (port == NULL)
+	if (subport == NULL)
 		return;
 
-	n_queues_per_port = rte_sched_port_queues_per_port(port);
+	n_subport_queues = rte_sched_subport_queues(subport);
 
 	/* Free enqueued mbufs */
-	for (qindex = 0; qindex < n_queues_per_port; qindex++) {
-		struct rte_mbuf **mbufs = rte_sched_port_qbase(port, qindex);
-		uint16_t qsize = rte_sched_port_qsize(port, qindex);
-		struct rte_sched_queue *queue = port->queue + qindex;
-		uint16_t qr = queue->qr & (qsize - 1);
-		uint16_t qw = queue->qw & (qsize - 1);
-
-		for (; qr != qw; qr = (qr + 1) & (qsize - 1))
-			rte_pktmbuf_free(mbufs[qr]);
+	for (qindex = 0; qindex < n_subport_queues; qindex++) {
+		struct rte_mbuf **mbufs =
+			rte_sched_subport_qbase(subport, qindex);
+		uint16_t qsize = rte_sched_subport_qsize(subport, qindex);
+		if (qsize != 0) {
+			struct rte_sched_queue *queue = subport->queue + qindex;
+			uint16_t qr = queue->qr & (qsize - 1);
+			uint16_t qw = queue->qw & (qsize - 1);
+
+			for (; qr != qw; qr = (qr + 1) & (qsize - 1))
+				rte_pktmbuf_free(mbufs[qr]);
+		}
 	}
 
-	rte_bitmap_free(port->bmp);
+	rte_bitmap_free(subport->bmp);
+}
+
+void
+rte_sched_port_free(struct rte_sched_port *port)
+{
+	uint32_t i;
+
+	/* Check user parameters */
+	if (port == NULL)
+		return;
+
+	for (i = 0; i < port->n_subports_per_port; i++)
+		rte_sched_subport_free(port->subports[i]);
+
 	rte_free(port);
 }
 
-- 
2.21.0


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH v2 06/28] sched: update subport config API
  2019-06-25 15:31   ` [dpdk-dev] [PATCH v2 00/28] sched: feature enhancements Jasvinder Singh
                       ` (4 preceding siblings ...)
  2019-06-25 15:31     ` [dpdk-dev] [PATCH v2 05/28] sched: update port free API Jasvinder Singh
@ 2019-06-25 15:31     ` Jasvinder Singh
  2019-06-25 15:31     ` [dpdk-dev] [PATCH v2 07/28] sched: update pipe profile add API Jasvinder Singh
                       ` (24 subsequent siblings)
  30 siblings, 0 replies; 163+ messages in thread
From: Jasvinder Singh @ 2019-06-25 15:31 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, Abraham Tovar, Lukasz Krakowiak

Update suport configuration api implementation to allow
configuration flexiblity for pipe traffic classes and queues,
and subport level configuration of the pipe parameters.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 lib/librte_sched/rte_sched.c | 603 +++++++++++++++++++++++++++++------
 lib/librte_sched/rte_sched.h |   3 +
 2 files changed, 507 insertions(+), 99 deletions(-)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index 22db212ed..46a98ad63 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -343,44 +343,49 @@ rte_sched_port_qsize(struct rte_sched_port *port, uint32_t qindex)
 
 static int
 pipe_profile_check(struct rte_sched_pipe_params *params,
-	uint32_t rate)
+	uint32_t rate, uint16_t *qsize)
 {
 	uint32_t i;
 
 	/* Pipe parameters */
 	if (params == NULL)
-		return -10;
+		return -11;
 
 	/* TB rate: non-zero, not greater than port rate */
 	if (params->tb_rate == 0 ||
 		params->tb_rate > rate)
-		return -11;
+		return -12;
 
 	/* TB size: non-zero */
 	if (params->tb_size == 0)
-		return -12;
+		return -13;
 
 	/* TC rate: non-zero, less than pipe rate */
-	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++) {
-		if (params->tc_rate[i] == 0 ||
-			params->tc_rate[i] > params->tb_rate)
-			return -13;
+	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASS_BE; i++) {
+		if ((qsize[i] == 0 && params->tc_rate[i] != 0) ||
+			(qsize[i] != 0 && (params->tc_rate[i] == 0 ||
+			params->tc_rate[i] > params->tb_rate)))
+			return -14;
 	}
+	if (params->tc_rate[RTE_SCHED_TRAFFIC_CLASS_BE] == 0)
+		return -15;
 
 	/* TC period: non-zero */
 	if (params->tc_period == 0)
-		return -14;
+		return -16;
 
 #ifdef RTE_SCHED_SUBPORT_TC_OV
 	/* TC3 oversubscription weight: non-zero */
 	if (params->tc_ov_weight == 0)
-		return -15;
+		return -17;
 #endif
 
 	/* Queue WRR weights: non-zero */
-	for (i = 0; i < RTE_SCHED_QUEUES_PER_PIPE; i++) {
-		if (params->wrr_weights[i] == 0)
-			return -16;
+	for (i = 0; i < RTE_SCHED_BE_QUEUES_PER_PIPE; i++) {
+		uint32_t qindex = RTE_SCHED_TRAFFIC_CLASS_BE + i;
+		if ((qsize[qindex] != 0 && params->wrr_weights[i] == 0) ||
+			(qsize[qindex] == 0 && params->wrr_weights[i] != 0))
+			return -18;
 	}
 
 	return 0;
@@ -487,36 +492,181 @@ rte_sched_port_get_array_base(struct rte_sched_port_params *params, enum rte_sch
 	return base;
 }
 
-uint32_t
-rte_sched_port_get_memory_footprint(struct rte_sched_port_params *params)
+static uint32_t
+rte_sched_subport_get_array_base(struct rte_sched_subport_params *params,
+	enum rte_sched_subport_array array)
 {
-	uint32_t size0, size1;
-	int status;
+	uint32_t n_subport_pipes = params->n_subport_pipes;
+	uint32_t n_subport_queues = RTE_SCHED_QUEUES_PER_PIPE * n_subport_pipes;
 
-	status = rte_sched_port_check_params(params);
-	if (status != 0) {
-		RTE_LOG(NOTICE, SCHED,
-			"Port scheduler params check failed (%d)\n", status);
+	uint32_t size_pipe = n_subport_pipes * sizeof(struct rte_sched_pipe);
+	uint32_t size_queue = n_subport_queues * sizeof(struct rte_sched_queue);
+	uint32_t size_queue_extra
+		= n_subport_queues * sizeof(struct rte_sched_queue_extra);
+	uint32_t size_pipe_profiles = params->n_max_pipe_profiles *
+		sizeof(struct rte_sched_pipe_profile);
+	uint32_t size_bmp_array =
+		rte_bitmap_get_memory_footprint(n_subport_queues);
+	uint32_t size_per_pipe_queue_array, size_queue_array;
 
-		return 0;
+	uint32_t base, i;
+
+	size_per_pipe_queue_array = 0;
+	for (i = 0; i < RTE_SCHED_QUEUES_PER_PIPE; i++) {
+		size_per_pipe_queue_array += params->qsize[i] * sizeof(struct rte_mbuf *);
 	}
+	size_queue_array = n_subport_pipes * size_per_pipe_queue_array;
 
-	size0 = sizeof(struct rte_sched_port);
-	size1 = rte_sched_port_get_array_base(params, e_RTE_SCHED_PORT_ARRAY_TOTAL);
+	base = 0;
 
-	return size0 + size1;
+	if (array == e_RTE_SCHED_SUBPORT_ARRAY_PIPE)
+		return base;
+	base += RTE_CACHE_LINE_ROUNDUP(size_pipe);
+
+	if (array == e_RTE_SCHED_SUBPORT_ARRAY_QUEUE)
+		return base;
+	base += RTE_CACHE_LINE_ROUNDUP(size_queue);
+
+	if (array == e_RTE_SCHED_SUBPORT_ARRAY_QUEUE_EXTRA)
+		return base;
+	base += RTE_CACHE_LINE_ROUNDUP(size_queue_extra);
+
+	if (array == e_RTE_SCHED_SUBPORT_ARRAY_PIPE_PROFILES)
+		return base;
+	base += RTE_CACHE_LINE_ROUNDUP(size_pipe_profiles);
+
+	if (array == e_RTE_SCHED_SUBPORT_ARRAY_BMP_ARRAY)
+		return base;
+	base += RTE_CACHE_LINE_ROUNDUP(size_bmp_array);
+
+	if (array == e_RTE_SCHED_SUBPORT_ARRAY_QUEUE_ARRAY)
+		return base;
+	base += RTE_CACHE_LINE_ROUNDUP(size_queue_array);
+
+	return base;
+}
+
+static void
+rte_sched_pipe_wrr_queues_config(struct rte_sched_pipe_params *src,
+	struct rte_sched_pipe_profile *dst)
+{
+	uint32_t wrr_cost[RTE_SCHED_BE_QUEUES_PER_PIPE];
+
+	if (dst->n_be_queues == 1) {
+		dst->wrr_cost[0] = (uint8_t) src->wrr_weights[0];
+
+		return;
+	}
+
+	if (dst->n_be_queues == 2) {
+		uint32_t lcd;
+		wrr_cost[0] = src->wrr_weights[0];
+		wrr_cost[1] = src->wrr_weights[1];
+
+		lcd = rte_get_lcd(wrr_cost[0], wrr_cost[1]);
+
+		wrr_cost[0] = lcd / wrr_cost[0];
+		wrr_cost[1] = lcd / wrr_cost[1];
+
+		dst->wrr_cost[0] = (uint8_t) wrr_cost[0];
+		dst->wrr_cost[1] = (uint8_t) wrr_cost[1];
+
+		return;
+	}
+
+	if (dst->n_be_queues == 4) {
+		uint32_t lcd, lcd1, lcd2;
+
+		wrr_cost[0] = src->wrr_weights[0];
+		wrr_cost[1] = src->wrr_weights[1];
+		wrr_cost[2] = src->wrr_weights[2];
+		wrr_cost[3] = src->wrr_weights[3];
+
+		lcd1 = rte_get_lcd(wrr_cost[0], wrr_cost[1]);
+		lcd2 = rte_get_lcd(wrr_cost[2], wrr_cost[3]);
+		lcd = rte_get_lcd(lcd1, lcd2);
+
+		wrr_cost[0] = lcd / wrr_cost[0];
+		wrr_cost[1] = lcd / wrr_cost[1];
+		wrr_cost[2] = lcd / wrr_cost[2];
+		wrr_cost[3] = lcd / wrr_cost[3];
+
+		dst->wrr_cost[0] = (uint8_t) wrr_cost[0];
+		dst->wrr_cost[1] = (uint8_t) wrr_cost[1];
+		dst->wrr_cost[2] = (uint8_t) wrr_cost[2];
+		dst->wrr_cost[3] = (uint8_t) wrr_cost[3];
+
+		return;
+	}
+
+	if (dst->n_be_queues == 8) {
+		uint32_t lcd1, lcd2, lcd3, lcd4, lcd5, lcd6, lcd7;
+
+		wrr_cost[0] = src->wrr_weights[0];
+		wrr_cost[1] = src->wrr_weights[1];
+		wrr_cost[2] = src->wrr_weights[2];
+		wrr_cost[3] = src->wrr_weights[3];
+		wrr_cost[4] = src->wrr_weights[4];
+		wrr_cost[5] = src->wrr_weights[5];
+		wrr_cost[6] = src->wrr_weights[6];
+		wrr_cost[7] = src->wrr_weights[7];
+
+		lcd1 = rte_get_lcd(wrr_cost[0], wrr_cost[1]);
+		lcd2 = rte_get_lcd(wrr_cost[2], wrr_cost[3]);
+		lcd3 = rte_get_lcd(wrr_cost[4], wrr_cost[5]);
+		lcd4 = rte_get_lcd(wrr_cost[6], wrr_cost[7]);
+
+		lcd5 = rte_get_lcd(lcd1, lcd2);
+		lcd6 = rte_get_lcd(lcd3, lcd4);
+
+		lcd7 = rte_get_lcd(lcd5, lcd6);
+
+		wrr_cost[0] = lcd7 / wrr_cost[0];
+		wrr_cost[1] = lcd7 / wrr_cost[1];
+		wrr_cost[2] = lcd7 / wrr_cost[2];
+		wrr_cost[3] = lcd7 / wrr_cost[3];
+		wrr_cost[4] = lcd7 / wrr_cost[4];
+		wrr_cost[5] = lcd7 / wrr_cost[5];
+		wrr_cost[6] = lcd7 / wrr_cost[6];
+		wrr_cost[7] = lcd7 / wrr_cost[7];
+
+		dst->wrr_cost[0] = (uint8_t) wrr_cost[0];
+		dst->wrr_cost[1] = (uint8_t) wrr_cost[1];
+		dst->wrr_cost[2] = (uint8_t) wrr_cost[2];
+		dst->wrr_cost[3] = (uint8_t) wrr_cost[3];
+		dst->wrr_cost[4] = (uint8_t) wrr_cost[4];
+		dst->wrr_cost[5] = (uint8_t) wrr_cost[5];
+		dst->wrr_cost[6] = (uint8_t) wrr_cost[6];
+		dst->wrr_cost[7] = (uint8_t) wrr_cost[7];
+
+		return;
+	}
+}
+
+static void
+rte_sched_subport_config_qsize(struct rte_sched_subport *subport)
+{
+	uint32_t i;
+
+	subport->qsize_add[0] = 0;
+
+	for (i = 1; i < RTE_SCHED_QUEUES_PER_PIPE; i++)
+		subport->qsize_add[i] =
+			subport->qsize_add[i-1] + subport->qsize[i-1];
+
+	subport->qsize_sum = subport->qsize_add[15] + subport->qsize[15];
 }
 
 static void
-rte_sched_port_log_pipe_profile(struct rte_sched_port *port, uint32_t i)
+rte_sched_port_log_pipe_profile(struct rte_sched_subport *subport, uint32_t i)
 {
-	struct rte_sched_pipe_profile *p = port->pipe_profiles + i;
+	struct rte_sched_pipe_profile *p = subport->pipe_profiles + i;
 
 	RTE_LOG(DEBUG, SCHED, "Low level config for pipe profile %u:\n"
 		"    Token bucket: period = %u, credits per period = %u, size = %u\n"
-		"    Traffic classes: period = %u, credits per period = [%u, %u, %u, %u]\n"
-		"    Traffic class 3 oversubscription: weight = %hhu\n"
-		"    WRR cost: [%hhu, %hhu, %hhu, %hhu], [%hhu, %hhu, %hhu, %hhu],\n",
+		"    Traffic classes: period = %u, credits per period = [%u, %u, %u, %u, %u, %u, %u, %u, %u]\n"
+		"    Traffic class BE oversubscription: weight = %hhu\n"
+		"    WRR cost: [%hhu, %hhu, %hhu, %hhu, %hhu, %hhu, %hhu, %hhu]\n",
 		i,
 
 		/* Token bucket */
@@ -530,6 +680,11 @@ rte_sched_port_log_pipe_profile(struct rte_sched_port *port, uint32_t i)
 		p->tc_credits_per_period[1],
 		p->tc_credits_per_period[2],
 		p->tc_credits_per_period[3],
+		p->tc_credits_per_period[4],
+		p->tc_credits_per_period[5],
+		p->tc_credits_per_period[6],
+		p->tc_credits_per_period[7],
+		p->tc_credits_per_period[8],
 
 		/* Traffic class 3 oversubscription */
 		p->tc_ov_weight,
@@ -550,7 +705,8 @@ rte_sched_time_ms_to_bytes(uint32_t time_ms, uint32_t rate)
 }
 
 static void
-rte_sched_pipe_profile_convert(struct rte_sched_pipe_params *src,
+rte_sched_pipe_profile_convert(struct rte_sched_subport *subport,
+	struct rte_sched_pipe_params *src,
 	struct rte_sched_pipe_profile *dst,
 	uint32_t rate)
 {
@@ -576,43 +732,193 @@ rte_sched_pipe_profile_convert(struct rte_sched_pipe_params *src,
 						rate);
 
 	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++)
-		dst->tc_credits_per_period[i]
-			= rte_sched_time_ms_to_bytes(src->tc_period,
-				src->tc_rate[i]);
+		if (subport->qsize[i])
+			dst->tc_credits_per_period[i]
+				= rte_sched_time_ms_to_bytes(src->tc_period,
+					src->tc_rate[i]);
 
 #ifdef RTE_SCHED_SUBPORT_TC_OV
 	dst->tc_ov_weight = src->tc_ov_weight;
 #endif
 
-	/* WRR */
-	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++) {
-		uint32_t wrr_cost[RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS];
-		uint32_t lcd, lcd1, lcd2;
-		uint32_t qindex;
+	/* WRR queues */
+	for (i = 0; i < RTE_SCHED_BE_QUEUES_PER_PIPE; i++)
+		if (subport->qsize[RTE_SCHED_TRAFFIC_CLASS_BE + i])
+			dst->n_be_queues++;
 
-		qindex = i * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS;
+	rte_sched_pipe_wrr_queues_config(src, dst);
+}
 
-		wrr_cost[0] = src->wrr_weights[qindex];
-		wrr_cost[1] = src->wrr_weights[qindex + 1];
-		wrr_cost[2] = src->wrr_weights[qindex + 2];
-		wrr_cost[3] = src->wrr_weights[qindex + 3];
+static void
+rte_sched_subport_config_pipe_profile_table(struct rte_sched_subport *subport,
+	struct rte_sched_subport_params *params, uint32_t rate)
+{
+	uint32_t i;
 
-		lcd1 = rte_get_lcd(wrr_cost[0], wrr_cost[1]);
-		lcd2 = rte_get_lcd(wrr_cost[2], wrr_cost[3]);
-		lcd = rte_get_lcd(lcd1, lcd2);
+	for (i = 0; i < subport->n_pipe_profiles; i++) {
+		struct rte_sched_pipe_params *src = params->pipe_profiles + i;
+		struct rte_sched_pipe_profile *dst = subport->pipe_profiles + i;
 
-		wrr_cost[0] = lcd / wrr_cost[0];
-		wrr_cost[1] = lcd / wrr_cost[1];
-		wrr_cost[2] = lcd / wrr_cost[2];
-		wrr_cost[3] = lcd / wrr_cost[3];
+		rte_sched_pipe_profile_convert(subport, src, dst, rate);
+		rte_sched_port_log_pipe_profile(subport, i);
+	}
+
+	subport->pipe_tc_be_rate_max = 0;
+	for (i = 0; i < subport->n_pipe_profiles; i++) {
+		struct rte_sched_pipe_params *src = params->pipe_profiles + i;
+		uint32_t pipe_tc_be_rate = src->tc_rate[RTE_SCHED_TRAFFIC_CLASS_BE];
 
-		dst->wrr_cost[qindex] = (uint8_t) wrr_cost[0];
-		dst->wrr_cost[qindex + 1] = (uint8_t) wrr_cost[1];
-		dst->wrr_cost[qindex + 2] = (uint8_t) wrr_cost[2];
-		dst->wrr_cost[qindex + 3] = (uint8_t) wrr_cost[3];
+		if (subport->pipe_tc_be_rate_max < pipe_tc_be_rate)
+			subport->pipe_tc_be_rate_max = pipe_tc_be_rate;
 	}
 }
 
+static int
+rte_sched_subport_check_params(struct rte_sched_subport_params *params,
+	uint32_t rate)
+{
+	uint32_t i, j;
+
+	/* Check user parameters */
+	if (params == NULL) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for parameter params \n", __func__);
+		return -EINVAL;
+	}
+
+	if (params->tb_rate == 0 || params->tb_rate > rate) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for tb rate \n", __func__);
+		return -EINVAL;
+	}
+
+	if (params->tb_size == 0) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for tb size \n", __func__);
+		return -EINVAL;
+	}
+
+	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++)
+		if (params->tc_rate[i] > params->tb_rate) {
+			RTE_LOG(ERR, SCHED,
+				"%s: Incorrect value for tc rate \n", __func__);
+			return -EINVAL;
+		}
+
+	if (params->tc_period == 0) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for tc period \n", __func__);
+		return -EINVAL;
+	}
+
+	/* n_subport_pipes: non-zero, power of 2 */
+	if (params->n_subport_pipes == 0 ||
+	    !rte_is_power_of_2(params->n_subport_pipes)) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for pipes number \n", __func__);
+		return -EINVAL;
+	}
+
+	/* qsize: power of 2, if non-zero
+	 * no bigger than 32K (due to 16-bit read/write pointers)
+	 */
+	for (i = 0, j = 0; i < RTE_SCHED_QUEUES_PER_PIPE; i++) {
+		uint32_t tc_rate = params->tc_rate[j];
+		uint16_t qsize = params->qsize[i];
+
+		if (((qsize == 0) &&
+			((tc_rate != 0) && (j != RTE_SCHED_TRAFFIC_CLASS_BE))) ||
+			((qsize != 0) &&
+			((tc_rate == 0) || !rte_is_power_of_2(qsize)))) {
+			RTE_LOG(ERR, SCHED,
+				"%s: Incorrect value for tc rate \n", __func__);
+			return -EINVAL;
+		}
+		if (j < RTE_SCHED_TRAFFIC_CLASS_BE)
+			j++;
+	}
+
+	/* WRR queues : 1, 4, 8 */
+	uint32_t wrr_queues = 0;
+	for (i = 0; i < RTE_SCHED_BE_QUEUES_PER_PIPE; i++) {
+		if (params->qsize[RTE_SCHED_TRAFFIC_CLASS_BE + i])
+			wrr_queues++;
+	}
+	if (params->tc_rate[RTE_SCHED_TRAFFIC_CLASS_BE] &&
+		(wrr_queues != 1 && wrr_queues != 2 &&
+		wrr_queues != 4 && wrr_queues != 8)) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for wrr weights \n", __func__);
+		return -EINVAL;
+	}
+
+	/* pipe_profiles and n_pipe_profiles */
+	if (params->pipe_profiles == NULL ||
+	    params->n_pipe_profiles == 0 ||
+		 params->n_pipe_profiles > params->n_max_pipe_profiles) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for number of pipe profiles \n", __func__);
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static uint32_t
+rte_sched_subport_get_memory_footprint(struct rte_sched_port *port,
+	uint32_t subport_id, struct rte_sched_subport_params *params)
+{
+	uint32_t size0, size1;
+	int status;
+
+	if (port == NULL) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for parameter port \n", __func__);
+		return 0;
+	}
+
+	if (subport_id >= port->n_subports_per_port) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for subport id \n", __func__);
+		return 0;
+	}
+
+	status = rte_sched_subport_check_params(params, port->rate);
+	if (status != 0) {
+		RTE_LOG(NOTICE, SCHED,
+			"Port scheduler params check failed (%d)\n", status);
+
+		return 0;
+	}
+
+	size0 = sizeof(struct rte_sched_subport);
+	size1 = rte_sched_subport_get_array_base(params,
+			e_RTE_SCHED_SUBPORT_ARRAY_TOTAL);
+
+	return size0 + size1;
+}
+
+uint32_t
+rte_sched_port_get_memory_footprint(struct rte_sched_port_params *params)
+{
+	uint32_t size0, size1;
+	int status;
+
+	status = rte_sched_port_check_params(params);
+	if (status != 0) {
+		RTE_LOG(NOTICE, SCHED,
+			"Port scheduler params check failed (%d)\n", status);
+
+		return 0;
+	}
+
+	size0 = sizeof(struct rte_sched_port);
+	size1 = rte_sched_port_get_array_base(params,
+			e_RTE_SCHED_PORT_ARRAY_TOTAL);
+
+	return size0 + size1;
+}
+
 struct rte_sched_port *
 rte_sched_port_config(struct rte_sched_port_params *params)
 {
@@ -710,12 +1016,12 @@ rte_sched_port_free(struct rte_sched_port *port)
 static void
 rte_sched_port_log_subport_config(struct rte_sched_port *port, uint32_t i)
 {
-	struct rte_sched_subport *s = port->subport + i;
+	struct rte_sched_subport *s = port->subports[i];
 
 	RTE_LOG(DEBUG, SCHED, "Low level config for subport %u:\n"
 		"    Token bucket: period = %u, credits per period = %u, size = %u\n"
-		"    Traffic classes: period = %u, credits per period = [%u, %u, %u, %u]\n"
-		"    Traffic class 3 oversubscription: wm min = %u, wm max = %u\n",
+		"    Traffic classes: period = %u, credits per period = [%u, %u, %u, %u, %u, %u, %u, %u, %u]\n"
+		"    Traffic class BE oversubscription: wm min = %u, wm max = %u\n",
 		i,
 
 		/* Token bucket */
@@ -729,8 +1035,13 @@ rte_sched_port_log_subport_config(struct rte_sched_port *port, uint32_t i)
 		s->tc_credits_per_period[1],
 		s->tc_credits_per_period[2],
 		s->tc_credits_per_period[3],
+		s->tc_credits_per_period[4],
+		s->tc_credits_per_period[5],
+		s->tc_credits_per_period[6],
+		s->tc_credits_per_period[7],
+		s->tc_credits_per_period[8],
 
-		/* Traffic class 3 oversubscription */
+		/* Traffic class BE oversubscription */
 		s->tc_ov_wm_min,
 		s->tc_ov_wm_max);
 }
@@ -740,31 +1051,26 @@ rte_sched_subport_config(struct rte_sched_port *port,
 	uint32_t subport_id,
 	struct rte_sched_subport_params *params)
 {
-	struct rte_sched_subport *s;
-	uint32_t i;
+	struct rte_sched_subport *s = NULL;
+	uint32_t mem_size, bmp_mem_size, n_subport_queues, n_subport_pipes_log2, i;
 
-	/* Check user parameters */
-	if (port == NULL ||
-	    subport_id >= port->n_subports_per_port ||
-	    params == NULL)
-		return -1;
-
-	if (params->tb_rate == 0 || params->tb_rate > port->rate)
-		return -2;
-
-	if (params->tb_size == 0)
-		return -3;
-
-	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++) {
-		if (params->tc_rate[i] == 0 ||
-		    params->tc_rate[i] > params->tb_rate)
-			return -4;
+	/* Check user parameters. Determine the amount of memory to allocate */
+	mem_size = rte_sched_subport_get_memory_footprint(port,
+		subport_id, params);
+	if (mem_size == 0) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Fail to determine the memory to allocate \n", __func__);
+		return -EINVAL;
 	}
 
-	if (params->tc_period == 0)
-		return -5;
-
-	s = port->subport + subport_id;
+	/* Allocate memory to store the data structures */
+	s = rte_zmalloc_socket("subport_params", mem_size, RTE_CACHE_LINE_SIZE,
+		port->socket);
+	if (s == NULL) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Memory allocation fails \n", __func__);
+		return -ENOMEM;
+	}
 
 	/* Token Bucket (TB) */
 	if (params->tb_rate == port->rate) {
@@ -784,19 +1090,110 @@ rte_sched_subport_config(struct rte_sched_port *port,
 	/* Traffic Classes (TCs) */
 	s->tc_period = rte_sched_time_ms_to_bytes(params->tc_period, port->rate);
 	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++) {
-		s->tc_credits_per_period[i]
-			= rte_sched_time_ms_to_bytes(params->tc_period,
-						     params->tc_rate[i]);
+		if (params->qsize[i])
+			s->tc_credits_per_period[i]
+				= rte_sched_time_ms_to_bytes(params->tc_period,
+					params->tc_rate[i]);
 	}
 	s->tc_time = port->time + s->tc_period;
 	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++)
-		s->tc_credits[i] = s->tc_credits_per_period[i];
+		if (params->qsize[i])
+			s->tc_credits[i] = s->tc_credits_per_period[i];
+
+	/* compile time checks */
+	RTE_BUILD_BUG_ON(RTE_SCHED_PORT_N_GRINDERS == 0);
+	RTE_BUILD_BUG_ON(RTE_SCHED_PORT_N_GRINDERS &
+		(RTE_SCHED_PORT_N_GRINDERS - 1));
+
+	/* User parameters */
+	s->n_subport_pipes = params->n_subport_pipes;
+	n_subport_pipes_log2 = __builtin_ctz(params->n_subport_pipes);
+	memcpy(s->qsize, params->qsize, sizeof(params->qsize));
+	s->n_pipe_profiles = params->n_pipe_profiles;
+	s->n_max_pipe_profiles = params->n_max_pipe_profiles;
+
+#ifdef RTE_SCHED_RED
+	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++) {
+		uint32_t j;
+
+		for (j = 0; j < RTE_COLORS; j++) {
+			/* if min/max are both zero, then RED is disabled */
+			if ((params->red_params[i][j].min_th |
+			     params->red_params[i][j].max_th) == 0) {
+				continue;
+			}
+
+			if (rte_red_config_init(&s->red_config[i][j],
+				params->red_params[i][j].wq_log2,
+				params->red_params[i][j].min_th,
+				params->red_params[i][j].max_th,
+				params->red_params[i][j].maxp_inv) != 0) {
+				rte_free(s);
+				return -3;
+			}
+		}
+	}
+#endif
+
+	/* Scheduling loop detection */
+	s->pipe_loop = RTE_SCHED_PIPE_INVALID;
+	s->pipe_exhaustion = 0;
+
+	/* Grinders */
+	s->busy_grinders = 0;
+
+	/* Queue base calculation */
+	rte_sched_subport_config_qsize(s);
+
+	/* Large data structures */
+	s->pipe = (struct rte_sched_pipe *)
+		(s->memory + rte_sched_subport_get_array_base(params,
+						e_RTE_SCHED_SUBPORT_ARRAY_PIPE));
+	s->queue = (struct rte_sched_queue *)
+		(s->memory + rte_sched_subport_get_array_base(params,
+						e_RTE_SCHED_SUBPORT_ARRAY_QUEUE));
+	s->queue_extra = (struct rte_sched_queue_extra *)
+		(s->memory + rte_sched_subport_get_array_base(params,
+						e_RTE_SCHED_SUBPORT_ARRAY_QUEUE_EXTRA));
+	s->pipe_profiles = (struct rte_sched_pipe_profile *)
+		(s->memory + rte_sched_subport_get_array_base(params,
+						e_RTE_SCHED_SUBPORT_ARRAY_PIPE_PROFILES));
+	s->bmp_array =  s->memory + rte_sched_subport_get_array_base(params,
+						e_RTE_SCHED_SUBPORT_ARRAY_BMP_ARRAY);
+	s->queue_array = (struct rte_mbuf **)
+		(s->memory + rte_sched_subport_get_array_base(params,
+						e_RTE_SCHED_SUBPORT_ARRAY_QUEUE_ARRAY));
+
+	/* Pipe profile table */
+	rte_sched_subport_config_pipe_profile_table(s, params, port->rate);
+
+	/* Bitmap */
+	n_subport_queues = rte_sched_subport_queues(s);
+	bmp_mem_size = rte_bitmap_get_memory_footprint(n_subport_queues);
+	s->bmp = rte_bitmap_init(n_subport_queues, s->bmp_array,
+				bmp_mem_size);
+	if (s->bmp == NULL) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Subport bitmap init error \n", __func__);
+
+		rte_free(port);
+		return -EINVAL;
+	}
+
+	for (i = 0; i < RTE_SCHED_PORT_N_GRINDERS; i++)
+		s->grinder_base_bmp_pos[i] = RTE_SCHED_PIPE_INVALID;
+
+	/* Port */
+	port->subports[subport_id] = s;
+
+	if (n_subport_pipes_log2 > port->max_subport_pipes_log2)
+		port->max_subport_pipes_log2 = n_subport_pipes_log2;
 
 #ifdef RTE_SCHED_SUBPORT_TC_OV
 	/* TC oversubscription */
 	s->tc_ov_wm_min = port->mtu;
 	s->tc_ov_wm_max = rte_sched_time_ms_to_bytes(params->tc_period,
-						     port->pipe_tc3_rate_max);
+						     s->pipe_tc_be_rate_max);
 	s->tc_ov_wm = s->tc_ov_wm_max;
 	s->tc_ov_period_id = 0;
 	s->tc_ov = 0;
@@ -909,10 +1306,12 @@ rte_sched_pipe_config(struct rte_sched_port *port,
 
 int __rte_experimental
 rte_sched_port_pipe_profile_add(struct rte_sched_port *port,
+	uint32_t subport_id,
 	struct rte_sched_pipe_params *params,
 	uint32_t *pipe_profile_id)
 {
 	struct rte_sched_pipe_profile *pp;
+	struct rte_sched_subport *s;
 	uint32_t i;
 	int status;
 
@@ -920,31 +1319,37 @@ rte_sched_port_pipe_profile_add(struct rte_sched_port *port,
 	if (port == NULL)
 		return -1;
 
-	/* Pipe profiles not exceeds the max limit */
-	if (port->n_pipe_profiles >= RTE_SCHED_PIPE_PROFILES_PER_PORT)
+	/* Subport id not exceeds the max limit */
+	if (subport_id > port->n_subports_per_port)
 		return -2;
 
+	s = port->subports[subport_id];
+
+	/* Pipe profiles not exceeds the max limit */
+	if (s->n_pipe_profiles >= s->n_max_pipe_profiles)
+		return -3;
+
 	/* Pipe params */
-	status = pipe_profile_check(params, port->rate);
+	status = pipe_profile_check(params, port->rate, &s->qsize[0]);
 	if (status != 0)
 		return status;
 
-	pp = &port->pipe_profiles[port->n_pipe_profiles];
-	rte_sched_pipe_profile_convert(params, pp, port->rate);
+	pp = &s->pipe_profiles[s->n_pipe_profiles];
+	rte_sched_pipe_profile_convert(s, params, pp, port->rate);
 
 	/* Pipe profile not exists */
-	for (i = 0; i < port->n_pipe_profiles; i++)
-		if (memcmp(port->pipe_profiles + i, pp, sizeof(*pp)) == 0)
-			return -3;
+	for (i = 0; i < s->n_pipe_profiles; i++)
+		if (memcmp(s->pipe_profiles + i, pp, sizeof(*pp)) == 0)
+			return -4;
 
 	/* Pipe profile commit */
-	*pipe_profile_id = port->n_pipe_profiles;
-	port->n_pipe_profiles++;
+	*pipe_profile_id = s->n_pipe_profiles;
+	s->n_pipe_profiles++;
 
-	if (port->pipe_tc3_rate_max < params->tc_rate[3])
-		port->pipe_tc3_rate_max = params->tc_rate[3];
+	if (s->pipe_tc_be_rate_max < params->tc_rate[RTE_SCHED_TRAFFIC_CLASS_BE])
+		s->pipe_tc_be_rate_max = params->tc_rate[RTE_SCHED_TRAFFIC_CLASS_BE];
 
-	rte_sched_port_log_pipe_profile(port, *pipe_profile_id);
+	rte_sched_port_log_pipe_profile(s, *pipe_profile_id);
 
 	return 0;
 }
diff --git a/lib/librte_sched/rte_sched.h b/lib/librte_sched/rte_sched.h
index ebde07669..6ebc592ff 100644
--- a/lib/librte_sched/rte_sched.h
+++ b/lib/librte_sched/rte_sched.h
@@ -296,6 +296,8 @@ rte_sched_port_free(struct rte_sched_port *port);
  *
  * @param port
  *   Handle to port scheduler instance
+ * @param subport_id
+ *   Subport ID
  * @param params
  *   Pipe profile parameters
  * @param pipe_profile_id
@@ -305,6 +307,7 @@ rte_sched_port_free(struct rte_sched_port *port);
  */
 int __rte_experimental
 rte_sched_port_pipe_profile_add(struct rte_sched_port *port,
+	uint32_t subport_id,
 	struct rte_sched_pipe_params *params,
 	uint32_t *pipe_profile_id);
 
-- 
2.21.0


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH v2 07/28] sched: update pipe profile add API
  2019-06-25 15:31   ` [dpdk-dev] [PATCH v2 00/28] sched: feature enhancements Jasvinder Singh
                       ` (5 preceding siblings ...)
  2019-06-25 15:31     ` [dpdk-dev] [PATCH v2 06/28] sched: update subport config API Jasvinder Singh
@ 2019-06-25 15:31     ` Jasvinder Singh
  2019-06-25 15:31     ` [dpdk-dev] [PATCH v2 08/28] sched: update pipe config API Jasvinder Singh
                       ` (23 subsequent siblings)
  30 siblings, 0 replies; 163+ messages in thread
From: Jasvinder Singh @ 2019-06-25 15:31 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, Abraham Tovar, Lukasz Krakowiak

Update the pipe profile add api implementation to allow
configuration flexiblity for pipe traffic classes and queues,
and subport level configuration of the pipe parameters.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 lib/librte_sched/rte_sched.c | 104 ++++++++++++++++++++++++++---------
 1 file changed, 78 insertions(+), 26 deletions(-)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index 46a98ad63..619b4763f 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -348,44 +348,69 @@ pipe_profile_check(struct rte_sched_pipe_params *params,
 	uint32_t i;
 
 	/* Pipe parameters */
-	if (params == NULL)
-		return -11;
+	if (params == NULL) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for parameter params \n", __func__);
+		return -EINVAL;
+	}
 
 	/* TB rate: non-zero, not greater than port rate */
 	if (params->tb_rate == 0 ||
-		params->tb_rate > rate)
-		return -12;
+		params->tb_rate > rate) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for tb rate \n", __func__);
+		return -EINVAL;
+	}
 
 	/* TB size: non-zero */
-	if (params->tb_size == 0)
-		return -13;
+	if (params->tb_size == 0) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for tb size \n", __func__);
+		return -EINVAL;
+	}
 
 	/* TC rate: non-zero, less than pipe rate */
 	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASS_BE; i++) {
 		if ((qsize[i] == 0 && params->tc_rate[i] != 0) ||
 			(qsize[i] != 0 && (params->tc_rate[i] == 0 ||
-			params->tc_rate[i] > params->tb_rate)))
-			return -14;
+			params->tc_rate[i] > params->tb_rate))) {
+			RTE_LOG(ERR, SCHED,
+				"%s: Incorrect value for qsize or tc_rate \n", __func__);
+			return -EINVAL;
+		}
+	}
+
+	if (params->tc_rate[RTE_SCHED_TRAFFIC_CLASS_BE] == 0) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for be traffic class rate \n", __func__);
+		return -EINVAL;
 	}
-	if (params->tc_rate[RTE_SCHED_TRAFFIC_CLASS_BE] == 0)
-		return -15;
 
 	/* TC period: non-zero */
-	if (params->tc_period == 0)
-		return -16;
+	if (params->tc_period == 0) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for tc period \n", __func__);
+		return -EINVAL;
+	}
 
 #ifdef RTE_SCHED_SUBPORT_TC_OV
 	/* TC3 oversubscription weight: non-zero */
-	if (params->tc_ov_weight == 0)
-		return -17;
+	if (params->tc_ov_weight == 0) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for tc ov weight \n", __func__);
+		return -EINVAL;
+	}
 #endif
 
 	/* Queue WRR weights: non-zero */
 	for (i = 0; i < RTE_SCHED_BE_QUEUES_PER_PIPE; i++) {
 		uint32_t qindex = RTE_SCHED_TRAFFIC_CLASS_BE + i;
 		if ((qsize[qindex] != 0 && params->wrr_weights[i] == 0) ||
-			(qsize[qindex] == 0 && params->wrr_weights[i] != 0))
-			return -18;
+			(qsize[qindex] == 0 && params->wrr_weights[i] != 0)) {
+			RTE_LOG(ERR, SCHED,
+				"%s: Incorrect value for qsize or wrr weight \n", __func__);
+			return -EINVAL;
+		}
 	}
 
 	return 0;
@@ -861,6 +886,18 @@ rte_sched_subport_check_params(struct rte_sched_subport_params *params,
 		return -EINVAL;
 	}
 
+	for (i = 0; i < params->n_pipe_profiles; i++) {
+		struct rte_sched_pipe_params *p = params->pipe_profiles + i;
+		int status;
+
+		status = pipe_profile_check(p, rate, &params->qsize[0]);
+		if (status != 0) {
+			RTE_LOG(ERR, SCHED,
+				"%s: Pipe profile check failed(%d) \n", __func__, status);
+			return -EINVAL;
+		}
+	}
+
 	return 0;
 }
 
@@ -1316,31 +1353,46 @@ rte_sched_port_pipe_profile_add(struct rte_sched_port *port,
 	int status;
 
 	/* Port */
-	if (port == NULL)
-		return -1;
+	if (port == NULL) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for parameter port \n", __func__);
+		return -EINVAL;
+	}
 
 	/* Subport id not exceeds the max limit */
-	if (subport_id > port->n_subports_per_port)
-		return -2;
+	if (subport_id > port->n_subports_per_port) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for subport id \n", __func__);
+		return -EINVAL;
+	}
 
 	s = port->subports[subport_id];
 
 	/* Pipe profiles not exceeds the max limit */
-	if (s->n_pipe_profiles >= s->n_max_pipe_profiles)
-		return -3;
+	if (s->n_pipe_profiles >= s->n_max_pipe_profiles) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Number of pipe profiles exceeds the max limit \n", __func__);
+		return -EINVAL;
+	}
 
 	/* Pipe params */
 	status = pipe_profile_check(params, port->rate, &s->qsize[0]);
-	if (status != 0)
-		return status;
+	if (status != 0) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Pipe profile check failed(%d) \n", __func__, status);
+		return -EINVAL;
+	}
 
 	pp = &s->pipe_profiles[s->n_pipe_profiles];
 	rte_sched_pipe_profile_convert(s, params, pp, port->rate);
 
 	/* Pipe profile not exists */
 	for (i = 0; i < s->n_pipe_profiles; i++)
-		if (memcmp(s->pipe_profiles + i, pp, sizeof(*pp)) == 0)
-			return -4;
+		if (memcmp(s->pipe_profiles + i, pp, sizeof(*pp)) == 0) {
+			RTE_LOG(ERR, SCHED,
+				"%s: Pipe profile doesn't exist \n", __func__);
+			return -EINVAL;
+		}
 
 	/* Pipe profile commit */
 	*pipe_profile_id = s->n_pipe_profiles;
-- 
2.21.0


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH v2 08/28] sched: update pipe config API
  2019-06-25 15:31   ` [dpdk-dev] [PATCH v2 00/28] sched: feature enhancements Jasvinder Singh
                       ` (6 preceding siblings ...)
  2019-06-25 15:31     ` [dpdk-dev] [PATCH v2 07/28] sched: update pipe profile add API Jasvinder Singh
@ 2019-06-25 15:31     ` Jasvinder Singh
  2019-06-25 15:31     ` [dpdk-dev] [PATCH v2 09/28] sched: update pkt read and write API Jasvinder Singh
                       ` (22 subsequent siblings)
  30 siblings, 0 replies; 163+ messages in thread
From: Jasvinder Singh @ 2019-06-25 15:31 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, Abraham Tovar, Lukasz Krakowiak

Update pipe configuration api implementation to allow
configuration flexiblity for pipe traffic classes and queues,
and subport level configuration of the pipe parameters.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 lib/librte_sched/rte_sched.c | 83 +++++++++++++++++++++++-------------
 lib/librte_sched/rte_sched.h |  2 +-
 2 files changed, 55 insertions(+), 30 deletions(-)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index 619b4763f..1999bbfa3 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -1258,40 +1258,59 @@ rte_sched_pipe_config(struct rte_sched_port *port,
 	profile = (uint32_t) pipe_profile;
 	deactivate = (pipe_profile < 0);
 
-	if (port == NULL ||
-	    subport_id >= port->n_subports_per_port ||
-	    pipe_id >= port->n_pipes_per_subport ||
-	    (!deactivate && profile >= port->n_pipe_profiles))
-		return -1;
+	if (port == NULL) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for parameter port \n", __func__);
+		return -EINVAL;
+	}
+
+	if (subport_id >= port->n_subports_per_port) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for subport id \n", __func__);
+		return -EINVAL;
+	}
 
+	if (pipe_id >= port->subports[subport_id]->n_subport_pipes) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for pipe id \n", __func__);
+		return -EINVAL;
+	}
 
-	/* Check that subport configuration is valid */
-	s = port->subport + subport_id;
-	if (s->tb_period == 0)
-		return -2;
+	if (!deactivate &&
+		profile >= port->subports[subport_id]->n_pipe_profiles) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for pipe profile \n", __func__);
+		return -EINVAL;
+	}
 
-	p = port->pipe + (subport_id * port->n_pipes_per_subport + pipe_id);
+	/* Check that subport configuration is valid */
+	s = port->subports[subport_id];
+	p = s->pipe + pipe_id;
 
 	/* Handle the case when pipe already has a valid configuration */
 	if (p->tb_time) {
-		params = port->pipe_profiles + p->profile;
+		params = s->pipe_profiles + p->profile;
 
 #ifdef RTE_SCHED_SUBPORT_TC_OV
-		double subport_tc3_rate = (double) s->tc_credits_per_period[3]
+		double subport_tc_be_rate =
+			(double) s->tc_credits_per_period[
+					RTE_SCHED_TRAFFIC_CLASS_BE]
 			/ (double) s->tc_period;
-		double pipe_tc3_rate = (double) params->tc_credits_per_period[3]
+		double pipe_tc_be_rate =
+			(double) params->tc_credits_per_period[
+					RTE_SCHED_TRAFFIC_CLASS_BE]
 			/ (double) params->tc_period;
-		uint32_t tc3_ov = s->tc_ov;
+		uint32_t tc_be_ov = s->tc_ov;
 
 		/* Unplug pipe from its subport */
 		s->tc_ov_n -= params->tc_ov_weight;
-		s->tc_ov_rate -= pipe_tc3_rate;
-		s->tc_ov = s->tc_ov_rate > subport_tc3_rate;
+		s->tc_ov_rate -= pipe_tc_be_rate;
+		s->tc_ov = s->tc_ov_rate > subport_tc_be_rate;
 
-		if (s->tc_ov != tc3_ov) {
+		if (s->tc_ov != tc_be_ov) {
 			RTE_LOG(DEBUG, SCHED,
-				"Subport %u TC3 oversubscription is OFF (%.4lf >= %.4lf)\n",
-				subport_id, subport_tc3_rate, s->tc_ov_rate);
+				"Subport %u Best effort TC oversubscription is OFF (%.4lf >= %.4lf)\n",
+				subport_id, subport_tc_be_rate, s->tc_ov_rate);
 		}
 #endif
 
@@ -1304,34 +1323,40 @@ rte_sched_pipe_config(struct rte_sched_port *port,
 
 	/* Apply the new pipe configuration */
 	p->profile = profile;
-	params = port->pipe_profiles + p->profile;
+	params = s->pipe_profiles + p->profile;
 
 	/* Token Bucket (TB) */
 	p->tb_time = port->time;
 	p->tb_credits = params->tb_size / 2;
 
 	/* Traffic Classes (TCs) */
+	p->n_be_queues = params->n_be_queues;
 	p->tc_time = port->time + params->tc_period;
 	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++)
-		p->tc_credits[i] = params->tc_credits_per_period[i];
+		if (s->qsize[i])
+			p->tc_credits[i] = params->tc_credits_per_period[i];
 
 #ifdef RTE_SCHED_SUBPORT_TC_OV
 	{
 		/* Subport TC3 oversubscription */
-		double subport_tc3_rate = (double) s->tc_credits_per_period[3]
+		double subport_tc_be_rate =
+			(double) s->tc_credits_per_period[
+					RTE_SCHED_TRAFFIC_CLASS_BE]
 			/ (double) s->tc_period;
-		double pipe_tc3_rate = (double) params->tc_credits_per_period[3]
+		double pipe_tc_be_rate =
+			(double) params->tc_credits_per_period[
+					RTE_SCHED_TRAFFIC_CLASS_BE]
 			/ (double) params->tc_period;
-		uint32_t tc3_ov = s->tc_ov;
+		uint32_t tc_be_ov = s->tc_ov;
 
 		s->tc_ov_n += params->tc_ov_weight;
-		s->tc_ov_rate += pipe_tc3_rate;
-		s->tc_ov = s->tc_ov_rate > subport_tc3_rate;
+		s->tc_ov_rate += pipe_tc_be_rate;
+		s->tc_ov = s->tc_ov_rate > subport_tc_be_rate;
 
-		if (s->tc_ov != tc3_ov) {
+		if (s->tc_ov != tc_be_ov) {
 			RTE_LOG(DEBUG, SCHED,
-				"Subport %u TC3 oversubscription is ON (%.4lf < %.4lf)\n",
-				subport_id, subport_tc3_rate, s->tc_ov_rate);
+				"Subport %u Best effort TC oversubscription is ON (%.4lf < %.4lf)\n",
+				subport_id, subport_tc_be_rate, s->tc_ov_rate);
 		}
 		p->tc_ov_period_id = s->tc_ov_period_id;
 		p->tc_ov_credits = s->tc_ov_wm;
diff --git a/lib/librte_sched/rte_sched.h b/lib/librte_sched/rte_sched.h
index 6ebc592ff..121e1f669 100644
--- a/lib/librte_sched/rte_sched.h
+++ b/lib/librte_sched/rte_sched.h
@@ -338,7 +338,7 @@ rte_sched_subport_config(struct rte_sched_port *port,
  * @param pipe_id
  *   Pipe ID within subport
  * @param pipe_profile
- *   ID of port-level pre-configured pipe profile
+ *   ID of subport-level pre-configured pipe profile
  * @return
  *   0 upon success, error code otherwise
  */
-- 
2.21.0


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH v2 09/28] sched: update pkt read and write API
  2019-06-25 15:31   ` [dpdk-dev] [PATCH v2 00/28] sched: feature enhancements Jasvinder Singh
                       ` (7 preceding siblings ...)
  2019-06-25 15:31     ` [dpdk-dev] [PATCH v2 08/28] sched: update pipe config API Jasvinder Singh
@ 2019-06-25 15:31     ` Jasvinder Singh
  2019-07-01 23:25       ` Dumitrescu, Cristian
  2019-06-25 15:31     ` [dpdk-dev] [PATCH v2 10/28] sched: update subport and tc queue stats Jasvinder Singh
                       ` (21 subsequent siblings)
  30 siblings, 1 reply; 163+ messages in thread
From: Jasvinder Singh @ 2019-06-25 15:31 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, Abraham Tovar, Lukasz Krakowiak

Update run time packet read and write api implementation
to allow configuration flexiblity for pipe traffic classes
and queues, and subport level configuration of the pipe
parameters.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 lib/librte_sched/rte_sched.c | 32 +++++++++++++++++---------------
 lib/librte_sched/rte_sched.h |  8 ++++----
 2 files changed, 21 insertions(+), 19 deletions(-)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index 1999bbfa3..cd82fd918 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -1433,17 +1433,15 @@ rte_sched_port_pipe_profile_add(struct rte_sched_port *port,
 
 static inline uint32_t
 rte_sched_port_qindex(struct rte_sched_port *port,
+	struct rte_sched_subport *s,
 	uint32_t subport,
 	uint32_t pipe,
-	uint32_t traffic_class,
 	uint32_t queue)
 {
 	return ((subport & (port->n_subports_per_port - 1)) <<
-			(port->n_pipes_per_subport_log2 + 4)) |
-			((pipe & (port->n_pipes_per_subport - 1)) << 4) |
-			((traffic_class &
-			    (RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE - 1)) << 2) |
-			(queue & (RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS - 1));
+			(port->max_subport_pipes_log2 + 4)) |
+			((pipe & (s->n_subport_pipes - 1)) << 4) |
+			(queue & (RTE_SCHED_QUEUES_PER_PIPE - 1));
 }
 
 void
@@ -1453,9 +1451,9 @@ rte_sched_port_pkt_write(struct rte_sched_port *port,
 			 uint32_t traffic_class,
 			 uint32_t queue, enum rte_color color)
 {
-	uint32_t queue_id = rte_sched_port_qindex(port, subport, pipe,
-			traffic_class, queue);
-	rte_mbuf_sched_set(pkt, queue_id, traffic_class, (uint8_t)color);
+	struct rte_sched_subport *s = port->subports[subport];
+	uint32_t qindex = rte_sched_port_qindex(port, s, subport, pipe, queue);
+	rte_mbuf_sched_set(pkt, qindex, traffic_class, (uint8_t)color);
 }
 
 void
@@ -1464,13 +1462,17 @@ rte_sched_port_pkt_read_tree_path(struct rte_sched_port *port,
 				  uint32_t *subport, uint32_t *pipe,
 				  uint32_t *traffic_class, uint32_t *queue)
 {
-	uint32_t queue_id = rte_mbuf_sched_queue_get(pkt);
+	struct rte_sched_subport *s;
+	uint32_t qindex = rte_mbuf_sched_queue_get(pkt);
+	uint32_t tc_id = rte_mbuf_sched_traffic_class_get(pkt);
+
+	*subport = (qindex >> (port->max_subport_pipes_log2 + 4)) &
+		(port->n_subports_per_port - 1);
 
-	*subport = queue_id >> (port->n_pipes_per_subport_log2 + 4);
-	*pipe = (queue_id >> 4) & (port->n_pipes_per_subport - 1);
-	*traffic_class = (queue_id >> 2) &
-				(RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE - 1);
-	*queue = queue_id & (RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS - 1);
+	s = port->subports[*subport];
+	*pipe = (qindex >> 4) & (s->n_subport_pipes - 1);
+	*traffic_class = tc_id;
+	*queue = qindex & (RTE_SCHED_QUEUES_PER_PIPE - 1);
 }
 
 enum rte_color
diff --git a/lib/librte_sched/rte_sched.h b/lib/librte_sched/rte_sched.h
index 121e1f669..6a6ea84aa 100644
--- a/lib/librte_sched/rte_sched.h
+++ b/lib/librte_sched/rte_sched.h
@@ -421,9 +421,9 @@ rte_sched_queue_read_stats(struct rte_sched_port *port,
  * @param pipe
  *   Pipe ID within subport
  * @param traffic_class
- *   Traffic class ID within pipe (0 .. 3)
+ *   Traffic class ID within pipe (0 .. 8)
  * @param queue
- *   Queue ID within pipe traffic class (0 .. 3)
+ *   Queue ID within pipe traffic class (0 .. 15)
  * @param color
  *   Packet color set
  */
@@ -448,9 +448,9 @@ rte_sched_port_pkt_write(struct rte_sched_port *port,
  * @param pipe
  *   Pipe ID within subport
  * @param traffic_class
- *   Traffic class ID within pipe (0 .. 3)
+ *   Traffic class ID within pipe (0 .. 8)
  * @param queue
- *   Queue ID within pipe traffic class (0 .. 3)
+ *   Queue ID within pipe traffic class (0 .. 15)
  *
  */
 void
-- 
2.21.0


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH v2 10/28] sched: update subport and tc queue stats
  2019-06-25 15:31   ` [dpdk-dev] [PATCH v2 00/28] sched: feature enhancements Jasvinder Singh
                       ` (8 preceding siblings ...)
  2019-06-25 15:31     ` [dpdk-dev] [PATCH v2 09/28] sched: update pkt read and write API Jasvinder Singh
@ 2019-06-25 15:31     ` Jasvinder Singh
  2019-06-25 15:32     ` [dpdk-dev] [PATCH v2 11/28] sched: update port memory footprint API Jasvinder Singh
                       ` (20 subsequent siblings)
  30 siblings, 0 replies; 163+ messages in thread
From: Jasvinder Singh @ 2019-06-25 15:31 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, Abraham Tovar, Lukasz Krakowiak

Update subport and tc queue stats api mplementation to allow
configuration flexiblity for pipe traffic classes and queues,
and subport level configuration of the pipe parameters.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 lib/librte_sched/rte_sched.c | 74 +++++++++++++++++++++++-------------
 lib/librte_sched/rte_sched.h | 34 ++++++++++-------
 2 files changed, 68 insertions(+), 40 deletions(-)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index cd82fd918..7a4c7cf12 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -1490,11 +1490,31 @@ rte_sched_subport_read_stats(struct rte_sched_port *port,
 	struct rte_sched_subport *s;
 
 	/* Check user parameters */
-	if (port == NULL || subport_id >= port->n_subports_per_port ||
-	    stats == NULL || tc_ov == NULL)
-		return -1;
+	if (port == NULL) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for parameter port \n", __func__);
+		return -EINVAL;
+	}
 
-	s = port->subport + subport_id;
+	if (subport_id >= port->n_subports_per_port) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for subport id \n", __func__);
+		return -EINVAL;
+	}
+
+	if (stats == NULL) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for parameter stats \n", __func__);
+		return -EINVAL;
+	}
+
+	if (tc_ov == NULL) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for tc_ov \n", __func__);
+		return -EINVAL;
+	}
+
+	s = port->subports[subport_id];
 
 	/* Copy subport stats and clear */
 	memcpy(stats, &s->stats, sizeof(struct rte_sched_subport_stats));
@@ -1550,10 +1570,10 @@ rte_sched_port_queue_is_empty(struct rte_sched_port *port, uint32_t qindex)
 #ifdef RTE_SCHED_COLLECT_STATS
 
 static inline void
-rte_sched_port_update_subport_stats(struct rte_sched_port *port, uint32_t qindex, struct rte_mbuf *pkt)
+rte_sched_port_update_subport_stats(struct rte_sched_subport *s,
+	struct rte_mbuf *pkt)
 {
-	struct rte_sched_subport *s = port->subport + (qindex / rte_sched_port_queues_per_subport(port));
-	uint32_t tc_index = (qindex >> 2) & 0x3;
+	uint32_t tc_index = rte_mbuf_sched_traffic_class_get(pkt);
 	uint32_t pkt_len = pkt->pkt_len;
 
 	s->stats.n_pkts_tc[tc_index] += 1;
@@ -1562,31 +1582,31 @@ rte_sched_port_update_subport_stats(struct rte_sched_port *port, uint32_t qindex
 
 #ifdef RTE_SCHED_RED
 static inline void
-rte_sched_port_update_subport_stats_on_drop(struct rte_sched_port *port,
-						uint32_t qindex,
-						struct rte_mbuf *pkt, uint32_t red)
+rte_sched_port_update_subport_stats_on_drop(struct rte_sched_subport *subport,
+						struct rte_mbuf *pkt,
+						uint32_t red)
 #else
 static inline void
-rte_sched_port_update_subport_stats_on_drop(struct rte_sched_port *port,
-						uint32_t qindex,
-						struct rte_mbuf *pkt, __rte_unused uint32_t red)
+rte_sched_port_update_subport_stats_on_drop(struct rte_sched_subport *subport,
+						struct rte_mbuf *pkt,
+						__rte_unused uint32_t red)
 #endif
 {
-	struct rte_sched_subport *s = port->subport + (qindex / rte_sched_port_queues_per_subport(port));
-	uint32_t tc_index = (qindex >> 2) & 0x3;
+	uint32_t tc_index = rte_mbuf_sched_traffic_class_get(pkt);
 	uint32_t pkt_len = pkt->pkt_len;
 
-	s->stats.n_pkts_tc_dropped[tc_index] += 1;
-	s->stats.n_bytes_tc_dropped[tc_index] += pkt_len;
+	subport->stats.n_pkts_tc_dropped[tc_index] += 1;
+	subport->stats.n_bytes_tc_dropped[tc_index] += pkt_len;
 #ifdef RTE_SCHED_RED
-	s->stats.n_pkts_red_dropped[tc_index] += red;
+	subport->stats.n_pkts_red_dropped[tc_index] += red;
 #endif
 }
 
 static inline void
-rte_sched_port_update_queue_stats(struct rte_sched_port *port, uint32_t qindex, struct rte_mbuf *pkt)
+rte_sched_port_update_queue_stats(struct rte_sched_subport *subport,
+	uint32_t qindex, struct rte_mbuf *pkt)
 {
-	struct rte_sched_queue_extra *qe = port->queue_extra + qindex;
+	struct rte_sched_queue_extra *qe = subport->queue_extra + qindex;
 	uint32_t pkt_len = pkt->pkt_len;
 
 	qe->stats.n_pkts += 1;
@@ -1595,17 +1615,19 @@ rte_sched_port_update_queue_stats(struct rte_sched_port *port, uint32_t qindex,
 
 #ifdef RTE_SCHED_RED
 static inline void
-rte_sched_port_update_queue_stats_on_drop(struct rte_sched_port *port,
+rte_sched_port_update_queue_stats_on_drop(struct rte_sched_subport *subport,
 						uint32_t qindex,
-						struct rte_mbuf *pkt, uint32_t red)
+						struct rte_mbuf *pkt,
+						int32_t red)
 #else
 static inline void
-rte_sched_port_update_queue_stats_on_drop(struct rte_sched_port *port,
+rte_sched_port_update_queue_stats_on_drop(struct rte_sched_subport *subport,
 						uint32_t qindex,
-						struct rte_mbuf *pkt, __rte_unused uint32_t red)
+						struct rte_mbuf *pkt,
+						__rte_unused uint32_t red)
 #endif
 {
-	struct rte_sched_queue_extra *qe = port->queue_extra + qindex;
+	struct rte_sched_queue_extra *qe = subport->queue_extra + qindex;
 	uint32_t pkt_len = pkt->pkt_len;
 
 	qe->stats.n_pkts_dropped += 1;
@@ -1626,7 +1648,7 @@ rte_sched_port_red_drop(struct rte_sched_port *port, struct rte_mbuf *pkt, uint3
 	struct rte_red_config *red_cfg;
 	struct rte_red *red;
 	uint32_t tc_index;
-	enum rte_color color;
+	enum rte_meter_color color;
 
 	tc_index = (qindex >> 2) & 0x3;
 	color = rte_sched_port_pkt_read_color(pkt);
diff --git a/lib/librte_sched/rte_sched.h b/lib/librte_sched/rte_sched.h
index 6a6ea84aa..05f518457 100644
--- a/lib/librte_sched/rte_sched.h
+++ b/lib/librte_sched/rte_sched.h
@@ -195,36 +195,42 @@ struct rte_sched_subport_params {
 
 /** Subport statistics */
 struct rte_sched_subport_stats {
-	/* Packets */
+	/** Number of packets successfully written */
 	uint32_t n_pkts_tc[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
-	/**< Number of packets successfully written */
+
+	/** Number of packets dropped */
 	uint32_t n_pkts_tc_dropped[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
-	/**< Number of packets dropped */
 
-	/* Bytes */
+	/** Number of bytes successfully written for each traffic class */
 	uint32_t n_bytes_tc[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
-	/**< Number of bytes successfully written for each traffic class */
+
+	/** Number of bytes dropped for each traffic class */
 	uint32_t n_bytes_tc_dropped[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
-	/**< Number of bytes dropped for each traffic class */
 
 #ifdef RTE_SCHED_RED
+	/** Number of packets dropped by red */
 	uint32_t n_pkts_red_dropped[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
-	/**< Number of packets dropped by red */
 #endif
 };
 
 /** Queue statistics */
 struct rte_sched_queue_stats {
-	/* Packets */
-	uint32_t n_pkts;                 /**< Packets successfully written */
-	uint32_t n_pkts_dropped;         /**< Packets dropped */
+	/** Packets successfully written */
+	uint32_t n_pkts;
+
+	/** Packets dropped */
+	uint32_t n_pkts_dropped;
+
 #ifdef RTE_SCHED_RED
-	uint32_t n_pkts_red_dropped;	 /**< Packets dropped by RED */
+	/** Packets dropped by RED */
+	uint32_t n_pkts_red_dropped;
 #endif
 
-	/* Bytes */
-	uint32_t n_bytes;                /**< Bytes successfully written */
-	uint32_t n_bytes_dropped;        /**< Bytes dropped */
+	/** Bytes successfully written */
+	uint32_t n_bytes;
+
+	/** Bytes dropped */
+	uint32_t n_bytes_dropped;
 };
 
 /** Port configuration parameters. */
-- 
2.21.0


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH v2 11/28] sched: update port memory footprint API
  2019-06-25 15:31   ` [dpdk-dev] [PATCH v2 00/28] sched: feature enhancements Jasvinder Singh
                       ` (9 preceding siblings ...)
  2019-06-25 15:31     ` [dpdk-dev] [PATCH v2 10/28] sched: update subport and tc queue stats Jasvinder Singh
@ 2019-06-25 15:32     ` Jasvinder Singh
  2019-06-25 15:32     ` [dpdk-dev] [PATCH v2 12/28] sched: update packet enqueue API Jasvinder Singh
                       ` (19 subsequent siblings)
  30 siblings, 0 replies; 163+ messages in thread
From: Jasvinder Singh @ 2019-06-25 15:32 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, Abraham Tovar, Lukasz Krakowiak

Update port memory footprint api implementation to allow
configuration flexiblity for pipe traffic classes and
queues, and subport level configuration of the pipe
parameters.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 lib/librte_sched/rte_sched.c | 90 +++++++++---------------------------
 lib/librte_sched/rte_sched.h |  7 ++-
 2 files changed, 28 insertions(+), 69 deletions(-)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index 7a4c7cf12..65c645df7 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -457,66 +457,6 @@ rte_sched_port_check_params(struct rte_sched_port_params *params)
 	return 0;
 }
 
-static uint32_t
-rte_sched_port_get_array_base(struct rte_sched_port_params *params, enum rte_sched_port_array array)
-{
-	uint32_t n_subports_per_port = params->n_subports_per_port;
-	uint32_t n_pipes_per_subport = params->n_pipes_per_subport;
-	uint32_t n_pipes_per_port = n_pipes_per_subport * n_subports_per_port;
-	uint32_t n_queues_per_port = RTE_SCHED_QUEUES_PER_PIPE * n_pipes_per_subport * n_subports_per_port;
-
-	uint32_t size_subport = n_subports_per_port * sizeof(struct rte_sched_subport);
-	uint32_t size_pipe = n_pipes_per_port * sizeof(struct rte_sched_pipe);
-	uint32_t size_queue = n_queues_per_port * sizeof(struct rte_sched_queue);
-	uint32_t size_queue_extra
-		= n_queues_per_port * sizeof(struct rte_sched_queue_extra);
-	uint32_t size_pipe_profiles
-		= RTE_SCHED_PIPE_PROFILES_PER_PORT * sizeof(struct rte_sched_pipe_profile);
-	uint32_t size_bmp_array = rte_bitmap_get_memory_footprint(n_queues_per_port);
-	uint32_t size_per_pipe_queue_array, size_queue_array;
-
-	uint32_t base, i;
-
-	size_per_pipe_queue_array = 0;
-	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++) {
-		size_per_pipe_queue_array += RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS
-			* params->qsize[i] * sizeof(struct rte_mbuf *);
-	}
-	size_queue_array = n_pipes_per_port * size_per_pipe_queue_array;
-
-	base = 0;
-
-	if (array == e_RTE_SCHED_PORT_ARRAY_SUBPORT)
-		return base;
-	base += RTE_CACHE_LINE_ROUNDUP(size_subport);
-
-	if (array == e_RTE_SCHED_PORT_ARRAY_PIPE)
-		return base;
-	base += RTE_CACHE_LINE_ROUNDUP(size_pipe);
-
-	if (array == e_RTE_SCHED_PORT_ARRAY_QUEUE)
-		return base;
-	base += RTE_CACHE_LINE_ROUNDUP(size_queue);
-
-	if (array == e_RTE_SCHED_PORT_ARRAY_QUEUE_EXTRA)
-		return base;
-	base += RTE_CACHE_LINE_ROUNDUP(size_queue_extra);
-
-	if (array == e_RTE_SCHED_PORT_ARRAY_PIPE_PROFILES)
-		return base;
-	base += RTE_CACHE_LINE_ROUNDUP(size_pipe_profiles);
-
-	if (array == e_RTE_SCHED_PORT_ARRAY_BMP_ARRAY)
-		return base;
-	base += RTE_CACHE_LINE_ROUNDUP(size_bmp_array);
-
-	if (array == e_RTE_SCHED_PORT_ARRAY_QUEUE_ARRAY)
-		return base;
-	base += RTE_CACHE_LINE_ROUNDUP(size_queue_array);
-
-	return base;
-}
-
 static uint32_t
 rte_sched_subport_get_array_base(struct rte_sched_subport_params *params,
 	enum rte_sched_subport_array array)
@@ -936,22 +876,38 @@ rte_sched_subport_get_memory_footprint(struct rte_sched_port *port,
 }
 
 uint32_t
-rte_sched_port_get_memory_footprint(struct rte_sched_port_params *params)
+rte_sched_port_get_memory_footprint(struct rte_sched_port_params *port_params,
+	struct rte_sched_subport_params *subport_params)
 {
-	uint32_t size0, size1;
+	uint32_t size0 = 0, size1 = 0, i;
 	int status;
 
-	status = rte_sched_port_check_params(params);
+	status = rte_sched_port_check_params(port_params);
 	if (status != 0) {
-		RTE_LOG(NOTICE, SCHED,
-			"Port scheduler params check failed (%d)\n", status);
+		RTE_LOG(ERR, SCHED,
+			"%s: Port scheduler port params check failed (%d)\n",
+			__func__, status);
+
+		return 0;
+	}
+
+	status = rte_sched_subport_check_params(subport_params,
+				port_params->rate);
+	if (status != 0) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Port scheduler subport params check failed (%d)\n",
+			__func__, status);
 
 		return 0;
 	}
 
 	size0 = sizeof(struct rte_sched_port);
-	size1 = rte_sched_port_get_array_base(params,
-			e_RTE_SCHED_PORT_ARRAY_TOTAL);
+
+	for (i = 0; i < port_params->n_subports_per_port; i++) {
+		struct rte_sched_subport_params *sp = &subport_params[i];
+		size1 += rte_sched_subport_get_array_base(sp,
+			e_RTE_SCHED_SUBPORT_ARRAY_TOTAL);
+	}
 
 	return size0 + size1;
 }
diff --git a/lib/librte_sched/rte_sched.h b/lib/librte_sched/rte_sched.h
index 05f518457..1f690036d 100644
--- a/lib/librte_sched/rte_sched.h
+++ b/lib/librte_sched/rte_sched.h
@@ -357,13 +357,16 @@ rte_sched_pipe_config(struct rte_sched_port *port,
 /**
  * Hierarchical scheduler memory footprint size per port
  *
- * @param params
+ * @param port_params
  *   Port scheduler configuration parameter structure
+ * @param subport_params
+ *   Subport configuration parameter structure
  * @return
  *   Memory footprint size in bytes upon success, 0 otherwise
  */
 uint32_t
-rte_sched_port_get_memory_footprint(struct rte_sched_port_params *params);
+rte_sched_port_get_memory_footprint(struct rte_sched_port_params *port_params,
+	struct rte_sched_subport_params *subport_params);
 
 /*
  * Statistics
-- 
2.21.0


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH v2 12/28] sched: update packet enqueue API
  2019-06-25 15:31   ` [dpdk-dev] [PATCH v2 00/28] sched: feature enhancements Jasvinder Singh
                       ` (10 preceding siblings ...)
  2019-06-25 15:32     ` [dpdk-dev] [PATCH v2 11/28] sched: update port memory footprint API Jasvinder Singh
@ 2019-06-25 15:32     ` Jasvinder Singh
  2019-06-25 15:32     ` [dpdk-dev] [PATCH v2 13/28] sched: update grinder pipe and tc cache Jasvinder Singh
                       ` (18 subsequent siblings)
  30 siblings, 0 replies; 163+ messages in thread
From: Jasvinder Singh @ 2019-06-25 15:32 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, Abraham Tovar, Lukasz Krakowiak

Update packet enqueue api implementation to allow configuration
flexiblity for pipe traffic classes and queues, and subport
level configuration of the pipe parameters.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 lib/librte_sched/rte_sched.c | 228 ++++++++++++++++++++++-------------
 1 file changed, 144 insertions(+), 84 deletions(-)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index 65c645df7..cb96e0613 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -1598,31 +1598,36 @@ rte_sched_port_update_queue_stats_on_drop(struct rte_sched_subport *subport,
 #ifdef RTE_SCHED_RED
 
 static inline int
-rte_sched_port_red_drop(struct rte_sched_port *port, struct rte_mbuf *pkt, uint32_t qindex, uint16_t qlen)
+rte_sched_port_red_drop(struct rte_sched_subport *subport,
+	struct rte_mbuf *pkt,
+	uint32_t qindex,
+	uint16_t qlen,
+	uint64_t time)
 {
 	struct rte_sched_queue_extra *qe;
 	struct rte_red_config *red_cfg;
 	struct rte_red *red;
 	uint32_t tc_index;
-	enum rte_meter_color color;
+	enum rte_color color;
 
-	tc_index = (qindex >> 2) & 0x3;
+	tc_index = rte_mbuf_sched_traffic_class_get(pkt);
 	color = rte_sched_port_pkt_read_color(pkt);
-	red_cfg = &port->red_config[tc_index][color];
+	red_cfg = &subport->red_config[tc_index][color];
 
 	if ((red_cfg->min_th | red_cfg->max_th) == 0)
 		return 0;
 
-	qe = port->queue_extra + qindex;
+	qe = subport->queue_extra + qindex;
 	red = &qe->red;
 
-	return rte_red_enqueue(red_cfg, red, qlen, port->time);
+	return rte_red_enqueue(red_cfg, red, qlen, time);
 }
 
 static inline void
-rte_sched_port_set_queue_empty_timestamp(struct rte_sched_port *port, uint32_t qindex)
+rte_sched_port_set_queue_empty_timestamp(struct rte_sched_port *port,
+	struct rte_sched_subport *subport, uint32_t qindex)
 {
-	struct rte_sched_queue_extra *qe = port->queue_extra + qindex;
+	struct rte_sched_queue_extra *qe = subport->queue_extra + qindex;
 	struct rte_red *red = &qe->red;
 
 	rte_red_mark_queue_empty(red, port->time);
@@ -1630,10 +1635,23 @@ rte_sched_port_set_queue_empty_timestamp(struct rte_sched_port *port, uint32_t q
 
 #else
 
-#define rte_sched_port_red_drop(port, pkt, qindex, qlen)             0
-
-#define rte_sched_port_set_queue_empty_timestamp(port, qindex)
+static inline int rte_sched_port_red_drop(
+	struct rte_sched_subport *subport __rte_unused,
+	struct rte_mbuf *pkt __rte_unused,
+	uint32_t qindex __rte_unused,
+	uint16_t qlen __rte_unused,
+	uint64_t time __rte_unused)
+{
+	return 0;
+}
 
+static inline void
+rte_sched_port_set_queue_empty_timestamp(struct rte_sched_port *port __rte_unused,
+	struct rte_sched_subport *subport __rte_unused,
+	uint32_t qindex __rte_unused)
+{
+	return;
+}
 #endif /* RTE_SCHED_RED */
 
 #ifdef RTE_SCHED_DEBUG
@@ -1665,63 +1683,71 @@ debug_check_queue_slab(struct rte_sched_port *port, uint32_t bmp_pos,
 
 #endif /* RTE_SCHED_DEBUG */
 
+static inline struct rte_sched_subport *
+rte_sched_port_get_subport(struct rte_sched_port *port,
+	struct rte_mbuf *pkt)
+{
+	uint32_t qindex = rte_mbuf_sched_queue_get(pkt);
+	uint32_t subport_id = (qindex >> (port->max_subport_pipes_log2 + 4)) &
+		(port->n_subports_per_port - 1);
+
+	return port->subports[subport_id];
+}
+
 static inline uint32_t
-rte_sched_port_enqueue_qptrs_prefetch0(struct rte_sched_port *port,
-				       struct rte_mbuf *pkt)
+rte_sched_port_enqueue_qptrs_prefetch0(struct rte_sched_subport *subport,
+	struct rte_mbuf *pkt, uint32_t bitwidth)
 {
 	struct rte_sched_queue *q;
 #ifdef RTE_SCHED_COLLECT_STATS
 	struct rte_sched_queue_extra *qe;
 #endif
 	uint32_t qindex = rte_mbuf_sched_queue_get(pkt);
+	uint32_t queue_id = ((1 << (bitwidth + 4)) - 1) & qindex;
 
-	q = port->queue + qindex;
+	q = subport->queue + queue_id;
 	rte_prefetch0(q);
 #ifdef RTE_SCHED_COLLECT_STATS
-	qe = port->queue_extra + qindex;
+	qe = subport->queue_extra + queue_id;
 	rte_prefetch0(qe);
 #endif
 
-	return qindex;
+	return queue_id;
 }
 
 static inline void
-rte_sched_port_enqueue_qwa_prefetch0(struct rte_sched_port *port,
+rte_sched_port_enqueue_qwa_prefetch0(struct rte_sched_subport *subport,
 				     uint32_t qindex, struct rte_mbuf **qbase)
 {
 	struct rte_sched_queue *q;
 	struct rte_mbuf **q_qw;
 	uint16_t qsize;
 
-	q = port->queue + qindex;
-	qsize = rte_sched_port_qsize(port, qindex);
+	q = subport->queue + qindex;
+	qsize = rte_sched_subport_qsize(subport, qindex);
 	q_qw = qbase + (q->qw & (qsize - 1));
 
 	rte_prefetch0(q_qw);
-	rte_bitmap_prefetch0(port->bmp, qindex);
+	rte_bitmap_prefetch0(subport->bmp, qindex);
 }
 
 static inline int
-rte_sched_port_enqueue_qwa(struct rte_sched_port *port, uint32_t qindex,
-			   struct rte_mbuf **qbase, struct rte_mbuf *pkt)
+rte_sched_port_enqueue_qwa(struct rte_sched_subport *subport, uint32_t qindex,
+		struct rte_mbuf **qbase, struct rte_mbuf *pkt, uint64_t time)
 {
-	struct rte_sched_queue *q;
-	uint16_t qsize;
-	uint16_t qlen;
-
-	q = port->queue + qindex;
-	qsize = rte_sched_port_qsize(port, qindex);
-	qlen = q->qw - q->qr;
+	struct rte_sched_queue *q = subport->queue + qindex;
+	uint16_t qsize = rte_sched_subport_qsize(subport, qindex);
+	uint16_t qlen = q->qw - q->qr;
 
 	/* Drop the packet (and update drop stats) when queue is full */
-	if (unlikely(rte_sched_port_red_drop(port, pkt, qindex, qlen) ||
-		     (qlen >= qsize))) {
+	if (unlikely(rte_sched_port_red_drop(subport, pkt, qindex, qlen, time)
+		|| (qlen >= qsize))) {
 		rte_pktmbuf_free(pkt);
 #ifdef RTE_SCHED_COLLECT_STATS
-		rte_sched_port_update_subport_stats_on_drop(port, qindex, pkt,
-							    qlen < qsize);
-		rte_sched_port_update_queue_stats_on_drop(port, qindex, pkt,
-							  qlen < qsize);
+		rte_sched_port_update_subport_stats_on_drop(subport, pkt,
+			qlen < qsize);
+		rte_sched_port_update_queue_stats_on_drop(subport, qindex, pkt,
+			qlen < qsize);
 #endif
 		return 0;
 	}
@@ -1730,13 +1756,13 @@ rte_sched_port_enqueue_qwa(struct rte_sched_port *port, uint32_t qindex,
 	qbase[q->qw & (qsize - 1)] = pkt;
 	q->qw++;
 
-	/* Activate queue in the port bitmap */
-	rte_bitmap_set(port->bmp, qindex);
+	/* Activate queue in the subport bitmap */
+	rte_bitmap_set(subport->bmp, qindex);
 
 	/* Statistics */
 #ifdef RTE_SCHED_COLLECT_STATS
-	rte_sched_port_update_subport_stats(port, qindex, pkt);
-	rte_sched_port_update_queue_stats(port, qindex, pkt);
+	rte_sched_port_update_subport_stats(subport, pkt);
+	rte_sched_port_update_queue_stats(subport, qindex, pkt);
 #endif
 
 	return 1;
@@ -1764,17 +1790,21 @@ rte_sched_port_enqueue(struct rte_sched_port *port, struct rte_mbuf **pkts,
 		*pkt30, *pkt31, *pkt_last;
 	struct rte_mbuf **q00_base, **q01_base, **q10_base, **q11_base,
 		**q20_base, **q21_base, **q30_base, **q31_base, **q_last_base;
+	struct rte_sched_subport *subport00, *subport01, *subport10, *subport11,
+		*subport20, *subport21, *subport30, *subport31, *subport_last;
 	uint32_t q00, q01, q10, q11, q20, q21, q30, q31, q_last;
 	uint32_t r00, r01, r10, r11, r20, r21, r30, r31, r_last;
-	uint32_t result, i;
+	uint32_t result, bitwidth, i;
 
 	result = 0;
+	bitwidth = port->max_subport_pipes_log2;
 
 	/*
 	 * Less then 6 input packets available, which is not enough to
 	 * feed the pipeline
 	 */
 	if (unlikely(n_pkts < 6)) {
+		struct rte_sched_subport *subports[5];
 		struct rte_mbuf **q_base[5];
 		uint32_t q[5];
 
@@ -1782,22 +1812,27 @@ rte_sched_port_enqueue(struct rte_sched_port *port, struct rte_mbuf **pkts,
 		for (i = 0; i < n_pkts; i++)
 			rte_prefetch0(pkts[i]);
 
+		/* Prefetch the subport structure for each packet */
+		for (i = 0; i < n_pkts; i++)
+			subports[i] =
+				rte_sched_port_get_subport(port, pkts[i]);
+
 		/* Prefetch the queue structure for each queue */
 		for (i = 0; i < n_pkts; i++)
-			q[i] = rte_sched_port_enqueue_qptrs_prefetch0(port,
-								      pkts[i]);
+			q[i] = rte_sched_port_enqueue_qptrs_prefetch0(subports[i],
+					pkts[i], bitwidth);
 
 		/* Prefetch the write pointer location of each queue */
 		for (i = 0; i < n_pkts; i++) {
-			q_base[i] = rte_sched_port_qbase(port, q[i]);
-			rte_sched_port_enqueue_qwa_prefetch0(port, q[i],
+			q_base[i] = rte_sched_subport_qbase(subports[i], q[i]);
+			rte_sched_port_enqueue_qwa_prefetch0(subports[i], q[i],
 							     q_base[i]);
 		}
 
 		/* Write each packet to its queue */
 		for (i = 0; i < n_pkts; i++)
-			result += rte_sched_port_enqueue_qwa(port, q[i],
-							     q_base[i], pkts[i]);
+			result += rte_sched_port_enqueue_qwa(subports[i], q[i],
+					q_base[i], pkts[i], port->time);
 
 		return result;
 	}
@@ -1813,21 +1848,25 @@ rte_sched_port_enqueue(struct rte_sched_port *port, struct rte_mbuf **pkts,
 	rte_prefetch0(pkt10);
 	rte_prefetch0(pkt11);
 
-	q20 = rte_sched_port_enqueue_qptrs_prefetch0(port, pkt20);
-	q21 = rte_sched_port_enqueue_qptrs_prefetch0(port, pkt21);
+	subport20 = rte_sched_port_get_subport(port, pkt20);
+	subport21 = rte_sched_port_get_subport(port, pkt21);
+	q20 = rte_sched_port_enqueue_qptrs_prefetch0(subport20, pkt20, bitwidth);
+	q21 = rte_sched_port_enqueue_qptrs_prefetch0(subport21, pkt21, bitwidth);
 
 	pkt00 = pkts[4];
 	pkt01 = pkts[5];
 	rte_prefetch0(pkt00);
 	rte_prefetch0(pkt01);
 
-	q10 = rte_sched_port_enqueue_qptrs_prefetch0(port, pkt10);
-	q11 = rte_sched_port_enqueue_qptrs_prefetch0(port, pkt11);
+	subport10 = rte_sched_port_get_subport(port, pkt10);
+	subport11 = rte_sched_port_get_subport(port, pkt11);
+	q10 = rte_sched_port_enqueue_qptrs_prefetch0(subport10, pkt10, bitwidth);
+	q11 = rte_sched_port_enqueue_qptrs_prefetch0(subport11, pkt11, bitwidth);
 
-	q20_base = rte_sched_port_qbase(port, q20);
-	q21_base = rte_sched_port_qbase(port, q21);
-	rte_sched_port_enqueue_qwa_prefetch0(port, q20, q20_base);
-	rte_sched_port_enqueue_qwa_prefetch0(port, q21, q21_base);
+	q20_base = rte_sched_subport_qbase(subport20, q20);
+	q21_base = rte_sched_subport_qbase(subport21, q21);
+	rte_sched_port_enqueue_qwa_prefetch0(subport20, q20, q20_base);
+	rte_sched_port_enqueue_qwa_prefetch0(subport21, q21, q21_base);
 
 	/* Run the pipeline */
 	for (i = 6; i < (n_pkts & (~1)); i += 2) {
@@ -1842,6 +1881,10 @@ rte_sched_port_enqueue(struct rte_sched_port *port, struct rte_mbuf **pkts,
 		q31 = q21;
 		q20 = q10;
 		q21 = q11;
+		subport30 = subport20;
+		subport31 = subport21;
+		subport20 = subport10;
+		subport21 = subport11;
 		q30_base = q20_base;
 		q31_base = q21_base;
 
@@ -1851,19 +1894,25 @@ rte_sched_port_enqueue(struct rte_sched_port *port, struct rte_mbuf **pkts,
 		rte_prefetch0(pkt00);
 		rte_prefetch0(pkt01);
 
-		/* Stage 1: Prefetch queue structure storing queue pointers */
-		q10 = rte_sched_port_enqueue_qptrs_prefetch0(port, pkt10);
-		q11 = rte_sched_port_enqueue_qptrs_prefetch0(port, pkt11);
+		/* Stage 1: Prefetch subport and queue structure storing queue
+		 *  pointers
+		 */
+		subport10 = rte_sched_port_get_subport(port, pkt10);
+		subport11 = rte_sched_port_get_subport(port, pkt11);
+		q10 = rte_sched_port_enqueue_qptrs_prefetch0(subport10, pkt10, bitwidth);
+		q11 = rte_sched_port_enqueue_qptrs_prefetch0(subport11, pkt11, bitwidth);
 
 		/* Stage 2: Prefetch queue write location */
-		q20_base = rte_sched_port_qbase(port, q20);
-		q21_base = rte_sched_port_qbase(port, q21);
-		rte_sched_port_enqueue_qwa_prefetch0(port, q20, q20_base);
-		rte_sched_port_enqueue_qwa_prefetch0(port, q21, q21_base);
+		q20_base = rte_sched_subport_qbase(subport20, q20);
+		q21_base = rte_sched_subport_qbase(subport21, q21);
+		rte_sched_port_enqueue_qwa_prefetch0(subport20, q20, q20_base);
+		rte_sched_port_enqueue_qwa_prefetch0(subport21, q21, q21_base);
 
 		/* Stage 3: Write packet to queue and activate queue */
-		r30 = rte_sched_port_enqueue_qwa(port, q30, q30_base, pkt30);
-		r31 = rte_sched_port_enqueue_qwa(port, q31, q31_base, pkt31);
+		r30 = rte_sched_port_enqueue_qwa(subport30, q30, q30_base,
+			pkt30, port->time);
+		r31 = rte_sched_port_enqueue_qwa(subport31, q31, q31_base,
+			pkt31, port->time);
 		result += r30 + r31;
 	}
 
@@ -1875,38 +1924,49 @@ rte_sched_port_enqueue(struct rte_sched_port *port, struct rte_mbuf **pkts,
 	pkt_last = pkts[n_pkts - 1];
 	rte_prefetch0(pkt_last);
 
-	q00 = rte_sched_port_enqueue_qptrs_prefetch0(port, pkt00);
-	q01 = rte_sched_port_enqueue_qptrs_prefetch0(port, pkt01);
+	subport00 = rte_sched_port_get_subport(port, pkt00);
+	subport01 = rte_sched_port_get_subport(port, pkt01);
+	q00 = rte_sched_port_enqueue_qptrs_prefetch0(subport00, pkt00, bitwidth);
+	q01 = rte_sched_port_enqueue_qptrs_prefetch0(subport01, pkt01, bitwidth);
 
-	q10_base = rte_sched_port_qbase(port, q10);
-	q11_base = rte_sched_port_qbase(port, q11);
-	rte_sched_port_enqueue_qwa_prefetch0(port, q10, q10_base);
-	rte_sched_port_enqueue_qwa_prefetch0(port, q11, q11_base);
+	q10_base = rte_sched_subport_qbase(subport10, q10);
+	q11_base = rte_sched_subport_qbase(subport11, q11);
+	rte_sched_port_enqueue_qwa_prefetch0(subport10, q10, q10_base);
+	rte_sched_port_enqueue_qwa_prefetch0(subport11, q11, q11_base);
 
-	r20 = rte_sched_port_enqueue_qwa(port, q20, q20_base, pkt20);
-	r21 = rte_sched_port_enqueue_qwa(port, q21, q21_base, pkt21);
+	r20 = rte_sched_port_enqueue_qwa(subport20, q20, q20_base, pkt20,
+		port->time);
+	r21 = rte_sched_port_enqueue_qwa(subport21, q21, q21_base, pkt21,
+		port->time);
 	result += r20 + r21;
 
-	q_last = rte_sched_port_enqueue_qptrs_prefetch0(port, pkt_last);
+	subport_last = rte_sched_port_get_subport(port, pkt_last);
+	q_last = rte_sched_port_enqueue_qptrs_prefetch0(subport_last,
+				pkt_last, bitwidth);
 
-	q00_base = rte_sched_port_qbase(port, q00);
-	q01_base = rte_sched_port_qbase(port, q01);
-	rte_sched_port_enqueue_qwa_prefetch0(port, q00, q00_base);
-	rte_sched_port_enqueue_qwa_prefetch0(port, q01, q01_base);
+	q00_base = rte_sched_subport_qbase(subport00, q00);
+	q01_base = rte_sched_subport_qbase(subport01, q01);
+	rte_sched_port_enqueue_qwa_prefetch0(subport00, q00, q00_base);
+	rte_sched_port_enqueue_qwa_prefetch0(subport01, q01, q01_base);
 
-	r10 = rte_sched_port_enqueue_qwa(port, q10, q10_base, pkt10);
-	r11 = rte_sched_port_enqueue_qwa(port, q11, q11_base, pkt11);
+	r10 = rte_sched_port_enqueue_qwa(subport10, q10, q10_base, pkt10,
+		port->time);
+	r11 = rte_sched_port_enqueue_qwa(subport11, q11, q11_base, pkt11,
+		port->time);
 	result += r10 + r11;
 
-	q_last_base = rte_sched_port_qbase(port, q_last);
-	rte_sched_port_enqueue_qwa_prefetch0(port, q_last, q_last_base);
+	q_last_base = rte_sched_subport_qbase(subport_last, q_last);
+	rte_sched_port_enqueue_qwa_prefetch0(subport_last, q_last, q_last_base);
 
-	r00 = rte_sched_port_enqueue_qwa(port, q00, q00_base, pkt00);
-	r01 = rte_sched_port_enqueue_qwa(port, q01, q01_base, pkt01);
+	r00 = rte_sched_port_enqueue_qwa(subport00, q00, q00_base, pkt00,
+		port->time);
+	r01 = rte_sched_port_enqueue_qwa(subport01, q01, q01_base, pkt01,
+		port->time);
 	result += r00 + r01;
 
 	if (n_pkts & 1) {
-		r_last = rte_sched_port_enqueue_qwa(port, q_last, q_last_base, pkt_last);
+		r_last = rte_sched_port_enqueue_qwa(subport_last, q_last,
+				q_last_base, pkt_last, port->time);
 		result += r_last;
 	}
 
@@ -2148,7 +2208,7 @@ grinder_schedule(struct rte_sched_port *port, uint32_t pos)
 		rte_bitmap_clear(port->bmp, qindex);
 		grinder->qmask &= ~(1 << grinder->qpos);
 		grinder->wrr_mask[grinder->qpos] = 0;
-		rte_sched_port_set_queue_empty_timestamp(port, qindex);
+		rte_sched_port_set_queue_empty_timestamp(port, port->subport, qindex);
 	}
 
 	/* Reset pipe loop detection */
-- 
2.21.0


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH v2 13/28] sched: update grinder pipe and tc cache
  2019-06-25 15:31   ` [dpdk-dev] [PATCH v2 00/28] sched: feature enhancements Jasvinder Singh
                       ` (11 preceding siblings ...)
  2019-06-25 15:32     ` [dpdk-dev] [PATCH v2 12/28] sched: update packet enqueue API Jasvinder Singh
@ 2019-06-25 15:32     ` Jasvinder Singh
  2019-06-25 15:32     ` [dpdk-dev] [PATCH v2 14/28] sched: update grinder next pipe and tc functions Jasvinder Singh
                       ` (17 subsequent siblings)
  30 siblings, 0 replies; 163+ messages in thread
From: Jasvinder Singh @ 2019-06-25 15:32 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, Abraham Tovar, Lukasz Krakowiak

Update grinder pipe and tc cache population to allow
configuration flexiblity for pipe traffic classes and
queues, and subport level configuration of the pipe parameters.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 lib/librte_sched/rte_sched.c | 46 ++++++++++++++++--------------------
 1 file changed, 20 insertions(+), 26 deletions(-)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index cb96e0613..5f725bd03 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -2275,9 +2275,10 @@ grinder_pipe_exists(struct rte_sched_port *port, uint32_t base_pipe)
 #endif /* RTE_SCHED_OPTIMIZATIONS */
 
 static inline void
-grinder_pcache_populate(struct rte_sched_port *port, uint32_t pos, uint32_t bmp_pos, uint64_t bmp_slab)
+grinder_pcache_populate(struct rte_sched_subport *subport, uint32_t pos,
+	uint32_t bmp_pos, uint64_t bmp_slab)
 {
-	struct rte_sched_grinder *grinder = port->grinder + pos;
+	struct rte_sched_grinder *grinder = subport->grinder + pos;
 	uint16_t w[4];
 
 	grinder->pcache_w = 0;
@@ -2306,34 +2307,27 @@ grinder_pcache_populate(struct rte_sched_port *port, uint32_t pos, uint32_t bmp_
 }
 
 static inline void
-grinder_tccache_populate(struct rte_sched_port *port, uint32_t pos, uint32_t qindex, uint16_t qmask)
+grinder_tccache_populate(struct rte_sched_subport *subport, uint32_t pos,
+	uint32_t qindex, uint16_t qmask)
 {
-	struct rte_sched_grinder *grinder = port->grinder + pos;
-	uint8_t b[4];
+	struct rte_sched_grinder *grinder = subport->grinder + pos;
+	uint8_t b, i;
 
 	grinder->tccache_w = 0;
 	grinder->tccache_r = 0;
 
-	b[0] = (uint8_t) (qmask & 0xF);
-	b[1] = (uint8_t) ((qmask >> 4) & 0xF);
-	b[2] = (uint8_t) ((qmask >> 8) & 0xF);
-	b[3] = (uint8_t) ((qmask >> 12) & 0xF);
-
-	grinder->tccache_qmask[grinder->tccache_w] = b[0];
-	grinder->tccache_qindex[grinder->tccache_w] = qindex;
-	grinder->tccache_w += (b[0] != 0);
-
-	grinder->tccache_qmask[grinder->tccache_w] = b[1];
-	grinder->tccache_qindex[grinder->tccache_w] = qindex + 4;
-	grinder->tccache_w += (b[1] != 0);
-
-	grinder->tccache_qmask[grinder->tccache_w] = b[2];
-	grinder->tccache_qindex[grinder->tccache_w] = qindex + 8;
-	grinder->tccache_w += (b[2] != 0);
+	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASS_BE; i++) {
+		b = (uint8_t) ((qmask >> i) & 0x1);
+		grinder->tccache_qmask[grinder->tccache_w] = b;
+		grinder->tccache_qindex[grinder->tccache_w] = qindex + i;
+		grinder->tccache_w += (b != 0);
+	}
 
-	grinder->tccache_qmask[grinder->tccache_w] = b[3];
-	grinder->tccache_qindex[grinder->tccache_w] = qindex + 12;
-	grinder->tccache_w += (b[3] != 0);
+	b = (uint8_t) (qmask >> (RTE_SCHED_TRAFFIC_CLASS_BE));
+	grinder->tccache_qmask[grinder->tccache_w] = b;
+	grinder->tccache_qindex[grinder->tccache_w] = qindex +
+		RTE_SCHED_TRAFFIC_CLASS_BE;
+	grinder->tccache_w += (b != 0);
 }
 
 static inline int
@@ -2405,7 +2399,7 @@ grinder_next_pipe(struct rte_sched_port *port, uint32_t pos)
 		port->grinder_base_bmp_pos[pos] = bmp_pos;
 
 		/* Install new pipe group into grinder's pipe cache */
-		grinder_pcache_populate(port, pos, bmp_pos, bmp_slab);
+		grinder_pcache_populate(port->subport, pos, bmp_pos, bmp_slab);
 
 		pipe_qmask = grinder->pcache_qmask[0];
 		pipe_qindex = grinder->pcache_qindex[0];
@@ -2419,7 +2413,7 @@ grinder_next_pipe(struct rte_sched_port *port, uint32_t pos)
 	grinder->pipe_params = NULL; /* to be set after the pipe structure is prefetched */
 	grinder->productive = 0;
 
-	grinder_tccache_populate(port, pos, pipe_qindex, pipe_qmask);
+	grinder_tccache_populate(port->subport, pos, pipe_qindex, pipe_qmask);
 	grinder_next_tc(port, pos);
 
 	/* Check for pipe exhaustion */
-- 
2.21.0


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH v2 14/28] sched: update grinder next pipe and tc functions
  2019-06-25 15:31   ` [dpdk-dev] [PATCH v2 00/28] sched: feature enhancements Jasvinder Singh
                       ` (12 preceding siblings ...)
  2019-06-25 15:32     ` [dpdk-dev] [PATCH v2 13/28] sched: update grinder pipe and tc cache Jasvinder Singh
@ 2019-06-25 15:32     ` Jasvinder Singh
  2019-06-25 15:32     ` [dpdk-dev] [PATCH v2 15/28] sched: update pipe and tc queues prefetch Jasvinder Singh
                       ` (16 subsequent siblings)
  30 siblings, 0 replies; 163+ messages in thread
From: Jasvinder Singh @ 2019-06-25 15:32 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, Abraham Tovar, Lukasz Krakowiak

Update grinder next pipe and tc functions to allow configuration
flexiblity for pipe traffic classes and queues, and subport
level configuration of the pipe parameters.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 lib/librte_sched/rte_sched.c | 123 ++++++++++++++++-------------------
 1 file changed, 56 insertions(+), 67 deletions(-)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index 5f725bd03..382d9d929 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -84,7 +84,7 @@ struct rte_sched_grinder {
 	struct rte_sched_queue *queue[RTE_SCHED_MAX_QUEUES_PER_TC];
 	struct rte_mbuf **qbase[RTE_SCHED_MAX_QUEUES_PER_TC];
 	uint32_t qindex[RTE_SCHED_MAX_QUEUES_PER_TC];
-	uint16_t qsize;
+	uint16_t qsize[RTE_SCHED_MAX_QUEUES_PER_TC];
 	uint32_t qmask;
 	struct rte_mbuf *pkt;
 
@@ -323,24 +323,6 @@ rte_sched_port_queues_per_port(struct rte_sched_port *port)
 	return RTE_SCHED_QUEUES_PER_PIPE * port->n_pipes_per_subport * port->n_subports_per_port;
 }
 
-static inline struct rte_mbuf **
-rte_sched_port_qbase(struct rte_sched_port *port, uint32_t qindex)
-{
-	uint32_t pindex = qindex >> 4;
-	uint32_t qpos = qindex & 0xF;
-
-	return (port->queue_array + pindex *
-		port->qsize_sum + port->qsize_add[qpos]);
-}
-
-static inline uint16_t
-rte_sched_port_qsize(struct rte_sched_port *port, uint32_t qindex)
-{
-	uint32_t tc = (qindex >> 2) & 0x3;
-
-	return port->qsize[tc];
-}
-
 static int
 pipe_profile_check(struct rte_sched_pipe_params *params,
 	uint32_t rate, uint16_t *qsize)
@@ -2221,13 +2203,14 @@ grinder_schedule(struct rte_sched_port *port, uint32_t pos)
 #ifdef SCHED_VECTOR_SSE4
 
 static inline int
-grinder_pipe_exists(struct rte_sched_port *port, uint32_t base_pipe)
+grinder_pipe_exists(struct rte_sched_subport *subport, uint32_t base_pipe)
 {
 	__m128i index = _mm_set1_epi32(base_pipe);
-	__m128i pipes = _mm_load_si128((__m128i *)port->grinder_base_bmp_pos);
+	__m128i pipes =
+		_mm_load_si128((__m128i *)subport->grinder_base_bmp_pos);
 	__m128i res = _mm_cmpeq_epi32(pipes, index);
 
-	pipes = _mm_load_si128((__m128i *)(port->grinder_base_bmp_pos + 4));
+	pipes = _mm_load_si128((__m128i *)(subport->grinder_base_bmp_pos + 4));
 	pipes = _mm_cmpeq_epi32(pipes, index);
 	res = _mm_or_si128(res, pipes);
 
@@ -2240,10 +2223,10 @@ grinder_pipe_exists(struct rte_sched_port *port, uint32_t base_pipe)
 #elif defined(SCHED_VECTOR_NEON)
 
 static inline int
-grinder_pipe_exists(struct rte_sched_port *port, uint32_t base_pipe)
+grinder_pipe_exists(struct rte_sched_subport *subport, uint32_t base_pipe)
 {
 	uint32x4_t index, pipes;
-	uint32_t *pos = (uint32_t *)port->grinder_base_bmp_pos;
+	uint32_t *pos = (uint32_t *)subport->grinder_base_bmp_pos;
 
 	index = vmovq_n_u32(base_pipe);
 	pipes = vld1q_u32(pos);
@@ -2260,12 +2243,12 @@ grinder_pipe_exists(struct rte_sched_port *port, uint32_t base_pipe)
 #else
 
 static inline int
-grinder_pipe_exists(struct rte_sched_port *port, uint32_t base_pipe)
+grinder_pipe_exists(struct rte_sched_subport *subport, uint32_t base_pipe)
 {
 	uint32_t i;
 
 	for (i = 0; i < RTE_SCHED_PORT_N_GRINDERS; i++) {
-		if (port->grinder_base_bmp_pos[i] == base_pipe)
+		if (subport->grinder_base_bmp_pos[i] == base_pipe)
 			return 1;
 	}
 
@@ -2331,47 +2314,52 @@ grinder_tccache_populate(struct rte_sched_subport *subport, uint32_t pos,
 }
 
 static inline int
-grinder_next_tc(struct rte_sched_port *port, uint32_t pos)
+grinder_next_tc(struct rte_sched_subport *subport, uint32_t pos)
 {
-	struct rte_sched_grinder *grinder = port->grinder + pos;
+	struct rte_sched_grinder *grinder = subport->grinder + pos;
+	struct rte_sched_pipe *pipe = grinder->pipe;
 	struct rte_mbuf **qbase;
-	uint32_t qindex;
+	uint32_t qindex, qpos = 0;
 	uint16_t qsize;
 
 	if (grinder->tccache_r == grinder->tccache_w)
 		return 0;
 
 	qindex = grinder->tccache_qindex[grinder->tccache_r];
-	qbase = rte_sched_port_qbase(port, qindex);
-	qsize = rte_sched_port_qsize(port, qindex);
+	grinder->tc_index = qindex & 0xf;
+	qbase = rte_sched_subport_qbase(subport, qindex);
 
-	grinder->tc_index = (qindex >> 2) & 0x3;
-	grinder->qmask = grinder->tccache_qmask[grinder->tccache_r];
-	grinder->qsize = qsize;
-
-	grinder->qindex[0] = qindex;
-	grinder->qindex[1] = qindex + 1;
-	grinder->qindex[2] = qindex + 2;
-	grinder->qindex[3] = qindex + 3;
+	if (grinder->tc_index < RTE_SCHED_TRAFFIC_CLASS_BE) {
+		qsize = rte_sched_subport_qsize(subport, qindex);
 
-	grinder->queue[0] = port->queue + qindex;
-	grinder->queue[1] = port->queue + qindex + 1;
-	grinder->queue[2] = port->queue + qindex + 2;
-	grinder->queue[3] = port->queue + qindex + 3;
+		grinder->queue[qpos] = subport->queue + qindex;
+		grinder->qbase[qpos] = qbase;
+		grinder->qindex[qpos] = qindex;
+		grinder->qsize[qpos] = qsize;
+		grinder->qmask = grinder->tccache_qmask[grinder->tccache_r];
+		grinder->tccache_r++;
 
-	grinder->qbase[0] = qbase;
-	grinder->qbase[1] = qbase + qsize;
-	grinder->qbase[2] = qbase + 2 * qsize;
-	grinder->qbase[3] = qbase + 3 * qsize;
+		return 1;
+	}
 
+	for ( ; qpos < pipe->n_be_queues; qpos++) {
+		qsize = rte_sched_subport_qsize(subport, qindex + qpos);
+		grinder->queue[qpos] = subport->queue + qindex + qpos;
+		grinder->qbase[qpos] = qbase + qpos * qsize;
+		grinder->qindex[qpos] = qindex + qpos;
+		grinder->qsize[qpos] = qsize;
+	}
+	grinder->tc_index = RTE_SCHED_TRAFFIC_CLASS_BE;
+	grinder->qmask = grinder->tccache_qmask[grinder->tccache_r];
 	grinder->tccache_r++;
+
 	return 1;
 }
 
 static inline int
-grinder_next_pipe(struct rte_sched_port *port, uint32_t pos)
+grinder_next_pipe(struct rte_sched_subport *subport, uint32_t pos)
 {
-	struct rte_sched_grinder *grinder = port->grinder + pos;
+	struct rte_sched_grinder *grinder = subport->grinder + pos;
 	uint32_t pipe_qindex;
 	uint16_t pipe_qmask;
 
@@ -2384,22 +2372,23 @@ grinder_next_pipe(struct rte_sched_port *port, uint32_t pos)
 		uint32_t bmp_pos = 0;
 
 		/* Get another non-empty pipe group */
-		if (unlikely(rte_bitmap_scan(port->bmp, &bmp_pos, &bmp_slab) <= 0))
+		if (unlikely(rte_bitmap_scan(subport->bmp, &bmp_pos, &bmp_slab)
+			<= 0))
 			return 0;
 
 #ifdef RTE_SCHED_DEBUG
-		debug_check_queue_slab(port, bmp_pos, bmp_slab);
+		debug_check_queue_slab(subport, bmp_pos, bmp_slab);
 #endif
 
 		/* Return if pipe group already in one of the other grinders */
-		port->grinder_base_bmp_pos[pos] = RTE_SCHED_BMP_POS_INVALID;
-		if (unlikely(grinder_pipe_exists(port, bmp_pos)))
+		subport->grinder_base_bmp_pos[pos] = RTE_SCHED_BMP_POS_INVALID;
+		if (unlikely(grinder_pipe_exists(subport, bmp_pos)))
 			return 0;
 
-		port->grinder_base_bmp_pos[pos] = bmp_pos;
+		subport->grinder_base_bmp_pos[pos] = bmp_pos;
 
 		/* Install new pipe group into grinder's pipe cache */
-		grinder_pcache_populate(port->subport, pos, bmp_pos, bmp_slab);
+		grinder_pcache_populate(subport, pos, bmp_pos, bmp_slab);
 
 		pipe_qmask = grinder->pcache_qmask[0];
 		pipe_qindex = grinder->pcache_qindex[0];
@@ -2408,18 +2397,18 @@ grinder_next_pipe(struct rte_sched_port *port, uint32_t pos)
 
 	/* Install new pipe in the grinder */
 	grinder->pindex = pipe_qindex >> 4;
-	grinder->subport = port->subport + (grinder->pindex / port->n_pipes_per_subport);
-	grinder->pipe = port->pipe + grinder->pindex;
+	grinder->subport = subport;
+	grinder->pipe = subport->pipe + grinder->pindex;
 	grinder->pipe_params = NULL; /* to be set after the pipe structure is prefetched */
 	grinder->productive = 0;
 
-	grinder_tccache_populate(port->subport, pos, pipe_qindex, pipe_qmask);
-	grinder_next_tc(port, pos);
+	grinder_tccache_populate(subport, pos, pipe_qindex, pipe_qmask);
+	grinder_next_tc(subport, pos);
 
 	/* Check for pipe exhaustion */
-	if (grinder->pindex == port->pipe_loop) {
-		port->pipe_exhaustion = 1;
-		port->pipe_loop = RTE_SCHED_PIPE_INVALID;
+	if (grinder->pindex == subport->pipe_loop) {
+		subport->pipe_exhaustion = 1;
+		subport->pipe_loop = RTE_SCHED_PIPE_INVALID;
 	}
 
 	return 1;
@@ -2512,7 +2501,7 @@ grinder_prefetch_tc_queue_arrays(struct rte_sched_port *port, uint32_t pos)
 	struct rte_sched_grinder *grinder = port->grinder + pos;
 	uint16_t qsize, qr[4];
 
-	qsize = grinder->qsize;
+	qsize = grinder->qsize[0];
 	qr[0] = grinder->queue[0]->qr & (qsize - 1);
 	qr[1] = grinder->queue[1]->qr & (qsize - 1);
 	qr[2] = grinder->queue[2]->qr & (qsize - 1);
@@ -2534,7 +2523,7 @@ grinder_prefetch_mbuf(struct rte_sched_port *port, uint32_t pos)
 	struct rte_sched_grinder *grinder = port->grinder + pos;
 	uint32_t qpos = grinder->qpos;
 	struct rte_mbuf **qbase = grinder->qbase[qpos];
-	uint16_t qsize = grinder->qsize;
+	uint16_t qsize = grinder->qsize[qpos];
 	uint16_t qr = grinder->queue[qpos]->qr & (qsize - 1);
 
 	grinder->pkt = qbase[qr];
@@ -2555,7 +2544,7 @@ grinder_handle(struct rte_sched_port *port, uint32_t pos)
 	switch (grinder->state) {
 	case e_GRINDER_PREFETCH_PIPE:
 	{
-		if (grinder_next_pipe(port, pos)) {
+		if (grinder_next_pipe(port->subport, pos)) {
 			grinder_prefetch_pipe(port, pos);
 			port->busy_grinders++;
 
@@ -2602,7 +2591,7 @@ grinder_handle(struct rte_sched_port *port, uint32_t pos)
 		grinder_wrr_store(port, pos);
 
 		/* Look for another active TC within same pipe */
-		if (grinder_next_tc(port, pos)) {
+		if (grinder_next_tc(port->subport, pos)) {
 			grinder_prefetch_tc_queue_arrays(port, pos);
 
 			grinder->state = e_GRINDER_PREFETCH_MBUF;
@@ -2616,7 +2605,7 @@ grinder_handle(struct rte_sched_port *port, uint32_t pos)
 		grinder_evict(port, pos);
 
 		/* Look for another active pipe */
-		if (grinder_next_pipe(port, pos)) {
+		if (grinder_next_pipe(port->subport, pos)) {
 			grinder_prefetch_pipe(port, pos);
 
 			grinder->state = e_GRINDER_PREFETCH_TC_QUEUE_ARRAYS;
-- 
2.21.0


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH v2 15/28] sched: update pipe and tc queues prefetch
  2019-06-25 15:31   ` [dpdk-dev] [PATCH v2 00/28] sched: feature enhancements Jasvinder Singh
                       ` (13 preceding siblings ...)
  2019-06-25 15:32     ` [dpdk-dev] [PATCH v2 14/28] sched: update grinder next pipe and tc functions Jasvinder Singh
@ 2019-06-25 15:32     ` Jasvinder Singh
  2019-06-25 15:32     ` [dpdk-dev] [PATCH v2 16/28] sched: update grinder wrr compute function Jasvinder Singh
                       ` (15 subsequent siblings)
  30 siblings, 0 replies; 163+ messages in thread
From: Jasvinder Singh @ 2019-06-25 15:32 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, Abraham Tovar, Lukasz Krakowiak

Update pipe and tc queues prefetch functions to allow
configuration flexiblity for pipe traffic classes and
queues, and subport level configuration of the pipe parameters.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 lib/librte_sched/rte_sched.c | 41 ++++++++++++++++++++++--------------
 1 file changed, 25 insertions(+), 16 deletions(-)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index 382d9d929..00ee9e7a2 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -2487,9 +2487,9 @@ grinder_wrr(struct rte_sched_port *port, uint32_t pos)
 #define grinder_evict(port, pos)
 
 static inline void
-grinder_prefetch_pipe(struct rte_sched_port *port, uint32_t pos)
+grinder_prefetch_pipe(struct rte_sched_subport *subport, uint32_t pos)
 {
-	struct rte_sched_grinder *grinder = port->grinder + pos;
+	struct rte_sched_grinder *grinder = subport->grinder + pos;
 
 	rte_prefetch0(grinder->pipe);
 	rte_prefetch0(grinder->queue[0]);
@@ -2498,23 +2498,32 @@ grinder_prefetch_pipe(struct rte_sched_port *port, uint32_t pos)
 static inline void
 grinder_prefetch_tc_queue_arrays(struct rte_sched_port *port, uint32_t pos)
 {
-	struct rte_sched_grinder *grinder = port->grinder + pos;
-	uint16_t qsize, qr[4];
+	struct rte_sched_grinder *grinder = port->subport->grinder + pos;
+	struct rte_sched_pipe *pipe = grinder->pipe;
+	struct rte_sched_queue *queue;
+	uint32_t i;
+	uint16_t qsize, qr[RTE_SCHED_MAX_QUEUES_PER_TC];
 
-	qsize = grinder->qsize[0];
-	qr[0] = grinder->queue[0]->qr & (qsize - 1);
-	qr[1] = grinder->queue[1]->qr & (qsize - 1);
-	qr[2] = grinder->queue[2]->qr & (qsize - 1);
-	qr[3] = grinder->queue[3]->qr & (qsize - 1);
+	grinder->qpos = 0;
+	if (grinder->tc_index < RTE_SCHED_TRAFFIC_CLASS_BE) {
+		queue = grinder->queue[0];
+		qsize = grinder->qsize[0];
+		qr[0] = queue->qr & (qsize - 1);
 
-	rte_prefetch0(grinder->qbase[0] + qr[0]);
-	rte_prefetch0(grinder->qbase[1] + qr[1]);
+		rte_prefetch0(grinder->qbase[0] + qr[0]);
+		return;
+	}
+
+	for (i = 0; i < pipe->n_be_queues; i++) {
+		queue = grinder->queue[i];
+		qsize = grinder->qsize[i];
+		qr[i] = queue->qr & (qsize - 1);
+
+		rte_prefetch0(grinder->qbase[i] + qr[i]);
+	}
 
 	grinder_wrr_load(port, pos);
 	grinder_wrr(port, pos);
-
-	rte_prefetch0(grinder->qbase[2] + qr[2]);
-	rte_prefetch0(grinder->qbase[3] + qr[3]);
 }
 
 static inline void
@@ -2545,7 +2554,7 @@ grinder_handle(struct rte_sched_port *port, uint32_t pos)
 	case e_GRINDER_PREFETCH_PIPE:
 	{
 		if (grinder_next_pipe(port->subport, pos)) {
-			grinder_prefetch_pipe(port, pos);
+			grinder_prefetch_pipe(port->subport, pos);
 			port->busy_grinders++;
 
 			grinder->state = e_GRINDER_PREFETCH_TC_QUEUE_ARRAYS;
@@ -2606,7 +2615,7 @@ grinder_handle(struct rte_sched_port *port, uint32_t pos)
 
 		/* Look for another active pipe */
 		if (grinder_next_pipe(port->subport, pos)) {
-			grinder_prefetch_pipe(port, pos);
+			grinder_prefetch_pipe(port->subport, pos);
 
 			grinder->state = e_GRINDER_PREFETCH_TC_QUEUE_ARRAYS;
 			return result;
-- 
2.21.0


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH v2 16/28] sched: update grinder wrr compute function
  2019-06-25 15:31   ` [dpdk-dev] [PATCH v2 00/28] sched: feature enhancements Jasvinder Singh
                       ` (14 preceding siblings ...)
  2019-06-25 15:32     ` [dpdk-dev] [PATCH v2 15/28] sched: update pipe and tc queues prefetch Jasvinder Singh
@ 2019-06-25 15:32     ` Jasvinder Singh
  2019-06-25 15:32     ` [dpdk-dev] [PATCH v2 17/28] sched: modify credits update function Jasvinder Singh
                       ` (14 subsequent siblings)
  30 siblings, 0 replies; 163+ messages in thread
From: Jasvinder Singh @ 2019-06-25 15:32 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, Abraham Tovar, Lukasz Krakowiak

Update weighted round robin function for best-effort traffic
class queues to allow configuration flexiblity for pipe traffic
classes and queues, and subport level configuration of the pipe
parameters.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 lib/librte_sched/rte_sched.c        | 111 ++++++++++++++++++----------
 lib/librte_sched/rte_sched_common.h |  41 ++++++++++
 2 files changed, 111 insertions(+), 41 deletions(-)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index 00ee9e7a2..90c41e549 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -2416,71 +2416,100 @@ grinder_next_pipe(struct rte_sched_subport *subport, uint32_t pos)
 
 
 static inline void
-grinder_wrr_load(struct rte_sched_port *port, uint32_t pos)
+grinder_wrr_load(struct rte_sched_subport *subport, uint32_t pos)
 {
-	struct rte_sched_grinder *grinder = port->grinder + pos;
+	struct rte_sched_grinder *grinder = subport->grinder + pos;
 	struct rte_sched_pipe *pipe = grinder->pipe;
 	struct rte_sched_pipe_profile *pipe_params = grinder->pipe_params;
-	uint32_t tc_index = grinder->tc_index;
 	uint32_t qmask = grinder->qmask;
-	uint32_t qindex;
-
-	qindex = tc_index * 4;
-
-	grinder->wrr_tokens[0] = ((uint16_t) pipe->wrr_tokens[qindex]) << RTE_SCHED_WRR_SHIFT;
-	grinder->wrr_tokens[1] = ((uint16_t) pipe->wrr_tokens[qindex + 1]) << RTE_SCHED_WRR_SHIFT;
-	grinder->wrr_tokens[2] = ((uint16_t) pipe->wrr_tokens[qindex + 2]) << RTE_SCHED_WRR_SHIFT;
-	grinder->wrr_tokens[3] = ((uint16_t) pipe->wrr_tokens[qindex + 3]) << RTE_SCHED_WRR_SHIFT;
-
-	grinder->wrr_mask[0] = (qmask & 0x1) * 0xFFFF;
-	grinder->wrr_mask[1] = ((qmask >> 1) & 0x1) * 0xFFFF;
-	grinder->wrr_mask[2] = ((qmask >> 2) & 0x1) * 0xFFFF;
-	grinder->wrr_mask[3] = ((qmask >> 3) & 0x1) * 0xFFFF;
+	uint32_t qindex = grinder->qindex[0];
+	uint32_t i;
 
-	grinder->wrr_cost[0] = pipe_params->wrr_cost[qindex];
-	grinder->wrr_cost[1] = pipe_params->wrr_cost[qindex + 1];
-	grinder->wrr_cost[2] = pipe_params->wrr_cost[qindex + 2];
-	grinder->wrr_cost[3] = pipe_params->wrr_cost[qindex + 3];
+	for (i = 0; i < pipe->n_be_queues; i++) {
+		grinder->wrr_tokens[i] =
+			((uint16_t) pipe->wrr_tokens[qindex + i]) << RTE_SCHED_WRR_SHIFT;
+		grinder->wrr_mask[i] = ((qmask >> i) & 0x1) * 0xFFFF;
+		grinder->wrr_cost[i] = pipe_params->wrr_cost[qindex + i];
+	}
 }
 
 static inline void
-grinder_wrr_store(struct rte_sched_port *port, uint32_t pos)
+grinder_wrr_store(struct rte_sched_subport *subport, uint32_t pos)
 {
-	struct rte_sched_grinder *grinder = port->grinder + pos;
+	struct rte_sched_grinder *grinder = subport->grinder + pos;
 	struct rte_sched_pipe *pipe = grinder->pipe;
-	uint32_t tc_index = grinder->tc_index;
-	uint32_t qindex;
-
-	qindex = tc_index * 4;
+	uint32_t i;
 
-	pipe->wrr_tokens[qindex] = (grinder->wrr_tokens[0] & grinder->wrr_mask[0])
-		>> RTE_SCHED_WRR_SHIFT;
-	pipe->wrr_tokens[qindex + 1] = (grinder->wrr_tokens[1] & grinder->wrr_mask[1])
-		>> RTE_SCHED_WRR_SHIFT;
-	pipe->wrr_tokens[qindex + 2] = (grinder->wrr_tokens[2] & grinder->wrr_mask[2])
-		>> RTE_SCHED_WRR_SHIFT;
-	pipe->wrr_tokens[qindex + 3] = (grinder->wrr_tokens[3] & grinder->wrr_mask[3])
-		>> RTE_SCHED_WRR_SHIFT;
+	for (i = 0; i < pipe->n_be_queues; i++)
+		pipe->wrr_tokens[i] =
+			(grinder->wrr_tokens[i] & grinder->wrr_mask[i]) >>
+				RTE_SCHED_WRR_SHIFT;
 }
 
 static inline void
-grinder_wrr(struct rte_sched_port *port, uint32_t pos)
+grinder_wrr(struct rte_sched_subport *subport, uint32_t pos)
 {
-	struct rte_sched_grinder *grinder = port->grinder + pos;
+	struct rte_sched_grinder *grinder = subport->grinder + pos;
+	struct rte_sched_pipe *pipe = grinder->pipe;
+	uint32_t n_be_queues = pipe->n_be_queues;
 	uint16_t wrr_tokens_min;
 
+	if (n_be_queues == 1) {
+		grinder->wrr_tokens[0] |= ~grinder->wrr_mask[0];
+		grinder->qpos = 0;
+		wrr_tokens_min = grinder->wrr_tokens[0];
+		grinder->wrr_tokens[0] -= wrr_tokens_min;
+		return;
+	}
+
+	if (n_be_queues == 2) {
+		grinder->wrr_tokens[0] |= ~grinder->wrr_mask[0];
+		grinder->wrr_tokens[1] |= ~grinder->wrr_mask[1];
+
+		grinder->qpos = rte_min_pos_2_u16(grinder->wrr_tokens);
+		wrr_tokens_min = grinder->wrr_tokens[grinder->qpos];
+
+		grinder->wrr_tokens[0] -= wrr_tokens_min;
+		grinder->wrr_tokens[1] -= wrr_tokens_min;
+		return;
+	}
+
+	if (n_be_queues == 4) {
+		grinder->wrr_tokens[0] |= ~grinder->wrr_mask[0];
+		grinder->wrr_tokens[1] |= ~grinder->wrr_mask[1];
+		grinder->wrr_tokens[2] |= ~grinder->wrr_mask[2];
+		grinder->wrr_tokens[3] |= ~grinder->wrr_mask[3];
+
+		grinder->qpos = rte_min_pos_4_u16(grinder->wrr_tokens);
+		wrr_tokens_min = grinder->wrr_tokens[grinder->qpos];
+
+		grinder->wrr_tokens[0] -= wrr_tokens_min;
+		grinder->wrr_tokens[1] -= wrr_tokens_min;
+		grinder->wrr_tokens[2] -= wrr_tokens_min;
+		grinder->wrr_tokens[3] -= wrr_tokens_min;
+		return;
+	}
+
 	grinder->wrr_tokens[0] |= ~grinder->wrr_mask[0];
 	grinder->wrr_tokens[1] |= ~grinder->wrr_mask[1];
 	grinder->wrr_tokens[2] |= ~grinder->wrr_mask[2];
 	grinder->wrr_tokens[3] |= ~grinder->wrr_mask[3];
+	grinder->wrr_tokens[4] |= ~grinder->wrr_mask[4];
+	grinder->wrr_tokens[5] |= ~grinder->wrr_mask[5];
+	grinder->wrr_tokens[6] |= ~grinder->wrr_mask[6];
+	grinder->wrr_tokens[7] |= ~grinder->wrr_mask[7];
 
-	grinder->qpos = rte_min_pos_4_u16(grinder->wrr_tokens);
+	grinder->qpos = rte_min_pos_8_u16(grinder->wrr_tokens);
 	wrr_tokens_min = grinder->wrr_tokens[grinder->qpos];
 
 	grinder->wrr_tokens[0] -= wrr_tokens_min;
 	grinder->wrr_tokens[1] -= wrr_tokens_min;
 	grinder->wrr_tokens[2] -= wrr_tokens_min;
 	grinder->wrr_tokens[3] -= wrr_tokens_min;
+	grinder->wrr_tokens[4] -= wrr_tokens_min;
+	grinder->wrr_tokens[5] -= wrr_tokens_min;
+	grinder->wrr_tokens[6] -= wrr_tokens_min;
+	grinder->wrr_tokens[7] -= wrr_tokens_min;
 }
 
 
@@ -2522,8 +2551,8 @@ grinder_prefetch_tc_queue_arrays(struct rte_sched_port *port, uint32_t pos)
 		rte_prefetch0(grinder->qbase[i] + qr[i]);
 	}
 
-	grinder_wrr_load(port, pos);
-	grinder_wrr(port, pos);
+	grinder_wrr_load(port->subport, pos);
+	grinder_wrr(port->subport, pos);
 }
 
 static inline void
@@ -2592,12 +2621,12 @@ grinder_handle(struct rte_sched_port *port, uint32_t pos)
 
 		/* Look for next packet within the same TC */
 		if (result && grinder->qmask) {
-			grinder_wrr(port, pos);
+			grinder_wrr(port->subport, pos);
 			grinder_prefetch_mbuf(port, pos);
 
 			return 1;
 		}
-		grinder_wrr_store(port, pos);
+		grinder_wrr_store(port->subport, pos);
 
 		/* Look for another active TC within same pipe */
 		if (grinder_next_tc(port->subport, pos)) {
diff --git a/lib/librte_sched/rte_sched_common.h b/lib/librte_sched/rte_sched_common.h
index 8c191a9b8..bb3595f26 100644
--- a/lib/librte_sched/rte_sched_common.h
+++ b/lib/librte_sched/rte_sched_common.h
@@ -20,6 +20,18 @@ rte_sched_min_val_2_u32(uint32_t x, uint32_t y)
 	return (x < y)? x : y;
 }
 
+/* Simplified version to remove branches with CMOV instruction */
+static inline uint32_t
+rte_min_pos_2_u16(uint16_t *x)
+{
+	uint32_t pos0 = 0;
+
+	if (x[1] <= x[0])
+		pos0 = 1;
+
+	return pos0;
+}
+
 #if 0
 static inline uint32_t
 rte_min_pos_4_u16(uint16_t *x)
@@ -50,6 +62,35 @@ rte_min_pos_4_u16(uint16_t *x)
 
 #endif
 
+/* Simplified version to remove branches with CMOV instruction */
+static inline uint32_t
+rte_min_pos_8_u16(uint16_t *x)
+{
+	uint32_t pos0 = 0;
+	uint32_t pos1 = 2;
+	uint32_t pos2 = 4;
+	uint32_t pos3 = 6;
+
+	if (x[1] <= x[0])
+		pos0 = 1;
+	if (x[3] <= x[2])
+		pos1 = 3;
+	if (x[5] <= x[4])
+		pos2 = 5;
+	if (x[7] <= x[6])
+		pos3 = 7;
+
+	if (x[pos1] <= x[pos0])
+		pos0 = pos1;
+	if (x[pos3] <= x[pos2])
+		pos2 = pos3;
+
+	if (x[pos2] <= x[pos0])
+		pos0 = pos2;
+
+	return pos0;
+}
+
 /*
  * Compute the Greatest Common Divisor (GCD) of two numbers.
  * This implementation uses Euclid's algorithm:
-- 
2.21.0


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH v2 17/28] sched: modify credits update function
  2019-06-25 15:31   ` [dpdk-dev] [PATCH v2 00/28] sched: feature enhancements Jasvinder Singh
                       ` (15 preceding siblings ...)
  2019-06-25 15:32     ` [dpdk-dev] [PATCH v2 16/28] sched: update grinder wrr compute function Jasvinder Singh
@ 2019-06-25 15:32     ` Jasvinder Singh
  2019-06-25 15:32     ` [dpdk-dev] [PATCH v2 18/28] sched: update mbuf prefetch function Jasvinder Singh
                       ` (13 subsequent siblings)
  30 siblings, 0 replies; 163+ messages in thread
From: Jasvinder Singh @ 2019-06-25 15:32 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, Abraham Tovar, Lukasz Krakowiak

Modify credits update function of scheduler grinder to allow
configuration flexiblity for pipe traffic classes and queues, and
subport level configuration of the pipe parameters.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 lib/librte_sched/rte_sched.c | 75 +++++++++++++++++++-----------------
 1 file changed, 40 insertions(+), 35 deletions(-)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index 90c41e549..8b440637d 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -1958,13 +1958,14 @@ rte_sched_port_enqueue(struct rte_sched_port *port, struct rte_mbuf **pkts,
 #ifndef RTE_SCHED_SUBPORT_TC_OV
 
 static inline void
-grinder_credits_update(struct rte_sched_port *port, uint32_t pos)
+grinder_credits_update(struct rte_sched_port *port,
+	struct rte_sched_subport *subport, uint32_t pos)
 {
-	struct rte_sched_grinder *grinder = port->grinder + pos;
-	struct rte_sched_subport *subport = grinder->subport;
+	struct rte_sched_grinder *grinder = subport->grinder + pos;
 	struct rte_sched_pipe *pipe = grinder->pipe;
 	struct rte_sched_pipe_profile *params = grinder->pipe_params;
 	uint64_t n_periods;
+	uint32_t i;
 
 	/* Subport TB */
 	n_periods = (port->time - subport->tb_time) / subport->tb_period;
@@ -1980,19 +1981,17 @@ grinder_credits_update(struct rte_sched_port *port, uint32_t pos)
 
 	/* Subport TCs */
 	if (unlikely(port->time >= subport->tc_time)) {
-		subport->tc_credits[0] = subport->tc_credits_per_period[0];
-		subport->tc_credits[1] = subport->tc_credits_per_period[1];
-		subport->tc_credits[2] = subport->tc_credits_per_period[2];
-		subport->tc_credits[3] = subport->tc_credits_per_period[3];
+		for (i = 0; i <= RTE_SCHED_TRAFFIC_CLASS_BE; i++)
+			subport->tc_credits[i] = subport->tc_credits_per_period[i];
+
 		subport->tc_time = port->time + subport->tc_period;
 	}
 
 	/* Pipe TCs */
 	if (unlikely(port->time >= pipe->tc_time)) {
-		pipe->tc_credits[0] = params->tc_credits_per_period[0];
-		pipe->tc_credits[1] = params->tc_credits_per_period[1];
-		pipe->tc_credits[2] = params->tc_credits_per_period[2];
-		pipe->tc_credits[3] = params->tc_credits_per_period[3];
+		for (i = 0; i <= RTE_SCHED_TRAFFIC_CLASS_BE; i++)
+			pipe->tc_credits[i] = params->tc_credits_per_period[i];
+
 		pipe->tc_time = port->time + params->tc_period;
 	}
 }
@@ -2000,26 +1999,34 @@ grinder_credits_update(struct rte_sched_port *port, uint32_t pos)
 #else
 
 static inline uint32_t
-grinder_tc_ov_credits_update(struct rte_sched_port *port, uint32_t pos)
+grinder_tc_ov_credits_update(struct rte_sched_port *port,
+	struct rte_sched_subport *subport, uint32_t pos)
 {
-	struct rte_sched_grinder *grinder = port->grinder + pos;
-	struct rte_sched_subport *subport = grinder->subport;
+	struct rte_sched_grinder *grinder = subport->grinder + pos;
+	struct rte_sched_pipe *pipe = grinder->pipe;
 	uint32_t tc_ov_consumption[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
-	uint32_t tc_ov_consumption_max;
+	uint32_t tc_consumption = 0, tc_ov_consumption_max;
 	uint32_t tc_ov_wm = subport->tc_ov_wm;
+	uint32_t i;
 
 	if (subport->tc_ov == 0)
 		return subport->tc_ov_wm_max;
 
-	tc_ov_consumption[0] = subport->tc_credits_per_period[0] - subport->tc_credits[0];
-	tc_ov_consumption[1] = subport->tc_credits_per_period[1] - subport->tc_credits[1];
-	tc_ov_consumption[2] = subport->tc_credits_per_period[2] - subport->tc_credits[2];
-	tc_ov_consumption[3] = subport->tc_credits_per_period[3] - subport->tc_credits[3];
+	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASS_BE; i++) {
+		tc_ov_consumption[i] =
+			subport->tc_credits_per_period[i] - subport->tc_credits[i];
+		tc_consumption += tc_ov_consumption[i];
+	}
 
-	tc_ov_consumption_max = subport->tc_credits_per_period[3] -
-		(tc_ov_consumption[0] + tc_ov_consumption[1] + tc_ov_consumption[2]);
+	tc_ov_consumption[RTE_SCHED_TRAFFIC_CLASS_BE] =
+		subport->tc_credits_per_period[RTE_SCHED_TRAFFIC_CLASS_BE] -
+		subport->tc_credits[RTE_SCHED_TRAFFIC_CLASS_BE];
 
-	if (tc_ov_consumption[3] > (tc_ov_consumption_max - port->mtu)) {
+	tc_ov_consumption_max =
+		subport->tc_credits_per_period[RTE_SCHED_TRAFFIC_CLASS_BE] - tc_consumption;
+
+	if (tc_ov_consumption[RTE_SCHED_TRAFFIC_CLASS_BE] >
+		(tc_ov_consumption_max - port->mtu)) {
 		tc_ov_wm  -= tc_ov_wm >> 7;
 		if (tc_ov_wm < subport->tc_ov_wm_min)
 			tc_ov_wm = subport->tc_ov_wm_min;
@@ -2035,13 +2042,14 @@ grinder_tc_ov_credits_update(struct rte_sched_port *port, uint32_t pos)
 }
 
 static inline void
-grinder_credits_update(struct rte_sched_port *port, uint32_t pos)
+grinder_credits_update(struct rte_sched_port *port,
+	struct rte_sched_subport *subport, uint32_t pos)
 {
-	struct rte_sched_grinder *grinder = port->grinder + pos;
-	struct rte_sched_subport *subport = grinder->subport;
+	struct rte_sched_grinder *grinder = subport->grinder + pos;
 	struct rte_sched_pipe *pipe = grinder->pipe;
 	struct rte_sched_pipe_profile *params = grinder->pipe_params;
 	uint64_t n_periods;
+	uint32_t i;
 
 	/* Subport TB */
 	n_periods = (port->time - subport->tb_time) / subport->tb_period;
@@ -2057,12 +2065,10 @@ grinder_credits_update(struct rte_sched_port *port, uint32_t pos)
 
 	/* Subport TCs */
 	if (unlikely(port->time >= subport->tc_time)) {
-		subport->tc_ov_wm = grinder_tc_ov_credits_update(port, pos);
+		subport->tc_ov_wm = grinder_tc_ov_credits_update(port, subport, pos);
 
-		subport->tc_credits[0] = subport->tc_credits_per_period[0];
-		subport->tc_credits[1] = subport->tc_credits_per_period[1];
-		subport->tc_credits[2] = subport->tc_credits_per_period[2];
-		subport->tc_credits[3] = subport->tc_credits_per_period[3];
+		for (i = 0; i <= RTE_SCHED_TRAFFIC_CLASS_BE; i++)
+			subport->tc_credits[i] = subport->tc_credits_per_period[i];
 
 		subport->tc_time = port->time + subport->tc_period;
 		subport->tc_ov_period_id++;
@@ -2070,10 +2076,9 @@ grinder_credits_update(struct rte_sched_port *port, uint32_t pos)
 
 	/* Pipe TCs */
 	if (unlikely(port->time >= pipe->tc_time)) {
-		pipe->tc_credits[0] = params->tc_credits_per_period[0];
-		pipe->tc_credits[1] = params->tc_credits_per_period[1];
-		pipe->tc_credits[2] = params->tc_credits_per_period[2];
-		pipe->tc_credits[3] = params->tc_credits_per_period[3];
+		for (i = 0; i <= RTE_SCHED_TRAFFIC_CLASS_BE; i++)
+			pipe->tc_credits[i] = params->tc_credits_per_period[i];
+
 		pipe->tc_time = port->time + params->tc_period;
 	}
 
@@ -2599,7 +2604,7 @@ grinder_handle(struct rte_sched_port *port, uint32_t pos)
 
 		grinder->pipe_params = port->pipe_profiles + pipe->profile;
 		grinder_prefetch_tc_queue_arrays(port, pos);
-		grinder_credits_update(port, pos);
+		grinder_credits_update(port, port->subport, pos);
 
 		grinder->state = e_GRINDER_PREFETCH_MBUF;
 		return 0;
-- 
2.21.0


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH v2 18/28] sched: update mbuf prefetch function
  2019-06-25 15:31   ` [dpdk-dev] [PATCH v2 00/28] sched: feature enhancements Jasvinder Singh
                       ` (16 preceding siblings ...)
  2019-06-25 15:32     ` [dpdk-dev] [PATCH v2 17/28] sched: modify credits update function Jasvinder Singh
@ 2019-06-25 15:32     ` Jasvinder Singh
  2019-06-25 15:32     ` [dpdk-dev] [PATCH v2 19/28] sched: update grinder schedule function Jasvinder Singh
                       ` (12 subsequent siblings)
  30 siblings, 0 replies; 163+ messages in thread
From: Jasvinder Singh @ 2019-06-25 15:32 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, Abraham Tovar, Lukasz Krakowiak

Update mbuf prefetch function of the scheduler grinder to allow
configuration flexiblity for pipe traffic classes and queues, and
subport level configuration of the pipe parameters.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 lib/librte_sched/rte_sched.c | 17 ++++++++++-------
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index 8b440637d..607fe6c18 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -2561,13 +2561,16 @@ grinder_prefetch_tc_queue_arrays(struct rte_sched_port *port, uint32_t pos)
 }
 
 static inline void
-grinder_prefetch_mbuf(struct rte_sched_port *port, uint32_t pos)
+grinder_prefetch_mbuf(struct rte_sched_subport *subport, uint32_t pos)
 {
-	struct rte_sched_grinder *grinder = port->grinder + pos;
+	struct rte_sched_grinder *grinder = subport->grinder + pos;
+	struct rte_mbuf **qbase;
 	uint32_t qpos = grinder->qpos;
-	struct rte_mbuf **qbase = grinder->qbase[qpos];
-	uint16_t qsize = grinder->qsize[qpos];
-	uint16_t qr = grinder->queue[qpos]->qr & (qsize - 1);
+	uint16_t qsize, qr;
+
+	qbase = grinder->qbase[qpos];
+	qsize = grinder->qsize[qpos];
+	qr = grinder->queue[qpos]->qr & (qsize - 1);
 
 	grinder->pkt = qbase[qr];
 	rte_prefetch0(grinder->pkt);
@@ -2612,7 +2615,7 @@ grinder_handle(struct rte_sched_port *port, uint32_t pos)
 
 	case e_GRINDER_PREFETCH_MBUF:
 	{
-		grinder_prefetch_mbuf(port, pos);
+		grinder_prefetch_mbuf(port->subport, pos);
 
 		grinder->state = e_GRINDER_READ_MBUF;
 		return 0;
@@ -2627,7 +2630,7 @@ grinder_handle(struct rte_sched_port *port, uint32_t pos)
 		/* Look for next packet within the same TC */
 		if (result && grinder->qmask) {
 			grinder_wrr(port->subport, pos);
-			grinder_prefetch_mbuf(port, pos);
+			grinder_prefetch_mbuf(port->subport, pos);
 
 			return 1;
 		}
-- 
2.21.0


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH v2 19/28] sched: update grinder schedule function
  2019-06-25 15:31   ` [dpdk-dev] [PATCH v2 00/28] sched: feature enhancements Jasvinder Singh
                       ` (17 preceding siblings ...)
  2019-06-25 15:32     ` [dpdk-dev] [PATCH v2 18/28] sched: update mbuf prefetch function Jasvinder Singh
@ 2019-06-25 15:32     ` Jasvinder Singh
  2019-06-25 15:32     ` [dpdk-dev] [PATCH v2 20/28] sched: update grinder handle function Jasvinder Singh
                       ` (11 subsequent siblings)
  30 siblings, 0 replies; 163+ messages in thread
From: Jasvinder Singh @ 2019-06-25 15:32 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, Abraham Tovar, Lukasz Krakowiak

Update grinder schedule function to allow configuration
flexiblity for pipe traffic classes and queues, and subport
level configuration of the pipe parameters.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 lib/librte_sched/rte_sched.c | 82 ++++++++++++++++++++++--------------
 1 file changed, 51 insertions(+), 31 deletions(-)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index 607fe6c18..f468827f4 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -2096,14 +2096,14 @@ grinder_credits_update(struct rte_sched_port *port,
 #ifndef RTE_SCHED_SUBPORT_TC_OV
 
 static inline int
-grinder_credits_check(struct rte_sched_port *port, uint32_t pos)
+grinder_credits_check(struct rte_sched_subport *subport,
+	uint32_t pos, uint32_t frame_overhead)
 {
-	struct rte_sched_grinder *grinder = port->grinder + pos;
-	struct rte_sched_subport *subport = grinder->subport;
+	struct rte_sched_grinder *grinder = subport->grinder + pos;
 	struct rte_sched_pipe *pipe = grinder->pipe;
 	struct rte_mbuf *pkt = grinder->pkt;
 	uint32_t tc_index = grinder->tc_index;
-	uint32_t pkt_len = pkt->pkt_len + port->frame_overhead;
+	uint32_t pkt_len = pkt->pkt_len + frame_overhead;
 	uint32_t subport_tb_credits = subport->tb_credits;
 	uint32_t subport_tc_credits = subport->tc_credits[tc_index];
 	uint32_t pipe_tb_credits = pipe->tb_credits;
@@ -2119,7 +2119,7 @@ grinder_credits_check(struct rte_sched_port *port, uint32_t pos)
 	if (!enough_credits)
 		return 0;
 
-	/* Update port credits */
+	/* Update subport credits */
 	subport->tb_credits -= pkt_len;
 	subport->tc_credits[tc_index] -= pkt_len;
 	pipe->tb_credits -= pkt_len;
@@ -2131,23 +2131,30 @@ grinder_credits_check(struct rte_sched_port *port, uint32_t pos)
 #else
 
 static inline int
-grinder_credits_check(struct rte_sched_port *port, uint32_t pos)
+grinder_credits_check(struct rte_sched_subport *subport,
+	uint32_t pos, uint32_t frame_overhead)
 {
-	struct rte_sched_grinder *grinder = port->grinder + pos;
-	struct rte_sched_subport *subport = grinder->subport;
+	struct rte_sched_grinder *grinder = subport->grinder + pos;
 	struct rte_sched_pipe *pipe = grinder->pipe;
 	struct rte_mbuf *pkt = grinder->pkt;
 	uint32_t tc_index = grinder->tc_index;
-	uint32_t pkt_len = pkt->pkt_len + port->frame_overhead;
+	uint32_t pkt_len = pkt->pkt_len + frame_overhead;
 	uint32_t subport_tb_credits = subport->tb_credits;
 	uint32_t subport_tc_credits = subport->tc_credits[tc_index];
 	uint32_t pipe_tb_credits = pipe->tb_credits;
 	uint32_t pipe_tc_credits = pipe->tc_credits[tc_index];
-	uint32_t pipe_tc_ov_mask1[] = {UINT32_MAX, UINT32_MAX, UINT32_MAX, pipe->tc_ov_credits};
-	uint32_t pipe_tc_ov_mask2[] = {0, 0, 0, UINT32_MAX};
-	uint32_t pipe_tc_ov_credits = pipe_tc_ov_mask1[tc_index];
+	uint32_t pipe_tc_ov_mask1[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
+	uint32_t pipe_tc_ov_mask2[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE] = {0};
+	uint32_t pipe_tc_ov_credits, i;
 	int enough_credits;
 
+	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++)
+		pipe_tc_ov_mask1[i] = UINT32_MAX;
+
+	pipe_tc_ov_mask1[RTE_SCHED_TRAFFIC_CLASS_BE] = pipe->tc_ov_credits;
+	pipe_tc_ov_mask2[RTE_SCHED_TRAFFIC_CLASS_BE] = UINT32_MAX;
+	pipe_tc_ov_credits = pipe_tc_ov_mask1[tc_index];
+
 	/* Check pipe and subport credits */
 	enough_credits = (pkt_len <= subport_tb_credits) &&
 		(pkt_len <= subport_tc_credits) &&
@@ -2170,36 +2177,48 @@ grinder_credits_check(struct rte_sched_port *port, uint32_t pos)
 
 #endif /* RTE_SCHED_SUBPORT_TC_OV */
 
-
 static inline int
-grinder_schedule(struct rte_sched_port *port, uint32_t pos)
+grinder_schedule(struct rte_sched_port *port,
+	struct rte_sched_subport *subport, uint32_t pos)
 {
-	struct rte_sched_grinder *grinder = port->grinder + pos;
-	struct rte_sched_queue *queue = grinder->queue[grinder->qpos];
+	struct rte_sched_grinder *grinder = subport->grinder + pos;
 	struct rte_mbuf *pkt = grinder->pkt;
-	uint32_t pkt_len = pkt->pkt_len + port->frame_overhead;
+	struct rte_sched_queue *queue;
+	uint32_t frame_overhead = port->frame_overhead;
+	uint32_t qpos, pkt_len;
+	int be_tc_active;
 
-	if (!grinder_credits_check(port, pos))
+	if (!grinder_credits_check(subport, pos, frame_overhead))
 		return 0;
 
+	pkt_len = pkt->pkt_len + frame_overhead;
+	qpos = grinder->qpos;
+	queue = grinder->queue[qpos];
+
 	/* Advance port time */
 	port->time += pkt_len;
 
 	/* Send packet */
 	port->pkts_out[port->n_pkts_out++] = pkt;
 	queue->qr++;
-	grinder->wrr_tokens[grinder->qpos] += pkt_len * grinder->wrr_cost[grinder->qpos];
+
+	be_tc_active = (grinder->tc_index == RTE_SCHED_TRAFFIC_CLASS_BE);
+	grinder->wrr_tokens[qpos] +=
+		pkt_len * grinder->wrr_cost[qpos] * be_tc_active;
+
 	if (queue->qr == queue->qw) {
-		uint32_t qindex = grinder->qindex[grinder->qpos];
+		uint32_t qindex = grinder->qindex[qpos];
+
+		rte_bitmap_clear(subport->bmp, qindex);
+		grinder->qmask &= ~(1 << qpos);
+		if (be_tc_active)
+			grinder->wrr_mask[qpos] = 0;
 
-		rte_bitmap_clear(port->bmp, qindex);
-		grinder->qmask &= ~(1 << grinder->qpos);
-		grinder->wrr_mask[grinder->qpos] = 0;
-		rte_sched_port_set_queue_empty_timestamp(port, port->subport, qindex);
+		rte_sched_port_set_queue_empty_timestamp(port, subport, qindex);
 	}
 
 	/* Reset pipe loop detection */
-	port->pipe_loop = RTE_SCHED_PIPE_INVALID;
+	subport->pipe_loop = RTE_SCHED_PIPE_INVALID;
 	grinder->productive = 1;
 
 	return 1;
@@ -2585,14 +2604,15 @@ grinder_prefetch_mbuf(struct rte_sched_subport *subport, uint32_t pos)
 static inline uint32_t
 grinder_handle(struct rte_sched_port *port, uint32_t pos)
 {
-	struct rte_sched_grinder *grinder = port->grinder + pos;
+	struct rte_sched_subport *subport = port->subport;
+	struct rte_sched_grinder *grinder = subport->grinder + pos;
 
 	switch (grinder->state) {
 	case e_GRINDER_PREFETCH_PIPE:
 	{
-		if (grinder_next_pipe(port->subport, pos)) {
-			grinder_prefetch_pipe(port->subport, pos);
-			port->busy_grinders++;
+		if (grinder_next_pipe(subport, pos)) {
+			grinder_prefetch_pipe(subport, pos);
+			subport->busy_grinders++;
 
 			grinder->state = e_GRINDER_PREFETCH_TC_QUEUE_ARRAYS;
 			return 0;
@@ -2615,7 +2635,7 @@ grinder_handle(struct rte_sched_port *port, uint32_t pos)
 
 	case e_GRINDER_PREFETCH_MBUF:
 	{
-		grinder_prefetch_mbuf(port->subport, pos);
+		grinder_prefetch_mbuf(subport, pos);
 
 		grinder->state = e_GRINDER_READ_MBUF;
 		return 0;
@@ -2625,7 +2645,7 @@ grinder_handle(struct rte_sched_port *port, uint32_t pos)
 	{
 		uint32_t result = 0;
 
-		result = grinder_schedule(port, pos);
+		result = grinder_schedule(port, subport, pos);
 
 		/* Look for next packet within the same TC */
 		if (result && grinder->qmask) {
-- 
2.21.0


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH v2 20/28] sched: update grinder handle function
  2019-06-25 15:31   ` [dpdk-dev] [PATCH v2 00/28] sched: feature enhancements Jasvinder Singh
                       ` (18 preceding siblings ...)
  2019-06-25 15:32     ` [dpdk-dev] [PATCH v2 19/28] sched: update grinder schedule function Jasvinder Singh
@ 2019-06-25 15:32     ` Jasvinder Singh
  2019-06-25 15:32     ` [dpdk-dev] [PATCH v2 21/28] sched: update packet dequeue API Jasvinder Singh
                       ` (10 subsequent siblings)
  30 siblings, 0 replies; 163+ messages in thread
From: Jasvinder Singh @ 2019-06-25 15:32 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, Abraham Tovar, Lukasz Krakowiak

Update grinder handle function implementation to allow
configuration flexiblity for pipe traffic classes and queues,
and subport level configuration of the pipe parameters.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 lib/librte_sched/rte_sched.c | 52 ++++++++++++++++++++----------------
 1 file changed, 29 insertions(+), 23 deletions(-)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index f468827f4..6bdf1b831 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -2537,7 +2537,7 @@ grinder_wrr(struct rte_sched_subport *subport, uint32_t pos)
 }
 
 
-#define grinder_evict(port, pos)
+#define grinder_evict(subport, pos)
 
 static inline void
 grinder_prefetch_pipe(struct rte_sched_subport *subport, uint32_t pos)
@@ -2549,9 +2549,9 @@ grinder_prefetch_pipe(struct rte_sched_subport *subport, uint32_t pos)
 }
 
 static inline void
-grinder_prefetch_tc_queue_arrays(struct rte_sched_port *port, uint32_t pos)
+grinder_prefetch_tc_queue_arrays(struct rte_sched_subport *subport, uint32_t pos)
 {
-	struct rte_sched_grinder *grinder = port->subport->grinder + pos;
+	struct rte_sched_grinder *grinder = subport->grinder + pos;
 	struct rte_sched_pipe *pipe = grinder->pipe;
 	struct rte_sched_queue *queue;
 	uint32_t i;
@@ -2575,8 +2575,8 @@ grinder_prefetch_tc_queue_arrays(struct rte_sched_port *port, uint32_t pos)
 		rte_prefetch0(grinder->qbase[i] + qr[i]);
 	}
 
-	grinder_wrr_load(port->subport, pos);
-	grinder_wrr(port->subport, pos);
+	grinder_wrr_load(subport, pos);
+	grinder_wrr(subport, pos);
 }
 
 static inline void
@@ -2602,9 +2602,9 @@ grinder_prefetch_mbuf(struct rte_sched_subport *subport, uint32_t pos)
 }
 
 static inline uint32_t
-grinder_handle(struct rte_sched_port *port, uint32_t pos)
+grinder_handle(struct rte_sched_port *port,
+	struct rte_sched_subport *subport, uint32_t pos)
 {
-	struct rte_sched_subport *subport = port->subport;
 	struct rte_sched_grinder *grinder = subport->grinder + pos;
 
 	switch (grinder->state) {
@@ -2625,9 +2625,9 @@ grinder_handle(struct rte_sched_port *port, uint32_t pos)
 	{
 		struct rte_sched_pipe *pipe = grinder->pipe;
 
-		grinder->pipe_params = port->pipe_profiles + pipe->profile;
-		grinder_prefetch_tc_queue_arrays(port, pos);
-		grinder_credits_update(port, port->subport, pos);
+		grinder->pipe_params = subport->pipe_profiles + pipe->profile;
+		grinder_prefetch_tc_queue_arrays(subport, pos);
+		grinder_credits_update(port, subport, pos);
 
 		grinder->state = e_GRINDER_PREFETCH_MBUF;
 		return 0;
@@ -2643,43 +2643,48 @@ grinder_handle(struct rte_sched_port *port, uint32_t pos)
 
 	case e_GRINDER_READ_MBUF:
 	{
-		uint32_t result = 0;
+		uint32_t wrr_active, result = 0;
 
 		result = grinder_schedule(port, subport, pos);
+		wrr_active = (grinder->tc_index == RTE_SCHED_TRAFFIC_CLASS_BE);
 
 		/* Look for next packet within the same TC */
 		if (result && grinder->qmask) {
-			grinder_wrr(port->subport, pos);
-			grinder_prefetch_mbuf(port->subport, pos);
+			if (wrr_active)
+				grinder_wrr(subport, pos);
+
+			grinder_prefetch_mbuf(subport, pos);
 
 			return 1;
 		}
-		grinder_wrr_store(port->subport, pos);
+
+		if (wrr_active)
+			grinder_wrr_store(subport, pos);
 
 		/* Look for another active TC within same pipe */
-		if (grinder_next_tc(port->subport, pos)) {
-			grinder_prefetch_tc_queue_arrays(port, pos);
+		if (grinder_next_tc(subport, pos)) {
+			grinder_prefetch_tc_queue_arrays(subport, pos);
 
 			grinder->state = e_GRINDER_PREFETCH_MBUF;
 			return result;
 		}
 
 		if (grinder->productive == 0 &&
-		    port->pipe_loop == RTE_SCHED_PIPE_INVALID)
-			port->pipe_loop = grinder->pindex;
+		    subport->pipe_loop == RTE_SCHED_PIPE_INVALID)
+			subport->pipe_loop = grinder->pindex;
 
-		grinder_evict(port, pos);
+		grinder_evict(subport, pos);
 
 		/* Look for another active pipe */
-		if (grinder_next_pipe(port->subport, pos)) {
-			grinder_prefetch_pipe(port->subport, pos);
+		if (grinder_next_pipe(subport, pos)) {
+			grinder_prefetch_pipe(subport, pos);
 
 			grinder->state = e_GRINDER_PREFETCH_TC_QUEUE_ARRAYS;
 			return result;
 		}
 
 		/* No active pipe found */
-		port->busy_grinders--;
+		subport->busy_grinders--;
 
 		grinder->state = e_GRINDER_PREFETCH_PIPE;
 		return result;
@@ -2739,7 +2744,8 @@ rte_sched_port_dequeue(struct rte_sched_port *port, struct rte_mbuf **pkts, uint
 
 	/* Take each queue in the grinder one step further */
 	for (i = 0, count = 0; ; i++)  {
-		count += grinder_handle(port, i & (RTE_SCHED_PORT_N_GRINDERS - 1));
+		count += grinder_handle(port, port->subport,
+				i & (RTE_SCHED_PORT_N_GRINDERS - 1));
 		if ((count == n_pkts) ||
 		    rte_sched_port_exceptions(port, i >= RTE_SCHED_PORT_N_GRINDERS)) {
 			break;
-- 
2.21.0


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH v2 21/28] sched: update packet dequeue API
  2019-06-25 15:31   ` [dpdk-dev] [PATCH v2 00/28] sched: feature enhancements Jasvinder Singh
                       ` (19 preceding siblings ...)
  2019-06-25 15:32     ` [dpdk-dev] [PATCH v2 20/28] sched: update grinder handle function Jasvinder Singh
@ 2019-06-25 15:32     ` Jasvinder Singh
  2019-06-25 15:32     ` [dpdk-dev] [PATCH v2 22/28] sched: update sched queue stats API Jasvinder Singh
                       ` (9 subsequent siblings)
  30 siblings, 0 replies; 163+ messages in thread
From: Jasvinder Singh @ 2019-06-25 15:32 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, Abraham Tovar, Lukasz Krakowiak

Update packet dequeue api implementation to allow configuration
flexiblity for pipe traffic classes and queues, and subport
level configuration of the pipe parameters.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 lib/librte_sched/rte_sched.c | 54 ++++++++++++++++++++++++++----------
 1 file changed, 40 insertions(+), 14 deletions(-)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index 6bdf1b831..63073aad0 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -1496,9 +1496,10 @@ rte_sched_queue_read_stats(struct rte_sched_port *port,
 #ifdef RTE_SCHED_DEBUG
 
 static inline int
-rte_sched_port_queue_is_empty(struct rte_sched_port *port, uint32_t qindex)
+rte_sched_port_queue_is_empty(struct rte_sched_subport *subport,
+	uint32_t qindex)
 {
-	struct rte_sched_queue *queue = port->queue + qindex;
+	struct rte_sched_queue *queue = subport->queue + qindex;
 
 	return queue->qr == queue->qw;
 }
@@ -1639,7 +1640,7 @@ rte_sched_port_set_queue_empty_timestamp(struct rte_sched_port *port __rte_unuse
 #ifdef RTE_SCHED_DEBUG
 
 static inline void
-debug_check_queue_slab(struct rte_sched_port *port, uint32_t bmp_pos,
+debug_check_queue_slab(struct rte_sched_subport *subport, uint32_t bmp_pos,
 		       uint64_t bmp_slab)
 {
 	uint64_t mask;
@@ -1651,7 +1652,7 @@ debug_check_queue_slab(struct rte_sched_port *port, uint32_t bmp_pos,
 	panic = 0;
 	for (i = 0, mask = 1; i < 64; i++, mask <<= 1) {
 		if (mask & bmp_slab) {
-			if (rte_sched_port_queue_is_empty(port, bmp_pos + i)) {
+			if (rte_sched_port_queue_is_empty(subport, bmp_pos + i)) {
 				printf("Queue %u (slab offset %u) is empty\n", bmp_pos + i, i);
 				panic = 1;
 			}
@@ -2702,6 +2703,7 @@ rte_sched_port_time_resync(struct rte_sched_port *port)
 	uint64_t cycles = rte_get_tsc_cycles();
 	uint64_t cycles_diff = cycles - port->time_cpu_cycles;
 	uint64_t bytes_diff;
+	uint32_t i;
 
 	/* Compute elapsed time in bytes */
 	bytes_diff = rte_reciprocal_divide(cycles_diff << RTE_SCHED_TIME_SHIFT,
@@ -2714,20 +2716,21 @@ rte_sched_port_time_resync(struct rte_sched_port *port)
 		port->time = port->time_cpu_bytes;
 
 	/* Reset pipe loop detection */
-	port->pipe_loop = RTE_SCHED_PIPE_INVALID;
+	for (i = 0; i < port->n_subports_per_port; i++)
+		port->subports[i]->pipe_loop = RTE_SCHED_PIPE_INVALID;
 }
 
 static inline int
-rte_sched_port_exceptions(struct rte_sched_port *port, int second_pass)
+rte_sched_port_exceptions(struct rte_sched_subport *subport, int second_pass)
 {
 	int exceptions;
 
 	/* Check if any exception flag is set */
-	exceptions = (second_pass && port->busy_grinders == 0) ||
-		(port->pipe_exhaustion == 1);
+	exceptions = (second_pass && subport->busy_grinders == 0) ||
+		(subport->pipe_exhaustion == 1);
 
 	/* Clear exception flags */
-	port->pipe_exhaustion = 0;
+	subport->pipe_exhaustion = 0;
 
 	return exceptions;
 }
@@ -2735,7 +2738,9 @@ rte_sched_port_exceptions(struct rte_sched_port *port, int second_pass)
 int
 rte_sched_port_dequeue(struct rte_sched_port *port, struct rte_mbuf **pkts, uint32_t n_pkts)
 {
-	uint32_t i, count;
+	struct rte_sched_subport *subport;
+	uint32_t subport_id = port->subport_id;
+	uint32_t i, n_subports = 0, count;
 
 	port->pkts_out = pkts;
 	port->n_pkts_out = 0;
@@ -2744,10 +2749,31 @@ rte_sched_port_dequeue(struct rte_sched_port *port, struct rte_mbuf **pkts, uint
 
 	/* Take each queue in the grinder one step further */
 	for (i = 0, count = 0; ; i++)  {
-		count += grinder_handle(port, port->subport,
-				i & (RTE_SCHED_PORT_N_GRINDERS - 1));
-		if ((count == n_pkts) ||
-		    rte_sched_port_exceptions(port, i >= RTE_SCHED_PORT_N_GRINDERS)) {
+		subport = port->subports[subport_id];
+
+		count += grinder_handle(port, subport, i &
+				(RTE_SCHED_PORT_N_GRINDERS - 1));
+		if (count == n_pkts) {
+			subport_id++;
+
+			if (subport_id == port->n_subports_per_port)
+				subport_id = 0;
+
+			port->subport_id = subport_id;
+			break;
+		}
+
+		if (rte_sched_port_exceptions(subport, i >= RTE_SCHED_PORT_N_GRINDERS)) {
+			i = 0;
+			subport_id++;
+			n_subports++;
+		}
+
+		if (subport_id == port->n_subports_per_port)
+			subport_id = 0;
+
+		if (n_subports == port->n_subports_per_port) {
+			port->subport_id = subport_id;
 			break;
 		}
 	}
-- 
2.21.0


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH v2 22/28] sched: update sched queue stats API
  2019-06-25 15:31   ` [dpdk-dev] [PATCH v2 00/28] sched: feature enhancements Jasvinder Singh
                       ` (20 preceding siblings ...)
  2019-06-25 15:32     ` [dpdk-dev] [PATCH v2 21/28] sched: update packet dequeue API Jasvinder Singh
@ 2019-06-25 15:32     ` Jasvinder Singh
  2019-06-25 15:32     ` [dpdk-dev] [PATCH v2 23/28] test/sched: update unit test Jasvinder Singh
                       ` (8 subsequent siblings)
  30 siblings, 0 replies; 163+ messages in thread
From: Jasvinder Singh @ 2019-06-25 15:32 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, Abraham Tovar, Lukasz Krakowiak

Update queue stats read api implementation of the scheduler to allow
configuration flexiblity for pipe traffic classes and queues, and subport
level configuration of the pipe parameters.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 lib/librte_sched/rte_sched.c | 59 ++++++++++++++++++++++++------------
 1 file changed, 39 insertions(+), 20 deletions(-)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index 63073aad0..cc1dcf7ab 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -284,16 +284,6 @@ enum rte_sched_subport_array {
 	e_RTE_SCHED_SUBPORT_ARRAY_TOTAL,
 };
 
-#ifdef RTE_SCHED_COLLECT_STATS
-
-static inline uint32_t
-rte_sched_port_queues_per_subport(struct rte_sched_port *port)
-{
-	return RTE_SCHED_QUEUES_PER_PIPE * port->n_pipes_per_subport;
-}
-
-#endif
-
 static inline uint32_t
 rte_sched_subport_queues(struct rte_sched_subport *subport)
 {
@@ -318,9 +308,14 @@ rte_sched_subport_qsize(struct rte_sched_subport *subport, uint32_t qindex)
 }
 
 static inline uint32_t
-rte_sched_port_queues_per_port(struct rte_sched_port *port)
+rte_sched_port_queues(struct rte_sched_port *port)
 {
-	return RTE_SCHED_QUEUES_PER_PIPE * port->n_pipes_per_subport * port->n_subports_per_port;
+	uint32_t n_queues = 0, i;
+
+	for (i = 0; i < port->n_subports_per_port; i++)
+		n_queues += rte_sched_subport_queues(port->subports[i]);
+
+	return n_queues;
 }
 
 static int
@@ -1470,18 +1465,42 @@ rte_sched_queue_read_stats(struct rte_sched_port *port,
 	struct rte_sched_queue_stats *stats,
 	uint16_t *qlen)
 {
+	uint32_t subport_id, qindex;
+	struct rte_sched_subport *s;
 	struct rte_sched_queue *q;
 	struct rte_sched_queue_extra *qe;
 
 	/* Check user parameters */
-	if ((port == NULL) ||
-	    (queue_id >= rte_sched_port_queues_per_port(port)) ||
-		(stats == NULL) ||
-		(qlen == NULL)) {
-		return -1;
-	}
-	q = port->queue + queue_id;
-	qe = port->queue_extra + queue_id;
+	if (port == NULL) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for parameter port \n", __func__);
+		return -EINVAL;
+	}
+
+	if (queue_id >= rte_sched_port_queues(port)) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for queue id \n", __func__);
+		return -EINVAL;
+	}
+
+	if (stats == NULL) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for parameter stats \n", __func__);
+		return -EINVAL;
+	}
+
+	if (qlen == NULL) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for parameter qlen \n", __func__);
+		return -EINVAL;
+	}
+
+	subport_id = (queue_id >> (port->max_subport_pipes_log2 + 4)) &
+					(port->n_subports_per_port - 1);
+	s = port->subports[subport_id];
+	qindex = ((1 << (port->max_subport_pipes_log2 + 4)) - 1) & queue_id;
+	q = s->queue + qindex;
+	qe = s->queue_extra + qindex;
 
 	/* Copy queue stats and clear */
 	memcpy(stats, &qe->stats, sizeof(struct rte_sched_queue_stats));
-- 
2.21.0


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH v2 23/28] test/sched: update unit test
  2019-06-25 15:31   ` [dpdk-dev] [PATCH v2 00/28] sched: feature enhancements Jasvinder Singh
                       ` (21 preceding siblings ...)
  2019-06-25 15:32     ` [dpdk-dev] [PATCH v2 22/28] sched: update sched queue stats API Jasvinder Singh
@ 2019-06-25 15:32     ` Jasvinder Singh
  2019-06-25 15:32     ` [dpdk-dev] [PATCH v2 24/28] net/softnic: update softnic tm function Jasvinder Singh
                       ` (7 subsequent siblings)
  30 siblings, 0 replies; 163+ messages in thread
From: Jasvinder Singh @ 2019-06-25 15:32 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, Abraham Tovar, Lukasz Krakowiak

Update unit test to allow configuration flexiblity for
pipe traffic classes and queues, and subport level
configuration of the pipe parameters.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 app/test/test_sched.c | 37 ++++++++++++++++++++-----------------
 1 file changed, 20 insertions(+), 17 deletions(-)

diff --git a/app/test/test_sched.c b/app/test/test_sched.c
index d6651d490..929496ce8 100644
--- a/app/test/test_sched.c
+++ b/app/test/test_sched.c
@@ -20,40 +20,43 @@
 #define SUBPORT         0
 #define PIPE            1
 #define TC              2
-#define QUEUE           3
-
-static struct rte_sched_subport_params subport_param[] = {
-	{
-		.tb_rate = 1250000000,
-		.tb_size = 1000000,
-
-		.tc_rate = {1250000000, 1250000000, 1250000000, 1250000000},
-		.tc_period = 10,
-	},
-};
+#define QUEUE           2
 
 static struct rte_sched_pipe_params pipe_profile[] = {
 	{ /* Profile #0 */
 		.tb_rate = 305175,
 		.tb_size = 1000000,
 
-		.tc_rate = {305175, 305175, 305175, 305175},
+		.tc_rate = {305175, 305175, 305175, 305175,
+				305175, 305175, 305175, 305175, 305175},
 		.tc_period = 40,
 
 		.wrr_weights = {1, 1, 1, 1,  1, 1, 1, 1},
 	},
 };
 
+static struct rte_sched_subport_params subport_param[] = {
+	{
+		.tb_rate = 1250000000,
+		.tb_size = 1000000,
+
+		.tc_rate = {1250000000, 1250000000, 1250000000, 1250000000,
+			1250000000, 1250000000, 1250000000, 1250000000, 1250000000},
+		.tc_period = 10,
+		.n_subport_pipes = 1024,
+		.qsize = {32, 32, 32, 32, 32, 32, 32, 32,
+			32, 32, 32, 32, 32, 32, 32, 32},
+		.pipe_profiles = pipe_profile,
+		.n_pipe_profiles = 1,
+	},
+};
+
 static struct rte_sched_port_params port_param = {
 	.socket = 0, /* computed */
 	.rate = 0, /* computed */
 	.mtu = 1522,
 	.frame_overhead = RTE_SCHED_FRAME_OVERHEAD_DEFAULT,
 	.n_subports_per_port = 1,
-	.n_pipes_per_subport = 1024,
-	.qsize = {32, 32, 32, 32},
-	.pipe_profiles = pipe_profile,
-	.n_pipe_profiles = 1,
 };
 
 #define NB_MBUF          32
@@ -135,7 +138,7 @@ test_sched(void)
 	err = rte_sched_subport_config(port, SUBPORT, subport_param);
 	TEST_ASSERT_SUCCESS(err, "Error config sched, err=%d\n", err);
 
-	for (pipe = 0; pipe < port_param.n_pipes_per_subport; pipe ++) {
+	for (pipe = 0; pipe < subport_param[0].n_subport_pipes; pipe++) {
 		err = rte_sched_pipe_config(port, SUBPORT, pipe, 0);
 		TEST_ASSERT_SUCCESS(err, "Error config sched pipe %u, err=%d\n", pipe, err);
 	}
-- 
2.21.0


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH v2 24/28] net/softnic: update softnic tm function
  2019-06-25 15:31   ` [dpdk-dev] [PATCH v2 00/28] sched: feature enhancements Jasvinder Singh
                       ` (22 preceding siblings ...)
  2019-06-25 15:32     ` [dpdk-dev] [PATCH v2 23/28] test/sched: update unit test Jasvinder Singh
@ 2019-06-25 15:32     ` Jasvinder Singh
  2019-06-25 15:32     ` [dpdk-dev] [PATCH v2 25/28] examples/qos_sched: update qos sched sample app Jasvinder Singh
                       ` (6 subsequent siblings)
  30 siblings, 0 replies; 163+ messages in thread
From: Jasvinder Singh @ 2019-06-25 15:32 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, Abraham Tovar, Lukasz Krakowiak

Update softnic tm function to allow configuration flexiblity for
pipe traffic classes and queues, and subport level configuration
of the pipe parameters.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 drivers/net/softnic/rte_eth_softnic.c         | 131 ++++++++
 drivers/net/softnic/rte_eth_softnic_cli.c     | 286 ++++++++++++++++--
 .../net/softnic/rte_eth_softnic_internals.h   |   8 +-
 drivers/net/softnic/rte_eth_softnic_tm.c      |  89 +++---
 4 files changed, 445 insertions(+), 69 deletions(-)

diff --git a/drivers/net/softnic/rte_eth_softnic.c b/drivers/net/softnic/rte_eth_softnic.c
index 4bda2f2b0..50a48e90b 100644
--- a/drivers/net/softnic/rte_eth_softnic.c
+++ b/drivers/net/softnic/rte_eth_softnic.c
@@ -28,6 +28,19 @@
 #define PMD_PARAM_TM_QSIZE1                                "tm_qsize1"
 #define PMD_PARAM_TM_QSIZE2                                "tm_qsize2"
 #define PMD_PARAM_TM_QSIZE3                                "tm_qsize3"
+#define PMD_PARAM_TM_QSIZE4                                "tm_qsize4"
+#define PMD_PARAM_TM_QSIZE5                                "tm_qsize5"
+#define PMD_PARAM_TM_QSIZE6                                "tm_qsize6"
+#define PMD_PARAM_TM_QSIZE7                                "tm_qsize7"
+#define PMD_PARAM_TM_QSIZE8                                "tm_qsize8"
+#define PMD_PARAM_TM_QSIZE9                                "tm_qsize9"
+#define PMD_PARAM_TM_QSIZE10                               "tm_qsize10"
+#define PMD_PARAM_TM_QSIZE11                               "tm_qsize11"
+#define PMD_PARAM_TM_QSIZE12                               "tm_qsize12"
+#define PMD_PARAM_TM_QSIZE13                               "tm_qsize13"
+#define PMD_PARAM_TM_QSIZE14                               "tm_qsize14"
+#define PMD_PARAM_TM_QSIZE15                               "tm_qsize15"
+
 
 static const char * const pmd_valid_args[] = {
 	PMD_PARAM_FIRMWARE,
@@ -39,6 +52,18 @@ static const char * const pmd_valid_args[] = {
 	PMD_PARAM_TM_QSIZE1,
 	PMD_PARAM_TM_QSIZE2,
 	PMD_PARAM_TM_QSIZE3,
+	PMD_PARAM_TM_QSIZE4,
+	PMD_PARAM_TM_QSIZE5,
+	PMD_PARAM_TM_QSIZE6,
+	PMD_PARAM_TM_QSIZE7,
+	PMD_PARAM_TM_QSIZE8,
+	PMD_PARAM_TM_QSIZE9,
+	PMD_PARAM_TM_QSIZE10,
+	PMD_PARAM_TM_QSIZE11,
+	PMD_PARAM_TM_QSIZE12,
+	PMD_PARAM_TM_QSIZE13,
+	PMD_PARAM_TM_QSIZE14,
+	PMD_PARAM_TM_QSIZE15,
 	NULL
 };
 
@@ -434,6 +459,18 @@ pmd_parse_args(struct pmd_params *p, const char *params)
 	p->tm.qsize[1] = SOFTNIC_TM_QUEUE_SIZE;
 	p->tm.qsize[2] = SOFTNIC_TM_QUEUE_SIZE;
 	p->tm.qsize[3] = SOFTNIC_TM_QUEUE_SIZE;
+	p->tm.qsize[4] = SOFTNIC_TM_QUEUE_SIZE;
+	p->tm.qsize[5] = SOFTNIC_TM_QUEUE_SIZE;
+	p->tm.qsize[6] = SOFTNIC_TM_QUEUE_SIZE;
+	p->tm.qsize[7] = SOFTNIC_TM_QUEUE_SIZE;
+	p->tm.qsize[8] = SOFTNIC_TM_QUEUE_SIZE;
+	p->tm.qsize[9] = SOFTNIC_TM_QUEUE_SIZE;
+	p->tm.qsize[10] = SOFTNIC_TM_QUEUE_SIZE;
+	p->tm.qsize[11] = SOFTNIC_TM_QUEUE_SIZE;
+	p->tm.qsize[12] = SOFTNIC_TM_QUEUE_SIZE;
+	p->tm.qsize[13] = SOFTNIC_TM_QUEUE_SIZE;
+	p->tm.qsize[14] = SOFTNIC_TM_QUEUE_SIZE;
+	p->tm.qsize[15] = SOFTNIC_TM_QUEUE_SIZE;
 
 	/* Firmware script (optional) */
 	if (rte_kvargs_count(kvlist, PMD_PARAM_FIRMWARE) == 1) {
@@ -504,6 +541,88 @@ pmd_parse_args(struct pmd_params *p, const char *params)
 			goto out_free;
 	}
 
+	if (rte_kvargs_count(kvlist, PMD_PARAM_TM_QSIZE4) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_TM_QSIZE4,
+			&get_uint32, &p->tm.qsize[4]);
+		if (ret < 0)
+			goto out_free;
+	}
+
+	if (rte_kvargs_count(kvlist, PMD_PARAM_TM_QSIZE5) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_TM_QSIZE5,
+			&get_uint32, &p->tm.qsize[5]);
+		if (ret < 0)
+			goto out_free;
+	}
+
+	if (rte_kvargs_count(kvlist, PMD_PARAM_TM_QSIZE6) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_TM_QSIZE6,
+			&get_uint32, &p->tm.qsize[6]);
+		if (ret < 0)
+			goto out_free;
+	}
+
+	if (rte_kvargs_count(kvlist, PMD_PARAM_TM_QSIZE7) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_TM_QSIZE7,
+			&get_uint32, &p->tm.qsize[7]);
+		if (ret < 0)
+			goto out_free;
+	}
+	if (rte_kvargs_count(kvlist, PMD_PARAM_TM_QSIZE8) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_TM_QSIZE8,
+			&get_uint32, &p->tm.qsize[8]);
+		if (ret < 0)
+			goto out_free;
+	}
+	if (rte_kvargs_count(kvlist, PMD_PARAM_TM_QSIZE9) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_TM_QSIZE9,
+			&get_uint32, &p->tm.qsize[9]);
+		if (ret < 0)
+			goto out_free;
+	}
+
+	if (rte_kvargs_count(kvlist, PMD_PARAM_TM_QSIZE10) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_TM_QSIZE10,
+			&get_uint32, &p->tm.qsize[10]);
+		if (ret < 0)
+			goto out_free;
+	}
+
+	if (rte_kvargs_count(kvlist, PMD_PARAM_TM_QSIZE11) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_TM_QSIZE11,
+			&get_uint32, &p->tm.qsize[11]);
+		if (ret < 0)
+			goto out_free;
+	}
+
+	if (rte_kvargs_count(kvlist, PMD_PARAM_TM_QSIZE12) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_TM_QSIZE12,
+			&get_uint32, &p->tm.qsize[12]);
+		if (ret < 0)
+			goto out_free;
+	}
+
+	if (rte_kvargs_count(kvlist, PMD_PARAM_TM_QSIZE13) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_TM_QSIZE13,
+			&get_uint32, &p->tm.qsize[13]);
+		if (ret < 0)
+			goto out_free;
+	}
+
+	if (rte_kvargs_count(kvlist, PMD_PARAM_TM_QSIZE14) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_TM_QSIZE14,
+			&get_uint32, &p->tm.qsize[14]);
+		if (ret < 0)
+			goto out_free;
+	}
+
+	if (rte_kvargs_count(kvlist, PMD_PARAM_TM_QSIZE15) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_TM_QSIZE15,
+			&get_uint32, &p->tm.qsize[15]);
+		if (ret < 0)
+			goto out_free;
+	}
+
 out_free:
 	rte_kvargs_free(kvlist);
 	return ret;
@@ -588,6 +707,18 @@ RTE_PMD_REGISTER_PARAM_STRING(net_softnic,
 	PMD_PARAM_TM_QSIZE1 "=<uint32> "
 	PMD_PARAM_TM_QSIZE2 "=<uint32> "
 	PMD_PARAM_TM_QSIZE3 "=<uint32>"
+	PMD_PARAM_TM_QSIZE4 "=<uint32> "
+	PMD_PARAM_TM_QSIZE5 "=<uint32> "
+	PMD_PARAM_TM_QSIZE6 "=<uint32> "
+	PMD_PARAM_TM_QSIZE7 "=<uint32>"
+	PMD_PARAM_TM_QSIZE8 "=<uint32> "
+	PMD_PARAM_TM_QSIZE9 "=<uint32> "
+	PMD_PARAM_TM_QSIZE10 "=<uint32> "
+	PMD_PARAM_TM_QSIZE11 "=<uint32>"
+	PMD_PARAM_TM_QSIZE12 "=<uint32> "
+	PMD_PARAM_TM_QSIZE13 "=<uint32> "
+	PMD_PARAM_TM_QSIZE14 "=<uint32> "
+	PMD_PARAM_TM_QSIZE15 "=<uint32>"
 );
 
 
diff --git a/drivers/net/softnic/rte_eth_softnic_cli.c b/drivers/net/softnic/rte_eth_softnic_cli.c
index 56fc92ba2..63325623f 100644
--- a/drivers/net/softnic/rte_eth_softnic_cli.c
+++ b/drivers/net/softnic/rte_eth_softnic_cli.c
@@ -566,9 +566,13 @@ queue_node_id(uint32_t n_spp __rte_unused,
 	uint32_t tc_id,
 	uint32_t queue_id)
 {
-	return queue_id +
-		tc_id * RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE +
-		(pipe_id + subport_id * n_pps) * RTE_SCHED_QUEUES_PER_PIPE;
+	if (tc_id < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE)
+		return queue_id + tc_id +
+			(pipe_id + subport_id * n_pps) * RTE_SCHED_QUEUES_PER_PIPE;
+	else
+		return queue_id +
+			tc_id * RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE +
+			(pipe_id + subport_id * n_pps) * RTE_SCHED_QUEUES_PER_PIPE;
 }
 
 struct tmgr_hierarchy_default_params {
@@ -617,10 +621,19 @@ tmgr_hierarchy_default(struct pmd_internals *softnic,
 		},
 	};
 
+	uint32_t *shared_shaper_id =
+		(uint32_t *) calloc(RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE,
+		sizeof(uint32_t));
+	if (shared_shaper_id == NULL)
+		return -1;
+
+	memcpy(shared_shaper_id, params->shared_shaper_id.tc,
+		sizeof(params->shared_shaper_id.tc));
+
 	struct rte_tm_node_params tc_node_params[] = {
 		[0] = {
 			.shaper_profile_id = params->shaper_profile_id.tc[0],
-			.shared_shaper_id = &params->shared_shaper_id.tc[0],
+			.shared_shaper_id = &shared_shaper_id[0],
 			.n_shared_shapers =
 				(&params->shared_shaper_id.tc_valid[0]) ? 1 : 0,
 			.nonleaf = {
@@ -630,7 +643,7 @@ tmgr_hierarchy_default(struct pmd_internals *softnic,
 
 		[1] = {
 			.shaper_profile_id = params->shaper_profile_id.tc[1],
-			.shared_shaper_id = &params->shared_shaper_id.tc[1],
+			.shared_shaper_id = &shared_shaper_id[1],
 			.n_shared_shapers =
 				(&params->shared_shaper_id.tc_valid[1]) ? 1 : 0,
 			.nonleaf = {
@@ -640,7 +653,7 @@ tmgr_hierarchy_default(struct pmd_internals *softnic,
 
 		[2] = {
 			.shaper_profile_id = params->shaper_profile_id.tc[2],
-			.shared_shaper_id = &params->shared_shaper_id.tc[2],
+			.shared_shaper_id = &shared_shaper_id[2],
 			.n_shared_shapers =
 				(&params->shared_shaper_id.tc_valid[2]) ? 1 : 0,
 			.nonleaf = {
@@ -650,13 +663,63 @@ tmgr_hierarchy_default(struct pmd_internals *softnic,
 
 		[3] = {
 			.shaper_profile_id = params->shaper_profile_id.tc[3],
-			.shared_shaper_id = &params->shared_shaper_id.tc[3],
+			.shared_shaper_id = &shared_shaper_id[3],
 			.n_shared_shapers =
 				(&params->shared_shaper_id.tc_valid[3]) ? 1 : 0,
 			.nonleaf = {
 				.n_sp_priorities = 1,
 			},
 		},
+
+		[4] = {
+			.shaper_profile_id = params->shaper_profile_id.tc[4],
+			.shared_shaper_id = &shared_shaper_id[4],
+			.n_shared_shapers =
+				(&params->shared_shaper_id.tc_valid[4]) ? 1 : 0,
+			.nonleaf = {
+				.n_sp_priorities = 1,
+			},
+		},
+
+		[5] = {
+			.shaper_profile_id = params->shaper_profile_id.tc[5],
+			.shared_shaper_id = &shared_shaper_id[5],
+			.n_shared_shapers =
+				(&params->shared_shaper_id.tc_valid[5]) ? 1 : 0,
+			.nonleaf = {
+				.n_sp_priorities = 1,
+			},
+		},
+
+		[6] = {
+			.shaper_profile_id = params->shaper_profile_id.tc[6],
+			.shared_shaper_id = &shared_shaper_id[6],
+			.n_shared_shapers =
+				(&params->shared_shaper_id.tc_valid[6]) ? 1 : 0,
+			.nonleaf = {
+				.n_sp_priorities = 1,
+			},
+		},
+
+		[7] = {
+			.shaper_profile_id = params->shaper_profile_id.tc[7],
+			.shared_shaper_id = &shared_shaper_id[7],
+			.n_shared_shapers =
+				(&params->shared_shaper_id.tc_valid[7]) ? 1 : 0,
+			.nonleaf = {
+				.n_sp_priorities = 1,
+			},
+		},
+
+		[8] = {
+			.shaper_profile_id = params->shaper_profile_id.tc[8],
+			.shared_shaper_id = &shared_shaper_id[8],
+			.n_shared_shapers =
+				(&params->shared_shaper_id.tc_valid[8]) ? 1 : 0,
+			.nonleaf = {
+				.n_sp_priorities = 1,
+			},
+		},
 	};
 
 	struct rte_tm_node_params queue_node_params = {
@@ -730,7 +793,21 @@ tmgr_hierarchy_default(struct pmd_internals *softnic,
 					return -1;
 
 				/* Hierarchy level 4: Queue nodes */
-				for (q = 0; q < RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS; q++) {
+				if (t == RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE - 1) { /*BE Traffic Class*/
+					for (q = 0; q < RTE_SCHED_BE_QUEUES_PER_PIPE; q++) {
+						status = rte_tm_node_add(port_id,
+							queue_node_id(n_spp, n_pps, s, p, t, q),
+							tc_node_id(n_spp, n_pps, s, p, t),
+							0,
+							params->weight.queue[q],
+							RTE_TM_NODE_LEVEL_ID_ANY,
+							&queue_node_params,
+							&error);
+						if (status)
+							return -1;
+					} /* Queues (BE Traffic Class) */
+				} else { /* SP Traffic Class */
+					q = 0;
 					status = rte_tm_node_add(port_id,
 						queue_node_id(n_spp, n_pps, s, p, t, q),
 						tc_node_id(n_spp, n_pps, s, p, t),
@@ -741,7 +818,7 @@ tmgr_hierarchy_default(struct pmd_internals *softnic,
 						&error);
 					if (status)
 						return -1;
-				} /* Queue */
+				} /* Queue (SP Traffic Class) */
 			} /* TC */
 		} /* Pipe */
 	} /* Subport */
@@ -762,13 +839,23 @@ tmgr_hierarchy_default(struct pmd_internals *softnic,
  *   tc1 <profile_id>
  *   tc2 <profile_id>
  *   tc3 <profile_id>
+ *   tc4 <profile_id>
+ *   tc5 <profile_id>
+ *   tc6 <profile_id>
+ *   tc7 <profile_id>
+ *   tc8 <profile_id>
  *  shared shaper
  *   tc0 <id | none>
  *   tc1 <id | none>
  *   tc2 <id | none>
  *   tc3 <id | none>
+ *   tc4 <id | none>
+ *   tc5 <id | none>
+ *   tc6 <id | none>
+ *   tc7 <id | none>
+ *   tc8 <id | none>
  *  weight
- *   queue  <q0> ... <q15>
+ *   queue  <q8> ... <q15>
  */
 static void
 cmd_tmgr_hierarchy_default(struct pmd_internals *softnic,
@@ -778,11 +865,11 @@ cmd_tmgr_hierarchy_default(struct pmd_internals *softnic,
 	size_t out_size)
 {
 	struct tmgr_hierarchy_default_params p;
-	int i, status;
+	int i, j, status;
 
 	memset(&p, 0, sizeof(p));
 
-	if (n_tokens != 50) {
+	if (n_tokens != 62) {
 		snprintf(out, out_size, MSG_ARG_MISMATCH, tokens[0]);
 		return;
 	}
@@ -894,27 +981,77 @@ cmd_tmgr_hierarchy_default(struct pmd_internals *softnic,
 		return;
 	}
 
+	if (strcmp(tokens[22], "tc4") != 0) {
+		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "tc4");
+		return;
+	}
+
+	if (softnic_parser_read_uint32(&p.shaper_profile_id.tc[4], tokens[23]) != 0) {
+		snprintf(out, out_size, MSG_ARG_INVALID, "tc4 profile id");
+		return;
+	}
+
+	if (strcmp(tokens[24], "tc5") != 0) {
+		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "tc5");
+		return;
+	}
+
+	if (softnic_parser_read_uint32(&p.shaper_profile_id.tc[5], tokens[25]) != 0) {
+		snprintf(out, out_size, MSG_ARG_INVALID, "tc5 profile id");
+		return;
+	}
+
+	if (strcmp(tokens[26], "tc6") != 0) {
+		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "tc6");
+		return;
+	}
+
+	if (softnic_parser_read_uint32(&p.shaper_profile_id.tc[6], tokens[27]) != 0) {
+		snprintf(out, out_size, MSG_ARG_INVALID, "tc6 profile id");
+		return;
+	}
+
+	if (strcmp(tokens[28], "tc7") != 0) {
+		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "tc7");
+		return;
+	}
+
+	if (softnic_parser_read_uint32(&p.shaper_profile_id.tc[7], tokens[29]) != 0) {
+		snprintf(out, out_size, MSG_ARG_INVALID, "tc7 profile id");
+		return;
+	}
+
+	if (strcmp(tokens[30], "tc8") != 0) {
+		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "tc8");
+		return;
+	}
+
+	if (softnic_parser_read_uint32(&p.shaper_profile_id.tc[8], tokens[31]) != 0) {
+		snprintf(out, out_size, MSG_ARG_INVALID, "tc8 profile id");
+		return;
+	}
+
 	/* Shared shaper */
 
-	if (strcmp(tokens[22], "shared") != 0) {
+	if (strcmp(tokens[32], "shared") != 0) {
 		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "shared");
 		return;
 	}
 
-	if (strcmp(tokens[23], "shaper") != 0) {
+	if (strcmp(tokens[33], "shaper") != 0) {
 		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "shaper");
 		return;
 	}
 
-	if (strcmp(tokens[24], "tc0") != 0) {
+	if (strcmp(tokens[34], "tc0") != 0) {
 		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "tc0");
 		return;
 	}
 
-	if (strcmp(tokens[25], "none") == 0)
+	if (strcmp(tokens[35], "none") == 0)
 		p.shared_shaper_id.tc_valid[0] = 0;
 	else {
-		if (softnic_parser_read_uint32(&p.shared_shaper_id.tc[0], tokens[25]) != 0) {
+		if (softnic_parser_read_uint32(&p.shared_shaper_id.tc[0], tokens[35]) != 0) {
 			snprintf(out, out_size, MSG_ARG_INVALID, "shared shaper tc0");
 			return;
 		}
@@ -922,15 +1059,15 @@ cmd_tmgr_hierarchy_default(struct pmd_internals *softnic,
 		p.shared_shaper_id.tc_valid[0] = 1;
 	}
 
-	if (strcmp(tokens[26], "tc1") != 0) {
+	if (strcmp(tokens[36], "tc1") != 0) {
 		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "tc1");
 		return;
 	}
 
-	if (strcmp(tokens[27], "none") == 0)
+	if (strcmp(tokens[37], "none") == 0)
 		p.shared_shaper_id.tc_valid[1] = 0;
 	else {
-		if (softnic_parser_read_uint32(&p.shared_shaper_id.tc[1], tokens[27]) != 0) {
+		if (softnic_parser_read_uint32(&p.shared_shaper_id.tc[1], tokens[37]) != 0) {
 			snprintf(out, out_size, MSG_ARG_INVALID, "shared shaper tc1");
 			return;
 		}
@@ -938,15 +1075,15 @@ cmd_tmgr_hierarchy_default(struct pmd_internals *softnic,
 		p.shared_shaper_id.tc_valid[1] = 1;
 	}
 
-	if (strcmp(tokens[28], "tc2") != 0) {
+	if (strcmp(tokens[38], "tc2") != 0) {
 		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "tc2");
 		return;
 	}
 
-	if (strcmp(tokens[29], "none") == 0)
+	if (strcmp(tokens[39], "none") == 0)
 		p.shared_shaper_id.tc_valid[2] = 0;
 	else {
-		if (softnic_parser_read_uint32(&p.shared_shaper_id.tc[2], tokens[29]) != 0) {
+		if (softnic_parser_read_uint32(&p.shared_shaper_id.tc[2], tokens[39]) != 0) {
 			snprintf(out, out_size, MSG_ARG_INVALID, "shared shaper tc2");
 			return;
 		}
@@ -954,15 +1091,15 @@ cmd_tmgr_hierarchy_default(struct pmd_internals *softnic,
 		p.shared_shaper_id.tc_valid[2] = 1;
 	}
 
-	if (strcmp(tokens[30], "tc3") != 0) {
+	if (strcmp(tokens[40], "tc3") != 0) {
 		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "tc3");
 		return;
 	}
 
-	if (strcmp(tokens[31], "none") == 0)
+	if (strcmp(tokens[41], "none") == 0)
 		p.shared_shaper_id.tc_valid[3] = 0;
 	else {
-		if (softnic_parser_read_uint32(&p.shared_shaper_id.tc[3], tokens[31]) != 0) {
+		if (softnic_parser_read_uint32(&p.shared_shaper_id.tc[3], tokens[41]) != 0) {
 			snprintf(out, out_size, MSG_ARG_INVALID, "shared shaper tc3");
 			return;
 		}
@@ -970,22 +1107,107 @@ cmd_tmgr_hierarchy_default(struct pmd_internals *softnic,
 		p.shared_shaper_id.tc_valid[3] = 1;
 	}
 
+	if (strcmp(tokens[42], "tc4") != 0) {
+		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "tc4");
+		return;
+	}
+
+	if (strcmp(tokens[43], "none") == 0)
+		p.shared_shaper_id.tc_valid[4] = 0;
+	else {
+		if (softnic_parser_read_uint32(&p.shared_shaper_id.tc[4], tokens[43]) != 0) {
+			snprintf(out, out_size, MSG_ARG_INVALID, "shared shaper tc4");
+			return;
+		}
+
+		p.shared_shaper_id.tc_valid[4] = 1;
+	}
+
+	if (strcmp(tokens[44], "tc5") != 0) {
+		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "tc5");
+		return;
+	}
+
+	if (strcmp(tokens[45], "none") == 0)
+		p.shared_shaper_id.tc_valid[5] = 0;
+	else {
+		if (softnic_parser_read_uint32(&p.shared_shaper_id.tc[5], tokens[45]) != 0) {
+			snprintf(out, out_size, MSG_ARG_INVALID, "shared shaper tc5");
+			return;
+		}
+
+		p.shared_shaper_id.tc_valid[5] = 1;
+	}
+
+	if (strcmp(tokens[46], "tc6") != 0) {
+		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "tc3");
+		return;
+	}
+
+	if (strcmp(tokens[47], "none") == 0)
+		p.shared_shaper_id.tc_valid[6] = 0;
+	else {
+		if (softnic_parser_read_uint32(&p.shared_shaper_id.tc[6], tokens[47]) != 0) {
+			snprintf(out, out_size, MSG_ARG_INVALID, "shared shaper tc6");
+			return;
+		}
+
+		p.shared_shaper_id.tc_valid[6] = 1;
+	}
+
+	if (strcmp(tokens[48], "tc7") != 0) {
+		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "tc7");
+		return;
+	}
+
+	if (strcmp(tokens[49], "none") == 0)
+		p.shared_shaper_id.tc_valid[7] = 0;
+	else {
+		if (softnic_parser_read_uint32(&p.shared_shaper_id.tc[7], tokens[49]) != 0) {
+			snprintf(out, out_size, MSG_ARG_INVALID, "shared shaper tc7");
+			return;
+		}
+
+		p.shared_shaper_id.tc_valid[7] = 1;
+	}
+
+	if (strcmp(tokens[50], "tc8") != 0) {
+		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "tc8");
+		return;
+	}
+
+	if (strcmp(tokens[51], "none") == 0)
+		p.shared_shaper_id.tc_valid[8] = 0;
+	else {
+		if (softnic_parser_read_uint32(&p.shared_shaper_id.tc[8], tokens[51]) != 0) {
+			snprintf(out, out_size, MSG_ARG_INVALID, "shared shaper tc8");
+			return;
+		}
+
+		p.shared_shaper_id.tc_valid[8] = 1;
+	}
+
 	/* Weight */
 
-	if (strcmp(tokens[32], "weight") != 0) {
+	if (strcmp(tokens[52], "weight") != 0) {
 		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "weight");
 		return;
 	}
 
-	if (strcmp(tokens[33], "queue") != 0) {
+	if (strcmp(tokens[53], "queue") != 0) {
 		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "queue");
 		return;
 	}
 
-	for (i = 0; i < 16; i++) {
-		if (softnic_parser_read_uint32(&p.weight.queue[i], tokens[34 + i]) != 0) {
-			snprintf(out, out_size, MSG_ARG_INVALID, "weight queue");
-			return;
+	for (i = 0, j = 0; i < 16; i++) {
+		if (i < RTE_SCHED_BE_QUEUES_PER_PIPE) {
+			p.weight.queue[i] = 1;
+		} else {
+			if (softnic_parser_read_uint32(&p.weight.queue[i], tokens[54 + j]) != 0) {
+				snprintf(out, out_size, MSG_ARG_INVALID, "weight queue");
+				return;
+			}
+			j++;
 		}
 	}
 
diff --git a/drivers/net/softnic/rte_eth_softnic_internals.h b/drivers/net/softnic/rte_eth_softnic_internals.h
index 415434d0d..5525dff98 100644
--- a/drivers/net/softnic/rte_eth_softnic_internals.h
+++ b/drivers/net/softnic/rte_eth_softnic_internals.h
@@ -43,7 +43,7 @@ struct pmd_params {
 	/** Traffic Management (TM) */
 	struct {
 		uint32_t n_queues; /**< Number of queues */
-		uint16_t qsize[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
+		uint16_t qsize[RTE_SCHED_QUEUES_PER_PIPE];
 	} tm;
 };
 
@@ -161,13 +161,15 @@ TAILQ_HEAD(softnic_link_list, softnic_link);
 #define TM_MAX_PIPES_PER_SUBPORT			4096
 #endif
 
+#ifndef TM_MAX_PIPE_PROFILE
+#define TM_MAX_PIPE_PROFILE				256
+#endif
 struct tm_params {
 	struct rte_sched_port_params port_params;
 
 	struct rte_sched_subport_params subport_params[TM_MAX_SUBPORTS];
 
-	struct rte_sched_pipe_params
-		pipe_profiles[RTE_SCHED_PIPE_PROFILES_PER_PORT];
+	struct rte_sched_pipe_params pipe_profiles[TM_MAX_PIPE_PROFILE];
 	uint32_t n_pipe_profiles;
 	uint32_t pipe_to_profile[TM_MAX_SUBPORTS * TM_MAX_PIPES_PER_SUBPORT];
 };
diff --git a/drivers/net/softnic/rte_eth_softnic_tm.c b/drivers/net/softnic/rte_eth_softnic_tm.c
index 58744a9eb..6ba993147 100644
--- a/drivers/net/softnic/rte_eth_softnic_tm.c
+++ b/drivers/net/softnic/rte_eth_softnic_tm.c
@@ -85,7 +85,8 @@ softnic_tmgr_port_create(struct pmd_internals *p,
 	/* Subport */
 	n_subports = t->port_params.n_subports_per_port;
 	for (subport_id = 0; subport_id < n_subports; subport_id++) {
-		uint32_t n_pipes_per_subport = t->port_params.n_pipes_per_subport;
+		uint32_t n_pipes_per_subport =
+			t->subport_params[subport_id].n_subport_pipes;
 		uint32_t pipe_id;
 		int status;
 
@@ -367,7 +368,8 @@ tm_level_get_max_nodes(struct rte_eth_dev *dev, enum tm_node_level level)
 {
 	struct pmd_internals *p = dev->data->dev_private;
 	uint32_t n_queues_max = p->params.tm.n_queues;
-	uint32_t n_tc_max = n_queues_max / RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS;
+	uint32_t n_tc_max =
+		(n_queues_max * RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE) / RTE_SCHED_QUEUES_PER_PIPE;
 	uint32_t n_pipes_max = n_tc_max / RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE;
 	uint32_t n_subports_max = n_pipes_max;
 	uint32_t n_root_max = 1;
@@ -625,10 +627,10 @@ static const struct rte_tm_level_capabilities tm_level_cap[] = {
 			.shaper_shared_n_max = 1,
 
 			.sched_n_children_max =
-				RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS,
+				RTE_SCHED_BE_QUEUES_PER_PIPE,
 			.sched_sp_n_priorities_max = 1,
 			.sched_wfq_n_children_per_group_max =
-				RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS,
+				RTE_SCHED_BE_QUEUES_PER_PIPE,
 			.sched_wfq_n_groups_max = 1,
 			.sched_wfq_weight_max = UINT32_MAX,
 
@@ -793,10 +795,10 @@ static const struct rte_tm_node_capabilities tm_node_cap[] = {
 
 		{.nonleaf = {
 			.sched_n_children_max =
-				RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS,
+				RTE_SCHED_BE_QUEUES_PER_PIPE,
 			.sched_sp_n_priorities_max = 1,
 			.sched_wfq_n_children_per_group_max =
-				RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS,
+				RTE_SCHED_BE_QUEUES_PER_PIPE,
 			.sched_wfq_n_groups_max = 1,
 			.sched_wfq_weight_max = UINT32_MAX,
 		} },
@@ -2043,15 +2045,13 @@ pipe_profile_build(struct rte_eth_dev *dev,
 
 		/* Queue */
 		TAILQ_FOREACH(nq, nl, node) {
-			uint32_t pipe_queue_id;
 
 			if (nq->level != TM_NODE_LEVEL_QUEUE ||
 				nq->parent_node_id != nt->node_id)
 				continue;
 
-			pipe_queue_id = nt->priority *
-				RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS + queue_id;
-			pp->wrr_weights[pipe_queue_id] = nq->weight;
+			if (nt->priority == RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE - 1)
+				pp->wrr_weights[queue_id] = nq->weight;
 
 			queue_id++;
 		}
@@ -2065,7 +2065,7 @@ pipe_profile_free_exists(struct rte_eth_dev *dev,
 	struct pmd_internals *p = dev->data->dev_private;
 	struct tm_params *t = &p->soft.tm.params;
 
-	if (t->n_pipe_profiles < RTE_SCHED_PIPE_PROFILES_PER_PORT) {
+	if (t->n_pipe_profiles < TM_MAX_PIPE_PROFILE) {
 		*pipe_profile_id = t->n_pipe_profiles;
 		return 1;
 	}
@@ -2213,10 +2213,11 @@ tm_tc_wred_profile_get(struct rte_eth_dev *dev, uint32_t tc_id)
 #ifdef RTE_SCHED_RED
 
 static void
-wred_profiles_set(struct rte_eth_dev *dev)
+wred_profiles_set(struct rte_eth_dev *dev, uint32_t subport_id)
 {
 	struct pmd_internals *p = dev->data->dev_private;
-	struct rte_sched_port_params *pp = &p->soft.tm.params.port_params;
+	struct rte_sched_subport_params *pp =
+		&p->soft.tm.params.subport_params[subport_id];
 	uint32_t tc_id;
 	enum rte_color color;
 
@@ -2235,7 +2236,7 @@ wred_profiles_set(struct rte_eth_dev *dev)
 
 #else
 
-#define wred_profiles_set(dev)
+#define wred_profiles_set(dev, subport_id)
 
 #endif
 
@@ -2332,7 +2333,7 @@ hierarchy_commit_check(struct rte_eth_dev *dev, struct rte_tm_error *error)
 				rte_strerror(EINVAL));
 	}
 
-	/* Each pipe has exactly 4 TCs, with exactly one TC for each priority */
+	/* Each pipe has exactly 9 TCs, with exactly one TC for each priority */
 	TAILQ_FOREACH(np, nl, node) {
 		uint32_t mask = 0, mask_expected =
 			RTE_LEN2MASK(RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE,
@@ -2369,7 +2370,7 @@ hierarchy_commit_check(struct rte_eth_dev *dev, struct rte_tm_error *error)
 		if (nt->level != TM_NODE_LEVEL_TC)
 			continue;
 
-		if (nt->n_children != RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS)
+		if (nt->n_children != 1 && nt->n_children != RTE_SCHED_BE_QUEUES_PER_PIPE)
 			return -rte_tm_error_set(error,
 				EINVAL,
 				RTE_TM_ERROR_TYPE_UNSPECIFIED,
@@ -2525,19 +2526,8 @@ hierarchy_blueprints_create(struct rte_eth_dev *dev)
 		.frame_overhead =
 			root->shaper_profile->params.pkt_length_adjust,
 		.n_subports_per_port = root->n_children,
-		.n_pipes_per_subport = h->n_tm_nodes[TM_NODE_LEVEL_PIPE] /
-			h->n_tm_nodes[TM_NODE_LEVEL_SUBPORT],
-		.qsize = {p->params.tm.qsize[0],
-			p->params.tm.qsize[1],
-			p->params.tm.qsize[2],
-			p->params.tm.qsize[3],
-		},
-		.pipe_profiles = t->pipe_profiles,
-		.n_pipe_profiles = t->n_pipe_profiles,
 	};
 
-	wred_profiles_set(dev);
-
 	subport_id = 0;
 	TAILQ_FOREACH(n, nl, node) {
 		uint64_t tc_rate[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
@@ -2566,10 +2556,41 @@ hierarchy_blueprints_create(struct rte_eth_dev *dev)
 					tc_rate[1],
 					tc_rate[2],
 					tc_rate[3],
-			},
-			.tc_period = SUBPORT_TC_PERIOD,
+					tc_rate[4],
+					tc_rate[5],
+					tc_rate[6],
+					tc_rate[7],
+					tc_rate[8],
+				},
+				.tc_period = SUBPORT_TC_PERIOD,
+
+				.n_subport_pipes = h->n_tm_nodes[TM_NODE_LEVEL_PIPE] /
+					h->n_tm_nodes[TM_NODE_LEVEL_SUBPORT],
+
+				.qsize = {p->params.tm.qsize[0],
+					p->params.tm.qsize[1],
+					p->params.tm.qsize[2],
+					p->params.tm.qsize[3],
+					p->params.tm.qsize[4],
+					p->params.tm.qsize[5],
+					p->params.tm.qsize[6],
+					p->params.tm.qsize[7],
+					p->params.tm.qsize[8],
+					p->params.tm.qsize[9],
+					p->params.tm.qsize[10],
+					p->params.tm.qsize[11],
+					p->params.tm.qsize[12],
+					p->params.tm.qsize[13],
+					p->params.tm.qsize[14],
+					p->params.tm.qsize[15],
+				},
+
+				.pipe_profiles = t->pipe_profiles,
+				.n_pipe_profiles = t->n_pipe_profiles,
+				.n_max_pipe_profiles = TM_MAX_PIPE_PROFILE,
 		};
 
+		wred_profiles_set(dev, subport_id);
 		subport_id++;
 	}
 }
@@ -2666,7 +2687,7 @@ update_queue_weight(struct rte_eth_dev *dev,
 	uint32_t subport_id = tm_node_subport_id(dev, ns);
 
 	uint32_t pipe_queue_id =
-		tc_id * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS + queue_id;
+		tc_id * RTE_SCHED_QUEUES_PER_PIPE + queue_id;
 
 	struct rte_sched_pipe_params *profile0 = pipe_profile_get(dev, np);
 	struct rte_sched_pipe_params profile1;
@@ -3023,7 +3044,7 @@ tm_port_queue_id(struct rte_eth_dev *dev,
 	uint32_t port_tc_id =
 		port_pipe_id * RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE + pipe_tc_id;
 	uint32_t port_queue_id =
-		port_tc_id * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS + tc_queue_id;
+		port_tc_id * RTE_SCHED_QUEUES_PER_PIPE + tc_queue_id;
 
 	return port_queue_id;
 }
@@ -3149,8 +3170,8 @@ read_pipe_stats(struct rte_eth_dev *dev,
 		uint32_t qid = tm_port_queue_id(dev,
 			subport_id,
 			pipe_id,
-			i / RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS,
-			i % RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS);
+			i / RTE_SCHED_QUEUES_PER_PIPE,
+			i % RTE_SCHED_QUEUES_PER_PIPE);
 
 		int status = rte_sched_queue_read_stats(SCHED(p),
 			qid,
@@ -3202,7 +3223,7 @@ read_tc_stats(struct rte_eth_dev *dev,
 	uint32_t i;
 
 	/* Stats read */
-	for (i = 0; i < RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS; i++) {
+	for (i = 0; i < RTE_SCHED_QUEUES_PER_PIPE; i++) {
 		struct rte_sched_queue_stats s;
 		uint16_t qlen;
 
-- 
2.21.0


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH v2 25/28] examples/qos_sched: update qos sched sample app
  2019-06-25 15:31   ` [dpdk-dev] [PATCH v2 00/28] sched: feature enhancements Jasvinder Singh
                       ` (23 preceding siblings ...)
  2019-06-25 15:32     ` [dpdk-dev] [PATCH v2 24/28] net/softnic: update softnic tm function Jasvinder Singh
@ 2019-06-25 15:32     ` Jasvinder Singh
  2019-06-25 15:32     ` [dpdk-dev] [PATCH v2 26/28] examples/ip_pipeline: update ip pipeline " Jasvinder Singh
                       ` (5 subsequent siblings)
  30 siblings, 0 replies; 163+ messages in thread
From: Jasvinder Singh @ 2019-06-25 15:32 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, Abraham Tovar, Lukasz Krakowiak

Update qos sched sample app to allow configuration flexibility for
pipe traffic classes and queues, and subport level configuration
of the pipe parameters.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 examples/qos_sched/app_thread.c   |  11 +-
 examples/qos_sched/cfg_file.c     | 283 +++++++++--------
 examples/qos_sched/init.c         | 109 ++++---
 examples/qos_sched/main.h         |   7 +-
 examples/qos_sched/profile.cfg    |  59 +++-
 examples/qos_sched/profile_ov.cfg |  47 ++-
 examples/qos_sched/stats.c        | 483 +++++++++++++++++-------------
 7 files changed, 593 insertions(+), 406 deletions(-)

diff --git a/examples/qos_sched/app_thread.c b/examples/qos_sched/app_thread.c
index e14b275e3..25a8d42a0 100644
--- a/examples/qos_sched/app_thread.c
+++ b/examples/qos_sched/app_thread.c
@@ -20,13 +20,11 @@
  * QoS parameters are encoded as follows:
  *		Outer VLAN ID defines subport
  *		Inner VLAN ID defines pipe
- *		Destination IP 0.0.XXX.0 defines traffic class
  *		Destination IP host (0.0.0.XXX) defines queue
  * Values below define offset to each field from start of frame
  */
 #define SUBPORT_OFFSET	7
 #define PIPE_OFFSET		9
-#define TC_OFFSET		20
 #define QUEUE_OFFSET	20
 #define COLOR_OFFSET	19
 
@@ -39,11 +37,10 @@ get_pkt_sched(struct rte_mbuf *m, uint32_t *subport, uint32_t *pipe,
 	*subport = (rte_be_to_cpu_16(pdata[SUBPORT_OFFSET]) & 0x0FFF) &
 			(port_params.n_subports_per_port - 1); /* Outer VLAN ID*/
 	*pipe = (rte_be_to_cpu_16(pdata[PIPE_OFFSET]) & 0x0FFF) &
-			(port_params.n_pipes_per_subport - 1); /* Inner VLAN ID */
-	*traffic_class = (pdata[QUEUE_OFFSET] & 0x0F) &
-			(RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE - 1); /* Destination IP */
-	*queue = ((pdata[QUEUE_OFFSET] >> 8) & 0x0F) &
-			(RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS - 1) ; /* Destination IP */
+			(subport_params[*subport].n_subport_pipes - 1); /* Inner VLAN ID */
+	*queue = active_queues[(pdata[QUEUE_OFFSET] >> 8) % n_active_queues];
+	*traffic_class = (*queue > (RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE - 1) ?
+			(RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE - 1) : *queue); /* Destination IP */
 	*color = pdata[COLOR_OFFSET] & 0x03; 	/* Destination IP */
 
 	return 0;
diff --git a/examples/qos_sched/cfg_file.c b/examples/qos_sched/cfg_file.c
index 76ffffc4b..7f54bfe22 100644
--- a/examples/qos_sched/cfg_file.c
+++ b/examples/qos_sched/cfg_file.c
@@ -24,7 +24,6 @@ int
 cfg_load_port(struct rte_cfgfile *cfg, struct rte_sched_port_params *port_params)
 {
 	const char *entry;
-	int j;
 
 	if (!cfg || !port_params)
 		return -1;
@@ -37,93 +36,6 @@ cfg_load_port(struct rte_cfgfile *cfg, struct rte_sched_port_params *port_params
 	if (entry)
 		port_params->n_subports_per_port = (uint32_t)atoi(entry);
 
-	entry = rte_cfgfile_get_entry(cfg, "port", "number of pipes per subport");
-	if (entry)
-		port_params->n_pipes_per_subport = (uint32_t)atoi(entry);
-
-	entry = rte_cfgfile_get_entry(cfg, "port", "queue sizes");
-	if (entry) {
-		char *next;
-
-		for(j = 0; j < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; j++) {
-			port_params->qsize[j] = (uint16_t)strtol(entry, &next, 10);
-			if (next == NULL)
-				break;
-			entry = next;
-		}
-	}
-
-#ifdef RTE_SCHED_RED
-	for (j = 0; j < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; j++) {
-		char str[32];
-
-		/* Parse WRED min thresholds */
-		snprintf(str, sizeof(str), "tc %d wred min", j);
-		entry = rte_cfgfile_get_entry(cfg, "red", str);
-		if (entry) {
-			char *next;
-			int k;
-			/* for each packet colour (green, yellow, red) */
-			for (k = 0; k < RTE_COLORS; k++) {
-				port_params->red_params[j][k].min_th
-					= (uint16_t)strtol(entry, &next, 10);
-				if (next == NULL)
-					break;
-				entry = next;
-			}
-		}
-
-		/* Parse WRED max thresholds */
-		snprintf(str, sizeof(str), "tc %d wred max", j);
-		entry = rte_cfgfile_get_entry(cfg, "red", str);
-		if (entry) {
-			char *next;
-			int k;
-			/* for each packet colour (green, yellow, red) */
-			for (k = 0; k < RTE_COLORS; k++) {
-				port_params->red_params[j][k].max_th
-					= (uint16_t)strtol(entry, &next, 10);
-				if (next == NULL)
-					break;
-				entry = next;
-			}
-		}
-
-		/* Parse WRED inverse mark probabilities */
-		snprintf(str, sizeof(str), "tc %d wred inv prob", j);
-		entry = rte_cfgfile_get_entry(cfg, "red", str);
-		if (entry) {
-			char *next;
-			int k;
-			/* for each packet colour (green, yellow, red) */
-			for (k = 0; k < RTE_COLORS; k++) {
-				port_params->red_params[j][k].maxp_inv
-					= (uint8_t)strtol(entry, &next, 10);
-
-				if (next == NULL)
-					break;
-				entry = next;
-			}
-		}
-
-		/* Parse WRED EWMA filter weights */
-		snprintf(str, sizeof(str), "tc %d wred weight", j);
-		entry = rte_cfgfile_get_entry(cfg, "red", str);
-		if (entry) {
-			char *next;
-			int k;
-			/* for each packet colour (green, yellow, red) */
-			for (k = 0; k < RTE_COLORS; k++) {
-				port_params->red_params[j][k].wq_log2
-					= (uint8_t)strtol(entry, &next, 10);
-				if (next == NULL)
-					break;
-				entry = next;
-			}
-		}
-	}
-#endif /* RTE_SCHED_RED */
-
 	return 0;
 }
 
@@ -139,7 +51,7 @@ cfg_load_pipe(struct rte_cfgfile *cfg, struct rte_sched_pipe_params *pipe_params
 		return -1;
 
 	profiles = rte_cfgfile_num_sections(cfg, "pipe profile", sizeof("pipe profile") - 1);
-	port_params.n_pipe_profiles = profiles;
+	subport_params[0].n_pipe_profiles = profiles;
 
 	for (j = 0; j < profiles; j++) {
 		char pipe_name[32];
@@ -173,46 +85,36 @@ cfg_load_pipe(struct rte_cfgfile *cfg, struct rte_sched_pipe_params *pipe_params
 		if (entry)
 			pipe_params[j].tc_rate[3] = (uint32_t)atoi(entry);
 
+		entry = rte_cfgfile_get_entry(cfg, pipe_name, "tc 4 rate");
+		if (entry)
+			pipe_params[j].tc_rate[4] = (uint32_t)atoi(entry);
+
+		entry = rte_cfgfile_get_entry(cfg, pipe_name, "tc 5 rate");
+		if (entry)
+			pipe_params[j].tc_rate[5] = (uint32_t)atoi(entry);
+
+		entry = rte_cfgfile_get_entry(cfg, pipe_name, "tc 6 rate");
+		if (entry)
+			pipe_params[j].tc_rate[6] = (uint32_t)atoi(entry);
+
+		entry = rte_cfgfile_get_entry(cfg, pipe_name, "tc 7 rate");
+		if (entry)
+			pipe_params[j].tc_rate[7] = (uint32_t)atoi(entry);
+
+		entry = rte_cfgfile_get_entry(cfg, pipe_name, "tc 8 rate");
+		if (entry)
+			pipe_params[j].tc_rate[8] = (uint32_t)atoi(entry);
+
 #ifdef RTE_SCHED_SUBPORT_TC_OV
-		entry = rte_cfgfile_get_entry(cfg, pipe_name, "tc 3 oversubscription weight");
+		entry = rte_cfgfile_get_entry(cfg, pipe_name, "tc 8 oversubscription weight");
 		if (entry)
 			pipe_params[j].tc_ov_weight = (uint8_t)atoi(entry);
 #endif
 
-		entry = rte_cfgfile_get_entry(cfg, pipe_name, "tc 0 wrr weights");
-		if (entry) {
-			for(i = 0; i < RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS; i++) {
-				pipe_params[j].wrr_weights[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE*0 + i] =
-					(uint8_t)strtol(entry, &next, 10);
-				if (next == NULL)
-					break;
-				entry = next;
-			}
-		}
-		entry = rte_cfgfile_get_entry(cfg, pipe_name, "tc 1 wrr weights");
-		if (entry) {
-			for(i = 0; i < RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS; i++) {
-				pipe_params[j].wrr_weights[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE*1 + i] =
-					(uint8_t)strtol(entry, &next, 10);
-				if (next == NULL)
-					break;
-				entry = next;
-			}
-		}
-		entry = rte_cfgfile_get_entry(cfg, pipe_name, "tc 2 wrr weights");
-		if (entry) {
-			for(i = 0; i < RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS; i++) {
-				pipe_params[j].wrr_weights[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE*2 + i] =
-					(uint8_t)strtol(entry, &next, 10);
-				if (next == NULL)
-					break;
-				entry = next;
-			}
-		}
-		entry = rte_cfgfile_get_entry(cfg, pipe_name, "tc 3 wrr weights");
+		entry = rte_cfgfile_get_entry(cfg, pipe_name, "tc 8 wrr weights");
 		if (entry) {
-			for(i = 0; i < RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS; i++) {
-				pipe_params[j].wrr_weights[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE*3 + i] =
+			for (i = 0; i < RTE_SCHED_BE_QUEUES_PER_PIPE; i++) {
+				pipe_params[j].wrr_weights[i] =
 					(uint8_t)strtol(entry, &next, 10);
 				if (next == NULL)
 					break;
@@ -233,12 +135,112 @@ cfg_load_subport(struct rte_cfgfile *cfg, struct rte_sched_subport_params *subpo
 		return -1;
 
 	memset(app_pipe_to_profile, -1, sizeof(app_pipe_to_profile));
+	memset(active_queues, 0, sizeof(active_queues));
+	n_active_queues = 0;
+
+#ifdef RTE_SCHED_RED
+	char sec_name[CFG_NAME_LEN];
+	struct rte_red_params red_params[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE][RTE_COLORS];
+
+	snprintf(sec_name, sizeof(sec_name), "red");
+
+	if (rte_cfgfile_has_section(cfg, sec_name)) {
+
+		for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++) {
+			char str[32];
+
+			/* Parse WRED min thresholds */
+			snprintf(str, sizeof(str), "tc %d wred min", i);
+			entry = rte_cfgfile_get_entry(cfg, sec_name, str);
+			if (entry) {
+				char *next;
+				/* for each packet colour (green, yellow, red) */
+				for (j = 0; j < RTE_COLORS; j++) {
+					red_params[i][j].min_th
+						= (uint16_t)strtol(entry, &next, 10);
+					if (next == NULL)
+						break;
+					entry = next;
+				}
+			}
+
+			/* Parse WRED max thresholds */
+			snprintf(str, sizeof(str), "tc %d wred max", i);
+			entry = rte_cfgfile_get_entry(cfg, "red", str);
+			if (entry) {
+				char *next;
+				/* for each packet colour (green, yellow, red) */
+				for (j = 0; j < RTE_COLORS; j++) {
+					red_params[i][j].max_th
+						= (uint16_t)strtol(entry, &next, 10);
+					if (next == NULL)
+						break;
+					entry = next;
+				}
+			}
+
+			/* Parse WRED inverse mark probabilities */
+			snprintf(str, sizeof(str), "tc %d wred inv prob", i);
+			entry = rte_cfgfile_get_entry(cfg, "red", str);
+			if (entry) {
+				char *next;
+				/* for each packet colour (green, yellow, red) */
+				for (j = 0; j < RTE_COLORS; j++) {
+					red_params[i][j].maxp_inv
+						= (uint8_t)strtol(entry, &next, 10);
+
+					if (next == NULL)
+						break;
+					entry = next;
+				}
+			}
+
+			/* Parse WRED EWMA filter weights */
+			snprintf(str, sizeof(str), "tc %d wred weight", i);
+			entry = rte_cfgfile_get_entry(cfg, "red", str);
+			if (entry) {
+				char *next;
+				/* for each packet colour (green, yellow, red) */
+				for (j = 0; j < RTE_COLORS; j++) {
+					red_params[i][j].wq_log2
+						= (uint8_t)strtol(entry, &next, 10);
+					if (next == NULL)
+						break;
+					entry = next;
+				}
+			}
+		}
+	}
+#endif /* RTE_SCHED_RED */
 
 	for (i = 0; i < MAX_SCHED_SUBPORTS; i++) {
 		char sec_name[CFG_NAME_LEN];
 		snprintf(sec_name, sizeof(sec_name), "subport %d", i);
 
 		if (rte_cfgfile_has_section(cfg, sec_name)) {
+			entry = rte_cfgfile_get_entry(cfg, sec_name,
+				"number of pipes per subport");
+			if (entry)
+				subport_params[i].n_subport_pipes = (uint32_t)atoi(entry);
+
+			entry = rte_cfgfile_get_entry(cfg, sec_name, "queue sizes");
+			if (entry) {
+				char *next;
+
+				for (j = 0; j < RTE_SCHED_QUEUES_PER_PIPE; j++) {
+				subport_params[i].qsize[j] =
+					(uint16_t)strtol(entry, &next, 10);
+				if (subport_params[i].qsize[j] != 0) {
+					active_queues[n_active_queues] = j;
+					n_active_queues++;
+				}
+
+				if (next == NULL)
+					break;
+				entry = next;
+				}
+			}
+
 			entry = rte_cfgfile_get_entry(cfg, sec_name, "tb rate");
 			if (entry)
 				subport_params[i].tb_rate = (uint32_t)atoi(entry);
@@ -267,6 +269,26 @@ cfg_load_subport(struct rte_cfgfile *cfg, struct rte_sched_subport_params *subpo
 			if (entry)
 				subport_params[i].tc_rate[3] = (uint32_t)atoi(entry);
 
+			entry = rte_cfgfile_get_entry(cfg, sec_name, "tc 4 rate");
+			if (entry)
+				subport_params[i].tc_rate[4] = (uint32_t)atoi(entry);
+
+			entry = rte_cfgfile_get_entry(cfg, sec_name, "tc 5 rate");
+			if (entry)
+				subport_params[i].tc_rate[5] = (uint32_t)atoi(entry);
+
+			entry = rte_cfgfile_get_entry(cfg, sec_name, "tc 6 rate");
+			if (entry)
+				subport_params[i].tc_rate[6] = (uint32_t)atoi(entry);
+
+			entry = rte_cfgfile_get_entry(cfg, sec_name, "tc 7 rate");
+			if (entry)
+				subport_params[i].tc_rate[7] = (uint32_t)atoi(entry);
+
+			entry = rte_cfgfile_get_entry(cfg, sec_name, "tc 8 rate");
+			if (entry)
+				subport_params[i].tc_rate[8] = (uint32_t)atoi(entry);
+
 			int n_entries = rte_cfgfile_section_num_entries(cfg, sec_name);
 			struct rte_cfgfile_entry entries[n_entries];
 
@@ -306,6 +328,21 @@ cfg_load_subport(struct rte_cfgfile *cfg, struct rte_sched_subport_params *subpo
 					}
 				}
 			}
+
+#ifdef RTE_SCHED_RED
+			for (j = 0; j < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; j++) {
+				for (k = 0; k < RTE_COLORS; k++) {
+					subport_params[i].red_params[j][k].min_th =
+						red_params[j][k].min_th;
+					subport_params[i].red_params[j][k].max_th =
+						red_params[j][k].max_th;
+					subport_params[i].red_params[j][k].maxp_inv =
+						red_params[j][k].maxp_inv;
+					subport_params[i].red_params[j][k].wq_log2 =
+						red_params[j][k].wq_log2;
+				}
+			}
+#endif
 		}
 	}
 
diff --git a/examples/qos_sched/init.c b/examples/qos_sched/init.c
index f6e9af16b..55000b9ae 100644
--- a/examples/qos_sched/init.c
+++ b/examples/qos_sched/init.c
@@ -165,22 +165,12 @@ app_init_port(uint16_t portid, struct rte_mempool *mp)
 	return 0;
 }
 
-static struct rte_sched_subport_params subport_params[MAX_SCHED_SUBPORTS] = {
-	{
-		.tb_rate = 1250000000,
-		.tb_size = 1000000,
-
-		.tc_rate = {1250000000, 1250000000, 1250000000, 1250000000},
-		.tc_period = 10,
-	},
-};
-
-static struct rte_sched_pipe_params pipe_profiles[RTE_SCHED_PIPE_PROFILES_PER_PORT] = {
+static struct rte_sched_pipe_params pipe_profiles[MAX_SCHED_PIPE_PROFILES] = {
 	{ /* Profile #0 */
 		.tb_rate = 305175,
 		.tb_size = 1000000,
 
-		.tc_rate = {305175, 305175, 305175, 305175},
+		.tc_rate = {305175, 305175, 305175, 305175, 305175, 305175, 305175, 305175, 305175},
 		.tc_period = 40,
 #ifdef RTE_SCHED_SUBPORT_TC_OV
 		.tc_ov_weight = 1,
@@ -190,6 +180,70 @@ static struct rte_sched_pipe_params pipe_profiles[RTE_SCHED_PIPE_PROFILES_PER_PO
 	},
 };
 
+struct rte_sched_subport_params subport_params[MAX_SCHED_SUBPORTS] = {
+	{
+		.tb_rate = 1250000000,
+		.tb_size = 1000000,
+
+		.tc_rate = {1250000000, 1250000000, 1250000000, 1250000000, 1250000000, 1250000000, 1250000000, 1250000000, 1250000000},
+		.tc_period = 10,
+		.n_subport_pipes = 4096,
+		.qsize = {64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64},
+		.pipe_profiles = pipe_profiles,
+		.n_pipe_profiles = sizeof(pipe_profiles) / sizeof(struct rte_sched_pipe_params),
+		.n_max_pipe_profiles = MAX_SCHED_PIPE_PROFILES,
+
+#ifdef RTE_SCHED_RED
+		.red_params = {
+			/* Traffic Class 0 Colors Green / Yellow / Red */
+			[0][0] = {.min_th = 48, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+			[0][1] = {.min_th = 40, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+			[0][2] = {.min_th = 32, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+
+			/* Traffic Class 1 - Colors Green / Yellow / Red */
+			[1][0] = {.min_th = 48, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+			[1][1] = {.min_th = 40, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+			[1][2] = {.min_th = 32, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+
+			/* Traffic Class 2 - Colors Green / Yellow / Red */
+			[2][0] = {.min_th = 48, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+			[2][1] = {.min_th = 40, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+			[2][2] = {.min_th = 32, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+
+			/* Traffic Class 3 - Colors Green / Yellow / Red */
+			[3][0] = {.min_th = 48, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+			[3][1] = {.min_th = 40, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+			[3][2] = {.min_th = 32, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+
+			/* Traffic Class 4 - Colors Green / Yellow / Red */
+			[4][0] = {.min_th = 48, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+			[4][1] = {.min_th = 40, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+			[4][2] = {.min_th = 32, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+
+			/* Traffic Class 5 - Colors Green / Yellow / Red */
+			[5][0] = {.min_th = 48, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+			[5][1] = {.min_th = 40, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+			[5][2] = {.min_th = 32, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+
+			/* Traffic Class 6 - Colors Green / Yellow / Red */
+			[6][0] = {.min_th = 48, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+			[6][1] = {.min_th = 40, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+			[6][2] = {.min_th = 32, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+
+			/* Traffic Class 7 - Colors Green / Yellow / Red */
+			[7][0] = {.min_th = 48, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+			[7][1] = {.min_th = 40, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+			[7][2] = {.min_th = 32, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+
+			/* Traffic Class 8 - Colors Green / Yellow / Red */
+			[8][0] = {.min_th = 48, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+			[8][1] = {.min_th = 40, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+			[8][2] = {.min_th = 32, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+		},
+#endif /* RTE_SCHED_RED */
+	},
+};
+
 struct rte_sched_port_params port_params = {
 	.name = "port_scheduler_0",
 	.socket = 0, /* computed */
@@ -197,34 +251,6 @@ struct rte_sched_port_params port_params = {
 	.mtu = 6 + 6 + 4 + 4 + 2 + 1500,
 	.frame_overhead = RTE_SCHED_FRAME_OVERHEAD_DEFAULT,
 	.n_subports_per_port = 1,
-	.n_pipes_per_subport = 4096,
-	.qsize = {64, 64, 64, 64},
-	.pipe_profiles = pipe_profiles,
-	.n_pipe_profiles = sizeof(pipe_profiles) / sizeof(struct rte_sched_pipe_params),
-
-#ifdef RTE_SCHED_RED
-	.red_params = {
-		/* Traffic Class 0 Colors Green / Yellow / Red */
-		[0][0] = {.min_th = 48, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
-		[0][1] = {.min_th = 40, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
-		[0][2] = {.min_th = 32, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
-
-		/* Traffic Class 1 - Colors Green / Yellow / Red */
-		[1][0] = {.min_th = 48, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
-		[1][1] = {.min_th = 40, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
-		[1][2] = {.min_th = 32, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
-
-		/* Traffic Class 2 - Colors Green / Yellow / Red */
-		[2][0] = {.min_th = 48, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
-		[2][1] = {.min_th = 40, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
-		[2][2] = {.min_th = 32, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
-
-		/* Traffic Class 3 - Colors Green / Yellow / Red */
-		[3][0] = {.min_th = 48, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
-		[3][1] = {.min_th = 40, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
-		[3][2] = {.min_th = 32, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9}
-	}
-#endif /* RTE_SCHED_RED */
 };
 
 static struct rte_sched_port *
@@ -255,7 +281,8 @@ app_init_sched_port(uint32_t portid, uint32_t socketid)
 					subport, err);
 		}
 
-		for (pipe = 0; pipe < port_params.n_pipes_per_subport; pipe ++) {
+		uint32_t n_subport_pipes = subport_params[subport].n_subport_pipes;
+		for (pipe = 0; pipe < n_subport_pipes; pipe++) {
 			if (app_pipe_to_profile[subport][pipe] != -1) {
 				err = rte_sched_pipe_config(port, subport, pipe,
 						app_pipe_to_profile[subport][pipe]);
diff --git a/examples/qos_sched/main.h b/examples/qos_sched/main.h
index 8a2741c58..219aa9a95 100644
--- a/examples/qos_sched/main.h
+++ b/examples/qos_sched/main.h
@@ -26,7 +26,7 @@ extern "C" {
 
 #define MAX_PKT_RX_BURST 64
 #define PKT_ENQUEUE 64
-#define PKT_DEQUEUE 32
+#define PKT_DEQUEUE 60
 #define MAX_PKT_TX_BURST 64
 
 #define RX_PTHRESH 8 /**< Default values of RX prefetch threshold reg. */
@@ -50,6 +50,7 @@ extern "C" {
 #define MAX_DATA_STREAMS (APP_MAX_LCORE/2)
 #define MAX_SCHED_SUBPORTS		8
 #define MAX_SCHED_PIPES		4096
+#define MAX_SCHED_PIPE_PROFILES		256
 
 #ifndef APP_COLLECT_STAT
 #define APP_COLLECT_STAT		1
@@ -147,7 +148,11 @@ extern struct burst_conf burst_conf;
 extern struct ring_thresh rx_thresh;
 extern struct ring_thresh tx_thresh;
 
+uint32_t active_queues[RTE_SCHED_QUEUES_PER_PIPE];
+uint32_t n_active_queues;
+
 extern struct rte_sched_port_params port_params;
+extern struct rte_sched_subport_params subport_params[MAX_SCHED_SUBPORTS];
 
 int app_parse_args(int argc, char **argv);
 int app_init(void);
diff --git a/examples/qos_sched/profile.cfg b/examples/qos_sched/profile.cfg
index f5b704cc6..02fd8a00e 100644
--- a/examples/qos_sched/profile.cfg
+++ b/examples/qos_sched/profile.cfg
@@ -1,6 +1,6 @@
 ;   BSD LICENSE
 ;
-;   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+;   Copyright(c) 2010-2019 Intel Corporation. All rights reserved.
 ;   All rights reserved.
 ;
 ;   Redistribution and use in source and binary forms, with or without
@@ -33,12 +33,12 @@
 ; 10GbE output port:
 ;	* Single subport (subport 0):
 ;		- Subport rate set to 100% of port rate
-;		- Each of the 4 traffic classes has rate set to 100% of port rate
+;		- Each of the 9 traffic classes has rate set to 100% of port rate
 ;	* 4K pipes per subport 0 (pipes 0 .. 4095) with identical configuration:
 ;		- Pipe rate set to 1/4K of port rate
-;		- Each of the 4 traffic classes has rate set to 100% of pipe rate
-;		- Within each traffic class, the byte-level WRR weights for the 4 queues
-;         are set to 1:1:1:1
+;		- Each of the 9 traffic classes has rate set to 100% of pipe rate
+;		- Within lowest priority traffic class (best-effort), the byte-level
+;		  WRR weights for the 8 queues are set to 1:1:1:1:1:1:1:1
 ;
 ; For more details, please refer to chapter "Quality of Service (QoS) Framework"
 ; of Data Plane Development Kit (DPDK) Programmer's Guide.
@@ -47,11 +47,12 @@
 [port]
 frame overhead = 24
 number of subports per port = 1
-number of pipes per subport = 4096
-queue sizes = 64 64 64 64
 
 ; Subport configuration
 [subport 0]
+number of pipes per subport = 4096
+queue sizes = 64 64 64 64 64 64 64 64 64 64 64 64 64 64 64 64
+
 tb rate = 1250000000           ; Bytes per second
 tb size = 1000000              ; Bytes
 
@@ -59,6 +60,11 @@ tc 0 rate = 1250000000         ; Bytes per second
 tc 1 rate = 1250000000         ; Bytes per second
 tc 2 rate = 1250000000         ; Bytes per second
 tc 3 rate = 1250000000         ; Bytes per second
+tc 4 rate = 1250000000         ; Bytes per second
+tc 5 rate = 1250000000         ; Bytes per second
+tc 6 rate = 1250000000         ; Bytes per second
+tc 7 rate = 1250000000         ; Bytes per second
+tc 8 rate = 1250000000         ; Bytes per second
 tc period = 10                 ; Milliseconds
 
 pipe 0-4095 = 0                ; These pipes are configured with pipe profile 0
@@ -72,14 +78,16 @@ tc 0 rate = 305175             ; Bytes per second
 tc 1 rate = 305175             ; Bytes per second
 tc 2 rate = 305175             ; Bytes per second
 tc 3 rate = 305175             ; Bytes per second
-tc period = 40                 ; Milliseconds
+tc 4 rate = 305175             ; Bytes per second
+tc 5 rate = 305175             ; Bytes per second
+tc 6 rate = 305175             ; Bytes per second
+tc 7 rate = 305175             ; Bytes per second
+tc 8 rate = 305175             ; Bytes per second
+tc period = 160                ; Milliseconds
 
-tc 3 oversubscription weight = 1
+tc 8 oversubscription weight = 1
 
-tc 0 wrr weights = 1 1 1 1
-tc 1 wrr weights = 1 1 1 1
-tc 2 wrr weights = 1 1 1 1
-tc 3 wrr weights = 1 1 1 1
+tc 8 wrr weights = 1 1 1 1 1 1 1 1
 
 ; RED params per traffic class and color (Green / Yellow / Red)
 [red]
@@ -102,3 +110,28 @@ tc 3 wred min = 48 40 32
 tc 3 wred max = 64 64 64
 tc 3 wred inv prob = 10 10 10
 tc 3 wred weight = 9 9 9
+
+tc 4 wred min = 48 40 32
+tc 4 wred max = 64 64 64
+tc 4 wred inv prob = 10 10 10
+tc 4 wred weight = 9 9 9
+
+tc 5 wred min = 48 40 32
+tc 5 wred max = 64 64 64
+tc 5 wred inv prob = 10 10 10
+tc 5 wred weight = 9 9 9
+
+tc 6 wred min = 48 40 32
+tc 6 wred max = 64 64 64
+tc 6 wred inv prob = 10 10 10
+tc 6 wred weight = 9 9 9
+
+tc 7 wred min = 48 40 32
+tc 7 wred max = 64 64 64
+tc 7 wred inv prob = 10 10 10
+tc 7 wred weight = 9 9 9
+
+tc 8 wred min = 48 40 32
+tc 8 wred max = 64 64 64
+tc 8 wred inv prob = 10 10 10
+tc 8 wred weight = 9 9 9
diff --git a/examples/qos_sched/profile_ov.cfg b/examples/qos_sched/profile_ov.cfg
index 33000df9e..450001d2b 100644
--- a/examples/qos_sched/profile_ov.cfg
+++ b/examples/qos_sched/profile_ov.cfg
@@ -1,6 +1,6 @@
 ;   BSD LICENSE
 ;
-;   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+;   Copyright(c) 2010-2019 Intel Corporation. All rights reserved.
 ;   All rights reserved.
 ;
 ;   Redistribution and use in source and binary forms, with or without
@@ -33,11 +33,12 @@
 [port]
 frame overhead = 24
 number of subports per port = 1
-number of pipes per subport = 32
-queue sizes = 64 64 64 64
 
 ; Subport configuration
 [subport 0]
+number of pipes per subport = 32
+queue sizes = 64 64 64 64 64 64 64 64 64 64 64 64 64 64 64 64
+
 tb rate = 8400000           ; Bytes per second
 tb size = 100000            ; Bytes
 
@@ -45,6 +46,11 @@ tc 0 rate = 8400000         ; Bytes per second
 tc 1 rate = 8400000         ; Bytes per second
 tc 2 rate = 8400000         ; Bytes per second
 tc 3 rate = 8400000         ; Bytes per second
+tc 4 rate = 8400000         ; Bytes per second
+tc 5 rate = 8400000         ; Bytes per second
+tc 6 rate = 8400000         ; Bytes per second
+tc 7 rate = 8400000         ; Bytes per second
+tc 8 rate = 8400000         ; Bytes per second
 tc period = 10              ; Milliseconds
 
 pipe 0-31 = 0               ; These pipes are configured with pipe profile 0
@@ -58,14 +64,16 @@ tc 0 rate = 16800000           ; Bytes per second
 tc 1 rate = 16800000           ; Bytes per second
 tc 2 rate = 16800000           ; Bytes per second
 tc 3 rate = 16800000           ; Bytes per second
+tc 4 rate = 16800000           ; Bytes per second
+tc 5 rate = 16800000           ; Bytes per second
+tc 6 rate = 16800000           ; Bytes per second
+tc 7 rate = 16800000           ; Bytes per second
+tc 8 rate = 16800000           ; Bytes per second
 tc period = 28                 ; Milliseconds
 
 tc 3 oversubscription weight = 1
 
-tc 0 wrr weights = 1 1 1 1
-tc 1 wrr weights = 1 1 1 1
-tc 2 wrr weights = 1 1 1 1
-tc 3 wrr weights = 1 1 1 1
+tc 8 wrr weights = 1 1 1 1 1 1 1 1
 
 ; RED params per traffic class and color (Green / Yellow / Red)
 [red]
@@ -88,3 +96,28 @@ tc 3 wred min = 48 40 32
 tc 3 wred max = 64 64 64
 tc 3 wred inv prob = 10 10 10
 tc 3 wred weight = 9 9 9
+
+tc 4 wred min = 48 40 32
+tc 4 wred max = 64 64 64
+tc 4 wred inv prob = 10 10 10
+tc 4 wred weight = 9 9 9
+
+tc 5 wred min = 48 40 32
+tc 5 wred max = 64 64 64
+tc 5 wred inv prob = 10 10 10
+tc 5 wred weight = 9 9 9
+
+tc 6 wred min = 48 40 32
+tc 6 wred max = 64 64 64
+tc 6 wred inv prob = 10 10 10
+tc 6 wred weight = 9 9 9
+
+tc 7 wred min = 48 40 32
+tc 7 wred max = 64 64 64
+tc 7 wred inv prob = 10 10 10
+tc 7 wred weight = 9 9 9
+
+tc 8 wred min = 48 40 32
+tc 8 wred max = 64 64 64
+tc 8 wred inv prob = 10 10 10
+tc 8 wred weight = 9 9 9
diff --git a/examples/qos_sched/stats.c b/examples/qos_sched/stats.c
index 8193d964c..f69c5afb0 100644
--- a/examples/qos_sched/stats.c
+++ b/examples/qos_sched/stats.c
@@ -11,278 +11,333 @@ int
 qavg_q(uint16_t port_id, uint32_t subport_id, uint32_t pipe_id, uint8_t tc,
 		uint8_t q)
 {
-        struct rte_sched_queue_stats stats;
-        struct rte_sched_port *port;
-        uint16_t qlen;
-        uint32_t queue_id, count, i;
-        uint32_t average;
-
-        for (i = 0; i < nb_pfc; i++) {
-                if (qos_conf[i].tx_port == port_id)
-                        break;
-        }
-        if (i == nb_pfc || subport_id >= port_params.n_subports_per_port || pipe_id >= port_params.n_pipes_per_subport
-                        || tc >= RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE || q >= RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS)
-                return -1;
-
-        port = qos_conf[i].sched_port;
-
-        queue_id = RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS * (subport_id * port_params.n_pipes_per_subport + pipe_id);
-        queue_id = queue_id + (tc * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS + q);
-
-        average = 0;
-
-        for (count = 0; count < qavg_ntimes; count++) {
-                rte_sched_queue_read_stats(port, queue_id, &stats, &qlen);
-                average += qlen;
-                usleep(qavg_period);
-        }
-
-        average /= qavg_ntimes;
-
-        printf("\nAverage queue size: %" PRIu32 " bytes.\n\n", average);
-
-        return 0;
+	struct rte_sched_queue_stats stats;
+	struct rte_sched_port *port;
+	uint16_t qlen;
+	uint32_t count, i, queue_id = 0;
+	uint32_t average;
+
+	for (i = 0; i < nb_pfc; i++) {
+		if (qos_conf[i].tx_port == port_id)
+			break;
+	}
+
+	if (i == nb_pfc || subport_id >= port_params.n_subports_per_port ||
+		pipe_id >= subport_params[subport_id].n_subport_pipes  ||
+		tc >= RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE ||
+		q >= RTE_SCHED_BE_QUEUES_PER_PIPE ||
+		(tc < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE - 1 && q > 0))
+			return -1;
+
+	port = qos_conf[i].sched_port;
+	for (i = 0; i < subport_id; i++)
+		queue_id += subport_params[i].n_subport_pipes *
+				RTE_SCHED_QUEUES_PER_PIPE;
+	if (tc < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE - 1)
+		queue_id += pipe_id * RTE_SCHED_QUEUES_PER_PIPE + tc;
+	else
+		queue_id += pipe_id * RTE_SCHED_QUEUES_PER_PIPE + tc + q;
+
+	average = 0;
+	for (count = 0; count < qavg_ntimes; count++) {
+		rte_sched_queue_read_stats(port, queue_id, &stats, &qlen);
+		average += qlen;
+		usleep(qavg_period);
+	}
+
+	average /= qavg_ntimes;
+
+	printf("\nAverage queue size: %" PRIu32 " bytes.\n\n", average);
+
+	return 0;
 }
 
 int
 qavg_tcpipe(uint16_t port_id, uint32_t subport_id, uint32_t pipe_id,
-	     uint8_t tc)
+		uint8_t tc)
 {
-        struct rte_sched_queue_stats stats;
-        struct rte_sched_port *port;
-        uint16_t qlen;
-        uint32_t queue_id, count, i;
-        uint32_t average, part_average;
+	struct rte_sched_queue_stats stats;
+	struct rte_sched_port *port;
+	uint16_t qlen;
+	uint32_t count, i, queue_id = 0;
+	uint32_t average, part_average;
+
+	for (i = 0; i < nb_pfc; i++) {
+		if (qos_conf[i].tx_port == port_id)
+			break;
+	}
+
+	if (i == nb_pfc || subport_id >= port_params.n_subports_per_port ||
+		pipe_id >= subport_params[subport_id].n_subport_pipes ||
+		tc >= RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE)
+		return -1;
+
+	port = qos_conf[i].sched_port;
 
-        for (i = 0; i < nb_pfc; i++) {
-                if (qos_conf[i].tx_port == port_id)
-                        break;
-        }
-        if (i == nb_pfc || subport_id >= port_params.n_subports_per_port || pipe_id >= port_params.n_pipes_per_subport
-                        || tc >= RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE)
-                return -1;
+	for (i = 0; i < subport_id; i++)
+		queue_id += subport_params[i].n_subport_pipes * RTE_SCHED_QUEUES_PER_PIPE;
 
-        port = qos_conf[i].sched_port;
+	queue_id += pipe_id * RTE_SCHED_QUEUES_PER_PIPE + tc;
 
-        queue_id = RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS * (subport_id * port_params.n_pipes_per_subport + pipe_id);
+	average = 0;
 
-        average = 0;
+	for (count = 0; count < qavg_ntimes; count++) {
+		part_average = 0;
 
-        for (count = 0; count < qavg_ntimes; count++) {
-                part_average = 0;
-                for (i = 0; i < RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS; i++) {
-                        rte_sched_queue_read_stats(port, queue_id + (tc * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS + i), &stats, &qlen);
-                        part_average += qlen;
-                }
-                average += part_average / RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS;
-                usleep(qavg_period);
-        }
+		if (tc < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE - 1) {
+			rte_sched_queue_read_stats(port, queue_id, &stats, &qlen);
+			part_average += qlen;
+		} else {
+			for (i = 0; i < RTE_SCHED_BE_QUEUES_PER_PIPE; i++) {
+				rte_sched_queue_read_stats(port, queue_id + i, &stats, &qlen);
+				part_average += qlen;
+			}
+			average += part_average / RTE_SCHED_BE_QUEUES_PER_PIPE;
+		}
+		usleep(qavg_period);
+	}
 
-        average /= qavg_ntimes;
+	average /= qavg_ntimes;
 
-        printf("\nAverage queue size: %" PRIu32 " bytes.\n\n", average);
+	printf("\nAverage queue size: %" PRIu32 " bytes.\n\n", average);
 
-        return 0;
+	return 0;
 }
 
 int
 qavg_pipe(uint16_t port_id, uint32_t subport_id, uint32_t pipe_id)
 {
-        struct rte_sched_queue_stats stats;
-        struct rte_sched_port *port;
-        uint16_t qlen;
-        uint32_t queue_id, count, i;
-        uint32_t average, part_average;
+	struct rte_sched_queue_stats stats;
+	struct rte_sched_port *port;
+	uint16_t qlen;
+	uint32_t count, i, queue_id = 0;
+	uint32_t average, part_average;
 
-        for (i = 0; i < nb_pfc; i++) {
-                if (qos_conf[i].tx_port == port_id)
-                        break;
-        }
-        if (i == nb_pfc || subport_id >= port_params.n_subports_per_port || pipe_id >= port_params.n_pipes_per_subport)
-                return -1;
+	for (i = 0; i < nb_pfc; i++) {
+		if (qos_conf[i].tx_port == port_id)
+			break;
+	}
 
-        port = qos_conf[i].sched_port;
+	if (i == nb_pfc ||
+		subport_id >= port_params.n_subports_per_port ||
+		pipe_id >= subport_params[subport_id].n_subport_pipes)
+		return -1;
 
-        queue_id = RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS * (subport_id * port_params.n_pipes_per_subport + pipe_id);
+	port = qos_conf[i].sched_port;
 
-        average = 0;
+	for (i = 0; i < subport_id; i++)
+		queue_id += subport_params[i].n_subport_pipes *
+				RTE_SCHED_QUEUES_PER_PIPE;
 
-        for (count = 0; count < qavg_ntimes; count++) {
-                part_average = 0;
-                for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS; i++) {
-                        rte_sched_queue_read_stats(port, queue_id + i, &stats, &qlen);
-                        part_average += qlen;
-                }
-                average += part_average / (RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS);
-                usleep(qavg_period);
-        }
+	queue_id += pipe_id * RTE_SCHED_QUEUES_PER_PIPE;
 
-        average /= qavg_ntimes;
+	average = 0;
 
-        printf("\nAverage queue size: %" PRIu32 " bytes.\n\n", average);
+	for (count = 0; count < qavg_ntimes; count++) {
+		part_average = 0;
+		for (i = 0; i < RTE_SCHED_QUEUES_PER_PIPE; i++) {
+			rte_sched_queue_read_stats(port, queue_id + i, &stats, &qlen);
+			part_average += qlen;
+		}
+		average += part_average / RTE_SCHED_QUEUES_PER_PIPE;
+		usleep(qavg_period);
+	}
 
-        return 0;
+	average /= qavg_ntimes;
+
+	printf("\nAverage queue size: %" PRIu32 " bytes.\n\n", average);
+
+	return 0;
 }
 
 int
 qavg_tcsubport(uint16_t port_id, uint32_t subport_id, uint8_t tc)
 {
-        struct rte_sched_queue_stats stats;
-        struct rte_sched_port *port;
-        uint16_t qlen;
-        uint32_t queue_id, count, i, j;
-        uint32_t average, part_average;
-
-        for (i = 0; i < nb_pfc; i++) {
-                if (qos_conf[i].tx_port == port_id)
-                        break;
-        }
-        if (i == nb_pfc || subport_id >= port_params.n_subports_per_port || tc >= RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE)
-                return -1;
-
-        port = qos_conf[i].sched_port;
-
-        average = 0;
-
-        for (count = 0; count < qavg_ntimes; count++) {
-                part_average = 0;
-                for (i = 0; i < port_params.n_pipes_per_subport; i++) {
-                        queue_id = RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS * (subport_id * port_params.n_pipes_per_subport + i);
-
-                        for (j = 0; j < RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS; j++) {
-                                rte_sched_queue_read_stats(port, queue_id + (tc * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS + j), &stats, &qlen);
-                                part_average += qlen;
-                        }
-                }
-
-                average += part_average / (port_params.n_pipes_per_subport * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS);
-                usleep(qavg_period);
-        }
-
-        average /= qavg_ntimes;
-
-        printf("\nAverage queue size: %" PRIu32 " bytes.\n\n", average);
-
-        return 0;
+	struct rte_sched_queue_stats stats;
+	struct rte_sched_port *port;
+	uint16_t qlen;
+	uint32_t queue_id, count, i, j, subport_queue_id = 0;
+	uint32_t average, part_average;
+
+	for (i = 0; i < nb_pfc; i++) {
+		if (qos_conf[i].tx_port == port_id)
+			break;
+	}
+
+	if (i == nb_pfc ||
+		subport_id >= port_params.n_subports_per_port ||
+		tc >= RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE)
+		return -1;
+
+	port = qos_conf[i].sched_port;
+
+	for (i = 0; i < subport_id; i++)
+		subport_queue_id += subport_params[i].n_subport_pipes * RTE_SCHED_QUEUES_PER_PIPE;
+
+	average = 0;
+
+	for (count = 0; count < qavg_ntimes; count++) {
+		part_average = 0;
+		for (i = 0; i < subport_params[subport_id].n_subport_pipes; i++) {
+			if (tc < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE - 1) {
+				queue_id = subport_queue_id + i * RTE_SCHED_QUEUES_PER_PIPE + tc;
+				rte_sched_queue_read_stats(port, queue_id, &stats, &qlen);
+				part_average += qlen;
+			} else {
+				for (j = 0; j < RTE_SCHED_BE_QUEUES_PER_PIPE; j++) {
+					queue_id = subport_queue_id +
+							i * RTE_SCHED_QUEUES_PER_PIPE + tc + j;
+					rte_sched_queue_read_stats(port, queue_id, &stats, &qlen);
+					part_average += qlen;
+				}
+			}
+		}
+
+		if (tc < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE - 1)
+			average += part_average / (subport_params[subport_id].n_subport_pipes);
+		else
+			average += part_average / (subport_params[subport_id].n_subport_pipes) * RTE_SCHED_BE_QUEUES_PER_PIPE;
+
+		usleep(qavg_period);
+	}
+
+	average /= qavg_ntimes;
+
+	printf("\nAverage queue size: %" PRIu32 " bytes.\n\n", average);
+
+	return 0;
 }
 
 int
 qavg_subport(uint16_t port_id, uint32_t subport_id)
 {
-        struct rte_sched_queue_stats stats;
-        struct rte_sched_port *port;
-        uint16_t qlen;
-        uint32_t queue_id, count, i, j;
-        uint32_t average, part_average;
+	struct rte_sched_queue_stats stats;
+	struct rte_sched_port *port;
+	uint16_t qlen;
+	uint32_t queue_id, count, i, j, subport_queue_id = 0;
+	uint32_t average, part_average;
+
+	for (i = 0; i < nb_pfc; i++) {
+		if (qos_conf[i].tx_port == port_id)
+			break;
+	}
+
+	if (i == nb_pfc ||
+		subport_id >= port_params.n_subports_per_port)
+		return -1;
 
-        for (i = 0; i < nb_pfc; i++) {
-                if (qos_conf[i].tx_port == port_id)
-                        break;
-        }
-        if (i == nb_pfc || subport_id >= port_params.n_subports_per_port)
-                return -1;
+	port = qos_conf[i].sched_port;
 
-        port = qos_conf[i].sched_port;
+	for (i = 0; i < subport_id; i++)
+		subport_queue_id += subport_params[i].n_subport_pipes * RTE_SCHED_QUEUES_PER_PIPE;
 
-        average = 0;
+	average = 0;
 
-        for (count = 0; count < qavg_ntimes; count++) {
-                part_average = 0;
-                for (i = 0; i < port_params.n_pipes_per_subport; i++) {
-                        queue_id = RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS * (subport_id * port_params.n_pipes_per_subport + i);
+	for (count = 0; count < qavg_ntimes; count++) {
+		part_average = 0;
+		for (i = 0; i < subport_params[subport_id].n_subport_pipes; i++) {
+			queue_id = subport_queue_id + i * RTE_SCHED_QUEUES_PER_PIPE;
 
-                        for (j = 0; j < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS; j++) {
-                                rte_sched_queue_read_stats(port, queue_id + j, &stats, &qlen);
-                                part_average += qlen;
-                        }
-                }
+			for (j = 0; j < RTE_SCHED_QUEUES_PER_PIPE; j++) {
+				rte_sched_queue_read_stats(port, queue_id + j, &stats, &qlen);
+				part_average += qlen;
+			}
+		}
 
-                average += part_average / (port_params.n_pipes_per_subport * RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS);
-                usleep(qavg_period);
-        }
+		average += part_average / (subport_params[subport_id].n_subport_pipes * RTE_SCHED_QUEUES_PER_PIPE);
+		usleep(qavg_period);
+	}
 
-        average /= qavg_ntimes;
+	average /= qavg_ntimes;
 
-        printf("\nAverage queue size: %" PRIu32 " bytes.\n\n", average);
+	printf("\nAverage queue size: %" PRIu32 " bytes.\n\n", average);
 
-        return 0;
+	return 0;
 }
 
 int
 subport_stat(uint16_t port_id, uint32_t subport_id)
 {
-        struct rte_sched_subport_stats stats;
-        struct rte_sched_port *port;
-        uint32_t tc_ov[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
-        uint8_t i;
-
-        for (i = 0; i < nb_pfc; i++) {
-                if (qos_conf[i].tx_port == port_id)
-                        break;
-        }
-        if (i == nb_pfc || subport_id >= port_params.n_subports_per_port)
-                return -1;
-
-        port = qos_conf[i].sched_port;
+	struct rte_sched_subport_stats stats;
+	struct rte_sched_port *port;
+	uint32_t tc_ov[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
+	uint8_t i;
+
+	for (i = 0; i < nb_pfc; i++) {
+		if (qos_conf[i].tx_port == port_id)
+			break;
+	}
+
+	if (i == nb_pfc || subport_id >= port_params.n_subports_per_port)
+		return -1;
+
+	port = qos_conf[i].sched_port;
 	memset (tc_ov, 0, sizeof(tc_ov));
 
-        rte_sched_subport_read_stats(port, subport_id, &stats, tc_ov);
+	rte_sched_subport_read_stats(port, subport_id, &stats, tc_ov);
 
-        printf("\n");
-        printf("+----+-------------+-------------+-------------+-------------+-------------+\n");
-        printf("| TC |   Pkts OK   |Pkts Dropped |  Bytes OK   |Bytes Dropped|  OV Status  |\n");
-        printf("+----+-------------+-------------+-------------+-------------+-------------+\n");
+	printf("\n");
+	printf("+----+-------------+-------------+-------------+-------------+-------------+\n");
+	printf("| TC |   Pkts OK   |Pkts Dropped |  Bytes OK   |Bytes Dropped|  OV Status  |\n");
+	printf("+----+-------------+-------------+-------------+-------------+-------------+\n");
 
-        for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++) {
-                printf("|  %d | %11" PRIu32 " | %11" PRIu32 " | %11" PRIu32 " | %11" PRIu32 " | %11" PRIu32 " |\n", i,
-                                stats.n_pkts_tc[i], stats.n_pkts_tc_dropped[i],
-                                stats.n_bytes_tc[i], stats.n_bytes_tc_dropped[i], tc_ov[i]);
-                printf("+----+-------------+-------------+-------------+-------------+-------------+\n");
-        }
-        printf("\n");
+	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++) {
+		printf("|  %d | %11" PRIu32 " | %11" PRIu32 " | %11" PRIu32 " | %11" PRIu32 " | %11" PRIu32 " |\n", i,
+		stats.n_pkts_tc[i], stats.n_pkts_tc_dropped[i],
+		stats.n_bytes_tc[i], stats.n_bytes_tc_dropped[i], tc_ov[i]);
+		printf("+----+-------------+-------------+-------------+-------------+-------------+\n");
+	}
+	printf("\n");
 
-        return 0;
+	return 0;
 }
 
 int
 pipe_stat(uint16_t port_id, uint32_t subport_id, uint32_t pipe_id)
 {
-        struct rte_sched_queue_stats stats;
-        struct rte_sched_port *port;
-        uint16_t qlen;
-        uint8_t i, j;
-        uint32_t queue_id;
-
-        for (i = 0; i < nb_pfc; i++) {
-                if (qos_conf[i].tx_port == port_id)
-                        break;
-        }
-        if (i == nb_pfc || subport_id >= port_params.n_subports_per_port || pipe_id >= port_params.n_pipes_per_subport)
-                return -1;
-
-        port = qos_conf[i].sched_port;
-
-        queue_id = RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS * (subport_id * port_params.n_pipes_per_subport + pipe_id);
-
-        printf("\n");
-        printf("+----+-------+-------------+-------------+-------------+-------------+-------------+\n");
-        printf("| TC | Queue |   Pkts OK   |Pkts Dropped |  Bytes OK   |Bytes Dropped|    Length   |\n");
-        printf("+----+-------+-------------+-------------+-------------+-------------+-------------+\n");
-
-        for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++) {
-                for (j = 0; j < RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS; j++) {
-
-                        rte_sched_queue_read_stats(port, queue_id + (i * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS + j), &stats, &qlen);
-
-                        printf("|  %d |   %d   | %11" PRIu32 " | %11" PRIu32 " | %11" PRIu32 " | %11" PRIu32 " | %11i |\n", i, j,
-                                        stats.n_pkts, stats.n_pkts_dropped, stats.n_bytes, stats.n_bytes_dropped, qlen);
-                        printf("+----+-------+-------------+-------------+-------------+-------------+-------------+\n");
-                }
-                if (i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE - 1)
-                        printf("+----+-------+-------------+-------------+-------------+-------------+-------------+\n");
-        }
-        printf("\n");
-
-        return 0;
+	struct rte_sched_queue_stats stats;
+	struct rte_sched_port *port;
+	uint16_t qlen;
+	uint8_t i, j;
+	uint32_t queue_id = 0;
+
+	for (i = 0; i < nb_pfc; i++) {
+		if (qos_conf[i].tx_port == port_id)
+			break;
+	}
+
+	if (i == nb_pfc ||
+		subport_id >= port_params.n_subports_per_port ||
+		pipe_id >= subport_params[subport_id].n_subport_pipes)
+		return -1;
+
+	port = qos_conf[i].sched_port;
+	for (i = 0; i < subport_id; i++)
+		queue_id += subport_params[i].n_subport_pipes * RTE_SCHED_QUEUES_PER_PIPE;
+
+	queue_id += pipe_id * RTE_SCHED_QUEUES_PER_PIPE;
+
+	printf("\n");
+	printf("+----+-------+-------------+-------------+-------------+-------------+-------------+\n");
+	printf("| TC | Queue |   Pkts OK   |Pkts Dropped |  Bytes OK   |Bytes Dropped|    Length   |\n");
+	printf("+----+-------+-------------+-------------+-------------+-------------+-------------+\n");
+
+	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++) {
+		if (i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE - 1) {
+			rte_sched_queue_read_stats(port, queue_id + i, &stats, &qlen);
+			printf("|  %d |   %d   | %11" PRIu32 " | %11" PRIu32 " | %11" PRIu32 " | %11" PRIu32 " | %11i |\n", i, 0,
+				stats.n_pkts, stats.n_pkts_dropped, stats.n_bytes, stats.n_bytes_dropped, qlen);
+			printf("+----+-------+-------------+-------------+-------------+-------------+-------------+\n");
+		} else {
+			for (j = 0; j < RTE_SCHED_BE_QUEUES_PER_PIPE; j++) {
+				rte_sched_queue_read_stats(port, queue_id + i + j, &stats, &qlen);
+				printf("|  %d |   %d   | %11" PRIu32 " | %11" PRIu32 " | %11" PRIu32 " | %11" PRIu32 " | %11i |\n", i, j,
+					stats.n_pkts, stats.n_pkts_dropped, stats.n_bytes, stats.n_bytes_dropped, qlen);
+				printf("+----+-------+-------------+-------------+-------------+-------------+-------------+\n");
+			}
+		}
+	}
+	printf("\n");
+
+	return 0;
 }
-- 
2.21.0


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH v2 26/28] examples/ip_pipeline: update ip pipeline sample app
  2019-06-25 15:31   ` [dpdk-dev] [PATCH v2 00/28] sched: feature enhancements Jasvinder Singh
                       ` (24 preceding siblings ...)
  2019-06-25 15:32     ` [dpdk-dev] [PATCH v2 25/28] examples/qos_sched: update qos sched sample app Jasvinder Singh
@ 2019-06-25 15:32     ` " Jasvinder Singh
  2019-06-25 15:32     ` [dpdk-dev] [PATCH v2 27/28] sched: code cleanup Jasvinder Singh
                       ` (4 subsequent siblings)
  30 siblings, 0 replies; 163+ messages in thread
From: Jasvinder Singh @ 2019-06-25 15:32 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, Abraham Tovar, Lukasz Krakowiak

Update ip pipeline sample app to allow configuration flexiblity
for pipe traffic classes and queues, and subport level configuration
of the pipe parameters.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 examples/ip_pipeline/cli.c             | 85 +++++++++++++-------------
 examples/ip_pipeline/tmgr.c            | 22 +++----
 examples/ip_pipeline/tmgr.h            |  3 -
 lib/librte_pipeline/rte_table_action.c |  1 -
 lib/librte_pipeline/rte_table_action.h |  4 +-
 5 files changed, 54 insertions(+), 61 deletions(-)

diff --git a/examples/ip_pipeline/cli.c b/examples/ip_pipeline/cli.c
index 309b2936e..1c19d0e21 100644
--- a/examples/ip_pipeline/cli.c
+++ b/examples/ip_pipeline/cli.c
@@ -377,8 +377,11 @@ cmd_swq(char **tokens,
 static const char cmd_tmgr_subport_profile_help[] =
 "tmgr subport profile\n"
 "   <tb_rate> <tb_size>\n"
-"   <tc0_rate> <tc1_rate> <tc2_rate> <tc3_rate>\n"
-"   <tc_period>\n";
+"   <tc0_rate> <tc1_rate> <tc2_rate> <tc3_rate>"
+"        <tc4_rate> <tc5_rate> <tc6_rate> <tc7_rate> <tc8_rate>\n"
+"   <tc_period>\n"
+"   pps <n_pipes_per_subport>\n"
+"   qsize <qsize_q0..15>";
 
 static void
 cmd_tmgr_subport_profile(char **tokens,
@@ -389,7 +392,7 @@ cmd_tmgr_subport_profile(char **tokens,
 	struct rte_sched_subport_params p;
 	int status, i;
 
-	if (n_tokens != 10) {
+	if (n_tokens != 34) {
 		snprintf(out, out_size, MSG_ARG_MISMATCH, tokens[0]);
 		return;
 	}
@@ -410,11 +413,32 @@ cmd_tmgr_subport_profile(char **tokens,
 			return;
 		}
 
-	if (parser_read_uint32(&p.tc_period, tokens[9]) != 0) {
+	if (parser_read_uint32(&p.tc_period, tokens[14]) != 0) {
 		snprintf(out, out_size, MSG_ARG_INVALID, "tc_period");
 		return;
 	}
 
+	if (strcmp(tokens[15], "pps") != 0) {
+		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "pps");
+		return;
+	}
+
+	if (parser_read_uint32(&p.n_subport_pipes, tokens[16]) != 0) {
+		snprintf(out, out_size, MSG_ARG_INVALID, "n_subport_pipes");
+		return;
+	}
+
+	if (strcmp(tokens[17], "qsize") != 0) {
+		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "qsize");
+		return;
+	}
+
+	for (i = 0; i < RTE_SCHED_QUEUES_PER_PIPE; i++)
+		if (parser_read_uint16(&p.qsize[i], tokens[18 + i]) != 0) {
+			snprintf(out, out_size, MSG_ARG_INVALID, "qsize");
+			return;
+		}
+
 	status = tmgr_subport_profile_add(&p);
 	if (status != 0) {
 		snprintf(out, out_size, MSG_CMD_FAIL, tokens[0]);
@@ -425,10 +449,11 @@ cmd_tmgr_subport_profile(char **tokens,
 static const char cmd_tmgr_pipe_profile_help[] =
 "tmgr pipe profile\n"
 "   <tb_rate> <tb_size>\n"
-"   <tc0_rate> <tc1_rate> <tc2_rate> <tc3_rate>\n"
+"   <tc0_rate> <tc1_rate> <tc2_rate> <tc3_rate>"
+"     <tc0_rate> <tc1_rate> <tc2_rate> <tc3_rate>\n"
 "   <tc_period>\n"
 "   <tc_ov_weight>\n"
-"   <wrr_weight0..15>\n";
+"   <wrr_weight0..7>\n";
 
 static void
 cmd_tmgr_pipe_profile(char **tokens,
@@ -439,7 +464,7 @@ cmd_tmgr_pipe_profile(char **tokens,
 	struct rte_sched_pipe_params p;
 	int status, i;
 
-	if (n_tokens != 27) {
+	if (n_tokens != 24) {
 		snprintf(out, out_size, MSG_ARG_MISMATCH, tokens[0]);
 		return;
 	}
@@ -460,20 +485,20 @@ cmd_tmgr_pipe_profile(char **tokens,
 			return;
 		}
 
-	if (parser_read_uint32(&p.tc_period, tokens[9]) != 0) {
+	if (parser_read_uint32(&p.tc_period, tokens[14]) != 0) {
 		snprintf(out, out_size, MSG_ARG_INVALID, "tc_period");
 		return;
 	}
 
 #ifdef RTE_SCHED_SUBPORT_TC_OV
-	if (parser_read_uint8(&p.tc_ov_weight, tokens[10]) != 0) {
+	if (parser_read_uint8(&p.tc_ov_weight, tokens[15]) != 0) {
 		snprintf(out, out_size, MSG_ARG_INVALID, "tc_ov_weight");
 		return;
 	}
 #endif
 
-	for (i = 0; i < RTE_SCHED_QUEUES_PER_PIPE; i++)
-		if (parser_read_uint8(&p.wrr_weights[i], tokens[11 + i]) != 0) {
+	for (i = 0; i < RTE_SCHED_BE_QUEUES_PER_PIPE; i++)
+		if (parser_read_uint8(&p.wrr_weights[i], tokens[16 + i]) != 0) {
 			snprintf(out, out_size, MSG_ARG_INVALID, "wrr_weights");
 			return;
 		}
@@ -489,8 +514,6 @@ static const char cmd_tmgr_help[] =
 "tmgr <tmgr_name>\n"
 "   rate <rate>\n"
 "   spp <n_subports_per_port>\n"
-"   pps <n_pipes_per_subport>\n"
-"   qsize <qsize_tc0> <qsize_tc1> <qsize_tc2> <qsize_tc3>\n"
 "   fo <frame_overhead>\n"
 "   mtu <mtu>\n"
 "   cpu <cpu_id>\n";
@@ -504,9 +527,8 @@ cmd_tmgr(char **tokens,
 	struct tmgr_port_params p;
 	char *name;
 	struct tmgr_port *tmgr_port;
-	int i;
 
-	if (n_tokens != 19) {
+	if (n_tokens != 12) {
 		snprintf(out, out_size, MSG_ARG_MISMATCH, tokens[0]);
 		return;
 	}
@@ -533,53 +555,32 @@ cmd_tmgr(char **tokens,
 		return;
 	}
 
-	if (strcmp(tokens[6], "pps") != 0) {
-		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "pps");
-		return;
-	}
-
-	if (parser_read_uint32(&p.n_pipes_per_subport, tokens[7]) != 0) {
-		snprintf(out, out_size, MSG_ARG_INVALID, "n_pipes_per_subport");
-		return;
-	}
-
-	if (strcmp(tokens[8], "qsize") != 0) {
-		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "qsize");
-		return;
-	}
-
-	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++)
-		if (parser_read_uint16(&p.qsize[i], tokens[9 + i]) != 0) {
-			snprintf(out, out_size, MSG_ARG_INVALID, "qsize");
-			return;
-		}
-
-	if (strcmp(tokens[13], "fo") != 0) {
+	if (strcmp(tokens[6], "fo") != 0) {
 		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "fo");
 		return;
 	}
 
-	if (parser_read_uint32(&p.frame_overhead, tokens[14]) != 0) {
+	if (parser_read_uint32(&p.frame_overhead, tokens[7]) != 0) {
 		snprintf(out, out_size, MSG_ARG_INVALID, "frame_overhead");
 		return;
 	}
 
-	if (strcmp(tokens[15], "mtu") != 0) {
+	if (strcmp(tokens[8], "mtu") != 0) {
 		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "mtu");
 		return;
 	}
 
-	if (parser_read_uint32(&p.mtu, tokens[16]) != 0) {
+	if (parser_read_uint32(&p.mtu, tokens[9]) != 0) {
 		snprintf(out, out_size, MSG_ARG_INVALID, "mtu");
 		return;
 	}
 
-	if (strcmp(tokens[17], "cpu") != 0) {
+	if (strcmp(tokens[10], "cpu") != 0) {
 		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "cpu");
 		return;
 	}
 
-	if (parser_read_uint32(&p.cpu_id, tokens[18]) != 0) {
+	if (parser_read_uint32(&p.cpu_id, tokens[11]) != 0) {
 		snprintf(out, out_size, MSG_ARG_INVALID, "cpu_id");
 		return;
 	}
diff --git a/examples/ip_pipeline/tmgr.c b/examples/ip_pipeline/tmgr.c
index 40cbf1d0a..5e55e8ef1 100644
--- a/examples/ip_pipeline/tmgr.c
+++ b/examples/ip_pipeline/tmgr.c
@@ -47,7 +47,8 @@ int
 tmgr_subport_profile_add(struct rte_sched_subport_params *p)
 {
 	/* Check input params */
-	if (p == NULL)
+	if (p == NULL ||
+		p->n_subport_pipes == 0)
 		return -1;
 
 	/* Save profile */
@@ -90,7 +91,6 @@ tmgr_port_create(const char *name, struct tmgr_port_params *params)
 		tmgr_port_find(name) ||
 		(params == NULL) ||
 		(params->n_subports_per_port == 0) ||
-		(params->n_pipes_per_subport == 0) ||
 		(params->cpu_id >= RTE_MAX_NUMA_NODES) ||
 		(n_subport_profiles == 0) ||
 		(n_pipe_profiles == 0))
@@ -103,18 +103,15 @@ tmgr_port_create(const char *name, struct tmgr_port_params *params)
 	p.mtu = params->mtu;
 	p.frame_overhead = params->frame_overhead;
 	p.n_subports_per_port = params->n_subports_per_port;
-	p.n_pipes_per_subport = params->n_pipes_per_subport;
-
-	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++)
-		p.qsize[i] = params->qsize[i];
-
-	p.pipe_profiles = pipe_profile;
-	p.n_pipe_profiles = n_pipe_profiles;
 
 	s = rte_sched_port_config(&p);
 	if (s == NULL)
 		return NULL;
 
+	subport_profile[0].pipe_profiles = pipe_profile;
+	subport_profile[0].n_pipe_profiles = n_pipe_profiles;
+	subport_profile[0].n_max_pipe_profiles = TMGR_PIPE_PROFILE_MAX;
+
 	for (i = 0; i < params->n_subports_per_port; i++) {
 		int status;
 
@@ -128,7 +125,7 @@ tmgr_port_create(const char *name, struct tmgr_port_params *params)
 			return NULL;
 		}
 
-		for (j = 0; j < params->n_pipes_per_subport; j++) {
+		for (j = 0; j < subport_profile[0].n_subport_pipes; j++) {
 			status = rte_sched_pipe_config(
 				s,
 				i,
@@ -153,7 +150,6 @@ tmgr_port_create(const char *name, struct tmgr_port_params *params)
 	strlcpy(tmgr_port->name, name, sizeof(tmgr_port->name));
 	tmgr_port->s = s;
 	tmgr_port->n_subports_per_port = params->n_subports_per_port;
-	tmgr_port->n_pipes_per_subport = params->n_pipes_per_subport;
 
 	/* Node add to list */
 	TAILQ_INSERT_TAIL(&tmgr_port_list, tmgr_port, node);
@@ -205,8 +201,8 @@ tmgr_pipe_config(const char *port_name,
 	port = tmgr_port_find(port_name);
 	if ((port == NULL) ||
 		(subport_id >= port->n_subports_per_port) ||
-		(pipe_id_first >= port->n_pipes_per_subport) ||
-		(pipe_id_last >= port->n_pipes_per_subport) ||
+		(pipe_id_first >= subport_profile[0].n_subport_pipes) ||
+		(pipe_id_last >= subport_profile[0].n_subport_pipes) ||
 		(pipe_id_first > pipe_id_last) ||
 		(pipe_profile_id >= n_pipe_profiles))
 		return -1;
diff --git a/examples/ip_pipeline/tmgr.h b/examples/ip_pipeline/tmgr.h
index 0b497e795..3a958492c 100644
--- a/examples/ip_pipeline/tmgr.h
+++ b/examples/ip_pipeline/tmgr.h
@@ -25,7 +25,6 @@ struct tmgr_port {
 	char name[NAME_SIZE];
 	struct rte_sched_port *s;
 	uint32_t n_subports_per_port;
-	uint32_t n_pipes_per_subport;
 };
 
 TAILQ_HEAD(tmgr_port_list, tmgr_port);
@@ -39,8 +38,6 @@ tmgr_port_find(const char *name);
 struct tmgr_port_params {
 	uint32_t rate;
 	uint32_t n_subports_per_port;
-	uint32_t n_pipes_per_subport;
-	uint16_t qsize[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
 	uint32_t frame_overhead;
 	uint32_t mtu;
 	uint32_t cpu_id;
diff --git a/lib/librte_pipeline/rte_table_action.c b/lib/librte_pipeline/rte_table_action.c
index a54ec46bc..47d7efbc1 100644
--- a/lib/librte_pipeline/rte_table_action.c
+++ b/lib/librte_pipeline/rte_table_action.c
@@ -401,7 +401,6 @@ pkt_work_tm(struct rte_mbuf *mbuf,
 {
 	struct dscp_table_entry_data *dscp_entry = &dscp_table->entry[dscp];
 	uint32_t queue_id = data->queue_id |
-				(dscp_entry->tc << 2) |
 				dscp_entry->tc_queue;
 	rte_mbuf_sched_set(mbuf, queue_id, dscp_entry->tc,
 				(uint8_t)dscp_entry->color);
diff --git a/lib/librte_pipeline/rte_table_action.h b/lib/librte_pipeline/rte_table_action.h
index ef45a3023..4a68deb2e 100644
--- a/lib/librte_pipeline/rte_table_action.h
+++ b/lib/librte_pipeline/rte_table_action.h
@@ -181,10 +181,10 @@ struct rte_table_action_lb_params {
  * RTE_TABLE_ACTION_MTR
  */
 /** Max number of traffic classes (TCs). */
-#define RTE_TABLE_ACTION_TC_MAX                                  4
+#define RTE_TABLE_ACTION_TC_MAX                                  16
 
 /** Max number of queues per traffic class. */
-#define RTE_TABLE_ACTION_TC_QUEUE_MAX                            4
+#define RTE_TABLE_ACTION_TC_QUEUE_MAX                            16
 
 /** Differentiated Services Code Point (DSCP) translation table entry. */
 struct rte_table_action_dscp_table_entry {
-- 
2.21.0


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH v2 27/28] sched: code cleanup
  2019-06-25 15:31   ` [dpdk-dev] [PATCH v2 00/28] sched: feature enhancements Jasvinder Singh
                       ` (25 preceding siblings ...)
  2019-06-25 15:32     ` [dpdk-dev] [PATCH v2 26/28] examples/ip_pipeline: update ip pipeline " Jasvinder Singh
@ 2019-06-25 15:32     ` Jasvinder Singh
  2019-06-25 15:32     ` [dpdk-dev] [PATCH v2 28/28] sched: add release note Jasvinder Singh
                       ` (3 subsequent siblings)
  30 siblings, 0 replies; 163+ messages in thread
From: Jasvinder Singh @ 2019-06-25 15:32 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, Abraham Tovar, Lukasz Krakowiak

Remove redundant macros and fields from the data structures.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 lib/librte_sched/rte_sched.c | 43 ------------------------------------
 lib/librte_sched/rte_sched.h | 25 +++------------------
 2 files changed, 3 insertions(+), 65 deletions(-)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index cc1dcf7ab..b214e4283 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -193,7 +193,6 @@ struct rte_sched_pipe {
 	/* TC oversubscription */
 	uint32_t tc_ov_credits;
 	uint8_t tc_ov_period_id;
-	uint8_t reserved[3];
 } __rte_cache_aligned;
 
 struct rte_sched_queue {
@@ -211,18 +210,10 @@ struct rte_sched_queue_extra {
 struct rte_sched_port {
 	/* User parameters */
 	uint32_t n_subports_per_port;
-	uint32_t n_pipes_per_subport;
-	uint32_t n_pipes_per_subport_log2;
 	int socket;
 	uint32_t rate;
 	uint32_t mtu;
 	uint32_t frame_overhead;
-	uint16_t qsize[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
-	uint32_t n_pipe_profiles;
-	uint32_t pipe_tc3_rate_max;
-#ifdef RTE_SCHED_RED
-	struct rte_red_config red_config[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE][RTE_COLORS];
-#endif
 
 	/* Timing */
 	uint64_t time_cpu_cycles;     /* Current CPU time measured in CPU cyles */
@@ -230,50 +221,17 @@ struct rte_sched_port {
 	uint64_t time;                /* Current NIC TX time measured in bytes */
 	struct rte_reciprocal inv_cycles_per_byte; /* CPU cycles per byte */
 
-	/* Scheduling loop detection */
-	uint32_t pipe_loop;
-	uint32_t pipe_exhaustion;
-
-	/* Bitmap */
-	struct rte_bitmap *bmp;
-	uint32_t grinder_base_bmp_pos[RTE_SCHED_PORT_N_GRINDERS] __rte_aligned_16;
-
 	/* Grinders */
-	struct rte_sched_grinder grinder[RTE_SCHED_PORT_N_GRINDERS];
-	uint32_t busy_grinders;
 	struct rte_mbuf **pkts_out;
 	uint32_t n_pkts_out;
 	uint32_t subport_id;
 
 	uint32_t max_subport_pipes_log2;   /* Max number of subport pipes */
 
-	/* Queue base calculation */
-	uint32_t qsize_add[RTE_SCHED_QUEUES_PER_PIPE];
-	uint32_t qsize_sum;
-
 	/* Large data structures */
-	struct rte_sched_subport *subport;
-	struct rte_sched_pipe *pipe;
-	struct rte_sched_queue *queue;
-	struct rte_sched_queue_extra *queue_extra;
-	struct rte_sched_pipe_profile *pipe_profiles;
-	uint8_t *bmp_array;
-	struct rte_mbuf **queue_array;
 	struct rte_sched_subport *subports[0];
-	uint8_t memory[0] __rte_cache_aligned;
 } __rte_cache_aligned;
 
-enum rte_sched_port_array {
-	e_RTE_SCHED_PORT_ARRAY_SUBPORT = 0,
-	e_RTE_SCHED_PORT_ARRAY_PIPE,
-	e_RTE_SCHED_PORT_ARRAY_QUEUE,
-	e_RTE_SCHED_PORT_ARRAY_QUEUE_EXTRA,
-	e_RTE_SCHED_PORT_ARRAY_PIPE_PROFILES,
-	e_RTE_SCHED_PORT_ARRAY_BMP_ARRAY,
-	e_RTE_SCHED_PORT_ARRAY_QUEUE_ARRAY,
-	e_RTE_SCHED_PORT_ARRAY_TOTAL,
-};
-
 enum rte_sched_subport_array {
 	e_RTE_SCHED_SUBPORT_ARRAY_PIPE = 0,
 	e_RTE_SCHED_SUBPORT_ARRAY_QUEUE,
@@ -2458,7 +2416,6 @@ grinder_next_pipe(struct rte_sched_subport *subport, uint32_t pos)
 	return 1;
 }
 
-
 static inline void
 grinder_wrr_load(struct rte_sched_subport *subport, uint32_t pos)
 {
diff --git a/lib/librte_sched/rte_sched.h b/lib/librte_sched/rte_sched.h
index 1f690036d..8fe6ea904 100644
--- a/lib/librte_sched/rte_sched.h
+++ b/lib/librte_sched/rte_sched.h
@@ -82,7 +82,6 @@ extern "C" {
  */
 #define RTE_SCHED_BE_QUEUES_PER_PIPE    8
 
-#define RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS    4
 /** Number of traffic classes per pipe (as well as subport).
  *
  * @see struct rte_sched_subport_params
@@ -91,13 +90,6 @@ extern "C" {
 #define RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE    \
 (RTE_SCHED_QUEUES_PER_PIPE - RTE_SCHED_BE_QUEUES_PER_PIPE + 1)
 
-/** Maximum number of pipe profiles that can be defined per subport.
- * Compile-time configurable.
- */
-#ifndef RTE_SCHED_PIPE_PROFILES_PER_PORT
-#define RTE_SCHED_PIPE_PROFILES_PER_PORT      256
-#endif
-
 /*
  * Ethernet framing overhead. Overhead fields per Ethernet frame:
  * 1. Preamble:                             7 bytes;
@@ -126,6 +118,7 @@ extern "C" {
 struct rte_sched_pipe_params {
 	/** Token bucket rate (measured in bytes per second) */
 	uint32_t tb_rate;
+
 	/** Token bucket size (measured in credits) */
 	uint32_t tb_size;
 
@@ -134,6 +127,7 @@ struct rte_sched_pipe_params {
 
 	/** Enforcement period (measured in milliseconds) */
 	uint32_t tc_period;
+
 #ifdef RTE_SCHED_SUBPORT_TC_OV
 	/** Best-effort traffic class oversubscription weight */
 	uint8_t tc_ov_weight;
@@ -185,11 +179,11 @@ struct rte_sched_subport_params {
 
 	/** Max profiles allowed in the pipe profile table */
 	uint32_t n_max_pipe_profiles;
+
 #ifdef RTE_SCHED_RED
 	/** RED parameters */
 	struct rte_red_params
 		red_params[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE][RTE_COLORS];
-
 #endif
 };
 
@@ -254,19 +248,6 @@ struct rte_sched_port_params {
 
 	/** Number of subports */
 	uint32_t n_subports_per_port;
-	uint32_t n_pipes_per_subport;    /**< Number of pipes per subport */
-	uint16_t qsize[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
-	/**< Packet queue size for each traffic class.
-	 * All queues within the same pipe traffic class have the same
-	 * size. Queues from different pipes serving the same traffic
-	 * class have the same size. */
-	struct rte_sched_pipe_params *pipe_profiles;
-	/**< Pipe profile table.
-	 * Every pipe is configured using one of the profiles from this table. */
-	uint32_t n_pipe_profiles;        /**< Profiles in the pipe profile table */
-#ifdef RTE_SCHED_RED
-	struct rte_red_params red_params[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE][RTE_COLORS]; /**< RED parameters */
-#endif
 };
 
 /*
-- 
2.21.0


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH v2 28/28] sched: add release note
  2019-06-25 15:31   ` [dpdk-dev] [PATCH v2 00/28] sched: feature enhancements Jasvinder Singh
                       ` (26 preceding siblings ...)
  2019-06-25 15:32     ` [dpdk-dev] [PATCH v2 27/28] sched: code cleanup Jasvinder Singh
@ 2019-06-25 15:32     ` Jasvinder Singh
  2019-06-26 21:31       ` Thomas Monjalon
  2019-06-26 21:33     ` [dpdk-dev] [PATCH v2 00/28] sched: feature enhancements Thomas Monjalon
                       ` (2 subsequent siblings)
  30 siblings, 1 reply; 163+ messages in thread
From: Jasvinder Singh @ 2019-06-25 15:32 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, Lukasz Krakowiak, Abraham Tovar

From: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>

Add release notes and remove deprecation note.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 doc/guides/rel_notes/deprecation.rst   | 6 ------
 doc/guides/rel_notes/release_19_08.rst | 7 ++++++-
 lib/librte_sched/Makefile              | 2 +-
 lib/librte_sched/meson.build           | 2 +-
 4 files changed, 8 insertions(+), 9 deletions(-)

diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index e2721fad6..4810989da 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -86,12 +86,6 @@ Deprecation Notices
   to one it means it represents IV, when is set to zero it means J0 is used
   directly, in this case 16 bytes of J0 need to be passed.
 
-* sched: To allow more traffic classes, flexible mapping of pipe queues to
-  traffic classes, and subport level configuration of pipes and queues
-  changes will be made to macros, data structures and API functions defined
-  in "rte_sched.h". These changes are aligned to improvements suggested in the
-  RFC https://mails.dpdk.org/archives/dev/2018-November/120035.html.
-
 * metrics: The function ``rte_metrics_init`` will have a non-void return
   in order to notify errors instead of calling ``rte_exit``.
 
diff --git a/doc/guides/rel_notes/release_19_08.rst b/doc/guides/rel_notes/release_19_08.rst
index 3da266705..8fe08424a 100644
--- a/doc/guides/rel_notes/release_19_08.rst
+++ b/doc/guides/rel_notes/release_19_08.rst
@@ -135,6 +135,11 @@ API Changes
 * The network structures, definitions and functions have
   been prefixed by ``rte_`` to resolve conflicts with libc headers.
 
+* sched: To allow more traffic classes, flexible mapping of pipe queues to
+  traffic classes, and subport level configuration of pipes and queues
+  changes are made to public macros, data structures and API functions defined
+  in "rte_sched.h".
+
 
 ABI Changes
 -----------
@@ -222,7 +227,7 @@ The libraries prepended with a plus sign were incremented in this version.
      librte_rcu.so.1
      librte_reorder.so.1
      librte_ring.so.2
-     librte_sched.so.2
+   + librte_sched.so.3
      librte_security.so.2
      librte_stack.so.1
      librte_table.so.3
diff --git a/lib/librte_sched/Makefile b/lib/librte_sched/Makefile
index 644fd9d15..3d7f410e1 100644
--- a/lib/librte_sched/Makefile
+++ b/lib/librte_sched/Makefile
@@ -18,7 +18,7 @@ LDLIBS += -lrte_timer
 
 EXPORT_MAP := rte_sched_version.map
 
-LIBABIVER := 2
+LIBABIVER := 3
 
 #
 # all source are stored in SRCS-y
diff --git a/lib/librte_sched/meson.build b/lib/librte_sched/meson.build
index 8e989e5f6..59d43c6d8 100644
--- a/lib/librte_sched/meson.build
+++ b/lib/librte_sched/meson.build
@@ -1,7 +1,7 @@
 # SPDX-License-Identifier: BSD-3-Clause
 # Copyright(c) 2017 Intel Corporation
 
-version = 2
+version = 3
 sources = files('rte_sched.c', 'rte_red.c', 'rte_approx.c')
 headers = files('rte_sched.h', 'rte_sched_common.h',
 		'rte_red.h', 'rte_approx.h')
-- 
2.21.0


^ permalink raw reply	[flat|nested] 163+ messages in thread

* Re: [dpdk-dev] [PATCH v2 28/28] sched: add release note
  2019-06-25 15:32     ` [dpdk-dev] [PATCH v2 28/28] sched: add release note Jasvinder Singh
@ 2019-06-26 21:31       ` Thomas Monjalon
  2019-06-27 10:50         ` Singh, Jasvinder
  0 siblings, 1 reply; 163+ messages in thread
From: Thomas Monjalon @ 2019-06-26 21:31 UTC (permalink / raw)
  To: Jasvinder Singh; +Cc: dev, cristian.dumitrescu, Lukasz Krakowiak, Abraham Tovar

25/06/2019 17:32, Jasvinder Singh:
> --- a/doc/guides/rel_notes/release_19_08.rst
> +++ b/doc/guides/rel_notes/release_19_08.rst
> @@ -135,6 +135,11 @@ API Changes
> +* sched: To allow more traffic classes, flexible mapping of pipe queues to
> +  traffic classes, and subport level configuration of pipes and queues
> +  changes are made to public macros, data structures and API functions defined
> +  in "rte_sched.h".

Does it make sense to merge this text in a code patch?

> --- a/lib/librte_sched/Makefile
> +++ b/lib/librte_sched/Makefile
> -LIBABIVER := 2
> +LIBABIVER := 3

Please merge this change in the first patch breaking the ABI.




^ permalink raw reply	[flat|nested] 163+ messages in thread

* Re: [dpdk-dev] [PATCH v2 00/28] sched: feature enhancements
  2019-06-25 15:31   ` [dpdk-dev] [PATCH v2 00/28] sched: feature enhancements Jasvinder Singh
                       ` (27 preceding siblings ...)
  2019-06-25 15:32     ` [dpdk-dev] [PATCH v2 28/28] sched: add release note Jasvinder Singh
@ 2019-06-26 21:33     ` Thomas Monjalon
  2019-06-27 10:52       ` Singh, Jasvinder
  2019-06-27  0:04     ` Stephen Hemminger
  2019-07-01 18:51     ` Dumitrescu, Cristian
  30 siblings, 1 reply; 163+ messages in thread
From: Thomas Monjalon @ 2019-06-26 21:33 UTC (permalink / raw)
  To: Jasvinder Singh; +Cc: dev, cristian.dumitrescu

25/06/2019 17:31, Jasvinder Singh:
> Jasvinder Singh (27):
>   sched: update macros for flexible config
>   sched: update subport and pipe data structures
>   sched: update internal data structures
>   sched: update port config API
>   sched: update port free API
>   sched: update subport config API
>   sched: update pipe profile add API
>   sched: update pipe config API
>   sched: update pkt read and write API
>   sched: update subport and tc queue stats
>   sched: update port memory footprint API
>   sched: update packet enqueue API
>   sched: update grinder pipe and tc cache
>   sched: update grinder next pipe and tc functions
>   sched: update pipe and tc queues prefetch
>   sched: update grinder wrr compute function
>   sched: modify credits update function
>   sched: update mbuf prefetch function
>   sched: update grinder schedule function
>   sched: update grinder handle function
>   sched: update packet dequeue API
>   sched: update sched queue stats API
>   test/sched: update unit test
>   net/softnic: update softnic tm function
>   examples/qos_sched: update qos sched sample app
>   examples/ip_pipeline: update ip pipeline sample app
>   sched: code cleanup

I feel the titles of type "update some API" might be more meaningful
if giving the intent (flexibility) or more context (what is updated).
Not sure because I don't know this library enough.



^ permalink raw reply	[flat|nested] 163+ messages in thread

* Re: [dpdk-dev] [PATCH v2 00/28] sched: feature enhancements
  2019-06-25 15:31   ` [dpdk-dev] [PATCH v2 00/28] sched: feature enhancements Jasvinder Singh
                       ` (28 preceding siblings ...)
  2019-06-26 21:33     ` [dpdk-dev] [PATCH v2 00/28] sched: feature enhancements Thomas Monjalon
@ 2019-06-27  0:04     ` Stephen Hemminger
  2019-06-27 10:49       ` Singh, Jasvinder
  2019-07-01 18:51     ` Dumitrescu, Cristian
  30 siblings, 1 reply; 163+ messages in thread
From: Stephen Hemminger @ 2019-06-27  0:04 UTC (permalink / raw)
  To: Jasvinder Singh; +Cc: dev, cristian.dumitrescu

On Tue, 25 Jun 2019 16:31:49 +0100
Jasvinder Singh <jasvinder.singh@intel.com> wrote:

> This patchset refactors the dpdk qos sched library to add
> following features to enhance the scheduler functionality.
> 
> 1. flexibile configuration of the pipe traffic classes and queues;
> 
>    Currently, each pipe has 16 queues hardwired into 4 TCs scheduled with
>    strict priority, and each TC has exactly with 4 queues that are
>    scheduled with Weighted Fair Queuing (WFQ).
> 
>    Instead of hardwiring queues to traffic class within the specific pipe,
>    the new implementation allows more flexible/configurable split of pipe
>    queues between strict priority (SP) and best-effort (BE) traffic classes
>    along with the support of more number of traffic classes i.e. max 16.
>    
>    All the high priority TCs (TC1, TC2, ...) have exactly 1 queue, while
>    the lowest priority BE TC, has 1, 4 or 8 queues. This is justified by
>    the fact that all the high priority TCs are fully provisioned (small to
>    medium traffic rates), while most of the traffic fits into the BE class,
>    which is typically oversubscribed.
> 
>    Furthermore, this change allows to use less than 16 queues per pipe when
>    not all the 16 queues are needed. Therefore, no memory will be allocated
>    to the queues that are not needed.
> 
> 2. Subport level configuration of pipe nodes;
> 
>    Currently, all parameters for the pipe nodes (subscribers) configuration
>    are part of the port level structure which forces all groups of
>    subscribers (i.e. pipes) in different subports to have similar
>    configurations in terms of their number, queue sizes, traffic-classes,
>    etc.
> 
>    The new implementation moves pipe nodes configuration parameters from
>    port level to subport level structure. Therefore, different subports of
>    the same port can have different configuration for the pipe nodes
>    (subscribers), for examples- number of pipes, queue sizes, queues to
>    traffic-class mapping, etc.
> 
> v2:
> - fix bug in subport parameters check
> - remove redundant RTE_SCHED_SUBPORT_PER_PORT macro
> - fix bug in grinder_scheduler function
> - improve doxygen comments 
> - add error log information
> 
> Jasvinder Singh (27):
>   sched: update macros for flexible config
>   sched: update subport and pipe data structures
>   sched: update internal data structures
>   sched: update port config API
>   sched: update port free API
>   sched: update subport config API
>   sched: update pipe profile add API
>   sched: update pipe config API
>   sched: update pkt read and write API
>   sched: update subport and tc queue stats
>   sched: update port memory footprint API
>   sched: update packet enqueue API
>   sched: update grinder pipe and tc cache
>   sched: update grinder next pipe and tc functions
>   sched: update pipe and tc queues prefetch
>   sched: update grinder wrr compute function
>   sched: modify credits update function
>   sched: update mbuf prefetch function
>   sched: update grinder schedule function
>   sched: update grinder handle function
>   sched: update packet dequeue API
>   sched: update sched queue stats API
>   test/sched: update unit test
>   net/softnic: update softnic tm function
>   examples/qos_sched: update qos sched sample app
>   examples/ip_pipeline: update ip pipeline sample app
>   sched: code cleanup
> 
> Lukasz Krakowiak (1):
>   sched: add release note
> 
>  app/test/test_sched.c                         |   39 +-
>  doc/guides/rel_notes/deprecation.rst          |    6 -
>  doc/guides/rel_notes/release_19_08.rst        |    7 +-
>  drivers/net/softnic/rte_eth_softnic.c         |  131 +
>  drivers/net/softnic/rte_eth_softnic_cli.c     |  286 ++-
>  .../net/softnic/rte_eth_softnic_internals.h   |    8 +-
>  drivers/net/softnic/rte_eth_softnic_tm.c      |   89 +-
>  examples/ip_pipeline/cli.c                    |   85 +-
>  examples/ip_pipeline/tmgr.c                   |   22 +-
>  examples/ip_pipeline/tmgr.h                   |    3 -
>  examples/qos_sched/app_thread.c               |   11 +-
>  examples/qos_sched/cfg_file.c                 |  283 ++-
>  examples/qos_sched/init.c                     |  111 +-
>  examples/qos_sched/main.h                     |    7 +-
>  examples/qos_sched/profile.cfg                |   59 +-
>  examples/qos_sched/profile_ov.cfg             |   47 +-
>  examples/qos_sched/stats.c                    |  483 ++--
>  lib/librte_pipeline/rte_table_action.c        |    1 -
>  lib/librte_pipeline/rte_table_action.h        |    4 +-
>  lib/librte_sched/Makefile                     |    2 +-
>  lib/librte_sched/meson.build                  |    2 +-
>  lib/librte_sched/rte_sched.c                  | 2133 ++++++++++-------
>  lib/librte_sched/rte_sched.h                  |  229 +-
>  lib/librte_sched/rte_sched_common.h           |   41 +
>  24 files changed, 2634 insertions(+), 1455 deletions(-)
> 

Glad to see the QoS get more flexible.

  1. Is this patch series bisectable? I.e does each step build?
  2. What about the QoS part of the program guide. Doesn't it need to be updated?
    guides/prog_guide/qos_framework.rst

^ permalink raw reply	[flat|nested] 163+ messages in thread

* Re: [dpdk-dev] [PATCH v2 00/28] sched: feature enhancements
  2019-06-27  0:04     ` Stephen Hemminger
@ 2019-06-27 10:49       ` Singh, Jasvinder
  0 siblings, 0 replies; 163+ messages in thread
From: Singh, Jasvinder @ 2019-06-27 10:49 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev, Dumitrescu, Cristian


<snip>

> >
> >  app/test/test_sched.c                         |   39 +-
> >  doc/guides/rel_notes/deprecation.rst          |    6 -
> >  doc/guides/rel_notes/release_19_08.rst        |    7 +-
> >  drivers/net/softnic/rte_eth_softnic.c         |  131 +
> >  drivers/net/softnic/rte_eth_softnic_cli.c     |  286 ++-
> >  .../net/softnic/rte_eth_softnic_internals.h   |    8 +-
> >  drivers/net/softnic/rte_eth_softnic_tm.c      |   89 +-
> >  examples/ip_pipeline/cli.c                    |   85 +-
> >  examples/ip_pipeline/tmgr.c                   |   22 +-
> >  examples/ip_pipeline/tmgr.h                   |    3 -
> >  examples/qos_sched/app_thread.c               |   11 +-
> >  examples/qos_sched/cfg_file.c                 |  283 ++-
> >  examples/qos_sched/init.c                     |  111 +-
> >  examples/qos_sched/main.h                     |    7 +-
> >  examples/qos_sched/profile.cfg                |   59 +-
> >  examples/qos_sched/profile_ov.cfg             |   47 +-
> >  examples/qos_sched/stats.c                    |  483 ++--
> >  lib/librte_pipeline/rte_table_action.c        |    1 -
> >  lib/librte_pipeline/rte_table_action.h        |    4 +-
> >  lib/librte_sched/Makefile                     |    2 +-
> >  lib/librte_sched/meson.build                  |    2 +-
> >  lib/librte_sched/rte_sched.c                  | 2133 ++++++++++-------
> >  lib/librte_sched/rte_sched.h                  |  229 +-
> >  lib/librte_sched/rte_sched_common.h           |   41 +
> >  24 files changed, 2634 insertions(+), 1455 deletions(-)
> >
> 
> Glad to see the QoS get more flexible.
> 
>   1. Is this patch series bisectable? I.e does each step build?

Yes, each patch of the series builds independently.

>   2. What about the QoS part of the program guide. Doesn't it need to be
> updated?
>     guides/prog_guide/qos_framework.rst

We will update the documentation soon.

Thanks,
Jasvinder


^ permalink raw reply	[flat|nested] 163+ messages in thread

* Re: [dpdk-dev] [PATCH v2 28/28] sched: add release note
  2019-06-26 21:31       ` Thomas Monjalon
@ 2019-06-27 10:50         ` Singh, Jasvinder
  0 siblings, 0 replies; 163+ messages in thread
From: Singh, Jasvinder @ 2019-06-27 10:50 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: dev, Dumitrescu, Cristian, Krakowiak, LukaszX, Tovar, AbrahamX



> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas@monjalon.net]
> Sent: Wednesday, June 26, 2019 10:31 PM
> To: Singh, Jasvinder <jasvinder.singh@intel.com>
> Cc: dev@dpdk.org; Dumitrescu, Cristian <cristian.dumitrescu@intel.com>;
> Krakowiak, LukaszX <lukaszx.krakowiak@intel.com>; Tovar, AbrahamX
> <abrahamx.tovar@intel.com>
> Subject: Re: [dpdk-dev] [PATCH v2 28/28] sched: add release note
> 
> 25/06/2019 17:32, Jasvinder Singh:
> > --- a/doc/guides/rel_notes/release_19_08.rst
> > +++ b/doc/guides/rel_notes/release_19_08.rst
> > @@ -135,6 +135,11 @@ API Changes
> > +* sched: To allow more traffic classes, flexible mapping of pipe queues to
> > +  traffic classes, and subport level configuration of pipes and queues
> > +  changes are made to public macros, data structures and API functions
> defined
> > +  in "rte_sched.h".
> 
> Does it make sense to merge this text in a code patch?

Will merge this code patch. 

> > --- a/lib/librte_sched/Makefile
> > +++ b/lib/librte_sched/Makefile
> > -LIBABIVER := 2
> > +LIBABIVER := 3
> 
> Please merge this change in the first patch breaking the ABI.
> 
Will do that.

Thanks,
Jasvinder


^ permalink raw reply	[flat|nested] 163+ messages in thread

* Re: [dpdk-dev] [PATCH v2 00/28] sched: feature enhancements
  2019-06-26 21:33     ` [dpdk-dev] [PATCH v2 00/28] sched: feature enhancements Thomas Monjalon
@ 2019-06-27 10:52       ` Singh, Jasvinder
  0 siblings, 0 replies; 163+ messages in thread
From: Singh, Jasvinder @ 2019-06-27 10:52 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev, Dumitrescu, Cristian



> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas@monjalon.net]
> Sent: Wednesday, June 26, 2019 10:34 PM
> To: Singh, Jasvinder <jasvinder.singh@intel.com>
> Cc: dev@dpdk.org; Dumitrescu, Cristian <cristian.dumitrescu@intel.com>
> Subject: Re: [dpdk-dev] [PATCH v2 00/28] sched: feature enhancements
> 
> 25/06/2019 17:31, Jasvinder Singh:
> > Jasvinder Singh (27):
> >   sched: update macros for flexible config
> >   sched: update subport and pipe data structures
> >   sched: update internal data structures
> >   sched: update port config API
> >   sched: update port free API
> >   sched: update subport config API
> >   sched: update pipe profile add API
> >   sched: update pipe config API
> >   sched: update pkt read and write API
> >   sched: update subport and tc queue stats
> >   sched: update port memory footprint API
> >   sched: update packet enqueue API
> >   sched: update grinder pipe and tc cache
> >   sched: update grinder next pipe and tc functions
> >   sched: update pipe and tc queues prefetch
> >   sched: update grinder wrr compute function
> >   sched: modify credits update function
> >   sched: update mbuf prefetch function
> >   sched: update grinder schedule function
> >   sched: update grinder handle function
> >   sched: update packet dequeue API
> >   sched: update sched queue stats API
> >   test/sched: update unit test
> >   net/softnic: update softnic tm function
> >   examples/qos_sched: update qos sched sample app
> >   examples/ip_pipeline: update ip pipeline sample app
> >   sched: code cleanup
> 
> I feel the titles of type "update some API" might be more meaningful if giving
> the intent (flexibility) or more context (what is updated).
> Not sure because I don't know this library enough.

We will figure out better names in next version :)

Thanks,
Jasvinder


^ permalink raw reply	[flat|nested] 163+ messages in thread

* Re: [dpdk-dev] [PATCH v2 00/28] sched: feature enhancements
  2019-06-25 15:31   ` [dpdk-dev] [PATCH v2 00/28] sched: feature enhancements Jasvinder Singh
                       ` (29 preceding siblings ...)
  2019-06-27  0:04     ` Stephen Hemminger
@ 2019-07-01 18:51     ` Dumitrescu, Cristian
  2019-07-02  9:32       ` Singh, Jasvinder
  30 siblings, 1 reply; 163+ messages in thread
From: Dumitrescu, Cristian @ 2019-07-01 18:51 UTC (permalink / raw)
  To: Singh, Jasvinder, dev

Hi Jasvinder,

Thanks for doing this work! Finally a man brave enough to do substantial changes to this library!

> -----Original Message-----
> From: Singh, Jasvinder
> Sent: Tuesday, June 25, 2019 4:32 PM
> To: dev@dpdk.org
> Cc: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>
> Subject: [PATCH v2 00/28] sched: feature enhancements
> 
> This patchset refactors the dpdk qos sched library to add
> following features to enhance the scheduler functionality.
> 
> 1. flexibile configuration of the pipe traffic classes and queues;
> 
>    Currently, each pipe has 16 queues hardwired into 4 TCs scheduled with
>    strict priority, and each TC has exactly with 4 queues that are
>    scheduled with Weighted Fair Queuing (WFQ).
> 
>    Instead of hardwiring queues to traffic class within the specific pipe,
>    the new implementation allows more flexible/configurable split of pipe
>    queues between strict priority (SP) and best-effort (BE) traffic classes
>    along with the support of more number of traffic classes i.e. max 16.
> 
>    All the high priority TCs (TC1, TC2, ...) have exactly 1 queue, while
>    the lowest priority BE TC, has 1, 4 or 8 queues. This is justified by
>    the fact that all the high priority TCs are fully provisioned (small to
>    medium traffic rates), while most of the traffic fits into the BE class,
>    which is typically oversubscribed.
> 
>    Furthermore, this change allows to use less than 16 queues per pipe when
>    not all the 16 queues are needed. Therefore, no memory will be allocated
>    to the queues that are not needed.
> 
> 2. Subport level configuration of pipe nodes;
> 
>    Currently, all parameters for the pipe nodes (subscribers) configuration
>    are part of the port level structure which forces all groups of
>    subscribers (i.e. pipes) in different subports to have similar
>    configurations in terms of their number, queue sizes, traffic-classes,
>    etc.
> 
>    The new implementation moves pipe nodes configuration parameters from
>    port level to subport level structure. Therefore, different subports of
>    the same port can have different configuration for the pipe nodes
>    (subscribers), for examples- number of pipes, queue sizes, queues to
>    traffic-class mapping, etc.
> 
> v2:
> - fix bug in subport parameters check
> - remove redundant RTE_SCHED_SUBPORT_PER_PORT macro
> - fix bug in grinder_scheduler function
> - improve doxygen comments
> - add error log information
> 
> Jasvinder Singh (27):
>   sched: update macros for flexible config
>   sched: update subport and pipe data structures
>   sched: update internal data structures
>   sched: update port config API
>   sched: update port free API
>   sched: update subport config API
>   sched: update pipe profile add API
>   sched: update pipe config API
>   sched: update pkt read and write API
>   sched: update subport and tc queue stats
>   sched: update port memory footprint API
>   sched: update packet enqueue API
>   sched: update grinder pipe and tc cache
>   sched: update grinder next pipe and tc functions
>   sched: update pipe and tc queues prefetch
>   sched: update grinder wrr compute function
>   sched: modify credits update function
>   sched: update mbuf prefetch function
>   sched: update grinder schedule function
>   sched: update grinder handle function
>   sched: update packet dequeue API
>   sched: update sched queue stats API
>   test/sched: update unit test
>   net/softnic: update softnic tm function
>   examples/qos_sched: update qos sched sample app
>   examples/ip_pipeline: update ip pipeline sample app
>   sched: code cleanup
> 
> Lukasz Krakowiak (1):
>   sched: add release note
> 
>  app/test/test_sched.c                         |   39 +-
>  doc/guides/rel_notes/deprecation.rst          |    6 -
>  doc/guides/rel_notes/release_19_08.rst        |    7 +-
>  drivers/net/softnic/rte_eth_softnic.c         |  131 +
>  drivers/net/softnic/rte_eth_softnic_cli.c     |  286 ++-
>  .../net/softnic/rte_eth_softnic_internals.h   |    8 +-
>  drivers/net/softnic/rte_eth_softnic_tm.c      |   89 +-
>  examples/ip_pipeline/cli.c                    |   85 +-
>  examples/ip_pipeline/tmgr.c                   |   22 +-
>  examples/ip_pipeline/tmgr.h                   |    3 -
>  examples/qos_sched/app_thread.c               |   11 +-
>  examples/qos_sched/cfg_file.c                 |  283 ++-
>  examples/qos_sched/init.c                     |  111 +-
>  examples/qos_sched/main.h                     |    7 +-
>  examples/qos_sched/profile.cfg                |   59 +-
>  examples/qos_sched/profile_ov.cfg             |   47 +-
>  examples/qos_sched/stats.c                    |  483 ++--
>  lib/librte_pipeline/rte_table_action.c        |    1 -
>  lib/librte_pipeline/rte_table_action.h        |    4 +-
>  lib/librte_sched/Makefile                     |    2 +-
>  lib/librte_sched/meson.build                  |    2 +-
>  lib/librte_sched/rte_sched.c                  | 2133 ++++++++++-------
>  lib/librte_sched/rte_sched.h                  |  229 +-
>  lib/librte_sched/rte_sched_common.h           |   41 +
>  24 files changed, 2634 insertions(+), 1455 deletions(-)
> 
> --
> 2.21.0

This library is tricky, as validating the functional correctness usually requires more than just a binary pass/fail result, like your usual PMD (are packets coming out? YES/NO). It requires an accuracy testing, for example how close is the actual pipe/shaper output rate from the expected rate, how accurate is the strict priority of WFQ scheduling, etc. Therefore, here are a few accuracy tests that we need to perform on this library to make sure we did not break any functionality, feel free to reach out to me if more details needed:

1. Subport shaper accuracy: Inject line rate into a subport, limit subport rate to X% of line rate (X = 5, 10, 15, ..., 90). Check that subport output rate matches the rate limit with a tolerance of 1% of the expected rate.
2. Subport traffic class rate limiting accuracy: same 1% tolerance
3. Pipe shaper accuracy: same 1% tolerance
4. Pipe traffic class rate limiting accuracy: same 1% tolerance
5. Traffic class strict priority scheduling
6. WFQ for best effort traffic class

On performance side, we need to make sure we don't get a massive performance degradation. We need proof points for:
1. 8x traffic classes, 4x best effort queues
2. 8x traffic classes, 1x best effort queue
3. 4x traffic classes, 4x best effort queues
4. 4x traffic classes, 1x best effort queue

On unit test side, a few tests to highlight:
1. Packet for queue X is enqueued to queue X and dequeued from queue X.
a) Typically tested by sending a single packet and tracing it through the queues.
b) Should be done for different queue IDs, different number of queues per subport and different number of subports.

I am sending some initial comments to the V2 patches now, more to come during the next few days.

Regards,
Cristian


^ permalink raw reply	[flat|nested] 163+ messages in thread

* Re: [dpdk-dev] [PATCH v2 02/28] sched: update subport and pipe data structures
  2019-06-25 15:31     ` [dpdk-dev] [PATCH v2 02/28] sched: update subport and pipe data structures Jasvinder Singh
@ 2019-07-01 18:58       ` Dumitrescu, Cristian
  2019-07-02 13:20         ` Singh, Jasvinder
  2019-07-01 19:12       ` Dumitrescu, Cristian
  1 sibling, 1 reply; 163+ messages in thread
From: Dumitrescu, Cristian @ 2019-07-01 18:58 UTC (permalink / raw)
  To: Singh, Jasvinder, dev; +Cc: Tovar, AbrahamX, Krakowiak, LukaszX



> -----Original Message-----
> From: Singh, Jasvinder
> Sent: Tuesday, June 25, 2019 4:32 PM
> To: dev@dpdk.org
> Cc: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>; Tovar, AbrahamX
> <abrahamx.tovar@intel.com>; Krakowiak, LukaszX
> <lukaszx.krakowiak@intel.com>
> Subject: [PATCH v2 02/28] sched: update subport and pipe data structures
> 
> Update subport and pipe data structures to allow configuration
> flexiblity for pipe traffic classes and queues, and subport level
> configuration of the pipe parameters.
> 
> Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
> Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
> Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
> ---
>  app/test/test_sched.c        |   2 +-
>  examples/qos_sched/init.c    |   2 +-
>  lib/librte_sched/rte_sched.h | 126 +++++++++++++++++++++++------------
>  3 files changed, 85 insertions(+), 45 deletions(-)
> 
> diff --git a/app/test/test_sched.c b/app/test/test_sched.c
> index 49bb9ea6f..d6651d490 100644
> --- a/app/test/test_sched.c
> +++ b/app/test/test_sched.c
> @@ -40,7 +40,7 @@ static struct rte_sched_pipe_params pipe_profile[] = {
>  		.tc_rate = {305175, 305175, 305175, 305175},
>  		.tc_period = 40,
> 
> -		.wrr_weights = {1, 1, 1, 1,  1, 1, 1, 1,  1, 1, 1, 1,  1, 1, 1, 1},
> +		.wrr_weights = {1, 1, 1, 1,  1, 1, 1, 1},
>  	},
>  };
> 
> diff --git a/examples/qos_sched/init.c b/examples/qos_sched/init.c
> index 1209bd7ce..f6e9af16b 100644
> --- a/examples/qos_sched/init.c
> +++ b/examples/qos_sched/init.c
> @@ -186,7 +186,7 @@ static struct rte_sched_pipe_params
> pipe_profiles[RTE_SCHED_PIPE_PROFILES_PER_PO
>  		.tc_ov_weight = 1,
>  #endif
> 
> -		.wrr_weights = {1, 1, 1, 1,  1, 1, 1, 1,  1, 1, 1, 1,  1, 1, 1, 1},
> +		.wrr_weights = {1, 1, 1, 1,  1, 1, 1, 1},
>  	},
>  };
> 
> diff --git a/lib/librte_sched/rte_sched.h b/lib/librte_sched/rte_sched.h
> index 470a0036a..ebde07669 100644
> --- a/lib/librte_sched/rte_sched.h
> +++ b/lib/librte_sched/rte_sched.h
> @@ -114,6 +114,35 @@ extern "C" {
>  #define RTE_SCHED_FRAME_OVERHEAD_DEFAULT      24
>  #endif
> 
> +/*
> + * Pipe configuration parameters. The period and credits_per_period
> + * parameters are measured in bytes, with one byte meaning the time
> + * duration associated with the transmission of one byte on the
> + * physical medium of the output port, with pipe or pipe traffic class
> + * rate (measured as percentage of output port rate) determined as
> + * credits_per_period divided by period. One credit represents one
> + * byte.
> + */
> +struct rte_sched_pipe_params {
> +	/** Token bucket rate (measured in bytes per second) */
> +	uint32_t tb_rate;
> +	/** Token bucket size (measured in credits) */
> +	uint32_t tb_size;
> +
> +	/** Traffic class rates (measured in bytes per second) */
> +	uint32_t tc_rate[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
> +
> +	/** Enforcement period (measured in milliseconds) */
> +	uint32_t tc_period;
> +#ifdef RTE_SCHED_SUBPORT_TC_OV
> +	/** Best-effort traffic class oversubscription weight */
> +	uint8_t tc_ov_weight;
> +#endif

We should always enable the Best Effort traffic class oversubscription feature on the API side, at least. In case this feature is disabled through the build time option (RTE_SCHED_SUBPORT_TC_OV), the values for these params can be ignored.

We should also consider always enabling the run-time part for this feature, as the oversubscription is the typical configuration used by the service providers. Do you see significant performance drop when this feature is enabled?

> +
> +	/** WRR weights of best-effort traffic class queues */
> +	uint8_t wrr_weights[RTE_SCHED_BE_QUEUES_PER_PIPE];
> +};
> +
>  /*
>   * Subport configuration parameters. The period and credits_per_period
>   * parameters are measured in bytes, with one byte meaning the time
> @@ -124,15 +153,44 @@ extern "C" {
>   * byte.
>   */
>  struct rte_sched_subport_params {
> -	/* Subport token bucket */
> -	uint32_t tb_rate;                /**< Rate (measured in bytes per second)
> */
> -	uint32_t tb_size;                /**< Size (measured in credits) */
> +	/** Token bucket rate (measured in bytes per second) */
> +	uint32_t tb_rate;
> +
> +	/** Token bucket size (measured in credits) */
> +	uint32_t tb_size;
> 
> -	/* Subport traffic classes */
> +	/** Traffic class rates (measured in bytes per second) */
>  	uint32_t tc_rate[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
> -	/**< Traffic class rates (measured in bytes per second) */
> +
> +	/** Enforcement period for rates (measured in milliseconds) */
>  	uint32_t tc_period;
> -	/**< Enforcement period for rates (measured in milliseconds) */
> +
> +	/** Number of subport_pipes */
> +	uint32_t n_subport_pipes;

Minor issue: Any reason why not keeping the initial name of n_pipes_per_subport? The initial name looks more intuitive to me, I vote to keep it; it is also inline with other naming conventions in this library.

> +
> +	/** Packet queue size for each traffic class.
> +	 * All the pipes within the same subport share the similar
> +	 * configuration for the queues. Queues which are not needed, have
> +	 * zero size.
> +	 */
> +	uint16_t qsize[RTE_SCHED_QUEUES_PER_PIPE];
> +
> +	/** Pipe profile table.
> +	 * Every pipe is configured using one of the profiles from this table.
> +	 */
> +	struct rte_sched_pipe_params *pipe_profiles;
> +
> +	/** Profiles in the pipe profile table */
> +	uint32_t n_pipe_profiles;
> +
> +	/** Max profiles allowed in the pipe profile table */
> +	uint32_t n_max_pipe_profiles;
> +#ifdef RTE_SCHED_RED
> +	/** RED parameters */
> +	struct rte_red_params
> +
> 	red_params[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE][RTE_COLORS
> ];
> +
> +#endif
>  };
> 
>  /** Subport statistics */
> @@ -155,33 +213,6 @@ struct rte_sched_subport_stats {
>  #endif
>  };
> 
> -/*
> - * Pipe configuration parameters. The period and credits_per_period
> - * parameters are measured in bytes, with one byte meaning the time
> - * duration associated with the transmission of one byte on the
> - * physical medium of the output port, with pipe or pipe traffic class
> - * rate (measured as percentage of output port rate) determined as
> - * credits_per_period divided by period. One credit represents one
> - * byte.
> - */
> -struct rte_sched_pipe_params {
> -	/* Pipe token bucket */
> -	uint32_t tb_rate;                /**< Rate (measured in bytes per second)
> */
> -	uint32_t tb_size;                /**< Size (measured in credits) */
> -
> -	/* Pipe traffic classes */
> -	uint32_t tc_rate[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
> -	/**< Traffic class rates (measured in bytes per second) */
> -	uint32_t tc_period;
> -	/**< Enforcement period (measured in milliseconds) */
> -#ifdef RTE_SCHED_SUBPORT_TC_OV
> -	uint8_t tc_ov_weight;		 /**< Weight Traffic class 3
> oversubscription */
> -#endif
> -
> -	/* Pipe queues */
> -	uint8_t  wrr_weights[RTE_SCHED_QUEUES_PER_PIPE]; /**< WRR
> weights */
> -};
> -
>  /** Queue statistics */
>  struct rte_sched_queue_stats {
>  	/* Packets */
> @@ -198,16 +229,25 @@ struct rte_sched_queue_stats {
> 
>  /** Port configuration parameters. */
>  struct rte_sched_port_params {
> -	const char *name;                /**< String to be associated */
> -	int socket;                      /**< CPU socket ID */
> -	uint32_t rate;                   /**< Output port rate
> -					  * (measured in bytes per second) */
> -	uint32_t mtu;                    /**< Maximum Ethernet frame size
> -					  * (measured in bytes).
> -					  * Should not include the framing
> overhead. */
> -	uint32_t frame_overhead;         /**< Framing overhead per packet
> -					  * (measured in bytes) */
> -	uint32_t n_subports_per_port;    /**< Number of subports */
> +	/** Name of the port to be associated */
> +	const char *name;
> +
> +	/** CPU socket ID */
> +	int socket;
> +
> +	/** Output port rate (measured in bytes per second) */
> +	uint32_t rate;
> +
> +	/** Maximum Ethernet frame size (measured in bytes).
> +	 * Should not include the framing overhead.
> +	 */
> +	uint32_t mtu;
> +
> +	/** Framing overhead per packet (measured in bytes) */
> +	uint32_t frame_overhead;
> +
> +	/** Number of subports */
> +	uint32_t n_subports_per_port;
>  	uint32_t n_pipes_per_subport;    /**< Number of pipes per subport
> */
>  	uint16_t qsize[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
>  	/**< Packet queue size for each traffic class.
> --
> 2.21.0


^ permalink raw reply	[flat|nested] 163+ messages in thread

* Re: [dpdk-dev] [PATCH v2 01/28] sched: update macros for flexible config
  2019-06-25 15:31     ` [dpdk-dev] [PATCH v2 01/28] sched: update macros for flexible config Jasvinder Singh
@ 2019-07-01 19:04       ` Dumitrescu, Cristian
  2019-07-02 13:26         ` Singh, Jasvinder
  2019-07-11 10:26       ` [dpdk-dev] [PATCH v3 00/11] sched: feature enhancements Jasvinder Singh
  1 sibling, 1 reply; 163+ messages in thread
From: Dumitrescu, Cristian @ 2019-07-01 19:04 UTC (permalink / raw)
  To: Singh, Jasvinder, dev; +Cc: Tovar, AbrahamX, Krakowiak, LukaszX



> -----Original Message-----
> From: Singh, Jasvinder
> Sent: Tuesday, June 25, 2019 4:32 PM
> To: dev@dpdk.org
> Cc: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>; Tovar, AbrahamX
> <abrahamx.tovar@intel.com>; Krakowiak, LukaszX
> <lukaszx.krakowiak@intel.com>
> Subject: [PATCH v2 01/28] sched: update macros for flexible config
> 
> Update macros to allow configuration flexiblity for pipe traffic
> classes and queues, and subport level configuration of the pipe
> parameters.
> 
> Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
> Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
> Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
> ---
>  lib/librte_sched/rte_sched.h | 36 +++++++++++++++++++++++++-----------
>  1 file changed, 25 insertions(+), 11 deletions(-)
> 
> diff --git a/lib/librte_sched/rte_sched.h b/lib/librte_sched/rte_sched.h
> index 9c55a787d..470a0036a 100644
> --- a/lib/librte_sched/rte_sched.h
> +++ b/lib/librte_sched/rte_sched.h
> @@ -52,7 +52,7 @@ extern "C" {
>   *	    multiple connections of same traffic class belonging to
>   *	    the same user;
>   *           - Weighted Round Robin (WRR) is used to service the
> - *	    queues within same pipe traffic class.
> + *	    queues within same pipe lowest priority traffic class (best-effort).
>   *
>   */
> 
> @@ -66,20 +66,32 @@ extern "C" {
>  #include "rte_red.h"
>  #endif
> 
> -/** Number of traffic classes per pipe (as well as subport).
> - * Cannot be changed.
> +/** Maximum number of queues per pipe.
> + * Note that the multiple queues (power of 2) can only be assigned to
> + * lowest priority (best-effort) traffic class. Other higher priority traffic
> + * classes can only have one queue.
> + * Can not change.
> + *
> + * @see struct rte_sched_subport_params
>   */
> -#define RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE    4
> +#define RTE_SCHED_QUEUES_PER_PIPE    16
> 
> -/** Number of queues per pipe traffic class. Cannot be changed. */
> -#define RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS    4
> +/** Number of WRR queues for best-effort traffic class per pipe.
> + *
> + * @see struct rte_sched_pipe_params
> + */
> +#define RTE_SCHED_BE_QUEUES_PER_PIPE    8

Should we have this as 8 or 4? I think we should limit this to 4, as 4 allows quick vectorization, while 8 is problematic.

We should also not have a run-time parameter for number of best effort queues, as this can be detected by checking the size of all best effort queues against 0. Of course, we should mandate that the enabled queues (with non-zero size) are contiguous.

> 
> -/** Number of queues per pipe. */
> -#define RTE_SCHED_QUEUES_PER_PIPE             \
> -	(RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE *     \
> -	RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS)
> +#define RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS    4
> +/** Number of traffic classes per pipe (as well as subport).
> + *
> + * @see struct rte_sched_subport_params
> + * @see struct rte_sched_pipe_params
> + */
> +#define RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE    \
> +(RTE_SCHED_QUEUES_PER_PIPE - RTE_SCHED_BE_QUEUES_PER_PIPE + 1)
> 
> -/** Maximum number of pipe profiles that can be defined per port.
> +/** Maximum number of pipe profiles that can be defined per subport.
>   * Compile-time configurable.
>   */
>  #ifndef RTE_SCHED_PIPE_PROFILES_PER_PORT
> @@ -95,6 +107,8 @@ extern "C" {
>   *
>   * The FCS is considered overhead only if not included in the packet
>   * length (field pkt_len of struct rte_mbuf).
> + *
> + * @see struct rte_sched_port_params
>   */
>  #ifndef RTE_SCHED_FRAME_OVERHEAD_DEFAULT
>  #define RTE_SCHED_FRAME_OVERHEAD_DEFAULT      24
> --
> 2.21.0


^ permalink raw reply	[flat|nested] 163+ messages in thread

* Re: [dpdk-dev] [PATCH v2 02/28] sched: update subport and pipe data structures
  2019-06-25 15:31     ` [dpdk-dev] [PATCH v2 02/28] sched: update subport and pipe data structures Jasvinder Singh
  2019-07-01 18:58       ` Dumitrescu, Cristian
@ 2019-07-01 19:12       ` Dumitrescu, Cristian
  1 sibling, 0 replies; 163+ messages in thread
From: Dumitrescu, Cristian @ 2019-07-01 19:12 UTC (permalink / raw)
  To: Singh, Jasvinder, dev; +Cc: Tovar, AbrahamX, Krakowiak, LukaszX



> -----Original Message-----
> From: Singh, Jasvinder
> Sent: Tuesday, June 25, 2019 4:32 PM
> To: dev@dpdk.org
> Cc: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>; Tovar, AbrahamX
> <abrahamx.tovar@intel.com>; Krakowiak, LukaszX
> <lukaszx.krakowiak@intel.com>
> Subject: [PATCH v2 02/28] sched: update subport and pipe data structures
> 
> Update subport and pipe data structures to allow configuration
> flexiblity for pipe traffic classes and queues, and subport level
> configuration of the pipe parameters.
> 
> Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
> Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
> Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
> ---
>  app/test/test_sched.c        |   2 +-
>  examples/qos_sched/init.c    |   2 +-
>  lib/librte_sched/rte_sched.h | 126 +++++++++++++++++++++++------------
>  3 files changed, 85 insertions(+), 45 deletions(-)
> 
> diff --git a/app/test/test_sched.c b/app/test/test_sched.c
> index 49bb9ea6f..d6651d490 100644
> --- a/app/test/test_sched.c
> +++ b/app/test/test_sched.c
> @@ -40,7 +40,7 @@ static struct rte_sched_pipe_params pipe_profile[] = {
>  		.tc_rate = {305175, 305175, 305175, 305175},
>  		.tc_period = 40,
> 
> -		.wrr_weights = {1, 1, 1, 1,  1, 1, 1, 1,  1, 1, 1, 1,  1, 1, 1, 1},
> +		.wrr_weights = {1, 1, 1, 1,  1, 1, 1, 1},
>  	},
>  };
> 
> diff --git a/examples/qos_sched/init.c b/examples/qos_sched/init.c
> index 1209bd7ce..f6e9af16b 100644
> --- a/examples/qos_sched/init.c
> +++ b/examples/qos_sched/init.c
> @@ -186,7 +186,7 @@ static struct rte_sched_pipe_params
> pipe_profiles[RTE_SCHED_PIPE_PROFILES_PER_PO
>  		.tc_ov_weight = 1,
>  #endif
> 
> -		.wrr_weights = {1, 1, 1, 1,  1, 1, 1, 1,  1, 1, 1, 1,  1, 1, 1, 1},
> +		.wrr_weights = {1, 1, 1, 1,  1, 1, 1, 1},
>  	},
>  };
> 
> diff --git a/lib/librte_sched/rte_sched.h b/lib/librte_sched/rte_sched.h
> index 470a0036a..ebde07669 100644
> --- a/lib/librte_sched/rte_sched.h
> +++ b/lib/librte_sched/rte_sched.h
> @@ -114,6 +114,35 @@ extern "C" {
>  #define RTE_SCHED_FRAME_OVERHEAD_DEFAULT      24
>  #endif
> 
> +/*
> + * Pipe configuration parameters. The period and credits_per_period
> + * parameters are measured in bytes, with one byte meaning the time
> + * duration associated with the transmission of one byte on the
> + * physical medium of the output port, with pipe or pipe traffic class
> + * rate (measured as percentage of output port rate) determined as
> + * credits_per_period divided by period. One credit represents one
> + * byte.
> + */
> +struct rte_sched_pipe_params {
> +	/** Token bucket rate (measured in bytes per second) */
> +	uint32_t tb_rate;
> +	/** Token bucket size (measured in credits) */
> +	uint32_t tb_size;
> +
> +	/** Traffic class rates (measured in bytes per second) */
> +	uint32_t tc_rate[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
> +

We should have all these parameters as 64-bit values. Internally, we can decide to use 32-bit or 64-bit counters, depending on performance impact. But as we are changing the API now (under the existing notice), we should not have to change the API again later.

Any idea on the performance impact of updating all the internal counters to 64-bit?

For the purpose of being able to quickly check this, I suggest we use a define internally within the rte_sched.c file (while we always use uint64_t in the rte_sched.h file):
#ifdef COUNTER_SIZE_64
typedef sched_counter_t uint64_t
#else
typedef sched counter_t uint32_t
#endif

This would also require in rte_sched.c to use conversions on assignment to avoid compiler warnings: p->counter = (sched_counter_t) counter;

> +	/** Enforcement period (measured in milliseconds) */
> +	uint32_t tc_period;
> +#ifdef RTE_SCHED_SUBPORT_TC_OV
> +	/** Best-effort traffic class oversubscription weight */
> +	uint8_t tc_ov_weight;
> +#endif
> +
> +	/** WRR weights of best-effort traffic class queues */
> +	uint8_t wrr_weights[RTE_SCHED_BE_QUEUES_PER_PIPE];
> +};
> +
>  /*
>   * Subport configuration parameters. The period and credits_per_period
>   * parameters are measured in bytes, with one byte meaning the time
> @@ -124,15 +153,44 @@ extern "C" {
>   * byte.
>   */
>  struct rte_sched_subport_params {
> -	/* Subport token bucket */
> -	uint32_t tb_rate;                /**< Rate (measured in bytes per second)
> */
> -	uint32_t tb_size;                /**< Size (measured in credits) */
> +	/** Token bucket rate (measured in bytes per second) */
> +	uint32_t tb_rate;
> +
> +	/** Token bucket size (measured in credits) */
> +	uint32_t tb_size;

Same 64-bit comment here.

> 
> -	/* Subport traffic classes */
> +	/** Traffic class rates (measured in bytes per second) */
>  	uint32_t tc_rate[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
> -	/**< Traffic class rates (measured in bytes per second) */
> +
> +	/** Enforcement period for rates (measured in milliseconds) */
>  	uint32_t tc_period;
> -	/**< Enforcement period for rates (measured in milliseconds) */
> +
> +	/** Number of subport_pipes */
> +	uint32_t n_subport_pipes;
> +
> +	/** Packet queue size for each traffic class.
> +	 * All the pipes within the same subport share the similar
> +	 * configuration for the queues. Queues which are not needed, have
> +	 * zero size.
> +	 */
> +	uint16_t qsize[RTE_SCHED_QUEUES_PER_PIPE];
> +
> +	/** Pipe profile table.
> +	 * Every pipe is configured using one of the profiles from this table.
> +	 */
> +	struct rte_sched_pipe_params *pipe_profiles;
> +
> +	/** Profiles in the pipe profile table */
> +	uint32_t n_pipe_profiles;
> +
> +	/** Max profiles allowed in the pipe profile table */
> +	uint32_t n_max_pipe_profiles;
> +#ifdef RTE_SCHED_RED
> +	/** RED parameters */
> +	struct rte_red_params
> +
> 	red_params[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE][RTE_COLORS
> ];
> +
> +#endif
>  };
> 
>  /** Subport statistics */
> @@ -155,33 +213,6 @@ struct rte_sched_subport_stats {
>  #endif
>  };
> 
> -/*
> - * Pipe configuration parameters. The period and credits_per_period
> - * parameters are measured in bytes, with one byte meaning the time
> - * duration associated with the transmission of one byte on the
> - * physical medium of the output port, with pipe or pipe traffic class
> - * rate (measured as percentage of output port rate) determined as
> - * credits_per_period divided by period. One credit represents one
> - * byte.
> - */
> -struct rte_sched_pipe_params {
> -	/* Pipe token bucket */
> -	uint32_t tb_rate;                /**< Rate (measured in bytes per second)
> */
> -	uint32_t tb_size;                /**< Size (measured in credits) */
> -
> -	/* Pipe traffic classes */
> -	uint32_t tc_rate[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
> -	/**< Traffic class rates (measured in bytes per second) */
> -	uint32_t tc_period;
> -	/**< Enforcement period (measured in milliseconds) */
> -#ifdef RTE_SCHED_SUBPORT_TC_OV
> -	uint8_t tc_ov_weight;		 /**< Weight Traffic class 3
> oversubscription */
> -#endif
> -
> -	/* Pipe queues */
> -	uint8_t  wrr_weights[RTE_SCHED_QUEUES_PER_PIPE]; /**< WRR
> weights */
> -};
> -
>  /** Queue statistics */
>  struct rte_sched_queue_stats {
>  	/* Packets */
> @@ -198,16 +229,25 @@ struct rte_sched_queue_stats {
> 
>  /** Port configuration parameters. */
>  struct rte_sched_port_params {
> -	const char *name;                /**< String to be associated */
> -	int socket;                      /**< CPU socket ID */
> -	uint32_t rate;                   /**< Output port rate
> -					  * (measured in bytes per second) */
> -	uint32_t mtu;                    /**< Maximum Ethernet frame size
> -					  * (measured in bytes).
> -					  * Should not include the framing
> overhead. */
> -	uint32_t frame_overhead;         /**< Framing overhead per packet
> -					  * (measured in bytes) */
> -	uint32_t n_subports_per_port;    /**< Number of subports */
> +	/** Name of the port to be associated */
> +	const char *name;
> +
> +	/** CPU socket ID */
> +	int socket;
> +
> +	/** Output port rate (measured in bytes per second) */
> +	uint32_t rate;

Same 64-bit comment here.

> +
> +	/** Maximum Ethernet frame size (measured in bytes).
> +	 * Should not include the framing overhead.
> +	 */
> +	uint32_t mtu;
> +
> +	/** Framing overhead per packet (measured in bytes) */
> +	uint32_t frame_overhead;
> +
> +	/** Number of subports */
> +	uint32_t n_subports_per_port;
>  	uint32_t n_pipes_per_subport;    /**< Number of pipes per subport
> */
>  	uint16_t qsize[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
>  	/**< Packet queue size for each traffic class.
> --
> 2.21.0


^ permalink raw reply	[flat|nested] 163+ messages in thread

* Re: [dpdk-dev] [PATCH v2 09/28] sched: update pkt read and write API
  2019-06-25 15:31     ` [dpdk-dev] [PATCH v2 09/28] sched: update pkt read and write API Jasvinder Singh
@ 2019-07-01 23:25       ` Dumitrescu, Cristian
  2019-07-02 21:05         ` Singh, Jasvinder
  0 siblings, 1 reply; 163+ messages in thread
From: Dumitrescu, Cristian @ 2019-07-01 23:25 UTC (permalink / raw)
  To: Singh, Jasvinder, dev; +Cc: Tovar, AbrahamX, Krakowiak, LukaszX



> -----Original Message-----
> From: Singh, Jasvinder
> Sent: Tuesday, June 25, 2019 4:32 PM
> To: dev@dpdk.org
> Cc: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>; Tovar, AbrahamX
> <abrahamx.tovar@intel.com>; Krakowiak, LukaszX
> <lukaszx.krakowiak@intel.com>
> Subject: [PATCH v2 09/28] sched: update pkt read and write API
> 
> Update run time packet read and write api implementation
> to allow configuration flexiblity for pipe traffic classes
> and queues, and subport level configuration of the pipe
> parameters.
> 
> Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
> Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
> Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
> ---
>  lib/librte_sched/rte_sched.c | 32 +++++++++++++++++---------------
>  lib/librte_sched/rte_sched.h |  8 ++++----
>  2 files changed, 21 insertions(+), 19 deletions(-)
> 
> diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
> index 1999bbfa3..cd82fd918 100644
> --- a/lib/librte_sched/rte_sched.c
> +++ b/lib/librte_sched/rte_sched.c
> @@ -1433,17 +1433,15 @@ rte_sched_port_pipe_profile_add(struct
> rte_sched_port *port,
> 
>  static inline uint32_t
>  rte_sched_port_qindex(struct rte_sched_port *port,
> +	struct rte_sched_subport *s,
>  	uint32_t subport,
>  	uint32_t pipe,
> -	uint32_t traffic_class,
>  	uint32_t queue)
>  {
>  	return ((subport & (port->n_subports_per_port - 1)) <<
> -			(port->n_pipes_per_subport_log2 + 4)) |
> -			((pipe & (port->n_pipes_per_subport - 1)) << 4) |
> -			((traffic_class &
> -			    (RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE - 1)) <<
> 2) |
> -			(queue &
> (RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS - 1));
> +			(port->max_subport_pipes_log2 + 4)) |
> +			((pipe & (s->n_subport_pipes - 1)) << 4) |
> +			(queue & (RTE_SCHED_QUEUES_PER_PIPE - 1));
>  }
> 

This function contains a critical bug: this patchset proposes that the number of pipes per subport is configurable independently for each subport; in other words, each subport can be configured with a different number of pipes. Therefore, the above logic is broken, as it assumes all subports have the same number of pipes. There is no longer possible to compute port->max_subport_pipes_log2. Correct?

We might need to rethink the design solution for the per-subport independent configuration.

We also need to make sure we test this library with multiple subports per port, with each subport having different number of pipes. Need to do the basic uni test proposed earlier to trace the packet through the scheduler hierarchy up to the packet queue.

>  void
> @@ -1453,9 +1451,9 @@ rte_sched_port_pkt_write(struct rte_sched_port
> *port,
>  			 uint32_t traffic_class,
>  			 uint32_t queue, enum rte_color color)
>  {
> -	uint32_t queue_id = rte_sched_port_qindex(port, subport, pipe,
> -			traffic_class, queue);
> -	rte_mbuf_sched_set(pkt, queue_id, traffic_class, (uint8_t)color);
> +	struct rte_sched_subport *s = port->subports[subport];
> +	uint32_t qindex = rte_sched_port_qindex(port, s, subport, pipe,
> queue);
> +	rte_mbuf_sched_set(pkt, qindex, traffic_class, (uint8_t)color);
>  }
> 

Same comment here.

>  void
> @@ -1464,13 +1462,17 @@ rte_sched_port_pkt_read_tree_path(struct
> rte_sched_port *port,
>  				  uint32_t *subport, uint32_t *pipe,
>  				  uint32_t *traffic_class, uint32_t *queue)
>  {
> -	uint32_t queue_id = rte_mbuf_sched_queue_get(pkt);
> +	struct rte_sched_subport *s;
> +	uint32_t qindex = rte_mbuf_sched_queue_get(pkt);
> +	uint32_t tc_id = rte_mbuf_sched_traffic_class_get(pkt);
> +
> +	*subport = (qindex >> (port->max_subport_pipes_log2 + 4)) &
> +		(port->n_subports_per_port - 1);
> 
> -	*subport = queue_id >> (port->n_pipes_per_subport_log2 + 4);
> -	*pipe = (queue_id >> 4) & (port->n_pipes_per_subport - 1);
> -	*traffic_class = (queue_id >> 2) &
> -				(RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE -
> 1);
> -	*queue = queue_id & (RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS -
> 1);
> +	s = port->subports[*subport];
> +	*pipe = (qindex >> 4) & (s->n_subport_pipes - 1);
> +	*traffic_class = tc_id;
> +	*queue = qindex & (RTE_SCHED_QUEUES_PER_PIPE - 1);
>  }
> 

Same comment here.

>  enum rte_color
> diff --git a/lib/librte_sched/rte_sched.h b/lib/librte_sched/rte_sched.h
> index 121e1f669..6a6ea84aa 100644
> --- a/lib/librte_sched/rte_sched.h
> +++ b/lib/librte_sched/rte_sched.h
> @@ -421,9 +421,9 @@ rte_sched_queue_read_stats(struct rte_sched_port
> *port,
>   * @param pipe
>   *   Pipe ID within subport
>   * @param traffic_class
> - *   Traffic class ID within pipe (0 .. 3)
> + *   Traffic class ID within pipe (0 .. 8)
>   * @param queue
> - *   Queue ID within pipe traffic class (0 .. 3)
> + *   Queue ID within pipe traffic class (0 .. 15)
>   * @param color
>   *   Packet color set
>   */
> @@ -448,9 +448,9 @@ rte_sched_port_pkt_write(struct rte_sched_port
> *port,
>   * @param pipe
>   *   Pipe ID within subport
>   * @param traffic_class
> - *   Traffic class ID within pipe (0 .. 3)
> + *   Traffic class ID within pipe (0 .. 8)
>   * @param queue
> - *   Queue ID within pipe traffic class (0 .. 3)
> + *   Queue ID within pipe traffic class (0 .. 15)
>   *
>   */
>  void
> --
> 2.21.0


^ permalink raw reply	[flat|nested] 163+ messages in thread

* Re: [dpdk-dev] [PATCH v2 00/28] sched: feature enhancements
  2019-07-01 18:51     ` Dumitrescu, Cristian
@ 2019-07-02  9:32       ` Singh, Jasvinder
  0 siblings, 0 replies; 163+ messages in thread
From: Singh, Jasvinder @ 2019-07-02  9:32 UTC (permalink / raw)
  To: Dumitrescu, Cristian, dev



> -----Original Message-----
> From: Dumitrescu, Cristian
> Sent: Monday, July 1, 2019 7:51 PM
> To: Singh, Jasvinder <jasvinder.singh@intel.com>; dev@dpdk.org
> Subject: RE: [PATCH v2 00/28] sched: feature enhancements
> 
> Hi Jasvinder,
> 
> Thanks for doing this work! Finally a man brave enough to do substantial
> changes to this library!

Thanks for the words, Cristian! Hope these additions increase the utility of library.

> 
> > -----Original Message-----
> > From: Singh, Jasvinder
> > Sent: Tuesday, June 25, 2019 4:32 PM
> > To: dev@dpdk.org
> > Cc: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>
> > Subject: [PATCH v2 00/28] sched: feature enhancements
> >
> > This patchset refactors the dpdk qos sched library to add following
> > features to enhance the scheduler functionality.
> >
> > 1. flexibile configuration of the pipe traffic classes and queues;
> >
> >    Currently, each pipe has 16 queues hardwired into 4 TCs scheduled with
> >    strict priority, and each TC has exactly with 4 queues that are
> >    scheduled with Weighted Fair Queuing (WFQ).
> >
> >    Instead of hardwiring queues to traffic class within the specific pipe,
> >    the new implementation allows more flexible/configurable split of pipe
> >    queues between strict priority (SP) and best-effort (BE) traffic classes
> >    along with the support of more number of traffic classes i.e. max 16.
> >
> >    All the high priority TCs (TC1, TC2, ...) have exactly 1 queue, while
> >    the lowest priority BE TC, has 1, 4 or 8 queues. This is justified by
> >    the fact that all the high priority TCs are fully provisioned (small to
> >    medium traffic rates), while most of the traffic fits into the BE class,
> >    which is typically oversubscribed.
> >
> >    Furthermore, this change allows to use less than 16 queues per pipe when
> >    not all the 16 queues are needed. Therefore, no memory will be allocated
> >    to the queues that are not needed.
> >
> > 2. Subport level configuration of pipe nodes;
> >
> >    Currently, all parameters for the pipe nodes (subscribers) configuration
> >    are part of the port level structure which forces all groups of
> >    subscribers (i.e. pipes) in different subports to have similar
> >    configurations in terms of their number, queue sizes, traffic-classes,
> >    etc.
> >
> >    The new implementation moves pipe nodes configuration parameters from
> >    port level to subport level structure. Therefore, different subports of
> >    the same port can have different configuration for the pipe nodes
> >    (subscribers), for examples- number of pipes, queue sizes, queues to
> >    traffic-class mapping, etc.
> >
> > v2:
> > - fix bug in subport parameters check
> > - remove redundant RTE_SCHED_SUBPORT_PER_PORT macro
> > - fix bug in grinder_scheduler function
> > - improve doxygen comments
> > - add error log information
> >
> > Jasvinder Singh (27):
> >   sched: update macros for flexible config
> >   sched: update subport and pipe data structures
> >   sched: update internal data structures
> >   sched: update port config API
> >   sched: update port free API
> >   sched: update subport config API
> >   sched: update pipe profile add API
> >   sched: update pipe config API
> >   sched: update pkt read and write API
> >   sched: update subport and tc queue stats
> >   sched: update port memory footprint API
> >   sched: update packet enqueue API
> >   sched: update grinder pipe and tc cache
> >   sched: update grinder next pipe and tc functions
> >   sched: update pipe and tc queues prefetch
> >   sched: update grinder wrr compute function
> >   sched: modify credits update function
> >   sched: update mbuf prefetch function
> >   sched: update grinder schedule function
> >   sched: update grinder handle function
> >   sched: update packet dequeue API
> >   sched: update sched queue stats API
> >   test/sched: update unit test
> >   net/softnic: update softnic tm function
> >   examples/qos_sched: update qos sched sample app
> >   examples/ip_pipeline: update ip pipeline sample app
> >   sched: code cleanup
> >
> > Lukasz Krakowiak (1):
> >   sched: add release note
> >
> >  app/test/test_sched.c                         |   39 +-
> >  doc/guides/rel_notes/deprecation.rst          |    6 -
> >  doc/guides/rel_notes/release_19_08.rst        |    7 +-
> >  drivers/net/softnic/rte_eth_softnic.c         |  131 +
> >  drivers/net/softnic/rte_eth_softnic_cli.c     |  286 ++-
> >  .../net/softnic/rte_eth_softnic_internals.h   |    8 +-
> >  drivers/net/softnic/rte_eth_softnic_tm.c      |   89 +-
> >  examples/ip_pipeline/cli.c                    |   85 +-
> >  examples/ip_pipeline/tmgr.c                   |   22 +-
> >  examples/ip_pipeline/tmgr.h                   |    3 -
> >  examples/qos_sched/app_thread.c               |   11 +-
> >  examples/qos_sched/cfg_file.c                 |  283 ++-
> >  examples/qos_sched/init.c                     |  111 +-
> >  examples/qos_sched/main.h                     |    7 +-
> >  examples/qos_sched/profile.cfg                |   59 +-
> >  examples/qos_sched/profile_ov.cfg             |   47 +-
> >  examples/qos_sched/stats.c                    |  483 ++--
> >  lib/librte_pipeline/rte_table_action.c        |    1 -
> >  lib/librte_pipeline/rte_table_action.h        |    4 +-
> >  lib/librte_sched/Makefile                     |    2 +-
> >  lib/librte_sched/meson.build                  |    2 +-
> >  lib/librte_sched/rte_sched.c                  | 2133 ++++++++++-------
> >  lib/librte_sched/rte_sched.h                  |  229 +-
> >  lib/librte_sched/rte_sched_common.h           |   41 +
> >  24 files changed, 2634 insertions(+), 1455 deletions(-)
> >
> > --
> > 2.21.0
> 
> This library is tricky, as validating the functional correctness usually requires
> more than just a binary pass/fail result, like your usual PMD (are packets
> coming out? YES/NO). It requires an accuracy testing, for example how close is
> the actual pipe/shaper output rate from the expected rate, how accurate is the
> strict priority of WFQ scheduling, etc. Therefore, here are a few accuracy tests
> that we need to perform on this library to make sure we did not break any
> functionality, feel free to reach out to me if more details needed:

I completely agree that we need to validate the accuracy by performing
different tests as suggested below. We have performed few of them during development and
found results within tolerance. 


> 1. Subport shaper accuracy: Inject line rate into a subport, limit subport rate to
> X% of line rate (X = 5, 10, 15, ..., 90). Check that subport output rate matches
> the rate limit with a tolerance of 1% of the expected rate.
> 2. Subport traffic class rate limiting accuracy: same 1% tolerance 
3. Pipe
> shaper accuracy: same 1% tolerance 

All above three tests are conducted successfully.

4. Pipe traffic class rate limiting accuracy:
> same 1% tolerance 5. Traffic class strict priority scheduling 6. WFQ for best
> effort traffic class

These tests are in progress.

 
> On performance side, we need to make sure we don't get a massive
> performance degradation. We need proof points for:
> 1. 8x traffic classes, 4x best effort queues 2. 8x traffic classes, 1x best effort
> queue 3. 4x traffic classes, 4x best effort queues 4. 4x traffic classes, 1x best
> effort queue

We have done performance measurement all above cases, and performance was found slightly low (between 0.65% -2.6%) compared to existing version. 


> On unit test side, a few tests to highlight:
> 1. Packet for queue X is enqueued to queue X and dequeued from queue X.

This is done.

> a) Typically tested by sending a single packet and tracing it through the queues.
Yes, done.
> b) Should be done for different queue IDs, different number of queues per
> subport and different number of subports.

This need to be included, will include this in unit tests.

> I am sending some initial comments to the V2 patches now, more to come
> during the next few days.
> 
> Regards,
> Cristian


^ permalink raw reply	[flat|nested] 163+ messages in thread

* Re: [dpdk-dev] [PATCH v2 02/28] sched: update subport and pipe data structures
  2019-07-01 18:58       ` Dumitrescu, Cristian
@ 2019-07-02 13:20         ` Singh, Jasvinder
  0 siblings, 0 replies; 163+ messages in thread
From: Singh, Jasvinder @ 2019-07-02 13:20 UTC (permalink / raw)
  To: Dumitrescu, Cristian, dev; +Cc: Tovar, AbrahamX, Krakowiak, LukaszX



> -----Original Message-----
> From: Dumitrescu, Cristian
> Sent: Monday, July 1, 2019 7:59 PM
> To: Singh, Jasvinder <jasvinder.singh@intel.com>; dev@dpdk.org
> Cc: Tovar, AbrahamX <abrahamx.tovar@intel.com>; Krakowiak, LukaszX
> <lukaszx.krakowiak@intel.com>
> Subject: RE: [PATCH v2 02/28] sched: update subport and pipe data structures
> 
> 
> 
> > -----Original Message-----
> > From: Singh, Jasvinder
> > Sent: Tuesday, June 25, 2019 4:32 PM
> > To: dev@dpdk.org
> > Cc: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>; Tovar,
> > AbrahamX <abrahamx.tovar@intel.com>; Krakowiak, LukaszX
> > <lukaszx.krakowiak@intel.com>
> > Subject: [PATCH v2 02/28] sched: update subport and pipe data
> > structures
> >
> > Update subport and pipe data structures to allow configuration
> > flexiblity for pipe traffic classes and queues, and subport level
> > configuration of the pipe parameters.
> >
> > Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
> > Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
> > Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
> > ---
> >  app/test/test_sched.c        |   2 +-
> >  examples/qos_sched/init.c    |   2 +-
> >  lib/librte_sched/rte_sched.h | 126
> > +++++++++++++++++++++++------------
> >  3 files changed, 85 insertions(+), 45 deletions(-)
> >
> > diff --git a/app/test/test_sched.c b/app/test/test_sched.c index
> > 49bb9ea6f..d6651d490 100644
> > --- a/app/test/test_sched.c
> > +++ b/app/test/test_sched.c
> > @@ -40,7 +40,7 @@ static struct rte_sched_pipe_params pipe_profile[] = {
> >  		.tc_rate = {305175, 305175, 305175, 305175},
> >  		.tc_period = 40,
> >
> > -		.wrr_weights = {1, 1, 1, 1,  1, 1, 1, 1,  1, 1, 1, 1,  1, 1, 1, 1},
> > +		.wrr_weights = {1, 1, 1, 1,  1, 1, 1, 1},
> >  	},
> >  };
> >
> > diff --git a/examples/qos_sched/init.c b/examples/qos_sched/init.c
> > index 1209bd7ce..f6e9af16b 100644
> > --- a/examples/qos_sched/init.c
> > +++ b/examples/qos_sched/init.c
> > @@ -186,7 +186,7 @@ static struct rte_sched_pipe_params
> > pipe_profiles[RTE_SCHED_PIPE_PROFILES_PER_PO
> >  		.tc_ov_weight = 1,
> >  #endif
> >
> > -		.wrr_weights = {1, 1, 1, 1,  1, 1, 1, 1,  1, 1, 1, 1,  1, 1, 1, 1},
> > +		.wrr_weights = {1, 1, 1, 1,  1, 1, 1, 1},
> >  	},
> >  };
> >
> > diff --git a/lib/librte_sched/rte_sched.h
> > b/lib/librte_sched/rte_sched.h index 470a0036a..ebde07669 100644
> > --- a/lib/librte_sched/rte_sched.h
> > +++ b/lib/librte_sched/rte_sched.h
> > @@ -114,6 +114,35 @@ extern "C" {
> >  #define RTE_SCHED_FRAME_OVERHEAD_DEFAULT      24
> >  #endif
> >
> > +/*
> > + * Pipe configuration parameters. The period and credits_per_period
> > + * parameters are measured in bytes, with one byte meaning the time
> > + * duration associated with the transmission of one byte on the
> > + * physical medium of the output port, with pipe or pipe traffic
> > +class
> > + * rate (measured as percentage of output port rate) determined as
> > + * credits_per_period divided by period. One credit represents one
> > + * byte.
> > + */
> > +struct rte_sched_pipe_params {
> > +	/** Token bucket rate (measured in bytes per second) */
> > +	uint32_t tb_rate;
> > +	/** Token bucket size (measured in credits) */
> > +	uint32_t tb_size;
> > +
> > +	/** Traffic class rates (measured in bytes per second) */
> > +	uint32_t tc_rate[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
> > +
> > +	/** Enforcement period (measured in milliseconds) */
> > +	uint32_t tc_period;
> > +#ifdef RTE_SCHED_SUBPORT_TC_OV
> > +	/** Best-effort traffic class oversubscription weight */
> > +	uint8_t tc_ov_weight;
> > +#endif
> 
> We should always enable the Best Effort traffic class oversubscription feature
> on the API side, at least. In case this feature is disabled through the build time
> option (RTE_SCHED_SUBPORT_TC_OV), the values for these params can be
> ignored.
> 
> We should also consider always enabling the run-time part for this feature, as
> the oversubscription is the typical configuration used by the service providers.

Yes, will do in next version.

> Do you see significant performance drop when this feature is enabled?

There is  ~2.5 MPPS drop with oversubscription flag on due to extra cycles consumption in credits compute. 

> > +
> > +	/** WRR weights of best-effort traffic class queues */
> > +	uint8_t wrr_weights[RTE_SCHED_BE_QUEUES_PER_PIPE];
> > +};
> > +
> >  /*
> >   * Subport configuration parameters. The period and credits_per_period
> >   * parameters are measured in bytes, with one byte meaning the time
> > @@ -124,15 +153,44 @@ extern "C" {
> >   * byte.
> >   */
> >  struct rte_sched_subport_params {
> > -	/* Subport token bucket */
> > -	uint32_t tb_rate;                /**< Rate (measured in bytes per second)
> > */
> > -	uint32_t tb_size;                /**< Size (measured in credits) */
> > +	/** Token bucket rate (measured in bytes per second) */
> > +	uint32_t tb_rate;
> > +
> > +	/** Token bucket size (measured in credits) */
> > +	uint32_t tb_size;
> >
> > -	/* Subport traffic classes */
> > +	/** Traffic class rates (measured in bytes per second) */
> >  	uint32_t tc_rate[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
> > -	/**< Traffic class rates (measured in bytes per second) */
> > +
> > +	/** Enforcement period for rates (measured in milliseconds) */
> >  	uint32_t tc_period;
> > -	/**< Enforcement period for rates (measured in milliseconds) */
> > +
> > +	/** Number of subport_pipes */
> > +	uint32_t n_subport_pipes;
> 
> Minor issue: Any reason why not keeping the initial name of
> n_pipes_per_subport? The initial name looks more intuitive to me, I vote to
> keep it; it is also inline with other naming conventions in this library.

Will change in next version.

> > +
> > +	/** Packet queue size for each traffic class.
> > +	 * All the pipes within the same subport share the similar
> > +	 * configuration for the queues. Queues which are not needed, have
> > +	 * zero size.
> > +	 */
> > +	uint16_t qsize[RTE_SCHED_QUEUES_PER_PIPE];
> > +
> > +	/** Pipe profile table.
> > +	 * Every pipe is configured using one of the profiles from this table.
> > +	 */
> > +	struct rte_sched_pipe_params *pipe_profiles;
> > +
> > +	/** Profiles in the pipe profile table */
> > +	uint32_t n_pipe_profiles;
> > +
> > +	/** Max profiles allowed in the pipe profile table */
> > +	uint32_t n_max_pipe_profiles;
> > +#ifdef RTE_SCHED_RED
> > +	/** RED parameters */
> > +	struct rte_red_params
> > +
> > 	red_params[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE][RTE_COLORS
> > ];
> > +
> > +#endif
> >  };
> >
> >  /** Subport statistics */
> > @@ -155,33 +213,6 @@ struct rte_sched_subport_stats {  #endif  };
> >
> > -/*
> > - * Pipe configuration parameters. The period and credits_per_period
> > - * parameters are measured in bytes, with one byte meaning the time
> > - * duration associated with the transmission of one byte on the
> > - * physical medium of the output port, with pipe or pipe traffic
> > class
> > - * rate (measured as percentage of output port rate) determined as
> > - * credits_per_period divided by period. One credit represents one
> > - * byte.
> > - */
> > -struct rte_sched_pipe_params {
> > -	/* Pipe token bucket */
> > -	uint32_t tb_rate;                /**< Rate (measured in bytes per second)
> > */
> > -	uint32_t tb_size;                /**< Size (measured in credits) */
> > -
> > -	/* Pipe traffic classes */
> > -	uint32_t tc_rate[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
> > -	/**< Traffic class rates (measured in bytes per second) */
> > -	uint32_t tc_period;
> > -	/**< Enforcement period (measured in milliseconds) */
> > -#ifdef RTE_SCHED_SUBPORT_TC_OV
> > -	uint8_t tc_ov_weight;		 /**< Weight Traffic class 3
> > oversubscription */
> > -#endif
> > -
> > -	/* Pipe queues */
> > -	uint8_t  wrr_weights[RTE_SCHED_QUEUES_PER_PIPE]; /**< WRR
> > weights */
> > -};
> > -
> >  /** Queue statistics */
> >  struct rte_sched_queue_stats {
> >  	/* Packets */
> > @@ -198,16 +229,25 @@ struct rte_sched_queue_stats {
> >
> >  /** Port configuration parameters. */  struct rte_sched_port_params {
> > -	const char *name;                /**< String to be associated */
> > -	int socket;                      /**< CPU socket ID */
> > -	uint32_t rate;                   /**< Output port rate
> > -					  * (measured in bytes per second) */
> > -	uint32_t mtu;                    /**< Maximum Ethernet frame size
> > -					  * (measured in bytes).
> > -					  * Should not include the framing
> > overhead. */
> > -	uint32_t frame_overhead;         /**< Framing overhead per packet
> > -					  * (measured in bytes) */
> > -	uint32_t n_subports_per_port;    /**< Number of subports */
> > +	/** Name of the port to be associated */
> > +	const char *name;
> > +
> > +	/** CPU socket ID */
> > +	int socket;
> > +
> > +	/** Output port rate (measured in bytes per second) */
> > +	uint32_t rate;
> > +
> > +	/** Maximum Ethernet frame size (measured in bytes).
> > +	 * Should not include the framing overhead.
> > +	 */
> > +	uint32_t mtu;
> > +
> > +	/** Framing overhead per packet (measured in bytes) */
> > +	uint32_t frame_overhead;
> > +
> > +	/** Number of subports */
> > +	uint32_t n_subports_per_port;
> >  	uint32_t n_pipes_per_subport;    /**< Number of pipes per subport
> > */
> >  	uint16_t qsize[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
> >  	/**< Packet queue size for each traffic class.
> > --
> > 2.21.0


^ permalink raw reply	[flat|nested] 163+ messages in thread

* Re: [dpdk-dev] [PATCH v2 01/28] sched: update macros for flexible config
  2019-07-01 19:04       ` Dumitrescu, Cristian
@ 2019-07-02 13:26         ` Singh, Jasvinder
  0 siblings, 0 replies; 163+ messages in thread
From: Singh, Jasvinder @ 2019-07-02 13:26 UTC (permalink / raw)
  To: Dumitrescu, Cristian, dev; +Cc: Tovar, AbrahamX, Krakowiak, LukaszX



> -----Original Message-----
> From: Dumitrescu, Cristian
> Sent: Monday, July 1, 2019 8:05 PM
> To: Singh, Jasvinder <jasvinder.singh@intel.com>; dev@dpdk.org
> Cc: Tovar, AbrahamX <abrahamx.tovar@intel.com>; Krakowiak, LukaszX
> <lukaszx.krakowiak@intel.com>
> Subject: RE: [PATCH v2 01/28] sched: update macros for flexible config
> 
> 
> 
> > -----Original Message-----
> > From: Singh, Jasvinder
> > Sent: Tuesday, June 25, 2019 4:32 PM
> > To: dev@dpdk.org
> > Cc: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>; Tovar,
> > AbrahamX <abrahamx.tovar@intel.com>; Krakowiak, LukaszX
> > <lukaszx.krakowiak@intel.com>
> > Subject: [PATCH v2 01/28] sched: update macros for flexible config
> >
> > Update macros to allow configuration flexiblity for pipe traffic
> > classes and queues, and subport level configuration of the pipe
> > parameters.
> >
> > Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
> > Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
> > Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
> > ---
> >  lib/librte_sched/rte_sched.h | 36
> > +++++++++++++++++++++++++-----------
> >  1 file changed, 25 insertions(+), 11 deletions(-)
> >
> > diff --git a/lib/librte_sched/rte_sched.h
> > b/lib/librte_sched/rte_sched.h index 9c55a787d..470a0036a 100644
> > --- a/lib/librte_sched/rte_sched.h
> > +++ b/lib/librte_sched/rte_sched.h
> > @@ -52,7 +52,7 @@ extern "C" {
> >   *	    multiple connections of same traffic class belonging to
> >   *	    the same user;
> >   *           - Weighted Round Robin (WRR) is used to service the
> > - *	    queues within same pipe traffic class.
> > + *	    queues within same pipe lowest priority traffic class (best-effort).
> >   *
> >   */
> >
> > @@ -66,20 +66,32 @@ extern "C" {
> >  #include "rte_red.h"
> >  #endif
> >
> > -/** Number of traffic classes per pipe (as well as subport).
> > - * Cannot be changed.
> > +/** Maximum number of queues per pipe.
> > + * Note that the multiple queues (power of 2) can only be assigned to
> > + * lowest priority (best-effort) traffic class. Other higher priority
> > +traffic
> > + * classes can only have one queue.
> > + * Can not change.
> > + *
> > + * @see struct rte_sched_subport_params
> >   */
> > -#define RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE    4
> > +#define RTE_SCHED_QUEUES_PER_PIPE    16
> >
> > -/** Number of queues per pipe traffic class. Cannot be changed. */
> > -#define RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS    4
> > +/** Number of WRR queues for best-effort traffic class per pipe.
> > + *
> > + * @see struct rte_sched_pipe_params
> > + */
> > +#define RTE_SCHED_BE_QUEUES_PER_PIPE    8
> 
> Should we have this as 8 or 4? I think we should limit this to 4, as 4 allows quick
> vectorization, while 8 is problematic.

The only reason to keep 8 queues for best effort TC is flexibility in reducing to 4 if needed, and moreover, it has very little impact on performance in out tests.    

> We should also not have a run-time parameter for number of best effort
> queues, as this can be detected by checking the size of all best effort queues
> against 0. Of course, we should mandate that the enabled queues (with non-
> zero size) are contiguous.

Yes, it is done that way without having any additional parameter in public structure. 
> >
> > -/** Number of queues per pipe. */
> > -#define RTE_SCHED_QUEUES_PER_PIPE             \
> > -	(RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE *     \
> > -	RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS)
> > +#define RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS    4
> > +/** Number of traffic classes per pipe (as well as subport).
> > + *
> > + * @see struct rte_sched_subport_params
> > + * @see struct rte_sched_pipe_params
> > + */
> > +#define RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE    \
> > +(RTE_SCHED_QUEUES_PER_PIPE - RTE_SCHED_BE_QUEUES_PER_PIPE + 1)
> >
> > -/** Maximum number of pipe profiles that can be defined per port.
> > +/** Maximum number of pipe profiles that can be defined per subport.
> >   * Compile-time configurable.
> >   */
> >  #ifndef RTE_SCHED_PIPE_PROFILES_PER_PORT @@ -95,6 +107,8 @@
> extern
> > "C" {
> >   *
> >   * The FCS is considered overhead only if not included in the packet
> >   * length (field pkt_len of struct rte_mbuf).
> > + *
> > + * @see struct rte_sched_port_params
> >   */
> >  #ifndef RTE_SCHED_FRAME_OVERHEAD_DEFAULT
> >  #define RTE_SCHED_FRAME_OVERHEAD_DEFAULT      24
> > --
> > 2.21.0


^ permalink raw reply	[flat|nested] 163+ messages in thread

* Re: [dpdk-dev] [PATCH v2 09/28] sched: update pkt read and write API
  2019-07-01 23:25       ` Dumitrescu, Cristian
@ 2019-07-02 21:05         ` Singh, Jasvinder
  2019-07-03 13:40           ` Dumitrescu, Cristian
  0 siblings, 1 reply; 163+ messages in thread
From: Singh, Jasvinder @ 2019-07-02 21:05 UTC (permalink / raw)
  To: Dumitrescu, Cristian, dev; +Cc: Tovar, AbrahamX, Krakowiak, LukaszX



> -----Original Message-----
> From: Dumitrescu, Cristian
> Sent: Tuesday, July 2, 2019 12:25 AM
> To: Singh, Jasvinder <jasvinder.singh@intel.com>; dev@dpdk.org
> Cc: Tovar, AbrahamX <abrahamx.tovar@intel.com>; Krakowiak, LukaszX
> <lukaszx.krakowiak@intel.com>
> Subject: RE: [PATCH v2 09/28] sched: update pkt read and write API
> 
> 
> 
> > -----Original Message-----
> > From: Singh, Jasvinder
> > Sent: Tuesday, June 25, 2019 4:32 PM
> > To: dev@dpdk.org
> > Cc: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>; Tovar,
> > AbrahamX <abrahamx.tovar@intel.com>; Krakowiak, LukaszX
> > <lukaszx.krakowiak@intel.com>
> > Subject: [PATCH v2 09/28] sched: update pkt read and write API
> >
> > Update run time packet read and write api implementation to allow
> > configuration flexiblity for pipe traffic classes and queues, and
> > subport level configuration of the pipe parameters.
> >
> > Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
> > Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
> > Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
> > ---
> >  lib/librte_sched/rte_sched.c | 32 +++++++++++++++++---------------
> > lib/librte_sched/rte_sched.h |  8 ++++----
> >  2 files changed, 21 insertions(+), 19 deletions(-)
> >
> > diff --git a/lib/librte_sched/rte_sched.c
> > b/lib/librte_sched/rte_sched.c index 1999bbfa3..cd82fd918 100644
> > --- a/lib/librte_sched/rte_sched.c
> > +++ b/lib/librte_sched/rte_sched.c
> > @@ -1433,17 +1433,15 @@ rte_sched_port_pipe_profile_add(struct
> > rte_sched_port *port,
> >
> >  static inline uint32_t
> >  rte_sched_port_qindex(struct rte_sched_port *port,
> > +	struct rte_sched_subport *s,
> >  	uint32_t subport,
> >  	uint32_t pipe,
> > -	uint32_t traffic_class,
> >  	uint32_t queue)
> >  {
> >  	return ((subport & (port->n_subports_per_port - 1)) <<
> > -			(port->n_pipes_per_subport_log2 + 4)) |
> > -			((pipe & (port->n_pipes_per_subport - 1)) << 4) |
> > -			((traffic_class &
> > -			    (RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE - 1)) <<
> > 2) |
> > -			(queue &
> > (RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS - 1));
> > +			(port->max_subport_pipes_log2 + 4)) |
> > +			((pipe & (s->n_subport_pipes - 1)) << 4) |
> > +			(queue & (RTE_SCHED_QUEUES_PER_PIPE - 1));
> >  }
> >
> 
> This function contains a critical bug: this patchset proposes that the number of
> pipes per subport is configurable independently for each subport; in other
> words, each subport can be configured with a different number of pipes.
> Therefore, the above logic is broken, as it assumes all subports have the same
> number of pipes. There is no longer possible to compute port-
> >max_subport_pipes_log2. Correct?

Yes, you are right. I didn't realize this issue.
 
> 
> We might need to rethink the design solution for the per-subport independent
> configuration.

One option to get around this is by computing  start_queue_offset for each subport and store that in rte_sched_subport data structure during subport configuration.

During run time, when writing pkt metadata;  add that offset to calculate queue index ;
	subport->start_queue_offset+((pipe & (s->n_pipes_per_subport - 1)) << 4) | (queue & (RTE_SCHED_QUEUES_PER_PIPE - 1));
At the same time,  subport id can be written to reserved field of the mbuf->hash.sched 

During packet read, offset value is retrieved from subport_id (from reserved field). By subtracting offset from the qindex, 
pipe, tc and queue id can determined from the remaining value.

This will allow contiguous value of the queue id at the port level.

> We also need to make sure we test this library with multiple subports per port,
> with each subport having different number of pipes. Need to do the basic uni
> test proposed earlier to trace the packet through the scheduler hierarchy up to
> the packet queue.

Yes, will add unit test for this case.
> 
> >  void
> > @@ -1453,9 +1451,9 @@ rte_sched_port_pkt_write(struct rte_sched_port
> > *port,
> >  			 uint32_t traffic_class,
> >  			 uint32_t queue, enum rte_color color)  {
> > -	uint32_t queue_id = rte_sched_port_qindex(port, subport, pipe,
> > -			traffic_class, queue);
> > -	rte_mbuf_sched_set(pkt, queue_id, traffic_class, (uint8_t)color);
> > +	struct rte_sched_subport *s = port->subports[subport];
> > +	uint32_t qindex = rte_sched_port_qindex(port, s, subport, pipe,
> > queue);
> > +	rte_mbuf_sched_set(pkt, qindex, traffic_class, (uint8_t)color);
> >  }
> >
> 
> Same comment here.
> 
> >  void
> > @@ -1464,13 +1462,17 @@ rte_sched_port_pkt_read_tree_path(struct
> > rte_sched_port *port,
> >  				  uint32_t *subport, uint32_t *pipe,
> >  				  uint32_t *traffic_class, uint32_t *queue)  {
> > -	uint32_t queue_id = rte_mbuf_sched_queue_get(pkt);
> > +	struct rte_sched_subport *s;
> > +	uint32_t qindex = rte_mbuf_sched_queue_get(pkt);
> > +	uint32_t tc_id = rte_mbuf_sched_traffic_class_get(pkt);
> > +
> > +	*subport = (qindex >> (port->max_subport_pipes_log2 + 4)) &
> > +		(port->n_subports_per_port - 1);
> >
> > -	*subport = queue_id >> (port->n_pipes_per_subport_log2 + 4);
> > -	*pipe = (queue_id >> 4) & (port->n_pipes_per_subport - 1);
> > -	*traffic_class = (queue_id >> 2) &
> > -				(RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE -
> > 1);
> > -	*queue = queue_id & (RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS -
> > 1);
> > +	s = port->subports[*subport];
> > +	*pipe = (qindex >> 4) & (s->n_subport_pipes - 1);
> > +	*traffic_class = tc_id;
> > +	*queue = qindex & (RTE_SCHED_QUEUES_PER_PIPE - 1);
> >  }
> >
> 
> Same comment here.
> 
> >  enum rte_color
> > diff --git a/lib/librte_sched/rte_sched.h
> > b/lib/librte_sched/rte_sched.h index 121e1f669..6a6ea84aa 100644
> > --- a/lib/librte_sched/rte_sched.h
> > +++ b/lib/librte_sched/rte_sched.h
> > @@ -421,9 +421,9 @@ rte_sched_queue_read_stats(struct rte_sched_port
> > *port,
> >   * @param pipe
> >   *   Pipe ID within subport
> >   * @param traffic_class
> > - *   Traffic class ID within pipe (0 .. 3)
> > + *   Traffic class ID within pipe (0 .. 8)
> >   * @param queue
> > - *   Queue ID within pipe traffic class (0 .. 3)
> > + *   Queue ID within pipe traffic class (0 .. 15)
> >   * @param color
> >   *   Packet color set
> >   */
> > @@ -448,9 +448,9 @@ rte_sched_port_pkt_write(struct rte_sched_port
> > *port,
> >   * @param pipe
> >   *   Pipe ID within subport
> >   * @param traffic_class
> > - *   Traffic class ID within pipe (0 .. 3)
> > + *   Traffic class ID within pipe (0 .. 8)
> >   * @param queue
> > - *   Queue ID within pipe traffic class (0 .. 3)
> > + *   Queue ID within pipe traffic class (0 .. 15)
> >   *
> >   */
> >  void
> > --
> > 2.21.0


^ permalink raw reply	[flat|nested] 163+ messages in thread

* Re: [dpdk-dev] [PATCH v2 09/28] sched: update pkt read and write API
  2019-07-02 21:05         ` Singh, Jasvinder
@ 2019-07-03 13:40           ` Dumitrescu, Cristian
  0 siblings, 0 replies; 163+ messages in thread
From: Dumitrescu, Cristian @ 2019-07-03 13:40 UTC (permalink / raw)
  To: Singh, Jasvinder, dev; +Cc: Tovar, AbrahamX, Krakowiak, LukaszX



> -----Original Message-----
> From: Singh, Jasvinder
> Sent: Tuesday, July 2, 2019 10:05 PM
> To: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>; dev@dpdk.org
> Cc: Tovar, AbrahamX <abrahamx.tovar@intel.com>; Krakowiak, LukaszX
> <lukaszx.krakowiak@intel.com>
> Subject: RE: [PATCH v2 09/28] sched: update pkt read and write API
> 
> 
> 
> > -----Original Message-----
> > From: Dumitrescu, Cristian
> > Sent: Tuesday, July 2, 2019 12:25 AM
> > To: Singh, Jasvinder <jasvinder.singh@intel.com>; dev@dpdk.org
> > Cc: Tovar, AbrahamX <abrahamx.tovar@intel.com>; Krakowiak, LukaszX
> > <lukaszx.krakowiak@intel.com>
> > Subject: RE: [PATCH v2 09/28] sched: update pkt read and write API
> >
> >
> >
> > > -----Original Message-----
> > > From: Singh, Jasvinder
> > > Sent: Tuesday, June 25, 2019 4:32 PM
> > > To: dev@dpdk.org
> > > Cc: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>; Tovar,
> > > AbrahamX <abrahamx.tovar@intel.com>; Krakowiak, LukaszX
> > > <lukaszx.krakowiak@intel.com>
> > > Subject: [PATCH v2 09/28] sched: update pkt read and write API
> > >
> > > Update run time packet read and write api implementation to allow
> > > configuration flexiblity for pipe traffic classes and queues, and
> > > subport level configuration of the pipe parameters.
> > >
> > > Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
> > > Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
> > > Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
> > > ---
> > >  lib/librte_sched/rte_sched.c | 32 +++++++++++++++++---------------
> > > lib/librte_sched/rte_sched.h |  8 ++++----
> > >  2 files changed, 21 insertions(+), 19 deletions(-)
> > >
> > > diff --git a/lib/librte_sched/rte_sched.c
> > > b/lib/librte_sched/rte_sched.c index 1999bbfa3..cd82fd918 100644
> > > --- a/lib/librte_sched/rte_sched.c
> > > +++ b/lib/librte_sched/rte_sched.c
> > > @@ -1433,17 +1433,15 @@ rte_sched_port_pipe_profile_add(struct
> > > rte_sched_port *port,
> > >
> > >  static inline uint32_t
> > >  rte_sched_port_qindex(struct rte_sched_port *port,
> > > +	struct rte_sched_subport *s,
> > >  	uint32_t subport,
> > >  	uint32_t pipe,
> > > -	uint32_t traffic_class,
> > >  	uint32_t queue)
> > >  {
> > >  	return ((subport & (port->n_subports_per_port - 1)) <<
> > > -			(port->n_pipes_per_subport_log2 + 4)) |
> > > -			((pipe & (port->n_pipes_per_subport - 1)) << 4) |
> > > -			((traffic_class &
> > > -			    (RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE - 1)) <<
> > > 2) |
> > > -			(queue &
> > > (RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS - 1));
> > > +			(port->max_subport_pipes_log2 + 4)) |
> > > +			((pipe & (s->n_subport_pipes - 1)) << 4) |
> > > +			(queue & (RTE_SCHED_QUEUES_PER_PIPE - 1));
> > >  }
> > >
> >
> > This function contains a critical bug: this patchset proposes that the number
> of
> > pipes per subport is configurable independently for each subport; in other
> > words, each subport can be configured with a different number of pipes.
> > Therefore, the above logic is broken, as it assumes all subports have the
> same
> > number of pipes. There is no longer possible to compute port-
> > >max_subport_pipes_log2. Correct?
> 
> Yes, you are right. I didn't realize this issue.
> 
> >
> > We might need to rethink the design solution for the per-subport
> independent
> > configuration.
> 
> One option to get around this is by computing  start_queue_offset for each
> subport and store that in rte_sched_subport data structure during subport
> configuration.
> 

Yes, it can be done, this is probably the only way to get it done, but it would severely impact the performance, as in order to determine the subport ID you'd have to iterate through the list of subports to search where this queue ID fits.

> During run time, when writing pkt metadata;  add that offset to calculate
> queue index ;
> 	subport->start_queue_offset+((pipe & (s->n_pipes_per_subport -
> 1)) << 4) | (queue & (RTE_SCHED_QUEUES_PER_PIPE - 1));

> At the same time,  subport id can be written to reserved field of the mbuf-
> >hash.sched

I am afraid we cannot write subport ID into mbuf->sched, as subport ID is not generic, it is only specific to librte_sched feature set. This will pollute the generic mbuf->sched with librte_sched implementation details.

> 
> During packet read, offset value is retrieved from subport_id (from reserved
> field). By subtracting offset from the qindex,
> pipe, tc and queue id can determined from the remaining value.
> 
> This will allow contiguous value of the queue id at the port level.
> 
> > We also need to make sure we test this library with multiple subports per
> port,
> > with each subport having different number of pipes. Need to do the basic
> uni
> > test proposed earlier to trace the packet through the scheduler hierarchy
> up to
> > the packet queue.
> 
> Yes, will add unit test for this case.
> >
> > >  void
> > > @@ -1453,9 +1451,9 @@ rte_sched_port_pkt_write(struct
> rte_sched_port
> > > *port,
> > >  			 uint32_t traffic_class,
> > >  			 uint32_t queue, enum rte_color color)  {
> > > -	uint32_t queue_id = rte_sched_port_qindex(port, subport, pipe,
> > > -			traffic_class, queue);
> > > -	rte_mbuf_sched_set(pkt, queue_id, traffic_class, (uint8_t)color);
> > > +	struct rte_sched_subport *s = port->subports[subport];
> > > +	uint32_t qindex = rte_sched_port_qindex(port, s, subport, pipe,
> > > queue);
> > > +	rte_mbuf_sched_set(pkt, qindex, traffic_class, (uint8_t)color);
> > >  }
> > >
> >
> > Same comment here.
> >
> > >  void
> > > @@ -1464,13 +1462,17 @@ rte_sched_port_pkt_read_tree_path(struct
> > > rte_sched_port *port,
> > >  				  uint32_t *subport, uint32_t *pipe,
> > >  				  uint32_t *traffic_class, uint32_t *queue)  {
> > > -	uint32_t queue_id = rte_mbuf_sched_queue_get(pkt);
> > > +	struct rte_sched_subport *s;
> > > +	uint32_t qindex = rte_mbuf_sched_queue_get(pkt);
> > > +	uint32_t tc_id = rte_mbuf_sched_traffic_class_get(pkt);
> > > +
> > > +	*subport = (qindex >> (port->max_subport_pipes_log2 + 4)) &
> > > +		(port->n_subports_per_port - 1);
> > >
> > > -	*subport = queue_id >> (port->n_pipes_per_subport_log2 + 4);
> > > -	*pipe = (queue_id >> 4) & (port->n_pipes_per_subport - 1);
> > > -	*traffic_class = (queue_id >> 2) &
> > > -				(RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE -
> > > 1);
> > > -	*queue = queue_id & (RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS -
> > > 1);
> > > +	s = port->subports[*subport];
> > > +	*pipe = (qindex >> 4) & (s->n_subport_pipes - 1);
> > > +	*traffic_class = tc_id;
> > > +	*queue = qindex & (RTE_SCHED_QUEUES_PER_PIPE - 1);
> > >  }
> > >
> >
> > Same comment here.
> >
> > >  enum rte_color
> > > diff --git a/lib/librte_sched/rte_sched.h
> > > b/lib/librte_sched/rte_sched.h index 121e1f669..6a6ea84aa 100644
> > > --- a/lib/librte_sched/rte_sched.h
> > > +++ b/lib/librte_sched/rte_sched.h
> > > @@ -421,9 +421,9 @@ rte_sched_queue_read_stats(struct
> rte_sched_port
> > > *port,
> > >   * @param pipe
> > >   *   Pipe ID within subport
> > >   * @param traffic_class
> > > - *   Traffic class ID within pipe (0 .. 3)
> > > + *   Traffic class ID within pipe (0 .. 8)
> > >   * @param queue
> > > - *   Queue ID within pipe traffic class (0 .. 3)
> > > + *   Queue ID within pipe traffic class (0 .. 15)
> > >   * @param color
> > >   *   Packet color set
> > >   */
> > > @@ -448,9 +448,9 @@ rte_sched_port_pkt_write(struct rte_sched_port
> > > *port,
> > >   * @param pipe
> > >   *   Pipe ID within subport
> > >   * @param traffic_class
> > > - *   Traffic class ID within pipe (0 .. 3)
> > > + *   Traffic class ID within pipe (0 .. 8)
> > >   * @param queue
> > > - *   Queue ID within pipe traffic class (0 .. 3)
> > > + *   Queue ID within pipe traffic class (0 .. 15)
> > >   *
> > >   */
> > >  void
> > > --
> > > 2.21.0


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH v3 00/11] sched: feature enhancements
  2019-06-25 15:31     ` [dpdk-dev] [PATCH v2 01/28] sched: update macros for flexible config Jasvinder Singh
  2019-07-01 19:04       ` Dumitrescu, Cristian
@ 2019-07-11 10:26       ` Jasvinder Singh
  2019-07-11 10:26         ` [dpdk-dev] [PATCH v3 01/11] sched: remove wrr from strict priority tc queues Jasvinder Singh
                           ` (10 more replies)
  1 sibling, 11 replies; 163+ messages in thread
From: Jasvinder Singh @ 2019-07-11 10:26 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu

This patchset refactors the dpdk qos sched library to allow flexibile
configuration of the pipe traffic classes and queue sizes.

Currently, each pipe has 16 queues hardwired into 4 TCs scheduled with
strict priority, and each TC has exactly with 4 queues that are
scheduled with Weighted Fair Queuing (WFQ).

Instead of hardwiring queues to traffic class within the specific pipe,
the new implementation allows more flexible/configurable split of pipe
queues between strict priority (SP) and best-effort (BE) traffic classes
along with the support of more number of traffic classes i.e. max 16.
   
All the high priority TCs (TC1, TC2, ...) have exactly 1 queue, while
the lowest priority BE TC, has 1, 4 or 8 queues. This is justified by
the fact that all the high priority TCs are fully provisioned (small to
medium traffic rates), while most of the traffic fits into the BE class,
which is typically oversubscribed.

Furthermore, this change allows to use less than 16 queues per pipe when
not all the 16 queues are needed. Therefore, no memory will be allocated
to the queues that are not needed.

v3:
- remove code related to subport level configuration of the pipe 
- remove tc oversubscription flag from struct rte_sched_pipe_params
- replace RTE_SCHED_PIPE_PROFILES_PER_PORT with port param field

v2:
- fix bug in subport parameters check
- remove redundant RTE_SCHED_SUBPORT_PER_PORT macro
- fix bug in grinder_scheduler function
- improve doxygen comments 
- add error log information

Jasvinder Singh (11):
  sched: remove wrr from strict priority tc queues
  sched: add config flexibility to tc queue sizes
  sched: add max pipe profiles config in run time
  sched: rename tc3 params to best-effort tc
  sched: improve error log messages
  sched: improve doxygen comments
  net/softnic: add config flexibility to softnic tm
  test_sched: modify tests for config flexibility
  examples/ip_pipeline: add config flexibility to tm function
  examples/qos_sched: add tc and queue config flexibility
  sched: remove redundant macros

 app/test/test_sched.c                         |  12 +-
 doc/guides/rel_notes/release_19_08.rst        |  10 +-
 drivers/net/softnic/rte_eth_softnic.c         | 131 +++
 drivers/net/softnic/rte_eth_softnic_cli.c     | 433 ++++++++-
 .../net/softnic/rte_eth_softnic_internals.h   |   8 +-
 drivers/net/softnic/rte_eth_softnic_tm.c      |  64 +-
 examples/ip_pipeline/cli.c                    |  45 +-
 examples/ip_pipeline/tmgr.c                   |   2 +-
 examples/ip_pipeline/tmgr.h                   |   4 +-
 examples/qos_sched/app_thread.c               |   9 +-
 examples/qos_sched/cfg_file.c                 | 119 ++-
 examples/qos_sched/init.c                     |  65 +-
 examples/qos_sched/main.h                     |   4 +
 examples/qos_sched/profile.cfg                |  66 +-
 examples/qos_sched/profile_ov.cfg             |  54 +-
 examples/qos_sched/stats.c                    | 483 +++++-----
 lib/librte_pipeline/rte_table_action.c        |   1 -
 lib/librte_pipeline/rte_table_action.h        |   4 +-
 lib/librte_sched/Makefile                     |   2 +-
 lib/librte_sched/meson.build                  |   2 +-
 lib/librte_sched/rte_sched.c                  | 842 +++++++++++-------
 lib/librte_sched/rte_sched.h                  | 182 ++--
 22 files changed, 1786 insertions(+), 756 deletions(-)

-- 
2.21.0


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH v3 01/11] sched: remove wrr from strict priority tc queues
  2019-07-11 10:26       ` [dpdk-dev] [PATCH v3 00/11] sched: feature enhancements Jasvinder Singh
@ 2019-07-11 10:26         ` Jasvinder Singh
  2019-07-12  9:57           ` [dpdk-dev] [PATCH v4 00/11] sched: feature enhancements Jasvinder Singh
  2019-07-11 10:26         ` [dpdk-dev] [PATCH v3 02/11] sched: add config flexibility to tc queue sizes Jasvinder Singh
                           ` (9 subsequent siblings)
  10 siblings, 1 reply; 163+ messages in thread
From: Jasvinder Singh @ 2019-07-11 10:26 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, Abraham Tovar, Lukasz Krakowiak

All higher priority traffic classes contain only one queue, thus
remove wrr function for them. The lowest priority best-effort
traffic class conitnue to have multiple queues and packet are
scheduled from its queues using wrr function.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 app/test/test_sched.c        |   2 +-
 examples/qos_sched/init.c    |   2 +-
 lib/librte_sched/Makefile    |   2 +-
 lib/librte_sched/meson.build |   2 +-
 lib/librte_sched/rte_sched.c | 183 ++++++++++++++++++++---------------
 lib/librte_sched/rte_sched.h |  23 +++--
 6 files changed, 125 insertions(+), 89 deletions(-)

diff --git a/app/test/test_sched.c b/app/test/test_sched.c
index 49bb9ea6f..36fa2d425 100644
--- a/app/test/test_sched.c
+++ b/app/test/test_sched.c
@@ -40,7 +40,7 @@ static struct rte_sched_pipe_params pipe_profile[] = {
 		.tc_rate = {305175, 305175, 305175, 305175},
 		.tc_period = 40,
 
-		.wrr_weights = {1, 1, 1, 1,  1, 1, 1, 1,  1, 1, 1, 1,  1, 1, 1, 1},
+		.wrr_weights = {1, 1, 1, 1},
 	},
 };
 
diff --git a/examples/qos_sched/init.c b/examples/qos_sched/init.c
index 1209bd7ce..6b63d4e0e 100644
--- a/examples/qos_sched/init.c
+++ b/examples/qos_sched/init.c
@@ -186,7 +186,7 @@ static struct rte_sched_pipe_params pipe_profiles[RTE_SCHED_PIPE_PROFILES_PER_PO
 		.tc_ov_weight = 1,
 #endif
 
-		.wrr_weights = {1, 1, 1, 1,  1, 1, 1, 1,  1, 1, 1, 1,  1, 1, 1, 1},
+		.wrr_weights = {1, 1, 1, 1},
 	},
 };
 
diff --git a/lib/librte_sched/Makefile b/lib/librte_sched/Makefile
index 644fd9d15..3d7f410e1 100644
--- a/lib/librte_sched/Makefile
+++ b/lib/librte_sched/Makefile
@@ -18,7 +18,7 @@ LDLIBS += -lrte_timer
 
 EXPORT_MAP := rte_sched_version.map
 
-LIBABIVER := 2
+LIBABIVER := 3
 
 #
 # all source are stored in SRCS-y
diff --git a/lib/librte_sched/meson.build b/lib/librte_sched/meson.build
index 8e989e5f6..59d43c6d8 100644
--- a/lib/librte_sched/meson.build
+++ b/lib/librte_sched/meson.build
@@ -1,7 +1,7 @@
 # SPDX-License-Identifier: BSD-3-Clause
 # Copyright(c) 2017 Intel Corporation
 
-version = 2
+version = 3
 sources = files('rte_sched.c', 'rte_red.c', 'rte_approx.c')
 headers = files('rte_sched.h', 'rte_sched_common.h',
 		'rte_red.h', 'rte_approx.h')
diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index bc06bc3f4..eac995680 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -37,6 +37,8 @@
 
 #define RTE_SCHED_TB_RATE_CONFIG_ERR          (1e-7)
 #define RTE_SCHED_WRR_SHIFT                   3
+#define RTE_SCHED_TRAFFIC_CLASS_BE            (RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE - 1)
+#define RTE_SCHED_MAX_QUEUES_PER_TC           RTE_SCHED_BE_QUEUES_PER_PIPE
 #define RTE_SCHED_GRINDER_PCACHE_SIZE         (64 / RTE_SCHED_QUEUES_PER_PIPE)
 #define RTE_SCHED_PIPE_INVALID                UINT32_MAX
 #define RTE_SCHED_BMP_POS_INVALID             UINT32_MAX
@@ -84,8 +86,9 @@ struct rte_sched_pipe_profile {
 	uint32_t tc_credits_per_period[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
 	uint8_t tc_ov_weight;
 
-	/* Pipe queues */
-	uint8_t  wrr_cost[RTE_SCHED_QUEUES_PER_PIPE];
+	/* Pipe best-effort traffic class queues */
+	uint8_t n_be_queues;
+	uint8_t  wrr_cost[RTE_SCHED_BE_QUEUES_PER_PIPE];
 };
 
 struct rte_sched_pipe {
@@ -100,8 +103,10 @@ struct rte_sched_pipe {
 	uint64_t tc_time; /* time of next update */
 	uint32_t tc_credits[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
 
+	uint8_t n_be_queues; /* Best effort traffic class queues */
+
 	/* Weighted Round Robin (WRR) */
-	uint8_t wrr_tokens[RTE_SCHED_QUEUES_PER_PIPE];
+	uint8_t wrr_tokens[RTE_SCHED_BE_QUEUES_PER_PIPE];
 
 	/* TC oversubscription */
 	uint32_t tc_ov_credits;
@@ -153,16 +158,16 @@ struct rte_sched_grinder {
 	uint32_t tc_index;
 	struct rte_sched_queue *queue[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
 	struct rte_mbuf **qbase[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
-	uint32_t qindex[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
-	uint16_t qsize;
+	uint32_t qindex[RTE_SCHED_MAX_QUEUES_PER_TC];
+	uint16_t qsize[RTE_SCHED_MAX_QUEUES_PER_TC];
 	uint32_t qmask;
 	uint32_t qpos;
 	struct rte_mbuf *pkt;
 
 	/* WRR */
-	uint16_t wrr_tokens[RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS];
-	uint16_t wrr_mask[RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS];
-	uint8_t wrr_cost[RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS];
+	uint16_t wrr_tokens[RTE_SCHED_BE_QUEUES_PER_PIPE];
+	uint16_t wrr_mask[RTE_SCHED_BE_QUEUES_PER_PIPE];
+	uint8_t wrr_cost[RTE_SCHED_BE_QUEUES_PER_PIPE];
 };
 
 struct rte_sched_port {
@@ -301,7 +306,6 @@ pipe_profile_check(struct rte_sched_pipe_params *params,
 		if (params->wrr_weights[i] == 0)
 			return -16;
 	}
-
 	return 0;
 }
 
@@ -483,7 +487,7 @@ rte_sched_port_log_pipe_profile(struct rte_sched_port *port, uint32_t i)
 		"    Token bucket: period = %u, credits per period = %u, size = %u\n"
 		"    Traffic classes: period = %u, credits per period = [%u, %u, %u, %u]\n"
 		"    Traffic class 3 oversubscription: weight = %hhu\n"
-		"    WRR cost: [%hhu, %hhu, %hhu, %hhu], [%hhu, %hhu, %hhu, %hhu], [%hhu, %hhu, %hhu, %hhu], [%hhu, %hhu, %hhu, %hhu]\n",
+		"    WRR cost: [%hhu, %hhu, %hhu, %hhu]\n",
 		i,
 
 		/* Token bucket */
@@ -502,10 +506,7 @@ rte_sched_port_log_pipe_profile(struct rte_sched_port *port, uint32_t i)
 		p->tc_ov_weight,
 
 		/* WRR */
-		p->wrr_cost[ 0], p->wrr_cost[ 1], p->wrr_cost[ 2], p->wrr_cost[ 3],
-		p->wrr_cost[ 4], p->wrr_cost[ 5], p->wrr_cost[ 6], p->wrr_cost[ 7],
-		p->wrr_cost[ 8], p->wrr_cost[ 9], p->wrr_cost[10], p->wrr_cost[11],
-		p->wrr_cost[12], p->wrr_cost[13], p->wrr_cost[14], p->wrr_cost[15]);
+		p->wrr_cost[0], p->wrr_cost[1], p->wrr_cost[2], p->wrr_cost[3]);
 }
 
 static inline uint64_t
@@ -519,10 +520,12 @@ rte_sched_time_ms_to_bytes(uint32_t time_ms, uint32_t rate)
 }
 
 static void
-rte_sched_pipe_profile_convert(struct rte_sched_pipe_params *src,
+rte_sched_pipe_profile_convert(struct rte_sched_port *port,
+	struct rte_sched_pipe_params *src,
 	struct rte_sched_pipe_profile *dst,
 	uint32_t rate)
 {
+	uint32_t wrr_cost[RTE_SCHED_BE_QUEUES_PER_PIPE];
 	uint32_t i;
 
 	/* Token Bucket */
@@ -553,18 +556,36 @@ rte_sched_pipe_profile_convert(struct rte_sched_pipe_params *src,
 	dst->tc_ov_weight = src->tc_ov_weight;
 #endif
 
-	/* WRR */
-	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++) {
-		uint32_t wrr_cost[RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS];
-		uint32_t lcd, lcd1, lcd2;
-		uint32_t qindex;
+	/* WRR queues */
+	for (i = 0; i < RTE_SCHED_BE_QUEUES_PER_PIPE; i++)
+		if (port->qsize[i])
+			dst->n_be_queues++;
+
+	if (dst->n_be_queues == 1)
+		dst->wrr_cost[0] = src->wrr_weights[0];
+
+	if (dst->n_be_queues == 2) {
+		uint32_t lcd;
+
+		wrr_cost[0] = src->wrr_weights[0];
+		wrr_cost[1] = src->wrr_weights[1];
+
+		lcd = rte_get_lcd(wrr_cost[0], wrr_cost[1]);
+
+		wrr_cost[0] = lcd / wrr_cost[0];
+		wrr_cost[1] = lcd / wrr_cost[1];
 
-		qindex = i * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS;
+		dst->wrr_cost[0] = (uint8_t) wrr_cost[0];
+		dst->wrr_cost[1] = (uint8_t) wrr_cost[1];
+	}
 
-		wrr_cost[0] = src->wrr_weights[qindex];
-		wrr_cost[1] = src->wrr_weights[qindex + 1];
-		wrr_cost[2] = src->wrr_weights[qindex + 2];
-		wrr_cost[3] = src->wrr_weights[qindex + 3];
+	if (dst->n_be_queues == 4) {
+		uint32_t lcd1, lcd2, lcd;
+
+		wrr_cost[0] = src->wrr_weights[0];
+		wrr_cost[1] = src->wrr_weights[1];
+		wrr_cost[2] = src->wrr_weights[2];
+		wrr_cost[3] = src->wrr_weights[3];
 
 		lcd1 = rte_get_lcd(wrr_cost[0], wrr_cost[1]);
 		lcd2 = rte_get_lcd(wrr_cost[2], wrr_cost[3]);
@@ -575,10 +596,10 @@ rte_sched_pipe_profile_convert(struct rte_sched_pipe_params *src,
 		wrr_cost[2] = lcd / wrr_cost[2];
 		wrr_cost[3] = lcd / wrr_cost[3];
 
-		dst->wrr_cost[qindex] = (uint8_t) wrr_cost[0];
-		dst->wrr_cost[qindex + 1] = (uint8_t) wrr_cost[1];
-		dst->wrr_cost[qindex + 2] = (uint8_t) wrr_cost[2];
-		dst->wrr_cost[qindex + 3] = (uint8_t) wrr_cost[3];
+		dst->wrr_cost[0] = (uint8_t) wrr_cost[0];
+		dst->wrr_cost[1] = (uint8_t) wrr_cost[1];
+		dst->wrr_cost[2] = (uint8_t) wrr_cost[2];
+		dst->wrr_cost[3] = (uint8_t) wrr_cost[3];
 	}
 }
 
@@ -592,7 +613,7 @@ rte_sched_port_config_pipe_profile_table(struct rte_sched_port *port,
 		struct rte_sched_pipe_params *src = params->pipe_profiles + i;
 		struct rte_sched_pipe_profile *dst = port->pipe_profiles + i;
 
-		rte_sched_pipe_profile_convert(src, dst, params->rate);
+		rte_sched_pipe_profile_convert(port, src, dst, params->rate);
 		rte_sched_port_log_pipe_profile(port, i);
 	}
 
@@ -976,7 +997,7 @@ rte_sched_port_pipe_profile_add(struct rte_sched_port *port,
 		return status;
 
 	pp = &port->pipe_profiles[port->n_pipe_profiles];
-	rte_sched_pipe_profile_convert(params, pp, port->rate);
+	rte_sched_pipe_profile_convert(port, params, pp, port->rate);
 
 	/* Pipe profile not exists */
 	for (i = 0; i < port->n_pipe_profiles; i++)
@@ -1715,6 +1736,7 @@ grinder_schedule(struct rte_sched_port *port, uint32_t pos)
 	struct rte_sched_queue *queue = grinder->queue[grinder->qpos];
 	struct rte_mbuf *pkt = grinder->pkt;
 	uint32_t pkt_len = pkt->pkt_len + port->frame_overhead;
+	int be_tc_active;
 
 	if (!grinder_credits_check(port, pos))
 		return 0;
@@ -1725,13 +1747,18 @@ grinder_schedule(struct rte_sched_port *port, uint32_t pos)
 	/* Send packet */
 	port->pkts_out[port->n_pkts_out++] = pkt;
 	queue->qr++;
-	grinder->wrr_tokens[grinder->qpos] += pkt_len * grinder->wrr_cost[grinder->qpos];
+
+	be_tc_active = (grinder->tc_index == RTE_SCHED_TRAFFIC_CLASS_BE);
+	grinder->wrr_tokens[grinder->qpos] +=
+		pkt_len * grinder->wrr_cost[grinder->qpos] * be_tc_active;
+
 	if (queue->qr == queue->qw) {
 		uint32_t qindex = grinder->qindex[grinder->qpos];
 
 		rte_bitmap_clear(port->bmp, qindex);
 		grinder->qmask &= ~(1 << grinder->qpos);
-		grinder->wrr_mask[grinder->qpos] = 0;
+		if (be_tc_active)
+			grinder->wrr_mask[grinder->qpos] = 0;
 		rte_sched_port_set_queue_empty_timestamp(port, qindex);
 	}
 
@@ -1877,7 +1904,7 @@ grinder_next_tc(struct rte_sched_port *port, uint32_t pos)
 
 	grinder->tc_index = (qindex >> 2) & 0x3;
 	grinder->qmask = grinder->tccache_qmask[grinder->tccache_r];
-	grinder->qsize = qsize;
+	grinder->qsize[grinder->tc_index] = qsize;
 
 	grinder->qindex[0] = qindex;
 	grinder->qindex[1] = qindex + 1;
@@ -1962,26 +1989,16 @@ grinder_wrr_load(struct rte_sched_port *port, uint32_t pos)
 	struct rte_sched_grinder *grinder = port->grinder + pos;
 	struct rte_sched_pipe *pipe = grinder->pipe;
 	struct rte_sched_pipe_profile *pipe_params = grinder->pipe_params;
-	uint32_t tc_index = grinder->tc_index;
 	uint32_t qmask = grinder->qmask;
-	uint32_t qindex;
-
-	qindex = tc_index * 4;
-
-	grinder->wrr_tokens[0] = ((uint16_t) pipe->wrr_tokens[qindex]) << RTE_SCHED_WRR_SHIFT;
-	grinder->wrr_tokens[1] = ((uint16_t) pipe->wrr_tokens[qindex + 1]) << RTE_SCHED_WRR_SHIFT;
-	grinder->wrr_tokens[2] = ((uint16_t) pipe->wrr_tokens[qindex + 2]) << RTE_SCHED_WRR_SHIFT;
-	grinder->wrr_tokens[3] = ((uint16_t) pipe->wrr_tokens[qindex + 3]) << RTE_SCHED_WRR_SHIFT;
-
-	grinder->wrr_mask[0] = (qmask & 0x1) * 0xFFFF;
-	grinder->wrr_mask[1] = ((qmask >> 1) & 0x1) * 0xFFFF;
-	grinder->wrr_mask[2] = ((qmask >> 2) & 0x1) * 0xFFFF;
-	grinder->wrr_mask[3] = ((qmask >> 3) & 0x1) * 0xFFFF;
+	uint32_t qindex = grinder->qindex[0];
+	uint32_t i;
 
-	grinder->wrr_cost[0] = pipe_params->wrr_cost[qindex];
-	grinder->wrr_cost[1] = pipe_params->wrr_cost[qindex + 1];
-	grinder->wrr_cost[2] = pipe_params->wrr_cost[qindex + 2];
-	grinder->wrr_cost[3] = pipe_params->wrr_cost[qindex + 3];
+	for (i = 0; i < pipe->n_be_queues; i++) {
+		grinder->wrr_tokens[i] =
+			((uint16_t) pipe->wrr_tokens[qindex + i]) << RTE_SCHED_WRR_SHIFT;
+		grinder->wrr_mask[i] = ((qmask >> i) & 0x1) * 0xFFFF;
+		grinder->wrr_cost[i] = pipe_params->wrr_cost[qindex + i];
+	}
 }
 
 static inline void
@@ -1989,19 +2006,12 @@ grinder_wrr_store(struct rte_sched_port *port, uint32_t pos)
 {
 	struct rte_sched_grinder *grinder = port->grinder + pos;
 	struct rte_sched_pipe *pipe = grinder->pipe;
-	uint32_t tc_index = grinder->tc_index;
-	uint32_t qindex;
-
-	qindex = tc_index * 4;
+	uint32_t i;
 
-	pipe->wrr_tokens[qindex] = (grinder->wrr_tokens[0] & grinder->wrr_mask[0])
-		>> RTE_SCHED_WRR_SHIFT;
-	pipe->wrr_tokens[qindex + 1] = (grinder->wrr_tokens[1] & grinder->wrr_mask[1])
-		>> RTE_SCHED_WRR_SHIFT;
-	pipe->wrr_tokens[qindex + 2] = (grinder->wrr_tokens[2] & grinder->wrr_mask[2])
-		>> RTE_SCHED_WRR_SHIFT;
-	pipe->wrr_tokens[qindex + 3] = (grinder->wrr_tokens[3] & grinder->wrr_mask[3])
-		>> RTE_SCHED_WRR_SHIFT;
+	for (i = 0; i < pipe->n_be_queues; i++)
+		pipe->wrr_tokens[i] =
+			(grinder->wrr_tokens[i] & grinder->wrr_mask[i]) >>
+				RTE_SCHED_WRR_SHIFT;
 }
 
 static inline void
@@ -2040,22 +2050,31 @@ static inline void
 grinder_prefetch_tc_queue_arrays(struct rte_sched_port *port, uint32_t pos)
 {
 	struct rte_sched_grinder *grinder = port->grinder + pos;
-	uint16_t qsize, qr[4];
+	struct rte_sched_pipe *pipe = grinder->pipe;
+	struct rte_sched_queue *queue;
+	uint32_t i;
+	uint16_t qsize, qr[RTE_SCHED_MAX_QUEUES_PER_TC];
 
-	qsize = grinder->qsize;
-	qr[0] = grinder->queue[0]->qr & (qsize - 1);
-	qr[1] = grinder->queue[1]->qr & (qsize - 1);
-	qr[2] = grinder->queue[2]->qr & (qsize - 1);
-	qr[3] = grinder->queue[3]->qr & (qsize - 1);
+	grinder->qpos = 0;
+	if (grinder->tc_index < RTE_SCHED_TRAFFIC_CLASS_BE) {
+		queue = grinder->queue[0];
+		qsize = grinder->qsize[0];
+		qr[0] = queue->qr & (qsize - 1);
 
-	rte_prefetch0(grinder->qbase[0] + qr[0]);
-	rte_prefetch0(grinder->qbase[1] + qr[1]);
+		rte_prefetch0(grinder->qbase[0] + qr[0]);
+		return;
+	}
+
+	for (i = 0; i < pipe->n_be_queues; i++) {
+		queue = grinder->queue[i];
+		qsize = grinder->qsize[i];
+		qr[i] = queue->qr & (qsize - 1);
+
+		rte_prefetch0(grinder->qbase[i] + qr[i]);
+	}
 
 	grinder_wrr_load(port, pos);
 	grinder_wrr(port, pos);
-
-	rte_prefetch0(grinder->qbase[2] + qr[2]);
-	rte_prefetch0(grinder->qbase[3] + qr[3]);
 }
 
 static inline void
@@ -2064,7 +2083,7 @@ grinder_prefetch_mbuf(struct rte_sched_port *port, uint32_t pos)
 	struct rte_sched_grinder *grinder = port->grinder + pos;
 	uint32_t qpos = grinder->qpos;
 	struct rte_mbuf **qbase = grinder->qbase[qpos];
-	uint16_t qsize = grinder->qsize;
+	uint16_t qsize = grinder->qsize[qpos];
 	uint16_t qr = grinder->queue[qpos]->qr & (qsize - 1);
 
 	grinder->pkt = qbase[qr];
@@ -2118,18 +2137,24 @@ grinder_handle(struct rte_sched_port *port, uint32_t pos)
 
 	case e_GRINDER_READ_MBUF:
 	{
-		uint32_t result = 0;
+		uint32_t wrr_active, result = 0;
 
 		result = grinder_schedule(port, pos);
 
+		wrr_active = (grinder->tc_index == RTE_SCHED_TRAFFIC_CLASS_BE);
+
 		/* Look for next packet within the same TC */
 		if (result && grinder->qmask) {
-			grinder_wrr(port, pos);
+			if (wrr_active)
+				grinder_wrr(port, pos);
+
 			grinder_prefetch_mbuf(port, pos);
 
 			return 1;
 		}
-		grinder_wrr_store(port, pos);
+
+		if (wrr_active)
+			grinder_wrr_store(port, pos);
 
 		/* Look for another active TC within same pipe */
 		if (grinder_next_tc(port, pos)) {
diff --git a/lib/librte_sched/rte_sched.h b/lib/librte_sched/rte_sched.h
index d61dda9f5..2a935998a 100644
--- a/lib/librte_sched/rte_sched.h
+++ b/lib/librte_sched/rte_sched.h
@@ -66,6 +66,22 @@ extern "C" {
 #include "rte_red.h"
 #endif
 
+/** Maximum number of queues per pipe.
+ * Note that the multiple queues (power of 2) can only be assigned to
+ * lowest priority (best-effort) traffic class. Other higher priority traffic
+ * classes can only have one queue.
+ * Can not change.
+ *
+ * @see struct rte_sched_port_params
+ */
+#define RTE_SCHED_QUEUES_PER_PIPE    16
+
+/** Number of WRR queues for best-effort traffic class per pipe.
+ *
+ * @see struct rte_sched_pipe_params
+ */
+#define RTE_SCHED_BE_QUEUES_PER_PIPE    4
+
 /** Number of traffic classes per pipe (as well as subport).
  * Cannot be changed.
  */
@@ -74,11 +90,6 @@ extern "C" {
 /** Number of queues per pipe traffic class. Cannot be changed. */
 #define RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS    4
 
-/** Number of queues per pipe. */
-#define RTE_SCHED_QUEUES_PER_PIPE             \
-	(RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE *     \
-	RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS)
-
 /** Maximum number of pipe profiles that can be defined per port.
  * Compile-time configurable.
  */
@@ -165,7 +176,7 @@ struct rte_sched_pipe_params {
 #endif
 
 	/* Pipe queues */
-	uint8_t  wrr_weights[RTE_SCHED_QUEUES_PER_PIPE]; /**< WRR weights */
+	uint8_t  wrr_weights[RTE_SCHED_BE_QUEUES_PER_PIPE]; /**< WRR weights */
 };
 
 /** Queue statistics */
-- 
2.21.0


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH v3 02/11] sched: add config flexibility to tc queue sizes
  2019-07-11 10:26       ` [dpdk-dev] [PATCH v3 00/11] sched: feature enhancements Jasvinder Singh
  2019-07-11 10:26         ` [dpdk-dev] [PATCH v3 01/11] sched: remove wrr from strict priority tc queues Jasvinder Singh
@ 2019-07-11 10:26         ` Jasvinder Singh
  2019-07-11 10:26         ` [dpdk-dev] [PATCH v3 03/11] sched: add max pipe profiles config in run time Jasvinder Singh
                           ` (8 subsequent siblings)
  10 siblings, 0 replies; 163+ messages in thread
From: Jasvinder Singh @ 2019-07-11 10:26 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, Abraham Tovar, Lukasz Krakowiak

Add support for zero queue sizes of the traffic classes. The queues
which are not used can be set to zero size. This helps in reducing
memory footprint of the hierarchical scheduler.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 lib/librte_sched/rte_sched.c | 354 +++++++++++++++++++----------------
 lib/librte_sched/rte_sched.h |  12 +-
 2 files changed, 198 insertions(+), 168 deletions(-)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index eac995680..34f96a46c 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -149,15 +149,15 @@ struct rte_sched_grinder {
 	struct rte_sched_pipe_profile *pipe_params;
 
 	/* TC cache */
-	uint8_t tccache_qmask[4];
-	uint32_t tccache_qindex[4];
+	uint8_t tccache_qmask[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
+	uint32_t tccache_qindex[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
 	uint32_t tccache_w;
 	uint32_t tccache_r;
 
 	/* Current TC */
 	uint32_t tc_index;
-	struct rte_sched_queue *queue[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
-	struct rte_mbuf **qbase[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
+	struct rte_sched_queue *queue[RTE_SCHED_MAX_QUEUES_PER_TC];
+	struct rte_mbuf **qbase[RTE_SCHED_MAX_QUEUES_PER_TC];
 	uint32_t qindex[RTE_SCHED_MAX_QUEUES_PER_TC];
 	uint16_t qsize[RTE_SCHED_MAX_QUEUES_PER_TC];
 	uint32_t qmask;
@@ -178,7 +178,7 @@ struct rte_sched_port {
 	uint32_t rate;
 	uint32_t mtu;
 	uint32_t frame_overhead;
-	uint16_t qsize[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
+	uint16_t qsize[RTE_SCHED_QUEUES_PER_PIPE];
 	uint32_t n_pipe_profiles;
 	uint32_t pipe_tc3_rate_max;
 #ifdef RTE_SCHED_RED
@@ -260,14 +260,13 @@ rte_sched_port_qbase(struct rte_sched_port *port, uint32_t qindex)
 static inline uint16_t
 rte_sched_port_qsize(struct rte_sched_port *port, uint32_t qindex)
 {
-	uint32_t tc = (qindex >> 2) & 0x3;
-
-	return port->qsize[tc];
+	uint32_t qpos = qindex & 0xF;
+	return port->qsize[qpos];
 }
 
 static int
 pipe_profile_check(struct rte_sched_pipe_params *params,
-	uint32_t rate)
+	uint32_t rate, uint16_t *qsize)
 {
 	uint32_t i;
 
@@ -285,12 +284,17 @@ pipe_profile_check(struct rte_sched_pipe_params *params,
 		return -12;
 
 	/* TC rate: non-zero, less than pipe rate */
-	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++) {
-		if (params->tc_rate[i] == 0 ||
-			params->tc_rate[i] > params->tb_rate)
+	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASS_BE; i++) {
+		if ((qsize[i] == 0 && params->tc_rate[i] != 0) ||
+			(qsize[i] != 0 && (params->tc_rate[i] == 0 ||
+			params->tc_rate[i] > params->tb_rate)))
 			return -13;
+
 	}
 
+	if (params->tc_rate[RTE_SCHED_TRAFFIC_CLASS_BE] == 0)
+		return -13;
+
 	/* TC period: non-zero */
 	if (params->tc_period == 0)
 		return -14;
@@ -302,8 +306,10 @@ pipe_profile_check(struct rte_sched_pipe_params *params,
 #endif
 
 	/* Queue WRR weights: non-zero */
-	for (i = 0; i < RTE_SCHED_QUEUES_PER_PIPE; i++) {
-		if (params->wrr_weights[i] == 0)
+	for (i = 0; i < RTE_SCHED_BE_QUEUES_PER_PIPE; i++) {
+		uint32_t qindex = RTE_SCHED_TRAFFIC_CLASS_BE + i;
+		if ((qsize[qindex] != 0 && params->wrr_weights[i] == 0) ||
+			(qsize[qindex] == 0 && params->wrr_weights[i] != 0))
 			return -16;
 	}
 	return 0;
@@ -343,10 +349,10 @@ rte_sched_port_check_params(struct rte_sched_port_params *params)
 	/* qsize: non-zero, power of 2,
 	 * no bigger than 32K (due to 16-bit read/write pointers)
 	 */
-	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++) {
+	for (i = 0; i < RTE_SCHED_QUEUES_PER_PIPE; i++) {
 		uint16_t qsize = params->qsize[i];
-
-		if (qsize == 0 || !rte_is_power_of_2(qsize))
+		if ((qsize != 0 && !rte_is_power_of_2(qsize)) ||
+			((i == RTE_SCHED_TRAFFIC_CLASS_BE) && (qsize == 0)))
 			return -8;
 	}
 
@@ -360,7 +366,7 @@ rte_sched_port_check_params(struct rte_sched_port_params *params)
 		struct rte_sched_pipe_params *p = params->pipe_profiles + i;
 		int status;
 
-		status = pipe_profile_check(p, params->rate);
+		status = pipe_profile_check(p, params->rate, &params->qsize[0]);
 		if (status != 0)
 			return status;
 	}
@@ -389,9 +395,9 @@ rte_sched_port_get_array_base(struct rte_sched_port_params *params, enum rte_sch
 	uint32_t base, i;
 
 	size_per_pipe_queue_array = 0;
-	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++) {
-		size_per_pipe_queue_array += RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS
-			* params->qsize[i] * sizeof(struct rte_mbuf *);
+	for (i = 0; i < RTE_SCHED_QUEUES_PER_PIPE; i++) {
+		size_per_pipe_queue_array +=
+			params->qsize[i] * sizeof(struct rte_mbuf *);
 	}
 	size_queue_array = n_pipes_per_port * size_per_pipe_queue_array;
 
@@ -451,31 +457,12 @@ rte_sched_port_get_memory_footprint(struct rte_sched_port_params *params)
 static void
 rte_sched_port_config_qsize(struct rte_sched_port *port)
 {
-	/* TC 0 */
+	uint32_t i;
 	port->qsize_add[0] = 0;
-	port->qsize_add[1] = port->qsize_add[0] + port->qsize[0];
-	port->qsize_add[2] = port->qsize_add[1] + port->qsize[0];
-	port->qsize_add[3] = port->qsize_add[2] + port->qsize[0];
-
-	/* TC 1 */
-	port->qsize_add[4] = port->qsize_add[3] + port->qsize[0];
-	port->qsize_add[5] = port->qsize_add[4] + port->qsize[1];
-	port->qsize_add[6] = port->qsize_add[5] + port->qsize[1];
-	port->qsize_add[7] = port->qsize_add[6] + port->qsize[1];
-
-	/* TC 2 */
-	port->qsize_add[8] = port->qsize_add[7] + port->qsize[1];
-	port->qsize_add[9] = port->qsize_add[8] + port->qsize[2];
-	port->qsize_add[10] = port->qsize_add[9] + port->qsize[2];
-	port->qsize_add[11] = port->qsize_add[10] + port->qsize[2];
-
-	/* TC 3 */
-	port->qsize_add[12] = port->qsize_add[11] + port->qsize[2];
-	port->qsize_add[13] = port->qsize_add[12] + port->qsize[3];
-	port->qsize_add[14] = port->qsize_add[13] + port->qsize[3];
-	port->qsize_add[15] = port->qsize_add[14] + port->qsize[3];
-
-	port->qsize_sum = port->qsize_add[15] + port->qsize[3];
+	for (i = 1; i < RTE_SCHED_QUEUES_PER_PIPE; i++)
+		port->qsize_add[i] = port->qsize_add[i-1] + port->qsize[i-1];
+
+	port->qsize_sum = port->qsize_add[15] + port->qsize[15];
 }
 
 static void
@@ -484,10 +471,11 @@ rte_sched_port_log_pipe_profile(struct rte_sched_port *port, uint32_t i)
 	struct rte_sched_pipe_profile *p = port->pipe_profiles + i;
 
 	RTE_LOG(DEBUG, SCHED, "Low level config for pipe profile %u:\n"
-		"    Token bucket: period = %u, credits per period = %u, size = %u\n"
-		"    Traffic classes: period = %u, credits per period = [%u, %u, %u, %u]\n"
-		"    Traffic class 3 oversubscription: weight = %hhu\n"
-		"    WRR cost: [%hhu, %hhu, %hhu, %hhu]\n",
+		"	Token bucket: period = %u, credits per period = %u, size = %u\n"
+		"	Traffic classes: period = %u,\n"
+		"	credits per period = [%u, %u, %u, %u, %u, %u, %u, %u, %u, %u, %u, %u, %u]\n"
+		"	Best-effort traffic class oversubscription: weight = %hhu\n"
+		"	WRR cost: [%hhu, %hhu, %hhu, %hhu]\n",
 		i,
 
 		/* Token bucket */
@@ -501,8 +489,17 @@ rte_sched_port_log_pipe_profile(struct rte_sched_port *port, uint32_t i)
 		p->tc_credits_per_period[1],
 		p->tc_credits_per_period[2],
 		p->tc_credits_per_period[3],
-
-		/* Traffic class 3 oversubscription */
+		p->tc_credits_per_period[4],
+		p->tc_credits_per_period[5],
+		p->tc_credits_per_period[6],
+		p->tc_credits_per_period[7],
+		p->tc_credits_per_period[8],
+		p->tc_credits_per_period[9],
+		p->tc_credits_per_period[10],
+		p->tc_credits_per_period[11],
+		p->tc_credits_per_period[12],
+
+		/* Best-effort traffic class oversubscription */
 		p->tc_ov_weight,
 
 		/* WRR */
@@ -548,9 +545,10 @@ rte_sched_pipe_profile_convert(struct rte_sched_port *port,
 						rate);
 
 	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++)
-		dst->tc_credits_per_period[i]
-			= rte_sched_time_ms_to_bytes(src->tc_period,
-				src->tc_rate[i]);
+		if (port->qsize[i])
+			dst->tc_credits_per_period[i]
+				= rte_sched_time_ms_to_bytes(src->tc_period,
+					src->tc_rate[i]);
 
 #ifdef RTE_SCHED_SUBPORT_TC_OV
 	dst->tc_ov_weight = src->tc_ov_weight;
@@ -558,7 +556,7 @@ rte_sched_pipe_profile_convert(struct rte_sched_port *port,
 
 	/* WRR queues */
 	for (i = 0; i < RTE_SCHED_BE_QUEUES_PER_PIPE; i++)
-		if (port->qsize[i])
+		if (port->qsize[RTE_SCHED_TRAFFIC_CLASS_BE + i])
 			dst->n_be_queues++;
 
 	if (dst->n_be_queues == 1)
@@ -620,7 +618,7 @@ rte_sched_port_config_pipe_profile_table(struct rte_sched_port *port,
 	port->pipe_tc3_rate_max = 0;
 	for (i = 0; i < port->n_pipe_profiles; i++) {
 		struct rte_sched_pipe_params *src = params->pipe_profiles + i;
-		uint32_t pipe_tc3_rate = src->tc_rate[3];
+		uint32_t pipe_tc3_rate = src->tc_rate[RTE_SCHED_TRAFFIC_CLASS_BE];
 
 		if (port->pipe_tc3_rate_max < pipe_tc3_rate)
 			port->pipe_tc3_rate_max = pipe_tc3_rate;
@@ -762,12 +760,14 @@ rte_sched_port_free(struct rte_sched_port *port)
 	for (qindex = 0; qindex < n_queues_per_port; qindex++) {
 		struct rte_mbuf **mbufs = rte_sched_port_qbase(port, qindex);
 		uint16_t qsize = rte_sched_port_qsize(port, qindex);
-		struct rte_sched_queue *queue = port->queue + qindex;
-		uint16_t qr = queue->qr & (qsize - 1);
-		uint16_t qw = queue->qw & (qsize - 1);
+		if (qsize != 0) {
+			struct rte_sched_queue *queue = port->queue + qindex;
+			uint16_t qr = queue->qr & (qsize - 1);
+			uint16_t qw = queue->qw & (qsize - 1);
 
-		for (; qr != qw; qr = (qr + 1) & (qsize - 1))
-			rte_pktmbuf_free(mbufs[qr]);
+			for (; qr != qw; qr = (qr + 1) & (qsize - 1))
+				rte_pktmbuf_free(mbufs[qr]);
+		}
 	}
 
 	rte_bitmap_free(port->bmp);
@@ -780,9 +780,10 @@ rte_sched_port_log_subport_config(struct rte_sched_port *port, uint32_t i)
 	struct rte_sched_subport *s = port->subport + i;
 
 	RTE_LOG(DEBUG, SCHED, "Low level config for subport %u:\n"
-		"    Token bucket: period = %u, credits per period = %u, size = %u\n"
-		"    Traffic classes: period = %u, credits per period = [%u, %u, %u, %u]\n"
-		"    Traffic class 3 oversubscription: wm min = %u, wm max = %u\n",
+		"	Token bucket: period = %u, credits per period = %u, size = %u\n"
+		"	Traffic classes: period = %u\n"
+		"	credits per period = [%u, %u, %u, %u, %u, %u, %u, %u, %u, %u, %u, %u, %u]\n"
+		"	Best effort traffic class oversubscription: wm min = %u, wm max = %u\n",
 		i,
 
 		/* Token bucket */
@@ -796,8 +797,17 @@ rte_sched_port_log_subport_config(struct rte_sched_port *port, uint32_t i)
 		s->tc_credits_per_period[1],
 		s->tc_credits_per_period[2],
 		s->tc_credits_per_period[3],
-
-		/* Traffic class 3 oversubscription */
+		s->tc_credits_per_period[4],
+		s->tc_credits_per_period[5],
+		s->tc_credits_per_period[6],
+		s->tc_credits_per_period[7],
+		s->tc_credits_per_period[8],
+		s->tc_credits_per_period[9],
+		s->tc_credits_per_period[10],
+		s->tc_credits_per_period[11],
+		s->tc_credits_per_period[12],
+
+		/* Best effort traffic class oversubscription */
 		s->tc_ov_wm_min,
 		s->tc_ov_wm_max);
 }
@@ -808,7 +818,7 @@ rte_sched_subport_config(struct rte_sched_port *port,
 	struct rte_sched_subport_params *params)
 {
 	struct rte_sched_subport *s;
-	uint32_t i;
+	uint32_t i, j;
 
 	/* Check user parameters */
 	if (port == NULL ||
@@ -822,12 +832,24 @@ rte_sched_subport_config(struct rte_sched_port *port,
 	if (params->tb_size == 0)
 		return -3;
 
-	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++) {
-		if (params->tc_rate[i] == 0 ||
-		    params->tc_rate[i] > params->tb_rate)
-			return -4;
+	for (i = 0, j = 0; i < RTE_SCHED_QUEUES_PER_PIPE; i++) {
+		uint32_t tc_rate = params->tc_rate[j];
+		uint16_t qsize = port->qsize[i];
+
+		if (((qsize == 0) &&
+			((tc_rate != 0) && (j != RTE_SCHED_TRAFFIC_CLASS_BE))) ||
+			((qsize != 0) && (tc_rate == 0)) ||
+			(tc_rate > params->tb_rate))
+			return -3;
+
+		if (j < RTE_SCHED_TRAFFIC_CLASS_BE)
+			j++;
 	}
 
+	if (port->qsize[RTE_SCHED_TRAFFIC_CLASS_BE] == 0 ||
+		params->tc_rate[RTE_SCHED_TRAFFIC_CLASS_BE] == 0)
+		return -3;
+
 	if (params->tc_period == 0)
 		return -5;
 
@@ -851,13 +873,16 @@ rte_sched_subport_config(struct rte_sched_port *port,
 	/* Traffic Classes (TCs) */
 	s->tc_period = rte_sched_time_ms_to_bytes(params->tc_period, port->rate);
 	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++) {
-		s->tc_credits_per_period[i]
-			= rte_sched_time_ms_to_bytes(params->tc_period,
-						     params->tc_rate[i]);
+		if (port->qsize[i])
+			s->tc_credits_per_period[i]
+				= rte_sched_time_ms_to_bytes(params->tc_period,
+								 params->tc_rate[i]);
+
 	}
 	s->tc_time = port->time + s->tc_period;
 	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++)
-		s->tc_credits[i] = s->tc_credits_per_period[i];
+		if (port->qsize[i])
+			s->tc_credits[i] = s->tc_credits_per_period[i];
 
 #ifdef RTE_SCHED_SUBPORT_TC_OV
 	/* TC oversubscription */
@@ -910,9 +935,9 @@ rte_sched_pipe_config(struct rte_sched_port *port,
 		params = port->pipe_profiles + p->profile;
 
 #ifdef RTE_SCHED_SUBPORT_TC_OV
-		double subport_tc3_rate = (double) s->tc_credits_per_period[3]
+		double subport_tc3_rate = (double) s->tc_credits_per_period[RTE_SCHED_TRAFFIC_CLASS_BE]
 			/ (double) s->tc_period;
-		double pipe_tc3_rate = (double) params->tc_credits_per_period[3]
+		double pipe_tc3_rate = (double) params->tc_credits_per_period[RTE_SCHED_TRAFFIC_CLASS_BE]
 			/ (double) params->tc_period;
 		uint32_t tc3_ov = s->tc_ov;
 
@@ -945,15 +970,19 @@ rte_sched_pipe_config(struct rte_sched_port *port,
 
 	/* Traffic Classes (TCs) */
 	p->tc_time = port->time + params->tc_period;
+	p->n_be_queues = params->n_be_queues;
 	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++)
-		p->tc_credits[i] = params->tc_credits_per_period[i];
+		if (port->qsize[i])
+			p->tc_credits[i] = params->tc_credits_per_period[i];
 
 #ifdef RTE_SCHED_SUBPORT_TC_OV
 	{
 		/* Subport TC3 oversubscription */
-		double subport_tc3_rate = (double) s->tc_credits_per_period[3]
+		double subport_tc3_rate =
+			(double) s->tc_credits_per_period[RTE_SCHED_TRAFFIC_CLASS_BE]
 			/ (double) s->tc_period;
-		double pipe_tc3_rate = (double) params->tc_credits_per_period[3]
+		double pipe_tc3_rate =
+			(double) params->tc_credits_per_period[RTE_SCHED_TRAFFIC_CLASS_BE]
 			/ (double) params->tc_period;
 		uint32_t tc3_ov = s->tc_ov;
 
@@ -992,7 +1021,7 @@ rte_sched_port_pipe_profile_add(struct rte_sched_port *port,
 		return -2;
 
 	/* Pipe params */
-	status = pipe_profile_check(params, port->rate);
+	status = pipe_profile_check(params, port->rate, &port->qsize[0]);
 	if (status != 0)
 		return status;
 
@@ -1008,8 +1037,8 @@ rte_sched_port_pipe_profile_add(struct rte_sched_port *port,
 	*pipe_profile_id = port->n_pipe_profiles;
 	port->n_pipe_profiles++;
 
-	if (port->pipe_tc3_rate_max < params->tc_rate[3])
-		port->pipe_tc3_rate_max = params->tc_rate[3];
+	if (port->pipe_tc3_rate_max < params->tc_rate[RTE_SCHED_TRAFFIC_CLASS_BE])
+		port->pipe_tc3_rate_max = params->tc_rate[RTE_SCHED_TRAFFIC_CLASS_BE];
 
 	rte_sched_port_log_pipe_profile(port, *pipe_profile_id);
 
@@ -1020,15 +1049,12 @@ static inline uint32_t
 rte_sched_port_qindex(struct rte_sched_port *port,
 	uint32_t subport,
 	uint32_t pipe,
-	uint32_t traffic_class,
 	uint32_t queue)
 {
 	return ((subport & (port->n_subports_per_port - 1)) <<
 			(port->n_pipes_per_subport_log2 + 4)) |
 			((pipe & (port->n_pipes_per_subport - 1)) << 4) |
-			((traffic_class &
-			    (RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE - 1)) << 2) |
-			(queue & (RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS - 1));
+			(queue & (RTE_SCHED_QUEUES_PER_PIPE - 1));
 }
 
 void
@@ -1038,8 +1064,8 @@ rte_sched_port_pkt_write(struct rte_sched_port *port,
 			 uint32_t traffic_class,
 			 uint32_t queue, enum rte_color color)
 {
-	uint32_t queue_id = rte_sched_port_qindex(port, subport, pipe,
-			traffic_class, queue);
+	uint32_t queue_id = rte_sched_port_qindex(port, subport, pipe, queue);
+
 	rte_mbuf_sched_set(pkt, queue_id, traffic_class, (uint8_t)color);
 }
 
@@ -1050,12 +1076,12 @@ rte_sched_port_pkt_read_tree_path(struct rte_sched_port *port,
 				  uint32_t *traffic_class, uint32_t *queue)
 {
 	uint32_t queue_id = rte_mbuf_sched_queue_get(pkt);
+	uint32_t tc_id = rte_mbuf_sched_traffic_class_get(pkt);
 
 	*subport = queue_id >> (port->n_pipes_per_subport_log2 + 4);
 	*pipe = (queue_id >> 4) & (port->n_pipes_per_subport - 1);
-	*traffic_class = (queue_id >> 2) &
-				(RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE - 1);
-	*queue = queue_id & (RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS - 1);
+	*traffic_class = tc_id;
+	*queue = queue_id & (RTE_SCHED_QUEUES_PER_PIPE - 1);
 }
 
 enum rte_color
@@ -1136,7 +1162,7 @@ static inline void
 rte_sched_port_update_subport_stats(struct rte_sched_port *port, uint32_t qindex, struct rte_mbuf *pkt)
 {
 	struct rte_sched_subport *s = port->subport + (qindex / rte_sched_port_queues_per_subport(port));
-	uint32_t tc_index = (qindex >> 2) & 0x3;
+	uint32_t tc_index = rte_mbuf_sched_traffic_class_get(pkt);
 	uint32_t pkt_len = pkt->pkt_len;
 
 	s->stats.n_pkts_tc[tc_index] += 1;
@@ -1156,7 +1182,7 @@ rte_sched_port_update_subport_stats_on_drop(struct rte_sched_port *port,
 #endif
 {
 	struct rte_sched_subport *s = port->subport + (qindex / rte_sched_port_queues_per_subport(port));
-	uint32_t tc_index = (qindex >> 2) & 0x3;
+	uint32_t tc_index = rte_mbuf_sched_traffic_class_get(pkt);
 	uint32_t pkt_len = pkt->pkt_len;
 
 	s->stats.n_pkts_tc_dropped[tc_index] += 1;
@@ -1211,7 +1237,7 @@ rte_sched_port_red_drop(struct rte_sched_port *port, struct rte_mbuf *pkt, uint3
 	uint32_t tc_index;
 	enum rte_color color;
 
-	tc_index = (qindex >> 2) & 0x3;
+	tc_index = rte_mbuf_sched_traffic_class_get(pkt);
 	color = rte_sched_port_pkt_read_color(pkt);
 	red_cfg = &port->red_config[tc_index][color];
 
@@ -1528,6 +1554,7 @@ grinder_credits_update(struct rte_sched_port *port, uint32_t pos)
 	struct rte_sched_pipe *pipe = grinder->pipe;
 	struct rte_sched_pipe_profile *params = grinder->pipe_params;
 	uint64_t n_periods;
+	uint32_t i;
 
 	/* Subport TB */
 	n_periods = (port->time - subport->tb_time) / subport->tb_period;
@@ -1543,19 +1570,17 @@ grinder_credits_update(struct rte_sched_port *port, uint32_t pos)
 
 	/* Subport TCs */
 	if (unlikely(port->time >= subport->tc_time)) {
-		subport->tc_credits[0] = subport->tc_credits_per_period[0];
-		subport->tc_credits[1] = subport->tc_credits_per_period[1];
-		subport->tc_credits[2] = subport->tc_credits_per_period[2];
-		subport->tc_credits[3] = subport->tc_credits_per_period[3];
+		for (i = 0; i <= RTE_SCHED_TRAFFIC_CLASS_BE; i++)
+			subport->tc_credits[i] = subport->tc_credits_per_period[i];
+
 		subport->tc_time = port->time + subport->tc_period;
 	}
 
 	/* Pipe TCs */
 	if (unlikely(port->time >= pipe->tc_time)) {
-		pipe->tc_credits[0] = params->tc_credits_per_period[0];
-		pipe->tc_credits[1] = params->tc_credits_per_period[1];
-		pipe->tc_credits[2] = params->tc_credits_per_period[2];
-		pipe->tc_credits[3] = params->tc_credits_per_period[3];
+		for (i = 0; i <= RTE_SCHED_TRAFFIC_CLASS_BE; i++)
+			pipe->tc_credits[i] = params->tc_credits_per_period[i];
+
 		pipe->tc_time = port->time + params->tc_period;
 	}
 }
@@ -1568,21 +1593,29 @@ grinder_tc_ov_credits_update(struct rte_sched_port *port, uint32_t pos)
 	struct rte_sched_grinder *grinder = port->grinder + pos;
 	struct rte_sched_subport *subport = grinder->subport;
 	uint32_t tc_ov_consumption[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
-	uint32_t tc_ov_consumption_max;
+	uint32_t tc_consumption = 0, tc_ov_consumption_max;
 	uint32_t tc_ov_wm = subport->tc_ov_wm;
+	uint32_t i;
 
 	if (subport->tc_ov == 0)
 		return subport->tc_ov_wm_max;
 
-	tc_ov_consumption[0] = subport->tc_credits_per_period[0] - subport->tc_credits[0];
-	tc_ov_consumption[1] = subport->tc_credits_per_period[1] - subport->tc_credits[1];
-	tc_ov_consumption[2] = subport->tc_credits_per_period[2] - subport->tc_credits[2];
-	tc_ov_consumption[3] = subport->tc_credits_per_period[3] - subport->tc_credits[3];
+	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASS_BE; i++) {
+		tc_ov_consumption[i] =
+			subport->tc_credits_per_period[i] - subport->tc_credits[i];
+		tc_consumption += tc_ov_consumption[i];
+	}
+
+	tc_ov_consumption[RTE_SCHED_TRAFFIC_CLASS_BE] =
+		subport->tc_credits_per_period[RTE_SCHED_TRAFFIC_CLASS_BE] -
+		subport->tc_credits[RTE_SCHED_TRAFFIC_CLASS_BE];
+
 
-	tc_ov_consumption_max = subport->tc_credits_per_period[3] -
-		(tc_ov_consumption[0] + tc_ov_consumption[1] + tc_ov_consumption[2]);
+	tc_ov_consumption_max =
+		subport->tc_credits_per_period[RTE_SCHED_TRAFFIC_CLASS_BE] - tc_consumption;
 
-	if (tc_ov_consumption[3] > (tc_ov_consumption_max - port->mtu)) {
+	if (tc_ov_consumption[RTE_SCHED_TRAFFIC_CLASS_BE] >
+		(tc_ov_consumption_max - port->mtu)) {
 		tc_ov_wm  -= tc_ov_wm >> 7;
 		if (tc_ov_wm < subport->tc_ov_wm_min)
 			tc_ov_wm = subport->tc_ov_wm_min;
@@ -1605,6 +1638,7 @@ grinder_credits_update(struct rte_sched_port *port, uint32_t pos)
 	struct rte_sched_pipe *pipe = grinder->pipe;
 	struct rte_sched_pipe_profile *params = grinder->pipe_params;
 	uint64_t n_periods;
+	uint32_t i;
 
 	/* Subport TB */
 	n_periods = (port->time - subport->tb_time) / subport->tb_period;
@@ -1622,10 +1656,8 @@ grinder_credits_update(struct rte_sched_port *port, uint32_t pos)
 	if (unlikely(port->time >= subport->tc_time)) {
 		subport->tc_ov_wm = grinder_tc_ov_credits_update(port, pos);
 
-		subport->tc_credits[0] = subport->tc_credits_per_period[0];
-		subport->tc_credits[1] = subport->tc_credits_per_period[1];
-		subport->tc_credits[2] = subport->tc_credits_per_period[2];
-		subport->tc_credits[3] = subport->tc_credits_per_period[3];
+		for (i = 0; i <= RTE_SCHED_TRAFFIC_CLASS_BE; i++)
+			subport->tc_credits[i] = subport->tc_credits_per_period[i];
 
 		subport->tc_time = port->time + subport->tc_period;
 		subport->tc_ov_period_id++;
@@ -1633,10 +1665,8 @@ grinder_credits_update(struct rte_sched_port *port, uint32_t pos)
 
 	/* Pipe TCs */
 	if (unlikely(port->time >= pipe->tc_time)) {
-		pipe->tc_credits[0] = params->tc_credits_per_period[0];
-		pipe->tc_credits[1] = params->tc_credits_per_period[1];
-		pipe->tc_credits[2] = params->tc_credits_per_period[2];
-		pipe->tc_credits[3] = params->tc_credits_per_period[3];
+		for (i = 0; i <= RTE_SCHED_TRAFFIC_CLASS_BE; i++)
+			pipe->tc_credits[i] = params->tc_credits_per_period[i];
 		pipe->tc_time = port->time + params->tc_period;
 	}
 
@@ -1701,11 +1731,18 @@ grinder_credits_check(struct rte_sched_port *port, uint32_t pos)
 	uint32_t subport_tc_credits = subport->tc_credits[tc_index];
 	uint32_t pipe_tb_credits = pipe->tb_credits;
 	uint32_t pipe_tc_credits = pipe->tc_credits[tc_index];
-	uint32_t pipe_tc_ov_mask1[] = {UINT32_MAX, UINT32_MAX, UINT32_MAX, pipe->tc_ov_credits};
-	uint32_t pipe_tc_ov_mask2[] = {0, 0, 0, UINT32_MAX};
-	uint32_t pipe_tc_ov_credits = pipe_tc_ov_mask1[tc_index];
+	uint32_t pipe_tc_ov_mask1[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
+	uint32_t pipe_tc_ov_mask2[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE] = {0};
+	uint32_t pipe_tc_ov_credits, i;
 	int enough_credits;
 
+	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++)
+		pipe_tc_ov_mask1[i] = UINT32_MAX;
+
+	pipe_tc_ov_mask1[RTE_SCHED_TRAFFIC_CLASS_BE] = pipe->tc_ov_credits;
+	pipe_tc_ov_mask2[RTE_SCHED_TRAFFIC_CLASS_BE] = UINT32_MAX;
+	pipe_tc_ov_credits = pipe_tc_ov_mask1[tc_index];
+
 	/* Check pipe and subport credits */
 	enough_credits = (pkt_len <= subport_tb_credits) &&
 		(pkt_len <= subport_tc_credits) &&
@@ -1860,68 +1897,65 @@ static inline void
 grinder_tccache_populate(struct rte_sched_port *port, uint32_t pos, uint32_t qindex, uint16_t qmask)
 {
 	struct rte_sched_grinder *grinder = port->grinder + pos;
-	uint8_t b[4];
+	uint8_t b, i;
 
 	grinder->tccache_w = 0;
 	grinder->tccache_r = 0;
 
-	b[0] = (uint8_t) (qmask & 0xF);
-	b[1] = (uint8_t) ((qmask >> 4) & 0xF);
-	b[2] = (uint8_t) ((qmask >> 8) & 0xF);
-	b[3] = (uint8_t) ((qmask >> 12) & 0xF);
-
-	grinder->tccache_qmask[grinder->tccache_w] = b[0];
-	grinder->tccache_qindex[grinder->tccache_w] = qindex;
-	grinder->tccache_w += (b[0] != 0);
-
-	grinder->tccache_qmask[grinder->tccache_w] = b[1];
-	grinder->tccache_qindex[grinder->tccache_w] = qindex + 4;
-	grinder->tccache_w += (b[1] != 0);
-
-	grinder->tccache_qmask[grinder->tccache_w] = b[2];
-	grinder->tccache_qindex[grinder->tccache_w] = qindex + 8;
-	grinder->tccache_w += (b[2] != 0);
+	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASS_BE; i++) {
+		b = (uint8_t) ((qmask >> i) & 0x1);
+		grinder->tccache_qmask[grinder->tccache_w] = b;
+		grinder->tccache_qindex[grinder->tccache_w] = qindex + i;
+		grinder->tccache_w += (b != 0);
+	}
 
-	grinder->tccache_qmask[grinder->tccache_w] = b[3];
-	grinder->tccache_qindex[grinder->tccache_w] = qindex + 12;
-	grinder->tccache_w += (b[3] != 0);
+	b = (uint8_t) (qmask >> (RTE_SCHED_TRAFFIC_CLASS_BE));
+	grinder->tccache_qmask[grinder->tccache_w] = b;
+	grinder->tccache_qindex[grinder->tccache_w] = qindex +
+		RTE_SCHED_TRAFFIC_CLASS_BE;
+	grinder->tccache_w += (b != 0);
 }
 
 static inline int
 grinder_next_tc(struct rte_sched_port *port, uint32_t pos)
 {
 	struct rte_sched_grinder *grinder = port->grinder + pos;
+	struct rte_sched_pipe *pipe = grinder->pipe;
 	struct rte_mbuf **qbase;
-	uint32_t qindex;
+	uint32_t qindex, qpos = 0;
 	uint16_t qsize;
 
 	if (grinder->tccache_r == grinder->tccache_w)
 		return 0;
 
 	qindex = grinder->tccache_qindex[grinder->tccache_r];
+	grinder->tc_index = qindex & 0xf;
 	qbase = rte_sched_port_qbase(port, qindex);
-	qsize = rte_sched_port_qsize(port, qindex);
-
-	grinder->tc_index = (qindex >> 2) & 0x3;
-	grinder->qmask = grinder->tccache_qmask[grinder->tccache_r];
-	grinder->qsize[grinder->tc_index] = qsize;
 
-	grinder->qindex[0] = qindex;
-	grinder->qindex[1] = qindex + 1;
-	grinder->qindex[2] = qindex + 2;
-	grinder->qindex[3] = qindex + 3;
+	if (grinder->tc_index < RTE_SCHED_TRAFFIC_CLASS_BE) {
+		qsize = rte_sched_port_qsize(port, qindex);
 
-	grinder->queue[0] = port->queue + qindex;
-	grinder->queue[1] = port->queue + qindex + 1;
-	grinder->queue[2] = port->queue + qindex + 2;
-	grinder->queue[3] = port->queue + qindex + 3;
+		grinder->queue[qpos] = port->queue + qindex;
+		grinder->qbase[qpos] = qbase;
+		grinder->qindex[qpos] = qindex;
+		grinder->qsize[qpos] = qsize;
+		grinder->qmask = grinder->tccache_qmask[grinder->tccache_r];
+		grinder->tccache_r++;
 
-	grinder->qbase[0] = qbase;
-	grinder->qbase[1] = qbase + qsize;
-	grinder->qbase[2] = qbase + 2 * qsize;
-	grinder->qbase[3] = qbase + 3 * qsize;
+		return 1;
+	}
 
+	for ( ; qpos < pipe->n_be_queues; qpos++) {
+		qsize = rte_sched_port_qsize(port, qindex + qpos);
+		grinder->queue[qpos] = port->queue + qindex + qpos;
+		grinder->qbase[qpos] = qbase + qpos * qsize;
+		grinder->qindex[qpos] = qindex + qpos;
+		grinder->qsize[qpos] = qsize;
+	}
+	grinder->tc_index = RTE_SCHED_TRAFFIC_CLASS_BE;
+	grinder->qmask = grinder->tccache_qmask[grinder->tccache_r];
 	grinder->tccache_r++;
+
 	return 1;
 }
 
diff --git a/lib/librte_sched/rte_sched.h b/lib/librte_sched/rte_sched.h
index 2a935998a..ae4dfb311 100644
--- a/lib/librte_sched/rte_sched.h
+++ b/lib/librte_sched/rte_sched.h
@@ -83,9 +83,9 @@ extern "C" {
 #define RTE_SCHED_BE_QUEUES_PER_PIPE    4
 
 /** Number of traffic classes per pipe (as well as subport).
- * Cannot be changed.
  */
-#define RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE    4
+#define RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE    \
+(RTE_SCHED_QUEUES_PER_PIPE - RTE_SCHED_BE_QUEUES_PER_PIPE + 1)
 
 /** Number of queues per pipe traffic class. Cannot be changed. */
 #define RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS    4
@@ -171,9 +171,7 @@ struct rte_sched_pipe_params {
 	/**< Traffic class rates (measured in bytes per second) */
 	uint32_t tc_period;
 	/**< Enforcement period (measured in milliseconds) */
-#ifdef RTE_SCHED_SUBPORT_TC_OV
 	uint8_t tc_ov_weight;		 /**< Weight Traffic class 3 oversubscription */
-#endif
 
 	/* Pipe queues */
 	uint8_t  wrr_weights[RTE_SCHED_BE_QUEUES_PER_PIPE]; /**< WRR weights */
@@ -206,11 +204,9 @@ struct rte_sched_port_params {
 					  * (measured in bytes) */
 	uint32_t n_subports_per_port;    /**< Number of subports */
 	uint32_t n_pipes_per_subport;    /**< Number of pipes per subport */
-	uint16_t qsize[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
+	uint16_t qsize[RTE_SCHED_QUEUES_PER_PIPE];
 	/**< Packet queue size for each traffic class.
-	 * All queues within the same pipe traffic class have the same
-	 * size. Queues from different pipes serving the same traffic
-	 * class have the same size. */
+	 * Queues which are not needed are allowed to have zero size. */
 	struct rte_sched_pipe_params *pipe_profiles;
 	/**< Pipe profile table.
 	 * Every pipe is configured using one of the profiles from this table. */
-- 
2.21.0


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH v3 03/11] sched: add max pipe profiles config in run time
  2019-07-11 10:26       ` [dpdk-dev] [PATCH v3 00/11] sched: feature enhancements Jasvinder Singh
  2019-07-11 10:26         ` [dpdk-dev] [PATCH v3 01/11] sched: remove wrr from strict priority tc queues Jasvinder Singh
  2019-07-11 10:26         ` [dpdk-dev] [PATCH v3 02/11] sched: add config flexibility to tc queue sizes Jasvinder Singh
@ 2019-07-11 10:26         ` Jasvinder Singh
  2019-07-11 10:26         ` [dpdk-dev] [PATCH v3 04/11] sched: rename tc3 params to best-effort tc Jasvinder Singh
                           ` (7 subsequent siblings)
  10 siblings, 0 replies; 163+ messages in thread
From: Jasvinder Singh @ 2019-07-11 10:26 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, Abraham Tovar, Lukasz Krakowiak

Allow setting the maximum number of pipe profiles in run time.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 lib/librte_sched/rte_sched.c | 8 +++++---
 lib/librte_sched/rte_sched.h | 2 ++
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index 34f96a46c..537ef2cbb 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -180,6 +180,7 @@ struct rte_sched_port {
 	uint32_t frame_overhead;
 	uint16_t qsize[RTE_SCHED_QUEUES_PER_PIPE];
 	uint32_t n_pipe_profiles;
+	uint32_t n_max_pipe_profiles;
 	uint32_t pipe_tc3_rate_max;
 #ifdef RTE_SCHED_RED
 	struct rte_red_config red_config[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE][RTE_COLORS];
@@ -359,7 +360,7 @@ rte_sched_port_check_params(struct rte_sched_port_params *params)
 	/* pipe_profiles and n_pipe_profiles */
 	if (params->pipe_profiles == NULL ||
 	    params->n_pipe_profiles == 0 ||
-	    params->n_pipe_profiles > RTE_SCHED_PIPE_PROFILES_PER_PORT)
+	    params->n_pipe_profiles > params->n_max_pipe_profiles)
 		return -9;
 
 	for (i = 0; i < params->n_pipe_profiles; i++) {
@@ -388,7 +389,7 @@ rte_sched_port_get_array_base(struct rte_sched_port_params *params, enum rte_sch
 	uint32_t size_queue_extra
 		= n_queues_per_port * sizeof(struct rte_sched_queue_extra);
 	uint32_t size_pipe_profiles
-		= RTE_SCHED_PIPE_PROFILES_PER_PORT * sizeof(struct rte_sched_pipe_profile);
+		= params->n_max_pipe_profiles * sizeof(struct rte_sched_pipe_profile);
 	uint32_t size_bmp_array = rte_bitmap_get_memory_footprint(n_queues_per_port);
 	uint32_t size_per_pipe_queue_array, size_queue_array;
 
@@ -656,6 +657,7 @@ rte_sched_port_config(struct rte_sched_port_params *params)
 	port->frame_overhead = params->frame_overhead;
 	memcpy(port->qsize, params->qsize, sizeof(params->qsize));
 	port->n_pipe_profiles = params->n_pipe_profiles;
+	port->n_max_pipe_profiles = params->n_max_pipe_profiles;
 
 #ifdef RTE_SCHED_RED
 	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++) {
@@ -1017,7 +1019,7 @@ rte_sched_port_pipe_profile_add(struct rte_sched_port *port,
 		return -1;
 
 	/* Pipe profiles not exceeds the max limit */
-	if (port->n_pipe_profiles >= RTE_SCHED_PIPE_PROFILES_PER_PORT)
+	if (port->n_pipe_profiles >= port->n_max_pipe_profiles)
 		return -2;
 
 	/* Pipe params */
diff --git a/lib/librte_sched/rte_sched.h b/lib/librte_sched/rte_sched.h
index ae4dfb311..9cccdda41 100644
--- a/lib/librte_sched/rte_sched.h
+++ b/lib/librte_sched/rte_sched.h
@@ -211,6 +211,8 @@ struct rte_sched_port_params {
 	/**< Pipe profile table.
 	 * Every pipe is configured using one of the profiles from this table. */
 	uint32_t n_pipe_profiles;        /**< Profiles in the pipe profile table */
+	uint32_t n_max_pipe_profiles;
+	/**< Max profiles allowed in the pipe profile table */
 #ifdef RTE_SCHED_RED
 	struct rte_red_params red_params[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE][RTE_COLORS]; /**< RED parameters */
 #endif
-- 
2.21.0


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH v3 04/11] sched: rename tc3 params to best-effort tc
  2019-07-11 10:26       ` [dpdk-dev] [PATCH v3 00/11] sched: feature enhancements Jasvinder Singh
                           ` (2 preceding siblings ...)
  2019-07-11 10:26         ` [dpdk-dev] [PATCH v3 03/11] sched: add max pipe profiles config in run time Jasvinder Singh
@ 2019-07-11 10:26         ` Jasvinder Singh
  2019-07-11 10:26         ` [dpdk-dev] [PATCH v3 05/11] sched: improve error log messages Jasvinder Singh
                           ` (6 subsequent siblings)
  10 siblings, 0 replies; 163+ messages in thread
From: Jasvinder Singh @ 2019-07-11 10:26 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, Abraham Tovar, Lukasz Krakowiak

Change the traffic class 3 related params name to best-effort(be)
traffic class.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 lib/librte_sched/rte_sched.c | 48 ++++++++++++++++++------------------
 1 file changed, 24 insertions(+), 24 deletions(-)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index 537ef2cbb..0eb25f517 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -181,7 +181,7 @@ struct rte_sched_port {
 	uint16_t qsize[RTE_SCHED_QUEUES_PER_PIPE];
 	uint32_t n_pipe_profiles;
 	uint32_t n_max_pipe_profiles;
-	uint32_t pipe_tc3_rate_max;
+	uint32_t pipe_tc_be_rate_max;
 #ifdef RTE_SCHED_RED
 	struct rte_red_config red_config[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE][RTE_COLORS];
 #endif
@@ -616,13 +616,13 @@ rte_sched_port_config_pipe_profile_table(struct rte_sched_port *port,
 		rte_sched_port_log_pipe_profile(port, i);
 	}
 
-	port->pipe_tc3_rate_max = 0;
+	port->pipe_tc_be_rate_max = 0;
 	for (i = 0; i < port->n_pipe_profiles; i++) {
 		struct rte_sched_pipe_params *src = params->pipe_profiles + i;
-		uint32_t pipe_tc3_rate = src->tc_rate[RTE_SCHED_TRAFFIC_CLASS_BE];
+		uint32_t pipe_tc_be_rate = src->tc_rate[RTE_SCHED_TRAFFIC_CLASS_BE];
 
-		if (port->pipe_tc3_rate_max < pipe_tc3_rate)
-			port->pipe_tc3_rate_max = pipe_tc3_rate;
+		if (port->pipe_tc_be_rate_max < pipe_tc_be_rate)
+			port->pipe_tc_be_rate_max = pipe_tc_be_rate;
 	}
 }
 
@@ -890,7 +890,7 @@ rte_sched_subport_config(struct rte_sched_port *port,
 	/* TC oversubscription */
 	s->tc_ov_wm_min = port->mtu;
 	s->tc_ov_wm_max = rte_sched_time_ms_to_bytes(params->tc_period,
-						     port->pipe_tc3_rate_max);
+						     port->pipe_tc_be_rate_max);
 	s->tc_ov_wm = s->tc_ov_wm_max;
 	s->tc_ov_period_id = 0;
 	s->tc_ov = 0;
@@ -937,21 +937,21 @@ rte_sched_pipe_config(struct rte_sched_port *port,
 		params = port->pipe_profiles + p->profile;
 
 #ifdef RTE_SCHED_SUBPORT_TC_OV
-		double subport_tc3_rate = (double) s->tc_credits_per_period[RTE_SCHED_TRAFFIC_CLASS_BE]
+		double subport_tc_be_rate = (double) s->tc_credits_per_period[RTE_SCHED_TRAFFIC_CLASS_BE]
 			/ (double) s->tc_period;
-		double pipe_tc3_rate = (double) params->tc_credits_per_period[RTE_SCHED_TRAFFIC_CLASS_BE]
+		double pipe_tc_be_rate = (double) params->tc_credits_per_period[RTE_SCHED_TRAFFIC_CLASS_BE]
 			/ (double) params->tc_period;
-		uint32_t tc3_ov = s->tc_ov;
+		uint32_t tc_be_ov = s->tc_ov;
 
 		/* Unplug pipe from its subport */
 		s->tc_ov_n -= params->tc_ov_weight;
-		s->tc_ov_rate -= pipe_tc3_rate;
-		s->tc_ov = s->tc_ov_rate > subport_tc3_rate;
+		s->tc_ov_rate -= pipe_tc_be_rate;
+		s->tc_ov = s->tc_ov_rate > subport_tc_be_rate;
 
-		if (s->tc_ov != tc3_ov) {
+		if (s->tc_ov != tc_be_ov) {
 			RTE_LOG(DEBUG, SCHED,
-				"Subport %u TC3 oversubscription is OFF (%.4lf >= %.4lf)\n",
-				subport_id, subport_tc3_rate, s->tc_ov_rate);
+				"Subport %u Best effort TC oversubscription is OFF (%.4lf >= %.4lf)\n",
+				subport_id, subport_tc_be_rate, s->tc_ov_rate);
 		}
 #endif
 
@@ -980,22 +980,22 @@ rte_sched_pipe_config(struct rte_sched_port *port,
 #ifdef RTE_SCHED_SUBPORT_TC_OV
 	{
 		/* Subport TC3 oversubscription */
-		double subport_tc3_rate =
+		double subport_tc_be_rate =
 			(double) s->tc_credits_per_period[RTE_SCHED_TRAFFIC_CLASS_BE]
 			/ (double) s->tc_period;
-		double pipe_tc3_rate =
+		double pipe_tc_be_rate =
 			(double) params->tc_credits_per_period[RTE_SCHED_TRAFFIC_CLASS_BE]
 			/ (double) params->tc_period;
-		uint32_t tc3_ov = s->tc_ov;
+		uint32_t tc_be_ov = s->tc_ov;
 
 		s->tc_ov_n += params->tc_ov_weight;
-		s->tc_ov_rate += pipe_tc3_rate;
-		s->tc_ov = s->tc_ov_rate > subport_tc3_rate;
+		s->tc_ov_rate += pipe_tc_be_rate;
+		s->tc_ov = s->tc_ov_rate > subport_tc_be_rate;
 
-		if (s->tc_ov != tc3_ov) {
+		if (s->tc_ov != tc_be_ov) {
 			RTE_LOG(DEBUG, SCHED,
-				"Subport %u TC3 oversubscription is ON (%.4lf < %.4lf)\n",
-				subport_id, subport_tc3_rate, s->tc_ov_rate);
+				"Subport %u Best effort TC oversubscription is ON (%.4lf < %.4lf)\n",
+				subport_id, subport_tc_be_rate, s->tc_ov_rate);
 		}
 		p->tc_ov_period_id = s->tc_ov_period_id;
 		p->tc_ov_credits = s->tc_ov_wm;
@@ -1039,8 +1039,8 @@ rte_sched_port_pipe_profile_add(struct rte_sched_port *port,
 	*pipe_profile_id = port->n_pipe_profiles;
 	port->n_pipe_profiles++;
 
-	if (port->pipe_tc3_rate_max < params->tc_rate[RTE_SCHED_TRAFFIC_CLASS_BE])
-		port->pipe_tc3_rate_max = params->tc_rate[RTE_SCHED_TRAFFIC_CLASS_BE];
+	if (port->pipe_tc_be_rate_max < params->tc_rate[RTE_SCHED_TRAFFIC_CLASS_BE])
+		port->pipe_tc_be_rate_max = params->tc_rate[RTE_SCHED_TRAFFIC_CLASS_BE];
 
 	rte_sched_port_log_pipe_profile(port, *pipe_profile_id);
 
-- 
2.21.0


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH v3 05/11] sched: improve error log messages
  2019-07-11 10:26       ` [dpdk-dev] [PATCH v3 00/11] sched: feature enhancements Jasvinder Singh
                           ` (3 preceding siblings ...)
  2019-07-11 10:26         ` [dpdk-dev] [PATCH v3 04/11] sched: rename tc3 params to best-effort tc Jasvinder Singh
@ 2019-07-11 10:26         ` Jasvinder Singh
  2019-07-11 10:26         ` [dpdk-dev] [PATCH v3 06/11] sched: improve doxygen comments Jasvinder Singh
                           ` (5 subsequent siblings)
  10 siblings, 0 replies; 163+ messages in thread
From: Jasvinder Singh @ 2019-07-11 10:26 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, Abraham Tovar, Lukasz Krakowiak

Replace hard-coded numbers for reporting errors with
error messages.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 lib/librte_sched/rte_sched.c | 295 ++++++++++++++++++++++++++---------
 1 file changed, 221 insertions(+), 74 deletions(-)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index 0eb25f517..04c6b3f6a 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -272,46 +272,70 @@ pipe_profile_check(struct rte_sched_pipe_params *params,
 	uint32_t i;
 
 	/* Pipe parameters */
-	if (params == NULL)
-		return -10;
+	if (params == NULL) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for parameter params \n", __func__);
+		return -EINVAL;
+	}
 
 	/* TB rate: non-zero, not greater than port rate */
 	if (params->tb_rate == 0 ||
-		params->tb_rate > rate)
-		return -11;
+		params->tb_rate > rate) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for tb rate \n", __func__);
+		return -EINVAL;
+	}
 
 	/* TB size: non-zero */
-	if (params->tb_size == 0)
-		return -12;
+	if (params->tb_size == 0) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for tb size \n", __func__);
+		return -EINVAL;
+	}
 
 	/* TC rate: non-zero, less than pipe rate */
 	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASS_BE; i++) {
 		if ((qsize[i] == 0 && params->tc_rate[i] != 0) ||
 			(qsize[i] != 0 && (params->tc_rate[i] == 0 ||
-			params->tc_rate[i] > params->tb_rate)))
-			return -13;
-
+			params->tc_rate[i] > params->tb_rate))) {
+			RTE_LOG(ERR, SCHED,
+				"%s: Incorrect value for qsize or tc_rate \n", __func__);
+			return -EINVAL;
+		}
 	}
 
-	if (params->tc_rate[RTE_SCHED_TRAFFIC_CLASS_BE] == 0)
-		return -13;
+	if (params->tc_rate[RTE_SCHED_TRAFFIC_CLASS_BE] == 0) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for be traffic class rate \n", __func__);
+		return -EINVAL;
+	}
 
 	/* TC period: non-zero */
-	if (params->tc_period == 0)
-		return -14;
+	if (params->tc_period == 0) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for tc period \n", __func__);
+		return -EINVAL;
+	}
 
 #ifdef RTE_SCHED_SUBPORT_TC_OV
 	/* TC3 oversubscription weight: non-zero */
-	if (params->tc_ov_weight == 0)
-		return -15;
+	if (params->tc_ov_weight == 0) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for tc ov weight \n", __func__);
+		return -EINVAL;
+	}
 #endif
 
 	/* Queue WRR weights: non-zero */
 	for (i = 0; i < RTE_SCHED_BE_QUEUES_PER_PIPE; i++) {
 		uint32_t qindex = RTE_SCHED_TRAFFIC_CLASS_BE + i;
 		if ((qsize[qindex] != 0 && params->wrr_weights[i] == 0) ||
-			(qsize[qindex] == 0 && params->wrr_weights[i] != 0))
-			return -16;
+			(qsize[qindex] == 0 && params->wrr_weights[i] != 0)) {
+				printf("qindex %u, qsize %u, wrr weights %u \n", qindex, qsize[qindex], params->wrr_weights[i]);
+			RTE_LOG(ERR, SCHED,
+				"%s: Incorrect value for qsize or wrr weight \n", __func__);
+			return -EINVAL;
+		}
 	}
 	return 0;
 }
@@ -321,55 +345,82 @@ rte_sched_port_check_params(struct rte_sched_port_params *params)
 {
 	uint32_t i;
 
-	if (params == NULL)
-		return -1;
+	if (params == NULL) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for parameter params \n", __func__);
+		return -EINVAL;
+	}
 
 	/* socket */
-	if (params->socket < 0)
-		return -3;
+	if (params->socket < 0) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for socket id \n", __func__);
+		return -EINVAL;
+	}
 
 	/* rate */
-	if (params->rate == 0)
-		return -4;
+	if (params->rate == 0) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for rate \n", __func__);
+		return -EINVAL;
+	}
 
 	/* mtu */
-	if (params->mtu == 0)
-		return -5;
+	if (params->mtu == 0) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for mtu \n", __func__);
+		return -EINVAL;
+	}
 
 	/* n_subports_per_port: non-zero, limited to 16 bits, power of 2 */
 	if (params->n_subports_per_port == 0 ||
 	    params->n_subports_per_port > 1u << 16 ||
-	    !rte_is_power_of_2(params->n_subports_per_port))
-		return -6;
+	    !rte_is_power_of_2(params->n_subports_per_port)) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for number of subports \n", __func__);
+		return -EINVAL;
+	}
 
 	/* n_pipes_per_subport: non-zero, power of 2 */
 	if (params->n_pipes_per_subport == 0 ||
-	    !rte_is_power_of_2(params->n_pipes_per_subport))
-		return -7;
+	    !rte_is_power_of_2(params->n_pipes_per_subport)) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for pipes number \n", __func__);
+		return -EINVAL;
+	}
 
-	/* qsize: non-zero, power of 2,
+	/* qsize: if non-zero, power of 2,
 	 * no bigger than 32K (due to 16-bit read/write pointers)
 	 */
 	for (i = 0; i < RTE_SCHED_QUEUES_PER_PIPE; i++) {
 		uint16_t qsize = params->qsize[i];
 		if ((qsize != 0 && !rte_is_power_of_2(qsize)) ||
-			((i == RTE_SCHED_TRAFFIC_CLASS_BE) && (qsize == 0)))
-			return -8;
+			((i == RTE_SCHED_TRAFFIC_CLASS_BE) && (qsize == 0))) {
+			RTE_LOG(ERR, SCHED,
+				"%s: Incorrect value for tc rate \n", __func__);
+			return -EINVAL;
+		}
 	}
 
 	/* pipe_profiles and n_pipe_profiles */
 	if (params->pipe_profiles == NULL ||
 	    params->n_pipe_profiles == 0 ||
-	    params->n_pipe_profiles > params->n_max_pipe_profiles)
-		return -9;
+		 params->n_pipe_profiles > params->n_max_pipe_profiles) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for number of pipe profiles \n", __func__);
+		return -EINVAL;
+	}
 
 	for (i = 0; i < params->n_pipe_profiles; i++) {
 		struct rte_sched_pipe_params *p = params->pipe_profiles + i;
 		int status;
 
 		status = pipe_profile_check(p, params->rate, &params->qsize[0]);
-		if (status != 0)
-			return status;
+		if (status != 0) {
+			RTE_LOG(ERR, SCHED,
+				"%s: Pipe profile check failed(%d) \n", __func__, status);
+			return -EINVAL;
+		}
 	}
 
 	return 0;
@@ -823,16 +874,35 @@ rte_sched_subport_config(struct rte_sched_port *port,
 	uint32_t i, j;
 
 	/* Check user parameters */
-	if (port == NULL ||
-	    subport_id >= port->n_subports_per_port ||
-	    params == NULL)
-		return -1;
+	if (port == NULL) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for parameter port \n", __func__);
+		return -EINVAL;
+	}
+
+	if (subport_id >= port->n_subports_per_port) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for subport id \n", __func__);
+		return -EINVAL;
+	}
 
-	if (params->tb_rate == 0 || params->tb_rate > port->rate)
-		return -2;
+	if (params == NULL) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for parameter params \n", __func__);
+		return -EINVAL;
+	}
+
+	if (params->tb_rate == 0 || params->tb_rate > port->rate) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for tb rate \n", __func__);
+		return -EINVAL;
+	}
 
-	if (params->tb_size == 0)
-		return -3;
+	if (params->tb_size == 0) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for tb size \n", __func__);
+		return -EINVAL;
+	}
 
 	for (i = 0, j = 0; i < RTE_SCHED_QUEUES_PER_PIPE; i++) {
 		uint32_t tc_rate = params->tc_rate[j];
@@ -841,19 +911,27 @@ rte_sched_subport_config(struct rte_sched_port *port,
 		if (((qsize == 0) &&
 			((tc_rate != 0) && (j != RTE_SCHED_TRAFFIC_CLASS_BE))) ||
 			((qsize != 0) && (tc_rate == 0)) ||
-			(tc_rate > params->tb_rate))
-			return -3;
-
+			(tc_rate > params->tb_rate)) {
+			RTE_LOG(ERR, SCHED,
+				"%s: Incorrect value for tc rate \n", __func__);
+			return -EINVAL;
+		}
 		if (j < RTE_SCHED_TRAFFIC_CLASS_BE)
 			j++;
 	}
 
 	if (port->qsize[RTE_SCHED_TRAFFIC_CLASS_BE] == 0 ||
-		params->tc_rate[RTE_SCHED_TRAFFIC_CLASS_BE] == 0)
-		return -3;
+		params->tc_rate[RTE_SCHED_TRAFFIC_CLASS_BE] == 0) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for tc rate(best effort) \n", __func__);
+		return -EINVAL;
+	}
 
-	if (params->tc_period == 0)
-		return -5;
+	if (params->tc_period == 0) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for tc period \n", __func__);
+		return -EINVAL;
+	}
 
 	s = port->subport + subport_id;
 
@@ -918,17 +996,37 @@ rte_sched_pipe_config(struct rte_sched_port *port,
 	profile = (uint32_t) pipe_profile;
 	deactivate = (pipe_profile < 0);
 
-	if (port == NULL ||
-	    subport_id >= port->n_subports_per_port ||
-	    pipe_id >= port->n_pipes_per_subport ||
-	    (!deactivate && profile >= port->n_pipe_profiles))
-		return -1;
+	if (port == NULL) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for parameter port \n", __func__);
+		return -EINVAL;
+	}
+
+	if (subport_id >= port->n_subports_per_port) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for parameter subport id \n", __func__);
+		return -EINVAL;
+	}
 
+	if (pipe_id >= port->n_pipes_per_subport) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for parameter pipe id \n", __func__);
+		return -EINVAL;
+	}
+
+	if (!deactivate && profile >= port->n_pipe_profiles) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for parameter pipe profile \n", __func__);
+		return -EINVAL;
+	}
 
 	/* Check that subport configuration is valid */
 	s = port->subport + subport_id;
-	if (s->tb_period == 0)
-		return -2;
+	if (s->tb_period == 0) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Subport configuration invalid \n", __func__);
+		return -EINVAL;
+	}
 
 	p = port->pipe + (subport_id * port->n_pipes_per_subport + pipe_id);
 
@@ -1015,25 +1113,37 @@ rte_sched_port_pipe_profile_add(struct rte_sched_port *port,
 	int status;
 
 	/* Port */
-	if (port == NULL)
-		return -1;
+	if (port == NULL) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for parameter port \n", __func__);
+		return -EINVAL;
+	}
 
 	/* Pipe profiles not exceeds the max limit */
-	if (port->n_pipe_profiles >= port->n_max_pipe_profiles)
-		return -2;
+	if (port->n_pipe_profiles >= port->n_max_pipe_profiles) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Number of pipe profiles exceeds the max limit \n", __func__);
+		return -EINVAL;
+	}
 
 	/* Pipe params */
 	status = pipe_profile_check(params, port->rate, &port->qsize[0]);
-	if (status != 0)
-		return status;
+	if (status != 0) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Pipe profile check failed(%d) \n", __func__, status);
+		return -EINVAL;
+	}
 
 	pp = &port->pipe_profiles[port->n_pipe_profiles];
 	rte_sched_pipe_profile_convert(port, params, pp, port->rate);
 
 	/* Pipe profile not exists */
 	for (i = 0; i < port->n_pipe_profiles; i++)
-		if (memcmp(port->pipe_profiles + i, pp, sizeof(*pp)) == 0)
-			return -3;
+		if (memcmp(port->pipe_profiles + i, pp, sizeof(*pp)) == 0) {
+			RTE_LOG(ERR, SCHED,
+				"%s: Pipe profile doesn't exist \n", __func__);
+			return -EINVAL;
+		}
 
 	/* Pipe profile commit */
 	*pipe_profile_id = port->n_pipe_profiles;
@@ -1101,9 +1211,29 @@ rte_sched_subport_read_stats(struct rte_sched_port *port,
 	struct rte_sched_subport *s;
 
 	/* Check user parameters */
-	if (port == NULL || subport_id >= port->n_subports_per_port ||
-	    stats == NULL || tc_ov == NULL)
-		return -1;
+	if (port == NULL) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for parameter port \n", __func__);
+		return -EINVAL;
+	}
+
+	if (subport_id >= port->n_subports_per_port) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for subport id \n", __func__);
+		return -EINVAL;
+	}
+
+	if (stats == NULL) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for parameter stats \n", __func__);
+		return -EINVAL;
+	}
+
+	if (tc_ov == NULL) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for tc_ov \n", __func__);
+		return -EINVAL;
+	}
 
 	s = port->subport + subport_id;
 
@@ -1127,11 +1257,28 @@ rte_sched_queue_read_stats(struct rte_sched_port *port,
 	struct rte_sched_queue_extra *qe;
 
 	/* Check user parameters */
-	if ((port == NULL) ||
-	    (queue_id >= rte_sched_port_queues_per_port(port)) ||
-		(stats == NULL) ||
-		(qlen == NULL)) {
-		return -1;
+	if (port == NULL) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for parameter port \n", __func__);
+		return -EINVAL;
+	}
+
+	if (queue_id >= rte_sched_port_queues_per_port(port)) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for queue id \n", __func__);
+		return -EINVAL;
+	}
+
+	if (stats == NULL) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for parameter stats \n", __func__);
+		return -EINVAL;
+	}
+
+	if (qlen == NULL) {
+		RTE_LOG(ERR, SCHED,
+			"%s: Incorrect value for parameter qlen \n", __func__);
+		return -EINVAL;
 	}
 	q = port->queue + queue_id;
 	qe = port->queue_extra + queue_id;
-- 
2.21.0


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH v3 06/11] sched: improve doxygen comments
  2019-07-11 10:26       ` [dpdk-dev] [PATCH v3 00/11] sched: feature enhancements Jasvinder Singh
                           ` (4 preceding siblings ...)
  2019-07-11 10:26         ` [dpdk-dev] [PATCH v3 05/11] sched: improve error log messages Jasvinder Singh
@ 2019-07-11 10:26         ` Jasvinder Singh
  2019-07-11 10:26         ` [dpdk-dev] [PATCH v3 07/11] net/softnic: add config flexibility to softnic tm Jasvinder Singh
                           ` (4 subsequent siblings)
  10 siblings, 0 replies; 163+ messages in thread
From: Jasvinder Singh @ 2019-07-11 10:26 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, Abraham Tovar, Lukasz Krakowiak

Improve doxygen comments.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 lib/librte_sched/rte_sched.h | 145 ++++++++++++++++++++++-------------
 1 file changed, 93 insertions(+), 52 deletions(-)

diff --git a/lib/librte_sched/rte_sched.h b/lib/librte_sched/rte_sched.h
index 9cccdda41..e8aa63301 100644
--- a/lib/librte_sched/rte_sched.h
+++ b/lib/librte_sched/rte_sched.h
@@ -52,7 +52,7 @@ extern "C" {
  *	    multiple connections of same traffic class belonging to
  *	    the same user;
  *           - Weighted Round Robin (WRR) is used to service the
- *	    queues within same pipe traffic class.
+ *	    queues within same pipe lowest priority traffic class (best-effort).
  *
  */
 
@@ -83,6 +83,8 @@ extern "C" {
 #define RTE_SCHED_BE_QUEUES_PER_PIPE    4
 
 /** Number of traffic classes per pipe (as well as subport).
+ * @see struct rte_sched_subport_params
+ * @see struct rte_sched_pipe_params
  */
 #define RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE    \
 (RTE_SCHED_QUEUES_PER_PIPE - RTE_SCHED_BE_QUEUES_PER_PIPE + 1)
@@ -106,6 +108,8 @@ extern "C" {
  *
  * The FCS is considered overhead only if not included in the packet
  * length (field pkt_len of struct rte_mbuf).
+ *
+ * @see struct rte_sched_port_params
  */
 #ifndef RTE_SCHED_FRAME_OVERHEAD_DEFAULT
 #define RTE_SCHED_FRAME_OVERHEAD_DEFAULT      24
@@ -121,34 +125,39 @@ extern "C" {
  * byte.
  */
 struct rte_sched_subport_params {
-	/* Subport token bucket */
-	uint32_t tb_rate;                /**< Rate (measured in bytes per second) */
-	uint32_t tb_size;                /**< Size (measured in credits) */
+	/** Token bucket rate (measured in bytes per second) */
+	uint32_t tb_rate;
+
+	/** Token bucket size (measured in credits) */
+	uint32_t tb_size;
 
-	/* Subport traffic classes */
+	/** Traffic class rates (measured in bytes per second) */
 	uint32_t tc_rate[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
-	/**< Traffic class rates (measured in bytes per second) */
+
+	/** Enforcement period for rates (measured in milliseconds) */
 	uint32_t tc_period;
-	/**< Enforcement period for rates (measured in milliseconds) */
+
+	/** Number of subport_pipes */
+	uint32_t n_pipes_per_subport;
 };
 
 /** Subport statistics */
 struct rte_sched_subport_stats {
-	/* Packets */
+	/** Number of packets successfully written */
 	uint32_t n_pkts_tc[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
-	/**< Number of packets successfully written */
+
+	/** Number of packets dropped */
 	uint32_t n_pkts_tc_dropped[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
-	/**< Number of packets dropped */
 
-	/* Bytes */
+	/** Number of bytes successfully written for each traffic class */
 	uint32_t n_bytes_tc[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
-	/**< Number of bytes successfully written for each traffic class */
+
+	/** Number of bytes dropped for each traffic class */
 	uint32_t n_bytes_tc_dropped[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
-	/**< Number of bytes dropped for each traffic class */
 
 #ifdef RTE_SCHED_RED
+	/** Number of packets dropped by red */
 	uint32_t n_pkts_red_dropped[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
-	/**< Number of packets dropped by red */
 #endif
 };
 
@@ -162,59 +171,91 @@ struct rte_sched_subport_stats {
  * byte.
  */
 struct rte_sched_pipe_params {
-	/* Pipe token bucket */
-	uint32_t tb_rate;                /**< Rate (measured in bytes per second) */
-	uint32_t tb_size;                /**< Size (measured in credits) */
+	/** Token bucket rate (measured in bytes per second) */
+	uint32_t tb_rate;
+
+	/** Token bucket size (measured in credits) */
+	uint32_t tb_size;
 
-	/* Pipe traffic classes */
+	/** Traffic class rates (measured in bytes per second) */
 	uint32_t tc_rate[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
-	/**< Traffic class rates (measured in bytes per second) */
+
+	/** Enforcement period (measured in milliseconds) */
 	uint32_t tc_period;
-	/**< Enforcement period (measured in milliseconds) */
-	uint8_t tc_ov_weight;		 /**< Weight Traffic class 3 oversubscription */
 
-	/* Pipe queues */
-	uint8_t  wrr_weights[RTE_SCHED_BE_QUEUES_PER_PIPE]; /**< WRR weights */
+	/** Best-effort traffic class oversubscription weight */
+	uint8_t tc_ov_weight;
+
+	/** WRR weights of best-effort traffic class queues */
+	uint8_t wrr_weights[RTE_SCHED_BE_QUEUES_PER_PIPE];
 };
 
 /** Queue statistics */
 struct rte_sched_queue_stats {
-	/* Packets */
-	uint32_t n_pkts;                 /**< Packets successfully written */
-	uint32_t n_pkts_dropped;         /**< Packets dropped */
+	/** Packets successfully written */
+	uint32_t n_pkts;
+
+	/** Packets dropped */
+	uint32_t n_pkts_dropped;
+
 #ifdef RTE_SCHED_RED
-	uint32_t n_pkts_red_dropped;	 /**< Packets dropped by RED */
+	/** Packets dropped by RED */
+	uint32_t n_pkts_red_dropped;
 #endif
 
-	/* Bytes */
-	uint32_t n_bytes;                /**< Bytes successfully written */
-	uint32_t n_bytes_dropped;        /**< Bytes dropped */
+	/** Bytes successfully written */
+	uint32_t n_bytes;
+
+	/** Bytes dropped */
+	uint32_t n_bytes_dropped;
 };
 
 /** Port configuration parameters. */
 struct rte_sched_port_params {
-	const char *name;                /**< String to be associated */
-	int socket;                      /**< CPU socket ID */
-	uint32_t rate;                   /**< Output port rate
-					  * (measured in bytes per second) */
-	uint32_t mtu;                    /**< Maximum Ethernet frame size
-					  * (measured in bytes).
-					  * Should not include the framing overhead. */
-	uint32_t frame_overhead;         /**< Framing overhead per packet
-					  * (measured in bytes) */
-	uint32_t n_subports_per_port;    /**< Number of subports */
-	uint32_t n_pipes_per_subport;    /**< Number of pipes per subport */
+	/** Name of the port to be associated */
+	const char *name;
+
+	/** CPU socket ID */
+	int socket;
+
+	/** Output port rate (measured in bytes per second) */
+	uint32_t rate;
+
+	/** Maximum Ethernet frame size (measured in bytes).
+	 * Should not include the framing overhead.
+	 */
+	uint32_t mtu;
+
+	/** Framing overhead per packet (measured in bytes) */
+	uint32_t frame_overhead;
+
+	/** Number of subports */
+	uint32_t n_subports_per_port;
+
+	/** Number of subport_pipes */
+	uint32_t n_pipes_per_subport;
+
+	/** Packet queue size for each traffic class.
+	 * All the pipes within the same subport share the similar
+	 * configuration for the queues. Queues which are not needed, have
+	 * zero size.
+	 */
 	uint16_t qsize[RTE_SCHED_QUEUES_PER_PIPE];
-	/**< Packet queue size for each traffic class.
-	 * Queues which are not needed are allowed to have zero size. */
+
+	/** Pipe profile table.
+	 * Every pipe is configured using one of the profiles from this table.
+	 */
 	struct rte_sched_pipe_params *pipe_profiles;
-	/**< Pipe profile table.
-	 * Every pipe is configured using one of the profiles from this table. */
-	uint32_t n_pipe_profiles;        /**< Profiles in the pipe profile table */
+
+	/** Profiles in the pipe profile table */
+	uint32_t n_pipe_profiles;
+
+	/** Max profiles allowed in the pipe profile table */
 	uint32_t n_max_pipe_profiles;
-	/**< Max profiles allowed in the pipe profile table */
+
 #ifdef RTE_SCHED_RED
-	struct rte_red_params red_params[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE][RTE_COLORS]; /**< RED parameters */
+	/** RED parameters */
+	struct rte_red_params red_params[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE][RTE_COLORS];
 #endif
 };
 
@@ -328,8 +369,8 @@ rte_sched_port_get_memory_footprint(struct rte_sched_port_params *params);
  *   Pointer to pre-allocated subport statistics structure where the statistics
  *   counters should be stored
  * @param tc_ov
- *   Pointer to pre-allocated 4-entry array where the oversubscription status for
- *   each of the 4 subport traffic classes should be stored.
+ *   Pointer to pre-allocated 13-entry array where the oversubscription status for
+ *   each of the subport traffic classes should be stored.
  * @return
  *   0 upon success, error code otherwise
  */
@@ -374,7 +415,7 @@ rte_sched_queue_read_stats(struct rte_sched_port *port,
  * @param pipe
  *   Pipe ID within subport
  * @param traffic_class
- *   Traffic class ID within pipe (0 .. 3)
+ *   Traffic class ID within pipe (0 .. 12)
  * @param queue
  *   Queue ID within pipe traffic class (0 .. 3)
  * @param color
@@ -401,7 +442,7 @@ rte_sched_port_pkt_write(struct rte_sched_port *port,
  * @param pipe
  *   Pipe ID within subport
  * @param traffic_class
- *   Traffic class ID within pipe (0 .. 3)
+ *   Traffic class ID within pipe (0 .. 12)
  * @param queue
  *   Queue ID within pipe traffic class (0 .. 3)
  *
-- 
2.21.0


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH v3 07/11] net/softnic: add config flexibility to softnic tm
  2019-07-11 10:26       ` [dpdk-dev] [PATCH v3 00/11] sched: feature enhancements Jasvinder Singh
                           ` (5 preceding siblings ...)
  2019-07-11 10:26         ` [dpdk-dev] [PATCH v3 06/11] sched: improve doxygen comments Jasvinder Singh
@ 2019-07-11 10:26         ` Jasvinder Singh
  2019-07-11 10:26         ` [dpdk-dev] [PATCH v3 08/11] test_sched: modify tests for config flexibility Jasvinder Singh
                           ` (3 subsequent siblings)
  10 siblings, 0 replies; 163+ messages in thread
From: Jasvinder Singh @ 2019-07-11 10:26 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, Abraham Tovar, Lukasz Krakowiak

Update softnic tm function for configuration flexiblity of pipe
traffic classes and queues size.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 drivers/net/softnic/rte_eth_softnic.c         | 131 ++++++
 drivers/net/softnic/rte_eth_softnic_cli.c     | 433 ++++++++++++++++--
 .../net/softnic/rte_eth_softnic_internals.h   |   8 +-
 drivers/net/softnic/rte_eth_softnic_tm.c      |  64 ++-
 4 files changed, 582 insertions(+), 54 deletions(-)

diff --git a/drivers/net/softnic/rte_eth_softnic.c b/drivers/net/softnic/rte_eth_softnic.c
index 4bda2f2b0..50a48e90b 100644
--- a/drivers/net/softnic/rte_eth_softnic.c
+++ b/drivers/net/softnic/rte_eth_softnic.c
@@ -28,6 +28,19 @@
 #define PMD_PARAM_TM_QSIZE1                                "tm_qsize1"
 #define PMD_PARAM_TM_QSIZE2                                "tm_qsize2"
 #define PMD_PARAM_TM_QSIZE3                                "tm_qsize3"
+#define PMD_PARAM_TM_QSIZE4                                "tm_qsize4"
+#define PMD_PARAM_TM_QSIZE5                                "tm_qsize5"
+#define PMD_PARAM_TM_QSIZE6                                "tm_qsize6"
+#define PMD_PARAM_TM_QSIZE7                                "tm_qsize7"
+#define PMD_PARAM_TM_QSIZE8                                "tm_qsize8"
+#define PMD_PARAM_TM_QSIZE9                                "tm_qsize9"
+#define PMD_PARAM_TM_QSIZE10                               "tm_qsize10"
+#define PMD_PARAM_TM_QSIZE11                               "tm_qsize11"
+#define PMD_PARAM_TM_QSIZE12                               "tm_qsize12"
+#define PMD_PARAM_TM_QSIZE13                               "tm_qsize13"
+#define PMD_PARAM_TM_QSIZE14                               "tm_qsize14"
+#define PMD_PARAM_TM_QSIZE15                               "tm_qsize15"
+
 
 static const char * const pmd_valid_args[] = {
 	PMD_PARAM_FIRMWARE,
@@ -39,6 +52,18 @@ static const char * const pmd_valid_args[] = {
 	PMD_PARAM_TM_QSIZE1,
 	PMD_PARAM_TM_QSIZE2,
 	PMD_PARAM_TM_QSIZE3,
+	PMD_PARAM_TM_QSIZE4,
+	PMD_PARAM_TM_QSIZE5,
+	PMD_PARAM_TM_QSIZE6,
+	PMD_PARAM_TM_QSIZE7,
+	PMD_PARAM_TM_QSIZE8,
+	PMD_PARAM_TM_QSIZE9,
+	PMD_PARAM_TM_QSIZE10,
+	PMD_PARAM_TM_QSIZE11,
+	PMD_PARAM_TM_QSIZE12,
+	PMD_PARAM_TM_QSIZE13,
+	PMD_PARAM_TM_QSIZE14,
+	PMD_PARAM_TM_QSIZE15,
 	NULL
 };
 
@@ -434,6 +459,18 @@ pmd_parse_args(struct pmd_params *p, const char *params)
 	p->tm.qsize[1] = SOFTNIC_TM_QUEUE_SIZE;
 	p->tm.qsize[2] = SOFTNIC_TM_QUEUE_SIZE;
 	p->tm.qsize[3] = SOFTNIC_TM_QUEUE_SIZE;
+	p->tm.qsize[4] = SOFTNIC_TM_QUEUE_SIZE;
+	p->tm.qsize[5] = SOFTNIC_TM_QUEUE_SIZE;
+	p->tm.qsize[6] = SOFTNIC_TM_QUEUE_SIZE;
+	p->tm.qsize[7] = SOFTNIC_TM_QUEUE_SIZE;
+	p->tm.qsize[8] = SOFTNIC_TM_QUEUE_SIZE;
+	p->tm.qsize[9] = SOFTNIC_TM_QUEUE_SIZE;
+	p->tm.qsize[10] = SOFTNIC_TM_QUEUE_SIZE;
+	p->tm.qsize[11] = SOFTNIC_TM_QUEUE_SIZE;
+	p->tm.qsize[12] = SOFTNIC_TM_QUEUE_SIZE;
+	p->tm.qsize[13] = SOFTNIC_TM_QUEUE_SIZE;
+	p->tm.qsize[14] = SOFTNIC_TM_QUEUE_SIZE;
+	p->tm.qsize[15] = SOFTNIC_TM_QUEUE_SIZE;
 
 	/* Firmware script (optional) */
 	if (rte_kvargs_count(kvlist, PMD_PARAM_FIRMWARE) == 1) {
@@ -504,6 +541,88 @@ pmd_parse_args(struct pmd_params *p, const char *params)
 			goto out_free;
 	}
 
+	if (rte_kvargs_count(kvlist, PMD_PARAM_TM_QSIZE4) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_TM_QSIZE4,
+			&get_uint32, &p->tm.qsize[4]);
+		if (ret < 0)
+			goto out_free;
+	}
+
+	if (rte_kvargs_count(kvlist, PMD_PARAM_TM_QSIZE5) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_TM_QSIZE5,
+			&get_uint32, &p->tm.qsize[5]);
+		if (ret < 0)
+			goto out_free;
+	}
+
+	if (rte_kvargs_count(kvlist, PMD_PARAM_TM_QSIZE6) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_TM_QSIZE6,
+			&get_uint32, &p->tm.qsize[6]);
+		if (ret < 0)
+			goto out_free;
+	}
+
+	if (rte_kvargs_count(kvlist, PMD_PARAM_TM_QSIZE7) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_TM_QSIZE7,
+			&get_uint32, &p->tm.qsize[7]);
+		if (ret < 0)
+			goto out_free;
+	}
+	if (rte_kvargs_count(kvlist, PMD_PARAM_TM_QSIZE8) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_TM_QSIZE8,
+			&get_uint32, &p->tm.qsize[8]);
+		if (ret < 0)
+			goto out_free;
+	}
+	if (rte_kvargs_count(kvlist, PMD_PARAM_TM_QSIZE9) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_TM_QSIZE9,
+			&get_uint32, &p->tm.qsize[9]);
+		if (ret < 0)
+			goto out_free;
+	}
+
+	if (rte_kvargs_count(kvlist, PMD_PARAM_TM_QSIZE10) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_TM_QSIZE10,
+			&get_uint32, &p->tm.qsize[10]);
+		if (ret < 0)
+			goto out_free;
+	}
+
+	if (rte_kvargs_count(kvlist, PMD_PARAM_TM_QSIZE11) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_TM_QSIZE11,
+			&get_uint32, &p->tm.qsize[11]);
+		if (ret < 0)
+			goto out_free;
+	}
+
+	if (rte_kvargs_count(kvlist, PMD_PARAM_TM_QSIZE12) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_TM_QSIZE12,
+			&get_uint32, &p->tm.qsize[12]);
+		if (ret < 0)
+			goto out_free;
+	}
+
+	if (rte_kvargs_count(kvlist, PMD_PARAM_TM_QSIZE13) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_TM_QSIZE13,
+			&get_uint32, &p->tm.qsize[13]);
+		if (ret < 0)
+			goto out_free;
+	}
+
+	if (rte_kvargs_count(kvlist, PMD_PARAM_TM_QSIZE14) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_TM_QSIZE14,
+			&get_uint32, &p->tm.qsize[14]);
+		if (ret < 0)
+			goto out_free;
+	}
+
+	if (rte_kvargs_count(kvlist, PMD_PARAM_TM_QSIZE15) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_TM_QSIZE15,
+			&get_uint32, &p->tm.qsize[15]);
+		if (ret < 0)
+			goto out_free;
+	}
+
 out_free:
 	rte_kvargs_free(kvlist);
 	return ret;
@@ -588,6 +707,18 @@ RTE_PMD_REGISTER_PARAM_STRING(net_softnic,
 	PMD_PARAM_TM_QSIZE1 "=<uint32> "
 	PMD_PARAM_TM_QSIZE2 "=<uint32> "
 	PMD_PARAM_TM_QSIZE3 "=<uint32>"
+	PMD_PARAM_TM_QSIZE4 "=<uint32> "
+	PMD_PARAM_TM_QSIZE5 "=<uint32> "
+	PMD_PARAM_TM_QSIZE6 "=<uint32> "
+	PMD_PARAM_TM_QSIZE7 "=<uint32>"
+	PMD_PARAM_TM_QSIZE8 "=<uint32> "
+	PMD_PARAM_TM_QSIZE9 "=<uint32> "
+	PMD_PARAM_TM_QSIZE10 "=<uint32> "
+	PMD_PARAM_TM_QSIZE11 "=<uint32>"
+	PMD_PARAM_TM_QSIZE12 "=<uint32> "
+	PMD_PARAM_TM_QSIZE13 "=<uint32> "
+	PMD_PARAM_TM_QSIZE14 "=<uint32> "
+	PMD_PARAM_TM_QSIZE15 "=<uint32>"
 );
 
 
diff --git a/drivers/net/softnic/rte_eth_softnic_cli.c b/drivers/net/softnic/rte_eth_softnic_cli.c
index 56fc92ba2..7db77a33a 100644
--- a/drivers/net/softnic/rte_eth_softnic_cli.c
+++ b/drivers/net/softnic/rte_eth_softnic_cli.c
@@ -566,8 +566,7 @@ queue_node_id(uint32_t n_spp __rte_unused,
 	uint32_t tc_id,
 	uint32_t queue_id)
 {
-	return queue_id +
-		tc_id * RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE +
+	return queue_id + tc_id +
 		(pipe_id + subport_id * n_pps) * RTE_SCHED_QUEUES_PER_PIPE;
 }
 
@@ -617,10 +616,19 @@ tmgr_hierarchy_default(struct pmd_internals *softnic,
 		},
 	};
 
+	uint32_t *shared_shaper_id =
+		(uint32_t *) calloc(RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE,
+		sizeof(uint32_t));
+	if (shared_shaper_id == NULL)
+		return -1;
+
+	memcpy(shared_shaper_id, params->shared_shaper_id.tc,
+		sizeof(params->shared_shaper_id.tc));
+
 	struct rte_tm_node_params tc_node_params[] = {
 		[0] = {
 			.shaper_profile_id = params->shaper_profile_id.tc[0],
-			.shared_shaper_id = &params->shared_shaper_id.tc[0],
+			.shared_shaper_id = &shared_shaper_id[0],
 			.n_shared_shapers =
 				(&params->shared_shaper_id.tc_valid[0]) ? 1 : 0,
 			.nonleaf = {
@@ -630,7 +638,7 @@ tmgr_hierarchy_default(struct pmd_internals *softnic,
 
 		[1] = {
 			.shaper_profile_id = params->shaper_profile_id.tc[1],
-			.shared_shaper_id = &params->shared_shaper_id.tc[1],
+			.shared_shaper_id = &shared_shaper_id[1],
 			.n_shared_shapers =
 				(&params->shared_shaper_id.tc_valid[1]) ? 1 : 0,
 			.nonleaf = {
@@ -640,7 +648,7 @@ tmgr_hierarchy_default(struct pmd_internals *softnic,
 
 		[2] = {
 			.shaper_profile_id = params->shaper_profile_id.tc[2],
-			.shared_shaper_id = &params->shared_shaper_id.tc[2],
+			.shared_shaper_id = &shared_shaper_id[2],
 			.n_shared_shapers =
 				(&params->shared_shaper_id.tc_valid[2]) ? 1 : 0,
 			.nonleaf = {
@@ -650,13 +658,103 @@ tmgr_hierarchy_default(struct pmd_internals *softnic,
 
 		[3] = {
 			.shaper_profile_id = params->shaper_profile_id.tc[3],
-			.shared_shaper_id = &params->shared_shaper_id.tc[3],
+			.shared_shaper_id = &shared_shaper_id[3],
 			.n_shared_shapers =
 				(&params->shared_shaper_id.tc_valid[3]) ? 1 : 0,
 			.nonleaf = {
 				.n_sp_priorities = 1,
 			},
 		},
+
+		[4] = {
+			.shaper_profile_id = params->shaper_profile_id.tc[4],
+			.shared_shaper_id = &shared_shaper_id[4],
+			.n_shared_shapers =
+				(&params->shared_shaper_id.tc_valid[4]) ? 1 : 0,
+			.nonleaf = {
+				.n_sp_priorities = 1,
+			},
+		},
+
+		[5] = {
+			.shaper_profile_id = params->shaper_profile_id.tc[5],
+			.shared_shaper_id = &shared_shaper_id[5],
+			.n_shared_shapers =
+				(&params->shared_shaper_id.tc_valid[5]) ? 1 : 0,
+			.nonleaf = {
+				.n_sp_priorities = 1,
+			},
+		},
+
+		[6] = {
+			.shaper_profile_id = params->shaper_profile_id.tc[6],
+			.shared_shaper_id = &shared_shaper_id[6],
+			.n_shared_shapers =
+				(&params->shared_shaper_id.tc_valid[6]) ? 1 : 0,
+			.nonleaf = {
+				.n_sp_priorities = 1,
+			},
+		},
+
+		[7] = {
+			.shaper_profile_id = params->shaper_profile_id.tc[7],
+			.shared_shaper_id = &shared_shaper_id[7],
+			.n_shared_shapers =
+				(&params->shared_shaper_id.tc_valid[7]) ? 1 : 0,
+			.nonleaf = {
+				.n_sp_priorities = 1,
+			},
+		},
+
+		[8] = {
+			.shaper_profile_id = params->shaper_profile_id.tc[8],
+			.shared_shaper_id = &shared_shaper_id[8],
+			.n_shared_shapers =
+				(&params->shared_shaper_id.tc_valid[8]) ? 1 : 0,
+			.nonleaf = {
+				.n_sp_priorities = 1,
+			},
+		},
+
+		[9] = {
+			.shaper_profile_id = params->shaper_profile_id.tc[9],
+			.shared_shaper_id = &shared_shaper_id[9],
+			.n_shared_shapers =
+				(&params->shared_shaper_id.tc_valid[9]) ? 1 : 0,
+			.nonleaf = {
+				.n_sp_priorities = 1,
+			},
+		},
+
+		[10] = {
+			.shaper_profile_id = params->shaper_profile_id.tc[10],
+			.shared_shaper_id = &shared_shaper_id[10],
+			.n_shared_shapers =
+				(&params->shared_shaper_id.tc_valid[10]) ? 1 : 0,
+			.nonleaf = {
+				.n_sp_priorities = 1,
+			},
+		},
+
+		[11] = {
+			.shaper_profile_id = params->shaper_profile_id.tc[11],
+			.shared_shaper_id = &shared_shaper_id[11],
+			.n_shared_shapers =
+				(&params->shared_shaper_id.tc_valid[11]) ? 1 : 0,
+			.nonleaf = {
+				.n_sp_priorities = 1,
+			},
+		},
+
+		[12] = {
+			.shaper_profile_id = params->shaper_profile_id.tc[12],
+			.shared_shaper_id = &shared_shaper_id[12],
+			.n_shared_shapers =
+				(&params->shared_shaper_id.tc_valid[12]) ? 1 : 0,
+			.nonleaf = {
+				.n_sp_priorities = 1,
+			},
+		},
 	};
 
 	struct rte_tm_node_params queue_node_params = {
@@ -730,7 +828,23 @@ tmgr_hierarchy_default(struct pmd_internals *softnic,
 					return -1;
 
 				/* Hierarchy level 4: Queue nodes */
-				for (q = 0; q < RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS; q++) {
+				if (t == RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE - 1) {
+					/* Best-effort traffic class queues */
+					for (q = 0; q < RTE_SCHED_BE_QUEUES_PER_PIPE; q++) {
+						status = rte_tm_node_add(port_id,
+							queue_node_id(n_spp, n_pps, s, p, t, q),
+							tc_node_id(n_spp, n_pps, s, p, t),
+							0,
+							params->weight.queue[q],
+							RTE_TM_NODE_LEVEL_ID_ANY,
+							&queue_node_params,
+							&error);
+						if (status)
+							return -1;
+					}
+				} else {
+					/* Strict-priority traffic class queues */
+					q = 0;
 					status = rte_tm_node_add(port_id,
 						queue_node_id(n_spp, n_pps, s, p, t, q),
 						tc_node_id(n_spp, n_pps, s, p, t),
@@ -741,7 +855,7 @@ tmgr_hierarchy_default(struct pmd_internals *softnic,
 						&error);
 					if (status)
 						return -1;
-				} /* Queue */
+				}
 			} /* TC */
 		} /* Pipe */
 	} /* Subport */
@@ -762,13 +876,31 @@ tmgr_hierarchy_default(struct pmd_internals *softnic,
  *   tc1 <profile_id>
  *   tc2 <profile_id>
  *   tc3 <profile_id>
+ *   tc4 <profile_id>
+ *   tc5 <profile_id>
+ *   tc6 <profile_id>
+ *   tc7 <profile_id>
+ *   tc8 <profile_id>
+ *   tc9 <profile_id>
+ *   tc10 <profile_id>
+ *   tc11 <profile_id>
+ *   tc12 <profile_id>
  *  shared shaper
  *   tc0 <id | none>
  *   tc1 <id | none>
  *   tc2 <id | none>
  *   tc3 <id | none>
+ *   tc4 <id | none>
+ *   tc5 <id | none>
+ *   tc6 <id | none>
+ *   tc7 <id | none>
+ *   tc8 <id | none>
+ *   tc9 <id | none>
+ *   tc10 <id | none>
+ *   tc11 <id | none>
+ *   tc12 <id | none>
  *  weight
- *   queue  <q0> ... <q15>
+ *   queue  <q12> ... <q15>
  */
 static void
 cmd_tmgr_hierarchy_default(struct pmd_internals *softnic,
@@ -778,11 +910,11 @@ cmd_tmgr_hierarchy_default(struct pmd_internals *softnic,
 	size_t out_size)
 {
 	struct tmgr_hierarchy_default_params p;
-	int i, status;
+	int i, j, status;
 
 	memset(&p, 0, sizeof(p));
 
-	if (n_tokens != 50) {
+	if (n_tokens != 74) {
 		snprintf(out, out_size, MSG_ARG_MISMATCH, tokens[0]);
 		return;
 	}
@@ -894,27 +1026,117 @@ cmd_tmgr_hierarchy_default(struct pmd_internals *softnic,
 		return;
 	}
 
+	if (strcmp(tokens[22], "tc4") != 0) {
+		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "tc4");
+		return;
+	}
+
+	if (softnic_parser_read_uint32(&p.shaper_profile_id.tc[4], tokens[23]) != 0) {
+		snprintf(out, out_size, MSG_ARG_INVALID, "tc4 profile id");
+		return;
+	}
+
+	if (strcmp(tokens[24], "tc5") != 0) {
+		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "tc5");
+		return;
+	}
+
+	if (softnic_parser_read_uint32(&p.shaper_profile_id.tc[5], tokens[25]) != 0) {
+		snprintf(out, out_size, MSG_ARG_INVALID, "tc5 profile id");
+		return;
+	}
+
+	if (strcmp(tokens[26], "tc6") != 0) {
+		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "tc6");
+		return;
+	}
+
+	if (softnic_parser_read_uint32(&p.shaper_profile_id.tc[6], tokens[27]) != 0) {
+		snprintf(out, out_size, MSG_ARG_INVALID, "tc6 profile id");
+		return;
+	}
+
+	if (strcmp(tokens[28], "tc7") != 0) {
+		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "tc7");
+		return;
+	}
+
+	if (softnic_parser_read_uint32(&p.shaper_profile_id.tc[7], tokens[29]) != 0) {
+		snprintf(out, out_size, MSG_ARG_INVALID, "tc7 profile id");
+		return;
+	}
+
+	if (strcmp(tokens[30], "tc8") != 0) {
+		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "tc8");
+		return;
+	}
+
+	if (softnic_parser_read_uint32(&p.shaper_profile_id.tc[8], tokens[31]) != 0) {
+		snprintf(out, out_size, MSG_ARG_INVALID, "tc8 profile id");
+		return;
+	}
+
+	if (strcmp(tokens[32], "tc9") != 0) {
+		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "tc9");
+		return;
+	}
+
+	if (softnic_parser_read_uint32(&p.shaper_profile_id.tc[9], tokens[33]) != 0) {
+		snprintf(out, out_size, MSG_ARG_INVALID, "tc9 profile id");
+		return;
+	}
+
+	if (strcmp(tokens[34], "tc10") != 0) {
+		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "tc10");
+		return;
+	}
+
+	if (softnic_parser_read_uint32(&p.shaper_profile_id.tc[10], tokens[35]) != 0) {
+		snprintf(out, out_size, MSG_ARG_INVALID, "tc10 profile id");
+		return;
+	}
+
+	if (strcmp(tokens[36], "tc11") != 0) {
+		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "tc11");
+		return;
+	}
+
+	if (softnic_parser_read_uint32(&p.shaper_profile_id.tc[11], tokens[37]) != 0) {
+		snprintf(out, out_size, MSG_ARG_INVALID, "tc11 profile id");
+		return;
+	}
+
+	if (strcmp(tokens[38], "tc12") != 0) {
+		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "tc12");
+		return;
+	}
+
+	if (softnic_parser_read_uint32(&p.shaper_profile_id.tc[12], tokens[39]) != 0) {
+		snprintf(out, out_size, MSG_ARG_INVALID, "tc12 profile id");
+		return;
+	}
+
 	/* Shared shaper */
 
-	if (strcmp(tokens[22], "shared") != 0) {
+	if (strcmp(tokens[40], "shared") != 0) {
 		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "shared");
 		return;
 	}
 
-	if (strcmp(tokens[23], "shaper") != 0) {
+	if (strcmp(tokens[41], "shaper") != 0) {
 		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "shaper");
 		return;
 	}
 
-	if (strcmp(tokens[24], "tc0") != 0) {
+	if (strcmp(tokens[42], "tc0") != 0) {
 		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "tc0");
 		return;
 	}
 
-	if (strcmp(tokens[25], "none") == 0)
+	if (strcmp(tokens[43], "none") == 0)
 		p.shared_shaper_id.tc_valid[0] = 0;
 	else {
-		if (softnic_parser_read_uint32(&p.shared_shaper_id.tc[0], tokens[25]) != 0) {
+		if (softnic_parser_read_uint32(&p.shared_shaper_id.tc[0], tokens[43]) != 0) {
 			snprintf(out, out_size, MSG_ARG_INVALID, "shared shaper tc0");
 			return;
 		}
@@ -922,15 +1144,15 @@ cmd_tmgr_hierarchy_default(struct pmd_internals *softnic,
 		p.shared_shaper_id.tc_valid[0] = 1;
 	}
 
-	if (strcmp(tokens[26], "tc1") != 0) {
+	if (strcmp(tokens[44], "tc1") != 0) {
 		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "tc1");
 		return;
 	}
 
-	if (strcmp(tokens[27], "none") == 0)
+	if (strcmp(tokens[45], "none") == 0)
 		p.shared_shaper_id.tc_valid[1] = 0;
 	else {
-		if (softnic_parser_read_uint32(&p.shared_shaper_id.tc[1], tokens[27]) != 0) {
+		if (softnic_parser_read_uint32(&p.shared_shaper_id.tc[1], tokens[45]) != 0) {
 			snprintf(out, out_size, MSG_ARG_INVALID, "shared shaper tc1");
 			return;
 		}
@@ -938,15 +1160,15 @@ cmd_tmgr_hierarchy_default(struct pmd_internals *softnic,
 		p.shared_shaper_id.tc_valid[1] = 1;
 	}
 
-	if (strcmp(tokens[28], "tc2") != 0) {
+	if (strcmp(tokens[46], "tc2") != 0) {
 		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "tc2");
 		return;
 	}
 
-	if (strcmp(tokens[29], "none") == 0)
+	if (strcmp(tokens[47], "none") == 0)
 		p.shared_shaper_id.tc_valid[2] = 0;
 	else {
-		if (softnic_parser_read_uint32(&p.shared_shaper_id.tc[2], tokens[29]) != 0) {
+		if (softnic_parser_read_uint32(&p.shared_shaper_id.tc[2], tokens[47]) != 0) {
 			snprintf(out, out_size, MSG_ARG_INVALID, "shared shaper tc2");
 			return;
 		}
@@ -954,15 +1176,15 @@ cmd_tmgr_hierarchy_default(struct pmd_internals *softnic,
 		p.shared_shaper_id.tc_valid[2] = 1;
 	}
 
-	if (strcmp(tokens[30], "tc3") != 0) {
+	if (strcmp(tokens[48], "tc3") != 0) {
 		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "tc3");
 		return;
 	}
 
-	if (strcmp(tokens[31], "none") == 0)
+	if (strcmp(tokens[49], "none") == 0)
 		p.shared_shaper_id.tc_valid[3] = 0;
 	else {
-		if (softnic_parser_read_uint32(&p.shared_shaper_id.tc[3], tokens[31]) != 0) {
+		if (softnic_parser_read_uint32(&p.shared_shaper_id.tc[3], tokens[49]) != 0) {
 			snprintf(out, out_size, MSG_ARG_INVALID, "shared shaper tc3");
 			return;
 		}
@@ -970,22 +1192,171 @@ cmd_tmgr_hierarchy_default(struct pmd_internals *softnic,
 		p.shared_shaper_id.tc_valid[3] = 1;
 	}
 
+	if (strcmp(tokens[50], "tc4") != 0) {
+		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "tc4");
+		return;
+	}
+
+	if (strcmp(tokens[51], "none") == 0)
+		p.shared_shaper_id.tc_valid[4] = 0;
+	else {
+		if (softnic_parser_read_uint32(&p.shared_shaper_id.tc[4], tokens[51]) != 0) {
+			snprintf(out, out_size, MSG_ARG_INVALID, "shared shaper tc4");
+			return;
+		}
+
+		p.shared_shaper_id.tc_valid[4] = 1;
+	}
+
+	if (strcmp(tokens[52], "tc5") != 0) {
+		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "tc5");
+		return;
+	}
+
+	if (strcmp(tokens[53], "none") == 0)
+		p.shared_shaper_id.tc_valid[5] = 0;
+	else {
+		if (softnic_parser_read_uint32(&p.shared_shaper_id.tc[5], tokens[53]) != 0) {
+			snprintf(out, out_size, MSG_ARG_INVALID, "shared shaper tc5");
+			return;
+		}
+
+		p.shared_shaper_id.tc_valid[5] = 1;
+	}
+
+	if (strcmp(tokens[54], "tc6") != 0) {
+		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "tc6");
+		return;
+	}
+
+	if (strcmp(tokens[55], "none") == 0)
+		p.shared_shaper_id.tc_valid[6] = 0;
+	else {
+		if (softnic_parser_read_uint32(&p.shared_shaper_id.tc[6], tokens[55]) != 0) {
+			snprintf(out, out_size, MSG_ARG_INVALID, "shared shaper tc6");
+			return;
+		}
+
+		p.shared_shaper_id.tc_valid[6] = 1;
+	}
+
+	if (strcmp(tokens[56], "tc7") != 0) {
+		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "tc7");
+		return;
+	}
+
+	if (strcmp(tokens[57], "none") == 0)
+		p.shared_shaper_id.tc_valid[7] = 0;
+	else {
+		if (softnic_parser_read_uint32(&p.shared_shaper_id.tc[7], tokens[57]) != 0) {
+			snprintf(out, out_size, MSG_ARG_INVALID, "shared shaper tc7");
+			return;
+		}
+
+		p.shared_shaper_id.tc_valid[7] = 1;
+	}
+
+	if (strcmp(tokens[58], "tc8") != 0) {
+		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "tc8");
+		return;
+	}
+
+	if (strcmp(tokens[59], "none") == 0)
+		p.shared_shaper_id.tc_valid[8] = 0;
+	else {
+		if (softnic_parser_read_uint32(&p.shared_shaper_id.tc[8], tokens[59]) != 0) {
+			snprintf(out, out_size, MSG_ARG_INVALID, "shared shaper tc8");
+			return;
+		}
+
+		p.shared_shaper_id.tc_valid[8] = 1;
+	}
+
+	if (strcmp(tokens[60], "tc9") != 0) {
+		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "tc9");
+		return;
+	}
+
+	if (strcmp(tokens[61], "none") == 0)
+		p.shared_shaper_id.tc_valid[9] = 0;
+	else {
+		if (softnic_parser_read_uint32(&p.shared_shaper_id.tc[9], tokens[61]) != 0) {
+			snprintf(out, out_size, MSG_ARG_INVALID, "shared shaper tc9");
+			return;
+		}
+
+		p.shared_shaper_id.tc_valid[9] = 1;
+	}
+
+	if (strcmp(tokens[62], "tc10") != 0) {
+		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "tc10");
+		return;
+	}
+
+	if (strcmp(tokens[63], "none") == 0)
+		p.shared_shaper_id.tc_valid[10] = 0;
+	else {
+		if (softnic_parser_read_uint32(&p.shared_shaper_id.tc[10], tokens[63]) != 0) {
+			snprintf(out, out_size, MSG_ARG_INVALID, "shared shaper tc10");
+			return;
+		}
+
+		p.shared_shaper_id.tc_valid[10] = 1;
+	}
+
+	if (strcmp(tokens[64], "tc11") != 0) {
+		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "tc11");
+		return;
+	}
+
+	if (strcmp(tokens[65], "none") == 0)
+		p.shared_shaper_id.tc_valid[11] = 0;
+	else {
+		if (softnic_parser_read_uint32(&p.shared_shaper_id.tc[11], tokens[65]) != 0) {
+			snprintf(out, out_size, MSG_ARG_INVALID, "shared shaper tc11");
+			return;
+		}
+
+		p.shared_shaper_id.tc_valid[11] = 1;
+	}
+
+	if (strcmp(tokens[66], "tc12") != 0) {
+		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "tc12");
+		return;
+	}
+
+	if (strcmp(tokens[67], "none") == 0)
+		p.shared_shaper_id.tc_valid[12] = 0;
+	else {
+		if (softnic_parser_read_uint32(&p.shared_shaper_id.tc[12], tokens[67]) != 0) {
+			snprintf(out, out_size, MSG_ARG_INVALID, "shared shaper tc12");
+			return;
+		}
+
+		p.shared_shaper_id.tc_valid[12] = 1;
+	}
+
 	/* Weight */
 
-	if (strcmp(tokens[32], "weight") != 0) {
+	if (strcmp(tokens[68], "weight") != 0) {
 		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "weight");
 		return;
 	}
 
-	if (strcmp(tokens[33], "queue") != 0) {
+	if (strcmp(tokens[69], "queue") != 0) {
 		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "queue");
 		return;
 	}
 
-	for (i = 0; i < 16; i++) {
-		if (softnic_parser_read_uint32(&p.weight.queue[i], tokens[34 + i]) != 0) {
-			snprintf(out, out_size, MSG_ARG_INVALID, "weight queue");
-			return;
+	for (i = 0, j = 0; i < 16; i++) {
+		if (i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE - 1) {
+			p.weight.queue[i] = 1;
+		} else {
+			if (softnic_parser_read_uint32(&p.weight.queue[i], tokens[70 + j]) != 0) {
+				snprintf(out, out_size, MSG_ARG_INVALID, "weight queue");
+				return;
+			}
+			j++;
 		}
 	}
 
diff --git a/drivers/net/softnic/rte_eth_softnic_internals.h b/drivers/net/softnic/rte_eth_softnic_internals.h
index 415434d0d..5525dff98 100644
--- a/drivers/net/softnic/rte_eth_softnic_internals.h
+++ b/drivers/net/softnic/rte_eth_softnic_internals.h
@@ -43,7 +43,7 @@ struct pmd_params {
 	/** Traffic Management (TM) */
 	struct {
 		uint32_t n_queues; /**< Number of queues */
-		uint16_t qsize[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
+		uint16_t qsize[RTE_SCHED_QUEUES_PER_PIPE];
 	} tm;
 };
 
@@ -161,13 +161,15 @@ TAILQ_HEAD(softnic_link_list, softnic_link);
 #define TM_MAX_PIPES_PER_SUBPORT			4096
 #endif
 
+#ifndef TM_MAX_PIPE_PROFILE
+#define TM_MAX_PIPE_PROFILE				256
+#endif
 struct tm_params {
 	struct rte_sched_port_params port_params;
 
 	struct rte_sched_subport_params subport_params[TM_MAX_SUBPORTS];
 
-	struct rte_sched_pipe_params
-		pipe_profiles[RTE_SCHED_PIPE_PROFILES_PER_PORT];
+	struct rte_sched_pipe_params pipe_profiles[TM_MAX_PIPE_PROFILE];
 	uint32_t n_pipe_profiles;
 	uint32_t pipe_to_profile[TM_MAX_SUBPORTS * TM_MAX_PIPES_PER_SUBPORT];
 };
diff --git a/drivers/net/softnic/rte_eth_softnic_tm.c b/drivers/net/softnic/rte_eth_softnic_tm.c
index 58744a9eb..c7a74836b 100644
--- a/drivers/net/softnic/rte_eth_softnic_tm.c
+++ b/drivers/net/softnic/rte_eth_softnic_tm.c
@@ -367,7 +367,8 @@ tm_level_get_max_nodes(struct rte_eth_dev *dev, enum tm_node_level level)
 {
 	struct pmd_internals *p = dev->data->dev_private;
 	uint32_t n_queues_max = p->params.tm.n_queues;
-	uint32_t n_tc_max = n_queues_max / RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS;
+	uint32_t n_tc_max =
+		(n_queues_max * RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE) / RTE_SCHED_QUEUES_PER_PIPE;
 	uint32_t n_pipes_max = n_tc_max / RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE;
 	uint32_t n_subports_max = n_pipes_max;
 	uint32_t n_root_max = 1;
@@ -625,10 +626,10 @@ static const struct rte_tm_level_capabilities tm_level_cap[] = {
 			.shaper_shared_n_max = 1,
 
 			.sched_n_children_max =
-				RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS,
+				RTE_SCHED_BE_QUEUES_PER_PIPE,
 			.sched_sp_n_priorities_max = 1,
 			.sched_wfq_n_children_per_group_max =
-				RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS,
+				RTE_SCHED_BE_QUEUES_PER_PIPE,
 			.sched_wfq_n_groups_max = 1,
 			.sched_wfq_weight_max = UINT32_MAX,
 
@@ -793,10 +794,10 @@ static const struct rte_tm_node_capabilities tm_node_cap[] = {
 
 		{.nonleaf = {
 			.sched_n_children_max =
-				RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS,
+				RTE_SCHED_BE_QUEUES_PER_PIPE,
 			.sched_sp_n_priorities_max = 1,
 			.sched_wfq_n_children_per_group_max =
-				RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS,
+				RTE_SCHED_BE_QUEUES_PER_PIPE,
 			.sched_wfq_n_groups_max = 1,
 			.sched_wfq_weight_max = UINT32_MAX,
 		} },
@@ -2043,15 +2044,13 @@ pipe_profile_build(struct rte_eth_dev *dev,
 
 		/* Queue */
 		TAILQ_FOREACH(nq, nl, node) {
-			uint32_t pipe_queue_id;
 
 			if (nq->level != TM_NODE_LEVEL_QUEUE ||
 				nq->parent_node_id != nt->node_id)
 				continue;
 
-			pipe_queue_id = nt->priority *
-				RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS + queue_id;
-			pp->wrr_weights[pipe_queue_id] = nq->weight;
+			if (nt->priority == RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE - 1)
+				pp->wrr_weights[queue_id] = nq->weight;
 
 			queue_id++;
 		}
@@ -2065,7 +2064,7 @@ pipe_profile_free_exists(struct rte_eth_dev *dev,
 	struct pmd_internals *p = dev->data->dev_private;
 	struct tm_params *t = &p->soft.tm.params;
 
-	if (t->n_pipe_profiles < RTE_SCHED_PIPE_PROFILES_PER_PORT) {
+	if (t->n_pipe_profiles < TM_MAX_PIPE_PROFILE) {
 		*pipe_profile_id = t->n_pipe_profiles;
 		return 1;
 	}
@@ -2217,6 +2216,7 @@ wred_profiles_set(struct rte_eth_dev *dev)
 {
 	struct pmd_internals *p = dev->data->dev_private;
 	struct rte_sched_port_params *pp = &p->soft.tm.params.port_params;
+
 	uint32_t tc_id;
 	enum rte_color color;
 
@@ -2332,7 +2332,7 @@ hierarchy_commit_check(struct rte_eth_dev *dev, struct rte_tm_error *error)
 				rte_strerror(EINVAL));
 	}
 
-	/* Each pipe has exactly 4 TCs, with exactly one TC for each priority */
+	/* Each pipe has exactly 13 TCs, with exactly one TC for each priority */
 	TAILQ_FOREACH(np, nl, node) {
 		uint32_t mask = 0, mask_expected =
 			RTE_LEN2MASK(RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE,
@@ -2364,12 +2364,14 @@ hierarchy_commit_check(struct rte_eth_dev *dev, struct rte_tm_error *error)
 				rte_strerror(EINVAL));
 	}
 
-	/* Each TC has exactly 4 packet queues. */
+	/** Each Strict priority TC has exactly 1 packet queues while
+	 *	lowest priority TC (Best-effort) has 4 queues.
+	 */
 	TAILQ_FOREACH(nt, nl, node) {
 		if (nt->level != TM_NODE_LEVEL_TC)
 			continue;
 
-		if (nt->n_children != RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS)
+		if (nt->n_children != 1 && nt->n_children != RTE_SCHED_BE_QUEUES_PER_PIPE)
 			return -rte_tm_error_set(error,
 				EINVAL,
 				RTE_TM_ERROR_TYPE_UNSPECIFIED,
@@ -2531,9 +2533,22 @@ hierarchy_blueprints_create(struct rte_eth_dev *dev)
 			p->params.tm.qsize[1],
 			p->params.tm.qsize[2],
 			p->params.tm.qsize[3],
+			p->params.tm.qsize[4],
+			p->params.tm.qsize[5],
+			p->params.tm.qsize[6],
+			p->params.tm.qsize[7],
+			p->params.tm.qsize[8],
+			p->params.tm.qsize[9],
+			p->params.tm.qsize[10],
+			p->params.tm.qsize[11],
+			p->params.tm.qsize[12],
+			p->params.tm.qsize[13],
+			p->params.tm.qsize[14],
+			p->params.tm.qsize[15],
 		},
 		.pipe_profiles = t->pipe_profiles,
 		.n_pipe_profiles = t->n_pipe_profiles,
+		.n_max_pipe_profiles = TM_MAX_PIPE_PROFILE,
 	};
 
 	wred_profiles_set(dev);
@@ -2566,8 +2581,17 @@ hierarchy_blueprints_create(struct rte_eth_dev *dev)
 					tc_rate[1],
 					tc_rate[2],
 					tc_rate[3],
-			},
-			.tc_period = SUBPORT_TC_PERIOD,
+					tc_rate[4],
+					tc_rate[5],
+					tc_rate[6],
+					tc_rate[7],
+					tc_rate[8],
+					tc_rate[9],
+					tc_rate[10],
+					tc_rate[11],
+					tc_rate[12],
+				},
+				.tc_period = SUBPORT_TC_PERIOD,
 		};
 
 		subport_id++;
@@ -2666,7 +2690,7 @@ update_queue_weight(struct rte_eth_dev *dev,
 	uint32_t subport_id = tm_node_subport_id(dev, ns);
 
 	uint32_t pipe_queue_id =
-		tc_id * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS + queue_id;
+		tc_id * RTE_SCHED_QUEUES_PER_PIPE + queue_id;
 
 	struct rte_sched_pipe_params *profile0 = pipe_profile_get(dev, np);
 	struct rte_sched_pipe_params profile1;
@@ -3023,7 +3047,7 @@ tm_port_queue_id(struct rte_eth_dev *dev,
 	uint32_t port_tc_id =
 		port_pipe_id * RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE + pipe_tc_id;
 	uint32_t port_queue_id =
-		port_tc_id * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS + tc_queue_id;
+		port_tc_id * RTE_SCHED_QUEUES_PER_PIPE + tc_queue_id;
 
 	return port_queue_id;
 }
@@ -3149,8 +3173,8 @@ read_pipe_stats(struct rte_eth_dev *dev,
 		uint32_t qid = tm_port_queue_id(dev,
 			subport_id,
 			pipe_id,
-			i / RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS,
-			i % RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS);
+			i / RTE_SCHED_QUEUES_PER_PIPE,
+			i % RTE_SCHED_QUEUES_PER_PIPE);
 
 		int status = rte_sched_queue_read_stats(SCHED(p),
 			qid,
@@ -3202,7 +3226,7 @@ read_tc_stats(struct rte_eth_dev *dev,
 	uint32_t i;
 
 	/* Stats read */
-	for (i = 0; i < RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS; i++) {
+	for (i = 0; i < RTE_SCHED_QUEUES_PER_PIPE; i++) {
 		struct rte_sched_queue_stats s;
 		uint16_t qlen;
 
-- 
2.21.0


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH v3 08/11] test_sched: modify tests for config flexibility
  2019-07-11 10:26       ` [dpdk-dev] [PATCH v3 00/11] sched: feature enhancements Jasvinder Singh
                           ` (6 preceding siblings ...)
  2019-07-11 10:26         ` [dpdk-dev] [PATCH v3 07/11] net/softnic: add config flexibility to softnic tm Jasvinder Singh
@ 2019-07-11 10:26         ` Jasvinder Singh
  2019-07-11 10:26         ` [dpdk-dev] [PATCH v3 09/11] examples/ip_pipeline: add config flexibility to tm function Jasvinder Singh
                           ` (2 subsequent siblings)
  10 siblings, 0 replies; 163+ messages in thread
From: Jasvinder Singh @ 2019-07-11 10:26 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, Abraham Tovar, Lukasz Krakowiak

update unit tests for configuration flexibility of pipe traffic
classes and queues size.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 app/test/test_sched.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/app/test/test_sched.c b/app/test/test_sched.c
index 36fa2d425..797d8ac34 100644
--- a/app/test/test_sched.c
+++ b/app/test/test_sched.c
@@ -27,7 +27,9 @@ static struct rte_sched_subport_params subport_param[] = {
 		.tb_rate = 1250000000,
 		.tb_size = 1000000,
 
-		.tc_rate = {1250000000, 1250000000, 1250000000, 1250000000},
+		.tc_rate = {1250000000, 1250000000, 1250000000, 1250000000,
+			1250000000, 1250000000, 1250000000, 1250000000, 1250000000,
+			1250000000, 1250000000, 1250000000, 1250000000},
 		.tc_period = 10,
 	},
 };
@@ -37,7 +39,8 @@ static struct rte_sched_pipe_params pipe_profile[] = {
 		.tb_rate = 305175,
 		.tb_size = 1000000,
 
-		.tc_rate = {305175, 305175, 305175, 305175},
+		.tc_rate = {305175, 305175, 305175, 305175, 305175, 305175,
+			305175, 305175, 305175, 305175, 305175, 305175, 305175},
 		.tc_period = 40,
 
 		.wrr_weights = {1, 1, 1, 1},
@@ -51,9 +54,10 @@ static struct rte_sched_port_params port_param = {
 	.frame_overhead = RTE_SCHED_FRAME_OVERHEAD_DEFAULT,
 	.n_subports_per_port = 1,
 	.n_pipes_per_subport = 1024,
-	.qsize = {32, 32, 32, 32},
+	.qsize = {32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32},
 	.pipe_profiles = pipe_profile,
 	.n_pipe_profiles = 1,
+	.n_max_pipe_profiles = 1,
 };
 
 #define NB_MBUF          32
-- 
2.21.0


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH v3 09/11] examples/ip_pipeline: add config flexibility to tm function
  2019-07-11 10:26       ` [dpdk-dev] [PATCH v3 00/11] sched: feature enhancements Jasvinder Singh
                           ` (7 preceding siblings ...)
  2019-07-11 10:26         ` [dpdk-dev] [PATCH v3 08/11] test_sched: modify tests for config flexibility Jasvinder Singh
@ 2019-07-11 10:26         ` Jasvinder Singh
  2019-07-11 10:26         ` [dpdk-dev] [PATCH v3 10/11] examples/qos_sched: add tc and queue config flexibility Jasvinder Singh
  2019-07-11 10:26         ` [dpdk-dev] [PATCH v3 11/11] sched: remove redundant macros Jasvinder Singh
  10 siblings, 0 replies; 163+ messages in thread
From: Jasvinder Singh @ 2019-07-11 10:26 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, Abraham Tovar, Lukasz Krakowiak

Update ip pipeline sample app for configuration flexiblity of
pipe traffic classes and queues.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 examples/ip_pipeline/cli.c             | 45 +++++++++++++++-----------
 examples/ip_pipeline/tmgr.c            |  2 +-
 examples/ip_pipeline/tmgr.h            |  4 +--
 lib/librte_pipeline/rte_table_action.c |  1 -
 lib/librte_pipeline/rte_table_action.h |  4 +--
 5 files changed, 31 insertions(+), 25 deletions(-)

diff --git a/examples/ip_pipeline/cli.c b/examples/ip_pipeline/cli.c
index 309b2936e..b5e8b9bcf 100644
--- a/examples/ip_pipeline/cli.c
+++ b/examples/ip_pipeline/cli.c
@@ -377,7 +377,9 @@ cmd_swq(char **tokens,
 static const char cmd_tmgr_subport_profile_help[] =
 "tmgr subport profile\n"
 "   <tb_rate> <tb_size>\n"
-"   <tc0_rate> <tc1_rate> <tc2_rate> <tc3_rate>\n"
+"   <tc0_rate> <tc1_rate> <tc2_rate> <tc3_rate> <tc4_rate>"
+"        <tc5_rate> <tc6_rate> <tc7_rate> <tc8_rate>"
+"        <tc9_rate> <tc10_rate> <tc11_rate> <tc12_rate>\n"
 "   <tc_period>\n";
 
 static void
@@ -389,7 +391,7 @@ cmd_tmgr_subport_profile(char **tokens,
 	struct rte_sched_subport_params p;
 	int status, i;
 
-	if (n_tokens != 10) {
+	if (n_tokens != 19) {
 		snprintf(out, out_size, MSG_ARG_MISMATCH, tokens[0]);
 		return;
 	}
@@ -410,7 +412,7 @@ cmd_tmgr_subport_profile(char **tokens,
 			return;
 		}
 
-	if (parser_read_uint32(&p.tc_period, tokens[9]) != 0) {
+	if (parser_read_uint32(&p.tc_period, tokens[18]) != 0) {
 		snprintf(out, out_size, MSG_ARG_INVALID, "tc_period");
 		return;
 	}
@@ -425,10 +427,12 @@ cmd_tmgr_subport_profile(char **tokens,
 static const char cmd_tmgr_pipe_profile_help[] =
 "tmgr pipe profile\n"
 "   <tb_rate> <tb_size>\n"
-"   <tc0_rate> <tc1_rate> <tc2_rate> <tc3_rate>\n"
+"   <tc0_rate> <tc1_rate> <tc2_rate> <tc3_rate> <tc4_rate>"
+"     <tc5_rate> <tc6_rate> <tc7_rate> <tc8_rate>"
+"     <tc9_rate> <tc10_rate> <tc11_rate> <tc12_rate>\n"
 "   <tc_period>\n"
 "   <tc_ov_weight>\n"
-"   <wrr_weight0..15>\n";
+"   <wrr_weight0..3>\n";
 
 static void
 cmd_tmgr_pipe_profile(char **tokens,
@@ -439,7 +443,7 @@ cmd_tmgr_pipe_profile(char **tokens,
 	struct rte_sched_pipe_params p;
 	int status, i;
 
-	if (n_tokens != 27) {
+	if (n_tokens != 24) {
 		snprintf(out, out_size, MSG_ARG_MISMATCH, tokens[0]);
 		return;
 	}
@@ -460,20 +464,20 @@ cmd_tmgr_pipe_profile(char **tokens,
 			return;
 		}
 
-	if (parser_read_uint32(&p.tc_period, tokens[9]) != 0) {
+	if (parser_read_uint32(&p.tc_period, tokens[18]) != 0) {
 		snprintf(out, out_size, MSG_ARG_INVALID, "tc_period");
 		return;
 	}
 
 #ifdef RTE_SCHED_SUBPORT_TC_OV
-	if (parser_read_uint8(&p.tc_ov_weight, tokens[10]) != 0) {
+	if (parser_read_uint8(&p.tc_ov_weight, tokens[19]) != 0) {
 		snprintf(out, out_size, MSG_ARG_INVALID, "tc_ov_weight");
 		return;
 	}
 #endif
 
-	for (i = 0; i < RTE_SCHED_QUEUES_PER_PIPE; i++)
-		if (parser_read_uint8(&p.wrr_weights[i], tokens[11 + i]) != 0) {
+	for (i = 0; i < RTE_SCHED_BE_QUEUES_PER_PIPE; i++)
+		if (parser_read_uint8(&p.wrr_weights[i], tokens[20 + i]) != 0) {
 			snprintf(out, out_size, MSG_ARG_INVALID, "wrr_weights");
 			return;
 		}
@@ -490,7 +494,10 @@ static const char cmd_tmgr_help[] =
 "   rate <rate>\n"
 "   spp <n_subports_per_port>\n"
 "   pps <n_pipes_per_subport>\n"
-"   qsize <qsize_tc0> <qsize_tc1> <qsize_tc2> <qsize_tc3>\n"
+"   qsize <qsize_tc0> <qsize_tc1> <qsize_tc2> <qsize_tc3>"
+"   <qsize_tc4> <qsize_tc5> <qsize_tc6> <qsize_tc7>"
+"   <qsize_tc8> <qsize_tc9> <qsize_tc10> <qsize_tc11>"
+"   <qsize_tc12> <qsize_tc13> <qsize_tc14 <qsize_tc15>\n"
 "   fo <frame_overhead>\n"
 "   mtu <mtu>\n"
 "   cpu <cpu_id>\n";
@@ -506,7 +513,7 @@ cmd_tmgr(char **tokens,
 	struct tmgr_port *tmgr_port;
 	int i;
 
-	if (n_tokens != 19) {
+	if (n_tokens != 31) {
 		snprintf(out, out_size, MSG_ARG_MISMATCH, tokens[0]);
 		return;
 	}
@@ -548,38 +555,38 @@ cmd_tmgr(char **tokens,
 		return;
 	}
 
-	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++)
+	for (i = 0; i < RTE_SCHED_QUEUES_PER_PIPE; i++)
 		if (parser_read_uint16(&p.qsize[i], tokens[9 + i]) != 0) {
 			snprintf(out, out_size, MSG_ARG_INVALID, "qsize");
 			return;
 		}
 
-	if (strcmp(tokens[13], "fo") != 0) {
+	if (strcmp(tokens[25], "fo") != 0) {
 		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "fo");
 		return;
 	}
 
-	if (parser_read_uint32(&p.frame_overhead, tokens[14]) != 0) {
+	if (parser_read_uint32(&p.frame_overhead, tokens[26]) != 0) {
 		snprintf(out, out_size, MSG_ARG_INVALID, "frame_overhead");
 		return;
 	}
 
-	if (strcmp(tokens[15], "mtu") != 0) {
+	if (strcmp(tokens[27], "mtu") != 0) {
 		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "mtu");
 		return;
 	}
 
-	if (parser_read_uint32(&p.mtu, tokens[16]) != 0) {
+	if (parser_read_uint32(&p.mtu, tokens[28]) != 0) {
 		snprintf(out, out_size, MSG_ARG_INVALID, "mtu");
 		return;
 	}
 
-	if (strcmp(tokens[17], "cpu") != 0) {
+	if (strcmp(tokens[29], "cpu") != 0) {
 		snprintf(out, out_size, MSG_ARG_NOT_FOUND, "cpu");
 		return;
 	}
 
-	if (parser_read_uint32(&p.cpu_id, tokens[18]) != 0) {
+	if (parser_read_uint32(&p.cpu_id, tokens[30]) != 0) {
 		snprintf(out, out_size, MSG_ARG_INVALID, "cpu_id");
 		return;
 	}
diff --git a/examples/ip_pipeline/tmgr.c b/examples/ip_pipeline/tmgr.c
index 40cbf1d0a..0a04ca4a6 100644
--- a/examples/ip_pipeline/tmgr.c
+++ b/examples/ip_pipeline/tmgr.c
@@ -105,7 +105,7 @@ tmgr_port_create(const char *name, struct tmgr_port_params *params)
 	p.n_subports_per_port = params->n_subports_per_port;
 	p.n_pipes_per_subport = params->n_pipes_per_subport;
 
-	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++)
+	for (i = 0; i < RTE_SCHED_QUEUES_PER_PIPE; i++)
 		p.qsize[i] = params->qsize[i];
 
 	p.pipe_profiles = pipe_profile;
diff --git a/examples/ip_pipeline/tmgr.h b/examples/ip_pipeline/tmgr.h
index 0b497e795..aad96097d 100644
--- a/examples/ip_pipeline/tmgr.h
+++ b/examples/ip_pipeline/tmgr.h
@@ -39,11 +39,11 @@ tmgr_port_find(const char *name);
 struct tmgr_port_params {
 	uint32_t rate;
 	uint32_t n_subports_per_port;
-	uint32_t n_pipes_per_subport;
-	uint16_t qsize[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
 	uint32_t frame_overhead;
 	uint32_t mtu;
 	uint32_t cpu_id;
+	uint32_t n_pipes_per_subport;
+	uint16_t qsize[RTE_SCHED_QUEUES_PER_PIPE];
 };
 
 int
diff --git a/lib/librte_pipeline/rte_table_action.c b/lib/librte_pipeline/rte_table_action.c
index a54ec46bc..47d7efbc1 100644
--- a/lib/librte_pipeline/rte_table_action.c
+++ b/lib/librte_pipeline/rte_table_action.c
@@ -401,7 +401,6 @@ pkt_work_tm(struct rte_mbuf *mbuf,
 {
 	struct dscp_table_entry_data *dscp_entry = &dscp_table->entry[dscp];
 	uint32_t queue_id = data->queue_id |
-				(dscp_entry->tc << 2) |
 				dscp_entry->tc_queue;
 	rte_mbuf_sched_set(mbuf, queue_id, dscp_entry->tc,
 				(uint8_t)dscp_entry->color);
diff --git a/lib/librte_pipeline/rte_table_action.h b/lib/librte_pipeline/rte_table_action.h
index 44041b5c9..82bc9d9ac 100644
--- a/lib/librte_pipeline/rte_table_action.h
+++ b/lib/librte_pipeline/rte_table_action.h
@@ -181,10 +181,10 @@ struct rte_table_action_lb_params {
  * RTE_TABLE_ACTION_MTR
  */
 /** Max number of traffic classes (TCs). */
-#define RTE_TABLE_ACTION_TC_MAX                                  4
+#define RTE_TABLE_ACTION_TC_MAX                                  16
 
 /** Max number of queues per traffic class. */
-#define RTE_TABLE_ACTION_TC_QUEUE_MAX                            4
+#define RTE_TABLE_ACTION_TC_QUEUE_MAX                            16
 
 /** Differentiated Services Code Point (DSCP) translation table entry. */
 struct rte_table_action_dscp_table_entry {
-- 
2.21.0


^ permalink raw reply	[flat|nested] 163+ messages in thread

* [dpdk-dev] [PATCH v3 10/11] examples/qos_sched: add tc and queue config flexibility
  2019-07-11 10:26       ` [dpdk-dev] [PATCH v3 00/11] sched: feature enhancements Jasvinder Singh
                           ` (8 preceding siblings ...)
  2019-07-11 10:26         ` [dpdk-dev] [PATCH v3 09/11] examples/ip_pipeline: add config flexibility to tm function Jasvinder Singh
@ 2019-07-11 10:26         ` Jasvinder Singh
  2019-07-11 10:26         ` [dpdk-dev] [PATCH v3 11/11] sched: remove redundant macros Jasvinder Singh
  10 siblings, 0 replies; 163+ messages in thread
From: Jasvinder Singh @ 2019-07-11 10:26 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, Abraham Tovar, Lukasz Krakowiak

Update qos sched sample app for configuration flexibility of
pipe traffic classes and queues.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
---
 examples/qos_sched/app_thread.c   |   9 +-
 examples/qos_sched/cfg_file.c     | 119 +++++---
 examples/qos_sched/init.c         |  63 +++-
 examples/qos_sched/main.h         |   4 +
 examples/qos_sched/profile.cfg    |  66 +++-
 examples/qos_sched/profile_ov.cfg |  54 +++-
 examples/qos_sched/stats.c        | 483 +++++++++++++++++-------------
 7 files changed, 517 insertions(+), 281 deletions(-)

diff --git a/examples/qos_sched/app_thread.c b/examples/qos_sched/app_thread.c
index e14b275e3..1ce3639ee 100644
--- a/examples/qos_sched/app_thread.c
+++ b/examples/qos_sched/app_thread.c
@@ -20,13 +20,11 @@
  * QoS parameters are encoded as follows:
  *		Outer VLAN ID defines subport
  *		Inner VLAN ID defines pipe
- *		Destination IP 0.0.XXX.0 defines traffic class
  *		Destination IP host (0.0.0.XXX) defines queue
  * Values below define offset to each field from start of frame
  */
 #define SUBPORT_OFFSET	7
 #define PIPE_OFFSET		9
-#define TC_OFFSET		20
 #define QUEUE_OFFSET	20
 #define COLOR_OFFSET	19
 
@@ -40,10 +38,9 @@ get_pkt_sched(struct rte_mbuf *m, uint32_t *subport, uint32_t *pipe,
 			(port_params.n_subports_per_port - 1); /* Outer VLAN ID*/
 	*pipe = (rte_be_to_cpu_16(pdata[PIPE_OFFSET]) & 0x0FFF) &
 			(port_params.n_pipes_per_subport - 1); /* Inner VLAN ID */
-	*traffic_class = (pdata[QUEUE_OFFSET] & 0x0F) &
-			(RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE - 1); /* Destination IP */
-	*queue = ((pdata[QUEUE_OFFSET] >> 8) & 0x0F) &
-			(RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS - 1) ; /* Destination IP */
+	*queue = active_queues[(pdata[QUEUE_OFFSET] >> 8) % n_active_queues];
+	*traffic_class = (*queue > (RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE - 1) ?
+			(RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE - 1) : *queue); /* Destination IP */
 	*color = pdata[COLOR_OFFSET] & 0x03; 	/* Destination IP */
 
 	return 0;
diff --git a/examples/qos_sched/cfg_file.c b/examples/qos_sched/cfg_file.c
index 76ffffc4b..522de1aea 100644
--- a/examples/qos_sched/cfg_file.c
+++ b/examples/qos_sched/cfg_file.c
@@ -29,6 +29,9 @@ cfg_load_port(struct rte_cfgfile *cfg, struct rte_sched_port_params *port_params
 	if (!cfg || !port_params)
 		return -1;
 
+	memset(active_queues, 0, sizeof(active_queues));
+	n_active_queues = 0;
+
 	entry = rte_cfgfile_get_entry(cfg, "port", "frame overhead");
 	if (entry)
 		port_params->frame_overhead = (uint32_t)atoi(entry);
@@ -45,8 +48,12 @@ cfg_load_port(struct rte_cfgfile *cfg, struct rte_sched_port_params *port_params
 	if (entry) {
 		char *next;
 
-		for(j = 0; j < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; j++) {
+		for (j = 0; j < RTE_SCHED_QUEUES_PER_PIPE; j++) {
 			port_params->qsize[j] = (uint16_t)strtol(entry, &next, 10);
+			if (port_params->qsize[j] != 0) {
+				active_queues[n_active_queues] = j;
+				n_active_queues++;
+			}
 			if (next == NULL)
 				break;
 			entry = next;
@@ -173,46 +180,52 @@ cfg_load_pipe(struct rte_cfgfile *cfg, struct rte_sched_pipe_params *pipe_params
 		if (entry)
 			pipe_params[j].tc_rate[3] = (uint32_t)atoi(entry);
 
+		entry = rte_cfgfile_get_entry(cfg, pipe_name, "tc 4 rate");
+		if (entry)
+			pipe_params[j].tc_rate[4] = (uint32_t)atoi(entry);
+
+		entry = rte_cfgfile_get_entry(cfg, pipe_name, "tc 5 rate");
+		if (entry)
+			pipe_params[j].tc_rate[5] = (uint32_t)atoi(entry);
+
+		entry = rte_cfgfile_get_entry(cfg, pipe_name, "tc 6 rate");
+		if (entry)
+			pipe_params[j].tc_rate[6] = (uint32_t)atoi(entry);
+
+		entry = rte_cfgfile_get_entry(cfg, pipe_name, "tc 7 rate");
+		if (entry)
+			pipe_params[j].tc_rate[7] = (uint32_t)atoi(entry);
+
+		entry = rte_cfgfile_get_entry(cfg, pipe_name, "tc 8 rate");
+		if (entry)
+			pipe_params[j].tc_rate[8] = (uint32_t)atoi(entry);
+
+		entry = rte_cfgfile_get_entry(cfg, pipe_name, "tc 9 rate");
+		if (entry)
+			pipe_params[j].tc_rate[9] = (uint32_t)atoi(entry);
+
+		entry = rte_cfgfile_get_entry(cfg, pipe_name, "tc 10 rate");
+		if (entry)
+			pipe_params[j].tc_rate[10] = (uint32_t)atoi(entry);
+
+		entry = rte_cfgfile_get_entry(cfg, pipe_name, "tc 11 rate");
+		if (entry)
+			pipe_params[j].tc_rate[11] = (uint32_t)atoi(entry);
+
+		entry = rte_cfgfile_get_entry(cfg, pipe_name, "tc 12 rate");
+		if (entry)
+			pipe_params[j].tc_rate[12] = (uint32_t)atoi(entry);
+
 #ifdef RTE_SCHED_SUBPORT_TC_OV
-		entry = rte_cfgfile_get_entry(cfg, pipe_name, "tc 3 oversubscription weight");
+		entry = rte_cfgfile_get_entry(cfg, pipe_name, "tc 12 oversubscription weight");
 		if (entry)
 			pipe_params[j].tc_ov_weight = (uint8_t)atoi(entry);
 #endif
 
-		entry = rte_cfgfile_get_entry(cfg, pipe_name, "tc 0 wrr weights");
+		entry = rte_cfgfile_get_entry(cfg, pipe_name, "tc 12 wrr weights");
 		if (entry) {
-			for(i = 0; i < RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS; i++) {
-				pipe_params[j].wrr_weights[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE*0 + i] =
-					(uint8_t)strtol(entry, &next, 10);
-				if (next == NULL)
-					break;
-				entry = next;
-			}
-		}
-		entry = rte_cfgfile_get_entry(cfg, pipe_name, "tc 1 wrr weights");
-		if (entry) {
-			for(i = 0; i < RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS; i++) {
-				pipe_params[j].wrr_weights[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE*1 + i] =
-					(uint8_t)strtol(entry, &next, 10);
-				if (next == NULL)
-					break;
-				entry = next;
-			}
-		}
-		entry = rte_cfgfile_get_entry(cfg, pipe_name, "tc 2 wrr weights");
-		if (entry) {
-			for(i = 0; i < RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS; i++) {
-				pipe_params[j].wrr_weights[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE*2 + i] =
-					(uint8_t)strtol(entry, &next, 10);
-				if (next == NULL)
-					break;
-				entry = next;
-			}
-		}
-		entry = rte_cfgfile_get_entry(cfg, pipe_name, "tc 3 wrr weights");
-		if (entry) {
-			for(i = 0; i < RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS; i++) {
-				pipe_params[j].wrr_weights[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE*3 + i] =
+			for (i = 0; i < RTE_SCHED_BE_QUEUES_PER_PIPE; i++) {
+				pipe_params[j].wrr_weights[i] =
 					(uint8_t)strtol(entry, &next, 10);
 				if (next == NULL)
 					break;
@@ -267,6 +280,42 @@ cfg_load_subport(struct rte_cfgfile *cfg, struct rte_sched_subport_params *subpo
 			if (entry)
 				subport_params[i].tc_rate[3] = (uint32_t)atoi(entry);
 
+			entry = rte_cfgfile_get_entry(cfg, sec_name, "tc 4 rate");
+			if (entry)
+				subport_params[i].tc_rate[4] = (uint32_t)atoi(entry);
+
+			entry = rte_cfgfile_get_entry(cfg, sec_name, "tc 5 rate");
+			if (entry)
+				subport_params[i].tc_rate[5] = (uint32_t)atoi(entry);
+
+			entry = rte_cfgfile_get_entry(cfg, sec_name, "tc 6 rate");
+			if (entry)
+				subport_params[i].tc_rate[6] = (uint32_t)atoi(entry);
+
+			entry = rte_cfgfile_get_entry(cfg, sec_name, "tc 7 rate");
+			if (entry)
+				subport_params[i].tc_rate[7] = (uint32_t)atoi(entry);
+
+			entry = rte_cfgfile_get_entry(cfg, sec_name, "tc 8 rate");
+			if (entry)
+				subport_params[i].tc_rate[8] = (uint32_t)atoi(entry);
+
+			entry = rte_cfgfile_get_entry(cfg, sec_name, "tc 9 rate");
+			if (entry)
+				subport_params[i].tc_rate[9] = (uint32_t)atoi(entry);
+
+			entry = rte_cfgfile_get_entry(cfg, sec_name, "tc 10 rate");
+			if (entry)
+				subport_params[i].tc_rate[10] = (uint32_t)atoi(entry);
+
+			entry = rte_cfgfile_get_entry(cfg, sec_name, "tc 11 rate");
+			if (entry)
+				subport_params[i].tc_rate[11] = (uint32_t)atoi(entry);
+
+			entry = rte_cfgfile_get_entry(cfg, sec_name, "tc 12 rate");
+			if (entry)
+				subport_params[i].tc_rate[12] = (uint32_t)atoi(entry);
+
 			int n_entries = rte_cfgfile_section_num_entries(cfg, sec_name);
 			struct rte_cfgfile_entry entries[n_entries];
 
diff --git a/examples/qos_sched/init.c b/examples/qos_sched/init.c
index 6b63d4e0e..5fd2a38e4 100644
--- a/examples/qos_sched/init.c
+++ b/examples/qos_sched/init.c
@@ -170,17 +170,20 @@ static struct rte_sched_subport_params subport_params[MAX_SCHED_SUBPORTS] = {
 		.tb_rate = 1250000000,
 		.tb_size = 1000000,
 
-		.tc_rate = {1250000000, 1250000000, 1250000000, 1250000000},
+		.tc_rate = {1250000000, 1250000000, 1250000000, 1250000000,
+			1250000000, 1250000000, 1250000000, 1250000000, 1250000000,
+			1250000000, 1250000000, 1250000000, 1250000000},
 		.tc_period = 10,
 	},
 };
 
-static struct rte_sched_pipe_params pipe_profiles[RTE_SCHED_PIPE_PROFILES_PER_PORT] = {
+static struct rte_sched_pipe_params pipe_profiles[MAX_SCHED_PIPE_PROFILES] = {
 	{ /* Profile #0 */
 		.tb_rate = 305175,
 		.tb_size = 1000000,
 
-		.tc_rate = {305175, 305175, 305175, 305175},
+		.tc_rate = {305175, 305175, 305175, 305175, 305175, 305175,
+			305175, 305175, 305175, 305175, 305175, 305175, 305175},
 		.tc_period = 40,
 #ifdef RTE_SCHED_SUBPORT_TC_OV
 		.tc_ov_weight = 1,
@@ -198,9 +201,10 @@ struct rte_sched_port_params port_params = {
 	.frame_overhead = RTE_SCHED_FRAME_OVERHEAD_DEFAULT,
 	.n_subports_per_port = 1,
 	.n_pipes_per_subport = 4096,
-	.qsize = {64, 64, 64, 64},
+	.qsize = {64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64},
 	.pipe_profiles = pipe_profiles,
 	.n_pipe_profiles = sizeof(pipe_profiles) / sizeof(struct rte_sched_pipe_params),
+	.n_max_pipe_profiles = MAX_SCHED_PIPE_PROFILES,
 
 #ifdef RTE_SCHED_RED
 	.red_params = {
@@ -222,8 +226,53 @@ struct rte_sched_port_params port_params = {
 		/* Traffic Class 3 - Colors Green / Yellow / Red */
 		[3][0] = {.min_th = 48, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
 		[3][1] = {.min_th = 40, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
-		[3][2] = {.min_th = 32, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9}
-	}
+		[3][2] = {.min_th = 32, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+
+		/* Traffic Class 4 - Colors Green / Yellow / Red */
+		[4][0] = {.min_th = 48, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+		[4][1] = {.min_th = 40, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+		[4][2] = {.min_th = 32, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+
+		/* Traffic Class 5 - Colors Green / Yellow / Red */
+		[5][0] = {.min_th = 48, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+		[5][1] = {.min_th = 40, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+		[5][2] = {.min_th = 32, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+
+		/* Traffic Class 6 - Colors Green / Yellow / Red */
+		[6][0] = {.min_th = 48, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+		[6][1] = {.min_th = 40, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+		[6][2] = {.min_th = 32, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+
+		/* Traffic Class 7 - Colors Green / Yellow / Red */
+		[7][0] = {.min_th = 48, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+		[7][1] = {.min_th = 40, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+		[7][2] = {.min_th = 32, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+
+		/* Traffic Class 8 - Colors Green / Yellow / Red */
+		[8][0] = {.min_th = 48, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+		[8][1] = {.min_th = 40, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+		[8][2] = {.min_th = 32, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+
+		/* Traffic Class 9 - Colors Green / Yellow / Red */
+		[9][0] = {.min_th = 48, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+		[9][1] = {.min_th = 40, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+		[9][2] = {.min_th = 32, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+
+		/* Traffic Class 10 - Colors Green / Yellow / Red */
+		[10][0] = {.min_th = 48, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+		[10][1] = {.min_th = 40, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+		[10][2] = {.min_th = 32, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+
+		/* Traffic Class 11 - Colors Green / Yellow / Red */
+		[11][0] = {.min_th = 48, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+		[11][1] = {.min_th = 40, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+		[11][2] = {.min_th = 32, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+
+		/* Traffic Class 12 - Colors Green / Yellow / Red */
+		[12][0] = {.min_th = 48, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+		[12][1] = {.min_th = 40, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+		[12][2] = {.min_th = 32, .max_th = 64, .maxp_inv = 10, .wq_log2 = 9},
+	},
 #endif /* RTE_SCHED_RED */
 };
 
@@ -255,7 +304,7 @@ app_init_sched_port(uint32_t portid, uint32_t socketid)
 					subport, err);
 		}
 
-		for (pipe = 0; pipe < port_params.n_pipes_per_subport; pipe ++) {
+		for (pipe = 0; pipe < port_params.n_pipes_per_subport; pipe++) {
 			if (app_pipe_to_profile[subport][pipe] != -1) {
 				err = rte_sched_pipe_config(port, subport, pipe,
 						app_pipe_to_profile[subport][pipe]);
diff --git a/examples/qos_sched/main.h b/examples/qos_sched/main.h
index 8a2741c58..d8f890b64 100644
--- a/examples/qos_sched/main.h
+++ b/examples/qos_sched/main.h
@@ -50,6 +50,7 @@ extern "C" {
 #define MAX_DATA_STREAMS (APP_MAX_LCORE/2)
 #define MAX_SCHED_SUBPORTS		8
 #define MAX_SCHED_PIPES		4096
+#define MAX_SCHED_PIPE_PROFILES		256
 
 #ifndef APP_COLLECT_STAT
 #define APP_COLLECT_STAT		1
@@ -147,6 +148,9 @@ extern struct burst_conf burst_conf;
 extern struct ring_thresh rx_thresh;
 extern struct ring_thresh tx_thresh;
 
+uint32_t active_queues[RTE_SCHED_QUEUES_PER_PIPE];
+uint32_t n_active_queues;
+
 extern struct rte_sched_port_params port_params;
 
 int app_parse_args(int argc, char **argv);
diff --git a/examples/qos_sched/profile.cfg b/examples/qos_sched/profile.cfg
index f5b704cc6..55fd7d1e0 100644
--- a/examples/qos_sched/profile.cfg
+++ b/examples/qos_sched/profile.cfg
@@ -1,6 +1,6 @@
 ;   BSD LICENSE
 ;
-;   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+;   Copyright(c) 2010-2019 Intel Corporation. All rights reserved.
 ;   All rights reserved.
 ;
 ;   Redistribution and use in source and binary forms, with or without
@@ -33,12 +33,12 @@
 ; 10GbE output port:
 ;	* Single subport (subport 0):
 ;		- Subport rate set to 100% of port rate
-;		- Each of the 4 traffic classes has rate set to 100% of port rate
+;		- Each of the 9 traffic classes has rate set to 100% of port rate
 ;	* 4K pipes per subport 0 (pipes 0 .. 4095) with identical configuration:
 ;		- Pipe rate set to 1/4K of port rate
-;		- Each of the 4 traffic classes has rate set to 100% of pipe rate
-;		- Within each traffic class, the byte-level WRR weights for the 4 queues
-;         are set to 1:1:1:1
+;		- Each of the 9 traffic classes has rate set to 100% of pipe rate
+;		- Within lowest priority traffic class (best-effort), the byte-level
+;		  WRR weights for the 8 queues are set to 1:1:1:1:1:1:1:1
 ;
 ; For more details, please refer to chapter "Quality of Service (QoS) Framework"
 ; of Data Plane Development Kit (DPDK) Programmer's Guide.
@@ -48,7 +48,7 @@
 frame overhead = 24
 number of subports per port = 1
 number of pipes per subport = 4096
-queue sizes = 64 64 64 64
+queue sizes = 64 64 64 64 64 64 64 64 64 64 64 64 64 64 64 64
 
 ; Subport configuration
 [subport 0]
@@ -59,6 +59,16 @@ tc 0 rate = 1250000000         ; Bytes per second
 tc 1 rate = 1250000000         ; Bytes per second
 tc 2 rate = 1250000000         ; Bytes per second
 tc 3 rate = 1250000000         ; Bytes per second
+tc 4 rate = 1250000000         ; Bytes per second
+tc 5 rate = 1250000000         ; Bytes per second
+tc 6 rate = 1250000000         ; Bytes per second
+tc 7 rate = 1250000000         ; Bytes per second
+tc 8 rate = 1250000000         ; Bytes per second
+tc 9 rate = 1250000000         ; Bytes per second
+tc 10 rate = 1250000000        ; Bytes per second
+tc 11 rate = 1250000000        ; Bytes per second
+tc 12 rate = 1250000000        ; Bytes per second
+
 tc period = 10                 ; Milliseconds
 
 pipe 0-4095 = 0                ; These pipes are configured with pipe profile 0
@@ -72,14 +82,21 @@ tc 0 rate = 305175             ; Bytes per second
 tc 1 rate = 305175             ; Bytes per second
 tc 2 rate = 305175             ; Bytes per second
 tc 3 rate = 305175             ; Bytes per second
-tc period = 40                 ; Milliseconds
+tc 4 rate = 305175             ; Bytes per second
+tc 5 rate = 305175             ; Bytes per second
+tc 6 rate = 305175             ; Bytes per second
+tc 7 rate = 305175             ; Bytes per second
+tc 8 rate = 305175             ; Bytes per second
+tc 9 rate = 305175             ; Bytes per second
+tc 10 rate = 305175            ; Bytes per second
+tc 11 rate = 305175            ; Bytes per second
+tc 12 rate = 305175            ; Bytes per second
+
+tc period = 40                ; Milliseconds
 
-tc 3 oversubscription weight = 1
+tc 12 oversubscription weight = 1
 
-tc 0 wrr weights = 1 1 1 1
-tc 1 wrr weights = 1 1 1 1
-tc 2 wrr weights = 1 1 1 1
-tc 3 wrr weights = 1 1 1 1
+tc 12 wrr weights = 1 1 1 1
 
 ; RED params per traffic class and color (Green / Yellow / Red)
 [red]
@@ -102,3 +119,28 @@ tc 3 wred min = 48 40 32
 tc 3 wred max = 64 64 64
 tc 3 wred inv prob = 10 10 10
 tc 3 wred weight = 9 9 9
+
+tc 4 wred min = 48 40 32
+tc 4 wred max = 64 64 64
+tc 4 wred inv prob = 10 10 10
+tc 4 wred weight = 9 9 9
+
+tc 5 wred min = 48 40 32
+tc 5 wred max = 64 64 64
+tc 5 wred inv prob = 10 10 10
+tc 5 wred weight = 9 9 9
+
+tc 6 wred min = 48 40 32
+tc 6 wred max = 64 64 64
+tc 6 wred inv prob = 10 10 10
+tc 6 wred weight = 9 9 9
+
+tc 7 wred min = 48 40 32
+tc 7 wred max = 64 64 64
+tc 7 wred inv prob = 10 10 10
+tc 7 wred weight = 9 9 9
+
+tc 8 wred min = 48 40 32
+tc 8 wred max = 64 64 64
+tc 8 wred inv prob = 10 10 10
+tc 8 wred weight = 9 9 9
diff --git a/examples/qos_sched/profile_ov.cfg b/examples/qos_sched/profile_ov.cfg
index 33000df9e..d5d9b321e 100644
--- a/examples/qos_sched/profile_ov.cfg
+++ b/examples/qos_sched/profile_ov.cfg
@@ -1,6 +1,6 @@
 ;   BSD LICENSE
 ;
-;   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+;   Copyright(c) 2010-2019 Intel Corporation. All rights reserved.
 ;   All rights reserved.
 ;
 ;   Redistribution and use in source and binary forms, with or without
@@ -34,7 +34,7 @@
 frame overhead = 24
 number of subports per port = 1
 number of pipes per subport = 32
-queue sizes = 64 64 64 64
+queue sizes = 64 64 64 64 64 64 64 64 64 64 64 64 64 64 64 64
 
 ; Subport configuration
 [subport 0]
@@ -45,6 +45,15 @@ tc 0 rate = 8400000         ; Bytes per second
 tc 1 rate = 8400000         ; Bytes per second
 tc 2 rate = 8400000         ; Bytes per second
 tc 3 rate = 8400000         ; Bytes per second
+tc 4 rate = 8400000         ; Bytes per second
+tc 5 rate = 8400000         ; Bytes per second
+tc 6 rate = 8400000         ; Bytes per second
+tc 7 rate = 8400000         ; Bytes per second
+tc 8 rate = 8400000         ; Bytes per second
+tc 9 rate = 8400000         ; Bytes per second
+tc 10 rate = 8400000         ; Bytes per second
+tc 11 rate = 8400000         ; Bytes per second
+tc 12 rate = 8400000         ; Bytes per second
 tc period = 10              ; Milliseconds
 
 pipe 0-31 = 0               ; These pipes are configured with pipe profile 0
@@ -58,14 +67,20 @@ tc 0 rate = 16800000           ; Bytes per second
 tc 1 rate = 16800000           ; Bytes per second
 tc 2 rate = 16800000           ; Bytes per second
 tc 3 rate = 16800000           ; Bytes per second
+tc 4 rate = 16800000           ; Bytes per second
+tc 5 rate = 16800000           ; Bytes per second
+tc 6 rate = 16800000           ; Bytes per second
+tc 7 rate = 16800000           ; Bytes per second
+tc 8 rate = 16800000           ; Bytes per second
+tc 9 rate = 16800000           ; Bytes per second
+tc 10 rate = 16800000           ; Bytes per second
+tc 11 rate = 16800000           ; Bytes per second
+tc 12 rate = 16800000           ; Bytes per second
 tc period = 28                 ; Milliseconds
 
-tc 3 oversubscription weight = 1
+tc 12 oversubscription weight = 1
 
-tc 0 wrr weights = 1 1 1 1
-tc 1 wrr weights = 1 1 1 1
-tc 2 wrr weights = 1 1 1 1
-tc 3 wrr weights = 1 1 1 1
+tc 12 wrr weights = 1 1 1 1
 
 ; RED params per traffic class and color (Green / Yellow / Red)
 [red]
@@ -88,3 +103,28 @@ tc 3 wred min = 48 40 32
 tc 3 wred max = 64 64 64
 tc 3 wred inv prob = 10 10 10
 tc 3 wred weight = 9 9 9
+
+tc 4 wred min = 48 40 32
+tc 4 wred max = 64 64 64
+tc 4 wred inv prob = 10 10 10
+tc 4 wred weight = 9 9 9
+
+tc 5 wred min = 48 40 32
+tc 5 wred max = 64 64 64
+tc 5 wred inv prob = 10 10 10
+tc 5 wred weight = 9 9 9
+
+tc 6 wred min = 48 40 32
+tc 6 wred max = 64 64 64
+tc 6 wred inv prob = 10 10 10
+tc 6 wred weight = 9 9 9
+
+tc 7 wred min = 48 40 32
+tc 7 wred max = 64 64 64
+tc 7 wred inv prob = 10 10 10
+tc 7 wred weight = 9 9 9
+
+tc 8 wred min = 48 40 32
+tc 8 wred max = 64 64 64
+tc 8 wred inv prob = 10 10 10
+tc 8 wred weight = 9 9 9
diff --git a/examples/qos_sched/stats.c b/examples/qos_sched/stats.c
index 8193d964c..4f5fdda47 100644
--- a/examples/qos_sched/stats.c
+++ b/examples/qos_sched/stats.c
@@ -11,278 +11,333 @@ int
 qavg_q(uint16_t port_id, uint32_t subport_id, uint32_t pipe_id, uint8_t tc,
 		uint8_t q)
 {
-        struct rte_sched_queue_stats stats;
-        struct rte_sched_port *port;
-        uint16_t qlen;
-        uint32_t queue_id, count, i;
-        uint32_t average;
-
-        for (i = 0; i < nb_pfc; i++) {
-                if (qos_conf[i].tx_port == port_id)
-                        break;
-        }
-        if (i == nb_pfc || subport_id >= port_params.n_subports_per_port || pipe_id >= port_params.n_pipes_per_subport
-                        || tc >= RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE || q >= RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS)
-                return -1;
-
-        port = qos_conf[i].sched_port;
-
-        queue_id = RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS * (subport_id * port_params.n_pipes_per_subport + pipe_id);
-        queue_id = queue_id + (tc * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS + q);
-
-        average = 0;
-
-        for (count = 0; count < qavg_ntimes; count++) {
-                rte_sched_queue_read_stats(port, queue_id, &stats, &qlen);
-                average += qlen;
-                usleep(qavg_period);
-        }
-
-        average /= qavg_ntimes;
-
-        printf("\nAverage queue size: %" PRIu32 " bytes.\n\n", average);
-
-        return 0;
+	struct rte_sched_queue_stats stats;
+	struct rte_sched_port *port;
+	uint16_t qlen;
+	uint32_t count, i, queue_id = 0;
+	uint32_t average;
+
+	for (i = 0; i < nb_pfc; i++) {
+		if (qos_conf[i].tx_port == port_id)
+			break;
+	}
+
+	if (i == nb_pfc || subport_id >= port_params.n_subports_per_port ||
+		pipe_id >= port_params.n_pipes_per_subport  ||
+		tc >= RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE ||
+		q >= RTE_SCHED_BE_QUEUES_PER_PIPE ||
+		(tc < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE - 1 && q > 0))
+			return -1;
+
+	port = qos_conf[i].sched_port;
+	for (i = 0; i < subport_id; i++)
+		queue_id += port_params.n_pipes_per_subport *
+				RTE_SCHED_QUEUES_PER_PIPE;
+	if (tc < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE - 1)
+		queue_id += pipe_id * RTE_SCHED_QUEUES_PER_PIPE + tc;
+	else
+		queue_id += pipe_id * RTE_SCHED_QUEUES_PER_PIPE + tc + q;
+
+	average = 0;
+	for (count = 0; count < qavg_ntimes; count++) {
+		rte_sched_queue_read_stats(port, queue_id, &stats, &qlen);
+		average += qlen;
+		usleep(qavg_period);
+	}
+
+	average /= qavg_ntimes;
+
+	printf("\nAverage queue size: %" PRIu32 " bytes.\n\n", average);
+
+	return 0;
 }
 
 int
 qavg_tcpipe(uint16_t port_id, uint32_t subport_id, uint32_t pipe_id,
-	     uint8_t tc)
+		uint8_t tc)
 {
-        struct rte_sched_queue_stats stats;
-        struct rte_sched_port *port;
-        uint16_t qlen;
-        uint32_t queue_id, count, i;
-        uint32_t average, part_average;
+	struct rte_sched_queue_stats stats;
+	struct rte_sched_port *port;
+	uint16_t qlen;
+	uint32_t count, i, queue_id = 0;
+	uint32_t average, part_average;
+
+	for (i = 0; i < nb_pfc; i++) {
+		if (qos_conf[i].tx_port == port_id)
+			break;
+	}
+
+	if (i == nb_pfc || subport_id >= port_params.n_subports_per_port ||
+		pipe_id >= port_params.n_pipes_per_subport ||
+		tc >= RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE)
+		return -1;
+
+	port = qos_conf[i].sched_port;
 
-        for (i = 0; i < nb_pfc; i++) {
-                if (qos_conf[i].tx_port == port_id)
-                        break;
-        }
-        if (i == nb_pfc || subport_id >= port_params.n_subports_per_port || pipe_id >= port_params.n_pipes_per_subport
-                        || tc >= RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE)
-                return -1;
+	for (i = 0; i < subport_id; i++)
+		queue_id += port_params.n_pipes_per_subport * RTE_SCHED_QUEUES_PER_PIPE;
 
-        port = qos_conf[i].sched_port;
+	queue_id += pipe_id * RTE_SCHED_QUEUES_PER_PIPE + tc;
 
-        queue_id = RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS * (subport_id * port_params.n_pipes_per_subport + pipe_id);
+	average = 0;
 
-        average = 0;
+	for (count = 0; count < qavg_ntimes; count++) {
+		part_average = 0;
 
-        for (count = 0; count < qavg_ntimes; count++) {
-                part_average = 0;
-                for (i = 0; i < RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS; i++) {
-                        rte_sched_queue_read_stats(port, queue_id + (tc * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS + i), &stats, &qlen);
-                        part_average += qlen;
-                }
-                average += part_average / RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS;
-                usleep(qavg_period);
-        }
+		if (tc < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE - 1) {
+			rte_sched_queue_read_stats(port, queue_id, &stats, &qlen);
+			part_average += qlen;
+		} else {
+			for (i = 0; i < RTE_SCHED_BE_QUEUES_PER_PIPE; i++) {
+				rte_sched_queue_read_stats(port, queue_id + i, &stats, &qlen);
+				part_average += qlen;
+			}
+			average += part_average / RTE_SCHED_BE_QUEUES_PER_PIPE;
+		}
+		usleep(qavg_period);
+	}
 
-        average /= qavg_ntimes;
+	average /= qavg_ntimes;
 
-        printf("\nAverage queue size: %" PRIu32 " bytes.\n\n", average);
+	printf("\nAverage queue size: