All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/14] mlx5 Rx tunnel offloading
       [not found] <20180410133415.189905-1-xuemingl%40mellanox.com>
@ 2018-04-13 11:20 ` Xueming Li
  2018-04-17 15:14   ` [PATCH v4 00/11] " Xueming Li
                     ` (11 more replies)
  2018-04-13 11:20 ` [PATCH v3 01/14] net/mlx5: support 16 hardware priorities Xueming Li
                   ` (13 subsequent siblings)
  14 siblings, 12 replies; 115+ messages in thread
From: Xueming Li @ 2018-04-13 11:20 UTC (permalink / raw)
  To: Nelio Laranjeiro, Shahaf Shuler; +Cc: Xueming Li, dev

v3:
- Refactor 16 Verbs priority detection.
- Other updates according to ML discussion.
v2:
- Split into 2 series: public api and mlx5, this one is the second.
- Rebased on Adrien's rte flow overhaul:
  http://www.dpdk.org/ml/archives/dev/2018-April/095774.html
v1:
- Support new tunnel type MPLS-in-GRE and MPLS-in-UDP
- Remove deprecation notes of rss level

This patchset supports MLX5 Rx tunnel checksum, inner rss, inner ptype offloading of following tunnel types:
- Standard VXLAN
- L3 VXLAN (no inner ethernet header)
- VXLAN-GPE
- MPLS-in-GRE
- MPLS-in-GPE

Xueming Li (14):
  net/mlx5: support 16 hardware priorities
  net/mlx5: support GRE tunnel flow
  net/mlx5: support L3 VXLAN flow
  net/mlx5: support Rx tunnel type identification
  net/mlx5: cleanup tunnel checksum offloads
  net/mlx5: split flow RSS handling logic
  net/mlx5: support tunnel RSS level
  net/mlx5: add hardware flow debug dump
  net/mlx5: introduce VXLAN-GPE tunnel type
  net/mlx5: allow flow tunnel ID 0 with outer pattern
  net/mlx5: support MPLS-in-GRE and MPLS-in-UDP
  doc: update mlx5 guide on tunnel offloading
  net/mlx5: fix invalid flow item check
  net/mlx5: support RSS configuration in isolated mode

 doc/guides/nics/mlx5.rst              |   4 +-
 drivers/net/mlx5/Makefile             |   7 +-
 drivers/net/mlx5/mlx5.c               |  37 ++
 drivers/net/mlx5/mlx5.h               |   6 +
 drivers/net/mlx5/mlx5_flow.c          | 943 ++++++++++++++++++++++++++++------
 drivers/net/mlx5/mlx5_glue.c          |  16 +
 drivers/net/mlx5/mlx5_glue.h          |   8 +
 drivers/net/mlx5/mlx5_rxq.c           |  90 +++-
 drivers/net/mlx5/mlx5_rxtx.c          |  33 +-
 drivers/net/mlx5/mlx5_rxtx.h          |  11 +-
 drivers/net/mlx5/mlx5_rxtx_vec_neon.h |  21 +-
 drivers/net/mlx5/mlx5_rxtx_vec_sse.h  |  17 +-
 drivers/net/mlx5/mlx5_trigger.c       |   8 -
 drivers/net/mlx5/mlx5_utils.h         |   6 +
 14 files changed, 983 insertions(+), 224 deletions(-)

-- 
2.13.3

^ permalink raw reply	[flat|nested] 115+ messages in thread

* [PATCH v3 01/14] net/mlx5: support 16 hardware priorities
       [not found] <20180410133415.189905-1-xuemingl%40mellanox.com>
  2018-04-13 11:20 ` [PATCH v3 00/14] mlx5 Rx tunnel offloading Xueming Li
@ 2018-04-13 11:20 ` Xueming Li
  2018-04-13 11:58   ` Nélio Laranjeiro
  2018-04-13 11:20 ` [PATCH v3 02/14] net/mlx5: support GRE tunnel flow Xueming Li
                   ` (12 subsequent siblings)
  14 siblings, 1 reply; 115+ messages in thread
From: Xueming Li @ 2018-04-13 11:20 UTC (permalink / raw)
  To: Nelio Laranjeiro, Shahaf Shuler; +Cc: Xueming Li, dev

This patch supports new 16 Verbs flow priorities by trying to create a
simple flow of priority 15. If 16 priorities not available, fallback to
traditional 8 priorities.

Verb priority mapping:
			8 priorities	>=16 priorities
Control flow:		4-7		8-15
User normal flow:	1-3		4-7
User tunnel flow:	0-2		0-3

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
---
 drivers/net/mlx5/mlx5.c         |  18 +++++++
 drivers/net/mlx5/mlx5.h         |   5 ++
 drivers/net/mlx5/mlx5_flow.c    | 112 +++++++++++++++++++++++++++++++++-------
 drivers/net/mlx5/mlx5_trigger.c |   8 ---
 4 files changed, 115 insertions(+), 28 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index cfab55897..38118e524 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -197,6 +197,7 @@ mlx5_dev_close(struct rte_eth_dev *dev)
 		priv->txqs_n = 0;
 		priv->txqs = NULL;
 	}
+	mlx5_flow_delete_drop_queue(dev);
 	if (priv->pd != NULL) {
 		assert(priv->ctx != NULL);
 		claim_zero(mlx5_glue->dealloc_pd(priv->pd));
@@ -612,6 +613,7 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 	unsigned int mps;
 	unsigned int cqe_comp;
 	unsigned int tunnel_en = 0;
+	unsigned int verb_priorities = 0;
 	int idx;
 	int i;
 	struct mlx5dv_context attrs_out = {0};
@@ -993,6 +995,22 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 		mlx5_set_link_up(eth_dev);
 		/* Store device configuration on private structure. */
 		priv->config = config;
+		/* Create drop queue. */
+		err = mlx5_flow_create_drop_queue(eth_dev);
+		if (err) {
+			DRV_LOG(ERR, "port %u drop queue allocation failed: %s",
+				eth_dev->data->port_id, strerror(rte_errno));
+			goto port_error;
+		}
+		/* Supported Verbs flow priority number detection. */
+		if (verb_priorities == 0)
+			verb_priorities = priv_get_max_verbs_prio(eth_dev);
+		if (verb_priorities < MLX5_VERBS_FLOW_PRIO_8) {
+			DRV_LOG(ERR, "port %u wrong Verbs flow priorities: %u",
+				eth_dev->data->port_id, verb_priorities);
+			goto port_error;
+		}
+		priv->config.max_verb_prio = verb_priorities;
 		continue;
 port_error:
 		if (priv)
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 63b24e6bb..6e4613fe0 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -89,6 +89,7 @@ struct mlx5_dev_config {
 	unsigned int rx_vec_en:1; /* Rx vector is enabled. */
 	unsigned int mpw_hdr_dseg:1; /* Enable DSEGs in the title WQEBB. */
 	unsigned int vf_nl_en:1; /* Enable Netlink requests in VF mode. */
+	unsigned int max_verb_prio; /* Number of Verb flow priorities. */
 	unsigned int tso_max_payload_sz; /* Maximum TCP payload for TSO. */
 	unsigned int ind_table_max_size; /* Maximum indirection table size. */
 	int txq_inline; /* Maximum packet size for inlining. */
@@ -105,6 +106,9 @@ enum mlx5_verbs_alloc_type {
 	MLX5_VERBS_ALLOC_TYPE_RX_QUEUE,
 };
 
+/* 8 Verbs priorities. */
+#define MLX5_VERBS_FLOW_PRIO_8 8
+
 /**
  * Verbs allocator needs a context to know in the callback which kind of
  * resources it is allocating.
@@ -253,6 +257,7 @@ int mlx5_traffic_restart(struct rte_eth_dev *dev);
 
 /* mlx5_flow.c */
 
+unsigned int priv_get_max_verbs_prio(struct rte_eth_dev *dev);
 int mlx5_flow_validate(struct rte_eth_dev *dev,
 		       const struct rte_flow_attr *attr,
 		       const struct rte_flow_item items[],
diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index 288610620..5c4f0b586 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -32,8 +32,8 @@
 #include "mlx5_prm.h"
 #include "mlx5_glue.h"
 
-/* Define minimal priority for control plane flows. */
-#define MLX5_CTRL_FLOW_PRIORITY 4
+/* Flow priority for control plane flows. */
+#define MLX5_CTRL_FLOW_PRIORITY 1
 
 /* Internet Protocol versions. */
 #define MLX5_IPV4 4
@@ -129,7 +129,7 @@ const struct hash_rxq_init hash_rxq_init[] = {
 				IBV_RX_HASH_SRC_PORT_TCP |
 				IBV_RX_HASH_DST_PORT_TCP),
 		.dpdk_rss_hf = ETH_RSS_NONFRAG_IPV4_TCP,
-		.flow_priority = 1,
+		.flow_priority = 0,
 		.ip_version = MLX5_IPV4,
 	},
 	[HASH_RXQ_UDPV4] = {
@@ -138,7 +138,7 @@ const struct hash_rxq_init hash_rxq_init[] = {
 				IBV_RX_HASH_SRC_PORT_UDP |
 				IBV_RX_HASH_DST_PORT_UDP),
 		.dpdk_rss_hf = ETH_RSS_NONFRAG_IPV4_UDP,
-		.flow_priority = 1,
+		.flow_priority = 0,
 		.ip_version = MLX5_IPV4,
 	},
 	[HASH_RXQ_IPV4] = {
@@ -146,7 +146,7 @@ const struct hash_rxq_init hash_rxq_init[] = {
 				IBV_RX_HASH_DST_IPV4),
 		.dpdk_rss_hf = (ETH_RSS_IPV4 |
 				ETH_RSS_FRAG_IPV4),
-		.flow_priority = 2,
+		.flow_priority = 1,
 		.ip_version = MLX5_IPV4,
 	},
 	[HASH_RXQ_TCPV6] = {
@@ -155,7 +155,7 @@ const struct hash_rxq_init hash_rxq_init[] = {
 				IBV_RX_HASH_SRC_PORT_TCP |
 				IBV_RX_HASH_DST_PORT_TCP),
 		.dpdk_rss_hf = ETH_RSS_NONFRAG_IPV6_TCP,
-		.flow_priority = 1,
+		.flow_priority = 0,
 		.ip_version = MLX5_IPV6,
 	},
 	[HASH_RXQ_UDPV6] = {
@@ -164,7 +164,7 @@ const struct hash_rxq_init hash_rxq_init[] = {
 				IBV_RX_HASH_SRC_PORT_UDP |
 				IBV_RX_HASH_DST_PORT_UDP),
 		.dpdk_rss_hf = ETH_RSS_NONFRAG_IPV6_UDP,
-		.flow_priority = 1,
+		.flow_priority = 0,
 		.ip_version = MLX5_IPV6,
 	},
 	[HASH_RXQ_IPV6] = {
@@ -172,13 +172,13 @@ const struct hash_rxq_init hash_rxq_init[] = {
 				IBV_RX_HASH_DST_IPV6),
 		.dpdk_rss_hf = (ETH_RSS_IPV6 |
 				ETH_RSS_FRAG_IPV6),
-		.flow_priority = 2,
+		.flow_priority = 1,
 		.ip_version = MLX5_IPV6,
 	},
 	[HASH_RXQ_ETH] = {
 		.hash_fields = 0,
 		.dpdk_rss_hf = 0,
-		.flow_priority = 3,
+		.flow_priority = 2,
 	},
 };
 
@@ -900,30 +900,50 @@ mlx5_flow_convert_allocate(unsigned int size, struct rte_flow_error *error)
  * Make inner packet matching with an higher priority from the non Inner
  * matching.
  *
+ * @param dev
+ *   Pointer to Ethernet device.
  * @param[in, out] parser
  *   Internal parser structure.
  * @param attr
  *   User flow attribute.
  */
 static void
-mlx5_flow_update_priority(struct mlx5_flow_parse *parser,
+mlx5_flow_update_priority(struct rte_eth_dev *dev,
+			  struct mlx5_flow_parse *parser,
 			  const struct rte_flow_attr *attr)
 {
+	struct priv *priv = dev->data->dev_private;
 	unsigned int i;
+	uint16_t priority;
 
+	/*			8 priorities	>= 16 priorities
+	 * Control flow:	4-7		8-15
+	 * User normal flow:	1-3		4-7
+	 * User tunnel flow:	0-2		0-3
+	 */
+	priority = attr->priority * MLX5_VERBS_FLOW_PRIO_8;
+	if (priv->config.max_verb_prio == MLX5_VERBS_FLOW_PRIO_8)
+		priority /= 2;
+	/*
+	 * Lower non-tunnel flow Verbs priority 1 if only support 8 Verbs
+	 * priorities, lower 4 otherwise.
+	 */
+	if (!parser->inner) {
+		if (priv->config.max_verb_prio == MLX5_VERBS_FLOW_PRIO_8)
+			priority += 1;
+		else
+			priority += MLX5_VERBS_FLOW_PRIO_8 / 2;
+	}
 	if (parser->drop) {
-		parser->queue[HASH_RXQ_ETH].ibv_attr->priority =
-			attr->priority +
-			hash_rxq_init[HASH_RXQ_ETH].flow_priority;
+		parser->queue[HASH_RXQ_ETH].ibv_attr->priority = priority +
+				hash_rxq_init[HASH_RXQ_ETH].flow_priority;
 		return;
 	}
 	for (i = 0; i != hash_rxq_init_n; ++i) {
-		if (parser->queue[i].ibv_attr) {
-			parser->queue[i].ibv_attr->priority =
-				attr->priority +
-				hash_rxq_init[i].flow_priority -
-				(parser->inner ? 1 : 0);
-		}
+		if (!parser->queue[i].ibv_attr)
+			continue;
+		parser->queue[i].ibv_attr->priority = priority +
+				hash_rxq_init[i].flow_priority;
 	}
 }
 
@@ -1158,7 +1178,7 @@ mlx5_flow_convert(struct rte_eth_dev *dev,
 	 */
 	if (!parser->drop)
 		mlx5_flow_convert_finalise(parser);
-	mlx5_flow_update_priority(parser, attr);
+	mlx5_flow_update_priority(dev, parser, attr);
 exit_free:
 	/* Only verification is expected, all resources should be released. */
 	if (!parser->create) {
@@ -3161,3 +3181,55 @@ mlx5_dev_filter_ctrl(struct rte_eth_dev *dev,
 	}
 	return 0;
 }
+
+/**
+ * Detect number of Verbs flow priorities supported.
+ *
+ * @param dev
+ *   Pointer to Ethernet device.
+ *
+ * @return
+ *   number of supported Verbs flow priority.
+ */
+unsigned int
+priv_get_max_verbs_prio(struct rte_eth_dev *dev)
+{
+	struct priv *priv = dev->data->dev_private;
+	unsigned int verb_priorities = MLX5_VERBS_FLOW_PRIO_8;
+	struct {
+		struct ibv_flow_attr attr;
+		struct ibv_flow_spec_eth eth;
+		struct ibv_flow_spec_action_drop drop;
+	} flow_attr = {
+		.attr = {
+			.num_of_specs = 2,
+		},
+		.eth = {
+			.type = IBV_FLOW_SPEC_ETH,
+			.size = sizeof(struct ibv_flow_spec_eth),
+		},
+		.drop = {
+			.size = sizeof(struct ibv_flow_spec_action_drop),
+			.type = IBV_FLOW_SPEC_ACTION_DROP,
+		},
+	};
+	struct ibv_flow *flow;
+
+	do {
+		flow_attr.attr.priority = verb_priorities - 1;
+		flow = mlx5_glue->create_flow(priv->flow_drop_queue->qp,
+					      &flow_attr.attr);
+		if (flow) {
+			claim_zero(mlx5_glue->destroy_flow(flow));
+			/* Try more priorities. */
+			verb_priorities *= 2;
+		} else {
+			/* Failed, restore last right number. */
+			verb_priorities /= 2;
+			break;
+		}
+	} while (1);
+	DRV_LOG(INFO, "port %u Verbs flow priorities: %d",
+		dev->data->port_id, verb_priorities);
+	return verb_priorities;
+}
diff --git a/drivers/net/mlx5/mlx5_trigger.c b/drivers/net/mlx5/mlx5_trigger.c
index 6bb4ffb14..d80a2e688 100644
--- a/drivers/net/mlx5/mlx5_trigger.c
+++ b/drivers/net/mlx5/mlx5_trigger.c
@@ -148,12 +148,6 @@ mlx5_dev_start(struct rte_eth_dev *dev)
 	int ret;
 
 	dev->data->dev_started = 1;
-	ret = mlx5_flow_create_drop_queue(dev);
-	if (ret) {
-		DRV_LOG(ERR, "port %u drop queue allocation failed: %s",
-			dev->data->port_id, strerror(rte_errno));
-		goto error;
-	}
 	DRV_LOG(DEBUG, "port %u allocating and configuring hash Rx queues",
 		dev->data->port_id);
 	rte_mempool_walk(mlx5_mp2mr_iter, priv);
@@ -202,7 +196,6 @@ mlx5_dev_start(struct rte_eth_dev *dev)
 	mlx5_traffic_disable(dev);
 	mlx5_txq_stop(dev);
 	mlx5_rxq_stop(dev);
-	mlx5_flow_delete_drop_queue(dev);
 	rte_errno = ret; /* Restore rte_errno. */
 	return -rte_errno;
 }
@@ -237,7 +230,6 @@ mlx5_dev_stop(struct rte_eth_dev *dev)
 	mlx5_rxq_stop(dev);
 	for (mr = LIST_FIRST(&priv->mr); mr; mr = LIST_FIRST(&priv->mr))
 		mlx5_mr_release(mr);
-	mlx5_flow_delete_drop_queue(dev);
 }
 
 /**
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v3 02/14] net/mlx5: support GRE tunnel flow
       [not found] <20180410133415.189905-1-xuemingl%40mellanox.com>
  2018-04-13 11:20 ` [PATCH v3 00/14] mlx5 Rx tunnel offloading Xueming Li
  2018-04-13 11:20 ` [PATCH v3 01/14] net/mlx5: support 16 hardware priorities Xueming Li
@ 2018-04-13 11:20 ` Xueming Li
  2018-04-13 12:02   ` Nélio Laranjeiro
  2018-04-13 11:20 ` [PATCH v3 03/14] net/mlx5: support L3 VXLAN flow Xueming Li
                   ` (11 subsequent siblings)
  14 siblings, 1 reply; 115+ messages in thread
From: Xueming Li @ 2018-04-13 11:20 UTC (permalink / raw)
  To: Nelio Laranjeiro, Shahaf Shuler; +Cc: Xueming Li, dev

Support GRE tunnel type flow.

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
---
 drivers/net/mlx5/mlx5_flow.c | 69 +++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 62 insertions(+), 7 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index 5c4f0b586..2aae988f2 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -90,6 +90,11 @@ mlx5_flow_create_vxlan(const struct rte_flow_item *item,
 		       const void *default_mask,
 		       struct mlx5_flow_data *data);
 
+static int
+mlx5_flow_create_gre(const struct rte_flow_item *item,
+		       const void *default_mask,
+		       struct mlx5_flow_data *data);
+
 struct mlx5_flow_parse;
 
 static void
@@ -232,6 +237,10 @@ struct rte_flow {
 		__VA_ARGS__, RTE_FLOW_ITEM_TYPE_END, \
 	}
 
+#define IS_TUNNEL(type) ( \
+	(type) == RTE_FLOW_ITEM_TYPE_VXLAN || \
+	(type) == RTE_FLOW_ITEM_TYPE_GRE)
+
 /** Structure to generate a simple graph of layers supported by the NIC. */
 struct mlx5_flow_items {
 	/** List of possible actions for these items. */
@@ -285,7 +294,8 @@ static const enum rte_flow_action_type valid_actions[] = {
 static const struct mlx5_flow_items mlx5_flow_items[] = {
 	[RTE_FLOW_ITEM_TYPE_END] = {
 		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
-			       RTE_FLOW_ITEM_TYPE_VXLAN),
+			       RTE_FLOW_ITEM_TYPE_VXLAN,
+			       RTE_FLOW_ITEM_TYPE_GRE),
 	},
 	[RTE_FLOW_ITEM_TYPE_ETH] = {
 		.items = ITEMS(RTE_FLOW_ITEM_TYPE_VLAN,
@@ -317,7 +327,8 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
 	},
 	[RTE_FLOW_ITEM_TYPE_IPV4] = {
 		.items = ITEMS(RTE_FLOW_ITEM_TYPE_UDP,
-			       RTE_FLOW_ITEM_TYPE_TCP),
+			       RTE_FLOW_ITEM_TYPE_TCP,
+			       RTE_FLOW_ITEM_TYPE_GRE),
 		.actions = valid_actions,
 		.mask = &(const struct rte_flow_item_ipv4){
 			.hdr = {
@@ -334,7 +345,8 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
 	},
 	[RTE_FLOW_ITEM_TYPE_IPV6] = {
 		.items = ITEMS(RTE_FLOW_ITEM_TYPE_UDP,
-			       RTE_FLOW_ITEM_TYPE_TCP),
+			       RTE_FLOW_ITEM_TYPE_TCP,
+			       RTE_FLOW_ITEM_TYPE_GRE),
 		.actions = valid_actions,
 		.mask = &(const struct rte_flow_item_ipv6){
 			.hdr = {
@@ -387,6 +399,19 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
 		.convert = mlx5_flow_create_tcp,
 		.dst_sz = sizeof(struct ibv_flow_spec_tcp_udp),
 	},
+	[RTE_FLOW_ITEM_TYPE_GRE] = {
+		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
+			       RTE_FLOW_ITEM_TYPE_IPV4,
+			       RTE_FLOW_ITEM_TYPE_IPV6),
+		.actions = valid_actions,
+		.mask = &(const struct rte_flow_item_gre){
+			.protocol = -1,
+		},
+		.default_mask = &rte_flow_item_gre_mask,
+		.mask_sz = sizeof(struct rte_flow_item_gre),
+		.convert = mlx5_flow_create_gre,
+		.dst_sz = sizeof(struct ibv_flow_spec_tunnel),
+	},
 	[RTE_FLOW_ITEM_TYPE_VXLAN] = {
 		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH),
 		.actions = valid_actions,
@@ -402,7 +427,7 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
 
 /** Structure to pass to the conversion function. */
 struct mlx5_flow_parse {
-	uint32_t inner; /**< Set once VXLAN is encountered. */
+	uint32_t inner; /**< Verbs value, set once tunnel is encountered. */
 	uint32_t create:1;
 	/**< Whether resources should remain after a validate. */
 	uint32_t drop:1; /**< Target is a drop queue. */
@@ -830,13 +855,13 @@ mlx5_flow_convert_items_validate(const struct rte_flow_item items[],
 					      cur_item->mask_sz);
 		if (ret)
 			goto exit_item_not_supported;
-		if (items->type == RTE_FLOW_ITEM_TYPE_VXLAN) {
+		if (IS_TUNNEL(items->type)) {
 			if (parser->inner) {
 				rte_flow_error_set(error, ENOTSUP,
 						   RTE_FLOW_ERROR_TYPE_ITEM,
 						   items,
-						   "cannot recognize multiple"
-						   " VXLAN encapsulations");
+						   "Cannot recognize multiple"
+						   " tunnel encapsulations.");
 				return -rte_errno;
 			}
 			parser->inner = IBV_FLOW_SPEC_INNER;
@@ -1644,6 +1669,36 @@ mlx5_flow_create_vxlan(const struct rte_flow_item *item,
 }
 
 /**
+ * Convert GRE item to Verbs specification.
+ *
+ * @param item[in]
+ *   Item specification.
+ * @param default_mask[in]
+ *   Default bit-masks to use when item->mask is not provided.
+ * @param data[in, out]
+ *   User structure.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+mlx5_flow_create_gre(const struct rte_flow_item *item __rte_unused,
+		     const void *default_mask __rte_unused,
+		     struct mlx5_flow_data *data)
+{
+	struct mlx5_flow_parse *parser = data->parser;
+	unsigned int size = sizeof(struct ibv_flow_spec_tunnel);
+	struct ibv_flow_spec_tunnel tunnel = {
+		.type = parser->inner | IBV_FLOW_SPEC_VXLAN_TUNNEL,
+		.size = size,
+	};
+
+	parser->inner = IBV_FLOW_SPEC_INNER;
+	mlx5_flow_create_copy(parser, &tunnel, size);
+	return 0;
+}
+
+/**
  * Convert mark/flag action to Verbs specification.
  *
  * @param parser
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v3 03/14] net/mlx5: support L3 VXLAN flow
       [not found] <20180410133415.189905-1-xuemingl%40mellanox.com>
                   ` (2 preceding siblings ...)
  2018-04-13 11:20 ` [PATCH v3 02/14] net/mlx5: support GRE tunnel flow Xueming Li
@ 2018-04-13 11:20 ` Xueming Li
  2018-04-13 12:13   ` Nélio Laranjeiro
  2018-04-13 11:20 ` [PATCH v3 04/14] net/mlx5: support Rx tunnel type identification Xueming Li
                   ` (10 subsequent siblings)
  14 siblings, 1 reply; 115+ messages in thread
From: Xueming Li @ 2018-04-13 11:20 UTC (permalink / raw)
  To: Nelio Laranjeiro, Shahaf Shuler; +Cc: Xueming Li, dev

This patch support L3 VXLAN, no inner L2 header comparing to standard
VXLAN protocol. L3 VXLAN using specific overlay UDP destination port to
discriminate against standard VXLAN, FW has to be configured to support
it:
  sudo mlxconfig -d <device> -y s IP_OVER_VXLAN_EN=1
  sudo mlxconfig -d <device> -y s IP_OVER_VXLAN_PORT=<port>

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
---
 drivers/net/mlx5/mlx5_flow.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index 2aae988f2..644f26a95 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -413,7 +413,9 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
 		.dst_sz = sizeof(struct ibv_flow_spec_tunnel),
 	},
 	[RTE_FLOW_ITEM_TYPE_VXLAN] = {
-		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH),
+		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
+			       RTE_FLOW_ITEM_TYPE_IPV4, /* L3 VXLAN. */
+			       RTE_FLOW_ITEM_TYPE_IPV6), /* L3 VXLAN. */
 		.actions = valid_actions,
 		.mask = &(const struct rte_flow_item_vxlan){
 			.vni = "\xff\xff\xff",
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v3 04/14] net/mlx5: support Rx tunnel type identification
       [not found] <20180410133415.189905-1-xuemingl%40mellanox.com>
                   ` (3 preceding siblings ...)
  2018-04-13 11:20 ` [PATCH v3 03/14] net/mlx5: support L3 VXLAN flow Xueming Li
@ 2018-04-13 11:20 ` Xueming Li
  2018-04-13 13:02   ` Nélio Laranjeiro
  2018-04-13 11:20 ` [PATCH v3 05/14] net/mlx5: cleanup tunnel checksum offloads Xueming Li
                   ` (9 subsequent siblings)
  14 siblings, 1 reply; 115+ messages in thread
From: Xueming Li @ 2018-04-13 11:20 UTC (permalink / raw)
  To: Nelio Laranjeiro, Shahaf Shuler; +Cc: Xueming Li, dev

This patch introduced tunnel type identification based on flow rules.
If flows of multiple tunnel types built on same queue,
RTE_PTYPE_TUNNEL_MASK will be returned, user application could use bits
in flow mark as tunnel type identifier.

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
---
 drivers/net/mlx5/mlx5_flow.c          | 127 +++++++++++++++++++++++++++++-----
 drivers/net/mlx5/mlx5_rxq.c           |  11 ++-
 drivers/net/mlx5/mlx5_rxtx.c          |  12 ++--
 drivers/net/mlx5/mlx5_rxtx.h          |   9 ++-
 drivers/net/mlx5/mlx5_rxtx_vec_neon.h |  21 +++---
 drivers/net/mlx5/mlx5_rxtx_vec_sse.h  |  17 +++--
 6 files changed, 159 insertions(+), 38 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index 644f26a95..7d04b4d65 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -225,6 +225,7 @@ struct rte_flow {
 	struct rte_flow_action_rss rss_conf; /**< RSS configuration */
 	uint16_t (*queues)[]; /**< Queues indexes to use. */
 	uint8_t rss_key[40]; /**< copy of the RSS key. */
+	uint32_t tunnel; /**< Tunnel type of RTE_PTYPE_TUNNEL_XXX. */
 	struct ibv_counter_set *cs; /**< Holds the counters for the rule. */
 	struct mlx5_flow_counter_stats counter_stats;/**<The counter stats. */
 	struct mlx5_flow frxq[RTE_DIM(hash_rxq_init)];
@@ -241,6 +242,19 @@ struct rte_flow {
 	(type) == RTE_FLOW_ITEM_TYPE_VXLAN || \
 	(type) == RTE_FLOW_ITEM_TYPE_GRE)
 
+const uint32_t flow_ptype[] = {
+	[RTE_FLOW_ITEM_TYPE_VXLAN] = RTE_PTYPE_TUNNEL_VXLAN,
+	[RTE_FLOW_ITEM_TYPE_GRE] = RTE_PTYPE_TUNNEL_GRE,
+};
+
+#define PTYPE_IDX(t) ((RTE_PTYPE_TUNNEL_MASK & (t)) >> 12)
+
+const uint32_t ptype_ext[] = {
+	[PTYPE_IDX(RTE_PTYPE_TUNNEL_VXLAN)] = RTE_PTYPE_TUNNEL_VXLAN |
+					      RTE_PTYPE_L4_UDP,
+	[PTYPE_IDX(RTE_PTYPE_TUNNEL_GRE)] = RTE_PTYPE_TUNNEL_GRE,
+};
+
 /** Structure to generate a simple graph of layers supported by the NIC. */
 struct mlx5_flow_items {
 	/** List of possible actions for these items. */
@@ -440,6 +454,7 @@ struct mlx5_flow_parse {
 	uint16_t queues[RTE_MAX_QUEUES_PER_PORT]; /**< Queues indexes to use. */
 	uint8_t rss_key[40]; /**< copy of the RSS key. */
 	enum hash_rxq_type layer; /**< Last pattern layer detected. */
+	uint32_t tunnel; /**< Tunnel type of RTE_PTYPE_TUNNEL_XXX. */
 	struct ibv_counter_set *cs; /**< Holds the counter set for the rule */
 	struct {
 		struct ibv_flow_attr *ibv_attr;
@@ -858,7 +873,7 @@ mlx5_flow_convert_items_validate(const struct rte_flow_item items[],
 		if (ret)
 			goto exit_item_not_supported;
 		if (IS_TUNNEL(items->type)) {
-			if (parser->inner) {
+			if (parser->tunnel) {
 				rte_flow_error_set(error, ENOTSUP,
 						   RTE_FLOW_ERROR_TYPE_ITEM,
 						   items,
@@ -867,6 +882,7 @@ mlx5_flow_convert_items_validate(const struct rte_flow_item items[],
 				return -rte_errno;
 			}
 			parser->inner = IBV_FLOW_SPEC_INNER;
+			parser->tunnel = flow_ptype[items->type];
 		}
 		if (parser->drop) {
 			parser->queue[HASH_RXQ_ETH].offset += cur_item->dst_sz;
@@ -1175,6 +1191,7 @@ mlx5_flow_convert(struct rte_eth_dev *dev,
 	}
 	/* Third step. Conversion parse, fill the specifications. */
 	parser->inner = 0;
+	parser->tunnel = 0;
 	for (; items->type != RTE_FLOW_ITEM_TYPE_END; ++items) {
 		struct mlx5_flow_data data = {
 			.parser = parser,
@@ -1643,6 +1660,7 @@ mlx5_flow_create_vxlan(const struct rte_flow_item *item,
 
 	id.vni[0] = 0;
 	parser->inner = IBV_FLOW_SPEC_INNER;
+	parser->tunnel = ptype_ext[PTYPE_IDX(RTE_PTYPE_TUNNEL_VXLAN)];
 	if (spec) {
 		if (!mask)
 			mask = default_mask;
@@ -1696,6 +1714,7 @@ mlx5_flow_create_gre(const struct rte_flow_item *item __rte_unused,
 	};
 
 	parser->inner = IBV_FLOW_SPEC_INNER;
+	parser->tunnel = ptype_ext[PTYPE_IDX(RTE_PTYPE_TUNNEL_GRE)];
 	mlx5_flow_create_copy(parser, &tunnel, size);
 	return 0;
 }
@@ -1874,7 +1893,8 @@ mlx5_flow_create_action_queue_rss(struct rte_eth_dev *dev,
 				      parser->rss_conf.key_len,
 				      hash_fields,
 				      parser->rss_conf.queue,
-				      parser->rss_conf.queue_num);
+				      parser->rss_conf.queue_num,
+				      parser->tunnel);
 		if (flow->frxq[i].hrxq)
 			continue;
 		flow->frxq[i].hrxq =
@@ -1883,7 +1903,8 @@ mlx5_flow_create_action_queue_rss(struct rte_eth_dev *dev,
 				      parser->rss_conf.key_len,
 				      hash_fields,
 				      parser->rss_conf.queue,
-				      parser->rss_conf.queue_num);
+				      parser->rss_conf.queue_num,
+				      parser->tunnel);
 		if (!flow->frxq[i].hrxq) {
 			return rte_flow_error_set(error, ENOMEM,
 						  RTE_FLOW_ERROR_TYPE_HANDLE,
@@ -1895,6 +1916,40 @@ mlx5_flow_create_action_queue_rss(struct rte_eth_dev *dev,
 }
 
 /**
+ * RXQ update after flow rule creation.
+ *
+ * @param dev
+ *   Pointer to Ethernet device.
+ * @param flow
+ *   Pointer to the flow rule.
+ */
+static void
+mlx5_flow_create_update_rxqs(struct rte_eth_dev *dev, struct rte_flow *flow)
+{
+	struct priv *priv = dev->data->dev_private;
+	unsigned int i;
+
+	if (!dev->data->dev_started)
+		return;
+	for (i = 0; i != flow->rss_conf.queue_num; ++i) {
+		struct mlx5_rxq_data *rxq_data = (*priv->rxqs)
+						 [(*flow->queues)[i]];
+		struct mlx5_rxq_ctrl *rxq_ctrl =
+			container_of(rxq_data, struct mlx5_rxq_ctrl, rxq);
+		uint8_t tunnel = PTYPE_IDX(flow->tunnel);
+
+		rxq_data->mark |= flow->mark;
+		if (!tunnel)
+			continue;
+		rxq_ctrl->tunnel_types[tunnel] += 1;
+		if (rxq_data->tunnel != flow->tunnel)
+			rxq_data->tunnel = rxq_data->tunnel ?
+					   RTE_PTYPE_TUNNEL_MASK :
+					   flow->tunnel;
+	}
+}
+
+/**
  * Complete flow rule creation.
  *
  * @param dev
@@ -1954,12 +2009,7 @@ mlx5_flow_create_action_queue(struct rte_eth_dev *dev,
 				   NULL, "internal error in flow creation");
 		goto error;
 	}
-	for (i = 0; i != parser->rss_conf.queue_num; ++i) {
-		struct mlx5_rxq_data *q =
-			(*priv->rxqs)[parser->rss_conf.queue[i]];
-
-		q->mark |= parser->mark;
-	}
+	mlx5_flow_create_update_rxqs(dev, flow);
 	return 0;
 error:
 	ret = rte_errno; /* Save rte_errno before cleanup. */
@@ -2032,6 +2082,7 @@ mlx5_flow_list_create(struct rte_eth_dev *dev,
 	}
 	/* Copy configuration. */
 	flow->queues = (uint16_t (*)[])(flow + 1);
+	flow->tunnel = parser.tunnel;
 	flow->rss_conf = (struct rte_flow_action_rss){
 		.func = RTE_ETH_HASH_FUNCTION_DEFAULT,
 		.level = 0,
@@ -2123,9 +2174,38 @@ mlx5_flow_list_destroy(struct rte_eth_dev *dev, struct mlx5_flows *list,
 	struct priv *priv = dev->data->dev_private;
 	unsigned int i;
 
-	if (flow->drop || !flow->mark)
+	if (flow->drop || !dev->data->dev_started)
 		goto free;
-	for (i = 0; i != flow->rss_conf.queue_num; ++i) {
+	for (i = 0; flow->tunnel && i != flow->rss_conf.queue_num; ++i) {
+		/* Update queue tunnel type. */
+		struct mlx5_rxq_data *rxq_data = (*priv->rxqs)
+						 [(*flow->queues)[i]];
+		struct mlx5_rxq_ctrl *rxq_ctrl =
+			container_of(rxq_data, struct mlx5_rxq_ctrl, rxq);
+		uint8_t tunnel = PTYPE_IDX(flow->tunnel);
+
+		assert(rxq_ctrl->tunnel_types[tunnel] > 0);
+		rxq_ctrl->tunnel_types[tunnel] -= 1;
+		if (!rxq_ctrl->tunnel_types[tunnel]) {
+			/* Update tunnel type. */
+			uint8_t j;
+			uint8_t types = 0;
+			uint8_t last;
+
+			for (j = 0; j < RTE_DIM(rxq_ctrl->tunnel_types); j++)
+				if (rxq_ctrl->tunnel_types[j]) {
+					types += 1;
+					last = j;
+				}
+			/* Keep same if more than one tunnel types left. */
+			if (types == 1)
+				rxq_data->tunnel = ptype_ext[last];
+			else if (types == 0)
+				/* No tunnel type left. */
+				rxq_data->tunnel = 0;
+		}
+	}
+	for (i = 0; flow->mark && i != flow->rss_conf.queue_num; ++i) {
 		struct rte_flow *tmp;
 		int mark = 0;
 
@@ -2344,9 +2424,9 @@ mlx5_flow_stop(struct rte_eth_dev *dev, struct mlx5_flows *list)
 {
 	struct priv *priv = dev->data->dev_private;
 	struct rte_flow *flow;
+	unsigned int i;
 
 	TAILQ_FOREACH_REVERSE(flow, list, mlx5_flows, next) {
-		unsigned int i;
 		struct mlx5_ind_table_ibv *ind_tbl = NULL;
 
 		if (flow->drop) {
@@ -2392,6 +2472,18 @@ mlx5_flow_stop(struct rte_eth_dev *dev, struct mlx5_flows *list)
 		DRV_LOG(DEBUG, "port %u flow %p removed", dev->data->port_id,
 			(void *)flow);
 	}
+	/* Cleanup Rx queue tunnel info. */
+	for (i = 0; i != priv->rxqs_n; ++i) {
+		struct mlx5_rxq_data *q = (*priv->rxqs)[i];
+		struct mlx5_rxq_ctrl *rxq_ctrl =
+			container_of(q, struct mlx5_rxq_ctrl, rxq);
+
+		if (!q)
+			continue;
+		memset((void *)rxq_ctrl->tunnel_types, 0,
+		       sizeof(rxq_ctrl->tunnel_types));
+		q->tunnel = 0;
+	}
 }
 
 /**
@@ -2439,7 +2531,8 @@ mlx5_flow_start(struct rte_eth_dev *dev, struct mlx5_flows *list)
 					      flow->rss_conf.key_len,
 					      hash_rxq_init[i].hash_fields,
 					      flow->rss_conf.queue,
-					      flow->rss_conf.queue_num);
+					      flow->rss_conf.queue_num,
+					      flow->tunnel);
 			if (flow->frxq[i].hrxq)
 				goto flow_create;
 			flow->frxq[i].hrxq =
@@ -2447,7 +2540,8 @@ mlx5_flow_start(struct rte_eth_dev *dev, struct mlx5_flows *list)
 					      flow->rss_conf.key_len,
 					      hash_rxq_init[i].hash_fields,
 					      flow->rss_conf.queue,
-					      flow->rss_conf.queue_num);
+					      flow->rss_conf.queue_num,
+					      flow->tunnel);
 			if (!flow->frxq[i].hrxq) {
 				DRV_LOG(DEBUG,
 					"port %u flow %p cannot be applied",
@@ -2469,10 +2563,7 @@ mlx5_flow_start(struct rte_eth_dev *dev, struct mlx5_flows *list)
 			DRV_LOG(DEBUG, "port %u flow %p applied",
 				dev->data->port_id, (void *)flow);
 		}
-		if (!flow->mark)
-			continue;
-		for (i = 0; i != flow->rss_conf.queue_num; ++i)
-			(*priv->rxqs)[flow->rss_conf.queue[i]]->mark = 1;
+		mlx5_flow_create_update_rxqs(dev, flow);
 	}
 	return 0;
 }
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index 1e4354ab3..351acfc0f 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -1386,6 +1386,8 @@ mlx5_ind_table_ibv_verify(struct rte_eth_dev *dev)
  *   first queue index will be taken for the indirection table.
  * @param queues_n
  *   Number of queues.
+ * @param tunnel
+ *   Tunnel type.
  *
  * @return
  *   The Verbs object initialised, NULL otherwise and rte_errno is set.
@@ -1394,7 +1396,7 @@ struct mlx5_hrxq *
 mlx5_hrxq_new(struct rte_eth_dev *dev,
 	      const uint8_t *rss_key, uint32_t rss_key_len,
 	      uint64_t hash_fields,
-	      const uint16_t *queues, uint32_t queues_n)
+	      const uint16_t *queues, uint32_t queues_n, uint32_t tunnel)
 {
 	struct priv *priv = dev->data->dev_private;
 	struct mlx5_hrxq *hrxq;
@@ -1438,6 +1440,7 @@ mlx5_hrxq_new(struct rte_eth_dev *dev,
 	hrxq->qp = qp;
 	hrxq->rss_key_len = rss_key_len;
 	hrxq->hash_fields = hash_fields;
+	hrxq->tunnel = tunnel;
 	memcpy(hrxq->rss_key, rss_key, rss_key_len);
 	rte_atomic32_inc(&hrxq->refcnt);
 	LIST_INSERT_HEAD(&priv->hrxqs, hrxq, next);
@@ -1466,6 +1469,8 @@ mlx5_hrxq_new(struct rte_eth_dev *dev,
  *   first queue index will be taken for the indirection table.
  * @param queues_n
  *   Number of queues.
+ * @param tunnel
+ *   Tunnel type.
  *
  * @return
  *   An hash Rx queue on success.
@@ -1474,7 +1479,7 @@ struct mlx5_hrxq *
 mlx5_hrxq_get(struct rte_eth_dev *dev,
 	      const uint8_t *rss_key, uint32_t rss_key_len,
 	      uint64_t hash_fields,
-	      const uint16_t *queues, uint32_t queues_n)
+	      const uint16_t *queues, uint32_t queues_n, uint32_t tunnel)
 {
 	struct priv *priv = dev->data->dev_private;
 	struct mlx5_hrxq *hrxq;
@@ -1489,6 +1494,8 @@ mlx5_hrxq_get(struct rte_eth_dev *dev,
 			continue;
 		if (hrxq->hash_fields != hash_fields)
 			continue;
+		if (hrxq->tunnel != tunnel)
+			continue;
 		ind_tbl = mlx5_ind_table_ibv_get(dev, queues, queues_n);
 		if (!ind_tbl)
 			continue;
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 1f422c70b..d061dfc8a 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -34,7 +34,7 @@
 #include "mlx5_prm.h"
 
 static __rte_always_inline uint32_t
-rxq_cq_to_pkt_type(volatile struct mlx5_cqe *cqe);
+rxq_cq_to_pkt_type(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cqe);
 
 static __rte_always_inline int
 mlx5_rx_poll_len(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cqe,
@@ -125,12 +125,14 @@ mlx5_set_ptype_table(void)
 	(*p)[0x8a] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
 		     RTE_PTYPE_L4_UDP;
 	/* Tunneled - L3 */
+	(*p)[0x40] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN;
 	(*p)[0x41] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
 		     RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
 		     RTE_PTYPE_INNER_L4_NONFRAG;
 	(*p)[0x42] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
 		     RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
 		     RTE_PTYPE_INNER_L4_NONFRAG;
+	(*p)[0xc0] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN;
 	(*p)[0xc1] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
 		     RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
 		     RTE_PTYPE_INNER_L4_NONFRAG;
@@ -1577,6 +1579,8 @@ mlx5_tx_burst_empw(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
 /**
  * Translate RX completion flags to packet type.
  *
+ * @param[in] rxq
+ *   Pointer to RX queue structure.
  * @param[in] cqe
  *   Pointer to CQE.
  *
@@ -1586,7 +1590,7 @@ mlx5_tx_burst_empw(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
  *   Packet type for struct rte_mbuf.
  */
 static inline uint32_t
-rxq_cq_to_pkt_type(volatile struct mlx5_cqe *cqe)
+rxq_cq_to_pkt_type(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cqe)
 {
 	uint8_t idx;
 	uint8_t pinfo = cqe->pkt_info;
@@ -1601,7 +1605,7 @@ rxq_cq_to_pkt_type(volatile struct mlx5_cqe *cqe)
 	 * bit[7] = outer_l3_type
 	 */
 	idx = ((pinfo & 0x3) << 6) | ((ptype & 0xfc00) >> 10);
-	return mlx5_ptype_table[idx];
+	return mlx5_ptype_table[idx] | rxq->tunnel * !!(idx & (1 << 6));
 }
 
 /**
@@ -1833,7 +1837,7 @@ mlx5_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n)
 			pkt = seg;
 			assert(len >= (rxq->crc_present << 2));
 			/* Update packet information. */
-			pkt->packet_type = rxq_cq_to_pkt_type(cqe);
+			pkt->packet_type = rxq_cq_to_pkt_type(rxq, cqe);
 			pkt->ol_flags = 0;
 			if (rss_hash_res && rxq->rss_hash) {
 				pkt->hash.rss = rss_hash_res;
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index a702cb603..6866f6818 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -104,6 +104,7 @@ struct mlx5_rxq_data {
 	void *cq_uar; /* CQ user access region. */
 	uint32_t cqn; /* CQ number. */
 	uint8_t cq_arm_sn; /* CQ arm seq number. */
+	uint32_t tunnel; /* Tunnel information. */
 } __rte_cache_aligned;
 
 /* Verbs Rx queue elements. */
@@ -125,6 +126,7 @@ struct mlx5_rxq_ctrl {
 	struct mlx5_rxq_ibv *ibv; /* Verbs elements. */
 	struct mlx5_rxq_data rxq; /* Data path structure. */
 	unsigned int socket; /* CPU socket ID for allocations. */
+	uint32_t tunnel_types[16]; /* Tunnel type counter. */
 	unsigned int irq:1; /* Whether IRQ is enabled. */
 	uint16_t idx; /* Queue index. */
 };
@@ -145,6 +147,7 @@ struct mlx5_hrxq {
 	struct mlx5_ind_table_ibv *ind_table; /* Indirection table. */
 	struct ibv_qp *qp; /* Verbs queue pair. */
 	uint64_t hash_fields; /* Verbs Hash fields. */
+	uint32_t tunnel; /* Tunnel type. */
 	uint32_t rss_key_len; /* Hash key length in bytes. */
 	uint8_t rss_key[]; /* Hash key. */
 };
@@ -248,11 +251,13 @@ int mlx5_ind_table_ibv_verify(struct rte_eth_dev *dev);
 struct mlx5_hrxq *mlx5_hrxq_new(struct rte_eth_dev *dev,
 				const uint8_t *rss_key, uint32_t rss_key_len,
 				uint64_t hash_fields,
-				const uint16_t *queues, uint32_t queues_n);
+				const uint16_t *queues, uint32_t queues_n,
+				uint32_t tunnel);
 struct mlx5_hrxq *mlx5_hrxq_get(struct rte_eth_dev *dev,
 				const uint8_t *rss_key, uint32_t rss_key_len,
 				uint64_t hash_fields,
-				const uint16_t *queues, uint32_t queues_n);
+				const uint16_t *queues, uint32_t queues_n,
+				uint32_t tunnel);
 int mlx5_hrxq_release(struct rte_eth_dev *dev, struct mlx5_hrxq *hxrq);
 int mlx5_hrxq_ibv_verify(struct rte_eth_dev *dev);
 uint64_t mlx5_get_rx_port_offloads(void);
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
index bbe1818ef..9f9136108 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
@@ -551,6 +551,7 @@ rxq_cq_to_ptype_oflags_v(struct mlx5_rxq_data *rxq,
 	const uint64x1_t mbuf_init = vld1_u64(&rxq->mbuf_initializer);
 	const uint64x1_t r32_mask = vcreate_u64(0xffffffff);
 	uint64x2_t rearm0, rearm1, rearm2, rearm3;
+	uint8_t pt_idx0, pt_idx1, pt_idx2, pt_idx3;
 
 	if (rxq->mark) {
 		const uint32x4_t ft_def = vdupq_n_u32(MLX5_FLOW_MARK_DEFAULT);
@@ -583,14 +584,18 @@ rxq_cq_to_ptype_oflags_v(struct mlx5_rxq_data *rxq,
 	ptype = vshrn_n_u32(ptype_info, 10);
 	/* Errored packets will have RTE_PTYPE_ALL_MASK. */
 	ptype = vorr_u16(ptype, op_err);
-	pkts[0]->packet_type =
-		mlx5_ptype_table[vget_lane_u8(vreinterpret_u8_u16(ptype), 6)];
-	pkts[1]->packet_type =
-		mlx5_ptype_table[vget_lane_u8(vreinterpret_u8_u16(ptype), 4)];
-	pkts[2]->packet_type =
-		mlx5_ptype_table[vget_lane_u8(vreinterpret_u8_u16(ptype), 2)];
-	pkts[3]->packet_type =
-		mlx5_ptype_table[vget_lane_u8(vreinterpret_u8_u16(ptype), 0)];
+	pt_idx0 = vget_lane_u8(vreinterpret_u8_u16(ptype), 6);
+	pt_idx1 = vget_lane_u8(vreinterpret_u8_u16(ptype), 4);
+	pt_idx2 = vget_lane_u8(vreinterpret_u8_u16(ptype), 2);
+	pt_idx3 = vget_lane_u8(vreinterpret_u8_u16(ptype), 0);
+	pkts[0]->packet_type = mlx5_ptype_table[pt_idx0] |
+			       !!(pt_idx0 & (1 << 6)) * rxq->tunnel;
+	pkts[1]->packet_type = mlx5_ptype_table[pt_idx1] |
+			       !!(pt_idx1 & (1 << 6)) * rxq->tunnel;
+	pkts[2]->packet_type = mlx5_ptype_table[pt_idx2] |
+			       !!(pt_idx2 & (1 << 6)) * rxq->tunnel;
+	pkts[3]->packet_type = mlx5_ptype_table[pt_idx3] |
+			       !!(pt_idx3 & (1 << 6)) * rxq->tunnel;
 	/* Fill flags for checksum and VLAN. */
 	pinfo = vandq_u32(ptype_info, ptype_ol_mask);
 	pinfo = vreinterpretq_u32_u8(
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
index c088bcb51..d2492481d 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
@@ -542,6 +542,7 @@ rxq_cq_to_ptype_oflags_v(struct mlx5_rxq_data *rxq, __m128i cqes[4],
 	const __m128i mbuf_init =
 		_mm_loadl_epi64((__m128i *)&rxq->mbuf_initializer);
 	__m128i rearm0, rearm1, rearm2, rearm3;
+	uint8_t pt_idx0, pt_idx1, pt_idx2, pt_idx3;
 
 	/* Extract pkt_info field. */
 	pinfo0 = _mm_unpacklo_epi32(cqes[0], cqes[1]);
@@ -595,10 +596,18 @@ rxq_cq_to_ptype_oflags_v(struct mlx5_rxq_data *rxq, __m128i cqes[4],
 	/* Errored packets will have RTE_PTYPE_ALL_MASK. */
 	op_err = _mm_srli_epi16(op_err, 8);
 	ptype = _mm_or_si128(ptype, op_err);
-	pkts[0]->packet_type = mlx5_ptype_table[_mm_extract_epi8(ptype, 0)];
-	pkts[1]->packet_type = mlx5_ptype_table[_mm_extract_epi8(ptype, 2)];
-	pkts[2]->packet_type = mlx5_ptype_table[_mm_extract_epi8(ptype, 4)];
-	pkts[3]->packet_type = mlx5_ptype_table[_mm_extract_epi8(ptype, 6)];
+	pt_idx0 = _mm_extract_epi8(ptype, 0);
+	pt_idx1 = _mm_extract_epi8(ptype, 2);
+	pt_idx2 = _mm_extract_epi8(ptype, 4);
+	pt_idx3 = _mm_extract_epi8(ptype, 6);
+	pkts[0]->packet_type = mlx5_ptype_table[pt_idx0] |
+			       !!(pt_idx0 & (1 << 6)) * rxq->tunnel;
+	pkts[1]->packet_type = mlx5_ptype_table[pt_idx1] |
+			       !!(pt_idx1 & (1 << 6)) * rxq->tunnel;
+	pkts[2]->packet_type = mlx5_ptype_table[pt_idx2] |
+			       !!(pt_idx2 & (1 << 6)) * rxq->tunnel;
+	pkts[3]->packet_type = mlx5_ptype_table[pt_idx3] |
+			       !!(pt_idx3 & (1 << 6)) * rxq->tunnel;
 	/* Fill flags for checksum and VLAN. */
 	pinfo = _mm_and_si128(pinfo, ptype_ol_mask);
 	pinfo = _mm_shuffle_epi8(cv_flag_sel, pinfo);
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v3 05/14] net/mlx5: cleanup tunnel checksum offloads
       [not found] <20180410133415.189905-1-xuemingl%40mellanox.com>
                   ` (4 preceding siblings ...)
  2018-04-13 11:20 ` [PATCH v3 04/14] net/mlx5: support Rx tunnel type identification Xueming Li
@ 2018-04-13 11:20 ` Xueming Li
  2018-04-13 11:20 ` [PATCH v3 06/14] net/mlx5: split flow RSS handling logic Xueming Li
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 115+ messages in thread
From: Xueming Li @ 2018-04-13 11:20 UTC (permalink / raw)
  To: Nelio Laranjeiro, Shahaf Shuler; +Cc: Xueming Li, dev

This patch cleanup tunnel checksum offloads.

Once tunnel packet type(RTE_PTYPE_TUNNEL_xxx) identified,
PKT_RX_IP_CKSUM_XXX and PKT_RX_L4_CKSUM_XXX represent checksum result of
inner headers, outer L3 and L4 header checksum are always valid as soon
as tunnel identified. If no tunnel identified, PKT_RX_IP_CKSUM_XXX and
PKT_RX_L4_CKSUM_XXX represent checksum result of outer L3 and L4
headers.

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
---
 drivers/net/mlx5/mlx5_rxq.c  |  2 --
 drivers/net/mlx5/mlx5_rxtx.c | 18 ++++--------------
 drivers/net/mlx5/mlx5_rxtx.h |  1 -
 3 files changed, 4 insertions(+), 17 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index 351acfc0f..073732e16 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -1045,8 +1045,6 @@ mlx5_rxq_new(struct rte_eth_dev *dev, uint16_t idx, uint16_t desc,
 	}
 	/* Toggle RX checksum offload if hardware supports it. */
 	tmpl->rxq.csum = !!(conf->offloads & DEV_RX_OFFLOAD_CHECKSUM);
-	tmpl->rxq.csum_l2tun = (!!(conf->offloads & DEV_RX_OFFLOAD_CHECKSUM) &&
-				priv->config.tunnel_en);
 	tmpl->rxq.hw_timestamp = !!(conf->offloads & DEV_RX_OFFLOAD_TIMESTAMP);
 	/* Configure VLAN stripping. */
 	tmpl->rxq.vlan_strip = !!(conf->offloads & DEV_RX_OFFLOAD_VLAN_STRIP);
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index d061dfc8a..285b2dbf0 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -41,7 +41,7 @@ mlx5_rx_poll_len(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cqe,
 		 uint16_t cqe_cnt, uint32_t *rss_hash);
 
 static __rte_always_inline uint32_t
-rxq_cq_to_ol_flags(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cqe);
+rxq_cq_to_ol_flags(volatile struct mlx5_cqe *cqe);
 
 uint32_t mlx5_ptype_table[] __rte_cache_aligned = {
 	[0xff] = RTE_PTYPE_ALL_MASK, /* Last entry for errored packet. */
@@ -1728,8 +1728,6 @@ mlx5_rx_poll_len(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cqe,
 /**
  * Translate RX completion flags to offload flags.
  *
- * @param[in] rxq
- *   Pointer to RX queue structure.
  * @param[in] cqe
  *   Pointer to CQE.
  *
@@ -1737,7 +1735,7 @@ mlx5_rx_poll_len(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cqe,
  *   Offload flags (ol_flags) for struct rte_mbuf.
  */
 static inline uint32_t
-rxq_cq_to_ol_flags(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cqe)
+rxq_cq_to_ol_flags(volatile struct mlx5_cqe *cqe)
 {
 	uint32_t ol_flags = 0;
 	uint16_t flags = rte_be_to_cpu_16(cqe->hdr_type_etc);
@@ -1749,14 +1747,6 @@ rxq_cq_to_ol_flags(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cqe)
 		TRANSPOSE(flags,
 			  MLX5_CQE_RX_L4_HDR_VALID,
 			  PKT_RX_L4_CKSUM_GOOD);
-	if ((cqe->pkt_info & MLX5_CQE_RX_TUNNEL_PACKET) && (rxq->csum_l2tun))
-		ol_flags |=
-			TRANSPOSE(flags,
-				  MLX5_CQE_RX_L3_HDR_VALID,
-				  PKT_RX_IP_CKSUM_GOOD) |
-			TRANSPOSE(flags,
-				  MLX5_CQE_RX_L4_HDR_VALID,
-				  PKT_RX_L4_CKSUM_GOOD);
 	return ol_flags;
 }
 
@@ -1855,8 +1845,8 @@ mlx5_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n)
 						mlx5_flow_mark_get(mark);
 				}
 			}
-			if (rxq->csum | rxq->csum_l2tun)
-				pkt->ol_flags |= rxq_cq_to_ol_flags(rxq, cqe);
+			if (rxq->csum)
+				pkt->ol_flags |= rxq_cq_to_ol_flags(cqe);
 			if (rxq->vlan_strip &&
 			    (cqe->hdr_type_etc &
 			     rte_cpu_to_be_16(MLX5_CQE_VLAN_STRIPPED))) {
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index 6866f6818..d35605b55 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -77,7 +77,6 @@ struct rxq_zip {
 /* RX queue descriptor. */
 struct mlx5_rxq_data {
 	unsigned int csum:1; /* Enable checksum offloading. */
-	unsigned int csum_l2tun:1; /* Same for L2 tunnels. */
 	unsigned int hw_timestamp:1; /* Enable HW timestamp. */
 	unsigned int vlan_strip:1; /* Enable VLAN stripping. */
 	unsigned int crc_present:1; /* CRC must be subtracted. */
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v3 06/14] net/mlx5: split flow RSS handling logic
       [not found] <20180410133415.189905-1-xuemingl%40mellanox.com>
                   ` (5 preceding siblings ...)
  2018-04-13 11:20 ` [PATCH v3 05/14] net/mlx5: cleanup tunnel checksum offloads Xueming Li
@ 2018-04-13 11:20 ` Xueming Li
  2018-04-13 11:20 ` [PATCH v3 07/14] net/mlx5: support tunnel RSS level Xueming Li
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 115+ messages in thread
From: Xueming Li @ 2018-04-13 11:20 UTC (permalink / raw)
  To: Nelio Laranjeiro, Shahaf Shuler; +Cc: Xueming Li, dev

This patch split out flow RSS hash field handling logic to dedicate
function.

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
---
 drivers/net/mlx5/mlx5_flow.c | 114 ++++++++++++++++++++++++-------------------
 1 file changed, 63 insertions(+), 51 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index 7d04b4d65..dd099f328 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -999,59 +999,8 @@ mlx5_flow_update_priority(struct rte_eth_dev *dev,
 static void
 mlx5_flow_convert_finalise(struct mlx5_flow_parse *parser)
 {
-	const unsigned int ipv4 =
-		hash_rxq_init[parser->layer].ip_version == MLX5_IPV4;
-	const enum hash_rxq_type hmin = ipv4 ? HASH_RXQ_TCPV4 : HASH_RXQ_TCPV6;
-	const enum hash_rxq_type hmax = ipv4 ? HASH_RXQ_IPV4 : HASH_RXQ_IPV6;
-	const enum hash_rxq_type ohmin = ipv4 ? HASH_RXQ_TCPV6 : HASH_RXQ_TCPV4;
-	const enum hash_rxq_type ohmax = ipv4 ? HASH_RXQ_IPV6 : HASH_RXQ_IPV4;
-	const enum hash_rxq_type ip = ipv4 ? HASH_RXQ_IPV4 : HASH_RXQ_IPV6;
 	unsigned int i;
 
-	/* Remove any other flow not matching the pattern. */
-	if (parser->rss_conf.queue_num == 1 && !parser->rss_conf.types) {
-		for (i = 0; i != hash_rxq_init_n; ++i) {
-			if (i == HASH_RXQ_ETH)
-				continue;
-			rte_free(parser->queue[i].ibv_attr);
-			parser->queue[i].ibv_attr = NULL;
-		}
-		return;
-	}
-	if (parser->layer == HASH_RXQ_ETH) {
-		goto fill;
-	} else {
-		/*
-		 * This layer becomes useless as the pattern define under
-		 * layers.
-		 */
-		rte_free(parser->queue[HASH_RXQ_ETH].ibv_attr);
-		parser->queue[HASH_RXQ_ETH].ibv_attr = NULL;
-	}
-	/* Remove opposite kind of layer e.g. IPv6 if the pattern is IPv4. */
-	for (i = ohmin; i != (ohmax + 1); ++i) {
-		if (!parser->queue[i].ibv_attr)
-			continue;
-		rte_free(parser->queue[i].ibv_attr);
-		parser->queue[i].ibv_attr = NULL;
-	}
-	/* Remove impossible flow according to the RSS configuration. */
-	if (hash_rxq_init[parser->layer].dpdk_rss_hf &
-	    parser->rss_conf.types) {
-		/* Remove any other flow. */
-		for (i = hmin; i != (hmax + 1); ++i) {
-			if ((i == parser->layer) ||
-			     (!parser->queue[i].ibv_attr))
-				continue;
-			rte_free(parser->queue[i].ibv_attr);
-			parser->queue[i].ibv_attr = NULL;
-		}
-	} else  if (!parser->queue[ip].ibv_attr) {
-		/* no RSS possible with the current configuration. */
-		parser->rss_conf.queue_num = 1;
-		return;
-	}
-fill:
 	/*
 	 * Fill missing layers in verbs specifications, or compute the correct
 	 * offset to allocate the memory space for the attributes and
@@ -1114,6 +1063,66 @@ mlx5_flow_convert_finalise(struct mlx5_flow_parse *parser)
 }
 
 /**
+ * Update flows according to pattern and RSS hash fields.
+ *
+ * @param[in, out] parser
+ *   Internal parser structure.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+mlx5_flow_convert_rss(struct mlx5_flow_parse *parser)
+{
+	const unsigned int ipv4 =
+		hash_rxq_init[parser->layer].ip_version == MLX5_IPV4;
+	const enum hash_rxq_type hmin = ipv4 ? HASH_RXQ_TCPV4 : HASH_RXQ_TCPV6;
+	const enum hash_rxq_type hmax = ipv4 ? HASH_RXQ_IPV4 : HASH_RXQ_IPV6;
+	const enum hash_rxq_type ohmin = ipv4 ? HASH_RXQ_TCPV6 : HASH_RXQ_TCPV4;
+	const enum hash_rxq_type ohmax = ipv4 ? HASH_RXQ_IPV6 : HASH_RXQ_IPV4;
+	const enum hash_rxq_type ip = ipv4 ? HASH_RXQ_IPV4 : HASH_RXQ_IPV6;
+	unsigned int i;
+
+	/* Remove any other flow not matching the pattern. */
+	if (parser->rss_conf.queue_num == 1 && !parser->rss_conf.types) {
+		for (i = 0; i != hash_rxq_init_n; ++i) {
+			if (i == HASH_RXQ_ETH)
+				continue;
+			rte_free(parser->queue[i].ibv_attr);
+			parser->queue[i].ibv_attr = NULL;
+		}
+		return 0;
+	}
+	if (parser->layer == HASH_RXQ_ETH)
+		return 0;
+	/* This layer becomes useless as the pattern define under layers. */
+	rte_free(parser->queue[HASH_RXQ_ETH].ibv_attr);
+	parser->queue[HASH_RXQ_ETH].ibv_attr = NULL;
+	/* Remove opposite kind of layer e.g. IPv6 if the pattern is IPv4. */
+	for (i = ohmin; i != (ohmax + 1); ++i) {
+		if (!parser->queue[i].ibv_attr)
+			continue;
+		rte_free(parser->queue[i].ibv_attr);
+		parser->queue[i].ibv_attr = NULL;
+	}
+	/* Remove impossible flow according to the RSS configuration. */
+	if (hash_rxq_init[parser->layer].dpdk_rss_hf &
+	    parser->rss_conf.types) {
+		/* Remove any other flow. */
+		for (i = hmin; i != (hmax + 1); ++i) {
+			if (i == parser->layer || !parser->queue[i].ibv_attr)
+				continue;
+			rte_free(parser->queue[i].ibv_attr);
+			parser->queue[i].ibv_attr = NULL;
+		}
+	} else if (!parser->queue[ip].ibv_attr) {
+		/* no RSS possible with the current configuration. */
+		parser->rss_conf.queue_num = 1;
+	}
+	return 0;
+}
+
+/**
  * Validate and convert a flow supported by the NIC.
  *
  * @param dev
@@ -1221,6 +1230,9 @@ mlx5_flow_convert(struct rte_eth_dev *dev,
 	 * configuration.
 	 */
 	if (!parser->drop)
+		ret = mlx5_flow_convert_rss(parser);
+		if (ret)
+			goto exit_free;
 		mlx5_flow_convert_finalise(parser);
 	mlx5_flow_update_priority(dev, parser, attr);
 exit_free:
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v3 07/14] net/mlx5: support tunnel RSS level
       [not found] <20180410133415.189905-1-xuemingl%40mellanox.com>
                   ` (6 preceding siblings ...)
  2018-04-13 11:20 ` [PATCH v3 06/14] net/mlx5: split flow RSS handling logic Xueming Li
@ 2018-04-13 11:20 ` Xueming Li
  2018-04-13 13:27   ` Nélio Laranjeiro
  2018-04-13 11:20 ` [PATCH v3 08/14] net/mlx5: add hardware flow debug dump Xueming Li
                   ` (6 subsequent siblings)
  14 siblings, 1 reply; 115+ messages in thread
From: Xueming Li @ 2018-04-13 11:20 UTC (permalink / raw)
  To: Nelio Laranjeiro, Shahaf Shuler; +Cc: Xueming Li, dev

Tunnel RSS level of flow RSS action offers user a choice to do RSS hash
calculation on inner or outer RSS fields. Testpmd flow command examples:

GRE flow inner RSS:
  flow create 0 ingress pattern eth / ipv4 proto is 47 / gre / end
actions rss queues 1 2 end level 1 / end

GRE tunnel flow outer RSS:
  flow create 0 ingress pattern eth  / ipv4 proto is 47 / gre / end
actions rss queues 1 2 end level 0 / end

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
---
 drivers/net/mlx5/Makefile    |   2 +-
 drivers/net/mlx5/mlx5_flow.c | 247 +++++++++++++++++++++++++++++--------------
 drivers/net/mlx5/mlx5_glue.c |  16 +++
 drivers/net/mlx5/mlx5_glue.h |   8 ++
 drivers/net/mlx5/mlx5_rxq.c  |  56 +++++++++-
 drivers/net/mlx5/mlx5_rxtx.h |   5 +-
 6 files changed, 248 insertions(+), 86 deletions(-)

diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
index ae118ad33..f9a6c460b 100644
--- a/drivers/net/mlx5/Makefile
+++ b/drivers/net/mlx5/Makefile
@@ -35,7 +35,7 @@ include $(RTE_SDK)/mk/rte.vars.mk
 LIB = librte_pmd_mlx5.a
 LIB_GLUE = $(LIB_GLUE_BASE).$(LIB_GLUE_VERSION)
 LIB_GLUE_BASE = librte_pmd_mlx5_glue.so
-LIB_GLUE_VERSION = 18.02.0
+LIB_GLUE_VERSION = 18.05.0
 
 # Sources.
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5.c
diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index dd099f328..a22554706 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -116,6 +116,7 @@ enum hash_rxq_type {
 	HASH_RXQ_UDPV6,
 	HASH_RXQ_IPV6,
 	HASH_RXQ_ETH,
+	HASH_RXQ_TUNNEL,
 };
 
 /* Initialization data for hash RX queue. */
@@ -454,6 +455,7 @@ struct mlx5_flow_parse {
 	uint16_t queues[RTE_MAX_QUEUES_PER_PORT]; /**< Queues indexes to use. */
 	uint8_t rss_key[40]; /**< copy of the RSS key. */
 	enum hash_rxq_type layer; /**< Last pattern layer detected. */
+	enum hash_rxq_type out_layer; /**< Last outer pattern layer detected. */
 	uint32_t tunnel; /**< Tunnel type of RTE_PTYPE_TUNNEL_XXX. */
 	struct ibv_counter_set *cs; /**< Holds the counter set for the rule */
 	struct {
@@ -461,6 +463,7 @@ struct mlx5_flow_parse {
 		/**< Pointer to Verbs attributes. */
 		unsigned int offset;
 		/**< Current position or total size of the attribute. */
+		uint64_t hash_fields; /**< Verbs hash fields. */
 	} queue[RTE_DIM(hash_rxq_init)];
 };
 
@@ -696,7 +699,8 @@ mlx5_flow_convert_actions(struct rte_eth_dev *dev,
 						   " function is Toeplitz");
 				return -rte_errno;
 			}
-			if (rss->level) {
+#ifndef HAVE_IBV_DEVICE_TUNNEL_SUPPORT
+			if (parser->rss_conf.level > 0) {
 				rte_flow_error_set(error, EINVAL,
 						   RTE_FLOW_ERROR_TYPE_ACTION,
 						   actions,
@@ -704,6 +708,15 @@ mlx5_flow_convert_actions(struct rte_eth_dev *dev,
 						   " level is not supported");
 				return -rte_errno;
 			}
+#endif
+			if (parser->rss_conf.level > 1) {
+				rte_flow_error_set(error, EINVAL,
+						   RTE_FLOW_ERROR_TYPE_ACTION,
+						   actions,
+						   "RSS encapsulation level"
+						   " > 1 is not supported");
+				return -rte_errno;
+			}
 			if (rss->types & MLX5_RSS_HF_MASK) {
 				rte_flow_error_set(error, EINVAL,
 						   RTE_FLOW_ERROR_TYPE_ACTION,
@@ -754,7 +767,7 @@ mlx5_flow_convert_actions(struct rte_eth_dev *dev,
 			}
 			parser->rss_conf = (struct rte_flow_action_rss){
 				.func = RTE_ETH_HASH_FUNCTION_DEFAULT,
-				.level = 0,
+				.level = rss->level,
 				.types = rss->types,
 				.key_len = rss_key_len,
 				.queue_num = rss->queue_num,
@@ -838,10 +851,12 @@ mlx5_flow_convert_actions(struct rte_eth_dev *dev,
  *   0 on success, a negative errno value otherwise and rte_errno is set.
  */
 static int
-mlx5_flow_convert_items_validate(const struct rte_flow_item items[],
+mlx5_flow_convert_items_validate(struct rte_eth_dev *dev,
+				 const struct rte_flow_item items[],
 				 struct rte_flow_error *error,
 				 struct mlx5_flow_parse *parser)
 {
+	struct priv *priv = dev->data->dev_private;
 	const struct mlx5_flow_items *cur_item = mlx5_flow_items;
 	unsigned int i;
 	int ret = 0;
@@ -881,6 +896,14 @@ mlx5_flow_convert_items_validate(const struct rte_flow_item items[],
 						   " tunnel encapsulations.");
 				return -rte_errno;
 			}
+			if (!priv->config.tunnel_en &&
+			    parser->rss_conf.level) {
+				rte_flow_error_set(error, ENOTSUP,
+					RTE_FLOW_ERROR_TYPE_ITEM,
+					items,
+					"Tunnel offloading not enabled");
+				return -rte_errno;
+			}
 			parser->inner = IBV_FLOW_SPEC_INNER;
 			parser->tunnel = flow_ptype[items->type];
 		}
@@ -1000,7 +1023,11 @@ static void
 mlx5_flow_convert_finalise(struct mlx5_flow_parse *parser)
 {
 	unsigned int i;
+	uint32_t inner = parser->inner;
 
+	/* Don't create extra flows for outer RSS. */
+	if (parser->tunnel && !parser->rss_conf.level)
+		return;
 	/*
 	 * Fill missing layers in verbs specifications, or compute the correct
 	 * offset to allocate the memory space for the attributes and
@@ -1011,23 +1038,25 @@ mlx5_flow_convert_finalise(struct mlx5_flow_parse *parser)
 			struct ibv_flow_spec_ipv4_ext ipv4;
 			struct ibv_flow_spec_ipv6 ipv6;
 			struct ibv_flow_spec_tcp_udp udp_tcp;
+			struct ibv_flow_spec_eth eth;
 		} specs;
 		void *dst;
 		uint16_t size;
 
 		if (i == parser->layer)
 			continue;
-		if (parser->layer == HASH_RXQ_ETH) {
+		if (parser->layer == HASH_RXQ_ETH ||
+		    parser->layer == HASH_RXQ_TUNNEL) {
 			if (hash_rxq_init[i].ip_version == MLX5_IPV4) {
 				size = sizeof(struct ibv_flow_spec_ipv4_ext);
 				specs.ipv4 = (struct ibv_flow_spec_ipv4_ext){
-					.type = IBV_FLOW_SPEC_IPV4_EXT,
+					.type = inner | IBV_FLOW_SPEC_IPV4_EXT,
 					.size = size,
 				};
 			} else {
 				size = sizeof(struct ibv_flow_spec_ipv6);
 				specs.ipv6 = (struct ibv_flow_spec_ipv6){
-					.type = IBV_FLOW_SPEC_IPV6,
+					.type = inner | IBV_FLOW_SPEC_IPV6,
 					.size = size,
 				};
 			}
@@ -1044,7 +1073,7 @@ mlx5_flow_convert_finalise(struct mlx5_flow_parse *parser)
 		    (i == HASH_RXQ_UDPV6) || (i == HASH_RXQ_TCPV6)) {
 			size = sizeof(struct ibv_flow_spec_tcp_udp);
 			specs.udp_tcp = (struct ibv_flow_spec_tcp_udp) {
-				.type = ((i == HASH_RXQ_UDPV4 ||
+				.type = inner | ((i == HASH_RXQ_UDPV4 ||
 					  i == HASH_RXQ_UDPV6) ?
 					 IBV_FLOW_SPEC_UDP :
 					 IBV_FLOW_SPEC_TCP),
@@ -1065,6 +1094,8 @@ mlx5_flow_convert_finalise(struct mlx5_flow_parse *parser)
 /**
  * Update flows according to pattern and RSS hash fields.
  *
+ * @param dev
+ *   Pointer to Ethernet device.
  * @param[in, out] parser
  *   Internal parser structure.
  *
@@ -1072,16 +1103,17 @@ mlx5_flow_convert_finalise(struct mlx5_flow_parse *parser)
  *   0 on success, a negative errno value otherwise and rte_errno is set.
  */
 static int
-mlx5_flow_convert_rss(struct mlx5_flow_parse *parser)
+mlx5_flow_convert_rss(struct rte_eth_dev *dev, struct mlx5_flow_parse *parser)
 {
-	const unsigned int ipv4 =
+	unsigned int ipv4 =
 		hash_rxq_init[parser->layer].ip_version == MLX5_IPV4;
 	const enum hash_rxq_type hmin = ipv4 ? HASH_RXQ_TCPV4 : HASH_RXQ_TCPV6;
 	const enum hash_rxq_type hmax = ipv4 ? HASH_RXQ_IPV4 : HASH_RXQ_IPV6;
 	const enum hash_rxq_type ohmin = ipv4 ? HASH_RXQ_TCPV6 : HASH_RXQ_TCPV4;
 	const enum hash_rxq_type ohmax = ipv4 ? HASH_RXQ_IPV6 : HASH_RXQ_IPV4;
-	const enum hash_rxq_type ip = ipv4 ? HASH_RXQ_IPV4 : HASH_RXQ_IPV6;
+	enum hash_rxq_type ip = ipv4 ? HASH_RXQ_IPV4 : HASH_RXQ_IPV6;
 	unsigned int i;
+	int found = 0;
 
 	/* Remove any other flow not matching the pattern. */
 	if (parser->rss_conf.queue_num == 1 && !parser->rss_conf.types) {
@@ -1093,9 +1125,51 @@ mlx5_flow_convert_rss(struct mlx5_flow_parse *parser)
 		}
 		return 0;
 	}
-	if (parser->layer == HASH_RXQ_ETH)
+	/*
+	 * Outer RSS.
+	 * HASH_RXQ_ETH is the only rule since tunnel packet match this
+	 * rule must match outer pattern.
+	 */
+	if (parser->tunnel && !parser->rss_conf.level) {
+		/* Remove flows other than default. */
+		for (i = 0; i != hash_rxq_init_n - 1; ++i) {
+			rte_free(parser->queue[i].ibv_attr);
+			parser->queue[i].ibv_attr = NULL;
+		}
+		ipv4 = hash_rxq_init[parser->out_layer].ip_version == MLX5_IPV4;
+		ip = ipv4 ? HASH_RXQ_IPV4 : HASH_RXQ_IPV6;
+		if (hash_rxq_init[parser->out_layer].dpdk_rss_hf &
+		    parser->rss_conf.types) {
+			parser->queue[HASH_RXQ_ETH].hash_fields =
+				hash_rxq_init[parser->out_layer].hash_fields;
+		} else if (ip && (hash_rxq_init[ip].dpdk_rss_hf &
+		    parser->rss_conf.types)) {
+			parser->queue[HASH_RXQ_ETH].hash_fields =
+				hash_rxq_init[ip].hash_fields;
+		} else if (parser->rss_conf.types) {
+			DRV_LOG(WARNING,
+				"port %u rss outer hash function doesn't match"
+				" pattern", dev->data->port_id);
+		}
+		return 0;
+	}
+	if (parser->layer == HASH_RXQ_ETH || parser->layer == HASH_RXQ_TUNNEL) {
+		/* Remove unused flows according to hash function. */
+		for (i = 0; i != hash_rxq_init_n - 1; ++i) {
+			if (!parser->queue[i].ibv_attr)
+				continue;
+			if (hash_rxq_init[i].dpdk_rss_hf &
+			    parser->rss_conf.types) {
+				parser->queue[i].hash_fields =
+					hash_rxq_init[i].hash_fields;
+				continue;
+			}
+			rte_free(parser->queue[i].ibv_attr);
+			parser->queue[i].ibv_attr = NULL;
+		}
 		return 0;
-	/* This layer becomes useless as the pattern define under layers. */
+	}
+	/* Remove ETH layer flow. */
 	rte_free(parser->queue[HASH_RXQ_ETH].ibv_attr);
 	parser->queue[HASH_RXQ_ETH].ibv_attr = NULL;
 	/* Remove opposite kind of layer e.g. IPv6 if the pattern is IPv4. */
@@ -1105,20 +1179,50 @@ mlx5_flow_convert_rss(struct mlx5_flow_parse *parser)
 		rte_free(parser->queue[i].ibv_attr);
 		parser->queue[i].ibv_attr = NULL;
 	}
-	/* Remove impossible flow according to the RSS configuration. */
-	if (hash_rxq_init[parser->layer].dpdk_rss_hf &
-	    parser->rss_conf.types) {
-		/* Remove any other flow. */
+	/*
+	 * Keep L4 flows as IP pattern has to support L4 RSS.
+	 * Otherwise, only keep the flow that match the pattern.
+	 */
+	if (parser->layer != ip) {
+		/* Only keep the flow that match the pattern. */
 		for (i = hmin; i != (hmax + 1); ++i) {
-			if (i == parser->layer || !parser->queue[i].ibv_attr)
+			if (i == parser->layer)
 				continue;
 			rte_free(parser->queue[i].ibv_attr);
 			parser->queue[i].ibv_attr = NULL;
 		}
-	} else if (!parser->queue[ip].ibv_attr) {
-		/* no RSS possible with the current configuration. */
-		parser->rss_conf.queue_num = 1;
 	}
+	/* Remove impossible flow according to the RSS configuration. */
+	for (i = hmin; i != (hmax + 1); ++i) {
+		if (!parser->queue[i].ibv_attr)
+			continue;
+		if (parser->rss_conf.types &
+		    hash_rxq_init[i].dpdk_rss_hf) {
+			parser->queue[i].hash_fields =
+				hash_rxq_init[i].hash_fields;
+			found = 1;
+			continue;
+		}
+		/* L4 flow could be used for L3 RSS. */
+		if (i == parser->layer && i < ip &&
+		    (hash_rxq_init[ip].dpdk_rss_hf &
+		     parser->rss_conf.types)) {
+			parser->queue[i].hash_fields =
+				hash_rxq_init[ip].hash_fields;
+			found = 1;
+			continue;
+		}
+		/* L3 flow and L4 hash: non-rss L3 flow. */
+		if (i == parser->layer && i == ip && found)
+			/* IP pattern and L4 HF. */
+			continue;
+		rte_free(parser->queue[i].ibv_attr);
+		parser->queue[i].ibv_attr = NULL;
+	}
+	if (!found)
+		DRV_LOG(WARNING,
+			"port %u rss hash function doesn't match "
+			"pattern", dev->data->port_id);
 	return 0;
 }
 
@@ -1165,7 +1269,7 @@ mlx5_flow_convert(struct rte_eth_dev *dev,
 	ret = mlx5_flow_convert_actions(dev, actions, error, parser);
 	if (ret)
 		return ret;
-	ret = mlx5_flow_convert_items_validate(items, error, parser);
+	ret = mlx5_flow_convert_items_validate(dev, items, error, parser);
 	if (ret)
 		return ret;
 	mlx5_flow_convert_finalise(parser);
@@ -1186,10 +1290,6 @@ mlx5_flow_convert(struct rte_eth_dev *dev,
 		for (i = 0; i != hash_rxq_init_n; ++i) {
 			unsigned int offset;
 
-			if (!(parser->rss_conf.types &
-			      hash_rxq_init[i].dpdk_rss_hf) &&
-			    (i != HASH_RXQ_ETH))
-				continue;
 			offset = parser->queue[i].offset;
 			parser->queue[i].ibv_attr =
 				mlx5_flow_convert_allocate(offset, error);
@@ -1201,6 +1301,7 @@ mlx5_flow_convert(struct rte_eth_dev *dev,
 	/* Third step. Conversion parse, fill the specifications. */
 	parser->inner = 0;
 	parser->tunnel = 0;
+	parser->layer = HASH_RXQ_ETH;
 	for (; items->type != RTE_FLOW_ITEM_TYPE_END; ++items) {
 		struct mlx5_flow_data data = {
 			.parser = parser,
@@ -1218,23 +1319,23 @@ mlx5_flow_convert(struct rte_eth_dev *dev,
 		if (ret)
 			goto exit_free;
 	}
-	if (parser->mark)
-		mlx5_flow_create_flag_mark(parser, parser->mark_id);
-	if (parser->count && parser->create) {
-		mlx5_flow_create_count(dev, parser);
-		if (!parser->cs)
-			goto exit_count_error;
-	}
 	/*
 	 * Last step. Complete missing specification to reach the RSS
 	 * configuration.
 	 */
 	if (!parser->drop)
-		ret = mlx5_flow_convert_rss(parser);
+		ret = mlx5_flow_convert_rss(dev, parser);
 		if (ret)
 			goto exit_free;
 		mlx5_flow_convert_finalise(parser);
 	mlx5_flow_update_priority(dev, parser, attr);
+	if (parser->mark)
+		mlx5_flow_create_flag_mark(parser, parser->mark_id);
+	if (parser->count && parser->create) {
+		mlx5_flow_create_count(dev, parser);
+		if (!parser->cs)
+			goto exit_count_error;
+	}
 exit_free:
 	/* Only verification is expected, all resources should be released. */
 	if (!parser->create) {
@@ -1282,17 +1383,11 @@ mlx5_flow_create_copy(struct mlx5_flow_parse *parser, void *src,
 	for (i = 0; i != hash_rxq_init_n; ++i) {
 		if (!parser->queue[i].ibv_attr)
 			continue;
-		/* Specification must be the same l3 type or none. */
-		if (parser->layer == HASH_RXQ_ETH ||
-		    (hash_rxq_init[parser->layer].ip_version ==
-		     hash_rxq_init[i].ip_version) ||
-		    (hash_rxq_init[i].ip_version == 0)) {
-			dst = (void *)((uintptr_t)parser->queue[i].ibv_attr +
-					parser->queue[i].offset);
-			memcpy(dst, src, size);
-			++parser->queue[i].ibv_attr->num_of_specs;
-			parser->queue[i].offset += size;
-		}
+		dst = (void *)((uintptr_t)parser->queue[i].ibv_attr +
+				parser->queue[i].offset);
+		memcpy(dst, src, size);
+		++parser->queue[i].ibv_attr->num_of_specs;
+		parser->queue[i].offset += size;
 	}
 }
 
@@ -1323,9 +1418,7 @@ mlx5_flow_create_eth(const struct rte_flow_item *item,
 		.size = eth_size,
 	};
 
-	/* Don't update layer for the inner pattern. */
-	if (!parser->inner)
-		parser->layer = HASH_RXQ_ETH;
+	parser->layer = HASH_RXQ_ETH;
 	if (spec) {
 		unsigned int i;
 
@@ -1438,9 +1531,7 @@ mlx5_flow_create_ipv4(const struct rte_flow_item *item,
 		.size = ipv4_size,
 	};
 
-	/* Don't update layer for the inner pattern. */
-	if (!parser->inner)
-		parser->layer = HASH_RXQ_IPV4;
+	parser->layer = HASH_RXQ_IPV4;
 	if (spec) {
 		if (!mask)
 			mask = default_mask;
@@ -1493,9 +1584,7 @@ mlx5_flow_create_ipv6(const struct rte_flow_item *item,
 		.size = ipv6_size,
 	};
 
-	/* Don't update layer for the inner pattern. */
-	if (!parser->inner)
-		parser->layer = HASH_RXQ_IPV6;
+	parser->layer = HASH_RXQ_IPV6;
 	if (spec) {
 		unsigned int i;
 		uint32_t vtc_flow_val;
@@ -1568,13 +1657,10 @@ mlx5_flow_create_udp(const struct rte_flow_item *item,
 		.size = udp_size,
 	};
 
-	/* Don't update layer for the inner pattern. */
-	if (!parser->inner) {
-		if (parser->layer == HASH_RXQ_IPV4)
-			parser->layer = HASH_RXQ_UDPV4;
-		else
-			parser->layer = HASH_RXQ_UDPV6;
-	}
+	if (parser->layer == HASH_RXQ_IPV4)
+		parser->layer = HASH_RXQ_UDPV4;
+	else
+		parser->layer = HASH_RXQ_UDPV6;
 	if (spec) {
 		if (!mask)
 			mask = default_mask;
@@ -1617,13 +1703,10 @@ mlx5_flow_create_tcp(const struct rte_flow_item *item,
 		.size = tcp_size,
 	};
 
-	/* Don't update layer for the inner pattern. */
-	if (!parser->inner) {
-		if (parser->layer == HASH_RXQ_IPV4)
-			parser->layer = HASH_RXQ_TCPV4;
-		else
-			parser->layer = HASH_RXQ_TCPV6;
-	}
+	if (parser->layer == HASH_RXQ_IPV4)
+		parser->layer = HASH_RXQ_TCPV4;
+	else
+		parser->layer = HASH_RXQ_TCPV6;
 	if (spec) {
 		if (!mask)
 			mask = default_mask;
@@ -1673,6 +1756,8 @@ mlx5_flow_create_vxlan(const struct rte_flow_item *item,
 	id.vni[0] = 0;
 	parser->inner = IBV_FLOW_SPEC_INNER;
 	parser->tunnel = ptype_ext[PTYPE_IDX(RTE_PTYPE_TUNNEL_VXLAN)];
+	parser->out_layer = parser->layer;
+	parser->layer = HASH_RXQ_TUNNEL;
 	if (spec) {
 		if (!mask)
 			mask = default_mask;
@@ -1727,6 +1812,8 @@ mlx5_flow_create_gre(const struct rte_flow_item *item __rte_unused,
 
 	parser->inner = IBV_FLOW_SPEC_INNER;
 	parser->tunnel = ptype_ext[PTYPE_IDX(RTE_PTYPE_TUNNEL_GRE)];
+	parser->out_layer = parser->layer;
+	parser->layer = HASH_RXQ_TUNNEL;
 	mlx5_flow_create_copy(parser, &tunnel, size);
 	return 0;
 }
@@ -1890,33 +1977,33 @@ mlx5_flow_create_action_queue_rss(struct rte_eth_dev *dev,
 	unsigned int i;
 
 	for (i = 0; i != hash_rxq_init_n; ++i) {
-		uint64_t hash_fields;
-
 		if (!parser->queue[i].ibv_attr)
 			continue;
 		flow->frxq[i].ibv_attr = parser->queue[i].ibv_attr;
 		parser->queue[i].ibv_attr = NULL;
-		hash_fields = hash_rxq_init[i].hash_fields;
+		flow->frxq[i].hash_fields = parser->queue[i].hash_fields;
 		if (!priv->dev->data->dev_started)
 			continue;
 		flow->frxq[i].hrxq =
 			mlx5_hrxq_get(dev,
 				      parser->rss_conf.key,
 				      parser->rss_conf.key_len,
-				      hash_fields,
+				      flow->frxq[i].hash_fields,
 				      parser->rss_conf.queue,
 				      parser->rss_conf.queue_num,
-				      parser->tunnel);
+				      parser->tunnel,
+				      parser->rss_conf.level);
 		if (flow->frxq[i].hrxq)
 			continue;
 		flow->frxq[i].hrxq =
 			mlx5_hrxq_new(dev,
 				      parser->rss_conf.key,
 				      parser->rss_conf.key_len,
-				      hash_fields,
+				      flow->frxq[i].hash_fields,
 				      parser->rss_conf.queue,
 				      parser->rss_conf.queue_num,
-				      parser->tunnel);
+				      parser->tunnel,
+				      parser->rss_conf.level);
 		if (!flow->frxq[i].hrxq) {
 			return rte_flow_error_set(error, ENOMEM,
 						  RTE_FLOW_ERROR_TYPE_HANDLE,
@@ -2013,7 +2100,7 @@ mlx5_flow_create_action_queue(struct rte_eth_dev *dev,
 		DRV_LOG(DEBUG, "port %u %p type %d QP %p ibv_flow %p",
 			dev->data->port_id,
 			(void *)flow, i,
-			(void *)flow->frxq[i].hrxq,
+			(void *)flow->frxq[i].hrxq->qp,
 			(void *)flow->frxq[i].ibv_flow);
 	}
 	if (!flows_n) {
@@ -2541,19 +2628,21 @@ mlx5_flow_start(struct rte_eth_dev *dev, struct mlx5_flows *list)
 			flow->frxq[i].hrxq =
 				mlx5_hrxq_get(dev, flow->rss_conf.key,
 					      flow->rss_conf.key_len,
-					      hash_rxq_init[i].hash_fields,
+					      flow->frxq[i].hash_fields,
 					      flow->rss_conf.queue,
 					      flow->rss_conf.queue_num,
-					      flow->tunnel);
+					      flow->tunnel,
+					      flow->rss_conf.level);
 			if (flow->frxq[i].hrxq)
 				goto flow_create;
 			flow->frxq[i].hrxq =
 				mlx5_hrxq_new(dev, flow->rss_conf.key,
 					      flow->rss_conf.key_len,
-					      hash_rxq_init[i].hash_fields,
+					      flow->frxq[i].hash_fields,
 					      flow->rss_conf.queue,
 					      flow->rss_conf.queue_num,
-					      flow->tunnel);
+					      flow->tunnel,
+					      flow->rss_conf.level);
 			if (!flow->frxq[i].hrxq) {
 				DRV_LOG(DEBUG,
 					"port %u flow %p cannot be applied",
diff --git a/drivers/net/mlx5/mlx5_glue.c b/drivers/net/mlx5/mlx5_glue.c
index be684d378..6874aa32a 100644
--- a/drivers/net/mlx5/mlx5_glue.c
+++ b/drivers/net/mlx5/mlx5_glue.c
@@ -313,6 +313,21 @@ mlx5_glue_dv_init_obj(struct mlx5dv_obj *obj, uint64_t obj_type)
 	return mlx5dv_init_obj(obj, obj_type);
 }
 
+static struct ibv_qp *
+mlx5_glue_dv_create_qp(struct ibv_context *context,
+		       struct ibv_qp_init_attr_ex *qp_init_attr_ex,
+		       struct mlx5dv_qp_init_attr *dv_qp_init_attr)
+{
+#ifdef HAVE_IBV_DEVICE_TUNNEL_SUPPORT
+	return mlx5dv_create_qp(context, qp_init_attr_ex, dv_qp_init_attr);
+#else
+	(void)context;
+	(void)qp_init_attr_ex;
+	(void)dv_qp_init_attr;
+	return NULL;
+#endif
+}
+
 const struct mlx5_glue *mlx5_glue = &(const struct mlx5_glue){
 	.version = MLX5_GLUE_VERSION,
 	.fork_init = mlx5_glue_fork_init,
@@ -356,4 +371,5 @@ const struct mlx5_glue *mlx5_glue = &(const struct mlx5_glue){
 	.dv_query_device = mlx5_glue_dv_query_device,
 	.dv_set_context_attr = mlx5_glue_dv_set_context_attr,
 	.dv_init_obj = mlx5_glue_dv_init_obj,
+	.dv_create_qp = mlx5_glue_dv_create_qp,
 };
diff --git a/drivers/net/mlx5/mlx5_glue.h b/drivers/net/mlx5/mlx5_glue.h
index b5efee3b6..841363872 100644
--- a/drivers/net/mlx5/mlx5_glue.h
+++ b/drivers/net/mlx5/mlx5_glue.h
@@ -31,6 +31,10 @@ struct ibv_counter_set_init_attr;
 struct ibv_query_counter_set_attr;
 #endif
 
+#ifndef HAVE_IBV_DEVICE_TUNNEL_SUPPORT
+struct mlx5dv_qp_init_attr;
+#endif
+
 /* LIB_GLUE_VERSION must be updated every time this structure is modified. */
 struct mlx5_glue {
 	const char *version;
@@ -106,6 +110,10 @@ struct mlx5_glue {
 				   enum mlx5dv_set_ctx_attr_type type,
 				   void *attr);
 	int (*dv_init_obj)(struct mlx5dv_obj *obj, uint64_t obj_type);
+	struct ibv_qp *(*dv_create_qp)
+		(struct ibv_context *context,
+		 struct ibv_qp_init_attr_ex *qp_init_attr_ex,
+		 struct mlx5dv_qp_init_attr *dv_qp_init_attr);
 };
 
 const struct mlx5_glue *mlx5_glue;
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index 073732e16..1997609ec 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -1386,6 +1386,8 @@ mlx5_ind_table_ibv_verify(struct rte_eth_dev *dev)
  *   Number of queues.
  * @param tunnel
  *   Tunnel type.
+ * @param rss_level
+ *   RSS hash on tunnel level.
  *
  * @return
  *   The Verbs object initialised, NULL otherwise and rte_errno is set.
@@ -1394,13 +1396,17 @@ struct mlx5_hrxq *
 mlx5_hrxq_new(struct rte_eth_dev *dev,
 	      const uint8_t *rss_key, uint32_t rss_key_len,
 	      uint64_t hash_fields,
-	      const uint16_t *queues, uint32_t queues_n, uint32_t tunnel)
+	      const uint16_t *queues, uint32_t queues_n,
+	      uint32_t tunnel, uint32_t rss_level)
 {
 	struct priv *priv = dev->data->dev_private;
 	struct mlx5_hrxq *hrxq;
 	struct mlx5_ind_table_ibv *ind_tbl;
 	struct ibv_qp *qp;
 	int err;
+#ifdef HAVE_IBV_DEVICE_TUNNEL_SUPPORT
+	struct mlx5dv_qp_init_attr qp_init_attr = {0};
+#endif
 
 	queues_n = hash_fields ? queues_n : 1;
 	ind_tbl = mlx5_ind_table_ibv_get(dev, queues, queues_n);
@@ -1410,6 +1416,36 @@ mlx5_hrxq_new(struct rte_eth_dev *dev,
 		rte_errno = ENOMEM;
 		return NULL;
 	}
+#ifdef HAVE_IBV_DEVICE_TUNNEL_SUPPORT
+	if (tunnel) {
+		qp_init_attr.comp_mask =
+				MLX5DV_QP_INIT_ATTR_MASK_QP_CREATE_FLAGS;
+		qp_init_attr.create_flags = MLX5DV_QP_CREATE_TUNNEL_OFFLOADS;
+	}
+	qp = mlx5_glue->dv_create_qp(
+		priv->ctx,
+		&(struct ibv_qp_init_attr_ex){
+			.qp_type = IBV_QPT_RAW_PACKET,
+			.comp_mask =
+				IBV_QP_INIT_ATTR_PD |
+				IBV_QP_INIT_ATTR_IND_TABLE |
+				IBV_QP_INIT_ATTR_RX_HASH,
+			.rx_hash_conf = (struct ibv_rx_hash_conf){
+				.rx_hash_function = IBV_RX_HASH_FUNC_TOEPLITZ,
+				.rx_hash_key_len = rss_key_len ? rss_key_len :
+						   rss_hash_default_key_len,
+				.rx_hash_key = rss_key ?
+					       (void *)(uintptr_t)rss_key :
+					       rss_hash_default_key,
+				.rx_hash_fields_mask = hash_fields |
+					(tunnel && rss_level ?
+					(uint32_t)IBV_RX_HASH_INNER : 0),
+			},
+			.rwq_ind_tbl = ind_tbl->ind_table,
+			.pd = priv->pd,
+		},
+		&qp_init_attr);
+#else
 	qp = mlx5_glue->create_qp_ex
 		(priv->ctx,
 		 &(struct ibv_qp_init_attr_ex){
@@ -1420,13 +1456,17 @@ mlx5_hrxq_new(struct rte_eth_dev *dev,
 				IBV_QP_INIT_ATTR_RX_HASH,
 			.rx_hash_conf = (struct ibv_rx_hash_conf){
 				.rx_hash_function = IBV_RX_HASH_FUNC_TOEPLITZ,
-				.rx_hash_key_len = rss_key_len,
-				.rx_hash_key = (void *)(uintptr_t)rss_key,
+				.rx_hash_key_len = rss_key_len ? rss_key_len :
+						   rss_hash_default_key_len,
+				.rx_hash_key = rss_key ?
+					       (void *)(uintptr_t)rss_key :
+					       rss_hash_default_key,
 				.rx_hash_fields_mask = hash_fields,
 			},
 			.rwq_ind_tbl = ind_tbl->ind_table,
 			.pd = priv->pd,
 		 });
+#endif
 	if (!qp) {
 		rte_errno = errno;
 		goto error;
@@ -1439,6 +1479,7 @@ mlx5_hrxq_new(struct rte_eth_dev *dev,
 	hrxq->rss_key_len = rss_key_len;
 	hrxq->hash_fields = hash_fields;
 	hrxq->tunnel = tunnel;
+	hrxq->rss_level = rss_level;
 	memcpy(hrxq->rss_key, rss_key, rss_key_len);
 	rte_atomic32_inc(&hrxq->refcnt);
 	LIST_INSERT_HEAD(&priv->hrxqs, hrxq, next);
@@ -1448,6 +1489,8 @@ mlx5_hrxq_new(struct rte_eth_dev *dev,
 	return hrxq;
 error:
 	err = rte_errno; /* Save rte_errno before cleanup. */
+	DRV_LOG(ERR, "port %u: Error creating Hash Rx queue",
+		dev->data->port_id);
 	mlx5_ind_table_ibv_release(dev, ind_tbl);
 	if (qp)
 		claim_zero(mlx5_glue->destroy_qp(qp));
@@ -1469,6 +1512,8 @@ mlx5_hrxq_new(struct rte_eth_dev *dev,
  *   Number of queues.
  * @param tunnel
  *   Tunnel type.
+ * @param rss_level
+ *   RSS hash on tunnel level
  *
  * @return
  *   An hash Rx queue on success.
@@ -1477,7 +1522,8 @@ struct mlx5_hrxq *
 mlx5_hrxq_get(struct rte_eth_dev *dev,
 	      const uint8_t *rss_key, uint32_t rss_key_len,
 	      uint64_t hash_fields,
-	      const uint16_t *queues, uint32_t queues_n, uint32_t tunnel)
+	      const uint16_t *queues, uint32_t queues_n,
+	      uint32_t tunnel, uint32_t rss_level)
 {
 	struct priv *priv = dev->data->dev_private;
 	struct mlx5_hrxq *hrxq;
@@ -1494,6 +1540,8 @@ mlx5_hrxq_get(struct rte_eth_dev *dev,
 			continue;
 		if (hrxq->tunnel != tunnel)
 			continue;
+		if (hrxq->rss_level != rss_level)
+			continue;
 		ind_tbl = mlx5_ind_table_ibv_get(dev, queues, queues_n);
 		if (!ind_tbl)
 			continue;
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index d35605b55..62cf55109 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -147,6 +147,7 @@ struct mlx5_hrxq {
 	struct ibv_qp *qp; /* Verbs queue pair. */
 	uint64_t hash_fields; /* Verbs Hash fields. */
 	uint32_t tunnel; /* Tunnel type. */
+	uint32_t rss_level; /* RSS on tunnel level. */
 	uint32_t rss_key_len; /* Hash key length in bytes. */
 	uint8_t rss_key[]; /* Hash key. */
 };
@@ -251,12 +252,12 @@ struct mlx5_hrxq *mlx5_hrxq_new(struct rte_eth_dev *dev,
 				const uint8_t *rss_key, uint32_t rss_key_len,
 				uint64_t hash_fields,
 				const uint16_t *queues, uint32_t queues_n,
-				uint32_t tunnel);
+				uint32_t tunnel, uint32_t rss_level);
 struct mlx5_hrxq *mlx5_hrxq_get(struct rte_eth_dev *dev,
 				const uint8_t *rss_key, uint32_t rss_key_len,
 				uint64_t hash_fields,
 				const uint16_t *queues, uint32_t queues_n,
-				uint32_t tunnel);
+				uint32_t tunnel, uint32_t rss_level);
 int mlx5_hrxq_release(struct rte_eth_dev *dev, struct mlx5_hrxq *hxrq);
 int mlx5_hrxq_ibv_verify(struct rte_eth_dev *dev);
 uint64_t mlx5_get_rx_port_offloads(void);
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v3 08/14] net/mlx5: add hardware flow debug dump
       [not found] <20180410133415.189905-1-xuemingl%40mellanox.com>
                   ` (7 preceding siblings ...)
  2018-04-13 11:20 ` [PATCH v3 07/14] net/mlx5: support tunnel RSS level Xueming Li
@ 2018-04-13 11:20 ` Xueming Li
  2018-04-13 13:29   ` Nélio Laranjeiro
  2018-04-13 11:20 ` [PATCH v3 09/14] net/mlx5: introduce VXLAN-GPE tunnel type Xueming Li
                   ` (5 subsequent siblings)
  14 siblings, 1 reply; 115+ messages in thread
From: Xueming Li @ 2018-04-13 11:20 UTC (permalink / raw)
  To: Nelio Laranjeiro, Shahaf Shuler; +Cc: Xueming Li, dev

Dump verb flow detail including flow spec type and size for debugging
purpose.

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
---
 drivers/net/mlx5/mlx5_flow.c  | 68 ++++++++++++++++++++++++++++++++++++-------
 drivers/net/mlx5/mlx5_rxq.c   | 25 +++++++++++++---
 drivers/net/mlx5/mlx5_utils.h |  6 ++++
 3 files changed, 85 insertions(+), 14 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index a22554706..c99722770 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -2049,6 +2049,57 @@ mlx5_flow_create_update_rxqs(struct rte_eth_dev *dev, struct rte_flow *flow)
 }
 
 /**
+ * Dump flow hash RX queue detail.
+ *
+ * @param dev
+ *   Pointer to Ethernet device.
+ * @param flow
+ *   Pointer to the rte_flow.
+ * @param i
+ *   Hash RX queue index.
+ */
+static void
+mlx5_flow_dump(struct rte_eth_dev *dev __rte_unused,
+	       struct rte_flow *flow __rte_unused,
+	       unsigned int i __rte_unused)
+{
+#ifndef NDEBUG
+	uintptr_t spec_ptr;
+	uint16_t j;
+	char buf[256];
+	uint8_t off;
+
+	spec_ptr = (uintptr_t)(flow->frxq[i].ibv_attr + 1);
+	for (j = 0, off = 0; j < flow->frxq[i].ibv_attr->num_of_specs;
+	     j++) {
+		struct ibv_flow_spec *spec = (void *)spec_ptr;
+		off += sprintf(buf + off, " %x(%hu)", spec->hdr.type,
+			       spec->hdr.size);
+		spec_ptr += spec->hdr.size;
+	}
+	DRV_LOG(DEBUG,
+		"port %u Verbs flow %p type %u: hrxq:%p qp:%p ind:%p, hash:%lx/%u"
+		" specs:%hhu(%hu), priority:%hu, type:%d, flags:%x,"
+		" comp_mask:%x specs:%s",
+		dev->data->port_id, (void *)flow, i,
+		(void *)flow->frxq[i].hrxq,
+		(void *)flow->frxq[i].hrxq->qp,
+		(void *)flow->frxq[i].hrxq->ind_table,
+		flow->frxq[i].hash_fields |
+		(flow->tunnel &&
+		 flow->rss_conf.level ? (uint32_t)IBV_RX_HASH_INNER : 0),
+		flow->rss_conf.queue_num,
+		flow->frxq[i].ibv_attr->num_of_specs,
+		flow->frxq[i].ibv_attr->size,
+		flow->frxq[i].ibv_attr->priority,
+		flow->frxq[i].ibv_attr->type,
+		flow->frxq[i].ibv_attr->flags,
+		flow->frxq[i].ibv_attr->comp_mask,
+		buf);
+#endif
+}
+
+/**
  * Complete flow rule creation.
  *
  * @param dev
@@ -2090,6 +2141,7 @@ mlx5_flow_create_action_queue(struct rte_eth_dev *dev,
 		flow->frxq[i].ibv_flow =
 			mlx5_glue->create_flow(flow->frxq[i].hrxq->qp,
 					       flow->frxq[i].ibv_attr);
+		mlx5_flow_dump(dev, flow, i);
 		if (!flow->frxq[i].ibv_flow) {
 			rte_flow_error_set(error, ENOMEM,
 					   RTE_FLOW_ERROR_TYPE_HANDLE,
@@ -2097,11 +2149,6 @@ mlx5_flow_create_action_queue(struct rte_eth_dev *dev,
 			goto error;
 		}
 		++flows_n;
-		DRV_LOG(DEBUG, "port %u %p type %d QP %p ibv_flow %p",
-			dev->data->port_id,
-			(void *)flow, i,
-			(void *)flow->frxq[i].hrxq->qp,
-			(void *)flow->frxq[i].ibv_flow);
 	}
 	if (!flows_n) {
 		rte_flow_error_set(error, EINVAL, RTE_FLOW_ERROR_TYPE_HANDLE,
@@ -2645,24 +2692,25 @@ mlx5_flow_start(struct rte_eth_dev *dev, struct mlx5_flows *list)
 					      flow->rss_conf.level);
 			if (!flow->frxq[i].hrxq) {
 				DRV_LOG(DEBUG,
-					"port %u flow %p cannot be applied",
+					"port %u flow %p cannot create hash"
+					" rxq",
 					dev->data->port_id, (void *)flow);
 				rte_errno = EINVAL;
 				return -rte_errno;
 			}
 flow_create:
+			mlx5_flow_dump(dev, flow, i);
 			flow->frxq[i].ibv_flow =
 				mlx5_glue->create_flow(flow->frxq[i].hrxq->qp,
 						       flow->frxq[i].ibv_attr);
 			if (!flow->frxq[i].ibv_flow) {
 				DRV_LOG(DEBUG,
-					"port %u flow %p cannot be applied",
-					dev->data->port_id, (void *)flow);
+					"port %u flow %p type %u cannot be"
+					" applied",
+					dev->data->port_id, (void *)flow, i);
 				rte_errno = EINVAL;
 				return -rte_errno;
 			}
-			DRV_LOG(DEBUG, "port %u flow %p applied",
-				dev->data->port_id, (void *)flow);
 		}
 		mlx5_flow_create_update_rxqs(dev, flow);
 	}
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index 1997609ec..f55980836 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -1259,9 +1259,9 @@ mlx5_ind_table_ibv_new(struct rte_eth_dev *dev, const uint16_t *queues,
 	}
 	rte_atomic32_inc(&ind_tbl->refcnt);
 	LIST_INSERT_HEAD(&priv->ind_tbls, ind_tbl, next);
-	DRV_LOG(DEBUG, "port %u indirection table %p: refcnt %d",
-		dev->data->port_id, (void *)ind_tbl,
-		rte_atomic32_read(&ind_tbl->refcnt));
+	DEBUG("port %u new indirection table %p: queues:%u refcnt:%d",
+	      dev->data->port_id, (void *)ind_tbl, 1 << wq_n,
+	      rte_atomic32_read(&ind_tbl->refcnt));
 	return ind_tbl;
 error:
 	rte_free(ind_tbl);
@@ -1330,9 +1330,12 @@ mlx5_ind_table_ibv_release(struct rte_eth_dev *dev,
 	DRV_LOG(DEBUG, "port %u indirection table %p: refcnt %d",
 		((struct priv *)dev->data->dev_private)->port,
 		(void *)ind_tbl, rte_atomic32_read(&ind_tbl->refcnt));
-	if (rte_atomic32_dec_and_test(&ind_tbl->refcnt))
+	if (rte_atomic32_dec_and_test(&ind_tbl->refcnt)) {
 		claim_zero(mlx5_glue->destroy_rwq_ind_table
 			   (ind_tbl->ind_table));
+		DEBUG("port %u delete indirection table %p: queues: %u",
+		      dev->data->port_id, (void *)ind_tbl, ind_tbl->queues_n);
+	}
 	for (i = 0; i != ind_tbl->queues_n; ++i)
 		claim_nonzero(mlx5_rxq_release(dev, ind_tbl->queues[i]));
 	if (!rte_atomic32_read(&ind_tbl->refcnt)) {
@@ -1445,6 +1448,12 @@ mlx5_hrxq_new(struct rte_eth_dev *dev,
 			.pd = priv->pd,
 		},
 		&qp_init_attr);
+	DEBUG("port %u new QP:%p ind_tbl:%p hash_fields:0x%lx tunnel:0x%x"
+	      " level:%hhu dv_attr:comp_mask:0x%lx create_flags:0x%x",
+	      dev->data->port_id, (void *)qp, (void *)ind_tbl,
+	      (tunnel && rss_level ? (uint32_t)IBV_RX_HASH_INNER : 0) |
+	      hash_fields, tunnel, rss_level,
+	      qp_init_attr.comp_mask, qp_init_attr.create_flags);
 #else
 	qp = mlx5_glue->create_qp_ex
 		(priv->ctx,
@@ -1466,6 +1475,10 @@ mlx5_hrxq_new(struct rte_eth_dev *dev,
 			.rwq_ind_tbl = ind_tbl->ind_table,
 			.pd = priv->pd,
 		 });
+	DEBUG("port %u new QP:%p ind_tbl:%p hash_fields:0x%lx tunnel:0x%x"
+	      " level:%hhu",
+	      dev->data->port_id, (void *)qp, (void *)ind_tbl,
+	      hash_fields, tunnel, rss_level);
 #endif
 	if (!qp) {
 		rte_errno = errno;
@@ -1577,6 +1590,10 @@ mlx5_hrxq_release(struct rte_eth_dev *dev, struct mlx5_hrxq *hrxq)
 		(void *)hrxq, rte_atomic32_read(&hrxq->refcnt));
 	if (rte_atomic32_dec_and_test(&hrxq->refcnt)) {
 		claim_zero(mlx5_glue->destroy_qp(hrxq->qp));
+		DEBUG("port %u delete QP %p: hash: 0x%lx, tunnel:"
+		      " 0x%x, level: %hhu",
+		      dev->data->port_id, (void *)hrxq, hrxq->hash_fields,
+		      hrxq->tunnel, hrxq->rss_level);
 		mlx5_ind_table_ibv_release(dev, hrxq->ind_table);
 		LIST_REMOVE(hrxq, next);
 		rte_free(hrxq);
diff --git a/drivers/net/mlx5/mlx5_utils.h b/drivers/net/mlx5/mlx5_utils.h
index 85d2aae2b..9a3181b1f 100644
--- a/drivers/net/mlx5/mlx5_utils.h
+++ b/drivers/net/mlx5/mlx5_utils.h
@@ -103,16 +103,22 @@ extern int mlx5_logtype;
 /* claim_zero() does not perform any check when debugging is disabled. */
 #ifndef NDEBUG
 
+#define DEBUG(...) DRV_LOG(DEBUG, __VA_ARGS__)
 #define claim_zero(...) assert((__VA_ARGS__) == 0)
 #define claim_nonzero(...) assert((__VA_ARGS__) != 0)
 
 #else /* NDEBUG */
 
+#define DEBUG(...) (void)0
 #define claim_zero(...) (__VA_ARGS__)
 #define claim_nonzero(...) (__VA_ARGS__)
 
 #endif /* NDEBUG */
 
+#define INFO(...) DRV_LOG(INFO, __VA_ARGS__)
+#define WARN(...) DRV_LOG(WARNING, __VA_ARGS__)
+#define ERROR(...) DRV_LOG(ERR, __VA_ARGS__)
+
 /* Convenience macros for accessing mbuf fields. */
 #define NEXT(m) ((m)->next)
 #define DATA_LEN(m) ((m)->data_len)
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v3 09/14] net/mlx5: introduce VXLAN-GPE tunnel type
       [not found] <20180410133415.189905-1-xuemingl%40mellanox.com>
                   ` (8 preceding siblings ...)
  2018-04-13 11:20 ` [PATCH v3 08/14] net/mlx5: add hardware flow debug dump Xueming Li
@ 2018-04-13 11:20 ` Xueming Li
  2018-04-13 13:32   ` Nélio Laranjeiro
  2018-04-13 11:20 ` [PATCH v3 10/14] net/mlx5: allow flow tunnel ID 0 with outer pattern Xueming Li
                   ` (4 subsequent siblings)
  14 siblings, 1 reply; 115+ messages in thread
From: Xueming Li @ 2018-04-13 11:20 UTC (permalink / raw)
  To: Nelio Laranjeiro, Shahaf Shuler; +Cc: Xueming Li, dev

Add VXLAN-GPE support to rte flow.

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
---
 drivers/net/mlx5/mlx5_flow.c | 95 +++++++++++++++++++++++++++++++++++++++++++-
 drivers/net/mlx5/mlx5_rxtx.c |  3 +-
 2 files changed, 95 insertions(+), 3 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index c99722770..19973b13c 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -91,6 +91,11 @@ mlx5_flow_create_vxlan(const struct rte_flow_item *item,
 		       struct mlx5_flow_data *data);
 
 static int
+mlx5_flow_create_vxlan_gpe(const struct rte_flow_item *item,
+			   const void *default_mask,
+			   struct mlx5_flow_data *data);
+
+static int
 mlx5_flow_create_gre(const struct rte_flow_item *item,
 		       const void *default_mask,
 		       struct mlx5_flow_data *data);
@@ -241,10 +246,12 @@ struct rte_flow {
 
 #define IS_TUNNEL(type) ( \
 	(type) == RTE_FLOW_ITEM_TYPE_VXLAN || \
+	(type) == RTE_FLOW_ITEM_TYPE_VXLAN_GPE || \
 	(type) == RTE_FLOW_ITEM_TYPE_GRE)
 
 const uint32_t flow_ptype[] = {
 	[RTE_FLOW_ITEM_TYPE_VXLAN] = RTE_PTYPE_TUNNEL_VXLAN,
+	[RTE_FLOW_ITEM_TYPE_VXLAN_GPE] = RTE_PTYPE_TUNNEL_VXLAN_GPE,
 	[RTE_FLOW_ITEM_TYPE_GRE] = RTE_PTYPE_TUNNEL_GRE,
 };
 
@@ -253,6 +260,8 @@ const uint32_t flow_ptype[] = {
 const uint32_t ptype_ext[] = {
 	[PTYPE_IDX(RTE_PTYPE_TUNNEL_VXLAN)] = RTE_PTYPE_TUNNEL_VXLAN |
 					      RTE_PTYPE_L4_UDP,
+	[PTYPE_IDX(RTE_PTYPE_TUNNEL_VXLAN_GPE)]	= RTE_PTYPE_TUNNEL_VXLAN_GPE |
+						  RTE_PTYPE_L4_UDP,
 	[PTYPE_IDX(RTE_PTYPE_TUNNEL_GRE)] = RTE_PTYPE_TUNNEL_GRE,
 };
 
@@ -310,6 +319,7 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
 	[RTE_FLOW_ITEM_TYPE_END] = {
 		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
 			       RTE_FLOW_ITEM_TYPE_VXLAN,
+			       RTE_FLOW_ITEM_TYPE_VXLAN_GPE,
 			       RTE_FLOW_ITEM_TYPE_GRE),
 	},
 	[RTE_FLOW_ITEM_TYPE_ETH] = {
@@ -388,7 +398,8 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
 		.dst_sz = sizeof(struct ibv_flow_spec_ipv6),
 	},
 	[RTE_FLOW_ITEM_TYPE_UDP] = {
-		.items = ITEMS(RTE_FLOW_ITEM_TYPE_VXLAN),
+		.items = ITEMS(RTE_FLOW_ITEM_TYPE_VXLAN,
+			       RTE_FLOW_ITEM_TYPE_VXLAN_GPE),
 		.actions = valid_actions,
 		.mask = &(const struct rte_flow_item_udp){
 			.hdr = {
@@ -440,6 +451,19 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
 		.convert = mlx5_flow_create_vxlan,
 		.dst_sz = sizeof(struct ibv_flow_spec_tunnel),
 	},
+	[RTE_FLOW_ITEM_TYPE_VXLAN_GPE] = {
+		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
+			       RTE_FLOW_ITEM_TYPE_IPV4,
+			       RTE_FLOW_ITEM_TYPE_IPV6),
+		.actions = valid_actions,
+		.mask = &(const struct rte_flow_item_vxlan_gpe){
+			.vni = "\xff\xff\xff",
+		},
+		.default_mask = &rte_flow_item_vxlan_gpe_mask,
+		.mask_sz = sizeof(struct rte_flow_item_vxlan_gpe),
+		.convert = mlx5_flow_create_vxlan_gpe,
+		.dst_sz = sizeof(struct ibv_flow_spec_tunnel),
+	},
 };
 
 /** Structure to pass to the conversion function. */
@@ -1786,6 +1810,75 @@ mlx5_flow_create_vxlan(const struct rte_flow_item *item,
 }
 
 /**
+ * Convert VXLAN-GPE item to Verbs specification.
+ *
+ * @param item[in]
+ *   Item specification.
+ * @param default_mask[in]
+ *   Default bit-masks to use when item->mask is not provided.
+ * @param data[in, out]
+ *   User structure.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+mlx5_flow_create_vxlan_gpe(const struct rte_flow_item *item,
+			   const void *default_mask,
+			   struct mlx5_flow_data *data)
+{
+	const struct rte_flow_item_vxlan_gpe *spec = item->spec;
+	const struct rte_flow_item_vxlan_gpe *mask = item->mask;
+	struct mlx5_flow_parse *parser = data->parser;
+	unsigned int size = sizeof(struct ibv_flow_spec_tunnel);
+	struct ibv_flow_spec_tunnel vxlan = {
+		.type = parser->inner | IBV_FLOW_SPEC_VXLAN_TUNNEL,
+		.size = size,
+	};
+	union vni {
+		uint32_t vlan_id;
+		uint8_t vni[4];
+	} id;
+	int r;
+
+	id.vni[0] = 0;
+	parser->inner = IBV_FLOW_SPEC_INNER;
+	parser->tunnel = ptype_ext[PTYPE_IDX(RTE_PTYPE_TUNNEL_VXLAN_GPE)];
+	parser->out_layer = parser->layer;
+	parser->layer = HASH_RXQ_TUNNEL;
+	if (spec) {
+		if (!mask)
+			mask = default_mask;
+		memcpy(&id.vni[1], spec->vni, 3);
+		vxlan.val.tunnel_id = id.vlan_id;
+		memcpy(&id.vni[1], mask->vni, 3);
+		vxlan.mask.tunnel_id = id.vlan_id;
+		if (spec->protocol) {
+			r = EINVAL;
+			return r;
+		}
+		/* Remove unwanted bits from values. */
+		vxlan.val.tunnel_id &= vxlan.mask.tunnel_id;
+	}
+	/*
+	 * Tunnel id 0 is equivalent as not adding a VXLAN layer, if only this
+	 * layer is defined in the Verbs specification it is interpreted as
+	 * wildcard and all packets will match this rule, if it follows a full
+	 * stack layer (ex: eth / ipv4 / udp), all packets matching the layers
+	 * before will also match this rule.
+	 * To avoid such situation, VNI 0 is currently refused.
+	 */
+	/* Only allow tunnel w/o tunnel id pattern after proper outer spec. */
+	if (parser->out_layer == HASH_RXQ_ETH && !vxlan.val.tunnel_id)
+		return rte_flow_error_set(data->error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ITEM,
+					  item,
+					  "VxLAN-GPE vni cannot be 0");
+	mlx5_flow_create_copy(parser, &vxlan, size);
+	return 0;
+}
+
+/**
  * Convert GRE item to Verbs specification.
  *
  * @param item[in]
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 285b2dbf0..c9342d659 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -466,8 +466,7 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
 			uint8_t vlan_sz =
 				(buf->ol_flags & PKT_TX_VLAN_PKT) ? 4 : 0;
 			const uint64_t is_tunneled =
-				buf->ol_flags & (PKT_TX_TUNNEL_GRE |
-						 PKT_TX_TUNNEL_VXLAN);
+				buf->ol_flags & (PKT_TX_TUNNEL_MASK);
 
 			tso_header_sz = buf->l2_len + vlan_sz +
 					buf->l3_len + buf->l4_len;
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v3 10/14] net/mlx5: allow flow tunnel ID 0 with outer pattern
       [not found] <20180410133415.189905-1-xuemingl%40mellanox.com>
                   ` (9 preceding siblings ...)
  2018-04-13 11:20 ` [PATCH v3 09/14] net/mlx5: introduce VXLAN-GPE tunnel type Xueming Li
@ 2018-04-13 11:20 ` Xueming Li
  2018-04-13 11:20 ` [PATCH v3 11/14] net/mlx5: support MPLS-in-GRE and MPLS-in-UDP Xueming Li
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 115+ messages in thread
From: Xueming Li @ 2018-04-13 11:20 UTC (permalink / raw)
  To: Nelio Laranjeiro, Shahaf Shuler; +Cc: Xueming Li, dev

Tunnel w/o tunnel id pattern could match any non-tunneled packet,
this patch allowed tunnel w/o tunnel id pattern after proper outer spec.

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
---
 drivers/net/mlx5/mlx5_flow.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index 19973b13c..0fccd39b3 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -1800,7 +1800,8 @@ mlx5_flow_create_vxlan(const struct rte_flow_item *item,
 	 * before will also match this rule.
 	 * To avoid such situation, VNI 0 is currently refused.
 	 */
-	if (!vxlan.val.tunnel_id)
+	/* Only allow tunnel w/o tunnel id pattern after proper outer spec. */
+	if (parser->out_layer == HASH_RXQ_ETH && !vxlan.val.tunnel_id)
 		return rte_flow_error_set(data->error, EINVAL,
 					  RTE_FLOW_ERROR_TYPE_ITEM,
 					  item,
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v3 11/14] net/mlx5: support MPLS-in-GRE and MPLS-in-UDP
       [not found] <20180410133415.189905-1-xuemingl%40mellanox.com>
                   ` (10 preceding siblings ...)
  2018-04-13 11:20 ` [PATCH v3 10/14] net/mlx5: allow flow tunnel ID 0 with outer pattern Xueming Li
@ 2018-04-13 11:20 ` Xueming Li
  2018-04-13 13:37   ` Nélio Laranjeiro
  2018-04-13 11:20 ` [PATCH v3 12/14] doc: update mlx5 guide on tunnel offloading Xueming Li
                   ` (2 subsequent siblings)
  14 siblings, 1 reply; 115+ messages in thread
From: Xueming Li @ 2018-04-13 11:20 UTC (permalink / raw)
  To: Nelio Laranjeiro, Shahaf Shuler; +Cc: Xueming Li, dev

This patch supports new tunnel type MPLS-in-GRE and MPLS-in-UDP.
Flow pattern example:
  ipv4 proto is 47 / gre proto is 0x8847 / mpls
  ipv4 / udp dst is 6635 / mpls / end

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
---
 drivers/net/mlx5/Makefile    |   5 ++
 drivers/net/mlx5/mlx5.c      |  15 +++++
 drivers/net/mlx5/mlx5.h      |   1 +
 drivers/net/mlx5/mlx5_flow.c | 148 ++++++++++++++++++++++++++++++++++++++++++-
 4 files changed, 166 insertions(+), 3 deletions(-)

diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
index f9a6c460b..33553483e 100644
--- a/drivers/net/mlx5/Makefile
+++ b/drivers/net/mlx5/Makefile
@@ -131,6 +131,11 @@ mlx5_autoconf.h.new: $(RTE_SDK)/buildtools/auto-config-h.sh
 		enum MLX5DV_CONTEXT_MASK_TUNNEL_OFFLOADS \
 		$(AUTOCONF_OUTPUT)
 	$Q sh -- '$<' '$@' \
+		HAVE_IBV_DEVICE_MPLS_SUPPORT \
+		infiniband/verbs.h \
+		enum IBV_FLOW_SPEC_MPLS \
+		$(AUTOCONF_OUTPUT)
+	$Q sh -- '$<' '$@' \
 		HAVE_IBV_WQ_FLAG_RX_END_PADDING \
 		infiniband/verbs.h \
 		enum IBV_WQ_FLAG_RX_END_PADDING \
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 38118e524..89b683d6e 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -614,6 +614,7 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 	unsigned int cqe_comp;
 	unsigned int tunnel_en = 0;
 	unsigned int verb_priorities = 0;
+	unsigned int mpls_en = 0;
 	int idx;
 	int i;
 	struct mlx5dv_context attrs_out = {0};
@@ -720,12 +721,25 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 			      MLX5DV_RAW_PACKET_CAP_TUNNELED_OFFLOAD_VXLAN) &&
 			     (attrs_out.tunnel_offloads_caps &
 			      MLX5DV_RAW_PACKET_CAP_TUNNELED_OFFLOAD_GRE));
+#ifdef HAVE_IBV_DEVICE_MPLS_SUPPORT
+		mpls_en = ((attrs_out.tunnel_offloads_caps &
+			    MLX5DV_RAW_PACKET_CAP_TUNNELED_OFFLOAD_MPLS_GRE) &&
+			   (attrs_out.tunnel_offloads_caps &
+			    MLX5DV_RAW_PACKET_CAP_TUNNELED_OFFLOAD_MPLS_UDP) &&
+			   (attrs_out.tunnel_offloads_caps &
+			  MLX5DV_RAW_PACKET_CAP_TUNNELED_OFFLOAD_CTRL_DW_MPLS));
+#endif
 	}
 	DRV_LOG(DEBUG, "tunnel offloading is %ssupported",
 		tunnel_en ? "" : "not ");
+	DRV_LOG(DEBUG, "MPLS over GRE/UDP offloading is %ssupported",
+		mpls_en ? "" : "not ");
 #else
 	DRV_LOG(WARNING,
 		"tunnel offloading disabled due to old OFED/rdma-core version");
+	DRV_LOG(WARNING,
+		"MPLS over GRE/UDP offloading disabled due to old"
+		" OFED/rdma-core version or firmware configuration");
 #endif
 	if (mlx5_glue->query_device_ex(attr_ctx, NULL, &device_attr)) {
 		err = errno;
@@ -749,6 +763,7 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 			.cqe_comp = cqe_comp,
 			.mps = mps,
 			.tunnel_en = tunnel_en,
+			.mpls_en = mpls_en,
 			.tx_vec_en = 1,
 			.rx_vec_en = 1,
 			.mpw_hdr_dseg = 0,
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 6e4613fe0..efbcb2156 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -81,6 +81,7 @@ struct mlx5_dev_config {
 	unsigned int vf:1; /* This is a VF. */
 	unsigned int mps:2; /* Multi-packet send supported mode. */
 	unsigned int tunnel_en:1;
+	unsigned int mpls_en:1; /* MPLS over GRE/UDP is enabled. */
 	/* Whether tunnel stateless offloads are supported. */
 	unsigned int flow_counter_en:1; /* Whether flow counter is supported. */
 	unsigned int cqe_comp:1; /* CQE compression is enabled. */
diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index 0fccd39b3..98edf1882 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -100,6 +100,11 @@ mlx5_flow_create_gre(const struct rte_flow_item *item,
 		       const void *default_mask,
 		       struct mlx5_flow_data *data);
 
+static int
+mlx5_flow_create_mpls(const struct rte_flow_item *item,
+		      const void *default_mask,
+		      struct mlx5_flow_data *data);
+
 struct mlx5_flow_parse;
 
 static void
@@ -247,12 +252,14 @@ struct rte_flow {
 #define IS_TUNNEL(type) ( \
 	(type) == RTE_FLOW_ITEM_TYPE_VXLAN || \
 	(type) == RTE_FLOW_ITEM_TYPE_VXLAN_GPE || \
+	(type) == RTE_FLOW_ITEM_TYPE_MPLS || \
 	(type) == RTE_FLOW_ITEM_TYPE_GRE)
 
 const uint32_t flow_ptype[] = {
 	[RTE_FLOW_ITEM_TYPE_VXLAN] = RTE_PTYPE_TUNNEL_VXLAN,
 	[RTE_FLOW_ITEM_TYPE_VXLAN_GPE] = RTE_PTYPE_TUNNEL_VXLAN_GPE,
 	[RTE_FLOW_ITEM_TYPE_GRE] = RTE_PTYPE_TUNNEL_GRE,
+	[RTE_FLOW_ITEM_TYPE_MPLS] = RTE_PTYPE_TUNNEL_MPLS_IN_GRE,
 };
 
 #define PTYPE_IDX(t) ((RTE_PTYPE_TUNNEL_MASK & (t)) >> 12)
@@ -263,6 +270,10 @@ const uint32_t ptype_ext[] = {
 	[PTYPE_IDX(RTE_PTYPE_TUNNEL_VXLAN_GPE)]	= RTE_PTYPE_TUNNEL_VXLAN_GPE |
 						  RTE_PTYPE_L4_UDP,
 	[PTYPE_IDX(RTE_PTYPE_TUNNEL_GRE)] = RTE_PTYPE_TUNNEL_GRE,
+	[PTYPE_IDX(RTE_PTYPE_TUNNEL_MPLS_IN_GRE)] =
+		RTE_PTYPE_TUNNEL_MPLS_IN_GRE,
+	[PTYPE_IDX(RTE_PTYPE_TUNNEL_MPLS_IN_UDP)] =
+		RTE_PTYPE_TUNNEL_MPLS_IN_GRE | RTE_PTYPE_L4_UDP,
 };
 
 /** Structure to generate a simple graph of layers supported by the NIC. */
@@ -399,7 +410,8 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
 	},
 	[RTE_FLOW_ITEM_TYPE_UDP] = {
 		.items = ITEMS(RTE_FLOW_ITEM_TYPE_VXLAN,
-			       RTE_FLOW_ITEM_TYPE_VXLAN_GPE),
+			       RTE_FLOW_ITEM_TYPE_VXLAN_GPE,
+			       RTE_FLOW_ITEM_TYPE_MPLS),
 		.actions = valid_actions,
 		.mask = &(const struct rte_flow_item_udp){
 			.hdr = {
@@ -428,7 +440,8 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
 	[RTE_FLOW_ITEM_TYPE_GRE] = {
 		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
 			       RTE_FLOW_ITEM_TYPE_IPV4,
-			       RTE_FLOW_ITEM_TYPE_IPV6),
+			       RTE_FLOW_ITEM_TYPE_IPV6,
+			       RTE_FLOW_ITEM_TYPE_MPLS),
 		.actions = valid_actions,
 		.mask = &(const struct rte_flow_item_gre){
 			.protocol = -1,
@@ -436,7 +449,11 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
 		.default_mask = &rte_flow_item_gre_mask,
 		.mask_sz = sizeof(struct rte_flow_item_gre),
 		.convert = mlx5_flow_create_gre,
+#ifdef HAVE_IBV_DEVICE_MPLS_SUPPORT
+		.dst_sz = sizeof(struct ibv_flow_spec_gre),
+#else
 		.dst_sz = sizeof(struct ibv_flow_spec_tunnel),
+#endif
 	},
 	[RTE_FLOW_ITEM_TYPE_VXLAN] = {
 		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
@@ -464,6 +481,21 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
 		.convert = mlx5_flow_create_vxlan_gpe,
 		.dst_sz = sizeof(struct ibv_flow_spec_tunnel),
 	},
+	[RTE_FLOW_ITEM_TYPE_MPLS] = {
+		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
+			       RTE_FLOW_ITEM_TYPE_IPV4,
+			       RTE_FLOW_ITEM_TYPE_IPV6),
+		.actions = valid_actions,
+		.mask = &(const struct rte_flow_item_mpls){
+			.label_tc_s = "\xff\xff\xf0",
+		},
+		.default_mask = &rte_flow_item_mpls_mask,
+		.mask_sz = sizeof(struct rte_flow_item_mpls),
+		.convert = mlx5_flow_create_mpls,
+#ifdef HAVE_IBV_DEVICE_MPLS_SUPPORT
+		.dst_sz = sizeof(struct ibv_flow_spec_mpls),
+#endif
+	},
 };
 
 /** Structure to pass to the conversion function. */
@@ -912,7 +944,9 @@ mlx5_flow_convert_items_validate(struct rte_eth_dev *dev,
 		if (ret)
 			goto exit_item_not_supported;
 		if (IS_TUNNEL(items->type)) {
-			if (parser->tunnel) {
+			if (parser->tunnel &&
+			   !(parser->tunnel == RTE_PTYPE_TUNNEL_GRE &&
+			     items->type == RTE_FLOW_ITEM_TYPE_MPLS)) {
 				rte_flow_error_set(error, ENOTSUP,
 						   RTE_FLOW_ERROR_TYPE_ITEM,
 						   items,
@@ -920,6 +954,16 @@ mlx5_flow_convert_items_validate(struct rte_eth_dev *dev,
 						   " tunnel encapsulations.");
 				return -rte_errno;
 			}
+			if (items->type == RTE_FLOW_ITEM_TYPE_MPLS &&
+			    !priv->config.mpls_en) {
+				rte_flow_error_set(error, ENOTSUP,
+						   RTE_FLOW_ERROR_TYPE_ITEM,
+						   items,
+						   "MPLS not supported or"
+						   " disabled in firmware"
+						   " configuration.");
+				return -rte_errno;
+			}
 			if (!priv->config.tunnel_en &&
 			    parser->rss_conf.level) {
 				rte_flow_error_set(error, ENOTSUP,
@@ -1880,6 +1924,80 @@ mlx5_flow_create_vxlan_gpe(const struct rte_flow_item *item,
 }
 
 /**
+ * Convert MPLS item to Verbs specification.
+ * Tunnel types currently supported are MPLS-in-GRE and MPLS-in-UDP.
+ *
+ * @param item[in]
+ *   Item specification.
+ * @param default_mask[in]
+ *   Default bit-masks to use when item->mask is not provided.
+ * @param data[in, out]
+ *   User structure.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+mlx5_flow_create_mpls(const struct rte_flow_item *item __rte_unused,
+		      const void *default_mask __rte_unused,
+		      struct mlx5_flow_data *data __rte_unused)
+{
+#ifndef HAVE_IBV_DEVICE_MPLS_SUPPORT
+	return rte_flow_error_set(data->error, EINVAL,
+				  RTE_FLOW_ERROR_TYPE_ITEM,
+				  item,
+				  "MPLS not supported by driver");
+#else
+	unsigned int i;
+	const struct rte_flow_item_mpls *spec = item->spec;
+	const struct rte_flow_item_mpls *mask = item->mask;
+	struct mlx5_flow_parse *parser = data->parser;
+	unsigned int size = sizeof(struct ibv_flow_spec_mpls);
+	struct ibv_flow_spec_mpls mpls = {
+		.type = IBV_FLOW_SPEC_MPLS,
+		.size = size,
+	};
+	union tag {
+		uint32_t tag;
+		uint8_t label[4];
+	} id;
+
+	id.tag = 0;
+	parser->inner = IBV_FLOW_SPEC_INNER;
+	if (parser->layer == HASH_RXQ_UDPV4 ||
+	    parser->layer == HASH_RXQ_UDPV6) {
+		parser->tunnel =
+			ptype_ext[PTYPE_IDX(RTE_PTYPE_TUNNEL_MPLS_IN_UDP)];
+		parser->out_layer = parser->layer;
+	} else {
+		parser->tunnel =
+			ptype_ext[PTYPE_IDX(RTE_PTYPE_TUNNEL_MPLS_IN_GRE)];
+	}
+	parser->layer = HASH_RXQ_TUNNEL;
+	if (spec) {
+		if (!mask)
+			mask = default_mask;
+		memcpy(&id.label[1], spec->label_tc_s, 3);
+		id.label[0] = spec->ttl;
+		mpls.val.tag = id.tag;
+		memcpy(&id.label[1], mask->label_tc_s, 3);
+		id.label[0] = mask->ttl;
+		mpls.mask.tag = id.tag;
+		/* Remove unwanted bits from values. */
+		mpls.val.tag &= mpls.mask.tag;
+	}
+	mlx5_flow_create_copy(parser, &mpls, size);
+	for (i = 0; i != hash_rxq_init_n; ++i) {
+		if (!parser->queue[i].ibv_attr)
+			continue;
+		parser->queue[i].ibv_attr->flags |=
+			IBV_FLOW_ATTR_FLAGS_ORDERED_SPEC_LIST;
+	}
+	return 0;
+#endif
+}
+
+/**
  * Convert GRE item to Verbs specification.
  *
  * @param item[in]
@@ -1898,16 +2016,40 @@ mlx5_flow_create_gre(const struct rte_flow_item *item __rte_unused,
 		     struct mlx5_flow_data *data)
 {
 	struct mlx5_flow_parse *parser = data->parser;
+#ifndef HAVE_IBV_DEVICE_MPLS_SUPPORT
 	unsigned int size = sizeof(struct ibv_flow_spec_tunnel);
 	struct ibv_flow_spec_tunnel tunnel = {
 		.type = parser->inner | IBV_FLOW_SPEC_VXLAN_TUNNEL,
 		.size = size,
 	};
+#else
+	const struct rte_flow_item_gre *spec = item->spec;
+	const struct rte_flow_item_gre *mask = item->mask;
+	unsigned int size = sizeof(struct ibv_flow_spec_gre);
+	struct ibv_flow_spec_gre tunnel = {
+		.type = parser->inner | IBV_FLOW_SPEC_GRE,
+		.size = size,
+	};
+#endif
 
 	parser->inner = IBV_FLOW_SPEC_INNER;
 	parser->tunnel = ptype_ext[PTYPE_IDX(RTE_PTYPE_TUNNEL_GRE)];
 	parser->out_layer = parser->layer;
 	parser->layer = HASH_RXQ_TUNNEL;
+#ifdef HAVE_IBV_DEVICE_MPLS_SUPPORT
+	if (spec) {
+		if (!mask)
+			mask = default_mask;
+		tunnel.val.c_ks_res0_ver = spec->c_rsvd0_ver;
+		tunnel.val.protocol = spec->protocol;
+		tunnel.val.c_ks_res0_ver = mask->c_rsvd0_ver;
+		tunnel.val.protocol = mask->protocol;
+		/* Remove unwanted bits from values. */
+		tunnel.val.c_ks_res0_ver &= tunnel.mask.c_ks_res0_ver;
+		tunnel.val.protocol &= tunnel.mask.protocol;
+		tunnel.val.key &= tunnel.mask.key;
+	}
+#endif
 	mlx5_flow_create_copy(parser, &tunnel, size);
 	return 0;
 }
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v3 12/14] doc: update mlx5 guide on tunnel offloading
       [not found] <20180410133415.189905-1-xuemingl%40mellanox.com>
                   ` (11 preceding siblings ...)
  2018-04-13 11:20 ` [PATCH v3 11/14] net/mlx5: support MPLS-in-GRE and MPLS-in-UDP Xueming Li
@ 2018-04-13 11:20 ` Xueming Li
  2018-04-13 13:38   ` Nélio Laranjeiro
  2018-04-13 11:20 ` [PATCH v3 13/14] net/mlx5: fix invalid flow item check Xueming Li
  2018-04-13 11:20 ` [PATCH v3 14/14] net/mlx5: support RSS configuration in isolated mode Xueming Li
  14 siblings, 1 reply; 115+ messages in thread
From: Xueming Li @ 2018-04-13 11:20 UTC (permalink / raw)
  To: Nelio Laranjeiro, Shahaf Shuler; +Cc: Xueming Li, dev

Remove tunnel limitations, add new hardware tunnel offload features.

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
---
 doc/guides/nics/mlx5.rst | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index b1bab2ce2..c256f85f3 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -100,12 +100,12 @@ Features
 - RX interrupts.
 - Statistics query including Basic, Extended and per queue.
 - Rx HW timestamp.
+- Tunnel types: VXLAN, L3 VXLAN, VXLAN-GPE, GRE, MPLS-in-GRE, MPLS-in-UDP.
+- Tunnel HW offloads: packet type, inner/outer RSS, IP and UDP checksum verification.
 
 Limitations
 -----------
 
-- Inner RSS for VXLAN frames is not supported yet.
-- Hardware checksum RX offloads for VXLAN inner header are not supported yet.
 - For secondary process:
 
   - Forked secondary process not supported.
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v3 13/14] net/mlx5: fix invalid flow item check
       [not found] <20180410133415.189905-1-xuemingl%40mellanox.com>
                   ` (12 preceding siblings ...)
  2018-04-13 11:20 ` [PATCH v3 12/14] doc: update mlx5 guide on tunnel offloading Xueming Li
@ 2018-04-13 11:20 ` Xueming Li
  2018-04-13 13:40   ` Nélio Laranjeiro
  2018-04-13 11:20 ` [PATCH v3 14/14] net/mlx5: support RSS configuration in isolated mode Xueming Li
  14 siblings, 1 reply; 115+ messages in thread
From: Xueming Li @ 2018-04-13 11:20 UTC (permalink / raw)
  To: Nelio Laranjeiro, Shahaf Shuler; +Cc: Xueming Li, dev

This patch fixed invalid flow item check.

Fixes: 4f1a88e3f9b0 ("net/mlx5: standardize on negative errno values")
Cc: nelio.laranjeiro@6wind.com

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
---
 drivers/net/mlx5/mlx5_flow.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index 98edf1882..d36b6ed8a 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -935,8 +935,10 @@ mlx5_flow_convert_items_validate(struct rte_eth_dev *dev,
 				break;
 			}
 		}
-		if (!token)
+		if (!token) {
+			ret = -ENOTSUP;
 			goto exit_item_not_supported;
+		}
 		cur_item = token;
 		ret = mlx5_flow_item_validate(items,
 					      (const uint8_t *)cur_item->mask,
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v3 14/14] net/mlx5: support RSS configuration in isolated mode
       [not found] <20180410133415.189905-1-xuemingl%40mellanox.com>
                   ` (13 preceding siblings ...)
  2018-04-13 11:20 ` [PATCH v3 13/14] net/mlx5: fix invalid flow item check Xueming Li
@ 2018-04-13 11:20 ` Xueming Li
  2018-04-13 13:43   ` Nélio Laranjeiro
  14 siblings, 1 reply; 115+ messages in thread
From: Xueming Li @ 2018-04-13 11:20 UTC (permalink / raw)
  To: Nelio Laranjeiro, Shahaf Shuler; +Cc: Xueming Li, dev

Enable RSS related configuration in isolated mode.

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
---
 drivers/net/mlx5/mlx5.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 89b683d6e..521f60c18 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -333,6 +333,10 @@ const struct eth_dev_ops mlx5_dev_ops_isolate = {
 	.mtu_set = mlx5_dev_set_mtu,
 	.vlan_strip_queue_set = mlx5_vlan_strip_queue_set,
 	.vlan_offload_set = mlx5_vlan_offload_set,
+	.reta_update = mlx5_dev_rss_reta_update,
+	.reta_query = mlx5_dev_rss_reta_query,
+	.rss_hash_update = mlx5_rss_hash_update,
+	.rss_hash_conf_get = mlx5_rss_hash_conf_get,
 	.filter_ctrl = mlx5_dev_filter_ctrl,
 	.rx_descriptor_status = mlx5_rx_descriptor_status,
 	.tx_descriptor_status = mlx5_tx_descriptor_status,
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* Re: [PATCH v3 01/14] net/mlx5: support 16 hardware priorities
  2018-04-13 11:20 ` [PATCH v3 01/14] net/mlx5: support 16 hardware priorities Xueming Li
@ 2018-04-13 11:58   ` Nélio Laranjeiro
  2018-04-13 13:10     ` Xueming(Steven) Li
  0 siblings, 1 reply; 115+ messages in thread
From: Nélio Laranjeiro @ 2018-04-13 11:58 UTC (permalink / raw)
  To: Xueming Li; +Cc: Shahaf Shuler, dev

Hi Xueming,

Small nips and documentation issues,

On Fri, Apr 13, 2018 at 07:20:10PM +0800, Xueming Li wrote:
> This patch supports new 16 Verbs flow priorities by trying to create a
> simple flow of priority 15. If 16 priorities not available, fallback to
> traditional 8 priorities.
> 
> Verb priority mapping:
> 			8 priorities	>=16 priorities
> Control flow:		4-7		8-15
> User normal flow:	1-3		4-7
> User tunnel flow:	0-2		0-3

There is an overlap between tunnel and normal flows it is expected?

> 
> Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> ---
>  drivers/net/mlx5/mlx5.c         |  18 +++++++
>  drivers/net/mlx5/mlx5.h         |   5 ++
>  drivers/net/mlx5/mlx5_flow.c    | 112 +++++++++++++++++++++++++++++++++-------
>  drivers/net/mlx5/mlx5_trigger.c |   8 ---
>  4 files changed, 115 insertions(+), 28 deletions(-)
> 
> diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
> index cfab55897..38118e524 100644
> --- a/drivers/net/mlx5/mlx5.c
> +++ b/drivers/net/mlx5/mlx5.c
> @@ -197,6 +197,7 @@ mlx5_dev_close(struct rte_eth_dev *dev)
>  		priv->txqs_n = 0;
>  		priv->txqs = NULL;
>  	}
> +	mlx5_flow_delete_drop_queue(dev);
>  	if (priv->pd != NULL) {
>  		assert(priv->ctx != NULL);
>  		claim_zero(mlx5_glue->dealloc_pd(priv->pd));
> @@ -612,6 +613,7 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
>  	unsigned int mps;
>  	unsigned int cqe_comp;
>  	unsigned int tunnel_en = 0;
> +	unsigned int verb_priorities = 0;
>  	int idx;
>  	int i;
>  	struct mlx5dv_context attrs_out = {0};
> @@ -993,6 +995,22 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
>  		mlx5_set_link_up(eth_dev);
>  		/* Store device configuration on private structure. */
>  		priv->config = config;
> +		/* Create drop queue. */
> +		err = mlx5_flow_create_drop_queue(eth_dev);
> +		if (err) {
> +			DRV_LOG(ERR, "port %u drop queue allocation failed: %s",
> +				eth_dev->data->port_id, strerror(rte_errno));
> +			goto port_error;
> +		}
> +		/* Supported Verbs flow priority number detection. */
> +		if (verb_priorities == 0)
> +			verb_priorities = priv_get_max_verbs_prio(eth_dev);

No more priv*() rename it to mlx5_get_max_verbs_prio()

> +		if (verb_priorities < MLX5_VERBS_FLOW_PRIO_8) {
> +			DRV_LOG(ERR, "port %u wrong Verbs flow priorities: %u",
> +				eth_dev->data->port_id, verb_priorities);
> +			goto port_error;
> +		}
> +		priv->config.max_verb_prio = verb_priorities;

s/verb/verbs/

>  		continue;
>  port_error:
>  		if (priv)
> diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
> index 63b24e6bb..6e4613fe0 100644
> --- a/drivers/net/mlx5/mlx5.h
> +++ b/drivers/net/mlx5/mlx5.h
> @@ -89,6 +89,7 @@ struct mlx5_dev_config {
>  	unsigned int rx_vec_en:1; /* Rx vector is enabled. */
>  	unsigned int mpw_hdr_dseg:1; /* Enable DSEGs in the title WQEBB. */
>  	unsigned int vf_nl_en:1; /* Enable Netlink requests in VF mode. */
> +	unsigned int max_verb_prio; /* Number of Verb flow priorities. */
>  	unsigned int tso_max_payload_sz; /* Maximum TCP payload for TSO. */
>  	unsigned int ind_table_max_size; /* Maximum indirection table size. */
>  	int txq_inline; /* Maximum packet size for inlining. */
> @@ -105,6 +106,9 @@ enum mlx5_verbs_alloc_type {
>  	MLX5_VERBS_ALLOC_TYPE_RX_QUEUE,
>  };
>  
> +/* 8 Verbs priorities. */
> +#define MLX5_VERBS_FLOW_PRIO_8 8
> +
>  /**
>   * Verbs allocator needs a context to know in the callback which kind of
>   * resources it is allocating.
> @@ -253,6 +257,7 @@ int mlx5_traffic_restart(struct rte_eth_dev *dev);
>  
>  /* mlx5_flow.c */
>  
> +unsigned int priv_get_max_verbs_prio(struct rte_eth_dev *dev);
>  int mlx5_flow_validate(struct rte_eth_dev *dev,
>  		       const struct rte_flow_attr *attr,
>  		       const struct rte_flow_item items[],
> diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
> index 288610620..5c4f0b586 100644
> --- a/drivers/net/mlx5/mlx5_flow.c
> +++ b/drivers/net/mlx5/mlx5_flow.c
> @@ -32,8 +32,8 @@
>  #include "mlx5_prm.h"
>  #include "mlx5_glue.h"
>  
> -/* Define minimal priority for control plane flows. */
> -#define MLX5_CTRL_FLOW_PRIORITY 4
> +/* Flow priority for control plane flows. */
> +#define MLX5_CTRL_FLOW_PRIORITY 1
>  
>  /* Internet Protocol versions. */
>  #define MLX5_IPV4 4
> @@ -129,7 +129,7 @@ const struct hash_rxq_init hash_rxq_init[] = {
>  				IBV_RX_HASH_SRC_PORT_TCP |
>  				IBV_RX_HASH_DST_PORT_TCP),
>  		.dpdk_rss_hf = ETH_RSS_NONFRAG_IPV4_TCP,
> -		.flow_priority = 1,
> +		.flow_priority = 0,
>  		.ip_version = MLX5_IPV4,
>  	},
>  	[HASH_RXQ_UDPV4] = {
> @@ -138,7 +138,7 @@ const struct hash_rxq_init hash_rxq_init[] = {
>  				IBV_RX_HASH_SRC_PORT_UDP |
>  				IBV_RX_HASH_DST_PORT_UDP),
>  		.dpdk_rss_hf = ETH_RSS_NONFRAG_IPV4_UDP,
> -		.flow_priority = 1,
> +		.flow_priority = 0,
>  		.ip_version = MLX5_IPV4,
>  	},
>  	[HASH_RXQ_IPV4] = {
> @@ -146,7 +146,7 @@ const struct hash_rxq_init hash_rxq_init[] = {
>  				IBV_RX_HASH_DST_IPV4),
>  		.dpdk_rss_hf = (ETH_RSS_IPV4 |
>  				ETH_RSS_FRAG_IPV4),
> -		.flow_priority = 2,
> +		.flow_priority = 1,
>  		.ip_version = MLX5_IPV4,
>  	},
>  	[HASH_RXQ_TCPV6] = {
> @@ -155,7 +155,7 @@ const struct hash_rxq_init hash_rxq_init[] = {
>  				IBV_RX_HASH_SRC_PORT_TCP |
>  				IBV_RX_HASH_DST_PORT_TCP),
>  		.dpdk_rss_hf = ETH_RSS_NONFRAG_IPV6_TCP,
> -		.flow_priority = 1,
> +		.flow_priority = 0,
>  		.ip_version = MLX5_IPV6,
>  	},
>  	[HASH_RXQ_UDPV6] = {
> @@ -164,7 +164,7 @@ const struct hash_rxq_init hash_rxq_init[] = {
>  				IBV_RX_HASH_SRC_PORT_UDP |
>  				IBV_RX_HASH_DST_PORT_UDP),
>  		.dpdk_rss_hf = ETH_RSS_NONFRAG_IPV6_UDP,
> -		.flow_priority = 1,
> +		.flow_priority = 0,
>  		.ip_version = MLX5_IPV6,
>  	},
>  	[HASH_RXQ_IPV6] = {
> @@ -172,13 +172,13 @@ const struct hash_rxq_init hash_rxq_init[] = {
>  				IBV_RX_HASH_DST_IPV6),
>  		.dpdk_rss_hf = (ETH_RSS_IPV6 |
>  				ETH_RSS_FRAG_IPV6),
> -		.flow_priority = 2,
> +		.flow_priority = 1,
>  		.ip_version = MLX5_IPV6,
>  	},
>  	[HASH_RXQ_ETH] = {
>  		.hash_fields = 0,
>  		.dpdk_rss_hf = 0,
> -		.flow_priority = 3,
> +		.flow_priority = 2,
>  	},
>  };
>  
> @@ -900,30 +900,50 @@ mlx5_flow_convert_allocate(unsigned int size, struct rte_flow_error *error)
>   * Make inner packet matching with an higher priority from the non Inner
>   * matching.
>   *
> + * @param dev
> + *   Pointer to Ethernet device.
>   * @param[in, out] parser
>   *   Internal parser structure.
>   * @param attr
>   *   User flow attribute.
>   */
>  static void
> -mlx5_flow_update_priority(struct mlx5_flow_parse *parser,
> +mlx5_flow_update_priority(struct rte_eth_dev *dev,
> +			  struct mlx5_flow_parse *parser,
>  			  const struct rte_flow_attr *attr)
>  {
> +	struct priv *priv = dev->data->dev_private;
>  	unsigned int i;
> +	uint16_t priority;
>  
> +	/*			8 priorities	>= 16 priorities
> +	 * Control flow:	4-7		8-15
> +	 * User normal flow:	1-3		4-7
> +	 * User tunnel flow:	0-2		0-3
> +	 */

Same comment here, the tunnel flow overlap when there are only 8
priorities.

> +	priority = attr->priority * MLX5_VERBS_FLOW_PRIO_8;
> +	if (priv->config.max_verb_prio == MLX5_VERBS_FLOW_PRIO_8)
> +		priority /= 2;
> +	/*
> +	 * Lower non-tunnel flow Verbs priority 1 if only support 8 Verbs
> +	 * priorities, lower 4 otherwise.
> +	 */
> +	if (!parser->inner) {
> +		if (priv->config.max_verb_prio == MLX5_VERBS_FLOW_PRIO_8)
> +			priority += 1;
> +		else
> +			priority += MLX5_VERBS_FLOW_PRIO_8 / 2;
> +	}
>  	if (parser->drop) {
> -		parser->queue[HASH_RXQ_ETH].ibv_attr->priority =
> -			attr->priority +
> -			hash_rxq_init[HASH_RXQ_ETH].flow_priority;
> +		parser->queue[HASH_RXQ_ETH].ibv_attr->priority = priority +
> +				hash_rxq_init[HASH_RXQ_ETH].flow_priority;
>  		return;
>  	}
>  	for (i = 0; i != hash_rxq_init_n; ++i) {
> -		if (parser->queue[i].ibv_attr) {
> -			parser->queue[i].ibv_attr->priority =
> -				attr->priority +
> -				hash_rxq_init[i].flow_priority -
> -				(parser->inner ? 1 : 0);
> -		}
> +		if (!parser->queue[i].ibv_attr)
> +			continue;
> +		parser->queue[i].ibv_attr->priority = priority +
> +				hash_rxq_init[i].flow_priority;
>  	}
>  }
>  
> @@ -1158,7 +1178,7 @@ mlx5_flow_convert(struct rte_eth_dev *dev,
>  	 */
>  	if (!parser->drop)
>  		mlx5_flow_convert_finalise(parser);
> -	mlx5_flow_update_priority(parser, attr);
> +	mlx5_flow_update_priority(dev, parser, attr);
>  exit_free:
>  	/* Only verification is expected, all resources should be released. */
>  	if (!parser->create) {
> @@ -3161,3 +3181,55 @@ mlx5_dev_filter_ctrl(struct rte_eth_dev *dev,
>  	}
>  	return 0;
>  }
> +
> +/**
> + * Detect number of Verbs flow priorities supported.
> + *
> + * @param dev
> + *   Pointer to Ethernet device.
> + *
> + * @return
> + *   number of supported Verbs flow priority.
> + */
> +unsigned int
> +priv_get_max_verbs_prio(struct rte_eth_dev *dev)
> +{
> +	struct priv *priv = dev->data->dev_private;
> +	unsigned int verb_priorities = MLX5_VERBS_FLOW_PRIO_8;
> +	struct {
> +		struct ibv_flow_attr attr;
> +		struct ibv_flow_spec_eth eth;
> +		struct ibv_flow_spec_action_drop drop;
> +	} flow_attr = {
> +		.attr = {
> +			.num_of_specs = 2,
> +		},
> +		.eth = {
> +			.type = IBV_FLOW_SPEC_ETH,
> +			.size = sizeof(struct ibv_flow_spec_eth),
> +		},
> +		.drop = {
> +			.size = sizeof(struct ibv_flow_spec_action_drop),
> +			.type = IBV_FLOW_SPEC_ACTION_DROP,
> +		},
> +	};
> +	struct ibv_flow *flow;
> +
> +	do {
> +		flow_attr.attr.priority = verb_priorities - 1;
> +		flow = mlx5_glue->create_flow(priv->flow_drop_queue->qp,
> +					      &flow_attr.attr);
> +		if (flow) {
> +			claim_zero(mlx5_glue->destroy_flow(flow));
> +			/* Try more priorities. */
> +			verb_priorities *= 2;
> +		} else {
> +			/* Failed, restore last right number. */
> +			verb_priorities /= 2;
> +			break;
> +		}
> +	} while (1);
> +	DRV_LOG(INFO, "port %u Verbs flow priorities: %d",
> +		dev->data->port_id, verb_priorities);

Please remove this developer log, it will confuse the user who will
believe he have N priorities which is absolutely not the case.

> +	return verb_priorities;
> +}
> diff --git a/drivers/net/mlx5/mlx5_trigger.c b/drivers/net/mlx5/mlx5_trigger.c
> index 6bb4ffb14..d80a2e688 100644
> --- a/drivers/net/mlx5/mlx5_trigger.c
> +++ b/drivers/net/mlx5/mlx5_trigger.c
> @@ -148,12 +148,6 @@ mlx5_dev_start(struct rte_eth_dev *dev)
>  	int ret;
>  
>  	dev->data->dev_started = 1;
> -	ret = mlx5_flow_create_drop_queue(dev);
> -	if (ret) {
> -		DRV_LOG(ERR, "port %u drop queue allocation failed: %s",
> -			dev->data->port_id, strerror(rte_errno));
> -		goto error;
> -	}
>  	DRV_LOG(DEBUG, "port %u allocating and configuring hash Rx queues",
>  		dev->data->port_id);
>  	rte_mempool_walk(mlx5_mp2mr_iter, priv);
> @@ -202,7 +196,6 @@ mlx5_dev_start(struct rte_eth_dev *dev)
>  	mlx5_traffic_disable(dev);
>  	mlx5_txq_stop(dev);
>  	mlx5_rxq_stop(dev);
> -	mlx5_flow_delete_drop_queue(dev);
>  	rte_errno = ret; /* Restore rte_errno. */
>  	return -rte_errno;
>  }
> @@ -237,7 +230,6 @@ mlx5_dev_stop(struct rte_eth_dev *dev)
>  	mlx5_rxq_stop(dev);
>  	for (mr = LIST_FIRST(&priv->mr); mr; mr = LIST_FIRST(&priv->mr))
>  		mlx5_mr_release(mr);
> -	mlx5_flow_delete_drop_queue(dev);
>  }
>  
>  /**
> -- 
> 2.13.3

Thanks,

-- 
Nélio Laranjeiro
6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v3 02/14] net/mlx5: support GRE tunnel flow
  2018-04-13 11:20 ` [PATCH v3 02/14] net/mlx5: support GRE tunnel flow Xueming Li
@ 2018-04-13 12:02   ` Nélio Laranjeiro
  0 siblings, 0 replies; 115+ messages in thread
From: Nélio Laranjeiro @ 2018-04-13 12:02 UTC (permalink / raw)
  To: Xueming Li; +Cc: Shahaf Shuler, dev

Some nits,

On Fri, Apr 13, 2018 at 07:20:11PM +0800, Xueming Li wrote:
> Support GRE tunnel type flow.

Not sure it is necessary to copy/paste the commit title in the body.

> Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> ---
>  drivers/net/mlx5/mlx5_flow.c | 69 +++++++++++++++++++++++++++++++++++++++-----
>  1 file changed, 62 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
> index 5c4f0b586..2aae988f2 100644
> --- a/drivers/net/mlx5/mlx5_flow.c
> +++ b/drivers/net/mlx5/mlx5_flow.c
> @@ -90,6 +90,11 @@ mlx5_flow_create_vxlan(const struct rte_flow_item *item,
>  		       const void *default_mask,
>  		       struct mlx5_flow_data *data);
>  
> +static int
> +mlx5_flow_create_gre(const struct rte_flow_item *item,
> +		       const void *default_mask,
> +		       struct mlx5_flow_data *data);
> +

Is not there an indentation issue here?

>  struct mlx5_flow_parse;
>  
>  static void
> @@ -232,6 +237,10 @@ struct rte_flow {
>  		__VA_ARGS__, RTE_FLOW_ITEM_TYPE_END, \
>  	}
>  
> +#define IS_TUNNEL(type) ( \
> +	(type) == RTE_FLOW_ITEM_TYPE_VXLAN || \
> +	(type) == RTE_FLOW_ITEM_TYPE_GRE)
> +
>  /** Structure to generate a simple graph of layers supported by the NIC. */
>  struct mlx5_flow_items {
>  	/** List of possible actions for these items. */
> @@ -285,7 +294,8 @@ static const enum rte_flow_action_type valid_actions[] = {
>  static const struct mlx5_flow_items mlx5_flow_items[] = {
>  	[RTE_FLOW_ITEM_TYPE_END] = {
>  		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
> -			       RTE_FLOW_ITEM_TYPE_VXLAN),
> +			       RTE_FLOW_ITEM_TYPE_VXLAN,
> +			       RTE_FLOW_ITEM_TYPE_GRE),
>  	},
>  	[RTE_FLOW_ITEM_TYPE_ETH] = {
>  		.items = ITEMS(RTE_FLOW_ITEM_TYPE_VLAN,
> @@ -317,7 +327,8 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
>  	},
>  	[RTE_FLOW_ITEM_TYPE_IPV4] = {
>  		.items = ITEMS(RTE_FLOW_ITEM_TYPE_UDP,
> -			       RTE_FLOW_ITEM_TYPE_TCP),
> +			       RTE_FLOW_ITEM_TYPE_TCP,
> +			       RTE_FLOW_ITEM_TYPE_GRE),
>  		.actions = valid_actions,
>  		.mask = &(const struct rte_flow_item_ipv4){
>  			.hdr = {
> @@ -334,7 +345,8 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
>  	},
>  	[RTE_FLOW_ITEM_TYPE_IPV6] = {
>  		.items = ITEMS(RTE_FLOW_ITEM_TYPE_UDP,
> -			       RTE_FLOW_ITEM_TYPE_TCP),
> +			       RTE_FLOW_ITEM_TYPE_TCP,
> +			       RTE_FLOW_ITEM_TYPE_GRE),
>  		.actions = valid_actions,
>  		.mask = &(const struct rte_flow_item_ipv6){
>  			.hdr = {
> @@ -387,6 +399,19 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
>  		.convert = mlx5_flow_create_tcp,
>  		.dst_sz = sizeof(struct ibv_flow_spec_tcp_udp),
>  	},
> +	[RTE_FLOW_ITEM_TYPE_GRE] = {
> +		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
> +			       RTE_FLOW_ITEM_TYPE_IPV4,
> +			       RTE_FLOW_ITEM_TYPE_IPV6),
> +		.actions = valid_actions,
> +		.mask = &(const struct rte_flow_item_gre){
> +			.protocol = -1,
> +		},
> +		.default_mask = &rte_flow_item_gre_mask,
> +		.mask_sz = sizeof(struct rte_flow_item_gre),
> +		.convert = mlx5_flow_create_gre,
> +		.dst_sz = sizeof(struct ibv_flow_spec_tunnel),
> +	},
>  	[RTE_FLOW_ITEM_TYPE_VXLAN] = {
>  		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH),
>  		.actions = valid_actions,
> @@ -402,7 +427,7 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
>  
>  /** Structure to pass to the conversion function. */
>  struct mlx5_flow_parse {
> -	uint32_t inner; /**< Set once VXLAN is encountered. */
> +	uint32_t inner; /**< Verbs value, set once tunnel is encountered. */
>  	uint32_t create:1;
>  	/**< Whether resources should remain after a validate. */
>  	uint32_t drop:1; /**< Target is a drop queue. */
> @@ -830,13 +855,13 @@ mlx5_flow_convert_items_validate(const struct rte_flow_item items[],
>  					      cur_item->mask_sz);
>  		if (ret)
>  			goto exit_item_not_supported;
> -		if (items->type == RTE_FLOW_ITEM_TYPE_VXLAN) {
> +		if (IS_TUNNEL(items->type)) {
>  			if (parser->inner) {
>  				rte_flow_error_set(error, ENOTSUP,
>  						   RTE_FLOW_ERROR_TYPE_ITEM,
>  						   items,
> -						   "cannot recognize multiple"
> -						   " VXLAN encapsulations");
> +						   "Cannot recognize multiple"
> +						   " tunnel encapsulations.");
>  				return -rte_errno;
>  			}
>  			parser->inner = IBV_FLOW_SPEC_INNER;
> @@ -1644,6 +1669,36 @@ mlx5_flow_create_vxlan(const struct rte_flow_item *item,
>  }
>  
>  /**
> + * Convert GRE item to Verbs specification.
> + *
> + * @param item[in]
> + *   Item specification.
> + * @param default_mask[in]
> + *   Default bit-masks to use when item->mask is not provided.
> + * @param data[in, out]
> + *   User structure.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +static int
> +mlx5_flow_create_gre(const struct rte_flow_item *item __rte_unused,
> +		     const void *default_mask __rte_unused,
> +		     struct mlx5_flow_data *data)
> +{
> +	struct mlx5_flow_parse *parser = data->parser;
> +	unsigned int size = sizeof(struct ibv_flow_spec_tunnel);
> +	struct ibv_flow_spec_tunnel tunnel = {
> +		.type = parser->inner | IBV_FLOW_SPEC_VXLAN_TUNNEL,
> +		.size = size,
> +	};
> +
> +	parser->inner = IBV_FLOW_SPEC_INNER;
> +	mlx5_flow_create_copy(parser, &tunnel, size);
> +	return 0;
> +}
> +
> +/**
>   * Convert mark/flag action to Verbs specification.
>   *
>   * @param parser
> -- 
> 2.13.3

Thanks,

-- 
Nélio Laranjeiro
6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v3 03/14] net/mlx5: support L3 VXLAN flow
  2018-04-13 11:20 ` [PATCH v3 03/14] net/mlx5: support L3 VXLAN flow Xueming Li
@ 2018-04-13 12:13   ` Nélio Laranjeiro
  2018-04-13 13:51     ` Xueming(Steven) Li
  2018-04-13 14:04     ` Xueming(Steven) Li
  0 siblings, 2 replies; 115+ messages in thread
From: Nélio Laranjeiro @ 2018-04-13 12:13 UTC (permalink / raw)
  To: Xueming Li; +Cc: Shahaf Shuler, dev

On Fri, Apr 13, 2018 at 07:20:12PM +0800, Xueming Li wrote:
> This patch support L3 VXLAN, no inner L2 header comparing to standard
> VXLAN protocol. L3 VXLAN using specific overlay UDP destination port to
> discriminate against standard VXLAN, FW has to be configured to support
> it:
>   sudo mlxconfig -d <device> -y s IP_OVER_VXLAN_EN=1
>   sudo mlxconfig -d <device> -y s IP_OVER_VXLAN_PORT=<port>

This fully deserves to update MLX5 guide with such information, users
are already not reading it, don't expect them to read commit logs.

> Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> ---
>  drivers/net/mlx5/mlx5_flow.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
> index 2aae988f2..644f26a95 100644
> --- a/drivers/net/mlx5/mlx5_flow.c
> +++ b/drivers/net/mlx5/mlx5_flow.c
> @@ -413,7 +413,9 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
>  		.dst_sz = sizeof(struct ibv_flow_spec_tunnel),
>  	},
>  	[RTE_FLOW_ITEM_TYPE_VXLAN] = {
> -		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH),
> +		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
> +			       RTE_FLOW_ITEM_TYPE_IPV4, /* L3 VXLAN. */
> +			       RTE_FLOW_ITEM_TYPE_IPV6), /* L3 VXLAN. */

s/L3/For L3/

>  		.actions = valid_actions,
>  		.mask = &(const struct rte_flow_item_vxlan){
>  			.vni = "\xff\xff\xff",
> -- 
> 2.13.3

There is an important question about this support as the firmware needs
to be configured for it.

1. Is such rule accepted by the kernel modules if the support is not
enabled in the firmware?

2. Is it possible from the PMD to query such information?

If both answers are no, such features should be enabled through a device
parameter to let the PMD refuse such un-supported flow request.

Thanks,

-- 
Nélio Laranjeiro
6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v3 04/14] net/mlx5: support Rx tunnel type identification
  2018-04-13 11:20 ` [PATCH v3 04/14] net/mlx5: support Rx tunnel type identification Xueming Li
@ 2018-04-13 13:02   ` Nélio Laranjeiro
  2018-04-14 12:57     ` Xueming(Steven) Li
  0 siblings, 1 reply; 115+ messages in thread
From: Nélio Laranjeiro @ 2018-04-13 13:02 UTC (permalink / raw)
  To: Xueming Li; +Cc: Shahaf Shuler, dev, Olivier Matz

+Olivier,

On Fri, Apr 13, 2018 at 07:20:13PM +0800, Xueming Li wrote:
> This patch introduced tunnel type identification based on flow rules.
> If flows of multiple tunnel types built on same queue,
> RTE_PTYPE_TUNNEL_MASK will be returned, user application could use bits
> in flow mark as tunnel type identifier.

For an application it will mean the packet embed all tunnel types defined in
DPDK, to make such thing you need a RTE_PTYPE_TUNNEL_UNKNOWN which does
not exists currently.
Even with it, the application still needs to parse the packet to
discover which tunnel the packet embed, is there any benefit having such
bit?  Not so sure.

Thanks,

> Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> ---
>  drivers/net/mlx5/mlx5_flow.c          | 127 +++++++++++++++++++++++++++++-----
>  drivers/net/mlx5/mlx5_rxq.c           |  11 ++-
>  drivers/net/mlx5/mlx5_rxtx.c          |  12 ++--
>  drivers/net/mlx5/mlx5_rxtx.h          |   9 ++-
>  drivers/net/mlx5/mlx5_rxtx_vec_neon.h |  21 +++---
>  drivers/net/mlx5/mlx5_rxtx_vec_sse.h  |  17 +++--
>  6 files changed, 159 insertions(+), 38 deletions(-)
> 
> diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
> index 644f26a95..7d04b4d65 100644
> --- a/drivers/net/mlx5/mlx5_flow.c
> +++ b/drivers/net/mlx5/mlx5_flow.c
> @@ -225,6 +225,7 @@ struct rte_flow {
>  	struct rte_flow_action_rss rss_conf; /**< RSS configuration */
>  	uint16_t (*queues)[]; /**< Queues indexes to use. */
>  	uint8_t rss_key[40]; /**< copy of the RSS key. */
> +	uint32_t tunnel; /**< Tunnel type of RTE_PTYPE_TUNNEL_XXX. */
>  	struct ibv_counter_set *cs; /**< Holds the counters for the rule. */
>  	struct mlx5_flow_counter_stats counter_stats;/**<The counter stats. */
>  	struct mlx5_flow frxq[RTE_DIM(hash_rxq_init)];
> @@ -241,6 +242,19 @@ struct rte_flow {
>  	(type) == RTE_FLOW_ITEM_TYPE_VXLAN || \
>  	(type) == RTE_FLOW_ITEM_TYPE_GRE)
>  
> +const uint32_t flow_ptype[] = {
> +	[RTE_FLOW_ITEM_TYPE_VXLAN] = RTE_PTYPE_TUNNEL_VXLAN,
> +	[RTE_FLOW_ITEM_TYPE_GRE] = RTE_PTYPE_TUNNEL_GRE,
> +};
> +
> +#define PTYPE_IDX(t) ((RTE_PTYPE_TUNNEL_MASK & (t)) >> 12)
> +
> +const uint32_t ptype_ext[] = {
> +	[PTYPE_IDX(RTE_PTYPE_TUNNEL_VXLAN)] = RTE_PTYPE_TUNNEL_VXLAN |
> +					      RTE_PTYPE_L4_UDP,
> +	[PTYPE_IDX(RTE_PTYPE_TUNNEL_GRE)] = RTE_PTYPE_TUNNEL_GRE,
> +};
> +
>  /** Structure to generate a simple graph of layers supported by the NIC. */
>  struct mlx5_flow_items {
>  	/** List of possible actions for these items. */
> @@ -440,6 +454,7 @@ struct mlx5_flow_parse {
>  	uint16_t queues[RTE_MAX_QUEUES_PER_PORT]; /**< Queues indexes to use. */
>  	uint8_t rss_key[40]; /**< copy of the RSS key. */
>  	enum hash_rxq_type layer; /**< Last pattern layer detected. */
> +	uint32_t tunnel; /**< Tunnel type of RTE_PTYPE_TUNNEL_XXX. */
>  	struct ibv_counter_set *cs; /**< Holds the counter set for the rule */
>  	struct {
>  		struct ibv_flow_attr *ibv_attr;
> @@ -858,7 +873,7 @@ mlx5_flow_convert_items_validate(const struct rte_flow_item items[],
>  		if (ret)
>  			goto exit_item_not_supported;
>  		if (IS_TUNNEL(items->type)) {
> -			if (parser->inner) {
> +			if (parser->tunnel) {
>  				rte_flow_error_set(error, ENOTSUP,
>  						   RTE_FLOW_ERROR_TYPE_ITEM,
>  						   items,
> @@ -867,6 +882,7 @@ mlx5_flow_convert_items_validate(const struct rte_flow_item items[],
>  				return -rte_errno;
>  			}
>  			parser->inner = IBV_FLOW_SPEC_INNER;
> +			parser->tunnel = flow_ptype[items->type];
>  		}
>  		if (parser->drop) {
>  			parser->queue[HASH_RXQ_ETH].offset += cur_item->dst_sz;
> @@ -1175,6 +1191,7 @@ mlx5_flow_convert(struct rte_eth_dev *dev,
>  	}
>  	/* Third step. Conversion parse, fill the specifications. */
>  	parser->inner = 0;
> +	parser->tunnel = 0;
>  	for (; items->type != RTE_FLOW_ITEM_TYPE_END; ++items) {
>  		struct mlx5_flow_data data = {
>  			.parser = parser,
> @@ -1643,6 +1660,7 @@ mlx5_flow_create_vxlan(const struct rte_flow_item *item,
>  
>  	id.vni[0] = 0;
>  	parser->inner = IBV_FLOW_SPEC_INNER;
> +	parser->tunnel = ptype_ext[PTYPE_IDX(RTE_PTYPE_TUNNEL_VXLAN)];
>  	if (spec) {
>  		if (!mask)
>  			mask = default_mask;
> @@ -1696,6 +1714,7 @@ mlx5_flow_create_gre(const struct rte_flow_item *item __rte_unused,
>  	};
>  
>  	parser->inner = IBV_FLOW_SPEC_INNER;
> +	parser->tunnel = ptype_ext[PTYPE_IDX(RTE_PTYPE_TUNNEL_GRE)];
>  	mlx5_flow_create_copy(parser, &tunnel, size);
>  	return 0;
>  }
> @@ -1874,7 +1893,8 @@ mlx5_flow_create_action_queue_rss(struct rte_eth_dev *dev,
>  				      parser->rss_conf.key_len,
>  				      hash_fields,
>  				      parser->rss_conf.queue,
> -				      parser->rss_conf.queue_num);
> +				      parser->rss_conf.queue_num,
> +				      parser->tunnel);
>  		if (flow->frxq[i].hrxq)
>  			continue;
>  		flow->frxq[i].hrxq =
> @@ -1883,7 +1903,8 @@ mlx5_flow_create_action_queue_rss(struct rte_eth_dev *dev,
>  				      parser->rss_conf.key_len,
>  				      hash_fields,
>  				      parser->rss_conf.queue,
> -				      parser->rss_conf.queue_num);
> +				      parser->rss_conf.queue_num,
> +				      parser->tunnel);
>  		if (!flow->frxq[i].hrxq) {
>  			return rte_flow_error_set(error, ENOMEM,
>  						  RTE_FLOW_ERROR_TYPE_HANDLE,
> @@ -1895,6 +1916,40 @@ mlx5_flow_create_action_queue_rss(struct rte_eth_dev *dev,
>  }
>  
>  /**
> + * RXQ update after flow rule creation.
> + *
> + * @param dev
> + *   Pointer to Ethernet device.
> + * @param flow
> + *   Pointer to the flow rule.
> + */
> +static void
> +mlx5_flow_create_update_rxqs(struct rte_eth_dev *dev, struct rte_flow *flow)
> +{
> +	struct priv *priv = dev->data->dev_private;
> +	unsigned int i;
> +
> +	if (!dev->data->dev_started)
> +		return;
> +	for (i = 0; i != flow->rss_conf.queue_num; ++i) {
> +		struct mlx5_rxq_data *rxq_data = (*priv->rxqs)
> +						 [(*flow->queues)[i]];
> +		struct mlx5_rxq_ctrl *rxq_ctrl =
> +			container_of(rxq_data, struct mlx5_rxq_ctrl, rxq);
> +		uint8_t tunnel = PTYPE_IDX(flow->tunnel);
> +
> +		rxq_data->mark |= flow->mark;
> +		if (!tunnel)
> +			continue;
> +		rxq_ctrl->tunnel_types[tunnel] += 1;
> +		if (rxq_data->tunnel != flow->tunnel)
> +			rxq_data->tunnel = rxq_data->tunnel ?
> +					   RTE_PTYPE_TUNNEL_MASK :
> +					   flow->tunnel;
> +	}
> +}
> +
> +/**
>   * Complete flow rule creation.
>   *
>   * @param dev
> @@ -1954,12 +2009,7 @@ mlx5_flow_create_action_queue(struct rte_eth_dev *dev,
>  				   NULL, "internal error in flow creation");
>  		goto error;
>  	}
> -	for (i = 0; i != parser->rss_conf.queue_num; ++i) {
> -		struct mlx5_rxq_data *q =
> -			(*priv->rxqs)[parser->rss_conf.queue[i]];
> -
> -		q->mark |= parser->mark;
> -	}
> +	mlx5_flow_create_update_rxqs(dev, flow);
>  	return 0;
>  error:
>  	ret = rte_errno; /* Save rte_errno before cleanup. */
> @@ -2032,6 +2082,7 @@ mlx5_flow_list_create(struct rte_eth_dev *dev,
>  	}
>  	/* Copy configuration. */
>  	flow->queues = (uint16_t (*)[])(flow + 1);
> +	flow->tunnel = parser.tunnel;
>  	flow->rss_conf = (struct rte_flow_action_rss){
>  		.func = RTE_ETH_HASH_FUNCTION_DEFAULT,
>  		.level = 0,
> @@ -2123,9 +2174,38 @@ mlx5_flow_list_destroy(struct rte_eth_dev *dev, struct mlx5_flows *list,
>  	struct priv *priv = dev->data->dev_private;
>  	unsigned int i;
>  
> -	if (flow->drop || !flow->mark)
> +	if (flow->drop || !dev->data->dev_started)
>  		goto free;
> -	for (i = 0; i != flow->rss_conf.queue_num; ++i) {
> +	for (i = 0; flow->tunnel && i != flow->rss_conf.queue_num; ++i) {
> +		/* Update queue tunnel type. */
> +		struct mlx5_rxq_data *rxq_data = (*priv->rxqs)
> +						 [(*flow->queues)[i]];
> +		struct mlx5_rxq_ctrl *rxq_ctrl =
> +			container_of(rxq_data, struct mlx5_rxq_ctrl, rxq);
> +		uint8_t tunnel = PTYPE_IDX(flow->tunnel);
> +
> +		assert(rxq_ctrl->tunnel_types[tunnel] > 0);
> +		rxq_ctrl->tunnel_types[tunnel] -= 1;
> +		if (!rxq_ctrl->tunnel_types[tunnel]) {
> +			/* Update tunnel type. */
> +			uint8_t j;
> +			uint8_t types = 0;
> +			uint8_t last;
> +
> +			for (j = 0; j < RTE_DIM(rxq_ctrl->tunnel_types); j++)
> +				if (rxq_ctrl->tunnel_types[j]) {
> +					types += 1;
> +					last = j;
> +				}
> +			/* Keep same if more than one tunnel types left. */
> +			if (types == 1)
> +				rxq_data->tunnel = ptype_ext[last];
> +			else if (types == 0)
> +				/* No tunnel type left. */
> +				rxq_data->tunnel = 0;
> +		}
> +	}
> +	for (i = 0; flow->mark && i != flow->rss_conf.queue_num; ++i) {
>  		struct rte_flow *tmp;
>  		int mark = 0;
>  
> @@ -2344,9 +2424,9 @@ mlx5_flow_stop(struct rte_eth_dev *dev, struct mlx5_flows *list)
>  {
>  	struct priv *priv = dev->data->dev_private;
>  	struct rte_flow *flow;
> +	unsigned int i;
>  
>  	TAILQ_FOREACH_REVERSE(flow, list, mlx5_flows, next) {
> -		unsigned int i;
>  		struct mlx5_ind_table_ibv *ind_tbl = NULL;
>  
>  		if (flow->drop) {
> @@ -2392,6 +2472,18 @@ mlx5_flow_stop(struct rte_eth_dev *dev, struct mlx5_flows *list)
>  		DRV_LOG(DEBUG, "port %u flow %p removed", dev->data->port_id,
>  			(void *)flow);
>  	}
> +	/* Cleanup Rx queue tunnel info. */
> +	for (i = 0; i != priv->rxqs_n; ++i) {
> +		struct mlx5_rxq_data *q = (*priv->rxqs)[i];
> +		struct mlx5_rxq_ctrl *rxq_ctrl =
> +			container_of(q, struct mlx5_rxq_ctrl, rxq);
> +
> +		if (!q)
> +			continue;
> +		memset((void *)rxq_ctrl->tunnel_types, 0,
> +		       sizeof(rxq_ctrl->tunnel_types));
> +		q->tunnel = 0;
> +	}
>  }
>  
>  /**
> @@ -2439,7 +2531,8 @@ mlx5_flow_start(struct rte_eth_dev *dev, struct mlx5_flows *list)
>  					      flow->rss_conf.key_len,
>  					      hash_rxq_init[i].hash_fields,
>  					      flow->rss_conf.queue,
> -					      flow->rss_conf.queue_num);
> +					      flow->rss_conf.queue_num,
> +					      flow->tunnel);
>  			if (flow->frxq[i].hrxq)
>  				goto flow_create;
>  			flow->frxq[i].hrxq =
> @@ -2447,7 +2540,8 @@ mlx5_flow_start(struct rte_eth_dev *dev, struct mlx5_flows *list)
>  					      flow->rss_conf.key_len,
>  					      hash_rxq_init[i].hash_fields,
>  					      flow->rss_conf.queue,
> -					      flow->rss_conf.queue_num);
> +					      flow->rss_conf.queue_num,
> +					      flow->tunnel);
>  			if (!flow->frxq[i].hrxq) {
>  				DRV_LOG(DEBUG,
>  					"port %u flow %p cannot be applied",
> @@ -2469,10 +2563,7 @@ mlx5_flow_start(struct rte_eth_dev *dev, struct mlx5_flows *list)
>  			DRV_LOG(DEBUG, "port %u flow %p applied",
>  				dev->data->port_id, (void *)flow);
>  		}
> -		if (!flow->mark)
> -			continue;
> -		for (i = 0; i != flow->rss_conf.queue_num; ++i)
> -			(*priv->rxqs)[flow->rss_conf.queue[i]]->mark = 1;
> +		mlx5_flow_create_update_rxqs(dev, flow);
>  	}
>  	return 0;
>  }
> diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
> index 1e4354ab3..351acfc0f 100644
> --- a/drivers/net/mlx5/mlx5_rxq.c
> +++ b/drivers/net/mlx5/mlx5_rxq.c
> @@ -1386,6 +1386,8 @@ mlx5_ind_table_ibv_verify(struct rte_eth_dev *dev)
>   *   first queue index will be taken for the indirection table.
>   * @param queues_n
>   *   Number of queues.
> + * @param tunnel
> + *   Tunnel type.
>   *
>   * @return
>   *   The Verbs object initialised, NULL otherwise and rte_errno is set.
> @@ -1394,7 +1396,7 @@ struct mlx5_hrxq *
>  mlx5_hrxq_new(struct rte_eth_dev *dev,
>  	      const uint8_t *rss_key, uint32_t rss_key_len,
>  	      uint64_t hash_fields,
> -	      const uint16_t *queues, uint32_t queues_n)
> +	      const uint16_t *queues, uint32_t queues_n, uint32_t tunnel)
>  {
>  	struct priv *priv = dev->data->dev_private;
>  	struct mlx5_hrxq *hrxq;
> @@ -1438,6 +1440,7 @@ mlx5_hrxq_new(struct rte_eth_dev *dev,
>  	hrxq->qp = qp;
>  	hrxq->rss_key_len = rss_key_len;
>  	hrxq->hash_fields = hash_fields;
> +	hrxq->tunnel = tunnel;
>  	memcpy(hrxq->rss_key, rss_key, rss_key_len);
>  	rte_atomic32_inc(&hrxq->refcnt);
>  	LIST_INSERT_HEAD(&priv->hrxqs, hrxq, next);
> @@ -1466,6 +1469,8 @@ mlx5_hrxq_new(struct rte_eth_dev *dev,
>   *   first queue index will be taken for the indirection table.
>   * @param queues_n
>   *   Number of queues.
> + * @param tunnel
> + *   Tunnel type.
>   *
>   * @return
>   *   An hash Rx queue on success.
> @@ -1474,7 +1479,7 @@ struct mlx5_hrxq *
>  mlx5_hrxq_get(struct rte_eth_dev *dev,
>  	      const uint8_t *rss_key, uint32_t rss_key_len,
>  	      uint64_t hash_fields,
> -	      const uint16_t *queues, uint32_t queues_n)
> +	      const uint16_t *queues, uint32_t queues_n, uint32_t tunnel)
>  {
>  	struct priv *priv = dev->data->dev_private;
>  	struct mlx5_hrxq *hrxq;
> @@ -1489,6 +1494,8 @@ mlx5_hrxq_get(struct rte_eth_dev *dev,
>  			continue;
>  		if (hrxq->hash_fields != hash_fields)
>  			continue;
> +		if (hrxq->tunnel != tunnel)
> +			continue;
>  		ind_tbl = mlx5_ind_table_ibv_get(dev, queues, queues_n);
>  		if (!ind_tbl)
>  			continue;
> diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
> index 1f422c70b..d061dfc8a 100644
> --- a/drivers/net/mlx5/mlx5_rxtx.c
> +++ b/drivers/net/mlx5/mlx5_rxtx.c
> @@ -34,7 +34,7 @@
>  #include "mlx5_prm.h"
>  
>  static __rte_always_inline uint32_t
> -rxq_cq_to_pkt_type(volatile struct mlx5_cqe *cqe);
> +rxq_cq_to_pkt_type(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cqe);
>  
>  static __rte_always_inline int
>  mlx5_rx_poll_len(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cqe,
> @@ -125,12 +125,14 @@ mlx5_set_ptype_table(void)
>  	(*p)[0x8a] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
>  		     RTE_PTYPE_L4_UDP;
>  	/* Tunneled - L3 */
> +	(*p)[0x40] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN;
>  	(*p)[0x41] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
>  		     RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
>  		     RTE_PTYPE_INNER_L4_NONFRAG;
>  	(*p)[0x42] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
>  		     RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
>  		     RTE_PTYPE_INNER_L4_NONFRAG;
> +	(*p)[0xc0] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN;
>  	(*p)[0xc1] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
>  		     RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
>  		     RTE_PTYPE_INNER_L4_NONFRAG;
> @@ -1577,6 +1579,8 @@ mlx5_tx_burst_empw(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
>  /**
>   * Translate RX completion flags to packet type.
>   *
> + * @param[in] rxq
> + *   Pointer to RX queue structure.
>   * @param[in] cqe
>   *   Pointer to CQE.
>   *
> @@ -1586,7 +1590,7 @@ mlx5_tx_burst_empw(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
>   *   Packet type for struct rte_mbuf.
>   */
>  static inline uint32_t
> -rxq_cq_to_pkt_type(volatile struct mlx5_cqe *cqe)
> +rxq_cq_to_pkt_type(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cqe)
>  {
>  	uint8_t idx;
>  	uint8_t pinfo = cqe->pkt_info;
> @@ -1601,7 +1605,7 @@ rxq_cq_to_pkt_type(volatile struct mlx5_cqe *cqe)
>  	 * bit[7] = outer_l3_type
>  	 */
>  	idx = ((pinfo & 0x3) << 6) | ((ptype & 0xfc00) >> 10);
> -	return mlx5_ptype_table[idx];
> +	return mlx5_ptype_table[idx] | rxq->tunnel * !!(idx & (1 << 6));
>  }
>  
>  /**
> @@ -1833,7 +1837,7 @@ mlx5_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n)
>  			pkt = seg;
>  			assert(len >= (rxq->crc_present << 2));
>  			/* Update packet information. */
> -			pkt->packet_type = rxq_cq_to_pkt_type(cqe);
> +			pkt->packet_type = rxq_cq_to_pkt_type(rxq, cqe);
>  			pkt->ol_flags = 0;
>  			if (rss_hash_res && rxq->rss_hash) {
>  				pkt->hash.rss = rss_hash_res;
> diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
> index a702cb603..6866f6818 100644
> --- a/drivers/net/mlx5/mlx5_rxtx.h
> +++ b/drivers/net/mlx5/mlx5_rxtx.h
> @@ -104,6 +104,7 @@ struct mlx5_rxq_data {
>  	void *cq_uar; /* CQ user access region. */
>  	uint32_t cqn; /* CQ number. */
>  	uint8_t cq_arm_sn; /* CQ arm seq number. */
> +	uint32_t tunnel; /* Tunnel information. */
>  } __rte_cache_aligned;
>  
>  /* Verbs Rx queue elements. */
> @@ -125,6 +126,7 @@ struct mlx5_rxq_ctrl {
>  	struct mlx5_rxq_ibv *ibv; /* Verbs elements. */
>  	struct mlx5_rxq_data rxq; /* Data path structure. */
>  	unsigned int socket; /* CPU socket ID for allocations. */
> +	uint32_t tunnel_types[16]; /* Tunnel type counter. */
>  	unsigned int irq:1; /* Whether IRQ is enabled. */
>  	uint16_t idx; /* Queue index. */
>  };
> @@ -145,6 +147,7 @@ struct mlx5_hrxq {
>  	struct mlx5_ind_table_ibv *ind_table; /* Indirection table. */
>  	struct ibv_qp *qp; /* Verbs queue pair. */
>  	uint64_t hash_fields; /* Verbs Hash fields. */
> +	uint32_t tunnel; /* Tunnel type. */
>  	uint32_t rss_key_len; /* Hash key length in bytes. */
>  	uint8_t rss_key[]; /* Hash key. */
>  };
> @@ -248,11 +251,13 @@ int mlx5_ind_table_ibv_verify(struct rte_eth_dev *dev);
>  struct mlx5_hrxq *mlx5_hrxq_new(struct rte_eth_dev *dev,
>  				const uint8_t *rss_key, uint32_t rss_key_len,
>  				uint64_t hash_fields,
> -				const uint16_t *queues, uint32_t queues_n);
> +				const uint16_t *queues, uint32_t queues_n,
> +				uint32_t tunnel);
>  struct mlx5_hrxq *mlx5_hrxq_get(struct rte_eth_dev *dev,
>  				const uint8_t *rss_key, uint32_t rss_key_len,
>  				uint64_t hash_fields,
> -				const uint16_t *queues, uint32_t queues_n);
> +				const uint16_t *queues, uint32_t queues_n,
> +				uint32_t tunnel);
>  int mlx5_hrxq_release(struct rte_eth_dev *dev, struct mlx5_hrxq *hxrq);
>  int mlx5_hrxq_ibv_verify(struct rte_eth_dev *dev);
>  uint64_t mlx5_get_rx_port_offloads(void);
> diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
> index bbe1818ef..9f9136108 100644
> --- a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
> +++ b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
> @@ -551,6 +551,7 @@ rxq_cq_to_ptype_oflags_v(struct mlx5_rxq_data *rxq,
>  	const uint64x1_t mbuf_init = vld1_u64(&rxq->mbuf_initializer);
>  	const uint64x1_t r32_mask = vcreate_u64(0xffffffff);
>  	uint64x2_t rearm0, rearm1, rearm2, rearm3;
> +	uint8_t pt_idx0, pt_idx1, pt_idx2, pt_idx3;
>  
>  	if (rxq->mark) {
>  		const uint32x4_t ft_def = vdupq_n_u32(MLX5_FLOW_MARK_DEFAULT);
> @@ -583,14 +584,18 @@ rxq_cq_to_ptype_oflags_v(struct mlx5_rxq_data *rxq,
>  	ptype = vshrn_n_u32(ptype_info, 10);
>  	/* Errored packets will have RTE_PTYPE_ALL_MASK. */
>  	ptype = vorr_u16(ptype, op_err);
> -	pkts[0]->packet_type =
> -		mlx5_ptype_table[vget_lane_u8(vreinterpret_u8_u16(ptype), 6)];
> -	pkts[1]->packet_type =
> -		mlx5_ptype_table[vget_lane_u8(vreinterpret_u8_u16(ptype), 4)];
> -	pkts[2]->packet_type =
> -		mlx5_ptype_table[vget_lane_u8(vreinterpret_u8_u16(ptype), 2)];
> -	pkts[3]->packet_type =
> -		mlx5_ptype_table[vget_lane_u8(vreinterpret_u8_u16(ptype), 0)];
> +	pt_idx0 = vget_lane_u8(vreinterpret_u8_u16(ptype), 6);
> +	pt_idx1 = vget_lane_u8(vreinterpret_u8_u16(ptype), 4);
> +	pt_idx2 = vget_lane_u8(vreinterpret_u8_u16(ptype), 2);
> +	pt_idx3 = vget_lane_u8(vreinterpret_u8_u16(ptype), 0);
> +	pkts[0]->packet_type = mlx5_ptype_table[pt_idx0] |
> +			       !!(pt_idx0 & (1 << 6)) * rxq->tunnel;
> +	pkts[1]->packet_type = mlx5_ptype_table[pt_idx1] |
> +			       !!(pt_idx1 & (1 << 6)) * rxq->tunnel;
> +	pkts[2]->packet_type = mlx5_ptype_table[pt_idx2] |
> +			       !!(pt_idx2 & (1 << 6)) * rxq->tunnel;
> +	pkts[3]->packet_type = mlx5_ptype_table[pt_idx3] |
> +			       !!(pt_idx3 & (1 << 6)) * rxq->tunnel;
>  	/* Fill flags for checksum and VLAN. */
>  	pinfo = vandq_u32(ptype_info, ptype_ol_mask);
>  	pinfo = vreinterpretq_u32_u8(
> diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
> index c088bcb51..d2492481d 100644
> --- a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
> +++ b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
> @@ -542,6 +542,7 @@ rxq_cq_to_ptype_oflags_v(struct mlx5_rxq_data *rxq, __m128i cqes[4],
>  	const __m128i mbuf_init =
>  		_mm_loadl_epi64((__m128i *)&rxq->mbuf_initializer);
>  	__m128i rearm0, rearm1, rearm2, rearm3;
> +	uint8_t pt_idx0, pt_idx1, pt_idx2, pt_idx3;
>  
>  	/* Extract pkt_info field. */
>  	pinfo0 = _mm_unpacklo_epi32(cqes[0], cqes[1]);
> @@ -595,10 +596,18 @@ rxq_cq_to_ptype_oflags_v(struct mlx5_rxq_data *rxq, __m128i cqes[4],
>  	/* Errored packets will have RTE_PTYPE_ALL_MASK. */
>  	op_err = _mm_srli_epi16(op_err, 8);
>  	ptype = _mm_or_si128(ptype, op_err);
> -	pkts[0]->packet_type = mlx5_ptype_table[_mm_extract_epi8(ptype, 0)];
> -	pkts[1]->packet_type = mlx5_ptype_table[_mm_extract_epi8(ptype, 2)];
> -	pkts[2]->packet_type = mlx5_ptype_table[_mm_extract_epi8(ptype, 4)];
> -	pkts[3]->packet_type = mlx5_ptype_table[_mm_extract_epi8(ptype, 6)];
> +	pt_idx0 = _mm_extract_epi8(ptype, 0);
> +	pt_idx1 = _mm_extract_epi8(ptype, 2);
> +	pt_idx2 = _mm_extract_epi8(ptype, 4);
> +	pt_idx3 = _mm_extract_epi8(ptype, 6);
> +	pkts[0]->packet_type = mlx5_ptype_table[pt_idx0] |
> +			       !!(pt_idx0 & (1 << 6)) * rxq->tunnel;
> +	pkts[1]->packet_type = mlx5_ptype_table[pt_idx1] |
> +			       !!(pt_idx1 & (1 << 6)) * rxq->tunnel;
> +	pkts[2]->packet_type = mlx5_ptype_table[pt_idx2] |
> +			       !!(pt_idx2 & (1 << 6)) * rxq->tunnel;
> +	pkts[3]->packet_type = mlx5_ptype_table[pt_idx3] |
> +			       !!(pt_idx3 & (1 << 6)) * rxq->tunnel;
>  	/* Fill flags for checksum and VLAN. */
>  	pinfo = _mm_and_si128(pinfo, ptype_ol_mask);
>  	pinfo = _mm_shuffle_epi8(cv_flag_sel, pinfo);
> -- 
> 2.13.3
 

-- 
Nélio Laranjeiro
6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v3 01/14] net/mlx5: support 16 hardware priorities
  2018-04-13 11:58   ` Nélio Laranjeiro
@ 2018-04-13 13:10     ` Xueming(Steven) Li
  2018-04-13 13:46       ` Nélio Laranjeiro
  0 siblings, 1 reply; 115+ messages in thread
From: Xueming(Steven) Li @ 2018-04-13 13:10 UTC (permalink / raw)
  To: Nélio Laranjeiro; +Cc: Shahaf Shuler, dev



> -----Original Message-----
> From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> Sent: Friday, April 13, 2018 7:59 PM
> To: Xueming(Steven) Li <xuemingl@mellanox.com>
> Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org
> Subject: Re: [PATCH v3 01/14] net/mlx5: support 16 hardware priorities
> 
> Hi Xueming,
> 
> Small nips and documentation issues,
> 
> On Fri, Apr 13, 2018 at 07:20:10PM +0800, Xueming Li wrote:
> > This patch supports new 16 Verbs flow priorities by trying to create a
> > simple flow of priority 15. If 16 priorities not available, fallback
> > to traditional 8 priorities.
> >
> > Verb priority mapping:
> > 			8 priorities	>=16 priorities
> > Control flow:		4-7		8-15
> > User normal flow:	1-3		4-7
> > User tunnel flow:	0-2		0-3
> 
> There is an overlap between tunnel and normal flows it is expected?

For 8 priorities, (4-7), (1-3) and (0-2) are the behavior of today, 
1 Verbs shift to make tunnel flow higher priority, please refer to 
commit #74936571

> 
> >
> > Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> > ---
> >  drivers/net/mlx5/mlx5.c         |  18 +++++++
> >  drivers/net/mlx5/mlx5.h         |   5 ++
> >  drivers/net/mlx5/mlx5_flow.c    | 112
> +++++++++++++++++++++++++++++++++-------
> >  drivers/net/mlx5/mlx5_trigger.c |   8 ---
> >  4 files changed, 115 insertions(+), 28 deletions(-)
> >
> > diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index
> > cfab55897..38118e524 100644
> > --- a/drivers/net/mlx5/mlx5.c
> > +++ b/drivers/net/mlx5/mlx5.c
> > @@ -197,6 +197,7 @@ mlx5_dev_close(struct rte_eth_dev *dev)
> >  		priv->txqs_n = 0;
> >  		priv->txqs = NULL;
> >  	}
> > +	mlx5_flow_delete_drop_queue(dev);
> >  	if (priv->pd != NULL) {
> >  		assert(priv->ctx != NULL);
> >  		claim_zero(mlx5_glue->dealloc_pd(priv->pd));
> > @@ -612,6 +613,7 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv
> __rte_unused,
> >  	unsigned int mps;
> >  	unsigned int cqe_comp;
> >  	unsigned int tunnel_en = 0;
> > +	unsigned int verb_priorities = 0;
> >  	int idx;
> >  	int i;
> >  	struct mlx5dv_context attrs_out = {0}; @@ -993,6 +995,22 @@
> > mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
> >  		mlx5_set_link_up(eth_dev);
> >  		/* Store device configuration on private structure. */
> >  		priv->config = config;
> > +		/* Create drop queue. */
> > +		err = mlx5_flow_create_drop_queue(eth_dev);
> > +		if (err) {
> > +			DRV_LOG(ERR, "port %u drop queue allocation failed: %s",
> > +				eth_dev->data->port_id, strerror(rte_errno));
> > +			goto port_error;
> > +		}
> > +		/* Supported Verbs flow priority number detection. */
> > +		if (verb_priorities == 0)
> > +			verb_priorities = priv_get_max_verbs_prio(eth_dev);
> 
> No more priv*() rename it to mlx5_get_max_verbs_prio()
> 
> > +		if (verb_priorities < MLX5_VERBS_FLOW_PRIO_8) {
> > +			DRV_LOG(ERR, "port %u wrong Verbs flow priorities: %u",
> > +				eth_dev->data->port_id, verb_priorities);
> > +			goto port_error;
> > +		}
> > +		priv->config.max_verb_prio = verb_priorities;
> 
> s/verb/verbs/
> 
> >  		continue;
> >  port_error:
> >  		if (priv)
> > diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h index
> > 63b24e6bb..6e4613fe0 100644
> > --- a/drivers/net/mlx5/mlx5.h
> > +++ b/drivers/net/mlx5/mlx5.h
> > @@ -89,6 +89,7 @@ struct mlx5_dev_config {
> >  	unsigned int rx_vec_en:1; /* Rx vector is enabled. */
> >  	unsigned int mpw_hdr_dseg:1; /* Enable DSEGs in the title WQEBB. */
> >  	unsigned int vf_nl_en:1; /* Enable Netlink requests in VF mode. */
> > +	unsigned int max_verb_prio; /* Number of Verb flow priorities. */
> >  	unsigned int tso_max_payload_sz; /* Maximum TCP payload for TSO. */
> >  	unsigned int ind_table_max_size; /* Maximum indirection table size.
> */
> >  	int txq_inline; /* Maximum packet size for inlining. */ @@ -105,6
> > +106,9 @@ enum mlx5_verbs_alloc_type {
> >  	MLX5_VERBS_ALLOC_TYPE_RX_QUEUE,
> >  };
> >
> > +/* 8 Verbs priorities. */
> > +#define MLX5_VERBS_FLOW_PRIO_8 8
> > +
> >  /**
> >   * Verbs allocator needs a context to know in the callback which kind
> of
> >   * resources it is allocating.
> > @@ -253,6 +257,7 @@ int mlx5_traffic_restart(struct rte_eth_dev *dev);
> >
> >  /* mlx5_flow.c */
> >
> > +unsigned int priv_get_max_verbs_prio(struct rte_eth_dev *dev);
> >  int mlx5_flow_validate(struct rte_eth_dev *dev,
> >  		       const struct rte_flow_attr *attr,
> >  		       const struct rte_flow_item items[], diff --git
> > a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c index
> > 288610620..5c4f0b586 100644
> > --- a/drivers/net/mlx5/mlx5_flow.c
> > +++ b/drivers/net/mlx5/mlx5_flow.c
> > @@ -32,8 +32,8 @@
> >  #include "mlx5_prm.h"
> >  #include "mlx5_glue.h"
> >
> > -/* Define minimal priority for control plane flows. */ -#define
> > MLX5_CTRL_FLOW_PRIORITY 4
> > +/* Flow priority for control plane flows. */ #define
> > +MLX5_CTRL_FLOW_PRIORITY 1
> >
> >  /* Internet Protocol versions. */
> >  #define MLX5_IPV4 4
> > @@ -129,7 +129,7 @@ const struct hash_rxq_init hash_rxq_init[] = {
> >  				IBV_RX_HASH_SRC_PORT_TCP |
> >  				IBV_RX_HASH_DST_PORT_TCP),
> >  		.dpdk_rss_hf = ETH_RSS_NONFRAG_IPV4_TCP,
> > -		.flow_priority = 1,
> > +		.flow_priority = 0,
> >  		.ip_version = MLX5_IPV4,
> >  	},
> >  	[HASH_RXQ_UDPV4] = {
> > @@ -138,7 +138,7 @@ const struct hash_rxq_init hash_rxq_init[] = {
> >  				IBV_RX_HASH_SRC_PORT_UDP |
> >  				IBV_RX_HASH_DST_PORT_UDP),
> >  		.dpdk_rss_hf = ETH_RSS_NONFRAG_IPV4_UDP,
> > -		.flow_priority = 1,
> > +		.flow_priority = 0,
> >  		.ip_version = MLX5_IPV4,
> >  	},
> >  	[HASH_RXQ_IPV4] = {
> > @@ -146,7 +146,7 @@ const struct hash_rxq_init hash_rxq_init[] = {
> >  				IBV_RX_HASH_DST_IPV4),
> >  		.dpdk_rss_hf = (ETH_RSS_IPV4 |
> >  				ETH_RSS_FRAG_IPV4),
> > -		.flow_priority = 2,
> > +		.flow_priority = 1,
> >  		.ip_version = MLX5_IPV4,
> >  	},
> >  	[HASH_RXQ_TCPV6] = {
> > @@ -155,7 +155,7 @@ const struct hash_rxq_init hash_rxq_init[] = {
> >  				IBV_RX_HASH_SRC_PORT_TCP |
> >  				IBV_RX_HASH_DST_PORT_TCP),
> >  		.dpdk_rss_hf = ETH_RSS_NONFRAG_IPV6_TCP,
> > -		.flow_priority = 1,
> > +		.flow_priority = 0,
> >  		.ip_version = MLX5_IPV6,
> >  	},
> >  	[HASH_RXQ_UDPV6] = {
> > @@ -164,7 +164,7 @@ const struct hash_rxq_init hash_rxq_init[] = {
> >  				IBV_RX_HASH_SRC_PORT_UDP |
> >  				IBV_RX_HASH_DST_PORT_UDP),
> >  		.dpdk_rss_hf = ETH_RSS_NONFRAG_IPV6_UDP,
> > -		.flow_priority = 1,
> > +		.flow_priority = 0,
> >  		.ip_version = MLX5_IPV6,
> >  	},
> >  	[HASH_RXQ_IPV6] = {
> > @@ -172,13 +172,13 @@ const struct hash_rxq_init hash_rxq_init[] = {
> >  				IBV_RX_HASH_DST_IPV6),
> >  		.dpdk_rss_hf = (ETH_RSS_IPV6 |
> >  				ETH_RSS_FRAG_IPV6),
> > -		.flow_priority = 2,
> > +		.flow_priority = 1,
> >  		.ip_version = MLX5_IPV6,
> >  	},
> >  	[HASH_RXQ_ETH] = {
> >  		.hash_fields = 0,
> >  		.dpdk_rss_hf = 0,
> > -		.flow_priority = 3,
> > +		.flow_priority = 2,
> >  	},
> >  };
> >
> > @@ -900,30 +900,50 @@ mlx5_flow_convert_allocate(unsigned int size,
> struct rte_flow_error *error)
> >   * Make inner packet matching with an higher priority from the non
> Inner
> >   * matching.
> >   *
> > + * @param dev
> > + *   Pointer to Ethernet device.
> >   * @param[in, out] parser
> >   *   Internal parser structure.
> >   * @param attr
> >   *   User flow attribute.
> >   */
> >  static void
> > -mlx5_flow_update_priority(struct mlx5_flow_parse *parser,
> > +mlx5_flow_update_priority(struct rte_eth_dev *dev,
> > +			  struct mlx5_flow_parse *parser,
> >  			  const struct rte_flow_attr *attr)  {
> > +	struct priv *priv = dev->data->dev_private;
> >  	unsigned int i;
> > +	uint16_t priority;
> >
> > +	/*			8 priorities	>= 16 priorities
> > +	 * Control flow:	4-7		8-15
> > +	 * User normal flow:	1-3		4-7
> > +	 * User tunnel flow:	0-2		0-3
> > +	 */
> 
> Same comment here, the tunnel flow overlap when there are only 8
> priorities.
> 
> > +	priority = attr->priority * MLX5_VERBS_FLOW_PRIO_8;
> > +	if (priv->config.max_verb_prio == MLX5_VERBS_FLOW_PRIO_8)
> > +		priority /= 2;
> > +	/*
> > +	 * Lower non-tunnel flow Verbs priority 1 if only support 8 Verbs
> > +	 * priorities, lower 4 otherwise.
> > +	 */
> > +	if (!parser->inner) {
> > +		if (priv->config.max_verb_prio == MLX5_VERBS_FLOW_PRIO_8)
> > +			priority += 1;
> > +		else
> > +			priority += MLX5_VERBS_FLOW_PRIO_8 / 2;
> > +	}
> >  	if (parser->drop) {
> > -		parser->queue[HASH_RXQ_ETH].ibv_attr->priority =
> > -			attr->priority +
> > -			hash_rxq_init[HASH_RXQ_ETH].flow_priority;
> > +		parser->queue[HASH_RXQ_ETH].ibv_attr->priority = priority +
> > +				hash_rxq_init[HASH_RXQ_ETH].flow_priority;
> >  		return;
> >  	}
> >  	for (i = 0; i != hash_rxq_init_n; ++i) {
> > -		if (parser->queue[i].ibv_attr) {
> > -			parser->queue[i].ibv_attr->priority =
> > -				attr->priority +
> > -				hash_rxq_init[i].flow_priority -
> > -				(parser->inner ? 1 : 0);
> > -		}
> > +		if (!parser->queue[i].ibv_attr)
> > +			continue;
> > +		parser->queue[i].ibv_attr->priority = priority +
> > +				hash_rxq_init[i].flow_priority;
> >  	}
> >  }
> >
> > @@ -1158,7 +1178,7 @@ mlx5_flow_convert(struct rte_eth_dev *dev,
> >  	 */
> >  	if (!parser->drop)
> >  		mlx5_flow_convert_finalise(parser);
> > -	mlx5_flow_update_priority(parser, attr);
> > +	mlx5_flow_update_priority(dev, parser, attr);
> >  exit_free:
> >  	/* Only verification is expected, all resources should be released.
> */
> >  	if (!parser->create) {
> > @@ -3161,3 +3181,55 @@ mlx5_dev_filter_ctrl(struct rte_eth_dev *dev,
> >  	}
> >  	return 0;
> >  }
> > +
> > +/**
> > + * Detect number of Verbs flow priorities supported.
> > + *
> > + * @param dev
> > + *   Pointer to Ethernet device.
> > + *
> > + * @return
> > + *   number of supported Verbs flow priority.
> > + */
> > +unsigned int
> > +priv_get_max_verbs_prio(struct rte_eth_dev *dev) {
> > +	struct priv *priv = dev->data->dev_private;
> > +	unsigned int verb_priorities = MLX5_VERBS_FLOW_PRIO_8;
> > +	struct {
> > +		struct ibv_flow_attr attr;
> > +		struct ibv_flow_spec_eth eth;
> > +		struct ibv_flow_spec_action_drop drop;
> > +	} flow_attr = {
> > +		.attr = {
> > +			.num_of_specs = 2,
> > +		},
> > +		.eth = {
> > +			.type = IBV_FLOW_SPEC_ETH,
> > +			.size = sizeof(struct ibv_flow_spec_eth),
> > +		},
> > +		.drop = {
> > +			.size = sizeof(struct ibv_flow_spec_action_drop),
> > +			.type = IBV_FLOW_SPEC_ACTION_DROP,
> > +		},
> > +	};
> > +	struct ibv_flow *flow;
> > +
> > +	do {
> > +		flow_attr.attr.priority = verb_priorities - 1;
> > +		flow = mlx5_glue->create_flow(priv->flow_drop_queue->qp,
> > +					      &flow_attr.attr);
> > +		if (flow) {
> > +			claim_zero(mlx5_glue->destroy_flow(flow));
> > +			/* Try more priorities. */
> > +			verb_priorities *= 2;
> > +		} else {
> > +			/* Failed, restore last right number. */
> > +			verb_priorities /= 2;
> > +			break;
> > +		}
> > +	} while (1);
> > +	DRV_LOG(INFO, "port %u Verbs flow priorities: %d",
> > +		dev->data->port_id, verb_priorities);
> 
> Please remove this developer log, it will confuse the user who will
> believe he have N priorities which is absolutely not the case.

How about change it to DEBUG level, this should be very useful in real deployment
trouble shooting.

I could append something like "user flow priorities: %d" to avoid confusion..

> 
> > +	return verb_priorities;
> > +}
> > diff --git a/drivers/net/mlx5/mlx5_trigger.c
> > b/drivers/net/mlx5/mlx5_trigger.c index 6bb4ffb14..d80a2e688 100644
> > --- a/drivers/net/mlx5/mlx5_trigger.c
> > +++ b/drivers/net/mlx5/mlx5_trigger.c
> > @@ -148,12 +148,6 @@ mlx5_dev_start(struct rte_eth_dev *dev)
> >  	int ret;
> >
> >  	dev->data->dev_started = 1;
> > -	ret = mlx5_flow_create_drop_queue(dev);
> > -	if (ret) {
> > -		DRV_LOG(ERR, "port %u drop queue allocation failed: %s",
> > -			dev->data->port_id, strerror(rte_errno));
> > -		goto error;
> > -	}
> >  	DRV_LOG(DEBUG, "port %u allocating and configuring hash Rx queues",
> >  		dev->data->port_id);
> >  	rte_mempool_walk(mlx5_mp2mr_iter, priv); @@ -202,7 +196,6 @@
> > mlx5_dev_start(struct rte_eth_dev *dev)
> >  	mlx5_traffic_disable(dev);
> >  	mlx5_txq_stop(dev);
> >  	mlx5_rxq_stop(dev);
> > -	mlx5_flow_delete_drop_queue(dev);
> >  	rte_errno = ret; /* Restore rte_errno. */
> >  	return -rte_errno;
> >  }
> > @@ -237,7 +230,6 @@ mlx5_dev_stop(struct rte_eth_dev *dev)
> >  	mlx5_rxq_stop(dev);
> >  	for (mr = LIST_FIRST(&priv->mr); mr; mr = LIST_FIRST(&priv->mr))
> >  		mlx5_mr_release(mr);
> > -	mlx5_flow_delete_drop_queue(dev);
> >  }
> >
> >  /**
> > --
> > 2.13.3
> 
> Thanks,
> 
> --
> Nélio Laranjeiro
> 6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v3 07/14] net/mlx5: support tunnel RSS level
  2018-04-13 11:20 ` [PATCH v3 07/14] net/mlx5: support tunnel RSS level Xueming Li
@ 2018-04-13 13:27   ` Nélio Laranjeiro
  2018-04-14 10:12     ` Xueming(Steven) Li
  0 siblings, 1 reply; 115+ messages in thread
From: Nélio Laranjeiro @ 2018-04-13 13:27 UTC (permalink / raw)
  To: Xueming Li; +Cc: Shahaf Shuler, dev

Seems you did not read my comments on this patch, I have exactly the
same ones.

On Fri, Apr 13, 2018 at 07:20:16PM +0800, Xueming Li wrote:
> Tunnel RSS level of flow RSS action offers user a choice to do RSS hash
> calculation on inner or outer RSS fields. Testpmd flow command examples:
> 
> GRE flow inner RSS:
>   flow create 0 ingress pattern eth / ipv4 proto is 47 / gre / end
> actions rss queues 1 2 end level 1 / end
> 
> GRE tunnel flow outer RSS:
>   flow create 0 ingress pattern eth  / ipv4 proto is 47 / gre / end
> actions rss queues 1 2 end level 0 / end
> 
> Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> ---
>  drivers/net/mlx5/Makefile    |   2 +-
>  drivers/net/mlx5/mlx5_flow.c | 247 +++++++++++++++++++++++++++++--------------
>  drivers/net/mlx5/mlx5_glue.c |  16 +++
>  drivers/net/mlx5/mlx5_glue.h |   8 ++
>  drivers/net/mlx5/mlx5_rxq.c  |  56 +++++++++-
>  drivers/net/mlx5/mlx5_rxtx.h |   5 +-
>  6 files changed, 248 insertions(+), 86 deletions(-)
> 
> diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
> index ae118ad33..f9a6c460b 100644
> --- a/drivers/net/mlx5/Makefile
> +++ b/drivers/net/mlx5/Makefile
> @@ -35,7 +35,7 @@ include $(RTE_SDK)/mk/rte.vars.mk
>  LIB = librte_pmd_mlx5.a
>  LIB_GLUE = $(LIB_GLUE_BASE).$(LIB_GLUE_VERSION)
>  LIB_GLUE_BASE = librte_pmd_mlx5_glue.so
> -LIB_GLUE_VERSION = 18.02.0
> +LIB_GLUE_VERSION = 18.05.0
>  
>  # Sources.
>  SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5.c
> diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
> index dd099f328..a22554706 100644
> --- a/drivers/net/mlx5/mlx5_flow.c
> +++ b/drivers/net/mlx5/mlx5_flow.c
> @@ -116,6 +116,7 @@ enum hash_rxq_type {
>  	HASH_RXQ_UDPV6,
>  	HASH_RXQ_IPV6,
>  	HASH_RXQ_ETH,
> +	HASH_RXQ_TUNNEL,
>  };
>  
>  /* Initialization data for hash RX queue. */
> @@ -454,6 +455,7 @@ struct mlx5_flow_parse {
>  	uint16_t queues[RTE_MAX_QUEUES_PER_PORT]; /**< Queues indexes to use. */
>  	uint8_t rss_key[40]; /**< copy of the RSS key. */
>  	enum hash_rxq_type layer; /**< Last pattern layer detected. */
> +	enum hash_rxq_type out_layer; /**< Last outer pattern layer detected. */
>  	uint32_t tunnel; /**< Tunnel type of RTE_PTYPE_TUNNEL_XXX. */
>  	struct ibv_counter_set *cs; /**< Holds the counter set for the rule */
>  	struct {
> @@ -461,6 +463,7 @@ struct mlx5_flow_parse {
>  		/**< Pointer to Verbs attributes. */
>  		unsigned int offset;
>  		/**< Current position or total size of the attribute. */
> +		uint64_t hash_fields; /**< Verbs hash fields. */
>  	} queue[RTE_DIM(hash_rxq_init)];
>  };
>  
> @@ -696,7 +699,8 @@ mlx5_flow_convert_actions(struct rte_eth_dev *dev,
>  						   " function is Toeplitz");
>  				return -rte_errno;
>  			}
> -			if (rss->level) {
> +#ifndef HAVE_IBV_DEVICE_TUNNEL_SUPPORT
> +			if (parser->rss_conf.level > 0) {
>  				rte_flow_error_set(error, EINVAL,
>  						   RTE_FLOW_ERROR_TYPE_ACTION,
>  						   actions,
> @@ -704,6 +708,15 @@ mlx5_flow_convert_actions(struct rte_eth_dev *dev,
>  						   " level is not supported");
>  				return -rte_errno;
>  			}
> +#endif
> +			if (parser->rss_conf.level > 1) {
> +				rte_flow_error_set(error, EINVAL,
> +						   RTE_FLOW_ERROR_TYPE_ACTION,
> +						   actions,
> +						   "RSS encapsulation level"
> +						   " > 1 is not supported");
> +				return -rte_errno;
> +			}
>  			if (rss->types & MLX5_RSS_HF_MASK) {
>  				rte_flow_error_set(error, EINVAL,
>  						   RTE_FLOW_ERROR_TYPE_ACTION,

Same comment as in previous review.
The levels are not matching the proposed API.
Level  0 = unspecified, 1 = outermost, 2 = next outermost, 3 = next next
...

> @@ -754,7 +767,7 @@ mlx5_flow_convert_actions(struct rte_eth_dev *dev,
>  			}
>  			parser->rss_conf = (struct rte_flow_action_rss){
>  				.func = RTE_ETH_HASH_FUNCTION_DEFAULT,
> -				.level = 0,
> +				.level = rss->level,
>  				.types = rss->types,
>  				.key_len = rss_key_len,
>  				.queue_num = rss->queue_num,
> @@ -838,10 +851,12 @@ mlx5_flow_convert_actions(struct rte_eth_dev *dev,
>   *   0 on success, a negative errno value otherwise and rte_errno is set.
>   */
>  static int
> -mlx5_flow_convert_items_validate(const struct rte_flow_item items[],
> +mlx5_flow_convert_items_validate(struct rte_eth_dev *dev,
> +				 const struct rte_flow_item items[],
>  				 struct rte_flow_error *error,
>  				 struct mlx5_flow_parse *parser)
>  {
> +	struct priv *priv = dev->data->dev_private;
>  	const struct mlx5_flow_items *cur_item = mlx5_flow_items;
>  	unsigned int i;
>  	int ret = 0;
> @@ -881,6 +896,14 @@ mlx5_flow_convert_items_validate(const struct rte_flow_item items[],
>  						   " tunnel encapsulations.");
>  				return -rte_errno;
>  			}
> +			if (!priv->config.tunnel_en &&
> +			    parser->rss_conf.level) {
> +				rte_flow_error_set(error, ENOTSUP,
> +					RTE_FLOW_ERROR_TYPE_ITEM,
> +					items,
> +					"Tunnel offloading not enabled");
> +				return -rte_errno;
> +			}
>  			parser->inner = IBV_FLOW_SPEC_INNER;
>  			parser->tunnel = flow_ptype[items->type];
>  		}
> @@ -1000,7 +1023,11 @@ static void
>  mlx5_flow_convert_finalise(struct mlx5_flow_parse *parser)
>  {
>  	unsigned int i;
> +	uint32_t inner = parser->inner;
>  
> +	/* Don't create extra flows for outer RSS. */
> +	if (parser->tunnel && !parser->rss_conf.level)
> +		return;
>  	/*
>  	 * Fill missing layers in verbs specifications, or compute the correct
>  	 * offset to allocate the memory space for the attributes and
> @@ -1011,23 +1038,25 @@ mlx5_flow_convert_finalise(struct mlx5_flow_parse *parser)
>  			struct ibv_flow_spec_ipv4_ext ipv4;
>  			struct ibv_flow_spec_ipv6 ipv6;
>  			struct ibv_flow_spec_tcp_udp udp_tcp;
> +			struct ibv_flow_spec_eth eth;
>  		} specs;
>  		void *dst;
>  		uint16_t size;
>  
>  		if (i == parser->layer)
>  			continue;
> -		if (parser->layer == HASH_RXQ_ETH) {
> +		if (parser->layer == HASH_RXQ_ETH ||
> +		    parser->layer == HASH_RXQ_TUNNEL) {
>  			if (hash_rxq_init[i].ip_version == MLX5_IPV4) {
>  				size = sizeof(struct ibv_flow_spec_ipv4_ext);
>  				specs.ipv4 = (struct ibv_flow_spec_ipv4_ext){
> -					.type = IBV_FLOW_SPEC_IPV4_EXT,
> +					.type = inner | IBV_FLOW_SPEC_IPV4_EXT,
>  					.size = size,
>  				};
>  			} else {
>  				size = sizeof(struct ibv_flow_spec_ipv6);
>  				specs.ipv6 = (struct ibv_flow_spec_ipv6){
> -					.type = IBV_FLOW_SPEC_IPV6,
> +					.type = inner | IBV_FLOW_SPEC_IPV6,
>  					.size = size,
>  				};
>  			}
> @@ -1044,7 +1073,7 @@ mlx5_flow_convert_finalise(struct mlx5_flow_parse *parser)
>  		    (i == HASH_RXQ_UDPV6) || (i == HASH_RXQ_TCPV6)) {
>  			size = sizeof(struct ibv_flow_spec_tcp_udp);
>  			specs.udp_tcp = (struct ibv_flow_spec_tcp_udp) {
> -				.type = ((i == HASH_RXQ_UDPV4 ||
> +				.type = inner | ((i == HASH_RXQ_UDPV4 ||
>  					  i == HASH_RXQ_UDPV6) ?
>  					 IBV_FLOW_SPEC_UDP :
>  					 IBV_FLOW_SPEC_TCP),
> @@ -1065,6 +1094,8 @@ mlx5_flow_convert_finalise(struct mlx5_flow_parse *parser)
>  /**
>   * Update flows according to pattern and RSS hash fields.
>   *
> + * @param dev
> + *   Pointer to Ethernet device.
>   * @param[in, out] parser
>   *   Internal parser structure.
>   *
> @@ -1072,16 +1103,17 @@ mlx5_flow_convert_finalise(struct mlx5_flow_parse *parser)
>   *   0 on success, a negative errno value otherwise and rte_errno is set.
>   */
>  static int
> -mlx5_flow_convert_rss(struct mlx5_flow_parse *parser)
> +mlx5_flow_convert_rss(struct rte_eth_dev *dev, struct mlx5_flow_parse *parser)
>  {
> -	const unsigned int ipv4 =
> +	unsigned int ipv4 =
>  		hash_rxq_init[parser->layer].ip_version == MLX5_IPV4;
>  	const enum hash_rxq_type hmin = ipv4 ? HASH_RXQ_TCPV4 : HASH_RXQ_TCPV6;
>  	const enum hash_rxq_type hmax = ipv4 ? HASH_RXQ_IPV4 : HASH_RXQ_IPV6;
>  	const enum hash_rxq_type ohmin = ipv4 ? HASH_RXQ_TCPV6 : HASH_RXQ_TCPV4;
>  	const enum hash_rxq_type ohmax = ipv4 ? HASH_RXQ_IPV6 : HASH_RXQ_IPV4;
> -	const enum hash_rxq_type ip = ipv4 ? HASH_RXQ_IPV4 : HASH_RXQ_IPV6;
> +	enum hash_rxq_type ip = ipv4 ? HASH_RXQ_IPV4 : HASH_RXQ_IPV6;
>  	unsigned int i;
> +	int found = 0;
>  
>  	/* Remove any other flow not matching the pattern. */
>  	if (parser->rss_conf.queue_num == 1 && !parser->rss_conf.types) {
> @@ -1093,9 +1125,51 @@ mlx5_flow_convert_rss(struct mlx5_flow_parse *parser)
>  		}
>  		return 0;
>  	}
> -	if (parser->layer == HASH_RXQ_ETH)
> +	/*
> +	 * Outer RSS.
> +	 * HASH_RXQ_ETH is the only rule since tunnel packet match this
> +	 * rule must match outer pattern.
> +	 */
> +	if (parser->tunnel && !parser->rss_conf.level) {
> +		/* Remove flows other than default. */
> +		for (i = 0; i != hash_rxq_init_n - 1; ++i) {
> +			rte_free(parser->queue[i].ibv_attr);
> +			parser->queue[i].ibv_attr = NULL;
> +		}
> +		ipv4 = hash_rxq_init[parser->out_layer].ip_version == MLX5_IPV4;
> +		ip = ipv4 ? HASH_RXQ_IPV4 : HASH_RXQ_IPV6;
> +		if (hash_rxq_init[parser->out_layer].dpdk_rss_hf &
> +		    parser->rss_conf.types) {
> +			parser->queue[HASH_RXQ_ETH].hash_fields =
> +				hash_rxq_init[parser->out_layer].hash_fields;
> +		} else if (ip && (hash_rxq_init[ip].dpdk_rss_hf &
> +		    parser->rss_conf.types)) {
> +			parser->queue[HASH_RXQ_ETH].hash_fields =
> +				hash_rxq_init[ip].hash_fields;
> +		} else if (parser->rss_conf.types) {
> +			DRV_LOG(WARNING,
> +				"port %u rss outer hash function doesn't match"
> +				" pattern", dev->data->port_id);
> +		}
> +		return 0;
> +	}
> +	if (parser->layer == HASH_RXQ_ETH || parser->layer == HASH_RXQ_TUNNEL) {
> +		/* Remove unused flows according to hash function. */
> +		for (i = 0; i != hash_rxq_init_n - 1; ++i) {
> +			if (!parser->queue[i].ibv_attr)
> +				continue;
> +			if (hash_rxq_init[i].dpdk_rss_hf &
> +			    parser->rss_conf.types) {
> +				parser->queue[i].hash_fields =
> +					hash_rxq_init[i].hash_fields;
> +				continue;
> +			}
> +			rte_free(parser->queue[i].ibv_attr);
> +			parser->queue[i].ibv_attr = NULL;
> +		}
>  		return 0;
> -	/* This layer becomes useless as the pattern define under layers. */
> +	}
> +	/* Remove ETH layer flow. */
>  	rte_free(parser->queue[HASH_RXQ_ETH].ibv_attr);
>  	parser->queue[HASH_RXQ_ETH].ibv_attr = NULL;
>  	/* Remove opposite kind of layer e.g. IPv6 if the pattern is IPv4. */
> @@ -1105,20 +1179,50 @@ mlx5_flow_convert_rss(struct mlx5_flow_parse *parser)
>  		rte_free(parser->queue[i].ibv_attr);
>  		parser->queue[i].ibv_attr = NULL;
>  	}
> -	/* Remove impossible flow according to the RSS configuration. */
> -	if (hash_rxq_init[parser->layer].dpdk_rss_hf &
> -	    parser->rss_conf.types) {
> -		/* Remove any other flow. */
> +	/*
> +	 * Keep L4 flows as IP pattern has to support L4 RSS.
> +	 * Otherwise, only keep the flow that match the pattern.
> +	 */
> +	if (parser->layer != ip) {
> +		/* Only keep the flow that match the pattern. */
>  		for (i = hmin; i != (hmax + 1); ++i) {
> -			if (i == parser->layer || !parser->queue[i].ibv_attr)
> +			if (i == parser->layer)
>  				continue;
>  			rte_free(parser->queue[i].ibv_attr);
>  			parser->queue[i].ibv_attr = NULL;
>  		}
> -	} else if (!parser->queue[ip].ibv_attr) {
> -		/* no RSS possible with the current configuration. */
> -		parser->rss_conf.queue_num = 1;
>  	}
> +	/* Remove impossible flow according to the RSS configuration. */
> +	for (i = hmin; i != (hmax + 1); ++i) {
> +		if (!parser->queue[i].ibv_attr)
> +			continue;
> +		if (parser->rss_conf.types &
> +		    hash_rxq_init[i].dpdk_rss_hf) {
> +			parser->queue[i].hash_fields =
> +				hash_rxq_init[i].hash_fields;
> +			found = 1;
> +			continue;
> +		}
> +		/* L4 flow could be used for L3 RSS. */
> +		if (i == parser->layer && i < ip &&
> +		    (hash_rxq_init[ip].dpdk_rss_hf &
> +		     parser->rss_conf.types)) {
> +			parser->queue[i].hash_fields =
> +				hash_rxq_init[ip].hash_fields;
> +			found = 1;
> +			continue;
> +		}
> +		/* L3 flow and L4 hash: non-rss L3 flow. */
> +		if (i == parser->layer && i == ip && found)
> +			/* IP pattern and L4 HF. */
> +			continue;
> +		rte_free(parser->queue[i].ibv_attr);
> +		parser->queue[i].ibv_attr = NULL;
> +	}
> +	if (!found)
> +		DRV_LOG(WARNING,
> +			"port %u rss hash function doesn't match "
> +			"pattern", dev->data->port_id);

The hash function is toeplitz, xor, it is not applied on the pattern but
used to compute an hash result using some information from the packet.
This comment is totally wrong.

Another point, such log will trigger on an application using MLX5 PMD
but not on MLX4 PMD and this specifically because on how the NIC using
the MLX5 PMD are made internally (MLX4 can use a single Hash RX queue
whereas MLX5 needs an Hash Rx queue per kind of protocol).
The fact being it will have the exact same behavior I'll *strongly*
suggest to remove such annoying warning.

>  	return 0;
>  }
>  
> @@ -1165,7 +1269,7 @@ mlx5_flow_convert(struct rte_eth_dev *dev,
>  	ret = mlx5_flow_convert_actions(dev, actions, error, parser);
>  	if (ret)
>  		return ret;
> -	ret = mlx5_flow_convert_items_validate(items, error, parser);
> +	ret = mlx5_flow_convert_items_validate(dev, items, error, parser);
>  	if (ret)
>  		return ret;
>  	mlx5_flow_convert_finalise(parser);
> @@ -1186,10 +1290,6 @@ mlx5_flow_convert(struct rte_eth_dev *dev,
>  		for (i = 0; i != hash_rxq_init_n; ++i) {
>  			unsigned int offset;
>  
> -			if (!(parser->rss_conf.types &
> -			      hash_rxq_init[i].dpdk_rss_hf) &&
> -			    (i != HASH_RXQ_ETH))
> -				continue;
>  			offset = parser->queue[i].offset;
>  			parser->queue[i].ibv_attr =
>  				mlx5_flow_convert_allocate(offset, error);
> @@ -1201,6 +1301,7 @@ mlx5_flow_convert(struct rte_eth_dev *dev,
>  	/* Third step. Conversion parse, fill the specifications. */
>  	parser->inner = 0;
>  	parser->tunnel = 0;
> +	parser->layer = HASH_RXQ_ETH;
>  	for (; items->type != RTE_FLOW_ITEM_TYPE_END; ++items) {
>  		struct mlx5_flow_data data = {
>  			.parser = parser,
> @@ -1218,23 +1319,23 @@ mlx5_flow_convert(struct rte_eth_dev *dev,
>  		if (ret)
>  			goto exit_free;
>  	}
> -	if (parser->mark)
> -		mlx5_flow_create_flag_mark(parser, parser->mark_id);
> -	if (parser->count && parser->create) {
> -		mlx5_flow_create_count(dev, parser);
> -		if (!parser->cs)
> -			goto exit_count_error;
> -	}
>  	/*
>  	 * Last step. Complete missing specification to reach the RSS
>  	 * configuration.
>  	 */
>  	if (!parser->drop)
> -		ret = mlx5_flow_convert_rss(parser);
> +		ret = mlx5_flow_convert_rss(dev, parser);
>  		if (ret)
>  			goto exit_free;
>  		mlx5_flow_convert_finalise(parser);
>  	mlx5_flow_update_priority(dev, parser, attr);
> +	if (parser->mark)
> +		mlx5_flow_create_flag_mark(parser, parser->mark_id);
> +	if (parser->count && parser->create) {
> +		mlx5_flow_create_count(dev, parser);
> +		if (!parser->cs)
> +			goto exit_count_error;
> +	}
>  exit_free:
>  	/* Only verification is expected, all resources should be released. */
>  	if (!parser->create) {
> @@ -1282,17 +1383,11 @@ mlx5_flow_create_copy(struct mlx5_flow_parse *parser, void *src,
>  	for (i = 0; i != hash_rxq_init_n; ++i) {
>  		if (!parser->queue[i].ibv_attr)
>  			continue;
> -		/* Specification must be the same l3 type or none. */
> -		if (parser->layer == HASH_RXQ_ETH ||
> -		    (hash_rxq_init[parser->layer].ip_version ==
> -		     hash_rxq_init[i].ip_version) ||
> -		    (hash_rxq_init[i].ip_version == 0)) {
> -			dst = (void *)((uintptr_t)parser->queue[i].ibv_attr +
> -					parser->queue[i].offset);
> -			memcpy(dst, src, size);
> -			++parser->queue[i].ibv_attr->num_of_specs;
> -			parser->queue[i].offset += size;
> -		}
> +		dst = (void *)((uintptr_t)parser->queue[i].ibv_attr +
> +				parser->queue[i].offset);
> +		memcpy(dst, src, size);
> +		++parser->queue[i].ibv_attr->num_of_specs;
> +		parser->queue[i].offset += size;
>  	}
>  }
>  
> @@ -1323,9 +1418,7 @@ mlx5_flow_create_eth(const struct rte_flow_item *item,
>  		.size = eth_size,
>  	};
>  
> -	/* Don't update layer for the inner pattern. */
> -	if (!parser->inner)
> -		parser->layer = HASH_RXQ_ETH;
> +	parser->layer = HASH_RXQ_ETH;
>  	if (spec) {
>  		unsigned int i;
>  
> @@ -1438,9 +1531,7 @@ mlx5_flow_create_ipv4(const struct rte_flow_item *item,
>  		.size = ipv4_size,
>  	};
>  
> -	/* Don't update layer for the inner pattern. */
> -	if (!parser->inner)
> -		parser->layer = HASH_RXQ_IPV4;
> +	parser->layer = HASH_RXQ_IPV4;
>  	if (spec) {
>  		if (!mask)
>  			mask = default_mask;
> @@ -1493,9 +1584,7 @@ mlx5_flow_create_ipv6(const struct rte_flow_item *item,
>  		.size = ipv6_size,
>  	};
>  
> -	/* Don't update layer for the inner pattern. */
> -	if (!parser->inner)
> -		parser->layer = HASH_RXQ_IPV6;
> +	parser->layer = HASH_RXQ_IPV6;
>  	if (spec) {
>  		unsigned int i;
>  		uint32_t vtc_flow_val;
> @@ -1568,13 +1657,10 @@ mlx5_flow_create_udp(const struct rte_flow_item *item,
>  		.size = udp_size,
>  	};
>  
> -	/* Don't update layer for the inner pattern. */
> -	if (!parser->inner) {
> -		if (parser->layer == HASH_RXQ_IPV4)
> -			parser->layer = HASH_RXQ_UDPV4;
> -		else
> -			parser->layer = HASH_RXQ_UDPV6;
> -	}
> +	if (parser->layer == HASH_RXQ_IPV4)
> +		parser->layer = HASH_RXQ_UDPV4;
> +	else
> +		parser->layer = HASH_RXQ_UDPV6;
>  	if (spec) {
>  		if (!mask)
>  			mask = default_mask;
> @@ -1617,13 +1703,10 @@ mlx5_flow_create_tcp(const struct rte_flow_item *item,
>  		.size = tcp_size,
>  	};
>  
> -	/* Don't update layer for the inner pattern. */
> -	if (!parser->inner) {
> -		if (parser->layer == HASH_RXQ_IPV4)
> -			parser->layer = HASH_RXQ_TCPV4;
> -		else
> -			parser->layer = HASH_RXQ_TCPV6;
> -	}
> +	if (parser->layer == HASH_RXQ_IPV4)
> +		parser->layer = HASH_RXQ_TCPV4;
> +	else
> +		parser->layer = HASH_RXQ_TCPV6;
>  	if (spec) {
>  		if (!mask)
>  			mask = default_mask;
> @@ -1673,6 +1756,8 @@ mlx5_flow_create_vxlan(const struct rte_flow_item *item,
>  	id.vni[0] = 0;
>  	parser->inner = IBV_FLOW_SPEC_INNER;
>  	parser->tunnel = ptype_ext[PTYPE_IDX(RTE_PTYPE_TUNNEL_VXLAN)];
> +	parser->out_layer = parser->layer;
> +	parser->layer = HASH_RXQ_TUNNEL;
>  	if (spec) {
>  		if (!mask)
>  			mask = default_mask;
> @@ -1727,6 +1812,8 @@ mlx5_flow_create_gre(const struct rte_flow_item *item __rte_unused,
>  
>  	parser->inner = IBV_FLOW_SPEC_INNER;
>  	parser->tunnel = ptype_ext[PTYPE_IDX(RTE_PTYPE_TUNNEL_GRE)];
> +	parser->out_layer = parser->layer;
> +	parser->layer = HASH_RXQ_TUNNEL;
>  	mlx5_flow_create_copy(parser, &tunnel, size);
>  	return 0;
>  }
> @@ -1890,33 +1977,33 @@ mlx5_flow_create_action_queue_rss(struct rte_eth_dev *dev,
>  	unsigned int i;
>  
>  	for (i = 0; i != hash_rxq_init_n; ++i) {
> -		uint64_t hash_fields;
> -
>  		if (!parser->queue[i].ibv_attr)
>  			continue;
>  		flow->frxq[i].ibv_attr = parser->queue[i].ibv_attr;
>  		parser->queue[i].ibv_attr = NULL;
> -		hash_fields = hash_rxq_init[i].hash_fields;
> +		flow->frxq[i].hash_fields = parser->queue[i].hash_fields;
>  		if (!priv->dev->data->dev_started)
>  			continue;
>  		flow->frxq[i].hrxq =
>  			mlx5_hrxq_get(dev,
>  				      parser->rss_conf.key,
>  				      parser->rss_conf.key_len,
> -				      hash_fields,
> +				      flow->frxq[i].hash_fields,
>  				      parser->rss_conf.queue,
>  				      parser->rss_conf.queue_num,
> -				      parser->tunnel);
> +				      parser->tunnel,
> +				      parser->rss_conf.level);
>  		if (flow->frxq[i].hrxq)
>  			continue;
>  		flow->frxq[i].hrxq =
>  			mlx5_hrxq_new(dev,
>  				      parser->rss_conf.key,
>  				      parser->rss_conf.key_len,
> -				      hash_fields,
> +				      flow->frxq[i].hash_fields,
>  				      parser->rss_conf.queue,
>  				      parser->rss_conf.queue_num,
> -				      parser->tunnel);
> +				      parser->tunnel,
> +				      parser->rss_conf.level);
>  		if (!flow->frxq[i].hrxq) {
>  			return rte_flow_error_set(error, ENOMEM,
>  						  RTE_FLOW_ERROR_TYPE_HANDLE,
> @@ -2013,7 +2100,7 @@ mlx5_flow_create_action_queue(struct rte_eth_dev *dev,
>  		DRV_LOG(DEBUG, "port %u %p type %d QP %p ibv_flow %p",
>  			dev->data->port_id,
>  			(void *)flow, i,
> -			(void *)flow->frxq[i].hrxq,
> +			(void *)flow->frxq[i].hrxq->qp,
>  			(void *)flow->frxq[i].ibv_flow);
>  	}
>  	if (!flows_n) {
> @@ -2541,19 +2628,21 @@ mlx5_flow_start(struct rte_eth_dev *dev, struct mlx5_flows *list)
>  			flow->frxq[i].hrxq =
>  				mlx5_hrxq_get(dev, flow->rss_conf.key,
>  					      flow->rss_conf.key_len,
> -					      hash_rxq_init[i].hash_fields,
> +					      flow->frxq[i].hash_fields,
>  					      flow->rss_conf.queue,
>  					      flow->rss_conf.queue_num,
> -					      flow->tunnel);
> +					      flow->tunnel,
> +					      flow->rss_conf.level);
>  			if (flow->frxq[i].hrxq)
>  				goto flow_create;
>  			flow->frxq[i].hrxq =
>  				mlx5_hrxq_new(dev, flow->rss_conf.key,
>  					      flow->rss_conf.key_len,
> -					      hash_rxq_init[i].hash_fields,
> +					      flow->frxq[i].hash_fields,
>  					      flow->rss_conf.queue,
>  					      flow->rss_conf.queue_num,
> -					      flow->tunnel);
> +					      flow->tunnel,
> +					      flow->rss_conf.level);
>  			if (!flow->frxq[i].hrxq) {
>  				DRV_LOG(DEBUG,
>  					"port %u flow %p cannot be applied",
> diff --git a/drivers/net/mlx5/mlx5_glue.c b/drivers/net/mlx5/mlx5_glue.c
> index be684d378..6874aa32a 100644
> --- a/drivers/net/mlx5/mlx5_glue.c
> +++ b/drivers/net/mlx5/mlx5_glue.c
> @@ -313,6 +313,21 @@ mlx5_glue_dv_init_obj(struct mlx5dv_obj *obj, uint64_t obj_type)
>  	return mlx5dv_init_obj(obj, obj_type);
>  }
>  
> +static struct ibv_qp *
> +mlx5_glue_dv_create_qp(struct ibv_context *context,
> +		       struct ibv_qp_init_attr_ex *qp_init_attr_ex,
> +		       struct mlx5dv_qp_init_attr *dv_qp_init_attr)
> +{
> +#ifdef HAVE_IBV_DEVICE_TUNNEL_SUPPORT
> +	return mlx5dv_create_qp(context, qp_init_attr_ex, dv_qp_init_attr);
> +#else
> +	(void)context;
> +	(void)qp_init_attr_ex;
> +	(void)dv_qp_init_attr;
> +	return NULL;
> +#endif
> +}
> +
>  const struct mlx5_glue *mlx5_glue = &(const struct mlx5_glue){
>  	.version = MLX5_GLUE_VERSION,
>  	.fork_init = mlx5_glue_fork_init,
> @@ -356,4 +371,5 @@ const struct mlx5_glue *mlx5_glue = &(const struct mlx5_glue){
>  	.dv_query_device = mlx5_glue_dv_query_device,
>  	.dv_set_context_attr = mlx5_glue_dv_set_context_attr,
>  	.dv_init_obj = mlx5_glue_dv_init_obj,
> +	.dv_create_qp = mlx5_glue_dv_create_qp,
>  };
> diff --git a/drivers/net/mlx5/mlx5_glue.h b/drivers/net/mlx5/mlx5_glue.h
> index b5efee3b6..841363872 100644
> --- a/drivers/net/mlx5/mlx5_glue.h
> +++ b/drivers/net/mlx5/mlx5_glue.h
> @@ -31,6 +31,10 @@ struct ibv_counter_set_init_attr;
>  struct ibv_query_counter_set_attr;
>  #endif
>  
> +#ifndef HAVE_IBV_DEVICE_TUNNEL_SUPPORT
> +struct mlx5dv_qp_init_attr;
> +#endif
> +
>  /* LIB_GLUE_VERSION must be updated every time this structure is modified. */
>  struct mlx5_glue {
>  	const char *version;
> @@ -106,6 +110,10 @@ struct mlx5_glue {
>  				   enum mlx5dv_set_ctx_attr_type type,
>  				   void *attr);
>  	int (*dv_init_obj)(struct mlx5dv_obj *obj, uint64_t obj_type);
> +	struct ibv_qp *(*dv_create_qp)
> +		(struct ibv_context *context,
> +		 struct ibv_qp_init_attr_ex *qp_init_attr_ex,
> +		 struct mlx5dv_qp_init_attr *dv_qp_init_attr);
>  };
>  
>  const struct mlx5_glue *mlx5_glue;
> diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
> index 073732e16..1997609ec 100644
> --- a/drivers/net/mlx5/mlx5_rxq.c
> +++ b/drivers/net/mlx5/mlx5_rxq.c
> @@ -1386,6 +1386,8 @@ mlx5_ind_table_ibv_verify(struct rte_eth_dev *dev)
>   *   Number of queues.
>   * @param tunnel
>   *   Tunnel type.
> + * @param rss_level
> + *   RSS hash on tunnel level.
>   *
>   * @return
>   *   The Verbs object initialised, NULL otherwise and rte_errno is set.
> @@ -1394,13 +1396,17 @@ struct mlx5_hrxq *
>  mlx5_hrxq_new(struct rte_eth_dev *dev,
>  	      const uint8_t *rss_key, uint32_t rss_key_len,
>  	      uint64_t hash_fields,
> -	      const uint16_t *queues, uint32_t queues_n, uint32_t tunnel)
> +	      const uint16_t *queues, uint32_t queues_n,
> +	      uint32_t tunnel, uint32_t rss_level)
>  {
>  	struct priv *priv = dev->data->dev_private;
>  	struct mlx5_hrxq *hrxq;
>  	struct mlx5_ind_table_ibv *ind_tbl;
>  	struct ibv_qp *qp;
>  	int err;
> +#ifdef HAVE_IBV_DEVICE_TUNNEL_SUPPORT
> +	struct mlx5dv_qp_init_attr qp_init_attr = {0};
> +#endif
>  
>  	queues_n = hash_fields ? queues_n : 1;
>  	ind_tbl = mlx5_ind_table_ibv_get(dev, queues, queues_n);
> @@ -1410,6 +1416,36 @@ mlx5_hrxq_new(struct rte_eth_dev *dev,
>  		rte_errno = ENOMEM;
>  		return NULL;
>  	}
> +#ifdef HAVE_IBV_DEVICE_TUNNEL_SUPPORT
> +	if (tunnel) {
> +		qp_init_attr.comp_mask =
> +				MLX5DV_QP_INIT_ATTR_MASK_QP_CREATE_FLAGS;
> +		qp_init_attr.create_flags = MLX5DV_QP_CREATE_TUNNEL_OFFLOADS;
> +	}
> +	qp = mlx5_glue->dv_create_qp(
> +		priv->ctx,
> +		&(struct ibv_qp_init_attr_ex){
> +			.qp_type = IBV_QPT_RAW_PACKET,
> +			.comp_mask =
> +				IBV_QP_INIT_ATTR_PD |
> +				IBV_QP_INIT_ATTR_IND_TABLE |
> +				IBV_QP_INIT_ATTR_RX_HASH,
> +			.rx_hash_conf = (struct ibv_rx_hash_conf){
> +				.rx_hash_function = IBV_RX_HASH_FUNC_TOEPLITZ,
> +				.rx_hash_key_len = rss_key_len ? rss_key_len :
> +						   rss_hash_default_key_len,
> +				.rx_hash_key = rss_key ?
> +					       (void *)(uintptr_t)rss_key :
> +					       rss_hash_default_key,
> +				.rx_hash_fields_mask = hash_fields |
> +					(tunnel && rss_level ?
> +					(uint32_t)IBV_RX_HASH_INNER : 0),
> +			},
> +			.rwq_ind_tbl = ind_tbl->ind_table,
> +			.pd = priv->pd,
> +		},
> +		&qp_init_attr);
> +#else
>  	qp = mlx5_glue->create_qp_ex
>  		(priv->ctx,
>  		 &(struct ibv_qp_init_attr_ex){
> @@ -1420,13 +1456,17 @@ mlx5_hrxq_new(struct rte_eth_dev *dev,
>  				IBV_QP_INIT_ATTR_RX_HASH,
>  			.rx_hash_conf = (struct ibv_rx_hash_conf){
>  				.rx_hash_function = IBV_RX_HASH_FUNC_TOEPLITZ,
> -				.rx_hash_key_len = rss_key_len,
> -				.rx_hash_key = (void *)(uintptr_t)rss_key,
> +				.rx_hash_key_len = rss_key_len ? rss_key_len :
> +						   rss_hash_default_key_len,
> +				.rx_hash_key = rss_key ?
> +					       (void *)(uintptr_t)rss_key :
> +					       rss_hash_default_key,
>  				.rx_hash_fields_mask = hash_fields,
>  			},
>  			.rwq_ind_tbl = ind_tbl->ind_table,
>  			.pd = priv->pd,
>  		 });
> +#endif
>  	if (!qp) {
>  		rte_errno = errno;
>  		goto error;
> @@ -1439,6 +1479,7 @@ mlx5_hrxq_new(struct rte_eth_dev *dev,
>  	hrxq->rss_key_len = rss_key_len;
>  	hrxq->hash_fields = hash_fields;
>  	hrxq->tunnel = tunnel;
> +	hrxq->rss_level = rss_level;
>  	memcpy(hrxq->rss_key, rss_key, rss_key_len);
>  	rte_atomic32_inc(&hrxq->refcnt);
>  	LIST_INSERT_HEAD(&priv->hrxqs, hrxq, next);
> @@ -1448,6 +1489,8 @@ mlx5_hrxq_new(struct rte_eth_dev *dev,
>  	return hrxq;
>  error:
>  	err = rte_errno; /* Save rte_errno before cleanup. */
> +	DRV_LOG(ERR, "port %u: Error creating Hash Rx queue",
> +		dev->data->port_id);

Internal developer log should not remain in the code.  The user will
already have a flow creation failure, there is no need to annoy him with
messages he cannot understand.

>  	mlx5_ind_table_ibv_release(dev, ind_tbl);
>  	if (qp)
>  		claim_zero(mlx5_glue->destroy_qp(qp));
> @@ -1469,6 +1512,8 @@ mlx5_hrxq_new(struct rte_eth_dev *dev,
>   *   Number of queues.
>   * @param tunnel
>   *   Tunnel type.
> + * @param rss_level
> + *   RSS hash on tunnel level
>   *
>   * @return
>   *   An hash Rx queue on success.
> @@ -1477,7 +1522,8 @@ struct mlx5_hrxq *
>  mlx5_hrxq_get(struct rte_eth_dev *dev,
>  	      const uint8_t *rss_key, uint32_t rss_key_len,
>  	      uint64_t hash_fields,
> -	      const uint16_t *queues, uint32_t queues_n, uint32_t tunnel)
> +	      const uint16_t *queues, uint32_t queues_n,
> +	      uint32_t tunnel, uint32_t rss_level)

rss_level > 1 means tunnel, there is no need to have a redundant
information.

>  {
>  	struct priv *priv = dev->data->dev_private;
>  	struct mlx5_hrxq *hrxq;
> @@ -1494,6 +1540,8 @@ mlx5_hrxq_get(struct rte_eth_dev *dev,
>  			continue;
>  		if (hrxq->tunnel != tunnel)
>  			continue;
> +		if (hrxq->rss_level != rss_level)
> +			continue;
>  		ind_tbl = mlx5_ind_table_ibv_get(dev, queues, queues_n);
>  		if (!ind_tbl)
>  			continue;
> diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
> index d35605b55..62cf55109 100644
> --- a/drivers/net/mlx5/mlx5_rxtx.h
> +++ b/drivers/net/mlx5/mlx5_rxtx.h
> @@ -147,6 +147,7 @@ struct mlx5_hrxq {
>  	struct ibv_qp *qp; /* Verbs queue pair. */
>  	uint64_t hash_fields; /* Verbs Hash fields. */
>  	uint32_t tunnel; /* Tunnel type. */
> +	uint32_t rss_level; /* RSS on tunnel level. */
>  	uint32_t rss_key_len; /* Hash key length in bytes. */
>  	uint8_t rss_key[]; /* Hash key. */
>  };
> @@ -251,12 +252,12 @@ struct mlx5_hrxq *mlx5_hrxq_new(struct rte_eth_dev *dev,
>  				const uint8_t *rss_key, uint32_t rss_key_len,
>  				uint64_t hash_fields,
>  				const uint16_t *queues, uint32_t queues_n,
> -				uint32_t tunnel);
> +				uint32_t tunnel, uint32_t rss_level);
>  struct mlx5_hrxq *mlx5_hrxq_get(struct rte_eth_dev *dev,
>  				const uint8_t *rss_key, uint32_t rss_key_len,
>  				uint64_t hash_fields,
>  				const uint16_t *queues, uint32_t queues_n,
> -				uint32_t tunnel);
> +				uint32_t tunnel, uint32_t rss_level);
>  int mlx5_hrxq_release(struct rte_eth_dev *dev, struct mlx5_hrxq *hxrq);
>  int mlx5_hrxq_ibv_verify(struct rte_eth_dev *dev);
>  uint64_t mlx5_get_rx_port_offloads(void);
> -- 
> 2.13.3
> 

Thanks,
-- 
Nélio Laranjeiro
6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v3 08/14] net/mlx5: add hardware flow debug dump
  2018-04-13 11:20 ` [PATCH v3 08/14] net/mlx5: add hardware flow debug dump Xueming Li
@ 2018-04-13 13:29   ` Nélio Laranjeiro
  0 siblings, 0 replies; 115+ messages in thread
From: Nélio Laranjeiro @ 2018-04-13 13:29 UTC (permalink / raw)
  To: Xueming Li; +Cc: Shahaf Shuler, dev

On Fri, Apr 13, 2018 at 07:20:17PM +0800, Xueming Li wrote:
> Dump verb flow detail including flow spec type and size for debugging
> purpose.
> 
> Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> ---
>  drivers/net/mlx5/mlx5_flow.c  | 68 ++++++++++++++++++++++++++++++++++++-------
>  drivers/net/mlx5/mlx5_rxq.c   | 25 +++++++++++++---
>  drivers/net/mlx5/mlx5_utils.h |  6 ++++
>  3 files changed, 85 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
> index a22554706..c99722770 100644
> --- a/drivers/net/mlx5/mlx5_flow.c
> +++ b/drivers/net/mlx5/mlx5_flow.c
> @@ -2049,6 +2049,57 @@ mlx5_flow_create_update_rxqs(struct rte_eth_dev *dev, struct rte_flow *flow)
>  }
>  
>  /**
> + * Dump flow hash RX queue detail.
> + *
> + * @param dev
> + *   Pointer to Ethernet device.
> + * @param flow
> + *   Pointer to the rte_flow.
> + * @param i
> + *   Hash RX queue index.
> + */
> +static void
> +mlx5_flow_dump(struct rte_eth_dev *dev __rte_unused,
> +	       struct rte_flow *flow __rte_unused,
> +	       unsigned int i __rte_unused)
> +{
> +#ifndef NDEBUG
> +	uintptr_t spec_ptr;
> +	uint16_t j;
> +	char buf[256];
> +	uint8_t off;
> +
> +	spec_ptr = (uintptr_t)(flow->frxq[i].ibv_attr + 1);
> +	for (j = 0, off = 0; j < flow->frxq[i].ibv_attr->num_of_specs;
> +	     j++) {
> +		struct ibv_flow_spec *spec = (void *)spec_ptr;
> +		off += sprintf(buf + off, " %x(%hu)", spec->hdr.type,
> +			       spec->hdr.size);
> +		spec_ptr += spec->hdr.size;
> +	}
> +	DRV_LOG(DEBUG,
> +		"port %u Verbs flow %p type %u: hrxq:%p qp:%p ind:%p, hash:%lx/%u"
> +		" specs:%hhu(%hu), priority:%hu, type:%d, flags:%x,"
> +		" comp_mask:%x specs:%s",
> +		dev->data->port_id, (void *)flow, i,
> +		(void *)flow->frxq[i].hrxq,
> +		(void *)flow->frxq[i].hrxq->qp,
> +		(void *)flow->frxq[i].hrxq->ind_table,
> +		flow->frxq[i].hash_fields |
> +		(flow->tunnel &&
> +		 flow->rss_conf.level ? (uint32_t)IBV_RX_HASH_INNER : 0),
> +		flow->rss_conf.queue_num,
> +		flow->frxq[i].ibv_attr->num_of_specs,
> +		flow->frxq[i].ibv_attr->size,
> +		flow->frxq[i].ibv_attr->priority,
> +		flow->frxq[i].ibv_attr->type,
> +		flow->frxq[i].ibv_attr->flags,
> +		flow->frxq[i].ibv_attr->comp_mask,
> +		buf);
> +#endif
> +}
> +
> +/**
>   * Complete flow rule creation.
>   *
>   * @param dev
> @@ -2090,6 +2141,7 @@ mlx5_flow_create_action_queue(struct rte_eth_dev *dev,
>  		flow->frxq[i].ibv_flow =
>  			mlx5_glue->create_flow(flow->frxq[i].hrxq->qp,
>  					       flow->frxq[i].ibv_attr);
> +		mlx5_flow_dump(dev, flow, i);
>  		if (!flow->frxq[i].ibv_flow) {
>  			rte_flow_error_set(error, ENOMEM,
>  					   RTE_FLOW_ERROR_TYPE_HANDLE,
> @@ -2097,11 +2149,6 @@ mlx5_flow_create_action_queue(struct rte_eth_dev *dev,
>  			goto error;
>  		}
>  		++flows_n;
> -		DRV_LOG(DEBUG, "port %u %p type %d QP %p ibv_flow %p",
> -			dev->data->port_id,
> -			(void *)flow, i,
> -			(void *)flow->frxq[i].hrxq->qp,
> -			(void *)flow->frxq[i].ibv_flow);
>  	}
>  	if (!flows_n) {
>  		rte_flow_error_set(error, EINVAL, RTE_FLOW_ERROR_TYPE_HANDLE,
> @@ -2645,24 +2692,25 @@ mlx5_flow_start(struct rte_eth_dev *dev, struct mlx5_flows *list)
>  					      flow->rss_conf.level);
>  			if (!flow->frxq[i].hrxq) {
>  				DRV_LOG(DEBUG,
> -					"port %u flow %p cannot be applied",
> +					"port %u flow %p cannot create hash"
> +					" rxq",
>  					dev->data->port_id, (void *)flow);
>  				rte_errno = EINVAL;
>  				return -rte_errno;
>  			}
>  flow_create:
> +			mlx5_flow_dump(dev, flow, i);
>  			flow->frxq[i].ibv_flow =
>  				mlx5_glue->create_flow(flow->frxq[i].hrxq->qp,
>  						       flow->frxq[i].ibv_attr);
>  			if (!flow->frxq[i].ibv_flow) {
>  				DRV_LOG(DEBUG,
> -					"port %u flow %p cannot be applied",
> -					dev->data->port_id, (void *)flow);
> +					"port %u flow %p type %u cannot be"
> +					" applied",
> +					dev->data->port_id, (void *)flow, i);
>  				rte_errno = EINVAL;
>  				return -rte_errno;
>  			}
> -			DRV_LOG(DEBUG, "port %u flow %p applied",
> -				dev->data->port_id, (void *)flow);
>  		}
>  		mlx5_flow_create_update_rxqs(dev, flow);
>  	}
> diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
> index 1997609ec..f55980836 100644
> --- a/drivers/net/mlx5/mlx5_rxq.c
> +++ b/drivers/net/mlx5/mlx5_rxq.c
> @@ -1259,9 +1259,9 @@ mlx5_ind_table_ibv_new(struct rte_eth_dev *dev, const uint16_t *queues,
>  	}
>  	rte_atomic32_inc(&ind_tbl->refcnt);
>  	LIST_INSERT_HEAD(&priv->ind_tbls, ind_tbl, next);
> -	DRV_LOG(DEBUG, "port %u indirection table %p: refcnt %d",
> -		dev->data->port_id, (void *)ind_tbl,
> -		rte_atomic32_read(&ind_tbl->refcnt));
> +	DEBUG("port %u new indirection table %p: queues:%u refcnt:%d",
> +	      dev->data->port_id, (void *)ind_tbl, 1 << wq_n,
> +	      rte_atomic32_read(&ind_tbl->refcnt));
>  	return ind_tbl;
>  error:
>  	rte_free(ind_tbl);
> @@ -1330,9 +1330,12 @@ mlx5_ind_table_ibv_release(struct rte_eth_dev *dev,
>  	DRV_LOG(DEBUG, "port %u indirection table %p: refcnt %d",
>  		((struct priv *)dev->data->dev_private)->port,
>  		(void *)ind_tbl, rte_atomic32_read(&ind_tbl->refcnt));
> -	if (rte_atomic32_dec_and_test(&ind_tbl->refcnt))
> +	if (rte_atomic32_dec_and_test(&ind_tbl->refcnt)) {
>  		claim_zero(mlx5_glue->destroy_rwq_ind_table
>  			   (ind_tbl->ind_table));
> +		DEBUG("port %u delete indirection table %p: queues: %u",
> +		      dev->data->port_id, (void *)ind_tbl, ind_tbl->queues_n);
> +	}
>  	for (i = 0; i != ind_tbl->queues_n; ++i)
>  		claim_nonzero(mlx5_rxq_release(dev, ind_tbl->queues[i]));
>  	if (!rte_atomic32_read(&ind_tbl->refcnt)) {
> @@ -1445,6 +1448,12 @@ mlx5_hrxq_new(struct rte_eth_dev *dev,
>  			.pd = priv->pd,
>  		},
>  		&qp_init_attr);
> +	DEBUG("port %u new QP:%p ind_tbl:%p hash_fields:0x%lx tunnel:0x%x"
> +	      " level:%hhu dv_attr:comp_mask:0x%lx create_flags:0x%x",
> +	      dev->data->port_id, (void *)qp, (void *)ind_tbl,
> +	      (tunnel && rss_level ? (uint32_t)IBV_RX_HASH_INNER : 0) |
> +	      hash_fields, tunnel, rss_level,
> +	      qp_init_attr.comp_mask, qp_init_attr.create_flags);
>  #else
>  	qp = mlx5_glue->create_qp_ex
>  		(priv->ctx,
> @@ -1466,6 +1475,10 @@ mlx5_hrxq_new(struct rte_eth_dev *dev,
>  			.rwq_ind_tbl = ind_tbl->ind_table,
>  			.pd = priv->pd,
>  		 });
> +	DEBUG("port %u new QP:%p ind_tbl:%p hash_fields:0x%lx tunnel:0x%x"
> +	      " level:%hhu",
> +	      dev->data->port_id, (void *)qp, (void *)ind_tbl,
> +	      hash_fields, tunnel, rss_level);
>  #endif
>  	if (!qp) {
>  		rte_errno = errno;
> @@ -1577,6 +1590,10 @@ mlx5_hrxq_release(struct rte_eth_dev *dev, struct mlx5_hrxq *hrxq)
>  		(void *)hrxq, rte_atomic32_read(&hrxq->refcnt));
>  	if (rte_atomic32_dec_and_test(&hrxq->refcnt)) {
>  		claim_zero(mlx5_glue->destroy_qp(hrxq->qp));
> +		DEBUG("port %u delete QP %p: hash: 0x%lx, tunnel:"
> +		      " 0x%x, level: %hhu",
> +		      dev->data->port_id, (void *)hrxq, hrxq->hash_fields,
> +		      hrxq->tunnel, hrxq->rss_level);
>  		mlx5_ind_table_ibv_release(dev, hrxq->ind_table);
>  		LIST_REMOVE(hrxq, next);
>  		rte_free(hrxq);
> diff --git a/drivers/net/mlx5/mlx5_utils.h b/drivers/net/mlx5/mlx5_utils.h
> index 85d2aae2b..9a3181b1f 100644
> --- a/drivers/net/mlx5/mlx5_utils.h
> +++ b/drivers/net/mlx5/mlx5_utils.h
> @@ -103,16 +103,22 @@ extern int mlx5_logtype;
>  /* claim_zero() does not perform any check when debugging is disabled. */
>  #ifndef NDEBUG
>  
> +#define DEBUG(...) DRV_LOG(DEBUG, __VA_ARGS__)
>  #define claim_zero(...) assert((__VA_ARGS__) == 0)
>  #define claim_nonzero(...) assert((__VA_ARGS__) != 0)
>  
>  #else /* NDEBUG */
>  
> +#define DEBUG(...) (void)0
>  #define claim_zero(...) (__VA_ARGS__)
>  #define claim_nonzero(...) (__VA_ARGS__)
>  
>  #endif /* NDEBUG */
>  
> +#define INFO(...) DRV_LOG(INFO, __VA_ARGS__)
> +#define WARN(...) DRV_LOG(WARNING, __VA_ARGS__)
> +#define ERROR(...) DRV_LOG(ERR, __VA_ARGS__)
> +
>  /* Convenience macros for accessing mbuf fields. */
>  #define NEXT(m) ((m)->next)
>  #define DATA_LEN(m) ((m)->data_len)
> -- 
> 2.13.3

This is a really good first step, even if you could also spread the user
of the DEBUG macro all over the Verbs object creations, they are purely
developer needs.

Thanks,

-- 
Nélio Laranjeiro
6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v3 09/14] net/mlx5: introduce VXLAN-GPE tunnel type
  2018-04-13 11:20 ` [PATCH v3 09/14] net/mlx5: introduce VXLAN-GPE tunnel type Xueming Li
@ 2018-04-13 13:32   ` Nélio Laranjeiro
  0 siblings, 0 replies; 115+ messages in thread
From: Nélio Laranjeiro @ 2018-04-13 13:32 UTC (permalink / raw)
  To: Xueming Li; +Cc: Shahaf Shuler, dev

Small nit,

On Fri, Apr 13, 2018 at 07:20:18PM +0800, Xueming Li wrote:
> Add VXLAN-GPE support to rte flow.

Duplicating the title in the body seems not useful ;)

> 
> Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> ---
>  drivers/net/mlx5/mlx5_flow.c | 95 +++++++++++++++++++++++++++++++++++++++++++-
>  drivers/net/mlx5/mlx5_rxtx.c |  3 +-
>  2 files changed, 95 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
> index c99722770..19973b13c 100644
> --- a/drivers/net/mlx5/mlx5_flow.c
> +++ b/drivers/net/mlx5/mlx5_flow.c
> @@ -91,6 +91,11 @@ mlx5_flow_create_vxlan(const struct rte_flow_item *item,
>  		       struct mlx5_flow_data *data);
>  
>  static int
> +mlx5_flow_create_vxlan_gpe(const struct rte_flow_item *item,
> +			   const void *default_mask,
> +			   struct mlx5_flow_data *data);
> +
> +static int
>  mlx5_flow_create_gre(const struct rte_flow_item *item,
>  		       const void *default_mask,
>  		       struct mlx5_flow_data *data);
> @@ -241,10 +246,12 @@ struct rte_flow {
>  
>  #define IS_TUNNEL(type) ( \
>  	(type) == RTE_FLOW_ITEM_TYPE_VXLAN || \
> +	(type) == RTE_FLOW_ITEM_TYPE_VXLAN_GPE || \
>  	(type) == RTE_FLOW_ITEM_TYPE_GRE)
>  
>  const uint32_t flow_ptype[] = {
>  	[RTE_FLOW_ITEM_TYPE_VXLAN] = RTE_PTYPE_TUNNEL_VXLAN,
> +	[RTE_FLOW_ITEM_TYPE_VXLAN_GPE] = RTE_PTYPE_TUNNEL_VXLAN_GPE,
>  	[RTE_FLOW_ITEM_TYPE_GRE] = RTE_PTYPE_TUNNEL_GRE,
>  };
>  
> @@ -253,6 +260,8 @@ const uint32_t flow_ptype[] = {
>  const uint32_t ptype_ext[] = {
>  	[PTYPE_IDX(RTE_PTYPE_TUNNEL_VXLAN)] = RTE_PTYPE_TUNNEL_VXLAN |
>  					      RTE_PTYPE_L4_UDP,
> +	[PTYPE_IDX(RTE_PTYPE_TUNNEL_VXLAN_GPE)]	= RTE_PTYPE_TUNNEL_VXLAN_GPE |
> +						  RTE_PTYPE_L4_UDP,
>  	[PTYPE_IDX(RTE_PTYPE_TUNNEL_GRE)] = RTE_PTYPE_TUNNEL_GRE,
>  };
>  
> @@ -310,6 +319,7 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
>  	[RTE_FLOW_ITEM_TYPE_END] = {
>  		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
>  			       RTE_FLOW_ITEM_TYPE_VXLAN,
> +			       RTE_FLOW_ITEM_TYPE_VXLAN_GPE,
>  			       RTE_FLOW_ITEM_TYPE_GRE),
>  	},
>  	[RTE_FLOW_ITEM_TYPE_ETH] = {
> @@ -388,7 +398,8 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
>  		.dst_sz = sizeof(struct ibv_flow_spec_ipv6),
>  	},
>  	[RTE_FLOW_ITEM_TYPE_UDP] = {
> -		.items = ITEMS(RTE_FLOW_ITEM_TYPE_VXLAN),
> +		.items = ITEMS(RTE_FLOW_ITEM_TYPE_VXLAN,
> +			       RTE_FLOW_ITEM_TYPE_VXLAN_GPE),
>  		.actions = valid_actions,
>  		.mask = &(const struct rte_flow_item_udp){
>  			.hdr = {
> @@ -440,6 +451,19 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
>  		.convert = mlx5_flow_create_vxlan,
>  		.dst_sz = sizeof(struct ibv_flow_spec_tunnel),
>  	},
> +	[RTE_FLOW_ITEM_TYPE_VXLAN_GPE] = {
> +		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
> +			       RTE_FLOW_ITEM_TYPE_IPV4,
> +			       RTE_FLOW_ITEM_TYPE_IPV6),
> +		.actions = valid_actions,
> +		.mask = &(const struct rte_flow_item_vxlan_gpe){
> +			.vni = "\xff\xff\xff",
> +		},
> +		.default_mask = &rte_flow_item_vxlan_gpe_mask,
> +		.mask_sz = sizeof(struct rte_flow_item_vxlan_gpe),
> +		.convert = mlx5_flow_create_vxlan_gpe,
> +		.dst_sz = sizeof(struct ibv_flow_spec_tunnel),
> +	},
>  };
>  
>  /** Structure to pass to the conversion function. */
> @@ -1786,6 +1810,75 @@ mlx5_flow_create_vxlan(const struct rte_flow_item *item,
>  }
>  
>  /**
> + * Convert VXLAN-GPE item to Verbs specification.
> + *
> + * @param item[in]
> + *   Item specification.
> + * @param default_mask[in]
> + *   Default bit-masks to use when item->mask is not provided.
> + * @param data[in, out]
> + *   User structure.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +static int
> +mlx5_flow_create_vxlan_gpe(const struct rte_flow_item *item,
> +			   const void *default_mask,
> +			   struct mlx5_flow_data *data)
> +{
> +	const struct rte_flow_item_vxlan_gpe *spec = item->spec;
> +	const struct rte_flow_item_vxlan_gpe *mask = item->mask;
> +	struct mlx5_flow_parse *parser = data->parser;
> +	unsigned int size = sizeof(struct ibv_flow_spec_tunnel);
> +	struct ibv_flow_spec_tunnel vxlan = {
> +		.type = parser->inner | IBV_FLOW_SPEC_VXLAN_TUNNEL,
> +		.size = size,
> +	};
> +	union vni {
> +		uint32_t vlan_id;
> +		uint8_t vni[4];
> +	} id;
> +	int r;
> +
> +	id.vni[0] = 0;
> +	parser->inner = IBV_FLOW_SPEC_INNER;
> +	parser->tunnel = ptype_ext[PTYPE_IDX(RTE_PTYPE_TUNNEL_VXLAN_GPE)];
> +	parser->out_layer = parser->layer;
> +	parser->layer = HASH_RXQ_TUNNEL;
> +	if (spec) {
> +		if (!mask)
> +			mask = default_mask;
> +		memcpy(&id.vni[1], spec->vni, 3);
> +		vxlan.val.tunnel_id = id.vlan_id;
> +		memcpy(&id.vni[1], mask->vni, 3);
> +		vxlan.mask.tunnel_id = id.vlan_id;
> +		if (spec->protocol) {
> +			r = EINVAL;
> +			return r;

rte_errno is not set and the returned value is positive.

> +		}
> +		/* Remove unwanted bits from values. */
> +		vxlan.val.tunnel_id &= vxlan.mask.tunnel_id;
> +	}
> +	/*
> +	 * Tunnel id 0 is equivalent as not adding a VXLAN layer, if only this
> +	 * layer is defined in the Verbs specification it is interpreted as
> +	 * wildcard and all packets will match this rule, if it follows a full
> +	 * stack layer (ex: eth / ipv4 / udp), all packets matching the layers
> +	 * before will also match this rule.
> +	 * To avoid such situation, VNI 0 is currently refused.
> +	 */
> +	/* Only allow tunnel w/o tunnel id pattern after proper outer spec. */
> +	if (parser->out_layer == HASH_RXQ_ETH && !vxlan.val.tunnel_id)
> +		return rte_flow_error_set(data->error, EINVAL,
> +					  RTE_FLOW_ERROR_TYPE_ITEM,
> +					  item,
> +					  "VxLAN-GPE vni cannot be 0");
> +	mlx5_flow_create_copy(parser, &vxlan, size);
> +	return 0;
> +}
> +
> +/**
>   * Convert GRE item to Verbs specification.
>   *
>   * @param item[in]
> diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
> index 285b2dbf0..c9342d659 100644
> --- a/drivers/net/mlx5/mlx5_rxtx.c
> +++ b/drivers/net/mlx5/mlx5_rxtx.c
> @@ -466,8 +466,7 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
>  			uint8_t vlan_sz =
>  				(buf->ol_flags & PKT_TX_VLAN_PKT) ? 4 : 0;
>  			const uint64_t is_tunneled =
> -				buf->ol_flags & (PKT_TX_TUNNEL_GRE |
> -						 PKT_TX_TUNNEL_VXLAN);
> +				buf->ol_flags & (PKT_TX_TUNNEL_MASK);
>  
>  			tso_header_sz = buf->l2_len + vlan_sz +
>  					buf->l3_len + buf->l4_len;
> -- 
> 2.13.3
> 

Thanks,

-- 
Nélio Laranjeiro
6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v3 11/14] net/mlx5: support MPLS-in-GRE and MPLS-in-UDP
  2018-04-13 11:20 ` [PATCH v3 11/14] net/mlx5: support MPLS-in-GRE and MPLS-in-UDP Xueming Li
@ 2018-04-13 13:37   ` Nélio Laranjeiro
  2018-04-13 14:48     ` Xueming(Steven) Li
  0 siblings, 1 reply; 115+ messages in thread
From: Nélio Laranjeiro @ 2018-04-13 13:37 UTC (permalink / raw)
  To: Xueming Li; +Cc: Shahaf Shuler, dev

Some nits,

On Fri, Apr 13, 2018 at 07:20:20PM +0800, Xueming Li wrote:
> This patch supports new tunnel type MPLS-in-GRE and MPLS-in-UDP.
> Flow pattern example:
>   ipv4 proto is 47 / gre proto is 0x8847 / mpls
>   ipv4 / udp dst is 6635 / mpls / end
> 
> Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> ---
>  drivers/net/mlx5/Makefile    |   5 ++
>  drivers/net/mlx5/mlx5.c      |  15 +++++
>  drivers/net/mlx5/mlx5.h      |   1 +
>  drivers/net/mlx5/mlx5_flow.c | 148 ++++++++++++++++++++++++++++++++++++++++++-
>  4 files changed, 166 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
> index f9a6c460b..33553483e 100644
> --- a/drivers/net/mlx5/Makefile
> +++ b/drivers/net/mlx5/Makefile
> @@ -131,6 +131,11 @@ mlx5_autoconf.h.new: $(RTE_SDK)/buildtools/auto-config-h.sh
>  		enum MLX5DV_CONTEXT_MASK_TUNNEL_OFFLOADS \
>  		$(AUTOCONF_OUTPUT)
>  	$Q sh -- '$<' '$@' \
> +		HAVE_IBV_DEVICE_MPLS_SUPPORT \
> +		infiniband/verbs.h \
> +		enum IBV_FLOW_SPEC_MPLS \
> +		$(AUTOCONF_OUTPUT)
> +	$Q sh -- '$<' '$@' \
>  		HAVE_IBV_WQ_FLAG_RX_END_PADDING \
>  		infiniband/verbs.h \
>  		enum IBV_WQ_FLAG_RX_END_PADDING \
> diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
> index 38118e524..89b683d6e 100644
> --- a/drivers/net/mlx5/mlx5.c
> +++ b/drivers/net/mlx5/mlx5.c
> @@ -614,6 +614,7 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
>  	unsigned int cqe_comp;
>  	unsigned int tunnel_en = 0;
>  	unsigned int verb_priorities = 0;
> +	unsigned int mpls_en = 0;
>  	int idx;
>  	int i;
>  	struct mlx5dv_context attrs_out = {0};
> @@ -720,12 +721,25 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
>  			      MLX5DV_RAW_PACKET_CAP_TUNNELED_OFFLOAD_VXLAN) &&
>  			     (attrs_out.tunnel_offloads_caps &
>  			      MLX5DV_RAW_PACKET_CAP_TUNNELED_OFFLOAD_GRE));
> +#ifdef HAVE_IBV_DEVICE_MPLS_SUPPORT
> +		mpls_en = ((attrs_out.tunnel_offloads_caps &
> +			    MLX5DV_RAW_PACKET_CAP_TUNNELED_OFFLOAD_MPLS_GRE) &&
> +			   (attrs_out.tunnel_offloads_caps &
> +			    MLX5DV_RAW_PACKET_CAP_TUNNELED_OFFLOAD_MPLS_UDP) &&
> +			   (attrs_out.tunnel_offloads_caps &
> +			  MLX5DV_RAW_PACKET_CAP_TUNNELED_OFFLOAD_CTRL_DW_MPLS));
> +#endif
>  	}
>  	DRV_LOG(DEBUG, "tunnel offloading is %ssupported",
>  		tunnel_en ? "" : "not ");
> +	DRV_LOG(DEBUG, "MPLS over GRE/UDP offloading is %ssupported",
> +		mpls_en ? "" : "not ");
>  #else
>  	DRV_LOG(WARNING,
>  		"tunnel offloading disabled due to old OFED/rdma-core version");
> +	DRV_LOG(WARNING,
> +		"MPLS over GRE/UDP offloading disabled due to old"
> +		" OFED/rdma-core version or firmware configuration");
>  #endif
>  	if (mlx5_glue->query_device_ex(attr_ctx, NULL, &device_attr)) {
>  		err = errno;
> @@ -749,6 +763,7 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
>  			.cqe_comp = cqe_comp,
>  			.mps = mps,
>  			.tunnel_en = tunnel_en,
> +			.mpls_en = mpls_en,
>  			.tx_vec_en = 1,
>  			.rx_vec_en = 1,
>  			.mpw_hdr_dseg = 0,
> diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
> index 6e4613fe0..efbcb2156 100644
> --- a/drivers/net/mlx5/mlx5.h
> +++ b/drivers/net/mlx5/mlx5.h
> @@ -81,6 +81,7 @@ struct mlx5_dev_config {
>  	unsigned int vf:1; /* This is a VF. */
>  	unsigned int mps:2; /* Multi-packet send supported mode. */
>  	unsigned int tunnel_en:1;
> +	unsigned int mpls_en:1; /* MPLS over GRE/UDP is enabled. */
>  	/* Whether tunnel stateless offloads are supported. */
>  	unsigned int flow_counter_en:1; /* Whether flow counter is supported. */
>  	unsigned int cqe_comp:1; /* CQE compression is enabled. */
> diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
> index 0fccd39b3..98edf1882 100644
> --- a/drivers/net/mlx5/mlx5_flow.c
> +++ b/drivers/net/mlx5/mlx5_flow.c
> @@ -100,6 +100,11 @@ mlx5_flow_create_gre(const struct rte_flow_item *item,
>  		       const void *default_mask,
>  		       struct mlx5_flow_data *data);
>  
> +static int
> +mlx5_flow_create_mpls(const struct rte_flow_item *item,
> +		      const void *default_mask,
> +		      struct mlx5_flow_data *data);
> +
>  struct mlx5_flow_parse;
>  
>  static void
> @@ -247,12 +252,14 @@ struct rte_flow {
>  #define IS_TUNNEL(type) ( \
>  	(type) == RTE_FLOW_ITEM_TYPE_VXLAN || \
>  	(type) == RTE_FLOW_ITEM_TYPE_VXLAN_GPE || \
> +	(type) == RTE_FLOW_ITEM_TYPE_MPLS || \
>  	(type) == RTE_FLOW_ITEM_TYPE_GRE)
>  
>  const uint32_t flow_ptype[] = {
>  	[RTE_FLOW_ITEM_TYPE_VXLAN] = RTE_PTYPE_TUNNEL_VXLAN,
>  	[RTE_FLOW_ITEM_TYPE_VXLAN_GPE] = RTE_PTYPE_TUNNEL_VXLAN_GPE,
>  	[RTE_FLOW_ITEM_TYPE_GRE] = RTE_PTYPE_TUNNEL_GRE,
> +	[RTE_FLOW_ITEM_TYPE_MPLS] = RTE_PTYPE_TUNNEL_MPLS_IN_GRE,
>  };
>  
>  #define PTYPE_IDX(t) ((RTE_PTYPE_TUNNEL_MASK & (t)) >> 12)
> @@ -263,6 +270,10 @@ const uint32_t ptype_ext[] = {
>  	[PTYPE_IDX(RTE_PTYPE_TUNNEL_VXLAN_GPE)]	= RTE_PTYPE_TUNNEL_VXLAN_GPE |
>  						  RTE_PTYPE_L4_UDP,
>  	[PTYPE_IDX(RTE_PTYPE_TUNNEL_GRE)] = RTE_PTYPE_TUNNEL_GRE,
> +	[PTYPE_IDX(RTE_PTYPE_TUNNEL_MPLS_IN_GRE)] =
> +		RTE_PTYPE_TUNNEL_MPLS_IN_GRE,
> +	[PTYPE_IDX(RTE_PTYPE_TUNNEL_MPLS_IN_UDP)] =
> +		RTE_PTYPE_TUNNEL_MPLS_IN_GRE | RTE_PTYPE_L4_UDP,
>  };
>  
>  /** Structure to generate a simple graph of layers supported by the NIC. */
> @@ -399,7 +410,8 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
>  	},
>  	[RTE_FLOW_ITEM_TYPE_UDP] = {
>  		.items = ITEMS(RTE_FLOW_ITEM_TYPE_VXLAN,
> -			       RTE_FLOW_ITEM_TYPE_VXLAN_GPE),
> +			       RTE_FLOW_ITEM_TYPE_VXLAN_GPE,
> +			       RTE_FLOW_ITEM_TYPE_MPLS),
>  		.actions = valid_actions,
>  		.mask = &(const struct rte_flow_item_udp){
>  			.hdr = {
> @@ -428,7 +440,8 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
>  	[RTE_FLOW_ITEM_TYPE_GRE] = {
>  		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
>  			       RTE_FLOW_ITEM_TYPE_IPV4,
> -			       RTE_FLOW_ITEM_TYPE_IPV6),
> +			       RTE_FLOW_ITEM_TYPE_IPV6,
> +			       RTE_FLOW_ITEM_TYPE_MPLS),
>  		.actions = valid_actions,
>  		.mask = &(const struct rte_flow_item_gre){
>  			.protocol = -1,
> @@ -436,7 +449,11 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
>  		.default_mask = &rte_flow_item_gre_mask,
>  		.mask_sz = sizeof(struct rte_flow_item_gre),
>  		.convert = mlx5_flow_create_gre,
> +#ifdef HAVE_IBV_DEVICE_MPLS_SUPPORT
> +		.dst_sz = sizeof(struct ibv_flow_spec_gre),
> +#else
>  		.dst_sz = sizeof(struct ibv_flow_spec_tunnel),
> +#endif
>  	},
>  	[RTE_FLOW_ITEM_TYPE_VXLAN] = {
>  		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
> @@ -464,6 +481,21 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
>  		.convert = mlx5_flow_create_vxlan_gpe,
>  		.dst_sz = sizeof(struct ibv_flow_spec_tunnel),
>  	},
> +	[RTE_FLOW_ITEM_TYPE_MPLS] = {
> +		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
> +			       RTE_FLOW_ITEM_TYPE_IPV4,
> +			       RTE_FLOW_ITEM_TYPE_IPV6),
> +		.actions = valid_actions,
> +		.mask = &(const struct rte_flow_item_mpls){
> +			.label_tc_s = "\xff\xff\xf0",
> +		},
> +		.default_mask = &rte_flow_item_mpls_mask,
> +		.mask_sz = sizeof(struct rte_flow_item_mpls),
> +		.convert = mlx5_flow_create_mpls,
> +#ifdef HAVE_IBV_DEVICE_MPLS_SUPPORT
> +		.dst_sz = sizeof(struct ibv_flow_spec_mpls),
> +#endif
> +	},

Why the whole item is not under ifdef?

>  };
>  
>  /** Structure to pass to the conversion function. */
> @@ -912,7 +944,9 @@ mlx5_flow_convert_items_validate(struct rte_eth_dev *dev,
>  		if (ret)
>  			goto exit_item_not_supported;
>  		if (IS_TUNNEL(items->type)) {
> -			if (parser->tunnel) {
> +			if (parser->tunnel &&
> +			   !(parser->tunnel == RTE_PTYPE_TUNNEL_GRE &&
> +			     items->type == RTE_FLOW_ITEM_TYPE_MPLS)) {
>  				rte_flow_error_set(error, ENOTSUP,
>  						   RTE_FLOW_ERROR_TYPE_ITEM,
>  						   items,
> @@ -920,6 +954,16 @@ mlx5_flow_convert_items_validate(struct rte_eth_dev *dev,
>  						   " tunnel encapsulations.");
>  				return -rte_errno;
>  			}
> +			if (items->type == RTE_FLOW_ITEM_TYPE_MPLS &&
> +			    !priv->config.mpls_en) {
> +				rte_flow_error_set(error, ENOTSUP,
> +						   RTE_FLOW_ERROR_TYPE_ITEM,
> +						   items,
> +						   "MPLS not supported or"
> +						   " disabled in firmware"
> +						   " configuration.");
> +				return -rte_errno;
> +			}
>  			if (!priv->config.tunnel_en &&
>  			    parser->rss_conf.level) {
>  				rte_flow_error_set(error, ENOTSUP,
> @@ -1880,6 +1924,80 @@ mlx5_flow_create_vxlan_gpe(const struct rte_flow_item *item,
>  }
>  
>  /**
> + * Convert MPLS item to Verbs specification.
> + * Tunnel types currently supported are MPLS-in-GRE and MPLS-in-UDP.
> + *
> + * @param item[in]
> + *   Item specification.
> + * @param default_mask[in]
> + *   Default bit-masks to use when item->mask is not provided.
> + * @param data[in, out]
> + *   User structure.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +static int
> +mlx5_flow_create_mpls(const struct rte_flow_item *item __rte_unused,
> +		      const void *default_mask __rte_unused,
> +		      struct mlx5_flow_data *data __rte_unused)
> +{
> +#ifndef HAVE_IBV_DEVICE_MPLS_SUPPORT
> +	return rte_flow_error_set(data->error, EINVAL,

ENOTSUP is more accurate to keep a consistency among the errors.

> +				  RTE_FLOW_ERROR_TYPE_ITEM,
> +				  item,
> +				  "MPLS not supported by driver");
> +#else
> +	unsigned int i;
> +	const struct rte_flow_item_mpls *spec = item->spec;
> +	const struct rte_flow_item_mpls *mask = item->mask;
> +	struct mlx5_flow_parse *parser = data->parser;
> +	unsigned int size = sizeof(struct ibv_flow_spec_mpls);
> +	struct ibv_flow_spec_mpls mpls = {
> +		.type = IBV_FLOW_SPEC_MPLS,
> +		.size = size,
> +	};
> +	union tag {
> +		uint32_t tag;
> +		uint8_t label[4];
> +	} id;
> +
> +	id.tag = 0;
> +	parser->inner = IBV_FLOW_SPEC_INNER;
> +	if (parser->layer == HASH_RXQ_UDPV4 ||
> +	    parser->layer == HASH_RXQ_UDPV6) {
> +		parser->tunnel =
> +			ptype_ext[PTYPE_IDX(RTE_PTYPE_TUNNEL_MPLS_IN_UDP)];
> +		parser->out_layer = parser->layer;
> +	} else {
> +		parser->tunnel =
> +			ptype_ext[PTYPE_IDX(RTE_PTYPE_TUNNEL_MPLS_IN_GRE)];
> +	}
> +	parser->layer = HASH_RXQ_TUNNEL;
> +	if (spec) {
> +		if (!mask)
> +			mask = default_mask;
> +		memcpy(&id.label[1], spec->label_tc_s, 3);
> +		id.label[0] = spec->ttl;
> +		mpls.val.tag = id.tag;
> +		memcpy(&id.label[1], mask->label_tc_s, 3);
> +		id.label[0] = mask->ttl;
> +		mpls.mask.tag = id.tag;
> +		/* Remove unwanted bits from values. */
> +		mpls.val.tag &= mpls.mask.tag;
> +	}
> +	mlx5_flow_create_copy(parser, &mpls, size);
> +	for (i = 0; i != hash_rxq_init_n; ++i) {
> +		if (!parser->queue[i].ibv_attr)
> +			continue;
> +		parser->queue[i].ibv_attr->flags |=
> +			IBV_FLOW_ATTR_FLAGS_ORDERED_SPEC_LIST;
> +	}
> +	return 0;
> +#endif
> +}
> +
> +/**
>   * Convert GRE item to Verbs specification.
>   *
>   * @param item[in]
> @@ -1898,16 +2016,40 @@ mlx5_flow_create_gre(const struct rte_flow_item *item __rte_unused,
>  		     struct mlx5_flow_data *data)
>  {
>  	struct mlx5_flow_parse *parser = data->parser;
> +#ifndef HAVE_IBV_DEVICE_MPLS_SUPPORT
>  	unsigned int size = sizeof(struct ibv_flow_spec_tunnel);
>  	struct ibv_flow_spec_tunnel tunnel = {
>  		.type = parser->inner | IBV_FLOW_SPEC_VXLAN_TUNNEL,
>  		.size = size,
>  	};
> +#else
> +	const struct rte_flow_item_gre *spec = item->spec;
> +	const struct rte_flow_item_gre *mask = item->mask;
> +	unsigned int size = sizeof(struct ibv_flow_spec_gre);
> +	struct ibv_flow_spec_gre tunnel = {
> +		.type = parser->inner | IBV_FLOW_SPEC_GRE,
> +		.size = size,
> +	};
> +#endif
>  
>  	parser->inner = IBV_FLOW_SPEC_INNER;
>  	parser->tunnel = ptype_ext[PTYPE_IDX(RTE_PTYPE_TUNNEL_GRE)];
>  	parser->out_layer = parser->layer;
>  	parser->layer = HASH_RXQ_TUNNEL;
> +#ifdef HAVE_IBV_DEVICE_MPLS_SUPPORT
> +	if (spec) {
> +		if (!mask)
> +			mask = default_mask;
> +		tunnel.val.c_ks_res0_ver = spec->c_rsvd0_ver;
> +		tunnel.val.protocol = spec->protocol;
> +		tunnel.val.c_ks_res0_ver = mask->c_rsvd0_ver;
> +		tunnel.val.protocol = mask->protocol;
> +		/* Remove unwanted bits from values. */
> +		tunnel.val.c_ks_res0_ver &= tunnel.mask.c_ks_res0_ver;
> +		tunnel.val.protocol &= tunnel.mask.protocol;
> +		tunnel.val.key &= tunnel.mask.key;
> +	}
> +#endif
>  	mlx5_flow_create_copy(parser, &tunnel, size);
>  	return 0;
>  }
> -- 
> 2.13.3
> 

Thanks,

-- 
Nélio Laranjeiro
6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v3 12/14] doc: update mlx5 guide on tunnel offloading
  2018-04-13 11:20 ` [PATCH v3 12/14] doc: update mlx5 guide on tunnel offloading Xueming Li
@ 2018-04-13 13:38   ` Nélio Laranjeiro
  0 siblings, 0 replies; 115+ messages in thread
From: Nélio Laranjeiro @ 2018-04-13 13:38 UTC (permalink / raw)
  To: Xueming Li; +Cc: Shahaf Shuler, dev

On Fri, Apr 13, 2018 at 07:20:21PM +0800, Xueming Li wrote:
> Remove tunnel limitations, add new hardware tunnel offload features.
> 
> Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> ---
>  doc/guides/nics/mlx5.rst | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
> index b1bab2ce2..c256f85f3 100644
> --- a/doc/guides/nics/mlx5.rst
> +++ b/doc/guides/nics/mlx5.rst
> @@ -100,12 +100,12 @@ Features
>  - RX interrupts.
>  - Statistics query including Basic, Extended and per queue.
>  - Rx HW timestamp.
> +- Tunnel types: VXLAN, L3 VXLAN, VXLAN-GPE, GRE, MPLS-in-GRE, MPLS-in-UDP.
> +- Tunnel HW offloads: packet type, inner/outer RSS, IP and UDP checksum verification.
>  
>  Limitations
>  -----------
>  
> -- Inner RSS for VXLAN frames is not supported yet.
> -- Hardware checksum RX offloads for VXLAN inner header are not supported yet.
>  - For secondary process:
>  
>    - Forked secondary process not supported.
> -- 
> 2.13.3
 
As discussed in [1], also add a new entry in the features list,

Thanks,

[1] https://dpdk.org/ml/archives/dev/2018-April/096736.html

-- 
Nélio Laranjeiro
6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v3 13/14] net/mlx5: fix invalid flow item check
  2018-04-13 11:20 ` [PATCH v3 13/14] net/mlx5: fix invalid flow item check Xueming Li
@ 2018-04-13 13:40   ` Nélio Laranjeiro
  0 siblings, 0 replies; 115+ messages in thread
From: Nélio Laranjeiro @ 2018-04-13 13:40 UTC (permalink / raw)
  To: Xueming Li; +Cc: Shahaf Shuler, dev

On Fri, Apr 13, 2018 at 07:20:22PM +0800, Xueming Li wrote:
> This patch fixed invalid flow item check.
> 
> Fixes: 4f1a88e3f9b0 ("net/mlx5: standardize on negative errno values")
> Cc: nelio.laranjeiro@6wind.com
> 
> Signed-off-by: Xueming Li <xuemingl@mellanox.com>

Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>

> ---
>  drivers/net/mlx5/mlx5_flow.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
> index 98edf1882..d36b6ed8a 100644
> --- a/drivers/net/mlx5/mlx5_flow.c
> +++ b/drivers/net/mlx5/mlx5_flow.c
> @@ -935,8 +935,10 @@ mlx5_flow_convert_items_validate(struct rte_eth_dev *dev,
>  				break;
>  			}
>  		}
> -		if (!token)
> +		if (!token) {
> +			ret = -ENOTSUP;
>  			goto exit_item_not_supported;
> +		}
>  		cur_item = token;
>  		ret = mlx5_flow_item_validate(items,
>  					      (const uint8_t *)cur_item->mask,
> -- 
> 2.13.3
 
Should be proposed outside of this series.

Thanks,

-- 
Nélio Laranjeiro
6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v3 14/14] net/mlx5: support RSS configuration in isolated mode
  2018-04-13 11:20 ` [PATCH v3 14/14] net/mlx5: support RSS configuration in isolated mode Xueming Li
@ 2018-04-13 13:43   ` Nélio Laranjeiro
  0 siblings, 0 replies; 115+ messages in thread
From: Nélio Laranjeiro @ 2018-04-13 13:43 UTC (permalink / raw)
  To: Xueming Li; +Cc: Shahaf Shuler, dev

On Fri, Apr 13, 2018 at 07:20:23PM +0800, Xueming Li wrote:
> Enable RSS related configuration in isolated mode.
> 
> Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> ---
>  drivers/net/mlx5/mlx5.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
> index 89b683d6e..521f60c18 100644
> --- a/drivers/net/mlx5/mlx5.c
> +++ b/drivers/net/mlx5/mlx5.c
> @@ -333,6 +333,10 @@ const struct eth_dev_ops mlx5_dev_ops_isolate = {
>  	.mtu_set = mlx5_dev_set_mtu,
>  	.vlan_strip_queue_set = mlx5_vlan_strip_queue_set,
>  	.vlan_offload_set = mlx5_vlan_offload_set,
> +	.reta_update = mlx5_dev_rss_reta_update,
> +	.reta_query = mlx5_dev_rss_reta_query,
> +	.rss_hash_update = mlx5_rss_hash_update,
> +	.rss_hash_conf_get = mlx5_rss_hash_conf_get,
>  	.filter_ctrl = mlx5_dev_filter_ctrl,
>  	.rx_descriptor_status = mlx5_rx_descriptor_status,
>  	.tx_descriptor_status = mlx5_tx_descriptor_status,
> -- 
> 2.13.3
 
This API are only modifying the behavior of control plane flows, i.e.
unicast, promiscuous and all multicast, in isolated mode such flows are
never created to comply with the isolated mode.
There is no need to enable such API.

Thanks,

-- 
Nélio Laranjeiro
6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v3 01/14] net/mlx5: support 16 hardware priorities
  2018-04-13 13:10     ` Xueming(Steven) Li
@ 2018-04-13 13:46       ` Nélio Laranjeiro
  0 siblings, 0 replies; 115+ messages in thread
From: Nélio Laranjeiro @ 2018-04-13 13:46 UTC (permalink / raw)
  To: Xueming(Steven) Li; +Cc: Shahaf Shuler, dev

On Fri, Apr 13, 2018 at 01:10:07PM +0000, Xueming(Steven) Li wrote:
> 
> 
> > -----Original Message-----
> > From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> > Sent: Friday, April 13, 2018 7:59 PM
> > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org
> > Subject: Re: [PATCH v3 01/14] net/mlx5: support 16 hardware priorities
> > 
> > Hi Xueming,
> > 
> > Small nips and documentation issues,
> > 
> > On Fri, Apr 13, 2018 at 07:20:10PM +0800, Xueming Li wrote:
> > > This patch supports new 16 Verbs flow priorities by trying to create a
> > > simple flow of priority 15. If 16 priorities not available, fallback
> > > to traditional 8 priorities.
> > >
> > > Verb priority mapping:
> > > 			8 priorities	>=16 priorities
> > > Control flow:		4-7		8-15
> > > User normal flow:	1-3		4-7
> > > User tunnel flow:	0-2		0-3
> > 
> > There is an overlap between tunnel and normal flows it is expected?
> 
> For 8 priorities, (4-7), (1-3) and (0-2) are the behavior of today, 
> 1 Verbs shift to make tunnel flow higher priority, please refer to 
> commit #74936571

This is a little confusing wrote like this, tunnel is normal priority
less 1 according to the inner pattern. 
Documenting it like this seems in such situation it will overlap with
the normal rule.

> > >
> > > Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> > > ---
> > >  drivers/net/mlx5/mlx5.c         |  18 +++++++
> > >  drivers/net/mlx5/mlx5.h         |   5 ++
> > >  drivers/net/mlx5/mlx5_flow.c    | 112
> > +++++++++++++++++++++++++++++++++-------
> > >  drivers/net/mlx5/mlx5_trigger.c |   8 ---
> > >  4 files changed, 115 insertions(+), 28 deletions(-)
> > >
> > > diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index
> > > cfab55897..38118e524 100644
> > > --- a/drivers/net/mlx5/mlx5.c
> > > +++ b/drivers/net/mlx5/mlx5.c
> > > @@ -197,6 +197,7 @@ mlx5_dev_close(struct rte_eth_dev *dev)
> > >  		priv->txqs_n = 0;
> > >  		priv->txqs = NULL;
> > >  	}
> > > +	mlx5_flow_delete_drop_queue(dev);
> > >  	if (priv->pd != NULL) {
> > >  		assert(priv->ctx != NULL);
> > >  		claim_zero(mlx5_glue->dealloc_pd(priv->pd));
> > > @@ -612,6 +613,7 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv
> > __rte_unused,
> > >  	unsigned int mps;
> > >  	unsigned int cqe_comp;
> > >  	unsigned int tunnel_en = 0;
> > > +	unsigned int verb_priorities = 0;
> > >  	int idx;
> > >  	int i;
> > >  	struct mlx5dv_context attrs_out = {0}; @@ -993,6 +995,22 @@
> > > mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
> > >  		mlx5_set_link_up(eth_dev);
> > >  		/* Store device configuration on private structure. */
> > >  		priv->config = config;
> > > +		/* Create drop queue. */
> > > +		err = mlx5_flow_create_drop_queue(eth_dev);
> > > +		if (err) {
> > > +			DRV_LOG(ERR, "port %u drop queue allocation failed: %s",
> > > +				eth_dev->data->port_id, strerror(rte_errno));
> > > +			goto port_error;
> > > +		}
> > > +		/* Supported Verbs flow priority number detection. */
> > > +		if (verb_priorities == 0)
> > > +			verb_priorities = priv_get_max_verbs_prio(eth_dev);
> > 
> > No more priv*() rename it to mlx5_get_max_verbs_prio()
> > 
> > > +		if (verb_priorities < MLX5_VERBS_FLOW_PRIO_8) {
> > > +			DRV_LOG(ERR, "port %u wrong Verbs flow priorities: %u",
> > > +				eth_dev->data->port_id, verb_priorities);
> > > +			goto port_error;
> > > +		}
> > > +		priv->config.max_verb_prio = verb_priorities;
> > 
> > s/verb/verbs/
> > 
> > >  		continue;
> > >  port_error:
> > >  		if (priv)
> > > diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h index
> > > 63b24e6bb..6e4613fe0 100644
> > > --- a/drivers/net/mlx5/mlx5.h
> > > +++ b/drivers/net/mlx5/mlx5.h
> > > @@ -89,6 +89,7 @@ struct mlx5_dev_config {
> > >  	unsigned int rx_vec_en:1; /* Rx vector is enabled. */
> > >  	unsigned int mpw_hdr_dseg:1; /* Enable DSEGs in the title WQEBB. */
> > >  	unsigned int vf_nl_en:1; /* Enable Netlink requests in VF mode. */
> > > +	unsigned int max_verb_prio; /* Number of Verb flow priorities. */
> > >  	unsigned int tso_max_payload_sz; /* Maximum TCP payload for TSO. */
> > >  	unsigned int ind_table_max_size; /* Maximum indirection table size.
> > */
> > >  	int txq_inline; /* Maximum packet size for inlining. */ @@ -105,6
> > > +106,9 @@ enum mlx5_verbs_alloc_type {
> > >  	MLX5_VERBS_ALLOC_TYPE_RX_QUEUE,
> > >  };
> > >
> > > +/* 8 Verbs priorities. */
> > > +#define MLX5_VERBS_FLOW_PRIO_8 8
> > > +
> > >  /**
> > >   * Verbs allocator needs a context to know in the callback which kind
> > of
> > >   * resources it is allocating.
> > > @@ -253,6 +257,7 @@ int mlx5_traffic_restart(struct rte_eth_dev *dev);
> > >
> > >  /* mlx5_flow.c */
> > >
> > > +unsigned int priv_get_max_verbs_prio(struct rte_eth_dev *dev);
> > >  int mlx5_flow_validate(struct rte_eth_dev *dev,
> > >  		       const struct rte_flow_attr *attr,
> > >  		       const struct rte_flow_item items[], diff --git
> > > a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c index
> > > 288610620..5c4f0b586 100644
> > > --- a/drivers/net/mlx5/mlx5_flow.c
> > > +++ b/drivers/net/mlx5/mlx5_flow.c
> > > @@ -32,8 +32,8 @@
> > >  #include "mlx5_prm.h"
> > >  #include "mlx5_glue.h"
> > >
> > > -/* Define minimal priority for control plane flows. */ -#define
> > > MLX5_CTRL_FLOW_PRIORITY 4
> > > +/* Flow priority for control plane flows. */ #define
> > > +MLX5_CTRL_FLOW_PRIORITY 1
> > >
> > >  /* Internet Protocol versions. */
> > >  #define MLX5_IPV4 4
> > > @@ -129,7 +129,7 @@ const struct hash_rxq_init hash_rxq_init[] = {
> > >  				IBV_RX_HASH_SRC_PORT_TCP |
> > >  				IBV_RX_HASH_DST_PORT_TCP),
> > >  		.dpdk_rss_hf = ETH_RSS_NONFRAG_IPV4_TCP,
> > > -		.flow_priority = 1,
> > > +		.flow_priority = 0,
> > >  		.ip_version = MLX5_IPV4,
> > >  	},
> > >  	[HASH_RXQ_UDPV4] = {
> > > @@ -138,7 +138,7 @@ const struct hash_rxq_init hash_rxq_init[] = {
> > >  				IBV_RX_HASH_SRC_PORT_UDP |
> > >  				IBV_RX_HASH_DST_PORT_UDP),
> > >  		.dpdk_rss_hf = ETH_RSS_NONFRAG_IPV4_UDP,
> > > -		.flow_priority = 1,
> > > +		.flow_priority = 0,
> > >  		.ip_version = MLX5_IPV4,
> > >  	},
> > >  	[HASH_RXQ_IPV4] = {
> > > @@ -146,7 +146,7 @@ const struct hash_rxq_init hash_rxq_init[] = {
> > >  				IBV_RX_HASH_DST_IPV4),
> > >  		.dpdk_rss_hf = (ETH_RSS_IPV4 |
> > >  				ETH_RSS_FRAG_IPV4),
> > > -		.flow_priority = 2,
> > > +		.flow_priority = 1,
> > >  		.ip_version = MLX5_IPV4,
> > >  	},
> > >  	[HASH_RXQ_TCPV6] = {
> > > @@ -155,7 +155,7 @@ const struct hash_rxq_init hash_rxq_init[] = {
> > >  				IBV_RX_HASH_SRC_PORT_TCP |
> > >  				IBV_RX_HASH_DST_PORT_TCP),
> > >  		.dpdk_rss_hf = ETH_RSS_NONFRAG_IPV6_TCP,
> > > -		.flow_priority = 1,
> > > +		.flow_priority = 0,
> > >  		.ip_version = MLX5_IPV6,
> > >  	},
> > >  	[HASH_RXQ_UDPV6] = {
> > > @@ -164,7 +164,7 @@ const struct hash_rxq_init hash_rxq_init[] = {
> > >  				IBV_RX_HASH_SRC_PORT_UDP |
> > >  				IBV_RX_HASH_DST_PORT_UDP),
> > >  		.dpdk_rss_hf = ETH_RSS_NONFRAG_IPV6_UDP,
> > > -		.flow_priority = 1,
> > > +		.flow_priority = 0,
> > >  		.ip_version = MLX5_IPV6,
> > >  	},
> > >  	[HASH_RXQ_IPV6] = {
> > > @@ -172,13 +172,13 @@ const struct hash_rxq_init hash_rxq_init[] = {
> > >  				IBV_RX_HASH_DST_IPV6),
> > >  		.dpdk_rss_hf = (ETH_RSS_IPV6 |
> > >  				ETH_RSS_FRAG_IPV6),
> > > -		.flow_priority = 2,
> > > +		.flow_priority = 1,
> > >  		.ip_version = MLX5_IPV6,
> > >  	},
> > >  	[HASH_RXQ_ETH] = {
> > >  		.hash_fields = 0,
> > >  		.dpdk_rss_hf = 0,
> > > -		.flow_priority = 3,
> > > +		.flow_priority = 2,
> > >  	},
> > >  };
> > >
> > > @@ -900,30 +900,50 @@ mlx5_flow_convert_allocate(unsigned int size,
> > struct rte_flow_error *error)
> > >   * Make inner packet matching with an higher priority from the non
> > Inner
> > >   * matching.
> > >   *
> > > + * @param dev
> > > + *   Pointer to Ethernet device.
> > >   * @param[in, out] parser
> > >   *   Internal parser structure.
> > >   * @param attr
> > >   *   User flow attribute.
> > >   */
> > >  static void
> > > -mlx5_flow_update_priority(struct mlx5_flow_parse *parser,
> > > +mlx5_flow_update_priority(struct rte_eth_dev *dev,
> > > +			  struct mlx5_flow_parse *parser,
> > >  			  const struct rte_flow_attr *attr)  {
> > > +	struct priv *priv = dev->data->dev_private;
> > >  	unsigned int i;
> > > +	uint16_t priority;
> > >
> > > +	/*			8 priorities	>= 16 priorities
> > > +	 * Control flow:	4-7		8-15
> > > +	 * User normal flow:	1-3		4-7
> > > +	 * User tunnel flow:	0-2		0-3
> > > +	 */
> > 
> > Same comment here, the tunnel flow overlap when there are only 8
> > priorities.
> > 
> > > +	priority = attr->priority * MLX5_VERBS_FLOW_PRIO_8;
> > > +	if (priv->config.max_verb_prio == MLX5_VERBS_FLOW_PRIO_8)
> > > +		priority /= 2;
> > > +	/*
> > > +	 * Lower non-tunnel flow Verbs priority 1 if only support 8 Verbs
> > > +	 * priorities, lower 4 otherwise.
> > > +	 */
> > > +	if (!parser->inner) {
> > > +		if (priv->config.max_verb_prio == MLX5_VERBS_FLOW_PRIO_8)
> > > +			priority += 1;
> > > +		else
> > > +			priority += MLX5_VERBS_FLOW_PRIO_8 / 2;
> > > +	}
> > >  	if (parser->drop) {
> > > -		parser->queue[HASH_RXQ_ETH].ibv_attr->priority =
> > > -			attr->priority +
> > > -			hash_rxq_init[HASH_RXQ_ETH].flow_priority;
> > > +		parser->queue[HASH_RXQ_ETH].ibv_attr->priority = priority +
> > > +				hash_rxq_init[HASH_RXQ_ETH].flow_priority;
> > >  		return;
> > >  	}
> > >  	for (i = 0; i != hash_rxq_init_n; ++i) {
> > > -		if (parser->queue[i].ibv_attr) {
> > > -			parser->queue[i].ibv_attr->priority =
> > > -				attr->priority +
> > > -				hash_rxq_init[i].flow_priority -
> > > -				(parser->inner ? 1 : 0);
> > > -		}
> > > +		if (!parser->queue[i].ibv_attr)
> > > +			continue;
> > > +		parser->queue[i].ibv_attr->priority = priority +
> > > +				hash_rxq_init[i].flow_priority;
> > >  	}
> > >  }
> > >
> > > @@ -1158,7 +1178,7 @@ mlx5_flow_convert(struct rte_eth_dev *dev,
> > >  	 */
> > >  	if (!parser->drop)
> > >  		mlx5_flow_convert_finalise(parser);
> > > -	mlx5_flow_update_priority(parser, attr);
> > > +	mlx5_flow_update_priority(dev, parser, attr);
> > >  exit_free:
> > >  	/* Only verification is expected, all resources should be released.
> > */
> > >  	if (!parser->create) {
> > > @@ -3161,3 +3181,55 @@ mlx5_dev_filter_ctrl(struct rte_eth_dev *dev,
> > >  	}
> > >  	return 0;
> > >  }
> > > +
> > > +/**
> > > + * Detect number of Verbs flow priorities supported.
> > > + *
> > > + * @param dev
> > > + *   Pointer to Ethernet device.
> > > + *
> > > + * @return
> > > + *   number of supported Verbs flow priority.
> > > + */
> > > +unsigned int
> > > +priv_get_max_verbs_prio(struct rte_eth_dev *dev) {
> > > +	struct priv *priv = dev->data->dev_private;
> > > +	unsigned int verb_priorities = MLX5_VERBS_FLOW_PRIO_8;
> > > +	struct {
> > > +		struct ibv_flow_attr attr;
> > > +		struct ibv_flow_spec_eth eth;
> > > +		struct ibv_flow_spec_action_drop drop;
> > > +	} flow_attr = {
> > > +		.attr = {
> > > +			.num_of_specs = 2,
> > > +		},
> > > +		.eth = {
> > > +			.type = IBV_FLOW_SPEC_ETH,
> > > +			.size = sizeof(struct ibv_flow_spec_eth),
> > > +		},
> > > +		.drop = {
> > > +			.size = sizeof(struct ibv_flow_spec_action_drop),
> > > +			.type = IBV_FLOW_SPEC_ACTION_DROP,
> > > +		},
> > > +	};
> > > +	struct ibv_flow *flow;
> > > +
> > > +	do {
> > > +		flow_attr.attr.priority = verb_priorities - 1;
> > > +		flow = mlx5_glue->create_flow(priv->flow_drop_queue->qp,
> > > +					      &flow_attr.attr);
> > > +		if (flow) {
> > > +			claim_zero(mlx5_glue->destroy_flow(flow));
> > > +			/* Try more priorities. */
> > > +			verb_priorities *= 2;
> > > +		} else {
> > > +			/* Failed, restore last right number. */
> > > +			verb_priorities /= 2;
> > > +			break;
> > > +		}
> > > +	} while (1);
> > > +	DRV_LOG(INFO, "port %u Verbs flow priorities: %d",
> > > +		dev->data->port_id, verb_priorities);
> > 
> > Please remove this developer log, it will confuse the user who will
> > believe he have N priorities which is absolutely not the case.
> 
> How about change it to DEBUG level, this should be very useful in real deployment
> trouble shooting.

I agree in using the DEBUG() instead.

> I could append something like "user flow priorities: %d" to avoid confusion..
> 
> > 
> > > +	return verb_priorities;
> > > +}
> > > diff --git a/drivers/net/mlx5/mlx5_trigger.c
> > > b/drivers/net/mlx5/mlx5_trigger.c index 6bb4ffb14..d80a2e688 100644
> > > --- a/drivers/net/mlx5/mlx5_trigger.c
> > > +++ b/drivers/net/mlx5/mlx5_trigger.c
> > > @@ -148,12 +148,6 @@ mlx5_dev_start(struct rte_eth_dev *dev)
> > >  	int ret;
> > >
> > >  	dev->data->dev_started = 1;
> > > -	ret = mlx5_flow_create_drop_queue(dev);
> > > -	if (ret) {
> > > -		DRV_LOG(ERR, "port %u drop queue allocation failed: %s",
> > > -			dev->data->port_id, strerror(rte_errno));
> > > -		goto error;
> > > -	}
> > >  	DRV_LOG(DEBUG, "port %u allocating and configuring hash Rx queues",
> > >  		dev->data->port_id);
> > >  	rte_mempool_walk(mlx5_mp2mr_iter, priv); @@ -202,7 +196,6 @@
> > > mlx5_dev_start(struct rte_eth_dev *dev)
> > >  	mlx5_traffic_disable(dev);
> > >  	mlx5_txq_stop(dev);
> > >  	mlx5_rxq_stop(dev);
> > > -	mlx5_flow_delete_drop_queue(dev);
> > >  	rte_errno = ret; /* Restore rte_errno. */
> > >  	return -rte_errno;
> > >  }
> > > @@ -237,7 +230,6 @@ mlx5_dev_stop(struct rte_eth_dev *dev)
> > >  	mlx5_rxq_stop(dev);
> > >  	for (mr = LIST_FIRST(&priv->mr); mr; mr = LIST_FIRST(&priv->mr))
> > >  		mlx5_mr_release(mr);
> > > -	mlx5_flow_delete_drop_queue(dev);
> > >  }
> > >
> > >  /**
> > > --
> > > 2.13.3

Thanks,
-- 
Nélio Laranjeiro
6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v3 03/14] net/mlx5: support L3 VXLAN flow
  2018-04-13 12:13   ` Nélio Laranjeiro
@ 2018-04-13 13:51     ` Xueming(Steven) Li
  2018-04-13 14:04     ` Xueming(Steven) Li
  1 sibling, 0 replies; 115+ messages in thread
From: Xueming(Steven) Li @ 2018-04-13 13:51 UTC (permalink / raw)
  To: Nélio Laranjeiro; +Cc: Shahaf Shuler, dev



> -----Original Message-----
> From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> Sent: Friday, April 13, 2018 8:14 PM
> To: Xueming(Steven) Li <xuemingl@mellanox.com>
> Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org
> Subject: Re: [PATCH v3 03/14] net/mlx5: support L3 VXLAN flow
> 
> On Fri, Apr 13, 2018 at 07:20:12PM +0800, Xueming Li wrote:
> > This patch support L3 VXLAN, no inner L2 header comparing to standard
> > VXLAN protocol. L3 VXLAN using specific overlay UDP destination port
> > to discriminate against standard VXLAN, FW has to be configured to
> > support
> > it:
> >   sudo mlxconfig -d <device> -y s IP_OVER_VXLAN_EN=1
> >   sudo mlxconfig -d <device> -y s IP_OVER_VXLAN_PORT=<port>
> 
> This fully deserves to update MLX5 guide with such information, users are
> already not reading it, don't expect them to read commit logs.

Okay, I'll add them into document update commit

> 
> > Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> > ---
> >  drivers/net/mlx5/mlx5_flow.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/net/mlx5/mlx5_flow.c
> > b/drivers/net/mlx5/mlx5_flow.c index 2aae988f2..644f26a95 100644
> > --- a/drivers/net/mlx5/mlx5_flow.c
> > +++ b/drivers/net/mlx5/mlx5_flow.c
> > @@ -413,7 +413,9 @@ static const struct mlx5_flow_items mlx5_flow_items[]
> = {
> >  		.dst_sz = sizeof(struct ibv_flow_spec_tunnel),
> >  	},
> >  	[RTE_FLOW_ITEM_TYPE_VXLAN] = {
> > -		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH),
> > +		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
> > +			       RTE_FLOW_ITEM_TYPE_IPV4, /* L3 VXLAN. */
> > +			       RTE_FLOW_ITEM_TYPE_IPV6), /* L3 VXLAN. */
> 
> s/L3/For L3/
> 
> >  		.actions = valid_actions,
> >  		.mask = &(const struct rte_flow_item_vxlan){
> >  			.vni = "\xff\xff\xff",
> > --
> > 2.13.3
> 
> There is an important question about this support as the firmware needs to
> be configured for it.
> 
> 1. Is such rule accepted by the kernel modules if the support is not
> enabled in the firmware?

Yes.

> 
> 2. Is it possible from the PMD to query such information?

Unfortunately.

> 
> If both answers are no, such features should be enabled through a device
> parameter to let the PMD refuse such un-supported flow request.
> 
> Thanks,
> 
> --
> Nélio Laranjeiro
> 6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v3 03/14] net/mlx5: support L3 VXLAN flow
  2018-04-13 12:13   ` Nélio Laranjeiro
  2018-04-13 13:51     ` Xueming(Steven) Li
@ 2018-04-13 14:04     ` Xueming(Steven) Li
  1 sibling, 0 replies; 115+ messages in thread
From: Xueming(Steven) Li @ 2018-04-13 14:04 UTC (permalink / raw)
  To: Nélio Laranjeiro; +Cc: Shahaf Shuler, dev



> -----Original Message-----
> From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> Sent: Friday, April 13, 2018 8:14 PM
> To: Xueming(Steven) Li <xuemingl@mellanox.com>
> Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org
> Subject: Re: [PATCH v3 03/14] net/mlx5: support L3 VXLAN flow
> 
> On Fri, Apr 13, 2018 at 07:20:12PM +0800, Xueming Li wrote:
> > This patch support L3 VXLAN, no inner L2 header comparing to standard
> > VXLAN protocol. L3 VXLAN using specific overlay UDP destination port
> > to discriminate against standard VXLAN, FW has to be configured to
> > support
> > it:
> >   sudo mlxconfig -d <device> -y s IP_OVER_VXLAN_EN=1
> >   sudo mlxconfig -d <device> -y s IP_OVER_VXLAN_PORT=<port>
> 
> This fully deserves to update MLX5 guide with such information, users are
> already not reading it, don't expect them to read commit logs.

Okay, I'll add them into document update commit

> 
> > Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> > ---
> >  drivers/net/mlx5/mlx5_flow.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/net/mlx5/mlx5_flow.c
> > b/drivers/net/mlx5/mlx5_flow.c index 2aae988f2..644f26a95 100644
> > --- a/drivers/net/mlx5/mlx5_flow.c
> > +++ b/drivers/net/mlx5/mlx5_flow.c
> > @@ -413,7 +413,9 @@ static const struct mlx5_flow_items mlx5_flow_items[]
> = {
> >  		.dst_sz = sizeof(struct ibv_flow_spec_tunnel),
> >  	},
> >  	[RTE_FLOW_ITEM_TYPE_VXLAN] = {
> > -		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH),
> > +		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
> > +			       RTE_FLOW_ITEM_TYPE_IPV4, /* L3 VXLAN. */
> > +			       RTE_FLOW_ITEM_TYPE_IPV6), /* L3 VXLAN. */
> 
> s/L3/For L3/
> 
> >  		.actions = valid_actions,
> >  		.mask = &(const struct rte_flow_item_vxlan){
> >  			.vni = "\xff\xff\xff",
> > --
> > 2.13.3
> 
> There is an important question about this support as the firmware needs to
> be configured for it.
> 
> 1. Is such rule accepted by the kernel modules if the support is not
> enabled in the firmware?

Yes.

> 
> 2. Is it possible from the PMD to query such information?

Unfortunately.

> 
> If both answers are no, such features should be enabled through a device
> parameter to let the PMD refuse such un-supported flow request.
> 
> Thanks,
> 
> --
> Nélio Laranjeiro
> 6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v3 11/14] net/mlx5: support MPLS-in-GRE and MPLS-in-UDP
  2018-04-13 13:37   ` Nélio Laranjeiro
@ 2018-04-13 14:48     ` Xueming(Steven) Li
  2018-04-13 14:55       ` Nélio Laranjeiro
  0 siblings, 1 reply; 115+ messages in thread
From: Xueming(Steven) Li @ 2018-04-13 14:48 UTC (permalink / raw)
  To: Nélio Laranjeiro; +Cc: Shahaf Shuler, dev



> -----Original Message-----
> From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> Sent: Friday, April 13, 2018 9:37 PM
> To: Xueming(Steven) Li <xuemingl@mellanox.com>
> Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org
> Subject: Re: [PATCH v3 11/14] net/mlx5: support MPLS-in-GRE and MPLS-in-
> UDP
> 
> Some nits,
> 
> On Fri, Apr 13, 2018 at 07:20:20PM +0800, Xueming Li wrote:
> > This patch supports new tunnel type MPLS-in-GRE and MPLS-in-UDP.
> > Flow pattern example:
> >   ipv4 proto is 47 / gre proto is 0x8847 / mpls
> >   ipv4 / udp dst is 6635 / mpls / end
> >
> > Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> > ---
> >  drivers/net/mlx5/Makefile    |   5 ++
> >  drivers/net/mlx5/mlx5.c      |  15 +++++
> >  drivers/net/mlx5/mlx5.h      |   1 +
> >  drivers/net/mlx5/mlx5_flow.c | 148
> > ++++++++++++++++++++++++++++++++++++++++++-
> >  4 files changed, 166 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
> > index f9a6c460b..33553483e 100644
> > --- a/drivers/net/mlx5/Makefile
> > +++ b/drivers/net/mlx5/Makefile
> > @@ -131,6 +131,11 @@ mlx5_autoconf.h.new: $(RTE_SDK)/buildtools/auto-
> config-h.sh
> >  		enum MLX5DV_CONTEXT_MASK_TUNNEL_OFFLOADS \
> >  		$(AUTOCONF_OUTPUT)
> >  	$Q sh -- '$<' '$@' \
> > +		HAVE_IBV_DEVICE_MPLS_SUPPORT \
> > +		infiniband/verbs.h \
> > +		enum IBV_FLOW_SPEC_MPLS \
> > +		$(AUTOCONF_OUTPUT)
> > +	$Q sh -- '$<' '$@' \
> >  		HAVE_IBV_WQ_FLAG_RX_END_PADDING \
> >  		infiniband/verbs.h \
> >  		enum IBV_WQ_FLAG_RX_END_PADDING \
> > diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index
> > 38118e524..89b683d6e 100644
> > --- a/drivers/net/mlx5/mlx5.c
> > +++ b/drivers/net/mlx5/mlx5.c
> > @@ -614,6 +614,7 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv
> __rte_unused,
> >  	unsigned int cqe_comp;
> >  	unsigned int tunnel_en = 0;
> >  	unsigned int verb_priorities = 0;
> > +	unsigned int mpls_en = 0;
> >  	int idx;
> >  	int i;
> >  	struct mlx5dv_context attrs_out = {0}; @@ -720,12 +721,25 @@
> > mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
> >  			      MLX5DV_RAW_PACKET_CAP_TUNNELED_OFFLOAD_VXLAN) &&
> >  			     (attrs_out.tunnel_offloads_caps &
> >  			      MLX5DV_RAW_PACKET_CAP_TUNNELED_OFFLOAD_GRE));
> > +#ifdef HAVE_IBV_DEVICE_MPLS_SUPPORT
> > +		mpls_en = ((attrs_out.tunnel_offloads_caps &
> > +			    MLX5DV_RAW_PACKET_CAP_TUNNELED_OFFLOAD_MPLS_GRE) &&
> > +			   (attrs_out.tunnel_offloads_caps &
> > +			    MLX5DV_RAW_PACKET_CAP_TUNNELED_OFFLOAD_MPLS_UDP) &&
> > +			   (attrs_out.tunnel_offloads_caps &
> > +			  MLX5DV_RAW_PACKET_CAP_TUNNELED_OFFLOAD_CTRL_DW_MPLS));
> > +#endif
> >  	}
> >  	DRV_LOG(DEBUG, "tunnel offloading is %ssupported",
> >  		tunnel_en ? "" : "not ");
> > +	DRV_LOG(DEBUG, "MPLS over GRE/UDP offloading is %ssupported",
> > +		mpls_en ? "" : "not ");
> >  #else
> >  	DRV_LOG(WARNING,
> >  		"tunnel offloading disabled due to old OFED/rdma-core
> version");
> > +	DRV_LOG(WARNING,
> > +		"MPLS over GRE/UDP offloading disabled due to old"
> > +		" OFED/rdma-core version or firmware configuration");
> >  #endif
> >  	if (mlx5_glue->query_device_ex(attr_ctx, NULL, &device_attr)) {
> >  		err = errno;
> > @@ -749,6 +763,7 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv
> __rte_unused,
> >  			.cqe_comp = cqe_comp,
> >  			.mps = mps,
> >  			.tunnel_en = tunnel_en,
> > +			.mpls_en = mpls_en,
> >  			.tx_vec_en = 1,
> >  			.rx_vec_en = 1,
> >  			.mpw_hdr_dseg = 0,
> > diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h index
> > 6e4613fe0..efbcb2156 100644
> > --- a/drivers/net/mlx5/mlx5.h
> > +++ b/drivers/net/mlx5/mlx5.h
> > @@ -81,6 +81,7 @@ struct mlx5_dev_config {
> >  	unsigned int vf:1; /* This is a VF. */
> >  	unsigned int mps:2; /* Multi-packet send supported mode. */
> >  	unsigned int tunnel_en:1;
> > +	unsigned int mpls_en:1; /* MPLS over GRE/UDP is enabled. */
> >  	/* Whether tunnel stateless offloads are supported. */
> >  	unsigned int flow_counter_en:1; /* Whether flow counter is supported.
> */
> >  	unsigned int cqe_comp:1; /* CQE compression is enabled. */ diff
> > --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
> > index 0fccd39b3..98edf1882 100644
> > --- a/drivers/net/mlx5/mlx5_flow.c
> > +++ b/drivers/net/mlx5/mlx5_flow.c
> > @@ -100,6 +100,11 @@ mlx5_flow_create_gre(const struct rte_flow_item
> *item,
> >  		       const void *default_mask,
> >  		       struct mlx5_flow_data *data);
> >
> > +static int
> > +mlx5_flow_create_mpls(const struct rte_flow_item *item,
> > +		      const void *default_mask,
> > +		      struct mlx5_flow_data *data);
> > +
> >  struct mlx5_flow_parse;
> >
> >  static void
> > @@ -247,12 +252,14 @@ struct rte_flow {  #define IS_TUNNEL(type) ( \
> >  	(type) == RTE_FLOW_ITEM_TYPE_VXLAN || \
> >  	(type) == RTE_FLOW_ITEM_TYPE_VXLAN_GPE || \
> > +	(type) == RTE_FLOW_ITEM_TYPE_MPLS || \
> >  	(type) == RTE_FLOW_ITEM_TYPE_GRE)
> >
> >  const uint32_t flow_ptype[] = {
> >  	[RTE_FLOW_ITEM_TYPE_VXLAN] = RTE_PTYPE_TUNNEL_VXLAN,
> >  	[RTE_FLOW_ITEM_TYPE_VXLAN_GPE] = RTE_PTYPE_TUNNEL_VXLAN_GPE,
> >  	[RTE_FLOW_ITEM_TYPE_GRE] = RTE_PTYPE_TUNNEL_GRE,
> > +	[RTE_FLOW_ITEM_TYPE_MPLS] = RTE_PTYPE_TUNNEL_MPLS_IN_GRE,
> >  };
> >
> >  #define PTYPE_IDX(t) ((RTE_PTYPE_TUNNEL_MASK & (t)) >> 12) @@ -263,6
> > +270,10 @@ const uint32_t ptype_ext[] = {
> >  	[PTYPE_IDX(RTE_PTYPE_TUNNEL_VXLAN_GPE)]	=
> RTE_PTYPE_TUNNEL_VXLAN_GPE |
> >  						  RTE_PTYPE_L4_UDP,
> >  	[PTYPE_IDX(RTE_PTYPE_TUNNEL_GRE)] = RTE_PTYPE_TUNNEL_GRE,
> > +	[PTYPE_IDX(RTE_PTYPE_TUNNEL_MPLS_IN_GRE)] =
> > +		RTE_PTYPE_TUNNEL_MPLS_IN_GRE,
> > +	[PTYPE_IDX(RTE_PTYPE_TUNNEL_MPLS_IN_UDP)] =
> > +		RTE_PTYPE_TUNNEL_MPLS_IN_GRE | RTE_PTYPE_L4_UDP,
> >  };
> >
> >  /** Structure to generate a simple graph of layers supported by the
> > NIC. */ @@ -399,7 +410,8 @@ static const struct mlx5_flow_items
> mlx5_flow_items[] = {
> >  	},
> >  	[RTE_FLOW_ITEM_TYPE_UDP] = {
> >  		.items = ITEMS(RTE_FLOW_ITEM_TYPE_VXLAN,
> > -			       RTE_FLOW_ITEM_TYPE_VXLAN_GPE),
> > +			       RTE_FLOW_ITEM_TYPE_VXLAN_GPE,
> > +			       RTE_FLOW_ITEM_TYPE_MPLS),
> >  		.actions = valid_actions,
> >  		.mask = &(const struct rte_flow_item_udp){
> >  			.hdr = {
> > @@ -428,7 +440,8 @@ static const struct mlx5_flow_items mlx5_flow_items[]
> = {
> >  	[RTE_FLOW_ITEM_TYPE_GRE] = {
> >  		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
> >  			       RTE_FLOW_ITEM_TYPE_IPV4,
> > -			       RTE_FLOW_ITEM_TYPE_IPV6),
> > +			       RTE_FLOW_ITEM_TYPE_IPV6,
> > +			       RTE_FLOW_ITEM_TYPE_MPLS),
> >  		.actions = valid_actions,
> >  		.mask = &(const struct rte_flow_item_gre){
> >  			.protocol = -1,
> > @@ -436,7 +449,11 @@ static const struct mlx5_flow_items
> mlx5_flow_items[] = {
> >  		.default_mask = &rte_flow_item_gre_mask,
> >  		.mask_sz = sizeof(struct rte_flow_item_gre),
> >  		.convert = mlx5_flow_create_gre,
> > +#ifdef HAVE_IBV_DEVICE_MPLS_SUPPORT
> > +		.dst_sz = sizeof(struct ibv_flow_spec_gre), #else
> >  		.dst_sz = sizeof(struct ibv_flow_spec_tunnel),
> > +#endif
> >  	},
> >  	[RTE_FLOW_ITEM_TYPE_VXLAN] = {
> >  		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH, @@ -464,6 +481,21 @@
> static
> > const struct mlx5_flow_items mlx5_flow_items[] = {
> >  		.convert = mlx5_flow_create_vxlan_gpe,
> >  		.dst_sz = sizeof(struct ibv_flow_spec_tunnel),
> >  	},
> > +	[RTE_FLOW_ITEM_TYPE_MPLS] = {
> > +		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
> > +			       RTE_FLOW_ITEM_TYPE_IPV4,
> > +			       RTE_FLOW_ITEM_TYPE_IPV6),
> > +		.actions = valid_actions,
> > +		.mask = &(const struct rte_flow_item_mpls){
> > +			.label_tc_s = "\xff\xff\xf0",
> > +		},
> > +		.default_mask = &rte_flow_item_mpls_mask,
> > +		.mask_sz = sizeof(struct rte_flow_item_mpls),
> > +		.convert = mlx5_flow_create_mpls,
> > +#ifdef HAVE_IBV_DEVICE_MPLS_SUPPORT
> > +		.dst_sz = sizeof(struct ibv_flow_spec_mpls), #endif
> > +	},
> 
> Why the whole item is not under ifdef?

If apply macro to whole item, there will be a null pointer if create mpls flow.
There is a macro in function mlx5_flow_create_mpls() to avoid using this invalid data.

> 
> >  };
> >
> >  /** Structure to pass to the conversion function. */ @@ -912,7 +944,9
> > @@ mlx5_flow_convert_items_validate(struct rte_eth_dev *dev,
> >  		if (ret)
> >  			goto exit_item_not_supported;
> >  		if (IS_TUNNEL(items->type)) {
> > -			if (parser->tunnel) {
> > +			if (parser->tunnel &&
> > +			   !(parser->tunnel == RTE_PTYPE_TUNNEL_GRE &&
> > +			     items->type == RTE_FLOW_ITEM_TYPE_MPLS)) {
> >  				rte_flow_error_set(error, ENOTSUP,
> >  						   RTE_FLOW_ERROR_TYPE_ITEM,
> >  						   items,
> > @@ -920,6 +954,16 @@ mlx5_flow_convert_items_validate(struct rte_eth_dev
> *dev,
> >  						   " tunnel encapsulations.");
> >  				return -rte_errno;
> >  			}
> > +			if (items->type == RTE_FLOW_ITEM_TYPE_MPLS &&
> > +			    !priv->config.mpls_en) {
> > +				rte_flow_error_set(error, ENOTSUP,
> > +						   RTE_FLOW_ERROR_TYPE_ITEM,
> > +						   items,
> > +						   "MPLS not supported or"
> > +						   " disabled in firmware"
> > +						   " configuration.");
> > +				return -rte_errno;
> > +			}
> >  			if (!priv->config.tunnel_en &&
> >  			    parser->rss_conf.level) {
> >  				rte_flow_error_set(error, ENOTSUP, @@ -1880,6
> +1924,80 @@
> > mlx5_flow_create_vxlan_gpe(const struct rte_flow_item *item,  }
> >
> >  /**
> > + * Convert MPLS item to Verbs specification.
> > + * Tunnel types currently supported are MPLS-in-GRE and MPLS-in-UDP.
> > + *
> > + * @param item[in]
> > + *   Item specification.
> > + * @param default_mask[in]
> > + *   Default bit-masks to use when item->mask is not provided.
> > + * @param data[in, out]
> > + *   User structure.
> > + *
> > + * @return
> > + *   0 on success, a negative errno value otherwise and rte_errno is
> set.
> > + */
> > +static int
> > +mlx5_flow_create_mpls(const struct rte_flow_item *item __rte_unused,
> > +		      const void *default_mask __rte_unused,
> > +		      struct mlx5_flow_data *data __rte_unused) { #ifndef
> > +HAVE_IBV_DEVICE_MPLS_SUPPORT
> > +	return rte_flow_error_set(data->error, EINVAL,
> 
> ENOTSUP is more accurate to keep a consistency among the errors.
> 
> > +				  RTE_FLOW_ERROR_TYPE_ITEM,
> > +				  item,
> > +				  "MPLS not supported by driver"); #else
> > +	unsigned int i;
> > +	const struct rte_flow_item_mpls *spec = item->spec;
> > +	const struct rte_flow_item_mpls *mask = item->mask;
> > +	struct mlx5_flow_parse *parser = data->parser;
> > +	unsigned int size = sizeof(struct ibv_flow_spec_mpls);
> > +	struct ibv_flow_spec_mpls mpls = {
> > +		.type = IBV_FLOW_SPEC_MPLS,
> > +		.size = size,
> > +	};
> > +	union tag {
> > +		uint32_t tag;
> > +		uint8_t label[4];
> > +	} id;
> > +
> > +	id.tag = 0;
> > +	parser->inner = IBV_FLOW_SPEC_INNER;
> > +	if (parser->layer == HASH_RXQ_UDPV4 ||
> > +	    parser->layer == HASH_RXQ_UDPV6) {
> > +		parser->tunnel =
> > +			ptype_ext[PTYPE_IDX(RTE_PTYPE_TUNNEL_MPLS_IN_UDP)];
> > +		parser->out_layer = parser->layer;
> > +	} else {
> > +		parser->tunnel =
> > +			ptype_ext[PTYPE_IDX(RTE_PTYPE_TUNNEL_MPLS_IN_GRE)];
> > +	}
> > +	parser->layer = HASH_RXQ_TUNNEL;
> > +	if (spec) {
> > +		if (!mask)
> > +			mask = default_mask;
> > +		memcpy(&id.label[1], spec->label_tc_s, 3);
> > +		id.label[0] = spec->ttl;
> > +		mpls.val.tag = id.tag;
> > +		memcpy(&id.label[1], mask->label_tc_s, 3);
> > +		id.label[0] = mask->ttl;
> > +		mpls.mask.tag = id.tag;
> > +		/* Remove unwanted bits from values. */
> > +		mpls.val.tag &= mpls.mask.tag;
> > +	}
> > +	mlx5_flow_create_copy(parser, &mpls, size);
> > +	for (i = 0; i != hash_rxq_init_n; ++i) {
> > +		if (!parser->queue[i].ibv_attr)
> > +			continue;
> > +		parser->queue[i].ibv_attr->flags |=
> > +			IBV_FLOW_ATTR_FLAGS_ORDERED_SPEC_LIST;
> > +	}
> > +	return 0;
> > +#endif
> > +}
> > +
> > +/**
> >   * Convert GRE item to Verbs specification.
> >   *
> >   * @param item[in]
> > @@ -1898,16 +2016,40 @@ mlx5_flow_create_gre(const struct rte_flow_item
> *item __rte_unused,
> >  		     struct mlx5_flow_data *data)
> >  {
> >  	struct mlx5_flow_parse *parser = data->parser;
> > +#ifndef HAVE_IBV_DEVICE_MPLS_SUPPORT
> >  	unsigned int size = sizeof(struct ibv_flow_spec_tunnel);
> >  	struct ibv_flow_spec_tunnel tunnel = {
> >  		.type = parser->inner | IBV_FLOW_SPEC_VXLAN_TUNNEL,
> >  		.size = size,
> >  	};
> > +#else
> > +	const struct rte_flow_item_gre *spec = item->spec;
> > +	const struct rte_flow_item_gre *mask = item->mask;
> > +	unsigned int size = sizeof(struct ibv_flow_spec_gre);
> > +	struct ibv_flow_spec_gre tunnel = {
> > +		.type = parser->inner | IBV_FLOW_SPEC_GRE,
> > +		.size = size,
> > +	};
> > +#endif
> >
> >  	parser->inner = IBV_FLOW_SPEC_INNER;
> >  	parser->tunnel = ptype_ext[PTYPE_IDX(RTE_PTYPE_TUNNEL_GRE)];
> >  	parser->out_layer = parser->layer;
> >  	parser->layer = HASH_RXQ_TUNNEL;
> > +#ifdef HAVE_IBV_DEVICE_MPLS_SUPPORT
> > +	if (spec) {
> > +		if (!mask)
> > +			mask = default_mask;
> > +		tunnel.val.c_ks_res0_ver = spec->c_rsvd0_ver;
> > +		tunnel.val.protocol = spec->protocol;
> > +		tunnel.val.c_ks_res0_ver = mask->c_rsvd0_ver;
> > +		tunnel.val.protocol = mask->protocol;
> > +		/* Remove unwanted bits from values. */
> > +		tunnel.val.c_ks_res0_ver &= tunnel.mask.c_ks_res0_ver;
> > +		tunnel.val.protocol &= tunnel.mask.protocol;
> > +		tunnel.val.key &= tunnel.mask.key;
> > +	}
> > +#endif
> >  	mlx5_flow_create_copy(parser, &tunnel, size);
> >  	return 0;
> >  }
> > --
> > 2.13.3
> >
> 
> Thanks,
> 
> --
> Nélio Laranjeiro
> 6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v3 11/14] net/mlx5: support MPLS-in-GRE and MPLS-in-UDP
  2018-04-13 14:48     ` Xueming(Steven) Li
@ 2018-04-13 14:55       ` Nélio Laranjeiro
  2018-04-13 15:22         ` Xueming(Steven) Li
  0 siblings, 1 reply; 115+ messages in thread
From: Nélio Laranjeiro @ 2018-04-13 14:55 UTC (permalink / raw)
  To: Xueming(Steven) Li; +Cc: Shahaf Shuler, dev

On Fri, Apr 13, 2018 at 02:48:17PM +0000, Xueming(Steven) Li wrote:
> 
> 
> > -----Original Message-----
> > From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> > Sent: Friday, April 13, 2018 9:37 PM
> > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org
> > Subject: Re: [PATCH v3 11/14] net/mlx5: support MPLS-in-GRE and MPLS-in-
> > UDP
> > 
> > Some nits,
> > 
> > On Fri, Apr 13, 2018 at 07:20:20PM +0800, Xueming Li wrote:
> > > This patch supports new tunnel type MPLS-in-GRE and MPLS-in-UDP.
> > > Flow pattern example:
> > >   ipv4 proto is 47 / gre proto is 0x8847 / mpls
> > >   ipv4 / udp dst is 6635 / mpls / end
> > >
> > > Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> > > ---
> > >  drivers/net/mlx5/Makefile    |   5 ++
> > >  drivers/net/mlx5/mlx5.c      |  15 +++++
> > >  drivers/net/mlx5/mlx5.h      |   1 +
> > >  drivers/net/mlx5/mlx5_flow.c | 148
> > > ++++++++++++++++++++++++++++++++++++++++++-
> > >  4 files changed, 166 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
> > > index f9a6c460b..33553483e 100644
> > > --- a/drivers/net/mlx5/Makefile
> > > +++ b/drivers/net/mlx5/Makefile
> > > @@ -131,6 +131,11 @@ mlx5_autoconf.h.new: $(RTE_SDK)/buildtools/auto-
> > config-h.sh
> > >  		enum MLX5DV_CONTEXT_MASK_TUNNEL_OFFLOADS \
> > >  		$(AUTOCONF_OUTPUT)
> > >  	$Q sh -- '$<' '$@' \
> > > +		HAVE_IBV_DEVICE_MPLS_SUPPORT \
> > > +		infiniband/verbs.h \
> > > +		enum IBV_FLOW_SPEC_MPLS \
> > > +		$(AUTOCONF_OUTPUT)
> > > +	$Q sh -- '$<' '$@' \
> > >  		HAVE_IBV_WQ_FLAG_RX_END_PADDING \
> > >  		infiniband/verbs.h \
> > >  		enum IBV_WQ_FLAG_RX_END_PADDING \
> > > diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index
> > > 38118e524..89b683d6e 100644
> > > --- a/drivers/net/mlx5/mlx5.c
> > > +++ b/drivers/net/mlx5/mlx5.c
> > > @@ -614,6 +614,7 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv
> > __rte_unused,
> > >  	unsigned int cqe_comp;
> > >  	unsigned int tunnel_en = 0;
> > >  	unsigned int verb_priorities = 0;
> > > +	unsigned int mpls_en = 0;
> > >  	int idx;
> > >  	int i;
> > >  	struct mlx5dv_context attrs_out = {0}; @@ -720,12 +721,25 @@
> > > mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
> > >  			      MLX5DV_RAW_PACKET_CAP_TUNNELED_OFFLOAD_VXLAN) &&
> > >  			     (attrs_out.tunnel_offloads_caps &
> > >  			      MLX5DV_RAW_PACKET_CAP_TUNNELED_OFFLOAD_GRE));
> > > +#ifdef HAVE_IBV_DEVICE_MPLS_SUPPORT
> > > +		mpls_en = ((attrs_out.tunnel_offloads_caps &
> > > +			    MLX5DV_RAW_PACKET_CAP_TUNNELED_OFFLOAD_MPLS_GRE) &&
> > > +			   (attrs_out.tunnel_offloads_caps &
> > > +			    MLX5DV_RAW_PACKET_CAP_TUNNELED_OFFLOAD_MPLS_UDP) &&
> > > +			   (attrs_out.tunnel_offloads_caps &
> > > +			  MLX5DV_RAW_PACKET_CAP_TUNNELED_OFFLOAD_CTRL_DW_MPLS));
> > > +#endif
> > >  	}
> > >  	DRV_LOG(DEBUG, "tunnel offloading is %ssupported",
> > >  		tunnel_en ? "" : "not ");
> > > +	DRV_LOG(DEBUG, "MPLS over GRE/UDP offloading is %ssupported",
> > > +		mpls_en ? "" : "not ");
> > >  #else
> > >  	DRV_LOG(WARNING,
> > >  		"tunnel offloading disabled due to old OFED/rdma-core
> > version");
> > > +	DRV_LOG(WARNING,
> > > +		"MPLS over GRE/UDP offloading disabled due to old"
> > > +		" OFED/rdma-core version or firmware configuration");
> > >  #endif
> > >  	if (mlx5_glue->query_device_ex(attr_ctx, NULL, &device_attr)) {
> > >  		err = errno;
> > > @@ -749,6 +763,7 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv
> > __rte_unused,
> > >  			.cqe_comp = cqe_comp,
> > >  			.mps = mps,
> > >  			.tunnel_en = tunnel_en,
> > > +			.mpls_en = mpls_en,
> > >  			.tx_vec_en = 1,
> > >  			.rx_vec_en = 1,
> > >  			.mpw_hdr_dseg = 0,
> > > diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h index
> > > 6e4613fe0..efbcb2156 100644
> > > --- a/drivers/net/mlx5/mlx5.h
> > > +++ b/drivers/net/mlx5/mlx5.h
> > > @@ -81,6 +81,7 @@ struct mlx5_dev_config {
> > >  	unsigned int vf:1; /* This is a VF. */
> > >  	unsigned int mps:2; /* Multi-packet send supported mode. */
> > >  	unsigned int tunnel_en:1;
> > > +	unsigned int mpls_en:1; /* MPLS over GRE/UDP is enabled. */
> > >  	/* Whether tunnel stateless offloads are supported. */
> > >  	unsigned int flow_counter_en:1; /* Whether flow counter is supported.
> > */
> > >  	unsigned int cqe_comp:1; /* CQE compression is enabled. */ diff
> > > --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
> > > index 0fccd39b3..98edf1882 100644
> > > --- a/drivers/net/mlx5/mlx5_flow.c
> > > +++ b/drivers/net/mlx5/mlx5_flow.c
> > > @@ -100,6 +100,11 @@ mlx5_flow_create_gre(const struct rte_flow_item
> > *item,
> > >  		       const void *default_mask,
> > >  		       struct mlx5_flow_data *data);
> > >
> > > +static int
> > > +mlx5_flow_create_mpls(const struct rte_flow_item *item,
> > > +		      const void *default_mask,
> > > +		      struct mlx5_flow_data *data);
> > > +
> > >  struct mlx5_flow_parse;
> > >
> > >  static void
> > > @@ -247,12 +252,14 @@ struct rte_flow {  #define IS_TUNNEL(type) ( \
> > >  	(type) == RTE_FLOW_ITEM_TYPE_VXLAN || \
> > >  	(type) == RTE_FLOW_ITEM_TYPE_VXLAN_GPE || \
> > > +	(type) == RTE_FLOW_ITEM_TYPE_MPLS || \
> > >  	(type) == RTE_FLOW_ITEM_TYPE_GRE)
> > >
> > >  const uint32_t flow_ptype[] = {
> > >  	[RTE_FLOW_ITEM_TYPE_VXLAN] = RTE_PTYPE_TUNNEL_VXLAN,
> > >  	[RTE_FLOW_ITEM_TYPE_VXLAN_GPE] = RTE_PTYPE_TUNNEL_VXLAN_GPE,
> > >  	[RTE_FLOW_ITEM_TYPE_GRE] = RTE_PTYPE_TUNNEL_GRE,
> > > +	[RTE_FLOW_ITEM_TYPE_MPLS] = RTE_PTYPE_TUNNEL_MPLS_IN_GRE,
> > >  };
> > >
> > >  #define PTYPE_IDX(t) ((RTE_PTYPE_TUNNEL_MASK & (t)) >> 12) @@ -263,6
> > > +270,10 @@ const uint32_t ptype_ext[] = {
> > >  	[PTYPE_IDX(RTE_PTYPE_TUNNEL_VXLAN_GPE)]	=
> > RTE_PTYPE_TUNNEL_VXLAN_GPE |
> > >  						  RTE_PTYPE_L4_UDP,
> > >  	[PTYPE_IDX(RTE_PTYPE_TUNNEL_GRE)] = RTE_PTYPE_TUNNEL_GRE,
> > > +	[PTYPE_IDX(RTE_PTYPE_TUNNEL_MPLS_IN_GRE)] =
> > > +		RTE_PTYPE_TUNNEL_MPLS_IN_GRE,
> > > +	[PTYPE_IDX(RTE_PTYPE_TUNNEL_MPLS_IN_UDP)] =
> > > +		RTE_PTYPE_TUNNEL_MPLS_IN_GRE | RTE_PTYPE_L4_UDP,
> > >  };
> > >
> > >  /** Structure to generate a simple graph of layers supported by the
> > > NIC. */ @@ -399,7 +410,8 @@ static const struct mlx5_flow_items
> > mlx5_flow_items[] = {
> > >  	},
> > >  	[RTE_FLOW_ITEM_TYPE_UDP] = {
> > >  		.items = ITEMS(RTE_FLOW_ITEM_TYPE_VXLAN,
> > > -			       RTE_FLOW_ITEM_TYPE_VXLAN_GPE),
> > > +			       RTE_FLOW_ITEM_TYPE_VXLAN_GPE,
> > > +			       RTE_FLOW_ITEM_TYPE_MPLS),
> > >  		.actions = valid_actions,
> > >  		.mask = &(const struct rte_flow_item_udp){
> > >  			.hdr = {
> > > @@ -428,7 +440,8 @@ static const struct mlx5_flow_items mlx5_flow_items[]
> > = {
> > >  	[RTE_FLOW_ITEM_TYPE_GRE] = {
> > >  		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
> > >  			       RTE_FLOW_ITEM_TYPE_IPV4,
> > > -			       RTE_FLOW_ITEM_TYPE_IPV6),
> > > +			       RTE_FLOW_ITEM_TYPE_IPV6,
> > > +			       RTE_FLOW_ITEM_TYPE_MPLS),
> > >  		.actions = valid_actions,
> > >  		.mask = &(const struct rte_flow_item_gre){
> > >  			.protocol = -1,
> > > @@ -436,7 +449,11 @@ static const struct mlx5_flow_items
> > mlx5_flow_items[] = {
> > >  		.default_mask = &rte_flow_item_gre_mask,
> > >  		.mask_sz = sizeof(struct rte_flow_item_gre),
> > >  		.convert = mlx5_flow_create_gre,
> > > +#ifdef HAVE_IBV_DEVICE_MPLS_SUPPORT
> > > +		.dst_sz = sizeof(struct ibv_flow_spec_gre), #else
> > >  		.dst_sz = sizeof(struct ibv_flow_spec_tunnel),
> > > +#endif
> > >  	},
> > >  	[RTE_FLOW_ITEM_TYPE_VXLAN] = {
> > >  		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH, @@ -464,6 +481,21 @@
> > static
> > > const struct mlx5_flow_items mlx5_flow_items[] = {
> > >  		.convert = mlx5_flow_create_vxlan_gpe,
> > >  		.dst_sz = sizeof(struct ibv_flow_spec_tunnel),
> > >  	},
> > > +	[RTE_FLOW_ITEM_TYPE_MPLS] = {
> > > +		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
> > > +			       RTE_FLOW_ITEM_TYPE_IPV4,
> > > +			       RTE_FLOW_ITEM_TYPE_IPV6),
> > > +		.actions = valid_actions,
> > > +		.mask = &(const struct rte_flow_item_mpls){
> > > +			.label_tc_s = "\xff\xff\xf0",
> > > +		},
> > > +		.default_mask = &rte_flow_item_mpls_mask,
> > > +		.mask_sz = sizeof(struct rte_flow_item_mpls),
> > > +		.convert = mlx5_flow_create_mpls,
> > > +#ifdef HAVE_IBV_DEVICE_MPLS_SUPPORT
> > > +		.dst_sz = sizeof(struct ibv_flow_spec_mpls), #endif
> > > +	},
> > 
> > Why the whole item is not under ifdef?
> 
> If apply macro to whole item, there will be a null pointer if create mpls flow.
> There is a macro in function mlx5_flow_create_mpls() to avoid using this invalid data.

I think there is some kind of confusion here, what I mean is moving the
#ifdef to embrace the whole stuff i.e.:

 #ifdef HAVE_IBV_DEVICE_MPLS_SUPPORT
 [RTE_FLOW_ITEM_TYPE_MPLS] = {
  .items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
  	       RTE_FLOW_ITEM_TYPE_IPV4,
  	       RTE_FLOW_ITEM_TYPE_IPV6),
  .actions = valid_actions,
  .mask = &(const struct rte_flow_item_mpls){
  	.label_tc_s = "\xff\xff\xf0",
  },
  .default_mask = &rte_flow_item_mpls_mask,
  .mask_sz = sizeof(struct rte_flow_item_mpls),
  .convert = mlx5_flow_create_mpls,
  .dst_sz = sizeof(struct ibv_flow_spec_mpls)
 #endif

Not having this item in this static array ends by not supporting it,
this is what I mean.

> > >  };
> > >
> > >  /** Structure to pass to the conversion function. */ @@ -912,7 +944,9
> > > @@ mlx5_flow_convert_items_validate(struct rte_eth_dev *dev,
> > >  		if (ret)
> > >  			goto exit_item_not_supported;
> > >  		if (IS_TUNNEL(items->type)) {
> > > -			if (parser->tunnel) {
> > > +			if (parser->tunnel &&
> > > +			   !(parser->tunnel == RTE_PTYPE_TUNNEL_GRE &&
> > > +			     items->type == RTE_FLOW_ITEM_TYPE_MPLS)) {
> > >  				rte_flow_error_set(error, ENOTSUP,
> > >  						   RTE_FLOW_ERROR_TYPE_ITEM,
> > >  						   items,
> > > @@ -920,6 +954,16 @@ mlx5_flow_convert_items_validate(struct rte_eth_dev
> > *dev,
> > >  						   " tunnel encapsulations.");
> > >  				return -rte_errno;
> > >  			}
> > > +			if (items->type == RTE_FLOW_ITEM_TYPE_MPLS &&
> > > +			    !priv->config.mpls_en) {
> > > +				rte_flow_error_set(error, ENOTSUP,
> > > +						   RTE_FLOW_ERROR_TYPE_ITEM,
> > > +						   items,
> > > +						   "MPLS not supported or"
> > > +						   " disabled in firmware"
> > > +						   " configuration.");
> > > +				return -rte_errno;
> > > +			}
> > >  			if (!priv->config.tunnel_en &&
> > >  			    parser->rss_conf.level) {
> > >  				rte_flow_error_set(error, ENOTSUP, @@ -1880,6
> > +1924,80 @@
> > > mlx5_flow_create_vxlan_gpe(const struct rte_flow_item *item,  }
> > >
> > >  /**
> > > + * Convert MPLS item to Verbs specification.
> > > + * Tunnel types currently supported are MPLS-in-GRE and MPLS-in-UDP.
> > > + *
> > > + * @param item[in]
> > > + *   Item specification.
> > > + * @param default_mask[in]
> > > + *   Default bit-masks to use when item->mask is not provided.
> > > + * @param data[in, out]
> > > + *   User structure.
> > > + *
> > > + * @return
> > > + *   0 on success, a negative errno value otherwise and rte_errno is
> > set.
> > > + */
> > > +static int
> > > +mlx5_flow_create_mpls(const struct rte_flow_item *item __rte_unused,
> > > +		      const void *default_mask __rte_unused,
> > > +		      struct mlx5_flow_data *data __rte_unused) { #ifndef
> > > +HAVE_IBV_DEVICE_MPLS_SUPPORT
> > > +	return rte_flow_error_set(data->error, EINVAL,
> > 
> > ENOTSUP is more accurate to keep a consistency among the errors.
> > 
> > > +				  RTE_FLOW_ERROR_TYPE_ITEM,
> > > +				  item,
> > > +				  "MPLS not supported by driver"); #else
> > > +	unsigned int i;
> > > +	const struct rte_flow_item_mpls *spec = item->spec;
> > > +	const struct rte_flow_item_mpls *mask = item->mask;
> > > +	struct mlx5_flow_parse *parser = data->parser;
> > > +	unsigned int size = sizeof(struct ibv_flow_spec_mpls);
> > > +	struct ibv_flow_spec_mpls mpls = {
> > > +		.type = IBV_FLOW_SPEC_MPLS,
> > > +		.size = size,
> > > +	};
> > > +	union tag {
> > > +		uint32_t tag;
> > > +		uint8_t label[4];
> > > +	} id;
> > > +
> > > +	id.tag = 0;
> > > +	parser->inner = IBV_FLOW_SPEC_INNER;
> > > +	if (parser->layer == HASH_RXQ_UDPV4 ||
> > > +	    parser->layer == HASH_RXQ_UDPV6) {
> > > +		parser->tunnel =
> > > +			ptype_ext[PTYPE_IDX(RTE_PTYPE_TUNNEL_MPLS_IN_UDP)];
> > > +		parser->out_layer = parser->layer;
> > > +	} else {
> > > +		parser->tunnel =
> > > +			ptype_ext[PTYPE_IDX(RTE_PTYPE_TUNNEL_MPLS_IN_GRE)];
> > > +	}
> > > +	parser->layer = HASH_RXQ_TUNNEL;
> > > +	if (spec) {
> > > +		if (!mask)
> > > +			mask = default_mask;
> > > +		memcpy(&id.label[1], spec->label_tc_s, 3);
> > > +		id.label[0] = spec->ttl;
> > > +		mpls.val.tag = id.tag;
> > > +		memcpy(&id.label[1], mask->label_tc_s, 3);
> > > +		id.label[0] = mask->ttl;
> > > +		mpls.mask.tag = id.tag;
> > > +		/* Remove unwanted bits from values. */
> > > +		mpls.val.tag &= mpls.mask.tag;
> > > +	}
> > > +	mlx5_flow_create_copy(parser, &mpls, size);
> > > +	for (i = 0; i != hash_rxq_init_n; ++i) {
> > > +		if (!parser->queue[i].ibv_attr)
> > > +			continue;
> > > +		parser->queue[i].ibv_attr->flags |=
> > > +			IBV_FLOW_ATTR_FLAGS_ORDERED_SPEC_LIST;
> > > +	}
> > > +	return 0;
> > > +#endif
> > > +}
> > > +
> > > +/**
> > >   * Convert GRE item to Verbs specification.
> > >   *
> > >   * @param item[in]
> > > @@ -1898,16 +2016,40 @@ mlx5_flow_create_gre(const struct rte_flow_item
> > *item __rte_unused,
> > >  		     struct mlx5_flow_data *data)
> > >  {
> > >  	struct mlx5_flow_parse *parser = data->parser;
> > > +#ifndef HAVE_IBV_DEVICE_MPLS_SUPPORT
> > >  	unsigned int size = sizeof(struct ibv_flow_spec_tunnel);
> > >  	struct ibv_flow_spec_tunnel tunnel = {
> > >  		.type = parser->inner | IBV_FLOW_SPEC_VXLAN_TUNNEL,
> > >  		.size = size,
> > >  	};
> > > +#else
> > > +	const struct rte_flow_item_gre *spec = item->spec;
> > > +	const struct rte_flow_item_gre *mask = item->mask;
> > > +	unsigned int size = sizeof(struct ibv_flow_spec_gre);
> > > +	struct ibv_flow_spec_gre tunnel = {
> > > +		.type = parser->inner | IBV_FLOW_SPEC_GRE,
> > > +		.size = size,
> > > +	};
> > > +#endif
> > >
> > >  	parser->inner = IBV_FLOW_SPEC_INNER;
> > >  	parser->tunnel = ptype_ext[PTYPE_IDX(RTE_PTYPE_TUNNEL_GRE)];
> > >  	parser->out_layer = parser->layer;
> > >  	parser->layer = HASH_RXQ_TUNNEL;
> > > +#ifdef HAVE_IBV_DEVICE_MPLS_SUPPORT
> > > +	if (spec) {
> > > +		if (!mask)
> > > +			mask = default_mask;
> > > +		tunnel.val.c_ks_res0_ver = spec->c_rsvd0_ver;
> > > +		tunnel.val.protocol = spec->protocol;
> > > +		tunnel.val.c_ks_res0_ver = mask->c_rsvd0_ver;
> > > +		tunnel.val.protocol = mask->protocol;
> > > +		/* Remove unwanted bits from values. */
> > > +		tunnel.val.c_ks_res0_ver &= tunnel.mask.c_ks_res0_ver;
> > > +		tunnel.val.protocol &= tunnel.mask.protocol;
> > > +		tunnel.val.key &= tunnel.mask.key;
> > > +	}
> > > +#endif
> > >  	mlx5_flow_create_copy(parser, &tunnel, size);
> > >  	return 0;
> > >  }
> > > --
> > > 2.13.3
> > >
> > 
> > Thanks,
> > 
> > --
> > Nélio Laranjeiro
> > 6WIND

-- 
Nélio Laranjeiro
6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v3 11/14] net/mlx5: support MPLS-in-GRE and MPLS-in-UDP
  2018-04-13 14:55       ` Nélio Laranjeiro
@ 2018-04-13 15:22         ` Xueming(Steven) Li
  2018-04-16  8:14           ` Nélio Laranjeiro
  0 siblings, 1 reply; 115+ messages in thread
From: Xueming(Steven) Li @ 2018-04-13 15:22 UTC (permalink / raw)
  To: Nélio Laranjeiro; +Cc: Shahaf Shuler, dev



> -----Original Message-----
> From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> Sent: Friday, April 13, 2018 10:56 PM
> To: Xueming(Steven) Li <xuemingl@mellanox.com>
> Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org
> Subject: Re: [PATCH v3 11/14] net/mlx5: support MPLS-in-GRE and MPLS-in-
> UDP
> 
> On Fri, Apr 13, 2018 at 02:48:17PM +0000, Xueming(Steven) Li wrote:
> >
> >
> > > -----Original Message-----
> > > From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > Sent: Friday, April 13, 2018 9:37 PM
> > > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > > Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org
> > > Subject: Re: [PATCH v3 11/14] net/mlx5: support MPLS-in-GRE and
> > > MPLS-in- UDP
> > >
> > > Some nits,
> > >
> > > On Fri, Apr 13, 2018 at 07:20:20PM +0800, Xueming Li wrote:
> > > > This patch supports new tunnel type MPLS-in-GRE and MPLS-in-UDP.
> > > > Flow pattern example:
> > > >   ipv4 proto is 47 / gre proto is 0x8847 / mpls
> > > >   ipv4 / udp dst is 6635 / mpls / end
> > > >
> > > > Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> > > > ---
> > > >  drivers/net/mlx5/Makefile    |   5 ++
> > > >  drivers/net/mlx5/mlx5.c      |  15 +++++
> > > >  drivers/net/mlx5/mlx5.h      |   1 +
> > > >  drivers/net/mlx5/mlx5_flow.c | 148
> > > > ++++++++++++++++++++++++++++++++++++++++++-
> > > >  4 files changed, 166 insertions(+), 3 deletions(-)
> > > >
> > > > diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
> > > > index f9a6c460b..33553483e 100644
> > > > --- a/drivers/net/mlx5/Makefile
> > > > +++ b/drivers/net/mlx5/Makefile
> > > > @@ -131,6 +131,11 @@ mlx5_autoconf.h.new:
> > > > $(RTE_SDK)/buildtools/auto-
> > > config-h.sh
> > > >  		enum MLX5DV_CONTEXT_MASK_TUNNEL_OFFLOADS \
> > > >  		$(AUTOCONF_OUTPUT)
> > > >  	$Q sh -- '$<' '$@' \
> > > > +		HAVE_IBV_DEVICE_MPLS_SUPPORT \
> > > > +		infiniband/verbs.h \
> > > > +		enum IBV_FLOW_SPEC_MPLS \
> > > > +		$(AUTOCONF_OUTPUT)
> > > > +	$Q sh -- '$<' '$@' \
> > > >  		HAVE_IBV_WQ_FLAG_RX_END_PADDING \
> > > >  		infiniband/verbs.h \
> > > >  		enum IBV_WQ_FLAG_RX_END_PADDING \ diff --git
> > > > a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index
> > > > 38118e524..89b683d6e 100644
> > > > --- a/drivers/net/mlx5/mlx5.c
> > > > +++ b/drivers/net/mlx5/mlx5.c
> > > > @@ -614,6 +614,7 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv
> > > __rte_unused,
> > > >  	unsigned int cqe_comp;
> > > >  	unsigned int tunnel_en = 0;
> > > >  	unsigned int verb_priorities = 0;
> > > > +	unsigned int mpls_en = 0;
> > > >  	int idx;
> > > >  	int i;
> > > >  	struct mlx5dv_context attrs_out = {0}; @@ -720,12 +721,25 @@
> > > > mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
> > > >  			      MLX5DV_RAW_PACKET_CAP_TUNNELED_OFFLOAD_VXLAN)
> &&
> > > >  			     (attrs_out.tunnel_offloads_caps &
> > > >  			      MLX5DV_RAW_PACKET_CAP_TUNNELED_OFFLOAD_GRE));
> > > > +#ifdef HAVE_IBV_DEVICE_MPLS_SUPPORT
> > > > +		mpls_en = ((attrs_out.tunnel_offloads_caps &
> > > > +
> MLX5DV_RAW_PACKET_CAP_TUNNELED_OFFLOAD_MPLS_GRE) &&
> > > > +			   (attrs_out.tunnel_offloads_caps &
> > > > +
> MLX5DV_RAW_PACKET_CAP_TUNNELED_OFFLOAD_MPLS_UDP) &&
> > > > +			   (attrs_out.tunnel_offloads_caps &
> > > > +
> MLX5DV_RAW_PACKET_CAP_TUNNELED_OFFLOAD_CTRL_DW_MPLS));
> > > > +#endif
> > > >  	}
> > > >  	DRV_LOG(DEBUG, "tunnel offloading is %ssupported",
> > > >  		tunnel_en ? "" : "not ");
> > > > +	DRV_LOG(DEBUG, "MPLS over GRE/UDP offloading is %ssupported",
> > > > +		mpls_en ? "" : "not ");
> > > >  #else
> > > >  	DRV_LOG(WARNING,
> > > >  		"tunnel offloading disabled due to old OFED/rdma-core
> > > version");
> > > > +	DRV_LOG(WARNING,
> > > > +		"MPLS over GRE/UDP offloading disabled due to old"
> > > > +		" OFED/rdma-core version or firmware configuration");
> > > >  #endif
> > > >  	if (mlx5_glue->query_device_ex(attr_ctx, NULL, &device_attr))
> {
> > > >  		err = errno;
> > > > @@ -749,6 +763,7 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv
> > > __rte_unused,
> > > >  			.cqe_comp = cqe_comp,
> > > >  			.mps = mps,
> > > >  			.tunnel_en = tunnel_en,
> > > > +			.mpls_en = mpls_en,
> > > >  			.tx_vec_en = 1,
> > > >  			.rx_vec_en = 1,
> > > >  			.mpw_hdr_dseg = 0,
> > > > diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
> > > > index
> > > > 6e4613fe0..efbcb2156 100644
> > > > --- a/drivers/net/mlx5/mlx5.h
> > > > +++ b/drivers/net/mlx5/mlx5.h
> > > > @@ -81,6 +81,7 @@ struct mlx5_dev_config {
> > > >  	unsigned int vf:1; /* This is a VF. */
> > > >  	unsigned int mps:2; /* Multi-packet send supported mode. */
> > > >  	unsigned int tunnel_en:1;
> > > > +	unsigned int mpls_en:1; /* MPLS over GRE/UDP is enabled. */
> > > >  	/* Whether tunnel stateless offloads are supported. */
> > > >  	unsigned int flow_counter_en:1; /* Whether flow counter is
> supported.
> > > */
> > > >  	unsigned int cqe_comp:1; /* CQE compression is enabled. */
> diff
> > > > --git a/drivers/net/mlx5/mlx5_flow.c
> > > > b/drivers/net/mlx5/mlx5_flow.c index 0fccd39b3..98edf1882 100644
> > > > --- a/drivers/net/mlx5/mlx5_flow.c
> > > > +++ b/drivers/net/mlx5/mlx5_flow.c
> > > > @@ -100,6 +100,11 @@ mlx5_flow_create_gre(const struct
> > > > rte_flow_item
> > > *item,
> > > >  		       const void *default_mask,
> > > >  		       struct mlx5_flow_data *data);
> > > >
> > > > +static int
> > > > +mlx5_flow_create_mpls(const struct rte_flow_item *item,
> > > > +		      const void *default_mask,
> > > > +		      struct mlx5_flow_data *data);
> > > > +
> > > >  struct mlx5_flow_parse;
> > > >
> > > >  static void
> > > > @@ -247,12 +252,14 @@ struct rte_flow {  #define IS_TUNNEL(type) ( \
> > > >  	(type) == RTE_FLOW_ITEM_TYPE_VXLAN || \
> > > >  	(type) == RTE_FLOW_ITEM_TYPE_VXLAN_GPE || \
> > > > +	(type) == RTE_FLOW_ITEM_TYPE_MPLS || \
> > > >  	(type) == RTE_FLOW_ITEM_TYPE_GRE)
> > > >
> > > >  const uint32_t flow_ptype[] = {
> > > >  	[RTE_FLOW_ITEM_TYPE_VXLAN] = RTE_PTYPE_TUNNEL_VXLAN,
> > > >  	[RTE_FLOW_ITEM_TYPE_VXLAN_GPE] = RTE_PTYPE_TUNNEL_VXLAN_GPE,
> > > >  	[RTE_FLOW_ITEM_TYPE_GRE] = RTE_PTYPE_TUNNEL_GRE,
> > > > +	[RTE_FLOW_ITEM_TYPE_MPLS] = RTE_PTYPE_TUNNEL_MPLS_IN_GRE,
> > > >  };
> > > >
> > > >  #define PTYPE_IDX(t) ((RTE_PTYPE_TUNNEL_MASK & (t)) >> 12) @@
> > > > -263,6
> > > > +270,10 @@ const uint32_t ptype_ext[] = {
> > > >  	[PTYPE_IDX(RTE_PTYPE_TUNNEL_VXLAN_GPE)]	=
> > > RTE_PTYPE_TUNNEL_VXLAN_GPE |
> > > >  						  RTE_PTYPE_L4_UDP,
> > > >  	[PTYPE_IDX(RTE_PTYPE_TUNNEL_GRE)] = RTE_PTYPE_TUNNEL_GRE,
> > > > +	[PTYPE_IDX(RTE_PTYPE_TUNNEL_MPLS_IN_GRE)] =
> > > > +		RTE_PTYPE_TUNNEL_MPLS_IN_GRE,
> > > > +	[PTYPE_IDX(RTE_PTYPE_TUNNEL_MPLS_IN_UDP)] =
> > > > +		RTE_PTYPE_TUNNEL_MPLS_IN_GRE | RTE_PTYPE_L4_UDP,
> > > >  };
> > > >
> > > >  /** Structure to generate a simple graph of layers supported by
> > > > the NIC. */ @@ -399,7 +410,8 @@ static const struct
> > > > mlx5_flow_items
> > > mlx5_flow_items[] = {
> > > >  	},
> > > >  	[RTE_FLOW_ITEM_TYPE_UDP] = {
> > > >  		.items = ITEMS(RTE_FLOW_ITEM_TYPE_VXLAN,
> > > > -			       RTE_FLOW_ITEM_TYPE_VXLAN_GPE),
> > > > +			       RTE_FLOW_ITEM_TYPE_VXLAN_GPE,
> > > > +			       RTE_FLOW_ITEM_TYPE_MPLS),
> > > >  		.actions = valid_actions,
> > > >  		.mask = &(const struct rte_flow_item_udp){
> > > >  			.hdr = {
> > > > @@ -428,7 +440,8 @@ static const struct mlx5_flow_items
> > > > mlx5_flow_items[]
> > > = {
> > > >  	[RTE_FLOW_ITEM_TYPE_GRE] = {
> > > >  		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
> > > >  			       RTE_FLOW_ITEM_TYPE_IPV4,
> > > > -			       RTE_FLOW_ITEM_TYPE_IPV6),
> > > > +			       RTE_FLOW_ITEM_TYPE_IPV6,
> > > > +			       RTE_FLOW_ITEM_TYPE_MPLS),
> > > >  		.actions = valid_actions,
> > > >  		.mask = &(const struct rte_flow_item_gre){
> > > >  			.protocol = -1,
> > > > @@ -436,7 +449,11 @@ static const struct mlx5_flow_items
> > > mlx5_flow_items[] = {
> > > >  		.default_mask = &rte_flow_item_gre_mask,
> > > >  		.mask_sz = sizeof(struct rte_flow_item_gre),
> > > >  		.convert = mlx5_flow_create_gre,
> > > > +#ifdef HAVE_IBV_DEVICE_MPLS_SUPPORT
> > > > +		.dst_sz = sizeof(struct ibv_flow_spec_gre), #else
> > > >  		.dst_sz = sizeof(struct ibv_flow_spec_tunnel),
> > > > +#endif
> > > >  	},
> > > >  	[RTE_FLOW_ITEM_TYPE_VXLAN] = {
> > > >  		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH, @@ -464,6 +481,21
> @@
> > > static
> > > > const struct mlx5_flow_items mlx5_flow_items[] = {
> > > >  		.convert = mlx5_flow_create_vxlan_gpe,
> > > >  		.dst_sz = sizeof(struct ibv_flow_spec_tunnel),
> > > >  	},
> > > > +	[RTE_FLOW_ITEM_TYPE_MPLS] = {
> > > > +		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
> > > > +			       RTE_FLOW_ITEM_TYPE_IPV4,
> > > > +			       RTE_FLOW_ITEM_TYPE_IPV6),
> > > > +		.actions = valid_actions,
> > > > +		.mask = &(const struct rte_flow_item_mpls){
> > > > +			.label_tc_s = "\xff\xff\xf0",
> > > > +		},
> > > > +		.default_mask = &rte_flow_item_mpls_mask,
> > > > +		.mask_sz = sizeof(struct rte_flow_item_mpls),
> > > > +		.convert = mlx5_flow_create_mpls, #ifdef
> > > > +HAVE_IBV_DEVICE_MPLS_SUPPORT
> > > > +		.dst_sz = sizeof(struct ibv_flow_spec_mpls), #endif
> > > > +	},
> > >
> > > Why the whole item is not under ifdef?
> >
> > If apply macro to whole item, there will be a null pointer if create
> mpls flow.
> > There is a macro in function mlx5_flow_create_mpls() to avoid using this
> invalid data.
> 
> I think there is some kind of confusion here, what I mean is moving the
> #ifdef to embrace the whole stuff i.e.:
> 
>  #ifdef HAVE_IBV_DEVICE_MPLS_SUPPORT
>  [RTE_FLOW_ITEM_TYPE_MPLS] = {
>   .items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
>   	       RTE_FLOW_ITEM_TYPE_IPV4,
>   	       RTE_FLOW_ITEM_TYPE_IPV6),
>   .actions = valid_actions,
>   .mask = &(const struct rte_flow_item_mpls){
>   	.label_tc_s = "\xff\xff\xf0",
>   },
>   .default_mask = &rte_flow_item_mpls_mask,
>   .mask_sz = sizeof(struct rte_flow_item_mpls),
>   .convert = mlx5_flow_create_mpls,
>   .dst_sz = sizeof(struct ibv_flow_spec_mpls)  #endif
> 
> Not having this item in this static array ends by not supporting it, this
> is what I mean.

Yes, I know. There is a code using this array w/o NULL check:
		cur_item = &mlx5_flow_items[items->type];
		ret = cur_item->convert(items,
					(cur_item->default_mask ?
					 cur_item->default_mask :
					 cur_item->mask),
					 &data);


> 
> > > >  };
> > > >
> > > >  /** Structure to pass to the conversion function. */ @@ -912,7
> > > > +944,9 @@ mlx5_flow_convert_items_validate(struct rte_eth_dev *dev,
> > > >  		if (ret)
> > > >  			goto exit_item_not_supported;
> > > >  		if (IS_TUNNEL(items->type)) {
> > > > -			if (parser->tunnel) {
> > > > +			if (parser->tunnel &&
> > > > +			   !(parser->tunnel == RTE_PTYPE_TUNNEL_GRE &&
> > > > +			     items->type == RTE_FLOW_ITEM_TYPE_MPLS)) {
> > > >  				rte_flow_error_set(error, ENOTSUP,
> > > >  						   RTE_FLOW_ERROR_TYPE_ITEM,
> > > >  						   items,
> > > > @@ -920,6 +954,16 @@ mlx5_flow_convert_items_validate(struct
> > > > rte_eth_dev
> > > *dev,
> > > >  						   " tunnel encapsulations.");
> > > >  				return -rte_errno;
> > > >  			}
> > > > +			if (items->type == RTE_FLOW_ITEM_TYPE_MPLS &&
> > > > +			    !priv->config.mpls_en) {
> > > > +				rte_flow_error_set(error, ENOTSUP,
> > > > +						   RTE_FLOW_ERROR_TYPE_ITEM,
> > > > +						   items,
> > > > +						   "MPLS not supported or"
> > > > +						   " disabled in firmware"
> > > > +						   " configuration.");
> > > > +				return -rte_errno;
> > > > +			}
> > > >  			if (!priv->config.tunnel_en &&
> > > >  			    parser->rss_conf.level) {
> > > >  				rte_flow_error_set(error, ENOTSUP, @@ -
> 1880,6
> > > +1924,80 @@
> > > > mlx5_flow_create_vxlan_gpe(const struct rte_flow_item *item,  }
> > > >
> > > >  /**
> > > > + * Convert MPLS item to Verbs specification.
> > > > + * Tunnel types currently supported are MPLS-in-GRE and MPLS-in-UDP.
> > > > + *
> > > > + * @param item[in]
> > > > + *   Item specification.
> > > > + * @param default_mask[in]
> > > > + *   Default bit-masks to use when item->mask is not provided.
> > > > + * @param data[in, out]
> > > > + *   User structure.
> > > > + *
> > > > + * @return
> > > > + *   0 on success, a negative errno value otherwise and rte_errno
> is
> > > set.
> > > > + */
> > > > +static int
> > > > +mlx5_flow_create_mpls(const struct rte_flow_item *item __rte_unused,
> > > > +		      const void *default_mask __rte_unused,
> > > > +		      struct mlx5_flow_data *data __rte_unused)
> { #ifndef
> > > > +HAVE_IBV_DEVICE_MPLS_SUPPORT
> > > > +	return rte_flow_error_set(data->error, EINVAL,
> > >
> > > ENOTSUP is more accurate to keep a consistency among the errors.
> > >
> > > > +				  RTE_FLOW_ERROR_TYPE_ITEM,
> > > > +				  item,
> > > > +				  "MPLS not supported by driver"); #else
> > > > +	unsigned int i;
> > > > +	const struct rte_flow_item_mpls *spec = item->spec;
> > > > +	const struct rte_flow_item_mpls *mask = item->mask;
> > > > +	struct mlx5_flow_parse *parser = data->parser;
> > > > +	unsigned int size = sizeof(struct ibv_flow_spec_mpls);
> > > > +	struct ibv_flow_spec_mpls mpls = {
> > > > +		.type = IBV_FLOW_SPEC_MPLS,
> > > > +		.size = size,
> > > > +	};
> > > > +	union tag {
> > > > +		uint32_t tag;
> > > > +		uint8_t label[4];
> > > > +	} id;
> > > > +
> > > > +	id.tag = 0;
> > > > +	parser->inner = IBV_FLOW_SPEC_INNER;
> > > > +	if (parser->layer == HASH_RXQ_UDPV4 ||
> > > > +	    parser->layer == HASH_RXQ_UDPV6) {
> > > > +		parser->tunnel =
> > > > +			ptype_ext[PTYPE_IDX(RTE_PTYPE_TUNNEL_MPLS_IN_UDP)];
> > > > +		parser->out_layer = parser->layer;
> > > > +	} else {
> > > > +		parser->tunnel =
> > > > +			ptype_ext[PTYPE_IDX(RTE_PTYPE_TUNNEL_MPLS_IN_GRE)];
> > > > +	}
> > > > +	parser->layer = HASH_RXQ_TUNNEL;
> > > > +	if (spec) {
> > > > +		if (!mask)
> > > > +			mask = default_mask;
> > > > +		memcpy(&id.label[1], spec->label_tc_s, 3);
> > > > +		id.label[0] = spec->ttl;
> > > > +		mpls.val.tag = id.tag;
> > > > +		memcpy(&id.label[1], mask->label_tc_s, 3);
> > > > +		id.label[0] = mask->ttl;
> > > > +		mpls.mask.tag = id.tag;
> > > > +		/* Remove unwanted bits from values. */
> > > > +		mpls.val.tag &= mpls.mask.tag;
> > > > +	}
> > > > +	mlx5_flow_create_copy(parser, &mpls, size);
> > > > +	for (i = 0; i != hash_rxq_init_n; ++i) {
> > > > +		if (!parser->queue[i].ibv_attr)
> > > > +			continue;
> > > > +		parser->queue[i].ibv_attr->flags |=
> > > > +			IBV_FLOW_ATTR_FLAGS_ORDERED_SPEC_LIST;
> > > > +	}
> > > > +	return 0;
> > > > +#endif
> > > > +}
> > > > +
> > > > +/**
> > > >   * Convert GRE item to Verbs specification.
> > > >   *
> > > >   * @param item[in]
> > > > @@ -1898,16 +2016,40 @@ mlx5_flow_create_gre(const struct
> > > > rte_flow_item
> > > *item __rte_unused,
> > > >  		     struct mlx5_flow_data *data)  {
> > > >  	struct mlx5_flow_parse *parser = data->parser;
> > > > +#ifndef HAVE_IBV_DEVICE_MPLS_SUPPORT
> > > >  	unsigned int size = sizeof(struct ibv_flow_spec_tunnel);
> > > >  	struct ibv_flow_spec_tunnel tunnel = {
> > > >  		.type = parser->inner | IBV_FLOW_SPEC_VXLAN_TUNNEL,
> > > >  		.size = size,
> > > >  	};
> > > > +#else
> > > > +	const struct rte_flow_item_gre *spec = item->spec;
> > > > +	const struct rte_flow_item_gre *mask = item->mask;
> > > > +	unsigned int size = sizeof(struct ibv_flow_spec_gre);
> > > > +	struct ibv_flow_spec_gre tunnel = {
> > > > +		.type = parser->inner | IBV_FLOW_SPEC_GRE,
> > > > +		.size = size,
> > > > +	};
> > > > +#endif
> > > >
> > > >  	parser->inner = IBV_FLOW_SPEC_INNER;
> > > >  	parser->tunnel = ptype_ext[PTYPE_IDX(RTE_PTYPE_TUNNEL_GRE)];
> > > >  	parser->out_layer = parser->layer;
> > > >  	parser->layer = HASH_RXQ_TUNNEL;
> > > > +#ifdef HAVE_IBV_DEVICE_MPLS_SUPPORT
> > > > +	if (spec) {
> > > > +		if (!mask)
> > > > +			mask = default_mask;
> > > > +		tunnel.val.c_ks_res0_ver = spec->c_rsvd0_ver;
> > > > +		tunnel.val.protocol = spec->protocol;
> > > > +		tunnel.val.c_ks_res0_ver = mask->c_rsvd0_ver;
> > > > +		tunnel.val.protocol = mask->protocol;
> > > > +		/* Remove unwanted bits from values. */
> > > > +		tunnel.val.c_ks_res0_ver &= tunnel.mask.c_ks_res0_ver;
> > > > +		tunnel.val.protocol &= tunnel.mask.protocol;
> > > > +		tunnel.val.key &= tunnel.mask.key;
> > > > +	}
> > > > +#endif
> > > >  	mlx5_flow_create_copy(parser, &tunnel, size);
> > > >  	return 0;
> > > >  }
> > > > --
> > > > 2.13.3
> > > >
> > >
> > > Thanks,
> > >
> > > --
> > > Nélio Laranjeiro
> > > 6WIND
> 
> --
> Nélio Laranjeiro
> 6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v3 07/14] net/mlx5: support tunnel RSS level
  2018-04-13 13:27   ` Nélio Laranjeiro
@ 2018-04-14 10:12     ` Xueming(Steven) Li
  2018-04-16 12:25       ` Nélio Laranjeiro
  0 siblings, 1 reply; 115+ messages in thread
From: Xueming(Steven) Li @ 2018-04-14 10:12 UTC (permalink / raw)
  To: Nélio Laranjeiro; +Cc: Shahaf Shuler, dev

Hi Nelio,

> -----Original Message-----
> From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> Sent: Friday, April 13, 2018 9:27 PM
> To: Xueming(Steven) Li <xuemingl@mellanox.com>
> Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org
> Subject: Re: [PATCH v3 07/14] net/mlx5: support tunnel RSS level
> 
> Seems you did not read my comments on this patch, I have exactly the same
> ones.

I finally found your previous comments in trash box, not sure why outlook 
always deliver important mail to trash box, anyway I updated rules.

> 
> On Fri, Apr 13, 2018 at 07:20:16PM +0800, Xueming Li wrote:
> > Tunnel RSS level of flow RSS action offers user a choice to do RSS
> > hash calculation on inner or outer RSS fields. Testpmd flow command
> examples:
> >
> > GRE flow inner RSS:
> >   flow create 0 ingress pattern eth / ipv4 proto is 47 / gre / end
> > actions rss queues 1 2 end level 1 / end
> >
> > GRE tunnel flow outer RSS:
> >   flow create 0 ingress pattern eth  / ipv4 proto is 47 / gre / end
> > actions rss queues 1 2 end level 0 / end
> >
> > Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> > ---
> >  drivers/net/mlx5/Makefile    |   2 +-
> >  drivers/net/mlx5/mlx5_flow.c | 247
> > +++++++++++++++++++++++++++++--------------
> >  drivers/net/mlx5/mlx5_glue.c |  16 +++
> >  drivers/net/mlx5/mlx5_glue.h |   8 ++
> >  drivers/net/mlx5/mlx5_rxq.c  |  56 +++++++++-
> >  drivers/net/mlx5/mlx5_rxtx.h |   5 +-
> >  6 files changed, 248 insertions(+), 86 deletions(-)
> >
> > diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
> > index ae118ad33..f9a6c460b 100644
> > --- a/drivers/net/mlx5/Makefile
> > +++ b/drivers/net/mlx5/Makefile
> > @@ -35,7 +35,7 @@ include $(RTE_SDK)/mk/rte.vars.mk  LIB =
> > librte_pmd_mlx5.a  LIB_GLUE = $(LIB_GLUE_BASE).$(LIB_GLUE_VERSION)
> >  LIB_GLUE_BASE = librte_pmd_mlx5_glue.so -LIB_GLUE_VERSION = 18.02.0
> > +LIB_GLUE_VERSION = 18.05.0
> >
> >  # Sources.
> >  SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5.c diff --git
> > a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c index
> > dd099f328..a22554706 100644
> > --- a/drivers/net/mlx5/mlx5_flow.c
> > +++ b/drivers/net/mlx5/mlx5_flow.c
> > @@ -116,6 +116,7 @@ enum hash_rxq_type {
> >  	HASH_RXQ_UDPV6,
> >  	HASH_RXQ_IPV6,
> >  	HASH_RXQ_ETH,
> > +	HASH_RXQ_TUNNEL,
> >  };
> >
> >  /* Initialization data for hash RX queue. */ @@ -454,6 +455,7 @@
> > struct mlx5_flow_parse {
> >  	uint16_t queues[RTE_MAX_QUEUES_PER_PORT]; /**< Queues indexes to use.
> */
> >  	uint8_t rss_key[40]; /**< copy of the RSS key. */
> >  	enum hash_rxq_type layer; /**< Last pattern layer detected. */
> > +	enum hash_rxq_type out_layer; /**< Last outer pattern layer
> > +detected. */
> >  	uint32_t tunnel; /**< Tunnel type of RTE_PTYPE_TUNNEL_XXX. */
> >  	struct ibv_counter_set *cs; /**< Holds the counter set for the rule
> */
> >  	struct {
> > @@ -461,6 +463,7 @@ struct mlx5_flow_parse {
> >  		/**< Pointer to Verbs attributes. */
> >  		unsigned int offset;
> >  		/**< Current position or total size of the attribute. */
> > +		uint64_t hash_fields; /**< Verbs hash fields. */
> >  	} queue[RTE_DIM(hash_rxq_init)];
> >  };
> >
> > @@ -696,7 +699,8 @@ mlx5_flow_convert_actions(struct rte_eth_dev *dev,
> >  						   " function is Toeplitz");
> >  				return -rte_errno;
> >  			}
> > -			if (rss->level) {
> > +#ifndef HAVE_IBV_DEVICE_TUNNEL_SUPPORT
> > +			if (parser->rss_conf.level > 0) {
> >  				rte_flow_error_set(error, EINVAL,
> >  						   RTE_FLOW_ERROR_TYPE_ACTION,
> >  						   actions,
> > @@ -704,6 +708,15 @@ mlx5_flow_convert_actions(struct rte_eth_dev *dev,
> >  						   " level is not supported");
> >  				return -rte_errno;
> >  			}
> > +#endif
> > +			if (parser->rss_conf.level > 1) {
> > +				rte_flow_error_set(error, EINVAL,
> > +						   RTE_FLOW_ERROR_TYPE_ACTION,
> > +						   actions,
> > +						   "RSS encapsulation level"
> > +						   " > 1 is not supported");
> > +				return -rte_errno;
> > +			}
> >  			if (rss->types & MLX5_RSS_HF_MASK) {
> >  				rte_flow_error_set(error, EINVAL,
> >  						   RTE_FLOW_ERROR_TYPE_ACTION,
> 
> Same comment as in previous review.
> The levels are not matching the proposed API.
> Level  0 = unspecified, 1 = outermost, 2 = next outermost, 3 = next
> next ...

Thanks, a big  missing in rebase.

> 
> > @@ -754,7 +767,7 @@ mlx5_flow_convert_actions(struct rte_eth_dev *dev,
> >  			}
> >  			parser->rss_conf = (struct rte_flow_action_rss){
> >  				.func = RTE_ETH_HASH_FUNCTION_DEFAULT,
> > -				.level = 0,
> > +				.level = rss->level,
> >  				.types = rss->types,
> >  				.key_len = rss_key_len,
> >  				.queue_num = rss->queue_num,
> > @@ -838,10 +851,12 @@ mlx5_flow_convert_actions(struct rte_eth_dev *dev,
> >   *   0 on success, a negative errno value otherwise and rte_errno is
> set.
> >   */
> >  static int
> > -mlx5_flow_convert_items_validate(const struct rte_flow_item items[],
> > +mlx5_flow_convert_items_validate(struct rte_eth_dev *dev,
> > +				 const struct rte_flow_item items[],
> >  				 struct rte_flow_error *error,
> >  				 struct mlx5_flow_parse *parser)
> >  {
> > +	struct priv *priv = dev->data->dev_private;
> >  	const struct mlx5_flow_items *cur_item = mlx5_flow_items;
> >  	unsigned int i;
> >  	int ret = 0;
> > @@ -881,6 +896,14 @@ mlx5_flow_convert_items_validate(const struct
> rte_flow_item items[],
> >  						   " tunnel encapsulations.");
> >  				return -rte_errno;
> >  			}
> > +			if (!priv->config.tunnel_en &&
> > +			    parser->rss_conf.level) {
> > +				rte_flow_error_set(error, ENOTSUP,
> > +					RTE_FLOW_ERROR_TYPE_ITEM,
> > +					items,
> > +					"Tunnel offloading not enabled");
> > +				return -rte_errno;
> > +			}
> >  			parser->inner = IBV_FLOW_SPEC_INNER;
> >  			parser->tunnel = flow_ptype[items->type];
> >  		}
> > @@ -1000,7 +1023,11 @@ static void
> >  mlx5_flow_convert_finalise(struct mlx5_flow_parse *parser)  {
> >  	unsigned int i;
> > +	uint32_t inner = parser->inner;
> >
> > +	/* Don't create extra flows for outer RSS. */
> > +	if (parser->tunnel && !parser->rss_conf.level)
> > +		return;
> >  	/*
> >  	 * Fill missing layers in verbs specifications, or compute the
> correct
> >  	 * offset to allocate the memory space for the attributes and @@
> > -1011,23 +1038,25 @@ mlx5_flow_convert_finalise(struct mlx5_flow_parse
> *parser)
> >  			struct ibv_flow_spec_ipv4_ext ipv4;
> >  			struct ibv_flow_spec_ipv6 ipv6;
> >  			struct ibv_flow_spec_tcp_udp udp_tcp;
> > +			struct ibv_flow_spec_eth eth;
> >  		} specs;
> >  		void *dst;
> >  		uint16_t size;
> >
> >  		if (i == parser->layer)
> >  			continue;
> > -		if (parser->layer == HASH_RXQ_ETH) {
> > +		if (parser->layer == HASH_RXQ_ETH ||
> > +		    parser->layer == HASH_RXQ_TUNNEL) {
> >  			if (hash_rxq_init[i].ip_version == MLX5_IPV4) {
> >  				size = sizeof(struct ibv_flow_spec_ipv4_ext);
> >  				specs.ipv4 = (struct ibv_flow_spec_ipv4_ext){
> > -					.type = IBV_FLOW_SPEC_IPV4_EXT,
> > +					.type = inner | IBV_FLOW_SPEC_IPV4_EXT,
> >  					.size = size,
> >  				};
> >  			} else {
> >  				size = sizeof(struct ibv_flow_spec_ipv6);
> >  				specs.ipv6 = (struct ibv_flow_spec_ipv6){
> > -					.type = IBV_FLOW_SPEC_IPV6,
> > +					.type = inner | IBV_FLOW_SPEC_IPV6,
> >  					.size = size,
> >  				};
> >  			}
> > @@ -1044,7 +1073,7 @@ mlx5_flow_convert_finalise(struct mlx5_flow_parse
> *parser)
> >  		    (i == HASH_RXQ_UDPV6) || (i == HASH_RXQ_TCPV6)) {
> >  			size = sizeof(struct ibv_flow_spec_tcp_udp);
> >  			specs.udp_tcp = (struct ibv_flow_spec_tcp_udp) {
> > -				.type = ((i == HASH_RXQ_UDPV4 ||
> > +				.type = inner | ((i == HASH_RXQ_UDPV4 ||
> >  					  i == HASH_RXQ_UDPV6) ?
> >  					 IBV_FLOW_SPEC_UDP :
> >  					 IBV_FLOW_SPEC_TCP),
> > @@ -1065,6 +1094,8 @@ mlx5_flow_convert_finalise(struct
> > mlx5_flow_parse *parser)
> >  /**
> >   * Update flows according to pattern and RSS hash fields.
> >   *
> > + * @param dev
> > + *   Pointer to Ethernet device.
> >   * @param[in, out] parser
> >   *   Internal parser structure.
> >   *
> > @@ -1072,16 +1103,17 @@ mlx5_flow_convert_finalise(struct
> mlx5_flow_parse *parser)
> >   *   0 on success, a negative errno value otherwise and rte_errno is
> set.
> >   */
> >  static int
> > -mlx5_flow_convert_rss(struct mlx5_flow_parse *parser)
> > +mlx5_flow_convert_rss(struct rte_eth_dev *dev, struct mlx5_flow_parse
> > +*parser)
> >  {
> > -	const unsigned int ipv4 =
> > +	unsigned int ipv4 =
> >  		hash_rxq_init[parser->layer].ip_version == MLX5_IPV4;
> >  	const enum hash_rxq_type hmin = ipv4 ? HASH_RXQ_TCPV4 :
> HASH_RXQ_TCPV6;
> >  	const enum hash_rxq_type hmax = ipv4 ? HASH_RXQ_IPV4 : HASH_RXQ_IPV6;
> >  	const enum hash_rxq_type ohmin = ipv4 ? HASH_RXQ_TCPV6 :
> HASH_RXQ_TCPV4;
> >  	const enum hash_rxq_type ohmax = ipv4 ? HASH_RXQ_IPV6 :
> HASH_RXQ_IPV4;
> > -	const enum hash_rxq_type ip = ipv4 ? HASH_RXQ_IPV4 : HASH_RXQ_IPV6;
> > +	enum hash_rxq_type ip = ipv4 ? HASH_RXQ_IPV4 : HASH_RXQ_IPV6;
> >  	unsigned int i;
> > +	int found = 0;
> >
> >  	/* Remove any other flow not matching the pattern. */
> >  	if (parser->rss_conf.queue_num == 1 && !parser->rss_conf.types) { @@
> > -1093,9 +1125,51 @@ mlx5_flow_convert_rss(struct mlx5_flow_parse *parser)
> >  		}
> >  		return 0;
> >  	}
> > -	if (parser->layer == HASH_RXQ_ETH)
> > +	/*
> > +	 * Outer RSS.
> > +	 * HASH_RXQ_ETH is the only rule since tunnel packet match this
> > +	 * rule must match outer pattern.
> > +	 */
> > +	if (parser->tunnel && !parser->rss_conf.level) {
> > +		/* Remove flows other than default. */
> > +		for (i = 0; i != hash_rxq_init_n - 1; ++i) {
> > +			rte_free(parser->queue[i].ibv_attr);
> > +			parser->queue[i].ibv_attr = NULL;
> > +		}
> > +		ipv4 = hash_rxq_init[parser->out_layer].ip_version ==
> MLX5_IPV4;
> > +		ip = ipv4 ? HASH_RXQ_IPV4 : HASH_RXQ_IPV6;
> > +		if (hash_rxq_init[parser->out_layer].dpdk_rss_hf &
> > +		    parser->rss_conf.types) {
> > +			parser->queue[HASH_RXQ_ETH].hash_fields =
> > +				hash_rxq_init[parser->out_layer].hash_fields;
> > +		} else if (ip && (hash_rxq_init[ip].dpdk_rss_hf &
> > +		    parser->rss_conf.types)) {
> > +			parser->queue[HASH_RXQ_ETH].hash_fields =
> > +				hash_rxq_init[ip].hash_fields;
> > +		} else if (parser->rss_conf.types) {
> > +			DRV_LOG(WARNING,
> > +				"port %u rss outer hash function doesn't match"
> > +				" pattern", dev->data->port_id);
> > +		}
> > +		return 0;
> > +	}
> > +	if (parser->layer == HASH_RXQ_ETH || parser->layer ==
> HASH_RXQ_TUNNEL) {
> > +		/* Remove unused flows according to hash function. */
> > +		for (i = 0; i != hash_rxq_init_n - 1; ++i) {
> > +			if (!parser->queue[i].ibv_attr)
> > +				continue;
> > +			if (hash_rxq_init[i].dpdk_rss_hf &
> > +			    parser->rss_conf.types) {
> > +				parser->queue[i].hash_fields =
> > +					hash_rxq_init[i].hash_fields;
> > +				continue;
> > +			}
> > +			rte_free(parser->queue[i].ibv_attr);
> > +			parser->queue[i].ibv_attr = NULL;
> > +		}
> >  		return 0;
> > -	/* This layer becomes useless as the pattern define under layers. */
> > +	}
> > +	/* Remove ETH layer flow. */
> >  	rte_free(parser->queue[HASH_RXQ_ETH].ibv_attr);
> >  	parser->queue[HASH_RXQ_ETH].ibv_attr = NULL;
> >  	/* Remove opposite kind of layer e.g. IPv6 if the pattern is IPv4.
> > */ @@ -1105,20 +1179,50 @@ mlx5_flow_convert_rss(struct mlx5_flow_parse
> *parser)
> >  		rte_free(parser->queue[i].ibv_attr);
> >  		parser->queue[i].ibv_attr = NULL;
> >  	}
> > -	/* Remove impossible flow according to the RSS configuration. */
> > -	if (hash_rxq_init[parser->layer].dpdk_rss_hf &
> > -	    parser->rss_conf.types) {
> > -		/* Remove any other flow. */
> > +	/*
> > +	 * Keep L4 flows as IP pattern has to support L4 RSS.
> > +	 * Otherwise, only keep the flow that match the pattern.
> > +	 */
> > +	if (parser->layer != ip) {
> > +		/* Only keep the flow that match the pattern. */
> >  		for (i = hmin; i != (hmax + 1); ++i) {
> > -			if (i == parser->layer || !parser->queue[i].ibv_attr)
> > +			if (i == parser->layer)
> >  				continue;
> >  			rte_free(parser->queue[i].ibv_attr);
> >  			parser->queue[i].ibv_attr = NULL;
> >  		}
> > -	} else if (!parser->queue[ip].ibv_attr) {
> > -		/* no RSS possible with the current configuration. */
> > -		parser->rss_conf.queue_num = 1;
> >  	}
> > +	/* Remove impossible flow according to the RSS configuration. */
> > +	for (i = hmin; i != (hmax + 1); ++i) {
> > +		if (!parser->queue[i].ibv_attr)
> > +			continue;
> > +		if (parser->rss_conf.types &
> > +		    hash_rxq_init[i].dpdk_rss_hf) {
> > +			parser->queue[i].hash_fields =
> > +				hash_rxq_init[i].hash_fields;
> > +			found = 1;
> > +			continue;
> > +		}
> > +		/* L4 flow could be used for L3 RSS. */
> > +		if (i == parser->layer && i < ip &&
> > +		    (hash_rxq_init[ip].dpdk_rss_hf &
> > +		     parser->rss_conf.types)) {
> > +			parser->queue[i].hash_fields =
> > +				hash_rxq_init[ip].hash_fields;
> > +			found = 1;
> > +			continue;
> > +		}
> > +		/* L3 flow and L4 hash: non-rss L3 flow. */
> > +		if (i == parser->layer && i == ip && found)
> > +			/* IP pattern and L4 HF. */
> > +			continue;
> > +		rte_free(parser->queue[i].ibv_attr);
> > +		parser->queue[i].ibv_attr = NULL;
> > +	}
> > +	if (!found)
> > +		DRV_LOG(WARNING,
> > +			"port %u rss hash function doesn't match "
> > +			"pattern", dev->data->port_id);
> 
> The hash function is toeplitz, xor, it is not applied on the pattern but
> used to compute an hash result using some information from the packet.
> This comment is totally wrong.

Thanks, I'll replace "hash function" to "hash fields".

> 
> Another point, such log will trigger on an application using MLX5 PMD but
> not on MLX4 PMD and this specifically because on how the NIC using the
> MLX5 PMD are made internally (MLX4 can use a single Hash RX queue whereas
> MLX5 needs an Hash Rx queue per kind of protocol).
> The fact being it will have the exact same behavior I'll *strongly*
> suggest to remove such annoying warning.

After some test on mlx5 current code, the behavior in previous code doesn't
seem to be consistent, not sure whether it same in mlx4 PMD:
- Pattern: eth/ipv4/tcp, RSS: UDP, creation success.
- Pattern: eth/ipv4,RSS: IPv6, creation failed.

This patch support the 2nd case w/o hash, and warn upon the first case.
Take example of first case, a packet that matches the pattern must be TCP,
no reason to hash it as TCP, same to the 2nd case. They are totally
wrong configuration, but to be robust, warning is used here, and users 
have to learn that NO hash result because HF configuration mismatch through 
this warning message.

Please note that below cases are valid and no warning:
- Pattern: eth/ipv4, RSS: UDP
- Pattern: eth/ipv4/udp, RSS: IPv4

> 
> >  	return 0;
> >  }
> >
> > @@ -1165,7 +1269,7 @@ mlx5_flow_convert(struct rte_eth_dev *dev,
> >  	ret = mlx5_flow_convert_actions(dev, actions, error, parser);
> >  	if (ret)
> >  		return ret;
> > -	ret = mlx5_flow_convert_items_validate(items, error, parser);
> > +	ret = mlx5_flow_convert_items_validate(dev, items, error, parser);
> >  	if (ret)
> >  		return ret;
> >  	mlx5_flow_convert_finalise(parser);
> > @@ -1186,10 +1290,6 @@ mlx5_flow_convert(struct rte_eth_dev *dev,
> >  		for (i = 0; i != hash_rxq_init_n; ++i) {
> >  			unsigned int offset;
> >
> > -			if (!(parser->rss_conf.types &
> > -			      hash_rxq_init[i].dpdk_rss_hf) &&
> > -			    (i != HASH_RXQ_ETH))
> > -				continue;
> >  			offset = parser->queue[i].offset;
> >  			parser->queue[i].ibv_attr =
> >  				mlx5_flow_convert_allocate(offset, error); @@ -
> 1201,6 +1301,7 @@
> > mlx5_flow_convert(struct rte_eth_dev *dev,
> >  	/* Third step. Conversion parse, fill the specifications. */
> >  	parser->inner = 0;
> >  	parser->tunnel = 0;
> > +	parser->layer = HASH_RXQ_ETH;
> >  	for (; items->type != RTE_FLOW_ITEM_TYPE_END; ++items) {
> >  		struct mlx5_flow_data data = {
> >  			.parser = parser,
> > @@ -1218,23 +1319,23 @@ mlx5_flow_convert(struct rte_eth_dev *dev,
> >  		if (ret)
> >  			goto exit_free;
> >  	}
> > -	if (parser->mark)
> > -		mlx5_flow_create_flag_mark(parser, parser->mark_id);
> > -	if (parser->count && parser->create) {
> > -		mlx5_flow_create_count(dev, parser);
> > -		if (!parser->cs)
> > -			goto exit_count_error;
> > -	}
> >  	/*
> >  	 * Last step. Complete missing specification to reach the RSS
> >  	 * configuration.
> >  	 */
> >  	if (!parser->drop)
> > -		ret = mlx5_flow_convert_rss(parser);
> > +		ret = mlx5_flow_convert_rss(dev, parser);
> >  		if (ret)
> >  			goto exit_free;
> >  		mlx5_flow_convert_finalise(parser);
> >  	mlx5_flow_update_priority(dev, parser, attr);
> > +	if (parser->mark)
> > +		mlx5_flow_create_flag_mark(parser, parser->mark_id);
> > +	if (parser->count && parser->create) {
> > +		mlx5_flow_create_count(dev, parser);
> > +		if (!parser->cs)
> > +			goto exit_count_error;
> > +	}
> >  exit_free:
> >  	/* Only verification is expected, all resources should be released.
> */
> >  	if (!parser->create) {
> > @@ -1282,17 +1383,11 @@ mlx5_flow_create_copy(struct mlx5_flow_parse
> *parser, void *src,
> >  	for (i = 0; i != hash_rxq_init_n; ++i) {
> >  		if (!parser->queue[i].ibv_attr)
> >  			continue;
> > -		/* Specification must be the same l3 type or none. */
> > -		if (parser->layer == HASH_RXQ_ETH ||
> > -		    (hash_rxq_init[parser->layer].ip_version ==
> > -		     hash_rxq_init[i].ip_version) ||
> > -		    (hash_rxq_init[i].ip_version == 0)) {
> > -			dst = (void *)((uintptr_t)parser->queue[i].ibv_attr +
> > -					parser->queue[i].offset);
> > -			memcpy(dst, src, size);
> > -			++parser->queue[i].ibv_attr->num_of_specs;
> > -			parser->queue[i].offset += size;
> > -		}
> > +		dst = (void *)((uintptr_t)parser->queue[i].ibv_attr +
> > +				parser->queue[i].offset);
> > +		memcpy(dst, src, size);
> > +		++parser->queue[i].ibv_attr->num_of_specs;
> > +		parser->queue[i].offset += size;
> >  	}
> >  }
> >
> > @@ -1323,9 +1418,7 @@ mlx5_flow_create_eth(const struct rte_flow_item
> *item,
> >  		.size = eth_size,
> >  	};
> >
> > -	/* Don't update layer for the inner pattern. */
> > -	if (!parser->inner)
> > -		parser->layer = HASH_RXQ_ETH;
> > +	parser->layer = HASH_RXQ_ETH;
> >  	if (spec) {
> >  		unsigned int i;
> >
> > @@ -1438,9 +1531,7 @@ mlx5_flow_create_ipv4(const struct rte_flow_item
> *item,
> >  		.size = ipv4_size,
> >  	};
> >
> > -	/* Don't update layer for the inner pattern. */
> > -	if (!parser->inner)
> > -		parser->layer = HASH_RXQ_IPV4;
> > +	parser->layer = HASH_RXQ_IPV4;
> >  	if (spec) {
> >  		if (!mask)
> >  			mask = default_mask;
> > @@ -1493,9 +1584,7 @@ mlx5_flow_create_ipv6(const struct rte_flow_item
> *item,
> >  		.size = ipv6_size,
> >  	};
> >
> > -	/* Don't update layer for the inner pattern. */
> > -	if (!parser->inner)
> > -		parser->layer = HASH_RXQ_IPV6;
> > +	parser->layer = HASH_RXQ_IPV6;
> >  	if (spec) {
> >  		unsigned int i;
> >  		uint32_t vtc_flow_val;
> > @@ -1568,13 +1657,10 @@ mlx5_flow_create_udp(const struct rte_flow_item
> *item,
> >  		.size = udp_size,
> >  	};
> >
> > -	/* Don't update layer for the inner pattern. */
> > -	if (!parser->inner) {
> > -		if (parser->layer == HASH_RXQ_IPV4)
> > -			parser->layer = HASH_RXQ_UDPV4;
> > -		else
> > -			parser->layer = HASH_RXQ_UDPV6;
> > -	}
> > +	if (parser->layer == HASH_RXQ_IPV4)
> > +		parser->layer = HASH_RXQ_UDPV4;
> > +	else
> > +		parser->layer = HASH_RXQ_UDPV6;
> >  	if (spec) {
> >  		if (!mask)
> >  			mask = default_mask;
> > @@ -1617,13 +1703,10 @@ mlx5_flow_create_tcp(const struct rte_flow_item
> *item,
> >  		.size = tcp_size,
> >  	};
> >
> > -	/* Don't update layer for the inner pattern. */
> > -	if (!parser->inner) {
> > -		if (parser->layer == HASH_RXQ_IPV4)
> > -			parser->layer = HASH_RXQ_TCPV4;
> > -		else
> > -			parser->layer = HASH_RXQ_TCPV6;
> > -	}
> > +	if (parser->layer == HASH_RXQ_IPV4)
> > +		parser->layer = HASH_RXQ_TCPV4;
> > +	else
> > +		parser->layer = HASH_RXQ_TCPV6;
> >  	if (spec) {
> >  		if (!mask)
> >  			mask = default_mask;
> > @@ -1673,6 +1756,8 @@ mlx5_flow_create_vxlan(const struct rte_flow_item
> *item,
> >  	id.vni[0] = 0;
> >  	parser->inner = IBV_FLOW_SPEC_INNER;
> >  	parser->tunnel = ptype_ext[PTYPE_IDX(RTE_PTYPE_TUNNEL_VXLAN)];
> > +	parser->out_layer = parser->layer;
> > +	parser->layer = HASH_RXQ_TUNNEL;
> >  	if (spec) {
> >  		if (!mask)
> >  			mask = default_mask;
> > @@ -1727,6 +1812,8 @@ mlx5_flow_create_gre(const struct rte_flow_item
> > *item __rte_unused,
> >
> >  	parser->inner = IBV_FLOW_SPEC_INNER;
> >  	parser->tunnel = ptype_ext[PTYPE_IDX(RTE_PTYPE_TUNNEL_GRE)];
> > +	parser->out_layer = parser->layer;
> > +	parser->layer = HASH_RXQ_TUNNEL;
> >  	mlx5_flow_create_copy(parser, &tunnel, size);
> >  	return 0;
> >  }
> > @@ -1890,33 +1977,33 @@ mlx5_flow_create_action_queue_rss(struct
> rte_eth_dev *dev,
> >  	unsigned int i;
> >
> >  	for (i = 0; i != hash_rxq_init_n; ++i) {
> > -		uint64_t hash_fields;
> > -
> >  		if (!parser->queue[i].ibv_attr)
> >  			continue;
> >  		flow->frxq[i].ibv_attr = parser->queue[i].ibv_attr;
> >  		parser->queue[i].ibv_attr = NULL;
> > -		hash_fields = hash_rxq_init[i].hash_fields;
> > +		flow->frxq[i].hash_fields = parser->queue[i].hash_fields;
> >  		if (!priv->dev->data->dev_started)
> >  			continue;
> >  		flow->frxq[i].hrxq =
> >  			mlx5_hrxq_get(dev,
> >  				      parser->rss_conf.key,
> >  				      parser->rss_conf.key_len,
> > -				      hash_fields,
> > +				      flow->frxq[i].hash_fields,
> >  				      parser->rss_conf.queue,
> >  				      parser->rss_conf.queue_num,
> > -				      parser->tunnel);
> > +				      parser->tunnel,
> > +				      parser->rss_conf.level);
> >  		if (flow->frxq[i].hrxq)
> >  			continue;
> >  		flow->frxq[i].hrxq =
> >  			mlx5_hrxq_new(dev,
> >  				      parser->rss_conf.key,
> >  				      parser->rss_conf.key_len,
> > -				      hash_fields,
> > +				      flow->frxq[i].hash_fields,
> >  				      parser->rss_conf.queue,
> >  				      parser->rss_conf.queue_num,
> > -				      parser->tunnel);
> > +				      parser->tunnel,
> > +				      parser->rss_conf.level);
> >  		if (!flow->frxq[i].hrxq) {
> >  			return rte_flow_error_set(error, ENOMEM,
> >  						  RTE_FLOW_ERROR_TYPE_HANDLE,
> > @@ -2013,7 +2100,7 @@ mlx5_flow_create_action_queue(struct rte_eth_dev
> *dev,
> >  		DRV_LOG(DEBUG, "port %u %p type %d QP %p ibv_flow %p",
> >  			dev->data->port_id,
> >  			(void *)flow, i,
> > -			(void *)flow->frxq[i].hrxq,
> > +			(void *)flow->frxq[i].hrxq->qp,
> >  			(void *)flow->frxq[i].ibv_flow);
> >  	}
> >  	if (!flows_n) {
> > @@ -2541,19 +2628,21 @@ mlx5_flow_start(struct rte_eth_dev *dev, struct
> mlx5_flows *list)
> >  			flow->frxq[i].hrxq =
> >  				mlx5_hrxq_get(dev, flow->rss_conf.key,
> >  					      flow->rss_conf.key_len,
> > -					      hash_rxq_init[i].hash_fields,
> > +					      flow->frxq[i].hash_fields,
> >  					      flow->rss_conf.queue,
> >  					      flow->rss_conf.queue_num,
> > -					      flow->tunnel);
> > +					      flow->tunnel,
> > +					      flow->rss_conf.level);
> >  			if (flow->frxq[i].hrxq)
> >  				goto flow_create;
> >  			flow->frxq[i].hrxq =
> >  				mlx5_hrxq_new(dev, flow->rss_conf.key,
> >  					      flow->rss_conf.key_len,
> > -					      hash_rxq_init[i].hash_fields,
> > +					      flow->frxq[i].hash_fields,
> >  					      flow->rss_conf.queue,
> >  					      flow->rss_conf.queue_num,
> > -					      flow->tunnel);
> > +					      flow->tunnel,
> > +					      flow->rss_conf.level);
> >  			if (!flow->frxq[i].hrxq) {
> >  				DRV_LOG(DEBUG,
> >  					"port %u flow %p cannot be applied", diff --
> git
> > a/drivers/net/mlx5/mlx5_glue.c b/drivers/net/mlx5/mlx5_glue.c index
> > be684d378..6874aa32a 100644
> > --- a/drivers/net/mlx5/mlx5_glue.c
> > +++ b/drivers/net/mlx5/mlx5_glue.c
> > @@ -313,6 +313,21 @@ mlx5_glue_dv_init_obj(struct mlx5dv_obj *obj,
> uint64_t obj_type)
> >  	return mlx5dv_init_obj(obj, obj_type);  }
> >
> > +static struct ibv_qp *
> > +mlx5_glue_dv_create_qp(struct ibv_context *context,
> > +		       struct ibv_qp_init_attr_ex *qp_init_attr_ex,
> > +		       struct mlx5dv_qp_init_attr *dv_qp_init_attr) { #ifdef
> > +HAVE_IBV_DEVICE_TUNNEL_SUPPORT
> > +	return mlx5dv_create_qp(context, qp_init_attr_ex, dv_qp_init_attr);
> > +#else
> > +	(void)context;
> > +	(void)qp_init_attr_ex;
> > +	(void)dv_qp_init_attr;
> > +	return NULL;
> > +#endif
> > +}
> > +
> >  const struct mlx5_glue *mlx5_glue = &(const struct mlx5_glue){
> >  	.version = MLX5_GLUE_VERSION,
> >  	.fork_init = mlx5_glue_fork_init,
> > @@ -356,4 +371,5 @@ const struct mlx5_glue *mlx5_glue = &(const struct
> mlx5_glue){
> >  	.dv_query_device = mlx5_glue_dv_query_device,
> >  	.dv_set_context_attr = mlx5_glue_dv_set_context_attr,
> >  	.dv_init_obj = mlx5_glue_dv_init_obj,
> > +	.dv_create_qp = mlx5_glue_dv_create_qp,
> >  };
> > diff --git a/drivers/net/mlx5/mlx5_glue.h
> > b/drivers/net/mlx5/mlx5_glue.h index b5efee3b6..841363872 100644
> > --- a/drivers/net/mlx5/mlx5_glue.h
> > +++ b/drivers/net/mlx5/mlx5_glue.h
> > @@ -31,6 +31,10 @@ struct ibv_counter_set_init_attr;  struct
> > ibv_query_counter_set_attr;  #endif
> >
> > +#ifndef HAVE_IBV_DEVICE_TUNNEL_SUPPORT struct mlx5dv_qp_init_attr;
> > +#endif
> > +
> >  /* LIB_GLUE_VERSION must be updated every time this structure is
> > modified. */  struct mlx5_glue {
> >  	const char *version;
> > @@ -106,6 +110,10 @@ struct mlx5_glue {
> >  				   enum mlx5dv_set_ctx_attr_type type,
> >  				   void *attr);
> >  	int (*dv_init_obj)(struct mlx5dv_obj *obj, uint64_t obj_type);
> > +	struct ibv_qp *(*dv_create_qp)
> > +		(struct ibv_context *context,
> > +		 struct ibv_qp_init_attr_ex *qp_init_attr_ex,
> > +		 struct mlx5dv_qp_init_attr *dv_qp_init_attr);
> >  };
> >
> >  const struct mlx5_glue *mlx5_glue;
> > diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
> > index 073732e16..1997609ec 100644
> > --- a/drivers/net/mlx5/mlx5_rxq.c
> > +++ b/drivers/net/mlx5/mlx5_rxq.c
> > @@ -1386,6 +1386,8 @@ mlx5_ind_table_ibv_verify(struct rte_eth_dev *dev)
> >   *   Number of queues.
> >   * @param tunnel
> >   *   Tunnel type.
> > + * @param rss_level
> > + *   RSS hash on tunnel level.
> >   *
> >   * @return
> >   *   The Verbs object initialised, NULL otherwise and rte_errno is set.
> > @@ -1394,13 +1396,17 @@ struct mlx5_hrxq *  mlx5_hrxq_new(struct
> > rte_eth_dev *dev,
> >  	      const uint8_t *rss_key, uint32_t rss_key_len,
> >  	      uint64_t hash_fields,
> > -	      const uint16_t *queues, uint32_t queues_n, uint32_t tunnel)
> > +	      const uint16_t *queues, uint32_t queues_n,
> > +	      uint32_t tunnel, uint32_t rss_level)
> >  {
> >  	struct priv *priv = dev->data->dev_private;
> >  	struct mlx5_hrxq *hrxq;
> >  	struct mlx5_ind_table_ibv *ind_tbl;
> >  	struct ibv_qp *qp;
> >  	int err;
> > +#ifdef HAVE_IBV_DEVICE_TUNNEL_SUPPORT
> > +	struct mlx5dv_qp_init_attr qp_init_attr = {0}; #endif
> >
> >  	queues_n = hash_fields ? queues_n : 1;
> >  	ind_tbl = mlx5_ind_table_ibv_get(dev, queues, queues_n); @@ -1410,6
> > +1416,36 @@ mlx5_hrxq_new(struct rte_eth_dev *dev,
> >  		rte_errno = ENOMEM;
> >  		return NULL;
> >  	}
> > +#ifdef HAVE_IBV_DEVICE_TUNNEL_SUPPORT
> > +	if (tunnel) {
> > +		qp_init_attr.comp_mask =
> > +				MLX5DV_QP_INIT_ATTR_MASK_QP_CREATE_FLAGS;
> > +		qp_init_attr.create_flags = MLX5DV_QP_CREATE_TUNNEL_OFFLOADS;
> > +	}
> > +	qp = mlx5_glue->dv_create_qp(
> > +		priv->ctx,
> > +		&(struct ibv_qp_init_attr_ex){
> > +			.qp_type = IBV_QPT_RAW_PACKET,
> > +			.comp_mask =
> > +				IBV_QP_INIT_ATTR_PD |
> > +				IBV_QP_INIT_ATTR_IND_TABLE |
> > +				IBV_QP_INIT_ATTR_RX_HASH,
> > +			.rx_hash_conf = (struct ibv_rx_hash_conf){
> > +				.rx_hash_function = IBV_RX_HASH_FUNC_TOEPLITZ,
> > +				.rx_hash_key_len = rss_key_len ? rss_key_len :
> > +						   rss_hash_default_key_len,
> > +				.rx_hash_key = rss_key ?
> > +					       (void *)(uintptr_t)rss_key :
> > +					       rss_hash_default_key,
> > +				.rx_hash_fields_mask = hash_fields |
> > +					(tunnel && rss_level ?
> > +					(uint32_t)IBV_RX_HASH_INNER : 0),
> > +			},
> > +			.rwq_ind_tbl = ind_tbl->ind_table,
> > +			.pd = priv->pd,
> > +		},
> > +		&qp_init_attr);
> > +#else
> >  	qp = mlx5_glue->create_qp_ex
> >  		(priv->ctx,
> >  		 &(struct ibv_qp_init_attr_ex){
> > @@ -1420,13 +1456,17 @@ mlx5_hrxq_new(struct rte_eth_dev *dev,
> >  				IBV_QP_INIT_ATTR_RX_HASH,
> >  			.rx_hash_conf = (struct ibv_rx_hash_conf){
> >  				.rx_hash_function = IBV_RX_HASH_FUNC_TOEPLITZ,
> > -				.rx_hash_key_len = rss_key_len,
> > -				.rx_hash_key = (void *)(uintptr_t)rss_key,
> > +				.rx_hash_key_len = rss_key_len ? rss_key_len :
> > +						   rss_hash_default_key_len,
> > +				.rx_hash_key = rss_key ?
> > +					       (void *)(uintptr_t)rss_key :
> > +					       rss_hash_default_key,
> >  				.rx_hash_fields_mask = hash_fields,
> >  			},
> >  			.rwq_ind_tbl = ind_tbl->ind_table,
> >  			.pd = priv->pd,
> >  		 });
> > +#endif
> >  	if (!qp) {
> >  		rte_errno = errno;
> >  		goto error;
> > @@ -1439,6 +1479,7 @@ mlx5_hrxq_new(struct rte_eth_dev *dev,
> >  	hrxq->rss_key_len = rss_key_len;
> >  	hrxq->hash_fields = hash_fields;
> >  	hrxq->tunnel = tunnel;
> > +	hrxq->rss_level = rss_level;
> >  	memcpy(hrxq->rss_key, rss_key, rss_key_len);
> >  	rte_atomic32_inc(&hrxq->refcnt);
> >  	LIST_INSERT_HEAD(&priv->hrxqs, hrxq, next); @@ -1448,6 +1489,8 @@
> > mlx5_hrxq_new(struct rte_eth_dev *dev,
> >  	return hrxq;
> >  error:
> >  	err = rte_errno; /* Save rte_errno before cleanup. */
> > +	DRV_LOG(ERR, "port %u: Error creating Hash Rx queue",
> > +		dev->data->port_id);
> 
> Internal developer log should not remain in the code.  The user will
> already have a flow creation failure, there is no need to annoy him with
> messages he cannot understand.
> 
> >  	mlx5_ind_table_ibv_release(dev, ind_tbl);
> >  	if (qp)
> >  		claim_zero(mlx5_glue->destroy_qp(qp));
> > @@ -1469,6 +1512,8 @@ mlx5_hrxq_new(struct rte_eth_dev *dev,
> >   *   Number of queues.
> >   * @param tunnel
> >   *   Tunnel type.
> > + * @param rss_level
> > + *   RSS hash on tunnel level
> >   *
> >   * @return
> >   *   An hash Rx queue on success.
> > @@ -1477,7 +1522,8 @@ struct mlx5_hrxq *  mlx5_hrxq_get(struct
> > rte_eth_dev *dev,
> >  	      const uint8_t *rss_key, uint32_t rss_key_len,
> >  	      uint64_t hash_fields,
> > -	      const uint16_t *queues, uint32_t queues_n, uint32_t tunnel)
> > +	      const uint16_t *queues, uint32_t queues_n,
> > +	      uint32_t tunnel, uint32_t rss_level)
> 
> rss_level > 1 means tunnel, there is no need to have a redundant
> information.
> 
> >  {
> >  	struct priv *priv = dev->data->dev_private;
> >  	struct mlx5_hrxq *hrxq;
> > @@ -1494,6 +1540,8 @@ mlx5_hrxq_get(struct rte_eth_dev *dev,
> >  			continue;
> >  		if (hrxq->tunnel != tunnel)
> >  			continue;
> > +		if (hrxq->rss_level != rss_level)
> > +			continue;
> >  		ind_tbl = mlx5_ind_table_ibv_get(dev, queues, queues_n);
> >  		if (!ind_tbl)
> >  			continue;
> > diff --git a/drivers/net/mlx5/mlx5_rxtx.h
> > b/drivers/net/mlx5/mlx5_rxtx.h index d35605b55..62cf55109 100644
> > --- a/drivers/net/mlx5/mlx5_rxtx.h
> > +++ b/drivers/net/mlx5/mlx5_rxtx.h
> > @@ -147,6 +147,7 @@ struct mlx5_hrxq {
> >  	struct ibv_qp *qp; /* Verbs queue pair. */
> >  	uint64_t hash_fields; /* Verbs Hash fields. */
> >  	uint32_t tunnel; /* Tunnel type. */
> > +	uint32_t rss_level; /* RSS on tunnel level. */
> >  	uint32_t rss_key_len; /* Hash key length in bytes. */
> >  	uint8_t rss_key[]; /* Hash key. */
> >  };
> > @@ -251,12 +252,12 @@ struct mlx5_hrxq *mlx5_hrxq_new(struct rte_eth_dev
> *dev,
> >  				const uint8_t *rss_key, uint32_t rss_key_len,
> >  				uint64_t hash_fields,
> >  				const uint16_t *queues, uint32_t queues_n,
> > -				uint32_t tunnel);
> > +				uint32_t tunnel, uint32_t rss_level);
> >  struct mlx5_hrxq *mlx5_hrxq_get(struct rte_eth_dev *dev,
> >  				const uint8_t *rss_key, uint32_t rss_key_len,
> >  				uint64_t hash_fields,
> >  				const uint16_t *queues, uint32_t queues_n,
> > -				uint32_t tunnel);
> > +				uint32_t tunnel, uint32_t rss_level);
> >  int mlx5_hrxq_release(struct rte_eth_dev *dev, struct mlx5_hrxq
> > *hxrq);  int mlx5_hrxq_ibv_verify(struct rte_eth_dev *dev);  uint64_t
> > mlx5_get_rx_port_offloads(void);
> > --
> > 2.13.3
> >
> 
> Thanks,
> --
> Nélio Laranjeiro
> 6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v3 04/14] net/mlx5: support Rx tunnel type identification
  2018-04-13 13:02   ` Nélio Laranjeiro
@ 2018-04-14 12:57     ` Xueming(Steven) Li
  2018-04-16  7:28       ` Nélio Laranjeiro
  0 siblings, 1 reply; 115+ messages in thread
From: Xueming(Steven) Li @ 2018-04-14 12:57 UTC (permalink / raw)
  To: Nélio Laranjeiro; +Cc: Shahaf Shuler, dev, Olivier Matz, Adrien Mazarguil

+Adrien

> -----Original Message-----
> From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> Sent: Friday, April 13, 2018 9:03 PM
> To: Xueming(Steven) Li <xuemingl@mellanox.com>
> Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org; Olivier Matz
> <olivier.matz@6wind.com>
> Subject: Re: [PATCH v3 04/14] net/mlx5: support Rx tunnel type
> identification
> 
> +Olivier,
> 
> On Fri, Apr 13, 2018 at 07:20:13PM +0800, Xueming Li wrote:
> > This patch introduced tunnel type identification based on flow rules.
> > If flows of multiple tunnel types built on same queue,
> > RTE_PTYPE_TUNNEL_MASK will be returned, user application could use
> > bits in flow mark as tunnel type identifier.
> 
> For an application it will mean the packet embed all tunnel types defined
> in DPDK, to make such thing you need a RTE_PTYPE_TUNNEL_UNKNOWN which does
> not exists currently.

There was a RTE_PTYPE_TUNNEL_UNKNOWN definition, but removed due to discussion.
So I think it good to add it in the patchset of reviewed by Adrien.

> Even with it, the application still needs to parse the packet to discover
> which tunnel the packet embed, is there any benefit having such bit?  Not
> so sure.

With a tunnel flag, checksum status represent inner checksum.
Setting flow mark for different flow type could save time of parsing tunnel.

> 
> Thanks,
> 
> > Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> > ---
> >  drivers/net/mlx5/mlx5_flow.c          | 127
> +++++++++++++++++++++++++++++-----
> >  drivers/net/mlx5/mlx5_rxq.c           |  11 ++-
> >  drivers/net/mlx5/mlx5_rxtx.c          |  12 ++--
> >  drivers/net/mlx5/mlx5_rxtx.h          |   9 ++-
> >  drivers/net/mlx5/mlx5_rxtx_vec_neon.h |  21 +++---
> > drivers/net/mlx5/mlx5_rxtx_vec_sse.h  |  17 +++--
> >  6 files changed, 159 insertions(+), 38 deletions(-)
> >
> > diff --git a/drivers/net/mlx5/mlx5_flow.c
> > b/drivers/net/mlx5/mlx5_flow.c index 644f26a95..7d04b4d65 100644
> > --- a/drivers/net/mlx5/mlx5_flow.c
> > +++ b/drivers/net/mlx5/mlx5_flow.c
> > @@ -225,6 +225,7 @@ struct rte_flow {
> >  	struct rte_flow_action_rss rss_conf; /**< RSS configuration */
> >  	uint16_t (*queues)[]; /**< Queues indexes to use. */
> >  	uint8_t rss_key[40]; /**< copy of the RSS key. */
> > +	uint32_t tunnel; /**< Tunnel type of RTE_PTYPE_TUNNEL_XXX. */
> >  	struct ibv_counter_set *cs; /**< Holds the counters for the rule. */
> >  	struct mlx5_flow_counter_stats counter_stats;/**<The counter stats.
> */
> >  	struct mlx5_flow frxq[RTE_DIM(hash_rxq_init)]; @@ -241,6 +242,19 @@
> > struct rte_flow {
> >  	(type) == RTE_FLOW_ITEM_TYPE_VXLAN || \
> >  	(type) == RTE_FLOW_ITEM_TYPE_GRE)
> >
> > +const uint32_t flow_ptype[] = {
> > +	[RTE_FLOW_ITEM_TYPE_VXLAN] = RTE_PTYPE_TUNNEL_VXLAN,
> > +	[RTE_FLOW_ITEM_TYPE_GRE] = RTE_PTYPE_TUNNEL_GRE, };
> > +
> > +#define PTYPE_IDX(t) ((RTE_PTYPE_TUNNEL_MASK & (t)) >> 12)
> > +
> > +const uint32_t ptype_ext[] = {
> > +	[PTYPE_IDX(RTE_PTYPE_TUNNEL_VXLAN)] = RTE_PTYPE_TUNNEL_VXLAN |
> > +					      RTE_PTYPE_L4_UDP,
> > +	[PTYPE_IDX(RTE_PTYPE_TUNNEL_GRE)] = RTE_PTYPE_TUNNEL_GRE, };
> > +
> >  /** Structure to generate a simple graph of layers supported by the
> > NIC. */  struct mlx5_flow_items {
> >  	/** List of possible actions for these items. */ @@ -440,6 +454,7 @@
> > struct mlx5_flow_parse {
> >  	uint16_t queues[RTE_MAX_QUEUES_PER_PORT]; /**< Queues indexes to use.
> */
> >  	uint8_t rss_key[40]; /**< copy of the RSS key. */
> >  	enum hash_rxq_type layer; /**< Last pattern layer detected. */
> > +	uint32_t tunnel; /**< Tunnel type of RTE_PTYPE_TUNNEL_XXX. */
> >  	struct ibv_counter_set *cs; /**< Holds the counter set for the rule
> */
> >  	struct {
> >  		struct ibv_flow_attr *ibv_attr;
> > @@ -858,7 +873,7 @@ mlx5_flow_convert_items_validate(const struct
> rte_flow_item items[],
> >  		if (ret)
> >  			goto exit_item_not_supported;
> >  		if (IS_TUNNEL(items->type)) {
> > -			if (parser->inner) {
> > +			if (parser->tunnel) {
> >  				rte_flow_error_set(error, ENOTSUP,
> >  						   RTE_FLOW_ERROR_TYPE_ITEM,
> >  						   items,
> > @@ -867,6 +882,7 @@ mlx5_flow_convert_items_validate(const struct
> rte_flow_item items[],
> >  				return -rte_errno;
> >  			}
> >  			parser->inner = IBV_FLOW_SPEC_INNER;
> > +			parser->tunnel = flow_ptype[items->type];
> >  		}
> >  		if (parser->drop) {
> >  			parser->queue[HASH_RXQ_ETH].offset += cur_item->dst_sz;
> @@ -1175,6
> > +1191,7 @@ mlx5_flow_convert(struct rte_eth_dev *dev,
> >  	}
> >  	/* Third step. Conversion parse, fill the specifications. */
> >  	parser->inner = 0;
> > +	parser->tunnel = 0;
> >  	for (; items->type != RTE_FLOW_ITEM_TYPE_END; ++items) {
> >  		struct mlx5_flow_data data = {
> >  			.parser = parser,
> > @@ -1643,6 +1660,7 @@ mlx5_flow_create_vxlan(const struct
> > rte_flow_item *item,
> >
> >  	id.vni[0] = 0;
> >  	parser->inner = IBV_FLOW_SPEC_INNER;
> > +	parser->tunnel = ptype_ext[PTYPE_IDX(RTE_PTYPE_TUNNEL_VXLAN)];
> >  	if (spec) {
> >  		if (!mask)
> >  			mask = default_mask;
> > @@ -1696,6 +1714,7 @@ mlx5_flow_create_gre(const struct rte_flow_item
> *item __rte_unused,
> >  	};
> >
> >  	parser->inner = IBV_FLOW_SPEC_INNER;
> > +	parser->tunnel = ptype_ext[PTYPE_IDX(RTE_PTYPE_TUNNEL_GRE)];
> >  	mlx5_flow_create_copy(parser, &tunnel, size);
> >  	return 0;
> >  }
> > @@ -1874,7 +1893,8 @@ mlx5_flow_create_action_queue_rss(struct
> rte_eth_dev *dev,
> >  				      parser->rss_conf.key_len,
> >  				      hash_fields,
> >  				      parser->rss_conf.queue,
> > -				      parser->rss_conf.queue_num);
> > +				      parser->rss_conf.queue_num,
> > +				      parser->tunnel);
> >  		if (flow->frxq[i].hrxq)
> >  			continue;
> >  		flow->frxq[i].hrxq =
> > @@ -1883,7 +1903,8 @@ mlx5_flow_create_action_queue_rss(struct
> rte_eth_dev *dev,
> >  				      parser->rss_conf.key_len,
> >  				      hash_fields,
> >  				      parser->rss_conf.queue,
> > -				      parser->rss_conf.queue_num);
> > +				      parser->rss_conf.queue_num,
> > +				      parser->tunnel);
> >  		if (!flow->frxq[i].hrxq) {
> >  			return rte_flow_error_set(error, ENOMEM,
> >  						  RTE_FLOW_ERROR_TYPE_HANDLE,
> > @@ -1895,6 +1916,40 @@ mlx5_flow_create_action_queue_rss(struct
> > rte_eth_dev *dev,  }
> >
> >  /**
> > + * RXQ update after flow rule creation.
> > + *
> > + * @param dev
> > + *   Pointer to Ethernet device.
> > + * @param flow
> > + *   Pointer to the flow rule.
> > + */
> > +static void
> > +mlx5_flow_create_update_rxqs(struct rte_eth_dev *dev, struct rte_flow
> > +*flow) {
> > +	struct priv *priv = dev->data->dev_private;
> > +	unsigned int i;
> > +
> > +	if (!dev->data->dev_started)
> > +		return;
> > +	for (i = 0; i != flow->rss_conf.queue_num; ++i) {
> > +		struct mlx5_rxq_data *rxq_data = (*priv->rxqs)
> > +						 [(*flow->queues)[i]];
> > +		struct mlx5_rxq_ctrl *rxq_ctrl =
> > +			container_of(rxq_data, struct mlx5_rxq_ctrl, rxq);
> > +		uint8_t tunnel = PTYPE_IDX(flow->tunnel);
> > +
> > +		rxq_data->mark |= flow->mark;
> > +		if (!tunnel)
> > +			continue;
> > +		rxq_ctrl->tunnel_types[tunnel] += 1;
> > +		if (rxq_data->tunnel != flow->tunnel)
> > +			rxq_data->tunnel = rxq_data->tunnel ?
> > +					   RTE_PTYPE_TUNNEL_MASK :
> > +					   flow->tunnel;
> > +	}
> > +}
> > +
> > +/**
> >   * Complete flow rule creation.
> >   *
> >   * @param dev
> > @@ -1954,12 +2009,7 @@ mlx5_flow_create_action_queue(struct rte_eth_dev
> *dev,
> >  				   NULL, "internal error in flow creation");
> >  		goto error;
> >  	}
> > -	for (i = 0; i != parser->rss_conf.queue_num; ++i) {
> > -		struct mlx5_rxq_data *q =
> > -			(*priv->rxqs)[parser->rss_conf.queue[i]];
> > -
> > -		q->mark |= parser->mark;
> > -	}
> > +	mlx5_flow_create_update_rxqs(dev, flow);
> >  	return 0;
> >  error:
> >  	ret = rte_errno; /* Save rte_errno before cleanup. */ @@ -2032,6
> > +2082,7 @@ mlx5_flow_list_create(struct rte_eth_dev *dev,
> >  	}
> >  	/* Copy configuration. */
> >  	flow->queues = (uint16_t (*)[])(flow + 1);
> > +	flow->tunnel = parser.tunnel;
> >  	flow->rss_conf = (struct rte_flow_action_rss){
> >  		.func = RTE_ETH_HASH_FUNCTION_DEFAULT,
> >  		.level = 0,
> > @@ -2123,9 +2174,38 @@ mlx5_flow_list_destroy(struct rte_eth_dev *dev,
> struct mlx5_flows *list,
> >  	struct priv *priv = dev->data->dev_private;
> >  	unsigned int i;
> >
> > -	if (flow->drop || !flow->mark)
> > +	if (flow->drop || !dev->data->dev_started)
> >  		goto free;
> > -	for (i = 0; i != flow->rss_conf.queue_num; ++i) {
> > +	for (i = 0; flow->tunnel && i != flow->rss_conf.queue_num; ++i) {
> > +		/* Update queue tunnel type. */
> > +		struct mlx5_rxq_data *rxq_data = (*priv->rxqs)
> > +						 [(*flow->queues)[i]];
> > +		struct mlx5_rxq_ctrl *rxq_ctrl =
> > +			container_of(rxq_data, struct mlx5_rxq_ctrl, rxq);
> > +		uint8_t tunnel = PTYPE_IDX(flow->tunnel);
> > +
> > +		assert(rxq_ctrl->tunnel_types[tunnel] > 0);
> > +		rxq_ctrl->tunnel_types[tunnel] -= 1;
> > +		if (!rxq_ctrl->tunnel_types[tunnel]) {
> > +			/* Update tunnel type. */
> > +			uint8_t j;
> > +			uint8_t types = 0;
> > +			uint8_t last;
> > +
> > +			for (j = 0; j < RTE_DIM(rxq_ctrl->tunnel_types); j++)
> > +				if (rxq_ctrl->tunnel_types[j]) {
> > +					types += 1;
> > +					last = j;
> > +				}
> > +			/* Keep same if more than one tunnel types left. */
> > +			if (types == 1)
> > +				rxq_data->tunnel = ptype_ext[last];
> > +			else if (types == 0)
> > +				/* No tunnel type left. */
> > +				rxq_data->tunnel = 0;
> > +		}
> > +	}
> > +	for (i = 0; flow->mark && i != flow->rss_conf.queue_num; ++i) {
> >  		struct rte_flow *tmp;
> >  		int mark = 0;
> >
> > @@ -2344,9 +2424,9 @@ mlx5_flow_stop(struct rte_eth_dev *dev, struct
> > mlx5_flows *list)  {
> >  	struct priv *priv = dev->data->dev_private;
> >  	struct rte_flow *flow;
> > +	unsigned int i;
> >
> >  	TAILQ_FOREACH_REVERSE(flow, list, mlx5_flows, next) {
> > -		unsigned int i;
> >  		struct mlx5_ind_table_ibv *ind_tbl = NULL;
> >
> >  		if (flow->drop) {
> > @@ -2392,6 +2472,18 @@ mlx5_flow_stop(struct rte_eth_dev *dev, struct
> mlx5_flows *list)
> >  		DRV_LOG(DEBUG, "port %u flow %p removed", dev->data->port_id,
> >  			(void *)flow);
> >  	}
> > +	/* Cleanup Rx queue tunnel info. */
> > +	for (i = 0; i != priv->rxqs_n; ++i) {
> > +		struct mlx5_rxq_data *q = (*priv->rxqs)[i];
> > +		struct mlx5_rxq_ctrl *rxq_ctrl =
> > +			container_of(q, struct mlx5_rxq_ctrl, rxq);
> > +
> > +		if (!q)
> > +			continue;
> > +		memset((void *)rxq_ctrl->tunnel_types, 0,
> > +		       sizeof(rxq_ctrl->tunnel_types));
> > +		q->tunnel = 0;
> > +	}
> >  }
> >
> >  /**
> > @@ -2439,7 +2531,8 @@ mlx5_flow_start(struct rte_eth_dev *dev, struct
> mlx5_flows *list)
> >  					      flow->rss_conf.key_len,
> >  					      hash_rxq_init[i].hash_fields,
> >  					      flow->rss_conf.queue,
> > -					      flow->rss_conf.queue_num);
> > +					      flow->rss_conf.queue_num,
> > +					      flow->tunnel);
> >  			if (flow->frxq[i].hrxq)
> >  				goto flow_create;
> >  			flow->frxq[i].hrxq =
> > @@ -2447,7 +2540,8 @@ mlx5_flow_start(struct rte_eth_dev *dev, struct
> mlx5_flows *list)
> >  					      flow->rss_conf.key_len,
> >  					      hash_rxq_init[i].hash_fields,
> >  					      flow->rss_conf.queue,
> > -					      flow->rss_conf.queue_num);
> > +					      flow->rss_conf.queue_num,
> > +					      flow->tunnel);
> >  			if (!flow->frxq[i].hrxq) {
> >  				DRV_LOG(DEBUG,
> >  					"port %u flow %p cannot be applied", @@ -
> 2469,10 +2563,7 @@
> > mlx5_flow_start(struct rte_eth_dev *dev, struct mlx5_flows *list)
> >  			DRV_LOG(DEBUG, "port %u flow %p applied",
> >  				dev->data->port_id, (void *)flow);
> >  		}
> > -		if (!flow->mark)
> > -			continue;
> > -		for (i = 0; i != flow->rss_conf.queue_num; ++i)
> > -			(*priv->rxqs)[flow->rss_conf.queue[i]]->mark = 1;
> > +		mlx5_flow_create_update_rxqs(dev, flow);
> >  	}
> >  	return 0;
> >  }
> > diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
> > index 1e4354ab3..351acfc0f 100644
> > --- a/drivers/net/mlx5/mlx5_rxq.c
> > +++ b/drivers/net/mlx5/mlx5_rxq.c
> > @@ -1386,6 +1386,8 @@ mlx5_ind_table_ibv_verify(struct rte_eth_dev *dev)
> >   *   first queue index will be taken for the indirection table.
> >   * @param queues_n
> >   *   Number of queues.
> > + * @param tunnel
> > + *   Tunnel type.
> >   *
> >   * @return
> >   *   The Verbs object initialised, NULL otherwise and rte_errno is set.
> > @@ -1394,7 +1396,7 @@ struct mlx5_hrxq *  mlx5_hrxq_new(struct
> > rte_eth_dev *dev,
> >  	      const uint8_t *rss_key, uint32_t rss_key_len,
> >  	      uint64_t hash_fields,
> > -	      const uint16_t *queues, uint32_t queues_n)
> > +	      const uint16_t *queues, uint32_t queues_n, uint32_t tunnel)
> >  {
> >  	struct priv *priv = dev->data->dev_private;
> >  	struct mlx5_hrxq *hrxq;
> > @@ -1438,6 +1440,7 @@ mlx5_hrxq_new(struct rte_eth_dev *dev,
> >  	hrxq->qp = qp;
> >  	hrxq->rss_key_len = rss_key_len;
> >  	hrxq->hash_fields = hash_fields;
> > +	hrxq->tunnel = tunnel;
> >  	memcpy(hrxq->rss_key, rss_key, rss_key_len);
> >  	rte_atomic32_inc(&hrxq->refcnt);
> >  	LIST_INSERT_HEAD(&priv->hrxqs, hrxq, next); @@ -1466,6 +1469,8 @@
> > mlx5_hrxq_new(struct rte_eth_dev *dev,
> >   *   first queue index will be taken for the indirection table.
> >   * @param queues_n
> >   *   Number of queues.
> > + * @param tunnel
> > + *   Tunnel type.
> >   *
> >   * @return
> >   *   An hash Rx queue on success.
> > @@ -1474,7 +1479,7 @@ struct mlx5_hrxq *  mlx5_hrxq_get(struct
> > rte_eth_dev *dev,
> >  	      const uint8_t *rss_key, uint32_t rss_key_len,
> >  	      uint64_t hash_fields,
> > -	      const uint16_t *queues, uint32_t queues_n)
> > +	      const uint16_t *queues, uint32_t queues_n, uint32_t tunnel)
> >  {
> >  	struct priv *priv = dev->data->dev_private;
> >  	struct mlx5_hrxq *hrxq;
> > @@ -1489,6 +1494,8 @@ mlx5_hrxq_get(struct rte_eth_dev *dev,
> >  			continue;
> >  		if (hrxq->hash_fields != hash_fields)
> >  			continue;
> > +		if (hrxq->tunnel != tunnel)
> > +			continue;
> >  		ind_tbl = mlx5_ind_table_ibv_get(dev, queues, queues_n);
> >  		if (!ind_tbl)
> >  			continue;
> > diff --git a/drivers/net/mlx5/mlx5_rxtx.c
> > b/drivers/net/mlx5/mlx5_rxtx.c index 1f422c70b..d061dfc8a 100644
> > --- a/drivers/net/mlx5/mlx5_rxtx.c
> > +++ b/drivers/net/mlx5/mlx5_rxtx.c
> > @@ -34,7 +34,7 @@
> >  #include "mlx5_prm.h"
> >
> >  static __rte_always_inline uint32_t
> > -rxq_cq_to_pkt_type(volatile struct mlx5_cqe *cqe);
> > +rxq_cq_to_pkt_type(struct mlx5_rxq_data *rxq, volatile struct
> > +mlx5_cqe *cqe);
> >
> >  static __rte_always_inline int
> >  mlx5_rx_poll_len(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe
> > *cqe, @@ -125,12 +125,14 @@ mlx5_set_ptype_table(void)
> >  	(*p)[0x8a] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
> >  		     RTE_PTYPE_L4_UDP;
> >  	/* Tunneled - L3 */
> > +	(*p)[0x40] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN;
> >  	(*p)[0x41] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
> >  		     RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
> >  		     RTE_PTYPE_INNER_L4_NONFRAG;
> >  	(*p)[0x42] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
> >  		     RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
> >  		     RTE_PTYPE_INNER_L4_NONFRAG;
> > +	(*p)[0xc0] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN;
> >  	(*p)[0xc1] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
> >  		     RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
> >  		     RTE_PTYPE_INNER_L4_NONFRAG;
> > @@ -1577,6 +1579,8 @@ mlx5_tx_burst_empw(void *dpdk_txq, struct
> > rte_mbuf **pkts, uint16_t pkts_n)
> >  /**
> >   * Translate RX completion flags to packet type.
> >   *
> > + * @param[in] rxq
> > + *   Pointer to RX queue structure.
> >   * @param[in] cqe
> >   *   Pointer to CQE.
> >   *
> > @@ -1586,7 +1590,7 @@ mlx5_tx_burst_empw(void *dpdk_txq, struct rte_mbuf
> **pkts, uint16_t pkts_n)
> >   *   Packet type for struct rte_mbuf.
> >   */
> >  static inline uint32_t
> > -rxq_cq_to_pkt_type(volatile struct mlx5_cqe *cqe)
> > +rxq_cq_to_pkt_type(struct mlx5_rxq_data *rxq, volatile struct
> > +mlx5_cqe *cqe)
> >  {
> >  	uint8_t idx;
> >  	uint8_t pinfo = cqe->pkt_info;
> > @@ -1601,7 +1605,7 @@ rxq_cq_to_pkt_type(volatile struct mlx5_cqe *cqe)
> >  	 * bit[7] = outer_l3_type
> >  	 */
> >  	idx = ((pinfo & 0x3) << 6) | ((ptype & 0xfc00) >> 10);
> > -	return mlx5_ptype_table[idx];
> > +	return mlx5_ptype_table[idx] | rxq->tunnel * !!(idx & (1 << 6));
> >  }
> >
> >  /**
> > @@ -1833,7 +1837,7 @@ mlx5_rx_burst(void *dpdk_rxq, struct rte_mbuf
> **pkts, uint16_t pkts_n)
> >  			pkt = seg;
> >  			assert(len >= (rxq->crc_present << 2));
> >  			/* Update packet information. */
> > -			pkt->packet_type = rxq_cq_to_pkt_type(cqe);
> > +			pkt->packet_type = rxq_cq_to_pkt_type(rxq, cqe);
> >  			pkt->ol_flags = 0;
> >  			if (rss_hash_res && rxq->rss_hash) {
> >  				pkt->hash.rss = rss_hash_res;
> > diff --git a/drivers/net/mlx5/mlx5_rxtx.h
> > b/drivers/net/mlx5/mlx5_rxtx.h index a702cb603..6866f6818 100644
> > --- a/drivers/net/mlx5/mlx5_rxtx.h
> > +++ b/drivers/net/mlx5/mlx5_rxtx.h
> > @@ -104,6 +104,7 @@ struct mlx5_rxq_data {
> >  	void *cq_uar; /* CQ user access region. */
> >  	uint32_t cqn; /* CQ number. */
> >  	uint8_t cq_arm_sn; /* CQ arm seq number. */
> > +	uint32_t tunnel; /* Tunnel information. */
> >  } __rte_cache_aligned;
> >
> >  /* Verbs Rx queue elements. */
> > @@ -125,6 +126,7 @@ struct mlx5_rxq_ctrl {
> >  	struct mlx5_rxq_ibv *ibv; /* Verbs elements. */
> >  	struct mlx5_rxq_data rxq; /* Data path structure. */
> >  	unsigned int socket; /* CPU socket ID for allocations. */
> > +	uint32_t tunnel_types[16]; /* Tunnel type counter. */
> >  	unsigned int irq:1; /* Whether IRQ is enabled. */
> >  	uint16_t idx; /* Queue index. */
> >  };
> > @@ -145,6 +147,7 @@ struct mlx5_hrxq {
> >  	struct mlx5_ind_table_ibv *ind_table; /* Indirection table. */
> >  	struct ibv_qp *qp; /* Verbs queue pair. */
> >  	uint64_t hash_fields; /* Verbs Hash fields. */
> > +	uint32_t tunnel; /* Tunnel type. */
> >  	uint32_t rss_key_len; /* Hash key length in bytes. */
> >  	uint8_t rss_key[]; /* Hash key. */
> >  };
> > @@ -248,11 +251,13 @@ int mlx5_ind_table_ibv_verify(struct rte_eth_dev
> > *dev);  struct mlx5_hrxq *mlx5_hrxq_new(struct rte_eth_dev *dev,
> >  				const uint8_t *rss_key, uint32_t rss_key_len,
> >  				uint64_t hash_fields,
> > -				const uint16_t *queues, uint32_t queues_n);
> > +				const uint16_t *queues, uint32_t queues_n,
> > +				uint32_t tunnel);
> >  struct mlx5_hrxq *mlx5_hrxq_get(struct rte_eth_dev *dev,
> >  				const uint8_t *rss_key, uint32_t rss_key_len,
> >  				uint64_t hash_fields,
> > -				const uint16_t *queues, uint32_t queues_n);
> > +				const uint16_t *queues, uint32_t queues_n,
> > +				uint32_t tunnel);
> >  int mlx5_hrxq_release(struct rte_eth_dev *dev, struct mlx5_hrxq
> > *hxrq);  int mlx5_hrxq_ibv_verify(struct rte_eth_dev *dev);  uint64_t
> > mlx5_get_rx_port_offloads(void); diff --git
> > a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
> > b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
> > index bbe1818ef..9f9136108 100644
> > --- a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
> > +++ b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
> > @@ -551,6 +551,7 @@ rxq_cq_to_ptype_oflags_v(struct mlx5_rxq_data *rxq,
> >  	const uint64x1_t mbuf_init = vld1_u64(&rxq->mbuf_initializer);
> >  	const uint64x1_t r32_mask = vcreate_u64(0xffffffff);
> >  	uint64x2_t rearm0, rearm1, rearm2, rearm3;
> > +	uint8_t pt_idx0, pt_idx1, pt_idx2, pt_idx3;
> >
> >  	if (rxq->mark) {
> >  		const uint32x4_t ft_def = vdupq_n_u32(MLX5_FLOW_MARK_DEFAULT);
> > @@ -583,14 +584,18 @@ rxq_cq_to_ptype_oflags_v(struct mlx5_rxq_data *rxq,
> >  	ptype = vshrn_n_u32(ptype_info, 10);
> >  	/* Errored packets will have RTE_PTYPE_ALL_MASK. */
> >  	ptype = vorr_u16(ptype, op_err);
> > -	pkts[0]->packet_type =
> > -		mlx5_ptype_table[vget_lane_u8(vreinterpret_u8_u16(ptype), 6)];
> > -	pkts[1]->packet_type =
> > -		mlx5_ptype_table[vget_lane_u8(vreinterpret_u8_u16(ptype), 4)];
> > -	pkts[2]->packet_type =
> > -		mlx5_ptype_table[vget_lane_u8(vreinterpret_u8_u16(ptype), 2)];
> > -	pkts[3]->packet_type =
> > -		mlx5_ptype_table[vget_lane_u8(vreinterpret_u8_u16(ptype), 0)];
> > +	pt_idx0 = vget_lane_u8(vreinterpret_u8_u16(ptype), 6);
> > +	pt_idx1 = vget_lane_u8(vreinterpret_u8_u16(ptype), 4);
> > +	pt_idx2 = vget_lane_u8(vreinterpret_u8_u16(ptype), 2);
> > +	pt_idx3 = vget_lane_u8(vreinterpret_u8_u16(ptype), 0);
> > +	pkts[0]->packet_type = mlx5_ptype_table[pt_idx0] |
> > +			       !!(pt_idx0 & (1 << 6)) * rxq->tunnel;
> > +	pkts[1]->packet_type = mlx5_ptype_table[pt_idx1] |
> > +			       !!(pt_idx1 & (1 << 6)) * rxq->tunnel;
> > +	pkts[2]->packet_type = mlx5_ptype_table[pt_idx2] |
> > +			       !!(pt_idx2 & (1 << 6)) * rxq->tunnel;
> > +	pkts[3]->packet_type = mlx5_ptype_table[pt_idx3] |
> > +			       !!(pt_idx3 & (1 << 6)) * rxq->tunnel;
> >  	/* Fill flags for checksum and VLAN. */
> >  	pinfo = vandq_u32(ptype_info, ptype_ol_mask);
> >  	pinfo = vreinterpretq_u32_u8(
> > diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
> > b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
> > index c088bcb51..d2492481d 100644
> > --- a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
> > +++ b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
> > @@ -542,6 +542,7 @@ rxq_cq_to_ptype_oflags_v(struct mlx5_rxq_data *rxq,
> __m128i cqes[4],
> >  	const __m128i mbuf_init =
> >  		_mm_loadl_epi64((__m128i *)&rxq->mbuf_initializer);
> >  	__m128i rearm0, rearm1, rearm2, rearm3;
> > +	uint8_t pt_idx0, pt_idx1, pt_idx2, pt_idx3;
> >
> >  	/* Extract pkt_info field. */
> >  	pinfo0 = _mm_unpacklo_epi32(cqes[0], cqes[1]); @@ -595,10 +596,18 @@
> > rxq_cq_to_ptype_oflags_v(struct mlx5_rxq_data *rxq, __m128i cqes[4],
> >  	/* Errored packets will have RTE_PTYPE_ALL_MASK. */
> >  	op_err = _mm_srli_epi16(op_err, 8);
> >  	ptype = _mm_or_si128(ptype, op_err);
> > -	pkts[0]->packet_type = mlx5_ptype_table[_mm_extract_epi8(ptype, 0)];
> > -	pkts[1]->packet_type = mlx5_ptype_table[_mm_extract_epi8(ptype, 2)];
> > -	pkts[2]->packet_type = mlx5_ptype_table[_mm_extract_epi8(ptype, 4)];
> > -	pkts[3]->packet_type = mlx5_ptype_table[_mm_extract_epi8(ptype, 6)];
> > +	pt_idx0 = _mm_extract_epi8(ptype, 0);
> > +	pt_idx1 = _mm_extract_epi8(ptype, 2);
> > +	pt_idx2 = _mm_extract_epi8(ptype, 4);
> > +	pt_idx3 = _mm_extract_epi8(ptype, 6);
> > +	pkts[0]->packet_type = mlx5_ptype_table[pt_idx0] |
> > +			       !!(pt_idx0 & (1 << 6)) * rxq->tunnel;
> > +	pkts[1]->packet_type = mlx5_ptype_table[pt_idx1] |
> > +			       !!(pt_idx1 & (1 << 6)) * rxq->tunnel;
> > +	pkts[2]->packet_type = mlx5_ptype_table[pt_idx2] |
> > +			       !!(pt_idx2 & (1 << 6)) * rxq->tunnel;
> > +	pkts[3]->packet_type = mlx5_ptype_table[pt_idx3] |
> > +			       !!(pt_idx3 & (1 << 6)) * rxq->tunnel;
> >  	/* Fill flags for checksum and VLAN. */
> >  	pinfo = _mm_and_si128(pinfo, ptype_ol_mask);
> >  	pinfo = _mm_shuffle_epi8(cv_flag_sel, pinfo);
> > --
> > 2.13.3
> 
> 
> --
> Nélio Laranjeiro
> 6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v3 04/14] net/mlx5: support Rx tunnel type identification
  2018-04-14 12:57     ` Xueming(Steven) Li
@ 2018-04-16  7:28       ` Nélio Laranjeiro
  2018-04-16  8:05         ` Xueming(Steven) Li
  0 siblings, 1 reply; 115+ messages in thread
From: Nélio Laranjeiro @ 2018-04-16  7:28 UTC (permalink / raw)
  To: Xueming(Steven) Li; +Cc: Shahaf Shuler, dev, Olivier Matz, Adrien Mazarguil

On Sat, Apr 14, 2018 at 12:57:58PM +0000, Xueming(Steven) Li wrote:
> +Adrien
> 
> > -----Original Message-----
> > From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> > Sent: Friday, April 13, 2018 9:03 PM
> > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org; Olivier Matz
> > <olivier.matz@6wind.com>
> > Subject: Re: [PATCH v3 04/14] net/mlx5: support Rx tunnel type
> > identification
> > 
> > +Olivier,
> > 
> > On Fri, Apr 13, 2018 at 07:20:13PM +0800, Xueming Li wrote:
> > > This patch introduced tunnel type identification based on flow rules.
> > > If flows of multiple tunnel types built on same queue,
> > > RTE_PTYPE_TUNNEL_MASK will be returned, user application could use
> > > bits in flow mark as tunnel type identifier.
> > 
> > For an application it will mean the packet embed all tunnel types defined
> > in DPDK, to make such thing you need a RTE_PTYPE_TUNNEL_UNKNOWN which does
> > not exists currently.
> 
> There was a RTE_PTYPE_TUNNEL_UNKNOWN definition, but removed due to discussion.
> So I think it good to add it in the patchset of reviewed by Adrien.

Agreed,

> 
> > Even with it, the application still needs to parse the packet to discover
> > which tunnel the packet embed, is there any benefit having such bit?  Not
> > so sure.
> 
> With a tunnel flag, checksum status represent inner checksum.

Not sure this is generic enough, MLX5 behaves as this, but how behaves
other NICs?  It should have specific bits for inner checksum if all NIC
don't have the same behavior.

> Setting flow mark for different flow type could save time of parsing tunnel.

Thanks,

-- 
Nélio Laranjeiro
6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v3 04/14] net/mlx5: support Rx tunnel type identification
  2018-04-16  7:28       ` Nélio Laranjeiro
@ 2018-04-16  8:05         ` Xueming(Steven) Li
  2018-04-16  9:28           ` Adrien Mazarguil
  0 siblings, 1 reply; 115+ messages in thread
From: Xueming(Steven) Li @ 2018-04-16  8:05 UTC (permalink / raw)
  To: Nélio Laranjeiro; +Cc: Shahaf Shuler, dev, Olivier Matz, Adrien Mazarguil



> -----Original Message-----
> From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> Sent: Monday, April 16, 2018 3:29 PM
> To: Xueming(Steven) Li <xuemingl@mellanox.com>
> Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org; Olivier Matz
> <olivier.matz@6wind.com>; Adrien Mazarguil <adrien.mazarguil@6wind.com>
> Subject: Re: [PATCH v3 04/14] net/mlx5: support Rx tunnel type
> identification
> 
> On Sat, Apr 14, 2018 at 12:57:58PM +0000, Xueming(Steven) Li wrote:
> > +Adrien
> >
> > > -----Original Message-----
> > > From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > Sent: Friday, April 13, 2018 9:03 PM
> > > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > > Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org; Olivier Matz
> > > <olivier.matz@6wind.com>
> > > Subject: Re: [PATCH v3 04/14] net/mlx5: support Rx tunnel type
> > > identification
> > >
> > > +Olivier,
> > >
> > > On Fri, Apr 13, 2018 at 07:20:13PM +0800, Xueming Li wrote:
> > > > This patch introduced tunnel type identification based on flow rules.
> > > > If flows of multiple tunnel types built on same queue,
> > > > RTE_PTYPE_TUNNEL_MASK will be returned, user application could use
> > > > bits in flow mark as tunnel type identifier.
> > >
> > > For an application it will mean the packet embed all tunnel types
> > > defined in DPDK, to make such thing you need a
> > > RTE_PTYPE_TUNNEL_UNKNOWN which does not exists currently.
> >
> > There was a RTE_PTYPE_TUNNEL_UNKNOWN definition, but removed due to
> discussion.
> > So I think it good to add it in the patchset of reviewed by Adrien.
> 
> Agreed,
> 
> >
> > > Even with it, the application still needs to parse the packet to
> > > discover which tunnel the packet embed, is there any benefit having
> > > such bit?  Not so sure.
> >
> > With a tunnel flag, checksum status represent inner checksum.
> 
> Not sure this is generic enough, MLX5 behaves as this, but how behaves
> other NICs?  It should have specific bits for inner checksum if all NIC
> don't have the same behavior.

From my understanding, if outer checksum invalid, the packet can't be received 
as a tunneled packet, but a normal packet, thus checksum flags always result 
of inner for a valid tunneled packet.

> 
> > Setting flow mark for different flow type could save time of parsing
> tunnel.
> 
> Thanks,
> 
> --
> Nélio Laranjeiro
> 6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v3 11/14] net/mlx5: support MPLS-in-GRE and MPLS-in-UDP
  2018-04-13 15:22         ` Xueming(Steven) Li
@ 2018-04-16  8:14           ` Nélio Laranjeiro
  0 siblings, 0 replies; 115+ messages in thread
From: Nélio Laranjeiro @ 2018-04-16  8:14 UTC (permalink / raw)
  To: Xueming(Steven) Li; +Cc: Shahaf Shuler, dev

On Fri, Apr 13, 2018 at 03:22:50PM +0000, Xueming(Steven) Li wrote:
>[...] 
> > @@
> > > > static
> > > > > const struct mlx5_flow_items mlx5_flow_items[] = {
> > > > >  		.convert = mlx5_flow_create_vxlan_gpe,
> > > > >  		.dst_sz = sizeof(struct ibv_flow_spec_tunnel),
> > > > >  	},
> > > > > +	[RTE_FLOW_ITEM_TYPE_MPLS] = {
> > > > > +		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
> > > > > +			       RTE_FLOW_ITEM_TYPE_IPV4,
> > > > > +			       RTE_FLOW_ITEM_TYPE_IPV6),
> > > > > +		.actions = valid_actions,
> > > > > +		.mask = &(const struct rte_flow_item_mpls){
> > > > > +			.label_tc_s = "\xff\xff\xf0",
> > > > > +		},
> > > > > +		.default_mask = &rte_flow_item_mpls_mask,
> > > > > +		.mask_sz = sizeof(struct rte_flow_item_mpls),
> > > > > +		.convert = mlx5_flow_create_mpls, #ifdef
> > > > > +HAVE_IBV_DEVICE_MPLS_SUPPORT
> > > > > +		.dst_sz = sizeof(struct ibv_flow_spec_mpls), #endif
> > > > > +	},
> > > >
> > > > Why the whole item is not under ifdef?
> > >
> > > If apply macro to whole item, there will be a null pointer if create
> > mpls flow.
> > > There is a macro in function mlx5_flow_create_mpls() to avoid using this
> > invalid data.
> > 
> > I think there is some kind of confusion here, what I mean is moving the
> > #ifdef to embrace the whole stuff i.e.:
> > 
> >  #ifdef HAVE_IBV_DEVICE_MPLS_SUPPORT
> >  [RTE_FLOW_ITEM_TYPE_MPLS] = {
> >   .items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
> >   	       RTE_FLOW_ITEM_TYPE_IPV4,
> >   	       RTE_FLOW_ITEM_TYPE_IPV6),
> >   .actions = valid_actions,
> >   .mask = &(const struct rte_flow_item_mpls){
> >   	.label_tc_s = "\xff\xff\xf0",
> >   },
> >   .default_mask = &rte_flow_item_mpls_mask,
> >   .mask_sz = sizeof(struct rte_flow_item_mpls),
> >   .convert = mlx5_flow_create_mpls,
> >   .dst_sz = sizeof(struct ibv_flow_spec_mpls)  #endif
> > 
> > Not having this item in this static array ends by not supporting it, this
> > is what I mean.
> 
> Yes, I know. There is a code using this array w/o NULL check:
> 		cur_item = &mlx5_flow_items[items->type];
> 		ret = cur_item->convert(items,
> 					(cur_item->default_mask ?
> 					 cur_item->default_mask :
> 					 cur_item->mask),
> 					 &data);
> 
> 

This code is after the mlx5_flow_convert_items_validate() which refuses
unknown items, if you you see an unknown item reaching this code above,
there is bug somewhere and it should be fixed.  Un-supported items
should not be in the static array.  

Regards,

-- 
Nélio Laranjeiro
6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v3 04/14] net/mlx5: support Rx tunnel type identification
  2018-04-16  8:05         ` Xueming(Steven) Li
@ 2018-04-16  9:28           ` Adrien Mazarguil
  2018-04-16 13:32             ` Xueming(Steven) Li
  0 siblings, 1 reply; 115+ messages in thread
From: Adrien Mazarguil @ 2018-04-16  9:28 UTC (permalink / raw)
  To: Xueming(Steven) Li
  Cc: Nélio Laranjeiro, Shahaf Shuler, dev, Olivier Matz

On Mon, Apr 16, 2018 at 08:05:13AM +0000, Xueming(Steven) Li wrote:
> 
> 
> > -----Original Message-----
> > From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> > Sent: Monday, April 16, 2018 3:29 PM
> > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org; Olivier Matz
> > <olivier.matz@6wind.com>; Adrien Mazarguil <adrien.mazarguil@6wind.com>
> > Subject: Re: [PATCH v3 04/14] net/mlx5: support Rx tunnel type
> > identification
> > 
> > On Sat, Apr 14, 2018 at 12:57:58PM +0000, Xueming(Steven) Li wrote:
> > > +Adrien
> > >
> > > > -----Original Message-----
> > > > From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > > Sent: Friday, April 13, 2018 9:03 PM
> > > > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > > > Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org; Olivier Matz
> > > > <olivier.matz@6wind.com>
> > > > Subject: Re: [PATCH v3 04/14] net/mlx5: support Rx tunnel type
> > > > identification
> > > >
> > > > +Olivier,
> > > >
> > > > On Fri, Apr 13, 2018 at 07:20:13PM +0800, Xueming Li wrote:
> > > > > This patch introduced tunnel type identification based on flow rules.
> > > > > If flows of multiple tunnel types built on same queue,
> > > > > RTE_PTYPE_TUNNEL_MASK will be returned, user application could use
> > > > > bits in flow mark as tunnel type identifier.
> > > >
> > > > For an application it will mean the packet embed all tunnel types
> > > > defined in DPDK, to make such thing you need a
> > > > RTE_PTYPE_TUNNEL_UNKNOWN which does not exists currently.
> > >
> > > There was a RTE_PTYPE_TUNNEL_UNKNOWN definition, but removed due to
> > discussion.
> > > So I think it good to add it in the patchset of reviewed by Adrien.
> > 
> > Agreed,
> > 
> > >
> > > > Even with it, the application still needs to parse the packet to
> > > > discover which tunnel the packet embed, is there any benefit having
> > > > such bit?  Not so sure.
> > >
> > > With a tunnel flag, checksum status represent inner checksum.
> > 
> > Not sure this is generic enough, MLX5 behaves as this, but how behaves
> > other NICs?  It should have specific bits for inner checksum if all NIC
> > don't have the same behavior.
> 
> From my understanding, if outer checksum invalid, the packet can't be received 
> as a tunneled packet, but a normal packet, thus checksum flags always result 
> of inner for a valid tunneled packet.

Yes, since checksum validation information covers all layers at once
(outermost to the innermost recognized), the presence of an "unknown tunnel"
bit implicitly means outer headers are OK.

Now regarding the addition of RTE_PTYPE_TUNNEL_UNKNOWN, the main issue I see
is that it's implicit, as in getting 0 after and'ing packet types with
RTE_PTYPE_TUNNEL_MASK means either not present or unknown type.

How about not setting any tunnel bit and let applications rely on the
presence of RTE_PTYPE_INNER_* to determine that there is a tunnel of unknown
type? The rationale being that a tunneled packet without an inner payload is
kind of pointless anyway.

> > > Setting flow mark for different flow type could save time of parsing
> > tunnel.
> > 
> > Thanks,
> > 
> > --
> > Nélio Laranjeiro
> > 6WIND

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v3 07/14] net/mlx5: support tunnel RSS level
  2018-04-14 10:12     ` Xueming(Steven) Li
@ 2018-04-16 12:25       ` Nélio Laranjeiro
  0 siblings, 0 replies; 115+ messages in thread
From: Nélio Laranjeiro @ 2018-04-16 12:25 UTC (permalink / raw)
  To: Xueming(Steven) Li; +Cc: Shahaf Shuler, dev

On Sat, Apr 14, 2018 at 10:12:58AM +0000, Xueming(Steven) Li wrote:
> Hi Nelio,
>[...]
> > > +	if (!found)
> > > +		DRV_LOG(WARNING,
> > > +			"port %u rss hash function doesn't match "
> > > +			"pattern", dev->data->port_id);
> > 
> > The hash function is toeplitz, xor, it is not applied on the pattern but
> > used to compute an hash result using some information from the packet.
> > This comment is totally wrong.
> 
> Thanks, I'll replace "hash function" to "hash fields".
> 
> > 
> > Another point, such log will trigger on an application using MLX5 PMD but
> > not on MLX4 PMD and this specifically because on how the NIC using the
> > MLX5 PMD are made internally (MLX4 can use a single Hash RX queue whereas
> > MLX5 needs an Hash Rx queue per kind of protocol).
> > The fact being it will have the exact same behavior I'll *strongly*
> > suggest to remove such annoying warning.
> 
> After some test on mlx5 current code, the behavior in previous code doesn't
> seem to be consistent, not sure whether it same in mlx4 PMD:
> - Pattern: eth/ipv4/tcp, RSS: UDP, creation success.
> - Pattern: eth/ipv4,RSS: IPv6, creation failed.

Seems there is a bug.

> This patch support the 2nd case w/o hash, and warn upon the first case.
> Take example of first case, a packet that matches the pattern must be TCP,
> no reason to hash it as TCP, same to the 2nd case. They are totally
> wrong configuration, but to be robust, warning is used here, and users 
> have to learn that NO hash result because HF configuration mismatch through 
> this warning message.
> 
> Please note that below cases are valid and no warning:
> - Pattern: eth/ipv4, RSS: UDP
> - Pattern: eth/ipv4/udp, RSS: IPv4

This log will not raise for non IP protocols defined by the user, or it
will raise when the user already expects it to not make RSS.
It will more annoying than helping.

Example: 

 flow create 0 ingress eth ethertype is 0x0806 / end actions rss ....

won't raise such log, whereas ARP is not an IP protocol and thus can be
RSS'ed.

 flow create 0 ingress eth / ipv4 / end actions rss type ipv6...

will raise the log, but it is obvious the user won't have RSS.

Regards,

-- 
Nélio Laranjeiro
6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v3 04/14] net/mlx5: support Rx tunnel type identification
  2018-04-16  9:28           ` Adrien Mazarguil
@ 2018-04-16 13:32             ` Xueming(Steven) Li
  2018-04-16 13:47               ` Adrien Mazarguil
  0 siblings, 1 reply; 115+ messages in thread
From: Xueming(Steven) Li @ 2018-04-16 13:32 UTC (permalink / raw)
  To: Adrien Mazarguil; +Cc: Nélio Laranjeiro, Shahaf Shuler, dev, Olivier Matz


> -----Original Message-----
> From: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> Sent: Monday, April 16, 2018 5:28 PM
> To: Xueming(Steven) Li <xuemingl@mellanox.com>
> Cc: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>; Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org;
> Olivier Matz <olivier.matz@6wind.com>
> Subject: Re: [PATCH v3 04/14] net/mlx5: support Rx tunnel type identification
> 
> On Mon, Apr 16, 2018 at 08:05:13AM +0000, Xueming(Steven) Li wrote:
> >
> >
> > > -----Original Message-----
> > > From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > Sent: Monday, April 16, 2018 3:29 PM
> > > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > > Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org; Olivier Matz
> > > <olivier.matz@6wind.com>; Adrien Mazarguil
> > > <adrien.mazarguil@6wind.com>
> > > Subject: Re: [PATCH v3 04/14] net/mlx5: support Rx tunnel type
> > > identification
> > >
> > > On Sat, Apr 14, 2018 at 12:57:58PM +0000, Xueming(Steven) Li wrote:
> > > > +Adrien
> > > >
> > > > > -----Original Message-----
> > > > > From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > > > Sent: Friday, April 13, 2018 9:03 PM
> > > > > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > > > > Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org; Olivier
> > > > > Matz <olivier.matz@6wind.com>
> > > > > Subject: Re: [PATCH v3 04/14] net/mlx5: support Rx tunnel type
> > > > > identification
> > > > >
> > > > > +Olivier,
> > > > >
> > > > > On Fri, Apr 13, 2018 at 07:20:13PM +0800, Xueming Li wrote:
> > > > > > This patch introduced tunnel type identification based on flow rules.
> > > > > > If flows of multiple tunnel types built on same queue,
> > > > > > RTE_PTYPE_TUNNEL_MASK will be returned, user application could
> > > > > > use bits in flow mark as tunnel type identifier.
> > > > >
> > > > > For an application it will mean the packet embed all tunnel
> > > > > types defined in DPDK, to make such thing you need a
> > > > > RTE_PTYPE_TUNNEL_UNKNOWN which does not exists currently.
> > > >
> > > > There was a RTE_PTYPE_TUNNEL_UNKNOWN definition, but removed due
> > > > to
> > > discussion.
> > > > So I think it good to add it in the patchset of reviewed by Adrien.
> > >
> > > Agreed,
> > >
> > > >
> > > > > Even with it, the application still needs to parse the packet to
> > > > > discover which tunnel the packet embed, is there any benefit
> > > > > having such bit?  Not so sure.
> > > >
> > > > With a tunnel flag, checksum status represent inner checksum.
> > >
> > > Not sure this is generic enough, MLX5 behaves as this, but how
> > > behaves other NICs?  It should have specific bits for inner checksum
> > > if all NIC don't have the same behavior.
> >
> > From my understanding, if outer checksum invalid, the packet can't be
> > received as a tunneled packet, but a normal packet, thus checksum
> > flags always result of inner for a valid tunneled packet.
> 
> Yes, since checksum validation information covers all layers at once (outermost to the innermost
> recognized), the presence of an "unknown tunnel"
> bit implicitly means outer headers are OK.
> 
> Now regarding the addition of RTE_PTYPE_TUNNEL_UNKNOWN, the main issue I see is that it's implicit, as
> in getting 0 after and'ing packet types with RTE_PTYPE_TUNNEL_MASK means either not present or unknown
> type.

How about define RTE_PTYPE_TUNNEL_UNKNOWN same ask RTE_PTYPE_TUNNEL_MASK? And'ding packet types always 
return a non-zero value.

> 
> How about not setting any tunnel bit and let applications rely on the presence of RTE_PTYPE_INNER_* to
> determine that there is a tunnel of unknown type? The rationale being that a tunneled packet without
> an inner payload is kind of pointless anyway.

An unknown type doesn't break anything, neither enum bits, straightforward IMHO.

> 
> > > > Setting flow mark for different flow type could save time of
> > > > parsing
> > > tunnel.
> > >
> > > Thanks,
> > >
> > > --
> > > Nélio Laranjeiro
> > > 6WIND
> 
> --
> Adrien Mazarguil
> 6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v3 04/14] net/mlx5: support Rx tunnel type identification
  2018-04-16 13:32             ` Xueming(Steven) Li
@ 2018-04-16 13:47               ` Adrien Mazarguil
  2018-04-16 15:27                 ` Xueming(Steven) Li
  0 siblings, 1 reply; 115+ messages in thread
From: Adrien Mazarguil @ 2018-04-16 13:47 UTC (permalink / raw)
  To: Xueming(Steven) Li
  Cc: Nélio Laranjeiro, Shahaf Shuler, dev, Olivier Matz

On Mon, Apr 16, 2018 at 01:32:49PM +0000, Xueming(Steven) Li wrote:
> 
> > -----Original Message-----
> > From: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> > Sent: Monday, April 16, 2018 5:28 PM
> > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > Cc: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>; Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org;
> > Olivier Matz <olivier.matz@6wind.com>
> > Subject: Re: [PATCH v3 04/14] net/mlx5: support Rx tunnel type identification
> > 
> > On Mon, Apr 16, 2018 at 08:05:13AM +0000, Xueming(Steven) Li wrote:
> > >
> > >
> > > > -----Original Message-----
> > > > From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > > Sent: Monday, April 16, 2018 3:29 PM
> > > > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > > > Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org; Olivier Matz
> > > > <olivier.matz@6wind.com>; Adrien Mazarguil
> > > > <adrien.mazarguil@6wind.com>
> > > > Subject: Re: [PATCH v3 04/14] net/mlx5: support Rx tunnel type
> > > > identification
> > > >
> > > > On Sat, Apr 14, 2018 at 12:57:58PM +0000, Xueming(Steven) Li wrote:
> > > > > +Adrien
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > > > > Sent: Friday, April 13, 2018 9:03 PM
> > > > > > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > > > > > Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org; Olivier
> > > > > > Matz <olivier.matz@6wind.com>
> > > > > > Subject: Re: [PATCH v3 04/14] net/mlx5: support Rx tunnel type
> > > > > > identification
> > > > > >
> > > > > > +Olivier,
> > > > > >
> > > > > > On Fri, Apr 13, 2018 at 07:20:13PM +0800, Xueming Li wrote:
> > > > > > > This patch introduced tunnel type identification based on flow rules.
> > > > > > > If flows of multiple tunnel types built on same queue,
> > > > > > > RTE_PTYPE_TUNNEL_MASK will be returned, user application could
> > > > > > > use bits in flow mark as tunnel type identifier.
> > > > > >
> > > > > > For an application it will mean the packet embed all tunnel
> > > > > > types defined in DPDK, to make such thing you need a
> > > > > > RTE_PTYPE_TUNNEL_UNKNOWN which does not exists currently.
> > > > >
> > > > > There was a RTE_PTYPE_TUNNEL_UNKNOWN definition, but removed due
> > > > > to
> > > > discussion.
> > > > > So I think it good to add it in the patchset of reviewed by Adrien.
> > > >
> > > > Agreed,
> > > >
> > > > >
> > > > > > Even with it, the application still needs to parse the packet to
> > > > > > discover which tunnel the packet embed, is there any benefit
> > > > > > having such bit?  Not so sure.
> > > > >
> > > > > With a tunnel flag, checksum status represent inner checksum.
> > > >
> > > > Not sure this is generic enough, MLX5 behaves as this, but how
> > > > behaves other NICs?  It should have specific bits for inner checksum
> > > > if all NIC don't have the same behavior.
> > >
> > > From my understanding, if outer checksum invalid, the packet can't be
> > > received as a tunneled packet, but a normal packet, thus checksum
> > > flags always result of inner for a valid tunneled packet.
> > 
> > Yes, since checksum validation information covers all layers at once (outermost to the innermost
> > recognized), the presence of an "unknown tunnel"
> > bit implicitly means outer headers are OK.
> > 
> > Now regarding the addition of RTE_PTYPE_TUNNEL_UNKNOWN, the main issue I see is that it's implicit, as
> > in getting 0 after and'ing packet types with RTE_PTYPE_TUNNEL_MASK means either not present or unknown
> > type.
> 
> How about define RTE_PTYPE_TUNNEL_UNKNOWN same ask RTE_PTYPE_TUNNEL_MASK? And'ding packet types always 
> return a non-zero value.

I mean the value already exists, it's implicitly 0. Adding one with the same
value as RTE_PTYPE_TUNNEL_MASK could be seen as a waste of a value otherwise
usable for an actual tunnel type (there are only 4 bits).

> > How about not setting any tunnel bit and let applications rely on the presence of RTE_PTYPE_INNER_* to
> > determine that there is a tunnel of unknown type? The rationale being that a tunneled packet without
> > an inner payload is kind of pointless anyway.
> 
> An unknown type doesn't break anything, neither enum bits, straightforward IMHO.

Keep in mind that mbuf packet types report what is identified. All the
definitions in this file name a specific protocol. For instance there is no
such definition as "L3 present" or "L4 present". "Tunnel present" doesn't
make a lot of sense on its own either.

Don't you agree that reporting at least one inner ptype while leaving tunnel
ptype to 0 automatically addresses this issue?

> > > > > Setting flow mark for different flow type could save time of
> > > > > parsing
> > > > tunnel.
> > > >
> > > > Thanks,
> > > >
> > > > --
> > > > Nélio Laranjeiro
> > > > 6WIND
> > 
> > --
> > Adrien Mazarguil
> > 6WIND

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v3 04/14] net/mlx5: support Rx tunnel type identification
  2018-04-16 13:47               ` Adrien Mazarguil
@ 2018-04-16 15:27                 ` Xueming(Steven) Li
  2018-04-16 16:02                   ` Adrien Mazarguil
  0 siblings, 1 reply; 115+ messages in thread
From: Xueming(Steven) Li @ 2018-04-16 15:27 UTC (permalink / raw)
  To: Adrien Mazarguil; +Cc: Nélio Laranjeiro, Shahaf Shuler, dev, Olivier Matz



> -----Original Message-----
> From: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> Sent: Monday, April 16, 2018 9:48 PM
> To: Xueming(Steven) Li <xuemingl@mellanox.com>
> Cc: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>; Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org;
> Olivier Matz <olivier.matz@6wind.com>
> Subject: Re: [PATCH v3 04/14] net/mlx5: support Rx tunnel type identification
> 
> On Mon, Apr 16, 2018 at 01:32:49PM +0000, Xueming(Steven) Li wrote:
> >
> > > -----Original Message-----
> > > From: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> > > Sent: Monday, April 16, 2018 5:28 PM
> > > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > > Cc: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>; Shahaf Shuler
> > > <shahafs@mellanox.com>; dev@dpdk.org; Olivier Matz
> > > <olivier.matz@6wind.com>
> > > Subject: Re: [PATCH v3 04/14] net/mlx5: support Rx tunnel type
> > > identification
> > >
> > > On Mon, Apr 16, 2018 at 08:05:13AM +0000, Xueming(Steven) Li wrote:
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > > > Sent: Monday, April 16, 2018 3:29 PM
> > > > > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > > > > Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org; Olivier
> > > > > Matz <olivier.matz@6wind.com>; Adrien Mazarguil
> > > > > <adrien.mazarguil@6wind.com>
> > > > > Subject: Re: [PATCH v3 04/14] net/mlx5: support Rx tunnel type
> > > > > identification
> > > > >
> > > > > On Sat, Apr 14, 2018 at 12:57:58PM +0000, Xueming(Steven) Li wrote:
> > > > > > +Adrien
> > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > > > > > Sent: Friday, April 13, 2018 9:03 PM
> > > > > > > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > > > > > > Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org;
> > > > > > > Olivier Matz <olivier.matz@6wind.com>
> > > > > > > Subject: Re: [PATCH v3 04/14] net/mlx5: support Rx tunnel
> > > > > > > type identification
> > > > > > >
> > > > > > > +Olivier,
> > > > > > >
> > > > > > > On Fri, Apr 13, 2018 at 07:20:13PM +0800, Xueming Li wrote:
> > > > > > > > This patch introduced tunnel type identification based on flow rules.
> > > > > > > > If flows of multiple tunnel types built on same queue,
> > > > > > > > RTE_PTYPE_TUNNEL_MASK will be returned, user application
> > > > > > > > could use bits in flow mark as tunnel type identifier.
> > > > > > >
> > > > > > > For an application it will mean the packet embed all tunnel
> > > > > > > types defined in DPDK, to make such thing you need a
> > > > > > > RTE_PTYPE_TUNNEL_UNKNOWN which does not exists currently.
> > > > > >
> > > > > > There was a RTE_PTYPE_TUNNEL_UNKNOWN definition, but removed
> > > > > > due to
> > > > > discussion.
> > > > > > So I think it good to add it in the patchset of reviewed by Adrien.
> > > > >
> > > > > Agreed,
> > > > >
> > > > > >
> > > > > > > Even with it, the application still needs to parse the
> > > > > > > packet to discover which tunnel the packet embed, is there
> > > > > > > any benefit having such bit?  Not so sure.
> > > > > >
> > > > > > With a tunnel flag, checksum status represent inner checksum.
> > > > >
> > > > > Not sure this is generic enough, MLX5 behaves as this, but how
> > > > > behaves other NICs?  It should have specific bits for inner
> > > > > checksum if all NIC don't have the same behavior.
> > > >
> > > > From my understanding, if outer checksum invalid, the packet can't
> > > > be received as a tunneled packet, but a normal packet, thus
> > > > checksum flags always result of inner for a valid tunneled packet.
> > >
> > > Yes, since checksum validation information covers all layers at once
> > > (outermost to the innermost recognized), the presence of an "unknown tunnel"
> > > bit implicitly means outer headers are OK.
> > >
> > > Now regarding the addition of RTE_PTYPE_TUNNEL_UNKNOWN, the main
> > > issue I see is that it's implicit, as in getting 0 after and'ing
> > > packet types with RTE_PTYPE_TUNNEL_MASK means either not present or unknown type.
> >
> > How about define RTE_PTYPE_TUNNEL_UNKNOWN same ask
> > RTE_PTYPE_TUNNEL_MASK? And'ding packet types always return a non-zero value.
> 
> I mean the value already exists, it's implicitly 0. Adding one with the same value as
> RTE_PTYPE_TUNNEL_MASK could be seen as a waste of a value otherwise usable for an actual tunnel type
> (there are only 4 bits).
> 
> > > How about not setting any tunnel bit and let applications rely on
> > > the presence of RTE_PTYPE_INNER_* to determine that there is a
> > > tunnel of unknown type? The rationale being that a tunneled packet without an inner payload is
> kind of pointless anyway.
> >
> > An unknown type doesn't break anything, neither enum bits, straightforward IMHO.
> 
> Keep in mind that mbuf packet types report what is identified. All the definitions in this file name a
> specific protocol. For instance there is no such definition as "L3 present" or "L4 present". "Tunnel
> present" doesn't make a lot of sense on its own either.
> 
> Don't you agree that reporting at least one inner ptype while leaving tunnel ptype to 0 automatically
> addresses this issue?

Currently, no inner L2 ptype, so for packet with only L2, it will be recognized as non-tunnel packet.

> 
> > > > > > Setting flow mark for different flow type could save time of
> > > > > > parsing
> > > > > tunnel.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > --
> > > > > Nélio Laranjeiro
> > > > > 6WIND
> > >
> > > --
> > > Adrien Mazarguil
> > > 6WIND
> 
> --
> Adrien Mazarguil
> 6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v3 04/14] net/mlx5: support Rx tunnel type identification
  2018-04-16 15:27                 ` Xueming(Steven) Li
@ 2018-04-16 16:02                   ` Adrien Mazarguil
  2018-04-17  4:53                     ` Xueming(Steven) Li
  0 siblings, 1 reply; 115+ messages in thread
From: Adrien Mazarguil @ 2018-04-16 16:02 UTC (permalink / raw)
  To: Xueming(Steven) Li
  Cc: Nélio Laranjeiro, Shahaf Shuler, dev, Olivier Matz

On Mon, Apr 16, 2018 at 03:27:37PM +0000, Xueming(Steven) Li wrote:
> 
> 
> > -----Original Message-----
> > From: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> > Sent: Monday, April 16, 2018 9:48 PM
> > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > Cc: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>; Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org;
> > Olivier Matz <olivier.matz@6wind.com>
> > Subject: Re: [PATCH v3 04/14] net/mlx5: support Rx tunnel type identification
> > 
> > On Mon, Apr 16, 2018 at 01:32:49PM +0000, Xueming(Steven) Li wrote:
> > >
> > > > -----Original Message-----
> > > > From: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> > > > Sent: Monday, April 16, 2018 5:28 PM
> > > > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > > > Cc: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>; Shahaf Shuler
> > > > <shahafs@mellanox.com>; dev@dpdk.org; Olivier Matz
> > > > <olivier.matz@6wind.com>
> > > > Subject: Re: [PATCH v3 04/14] net/mlx5: support Rx tunnel type
> > > > identification
> > > >
> > > > On Mon, Apr 16, 2018 at 08:05:13AM +0000, Xueming(Steven) Li wrote:
> > > > >
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > > > > Sent: Monday, April 16, 2018 3:29 PM
> > > > > > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > > > > > Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org; Olivier
> > > > > > Matz <olivier.matz@6wind.com>; Adrien Mazarguil
> > > > > > <adrien.mazarguil@6wind.com>
> > > > > > Subject: Re: [PATCH v3 04/14] net/mlx5: support Rx tunnel type
> > > > > > identification
> > > > > >
> > > > > > On Sat, Apr 14, 2018 at 12:57:58PM +0000, Xueming(Steven) Li wrote:
> > > > > > > +Adrien
> > > > > > >
> > > > > > > > -----Original Message-----
> > > > > > > > From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > > > > > > Sent: Friday, April 13, 2018 9:03 PM
> > > > > > > > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > > > > > > > Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org;
> > > > > > > > Olivier Matz <olivier.matz@6wind.com>
> > > > > > > > Subject: Re: [PATCH v3 04/14] net/mlx5: support Rx tunnel
> > > > > > > > type identification
> > > > > > > >
> > > > > > > > +Olivier,
> > > > > > > >
> > > > > > > > On Fri, Apr 13, 2018 at 07:20:13PM +0800, Xueming Li wrote:
> > > > > > > > > This patch introduced tunnel type identification based on flow rules.
> > > > > > > > > If flows of multiple tunnel types built on same queue,
> > > > > > > > > RTE_PTYPE_TUNNEL_MASK will be returned, user application
> > > > > > > > > could use bits in flow mark as tunnel type identifier.
> > > > > > > >
> > > > > > > > For an application it will mean the packet embed all tunnel
> > > > > > > > types defined in DPDK, to make such thing you need a
> > > > > > > > RTE_PTYPE_TUNNEL_UNKNOWN which does not exists currently.
> > > > > > >
> > > > > > > There was a RTE_PTYPE_TUNNEL_UNKNOWN definition, but removed
> > > > > > > due to
> > > > > > discussion.
> > > > > > > So I think it good to add it in the patchset of reviewed by Adrien.
> > > > > >
> > > > > > Agreed,
> > > > > >
> > > > > > >
> > > > > > > > Even with it, the application still needs to parse the
> > > > > > > > packet to discover which tunnel the packet embed, is there
> > > > > > > > any benefit having such bit?  Not so sure.
> > > > > > >
> > > > > > > With a tunnel flag, checksum status represent inner checksum.
> > > > > >
> > > > > > Not sure this is generic enough, MLX5 behaves as this, but how
> > > > > > behaves other NICs?  It should have specific bits for inner
> > > > > > checksum if all NIC don't have the same behavior.
> > > > >
> > > > > From my understanding, if outer checksum invalid, the packet can't
> > > > > be received as a tunneled packet, but a normal packet, thus
> > > > > checksum flags always result of inner for a valid tunneled packet.
> > > >
> > > > Yes, since checksum validation information covers all layers at once
> > > > (outermost to the innermost recognized), the presence of an "unknown tunnel"
> > > > bit implicitly means outer headers are OK.
> > > >
> > > > Now regarding the addition of RTE_PTYPE_TUNNEL_UNKNOWN, the main
> > > > issue I see is that it's implicit, as in getting 0 after and'ing
> > > > packet types with RTE_PTYPE_TUNNEL_MASK means either not present or unknown type.
> > >
> > > How about define RTE_PTYPE_TUNNEL_UNKNOWN same ask
> > > RTE_PTYPE_TUNNEL_MASK? And'ding packet types always return a non-zero value.
> > 
> > I mean the value already exists, it's implicitly 0. Adding one with the same value as
> > RTE_PTYPE_TUNNEL_MASK could be seen as a waste of a value otherwise usable for an actual tunnel type
> > (there are only 4 bits).
> > 
> > > > How about not setting any tunnel bit and let applications rely on
> > > > the presence of RTE_PTYPE_INNER_* to determine that there is a
> > > > tunnel of unknown type? The rationale being that a tunneled packet without an inner payload is
> > kind of pointless anyway.
> > >
> > > An unknown type doesn't break anything, neither enum bits, straightforward IMHO.
> > 
> > Keep in mind that mbuf packet types report what is identified. All the definitions in this file name a
> > specific protocol. For instance there is no such definition as "L3 present" or "L4 present". "Tunnel
> > present" doesn't make a lot of sense on its own either.
> > 
> > Don't you agree that reporting at least one inner ptype while leaving tunnel ptype to 0 automatically
> > addresses this issue?
> 
> Currently, no inner L2 ptype, so for packet with only L2, it will be recognized as non-tunnel packet.

Applications can live with it. Don't bother with a ptype API change at this
point, it raises more issues than it solves.

Given the size of the series, let's deal with that later through a separate
task and according to user feedback.

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v3 04/14] net/mlx5: support Rx tunnel type identification
  2018-04-16 16:02                   ` Adrien Mazarguil
@ 2018-04-17  4:53                     ` Xueming(Steven) Li
  2018-04-17  7:20                       ` Nélio Laranjeiro
  0 siblings, 1 reply; 115+ messages in thread
From: Xueming(Steven) Li @ 2018-04-17  4:53 UTC (permalink / raw)
  To: Adrien Mazarguil; +Cc: Nélio Laranjeiro, Shahaf Shuler, dev, Olivier Matz



> -----Original Message-----
> From: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> Sent: Tuesday, April 17, 2018 12:03 AM
> To: Xueming(Steven) Li <xuemingl@mellanox.com>
> Cc: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>; Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org;
> Olivier Matz <olivier.matz@6wind.com>
> Subject: Re: [PATCH v3 04/14] net/mlx5: support Rx tunnel type identification
> 
> On Mon, Apr 16, 2018 at 03:27:37PM +0000, Xueming(Steven) Li wrote:
> >
> >
> > > -----Original Message-----
> > > From: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> > > Sent: Monday, April 16, 2018 9:48 PM
> > > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > > Cc: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>; Shahaf Shuler
> > > <shahafs@mellanox.com>; dev@dpdk.org; Olivier Matz
> > > <olivier.matz@6wind.com>
> > > Subject: Re: [PATCH v3 04/14] net/mlx5: support Rx tunnel type
> > > identification
> > >
> > > On Mon, Apr 16, 2018 at 01:32:49PM +0000, Xueming(Steven) Li wrote:
> > > >
> > > > > -----Original Message-----
> > > > > From: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> > > > > Sent: Monday, April 16, 2018 5:28 PM
> > > > > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > > > > Cc: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>; Shahaf Shuler
> > > > > <shahafs@mellanox.com>; dev@dpdk.org; Olivier Matz
> > > > > <olivier.matz@6wind.com>
> > > > > Subject: Re: [PATCH v3 04/14] net/mlx5: support Rx tunnel type
> > > > > identification
> > > > >
> > > > > On Mon, Apr 16, 2018 at 08:05:13AM +0000, Xueming(Steven) Li wrote:
> > > > > >
> > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > > > > > Sent: Monday, April 16, 2018 3:29 PM
> > > > > > > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > > > > > > Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org;
> > > > > > > Olivier Matz <olivier.matz@6wind.com>; Adrien Mazarguil
> > > > > > > <adrien.mazarguil@6wind.com>
> > > > > > > Subject: Re: [PATCH v3 04/14] net/mlx5: support Rx tunnel
> > > > > > > type identification
> > > > > > >
> > > > > > > On Sat, Apr 14, 2018 at 12:57:58PM +0000, Xueming(Steven) Li wrote:
> > > > > > > > +Adrien
> > > > > > > >
> > > > > > > > > -----Original Message-----
> > > > > > > > > From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > > > > > > > Sent: Friday, April 13, 2018 9:03 PM
> > > > > > > > > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > > > > > > > > Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org;
> > > > > > > > > Olivier Matz <olivier.matz@6wind.com>
> > > > > > > > > Subject: Re: [PATCH v3 04/14] net/mlx5: support Rx
> > > > > > > > > tunnel type identification
> > > > > > > > >
> > > > > > > > > +Olivier,
> > > > > > > > >
> > > > > > > > > On Fri, Apr 13, 2018 at 07:20:13PM +0800, Xueming Li wrote:
> > > > > > > > > > This patch introduced tunnel type identification based on flow rules.
> > > > > > > > > > If flows of multiple tunnel types built on same queue,
> > > > > > > > > > RTE_PTYPE_TUNNEL_MASK will be returned, user
> > > > > > > > > > application could use bits in flow mark as tunnel type identifier.
> > > > > > > > >
> > > > > > > > > For an application it will mean the packet embed all
> > > > > > > > > tunnel types defined in DPDK, to make such thing you
> > > > > > > > > need a RTE_PTYPE_TUNNEL_UNKNOWN which does not exists currently.
> > > > > > > >
> > > > > > > > There was a RTE_PTYPE_TUNNEL_UNKNOWN definition, but
> > > > > > > > removed due to
> > > > > > > discussion.
> > > > > > > > So I think it good to add it in the patchset of reviewed by Adrien.
> > > > > > >
> > > > > > > Agreed,
> > > > > > >
> > > > > > > >
> > > > > > > > > Even with it, the application still needs to parse the
> > > > > > > > > packet to discover which tunnel the packet embed, is
> > > > > > > > > there any benefit having such bit?  Not so sure.
> > > > > > > >
> > > > > > > > With a tunnel flag, checksum status represent inner checksum.
> > > > > > >
> > > > > > > Not sure this is generic enough, MLX5 behaves as this, but
> > > > > > > how behaves other NICs?  It should have specific bits for
> > > > > > > inner checksum if all NIC don't have the same behavior.
> > > > > >
> > > > > > From my understanding, if outer checksum invalid, the packet
> > > > > > can't be received as a tunneled packet, but a normal packet,
> > > > > > thus checksum flags always result of inner for a valid tunneled packet.
> > > > >
> > > > > Yes, since checksum validation information covers all layers at
> > > > > once (outermost to the innermost recognized), the presence of an "unknown tunnel"
> > > > > bit implicitly means outer headers are OK.
> > > > >
> > > > > Now regarding the addition of RTE_PTYPE_TUNNEL_UNKNOWN, the main
> > > > > issue I see is that it's implicit, as in getting 0 after and'ing
> > > > > packet types with RTE_PTYPE_TUNNEL_MASK means either not present or unknown type.
> > > >
> > > > How about define RTE_PTYPE_TUNNEL_UNKNOWN same ask
> > > > RTE_PTYPE_TUNNEL_MASK? And'ding packet types always return a non-zero value.
> > >
> > > I mean the value already exists, it's implicitly 0. Adding one with
> > > the same value as RTE_PTYPE_TUNNEL_MASK could be seen as a waste of
> > > a value otherwise usable for an actual tunnel type (there are only 4 bits).
> > >
> > > > > How about not setting any tunnel bit and let applications rely
> > > > > on the presence of RTE_PTYPE_INNER_* to determine that there is
> > > > > a tunnel of unknown type? The rationale being that a tunneled
> > > > > packet without an inner payload is
> > > kind of pointless anyway.
> > > >
> > > > An unknown type doesn't break anything, neither enum bits, straightforward IMHO.
> > >
> > > Keep in mind that mbuf packet types report what is identified. All
> > > the definitions in this file name a specific protocol. For instance
> > > there is no such definition as "L3 present" or "L4 present". "Tunnel present" doesn't make a lot
> of sense on its own either.
> > >
> > > Don't you agree that reporting at least one inner ptype while
> > > leaving tunnel ptype to 0 automatically addresses this issue?
> >
> > Currently, no inner L2 ptype, so for packet with only L2, it will be recognized as non-tunnel packet.
> 
> Applications can live with it. Don't bother with a ptype API change at this point, it raises more
> issues than it solves.
> 
> Given the size of the series, let's deal with that later through a separate task and according to user
> feedback.

Nelio, so I'll leave it as it is, are you okay with it?

> 
> --
> Adrien Mazarguil
> 6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v3 04/14] net/mlx5: support Rx tunnel type identification
  2018-04-17  4:53                     ` Xueming(Steven) Li
@ 2018-04-17  7:20                       ` Nélio Laranjeiro
  2018-04-17 11:50                         ` Xueming(Steven) Li
  0 siblings, 1 reply; 115+ messages in thread
From: Nélio Laranjeiro @ 2018-04-17  7:20 UTC (permalink / raw)
  To: Xueming(Steven) Li; +Cc: Adrien Mazarguil, Shahaf Shuler, dev, Olivier Matz

On Tue, Apr 17, 2018 at 04:53:15AM +0000, Xueming(Steven) Li wrote:
> 
> 
> > -----Original Message-----
> > From: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> > Sent: Tuesday, April 17, 2018 12:03 AM
> > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > Cc: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>; Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org;
> > Olivier Matz <olivier.matz@6wind.com>
> > Subject: Re: [PATCH v3 04/14] net/mlx5: support Rx tunnel type identification
> > 
> > On Mon, Apr 16, 2018 at 03:27:37PM +0000, Xueming(Steven) Li wrote:
> > >
> > >
> > > > -----Original Message-----
> > > > From: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> > > > Sent: Monday, April 16, 2018 9:48 PM
> > > > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > > > Cc: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>; Shahaf Shuler
> > > > <shahafs@mellanox.com>; dev@dpdk.org; Olivier Matz
> > > > <olivier.matz@6wind.com>
> > > > Subject: Re: [PATCH v3 04/14] net/mlx5: support Rx tunnel type
> > > > identification
> > > >
> > > > On Mon, Apr 16, 2018 at 01:32:49PM +0000, Xueming(Steven) Li wrote:
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> > > > > > Sent: Monday, April 16, 2018 5:28 PM
> > > > > > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > > > > > Cc: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>; Shahaf Shuler
> > > > > > <shahafs@mellanox.com>; dev@dpdk.org; Olivier Matz
> > > > > > <olivier.matz@6wind.com>
> > > > > > Subject: Re: [PATCH v3 04/14] net/mlx5: support Rx tunnel type
> > > > > > identification
> > > > > >
> > > > > > On Mon, Apr 16, 2018 at 08:05:13AM +0000, Xueming(Steven) Li wrote:
> > > > > > >
> > > > > > >
> > > > > > > > -----Original Message-----
> > > > > > > > From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > > > > > > Sent: Monday, April 16, 2018 3:29 PM
> > > > > > > > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > > > > > > > Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org;
> > > > > > > > Olivier Matz <olivier.matz@6wind.com>; Adrien Mazarguil
> > > > > > > > <adrien.mazarguil@6wind.com>
> > > > > > > > Subject: Re: [PATCH v3 04/14] net/mlx5: support Rx tunnel
> > > > > > > > type identification
> > > > > > > >
> > > > > > > > On Sat, Apr 14, 2018 at 12:57:58PM +0000, Xueming(Steven) Li wrote:
> > > > > > > > > +Adrien
> > > > > > > > >
> > > > > > > > > > -----Original Message-----
> > > > > > > > > > From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > > > > > > > > Sent: Friday, April 13, 2018 9:03 PM
> > > > > > > > > > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > > > > > > > > > Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org;
> > > > > > > > > > Olivier Matz <olivier.matz@6wind.com>
> > > > > > > > > > Subject: Re: [PATCH v3 04/14] net/mlx5: support Rx
> > > > > > > > > > tunnel type identification
> > > > > > > > > >
> > > > > > > > > > +Olivier,
> > > > > > > > > >
> > > > > > > > > > On Fri, Apr 13, 2018 at 07:20:13PM +0800, Xueming Li wrote:
> > > > > > > > > > > This patch introduced tunnel type identification based on flow rules.
> > > > > > > > > > > If flows of multiple tunnel types built on same queue,
> > > > > > > > > > > RTE_PTYPE_TUNNEL_MASK will be returned, user
> > > > > > > > > > > application could use bits in flow mark as tunnel type identifier.
> > > > > > > > > >
> > > > > > > > > > For an application it will mean the packet embed all
> > > > > > > > > > tunnel types defined in DPDK, to make such thing you
> > > > > > > > > > need a RTE_PTYPE_TUNNEL_UNKNOWN which does not exists currently.
> > > > > > > > >
> > > > > > > > > There was a RTE_PTYPE_TUNNEL_UNKNOWN definition, but
> > > > > > > > > removed due to
> > > > > > > > discussion.
> > > > > > > > > So I think it good to add it in the patchset of reviewed by Adrien.
> > > > > > > >
> > > > > > > > Agreed,
> > > > > > > >
> > > > > > > > >
> > > > > > > > > > Even with it, the application still needs to parse the
> > > > > > > > > > packet to discover which tunnel the packet embed, is
> > > > > > > > > > there any benefit having such bit?  Not so sure.
> > > > > > > > >
> > > > > > > > > With a tunnel flag, checksum status represent inner checksum.
> > > > > > > >
> > > > > > > > Not sure this is generic enough, MLX5 behaves as this, but
> > > > > > > > how behaves other NICs?  It should have specific bits for
> > > > > > > > inner checksum if all NIC don't have the same behavior.
> > > > > > >
> > > > > > > From my understanding, if outer checksum invalid, the packet
> > > > > > > can't be received as a tunneled packet, but a normal packet,
> > > > > > > thus checksum flags always result of inner for a valid tunneled packet.
> > > > > >
> > > > > > Yes, since checksum validation information covers all layers at
> > > > > > once (outermost to the innermost recognized), the presence of an "unknown tunnel"
> > > > > > bit implicitly means outer headers are OK.
> > > > > >
> > > > > > Now regarding the addition of RTE_PTYPE_TUNNEL_UNKNOWN, the main
> > > > > > issue I see is that it's implicit, as in getting 0 after and'ing
> > > > > > packet types with RTE_PTYPE_TUNNEL_MASK means either not present or unknown type.
> > > > >
> > > > > How about define RTE_PTYPE_TUNNEL_UNKNOWN same ask
> > > > > RTE_PTYPE_TUNNEL_MASK? And'ding packet types always return a non-zero value.
> > > >
> > > > I mean the value already exists, it's implicitly 0. Adding one with
> > > > the same value as RTE_PTYPE_TUNNEL_MASK could be seen as a waste of
> > > > a value otherwise usable for an actual tunnel type (there are only 4 bits).
> > > >
> > > > > > How about not setting any tunnel bit and let applications rely
> > > > > > on the presence of RTE_PTYPE_INNER_* to determine that there is
> > > > > > a tunnel of unknown type? The rationale being that a tunneled
> > > > > > packet without an inner payload is
> > > > kind of pointless anyway.
> > > > >
> > > > > An unknown type doesn't break anything, neither enum bits, straightforward IMHO.
> > > >
> > > > Keep in mind that mbuf packet types report what is identified. All
> > > > the definitions in this file name a specific protocol. For instance
> > > > there is no such definition as "L3 present" or "L4 present". "Tunnel present" doesn't make a lot
> > of sense on its own either.
> > > >
> > > > Don't you agree that reporting at least one inner ptype while
> > > > leaving tunnel ptype to 0 automatically addresses this issue?
> > >
> > > Currently, no inner L2 ptype, so for packet with only L2, it will be recognized as non-tunnel packet.
> > 
> > Applications can live with it. Don't bother with a ptype API change at this point, it raises more
> > issues than it solves.
> > 
> > Given the size of the series, let's deal with that later through a separate task and according to user
> > feedback.
> 
> Nelio, so I'll leave it as it is, are you okay with it?

I agree with Adrien, if you are not able to say which kind of tunnel it
is, don't set it in the mbuf.

Regards,

-- 
Nélio Laranjeiro
6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v3 04/14] net/mlx5: support Rx tunnel type identification
  2018-04-17  7:20                       ` Nélio Laranjeiro
@ 2018-04-17 11:50                         ` Xueming(Steven) Li
  0 siblings, 0 replies; 115+ messages in thread
From: Xueming(Steven) Li @ 2018-04-17 11:50 UTC (permalink / raw)
  To: Nélio Laranjeiro; +Cc: Adrien Mazarguil, Shahaf Shuler, dev, Olivier Matz



> -----Original Message-----
> From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> Sent: Tuesday, April 17, 2018 3:20 PM
> To: Xueming(Steven) Li <xuemingl@mellanox.com>
> Cc: Adrien Mazarguil <adrien.mazarguil@6wind.com>; Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org;
> Olivier Matz <olivier.matz@6wind.com>
> Subject: Re: [PATCH v3 04/14] net/mlx5: support Rx tunnel type identification
> 
> On Tue, Apr 17, 2018 at 04:53:15AM +0000, Xueming(Steven) Li wrote:
> >
> >
> > > -----Original Message-----
> > > From: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> > > Sent: Tuesday, April 17, 2018 12:03 AM
> > > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > > Cc: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>; Shahaf Shuler
> > > <shahafs@mellanox.com>; dev@dpdk.org; Olivier Matz
> > > <olivier.matz@6wind.com>
> > > Subject: Re: [PATCH v3 04/14] net/mlx5: support Rx tunnel type
> > > identification
> > >
> > > On Mon, Apr 16, 2018 at 03:27:37PM +0000, Xueming(Steven) Li wrote:
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> > > > > Sent: Monday, April 16, 2018 9:48 PM
> > > > > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > > > > Cc: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>; Shahaf Shuler
> > > > > <shahafs@mellanox.com>; dev@dpdk.org; Olivier Matz
> > > > > <olivier.matz@6wind.com>
> > > > > Subject: Re: [PATCH v3 04/14] net/mlx5: support Rx tunnel type
> > > > > identification
> > > > >
> > > > > On Mon, Apr 16, 2018 at 01:32:49PM +0000, Xueming(Steven) Li wrote:
> > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> > > > > > > Sent: Monday, April 16, 2018 5:28 PM
> > > > > > > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > > > > > > Cc: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>; Shahaf
> > > > > > > Shuler <shahafs@mellanox.com>; dev@dpdk.org; Olivier Matz
> > > > > > > <olivier.matz@6wind.com>
> > > > > > > Subject: Re: [PATCH v3 04/14] net/mlx5: support Rx tunnel
> > > > > > > type identification
> > > > > > >
> > > > > > > On Mon, Apr 16, 2018 at 08:05:13AM +0000, Xueming(Steven) Li wrote:
> > > > > > > >
> > > > > > > >
> > > > > > > > > -----Original Message-----
> > > > > > > > > From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > > > > > > > Sent: Monday, April 16, 2018 3:29 PM
> > > > > > > > > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > > > > > > > > Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org;
> > > > > > > > > Olivier Matz <olivier.matz@6wind.com>; Adrien Mazarguil
> > > > > > > > > <adrien.mazarguil@6wind.com>
> > > > > > > > > Subject: Re: [PATCH v3 04/14] net/mlx5: support Rx
> > > > > > > > > tunnel type identification
> > > > > > > > >
> > > > > > > > > On Sat, Apr 14, 2018 at 12:57:58PM +0000, Xueming(Steven) Li wrote:
> > > > > > > > > > +Adrien
> > > > > > > > > >
> > > > > > > > > > > -----Original Message-----
> > > > > > > > > > > From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > > > > > > > > > Sent: Friday, April 13, 2018 9:03 PM
> > > > > > > > > > > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > > > > > > > > > > Cc: Shahaf Shuler <shahafs@mellanox.com>;
> > > > > > > > > > > dev@dpdk.org; Olivier Matz <olivier.matz@6wind.com>
> > > > > > > > > > > Subject: Re: [PATCH v3 04/14] net/mlx5: support Rx
> > > > > > > > > > > tunnel type identification
> > > > > > > > > > >
> > > > > > > > > > > +Olivier,
> > > > > > > > > > >
> > > > > > > > > > > On Fri, Apr 13, 2018 at 07:20:13PM +0800, Xueming Li wrote:
> > > > > > > > > > > > This patch introduced tunnel type identification based on flow rules.
> > > > > > > > > > > > If flows of multiple tunnel types built on same
> > > > > > > > > > > > queue, RTE_PTYPE_TUNNEL_MASK will be returned,
> > > > > > > > > > > > user application could use bits in flow mark as tunnel type identifier.
> > > > > > > > > > >
> > > > > > > > > > > For an application it will mean the packet embed all
> > > > > > > > > > > tunnel types defined in DPDK, to make such thing you
> > > > > > > > > > > need a RTE_PTYPE_TUNNEL_UNKNOWN which does not exists currently.
> > > > > > > > > >
> > > > > > > > > > There was a RTE_PTYPE_TUNNEL_UNKNOWN definition, but
> > > > > > > > > > removed due to
> > > > > > > > > discussion.
> > > > > > > > > > So I think it good to add it in the patchset of reviewed by Adrien.
> > > > > > > > >
> > > > > > > > > Agreed,
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > Even with it, the application still needs to parse
> > > > > > > > > > > the packet to discover which tunnel the packet
> > > > > > > > > > > embed, is there any benefit having such bit?  Not so sure.
> > > > > > > > > >
> > > > > > > > > > With a tunnel flag, checksum status represent inner checksum.
> > > > > > > > >
> > > > > > > > > Not sure this is generic enough, MLX5 behaves as this,
> > > > > > > > > but how behaves other NICs?  It should have specific
> > > > > > > > > bits for inner checksum if all NIC don't have the same behavior.
> > > > > > > >
> > > > > > > > From my understanding, if outer checksum invalid, the
> > > > > > > > packet can't be received as a tunneled packet, but a
> > > > > > > > normal packet, thus checksum flags always result of inner for a valid tunneled packet.
> > > > > > >
> > > > > > > Yes, since checksum validation information covers all layers
> > > > > > > at once (outermost to the innermost recognized), the presence of an "unknown tunnel"
> > > > > > > bit implicitly means outer headers are OK.
> > > > > > >
> > > > > > > Now regarding the addition of RTE_PTYPE_TUNNEL_UNKNOWN, the
> > > > > > > main issue I see is that it's implicit, as in getting 0
> > > > > > > after and'ing packet types with RTE_PTYPE_TUNNEL_MASK means either not present or unknown
> type.
> > > > > >
> > > > > > How about define RTE_PTYPE_TUNNEL_UNKNOWN same ask
> > > > > > RTE_PTYPE_TUNNEL_MASK? And'ding packet types always return a non-zero value.
> > > > >
> > > > > I mean the value already exists, it's implicitly 0. Adding one
> > > > > with the same value as RTE_PTYPE_TUNNEL_MASK could be seen as a
> > > > > waste of a value otherwise usable for an actual tunnel type (there are only 4 bits).
> > > > >
> > > > > > > How about not setting any tunnel bit and let applications
> > > > > > > rely on the presence of RTE_PTYPE_INNER_* to determine that
> > > > > > > there is a tunnel of unknown type? The rationale being that
> > > > > > > a tunneled packet without an inner payload is
> > > > > kind of pointless anyway.
> > > > > >
> > > > > > An unknown type doesn't break anything, neither enum bits, straightforward IMHO.
> > > > >
> > > > > Keep in mind that mbuf packet types report what is identified.
> > > > > All the definitions in this file name a specific protocol. For
> > > > > instance there is no such definition as "L3 present" or "L4
> > > > > present". "Tunnel present" doesn't make a lot
> > > of sense on its own either.
> > > > >
> > > > > Don't you agree that reporting at least one inner ptype while
> > > > > leaving tunnel ptype to 0 automatically addresses this issue?
> > > >
> > > > Currently, no inner L2 ptype, so for packet with only L2, it will be recognized as non-tunnel
> packet.
> > >
> > > Applications can live with it. Don't bother with a ptype API change
> > > at this point, it raises more issues than it solves.
> > >
> > > Given the size of the series, let's deal with that later through a
> > > separate task and according to user feedback.
> >
> > Nelio, so I'll leave it as it is, are you okay with it?
> 
> I agree with Adrien, if you are not able to say which kind of tunnel it is, don't set it in the mbuf.

It's useful to have a tunnel flag otherwise have to change the code to iterate tunnel_types which
would slow down the flow creation.
		rxq_ctrl->tunnel_types[tunnel] += 1;
		if (rxq_data->tunnel != flow->tunnel)
			rxq_data->tunnel = rxq_data->tunnel ?
					   RTE_PTYPE_TUNNEL_MASK :
					   flow->tunnel;

> 
> Regards,
> 
> --
> Nélio Laranjeiro
> 6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* [PATCH v4 00/11] mlx5 Rx tunnel offloading
  2018-04-13 11:20 ` [PATCH v3 00/14] mlx5 Rx tunnel offloading Xueming Li
@ 2018-04-17 15:14   ` Xueming Li
  2018-04-20 12:23     ` [PATCH v5 " Xueming Li
                       ` (11 more replies)
  2018-04-17 15:14   ` [PATCH v4 01/11] net/mlx5: support 16 hardware priorities Xueming Li
                     ` (10 subsequent siblings)
  11 siblings, 12 replies; 115+ messages in thread
From: Xueming Li @ 2018-04-17 15:14 UTC (permalink / raw)
  To: Nelio Laranjeiro, Shahaf Shuler; +Cc: Xueming Li, dev

v4:
- Fix RSS level according to value defination
- Add "Inner RSS" column to NIC feature doc
- Fixed flow creation error in case of ipv4 rss on ipv6 pattern
- new patch: enforce IP protocol of GRE to be 47.
- Removed MPLS-in-UDP and MPLS-in-GRE replated patchset
- Removed invalid RSS type check
v3:
- Refactor 16 Verbs priority detection.
- Other updates according to ML discussion.
v2:
- Split into 2 series: public api and mlx5, this one is the second.
- Rebased on Adrien's rte flow overhaul:
  http://www.dpdk.org/ml/archives/dev/2018-April/095774.html
v1:
- Support new tunnel type MPLS-in-GRE and MPLS-in-UDP
- Remove deprecation notes of rss level

This patchset supports MLX5 Rx tunnel checksum, inner rss, inner ptype offloading of following tunnel types:
- Standard VXLAN
- L3 VXLAN (no inner ethernet header)
- VXLAN-GPE

Xueming Li (11):
  net/mlx5: support 16 hardware priorities
  net/mlx5: support GRE tunnel flow
  net/mlx5: support L3 VXLAN flow
  net/mlx5: support Rx tunnel type identification
  net/mlx5: cleanup tunnel checksum offloads
  net/mlx5: split flow RSS handling logic
  net/mlx5: support tunnel RSS level
  net/mlx5: add hardware flow debug dump
  net/mlx5: introduce VXLAN-GPE tunnel type
  net/mlx5: allow flow tunnel ID 0 with outer pattern
  doc: update mlx5 guide on tunnel offloading

 doc/guides/nics/features/default.ini  |   1 +
 doc/guides/nics/features/mlx5.ini     |   3 +
 doc/guides/nics/mlx5.rst              |  22 +-
 drivers/net/mlx5/Makefile             |   2 +-
 drivers/net/mlx5/mlx5.c               |  18 +
 drivers/net/mlx5/mlx5.h               |   5 +
 drivers/net/mlx5/mlx5_flow.c          | 806 +++++++++++++++++++++++++++-------
 drivers/net/mlx5/mlx5_glue.c          |  16 +
 drivers/net/mlx5/mlx5_glue.h          |   8 +
 drivers/net/mlx5/mlx5_rxq.c           |  88 +++-
 drivers/net/mlx5/mlx5_rxtx.c          |  33 +-
 drivers/net/mlx5/mlx5_rxtx.h          |  11 +-
 drivers/net/mlx5/mlx5_rxtx_vec_neon.h |  21 +-
 drivers/net/mlx5/mlx5_rxtx_vec_sse.h  |  17 +-
 drivers/net/mlx5/mlx5_trigger.c       |   8 -
 drivers/net/mlx5/mlx5_utils.h         |   6 +
 16 files changed, 842 insertions(+), 223 deletions(-)

-- 
2.13.3

^ permalink raw reply	[flat|nested] 115+ messages in thread

* [PATCH v4 01/11] net/mlx5: support 16 hardware priorities
  2018-04-13 11:20 ` [PATCH v3 00/14] mlx5 Rx tunnel offloading Xueming Li
  2018-04-17 15:14   ` [PATCH v4 00/11] " Xueming Li
@ 2018-04-17 15:14   ` Xueming Li
  2018-04-17 15:14   ` [PATCH v4 02/11] net/mlx5: support GRE tunnel flow Xueming Li
                     ` (9 subsequent siblings)
  11 siblings, 0 replies; 115+ messages in thread
From: Xueming Li @ 2018-04-17 15:14 UTC (permalink / raw)
  To: Nelio Laranjeiro, Shahaf Shuler; +Cc: Xueming Li, dev

This patch supports new 16 Verbs flow priorities by trying to create a
simple flow of priority 15. If 16 priorities not available, fallback to
traditional 8 priorities.

Verb priority mapping:
			8 priorities	>=16 priorities
Control flow:		4-7		8-15
User normal flow:	1-3		4-7
User tunnel flow:	0-2		0-3

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
---
 drivers/net/mlx5/mlx5.c         |  18 +++++++
 drivers/net/mlx5/mlx5.h         |   5 ++
 drivers/net/mlx5/mlx5_flow.c    | 113 +++++++++++++++++++++++++++++++++-------
 drivers/net/mlx5/mlx5_trigger.c |   8 ---
 4 files changed, 116 insertions(+), 28 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 68783c3ac..5a0b8de85 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -197,6 +197,7 @@ mlx5_dev_close(struct rte_eth_dev *dev)
 		priv->txqs_n = 0;
 		priv->txqs = NULL;
 	}
+	mlx5_flow_delete_drop_queue(dev);
 	if (priv->pd != NULL) {
 		assert(priv->ctx != NULL);
 		claim_zero(mlx5_glue->dealloc_pd(priv->pd));
@@ -619,6 +620,7 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 	unsigned int mps;
 	unsigned int cqe_comp;
 	unsigned int tunnel_en = 0;
+	unsigned int verb_priorities = 0;
 	int idx;
 	int i;
 	struct mlx5dv_context attrs_out = {0};
@@ -1006,6 +1008,22 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 		mlx5_link_update(eth_dev, 0);
 		/* Store device configuration on private structure. */
 		priv->config = config;
+		/* Create drop queue. */
+		err = mlx5_flow_create_drop_queue(eth_dev);
+		if (err) {
+			DRV_LOG(ERR, "port %u drop queue allocation failed: %s",
+				eth_dev->data->port_id, strerror(rte_errno));
+			goto port_error;
+		}
+		/* Supported Verbs flow priority number detection. */
+		if (verb_priorities == 0)
+			verb_priorities = mlx5_get_max_verbs_prio(eth_dev);
+		if (verb_priorities < MLX5_VERBS_FLOW_PRIO_8) {
+			DRV_LOG(ERR, "port %u wrong Verbs flow priorities: %u",
+				eth_dev->data->port_id, verb_priorities);
+			goto port_error;
+		}
+		priv->config.max_verbs_prio = verb_priorities;
 		continue;
 port_error:
 		if (priv)
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 6ad41390a..670f6860f 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -89,6 +89,7 @@ struct mlx5_dev_config {
 	unsigned int rx_vec_en:1; /* Rx vector is enabled. */
 	unsigned int mpw_hdr_dseg:1; /* Enable DSEGs in the title WQEBB. */
 	unsigned int vf_nl_en:1; /* Enable Netlink requests in VF mode. */
+	unsigned int max_verbs_prio; /* Number of Verb flow priorities. */
 	unsigned int tso_max_payload_sz; /* Maximum TCP payload for TSO. */
 	unsigned int ind_table_max_size; /* Maximum indirection table size. */
 	int txq_inline; /* Maximum packet size for inlining. */
@@ -105,6 +106,9 @@ enum mlx5_verbs_alloc_type {
 	MLX5_VERBS_ALLOC_TYPE_RX_QUEUE,
 };
 
+/* 8 Verbs priorities. */
+#define MLX5_VERBS_FLOW_PRIO_8 8
+
 /**
  * Verbs allocator needs a context to know in the callback which kind of
  * resources it is allocating.
@@ -253,6 +257,7 @@ int mlx5_traffic_restart(struct rte_eth_dev *dev);
 
 /* mlx5_flow.c */
 
+unsigned int mlx5_get_max_verbs_prio(struct rte_eth_dev *dev);
 int mlx5_flow_validate(struct rte_eth_dev *dev,
 		       const struct rte_flow_attr *attr,
 		       const struct rte_flow_item items[],
diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index 968bef746..39c19abce 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -31,8 +31,8 @@
 #include "mlx5_prm.h"
 #include "mlx5_glue.h"
 
-/* Define minimal priority for control plane flows. */
-#define MLX5_CTRL_FLOW_PRIORITY 4
+/* Flow priority for control plane flows. */
+#define MLX5_CTRL_FLOW_PRIORITY 1
 
 /* Internet Protocol versions. */
 #define MLX5_IPV4 4
@@ -128,7 +128,7 @@ const struct hash_rxq_init hash_rxq_init[] = {
 				IBV_RX_HASH_SRC_PORT_TCP |
 				IBV_RX_HASH_DST_PORT_TCP),
 		.dpdk_rss_hf = ETH_RSS_NONFRAG_IPV4_TCP,
-		.flow_priority = 1,
+		.flow_priority = 0,
 		.ip_version = MLX5_IPV4,
 	},
 	[HASH_RXQ_UDPV4] = {
@@ -137,7 +137,7 @@ const struct hash_rxq_init hash_rxq_init[] = {
 				IBV_RX_HASH_SRC_PORT_UDP |
 				IBV_RX_HASH_DST_PORT_UDP),
 		.dpdk_rss_hf = ETH_RSS_NONFRAG_IPV4_UDP,
-		.flow_priority = 1,
+		.flow_priority = 0,
 		.ip_version = MLX5_IPV4,
 	},
 	[HASH_RXQ_IPV4] = {
@@ -145,7 +145,7 @@ const struct hash_rxq_init hash_rxq_init[] = {
 				IBV_RX_HASH_DST_IPV4),
 		.dpdk_rss_hf = (ETH_RSS_IPV4 |
 				ETH_RSS_FRAG_IPV4),
-		.flow_priority = 2,
+		.flow_priority = 1,
 		.ip_version = MLX5_IPV4,
 	},
 	[HASH_RXQ_TCPV6] = {
@@ -154,7 +154,7 @@ const struct hash_rxq_init hash_rxq_init[] = {
 				IBV_RX_HASH_SRC_PORT_TCP |
 				IBV_RX_HASH_DST_PORT_TCP),
 		.dpdk_rss_hf = ETH_RSS_NONFRAG_IPV6_TCP,
-		.flow_priority = 1,
+		.flow_priority = 0,
 		.ip_version = MLX5_IPV6,
 	},
 	[HASH_RXQ_UDPV6] = {
@@ -163,7 +163,7 @@ const struct hash_rxq_init hash_rxq_init[] = {
 				IBV_RX_HASH_SRC_PORT_UDP |
 				IBV_RX_HASH_DST_PORT_UDP),
 		.dpdk_rss_hf = ETH_RSS_NONFRAG_IPV6_UDP,
-		.flow_priority = 1,
+		.flow_priority = 0,
 		.ip_version = MLX5_IPV6,
 	},
 	[HASH_RXQ_IPV6] = {
@@ -171,13 +171,13 @@ const struct hash_rxq_init hash_rxq_init[] = {
 				IBV_RX_HASH_DST_IPV6),
 		.dpdk_rss_hf = (ETH_RSS_IPV6 |
 				ETH_RSS_FRAG_IPV6),
-		.flow_priority = 2,
+		.flow_priority = 1,
 		.ip_version = MLX5_IPV6,
 	},
 	[HASH_RXQ_ETH] = {
 		.hash_fields = 0,
 		.dpdk_rss_hf = 0,
-		.flow_priority = 3,
+		.flow_priority = 2,
 	},
 };
 
@@ -899,30 +899,50 @@ mlx5_flow_convert_allocate(unsigned int size, struct rte_flow_error *error)
  * Make inner packet matching with an higher priority from the non Inner
  * matching.
  *
+ * @param dev
+ *   Pointer to Ethernet device.
  * @param[in, out] parser
  *   Internal parser structure.
  * @param attr
  *   User flow attribute.
  */
 static void
-mlx5_flow_update_priority(struct mlx5_flow_parse *parser,
+mlx5_flow_update_priority(struct rte_eth_dev *dev,
+			  struct mlx5_flow_parse *parser,
 			  const struct rte_flow_attr *attr)
 {
+	struct priv *priv = dev->data->dev_private;
 	unsigned int i;
+	uint16_t priority;
 
+	/*			8 priorities	>= 16 priorities
+	 * Control flow:	4-7		8-15
+	 * User normal flow:	1-3		4-7
+	 * User tunnel flow:	0-2		0-3
+	 */
+	priority = attr->priority * MLX5_VERBS_FLOW_PRIO_8;
+	if (priv->config.max_verbs_prio == MLX5_VERBS_FLOW_PRIO_8)
+		priority /= 2;
+	/*
+	 * Lower non-tunnel flow Verbs priority 1 if only support 8 Verbs
+	 * priorities, lower 4 otherwise.
+	 */
+	if (!parser->inner) {
+		if (priv->config.max_verbs_prio == MLX5_VERBS_FLOW_PRIO_8)
+			priority += 1;
+		else
+			priority += MLX5_VERBS_FLOW_PRIO_8 / 2;
+	}
 	if (parser->drop) {
-		parser->queue[HASH_RXQ_ETH].ibv_attr->priority =
-			attr->priority +
-			hash_rxq_init[HASH_RXQ_ETH].flow_priority;
+		parser->queue[HASH_RXQ_ETH].ibv_attr->priority = priority +
+				hash_rxq_init[HASH_RXQ_ETH].flow_priority;
 		return;
 	}
 	for (i = 0; i != hash_rxq_init_n; ++i) {
-		if (parser->queue[i].ibv_attr) {
-			parser->queue[i].ibv_attr->priority =
-				attr->priority +
-				hash_rxq_init[i].flow_priority -
-				(parser->inner ? 1 : 0);
-		}
+		if (!parser->queue[i].ibv_attr)
+			continue;
+		parser->queue[i].ibv_attr->priority = priority +
+				hash_rxq_init[i].flow_priority;
 	}
 }
 
@@ -1157,7 +1177,7 @@ mlx5_flow_convert(struct rte_eth_dev *dev,
 	 */
 	if (!parser->drop)
 		mlx5_flow_convert_finalise(parser);
-	mlx5_flow_update_priority(parser, attr);
+	mlx5_flow_update_priority(dev, parser, attr);
 exit_free:
 	/* Only verification is expected, all resources should be released. */
 	if (!parser->create) {
@@ -3158,3 +3178,56 @@ mlx5_dev_filter_ctrl(struct rte_eth_dev *dev,
 	}
 	return 0;
 }
+
+/**
+ * Detect number of Verbs flow priorities supported.
+ *
+ * @param dev
+ *   Pointer to Ethernet device.
+ *
+ * @return
+ *   number of supported Verbs flow priority.
+ */
+unsigned int
+mlx5_get_max_verbs_prio(struct rte_eth_dev *dev)
+{
+	struct priv *priv = dev->data->dev_private;
+	unsigned int verb_priorities = MLX5_VERBS_FLOW_PRIO_8;
+	struct {
+		struct ibv_flow_attr attr;
+		struct ibv_flow_spec_eth eth;
+		struct ibv_flow_spec_action_drop drop;
+	} flow_attr = {
+		.attr = {
+			.num_of_specs = 2,
+		},
+		.eth = {
+			.type = IBV_FLOW_SPEC_ETH,
+			.size = sizeof(struct ibv_flow_spec_eth),
+		},
+		.drop = {
+			.size = sizeof(struct ibv_flow_spec_action_drop),
+			.type = IBV_FLOW_SPEC_ACTION_DROP,
+		},
+	};
+	struct ibv_flow *flow;
+
+	do {
+		flow_attr.attr.priority = verb_priorities - 1;
+		flow = mlx5_glue->create_flow(priv->flow_drop_queue->qp,
+					      &flow_attr.attr);
+		if (flow) {
+			claim_zero(mlx5_glue->destroy_flow(flow));
+			/* Try more priorities. */
+			verb_priorities *= 2;
+		} else {
+			/* Failed, restore last right number. */
+			verb_priorities /= 2;
+			break;
+		}
+	} while (1);
+	DRV_LOG(DEBUG, "port %u Verbs flow priorities: %d,"
+		" user flow priorities: %d",
+		dev->data->port_id, verb_priorities, MLX5_CTRL_FLOW_PRIORITY);
+	return verb_priorities;
+}
diff --git a/drivers/net/mlx5/mlx5_trigger.c b/drivers/net/mlx5/mlx5_trigger.c
index ee08c5677..fc56d1ee8 100644
--- a/drivers/net/mlx5/mlx5_trigger.c
+++ b/drivers/net/mlx5/mlx5_trigger.c
@@ -148,12 +148,6 @@ mlx5_dev_start(struct rte_eth_dev *dev)
 	int ret;
 
 	dev->data->dev_started = 1;
-	ret = mlx5_flow_create_drop_queue(dev);
-	if (ret) {
-		DRV_LOG(ERR, "port %u drop queue allocation failed: %s",
-			dev->data->port_id, strerror(rte_errno));
-		goto error;
-	}
 	DRV_LOG(DEBUG, "port %u allocating and configuring hash Rx queues",
 		dev->data->port_id);
 	rte_mempool_walk(mlx5_mp2mr_iter, priv);
@@ -202,7 +196,6 @@ mlx5_dev_start(struct rte_eth_dev *dev)
 	mlx5_traffic_disable(dev);
 	mlx5_txq_stop(dev);
 	mlx5_rxq_stop(dev);
-	mlx5_flow_delete_drop_queue(dev);
 	rte_errno = ret; /* Restore rte_errno. */
 	return -rte_errno;
 }
@@ -237,7 +230,6 @@ mlx5_dev_stop(struct rte_eth_dev *dev)
 	mlx5_rxq_stop(dev);
 	for (mr = LIST_FIRST(&priv->mr); mr; mr = LIST_FIRST(&priv->mr))
 		mlx5_mr_release(mr);
-	mlx5_flow_delete_drop_queue(dev);
 }
 
 /**
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v4 02/11] net/mlx5: support GRE tunnel flow
  2018-04-13 11:20 ` [PATCH v3 00/14] mlx5 Rx tunnel offloading Xueming Li
  2018-04-17 15:14   ` [PATCH v4 00/11] " Xueming Li
  2018-04-17 15:14   ` [PATCH v4 01/11] net/mlx5: support 16 hardware priorities Xueming Li
@ 2018-04-17 15:14   ` Xueming Li
  2018-04-17 15:14   ` [PATCH v4 03/11] net/mlx5: support L3 VXLAN flow Xueming Li
                     ` (8 subsequent siblings)
  11 siblings, 0 replies; 115+ messages in thread
From: Xueming Li @ 2018-04-17 15:14 UTC (permalink / raw)
  To: Nelio Laranjeiro, Shahaf Shuler; +Cc: Xueming Li, dev

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
---
 drivers/net/mlx5/mlx5_flow.c | 101 ++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 94 insertions(+), 7 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index 39c19abce..771d5f14d 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -37,6 +37,7 @@
 /* Internet Protocol versions. */
 #define MLX5_IPV4 4
 #define MLX5_IPV6 6
+#define MLX5_GRE 47
 
 #ifndef HAVE_IBV_DEVICE_COUNTERS_SET_SUPPORT
 struct ibv_flow_spec_counter_action {
@@ -89,6 +90,11 @@ mlx5_flow_create_vxlan(const struct rte_flow_item *item,
 		       const void *default_mask,
 		       struct mlx5_flow_data *data);
 
+static int
+mlx5_flow_create_gre(const struct rte_flow_item *item,
+		     const void *default_mask,
+		     struct mlx5_flow_data *data);
+
 struct mlx5_flow_parse;
 
 static void
@@ -231,6 +237,10 @@ struct rte_flow {
 		__VA_ARGS__, RTE_FLOW_ITEM_TYPE_END, \
 	}
 
+#define IS_TUNNEL(type) ( \
+	(type) == RTE_FLOW_ITEM_TYPE_VXLAN || \
+	(type) == RTE_FLOW_ITEM_TYPE_GRE)
+
 /** Structure to generate a simple graph of layers supported by the NIC. */
 struct mlx5_flow_items {
 	/** List of possible actions for these items. */
@@ -284,7 +294,8 @@ static const enum rte_flow_action_type valid_actions[] = {
 static const struct mlx5_flow_items mlx5_flow_items[] = {
 	[RTE_FLOW_ITEM_TYPE_END] = {
 		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
-			       RTE_FLOW_ITEM_TYPE_VXLAN),
+			       RTE_FLOW_ITEM_TYPE_VXLAN,
+			       RTE_FLOW_ITEM_TYPE_GRE),
 	},
 	[RTE_FLOW_ITEM_TYPE_ETH] = {
 		.items = ITEMS(RTE_FLOW_ITEM_TYPE_VLAN,
@@ -316,7 +327,8 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
 	},
 	[RTE_FLOW_ITEM_TYPE_IPV4] = {
 		.items = ITEMS(RTE_FLOW_ITEM_TYPE_UDP,
-			       RTE_FLOW_ITEM_TYPE_TCP),
+			       RTE_FLOW_ITEM_TYPE_TCP,
+			       RTE_FLOW_ITEM_TYPE_GRE),
 		.actions = valid_actions,
 		.mask = &(const struct rte_flow_item_ipv4){
 			.hdr = {
@@ -333,7 +345,8 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
 	},
 	[RTE_FLOW_ITEM_TYPE_IPV6] = {
 		.items = ITEMS(RTE_FLOW_ITEM_TYPE_UDP,
-			       RTE_FLOW_ITEM_TYPE_TCP),
+			       RTE_FLOW_ITEM_TYPE_TCP,
+			       RTE_FLOW_ITEM_TYPE_GRE),
 		.actions = valid_actions,
 		.mask = &(const struct rte_flow_item_ipv6){
 			.hdr = {
@@ -386,6 +399,19 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
 		.convert = mlx5_flow_create_tcp,
 		.dst_sz = sizeof(struct ibv_flow_spec_tcp_udp),
 	},
+	[RTE_FLOW_ITEM_TYPE_GRE] = {
+		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
+			       RTE_FLOW_ITEM_TYPE_IPV4,
+			       RTE_FLOW_ITEM_TYPE_IPV6),
+		.actions = valid_actions,
+		.mask = &(const struct rte_flow_item_gre){
+			.protocol = -1,
+		},
+		.default_mask = &rte_flow_item_gre_mask,
+		.mask_sz = sizeof(struct rte_flow_item_gre),
+		.convert = mlx5_flow_create_gre,
+		.dst_sz = sizeof(struct ibv_flow_spec_tunnel),
+	},
 	[RTE_FLOW_ITEM_TYPE_VXLAN] = {
 		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH),
 		.actions = valid_actions,
@@ -401,7 +427,7 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
 
 /** Structure to pass to the conversion function. */
 struct mlx5_flow_parse {
-	uint32_t inner; /**< Set once VXLAN is encountered. */
+	uint32_t inner; /**< Verbs value, set once tunnel is encountered. */
 	uint32_t create:1;
 	/**< Whether resources should remain after a validate. */
 	uint32_t drop:1; /**< Target is a drop queue. */
@@ -829,13 +855,13 @@ mlx5_flow_convert_items_validate(const struct rte_flow_item items[],
 					      cur_item->mask_sz);
 		if (ret)
 			goto exit_item_not_supported;
-		if (items->type == RTE_FLOW_ITEM_TYPE_VXLAN) {
+		if (IS_TUNNEL(items->type)) {
 			if (parser->inner) {
 				rte_flow_error_set(error, ENOTSUP,
 						   RTE_FLOW_ERROR_TYPE_ITEM,
 						   items,
-						   "cannot recognize multiple"
-						   " VXLAN encapsulations");
+						   "Cannot recognize multiple"
+						   " tunnel encapsulations.");
 				return -rte_errno;
 			}
 			parser->inner = IBV_FLOW_SPEC_INNER;
@@ -1641,6 +1667,67 @@ mlx5_flow_create_vxlan(const struct rte_flow_item *item,
 }
 
 /**
+ * Convert GRE item to Verbs specification.
+ *
+ * @param item[in]
+ *   Item specification.
+ * @param default_mask[in]
+ *   Default bit-masks to use when item->mask is not provided.
+ * @param data[in, out]
+ *   User structure.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+mlx5_flow_create_gre(const struct rte_flow_item *item __rte_unused,
+		     const void *default_mask __rte_unused,
+		     struct mlx5_flow_data *data)
+{
+	struct mlx5_flow_parse *parser = data->parser;
+	unsigned int size = sizeof(struct ibv_flow_spec_tunnel);
+	struct ibv_flow_spec_tunnel tunnel = {
+		.type = parser->inner | IBV_FLOW_SPEC_VXLAN_TUNNEL,
+		.size = size,
+	};
+	struct ibv_flow_spec_ipv4_ext *ipv4;
+	struct ibv_flow_spec_ipv6 *ipv6;
+	unsigned int i;
+
+	parser->inner = IBV_FLOW_SPEC_INNER;
+	/* Update encapsulation IP layer protocol. */
+	for (i = 0; i != hash_rxq_init_n; ++i) {
+		if (!parser->queue[i].ibv_attr)
+			continue;
+		if (parser->out_layer == HASH_RXQ_IPV4) {
+			ipv4 = (void *)((uintptr_t)parser->queue[i].ibv_attr +
+				parser->queue[i].offset -
+				sizeof(struct ibv_flow_spec_ipv4_ext));
+			if (ipv4->mask.proto && ipv4->val.proto != MLX5_GRE)
+				break;
+			ipv4->val.proto = MLX5_GRE;
+			ipv4->mask.proto = 0xff;
+		} else if (parser->out_layer == HASH_RXQ_IPV6) {
+			ipv6 = (void *)((uintptr_t)parser->queue[i].ibv_attr +
+				parser->queue[i].offset -
+				sizeof(struct ibv_flow_spec_ipv6));
+			if (ipv6->mask.next_hdr &&
+			    ipv6->val.next_hdr != MLX5_GRE)
+				break;
+			ipv6->val.next_hdr = MLX5_GRE;
+			ipv6->mask.next_hdr = 0xff;
+		}
+	}
+	if (i != hash_rxq_init_n)
+		return rte_flow_error_set(data->error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ITEM,
+					  item,
+					  "IP protocol of GRE must be 47");
+	mlx5_flow_create_copy(parser, &tunnel, size);
+	return 0;
+}
+
+/**
  * Convert mark/flag action to Verbs specification.
  *
  * @param parser
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v4 03/11] net/mlx5: support L3 VXLAN flow
  2018-04-13 11:20 ` [PATCH v3 00/14] mlx5 Rx tunnel offloading Xueming Li
                     ` (2 preceding siblings ...)
  2018-04-17 15:14   ` [PATCH v4 02/11] net/mlx5: support GRE tunnel flow Xueming Li
@ 2018-04-17 15:14   ` Xueming Li
  2018-04-18  6:48     ` Nélio Laranjeiro
  2018-04-17 15:14   ` [PATCH v4 04/11] net/mlx5: support Rx tunnel type identification Xueming Li
                     ` (7 subsequent siblings)
  11 siblings, 1 reply; 115+ messages in thread
From: Xueming Li @ 2018-04-17 15:14 UTC (permalink / raw)
  To: Nelio Laranjeiro, Shahaf Shuler; +Cc: Xueming Li, dev

This patch support L3 VXLAN, no inner L2 header comparing to standard
VXLAN protocol. L3 VXLAN using specific overlay UDP destination port to
discriminate against standard VXLAN, FW has to be configured to support
it:
  sudo mlxconfig -d <device> -y s IP_OVER_VXLAN_EN=1
  sudo mlxconfig -d <device> -y s IP_OVER_VXLAN_PORT=<port>

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
---
 drivers/net/mlx5/mlx5_flow.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index 771d5f14d..d7a921dff 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -413,7 +413,9 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
 		.dst_sz = sizeof(struct ibv_flow_spec_tunnel),
 	},
 	[RTE_FLOW_ITEM_TYPE_VXLAN] = {
-		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH),
+		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
+			       RTE_FLOW_ITEM_TYPE_IPV4, /* For L3 VXLAN. */
+			       RTE_FLOW_ITEM_TYPE_IPV6), /* For L3 VXLAN. */
 		.actions = valid_actions,
 		.mask = &(const struct rte_flow_item_vxlan){
 			.vni = "\xff\xff\xff",
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v4 04/11] net/mlx5: support Rx tunnel type identification
  2018-04-13 11:20 ` [PATCH v3 00/14] mlx5 Rx tunnel offloading Xueming Li
                     ` (3 preceding siblings ...)
  2018-04-17 15:14   ` [PATCH v4 03/11] net/mlx5: support L3 VXLAN flow Xueming Li
@ 2018-04-17 15:14   ` Xueming Li
  2018-04-18  6:50     ` Nélio Laranjeiro
  2018-04-17 15:14   ` [PATCH v4 05/11] net/mlx5: cleanup tunnel checksum offloads Xueming Li
                     ` (6 subsequent siblings)
  11 siblings, 1 reply; 115+ messages in thread
From: Xueming Li @ 2018-04-17 15:14 UTC (permalink / raw)
  To: Nelio Laranjeiro, Shahaf Shuler; +Cc: Xueming Li, dev

This patch introduced tunnel type identification based on flow rules.
If flows of multiple tunnel types built on same queue,
RTE_PTYPE_TUNNEL_MASK will be returned, user application could use bits
in flow mark as tunnel type identifier.

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
---
 drivers/net/mlx5/mlx5_flow.c          | 127 +++++++++++++++++++++++++++++-----
 drivers/net/mlx5/mlx5_rxq.c           |  11 ++-
 drivers/net/mlx5/mlx5_rxtx.c          |  12 ++--
 drivers/net/mlx5/mlx5_rxtx.h          |   9 ++-
 drivers/net/mlx5/mlx5_rxtx_vec_neon.h |  21 +++---
 drivers/net/mlx5/mlx5_rxtx_vec_sse.h  |  17 +++--
 6 files changed, 159 insertions(+), 38 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index d7a921dff..59eb80655 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -225,6 +225,7 @@ struct rte_flow {
 	struct rte_flow_action_rss rss_conf; /**< RSS configuration */
 	uint16_t (*queues)[]; /**< Queues indexes to use. */
 	uint8_t rss_key[40]; /**< copy of the RSS key. */
+	uint32_t tunnel; /**< Tunnel type of RTE_PTYPE_TUNNEL_XXX. */
 	struct ibv_counter_set *cs; /**< Holds the counters for the rule. */
 	struct mlx5_flow_counter_stats counter_stats;/**<The counter stats. */
 	struct mlx5_flow frxq[RTE_DIM(hash_rxq_init)];
@@ -241,6 +242,19 @@ struct rte_flow {
 	(type) == RTE_FLOW_ITEM_TYPE_VXLAN || \
 	(type) == RTE_FLOW_ITEM_TYPE_GRE)
 
+const uint32_t flow_ptype[] = {
+	[RTE_FLOW_ITEM_TYPE_VXLAN] = RTE_PTYPE_TUNNEL_VXLAN,
+	[RTE_FLOW_ITEM_TYPE_GRE] = RTE_PTYPE_TUNNEL_GRE,
+};
+
+#define PTYPE_IDX(t) ((RTE_PTYPE_TUNNEL_MASK & (t)) >> 12)
+
+const uint32_t ptype_ext[] = {
+	[PTYPE_IDX(RTE_PTYPE_TUNNEL_VXLAN)] = RTE_PTYPE_TUNNEL_VXLAN |
+					      RTE_PTYPE_L4_UDP,
+	[PTYPE_IDX(RTE_PTYPE_TUNNEL_GRE)] = RTE_PTYPE_TUNNEL_GRE,
+};
+
 /** Structure to generate a simple graph of layers supported by the NIC. */
 struct mlx5_flow_items {
 	/** List of possible actions for these items. */
@@ -440,6 +454,7 @@ struct mlx5_flow_parse {
 	uint16_t queues[RTE_MAX_QUEUES_PER_PORT]; /**< Queues indexes to use. */
 	uint8_t rss_key[40]; /**< copy of the RSS key. */
 	enum hash_rxq_type layer; /**< Last pattern layer detected. */
+	uint32_t tunnel; /**< Tunnel type of RTE_PTYPE_TUNNEL_XXX. */
 	struct ibv_counter_set *cs; /**< Holds the counter set for the rule */
 	struct {
 		struct ibv_flow_attr *ibv_attr;
@@ -858,7 +873,7 @@ mlx5_flow_convert_items_validate(const struct rte_flow_item items[],
 		if (ret)
 			goto exit_item_not_supported;
 		if (IS_TUNNEL(items->type)) {
-			if (parser->inner) {
+			if (parser->tunnel) {
 				rte_flow_error_set(error, ENOTSUP,
 						   RTE_FLOW_ERROR_TYPE_ITEM,
 						   items,
@@ -867,6 +882,7 @@ mlx5_flow_convert_items_validate(const struct rte_flow_item items[],
 				return -rte_errno;
 			}
 			parser->inner = IBV_FLOW_SPEC_INNER;
+			parser->tunnel = flow_ptype[items->type];
 		}
 		if (parser->drop) {
 			parser->queue[HASH_RXQ_ETH].offset += cur_item->dst_sz;
@@ -1175,6 +1191,7 @@ mlx5_flow_convert(struct rte_eth_dev *dev,
 	}
 	/* Third step. Conversion parse, fill the specifications. */
 	parser->inner = 0;
+	parser->tunnel = 0;
 	for (; items->type != RTE_FLOW_ITEM_TYPE_END; ++items) {
 		struct mlx5_flow_data data = {
 			.parser = parser,
@@ -1641,6 +1658,7 @@ mlx5_flow_create_vxlan(const struct rte_flow_item *item,
 
 	id.vni[0] = 0;
 	parser->inner = IBV_FLOW_SPEC_INNER;
+	parser->tunnel = ptype_ext[PTYPE_IDX(RTE_PTYPE_TUNNEL_VXLAN)];
 	if (spec) {
 		if (!mask)
 			mask = default_mask;
@@ -1697,6 +1715,7 @@ mlx5_flow_create_gre(const struct rte_flow_item *item __rte_unused,
 	unsigned int i;
 
 	parser->inner = IBV_FLOW_SPEC_INNER;
+	parser->tunnel = ptype_ext[PTYPE_IDX(RTE_PTYPE_TUNNEL_GRE)];
 	/* Update encapsulation IP layer protocol. */
 	for (i = 0; i != hash_rxq_init_n; ++i) {
 		if (!parser->queue[i].ibv_attr)
@@ -1903,7 +1922,8 @@ mlx5_flow_create_action_queue_rss(struct rte_eth_dev *dev,
 				      parser->rss_conf.key_len,
 				      hash_fields,
 				      parser->rss_conf.queue,
-				      parser->rss_conf.queue_num);
+				      parser->rss_conf.queue_num,
+				      parser->tunnel);
 		if (flow->frxq[i].hrxq)
 			continue;
 		flow->frxq[i].hrxq =
@@ -1912,7 +1932,8 @@ mlx5_flow_create_action_queue_rss(struct rte_eth_dev *dev,
 				      parser->rss_conf.key_len,
 				      hash_fields,
 				      parser->rss_conf.queue,
-				      parser->rss_conf.queue_num);
+				      parser->rss_conf.queue_num,
+				      parser->tunnel);
 		if (!flow->frxq[i].hrxq) {
 			return rte_flow_error_set(error, ENOMEM,
 						  RTE_FLOW_ERROR_TYPE_HANDLE,
@@ -1924,6 +1945,40 @@ mlx5_flow_create_action_queue_rss(struct rte_eth_dev *dev,
 }
 
 /**
+ * RXQ update after flow rule creation.
+ *
+ * @param dev
+ *   Pointer to Ethernet device.
+ * @param flow
+ *   Pointer to the flow rule.
+ */
+static void
+mlx5_flow_create_update_rxqs(struct rte_eth_dev *dev, struct rte_flow *flow)
+{
+	struct priv *priv = dev->data->dev_private;
+	unsigned int i;
+
+	if (!dev->data->dev_started)
+		return;
+	for (i = 0; i != flow->rss_conf.queue_num; ++i) {
+		struct mlx5_rxq_data *rxq_data = (*priv->rxqs)
+						 [(*flow->queues)[i]];
+		struct mlx5_rxq_ctrl *rxq_ctrl =
+			container_of(rxq_data, struct mlx5_rxq_ctrl, rxq);
+		uint8_t tunnel = PTYPE_IDX(flow->tunnel);
+
+		rxq_data->mark |= flow->mark;
+		if (!tunnel)
+			continue;
+		rxq_ctrl->tunnel_types[tunnel] += 1;
+		if (rxq_data->tunnel != flow->tunnel)
+			rxq_data->tunnel = rxq_data->tunnel ?
+					   RTE_PTYPE_TUNNEL_MASK :
+					   flow->tunnel;
+	}
+}
+
+/**
  * Complete flow rule creation.
  *
  * @param dev
@@ -1983,12 +2038,7 @@ mlx5_flow_create_action_queue(struct rte_eth_dev *dev,
 				   NULL, "internal error in flow creation");
 		goto error;
 	}
-	for (i = 0; i != parser->rss_conf.queue_num; ++i) {
-		struct mlx5_rxq_data *q =
-			(*priv->rxqs)[parser->rss_conf.queue[i]];
-
-		q->mark |= parser->mark;
-	}
+	mlx5_flow_create_update_rxqs(dev, flow);
 	return 0;
 error:
 	ret = rte_errno; /* Save rte_errno before cleanup. */
@@ -2061,6 +2111,7 @@ mlx5_flow_list_create(struct rte_eth_dev *dev,
 	}
 	/* Copy configuration. */
 	flow->queues = (uint16_t (*)[])(flow + 1);
+	flow->tunnel = parser.tunnel;
 	flow->rss_conf = (struct rte_flow_action_rss){
 		.func = RTE_ETH_HASH_FUNCTION_DEFAULT,
 		.level = 0,
@@ -2152,9 +2203,38 @@ mlx5_flow_list_destroy(struct rte_eth_dev *dev, struct mlx5_flows *list,
 	struct priv *priv = dev->data->dev_private;
 	unsigned int i;
 
-	if (flow->drop || !flow->mark)
+	if (flow->drop || !dev->data->dev_started)
 		goto free;
-	for (i = 0; i != flow->rss_conf.queue_num; ++i) {
+	for (i = 0; flow->tunnel && i != flow->rss_conf.queue_num; ++i) {
+		/* Update queue tunnel type. */
+		struct mlx5_rxq_data *rxq_data = (*priv->rxqs)
+						 [(*flow->queues)[i]];
+		struct mlx5_rxq_ctrl *rxq_ctrl =
+			container_of(rxq_data, struct mlx5_rxq_ctrl, rxq);
+		uint8_t tunnel = PTYPE_IDX(flow->tunnel);
+
+		assert(rxq_ctrl->tunnel_types[tunnel] > 0);
+		rxq_ctrl->tunnel_types[tunnel] -= 1;
+		if (!rxq_ctrl->tunnel_types[tunnel]) {
+			/* Update tunnel type. */
+			uint8_t j;
+			uint8_t types = 0;
+			uint8_t last;
+
+			for (j = 0; j < RTE_DIM(rxq_ctrl->tunnel_types); j++)
+				if (rxq_ctrl->tunnel_types[j]) {
+					types += 1;
+					last = j;
+				}
+			/* Keep same if more than one tunnel types left. */
+			if (types == 1)
+				rxq_data->tunnel = ptype_ext[last];
+			else if (types == 0)
+				/* No tunnel type left. */
+				rxq_data->tunnel = 0;
+		}
+	}
+	for (i = 0; flow->mark && i != flow->rss_conf.queue_num; ++i) {
 		struct rte_flow *tmp;
 		int mark = 0;
 
@@ -2373,9 +2453,9 @@ mlx5_flow_stop(struct rte_eth_dev *dev, struct mlx5_flows *list)
 {
 	struct priv *priv = dev->data->dev_private;
 	struct rte_flow *flow;
+	unsigned int i;
 
 	TAILQ_FOREACH_REVERSE(flow, list, mlx5_flows, next) {
-		unsigned int i;
 		struct mlx5_ind_table_ibv *ind_tbl = NULL;
 
 		if (flow->drop) {
@@ -2421,6 +2501,18 @@ mlx5_flow_stop(struct rte_eth_dev *dev, struct mlx5_flows *list)
 		DRV_LOG(DEBUG, "port %u flow %p removed", dev->data->port_id,
 			(void *)flow);
 	}
+	/* Cleanup Rx queue tunnel info. */
+	for (i = 0; i != priv->rxqs_n; ++i) {
+		struct mlx5_rxq_data *q = (*priv->rxqs)[i];
+		struct mlx5_rxq_ctrl *rxq_ctrl =
+			container_of(q, struct mlx5_rxq_ctrl, rxq);
+
+		if (!q)
+			continue;
+		memset((void *)rxq_ctrl->tunnel_types, 0,
+		       sizeof(rxq_ctrl->tunnel_types));
+		q->tunnel = 0;
+	}
 }
 
 /**
@@ -2468,7 +2560,8 @@ mlx5_flow_start(struct rte_eth_dev *dev, struct mlx5_flows *list)
 					      flow->rss_conf.key_len,
 					      hash_rxq_init[i].hash_fields,
 					      flow->rss_conf.queue,
-					      flow->rss_conf.queue_num);
+					      flow->rss_conf.queue_num,
+					      flow->tunnel);
 			if (flow->frxq[i].hrxq)
 				goto flow_create;
 			flow->frxq[i].hrxq =
@@ -2476,7 +2569,8 @@ mlx5_flow_start(struct rte_eth_dev *dev, struct mlx5_flows *list)
 					      flow->rss_conf.key_len,
 					      hash_rxq_init[i].hash_fields,
 					      flow->rss_conf.queue,
-					      flow->rss_conf.queue_num);
+					      flow->rss_conf.queue_num,
+					      flow->tunnel);
 			if (!flow->frxq[i].hrxq) {
 				DRV_LOG(DEBUG,
 					"port %u flow %p cannot be applied",
@@ -2498,10 +2592,7 @@ mlx5_flow_start(struct rte_eth_dev *dev, struct mlx5_flows *list)
 			DRV_LOG(DEBUG, "port %u flow %p applied",
 				dev->data->port_id, (void *)flow);
 		}
-		if (!flow->mark)
-			continue;
-		for (i = 0; i != flow->rss_conf.queue_num; ++i)
-			(*priv->rxqs)[flow->rss_conf.queue[i]]->mark = 1;
+		mlx5_flow_create_update_rxqs(dev, flow);
 	}
 	return 0;
 }
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index 18ad40813..1fbd02aa0 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -1386,6 +1386,8 @@ mlx5_ind_table_ibv_verify(struct rte_eth_dev *dev)
  *   first queue index will be taken for the indirection table.
  * @param queues_n
  *   Number of queues.
+ * @param tunnel
+ *   Tunnel type.
  *
  * @return
  *   The Verbs object initialised, NULL otherwise and rte_errno is set.
@@ -1394,7 +1396,7 @@ struct mlx5_hrxq *
 mlx5_hrxq_new(struct rte_eth_dev *dev,
 	      const uint8_t *rss_key, uint32_t rss_key_len,
 	      uint64_t hash_fields,
-	      const uint16_t *queues, uint32_t queues_n)
+	      const uint16_t *queues, uint32_t queues_n, uint32_t tunnel)
 {
 	struct priv *priv = dev->data->dev_private;
 	struct mlx5_hrxq *hrxq;
@@ -1438,6 +1440,7 @@ mlx5_hrxq_new(struct rte_eth_dev *dev,
 	hrxq->qp = qp;
 	hrxq->rss_key_len = rss_key_len;
 	hrxq->hash_fields = hash_fields;
+	hrxq->tunnel = tunnel;
 	memcpy(hrxq->rss_key, rss_key, rss_key_len);
 	rte_atomic32_inc(&hrxq->refcnt);
 	LIST_INSERT_HEAD(&priv->hrxqs, hrxq, next);
@@ -1466,6 +1469,8 @@ mlx5_hrxq_new(struct rte_eth_dev *dev,
  *   first queue index will be taken for the indirection table.
  * @param queues_n
  *   Number of queues.
+ * @param tunnel
+ *   Tunnel type.
  *
  * @return
  *   An hash Rx queue on success.
@@ -1474,7 +1479,7 @@ struct mlx5_hrxq *
 mlx5_hrxq_get(struct rte_eth_dev *dev,
 	      const uint8_t *rss_key, uint32_t rss_key_len,
 	      uint64_t hash_fields,
-	      const uint16_t *queues, uint32_t queues_n)
+	      const uint16_t *queues, uint32_t queues_n, uint32_t tunnel)
 {
 	struct priv *priv = dev->data->dev_private;
 	struct mlx5_hrxq *hrxq;
@@ -1489,6 +1494,8 @@ mlx5_hrxq_get(struct rte_eth_dev *dev,
 			continue;
 		if (hrxq->hash_fields != hash_fields)
 			continue;
+		if (hrxq->tunnel != tunnel)
+			continue;
 		ind_tbl = mlx5_ind_table_ibv_get(dev, queues, queues_n);
 		if (!ind_tbl)
 			continue;
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 05fe10918..fafac514b 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -34,7 +34,7 @@
 #include "mlx5_prm.h"
 
 static __rte_always_inline uint32_t
-rxq_cq_to_pkt_type(volatile struct mlx5_cqe *cqe);
+rxq_cq_to_pkt_type(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cqe);
 
 static __rte_always_inline int
 mlx5_rx_poll_len(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cqe,
@@ -125,12 +125,14 @@ mlx5_set_ptype_table(void)
 	(*p)[0x8a] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
 		     RTE_PTYPE_L4_UDP;
 	/* Tunneled - L3 */
+	(*p)[0x40] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN;
 	(*p)[0x41] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
 		     RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
 		     RTE_PTYPE_INNER_L4_NONFRAG;
 	(*p)[0x42] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
 		     RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
 		     RTE_PTYPE_INNER_L4_NONFRAG;
+	(*p)[0xc0] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN;
 	(*p)[0xc1] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
 		     RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
 		     RTE_PTYPE_INNER_L4_NONFRAG;
@@ -1577,6 +1579,8 @@ mlx5_tx_burst_empw(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
 /**
  * Translate RX completion flags to packet type.
  *
+ * @param[in] rxq
+ *   Pointer to RX queue structure.
  * @param[in] cqe
  *   Pointer to CQE.
  *
@@ -1586,7 +1590,7 @@ mlx5_tx_burst_empw(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
  *   Packet type for struct rte_mbuf.
  */
 static inline uint32_t
-rxq_cq_to_pkt_type(volatile struct mlx5_cqe *cqe)
+rxq_cq_to_pkt_type(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cqe)
 {
 	uint8_t idx;
 	uint8_t pinfo = cqe->pkt_info;
@@ -1601,7 +1605,7 @@ rxq_cq_to_pkt_type(volatile struct mlx5_cqe *cqe)
 	 * bit[7] = outer_l3_type
 	 */
 	idx = ((pinfo & 0x3) << 6) | ((ptype & 0xfc00) >> 10);
-	return mlx5_ptype_table[idx];
+	return mlx5_ptype_table[idx] | rxq->tunnel * !!(idx & (1 << 6));
 }
 
 /**
@@ -1833,7 +1837,7 @@ mlx5_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n)
 			pkt = seg;
 			assert(len >= (rxq->crc_present << 2));
 			/* Update packet information. */
-			pkt->packet_type = rxq_cq_to_pkt_type(cqe);
+			pkt->packet_type = rxq_cq_to_pkt_type(rxq, cqe);
 			pkt->ol_flags = 0;
 			if (rss_hash_res && rxq->rss_hash) {
 				pkt->hash.rss = rss_hash_res;
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index ee534c340..676ad6a9a 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -104,6 +104,7 @@ struct mlx5_rxq_data {
 	void *cq_uar; /* CQ user access region. */
 	uint32_t cqn; /* CQ number. */
 	uint8_t cq_arm_sn; /* CQ arm seq number. */
+	uint32_t tunnel; /* Tunnel information. */
 } __rte_cache_aligned;
 
 /* Verbs Rx queue elements. */
@@ -125,6 +126,7 @@ struct mlx5_rxq_ctrl {
 	struct mlx5_rxq_ibv *ibv; /* Verbs elements. */
 	struct mlx5_rxq_data rxq; /* Data path structure. */
 	unsigned int socket; /* CPU socket ID for allocations. */
+	uint32_t tunnel_types[16]; /* Tunnel type counter. */
 	unsigned int irq:1; /* Whether IRQ is enabled. */
 	uint16_t idx; /* Queue index. */
 };
@@ -145,6 +147,7 @@ struct mlx5_hrxq {
 	struct mlx5_ind_table_ibv *ind_table; /* Indirection table. */
 	struct ibv_qp *qp; /* Verbs queue pair. */
 	uint64_t hash_fields; /* Verbs Hash fields. */
+	uint32_t tunnel; /* Tunnel type. */
 	uint32_t rss_key_len; /* Hash key length in bytes. */
 	uint8_t rss_key[]; /* Hash key. */
 };
@@ -248,11 +251,13 @@ int mlx5_ind_table_ibv_verify(struct rte_eth_dev *dev);
 struct mlx5_hrxq *mlx5_hrxq_new(struct rte_eth_dev *dev,
 				const uint8_t *rss_key, uint32_t rss_key_len,
 				uint64_t hash_fields,
-				const uint16_t *queues, uint32_t queues_n);
+				const uint16_t *queues, uint32_t queues_n,
+				uint32_t tunnel);
 struct mlx5_hrxq *mlx5_hrxq_get(struct rte_eth_dev *dev,
 				const uint8_t *rss_key, uint32_t rss_key_len,
 				uint64_t hash_fields,
-				const uint16_t *queues, uint32_t queues_n);
+				const uint16_t *queues, uint32_t queues_n,
+				uint32_t tunnel);
 int mlx5_hrxq_release(struct rte_eth_dev *dev, struct mlx5_hrxq *hxrq);
 int mlx5_hrxq_ibv_verify(struct rte_eth_dev *dev);
 uint64_t mlx5_get_rx_port_offloads(void);
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
index 84817e7ad..d21e99f68 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
@@ -551,6 +551,7 @@ rxq_cq_to_ptype_oflags_v(struct mlx5_rxq_data *rxq,
 	const uint64x1_t mbuf_init = vld1_u64(&rxq->mbuf_initializer);
 	const uint64x1_t r32_mask = vcreate_u64(0xffffffff);
 	uint64x2_t rearm0, rearm1, rearm2, rearm3;
+	uint8_t pt_idx0, pt_idx1, pt_idx2, pt_idx3;
 
 	if (rxq->mark) {
 		const uint32x4_t ft_def = vdupq_n_u32(MLX5_FLOW_MARK_DEFAULT);
@@ -583,14 +584,18 @@ rxq_cq_to_ptype_oflags_v(struct mlx5_rxq_data *rxq,
 	ptype = vshrn_n_u32(ptype_info, 10);
 	/* Errored packets will have RTE_PTYPE_ALL_MASK. */
 	ptype = vorr_u16(ptype, op_err);
-	pkts[0]->packet_type =
-		mlx5_ptype_table[vget_lane_u8(vreinterpret_u8_u16(ptype), 6)];
-	pkts[1]->packet_type =
-		mlx5_ptype_table[vget_lane_u8(vreinterpret_u8_u16(ptype), 4)];
-	pkts[2]->packet_type =
-		mlx5_ptype_table[vget_lane_u8(vreinterpret_u8_u16(ptype), 2)];
-	pkts[3]->packet_type =
-		mlx5_ptype_table[vget_lane_u8(vreinterpret_u8_u16(ptype), 0)];
+	pt_idx0 = vget_lane_u8(vreinterpret_u8_u16(ptype), 6);
+	pt_idx1 = vget_lane_u8(vreinterpret_u8_u16(ptype), 4);
+	pt_idx2 = vget_lane_u8(vreinterpret_u8_u16(ptype), 2);
+	pt_idx3 = vget_lane_u8(vreinterpret_u8_u16(ptype), 0);
+	pkts[0]->packet_type = mlx5_ptype_table[pt_idx0] |
+			       !!(pt_idx0 & (1 << 6)) * rxq->tunnel;
+	pkts[1]->packet_type = mlx5_ptype_table[pt_idx1] |
+			       !!(pt_idx1 & (1 << 6)) * rxq->tunnel;
+	pkts[2]->packet_type = mlx5_ptype_table[pt_idx2] |
+			       !!(pt_idx2 & (1 << 6)) * rxq->tunnel;
+	pkts[3]->packet_type = mlx5_ptype_table[pt_idx3] |
+			       !!(pt_idx3 & (1 << 6)) * rxq->tunnel;
 	/* Fill flags for checksum and VLAN. */
 	pinfo = vandq_u32(ptype_info, ptype_ol_mask);
 	pinfo = vreinterpretq_u32_u8(
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
index 83d6e431f..4a6789a78 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
@@ -542,6 +542,7 @@ rxq_cq_to_ptype_oflags_v(struct mlx5_rxq_data *rxq, __m128i cqes[4],
 	const __m128i mbuf_init =
 		_mm_loadl_epi64((__m128i *)&rxq->mbuf_initializer);
 	__m128i rearm0, rearm1, rearm2, rearm3;
+	uint8_t pt_idx0, pt_idx1, pt_idx2, pt_idx3;
 
 	/* Extract pkt_info field. */
 	pinfo0 = _mm_unpacklo_epi32(cqes[0], cqes[1]);
@@ -595,10 +596,18 @@ rxq_cq_to_ptype_oflags_v(struct mlx5_rxq_data *rxq, __m128i cqes[4],
 	/* Errored packets will have RTE_PTYPE_ALL_MASK. */
 	op_err = _mm_srli_epi16(op_err, 8);
 	ptype = _mm_or_si128(ptype, op_err);
-	pkts[0]->packet_type = mlx5_ptype_table[_mm_extract_epi8(ptype, 0)];
-	pkts[1]->packet_type = mlx5_ptype_table[_mm_extract_epi8(ptype, 2)];
-	pkts[2]->packet_type = mlx5_ptype_table[_mm_extract_epi8(ptype, 4)];
-	pkts[3]->packet_type = mlx5_ptype_table[_mm_extract_epi8(ptype, 6)];
+	pt_idx0 = _mm_extract_epi8(ptype, 0);
+	pt_idx1 = _mm_extract_epi8(ptype, 2);
+	pt_idx2 = _mm_extract_epi8(ptype, 4);
+	pt_idx3 = _mm_extract_epi8(ptype, 6);
+	pkts[0]->packet_type = mlx5_ptype_table[pt_idx0] |
+			       !!(pt_idx0 & (1 << 6)) * rxq->tunnel;
+	pkts[1]->packet_type = mlx5_ptype_table[pt_idx1] |
+			       !!(pt_idx1 & (1 << 6)) * rxq->tunnel;
+	pkts[2]->packet_type = mlx5_ptype_table[pt_idx2] |
+			       !!(pt_idx2 & (1 << 6)) * rxq->tunnel;
+	pkts[3]->packet_type = mlx5_ptype_table[pt_idx3] |
+			       !!(pt_idx3 & (1 << 6)) * rxq->tunnel;
 	/* Fill flags for checksum and VLAN. */
 	pinfo = _mm_and_si128(pinfo, ptype_ol_mask);
 	pinfo = _mm_shuffle_epi8(cv_flag_sel, pinfo);
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v4 05/11] net/mlx5: cleanup tunnel checksum offloads
  2018-04-13 11:20 ` [PATCH v3 00/14] mlx5 Rx tunnel offloading Xueming Li
                     ` (4 preceding siblings ...)
  2018-04-17 15:14   ` [PATCH v4 04/11] net/mlx5: support Rx tunnel type identification Xueming Li
@ 2018-04-17 15:14   ` Xueming Li
  2018-04-17 15:14   ` [PATCH v4 06/11] net/mlx5: split flow RSS handling logic Xueming Li
                     ` (5 subsequent siblings)
  11 siblings, 0 replies; 115+ messages in thread
From: Xueming Li @ 2018-04-17 15:14 UTC (permalink / raw)
  To: Nelio Laranjeiro, Shahaf Shuler; +Cc: Xueming Li, dev

This patch cleanup tunnel checksum offloads.

Once tunnel packet type(RTE_PTYPE_TUNNEL_xxx) identified,
PKT_RX_IP_CKSUM_XXX and PKT_RX_L4_CKSUM_XXX represent checksum result of
inner headers, outer L3 and L4 header checksum are always valid as soon
as tunnel identified. If no tunnel identified, PKT_RX_IP_CKSUM_XXX and
PKT_RX_L4_CKSUM_XXX represent checksum result of outer L3 and L4
headers.

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
---
 drivers/net/mlx5/mlx5_rxq.c  |  2 --
 drivers/net/mlx5/mlx5_rxtx.c | 18 ++++--------------
 drivers/net/mlx5/mlx5_rxtx.h |  1 -
 3 files changed, 4 insertions(+), 17 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index 1fbd02aa0..6756f25fa 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -1045,8 +1045,6 @@ mlx5_rxq_new(struct rte_eth_dev *dev, uint16_t idx, uint16_t desc,
 	}
 	/* Toggle RX checksum offload if hardware supports it. */
 	tmpl->rxq.csum = !!(conf->offloads & DEV_RX_OFFLOAD_CHECKSUM);
-	tmpl->rxq.csum_l2tun = (!!(conf->offloads & DEV_RX_OFFLOAD_CHECKSUM) &&
-				priv->config.tunnel_en);
 	tmpl->rxq.hw_timestamp = !!(conf->offloads & DEV_RX_OFFLOAD_TIMESTAMP);
 	/* Configure VLAN stripping. */
 	tmpl->rxq.vlan_strip = !!(conf->offloads & DEV_RX_OFFLOAD_VLAN_STRIP);
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index fafac514b..060ff0e85 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -41,7 +41,7 @@ mlx5_rx_poll_len(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cqe,
 		 uint16_t cqe_cnt, uint32_t *rss_hash);
 
 static __rte_always_inline uint32_t
-rxq_cq_to_ol_flags(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cqe);
+rxq_cq_to_ol_flags(volatile struct mlx5_cqe *cqe);
 
 uint32_t mlx5_ptype_table[] __rte_cache_aligned = {
 	[0xff] = RTE_PTYPE_ALL_MASK, /* Last entry for errored packet. */
@@ -1728,8 +1728,6 @@ mlx5_rx_poll_len(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cqe,
 /**
  * Translate RX completion flags to offload flags.
  *
- * @param[in] rxq
- *   Pointer to RX queue structure.
  * @param[in] cqe
  *   Pointer to CQE.
  *
@@ -1737,7 +1735,7 @@ mlx5_rx_poll_len(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cqe,
  *   Offload flags (ol_flags) for struct rte_mbuf.
  */
 static inline uint32_t
-rxq_cq_to_ol_flags(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cqe)
+rxq_cq_to_ol_flags(volatile struct mlx5_cqe *cqe)
 {
 	uint32_t ol_flags = 0;
 	uint16_t flags = rte_be_to_cpu_16(cqe->hdr_type_etc);
@@ -1749,14 +1747,6 @@ rxq_cq_to_ol_flags(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cqe)
 		TRANSPOSE(flags,
 			  MLX5_CQE_RX_L4_HDR_VALID,
 			  PKT_RX_L4_CKSUM_GOOD);
-	if ((cqe->pkt_info & MLX5_CQE_RX_TUNNEL_PACKET) && (rxq->csum_l2tun))
-		ol_flags |=
-			TRANSPOSE(flags,
-				  MLX5_CQE_RX_L3_HDR_VALID,
-				  PKT_RX_IP_CKSUM_GOOD) |
-			TRANSPOSE(flags,
-				  MLX5_CQE_RX_L4_HDR_VALID,
-				  PKT_RX_L4_CKSUM_GOOD);
 	return ol_flags;
 }
 
@@ -1855,8 +1845,8 @@ mlx5_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n)
 						mlx5_flow_mark_get(mark);
 				}
 			}
-			if (rxq->csum | rxq->csum_l2tun)
-				pkt->ol_flags |= rxq_cq_to_ol_flags(rxq, cqe);
+			if (rxq->csum)
+				pkt->ol_flags |= rxq_cq_to_ol_flags(cqe);
 			if (rxq->vlan_strip &&
 			    (cqe->hdr_type_etc &
 			     rte_cpu_to_be_16(MLX5_CQE_VLAN_STRIPPED))) {
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index 676ad6a9a..188fd65c5 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -77,7 +77,6 @@ struct rxq_zip {
 /* RX queue descriptor. */
 struct mlx5_rxq_data {
 	unsigned int csum:1; /* Enable checksum offloading. */
-	unsigned int csum_l2tun:1; /* Same for L2 tunnels. */
 	unsigned int hw_timestamp:1; /* Enable HW timestamp. */
 	unsigned int vlan_strip:1; /* Enable VLAN stripping. */
 	unsigned int crc_present:1; /* CRC must be subtracted. */
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v4 06/11] net/mlx5: split flow RSS handling logic
  2018-04-13 11:20 ` [PATCH v3 00/14] mlx5 Rx tunnel offloading Xueming Li
                     ` (5 preceding siblings ...)
  2018-04-17 15:14   ` [PATCH v4 05/11] net/mlx5: cleanup tunnel checksum offloads Xueming Li
@ 2018-04-17 15:14   ` Xueming Li
  2018-04-17 15:14   ` [PATCH v4 07/11] net/mlx5: support tunnel RSS level Xueming Li
                     ` (4 subsequent siblings)
  11 siblings, 0 replies; 115+ messages in thread
From: Xueming Li @ 2018-04-17 15:14 UTC (permalink / raw)
  To: Nelio Laranjeiro, Shahaf Shuler; +Cc: Xueming Li, dev

This patch split out flow RSS hash field handling logic to dedicate
function.

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
---
 drivers/net/mlx5/mlx5_flow.c | 126 +++++++++++++++++++++++--------------------
 1 file changed, 68 insertions(+), 58 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index 59eb80655..c9cda86a5 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -999,59 +999,8 @@ mlx5_flow_update_priority(struct rte_eth_dev *dev,
 static void
 mlx5_flow_convert_finalise(struct mlx5_flow_parse *parser)
 {
-	const unsigned int ipv4 =
-		hash_rxq_init[parser->layer].ip_version == MLX5_IPV4;
-	const enum hash_rxq_type hmin = ipv4 ? HASH_RXQ_TCPV4 : HASH_RXQ_TCPV6;
-	const enum hash_rxq_type hmax = ipv4 ? HASH_RXQ_IPV4 : HASH_RXQ_IPV6;
-	const enum hash_rxq_type ohmin = ipv4 ? HASH_RXQ_TCPV6 : HASH_RXQ_TCPV4;
-	const enum hash_rxq_type ohmax = ipv4 ? HASH_RXQ_IPV6 : HASH_RXQ_IPV4;
-	const enum hash_rxq_type ip = ipv4 ? HASH_RXQ_IPV4 : HASH_RXQ_IPV6;
 	unsigned int i;
 
-	/* Remove any other flow not matching the pattern. */
-	if (parser->rss_conf.queue_num == 1 && !parser->rss_conf.types) {
-		for (i = 0; i != hash_rxq_init_n; ++i) {
-			if (i == HASH_RXQ_ETH)
-				continue;
-			rte_free(parser->queue[i].ibv_attr);
-			parser->queue[i].ibv_attr = NULL;
-		}
-		return;
-	}
-	if (parser->layer == HASH_RXQ_ETH) {
-		goto fill;
-	} else {
-		/*
-		 * This layer becomes useless as the pattern define under
-		 * layers.
-		 */
-		rte_free(parser->queue[HASH_RXQ_ETH].ibv_attr);
-		parser->queue[HASH_RXQ_ETH].ibv_attr = NULL;
-	}
-	/* Remove opposite kind of layer e.g. IPv6 if the pattern is IPv4. */
-	for (i = ohmin; i != (ohmax + 1); ++i) {
-		if (!parser->queue[i].ibv_attr)
-			continue;
-		rte_free(parser->queue[i].ibv_attr);
-		parser->queue[i].ibv_attr = NULL;
-	}
-	/* Remove impossible flow according to the RSS configuration. */
-	if (hash_rxq_init[parser->layer].dpdk_rss_hf &
-	    parser->rss_conf.types) {
-		/* Remove any other flow. */
-		for (i = hmin; i != (hmax + 1); ++i) {
-			if ((i == parser->layer) ||
-			     (!parser->queue[i].ibv_attr))
-				continue;
-			rte_free(parser->queue[i].ibv_attr);
-			parser->queue[i].ibv_attr = NULL;
-		}
-	} else  if (!parser->queue[ip].ibv_attr) {
-		/* no RSS possible with the current configuration. */
-		parser->rss_conf.queue_num = 1;
-		return;
-	}
-fill:
 	/*
 	 * Fill missing layers in verbs specifications, or compute the correct
 	 * offset to allocate the memory space for the attributes and
@@ -1114,6 +1063,66 @@ mlx5_flow_convert_finalise(struct mlx5_flow_parse *parser)
 }
 
 /**
+ * Update flows according to pattern and RSS hash fields.
+ *
+ * @param[in, out] parser
+ *   Internal parser structure.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+mlx5_flow_convert_rss(struct mlx5_flow_parse *parser)
+{
+	const unsigned int ipv4 =
+		hash_rxq_init[parser->layer].ip_version == MLX5_IPV4;
+	const enum hash_rxq_type hmin = ipv4 ? HASH_RXQ_TCPV4 : HASH_RXQ_TCPV6;
+	const enum hash_rxq_type hmax = ipv4 ? HASH_RXQ_IPV4 : HASH_RXQ_IPV6;
+	const enum hash_rxq_type ohmin = ipv4 ? HASH_RXQ_TCPV6 : HASH_RXQ_TCPV4;
+	const enum hash_rxq_type ohmax = ipv4 ? HASH_RXQ_IPV6 : HASH_RXQ_IPV4;
+	const enum hash_rxq_type ip = ipv4 ? HASH_RXQ_IPV4 : HASH_RXQ_IPV6;
+	unsigned int i;
+
+	/* Remove any other flow not matching the pattern. */
+	if (parser->rss_conf.queue_num == 1 && !parser->rss_conf.types) {
+		for (i = 0; i != hash_rxq_init_n; ++i) {
+			if (i == HASH_RXQ_ETH)
+				continue;
+			rte_free(parser->queue[i].ibv_attr);
+			parser->queue[i].ibv_attr = NULL;
+		}
+		return 0;
+	}
+	if (parser->layer == HASH_RXQ_ETH)
+		return 0;
+	/* This layer becomes useless as the pattern define under layers. */
+	rte_free(parser->queue[HASH_RXQ_ETH].ibv_attr);
+	parser->queue[HASH_RXQ_ETH].ibv_attr = NULL;
+	/* Remove opposite kind of layer e.g. IPv6 if the pattern is IPv4. */
+	for (i = ohmin; i != (ohmax + 1); ++i) {
+		if (!parser->queue[i].ibv_attr)
+			continue;
+		rte_free(parser->queue[i].ibv_attr);
+		parser->queue[i].ibv_attr = NULL;
+	}
+	/* Remove impossible flow according to the RSS configuration. */
+	if (hash_rxq_init[parser->layer].dpdk_rss_hf &
+	    parser->rss_conf.types) {
+		/* Remove any other flow. */
+		for (i = hmin; i != (hmax + 1); ++i) {
+			if (i == parser->layer || !parser->queue[i].ibv_attr)
+				continue;
+			rte_free(parser->queue[i].ibv_attr);
+			parser->queue[i].ibv_attr = NULL;
+		}
+	} else if (!parser->queue[ip].ibv_attr) {
+		/* no RSS possible with the current configuration. */
+		parser->rss_conf.queue_num = 1;
+	}
+	return 0;
+}
+
+/**
  * Validate and convert a flow supported by the NIC.
  *
  * @param dev
@@ -1209,6 +1218,14 @@ mlx5_flow_convert(struct rte_eth_dev *dev,
 		if (ret)
 			goto exit_free;
 	}
+	if (!parser->drop)
+		/* RSS check, remove unused hash types. */
+		ret = mlx5_flow_convert_rss(parser);
+		if (ret)
+			goto exit_free;
+		/* Complete missing specification. */
+		mlx5_flow_convert_finalise(parser);
+	mlx5_flow_update_priority(dev, parser, attr);
 	if (parser->mark)
 		mlx5_flow_create_flag_mark(parser, parser->mark_id);
 	if (parser->count && parser->create) {
@@ -1216,13 +1233,6 @@ mlx5_flow_convert(struct rte_eth_dev *dev,
 		if (!parser->cs)
 			goto exit_count_error;
 	}
-	/*
-	 * Last step. Complete missing specification to reach the RSS
-	 * configuration.
-	 */
-	if (!parser->drop)
-		mlx5_flow_convert_finalise(parser);
-	mlx5_flow_update_priority(dev, parser, attr);
 exit_free:
 	/* Only verification is expected, all resources should be released. */
 	if (!parser->create) {
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v4 07/11] net/mlx5: support tunnel RSS level
  2018-04-13 11:20 ` [PATCH v3 00/14] mlx5 Rx tunnel offloading Xueming Li
                     ` (6 preceding siblings ...)
  2018-04-17 15:14   ` [PATCH v4 06/11] net/mlx5: split flow RSS handling logic Xueming Li
@ 2018-04-17 15:14   ` Xueming Li
  2018-04-18  6:55     ` Nélio Laranjeiro
  2018-04-17 15:14   ` [PATCH v4 08/11] net/mlx5: add hardware flow debug dump Xueming Li
                     ` (3 subsequent siblings)
  11 siblings, 1 reply; 115+ messages in thread
From: Xueming Li @ 2018-04-17 15:14 UTC (permalink / raw)
  To: Nelio Laranjeiro, Shahaf Shuler; +Cc: Xueming Li, dev

Tunnel RSS level of flow RSS action offers user a choice to do RSS hash
calculation on inner or outer RSS fields. Testpmd flow command examples:

GRE flow inner RSS:
  flow create 0 ingress pattern eth / ipv4 proto is 47 / gre / end
actions rss queues 1 2 end level 1 / end

GRE tunnel flow outer RSS:
  flow create 0 ingress pattern eth  / ipv4 proto is 47 / gre / end
actions rss queues 1 2 end level 0 / end

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
---
 drivers/net/mlx5/Makefile    |   2 +-
 drivers/net/mlx5/mlx5_flow.c | 257 +++++++++++++++++++++++++++----------------
 drivers/net/mlx5/mlx5_glue.c |  16 +++
 drivers/net/mlx5/mlx5_glue.h |   8 ++
 drivers/net/mlx5/mlx5_rxq.c  |  58 +++++++++-
 drivers/net/mlx5/mlx5_rxtx.h |   5 +-
 6 files changed, 240 insertions(+), 106 deletions(-)

diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
index b710a10f5..d9447ace9 100644
--- a/drivers/net/mlx5/Makefile
+++ b/drivers/net/mlx5/Makefile
@@ -8,7 +8,7 @@ include $(RTE_SDK)/mk/rte.vars.mk
 LIB = librte_pmd_mlx5.a
 LIB_GLUE = $(LIB_GLUE_BASE).$(LIB_GLUE_VERSION)
 LIB_GLUE_BASE = librte_pmd_mlx5_glue.so
-LIB_GLUE_VERSION = 18.02.0
+LIB_GLUE_VERSION = 18.05.0
 
 # Sources.
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5.c
diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index c9cda86a5..a6791c525 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -116,6 +116,7 @@ enum hash_rxq_type {
 	HASH_RXQ_UDPV6,
 	HASH_RXQ_IPV6,
 	HASH_RXQ_ETH,
+	HASH_RXQ_TUNNEL,
 };
 
 /* Initialization data for hash RX queue. */
@@ -454,6 +455,7 @@ struct mlx5_flow_parse {
 	uint16_t queues[RTE_MAX_QUEUES_PER_PORT]; /**< Queues indexes to use. */
 	uint8_t rss_key[40]; /**< copy of the RSS key. */
 	enum hash_rxq_type layer; /**< Last pattern layer detected. */
+	enum hash_rxq_type out_layer; /**< Last outer pattern layer detected. */
 	uint32_t tunnel; /**< Tunnel type of RTE_PTYPE_TUNNEL_XXX. */
 	struct ibv_counter_set *cs; /**< Holds the counter set for the rule */
 	struct {
@@ -461,6 +463,7 @@ struct mlx5_flow_parse {
 		/**< Pointer to Verbs attributes. */
 		unsigned int offset;
 		/**< Current position or total size of the attribute. */
+		uint64_t hash_fields; /**< Verbs hash fields. */
 	} queue[RTE_DIM(hash_rxq_init)];
 };
 
@@ -696,7 +699,8 @@ mlx5_flow_convert_actions(struct rte_eth_dev *dev,
 						   " function is Toeplitz");
 				return -rte_errno;
 			}
-			if (rss->level) {
+#ifndef HAVE_IBV_DEVICE_TUNNEL_SUPPORT
+			if (parser->rss_conf.level > 1) {
 				rte_flow_error_set(error, EINVAL,
 						   RTE_FLOW_ERROR_TYPE_ACTION,
 						   actions,
@@ -704,6 +708,15 @@ mlx5_flow_convert_actions(struct rte_eth_dev *dev,
 						   " level is not supported");
 				return -rte_errno;
 			}
+#endif
+			if (parser->rss_conf.level > 2) {
+				rte_flow_error_set(error, EINVAL,
+						   RTE_FLOW_ERROR_TYPE_ACTION,
+						   actions,
+						   "RSS encapsulation level"
+						   " > 1 is not supported");
+				return -rte_errno;
+			}
 			if (rss->types & MLX5_RSS_HF_MASK) {
 				rte_flow_error_set(error, EINVAL,
 						   RTE_FLOW_ERROR_TYPE_ACTION,
@@ -754,7 +767,7 @@ mlx5_flow_convert_actions(struct rte_eth_dev *dev,
 			}
 			parser->rss_conf = (struct rte_flow_action_rss){
 				.func = RTE_ETH_HASH_FUNCTION_DEFAULT,
-				.level = 0,
+				.level = rss->level,
 				.types = rss->types,
 				.key_len = rss_key_len,
 				.queue_num = rss->queue_num,
@@ -838,10 +851,12 @@ mlx5_flow_convert_actions(struct rte_eth_dev *dev,
  *   0 on success, a negative errno value otherwise and rte_errno is set.
  */
 static int
-mlx5_flow_convert_items_validate(const struct rte_flow_item items[],
+mlx5_flow_convert_items_validate(struct rte_eth_dev *dev,
+				 const struct rte_flow_item items[],
 				 struct rte_flow_error *error,
 				 struct mlx5_flow_parse *parser)
 {
+	struct priv *priv = dev->data->dev_private;
 	const struct mlx5_flow_items *cur_item = mlx5_flow_items;
 	unsigned int i;
 	int ret = 0;
@@ -881,6 +896,14 @@ mlx5_flow_convert_items_validate(const struct rte_flow_item items[],
 						   " tunnel encapsulations.");
 				return -rte_errno;
 			}
+			if (!priv->config.tunnel_en &&
+			    parser->rss_conf.level > 1) {
+				rte_flow_error_set(error, ENOTSUP,
+					RTE_FLOW_ERROR_TYPE_ITEM,
+					items,
+					"RSS on tunnel is not supported");
+				return -rte_errno;
+			}
 			parser->inner = IBV_FLOW_SPEC_INNER;
 			parser->tunnel = flow_ptype[items->type];
 		}
@@ -1000,7 +1023,11 @@ static void
 mlx5_flow_convert_finalise(struct mlx5_flow_parse *parser)
 {
 	unsigned int i;
+	uint32_t inner = parser->inner;
 
+	/* Don't create extra flows for outer RSS. */
+	if (parser->tunnel && parser->rss_conf.level < 2)
+		return;
 	/*
 	 * Fill missing layers in verbs specifications, or compute the correct
 	 * offset to allocate the memory space for the attributes and
@@ -1011,23 +1038,25 @@ mlx5_flow_convert_finalise(struct mlx5_flow_parse *parser)
 			struct ibv_flow_spec_ipv4_ext ipv4;
 			struct ibv_flow_spec_ipv6 ipv6;
 			struct ibv_flow_spec_tcp_udp udp_tcp;
+			struct ibv_flow_spec_eth eth;
 		} specs;
 		void *dst;
 		uint16_t size;
 
 		if (i == parser->layer)
 			continue;
-		if (parser->layer == HASH_RXQ_ETH) {
+		if (parser->layer == HASH_RXQ_ETH ||
+		    parser->layer == HASH_RXQ_TUNNEL) {
 			if (hash_rxq_init[i].ip_version == MLX5_IPV4) {
 				size = sizeof(struct ibv_flow_spec_ipv4_ext);
 				specs.ipv4 = (struct ibv_flow_spec_ipv4_ext){
-					.type = IBV_FLOW_SPEC_IPV4_EXT,
+					.type = inner | IBV_FLOW_SPEC_IPV4_EXT,
 					.size = size,
 				};
 			} else {
 				size = sizeof(struct ibv_flow_spec_ipv6);
 				specs.ipv6 = (struct ibv_flow_spec_ipv6){
-					.type = IBV_FLOW_SPEC_IPV6,
+					.type = inner | IBV_FLOW_SPEC_IPV6,
 					.size = size,
 				};
 			}
@@ -1044,7 +1073,7 @@ mlx5_flow_convert_finalise(struct mlx5_flow_parse *parser)
 		    (i == HASH_RXQ_UDPV6) || (i == HASH_RXQ_TCPV6)) {
 			size = sizeof(struct ibv_flow_spec_tcp_udp);
 			specs.udp_tcp = (struct ibv_flow_spec_tcp_udp) {
-				.type = ((i == HASH_RXQ_UDPV4 ||
+				.type = inner | ((i == HASH_RXQ_UDPV4 ||
 					  i == HASH_RXQ_UDPV6) ?
 					 IBV_FLOW_SPEC_UDP :
 					 IBV_FLOW_SPEC_TCP),
@@ -1074,50 +1103,93 @@ mlx5_flow_convert_finalise(struct mlx5_flow_parse *parser)
 static int
 mlx5_flow_convert_rss(struct mlx5_flow_parse *parser)
 {
-	const unsigned int ipv4 =
-		hash_rxq_init[parser->layer].ip_version == MLX5_IPV4;
-	const enum hash_rxq_type hmin = ipv4 ? HASH_RXQ_TCPV4 : HASH_RXQ_TCPV6;
-	const enum hash_rxq_type hmax = ipv4 ? HASH_RXQ_IPV4 : HASH_RXQ_IPV6;
-	const enum hash_rxq_type ohmin = ipv4 ? HASH_RXQ_TCPV6 : HASH_RXQ_TCPV4;
-	const enum hash_rxq_type ohmax = ipv4 ? HASH_RXQ_IPV6 : HASH_RXQ_IPV4;
-	const enum hash_rxq_type ip = ipv4 ? HASH_RXQ_IPV4 : HASH_RXQ_IPV6;
 	unsigned int i;
-
-	/* Remove any other flow not matching the pattern. */
-	if (parser->rss_conf.queue_num == 1 && !parser->rss_conf.types) {
-		for (i = 0; i != hash_rxq_init_n; ++i) {
-			if (i == HASH_RXQ_ETH)
+	enum hash_rxq_type start;
+	enum hash_rxq_type layer;
+	int outer = parser->tunnel && parser->rss_conf.level < 2;
+	uint64_t rss = parser->rss_conf.types;
+
+	/* Default to outer RSS. */
+	if (!parser->rss_conf.level)
+		parser->rss_conf.level = 1;
+	layer = outer ? parser->out_layer : parser->layer;
+	if (layer == HASH_RXQ_TUNNEL)
+		layer = HASH_RXQ_ETH;
+	if (outer) {
+		/* Only one hash type for outer RSS. */
+		if (rss && layer == HASH_RXQ_ETH) {
+			start = HASH_RXQ_TCPV4;
+		} else if (rss && layer != HASH_RXQ_ETH &&
+			   !(rss & hash_rxq_init[layer].dpdk_rss_hf)) {
+			/* If RSS not match L4 pattern, try L3 RSS. */
+			if (layer < HASH_RXQ_IPV4)
+				layer = HASH_RXQ_IPV4;
+			else if (layer > HASH_RXQ_IPV4 && layer < HASH_RXQ_IPV6)
+				layer = HASH_RXQ_IPV6;
+			start = layer;
+		} else {
+			start = layer;
+		}
+		/* Scan first valid hash type. */
+		for (i = start; rss && i <= layer; ++i) {
+			if (!parser->queue[i].ibv_attr)
 				continue;
-			rte_free(parser->queue[i].ibv_attr);
-			parser->queue[i].ibv_attr = NULL;
+			if (hash_rxq_init[i].dpdk_rss_hf & rss)
+				break;
 		}
-		return 0;
-	}
-	if (parser->layer == HASH_RXQ_ETH)
-		return 0;
-	/* This layer becomes useless as the pattern define under layers. */
-	rte_free(parser->queue[HASH_RXQ_ETH].ibv_attr);
-	parser->queue[HASH_RXQ_ETH].ibv_attr = NULL;
-	/* Remove opposite kind of layer e.g. IPv6 if the pattern is IPv4. */
-	for (i = ohmin; i != (ohmax + 1); ++i) {
-		if (!parser->queue[i].ibv_attr)
-			continue;
-		rte_free(parser->queue[i].ibv_attr);
-		parser->queue[i].ibv_attr = NULL;
-	}
-	/* Remove impossible flow according to the RSS configuration. */
-	if (hash_rxq_init[parser->layer].dpdk_rss_hf &
-	    parser->rss_conf.types) {
-		/* Remove any other flow. */
-		for (i = hmin; i != (hmax + 1); ++i) {
-			if (i == parser->layer || !parser->queue[i].ibv_attr)
+		if (rss && i <= layer)
+			parser->queue[layer].hash_fields =
+					hash_rxq_init[i].hash_fields;
+		/* Trim unused hash types. */
+		for (i = 0; i != hash_rxq_init_n; ++i) {
+			if (parser->queue[i].ibv_attr && i != layer) {
+				rte_free(parser->queue[i].ibv_attr);
+				parser->queue[i].ibv_attr = NULL;
+			}
+		}
+	} else {
+		/* Expand for inner or normal RSS. */
+		if (rss && (layer == HASH_RXQ_ETH || layer == HASH_RXQ_IPV4))
+			start = HASH_RXQ_TCPV4;
+		else if (rss && layer == HASH_RXQ_IPV6)
+			start = HASH_RXQ_TCPV6;
+		else
+			start = layer;
+		/* For L4 pattern, try L3 RSS if no L4 RSS. */
+		/* Trim unused hash types. */
+		for (i = 0; i != hash_rxq_init_n; ++i) {
+			if (!parser->queue[i].ibv_attr)
 				continue;
-			rte_free(parser->queue[i].ibv_attr);
-			parser->queue[i].ibv_attr = NULL;
+			if (i < start || i > layer) {
+				rte_free(parser->queue[i].ibv_attr);
+				parser->queue[i].ibv_attr = NULL;
+				continue;
+			}
+			if (!rss)
+				continue;
+			if (hash_rxq_init[i].dpdk_rss_hf & rss) {
+				parser->queue[i].hash_fields =
+						hash_rxq_init[i].hash_fields;
+			} else if (i != layer) {
+				/* Remove unused RSS expansion. */
+				rte_free(parser->queue[i].ibv_attr);
+				parser->queue[i].ibv_attr = NULL;
+			} else if (layer < HASH_RXQ_IPV4 &&
+				   (hash_rxq_init[HASH_RXQ_IPV4].dpdk_rss_hf &
+				    rss)) {
+				/* Allow IPv4 RSS on L4 pattern. */
+				parser->queue[i].hash_fields =
+					hash_rxq_init[HASH_RXQ_IPV4]
+						.hash_fields;
+			} else if (i > HASH_RXQ_IPV4 && i < HASH_RXQ_IPV6 &&
+				   (hash_rxq_init[HASH_RXQ_IPV6].dpdk_rss_hf &
+				    rss)) {
+				/* Allow IPv4 RSS on L4 pattern. */
+				parser->queue[i].hash_fields =
+					hash_rxq_init[HASH_RXQ_IPV6]
+						.hash_fields;
+			}
 		}
-	} else if (!parser->queue[ip].ibv_attr) {
-		/* no RSS possible with the current configuration. */
-		parser->rss_conf.queue_num = 1;
 	}
 	return 0;
 }
@@ -1165,7 +1237,7 @@ mlx5_flow_convert(struct rte_eth_dev *dev,
 	ret = mlx5_flow_convert_actions(dev, actions, error, parser);
 	if (ret)
 		return ret;
-	ret = mlx5_flow_convert_items_validate(items, error, parser);
+	ret = mlx5_flow_convert_items_validate(dev, items, error, parser);
 	if (ret)
 		return ret;
 	mlx5_flow_convert_finalise(parser);
@@ -1186,10 +1258,6 @@ mlx5_flow_convert(struct rte_eth_dev *dev,
 		for (i = 0; i != hash_rxq_init_n; ++i) {
 			unsigned int offset;
 
-			if (!(parser->rss_conf.types &
-			      hash_rxq_init[i].dpdk_rss_hf) &&
-			    (i != HASH_RXQ_ETH))
-				continue;
 			offset = parser->queue[i].offset;
 			parser->queue[i].ibv_attr =
 				mlx5_flow_convert_allocate(offset, error);
@@ -1201,6 +1269,7 @@ mlx5_flow_convert(struct rte_eth_dev *dev,
 	/* Third step. Conversion parse, fill the specifications. */
 	parser->inner = 0;
 	parser->tunnel = 0;
+	parser->layer = HASH_RXQ_ETH;
 	for (; items->type != RTE_FLOW_ITEM_TYPE_END; ++items) {
 		struct mlx5_flow_data data = {
 			.parser = parser,
@@ -1280,17 +1349,11 @@ mlx5_flow_create_copy(struct mlx5_flow_parse *parser, void *src,
 	for (i = 0; i != hash_rxq_init_n; ++i) {
 		if (!parser->queue[i].ibv_attr)
 			continue;
-		/* Specification must be the same l3 type or none. */
-		if (parser->layer == HASH_RXQ_ETH ||
-		    (hash_rxq_init[parser->layer].ip_version ==
-		     hash_rxq_init[i].ip_version) ||
-		    (hash_rxq_init[i].ip_version == 0)) {
-			dst = (void *)((uintptr_t)parser->queue[i].ibv_attr +
-					parser->queue[i].offset);
-			memcpy(dst, src, size);
-			++parser->queue[i].ibv_attr->num_of_specs;
-			parser->queue[i].offset += size;
-		}
+		dst = (void *)((uintptr_t)parser->queue[i].ibv_attr +
+				parser->queue[i].offset);
+		memcpy(dst, src, size);
+		++parser->queue[i].ibv_attr->num_of_specs;
+		parser->queue[i].offset += size;
 	}
 }
 
@@ -1321,9 +1384,7 @@ mlx5_flow_create_eth(const struct rte_flow_item *item,
 		.size = eth_size,
 	};
 
-	/* Don't update layer for the inner pattern. */
-	if (!parser->inner)
-		parser->layer = HASH_RXQ_ETH;
+	parser->layer = HASH_RXQ_ETH;
 	if (spec) {
 		unsigned int i;
 
@@ -1434,9 +1495,7 @@ mlx5_flow_create_ipv4(const struct rte_flow_item *item,
 		.size = ipv4_size,
 	};
 
-	/* Don't update layer for the inner pattern. */
-	if (!parser->inner)
-		parser->layer = HASH_RXQ_IPV4;
+	parser->layer = HASH_RXQ_IPV4;
 	if (spec) {
 		if (!mask)
 			mask = default_mask;
@@ -1489,9 +1548,7 @@ mlx5_flow_create_ipv6(const struct rte_flow_item *item,
 		.size = ipv6_size,
 	};
 
-	/* Don't update layer for the inner pattern. */
-	if (!parser->inner)
-		parser->layer = HASH_RXQ_IPV6;
+	parser->layer = HASH_RXQ_IPV6;
 	if (spec) {
 		unsigned int i;
 		uint32_t vtc_flow_val;
@@ -1564,13 +1621,10 @@ mlx5_flow_create_udp(const struct rte_flow_item *item,
 		.size = udp_size,
 	};
 
-	/* Don't update layer for the inner pattern. */
-	if (!parser->inner) {
-		if (parser->layer == HASH_RXQ_IPV4)
-			parser->layer = HASH_RXQ_UDPV4;
-		else
-			parser->layer = HASH_RXQ_UDPV6;
-	}
+	if (parser->layer == HASH_RXQ_IPV4)
+		parser->layer = HASH_RXQ_UDPV4;
+	else
+		parser->layer = HASH_RXQ_UDPV6;
 	if (spec) {
 		if (!mask)
 			mask = default_mask;
@@ -1613,13 +1667,10 @@ mlx5_flow_create_tcp(const struct rte_flow_item *item,
 		.size = tcp_size,
 	};
 
-	/* Don't update layer for the inner pattern. */
-	if (!parser->inner) {
-		if (parser->layer == HASH_RXQ_IPV4)
-			parser->layer = HASH_RXQ_TCPV4;
-		else
-			parser->layer = HASH_RXQ_TCPV6;
-	}
+	if (parser->layer == HASH_RXQ_IPV4)
+		parser->layer = HASH_RXQ_TCPV4;
+	else
+		parser->layer = HASH_RXQ_TCPV6;
 	if (spec) {
 		if (!mask)
 			mask = default_mask;
@@ -1669,6 +1720,11 @@ mlx5_flow_create_vxlan(const struct rte_flow_item *item,
 	id.vni[0] = 0;
 	parser->inner = IBV_FLOW_SPEC_INNER;
 	parser->tunnel = ptype_ext[PTYPE_IDX(RTE_PTYPE_TUNNEL_VXLAN)];
+	parser->out_layer = parser->layer;
+	parser->layer = HASH_RXQ_TUNNEL;
+	/* Default VXLAN to outer RSS. */
+	if (!parser->rss_conf.level)
+		parser->rss_conf.level = 1;
 	if (spec) {
 		if (!mask)
 			mask = default_mask;
@@ -1726,6 +1782,11 @@ mlx5_flow_create_gre(const struct rte_flow_item *item __rte_unused,
 
 	parser->inner = IBV_FLOW_SPEC_INNER;
 	parser->tunnel = ptype_ext[PTYPE_IDX(RTE_PTYPE_TUNNEL_GRE)];
+	parser->out_layer = parser->layer;
+	parser->layer = HASH_RXQ_TUNNEL;
+	/* Default GRE to inner RSS. */
+	if (!parser->rss_conf.level)
+		parser->rss_conf.level = 2;
 	/* Update encapsulation IP layer protocol. */
 	for (i = 0; i != hash_rxq_init_n; ++i) {
 		if (!parser->queue[i].ibv_attr)
@@ -1917,33 +1978,33 @@ mlx5_flow_create_action_queue_rss(struct rte_eth_dev *dev,
 	unsigned int i;
 
 	for (i = 0; i != hash_rxq_init_n; ++i) {
-		uint64_t hash_fields;
-
 		if (!parser->queue[i].ibv_attr)
 			continue;
 		flow->frxq[i].ibv_attr = parser->queue[i].ibv_attr;
 		parser->queue[i].ibv_attr = NULL;
-		hash_fields = hash_rxq_init[i].hash_fields;
+		flow->frxq[i].hash_fields = parser->queue[i].hash_fields;
 		if (!priv->dev->data->dev_started)
 			continue;
 		flow->frxq[i].hrxq =
 			mlx5_hrxq_get(dev,
 				      parser->rss_conf.key,
 				      parser->rss_conf.key_len,
-				      hash_fields,
+				      flow->frxq[i].hash_fields,
 				      parser->rss_conf.queue,
 				      parser->rss_conf.queue_num,
-				      parser->tunnel);
+				      parser->tunnel,
+				      parser->rss_conf.level);
 		if (flow->frxq[i].hrxq)
 			continue;
 		flow->frxq[i].hrxq =
 			mlx5_hrxq_new(dev,
 				      parser->rss_conf.key,
 				      parser->rss_conf.key_len,
-				      hash_fields,
+				      flow->frxq[i].hash_fields,
 				      parser->rss_conf.queue,
 				      parser->rss_conf.queue_num,
-				      parser->tunnel);
+				      parser->tunnel,
+				      parser->rss_conf.level);
 		if (!flow->frxq[i].hrxq) {
 			return rte_flow_error_set(error, ENOMEM,
 						  RTE_FLOW_ERROR_TYPE_HANDLE,
@@ -2040,7 +2101,7 @@ mlx5_flow_create_action_queue(struct rte_eth_dev *dev,
 		DRV_LOG(DEBUG, "port %u %p type %d QP %p ibv_flow %p",
 			dev->data->port_id,
 			(void *)flow, i,
-			(void *)flow->frxq[i].hrxq,
+			(void *)flow->frxq[i].hrxq->qp,
 			(void *)flow->frxq[i].ibv_flow);
 	}
 	if (!flows_n) {
@@ -2568,19 +2629,21 @@ mlx5_flow_start(struct rte_eth_dev *dev, struct mlx5_flows *list)
 			flow->frxq[i].hrxq =
 				mlx5_hrxq_get(dev, flow->rss_conf.key,
 					      flow->rss_conf.key_len,
-					      hash_rxq_init[i].hash_fields,
+					      flow->frxq[i].hash_fields,
 					      flow->rss_conf.queue,
 					      flow->rss_conf.queue_num,
-					      flow->tunnel);
+					      flow->tunnel,
+					      flow->rss_conf.level);
 			if (flow->frxq[i].hrxq)
 				goto flow_create;
 			flow->frxq[i].hrxq =
 				mlx5_hrxq_new(dev, flow->rss_conf.key,
 					      flow->rss_conf.key_len,
-					      hash_rxq_init[i].hash_fields,
+					      flow->frxq[i].hash_fields,
 					      flow->rss_conf.queue,
 					      flow->rss_conf.queue_num,
-					      flow->tunnel);
+					      flow->tunnel,
+					      flow->rss_conf.level);
 			if (!flow->frxq[i].hrxq) {
 				DRV_LOG(DEBUG,
 					"port %u flow %p cannot be applied",
diff --git a/drivers/net/mlx5/mlx5_glue.c b/drivers/net/mlx5/mlx5_glue.c
index a771ac4c7..cd2716352 100644
--- a/drivers/net/mlx5/mlx5_glue.c
+++ b/drivers/net/mlx5/mlx5_glue.c
@@ -313,6 +313,21 @@ mlx5_glue_dv_init_obj(struct mlx5dv_obj *obj, uint64_t obj_type)
 	return mlx5dv_init_obj(obj, obj_type);
 }
 
+static struct ibv_qp *
+mlx5_glue_dv_create_qp(struct ibv_context *context,
+		       struct ibv_qp_init_attr_ex *qp_init_attr_ex,
+		       struct mlx5dv_qp_init_attr *dv_qp_init_attr)
+{
+#ifdef HAVE_IBV_DEVICE_TUNNEL_SUPPORT
+	return mlx5dv_create_qp(context, qp_init_attr_ex, dv_qp_init_attr);
+#else
+	(void)context;
+	(void)qp_init_attr_ex;
+	(void)dv_qp_init_attr;
+	return NULL;
+#endif
+}
+
 const struct mlx5_glue *mlx5_glue = &(const struct mlx5_glue){
 	.version = MLX5_GLUE_VERSION,
 	.fork_init = mlx5_glue_fork_init,
@@ -356,4 +371,5 @@ const struct mlx5_glue *mlx5_glue = &(const struct mlx5_glue){
 	.dv_query_device = mlx5_glue_dv_query_device,
 	.dv_set_context_attr = mlx5_glue_dv_set_context_attr,
 	.dv_init_obj = mlx5_glue_dv_init_obj,
+	.dv_create_qp = mlx5_glue_dv_create_qp,
 };
diff --git a/drivers/net/mlx5/mlx5_glue.h b/drivers/net/mlx5/mlx5_glue.h
index 33385d226..9f36af81a 100644
--- a/drivers/net/mlx5/mlx5_glue.h
+++ b/drivers/net/mlx5/mlx5_glue.h
@@ -31,6 +31,10 @@ struct ibv_counter_set_init_attr;
 struct ibv_query_counter_set_attr;
 #endif
 
+#ifndef HAVE_IBV_DEVICE_TUNNEL_SUPPORT
+struct mlx5dv_qp_init_attr;
+#endif
+
 /* LIB_GLUE_VERSION must be updated every time this structure is modified. */
 struct mlx5_glue {
 	const char *version;
@@ -106,6 +110,10 @@ struct mlx5_glue {
 				   enum mlx5dv_set_ctx_attr_type type,
 				   void *attr);
 	int (*dv_init_obj)(struct mlx5dv_obj *obj, uint64_t obj_type);
+	struct ibv_qp *(*dv_create_qp)
+		(struct ibv_context *context,
+		 struct ibv_qp_init_attr_ex *qp_init_attr_ex,
+		 struct mlx5dv_qp_init_attr *dv_qp_init_attr);
 };
 
 const struct mlx5_glue *mlx5_glue;
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index 6756f25fa..58403b5b6 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -1385,7 +1385,9 @@ mlx5_ind_table_ibv_verify(struct rte_eth_dev *dev)
  * @param queues_n
  *   Number of queues.
  * @param tunnel
- *   Tunnel type.
+ *   Tunnel type, implies tunnel offloading like inner checksum if available.
+ * @param rss_level
+ *   RSS hash on tunnel level.
  *
  * @return
  *   The Verbs object initialised, NULL otherwise and rte_errno is set.
@@ -1394,13 +1396,17 @@ struct mlx5_hrxq *
 mlx5_hrxq_new(struct rte_eth_dev *dev,
 	      const uint8_t *rss_key, uint32_t rss_key_len,
 	      uint64_t hash_fields,
-	      const uint16_t *queues, uint32_t queues_n, uint32_t tunnel)
+	      const uint16_t *queues, uint32_t queues_n,
+	      uint32_t tunnel, uint32_t rss_level)
 {
 	struct priv *priv = dev->data->dev_private;
 	struct mlx5_hrxq *hrxq;
 	struct mlx5_ind_table_ibv *ind_tbl;
 	struct ibv_qp *qp;
 	int err;
+#ifdef HAVE_IBV_DEVICE_TUNNEL_SUPPORT
+	struct mlx5dv_qp_init_attr qp_init_attr = {0};
+#endif
 
 	queues_n = hash_fields ? queues_n : 1;
 	ind_tbl = mlx5_ind_table_ibv_get(dev, queues, queues_n);
@@ -1410,6 +1416,36 @@ mlx5_hrxq_new(struct rte_eth_dev *dev,
 		rte_errno = ENOMEM;
 		return NULL;
 	}
+#ifdef HAVE_IBV_DEVICE_TUNNEL_SUPPORT
+	if (tunnel) {
+		qp_init_attr.comp_mask =
+				MLX5DV_QP_INIT_ATTR_MASK_QP_CREATE_FLAGS;
+		qp_init_attr.create_flags = MLX5DV_QP_CREATE_TUNNEL_OFFLOADS;
+	}
+	qp = mlx5_glue->dv_create_qp(
+		priv->ctx,
+		&(struct ibv_qp_init_attr_ex){
+			.qp_type = IBV_QPT_RAW_PACKET,
+			.comp_mask =
+				IBV_QP_INIT_ATTR_PD |
+				IBV_QP_INIT_ATTR_IND_TABLE |
+				IBV_QP_INIT_ATTR_RX_HASH,
+			.rx_hash_conf = (struct ibv_rx_hash_conf){
+				.rx_hash_function = IBV_RX_HASH_FUNC_TOEPLITZ,
+				.rx_hash_key_len = rss_key_len ? rss_key_len :
+						   rss_hash_default_key_len,
+				.rx_hash_key = rss_key ?
+					       (void *)(uintptr_t)rss_key :
+					       rss_hash_default_key,
+				.rx_hash_fields_mask = hash_fields |
+					(tunnel && rss_level > 1 ?
+					(uint32_t)IBV_RX_HASH_INNER : 0),
+			},
+			.rwq_ind_tbl = ind_tbl->ind_table,
+			.pd = priv->pd,
+		},
+		&qp_init_attr);
+#else
 	qp = mlx5_glue->create_qp_ex
 		(priv->ctx,
 		 &(struct ibv_qp_init_attr_ex){
@@ -1420,13 +1456,17 @@ mlx5_hrxq_new(struct rte_eth_dev *dev,
 				IBV_QP_INIT_ATTR_RX_HASH,
 			.rx_hash_conf = (struct ibv_rx_hash_conf){
 				.rx_hash_function = IBV_RX_HASH_FUNC_TOEPLITZ,
-				.rx_hash_key_len = rss_key_len,
-				.rx_hash_key = (void *)(uintptr_t)rss_key,
+				.rx_hash_key_len = rss_key_len ? rss_key_len :
+						   rss_hash_default_key_len,
+				.rx_hash_key = rss_key ?
+					       (void *)(uintptr_t)rss_key :
+					       rss_hash_default_key,
 				.rx_hash_fields_mask = hash_fields,
 			},
 			.rwq_ind_tbl = ind_tbl->ind_table,
 			.pd = priv->pd,
 		 });
+#endif
 	if (!qp) {
 		rte_errno = errno;
 		goto error;
@@ -1439,6 +1479,7 @@ mlx5_hrxq_new(struct rte_eth_dev *dev,
 	hrxq->rss_key_len = rss_key_len;
 	hrxq->hash_fields = hash_fields;
 	hrxq->tunnel = tunnel;
+	hrxq->rss_level = rss_level;
 	memcpy(hrxq->rss_key, rss_key, rss_key_len);
 	rte_atomic32_inc(&hrxq->refcnt);
 	LIST_INSERT_HEAD(&priv->hrxqs, hrxq, next);
@@ -1468,7 +1509,9 @@ mlx5_hrxq_new(struct rte_eth_dev *dev,
  * @param queues_n
  *   Number of queues.
  * @param tunnel
- *   Tunnel type.
+ *   Tunnel type, implies tunnel offloading like inner checksum if available.
+ * @param rss_level
+ *   RSS hash on tunnel level
  *
  * @return
  *   An hash Rx queue on success.
@@ -1477,7 +1520,8 @@ struct mlx5_hrxq *
 mlx5_hrxq_get(struct rte_eth_dev *dev,
 	      const uint8_t *rss_key, uint32_t rss_key_len,
 	      uint64_t hash_fields,
-	      const uint16_t *queues, uint32_t queues_n, uint32_t tunnel)
+	      const uint16_t *queues, uint32_t queues_n,
+	      uint32_t tunnel, uint32_t rss_level)
 {
 	struct priv *priv = dev->data->dev_private;
 	struct mlx5_hrxq *hrxq;
@@ -1494,6 +1538,8 @@ mlx5_hrxq_get(struct rte_eth_dev *dev,
 			continue;
 		if (hrxq->tunnel != tunnel)
 			continue;
+		if (hrxq->rss_level != rss_level)
+			continue;
 		ind_tbl = mlx5_ind_table_ibv_get(dev, queues, queues_n);
 		if (!ind_tbl)
 			continue;
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index 188fd65c5..07b3adfae 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -147,6 +147,7 @@ struct mlx5_hrxq {
 	struct ibv_qp *qp; /* Verbs queue pair. */
 	uint64_t hash_fields; /* Verbs Hash fields. */
 	uint32_t tunnel; /* Tunnel type. */
+	uint32_t rss_level; /* RSS on tunnel level. */
 	uint32_t rss_key_len; /* Hash key length in bytes. */
 	uint8_t rss_key[]; /* Hash key. */
 };
@@ -251,12 +252,12 @@ struct mlx5_hrxq *mlx5_hrxq_new(struct rte_eth_dev *dev,
 				const uint8_t *rss_key, uint32_t rss_key_len,
 				uint64_t hash_fields,
 				const uint16_t *queues, uint32_t queues_n,
-				uint32_t tunnel);
+				uint32_t tunnel, uint32_t rss_level);
 struct mlx5_hrxq *mlx5_hrxq_get(struct rte_eth_dev *dev,
 				const uint8_t *rss_key, uint32_t rss_key_len,
 				uint64_t hash_fields,
 				const uint16_t *queues, uint32_t queues_n,
-				uint32_t tunnel);
+				uint32_t tunnel, uint32_t rss_level);
 int mlx5_hrxq_release(struct rte_eth_dev *dev, struct mlx5_hrxq *hxrq);
 int mlx5_hrxq_ibv_verify(struct rte_eth_dev *dev);
 uint64_t mlx5_get_rx_port_offloads(void);
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v4 08/11] net/mlx5: add hardware flow debug dump
  2018-04-13 11:20 ` [PATCH v3 00/14] mlx5 Rx tunnel offloading Xueming Li
                     ` (7 preceding siblings ...)
  2018-04-17 15:14   ` [PATCH v4 07/11] net/mlx5: support tunnel RSS level Xueming Li
@ 2018-04-17 15:14   ` Xueming Li
  2018-04-18  6:57     ` Nélio Laranjeiro
  2018-04-17 15:14   ` [PATCH v4 09/11] net/mlx5: introduce VXLAN-GPE tunnel type Xueming Li
                     ` (2 subsequent siblings)
  11 siblings, 1 reply; 115+ messages in thread
From: Xueming Li @ 2018-04-17 15:14 UTC (permalink / raw)
  To: Nelio Laranjeiro, Shahaf Shuler; +Cc: Xueming Li, dev

Dump verb flow detail including flow spec type and size for debugging
purpose.

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
---
 drivers/net/mlx5/mlx5_flow.c  | 68 ++++++++++++++++++++++++++++++++++++-------
 drivers/net/mlx5/mlx5_rxq.c   | 25 +++++++++++++---
 drivers/net/mlx5/mlx5_utils.h |  6 ++++
 3 files changed, 85 insertions(+), 14 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index a6791c525..371d029c8 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -2050,6 +2050,57 @@ mlx5_flow_create_update_rxqs(struct rte_eth_dev *dev, struct rte_flow *flow)
 }
 
 /**
+ * Dump flow hash RX queue detail.
+ *
+ * @param dev
+ *   Pointer to Ethernet device.
+ * @param flow
+ *   Pointer to the rte_flow.
+ * @param i
+ *   Hash RX queue index.
+ */
+static void
+mlx5_flow_dump(struct rte_eth_dev *dev __rte_unused,
+	       struct rte_flow *flow __rte_unused,
+	       unsigned int i __rte_unused)
+{
+#ifndef NDEBUG
+	uintptr_t spec_ptr;
+	uint16_t j;
+	char buf[256];
+	uint8_t off;
+
+	spec_ptr = (uintptr_t)(flow->frxq[i].ibv_attr + 1);
+	for (j = 0, off = 0; j < flow->frxq[i].ibv_attr->num_of_specs;
+	     j++) {
+		struct ibv_flow_spec *spec = (void *)spec_ptr;
+		off += sprintf(buf + off, " %x(%hu)", spec->hdr.type,
+			       spec->hdr.size);
+		spec_ptr += spec->hdr.size;
+	}
+	DRV_LOG(DEBUG,
+		"port %u Verbs flow %p type %u: hrxq:%p qp:%p ind:%p, hash:%lx/%u"
+		" specs:%hhu(%hu), priority:%hu, type:%d, flags:%x,"
+		" comp_mask:%x specs:%s",
+		dev->data->port_id, (void *)flow, i,
+		(void *)flow->frxq[i].hrxq,
+		(void *)flow->frxq[i].hrxq->qp,
+		(void *)flow->frxq[i].hrxq->ind_table,
+		flow->frxq[i].hash_fields |
+		(flow->tunnel &&
+		 flow->rss_conf.level > 1 ? (uint32_t)IBV_RX_HASH_INNER : 0),
+		flow->rss_conf.queue_num,
+		flow->frxq[i].ibv_attr->num_of_specs,
+		flow->frxq[i].ibv_attr->size,
+		flow->frxq[i].ibv_attr->priority,
+		flow->frxq[i].ibv_attr->type,
+		flow->frxq[i].ibv_attr->flags,
+		flow->frxq[i].ibv_attr->comp_mask,
+		buf);
+#endif
+}
+
+/**
  * Complete flow rule creation.
  *
  * @param dev
@@ -2091,6 +2142,7 @@ mlx5_flow_create_action_queue(struct rte_eth_dev *dev,
 		flow->frxq[i].ibv_flow =
 			mlx5_glue->create_flow(flow->frxq[i].hrxq->qp,
 					       flow->frxq[i].ibv_attr);
+		mlx5_flow_dump(dev, flow, i);
 		if (!flow->frxq[i].ibv_flow) {
 			rte_flow_error_set(error, ENOMEM,
 					   RTE_FLOW_ERROR_TYPE_HANDLE,
@@ -2098,11 +2150,6 @@ mlx5_flow_create_action_queue(struct rte_eth_dev *dev,
 			goto error;
 		}
 		++flows_n;
-		DRV_LOG(DEBUG, "port %u %p type %d QP %p ibv_flow %p",
-			dev->data->port_id,
-			(void *)flow, i,
-			(void *)flow->frxq[i].hrxq->qp,
-			(void *)flow->frxq[i].ibv_flow);
 	}
 	if (!flows_n) {
 		rte_flow_error_set(error, EINVAL, RTE_FLOW_ERROR_TYPE_HANDLE,
@@ -2646,24 +2693,25 @@ mlx5_flow_start(struct rte_eth_dev *dev, struct mlx5_flows *list)
 					      flow->rss_conf.level);
 			if (!flow->frxq[i].hrxq) {
 				DRV_LOG(DEBUG,
-					"port %u flow %p cannot be applied",
+					"port %u flow %p cannot create hash"
+					" rxq",
 					dev->data->port_id, (void *)flow);
 				rte_errno = EINVAL;
 				return -rte_errno;
 			}
 flow_create:
+			mlx5_flow_dump(dev, flow, i);
 			flow->frxq[i].ibv_flow =
 				mlx5_glue->create_flow(flow->frxq[i].hrxq->qp,
 						       flow->frxq[i].ibv_attr);
 			if (!flow->frxq[i].ibv_flow) {
 				DRV_LOG(DEBUG,
-					"port %u flow %p cannot be applied",
-					dev->data->port_id, (void *)flow);
+					"port %u flow %p type %u cannot be"
+					" applied",
+					dev->data->port_id, (void *)flow, i);
 				rte_errno = EINVAL;
 				return -rte_errno;
 			}
-			DRV_LOG(DEBUG, "port %u flow %p applied",
-				dev->data->port_id, (void *)flow);
 		}
 		mlx5_flow_create_update_rxqs(dev, flow);
 	}
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index 58403b5b6..eb844a890 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -1259,9 +1259,9 @@ mlx5_ind_table_ibv_new(struct rte_eth_dev *dev, const uint16_t *queues,
 	}
 	rte_atomic32_inc(&ind_tbl->refcnt);
 	LIST_INSERT_HEAD(&priv->ind_tbls, ind_tbl, next);
-	DRV_LOG(DEBUG, "port %u indirection table %p: refcnt %d",
-		dev->data->port_id, (void *)ind_tbl,
-		rte_atomic32_read(&ind_tbl->refcnt));
+	DEBUG("port %u new indirection table %p: queues:%u refcnt:%d",
+	      dev->data->port_id, (void *)ind_tbl, 1 << wq_n,
+	      rte_atomic32_read(&ind_tbl->refcnt));
 	return ind_tbl;
 error:
 	rte_free(ind_tbl);
@@ -1330,9 +1330,12 @@ mlx5_ind_table_ibv_release(struct rte_eth_dev *dev,
 	DRV_LOG(DEBUG, "port %u indirection table %p: refcnt %d",
 		((struct priv *)dev->data->dev_private)->port,
 		(void *)ind_tbl, rte_atomic32_read(&ind_tbl->refcnt));
-	if (rte_atomic32_dec_and_test(&ind_tbl->refcnt))
+	if (rte_atomic32_dec_and_test(&ind_tbl->refcnt)) {
 		claim_zero(mlx5_glue->destroy_rwq_ind_table
 			   (ind_tbl->ind_table));
+		DEBUG("port %u delete indirection table %p: queues: %u",
+		      dev->data->port_id, (void *)ind_tbl, ind_tbl->queues_n);
+	}
 	for (i = 0; i != ind_tbl->queues_n; ++i)
 		claim_nonzero(mlx5_rxq_release(dev, ind_tbl->queues[i]));
 	if (!rte_atomic32_read(&ind_tbl->refcnt)) {
@@ -1445,6 +1448,12 @@ mlx5_hrxq_new(struct rte_eth_dev *dev,
 			.pd = priv->pd,
 		},
 		&qp_init_attr);
+	DEBUG("port %u new QP:%p ind_tbl:%p hash_fields:0x%lx tunnel:0x%x"
+	      " level:%hhu dv_attr:comp_mask:0x%lx create_flags:0x%x",
+	      dev->data->port_id, (void *)qp, (void *)ind_tbl,
+	      (tunnel && rss_level == 2 ? (uint32_t)IBV_RX_HASH_INNER : 0) |
+	      hash_fields, tunnel, rss_level,
+	      qp_init_attr.comp_mask, qp_init_attr.create_flags);
 #else
 	qp = mlx5_glue->create_qp_ex
 		(priv->ctx,
@@ -1466,6 +1475,10 @@ mlx5_hrxq_new(struct rte_eth_dev *dev,
 			.rwq_ind_tbl = ind_tbl->ind_table,
 			.pd = priv->pd,
 		 });
+	DEBUG("port %u new QP:%p ind_tbl:%p hash_fields:0x%lx tunnel:0x%x"
+	      " level:%hhu",
+	      dev->data->port_id, (void *)qp, (void *)ind_tbl,
+	      hash_fields, tunnel, rss_level);
 #endif
 	if (!qp) {
 		rte_errno = errno;
@@ -1575,6 +1588,10 @@ mlx5_hrxq_release(struct rte_eth_dev *dev, struct mlx5_hrxq *hrxq)
 		(void *)hrxq, rte_atomic32_read(&hrxq->refcnt));
 	if (rte_atomic32_dec_and_test(&hrxq->refcnt)) {
 		claim_zero(mlx5_glue->destroy_qp(hrxq->qp));
+		DEBUG("port %u delete QP %p: hash: 0x%lx, tunnel:"
+		      " 0x%x, level: %hhu",
+		      dev->data->port_id, (void *)hrxq, hrxq->hash_fields,
+		      hrxq->tunnel, hrxq->rss_level);
 		mlx5_ind_table_ibv_release(dev, hrxq->ind_table);
 		LIST_REMOVE(hrxq, next);
 		rte_free(hrxq);
diff --git a/drivers/net/mlx5/mlx5_utils.h b/drivers/net/mlx5/mlx5_utils.h
index e8f980ff7..886f60e61 100644
--- a/drivers/net/mlx5/mlx5_utils.h
+++ b/drivers/net/mlx5/mlx5_utils.h
@@ -103,16 +103,22 @@ extern int mlx5_logtype;
 /* claim_zero() does not perform any check when debugging is disabled. */
 #ifndef NDEBUG
 
+#define DEBUG(...) DRV_LOG(DEBUG, __VA_ARGS__)
 #define claim_zero(...) assert((__VA_ARGS__) == 0)
 #define claim_nonzero(...) assert((__VA_ARGS__) != 0)
 
 #else /* NDEBUG */
 
+#define DEBUG(...) (void)0
 #define claim_zero(...) (__VA_ARGS__)
 #define claim_nonzero(...) (__VA_ARGS__)
 
 #endif /* NDEBUG */
 
+#define INFO(...) DRV_LOG(INFO, __VA_ARGS__)
+#define WARN(...) DRV_LOG(WARNING, __VA_ARGS__)
+#define ERROR(...) DRV_LOG(ERR, __VA_ARGS__)
+
 /* Convenience macros for accessing mbuf fields. */
 #define NEXT(m) ((m)->next)
 #define DATA_LEN(m) ((m)->data_len)
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v4 09/11] net/mlx5: introduce VXLAN-GPE tunnel type
  2018-04-13 11:20 ` [PATCH v3 00/14] mlx5 Rx tunnel offloading Xueming Li
                     ` (8 preceding siblings ...)
  2018-04-17 15:14   ` [PATCH v4 08/11] net/mlx5: add hardware flow debug dump Xueming Li
@ 2018-04-17 15:14   ` Xueming Li
  2018-04-18  6:58     ` Nélio Laranjeiro
  2018-04-17 15:14   ` [PATCH v4 10/11] net/mlx5: allow flow tunnel ID 0 with outer pattern Xueming Li
  2018-04-17 15:14   ` [PATCH v4 11/11] doc: update mlx5 guide on tunnel offloading Xueming Li
  11 siblings, 1 reply; 115+ messages in thread
From: Xueming Li @ 2018-04-17 15:14 UTC (permalink / raw)
  To: Nelio Laranjeiro, Shahaf Shuler; +Cc: Xueming Li, dev

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
---
 drivers/net/mlx5/mlx5_flow.c | 99 +++++++++++++++++++++++++++++++++++++++++++-
 drivers/net/mlx5/mlx5_rxtx.c |  3 +-
 2 files changed, 99 insertions(+), 3 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index 371d029c8..1a7601cd9 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -91,6 +91,11 @@ mlx5_flow_create_vxlan(const struct rte_flow_item *item,
 		       struct mlx5_flow_data *data);
 
 static int
+mlx5_flow_create_vxlan_gpe(const struct rte_flow_item *item,
+			   const void *default_mask,
+			   struct mlx5_flow_data *data);
+
+static int
 mlx5_flow_create_gre(const struct rte_flow_item *item,
 		     const void *default_mask,
 		     struct mlx5_flow_data *data);
@@ -241,10 +246,12 @@ struct rte_flow {
 
 #define IS_TUNNEL(type) ( \
 	(type) == RTE_FLOW_ITEM_TYPE_VXLAN || \
+	(type) == RTE_FLOW_ITEM_TYPE_VXLAN_GPE || \
 	(type) == RTE_FLOW_ITEM_TYPE_GRE)
 
 const uint32_t flow_ptype[] = {
 	[RTE_FLOW_ITEM_TYPE_VXLAN] = RTE_PTYPE_TUNNEL_VXLAN,
+	[RTE_FLOW_ITEM_TYPE_VXLAN_GPE] = RTE_PTYPE_TUNNEL_VXLAN_GPE,
 	[RTE_FLOW_ITEM_TYPE_GRE] = RTE_PTYPE_TUNNEL_GRE,
 };
 
@@ -253,6 +260,8 @@ const uint32_t flow_ptype[] = {
 const uint32_t ptype_ext[] = {
 	[PTYPE_IDX(RTE_PTYPE_TUNNEL_VXLAN)] = RTE_PTYPE_TUNNEL_VXLAN |
 					      RTE_PTYPE_L4_UDP,
+	[PTYPE_IDX(RTE_PTYPE_TUNNEL_VXLAN_GPE)]	= RTE_PTYPE_TUNNEL_VXLAN_GPE |
+						  RTE_PTYPE_L4_UDP,
 	[PTYPE_IDX(RTE_PTYPE_TUNNEL_GRE)] = RTE_PTYPE_TUNNEL_GRE,
 };
 
@@ -310,6 +319,7 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
 	[RTE_FLOW_ITEM_TYPE_END] = {
 		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
 			       RTE_FLOW_ITEM_TYPE_VXLAN,
+			       RTE_FLOW_ITEM_TYPE_VXLAN_GPE,
 			       RTE_FLOW_ITEM_TYPE_GRE),
 	},
 	[RTE_FLOW_ITEM_TYPE_ETH] = {
@@ -388,7 +398,8 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
 		.dst_sz = sizeof(struct ibv_flow_spec_ipv6),
 	},
 	[RTE_FLOW_ITEM_TYPE_UDP] = {
-		.items = ITEMS(RTE_FLOW_ITEM_TYPE_VXLAN),
+		.items = ITEMS(RTE_FLOW_ITEM_TYPE_VXLAN,
+			       RTE_FLOW_ITEM_TYPE_VXLAN_GPE),
 		.actions = valid_actions,
 		.mask = &(const struct rte_flow_item_udp){
 			.hdr = {
@@ -440,6 +451,19 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
 		.convert = mlx5_flow_create_vxlan,
 		.dst_sz = sizeof(struct ibv_flow_spec_tunnel),
 	},
+	[RTE_FLOW_ITEM_TYPE_VXLAN_GPE] = {
+		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
+			       RTE_FLOW_ITEM_TYPE_IPV4,
+			       RTE_FLOW_ITEM_TYPE_IPV6),
+		.actions = valid_actions,
+		.mask = &(const struct rte_flow_item_vxlan_gpe){
+			.vni = "\xff\xff\xff",
+		},
+		.default_mask = &rte_flow_item_vxlan_gpe_mask,
+		.mask_sz = sizeof(struct rte_flow_item_vxlan_gpe),
+		.convert = mlx5_flow_create_vxlan_gpe,
+		.dst_sz = sizeof(struct ibv_flow_spec_tunnel),
+	},
 };
 
 /** Structure to pass to the conversion function. */
@@ -1753,6 +1777,79 @@ mlx5_flow_create_vxlan(const struct rte_flow_item *item,
 }
 
 /**
+ * Convert VXLAN-GPE item to Verbs specification.
+ *
+ * @param item[in]
+ *   Item specification.
+ * @param default_mask[in]
+ *   Default bit-masks to use when item->mask is not provided.
+ * @param data[in, out]
+ *   User structure.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+mlx5_flow_create_vxlan_gpe(const struct rte_flow_item *item,
+			   const void *default_mask,
+			   struct mlx5_flow_data *data)
+{
+	const struct rte_flow_item_vxlan_gpe *spec = item->spec;
+	const struct rte_flow_item_vxlan_gpe *mask = item->mask;
+	struct mlx5_flow_parse *parser = data->parser;
+	unsigned int size = sizeof(struct ibv_flow_spec_tunnel);
+	struct ibv_flow_spec_tunnel vxlan = {
+		.type = parser->inner | IBV_FLOW_SPEC_VXLAN_TUNNEL,
+		.size = size,
+	};
+	union vni {
+		uint32_t vlan_id;
+		uint8_t vni[4];
+	} id;
+
+	id.vni[0] = 0;
+	parser->inner = IBV_FLOW_SPEC_INNER;
+	parser->tunnel = ptype_ext[PTYPE_IDX(RTE_PTYPE_TUNNEL_VXLAN_GPE)];
+	parser->out_layer = parser->layer;
+	parser->layer = HASH_RXQ_TUNNEL;
+	/* Default VXLAN-GPE to outer RSS. */
+	if (!parser->rss_conf.level)
+		parser->rss_conf.level = 1;
+	if (spec) {
+		if (!mask)
+			mask = default_mask;
+		memcpy(&id.vni[1], spec->vni, 3);
+		vxlan.val.tunnel_id = id.vlan_id;
+		memcpy(&id.vni[1], mask->vni, 3);
+		vxlan.mask.tunnel_id = id.vlan_id;
+		if (spec->protocol)
+			return rte_flow_error_set(data->error, EINVAL,
+						  RTE_FLOW_ERROR_TYPE_ITEM,
+						  item,
+						  "VxLAN-GPE protocol not"
+						  " supported");
+		/* Remove unwanted bits from values. */
+		vxlan.val.tunnel_id &= vxlan.mask.tunnel_id;
+	}
+	/*
+	 * Tunnel id 0 is equivalent as not adding a VXLAN layer, if only this
+	 * layer is defined in the Verbs specification it is interpreted as
+	 * wildcard and all packets will match this rule, if it follows a full
+	 * stack layer (ex: eth / ipv4 / udp), all packets matching the layers
+	 * before will also match this rule.
+	 * To avoid such situation, VNI 0 is currently refused.
+	 */
+	/* Only allow tunnel w/o tunnel id pattern after proper outer spec. */
+	if (parser->out_layer == HASH_RXQ_ETH && !vxlan.val.tunnel_id)
+		return rte_flow_error_set(data->error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ITEM,
+					  item,
+					  "VxLAN-GPE vni cannot be 0");
+	mlx5_flow_create_copy(parser, &vxlan, size);
+	return 0;
+}
+
+/**
  * Convert GRE item to Verbs specification.
  *
  * @param item[in]
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 060ff0e85..f10ea13c1 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -466,8 +466,7 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
 			uint8_t vlan_sz =
 				(buf->ol_flags & PKT_TX_VLAN_PKT) ? 4 : 0;
 			const uint64_t is_tunneled =
-				buf->ol_flags & (PKT_TX_TUNNEL_GRE |
-						 PKT_TX_TUNNEL_VXLAN);
+				buf->ol_flags & (PKT_TX_TUNNEL_MASK);
 
 			tso_header_sz = buf->l2_len + vlan_sz +
 					buf->l3_len + buf->l4_len;
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v4 10/11] net/mlx5: allow flow tunnel ID 0 with outer pattern
  2018-04-13 11:20 ` [PATCH v3 00/14] mlx5 Rx tunnel offloading Xueming Li
                     ` (9 preceding siblings ...)
  2018-04-17 15:14   ` [PATCH v4 09/11] net/mlx5: introduce VXLAN-GPE tunnel type Xueming Li
@ 2018-04-17 15:14   ` Xueming Li
  2018-04-17 15:14   ` [PATCH v4 11/11] doc: update mlx5 guide on tunnel offloading Xueming Li
  11 siblings, 0 replies; 115+ messages in thread
From: Xueming Li @ 2018-04-17 15:14 UTC (permalink / raw)
  To: Nelio Laranjeiro, Shahaf Shuler; +Cc: Xueming Li, dev

Tunnel w/o tunnel id pattern could match any non-tunneled packet,
this patch allowed tunnel w/o tunnel id pattern after proper outer spec.

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
---
 drivers/net/mlx5/mlx5_flow.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index 1a7601cd9..c30a6d1b1 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -1767,7 +1767,8 @@ mlx5_flow_create_vxlan(const struct rte_flow_item *item,
 	 * before will also match this rule.
 	 * To avoid such situation, VNI 0 is currently refused.
 	 */
-	if (!vxlan.val.tunnel_id)
+	/* Only allow tunnel w/o tunnel id pattern after proper outer spec. */
+	if (parser->out_layer == HASH_RXQ_ETH && !vxlan.val.tunnel_id)
 		return rte_flow_error_set(data->error, EINVAL,
 					  RTE_FLOW_ERROR_TYPE_ITEM,
 					  item,
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v4 11/11] doc: update mlx5 guide on tunnel offloading
  2018-04-13 11:20 ` [PATCH v3 00/14] mlx5 Rx tunnel offloading Xueming Li
                     ` (10 preceding siblings ...)
  2018-04-17 15:14   ` [PATCH v4 10/11] net/mlx5: allow flow tunnel ID 0 with outer pattern Xueming Li
@ 2018-04-17 15:14   ` Xueming Li
  2018-04-18  7:00     ` Nélio Laranjeiro
  11 siblings, 1 reply; 115+ messages in thread
From: Xueming Li @ 2018-04-17 15:14 UTC (permalink / raw)
  To: Nelio Laranjeiro, Shahaf Shuler; +Cc: Xueming Li, dev

Remove tunnel limitations, add new hardware tunnel offload features.

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
---
 doc/guides/nics/features/default.ini |  1 +
 doc/guides/nics/features/mlx5.ini    |  3 +++
 doc/guides/nics/mlx5.rst             | 22 ++++++++++++++++++++--
 3 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/doc/guides/nics/features/default.ini b/doc/guides/nics/features/default.ini
index dae2ad776..49be81450 100644
--- a/doc/guides/nics/features/default.ini
+++ b/doc/guides/nics/features/default.ini
@@ -29,6 +29,7 @@ Multicast MAC filter =
 RSS hash             =
 RSS key update       =
 RSS reta update      =
+Inner RSS            =
 VMDq                 =
 SR-IOV               =
 DCB                  =
diff --git a/doc/guides/nics/features/mlx5.ini b/doc/guides/nics/features/mlx5.ini
index f8ce08770..e75b14bdc 100644
--- a/doc/guides/nics/features/mlx5.ini
+++ b/doc/guides/nics/features/mlx5.ini
@@ -21,6 +21,7 @@ Multicast MAC filter = Y
 RSS hash             = Y
 RSS key update       = Y
 RSS reta update      = Y
+Inner RSS            = Y
 SR-IOV               = Y
 VLAN filter          = Y
 Flow director        = Y
@@ -30,6 +31,8 @@ VLAN offload         = Y
 L3 checksum offload  = Y
 L4 checksum offload  = Y
 Timestamp offload    = Y
+Inner L3 checksum    = Y
+Inner L4 checksum    = Y
 Packet type parsing  = Y
 Rx descriptor status = Y
 Tx descriptor status = Y
diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index c28c83278..51590b0a3 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -74,12 +74,12 @@ Features
 - RX interrupts.
 - Statistics query including Basic, Extended and per queue.
 - Rx HW timestamp.
+- Tunnel types: VXLAN, L3 VXLAN, VXLAN-GPE, GRE, MPLS-in-GRE, MPLS-in-UDP.
+- Tunnel HW offloads: packet type, inner/outer RSS, IP and UDP checksum verification.
 
 Limitations
 -----------
 
-- Inner RSS for VXLAN frames is not supported yet.
-- Hardware checksum RX offloads for VXLAN inner header are not supported yet.
 - For secondary process:
 
   - Forked secondary process not supported.
@@ -327,6 +327,24 @@ Run-time configuration
 
   Enabled by default, valid only on VF devices ignored otherwise.
 
+Firmware configuration
+~~~~~~~~~~~~~~~~~~~~~~
+
+- L3 VXLAN and VXLAN-GPE destination UDP port
+
+   .. code-block:: console
+
+     mlxconfig -d <mst device> set IP_OVER_VXLAN_EN=1
+     mlxconfig -d <mst device> set IP_OVER_VXLAN_PORT=<udp dport>
+
+  Verify configurations are set:
+
+   .. code-block:: console
+
+     mlxconfig -d <mst device> query | grep IP_OVER_VXLAN
+     IP_OVER_VXLAN_EN                    True(1)
+     IP_OVER_VXLAN_PORT                  4790
+
 Prerequisites
 -------------
 
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* Re: [PATCH v4 03/11] net/mlx5: support L3 VXLAN flow
  2018-04-17 15:14   ` [PATCH v4 03/11] net/mlx5: support L3 VXLAN flow Xueming Li
@ 2018-04-18  6:48     ` Nélio Laranjeiro
  2018-04-18 14:43       ` Xueming(Steven) Li
  0 siblings, 1 reply; 115+ messages in thread
From: Nélio Laranjeiro @ 2018-04-18  6:48 UTC (permalink / raw)
  To: Xueming Li; +Cc: Shahaf Shuler, dev

On Tue, Apr 17, 2018 at 11:14:28PM +0800, Xueming Li wrote:
> This patch support L3 VXLAN, no inner L2 header comparing to standard
> VXLAN protocol. L3 VXLAN using specific overlay UDP destination port to
> discriminate against standard VXLAN, FW has to be configured to support
> it:
>   sudo mlxconfig -d <device> -y s IP_OVER_VXLAN_EN=1
>   sudo mlxconfig -d <device> -y s IP_OVER_VXLAN_PORT=<port>
> 
> Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> ---
>  drivers/net/mlx5/mlx5_flow.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
> index 771d5f14d..d7a921dff 100644
> --- a/drivers/net/mlx5/mlx5_flow.c
> +++ b/drivers/net/mlx5/mlx5_flow.c
> @@ -413,7 +413,9 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
>  		.dst_sz = sizeof(struct ibv_flow_spec_tunnel),
>  	},
>  	[RTE_FLOW_ITEM_TYPE_VXLAN] = {
> -		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH),
> +		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
> +			       RTE_FLOW_ITEM_TYPE_IPV4, /* For L3 VXLAN. */
> +			       RTE_FLOW_ITEM_TYPE_IPV6), /* For L3 VXLAN. */
>  		.actions = valid_actions,
>  		.mask = &(const struct rte_flow_item_vxlan){
>  			.vni = "\xff\xff\xff",
> -- 
> 2.13.3

Such support must be under device parameter has it depends on the
configuration of the firmware.  If the firmware is not correctly
configured the PMD must refuse such rule.

Thanks,

-- 
Nélio Laranjeiro
6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v4 04/11] net/mlx5: support Rx tunnel type identification
  2018-04-17 15:14   ` [PATCH v4 04/11] net/mlx5: support Rx tunnel type identification Xueming Li
@ 2018-04-18  6:50     ` Nélio Laranjeiro
  2018-04-18 14:33       ` Xueming(Steven) Li
  0 siblings, 1 reply; 115+ messages in thread
From: Nélio Laranjeiro @ 2018-04-18  6:50 UTC (permalink / raw)
  To: Xueming Li; +Cc: Shahaf Shuler, dev

On Tue, Apr 17, 2018 at 11:14:29PM +0800, Xueming Li wrote:
> This patch introduced tunnel type identification based on flow rules.
> If flows of multiple tunnel types built on same queue,
> RTE_PTYPE_TUNNEL_MASK will be returned, user application could use bits
> in flow mark as tunnel type identifier.

As discussed in the previous thread, you cannot set all tunnel bits in
the mbuf, it will break existing applications.

This is an non announce API breakage.

Thanks,

-- 
Nélio Laranjeiro
6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v4 07/11] net/mlx5: support tunnel RSS level
  2018-04-17 15:14   ` [PATCH v4 07/11] net/mlx5: support tunnel RSS level Xueming Li
@ 2018-04-18  6:55     ` Nélio Laranjeiro
  0 siblings, 0 replies; 115+ messages in thread
From: Nélio Laranjeiro @ 2018-04-18  6:55 UTC (permalink / raw)
  To: Xueming Li; +Cc: Shahaf Shuler, dev

On Tue, Apr 17, 2018 at 11:14:32PM +0800, Xueming Li wrote:
> Tunnel RSS level of flow RSS action offers user a choice to do RSS hash
> calculation on inner or outer RSS fields. Testpmd flow command examples:
> 
> GRE flow inner RSS:
>   flow create 0 ingress pattern eth / ipv4 proto is 47 / gre / end
> actions rss queues 1 2 end level 1 / end
> 
> GRE tunnel flow outer RSS:
>   flow create 0 ingress pattern eth  / ipv4 proto is 47 / gre / end
> actions rss queues 1 2 end level 0 / end
> 
> Signed-off-by: Xueming Li <xuemingl@mellanox.com>

Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>

-- 
Nélio Laranjeiro
6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v4 08/11] net/mlx5: add hardware flow debug dump
  2018-04-17 15:14   ` [PATCH v4 08/11] net/mlx5: add hardware flow debug dump Xueming Li
@ 2018-04-18  6:57     ` Nélio Laranjeiro
  0 siblings, 0 replies; 115+ messages in thread
From: Nélio Laranjeiro @ 2018-04-18  6:57 UTC (permalink / raw)
  To: Xueming Li; +Cc: Shahaf Shuler, dev

On Tue, Apr 17, 2018 at 11:14:33PM +0800, Xueming Li wrote:
> Dump verb flow detail including flow spec type and size for debugging
> purpose.
> 
> Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> ---
>  drivers/net/mlx5/mlx5_flow.c  | 68 ++++++++++++++++++++++++++++++++++++-------
>  drivers/net/mlx5/mlx5_rxq.c   | 25 +++++++++++++---
>  drivers/net/mlx5/mlx5_utils.h |  6 ++++
>  3 files changed, 85 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
> index a6791c525..371d029c8 100644
> --- a/drivers/net/mlx5/mlx5_flow.c
> +++ b/drivers/net/mlx5/mlx5_flow.c
> @@ -2050,6 +2050,57 @@ mlx5_flow_create_update_rxqs(struct rte_eth_dev *dev, struct rte_flow *flow)
>  }
>  
>  /**
> + * Dump flow hash RX queue detail.
> + *
> + * @param dev
> + *   Pointer to Ethernet device.
> + * @param flow
> + *   Pointer to the rte_flow.
> + * @param i
> + *   Hash RX queue index.
> + */
> +static void
> +mlx5_flow_dump(struct rte_eth_dev *dev __rte_unused,
> +	       struct rte_flow *flow __rte_unused,
> +	       unsigned int i __rte_unused)

Can this "i" be renamed to hrxq_idx to have something more
understandable across the code?

> +{
> +#ifndef NDEBUG
> +	uintptr_t spec_ptr;
> +	uint16_t j;
> +	char buf[256];
> +	uint8_t off;
> +
> +	spec_ptr = (uintptr_t)(flow->frxq[i].ibv_attr + 1);
> +	for (j = 0, off = 0; j < flow->frxq[i].ibv_attr->num_of_specs;
> +	     j++) {
> +		struct ibv_flow_spec *spec = (void *)spec_ptr;
> +		off += sprintf(buf + off, " %x(%hu)", spec->hdr.type,
> +			       spec->hdr.size);
> +		spec_ptr += spec->hdr.size;
> +	}
> +	DRV_LOG(DEBUG,
> +		"port %u Verbs flow %p type %u: hrxq:%p qp:%p ind:%p, hash:%lx/%u"
> +		" specs:%hhu(%hu), priority:%hu, type:%d, flags:%x,"
> +		" comp_mask:%x specs:%s",
> +		dev->data->port_id, (void *)flow, i,
> +		(void *)flow->frxq[i].hrxq,
> +		(void *)flow->frxq[i].hrxq->qp,
> +		(void *)flow->frxq[i].hrxq->ind_table,
> +		flow->frxq[i].hash_fields |
> +		(flow->tunnel &&
> +		 flow->rss_conf.level > 1 ? (uint32_t)IBV_RX_HASH_INNER : 0),
> +		flow->rss_conf.queue_num,
> +		flow->frxq[i].ibv_attr->num_of_specs,
> +		flow->frxq[i].ibv_attr->size,
> +		flow->frxq[i].ibv_attr->priority,
> +		flow->frxq[i].ibv_attr->type,
> +		flow->frxq[i].ibv_attr->flags,
> +		flow->frxq[i].ibv_attr->comp_mask,
> +		buf);
> +#endif
>[...]

Thanks,

-- 
Nélio Laranjeiro
6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v4 09/11] net/mlx5: introduce VXLAN-GPE tunnel type
  2018-04-17 15:14   ` [PATCH v4 09/11] net/mlx5: introduce VXLAN-GPE tunnel type Xueming Li
@ 2018-04-18  6:58     ` Nélio Laranjeiro
  0 siblings, 0 replies; 115+ messages in thread
From: Nélio Laranjeiro @ 2018-04-18  6:58 UTC (permalink / raw)
  To: Xueming Li; +Cc: Shahaf Shuler, dev

On Tue, Apr 17, 2018 at 11:14:34PM +0800, Xueming Li wrote:
> Signed-off-by: Xueming Li <xuemingl@mellanox.com>

Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>

> ---
>  drivers/net/mlx5/mlx5_flow.c | 99 +++++++++++++++++++++++++++++++++++++++++++-
>  drivers/net/mlx5/mlx5_rxtx.c |  3 +-
>  2 files changed, 99 insertions(+), 3 deletions(-)
>[...]

-- 
Nélio Laranjeiro
6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v4 11/11] doc: update mlx5 guide on tunnel offloading
  2018-04-17 15:14   ` [PATCH v4 11/11] doc: update mlx5 guide on tunnel offloading Xueming Li
@ 2018-04-18  7:00     ` Nélio Laranjeiro
  0 siblings, 0 replies; 115+ messages in thread
From: Nélio Laranjeiro @ 2018-04-18  7:00 UTC (permalink / raw)
  To: Xueming Li; +Cc: Shahaf Shuler, dev

On Tue, Apr 17, 2018 at 11:14:36PM +0800, Xueming Li wrote:
> Remove tunnel limitations, add new hardware tunnel offload features.
> 
> Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> ---
>  doc/guides/nics/features/default.ini |  1 +
>  doc/guides/nics/features/mlx5.ini    |  3 +++
>  doc/guides/nics/mlx5.rst             | 22 ++++++++++++++++++++--
>  3 files changed, 24 insertions(+), 2 deletions(-)
> 
> diff --git a/doc/guides/nics/features/default.ini b/doc/guides/nics/features/default.ini
> index dae2ad776..49be81450 100644
> --- a/doc/guides/nics/features/default.ini
> +++ b/doc/guides/nics/features/default.ini
> @@ -29,6 +29,7 @@ Multicast MAC filter =
>  RSS hash             =
>  RSS key update       =
>  RSS reta update      =
> +Inner RSS            =
>  VMDq                 =
>  SR-IOV               =
>  DCB                  =
> diff --git a/doc/guides/nics/features/mlx5.ini b/doc/guides/nics/features/mlx5.ini
> index f8ce08770..e75b14bdc 100644
> --- a/doc/guides/nics/features/mlx5.ini
> +++ b/doc/guides/nics/features/mlx5.ini
> @@ -21,6 +21,7 @@ Multicast MAC filter = Y
>  RSS hash             = Y
>  RSS key update       = Y
>  RSS reta update      = Y
> +Inner RSS            = Y
>  SR-IOV               = Y
>  VLAN filter          = Y
>  Flow director        = Y
> @@ -30,6 +31,8 @@ VLAN offload         = Y
>  L3 checksum offload  = Y
>  L4 checksum offload  = Y
>  Timestamp offload    = Y
> +Inner L3 checksum    = Y
> +Inner L4 checksum    = Y
>  Packet type parsing  = Y
>  Rx descriptor status = Y
>  Tx descriptor status = Y
> diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
> index c28c83278..51590b0a3 100644
> --- a/doc/guides/nics/mlx5.rst
> +++ b/doc/guides/nics/mlx5.rst
> @@ -74,12 +74,12 @@ Features
>  - RX interrupts.
>  - Statistics query including Basic, Extended and per queue.
>  - Rx HW timestamp.
> +- Tunnel types: VXLAN, L3 VXLAN, VXLAN-GPE, GRE, MPLS-in-GRE, MPLS-in-UDP.
> +- Tunnel HW offloads: packet type, inner/outer RSS, IP and UDP checksum verification.
>  
>  Limitations
>  -----------
>  
> -- Inner RSS for VXLAN frames is not supported yet.
> -- Hardware checksum RX offloads for VXLAN inner header are not supported yet.
>  - For secondary process:
>  
>    - Forked secondary process not supported.
> @@ -327,6 +327,24 @@ Run-time configuration
>  
>    Enabled by default, valid only on VF devices ignored otherwise.
>  
> +Firmware configuration
> +~~~~~~~~~~~~~~~~~~~~~~
> +
> +- L3 VXLAN and VXLAN-GPE destination UDP port
> +
> +   .. code-block:: console
> +
> +     mlxconfig -d <mst device> set IP_OVER_VXLAN_EN=1
> +     mlxconfig -d <mst device> set IP_OVER_VXLAN_PORT=<udp dport>
> +
> +  Verify configurations are set:
> +
> +   .. code-block:: console
> +
> +     mlxconfig -d <mst device> query | grep IP_OVER_VXLAN
> +     IP_OVER_VXLAN_EN                    True(1)
> +     IP_OVER_VXLAN_PORT                  4790
> +
>  Prerequisites
>  -------------
>  
> -- 
> 2.13.3

The documentation modification related to the L3 VXLAN should be in the
same patch as the code in mlx5.

Thanks,

-- 
Nélio Laranjeiro
6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v4 04/11] net/mlx5: support Rx tunnel type identification
  2018-04-18  6:50     ` Nélio Laranjeiro
@ 2018-04-18 14:33       ` Xueming(Steven) Li
  2018-04-18 15:06         ` Nélio Laranjeiro
  0 siblings, 1 reply; 115+ messages in thread
From: Xueming(Steven) Li @ 2018-04-18 14:33 UTC (permalink / raw)
  To: Nélio Laranjeiro; +Cc: Shahaf Shuler, dev



> -----Original Message-----
> From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> Sent: Wednesday, April 18, 2018 2:51 PM
> To: Xueming(Steven) Li <xuemingl@mellanox.com>
> Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org
> Subject: Re: [PATCH v4 04/11] net/mlx5: support Rx tunnel type identification
> 
> On Tue, Apr 17, 2018 at 11:14:29PM +0800, Xueming Li wrote:
> > This patch introduced tunnel type identification based on flow rules.
> > If flows of multiple tunnel types built on same queue,
> > RTE_PTYPE_TUNNEL_MASK will be returned, user application could use
> > bits in flow mark as tunnel type identifier.
> 
> As discussed in the previous thread, you cannot set all tunnel bits in the mbuf, it will break
> existing applications.
> 
> This is an non announce API breakage.
> 
> Thanks,
> 
> --
> Nélio Laranjeiro
> 6WIND

There was another issue in code, please check below comments:
http://www.dpdk.org/ml/archives/dev/2018-April/096991.html

Thanks,
Xueming

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v4 03/11] net/mlx5: support L3 VXLAN flow
  2018-04-18  6:48     ` Nélio Laranjeiro
@ 2018-04-18 14:43       ` Xueming(Steven) Li
  2018-04-18 15:08         ` Nélio Laranjeiro
  0 siblings, 1 reply; 115+ messages in thread
From: Xueming(Steven) Li @ 2018-04-18 14:43 UTC (permalink / raw)
  To: Nélio Laranjeiro; +Cc: Shahaf Shuler, dev



> -----Original Message-----
> From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> Sent: Wednesday, April 18, 2018 2:49 PM
> To: Xueming(Steven) Li <xuemingl@mellanox.com>
> Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org
> Subject: Re: [PATCH v4 03/11] net/mlx5: support L3 VXLAN flow
> 
> On Tue, Apr 17, 2018 at 11:14:28PM +0800, Xueming Li wrote:
> > This patch support L3 VXLAN, no inner L2 header comparing to standard
> > VXLAN protocol. L3 VXLAN using specific overlay UDP destination port
> > to discriminate against standard VXLAN, FW has to be configured to
> > support
> > it:
> >   sudo mlxconfig -d <device> -y s IP_OVER_VXLAN_EN=1
> >   sudo mlxconfig -d <device> -y s IP_OVER_VXLAN_PORT=<port>
> >
> > Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> > ---
> >  drivers/net/mlx5/mlx5_flow.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/net/mlx5/mlx5_flow.c
> > b/drivers/net/mlx5/mlx5_flow.c index 771d5f14d..d7a921dff 100644
> > --- a/drivers/net/mlx5/mlx5_flow.c
> > +++ b/drivers/net/mlx5/mlx5_flow.c
> > @@ -413,7 +413,9 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
> >  		.dst_sz = sizeof(struct ibv_flow_spec_tunnel),
> >  	},
> >  	[RTE_FLOW_ITEM_TYPE_VXLAN] = {
> > -		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH),
> > +		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
> > +			       RTE_FLOW_ITEM_TYPE_IPV4, /* For L3 VXLAN. */
> > +			       RTE_FLOW_ITEM_TYPE_IPV6), /* For L3 VXLAN. */
> >  		.actions = valid_actions,
> >  		.mask = &(const struct rte_flow_item_vxlan){
> >  			.vni = "\xff\xff\xff",
> > --
> > 2.13.3
> 
> Such support must be under device parameter has it depends on the configuration of the firmware.  If
> the firmware is not correctly configured the PMD must refuse such rule.
> 
> Thanks,
> 
> --
> Nélio Laranjeiro
> 6WIND

Are you suggesting Verbs parameter? I'm afraid we can't have it in short time, need new patch in later 
release when Verbs ready.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v4 04/11] net/mlx5: support Rx tunnel type identification
  2018-04-18 14:33       ` Xueming(Steven) Li
@ 2018-04-18 15:06         ` Nélio Laranjeiro
  0 siblings, 0 replies; 115+ messages in thread
From: Nélio Laranjeiro @ 2018-04-18 15:06 UTC (permalink / raw)
  To: Xueming(Steven) Li; +Cc: Shahaf Shuler, dev

On Wed, Apr 18, 2018 at 02:33:01PM +0000, Xueming(Steven) Li wrote:
> 
> 
> > -----Original Message-----
> > From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> > Sent: Wednesday, April 18, 2018 2:51 PM
> > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org
> > Subject: Re: [PATCH v4 04/11] net/mlx5: support Rx tunnel type identification
> > 
> > On Tue, Apr 17, 2018 at 11:14:29PM +0800, Xueming Li wrote:
> > > This patch introduced tunnel type identification based on flow rules.
> > > If flows of multiple tunnel types built on same queue,
> > > RTE_PTYPE_TUNNEL_MASK will be returned, user application could use
> > > bits in flow mark as tunnel type identifier.
> > 
> > As discussed in the previous thread, you cannot set all tunnel bits in the mbuf, it will break
> > existing applications.
> > 
> > This is an non announce API breakage.
> > 
> > Thanks,
> > 
> > --
> > Nélio Laranjeiro
> > 6WIND
> 
> There was another issue in code, please check below comments:
> http://www.dpdk.org/ml/archives/dev/2018-April/096991.html

I've already read this and I don't see anything related to changing the
meaning of the mbuf flags.

Such change is not announce, thus it cannot be accepted you must not set
any flag in the mbuf if it cannot clearly identify what is in the
packet.

Regards,

-- 
Nélio Laranjeiro
6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v4 03/11] net/mlx5: support L3 VXLAN flow
  2018-04-18 14:43       ` Xueming(Steven) Li
@ 2018-04-18 15:08         ` Nélio Laranjeiro
  2018-04-19  6:20           ` Xueming(Steven) Li
  0 siblings, 1 reply; 115+ messages in thread
From: Nélio Laranjeiro @ 2018-04-18 15:08 UTC (permalink / raw)
  To: Xueming(Steven) Li; +Cc: Shahaf Shuler, dev

On Wed, Apr 18, 2018 at 02:43:30PM +0000, Xueming(Steven) Li wrote:
> 
> 
> > -----Original Message-----
> > From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> > Sent: Wednesday, April 18, 2018 2:49 PM
> > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org
> > Subject: Re: [PATCH v4 03/11] net/mlx5: support L3 VXLAN flow
> > 
> > On Tue, Apr 17, 2018 at 11:14:28PM +0800, Xueming Li wrote:
> > > This patch support L3 VXLAN, no inner L2 header comparing to standard
> > > VXLAN protocol. L3 VXLAN using specific overlay UDP destination port
> > > to discriminate against standard VXLAN, FW has to be configured to
> > > support
> > > it:
> > >   sudo mlxconfig -d <device> -y s IP_OVER_VXLAN_EN=1
> > >   sudo mlxconfig -d <device> -y s IP_OVER_VXLAN_PORT=<port>
> > >
> > > Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> > > ---
> > >  drivers/net/mlx5/mlx5_flow.c | 4 +++-
> > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/net/mlx5/mlx5_flow.c
> > > b/drivers/net/mlx5/mlx5_flow.c index 771d5f14d..d7a921dff 100644
> > > --- a/drivers/net/mlx5/mlx5_flow.c
> > > +++ b/drivers/net/mlx5/mlx5_flow.c
> > > @@ -413,7 +413,9 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
> > >  		.dst_sz = sizeof(struct ibv_flow_spec_tunnel),
> > >  	},
> > >  	[RTE_FLOW_ITEM_TYPE_VXLAN] = {
> > > -		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH),
> > > +		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
> > > +			       RTE_FLOW_ITEM_TYPE_IPV4, /* For L3 VXLAN. */
> > > +			       RTE_FLOW_ITEM_TYPE_IPV6), /* For L3 VXLAN. */
> > >  		.actions = valid_actions,
> > >  		.mask = &(const struct rte_flow_item_vxlan){
> > >  			.vni = "\xff\xff\xff",
> > > --
> > > 2.13.3
> > 
> > Such support must be under device parameter has it depends on the configuration of the firmware.  If
> > the firmware is not correctly configured the PMD must refuse such rule.
> > 
> > Thanks,
> > 
> > --
> > Nélio Laranjeiro
> > 6WIND
> 
> Are you suggesting Verbs parameter? I'm afraid we can't have it in
> short time, need new patch in later release when Verbs ready.

Take a look at [1], this is what I mean.

Regards,

[1] https://dpdk.org/doc/guides/nics/mlx5.html#run-time-configuration

-- 
Nélio Laranjeiro
6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v4 03/11] net/mlx5: support L3 VXLAN flow
  2018-04-18 15:08         ` Nélio Laranjeiro
@ 2018-04-19  6:20           ` Xueming(Steven) Li
  2018-04-19  6:55             ` Nélio Laranjeiro
  0 siblings, 1 reply; 115+ messages in thread
From: Xueming(Steven) Li @ 2018-04-19  6:20 UTC (permalink / raw)
  To: Nélio Laranjeiro; +Cc: Shahaf Shuler, dev



> -----Original Message-----
> From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> Sent: Wednesday, April 18, 2018 11:09 PM
> To: Xueming(Steven) Li <xuemingl@mellanox.com>
> Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org
> Subject: Re: [PATCH v4 03/11] net/mlx5: support L3 VXLAN flow
> 
> On Wed, Apr 18, 2018 at 02:43:30PM +0000, Xueming(Steven) Li wrote:
> >
> >
> > > -----Original Message-----
> > > From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > Sent: Wednesday, April 18, 2018 2:49 PM
> > > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > > Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org
> > > Subject: Re: [PATCH v4 03/11] net/mlx5: support L3 VXLAN flow
> > >
> > > On Tue, Apr 17, 2018 at 11:14:28PM +0800, Xueming Li wrote:
> > > > This patch support L3 VXLAN, no inner L2 header comparing to
> > > > standard VXLAN protocol. L3 VXLAN using specific overlay UDP
> > > > destination port to discriminate against standard VXLAN, FW has to
> > > > be configured to support
> > > > it:
> > > >   sudo mlxconfig -d <device> -y s IP_OVER_VXLAN_EN=1
> > > >   sudo mlxconfig -d <device> -y s IP_OVER_VXLAN_PORT=<port>
> > > >
> > > > Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> > > > ---
> > > >  drivers/net/mlx5/mlx5_flow.c | 4 +++-
> > > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/net/mlx5/mlx5_flow.c
> > > > b/drivers/net/mlx5/mlx5_flow.c index 771d5f14d..d7a921dff 100644
> > > > --- a/drivers/net/mlx5/mlx5_flow.c
> > > > +++ b/drivers/net/mlx5/mlx5_flow.c
> > > > @@ -413,7 +413,9 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
> > > >  		.dst_sz = sizeof(struct ibv_flow_spec_tunnel),
> > > >  	},
> > > >  	[RTE_FLOW_ITEM_TYPE_VXLAN] = {
> > > > -		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH),
> > > > +		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
> > > > +			       RTE_FLOW_ITEM_TYPE_IPV4, /* For L3 VXLAN. */
> > > > +			       RTE_FLOW_ITEM_TYPE_IPV6), /* For L3 VXLAN. */
> > > >  		.actions = valid_actions,
> > > >  		.mask = &(const struct rte_flow_item_vxlan){
> > > >  			.vni = "\xff\xff\xff",
> > > > --
> > > > 2.13.3
> > >
> > > Such support must be under device parameter has it depends on the
> > > configuration of the firmware.  If the firmware is not correctly configured the PMD must refuse
> such rule.
> > >
> > > Thanks,
> > >
> > > --
> > > Nélio Laranjeiro
> > > 6WIND
> >
> > Are you suggesting Verbs parameter? I'm afraid we can't have it in
> > short time, need new patch in later release when Verbs ready.
> 
> Take a look at [1], this is what I mean.

Enabling a new device parameter can't make L3 VXLAN packet get received if fw configuration not set.
On the other hand, if fw continuation enabled and device parameter not set, packet could be received
but failed to create rule. I'm afraid that a device parameter will introduce complexity of using 
this feature w/o real benefits.

> 
> Regards,
> 
> [1]
> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdpdk.org%2Fdoc%2Fguides%2Fnics%2Fml
> x5.html%23run-time-
> configuration&data=02%7C01%7Cxuemingl%40mellanox.com%7Ce3b4395f998a447f054308d5a53e3afc%7Ca652971c7d2e
> 4d9ba6a4d149256f461b%7C0%7C0%7C636596609057090801&sdata=VrTSE0favcGEhpyURZ8thHjBxdlR8f80%2BOmDy1a1vZU%
> 3D&reserved=0
> 
> --
> Nélio Laranjeiro
> 6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v4 03/11] net/mlx5: support L3 VXLAN flow
  2018-04-19  6:20           ` Xueming(Steven) Li
@ 2018-04-19  6:55             ` Nélio Laranjeiro
  2018-04-19 10:21               ` Xueming(Steven) Li
  0 siblings, 1 reply; 115+ messages in thread
From: Nélio Laranjeiro @ 2018-04-19  6:55 UTC (permalink / raw)
  To: Xueming(Steven) Li; +Cc: Shahaf Shuler, dev

On Thu, Apr 19, 2018 at 06:20:50AM +0000, Xueming(Steven) Li wrote:
> 
> 
> > -----Original Message-----
> > From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> > Sent: Wednesday, April 18, 2018 11:09 PM
> > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org
> > Subject: Re: [PATCH v4 03/11] net/mlx5: support L3 VXLAN flow
> > 
> > On Wed, Apr 18, 2018 at 02:43:30PM +0000, Xueming(Steven) Li wrote:
> > >
> > >
> > > > -----Original Message-----
> > > > From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > > Sent: Wednesday, April 18, 2018 2:49 PM
> > > > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > > > Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org
> > > > Subject: Re: [PATCH v4 03/11] net/mlx5: support L3 VXLAN flow
> > > >
> > > > On Tue, Apr 17, 2018 at 11:14:28PM +0800, Xueming Li wrote:
> > > > > This patch support L3 VXLAN, no inner L2 header comparing to
> > > > > standard VXLAN protocol. L3 VXLAN using specific overlay UDP
> > > > > destination port to discriminate against standard VXLAN, FW has to
> > > > > be configured to support
> > > > > it:
> > > > >   sudo mlxconfig -d <device> -y s IP_OVER_VXLAN_EN=1
> > > > >   sudo mlxconfig -d <device> -y s IP_OVER_VXLAN_PORT=<port>
> > > > >
> > > > > Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> > > > > ---
> > > > >  drivers/net/mlx5/mlx5_flow.c | 4 +++-
> > > > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/drivers/net/mlx5/mlx5_flow.c
> > > > > b/drivers/net/mlx5/mlx5_flow.c index 771d5f14d..d7a921dff 100644
> > > > > --- a/drivers/net/mlx5/mlx5_flow.c
> > > > > +++ b/drivers/net/mlx5/mlx5_flow.c
> > > > > @@ -413,7 +413,9 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
> > > > >  		.dst_sz = sizeof(struct ibv_flow_spec_tunnel),
> > > > >  	},
> > > > >  	[RTE_FLOW_ITEM_TYPE_VXLAN] = {
> > > > > -		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH),
> > > > > +		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
> > > > > +			       RTE_FLOW_ITEM_TYPE_IPV4, /* For L3 VXLAN. */
> > > > > +			       RTE_FLOW_ITEM_TYPE_IPV6), /* For L3 VXLAN. */
> > > > >  		.actions = valid_actions,
> > > > >  		.mask = &(const struct rte_flow_item_vxlan){
> > > > >  			.vni = "\xff\xff\xff",
> > > > > --
> > > > > 2.13.3
> > > >
> > > > Such support must be under device parameter has it depends on the
> > > > configuration of the firmware.  If the firmware is not correctly configured the PMD must refuse
> > such rule.
> > > >
> > > > Thanks,
> > > >
> > > > --
> > > > Nélio Laranjeiro
> > > > 6WIND
> > >
> > > Are you suggesting Verbs parameter? I'm afraid we can't have it in
> > > short time, need new patch in later release when Verbs ready.
> > 
> > Take a look at [1], this is what I mean.
> 
> Enabling a new device parameter can't make L3 VXLAN packet get
> received if fw configuration not set.

So you expect than the user will enable a feature without reading the
PMD documentation?
If it is the case, the answer it pretty simple, it is the same as above,
read the PMD documentation.

> On the other hand, if fw continuation enabled and device parameter not
> set, packet could be received but failed to create rule.

Again a user using a NIC should read the documentation.

> I'm afraid that a device parameter will introduce complexity of using 
> this feature w/o real benefits.

Add this missing device parameter and update accordingly the
documentation, or wait for Verbs to add the missing query feature.

If the firmware it not configured this rule must be refused, as there is
no way in the PMD to know if the firmware is configured, it must rely on
a device parameter.

Regards,

-- 
Nélio Laranjeiro
6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v4 03/11] net/mlx5: support L3 VXLAN flow
  2018-04-19  6:55             ` Nélio Laranjeiro
@ 2018-04-19 10:21               ` Xueming(Steven) Li
  2018-04-19 11:15                 ` Nélio Laranjeiro
  0 siblings, 1 reply; 115+ messages in thread
From: Xueming(Steven) Li @ 2018-04-19 10:21 UTC (permalink / raw)
  To: Nélio Laranjeiro; +Cc: Shahaf Shuler, dev



> -----Original Message-----
> From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> Sent: Thursday, April 19, 2018 2:56 PM
> To: Xueming(Steven) Li <xuemingl@mellanox.com>
> Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org
> Subject: Re: [PATCH v4 03/11] net/mlx5: support L3 VXLAN flow
> 
> On Thu, Apr 19, 2018 at 06:20:50AM +0000, Xueming(Steven) Li wrote:
> >
> >
> > > -----Original Message-----
> > > From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > Sent: Wednesday, April 18, 2018 11:09 PM
> > > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > > Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org
> > > Subject: Re: [PATCH v4 03/11] net/mlx5: support L3 VXLAN flow
> > >
> > > On Wed, Apr 18, 2018 at 02:43:30PM +0000, Xueming(Steven) Li wrote:
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > > > Sent: Wednesday, April 18, 2018 2:49 PM
> > > > > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > > > > Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org
> > > > > Subject: Re: [PATCH v4 03/11] net/mlx5: support L3 VXLAN flow
> > > > >
> > > > > On Tue, Apr 17, 2018 at 11:14:28PM +0800, Xueming Li wrote:
> > > > > > This patch support L3 VXLAN, no inner L2 header comparing to
> > > > > > standard VXLAN protocol. L3 VXLAN using specific overlay UDP
> > > > > > destination port to discriminate against standard VXLAN, FW
> > > > > > has to be configured to support
> > > > > > it:
> > > > > >   sudo mlxconfig -d <device> -y s IP_OVER_VXLAN_EN=1
> > > > > >   sudo mlxconfig -d <device> -y s IP_OVER_VXLAN_PORT=<port>
> > > > > >
> > > > > > Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> > > > > > ---
> > > > > >  drivers/net/mlx5/mlx5_flow.c | 4 +++-
> > > > > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > > > > >
> > > > > > diff --git a/drivers/net/mlx5/mlx5_flow.c
> > > > > > b/drivers/net/mlx5/mlx5_flow.c index 771d5f14d..d7a921dff
> > > > > > 100644
> > > > > > --- a/drivers/net/mlx5/mlx5_flow.c
> > > > > > +++ b/drivers/net/mlx5/mlx5_flow.c
> > > > > > @@ -413,7 +413,9 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
> > > > > >  		.dst_sz = sizeof(struct ibv_flow_spec_tunnel),
> > > > > >  	},
> > > > > >  	[RTE_FLOW_ITEM_TYPE_VXLAN] = {
> > > > > > -		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH),
> > > > > > +		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
> > > > > > +			       RTE_FLOW_ITEM_TYPE_IPV4, /* For L3 VXLAN. */
> > > > > > +			       RTE_FLOW_ITEM_TYPE_IPV6), /* For L3 VXLAN. */
> > > > > >  		.actions = valid_actions,
> > > > > >  		.mask = &(const struct rte_flow_item_vxlan){
> > > > > >  			.vni = "\xff\xff\xff",
> > > > > > --
> > > > > > 2.13.3
> > > > >
> > > > > Such support must be under device parameter has it depends on
> > > > > the configuration of the firmware.  If the firmware is not
> > > > > correctly configured the PMD must refuse
> > > such rule.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > --
> > > > > Nélio Laranjeiro
> > > > > 6WIND
> > > >
> > > > Are you suggesting Verbs parameter? I'm afraid we can't have it in
> > > > short time, need new patch in later release when Verbs ready.
> > >
> > > Take a look at [1], this is what I mean.
> >
> > Enabling a new device parameter can't make L3 VXLAN packet get
> > received if fw configuration not set.
> 
> So you expect than the user will enable a feature without reading the PMD documentation?
> If it is the case, the answer it pretty simple, it is the same as above, read the PMD documentation.
> 
> > On the other hand, if fw continuation enabled and device parameter not
> > set, packet could be received but failed to create rule.
> 
> Again a user using a NIC should read the documentation.

If a user read the document, fw should be configured correctly to enable this feature.

> 
> > I'm afraid that a device parameter will introduce complexity of using
> > this feature w/o real benefits.
> 
> Add this missing device parameter and update accordingly the documentation, or wait for Verbs to add
> the missing query feature.
> 
> If the firmware it not configured this rule must be refused, as there is no way in the PMD to know if
> the firmware is configured, it must rely on a device parameter.

Let's keep the design simple, users know exactly what they are doing and should not expecting 
such flow working by reading document.

> 
> Regards,
> 
> --
> Nélio Laranjeiro
> 6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v4 03/11] net/mlx5: support L3 VXLAN flow
  2018-04-19 10:21               ` Xueming(Steven) Li
@ 2018-04-19 11:15                 ` Nélio Laranjeiro
  2018-04-19 11:53                   ` Xueming(Steven) Li
  0 siblings, 1 reply; 115+ messages in thread
From: Nélio Laranjeiro @ 2018-04-19 11:15 UTC (permalink / raw)
  To: Xueming(Steven) Li; +Cc: Shahaf Shuler, dev

On Thu, Apr 19, 2018 at 10:21:26AM +0000, Xueming(Steven) Li wrote:
> 
> 
> > -----Original Message-----
> > From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> > Sent: Thursday, April 19, 2018 2:56 PM
> > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org
> > Subject: Re: [PATCH v4 03/11] net/mlx5: support L3 VXLAN flow
> > 
> > On Thu, Apr 19, 2018 at 06:20:50AM +0000, Xueming(Steven) Li wrote:
> > >
> > >
> > > > -----Original Message-----
> > > > From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > > Sent: Wednesday, April 18, 2018 11:09 PM
> > > > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > > > Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org
> > > > Subject: Re: [PATCH v4 03/11] net/mlx5: support L3 VXLAN flow
> > > >
> > > > On Wed, Apr 18, 2018 at 02:43:30PM +0000, Xueming(Steven) Li wrote:
> > > > >
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > > > > Sent: Wednesday, April 18, 2018 2:49 PM
> > > > > > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > > > > > Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org
> > > > > > Subject: Re: [PATCH v4 03/11] net/mlx5: support L3 VXLAN flow
> > > > > >
> > > > > > On Tue, Apr 17, 2018 at 11:14:28PM +0800, Xueming Li wrote:
> > > > > > > This patch support L3 VXLAN, no inner L2 header comparing to
> > > > > > > standard VXLAN protocol. L3 VXLAN using specific overlay UDP
> > > > > > > destination port to discriminate against standard VXLAN, FW
> > > > > > > has to be configured to support
> > > > > > > it:
> > > > > > >   sudo mlxconfig -d <device> -y s IP_OVER_VXLAN_EN=1
> > > > > > >   sudo mlxconfig -d <device> -y s IP_OVER_VXLAN_PORT=<port>
> > > > > > >
> > > > > > > Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> > > > > > > ---
> > > > > > >  drivers/net/mlx5/mlx5_flow.c | 4 +++-
> > > > > > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > > > > > >
> > > > > > > diff --git a/drivers/net/mlx5/mlx5_flow.c
> > > > > > > b/drivers/net/mlx5/mlx5_flow.c index 771d5f14d..d7a921dff
> > > > > > > 100644
> > > > > > > --- a/drivers/net/mlx5/mlx5_flow.c
> > > > > > > +++ b/drivers/net/mlx5/mlx5_flow.c
> > > > > > > @@ -413,7 +413,9 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
> > > > > > >  		.dst_sz = sizeof(struct ibv_flow_spec_tunnel),
> > > > > > >  	},
> > > > > > >  	[RTE_FLOW_ITEM_TYPE_VXLAN] = {
> > > > > > > -		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH),
> > > > > > > +		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
> > > > > > > +			       RTE_FLOW_ITEM_TYPE_IPV4, /* For L3 VXLAN. */
> > > > > > > +			       RTE_FLOW_ITEM_TYPE_IPV6), /* For L3 VXLAN. */
> > > > > > >  		.actions = valid_actions,
> > > > > > >  		.mask = &(const struct rte_flow_item_vxlan){
> > > > > > >  			.vni = "\xff\xff\xff",
> > > > > > > --
> > > > > > > 2.13.3
> > > > > >
> > > > > > Such support must be under device parameter has it depends on
> > > > > > the configuration of the firmware.  If the firmware is not
> > > > > > correctly configured the PMD must refuse
> > > > such rule.
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > --
> > > > > > Nélio Laranjeiro
> > > > > > 6WIND
> > > > >
> > > > > Are you suggesting Verbs parameter? I'm afraid we can't have it in
> > > > > short time, need new patch in later release when Verbs ready.
> > > >
> > > > Take a look at [1], this is what I mean.
> > >
> > > Enabling a new device parameter can't make L3 VXLAN packet get
> > > received if fw configuration not set.
> > 
> > So you expect than the user will enable a feature without reading the PMD documentation?
> > If it is the case, the answer it pretty simple, it is the same as above, read the PMD documentation.
> > 
> > > On the other hand, if fw continuation enabled and device parameter not
> > > set, packet could be received but failed to create rule.
> > 
> > Again a user using a NIC should read the documentation.
> 
> If a user read the document, fw should be configured correctly to enable this feature.

And a user which does not read this document must not be able to create
rules the NIC cannot handle because the firmware is not configured.

> > > I'm afraid that a device parameter will introduce complexity of using
> > > this feature w/o real benefits.
> > 
> > Add this missing device parameter and update accordingly the documentation, or wait for Verbs to add
> > the missing query feature.
> > 
> > If the firmware it not configured this rule must be refused, as there is no way in the PMD to know if
> > the firmware is configured, it must rely on a device parameter.
> 
> Let's keep the design simple, users know exactly what they are doing and should not expecting 
> such flow working by reading document.

This is exactly the opposite, users never read documentation even
today I've already spotted a new user to such documentation [1].

For this same reason a functionality not enabled by default in the
firmware must not be used by the PMD.  No device parameter no feature.

Add the device parameter and the according documentation.

Regards,

[1] https://dpdk.org/ml/archives/users/2018-April/003020.html

-- 
Nélio Laranjeiro
6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v4 03/11] net/mlx5: support L3 VXLAN flow
  2018-04-19 11:15                 ` Nélio Laranjeiro
@ 2018-04-19 11:53                   ` Xueming(Steven) Li
  2018-04-19 12:18                     ` Nélio Laranjeiro
  0 siblings, 1 reply; 115+ messages in thread
From: Xueming(Steven) Li @ 2018-04-19 11:53 UTC (permalink / raw)
  To: Nélio Laranjeiro; +Cc: Shahaf Shuler, dev



> -----Original Message-----
> From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> Sent: Thursday, April 19, 2018 7:15 PM
> To: Xueming(Steven) Li <xuemingl@mellanox.com>
> Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org
> Subject: Re: [PATCH v4 03/11] net/mlx5: support L3 VXLAN flow
> 
> On Thu, Apr 19, 2018 at 10:21:26AM +0000, Xueming(Steven) Li wrote:
> >
> >
> > > -----Original Message-----
> > > From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > Sent: Thursday, April 19, 2018 2:56 PM
> > > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > > Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org
> > > Subject: Re: [PATCH v4 03/11] net/mlx5: support L3 VXLAN flow
> > >
> > > On Thu, Apr 19, 2018 at 06:20:50AM +0000, Xueming(Steven) Li wrote:
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > > > Sent: Wednesday, April 18, 2018 11:09 PM
> > > > > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > > > > Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org
> > > > > Subject: Re: [PATCH v4 03/11] net/mlx5: support L3 VXLAN flow
> > > > >
> > > > > On Wed, Apr 18, 2018 at 02:43:30PM +0000, Xueming(Steven) Li wrote:
> > > > > >
> > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > > > > > Sent: Wednesday, April 18, 2018 2:49 PM
> > > > > > > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > > > > > > Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org
> > > > > > > Subject: Re: [PATCH v4 03/11] net/mlx5: support L3 VXLAN
> > > > > > > flow
> > > > > > >
> > > > > > > On Tue, Apr 17, 2018 at 11:14:28PM +0800, Xueming Li wrote:
> > > > > > > > This patch support L3 VXLAN, no inner L2 header comparing
> > > > > > > > to standard VXLAN protocol. L3 VXLAN using specific
> > > > > > > > overlay UDP destination port to discriminate against
> > > > > > > > standard VXLAN, FW has to be configured to support
> > > > > > > > it:
> > > > > > > >   sudo mlxconfig -d <device> -y s IP_OVER_VXLAN_EN=1
> > > > > > > >   sudo mlxconfig -d <device> -y s
> > > > > > > > IP_OVER_VXLAN_PORT=<port>
> > > > > > > >
> > > > > > > > Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> > > > > > > > ---
> > > > > > > >  drivers/net/mlx5/mlx5_flow.c | 4 +++-
> > > > > > > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > > > > > > >
> > > > > > > > diff --git a/drivers/net/mlx5/mlx5_flow.c
> > > > > > > > b/drivers/net/mlx5/mlx5_flow.c index 771d5f14d..d7a921dff
> > > > > > > > 100644
> > > > > > > > --- a/drivers/net/mlx5/mlx5_flow.c
> > > > > > > > +++ b/drivers/net/mlx5/mlx5_flow.c
> > > > > > > > @@ -413,7 +413,9 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
> > > > > > > >  		.dst_sz = sizeof(struct ibv_flow_spec_tunnel),
> > > > > > > >  	},
> > > > > > > >  	[RTE_FLOW_ITEM_TYPE_VXLAN] = {
> > > > > > > > -		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH),
> > > > > > > > +		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
> > > > > > > > +			       RTE_FLOW_ITEM_TYPE_IPV4, /* For L3 VXLAN. */
> > > > > > > > +			       RTE_FLOW_ITEM_TYPE_IPV6), /* For L3 VXLAN. */
> > > > > > > >  		.actions = valid_actions,
> > > > > > > >  		.mask = &(const struct rte_flow_item_vxlan){
> > > > > > > >  			.vni = "\xff\xff\xff",
> > > > > > > > --
> > > > > > > > 2.13.3
> > > > > > >
> > > > > > > Such support must be under device parameter has it depends
> > > > > > > on the configuration of the firmware.  If the firmware is
> > > > > > > not correctly configured the PMD must refuse
> > > > > such rule.
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > --
> > > > > > > Nélio Laranjeiro
> > > > > > > 6WIND
> > > > > >
> > > > > > Are you suggesting Verbs parameter? I'm afraid we can't have
> > > > > > it in short time, need new patch in later release when Verbs ready.
> > > > >
> > > > > Take a look at [1], this is what I mean.
> > > >
> > > > Enabling a new device parameter can't make L3 VXLAN packet get
> > > > received if fw configuration not set.
> > >
> > > So you expect than the user will enable a feature without reading the PMD documentation?
> > > If it is the case, the answer it pretty simple, it is the same as above, read the PMD
> documentation.
> > >
> > > > On the other hand, if fw continuation enabled and device parameter
> > > > not set, packet could be received but failed to create rule.
> > >
> > > Again a user using a NIC should read the documentation.
> >
> > If a user read the document, fw should be configured correctly to enable this feature.
> 
> And a user which does not read this document must not be able to create rules the NIC cannot handle
> because the firmware is not configured.
> 
> > > > I'm afraid that a device parameter will introduce complexity of
> > > > using this feature w/o real benefits.
> > >
> > > Add this missing device parameter and update accordingly the
> > > documentation, or wait for Verbs to add the missing query feature.
> > >
> > > If the firmware it not configured this rule must be refused, as
> > > there is no way in the PMD to know if the firmware is configured, it must rely on a device
> parameter.
> >
> > Let's keep the design simple, users know exactly what they are doing
> > and should not expecting such flow working by reading document.
> 
> This is exactly the opposite, users never read documentation even today I've already spotted a new
> user to such documentation [1].

  "So you expect than the user will enable a feature without reading the PMD documentation?
   If it is the case, the answer it pretty simple, it is the same as above, read the PMD documentation.
   Again a user using a NIC should read the documentation."

> 
> For this same reason a functionality not enabled by default in the firmware must not be used by the
> PMD.  No device parameter no feature.

Unlike other functionality, this feature related to supporting a new tunnel type, w/o fw configuration, 
L3 VXLAN packet certainly be treated as normal packet, it doesn't hurt. How do you think?

> 
> Add the device parameter and the according documentation.
> 
> Regards,
> 
> [1]
> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdpdk.org%2Fml%2Farchives%2Fusers%2F
> 2018-
> April%2F003020.html&data=02%7C01%7Cxuemingl%40mellanox.com%7C7b417f2ace1044d6858a08d5a5e6c502%7Ca65297
> 1c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636597332927305548&sdata=7sUJ8okKx2yKTQz9nXfi%2B2cDwYVUbMa41glmez
> SYQyQ%3D&reserved=0
> 
> --
> Nélio Laranjeiro
> 6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v4 03/11] net/mlx5: support L3 VXLAN flow
  2018-04-19 11:53                   ` Xueming(Steven) Li
@ 2018-04-19 12:18                     ` Nélio Laranjeiro
  2018-04-19 12:49                       ` Xueming(Steven) Li
  0 siblings, 1 reply; 115+ messages in thread
From: Nélio Laranjeiro @ 2018-04-19 12:18 UTC (permalink / raw)
  To: Xueming(Steven) Li; +Cc: Shahaf Shuler, dev

On Thu, Apr 19, 2018 at 11:53:05AM +0000, Xueming(Steven) Li wrote:
> 
> 
> > -----Original Message-----
> > From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> > Sent: Thursday, April 19, 2018 7:15 PM
> > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org
> > Subject: Re: [PATCH v4 03/11] net/mlx5: support L3 VXLAN flow
> > 
> > On Thu, Apr 19, 2018 at 10:21:26AM +0000, Xueming(Steven) Li wrote:
> > >
> > >
> > > > -----Original Message-----
> > > > From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > > Sent: Thursday, April 19, 2018 2:56 PM
> > > > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > > > Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org
> > > > Subject: Re: [PATCH v4 03/11] net/mlx5: support L3 VXLAN flow
> > > >
> > > > On Thu, Apr 19, 2018 at 06:20:50AM +0000, Xueming(Steven) Li wrote:
> > > > >
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > > > > Sent: Wednesday, April 18, 2018 11:09 PM
> > > > > > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > > > > > Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org
> > > > > > Subject: Re: [PATCH v4 03/11] net/mlx5: support L3 VXLAN flow
> > > > > >
> > > > > > On Wed, Apr 18, 2018 at 02:43:30PM +0000, Xueming(Steven) Li wrote:
> > > > > > >
> > > > > > >
> > > > > > > > -----Original Message-----
> > > > > > > > From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > > > > > > Sent: Wednesday, April 18, 2018 2:49 PM
> > > > > > > > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > > > > > > > Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org
> > > > > > > > Subject: Re: [PATCH v4 03/11] net/mlx5: support L3 VXLAN
> > > > > > > > flow
> > > > > > > >
> > > > > > > > On Tue, Apr 17, 2018 at 11:14:28PM +0800, Xueming Li wrote:
> > > > > > > > > This patch support L3 VXLAN, no inner L2 header comparing
> > > > > > > > > to standard VXLAN protocol. L3 VXLAN using specific
> > > > > > > > > overlay UDP destination port to discriminate against
> > > > > > > > > standard VXLAN, FW has to be configured to support
> > > > > > > > > it:
> > > > > > > > >   sudo mlxconfig -d <device> -y s IP_OVER_VXLAN_EN=1
> > > > > > > > >   sudo mlxconfig -d <device> -y s
> > > > > > > > > IP_OVER_VXLAN_PORT=<port>
> > > > > > > > >
> > > > > > > > > Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> > > > > > > > > ---
> > > > > > > > >  drivers/net/mlx5/mlx5_flow.c | 4 +++-
> > > > > > > > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > > > > > > > >
> > > > > > > > > diff --git a/drivers/net/mlx5/mlx5_flow.c
> > > > > > > > > b/drivers/net/mlx5/mlx5_flow.c index 771d5f14d..d7a921dff
> > > > > > > > > 100644
> > > > > > > > > --- a/drivers/net/mlx5/mlx5_flow.c
> > > > > > > > > +++ b/drivers/net/mlx5/mlx5_flow.c
> > > > > > > > > @@ -413,7 +413,9 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
> > > > > > > > >  		.dst_sz = sizeof(struct ibv_flow_spec_tunnel),
> > > > > > > > >  	},
> > > > > > > > >  	[RTE_FLOW_ITEM_TYPE_VXLAN] = {
> > > > > > > > > -		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH),
> > > > > > > > > +		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
> > > > > > > > > +			       RTE_FLOW_ITEM_TYPE_IPV4, /* For L3 VXLAN. */
> > > > > > > > > +			       RTE_FLOW_ITEM_TYPE_IPV6), /* For L3 VXLAN. */
> > > > > > > > >  		.actions = valid_actions,
> > > > > > > > >  		.mask = &(const struct rte_flow_item_vxlan){
> > > > > > > > >  			.vni = "\xff\xff\xff",
> > > > > > > > > --
> > > > > > > > > 2.13.3
> > > > > > > >
> > > > > > > > Such support must be under device parameter has it depends
> > > > > > > > on the configuration of the firmware.  If the firmware is
> > > > > > > > not correctly configured the PMD must refuse
> > > > > > such rule.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > >
> > > > > > > > --
> > > > > > > > Nélio Laranjeiro
> > > > > > > > 6WIND
> > > > > > >
> > > > > > > Are you suggesting Verbs parameter? I'm afraid we can't have
> > > > > > > it in short time, need new patch in later release when Verbs ready.
> > > > > >
> > > > > > Take a look at [1], this is what I mean.
> > > > >
> > > > > Enabling a new device parameter can't make L3 VXLAN packet get
> > > > > received if fw configuration not set.
> > > >
> > > > So you expect than the user will enable a feature without reading the PMD documentation?
> > > > If it is the case, the answer it pretty simple, it is the same as above, read the PMD
> > documentation.
> > > >
> > > > > On the other hand, if fw continuation enabled and device parameter
> > > > > not set, packet could be received but failed to create rule.
> > > >
> > > > Again a user using a NIC should read the documentation.
> > >
> > > If a user read the document, fw should be configured correctly to enable this feature.
> > 
> > And a user which does not read this document must not be able to create rules the NIC cannot handle
> > because the firmware is not configured.
> > 
> > > > > I'm afraid that a device parameter will introduce complexity of
> > > > > using this feature w/o real benefits.
> > > >
> > > > Add this missing device parameter and update accordingly the
> > > > documentation, or wait for Verbs to add the missing query feature.
> > > >
> > > > If the firmware it not configured this rule must be refused, as
> > > > there is no way in the PMD to know if the firmware is configured, it must rely on a device
> > parameter.
> > >
> > > Let's keep the design simple, users know exactly what they are doing
> > > and should not expecting such flow working by reading document.
> > 
> > This is exactly the opposite, users never read documentation even today I've already spotted a new
> > user to such documentation [1].
> 
>   "So you expect than the user will enable a feature without reading the PMD documentation?
>    If it is the case, the answer it pretty simple, it is the same as above, read the PMD documentation.
>    Again a user using a NIC should read the documentation."
> 
> > 
> > For this same reason a functionality not enabled by default in the firmware must not be used by the
> > PMD.  No device parameter no feature.
> 
> Unlike other functionality, this feature related to supporting a new tunnel type, w/o fw configuration, 
> L3 VXLAN packet certainly be treated as normal packet, it doesn't hurt. How do you think?

 flow create 0 ingress eth / ipv4 / end action queue index 3 end

but the packet ends in queue 0, will it hurt?

Any rule *accepted* by the PMD *must* follow the user request, otherwise
it is a bug.

Add the device parameter and the according documentation.

Regards,

-- 
Nélio Laranjeiro
6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v4 03/11] net/mlx5: support L3 VXLAN flow
  2018-04-19 12:18                     ` Nélio Laranjeiro
@ 2018-04-19 12:49                       ` Xueming(Steven) Li
  2018-04-19 13:40                         ` Nélio Laranjeiro
  0 siblings, 1 reply; 115+ messages in thread
From: Xueming(Steven) Li @ 2018-04-19 12:49 UTC (permalink / raw)
  To: Nélio Laranjeiro; +Cc: Shahaf Shuler, dev



> -----Original Message-----
> From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> Sent: Thursday, April 19, 2018 8:19 PM
> To: Xueming(Steven) Li <xuemingl@mellanox.com>
> Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org
> Subject: Re: [PATCH v4 03/11] net/mlx5: support L3 VXLAN flow
> 
> On Thu, Apr 19, 2018 at 11:53:05AM +0000, Xueming(Steven) Li wrote:
> >
> >
> > > -----Original Message-----
> > > From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > Sent: Thursday, April 19, 2018 7:15 PM
> > > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > > Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org
> > > Subject: Re: [PATCH v4 03/11] net/mlx5: support L3 VXLAN flow
> > >
> > > On Thu, Apr 19, 2018 at 10:21:26AM +0000, Xueming(Steven) Li wrote:
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > > > Sent: Thursday, April 19, 2018 2:56 PM
> > > > > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > > > > Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org
> > > > > Subject: Re: [PATCH v4 03/11] net/mlx5: support L3 VXLAN flow
> > > > >
> > > > > On Thu, Apr 19, 2018 at 06:20:50AM +0000, Xueming(Steven) Li wrote:
> > > > > >
> > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > > > > > Sent: Wednesday, April 18, 2018 11:09 PM
> > > > > > > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > > > > > > Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org
> > > > > > > Subject: Re: [PATCH v4 03/11] net/mlx5: support L3 VXLAN
> > > > > > > flow
> > > > > > >
> > > > > > > On Wed, Apr 18, 2018 at 02:43:30PM +0000, Xueming(Steven) Li wrote:
> > > > > > > >
> > > > > > > >
> > > > > > > > > -----Original Message-----
> > > > > > > > > From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > > > > > > > Sent: Wednesday, April 18, 2018 2:49 PM
> > > > > > > > > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > > > > > > > > Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org
> > > > > > > > > Subject: Re: [PATCH v4 03/11] net/mlx5: support L3 VXLAN
> > > > > > > > > flow
> > > > > > > > >
> > > > > > > > > On Tue, Apr 17, 2018 at 11:14:28PM +0800, Xueming Li wrote:
> > > > > > > > > > This patch support L3 VXLAN, no inner L2 header
> > > > > > > > > > comparing to standard VXLAN protocol. L3 VXLAN using
> > > > > > > > > > specific overlay UDP destination port to discriminate
> > > > > > > > > > against standard VXLAN, FW has to be configured to
> > > > > > > > > > support
> > > > > > > > > > it:
> > > > > > > > > >   sudo mlxconfig -d <device> -y s IP_OVER_VXLAN_EN=1
> > > > > > > > > >   sudo mlxconfig -d <device> -y s
> > > > > > > > > > IP_OVER_VXLAN_PORT=<port>
> > > > > > > > > >
> > > > > > > > > > Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> > > > > > > > > > ---
> > > > > > > > > >  drivers/net/mlx5/mlx5_flow.c | 4 +++-
> > > > > > > > > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > > > > > > > > >
> > > > > > > > > > diff --git a/drivers/net/mlx5/mlx5_flow.c
> > > > > > > > > > b/drivers/net/mlx5/mlx5_flow.c index
> > > > > > > > > > 771d5f14d..d7a921dff
> > > > > > > > > > 100644
> > > > > > > > > > --- a/drivers/net/mlx5/mlx5_flow.c
> > > > > > > > > > +++ b/drivers/net/mlx5/mlx5_flow.c
> > > > > > > > > > @@ -413,7 +413,9 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
> > > > > > > > > >  		.dst_sz = sizeof(struct ibv_flow_spec_tunnel),
> > > > > > > > > >  	},
> > > > > > > > > >  	[RTE_FLOW_ITEM_TYPE_VXLAN] = {
> > > > > > > > > > -		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH),
> > > > > > > > > > +		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
> > > > > > > > > > +			       RTE_FLOW_ITEM_TYPE_IPV4, /* For L3 VXLAN. */
> > > > > > > > > > +			       RTE_FLOW_ITEM_TYPE_IPV6), /* For L3 VXLAN.
> > > > > > > > > > +*/
> > > > > > > > > >  		.actions = valid_actions,
> > > > > > > > > >  		.mask = &(const struct rte_flow_item_vxlan){
> > > > > > > > > >  			.vni = "\xff\xff\xff",
> > > > > > > > > > --
> > > > > > > > > > 2.13.3
> > > > > > > > >
> > > > > > > > > Such support must be under device parameter has it
> > > > > > > > > depends on the configuration of the firmware.  If the
> > > > > > > > > firmware is not correctly configured the PMD must refuse
> > > > > > > such rule.
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Nélio Laranjeiro
> > > > > > > > > 6WIND
> > > > > > > >
> > > > > > > > Are you suggesting Verbs parameter? I'm afraid we can't
> > > > > > > > have it in short time, need new patch in later release when Verbs ready.
> > > > > > >
> > > > > > > Take a look at [1], this is what I mean.
> > > > > >
> > > > > > Enabling a new device parameter can't make L3 VXLAN packet get
> > > > > > received if fw configuration not set.
> > > > >
> > > > > So you expect than the user will enable a feature without reading the PMD documentation?
> > > > > If it is the case, the answer it pretty simple, it is the same
> > > > > as above, read the PMD
> > > documentation.
> > > > >
> > > > > > On the other hand, if fw continuation enabled and device
> > > > > > parameter not set, packet could be received but failed to create rule.
> > > > >
> > > > > Again a user using a NIC should read the documentation.
> > > >
> > > > If a user read the document, fw should be configured correctly to enable this feature.
> > >
> > > And a user which does not read this document must not be able to
> > > create rules the NIC cannot handle because the firmware is not configured.
> > >
> > > > > > I'm afraid that a device parameter will introduce complexity
> > > > > > of using this feature w/o real benefits.
> > > > >
> > > > > Add this missing device parameter and update accordingly the
> > > > > documentation, or wait for Verbs to add the missing query feature.
> > > > >
> > > > > If the firmware it not configured this rule must be refused, as
> > > > > there is no way in the PMD to know if the firmware is
> > > > > configured, it must rely on a device
> > > parameter.
> > > >
> > > > Let's keep the design simple, users know exactly what they are
> > > > doing and should not expecting such flow working by reading document.
> > >
> > > This is exactly the opposite, users never read documentation even
> > > today I've already spotted a new user to such documentation [1].
> >
> >   "So you expect than the user will enable a feature without reading the PMD documentation?
> >    If it is the case, the answer it pretty simple, it is the same as above, read the PMD
> documentation.
> >    Again a user using a NIC should read the documentation."
> >
> > >
> > > For this same reason a functionality not enabled by default in the
> > > firmware must not be used by the PMD.  No device parameter no feature.
> >
> > Unlike other functionality, this feature related to supporting a new
> > tunnel type, w/o fw configuration,
> > L3 VXLAN packet certainly be treated as normal packet, it doesn't hurt. How do you think?
> 
>  flow create 0 ingress eth / ipv4 / end action queue index 3 end
> 
> but the packet ends in queue 0, will it hurt?

This is the correct example: 

flow create 0 ingress pattern eth / ipv4 / udp dst is 4789 / vxlan / ipv4 / end actions rss queues 1 2  end / end

Users should never create such rule and expect it to work because it doesn't meet and VXLAN RFC.
If users want to match L3 VXLAN, read document and configure fw to get correct result.

> 
> Any rule *accepted* by the PMD *must* follow the user request, otherwise it is a bug.

I'd beg you to consider from user's perspective, the motivation of this design is to sale rte flow
by replacing device parameter, now we are making the flow usage awkward.

> 
> Add the device parameter and the according documentation.
> 
> Regards,
> 
> --
> Nélio Laranjeiro
> 6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v4 03/11] net/mlx5: support L3 VXLAN flow
  2018-04-19 12:49                       ` Xueming(Steven) Li
@ 2018-04-19 13:40                         ` Nélio Laranjeiro
  0 siblings, 0 replies; 115+ messages in thread
From: Nélio Laranjeiro @ 2018-04-19 13:40 UTC (permalink / raw)
  To: Xueming(Steven) Li; +Cc: Shahaf Shuler, dev

On Thu, Apr 19, 2018 at 12:49:41PM +0000, Xueming(Steven) Li wrote:
> 
> 
> > -----Original Message-----
> > From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> > Sent: Thursday, April 19, 2018 8:19 PM
> > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org
> > Subject: Re: [PATCH v4 03/11] net/mlx5: support L3 VXLAN flow
> > 
> > On Thu, Apr 19, 2018 at 11:53:05AM +0000, Xueming(Steven) Li wrote:
> > >
> > >
> > > > -----Original Message-----
> > > > From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > > Sent: Thursday, April 19, 2018 7:15 PM
> > > > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > > > Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org
> > > > Subject: Re: [PATCH v4 03/11] net/mlx5: support L3 VXLAN flow
> > > >
> > > > On Thu, Apr 19, 2018 at 10:21:26AM +0000, Xueming(Steven) Li wrote:
> > > > >
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > > > > Sent: Thursday, April 19, 2018 2:56 PM
> > > > > > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > > > > > Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org
> > > > > > Subject: Re: [PATCH v4 03/11] net/mlx5: support L3 VXLAN flow
> > > > > >
> > > > > > On Thu, Apr 19, 2018 at 06:20:50AM +0000, Xueming(Steven) Li wrote:
> > > > > > >
> > > > > > >
> > > > > > > > -----Original Message-----
> > > > > > > > From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > > > > > > Sent: Wednesday, April 18, 2018 11:09 PM
> > > > > > > > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > > > > > > > Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org
> > > > > > > > Subject: Re: [PATCH v4 03/11] net/mlx5: support L3 VXLAN
> > > > > > > > flow
> > > > > > > >
> > > > > > > > On Wed, Apr 18, 2018 at 02:43:30PM +0000, Xueming(Steven) Li wrote:
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > > -----Original Message-----
> > > > > > > > > > From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > > > > > > > > Sent: Wednesday, April 18, 2018 2:49 PM
> > > > > > > > > > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > > > > > > > > > Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org
> > > > > > > > > > Subject: Re: [PATCH v4 03/11] net/mlx5: support L3 VXLAN
> > > > > > > > > > flow
> > > > > > > > > >
> > > > > > > > > > On Tue, Apr 17, 2018 at 11:14:28PM +0800, Xueming Li wrote:
> > > > > > > > > > > This patch support L3 VXLAN, no inner L2 header
> > > > > > > > > > > comparing to standard VXLAN protocol. L3 VXLAN using
> > > > > > > > > > > specific overlay UDP destination port to discriminate
> > > > > > > > > > > against standard VXLAN, FW has to be configured to
> > > > > > > > > > > support
> > > > > > > > > > > it:
> > > > > > > > > > >   sudo mlxconfig -d <device> -y s IP_OVER_VXLAN_EN=1
> > > > > > > > > > >   sudo mlxconfig -d <device> -y s
> > > > > > > > > > > IP_OVER_VXLAN_PORT=<port>
> > > > > > > > > > >
> > > > > > > > > > > Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> > > > > > > > > > > ---
> > > > > > > > > > >  drivers/net/mlx5/mlx5_flow.c | 4 +++-
> > > > > > > > > > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > > > > > > > > > >
> > > > > > > > > > > diff --git a/drivers/net/mlx5/mlx5_flow.c
> > > > > > > > > > > b/drivers/net/mlx5/mlx5_flow.c index
> > > > > > > > > > > 771d5f14d..d7a921dff
> > > > > > > > > > > 100644
> > > > > > > > > > > --- a/drivers/net/mlx5/mlx5_flow.c
> > > > > > > > > > > +++ b/drivers/net/mlx5/mlx5_flow.c
> > > > > > > > > > > @@ -413,7 +413,9 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
> > > > > > > > > > >  		.dst_sz = sizeof(struct ibv_flow_spec_tunnel),
> > > > > > > > > > >  	},
> > > > > > > > > > >  	[RTE_FLOW_ITEM_TYPE_VXLAN] = {
> > > > > > > > > > > -		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH),
> > > > > > > > > > > +		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
> > > > > > > > > > > +			       RTE_FLOW_ITEM_TYPE_IPV4, /* For L3 VXLAN. */
> > > > > > > > > > > +			       RTE_FLOW_ITEM_TYPE_IPV6), /* For L3 VXLAN.
> > > > > > > > > > > +*/
> > > > > > > > > > >  		.actions = valid_actions,
> > > > > > > > > > >  		.mask = &(const struct rte_flow_item_vxlan){
> > > > > > > > > > >  			.vni = "\xff\xff\xff",
> > > > > > > > > > > --
> > > > > > > > > > > 2.13.3
> > > > > > > > > >
> > > > > > > > > > Such support must be under device parameter has it
> > > > > > > > > > depends on the configuration of the firmware.  If the
> > > > > > > > > > firmware is not correctly configured the PMD must refuse
> > > > > > > > such rule.
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > Nélio Laranjeiro
> > > > > > > > > > 6WIND
> > > > > > > > >
> > > > > > > > > Are you suggesting Verbs parameter? I'm afraid we can't
> > > > > > > > > have it in short time, need new patch in later release when Verbs ready.
> > > > > > > >
> > > > > > > > Take a look at [1], this is what I mean.
> > > > > > >
> > > > > > > Enabling a new device parameter can't make L3 VXLAN packet get
> > > > > > > received if fw configuration not set.
> > > > > >
> > > > > > So you expect than the user will enable a feature without reading the PMD documentation?
> > > > > > If it is the case, the answer it pretty simple, it is the same
> > > > > > as above, read the PMD
> > > > documentation.
> > > > > >
> > > > > > > On the other hand, if fw continuation enabled and device
> > > > > > > parameter not set, packet could be received but failed to create rule.
> > > > > >
> > > > > > Again a user using a NIC should read the documentation.
> > > > >
> > > > > If a user read the document, fw should be configured correctly to enable this feature.
> > > >
> > > > And a user which does not read this document must not be able to
> > > > create rules the NIC cannot handle because the firmware is not configured.
> > > >
> > > > > > > I'm afraid that a device parameter will introduce complexity
> > > > > > > of using this feature w/o real benefits.
> > > > > >
> > > > > > Add this missing device parameter and update accordingly the
> > > > > > documentation, or wait for Verbs to add the missing query feature.
> > > > > >
> > > > > > If the firmware it not configured this rule must be refused, as
> > > > > > there is no way in the PMD to know if the firmware is
> > > > > > configured, it must rely on a device
> > > > parameter.
> > > > >
> > > > > Let's keep the design simple, users know exactly what they are
> > > > > doing and should not expecting such flow working by reading document.
> > > >
> > > > This is exactly the opposite, users never read documentation even
> > > > today I've already spotted a new user to such documentation [1].
> > >
> > >   "So you expect than the user will enable a feature without reading the PMD documentation?
> > >    If it is the case, the answer it pretty simple, it is the same as above, read the PMD
> > documentation.
> > >    Again a user using a NIC should read the documentation."
> > >
> > > >
> > > > For this same reason a functionality not enabled by default in the
> > > > firmware must not be used by the PMD.  No device parameter no feature.
> > >
> > > Unlike other functionality, this feature related to supporting a new
> > > tunnel type, w/o fw configuration,
> > > L3 VXLAN packet certainly be treated as normal packet, it doesn't hurt. How do you think?
> > 
> >  flow create 0 ingress eth / ipv4 / end action queue index 3 end
> > 
> > but the packet ends in queue 0, will it hurt?
> 
> This is the correct example: 
> 
> flow create 0 ingress pattern eth / ipv4 / udp dst is 4789 / vxlan / ipv4 / end actions rss queues 1 2  end / end
> 
> Users should never create such rule and expect it to work because it doesn't meet and VXLAN RFC.
> If users want to match L3 VXLAN, read document and configure fw to get correct result.
>
> > Any rule *accepted* by the PMD *must* follow the user request, otherwise it is a bug.
>
> I'd beg you to consider from user's perspective, the motivation of this design is to sale rte flow
> by replacing device parameter, now we are making the flow usage awkward.

DPDK application are generic, they can still create specific rules for
some devices, but has it don't have any knowledge on the underlying NIC
it must query them through the flow API and devices not supporting the
flow must answer with an error.

"Users should never" in reality the flow API is present for the exact
opposite.

Coming back to this specific patch, you are only giving more arguments
to refuse it and this, for the "user".

I cannot accept this patch as in some situation the user request cannot
be offloaded by the hardware.

There are only two possibilities for me to accept such feature in the
PMD:

1. the PMD can query the firmware and know if such feature is enabled
   (as it does for the (E)MPS and others) and thus refuse the flow if
   not.
2. a device parameter to enable such feature from the PMD perspective
   (does not mean it will be available from the hardware one).

I would prefer #1, but has you have mentioned there is no possibility
from Verbs team to provide such solution so fast.

Remains the #2 solution.  Add it and I can accept the patch.

Regards,

-- 
Nélio Laranjeiro
6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* [PATCH v5 00/11] mlx5 Rx tunnel offloading
  2018-04-17 15:14   ` [PATCH v4 00/11] " Xueming Li
@ 2018-04-20 12:23     ` Xueming Li
  2018-04-23 12:32       ` [PATCH v6 " Xueming Li
                         ` (11 more replies)
  2018-04-20 12:23     ` [PATCH v5 01/11] net/mlx5: support 16 hardware priorities Xueming Li
                       ` (10 subsequent siblings)
  11 siblings, 12 replies; 115+ messages in thread
From: Xueming Li @ 2018-04-20 12:23 UTC (permalink / raw)
  To: Iremonger Bernard, Nelio Laranjeiro, Shahaf Shuler; +Cc: Xueming Li, dev

Important note:
        please note that this patchset relies on Adrien's patchset of flow API
        overhaul: http://www.dpdk.org/ml/archives/dev/2018-April/098047.html

v5:
- Removed %lx prints
- Per review request, clear mbuf tunnel type in case of multiple tunnel types.
- Rebase on Adriens flow API overhaul patchset
- Split feature requirement document into patches of L3 VXLAN and VXLAN-GPE
- Per review request, add device parameter to enable L3 VXLAN and VXLAN-GPE
v4:
- Fix RSS level according to value defination
- Add "Inner RSS" column to NIC feature doc
- Fixed flow creation error in case of ipv4 rss on ipv6 pattern
- new patch: enforce IP protocol of GRE to be 47.
- Removed MPLS-in-UDP and MPLS-in-GRE replated patchset
- Removed invalid RSS type check
v3:
- Refactor 16 Verbs priority detection.
- Other updates according to ML discussion.
v2:
- Split into 2 series: public api and mlx5, this one is the second.
- Rebased on Adrien's rte flow overhaul:
  http://www.dpdk.org/ml/archives/dev/2018-April/095774.html
v1:
- Support new tunnel type MPLS-in-GRE and MPLS-in-UDP
- Remove deprecation notes of rss level

This patchset supports MLX5 Rx tunnel checksum, inner rss, inner ptype offloading of following tunnel types:
- Standard VXLAN
- L3 VXLAN (no inner ethernet header)
- VXLAN-GPE

Xueming Li (11):
  net/mlx5: support 16 hardware priorities
  net/mlx5: support GRE tunnel flow
  net/mlx5: support L3 VXLAN flow
  net/mlx5: support Rx tunnel type identification
  net/mlx5: cleanup tunnel checksum offloads
  net/mlx5: split flow RSS handling logic
  net/mlx5: support tunnel RSS level
  net/mlx5: add hardware flow debug dump
  net/mlx5: introduce VXLAN-GPE tunnel type
  net/mlx5: allow flow tunnel ID 0 with outer pattern
  doc: update mlx5 guide on tunnel offloading

 doc/guides/nics/features/default.ini  |   1 +
 doc/guides/nics/features/mlx5.ini     |   3 +
 doc/guides/nics/mlx5.rst              |  30 +-
 drivers/net/mlx5/Makefile             |   2 +-
 drivers/net/mlx5/mlx5.c               |  24 +
 drivers/net/mlx5/mlx5.h               |   6 +
 drivers/net/mlx5/mlx5_flow.c          | 844 +++++++++++++++++++++++++++-------
 drivers/net/mlx5/mlx5_glue.c          |  16 +
 drivers/net/mlx5/mlx5_glue.h          |   8 +
 drivers/net/mlx5/mlx5_rxq.c           |  89 +++-
 drivers/net/mlx5/mlx5_rxtx.c          |  33 +-
 drivers/net/mlx5/mlx5_rxtx.h          |  11 +-
 drivers/net/mlx5/mlx5_rxtx_vec_neon.h |  21 +-
 drivers/net/mlx5/mlx5_rxtx_vec_sse.h  |  17 +-
 drivers/net/mlx5/mlx5_trigger.c       |   8 -
 drivers/net/mlx5/mlx5_utils.h         |   6 +
 16 files changed, 896 insertions(+), 223 deletions(-)

-- 
2.13.3

^ permalink raw reply	[flat|nested] 115+ messages in thread

* [PATCH v5 01/11] net/mlx5: support 16 hardware priorities
  2018-04-17 15:14   ` [PATCH v4 00/11] " Xueming Li
  2018-04-20 12:23     ` [PATCH v5 " Xueming Li
@ 2018-04-20 12:23     ` Xueming Li
  2018-04-20 12:23     ` [PATCH v5 02/11] net/mlx5: support GRE tunnel flow Xueming Li
                       ` (9 subsequent siblings)
  11 siblings, 0 replies; 115+ messages in thread
From: Xueming Li @ 2018-04-20 12:23 UTC (permalink / raw)
  To: Iremonger Bernard, Nelio Laranjeiro, Shahaf Shuler; +Cc: Xueming Li, dev

This patch supports new 16 Verbs flow priorities by trying to create a
simple flow of priority 15. If 16 priorities not available, fallback to
traditional 8 priorities.

Verb priority mapping:
			8 priorities	>=16 priorities
Control flow:		4-7		8-15
User normal flow:	1-3		4-7
User tunnel flow:	0-2		0-3

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
---
 drivers/net/mlx5/mlx5.c         |  18 +++++++
 drivers/net/mlx5/mlx5.h         |   5 ++
 drivers/net/mlx5/mlx5_flow.c    | 113 +++++++++++++++++++++++++++++++++-------
 drivers/net/mlx5/mlx5_trigger.c |   8 ---
 4 files changed, 116 insertions(+), 28 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 68783c3ac..5a0b8de85 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -197,6 +197,7 @@ mlx5_dev_close(struct rte_eth_dev *dev)
 		priv->txqs_n = 0;
 		priv->txqs = NULL;
 	}
+	mlx5_flow_delete_drop_queue(dev);
 	if (priv->pd != NULL) {
 		assert(priv->ctx != NULL);
 		claim_zero(mlx5_glue->dealloc_pd(priv->pd));
@@ -619,6 +620,7 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 	unsigned int mps;
 	unsigned int cqe_comp;
 	unsigned int tunnel_en = 0;
+	unsigned int verb_priorities = 0;
 	int idx;
 	int i;
 	struct mlx5dv_context attrs_out = {0};
@@ -1006,6 +1008,22 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 		mlx5_link_update(eth_dev, 0);
 		/* Store device configuration on private structure. */
 		priv->config = config;
+		/* Create drop queue. */
+		err = mlx5_flow_create_drop_queue(eth_dev);
+		if (err) {
+			DRV_LOG(ERR, "port %u drop queue allocation failed: %s",
+				eth_dev->data->port_id, strerror(rte_errno));
+			goto port_error;
+		}
+		/* Supported Verbs flow priority number detection. */
+		if (verb_priorities == 0)
+			verb_priorities = mlx5_get_max_verbs_prio(eth_dev);
+		if (verb_priorities < MLX5_VERBS_FLOW_PRIO_8) {
+			DRV_LOG(ERR, "port %u wrong Verbs flow priorities: %u",
+				eth_dev->data->port_id, verb_priorities);
+			goto port_error;
+		}
+		priv->config.max_verbs_prio = verb_priorities;
 		continue;
 port_error:
 		if (priv)
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 6ad41390a..670f6860f 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -89,6 +89,7 @@ struct mlx5_dev_config {
 	unsigned int rx_vec_en:1; /* Rx vector is enabled. */
 	unsigned int mpw_hdr_dseg:1; /* Enable DSEGs in the title WQEBB. */
 	unsigned int vf_nl_en:1; /* Enable Netlink requests in VF mode. */
+	unsigned int max_verbs_prio; /* Number of Verb flow priorities. */
 	unsigned int tso_max_payload_sz; /* Maximum TCP payload for TSO. */
 	unsigned int ind_table_max_size; /* Maximum indirection table size. */
 	int txq_inline; /* Maximum packet size for inlining. */
@@ -105,6 +106,9 @@ enum mlx5_verbs_alloc_type {
 	MLX5_VERBS_ALLOC_TYPE_RX_QUEUE,
 };
 
+/* 8 Verbs priorities. */
+#define MLX5_VERBS_FLOW_PRIO_8 8
+
 /**
  * Verbs allocator needs a context to know in the callback which kind of
  * resources it is allocating.
@@ -253,6 +257,7 @@ int mlx5_traffic_restart(struct rte_eth_dev *dev);
 
 /* mlx5_flow.c */
 
+unsigned int mlx5_get_max_verbs_prio(struct rte_eth_dev *dev);
 int mlx5_flow_validate(struct rte_eth_dev *dev,
 		       const struct rte_flow_attr *attr,
 		       const struct rte_flow_item items[],
diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index e6c8b3df8..5402cb148 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -31,8 +31,8 @@
 #include "mlx5_prm.h"
 #include "mlx5_glue.h"
 
-/* Define minimal priority for control plane flows. */
-#define MLX5_CTRL_FLOW_PRIORITY 4
+/* Flow priority for control plane flows. */
+#define MLX5_CTRL_FLOW_PRIORITY 1
 
 /* Internet Protocol versions. */
 #define MLX5_IPV4 4
@@ -128,7 +128,7 @@ const struct hash_rxq_init hash_rxq_init[] = {
 				IBV_RX_HASH_SRC_PORT_TCP |
 				IBV_RX_HASH_DST_PORT_TCP),
 		.dpdk_rss_hf = ETH_RSS_NONFRAG_IPV4_TCP,
-		.flow_priority = 1,
+		.flow_priority = 0,
 		.ip_version = MLX5_IPV4,
 	},
 	[HASH_RXQ_UDPV4] = {
@@ -137,7 +137,7 @@ const struct hash_rxq_init hash_rxq_init[] = {
 				IBV_RX_HASH_SRC_PORT_UDP |
 				IBV_RX_HASH_DST_PORT_UDP),
 		.dpdk_rss_hf = ETH_RSS_NONFRAG_IPV4_UDP,
-		.flow_priority = 1,
+		.flow_priority = 0,
 		.ip_version = MLX5_IPV4,
 	},
 	[HASH_RXQ_IPV4] = {
@@ -145,7 +145,7 @@ const struct hash_rxq_init hash_rxq_init[] = {
 				IBV_RX_HASH_DST_IPV4),
 		.dpdk_rss_hf = (ETH_RSS_IPV4 |
 				ETH_RSS_FRAG_IPV4),
-		.flow_priority = 2,
+		.flow_priority = 1,
 		.ip_version = MLX5_IPV4,
 	},
 	[HASH_RXQ_TCPV6] = {
@@ -154,7 +154,7 @@ const struct hash_rxq_init hash_rxq_init[] = {
 				IBV_RX_HASH_SRC_PORT_TCP |
 				IBV_RX_HASH_DST_PORT_TCP),
 		.dpdk_rss_hf = ETH_RSS_NONFRAG_IPV6_TCP,
-		.flow_priority = 1,
+		.flow_priority = 0,
 		.ip_version = MLX5_IPV6,
 	},
 	[HASH_RXQ_UDPV6] = {
@@ -163,7 +163,7 @@ const struct hash_rxq_init hash_rxq_init[] = {
 				IBV_RX_HASH_SRC_PORT_UDP |
 				IBV_RX_HASH_DST_PORT_UDP),
 		.dpdk_rss_hf = ETH_RSS_NONFRAG_IPV6_UDP,
-		.flow_priority = 1,
+		.flow_priority = 0,
 		.ip_version = MLX5_IPV6,
 	},
 	[HASH_RXQ_IPV6] = {
@@ -171,13 +171,13 @@ const struct hash_rxq_init hash_rxq_init[] = {
 				IBV_RX_HASH_DST_IPV6),
 		.dpdk_rss_hf = (ETH_RSS_IPV6 |
 				ETH_RSS_FRAG_IPV6),
-		.flow_priority = 2,
+		.flow_priority = 1,
 		.ip_version = MLX5_IPV6,
 	},
 	[HASH_RXQ_ETH] = {
 		.hash_fields = 0,
 		.dpdk_rss_hf = 0,
-		.flow_priority = 3,
+		.flow_priority = 2,
 	},
 };
 
@@ -899,30 +899,50 @@ mlx5_flow_convert_allocate(unsigned int size, struct rte_flow_error *error)
  * Make inner packet matching with an higher priority from the non Inner
  * matching.
  *
+ * @param dev
+ *   Pointer to Ethernet device.
  * @param[in, out] parser
  *   Internal parser structure.
  * @param attr
  *   User flow attribute.
  */
 static void
-mlx5_flow_update_priority(struct mlx5_flow_parse *parser,
+mlx5_flow_update_priority(struct rte_eth_dev *dev,
+			  struct mlx5_flow_parse *parser,
 			  const struct rte_flow_attr *attr)
 {
+	struct priv *priv = dev->data->dev_private;
 	unsigned int i;
+	uint16_t priority;
 
+	/*			8 priorities	>= 16 priorities
+	 * Control flow:	4-7		8-15
+	 * User normal flow:	1-3		4-7
+	 * User tunnel flow:	0-2		0-3
+	 */
+	priority = attr->priority * MLX5_VERBS_FLOW_PRIO_8;
+	if (priv->config.max_verbs_prio == MLX5_VERBS_FLOW_PRIO_8)
+		priority /= 2;
+	/*
+	 * Lower non-tunnel flow Verbs priority 1 if only support 8 Verbs
+	 * priorities, lower 4 otherwise.
+	 */
+	if (!parser->inner) {
+		if (priv->config.max_verbs_prio == MLX5_VERBS_FLOW_PRIO_8)
+			priority += 1;
+		else
+			priority += MLX5_VERBS_FLOW_PRIO_8 / 2;
+	}
 	if (parser->drop) {
-		parser->queue[HASH_RXQ_ETH].ibv_attr->priority =
-			attr->priority +
-			hash_rxq_init[HASH_RXQ_ETH].flow_priority;
+		parser->queue[HASH_RXQ_ETH].ibv_attr->priority = priority +
+				hash_rxq_init[HASH_RXQ_ETH].flow_priority;
 		return;
 	}
 	for (i = 0; i != hash_rxq_init_n; ++i) {
-		if (parser->queue[i].ibv_attr) {
-			parser->queue[i].ibv_attr->priority =
-				attr->priority +
-				hash_rxq_init[i].flow_priority -
-				(parser->inner ? 1 : 0);
-		}
+		if (!parser->queue[i].ibv_attr)
+			continue;
+		parser->queue[i].ibv_attr->priority = priority +
+				hash_rxq_init[i].flow_priority;
 	}
 }
 
@@ -1157,7 +1177,7 @@ mlx5_flow_convert(struct rte_eth_dev *dev,
 	 */
 	if (!parser->drop)
 		mlx5_flow_convert_finalise(parser);
-	mlx5_flow_update_priority(parser, attr);
+	mlx5_flow_update_priority(dev, parser, attr);
 exit_free:
 	/* Only verification is expected, all resources should be released. */
 	if (!parser->create) {
@@ -3158,3 +3178,56 @@ mlx5_dev_filter_ctrl(struct rte_eth_dev *dev,
 	}
 	return 0;
 }
+
+/**
+ * Detect number of Verbs flow priorities supported.
+ *
+ * @param dev
+ *   Pointer to Ethernet device.
+ *
+ * @return
+ *   number of supported Verbs flow priority.
+ */
+unsigned int
+mlx5_get_max_verbs_prio(struct rte_eth_dev *dev)
+{
+	struct priv *priv = dev->data->dev_private;
+	unsigned int verb_priorities = MLX5_VERBS_FLOW_PRIO_8;
+	struct {
+		struct ibv_flow_attr attr;
+		struct ibv_flow_spec_eth eth;
+		struct ibv_flow_spec_action_drop drop;
+	} flow_attr = {
+		.attr = {
+			.num_of_specs = 2,
+		},
+		.eth = {
+			.type = IBV_FLOW_SPEC_ETH,
+			.size = sizeof(struct ibv_flow_spec_eth),
+		},
+		.drop = {
+			.size = sizeof(struct ibv_flow_spec_action_drop),
+			.type = IBV_FLOW_SPEC_ACTION_DROP,
+		},
+	};
+	struct ibv_flow *flow;
+
+	do {
+		flow_attr.attr.priority = verb_priorities - 1;
+		flow = mlx5_glue->create_flow(priv->flow_drop_queue->qp,
+					      &flow_attr.attr);
+		if (flow) {
+			claim_zero(mlx5_glue->destroy_flow(flow));
+			/* Try more priorities. */
+			verb_priorities *= 2;
+		} else {
+			/* Failed, restore last right number. */
+			verb_priorities /= 2;
+			break;
+		}
+	} while (1);
+	DRV_LOG(DEBUG, "port %u Verbs flow priorities: %d,"
+		" user flow priorities: %d",
+		dev->data->port_id, verb_priorities, MLX5_CTRL_FLOW_PRIORITY);
+	return verb_priorities;
+}
diff --git a/drivers/net/mlx5/mlx5_trigger.c b/drivers/net/mlx5/mlx5_trigger.c
index ee08c5677..fc56d1ee8 100644
--- a/drivers/net/mlx5/mlx5_trigger.c
+++ b/drivers/net/mlx5/mlx5_trigger.c
@@ -148,12 +148,6 @@ mlx5_dev_start(struct rte_eth_dev *dev)
 	int ret;
 
 	dev->data->dev_started = 1;
-	ret = mlx5_flow_create_drop_queue(dev);
-	if (ret) {
-		DRV_LOG(ERR, "port %u drop queue allocation failed: %s",
-			dev->data->port_id, strerror(rte_errno));
-		goto error;
-	}
 	DRV_LOG(DEBUG, "port %u allocating and configuring hash Rx queues",
 		dev->data->port_id);
 	rte_mempool_walk(mlx5_mp2mr_iter, priv);
@@ -202,7 +196,6 @@ mlx5_dev_start(struct rte_eth_dev *dev)
 	mlx5_traffic_disable(dev);
 	mlx5_txq_stop(dev);
 	mlx5_rxq_stop(dev);
-	mlx5_flow_delete_drop_queue(dev);
 	rte_errno = ret; /* Restore rte_errno. */
 	return -rte_errno;
 }
@@ -237,7 +230,6 @@ mlx5_dev_stop(struct rte_eth_dev *dev)
 	mlx5_rxq_stop(dev);
 	for (mr = LIST_FIRST(&priv->mr); mr; mr = LIST_FIRST(&priv->mr))
 		mlx5_mr_release(mr);
-	mlx5_flow_delete_drop_queue(dev);
 }
 
 /**
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v5 02/11] net/mlx5: support GRE tunnel flow
  2018-04-17 15:14   ` [PATCH v4 00/11] " Xueming Li
  2018-04-20 12:23     ` [PATCH v5 " Xueming Li
  2018-04-20 12:23     ` [PATCH v5 01/11] net/mlx5: support 16 hardware priorities Xueming Li
@ 2018-04-20 12:23     ` Xueming Li
  2018-04-20 12:23     ` [PATCH v5 03/11] net/mlx5: support L3 VXLAN flow Xueming Li
                       ` (8 subsequent siblings)
  11 siblings, 0 replies; 115+ messages in thread
From: Xueming Li @ 2018-04-20 12:23 UTC (permalink / raw)
  To: Iremonger Bernard, Nelio Laranjeiro, Shahaf Shuler; +Cc: Xueming Li, dev

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
---
 drivers/net/mlx5/mlx5_flow.c | 101 ++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 94 insertions(+), 7 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index 5402cb148..b365f9868 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -37,6 +37,7 @@
 /* Internet Protocol versions. */
 #define MLX5_IPV4 4
 #define MLX5_IPV6 6
+#define MLX5_GRE 47
 
 #ifndef HAVE_IBV_DEVICE_COUNTERS_SET_SUPPORT
 struct ibv_flow_spec_counter_action {
@@ -89,6 +90,11 @@ mlx5_flow_create_vxlan(const struct rte_flow_item *item,
 		       const void *default_mask,
 		       struct mlx5_flow_data *data);
 
+static int
+mlx5_flow_create_gre(const struct rte_flow_item *item,
+		     const void *default_mask,
+		     struct mlx5_flow_data *data);
+
 struct mlx5_flow_parse;
 
 static void
@@ -231,6 +237,10 @@ struct rte_flow {
 		__VA_ARGS__, RTE_FLOW_ITEM_TYPE_END, \
 	}
 
+#define IS_TUNNEL(type) ( \
+	(type) == RTE_FLOW_ITEM_TYPE_VXLAN || \
+	(type) == RTE_FLOW_ITEM_TYPE_GRE)
+
 /** Structure to generate a simple graph of layers supported by the NIC. */
 struct mlx5_flow_items {
 	/** List of possible actions for these items. */
@@ -284,7 +294,8 @@ static const enum rte_flow_action_type valid_actions[] = {
 static const struct mlx5_flow_items mlx5_flow_items[] = {
 	[RTE_FLOW_ITEM_TYPE_END] = {
 		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
-			       RTE_FLOW_ITEM_TYPE_VXLAN),
+			       RTE_FLOW_ITEM_TYPE_VXLAN,
+			       RTE_FLOW_ITEM_TYPE_GRE),
 	},
 	[RTE_FLOW_ITEM_TYPE_ETH] = {
 		.items = ITEMS(RTE_FLOW_ITEM_TYPE_VLAN,
@@ -316,7 +327,8 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
 	},
 	[RTE_FLOW_ITEM_TYPE_IPV4] = {
 		.items = ITEMS(RTE_FLOW_ITEM_TYPE_UDP,
-			       RTE_FLOW_ITEM_TYPE_TCP),
+			       RTE_FLOW_ITEM_TYPE_TCP,
+			       RTE_FLOW_ITEM_TYPE_GRE),
 		.actions = valid_actions,
 		.mask = &(const struct rte_flow_item_ipv4){
 			.hdr = {
@@ -333,7 +345,8 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
 	},
 	[RTE_FLOW_ITEM_TYPE_IPV6] = {
 		.items = ITEMS(RTE_FLOW_ITEM_TYPE_UDP,
-			       RTE_FLOW_ITEM_TYPE_TCP),
+			       RTE_FLOW_ITEM_TYPE_TCP,
+			       RTE_FLOW_ITEM_TYPE_GRE),
 		.actions = valid_actions,
 		.mask = &(const struct rte_flow_item_ipv6){
 			.hdr = {
@@ -386,6 +399,19 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
 		.convert = mlx5_flow_create_tcp,
 		.dst_sz = sizeof(struct ibv_flow_spec_tcp_udp),
 	},
+	[RTE_FLOW_ITEM_TYPE_GRE] = {
+		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
+			       RTE_FLOW_ITEM_TYPE_IPV4,
+			       RTE_FLOW_ITEM_TYPE_IPV6),
+		.actions = valid_actions,
+		.mask = &(const struct rte_flow_item_gre){
+			.protocol = -1,
+		},
+		.default_mask = &rte_flow_item_gre_mask,
+		.mask_sz = sizeof(struct rte_flow_item_gre),
+		.convert = mlx5_flow_create_gre,
+		.dst_sz = sizeof(struct ibv_flow_spec_tunnel),
+	},
 	[RTE_FLOW_ITEM_TYPE_VXLAN] = {
 		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH),
 		.actions = valid_actions,
@@ -401,7 +427,7 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
 
 /** Structure to pass to the conversion function. */
 struct mlx5_flow_parse {
-	uint32_t inner; /**< Set once VXLAN is encountered. */
+	uint32_t inner; /**< Verbs value, set once tunnel is encountered. */
 	uint32_t create:1;
 	/**< Whether resources should remain after a validate. */
 	uint32_t drop:1; /**< Target is a drop queue. */
@@ -829,13 +855,13 @@ mlx5_flow_convert_items_validate(const struct rte_flow_item items[],
 					      cur_item->mask_sz);
 		if (ret)
 			goto exit_item_not_supported;
-		if (items->type == RTE_FLOW_ITEM_TYPE_VXLAN) {
+		if (IS_TUNNEL(items->type)) {
 			if (parser->inner) {
 				rte_flow_error_set(error, ENOTSUP,
 						   RTE_FLOW_ERROR_TYPE_ITEM,
 						   items,
-						   "cannot recognize multiple"
-						   " VXLAN encapsulations");
+						   "Cannot recognize multiple"
+						   " tunnel encapsulations.");
 				return -rte_errno;
 			}
 			parser->inner = IBV_FLOW_SPEC_INNER;
@@ -1641,6 +1667,67 @@ mlx5_flow_create_vxlan(const struct rte_flow_item *item,
 }
 
 /**
+ * Convert GRE item to Verbs specification.
+ *
+ * @param item[in]
+ *   Item specification.
+ * @param default_mask[in]
+ *   Default bit-masks to use when item->mask is not provided.
+ * @param data[in, out]
+ *   User structure.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+mlx5_flow_create_gre(const struct rte_flow_item *item __rte_unused,
+		     const void *default_mask __rte_unused,
+		     struct mlx5_flow_data *data)
+{
+	struct mlx5_flow_parse *parser = data->parser;
+	unsigned int size = sizeof(struct ibv_flow_spec_tunnel);
+	struct ibv_flow_spec_tunnel tunnel = {
+		.type = parser->inner | IBV_FLOW_SPEC_VXLAN_TUNNEL,
+		.size = size,
+	};
+	struct ibv_flow_spec_ipv4_ext *ipv4;
+	struct ibv_flow_spec_ipv6 *ipv6;
+	unsigned int i;
+
+	parser->inner = IBV_FLOW_SPEC_INNER;
+	/* Update encapsulation IP layer protocol. */
+	for (i = 0; i != hash_rxq_init_n; ++i) {
+		if (!parser->queue[i].ibv_attr)
+			continue;
+		if (parser->out_layer == HASH_RXQ_IPV4) {
+			ipv4 = (void *)((uintptr_t)parser->queue[i].ibv_attr +
+				parser->queue[i].offset -
+				sizeof(struct ibv_flow_spec_ipv4_ext));
+			if (ipv4->mask.proto && ipv4->val.proto != MLX5_GRE)
+				break;
+			ipv4->val.proto = MLX5_GRE;
+			ipv4->mask.proto = 0xff;
+		} else if (parser->out_layer == HASH_RXQ_IPV6) {
+			ipv6 = (void *)((uintptr_t)parser->queue[i].ibv_attr +
+				parser->queue[i].offset -
+				sizeof(struct ibv_flow_spec_ipv6));
+			if (ipv6->mask.next_hdr &&
+			    ipv6->val.next_hdr != MLX5_GRE)
+				break;
+			ipv6->val.next_hdr = MLX5_GRE;
+			ipv6->mask.next_hdr = 0xff;
+		}
+	}
+	if (i != hash_rxq_init_n)
+		return rte_flow_error_set(data->error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ITEM,
+					  item,
+					  "IP protocol of GRE must be 47");
+	mlx5_flow_create_copy(parser, &tunnel, size);
+	return 0;
+}
+
+/**
  * Convert mark/flag action to Verbs specification.
  *
  * @param parser
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v5 03/11] net/mlx5: support L3 VXLAN flow
  2018-04-17 15:14   ` [PATCH v4 00/11] " Xueming Li
                       ` (2 preceding siblings ...)
  2018-04-20 12:23     ` [PATCH v5 02/11] net/mlx5: support GRE tunnel flow Xueming Li
@ 2018-04-20 12:23     ` Xueming Li
  2018-04-20 12:23     ` [PATCH v5 04/11] net/mlx5: support Rx tunnel type identification Xueming Li
                       ` (7 subsequent siblings)
  11 siblings, 0 replies; 115+ messages in thread
From: Xueming Li @ 2018-04-20 12:23 UTC (permalink / raw)
  To: Iremonger Bernard, Nelio Laranjeiro, Shahaf Shuler; +Cc: Xueming Li, dev

This patch support L3 VXLAN, no inner L2 header comparing to standard
VXLAN protocol. L3 VXLAN using specific overlay UDP destination port to
discriminate against standard VXLAN, device parameter and FW has to be
configured to support it:
  sudo mlxconfig -d <device> -y s IP_OVER_VXLAN_EN=1
  sudo mlxconfig -d <device> -y s IP_OVER_VXLAN_PORT=<port>

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
---
 doc/guides/nics/mlx5.rst     | 26 ++++++++++++++++++++++++++
 drivers/net/mlx5/mlx5.c      |  6 ++++++
 drivers/net/mlx5/mlx5.h      |  1 +
 drivers/net/mlx5/mlx5_flow.c | 26 +++++++++++++++++++++++++-
 4 files changed, 58 insertions(+), 1 deletion(-)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index c28c83278..421274729 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -327,6 +327,32 @@ Run-time configuration
 
   Enabled by default, valid only on VF devices ignored otherwise.
 
+- ``l3_vxlan_en`` parameter [int]
+
+  A nonzero value allows L3 VXLAN flow creation. To enable L3 VXLAN, users
+  has to configure firemware and enable this prameter. This is a prerequisite
+  to receive this kind of traffic.
+
+  Disabled by default.
+
+Firmware configuration
+~~~~~~~~~~~~~~~~~~~~~~
+
+- L3 VXLAN destination UDP port
+
+   .. code-block:: console
+
+     mlxconfig -d <mst device> set IP_OVER_VXLAN_EN=1
+     mlxconfig -d <mst device> set IP_OVER_VXLAN_PORT=<udp dport>
+
+  Verify configurations are set:
+
+   .. code-block:: console
+
+     mlxconfig -d <mst device> query | grep IP_OVER_VXLAN
+     IP_OVER_VXLAN_EN                    True(1)
+     IP_OVER_VXLAN_PORT                  <udp dport>
+
 Prerequisites
 -------------
 
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 5a0b8de85..d19981c60 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -69,6 +69,9 @@
 /* Device parameter to enable hardware Rx vector. */
 #define MLX5_RX_VEC_EN "rx_vec_en"
 
+/* Allow L3 VXLAN flow creation. */
+#define MLX5_L3_VXLAN_EN "l3_vxlan_en"
+
 /* Activate Netlink support in VF mode. */
 #define MLX5_VF_NL_EN "vf_nl_en"
 
@@ -416,6 +419,8 @@ mlx5_args_check(const char *key, const char *val, void *opaque)
 		config->tx_vec_en = !!tmp;
 	} else if (strcmp(MLX5_RX_VEC_EN, key) == 0) {
 		config->rx_vec_en = !!tmp;
+	} else if (strcmp(MLX5_L3_VXLAN_EN, key) == 0) {
+		config->l3_vxlan_en = !!tmp;
 	} else if (strcmp(MLX5_VF_NL_EN, key) == 0) {
 		config->vf_nl_en = !!tmp;
 	} else {
@@ -449,6 +454,7 @@ mlx5_args(struct mlx5_dev_config *config, struct rte_devargs *devargs)
 		MLX5_TXQ_MAX_INLINE_LEN,
 		MLX5_TX_VEC_EN,
 		MLX5_RX_VEC_EN,
+		MLX5_L3_VXLAN_EN,
 		MLX5_VF_NL_EN,
 		NULL,
 	};
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 670f6860f..c1a65257e 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -88,6 +88,7 @@ struct mlx5_dev_config {
 	unsigned int tx_vec_en:1; /* Tx vector is enabled. */
 	unsigned int rx_vec_en:1; /* Rx vector is enabled. */
 	unsigned int mpw_hdr_dseg:1; /* Enable DSEGs in the title WQEBB. */
+	unsigned int l3_vxlan_en:1; /* Enable L3 VXLAN flow creation. */
 	unsigned int vf_nl_en:1; /* Enable Netlink requests in VF mode. */
 	unsigned int max_verbs_prio; /* Number of Verb flow priorities. */
 	unsigned int tso_max_payload_sz; /* Maximum TCP payload for TSO. */
diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index b365f9868..20111383e 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -51,6 +51,7 @@ extern const struct eth_dev_ops mlx5_dev_ops_isolate;
 
 /** Structure give to the conversion functions. */
 struct mlx5_flow_data {
+	struct rte_eth_dev *dev; /** Ethernet device. */
 	struct mlx5_flow_parse *parser; /** Parser context. */
 	struct rte_flow_error *error; /** Error context. */
 };
@@ -413,7 +414,9 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
 		.dst_sz = sizeof(struct ibv_flow_spec_tunnel),
 	},
 	[RTE_FLOW_ITEM_TYPE_VXLAN] = {
-		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH),
+		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
+			       RTE_FLOW_ITEM_TYPE_IPV4, /* For L3 VXLAN. */
+			       RTE_FLOW_ITEM_TYPE_IPV6), /* For L3 VXLAN. */
 		.actions = valid_actions,
 		.mask = &(const struct rte_flow_item_vxlan){
 			.vni = "\xff\xff\xff",
@@ -1175,6 +1178,7 @@ mlx5_flow_convert(struct rte_eth_dev *dev,
 	parser->inner = 0;
 	for (; items->type != RTE_FLOW_ITEM_TYPE_END; ++items) {
 		struct mlx5_flow_data data = {
+			.dev = dev,
 			.parser = parser,
 			.error = error,
 		};
@@ -1396,6 +1400,7 @@ mlx5_flow_create_ipv4(const struct rte_flow_item *item,
 		      const void *default_mask,
 		      struct mlx5_flow_data *data)
 {
+	struct priv *priv = data->dev->data->dev_private;
 	const struct rte_flow_item_ipv4 *spec = item->spec;
 	const struct rte_flow_item_ipv4 *mask = item->mask;
 	struct mlx5_flow_parse *parser = data->parser;
@@ -1405,6 +1410,15 @@ mlx5_flow_create_ipv4(const struct rte_flow_item *item,
 		.size = ipv4_size,
 	};
 
+	if (parser->layer == HASH_RXQ_TUNNEL &&
+	    parser->tunnel == ptype_ext[PTYPE_IDX(RTE_PTYPE_TUNNEL_VXLAN)] &&
+	    !priv->config.l3_vxlan_en)
+		return rte_flow_error_set(data->error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ITEM,
+					  item,
+					  "L3 VXLAN not enabled by device"
+					  " parameter and/or not configured"
+					  " in firmware");
 	/* Don't update layer for the inner pattern. */
 	if (!parser->inner)
 		parser->layer = HASH_RXQ_IPV4;
@@ -1451,6 +1465,7 @@ mlx5_flow_create_ipv6(const struct rte_flow_item *item,
 		      const void *default_mask,
 		      struct mlx5_flow_data *data)
 {
+	struct priv *priv = data->dev->data->dev_private;
 	const struct rte_flow_item_ipv6 *spec = item->spec;
 	const struct rte_flow_item_ipv6 *mask = item->mask;
 	struct mlx5_flow_parse *parser = data->parser;
@@ -1460,6 +1475,15 @@ mlx5_flow_create_ipv6(const struct rte_flow_item *item,
 		.size = ipv6_size,
 	};
 
+	if (parser->layer == HASH_RXQ_TUNNEL &&
+	    parser->tunnel == ptype_ext[PTYPE_IDX(RTE_PTYPE_TUNNEL_VXLAN)] &&
+	    !priv->config.l3_vxlan_en)
+		return rte_flow_error_set(data->error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ITEM,
+					  item,
+					  "L3 VXLAN not enabled by device"
+					  " parameter and/or not configured"
+					  " in firmware");
 	/* Don't update layer for the inner pattern. */
 	if (!parser->inner)
 		parser->layer = HASH_RXQ_IPV6;
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v5 04/11] net/mlx5: support Rx tunnel type identification
  2018-04-17 15:14   ` [PATCH v4 00/11] " Xueming Li
                       ` (3 preceding siblings ...)
  2018-04-20 12:23     ` [PATCH v5 03/11] net/mlx5: support L3 VXLAN flow Xueming Li
@ 2018-04-20 12:23     ` Xueming Li
  2018-04-23  7:40       ` Nélio Laranjeiro
  2018-04-20 12:23     ` [PATCH v5 05/11] net/mlx5: cleanup tunnel checksum offloads Xueming Li
                       ` (6 subsequent siblings)
  11 siblings, 1 reply; 115+ messages in thread
From: Xueming Li @ 2018-04-20 12:23 UTC (permalink / raw)
  To: Iremonger Bernard, Nelio Laranjeiro, Shahaf Shuler; +Cc: Xueming Li, dev

This patch introduced tunnel type identification based on flow rules.
If flows of multiple tunnel types built on same queue,
RTE_PTYPE_TUNNEL_MASK will be returned, user application could use bits
in flow mark as tunnel type identifier.

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
---
 drivers/net/mlx5/mlx5_flow.c          | 135 +++++++++++++++++++++++++++++-----
 drivers/net/mlx5/mlx5_rxq.c           |  11 ++-
 drivers/net/mlx5/mlx5_rxtx.c          |  12 ++-
 drivers/net/mlx5/mlx5_rxtx.h          |   9 ++-
 drivers/net/mlx5/mlx5_rxtx_vec_neon.h |  21 ++++--
 drivers/net/mlx5/mlx5_rxtx_vec_sse.h  |  17 ++++-
 6 files changed, 167 insertions(+), 38 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index 20111383e..fa1487d29 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -226,6 +226,7 @@ struct rte_flow {
 	struct rte_flow_action_rss rss_conf; /**< RSS configuration */
 	uint16_t (*queues)[]; /**< Queues indexes to use. */
 	uint8_t rss_key[40]; /**< copy of the RSS key. */
+	uint32_t tunnel; /**< Tunnel type of RTE_PTYPE_TUNNEL_XXX. */
 	struct ibv_counter_set *cs; /**< Holds the counters for the rule. */
 	struct mlx5_flow_counter_stats counter_stats;/**<The counter stats. */
 	struct mlx5_flow frxq[RTE_DIM(hash_rxq_init)];
@@ -242,6 +243,19 @@ struct rte_flow {
 	(type) == RTE_FLOW_ITEM_TYPE_VXLAN || \
 	(type) == RTE_FLOW_ITEM_TYPE_GRE)
 
+const uint32_t flow_ptype[] = {
+	[RTE_FLOW_ITEM_TYPE_VXLAN] = RTE_PTYPE_TUNNEL_VXLAN,
+	[RTE_FLOW_ITEM_TYPE_GRE] = RTE_PTYPE_TUNNEL_GRE,
+};
+
+#define PTYPE_IDX(t) ((RTE_PTYPE_TUNNEL_MASK & (t)) >> 12)
+
+const uint32_t ptype_ext[] = {
+	[PTYPE_IDX(RTE_PTYPE_TUNNEL_VXLAN)] = RTE_PTYPE_TUNNEL_VXLAN |
+					      RTE_PTYPE_L4_UDP,
+	[PTYPE_IDX(RTE_PTYPE_TUNNEL_GRE)] = RTE_PTYPE_TUNNEL_GRE,
+};
+
 /** Structure to generate a simple graph of layers supported by the NIC. */
 struct mlx5_flow_items {
 	/** List of possible actions for these items. */
@@ -441,6 +455,7 @@ struct mlx5_flow_parse {
 	uint16_t queues[RTE_MAX_QUEUES_PER_PORT]; /**< Queues indexes to use. */
 	uint8_t rss_key[40]; /**< copy of the RSS key. */
 	enum hash_rxq_type layer; /**< Last pattern layer detected. */
+	uint32_t tunnel; /**< Tunnel type of RTE_PTYPE_TUNNEL_XXX. */
 	struct ibv_counter_set *cs; /**< Holds the counter set for the rule */
 	struct {
 		struct ibv_flow_attr *ibv_attr;
@@ -859,7 +874,7 @@ mlx5_flow_convert_items_validate(const struct rte_flow_item items[],
 		if (ret)
 			goto exit_item_not_supported;
 		if (IS_TUNNEL(items->type)) {
-			if (parser->inner) {
+			if (parser->tunnel) {
 				rte_flow_error_set(error, ENOTSUP,
 						   RTE_FLOW_ERROR_TYPE_ITEM,
 						   items,
@@ -868,6 +883,7 @@ mlx5_flow_convert_items_validate(const struct rte_flow_item items[],
 				return -rte_errno;
 			}
 			parser->inner = IBV_FLOW_SPEC_INNER;
+			parser->tunnel = flow_ptype[items->type];
 		}
 		if (parser->drop) {
 			parser->queue[HASH_RXQ_ETH].offset += cur_item->dst_sz;
@@ -1176,6 +1192,7 @@ mlx5_flow_convert(struct rte_eth_dev *dev,
 	}
 	/* Third step. Conversion parse, fill the specifications. */
 	parser->inner = 0;
+	parser->tunnel = 0;
 	for (; items->type != RTE_FLOW_ITEM_TYPE_END; ++items) {
 		struct mlx5_flow_data data = {
 			.dev = dev,
@@ -1663,6 +1680,7 @@ mlx5_flow_create_vxlan(const struct rte_flow_item *item,
 
 	id.vni[0] = 0;
 	parser->inner = IBV_FLOW_SPEC_INNER;
+	parser->tunnel = ptype_ext[PTYPE_IDX(RTE_PTYPE_TUNNEL_VXLAN)];
 	if (spec) {
 		if (!mask)
 			mask = default_mask;
@@ -1719,6 +1737,7 @@ mlx5_flow_create_gre(const struct rte_flow_item *item __rte_unused,
 	unsigned int i;
 
 	parser->inner = IBV_FLOW_SPEC_INNER;
+	parser->tunnel = ptype_ext[PTYPE_IDX(RTE_PTYPE_TUNNEL_GRE)];
 	/* Update encapsulation IP layer protocol. */
 	for (i = 0; i != hash_rxq_init_n; ++i) {
 		if (!parser->queue[i].ibv_attr)
@@ -1925,7 +1944,8 @@ mlx5_flow_create_action_queue_rss(struct rte_eth_dev *dev,
 				      parser->rss_conf.key_len,
 				      hash_fields,
 				      parser->rss_conf.queue,
-				      parser->rss_conf.queue_num);
+				      parser->rss_conf.queue_num,
+				      parser->tunnel);
 		if (flow->frxq[i].hrxq)
 			continue;
 		flow->frxq[i].hrxq =
@@ -1934,7 +1954,8 @@ mlx5_flow_create_action_queue_rss(struct rte_eth_dev *dev,
 				      parser->rss_conf.key_len,
 				      hash_fields,
 				      parser->rss_conf.queue,
-				      parser->rss_conf.queue_num);
+				      parser->rss_conf.queue_num,
+				      parser->tunnel);
 		if (!flow->frxq[i].hrxq) {
 			return rte_flow_error_set(error, ENOMEM,
 						  RTE_FLOW_ERROR_TYPE_HANDLE,
@@ -1946,6 +1967,48 @@ mlx5_flow_create_action_queue_rss(struct rte_eth_dev *dev,
 }
 
 /**
+ * RXQ update after flow rule creation.
+ *
+ * @param dev
+ *   Pointer to Ethernet device.
+ * @param flow
+ *   Pointer to the flow rule.
+ */
+static void
+mlx5_flow_create_update_rxqs(struct rte_eth_dev *dev, struct rte_flow *flow)
+{
+	struct priv *priv = dev->data->dev_private;
+	unsigned int i;
+	unsigned int j;
+
+	if (!dev->data->dev_started)
+		return;
+	for (i = 0; i != flow->rss_conf.queue_num; ++i) {
+		struct mlx5_rxq_data *rxq_data = (*priv->rxqs)
+						 [(*flow->queues)[i]];
+		struct mlx5_rxq_ctrl *rxq_ctrl =
+			container_of(rxq_data, struct mlx5_rxq_ctrl, rxq);
+		uint8_t tunnel = PTYPE_IDX(flow->tunnel);
+
+		rxq_data->mark |= flow->mark;
+		if (!tunnel)
+			continue;
+		rxq_ctrl->tunnel_types[tunnel] += 1;
+		/* Clear tunnel type if more than one tunnel types set. */
+		for (j = 0; j != RTE_DIM(rxq_ctrl->tunnel_types); ++j) {
+			if (j == tunnel)
+				continue;
+			if (rxq_ctrl->tunnel_types[j] > 0) {
+				rxq_data->tunnel = 0;
+				break;
+			}
+		}
+		if (j == RTE_DIM(rxq_ctrl->tunnel_types))
+			rxq_data->tunnel = flow->tunnel;
+	}
+}
+
+/**
  * Complete flow rule creation.
  *
  * @param dev
@@ -2005,12 +2068,7 @@ mlx5_flow_create_action_queue(struct rte_eth_dev *dev,
 				   NULL, "internal error in flow creation");
 		goto error;
 	}
-	for (i = 0; i != parser->rss_conf.queue_num; ++i) {
-		struct mlx5_rxq_data *q =
-			(*priv->rxqs)[parser->rss_conf.queue[i]];
-
-		q->mark |= parser->mark;
-	}
+	mlx5_flow_create_update_rxqs(dev, flow);
 	return 0;
 error:
 	ret = rte_errno; /* Save rte_errno before cleanup. */
@@ -2083,6 +2141,7 @@ mlx5_flow_list_create(struct rte_eth_dev *dev,
 	}
 	/* Copy configuration. */
 	flow->queues = (uint16_t (*)[])(flow + 1);
+	flow->tunnel = parser.tunnel;
 	flow->rss_conf = (struct rte_flow_action_rss){
 		.func = RTE_ETH_HASH_FUNCTION_DEFAULT,
 		.level = 0,
@@ -2174,9 +2233,38 @@ mlx5_flow_list_destroy(struct rte_eth_dev *dev, struct mlx5_flows *list,
 	struct priv *priv = dev->data->dev_private;
 	unsigned int i;
 
-	if (flow->drop || !flow->mark)
+	if (flow->drop || !dev->data->dev_started)
 		goto free;
-	for (i = 0; i != flow->rss_conf.queue_num; ++i) {
+	for (i = 0; flow->tunnel && i != flow->rss_conf.queue_num; ++i) {
+		/* Update queue tunnel type. */
+		struct mlx5_rxq_data *rxq_data = (*priv->rxqs)
+						 [(*flow->queues)[i]];
+		struct mlx5_rxq_ctrl *rxq_ctrl =
+			container_of(rxq_data, struct mlx5_rxq_ctrl, rxq);
+		uint8_t tunnel = PTYPE_IDX(flow->tunnel);
+
+		assert(rxq_ctrl->tunnel_types[tunnel] > 0);
+		rxq_ctrl->tunnel_types[tunnel] -= 1;
+		if (!rxq_ctrl->tunnel_types[tunnel]) {
+			/* Update tunnel type. */
+			uint8_t j;
+			uint8_t types = 0;
+			uint8_t last;
+
+			for (j = 0; j < RTE_DIM(rxq_ctrl->tunnel_types); j++)
+				if (rxq_ctrl->tunnel_types[j]) {
+					types += 1;
+					last = j;
+				}
+			/* Keep same if more than one tunnel types left. */
+			if (types == 1)
+				rxq_data->tunnel = ptype_ext[last];
+			else if (types == 0)
+				/* No tunnel type left. */
+				rxq_data->tunnel = 0;
+		}
+	}
+	for (i = 0; flow->mark && i != flow->rss_conf.queue_num; ++i) {
 		struct rte_flow *tmp;
 		int mark = 0;
 
@@ -2395,9 +2483,9 @@ mlx5_flow_stop(struct rte_eth_dev *dev, struct mlx5_flows *list)
 {
 	struct priv *priv = dev->data->dev_private;
 	struct rte_flow *flow;
+	unsigned int i;
 
 	TAILQ_FOREACH_REVERSE(flow, list, mlx5_flows, next) {
-		unsigned int i;
 		struct mlx5_ind_table_ibv *ind_tbl = NULL;
 
 		if (flow->drop) {
@@ -2443,6 +2531,18 @@ mlx5_flow_stop(struct rte_eth_dev *dev, struct mlx5_flows *list)
 		DRV_LOG(DEBUG, "port %u flow %p removed", dev->data->port_id,
 			(void *)flow);
 	}
+	/* Cleanup Rx queue tunnel info. */
+	for (i = 0; i != priv->rxqs_n; ++i) {
+		struct mlx5_rxq_data *q = (*priv->rxqs)[i];
+		struct mlx5_rxq_ctrl *rxq_ctrl =
+			container_of(q, struct mlx5_rxq_ctrl, rxq);
+
+		if (!q)
+			continue;
+		memset((void *)rxq_ctrl->tunnel_types, 0,
+		       sizeof(rxq_ctrl->tunnel_types));
+		q->tunnel = 0;
+	}
 }
 
 /**
@@ -2490,7 +2590,8 @@ mlx5_flow_start(struct rte_eth_dev *dev, struct mlx5_flows *list)
 					      flow->rss_conf.key_len,
 					      hash_rxq_init[i].hash_fields,
 					      flow->rss_conf.queue,
-					      flow->rss_conf.queue_num);
+					      flow->rss_conf.queue_num,
+					      flow->tunnel);
 			if (flow->frxq[i].hrxq)
 				goto flow_create;
 			flow->frxq[i].hrxq =
@@ -2498,7 +2599,8 @@ mlx5_flow_start(struct rte_eth_dev *dev, struct mlx5_flows *list)
 					      flow->rss_conf.key_len,
 					      hash_rxq_init[i].hash_fields,
 					      flow->rss_conf.queue,
-					      flow->rss_conf.queue_num);
+					      flow->rss_conf.queue_num,
+					      flow->tunnel);
 			if (!flow->frxq[i].hrxq) {
 				DRV_LOG(DEBUG,
 					"port %u flow %p cannot be applied",
@@ -2520,10 +2622,7 @@ mlx5_flow_start(struct rte_eth_dev *dev, struct mlx5_flows *list)
 			DRV_LOG(DEBUG, "port %u flow %p applied",
 				dev->data->port_id, (void *)flow);
 		}
-		if (!flow->mark)
-			continue;
-		for (i = 0; i != flow->rss_conf.queue_num; ++i)
-			(*priv->rxqs)[flow->rss_conf.queue[i]]->mark = 1;
+		mlx5_flow_create_update_rxqs(dev, flow);
 	}
 	return 0;
 }
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index 18ad40813..1fbd02aa0 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -1386,6 +1386,8 @@ mlx5_ind_table_ibv_verify(struct rte_eth_dev *dev)
  *   first queue index will be taken for the indirection table.
  * @param queues_n
  *   Number of queues.
+ * @param tunnel
+ *   Tunnel type.
  *
  * @return
  *   The Verbs object initialised, NULL otherwise and rte_errno is set.
@@ -1394,7 +1396,7 @@ struct mlx5_hrxq *
 mlx5_hrxq_new(struct rte_eth_dev *dev,
 	      const uint8_t *rss_key, uint32_t rss_key_len,
 	      uint64_t hash_fields,
-	      const uint16_t *queues, uint32_t queues_n)
+	      const uint16_t *queues, uint32_t queues_n, uint32_t tunnel)
 {
 	struct priv *priv = dev->data->dev_private;
 	struct mlx5_hrxq *hrxq;
@@ -1438,6 +1440,7 @@ mlx5_hrxq_new(struct rte_eth_dev *dev,
 	hrxq->qp = qp;
 	hrxq->rss_key_len = rss_key_len;
 	hrxq->hash_fields = hash_fields;
+	hrxq->tunnel = tunnel;
 	memcpy(hrxq->rss_key, rss_key, rss_key_len);
 	rte_atomic32_inc(&hrxq->refcnt);
 	LIST_INSERT_HEAD(&priv->hrxqs, hrxq, next);
@@ -1466,6 +1469,8 @@ mlx5_hrxq_new(struct rte_eth_dev *dev,
  *   first queue index will be taken for the indirection table.
  * @param queues_n
  *   Number of queues.
+ * @param tunnel
+ *   Tunnel type.
  *
  * @return
  *   An hash Rx queue on success.
@@ -1474,7 +1479,7 @@ struct mlx5_hrxq *
 mlx5_hrxq_get(struct rte_eth_dev *dev,
 	      const uint8_t *rss_key, uint32_t rss_key_len,
 	      uint64_t hash_fields,
-	      const uint16_t *queues, uint32_t queues_n)
+	      const uint16_t *queues, uint32_t queues_n, uint32_t tunnel)
 {
 	struct priv *priv = dev->data->dev_private;
 	struct mlx5_hrxq *hrxq;
@@ -1489,6 +1494,8 @@ mlx5_hrxq_get(struct rte_eth_dev *dev,
 			continue;
 		if (hrxq->hash_fields != hash_fields)
 			continue;
+		if (hrxq->tunnel != tunnel)
+			continue;
 		ind_tbl = mlx5_ind_table_ibv_get(dev, queues, queues_n);
 		if (!ind_tbl)
 			continue;
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 05fe10918..fafac514b 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -34,7 +34,7 @@
 #include "mlx5_prm.h"
 
 static __rte_always_inline uint32_t
-rxq_cq_to_pkt_type(volatile struct mlx5_cqe *cqe);
+rxq_cq_to_pkt_type(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cqe);
 
 static __rte_always_inline int
 mlx5_rx_poll_len(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cqe,
@@ -125,12 +125,14 @@ mlx5_set_ptype_table(void)
 	(*p)[0x8a] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
 		     RTE_PTYPE_L4_UDP;
 	/* Tunneled - L3 */
+	(*p)[0x40] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN;
 	(*p)[0x41] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
 		     RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
 		     RTE_PTYPE_INNER_L4_NONFRAG;
 	(*p)[0x42] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
 		     RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
 		     RTE_PTYPE_INNER_L4_NONFRAG;
+	(*p)[0xc0] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN;
 	(*p)[0xc1] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
 		     RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
 		     RTE_PTYPE_INNER_L4_NONFRAG;
@@ -1577,6 +1579,8 @@ mlx5_tx_burst_empw(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
 /**
  * Translate RX completion flags to packet type.
  *
+ * @param[in] rxq
+ *   Pointer to RX queue structure.
  * @param[in] cqe
  *   Pointer to CQE.
  *
@@ -1586,7 +1590,7 @@ mlx5_tx_burst_empw(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
  *   Packet type for struct rte_mbuf.
  */
 static inline uint32_t
-rxq_cq_to_pkt_type(volatile struct mlx5_cqe *cqe)
+rxq_cq_to_pkt_type(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cqe)
 {
 	uint8_t idx;
 	uint8_t pinfo = cqe->pkt_info;
@@ -1601,7 +1605,7 @@ rxq_cq_to_pkt_type(volatile struct mlx5_cqe *cqe)
 	 * bit[7] = outer_l3_type
 	 */
 	idx = ((pinfo & 0x3) << 6) | ((ptype & 0xfc00) >> 10);
-	return mlx5_ptype_table[idx];
+	return mlx5_ptype_table[idx] | rxq->tunnel * !!(idx & (1 << 6));
 }
 
 /**
@@ -1833,7 +1837,7 @@ mlx5_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n)
 			pkt = seg;
 			assert(len >= (rxq->crc_present << 2));
 			/* Update packet information. */
-			pkt->packet_type = rxq_cq_to_pkt_type(cqe);
+			pkt->packet_type = rxq_cq_to_pkt_type(rxq, cqe);
 			pkt->ol_flags = 0;
 			if (rss_hash_res && rxq->rss_hash) {
 				pkt->hash.rss = rss_hash_res;
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index ee534c340..676ad6a9a 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -104,6 +104,7 @@ struct mlx5_rxq_data {
 	void *cq_uar; /* CQ user access region. */
 	uint32_t cqn; /* CQ number. */
 	uint8_t cq_arm_sn; /* CQ arm seq number. */
+	uint32_t tunnel; /* Tunnel information. */
 } __rte_cache_aligned;
 
 /* Verbs Rx queue elements. */
@@ -125,6 +126,7 @@ struct mlx5_rxq_ctrl {
 	struct mlx5_rxq_ibv *ibv; /* Verbs elements. */
 	struct mlx5_rxq_data rxq; /* Data path structure. */
 	unsigned int socket; /* CPU socket ID for allocations. */
+	uint32_t tunnel_types[16]; /* Tunnel type counter. */
 	unsigned int irq:1; /* Whether IRQ is enabled. */
 	uint16_t idx; /* Queue index. */
 };
@@ -145,6 +147,7 @@ struct mlx5_hrxq {
 	struct mlx5_ind_table_ibv *ind_table; /* Indirection table. */
 	struct ibv_qp *qp; /* Verbs queue pair. */
 	uint64_t hash_fields; /* Verbs Hash fields. */
+	uint32_t tunnel; /* Tunnel type. */
 	uint32_t rss_key_len; /* Hash key length in bytes. */
 	uint8_t rss_key[]; /* Hash key. */
 };
@@ -248,11 +251,13 @@ int mlx5_ind_table_ibv_verify(struct rte_eth_dev *dev);
 struct mlx5_hrxq *mlx5_hrxq_new(struct rte_eth_dev *dev,
 				const uint8_t *rss_key, uint32_t rss_key_len,
 				uint64_t hash_fields,
-				const uint16_t *queues, uint32_t queues_n);
+				const uint16_t *queues, uint32_t queues_n,
+				uint32_t tunnel);
 struct mlx5_hrxq *mlx5_hrxq_get(struct rte_eth_dev *dev,
 				const uint8_t *rss_key, uint32_t rss_key_len,
 				uint64_t hash_fields,
-				const uint16_t *queues, uint32_t queues_n);
+				const uint16_t *queues, uint32_t queues_n,
+				uint32_t tunnel);
 int mlx5_hrxq_release(struct rte_eth_dev *dev, struct mlx5_hrxq *hxrq);
 int mlx5_hrxq_ibv_verify(struct rte_eth_dev *dev);
 uint64_t mlx5_get_rx_port_offloads(void);
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
index 84817e7ad..d21e99f68 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
@@ -551,6 +551,7 @@ rxq_cq_to_ptype_oflags_v(struct mlx5_rxq_data *rxq,
 	const uint64x1_t mbuf_init = vld1_u64(&rxq->mbuf_initializer);
 	const uint64x1_t r32_mask = vcreate_u64(0xffffffff);
 	uint64x2_t rearm0, rearm1, rearm2, rearm3;
+	uint8_t pt_idx0, pt_idx1, pt_idx2, pt_idx3;
 
 	if (rxq->mark) {
 		const uint32x4_t ft_def = vdupq_n_u32(MLX5_FLOW_MARK_DEFAULT);
@@ -583,14 +584,18 @@ rxq_cq_to_ptype_oflags_v(struct mlx5_rxq_data *rxq,
 	ptype = vshrn_n_u32(ptype_info, 10);
 	/* Errored packets will have RTE_PTYPE_ALL_MASK. */
 	ptype = vorr_u16(ptype, op_err);
-	pkts[0]->packet_type =
-		mlx5_ptype_table[vget_lane_u8(vreinterpret_u8_u16(ptype), 6)];
-	pkts[1]->packet_type =
-		mlx5_ptype_table[vget_lane_u8(vreinterpret_u8_u16(ptype), 4)];
-	pkts[2]->packet_type =
-		mlx5_ptype_table[vget_lane_u8(vreinterpret_u8_u16(ptype), 2)];
-	pkts[3]->packet_type =
-		mlx5_ptype_table[vget_lane_u8(vreinterpret_u8_u16(ptype), 0)];
+	pt_idx0 = vget_lane_u8(vreinterpret_u8_u16(ptype), 6);
+	pt_idx1 = vget_lane_u8(vreinterpret_u8_u16(ptype), 4);
+	pt_idx2 = vget_lane_u8(vreinterpret_u8_u16(ptype), 2);
+	pt_idx3 = vget_lane_u8(vreinterpret_u8_u16(ptype), 0);
+	pkts[0]->packet_type = mlx5_ptype_table[pt_idx0] |
+			       !!(pt_idx0 & (1 << 6)) * rxq->tunnel;
+	pkts[1]->packet_type = mlx5_ptype_table[pt_idx1] |
+			       !!(pt_idx1 & (1 << 6)) * rxq->tunnel;
+	pkts[2]->packet_type = mlx5_ptype_table[pt_idx2] |
+			       !!(pt_idx2 & (1 << 6)) * rxq->tunnel;
+	pkts[3]->packet_type = mlx5_ptype_table[pt_idx3] |
+			       !!(pt_idx3 & (1 << 6)) * rxq->tunnel;
 	/* Fill flags for checksum and VLAN. */
 	pinfo = vandq_u32(ptype_info, ptype_ol_mask);
 	pinfo = vreinterpretq_u32_u8(
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
index 83d6e431f..4a6789a78 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
@@ -542,6 +542,7 @@ rxq_cq_to_ptype_oflags_v(struct mlx5_rxq_data *rxq, __m128i cqes[4],
 	const __m128i mbuf_init =
 		_mm_loadl_epi64((__m128i *)&rxq->mbuf_initializer);
 	__m128i rearm0, rearm1, rearm2, rearm3;
+	uint8_t pt_idx0, pt_idx1, pt_idx2, pt_idx3;
 
 	/* Extract pkt_info field. */
 	pinfo0 = _mm_unpacklo_epi32(cqes[0], cqes[1]);
@@ -595,10 +596,18 @@ rxq_cq_to_ptype_oflags_v(struct mlx5_rxq_data *rxq, __m128i cqes[4],
 	/* Errored packets will have RTE_PTYPE_ALL_MASK. */
 	op_err = _mm_srli_epi16(op_err, 8);
 	ptype = _mm_or_si128(ptype, op_err);
-	pkts[0]->packet_type = mlx5_ptype_table[_mm_extract_epi8(ptype, 0)];
-	pkts[1]->packet_type = mlx5_ptype_table[_mm_extract_epi8(ptype, 2)];
-	pkts[2]->packet_type = mlx5_ptype_table[_mm_extract_epi8(ptype, 4)];
-	pkts[3]->packet_type = mlx5_ptype_table[_mm_extract_epi8(ptype, 6)];
+	pt_idx0 = _mm_extract_epi8(ptype, 0);
+	pt_idx1 = _mm_extract_epi8(ptype, 2);
+	pt_idx2 = _mm_extract_epi8(ptype, 4);
+	pt_idx3 = _mm_extract_epi8(ptype, 6);
+	pkts[0]->packet_type = mlx5_ptype_table[pt_idx0] |
+			       !!(pt_idx0 & (1 << 6)) * rxq->tunnel;
+	pkts[1]->packet_type = mlx5_ptype_table[pt_idx1] |
+			       !!(pt_idx1 & (1 << 6)) * rxq->tunnel;
+	pkts[2]->packet_type = mlx5_ptype_table[pt_idx2] |
+			       !!(pt_idx2 & (1 << 6)) * rxq->tunnel;
+	pkts[3]->packet_type = mlx5_ptype_table[pt_idx3] |
+			       !!(pt_idx3 & (1 << 6)) * rxq->tunnel;
 	/* Fill flags for checksum and VLAN. */
 	pinfo = _mm_and_si128(pinfo, ptype_ol_mask);
 	pinfo = _mm_shuffle_epi8(cv_flag_sel, pinfo);
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v5 05/11] net/mlx5: cleanup tunnel checksum offloads
  2018-04-17 15:14   ` [PATCH v4 00/11] " Xueming Li
                       ` (4 preceding siblings ...)
  2018-04-20 12:23     ` [PATCH v5 04/11] net/mlx5: support Rx tunnel type identification Xueming Li
@ 2018-04-20 12:23     ` Xueming Li
  2018-04-20 12:23     ` [PATCH v5 06/11] net/mlx5: split flow RSS handling logic Xueming Li
                       ` (5 subsequent siblings)
  11 siblings, 0 replies; 115+ messages in thread
From: Xueming Li @ 2018-04-20 12:23 UTC (permalink / raw)
  To: Iremonger Bernard, Nelio Laranjeiro, Shahaf Shuler; +Cc: Xueming Li, dev

This patch cleanup tunnel checksum offloads.

Once tunnel packet type(RTE_PTYPE_TUNNEL_xxx) identified,
PKT_RX_IP_CKSUM_XXX and PKT_RX_L4_CKSUM_XXX represent checksum result of
inner headers, outer L3 and L4 header checksum are always valid as soon
as tunnel identified. If no tunnel identified, PKT_RX_IP_CKSUM_XXX and
PKT_RX_L4_CKSUM_XXX represent checksum result of outer L3 and L4
headers.

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
---
 drivers/net/mlx5/mlx5_rxq.c  |  2 --
 drivers/net/mlx5/mlx5_rxtx.c | 18 ++++--------------
 drivers/net/mlx5/mlx5_rxtx.h |  1 -
 3 files changed, 4 insertions(+), 17 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index 1fbd02aa0..6756f25fa 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -1045,8 +1045,6 @@ mlx5_rxq_new(struct rte_eth_dev *dev, uint16_t idx, uint16_t desc,
 	}
 	/* Toggle RX checksum offload if hardware supports it. */
 	tmpl->rxq.csum = !!(conf->offloads & DEV_RX_OFFLOAD_CHECKSUM);
-	tmpl->rxq.csum_l2tun = (!!(conf->offloads & DEV_RX_OFFLOAD_CHECKSUM) &&
-				priv->config.tunnel_en);
 	tmpl->rxq.hw_timestamp = !!(conf->offloads & DEV_RX_OFFLOAD_TIMESTAMP);
 	/* Configure VLAN stripping. */
 	tmpl->rxq.vlan_strip = !!(conf->offloads & DEV_RX_OFFLOAD_VLAN_STRIP);
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index fafac514b..060ff0e85 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -41,7 +41,7 @@ mlx5_rx_poll_len(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cqe,
 		 uint16_t cqe_cnt, uint32_t *rss_hash);
 
 static __rte_always_inline uint32_t
-rxq_cq_to_ol_flags(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cqe);
+rxq_cq_to_ol_flags(volatile struct mlx5_cqe *cqe);
 
 uint32_t mlx5_ptype_table[] __rte_cache_aligned = {
 	[0xff] = RTE_PTYPE_ALL_MASK, /* Last entry for errored packet. */
@@ -1728,8 +1728,6 @@ mlx5_rx_poll_len(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cqe,
 /**
  * Translate RX completion flags to offload flags.
  *
- * @param[in] rxq
- *   Pointer to RX queue structure.
  * @param[in] cqe
  *   Pointer to CQE.
  *
@@ -1737,7 +1735,7 @@ mlx5_rx_poll_len(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cqe,
  *   Offload flags (ol_flags) for struct rte_mbuf.
  */
 static inline uint32_t
-rxq_cq_to_ol_flags(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cqe)
+rxq_cq_to_ol_flags(volatile struct mlx5_cqe *cqe)
 {
 	uint32_t ol_flags = 0;
 	uint16_t flags = rte_be_to_cpu_16(cqe->hdr_type_etc);
@@ -1749,14 +1747,6 @@ rxq_cq_to_ol_flags(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cqe)
 		TRANSPOSE(flags,
 			  MLX5_CQE_RX_L4_HDR_VALID,
 			  PKT_RX_L4_CKSUM_GOOD);
-	if ((cqe->pkt_info & MLX5_CQE_RX_TUNNEL_PACKET) && (rxq->csum_l2tun))
-		ol_flags |=
-			TRANSPOSE(flags,
-				  MLX5_CQE_RX_L3_HDR_VALID,
-				  PKT_RX_IP_CKSUM_GOOD) |
-			TRANSPOSE(flags,
-				  MLX5_CQE_RX_L4_HDR_VALID,
-				  PKT_RX_L4_CKSUM_GOOD);
 	return ol_flags;
 }
 
@@ -1855,8 +1845,8 @@ mlx5_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n)
 						mlx5_flow_mark_get(mark);
 				}
 			}
-			if (rxq->csum | rxq->csum_l2tun)
-				pkt->ol_flags |= rxq_cq_to_ol_flags(rxq, cqe);
+			if (rxq->csum)
+				pkt->ol_flags |= rxq_cq_to_ol_flags(cqe);
 			if (rxq->vlan_strip &&
 			    (cqe->hdr_type_etc &
 			     rte_cpu_to_be_16(MLX5_CQE_VLAN_STRIPPED))) {
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index 676ad6a9a..188fd65c5 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -77,7 +77,6 @@ struct rxq_zip {
 /* RX queue descriptor. */
 struct mlx5_rxq_data {
 	unsigned int csum:1; /* Enable checksum offloading. */
-	unsigned int csum_l2tun:1; /* Same for L2 tunnels. */
 	unsigned int hw_timestamp:1; /* Enable HW timestamp. */
 	unsigned int vlan_strip:1; /* Enable VLAN stripping. */
 	unsigned int crc_present:1; /* CRC must be subtracted. */
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v5 06/11] net/mlx5: split flow RSS handling logic
  2018-04-17 15:14   ` [PATCH v4 00/11] " Xueming Li
                       ` (5 preceding siblings ...)
  2018-04-20 12:23     ` [PATCH v5 05/11] net/mlx5: cleanup tunnel checksum offloads Xueming Li
@ 2018-04-20 12:23     ` Xueming Li
  2018-04-20 12:23     ` [PATCH v5 07/11] net/mlx5: support tunnel RSS level Xueming Li
                       ` (4 subsequent siblings)
  11 siblings, 0 replies; 115+ messages in thread
From: Xueming Li @ 2018-04-20 12:23 UTC (permalink / raw)
  To: Iremonger Bernard, Nelio Laranjeiro, Shahaf Shuler; +Cc: Xueming Li, dev

This patch split out flow RSS hash field handling logic to dedicate
function.

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
---
 drivers/net/mlx5/mlx5_flow.c | 126 +++++++++++++++++++++++--------------------
 1 file changed, 68 insertions(+), 58 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index fa1487d29..c2e57094e 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -1000,59 +1000,8 @@ mlx5_flow_update_priority(struct rte_eth_dev *dev,
 static void
 mlx5_flow_convert_finalise(struct mlx5_flow_parse *parser)
 {
-	const unsigned int ipv4 =
-		hash_rxq_init[parser->layer].ip_version == MLX5_IPV4;
-	const enum hash_rxq_type hmin = ipv4 ? HASH_RXQ_TCPV4 : HASH_RXQ_TCPV6;
-	const enum hash_rxq_type hmax = ipv4 ? HASH_RXQ_IPV4 : HASH_RXQ_IPV6;
-	const enum hash_rxq_type ohmin = ipv4 ? HASH_RXQ_TCPV6 : HASH_RXQ_TCPV4;
-	const enum hash_rxq_type ohmax = ipv4 ? HASH_RXQ_IPV6 : HASH_RXQ_IPV4;
-	const enum hash_rxq_type ip = ipv4 ? HASH_RXQ_IPV4 : HASH_RXQ_IPV6;
 	unsigned int i;
 
-	/* Remove any other flow not matching the pattern. */
-	if (parser->rss_conf.queue_num == 1 && !parser->rss_conf.types) {
-		for (i = 0; i != hash_rxq_init_n; ++i) {
-			if (i == HASH_RXQ_ETH)
-				continue;
-			rte_free(parser->queue[i].ibv_attr);
-			parser->queue[i].ibv_attr = NULL;
-		}
-		return;
-	}
-	if (parser->layer == HASH_RXQ_ETH) {
-		goto fill;
-	} else {
-		/*
-		 * This layer becomes useless as the pattern define under
-		 * layers.
-		 */
-		rte_free(parser->queue[HASH_RXQ_ETH].ibv_attr);
-		parser->queue[HASH_RXQ_ETH].ibv_attr = NULL;
-	}
-	/* Remove opposite kind of layer e.g. IPv6 if the pattern is IPv4. */
-	for (i = ohmin; i != (ohmax + 1); ++i) {
-		if (!parser->queue[i].ibv_attr)
-			continue;
-		rte_free(parser->queue[i].ibv_attr);
-		parser->queue[i].ibv_attr = NULL;
-	}
-	/* Remove impossible flow according to the RSS configuration. */
-	if (hash_rxq_init[parser->layer].dpdk_rss_hf &
-	    parser->rss_conf.types) {
-		/* Remove any other flow. */
-		for (i = hmin; i != (hmax + 1); ++i) {
-			if ((i == parser->layer) ||
-			     (!parser->queue[i].ibv_attr))
-				continue;
-			rte_free(parser->queue[i].ibv_attr);
-			parser->queue[i].ibv_attr = NULL;
-		}
-	} else  if (!parser->queue[ip].ibv_attr) {
-		/* no RSS possible with the current configuration. */
-		parser->rss_conf.queue_num = 1;
-		return;
-	}
-fill:
 	/*
 	 * Fill missing layers in verbs specifications, or compute the correct
 	 * offset to allocate the memory space for the attributes and
@@ -1115,6 +1064,66 @@ mlx5_flow_convert_finalise(struct mlx5_flow_parse *parser)
 }
 
 /**
+ * Update flows according to pattern and RSS hash fields.
+ *
+ * @param[in, out] parser
+ *   Internal parser structure.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+mlx5_flow_convert_rss(struct mlx5_flow_parse *parser)
+{
+	const unsigned int ipv4 =
+		hash_rxq_init[parser->layer].ip_version == MLX5_IPV4;
+	const enum hash_rxq_type hmin = ipv4 ? HASH_RXQ_TCPV4 : HASH_RXQ_TCPV6;
+	const enum hash_rxq_type hmax = ipv4 ? HASH_RXQ_IPV4 : HASH_RXQ_IPV6;
+	const enum hash_rxq_type ohmin = ipv4 ? HASH_RXQ_TCPV6 : HASH_RXQ_TCPV4;
+	const enum hash_rxq_type ohmax = ipv4 ? HASH_RXQ_IPV6 : HASH_RXQ_IPV4;
+	const enum hash_rxq_type ip = ipv4 ? HASH_RXQ_IPV4 : HASH_RXQ_IPV6;
+	unsigned int i;
+
+	/* Remove any other flow not matching the pattern. */
+	if (parser->rss_conf.queue_num == 1 && !parser->rss_conf.types) {
+		for (i = 0; i != hash_rxq_init_n; ++i) {
+			if (i == HASH_RXQ_ETH)
+				continue;
+			rte_free(parser->queue[i].ibv_attr);
+			parser->queue[i].ibv_attr = NULL;
+		}
+		return 0;
+	}
+	if (parser->layer == HASH_RXQ_ETH)
+		return 0;
+	/* This layer becomes useless as the pattern define under layers. */
+	rte_free(parser->queue[HASH_RXQ_ETH].ibv_attr);
+	parser->queue[HASH_RXQ_ETH].ibv_attr = NULL;
+	/* Remove opposite kind of layer e.g. IPv6 if the pattern is IPv4. */
+	for (i = ohmin; i != (ohmax + 1); ++i) {
+		if (!parser->queue[i].ibv_attr)
+			continue;
+		rte_free(parser->queue[i].ibv_attr);
+		parser->queue[i].ibv_attr = NULL;
+	}
+	/* Remove impossible flow according to the RSS configuration. */
+	if (hash_rxq_init[parser->layer].dpdk_rss_hf &
+	    parser->rss_conf.types) {
+		/* Remove any other flow. */
+		for (i = hmin; i != (hmax + 1); ++i) {
+			if (i == parser->layer || !parser->queue[i].ibv_attr)
+				continue;
+			rte_free(parser->queue[i].ibv_attr);
+			parser->queue[i].ibv_attr = NULL;
+		}
+	} else if (!parser->queue[ip].ibv_attr) {
+		/* no RSS possible with the current configuration. */
+		parser->rss_conf.queue_num = 1;
+	}
+	return 0;
+}
+
+/**
  * Validate and convert a flow supported by the NIC.
  *
  * @param dev
@@ -1211,6 +1220,14 @@ mlx5_flow_convert(struct rte_eth_dev *dev,
 		if (ret)
 			goto exit_free;
 	}
+	if (!parser->drop)
+		/* RSS check, remove unused hash types. */
+		ret = mlx5_flow_convert_rss(parser);
+		if (ret)
+			goto exit_free;
+		/* Complete missing specification. */
+		mlx5_flow_convert_finalise(parser);
+	mlx5_flow_update_priority(dev, parser, attr);
 	if (parser->mark)
 		mlx5_flow_create_flag_mark(parser, parser->mark_id);
 	if (parser->count && parser->create) {
@@ -1218,13 +1235,6 @@ mlx5_flow_convert(struct rte_eth_dev *dev,
 		if (!parser->cs)
 			goto exit_count_error;
 	}
-	/*
-	 * Last step. Complete missing specification to reach the RSS
-	 * configuration.
-	 */
-	if (!parser->drop)
-		mlx5_flow_convert_finalise(parser);
-	mlx5_flow_update_priority(dev, parser, attr);
 exit_free:
 	/* Only verification is expected, all resources should be released. */
 	if (!parser->create) {
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v5 07/11] net/mlx5: support tunnel RSS level
  2018-04-17 15:14   ` [PATCH v4 00/11] " Xueming Li
                       ` (6 preceding siblings ...)
  2018-04-20 12:23     ` [PATCH v5 06/11] net/mlx5: split flow RSS handling logic Xueming Li
@ 2018-04-20 12:23     ` Xueming Li
  2018-04-20 12:23     ` [PATCH v5 08/11] net/mlx5: add hardware flow debug dump Xueming Li
                       ` (3 subsequent siblings)
  11 siblings, 0 replies; 115+ messages in thread
From: Xueming Li @ 2018-04-20 12:23 UTC (permalink / raw)
  To: Iremonger Bernard, Nelio Laranjeiro, Shahaf Shuler; +Cc: Xueming Li, dev

Tunnel RSS level of flow RSS action offers user a choice to do RSS hash
calculation on inner or outer RSS fields. Testpmd flow command examples:

GRE flow inner RSS:
  flow create 0 ingress pattern eth / ipv4 proto is 47 / gre / end
actions rss queues 1 2 end level 1 / end

GRE tunnel flow outer RSS:
  flow create 0 ingress pattern eth  / ipv4 proto is 47 / gre / end
actions rss queues 1 2 end level 0 / end

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
---
 drivers/net/mlx5/Makefile    |   2 +-
 drivers/net/mlx5/mlx5_flow.c | 257 +++++++++++++++++++++++++++----------------
 drivers/net/mlx5/mlx5_glue.c |  16 +++
 drivers/net/mlx5/mlx5_glue.h |   8 ++
 drivers/net/mlx5/mlx5_rxq.c  |  58 +++++++++-
 drivers/net/mlx5/mlx5_rxtx.h |   5 +-
 6 files changed, 240 insertions(+), 106 deletions(-)

diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
index b710a10f5..d9447ace9 100644
--- a/drivers/net/mlx5/Makefile
+++ b/drivers/net/mlx5/Makefile
@@ -8,7 +8,7 @@ include $(RTE_SDK)/mk/rte.vars.mk
 LIB = librte_pmd_mlx5.a
 LIB_GLUE = $(LIB_GLUE_BASE).$(LIB_GLUE_VERSION)
 LIB_GLUE_BASE = librte_pmd_mlx5_glue.so
-LIB_GLUE_VERSION = 18.02.0
+LIB_GLUE_VERSION = 18.05.0
 
 # Sources.
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5.c
diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index c2e57094e..174f2ba6e 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -117,6 +117,7 @@ enum hash_rxq_type {
 	HASH_RXQ_UDPV6,
 	HASH_RXQ_IPV6,
 	HASH_RXQ_ETH,
+	HASH_RXQ_TUNNEL,
 };
 
 /* Initialization data for hash RX queue. */
@@ -455,6 +456,7 @@ struct mlx5_flow_parse {
 	uint16_t queues[RTE_MAX_QUEUES_PER_PORT]; /**< Queues indexes to use. */
 	uint8_t rss_key[40]; /**< copy of the RSS key. */
 	enum hash_rxq_type layer; /**< Last pattern layer detected. */
+	enum hash_rxq_type out_layer; /**< Last outer pattern layer detected. */
 	uint32_t tunnel; /**< Tunnel type of RTE_PTYPE_TUNNEL_XXX. */
 	struct ibv_counter_set *cs; /**< Holds the counter set for the rule */
 	struct {
@@ -462,6 +464,7 @@ struct mlx5_flow_parse {
 		/**< Pointer to Verbs attributes. */
 		unsigned int offset;
 		/**< Current position or total size of the attribute. */
+		uint64_t hash_fields; /**< Verbs hash fields. */
 	} queue[RTE_DIM(hash_rxq_init)];
 };
 
@@ -697,7 +700,8 @@ mlx5_flow_convert_actions(struct rte_eth_dev *dev,
 						   " function is Toeplitz");
 				return -rte_errno;
 			}
-			if (rss->level) {
+#ifndef HAVE_IBV_DEVICE_TUNNEL_SUPPORT
+			if (parser->rss_conf.level > 1) {
 				rte_flow_error_set(error, EINVAL,
 						   RTE_FLOW_ERROR_TYPE_ACTION,
 						   actions,
@@ -705,6 +709,15 @@ mlx5_flow_convert_actions(struct rte_eth_dev *dev,
 						   " level is not supported");
 				return -rte_errno;
 			}
+#endif
+			if (parser->rss_conf.level > 2) {
+				rte_flow_error_set(error, EINVAL,
+						   RTE_FLOW_ERROR_TYPE_ACTION,
+						   actions,
+						   "RSS encapsulation level"
+						   " > 1 is not supported");
+				return -rte_errno;
+			}
 			if (rss->types & MLX5_RSS_HF_MASK) {
 				rte_flow_error_set(error, EINVAL,
 						   RTE_FLOW_ERROR_TYPE_ACTION,
@@ -755,7 +768,7 @@ mlx5_flow_convert_actions(struct rte_eth_dev *dev,
 			}
 			parser->rss_conf = (struct rte_flow_action_rss){
 				.func = RTE_ETH_HASH_FUNCTION_DEFAULT,
-				.level = 0,
+				.level = rss->level,
 				.types = rss->types,
 				.key_len = rss_key_len,
 				.queue_num = rss->queue_num,
@@ -839,10 +852,12 @@ mlx5_flow_convert_actions(struct rte_eth_dev *dev,
  *   0 on success, a negative errno value otherwise and rte_errno is set.
  */
 static int
-mlx5_flow_convert_items_validate(const struct rte_flow_item items[],
+mlx5_flow_convert_items_validate(struct rte_eth_dev *dev,
+				 const struct rte_flow_item items[],
 				 struct rte_flow_error *error,
 				 struct mlx5_flow_parse *parser)
 {
+	struct priv *priv = dev->data->dev_private;
 	const struct mlx5_flow_items *cur_item = mlx5_flow_items;
 	unsigned int i;
 	int ret = 0;
@@ -882,6 +897,14 @@ mlx5_flow_convert_items_validate(const struct rte_flow_item items[],
 						   " tunnel encapsulations.");
 				return -rte_errno;
 			}
+			if (!priv->config.tunnel_en &&
+			    parser->rss_conf.level > 1) {
+				rte_flow_error_set(error, ENOTSUP,
+					RTE_FLOW_ERROR_TYPE_ITEM,
+					items,
+					"RSS on tunnel is not supported");
+				return -rte_errno;
+			}
 			parser->inner = IBV_FLOW_SPEC_INNER;
 			parser->tunnel = flow_ptype[items->type];
 		}
@@ -1001,7 +1024,11 @@ static void
 mlx5_flow_convert_finalise(struct mlx5_flow_parse *parser)
 {
 	unsigned int i;
+	uint32_t inner = parser->inner;
 
+	/* Don't create extra flows for outer RSS. */
+	if (parser->tunnel && parser->rss_conf.level < 2)
+		return;
 	/*
 	 * Fill missing layers in verbs specifications, or compute the correct
 	 * offset to allocate the memory space for the attributes and
@@ -1012,23 +1039,25 @@ mlx5_flow_convert_finalise(struct mlx5_flow_parse *parser)
 			struct ibv_flow_spec_ipv4_ext ipv4;
 			struct ibv_flow_spec_ipv6 ipv6;
 			struct ibv_flow_spec_tcp_udp udp_tcp;
+			struct ibv_flow_spec_eth eth;
 		} specs;
 		void *dst;
 		uint16_t size;
 
 		if (i == parser->layer)
 			continue;
-		if (parser->layer == HASH_RXQ_ETH) {
+		if (parser->layer == HASH_RXQ_ETH ||
+		    parser->layer == HASH_RXQ_TUNNEL) {
 			if (hash_rxq_init[i].ip_version == MLX5_IPV4) {
 				size = sizeof(struct ibv_flow_spec_ipv4_ext);
 				specs.ipv4 = (struct ibv_flow_spec_ipv4_ext){
-					.type = IBV_FLOW_SPEC_IPV4_EXT,
+					.type = inner | IBV_FLOW_SPEC_IPV4_EXT,
 					.size = size,
 				};
 			} else {
 				size = sizeof(struct ibv_flow_spec_ipv6);
 				specs.ipv6 = (struct ibv_flow_spec_ipv6){
-					.type = IBV_FLOW_SPEC_IPV6,
+					.type = inner | IBV_FLOW_SPEC_IPV6,
 					.size = size,
 				};
 			}
@@ -1045,7 +1074,7 @@ mlx5_flow_convert_finalise(struct mlx5_flow_parse *parser)
 		    (i == HASH_RXQ_UDPV6) || (i == HASH_RXQ_TCPV6)) {
 			size = sizeof(struct ibv_flow_spec_tcp_udp);
 			specs.udp_tcp = (struct ibv_flow_spec_tcp_udp) {
-				.type = ((i == HASH_RXQ_UDPV4 ||
+				.type = inner | ((i == HASH_RXQ_UDPV4 ||
 					  i == HASH_RXQ_UDPV6) ?
 					 IBV_FLOW_SPEC_UDP :
 					 IBV_FLOW_SPEC_TCP),
@@ -1075,50 +1104,93 @@ mlx5_flow_convert_finalise(struct mlx5_flow_parse *parser)
 static int
 mlx5_flow_convert_rss(struct mlx5_flow_parse *parser)
 {
-	const unsigned int ipv4 =
-		hash_rxq_init[parser->layer].ip_version == MLX5_IPV4;
-	const enum hash_rxq_type hmin = ipv4 ? HASH_RXQ_TCPV4 : HASH_RXQ_TCPV6;
-	const enum hash_rxq_type hmax = ipv4 ? HASH_RXQ_IPV4 : HASH_RXQ_IPV6;
-	const enum hash_rxq_type ohmin = ipv4 ? HASH_RXQ_TCPV6 : HASH_RXQ_TCPV4;
-	const enum hash_rxq_type ohmax = ipv4 ? HASH_RXQ_IPV6 : HASH_RXQ_IPV4;
-	const enum hash_rxq_type ip = ipv4 ? HASH_RXQ_IPV4 : HASH_RXQ_IPV6;
 	unsigned int i;
-
-	/* Remove any other flow not matching the pattern. */
-	if (parser->rss_conf.queue_num == 1 && !parser->rss_conf.types) {
-		for (i = 0; i != hash_rxq_init_n; ++i) {
-			if (i == HASH_RXQ_ETH)
+	enum hash_rxq_type start;
+	enum hash_rxq_type layer;
+	int outer = parser->tunnel && parser->rss_conf.level < 2;
+	uint64_t rss = parser->rss_conf.types;
+
+	/* Default to outer RSS. */
+	if (!parser->rss_conf.level)
+		parser->rss_conf.level = 1;
+	layer = outer ? parser->out_layer : parser->layer;
+	if (layer == HASH_RXQ_TUNNEL)
+		layer = HASH_RXQ_ETH;
+	if (outer) {
+		/* Only one hash type for outer RSS. */
+		if (rss && layer == HASH_RXQ_ETH) {
+			start = HASH_RXQ_TCPV4;
+		} else if (rss && layer != HASH_RXQ_ETH &&
+			   !(rss & hash_rxq_init[layer].dpdk_rss_hf)) {
+			/* If RSS not match L4 pattern, try L3 RSS. */
+			if (layer < HASH_RXQ_IPV4)
+				layer = HASH_RXQ_IPV4;
+			else if (layer > HASH_RXQ_IPV4 && layer < HASH_RXQ_IPV6)
+				layer = HASH_RXQ_IPV6;
+			start = layer;
+		} else {
+			start = layer;
+		}
+		/* Scan first valid hash type. */
+		for (i = start; rss && i <= layer; ++i) {
+			if (!parser->queue[i].ibv_attr)
 				continue;
-			rte_free(parser->queue[i].ibv_attr);
-			parser->queue[i].ibv_attr = NULL;
+			if (hash_rxq_init[i].dpdk_rss_hf & rss)
+				break;
 		}
-		return 0;
-	}
-	if (parser->layer == HASH_RXQ_ETH)
-		return 0;
-	/* This layer becomes useless as the pattern define under layers. */
-	rte_free(parser->queue[HASH_RXQ_ETH].ibv_attr);
-	parser->queue[HASH_RXQ_ETH].ibv_attr = NULL;
-	/* Remove opposite kind of layer e.g. IPv6 if the pattern is IPv4. */
-	for (i = ohmin; i != (ohmax + 1); ++i) {
-		if (!parser->queue[i].ibv_attr)
-			continue;
-		rte_free(parser->queue[i].ibv_attr);
-		parser->queue[i].ibv_attr = NULL;
-	}
-	/* Remove impossible flow according to the RSS configuration. */
-	if (hash_rxq_init[parser->layer].dpdk_rss_hf &
-	    parser->rss_conf.types) {
-		/* Remove any other flow. */
-		for (i = hmin; i != (hmax + 1); ++i) {
-			if (i == parser->layer || !parser->queue[i].ibv_attr)
+		if (rss && i <= layer)
+			parser->queue[layer].hash_fields =
+					hash_rxq_init[i].hash_fields;
+		/* Trim unused hash types. */
+		for (i = 0; i != hash_rxq_init_n; ++i) {
+			if (parser->queue[i].ibv_attr && i != layer) {
+				rte_free(parser->queue[i].ibv_attr);
+				parser->queue[i].ibv_attr = NULL;
+			}
+		}
+	} else {
+		/* Expand for inner or normal RSS. */
+		if (rss && (layer == HASH_RXQ_ETH || layer == HASH_RXQ_IPV4))
+			start = HASH_RXQ_TCPV4;
+		else if (rss && layer == HASH_RXQ_IPV6)
+			start = HASH_RXQ_TCPV6;
+		else
+			start = layer;
+		/* For L4 pattern, try L3 RSS if no L4 RSS. */
+		/* Trim unused hash types. */
+		for (i = 0; i != hash_rxq_init_n; ++i) {
+			if (!parser->queue[i].ibv_attr)
 				continue;
-			rte_free(parser->queue[i].ibv_attr);
-			parser->queue[i].ibv_attr = NULL;
+			if (i < start || i > layer) {
+				rte_free(parser->queue[i].ibv_attr);
+				parser->queue[i].ibv_attr = NULL;
+				continue;
+			}
+			if (!rss)
+				continue;
+			if (hash_rxq_init[i].dpdk_rss_hf & rss) {
+				parser->queue[i].hash_fields =
+						hash_rxq_init[i].hash_fields;
+			} else if (i != layer) {
+				/* Remove unused RSS expansion. */
+				rte_free(parser->queue[i].ibv_attr);
+				parser->queue[i].ibv_attr = NULL;
+			} else if (layer < HASH_RXQ_IPV4 &&
+				   (hash_rxq_init[HASH_RXQ_IPV4].dpdk_rss_hf &
+				    rss)) {
+				/* Allow IPv4 RSS on L4 pattern. */
+				parser->queue[i].hash_fields =
+					hash_rxq_init[HASH_RXQ_IPV4]
+						.hash_fields;
+			} else if (i > HASH_RXQ_IPV4 && i < HASH_RXQ_IPV6 &&
+				   (hash_rxq_init[HASH_RXQ_IPV6].dpdk_rss_hf &
+				    rss)) {
+				/* Allow IPv4 RSS on L4 pattern. */
+				parser->queue[i].hash_fields =
+					hash_rxq_init[HASH_RXQ_IPV6]
+						.hash_fields;
+			}
 		}
-	} else if (!parser->queue[ip].ibv_attr) {
-		/* no RSS possible with the current configuration. */
-		parser->rss_conf.queue_num = 1;
 	}
 	return 0;
 }
@@ -1166,7 +1238,7 @@ mlx5_flow_convert(struct rte_eth_dev *dev,
 	ret = mlx5_flow_convert_actions(dev, actions, error, parser);
 	if (ret)
 		return ret;
-	ret = mlx5_flow_convert_items_validate(items, error, parser);
+	ret = mlx5_flow_convert_items_validate(dev, items, error, parser);
 	if (ret)
 		return ret;
 	mlx5_flow_convert_finalise(parser);
@@ -1187,10 +1259,6 @@ mlx5_flow_convert(struct rte_eth_dev *dev,
 		for (i = 0; i != hash_rxq_init_n; ++i) {
 			unsigned int offset;
 
-			if (!(parser->rss_conf.types &
-			      hash_rxq_init[i].dpdk_rss_hf) &&
-			    (i != HASH_RXQ_ETH))
-				continue;
 			offset = parser->queue[i].offset;
 			parser->queue[i].ibv_attr =
 				mlx5_flow_convert_allocate(offset, error);
@@ -1202,6 +1270,7 @@ mlx5_flow_convert(struct rte_eth_dev *dev,
 	/* Third step. Conversion parse, fill the specifications. */
 	parser->inner = 0;
 	parser->tunnel = 0;
+	parser->layer = HASH_RXQ_ETH;
 	for (; items->type != RTE_FLOW_ITEM_TYPE_END; ++items) {
 		struct mlx5_flow_data data = {
 			.dev = dev,
@@ -1282,17 +1351,11 @@ mlx5_flow_create_copy(struct mlx5_flow_parse *parser, void *src,
 	for (i = 0; i != hash_rxq_init_n; ++i) {
 		if (!parser->queue[i].ibv_attr)
 			continue;
-		/* Specification must be the same l3 type or none. */
-		if (parser->layer == HASH_RXQ_ETH ||
-		    (hash_rxq_init[parser->layer].ip_version ==
-		     hash_rxq_init[i].ip_version) ||
-		    (hash_rxq_init[i].ip_version == 0)) {
-			dst = (void *)((uintptr_t)parser->queue[i].ibv_attr +
-					parser->queue[i].offset);
-			memcpy(dst, src, size);
-			++parser->queue[i].ibv_attr->num_of_specs;
-			parser->queue[i].offset += size;
-		}
+		dst = (void *)((uintptr_t)parser->queue[i].ibv_attr +
+				parser->queue[i].offset);
+		memcpy(dst, src, size);
+		++parser->queue[i].ibv_attr->num_of_specs;
+		parser->queue[i].offset += size;
 	}
 }
 
@@ -1323,9 +1386,7 @@ mlx5_flow_create_eth(const struct rte_flow_item *item,
 		.size = eth_size,
 	};
 
-	/* Don't update layer for the inner pattern. */
-	if (!parser->inner)
-		parser->layer = HASH_RXQ_ETH;
+	parser->layer = HASH_RXQ_ETH;
 	if (spec) {
 		unsigned int i;
 
@@ -1446,9 +1507,7 @@ mlx5_flow_create_ipv4(const struct rte_flow_item *item,
 					  "L3 VXLAN not enabled by device"
 					  " parameter and/or not configured"
 					  " in firmware");
-	/* Don't update layer for the inner pattern. */
-	if (!parser->inner)
-		parser->layer = HASH_RXQ_IPV4;
+	parser->layer = HASH_RXQ_IPV4;
 	if (spec) {
 		if (!mask)
 			mask = default_mask;
@@ -1511,9 +1570,7 @@ mlx5_flow_create_ipv6(const struct rte_flow_item *item,
 					  "L3 VXLAN not enabled by device"
 					  " parameter and/or not configured"
 					  " in firmware");
-	/* Don't update layer for the inner pattern. */
-	if (!parser->inner)
-		parser->layer = HASH_RXQ_IPV6;
+	parser->layer = HASH_RXQ_IPV6;
 	if (spec) {
 		unsigned int i;
 		uint32_t vtc_flow_val;
@@ -1586,13 +1643,10 @@ mlx5_flow_create_udp(const struct rte_flow_item *item,
 		.size = udp_size,
 	};
 
-	/* Don't update layer for the inner pattern. */
-	if (!parser->inner) {
-		if (parser->layer == HASH_RXQ_IPV4)
-			parser->layer = HASH_RXQ_UDPV4;
-		else
-			parser->layer = HASH_RXQ_UDPV6;
-	}
+	if (parser->layer == HASH_RXQ_IPV4)
+		parser->layer = HASH_RXQ_UDPV4;
+	else
+		parser->layer = HASH_RXQ_UDPV6;
 	if (spec) {
 		if (!mask)
 			mask = default_mask;
@@ -1635,13 +1689,10 @@ mlx5_flow_create_tcp(const struct rte_flow_item *item,
 		.size = tcp_size,
 	};
 
-	/* Don't update layer for the inner pattern. */
-	if (!parser->inner) {
-		if (parser->layer == HASH_RXQ_IPV4)
-			parser->layer = HASH_RXQ_TCPV4;
-		else
-			parser->layer = HASH_RXQ_TCPV6;
-	}
+	if (parser->layer == HASH_RXQ_IPV4)
+		parser->layer = HASH_RXQ_TCPV4;
+	else
+		parser->layer = HASH_RXQ_TCPV6;
 	if (spec) {
 		if (!mask)
 			mask = default_mask;
@@ -1691,6 +1742,11 @@ mlx5_flow_create_vxlan(const struct rte_flow_item *item,
 	id.vni[0] = 0;
 	parser->inner = IBV_FLOW_SPEC_INNER;
 	parser->tunnel = ptype_ext[PTYPE_IDX(RTE_PTYPE_TUNNEL_VXLAN)];
+	parser->out_layer = parser->layer;
+	parser->layer = HASH_RXQ_TUNNEL;
+	/* Default VXLAN to outer RSS. */
+	if (!parser->rss_conf.level)
+		parser->rss_conf.level = 1;
 	if (spec) {
 		if (!mask)
 			mask = default_mask;
@@ -1748,6 +1804,11 @@ mlx5_flow_create_gre(const struct rte_flow_item *item __rte_unused,
 
 	parser->inner = IBV_FLOW_SPEC_INNER;
 	parser->tunnel = ptype_ext[PTYPE_IDX(RTE_PTYPE_TUNNEL_GRE)];
+	parser->out_layer = parser->layer;
+	parser->layer = HASH_RXQ_TUNNEL;
+	/* Default GRE to inner RSS. */
+	if (!parser->rss_conf.level)
+		parser->rss_conf.level = 2;
 	/* Update encapsulation IP layer protocol. */
 	for (i = 0; i != hash_rxq_init_n; ++i) {
 		if (!parser->queue[i].ibv_attr)
@@ -1939,33 +2000,33 @@ mlx5_flow_create_action_queue_rss(struct rte_eth_dev *dev,
 	unsigned int i;
 
 	for (i = 0; i != hash_rxq_init_n; ++i) {
-		uint64_t hash_fields;
-
 		if (!parser->queue[i].ibv_attr)
 			continue;
 		flow->frxq[i].ibv_attr = parser->queue[i].ibv_attr;
 		parser->queue[i].ibv_attr = NULL;
-		hash_fields = hash_rxq_init[i].hash_fields;
+		flow->frxq[i].hash_fields = parser->queue[i].hash_fields;
 		if (!priv->dev->data->dev_started)
 			continue;
 		flow->frxq[i].hrxq =
 			mlx5_hrxq_get(dev,
 				      parser->rss_conf.key,
 				      parser->rss_conf.key_len,
-				      hash_fields,
+				      flow->frxq[i].hash_fields,
 				      parser->rss_conf.queue,
 				      parser->rss_conf.queue_num,
-				      parser->tunnel);
+				      parser->tunnel,
+				      parser->rss_conf.level);
 		if (flow->frxq[i].hrxq)
 			continue;
 		flow->frxq[i].hrxq =
 			mlx5_hrxq_new(dev,
 				      parser->rss_conf.key,
 				      parser->rss_conf.key_len,
-				      hash_fields,
+				      flow->frxq[i].hash_fields,
 				      parser->rss_conf.queue,
 				      parser->rss_conf.queue_num,
-				      parser->tunnel);
+				      parser->tunnel,
+				      parser->rss_conf.level);
 		if (!flow->frxq[i].hrxq) {
 			return rte_flow_error_set(error, ENOMEM,
 						  RTE_FLOW_ERROR_TYPE_HANDLE,
@@ -2070,7 +2131,7 @@ mlx5_flow_create_action_queue(struct rte_eth_dev *dev,
 		DRV_LOG(DEBUG, "port %u %p type %d QP %p ibv_flow %p",
 			dev->data->port_id,
 			(void *)flow, i,
-			(void *)flow->frxq[i].hrxq,
+			(void *)flow->frxq[i].hrxq->qp,
 			(void *)flow->frxq[i].ibv_flow);
 	}
 	if (!flows_n) {
@@ -2598,19 +2659,21 @@ mlx5_flow_start(struct rte_eth_dev *dev, struct mlx5_flows *list)
 			flow->frxq[i].hrxq =
 				mlx5_hrxq_get(dev, flow->rss_conf.key,
 					      flow->rss_conf.key_len,
-					      hash_rxq_init[i].hash_fields,
+					      flow->frxq[i].hash_fields,
 					      flow->rss_conf.queue,
 					      flow->rss_conf.queue_num,
-					      flow->tunnel);
+					      flow->tunnel,
+					      flow->rss_conf.level);
 			if (flow->frxq[i].hrxq)
 				goto flow_create;
 			flow->frxq[i].hrxq =
 				mlx5_hrxq_new(dev, flow->rss_conf.key,
 					      flow->rss_conf.key_len,
-					      hash_rxq_init[i].hash_fields,
+					      flow->frxq[i].hash_fields,
 					      flow->rss_conf.queue,
 					      flow->rss_conf.queue_num,
-					      flow->tunnel);
+					      flow->tunnel,
+					      flow->rss_conf.level);
 			if (!flow->frxq[i].hrxq) {
 				DRV_LOG(DEBUG,
 					"port %u flow %p cannot be applied",
diff --git a/drivers/net/mlx5/mlx5_glue.c b/drivers/net/mlx5/mlx5_glue.c
index a771ac4c7..cd2716352 100644
--- a/drivers/net/mlx5/mlx5_glue.c
+++ b/drivers/net/mlx5/mlx5_glue.c
@@ -313,6 +313,21 @@ mlx5_glue_dv_init_obj(struct mlx5dv_obj *obj, uint64_t obj_type)
 	return mlx5dv_init_obj(obj, obj_type);
 }
 
+static struct ibv_qp *
+mlx5_glue_dv_create_qp(struct ibv_context *context,
+		       struct ibv_qp_init_attr_ex *qp_init_attr_ex,
+		       struct mlx5dv_qp_init_attr *dv_qp_init_attr)
+{
+#ifdef HAVE_IBV_DEVICE_TUNNEL_SUPPORT
+	return mlx5dv_create_qp(context, qp_init_attr_ex, dv_qp_init_attr);
+#else
+	(void)context;
+	(void)qp_init_attr_ex;
+	(void)dv_qp_init_attr;
+	return NULL;
+#endif
+}
+
 const struct mlx5_glue *mlx5_glue = &(const struct mlx5_glue){
 	.version = MLX5_GLUE_VERSION,
 	.fork_init = mlx5_glue_fork_init,
@@ -356,4 +371,5 @@ const struct mlx5_glue *mlx5_glue = &(const struct mlx5_glue){
 	.dv_query_device = mlx5_glue_dv_query_device,
 	.dv_set_context_attr = mlx5_glue_dv_set_context_attr,
 	.dv_init_obj = mlx5_glue_dv_init_obj,
+	.dv_create_qp = mlx5_glue_dv_create_qp,
 };
diff --git a/drivers/net/mlx5/mlx5_glue.h b/drivers/net/mlx5/mlx5_glue.h
index 33385d226..9f36af81a 100644
--- a/drivers/net/mlx5/mlx5_glue.h
+++ b/drivers/net/mlx5/mlx5_glue.h
@@ -31,6 +31,10 @@ struct ibv_counter_set_init_attr;
 struct ibv_query_counter_set_attr;
 #endif
 
+#ifndef HAVE_IBV_DEVICE_TUNNEL_SUPPORT
+struct mlx5dv_qp_init_attr;
+#endif
+
 /* LIB_GLUE_VERSION must be updated every time this structure is modified. */
 struct mlx5_glue {
 	const char *version;
@@ -106,6 +110,10 @@ struct mlx5_glue {
 				   enum mlx5dv_set_ctx_attr_type type,
 				   void *attr);
 	int (*dv_init_obj)(struct mlx5dv_obj *obj, uint64_t obj_type);
+	struct ibv_qp *(*dv_create_qp)
+		(struct ibv_context *context,
+		 struct ibv_qp_init_attr_ex *qp_init_attr_ex,
+		 struct mlx5dv_qp_init_attr *dv_qp_init_attr);
 };
 
 const struct mlx5_glue *mlx5_glue;
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index 6756f25fa..58403b5b6 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -1385,7 +1385,9 @@ mlx5_ind_table_ibv_verify(struct rte_eth_dev *dev)
  * @param queues_n
  *   Number of queues.
  * @param tunnel
- *   Tunnel type.
+ *   Tunnel type, implies tunnel offloading like inner checksum if available.
+ * @param rss_level
+ *   RSS hash on tunnel level.
  *
  * @return
  *   The Verbs object initialised, NULL otherwise and rte_errno is set.
@@ -1394,13 +1396,17 @@ struct mlx5_hrxq *
 mlx5_hrxq_new(struct rte_eth_dev *dev,
 	      const uint8_t *rss_key, uint32_t rss_key_len,
 	      uint64_t hash_fields,
-	      const uint16_t *queues, uint32_t queues_n, uint32_t tunnel)
+	      const uint16_t *queues, uint32_t queues_n,
+	      uint32_t tunnel, uint32_t rss_level)
 {
 	struct priv *priv = dev->data->dev_private;
 	struct mlx5_hrxq *hrxq;
 	struct mlx5_ind_table_ibv *ind_tbl;
 	struct ibv_qp *qp;
 	int err;
+#ifdef HAVE_IBV_DEVICE_TUNNEL_SUPPORT
+	struct mlx5dv_qp_init_attr qp_init_attr = {0};
+#endif
 
 	queues_n = hash_fields ? queues_n : 1;
 	ind_tbl = mlx5_ind_table_ibv_get(dev, queues, queues_n);
@@ -1410,6 +1416,36 @@ mlx5_hrxq_new(struct rte_eth_dev *dev,
 		rte_errno = ENOMEM;
 		return NULL;
 	}
+#ifdef HAVE_IBV_DEVICE_TUNNEL_SUPPORT
+	if (tunnel) {
+		qp_init_attr.comp_mask =
+				MLX5DV_QP_INIT_ATTR_MASK_QP_CREATE_FLAGS;
+		qp_init_attr.create_flags = MLX5DV_QP_CREATE_TUNNEL_OFFLOADS;
+	}
+	qp = mlx5_glue->dv_create_qp(
+		priv->ctx,
+		&(struct ibv_qp_init_attr_ex){
+			.qp_type = IBV_QPT_RAW_PACKET,
+			.comp_mask =
+				IBV_QP_INIT_ATTR_PD |
+				IBV_QP_INIT_ATTR_IND_TABLE |
+				IBV_QP_INIT_ATTR_RX_HASH,
+			.rx_hash_conf = (struct ibv_rx_hash_conf){
+				.rx_hash_function = IBV_RX_HASH_FUNC_TOEPLITZ,
+				.rx_hash_key_len = rss_key_len ? rss_key_len :
+						   rss_hash_default_key_len,
+				.rx_hash_key = rss_key ?
+					       (void *)(uintptr_t)rss_key :
+					       rss_hash_default_key,
+				.rx_hash_fields_mask = hash_fields |
+					(tunnel && rss_level > 1 ?
+					(uint32_t)IBV_RX_HASH_INNER : 0),
+			},
+			.rwq_ind_tbl = ind_tbl->ind_table,
+			.pd = priv->pd,
+		},
+		&qp_init_attr);
+#else
 	qp = mlx5_glue->create_qp_ex
 		(priv->ctx,
 		 &(struct ibv_qp_init_attr_ex){
@@ -1420,13 +1456,17 @@ mlx5_hrxq_new(struct rte_eth_dev *dev,
 				IBV_QP_INIT_ATTR_RX_HASH,
 			.rx_hash_conf = (struct ibv_rx_hash_conf){
 				.rx_hash_function = IBV_RX_HASH_FUNC_TOEPLITZ,
-				.rx_hash_key_len = rss_key_len,
-				.rx_hash_key = (void *)(uintptr_t)rss_key,
+				.rx_hash_key_len = rss_key_len ? rss_key_len :
+						   rss_hash_default_key_len,
+				.rx_hash_key = rss_key ?
+					       (void *)(uintptr_t)rss_key :
+					       rss_hash_default_key,
 				.rx_hash_fields_mask = hash_fields,
 			},
 			.rwq_ind_tbl = ind_tbl->ind_table,
 			.pd = priv->pd,
 		 });
+#endif
 	if (!qp) {
 		rte_errno = errno;
 		goto error;
@@ -1439,6 +1479,7 @@ mlx5_hrxq_new(struct rte_eth_dev *dev,
 	hrxq->rss_key_len = rss_key_len;
 	hrxq->hash_fields = hash_fields;
 	hrxq->tunnel = tunnel;
+	hrxq->rss_level = rss_level;
 	memcpy(hrxq->rss_key, rss_key, rss_key_len);
 	rte_atomic32_inc(&hrxq->refcnt);
 	LIST_INSERT_HEAD(&priv->hrxqs, hrxq, next);
@@ -1468,7 +1509,9 @@ mlx5_hrxq_new(struct rte_eth_dev *dev,
  * @param queues_n
  *   Number of queues.
  * @param tunnel
- *   Tunnel type.
+ *   Tunnel type, implies tunnel offloading like inner checksum if available.
+ * @param rss_level
+ *   RSS hash on tunnel level
  *
  * @return
  *   An hash Rx queue on success.
@@ -1477,7 +1520,8 @@ struct mlx5_hrxq *
 mlx5_hrxq_get(struct rte_eth_dev *dev,
 	      const uint8_t *rss_key, uint32_t rss_key_len,
 	      uint64_t hash_fields,
-	      const uint16_t *queues, uint32_t queues_n, uint32_t tunnel)
+	      const uint16_t *queues, uint32_t queues_n,
+	      uint32_t tunnel, uint32_t rss_level)
 {
 	struct priv *priv = dev->data->dev_private;
 	struct mlx5_hrxq *hrxq;
@@ -1494,6 +1538,8 @@ mlx5_hrxq_get(struct rte_eth_dev *dev,
 			continue;
 		if (hrxq->tunnel != tunnel)
 			continue;
+		if (hrxq->rss_level != rss_level)
+			continue;
 		ind_tbl = mlx5_ind_table_ibv_get(dev, queues, queues_n);
 		if (!ind_tbl)
 			continue;
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index 188fd65c5..07b3adfae 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -147,6 +147,7 @@ struct mlx5_hrxq {
 	struct ibv_qp *qp; /* Verbs queue pair. */
 	uint64_t hash_fields; /* Verbs Hash fields. */
 	uint32_t tunnel; /* Tunnel type. */
+	uint32_t rss_level; /* RSS on tunnel level. */
 	uint32_t rss_key_len; /* Hash key length in bytes. */
 	uint8_t rss_key[]; /* Hash key. */
 };
@@ -251,12 +252,12 @@ struct mlx5_hrxq *mlx5_hrxq_new(struct rte_eth_dev *dev,
 				const uint8_t *rss_key, uint32_t rss_key_len,
 				uint64_t hash_fields,
 				const uint16_t *queues, uint32_t queues_n,
-				uint32_t tunnel);
+				uint32_t tunnel, uint32_t rss_level);
 struct mlx5_hrxq *mlx5_hrxq_get(struct rte_eth_dev *dev,
 				const uint8_t *rss_key, uint32_t rss_key_len,
 				uint64_t hash_fields,
 				const uint16_t *queues, uint32_t queues_n,
-				uint32_t tunnel);
+				uint32_t tunnel, uint32_t rss_level);
 int mlx5_hrxq_release(struct rte_eth_dev *dev, struct mlx5_hrxq *hxrq);
 int mlx5_hrxq_ibv_verify(struct rte_eth_dev *dev);
 uint64_t mlx5_get_rx_port_offloads(void);
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v5 08/11] net/mlx5: add hardware flow debug dump
  2018-04-17 15:14   ` [PATCH v4 00/11] " Xueming Li
                       ` (7 preceding siblings ...)
  2018-04-20 12:23     ` [PATCH v5 07/11] net/mlx5: support tunnel RSS level Xueming Li
@ 2018-04-20 12:23     ` Xueming Li
  2018-04-20 12:23     ` [PATCH v5 09/11] net/mlx5: introduce VXLAN-GPE tunnel type Xueming Li
                       ` (2 subsequent siblings)
  11 siblings, 0 replies; 115+ messages in thread
From: Xueming Li @ 2018-04-20 12:23 UTC (permalink / raw)
  To: Iremonger Bernard, Nelio Laranjeiro, Shahaf Shuler; +Cc: Xueming Li, dev

Dump verb flow detail including flow spec type and size for debugging
purpose.

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
---
 drivers/net/mlx5/mlx5_flow.c  | 68 ++++++++++++++++++++++++++++++++++++-------
 drivers/net/mlx5/mlx5_rxq.c   | 26 ++++++++++++++---
 drivers/net/mlx5/mlx5_utils.h |  6 ++++
 3 files changed, 86 insertions(+), 14 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index 174f2ba6e..593c960f8 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -2080,6 +2080,57 @@ mlx5_flow_create_update_rxqs(struct rte_eth_dev *dev, struct rte_flow *flow)
 }
 
 /**
+ * Dump flow hash RX queue detail.
+ *
+ * @param dev
+ *   Pointer to Ethernet device.
+ * @param flow
+ *   Pointer to the rte_flow.
+ * @param hrxq_idx
+ *   Hash RX queue index.
+ */
+static void
+mlx5_flow_dump(struct rte_eth_dev *dev __rte_unused,
+	       struct rte_flow *flow __rte_unused,
+	       unsigned int hrxq_idx __rte_unused)
+{
+#ifndef NDEBUG
+	uintptr_t spec_ptr;
+	uint16_t j;
+	char buf[256];
+	uint8_t off;
+
+	spec_ptr = (uintptr_t)(flow->frxq[hrxq_idx].ibv_attr + 1);
+	for (j = 0, off = 0; j < flow->frxq[hrxq_idx].ibv_attr->num_of_specs;
+	     j++) {
+		struct ibv_flow_spec *spec = (void *)spec_ptr;
+		off += sprintf(buf + off, " %x(%hu)", spec->hdr.type,
+			       spec->hdr.size);
+		spec_ptr += spec->hdr.size;
+	}
+	DRV_LOG(DEBUG,
+		"port %u Verbs flow %p type %u: hrxq:%p qp:%p ind:%p,"
+		" hash:%" PRIx64 "/%u specs:%hhu(%hu), priority:%hu, type:%d,"
+		" flags:%x, comp_mask:%x specs:%s",
+		dev->data->port_id, (void *)flow, hrxq_idx,
+		(void *)flow->frxq[hrxq_idx].hrxq,
+		(void *)flow->frxq[hrxq_idx].hrxq->qp,
+		(void *)flow->frxq[hrxq_idx].hrxq->ind_table,
+		flow->frxq[hrxq_idx].hash_fields |
+		(flow->tunnel &&
+		 flow->rss_conf.level > 1 ? (uint32_t)IBV_RX_HASH_INNER : 0),
+		flow->rss_conf.queue_num,
+		flow->frxq[hrxq_idx].ibv_attr->num_of_specs,
+		flow->frxq[hrxq_idx].ibv_attr->size,
+		flow->frxq[hrxq_idx].ibv_attr->priority,
+		flow->frxq[hrxq_idx].ibv_attr->type,
+		flow->frxq[hrxq_idx].ibv_attr->flags,
+		flow->frxq[hrxq_idx].ibv_attr->comp_mask,
+		buf);
+#endif
+}
+
+/**
  * Complete flow rule creation.
  *
  * @param dev
@@ -2121,6 +2172,7 @@ mlx5_flow_create_action_queue(struct rte_eth_dev *dev,
 		flow->frxq[i].ibv_flow =
 			mlx5_glue->create_flow(flow->frxq[i].hrxq->qp,
 					       flow->frxq[i].ibv_attr);
+		mlx5_flow_dump(dev, flow, i);
 		if (!flow->frxq[i].ibv_flow) {
 			rte_flow_error_set(error, ENOMEM,
 					   RTE_FLOW_ERROR_TYPE_HANDLE,
@@ -2128,11 +2180,6 @@ mlx5_flow_create_action_queue(struct rte_eth_dev *dev,
 			goto error;
 		}
 		++flows_n;
-		DRV_LOG(DEBUG, "port %u %p type %d QP %p ibv_flow %p",
-			dev->data->port_id,
-			(void *)flow, i,
-			(void *)flow->frxq[i].hrxq->qp,
-			(void *)flow->frxq[i].ibv_flow);
 	}
 	if (!flows_n) {
 		rte_flow_error_set(error, EINVAL, RTE_FLOW_ERROR_TYPE_HANDLE,
@@ -2676,24 +2723,25 @@ mlx5_flow_start(struct rte_eth_dev *dev, struct mlx5_flows *list)
 					      flow->rss_conf.level);
 			if (!flow->frxq[i].hrxq) {
 				DRV_LOG(DEBUG,
-					"port %u flow %p cannot be applied",
+					"port %u flow %p cannot create hash"
+					" rxq",
 					dev->data->port_id, (void *)flow);
 				rte_errno = EINVAL;
 				return -rte_errno;
 			}
 flow_create:
+			mlx5_flow_dump(dev, flow, i);
 			flow->frxq[i].ibv_flow =
 				mlx5_glue->create_flow(flow->frxq[i].hrxq->qp,
 						       flow->frxq[i].ibv_attr);
 			if (!flow->frxq[i].ibv_flow) {
 				DRV_LOG(DEBUG,
-					"port %u flow %p cannot be applied",
-					dev->data->port_id, (void *)flow);
+					"port %u flow %p type %u cannot be"
+					" applied",
+					dev->data->port_id, (void *)flow, i);
 				rte_errno = EINVAL;
 				return -rte_errno;
 			}
-			DRV_LOG(DEBUG, "port %u flow %p applied",
-				dev->data->port_id, (void *)flow);
 		}
 		mlx5_flow_create_update_rxqs(dev, flow);
 	}
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index 58403b5b6..2957e7c86 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -1259,9 +1259,9 @@ mlx5_ind_table_ibv_new(struct rte_eth_dev *dev, const uint16_t *queues,
 	}
 	rte_atomic32_inc(&ind_tbl->refcnt);
 	LIST_INSERT_HEAD(&priv->ind_tbls, ind_tbl, next);
-	DRV_LOG(DEBUG, "port %u indirection table %p: refcnt %d",
-		dev->data->port_id, (void *)ind_tbl,
-		rte_atomic32_read(&ind_tbl->refcnt));
+	DEBUG("port %u new indirection table %p: queues:%u refcnt:%d",
+	      dev->data->port_id, (void *)ind_tbl, 1 << wq_n,
+	      rte_atomic32_read(&ind_tbl->refcnt));
 	return ind_tbl;
 error:
 	rte_free(ind_tbl);
@@ -1330,9 +1330,12 @@ mlx5_ind_table_ibv_release(struct rte_eth_dev *dev,
 	DRV_LOG(DEBUG, "port %u indirection table %p: refcnt %d",
 		((struct priv *)dev->data->dev_private)->port,
 		(void *)ind_tbl, rte_atomic32_read(&ind_tbl->refcnt));
-	if (rte_atomic32_dec_and_test(&ind_tbl->refcnt))
+	if (rte_atomic32_dec_and_test(&ind_tbl->refcnt)) {
 		claim_zero(mlx5_glue->destroy_rwq_ind_table
 			   (ind_tbl->ind_table));
+		DEBUG("port %u delete indirection table %p: queues: %u",
+		      dev->data->port_id, (void *)ind_tbl, ind_tbl->queues_n);
+	}
 	for (i = 0; i != ind_tbl->queues_n; ++i)
 		claim_nonzero(mlx5_rxq_release(dev, ind_tbl->queues[i]));
 	if (!rte_atomic32_read(&ind_tbl->refcnt)) {
@@ -1445,6 +1448,13 @@ mlx5_hrxq_new(struct rte_eth_dev *dev,
 			.pd = priv->pd,
 		},
 		&qp_init_attr);
+	DEBUG("port %u new QP:%p ind_tbl:%p hash_fields:0x%" PRIx64
+	      " tunnel:0x%x level:%hhu dv_attr:comp_mask:0x%" PRIx64
+	      " create_flags:0x%x",
+	      dev->data->port_id, (void *)qp, (void *)ind_tbl,
+	      (tunnel && rss_level == 2 ? (uint32_t)IBV_RX_HASH_INNER : 0) |
+	      hash_fields, tunnel, rss_level,
+	      qp_init_attr.comp_mask, qp_init_attr.create_flags);
 #else
 	qp = mlx5_glue->create_qp_ex
 		(priv->ctx,
@@ -1466,6 +1476,10 @@ mlx5_hrxq_new(struct rte_eth_dev *dev,
 			.rwq_ind_tbl = ind_tbl->ind_table,
 			.pd = priv->pd,
 		 });
+	DEBUG("port %u new QP:%p ind_tbl:%p hash_fields:0x%" PRIx64
+	      " tunnel:0x%x level:%hhu",
+	      dev->data->port_id, (void *)qp, (void *)ind_tbl,
+	      hash_fields, tunnel, rss_level);
 #endif
 	if (!qp) {
 		rte_errno = errno;
@@ -1575,6 +1589,10 @@ mlx5_hrxq_release(struct rte_eth_dev *dev, struct mlx5_hrxq *hrxq)
 		(void *)hrxq, rte_atomic32_read(&hrxq->refcnt));
 	if (rte_atomic32_dec_and_test(&hrxq->refcnt)) {
 		claim_zero(mlx5_glue->destroy_qp(hrxq->qp));
+		DEBUG("port %u delete QP %p: hash: 0x%" PRIx64 ", tunnel:"
+		      " 0x%x, level: %hhu",
+		      dev->data->port_id, (void *)hrxq, hrxq->hash_fields,
+		      hrxq->tunnel, hrxq->rss_level);
 		mlx5_ind_table_ibv_release(dev, hrxq->ind_table);
 		LIST_REMOVE(hrxq, next);
 		rte_free(hrxq);
diff --git a/drivers/net/mlx5/mlx5_utils.h b/drivers/net/mlx5/mlx5_utils.h
index e8f980ff7..886f60e61 100644
--- a/drivers/net/mlx5/mlx5_utils.h
+++ b/drivers/net/mlx5/mlx5_utils.h
@@ -103,16 +103,22 @@ extern int mlx5_logtype;
 /* claim_zero() does not perform any check when debugging is disabled. */
 #ifndef NDEBUG
 
+#define DEBUG(...) DRV_LOG(DEBUG, __VA_ARGS__)
 #define claim_zero(...) assert((__VA_ARGS__) == 0)
 #define claim_nonzero(...) assert((__VA_ARGS__) != 0)
 
 #else /* NDEBUG */
 
+#define DEBUG(...) (void)0
 #define claim_zero(...) (__VA_ARGS__)
 #define claim_nonzero(...) (__VA_ARGS__)
 
 #endif /* NDEBUG */
 
+#define INFO(...) DRV_LOG(INFO, __VA_ARGS__)
+#define WARN(...) DRV_LOG(WARNING, __VA_ARGS__)
+#define ERROR(...) DRV_LOG(ERR, __VA_ARGS__)
+
 /* Convenience macros for accessing mbuf fields. */
 #define NEXT(m) ((m)->next)
 #define DATA_LEN(m) ((m)->data_len)
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v5 09/11] net/mlx5: introduce VXLAN-GPE tunnel type
  2018-04-17 15:14   ` [PATCH v4 00/11] " Xueming Li
                       ` (8 preceding siblings ...)
  2018-04-20 12:23     ` [PATCH v5 08/11] net/mlx5: add hardware flow debug dump Xueming Li
@ 2018-04-20 12:23     ` Xueming Li
  2018-04-20 12:23     ` [PATCH v5 10/11] net/mlx5: allow flow tunnel ID 0 with outer pattern Xueming Li
  2018-04-20 12:23     ` [PATCH v5 11/11] doc: update mlx5 guide on tunnel offloading Xueming Li
  11 siblings, 0 replies; 115+ messages in thread
From: Xueming Li @ 2018-04-20 12:23 UTC (permalink / raw)
  To: Iremonger Bernard, Nelio Laranjeiro, Shahaf Shuler; +Cc: Xueming Li, dev

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
---
 doc/guides/nics/mlx5.rst     |   8 ++--
 drivers/net/mlx5/mlx5_flow.c | 107 ++++++++++++++++++++++++++++++++++++++++++-
 drivers/net/mlx5/mlx5_rxtx.c |   3 +-
 3 files changed, 111 insertions(+), 7 deletions(-)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index 421274729..6b83759c8 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -329,16 +329,16 @@ Run-time configuration
 
 - ``l3_vxlan_en`` parameter [int]
 
-  A nonzero value allows L3 VXLAN flow creation. To enable L3 VXLAN, users
-  has to configure firemware and enable this prameter. This is a prerequisite
-  to receive this kind of traffic.
+  A nonzero value allows L3 VXLAN and VXLAN-GPE flow creation. To enable
+  L3 VXLAN or VXLAN-GPE, users has to configure firemware and enable this
+  prameter. This is a prerequisite to receive this kind of traffic.
 
   Disabled by default.
 
 Firmware configuration
 ~~~~~~~~~~~~~~~~~~~~~~
 
-- L3 VXLAN destination UDP port
+- L3 VXLAN and VXLAN-GPE destination UDP port
 
    .. code-block:: console
 
diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index 593c960f8..a55644cd0 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -92,6 +92,11 @@ mlx5_flow_create_vxlan(const struct rte_flow_item *item,
 		       struct mlx5_flow_data *data);
 
 static int
+mlx5_flow_create_vxlan_gpe(const struct rte_flow_item *item,
+			   const void *default_mask,
+			   struct mlx5_flow_data *data);
+
+static int
 mlx5_flow_create_gre(const struct rte_flow_item *item,
 		     const void *default_mask,
 		     struct mlx5_flow_data *data);
@@ -242,10 +247,12 @@ struct rte_flow {
 
 #define IS_TUNNEL(type) ( \
 	(type) == RTE_FLOW_ITEM_TYPE_VXLAN || \
+	(type) == RTE_FLOW_ITEM_TYPE_VXLAN_GPE || \
 	(type) == RTE_FLOW_ITEM_TYPE_GRE)
 
 const uint32_t flow_ptype[] = {
 	[RTE_FLOW_ITEM_TYPE_VXLAN] = RTE_PTYPE_TUNNEL_VXLAN,
+	[RTE_FLOW_ITEM_TYPE_VXLAN_GPE] = RTE_PTYPE_TUNNEL_VXLAN_GPE,
 	[RTE_FLOW_ITEM_TYPE_GRE] = RTE_PTYPE_TUNNEL_GRE,
 };
 
@@ -254,6 +261,8 @@ const uint32_t flow_ptype[] = {
 const uint32_t ptype_ext[] = {
 	[PTYPE_IDX(RTE_PTYPE_TUNNEL_VXLAN)] = RTE_PTYPE_TUNNEL_VXLAN |
 					      RTE_PTYPE_L4_UDP,
+	[PTYPE_IDX(RTE_PTYPE_TUNNEL_VXLAN_GPE)]	= RTE_PTYPE_TUNNEL_VXLAN_GPE |
+						  RTE_PTYPE_L4_UDP,
 	[PTYPE_IDX(RTE_PTYPE_TUNNEL_GRE)] = RTE_PTYPE_TUNNEL_GRE,
 };
 
@@ -311,6 +320,7 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
 	[RTE_FLOW_ITEM_TYPE_END] = {
 		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
 			       RTE_FLOW_ITEM_TYPE_VXLAN,
+			       RTE_FLOW_ITEM_TYPE_VXLAN_GPE,
 			       RTE_FLOW_ITEM_TYPE_GRE),
 	},
 	[RTE_FLOW_ITEM_TYPE_ETH] = {
@@ -389,7 +399,8 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
 		.dst_sz = sizeof(struct ibv_flow_spec_ipv6),
 	},
 	[RTE_FLOW_ITEM_TYPE_UDP] = {
-		.items = ITEMS(RTE_FLOW_ITEM_TYPE_VXLAN),
+		.items = ITEMS(RTE_FLOW_ITEM_TYPE_VXLAN,
+			       RTE_FLOW_ITEM_TYPE_VXLAN_GPE),
 		.actions = valid_actions,
 		.mask = &(const struct rte_flow_item_udp){
 			.hdr = {
@@ -441,6 +452,19 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
 		.convert = mlx5_flow_create_vxlan,
 		.dst_sz = sizeof(struct ibv_flow_spec_tunnel),
 	},
+	[RTE_FLOW_ITEM_TYPE_VXLAN_GPE] = {
+		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
+			       RTE_FLOW_ITEM_TYPE_IPV4,
+			       RTE_FLOW_ITEM_TYPE_IPV6),
+		.actions = valid_actions,
+		.mask = &(const struct rte_flow_item_vxlan_gpe){
+			.vni = "\xff\xff\xff",
+		},
+		.default_mask = &rte_flow_item_vxlan_gpe_mask,
+		.mask_sz = sizeof(struct rte_flow_item_vxlan_gpe),
+		.convert = mlx5_flow_create_vxlan_gpe,
+		.dst_sz = sizeof(struct ibv_flow_spec_tunnel),
+	},
 };
 
 /** Structure to pass to the conversion function. */
@@ -1775,6 +1799,87 @@ mlx5_flow_create_vxlan(const struct rte_flow_item *item,
 }
 
 /**
+ * Convert VXLAN-GPE item to Verbs specification.
+ *
+ * @param item[in]
+ *   Item specification.
+ * @param default_mask[in]
+ *   Default bit-masks to use when item->mask is not provided.
+ * @param data[in, out]
+ *   User structure.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+mlx5_flow_create_vxlan_gpe(const struct rte_flow_item *item,
+			   const void *default_mask,
+			   struct mlx5_flow_data *data)
+{
+	struct priv *priv = data->dev->data->dev_private;
+	const struct rte_flow_item_vxlan_gpe *spec = item->spec;
+	const struct rte_flow_item_vxlan_gpe *mask = item->mask;
+	struct mlx5_flow_parse *parser = data->parser;
+	unsigned int size = sizeof(struct ibv_flow_spec_tunnel);
+	struct ibv_flow_spec_tunnel vxlan = {
+		.type = parser->inner | IBV_FLOW_SPEC_VXLAN_TUNNEL,
+		.size = size,
+	};
+	union vni {
+		uint32_t vlan_id;
+		uint8_t vni[4];
+	} id;
+
+	if (!priv->config.l3_vxlan_en)
+		return rte_flow_error_set(data->error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ITEM,
+					  item,
+					  "L3 VXLAN not enabled by device"
+					  " parameter and/or not configured"
+					  " in firmware");
+	id.vni[0] = 0;
+	parser->inner = IBV_FLOW_SPEC_INNER;
+	parser->tunnel = ptype_ext[PTYPE_IDX(RTE_PTYPE_TUNNEL_VXLAN_GPE)];
+	parser->out_layer = parser->layer;
+	parser->layer = HASH_RXQ_TUNNEL;
+	/* Default VXLAN-GPE to outer RSS. */
+	if (!parser->rss_conf.level)
+		parser->rss_conf.level = 1;
+	if (spec) {
+		if (!mask)
+			mask = default_mask;
+		memcpy(&id.vni[1], spec->vni, 3);
+		vxlan.val.tunnel_id = id.vlan_id;
+		memcpy(&id.vni[1], mask->vni, 3);
+		vxlan.mask.tunnel_id = id.vlan_id;
+		if (spec->protocol)
+			return rte_flow_error_set(data->error, EINVAL,
+						  RTE_FLOW_ERROR_TYPE_ITEM,
+						  item,
+						  "VxLAN-GPE protocol not"
+						  " supported");
+		/* Remove unwanted bits from values. */
+		vxlan.val.tunnel_id &= vxlan.mask.tunnel_id;
+	}
+	/*
+	 * Tunnel id 0 is equivalent as not adding a VXLAN layer, if only this
+	 * layer is defined in the Verbs specification it is interpreted as
+	 * wildcard and all packets will match this rule, if it follows a full
+	 * stack layer (ex: eth / ipv4 / udp), all packets matching the layers
+	 * before will also match this rule.
+	 * To avoid such situation, VNI 0 is currently refused.
+	 */
+	/* Only allow tunnel w/o tunnel id pattern after proper outer spec. */
+	if (parser->out_layer == HASH_RXQ_ETH && !vxlan.val.tunnel_id)
+		return rte_flow_error_set(data->error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ITEM,
+					  item,
+					  "VxLAN-GPE vni cannot be 0");
+	mlx5_flow_create_copy(parser, &vxlan, size);
+	return 0;
+}
+
+/**
  * Convert GRE item to Verbs specification.
  *
  * @param item[in]
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 060ff0e85..f10ea13c1 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -466,8 +466,7 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
 			uint8_t vlan_sz =
 				(buf->ol_flags & PKT_TX_VLAN_PKT) ? 4 : 0;
 			const uint64_t is_tunneled =
-				buf->ol_flags & (PKT_TX_TUNNEL_GRE |
-						 PKT_TX_TUNNEL_VXLAN);
+				buf->ol_flags & (PKT_TX_TUNNEL_MASK);
 
 			tso_header_sz = buf->l2_len + vlan_sz +
 					buf->l3_len + buf->l4_len;
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v5 10/11] net/mlx5: allow flow tunnel ID 0 with outer pattern
  2018-04-17 15:14   ` [PATCH v4 00/11] " Xueming Li
                       ` (9 preceding siblings ...)
  2018-04-20 12:23     ` [PATCH v5 09/11] net/mlx5: introduce VXLAN-GPE tunnel type Xueming Li
@ 2018-04-20 12:23     ` Xueming Li
  2018-04-20 12:23     ` [PATCH v5 11/11] doc: update mlx5 guide on tunnel offloading Xueming Li
  11 siblings, 0 replies; 115+ messages in thread
From: Xueming Li @ 2018-04-20 12:23 UTC (permalink / raw)
  To: Iremonger Bernard, Nelio Laranjeiro, Shahaf Shuler; +Cc: Xueming Li, dev

Tunnel w/o tunnel id pattern could match any non-tunneled packet,
this patch allowed tunnel w/o tunnel id pattern after proper outer spec.

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
---
 drivers/net/mlx5/mlx5_flow.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index a55644cd0..06ed58ef5 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -1789,7 +1789,8 @@ mlx5_flow_create_vxlan(const struct rte_flow_item *item,
 	 * before will also match this rule.
 	 * To avoid such situation, VNI 0 is currently refused.
 	 */
-	if (!vxlan.val.tunnel_id)
+	/* Only allow tunnel w/o tunnel id pattern after proper outer spec. */
+	if (parser->out_layer == HASH_RXQ_ETH && !vxlan.val.tunnel_id)
 		return rte_flow_error_set(data->error, EINVAL,
 					  RTE_FLOW_ERROR_TYPE_ITEM,
 					  item,
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v5 11/11] doc: update mlx5 guide on tunnel offloading
  2018-04-17 15:14   ` [PATCH v4 00/11] " Xueming Li
                       ` (10 preceding siblings ...)
  2018-04-20 12:23     ` [PATCH v5 10/11] net/mlx5: allow flow tunnel ID 0 with outer pattern Xueming Li
@ 2018-04-20 12:23     ` Xueming Li
  11 siblings, 0 replies; 115+ messages in thread
From: Xueming Li @ 2018-04-20 12:23 UTC (permalink / raw)
  To: Iremonger Bernard, Nelio Laranjeiro, Shahaf Shuler; +Cc: Xueming Li, dev

Remove tunnel limitations, add new hardware tunnel offload features.

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
---
 doc/guides/nics/features/default.ini | 1 +
 doc/guides/nics/features/mlx5.ini    | 3 +++
 doc/guides/nics/mlx5.rst             | 4 ++--
 3 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/doc/guides/nics/features/default.ini b/doc/guides/nics/features/default.ini
index dae2ad776..49be81450 100644
--- a/doc/guides/nics/features/default.ini
+++ b/doc/guides/nics/features/default.ini
@@ -29,6 +29,7 @@ Multicast MAC filter =
 RSS hash             =
 RSS key update       =
 RSS reta update      =
+Inner RSS            =
 VMDq                 =
 SR-IOV               =
 DCB                  =
diff --git a/doc/guides/nics/features/mlx5.ini b/doc/guides/nics/features/mlx5.ini
index f8ce08770..e75b14bdc 100644
--- a/doc/guides/nics/features/mlx5.ini
+++ b/doc/guides/nics/features/mlx5.ini
@@ -21,6 +21,7 @@ Multicast MAC filter = Y
 RSS hash             = Y
 RSS key update       = Y
 RSS reta update      = Y
+Inner RSS            = Y
 SR-IOV               = Y
 VLAN filter          = Y
 Flow director        = Y
@@ -30,6 +31,8 @@ VLAN offload         = Y
 L3 checksum offload  = Y
 L4 checksum offload  = Y
 Timestamp offload    = Y
+Inner L3 checksum    = Y
+Inner L4 checksum    = Y
 Packet type parsing  = Y
 Rx descriptor status = Y
 Tx descriptor status = Y
diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index 6b83759c8..ef1c7da45 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -74,12 +74,12 @@ Features
 - RX interrupts.
 - Statistics query including Basic, Extended and per queue.
 - Rx HW timestamp.
+- Tunnel types: VXLAN, L3 VXLAN, VXLAN-GPE, GRE, MPLS-in-GRE, MPLS-in-UDP.
+- Tunnel HW offloads: packet type, inner/outer RSS, IP and UDP checksum verification.
 
 Limitations
 -----------
 
-- Inner RSS for VXLAN frames is not supported yet.
-- Hardware checksum RX offloads for VXLAN inner header are not supported yet.
 - For secondary process:
 
   - Forked secondary process not supported.
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* Re: [PATCH v5 04/11] net/mlx5: support Rx tunnel type identification
  2018-04-20 12:23     ` [PATCH v5 04/11] net/mlx5: support Rx tunnel type identification Xueming Li
@ 2018-04-23  7:40       ` Nélio Laranjeiro
  2018-04-23  7:56         ` Xueming(Steven) Li
  0 siblings, 1 reply; 115+ messages in thread
From: Nélio Laranjeiro @ 2018-04-23  7:40 UTC (permalink / raw)
  To: Xueming Li; +Cc: Iremonger Bernard, Shahaf Shuler, dev

On Fri, Apr 20, 2018 at 08:23:33PM +0800, Xueming Li wrote:
> This patch introduced tunnel type identification based on flow rules.
> If flows of multiple tunnel types built on same queue,
> RTE_PTYPE_TUNNEL_MASK will be returned, user application could use bits
> in flow mark as tunnel type identifier.
>[...]

There is still the issue about returning this wrong bits in the mbuf.

Bit in the mbuf ptypes must only reflect what is present in the mbuf,
using RTE_PTYPE_TUNNEL_MASK means all tunnels are present in the packet
which is absolutely wrong.

This behavior was not announce and breaks API/ABI.  It cannot be
accepted yet.

I'll suggest to add a new RTE_PTYPE_TUNNEL_UNKNOWN which does not break
the ABI or don't add such bits in the mbuf.

Regards,

-- 
Nélio Laranjeiro
6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v5 04/11] net/mlx5: support Rx tunnel type identification
  2018-04-23  7:40       ` Nélio Laranjeiro
@ 2018-04-23  7:56         ` Xueming(Steven) Li
  0 siblings, 0 replies; 115+ messages in thread
From: Xueming(Steven) Li @ 2018-04-23  7:56 UTC (permalink / raw)
  To: Nélio Laranjeiro; +Cc: Iremonger Bernard, Shahaf Shuler, dev



> -----Original Message-----
> From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> Sent: Monday, April 23, 2018 3:41 PM
> To: Xueming(Steven) Li <xuemingl@mellanox.com>
> Cc: Iremonger Bernard <bernard.iremonger@intel.com>; Shahaf Shuler <shahafs@mellanox.com>;
> dev@dpdk.org
> Subject: Re: [PATCH v5 04/11] net/mlx5: support Rx tunnel type identification
> 
> On Fri, Apr 20, 2018 at 08:23:33PM +0800, Xueming Li wrote:
> > This patch introduced tunnel type identification based on flow rules.
> > If flows of multiple tunnel types built on same queue,
> >RTE_PTYPE_TUNNEL_MASK will be returned, user application could use bits
> >in flow mark as tunnel type identifier.
> >[...]
> 
> There is still the issue about returning this wrong bits in the mbuf.
> 
> Bit in the mbuf ptypes must only reflect what is present in the mbuf, using RTE_PTYPE_TUNNEL_MASK
> means all tunnels are present in the packet which is absolutely wrong.
> 
> This behavior was not announce and breaks API/ABI.  It cannot be accepted yet.
> 
> I'll suggest to add a new RTE_PTYPE_TUNNEL_UNKNOWN which does not break the ABI or don't add such bits
> in the mbuf.
> 

Seems I forgot to update commit message. RTE_PTYPE_TUNNEL_UNKNOWN has been removed according to discussion.


> Regards,
> 
> --
> Nélio Laranjeiro
> 6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* [PATCH v6 00/11] mlx5 Rx tunnel offloading
  2018-04-20 12:23     ` [PATCH v5 " Xueming Li
@ 2018-04-23 12:32       ` Xueming Li
  2018-04-24  8:24         ` Nélio Laranjeiro
  2018-04-23 12:33       ` [PATCH v6 01/11] net/mlx5: support 16 hardware priorities Xueming Li
                         ` (10 subsequent siblings)
  11 siblings, 1 reply; 115+ messages in thread
From: Xueming Li @ 2018-04-23 12:32 UTC (permalink / raw)
  To: Nelio Laranjeiro, Shahaf Shuler; +Cc: Xueming Li, dev


Important note:
        please note that this patchset relies on Adrien's patchset of flow API
        overhaul: http://www.dpdk.org/dev/patchwork/patch/38508/
v6:
- Fixed commit log of tunnel type identification
v5:
- Removed %lx prints
- Per review request, clear mbuf tunnel type in case of multiple tunnel types.
- Rebase on Adriens flow API overhaul patchset
- Split feature requirement document into patches of L3 VXLAN and VXLAN-GPE
- Per review request, add device parameter to enable L3 VXLAN and VXLAN-GPE
v4:
- Fix RSS level according to value defination
- Add "Inner RSS" column to NIC feature doc
- Fixed flow creation error in case of ipv4 rss on ipv6 pattern
- new patch: enforce IP protocol of GRE to be 47.
- Removed MPLS-in-UDP and MPLS-in-GRE replated patchset
- Removed invalid RSS type check
v3:
- Refactor 16 Verbs priority detection.
- Other updates according to ML discussion.
v2:
- Split into 2 series: public api and mlx5, this one is the second.
- Rebased on Adrien's rte flow overhaul:
  http://www.dpdk.org/ml/archives/dev/2018-April/095774.html
v1:
- Support new tunnel type MPLS-in-GRE and MPLS-in-UDP
- Remove deprecation notes of rss level

This patchset supports MLX5 Rx tunnel checksum, inner rss, inner ptype offloading of following tunnel types:
- Standard VXLAN
- L3 VXLAN (no inner ethernet header)
- VXLAN-GPE

Xueming Li (11):
  net/mlx5: support 16 hardware priorities
  net/mlx5: support GRE tunnel flow
  net/mlx5: support L3 VXLAN flow
  net/mlx5: support Rx tunnel type identification
  net/mlx5: cleanup tunnel checksum offloads
  net/mlx5: split flow RSS handling logic
  net/mlx5: support tunnel RSS level
  net/mlx5: add hardware flow debug dump
  net/mlx5: introduce VXLAN-GPE tunnel type
  net/mlx5: allow flow tunnel ID 0 with outer pattern
  doc: update mlx5 guide on tunnel offloading

 doc/guides/nics/features/default.ini  |   1 +
 doc/guides/nics/features/mlx5.ini     |   3 +
 doc/guides/nics/mlx5.rst              |  30 +-
 drivers/net/mlx5/Makefile             |   2 +-
 drivers/net/mlx5/mlx5.c               |  24 +
 drivers/net/mlx5/mlx5.h               |   6 +
 drivers/net/mlx5/mlx5_flow.c          | 844 +++++++++++++++++++++++++++-------
 drivers/net/mlx5/mlx5_glue.c          |  16 +
 drivers/net/mlx5/mlx5_glue.h          |   8 +
 drivers/net/mlx5/mlx5_rxq.c           |  89 +++-
 drivers/net/mlx5/mlx5_rxtx.c          |  33 +-
 drivers/net/mlx5/mlx5_rxtx.h          |  11 +-
 drivers/net/mlx5/mlx5_rxtx_vec_neon.h |  21 +-
 drivers/net/mlx5/mlx5_rxtx_vec_sse.h  |  17 +-
 drivers/net/mlx5/mlx5_trigger.c       |   8 -
 drivers/net/mlx5/mlx5_utils.h         |   6 +
 16 files changed, 896 insertions(+), 223 deletions(-)

-- 
2.13.3

^ permalink raw reply	[flat|nested] 115+ messages in thread

* [PATCH v6 01/11] net/mlx5: support 16 hardware priorities
  2018-04-20 12:23     ` [PATCH v5 " Xueming Li
  2018-04-23 12:32       ` [PATCH v6 " Xueming Li
@ 2018-04-23 12:33       ` Xueming Li
  2018-04-23 12:33       ` [PATCH v6 02/11] net/mlx5: support GRE tunnel flow Xueming Li
                         ` (9 subsequent siblings)
  11 siblings, 0 replies; 115+ messages in thread
From: Xueming Li @ 2018-04-23 12:33 UTC (permalink / raw)
  To: Nelio Laranjeiro, Shahaf Shuler; +Cc: Xueming Li, dev

This patch supports new 16 Verbs flow priorities by trying to create a
simple flow of priority 15. If 16 priorities not available, fallback to
traditional 8 priorities.

Verb priority mapping:
			8 priorities	>=16 priorities
Control flow:		4-7		8-15
User normal flow:	1-3		4-7
User tunnel flow:	0-2		0-3

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
---
 drivers/net/mlx5/mlx5.c         |  18 +++++++
 drivers/net/mlx5/mlx5.h         |   5 ++
 drivers/net/mlx5/mlx5_flow.c    | 113 +++++++++++++++++++++++++++++++++-------
 drivers/net/mlx5/mlx5_trigger.c |   8 ---
 4 files changed, 116 insertions(+), 28 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 68783c3ac..5a0b8de85 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -197,6 +197,7 @@ mlx5_dev_close(struct rte_eth_dev *dev)
 		priv->txqs_n = 0;
 		priv->txqs = NULL;
 	}
+	mlx5_flow_delete_drop_queue(dev);
 	if (priv->pd != NULL) {
 		assert(priv->ctx != NULL);
 		claim_zero(mlx5_glue->dealloc_pd(priv->pd));
@@ -619,6 +620,7 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 	unsigned int mps;
 	unsigned int cqe_comp;
 	unsigned int tunnel_en = 0;
+	unsigned int verb_priorities = 0;
 	int idx;
 	int i;
 	struct mlx5dv_context attrs_out = {0};
@@ -1006,6 +1008,22 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 		mlx5_link_update(eth_dev, 0);
 		/* Store device configuration on private structure. */
 		priv->config = config;
+		/* Create drop queue. */
+		err = mlx5_flow_create_drop_queue(eth_dev);
+		if (err) {
+			DRV_LOG(ERR, "port %u drop queue allocation failed: %s",
+				eth_dev->data->port_id, strerror(rte_errno));
+			goto port_error;
+		}
+		/* Supported Verbs flow priority number detection. */
+		if (verb_priorities == 0)
+			verb_priorities = mlx5_get_max_verbs_prio(eth_dev);
+		if (verb_priorities < MLX5_VERBS_FLOW_PRIO_8) {
+			DRV_LOG(ERR, "port %u wrong Verbs flow priorities: %u",
+				eth_dev->data->port_id, verb_priorities);
+			goto port_error;
+		}
+		priv->config.max_verbs_prio = verb_priorities;
 		continue;
 port_error:
 		if (priv)
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 6ad41390a..670f6860f 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -89,6 +89,7 @@ struct mlx5_dev_config {
 	unsigned int rx_vec_en:1; /* Rx vector is enabled. */
 	unsigned int mpw_hdr_dseg:1; /* Enable DSEGs in the title WQEBB. */
 	unsigned int vf_nl_en:1; /* Enable Netlink requests in VF mode. */
+	unsigned int max_verbs_prio; /* Number of Verb flow priorities. */
 	unsigned int tso_max_payload_sz; /* Maximum TCP payload for TSO. */
 	unsigned int ind_table_max_size; /* Maximum indirection table size. */
 	int txq_inline; /* Maximum packet size for inlining. */
@@ -105,6 +106,9 @@ enum mlx5_verbs_alloc_type {
 	MLX5_VERBS_ALLOC_TYPE_RX_QUEUE,
 };
 
+/* 8 Verbs priorities. */
+#define MLX5_VERBS_FLOW_PRIO_8 8
+
 /**
  * Verbs allocator needs a context to know in the callback which kind of
  * resources it is allocating.
@@ -253,6 +257,7 @@ int mlx5_traffic_restart(struct rte_eth_dev *dev);
 
 /* mlx5_flow.c */
 
+unsigned int mlx5_get_max_verbs_prio(struct rte_eth_dev *dev);
 int mlx5_flow_validate(struct rte_eth_dev *dev,
 		       const struct rte_flow_attr *attr,
 		       const struct rte_flow_item items[],
diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index e6c8b3df8..5402cb148 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -31,8 +31,8 @@
 #include "mlx5_prm.h"
 #include "mlx5_glue.h"
 
-/* Define minimal priority for control plane flows. */
-#define MLX5_CTRL_FLOW_PRIORITY 4
+/* Flow priority for control plane flows. */
+#define MLX5_CTRL_FLOW_PRIORITY 1
 
 /* Internet Protocol versions. */
 #define MLX5_IPV4 4
@@ -128,7 +128,7 @@ const struct hash_rxq_init hash_rxq_init[] = {
 				IBV_RX_HASH_SRC_PORT_TCP |
 				IBV_RX_HASH_DST_PORT_TCP),
 		.dpdk_rss_hf = ETH_RSS_NONFRAG_IPV4_TCP,
-		.flow_priority = 1,
+		.flow_priority = 0,
 		.ip_version = MLX5_IPV4,
 	},
 	[HASH_RXQ_UDPV4] = {
@@ -137,7 +137,7 @@ const struct hash_rxq_init hash_rxq_init[] = {
 				IBV_RX_HASH_SRC_PORT_UDP |
 				IBV_RX_HASH_DST_PORT_UDP),
 		.dpdk_rss_hf = ETH_RSS_NONFRAG_IPV4_UDP,
-		.flow_priority = 1,
+		.flow_priority = 0,
 		.ip_version = MLX5_IPV4,
 	},
 	[HASH_RXQ_IPV4] = {
@@ -145,7 +145,7 @@ const struct hash_rxq_init hash_rxq_init[] = {
 				IBV_RX_HASH_DST_IPV4),
 		.dpdk_rss_hf = (ETH_RSS_IPV4 |
 				ETH_RSS_FRAG_IPV4),
-		.flow_priority = 2,
+		.flow_priority = 1,
 		.ip_version = MLX5_IPV4,
 	},
 	[HASH_RXQ_TCPV6] = {
@@ -154,7 +154,7 @@ const struct hash_rxq_init hash_rxq_init[] = {
 				IBV_RX_HASH_SRC_PORT_TCP |
 				IBV_RX_HASH_DST_PORT_TCP),
 		.dpdk_rss_hf = ETH_RSS_NONFRAG_IPV6_TCP,
-		.flow_priority = 1,
+		.flow_priority = 0,
 		.ip_version = MLX5_IPV6,
 	},
 	[HASH_RXQ_UDPV6] = {
@@ -163,7 +163,7 @@ const struct hash_rxq_init hash_rxq_init[] = {
 				IBV_RX_HASH_SRC_PORT_UDP |
 				IBV_RX_HASH_DST_PORT_UDP),
 		.dpdk_rss_hf = ETH_RSS_NONFRAG_IPV6_UDP,
-		.flow_priority = 1,
+		.flow_priority = 0,
 		.ip_version = MLX5_IPV6,
 	},
 	[HASH_RXQ_IPV6] = {
@@ -171,13 +171,13 @@ const struct hash_rxq_init hash_rxq_init[] = {
 				IBV_RX_HASH_DST_IPV6),
 		.dpdk_rss_hf = (ETH_RSS_IPV6 |
 				ETH_RSS_FRAG_IPV6),
-		.flow_priority = 2,
+		.flow_priority = 1,
 		.ip_version = MLX5_IPV6,
 	},
 	[HASH_RXQ_ETH] = {
 		.hash_fields = 0,
 		.dpdk_rss_hf = 0,
-		.flow_priority = 3,
+		.flow_priority = 2,
 	},
 };
 
@@ -899,30 +899,50 @@ mlx5_flow_convert_allocate(unsigned int size, struct rte_flow_error *error)
  * Make inner packet matching with an higher priority from the non Inner
  * matching.
  *
+ * @param dev
+ *   Pointer to Ethernet device.
  * @param[in, out] parser
  *   Internal parser structure.
  * @param attr
  *   User flow attribute.
  */
 static void
-mlx5_flow_update_priority(struct mlx5_flow_parse *parser,
+mlx5_flow_update_priority(struct rte_eth_dev *dev,
+			  struct mlx5_flow_parse *parser,
 			  const struct rte_flow_attr *attr)
 {
+	struct priv *priv = dev->data->dev_private;
 	unsigned int i;
+	uint16_t priority;
 
+	/*			8 priorities	>= 16 priorities
+	 * Control flow:	4-7		8-15
+	 * User normal flow:	1-3		4-7
+	 * User tunnel flow:	0-2		0-3
+	 */
+	priority = attr->priority * MLX5_VERBS_FLOW_PRIO_8;
+	if (priv->config.max_verbs_prio == MLX5_VERBS_FLOW_PRIO_8)
+		priority /= 2;
+	/*
+	 * Lower non-tunnel flow Verbs priority 1 if only support 8 Verbs
+	 * priorities, lower 4 otherwise.
+	 */
+	if (!parser->inner) {
+		if (priv->config.max_verbs_prio == MLX5_VERBS_FLOW_PRIO_8)
+			priority += 1;
+		else
+			priority += MLX5_VERBS_FLOW_PRIO_8 / 2;
+	}
 	if (parser->drop) {
-		parser->queue[HASH_RXQ_ETH].ibv_attr->priority =
-			attr->priority +
-			hash_rxq_init[HASH_RXQ_ETH].flow_priority;
+		parser->queue[HASH_RXQ_ETH].ibv_attr->priority = priority +
+				hash_rxq_init[HASH_RXQ_ETH].flow_priority;
 		return;
 	}
 	for (i = 0; i != hash_rxq_init_n; ++i) {
-		if (parser->queue[i].ibv_attr) {
-			parser->queue[i].ibv_attr->priority =
-				attr->priority +
-				hash_rxq_init[i].flow_priority -
-				(parser->inner ? 1 : 0);
-		}
+		if (!parser->queue[i].ibv_attr)
+			continue;
+		parser->queue[i].ibv_attr->priority = priority +
+				hash_rxq_init[i].flow_priority;
 	}
 }
 
@@ -1157,7 +1177,7 @@ mlx5_flow_convert(struct rte_eth_dev *dev,
 	 */
 	if (!parser->drop)
 		mlx5_flow_convert_finalise(parser);
-	mlx5_flow_update_priority(parser, attr);
+	mlx5_flow_update_priority(dev, parser, attr);
 exit_free:
 	/* Only verification is expected, all resources should be released. */
 	if (!parser->create) {
@@ -3158,3 +3178,56 @@ mlx5_dev_filter_ctrl(struct rte_eth_dev *dev,
 	}
 	return 0;
 }
+
+/**
+ * Detect number of Verbs flow priorities supported.
+ *
+ * @param dev
+ *   Pointer to Ethernet device.
+ *
+ * @return
+ *   number of supported Verbs flow priority.
+ */
+unsigned int
+mlx5_get_max_verbs_prio(struct rte_eth_dev *dev)
+{
+	struct priv *priv = dev->data->dev_private;
+	unsigned int verb_priorities = MLX5_VERBS_FLOW_PRIO_8;
+	struct {
+		struct ibv_flow_attr attr;
+		struct ibv_flow_spec_eth eth;
+		struct ibv_flow_spec_action_drop drop;
+	} flow_attr = {
+		.attr = {
+			.num_of_specs = 2,
+		},
+		.eth = {
+			.type = IBV_FLOW_SPEC_ETH,
+			.size = sizeof(struct ibv_flow_spec_eth),
+		},
+		.drop = {
+			.size = sizeof(struct ibv_flow_spec_action_drop),
+			.type = IBV_FLOW_SPEC_ACTION_DROP,
+		},
+	};
+	struct ibv_flow *flow;
+
+	do {
+		flow_attr.attr.priority = verb_priorities - 1;
+		flow = mlx5_glue->create_flow(priv->flow_drop_queue->qp,
+					      &flow_attr.attr);
+		if (flow) {
+			claim_zero(mlx5_glue->destroy_flow(flow));
+			/* Try more priorities. */
+			verb_priorities *= 2;
+		} else {
+			/* Failed, restore last right number. */
+			verb_priorities /= 2;
+			break;
+		}
+	} while (1);
+	DRV_LOG(DEBUG, "port %u Verbs flow priorities: %d,"
+		" user flow priorities: %d",
+		dev->data->port_id, verb_priorities, MLX5_CTRL_FLOW_PRIORITY);
+	return verb_priorities;
+}
diff --git a/drivers/net/mlx5/mlx5_trigger.c b/drivers/net/mlx5/mlx5_trigger.c
index ee08c5677..fc56d1ee8 100644
--- a/drivers/net/mlx5/mlx5_trigger.c
+++ b/drivers/net/mlx5/mlx5_trigger.c
@@ -148,12 +148,6 @@ mlx5_dev_start(struct rte_eth_dev *dev)
 	int ret;
 
 	dev->data->dev_started = 1;
-	ret = mlx5_flow_create_drop_queue(dev);
-	if (ret) {
-		DRV_LOG(ERR, "port %u drop queue allocation failed: %s",
-			dev->data->port_id, strerror(rte_errno));
-		goto error;
-	}
 	DRV_LOG(DEBUG, "port %u allocating and configuring hash Rx queues",
 		dev->data->port_id);
 	rte_mempool_walk(mlx5_mp2mr_iter, priv);
@@ -202,7 +196,6 @@ mlx5_dev_start(struct rte_eth_dev *dev)
 	mlx5_traffic_disable(dev);
 	mlx5_txq_stop(dev);
 	mlx5_rxq_stop(dev);
-	mlx5_flow_delete_drop_queue(dev);
 	rte_errno = ret; /* Restore rte_errno. */
 	return -rte_errno;
 }
@@ -237,7 +230,6 @@ mlx5_dev_stop(struct rte_eth_dev *dev)
 	mlx5_rxq_stop(dev);
 	for (mr = LIST_FIRST(&priv->mr); mr; mr = LIST_FIRST(&priv->mr))
 		mlx5_mr_release(mr);
-	mlx5_flow_delete_drop_queue(dev);
 }
 
 /**
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v6 02/11] net/mlx5: support GRE tunnel flow
  2018-04-20 12:23     ` [PATCH v5 " Xueming Li
  2018-04-23 12:32       ` [PATCH v6 " Xueming Li
  2018-04-23 12:33       ` [PATCH v6 01/11] net/mlx5: support 16 hardware priorities Xueming Li
@ 2018-04-23 12:33       ` Xueming Li
  2018-04-23 12:55         ` Nélio Laranjeiro
  2018-04-23 12:33       ` [PATCH v6 03/11] net/mlx5: support L3 VXLAN flow Xueming Li
                         ` (8 subsequent siblings)
  11 siblings, 1 reply; 115+ messages in thread
From: Xueming Li @ 2018-04-23 12:33 UTC (permalink / raw)
  To: Nelio Laranjeiro, Shahaf Shuler; +Cc: Xueming Li, dev

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
---
 drivers/net/mlx5/mlx5_flow.c | 101 ++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 94 insertions(+), 7 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index 5402cb148..b365f9868 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -37,6 +37,7 @@
 /* Internet Protocol versions. */
 #define MLX5_IPV4 4
 #define MLX5_IPV6 6
+#define MLX5_GRE 47
 
 #ifndef HAVE_IBV_DEVICE_COUNTERS_SET_SUPPORT
 struct ibv_flow_spec_counter_action {
@@ -89,6 +90,11 @@ mlx5_flow_create_vxlan(const struct rte_flow_item *item,
 		       const void *default_mask,
 		       struct mlx5_flow_data *data);
 
+static int
+mlx5_flow_create_gre(const struct rte_flow_item *item,
+		     const void *default_mask,
+		     struct mlx5_flow_data *data);
+
 struct mlx5_flow_parse;
 
 static void
@@ -231,6 +237,10 @@ struct rte_flow {
 		__VA_ARGS__, RTE_FLOW_ITEM_TYPE_END, \
 	}
 
+#define IS_TUNNEL(type) ( \
+	(type) == RTE_FLOW_ITEM_TYPE_VXLAN || \
+	(type) == RTE_FLOW_ITEM_TYPE_GRE)
+
 /** Structure to generate a simple graph of layers supported by the NIC. */
 struct mlx5_flow_items {
 	/** List of possible actions for these items. */
@@ -284,7 +294,8 @@ static const enum rte_flow_action_type valid_actions[] = {
 static const struct mlx5_flow_items mlx5_flow_items[] = {
 	[RTE_FLOW_ITEM_TYPE_END] = {
 		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
-			       RTE_FLOW_ITEM_TYPE_VXLAN),
+			       RTE_FLOW_ITEM_TYPE_VXLAN,
+			       RTE_FLOW_ITEM_TYPE_GRE),
 	},
 	[RTE_FLOW_ITEM_TYPE_ETH] = {
 		.items = ITEMS(RTE_FLOW_ITEM_TYPE_VLAN,
@@ -316,7 +327,8 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
 	},
 	[RTE_FLOW_ITEM_TYPE_IPV4] = {
 		.items = ITEMS(RTE_FLOW_ITEM_TYPE_UDP,
-			       RTE_FLOW_ITEM_TYPE_TCP),
+			       RTE_FLOW_ITEM_TYPE_TCP,
+			       RTE_FLOW_ITEM_TYPE_GRE),
 		.actions = valid_actions,
 		.mask = &(const struct rte_flow_item_ipv4){
 			.hdr = {
@@ -333,7 +345,8 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
 	},
 	[RTE_FLOW_ITEM_TYPE_IPV6] = {
 		.items = ITEMS(RTE_FLOW_ITEM_TYPE_UDP,
-			       RTE_FLOW_ITEM_TYPE_TCP),
+			       RTE_FLOW_ITEM_TYPE_TCP,
+			       RTE_FLOW_ITEM_TYPE_GRE),
 		.actions = valid_actions,
 		.mask = &(const struct rte_flow_item_ipv6){
 			.hdr = {
@@ -386,6 +399,19 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
 		.convert = mlx5_flow_create_tcp,
 		.dst_sz = sizeof(struct ibv_flow_spec_tcp_udp),
 	},
+	[RTE_FLOW_ITEM_TYPE_GRE] = {
+		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
+			       RTE_FLOW_ITEM_TYPE_IPV4,
+			       RTE_FLOW_ITEM_TYPE_IPV6),
+		.actions = valid_actions,
+		.mask = &(const struct rte_flow_item_gre){
+			.protocol = -1,
+		},
+		.default_mask = &rte_flow_item_gre_mask,
+		.mask_sz = sizeof(struct rte_flow_item_gre),
+		.convert = mlx5_flow_create_gre,
+		.dst_sz = sizeof(struct ibv_flow_spec_tunnel),
+	},
 	[RTE_FLOW_ITEM_TYPE_VXLAN] = {
 		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH),
 		.actions = valid_actions,
@@ -401,7 +427,7 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
 
 /** Structure to pass to the conversion function. */
 struct mlx5_flow_parse {
-	uint32_t inner; /**< Set once VXLAN is encountered. */
+	uint32_t inner; /**< Verbs value, set once tunnel is encountered. */
 	uint32_t create:1;
 	/**< Whether resources should remain after a validate. */
 	uint32_t drop:1; /**< Target is a drop queue. */
@@ -829,13 +855,13 @@ mlx5_flow_convert_items_validate(const struct rte_flow_item items[],
 					      cur_item->mask_sz);
 		if (ret)
 			goto exit_item_not_supported;
-		if (items->type == RTE_FLOW_ITEM_TYPE_VXLAN) {
+		if (IS_TUNNEL(items->type)) {
 			if (parser->inner) {
 				rte_flow_error_set(error, ENOTSUP,
 						   RTE_FLOW_ERROR_TYPE_ITEM,
 						   items,
-						   "cannot recognize multiple"
-						   " VXLAN encapsulations");
+						   "Cannot recognize multiple"
+						   " tunnel encapsulations.");
 				return -rte_errno;
 			}
 			parser->inner = IBV_FLOW_SPEC_INNER;
@@ -1641,6 +1667,67 @@ mlx5_flow_create_vxlan(const struct rte_flow_item *item,
 }
 
 /**
+ * Convert GRE item to Verbs specification.
+ *
+ * @param item[in]
+ *   Item specification.
+ * @param default_mask[in]
+ *   Default bit-masks to use when item->mask is not provided.
+ * @param data[in, out]
+ *   User structure.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+mlx5_flow_create_gre(const struct rte_flow_item *item __rte_unused,
+		     const void *default_mask __rte_unused,
+		     struct mlx5_flow_data *data)
+{
+	struct mlx5_flow_parse *parser = data->parser;
+	unsigned int size = sizeof(struct ibv_flow_spec_tunnel);
+	struct ibv_flow_spec_tunnel tunnel = {
+		.type = parser->inner | IBV_FLOW_SPEC_VXLAN_TUNNEL,
+		.size = size,
+	};
+	struct ibv_flow_spec_ipv4_ext *ipv4;
+	struct ibv_flow_spec_ipv6 *ipv6;
+	unsigned int i;
+
+	parser->inner = IBV_FLOW_SPEC_INNER;
+	/* Update encapsulation IP layer protocol. */
+	for (i = 0; i != hash_rxq_init_n; ++i) {
+		if (!parser->queue[i].ibv_attr)
+			continue;
+		if (parser->out_layer == HASH_RXQ_IPV4) {
+			ipv4 = (void *)((uintptr_t)parser->queue[i].ibv_attr +
+				parser->queue[i].offset -
+				sizeof(struct ibv_flow_spec_ipv4_ext));
+			if (ipv4->mask.proto && ipv4->val.proto != MLX5_GRE)
+				break;
+			ipv4->val.proto = MLX5_GRE;
+			ipv4->mask.proto = 0xff;
+		} else if (parser->out_layer == HASH_RXQ_IPV6) {
+			ipv6 = (void *)((uintptr_t)parser->queue[i].ibv_attr +
+				parser->queue[i].offset -
+				sizeof(struct ibv_flow_spec_ipv6));
+			if (ipv6->mask.next_hdr &&
+			    ipv6->val.next_hdr != MLX5_GRE)
+				break;
+			ipv6->val.next_hdr = MLX5_GRE;
+			ipv6->mask.next_hdr = 0xff;
+		}
+	}
+	if (i != hash_rxq_init_n)
+		return rte_flow_error_set(data->error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ITEM,
+					  item,
+					  "IP protocol of GRE must be 47");
+	mlx5_flow_create_copy(parser, &tunnel, size);
+	return 0;
+}
+
+/**
  * Convert mark/flag action to Verbs specification.
  *
  * @param parser
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v6 03/11] net/mlx5: support L3 VXLAN flow
  2018-04-20 12:23     ` [PATCH v5 " Xueming Li
                         ` (2 preceding siblings ...)
  2018-04-23 12:33       ` [PATCH v6 02/11] net/mlx5: support GRE tunnel flow Xueming Li
@ 2018-04-23 12:33       ` Xueming Li
  2018-04-23 12:33       ` [PATCH v6 04/11] net/mlx5: support Rx tunnel type identification Xueming Li
                         ` (7 subsequent siblings)
  11 siblings, 0 replies; 115+ messages in thread
From: Xueming Li @ 2018-04-23 12:33 UTC (permalink / raw)
  To: Nelio Laranjeiro, Shahaf Shuler; +Cc: Xueming Li, dev

This patch support L3 VXLAN, no inner L2 header comparing to standard
VXLAN protocol. L3 VXLAN using specific overlay UDP destination port to
discriminate against standard VXLAN, device parameter and FW has to be
configured to support it:
  sudo mlxconfig -d <device> -y s IP_OVER_VXLAN_EN=1
  sudo mlxconfig -d <device> -y s IP_OVER_VXLAN_PORT=<port>

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
---
 doc/guides/nics/mlx5.rst     | 26 ++++++++++++++++++++++++++
 drivers/net/mlx5/mlx5.c      |  6 ++++++
 drivers/net/mlx5/mlx5.h      |  1 +
 drivers/net/mlx5/mlx5_flow.c | 26 +++++++++++++++++++++++++-
 4 files changed, 58 insertions(+), 1 deletion(-)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index c28c83278..421274729 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -327,6 +327,32 @@ Run-time configuration
 
   Enabled by default, valid only on VF devices ignored otherwise.
 
+- ``l3_vxlan_en`` parameter [int]
+
+  A nonzero value allows L3 VXLAN flow creation. To enable L3 VXLAN, users
+  has to configure firemware and enable this prameter. This is a prerequisite
+  to receive this kind of traffic.
+
+  Disabled by default.
+
+Firmware configuration
+~~~~~~~~~~~~~~~~~~~~~~
+
+- L3 VXLAN destination UDP port
+
+   .. code-block:: console
+
+     mlxconfig -d <mst device> set IP_OVER_VXLAN_EN=1
+     mlxconfig -d <mst device> set IP_OVER_VXLAN_PORT=<udp dport>
+
+  Verify configurations are set:
+
+   .. code-block:: console
+
+     mlxconfig -d <mst device> query | grep IP_OVER_VXLAN
+     IP_OVER_VXLAN_EN                    True(1)
+     IP_OVER_VXLAN_PORT                  <udp dport>
+
 Prerequisites
 -------------
 
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 5a0b8de85..d19981c60 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -69,6 +69,9 @@
 /* Device parameter to enable hardware Rx vector. */
 #define MLX5_RX_VEC_EN "rx_vec_en"
 
+/* Allow L3 VXLAN flow creation. */
+#define MLX5_L3_VXLAN_EN "l3_vxlan_en"
+
 /* Activate Netlink support in VF mode. */
 #define MLX5_VF_NL_EN "vf_nl_en"
 
@@ -416,6 +419,8 @@ mlx5_args_check(const char *key, const char *val, void *opaque)
 		config->tx_vec_en = !!tmp;
 	} else if (strcmp(MLX5_RX_VEC_EN, key) == 0) {
 		config->rx_vec_en = !!tmp;
+	} else if (strcmp(MLX5_L3_VXLAN_EN, key) == 0) {
+		config->l3_vxlan_en = !!tmp;
 	} else if (strcmp(MLX5_VF_NL_EN, key) == 0) {
 		config->vf_nl_en = !!tmp;
 	} else {
@@ -449,6 +454,7 @@ mlx5_args(struct mlx5_dev_config *config, struct rte_devargs *devargs)
 		MLX5_TXQ_MAX_INLINE_LEN,
 		MLX5_TX_VEC_EN,
 		MLX5_RX_VEC_EN,
+		MLX5_L3_VXLAN_EN,
 		MLX5_VF_NL_EN,
 		NULL,
 	};
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 670f6860f..c1a65257e 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -88,6 +88,7 @@ struct mlx5_dev_config {
 	unsigned int tx_vec_en:1; /* Tx vector is enabled. */
 	unsigned int rx_vec_en:1; /* Rx vector is enabled. */
 	unsigned int mpw_hdr_dseg:1; /* Enable DSEGs in the title WQEBB. */
+	unsigned int l3_vxlan_en:1; /* Enable L3 VXLAN flow creation. */
 	unsigned int vf_nl_en:1; /* Enable Netlink requests in VF mode. */
 	unsigned int max_verbs_prio; /* Number of Verb flow priorities. */
 	unsigned int tso_max_payload_sz; /* Maximum TCP payload for TSO. */
diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index b365f9868..20111383e 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -51,6 +51,7 @@ extern const struct eth_dev_ops mlx5_dev_ops_isolate;
 
 /** Structure give to the conversion functions. */
 struct mlx5_flow_data {
+	struct rte_eth_dev *dev; /** Ethernet device. */
 	struct mlx5_flow_parse *parser; /** Parser context. */
 	struct rte_flow_error *error; /** Error context. */
 };
@@ -413,7 +414,9 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
 		.dst_sz = sizeof(struct ibv_flow_spec_tunnel),
 	},
 	[RTE_FLOW_ITEM_TYPE_VXLAN] = {
-		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH),
+		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
+			       RTE_FLOW_ITEM_TYPE_IPV4, /* For L3 VXLAN. */
+			       RTE_FLOW_ITEM_TYPE_IPV6), /* For L3 VXLAN. */
 		.actions = valid_actions,
 		.mask = &(const struct rte_flow_item_vxlan){
 			.vni = "\xff\xff\xff",
@@ -1175,6 +1178,7 @@ mlx5_flow_convert(struct rte_eth_dev *dev,
 	parser->inner = 0;
 	for (; items->type != RTE_FLOW_ITEM_TYPE_END; ++items) {
 		struct mlx5_flow_data data = {
+			.dev = dev,
 			.parser = parser,
 			.error = error,
 		};
@@ -1396,6 +1400,7 @@ mlx5_flow_create_ipv4(const struct rte_flow_item *item,
 		      const void *default_mask,
 		      struct mlx5_flow_data *data)
 {
+	struct priv *priv = data->dev->data->dev_private;
 	const struct rte_flow_item_ipv4 *spec = item->spec;
 	const struct rte_flow_item_ipv4 *mask = item->mask;
 	struct mlx5_flow_parse *parser = data->parser;
@@ -1405,6 +1410,15 @@ mlx5_flow_create_ipv4(const struct rte_flow_item *item,
 		.size = ipv4_size,
 	};
 
+	if (parser->layer == HASH_RXQ_TUNNEL &&
+	    parser->tunnel == ptype_ext[PTYPE_IDX(RTE_PTYPE_TUNNEL_VXLAN)] &&
+	    !priv->config.l3_vxlan_en)
+		return rte_flow_error_set(data->error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ITEM,
+					  item,
+					  "L3 VXLAN not enabled by device"
+					  " parameter and/or not configured"
+					  " in firmware");
 	/* Don't update layer for the inner pattern. */
 	if (!parser->inner)
 		parser->layer = HASH_RXQ_IPV4;
@@ -1451,6 +1465,7 @@ mlx5_flow_create_ipv6(const struct rte_flow_item *item,
 		      const void *default_mask,
 		      struct mlx5_flow_data *data)
 {
+	struct priv *priv = data->dev->data->dev_private;
 	const struct rte_flow_item_ipv6 *spec = item->spec;
 	const struct rte_flow_item_ipv6 *mask = item->mask;
 	struct mlx5_flow_parse *parser = data->parser;
@@ -1460,6 +1475,15 @@ mlx5_flow_create_ipv6(const struct rte_flow_item *item,
 		.size = ipv6_size,
 	};
 
+	if (parser->layer == HASH_RXQ_TUNNEL &&
+	    parser->tunnel == ptype_ext[PTYPE_IDX(RTE_PTYPE_TUNNEL_VXLAN)] &&
+	    !priv->config.l3_vxlan_en)
+		return rte_flow_error_set(data->error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ITEM,
+					  item,
+					  "L3 VXLAN not enabled by device"
+					  " parameter and/or not configured"
+					  " in firmware");
 	/* Don't update layer for the inner pattern. */
 	if (!parser->inner)
 		parser->layer = HASH_RXQ_IPV6;
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v6 04/11] net/mlx5: support Rx tunnel type identification
  2018-04-20 12:23     ` [PATCH v5 " Xueming Li
                         ` (3 preceding siblings ...)
  2018-04-23 12:33       ` [PATCH v6 03/11] net/mlx5: support L3 VXLAN flow Xueming Li
@ 2018-04-23 12:33       ` Xueming Li
  2018-04-23 12:33       ` [PATCH v6 05/11] net/mlx5: cleanup tunnel checksum offloads Xueming Li
                         ` (6 subsequent siblings)
  11 siblings, 0 replies; 115+ messages in thread
From: Xueming Li @ 2018-04-23 12:33 UTC (permalink / raw)
  To: Nelio Laranjeiro, Shahaf Shuler; +Cc: Xueming Li, dev

This patch introduced tunnel type identification based on flow rules.
If flows of multiple tunnel types built on same queue, no tunnel type
will be returned, user application could use bits in flow mark as tunnel
type identifier.

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
---
 drivers/net/mlx5/mlx5_flow.c          | 135 +++++++++++++++++++++++++++++-----
 drivers/net/mlx5/mlx5_rxq.c           |  11 ++-
 drivers/net/mlx5/mlx5_rxtx.c          |  12 ++-
 drivers/net/mlx5/mlx5_rxtx.h          |   9 ++-
 drivers/net/mlx5/mlx5_rxtx_vec_neon.h |  21 ++++--
 drivers/net/mlx5/mlx5_rxtx_vec_sse.h  |  17 ++++-
 6 files changed, 167 insertions(+), 38 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index 20111383e..fa1487d29 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -226,6 +226,7 @@ struct rte_flow {
 	struct rte_flow_action_rss rss_conf; /**< RSS configuration */
 	uint16_t (*queues)[]; /**< Queues indexes to use. */
 	uint8_t rss_key[40]; /**< copy of the RSS key. */
+	uint32_t tunnel; /**< Tunnel type of RTE_PTYPE_TUNNEL_XXX. */
 	struct ibv_counter_set *cs; /**< Holds the counters for the rule. */
 	struct mlx5_flow_counter_stats counter_stats;/**<The counter stats. */
 	struct mlx5_flow frxq[RTE_DIM(hash_rxq_init)];
@@ -242,6 +243,19 @@ struct rte_flow {
 	(type) == RTE_FLOW_ITEM_TYPE_VXLAN || \
 	(type) == RTE_FLOW_ITEM_TYPE_GRE)
 
+const uint32_t flow_ptype[] = {
+	[RTE_FLOW_ITEM_TYPE_VXLAN] = RTE_PTYPE_TUNNEL_VXLAN,
+	[RTE_FLOW_ITEM_TYPE_GRE] = RTE_PTYPE_TUNNEL_GRE,
+};
+
+#define PTYPE_IDX(t) ((RTE_PTYPE_TUNNEL_MASK & (t)) >> 12)
+
+const uint32_t ptype_ext[] = {
+	[PTYPE_IDX(RTE_PTYPE_TUNNEL_VXLAN)] = RTE_PTYPE_TUNNEL_VXLAN |
+					      RTE_PTYPE_L4_UDP,
+	[PTYPE_IDX(RTE_PTYPE_TUNNEL_GRE)] = RTE_PTYPE_TUNNEL_GRE,
+};
+
 /** Structure to generate a simple graph of layers supported by the NIC. */
 struct mlx5_flow_items {
 	/** List of possible actions for these items. */
@@ -441,6 +455,7 @@ struct mlx5_flow_parse {
 	uint16_t queues[RTE_MAX_QUEUES_PER_PORT]; /**< Queues indexes to use. */
 	uint8_t rss_key[40]; /**< copy of the RSS key. */
 	enum hash_rxq_type layer; /**< Last pattern layer detected. */
+	uint32_t tunnel; /**< Tunnel type of RTE_PTYPE_TUNNEL_XXX. */
 	struct ibv_counter_set *cs; /**< Holds the counter set for the rule */
 	struct {
 		struct ibv_flow_attr *ibv_attr;
@@ -859,7 +874,7 @@ mlx5_flow_convert_items_validate(const struct rte_flow_item items[],
 		if (ret)
 			goto exit_item_not_supported;
 		if (IS_TUNNEL(items->type)) {
-			if (parser->inner) {
+			if (parser->tunnel) {
 				rte_flow_error_set(error, ENOTSUP,
 						   RTE_FLOW_ERROR_TYPE_ITEM,
 						   items,
@@ -868,6 +883,7 @@ mlx5_flow_convert_items_validate(const struct rte_flow_item items[],
 				return -rte_errno;
 			}
 			parser->inner = IBV_FLOW_SPEC_INNER;
+			parser->tunnel = flow_ptype[items->type];
 		}
 		if (parser->drop) {
 			parser->queue[HASH_RXQ_ETH].offset += cur_item->dst_sz;
@@ -1176,6 +1192,7 @@ mlx5_flow_convert(struct rte_eth_dev *dev,
 	}
 	/* Third step. Conversion parse, fill the specifications. */
 	parser->inner = 0;
+	parser->tunnel = 0;
 	for (; items->type != RTE_FLOW_ITEM_TYPE_END; ++items) {
 		struct mlx5_flow_data data = {
 			.dev = dev,
@@ -1663,6 +1680,7 @@ mlx5_flow_create_vxlan(const struct rte_flow_item *item,
 
 	id.vni[0] = 0;
 	parser->inner = IBV_FLOW_SPEC_INNER;
+	parser->tunnel = ptype_ext[PTYPE_IDX(RTE_PTYPE_TUNNEL_VXLAN)];
 	if (spec) {
 		if (!mask)
 			mask = default_mask;
@@ -1719,6 +1737,7 @@ mlx5_flow_create_gre(const struct rte_flow_item *item __rte_unused,
 	unsigned int i;
 
 	parser->inner = IBV_FLOW_SPEC_INNER;
+	parser->tunnel = ptype_ext[PTYPE_IDX(RTE_PTYPE_TUNNEL_GRE)];
 	/* Update encapsulation IP layer protocol. */
 	for (i = 0; i != hash_rxq_init_n; ++i) {
 		if (!parser->queue[i].ibv_attr)
@@ -1925,7 +1944,8 @@ mlx5_flow_create_action_queue_rss(struct rte_eth_dev *dev,
 				      parser->rss_conf.key_len,
 				      hash_fields,
 				      parser->rss_conf.queue,
-				      parser->rss_conf.queue_num);
+				      parser->rss_conf.queue_num,
+				      parser->tunnel);
 		if (flow->frxq[i].hrxq)
 			continue;
 		flow->frxq[i].hrxq =
@@ -1934,7 +1954,8 @@ mlx5_flow_create_action_queue_rss(struct rte_eth_dev *dev,
 				      parser->rss_conf.key_len,
 				      hash_fields,
 				      parser->rss_conf.queue,
-				      parser->rss_conf.queue_num);
+				      parser->rss_conf.queue_num,
+				      parser->tunnel);
 		if (!flow->frxq[i].hrxq) {
 			return rte_flow_error_set(error, ENOMEM,
 						  RTE_FLOW_ERROR_TYPE_HANDLE,
@@ -1946,6 +1967,48 @@ mlx5_flow_create_action_queue_rss(struct rte_eth_dev *dev,
 }
 
 /**
+ * RXQ update after flow rule creation.
+ *
+ * @param dev
+ *   Pointer to Ethernet device.
+ * @param flow
+ *   Pointer to the flow rule.
+ */
+static void
+mlx5_flow_create_update_rxqs(struct rte_eth_dev *dev, struct rte_flow *flow)
+{
+	struct priv *priv = dev->data->dev_private;
+	unsigned int i;
+	unsigned int j;
+
+	if (!dev->data->dev_started)
+		return;
+	for (i = 0; i != flow->rss_conf.queue_num; ++i) {
+		struct mlx5_rxq_data *rxq_data = (*priv->rxqs)
+						 [(*flow->queues)[i]];
+		struct mlx5_rxq_ctrl *rxq_ctrl =
+			container_of(rxq_data, struct mlx5_rxq_ctrl, rxq);
+		uint8_t tunnel = PTYPE_IDX(flow->tunnel);
+
+		rxq_data->mark |= flow->mark;
+		if (!tunnel)
+			continue;
+		rxq_ctrl->tunnel_types[tunnel] += 1;
+		/* Clear tunnel type if more than one tunnel types set. */
+		for (j = 0; j != RTE_DIM(rxq_ctrl->tunnel_types); ++j) {
+			if (j == tunnel)
+				continue;
+			if (rxq_ctrl->tunnel_types[j] > 0) {
+				rxq_data->tunnel = 0;
+				break;
+			}
+		}
+		if (j == RTE_DIM(rxq_ctrl->tunnel_types))
+			rxq_data->tunnel = flow->tunnel;
+	}
+}
+
+/**
  * Complete flow rule creation.
  *
  * @param dev
@@ -2005,12 +2068,7 @@ mlx5_flow_create_action_queue(struct rte_eth_dev *dev,
 				   NULL, "internal error in flow creation");
 		goto error;
 	}
-	for (i = 0; i != parser->rss_conf.queue_num; ++i) {
-		struct mlx5_rxq_data *q =
-			(*priv->rxqs)[parser->rss_conf.queue[i]];
-
-		q->mark |= parser->mark;
-	}
+	mlx5_flow_create_update_rxqs(dev, flow);
 	return 0;
 error:
 	ret = rte_errno; /* Save rte_errno before cleanup. */
@@ -2083,6 +2141,7 @@ mlx5_flow_list_create(struct rte_eth_dev *dev,
 	}
 	/* Copy configuration. */
 	flow->queues = (uint16_t (*)[])(flow + 1);
+	flow->tunnel = parser.tunnel;
 	flow->rss_conf = (struct rte_flow_action_rss){
 		.func = RTE_ETH_HASH_FUNCTION_DEFAULT,
 		.level = 0,
@@ -2174,9 +2233,38 @@ mlx5_flow_list_destroy(struct rte_eth_dev *dev, struct mlx5_flows *list,
 	struct priv *priv = dev->data->dev_private;
 	unsigned int i;
 
-	if (flow->drop || !flow->mark)
+	if (flow->drop || !dev->data->dev_started)
 		goto free;
-	for (i = 0; i != flow->rss_conf.queue_num; ++i) {
+	for (i = 0; flow->tunnel && i != flow->rss_conf.queue_num; ++i) {
+		/* Update queue tunnel type. */
+		struct mlx5_rxq_data *rxq_data = (*priv->rxqs)
+						 [(*flow->queues)[i]];
+		struct mlx5_rxq_ctrl *rxq_ctrl =
+			container_of(rxq_data, struct mlx5_rxq_ctrl, rxq);
+		uint8_t tunnel = PTYPE_IDX(flow->tunnel);
+
+		assert(rxq_ctrl->tunnel_types[tunnel] > 0);
+		rxq_ctrl->tunnel_types[tunnel] -= 1;
+		if (!rxq_ctrl->tunnel_types[tunnel]) {
+			/* Update tunnel type. */
+			uint8_t j;
+			uint8_t types = 0;
+			uint8_t last;
+
+			for (j = 0; j < RTE_DIM(rxq_ctrl->tunnel_types); j++)
+				if (rxq_ctrl->tunnel_types[j]) {
+					types += 1;
+					last = j;
+				}
+			/* Keep same if more than one tunnel types left. */
+			if (types == 1)
+				rxq_data->tunnel = ptype_ext[last];
+			else if (types == 0)
+				/* No tunnel type left. */
+				rxq_data->tunnel = 0;
+		}
+	}
+	for (i = 0; flow->mark && i != flow->rss_conf.queue_num; ++i) {
 		struct rte_flow *tmp;
 		int mark = 0;
 
@@ -2395,9 +2483,9 @@ mlx5_flow_stop(struct rte_eth_dev *dev, struct mlx5_flows *list)
 {
 	struct priv *priv = dev->data->dev_private;
 	struct rte_flow *flow;
+	unsigned int i;
 
 	TAILQ_FOREACH_REVERSE(flow, list, mlx5_flows, next) {
-		unsigned int i;
 		struct mlx5_ind_table_ibv *ind_tbl = NULL;
 
 		if (flow->drop) {
@@ -2443,6 +2531,18 @@ mlx5_flow_stop(struct rte_eth_dev *dev, struct mlx5_flows *list)
 		DRV_LOG(DEBUG, "port %u flow %p removed", dev->data->port_id,
 			(void *)flow);
 	}
+	/* Cleanup Rx queue tunnel info. */
+	for (i = 0; i != priv->rxqs_n; ++i) {
+		struct mlx5_rxq_data *q = (*priv->rxqs)[i];
+		struct mlx5_rxq_ctrl *rxq_ctrl =
+			container_of(q, struct mlx5_rxq_ctrl, rxq);
+
+		if (!q)
+			continue;
+		memset((void *)rxq_ctrl->tunnel_types, 0,
+		       sizeof(rxq_ctrl->tunnel_types));
+		q->tunnel = 0;
+	}
 }
 
 /**
@@ -2490,7 +2590,8 @@ mlx5_flow_start(struct rte_eth_dev *dev, struct mlx5_flows *list)
 					      flow->rss_conf.key_len,
 					      hash_rxq_init[i].hash_fields,
 					      flow->rss_conf.queue,
-					      flow->rss_conf.queue_num);
+					      flow->rss_conf.queue_num,
+					      flow->tunnel);
 			if (flow->frxq[i].hrxq)
 				goto flow_create;
 			flow->frxq[i].hrxq =
@@ -2498,7 +2599,8 @@ mlx5_flow_start(struct rte_eth_dev *dev, struct mlx5_flows *list)
 					      flow->rss_conf.key_len,
 					      hash_rxq_init[i].hash_fields,
 					      flow->rss_conf.queue,
-					      flow->rss_conf.queue_num);
+					      flow->rss_conf.queue_num,
+					      flow->tunnel);
 			if (!flow->frxq[i].hrxq) {
 				DRV_LOG(DEBUG,
 					"port %u flow %p cannot be applied",
@@ -2520,10 +2622,7 @@ mlx5_flow_start(struct rte_eth_dev *dev, struct mlx5_flows *list)
 			DRV_LOG(DEBUG, "port %u flow %p applied",
 				dev->data->port_id, (void *)flow);
 		}
-		if (!flow->mark)
-			continue;
-		for (i = 0; i != flow->rss_conf.queue_num; ++i)
-			(*priv->rxqs)[flow->rss_conf.queue[i]]->mark = 1;
+		mlx5_flow_create_update_rxqs(dev, flow);
 	}
 	return 0;
 }
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index 18ad40813..1fbd02aa0 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -1386,6 +1386,8 @@ mlx5_ind_table_ibv_verify(struct rte_eth_dev *dev)
  *   first queue index will be taken for the indirection table.
  * @param queues_n
  *   Number of queues.
+ * @param tunnel
+ *   Tunnel type.
  *
  * @return
  *   The Verbs object initialised, NULL otherwise and rte_errno is set.
@@ -1394,7 +1396,7 @@ struct mlx5_hrxq *
 mlx5_hrxq_new(struct rte_eth_dev *dev,
 	      const uint8_t *rss_key, uint32_t rss_key_len,
 	      uint64_t hash_fields,
-	      const uint16_t *queues, uint32_t queues_n)
+	      const uint16_t *queues, uint32_t queues_n, uint32_t tunnel)
 {
 	struct priv *priv = dev->data->dev_private;
 	struct mlx5_hrxq *hrxq;
@@ -1438,6 +1440,7 @@ mlx5_hrxq_new(struct rte_eth_dev *dev,
 	hrxq->qp = qp;
 	hrxq->rss_key_len = rss_key_len;
 	hrxq->hash_fields = hash_fields;
+	hrxq->tunnel = tunnel;
 	memcpy(hrxq->rss_key, rss_key, rss_key_len);
 	rte_atomic32_inc(&hrxq->refcnt);
 	LIST_INSERT_HEAD(&priv->hrxqs, hrxq, next);
@@ -1466,6 +1469,8 @@ mlx5_hrxq_new(struct rte_eth_dev *dev,
  *   first queue index will be taken for the indirection table.
  * @param queues_n
  *   Number of queues.
+ * @param tunnel
+ *   Tunnel type.
  *
  * @return
  *   An hash Rx queue on success.
@@ -1474,7 +1479,7 @@ struct mlx5_hrxq *
 mlx5_hrxq_get(struct rte_eth_dev *dev,
 	      const uint8_t *rss_key, uint32_t rss_key_len,
 	      uint64_t hash_fields,
-	      const uint16_t *queues, uint32_t queues_n)
+	      const uint16_t *queues, uint32_t queues_n, uint32_t tunnel)
 {
 	struct priv *priv = dev->data->dev_private;
 	struct mlx5_hrxq *hrxq;
@@ -1489,6 +1494,8 @@ mlx5_hrxq_get(struct rte_eth_dev *dev,
 			continue;
 		if (hrxq->hash_fields != hash_fields)
 			continue;
+		if (hrxq->tunnel != tunnel)
+			continue;
 		ind_tbl = mlx5_ind_table_ibv_get(dev, queues, queues_n);
 		if (!ind_tbl)
 			continue;
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 05fe10918..fafac514b 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -34,7 +34,7 @@
 #include "mlx5_prm.h"
 
 static __rte_always_inline uint32_t
-rxq_cq_to_pkt_type(volatile struct mlx5_cqe *cqe);
+rxq_cq_to_pkt_type(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cqe);
 
 static __rte_always_inline int
 mlx5_rx_poll_len(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cqe,
@@ -125,12 +125,14 @@ mlx5_set_ptype_table(void)
 	(*p)[0x8a] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
 		     RTE_PTYPE_L4_UDP;
 	/* Tunneled - L3 */
+	(*p)[0x40] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN;
 	(*p)[0x41] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
 		     RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
 		     RTE_PTYPE_INNER_L4_NONFRAG;
 	(*p)[0x42] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
 		     RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
 		     RTE_PTYPE_INNER_L4_NONFRAG;
+	(*p)[0xc0] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN;
 	(*p)[0xc1] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
 		     RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
 		     RTE_PTYPE_INNER_L4_NONFRAG;
@@ -1577,6 +1579,8 @@ mlx5_tx_burst_empw(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
 /**
  * Translate RX completion flags to packet type.
  *
+ * @param[in] rxq
+ *   Pointer to RX queue structure.
  * @param[in] cqe
  *   Pointer to CQE.
  *
@@ -1586,7 +1590,7 @@ mlx5_tx_burst_empw(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
  *   Packet type for struct rte_mbuf.
  */
 static inline uint32_t
-rxq_cq_to_pkt_type(volatile struct mlx5_cqe *cqe)
+rxq_cq_to_pkt_type(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cqe)
 {
 	uint8_t idx;
 	uint8_t pinfo = cqe->pkt_info;
@@ -1601,7 +1605,7 @@ rxq_cq_to_pkt_type(volatile struct mlx5_cqe *cqe)
 	 * bit[7] = outer_l3_type
 	 */
 	idx = ((pinfo & 0x3) << 6) | ((ptype & 0xfc00) >> 10);
-	return mlx5_ptype_table[idx];
+	return mlx5_ptype_table[idx] | rxq->tunnel * !!(idx & (1 << 6));
 }
 
 /**
@@ -1833,7 +1837,7 @@ mlx5_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n)
 			pkt = seg;
 			assert(len >= (rxq->crc_present << 2));
 			/* Update packet information. */
-			pkt->packet_type = rxq_cq_to_pkt_type(cqe);
+			pkt->packet_type = rxq_cq_to_pkt_type(rxq, cqe);
 			pkt->ol_flags = 0;
 			if (rss_hash_res && rxq->rss_hash) {
 				pkt->hash.rss = rss_hash_res;
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index ee534c340..676ad6a9a 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -104,6 +104,7 @@ struct mlx5_rxq_data {
 	void *cq_uar; /* CQ user access region. */
 	uint32_t cqn; /* CQ number. */
 	uint8_t cq_arm_sn; /* CQ arm seq number. */
+	uint32_t tunnel; /* Tunnel information. */
 } __rte_cache_aligned;
 
 /* Verbs Rx queue elements. */
@@ -125,6 +126,7 @@ struct mlx5_rxq_ctrl {
 	struct mlx5_rxq_ibv *ibv; /* Verbs elements. */
 	struct mlx5_rxq_data rxq; /* Data path structure. */
 	unsigned int socket; /* CPU socket ID for allocations. */
+	uint32_t tunnel_types[16]; /* Tunnel type counter. */
 	unsigned int irq:1; /* Whether IRQ is enabled. */
 	uint16_t idx; /* Queue index. */
 };
@@ -145,6 +147,7 @@ struct mlx5_hrxq {
 	struct mlx5_ind_table_ibv *ind_table; /* Indirection table. */
 	struct ibv_qp *qp; /* Verbs queue pair. */
 	uint64_t hash_fields; /* Verbs Hash fields. */
+	uint32_t tunnel; /* Tunnel type. */
 	uint32_t rss_key_len; /* Hash key length in bytes. */
 	uint8_t rss_key[]; /* Hash key. */
 };
@@ -248,11 +251,13 @@ int mlx5_ind_table_ibv_verify(struct rte_eth_dev *dev);
 struct mlx5_hrxq *mlx5_hrxq_new(struct rte_eth_dev *dev,
 				const uint8_t *rss_key, uint32_t rss_key_len,
 				uint64_t hash_fields,
-				const uint16_t *queues, uint32_t queues_n);
+				const uint16_t *queues, uint32_t queues_n,
+				uint32_t tunnel);
 struct mlx5_hrxq *mlx5_hrxq_get(struct rte_eth_dev *dev,
 				const uint8_t *rss_key, uint32_t rss_key_len,
 				uint64_t hash_fields,
-				const uint16_t *queues, uint32_t queues_n);
+				const uint16_t *queues, uint32_t queues_n,
+				uint32_t tunnel);
 int mlx5_hrxq_release(struct rte_eth_dev *dev, struct mlx5_hrxq *hxrq);
 int mlx5_hrxq_ibv_verify(struct rte_eth_dev *dev);
 uint64_t mlx5_get_rx_port_offloads(void);
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
index 84817e7ad..d21e99f68 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
@@ -551,6 +551,7 @@ rxq_cq_to_ptype_oflags_v(struct mlx5_rxq_data *rxq,
 	const uint64x1_t mbuf_init = vld1_u64(&rxq->mbuf_initializer);
 	const uint64x1_t r32_mask = vcreate_u64(0xffffffff);
 	uint64x2_t rearm0, rearm1, rearm2, rearm3;
+	uint8_t pt_idx0, pt_idx1, pt_idx2, pt_idx3;
 
 	if (rxq->mark) {
 		const uint32x4_t ft_def = vdupq_n_u32(MLX5_FLOW_MARK_DEFAULT);
@@ -583,14 +584,18 @@ rxq_cq_to_ptype_oflags_v(struct mlx5_rxq_data *rxq,
 	ptype = vshrn_n_u32(ptype_info, 10);
 	/* Errored packets will have RTE_PTYPE_ALL_MASK. */
 	ptype = vorr_u16(ptype, op_err);
-	pkts[0]->packet_type =
-		mlx5_ptype_table[vget_lane_u8(vreinterpret_u8_u16(ptype), 6)];
-	pkts[1]->packet_type =
-		mlx5_ptype_table[vget_lane_u8(vreinterpret_u8_u16(ptype), 4)];
-	pkts[2]->packet_type =
-		mlx5_ptype_table[vget_lane_u8(vreinterpret_u8_u16(ptype), 2)];
-	pkts[3]->packet_type =
-		mlx5_ptype_table[vget_lane_u8(vreinterpret_u8_u16(ptype), 0)];
+	pt_idx0 = vget_lane_u8(vreinterpret_u8_u16(ptype), 6);
+	pt_idx1 = vget_lane_u8(vreinterpret_u8_u16(ptype), 4);
+	pt_idx2 = vget_lane_u8(vreinterpret_u8_u16(ptype), 2);
+	pt_idx3 = vget_lane_u8(vreinterpret_u8_u16(ptype), 0);
+	pkts[0]->packet_type = mlx5_ptype_table[pt_idx0] |
+			       !!(pt_idx0 & (1 << 6)) * rxq->tunnel;
+	pkts[1]->packet_type = mlx5_ptype_table[pt_idx1] |
+			       !!(pt_idx1 & (1 << 6)) * rxq->tunnel;
+	pkts[2]->packet_type = mlx5_ptype_table[pt_idx2] |
+			       !!(pt_idx2 & (1 << 6)) * rxq->tunnel;
+	pkts[3]->packet_type = mlx5_ptype_table[pt_idx3] |
+			       !!(pt_idx3 & (1 << 6)) * rxq->tunnel;
 	/* Fill flags for checksum and VLAN. */
 	pinfo = vandq_u32(ptype_info, ptype_ol_mask);
 	pinfo = vreinterpretq_u32_u8(
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
index 83d6e431f..4a6789a78 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
@@ -542,6 +542,7 @@ rxq_cq_to_ptype_oflags_v(struct mlx5_rxq_data *rxq, __m128i cqes[4],
 	const __m128i mbuf_init =
 		_mm_loadl_epi64((__m128i *)&rxq->mbuf_initializer);
 	__m128i rearm0, rearm1, rearm2, rearm3;
+	uint8_t pt_idx0, pt_idx1, pt_idx2, pt_idx3;
 
 	/* Extract pkt_info field. */
 	pinfo0 = _mm_unpacklo_epi32(cqes[0], cqes[1]);
@@ -595,10 +596,18 @@ rxq_cq_to_ptype_oflags_v(struct mlx5_rxq_data *rxq, __m128i cqes[4],
 	/* Errored packets will have RTE_PTYPE_ALL_MASK. */
 	op_err = _mm_srli_epi16(op_err, 8);
 	ptype = _mm_or_si128(ptype, op_err);
-	pkts[0]->packet_type = mlx5_ptype_table[_mm_extract_epi8(ptype, 0)];
-	pkts[1]->packet_type = mlx5_ptype_table[_mm_extract_epi8(ptype, 2)];
-	pkts[2]->packet_type = mlx5_ptype_table[_mm_extract_epi8(ptype, 4)];
-	pkts[3]->packet_type = mlx5_ptype_table[_mm_extract_epi8(ptype, 6)];
+	pt_idx0 = _mm_extract_epi8(ptype, 0);
+	pt_idx1 = _mm_extract_epi8(ptype, 2);
+	pt_idx2 = _mm_extract_epi8(ptype, 4);
+	pt_idx3 = _mm_extract_epi8(ptype, 6);
+	pkts[0]->packet_type = mlx5_ptype_table[pt_idx0] |
+			       !!(pt_idx0 & (1 << 6)) * rxq->tunnel;
+	pkts[1]->packet_type = mlx5_ptype_table[pt_idx1] |
+			       !!(pt_idx1 & (1 << 6)) * rxq->tunnel;
+	pkts[2]->packet_type = mlx5_ptype_table[pt_idx2] |
+			       !!(pt_idx2 & (1 << 6)) * rxq->tunnel;
+	pkts[3]->packet_type = mlx5_ptype_table[pt_idx3] |
+			       !!(pt_idx3 & (1 << 6)) * rxq->tunnel;
 	/* Fill flags for checksum and VLAN. */
 	pinfo = _mm_and_si128(pinfo, ptype_ol_mask);
 	pinfo = _mm_shuffle_epi8(cv_flag_sel, pinfo);
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v6 05/11] net/mlx5: cleanup tunnel checksum offloads
  2018-04-20 12:23     ` [PATCH v5 " Xueming Li
                         ` (4 preceding siblings ...)
  2018-04-23 12:33       ` [PATCH v6 04/11] net/mlx5: support Rx tunnel type identification Xueming Li
@ 2018-04-23 12:33       ` Xueming Li
  2018-04-23 12:33       ` [PATCH v6 06/11] net/mlx5: split flow RSS handling logic Xueming Li
                         ` (5 subsequent siblings)
  11 siblings, 0 replies; 115+ messages in thread
From: Xueming Li @ 2018-04-23 12:33 UTC (permalink / raw)
  To: Nelio Laranjeiro, Shahaf Shuler; +Cc: Xueming Li, dev

This patch cleanup tunnel checksum offloads.

Once tunnel packet type(RTE_PTYPE_TUNNEL_xxx) identified,
PKT_RX_IP_CKSUM_XXX and PKT_RX_L4_CKSUM_XXX represent checksum result of
inner headers, outer L3 and L4 header checksum are always valid as soon
as tunnel identified. If no tunnel identified, PKT_RX_IP_CKSUM_XXX and
PKT_RX_L4_CKSUM_XXX represent checksum result of outer L3 and L4
headers.

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
---
 drivers/net/mlx5/mlx5_rxq.c  |  2 --
 drivers/net/mlx5/mlx5_rxtx.c | 18 ++++--------------
 drivers/net/mlx5/mlx5_rxtx.h |  1 -
 3 files changed, 4 insertions(+), 17 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index 1fbd02aa0..6756f25fa 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -1045,8 +1045,6 @@ mlx5_rxq_new(struct rte_eth_dev *dev, uint16_t idx, uint16_t desc,
 	}
 	/* Toggle RX checksum offload if hardware supports it. */
 	tmpl->rxq.csum = !!(conf->offloads & DEV_RX_OFFLOAD_CHECKSUM);
-	tmpl->rxq.csum_l2tun = (!!(conf->offloads & DEV_RX_OFFLOAD_CHECKSUM) &&
-				priv->config.tunnel_en);
 	tmpl->rxq.hw_timestamp = !!(conf->offloads & DEV_RX_OFFLOAD_TIMESTAMP);
 	/* Configure VLAN stripping. */
 	tmpl->rxq.vlan_strip = !!(conf->offloads & DEV_RX_OFFLOAD_VLAN_STRIP);
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index fafac514b..060ff0e85 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -41,7 +41,7 @@ mlx5_rx_poll_len(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cqe,
 		 uint16_t cqe_cnt, uint32_t *rss_hash);
 
 static __rte_always_inline uint32_t
-rxq_cq_to_ol_flags(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cqe);
+rxq_cq_to_ol_flags(volatile struct mlx5_cqe *cqe);
 
 uint32_t mlx5_ptype_table[] __rte_cache_aligned = {
 	[0xff] = RTE_PTYPE_ALL_MASK, /* Last entry for errored packet. */
@@ -1728,8 +1728,6 @@ mlx5_rx_poll_len(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cqe,
 /**
  * Translate RX completion flags to offload flags.
  *
- * @param[in] rxq
- *   Pointer to RX queue structure.
  * @param[in] cqe
  *   Pointer to CQE.
  *
@@ -1737,7 +1735,7 @@ mlx5_rx_poll_len(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cqe,
  *   Offload flags (ol_flags) for struct rte_mbuf.
  */
 static inline uint32_t
-rxq_cq_to_ol_flags(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cqe)
+rxq_cq_to_ol_flags(volatile struct mlx5_cqe *cqe)
 {
 	uint32_t ol_flags = 0;
 	uint16_t flags = rte_be_to_cpu_16(cqe->hdr_type_etc);
@@ -1749,14 +1747,6 @@ rxq_cq_to_ol_flags(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cqe)
 		TRANSPOSE(flags,
 			  MLX5_CQE_RX_L4_HDR_VALID,
 			  PKT_RX_L4_CKSUM_GOOD);
-	if ((cqe->pkt_info & MLX5_CQE_RX_TUNNEL_PACKET) && (rxq->csum_l2tun))
-		ol_flags |=
-			TRANSPOSE(flags,
-				  MLX5_CQE_RX_L3_HDR_VALID,
-				  PKT_RX_IP_CKSUM_GOOD) |
-			TRANSPOSE(flags,
-				  MLX5_CQE_RX_L4_HDR_VALID,
-				  PKT_RX_L4_CKSUM_GOOD);
 	return ol_flags;
 }
 
@@ -1855,8 +1845,8 @@ mlx5_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n)
 						mlx5_flow_mark_get(mark);
 				}
 			}
-			if (rxq->csum | rxq->csum_l2tun)
-				pkt->ol_flags |= rxq_cq_to_ol_flags(rxq, cqe);
+			if (rxq->csum)
+				pkt->ol_flags |= rxq_cq_to_ol_flags(cqe);
 			if (rxq->vlan_strip &&
 			    (cqe->hdr_type_etc &
 			     rte_cpu_to_be_16(MLX5_CQE_VLAN_STRIPPED))) {
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index 676ad6a9a..188fd65c5 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -77,7 +77,6 @@ struct rxq_zip {
 /* RX queue descriptor. */
 struct mlx5_rxq_data {
 	unsigned int csum:1; /* Enable checksum offloading. */
-	unsigned int csum_l2tun:1; /* Same for L2 tunnels. */
 	unsigned int hw_timestamp:1; /* Enable HW timestamp. */
 	unsigned int vlan_strip:1; /* Enable VLAN stripping. */
 	unsigned int crc_present:1; /* CRC must be subtracted. */
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v6 06/11] net/mlx5: split flow RSS handling logic
  2018-04-20 12:23     ` [PATCH v5 " Xueming Li
                         ` (5 preceding siblings ...)
  2018-04-23 12:33       ` [PATCH v6 05/11] net/mlx5: cleanup tunnel checksum offloads Xueming Li
@ 2018-04-23 12:33       ` Xueming Li
  2018-04-23 12:33       ` [PATCH v6 07/11] net/mlx5: support tunnel RSS level Xueming Li
                         ` (4 subsequent siblings)
  11 siblings, 0 replies; 115+ messages in thread
From: Xueming Li @ 2018-04-23 12:33 UTC (permalink / raw)
  To: Nelio Laranjeiro, Shahaf Shuler; +Cc: Xueming Li, dev

This patch split out flow RSS hash field handling logic to dedicate
function.

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
---
 drivers/net/mlx5/mlx5_flow.c | 126 +++++++++++++++++++++++--------------------
 1 file changed, 68 insertions(+), 58 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index fa1487d29..c2e57094e 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -1000,59 +1000,8 @@ mlx5_flow_update_priority(struct rte_eth_dev *dev,
 static void
 mlx5_flow_convert_finalise(struct mlx5_flow_parse *parser)
 {
-	const unsigned int ipv4 =
-		hash_rxq_init[parser->layer].ip_version == MLX5_IPV4;
-	const enum hash_rxq_type hmin = ipv4 ? HASH_RXQ_TCPV4 : HASH_RXQ_TCPV6;
-	const enum hash_rxq_type hmax = ipv4 ? HASH_RXQ_IPV4 : HASH_RXQ_IPV6;
-	const enum hash_rxq_type ohmin = ipv4 ? HASH_RXQ_TCPV6 : HASH_RXQ_TCPV4;
-	const enum hash_rxq_type ohmax = ipv4 ? HASH_RXQ_IPV6 : HASH_RXQ_IPV4;
-	const enum hash_rxq_type ip = ipv4 ? HASH_RXQ_IPV4 : HASH_RXQ_IPV6;
 	unsigned int i;
 
-	/* Remove any other flow not matching the pattern. */
-	if (parser->rss_conf.queue_num == 1 && !parser->rss_conf.types) {
-		for (i = 0; i != hash_rxq_init_n; ++i) {
-			if (i == HASH_RXQ_ETH)
-				continue;
-			rte_free(parser->queue[i].ibv_attr);
-			parser->queue[i].ibv_attr = NULL;
-		}
-		return;
-	}
-	if (parser->layer == HASH_RXQ_ETH) {
-		goto fill;
-	} else {
-		/*
-		 * This layer becomes useless as the pattern define under
-		 * layers.
-		 */
-		rte_free(parser->queue[HASH_RXQ_ETH].ibv_attr);
-		parser->queue[HASH_RXQ_ETH].ibv_attr = NULL;
-	}
-	/* Remove opposite kind of layer e.g. IPv6 if the pattern is IPv4. */
-	for (i = ohmin; i != (ohmax + 1); ++i) {
-		if (!parser->queue[i].ibv_attr)
-			continue;
-		rte_free(parser->queue[i].ibv_attr);
-		parser->queue[i].ibv_attr = NULL;
-	}
-	/* Remove impossible flow according to the RSS configuration. */
-	if (hash_rxq_init[parser->layer].dpdk_rss_hf &
-	    parser->rss_conf.types) {
-		/* Remove any other flow. */
-		for (i = hmin; i != (hmax + 1); ++i) {
-			if ((i == parser->layer) ||
-			     (!parser->queue[i].ibv_attr))
-				continue;
-			rte_free(parser->queue[i].ibv_attr);
-			parser->queue[i].ibv_attr = NULL;
-		}
-	} else  if (!parser->queue[ip].ibv_attr) {
-		/* no RSS possible with the current configuration. */
-		parser->rss_conf.queue_num = 1;
-		return;
-	}
-fill:
 	/*
 	 * Fill missing layers in verbs specifications, or compute the correct
 	 * offset to allocate the memory space for the attributes and
@@ -1115,6 +1064,66 @@ mlx5_flow_convert_finalise(struct mlx5_flow_parse *parser)
 }
 
 /**
+ * Update flows according to pattern and RSS hash fields.
+ *
+ * @param[in, out] parser
+ *   Internal parser structure.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+mlx5_flow_convert_rss(struct mlx5_flow_parse *parser)
+{
+	const unsigned int ipv4 =
+		hash_rxq_init[parser->layer].ip_version == MLX5_IPV4;
+	const enum hash_rxq_type hmin = ipv4 ? HASH_RXQ_TCPV4 : HASH_RXQ_TCPV6;
+	const enum hash_rxq_type hmax = ipv4 ? HASH_RXQ_IPV4 : HASH_RXQ_IPV6;
+	const enum hash_rxq_type ohmin = ipv4 ? HASH_RXQ_TCPV6 : HASH_RXQ_TCPV4;
+	const enum hash_rxq_type ohmax = ipv4 ? HASH_RXQ_IPV6 : HASH_RXQ_IPV4;
+	const enum hash_rxq_type ip = ipv4 ? HASH_RXQ_IPV4 : HASH_RXQ_IPV6;
+	unsigned int i;
+
+	/* Remove any other flow not matching the pattern. */
+	if (parser->rss_conf.queue_num == 1 && !parser->rss_conf.types) {
+		for (i = 0; i != hash_rxq_init_n; ++i) {
+			if (i == HASH_RXQ_ETH)
+				continue;
+			rte_free(parser->queue[i].ibv_attr);
+			parser->queue[i].ibv_attr = NULL;
+		}
+		return 0;
+	}
+	if (parser->layer == HASH_RXQ_ETH)
+		return 0;
+	/* This layer becomes useless as the pattern define under layers. */
+	rte_free(parser->queue[HASH_RXQ_ETH].ibv_attr);
+	parser->queue[HASH_RXQ_ETH].ibv_attr = NULL;
+	/* Remove opposite kind of layer e.g. IPv6 if the pattern is IPv4. */
+	for (i = ohmin; i != (ohmax + 1); ++i) {
+		if (!parser->queue[i].ibv_attr)
+			continue;
+		rte_free(parser->queue[i].ibv_attr);
+		parser->queue[i].ibv_attr = NULL;
+	}
+	/* Remove impossible flow according to the RSS configuration. */
+	if (hash_rxq_init[parser->layer].dpdk_rss_hf &
+	    parser->rss_conf.types) {
+		/* Remove any other flow. */
+		for (i = hmin; i != (hmax + 1); ++i) {
+			if (i == parser->layer || !parser->queue[i].ibv_attr)
+				continue;
+			rte_free(parser->queue[i].ibv_attr);
+			parser->queue[i].ibv_attr = NULL;
+		}
+	} else if (!parser->queue[ip].ibv_attr) {
+		/* no RSS possible with the current configuration. */
+		parser->rss_conf.queue_num = 1;
+	}
+	return 0;
+}
+
+/**
  * Validate and convert a flow supported by the NIC.
  *
  * @param dev
@@ -1211,6 +1220,14 @@ mlx5_flow_convert(struct rte_eth_dev *dev,
 		if (ret)
 			goto exit_free;
 	}
+	if (!parser->drop)
+		/* RSS check, remove unused hash types. */
+		ret = mlx5_flow_convert_rss(parser);
+		if (ret)
+			goto exit_free;
+		/* Complete missing specification. */
+		mlx5_flow_convert_finalise(parser);
+	mlx5_flow_update_priority(dev, parser, attr);
 	if (parser->mark)
 		mlx5_flow_create_flag_mark(parser, parser->mark_id);
 	if (parser->count && parser->create) {
@@ -1218,13 +1235,6 @@ mlx5_flow_convert(struct rte_eth_dev *dev,
 		if (!parser->cs)
 			goto exit_count_error;
 	}
-	/*
-	 * Last step. Complete missing specification to reach the RSS
-	 * configuration.
-	 */
-	if (!parser->drop)
-		mlx5_flow_convert_finalise(parser);
-	mlx5_flow_update_priority(dev, parser, attr);
 exit_free:
 	/* Only verification is expected, all resources should be released. */
 	if (!parser->create) {
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v6 07/11] net/mlx5: support tunnel RSS level
  2018-04-20 12:23     ` [PATCH v5 " Xueming Li
                         ` (6 preceding siblings ...)
  2018-04-23 12:33       ` [PATCH v6 06/11] net/mlx5: split flow RSS handling logic Xueming Li
@ 2018-04-23 12:33       ` Xueming Li
  2018-04-23 12:33       ` [PATCH v6 08/11] net/mlx5: add hardware flow debug dump Xueming Li
                         ` (3 subsequent siblings)
  11 siblings, 0 replies; 115+ messages in thread
From: Xueming Li @ 2018-04-23 12:33 UTC (permalink / raw)
  To: Nelio Laranjeiro, Shahaf Shuler; +Cc: Xueming Li, dev

Tunnel RSS level of flow RSS action offers user a choice to do RSS hash
calculation on inner or outer RSS fields. Testpmd flow command examples:

GRE flow inner RSS:
  flow create 0 ingress pattern eth / ipv4 proto is 47 / gre / end
actions rss queues 1 2 end level 1 / end

GRE tunnel flow outer RSS:
  flow create 0 ingress pattern eth  / ipv4 proto is 47 / gre / end
actions rss queues 1 2 end level 0 / end

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
---
 drivers/net/mlx5/Makefile    |   2 +-
 drivers/net/mlx5/mlx5_flow.c | 257 +++++++++++++++++++++++++++----------------
 drivers/net/mlx5/mlx5_glue.c |  16 +++
 drivers/net/mlx5/mlx5_glue.h |   8 ++
 drivers/net/mlx5/mlx5_rxq.c  |  58 +++++++++-
 drivers/net/mlx5/mlx5_rxtx.h |   5 +-
 6 files changed, 240 insertions(+), 106 deletions(-)

diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
index b710a10f5..d9447ace9 100644
--- a/drivers/net/mlx5/Makefile
+++ b/drivers/net/mlx5/Makefile
@@ -8,7 +8,7 @@ include $(RTE_SDK)/mk/rte.vars.mk
 LIB = librte_pmd_mlx5.a
 LIB_GLUE = $(LIB_GLUE_BASE).$(LIB_GLUE_VERSION)
 LIB_GLUE_BASE = librte_pmd_mlx5_glue.so
-LIB_GLUE_VERSION = 18.02.0
+LIB_GLUE_VERSION = 18.05.0
 
 # Sources.
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5.c
diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index c2e57094e..174f2ba6e 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -117,6 +117,7 @@ enum hash_rxq_type {
 	HASH_RXQ_UDPV6,
 	HASH_RXQ_IPV6,
 	HASH_RXQ_ETH,
+	HASH_RXQ_TUNNEL,
 };
 
 /* Initialization data for hash RX queue. */
@@ -455,6 +456,7 @@ struct mlx5_flow_parse {
 	uint16_t queues[RTE_MAX_QUEUES_PER_PORT]; /**< Queues indexes to use. */
 	uint8_t rss_key[40]; /**< copy of the RSS key. */
 	enum hash_rxq_type layer; /**< Last pattern layer detected. */
+	enum hash_rxq_type out_layer; /**< Last outer pattern layer detected. */
 	uint32_t tunnel; /**< Tunnel type of RTE_PTYPE_TUNNEL_XXX. */
 	struct ibv_counter_set *cs; /**< Holds the counter set for the rule */
 	struct {
@@ -462,6 +464,7 @@ struct mlx5_flow_parse {
 		/**< Pointer to Verbs attributes. */
 		unsigned int offset;
 		/**< Current position or total size of the attribute. */
+		uint64_t hash_fields; /**< Verbs hash fields. */
 	} queue[RTE_DIM(hash_rxq_init)];
 };
 
@@ -697,7 +700,8 @@ mlx5_flow_convert_actions(struct rte_eth_dev *dev,
 						   " function is Toeplitz");
 				return -rte_errno;
 			}
-			if (rss->level) {
+#ifndef HAVE_IBV_DEVICE_TUNNEL_SUPPORT
+			if (parser->rss_conf.level > 1) {
 				rte_flow_error_set(error, EINVAL,
 						   RTE_FLOW_ERROR_TYPE_ACTION,
 						   actions,
@@ -705,6 +709,15 @@ mlx5_flow_convert_actions(struct rte_eth_dev *dev,
 						   " level is not supported");
 				return -rte_errno;
 			}
+#endif
+			if (parser->rss_conf.level > 2) {
+				rte_flow_error_set(error, EINVAL,
+						   RTE_FLOW_ERROR_TYPE_ACTION,
+						   actions,
+						   "RSS encapsulation level"
+						   " > 1 is not supported");
+				return -rte_errno;
+			}
 			if (rss->types & MLX5_RSS_HF_MASK) {
 				rte_flow_error_set(error, EINVAL,
 						   RTE_FLOW_ERROR_TYPE_ACTION,
@@ -755,7 +768,7 @@ mlx5_flow_convert_actions(struct rte_eth_dev *dev,
 			}
 			parser->rss_conf = (struct rte_flow_action_rss){
 				.func = RTE_ETH_HASH_FUNCTION_DEFAULT,
-				.level = 0,
+				.level = rss->level,
 				.types = rss->types,
 				.key_len = rss_key_len,
 				.queue_num = rss->queue_num,
@@ -839,10 +852,12 @@ mlx5_flow_convert_actions(struct rte_eth_dev *dev,
  *   0 on success, a negative errno value otherwise and rte_errno is set.
  */
 static int
-mlx5_flow_convert_items_validate(const struct rte_flow_item items[],
+mlx5_flow_convert_items_validate(struct rte_eth_dev *dev,
+				 const struct rte_flow_item items[],
 				 struct rte_flow_error *error,
 				 struct mlx5_flow_parse *parser)
 {
+	struct priv *priv = dev->data->dev_private;
 	const struct mlx5_flow_items *cur_item = mlx5_flow_items;
 	unsigned int i;
 	int ret = 0;
@@ -882,6 +897,14 @@ mlx5_flow_convert_items_validate(const struct rte_flow_item items[],
 						   " tunnel encapsulations.");
 				return -rte_errno;
 			}
+			if (!priv->config.tunnel_en &&
+			    parser->rss_conf.level > 1) {
+				rte_flow_error_set(error, ENOTSUP,
+					RTE_FLOW_ERROR_TYPE_ITEM,
+					items,
+					"RSS on tunnel is not supported");
+				return -rte_errno;
+			}
 			parser->inner = IBV_FLOW_SPEC_INNER;
 			parser->tunnel = flow_ptype[items->type];
 		}
@@ -1001,7 +1024,11 @@ static void
 mlx5_flow_convert_finalise(struct mlx5_flow_parse *parser)
 {
 	unsigned int i;
+	uint32_t inner = parser->inner;
 
+	/* Don't create extra flows for outer RSS. */
+	if (parser->tunnel && parser->rss_conf.level < 2)
+		return;
 	/*
 	 * Fill missing layers in verbs specifications, or compute the correct
 	 * offset to allocate the memory space for the attributes and
@@ -1012,23 +1039,25 @@ mlx5_flow_convert_finalise(struct mlx5_flow_parse *parser)
 			struct ibv_flow_spec_ipv4_ext ipv4;
 			struct ibv_flow_spec_ipv6 ipv6;
 			struct ibv_flow_spec_tcp_udp udp_tcp;
+			struct ibv_flow_spec_eth eth;
 		} specs;
 		void *dst;
 		uint16_t size;
 
 		if (i == parser->layer)
 			continue;
-		if (parser->layer == HASH_RXQ_ETH) {
+		if (parser->layer == HASH_RXQ_ETH ||
+		    parser->layer == HASH_RXQ_TUNNEL) {
 			if (hash_rxq_init[i].ip_version == MLX5_IPV4) {
 				size = sizeof(struct ibv_flow_spec_ipv4_ext);
 				specs.ipv4 = (struct ibv_flow_spec_ipv4_ext){
-					.type = IBV_FLOW_SPEC_IPV4_EXT,
+					.type = inner | IBV_FLOW_SPEC_IPV4_EXT,
 					.size = size,
 				};
 			} else {
 				size = sizeof(struct ibv_flow_spec_ipv6);
 				specs.ipv6 = (struct ibv_flow_spec_ipv6){
-					.type = IBV_FLOW_SPEC_IPV6,
+					.type = inner | IBV_FLOW_SPEC_IPV6,
 					.size = size,
 				};
 			}
@@ -1045,7 +1074,7 @@ mlx5_flow_convert_finalise(struct mlx5_flow_parse *parser)
 		    (i == HASH_RXQ_UDPV6) || (i == HASH_RXQ_TCPV6)) {
 			size = sizeof(struct ibv_flow_spec_tcp_udp);
 			specs.udp_tcp = (struct ibv_flow_spec_tcp_udp) {
-				.type = ((i == HASH_RXQ_UDPV4 ||
+				.type = inner | ((i == HASH_RXQ_UDPV4 ||
 					  i == HASH_RXQ_UDPV6) ?
 					 IBV_FLOW_SPEC_UDP :
 					 IBV_FLOW_SPEC_TCP),
@@ -1075,50 +1104,93 @@ mlx5_flow_convert_finalise(struct mlx5_flow_parse *parser)
 static int
 mlx5_flow_convert_rss(struct mlx5_flow_parse *parser)
 {
-	const unsigned int ipv4 =
-		hash_rxq_init[parser->layer].ip_version == MLX5_IPV4;
-	const enum hash_rxq_type hmin = ipv4 ? HASH_RXQ_TCPV4 : HASH_RXQ_TCPV6;
-	const enum hash_rxq_type hmax = ipv4 ? HASH_RXQ_IPV4 : HASH_RXQ_IPV6;
-	const enum hash_rxq_type ohmin = ipv4 ? HASH_RXQ_TCPV6 : HASH_RXQ_TCPV4;
-	const enum hash_rxq_type ohmax = ipv4 ? HASH_RXQ_IPV6 : HASH_RXQ_IPV4;
-	const enum hash_rxq_type ip = ipv4 ? HASH_RXQ_IPV4 : HASH_RXQ_IPV6;
 	unsigned int i;
-
-	/* Remove any other flow not matching the pattern. */
-	if (parser->rss_conf.queue_num == 1 && !parser->rss_conf.types) {
-		for (i = 0; i != hash_rxq_init_n; ++i) {
-			if (i == HASH_RXQ_ETH)
+	enum hash_rxq_type start;
+	enum hash_rxq_type layer;
+	int outer = parser->tunnel && parser->rss_conf.level < 2;
+	uint64_t rss = parser->rss_conf.types;
+
+	/* Default to outer RSS. */
+	if (!parser->rss_conf.level)
+		parser->rss_conf.level = 1;
+	layer = outer ? parser->out_layer : parser->layer;
+	if (layer == HASH_RXQ_TUNNEL)
+		layer = HASH_RXQ_ETH;
+	if (outer) {
+		/* Only one hash type for outer RSS. */
+		if (rss && layer == HASH_RXQ_ETH) {
+			start = HASH_RXQ_TCPV4;
+		} else if (rss && layer != HASH_RXQ_ETH &&
+			   !(rss & hash_rxq_init[layer].dpdk_rss_hf)) {
+			/* If RSS not match L4 pattern, try L3 RSS. */
+			if (layer < HASH_RXQ_IPV4)
+				layer = HASH_RXQ_IPV4;
+			else if (layer > HASH_RXQ_IPV4 && layer < HASH_RXQ_IPV6)
+				layer = HASH_RXQ_IPV6;
+			start = layer;
+		} else {
+			start = layer;
+		}
+		/* Scan first valid hash type. */
+		for (i = start; rss && i <= layer; ++i) {
+			if (!parser->queue[i].ibv_attr)
 				continue;
-			rte_free(parser->queue[i].ibv_attr);
-			parser->queue[i].ibv_attr = NULL;
+			if (hash_rxq_init[i].dpdk_rss_hf & rss)
+				break;
 		}
-		return 0;
-	}
-	if (parser->layer == HASH_RXQ_ETH)
-		return 0;
-	/* This layer becomes useless as the pattern define under layers. */
-	rte_free(parser->queue[HASH_RXQ_ETH].ibv_attr);
-	parser->queue[HASH_RXQ_ETH].ibv_attr = NULL;
-	/* Remove opposite kind of layer e.g. IPv6 if the pattern is IPv4. */
-	for (i = ohmin; i != (ohmax + 1); ++i) {
-		if (!parser->queue[i].ibv_attr)
-			continue;
-		rte_free(parser->queue[i].ibv_attr);
-		parser->queue[i].ibv_attr = NULL;
-	}
-	/* Remove impossible flow according to the RSS configuration. */
-	if (hash_rxq_init[parser->layer].dpdk_rss_hf &
-	    parser->rss_conf.types) {
-		/* Remove any other flow. */
-		for (i = hmin; i != (hmax + 1); ++i) {
-			if (i == parser->layer || !parser->queue[i].ibv_attr)
+		if (rss && i <= layer)
+			parser->queue[layer].hash_fields =
+					hash_rxq_init[i].hash_fields;
+		/* Trim unused hash types. */
+		for (i = 0; i != hash_rxq_init_n; ++i) {
+			if (parser->queue[i].ibv_attr && i != layer) {
+				rte_free(parser->queue[i].ibv_attr);
+				parser->queue[i].ibv_attr = NULL;
+			}
+		}
+	} else {
+		/* Expand for inner or normal RSS. */
+		if (rss && (layer == HASH_RXQ_ETH || layer == HASH_RXQ_IPV4))
+			start = HASH_RXQ_TCPV4;
+		else if (rss && layer == HASH_RXQ_IPV6)
+			start = HASH_RXQ_TCPV6;
+		else
+			start = layer;
+		/* For L4 pattern, try L3 RSS if no L4 RSS. */
+		/* Trim unused hash types. */
+		for (i = 0; i != hash_rxq_init_n; ++i) {
+			if (!parser->queue[i].ibv_attr)
 				continue;
-			rte_free(parser->queue[i].ibv_attr);
-			parser->queue[i].ibv_attr = NULL;
+			if (i < start || i > layer) {
+				rte_free(parser->queue[i].ibv_attr);
+				parser->queue[i].ibv_attr = NULL;
+				continue;
+			}
+			if (!rss)
+				continue;
+			if (hash_rxq_init[i].dpdk_rss_hf & rss) {
+				parser->queue[i].hash_fields =
+						hash_rxq_init[i].hash_fields;
+			} else if (i != layer) {
+				/* Remove unused RSS expansion. */
+				rte_free(parser->queue[i].ibv_attr);
+				parser->queue[i].ibv_attr = NULL;
+			} else if (layer < HASH_RXQ_IPV4 &&
+				   (hash_rxq_init[HASH_RXQ_IPV4].dpdk_rss_hf &
+				    rss)) {
+				/* Allow IPv4 RSS on L4 pattern. */
+				parser->queue[i].hash_fields =
+					hash_rxq_init[HASH_RXQ_IPV4]
+						.hash_fields;
+			} else if (i > HASH_RXQ_IPV4 && i < HASH_RXQ_IPV6 &&
+				   (hash_rxq_init[HASH_RXQ_IPV6].dpdk_rss_hf &
+				    rss)) {
+				/* Allow IPv4 RSS on L4 pattern. */
+				parser->queue[i].hash_fields =
+					hash_rxq_init[HASH_RXQ_IPV6]
+						.hash_fields;
+			}
 		}
-	} else if (!parser->queue[ip].ibv_attr) {
-		/* no RSS possible with the current configuration. */
-		parser->rss_conf.queue_num = 1;
 	}
 	return 0;
 }
@@ -1166,7 +1238,7 @@ mlx5_flow_convert(struct rte_eth_dev *dev,
 	ret = mlx5_flow_convert_actions(dev, actions, error, parser);
 	if (ret)
 		return ret;
-	ret = mlx5_flow_convert_items_validate(items, error, parser);
+	ret = mlx5_flow_convert_items_validate(dev, items, error, parser);
 	if (ret)
 		return ret;
 	mlx5_flow_convert_finalise(parser);
@@ -1187,10 +1259,6 @@ mlx5_flow_convert(struct rte_eth_dev *dev,
 		for (i = 0; i != hash_rxq_init_n; ++i) {
 			unsigned int offset;
 
-			if (!(parser->rss_conf.types &
-			      hash_rxq_init[i].dpdk_rss_hf) &&
-			    (i != HASH_RXQ_ETH))
-				continue;
 			offset = parser->queue[i].offset;
 			parser->queue[i].ibv_attr =
 				mlx5_flow_convert_allocate(offset, error);
@@ -1202,6 +1270,7 @@ mlx5_flow_convert(struct rte_eth_dev *dev,
 	/* Third step. Conversion parse, fill the specifications. */
 	parser->inner = 0;
 	parser->tunnel = 0;
+	parser->layer = HASH_RXQ_ETH;
 	for (; items->type != RTE_FLOW_ITEM_TYPE_END; ++items) {
 		struct mlx5_flow_data data = {
 			.dev = dev,
@@ -1282,17 +1351,11 @@ mlx5_flow_create_copy(struct mlx5_flow_parse *parser, void *src,
 	for (i = 0; i != hash_rxq_init_n; ++i) {
 		if (!parser->queue[i].ibv_attr)
 			continue;
-		/* Specification must be the same l3 type or none. */
-		if (parser->layer == HASH_RXQ_ETH ||
-		    (hash_rxq_init[parser->layer].ip_version ==
-		     hash_rxq_init[i].ip_version) ||
-		    (hash_rxq_init[i].ip_version == 0)) {
-			dst = (void *)((uintptr_t)parser->queue[i].ibv_attr +
-					parser->queue[i].offset);
-			memcpy(dst, src, size);
-			++parser->queue[i].ibv_attr->num_of_specs;
-			parser->queue[i].offset += size;
-		}
+		dst = (void *)((uintptr_t)parser->queue[i].ibv_attr +
+				parser->queue[i].offset);
+		memcpy(dst, src, size);
+		++parser->queue[i].ibv_attr->num_of_specs;
+		parser->queue[i].offset += size;
 	}
 }
 
@@ -1323,9 +1386,7 @@ mlx5_flow_create_eth(const struct rte_flow_item *item,
 		.size = eth_size,
 	};
 
-	/* Don't update layer for the inner pattern. */
-	if (!parser->inner)
-		parser->layer = HASH_RXQ_ETH;
+	parser->layer = HASH_RXQ_ETH;
 	if (spec) {
 		unsigned int i;
 
@@ -1446,9 +1507,7 @@ mlx5_flow_create_ipv4(const struct rte_flow_item *item,
 					  "L3 VXLAN not enabled by device"
 					  " parameter and/or not configured"
 					  " in firmware");
-	/* Don't update layer for the inner pattern. */
-	if (!parser->inner)
-		parser->layer = HASH_RXQ_IPV4;
+	parser->layer = HASH_RXQ_IPV4;
 	if (spec) {
 		if (!mask)
 			mask = default_mask;
@@ -1511,9 +1570,7 @@ mlx5_flow_create_ipv6(const struct rte_flow_item *item,
 					  "L3 VXLAN not enabled by device"
 					  " parameter and/or not configured"
 					  " in firmware");
-	/* Don't update layer for the inner pattern. */
-	if (!parser->inner)
-		parser->layer = HASH_RXQ_IPV6;
+	parser->layer = HASH_RXQ_IPV6;
 	if (spec) {
 		unsigned int i;
 		uint32_t vtc_flow_val;
@@ -1586,13 +1643,10 @@ mlx5_flow_create_udp(const struct rte_flow_item *item,
 		.size = udp_size,
 	};
 
-	/* Don't update layer for the inner pattern. */
-	if (!parser->inner) {
-		if (parser->layer == HASH_RXQ_IPV4)
-			parser->layer = HASH_RXQ_UDPV4;
-		else
-			parser->layer = HASH_RXQ_UDPV6;
-	}
+	if (parser->layer == HASH_RXQ_IPV4)
+		parser->layer = HASH_RXQ_UDPV4;
+	else
+		parser->layer = HASH_RXQ_UDPV6;
 	if (spec) {
 		if (!mask)
 			mask = default_mask;
@@ -1635,13 +1689,10 @@ mlx5_flow_create_tcp(const struct rte_flow_item *item,
 		.size = tcp_size,
 	};
 
-	/* Don't update layer for the inner pattern. */
-	if (!parser->inner) {
-		if (parser->layer == HASH_RXQ_IPV4)
-			parser->layer = HASH_RXQ_TCPV4;
-		else
-			parser->layer = HASH_RXQ_TCPV6;
-	}
+	if (parser->layer == HASH_RXQ_IPV4)
+		parser->layer = HASH_RXQ_TCPV4;
+	else
+		parser->layer = HASH_RXQ_TCPV6;
 	if (spec) {
 		if (!mask)
 			mask = default_mask;
@@ -1691,6 +1742,11 @@ mlx5_flow_create_vxlan(const struct rte_flow_item *item,
 	id.vni[0] = 0;
 	parser->inner = IBV_FLOW_SPEC_INNER;
 	parser->tunnel = ptype_ext[PTYPE_IDX(RTE_PTYPE_TUNNEL_VXLAN)];
+	parser->out_layer = parser->layer;
+	parser->layer = HASH_RXQ_TUNNEL;
+	/* Default VXLAN to outer RSS. */
+	if (!parser->rss_conf.level)
+		parser->rss_conf.level = 1;
 	if (spec) {
 		if (!mask)
 			mask = default_mask;
@@ -1748,6 +1804,11 @@ mlx5_flow_create_gre(const struct rte_flow_item *item __rte_unused,
 
 	parser->inner = IBV_FLOW_SPEC_INNER;
 	parser->tunnel = ptype_ext[PTYPE_IDX(RTE_PTYPE_TUNNEL_GRE)];
+	parser->out_layer = parser->layer;
+	parser->layer = HASH_RXQ_TUNNEL;
+	/* Default GRE to inner RSS. */
+	if (!parser->rss_conf.level)
+		parser->rss_conf.level = 2;
 	/* Update encapsulation IP layer protocol. */
 	for (i = 0; i != hash_rxq_init_n; ++i) {
 		if (!parser->queue[i].ibv_attr)
@@ -1939,33 +2000,33 @@ mlx5_flow_create_action_queue_rss(struct rte_eth_dev *dev,
 	unsigned int i;
 
 	for (i = 0; i != hash_rxq_init_n; ++i) {
-		uint64_t hash_fields;
-
 		if (!parser->queue[i].ibv_attr)
 			continue;
 		flow->frxq[i].ibv_attr = parser->queue[i].ibv_attr;
 		parser->queue[i].ibv_attr = NULL;
-		hash_fields = hash_rxq_init[i].hash_fields;
+		flow->frxq[i].hash_fields = parser->queue[i].hash_fields;
 		if (!priv->dev->data->dev_started)
 			continue;
 		flow->frxq[i].hrxq =
 			mlx5_hrxq_get(dev,
 				      parser->rss_conf.key,
 				      parser->rss_conf.key_len,
-				      hash_fields,
+				      flow->frxq[i].hash_fields,
 				      parser->rss_conf.queue,
 				      parser->rss_conf.queue_num,
-				      parser->tunnel);
+				      parser->tunnel,
+				      parser->rss_conf.level);
 		if (flow->frxq[i].hrxq)
 			continue;
 		flow->frxq[i].hrxq =
 			mlx5_hrxq_new(dev,
 				      parser->rss_conf.key,
 				      parser->rss_conf.key_len,
-				      hash_fields,
+				      flow->frxq[i].hash_fields,
 				      parser->rss_conf.queue,
 				      parser->rss_conf.queue_num,
-				      parser->tunnel);
+				      parser->tunnel,
+				      parser->rss_conf.level);
 		if (!flow->frxq[i].hrxq) {
 			return rte_flow_error_set(error, ENOMEM,
 						  RTE_FLOW_ERROR_TYPE_HANDLE,
@@ -2070,7 +2131,7 @@ mlx5_flow_create_action_queue(struct rte_eth_dev *dev,
 		DRV_LOG(DEBUG, "port %u %p type %d QP %p ibv_flow %p",
 			dev->data->port_id,
 			(void *)flow, i,
-			(void *)flow->frxq[i].hrxq,
+			(void *)flow->frxq[i].hrxq->qp,
 			(void *)flow->frxq[i].ibv_flow);
 	}
 	if (!flows_n) {
@@ -2598,19 +2659,21 @@ mlx5_flow_start(struct rte_eth_dev *dev, struct mlx5_flows *list)
 			flow->frxq[i].hrxq =
 				mlx5_hrxq_get(dev, flow->rss_conf.key,
 					      flow->rss_conf.key_len,
-					      hash_rxq_init[i].hash_fields,
+					      flow->frxq[i].hash_fields,
 					      flow->rss_conf.queue,
 					      flow->rss_conf.queue_num,
-					      flow->tunnel);
+					      flow->tunnel,
+					      flow->rss_conf.level);
 			if (flow->frxq[i].hrxq)
 				goto flow_create;
 			flow->frxq[i].hrxq =
 				mlx5_hrxq_new(dev, flow->rss_conf.key,
 					      flow->rss_conf.key_len,
-					      hash_rxq_init[i].hash_fields,
+					      flow->frxq[i].hash_fields,
 					      flow->rss_conf.queue,
 					      flow->rss_conf.queue_num,
-					      flow->tunnel);
+					      flow->tunnel,
+					      flow->rss_conf.level);
 			if (!flow->frxq[i].hrxq) {
 				DRV_LOG(DEBUG,
 					"port %u flow %p cannot be applied",
diff --git a/drivers/net/mlx5/mlx5_glue.c b/drivers/net/mlx5/mlx5_glue.c
index a771ac4c7..cd2716352 100644
--- a/drivers/net/mlx5/mlx5_glue.c
+++ b/drivers/net/mlx5/mlx5_glue.c
@@ -313,6 +313,21 @@ mlx5_glue_dv_init_obj(struct mlx5dv_obj *obj, uint64_t obj_type)
 	return mlx5dv_init_obj(obj, obj_type);
 }
 
+static struct ibv_qp *
+mlx5_glue_dv_create_qp(struct ibv_context *context,
+		       struct ibv_qp_init_attr_ex *qp_init_attr_ex,
+		       struct mlx5dv_qp_init_attr *dv_qp_init_attr)
+{
+#ifdef HAVE_IBV_DEVICE_TUNNEL_SUPPORT
+	return mlx5dv_create_qp(context, qp_init_attr_ex, dv_qp_init_attr);
+#else
+	(void)context;
+	(void)qp_init_attr_ex;
+	(void)dv_qp_init_attr;
+	return NULL;
+#endif
+}
+
 const struct mlx5_glue *mlx5_glue = &(const struct mlx5_glue){
 	.version = MLX5_GLUE_VERSION,
 	.fork_init = mlx5_glue_fork_init,
@@ -356,4 +371,5 @@ const struct mlx5_glue *mlx5_glue = &(const struct mlx5_glue){
 	.dv_query_device = mlx5_glue_dv_query_device,
 	.dv_set_context_attr = mlx5_glue_dv_set_context_attr,
 	.dv_init_obj = mlx5_glue_dv_init_obj,
+	.dv_create_qp = mlx5_glue_dv_create_qp,
 };
diff --git a/drivers/net/mlx5/mlx5_glue.h b/drivers/net/mlx5/mlx5_glue.h
index 33385d226..9f36af81a 100644
--- a/drivers/net/mlx5/mlx5_glue.h
+++ b/drivers/net/mlx5/mlx5_glue.h
@@ -31,6 +31,10 @@ struct ibv_counter_set_init_attr;
 struct ibv_query_counter_set_attr;
 #endif
 
+#ifndef HAVE_IBV_DEVICE_TUNNEL_SUPPORT
+struct mlx5dv_qp_init_attr;
+#endif
+
 /* LIB_GLUE_VERSION must be updated every time this structure is modified. */
 struct mlx5_glue {
 	const char *version;
@@ -106,6 +110,10 @@ struct mlx5_glue {
 				   enum mlx5dv_set_ctx_attr_type type,
 				   void *attr);
 	int (*dv_init_obj)(struct mlx5dv_obj *obj, uint64_t obj_type);
+	struct ibv_qp *(*dv_create_qp)
+		(struct ibv_context *context,
+		 struct ibv_qp_init_attr_ex *qp_init_attr_ex,
+		 struct mlx5dv_qp_init_attr *dv_qp_init_attr);
 };
 
 const struct mlx5_glue *mlx5_glue;
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index 6756f25fa..58403b5b6 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -1385,7 +1385,9 @@ mlx5_ind_table_ibv_verify(struct rte_eth_dev *dev)
  * @param queues_n
  *   Number of queues.
  * @param tunnel
- *   Tunnel type.
+ *   Tunnel type, implies tunnel offloading like inner checksum if available.
+ * @param rss_level
+ *   RSS hash on tunnel level.
  *
  * @return
  *   The Verbs object initialised, NULL otherwise and rte_errno is set.
@@ -1394,13 +1396,17 @@ struct mlx5_hrxq *
 mlx5_hrxq_new(struct rte_eth_dev *dev,
 	      const uint8_t *rss_key, uint32_t rss_key_len,
 	      uint64_t hash_fields,
-	      const uint16_t *queues, uint32_t queues_n, uint32_t tunnel)
+	      const uint16_t *queues, uint32_t queues_n,
+	      uint32_t tunnel, uint32_t rss_level)
 {
 	struct priv *priv = dev->data->dev_private;
 	struct mlx5_hrxq *hrxq;
 	struct mlx5_ind_table_ibv *ind_tbl;
 	struct ibv_qp *qp;
 	int err;
+#ifdef HAVE_IBV_DEVICE_TUNNEL_SUPPORT
+	struct mlx5dv_qp_init_attr qp_init_attr = {0};
+#endif
 
 	queues_n = hash_fields ? queues_n : 1;
 	ind_tbl = mlx5_ind_table_ibv_get(dev, queues, queues_n);
@@ -1410,6 +1416,36 @@ mlx5_hrxq_new(struct rte_eth_dev *dev,
 		rte_errno = ENOMEM;
 		return NULL;
 	}
+#ifdef HAVE_IBV_DEVICE_TUNNEL_SUPPORT
+	if (tunnel) {
+		qp_init_attr.comp_mask =
+				MLX5DV_QP_INIT_ATTR_MASK_QP_CREATE_FLAGS;
+		qp_init_attr.create_flags = MLX5DV_QP_CREATE_TUNNEL_OFFLOADS;
+	}
+	qp = mlx5_glue->dv_create_qp(
+		priv->ctx,
+		&(struct ibv_qp_init_attr_ex){
+			.qp_type = IBV_QPT_RAW_PACKET,
+			.comp_mask =
+				IBV_QP_INIT_ATTR_PD |
+				IBV_QP_INIT_ATTR_IND_TABLE |
+				IBV_QP_INIT_ATTR_RX_HASH,
+			.rx_hash_conf = (struct ibv_rx_hash_conf){
+				.rx_hash_function = IBV_RX_HASH_FUNC_TOEPLITZ,
+				.rx_hash_key_len = rss_key_len ? rss_key_len :
+						   rss_hash_default_key_len,
+				.rx_hash_key = rss_key ?
+					       (void *)(uintptr_t)rss_key :
+					       rss_hash_default_key,
+				.rx_hash_fields_mask = hash_fields |
+					(tunnel && rss_level > 1 ?
+					(uint32_t)IBV_RX_HASH_INNER : 0),
+			},
+			.rwq_ind_tbl = ind_tbl->ind_table,
+			.pd = priv->pd,
+		},
+		&qp_init_attr);
+#else
 	qp = mlx5_glue->create_qp_ex
 		(priv->ctx,
 		 &(struct ibv_qp_init_attr_ex){
@@ -1420,13 +1456,17 @@ mlx5_hrxq_new(struct rte_eth_dev *dev,
 				IBV_QP_INIT_ATTR_RX_HASH,
 			.rx_hash_conf = (struct ibv_rx_hash_conf){
 				.rx_hash_function = IBV_RX_HASH_FUNC_TOEPLITZ,
-				.rx_hash_key_len = rss_key_len,
-				.rx_hash_key = (void *)(uintptr_t)rss_key,
+				.rx_hash_key_len = rss_key_len ? rss_key_len :
+						   rss_hash_default_key_len,
+				.rx_hash_key = rss_key ?
+					       (void *)(uintptr_t)rss_key :
+					       rss_hash_default_key,
 				.rx_hash_fields_mask = hash_fields,
 			},
 			.rwq_ind_tbl = ind_tbl->ind_table,
 			.pd = priv->pd,
 		 });
+#endif
 	if (!qp) {
 		rte_errno = errno;
 		goto error;
@@ -1439,6 +1479,7 @@ mlx5_hrxq_new(struct rte_eth_dev *dev,
 	hrxq->rss_key_len = rss_key_len;
 	hrxq->hash_fields = hash_fields;
 	hrxq->tunnel = tunnel;
+	hrxq->rss_level = rss_level;
 	memcpy(hrxq->rss_key, rss_key, rss_key_len);
 	rte_atomic32_inc(&hrxq->refcnt);
 	LIST_INSERT_HEAD(&priv->hrxqs, hrxq, next);
@@ -1468,7 +1509,9 @@ mlx5_hrxq_new(struct rte_eth_dev *dev,
  * @param queues_n
  *   Number of queues.
  * @param tunnel
- *   Tunnel type.
+ *   Tunnel type, implies tunnel offloading like inner checksum if available.
+ * @param rss_level
+ *   RSS hash on tunnel level
  *
  * @return
  *   An hash Rx queue on success.
@@ -1477,7 +1520,8 @@ struct mlx5_hrxq *
 mlx5_hrxq_get(struct rte_eth_dev *dev,
 	      const uint8_t *rss_key, uint32_t rss_key_len,
 	      uint64_t hash_fields,
-	      const uint16_t *queues, uint32_t queues_n, uint32_t tunnel)
+	      const uint16_t *queues, uint32_t queues_n,
+	      uint32_t tunnel, uint32_t rss_level)
 {
 	struct priv *priv = dev->data->dev_private;
 	struct mlx5_hrxq *hrxq;
@@ -1494,6 +1538,8 @@ mlx5_hrxq_get(struct rte_eth_dev *dev,
 			continue;
 		if (hrxq->tunnel != tunnel)
 			continue;
+		if (hrxq->rss_level != rss_level)
+			continue;
 		ind_tbl = mlx5_ind_table_ibv_get(dev, queues, queues_n);
 		if (!ind_tbl)
 			continue;
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index 188fd65c5..07b3adfae 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -147,6 +147,7 @@ struct mlx5_hrxq {
 	struct ibv_qp *qp; /* Verbs queue pair. */
 	uint64_t hash_fields; /* Verbs Hash fields. */
 	uint32_t tunnel; /* Tunnel type. */
+	uint32_t rss_level; /* RSS on tunnel level. */
 	uint32_t rss_key_len; /* Hash key length in bytes. */
 	uint8_t rss_key[]; /* Hash key. */
 };
@@ -251,12 +252,12 @@ struct mlx5_hrxq *mlx5_hrxq_new(struct rte_eth_dev *dev,
 				const uint8_t *rss_key, uint32_t rss_key_len,
 				uint64_t hash_fields,
 				const uint16_t *queues, uint32_t queues_n,
-				uint32_t tunnel);
+				uint32_t tunnel, uint32_t rss_level);
 struct mlx5_hrxq *mlx5_hrxq_get(struct rte_eth_dev *dev,
 				const uint8_t *rss_key, uint32_t rss_key_len,
 				uint64_t hash_fields,
 				const uint16_t *queues, uint32_t queues_n,
-				uint32_t tunnel);
+				uint32_t tunnel, uint32_t rss_level);
 int mlx5_hrxq_release(struct rte_eth_dev *dev, struct mlx5_hrxq *hxrq);
 int mlx5_hrxq_ibv_verify(struct rte_eth_dev *dev);
 uint64_t mlx5_get_rx_port_offloads(void);
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v6 08/11] net/mlx5: add hardware flow debug dump
  2018-04-20 12:23     ` [PATCH v5 " Xueming Li
                         ` (7 preceding siblings ...)
  2018-04-23 12:33       ` [PATCH v6 07/11] net/mlx5: support tunnel RSS level Xueming Li
@ 2018-04-23 12:33       ` Xueming Li
  2018-04-26 10:09         ` Ferruh Yigit
  2018-04-23 12:33       ` [PATCH v6 09/11] net/mlx5: introduce VXLAN-GPE tunnel type Xueming Li
                         ` (2 subsequent siblings)
  11 siblings, 1 reply; 115+ messages in thread
From: Xueming Li @ 2018-04-23 12:33 UTC (permalink / raw)
  To: Nelio Laranjeiro, Shahaf Shuler; +Cc: Xueming Li, dev

Dump verb flow detail including flow spec type and size for debugging
purpose.

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
---
 drivers/net/mlx5/mlx5_flow.c  | 68 ++++++++++++++++++++++++++++++++++++-------
 drivers/net/mlx5/mlx5_rxq.c   | 26 ++++++++++++++---
 drivers/net/mlx5/mlx5_utils.h |  6 ++++
 3 files changed, 86 insertions(+), 14 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index 174f2ba6e..593c960f8 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -2080,6 +2080,57 @@ mlx5_flow_create_update_rxqs(struct rte_eth_dev *dev, struct rte_flow *flow)
 }
 
 /**
+ * Dump flow hash RX queue detail.
+ *
+ * @param dev
+ *   Pointer to Ethernet device.
+ * @param flow
+ *   Pointer to the rte_flow.
+ * @param hrxq_idx
+ *   Hash RX queue index.
+ */
+static void
+mlx5_flow_dump(struct rte_eth_dev *dev __rte_unused,
+	       struct rte_flow *flow __rte_unused,
+	       unsigned int hrxq_idx __rte_unused)
+{
+#ifndef NDEBUG
+	uintptr_t spec_ptr;
+	uint16_t j;
+	char buf[256];
+	uint8_t off;
+
+	spec_ptr = (uintptr_t)(flow->frxq[hrxq_idx].ibv_attr + 1);
+	for (j = 0, off = 0; j < flow->frxq[hrxq_idx].ibv_attr->num_of_specs;
+	     j++) {
+		struct ibv_flow_spec *spec = (void *)spec_ptr;
+		off += sprintf(buf + off, " %x(%hu)", spec->hdr.type,
+			       spec->hdr.size);
+		spec_ptr += spec->hdr.size;
+	}
+	DRV_LOG(DEBUG,
+		"port %u Verbs flow %p type %u: hrxq:%p qp:%p ind:%p,"
+		" hash:%" PRIx64 "/%u specs:%hhu(%hu), priority:%hu, type:%d,"
+		" flags:%x, comp_mask:%x specs:%s",
+		dev->data->port_id, (void *)flow, hrxq_idx,
+		(void *)flow->frxq[hrxq_idx].hrxq,
+		(void *)flow->frxq[hrxq_idx].hrxq->qp,
+		(void *)flow->frxq[hrxq_idx].hrxq->ind_table,
+		flow->frxq[hrxq_idx].hash_fields |
+		(flow->tunnel &&
+		 flow->rss_conf.level > 1 ? (uint32_t)IBV_RX_HASH_INNER : 0),
+		flow->rss_conf.queue_num,
+		flow->frxq[hrxq_idx].ibv_attr->num_of_specs,
+		flow->frxq[hrxq_idx].ibv_attr->size,
+		flow->frxq[hrxq_idx].ibv_attr->priority,
+		flow->frxq[hrxq_idx].ibv_attr->type,
+		flow->frxq[hrxq_idx].ibv_attr->flags,
+		flow->frxq[hrxq_idx].ibv_attr->comp_mask,
+		buf);
+#endif
+}
+
+/**
  * Complete flow rule creation.
  *
  * @param dev
@@ -2121,6 +2172,7 @@ mlx5_flow_create_action_queue(struct rte_eth_dev *dev,
 		flow->frxq[i].ibv_flow =
 			mlx5_glue->create_flow(flow->frxq[i].hrxq->qp,
 					       flow->frxq[i].ibv_attr);
+		mlx5_flow_dump(dev, flow, i);
 		if (!flow->frxq[i].ibv_flow) {
 			rte_flow_error_set(error, ENOMEM,
 					   RTE_FLOW_ERROR_TYPE_HANDLE,
@@ -2128,11 +2180,6 @@ mlx5_flow_create_action_queue(struct rte_eth_dev *dev,
 			goto error;
 		}
 		++flows_n;
-		DRV_LOG(DEBUG, "port %u %p type %d QP %p ibv_flow %p",
-			dev->data->port_id,
-			(void *)flow, i,
-			(void *)flow->frxq[i].hrxq->qp,
-			(void *)flow->frxq[i].ibv_flow);
 	}
 	if (!flows_n) {
 		rte_flow_error_set(error, EINVAL, RTE_FLOW_ERROR_TYPE_HANDLE,
@@ -2676,24 +2723,25 @@ mlx5_flow_start(struct rte_eth_dev *dev, struct mlx5_flows *list)
 					      flow->rss_conf.level);
 			if (!flow->frxq[i].hrxq) {
 				DRV_LOG(DEBUG,
-					"port %u flow %p cannot be applied",
+					"port %u flow %p cannot create hash"
+					" rxq",
 					dev->data->port_id, (void *)flow);
 				rte_errno = EINVAL;
 				return -rte_errno;
 			}
 flow_create:
+			mlx5_flow_dump(dev, flow, i);
 			flow->frxq[i].ibv_flow =
 				mlx5_glue->create_flow(flow->frxq[i].hrxq->qp,
 						       flow->frxq[i].ibv_attr);
 			if (!flow->frxq[i].ibv_flow) {
 				DRV_LOG(DEBUG,
-					"port %u flow %p cannot be applied",
-					dev->data->port_id, (void *)flow);
+					"port %u flow %p type %u cannot be"
+					" applied",
+					dev->data->port_id, (void *)flow, i);
 				rte_errno = EINVAL;
 				return -rte_errno;
 			}
-			DRV_LOG(DEBUG, "port %u flow %p applied",
-				dev->data->port_id, (void *)flow);
 		}
 		mlx5_flow_create_update_rxqs(dev, flow);
 	}
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index 58403b5b6..2957e7c86 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -1259,9 +1259,9 @@ mlx5_ind_table_ibv_new(struct rte_eth_dev *dev, const uint16_t *queues,
 	}
 	rte_atomic32_inc(&ind_tbl->refcnt);
 	LIST_INSERT_HEAD(&priv->ind_tbls, ind_tbl, next);
-	DRV_LOG(DEBUG, "port %u indirection table %p: refcnt %d",
-		dev->data->port_id, (void *)ind_tbl,
-		rte_atomic32_read(&ind_tbl->refcnt));
+	DEBUG("port %u new indirection table %p: queues:%u refcnt:%d",
+	      dev->data->port_id, (void *)ind_tbl, 1 << wq_n,
+	      rte_atomic32_read(&ind_tbl->refcnt));
 	return ind_tbl;
 error:
 	rte_free(ind_tbl);
@@ -1330,9 +1330,12 @@ mlx5_ind_table_ibv_release(struct rte_eth_dev *dev,
 	DRV_LOG(DEBUG, "port %u indirection table %p: refcnt %d",
 		((struct priv *)dev->data->dev_private)->port,
 		(void *)ind_tbl, rte_atomic32_read(&ind_tbl->refcnt));
-	if (rte_atomic32_dec_and_test(&ind_tbl->refcnt))
+	if (rte_atomic32_dec_and_test(&ind_tbl->refcnt)) {
 		claim_zero(mlx5_glue->destroy_rwq_ind_table
 			   (ind_tbl->ind_table));
+		DEBUG("port %u delete indirection table %p: queues: %u",
+		      dev->data->port_id, (void *)ind_tbl, ind_tbl->queues_n);
+	}
 	for (i = 0; i != ind_tbl->queues_n; ++i)
 		claim_nonzero(mlx5_rxq_release(dev, ind_tbl->queues[i]));
 	if (!rte_atomic32_read(&ind_tbl->refcnt)) {
@@ -1445,6 +1448,13 @@ mlx5_hrxq_new(struct rte_eth_dev *dev,
 			.pd = priv->pd,
 		},
 		&qp_init_attr);
+	DEBUG("port %u new QP:%p ind_tbl:%p hash_fields:0x%" PRIx64
+	      " tunnel:0x%x level:%hhu dv_attr:comp_mask:0x%" PRIx64
+	      " create_flags:0x%x",
+	      dev->data->port_id, (void *)qp, (void *)ind_tbl,
+	      (tunnel && rss_level == 2 ? (uint32_t)IBV_RX_HASH_INNER : 0) |
+	      hash_fields, tunnel, rss_level,
+	      qp_init_attr.comp_mask, qp_init_attr.create_flags);
 #else
 	qp = mlx5_glue->create_qp_ex
 		(priv->ctx,
@@ -1466,6 +1476,10 @@ mlx5_hrxq_new(struct rte_eth_dev *dev,
 			.rwq_ind_tbl = ind_tbl->ind_table,
 			.pd = priv->pd,
 		 });
+	DEBUG("port %u new QP:%p ind_tbl:%p hash_fields:0x%" PRIx64
+	      " tunnel:0x%x level:%hhu",
+	      dev->data->port_id, (void *)qp, (void *)ind_tbl,
+	      hash_fields, tunnel, rss_level);
 #endif
 	if (!qp) {
 		rte_errno = errno;
@@ -1575,6 +1589,10 @@ mlx5_hrxq_release(struct rte_eth_dev *dev, struct mlx5_hrxq *hrxq)
 		(void *)hrxq, rte_atomic32_read(&hrxq->refcnt));
 	if (rte_atomic32_dec_and_test(&hrxq->refcnt)) {
 		claim_zero(mlx5_glue->destroy_qp(hrxq->qp));
+		DEBUG("port %u delete QP %p: hash: 0x%" PRIx64 ", tunnel:"
+		      " 0x%x, level: %hhu",
+		      dev->data->port_id, (void *)hrxq, hrxq->hash_fields,
+		      hrxq->tunnel, hrxq->rss_level);
 		mlx5_ind_table_ibv_release(dev, hrxq->ind_table);
 		LIST_REMOVE(hrxq, next);
 		rte_free(hrxq);
diff --git a/drivers/net/mlx5/mlx5_utils.h b/drivers/net/mlx5/mlx5_utils.h
index e8f980ff7..886f60e61 100644
--- a/drivers/net/mlx5/mlx5_utils.h
+++ b/drivers/net/mlx5/mlx5_utils.h
@@ -103,16 +103,22 @@ extern int mlx5_logtype;
 /* claim_zero() does not perform any check when debugging is disabled. */
 #ifndef NDEBUG
 
+#define DEBUG(...) DRV_LOG(DEBUG, __VA_ARGS__)
 #define claim_zero(...) assert((__VA_ARGS__) == 0)
 #define claim_nonzero(...) assert((__VA_ARGS__) != 0)
 
 #else /* NDEBUG */
 
+#define DEBUG(...) (void)0
 #define claim_zero(...) (__VA_ARGS__)
 #define claim_nonzero(...) (__VA_ARGS__)
 
 #endif /* NDEBUG */
 
+#define INFO(...) DRV_LOG(INFO, __VA_ARGS__)
+#define WARN(...) DRV_LOG(WARNING, __VA_ARGS__)
+#define ERROR(...) DRV_LOG(ERR, __VA_ARGS__)
+
 /* Convenience macros for accessing mbuf fields. */
 #define NEXT(m) ((m)->next)
 #define DATA_LEN(m) ((m)->data_len)
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v6 09/11] net/mlx5: introduce VXLAN-GPE tunnel type
  2018-04-20 12:23     ` [PATCH v5 " Xueming Li
                         ` (8 preceding siblings ...)
  2018-04-23 12:33       ` [PATCH v6 08/11] net/mlx5: add hardware flow debug dump Xueming Li
@ 2018-04-23 12:33       ` Xueming Li
  2018-04-23 12:33       ` [PATCH v6 10/11] net/mlx5: allow flow tunnel ID 0 with outer pattern Xueming Li
  2018-04-23 12:33       ` [PATCH v6 11/11] doc: update mlx5 guide on tunnel offloading Xueming Li
  11 siblings, 0 replies; 115+ messages in thread
From: Xueming Li @ 2018-04-23 12:33 UTC (permalink / raw)
  To: Nelio Laranjeiro, Shahaf Shuler; +Cc: Xueming Li, dev

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
---
 doc/guides/nics/mlx5.rst     |   8 ++--
 drivers/net/mlx5/mlx5_flow.c | 107 ++++++++++++++++++++++++++++++++++++++++++-
 drivers/net/mlx5/mlx5_rxtx.c |   3 +-
 3 files changed, 111 insertions(+), 7 deletions(-)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index 421274729..6b83759c8 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -329,16 +329,16 @@ Run-time configuration
 
 - ``l3_vxlan_en`` parameter [int]
 
-  A nonzero value allows L3 VXLAN flow creation. To enable L3 VXLAN, users
-  has to configure firemware and enable this prameter. This is a prerequisite
-  to receive this kind of traffic.
+  A nonzero value allows L3 VXLAN and VXLAN-GPE flow creation. To enable
+  L3 VXLAN or VXLAN-GPE, users has to configure firemware and enable this
+  prameter. This is a prerequisite to receive this kind of traffic.
 
   Disabled by default.
 
 Firmware configuration
 ~~~~~~~~~~~~~~~~~~~~~~
 
-- L3 VXLAN destination UDP port
+- L3 VXLAN and VXLAN-GPE destination UDP port
 
    .. code-block:: console
 
diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index 593c960f8..a55644cd0 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -92,6 +92,11 @@ mlx5_flow_create_vxlan(const struct rte_flow_item *item,
 		       struct mlx5_flow_data *data);
 
 static int
+mlx5_flow_create_vxlan_gpe(const struct rte_flow_item *item,
+			   const void *default_mask,
+			   struct mlx5_flow_data *data);
+
+static int
 mlx5_flow_create_gre(const struct rte_flow_item *item,
 		     const void *default_mask,
 		     struct mlx5_flow_data *data);
@@ -242,10 +247,12 @@ struct rte_flow {
 
 #define IS_TUNNEL(type) ( \
 	(type) == RTE_FLOW_ITEM_TYPE_VXLAN || \
+	(type) == RTE_FLOW_ITEM_TYPE_VXLAN_GPE || \
 	(type) == RTE_FLOW_ITEM_TYPE_GRE)
 
 const uint32_t flow_ptype[] = {
 	[RTE_FLOW_ITEM_TYPE_VXLAN] = RTE_PTYPE_TUNNEL_VXLAN,
+	[RTE_FLOW_ITEM_TYPE_VXLAN_GPE] = RTE_PTYPE_TUNNEL_VXLAN_GPE,
 	[RTE_FLOW_ITEM_TYPE_GRE] = RTE_PTYPE_TUNNEL_GRE,
 };
 
@@ -254,6 +261,8 @@ const uint32_t flow_ptype[] = {
 const uint32_t ptype_ext[] = {
 	[PTYPE_IDX(RTE_PTYPE_TUNNEL_VXLAN)] = RTE_PTYPE_TUNNEL_VXLAN |
 					      RTE_PTYPE_L4_UDP,
+	[PTYPE_IDX(RTE_PTYPE_TUNNEL_VXLAN_GPE)]	= RTE_PTYPE_TUNNEL_VXLAN_GPE |
+						  RTE_PTYPE_L4_UDP,
 	[PTYPE_IDX(RTE_PTYPE_TUNNEL_GRE)] = RTE_PTYPE_TUNNEL_GRE,
 };
 
@@ -311,6 +320,7 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
 	[RTE_FLOW_ITEM_TYPE_END] = {
 		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
 			       RTE_FLOW_ITEM_TYPE_VXLAN,
+			       RTE_FLOW_ITEM_TYPE_VXLAN_GPE,
 			       RTE_FLOW_ITEM_TYPE_GRE),
 	},
 	[RTE_FLOW_ITEM_TYPE_ETH] = {
@@ -389,7 +399,8 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
 		.dst_sz = sizeof(struct ibv_flow_spec_ipv6),
 	},
 	[RTE_FLOW_ITEM_TYPE_UDP] = {
-		.items = ITEMS(RTE_FLOW_ITEM_TYPE_VXLAN),
+		.items = ITEMS(RTE_FLOW_ITEM_TYPE_VXLAN,
+			       RTE_FLOW_ITEM_TYPE_VXLAN_GPE),
 		.actions = valid_actions,
 		.mask = &(const struct rte_flow_item_udp){
 			.hdr = {
@@ -441,6 +452,19 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
 		.convert = mlx5_flow_create_vxlan,
 		.dst_sz = sizeof(struct ibv_flow_spec_tunnel),
 	},
+	[RTE_FLOW_ITEM_TYPE_VXLAN_GPE] = {
+		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
+			       RTE_FLOW_ITEM_TYPE_IPV4,
+			       RTE_FLOW_ITEM_TYPE_IPV6),
+		.actions = valid_actions,
+		.mask = &(const struct rte_flow_item_vxlan_gpe){
+			.vni = "\xff\xff\xff",
+		},
+		.default_mask = &rte_flow_item_vxlan_gpe_mask,
+		.mask_sz = sizeof(struct rte_flow_item_vxlan_gpe),
+		.convert = mlx5_flow_create_vxlan_gpe,
+		.dst_sz = sizeof(struct ibv_flow_spec_tunnel),
+	},
 };
 
 /** Structure to pass to the conversion function. */
@@ -1775,6 +1799,87 @@ mlx5_flow_create_vxlan(const struct rte_flow_item *item,
 }
 
 /**
+ * Convert VXLAN-GPE item to Verbs specification.
+ *
+ * @param item[in]
+ *   Item specification.
+ * @param default_mask[in]
+ *   Default bit-masks to use when item->mask is not provided.
+ * @param data[in, out]
+ *   User structure.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+mlx5_flow_create_vxlan_gpe(const struct rte_flow_item *item,
+			   const void *default_mask,
+			   struct mlx5_flow_data *data)
+{
+	struct priv *priv = data->dev->data->dev_private;
+	const struct rte_flow_item_vxlan_gpe *spec = item->spec;
+	const struct rte_flow_item_vxlan_gpe *mask = item->mask;
+	struct mlx5_flow_parse *parser = data->parser;
+	unsigned int size = sizeof(struct ibv_flow_spec_tunnel);
+	struct ibv_flow_spec_tunnel vxlan = {
+		.type = parser->inner | IBV_FLOW_SPEC_VXLAN_TUNNEL,
+		.size = size,
+	};
+	union vni {
+		uint32_t vlan_id;
+		uint8_t vni[4];
+	} id;
+
+	if (!priv->config.l3_vxlan_en)
+		return rte_flow_error_set(data->error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ITEM,
+					  item,
+					  "L3 VXLAN not enabled by device"
+					  " parameter and/or not configured"
+					  " in firmware");
+	id.vni[0] = 0;
+	parser->inner = IBV_FLOW_SPEC_INNER;
+	parser->tunnel = ptype_ext[PTYPE_IDX(RTE_PTYPE_TUNNEL_VXLAN_GPE)];
+	parser->out_layer = parser->layer;
+	parser->layer = HASH_RXQ_TUNNEL;
+	/* Default VXLAN-GPE to outer RSS. */
+	if (!parser->rss_conf.level)
+		parser->rss_conf.level = 1;
+	if (spec) {
+		if (!mask)
+			mask = default_mask;
+		memcpy(&id.vni[1], spec->vni, 3);
+		vxlan.val.tunnel_id = id.vlan_id;
+		memcpy(&id.vni[1], mask->vni, 3);
+		vxlan.mask.tunnel_id = id.vlan_id;
+		if (spec->protocol)
+			return rte_flow_error_set(data->error, EINVAL,
+						  RTE_FLOW_ERROR_TYPE_ITEM,
+						  item,
+						  "VxLAN-GPE protocol not"
+						  " supported");
+		/* Remove unwanted bits from values. */
+		vxlan.val.tunnel_id &= vxlan.mask.tunnel_id;
+	}
+	/*
+	 * Tunnel id 0 is equivalent as not adding a VXLAN layer, if only this
+	 * layer is defined in the Verbs specification it is interpreted as
+	 * wildcard and all packets will match this rule, if it follows a full
+	 * stack layer (ex: eth / ipv4 / udp), all packets matching the layers
+	 * before will also match this rule.
+	 * To avoid such situation, VNI 0 is currently refused.
+	 */
+	/* Only allow tunnel w/o tunnel id pattern after proper outer spec. */
+	if (parser->out_layer == HASH_RXQ_ETH && !vxlan.val.tunnel_id)
+		return rte_flow_error_set(data->error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ITEM,
+					  item,
+					  "VxLAN-GPE vni cannot be 0");
+	mlx5_flow_create_copy(parser, &vxlan, size);
+	return 0;
+}
+
+/**
  * Convert GRE item to Verbs specification.
  *
  * @param item[in]
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 060ff0e85..f10ea13c1 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -466,8 +466,7 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
 			uint8_t vlan_sz =
 				(buf->ol_flags & PKT_TX_VLAN_PKT) ? 4 : 0;
 			const uint64_t is_tunneled =
-				buf->ol_flags & (PKT_TX_TUNNEL_GRE |
-						 PKT_TX_TUNNEL_VXLAN);
+				buf->ol_flags & (PKT_TX_TUNNEL_MASK);
 
 			tso_header_sz = buf->l2_len + vlan_sz +
 					buf->l3_len + buf->l4_len;
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v6 10/11] net/mlx5: allow flow tunnel ID 0 with outer pattern
  2018-04-20 12:23     ` [PATCH v5 " Xueming Li
                         ` (9 preceding siblings ...)
  2018-04-23 12:33       ` [PATCH v6 09/11] net/mlx5: introduce VXLAN-GPE tunnel type Xueming Li
@ 2018-04-23 12:33       ` Xueming Li
  2018-04-23 12:33       ` [PATCH v6 11/11] doc: update mlx5 guide on tunnel offloading Xueming Li
  11 siblings, 0 replies; 115+ messages in thread
From: Xueming Li @ 2018-04-23 12:33 UTC (permalink / raw)
  To: Nelio Laranjeiro, Shahaf Shuler; +Cc: Xueming Li, dev

Tunnel w/o tunnel id pattern could match any non-tunneled packet,
this patch allowed tunnel w/o tunnel id pattern after proper outer spec.

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
---
 drivers/net/mlx5/mlx5_flow.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index a55644cd0..06ed58ef5 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -1789,7 +1789,8 @@ mlx5_flow_create_vxlan(const struct rte_flow_item *item,
 	 * before will also match this rule.
 	 * To avoid such situation, VNI 0 is currently refused.
 	 */
-	if (!vxlan.val.tunnel_id)
+	/* Only allow tunnel w/o tunnel id pattern after proper outer spec. */
+	if (parser->out_layer == HASH_RXQ_ETH && !vxlan.val.tunnel_id)
 		return rte_flow_error_set(data->error, EINVAL,
 					  RTE_FLOW_ERROR_TYPE_ITEM,
 					  item,
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v6 11/11] doc: update mlx5 guide on tunnel offloading
  2018-04-20 12:23     ` [PATCH v5 " Xueming Li
                         ` (10 preceding siblings ...)
  2018-04-23 12:33       ` [PATCH v6 10/11] net/mlx5: allow flow tunnel ID 0 with outer pattern Xueming Li
@ 2018-04-23 12:33       ` Xueming Li
  2018-04-26 11:00         ` Ferruh Yigit
  11 siblings, 1 reply; 115+ messages in thread
From: Xueming Li @ 2018-04-23 12:33 UTC (permalink / raw)
  To: Nelio Laranjeiro, Shahaf Shuler; +Cc: Xueming Li, dev

Remove tunnel limitations, add new hardware tunnel offload features.

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
---
 doc/guides/nics/features/default.ini | 1 +
 doc/guides/nics/features/mlx5.ini    | 3 +++
 doc/guides/nics/mlx5.rst             | 4 ++--
 3 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/doc/guides/nics/features/default.ini b/doc/guides/nics/features/default.ini
index dae2ad776..49be81450 100644
--- a/doc/guides/nics/features/default.ini
+++ b/doc/guides/nics/features/default.ini
@@ -29,6 +29,7 @@ Multicast MAC filter =
 RSS hash             =
 RSS key update       =
 RSS reta update      =
+Inner RSS            =
 VMDq                 =
 SR-IOV               =
 DCB                  =
diff --git a/doc/guides/nics/features/mlx5.ini b/doc/guides/nics/features/mlx5.ini
index f8ce08770..e75b14bdc 100644
--- a/doc/guides/nics/features/mlx5.ini
+++ b/doc/guides/nics/features/mlx5.ini
@@ -21,6 +21,7 @@ Multicast MAC filter = Y
 RSS hash             = Y
 RSS key update       = Y
 RSS reta update      = Y
+Inner RSS            = Y
 SR-IOV               = Y
 VLAN filter          = Y
 Flow director        = Y
@@ -30,6 +31,8 @@ VLAN offload         = Y
 L3 checksum offload  = Y
 L4 checksum offload  = Y
 Timestamp offload    = Y
+Inner L3 checksum    = Y
+Inner L4 checksum    = Y
 Packet type parsing  = Y
 Rx descriptor status = Y
 Tx descriptor status = Y
diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index 6b83759c8..ef1c7da45 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -74,12 +74,12 @@ Features
 - RX interrupts.
 - Statistics query including Basic, Extended and per queue.
 - Rx HW timestamp.
+- Tunnel types: VXLAN, L3 VXLAN, VXLAN-GPE, GRE, MPLS-in-GRE, MPLS-in-UDP.
+- Tunnel HW offloads: packet type, inner/outer RSS, IP and UDP checksum verification.
 
 Limitations
 -----------
 
-- Inner RSS for VXLAN frames is not supported yet.
-- Hardware checksum RX offloads for VXLAN inner header are not supported yet.
 - For secondary process:
 
   - Forked secondary process not supported.
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* Re: [PATCH v6 02/11] net/mlx5: support GRE tunnel flow
  2018-04-23 12:33       ` [PATCH v6 02/11] net/mlx5: support GRE tunnel flow Xueming Li
@ 2018-04-23 12:55         ` Nélio Laranjeiro
  2018-04-23 13:32           ` Xueming(Steven) Li
  0 siblings, 1 reply; 115+ messages in thread
From: Nélio Laranjeiro @ 2018-04-23 12:55 UTC (permalink / raw)
  To: Xueming Li; +Cc: Shahaf Shuler, dev

On Mon, Apr 23, 2018 at 08:33:01PM +0800, Xueming Li wrote:
> Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> ---
>  drivers/net/mlx5/mlx5_flow.c | 101 ++++++++++++++++++++++++++++++++++++++++---
>  1 file changed, 94 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
> index 5402cb148..b365f9868 100644
> --- a/drivers/net/mlx5/mlx5_flow.c
> +++ b/drivers/net/mlx5/mlx5_flow.c
> @@ -37,6 +37,7 @@
>  /* Internet Protocol versions. */
>  #define MLX5_IPV4 4
>  #define MLX5_IPV6 6
> +#define MLX5_GRE 47
>  
>  #ifndef HAVE_IBV_DEVICE_COUNTERS_SET_SUPPORT
>  struct ibv_flow_spec_counter_action {
> @@ -89,6 +90,11 @@ mlx5_flow_create_vxlan(const struct rte_flow_item *item,
>  		       const void *default_mask,
>  		       struct mlx5_flow_data *data);
>  
> +static int
> +mlx5_flow_create_gre(const struct rte_flow_item *item,
> +		     const void *default_mask,
> +		     struct mlx5_flow_data *data);
> +
>  struct mlx5_flow_parse;
>  
>  static void
> @@ -231,6 +237,10 @@ struct rte_flow {
>  		__VA_ARGS__, RTE_FLOW_ITEM_TYPE_END, \
>  	}
>  
> +#define IS_TUNNEL(type) ( \
> +	(type) == RTE_FLOW_ITEM_TYPE_VXLAN || \
> +	(type) == RTE_FLOW_ITEM_TYPE_GRE)
> +
>  /** Structure to generate a simple graph of layers supported by the NIC. */
>  struct mlx5_flow_items {
>  	/** List of possible actions for these items. */
> @@ -284,7 +294,8 @@ static const enum rte_flow_action_type valid_actions[] = {
>  static const struct mlx5_flow_items mlx5_flow_items[] = {
>  	[RTE_FLOW_ITEM_TYPE_END] = {
>  		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
> -			       RTE_FLOW_ITEM_TYPE_VXLAN),
> +			       RTE_FLOW_ITEM_TYPE_VXLAN,
> +			       RTE_FLOW_ITEM_TYPE_GRE),
>  	},
>  	[RTE_FLOW_ITEM_TYPE_ETH] = {
>  		.items = ITEMS(RTE_FLOW_ITEM_TYPE_VLAN,
> @@ -316,7 +327,8 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
>  	},
>  	[RTE_FLOW_ITEM_TYPE_IPV4] = {
>  		.items = ITEMS(RTE_FLOW_ITEM_TYPE_UDP,
> -			       RTE_FLOW_ITEM_TYPE_TCP),
> +			       RTE_FLOW_ITEM_TYPE_TCP,
> +			       RTE_FLOW_ITEM_TYPE_GRE),
>  		.actions = valid_actions,
>  		.mask = &(const struct rte_flow_item_ipv4){
>  			.hdr = {
> @@ -333,7 +345,8 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
>  	},
>  	[RTE_FLOW_ITEM_TYPE_IPV6] = {
>  		.items = ITEMS(RTE_FLOW_ITEM_TYPE_UDP,
> -			       RTE_FLOW_ITEM_TYPE_TCP),
> +			       RTE_FLOW_ITEM_TYPE_TCP,
> +			       RTE_FLOW_ITEM_TYPE_GRE),
>  		.actions = valid_actions,
>  		.mask = &(const struct rte_flow_item_ipv6){
>  			.hdr = {
> @@ -386,6 +399,19 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
>  		.convert = mlx5_flow_create_tcp,
>  		.dst_sz = sizeof(struct ibv_flow_spec_tcp_udp),
>  	},
> +	[RTE_FLOW_ITEM_TYPE_GRE] = {
> +		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
> +			       RTE_FLOW_ITEM_TYPE_IPV4,
> +			       RTE_FLOW_ITEM_TYPE_IPV6),
> +		.actions = valid_actions,
> +		.mask = &(const struct rte_flow_item_gre){
> +			.protocol = -1,
> +		},
> +		.default_mask = &rte_flow_item_gre_mask,
> +		.mask_sz = sizeof(struct rte_flow_item_gre),
> +		.convert = mlx5_flow_create_gre,
> +		.dst_sz = sizeof(struct ibv_flow_spec_tunnel),
> +	},
>  	[RTE_FLOW_ITEM_TYPE_VXLAN] = {
>  		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH),
>  		.actions = valid_actions,
> @@ -401,7 +427,7 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
>  
>  /** Structure to pass to the conversion function. */
>  struct mlx5_flow_parse {
> -	uint32_t inner; /**< Set once VXLAN is encountered. */
> +	uint32_t inner; /**< Verbs value, set once tunnel is encountered. */
>  	uint32_t create:1;
>  	/**< Whether resources should remain after a validate. */
>  	uint32_t drop:1; /**< Target is a drop queue. */
> @@ -829,13 +855,13 @@ mlx5_flow_convert_items_validate(const struct rte_flow_item items[],
>  					      cur_item->mask_sz);
>  		if (ret)
>  			goto exit_item_not_supported;
> -		if (items->type == RTE_FLOW_ITEM_TYPE_VXLAN) {
> +		if (IS_TUNNEL(items->type)) {
>  			if (parser->inner) {
>  				rte_flow_error_set(error, ENOTSUP,
>  						   RTE_FLOW_ERROR_TYPE_ITEM,
>  						   items,
> -						   "cannot recognize multiple"
> -						   " VXLAN encapsulations");
> +						   "Cannot recognize multiple"
> +						   " tunnel encapsulations.");
>  				return -rte_errno;
>  			}
>  			parser->inner = IBV_FLOW_SPEC_INNER;
> @@ -1641,6 +1667,67 @@ mlx5_flow_create_vxlan(const struct rte_flow_item *item,
>  }
>  
>  /**
> + * Convert GRE item to Verbs specification.
> + *
> + * @param item[in]
> + *   Item specification.
> + * @param default_mask[in]
> + *   Default bit-masks to use when item->mask is not provided.
> + * @param data[in, out]
> + *   User structure.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +static int
> +mlx5_flow_create_gre(const struct rte_flow_item *item __rte_unused,
> +		     const void *default_mask __rte_unused,
> +		     struct mlx5_flow_data *data)
> +{
> +	struct mlx5_flow_parse *parser = data->parser;
> +	unsigned int size = sizeof(struct ibv_flow_spec_tunnel);
> +	struct ibv_flow_spec_tunnel tunnel = {
> +		.type = parser->inner | IBV_FLOW_SPEC_VXLAN_TUNNEL,
> +		.size = size,
> +	};
> +	struct ibv_flow_spec_ipv4_ext *ipv4;
> +	struct ibv_flow_spec_ipv6 *ipv6;
> +	unsigned int i;
> +
> +	parser->inner = IBV_FLOW_SPEC_INNER;
> +	/* Update encapsulation IP layer protocol. */
> +	for (i = 0; i != hash_rxq_init_n; ++i) {
> +		if (!parser->queue[i].ibv_attr)
> +			continue;
> +		if (parser->out_layer == HASH_RXQ_IPV4) {
> +			ipv4 = (void *)((uintptr_t)parser->queue[i].ibv_attr +
> +				parser->queue[i].offset -
> +				sizeof(struct ibv_flow_spec_ipv4_ext));
> +			if (ipv4->mask.proto && ipv4->val.proto != MLX5_GRE)
> +				break;
> +			ipv4->val.proto = MLX5_GRE;
> +			ipv4->mask.proto = 0xff;
> +		} else if (parser->out_layer == HASH_RXQ_IPV6) {
> +			ipv6 = (void *)((uintptr_t)parser->queue[i].ibv_attr +
> +				parser->queue[i].offset -
> +				sizeof(struct ibv_flow_spec_ipv6));
> +			if (ipv6->mask.next_hdr &&
> +			    ipv6->val.next_hdr != MLX5_GRE)
> +				break;
> +			ipv6->val.next_hdr = MLX5_GRE;
> +			ipv6->mask.next_hdr = 0xff;
> +		}
> +	}
> +	if (i != hash_rxq_init_n)
> +		return rte_flow_error_set(data->error, EINVAL,
> +					  RTE_FLOW_ERROR_TYPE_ITEM,
> +					  item,
> +					  "IP protocol of GRE must be 47");
> +	mlx5_flow_create_copy(parser, &tunnel, size);
> +	return 0;
> +}

There is something strange, item is not unused as it is at least used in
the rte_flow_error_set().

In the other series you are pushing, there is no new RTE_FLOW_ITEM_GRE
and in the current code there is also no RTE_FLOW_ITEM_GRE.

I don't see how this code can match the missing item, what am I missing?

> +/**
>   * Convert mark/flag action to Verbs specification.
>   *
>   * @param parser
> -- 
> 2.13.3

Thanks,

-- 
Nélio Laranjeiro
6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v6 02/11] net/mlx5: support GRE tunnel flow
  2018-04-23 12:55         ` Nélio Laranjeiro
@ 2018-04-23 13:32           ` Xueming(Steven) Li
  2018-04-23 13:46             ` Nélio Laranjeiro
  0 siblings, 1 reply; 115+ messages in thread
From: Xueming(Steven) Li @ 2018-04-23 13:32 UTC (permalink / raw)
  To: Nélio Laranjeiro; +Cc: Shahaf Shuler, dev

Hi Nelio,

> -----Original Message-----
> From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> Sent: Monday, April 23, 2018 8:56 PM
> To: Xueming(Steven) Li <xuemingl@mellanox.com>
> Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org
> Subject: Re: [PATCH v6 02/11] net/mlx5: support GRE tunnel flow
> 
> On Mon, Apr 23, 2018 at 08:33:01PM +0800, Xueming Li wrote:
> > Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> > ---
> >  drivers/net/mlx5/mlx5_flow.c | 101
> > ++++++++++++++++++++++++++++++++++++++++---
> >  1 file changed, 94 insertions(+), 7 deletions(-)
> >
> > diff --git a/drivers/net/mlx5/mlx5_flow.c
> > b/drivers/net/mlx5/mlx5_flow.c index 5402cb148..b365f9868 100644
> > --- a/drivers/net/mlx5/mlx5_flow.c
> > +++ b/drivers/net/mlx5/mlx5_flow.c
> > @@ -37,6 +37,7 @@
> >  /* Internet Protocol versions. */
> >  #define MLX5_IPV4 4
> >  #define MLX5_IPV6 6
> > +#define MLX5_GRE 47
> >
> >  #ifndef HAVE_IBV_DEVICE_COUNTERS_SET_SUPPORT
> >  struct ibv_flow_spec_counter_action { @@ -89,6 +90,11 @@
> > mlx5_flow_create_vxlan(const struct rte_flow_item *item,
> >  		       const void *default_mask,
> >  		       struct mlx5_flow_data *data);
> >
> > +static int
> > +mlx5_flow_create_gre(const struct rte_flow_item *item,
> > +		     const void *default_mask,
> > +		     struct mlx5_flow_data *data);
> > +
> >  struct mlx5_flow_parse;
> >
> >  static void
> > @@ -231,6 +237,10 @@ struct rte_flow {
> >  		__VA_ARGS__, RTE_FLOW_ITEM_TYPE_END, \
> >  	}
> >
> > +#define IS_TUNNEL(type) ( \
> > +	(type) == RTE_FLOW_ITEM_TYPE_VXLAN || \
> > +	(type) == RTE_FLOW_ITEM_TYPE_GRE)
> > +
> >  /** Structure to generate a simple graph of layers supported by the
> > NIC. */  struct mlx5_flow_items {
> >  	/** List of possible actions for these items. */ @@ -284,7 +294,8 @@
> > static const enum rte_flow_action_type valid_actions[] = {  static
> > const struct mlx5_flow_items mlx5_flow_items[] = {
> >  	[RTE_FLOW_ITEM_TYPE_END] = {
> >  		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
> > -			       RTE_FLOW_ITEM_TYPE_VXLAN),
> > +			       RTE_FLOW_ITEM_TYPE_VXLAN,
> > +			       RTE_FLOW_ITEM_TYPE_GRE),
> >  	},
> >  	[RTE_FLOW_ITEM_TYPE_ETH] = {
> >  		.items = ITEMS(RTE_FLOW_ITEM_TYPE_VLAN, @@ -316,7 +327,8 @@ static
> > const struct mlx5_flow_items mlx5_flow_items[] = {
> >  	},
> >  	[RTE_FLOW_ITEM_TYPE_IPV4] = {
> >  		.items = ITEMS(RTE_FLOW_ITEM_TYPE_UDP,
> > -			       RTE_FLOW_ITEM_TYPE_TCP),
> > +			       RTE_FLOW_ITEM_TYPE_TCP,
> > +			       RTE_FLOW_ITEM_TYPE_GRE),
> >  		.actions = valid_actions,
> >  		.mask = &(const struct rte_flow_item_ipv4){
> >  			.hdr = {
> > @@ -333,7 +345,8 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
> >  	},
> >  	[RTE_FLOW_ITEM_TYPE_IPV6] = {
> >  		.items = ITEMS(RTE_FLOW_ITEM_TYPE_UDP,
> > -			       RTE_FLOW_ITEM_TYPE_TCP),
> > +			       RTE_FLOW_ITEM_TYPE_TCP,
> > +			       RTE_FLOW_ITEM_TYPE_GRE),
> >  		.actions = valid_actions,
> >  		.mask = &(const struct rte_flow_item_ipv6){
> >  			.hdr = {
> > @@ -386,6 +399,19 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
> >  		.convert = mlx5_flow_create_tcp,
> >  		.dst_sz = sizeof(struct ibv_flow_spec_tcp_udp),
> >  	},
> > +	[RTE_FLOW_ITEM_TYPE_GRE] = {
> > +		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
> > +			       RTE_FLOW_ITEM_TYPE_IPV4,
> > +			       RTE_FLOW_ITEM_TYPE_IPV6),
> > +		.actions = valid_actions,
> > +		.mask = &(const struct rte_flow_item_gre){
> > +			.protocol = -1,
> > +		},
> > +		.default_mask = &rte_flow_item_gre_mask,
> > +		.mask_sz = sizeof(struct rte_flow_item_gre),
> > +		.convert = mlx5_flow_create_gre,
> > +		.dst_sz = sizeof(struct ibv_flow_spec_tunnel),
> > +	},
> >  	[RTE_FLOW_ITEM_TYPE_VXLAN] = {
> >  		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH),
> >  		.actions = valid_actions,
> > @@ -401,7 +427,7 @@ static const struct mlx5_flow_items
> > mlx5_flow_items[] = {
> >
> >  /** Structure to pass to the conversion function. */  struct
> > mlx5_flow_parse {
> > -	uint32_t inner; /**< Set once VXLAN is encountered. */
> > +	uint32_t inner; /**< Verbs value, set once tunnel is encountered. */
> >  	uint32_t create:1;
> >  	/**< Whether resources should remain after a validate. */
> >  	uint32_t drop:1; /**< Target is a drop queue. */ @@ -829,13 +855,13
> > @@ mlx5_flow_convert_items_validate(const struct rte_flow_item items[],
> >  					      cur_item->mask_sz);
> >  		if (ret)
> >  			goto exit_item_not_supported;
> > -		if (items->type == RTE_FLOW_ITEM_TYPE_VXLAN) {
> > +		if (IS_TUNNEL(items->type)) {
> >  			if (parser->inner) {
> >  				rte_flow_error_set(error, ENOTSUP,
> >  						   RTE_FLOW_ERROR_TYPE_ITEM,
> >  						   items,
> > -						   "cannot recognize multiple"
> > -						   " VXLAN encapsulations");
> > +						   "Cannot recognize multiple"
> > +						   " tunnel encapsulations.");
> >  				return -rte_errno;
> >  			}
> >  			parser->inner = IBV_FLOW_SPEC_INNER; @@ -1641,6 +1667,67 @@
> > mlx5_flow_create_vxlan(const struct rte_flow_item *item,  }
> >
> >  /**
> > + * Convert GRE item to Verbs specification.
> > + *
> > + * @param item[in]
> > + *   Item specification.
> > + * @param default_mask[in]
> > + *   Default bit-masks to use when item->mask is not provided.
> > + * @param data[in, out]
> > + *   User structure.
> > + *
> > + * @return
> > + *   0 on success, a negative errno value otherwise and rte_errno is set.
> > + */
> > +static int
> > +mlx5_flow_create_gre(const struct rte_flow_item *item __rte_unused,
> > +		     const void *default_mask __rte_unused,
> > +		     struct mlx5_flow_data *data)
> > +{
> > +	struct mlx5_flow_parse *parser = data->parser;
> > +	unsigned int size = sizeof(struct ibv_flow_spec_tunnel);
> > +	struct ibv_flow_spec_tunnel tunnel = {
> > +		.type = parser->inner | IBV_FLOW_SPEC_VXLAN_TUNNEL,
> > +		.size = size,
> > +	};
> > +	struct ibv_flow_spec_ipv4_ext *ipv4;
> > +	struct ibv_flow_spec_ipv6 *ipv6;
> > +	unsigned int i;
> > +
> > +	parser->inner = IBV_FLOW_SPEC_INNER;
> > +	/* Update encapsulation IP layer protocol. */
> > +	for (i = 0; i != hash_rxq_init_n; ++i) {
> > +		if (!parser->queue[i].ibv_attr)
> > +			continue;
> > +		if (parser->out_layer == HASH_RXQ_IPV4) {
> > +			ipv4 = (void *)((uintptr_t)parser->queue[i].ibv_attr +
> > +				parser->queue[i].offset -
> > +				sizeof(struct ibv_flow_spec_ipv4_ext));
> > +			if (ipv4->mask.proto && ipv4->val.proto != MLX5_GRE)
> > +				break;
> > +			ipv4->val.proto = MLX5_GRE;
> > +			ipv4->mask.proto = 0xff;
> > +		} else if (parser->out_layer == HASH_RXQ_IPV6) {
> > +			ipv6 = (void *)((uintptr_t)parser->queue[i].ibv_attr +
> > +				parser->queue[i].offset -
> > +				sizeof(struct ibv_flow_spec_ipv6));
> > +			if (ipv6->mask.next_hdr &&
> > +			    ipv6->val.next_hdr != MLX5_GRE)
> > +				break;
> > +			ipv6->val.next_hdr = MLX5_GRE;
> > +			ipv6->mask.next_hdr = 0xff;
> > +		}
> > +	}
> > +	if (i != hash_rxq_init_n)
> > +		return rte_flow_error_set(data->error, EINVAL,
> > +					  RTE_FLOW_ERROR_TYPE_ITEM,
> > +					  item,
> > +					  "IP protocol of GRE must be 47");
> > +	mlx5_flow_create_copy(parser, &tunnel, size);
> > +	return 0;
> > +}
> 
> There is something strange, item is not unused as it is at least used in the rte_flow_error_set().

A new issue introduced when adding GRE protocol check.
If you finished this patchset review, I'll upload a new version to remove it.

> 
> In the other series you are pushing, there is no new RTE_FLOW_ITEM_GRE and in the current code there
> is also no RTE_FLOW_ITEM_GRE.
> 
> I don't see how this code can match the missing item, what am I missing?

Are you looking for RTE_FLOW_ITEM_TYPE_GRE?

> 
> > +/**
> >   * Convert mark/flag action to Verbs specification.
> >   *
> >   * @param parser
> > --
> > 2.13.3
> 
> Thanks,
> 
> --
> Nélio Laranjeiro
> 6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v6 02/11] net/mlx5: support GRE tunnel flow
  2018-04-23 13:32           ` Xueming(Steven) Li
@ 2018-04-23 13:46             ` Nélio Laranjeiro
  2018-04-24  7:40               ` Xueming(Steven) Li
  0 siblings, 1 reply; 115+ messages in thread
From: Nélio Laranjeiro @ 2018-04-23 13:46 UTC (permalink / raw)
  To: Xueming(Steven) Li; +Cc: Shahaf Shuler, dev

On Mon, Apr 23, 2018 at 01:32:23PM +0000, Xueming(Steven) Li wrote:
> Hi Nelio,
> 
> > -----Original Message-----
> > From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> > Sent: Monday, April 23, 2018 8:56 PM
> > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org
> > Subject: Re: [PATCH v6 02/11] net/mlx5: support GRE tunnel flow
> > 
> > On Mon, Apr 23, 2018 at 08:33:01PM +0800, Xueming Li wrote:
> > > Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> > > ---
> > >  drivers/net/mlx5/mlx5_flow.c | 101
> > > ++++++++++++++++++++++++++++++++++++++++---
> > >  1 file changed, 94 insertions(+), 7 deletions(-)
> > >
> > > diff --git a/drivers/net/mlx5/mlx5_flow.c
> > > b/drivers/net/mlx5/mlx5_flow.c index 5402cb148..b365f9868 100644
> > > --- a/drivers/net/mlx5/mlx5_flow.c
> > > +++ b/drivers/net/mlx5/mlx5_flow.c
> > > @@ -37,6 +37,7 @@
> > >  /* Internet Protocol versions. */
> > >  #define MLX5_IPV4 4
> > >  #define MLX5_IPV6 6
> > > +#define MLX5_GRE 47
> > >
> > >  #ifndef HAVE_IBV_DEVICE_COUNTERS_SET_SUPPORT
> > >  struct ibv_flow_spec_counter_action { @@ -89,6 +90,11 @@
> > > mlx5_flow_create_vxlan(const struct rte_flow_item *item,
> > >  		       const void *default_mask,
> > >  		       struct mlx5_flow_data *data);
> > >
> > > +static int
> > > +mlx5_flow_create_gre(const struct rte_flow_item *item,
> > > +		     const void *default_mask,
> > > +		     struct mlx5_flow_data *data);
> > > +
> > >  struct mlx5_flow_parse;
> > >
> > >  static void
> > > @@ -231,6 +237,10 @@ struct rte_flow {
> > >  		__VA_ARGS__, RTE_FLOW_ITEM_TYPE_END, \
> > >  	}
> > >
> > > +#define IS_TUNNEL(type) ( \
> > > +	(type) == RTE_FLOW_ITEM_TYPE_VXLAN || \
> > > +	(type) == RTE_FLOW_ITEM_TYPE_GRE)
> > > +
> > >  /** Structure to generate a simple graph of layers supported by the
> > > NIC. */  struct mlx5_flow_items {
> > >  	/** List of possible actions for these items. */ @@ -284,7 +294,8 @@
> > > static const enum rte_flow_action_type valid_actions[] = {  static
> > > const struct mlx5_flow_items mlx5_flow_items[] = {
> > >  	[RTE_FLOW_ITEM_TYPE_END] = {
> > >  		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
> > > -			       RTE_FLOW_ITEM_TYPE_VXLAN),
> > > +			       RTE_FLOW_ITEM_TYPE_VXLAN,
> > > +			       RTE_FLOW_ITEM_TYPE_GRE),
> > >  	},
> > >  	[RTE_FLOW_ITEM_TYPE_ETH] = {
> > >  		.items = ITEMS(RTE_FLOW_ITEM_TYPE_VLAN, @@ -316,7 +327,8 @@ static
> > > const struct mlx5_flow_items mlx5_flow_items[] = {
> > >  	},
> > >  	[RTE_FLOW_ITEM_TYPE_IPV4] = {
> > >  		.items = ITEMS(RTE_FLOW_ITEM_TYPE_UDP,
> > > -			       RTE_FLOW_ITEM_TYPE_TCP),
> > > +			       RTE_FLOW_ITEM_TYPE_TCP,
> > > +			       RTE_FLOW_ITEM_TYPE_GRE),
> > >  		.actions = valid_actions,
> > >  		.mask = &(const struct rte_flow_item_ipv4){
> > >  			.hdr = {
> > > @@ -333,7 +345,8 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
> > >  	},
> > >  	[RTE_FLOW_ITEM_TYPE_IPV6] = {
> > >  		.items = ITEMS(RTE_FLOW_ITEM_TYPE_UDP,
> > > -			       RTE_FLOW_ITEM_TYPE_TCP),
> > > +			       RTE_FLOW_ITEM_TYPE_TCP,
> > > +			       RTE_FLOW_ITEM_TYPE_GRE),
> > >  		.actions = valid_actions,
> > >  		.mask = &(const struct rte_flow_item_ipv6){
> > >  			.hdr = {
> > > @@ -386,6 +399,19 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
> > >  		.convert = mlx5_flow_create_tcp,
> > >  		.dst_sz = sizeof(struct ibv_flow_spec_tcp_udp),
> > >  	},
> > > +	[RTE_FLOW_ITEM_TYPE_GRE] = {
> > > +		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
> > > +			       RTE_FLOW_ITEM_TYPE_IPV4,
> > > +			       RTE_FLOW_ITEM_TYPE_IPV6),
> > > +		.actions = valid_actions,
> > > +		.mask = &(const struct rte_flow_item_gre){
> > > +			.protocol = -1,
> > > +		},
> > > +		.default_mask = &rte_flow_item_gre_mask,
> > > +		.mask_sz = sizeof(struct rte_flow_item_gre),
> > > +		.convert = mlx5_flow_create_gre,
> > > +		.dst_sz = sizeof(struct ibv_flow_spec_tunnel),
> > > +	},
> > >  	[RTE_FLOW_ITEM_TYPE_VXLAN] = {
> > >  		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH),
> > >  		.actions = valid_actions,
> > > @@ -401,7 +427,7 @@ static const struct mlx5_flow_items
> > > mlx5_flow_items[] = {
> > >
> > >  /** Structure to pass to the conversion function. */  struct
> > > mlx5_flow_parse {
> > > -	uint32_t inner; /**< Set once VXLAN is encountered. */
> > > +	uint32_t inner; /**< Verbs value, set once tunnel is encountered. */
> > >  	uint32_t create:1;
> > >  	/**< Whether resources should remain after a validate. */
> > >  	uint32_t drop:1; /**< Target is a drop queue. */ @@ -829,13 +855,13
> > > @@ mlx5_flow_convert_items_validate(const struct rte_flow_item items[],
> > >  					      cur_item->mask_sz);
> > >  		if (ret)
> > >  			goto exit_item_not_supported;
> > > -		if (items->type == RTE_FLOW_ITEM_TYPE_VXLAN) {
> > > +		if (IS_TUNNEL(items->type)) {
> > >  			if (parser->inner) {
> > >  				rte_flow_error_set(error, ENOTSUP,
> > >  						   RTE_FLOW_ERROR_TYPE_ITEM,
> > >  						   items,
> > > -						   "cannot recognize multiple"
> > > -						   " VXLAN encapsulations");
> > > +						   "Cannot recognize multiple"
> > > +						   " tunnel encapsulations.");
> > >  				return -rte_errno;
> > >  			}
> > >  			parser->inner = IBV_FLOW_SPEC_INNER; @@ -1641,6 +1667,67 @@
> > > mlx5_flow_create_vxlan(const struct rte_flow_item *item,  }
> > >
> > >  /**
> > > + * Convert GRE item to Verbs specification.
> > > + *
> > > + * @param item[in]
> > > + *   Item specification.
> > > + * @param default_mask[in]
> > > + *   Default bit-masks to use when item->mask is not provided.
> > > + * @param data[in, out]
> > > + *   User structure.
> > > + *
> > > + * @return
> > > + *   0 on success, a negative errno value otherwise and rte_errno is set.
> > > + */
> > > +static int
> > > +mlx5_flow_create_gre(const struct rte_flow_item *item __rte_unused,
> > > +		     const void *default_mask __rte_unused,
> > > +		     struct mlx5_flow_data *data)
> > > +{
> > > +	struct mlx5_flow_parse *parser = data->parser;
> > > +	unsigned int size = sizeof(struct ibv_flow_spec_tunnel);
> > > +	struct ibv_flow_spec_tunnel tunnel = {
> > > +		.type = parser->inner | IBV_FLOW_SPEC_VXLAN_TUNNEL,
> > > +		.size = size,
> > > +	};
> > > +	struct ibv_flow_spec_ipv4_ext *ipv4;
> > > +	struct ibv_flow_spec_ipv6 *ipv6;
> > > +	unsigned int i;
> > > +
> > > +	parser->inner = IBV_FLOW_SPEC_INNER;
> > > +	/* Update encapsulation IP layer protocol. */
> > > +	for (i = 0; i != hash_rxq_init_n; ++i) {
> > > +		if (!parser->queue[i].ibv_attr)
> > > +			continue;
> > > +		if (parser->out_layer == HASH_RXQ_IPV4) {
> > > +			ipv4 = (void *)((uintptr_t)parser->queue[i].ibv_attr +
> > > +				parser->queue[i].offset -
> > > +				sizeof(struct ibv_flow_spec_ipv4_ext));
> > > +			if (ipv4->mask.proto && ipv4->val.proto != MLX5_GRE)
> > > +				break;
> > > +			ipv4->val.proto = MLX5_GRE;
> > > +			ipv4->mask.proto = 0xff;
> > > +		} else if (parser->out_layer == HASH_RXQ_IPV6) {
> > > +			ipv6 = (void *)((uintptr_t)parser->queue[i].ibv_attr +
> > > +				parser->queue[i].offset -
> > > +				sizeof(struct ibv_flow_spec_ipv6));
> > > +			if (ipv6->mask.next_hdr &&
> > > +			    ipv6->val.next_hdr != MLX5_GRE)
> > > +				break;
> > > +			ipv6->val.next_hdr = MLX5_GRE;
> > > +			ipv6->mask.next_hdr = 0xff;
> > > +		}
> > > +	}
> > > +	if (i != hash_rxq_init_n)
> > > +		return rte_flow_error_set(data->error, EINVAL,
> > > +					  RTE_FLOW_ERROR_TYPE_ITEM,
> > > +					  item,
> > > +					  "IP protocol of GRE must be 47");
> > > +	mlx5_flow_create_copy(parser, &tunnel, size);
> > > +	return 0;
> > > +}
> > 
> > There is something strange, item is not unused as it is at least used in the rte_flow_error_set().
> 
> A new issue introduced when adding GRE protocol check.
> If you finished this patchset review, I'll upload a new version to remove it.
> 
> > 
> > In the other series you are pushing, there is no new RTE_FLOW_ITEM_GRE and in the current code there
> > is also no RTE_FLOW_ITEM_GRE.
> > 
> > I don't see how this code can match the missing item, what am I missing?
> 
> Are you looking for RTE_FLOW_ITEM_TYPE_GRE?

Yes

> > 
> > > +/**
> > >   * Convert mark/flag action to Verbs specification.
> > >   *
> > >   * @param parser
> > > --
> > > 2.13.3
> > 
> > Thanks,
> > 
> > --
> > Nélio Laranjeiro
> > 6WIND

-- 
Nélio Laranjeiro
6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v6 02/11] net/mlx5: support GRE tunnel flow
  2018-04-23 13:46             ` Nélio Laranjeiro
@ 2018-04-24  7:40               ` Xueming(Steven) Li
  2018-04-24  8:21                 ` Nélio Laranjeiro
  0 siblings, 1 reply; 115+ messages in thread
From: Xueming(Steven) Li @ 2018-04-24  7:40 UTC (permalink / raw)
  To: Nélio Laranjeiro; +Cc: Shahaf Shuler, dev



> -----Original Message-----
> From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> Sent: Monday, April 23, 2018 9:46 PM
> To: Xueming(Steven) Li <xuemingl@mellanox.com>
> Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org
> Subject: Re: [PATCH v6 02/11] net/mlx5: support GRE tunnel flow
> 
> On Mon, Apr 23, 2018 at 01:32:23PM +0000, Xueming(Steven) Li wrote:
> > Hi Nelio,
> >
> > > -----Original Message-----
> > > From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > Sent: Monday, April 23, 2018 8:56 PM
> > > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > > Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org
> > > Subject: Re: [PATCH v6 02/11] net/mlx5: support GRE tunnel flow
> > >
> > > On Mon, Apr 23, 2018 at 08:33:01PM +0800, Xueming Li wrote:
> > > > Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> > > > ---
> > > >  drivers/net/mlx5/mlx5_flow.c | 101
> > > > ++++++++++++++++++++++++++++++++++++++++---
> > > >  1 file changed, 94 insertions(+), 7 deletions(-)
> > > >
> > > > diff --git a/drivers/net/mlx5/mlx5_flow.c
> > > > b/drivers/net/mlx5/mlx5_flow.c index 5402cb148..b365f9868 100644
> > > > --- a/drivers/net/mlx5/mlx5_flow.c
> > > > +++ b/drivers/net/mlx5/mlx5_flow.c
> > > > @@ -37,6 +37,7 @@
> > > >  /* Internet Protocol versions. */  #define MLX5_IPV4 4  #define
> > > > MLX5_IPV6 6
> > > > +#define MLX5_GRE 47
> > > >
> > > >  #ifndef HAVE_IBV_DEVICE_COUNTERS_SET_SUPPORT
> > > >  struct ibv_flow_spec_counter_action { @@ -89,6 +90,11 @@
> > > > mlx5_flow_create_vxlan(const struct rte_flow_item *item,
> > > >  		       const void *default_mask,
> > > >  		       struct mlx5_flow_data *data);
> > > >
> > > > +static int
> > > > +mlx5_flow_create_gre(const struct rte_flow_item *item,
> > > > +		     const void *default_mask,
> > > > +		     struct mlx5_flow_data *data);
> > > > +
> > > >  struct mlx5_flow_parse;
> > > >
> > > >  static void
> > > > @@ -231,6 +237,10 @@ struct rte_flow {
> > > >  		__VA_ARGS__, RTE_FLOW_ITEM_TYPE_END, \
> > > >  	}
> > > >
> > > > +#define IS_TUNNEL(type) ( \
> > > > +	(type) == RTE_FLOW_ITEM_TYPE_VXLAN || \
> > > > +	(type) == RTE_FLOW_ITEM_TYPE_GRE)
> > > > +
> > > >  /** Structure to generate a simple graph of layers supported by
> > > > the NIC. */  struct mlx5_flow_items {
> > > >  	/** List of possible actions for these items. */ @@ -284,7
> > > > +294,8 @@ static const enum rte_flow_action_type valid_actions[] =
> > > > {  static const struct mlx5_flow_items mlx5_flow_items[] = {
> > > >  	[RTE_FLOW_ITEM_TYPE_END] = {
> > > >  		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
> > > > -			       RTE_FLOW_ITEM_TYPE_VXLAN),
> > > > +			       RTE_FLOW_ITEM_TYPE_VXLAN,
> > > > +			       RTE_FLOW_ITEM_TYPE_GRE),
> > > >  	},
> > > >  	[RTE_FLOW_ITEM_TYPE_ETH] = {
> > > >  		.items = ITEMS(RTE_FLOW_ITEM_TYPE_VLAN, @@ -316,7 +327,8 @@
> > > > static const struct mlx5_flow_items mlx5_flow_items[] = {
> > > >  	},
> > > >  	[RTE_FLOW_ITEM_TYPE_IPV4] = {
> > > >  		.items = ITEMS(RTE_FLOW_ITEM_TYPE_UDP,
> > > > -			       RTE_FLOW_ITEM_TYPE_TCP),
> > > > +			       RTE_FLOW_ITEM_TYPE_TCP,
> > > > +			       RTE_FLOW_ITEM_TYPE_GRE),
> > > >  		.actions = valid_actions,
> > > >  		.mask = &(const struct rte_flow_item_ipv4){
> > > >  			.hdr = {
> > > > @@ -333,7 +345,8 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
> > > >  	},
> > > >  	[RTE_FLOW_ITEM_TYPE_IPV6] = {
> > > >  		.items = ITEMS(RTE_FLOW_ITEM_TYPE_UDP,
> > > > -			       RTE_FLOW_ITEM_TYPE_TCP),
> > > > +			       RTE_FLOW_ITEM_TYPE_TCP,
> > > > +			       RTE_FLOW_ITEM_TYPE_GRE),
> > > >  		.actions = valid_actions,
> > > >  		.mask = &(const struct rte_flow_item_ipv6){
> > > >  			.hdr = {
> > > > @@ -386,6 +399,19 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
> > > >  		.convert = mlx5_flow_create_tcp,
> > > >  		.dst_sz = sizeof(struct ibv_flow_spec_tcp_udp),
> > > >  	},
> > > > +	[RTE_FLOW_ITEM_TYPE_GRE] = {
> > > > +		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
> > > > +			       RTE_FLOW_ITEM_TYPE_IPV4,
> > > > +			       RTE_FLOW_ITEM_TYPE_IPV6),
> > > > +		.actions = valid_actions,
> > > > +		.mask = &(const struct rte_flow_item_gre){
> > > > +			.protocol = -1,
> > > > +		},
> > > > +		.default_mask = &rte_flow_item_gre_mask,
> > > > +		.mask_sz = sizeof(struct rte_flow_item_gre),
> > > > +		.convert = mlx5_flow_create_gre,
> > > > +		.dst_sz = sizeof(struct ibv_flow_spec_tunnel),
> > > > +	},
> > > >  	[RTE_FLOW_ITEM_TYPE_VXLAN] = {
> > > >  		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH),
> > > >  		.actions = valid_actions,
> > > > @@ -401,7 +427,7 @@ static const struct mlx5_flow_items
> > > > mlx5_flow_items[] = {
> > > >
> > > >  /** Structure to pass to the conversion function. */  struct
> > > > mlx5_flow_parse {
> > > > -	uint32_t inner; /**< Set once VXLAN is encountered. */
> > > > +	uint32_t inner; /**< Verbs value, set once tunnel is
> > > > +encountered. */
> > > >  	uint32_t create:1;
> > > >  	/**< Whether resources should remain after a validate. */
> > > >  	uint32_t drop:1; /**< Target is a drop queue. */ @@ -829,13
> > > > +855,13 @@ mlx5_flow_convert_items_validate(const struct rte_flow_item items[],
> > > >  					      cur_item->mask_sz);
> > > >  		if (ret)
> > > >  			goto exit_item_not_supported;
> > > > -		if (items->type == RTE_FLOW_ITEM_TYPE_VXLAN) {
> > > > +		if (IS_TUNNEL(items->type)) {
> > > >  			if (parser->inner) {
> > > >  				rte_flow_error_set(error, ENOTSUP,
> > > >  						   RTE_FLOW_ERROR_TYPE_ITEM,
> > > >  						   items,
> > > > -						   "cannot recognize multiple"
> > > > -						   " VXLAN encapsulations");
> > > > +						   "Cannot recognize multiple"
> > > > +						   " tunnel encapsulations.");
> > > >  				return -rte_errno;
> > > >  			}
> > > >  			parser->inner = IBV_FLOW_SPEC_INNER; @@ -1641,6 +1667,67 @@
> > > > mlx5_flow_create_vxlan(const struct rte_flow_item *item,  }
> > > >
> > > >  /**
> > > > + * Convert GRE item to Verbs specification.
> > > > + *
> > > > + * @param item[in]
> > > > + *   Item specification.
> > > > + * @param default_mask[in]
> > > > + *   Default bit-masks to use when item->mask is not provided.
> > > > + * @param data[in, out]
> > > > + *   User structure.
> > > > + *
> > > > + * @return
> > > > + *   0 on success, a negative errno value otherwise and rte_errno is set.
> > > > + */
> > > > +static int
> > > > +mlx5_flow_create_gre(const struct rte_flow_item *item __rte_unused,
> > > > +		     const void *default_mask __rte_unused,
> > > > +		     struct mlx5_flow_data *data) {
> > > > +	struct mlx5_flow_parse *parser = data->parser;
> > > > +	unsigned int size = sizeof(struct ibv_flow_spec_tunnel);
> > > > +	struct ibv_flow_spec_tunnel tunnel = {
> > > > +		.type = parser->inner | IBV_FLOW_SPEC_VXLAN_TUNNEL,
> > > > +		.size = size,
> > > > +	};
> > > > +	struct ibv_flow_spec_ipv4_ext *ipv4;
> > > > +	struct ibv_flow_spec_ipv6 *ipv6;
> > > > +	unsigned int i;
> > > > +
> > > > +	parser->inner = IBV_FLOW_SPEC_INNER;
> > > > +	/* Update encapsulation IP layer protocol. */
> > > > +	for (i = 0; i != hash_rxq_init_n; ++i) {
> > > > +		if (!parser->queue[i].ibv_attr)
> > > > +			continue;
> > > > +		if (parser->out_layer == HASH_RXQ_IPV4) {
> > > > +			ipv4 = (void *)((uintptr_t)parser->queue[i].ibv_attr +
> > > > +				parser->queue[i].offset -
> > > > +				sizeof(struct ibv_flow_spec_ipv4_ext));
> > > > +			if (ipv4->mask.proto && ipv4->val.proto != MLX5_GRE)
> > > > +				break;
> > > > +			ipv4->val.proto = MLX5_GRE;
> > > > +			ipv4->mask.proto = 0xff;
> > > > +		} else if (parser->out_layer == HASH_RXQ_IPV6) {
> > > > +			ipv6 = (void *)((uintptr_t)parser->queue[i].ibv_attr +
> > > > +				parser->queue[i].offset -
> > > > +				sizeof(struct ibv_flow_spec_ipv6));
> > > > +			if (ipv6->mask.next_hdr &&
> > > > +			    ipv6->val.next_hdr != MLX5_GRE)
> > > > +				break;
> > > > +			ipv6->val.next_hdr = MLX5_GRE;
> > > > +			ipv6->mask.next_hdr = 0xff;
> > > > +		}
> > > > +	}
> > > > +	if (i != hash_rxq_init_n)
> > > > +		return rte_flow_error_set(data->error, EINVAL,
> > > > +					  RTE_FLOW_ERROR_TYPE_ITEM,
> > > > +					  item,
> > > > +					  "IP protocol of GRE must be 47");
> > > > +	mlx5_flow_create_copy(parser, &tunnel, size);
> > > > +	return 0;
> > > > +}
> > >
> > > There is something strange, item is not unused as it is at least used in the rte_flow_error_set().
> >
> > A new issue introduced when adding GRE protocol check.
> > If you finished this patchset review, I'll upload a new version to remove it.
> >
> > >
> > > In the other series you are pushing, there is no new
> > > RTE_FLOW_ITEM_GRE and in the current code there is also no RTE_FLOW_ITEM_GRE.
> > >
> > > I don't see how this code can match the missing item, what am I missing?
> >
> > Are you looking for RTE_FLOW_ITEM_TYPE_GRE?
> 
> Yes

RTE_FLOW_ITEM_TYPE_GRE has been defined in rte_flow.h, please check.

> 
> > >
> > > > +/**
> > > >   * Convert mark/flag action to Verbs specification.
> > > >   *
> > > >   * @param parser
> > > > --
> > > > 2.13.3
> > >
> > > Thanks,
> > >
> > > --
> > > Nélio Laranjeiro
> > > 6WIND
> 
> --
> Nélio Laranjeiro
> 6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v6 02/11] net/mlx5: support GRE tunnel flow
  2018-04-24  7:40               ` Xueming(Steven) Li
@ 2018-04-24  8:21                 ` Nélio Laranjeiro
  0 siblings, 0 replies; 115+ messages in thread
From: Nélio Laranjeiro @ 2018-04-24  8:21 UTC (permalink / raw)
  To: Xueming(Steven) Li; +Cc: Shahaf Shuler, dev

On Tue, Apr 24, 2018 at 07:40:24AM +0000, Xueming(Steven) Li wrote:
> 
> 
> > -----Original Message-----
> > From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> > Sent: Monday, April 23, 2018 9:46 PM
> > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org
> > Subject: Re: [PATCH v6 02/11] net/mlx5: support GRE tunnel flow
> > 
> > On Mon, Apr 23, 2018 at 01:32:23PM +0000, Xueming(Steven) Li wrote:
> > > Hi Nelio,
> > >
> > > > -----Original Message-----
> > > > From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > > Sent: Monday, April 23, 2018 8:56 PM
> > > > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > > > Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org
> > > > Subject: Re: [PATCH v6 02/11] net/mlx5: support GRE tunnel flow
> > > >
> > > > On Mon, Apr 23, 2018 at 08:33:01PM +0800, Xueming Li wrote:
> > > > > Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> > > > > ---
> > > > >  drivers/net/mlx5/mlx5_flow.c | 101
> > > > > ++++++++++++++++++++++++++++++++++++++++---
> > > > >  1 file changed, 94 insertions(+), 7 deletions(-)
> > > > >
> > > > > diff --git a/drivers/net/mlx5/mlx5_flow.c
> > > > > b/drivers/net/mlx5/mlx5_flow.c index 5402cb148..b365f9868 100644
> > > > > --- a/drivers/net/mlx5/mlx5_flow.c
> > > > > +++ b/drivers/net/mlx5/mlx5_flow.c
> > > > > @@ -37,6 +37,7 @@
> > > > >  /* Internet Protocol versions. */  #define MLX5_IPV4 4  #define
> > > > > MLX5_IPV6 6
> > > > > +#define MLX5_GRE 47
> > > > >
> > > > >  #ifndef HAVE_IBV_DEVICE_COUNTERS_SET_SUPPORT
> > > > >  struct ibv_flow_spec_counter_action { @@ -89,6 +90,11 @@
> > > > > mlx5_flow_create_vxlan(const struct rte_flow_item *item,
> > > > >  		       const void *default_mask,
> > > > >  		       struct mlx5_flow_data *data);
> > > > >
> > > > > +static int
> > > > > +mlx5_flow_create_gre(const struct rte_flow_item *item,
> > > > > +		     const void *default_mask,
> > > > > +		     struct mlx5_flow_data *data);
> > > > > +
> > > > >  struct mlx5_flow_parse;
> > > > >
> > > > >  static void
> > > > > @@ -231,6 +237,10 @@ struct rte_flow {
> > > > >  		__VA_ARGS__, RTE_FLOW_ITEM_TYPE_END, \
> > > > >  	}
> > > > >
> > > > > +#define IS_TUNNEL(type) ( \
> > > > > +	(type) == RTE_FLOW_ITEM_TYPE_VXLAN || \
> > > > > +	(type) == RTE_FLOW_ITEM_TYPE_GRE)
> > > > > +
> > > > >  /** Structure to generate a simple graph of layers supported by
> > > > > the NIC. */  struct mlx5_flow_items {
> > > > >  	/** List of possible actions for these items. */ @@ -284,7
> > > > > +294,8 @@ static const enum rte_flow_action_type valid_actions[] =
> > > > > {  static const struct mlx5_flow_items mlx5_flow_items[] = {
> > > > >  	[RTE_FLOW_ITEM_TYPE_END] = {
> > > > >  		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
> > > > > -			       RTE_FLOW_ITEM_TYPE_VXLAN),
> > > > > +			       RTE_FLOW_ITEM_TYPE_VXLAN,
> > > > > +			       RTE_FLOW_ITEM_TYPE_GRE),
> > > > >  	},
> > > > >  	[RTE_FLOW_ITEM_TYPE_ETH] = {
> > > > >  		.items = ITEMS(RTE_FLOW_ITEM_TYPE_VLAN, @@ -316,7 +327,8 @@
> > > > > static const struct mlx5_flow_items mlx5_flow_items[] = {
> > > > >  	},
> > > > >  	[RTE_FLOW_ITEM_TYPE_IPV4] = {
> > > > >  		.items = ITEMS(RTE_FLOW_ITEM_TYPE_UDP,
> > > > > -			       RTE_FLOW_ITEM_TYPE_TCP),
> > > > > +			       RTE_FLOW_ITEM_TYPE_TCP,
> > > > > +			       RTE_FLOW_ITEM_TYPE_GRE),
> > > > >  		.actions = valid_actions,
> > > > >  		.mask = &(const struct rte_flow_item_ipv4){
> > > > >  			.hdr = {
> > > > > @@ -333,7 +345,8 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
> > > > >  	},
> > > > >  	[RTE_FLOW_ITEM_TYPE_IPV6] = {
> > > > >  		.items = ITEMS(RTE_FLOW_ITEM_TYPE_UDP,
> > > > > -			       RTE_FLOW_ITEM_TYPE_TCP),
> > > > > +			       RTE_FLOW_ITEM_TYPE_TCP,
> > > > > +			       RTE_FLOW_ITEM_TYPE_GRE),
> > > > >  		.actions = valid_actions,
> > > > >  		.mask = &(const struct rte_flow_item_ipv6){
> > > > >  			.hdr = {
> > > > > @@ -386,6 +399,19 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
> > > > >  		.convert = mlx5_flow_create_tcp,
> > > > >  		.dst_sz = sizeof(struct ibv_flow_spec_tcp_udp),
> > > > >  	},
> > > > > +	[RTE_FLOW_ITEM_TYPE_GRE] = {
> > > > > +		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH,
> > > > > +			       RTE_FLOW_ITEM_TYPE_IPV4,
> > > > > +			       RTE_FLOW_ITEM_TYPE_IPV6),
> > > > > +		.actions = valid_actions,
> > > > > +		.mask = &(const struct rte_flow_item_gre){
> > > > > +			.protocol = -1,
> > > > > +		},
> > > > > +		.default_mask = &rte_flow_item_gre_mask,
> > > > > +		.mask_sz = sizeof(struct rte_flow_item_gre),
> > > > > +		.convert = mlx5_flow_create_gre,
> > > > > +		.dst_sz = sizeof(struct ibv_flow_spec_tunnel),
> > > > > +	},
> > > > >  	[RTE_FLOW_ITEM_TYPE_VXLAN] = {
> > > > >  		.items = ITEMS(RTE_FLOW_ITEM_TYPE_ETH),
> > > > >  		.actions = valid_actions,
> > > > > @@ -401,7 +427,7 @@ static const struct mlx5_flow_items
> > > > > mlx5_flow_items[] = {
> > > > >
> > > > >  /** Structure to pass to the conversion function. */  struct
> > > > > mlx5_flow_parse {
> > > > > -	uint32_t inner; /**< Set once VXLAN is encountered. */
> > > > > +	uint32_t inner; /**< Verbs value, set once tunnel is
> > > > > +encountered. */
> > > > >  	uint32_t create:1;
> > > > >  	/**< Whether resources should remain after a validate. */
> > > > >  	uint32_t drop:1; /**< Target is a drop queue. */ @@ -829,13
> > > > > +855,13 @@ mlx5_flow_convert_items_validate(const struct rte_flow_item items[],
> > > > >  					      cur_item->mask_sz);
> > > > >  		if (ret)
> > > > >  			goto exit_item_not_supported;
> > > > > -		if (items->type == RTE_FLOW_ITEM_TYPE_VXLAN) {
> > > > > +		if (IS_TUNNEL(items->type)) {
> > > > >  			if (parser->inner) {
> > > > >  				rte_flow_error_set(error, ENOTSUP,
> > > > >  						   RTE_FLOW_ERROR_TYPE_ITEM,
> > > > >  						   items,
> > > > > -						   "cannot recognize multiple"
> > > > > -						   " VXLAN encapsulations");
> > > > > +						   "Cannot recognize multiple"
> > > > > +						   " tunnel encapsulations.");
> > > > >  				return -rte_errno;
> > > > >  			}
> > > > >  			parser->inner = IBV_FLOW_SPEC_INNER; @@ -1641,6 +1667,67 @@
> > > > > mlx5_flow_create_vxlan(const struct rte_flow_item *item,  }
> > > > >
> > > > >  /**
> > > > > + * Convert GRE item to Verbs specification.
> > > > > + *
> > > > > + * @param item[in]
> > > > > + *   Item specification.
> > > > > + * @param default_mask[in]
> > > > > + *   Default bit-masks to use when item->mask is not provided.
> > > > > + * @param data[in, out]
> > > > > + *   User structure.
> > > > > + *
> > > > > + * @return
> > > > > + *   0 on success, a negative errno value otherwise and rte_errno is set.
> > > > > + */
> > > > > +static int
> > > > > +mlx5_flow_create_gre(const struct rte_flow_item *item __rte_unused,
> > > > > +		     const void *default_mask __rte_unused,
> > > > > +		     struct mlx5_flow_data *data) {
> > > > > +	struct mlx5_flow_parse *parser = data->parser;
> > > > > +	unsigned int size = sizeof(struct ibv_flow_spec_tunnel);
> > > > > +	struct ibv_flow_spec_tunnel tunnel = {
> > > > > +		.type = parser->inner | IBV_FLOW_SPEC_VXLAN_TUNNEL,
> > > > > +		.size = size,
> > > > > +	};
> > > > > +	struct ibv_flow_spec_ipv4_ext *ipv4;
> > > > > +	struct ibv_flow_spec_ipv6 *ipv6;
> > > > > +	unsigned int i;
> > > > > +
> > > > > +	parser->inner = IBV_FLOW_SPEC_INNER;
> > > > > +	/* Update encapsulation IP layer protocol. */
> > > > > +	for (i = 0; i != hash_rxq_init_n; ++i) {
> > > > > +		if (!parser->queue[i].ibv_attr)
> > > > > +			continue;
> > > > > +		if (parser->out_layer == HASH_RXQ_IPV4) {
> > > > > +			ipv4 = (void *)((uintptr_t)parser->queue[i].ibv_attr +
> > > > > +				parser->queue[i].offset -
> > > > > +				sizeof(struct ibv_flow_spec_ipv4_ext));
> > > > > +			if (ipv4->mask.proto && ipv4->val.proto != MLX5_GRE)
> > > > > +				break;
> > > > > +			ipv4->val.proto = MLX5_GRE;
> > > > > +			ipv4->mask.proto = 0xff;
> > > > > +		} else if (parser->out_layer == HASH_RXQ_IPV6) {
> > > > > +			ipv6 = (void *)((uintptr_t)parser->queue[i].ibv_attr +
> > > > > +				parser->queue[i].offset -
> > > > > +				sizeof(struct ibv_flow_spec_ipv6));
> > > > > +			if (ipv6->mask.next_hdr &&
> > > > > +			    ipv6->val.next_hdr != MLX5_GRE)
> > > > > +				break;
> > > > > +			ipv6->val.next_hdr = MLX5_GRE;
> > > > > +			ipv6->mask.next_hdr = 0xff;
> > > > > +		}
> > > > > +	}
> > > > > +	if (i != hash_rxq_init_n)
> > > > > +		return rte_flow_error_set(data->error, EINVAL,
> > > > > +					  RTE_FLOW_ERROR_TYPE_ITEM,
> > > > > +					  item,
> > > > > +					  "IP protocol of GRE must be 47");
> > > > > +	mlx5_flow_create_copy(parser, &tunnel, size);
> > > > > +	return 0;
> > > > > +}
> > > >
> > > > There is something strange, item is not unused as it is at least used in the rte_flow_error_set().
> > >
> > > A new issue introduced when adding GRE protocol check.
> > > If you finished this patchset review, I'll upload a new version to remove it.
> > >
> > > >
> > > > In the other series you are pushing, there is no new
> > > > RTE_FLOW_ITEM_GRE and in the current code there is also no RTE_FLOW_ITEM_GRE.
> > > >
> > > > I don't see how this code can match the missing item, what am I missing?
> > >
> > > Are you looking for RTE_FLOW_ITEM_TYPE_GRE?
> > 
> > Yes
> 
> RTE_FLOW_ITEM_TYPE_GRE has been defined in rte_flow.h, please check.

Ok I've just missed it.

> 
> > 
> > > >
> > > > > +/**
> > > > >   * Convert mark/flag action to Verbs specification.
> > > > >   *
> > > > >   * @param parser
> > > > > --
> > > > > 2.13.3
> > > >
> > > > Thanks,
> > > >
> > > > --
> > > > Nélio Laranjeiro
> > > > 6WIND
> > 
> > --
> > Nélio Laranjeiro
> > 6WIND

-- 
Nélio Laranjeiro
6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v6 00/11] mlx5 Rx tunnel offloading
  2018-04-23 12:32       ` [PATCH v6 " Xueming Li
@ 2018-04-24  8:24         ` Nélio Laranjeiro
  2018-04-24  8:25           ` Xueming(Steven) Li
  2018-04-26  6:23           ` Shahaf Shuler
  0 siblings, 2 replies; 115+ messages in thread
From: Nélio Laranjeiro @ 2018-04-24  8:24 UTC (permalink / raw)
  To: Xueming Li; +Cc: Shahaf Shuler, dev

On Mon, Apr 23, 2018 at 08:32:59PM +0800, Xueming Li wrote:
> 
> Important note:
>         please note that this patchset relies on Adrien's patchset of flow API
>         overhaul: http://www.dpdk.org/dev/patchwork/patch/38508/
> v6:
> - Fixed commit log of tunnel type identification
> v5:
> - Removed %lx prints
> - Per review request, clear mbuf tunnel type in case of multiple tunnel types.
> - Rebase on Adriens flow API overhaul patchset
> - Split feature requirement document into patches of L3 VXLAN and VXLAN-GPE
> - Per review request, add device parameter to enable L3 VXLAN and VXLAN-GPE
> v4:
> - Fix RSS level according to value defination
> - Add "Inner RSS" column to NIC feature doc
> - Fixed flow creation error in case of ipv4 rss on ipv6 pattern
> - new patch: enforce IP protocol of GRE to be 47.
> - Removed MPLS-in-UDP and MPLS-in-GRE replated patchset
> - Removed invalid RSS type check
> v3:
> - Refactor 16 Verbs priority detection.
> - Other updates according to ML discussion.
> v2:
> - Split into 2 series: public api and mlx5, this one is the second.
> - Rebased on Adrien's rte flow overhaul:
>   http://www.dpdk.org/ml/archives/dev/2018-April/095774.html
> v1:
> - Support new tunnel type MPLS-in-GRE and MPLS-in-UDP
> - Remove deprecation notes of rss level
> 
> This patchset supports MLX5 Rx tunnel checksum, inner rss, inner ptype offloading of following tunnel types:
> - Standard VXLAN
> - L3 VXLAN (no inner ethernet header)
> - VXLAN-GPE
> 
> Xueming Li (11):
>   net/mlx5: support 16 hardware priorities
>   net/mlx5: support GRE tunnel flow
>   net/mlx5: support L3 VXLAN flow
>   net/mlx5: support Rx tunnel type identification
>   net/mlx5: cleanup tunnel checksum offloads
>   net/mlx5: split flow RSS handling logic
>   net/mlx5: support tunnel RSS level
>   net/mlx5: add hardware flow debug dump
>   net/mlx5: introduce VXLAN-GPE tunnel type
>   net/mlx5: allow flow tunnel ID 0 with outer pattern
>   doc: update mlx5 guide on tunnel offloading
> 
>  doc/guides/nics/features/default.ini  |   1 +
>  doc/guides/nics/features/mlx5.ini     |   3 +
>  doc/guides/nics/mlx5.rst              |  30 +-
>  drivers/net/mlx5/Makefile             |   2 +-
>  drivers/net/mlx5/mlx5.c               |  24 +
>  drivers/net/mlx5/mlx5.h               |   6 +
>  drivers/net/mlx5/mlx5_flow.c          | 844 +++++++++++++++++++++++++++-------
>  drivers/net/mlx5/mlx5_glue.c          |  16 +
>  drivers/net/mlx5/mlx5_glue.h          |   8 +
>  drivers/net/mlx5/mlx5_rxq.c           |  89 +++-
>  drivers/net/mlx5/mlx5_rxtx.c          |  33 +-
>  drivers/net/mlx5/mlx5_rxtx.h          |  11 +-
>  drivers/net/mlx5/mlx5_rxtx_vec_neon.h |  21 +-
>  drivers/net/mlx5/mlx5_rxtx_vec_sse.h  |  17 +-
>  drivers/net/mlx5/mlx5_trigger.c       |   8 -
>  drivers/net/mlx5/mlx5_utils.h         |   6 +
>  16 files changed, 896 insertions(+), 223 deletions(-)
> 
> -- 
> 2.13.3
> 

I think we have caught almost all issues, if something remains fixes can
be added.

Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>

-- 
Nélio Laranjeiro
6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v6 00/11] mlx5 Rx tunnel offloading
  2018-04-24  8:24         ` Nélio Laranjeiro
@ 2018-04-24  8:25           ` Xueming(Steven) Li
  2018-04-26  6:23           ` Shahaf Shuler
  1 sibling, 0 replies; 115+ messages in thread
From: Xueming(Steven) Li @ 2018-04-24  8:25 UTC (permalink / raw)
  To: Nélio Laranjeiro; +Cc: Shahaf Shuler, dev



> -----Original Message-----
> From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>
> Sent: Tuesday, April 24, 2018 4:25 PM
> To: Xueming(Steven) Li <xuemingl@mellanox.com>
> Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org
> Subject: Re: [PATCH v6 00/11] mlx5 Rx tunnel offloading
> 
> On Mon, Apr 23, 2018 at 08:32:59PM +0800, Xueming Li wrote:
> >
> > Important note:
> >         please note that this patchset relies on Adrien's patchset of flow API
> >         overhaul:
> > https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.
> > dpdk.org%2Fdev%2Fpatchwork%2Fpatch%2F38508%2F&data=02%7C01%7Cxuemingl%
> > 40mellanox.com%7C8161b83e9bc44d85c68a08d5a9bcb1ab%7Ca652971c7d2e4d9ba6
> > a4d149256f461b%7C0%7C0%7C636601550260012889&sdata=A10kJZhFHY4zMgskl0dJ
> > tE0pRPpHaiQUFxLbAtNNjG4%3D&reserved=0
> > v6:
> > - Fixed commit log of tunnel type identification
> > v5:
> > - Removed %lx prints
> > - Per review request, clear mbuf tunnel type in case of multiple tunnel types.
> > - Rebase on Adriens flow API overhaul patchset
> > - Split feature requirement document into patches of L3 VXLAN and
> > VXLAN-GPE
> > - Per review request, add device parameter to enable L3 VXLAN and
> > VXLAN-GPE
> > v4:
> > - Fix RSS level according to value defination
> > - Add "Inner RSS" column to NIC feature doc
> > - Fixed flow creation error in case of ipv4 rss on ipv6 pattern
> > - new patch: enforce IP protocol of GRE to be 47.
> > - Removed MPLS-in-UDP and MPLS-in-GRE replated patchset
> > - Removed invalid RSS type check
> > v3:
> > - Refactor 16 Verbs priority detection.
> > - Other updates according to ML discussion.
> > v2:
> > - Split into 2 series: public api and mlx5, this one is the second.
> > - Rebased on Adrien's rte flow overhaul:
> >
> > https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.
> > dpdk.org%2Fml%2Farchives%2Fdev%2F2018-April%2F095774.html&data=02%7C01
> > %7Cxuemingl%40mellanox.com%7C8161b83e9bc44d85c68a08d5a9bcb1ab%7Ca65297
> > 1c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636601550260012889&sdata=pqNmsWrO
> > NPD7B8x7K6Kcy7tKIvMCk1%2BrwX9RObGJ35Q%3D&reserved=0
> > v1:
> > - Support new tunnel type MPLS-in-GRE and MPLS-in-UDP
> > - Remove deprecation notes of rss level
> >
> > This patchset supports MLX5 Rx tunnel checksum, inner rss, inner ptype offloading of following
> tunnel types:
> > - Standard VXLAN
> > - L3 VXLAN (no inner ethernet header)
> > - VXLAN-GPE
> >
> > Xueming Li (11):
> >   net/mlx5: support 16 hardware priorities
> >   net/mlx5: support GRE tunnel flow
> >   net/mlx5: support L3 VXLAN flow
> >   net/mlx5: support Rx tunnel type identification
> >   net/mlx5: cleanup tunnel checksum offloads
> >   net/mlx5: split flow RSS handling logic
> >   net/mlx5: support tunnel RSS level
> >   net/mlx5: add hardware flow debug dump
> >   net/mlx5: introduce VXLAN-GPE tunnel type
> >   net/mlx5: allow flow tunnel ID 0 with outer pattern
> >   doc: update mlx5 guide on tunnel offloading
> >
> >  doc/guides/nics/features/default.ini  |   1 +
> >  doc/guides/nics/features/mlx5.ini     |   3 +
> >  doc/guides/nics/mlx5.rst              |  30 +-
> >  drivers/net/mlx5/Makefile             |   2 +-
> >  drivers/net/mlx5/mlx5.c               |  24 +
> >  drivers/net/mlx5/mlx5.h               |   6 +
> >  drivers/net/mlx5/mlx5_flow.c          | 844 +++++++++++++++++++++++++++-------
> >  drivers/net/mlx5/mlx5_glue.c          |  16 +
> >  drivers/net/mlx5/mlx5_glue.h          |   8 +
> >  drivers/net/mlx5/mlx5_rxq.c           |  89 +++-
> >  drivers/net/mlx5/mlx5_rxtx.c          |  33 +-
> >  drivers/net/mlx5/mlx5_rxtx.h          |  11 +-
> >  drivers/net/mlx5/mlx5_rxtx_vec_neon.h |  21 +-
> > drivers/net/mlx5/mlx5_rxtx_vec_sse.h  |  17 +-
> >  drivers/net/mlx5/mlx5_trigger.c       |   8 -
> >  drivers/net/mlx5/mlx5_utils.h         |   6 +
> >  16 files changed, 896 insertions(+), 223 deletions(-)
> >
> > --
> > 2.13.3
> >
> 
> I think we have caught almost all issues, if something remains fixes can
> be added.

Many thanks for the long patch set review 😊

> 
> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
> 
> --
> Nélio Laranjeiro
> 6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v6 00/11] mlx5 Rx tunnel offloading
  2018-04-24  8:24         ` Nélio Laranjeiro
  2018-04-24  8:25           ` Xueming(Steven) Li
@ 2018-04-26  6:23           ` Shahaf Shuler
  1 sibling, 0 replies; 115+ messages in thread
From: Shahaf Shuler @ 2018-04-26  6:23 UTC (permalink / raw)
  To: Nélio Laranjeiro, Xueming(Steven) Li; +Cc: dev

Tuesday, April 24, 2018 11:25 AM, Nélio Laranjeiro:
> Subject: Re: [PATCH v6 00/11] mlx5 Rx tunnel offloading
> 
> On Mon, Apr 23, 2018 at 08:32:59PM +0800, Xueming Li wrote:
> >
> > Important note:
> >         please note that this patchset relies on Adrien's patchset of flow API
> >         overhaul:
> >
> https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fww
> w.
> >
> dpdk.org%2Fdev%2Fpatchwork%2Fpatch%2F38508%2F&data=02%7C01%7Cs
> hahafs%4
> >
> 0mellanox.com%7Ccd5e14a1081b4b4d06ee08d5a9bcb19c%7Ca652971c7d2e4
> d9ba6a
> >
> 4d149256f461b%7C0%7C0%7C636601550259090665&sdata=cCj7YufX7oNyl9J
> %2FC4C
> > zKJM3UPoD5S5QJpAG1SkNMBA%3D&reserved=0
> > v6:
> > - Fixed commit log of tunnel type identification
> > v5:
> > - Removed %lx prints
> > - Per review request, clear mbuf tunnel type in case of multiple tunnel
> types.
> > - Rebase on Adriens flow API overhaul patchset
> > - Split feature requirement document into patches of L3 VXLAN and
> > VXLAN-GPE
> > - Per review request, add device parameter to enable L3 VXLAN and
> > VXLAN-GPE
> > v4:
> > - Fix RSS level according to value defination
> > - Add "Inner RSS" column to NIC feature doc
> > - Fixed flow creation error in case of ipv4 rss on ipv6 pattern
> > - new patch: enforce IP protocol of GRE to be 47.
> > - Removed MPLS-in-UDP and MPLS-in-GRE replated patchset
> > - Removed invalid RSS type check
> > v3:
> > - Refactor 16 Verbs priority detection.
> > - Other updates according to ML discussion.
> > v2:
> > - Split into 2 series: public api and mlx5, this one is the second.
> > - Rebased on Adrien's rte flow overhaul:
> >
> >
> https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fww
> w.
> > dpdk.org%2Fml%2Farchives%2Fdev%2F2018-
> April%2F095774.html&data=02%7C01
> >
> %7Cshahafs%40mellanox.com%7Ccd5e14a1081b4b4d06ee08d5a9bcb19c%7C
> a652971
> >
> c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636601550259090665&sdata=PO
> Air3I7R
> > 45xtoemp8dQ%2BZt6YYePWS%2FHezXQbBQgbQY%3D&reserved=0
> > v1:
> > - Support new tunnel type MPLS-in-GRE and MPLS-in-UDP
> > - Remove deprecation notes of rss level
> >
> > This patchset supports MLX5 Rx tunnel checksum, inner rss, inner ptype
> offloading of following tunnel types:
> > - Standard VXLAN
> > - L3 VXLAN (no inner ethernet header)
> > - VXLAN-GPE
> >
> > Xueming Li (11):
> >   net/mlx5: support 16 hardware priorities
> >   net/mlx5: support GRE tunnel flow
> >   net/mlx5: support L3 VXLAN flow
> >   net/mlx5: support Rx tunnel type identification
> >   net/mlx5: cleanup tunnel checksum offloads
> >   net/mlx5: split flow RSS handling logic
> >   net/mlx5: support tunnel RSS level
> >   net/mlx5: add hardware flow debug dump
> >   net/mlx5: introduce VXLAN-GPE tunnel type
> >   net/mlx5: allow flow tunnel ID 0 with outer pattern
> >   doc: update mlx5 guide on tunnel offloading
> >
> >  doc/guides/nics/features/default.ini  |   1 +
> >  doc/guides/nics/features/mlx5.ini     |   3 +
> >  doc/guides/nics/mlx5.rst              |  30 +-
> >  drivers/net/mlx5/Makefile             |   2 +-
> >  drivers/net/mlx5/mlx5.c               |  24 +
> >  drivers/net/mlx5/mlx5.h               |   6 +
> >  drivers/net/mlx5/mlx5_flow.c          | 844
> +++++++++++++++++++++++++++-------
> >  drivers/net/mlx5/mlx5_glue.c          |  16 +
> >  drivers/net/mlx5/mlx5_glue.h          |   8 +
> >  drivers/net/mlx5/mlx5_rxq.c           |  89 +++-
> >  drivers/net/mlx5/mlx5_rxtx.c          |  33 +-
> >  drivers/net/mlx5/mlx5_rxtx.h          |  11 +-
> >  drivers/net/mlx5/mlx5_rxtx_vec_neon.h |  21 +-
> > drivers/net/mlx5/mlx5_rxtx_vec_sse.h  |  17 +-
> >  drivers/net/mlx5/mlx5_trigger.c       |   8 -
> >  drivers/net/mlx5/mlx5_utils.h         |   6 +
> >  16 files changed, 896 insertions(+), 223 deletions(-)
> >
> > --
> > 2.13.3
> >
> 
> I think we have caught almost all issues, if something remains fixes can
> be added.
> 
> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>

Series applied to next-net-mlx, thanks. 

> 
> --
> Nélio Laranjeiro
> 6WIND

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v6 08/11] net/mlx5: add hardware flow debug dump
  2018-04-23 12:33       ` [PATCH v6 08/11] net/mlx5: add hardware flow debug dump Xueming Li
@ 2018-04-26 10:09         ` Ferruh Yigit
  2018-04-26 10:48           ` Shahaf Shuler
  0 siblings, 1 reply; 115+ messages in thread
From: Ferruh Yigit @ 2018-04-26 10:09 UTC (permalink / raw)
  To: Xueming Li, Nelio Laranjeiro, Shahaf Shuler; +Cc: dev

On 4/23/2018 1:33 PM, Xueming Li wrote:
> Dump verb flow detail including flow spec type and size for debugging
> purpose.

This patch is causing build errors [1], please test build with debug enabled.

Also set is already in next-net-mlx, fixed version needs to be updated there.

Thanks,
ferruh

[1]
...dpdk/drivers/net/mlx5/mlx5_rxq.c:1460:29: error: format specifies type
'unsigned char' but the argument has type 'uint32_t' (aka 'unsigned int')
[-Werror,-Wformat]
              hash_fields, tunnel, rss_level,
                                   ^~~~~~~~~

...dpdk/drivers/net/mlx5/mlx5_rxq.c:1599:23: error: format specifies type
'unsigned char' but the argument has type 'uint32_t' (aka 'unsigned int')
[-Werror,-Wformat]
                      hrxq->tunnel, hrxq->rss_level);
                                    ^~~~~~~~~~~~~~~

> 
> Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> ---
>  drivers/net/mlx5/mlx5_flow.c  | 68 ++++++++++++++++++++++++++++++++++++-------
>  drivers/net/mlx5/mlx5_rxq.c   | 26 ++++++++++++++---
>  drivers/net/mlx5/mlx5_utils.h |  6 ++++
>  3 files changed, 86 insertions(+), 14 deletions(-)

<...>

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v6 08/11] net/mlx5: add hardware flow debug dump
  2018-04-26 10:09         ` Ferruh Yigit
@ 2018-04-26 10:48           ` Shahaf Shuler
  0 siblings, 0 replies; 115+ messages in thread
From: Shahaf Shuler @ 2018-04-26 10:48 UTC (permalink / raw)
  To: Ferruh Yigit, Xueming(Steven) Li, Nélio Laranjeiro; +Cc: dev

Thursday, April 26, 2018 1:10 PM, Ferruh Yigit:
> Subject: Re: [dpdk-dev] [PATCH v6 08/11] net/mlx5: add hardware flow
> debug dump
> 
> On 4/23/2018 1:33 PM, Xueming Li wrote:
> > Dump verb flow detail including flow spec type and size for debugging
> > purpose.
> 
> This patch is causing build errors [1], please test build with debug enabled.
> 
> Also set is already in next-net-mlx, fixed version needs to be updated there.
> 
> Thanks,
> ferruh
> 
> [1]
> ...dpdk/drivers/net/mlx5/mlx5_rxq.c:1460:29: error: format specifies type
> 'unsigned char' but the argument has type 'uint32_t' (aka 'unsigned int') [-
> Werror,-Wformat]
>               hash_fields, tunnel, rss_level,
>                                    ^~~~~~~~~
> 
> ...dpdk/drivers/net/mlx5/mlx5_rxq.c:1599:23: error: format specifies type
> 'unsigned char' but the argument has type 'uint32_t' (aka 'unsigned int') [-
> Werror,-Wformat]
>                       hrxq->tunnel, hrxq->rss_level);
>                                     ^~~~~~~~~~~~~~~

Fixed locally on next-net-mlx, no need to send patch. 

> 
> >
> > Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> > ---
> >  drivers/net/mlx5/mlx5_flow.c  | 68
> ++++++++++++++++++++++++++++++++++++-------
> >  drivers/net/mlx5/mlx5_rxq.c   | 26 ++++++++++++++---
> >  drivers/net/mlx5/mlx5_utils.h |  6 ++++
> >  3 files changed, 86 insertions(+), 14 deletions(-)
> 
> <...>

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v6 11/11] doc: update mlx5 guide on tunnel offloading
  2018-04-23 12:33       ` [PATCH v6 11/11] doc: update mlx5 guide on tunnel offloading Xueming Li
@ 2018-04-26 11:00         ` Ferruh Yigit
  2018-04-26 14:03           ` Xueming(Steven) Li
  0 siblings, 1 reply; 115+ messages in thread
From: Ferruh Yigit @ 2018-04-26 11:00 UTC (permalink / raw)
  To: Xueming Li, Nelio Laranjeiro, Shahaf Shuler; +Cc: dev

On 4/23/2018 1:33 PM, Xueming Li wrote:
> Remove tunnel limitations, add new hardware tunnel offload features.
> 
> Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> ---
>  doc/guides/nics/features/default.ini | 1 +
>  doc/guides/nics/features/mlx5.ini    | 3 +++
>  doc/guides/nics/mlx5.rst             | 4 ++--
>  3 files changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/doc/guides/nics/features/default.ini b/doc/guides/nics/features/default.ini
> index dae2ad776..49be81450 100644
> --- a/doc/guides/nics/features/default.ini
> +++ b/doc/guides/nics/features/default.ini
> @@ -29,6 +29,7 @@ Multicast MAC filter =
>  RSS hash             =
>  RSS key update       =
>  RSS reta update      =
> +Inner RSS            =
>  VMDq                 =
>  SR-IOV               =
>  DCB                  =

When a new feature added, it need to be documented, in doc/guides/nics/features.rst.

To not block this set, can you please send an incremental patch for this?

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v6 11/11] doc: update mlx5 guide on tunnel offloading
  2018-04-26 11:00         ` Ferruh Yigit
@ 2018-04-26 14:03           ` Xueming(Steven) Li
  0 siblings, 0 replies; 115+ messages in thread
From: Xueming(Steven) Li @ 2018-04-26 14:03 UTC (permalink / raw)
  To: Ferruh Yigit, Nélio Laranjeiro, Shahaf Shuler; +Cc: dev

Hi Ferruh,

Thanks for reminding, new patch sent:
http://www.dpdk.org/dev/patchwork/patch/39026/

Best Regards,
Xueming

> -----Original Message-----
> From: Ferruh Yigit <ferruh.yigit@intel.com>
> Sent: Thursday, April 26, 2018 7:00 PM
> To: Xueming(Steven) Li <xuemingl@mellanox.com>; Nélio Laranjeiro <nelio.laranjeiro@6wind.com>; Shahaf
> Shuler <shahafs@mellanox.com>
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v6 11/11] doc: update mlx5 guide on tunnel offloading
> 
> On 4/23/2018 1:33 PM, Xueming Li wrote:
> > Remove tunnel limitations, add new hardware tunnel offload features.
> >
> > Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> > ---
> >  doc/guides/nics/features/default.ini | 1 +
> >  doc/guides/nics/features/mlx5.ini    | 3 +++
> >  doc/guides/nics/mlx5.rst             | 4 ++--
> >  3 files changed, 6 insertions(+), 2 deletions(-)
> >
> > diff --git a/doc/guides/nics/features/default.ini b/doc/guides/nics/features/default.ini
> > index dae2ad776..49be81450 100644
> > --- a/doc/guides/nics/features/default.ini
> > +++ b/doc/guides/nics/features/default.ini
> > @@ -29,6 +29,7 @@ Multicast MAC filter =
> >  RSS hash             =
> >  RSS key update       =
> >  RSS reta update      =
> > +Inner RSS            =
> >  VMDq                 =
> >  SR-IOV               =
> >  DCB                  =
> 
> When a new feature added, it need to be documented, in doc/guides/nics/features.rst.
> 
> To not block this set, can you please send an incremental patch for this?

^ permalink raw reply	[flat|nested] 115+ messages in thread

end of thread, other threads:[~2018-04-26 14:03 UTC | newest]

Thread overview: 115+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20180410133415.189905-1-xuemingl%40mellanox.com>
2018-04-13 11:20 ` [PATCH v3 00/14] mlx5 Rx tunnel offloading Xueming Li
2018-04-17 15:14   ` [PATCH v4 00/11] " Xueming Li
2018-04-20 12:23     ` [PATCH v5 " Xueming Li
2018-04-23 12:32       ` [PATCH v6 " Xueming Li
2018-04-24  8:24         ` Nélio Laranjeiro
2018-04-24  8:25           ` Xueming(Steven) Li
2018-04-26  6:23           ` Shahaf Shuler
2018-04-23 12:33       ` [PATCH v6 01/11] net/mlx5: support 16 hardware priorities Xueming Li
2018-04-23 12:33       ` [PATCH v6 02/11] net/mlx5: support GRE tunnel flow Xueming Li
2018-04-23 12:55         ` Nélio Laranjeiro
2018-04-23 13:32           ` Xueming(Steven) Li
2018-04-23 13:46             ` Nélio Laranjeiro
2018-04-24  7:40               ` Xueming(Steven) Li
2018-04-24  8:21                 ` Nélio Laranjeiro
2018-04-23 12:33       ` [PATCH v6 03/11] net/mlx5: support L3 VXLAN flow Xueming Li
2018-04-23 12:33       ` [PATCH v6 04/11] net/mlx5: support Rx tunnel type identification Xueming Li
2018-04-23 12:33       ` [PATCH v6 05/11] net/mlx5: cleanup tunnel checksum offloads Xueming Li
2018-04-23 12:33       ` [PATCH v6 06/11] net/mlx5: split flow RSS handling logic Xueming Li
2018-04-23 12:33       ` [PATCH v6 07/11] net/mlx5: support tunnel RSS level Xueming Li
2018-04-23 12:33       ` [PATCH v6 08/11] net/mlx5: add hardware flow debug dump Xueming Li
2018-04-26 10:09         ` Ferruh Yigit
2018-04-26 10:48           ` Shahaf Shuler
2018-04-23 12:33       ` [PATCH v6 09/11] net/mlx5: introduce VXLAN-GPE tunnel type Xueming Li
2018-04-23 12:33       ` [PATCH v6 10/11] net/mlx5: allow flow tunnel ID 0 with outer pattern Xueming Li
2018-04-23 12:33       ` [PATCH v6 11/11] doc: update mlx5 guide on tunnel offloading Xueming Li
2018-04-26 11:00         ` Ferruh Yigit
2018-04-26 14:03           ` Xueming(Steven) Li
2018-04-20 12:23     ` [PATCH v5 01/11] net/mlx5: support 16 hardware priorities Xueming Li
2018-04-20 12:23     ` [PATCH v5 02/11] net/mlx5: support GRE tunnel flow Xueming Li
2018-04-20 12:23     ` [PATCH v5 03/11] net/mlx5: support L3 VXLAN flow Xueming Li
2018-04-20 12:23     ` [PATCH v5 04/11] net/mlx5: support Rx tunnel type identification Xueming Li
2018-04-23  7:40       ` Nélio Laranjeiro
2018-04-23  7:56         ` Xueming(Steven) Li
2018-04-20 12:23     ` [PATCH v5 05/11] net/mlx5: cleanup tunnel checksum offloads Xueming Li
2018-04-20 12:23     ` [PATCH v5 06/11] net/mlx5: split flow RSS handling logic Xueming Li
2018-04-20 12:23     ` [PATCH v5 07/11] net/mlx5: support tunnel RSS level Xueming Li
2018-04-20 12:23     ` [PATCH v5 08/11] net/mlx5: add hardware flow debug dump Xueming Li
2018-04-20 12:23     ` [PATCH v5 09/11] net/mlx5: introduce VXLAN-GPE tunnel type Xueming Li
2018-04-20 12:23     ` [PATCH v5 10/11] net/mlx5: allow flow tunnel ID 0 with outer pattern Xueming Li
2018-04-20 12:23     ` [PATCH v5 11/11] doc: update mlx5 guide on tunnel offloading Xueming Li
2018-04-17 15:14   ` [PATCH v4 01/11] net/mlx5: support 16 hardware priorities Xueming Li
2018-04-17 15:14   ` [PATCH v4 02/11] net/mlx5: support GRE tunnel flow Xueming Li
2018-04-17 15:14   ` [PATCH v4 03/11] net/mlx5: support L3 VXLAN flow Xueming Li
2018-04-18  6:48     ` Nélio Laranjeiro
2018-04-18 14:43       ` Xueming(Steven) Li
2018-04-18 15:08         ` Nélio Laranjeiro
2018-04-19  6:20           ` Xueming(Steven) Li
2018-04-19  6:55             ` Nélio Laranjeiro
2018-04-19 10:21               ` Xueming(Steven) Li
2018-04-19 11:15                 ` Nélio Laranjeiro
2018-04-19 11:53                   ` Xueming(Steven) Li
2018-04-19 12:18                     ` Nélio Laranjeiro
2018-04-19 12:49                       ` Xueming(Steven) Li
2018-04-19 13:40                         ` Nélio Laranjeiro
2018-04-17 15:14   ` [PATCH v4 04/11] net/mlx5: support Rx tunnel type identification Xueming Li
2018-04-18  6:50     ` Nélio Laranjeiro
2018-04-18 14:33       ` Xueming(Steven) Li
2018-04-18 15:06         ` Nélio Laranjeiro
2018-04-17 15:14   ` [PATCH v4 05/11] net/mlx5: cleanup tunnel checksum offloads Xueming Li
2018-04-17 15:14   ` [PATCH v4 06/11] net/mlx5: split flow RSS handling logic Xueming Li
2018-04-17 15:14   ` [PATCH v4 07/11] net/mlx5: support tunnel RSS level Xueming Li
2018-04-18  6:55     ` Nélio Laranjeiro
2018-04-17 15:14   ` [PATCH v4 08/11] net/mlx5: add hardware flow debug dump Xueming Li
2018-04-18  6:57     ` Nélio Laranjeiro
2018-04-17 15:14   ` [PATCH v4 09/11] net/mlx5: introduce VXLAN-GPE tunnel type Xueming Li
2018-04-18  6:58     ` Nélio Laranjeiro
2018-04-17 15:14   ` [PATCH v4 10/11] net/mlx5: allow flow tunnel ID 0 with outer pattern Xueming Li
2018-04-17 15:14   ` [PATCH v4 11/11] doc: update mlx5 guide on tunnel offloading Xueming Li
2018-04-18  7:00     ` Nélio Laranjeiro
2018-04-13 11:20 ` [PATCH v3 01/14] net/mlx5: support 16 hardware priorities Xueming Li
2018-04-13 11:58   ` Nélio Laranjeiro
2018-04-13 13:10     ` Xueming(Steven) Li
2018-04-13 13:46       ` Nélio Laranjeiro
2018-04-13 11:20 ` [PATCH v3 02/14] net/mlx5: support GRE tunnel flow Xueming Li
2018-04-13 12:02   ` Nélio Laranjeiro
2018-04-13 11:20 ` [PATCH v3 03/14] net/mlx5: support L3 VXLAN flow Xueming Li
2018-04-13 12:13   ` Nélio Laranjeiro
2018-04-13 13:51     ` Xueming(Steven) Li
2018-04-13 14:04     ` Xueming(Steven) Li
2018-04-13 11:20 ` [PATCH v3 04/14] net/mlx5: support Rx tunnel type identification Xueming Li
2018-04-13 13:02   ` Nélio Laranjeiro
2018-04-14 12:57     ` Xueming(Steven) Li
2018-04-16  7:28       ` Nélio Laranjeiro
2018-04-16  8:05         ` Xueming(Steven) Li
2018-04-16  9:28           ` Adrien Mazarguil
2018-04-16 13:32             ` Xueming(Steven) Li
2018-04-16 13:47               ` Adrien Mazarguil
2018-04-16 15:27                 ` Xueming(Steven) Li
2018-04-16 16:02                   ` Adrien Mazarguil
2018-04-17  4:53                     ` Xueming(Steven) Li
2018-04-17  7:20                       ` Nélio Laranjeiro
2018-04-17 11:50                         ` Xueming(Steven) Li
2018-04-13 11:20 ` [PATCH v3 05/14] net/mlx5: cleanup tunnel checksum offloads Xueming Li
2018-04-13 11:20 ` [PATCH v3 06/14] net/mlx5: split flow RSS handling logic Xueming Li
2018-04-13 11:20 ` [PATCH v3 07/14] net/mlx5: support tunnel RSS level Xueming Li
2018-04-13 13:27   ` Nélio Laranjeiro
2018-04-14 10:12     ` Xueming(Steven) Li
2018-04-16 12:25       ` Nélio Laranjeiro
2018-04-13 11:20 ` [PATCH v3 08/14] net/mlx5: add hardware flow debug dump Xueming Li
2018-04-13 13:29   ` Nélio Laranjeiro
2018-04-13 11:20 ` [PATCH v3 09/14] net/mlx5: introduce VXLAN-GPE tunnel type Xueming Li
2018-04-13 13:32   ` Nélio Laranjeiro
2018-04-13 11:20 ` [PATCH v3 10/14] net/mlx5: allow flow tunnel ID 0 with outer pattern Xueming Li
2018-04-13 11:20 ` [PATCH v3 11/14] net/mlx5: support MPLS-in-GRE and MPLS-in-UDP Xueming Li
2018-04-13 13:37   ` Nélio Laranjeiro
2018-04-13 14:48     ` Xueming(Steven) Li
2018-04-13 14:55       ` Nélio Laranjeiro
2018-04-13 15:22         ` Xueming(Steven) Li
2018-04-16  8:14           ` Nélio Laranjeiro
2018-04-13 11:20 ` [PATCH v3 12/14] doc: update mlx5 guide on tunnel offloading Xueming Li
2018-04-13 13:38   ` Nélio Laranjeiro
2018-04-13 11:20 ` [PATCH v3 13/14] net/mlx5: fix invalid flow item check Xueming Li
2018-04-13 13:40   ` Nélio Laranjeiro
2018-04-13 11:20 ` [PATCH v3 14/14] net/mlx5: support RSS configuration in isolated mode Xueming Li
2018-04-13 13:43   ` Nélio Laranjeiro

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.