DPDK-dev Archive on lore.kernel.org
 help / color / Atom feed
* [dpdk-dev] [RFC 1/3] ethdev: extend flow metadata
@ 2019-06-03 21:32 Yongseok Koh
  2019-06-03 21:32 ` [dpdk-dev] [RFC 2/3] ethdev: add flow modify mark action Yongseok Koh
                   ` (3 more replies)
  0 siblings, 4 replies; 34+ messages in thread
From: Yongseok Koh @ 2019-06-03 21:32 UTC (permalink / raw)
  To: shahafs, thomas, ferruh.yigit, arybchenko, adrien.mazarguil,
	olivier.matz
  Cc: dev

Currently, metadata can be set on egress path via mbuf tx_meatadata field
with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_RX_META matches metadata.

This patch extends the usability.

1) RTE_FLOW_ACTION_TYPE_SET_META

When supporting multiple tables, Tx metadata can also be set by a rule and
matched by another rule. This new action allows metadata to be set as a
result of flow match.

2) Metadata on ingress

There's also need to support metadata on packet Rx. Metadata can be set by
SET_META action and matched by META item like Tx. The final value set by
the action will be delivered to application via mbuf metadata field with
PKT_RX_METADATA ol_flag.

For this purpose, mbuf->tx_metadata is moved as a separate new field and
renamed to 'metadata' to support both Rx and Tx metadata.

For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
propagated to the other path depending on HW capability.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
---
 app/test-pmd/util.c                   |  2 +-
 doc/guides/prog_guide/rte_flow.rst    | 73 +++++++++++++++++++--------
 drivers/net/mlx5/mlx5_rxtx.c          | 12 ++---
 drivers/net/mlx5/mlx5_rxtx_vec.c      |  4 +-
 drivers/net/mlx5/mlx5_rxtx_vec_neon.h |  2 +-
 drivers/net/mlx5/mlx5_rxtx_vec_sse.h  |  2 +-
 lib/librte_ethdev/rte_flow.h          | 43 ++++++++++++++--
 lib/librte_mbuf/rte_mbuf.h            | 21 ++++----
 8 files changed, 111 insertions(+), 48 deletions(-)

diff --git a/app/test-pmd/util.c b/app/test-pmd/util.c
index a1164b7053..6ecc97351f 100644
--- a/app/test-pmd/util.c
+++ b/app/test-pmd/util.c
@@ -182,7 +182,7 @@ tx_pkt_set_md(uint16_t port_id, __rte_unused uint16_t queue,
 	 * and set ol_flags accordingly.
 	 */
 	for (i = 0; i < nb_pkts; i++) {
-		pkts[i]->tx_metadata = ports[port_id].tx_metadata;
+		pkts[i]->metadata = ports[port_id].tx_metadata;
 		pkts[i]->ol_flags |= PKT_TX_METADATA;
 	}
 	return nb_pkts;
diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index a34d012e55..016cd90e52 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -658,6 +658,32 @@ the physical device, with virtual groups in the PMD or not at all.
    | ``mask`` | ``id``   | zeroed to match any value |
    +----------+----------+---------------------------+
 
+Item: ``META``
+^^^^^^^^^^^^^^^^^
+
+Matches 32 bit metadata item set.
+
+On egress, metadata can be set either by mbuf metadata field with
+PKT_TX_METADATA flag or ``SET_META`` action. On ingress, ``SET_META`` action
+sets metadata for a packet and the metadata will be reported via mbuf metadata
+field with PKT_RX_METADATA flag.
+
+- Default ``mask`` matches the specified Rx metadata value.
+
+.. _table_rte_flow_item_meta:
+
+.. table:: META
+
+   +----------+----------+---------------------------------------+
+   | Field    | Subfield | Value                                 |
+   +==========+==========+=======================================+
+   | ``spec`` | ``data`` | 32 bit metadata value                 |
+   +----------+----------+---------------------------------------+
+   | ``last`` | ``data`` | upper range value                     |
+   +----------+----------+---------------------------------------+
+   | ``mask`` | ``data`` | bit-mask applies to "spec" and "last" |
+   +----------+----------+---------------------------------------+
+
 Data matching item types
 ~~~~~~~~~~~~~~~~~~~~~~~~
 
@@ -1189,27 +1215,6 @@ Normally preceded by any of:
 - `Item: ICMP6_ND_NS`_
 - `Item: ICMP6_ND_OPT`_
 
-Item: ``META``
-^^^^^^^^^^^^^^
-
-Matches an application specific 32 bit metadata item.
-
-- Default ``mask`` matches the specified metadata value.
-
-.. _table_rte_flow_item_meta:
-
-.. table:: META
-
-   +----------+----------+---------------------------------------+
-   | Field    | Subfield | Value                                 |
-   +==========+==========+=======================================+
-   | ``spec`` | ``data`` | 32 bit metadata value                 |
-   +----------+--------------------------------------------------+
-   | ``last`` | ``data`` | upper range value                     |
-   +----------+----------+---------------------------------------+
-   | ``mask`` | ``data`` | bit-mask applies to "spec" and "last" |
-   +----------+----------+---------------------------------------+
-
 Actions
 ~~~~~~~
 
@@ -2345,6 +2350,32 @@ Otherwise, RTE_FLOW_ERROR_TYPE_ACTION error will be returned.
    | ``mac_addr`` | MAC address   |
    +--------------+---------------+
 
+Action: ``SET_META``
+^^^^^^^^^^^^^^^^^^^^^^^
+
+Set metadata. Item ``META`` matches metadata.
+
+Metadata set by mbuf metadata field with PKT_TX_METADATA flag on egress will be
+overridden by this action. On ingress, the metadata will be carried by mbuf
+metadata field with PKT_RX_METADATA flag if set.
+
+Altering partial bits is supported with ``mask``. For bits which have never been
+set, unpredictable value will be seen depending on driver implementation. For
+loopback/hairpin packet, metadata set on Rx/Tx may or may not be propagated to
+the other path depending on HW capability.
+
+.. _table_rte_flow_action_set_meta:
+
+.. table:: SET_META
+
+   +----------+----------------------------+
+   | Field    | Value                      |
+   +==========+============================+
+   | ``data`` | 32 bit metadata value      |
+   +----------+----------------------------+
+   | ``mask`` | bit-mask applies to "data" |
+   +----------+----------------------------+
+
 Negative types
 ~~~~~~~~~~~~~~
 
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 7174ffc91c..19b4a2567b 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -626,8 +626,7 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
 		txq_mbuf_to_swp(txq, buf, (uint8_t *)&swp_offsets, &swp_types);
 		raw = ((uint8_t *)(uintptr_t)wqe) + 2 * MLX5_WQE_DWORD_SIZE;
 		/* Copy metadata from mbuf if valid */
-		metadata = buf->ol_flags & PKT_TX_METADATA ? buf->tx_metadata :
-							     0;
+		metadata = buf->ol_flags & PKT_TX_METADATA ? buf->metadata : 0;
 		/* Replace the Ethernet type by the VLAN if necessary. */
 		if (buf->ol_flags & PKT_TX_VLAN_PKT) {
 			uint32_t vlan = rte_cpu_to_be_32(0x81000000 |
@@ -1029,8 +1028,7 @@ mlx5_tx_burst_mpw(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
 		--pkts_n;
 		cs_flags = txq_ol_cksum_to_cs(buf);
 		/* Copy metadata from mbuf if valid */
-		metadata = buf->ol_flags & PKT_TX_METADATA ? buf->tx_metadata :
-							     0;
+		metadata = buf->ol_flags & PKT_TX_METADATA ? buf->metadata : 0;
 		/* Retrieve packet information. */
 		length = PKT_LEN(buf);
 		assert(length);
@@ -1264,8 +1262,7 @@ mlx5_tx_burst_mpw_inline(void *dpdk_txq, struct rte_mbuf **pkts,
 		max_wqe = (1u << txq->wqe_n) - (txq->wqe_ci - txq->wqe_pi);
 		cs_flags = txq_ol_cksum_to_cs(buf);
 		/* Copy metadata from mbuf if valid */
-		metadata = buf->ol_flags & PKT_TX_METADATA ? buf->tx_metadata :
-							     0;
+		metadata = buf->ol_flags & PKT_TX_METADATA ? buf->metadata : 0;
 		/* Retrieve packet information. */
 		length = PKT_LEN(buf);
 		/* Start new session if packet differs. */
@@ -1547,8 +1544,7 @@ txq_burst_empw(struct mlx5_txq_data *txq, struct rte_mbuf **pkts,
 			break;
 		cs_flags = txq_ol_cksum_to_cs(buf);
 		/* Copy metadata from mbuf if valid */
-		metadata = buf->ol_flags & PKT_TX_METADATA ? buf->tx_metadata :
-							     0;
+		metadata = buf->ol_flags & PKT_TX_METADATA ? buf->metadata : 0;
 		/* Retrieve packet information. */
 		length = PKT_LEN(buf);
 		/* Start new session if:
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.c b/drivers/net/mlx5/mlx5_rxtx_vec.c
index 9a3a5ae437..9f99c8cb03 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec.c
+++ b/drivers/net/mlx5/mlx5_rxtx_vec.c
@@ -71,7 +71,7 @@ txq_calc_offload(struct rte_mbuf **pkts, uint16_t pkts_n, uint8_t *cs_flags,
 	if (!pkts_n)
 		return 0;
 	p0_metadata = pkts[0]->ol_flags & PKT_TX_METADATA ?
-			pkts[0]->tx_metadata : 0;
+		      pkts[0]->metadata : 0;
 	/* Count the number of packets having same offload parameters. */
 	for (pos = 1; pos < pkts_n; ++pos) {
 		/* Check if packet has same checksum flags. */
@@ -81,7 +81,7 @@ txq_calc_offload(struct rte_mbuf **pkts, uint16_t pkts_n, uint8_t *cs_flags,
 		/* Check if packet has same metadata. */
 		if (txq_offloads & DEV_TX_OFFLOAD_MATCH_METADATA) {
 			pn_metadata = pkts[pos]->ol_flags & PKT_TX_METADATA ?
-					pkts[pos]->tx_metadata : 0;
+				      pkts[pos]->metadata : 0;
 			if (pn_metadata != p0_metadata)
 				break;
 		}
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
index b2cc710887..c54914e97a 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
@@ -131,7 +131,7 @@ txq_scatter_v(struct mlx5_txq_data *txq, struct rte_mbuf **pkts,
 		uint8x16_t ctrl;
 		rte_be32_t metadata =
 			metadata_ol && (buf->ol_flags & PKT_TX_METADATA) ?
-			buf->tx_metadata : 0;
+			buf->metadata : 0;
 
 		assert(segs_n);
 		max_elts = elts_n - (elts_head - txq->elts_tail);
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
index dce3ee4b40..3de640a2fd 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
@@ -129,7 +129,7 @@ txq_scatter_v(struct mlx5_txq_data *txq, struct rte_mbuf **pkts,
 		__m128i ctrl;
 		rte_be32_t metadata =
 			metadata_ol && (buf->ol_flags & PKT_TX_METADATA) ?
-			buf->tx_metadata : 0;
+			buf->metadata : 0;
 
 		assert(segs_n);
 		max_elts = elts_n - (elts_head - txq->elts_tail);
diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index f3a8fb103f..cda8628183 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -417,7 +417,8 @@ enum rte_flow_item_type {
 	/**
 	 * [META]
 	 *
-	 * Matches a metadata value specified in mbuf metadata field.
+	 * Matches a metadata value.
+	 *
 	 * See struct rte_flow_item_meta.
 	 */
 	RTE_FLOW_ITEM_TYPE_META,
@@ -1164,9 +1165,16 @@ rte_flow_item_icmp6_nd_opt_tla_eth_mask = {
 #endif
 
 /**
- * RTE_FLOW_ITEM_TYPE_META.
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
  *
- * Matches a specified metadata value.
+ * RTE_FLOW_ITEM_TYPE_META
+ *
+ * Matches a specified metadata value. On egress, metadata can be set either by
+ * mbuf metadata field with PKT_TX_METADATA flag or
+ * RTE_FLOW_ACTION_TYPE_SET_META. On ingress, RTE_FLOW_ACTION_TYPE_SET_META sets
+ * metadata for a packet and the metadata will be reported via mbuf metadata
+ * field with PKT_RX_METADATA flag.
  */
 struct rte_flow_item_meta {
 	rte_be32_t data;
@@ -1650,6 +1658,13 @@ enum rte_flow_action_type {
 	 * See struct rte_flow_action_set_mac.
 	 */
 	RTE_FLOW_ACTION_TYPE_SET_MAC_DST,
+
+	/**
+	 * Set metadata on ingress or egress path.
+	 *
+	 * See struct rte_flow_action_set_meta.
+	 */
+	RTE_FLOW_ACTION_TYPE_SET_META,
 };
 
 /**
@@ -2131,6 +2146,28 @@ struct rte_flow_action_set_mac {
 	uint8_t mac_addr[RTE_ETHER_ADDR_LEN];
 };
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ACTION_TYPE_SET_META
+ *
+ * Set metadata. Metadata set by mbuf metadata field with PKT_TX_METADATA flag
+ * on egress will be overridden by this action. On ingress, the metadata will be
+ * carried by mbuf metadata field with PKT_RX_METADATA flag if set.
+ *
+ * Altering partial bits is supported with mask. For bits which have never been
+ * set, unpredictable value will be seen depending on driver implementation. For
+ * loopback/hairpin packet, metadata set on Rx/Tx may or may not be propagated
+ * to the other path depending on HW capability.
+ *
+ * RTE_FLOW_ITEM_TYPE_META matches metadata.
+ */
+struct rte_flow_action_set_meta {
+	rte_be32_t data;
+	rte_be32_t mask;
+};
+
 /*
  * Definition of a single action.
  *
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index e4c2da6ee6..60f2b553e6 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -200,6 +200,11 @@ extern "C" {
 
 /* add new RX flags here */
 
+/**
+ * Indicate that mbuf has metadata from device.
+ */
+#define PKT_RX_METADATA	(1ULL << 23)
+
 /* add new TX flags here */
 
 /**
@@ -648,17 +653,6 @@ struct rte_mbuf {
 			/**< User defined tags. See rte_distributor_process() */
 			uint32_t usr;
 		} hash;                   /**< hash information */
-		struct {
-			/**
-			 * Application specific metadata value
-			 * for egress flow rule match.
-			 * Valid if PKT_TX_METADATA is set.
-			 * Located here to allow conjunct use
-			 * with hash.sched.hi.
-			 */
-			uint32_t tx_metadata;
-			uint32_t reserved;
-		};
 	};
 
 	/** Outer VLAN TCI (CPU order), valid if PKT_RX_QINQ is set. */
@@ -725,6 +719,11 @@ struct rte_mbuf {
 	 */
 	struct rte_mbuf_ext_shared_info *shinfo;
 
+	/** Application specific metadata value for flow rule match.
+	 * Valid if PKT_RX_METADATA or PKT_TX_METADATA is set.
+	 */
+	uint32_t metadata;
+
 } __rte_cache_aligned;
 
 /**
-- 
2.21.0


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [dpdk-dev] [RFC 2/3] ethdev: add flow modify mark action
  2019-06-03 21:32 [dpdk-dev] [RFC 1/3] ethdev: extend flow metadata Yongseok Koh
@ 2019-06-03 21:32 ` Yongseok Koh
  2019-06-06 10:35   ` Jerin Jacob Kollanukkaran
  2019-06-03 21:32 ` [dpdk-dev] [RFC 3/3] ethdev: add flow tag Yongseok Koh
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 34+ messages in thread
From: Yongseok Koh @ 2019-06-03 21:32 UTC (permalink / raw)
  To: shahafs, thomas, ferruh.yigit, arybchenko, adrien.mazarguil,
	olivier.matz
  Cc: dev

Mark ID can be modified when supporting multiple tables. Partial bit
alteration is supported to preserve some bit-fields set by previous match.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
---
 doc/guides/prog_guide/rte_flow.rst | 21 +++++++++++++++++++++
 lib/librte_ethdev/rte_flow.h       | 24 ++++++++++++++++++++++++
 2 files changed, 45 insertions(+)

diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index 016cd90e52..2907edfff4 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -1463,6 +1463,27 @@ depends on the underlying implementation. It is returned in the
    | ``id`` | integer value to return with packets |
    +--------+--------------------------------------+
 
+Action: ``MODIFY_MARK``
+^^^^^^^^^^^^^^^^^^^^^^^
+
+Alter partial bits of mark ID set by ``MARK`` action.
+
+``mask`` indicates which bits are modified. For bits which have never been set
+by ``MARK`` or ``MODIFY_MARK``, unpredictable value will be seen depending on
+driver implementation.
+
+.. _table_rte_flow_action_modify_mark:
+
+.. table:: MODIFY_MARK
+
+   +----------+--------------------------------------+
+   | Field    | Value                                |
+   +==========+======================================+
+   | ``id``   | integer value to return with packets |
+   +----------+--------------------------------------+
+   | ``mask`` | bit-mask applies to "id"             |
+   +----------+--------------------------------------+
+
 Action: ``FLAG``
 ^^^^^^^^^^^^^^^^
 
diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index cda8628183..d811f8a06e 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -1316,6 +1316,13 @@ enum rte_flow_action_type {
 	 */
 	RTE_FLOW_ACTION_TYPE_MARK,
 
+	/**
+	 * Alter partial bits of mark ID set by RTE_FLOW_ACTION_TYPE_MARK.
+	 *
+	 * See struct rte_flow_action_modify_mark.
+	 */
+	RTE_FLOW_ACTION_TYPE_MODIFY_MARK,
+
 	/**
 	 * Flags packets. Similar to MARK without a specific value; only
 	 * sets the PKT_RX_FDIR mbuf flag.
@@ -1681,6 +1688,23 @@ struct rte_flow_action_mark {
 	uint32_t id; /**< Integer value to return with packets. */
 };
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ACTION_TYPE_MODIFY_MARK
+ *
+ * Alter partial bits of mark ID set by RTE_FLOW_ACTION_TYPE_MARK.
+ *
+ * Provided mask indicates which bits are modified. For bits which have never
+ * been set by mark action or modify_mark action, unpredictable value will be
+ * seen depending on driver implementation.
+ */
+struct rte_flow_action_modify_mark {
+	uint32_t id; /**< Integer value to return with packets. */
+	uint32_t mask; /**< Mask of bits to modify. */
+};
+
 /**
  * @warning
  * @b EXPERIMENTAL: this structure may change without prior notice
-- 
2.21.0


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [dpdk-dev] [RFC 3/3] ethdev: add flow tag
  2019-06-03 21:32 [dpdk-dev] [RFC 1/3] ethdev: extend flow metadata Yongseok Koh
  2019-06-03 21:32 ` [dpdk-dev] [RFC 2/3] ethdev: add flow modify mark action Yongseok Koh
@ 2019-06-03 21:32 ` Yongseok Koh
  2019-07-04 23:23   ` [dpdk-dev] [PATCH] " Yongseok Koh
  2019-06-09 14:23 ` [dpdk-dev] [RFC 1/3] ethdev: extend flow metadata Andrew Rybchenko
  2019-07-04 23:21 ` [dpdk-dev] [PATCH] " Yongseok Koh
  3 siblings, 1 reply; 34+ messages in thread
From: Yongseok Koh @ 2019-06-03 21:32 UTC (permalink / raw)
  To: shahafs, thomas, ferruh.yigit, arybchenko, adrien.mazarguil,
	olivier.matz
  Cc: dev

A tag is a transient data which can be used during flow match. This can be
used to store match result from a previous table so that the same pattern
need not be matched again on the next table. Even if outer header is
decapsulated on the previous match, the match result can be kept.

Some device expose internal registers of its flow processing pipeline and
those registers are quite useful for stateful connection tracking as it
keeps status of flow matching. Multiple tags are supported by specifying
index.

Example testpmd commands are:

  flow create 0 ingress pattern ... / end
    actions set_tag index 2 value 0xaa00bb mask 0xffff00ff /
            set_tag index 3 value 0x123456 mask 0xffffff /
            vxlan_decap / jump group 1 / end

  flow create 0 ingress pattern ... / end
    actions set_tag index 2 value 0xcc00 mask 0xff00 /
            set_tag index 3 value 0x123456 mask 0xffffff /
            vxlan_decap / jump group 1 / end

  flow create 0 ingress group 1
    pattern tag index is 2 value spec 0xaa00bb value mask 0xffff00ff /
            eth ... / end
    actions ... jump group 2 / end

  flow create 0 ingress group 1
    pattern tag index is 2 value spec 0xcc00 value mask 0xff00 /
            tag index is 3 value spec 0x123456 value mask 0xffffff /
            eth ... / end
    actions ... / end

  flow create 0 ingress group 2
    pattern tag index is 3 value spec 0x123456 value mask 0xffffff /
            eth ... / end
    actions ... / end

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
---
 doc/guides/prog_guide/rte_flow.rst | 50 +++++++++++++++++++++++++++
 lib/librte_ethdev/rte_flow.h       | 54 ++++++++++++++++++++++++++++++
 2 files changed, 104 insertions(+)

diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index 2907edfff4..f6ef4305b4 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -684,6 +684,34 @@ field with PKT_RX_METADATA flag.
    | ``mask`` | ``data`` | bit-mask applies to "spec" and "last" |
    +----------+----------+---------------------------------------+
 
+Item: ``TAG``
+^^^^^^^^^^^^^
+
+Matches tag item set by other flows. Multiple tags are supported by specifying
+``index``.
+
+- Default ``mask`` matches the specified tag value and index.
+
+.. _table_rte_flow_item_tag:
+
+.. table:: TAG
+
+   +----------+----------+----------------------------------------+
+   | Field    | Subfield  | Value                                 |
+   +==========+===========+=======================================+
+   | ``spec`` | ``data``  | 32 bit flow tag value                 |
+   |          +-----------+---------------------------------------+
+   |          | ``index`` | index of flow tag                     |
+   +----------+-----------+---------------------------------------+
+   | ``last`` | ``data``  | upper range value                     |
+   |          +-----------+                                       |
+   |          | ``index`` |                                       |
+   +----------+-----------+---------------------------------------+
+   | ``mask`` | ``data``  | bit-mask applies to "spec" and "last" |
+   |          +-----------+                                       |
+   |          | ``index`` |                                       |
+   +----------+-----------+---------------------------------------+
+
 Data matching item types
 ~~~~~~~~~~~~~~~~~~~~~~~~
 
@@ -2397,6 +2425,28 @@ the other path depending on HW capability.
    | ``mask`` | bit-mask applies to "data" |
    +----------+----------------------------+
 
+Action: ``SET_TAG``
+^^^^^^^^^^^^^^^^^^^
+
+Set Tag.
+
+Tag is a transient data used during flow matching. This is not delivered to
+application. Multiple tags are supported by specifying index.
+
+.. _table_rte_flow_action_set_tag:
+
+.. table:: SET_TAG
+
+   +-----------+----------------------------+
+   | Field     | Value                      |
+   +===========+============================+
+   | ``data``  | 32 bit tag value           |
+   +-----------+----------------------------+
+   | ``mask``  | bit-mask applies to "data" |
+   +-----------+----------------------------+
+   | ``index`` | index of tag to set        |
+   +-----------+----------------------------+
+
 Negative types
 ~~~~~~~~~~~~~~
 
diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index d811f8a06e..5ee2bc95c6 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -422,6 +422,15 @@ enum rte_flow_item_type {
 	 * See struct rte_flow_item_meta.
 	 */
 	RTE_FLOW_ITEM_TYPE_META,
+
+	/**
+	 * [META]
+	 *
+	 * Matches a tag value.
+	 *
+	 * See struct rte_flow_item_tag.
+	 */
+	RTE_FLOW_ITEM_TYPE_TAG,
 };
 
 /**
@@ -1187,6 +1196,27 @@ static const struct rte_flow_item_meta rte_flow_item_meta_mask = {
 };
 #endif
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ITEM_TYPE_TAG
+ *
+ * Matches a specified tag value at the specified index.
+ */
+struct rte_flow_item_tag {
+	uint32_t data;
+	uint8_t index;
+};
+
+/** Default mask for RTE_FLOW_ITEM_TYPE_TAG. */
+#ifndef __cplusplus
+static const struct rte_flow_item_tag rte_flow_item_rx_meta_mask = {
+	.data = 0xffffffff,
+	.index = 0xff,
+};
+#endif
+
 /**
  * @warning
  * @b EXPERIMENTAL: this structure may change without prior notice
@@ -1672,6 +1702,15 @@ enum rte_flow_action_type {
 	 * See struct rte_flow_action_set_meta.
 	 */
 	RTE_FLOW_ACTION_TYPE_SET_META,
+
+	/**
+	 * Set Tag.
+	 *
+	 * Tag is not delivered to application.
+	 *
+	 * See struct rte_flow_action_set_tag.
+	 */
+	RTE_FLOW_ACTION_TYPE_SET_TAG,
 };
 
 /**
@@ -2192,6 +2231,21 @@ struct rte_flow_action_set_meta {
 	rte_be32_t mask;
 };
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ACTION_TYPE_SET_TAG
+ *
+ * Set a tag which is a transient data used during flow matching. This is not
+ * delivered to application. Multiple tags are supported by specifying index.
+ */
+struct rte_flow_action_set_tag {
+	uint32_t data;
+	uint32_t mask;
+	uint8_t index;
+};
+
 /*
  * Definition of a single action.
  *
-- 
2.21.0


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [dpdk-dev] [RFC 2/3] ethdev: add flow modify mark action
  2019-06-03 21:32 ` [dpdk-dev] [RFC 2/3] ethdev: add flow modify mark action Yongseok Koh
@ 2019-06-06 10:35   ` Jerin Jacob Kollanukkaran
  2019-06-06 18:33     ` Yongseok Koh
  0 siblings, 1 reply; 34+ messages in thread
From: Jerin Jacob Kollanukkaran @ 2019-06-06 10:35 UTC (permalink / raw)
  To: Yongseok Koh, shahafs, thomas, ferruh.yigit, arybchenko,
	adrien.mazarguil, olivier.matz
  Cc: dev

> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Yongseok Koh
> Sent: Tuesday, June 4, 2019 3:03 AM
> To: shahafs@mellanox.com; thomas@monjalon.net; ferruh.yigit@intel.com;
> arybchenko@solarflare.com; adrien.mazarguil@6wind.com;
> olivier.matz@6wind.com
> Cc: dev@dpdk.org
> Subject: [dpdk-dev] [RFC 2/3] ethdev: add flow modify mark action
> 
> Mark ID can be modified when supporting multiple tables. Partial bit
> alteration is supported to preserve some bit-fields set by previous match.
> 
> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> ---
>  doc/guides/prog_guide/rte_flow.rst | 21 +++++++++++++++++++++
>  lib/librte_ethdev/rte_flow.h       | 24 ++++++++++++++++++++++++
>  2 files changed, 45 insertions(+)
> 
> diff --git a/doc/guides/prog_guide/rte_flow.rst
> b/doc/guides/prog_guide/rte_flow.rst
> index 016cd90e52..2907edfff4 100644
> --- a/doc/guides/prog_guide/rte_flow.rst
> +++ b/doc/guides/prog_guide/rte_flow.rst
> @@ -1463,6 +1463,27 @@ depends on the underlying implementation. It is
> returned in the
>     | ``id`` | integer value to return with packets |
>     +--------+--------------------------------------+
> 
> +Action: ``MODIFY_MARK``
> +^^^^^^^^^^^^^^^^^^^^^^^
> +
> +Alter partial bits of mark ID set by ``MARK`` action.
> +
> +``mask`` indicates which bits are modified. For bits which have never
> +been set by ``MARK`` or ``MODIFY_MARK``, unpredictable value will be
> +seen depending on driver implementation.
> +
> +.. _table_rte_flow_action_modify_mark:
> +
> +.. table:: MODIFY_MARK
> +
> +   +----------+--------------------------------------+
> +   | Field    | Value                                |
> +   +==========+======================================+
> +   | ``id``   | integer value to return with packets |
> +   +----------+--------------------------------------+
> +   | ``mask`` | bit-mask applies to "id"             |
> +   +----------+--------------------------------------+
> +
>  Action: ``FLAG``
>  ^^^^^^^^^^^^^^^^
> 
> diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h index
> cda8628183..d811f8a06e 100644
> --- a/lib/librte_ethdev/rte_flow.h
> +++ b/lib/librte_ethdev/rte_flow.h
> @@ -1316,6 +1316,13 @@ enum rte_flow_action_type {
>  	 */
>  	RTE_FLOW_ACTION_TYPE_MARK,
> 
> +	/**
> +	 * Alter partial bits of mark ID set by
> RTE_FLOW_ACTION_TYPE_MARK.
> +	 *
> +	 * See struct rte_flow_action_modify_mark.
> +	 */
> +	RTE_FLOW_ACTION_TYPE_MODIFY_MARK,
> +

I think, we need to define the case where application calls MODIFY_MARK first on given pattern before MARK
I think, either we can 
# Introduce an error number for that?
# Treat first MODIFY_MARK as MARK

Just to understand, in this absence of this new action, an application needs
to destroy the given pattern with associated  existing MARK action and
add the same pattern with updated value as MARK action? Right?



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [dpdk-dev] [RFC 2/3] ethdev: add flow modify mark action
  2019-06-06 10:35   ` Jerin Jacob Kollanukkaran
@ 2019-06-06 18:33     ` Yongseok Koh
  0 siblings, 0 replies; 34+ messages in thread
From: Yongseok Koh @ 2019-06-06 18:33 UTC (permalink / raw)
  To: Jerin Jacob Kollanukkaran
  Cc: Shahaf Shuler, Thomas Monjalon, ferruh.yigit, arybchenko,
	Adrien Mazarguil, olivier.matz, dev


> On Jun 6, 2019, at 3:35 AM, Jerin Jacob Kollanukkaran <jerinj@marvell.com> wrote:
> 
>> -----Original Message-----
>> From: dev <dev-bounces@dpdk.org> On Behalf Of Yongseok Koh
>> Sent: Tuesday, June 4, 2019 3:03 AM
>> To: shahafs@mellanox.com; thomas@monjalon.net; ferruh.yigit@intel.com;
>> arybchenko@solarflare.com; adrien.mazarguil@6wind.com;
>> olivier.matz@6wind.com
>> Cc: dev@dpdk.org
>> Subject: [dpdk-dev] [RFC 2/3] ethdev: add flow modify mark action
>> 
>> Mark ID can be modified when supporting multiple tables. Partial bit
>> alteration is supported to preserve some bit-fields set by previous match.
>> 
>> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
>> ---
>> doc/guides/prog_guide/rte_flow.rst | 21 +++++++++++++++++++++
>> lib/librte_ethdev/rte_flow.h       | 24 ++++++++++++++++++++++++
>> 2 files changed, 45 insertions(+)
>> 
>> diff --git a/doc/guides/prog_guide/rte_flow.rst
>> b/doc/guides/prog_guide/rte_flow.rst
>> index 016cd90e52..2907edfff4 100644
>> --- a/doc/guides/prog_guide/rte_flow.rst
>> +++ b/doc/guides/prog_guide/rte_flow.rst
>> @@ -1463,6 +1463,27 @@ depends on the underlying implementation. It is
>> returned in the
>>    | ``id`` | integer value to return with packets |
>>    +--------+--------------------------------------+
>> 
>> +Action: ``MODIFY_MARK``
>> +^^^^^^^^^^^^^^^^^^^^^^^
>> +
>> +Alter partial bits of mark ID set by ``MARK`` action.
>> +
>> +``mask`` indicates which bits are modified. For bits which have never
>> +been set by ``MARK`` or ``MODIFY_MARK``, unpredictable value will be
>> +seen depending on driver implementation.
>> +
>> +.. _table_rte_flow_action_modify_mark:
>> +
>> +.. table:: MODIFY_MARK
>> +
>> +   +----------+--------------------------------------+
>> +   | Field    | Value                                |
>> +   +==========+======================================+
>> +   | ``id``   | integer value to return with packets |
>> +   +----------+--------------------------------------+
>> +   | ``mask`` | bit-mask applies to "id"             |
>> +   +----------+--------------------------------------+
>> +
>> Action: ``FLAG``
>> ^^^^^^^^^^^^^^^^
>> 
>> diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h index
>> cda8628183..d811f8a06e 100644
>> --- a/lib/librte_ethdev/rte_flow.h
>> +++ b/lib/librte_ethdev/rte_flow.h
>> @@ -1316,6 +1316,13 @@ enum rte_flow_action_type {
>> 	 */
>> 	RTE_FLOW_ACTION_TYPE_MARK,
>> 
>> +	/**
>> +	 * Alter partial bits of mark ID set by
>> RTE_FLOW_ACTION_TYPE_MARK.
>> +	 *
>> +	 * See struct rte_flow_action_modify_mark.
>> +	 */
>> +	RTE_FLOW_ACTION_TYPE_MODIFY_MARK,
>> +
> 
> I think, we need to define the case where application calls MODIFY_MARK first on given pattern before MARK

Good input. 

> I think, either we can 
> # Introduce an error number for that?

Practically, it would be impossible to keep track of MARK action to check if it is set or not prior to MODIFY_MARK.
When creating flows with multiple tables, we can't say a flow having MODIFY_MARK action will have prior MARK action or not.

> # Treat first MODIFY_MARK as MARK

So, I took similar approach.
In the documentation above, unset bits would have arbitrary value depending on driver/device implementation.
User can't assume mark ID is initially zeroed but rather need to check it with vendors.

> Just to understand, in this absence of this new action, an application needs
> to destroy the given pattern with associated  existing MARK action and
> add the same pattern with updated value as MARK action? Right?

Application would have to override it by second MARK action.
But it has to be validated by user anyway to check if device allows override.

Thanks,
Yongseok



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [dpdk-dev] [RFC 1/3] ethdev: extend flow metadata
  2019-06-03 21:32 [dpdk-dev] [RFC 1/3] ethdev: extend flow metadata Yongseok Koh
  2019-06-03 21:32 ` [dpdk-dev] [RFC 2/3] ethdev: add flow modify mark action Yongseok Koh
  2019-06-03 21:32 ` [dpdk-dev] [RFC 3/3] ethdev: add flow tag Yongseok Koh
@ 2019-06-09 14:23 ` Andrew Rybchenko
  2019-06-10  3:19   ` Wang, Haiyue
  2019-07-04 23:21 ` [dpdk-dev] [PATCH] " Yongseok Koh
  3 siblings, 1 reply; 34+ messages in thread
From: Andrew Rybchenko @ 2019-06-09 14:23 UTC (permalink / raw)
  To: Yongseok Koh, shahafs, thomas, ferruh.yigit, adrien.mazarguil,
	olivier.matz
  Cc: dev

On 6/4/19 12:32 AM, Yongseok Koh wrote:
> Currently, metadata can be set on egress path via mbuf tx_meatadata field
> with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_RX_META matches metadata.
>
> This patch extends the usability.
>
> 1) RTE_FLOW_ACTION_TYPE_SET_META
>
> When supporting multiple tables, Tx metadata can also be set by a rule and
> matched by another rule. This new action allows metadata to be set as a
> result of flow match.
>
> 2) Metadata on ingress
>
> There's also need to support metadata on packet Rx. Metadata can be set by
> SET_META action and matched by META item like Tx. The final value set by
> the action will be delivered to application via mbuf metadata field with
> PKT_RX_METADATA ol_flag.
>
> For this purpose, mbuf->tx_metadata is moved as a separate new field and
> renamed to 'metadata' to support both Rx and Tx metadata.
>
> For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
> propagated to the other path depending on HW capability.
>
> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>

There is a mark on Rx which is delivered to application in hash.fdir.hi.
Why do we need one more 32-bit value set by NIC and delivered to 
application?
What is the difference between MARK and META on Rx?
When application should use MARK and when META?
Is there cases when both could be necessary?

Moreover, the third patch adds 32-bit tags which are not delivered to
application. May be META/MARK should be simply a kind of TAG (e.g. with
index 0 or marked using additional attribute) which is delivered to 
application?

(It is either API breakage (if tx_metadata is removed) or ABI breakage
if metadata and tx_metadata will share new location after shinfo).

Andrew.


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [dpdk-dev] [RFC 1/3] ethdev: extend flow metadata
  2019-06-09 14:23 ` [dpdk-dev] [RFC 1/3] ethdev: extend flow metadata Andrew Rybchenko
@ 2019-06-10  3:19   ` Wang, Haiyue
  2019-06-10  7:20     ` Andrew Rybchenko
  0 siblings, 1 reply; 34+ messages in thread
From: Wang, Haiyue @ 2019-06-10  3:19 UTC (permalink / raw)
  To: Andrew Rybchenko, Yongseok Koh, shahafs, thomas, Yigit, Ferruh,
	adrien.mazarguil, olivier.matz
  Cc: dev

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Andrew Rybchenko
> Sent: Sunday, June 9, 2019 22:24
> To: Yongseok Koh <yskoh@mellanox.com>; shahafs@mellanox.com; thomas@monjalon.net; Yigit, Ferruh
> <ferruh.yigit@intel.com>; adrien.mazarguil@6wind.com; olivier.matz@6wind.com
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [RFC 1/3] ethdev: extend flow metadata
> 
> On 6/4/19 12:32 AM, Yongseok Koh wrote:
> > Currently, metadata can be set on egress path via mbuf tx_meatadata field
> > with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_RX_META matches metadata.
> >
> > This patch extends the usability.
> >
> > 1) RTE_FLOW_ACTION_TYPE_SET_META
> >
> > When supporting multiple tables, Tx metadata can also be set by a rule and
> > matched by another rule. This new action allows metadata to be set as a
> > result of flow match.
> >
> > 2) Metadata on ingress
> >
> > There's also need to support metadata on packet Rx. Metadata can be set by
> > SET_META action and matched by META item like Tx. The final value set by
> > the action will be delivered to application via mbuf metadata field with
> > PKT_RX_METADATA ol_flag.
> >
> > For this purpose, mbuf->tx_metadata is moved as a separate new field and
> > renamed to 'metadata' to support both Rx and Tx metadata.
> >
> > For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
> > propagated to the other path depending on HW capability.
> >
> > Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> 
> There is a mark on Rx which is delivered to application in hash.fdir.hi.
> Why do we need one more 32-bit value set by NIC and delivered to
> application?
> What is the difference between MARK and META on Rx?
> When application should use MARK and when META?
> Is there cases when both could be necessary?
> 
In my understanding, MARK is FDIR related thing, META seems to be NIC
specific. And we also need this kind of specific data field to export
NIC's data to application.

> Moreover, the third patch adds 32-bit tags which are not delivered to
> application. May be META/MARK should be simply a kind of TAG (e.g. with
> index 0 or marked using additional attribute) which is delivered to
> application?
> 
> (It is either API breakage (if tx_metadata is removed) or ABI breakage
> if metadata and tx_metadata will share new location after shinfo).
> 
Make use of udata64 to export NIC metadata to application ?
	RTE_STD_C11
	union {
		void *userdata;   /**< Can be used for external metadata */
		uint64_t udata64; /**< Allow 8-byte userdata on 32-bit */
		uint64_t rx_metadata;
	};
> Andrew.


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [dpdk-dev] [RFC 1/3] ethdev: extend flow metadata
  2019-06-10  3:19   ` Wang, Haiyue
@ 2019-06-10  7:20     ` Andrew Rybchenko
  2019-06-11  0:06       ` Yongseok Koh
  0 siblings, 1 reply; 34+ messages in thread
From: Andrew Rybchenko @ 2019-06-10  7:20 UTC (permalink / raw)
  To: Wang, Haiyue, Yongseok Koh, shahafs, thomas, Yigit, Ferruh,
	adrien.mazarguil, olivier.matz
  Cc: dev, Ananyev, Konstantin

On 6/10/19 6:19 AM, Wang, Haiyue wrote:
>> -----Original Message-----
>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Andrew Rybchenko
>> Sent: Sunday, June 9, 2019 22:24
>> To: Yongseok Koh <yskoh@mellanox.com>; shahafs@mellanox.com; thomas@monjalon.net; Yigit, Ferruh
>> <ferruh.yigit@intel.com>; adrien.mazarguil@6wind.com; olivier.matz@6wind.com
>> Cc: dev@dpdk.org
>> Subject: Re: [dpdk-dev] [RFC 1/3] ethdev: extend flow metadata
>>
>> On 6/4/19 12:32 AM, Yongseok Koh wrote:
>>> Currently, metadata can be set on egress path via mbuf tx_meatadata field
>>> with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_RX_META matches metadata.
>>>
>>> This patch extends the usability.
>>>
>>> 1) RTE_FLOW_ACTION_TYPE_SET_META
>>>
>>> When supporting multiple tables, Tx metadata can also be set by a rule and
>>> matched by another rule. This new action allows metadata to be set as a
>>> result of flow match.
>>>
>>> 2) Metadata on ingress
>>>
>>> There's also need to support metadata on packet Rx. Metadata can be set by
>>> SET_META action and matched by META item like Tx. The final value set by
>>> the action will be delivered to application via mbuf metadata field with
>>> PKT_RX_METADATA ol_flag.
>>>
>>> For this purpose, mbuf->tx_metadata is moved as a separate new field and
>>> renamed to 'metadata' to support both Rx and Tx metadata.
>>>
>>> For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
>>> propagated to the other path depending on HW capability.
>>>
>>> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
>> There is a mark on Rx which is delivered to application in hash.fdir.hi.
>> Why do we need one more 32-bit value set by NIC and delivered to
>> application?
>> What is the difference between MARK and META on Rx?
>> When application should use MARK and when META?
>> Is there cases when both could be necessary?
>>
> In my understanding, MARK is FDIR related thing, META seems to be NIC
> specific. And we also need this kind of specific data field to export
> NIC's data to application.

I think it is better to avoid NIC vendor-specifics in motivation. I 
understand
that it exists for you, but I think it is better to look at it from RTE 
flow API
definition point of view: both are 32-bit (except endianess and I'm not sure
that I understand why meta is defined as big-endian since it is not a value
coming from or going to network in a packet, I'm sorry that I've missed it
on review that time), both may be set using action on Rx, both may be
matched using pattern item.

>> Moreover, the third patch adds 32-bit tags which are not delivered to
>> application. May be META/MARK should be simply a kind of TAG (e.g. with
>> index 0 or marked using additional attribute) which is delivered to
>> application?
>>
>> (It is either API breakage (if tx_metadata is removed) or ABI breakage
>> if metadata and tx_metadata will share new location after shinfo).
>>
> Make use of udata64 to export NIC metadata to application ?
> 	RTE_STD_C11
> 	union {
> 		void *userdata;   /**< Can be used for external metadata */
> 		uint64_t udata64; /**< Allow 8-byte userdata on 32-bit */
> 		uint64_t rx_metadata;
> 	};

As I understand it does not work for Tx and I'm not sure that it is
a good idea to have different locations for Tx and Rx.

RFC adds it at the end of mbuf, but it was rejected before since
it eats space in mbuf structure (CC Konstantin).

There is a long discussion on the topic before [1], [2], [3] and [4].

Andrew.

[1] http://mails.dpdk.org/archives/dev/2018-August/109660.html
[2] http://mails.dpdk.org/archives/dev/2018-September/111771.html
[3] http://mails.dpdk.org/archives/dev/2018-October/114559.html
[4] http://mails.dpdk.org/archives/dev/2018-October/115469.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [dpdk-dev] [RFC 1/3] ethdev: extend flow metadata
  2019-06-10  7:20     ` Andrew Rybchenko
@ 2019-06-11  0:06       ` Yongseok Koh
  2019-06-19  9:05         ` Andrew Rybchenko
  0 siblings, 1 reply; 34+ messages in thread
From: Yongseok Koh @ 2019-06-11  0:06 UTC (permalink / raw)
  To: Andrew Rybchenko
  Cc: Wang, Haiyue, Shahaf Shuler, Thomas Monjalon, Yigit, Ferruh,
	Adrien Mazarguil, olivier.matz, dev, Ananyev, Konstantin

On Mon, Jun 10, 2019 at 10:20:28AM +0300, Andrew Rybchenko wrote:
> On 6/10/19 6:19 AM, Wang, Haiyue wrote:
> > > -----Original Message-----
> > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Andrew Rybchenko
> > > Sent: Sunday, June 9, 2019 22:24
> > > To: Yongseok Koh <yskoh@mellanox.com>; shahafs@mellanox.com; thomas@monjalon.net; Yigit, Ferruh
> > > <ferruh.yigit@intel.com>; adrien.mazarguil@6wind.com; olivier.matz@6wind.com
> > > Cc: dev@dpdk.org
> > > Subject: Re: [dpdk-dev] [RFC 1/3] ethdev: extend flow metadata
> > > 
> > > On 6/4/19 12:32 AM, Yongseok Koh wrote:
> > > > Currently, metadata can be set on egress path via mbuf tx_meatadata field
> > > > with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_RX_META matches metadata.
> > > > 
> > > > This patch extends the usability.
> > > > 
> > > > 1) RTE_FLOW_ACTION_TYPE_SET_META
> > > > 
> > > > When supporting multiple tables, Tx metadata can also be set by a rule and
> > > > matched by another rule. This new action allows metadata to be set as a
> > > > result of flow match.
> > > > 
> > > > 2) Metadata on ingress
> > > > 
> > > > There's also need to support metadata on packet Rx. Metadata can be set by
> > > > SET_META action and matched by META item like Tx. The final value set by
> > > > the action will be delivered to application via mbuf metadata field with
> > > > PKT_RX_METADATA ol_flag.
> > > > 
> > > > For this purpose, mbuf->tx_metadata is moved as a separate new field and
> > > > renamed to 'metadata' to support both Rx and Tx metadata.
> > > > 
> > > > For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
> > > > propagated to the other path depending on HW capability.
> > > > 
> > > > Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> > > There is a mark on Rx which is delivered to application in hash.fdir.hi.
> > > Why do we need one more 32-bit value set by NIC and delivered to
> > > application?
> > > What is the difference between MARK and META on Rx?
> > > When application should use MARK and when META?
> > > Is there cases when both could be necessary?
> > > 
> > In my understanding, MARK is FDIR related thing, META seems to be NIC
> > specific. And we also need this kind of specific data field to export
> > NIC's data to application.
> 
> I think it is better to avoid NIC vendor-specifics in motivation. I
> understand
> that it exists for you, but I think it is better to look at it from RTE flow
> API
> definition point of view: both are 32-bit (except endianess and I'm not sure
> that I understand why meta is defined as big-endian since it is not a value
> coming from or going to network in a packet, I'm sorry that I've missed it
> on review that time), both may be set using action on Rx, both may be
> matched using pattern item.

Yes, MARK and META has the same characteristic on Rx path. Let me clarify why I
picked this way.

What if device has more bits to deliver to host? Currently, only 32-bit data can
be delivered to user via MARK ID. Now we have more requests from users (OVS
connection tracking) that want to see more information generated during flow
match from the device. Let's say it is 64 bits and it may contain intermediate
match results to keep track of multi-table match, to keep address of callback
function to call, or so. I thought about extending the current MARK to 64-bit
but I knew that we couldn't make more room in the first cacheline of mbuf where
every vendor has their critical interest. And the FDIR has been there for a long
time and has lots of use-cases in DPDK (not easy to break). This is why I'm
suggesting to obtain another 32 bits in the second cacheline of the structure.

Also, I thought about other scenario as well. Even though we have MARK item
introduced lately, it isn't used by any PMD at all for now, meaning it might not
be match-able on a certain device. What if there are two types registers on Rx
and one is match-able and the other isn't? PMD can use META for match-able
register while MARK is used for non-match-able register without supporting
item match. If MARK simply becomes 64-bit just because it has the same
characteristic in terms of rte_flow, only one of such registers can be used as
we can't say only part of bits are match-able on the item. Instead of extending
the MARK to 64 bits, I thought it would be better to give more flexibility by
bundling it with Tx metadata, which can set by mbuf.

The actual issue we have may be how we can make it scalable? What if there's
more need to carry more data from device? Well, IIRC, Olivier once suggested to
put a pointer (like mbuf->userdata) to extend mbuf struct beyond two cachelines.
But we still have some space left at the end.

> > > Moreover, the third patch adds 32-bit tags which are not delivered to
> > > application. May be META/MARK should be simply a kind of TAG (e.g. with
> > > index 0 or marked using additional attribute) which is delivered to
> > > application?

Yes, TAG is a kind of transient device-internal data which isn't delivered to
host. It would be a design choice. I could define all these kinds as an array of
MARK IDs having different attributes - some are exportable/match-able and others
are not, which sounds quite complex. As rte_flow doesn't have a direct way to
check device capability (user has to call a series of validate functions
instead), I thought defining TAG would be better.

> > > (It is either API breakage (if tx_metadata is removed) or ABI breakage
> > > if metadata and tx_metadata will share new location after shinfo).

Fortunately, mlx5 is the only entity which uses tx_metadata so far.

> > Make use of udata64 to export NIC metadata to application ?
> > 	RTE_STD_C11
> > 	union {
> > 		void *userdata;   /**< Can be used for external metadata */
> > 		uint64_t udata64; /**< Allow 8-byte userdata on 32-bit */
> > 		uint64_t rx_metadata;
> > 	};
> 
> As I understand it does not work for Tx and I'm not sure that it is
> a good idea to have different locations for Tx and Rx.
> 
> RFC adds it at the end of mbuf, but it was rejected before since
> it eats space in mbuf structure (CC Konstantin).

Yep, I was in the discussion. IIRC, the reason wasn't because it ate space but
because it could recycle unused space on Tx path. We still have 16B after shinfo
and I'm not sure how many bytes we should reserve. I think reserving space for
one pointer would be fine.

Thanks,
Yongseok

> There is a long discussion on the topic before [1], [2], [3] and [4].
> 
> Andrew.
> 
> [1] https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmails.dpdk.org%2Farchives%2Fdev%2F2018-August%2F109660.html&amp;data=02%7C01%7Cyskoh%40mellanox.com%7C6c81080cb68340d2128c08d6ed742746%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636957480475389496&amp;sdata=EFHyECwg0NBRvyrouZqWD6x0WD4xAsqsfYQGrEvS%2BEg%3D&amp;reserved=0
> [2] https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmails.dpdk.org%2Farchives%2Fdev%2F2018-September%2F111771.html&amp;data=02%7C01%7Cyskoh%40mellanox.com%7C6c81080cb68340d2128c08d6ed742746%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636957480475389496&amp;sdata=M8cQSmQhWKlUVKvFgux0T0TWAnJhPxdO4Dn3fkReTyg%3D&amp;reserved=0
> [3] https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmails.dpdk.org%2Farchives%2Fdev%2F2018-October%2F114559.html&amp;data=02%7C01%7Cyskoh%40mellanox.com%7C6c81080cb68340d2128c08d6ed742746%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636957480475394493&amp;sdata=ZVm5god7n1i07OCc5Z7B%2BBUpnjXCraJXU0FeF5KkCRc%3D&amp;reserved=0
> [4] https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmails.dpdk.org%2Farchives%2Fdev%2F2018-October%2F115469.html&amp;data=02%7C01%7Cyskoh%40mellanox.com%7C6c81080cb68340d2128c08d6ed742746%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636957480475394493&amp;sdata=XgKV%2B331Vqsq9Ns40giI1nAwscVxBxqb78vB1BY8z%2Bc%3D&amp;reserved=0

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [dpdk-dev] [RFC 1/3] ethdev: extend flow metadata
  2019-06-11  0:06       ` Yongseok Koh
@ 2019-06-19  9:05         ` Andrew Rybchenko
  0 siblings, 0 replies; 34+ messages in thread
From: Andrew Rybchenko @ 2019-06-19  9:05 UTC (permalink / raw)
  To: Yongseok Koh
  Cc: Wang, Haiyue, Shahaf Shuler, Thomas Monjalon, Yigit, Ferruh,
	Adrien Mazarguil, olivier.matz, dev, Ananyev, Konstantin

On 11.06.2019 3:06, Yongseok Koh wrote:
> On Mon, Jun 10, 2019 at 10:20:28AM +0300, Andrew Rybchenko wrote:
>> On 6/10/19 6:19 AM, Wang, Haiyue wrote:
>>>> -----Original Message-----
>>>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Andrew Rybchenko
>>>> Sent: Sunday, June 9, 2019 22:24
>>>> To: Yongseok Koh <yskoh@mellanox.com>; shahafs@mellanox.com; thomas@monjalon.net; Yigit, Ferruh
>>>> <ferruh.yigit@intel.com>; adrien.mazarguil@6wind.com; olivier.matz@6wind.com
>>>> Cc: dev@dpdk.org
>>>> Subject: Re: [dpdk-dev] [RFC 1/3] ethdev: extend flow metadata
>>>>
>>>> On 6/4/19 12:32 AM, Yongseok Koh wrote:
>>>>> Currently, metadata can be set on egress path via mbuf tx_meatadata field
>>>>> with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_RX_META matches metadata.
>>>>>
>>>>> This patch extends the usability.
>>>>>
>>>>> 1) RTE_FLOW_ACTION_TYPE_SET_META
>>>>>
>>>>> When supporting multiple tables, Tx metadata can also be set by a rule and
>>>>> matched by another rule. This new action allows metadata to be set as a
>>>>> result of flow match.
>>>>>
>>>>> 2) Metadata on ingress
>>>>>
>>>>> There's also need to support metadata on packet Rx. Metadata can be set by
>>>>> SET_META action and matched by META item like Tx. The final value set by
>>>>> the action will be delivered to application via mbuf metadata field with
>>>>> PKT_RX_METADATA ol_flag.
>>>>>
>>>>> For this purpose, mbuf->tx_metadata is moved as a separate new field and
>>>>> renamed to 'metadata' to support both Rx and Tx metadata.
>>>>>
>>>>> For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
>>>>> propagated to the other path depending on HW capability.
>>>>>
>>>>> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
>>>> There is a mark on Rx which is delivered to application in hash.fdir.hi.
>>>> Why do we need one more 32-bit value set by NIC and delivered to
>>>> application?
>>>> What is the difference between MARK and META on Rx?
>>>> When application should use MARK and when META?
>>>> Is there cases when both could be necessary?
>>>>
>>> In my understanding, MARK is FDIR related thing, META seems to be NIC
>>> specific. And we also need this kind of specific data field to export
>>> NIC's data to application.
>> I think it is better to avoid NIC vendor-specifics in motivation. I
>> understand
>> that it exists for you, but I think it is better to look at it from RTE flow
>> API
>> definition point of view: both are 32-bit (except endianess and I'm not sure
>> that I understand why meta is defined as big-endian since it is not a value
>> coming from or going to network in a packet, I'm sorry that I've missed it
>> on review that time), both may be set using action on Rx, both may be
>> matched using pattern item.
> Yes, MARK and META has the same characteristic on Rx path. Let me clarify why I
> picked this way.
>
> What if device has more bits to deliver to host? Currently, only 32-bit data can
> be delivered to user via MARK ID. Now we have more requests from users (OVS
> connection tracking) that want to see more information generated during flow
> match from the device. Let's say it is 64 bits and it may contain intermediate
> match results to keep track of multi-table match, to keep address of callback
> function to call, or so. I thought about extending the current MARK to 64-bit
> but I knew that we couldn't make more room in the first cacheline of mbuf where
> every vendor has their critical interest. And the FDIR has been there for a long
> time and has lots of use-cases in DPDK (not easy to break). This is why I'm
> suggesting to obtain another 32 bits in the second cacheline of the structure.
>
> Also, I thought about other scenario as well. Even though we have MARK item
> introduced lately, it isn't used by any PMD at all for now, meaning it might not
> be match-able on a certain device. What if there are two types registers on Rx
> and one is match-able and the other isn't? PMD can use META for match-able
> register while MARK is used for non-match-able register without supporting
> item match. If MARK simply becomes 64-bit just because it has the same
> characteristic in terms of rte_flow, only one of such registers can be used as
> we can't say only part of bits are match-able on the item. Instead of extending
> the MARK to 64 bits, I thought it would be better to give more flexibility by
> bundling it with Tx metadata, which can set by mbuf.

Thanks a lot for explanations. If the way is finally approved, priority
among META and MARK should be defined. I.e. if only one is supported
or only one may be match, it must be MARK. Otherwise, it will be too
complicated for applications to find out which one to use.
Is there any limitations on usage of MARK or META in transfer rules?
There is a lot of work on documentation in this area to make it usable.


> The actual issue we have may be how we can make it scalable? What if there's
> more need to carry more data from device? Well, IIRC, Olivier once suggested to
> put a pointer (like mbuf->userdata) to extend mbuf struct beyond two cachelines.
> But we still have some space left at the end.
>
>>>> Moreover, the third patch adds 32-bit tags which are not delivered to
>>>> application. May be META/MARK should be simply a kind of TAG (e.g. with
>>>> index 0 or marked using additional attribute) which is delivered to
>>>> application?
> Yes, TAG is a kind of transient device-internal data which isn't delivered to
> host. It would be a design choice. I could define all these kinds as an array of
> MARK IDs having different attributes - some are exportable/match-able and others
> are not, which sounds quite complex. As rte_flow doesn't have a direct way to
> check device capability (user has to call a series of validate functions
> instead), I thought defining TAG would be better.
>
>>>> (It is either API breakage (if tx_metadata is removed) or ABI breakage
>>>> if metadata and tx_metadata will share new location after shinfo).
> Fortunately, mlx5 is the only entity which uses tx_metadata so far.

As I understand it is still breakage.

>>> Make use of udata64 to export NIC metadata to application ?
>>> 	RTE_STD_C11
>>> 	union {
>>> 		void *userdata;   /**< Can be used for external metadata */
>>> 		uint64_t udata64; /**< Allow 8-byte userdata on 32-bit */
>>> 		uint64_t rx_metadata;
>>> 	};
>> As I understand it does not work for Tx and I'm not sure that it is
>> a good idea to have different locations for Tx and Rx.
>>
>> RFC adds it at the end of mbuf, but it was rejected before since
>> it eats space in mbuf structure (CC Konstantin).
> Yep, I was in the discussion. IIRC, the reason wasn't because it ate space but
> because it could recycle unused space on Tx path. We still have 16B after shinfo
> and I'm not sure how many bytes we should reserve. I think reserving space for
> one pointer would be fine.

I have no strong opinion.

Thanks,
Andrew.

> Thanks,
> Yongseok
>
>> There is a long discussion on the topic before [1], [2], [3] and [4].
>>
>> Andrew.
>>
>> [1] https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmails.dpdk.org%2Farchives%2Fdev%2F2018-August%2F109660.html&amp;data=02%7C01%7Cyskoh%40mellanox.com%7C6c81080cb68340d2128c08d6ed742746%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636957480475389496&amp;sdata=EFHyECwg0NBRvyrouZqWD6x0WD4xAsqsfYQGrEvS%2BEg%3D&amp;reserved=0
>> [2] https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmails.dpdk.org%2Farchives%2Fdev%2F2018-September%2F111771.html&amp;data=02%7C01%7Cyskoh%40mellanox.com%7C6c81080cb68340d2128c08d6ed742746%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636957480475389496&amp;sdata=M8cQSmQhWKlUVKvFgux0T0TWAnJhPxdO4Dn3fkReTyg%3D&amp;reserved=0
>> [3] https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmails.dpdk.org%2Farchives%2Fdev%2F2018-October%2F114559.html&amp;data=02%7C01%7Cyskoh%40mellanox.com%7C6c81080cb68340d2128c08d6ed742746%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636957480475394493&amp;sdata=ZVm5god7n1i07OCc5Z7B%2BBUpnjXCraJXU0FeF5KkCRc%3D&amp;reserved=0
>> [4] https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmails.dpdk.org%2Farchives%2Fdev%2F2018-October%2F115469.html&amp;data=02%7C01%7Cyskoh%40mellanox.com%7C6c81080cb68340d2128c08d6ed742746%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636957480475394493&amp;sdata=XgKV%2B331Vqsq9Ns40giI1nAwscVxBxqb78vB1BY8z%2Bc%3D&amp;reserved=0

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [dpdk-dev] [PATCH] ethdev: extend flow metadata
  2019-06-03 21:32 [dpdk-dev] [RFC 1/3] ethdev: extend flow metadata Yongseok Koh
                   ` (2 preceding siblings ...)
  2019-06-09 14:23 ` [dpdk-dev] [RFC 1/3] ethdev: extend flow metadata Andrew Rybchenko
@ 2019-07-04 23:21 ` " Yongseok Koh
  2019-07-10  9:31   ` Olivier Matz
  2019-10-10 16:02   ` [dpdk-dev] [PATCH v2] " Viacheslav Ovsiienko
  3 siblings, 2 replies; 34+ messages in thread
From: Yongseok Koh @ 2019-07-04 23:21 UTC (permalink / raw)
  To: shahafs, thomas, ferruh.yigit, arybchenko, adrien.mazarguil,
	olivier.matz
  Cc: dev, viacheslavo

Currently, metadata can be set on egress path via mbuf tx_meatadata field
with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_RX_META matches metadata.

This patch extends the usability.

1) RTE_FLOW_ACTION_TYPE_SET_META

When supporting multiple tables, Tx metadata can also be set by a rule and
matched by another rule. This new action allows metadata to be set as a
result of flow match.

2) Metadata on ingress

There's also need to support metadata on packet Rx. Metadata can be set by
SET_META action and matched by META item like Tx. The final value set by
the action will be delivered to application via mbuf metadata field with
PKT_RX_METADATA ol_flag.

For this purpose, mbuf->tx_metadata is moved as a separate new field and
renamed to 'metadata' to support both Rx and Tx metadata.

For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
propagated to the other path depending on HW capability.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
---
 app/test-pmd/cmdline_flow.c            | 35 ++++++++++++
 app/test-pmd/util.c                    |  2 +-
 doc/guides/prog_guide/rte_flow.rst     | 73 ++++++++++++++++++--------
 doc/guides/rel_notes/release_19_08.rst | 10 ++++
 drivers/net/mlx5/mlx5_rxtx.c           | 12 ++---
 drivers/net/mlx5/mlx5_rxtx_vec.c       |  4 +-
 drivers/net/mlx5/mlx5_rxtx_vec_neon.h  |  2 +-
 drivers/net/mlx5/mlx5_rxtx_vec_sse.h   |  2 +-
 lib/librte_ethdev/rte_ethdev.h         |  5 ++
 lib/librte_ethdev/rte_flow.h           | 43 +++++++++++++--
 lib/librte_mbuf/rte_mbuf.h             | 21 ++++----
 11 files changed, 161 insertions(+), 48 deletions(-)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 201bd9de56..eda5c5491f 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -272,6 +272,9 @@ enum index {
 	ACTION_SET_MAC_SRC_MAC_SRC,
 	ACTION_SET_MAC_DST,
 	ACTION_SET_MAC_DST_MAC_DST,
+	ACTION_SET_META,
+	ACTION_SET_META_DATA,
+	ACTION_SET_META_MASK,
 };
 
 /** Maximum size for pattern in struct rte_flow_item_raw. */
@@ -885,6 +888,7 @@ static const enum index next_action[] = {
 	ACTION_SET_TTL,
 	ACTION_SET_MAC_SRC,
 	ACTION_SET_MAC_DST,
+	ACTION_SET_META,
 	ZERO,
 };
 
@@ -1047,6 +1051,13 @@ static const enum index action_set_mac_dst[] = {
 	ZERO,
 };
 
+static const enum index action_set_meta[] = {
+	ACTION_SET_META_DATA,
+	ACTION_SET_META_MASK,
+	ACTION_NEXT,
+	ZERO,
+};
+
 static int parse_init(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
@@ -2854,6 +2865,30 @@ static const struct token token_list[] = {
 			     (struct rte_flow_action_set_mac, mac_addr)),
 		.call = parse_vc_conf,
 	},
+	[ACTION_SET_META] = {
+		.name = "set_meta",
+		.help = "set metadata",
+		.priv = PRIV_ACTION(SET_META,
+			sizeof(struct rte_flow_action_set_meta)),
+		.next = NEXT(action_set_meta),
+		.call = parse_vc,
+	},
+	[ACTION_SET_META_DATA] = {
+		.name = "data",
+		.help = "metadata value",
+		.next = NEXT(action_set_meta, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY_HTON
+			     (struct rte_flow_action_set_meta, data)),
+		.call = parse_vc_conf,
+	},
+	[ACTION_SET_META_MASK] = {
+		.name = "mask",
+		.help = "mask for metadata value",
+		.next = NEXT(action_set_meta, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY_HTON
+			     (struct rte_flow_action_set_meta, mask)),
+		.call = parse_vc_conf,
+	},
 };
 
 /** Remove and return last entry from argument stack. */
diff --git a/app/test-pmd/util.c b/app/test-pmd/util.c
index a1164b7053..6ecc97351f 100644
--- a/app/test-pmd/util.c
+++ b/app/test-pmd/util.c
@@ -182,7 +182,7 @@ tx_pkt_set_md(uint16_t port_id, __rte_unused uint16_t queue,
 	 * and set ol_flags accordingly.
 	 */
 	for (i = 0; i < nb_pkts; i++) {
-		pkts[i]->tx_metadata = ports[port_id].tx_metadata;
+		pkts[i]->metadata = ports[port_id].tx_metadata;
 		pkts[i]->ol_flags |= PKT_TX_METADATA;
 	}
 	return nb_pkts;
diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index a34d012e55..5092f0074e 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -658,6 +658,32 @@ the physical device, with virtual groups in the PMD or not at all.
    | ``mask`` | ``id``   | zeroed to match any value |
    +----------+----------+---------------------------+
 
+Item: ``META``
+^^^^^^^^^^^^^^^^^
+
+Matches 32 bit metadata item set.
+
+On egress, metadata can be set either by mbuf metadata field with
+PKT_TX_METADATA flag or ``SET_META`` action. On ingress, ``SET_META``
+action sets metadata for a packet and the metadata will be reported via
+``metadata`` field of ``rte_mbuf`` field with PKT_RX_METADATA flag.
+
+- Default ``mask`` matches the specified Rx metadata value.
+
+.. _table_rte_flow_item_meta:
+
+.. table:: META
+
+   +----------+----------+---------------------------------------+
+   | Field    | Subfield | Value                                 |
+   +==========+==========+=======================================+
+   | ``spec`` | ``data`` | 32 bit metadata value                 |
+   +----------+----------+---------------------------------------+
+   | ``last`` | ``data`` | upper range value                     |
+   +----------+----------+---------------------------------------+
+   | ``mask`` | ``data`` | bit-mask applies to "spec" and "last" |
+   +----------+----------+---------------------------------------+
+
 Data matching item types
 ~~~~~~~~~~~~~~~~~~~~~~~~
 
@@ -1189,27 +1215,6 @@ Normally preceded by any of:
 - `Item: ICMP6_ND_NS`_
 - `Item: ICMP6_ND_OPT`_
 
-Item: ``META``
-^^^^^^^^^^^^^^
-
-Matches an application specific 32 bit metadata item.
-
-- Default ``mask`` matches the specified metadata value.
-
-.. _table_rte_flow_item_meta:
-
-.. table:: META
-
-   +----------+----------+---------------------------------------+
-   | Field    | Subfield | Value                                 |
-   +==========+==========+=======================================+
-   | ``spec`` | ``data`` | 32 bit metadata value                 |
-   +----------+--------------------------------------------------+
-   | ``last`` | ``data`` | upper range value                     |
-   +----------+----------+---------------------------------------+
-   | ``mask`` | ``data`` | bit-mask applies to "spec" and "last" |
-   +----------+----------+---------------------------------------+
-
 Actions
 ~~~~~~~
 
@@ -2345,6 +2350,32 @@ Otherwise, RTE_FLOW_ERROR_TYPE_ACTION error will be returned.
    | ``mac_addr`` | MAC address   |
    +--------------+---------------+
 
+Action: ``SET_META``
+^^^^^^^^^^^^^^^^^^^^^^^
+
+Set metadata. Item ``META`` matches metadata.
+
+Metadata set by mbuf metadata field with PKT_TX_METADATA flag on egress will be
+overridden by this action. On ingress, the metadata will be carried by mbuf
+metadata field with PKT_RX_METADATA flag if set.
+
+Altering partial bits is supported with ``mask``. For bits which have never been
+set, unpredictable value will be seen depending on driver implementation. For
+loopback/hairpin packet, metadata set on Rx/Tx may or may not be propagated to
+the other path depending on HW capability.
+
+.. _table_rte_flow_action_set_meta:
+
+.. table:: SET_META
+
+   +----------+----------------------------+
+   | Field    | Value                      |
+   +==========+============================+
+   | ``data`` | 32 bit metadata value      |
+   +----------+----------------------------+
+   | ``mask`` | bit-mask applies to "data" |
+   +----------+----------------------------+
+
 Negative types
 ~~~~~~~~~~~~~~
 
diff --git a/doc/guides/rel_notes/release_19_08.rst b/doc/guides/rel_notes/release_19_08.rst
index 223479c6d4..e087266da0 100644
--- a/doc/guides/rel_notes/release_19_08.rst
+++ b/doc/guides/rel_notes/release_19_08.rst
@@ -68,6 +68,16 @@ New Features
   rte_rand_max() which supplies unbiased, bounded pseudo-random
   numbers.
 
+* **Extended metadata support in rte_flow.**
+
+  Flow metadata is extended to both Rx and Tx.
+
+  * ``tx_metadata`` field of ``rte_mbuf`` has been moved to an independent
+    field and renamed as ``metadata``.
+  * Tx metadata can also be set by SET_META action of rte_flow.
+  * Rx metadata is delivered to host via ``metadata`` field of ``rte_mbuf``
+    with PKT_RX_METADATA.
+
 * **Updated the bnxt PMD.**
 
   Updated the bnxt PMD. The major enhancements include:
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index c1dc8c4e17..4b23a0176d 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -784,8 +784,7 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
 		txq_mbuf_to_swp(txq, buf, (uint8_t *)&swp_offsets, &swp_types);
 		raw = ((uint8_t *)(uintptr_t)wqe) + 2 * MLX5_WQE_DWORD_SIZE;
 		/* Copy metadata from mbuf if valid */
-		metadata = buf->ol_flags & PKT_TX_METADATA ? buf->tx_metadata :
-							     0;
+		metadata = buf->ol_flags & PKT_TX_METADATA ? buf->metadata : 0;
 		/* Replace the Ethernet type by the VLAN if necessary. */
 		if (buf->ol_flags & PKT_TX_VLAN_PKT) {
 			uint32_t vlan = rte_cpu_to_be_32(0x81000000 |
@@ -1193,8 +1192,7 @@ mlx5_tx_burst_mpw(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
 		--pkts_n;
 		cs_flags = txq_ol_cksum_to_cs(buf);
 		/* Copy metadata from mbuf if valid */
-		metadata = buf->ol_flags & PKT_TX_METADATA ? buf->tx_metadata :
-							     0;
+		metadata = buf->ol_flags & PKT_TX_METADATA ? buf->metadata : 0;
 		/* Retrieve packet information. */
 		length = PKT_LEN(buf);
 		assert(length);
@@ -1430,8 +1428,7 @@ mlx5_tx_burst_mpw_inline(void *dpdk_txq, struct rte_mbuf **pkts,
 		max_wqe = (1u << txq->wqe_n) - (txq->wqe_ci - txq->wqe_pi);
 		cs_flags = txq_ol_cksum_to_cs(buf);
 		/* Copy metadata from mbuf if valid */
-		metadata = buf->ol_flags & PKT_TX_METADATA ? buf->tx_metadata :
-							     0;
+		metadata = buf->ol_flags & PKT_TX_METADATA ? buf->metadata : 0;
 		/* Retrieve packet information. */
 		length = PKT_LEN(buf);
 		/* Start new session if packet differs. */
@@ -1715,8 +1712,7 @@ txq_burst_empw(struct mlx5_txq_data *txq, struct rte_mbuf **pkts,
 			break;
 		cs_flags = txq_ol_cksum_to_cs(buf);
 		/* Copy metadata from mbuf if valid */
-		metadata = buf->ol_flags & PKT_TX_METADATA ? buf->tx_metadata :
-							     0;
+		metadata = buf->ol_flags & PKT_TX_METADATA ? buf->metadata : 0;
 		/* Retrieve packet information. */
 		length = PKT_LEN(buf);
 		/* Start new session if:
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.c b/drivers/net/mlx5/mlx5_rxtx_vec.c
index 073044f6d1..b8e042c5d2 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec.c
+++ b/drivers/net/mlx5/mlx5_rxtx_vec.c
@@ -71,7 +71,7 @@ txq_calc_offload(struct rte_mbuf **pkts, uint16_t pkts_n, uint8_t *cs_flags,
 	if (!pkts_n)
 		return 0;
 	p0_metadata = pkts[0]->ol_flags & PKT_TX_METADATA ?
-			pkts[0]->tx_metadata : 0;
+		      pkts[0]->metadata : 0;
 	/* Count the number of packets having same offload parameters. */
 	for (pos = 1; pos < pkts_n; ++pos) {
 		/* Check if packet has same checksum flags. */
@@ -81,7 +81,7 @@ txq_calc_offload(struct rte_mbuf **pkts, uint16_t pkts_n, uint8_t *cs_flags,
 		/* Check if packet has same metadata. */
 		if (txq_offloads & DEV_TX_OFFLOAD_MATCH_METADATA) {
 			pn_metadata = pkts[pos]->ol_flags & PKT_TX_METADATA ?
-					pkts[pos]->tx_metadata : 0;
+				      pkts[pos]->metadata : 0;
 			if (pn_metadata != p0_metadata)
 				break;
 		}
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
index 1c7e3b444a..900cd9db43 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
@@ -131,7 +131,7 @@ txq_scatter_v(struct mlx5_txq_data *txq, struct rte_mbuf **pkts,
 		uint8x16_t ctrl;
 		rte_be32_t metadata =
 			metadata_ol && (buf->ol_flags & PKT_TX_METADATA) ?
-			buf->tx_metadata : 0;
+			buf->metadata : 0;
 
 		assert(segs_n);
 		max_elts = elts_n - (elts_head - txq->elts_tail);
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
index 503ca0f6ad..df7e22b9b9 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
@@ -129,7 +129,7 @@ txq_scatter_v(struct mlx5_txq_data *txq, struct rte_mbuf **pkts,
 		__m128i ctrl;
 		rte_be32_t metadata =
 			metadata_ol && (buf->ol_flags & PKT_TX_METADATA) ?
-			buf->tx_metadata : 0;
+			buf->metadata : 0;
 
 		assert(segs_n);
 		max_elts = elts_n - (elts_head - txq->elts_tail);
diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
index c85212649c..ee0707e2d8 100644
--- a/lib/librte_ethdev/rte_ethdev.h
+++ b/lib/librte_ethdev/rte_ethdev.h
@@ -1011,6 +1011,11 @@ struct rte_eth_conf {
 #define DEV_RX_OFFLOAD_KEEP_CRC		0x00010000
 #define DEV_RX_OFFLOAD_SCTP_CKSUM	0x00020000
 #define DEV_RX_OFFLOAD_OUTER_UDP_CKSUM  0x00040000
+/**
+ * Device supports match on metadata Rx offload.
+ * Driver sets PKT_RX_METADATA and mbuf metadata field.
+ */
+#define DEV_RX_OFFLOAD_MATCH_METADATA   0x00080000
 
 #define DEV_RX_OFFLOAD_CHECKSUM (DEV_RX_OFFLOAD_IPV4_CKSUM | \
 				 DEV_RX_OFFLOAD_UDP_CKSUM | \
diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index f3a8fb103f..cda8628183 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -417,7 +417,8 @@ enum rte_flow_item_type {
 	/**
 	 * [META]
 	 *
-	 * Matches a metadata value specified in mbuf metadata field.
+	 * Matches a metadata value.
+	 *
 	 * See struct rte_flow_item_meta.
 	 */
 	RTE_FLOW_ITEM_TYPE_META,
@@ -1164,9 +1165,16 @@ rte_flow_item_icmp6_nd_opt_tla_eth_mask = {
 #endif
 
 /**
- * RTE_FLOW_ITEM_TYPE_META.
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
  *
- * Matches a specified metadata value.
+ * RTE_FLOW_ITEM_TYPE_META
+ *
+ * Matches a specified metadata value. On egress, metadata can be set either by
+ * mbuf metadata field with PKT_TX_METADATA flag or
+ * RTE_FLOW_ACTION_TYPE_SET_META. On ingress, RTE_FLOW_ACTION_TYPE_SET_META sets
+ * metadata for a packet and the metadata will be reported via mbuf metadata
+ * field with PKT_RX_METADATA flag.
  */
 struct rte_flow_item_meta {
 	rte_be32_t data;
@@ -1650,6 +1658,13 @@ enum rte_flow_action_type {
 	 * See struct rte_flow_action_set_mac.
 	 */
 	RTE_FLOW_ACTION_TYPE_SET_MAC_DST,
+
+	/**
+	 * Set metadata on ingress or egress path.
+	 *
+	 * See struct rte_flow_action_set_meta.
+	 */
+	RTE_FLOW_ACTION_TYPE_SET_META,
 };
 
 /**
@@ -2131,6 +2146,28 @@ struct rte_flow_action_set_mac {
 	uint8_t mac_addr[RTE_ETHER_ADDR_LEN];
 };
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ACTION_TYPE_SET_META
+ *
+ * Set metadata. Metadata set by mbuf metadata field with PKT_TX_METADATA flag
+ * on egress will be overridden by this action. On ingress, the metadata will be
+ * carried by mbuf metadata field with PKT_RX_METADATA flag if set.
+ *
+ * Altering partial bits is supported with mask. For bits which have never been
+ * set, unpredictable value will be seen depending on driver implementation. For
+ * loopback/hairpin packet, metadata set on Rx/Tx may or may not be propagated
+ * to the other path depending on HW capability.
+ *
+ * RTE_FLOW_ITEM_TYPE_META matches metadata.
+ */
+struct rte_flow_action_set_meta {
+	rte_be32_t data;
+	rte_be32_t mask;
+};
+
 /*
  * Definition of a single action.
  *
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index 9542488554..ba2da874f5 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -200,6 +200,11 @@ extern "C" {
 
 /* add new RX flags here */
 
+/**
+ * Indicate that mbuf has metadata from device.
+ */
+#define PKT_RX_METADATA	(1ULL << 23)
+
 /* add new TX flags here */
 
 /**
@@ -648,17 +653,6 @@ struct rte_mbuf {
 			/**< User defined tags. See rte_distributor_process() */
 			uint32_t usr;
 		} hash;                   /**< hash information */
-		struct {
-			/**
-			 * Application specific metadata value
-			 * for egress flow rule match.
-			 * Valid if PKT_TX_METADATA is set.
-			 * Located here to allow conjunct use
-			 * with hash.sched.hi.
-			 */
-			uint32_t tx_metadata;
-			uint32_t reserved;
-		};
 	};
 
 	/** Outer VLAN TCI (CPU order), valid if PKT_RX_QINQ is set. */
@@ -727,6 +721,11 @@ struct rte_mbuf {
 	 */
 	struct rte_mbuf_ext_shared_info *shinfo;
 
+	/** Application specific metadata value for flow rule match.
+	 * Valid if PKT_RX_METADATA or PKT_TX_METADATA is set.
+	 */
+	uint32_t metadata;
+
 } __rte_cache_aligned;
 
 /**
-- 
2.21.0


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [dpdk-dev] [PATCH] ethdev: add flow tag
  2019-06-03 21:32 ` [dpdk-dev] [RFC 3/3] ethdev: add flow tag Yongseok Koh
@ 2019-07-04 23:23   ` " Yongseok Koh
  2019-07-05 13:54     ` Adrien Mazarguil
  2019-10-10 16:09     ` [dpdk-dev] [PATCH v2] " Viacheslav Ovsiienko
  0 siblings, 2 replies; 34+ messages in thread
From: Yongseok Koh @ 2019-07-04 23:23 UTC (permalink / raw)
  To: shahafs, thomas, ferruh.yigit, arybchenko, adrien.mazarguil,
	olivier.matz
  Cc: dev, viacheslavo

A tag is a transient data which can be used during flow match. This can be
used to store match result from a previous table so that the same pattern
need not be matched again on the next table. Even if outer header is
decapsulated on the previous match, the match result can be kept.

Some device expose internal registers of its flow processing pipeline and
those registers are quite useful for stateful connection tracking as it
keeps status of flow matching. Multiple tags are supported by specifying
index.

Example testpmd commands are:

  flow create 0 ingress pattern ... / end
    actions set_tag index 2 value 0xaa00bb mask 0xffff00ff /
            set_tag index 3 value 0x123456 mask 0xffffff /
            vxlan_decap / jump group 1 / end

  flow create 0 ingress pattern ... / end
    actions set_tag index 2 value 0xcc00 mask 0xff00 /
            set_tag index 3 value 0x123456 mask 0xffffff /
            vxlan_decap / jump group 1 / end

  flow create 0 ingress group 1
    pattern tag index is 2 value spec 0xaa00bb value mask 0xffff00ff /
            eth ... / end
    actions ... jump group 2 / end

  flow create 0 ingress group 1
    pattern tag index is 2 value spec 0xcc00 value mask 0xff00 /
            tag index is 3 value spec 0x123456 value mask 0xffffff /
            eth ... / end
    actions ... / end

  flow create 0 ingress group 2
    pattern tag index is 3 value spec 0x123456 value mask 0xffffff /
            eth ... / end
    actions ... / end

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
---
 app/test-pmd/cmdline_flow.c            | 75 ++++++++++++++++++++++++++
 doc/guides/prog_guide/rte_flow.rst     | 50 +++++++++++++++++
 doc/guides/rel_notes/release_19_08.rst |  4 ++
 lib/librte_ethdev/rte_flow.h           | 54 +++++++++++++++++++
 4 files changed, 183 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index eda5c5491f..37a692db54 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -181,6 +181,9 @@ enum index {
 	ITEM_ICMP6_ND_OPT_TLA_ETH_TLA,
 	ITEM_META,
 	ITEM_META_DATA,
+	ITEM_TAG,
+	ITEM_TAG_DATA,
+	ITEM_TAG_INDEX,
 
 	/* Validate/create actions. */
 	ACTIONS,
@@ -275,6 +278,10 @@ enum index {
 	ACTION_SET_META,
 	ACTION_SET_META_DATA,
 	ACTION_SET_META_MASK,
+	ACTION_SET_TAG,
+	ACTION_SET_TAG_INDEX,
+	ACTION_SET_TAG_DATA,
+	ACTION_SET_TAG_MASK,
 };
 
 /** Maximum size for pattern in struct rte_flow_item_raw. */
@@ -613,6 +620,7 @@ static const enum index next_item[] = {
 	ITEM_ICMP6_ND_OPT_SLA_ETH,
 	ITEM_ICMP6_ND_OPT_TLA_ETH,
 	ITEM_META,
+	ITEM_TAG,
 	ZERO,
 };
 
@@ -839,6 +847,13 @@ static const enum index item_meta[] = {
 	ZERO,
 };
 
+static const enum index item_tag[] = {
+	ITEM_TAG_DATA,
+	ITEM_TAG_INDEX,
+	ITEM_NEXT,
+	ZERO,
+};
+
 static const enum index next_action[] = {
 	ACTION_END,
 	ACTION_VOID,
@@ -889,6 +904,7 @@ static const enum index next_action[] = {
 	ACTION_SET_MAC_SRC,
 	ACTION_SET_MAC_DST,
 	ACTION_SET_META,
+	ACTION_SET_TAG,
 	ZERO,
 };
 
@@ -1058,6 +1074,14 @@ static const enum index action_set_meta[] = {
 	ZERO,
 };
 
+static const enum index action_set_tag[] = {
+	ACTION_SET_TAG_INDEX,
+	ACTION_SET_TAG_DATA,
+	ACTION_SET_TAG_MASK,
+	ACTION_NEXT,
+	ZERO,
+};
+
 static int parse_init(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
@@ -2161,6 +2185,26 @@ static const struct token token_list[] = {
 		.args = ARGS(ARGS_ENTRY_MASK_HTON(struct rte_flow_item_meta,
 						  data, "\xff\xff\xff\xff")),
 	},
+	[ITEM_TAG] = {
+		.name = "tag",
+		.help = "match tag value",
+		.priv = PRIV_ITEM(TAG, sizeof(struct rte_flow_item_tag)),
+		.next = NEXT(item_tag),
+		.call = parse_vc,
+	},
+	[ITEM_TAG_DATA] = {
+		.name = "data",
+		.help = "tag value to match",
+		.next = NEXT(item_tag, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_tag, data)),
+	},
+	[ITEM_TAG_INDEX] = {
+		.name = "index",
+		.help = "index of tag array to match",
+		.next = NEXT(item_tag, NEXT_ENTRY(UNSIGNED),
+			     NEXT_ENTRY(ITEM_PARAM_IS)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_tag, index)),
+	},
 
 	/* Validate/create actions. */
 	[ACTIONS] = {
@@ -2889,6 +2933,37 @@ static const struct token token_list[] = {
 			     (struct rte_flow_action_set_meta, mask)),
 		.call = parse_vc_conf,
 	},
+	[ACTION_SET_TAG] = {
+		.name = "set_tag",
+		.help = "set tag",
+		.priv = PRIV_ACTION(SET_TAG,
+			sizeof(struct rte_flow_action_set_tag)),
+		.next = NEXT(action_set_tag),
+		.call = parse_vc,
+	},
+	[ACTION_SET_TAG_INDEX] = {
+		.name = "index",
+		.help = "index of tag array",
+		.next = NEXT(action_set_tag, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_action_set_tag, index)),
+		.call = parse_vc_conf,
+	},
+	[ACTION_SET_TAG_DATA] = {
+		.name = "data",
+		.help = "tag value",
+		.next = NEXT(action_set_tag, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY_HTON
+			     (struct rte_flow_action_set_tag, data)),
+		.call = parse_vc_conf,
+	},
+	[ACTION_SET_TAG_MASK] = {
+		.name = "mask",
+		.help = "mask for tag value",
+		.next = NEXT(action_set_tag, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY_HTON
+			     (struct rte_flow_action_set_tag, mask)),
+		.call = parse_vc_conf,
+	},
 };
 
 /** Remove and return last entry from argument stack. */
diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index 5092f0074e..7880425e9e 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -684,6 +684,34 @@ action sets metadata for a packet and the metadata will be reported via
    | ``mask`` | ``data`` | bit-mask applies to "spec" and "last" |
    +----------+----------+---------------------------------------+
 
+Item: ``TAG``
+^^^^^^^^^^^^^
+
+Matches tag item set by other flows. Multiple tags are supported by specifying
+``index``.
+
+- Default ``mask`` matches the specified tag value and index.
+
+.. _table_rte_flow_item_tag:
+
+.. table:: TAG
+
+   +----------+----------+----------------------------------------+
+   | Field    | Subfield  | Value                                 |
+   +==========+===========+=======================================+
+   | ``spec`` | ``data``  | 32 bit flow tag value                 |
+   |          +-----------+---------------------------------------+
+   |          | ``index`` | index of flow tag                     |
+   +----------+-----------+---------------------------------------+
+   | ``last`` | ``data``  | upper range value                     |
+   |          +-----------+                                       |
+   |          | ``index`` |                                       |
+   +----------+-----------+---------------------------------------+
+   | ``mask`` | ``data``  | bit-mask applies to "spec" and "last" |
+   |          +-----------+                                       |
+   |          | ``index`` |                                       |
+   +----------+-----------+---------------------------------------+
+
 Data matching item types
 ~~~~~~~~~~~~~~~~~~~~~~~~
 
@@ -2376,6 +2404,28 @@ the other path depending on HW capability.
    | ``mask`` | bit-mask applies to "data" |
    +----------+----------------------------+
 
+Action: ``SET_TAG``
+^^^^^^^^^^^^^^^^^^^
+
+Set Tag.
+
+Tag is a transient data used during flow matching. This is not delivered to
+application. Multiple tags are supported by specifying index.
+
+.. _table_rte_flow_action_set_tag:
+
+.. table:: SET_TAG
+
+   +-----------+----------------------------+
+   | Field     | Value                      |
+   +===========+============================+
+   | ``data``  | 32 bit tag value           |
+   +-----------+----------------------------+
+   | ``mask``  | bit-mask applies to "data" |
+   +-----------+----------------------------+
+   | ``index`` | index of tag to set        |
+   +-----------+----------------------------+
+
 Negative types
 ~~~~~~~~~~~~~~
 
diff --git a/doc/guides/rel_notes/release_19_08.rst b/doc/guides/rel_notes/release_19_08.rst
index e087266da0..7c9da3fdae 100644
--- a/doc/guides/rel_notes/release_19_08.rst
+++ b/doc/guides/rel_notes/release_19_08.rst
@@ -78,6 +78,10 @@ New Features
   * Rx metadata is delivered to host via ``metadata`` field of ``rte_mbuf``
     with PKT_RX_METADATA.
 
+* **Added flow tag in rte_flow.**
+  SET_TAG action and TAG item have been added to support transient flow
+  tag.
+
 * **Updated the bnxt PMD.**
 
   Updated the bnxt PMD. The major enhancements include:
diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index cda8628183..ffb8a13098 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -422,6 +422,15 @@ enum rte_flow_item_type {
 	 * See struct rte_flow_item_meta.
 	 */
 	RTE_FLOW_ITEM_TYPE_META,
+
+	/**
+	 * [META]
+	 *
+	 * Matches a tag value.
+	 *
+	 * See struct rte_flow_item_tag.
+	 */
+	RTE_FLOW_ITEM_TYPE_TAG,
 };
 
 /**
@@ -1187,6 +1196,27 @@ static const struct rte_flow_item_meta rte_flow_item_meta_mask = {
 };
 #endif
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ITEM_TYPE_TAG
+ *
+ * Matches a specified tag value at the specified index.
+ */
+struct rte_flow_item_tag {
+	uint32_t data;
+	uint8_t index;
+};
+
+/** Default mask for RTE_FLOW_ITEM_TYPE_TAG. */
+#ifndef __cplusplus
+static const struct rte_flow_item_tag rte_flow_item_rx_meta_mask = {
+	.data = 0xffffffff,
+	.index = 0xff,
+};
+#endif
+
 /**
  * @warning
  * @b EXPERIMENTAL: this structure may change without prior notice
@@ -1665,6 +1695,15 @@ enum rte_flow_action_type {
 	 * See struct rte_flow_action_set_meta.
 	 */
 	RTE_FLOW_ACTION_TYPE_SET_META,
+
+	/**
+	 * Set Tag.
+	 *
+	 * Tag is not delivered to application.
+	 *
+	 * See struct rte_flow_action_set_tag.
+	 */
+	RTE_FLOW_ACTION_TYPE_SET_TAG,
 };
 
 /**
@@ -2168,6 +2207,21 @@ struct rte_flow_action_set_meta {
 	rte_be32_t mask;
 };
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ACTION_TYPE_SET_TAG
+ *
+ * Set a tag which is a transient data used during flow matching. This is not
+ * delivered to application. Multiple tags are supported by specifying index.
+ */
+struct rte_flow_action_set_tag {
+	uint32_t data;
+	uint32_t mask;
+	uint8_t index;
+};
+
 /*
  * Definition of a single action.
  *
-- 
2.21.0


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [dpdk-dev] [PATCH] ethdev: add flow tag
  2019-07-04 23:23   ` [dpdk-dev] [PATCH] " Yongseok Koh
@ 2019-07-05 13:54     ` Adrien Mazarguil
  2019-07-05 18:05       ` Yongseok Koh
  2019-10-10 16:09     ` [dpdk-dev] [PATCH v2] " Viacheslav Ovsiienko
  1 sibling, 1 reply; 34+ messages in thread
From: Adrien Mazarguil @ 2019-07-05 13:54 UTC (permalink / raw)
  To: Yongseok Koh
  Cc: shahafs, thomas, ferruh.yigit, arybchenko, olivier.matz, dev,
	viacheslavo

On Thu, Jul 04, 2019 at 04:23:02PM -0700, Yongseok Koh wrote:
> A tag is a transient data which can be used during flow match. This can be
> used to store match result from a previous table so that the same pattern
> need not be matched again on the next table. Even if outer header is
> decapsulated on the previous match, the match result can be kept.
> 
> Some device expose internal registers of its flow processing pipeline and
> those registers are quite useful for stateful connection tracking as it
> keeps status of flow matching. Multiple tags are supported by specifying
> index.
> 
> Example testpmd commands are:
> 
>   flow create 0 ingress pattern ... / end
>     actions set_tag index 2 value 0xaa00bb mask 0xffff00ff /
>             set_tag index 3 value 0x123456 mask 0xffffff /
>             vxlan_decap / jump group 1 / end
> 
>   flow create 0 ingress pattern ... / end
>     actions set_tag index 2 value 0xcc00 mask 0xff00 /
>             set_tag index 3 value 0x123456 mask 0xffffff /
>             vxlan_decap / jump group 1 / end
> 
>   flow create 0 ingress group 1
>     pattern tag index is 2 value spec 0xaa00bb value mask 0xffff00ff /
>             eth ... / end
>     actions ... jump group 2 / end
> 
>   flow create 0 ingress group 1
>     pattern tag index is 2 value spec 0xcc00 value mask 0xff00 /
>             tag index is 3 value spec 0x123456 value mask 0xffffff /
>             eth ... / end
>     actions ... / end
> 
>   flow create 0 ingress group 2
>     pattern tag index is 3 value spec 0x123456 value mask 0xffffff /
>             eth ... / end
>     actions ... / end
> 
> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>

Hi Yongseok,

Only high level questions for now, while it unquestionably looks useful,
from a user standpoint exposing the separate index seems redundant and not
necessarily convenient. Using the following example to illustrate:

 actions set_tag index 3 value 0x123456 mask 0xfffff

 pattern tag index is 3 value spec 0x123456 value mask 0xffffff

I might be missing something, but why isn't this enough:

 pattern tag index is 3 # match whatever is stored at index 3

Assuming it can work, then why bother with providing value spec/mask on
set_tag? A flow rule pattern matches something, sets some arbitrary tag to
be matched by a subsequent flow rule and that's it. It even seems like
relying on the index only on both occasions is enough for identification.

Same question for the opposite approach; relying on the value, never
mentioning the index.

I'm under the impression that the index is a hardware-specific constraint
that shouldn't be exposed (especially since it's an 8-bit field). If so, a
PMD could keep track of used indices without having them exposed through the
public API.

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [dpdk-dev] [PATCH] ethdev: add flow tag
  2019-07-05 13:54     ` Adrien Mazarguil
@ 2019-07-05 18:05       ` Yongseok Koh
  2019-07-08 23:32         ` Yongseok Koh
  2019-07-09  8:38         ` Adrien Mazarguil
  0 siblings, 2 replies; 34+ messages in thread
From: Yongseok Koh @ 2019-07-05 18:05 UTC (permalink / raw)
  To: Adrien Mazarguil
  Cc: Shahaf Shuler, Thomas Monjalon, Ferruh Yigit, Andrew Rybchenko,
	Olivier Matz, dev, Slava Ovsiienko



> On Jul 5, 2019, at 6:54 AM, Adrien Mazarguil <adrien.mazarguil@6wind.com> wrote:
> 
> On Thu, Jul 04, 2019 at 04:23:02PM -0700, Yongseok Koh wrote:
>> A tag is a transient data which can be used during flow match. This can be
>> used to store match result from a previous table so that the same pattern
>> need not be matched again on the next table. Even if outer header is
>> decapsulated on the previous match, the match result can be kept.
>> 
>> Some device expose internal registers of its flow processing pipeline and
>> those registers are quite useful for stateful connection tracking as it
>> keeps status of flow matching. Multiple tags are supported by specifying
>> index.
>> 
>> Example testpmd commands are:
>> 
>>  flow create 0 ingress pattern ... / end
>>    actions set_tag index 2 value 0xaa00bb mask 0xffff00ff /
>>            set_tag index 3 value 0x123456 mask 0xffffff /
>>            vxlan_decap / jump group 1 / end
>> 
>>  flow create 0 ingress pattern ... / end
>>    actions set_tag index 2 value 0xcc00 mask 0xff00 /
>>            set_tag index 3 value 0x123456 mask 0xffffff /
>>            vxlan_decap / jump group 1 / end
>> 
>>  flow create 0 ingress group 1
>>    pattern tag index is 2 value spec 0xaa00bb value mask 0xffff00ff /
>>            eth ... / end
>>    actions ... jump group 2 / end
>> 
>>  flow create 0 ingress group 1
>>    pattern tag index is 2 value spec 0xcc00 value mask 0xff00 /
>>            tag index is 3 value spec 0x123456 value mask 0xffffff /
>>            eth ... / end
>>    actions ... / end
>> 
>>  flow create 0 ingress group 2
>>    pattern tag index is 3 value spec 0x123456 value mask 0xffffff /
>>            eth ... / end
>>    actions ... / end
>> 
>> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> 
> Hi Yongseok,
> 
> Only high level questions for now, while it unquestionably looks useful,
> from a user standpoint exposing the separate index seems redundant and not
> necessarily convenient. Using the following example to illustrate:
> 
> actions set_tag index 3 value 0x123456 mask 0xfffff
> 
> pattern tag index is 3 value spec 0x123456 value mask 0xffffff
> 
> I might be missing something, but why isn't this enough:
> 
> pattern tag index is 3 # match whatever is stored at index 3
> 
> Assuming it can work, then why bother with providing value spec/mask on
> set_tag? A flow rule pattern matches something, sets some arbitrary tag to
> be matched by a subsequent flow rule and that's it. It even seems like
> relying on the index only on both occasions is enough for identification.
> 
> Same question for the opposite approach; relying on the value, never
> mentioning the index.
> 
> I'm under the impression that the index is a hardware-specific constraint
> that shouldn't be exposed (especially since it's an 8-bit field). If so, a
> PMD could keep track of used indices without having them exposed through the
> public API.


Thank you for review, Adrien.
Hope you are doing well. It's been long since we talked each other. :-)

Your approach will work too in general but we have a request from customer that
they want to partition this limited tag storage. Assuming that HW exposes 32bit
tags (those are 'registers' in HW pipeline in mlx5 HW). Then, customers want to
store multiple data even in a 32-bit storage. For example, 16bit vlan tag, 8bit
table id and 8bit flow id. As they want to split one 32bit storage, I thought it
is better to provide mask when setting/matching the value. Even some customer
wants to store multiple flags bit by bit like ol_flags. They do want to alter
only partial bits.

And for the index, it is to reference an entry of tags array as HW can provide
larger registers than 32-bit. For example, mlx5 HW would provide 4 of 32b
storage which users can use for their own sake.
	tag[0], tag[1], tag[2], tag[3]


Thanks,
Yongseok


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [dpdk-dev] [PATCH] ethdev: add flow tag
  2019-07-05 18:05       ` Yongseok Koh
@ 2019-07-08 23:32         ` Yongseok Koh
  2019-07-09  8:38         ` Adrien Mazarguil
  1 sibling, 0 replies; 34+ messages in thread
From: Yongseok Koh @ 2019-07-08 23:32 UTC (permalink / raw)
  To: Adrien Mazarguil
  Cc: Shahaf Shuler, Thomas Monjalon, Ferruh Yigit, Andrew Rybchenko,
	Olivier Matz, dev, Slava Ovsiienko


> On Jul 5, 2019, at 11:05 AM, Yongseok Koh <yskoh@mellanox.com> wrote:
> 
> 
> 
>> On Jul 5, 2019, at 6:54 AM, Adrien Mazarguil <adrien.mazarguil@6wind.com> wrote:
>> 
>> On Thu, Jul 04, 2019 at 04:23:02PM -0700, Yongseok Koh wrote:
>>> A tag is a transient data which can be used during flow match. This can be
>>> used to store match result from a previous table so that the same pattern
>>> need not be matched again on the next table. Even if outer header is
>>> decapsulated on the previous match, the match result can be kept.
>>> 
>>> Some device expose internal registers of its flow processing pipeline and
>>> those registers are quite useful for stateful connection tracking as it
>>> keeps status of flow matching. Multiple tags are supported by specifying
>>> index.
>>> 
>>> Example testpmd commands are:
>>> 
>>> flow create 0 ingress pattern ... / end
>>>   actions set_tag index 2 value 0xaa00bb mask 0xffff00ff /
>>>           set_tag index 3 value 0x123456 mask 0xffffff /
>>>           vxlan_decap / jump group 1 / end
>>> 
>>> flow create 0 ingress pattern ... / end
>>>   actions set_tag index 2 value 0xcc00 mask 0xff00 /
>>>           set_tag index 3 value 0x123456 mask 0xffffff /
>>>           vxlan_decap / jump group 1 / end
>>> 
>>> flow create 0 ingress group 1
>>>   pattern tag index is 2 value spec 0xaa00bb value mask 0xffff00ff /
>>>           eth ... / end
>>>   actions ... jump group 2 / end
>>> 
>>> flow create 0 ingress group 1
>>>   pattern tag index is 2 value spec 0xcc00 value mask 0xff00 /
>>>           tag index is 3 value spec 0x123456 value mask 0xffffff /
>>>           eth ... / end
>>>   actions ... / end
>>> 
>>> flow create 0 ingress group 2
>>>   pattern tag index is 3 value spec 0x123456 value mask 0xffffff /
>>>           eth ... / end
>>>   actions ... / end
>>> 
>>> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
>> 
>> Hi Yongseok,
>> 
>> Only high level questions for now, while it unquestionably looks useful,
>> from a user standpoint exposing the separate index seems redundant and not
>> necessarily convenient. Using the following example to illustrate:
>> 
>> actions set_tag index 3 value 0x123456 mask 0xfffff
>> 
>> pattern tag index is 3 value spec 0x123456 value mask 0xffffff
>> 
>> I might be missing something, but why isn't this enough:
>> 
>> pattern tag index is 3 # match whatever is stored at index 3
>> 
>> Assuming it can work, then why bother with providing value spec/mask on
>> set_tag? A flow rule pattern matches something, sets some arbitrary tag to
>> be matched by a subsequent flow rule and that's it. It even seems like
>> relying on the index only on both occasions is enough for identification.
>> 
>> Same question for the opposite approach; relying on the value, never
>> mentioning the index.
>> 
>> I'm under the impression that the index is a hardware-specific constraint
>> that shouldn't be exposed (especially since it's an 8-bit field). If so, a
>> PMD could keep track of used indices without having them exposed through the
>> public API.
> 
> 
> Thank you for review, Adrien.
> Hope you are doing well. It's been long since we talked each other. :-)
> 
> Your approach will work too in general but we have a request from customer that
> they want to partition this limited tag storage. Assuming that HW exposes 32bit
> tags (those are 'registers' in HW pipeline in mlx5 HW). Then, customers want to
> store multiple data even in a 32-bit storage. For example, 16bit vlan tag, 8bit
> table id and 8bit flow id. As they want to split one 32bit storage, I thought it
> is better to provide mask when setting/matching the value. Even some customer
> wants to store multiple flags bit by bit like ol_flags. They do want to alter
> only partial bits.
> 
> And for the index, it is to reference an entry of tags array as HW can provide
> larger registers than 32-bit. For example, mlx5 HW would provide 4 of 32b
> storage which users can use for their own sake.
> 	tag[0], tag[1], tag[2], tag[3]

Adrien,

Are you okay with this?

Thanks,
Yongseok



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [dpdk-dev] [PATCH] ethdev: add flow tag
  2019-07-05 18:05       ` Yongseok Koh
  2019-07-08 23:32         ` Yongseok Koh
@ 2019-07-09  8:38         ` Adrien Mazarguil
  2019-07-11  1:59           ` Yongseok Koh
  1 sibling, 1 reply; 34+ messages in thread
From: Adrien Mazarguil @ 2019-07-09  8:38 UTC (permalink / raw)
  To: Yongseok Koh
  Cc: Shahaf Shuler, Thomas Monjalon, Ferruh Yigit, Andrew Rybchenko,
	Olivier Matz, dev, Slava Ovsiienko

On Fri, Jul 05, 2019 at 06:05:50PM +0000, Yongseok Koh wrote:
> > On Jul 5, 2019, at 6:54 AM, Adrien Mazarguil <adrien.mazarguil@6wind.com> wrote:
> > 
> > On Thu, Jul 04, 2019 at 04:23:02PM -0700, Yongseok Koh wrote:
> >> A tag is a transient data which can be used during flow match. This can be
> >> used to store match result from a previous table so that the same pattern
> >> need not be matched again on the next table. Even if outer header is
> >> decapsulated on the previous match, the match result can be kept.
> >> 
> >> Some device expose internal registers of its flow processing pipeline and
> >> those registers are quite useful for stateful connection tracking as it
> >> keeps status of flow matching. Multiple tags are supported by specifying
> >> index.
> >> 
> >> Example testpmd commands are:
> >> 
> >>  flow create 0 ingress pattern ... / end
> >>    actions set_tag index 2 value 0xaa00bb mask 0xffff00ff /
> >>            set_tag index 3 value 0x123456 mask 0xffffff /
> >>            vxlan_decap / jump group 1 / end
> >> 
> >>  flow create 0 ingress pattern ... / end
> >>    actions set_tag index 2 value 0xcc00 mask 0xff00 /
> >>            set_tag index 3 value 0x123456 mask 0xffffff /
> >>            vxlan_decap / jump group 1 / end
> >> 
> >>  flow create 0 ingress group 1
> >>    pattern tag index is 2 value spec 0xaa00bb value mask 0xffff00ff /
> >>            eth ... / end
> >>    actions ... jump group 2 / end
> >> 
> >>  flow create 0 ingress group 1
> >>    pattern tag index is 2 value spec 0xcc00 value mask 0xff00 /
> >>            tag index is 3 value spec 0x123456 value mask 0xffffff /
> >>            eth ... / end
> >>    actions ... / end
> >> 
> >>  flow create 0 ingress group 2
> >>    pattern tag index is 3 value spec 0x123456 value mask 0xffffff /
> >>            eth ... / end
> >>    actions ... / end
> >> 
> >> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> > 
> > Hi Yongseok,
> > 
> > Only high level questions for now, while it unquestionably looks useful,
> > from a user standpoint exposing the separate index seems redundant and not
> > necessarily convenient. Using the following example to illustrate:
> > 
> > actions set_tag index 3 value 0x123456 mask 0xfffff
> > 
> > pattern tag index is 3 value spec 0x123456 value mask 0xffffff
> > 
> > I might be missing something, but why isn't this enough:
> > 
> > pattern tag index is 3 # match whatever is stored at index 3
> > 
> > Assuming it can work, then why bother with providing value spec/mask on
> > set_tag? A flow rule pattern matches something, sets some arbitrary tag to
> > be matched by a subsequent flow rule and that's it. It even seems like
> > relying on the index only on both occasions is enough for identification.
> > 
> > Same question for the opposite approach; relying on the value, never
> > mentioning the index.
> > 
> > I'm under the impression that the index is a hardware-specific constraint
> > that shouldn't be exposed (especially since it's an 8-bit field). If so, a
> > PMD could keep track of used indices without having them exposed through the
> > public API.
> 
> 
> Thank you for review, Adrien.
> Hope you are doing well. It's been long since we talked each other. :-)

Yeah clearly! Hope you're doing well too. I'm somewhat busy hence slow to
answer these days...

 <dev@dpdk.org> hey!
 <dev@dpdk.org> no private talks!

Back to the topic:

> Your approach will work too in general but we have a request from customer that
> they want to partition this limited tag storage. Assuming that HW exposes 32bit
> tags (those are 'registers' in HW pipeline in mlx5 HW). Then, customers want to
> store multiple data even in a 32-bit storage. For example, 16bit vlan tag, 8bit
> table id and 8bit flow id. As they want to split one 32bit storage, I thought it
> is better to provide mask when setting/matching the value. Even some customer
> wants to store multiple flags bit by bit like ol_flags. They do want to alter
> only partial bits.
> 
> And for the index, it is to reference an entry of tags array as HW can provide
> larger registers than 32-bit. For example, mlx5 HW would provide 4 of 32b
> storage which users can use for their own sake.
> 	tag[0], tag[1], tag[2], tag[3]

OK, looks like I missed the point then. I initially took it for a funky
alternative to RTE_FLOW_ITEM_TYPE_META & RTE_FLOW_ACTION_TYPE_SET_META
(ingress extended [1]) but while it could be used like that, it's more of a
way to temporarily store and retrieve a small amount of data, correct?

Out of curiosity, are these registers independent from META and other
items/actions in mlx5, otherwise what happens if they are combined?

Are there other uses for these registers? Say, referencing their contents
from other places in a flow rule so they don't have to be hard-coded?

Right now I'm still uncomfortable with such a feature in the public API
because compared to META [1], this approach looks very hardware-specific and
seemingly difficult to map on different HW architectures.

However, the main problem is that as described, its end purpose seems
redundant with META, which I think can cover the use cases you gave. So what
can an application do with this that couldn't be done in a more generic
fashion through META?

I may still be missing something and I'm open to ideas, but assuming it
doesn't make it into the public rte_flow API, it remains an interesting
feature on its own merit which could be added to DPDK as PMD-specific
pattern items/actions [2]. mlx5 doesn't have any yet, but it's pretty common
for PMDs to expose a public header that dedicated applications can include
to use this kind of features (look for rte_pmd_*.h, e.g. rte_pmd_ixgbe.h).
No problem with that.

[1] "[PATCH] ethdev: extend flow metadata"
    https://mails.dpdk.org/archives/dev/2019-July/137305.html

[2] "Negative types"
    https://doc.dpdk.org/guides/prog_guide/rte_flow.html#negative-types

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [dpdk-dev] [PATCH] ethdev: extend flow metadata
  2019-07-04 23:21 ` [dpdk-dev] [PATCH] " Yongseok Koh
@ 2019-07-10  9:31   ` Olivier Matz
  2019-07-10  9:55     ` Bruce Richardson
  2019-10-10 16:02   ` [dpdk-dev] [PATCH v2] " Viacheslav Ovsiienko
  1 sibling, 1 reply; 34+ messages in thread
From: Olivier Matz @ 2019-07-10  9:31 UTC (permalink / raw)
  To: Yongseok Koh
  Cc: shahafs, thomas, ferruh.yigit, arybchenko, adrien.mazarguil, dev,
	viacheslavo

Hi Yongseok,

On Thu, Jul 04, 2019 at 04:21:22PM -0700, Yongseok Koh wrote:
> Currently, metadata can be set on egress path via mbuf tx_meatadata field
> with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_RX_META matches metadata.
> 
> This patch extends the usability.
> 
> 1) RTE_FLOW_ACTION_TYPE_SET_META
> 
> When supporting multiple tables, Tx metadata can also be set by a rule and
> matched by another rule. This new action allows metadata to be set as a
> result of flow match.
> 
> 2) Metadata on ingress
> 
> There's also need to support metadata on packet Rx. Metadata can be set by
> SET_META action and matched by META item like Tx. The final value set by
> the action will be delivered to application via mbuf metadata field with
> PKT_RX_METADATA ol_flag.
> 
> For this purpose, mbuf->tx_metadata is moved as a separate new field and
> renamed to 'metadata' to support both Rx and Tx metadata.
> 
> For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
> propagated to the other path depending on HW capability.
> 
> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>

(...)

> --- a/lib/librte_mbuf/rte_mbuf.h
> +++ b/lib/librte_mbuf/rte_mbuf.h
> @@ -200,6 +200,11 @@ extern "C" {
>  
>  /* add new RX flags here */
>  
> +/**
> + * Indicate that mbuf has metadata from device.
> + */
> +#define PKT_RX_METADATA	(1ULL << 23)
> +
>  /* add new TX flags here */
>  
>  /**
> @@ -648,17 +653,6 @@ struct rte_mbuf {
>  			/**< User defined tags. See rte_distributor_process() */
>  			uint32_t usr;
>  		} hash;                   /**< hash information */
> -		struct {
> -			/**
> -			 * Application specific metadata value
> -			 * for egress flow rule match.
> -			 * Valid if PKT_TX_METADATA is set.
> -			 * Located here to allow conjunct use
> -			 * with hash.sched.hi.
> -			 */
> -			uint32_t tx_metadata;
> -			uint32_t reserved;
> -		};
>  	};
>  
>  	/** Outer VLAN TCI (CPU order), valid if PKT_RX_QINQ is set. */
> @@ -727,6 +721,11 @@ struct rte_mbuf {
>  	 */
>  	struct rte_mbuf_ext_shared_info *shinfo;
>  
> +	/** Application specific metadata value for flow rule match.
> +	 * Valid if PKT_RX_METADATA or PKT_TX_METADATA is set.
> +	 */
> +	uint32_t metadata;
> +
>  } __rte_cache_aligned;

This will break the ABI, so we cannot put it in 19.08, and we need a
deprecation notice.

That said, it shows again that we need something to make the addition of
fields in mbufs more flexible. Knowing that Thomas will present
something about it at next userspace [1], I drafted something in a RFC
[2]. It is simpler than I expected, and I think your commit could be
a good candidate for a first user. Do you mind having a try? Feedback
and comment is of course welcome.

If it fits your needs, we can target its introduction for 19.11. Having
a user for this new feature would be a plus for its introduction :)

Thanks,
Olivier

[1] https://dpdkbordeaux2019.sched.com/
[2] http://mails.dpdk.org/archives/dev/2019-July/137772.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [dpdk-dev] [PATCH] ethdev: extend flow metadata
  2019-07-10  9:31   ` Olivier Matz
@ 2019-07-10  9:55     ` Bruce Richardson
  2019-07-10 10:07       ` Olivier Matz
  0 siblings, 1 reply; 34+ messages in thread
From: Bruce Richardson @ 2019-07-10  9:55 UTC (permalink / raw)
  To: Olivier Matz
  Cc: Yongseok Koh, shahafs, thomas, ferruh.yigit, arybchenko,
	adrien.mazarguil, dev, viacheslavo

On Wed, Jul 10, 2019 at 11:31:56AM +0200, Olivier Matz wrote:
> Hi Yongseok,
> 
> On Thu, Jul 04, 2019 at 04:21:22PM -0700, Yongseok Koh wrote:
> > Currently, metadata can be set on egress path via mbuf tx_meatadata field
> > with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_RX_META matches metadata.
> > 
> > This patch extends the usability.
> > 
> > 1) RTE_FLOW_ACTION_TYPE_SET_META
> > 
> > When supporting multiple tables, Tx metadata can also be set by a rule and
> > matched by another rule. This new action allows metadata to be set as a
> > result of flow match.
> > 
> > 2) Metadata on ingress
> > 
> > There's also need to support metadata on packet Rx. Metadata can be set by
> > SET_META action and matched by META item like Tx. The final value set by
> > the action will be delivered to application via mbuf metadata field with
> > PKT_RX_METADATA ol_flag.
> > 
> > For this purpose, mbuf->tx_metadata is moved as a separate new field and
> > renamed to 'metadata' to support both Rx and Tx metadata.
> > 
> > For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
> > propagated to the other path depending on HW capability.
> > 
> > Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> 
> (...)
> 
> > --- a/lib/librte_mbuf/rte_mbuf.h
> > +++ b/lib/librte_mbuf/rte_mbuf.h
> > @@ -200,6 +200,11 @@ extern "C" {
> >  
> >  /* add new RX flags here */
> >  
> > +/**
> > + * Indicate that mbuf has metadata from device.
> > + */
> > +#define PKT_RX_METADATA	(1ULL << 23)
> > +
> >  /* add new TX flags here */
> >  
> >  /**
> > @@ -648,17 +653,6 @@ struct rte_mbuf {
> >  			/**< User defined tags. See rte_distributor_process() */
> >  			uint32_t usr;
> >  		} hash;                   /**< hash information */
> > -		struct {
> > -			/**
> > -			 * Application specific metadata value
> > -			 * for egress flow rule match.
> > -			 * Valid if PKT_TX_METADATA is set.
> > -			 * Located here to allow conjunct use
> > -			 * with hash.sched.hi.
> > -			 */
> > -			uint32_t tx_metadata;
> > -			uint32_t reserved;
> > -		};
> >  	};
> >  
> >  	/** Outer VLAN TCI (CPU order), valid if PKT_RX_QINQ is set. */
> > @@ -727,6 +721,11 @@ struct rte_mbuf {
> >  	 */
> >  	struct rte_mbuf_ext_shared_info *shinfo;
> >  
> > +	/** Application specific metadata value for flow rule match.
> > +	 * Valid if PKT_RX_METADATA or PKT_TX_METADATA is set.
> > +	 */
> > +	uint32_t metadata;
> > +
> >  } __rte_cache_aligned;
> 
> This will break the ABI, so we cannot put it in 19.08, and we need a
> deprecation notice.
> 
Does it actually break the ABI? Adding a new field to the mbuf should only
break the ABI if it either causes new fields to move or changes the
structure size. Since this is at the end, it's not going to move any older
fields, and since everything is cache-aligned I don't think the structure
size changes either.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [dpdk-dev] [PATCH] ethdev: extend flow metadata
  2019-07-10  9:55     ` Bruce Richardson
@ 2019-07-10 10:07       ` Olivier Matz
  2019-07-10 12:01         ` Bruce Richardson
  0 siblings, 1 reply; 34+ messages in thread
From: Olivier Matz @ 2019-07-10 10:07 UTC (permalink / raw)
  To: Bruce Richardson
  Cc: Yongseok Koh, shahafs, thomas, ferruh.yigit, arybchenko,
	adrien.mazarguil, dev, viacheslavo

On Wed, Jul 10, 2019 at 10:55:34AM +0100, Bruce Richardson wrote:
> On Wed, Jul 10, 2019 at 11:31:56AM +0200, Olivier Matz wrote:
> > Hi Yongseok,
> > 
> > On Thu, Jul 04, 2019 at 04:21:22PM -0700, Yongseok Koh wrote:
> > > Currently, metadata can be set on egress path via mbuf tx_meatadata field
> > > with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_RX_META matches metadata.
> > > 
> > > This patch extends the usability.
> > > 
> > > 1) RTE_FLOW_ACTION_TYPE_SET_META
> > > 
> > > When supporting multiple tables, Tx metadata can also be set by a rule and
> > > matched by another rule. This new action allows metadata to be set as a
> > > result of flow match.
> > > 
> > > 2) Metadata on ingress
> > > 
> > > There's also need to support metadata on packet Rx. Metadata can be set by
> > > SET_META action and matched by META item like Tx. The final value set by
> > > the action will be delivered to application via mbuf metadata field with
> > > PKT_RX_METADATA ol_flag.
> > > 
> > > For this purpose, mbuf->tx_metadata is moved as a separate new field and
> > > renamed to 'metadata' to support both Rx and Tx metadata.
> > > 
> > > For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
> > > propagated to the other path depending on HW capability.
> > > 
> > > Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> > 
> > (...)
> > 
> > > --- a/lib/librte_mbuf/rte_mbuf.h
> > > +++ b/lib/librte_mbuf/rte_mbuf.h
> > > @@ -200,6 +200,11 @@ extern "C" {
> > >  
> > >  /* add new RX flags here */
> > >  
> > > +/**
> > > + * Indicate that mbuf has metadata from device.
> > > + */
> > > +#define PKT_RX_METADATA	(1ULL << 23)
> > > +
> > >  /* add new TX flags here */
> > >  
> > >  /**
> > > @@ -648,17 +653,6 @@ struct rte_mbuf {
> > >  			/**< User defined tags. See rte_distributor_process() */
> > >  			uint32_t usr;
> > >  		} hash;                   /**< hash information */
> > > -		struct {
> > > -			/**
> > > -			 * Application specific metadata value
> > > -			 * for egress flow rule match.
> > > -			 * Valid if PKT_TX_METADATA is set.
> > > -			 * Located here to allow conjunct use
> > > -			 * with hash.sched.hi.
> > > -			 */
> > > -			uint32_t tx_metadata;
> > > -			uint32_t reserved;
> > > -		};
> > >  	};
> > >  
> > >  	/** Outer VLAN TCI (CPU order), valid if PKT_RX_QINQ is set. */
> > > @@ -727,6 +721,11 @@ struct rte_mbuf {
> > >  	 */
> > >  	struct rte_mbuf_ext_shared_info *shinfo;
> > >  
> > > +	/** Application specific metadata value for flow rule match.
> > > +	 * Valid if PKT_RX_METADATA or PKT_TX_METADATA is set.
> > > +	 */
> > > +	uint32_t metadata;
> > > +
> > >  } __rte_cache_aligned;
> > 
> > This will break the ABI, so we cannot put it in 19.08, and we need a
> > deprecation notice.
> > 
> Does it actually break the ABI? Adding a new field to the mbuf should only
> break the ABI if it either causes new fields to move or changes the
> structure size. Since this is at the end, it's not going to move any older
> fields, and since everything is cache-aligned I don't think the structure
> size changes either.

I think it does break the ABI: in previous version, when the PKT_TX_METADATA
flag is set, the associated value is put in m->tx_metadata (offset 44 on
x86-64), and in the next version, it will be in m->metadata (offset 112). So,
these 2 versions are not binary compatible.

Anyway, at least it breaks the API.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [dpdk-dev] [PATCH] ethdev: extend flow metadata
  2019-07-10 10:07       ` Olivier Matz
@ 2019-07-10 12:01         ` Bruce Richardson
  2019-07-10 12:26           ` Thomas Monjalon
  0 siblings, 1 reply; 34+ messages in thread
From: Bruce Richardson @ 2019-07-10 12:01 UTC (permalink / raw)
  To: Olivier Matz
  Cc: Yongseok Koh, shahafs, thomas, ferruh.yigit, arybchenko,
	adrien.mazarguil, dev, viacheslavo

On Wed, Jul 10, 2019 at 12:07:43PM +0200, Olivier Matz wrote:
> On Wed, Jul 10, 2019 at 10:55:34AM +0100, Bruce Richardson wrote:
> > On Wed, Jul 10, 2019 at 11:31:56AM +0200, Olivier Matz wrote:
> > > Hi Yongseok,
> > > 
> > > On Thu, Jul 04, 2019 at 04:21:22PM -0700, Yongseok Koh wrote:
> > > > Currently, metadata can be set on egress path via mbuf tx_meatadata field
> > > > with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_RX_META matches metadata.
> > > > 
> > > > This patch extends the usability.
> > > > 
> > > > 1) RTE_FLOW_ACTION_TYPE_SET_META
> > > > 
> > > > When supporting multiple tables, Tx metadata can also be set by a rule and
> > > > matched by another rule. This new action allows metadata to be set as a
> > > > result of flow match.
> > > > 
> > > > 2) Metadata on ingress
> > > > 
> > > > There's also need to support metadata on packet Rx. Metadata can be set by
> > > > SET_META action and matched by META item like Tx. The final value set by
> > > > the action will be delivered to application via mbuf metadata field with
> > > > PKT_RX_METADATA ol_flag.
> > > > 
> > > > For this purpose, mbuf->tx_metadata is moved as a separate new field and
> > > > renamed to 'metadata' to support both Rx and Tx metadata.
> > > > 
> > > > For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
> > > > propagated to the other path depending on HW capability.
> > > > 
> > > > Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> > > 
> > > (...)
> > > 
> > > > --- a/lib/librte_mbuf/rte_mbuf.h
> > > > +++ b/lib/librte_mbuf/rte_mbuf.h
> > > > @@ -200,6 +200,11 @@ extern "C" {
> > > >  
> > > >  /* add new RX flags here */
> > > >  
> > > > +/**
> > > > + * Indicate that mbuf has metadata from device.
> > > > + */
> > > > +#define PKT_RX_METADATA	(1ULL << 23)
> > > > +
> > > >  /* add new TX flags here */
> > > >  
> > > >  /**
> > > > @@ -648,17 +653,6 @@ struct rte_mbuf {
> > > >  			/**< User defined tags. See rte_distributor_process() */
> > > >  			uint32_t usr;
> > > >  		} hash;                   /**< hash information */
> > > > -		struct {
> > > > -			/**
> > > > -			 * Application specific metadata value
> > > > -			 * for egress flow rule match.
> > > > -			 * Valid if PKT_TX_METADATA is set.
> > > > -			 * Located here to allow conjunct use
> > > > -			 * with hash.sched.hi.
> > > > -			 */
> > > > -			uint32_t tx_metadata;
> > > > -			uint32_t reserved;
> > > > -		};
> > > >  	};
> > > >  
> > > >  	/** Outer VLAN TCI (CPU order), valid if PKT_RX_QINQ is set. */
> > > > @@ -727,6 +721,11 @@ struct rte_mbuf {
> > > >  	 */
> > > >  	struct rte_mbuf_ext_shared_info *shinfo;
> > > >  
> > > > +	/** Application specific metadata value for flow rule match.
> > > > +	 * Valid if PKT_RX_METADATA or PKT_TX_METADATA is set.
> > > > +	 */
> > > > +	uint32_t metadata;
> > > > +
> > > >  } __rte_cache_aligned;
> > > 
> > > This will break the ABI, so we cannot put it in 19.08, and we need a
> > > deprecation notice.
> > > 
> > Does it actually break the ABI? Adding a new field to the mbuf should only
> > break the ABI if it either causes new fields to move or changes the
> > structure size. Since this is at the end, it's not going to move any older
> > fields, and since everything is cache-aligned I don't think the structure
> > size changes either.
> 
> I think it does break the ABI: in previous version, when the PKT_TX_METADATA
> flag is set, the associated value is put in m->tx_metadata (offset 44 on
> x86-64), and in the next version, it will be in m->metadata (offset 112). So,
> these 2 versions are not binary compatible.
> 
> Anyway, at least it breaks the API.

Ok, I misunderstood. I thought it was the structure change itself you were
saying broke the ABI. Yes, putting the data in a different place is indeed
an ABI break.

/Bruce

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [dpdk-dev] [PATCH] ethdev: extend flow metadata
  2019-07-10 12:01         ` Bruce Richardson
@ 2019-07-10 12:26           ` Thomas Monjalon
  2019-07-10 16:37             ` Yongseok Koh
  0 siblings, 1 reply; 34+ messages in thread
From: Thomas Monjalon @ 2019-07-10 12:26 UTC (permalink / raw)
  To: Bruce Richardson, Olivier Matz, arybchenko, adrien.mazarguil
  Cc: Yongseok Koh, shahafs, ferruh.yigit, dev, viacheslavo

10/07/2019 14:01, Bruce Richardson:
> On Wed, Jul 10, 2019 at 12:07:43PM +0200, Olivier Matz wrote:
> > On Wed, Jul 10, 2019 at 10:55:34AM +0100, Bruce Richardson wrote:
> > > On Wed, Jul 10, 2019 at 11:31:56AM +0200, Olivier Matz wrote:
> > > > On Thu, Jul 04, 2019 at 04:21:22PM -0700, Yongseok Koh wrote:
> > > > > Currently, metadata can be set on egress path via mbuf tx_meatadata field
> > > > > with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_RX_META matches metadata.
> > > > > 
> > > > > This patch extends the usability.
> > > > > 
> > > > > 1) RTE_FLOW_ACTION_TYPE_SET_META
> > > > > 
> > > > > When supporting multiple tables, Tx metadata can also be set by a rule and
> > > > > matched by another rule. This new action allows metadata to be set as a
> > > > > result of flow match.
> > > > > 
> > > > > 2) Metadata on ingress
> > > > > 
> > > > > There's also need to support metadata on packet Rx. Metadata can be set by
> > > > > SET_META action and matched by META item like Tx. The final value set by
> > > > > the action will be delivered to application via mbuf metadata field with
> > > > > PKT_RX_METADATA ol_flag.
> > > > > 
> > > > > For this purpose, mbuf->tx_metadata is moved as a separate new field and
> > > > > renamed to 'metadata' to support both Rx and Tx metadata.
> > > > > 
> > > > > For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
> > > > > propagated to the other path depending on HW capability.
> > > > > 
> > > > > Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> > > > 
> > > > > --- a/lib/librte_mbuf/rte_mbuf.h
> > > > > +++ b/lib/librte_mbuf/rte_mbuf.h
> > > > > @@ -648,17 +653,6 @@ struct rte_mbuf {
> > > > >  			/**< User defined tags. See rte_distributor_process() */
> > > > >  			uint32_t usr;
> > > > >  		} hash;                   /**< hash information */
> > > > > -		struct {
> > > > > -			/**
> > > > > -			 * Application specific metadata value
> > > > > -			 * for egress flow rule match.
> > > > > -			 * Valid if PKT_TX_METADATA is set.
> > > > > -			 * Located here to allow conjunct use
> > > > > -			 * with hash.sched.hi.
> > > > > -			 */
> > > > > -			uint32_t tx_metadata;
> > > > > -			uint32_t reserved;
> > > > > -		};
> > > > >  	};
> > > > >  
> > > > >  	/** Outer VLAN TCI (CPU order), valid if PKT_RX_QINQ is set. */
> > > > > @@ -727,6 +721,11 @@ struct rte_mbuf {
> > > > >  	 */
> > > > >  	struct rte_mbuf_ext_shared_info *shinfo;
> > > > >  
> > > > > +	/** Application specific metadata value for flow rule match.
> > > > > +	 * Valid if PKT_RX_METADATA or PKT_TX_METADATA is set.
> > > > > +	 */
> > > > > +	uint32_t metadata;
> > > > > +
> > > > >  } __rte_cache_aligned;
> > > > 
> > > > This will break the ABI, so we cannot put it in 19.08, and we need a
> > > > deprecation notice.
> > > > 
> > > Does it actually break the ABI? Adding a new field to the mbuf should only
> > > break the ABI if it either causes new fields to move or changes the
> > > structure size. Since this is at the end, it's not going to move any older
> > > fields, and since everything is cache-aligned I don't think the structure
> > > size changes either.
> > 
> > I think it does break the ABI: in previous version, when the PKT_TX_METADATA
> > flag is set, the associated value is put in m->tx_metadata (offset 44 on
> > x86-64), and in the next version, it will be in m->metadata (offset 112). So,
> > these 2 versions are not binary compatible.
> > 
> > Anyway, at least it breaks the API.
> 
> Ok, I misunderstood. I thought it was the structure change itself you were
> saying broke the ABI. Yes, putting the data in a different place is indeed
> an ABI break.

We could add the new field and keep the old one unused,
so it does not break the ABI.
However I suppose everybody will prefer a version using dynamic fields.
Is someone against using dynamic field for such usage?



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [dpdk-dev] [PATCH] ethdev: extend flow metadata
  2019-07-10 12:26           ` Thomas Monjalon
@ 2019-07-10 16:37             ` Yongseok Koh
  2019-07-11  7:44               ` Adrien Mazarguil
  0 siblings, 1 reply; 34+ messages in thread
From: Yongseok Koh @ 2019-07-10 16:37 UTC (permalink / raw)
  To: Thomas Monjalon, Olivier Matz
  Cc: Bruce Richardson, Andrew Rybchenko, Adrien Mazarguil,
	Shahaf Shuler, Ferruh Yigit, dev, Slava Ovsiienko


> On Jul 10, 2019, at 5:26 AM, Thomas Monjalon <thomas@monjalon.net> wrote:
> 
> 10/07/2019 14:01, Bruce Richardson:
>> On Wed, Jul 10, 2019 at 12:07:43PM +0200, Olivier Matz wrote:
>>> On Wed, Jul 10, 2019 at 10:55:34AM +0100, Bruce Richardson wrote:
>>>> On Wed, Jul 10, 2019 at 11:31:56AM +0200, Olivier Matz wrote:
>>>>> On Thu, Jul 04, 2019 at 04:21:22PM -0700, Yongseok Koh wrote:
>>>>>> Currently, metadata can be set on egress path via mbuf tx_meatadata field
>>>>>> with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_RX_META matches metadata.
>>>>>> 
>>>>>> This patch extends the usability.
>>>>>> 
>>>>>> 1) RTE_FLOW_ACTION_TYPE_SET_META
>>>>>> 
>>>>>> When supporting multiple tables, Tx metadata can also be set by a rule and
>>>>>> matched by another rule. This new action allows metadata to be set as a
>>>>>> result of flow match.
>>>>>> 
>>>>>> 2) Metadata on ingress
>>>>>> 
>>>>>> There's also need to support metadata on packet Rx. Metadata can be set by
>>>>>> SET_META action and matched by META item like Tx. The final value set by
>>>>>> the action will be delivered to application via mbuf metadata field with
>>>>>> PKT_RX_METADATA ol_flag.
>>>>>> 
>>>>>> For this purpose, mbuf->tx_metadata is moved as a separate new field and
>>>>>> renamed to 'metadata' to support both Rx and Tx metadata.
>>>>>> 
>>>>>> For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
>>>>>> propagated to the other path depending on HW capability.
>>>>>> 
>>>>>> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
>>>>> 
>>>>>> --- a/lib/librte_mbuf/rte_mbuf.h
>>>>>> +++ b/lib/librte_mbuf/rte_mbuf.h
>>>>>> @@ -648,17 +653,6 @@ struct rte_mbuf {
>>>>>> 			/**< User defined tags. See rte_distributor_process() */
>>>>>> 			uint32_t usr;
>>>>>> 		} hash;                   /**< hash information */
>>>>>> -		struct {
>>>>>> -			/**
>>>>>> -			 * Application specific metadata value
>>>>>> -			 * for egress flow rule match.
>>>>>> -			 * Valid if PKT_TX_METADATA is set.
>>>>>> -			 * Located here to allow conjunct use
>>>>>> -			 * with hash.sched.hi.
>>>>>> -			 */
>>>>>> -			uint32_t tx_metadata;
>>>>>> -			uint32_t reserved;
>>>>>> -		};
>>>>>> 	};
>>>>>> 
>>>>>> 	/** Outer VLAN TCI (CPU order), valid if PKT_RX_QINQ is set. */
>>>>>> @@ -727,6 +721,11 @@ struct rte_mbuf {
>>>>>> 	 */
>>>>>> 	struct rte_mbuf_ext_shared_info *shinfo;
>>>>>> 
>>>>>> +	/** Application specific metadata value for flow rule match.
>>>>>> +	 * Valid if PKT_RX_METADATA or PKT_TX_METADATA is set.
>>>>>> +	 */
>>>>>> +	uint32_t metadata;
>>>>>> +
>>>>>> } __rte_cache_aligned;
>>>>> 
>>>>> This will break the ABI, so we cannot put it in 19.08, and we need a
>>>>> deprecation notice.
>>>>> 
>>>> Does it actually break the ABI? Adding a new field to the mbuf should only
>>>> break the ABI if it either causes new fields to move or changes the
>>>> structure size. Since this is at the end, it's not going to move any older
>>>> fields, and since everything is cache-aligned I don't think the structure
>>>> size changes either.
>>> 
>>> I think it does break the ABI: in previous version, when the PKT_TX_METADATA
>>> flag is set, the associated value is put in m->tx_metadata (offset 44 on
>>> x86-64), and in the next version, it will be in m->metadata (offset 112). So,
>>> these 2 versions are not binary compatible.
>>> 
>>> Anyway, at least it breaks the API.
>> 
>> Ok, I misunderstood. I thought it was the structure change itself you were
>> saying broke the ABI. Yes, putting the data in a different place is indeed
>> an ABI break.
> 
> We could add the new field and keep the old one unused,
> so it does not break the ABI.

Still breaks ABI if PKT_TX_METADATA is set. :-) In order not to break it, I can
keep the current union'd field (tx_metadata) as is with PKT_TX_METADATA, add
the new one at the end and make it used with the new PKT_RX_METADATA.

> However I suppose everybody will prefer a version using dynamic fields.
> Is someone against using dynamic field for such usage?

However, given that the amazing dynamic fields is coming soon (thanks for your
effort, Olivier and Thomas!), I'd be honored to be the first user of it.

Olivier, I'll take a look at your RFC.


Thanks,
Yongseok


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [dpdk-dev] [PATCH] ethdev: add flow tag
  2019-07-09  8:38         ` Adrien Mazarguil
@ 2019-07-11  1:59           ` Yongseok Koh
  2019-10-08 12:57             ` Yigit, Ferruh
  0 siblings, 1 reply; 34+ messages in thread
From: Yongseok Koh @ 2019-07-11  1:59 UTC (permalink / raw)
  To: Adrien Mazarguil
  Cc: Shahaf Shuler, Thomas Monjalon, Ferruh Yigit, Andrew Rybchenko,
	Olivier Matz, dev, Slava Ovsiienko

On Tue, Jul 09, 2019 at 10:38:06AM +0200, Adrien Mazarguil wrote:
> On Fri, Jul 05, 2019 at 06:05:50PM +0000, Yongseok Koh wrote:
> > > On Jul 5, 2019, at 6:54 AM, Adrien Mazarguil <adrien.mazarguil@6wind.com> wrote:
> > > 
> > > On Thu, Jul 04, 2019 at 04:23:02PM -0700, Yongseok Koh wrote:
> > >> A tag is a transient data which can be used during flow match. This can be
> > >> used to store match result from a previous table so that the same pattern
> > >> need not be matched again on the next table. Even if outer header is
> > >> decapsulated on the previous match, the match result can be kept.
> > >> 
> > >> Some device expose internal registers of its flow processing pipeline and
> > >> those registers are quite useful for stateful connection tracking as it
> > >> keeps status of flow matching. Multiple tags are supported by specifying
> > >> index.
> > >> 
> > >> Example testpmd commands are:
> > >> 
> > >>  flow create 0 ingress pattern ... / end
> > >>    actions set_tag index 2 value 0xaa00bb mask 0xffff00ff /
> > >>            set_tag index 3 value 0x123456 mask 0xffffff /
> > >>            vxlan_decap / jump group 1 / end
> > >> 
> > >>  flow create 0 ingress pattern ... / end
> > >>    actions set_tag index 2 value 0xcc00 mask 0xff00 /
> > >>            set_tag index 3 value 0x123456 mask 0xffffff /
> > >>            vxlan_decap / jump group 1 / end
> > >> 
> > >>  flow create 0 ingress group 1
> > >>    pattern tag index is 2 value spec 0xaa00bb value mask 0xffff00ff /
> > >>            eth ... / end
> > >>    actions ... jump group 2 / end
> > >> 
> > >>  flow create 0 ingress group 1
> > >>    pattern tag index is 2 value spec 0xcc00 value mask 0xff00 /
> > >>            tag index is 3 value spec 0x123456 value mask 0xffffff /
> > >>            eth ... / end
> > >>    actions ... / end
> > >> 
> > >>  flow create 0 ingress group 2
> > >>    pattern tag index is 3 value spec 0x123456 value mask 0xffffff /
> > >>            eth ... / end
> > >>    actions ... / end
> > >> 
> > >> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> > > 
> > > Hi Yongseok,
> > > 
> > > Only high level questions for now, while it unquestionably looks useful,
> > > from a user standpoint exposing the separate index seems redundant and not
> > > necessarily convenient. Using the following example to illustrate:
> > > 
> > > actions set_tag index 3 value 0x123456 mask 0xfffff
> > > 
> > > pattern tag index is 3 value spec 0x123456 value mask 0xffffff
> > > 
> > > I might be missing something, but why isn't this enough:
> > > 
> > > pattern tag index is 3 # match whatever is stored at index 3
> > > 
> > > Assuming it can work, then why bother with providing value spec/mask on
> > > set_tag? A flow rule pattern matches something, sets some arbitrary tag to
> > > be matched by a subsequent flow rule and that's it. It even seems like
> > > relying on the index only on both occasions is enough for identification.
> > > 
> > > Same question for the opposite approach; relying on the value, never
> > > mentioning the index.
> > > 
> > > I'm under the impression that the index is a hardware-specific constraint
> > > that shouldn't be exposed (especially since it's an 8-bit field). If so, a
> > > PMD could keep track of used indices without having them exposed through the
> > > public API.
> > 
> > 
> > Thank you for review, Adrien.
> > Hope you are doing well. It's been long since we talked each other. :-)
> 
> Yeah clearly! Hope you're doing well too. I'm somewhat busy hence slow to
> answer these days...
> 
>  <dev@dpdk.org> hey!
>  <dev@dpdk.org> no private talks!
> 
> Back to the topic:
> 
> > Your approach will work too in general but we have a request from customer that
> > they want to partition this limited tag storage. Assuming that HW exposes 32bit
> > tags (those are 'registers' in HW pipeline in mlx5 HW). Then, customers want to
> > store multiple data even in a 32-bit storage. For example, 16bit vlan tag, 8bit
> > table id and 8bit flow id. As they want to split one 32bit storage, I thought it
> > is better to provide mask when setting/matching the value. Even some customer
> > wants to store multiple flags bit by bit like ol_flags. They do want to alter
> > only partial bits.
> > 
> > And for the index, it is to reference an entry of tags array as HW can provide
> > larger registers than 32-bit. For example, mlx5 HW would provide 4 of 32b
> > storage which users can use for their own sake.
> > 	tag[0], tag[1], tag[2], tag[3]
> 
> OK, looks like I missed the point then. I initially took it for a funky
> alternative to RTE_FLOW_ITEM_TYPE_META & RTE_FLOW_ACTION_TYPE_SET_META
> (ingress extended [1]) but while it could be used like that, it's more of a
> way to temporarily store and retrieve a small amount of data, correct?

Correct.

> Out of curiosity, are these registers independent from META and other
> items/actions in mlx5, otherwise what happens if they are combined?

I thought about combining it but I chose this way. Because it is transient. META
can be set by packet descriptor on Tx and can be delivered to host via mbuf on
Rx, but this TAG item can't. If I combine it, users have to query this
capability for each 32b storage. And also, there should be a way to request data
from such storages (i.e. new action , e.g. copy_meta). Let's say there are 4x32b
storages - meta[4]. If user wants to get one 32b data (meta[i]) out of them to
mbuf->metadata, it should be something like,
	ingress / pattern .. /
	actions ... set_meta index i data x / copy_meta_to_rx index i
And if user wants to set meta[i] via mbuf on Tx,
	egress / pattern meta index is i data is x ... /
	actions ... copy_meta_to_tx index i

For sure, user is also responsible for querying these capabilities per each
meta[] storage.

As copy_meta_to_tx/rx isn't a real action, this example would confuse user.
	egress / pattern meta index is i data is x ... /
	actions ... copy_meta_to_tx index i

User might misunderstand the order of two things - item meta and copy_meta
action. I also thought about having capability bits per each meta[] storage but
it also looked complex.

I do think rte_flow item/action is better to be simple, atomic and intuitive.
That's why I made this choice.

> Are there other uses for these registers? Say, referencing their contents
> from other places in a flow rule so they don't have to be hard-coded?

Possible.
Actually, this feature is needed by connection tracking of OVS-DPDK.

> Right now I'm still uncomfortable with such a feature in the public API
> because compared to META [1], this approach looks very hardware-specific and
> seemingly difficult to map on different HW architectures.

I wouldn't say it is HW-specific. Like I explained above, I just define this new
item/action to make things easy-to-use and intuitive.

> However, the main problem is that as described, its end purpose seems
> redundant with META, which I think can cover the use cases you gave. So what
> can an application do with this that couldn't be done in a more generic
> fashion through META?
> 
> I may still be missing something and I'm open to ideas, but assuming it
> doesn't make it into the public rte_flow API, it remains an interesting
> feature on its own merit which could be added to DPDK as PMD-specific
> pattern items/actions [2]. mlx5 doesn't have any yet, but it's pretty common
> for PMDs to expose a public header that dedicated applications can include
> to use this kind of features (look for rte_pmd_*.h, e.g. rte_pmd_ixgbe.h).
> No problem with that.

That's good info. Thanks. But still considering connection-tracking-like
use-cases, this transient storage on multi-table flow pipeline is quite useful.


thanks,
Yongseok

> [1] "[PATCH] ethdev: extend flow metadata"
>     https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmails.dpdk.org%2Farchives%2Fdev%2F2019-July%2F137305.html&amp;data=02%7C01%7Cyskoh%40mellanox.com%7Ccd2d2d88786f43d9603708d70448c623%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636982582929119170&amp;sdata=4xI5tJ9pcVn1ooTwmZ1f0O%2BaY9p%2FL%2F8O23gr2OW7ZpI%3D&amp;reserved=0
> 
> [2] "Negative types"
>     https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdoc.dpdk.org%2Fguides%2Fprog_guide%2Frte_flow.html%23negative-types&amp;data=02%7C01%7Cyskoh%40mellanox.com%7Ccd2d2d88786f43d9603708d70448c623%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636982582929119170&amp;sdata=gFYRsOd8RzINShMvMR%2FXFKwV5RHAwThsDrvwnCrDIiQ%3D&amp;reserved=0

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [dpdk-dev] [PATCH] ethdev: extend flow metadata
  2019-07-10 16:37             ` Yongseok Koh
@ 2019-07-11  7:44               ` Adrien Mazarguil
  2019-07-14 11:46                 ` Andrew Rybchenko
  0 siblings, 1 reply; 34+ messages in thread
From: Adrien Mazarguil @ 2019-07-11  7:44 UTC (permalink / raw)
  To: Yongseok Koh
  Cc: Thomas Monjalon, Olivier Matz, Bruce Richardson,
	Andrew Rybchenko, Shahaf Shuler, Ferruh Yigit, dev,
	Slava Ovsiienko

On Wed, Jul 10, 2019 at 04:37:46PM +0000, Yongseok Koh wrote:
> 
> > On Jul 10, 2019, at 5:26 AM, Thomas Monjalon <thomas@monjalon.net> wrote:
> > 
> > 10/07/2019 14:01, Bruce Richardson:
> >> On Wed, Jul 10, 2019 at 12:07:43PM +0200, Olivier Matz wrote:
> >>> On Wed, Jul 10, 2019 at 10:55:34AM +0100, Bruce Richardson wrote:
> >>>> On Wed, Jul 10, 2019 at 11:31:56AM +0200, Olivier Matz wrote:
> >>>>> On Thu, Jul 04, 2019 at 04:21:22PM -0700, Yongseok Koh wrote:
> >>>>>> Currently, metadata can be set on egress path via mbuf tx_meatadata field
> >>>>>> with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_RX_META matches metadata.
> >>>>>> 
> >>>>>> This patch extends the usability.
> >>>>>> 
> >>>>>> 1) RTE_FLOW_ACTION_TYPE_SET_META
> >>>>>> 
> >>>>>> When supporting multiple tables, Tx metadata can also be set by a rule and
> >>>>>> matched by another rule. This new action allows metadata to be set as a
> >>>>>> result of flow match.
> >>>>>> 
> >>>>>> 2) Metadata on ingress
> >>>>>> 
> >>>>>> There's also need to support metadata on packet Rx. Metadata can be set by
> >>>>>> SET_META action and matched by META item like Tx. The final value set by
> >>>>>> the action will be delivered to application via mbuf metadata field with
> >>>>>> PKT_RX_METADATA ol_flag.
> >>>>>> 
> >>>>>> For this purpose, mbuf->tx_metadata is moved as a separate new field and
> >>>>>> renamed to 'metadata' to support both Rx and Tx metadata.
> >>>>>> 
> >>>>>> For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
> >>>>>> propagated to the other path depending on HW capability.
> >>>>>> 
> >>>>>> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> >>>>> 
> >>>>>> --- a/lib/librte_mbuf/rte_mbuf.h
> >>>>>> +++ b/lib/librte_mbuf/rte_mbuf.h
> >>>>>> @@ -648,17 +653,6 @@ struct rte_mbuf {
> >>>>>> 			/**< User defined tags. See rte_distributor_process() */
> >>>>>> 			uint32_t usr;
> >>>>>> 		} hash;                   /**< hash information */
> >>>>>> -		struct {
> >>>>>> -			/**
> >>>>>> -			 * Application specific metadata value
> >>>>>> -			 * for egress flow rule match.
> >>>>>> -			 * Valid if PKT_TX_METADATA is set.
> >>>>>> -			 * Located here to allow conjunct use
> >>>>>> -			 * with hash.sched.hi.
> >>>>>> -			 */
> >>>>>> -			uint32_t tx_metadata;
> >>>>>> -			uint32_t reserved;
> >>>>>> -		};
> >>>>>> 	};
> >>>>>> 
> >>>>>> 	/** Outer VLAN TCI (CPU order), valid if PKT_RX_QINQ is set. */
> >>>>>> @@ -727,6 +721,11 @@ struct rte_mbuf {
> >>>>>> 	 */
> >>>>>> 	struct rte_mbuf_ext_shared_info *shinfo;
> >>>>>> 
> >>>>>> +	/** Application specific metadata value for flow rule match.
> >>>>>> +	 * Valid if PKT_RX_METADATA or PKT_TX_METADATA is set.
> >>>>>> +	 */
> >>>>>> +	uint32_t metadata;
> >>>>>> +
> >>>>>> } __rte_cache_aligned;
> >>>>> 
> >>>>> This will break the ABI, so we cannot put it in 19.08, and we need a
> >>>>> deprecation notice.
> >>>>> 
> >>>> Does it actually break the ABI? Adding a new field to the mbuf should only
> >>>> break the ABI if it either causes new fields to move or changes the
> >>>> structure size. Since this is at the end, it's not going to move any older
> >>>> fields, and since everything is cache-aligned I don't think the structure
> >>>> size changes either.
> >>> 
> >>> I think it does break the ABI: in previous version, when the PKT_TX_METADATA
> >>> flag is set, the associated value is put in m->tx_metadata (offset 44 on
> >>> x86-64), and in the next version, it will be in m->metadata (offset 112). So,
> >>> these 2 versions are not binary compatible.
> >>> 
> >>> Anyway, at least it breaks the API.
> >> 
> >> Ok, I misunderstood. I thought it was the structure change itself you were
> >> saying broke the ABI. Yes, putting the data in a different place is indeed
> >> an ABI break.
> > 
> > We could add the new field and keep the old one unused,
> > so it does not break the ABI.
> 
> Still breaks ABI if PKT_TX_METADATA is set. :-) In order not to break it, I can
> keep the current union'd field (tx_metadata) as is with PKT_TX_METADATA, add
> the new one at the end and make it used with the new PKT_RX_METADATA.
> 
> > However I suppose everybody will prefer a version using dynamic fields.
> > Is someone against using dynamic field for such usage?
> 
> However, given that the amazing dynamic fields is coming soon (thanks for your
> effort, Olivier and Thomas!), I'd be honored to be the first user of it.
> 
> Olivier, I'll take a look at your RFC.

Just got a crazy idea while reading this thread... How about repurposing
that "reserved" field as "rx_metadata" in the meantime?

I know reserved fields are cursed and no one's ever supposed to touch them
but this risk is mitigated by having the end user explicitly request its
use, so the patch author (and his relatives) should be safe from the
resulting bad juju.

Joke aside, while I like the idea of Tx/Rx META, I think the similarities
with MARK (and TAG eventually) is a problem. I wasn't available and couldn't
comment when META was originally added to the Tx path, but there's a lot of
overlap between these items/actions, without anything explaining to the end
user how and why they should pick one over the other, if they can be
combined at all and what happens in that case.

All this must be documented, then we should think about unifying their
respective features and deprecate the less capable items/actions. In my
opinion, users need exactly one method to mark/match some mark while
processing Rx/Tx traffic and *optionally* have that mark read from/written
to the mbuf, which may or may not be possible depending on HW features.

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [dpdk-dev] [PATCH] ethdev: extend flow metadata
  2019-07-11  7:44               ` Adrien Mazarguil
@ 2019-07-14 11:46                 ` Andrew Rybchenko
  2019-07-29 15:06                   ` Adrien Mazarguil
  0 siblings, 1 reply; 34+ messages in thread
From: Andrew Rybchenko @ 2019-07-14 11:46 UTC (permalink / raw)
  To: Adrien Mazarguil, Yongseok Koh
  Cc: Thomas Monjalon, Olivier Matz, Bruce Richardson, Shahaf Shuler,
	Ferruh Yigit, dev, Slava Ovsiienko

On 11.07.2019 10:44, Adrien Mazarguil wrote:
> On Wed, Jul 10, 2019 at 04:37:46PM +0000, Yongseok Koh wrote:
>>> On Jul 10, 2019, at 5:26 AM, Thomas Monjalon <thomas@monjalon.net> wrote:
>>>
>>> 10/07/2019 14:01, Bruce Richardson:
>>>> On Wed, Jul 10, 2019 at 12:07:43PM +0200, Olivier Matz wrote:
>>>>> On Wed, Jul 10, 2019 at 10:55:34AM +0100, Bruce Richardson wrote:
>>>>>> On Wed, Jul 10, 2019 at 11:31:56AM +0200, Olivier Matz wrote:
>>>>>>> On Thu, Jul 04, 2019 at 04:21:22PM -0700, Yongseok Koh wrote:
>>>>>>>> Currently, metadata can be set on egress path via mbuf tx_meatadata field
>>>>>>>> with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_RX_META matches metadata.
>>>>>>>>
>>>>>>>> This patch extends the usability.
>>>>>>>>
>>>>>>>> 1) RTE_FLOW_ACTION_TYPE_SET_META
>>>>>>>>
>>>>>>>> When supporting multiple tables, Tx metadata can also be set by a rule and
>>>>>>>> matched by another rule. This new action allows metadata to be set as a
>>>>>>>> result of flow match.
>>>>>>>>
>>>>>>>> 2) Metadata on ingress
>>>>>>>>
>>>>>>>> There's also need to support metadata on packet Rx. Metadata can be set by
>>>>>>>> SET_META action and matched by META item like Tx. The final value set by
>>>>>>>> the action will be delivered to application via mbuf metadata field with
>>>>>>>> PKT_RX_METADATA ol_flag.
>>>>>>>>
>>>>>>>> For this purpose, mbuf->tx_metadata is moved as a separate new field and
>>>>>>>> renamed to 'metadata' to support both Rx and Tx metadata.
>>>>>>>>
>>>>>>>> For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
>>>>>>>> propagated to the other path depending on HW capability.
>>>>>>>>
>>>>>>>> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
>>>>>>>> --- a/lib/librte_mbuf/rte_mbuf.h
>>>>>>>> +++ b/lib/librte_mbuf/rte_mbuf.h
>>>>>>>> @@ -648,17 +653,6 @@ struct rte_mbuf {
>>>>>>>> 			/**< User defined tags. See rte_distributor_process() */
>>>>>>>> 			uint32_t usr;
>>>>>>>> 		} hash;                   /**< hash information */
>>>>>>>> -		struct {
>>>>>>>> -			/**
>>>>>>>> -			 * Application specific metadata value
>>>>>>>> -			 * for egress flow rule match.
>>>>>>>> -			 * Valid if PKT_TX_METADATA is set.
>>>>>>>> -			 * Located here to allow conjunct use
>>>>>>>> -			 * with hash.sched.hi.
>>>>>>>> -			 */
>>>>>>>> -			uint32_t tx_metadata;
>>>>>>>> -			uint32_t reserved;
>>>>>>>> -		};
>>>>>>>> 	};
>>>>>>>>
>>>>>>>> 	/** Outer VLAN TCI (CPU order), valid if PKT_RX_QINQ is set. */
>>>>>>>> @@ -727,6 +721,11 @@ struct rte_mbuf {
>>>>>>>> 	 */
>>>>>>>> 	struct rte_mbuf_ext_shared_info *shinfo;
>>>>>>>>
>>>>>>>> +	/** Application specific metadata value for flow rule match.
>>>>>>>> +	 * Valid if PKT_RX_METADATA or PKT_TX_METADATA is set.
>>>>>>>> +	 */
>>>>>>>> +	uint32_t metadata;
>>>>>>>> +
>>>>>>>> } __rte_cache_aligned;
>>>>>>> This will break the ABI, so we cannot put it in 19.08, and we need a
>>>>>>> deprecation notice.
>>>>>>>
>>>>>> Does it actually break the ABI? Adding a new field to the mbuf should only
>>>>>> break the ABI if it either causes new fields to move or changes the
>>>>>> structure size. Since this is at the end, it's not going to move any older
>>>>>> fields, and since everything is cache-aligned I don't think the structure
>>>>>> size changes either.
>>>>> I think it does break the ABI: in previous version, when the PKT_TX_METADATA
>>>>> flag is set, the associated value is put in m->tx_metadata (offset 44 on
>>>>> x86-64), and in the next version, it will be in m->metadata (offset 112). So,
>>>>> these 2 versions are not binary compatible.
>>>>>
>>>>> Anyway, at least it breaks the API.
>>>> Ok, I misunderstood. I thought it was the structure change itself you were
>>>> saying broke the ABI. Yes, putting the data in a different place is indeed
>>>> an ABI break.
>>> We could add the new field and keep the old one unused,
>>> so it does not break the ABI.
>> Still breaks ABI if PKT_TX_METADATA is set. :-) In order not to break it, I can
>> keep the current union'd field (tx_metadata) as is with PKT_TX_METADATA, add
>> the new one at the end and make it used with the new PKT_RX_METADATA.
>>
>>> However I suppose everybody will prefer a version using dynamic fields.
>>> Is someone against using dynamic field for such usage?
>> However, given that the amazing dynamic fields is coming soon (thanks for your
>> effort, Olivier and Thomas!), I'd be honored to be the first user of it.
>>
>> Olivier, I'll take a look at your RFC.
> Just got a crazy idea while reading this thread... How about repurposing
> that "reserved" field as "rx_metadata" in the meantime?

It overlaps with hash.fdir.hi which has RSS hash.

> I know reserved fields are cursed and no one's ever supposed to touch them
> but this risk is mitigated by having the end user explicitly request its
> use, so the patch author (and his relatives) should be safe from the
> resulting bad juju.
>
> Joke aside, while I like the idea of Tx/Rx META, I think the similarities
> with MARK (and TAG eventually) is a problem. I wasn't available and couldn't
> comment when META was originally added to the Tx path, but there's a lot of
> overlap between these items/actions, without anything explaining to the end
> user how and why they should pick one over the other, if they can be
> combined at all and what happens in that case.
>
> All this must be documented, then we should think about unifying their
> respective features and deprecate the less capable items/actions. In my
> opinion, users need exactly one method to mark/match some mark while
> processing Rx/Tx traffic and *optionally* have that mark read from/written
> to the mbuf, which may or may not be possible depending on HW features.
>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [dpdk-dev] [PATCH] ethdev: extend flow metadata
  2019-07-14 11:46                 ` Andrew Rybchenko
@ 2019-07-29 15:06                   ` Adrien Mazarguil
  2019-10-08 12:51                     ` Yigit, Ferruh
  0 siblings, 1 reply; 34+ messages in thread
From: Adrien Mazarguil @ 2019-07-29 15:06 UTC (permalink / raw)
  To: Andrew Rybchenko
  Cc: Yongseok Koh, Thomas Monjalon, Olivier Matz, Bruce Richardson,
	Shahaf Shuler, Ferruh Yigit, dev, Slava Ovsiienko

On Sun, Jul 14, 2019 at 02:46:58PM +0300, Andrew Rybchenko wrote:
> On 11.07.2019 10:44, Adrien Mazarguil wrote:
> > On Wed, Jul 10, 2019 at 04:37:46PM +0000, Yongseok Koh wrote:
> > > > On Jul 10, 2019, at 5:26 AM, Thomas Monjalon <thomas@monjalon.net> wrote:
> > > > 
> > > > 10/07/2019 14:01, Bruce Richardson:
> > > > > On Wed, Jul 10, 2019 at 12:07:43PM +0200, Olivier Matz wrote:
> > > > > > On Wed, Jul 10, 2019 at 10:55:34AM +0100, Bruce Richardson wrote:
> > > > > > > On Wed, Jul 10, 2019 at 11:31:56AM +0200, Olivier Matz wrote:
> > > > > > > > On Thu, Jul 04, 2019 at 04:21:22PM -0700, Yongseok Koh wrote:
> > > > > > > > > Currently, metadata can be set on egress path via mbuf tx_meatadata field
> > > > > > > > > with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_RX_META matches metadata.
> > > > > > > > > 
> > > > > > > > > This patch extends the usability.
> > > > > > > > > 
> > > > > > > > > 1) RTE_FLOW_ACTION_TYPE_SET_META
> > > > > > > > > 
> > > > > > > > > When supporting multiple tables, Tx metadata can also be set by a rule and
> > > > > > > > > matched by another rule. This new action allows metadata to be set as a
> > > > > > > > > result of flow match.
> > > > > > > > > 
> > > > > > > > > 2) Metadata on ingress
> > > > > > > > > 
> > > > > > > > > There's also need to support metadata on packet Rx. Metadata can be set by
> > > > > > > > > SET_META action and matched by META item like Tx. The final value set by
> > > > > > > > > the action will be delivered to application via mbuf metadata field with
> > > > > > > > > PKT_RX_METADATA ol_flag.
> > > > > > > > > 
> > > > > > > > > For this purpose, mbuf->tx_metadata is moved as a separate new field and
> > > > > > > > > renamed to 'metadata' to support both Rx and Tx metadata.
> > > > > > > > > 
> > > > > > > > > For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
> > > > > > > > > propagated to the other path depending on HW capability.
> > > > > > > > > 
> > > > > > > > > Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> > > > > > > > > --- a/lib/librte_mbuf/rte_mbuf.h
> > > > > > > > > +++ b/lib/librte_mbuf/rte_mbuf.h
> > > > > > > > > @@ -648,17 +653,6 @@ struct rte_mbuf {
> > > > > > > > > 			/**< User defined tags. See rte_distributor_process() */
> > > > > > > > > 			uint32_t usr;
> > > > > > > > > 		} hash;                   /**< hash information */
> > > > > > > > > -		struct {
> > > > > > > > > -			/**
> > > > > > > > > -			 * Application specific metadata value
> > > > > > > > > -			 * for egress flow rule match.
> > > > > > > > > -			 * Valid if PKT_TX_METADATA is set.
> > > > > > > > > -			 * Located here to allow conjunct use
> > > > > > > > > -			 * with hash.sched.hi.
> > > > > > > > > -			 */
> > > > > > > > > -			uint32_t tx_metadata;
> > > > > > > > > -			uint32_t reserved;
> > > > > > > > > -		};
> > > > > > > > > 	};
> > > > > > > > > 
> > > > > > > > > 	/** Outer VLAN TCI (CPU order), valid if PKT_RX_QINQ is set. */
> > > > > > > > > @@ -727,6 +721,11 @@ struct rte_mbuf {
> > > > > > > > > 	 */
> > > > > > > > > 	struct rte_mbuf_ext_shared_info *shinfo;
> > > > > > > > > 
> > > > > > > > > +	/** Application specific metadata value for flow rule match.
> > > > > > > > > +	 * Valid if PKT_RX_METADATA or PKT_TX_METADATA is set.
> > > > > > > > > +	 */
> > > > > > > > > +	uint32_t metadata;
> > > > > > > > > +
> > > > > > > > > } __rte_cache_aligned;
> > > > > > > > This will break the ABI, so we cannot put it in 19.08, and we need a
> > > > > > > > deprecation notice.
> > > > > > > > 
> > > > > > > Does it actually break the ABI? Adding a new field to the mbuf should only
> > > > > > > break the ABI if it either causes new fields to move or changes the
> > > > > > > structure size. Since this is at the end, it's not going to move any older
> > > > > > > fields, and since everything is cache-aligned I don't think the structure
> > > > > > > size changes either.
> > > > > > I think it does break the ABI: in previous version, when the PKT_TX_METADATA
> > > > > > flag is set, the associated value is put in m->tx_metadata (offset 44 on
> > > > > > x86-64), and in the next version, it will be in m->metadata (offset 112). So,
> > > > > > these 2 versions are not binary compatible.
> > > > > > 
> > > > > > Anyway, at least it breaks the API.
> > > > > Ok, I misunderstood. I thought it was the structure change itself you were
> > > > > saying broke the ABI. Yes, putting the data in a different place is indeed
> > > > > an ABI break.
> > > > We could add the new field and keep the old one unused,
> > > > so it does not break the ABI.
> > > Still breaks ABI if PKT_TX_METADATA is set. :-) In order not to break it, I can
> > > keep the current union'd field (tx_metadata) as is with PKT_TX_METADATA, add
> > > the new one at the end and make it used with the new PKT_RX_METADATA.
> > > 
> > > > However I suppose everybody will prefer a version using dynamic fields.
> > > > Is someone against using dynamic field for such usage?
> > > However, given that the amazing dynamic fields is coming soon (thanks for your
> > > effort, Olivier and Thomas!), I'd be honored to be the first user of it.
> > > 
> > > Olivier, I'll take a look at your RFC.
> > Just got a crazy idea while reading this thread... How about repurposing
> > that "reserved" field as "rx_metadata" in the meantime?
> 
> It overlaps with hash.fdir.hi which has RSS hash.

While it does overlap with hash.fdir.hi, isn't the RSS hash stored in the
"rss" field overlapping with hash.fdir.lo? (see struct rte_flow_action_rss)

hash.fdir.hi was originally used by FDIR and later repurposed by rte_flow
for its MARK action, which neatly qualifies as Rx metadata so renaming
"reserved" as "rx_metadata" could already make sense.

That is, assuming users do not need two different kinds of Rx metadata
returned simultaneously with their packets. I think it's safe.

> > I know reserved fields are cursed and no one's ever supposed to touch them
> > but this risk is mitigated by having the end user explicitly request its
> > use, so the patch author (and his relatives) should be safe from the
> > resulting bad juju.
> > 
> > Joke aside, while I like the idea of Tx/Rx META, I think the similarities
> > with MARK (and TAG eventually) is a problem. I wasn't available and couldn't
> > comment when META was originally added to the Tx path, but there's a lot of
> > overlap between these items/actions, without anything explaining to the end
> > user how and why they should pick one over the other, if they can be
> > combined at all and what happens in that case.
> > 
> > All this must be documented, then we should think about unifying their
> > respective features and deprecate the less capable items/actions. In my
> > opinion, users need exactly one method to mark/match some mark while
> > processing Rx/Tx traffic and *optionally* have that mark read from/written
> > to the mbuf, which may or may not be possible depending on HW features.

Thoughts regarding this suggestion? From a user perspective I think all
these actions should be unified but maybe there are good reasons to keep
them separate?

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [dpdk-dev] [PATCH] ethdev: extend flow metadata
  2019-07-29 15:06                   ` Adrien Mazarguil
@ 2019-10-08 12:51                     ` Yigit, Ferruh
  2019-10-08 13:17                       ` Slava Ovsiienko
  0 siblings, 1 reply; 34+ messages in thread
From: Yigit, Ferruh @ 2019-10-08 12:51 UTC (permalink / raw)
  To: Adrien Mazarguil, Andrew Rybchenko
  Cc: Yongseok Koh, Thomas Monjalon, Olivier Matz, Bruce Richardson,
	Shahaf Shuler, Ferruh Yigit, dev, Slava Ovsiienko

On 7/29/2019 4:06 PM, Adrien Mazarguil wrote:
> On Sun, Jul 14, 2019 at 02:46:58PM +0300, Andrew Rybchenko wrote:
>> On 11.07.2019 10:44, Adrien Mazarguil wrote:
>>> On Wed, Jul 10, 2019 at 04:37:46PM +0000, Yongseok Koh wrote:
>>>>> On Jul 10, 2019, at 5:26 AM, Thomas Monjalon <thomas@monjalon.net> wrote:
>>>>>
>>>>> 10/07/2019 14:01, Bruce Richardson:
>>>>>> On Wed, Jul 10, 2019 at 12:07:43PM +0200, Olivier Matz wrote:
>>>>>>> On Wed, Jul 10, 2019 at 10:55:34AM +0100, Bruce Richardson wrote:
>>>>>>>> On Wed, Jul 10, 2019 at 11:31:56AM +0200, Olivier Matz wrote:
>>>>>>>>> On Thu, Jul 04, 2019 at 04:21:22PM -0700, Yongseok Koh wrote:
>>>>>>>>>> Currently, metadata can be set on egress path via mbuf tx_meatadata field
>>>>>>>>>> with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_RX_META matches metadata.
>>>>>>>>>>
>>>>>>>>>> This patch extends the usability.
>>>>>>>>>>
>>>>>>>>>> 1) RTE_FLOW_ACTION_TYPE_SET_META
>>>>>>>>>>
>>>>>>>>>> When supporting multiple tables, Tx metadata can also be set by a rule and
>>>>>>>>>> matched by another rule. This new action allows metadata to be set as a
>>>>>>>>>> result of flow match.
>>>>>>>>>>
>>>>>>>>>> 2) Metadata on ingress
>>>>>>>>>>
>>>>>>>>>> There's also need to support metadata on packet Rx. Metadata can be set by
>>>>>>>>>> SET_META action and matched by META item like Tx. The final value set by
>>>>>>>>>> the action will be delivered to application via mbuf metadata field with
>>>>>>>>>> PKT_RX_METADATA ol_flag.
>>>>>>>>>>
>>>>>>>>>> For this purpose, mbuf->tx_metadata is moved as a separate new field and
>>>>>>>>>> renamed to 'metadata' to support both Rx and Tx metadata.
>>>>>>>>>>
>>>>>>>>>> For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
>>>>>>>>>> propagated to the other path depending on HW capability.
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
>>>>>>>>>> --- a/lib/librte_mbuf/rte_mbuf.h
>>>>>>>>>> +++ b/lib/librte_mbuf/rte_mbuf.h
>>>>>>>>>> @@ -648,17 +653,6 @@ struct rte_mbuf {
>>>>>>>>>> 			/**< User defined tags. See rte_distributor_process() */
>>>>>>>>>> 			uint32_t usr;
>>>>>>>>>> 		} hash;                   /**< hash information */
>>>>>>>>>> -		struct {
>>>>>>>>>> -			/**
>>>>>>>>>> -			 * Application specific metadata value
>>>>>>>>>> -			 * for egress flow rule match.
>>>>>>>>>> -			 * Valid if PKT_TX_METADATA is set.
>>>>>>>>>> -			 * Located here to allow conjunct use
>>>>>>>>>> -			 * with hash.sched.hi.
>>>>>>>>>> -			 */
>>>>>>>>>> -			uint32_t tx_metadata;
>>>>>>>>>> -			uint32_t reserved;
>>>>>>>>>> -		};
>>>>>>>>>> 	};
>>>>>>>>>>
>>>>>>>>>> 	/** Outer VLAN TCI (CPU order), valid if PKT_RX_QINQ is set. */
>>>>>>>>>> @@ -727,6 +721,11 @@ struct rte_mbuf {
>>>>>>>>>> 	 */
>>>>>>>>>> 	struct rte_mbuf_ext_shared_info *shinfo;
>>>>>>>>>>
>>>>>>>>>> +	/** Application specific metadata value for flow rule match.
>>>>>>>>>> +	 * Valid if PKT_RX_METADATA or PKT_TX_METADATA is set.
>>>>>>>>>> +	 */
>>>>>>>>>> +	uint32_t metadata;
>>>>>>>>>> +
>>>>>>>>>> } __rte_cache_aligned;
>>>>>>>>> This will break the ABI, so we cannot put it in 19.08, and we need a
>>>>>>>>> deprecation notice.
>>>>>>>>>
>>>>>>>> Does it actually break the ABI? Adding a new field to the mbuf should only
>>>>>>>> break the ABI if it either causes new fields to move or changes the
>>>>>>>> structure size. Since this is at the end, it's not going to move any older
>>>>>>>> fields, and since everything is cache-aligned I don't think the structure
>>>>>>>> size changes either.
>>>>>>> I think it does break the ABI: in previous version, when the PKT_TX_METADATA
>>>>>>> flag is set, the associated value is put in m->tx_metadata (offset 44 on
>>>>>>> x86-64), and in the next version, it will be in m->metadata (offset 112). So,
>>>>>>> these 2 versions are not binary compatible.
>>>>>>>
>>>>>>> Anyway, at least it breaks the API.
>>>>>> Ok, I misunderstood. I thought it was the structure change itself you were
>>>>>> saying broke the ABI. Yes, putting the data in a different place is indeed
>>>>>> an ABI break.
>>>>> We could add the new field and keep the old one unused,
>>>>> so it does not break the ABI.
>>>> Still breaks ABI if PKT_TX_METADATA is set. :-) In order not to break it, I can
>>>> keep the current union'd field (tx_metadata) as is with PKT_TX_METADATA, add
>>>> the new one at the end and make it used with the new PKT_RX_METADATA.
>>>>
>>>>> However I suppose everybody will prefer a version using dynamic fields.
>>>>> Is someone against using dynamic field for such usage?
>>>> However, given that the amazing dynamic fields is coming soon (thanks for your
>>>> effort, Olivier and Thomas!), I'd be honored to be the first user of it.
>>>>
>>>> Olivier, I'll take a look at your RFC.
>>> Just got a crazy idea while reading this thread... How about repurposing
>>> that "reserved" field as "rx_metadata" in the meantime?
>>
>> It overlaps with hash.fdir.hi which has RSS hash.
> 
> While it does overlap with hash.fdir.hi, isn't the RSS hash stored in the
> "rss" field overlapping with hash.fdir.lo? (see struct rte_flow_action_rss)
> 
> hash.fdir.hi was originally used by FDIR and later repurposed by rte_flow
> for its MARK action, which neatly qualifies as Rx metadata so renaming
> "reserved" as "rx_metadata" could already make sense.
> 
> That is, assuming users do not need two different kinds of Rx metadata
> returned simultaneously with their packets. I think it's safe.
> 
>>> I know reserved fields are cursed and no one's ever supposed to touch them
>>> but this risk is mitigated by having the end user explicitly request its
>>> use, so the patch author (and his relatives) should be safe from the
>>> resulting bad juju.
>>>
>>> Joke aside, while I like the idea of Tx/Rx META, I think the similarities
>>> with MARK (and TAG eventually) is a problem. I wasn't available and couldn't
>>> comment when META was originally added to the Tx path, but there's a lot of
>>> overlap between these items/actions, without anything explaining to the end
>>> user how and why they should pick one over the other, if they can be
>>> combined at all and what happens in that case.
>>>
>>> All this must be documented, then we should think about unifying their
>>> respective features and deprecate the less capable items/actions. In my
>>> opinion, users need exactly one method to mark/match some mark while
>>> processing Rx/Tx traffic and *optionally* have that mark read from/written
>>> to the mbuf, which may or may not be possible depending on HW features.
> 
> Thoughts regarding this suggestion? From a user perspective I think all
> these actions should be unified but maybe there are good reasons to keep
> them separate?
> 

I think more recent plan is introducing dynamic fields for the remaining 16
bytes in the second cacheline.

I will update the patch as rejected, is there any objection?

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [dpdk-dev] [PATCH] ethdev: add flow tag
  2019-07-11  1:59           ` Yongseok Koh
@ 2019-10-08 12:57             ` Yigit, Ferruh
  2019-10-08 13:18               ` Slava Ovsiienko
  0 siblings, 1 reply; 34+ messages in thread
From: Yigit, Ferruh @ 2019-10-08 12:57 UTC (permalink / raw)
  To: Yongseok Koh, Adrien Mazarguil
  Cc: Shahaf Shuler, Thomas Monjalon, Ferruh Yigit, Andrew Rybchenko,
	Olivier Matz, dev, Slava Ovsiienko

On 7/11/2019 2:59 AM, Yongseok Koh wrote:
> On Tue, Jul 09, 2019 at 10:38:06AM +0200, Adrien Mazarguil wrote:
>> On Fri, Jul 05, 2019 at 06:05:50PM +0000, Yongseok Koh wrote:
>>>> On Jul 5, 2019, at 6:54 AM, Adrien Mazarguil <adrien.mazarguil@6wind.com> wrote:
>>>>
>>>> On Thu, Jul 04, 2019 at 04:23:02PM -0700, Yongseok Koh wrote:
>>>>> A tag is a transient data which can be used during flow match. This can be
>>>>> used to store match result from a previous table so that the same pattern
>>>>> need not be matched again on the next table. Even if outer header is
>>>>> decapsulated on the previous match, the match result can be kept.
>>>>>
>>>>> Some device expose internal registers of its flow processing pipeline and
>>>>> those registers are quite useful for stateful connection tracking as it
>>>>> keeps status of flow matching. Multiple tags are supported by specifying
>>>>> index.
>>>>>
>>>>> Example testpmd commands are:
>>>>>
>>>>>  flow create 0 ingress pattern ... / end
>>>>>    actions set_tag index 2 value 0xaa00bb mask 0xffff00ff /
>>>>>            set_tag index 3 value 0x123456 mask 0xffffff /
>>>>>            vxlan_decap / jump group 1 / end
>>>>>
>>>>>  flow create 0 ingress pattern ... / end
>>>>>    actions set_tag index 2 value 0xcc00 mask 0xff00 /
>>>>>            set_tag index 3 value 0x123456 mask 0xffffff /
>>>>>            vxlan_decap / jump group 1 / end
>>>>>
>>>>>  flow create 0 ingress group 1
>>>>>    pattern tag index is 2 value spec 0xaa00bb value mask 0xffff00ff /
>>>>>            eth ... / end
>>>>>    actions ... jump group 2 / end
>>>>>
>>>>>  flow create 0 ingress group 1
>>>>>    pattern tag index is 2 value spec 0xcc00 value mask 0xff00 /
>>>>>            tag index is 3 value spec 0x123456 value mask 0xffffff /
>>>>>            eth ... / end
>>>>>    actions ... / end
>>>>>
>>>>>  flow create 0 ingress group 2
>>>>>    pattern tag index is 3 value spec 0x123456 value mask 0xffffff /
>>>>>            eth ... / end
>>>>>    actions ... / end
>>>>>
>>>>> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
>>>>
>>>> Hi Yongseok,
>>>>
>>>> Only high level questions for now, while it unquestionably looks useful,
>>>> from a user standpoint exposing the separate index seems redundant and not
>>>> necessarily convenient. Using the following example to illustrate:
>>>>
>>>> actions set_tag index 3 value 0x123456 mask 0xfffff
>>>>
>>>> pattern tag index is 3 value spec 0x123456 value mask 0xffffff
>>>>
>>>> I might be missing something, but why isn't this enough:
>>>>
>>>> pattern tag index is 3 # match whatever is stored at index 3
>>>>
>>>> Assuming it can work, then why bother with providing value spec/mask on
>>>> set_tag? A flow rule pattern matches something, sets some arbitrary tag to
>>>> be matched by a subsequent flow rule and that's it. It even seems like
>>>> relying on the index only on both occasions is enough for identification.
>>>>
>>>> Same question for the opposite approach; relying on the value, never
>>>> mentioning the index.
>>>>
>>>> I'm under the impression that the index is a hardware-specific constraint
>>>> that shouldn't be exposed (especially since it's an 8-bit field). If so, a
>>>> PMD could keep track of used indices without having them exposed through the
>>>> public API.
>>>
>>>
>>> Thank you for review, Adrien.
>>> Hope you are doing well. It's been long since we talked each other. :-)
>>
>> Yeah clearly! Hope you're doing well too. I'm somewhat busy hence slow to
>> answer these days...
>>
>>  <dev@dpdk.org> hey!
>>  <dev@dpdk.org> no private talks!
>>
>> Back to the topic:
>>
>>> Your approach will work too in general but we have a request from customer that
>>> they want to partition this limited tag storage. Assuming that HW exposes 32bit
>>> tags (those are 'registers' in HW pipeline in mlx5 HW). Then, customers want to
>>> store multiple data even in a 32-bit storage. For example, 16bit vlan tag, 8bit
>>> table id and 8bit flow id. As they want to split one 32bit storage, I thought it
>>> is better to provide mask when setting/matching the value. Even some customer
>>> wants to store multiple flags bit by bit like ol_flags. They do want to alter
>>> only partial bits.
>>>
>>> And for the index, it is to reference an entry of tags array as HW can provide
>>> larger registers than 32-bit. For example, mlx5 HW would provide 4 of 32b
>>> storage which users can use for their own sake.
>>> 	tag[0], tag[1], tag[2], tag[3]
>>
>> OK, looks like I missed the point then. I initially took it for a funky
>> alternative to RTE_FLOW_ITEM_TYPE_META & RTE_FLOW_ACTION_TYPE_SET_META
>> (ingress extended [1]) but while it could be used like that, it's more of a
>> way to temporarily store and retrieve a small amount of data, correct?
> 
> Correct.
> 
>> Out of curiosity, are these registers independent from META and other
>> items/actions in mlx5, otherwise what happens if they are combined?
> 
> I thought about combining it but I chose this way. Because it is transient. META
> can be set by packet descriptor on Tx and can be delivered to host via mbuf on
> Rx, but this TAG item can't. If I combine it, users have to query this
> capability for each 32b storage. And also, there should be a way to request data
> from such storages (i.e. new action , e.g. copy_meta). Let's say there are 4x32b
> storages - meta[4]. If user wants to get one 32b data (meta[i]) out of them to
> mbuf->metadata, it should be something like,
> 	ingress / pattern .. /
> 	actions ... set_meta index i data x / copy_meta_to_rx index i
> And if user wants to set meta[i] via mbuf on Tx,
> 	egress / pattern meta index is i data is x ... /
> 	actions ... copy_meta_to_tx index i
> 
> For sure, user is also responsible for querying these capabilities per each
> meta[] storage.
> 
> As copy_meta_to_tx/rx isn't a real action, this example would confuse user.
> 	egress / pattern meta index is i data is x ... /
> 	actions ... copy_meta_to_tx index i
> 
> User might misunderstand the order of two things - item meta and copy_meta
> action. I also thought about having capability bits per each meta[] storage but
> it also looked complex.
> 
> I do think rte_flow item/action is better to be simple, atomic and intuitive.
> That's why I made this choice.
> 
>> Are there other uses for these registers? Say, referencing their contents
>> from other places in a flow rule so they don't have to be hard-coded?
> 
> Possible.
> Actually, this feature is needed by connection tracking of OVS-DPDK.
> 
>> Right now I'm still uncomfortable with such a feature in the public API
>> because compared to META [1], this approach looks very hardware-specific and
>> seemingly difficult to map on different HW architectures.
> 
> I wouldn't say it is HW-specific. Like I explained above, I just define this new
> item/action to make things easy-to-use and intuitive.
> 
>> However, the main problem is that as described, its end purpose seems
>> redundant with META, which I think can cover the use cases you gave. So what
>> can an application do with this that couldn't be done in a more generic
>> fashion through META?
>>
>> I may still be missing something and I'm open to ideas, but assuming it
>> doesn't make it into the public rte_flow API, it remains an interesting
>> feature on its own merit which could be added to DPDK as PMD-specific
>> pattern items/actions [2]. mlx5 doesn't have any yet, but it's pretty common
>> for PMDs to expose a public header that dedicated applications can include
>> to use this kind of features (look for rte_pmd_*.h, e.g. rte_pmd_ixgbe.h).
>> No problem with that.
> 
> That's good info. Thanks. But still considering connection-tracking-like
> use-cases, this transient storage on multi-table flow pipeline is quite useful.
> 
> 
> thanks,
> Yongseok
> 
>> [1] "[PATCH] ethdev: extend flow metadata"
>>     https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmails.dpdk.org%2Farchives%2Fdev%2F2019-July%2F137305.html&amp;data=02%7C01%7Cyskoh%40mellanox.com%7Ccd2d2d88786f43d9603708d70448c623%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636982582929119170&amp;sdata=4xI5tJ9pcVn1ooTwmZ1f0O%2BaY9p%2FL%2F8O23gr2OW7ZpI%3D&amp;reserved=0
>>
>> [2] "Negative types"
>>     https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdoc.dpdk.org%2Fguides%2Fprog_guide%2Frte_flow.html%23negative-types&amp;data=02%7C01%7Cyskoh%40mellanox.com%7Ccd2d2d88786f43d9603708d70448c623%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636982582929119170&amp;sdata=gFYRsOd8RzINShMvMR%2FXFKwV5RHAwThsDrvwnCrDIiQ%3D&amp;reserved=0

Is this RFC still valid, will there be any follow up?
If not am marking it as rejected in next a few days.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [dpdk-dev] [PATCH] ethdev: extend flow metadata
  2019-10-08 12:51                     ` Yigit, Ferruh
@ 2019-10-08 13:17                       ` Slava Ovsiienko
  0 siblings, 0 replies; 34+ messages in thread
From: Slava Ovsiienko @ 2019-10-08 13:17 UTC (permalink / raw)
  To: Yigit, Ferruh, Adrien Mazarguil, Andrew Rybchenko
  Cc: Yongseok Koh, Thomas Monjalon, Olivier Matz, Bruce Richardson,
	Shahaf Shuler, Ferruh Yigit, dev

> -----Original Message-----
> From: Yigit, Ferruh <ferruh.yigit@linux.intel.com>
> Sent: Tuesday, October 8, 2019 15:51
> To: Adrien Mazarguil <adrien.mazarguil@6wind.com>; Andrew Rybchenko
> <arybchenko@solarflare.com>
> Cc: Yongseok Koh <yskoh@mellanox.com>; Thomas Monjalon
> <thomas@monjalon.net>; Olivier Matz <olivier.matz@6wind.com>; Bruce
> Richardson <bruce.richardson@intel.com>; Shahaf Shuler
> <shahafs@mellanox.com>; Ferruh Yigit <ferruh.yigit@intel.com>; dev
> <dev@dpdk.org>; Slava Ovsiienko <viacheslavo@mellanox.com>
> Subject: Re: [dpdk-dev] [PATCH] ethdev: extend flow metadata
> 
> On 7/29/2019 4:06 PM, Adrien Mazarguil wrote:
> > On Sun, Jul 14, 2019 at 02:46:58PM +0300, Andrew Rybchenko wrote:
> >> On 11.07.2019 10:44, Adrien Mazarguil wrote:
> >>> On Wed, Jul 10, 2019 at 04:37:46PM +0000, Yongseok Koh wrote:
> >>>>> On Jul 10, 2019, at 5:26 AM, Thomas Monjalon
> <thomas@monjalon.net> wrote:
> >>>>>
> >>>>> 10/07/2019 14:01, Bruce Richardson:
> >>>>>> On Wed, Jul 10, 2019 at 12:07:43PM +0200, Olivier Matz wrote:
> >>>>>>> On Wed, Jul 10, 2019 at 10:55:34AM +0100, Bruce Richardson
> wrote:
> >>>>>>>> On Wed, Jul 10, 2019 at 11:31:56AM +0200, Olivier Matz wrote:
> >>>>>>>>> On Thu, Jul 04, 2019 at 04:21:22PM -0700, Yongseok Koh wrote:
> >>>>>>>>>> Currently, metadata can be set on egress path via mbuf
> >>>>>>>>>> tx_meatadata field with PKT_TX_METADATA flag and
> RTE_FLOW_ITEM_TYPE_RX_META matches metadata.
> >>>>>>>>>>
> >>>>>>>>>> This patch extends the usability.
> >>>>>>>>>>
> >>>>>>>>>> 1) RTE_FLOW_ACTION_TYPE_SET_META
> >>>>>>>>>>
> >>>>>>>>>> When supporting multiple tables, Tx metadata can also be set
> >>>>>>>>>> by a rule and matched by another rule. This new action allows
> >>>>>>>>>> metadata to be set as a result of flow match.
> >>>>>>>>>>
> >>>>>>>>>> 2) Metadata on ingress
> >>>>>>>>>>
> >>>>>>>>>> There's also need to support metadata on packet Rx. Metadata
> >>>>>>>>>> can be set by SET_META action and matched by META item like
> >>>>>>>>>> Tx. The final value set by the action will be delivered to
> >>>>>>>>>> application via mbuf metadata field with PKT_RX_METADATA
> ol_flag.
> >>>>>>>>>>
> >>>>>>>>>> For this purpose, mbuf->tx_metadata is moved as a separate
> >>>>>>>>>> new field and renamed to 'metadata' to support both Rx and Tx
> metadata.
> >>>>>>>>>>
> >>>>>>>>>> For loopback/hairpin packet, metadata set on Rx/Tx may or
> may
> >>>>>>>>>> not be propagated to the other path depending on HW
> capability.
> >>>>>>>>>>
> >>>>>>>>>> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> >>>>>>>>>> --- a/lib/librte_mbuf/rte_mbuf.h
> >>>>>>>>>> +++ b/lib/librte_mbuf/rte_mbuf.h
> >>>>>>>>>> @@ -648,17 +653,6 @@ struct rte_mbuf {
> >>>>>>>>>> 			/**< User defined tags. See
> rte_distributor_process() */
> >>>>>>>>>> 			uint32_t usr;
> >>>>>>>>>> 		} hash;                   /**< hash information */
> >>>>>>>>>> -		struct {
> >>>>>>>>>> -			/**
> >>>>>>>>>> -			 * Application specific metadata value
> >>>>>>>>>> -			 * for egress flow rule match.
> >>>>>>>>>> -			 * Valid if PKT_TX_METADATA is set.
> >>>>>>>>>> -			 * Located here to allow conjunct use
> >>>>>>>>>> -			 * with hash.sched.hi.
> >>>>>>>>>> -			 */
> >>>>>>>>>> -			uint32_t tx_metadata;
> >>>>>>>>>> -			uint32_t reserved;
> >>>>>>>>>> -		};
> >>>>>>>>>> 	};
> >>>>>>>>>>
> >>>>>>>>>> 	/** Outer VLAN TCI (CPU order), valid if PKT_RX_QINQ is set.
> >>>>>>>>>> */ @@ -727,6 +721,11 @@ struct rte_mbuf {
> >>>>>>>>>> 	 */
> >>>>>>>>>> 	struct rte_mbuf_ext_shared_info *shinfo;
> >>>>>>>>>>
> >>>>>>>>>> +	/** Application specific metadata value for flow rule match.
> >>>>>>>>>> +	 * Valid if PKT_RX_METADATA or PKT_TX_METADATA is set.
> >>>>>>>>>> +	 */
> >>>>>>>>>> +	uint32_t metadata;
> >>>>>>>>>> +
> >>>>>>>>>> } __rte_cache_aligned;
> >>>>>>>>> This will break the ABI, so we cannot put it in 19.08, and we
> >>>>>>>>> need a deprecation notice.
> >>>>>>>>>
> >>>>>>>> Does it actually break the ABI? Adding a new field to the mbuf
> >>>>>>>> should only break the ABI if it either causes new fields to
> >>>>>>>> move or changes the structure size. Since this is at the end,
> >>>>>>>> it's not going to move any older fields, and since everything
> >>>>>>>> is cache-aligned I don't think the structure size changes either.
> >>>>>>> I think it does break the ABI: in previous version, when the
> >>>>>>> PKT_TX_METADATA flag is set, the associated value is put in
> >>>>>>> m->tx_metadata (offset 44 on x86-64), and in the next version,
> >>>>>>> it will be in m->metadata (offset 112). So, these 2 versions are not
> binary compatible.
> >>>>>>>
> >>>>>>> Anyway, at least it breaks the API.
> >>>>>> Ok, I misunderstood. I thought it was the structure change itself
> >>>>>> you were saying broke the ABI. Yes, putting the data in a
> >>>>>> different place is indeed an ABI break.
> >>>>> We could add the new field and keep the old one unused, so it does
> >>>>> not break the ABI.
> >>>> Still breaks ABI if PKT_TX_METADATA is set. :-) In order not to
> >>>> break it, I can keep the current union'd field (tx_metadata) as is
> >>>> with PKT_TX_METADATA, add the new one at the end and make it used
> with the new PKT_RX_METADATA.
> >>>>
> >>>>> However I suppose everybody will prefer a version using dynamic
> fields.
> >>>>> Is someone against using dynamic field for such usage?
> >>>> However, given that the amazing dynamic fields is coming soon
> >>>> (thanks for your effort, Olivier and Thomas!), I'd be honored to be the
> first user of it.
> >>>>
> >>>> Olivier, I'll take a look at your RFC.
> >>> Just got a crazy idea while reading this thread... How about
> >>> repurposing that "reserved" field as "rx_metadata" in the meantime?
> >>
> >> It overlaps with hash.fdir.hi which has RSS hash.
> >
> > While it does overlap with hash.fdir.hi, isn't the RSS hash stored in
> > the "rss" field overlapping with hash.fdir.lo? (see struct
> > rte_flow_action_rss)
> >
> > hash.fdir.hi was originally used by FDIR and later repurposed by
> > rte_flow for its MARK action, which neatly qualifies as Rx metadata so
> > renaming "reserved" as "rx_metadata" could already make sense.
> >
> > That is, assuming users do not need two different kinds of Rx metadata
> > returned simultaneously with their packets. I think it's safe.
> >
> >>> I know reserved fields are cursed and no one's ever supposed to
> >>> touch them but this risk is mitigated by having the end user
> >>> explicitly request its use, so the patch author (and his relatives)
> >>> should be safe from the resulting bad juju.
> >>>
> >>> Joke aside, while I like the idea of Tx/Rx META, I think the
> >>> similarities with MARK (and TAG eventually) is a problem. I wasn't
> >>> available and couldn't comment when META was originally added to the
> >>> Tx path, but there's a lot of overlap between these items/actions,
> >>> without anything explaining to the end user how and why they should
> >>> pick one over the other, if they can be combined at all and what happens
> in that case.
> >>>
> >>> All this must be documented, then we should think about unifying
> >>> their respective features and deprecate the less capable
> >>> items/actions. In my opinion, users need exactly one method to
> >>> mark/match some mark while processing Rx/Tx traffic and *optionally*
> >>> have that mark read from/written to the mbuf, which may or may not be
> possible depending on HW features.
> >
> > Thoughts regarding this suggestion? From a user perspective I think
> > all these actions should be unified but maybe there are good reasons
> > to keep them separate?
> >
> 
> I think more recent plan is introducing dynamic fields for the remaining 16
> bytes in the second cacheline.
> 
> I will update the patch as rejected, is there any objection?

v2 is coming,  will be based on dynamic mbuf fields.
I think Superseded / Changes Requested is more relevant.

WBR, Slava

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [dpdk-dev] [PATCH] ethdev: add flow tag
  2019-10-08 12:57             ` Yigit, Ferruh
@ 2019-10-08 13:18               ` Slava Ovsiienko
  0 siblings, 0 replies; 34+ messages in thread
From: Slava Ovsiienko @ 2019-10-08 13:18 UTC (permalink / raw)
  To: Yigit, Ferruh, Yongseok Koh, Adrien Mazarguil
  Cc: Shahaf Shuler, Thomas Monjalon, Ferruh Yigit, Andrew Rybchenko,
	Olivier Matz, dev

> -----Original Message-----
> From: Yigit, Ferruh <ferruh.yigit@linux.intel.com>
> Sent: Tuesday, October 8, 2019 15:57
> To: Yongseok Koh <yskoh@mellanox.com>; Adrien Mazarguil
> <adrien.mazarguil@6wind.com>
> Cc: Shahaf Shuler <shahafs@mellanox.com>; Thomas Monjalon
> <thomas@monjalon.net>; Ferruh Yigit <ferruh.yigit@intel.com>; Andrew
> Rybchenko <arybchenko@solarflare.com>; Olivier Matz
> <olivier.matz@6wind.com>; dev <dev@dpdk.org>; Slava Ovsiienko
> <viacheslavo@mellanox.com>
> Subject: Re: [dpdk-dev] [PATCH] ethdev: add flow tag
> 
> On 7/11/2019 2:59 AM, Yongseok Koh wrote:
> > On Tue, Jul 09, 2019 at 10:38:06AM +0200, Adrien Mazarguil wrote:
> >> On Fri, Jul 05, 2019 at 06:05:50PM +0000, Yongseok Koh wrote:
> >>>> On Jul 5, 2019, at 6:54 AM, Adrien Mazarguil
> <adrien.mazarguil@6wind.com> wrote:
> >>>>
> >>>> On Thu, Jul 04, 2019 at 04:23:02PM -0700, Yongseok Koh wrote:
> >>>>> A tag is a transient data which can be used during flow match.
> >>>>> This can be used to store match result from a previous table so
> >>>>> that the same pattern need not be matched again on the next table.
> >>>>> Even if outer header is decapsulated on the previous match, the match
> result can be kept.
> >>>>>
> >>>>> Some device expose internal registers of its flow processing
> >>>>> pipeline and those registers are quite useful for stateful
> >>>>> connection tracking as it keeps status of flow matching. Multiple
> >>>>> tags are supported by specifying index.
> >>>>>
> >>>>> Example testpmd commands are:
> >>>>>
> >>>>>  flow create 0 ingress pattern ... / end
> >>>>>    actions set_tag index 2 value 0xaa00bb mask 0xffff00ff /
> >>>>>            set_tag index 3 value 0x123456 mask 0xffffff /
> >>>>>            vxlan_decap / jump group 1 / end
> >>>>>
> >>>>>  flow create 0 ingress pattern ... / end
> >>>>>    actions set_tag index 2 value 0xcc00 mask 0xff00 /
> >>>>>            set_tag index 3 value 0x123456 mask 0xffffff /
> >>>>>            vxlan_decap / jump group 1 / end
> >>>>>
> >>>>>  flow create 0 ingress group 1
> >>>>>    pattern tag index is 2 value spec 0xaa00bb value mask 0xffff00ff /
> >>>>>            eth ... / end
> >>>>>    actions ... jump group 2 / end
> >>>>>
> >>>>>  flow create 0 ingress group 1
> >>>>>    pattern tag index is 2 value spec 0xcc00 value mask 0xff00 /
> >>>>>            tag index is 3 value spec 0x123456 value mask 0xffffff /
> >>>>>            eth ... / end
> >>>>>    actions ... / end
> >>>>>
> >>>>>  flow create 0 ingress group 2
> >>>>>    pattern tag index is 3 value spec 0x123456 value mask 0xffffff /
> >>>>>            eth ... / end
> >>>>>    actions ... / end
> >>>>>
> >>>>> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> >>>>
> >>>> Hi Yongseok,
> >>>>
> >>>> Only high level questions for now, while it unquestionably looks
> >>>> useful, from a user standpoint exposing the separate index seems
> >>>> redundant and not necessarily convenient. Using the following example
> to illustrate:
> >>>>
> >>>> actions set_tag index 3 value 0x123456 mask 0xfffff
> >>>>
> >>>> pattern tag index is 3 value spec 0x123456 value mask 0xffffff
> >>>>
> >>>> I might be missing something, but why isn't this enough:
> >>>>
> >>>> pattern tag index is 3 # match whatever is stored at index 3
> >>>>
> >>>> Assuming it can work, then why bother with providing value
> >>>> spec/mask on set_tag? A flow rule pattern matches something, sets
> >>>> some arbitrary tag to be matched by a subsequent flow rule and
> >>>> that's it. It even seems like relying on the index only on both occasions
> is enough for identification.
> >>>>
> >>>> Same question for the opposite approach; relying on the value,
> >>>> never mentioning the index.
> >>>>
> >>>> I'm under the impression that the index is a hardware-specific
> >>>> constraint that shouldn't be exposed (especially since it's an
> >>>> 8-bit field). If so, a PMD could keep track of used indices without
> >>>> having them exposed through the public API.
> >>>
> >>>
> >>> Thank you for review, Adrien.
> >>> Hope you are doing well. It's been long since we talked each other.
> >>> :-)
> >>
> >> Yeah clearly! Hope you're doing well too. I'm somewhat busy hence
> >> slow to answer these days...
> >>
> >>  <dev@dpdk.org> hey!
> >>  <dev@dpdk.org> no private talks!
> >>
> >> Back to the topic:
> >>
> >>> Your approach will work too in general but we have a request from
> >>> customer that they want to partition this limited tag storage.
> >>> Assuming that HW exposes 32bit tags (those are 'registers' in HW
> >>> pipeline in mlx5 HW). Then, customers want to store multiple data
> >>> even in a 32-bit storage. For example, 16bit vlan tag, 8bit table id
> >>> and 8bit flow id. As they want to split one 32bit storage, I thought
> >>> it is better to provide mask when setting/matching the value. Even
> >>> some customer wants to store multiple flags bit by bit like ol_flags. They
> do want to alter only partial bits.
> >>>
> >>> And for the index, it is to reference an entry of tags array as HW
> >>> can provide larger registers than 32-bit. For example, mlx5 HW would
> >>> provide 4 of 32b storage which users can use for their own sake.
> >>> 	tag[0], tag[1], tag[2], tag[3]
> >>
> >> OK, looks like I missed the point then. I initially took it for a
> >> funky alternative to RTE_FLOW_ITEM_TYPE_META &
> >> RTE_FLOW_ACTION_TYPE_SET_META (ingress extended [1]) but while it
> >> could be used like that, it's more of a way to temporarily store and
> retrieve a small amount of data, correct?
> >
> > Correct.
> >
> >> Out of curiosity, are these registers independent from META and other
> >> items/actions in mlx5, otherwise what happens if they are combined?
> >
> > I thought about combining it but I chose this way. Because it is
> > transient. META can be set by packet descriptor on Tx and can be
> > delivered to host via mbuf on Rx, but this TAG item can't. If I
> > combine it, users have to query this capability for each 32b storage.
> > And also, there should be a way to request data from such storages
> > (i.e. new action , e.g. copy_meta). Let's say there are 4x32b storages
> > - meta[4]. If user wants to get one 32b data (meta[i]) out of them to
> > mbuf->metadata, it should be something like,
> > 	ingress / pattern .. /
> > 	actions ... set_meta index i data x / copy_meta_to_rx index i And if
> > user wants to set meta[i] via mbuf on Tx,
> > 	egress / pattern meta index is i data is x ... /
> > 	actions ... copy_meta_to_tx index i
> >
> > For sure, user is also responsible for querying these capabilities per
> > each meta[] storage.
> >
> > As copy_meta_to_tx/rx isn't a real action, this example would confuse
> user.
> > 	egress / pattern meta index is i data is x ... /
> > 	actions ... copy_meta_to_tx index i
> >
> > User might misunderstand the order of two things - item meta and
> > copy_meta action. I also thought about having capability bits per each
> > meta[] storage but it also looked complex.
> >
> > I do think rte_flow item/action is better to be simple, atomic and intuitive.
> > That's why I made this choice.
> >
> >> Are there other uses for these registers? Say, referencing their
> >> contents from other places in a flow rule so they don't have to be hard-
> coded?
> >
> > Possible.
> > Actually, this feature is needed by connection tracking of OVS-DPDK.
> >
> >> Right now I'm still uncomfortable with such a feature in the public
> >> API because compared to META [1], this approach looks very
> >> hardware-specific and seemingly difficult to map on different HW
> architectures.
> >
> > I wouldn't say it is HW-specific. Like I explained above, I just
> > define this new item/action to make things easy-to-use and intuitive.
> >
> >> However, the main problem is that as described, its end purpose seems
> >> redundant with META, which I think can cover the use cases you gave.
> >> So what can an application do with this that couldn't be done in a
> >> more generic fashion through META?
> >>
> >> I may still be missing something and I'm open to ideas, but assuming
> >> it doesn't make it into the public rte_flow API, it remains an
> >> interesting feature on its own merit which could be added to DPDK as
> >> PMD-specific pattern items/actions [2]. mlx5 doesn't have any yet,
> >> but it's pretty common for PMDs to expose a public header that
> >> dedicated applications can include to use this kind of features (look for
> rte_pmd_*.h, e.g. rte_pmd_ixgbe.h).
> >> No problem with that.
> >
> > That's good info. Thanks. But still considering
> > connection-tracking-like use-cases, this transient storage on multi-table
> flow pipeline is quite useful.
> >
> >
> > thanks,
> > Yongseok
> >
> >> [1] "[PATCH] ethdev: extend flow metadata"
> >>
> >>
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmai
> >> ls.dpdk.org%2Farchives%2Fdev%2F2019-
> July%2F137305.html&amp;data=02%7C
> >>
> 01%7Cviacheslavo%40mellanox.com%7Cc0402133b8b2422fc23308d74bef1
> 4fd%7C
> >>
> a652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C637061362537116332
> &amp;sda
> >>
> ta=I%2B%2BERHK8FXzLxXkbbjGTmNDf2e%2FsVRvQ%2FIJW4ZmaYrk%3D&a
> mp;reserve
> >> d=0
> >>
> >> [2] "Negative types"
> >>
> >>
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdoc
> >> .dpdk.org%2Fguides%2Fprog_guide%2Frte_flow.html%23negative-
> types&amp;
> >>
> data=02%7C01%7Cviacheslavo%40mellanox.com%7Cc0402133b8b2422fc23
> 308d74
> >>
> bef14fd%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C63706136
> 25371163
> >>
> 32&amp;sdata=o6hcNuwWnv9fADGxNcy6S9B0xwCNdlNhbloIKRiMiNo%3D&
> amp;reser
> >> ved=0
> 
> Is this RFC still valid, will there be any follow up?
> If not am marking it as rejected in next a few days.

Yes, RFC is valid, v2 and support in mlx5 Is coming.

WBR, Slava

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [dpdk-dev] [PATCH v2] ethdev: extend flow metadata
  2019-07-04 23:21 ` [dpdk-dev] [PATCH] " Yongseok Koh
  2019-07-10  9:31   ` Olivier Matz
@ 2019-10-10 16:02   ` " Viacheslav Ovsiienko
  2019-10-18  9:22     ` Olivier Matz
  1 sibling, 1 reply; 34+ messages in thread
From: Viacheslav Ovsiienko @ 2019-10-10 16:02 UTC (permalink / raw)
  To: dev; +Cc: matan, rasland, thomas, olivier.matz, Yongseok Koh

Currently, metadata can be set on egress path via mbuf tx_metadata field
with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_META matches metadata.

This patch extends the metadata feature usability.

1) RTE_FLOW_ACTION_TYPE_SET_META

When supporting multiple tables, Tx metadata can also be set by a rule and
matched by another rule. This new action allows metadata to be set as a
result of flow match.

2) Metadata on ingress

There's also need to support metadata on ingress. Metadata can be set by
SET_META action and matched by META item like Tx. The final value set by
the action will be delivered to application via metadata dynamic field of
mbuf which can be accessed by RTE_FLOW_DYNF_METADATA().
PKT_RX_DYNF_METADATA flag will be set along with the data.

The mbuf dynamic field must be registered by calling
rte_flow_dynf_metadata_register() prior to use SET_META action.

The availability of dynamic mbuf metadata field can be checked
with rte_flow_dynf_metadata_avail() routine.

For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
propagated to the other path depending on hardware capability.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
---

  This patch uses dynamic mbuf field and must be applied after:
  http://patches.dpdk.org/patch/59343/
  "mbuf: support dynamic fields and flags"

  v2: - rebased
      - relies on dynamic mbuf field feature

  v1: http://patches.dpdk.org/patch/56103/

  rfc: http://patches.dpdk.org/patch/54270/

 app/test-pmd/cmdline_flow.c              | 57 ++++++++++++++++++++-
 app/test-pmd/util.c                      |  5 ++
 doc/guides/prog_guide/rte_flow.rst       | 57 +++++++++++++++++++++
 doc/guides/rel_notes/release_19_11.rst   |  8 +++
 lib/librte_ethdev/rte_ethdev.h           |  1 -
 lib/librte_ethdev/rte_ethdev_version.map |  6 +++
 lib/librte_ethdev/rte_flow.c             | 41 +++++++++++++++
 lib/librte_ethdev/rte_flow.h             | 87 ++++++++++++++++++++++++++++++--
 lib/librte_mbuf/rte_mbuf_dyn.h           |  8 +++
 9 files changed, 265 insertions(+), 5 deletions(-)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index b26b8bf..078f256 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -305,6 +305,9 @@ enum index {
 	ACTION_DEC_TCP_ACK_VALUE,
 	ACTION_RAW_ENCAP,
 	ACTION_RAW_DECAP,
+	ACTION_SET_META,
+	ACTION_SET_META_DATA,
+	ACTION_SET_META_MASK,
 };
 
 /** Maximum size for pattern in struct rte_flow_item_raw. */
@@ -994,6 +997,7 @@ struct parse_action_priv {
 	ACTION_DEC_TCP_ACK,
 	ACTION_RAW_ENCAP,
 	ACTION_RAW_DECAP,
+	ACTION_SET_META,
 	ZERO,
 };
 
@@ -1180,6 +1184,13 @@ struct parse_action_priv {
 	ZERO,
 };
 
+static const enum index action_set_meta[] = {
+	ACTION_SET_META_DATA,
+	ACTION_SET_META_MASK,
+	ACTION_NEXT,
+	ZERO,
+};
+
 static int parse_set_raw_encap_decap(struct context *, const struct token *,
 				     const char *, unsigned int,
 				     void *, unsigned int);
@@ -1238,6 +1249,10 @@ static int parse_vc_action_raw_encap(struct context *,
 static int parse_vc_action_raw_decap(struct context *,
 				     const struct token *, const char *,
 				     unsigned int, void *, unsigned int);
+static int parse_vc_action_set_meta(struct context *ctx,
+				    const struct token *token, const char *str,
+				    unsigned int len, void *buf,
+				    unsigned int size);
 static int parse_destroy(struct context *, const struct token *,
 			 const char *, unsigned int,
 			 void *, unsigned int);
@@ -3222,7 +3237,31 @@ static int comp_vc_action_rss_queue(struct context *, const struct token *,
 		.help = "set raw decap data",
 		.next = NEXT(next_item),
 		.call = parse_set_raw_encap_decap,
-	}
+	},
+	[ACTION_SET_META] = {
+		.name = "set_meta",
+		.help = "set metadata",
+		.priv = PRIV_ACTION(SET_META,
+			sizeof(struct rte_flow_action_set_meta)),
+		.next = NEXT(action_set_meta),
+		.call = parse_vc_action_set_meta,
+	},
+	[ACTION_SET_META_DATA] = {
+		.name = "data",
+		.help = "metadata value",
+		.next = NEXT(action_set_meta, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY_HTON
+			     (struct rte_flow_action_set_meta, data)),
+		.call = parse_vc_conf,
+	},
+	[ACTION_SET_META_MASK] = {
+		.name = "mask",
+		.help = "mask for metadata value",
+		.next = NEXT(action_set_meta, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY_HTON
+			     (struct rte_flow_action_set_meta, mask)),
+		.call = parse_vc_conf,
+	},
 };
 
 /** Remove and return last entry from argument stack. */
@@ -4592,6 +4631,22 @@ static int comp_vc_action_rss_queue(struct context *, const struct token *,
 	return ret;
 }
 
+static int
+parse_vc_action_set_meta(struct context *ctx, const struct token *token,
+			 const char *str, unsigned int len, void *buf,
+			 unsigned int size)
+{
+	int ret;
+
+	ret = parse_vc(ctx, token, str, len, buf, size);
+	if (ret < 0)
+		return ret;
+	ret = rte_flow_dynf_metadata_register();
+	if (ret < 0)
+		return -1;
+	return len;
+}
+
 /** Parse tokens for destroy command. */
 static int
 parse_destroy(struct context *ctx, const struct token *token,
diff --git a/app/test-pmd/util.c b/app/test-pmd/util.c
index 1570270..39ff07b 100644
--- a/app/test-pmd/util.c
+++ b/app/test-pmd/util.c
@@ -81,6 +81,11 @@
 			       mb->vlan_tci, mb->vlan_tci_outer);
 		else if (ol_flags & PKT_RX_VLAN)
 			printf(" - VLAN tci=0x%x", mb->vlan_tci);
+		if (ol_flags & PKT_TX_METADATA)
+			printf(" - Tx metadata: 0x%x", mb->tx_metadata);
+		if (ol_flags & PKT_RX_DYNF_METADATA)
+			printf(" - Rx metadata: 0x%x",
+			       *RTE_FLOW_DYNF_METADATA(mb));
 		if (mb->packet_type) {
 			rte_get_ptype_name(mb->packet_type, buf, sizeof(buf));
 			printf(" - hw ptype: %s", buf);
diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index ff6fb11..45fc041 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -658,6 +658,32 @@ the physical device, with virtual groups in the PMD or not at all.
    | ``mask`` | ``id``   | zeroed to match any value |
    +----------+----------+---------------------------+
 
+Item: ``META``
+^^^^^^^^^^^^^^^^^
+
+Matches 32 bit metadata item set.
+
+On egress, metadata can be set either by mbuf metadata field with
+PKT_TX_METADATA flag or ``SET_META`` action. On ingress, ``SET_META``
+action sets metadata for a packet and the metadata will be reported via
+``metadata`` dynamic field of ``rte_mbuf`` with PKT_RX_DYNF_METADATA flag.
+
+- Default ``mask`` matches the specified Rx metadata value.
+
+.. _table_rte_flow_item_meta:
+
+.. table:: META
+
+   +----------+----------+---------------------------------------+
+   | Field    | Subfield | Value                                 |
+   +==========+==========+=======================================+
+   | ``spec`` | ``data`` | 32 bit metadata value                 |
+   +----------+----------+---------------------------------------+
+   | ``last`` | ``data`` | upper range value                     |
+   +----------+----------+---------------------------------------+
+   | ``mask`` | ``data`` | bit-mask applies to "spec" and "last" |
+   +----------+----------+---------------------------------------+
+
 Data matching item types
 ~~~~~~~~~~~~~~~~~~~~~~~~
 
@@ -2415,6 +2441,37 @@ Value to decrease TCP acknowledgment number by is a big-endian 32 bit integer.
 
 Using this action on non-matching traffic will result in undefined behavior.
 
+Action: ``SET_META``
+^^^^^^^^^^^^^^^^^^^^^^^
+
+Set metadata. Item ``META`` matches metadata.
+
+Metadata set by mbuf metadata field with PKT_TX_METADATA flag on egress will be
+overridden by this action. On ingress, the metadata will be carried by
+``metadata`` dynamic field of ``rte_mbuf`` which can be accessed by
+``RTE_FLOW_DYNF_METADATA()``. PKT_RX_DYNF_METADATA flag will be set along
+with the data.
+
+The mbuf dynamic field must be registered by calling
+``rte_flow_dynf_metadata_register()`` prior to use ``SET_META`` action.
+
+Altering partial bits is supported with ``mask``. For bits which have never been
+set, unpredictable value will be seen depending on driver implementation. For
+loopback/hairpin packet, metadata set on Rx/Tx may or may not be propagated to
+the other path depending on HW capability.
+
+.. _table_rte_flow_action_set_meta:
+
+.. table:: SET_META
+
+   +----------+----------------------------+
+   | Field    | Value                      |
+   +==========+============================+
+   | ``data`` | 32 bit metadata value      |
+   +----------+----------------------------+
+   | ``mask`` | bit-mask applies to "data" |
+   +----------+----------------------------+
+
 Negative types
 ~~~~~~~~~~~~~~
 
diff --git a/doc/guides/rel_notes/release_19_11.rst b/doc/guides/rel_notes/release_19_11.rst
index 8921cfd..904746e 100644
--- a/doc/guides/rel_notes/release_19_11.rst
+++ b/doc/guides/rel_notes/release_19_11.rst
@@ -95,6 +95,14 @@ New Features
   for specific offload features, where adding a static field or flag
   in the mbuf is not justified.
 
+* **Extended metadata support in rte_flow.**
+
+  Flow metadata is extended to both Rx and Tx.
+
+  * Tx metadata can also be set by SET_META action of rte_flow.
+  * Rx metadata is delivered to host via a dynamic field of ``rte_mbuf`` with
+    PKT_RX_DYNF_METADATA.
+
 Removed Items
 -------------
 
diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
index d937fb4..9a6432c 100644
--- a/lib/librte_ethdev/rte_ethdev.h
+++ b/lib/librte_ethdev/rte_ethdev.h
@@ -1013,7 +1013,6 @@ struct rte_eth_conf {
 #define DEV_RX_OFFLOAD_KEEP_CRC		0x00010000
 #define DEV_RX_OFFLOAD_SCTP_CKSUM	0x00020000
 #define DEV_RX_OFFLOAD_OUTER_UDP_CKSUM  0x00040000
-
 #define DEV_RX_OFFLOAD_CHECKSUM (DEV_RX_OFFLOAD_IPV4_CKSUM | \
 				 DEV_RX_OFFLOAD_UDP_CKSUM | \
 				 DEV_RX_OFFLOAD_TCP_CKSUM)
diff --git a/lib/librte_ethdev/rte_ethdev_version.map b/lib/librte_ethdev/rte_ethdev_version.map
index 6df42a4..3d9cafc 100644
--- a/lib/librte_ethdev/rte_ethdev_version.map
+++ b/lib/librte_ethdev/rte_ethdev_version.map
@@ -283,4 +283,10 @@ EXPERIMENTAL {
 
 	# added in 19.08
 	rte_eth_read_clock;
+
+	# added in 19.11
+	rte_flow_dynf_metadata_offs;
+	rte_flow_dynf_metadata_flag;
+	rte_flow_dynf_metadata_avail;
+	rte_flow_dynf_metadata_register;
 };
diff --git a/lib/librte_ethdev/rte_flow.c b/lib/librte_ethdev/rte_flow.c
index cc03b15..9cbda75 100644
--- a/lib/librte_ethdev/rte_flow.c
+++ b/lib/librte_ethdev/rte_flow.c
@@ -12,10 +12,18 @@
 #include <rte_errno.h>
 #include <rte_branch_prediction.h>
 #include <rte_string_fns.h>
+#include <rte_mbuf.h>
+#include <rte_mbuf_dyn.h>
 #include "rte_ethdev.h"
 #include "rte_flow_driver.h"
 #include "rte_flow.h"
 
+/* Mbuf dynamic field name for metadata. */
+int rte_flow_dynf_metadata_offs = -1;
+
+/* Mbuf dynamic field flag bit number for metadata. */
+uint64_t rte_flow_dynf_metadata_mask;
+
 /**
  * Flow elements description tables.
  */
@@ -153,8 +161,41 @@ struct rte_flow_desc_data {
 	MK_FLOW_ACTION(DEC_TCP_SEQ, sizeof(rte_be32_t)),
 	MK_FLOW_ACTION(INC_TCP_ACK, sizeof(rte_be32_t)),
 	MK_FLOW_ACTION(DEC_TCP_ACK, sizeof(rte_be32_t)),
+	MK_FLOW_ACTION(SET_META, sizeof(struct rte_flow_action_set_meta)),
 };
 
+int
+rte_flow_dynf_metadata_register(void)
+{
+	int offset;
+	int flag;
+
+	static const struct rte_mbuf_dynfield desc_offs = {
+		.name = MBUF_DYNF_METADATA_NAME,
+		.size = MBUF_DYNF_METADATA_SIZE,
+		.align = MBUF_DYNF_METADATA_ALIGN,
+		.flags = MBUF_DYNF_METADATA_FLAGS,
+	};
+	static const struct rte_mbuf_dynflag desc_flag = {
+		.name = MBUF_DYNF_METADATA_NAME,
+	};
+
+	offset = rte_mbuf_dynfield_register(&desc_offs);
+	if (offset < 0)
+		goto error;
+	flag = rte_mbuf_dynflag_register(&desc_flag);
+	if (flag < 0)
+		goto error;
+	rte_flow_dynf_metadata_offs = offset;
+	rte_flow_dynf_metadata_mask = (1ULL << flag);
+	return 0;
+
+error:
+	rte_flow_dynf_metadata_offs = -1;
+	rte_flow_dynf_metadata_mask = 0ULL;
+	return -rte_errno;
+}
+
 static int
 flow_err(uint16_t port_id, int ret, struct rte_flow_error *error)
 {
diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index 391a44a..a27e619 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -27,6 +27,8 @@
 #include <rte_udp.h>
 #include <rte_byteorder.h>
 #include <rte_esp.h>
+#include <rte_mbuf.h>
+#include <rte_mbuf_dyn.h>
 
 #ifdef __cplusplus
 extern "C" {
@@ -417,7 +419,8 @@ enum rte_flow_item_type {
 	/**
 	 * [META]
 	 *
-	 * Matches a metadata value specified in mbuf metadata field.
+	 * Matches a metadata value.
+	 *
 	 * See struct rte_flow_item_meta.
 	 */
 	RTE_FLOW_ITEM_TYPE_META,
@@ -1213,9 +1216,17 @@ struct rte_flow_item_icmp6_nd_opt_tla_eth {
 #endif
 
 /**
- * RTE_FLOW_ITEM_TYPE_META.
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
  *
- * Matches a specified metadata value.
+ * RTE_FLOW_ITEM_TYPE_META
+ *
+ * Matches a specified metadata value. On egress, metadata can be set either by
+ * mbuf tx_metadata field with PKT_TX_METADATA flag or
+ * RTE_FLOW_ACTION_TYPE_SET_META. On ingress, RTE_FLOW_ACTION_TYPE_SET_META sets
+ * metadata for a packet and the metadata will be reported via mbuf metadata
+ * dynamic field with PKT_RX_DYNF_METADATA flag. The dynamic mbuf field must be
+ * registered in advance by rte_flow_dynf_metadata_register().
  */
 struct rte_flow_item_meta {
 	rte_be32_t data;
@@ -1813,6 +1824,13 @@ enum rte_flow_action_type {
 	 * undefined behavior.
 	 */
 	RTE_FLOW_ACTION_TYPE_DEC_TCP_ACK,
+
+	/**
+	 * Set metadata on ingress or egress path.
+	 *
+	 * See struct rte_flow_action_set_meta.
+	 */
+	RTE_FLOW_ACTION_TYPE_SET_META,
 };
 
 /**
@@ -2300,6 +2318,43 @@ struct rte_flow_action_set_mac {
 	uint8_t mac_addr[RTE_ETHER_ADDR_LEN];
 };
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ACTION_TYPE_SET_META
+ *
+ * Set metadata. Metadata set by mbuf tx_metadata field with
+ * PKT_TX_METADATA flag on egress will be overridden by this action. On
+ * ingress, the metadata will be carried by mbuf metadata dynamic field
+ * with PKT_RX_DYNF_METADATA flag if set.  The dynamic mbuf field must be
+ * registered in advance by rte_flow_dynf_metadata_register().
+ *
+ * Altering partial bits is supported with mask. For bits which have never
+ * been set, unpredictable value will be seen depending on driver
+ * implementation. For loopback/hairpin packet, metadata set on Rx/Tx may
+ * or may not be propagated to the other path depending on HW capability.
+ *
+ * RTE_FLOW_ITEM_TYPE_META matches metadata.
+ */
+struct rte_flow_action_set_meta {
+	rte_be32_t data;
+	rte_be32_t mask;
+};
+
+/* Mbuf dynamic field offset for metadata. */
+extern int rte_flow_dynf_metadata_offs;
+
+/* Mbuf dynamic field flag mask for metadata. */
+extern uint64_t rte_flow_dynf_metadata_mask;
+
+/* Mbuf dynamic field pointer for metadata. */
+#define RTE_FLOW_DYNF_METADATA(m) \
+	RTE_MBUF_DYNFIELD((m), rte_flow_dynf_metadata_offs, uint32_t *)
+
+/* Mbuf dynamic flag for metadata. */
+#define PKT_RX_DYNF_METADATA (rte_flow_dynf_metadata_mask)
+
 /*
  * Definition of a single action.
  *
@@ -2533,6 +2588,32 @@ enum rte_flow_conv_op {
 };
 
 /**
+ * Check if mbuf dynamic field for metadata is registered.
+ *
+ * @return
+ *   True if registered, false otherwise.
+ */
+__rte_experimental
+static inline int
+rte_flow_dynf_metadata_avail(void) {
+	return !!rte_flow_dynf_metadata_mask;
+}
+
+/**
+ * Register mbuf dynamic field and flag for metadata.
+ *
+ * This function must be called prior to use SET_META action in order to
+ * register the dynamic mbuf field. Otherwise, the data cannot be delivered to
+ * application.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+__rte_experimental
+int
+rte_flow_dynf_metadata_register(void);
+
+/**
  * Check whether a flow rule can be created on a given port.
  *
  * The flow rule is validated for correctness and whether it could be accepted
diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h b/lib/librte_mbuf/rte_mbuf_dyn.h
index 6e2c816..4ff33ac 100644
--- a/lib/librte_mbuf/rte_mbuf_dyn.h
+++ b/lib/librte_mbuf/rte_mbuf_dyn.h
@@ -160,4 +160,12 @@ int rte_mbuf_dynflag_lookup(const char *name,
  */
 #define RTE_MBUF_DYNFIELD(m, offset, type) ((type)((uintptr_t)(m) + (offset)))
 
+/**
+ * Flow metadata dynamic field definitions.
+ */
+#define MBUF_DYNF_METADATA_NAME "flow-metadata"
+#define MBUF_DYNF_METADATA_SIZE sizeof(uint32_t)
+#define MBUF_DYNF_METADATA_ALIGN __alignof__(uint32_t)
+#define MBUF_DYNF_METADATA_FLAGS 0
+
 #endif
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [dpdk-dev] [PATCH v2] ethdev: add flow tag
  2019-07-04 23:23   ` [dpdk-dev] [PATCH] " Yongseok Koh
  2019-07-05 13:54     ` Adrien Mazarguil
@ 2019-10-10 16:09     ` " Viacheslav Ovsiienko
  1 sibling, 0 replies; 34+ messages in thread
From: Viacheslav Ovsiienko @ 2019-10-10 16:09 UTC (permalink / raw)
  To: dev; +Cc: matan, rasland, thomas, Yongseok Koh

A tag is a transient data which can be used during flow match. This can be
used to store match result from a previous table so that the same pattern
need not be matched again on the next table. Even if outer header is
decapsulated on the previous match, the match result can be kept.

Some device expose internal registers of its flow processing pipeline and
those registers are quite useful for stateful connection tracking as it
keeps status of flow matching. Multiple tags are supported by specifying
index.

Example testpmd commands are:

  flow create 0 ingress pattern ... / end
    actions set_tag index 2 value 0xaa00bb mask 0xffff00ff /
            set_tag index 3 value 0x123456 mask 0xffffff /
            vxlan_decap / jump group 1 / end

  flow create 0 ingress pattern ... / end
    actions set_tag index 2 value 0xcc00 mask 0xff00 /
            set_tag index 3 value 0x123456 mask 0xffffff /
            vxlan_decap / jump group 1 / end

  flow create 0 ingress group 1
    pattern tag index is 2 value spec 0xaa00bb value mask 0xffff00ff /
            eth ... / end
    actions ... jump group 2 / end

  flow create 0 ingress group 1
    pattern tag index is 2 value spec 0xcc00 value mask 0xff00 /
            tag index is 3 value spec 0x123456 value mask 0xffffff /
            eth ... / end
    actions ... / end

  flow create 0 ingress group 2
    pattern tag index is 3 value spec 0x123456 value mask 0xffffff /
            eth ... / end
    actions ... / end

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
---

v2: rebased
v1: http://patches.dpdk.org/patch/56104/
rfc: http://patches.dpdk.org/patch/54271/


 app/test-pmd/cmdline_flow.c            | 75 ++++++++++++++++++++++++++++++++++
 doc/guides/prog_guide/rte_flow.rst     | 50 +++++++++++++++++++++++
 doc/guides/rel_notes/release_19_11.rst |  5 +++
 lib/librte_ethdev/rte_flow.c           |  2 +
 lib/librte_ethdev/rte_flow.h           | 54 ++++++++++++++++++++++++
 5 files changed, 186 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 078f256..667cb80 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -203,6 +203,9 @@ enum index {
 	ITEM_PPPOED,
 	ITEM_PPPOE_SEID,
 	ITEM_PPPOE_PROTO_ID,
+	ITEM_TAG,
+	ITEM_TAG_DATA,
+	ITEM_TAG_INDEX,
 
 	/* Validate/create actions. */
 	ACTIONS,
@@ -308,6 +311,10 @@ enum index {
 	ACTION_SET_META,
 	ACTION_SET_META_DATA,
 	ACTION_SET_META_MASK,
+	ACTION_SET_TAG,
+	ACTION_SET_TAG_INDEX,
+	ACTION_SET_TAG_DATA,
+	ACTION_SET_TAG_MASK,
 };
 
 /** Maximum size for pattern in struct rte_flow_item_raw. */
@@ -678,6 +685,7 @@ struct parse_action_priv {
 	ITEM_PPPOES,
 	ITEM_PPPOED,
 	ITEM_PPPOE_PROTO_ID,
+	ITEM_TAG,
 	END_SET,
 	ZERO,
 };
@@ -942,6 +950,13 @@ struct parse_action_priv {
 	ZERO,
 };
 
+static const enum index item_tag[] = {
+	ITEM_TAG_DATA,
+	ITEM_TAG_INDEX,
+	ITEM_NEXT,
+	ZERO,
+};
+
 static const enum index next_action[] = {
 	ACTION_END,
 	ACTION_VOID,
@@ -998,6 +1013,7 @@ struct parse_action_priv {
 	ACTION_RAW_ENCAP,
 	ACTION_RAW_DECAP,
 	ACTION_SET_META,
+	ACTION_SET_TAG,
 	ZERO,
 };
 
@@ -1191,6 +1207,14 @@ struct parse_action_priv {
 	ZERO,
 };
 
+static const enum index action_set_tag[] = {
+	ACTION_SET_TAG_INDEX,
+	ACTION_SET_TAG_DATA,
+	ACTION_SET_TAG_MASK,
+	ACTION_NEXT,
+	ZERO,
+};
+
 static int parse_set_raw_encap_decap(struct context *, const struct token *,
 				     const char *, unsigned int,
 				     void *, unsigned int);
@@ -2434,6 +2458,26 @@ static int comp_vc_action_rss_queue(struct context *, const struct token *,
 		.next = NEXT(item_pppoe_proto_id),
 		.call = parse_vc,
 	},
+	[ITEM_TAG] = {
+		.name = "tag",
+		.help = "match tag value",
+		.priv = PRIV_ITEM(TAG, sizeof(struct rte_flow_item_tag)),
+		.next = NEXT(item_tag),
+		.call = parse_vc,
+	},
+	[ITEM_TAG_DATA] = {
+		.name = "data",
+		.help = "tag value to match",
+		.next = NEXT(item_tag, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_tag, data)),
+	},
+	[ITEM_TAG_INDEX] = {
+		.name = "index",
+		.help = "index of tag array to match",
+		.next = NEXT(item_tag, NEXT_ENTRY(UNSIGNED),
+			     NEXT_ENTRY(ITEM_PARAM_IS)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_tag, index)),
+	},
 	/* Validate/create actions. */
 	[ACTIONS] = {
 		.name = "actions",
@@ -3262,6 +3306,37 @@ static int comp_vc_action_rss_queue(struct context *, const struct token *,
 			     (struct rte_flow_action_set_meta, mask)),
 		.call = parse_vc_conf,
 	},
+	[ACTION_SET_TAG] = {
+		.name = "set_tag",
+		.help = "set tag",
+		.priv = PRIV_ACTION(SET_TAG,
+			sizeof(struct rte_flow_action_set_tag)),
+		.next = NEXT(action_set_tag),
+		.call = parse_vc,
+	},
+	[ACTION_SET_TAG_INDEX] = {
+		.name = "index",
+		.help = "index of tag array",
+		.next = NEXT(action_set_tag, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_action_set_tag, index)),
+		.call = parse_vc_conf,
+	},
+	[ACTION_SET_TAG_DATA] = {
+		.name = "data",
+		.help = "tag value",
+		.next = NEXT(action_set_tag, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY_HTON
+			     (struct rte_flow_action_set_tag, data)),
+		.call = parse_vc_conf,
+	},
+	[ACTION_SET_TAG_MASK] = {
+		.name = "mask",
+		.help = "mask for tag value",
+		.next = NEXT(action_set_tag, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY_HTON
+			     (struct rte_flow_action_set_tag, mask)),
+		.call = parse_vc_conf,
+	},
 };
 
 /** Remove and return last entry from argument stack. */
diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index 45fc041..290646f 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -684,6 +684,34 @@ action sets metadata for a packet and the metadata will be reported via
    | ``mask`` | ``data`` | bit-mask applies to "spec" and "last" |
    +----------+----------+---------------------------------------+
 
+Item: ``TAG``
+^^^^^^^^^^^^^
+
+Matches tag item set by other flows. Multiple tags are supported by specifying
+``index``.
+
+- Default ``mask`` matches the specified tag value and index.
+
+.. _table_rte_flow_item_tag:
+
+.. table:: TAG
+
+   +----------+----------+----------------------------------------+
+   | Field    | Subfield  | Value                                 |
+   +==========+===========+=======================================+
+   | ``spec`` | ``data``  | 32 bit flow tag value                 |
+   |          +-----------+---------------------------------------+
+   |          | ``index`` | index of flow tag                     |
+   +----------+-----------+---------------------------------------+
+   | ``last`` | ``data``  | upper range value                     |
+   |          +-----------+                                       |
+   |          | ``index`` |                                       |
+   +----------+-----------+---------------------------------------+
+   | ``mask`` | ``data``  | bit-mask applies to "spec" and "last" |
+   |          +-----------+                                       |
+   |          | ``index`` |                                       |
+   +----------+-----------+---------------------------------------+
+
 Data matching item types
 ~~~~~~~~~~~~~~~~~~~~~~~~
 
@@ -2472,6 +2500,28 @@ the other path depending on HW capability.
    | ``mask`` | bit-mask applies to "data" |
    +----------+----------------------------+
 
+Action: ``SET_TAG``
+^^^^^^^^^^^^^^^^^^^
+
+Set Tag.
+
+Tag is a transient data used during flow matching. This is not delivered to
+application. Multiple tags are supported by specifying index.
+
+.. _table_rte_flow_action_set_tag:
+
+.. table:: SET_TAG
+
+   +-----------+----------------------------+
+   | Field     | Value                      |
+   +===========+============================+
+   | ``data``  | 32 bit tag value           |
+   +-----------+----------------------------+
+   | ``mask``  | bit-mask applies to "data" |
+   +-----------+----------------------------+
+   | ``index`` | index of tag to set        |
+   +-----------+----------------------------+
+
 Negative types
 ~~~~~~~~~~~~~~
 
diff --git a/doc/guides/rel_notes/release_19_11.rst b/doc/guides/rel_notes/release_19_11.rst
index 904746e..9077f2f 100644
--- a/doc/guides/rel_notes/release_19_11.rst
+++ b/doc/guides/rel_notes/release_19_11.rst
@@ -103,6 +103,11 @@ New Features
   * Rx metadata is delivered to host via a dynamic field of ``rte_mbuf`` with
     PKT_RX_DYNF_METADATA.
 
+* **Added flow tag in rte_flow.**
+  SET_TAG action and TAG item have been added to support transient flow
+  tag.
+
+
 Removed Items
 -------------
 
diff --git a/lib/librte_ethdev/rte_flow.c b/lib/librte_ethdev/rte_flow.c
index 9cbda75..dcbae99 100644
--- a/lib/librte_ethdev/rte_flow.c
+++ b/lib/librte_ethdev/rte_flow.c
@@ -82,6 +82,7 @@ struct rte_flow_desc_data {
 		     sizeof(struct rte_flow_item_icmp6_nd_opt_tla_eth)),
 	MK_FLOW_ITEM(MARK, sizeof(struct rte_flow_item_mark)),
 	MK_FLOW_ITEM(META, sizeof(struct rte_flow_item_meta)),
+	MK_FLOW_ITEM(TAG, sizeof(struct rte_flow_item_tag)),
 	MK_FLOW_ITEM(GRE_KEY, sizeof(rte_be32_t)),
 	MK_FLOW_ITEM(GTP_PSC, sizeof(struct rte_flow_item_gtp_psc)),
 	MK_FLOW_ITEM(PPPOES, sizeof(struct rte_flow_item_pppoe)),
@@ -162,6 +163,7 @@ struct rte_flow_desc_data {
 	MK_FLOW_ACTION(INC_TCP_ACK, sizeof(rte_be32_t)),
 	MK_FLOW_ACTION(DEC_TCP_ACK, sizeof(rte_be32_t)),
 	MK_FLOW_ACTION(SET_META, sizeof(struct rte_flow_action_set_meta)),
+	MK_FLOW_ACTION(SET_TAG, sizeof(struct rte_flow_action_set_tag)),
 };
 
 int
diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index a27e619..f3a5166 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -473,6 +473,15 @@ enum rte_flow_item_type {
 	 * See struct rte_flow_item_pppoe_proto_id.
 	 */
 	RTE_FLOW_ITEM_TYPE_PPPOE_PROTO_ID,
+
+	/**
+	 * [META]
+	 *
+	 * Matches a tag value.
+	 *
+	 * See struct rte_flow_item_tag.
+	 */
+	RTE_FLOW_ITEM_TYPE_TAG,
 };
 
 /**
@@ -1300,6 +1309,27 @@ struct rte_flow_item_pppoe_proto_id {
  * @warning
  * @b EXPERIMENTAL: this structure may change without prior notice
  *
+ * RTE_FLOW_ITEM_TYPE_TAG
+ *
+ * Matches a specified tag value at the specified index.
+ */
+struct rte_flow_item_tag {
+	uint32_t data;
+	uint8_t index;
+};
+
+/** Default mask for RTE_FLOW_ITEM_TYPE_TAG. */
+#ifndef __cplusplus
+static const struct rte_flow_item_tag rte_flow_item_tag_mask = {
+	.data = 0xffffffff,
+	.index = 0xff,
+};
+#endif
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
  * RTE_FLOW_ITEM_TYPE_MARK
  *
  * Matches an arbitrary integer value which was set using the ``MARK`` action
@@ -1831,6 +1861,15 @@ enum rte_flow_action_type {
 	 * See struct rte_flow_action_set_meta.
 	 */
 	RTE_FLOW_ACTION_TYPE_SET_META,
+
+	/**
+	 * Set Tag.
+	 *
+	 * Tag is not delivered to application.
+	 *
+	 * See struct rte_flow_action_set_tag.
+	 */
+	RTE_FLOW_ACTION_TYPE_SET_TAG,
 };
 
 /**
@@ -2355,6 +2394,21 @@ struct rte_flow_action_set_meta {
 /* Mbuf dynamic flag for metadata. */
 #define PKT_RX_DYNF_METADATA (rte_flow_dynf_metadata_mask)
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ACTION_TYPE_SET_TAG
+ *
+ * Set a tag which is a transient data used during flow matching. This is not
+ * delivered to application. Multiple tags are supported by specifying index.
+ */
+struct rte_flow_action_set_tag {
+	uint32_t data;
+	uint32_t mask;
+	uint8_t index;
+};
+
 /*
  * Definition of a single action.
  *
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [dpdk-dev] [PATCH v2] ethdev: extend flow metadata
  2019-10-10 16:02   ` [dpdk-dev] [PATCH v2] " Viacheslav Ovsiienko
@ 2019-10-18  9:22     ` Olivier Matz
  2019-10-19 19:47       ` Slava Ovsiienko
  0 siblings, 1 reply; 34+ messages in thread
From: Olivier Matz @ 2019-10-18  9:22 UTC (permalink / raw)
  To: Viacheslav Ovsiienko; +Cc: dev, matan, rasland, thomas, Yongseok Koh

Hi Viacheslav,

Few comments on the dynamic mbuf part below.

On Thu, Oct 10, 2019 at 04:02:39PM +0000, Viacheslav Ovsiienko wrote:
> Currently, metadata can be set on egress path via mbuf tx_metadata field
> with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_META matches metadata.
> 
> This patch extends the metadata feature usability.
> 
> 1) RTE_FLOW_ACTION_TYPE_SET_META
> 
> When supporting multiple tables, Tx metadata can also be set by a rule and
> matched by another rule. This new action allows metadata to be set as a
> result of flow match.
> 
> 2) Metadata on ingress
> 
> There's also need to support metadata on ingress. Metadata can be set by
> SET_META action and matched by META item like Tx. The final value set by
> the action will be delivered to application via metadata dynamic field of
> mbuf which can be accessed by RTE_FLOW_DYNF_METADATA().
> PKT_RX_DYNF_METADATA flag will be set along with the data.
> 
> The mbuf dynamic field must be registered by calling
> rte_flow_dynf_metadata_register() prior to use SET_META action.
> 
> The availability of dynamic mbuf metadata field can be checked
> with rte_flow_dynf_metadata_avail() routine.
> 
> For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
> propagated to the other path depending on hardware capability.
> 
> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> ---
> 
>   This patch uses dynamic mbuf field and must be applied after:
>   http://patches.dpdk.org/patch/59343/
>   "mbuf: support dynamic fields and flags"
> 
>   v2: - rebased
>       - relies on dynamic mbuf field feature
> 
>   v1: http://patches.dpdk.org/patch/56103/
> 
>   rfc: http://patches.dpdk.org/patch/54270/
> 
>  app/test-pmd/cmdline_flow.c              | 57 ++++++++++++++++++++-
>  app/test-pmd/util.c                      |  5 ++
>  doc/guides/prog_guide/rte_flow.rst       | 57 +++++++++++++++++++++
>  doc/guides/rel_notes/release_19_11.rst   |  8 +++
>  lib/librte_ethdev/rte_ethdev.h           |  1 -
>  lib/librte_ethdev/rte_ethdev_version.map |  6 +++
>  lib/librte_ethdev/rte_flow.c             | 41 +++++++++++++++
>  lib/librte_ethdev/rte_flow.h             | 87 ++++++++++++++++++++++++++++++--
>  lib/librte_mbuf/rte_mbuf_dyn.h           |  8 +++
>  9 files changed, 265 insertions(+), 5 deletions(-)
> 
> diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
> index b26b8bf..078f256 100644
> --- a/app/test-pmd/cmdline_flow.c
> +++ b/app/test-pmd/cmdline_flow.c
> @@ -305,6 +305,9 @@ enum index {
>  	ACTION_DEC_TCP_ACK_VALUE,
>  	ACTION_RAW_ENCAP,
>  	ACTION_RAW_DECAP,
> +	ACTION_SET_META,
> +	ACTION_SET_META_DATA,
> +	ACTION_SET_META_MASK,
>  };
>  
>  /** Maximum size for pattern in struct rte_flow_item_raw. */
> @@ -994,6 +997,7 @@ struct parse_action_priv {
>  	ACTION_DEC_TCP_ACK,
>  	ACTION_RAW_ENCAP,
>  	ACTION_RAW_DECAP,
> +	ACTION_SET_META,
>  	ZERO,
>  };
>  
> @@ -1180,6 +1184,13 @@ struct parse_action_priv {
>  	ZERO,
>  };
>  
> +static const enum index action_set_meta[] = {
> +	ACTION_SET_META_DATA,
> +	ACTION_SET_META_MASK,
> +	ACTION_NEXT,
> +	ZERO,
> +};
> +
>  static int parse_set_raw_encap_decap(struct context *, const struct token *,
>  				     const char *, unsigned int,
>  				     void *, unsigned int);
> @@ -1238,6 +1249,10 @@ static int parse_vc_action_raw_encap(struct context *,
>  static int parse_vc_action_raw_decap(struct context *,
>  				     const struct token *, const char *,
>  				     unsigned int, void *, unsigned int);
> +static int parse_vc_action_set_meta(struct context *ctx,
> +				    const struct token *token, const char *str,
> +				    unsigned int len, void *buf,
> +				    unsigned int size);
>  static int parse_destroy(struct context *, const struct token *,
>  			 const char *, unsigned int,
>  			 void *, unsigned int);
> @@ -3222,7 +3237,31 @@ static int comp_vc_action_rss_queue(struct context *, const struct token *,
>  		.help = "set raw decap data",
>  		.next = NEXT(next_item),
>  		.call = parse_set_raw_encap_decap,
> -	}
> +	},
> +	[ACTION_SET_META] = {
> +		.name = "set_meta",
> +		.help = "set metadata",
> +		.priv = PRIV_ACTION(SET_META,
> +			sizeof(struct rte_flow_action_set_meta)),
> +		.next = NEXT(action_set_meta),
> +		.call = parse_vc_action_set_meta,
> +	},
> +	[ACTION_SET_META_DATA] = {
> +		.name = "data",
> +		.help = "metadata value",
> +		.next = NEXT(action_set_meta, NEXT_ENTRY(UNSIGNED)),
> +		.args = ARGS(ARGS_ENTRY_HTON
> +			     (struct rte_flow_action_set_meta, data)),
> +		.call = parse_vc_conf,
> +	},
> +	[ACTION_SET_META_MASK] = {
> +		.name = "mask",
> +		.help = "mask for metadata value",
> +		.next = NEXT(action_set_meta, NEXT_ENTRY(UNSIGNED)),
> +		.args = ARGS(ARGS_ENTRY_HTON
> +			     (struct rte_flow_action_set_meta, mask)),
> +		.call = parse_vc_conf,
> +	},
>  };
>  
>  /** Remove and return last entry from argument stack. */
> @@ -4592,6 +4631,22 @@ static int comp_vc_action_rss_queue(struct context *, const struct token *,
>  	return ret;
>  }
>  
> +static int
> +parse_vc_action_set_meta(struct context *ctx, const struct token *token,
> +			 const char *str, unsigned int len, void *buf,
> +			 unsigned int size)
> +{
> +	int ret;
> +
> +	ret = parse_vc(ctx, token, str, len, buf, size);
> +	if (ret < 0)
> +		return ret;
> +	ret = rte_flow_dynf_metadata_register();
> +	if (ret < 0)
> +		return -1;
> +	return len;
> +}
> +
>  /** Parse tokens for destroy command. */
>  static int
>  parse_destroy(struct context *ctx, const struct token *token,
> diff --git a/app/test-pmd/util.c b/app/test-pmd/util.c
> index 1570270..39ff07b 100644
> --- a/app/test-pmd/util.c
> +++ b/app/test-pmd/util.c
> @@ -81,6 +81,11 @@
>  			       mb->vlan_tci, mb->vlan_tci_outer);
>  		else if (ol_flags & PKT_RX_VLAN)
>  			printf(" - VLAN tci=0x%x", mb->vlan_tci);
> +		if (ol_flags & PKT_TX_METADATA)
> +			printf(" - Tx metadata: 0x%x", mb->tx_metadata);
> +		if (ol_flags & PKT_RX_DYNF_METADATA)
> +			printf(" - Rx metadata: 0x%x",
> +			       *RTE_FLOW_DYNF_METADATA(mb));
>  		if (mb->packet_type) {
>  			rte_get_ptype_name(mb->packet_type, buf, sizeof(buf));
>  			printf(" - hw ptype: %s", buf);
> diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
> index ff6fb11..45fc041 100644
> --- a/doc/guides/prog_guide/rte_flow.rst
> +++ b/doc/guides/prog_guide/rte_flow.rst
> @@ -658,6 +658,32 @@ the physical device, with virtual groups in the PMD or not at all.
>     | ``mask`` | ``id``   | zeroed to match any value |
>     +----------+----------+---------------------------+
>  
> +Item: ``META``
> +^^^^^^^^^^^^^^^^^
> +
> +Matches 32 bit metadata item set.
> +
> +On egress, metadata can be set either by mbuf metadata field with
> +PKT_TX_METADATA flag or ``SET_META`` action. On ingress, ``SET_META``
> +action sets metadata for a packet and the metadata will be reported via
> +``metadata`` dynamic field of ``rte_mbuf`` with PKT_RX_DYNF_METADATA flag.
> +
> +- Default ``mask`` matches the specified Rx metadata value.
> +
> +.. _table_rte_flow_item_meta:
> +
> +.. table:: META
> +
> +   +----------+----------+---------------------------------------+
> +   | Field    | Subfield | Value                                 |
> +   +==========+==========+=======================================+
> +   | ``spec`` | ``data`` | 32 bit metadata value                 |
> +   +----------+----------+---------------------------------------+
> +   | ``last`` | ``data`` | upper range value                     |
> +   +----------+----------+---------------------------------------+
> +   | ``mask`` | ``data`` | bit-mask applies to "spec" and "last" |
> +   +----------+----------+---------------------------------------+
> +
>  Data matching item types
>  ~~~~~~~~~~~~~~~~~~~~~~~~
>  
> @@ -2415,6 +2441,37 @@ Value to decrease TCP acknowledgment number by is a big-endian 32 bit integer.
>  
>  Using this action on non-matching traffic will result in undefined behavior.
>  
> +Action: ``SET_META``
> +^^^^^^^^^^^^^^^^^^^^^^^
> +
> +Set metadata. Item ``META`` matches metadata.
> +
> +Metadata set by mbuf metadata field with PKT_TX_METADATA flag on egress will be
> +overridden by this action. On ingress, the metadata will be carried by
> +``metadata`` dynamic field of ``rte_mbuf`` which can be accessed by
> +``RTE_FLOW_DYNF_METADATA()``. PKT_RX_DYNF_METADATA flag will be set along
> +with the data.
> +
> +The mbuf dynamic field must be registered by calling
> +``rte_flow_dynf_metadata_register()`` prior to use ``SET_META`` action.
> +
> +Altering partial bits is supported with ``mask``. For bits which have never been
> +set, unpredictable value will be seen depending on driver implementation. For
> +loopback/hairpin packet, metadata set on Rx/Tx may or may not be propagated to
> +the other path depending on HW capability.
> +
> +.. _table_rte_flow_action_set_meta:
> +
> +.. table:: SET_META
> +
> +   +----------+----------------------------+
> +   | Field    | Value                      |
> +   +==========+============================+
> +   | ``data`` | 32 bit metadata value      |
> +   +----------+----------------------------+
> +   | ``mask`` | bit-mask applies to "data" |
> +   +----------+----------------------------+
> +
>  Negative types
>  ~~~~~~~~~~~~~~
>  
> diff --git a/doc/guides/rel_notes/release_19_11.rst b/doc/guides/rel_notes/release_19_11.rst
> index 8921cfd..904746e 100644
> --- a/doc/guides/rel_notes/release_19_11.rst
> +++ b/doc/guides/rel_notes/release_19_11.rst
> @@ -95,6 +95,14 @@ New Features
>    for specific offload features, where adding a static field or flag
>    in the mbuf is not justified.
>  
> +* **Extended metadata support in rte_flow.**
> +
> +  Flow metadata is extended to both Rx and Tx.
> +
> +  * Tx metadata can also be set by SET_META action of rte_flow.
> +  * Rx metadata is delivered to host via a dynamic field of ``rte_mbuf`` with
> +    PKT_RX_DYNF_METADATA.
> +
>  Removed Items
>  -------------
>  
> diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
> index d937fb4..9a6432c 100644
> --- a/lib/librte_ethdev/rte_ethdev.h
> +++ b/lib/librte_ethdev/rte_ethdev.h
> @@ -1013,7 +1013,6 @@ struct rte_eth_conf {
>  #define DEV_RX_OFFLOAD_KEEP_CRC		0x00010000
>  #define DEV_RX_OFFLOAD_SCTP_CKSUM	0x00020000
>  #define DEV_RX_OFFLOAD_OUTER_UDP_CKSUM  0x00040000
> -
>  #define DEV_RX_OFFLOAD_CHECKSUM (DEV_RX_OFFLOAD_IPV4_CKSUM | \
>  				 DEV_RX_OFFLOAD_UDP_CKSUM | \
>  				 DEV_RX_OFFLOAD_TCP_CKSUM)
> diff --git a/lib/librte_ethdev/rte_ethdev_version.map b/lib/librte_ethdev/rte_ethdev_version.map
> index 6df42a4..3d9cafc 100644
> --- a/lib/librte_ethdev/rte_ethdev_version.map
> +++ b/lib/librte_ethdev/rte_ethdev_version.map
> @@ -283,4 +283,10 @@ EXPERIMENTAL {
>  
>  	# added in 19.08
>  	rte_eth_read_clock;
> +
> +	# added in 19.11
> +	rte_flow_dynf_metadata_offs;
> +	rte_flow_dynf_metadata_flag;
> +	rte_flow_dynf_metadata_avail;
> +	rte_flow_dynf_metadata_register;
>  };
> diff --git a/lib/librte_ethdev/rte_flow.c b/lib/librte_ethdev/rte_flow.c
> index cc03b15..9cbda75 100644
> --- a/lib/librte_ethdev/rte_flow.c
> +++ b/lib/librte_ethdev/rte_flow.c
> @@ -12,10 +12,18 @@
>  #include <rte_errno.h>
>  #include <rte_branch_prediction.h>
>  #include <rte_string_fns.h>
> +#include <rte_mbuf.h>
> +#include <rte_mbuf_dyn.h>
>  #include "rte_ethdev.h"
>  #include "rte_flow_driver.h"
>  #include "rte_flow.h"
>  
> +/* Mbuf dynamic field name for metadata. */
> +int rte_flow_dynf_metadata_offs = -1;
> +
> +/* Mbuf dynamic field flag bit number for metadata. */
> +uint64_t rte_flow_dynf_metadata_mask;
> +
>  /**
>   * Flow elements description tables.
>   */
> @@ -153,8 +161,41 @@ struct rte_flow_desc_data {
>  	MK_FLOW_ACTION(DEC_TCP_SEQ, sizeof(rte_be32_t)),
>  	MK_FLOW_ACTION(INC_TCP_ACK, sizeof(rte_be32_t)),
>  	MK_FLOW_ACTION(DEC_TCP_ACK, sizeof(rte_be32_t)),
> +	MK_FLOW_ACTION(SET_META, sizeof(struct rte_flow_action_set_meta)),
>  };
>  
> +int
> +rte_flow_dynf_metadata_register(void)
> +{
> +	int offset;
> +	int flag;
> +
> +	static const struct rte_mbuf_dynfield desc_offs = {
> +		.name = MBUF_DYNF_METADATA_NAME,
> +		.size = MBUF_DYNF_METADATA_SIZE,
> +		.align = MBUF_DYNF_METADATA_ALIGN,
> +		.flags = MBUF_DYNF_METADATA_FLAGS,
> +	};
> +	static const struct rte_mbuf_dynflag desc_flag = {
> +		.name = MBUF_DYNF_METADATA_NAME,
> +	};

I don't see think we need #defines.
You can directly use the name, sizeof() and __alignof__() here.
If the information is used externally, the structure shall
be made global non-static.


> +
> +	offset = rte_mbuf_dynfield_register(&desc_offs);
> +	if (offset < 0)
> +		goto error;
> +	flag = rte_mbuf_dynflag_register(&desc_flag);
> +	if (flag < 0)
> +		goto error;
> +	rte_flow_dynf_metadata_offs = offset;
> +	rte_flow_dynf_metadata_mask = (1ULL << flag);
> +	return 0;
> +
> +error:
> +	rte_flow_dynf_metadata_offs = -1;
> +	rte_flow_dynf_metadata_mask = 0ULL;
> +	return -rte_errno;
> +}
> +
>  static int
>  flow_err(uint16_t port_id, int ret, struct rte_flow_error *error)
>  {
> diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
> index 391a44a..a27e619 100644
> --- a/lib/librte_ethdev/rte_flow.h
> +++ b/lib/librte_ethdev/rte_flow.h
> @@ -27,6 +27,8 @@
>  #include <rte_udp.h>
>  #include <rte_byteorder.h>
>  #include <rte_esp.h>
> +#include <rte_mbuf.h>
> +#include <rte_mbuf_dyn.h>
>  
>  #ifdef __cplusplus
>  extern "C" {
> @@ -417,7 +419,8 @@ enum rte_flow_item_type {
>  	/**
>  	 * [META]
>  	 *
> -	 * Matches a metadata value specified in mbuf metadata field.
> +	 * Matches a metadata value.
> +	 *
>  	 * See struct rte_flow_item_meta.
>  	 */
>  	RTE_FLOW_ITEM_TYPE_META,
> @@ -1213,9 +1216,17 @@ struct rte_flow_item_icmp6_nd_opt_tla_eth {
>  #endif
>  
>  /**
> - * RTE_FLOW_ITEM_TYPE_META.
> + * @warning
> + * @b EXPERIMENTAL: this structure may change without prior notice
>   *
> - * Matches a specified metadata value.
> + * RTE_FLOW_ITEM_TYPE_META
> + *
> + * Matches a specified metadata value. On egress, metadata can be set either by
> + * mbuf tx_metadata field with PKT_TX_METADATA flag or
> + * RTE_FLOW_ACTION_TYPE_SET_META. On ingress, RTE_FLOW_ACTION_TYPE_SET_META sets
> + * metadata for a packet and the metadata will be reported via mbuf metadata
> + * dynamic field with PKT_RX_DYNF_METADATA flag. The dynamic mbuf field must be
> + * registered in advance by rte_flow_dynf_metadata_register().
>   */
>  struct rte_flow_item_meta {
>  	rte_be32_t data;
> @@ -1813,6 +1824,13 @@ enum rte_flow_action_type {
>  	 * undefined behavior.
>  	 */
>  	RTE_FLOW_ACTION_TYPE_DEC_TCP_ACK,
> +
> +	/**
> +	 * Set metadata on ingress or egress path.
> +	 *
> +	 * See struct rte_flow_action_set_meta.
> +	 */
> +	RTE_FLOW_ACTION_TYPE_SET_META,
>  };
>  
>  /**
> @@ -2300,6 +2318,43 @@ struct rte_flow_action_set_mac {
>  	uint8_t mac_addr[RTE_ETHER_ADDR_LEN];
>  };
>  
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this structure may change without prior notice
> + *
> + * RTE_FLOW_ACTION_TYPE_SET_META
> + *
> + * Set metadata. Metadata set by mbuf tx_metadata field with
> + * PKT_TX_METADATA flag on egress will be overridden by this action. On
> + * ingress, the metadata will be carried by mbuf metadata dynamic field
> + * with PKT_RX_DYNF_METADATA flag if set.  The dynamic mbuf field must be
> + * registered in advance by rte_flow_dynf_metadata_register().
> + *
> + * Altering partial bits is supported with mask. For bits which have never
> + * been set, unpredictable value will be seen depending on driver
> + * implementation. For loopback/hairpin packet, metadata set on Rx/Tx may
> + * or may not be propagated to the other path depending on HW capability.
> + *
> + * RTE_FLOW_ITEM_TYPE_META matches metadata.
> + */
> +struct rte_flow_action_set_meta {
> +	rte_be32_t data;
> +	rte_be32_t mask;
> +};
> +
> +/* Mbuf dynamic field offset for metadata. */
> +extern int rte_flow_dynf_metadata_offs;
> +
> +/* Mbuf dynamic field flag mask for metadata. */
> +extern uint64_t rte_flow_dynf_metadata_mask;
> +
> +/* Mbuf dynamic field pointer for metadata. */
> +#define RTE_FLOW_DYNF_METADATA(m) \
> +	RTE_MBUF_DYNFIELD((m), rte_flow_dynf_metadata_offs, uint32_t *)
> +
> +/* Mbuf dynamic flag for metadata. */
> +#define PKT_RX_DYNF_METADATA (rte_flow_dynf_metadata_mask)
> +

I wonder if helpers like this wouldn't be better, because
they combine the flag and the field:

/**
 * Set metadata dynamic field and flag in mbuf.
 *
 * rte_flow_dynf_metadata_register() must have been called first.
 */
__rte_experimental
static inline void rte_mbuf_dyn_metadata_set(struct rte_mbuf *m,
                                       uint32_t metadata)
{
       *RTE_MBUF_DYNFIELD(m, rte_flow_dynf_metadata_offs,
                       uint32_t *) = metadata;
       m->ol_flags |= rte_flow_dynf_metadata_mask;
}

/**
 * Get metadata dynamic field value in mbuf.
 *
 * rte_flow_dynf_metadata_register() must have been called first.
 */
__rte_experimental
static inline int rte_mbuf_dyn_metadata_get(const struct rte_mbuf *m,
                                       uint32_t *metadata)
{
       if ((m->ol_flags & rte_flow_dynf_metadata_mask) == 0)
               return -1;
       *metadata = *RTE_MBUF_DYNFIELD(m, rte_flow_dynf_metadata_offs,
                               uint32_t *);
       return 0;
}

/**
 * Delete the metadata dynamic flag in mbuf.
 *
 * rte_flow_dynf_metadata_register() must have been called first.
 */
__rte_experimental
static inline void rte_mbuf_dyn_metadata_del(struct rte_mbuf *m)
{
       m->ol_flags &= ~rte_flow_dynf_metadata_mask;
}


>  /*
>   * Definition of a single action.
>   *
> @@ -2533,6 +2588,32 @@ enum rte_flow_conv_op {
>  };
>  
>  /**
> + * Check if mbuf dynamic field for metadata is registered.
> + *
> + * @return
> + *   True if registered, false otherwise.
> + */
> +__rte_experimental
> +static inline int
> +rte_flow_dynf_metadata_avail(void) {
> +	return !!rte_flow_dynf_metadata_mask;
> +}

_registered() instead of _avail() ?

> +
> +/**
> + * Register mbuf dynamic field and flag for metadata.
> + *
> + * This function must be called prior to use SET_META action in order to
> + * register the dynamic mbuf field. Otherwise, the data cannot be delivered to
> + * application.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +__rte_experimental
> +int
> +rte_flow_dynf_metadata_register(void);
> +
> +/**
>   * Check whether a flow rule can be created on a given port.
>   *
>   * The flow rule is validated for correctness and whether it could be accepted
> diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h b/lib/librte_mbuf/rte_mbuf_dyn.h
> index 6e2c816..4ff33ac 100644
> --- a/lib/librte_mbuf/rte_mbuf_dyn.h
> +++ b/lib/librte_mbuf/rte_mbuf_dyn.h
> @@ -160,4 +160,12 @@ int rte_mbuf_dynflag_lookup(const char *name,
>   */
>  #define RTE_MBUF_DYNFIELD(m, offset, type) ((type)((uintptr_t)(m) + (offset)))
>  
> +/**
> + * Flow metadata dynamic field definitions.
> + */
> +#define MBUF_DYNF_METADATA_NAME "flow-metadata"
> +#define MBUF_DYNF_METADATA_SIZE sizeof(uint32_t)
> +#define MBUF_DYNF_METADATA_ALIGN __alignof__(uint32_t)
> +#define MBUF_DYNF_METADATA_FLAGS 0

If this flag is only to be used in rte_flow, it can stay in rte_flow.
The name should follow the function name conventions, I suggest
"rte_flow_metadata".

If the flag is going to be used in several places in dpdk (rte_flow,
pmd, app, ...), I wonder if it shouldn't be defined it in
rte_mbuf_dyn.c. I mean:

====
/* rte_mbuf_dyn.c */
const struct rte_mbuf_dynfield rte_mbuf_dynfield_flow_metadata = {
   ...
};
int rte_mbuf_dynfield_flow_metadata_offset = -1;
const struct rte_mbuf_dynflag rte_mbuf_dynflag_flow_metadata = {
   ...
};
int rte_mbuf_dynflag_flow_metadata_bitnum = -1;

int rte_mbuf_dyn_flow_metadata_register(void)
{
...
}

/* rte_mbuf_dyn.h */
extern const struct rte_mbuf_dynfield rte_mbuf_dynfield_flow_metadata;
extern int rte_mbuf_dynfield_flow_metadata_offset;
extern const struct rte_mbuf_dynflag rte_mbuf_dynflag_flow_metadata;
extern int rte_mbuf_dynflag_flow_metadata_bitnum;

...helpers to set/get metadata...
===

Centralizing the definitions of non-private dynamic fields/flags in
rte_mbuf_dyn may help other people to reuse a field that is well
described if it match their use-case.

In your case, what is carried by metadata? Could it be reused by
others? I think some more description is needed.


Regards,
Olivier

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [dpdk-dev] [PATCH v2] ethdev: extend flow metadata
  2019-10-18  9:22     ` Olivier Matz
@ 2019-10-19 19:47       ` Slava Ovsiienko
  0 siblings, 0 replies; 34+ messages in thread
From: Slava Ovsiienko @ 2019-10-19 19:47 UTC (permalink / raw)
  To: Olivier Matz
  Cc: dev, Matan Azrad, Raslan Darawsheh, Thomas Monjalon, Yongseok Koh

Hi, Olivier

Thank you for your comment (and for the dynamic mbuf patch, btw). Please, see below.

> -----Original Message-----
> From: Olivier Matz <olivier.matz@6wind.com>
> Sent: Friday, October 18, 2019 12:22
> To: Slava Ovsiienko <viacheslavo@mellanox.com>
> Cc: dev@dpdk.org; Matan Azrad <matan@mellanox.com>; Raslan
> Darawsheh <rasland@mellanox.com>; Thomas Monjalon
> <thomas@monjalon.net>; Yongseok Koh <yskoh@mellanox.com>
> Subject: Re: [PATCH v2] ethdev: extend flow metadata
> 
> Hi Viacheslav,
> 
> Few comments on the dynamic mbuf part below.
> 
[snip]

> > @@ -12,10 +12,18 @@
> >  #include <rte_errno.h>
> >  #include <rte_branch_prediction.h>
> >  #include <rte_string_fns.h>
> > +#include <rte_mbuf.h>
> > +#include <rte_mbuf_dyn.h>
> >  #include "rte_ethdev.h"
> >  #include "rte_flow_driver.h"
> >  #include "rte_flow.h"
> >
> > +/* Mbuf dynamic field name for metadata. */ int
> > +rte_flow_dynf_metadata_offs = -1;
> > +
> > +/* Mbuf dynamic field flag bit number for metadata. */ uint64_t
> > +rte_flow_dynf_metadata_mask;
> > +
> >  /**
> >   * Flow elements description tables.
> >   */
> > @@ -153,8 +161,41 @@ struct rte_flow_desc_data {
> >  	MK_FLOW_ACTION(DEC_TCP_SEQ, sizeof(rte_be32_t)),
> >  	MK_FLOW_ACTION(INC_TCP_ACK, sizeof(rte_be32_t)),
> >  	MK_FLOW_ACTION(DEC_TCP_ACK, sizeof(rte_be32_t)),
> > +	MK_FLOW_ACTION(SET_META, sizeof(struct
> rte_flow_action_set_meta)),
> >  };
> >
> > +int
> > +rte_flow_dynf_metadata_register(void)
> > +{
> > +	int offset;
> > +	int flag;
> > +
> > +	static const struct rte_mbuf_dynfield desc_offs = {
> > +		.name = MBUF_DYNF_METADATA_NAME,
> > +		.size = MBUF_DYNF_METADATA_SIZE,
> > +		.align = MBUF_DYNF_METADATA_ALIGN,
> > +		.flags = MBUF_DYNF_METADATA_FLAGS,
> > +	};
> > +	static const struct rte_mbuf_dynflag desc_flag = {
> > +		.name = MBUF_DYNF_METADATA_NAME,
> > +	};
> 
> I don't see think we need #defines.
> You can directly use the name, sizeof() and __alignof__() here.
> If the information is used externally, the structure shall be made global non-
> static.

The intention was to gather all dynamic fields definitions in one place 
(in rte_mbuf_dyn.h). It would be easy to see all fields in one sight (some
might be shared, some might be mutual exclusive, estimate mbuf space,
required by various features, etc.). So, we can't just fill structure fields
with simple sizeof() and alignof() instead of definitions (the field parameters
must be defined once).

I do not see the reasons to make table global. I would prefer the definitions.
- the definitions are compile time processing (table fields are runtime),
it provides code optimization and better performance.

> > +
> > +	offset = rte_mbuf_dynfield_register(&desc_offs);
> > +	if (offset < 0)
> > +		goto error;
> > +	flag = rte_mbuf_dynflag_register(&desc_flag);
> > +	if (flag < 0)
> > +		goto error;
> > +	rte_flow_dynf_metadata_offs = offset;
> > +	rte_flow_dynf_metadata_mask = (1ULL << flag);
> > +	return 0;
> > +
> > +error:
> > +	rte_flow_dynf_metadata_offs = -1;
> > +	rte_flow_dynf_metadata_mask = 0ULL;
> > +	return -rte_errno;
> > +}
> > +
> >  static int
> >  flow_err(uint16_t port_id, int ret, struct rte_flow_error *error)  {
> > diff --git a/lib/librte_ethdev/rte_flow.h
> > b/lib/librte_ethdev/rte_flow.h index 391a44a..a27e619 100644
> > --- a/lib/librte_ethdev/rte_flow.h
> > +++ b/lib/librte_ethdev/rte_flow.h
> > @@ -27,6 +27,8 @@
> >  #include <rte_udp.h>
> >  #include <rte_byteorder.h>
> >  #include <rte_esp.h>
> > +#include <rte_mbuf.h>
> > +#include <rte_mbuf_dyn.h>
> >
> >  #ifdef __cplusplus
> >  extern "C" {
> > @@ -417,7 +419,8 @@ enum rte_flow_item_type {
> >  	/**
> >  	 * [META]
> >  	 *
> > -	 * Matches a metadata value specified in mbuf metadata field.
> > +	 * Matches a metadata value.
> > +	 *
> >  	 * See struct rte_flow_item_meta.
> >  	 */
> >  	RTE_FLOW_ITEM_TYPE_META,
> > @@ -1213,9 +1216,17 @@ struct rte_flow_item_icmp6_nd_opt_tla_eth {
> > #endif
> >
> >  /**
> > - * RTE_FLOW_ITEM_TYPE_META.
> > + * @warning
> > + * @b EXPERIMENTAL: this structure may change without prior notice
> >   *
> > - * Matches a specified metadata value.
> > + * RTE_FLOW_ITEM_TYPE_META
> > + *
> > + * Matches a specified metadata value. On egress, metadata can be set
> > + either by
> > + * mbuf tx_metadata field with PKT_TX_METADATA flag or
> > + * RTE_FLOW_ACTION_TYPE_SET_META. On ingress,
> > + RTE_FLOW_ACTION_TYPE_SET_META sets
> > + * metadata for a packet and the metadata will be reported via mbuf
> > + metadata
> > + * dynamic field with PKT_RX_DYNF_METADATA flag. The dynamic mbuf
> > + field must be
> > + * registered in advance by rte_flow_dynf_metadata_register().
> >   */
> >  struct rte_flow_item_meta {
> >  	rte_be32_t data;
> > @@ -1813,6 +1824,13 @@ enum rte_flow_action_type {
> >  	 * undefined behavior.
> >  	 */
> >  	RTE_FLOW_ACTION_TYPE_DEC_TCP_ACK,
> > +
> > +	/**
> > +	 * Set metadata on ingress or egress path.
> > +	 *
> > +	 * See struct rte_flow_action_set_meta.
> > +	 */
> > +	RTE_FLOW_ACTION_TYPE_SET_META,
> >  };
> >
> >  /**
> > @@ -2300,6 +2318,43 @@ struct rte_flow_action_set_mac {
> >  	uint8_t mac_addr[RTE_ETHER_ADDR_LEN];  };
> >
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this structure may change without prior notice
> > + *
> > + * RTE_FLOW_ACTION_TYPE_SET_META
> > + *
> > + * Set metadata. Metadata set by mbuf tx_metadata field with
> > + * PKT_TX_METADATA flag on egress will be overridden by this action.
> > +On
> > + * ingress, the metadata will be carried by mbuf metadata dynamic
> > +field
> > + * with PKT_RX_DYNF_METADATA flag if set.  The dynamic mbuf field
> > +must be
> > + * registered in advance by rte_flow_dynf_metadata_register().
> > + *
> > + * Altering partial bits is supported with mask. For bits which have
> > +never
> > + * been set, unpredictable value will be seen depending on driver
> > + * implementation. For loopback/hairpin packet, metadata set on Rx/Tx
> > +may
> > + * or may not be propagated to the other path depending on HW
> capability.
> > + *
> > + * RTE_FLOW_ITEM_TYPE_META matches metadata.
> > + */
> > +struct rte_flow_action_set_meta {
> > +	rte_be32_t data;
> > +	rte_be32_t mask;
> > +};
> > +
> > +/* Mbuf dynamic field offset for metadata. */ extern int
> > +rte_flow_dynf_metadata_offs;
> > +
> > +/* Mbuf dynamic field flag mask for metadata. */ extern uint64_t
> > +rte_flow_dynf_metadata_mask;
> > +
> > +/* Mbuf dynamic field pointer for metadata. */ #define
> > +RTE_FLOW_DYNF_METADATA(m) \
> > +	RTE_MBUF_DYNFIELD((m), rte_flow_dynf_metadata_offs, uint32_t
> *)
> > +
> > +/* Mbuf dynamic flag for metadata. */ #define PKT_RX_DYNF_METADATA
> > +(rte_flow_dynf_metadata_mask)
> > +
> 
> I wonder if helpers like this wouldn't be better, because they combine the
> flag and the field:
> 
> /**
>  * Set metadata dynamic field and flag in mbuf.
>  *
>  * rte_flow_dynf_metadata_register() must have been called first.
>  */
> __rte_experimental
> static inline void rte_mbuf_dyn_metadata_set(struct rte_mbuf *m,
>                                        uint32_t metadata) {
>        *RTE_MBUF_DYNFIELD(m, rte_flow_dynf_metadata_offs,
>                        uint32_t *) = metadata;
>        m->ol_flags |= rte_flow_dynf_metadata_mask; }
Setting flag looks redundantly.
What if driver just replaces the metadata and flag is already set?
The other option - the flags (for set of fields) might be set in combinations.
mbuf field is supposed to be engaged in datapath, performance is
very critical, adding one more abstraction layer seems not to be relevant.
Also, metadata is not feature of mbuf. It should have rte_flow prefix.

> /**
>  * Get metadata dynamic field value in mbuf.
>  *
>  * rte_flow_dynf_metadata_register() must have been called first.
>  */
> __rte_experimental
> static inline int rte_mbuf_dyn_metadata_get(const struct rte_mbuf *m,
>                                        uint32_t *metadata) {
>        if ((m->ol_flags & rte_flow_dynf_metadata_mask) == 0)
>                return -1;
What if metadata is 0xFFFFFFFF ?
The checking of availability might embrace larger code block, 
so this might be not the best place to check availability.

>        *metadata = *RTE_MBUF_DYNFIELD(m, rte_flow_dynf_metadata_offs,
>                                uint32_t *);
>        return 0;
> }
> 
> /**
>  * Delete the metadata dynamic flag in mbuf.
>  *
>  * rte_flow_dynf_metadata_register() must have been called first.
>  */
> __rte_experimental
> static inline void rte_mbuf_dyn_metadata_del(struct rte_mbuf *m) {
>        m->ol_flags &= ~rte_flow_dynf_metadata_mask; }
> 
Sorry, I do not see the practical usecase for these helpers. In my opinion it is just some kind of obscuration.
They do replace the very simple code and introduce some risk of performance impact.

> 
> >  /*
> >   * Definition of a single action.
> >   *
> > @@ -2533,6 +2588,32 @@ enum rte_flow_conv_op {  };
> >
> >  /**
> > + * Check if mbuf dynamic field for metadata is registered.
> > + *
> > + * @return
> > + *   True if registered, false otherwise.
> > + */
> > +__rte_experimental
> > +static inline int
> > +rte_flow_dynf_metadata_avail(void) {
> > +	return !!rte_flow_dynf_metadata_mask; }
> 
> _registered() instead of _avail() ?
Accepted, sounds better.

> 
> > +
> > +/**
> > + * Register mbuf dynamic field and flag for metadata.
> > + *
> > + * This function must be called prior to use SET_META action in order
> > +to
> > + * register the dynamic mbuf field. Otherwise, the data cannot be
> > +delivered to
> > + * application.
> > + *
> > + * @return
> > + *   0 on success, a negative errno value otherwise and rte_errno is set.
> > + */
> > +__rte_experimental
> > +int
> > +rte_flow_dynf_metadata_register(void);
> > +
> > +/**
> >   * Check whether a flow rule can be created on a given port.
> >   *
> >   * The flow rule is validated for correctness and whether it could be
> > accepted diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h
> > b/lib/librte_mbuf/rte_mbuf_dyn.h index 6e2c816..4ff33ac 100644
> > --- a/lib/librte_mbuf/rte_mbuf_dyn.h
> > +++ b/lib/librte_mbuf/rte_mbuf_dyn.h
> > @@ -160,4 +160,12 @@ int rte_mbuf_dynflag_lookup(const char *name,
> >   */
> >  #define RTE_MBUF_DYNFIELD(m, offset, type) ((type)((uintptr_t)(m) +
> > (offset)))
> >
> > +/**
> > + * Flow metadata dynamic field definitions.
> > + */
> > +#define MBUF_DYNF_METADATA_NAME "flow-metadata"
> > +#define MBUF_DYNF_METADATA_SIZE sizeof(uint32_t) #define
> > +MBUF_DYNF_METADATA_ALIGN __alignof__(uint32_t) #define
> > +MBUF_DYNF_METADATA_FLAGS 0
> 
> If this flag is only to be used in rte_flow, it can stay in rte_flow.
> The name should follow the function name conventions, I suggest
> "rte_flow_metadata".

The definitions:
MBUF_DYNF_METADATA_NAME, 
MBUF_DYNF_METADATA_SIZE,
MBUF_DYNF_METADATA_ALIGN
are global. rte_flow proposes only minimal set tyo check and access
the metadata. By knowing the field names applications would have the
more flexibility in processing the fields, for example it allows to  optimize
the handling of multiple dynamic fields . The definition of metadata size allows
to generate optimized code:
#if MBUF_DYNF_METADATA_SIZE == sizeof(uint32)
	*RTE_MBUF_DYNFIELD(m) = get_metadata_32bit()
#else
	*RTE_MBUF_DYNFIELD(m) = get_metadata_64bit()
#endif

MBUF_DYNF_METADATA_FLAGS flag is not used by rte_flow,
this flag is related exclusively to dynamic mbuf  " Reserved for future use, must be 0".
Would you like to drop this definition?

> 
> If the flag is going to be used in several places in dpdk (rte_flow, pmd, app,
> ...), I wonder if it shouldn't be defined it in rte_mbuf_dyn.c. I mean:
> 
> ====
> /* rte_mbuf_dyn.c */
> const struct rte_mbuf_dynfield rte_mbuf_dynfield_flow_metadata = {
>    ...
> };
In this case we would make this descriptor global.
It is no needed, because there Is no supposed any usage, but by
rte_flow_dynf_metadata_register() only. The 

> int rte_mbuf_dynfield_flow_metadata_offset = -1; const struct
> rte_mbuf_dynflag rte_mbuf_dynflag_flow_metadata = {
>    ...
> };
> int rte_mbuf_dynflag_flow_metadata_bitnum = -1;
> 
> int rte_mbuf_dyn_flow_metadata_register(void)
> {
> ...
> }
> 
> /* rte_mbuf_dyn.h */
> extern const struct rte_mbuf_dynfield rte_mbuf_dynfield_flow_metadata;
> extern int rte_mbuf_dynfield_flow_metadata_offset;
> extern const struct rte_mbuf_dynflag rte_mbuf_dynflag_flow_metadata;
> extern int rte_mbuf_dynflag_flow_metadata_bitnum;
> 
> ...helpers to set/get metadata...
> ===
> 
> Centralizing the definitions of non-private dynamic fields/flags in
> rte_mbuf_dyn may help other people to reuse a field that is well described if
> it match their use-case.

Yes, centralizing is important, that's why MBUF_DYNF_METADATA_xxx placed
in rte_mbuf_dyn.h. Do you think we should share the descriptors either?
I have no idea why someone (but rte_flow_dynf_metadata_register()) might
register metadata field directly.

> 
> In your case, what is carried by metadata? Could it be reused by others? I
> think some more description is needed.
In my case, metadata is just opaquie rte_flow related 32-bit unsigned value provided by
mlx5 hardrware in rx datapath. I have no guess whether someone wishes to reuse.

Brief summary of you comment (just to make sure I understood your proposal in correct way):
1. drop all definitions MBUF_DYNF_METADATA_xxx, leave MBUF_DYNF_METADATA_NAME only
2. move the descriptor const struct rte_mbuf_dynfield desc_offs = {} to rte_mbuf_dyn.c and make it global
3. provide helpers to access metadata

[1] and [2] look OK in general. Although I think these ones make code less flexible, restrict the potential compile time options.
For now it is rather theoretical question, if you insist on your approach - please, let me know, I'll address [1] and [2]
and update.my patch.

As for [3] - IMHO, the extra abstraction layer is not useful, and might be even harmful.
I tend not to complicate the code, at least, for now.

With best regards,
Slava
 
> Regards,
> Olivier

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, back to index

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-03 21:32 [dpdk-dev] [RFC 1/3] ethdev: extend flow metadata Yongseok Koh
2019-06-03 21:32 ` [dpdk-dev] [RFC 2/3] ethdev: add flow modify mark action Yongseok Koh
2019-06-06 10:35   ` Jerin Jacob Kollanukkaran
2019-06-06 18:33     ` Yongseok Koh
2019-06-03 21:32 ` [dpdk-dev] [RFC 3/3] ethdev: add flow tag Yongseok Koh
2019-07-04 23:23   ` [dpdk-dev] [PATCH] " Yongseok Koh
2019-07-05 13:54     ` Adrien Mazarguil
2019-07-05 18:05       ` Yongseok Koh
2019-07-08 23:32         ` Yongseok Koh
2019-07-09  8:38         ` Adrien Mazarguil
2019-07-11  1:59           ` Yongseok Koh
2019-10-08 12:57             ` Yigit, Ferruh
2019-10-08 13:18               ` Slava Ovsiienko
2019-10-10 16:09     ` [dpdk-dev] [PATCH v2] " Viacheslav Ovsiienko
2019-06-09 14:23 ` [dpdk-dev] [RFC 1/3] ethdev: extend flow metadata Andrew Rybchenko
2019-06-10  3:19   ` Wang, Haiyue
2019-06-10  7:20     ` Andrew Rybchenko
2019-06-11  0:06       ` Yongseok Koh
2019-06-19  9:05         ` Andrew Rybchenko
2019-07-04 23:21 ` [dpdk-dev] [PATCH] " Yongseok Koh
2019-07-10  9:31   ` Olivier Matz
2019-07-10  9:55     ` Bruce Richardson
2019-07-10 10:07       ` Olivier Matz
2019-07-10 12:01         ` Bruce Richardson
2019-07-10 12:26           ` Thomas Monjalon
2019-07-10 16:37             ` Yongseok Koh
2019-07-11  7:44               ` Adrien Mazarguil
2019-07-14 11:46                 ` Andrew Rybchenko
2019-07-29 15:06                   ` Adrien Mazarguil
2019-10-08 12:51                     ` Yigit, Ferruh
2019-10-08 13:17                       ` Slava Ovsiienko
2019-10-10 16:02   ` [dpdk-dev] [PATCH v2] " Viacheslav Ovsiienko
2019-10-18  9:22     ` Olivier Matz
2019-10-19 19:47       ` Slava Ovsiienko

DPDK-dev Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/dpdk-dev/0 dpdk-dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dpdk-dev dpdk-dev/ https://lore.kernel.org/dpdk-dev \
		dev@dpdk.org dpdk-dev@archiver.kernel.org
	public-inbox-index dpdk-dev

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.dpdk.dev


AGPL code for this site: git clone https://public-inbox.org/ public-inbox