All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC v2 0/5] rte_flow extension for vSwitch acceleration
@ 2017-12-21  2:35 Qi Zhang
  2017-12-21  2:35 ` [RFC v2 1/5] ether: add flow action to redirect packet in a switch domain Qi Zhang
                   ` (5 more replies)
  0 siblings, 6 replies; 17+ messages in thread
From: Qi Zhang @ 2017-12-21  2:35 UTC (permalink / raw)
  To: adrien.mazarguil; +Cc: dev, declan.doherty, Qi Zhang

This patch extend rte_flow API.
The purpose is to provide comfortable programming interface for virtual switch
software (such as OVS) to take advantage of incoming device's vSwitch acceleration
capability when using DPDK as data plane.

Below is summary of changes:

1. Support to specify flow's destination as an ethdev interface.

Add action RTE_FLOW_ACTION_TYPE_ETHDEV_PORT, use port_id as the identification
of the destitation. A typical use case is, with a smart NIC used for vSwitch
acceleration, flow is defined to redirect packet between switch port that is
managed by a Port Representor.
See patch for Port Representor: http://dpdk.org/dev/patchwork/patch/31458/

2. Enhanced flow statistics query.
Enhanced action RTE_FLOW_ACTION_COUNT by adding last hit timestamp tracking which is
the requirement from OVS.

3. Add flow timeout support as the requirement from OVS
Application is able to
a) Setup the time duration of a flow, the flow is expected to be deleted automatically
when timeout.
b) Ping a flow to check if it is active or not.
c) Register a callback function when a flow is deleted due to timeout.

4. Add protocol headers which will be supported by incoming device.

New protocal headers include IPV4 ARP, IPV6 ICMP , IPV6 extent header.

5. Add packet modification actions which will be supported by incoming device.

Add new actions that be used to modify packet content with generic semantic:

RTE_FLOW_ACTION_TYPE_FIELD_UPDATE: update specific field of packet
RTE_FLWO_ACTION_TYPE_FIELD_INCREMENT: increament specific field of packet
RTE_FLWO_ACTION_TYPE_FIELD_DECREMENT: decreament specific field of packet
RTE_FLWO_ACTION_TYPE_FIELD_COPY: copy data from one field to another in packet.

All action use struct rte_flow_item parameter to match the pattern that going
to be modified, if no pattern match, the action just be skipped.
These action are non-terminating action. they will not impact the fate of the
packets, since pattern match is expected to be performed before packet be modified.

Note:
The RFC patch is based on v17.11.
Testpmd command line support is not included.

v2:
- v1 is incompelete due to some misoperation.

Qi Zhang (5):
  ether: add flow action to redirect packet in a switch domain
  ether: add flow last hit query support
  ether: Add flow timeout support
  ether: add more protocol support in rte_flow
  ether: add packet modification aciton in rte_flow

 doc/guides/prog_guide/rte_flow.rst | 148 ++++++++++++++++++
 lib/librte_ether/rte_flow.c        |  38 +++++
 lib/librte_ether/rte_flow.h        | 305 ++++++++++++++++++++++++++++++++++++-
 lib/librte_ether/rte_flow_driver.h |  12 ++
 4 files changed, 502 insertions(+), 1 deletion(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [RFC v2 1/5] ether: add flow action to redirect packet in a switch domain
  2017-12-21  2:35 [RFC v2 0/5] rte_flow extension for vSwitch acceleration Qi Zhang
@ 2017-12-21  2:35 ` Qi Zhang
  2017-12-21 12:37   ` Alex Rosenbaum
  2017-12-21  2:35 ` [RFC v2 2/5] ether: add flow last hit query support Qi Zhang
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 17+ messages in thread
From: Qi Zhang @ 2017-12-21  2:35 UTC (permalink / raw)
  To: adrien.mazarguil; +Cc: dev, declan.doherty, Qi Zhang

Add action RTE_FLOW_ACTION_TYPE_SWITCH_PORT, it can be used to redirect
a packet to a network interface that connect to the same switch domain,
rte_ethdev's port_id is used as an identification of the destination.
A typical use case is: with a smart NIC for vSwitch acceleration, flow
is defined to forward packets between the switch port that is managed by
Port Representor.

Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
---
 doc/guides/prog_guide/rte_flow.rst | 22 ++++++++++++++++++++++
 lib/librte_ether/rte_flow.h        | 19 ++++++++++++++++++-
 2 files changed, 40 insertions(+), 1 deletion(-)

diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index d158be5..dcea2f6 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -1474,6 +1474,28 @@ fields in the pattern items.
    | 1     | END      |
    +-------+----------+
 
+Action: ``PORT``
+^^^^^^^^^^^^^^^^
+
+Redirect packets to an interface that connect to the same switch domain.
+
+The desitnation should be managed by a rte_ethdev instance, port_id is
+the identification of the destination. A typical use case is to define
+a flow that redirect packet to a interface that mananged by a Port
+Representor.
+
+- Terminating by default.
+
+.. _table_rte_flow_action_port:
+
+.. table:: PORT
+
+   +--------------+-----------------------------------+
+   | Field        | Value                             |
+   +==============+===================================+
+   | ``port_id``  | identification of the destination |
+   +--------------+-----------------------------------+
+
 Negative types
 ~~~~~~~~~~~~~~
 
diff --git a/lib/librte_ether/rte_flow.h b/lib/librte_ether/rte_flow.h
index 47c88ea..91706e2 100644
--- a/lib/librte_ether/rte_flow.h
+++ b/lib/librte_ether/rte_flow.h
@@ -1008,7 +1008,14 @@ enum rte_flow_action_type {
 	 *
 	 * See struct rte_flow_action_security.
 	 */
-	RTE_FLOW_ACTION_TYPE_SECURITY
+	RTE_FLOW_ACTION_TYPE_SECURITY,
+
+	/**
+	 * Redirect packets to a network interface in the same switch domain.
+	 *
+	 * See struct rte_flow_action_port.
+	 */
+	RTE_FLOW_ACTION_TYPE_PORT,
 };
 
 /**
@@ -1146,6 +1153,16 @@ struct rte_flow_action_security {
 	void *security_session; /**< Pointer to security session structure. */
 };
 
+/** RTE_FLOW_ACTION_TYPE_PORT
+ *
+ * Redirect packets to a network interface in the same switch domain.
+ *
+ * Terminateing by default.
+ */
+struct rte_flow_action_port {
+       uint16_t port_id; /**< identification of the forward destination. */
+};
+
 /**
  * Definition of a single action.
  *
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC v2 2/5] ether: add flow last hit query support
  2017-12-21  2:35 [RFC v2 0/5] rte_flow extension for vSwitch acceleration Qi Zhang
  2017-12-21  2:35 ` [RFC v2 1/5] ether: add flow action to redirect packet in a switch domain Qi Zhang
@ 2017-12-21  2:35 ` Qi Zhang
  2017-12-21  2:35 ` [RFC v2 3/5] ether: Add flow timeout support Qi Zhang
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 17+ messages in thread
From: Qi Zhang @ 2017-12-21  2:35 UTC (permalink / raw)
  To: adrien.mazarguil; +Cc: dev, declan.doherty, Qi Zhang

Enhanced the action RTE_FLOW_TYPE_ACTION_COUNT, number of milliseconds since
last hit can be queried.

Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
---
 lib/librte_ether/rte_flow.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/lib/librte_ether/rte_flow.h b/lib/librte_ether/rte_flow.h
index 91706e2..8e902f0 100644
--- a/lib/librte_ether/rte_flow.h
+++ b/lib/librte_ether/rte_flow.h
@@ -1052,9 +1052,11 @@ struct rte_flow_query_count {
 	uint32_t reset:1; /**< Reset counters after query [in]. */
 	uint32_t hits_set:1; /**< hits field is set [out]. */
 	uint32_t bytes_set:1; /**< bytes field is set [out]. */
+	uint32_t last_hit_set:1; /**< last_hit field is set [out]. */
 	uint32_t reserved:29; /**< Reserved, must be zero [in, out]. */
 	uint64_t hits; /**< Number of hits for this rule [out]. */
 	uint64_t bytes; /**< Number of bytes through this rule [out]. */
+	uint64_t last_hit; /**< Number of milliseconds since last hit [out]. */
 };
 
 /**
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC v2 3/5] ether: Add flow timeout support
  2017-12-21  2:35 [RFC v2 0/5] rte_flow extension for vSwitch acceleration Qi Zhang
  2017-12-21  2:35 ` [RFC v2 1/5] ether: add flow action to redirect packet in a switch domain Qi Zhang
  2017-12-21  2:35 ` [RFC v2 2/5] ether: add flow last hit query support Qi Zhang
@ 2017-12-21  2:35 ` Qi Zhang
  2017-12-21 13:59   ` Alex Rosenbaum
  2017-12-21  2:35 ` [RFC v2 4/5] ether: add more protocol support in rte_flow Qi Zhang
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 17+ messages in thread
From: Qi Zhang @ 2017-12-21  2:35 UTC (permalink / raw)
  To: adrien.mazarguil; +Cc: dev, declan.doherty, Qi Zhang

Add new APIs to support flow timeout, application is able to
1. Setup the time duration of a flow, the flow is expected to be deleted
automatically when timeout.
2. Ping a flow to check if it is active or not.
3. Register a callback function when a flow is deleted due to timeout.

Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
---
 doc/guides/prog_guide/rte_flow.rst | 37 ++++++++++++++++++++++++++++
 lib/librte_ether/rte_flow.c        | 38 +++++++++++++++++++++++++++++
 lib/librte_ether/rte_flow.h        | 49 ++++++++++++++++++++++++++++++++++++++
 lib/librte_ether/rte_flow_driver.h | 12 ++++++++++
 4 files changed, 136 insertions(+)

diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index dcea2f6..1a242fc 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -181,6 +181,14 @@ directions. At least one direction must be specified.
 Specifying both directions at once for a given rule is not recommended but
 may be valid in a few cases (e.g. shared counters).
 
+Attribute: Timeout
+^^^^^^^^^^^^^^^^^^
+
+Two kinds of timeout can be assigned to a flow rule. For "hard timeout", flow
+rule will be deleted when specific time duration  passed since it's creation.
+For "idle timeout", flow rule will be deleted when no packet hit in the given
+time duration.
+
 Pattern item
 ~~~~~~~~~~~~
 
@@ -1695,6 +1703,35 @@ definition.
                   void *data,
                   struct rte_flow_error *error);
 
+Is Active
+~~~~~~~~~
+
+Check if a flow is still active or not.
+
+It is possible a flow is be deleted automatically due to timeout, this function
+help to check if a flow is still exist.
+
+.. code-block:: c
+
+   int
+   rte_flow_is_active(uint8_t port_id,
+                      struct rte_flow *flow,
+                      uint8_t *active,
+                      struct rte_flow_error* error);
+
+Auto Delete Callback Set
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+Register a callback function when flow is automatically deleted due to timeout.
+
+.. code-block:: c
+
+   int
+   rte_flow_auto_delete_callback_set(uint8_t port_id,
+                                     struct rte_flow *flow,
+                                     void (*callback)(struct rte_flow *),
+                                     struct rte_flow_error *error);
+
 Arguments:
 
 - ``port_id``: port identifier of Ethernet device.
diff --git a/lib/librte_ether/rte_flow.c b/lib/librte_ether/rte_flow.c
index 6659063..650c5a5 100644
--- a/lib/librte_ether/rte_flow.c
+++ b/lib/librte_ether/rte_flow.c
@@ -425,3 +425,41 @@ rte_flow_copy(struct rte_flow_desc *desc, size_t len,
 	}
 	return 0;
 }
+
+/** Check if a flow is stiall active or not. */
+int
+rte_flow_is_active(uint8_t port_id,
+		   struct rte_flow* flow,
+		   uint8_t *active,
+		   struct rte_flow_error* error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
+
+	if (unlikely(!ops))
+		return -rte_errno;
+	if (likely(!!ops->is_active))
+		return ops->is_active(dev, flow, active, error);
+	return -rte_flow_error_set(error, ENOSYS,
+				   RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+				   NULL, rte_strerror(ENOSYS));
+}
+
+/** Register a callback function when flow is automatically deleted. */
+int
+rte_flow_auto_delete_callback_set(uint8_t port_id,
+				  struct rte_flow* flow,
+				  void (*callback)(struct rte_flow *),
+				  struct rte_flow_error* error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
+
+	if (unlikely(!ops))
+		return -rte_errno;
+	if (likely(!!ops->auto_delete_callback_set))
+		return ops->auto_delete_callback_set(dev, flow, callback, error);
+	return -rte_flow_error_set(error, ENOSYS,
+				   RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+				   NULL, rte_strerror(ENOSYS));
+}
diff --git a/lib/librte_ether/rte_flow.h b/lib/librte_ether/rte_flow.h
index 8e902f0..e09e07f 100644
--- a/lib/librte_ether/rte_flow.h
+++ b/lib/librte_ether/rte_flow.h
@@ -97,6 +97,10 @@ struct rte_flow_attr {
 	uint32_t ingress:1; /**< Rule applies to ingress traffic. */
 	uint32_t egress:1; /**< Rule applies to egress traffic. */
 	uint32_t reserved:30; /**< Reserved, must be zero. */
+	uint32_t hard_timeout;
+	/**< If !0, flow will be deleted after given number of seconds. */
+	uint32_t idle_timeout;
+	/**< If !0, flow will be deleted if no packet hit in given seconds. */
 };
 
 /**
@@ -1491,6 +1495,51 @@ rte_flow_copy(struct rte_flow_desc *fd, size_t len,
 	      const struct rte_flow_item *items,
 	      const struct rte_flow_action *actions);
 
+/**
+ * Check if a flow is still active or not.
+ *
+ * @param port_id
+ *   Port identifier of Ethernet device.
+ * @param flow
+ *   Flow rule to check.
+ * @param[out] active
+ *   0 for not active, 1 for active.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. PMDs initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+rte_flow_is_active(uint8_t port_id,
+		   struct rte_flow *flow,
+		   uint8_t *active,
+		   struct rte_flow_error *error);
+
+/**
+ * Register a callback function when flow is automatically deleted
+ * due to timeout
+ *
+ * @param port_id
+ *   Port identifier of Ethernet device.
+ * @param flow
+ *   Flow rule to track.
+ * @param callback
+ *   The callback function.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. PMDs initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+rte_flow_auto_delete_callback_set(uint8_t port_id,
+				  struct rte_flow *flow,
+				  void (*callback)(struct rte_flow *),
+				  struct rte_flow_error *error);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_ether/rte_flow_driver.h b/lib/librte_ether/rte_flow_driver.h
index 254d1cb..862d8ab 100644
--- a/lib/librte_ether/rte_flow_driver.h
+++ b/lib/librte_ether/rte_flow_driver.h
@@ -124,6 +124,18 @@ struct rte_flow_ops {
 		(struct rte_eth_dev *,
 		 int,
 		 struct rte_flow_error *);
+	/** See rte_flow_ping(). */
+	int (*is_active)
+		(struct rte_eth_dev *,
+		 struct rte_flow *,
+		 uint8_t *,
+		 struct rte_flow_error *);
+	/** See rte_flow_delete(). */
+	int (*auto_delete_callback_set)
+		(struct rte_eth_dev *,
+		 struct rte_flow *,
+		 void (*)(struct rte_flow *),
+		 struct rte_flow_error *);
 };
 
 /**
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC v2 4/5] ether: add more protocol support in rte_flow
  2017-12-21  2:35 [RFC v2 0/5] rte_flow extension for vSwitch acceleration Qi Zhang
                   ` (2 preceding siblings ...)
  2017-12-21  2:35 ` [RFC v2 3/5] ether: Add flow timeout support Qi Zhang
@ 2017-12-21  2:35 ` Qi Zhang
  2017-12-21  2:35 ` [RFC v2 5/5] ether: add packet modification aciton " Qi Zhang
  2017-12-21 13:01 ` [RFC v2 0/5] rte_flow extension for vSwitch acceleration Alex Rosenbaum
  5 siblings, 0 replies; 17+ messages in thread
From: Qi Zhang @ 2017-12-21  2:35 UTC (permalink / raw)
  To: adrien.mazarguil; +Cc: dev, declan.doherty, Qi Zhang

Add new protocol header match support as below

RTE_FLOW_ITEM_TYPE_ARP - match IPv4 ARP header.
RTE_FLOW_ITEM_TYPE_EXT_HDR_ANY - match any IPv6 extension header.
RTE_FLOW_ITEM_TYPE_ICMPV6 - match IPv6 ICMP header.
RTE_FLOW_ITEM_TYPE_ICMPV6_TGT_ADDR - match IPv6 ICMP Target address
RTE_FLOW_ITEM_TYPE_ICMPV6_SSL - match IPv6 ICMP Source Link-layer address
RTE_FLOW_ITEM_TYPE_ICMPV6_TTL - match IPv6 ICMP Target Link-layer address

Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
---
 lib/librte_ether/rte_flow.h | 156 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 156 insertions(+)

diff --git a/lib/librte_ether/rte_flow.h b/lib/librte_ether/rte_flow.h
index e09e07f..4ee2b62 100644
--- a/lib/librte_ether/rte_flow.h
+++ b/lib/librte_ether/rte_flow.h
@@ -348,6 +348,49 @@ enum rte_flow_item_type {
 	 * See struct rte_flow_item_esp.
 	 */
 	RTE_FLOW_ITEM_TYPE_ESP,
+
+	/**
+	 * Matches ARP IPv4 header.
+	 *
+	 * See struct rte_flow_item_arp.
+	 */
+	RTE_FLOW_ITEM_TYPE_ARP,
+
+	/**
+	 * Matches any IPv6 Extension header.
+	 *
+	 * See struct rte_flow_item_ipv6_ext_any.
+	 */
+	RTE_FLOW_ITEM_TYPE_IPV6_EXT_HDR_ANY,
+
+	/**
+	 * Matches ICMPv6 header.
+	 *
+	 * See struct rte_flow_item_icmpv6
+	 */
+	RTE_FLOW_ITEM_TYPE_ICMPV6,
+
+	/**
+	 * Match ICMPv6 target address.
+	 *
+	 * See struct rte_flow_item_icmpv6_tgt_addr.
+	 */
+	RTE_FLOW_ITEM_TYPE_ICMPV6_TGT_ADDR,
+
+	/**
+	 * Match ICMPv6 Source Link-Layer Address.
+	 *
+	 * See struct rte_flow_item_icmpv6_sll.
+	 */
+	RTE_FLOW_ITEM_TYPE_ICMPV6_SLL,
+
+	/**
+	 * Match ICMPv6 Source Link-Layer Address.
+	 *
+	 * See struct rte_flow_item_icmpv6_tll.
+	 */
+	RTE_FLOW_ITEM_TYPE_ICMPV6_TLL,
+
 };
 
 /**
@@ -817,6 +860,119 @@ static const struct rte_flow_item_esp rte_flow_item_esp_mask = {
 #endif
 
 /**
+ * RTE_FLOW_ITEM_TYPE_ARP
+ *
+ * Matches IPv4 ARP packet header
+ */
+struct rte_flow_item_arp {
+    struct arp_hdr hdr;
+};
+
+/** Default mask for RTE_FLOW_ITEM_TYPE_ARP. */
+#ifndef __cplusplus
+static const struct rte_flow_item_arp rte_flow_item_arp_mask = {
+	.hdr = {
+		.arp_data = {
+			.arp_sha = {
+			.addr_bytes = "\xff\xff\xff\xff\xff\xff",
+		},
+		.arp_sip = RTE_BE32(0xffffffff),
+		.arp_tha = {
+			.addr_bytes = "\xff\xff\xff\xff\xff\xff",
+		},
+			.arp_tip = RTE_BE32(0xffffffff),
+		},
+	},
+};
+#endif
+
+/**
+ * RTE_FLOW_ITEM_TYPE_IPV6_EXT_HDR_ANY
+ *
+ * Matches any IPv6 extension header.
+ */
+struct rte_flow_item_ipv6_ext_hdr_any {
+	uint8_t next_hdr;
+};
+
+/** Default mask for RTE_FLOW_ITEM_TYPE_IPV6_EXT_HDR_ANY. */
+#ifndef __cplusplus
+static const struct rte_flow_item_ipv6_ext_hdr_any rte_flow_item_ipv6_ext_any_mask = {
+	.next_hdr = 0xff,
+};
+#endif
+
+/**
+ * RTE_FLOW_ITEM_TYPE_ICMPV6
+ *
+ * Matches ICMPv6 header.
+ */
+struct rte_flow_item_icmpv6 {
+	uint8_t type;
+	uint8_t code;
+	uint16_t checksum;
+};
+
+/** Default mask for RTE_FLOW_ITEM_TYPE_ICMPV6 */
+#ifndef __cplusplus
+static const struct rte_flow_item_icmpv6 rte_flow_item_icmpv6_mask = {
+	.type = 0xff,
+	.code = 0xff,
+	.checksum = RTE_BE16(0xffff),
+};
+#endif
+
+/**
+ * RTE_FLOW_ITEM_TYPE_ICMPV6_TGT_ADDR
+ *
+ * Matches ICMPv6's Target Address.
+ */
+struct rte_flow_item_icmpv6_tgt_addr {
+	uint8_t addr[16];
+};
+
+/** Default mask for RTE_FLOW_ITEM_TYPE_ICMPV6_TGT_ADDR */
+#ifndef __cplusplus
+static const struct rte_flow_item_icmpv6_tgt_addr rte_flow_item_icmpv6_tgt_addr_mask = {
+	.addr = "\xff\xff\xff\xff",
+};
+#endif
+
+/**
+ * RTE_FLOW_ITEM_TYPE_ICPMV6_SLL.
+ *
+ * Matches ICMPv6 Source Link-Layer address.
+ */
+struct rte_flow_item_icmpv6_sll {
+	struct ether_addr addr;
+};
+/** Default mask for RTE_FLOW_ITEM_TYPE_ICMPV6_SLL */
+#ifndef __cplusplus
+static const struct rte_flow_item_icmpv6_sll rte_flow_item_icmpv6_sll_mask = {
+	.addr = {
+		.addr_bytes = "\xff\xff\xff\xff\xff\xff",
+	}
+};
+#endif
+
+/**
+ * RTE_FLOW_ITEM_TYPE_ICMPV6_TLL.
+ *
+ * Matches ICMPv6 Target Link-Layer address.
+ */
+struct rte_flow_item_icmpv6_tll {
+	struct ether_addr addr;
+};
+/** Default mask for RTE_FLOW_ITEM_TYPE_ICMPV6_TLL */
+#ifndef __cplusplus
+static const struct rte_flow_item_icmpv6_tll rte_flow_item_icmpv6_tll_mask = {
+	.addr = {
+		.addr_bytes = "\xff\xff\xff\xff\xff\xff",
+	}
+};
+#endif
+
+/**
  * Matching pattern item definition.
  *
  * A pattern is formed by stacking items starting from the lowest protocol
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC v2 5/5] ether: add packet modification aciton in rte_flow
  2017-12-21  2:35 [RFC v2 0/5] rte_flow extension for vSwitch acceleration Qi Zhang
                   ` (3 preceding siblings ...)
  2017-12-21  2:35 ` [RFC v2 4/5] ether: add more protocol support in rte_flow Qi Zhang
@ 2017-12-21  2:35 ` Qi Zhang
  2017-12-21 13:01 ` [RFC v2 0/5] rte_flow extension for vSwitch acceleration Alex Rosenbaum
  5 siblings, 0 replies; 17+ messages in thread
From: Qi Zhang @ 2017-12-21  2:35 UTC (permalink / raw)
  To: adrien.mazarguil; +Cc: dev, declan.doherty, Qi Zhang

Add new actions that be used to modify packet content with generic semantic:

RTE_FLOW_ACTION_TYPE_FIELD_UPDATE: update specific field of packet
RTE_FLWO_ACTION_TYPE_FIELD_INCREMENT: increament specific field of packet
RTE_FLWO_ACTION_TYPE_FIELD_DECREMENT: decreament specific field of packet
RTE_FLWO_ACTION_TYPE_FIELD_COPY: copy data from one field to another in packet.

All action use struct rte_flow_item parameter to match the pattern that going
to be modified, if no pattern match, the action just be skipped.

These action are non-terminating action. they will not impact the fate of the
packets.

Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
---
 doc/guides/prog_guide/rte_flow.rst | 89 ++++++++++++++++++++++++++++++++++++++
 lib/librte_ether/rte_flow.h        | 79 +++++++++++++++++++++++++++++++++
 2 files changed, 168 insertions(+)

diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index 1a242fc..b706a78 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -1504,6 +1504,95 @@ Representor.
    | ``port_id``  | identification of the destination |
    +--------------+-----------------------------------+
 
+Action: ``FILED_UPDATE``
+^^^^^^^^^^^^^^^^^^^^^^^
+
+Update specific field of the packet.
+
+- Non-terminating by default.
+
+.. _table_rte_flow_action_field_update:
+
+.. table:: FIELD_UPDATE
+
+   +-----------+---------------------------------------------------------+
+   | Field     | Value                                                   |
+   +===========+=========================================================+
+   | ``item``  | item->type: specify the pattern to modify               |
+   |           | item->spec: specify the new value to update             |
+   |           | item->mask: specify which part of the pattern to update |
+   |           | item->last: ignored                                     |
+   +-----------+---------------------------------------------------------+
+   | ``layer`` | 0 means outermost matched pattern,                      |
+   |           | 1 means next-to-outermost and so on ...                 |
+   +-----------+---------------------------------------------------------+
+
+Action: ``FILED_INCREMENT``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Increment 1 on specific field of the packet.
+
+- Non-terminating by default.
+
+.. _table_rte_flow_action_field_increment:
+
+.. table:: FIELD_INCREMENT
+
+   +-----------+---------------------------------------------------------+
+   | Field     | Value                                                   |
+   +===========+=========================================================+
+   | ``item``  | item->type: specify the pattern to modify               |
+   |           | item->spec: ignored                                     |
+   |           | item->mask: specify which part of the pattern to update |
+   |           | item->last: ignored                                     |
+   +-----------+---------------------------------------------------------+
+   | ``layer`` | 0 means outermost matched pattern,                      |
+   |           | 1 means next-to-outermost and so on ...                 |
+   +-----------+---------------------------------------------------------+
+
+Action: ``FIELD_DECREMENT``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Decrement 1 on specific field of the packet.
+
+Paramenter is same is FIELD_INCREMENT.
+
+-Non-terminating by default.
+
+ACTION: ``FIELD_COPY``
+^^^^^^^^^^^^^^^^^^^^^^
+
+Copy data of one field to another of the packet.
+
+-Non-terminating by default.
+
+.. _table_rte_flow_action_field_increment:
+
+.. table:: FIELD_COPY
+
+   +-----------------+-----------------------------------------------------------------+
+   | Field           | Value                                                           |
+   +=================+=================================================================+
+   | ``src_item``    | src_item->type: match the pattern that data will be copy from   |
+   |                 | src_item->spec: ignored                                         |
+   |                 | src_item->mask: specify which part of the pattern to copy       |
+   |                 | src_item->last: ignored                                         |
+   +-----------------+-----------------------------------------------------------------+
+   | ``src_layer``   | layer of src_item                                               |
+   |                 | 0 means outermost matched pattern,                              |
+   |                 | 1 means next-to-outermost and so on ...                         |
+   +-----------------+-----------------------------------------------------------------+
+   | ``dst_item``    | dst_item->type: match the pattern that data will be copy to     |
+   |                 | dst_item->spec: ignored                                         |
+   |                 | dst_item->mask: specify which part of the pattern to be update  |
+   |                 |                 it must match src_item->mask.                   |
+   |                 | dst_item->last: ignored                                         |
+   +-----------------+-----------------------------------------------------------------+
+   | ``dst_layer``   | layer of dst_item                                               |
+   |                 | 0 means outermost matched pattern,                              |
+   |                 | 1 means next-to-outermost and so on ...                         |
+   +-----------------+-----------------------------------------------------------------+
+
 Negative types
 ~~~~~~~~~~~~~~
 
diff --git a/lib/librte_ether/rte_flow.h b/lib/librte_ether/rte_flow.h
index 4ee2b62..12f194f 100644
--- a/lib/librte_ether/rte_flow.h
+++ b/lib/librte_ether/rte_flow.h
@@ -1176,6 +1176,34 @@ enum rte_flow_action_type {
 	 * See struct rte_flow_action_port.
 	 */
 	RTE_FLOW_ACTION_TYPE_PORT,
+
+	/**
+	 * Update specific field of the packet.
+	 *
+	 * See struct rte_flow_item_type_field_update.
+	 */
+	RTE_FLOW_ACTION_TYPE_FILED_UPDATE,
+
+	/**
+	 * Increment specific field of the packet.
+	 *
+	 * See struct rte_flow_item_type_field_increment.
+	 */
+	RTE_FLOW_ACTION_TYPE_FIELD_INCREMENT,
+
+	/**
+	 * Decrement specific field of the packet.
+	 *
+	 * See struct rte_flow_item_type_field_decrement.
+	 */
+	RTE_FLOW_ACTION_TYPE_FIELD_DECREMENT,
+
+	/**
+	 * Copy data of one field to another of the packet.
+	 *
+	 * See struct rte_flow_item_type_field_copy.
+	 */
+	RTE_FLOW_ACTION_TYPE_FIELD_COPY,
 };
 
 /**
@@ -1325,6 +1353,57 @@ struct rte_flow_action_port {
        uint16_t port_id; /**< identification of the forward destination. */
 };
 
+/** RTE_FLOW_ACTION_TYPE_FIELD_UPDATE
+ *
+ * Update specific field of the packet.
+ *
+ * Typical usage: update mac/ip address.
+ */
+struct rte_flow_action_field_update {
+	const struct rte_flow_item *item; /**< specify the data to modify. */
+	uint8_t layer;
+	/**< 0 means outermost matched pattern, 1 means next-to-outermost... */
+};
+
+/** RTE_FLOW_ACTION_TYPE_FIELD_INCREMENT
+ *
+ * Increment 1 on specific field of the packet.
+ *
+ * Typical usage: increase TTL
+ */
+struct rte_flow_action_field_increment {
+	const struct rte_flow_item *item; /**< specify the data to modify. */
+	uint8_t layer;
+	/**< 0 means outermost matched pattern, 1 means next-to-outermost... */
+};
+
+/** RTE_FLOW_ACTION_TYPE_FIELD_DECREMENT
+ *
+ * Decrement 1 on specific field of the packet.
+ *
+ * Typical usage: Decrease TTL
+ */
+struct rte_flow_action_field_decrement {
+	const struct rte_flow_item *item; /**< Specify the data to modify. */
+	uint8_t layer;
+	/**< 0 means outermost matched pattern, 1 means next-to-outermost... */
+};
+
+/** RTE_FLOW_ACTION_TYPE_FIELD_COPY
+ *
+ * Copy data from one field to another of the packet.
+ *
+ * Typical usage: TTL copy-in / copy-out
+ */
+struct rte_flow_action_field_copy {
+	const struct rte_flow_item *src_item; /**< Specify the data copy from */
+	uint8_t src_layer;
+	/**< 0 means outermost matched pattern, 1 means next-to-outermost... */
+	const struct rte_flow_item *dst_item; /**< Specify the data copy to */
+	uint8_t dst_layer;
+	/**< 0 means outermost matched pattern, 1 means next-to-outermost... */
+};
+
 /**
  * Definition of a single action.
  *
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [RFC v2 1/5] ether: add flow action to redirect packet in a switch domain
  2017-12-21  2:35 ` [RFC v2 1/5] ether: add flow action to redirect packet in a switch domain Qi Zhang
@ 2017-12-21 12:37   ` Alex Rosenbaum
  2017-12-22  8:20     ` Zhang, Qi Z
  0 siblings, 1 reply; 17+ messages in thread
From: Alex Rosenbaum @ 2017-12-21 12:37 UTC (permalink / raw)
  To: Qi Zhang; +Cc: adrien.mazarguil, DPDK, Declan Doherty

On Thu, Dec 21, 2017 at 4:35 AM, Qi Zhang <qi.z.zhang@intel.com> wrote:
> Add action RTE_FLOW_ACTION_TYPE_SWITCH_PORT, it can be used to redirect

I guess the word "SWITCH" should be remove from commit message. you
don't use it later in the patch.


>
> +Action: ``PORT``
> +^^^^^^^^^^^^^^^^
> +
> +Redirect packets to an interface that connect to the same switch domain.
> +
> +The desitnation should be managed by a rte_ethdev instance, port_id is
> +the identification of the destination. A typical use case is to define
> +a flow that redirect packet to a interface that mananged by a Port
> +Representor.


A verbs would be better suited for an ACTION_TYPE. while
".._TYPE_PORT" is a nous.
Probably ".._TYPE_REDIRECT" would better fit here.
See man tc-mirred as referance:
http://man7.org/linux/man-pages/man8/tc-mirred.8.html

Do we want to distinguish between different destination type?
The target might be a port (port_id) or potencial other destinations/queue.
So maybe use ".._TYPE_REDIRECT_TO_PORT"?

Anyway, I think you should remove the "same switch domain" from docs
since there is no switch domain yet in DPDK.
Lets let the PMD decided if this sucessed or fails, based on the
target type and other HW limitations. Not just based on switch domain.

PS: I agree switch domain needs to be introduced. I don't think port
representor is the correct direction.

Alex

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC v2 0/5] rte_flow extension for vSwitch acceleration
  2017-12-21  2:35 [RFC v2 0/5] rte_flow extension for vSwitch acceleration Qi Zhang
                   ` (4 preceding siblings ...)
  2017-12-21  2:35 ` [RFC v2 5/5] ether: add packet modification aciton " Qi Zhang
@ 2017-12-21 13:01 ` Alex Rosenbaum
  5 siblings, 0 replies; 17+ messages in thread
From: Alex Rosenbaum @ 2017-12-21 13:01 UTC (permalink / raw)
  To: Qi Zhang; +Cc: adrien.mazarguil, DPDK, Declan Doherty

On Thu, Dec 21, 2017 at 4:35 AM, Qi Zhang <qi.z.zhang@intel.com> wrote:
> This patch extend rte_flow API.
> The purpose is to provide comfortable programming interface for virtual switch
> software (such as OVS) to take advantage of incoming device's vSwitch acceleration
> capability when using DPDK as data plane.
>
> Below is summary of changes:
>
> 1. Support to specify flow's destination as an ethdev interface.
>
> Add action RTE_FLOW_ACTION_TYPE_ETHDEV_PORT, use port_id as the identification
> of the destitation. A typical use case is, with a smart NIC used for vSwitch
> acceleration, flow is defined to redirect packet between switch port that is
> managed by a Port Representor.
> See patch for Port Representor: http://dpdk.org/dev/patchwork/patch/31458/

If we are going to have a representation of a switch in DPDK then I we
need real PMD's for the switch ports, and not virtual PMD's as in the
Port Representor model.
These real ports will probably require specific switch_ops, which are
different than the NIC (eth_dev_ops). At least few additional on top
of NIC ops.
We will need to support Tx and Rx burst on these switch ports and
configure ACL's, flooding mode and more.

Alex

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC v2 3/5] ether: Add flow timeout support
  2017-12-21  2:35 ` [RFC v2 3/5] ether: Add flow timeout support Qi Zhang
@ 2017-12-21 13:59   ` Alex Rosenbaum
  2017-12-22  9:03     ` Zhang, Qi Z
  0 siblings, 1 reply; 17+ messages in thread
From: Alex Rosenbaum @ 2017-12-21 13:59 UTC (permalink / raw)
  To: Qi Zhang; +Cc: adrien.mazarguil, DPDK, Declan Doherty

On Thu, Dec 21, 2017 at 4:35 AM, Qi Zhang <qi.z.zhang@intel.com> wrote:
> Add new APIs to support flow timeout, application is able to
> 1. Setup the time duration of a flow, the flow is expected to be deleted
> automatically when timeout.

Can you explain how the application (OVS) is expected to use this API?
It will help to better understand the motivation here...

Are you trying to move the aging timer from application code into the
PMD? or can your HW remove/disable/inactivate a flow at certain time
semantics without software context?

I would prefer to have the aging timer logic in a centralized
location, leek the application itself or some DPDK library. instead of
having each PMD implement its own software timers.


> 3. Register a callback function when a flow is deleted due to timeout.

Is the application 'struct rte_flow*' handle really deleted? or the
flow was removed from HW, just in-active at this time?

Can a flow be re-activated? or does this require a call to
rte_flow_destory() and ret_flow_create()?

Alex

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC v2 1/5] ether: add flow action to redirect packet in a switch domain
  2017-12-21 12:37   ` Alex Rosenbaum
@ 2017-12-22  8:20     ` Zhang, Qi Z
  2017-12-22 22:10       ` Alex Rosenbaum
  0 siblings, 1 reply; 17+ messages in thread
From: Zhang, Qi Z @ 2017-12-22  8:20 UTC (permalink / raw)
  To: Alex Rosenbaum; +Cc: adrien.mazarguil, DPDK, Doherty, Declan

Hi Alex

> -----Original Message-----
> From: Alex Rosenbaum [mailto:rosenbaumalex@gmail.com]
> Sent: Thursday, December 21, 2017 8:37 PM
> To: Zhang, Qi Z <qi.z.zhang@intel.com>
> Cc: adrien.mazarguil@6wind.com; DPDK <dev@dpdk.org>; Doherty, Declan
> <declan.doherty@intel.com>
> Subject: Re: [dpdk-dev] [RFC v2 1/5] ether: add flow action to redirect packet
> in a switch domain
> 
> On Thu, Dec 21, 2017 at 4:35 AM, Qi Zhang <qi.z.zhang@intel.com> wrote:
> > Add action RTE_FLOW_ACTION_TYPE_SWITCH_PORT, it can be used to
> > redirect
> 
> I guess the word "SWITCH" should be remove from commit message. you
> don't use it later in the patch.

Yes, it should be corrected.
> 
> 
> >
> > +Action: ``PORT``
> > +^^^^^^^^^^^^^^^^
> > +
> > +Redirect packets to an interface that connect to the same switch domain.
> > +
> > +The destination should be managed by a rte_ethdev instance, port_id
> > +is the identification of the destination. A typical use case is to
> > +define a flow that redirect packet to an interface that managed by a
> > +Port Representor.
> 
> 
> A verbs would be better suited for an ACTION_TYPE. while ".._TYPE_PORT" is
> a nous.
> Probably ".._TYPE_REDIRECT" would better fit here.
> See man tc-mirred as referance:
> http://man7.org/linux/man-pages/man8/tc-mirred.8.html

I agree it will be better to use verbs for action, so we can have TYPE_REDIRECT_TO_PORT/VF/PF...,
But since we already have ACTION_TYPE_VF, ACTION_TYPE_PF ...
Maybe it's better just to follow the same pattern?

> 
> Do we want to distinguish between different destination type?
> The target might be a port (port_id) or potencial other destinations/queue.
> So maybe use ".._TYPE_REDIRECT_TO_PORT"?
> 
> Anyway, I think you should remove the "same switch domain" from docs
> since there is no switch domain yet in DPDK.
> Lets let the PMD decided if this sucessed or fails, based on the target type
> and other HW limitations. Not just based on switch domain.

Yes, it's not necessary to be specific here, the new action is just add the semantic
to support packet redirect between port that managed by etherdevs, device driver
can figure out the way or just reject it.
I will capture this in v3.
> 
> PS: I agree switch domain needs to be introduced. I don't think port
> representor is the correct direction.

OK, thanks for your sharing, I think this can be discussed more on Port Representor mail list

> 
> Alex

Thanks
Qi

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC v2 3/5] ether: Add flow timeout support
  2017-12-21 13:59   ` Alex Rosenbaum
@ 2017-12-22  9:03     ` Zhang, Qi Z
  2017-12-22 14:06       ` Wiles, Keith
  2017-12-22 22:26       ` Alex Rosenbaum
  0 siblings, 2 replies; 17+ messages in thread
From: Zhang, Qi Z @ 2017-12-22  9:03 UTC (permalink / raw)
  To: Alex Rosenbaum; +Cc: adrien.mazarguil, DPDK, Doherty, Declan

Alex:

> -----Original Message-----
> From: Alex Rosenbaum [mailto:rosenbaumalex@gmail.com]
> Sent: Thursday, December 21, 2017 9:59 PM
> To: Zhang, Qi Z <qi.z.zhang@intel.com>
> Cc: adrien.mazarguil@6wind.com; DPDK <dev@dpdk.org>; Doherty, Declan
> <declan.doherty@intel.com>
> Subject: Re: [dpdk-dev] [RFC v2 3/5] ether: Add flow timeout support
> 
> On Thu, Dec 21, 2017 at 4:35 AM, Qi Zhang <qi.z.zhang@intel.com> wrote:
> > Add new APIs to support flow timeout, application is able to 1. Setup
> > the time duration of a flow, the flow is expected to be deleted
> > automatically when timeout.
> 
> Can you explain how the application (OVS) is expected to use this API?
> It will help to better understand the motivation here...

I think the purpose of the APIs is to expose the hardware feature that support
flow auto delete with a timeout.
As I know, for OVS, every flow in flow table will have time duration
A flow be offloaded to hardware is still required to be deleted in specific time, 
I think these APIs help OVS to take advantage HW feature and simplify the flow
aging management

> 
> Are you trying to move the aging timer from application code into the PMD?
> or can your HW remove/disable/inactivate a flow at certain time semantics
> without software context?

Yes, it for hardware feature.

> 
> I would prefer to have the aging timer logic in a centralized location, leek the
> application itself or some DPDK library. instead of having each PMD
> implement its own software timers.
> 
> 
> > 3. Register a callback function when a flow is deleted due to timeout.
> 
> Is the application 'struct rte_flow*' handle really deleted? or the flow was
> removed from HW, just in-active at this time?

Here the flow is deleted, same thing happen as rte_flow_destroy and we need to call
rte_flow_create to re-enable the flow. 
I will add more explanation to avoid confusion in next release.

> 
> Can a flow be re-activated? or does this require a call to
> rte_flow_destory() and ret_flow_create()?
> 
> Alex

Thanks
Qi

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC v2 3/5] ether: Add flow timeout support
  2017-12-22  9:03     ` Zhang, Qi Z
@ 2017-12-22 14:06       ` Wiles, Keith
  2018-01-14  2:03         ` Zhang, Qi Z
  2017-12-22 22:26       ` Alex Rosenbaum
  1 sibling, 1 reply; 17+ messages in thread
From: Wiles, Keith @ 2017-12-22 14:06 UTC (permalink / raw)
  To: Zhang, Qi Z; +Cc: Alex Rosenbaum, adrien.mazarguil, DPDK, Doherty, Declan



> On Dec 22, 2017, at 3:03 AM, Zhang, Qi Z <qi.z.zhang@intel.com> wrote:
> 
> Alex:
> 
>> -----Original Message-----
>> From: Alex Rosenbaum [mailto:rosenbaumalex@gmail.com]
>> Sent: Thursday, December 21, 2017 9:59 PM
>> To: Zhang, Qi Z <qi.z.zhang@intel.com>
>> Cc: adrien.mazarguil@6wind.com; DPDK <dev@dpdk.org>; Doherty, Declan
>> <declan.doherty@intel.com>
>> Subject: Re: [dpdk-dev] [RFC v2 3/5] ether: Add flow timeout support
>> 
>> On Thu, Dec 21, 2017 at 4:35 AM, Qi Zhang <qi.z.zhang@intel.com> wrote:
>>> Add new APIs to support flow timeout, application is able to 1. Setup
>>> the time duration of a flow, the flow is expected to be deleted
>>> automatically when timeout.
>> 
>> Can you explain how the application (OVS) is expected to use this API?
>> It will help to better understand the motivation here...
> 
> I think the purpose of the APIs is to expose the hardware feature that support
> flow auto delete with a timeout.
> As I know, for OVS, every flow in flow table will have time duration
> A flow be offloaded to hardware is still required to be deleted in specific time, 
> I think these APIs help OVS to take advantage HW feature and simplify the flow
> aging management
> 
>> 
>> Are you trying to move the aging timer from application code into the PMD?
>> or can your HW remove/disable/inactivate a flow at certain time semantics
>> without software context?
> 
> Yes, it for hardware feature.

We also need to support a software timeout feature here and not just a hardware one. The reason is to make the APIs consistent across all hardware. If you are going to include hardware timeout then we need to add software supported timeout at the same time IMO.

> 
>> 
>> I would prefer to have the aging timer logic in a centralized location, leek the
>> application itself or some DPDK library. instead of having each PMD
>> implement its own software timers.
>> 
>> 
>>> 3. Register a callback function when a flow is deleted due to timeout.
>> 
>> Is the application 'struct rte_flow*' handle really deleted? or the flow was
>> removed from HW, just in-active at this time?
> 
> Here the flow is deleted, same thing happen as rte_flow_destroy and we need to call
> rte_flow_create to re-enable the flow. 
> I will add more explanation to avoid confusion in next release.

Sorry, I little late into this thread, but we can not have 1000 callbacks for each timeout and we need make sure we bunch up a number of timeouts at a time to make the feature more performant IMO. Maybe that discussed or address in the code.

> 
>> 
>> Can a flow be re-activated? or does this require a call to
>> rte_flow_destory() and ret_flow_create()?
>> 
>> Alex
> 
> Thanks
> Qi

Regards,
Keith

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC v2 1/5] ether: add flow action to redirect packet in a switch domain
  2017-12-22  8:20     ` Zhang, Qi Z
@ 2017-12-22 22:10       ` Alex Rosenbaum
  0 siblings, 0 replies; 17+ messages in thread
From: Alex Rosenbaum @ 2017-12-22 22:10 UTC (permalink / raw)
  To: Zhang, Qi Z, Adrien Mazarguil; +Cc: DPDK, Doherty, Declan

+Adrien

On Fri, Dec 22, 2017 at 10:20 AM, Zhang, Qi Z <qi.z.zhang@intel.com> wrote:
>> On Thu, Dec 21, 2017 at 4:35 AM, Qi Zhang <qi.z.zhang@intel.com> wrote:
>> > Add action RTE_FLOW_ACTION_TYPE_SWITCH_PORT, it can be used to
>> > redirect

>> A verbs would be better suited for an ACTION_TYPE. while ".._TYPE_PORT" is
>> a nous.
>> Probably ".._TYPE_REDIRECT" would better fit here.
>
> I agree it will be better to use verbs for action, so we can have TYPE_REDIRECT_TO_PORT/VF/PF...,
> But since we already have ACTION_TYPE_VF, ACTION_TYPE_PF ...
> Maybe it's better just to follow the same pattern?

hemmm, missed these ACTION_TYPE_VF/PF...
Adrien, what do you think about these naming conventions?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC v2 3/5] ether: Add flow timeout support
  2017-12-22  9:03     ` Zhang, Qi Z
  2017-12-22 14:06       ` Wiles, Keith
@ 2017-12-22 22:26       ` Alex Rosenbaum
  2017-12-26  3:28         ` Zhang, Qi Z
  1 sibling, 1 reply; 17+ messages in thread
From: Alex Rosenbaum @ 2017-12-22 22:26 UTC (permalink / raw)
  To: Zhang, Qi Z; +Cc: adrien.mazarguil, DPDK, Doherty, Declan

On Fri, Dec 22, 2017 at 11:03 AM, Zhang, Qi Z <qi.z.zhang@intel.com> wrote:
>> On Thu, Dec 21, 2017 at 4:35 AM, Qi Zhang <qi.z.zhang@intel.com> wrote:
>> > Add new APIs to support flow timeout, application is able to 1. Setup
>> > the time duration of a flow, the flow is expected to be deleted
>> > automatically when timeout.
>>
>> Can you explain how the application (OVS) is expected to use this API?
>> It will help to better understand the motivation here...
>
> I think the purpose of the APIs is to expose the hardware feature that support
> flow auto delete with a timeout.
> As I know, for OVS, every flow in flow table will have time duration
> A flow be offloaded to hardware is still required to be deleted in specific time,
> I think these APIs help OVS to take advantage HW feature and simplify the flow
> aging management

Are you sure this will allow OVS to 'fire-and-forget' about the rule removal?
or will OVS anyway do rule cleanup from application tables?

Do you know if OVS flow timers are (or can be) re-armed in different
use cases? e.g. extending the timeout duration if traffic is still
flowing?



>> Are you trying to move the aging timer from application code into the PMD?
>> or can your HW remove/disable/inactivate a flow at certain time semantics
>> without software context?
>
> Yes, it for hardware feature.

So if the hardware auto removes the hardware steering entry, what
software part deletes the rte_flow handle?
What software part triggers the application callback? from what
context? will locks be required?

How do you prevent races between application thread and the context
deleting/accessing the rte_flow handle?
I mean in cases that application wants to delete the flow before the
timeout expires, but actually it is same time hardware deletes it.

Alex

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC v2 3/5] ether: Add flow timeout support
  2017-12-22 22:26       ` Alex Rosenbaum
@ 2017-12-26  3:28         ` Zhang, Qi Z
  2017-12-26  7:44           ` Alex Rosenbaum
  0 siblings, 1 reply; 17+ messages in thread
From: Zhang, Qi Z @ 2017-12-26  3:28 UTC (permalink / raw)
  To: Alex Rosenbaum; +Cc: adrien.mazarguil, DPDK, Doherty, Declan, Chandran, Sugesh

Hi Alex:

> -----Original Message-----
> From: Alex Rosenbaum [mailto:rosenbaumalex@gmail.com]
> Sent: Saturday, December 23, 2017 6:27 AM
> To: Zhang, Qi Z <qi.z.zhang@intel.com>
> Cc: adrien.mazarguil@6wind.com; DPDK <dev@dpdk.org>; Doherty, Declan
> <declan.doherty@intel.com>
> Subject: Re: [dpdk-dev] [RFC v2 3/5] ether: Add flow timeout support
> 
> On Fri, Dec 22, 2017 at 11:03 AM, Zhang, Qi Z <qi.z.zhang@intel.com> wrote:
> >> On Thu, Dec 21, 2017 at 4:35 AM, Qi Zhang <qi.z.zhang@intel.com> wrote:
> >> > Add new APIs to support flow timeout, application is able to 1.
> >> > Setup the time duration of a flow, the flow is expected to be
> >> > deleted automatically when timeout.
> >>
> >> Can you explain how the application (OVS) is expected to use this API?
> >> It will help to better understand the motivation here...
> >
> > I think the purpose of the APIs is to expose the hardware feature that
> > support flow auto delete with a timeout.
> > As I know, for OVS, every flow in flow table will have time duration A
> > flow be offloaded to hardware is still required to be deleted in
> > specific time, I think these APIs help OVS to take advantage HW
> > feature and simplify the flow aging management
> 
> Are you sure this will allow OVS to 'fire-and-forget' about the rule removal?
> or will OVS anyway do rule cleanup from application tables?

There is some framework design about offload flow management on OVS side. 
Since I'm not a OVS guy, I can't answer OVS specific question precisely right now,
but the feedback I got is, it will be nice if rte_flow could support flow timeout 
I may check with some OVS expert to give further explanation.
BTW, I think there is no harmful to add these APIs into rte_flow, since a flow timeout is quite 
generic feature to me. it may be useful even for non-OVS case in future.

> 
> Do you know if OVS flow timers are (or can be) re-armed in different use
> cases? e.g. extending the timeout duration if traffic is still flowing?

As I know, for OVS every flow just has a fixed time duration, so "hard_timeout"
is going for this requirement, but by following OpenFlow spec, idle_timeout is paired 
with hard_timeout so I just add it since its generic and maybe useful for future.
> 
> 
> >> Are you trying to move the aging timer from application code into the
> PMD?
> >> or can your HW remove/disable/inactivate a flow at certain time
> >> semantics without software context?
> >
> > Yes, it for hardware feature.
> 
> So if the hardware auto removes the hardware steering entry, what software
> part deletes the rte_flow handle?
> What software part triggers the application callback? from what context? will
> locks be required? 
> How do you prevent races between application thread and the context
> deleting/accessing the rte_flow handle?
> I mean in cases that application wants to delete the flow before the timeout
> expires, but actually it is same time hardware deletes it.

Usually the flow auto delete is running on a separate background thread
(an interrupt handler or a watchdog thread base on hardware capability)
The low level driver is responsible to take care of the race condition between background and foreground flow deleting.
For application, it should be aware that the callback function is running on a separate thread, so it is also required to 
take care of race condition if it will access some data that is shared by foreground thread.

> 
> Alex

Regards
Qi

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC v2 3/5] ether: Add flow timeout support
  2017-12-26  3:28         ` Zhang, Qi Z
@ 2017-12-26  7:44           ` Alex Rosenbaum
  0 siblings, 0 replies; 17+ messages in thread
From: Alex Rosenbaum @ 2017-12-26  7:44 UTC (permalink / raw)
  To: Zhang, Qi Z; +Cc: adrien.mazarguil, DPDK, Doherty, Declan, Chandran, Sugesh

On Tue, Dec 26, 2017 at 5:28 AM, Zhang, Qi Z <qi.z.zhang@intel.com> wrote:
>> On Fri, Dec 22, 2017 at 11:03 AM, Zhang, Qi Z <qi.z.zhang@intel.com> wrote:
>> >> On Thu, Dec 21, 2017 at 4:35 AM, Qi Zhang <qi.z.zhang@intel.com> wrote:
>> >> > Add new APIs to support flow timeout, application is able to 1.
>> >> > Setup the time duration of a flow, the flow is expected to be
>> >> > deleted automatically when timeout.
>> >>
>> >> Can you explain how the application (OVS) is expected to use this API?
>> >> It will help to better understand the motivation here...
>> >
>> > I think the purpose of the APIs is to expose the hardware feature that
>> > support flow auto delete with a timeout.
>> > As I know, for OVS, every flow in flow table will have time duration A
>> > flow be offloaded to hardware is still required to be deleted in
>> > specific time, I think these APIs help OVS to take advantage HW
>> > feature and simplify the flow aging management
>>
>> Are you sure this will allow OVS to 'fire-and-forget' about the rule removal?
>> or will OVS anyway do rule cleanup from application tables?
>
> There is some framework design about offload flow management on OVS side.
> Since I'm not a OVS guy, I can't answer OVS specific question precisely right now,
> but the feedback I got is, it will be nice if rte_flow could support flow timeout
> I may check with some OVS expert to give further explanation.
> BTW, I think there is no harmful to add these APIs into rte_flow, since a flow timeout is quite
> generic feature to me. it may be useful even for non-OVS case in future.

I'm not a core OVS guy ether :) but adding a feature to DPDK because
it "might be nice" and not a "real benefit" does not sound like a
right approach to me. Each new feature will add extra
work/support/bugs for something which might not really used. I think
it is critical that OVS guys provide you a clear evidence how this
will help.
And we need to try and make this generic so other then OVS application
can use this. e.g. adding a re-arm to the timeout.


>> Do you know if OVS flow timers are (or can be) re-armed in different use
>> cases? e.g. extending the timeout duration if traffic is still flowing?
>
> As I know, for OVS every flow just has a fixed time duration, so "hard_timeout"
> is going for this requirement, but by following OpenFlow spec, idle_timeout is paired
> with hard_timeout so I just add it since its generic and maybe useful for future.

Yes, I also heard OF does issue a hard timeout value.
But I also understood from OVS guys that OVS will manage the timeout
internally. OVS will do traffic sampling for activity before deleting
the flow. so that the timeout value needs to be updated constantly
depending on connection state information (re you other patch exposing
the last_seen_timeout).
So the current suggestion, based on hard timeout, will not fit OVS. At
least as I understood from other OVS guys.
You can see that Kernel OVS guys did not add this to TC flower offload
support ether. I am sure they would have if it would have improved
performance.
That will be a fame if we miss the target of the feature.


>> >> Are you trying to move the aging timer from application code into the
>> PMD?
>> >> or can your HW remove/disable/inactivate a flow at certain time
>> >> semantics without software context?
>> >
>> > Yes, it for hardware feature.
>>
>> So if the hardware auto removes the hardware steering entry, what software
>> part deletes the rte_flow handle?
>> What software part triggers the application callback? from what context? will
>> locks be required?
>> How do you prevent races between application thread and the context
>> deleting/accessing the rte_flow handle?
>> I mean in cases that application wants to delete the flow before the timeout
>> expires, but actually it is same time hardware deletes it.
>
> Usually the flow auto delete is running on a separate background thread
> (an interrupt handler or a watchdog thread base on hardware capability)
> The low level driver is responsible to take care of the race condition between background and foreground flow deleting.

Please explain in details how the race is solved. Maybe a patch will
make this clearer? This is one of the main objection for this feature.
e.g.: If the application holds a rte_flow handle, and the timeout
expired, is not handled yet. Now the application will try to use the
rte_flow which state changed underneath, or even got deleted.

Besides you now require each and every low level driver to add this
same timeout expiration logic, alarm registration and race protection.
This should be done once for all PMD's.
PS: Will this callback thread clear the HW rule as well?

> For application, it should be aware

Need to make this clear in API definitions

> that the callback function is running on a separate thread, so it is also required to
> take care of race condition if it will access some data that is shared by foreground thread.

This means locks in application? or some lock-less async model?
Isn't it simpler to let the application (OVS) delete the rte_flow from
it's own thread, save locks, allow timeout updates according traffic
counter.

Alex

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC v2 3/5] ether: Add flow timeout support
  2017-12-22 14:06       ` Wiles, Keith
@ 2018-01-14  2:03         ` Zhang, Qi Z
  0 siblings, 0 replies; 17+ messages in thread
From: Zhang, Qi Z @ 2018-01-14  2:03 UTC (permalink / raw)
  To: Wiles, Keith, Alex Rosenbaum; +Cc: adrien.mazarguil, DPDK, Doherty, Declan

Hi Alex & Keith
	
	Base on my further understanding about OVS requirement and the new device's capability.
	I realize there is no strong point to have the timeout APIs from this patch, I'd like to withdraw it.
	Thanks for all your comments that help me to think it over.

Regards
Qi

> -----Original Message-----
> From: Wiles, Keith
> Sent: Friday, December 22, 2017 10:06 PM
> To: Zhang, Qi Z <qi.z.zhang@intel.com>
> Cc: Alex Rosenbaum <rosenbaumalex@gmail.com>;
> adrien.mazarguil@6wind.com; DPDK <dev@dpdk.org>; Doherty, Declan
> <declan.doherty@intel.com>
> Subject: Re: [dpdk-dev] [RFC v2 3/5] ether: Add flow timeout support
> 
> 
> 
> > On Dec 22, 2017, at 3:03 AM, Zhang, Qi Z <qi.z.zhang@intel.com> wrote:
> >
> > Alex:
> >
> >> -----Original Message-----
> >> From: Alex Rosenbaum [mailto:rosenbaumalex@gmail.com]
> >> Sent: Thursday, December 21, 2017 9:59 PM
> >> To: Zhang, Qi Z <qi.z.zhang@intel.com>
> >> Cc: adrien.mazarguil@6wind.com; DPDK <dev@dpdk.org>; Doherty, Declan
> >> <declan.doherty@intel.com>
> >> Subject: Re: [dpdk-dev] [RFC v2 3/5] ether: Add flow timeout support
> >>
> >> On Thu, Dec 21, 2017 at 4:35 AM, Qi Zhang <qi.z.zhang@intel.com> wrote:
> >>> Add new APIs to support flow timeout, application is able to 1.
> >>> Setup the time duration of a flow, the flow is expected to be
> >>> deleted automatically when timeout.
> >>
> >> Can you explain how the application (OVS) is expected to use this API?
> >> It will help to better understand the motivation here...
> >
> > I think the purpose of the APIs is to expose the hardware feature that
> > support flow auto delete with a timeout.
> > As I know, for OVS, every flow in flow table will have time duration A
> > flow be offloaded to hardware is still required to be deleted in
> > specific time, I think these APIs help OVS to take advantage HW
> > feature and simplify the flow aging management
> >
> >>
> >> Are you trying to move the aging timer from application code into the PMD?
> >> or can your HW remove/disable/inactivate a flow at certain time
> >> semantics without software context?
> >
> > Yes, it for hardware feature.
> 
> We also need to support a software timeout feature here and not just a
> hardware one. The reason is to make the APIs consistent across all hardware. If
> you are going to include hardware timeout then we need to add software
> supported timeout at the same time IMO.
> 
> >
> >>
> >> I would prefer to have the aging timer logic in a centralized
> >> location, leek the application itself or some DPDK library. instead
> >> of having each PMD implement its own software timers.
> >>
> >>
> >>> 3. Register a callback function when a flow is deleted due to timeout.
> >>
> >> Is the application 'struct rte_flow*' handle really deleted? or the
> >> flow was removed from HW, just in-active at this time?
> >
> > Here the flow is deleted, same thing happen as rte_flow_destroy and we
> > need to call rte_flow_create to re-enable the flow.
> > I will add more explanation to avoid confusion in next release.
> 
> Sorry, I little late into this thread, but we can not have 1000 callbacks for each
> timeout and we need make sure we bunch up a number of timeouts at a time to
> make the feature more performant IMO. Maybe that discussed or address in
> the code.
> 
> >
> >>
> >> Can a flow be re-activated? or does this require a call to
> >> rte_flow_destory() and ret_flow_create()?
> >>
> >> Alex
> >
> > Thanks
> > Qi
> 
> Regards,
> Keith

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2018-01-14  2:03 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-21  2:35 [RFC v2 0/5] rte_flow extension for vSwitch acceleration Qi Zhang
2017-12-21  2:35 ` [RFC v2 1/5] ether: add flow action to redirect packet in a switch domain Qi Zhang
2017-12-21 12:37   ` Alex Rosenbaum
2017-12-22  8:20     ` Zhang, Qi Z
2017-12-22 22:10       ` Alex Rosenbaum
2017-12-21  2:35 ` [RFC v2 2/5] ether: add flow last hit query support Qi Zhang
2017-12-21  2:35 ` [RFC v2 3/5] ether: Add flow timeout support Qi Zhang
2017-12-21 13:59   ` Alex Rosenbaum
2017-12-22  9:03     ` Zhang, Qi Z
2017-12-22 14:06       ` Wiles, Keith
2018-01-14  2:03         ` Zhang, Qi Z
2017-12-22 22:26       ` Alex Rosenbaum
2017-12-26  3:28         ` Zhang, Qi Z
2017-12-26  7:44           ` Alex Rosenbaum
2017-12-21  2:35 ` [RFC v2 4/5] ether: add more protocol support in rte_flow Qi Zhang
2017-12-21  2:35 ` [RFC v2 5/5] ether: add packet modification aciton " Qi Zhang
2017-12-21 13:01 ` [RFC v2 0/5] rte_flow extension for vSwitch acceleration Alex Rosenbaum

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.