All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/10] cxgbe: Add flow director support
@ 2016-02-03  8:32 Rahul Lakkireddy
  2016-02-03  8:32 ` [PATCH 01/10] ethdev: add a generic flow and new behavior switch to fdir Rahul Lakkireddy
                   ` (11 more replies)
  0 siblings, 12 replies; 27+ messages in thread
From: Rahul Lakkireddy @ 2016-02-03  8:32 UTC (permalink / raw)
  To: dev; +Cc: Kumar Sanghvi, Nirranjan Kirubaharan

This series of patches extend the flow director filter and add support
for Chelsio T5 hardware filtering capabilities.

Chelsio T5 supports carrying out filtering in hardware which supports 3
actions to carry out on a packet which hit a filter viz.

1. Action Pass - Packets hitting a filter rule can be directed to a
   particular RXQ.

2. Action Drop - Packets hitting a filter rule are dropped in h/w.

3. Action Switch - Packets hitting a filter rule can be switched in h/w
   from one port to another, without involvement of host.  Also, the
   action Switch also supports rewrite of src-mac/dst-mac headers as
   well as rewrite of vlan headers.  It also supports rewrite of IP
   headers and thereby, supports NAT (Network Address Translation)
   in h/w.

Also, each filter rule can optionally support specifying a mask value
i.e. it's possible to create a filter rule for an entire subnet of IP
addresses or a range of tcp/udp ports, etc.

Patch 1 does the following:
- Adds a new flow RTE_ETH_FLOW_RAW_PKT to allow specifying a generic
  flow.
- Adds an additional generic array to rte_eth_fdir_flow to allow
  specifying generic flow input.
- Adds an additional mask for the flow input to allow range of values
  to be matched in the flow input.
- Adds a new behavior 'switch'.
- Adds a generic array to hold behavior arguments that can be passed
  when a particular behavior is taken. For ex: in case of action
  'switch', pass additional 4-tuple to allow rewriting src/dst ip and
  port addresses to support NAT'ing.

RFC series of patches and discussion involving these enhancements to the
flow director are available at [1].

Patch 2 adds command line example app to test cxgbe flow director. Also
add documentation for the example app.

Patch 3 updates the cxgbe base to add support for packet filtering.

Patch 4 adds control txq for communicating filter info to the firmware.

Patches 5-7 add compressed local ip (CLIP) table, layer 2 table (L2T),
and source mac table (SMT) definitions required for holding info
for matching and executing various operations on matched filters.

Patch 8 adds the LE-TCAM (maskfull) filter support.

Patch 9 adds the HASH (maskless) filter support.

Patch 10 adds and implements the flow director filter operations. Also
add the documentation.


[1] http://comments.gmane.org/gmane.comp.networking.dpdk.devel/29986

Rahul Lakkireddy (10):
  ethdev: add a generic flow and new behavior switch to fdir
  examples/test-cxgbe-filters: add example to test cxgbe fdir support
  cxgbe: add skeleton to add support for T5 hardware filtering
  cxgbe: add control txq for communicating filtering info
  cxgbe: add compressed local IP table for matching IPv6 addresses
  cxgbe: add layer 2 table for switch action filter
  cxgbe: add source mac table for switch action filter
  cxgbe: add LE-TCAM filtering support
  cxgbe: add HASH filtering support
  cxgbe: add flow director support and update documentation

 MAINTAINERS                                        |    2 +
 doc/guides/nics/cxgbe.rst                          |  166 ++
 doc/guides/rel_notes/release_2_3.rst               |   10 +
 doc/guides/sample_app_ug/index.rst                 |    1 +
 doc/guides/sample_app_ug/test_cxgbe_filters.rst    |  694 +++++++++
 drivers/net/cxgbe/Makefile                         |    6 +
 drivers/net/cxgbe/base/adapter.h                   |  110 ++
 drivers/net/cxgbe/base/common.h                    |   11 +
 drivers/net/cxgbe/base/t4_hw.c                     |   28 +
 drivers/net/cxgbe/base/t4_msg.h                    |  324 ++++
 drivers/net/cxgbe/base/t4_regs.h                   |    9 +
 drivers/net/cxgbe/base/t4_regs_values.h            |   25 +
 drivers/net/cxgbe/base/t4_tcb.h                    |   95 ++
 drivers/net/cxgbe/base/t4fw_interface.h            |  272 ++++
 drivers/net/cxgbe/clip_tbl.c                       |  220 +++
 drivers/net/cxgbe/clip_tbl.h                       |   59 +
 drivers/net/cxgbe/cxgbe.h                          |    4 +
 drivers/net/cxgbe/cxgbe_compat.h                   |   12 +
 drivers/net/cxgbe/cxgbe_ethdev.c                   |   21 +
 drivers/net/cxgbe/cxgbe_fdir.c                     |  715 +++++++++
 drivers/net/cxgbe/cxgbe_fdir.h                     |  108 ++
 drivers/net/cxgbe/cxgbe_filter.c                   | 1614 ++++++++++++++++++++
 drivers/net/cxgbe/cxgbe_filter.h                   |  260 ++++
 drivers/net/cxgbe/cxgbe_main.c                     |  395 ++++-
 drivers/net/cxgbe/cxgbe_ofld.h                     |  126 ++
 drivers/net/cxgbe/l2t.c                            |  261 ++++
 drivers/net/cxgbe/l2t.h                            |   87 ++
 drivers/net/cxgbe/sge.c                            |  202 ++-
 drivers/net/cxgbe/smt.c                            |  275 ++++
 drivers/net/cxgbe/smt.h                            |   76 +
 examples/Makefile                                  |    1 +
 examples/test-cxgbe-filters/Makefile               |   63 +
 examples/test-cxgbe-filters/commands.c             |  429 ++++++
 examples/test-cxgbe-filters/commands.h             |   40 +
 examples/test-cxgbe-filters/config.c               |   79 +
 examples/test-cxgbe-filters/cxgbe/cxgbe_commands.c |  554 +++++++
 examples/test-cxgbe-filters/cxgbe/cxgbe_fdir.h     |   79 +
 examples/test-cxgbe-filters/init.c                 |  201 +++
 examples/test-cxgbe-filters/main.c                 |   79 +
 examples/test-cxgbe-filters/main.h                 |   77 +
 examples/test-cxgbe-filters/runtime.c              |   74 +
 lib/librte_ether/rte_eth_ctrl.h                    |   15 +-
 42 files changed, 7874 insertions(+), 5 deletions(-)
 create mode 100644 doc/guides/sample_app_ug/test_cxgbe_filters.rst
 create mode 100644 drivers/net/cxgbe/base/t4_tcb.h
 create mode 100644 drivers/net/cxgbe/clip_tbl.c
 create mode 100644 drivers/net/cxgbe/clip_tbl.h
 create mode 100644 drivers/net/cxgbe/cxgbe_fdir.c
 create mode 100644 drivers/net/cxgbe/cxgbe_fdir.h
 create mode 100644 drivers/net/cxgbe/cxgbe_filter.c
 create mode 100644 drivers/net/cxgbe/cxgbe_filter.h
 create mode 100644 drivers/net/cxgbe/cxgbe_ofld.h
 create mode 100644 drivers/net/cxgbe/l2t.c
 create mode 100644 drivers/net/cxgbe/l2t.h
 create mode 100644 drivers/net/cxgbe/smt.c
 create mode 100644 drivers/net/cxgbe/smt.h
 create mode 100644 examples/test-cxgbe-filters/Makefile
 create mode 100644 examples/test-cxgbe-filters/commands.c
 create mode 100644 examples/test-cxgbe-filters/commands.h
 create mode 100644 examples/test-cxgbe-filters/config.c
 create mode 100644 examples/test-cxgbe-filters/cxgbe/cxgbe_commands.c
 create mode 100644 examples/test-cxgbe-filters/cxgbe/cxgbe_fdir.h
 create mode 100644 examples/test-cxgbe-filters/init.c
 create mode 100644 examples/test-cxgbe-filters/main.c
 create mode 100644 examples/test-cxgbe-filters/main.h
 create mode 100644 examples/test-cxgbe-filters/runtime.c

-- 
2.5.3

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH 01/10] ethdev: add a generic flow and new behavior switch to fdir
  2016-02-03  8:32 [PATCH 00/10] cxgbe: Add flow director support Rahul Lakkireddy
@ 2016-02-03  8:32 ` Rahul Lakkireddy
  2016-02-24 14:43   ` Bruce Richardson
  2016-02-25  3:26   ` Wu, Jingjing
  2016-02-03  8:32 ` [PATCH 02/10] examples/test-cxgbe-filters: add example to test cxgbe fdir support Rahul Lakkireddy
                   ` (10 subsequent siblings)
  11 siblings, 2 replies; 27+ messages in thread
From: Rahul Lakkireddy @ 2016-02-03  8:32 UTC (permalink / raw)
  To: dev; +Cc: Kumar Sanghvi, Nirranjan Kirubaharan

Add a new raw packet flow that allows specifying generic flow input.

Add the ability to provide masks for fields in flow to allow range of
values.

Add a new behavior switch.

Add the ability to provide behavior arguments to allow rewriting matched
fields with new values. Ex: allows to provide new ip and port addresses
to rewrite the fields of packets matching a filter rule before NAT'ing.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
---
 doc/guides/rel_notes/release_2_3.rst |  3 +++
 lib/librte_ether/rte_eth_ctrl.h      | 15 ++++++++++++++-
 2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/doc/guides/rel_notes/release_2_3.rst b/doc/guides/rel_notes/release_2_3.rst
index 99de186..19ce954 100644
--- a/doc/guides/rel_notes/release_2_3.rst
+++ b/doc/guides/rel_notes/release_2_3.rst
@@ -39,6 +39,9 @@ API Changes
 ABI Changes
 -----------
 
+* New flow type ``RTE_ETH_FLOW_RAW_PKT`` had been introduced and hence
+  ``RTE_ETH_FLOW_MAX`` had been increased to 19.
+
 
 Shared Library Versions
 -----------------------
diff --git a/lib/librte_ether/rte_eth_ctrl.h b/lib/librte_ether/rte_eth_ctrl.h
index ce224ad..1bc0d03 100644
--- a/lib/librte_ether/rte_eth_ctrl.h
+++ b/lib/librte_ether/rte_eth_ctrl.h
@@ -74,7 +74,8 @@ extern "C" {
 #define RTE_ETH_FLOW_IPV6_EX            15
 #define RTE_ETH_FLOW_IPV6_TCP_EX        16
 #define RTE_ETH_FLOW_IPV6_UDP_EX        17
-#define RTE_ETH_FLOW_MAX                18
+#define RTE_ETH_FLOW_RAW_PKT            18
+#define RTE_ETH_FLOW_MAX                19
 
 /**
  * Feature filter types
@@ -499,6 +500,9 @@ struct rte_eth_tunnel_flow {
 	struct ether_addr mac_addr;                /**< Mac address to match. */
 };
 
+/**< Max length of raw packet in bytes. */
+#define RTE_ETH_RAW_PKT_FLOW_MAX_LEN 256
+
 /**
  * An union contains the inputs for all types of flow
  */
@@ -514,6 +518,7 @@ union rte_eth_fdir_flow {
 	struct rte_eth_ipv6_flow   ipv6_flow;
 	struct rte_eth_mac_vlan_flow mac_vlan_flow;
 	struct rte_eth_tunnel_flow   tunnel_flow;
+	uint8_t raw_pkt_flow[RTE_ETH_RAW_PKT_FLOW_MAX_LEN];
 };
 
 /**
@@ -534,6 +539,8 @@ struct rte_eth_fdir_input {
 	uint16_t flow_type;
 	union rte_eth_fdir_flow flow;
 	/**< Flow fields to match, dependent on flow_type */
+	union rte_eth_fdir_flow flow_mask;
+	/**< Mask for the fields matched, dependent on flow */
 	struct rte_eth_fdir_flow_ext flow_ext;
 	/**< Additional fields to match */
 };
@@ -545,6 +552,7 @@ enum rte_eth_fdir_behavior {
 	RTE_ETH_FDIR_ACCEPT = 0,
 	RTE_ETH_FDIR_REJECT,
 	RTE_ETH_FDIR_PASSTHRU,
+	RTE_ETH_FDIR_SWITCH,
 };
 
 /**
@@ -558,6 +566,9 @@ enum rte_eth_fdir_status {
 	RTE_ETH_FDIR_REPORT_FLEX_8,        /**< Report 8 flex bytes. */
 };
 
+/**< Max # of behavior arguments */
+#define RTE_ETH_BEHAVIOR_ARG_MAX_LEN 256
+
 /**
  * A structure used to define an action when match FDIR packet filter.
  */
@@ -569,6 +580,8 @@ struct rte_eth_fdir_action {
 	/**< If report_status is RTE_ETH_FDIR_REPORT_ID_FLEX_4 or
 	     RTE_ETH_FDIR_REPORT_FLEX_8, flex_off specifies where the reported
 	     flex bytes start from in flexible payload. */
+	uint8_t behavior_arg[RTE_ETH_BEHAVIOR_ARG_MAX_LEN];
+	/**< Extra arguments for behavior taken */
 };
 
 /**
-- 
2.5.3

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 02/10] examples/test-cxgbe-filters: add example to test cxgbe fdir support
  2016-02-03  8:32 [PATCH 00/10] cxgbe: Add flow director support Rahul Lakkireddy
  2016-02-03  8:32 ` [PATCH 01/10] ethdev: add a generic flow and new behavior switch to fdir Rahul Lakkireddy
@ 2016-02-03  8:32 ` Rahul Lakkireddy
  2016-02-24 14:40   ` Bruce Richardson
  2016-02-03  8:32 ` [PATCH 03/10] cxgbe: add skeleton to add support for T5 hardware filtering Rahul Lakkireddy
                   ` (9 subsequent siblings)
  11 siblings, 1 reply; 27+ messages in thread
From: Rahul Lakkireddy @ 2016-02-03  8:32 UTC (permalink / raw)
  To: dev; +Cc: Kumar Sanghvi, Nirranjan Kirubaharan

Add a new test_cxgbe_filters command line example to test support for
Chelsio T5 hardware filtering. Shows how to pass the Chelsio input flow
and input masks. Also, shows how to pass extra behavior arguments to
rewrite fields in matched filter rules.

Also add documentation and update MAINTAINERS.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
---
 MAINTAINERS                                        |   2 +
 doc/guides/sample_app_ug/index.rst                 |   1 +
 doc/guides/sample_app_ug/test_cxgbe_filters.rst    | 694 +++++++++++++++++++++
 examples/Makefile                                  |   1 +
 examples/test-cxgbe-filters/Makefile               |  63 ++
 examples/test-cxgbe-filters/commands.c             | 429 +++++++++++++
 examples/test-cxgbe-filters/commands.h             |  40 ++
 examples/test-cxgbe-filters/config.c               |  79 +++
 examples/test-cxgbe-filters/cxgbe/cxgbe_commands.c | 554 ++++++++++++++++
 examples/test-cxgbe-filters/cxgbe/cxgbe_fdir.h     |  79 +++
 examples/test-cxgbe-filters/init.c                 | 201 ++++++
 examples/test-cxgbe-filters/main.c                 |  79 +++
 examples/test-cxgbe-filters/main.h                 |  77 +++
 examples/test-cxgbe-filters/runtime.c              |  74 +++
 14 files changed, 2373 insertions(+)
 create mode 100644 doc/guides/sample_app_ug/test_cxgbe_filters.rst
 create mode 100644 examples/test-cxgbe-filters/Makefile
 create mode 100644 examples/test-cxgbe-filters/commands.c
 create mode 100644 examples/test-cxgbe-filters/commands.h
 create mode 100644 examples/test-cxgbe-filters/config.c
 create mode 100644 examples/test-cxgbe-filters/cxgbe/cxgbe_commands.c
 create mode 100644 examples/test-cxgbe-filters/cxgbe/cxgbe_fdir.h
 create mode 100644 examples/test-cxgbe-filters/init.c
 create mode 100644 examples/test-cxgbe-filters/main.c
 create mode 100644 examples/test-cxgbe-filters/main.h
 create mode 100644 examples/test-cxgbe-filters/runtime.c

diff --git a/MAINTAINERS b/MAINTAINERS
index b90aeea..1785b02 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -261,6 +261,8 @@ Chelsio cxgbe
 M: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
 F: drivers/net/cxgbe/
 F: doc/guides/nics/cxgbe.rst
+F: examples/test-cxgbe-filters/
+F: doc/guides/sample_app_ug/test_cxgbe_filters.rst
 
 Cisco enic
 M: John Daley <johndale@cisco.com>
diff --git a/doc/guides/sample_app_ug/index.rst b/doc/guides/sample_app_ug/index.rst
index 8a646dd..fdc0340 100644
--- a/doc/guides/sample_app_ug/index.rst
+++ b/doc/guides/sample_app_ug/index.rst
@@ -73,6 +73,7 @@ Sample Applications User Guide
     proc_info
     ptpclient
     performance_thread
+    test_cxgbe_filters
 
 **Figures**
 
diff --git a/doc/guides/sample_app_ug/test_cxgbe_filters.rst b/doc/guides/sample_app_ug/test_cxgbe_filters.rst
new file mode 100644
index 0000000..9012a58
--- /dev/null
+++ b/doc/guides/sample_app_ug/test_cxgbe_filters.rst
@@ -0,0 +1,694 @@
+..  BSD LICENSE
+    Copyright 2015-2016 Chelsio Communications.
+    All rights reserved.
+
+    Redistribution and use in source and binary forms, with or without
+    modification, are permitted provided that the following conditions
+    are met:
+
+    * Redistributions of source code must retain the above copyright
+    notice, this list of conditions and the following disclaimer.
+    * Redistributions in binary form must reproduce the above copyright
+    notice, this list of conditions and the following disclaimer in
+    the documentation and/or other materials provided with the
+    distribution.
+    * Neither the name of Chelsio Communications nor the names of its
+    contributors may be used to endorse or promote products derived
+    from this software without specific prior written permission.
+
+    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+    A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+    OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+    SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+    LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+    DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+    THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+    (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+Test CXGBE Filters Application
+==============================
+
+The test cxgbe filters application provides a command line interface to
+test Chelsio NIC packet classification and filtering features available
+in hardware.
+
+Overview
+--------
+
+Chelsio T5 NICs support packet classification and filtering in hardware.
+This feature can be used in the ingress path to:
+
+- Steer ingress packets that meet ACL (Access Control List) accept criteria
+  to a particular receive queue.
+
+- Switch (proxy) ingress packets that meet ACL accept criteria to an output
+  port, with optional header rewrite.
+
+- Drop ingress packets that fail ACL accept criteria.
+
+There are two types of filters that can be set, namely LE-TCAM (Maskfull)
+filters and HASH (Maskless) filters.  LE-TCAM filters allow specifying masks
+to the accept criteria to allow specifying a match for a range of values;
+whereas, HASH filters ignore masks and hence enforce a more strict accept
+criteria.
+
+The fields that can be specified for the accept criteria are based on the
+filter selection combination set in the firmware configuration (t5-config.txt)
+file flashed. Please see *CXGBE Poll Mode Driver NIC Guide* for instructions
+on how to flash the firmware configuration file onto Chelsio NICs.
+
+By default, the selection combination automatically includes source/
+destination IPV4/IPV6 address, and source/destination layer 4 port
+addresses.  In addition to the above, more combinations can be added by
+modifying the t5-config.txt firmware configuration file.
+
+For example, consider the following combination that has been set in
+t5-config.txt:
+
+.. code-block:: console
+
+   filterMode = ethertype, protocol, tos, vlan, port
+   filterMask = ethertype, protocol, tos, vlan, port
+
+In the above example, in addition to source/destination IPV4/IPV6
+addresses and layer 4 source/destination port addresses, a packet can also
+be matched against ethertype field set in the ethernet header, IP protocol
+and tos field set in the IP header, inner VLAN tag, and physical ingress
+port number, respectively.
+
+You can create 496 LE-TCAM filters and ~0.5 million HASH filter rules.
+For more information, please visit `Chelsio Communications Official Website
+<http://www.chelsio.com>`_.
+
+Compiling the Application
+-------------------------
+
+To compile the application:
+
+#. Turn on command line library in the corresponding config/common_*
+   configuration file. For example, for x86_64-native-linuxapp-gcc target,
+   enable ``CONFIG_RTE_LIBRTE_CMDLINE`` in config/common_linuxapp as follows:
+
+   .. code-block:: console
+
+      CONFIG_RTE_LIBRTE_CMDLINE=y
+
+#. Go to the sample application directory:
+
+   .. code-block:: console
+
+      export RTE_SDK=/path/to/rte_sdk
+      cd ${RTE_SDK}/examples/test-cxgbe-filters
+
+#. Set the target (a default target is used if not specified). For example:
+
+   .. code-block:: console
+
+      export RTE_TARGET=x86_64-native-linuxapp-gcc
+
+   See the *DPDK Getting Started* Guide for possible ``RTE_TARGET`` values.
+
+#. Build the application as follows:
+
+   .. code-block:: console
+
+      make
+
+Running the Application
+-----------------------
+
+Ensure, that the Chelsio NICs are bound to DPDK, and run the application
+from the build directory as follows:
+
+.. code-block:: console
+
+   ./build/test_cxgbe_filters
+
+When successful, a command line prompt appears as shown below:
+
+.. code-block:: console
+
+   [...]
+   PMD: rte_cxgbe_pmd:  0000:04:00.4 Chelsio rev 0 1000/10GBASE-SFP
+   PMD: rte_cxgbe_pmd:  0000:04:00.4 Chelsio rev 0 1000/10GBASE-SFP
+   USER1: Initializing NIC port 0 ...
+   Port 0: 00:07:43:2F:08:60
+   USER1: Initializing NIC port 1 ...
+   Port 1: 00:07:43:2F:08:68
+   PMD: rte_cxgbe_pmd: Port0: passive DA port module inserted
+   PMD: rte_cxgbe_pmd: Port1: passive DA port module inserted
+   USER1: Port 0 (10 Gbps) UP
+   USER1: Port 1 (10 Gbps) UP
+   USER1: Core 1 is doing RX for port 0
+   USER1: Core 2 is doing RX for port 1
+   cxgbe>
+
+There are a number of commands available.  Run **help** to get the command list.
+
+.. code-block:: console
+
+   cxgbe> help
+
+   Help:
+   -----
+
+   show (stats|port|fdir) (port_id)
+       Display information for port_id.
+
+   show_all (stats|port)
+       Display information for all ports.
+
+   clear stats (port_id)
+       Clear information for port_id.
+
+   clear_all stats
+       Clear information for all ports.
+
+   quit
+       Quit to prompt.
+
+   filter (port_id) (add|del) (ipv4|ipv6)
+    mode (maskfull|maskless) (no-prio|prio)
+    ingress-port (iport) (iport_mask)
+    ether (ether_type) (ether_type_mask)
+    vlan (inner_vlan) (inner_vlan_mask) (outer_vlan) (outer_vlan_mask)
+    ip (tos) (tos_mask) (proto) (proto_mask)
+    (src_ip_address) (src_ip_mask) (dst_ip_address) (dst_ip_mask)
+    (src_port) (src_port_mask) (dst_port) (dst_port_mask)
+    (drop|fwd|switch) queue (queue_id)
+    (port-none|port-redirect) (egress_port)
+    (ether-none|mac-rewrite|mac-swap) (src_mac) (dst_mac)
+    (vlan-none|vlan-rewrite|vlan-delete) (new_vlan)
+    (nat-none|nat-rewrite) (nat_src_ip) (nat_dst_ip)
+    (nat_src_port) (nat_dst_port)
+    fd_id (fd_id_value)
+       Add/Del a cxgbe flow director filter.
+
+Add/Delete Filters
+------------------
+
+The command line to add/delete filters is given below. Note that the
+command is too long to fit on one line and hence is shown wrapped
+at "\\" for display purposes.  In real prompt, these commands should
+be on a single line without the "\\".
+
+  .. code-block:: console
+
+     cxgbe> filter (port_id) (add|del) (ipv4|ipv6) \
+            mode (maskfull|maskless) (no-prio|prio) \
+            ingress-port (iport) (iport_mask) \
+            ether (ether_type) (ether_type_mask) \
+            vlan (inner_vlan) (inner_vlan_mask) \
+            (outer_vlan) (outer_vlan_mask) \
+            ip (tos) (tos_mask) (proto) (proto_mask) \
+            (src_ip_address) (src_ip_mask) \
+            (dst_ip_address) (dst_ip_mask) \
+            (src_port) (src_port_mask) (dst_port) (dst_port_mask) \
+            (drop|fwd|switch) queue (queue_id) \
+            (port-none|port-redirect) (egress_port) \
+            (ether-none|mac-rewrite|mac-swap) (src_mac) (dst_mac) \
+            (vlan-none|vlan-rewrite|vlan-delete) (new_vlan) \
+            (nat-none|nat-rewrite) (nat_src_ip) (nat_dst_ip) \
+            (nat_src_port) (nat_dst_port) \
+            fd_id (fd_id_value)
+
+LE-TCAM (Maskfull) Filters
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+For maskfull filters, if the match field and its corresponding mask in
+the accept criteria are both set to 0, then they are not considered for
+accept criteria against the packet.
+
+The **fd_id** value specified for maskfull filters have a priority with
+value 0 having the highest priority.  If a packet matches a filter rule
+with lower **fd_id**, then its action is executed immediately and the
+remaining filter rules are ignored.
+
+Maskfull IPv6 filters occupy 4 **fd_id** slots and hence must be on 4 slot
+boundary.  IPv4 filters on the other hand occupy only 1 slot.  Thus, if
+a slot is being occupied by an IPv6 filter rule, then an IPv4 filter rule
+can not be set on the occupied slot.
+
+By default, maskless filter rules have higher priority over the maskfull
+filter rules.  Thus, if a packet could match both a maskfull and a maskless
+filter rule, then **prio** value can be specified to allow maskfull filter
+rule to have a higher priority over the maskless filter rule.
+
+DROP Filter Example
+^^^^^^^^^^^^^^^^^^^
+
+An example to set a drop maskfull filter is given below:
+
+#. Generate some traffic destined for 102.1.2.0/24 network.  The app's rxq
+   should have successfully received the traffic as shown below:
+
+   .. code-block:: console
+
+      cxgbe> show stats 0
+      ################################################################
+      App Stats for port: 0
+      # of Received Packets: 1000000
+
+      Port Extended Stats
+      rx_good_packets: 1000000
+      tx_good_packets: 0
+      rx_good_bytes: 64000000
+      tx_good_bytes: 0
+      rx_errors: 0
+      tx_errors: 0
+      rx_mbuf_allocation_errors: 0
+      rx_q0_packets: 1000000
+      rx_q0_bytes: 64000000
+      rx_q0_errors: 0
+      tx_q0_packets: 0
+      tx_q0_bytes: 0
+      ################################################################
+
+#. To drop all traffic coming for 102.1.2.0/24 network, add a maskfull
+   filter as follows:
+
+   .. code-block:: console
+
+      cxgbe> filter 0 add ipv4 mode maskfull \
+             no-prio ingress-port 0 0 ether 0 0 vlan 0 0 0 0 \
+             ip 0 0 0 0 0.0.0.0 0.0.0.0 102.1.2.0 255.255.255.0 0 0 0 0 \
+             drop queue 0 port-none 0 \
+             ether-none 00:00:00:00:00:00 00:00:00:00:00:00 \
+             vlan-none 0 nat-none 0.0.0.0 0.0.0.0 0 0 \
+             fd_id 0
+
+#. Generate the same traffic destined for 102.1.2.0/24 network again.  The
+   traffic would have been arrived at the port, but the app's rxq should not
+   have received the traffic; i.e. the traffic had been successfully dropped
+   by the hardware, as shown below:
+
+   .. code-block:: console
+
+      cxgbe> show stats 0
+      ################################################################
+      App Stats for port: 0
+      # of Received Packets: 1000000
+
+      Port Extended Stats
+      rx_good_packets: 2000000
+      tx_good_packets: 0
+      rx_good_bytes: 128000000
+      tx_good_bytes: 0
+      rx_errors: 0
+      tx_errors: 0
+      rx_mbuf_allocation_errors: 0
+      rx_q0_packets: 1000000
+      rx_q0_bytes: 64000000
+      rx_q0_errors: 0
+      tx_q0_packets: 0
+      tx_q0_bytes: 0
+      ################################################################
+
+#. Delete the maskfull filter as follows:
+
+   .. code-block:: console
+
+      cxgbe> filter 0 del ipv4 mode maskfull \
+             no-prio ingress-port 0 0 ether 0 0 vlan 0 0 0 0 \
+             ip 0 0 0 0 0.0.0.0 0.0.0.0 0.0.0.0 0.0.0.0 0 0 0 0 \
+             drop queue 0 port-none 0 \
+             ether-none 00:00:00:00:00:00 00:00:00:00:00:00 \
+             vlan-none 0 nat-none 0.0.0.0 0.0.0.0 0 0 \
+             fd_id 0
+
+#. Generate the same traffic destined for 102.1.2.0/24 network again.
+   The app's rxq should have successfully received the traffic again
+   as shown below:
+
+   .. code-block:: console
+
+      cxgbe> show stats 0
+      ################################################################
+      App Stats for port: 0
+      # of Received Packets: 2000000
+
+      Port Extended Stats
+      rx_good_packets: 3000000
+      tx_good_packets: 0
+      rx_good_bytes: 192000000
+      tx_good_bytes: 0
+      rx_errors: 0
+      tx_errors: 0
+      rx_mbuf_allocation_errors: 0
+      rx_q0_packets: 2000000
+      rx_q0_bytes: 128000000
+      rx_q0_errors: 0
+      tx_q0_packets: 0
+      tx_q0_bytes: 0
+      ################################################################
+
+STEER Filter Example
+^^^^^^^^^^^^^^^^^^^^
+
+An example to set a steering maskfull filter to steer traffic from port
+0 to port 1's rxq is given below:
+
+#. Generate some traffic destined for 102.1.2.0/24 network to port 0.
+   The port 0's rxq should have successfully received the traffic,
+   as shown below:
+
+   .. code-block:: console
+
+      cxgbe> show_all stats
+      ################################################################
+      App Stats for port: 0
+      # of Received Packets: 1000000
+
+      Port Extended Stats
+      rx_good_packets: 1000000
+      tx_good_packets: 0
+      rx_good_bytes: 64000000
+      tx_good_bytes: 0
+      rx_errors: 0
+      tx_errors: 0
+      rx_mbuf_allocation_errors: 0
+      rx_q0_packets: 1000000
+      rx_q0_bytes: 64000000
+      rx_q0_errors: 0
+      tx_q0_packets: 0
+      tx_q0_bytes: 0
+      ################################################################
+      ################################################################
+      App Stats for port: 1
+      # of Received Packets: 0
+
+      Port Extended Stats
+      rx_good_packets: 0
+      tx_good_packets: 0
+      rx_good_bytes: 0
+      tx_good_bytes: 0
+      rx_errors: 0
+      tx_errors: 0
+      rx_mbuf_allocation_errors: 0
+      rx_q0_packets: 0
+      rx_q0_bytes: 0
+      rx_q0_errors: 0
+      tx_q0_packets: 0
+      tx_q0_bytes: 0
+      ################################################################
+
+#. To steer all traffic coming for 102.1.2.0/24 network from port 0 to
+   port 1's rxq, add a maskfull filter as follows:
+
+   .. code-block:: console
+
+      cxgbe> filter 1 add ipv4 mode maskfull \
+             no-prio ingress-port 0 0 ether 0 0 vlan 0 0 0 0 \
+             ip 0 0 0 0 0.0.0.0 0.0.0.0 102.1.2.0 255.255.255.0 0 0 0 0 \
+             fwd queue 0 port-none 0 \
+             ether-none 00:00:00:00:00:00 00:00:00:00:00:00 \
+             vlan-none 0 nat-none 0.0.0.0 0.0.0.0 0 0 \
+             fd_id 0
+
+#. Generate the same traffic destined for 102.1.2.0/24 network to port 0
+   again.  The traffic would have arrived at port 0, but it should be
+   steered to port 1's rxq, as shown below:
+
+   .. code-block:: console
+
+      cxgbe> show_all stats
+      ################################################################
+      App Stats for port: 0
+      # of Received Packets: 1000000
+
+      Port Extended Stats
+      rx_good_packets: 2000000
+      tx_good_packets: 0
+      rx_good_bytes: 128000000
+      tx_good_bytes: 0
+      rx_errors: 0
+      tx_errors: 0
+      rx_mbuf_allocation_errors: 0
+      rx_q0_packets: 1000000
+      rx_q0_bytes: 64000000
+      rx_q0_errors: 0
+      tx_q0_packets: 0
+      tx_q0_bytes: 0
+      ################################################################
+      ################################################################
+      App Stats for port: 1
+      # of Received Packets: 1000000
+
+      Port Extended Stats
+      rx_good_packets: 0
+      tx_good_packets: 0
+      rx_good_bytes: 0
+      tx_good_bytes: 0
+      rx_errors: 0
+      tx_errors: 0
+      rx_mbuf_allocation_errors: 0
+      rx_q0_packets: 1000000
+      rx_q0_bytes: 64000000
+      rx_q0_errors: 0
+      tx_q0_packets: 0
+      tx_q0_bytes: 0
+      ################################################################
+
+HASH (Maskless) Filters
+~~~~~~~~~~~~~~~~~~~~~~~
+
+Maskless filters perform match based on the hash generated from the
+fields selected in the accept criteria.  Maskless filters ignore masks
+if specified.  Both IPv4 and IPv6 filter rules occupy only 1 **fd_id**
+slot for maskless filters.
+
+Maskless filters require a special firmware configuration file. Please
+see *CXGBE Poll Mode Driver NIC Guide* for instructions on how to flash
+the firmware configuration file onto Chelsio NICs.
+
+SWITCH Filter Example
+^^^^^^^^^^^^^^^^^^^^^
+
+An example to set a switch maskless filter with source and destination
+mac addresses swapped is given below:
+
+#. Generate some TCP traffic destined for 102.1.2.2 with destination port
+   12865 coming from 102.1.2.1 with source port 12000.  The app's port 0
+   rxq should have successfully received the traffic as shown below:
+
+   .. code-block:: console
+
+      cxgbe> show_all stats
+      ################################################################
+      App Stats for port: 0
+      # of Received Packets: 1000000
+
+      Port Extended Stats
+      rx_good_packets: 1000000
+      tx_good_packets: 0
+      rx_good_bytes: 64000000
+      tx_good_bytes: 0
+      rx_errors: 0
+      tx_errors: 0
+      rx_mbuf_allocation_errors: 0
+      rx_q0_packets: 1000000
+      rx_q0_bytes: 64000000
+      rx_q0_errors: 0
+      tx_q0_packets: 0
+      tx_q0_bytes: 0
+      ################################################################
+      ################################################################
+      App Stats for port: 1
+      # of Received Packets: 0
+
+      Port Extended Stats
+      rx_good_packets: 0
+      tx_good_packets: 0
+      rx_good_bytes: 0
+      tx_good_bytes: 0
+      rx_errors: 0
+      tx_errors: 0
+      rx_mbuf_allocation_errors: 0
+      rx_q0_packets: 0
+      rx_q0_bytes: 0
+      rx_q0_errors: 0
+      tx_q0_packets: 0
+      tx_q0_bytes: 0
+      ################################################################
+
+#. To switch the traffic from port 0 to port 1 and swap the source and
+   destination mac addresses, add a maskless filter as follows:
+
+   .. code-block:: console
+
+      cxgbe> filter 0 add ipv4 mode maskless \
+             no-prio ingress-port 0 0 ether 0 0 vlan 0 0 0 0 \
+             ip 0 0 0 0 102.1.2.1 0.0.0.0 102.1.2.2 0.0.0.0 12000 0 12865 0 \
+             switch queue 0 port-redirect 1 \
+             mac-swap 00:00:00:00:00:00 00:00:00:00:00:00 \
+             vlan-none 0 nat-none 0.0.0.0 0.0.0.0 0 0 \
+             fd_id 0
+
+#. Generate the same traffic again and the traffic from port 0 gets
+   switched to port 1 without reaching the app's rx queues as shown below.
+   Also, the source and destination mac addresses should have been swapped
+   if the traffic had been captured.
+
+   .. code-block:: console
+
+      cxgbe> show_all stats
+      ################################################################
+      App Stats for port: 0
+      # of Received Packets: 1000000
+
+      Port Extended Stats
+      rx_good_packets: 2000000
+      tx_good_packets: 0
+      rx_good_bytes: 128000000
+      tx_good_bytes: 0
+      rx_errors: 0
+      tx_errors: 0
+      rx_mbuf_allocation_errors: 0
+      rx_q0_packets: 1000000
+      rx_q0_bytes: 64000000
+      rx_q0_errors: 0
+      tx_q0_packets: 0
+      tx_q0_bytes: 0
+      ################################################################
+      ################################################################
+      App Stats for port: 1
+      # of Received Packets: 0
+
+      Port Extended Stats
+      rx_good_packets: 0
+      tx_good_packets: 1000000
+      rx_good_bytes: 0
+      tx_good_bytes: 64000000
+      rx_errors: 0
+      tx_errors: 0
+      rx_mbuf_allocation_errors: 0
+      rx_q0_packets: 0
+      rx_q0_bytes: 0
+      rx_q0_errors: 0
+      tx_q0_packets: 0
+      tx_q0_bytes: 0
+      ################################################################
+
+#. To delete the maskless filter, run:
+
+   .. code-block:: console
+
+      cxgbe> filter 0 del ipv4 mode maskless \
+             no-prio ingress-port 0 0 ether 0 0 vlan 0 0 0 0 \
+             ip 0 0 0 0 0.0.0.0 0.0.0.0 0.0.0.0 0.0.0.0 0 0 0 0 \
+             fwd queue 0 port-none 0 \
+             ether-none 00:00:00:00:00:00 00:00:00:00:00:00 \
+             vlan-none 0 nat-none 0.0.0.0 0.0.0.0 0 0 \
+             fd_id 0
+
+NAT Offload Filter Example
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+An example to perform NAT with a switch maskless filter with destination
+ip and port addresses re-written is given below:
+
+#. Generate some TCP traffic destined for 102.1.2.2 with destination port
+   12865 coming from 102.1.2.1 with source port 12000.  The app's port 0
+   rxq should have successfully received the traffic as shown below:
+
+   .. code-block:: console
+
+      cxgbe> show_all stats
+      ################################################################
+      App Stats for port: 0
+      # of Received Packets: 1000000
+
+      Port Extended Stats
+      rx_good_packets: 1000000
+      tx_good_packets: 0
+      rx_good_bytes: 64000000
+      tx_good_bytes: 0
+      rx_errors: 0
+      tx_errors: 0
+      rx_mbuf_allocation_errors: 0
+      rx_q0_packets: 1000000
+      rx_q0_bytes: 64000000
+      rx_q0_errors: 0
+      tx_q0_packets: 0
+      tx_q0_bytes: 0
+      ################################################################
+      ################################################################
+      App Stats for port: 1
+      # of Received Packets: 0
+
+      Port Extended Stats
+      rx_good_packets: 0
+      tx_good_packets: 0
+      rx_good_bytes: 0
+      tx_good_bytes: 0
+      rx_errors: 0
+      tx_errors: 0
+      rx_mbuf_allocation_errors: 0
+      rx_q0_packets: 0
+      rx_q0_bytes: 0
+      rx_q0_errors: 0
+      tx_q0_packets: 0
+      tx_q0_bytes: 0
+      ################################################################
+
+#. To perform NAT and switch the traffic from port 0 to port 1 with
+   destination ip and port addresses re-written to 10.1.1.1 and 14000
+   respectively, add a maskless filter as follows:
+
+   .. code-block:: console
+
+      cxgbe> filter 0 add ipv4 mode maskless \
+             no-prio ingress-port 0 0 ether 0 0 vlan 0 0 0 0 \
+             ip 0 0 0 0 102.1.2.1 0.0.0.0 102.1.2.2 0.0.0.0 12000 0 12865 0 \
+             switch queue 0 port-redirect 1 \
+             ether-none 00:00:00:00:00:00 00:00:00:00:00:00 \
+             vlan-none 0 nat-rewrite 102.1.2.1 10.1.1.1 12000 14000 \
+             fd_id 0
+
+#. Generate the same traffic again and the traffic from port 0 gets
+   switched to port 1 without reaching the app's rx queues as shown below.
+   Also, the destination ip and port addresses should have been re-written
+   if the traffic had been captured.
+
+   .. code-block:: console
+
+      cxgbe> show_all stats
+      ################################################################
+      App Stats for port: 0
+      # of Received Packets: 1000000
+
+      Port Extended Stats
+      rx_good_packets: 2000000
+      tx_good_packets: 0
+      rx_good_bytes: 128000000
+      tx_good_bytes: 0
+      rx_errors: 0
+      tx_errors: 0
+      rx_mbuf_allocation_errors: 0
+      rx_q0_packets: 1000000
+      rx_q0_bytes: 64000000
+      rx_q0_errors: 0
+      tx_q0_packets: 0
+      tx_q0_bytes: 0
+      ################################################################
+      ################################################################
+      App Stats for port: 1
+      # of Received Packets: 0
+
+      Port Extended Stats
+      rx_good_packets: 0
+      tx_good_packets: 1000000
+      rx_good_bytes: 0
+      tx_good_bytes: 64000000
+      rx_errors: 0
+      tx_errors: 0
+      rx_mbuf_allocation_errors: 0
+      rx_q0_packets: 0
+      rx_q0_bytes: 0
+      rx_q0_errors: 0
+      tx_q0_packets: 0
+      tx_q0_bytes: 0
+      ################################################################
diff --git a/examples/Makefile b/examples/Makefile
index 1cb4785..ed7ad49 100644
--- a/examples/Makefile
+++ b/examples/Makefile
@@ -71,6 +71,7 @@ DIRS-y += quota_watermark
 DIRS-$(CONFIG_RTE_ETHDEV_RXTX_CALLBACKS) += rxtx_callbacks
 DIRS-y += skeleton
 DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += tep_termination
+DIRS-$(CONFIG_RTE_LIBRTE_CMDLINE) += test-cxgbe-filters
 DIRS-$(CONFIG_RTE_LIBRTE_TIMER) += timer
 DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += vhost
 DIRS-$(CONFIG_RTE_LIBRTE_XEN_DOM0) += vhost_xen
diff --git a/examples/test-cxgbe-filters/Makefile b/examples/test-cxgbe-filters/Makefile
new file mode 100644
index 0000000..9e9883b
--- /dev/null
+++ b/examples/test-cxgbe-filters/Makefile
@@ -0,0 +1,63 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2015-2016 Chelsio Communications.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Chelsio Communications nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+ifeq ($(RTE_SDK),)
+$(error "Please define RTE_SDK environment variable")
+endif
+
+# Default target, can be overridden by command line or environment
+RTE_TARGET ?= x86_64-native-linuxapp-gcc
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+ifeq ($(CONFIG_RTE_LIBRTE_CMDLINE), y)
+
+# binary name
+APP = test_cxgbe_filters
+
+VPATH += $(SRCDIR)/cxgbe
+
+INC += $(wildcard *.h) $(wildcard cxgbe/*.h)
+
+# all source are stored in SRCS-y
+SRCS-y := main.c
+SRCS-y += config.c
+SRCS-y += init.c
+SRCS-y += runtime.c
+SRCS-y += commands.c
+SRCS-y += cxgbe_commands.c
+
+CFLAGS += -I$(SRCDIR) -I$(SRCDIR)/cxgbe
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+
+include $(RTE_SDK)/mk/rte.extapp.mk
+endif
diff --git a/examples/test-cxgbe-filters/commands.c b/examples/test-cxgbe-filters/commands.c
new file mode 100644
index 0000000..09dee94
--- /dev/null
+++ b/examples/test-cxgbe-filters/commands.c
@@ -0,0 +1,429 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015-2016 Chelsio Communications.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Chelsio Communications nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+#include <string.h>
+
+#include <cmdline_rdline.h>
+#include <cmdline_parse.h>
+#include <cmdline_parse_num.h>
+#include <cmdline_parse_string.h>
+#include <cmdline.h>
+
+#include <rte_common.h>
+#include <rte_ethdev.h>
+#include <rte_malloc.h>
+
+#include "main.h"
+#include "commands.h"
+
+static int
+port_id_is_invalid(uint8_t port_id)
+{
+	if (port_id < app.n_ports)
+		return 0;
+
+	printf("Invalid port %d. Must be < %d\n", port_id, app.n_ports);
+	return 1;
+}
+
+/****************/
+
+struct cmd_help_result {
+	cmdline_fixed_string_t help;
+};
+
+static void
+cmd_help_parsed(__rte_unused void *parsed_result,
+		struct cmdline *cl,
+		__rte_unused void *data)
+{
+	cmdline_printf(cl,
+		       "\n"
+		       "Help:\n"
+		       "-----\n\n"
+
+		       "show (stats|port|fdir) (port_id)\n"
+		       "    Display information for port_id.\n\n"
+
+		       "show_all (stats|port)\n"
+		       "    Display information for all ports.\n\n"
+
+		       "clear stats (port_id)\n"
+		       "    Clear information for port_id.\n\n"
+
+		       "clear_all stats\n"
+		       "    Clear information for all ports.\n\n"
+
+		       "quit\n"
+		       "    Quit to prompt.\n\n"
+
+		       "filter (port_id) (add|del) (ipv4|ipv6)"
+		       " mode (maskfull|maskless) (no-prio|prio)"
+		       " ingress-port (iport) (iport_mask)"
+		       " ether (ether_type) (ether_type_mask)"
+		       " vlan (inner_vlan) (inner_vlan_mask)"
+		       " (outer_vlan) (outer_vlan_mask)"
+		       " ip (tos) (tos_mask) (proto) (proto_mask)"
+		       " (src_ip_address) (src_ip_mask)"
+		       " (dst_ip_address) (dst_ip_mask)"
+		       " (src_port) (src_port_mask) (dst_port) (dst_port_mask)"
+		       " (drop|fwd|switch) queue (queue_id)"
+		       " (port-none|port-redirect) (egress_port)"
+		       " (ether-none|mac-rewrite|mac-swap) (src_mac) (dst_mac)"
+		       " (vlan-none|vlan-rewrite|vlan-delete) (new_vlan)"
+		       " (nat-none|nat-rewrite) (nat_src_ip) (nat_dst_ip)"
+		       " (nat_src_port) (nat_dst_port)"
+		       " fd_id (fd_id_value)\n"
+		       "    Add/Del a cxgbe flow director filter.\n\n"
+		      );
+}
+
+cmdline_parse_token_string_t cmd_help_help =
+	TOKEN_STRING_INITIALIZER(struct cmd_help_result, help,
+				 "help");
+
+cmdline_parse_inst_t cmd_help = {
+	.f = cmd_help_parsed,
+	.data = NULL,
+	.help_str = "print help",
+	.tokens = {        /* token list, NULL terminated */
+		(void *)&cmd_help_help,
+		NULL,
+	},
+};
+
+/****************/
+
+struct cmd_show_result {
+	cmdline_fixed_string_t show;
+	cmdline_fixed_string_t op;
+	uint8_t port_id;
+};
+
+static void
+print_xstats(uint8_t port_id)
+{
+	struct rte_eth_xstats *xstats;
+	int len, ret, i;
+
+	if (port_id_is_invalid(port_id))
+		return;
+
+	printf("App Stats for port: %d\n", port_id);
+	printf("# of Received Packets: %" PRIu64 "\n",
+	       app.mbuf_rx[port_id].pkts);
+
+	printf("\nPort Extended Stats\n");
+	len = rte_eth_xstats_get(port_id, NULL, 0);
+	if (len < 0) {
+		printf("Cannot get extended stats\n");
+		return;
+	}
+
+	xstats = malloc(sizeof(xstats[0]) * len);
+	if (!xstats) {
+		printf("Cannot allocate memory for xstats\n");
+		return;
+	}
+
+	ret = rte_eth_xstats_get(port_id, xstats, len);
+	if (ret < 0 || ret > len) {
+		printf("Cannot get xstats\n");
+		free(xstats);
+		return;
+	}
+
+	for (i = 0; i < len; i++)
+		printf("%s: %" PRIu64 "\n", xstats[i].name, xstats[i].value);
+
+	free(xstats);
+}
+
+static void
+print_port(uint8_t port_id)
+{
+	struct ether_addr mac_addr;
+	struct rte_eth_link link;
+	char buf[ETHER_ADDR_FMT_SIZE];
+
+	if (port_id_is_invalid(port_id))
+		return;
+
+	rte_eth_link_get_nowait(port_id, &link);
+	printf("\nInfos for port %d\n", port_id);
+
+	rte_eth_macaddr_get(port_id, &mac_addr);
+	ether_format_addr(buf, ETHER_ADDR_FMT_SIZE, &mac_addr);
+	printf("MAC address: %s", buf);
+
+	printf("\nLink status: %s\n", (link.link_status) ? ("up") : ("down"));
+	printf("Link speed: %u Mbps\n", (unsigned)link.link_speed);
+	printf("Link duplex: %s\n", (link.link_duplex == ETH_LINK_FULL_DUPLEX) ?
+				    ("full-duplex") : ("half-duplex"));
+	printf("Promiscuous mode: %s\n",
+	       rte_eth_promiscuous_get(port_id) ? "enabled" : "disabled");
+	printf("Allmulticast mode: %s\n",
+	       rte_eth_allmulticast_get(port_id) ? "enabled" : "disabled");
+}
+
+static void
+print_fdir(uint8_t port_id)
+{
+	struct rte_eth_fdir_stats fdir_stat;
+	int ret;
+
+	if (port_id_is_invalid(port_id))
+		return;
+
+	ret = rte_eth_dev_filter_supported(port_id, RTE_ETH_FILTER_FDIR);
+	if (ret < 0) {
+		printf("FDIR is not supported on port %d\n", port_id);
+		return;
+	}
+
+	memset(&fdir_stat, 0, sizeof(fdir_stat));
+	rte_eth_dev_filter_ctrl(port_id, RTE_ETH_FILTER_FDIR,
+				RTE_ETH_FILTER_STATS, &fdir_stat);
+	printf("FDIR stats for port %d\n", port_id);
+	printf("  free:          %" PRIu32 "\n"
+	       "  add:	         %-10" PRIu64 "  remove:        %" PRIu64 "\n"
+	       "  f_add:         %-10" PRIu64 "  f_remove:      %" PRIu64 "\n",
+	       fdir_stat.free,
+	       fdir_stat.add, fdir_stat.remove,
+	       fdir_stat.f_add, fdir_stat.f_remove);
+}
+
+static void
+cmd_show_parsed(void *parsed_result,
+		__rte_unused struct cmdline *cl,
+		__rte_unused void *data)
+{
+	struct cmd_show_result *res = parsed_result;
+	const char *border = "#######################################";
+
+	printf("%s%s\n", border, border);
+	if (!strcmp(res->op, "stats"))
+		print_xstats(res->port_id);
+	else if (!strcmp(res->op, "port"))
+		print_port(res->port_id);
+	else if (!strcmp(res->op, "fdir"))
+		print_fdir(res->port_id);
+	printf("%s%s\n", border, border);
+}
+
+cmdline_parse_token_string_t cmd_show_show =
+	TOKEN_STRING_INITIALIZER(struct cmd_show_result, show, "show");
+cmdline_parse_token_string_t cmd_show_op =
+	TOKEN_STRING_INITIALIZER(struct cmd_show_result, op, "stats#port#fdir");
+cmdline_parse_token_num_t cmd_show_port_id =
+	TOKEN_NUM_INITIALIZER(struct cmd_show_result, port_id, UINT8);
+
+cmdline_parse_inst_t cmd_show = {
+	.f = cmd_show_parsed,
+	.data = NULL,
+	.help_str = "show information",
+	.tokens = {        /* token list, NULL terminated */
+		(void *)&cmd_show_show,
+		(void *)&cmd_show_op,
+		(void *)&cmd_show_port_id,
+		NULL,
+	},
+};
+
+/****************/
+
+struct cmd_show_all_result {
+	cmdline_fixed_string_t show_all;
+	cmdline_fixed_string_t op;
+};
+
+static void
+cmd_show_all_parsed(void *parsed_result,
+		    __rte_unused struct cmdline *cl,
+		    __rte_unused void *data)
+{
+	struct cmd_show_all_result *res = parsed_result;
+	const char *border = "#######################################";
+	unsigned int i;
+
+	for (i = 0; i < app.n_ports; i++) {
+		printf("%s%s\n", border, border);
+		if (!strcmp(res->op, "stats"))
+			print_xstats(i);
+		else if (!strcmp(res->op, "port"))
+			print_port(i);
+		printf("%s%s\n", border, border);
+	}
+}
+
+cmdline_parse_token_string_t cmd_show_all_show =
+	TOKEN_STRING_INITIALIZER(struct cmd_show_all_result, show_all,
+				 "show_all");
+cmdline_parse_token_string_t cmd_show_all_op =
+	TOKEN_STRING_INITIALIZER(struct cmd_show_all_result, op, "stats#port");
+
+cmdline_parse_inst_t cmd_show_all = {
+	.f = cmd_show_all_parsed,
+	.data = NULL,
+	.help_str = "show information for all ports",
+	.tokens = {        /* token list, NULL terminated */
+		(void *)&cmd_show_all_show,
+		(void *)&cmd_show_all_op,
+		NULL,
+	},
+};
+
+/****************/
+
+struct cmd_clear_result {
+	cmdline_fixed_string_t clear;
+	cmdline_fixed_string_t op;
+	uint8_t port_id;
+};
+
+static void
+cmd_clear_parsed(void *parsed_result,
+		 __rte_unused struct cmdline *cl,
+		 __rte_unused void *data)
+{
+	struct cmd_clear_result *res = parsed_result;
+
+	if (port_id_is_invalid(res->port_id))
+		return;
+
+	if (!strcmp(res->op, "stats")) {
+		app.mbuf_rx[res->port_id].pkts = 0;
+		rte_eth_xstats_reset(res->port_id);
+	}
+}
+
+cmdline_parse_token_string_t cmd_clear_clear =
+	TOKEN_STRING_INITIALIZER(struct cmd_clear_result, clear, "clear");
+cmdline_parse_token_string_t cmd_clear_op =
+	TOKEN_STRING_INITIALIZER(struct cmd_clear_result, op, "stats");
+cmdline_parse_token_num_t cmd_clear_port_id =
+	TOKEN_NUM_INITIALIZER(struct cmd_clear_result, port_id, UINT8);
+
+cmdline_parse_inst_t cmd_clear = {
+	.f = cmd_clear_parsed,
+	.data = NULL,
+	.help_str = "clear information",
+	.tokens = {        /* token list, NULL terminated */
+		(void *)&cmd_clear_clear,
+		(void *)&cmd_clear_op,
+		(void *)&cmd_clear_port_id,
+		NULL,
+	},
+};
+
+/****************/
+
+struct cmd_clear_all_result {
+	cmdline_fixed_string_t clear_all;
+	cmdline_fixed_string_t op;
+};
+
+static void
+cmd_clear_all_parsed(void *parsed_result,
+		     __rte_unused struct cmdline *cl,
+		     __rte_unused void *data)
+{
+	struct cmd_clear_all_result *res = parsed_result;
+	unsigned int i;
+
+	for (i = 0; i < app.n_ports; i++) {
+		if (!strcmp(res->op, "stats")) {
+			app.mbuf_rx[i].pkts = 0;
+			rte_eth_xstats_reset(i);
+		}
+	}
+}
+
+cmdline_parse_token_string_t cmd_clear_all_clear =
+	TOKEN_STRING_INITIALIZER(struct cmd_clear_all_result, clear_all,
+				 "clear_all");
+cmdline_parse_token_string_t cmd_clear_all_op =
+	TOKEN_STRING_INITIALIZER(struct cmd_clear_all_result, op, "stats");
+
+cmdline_parse_inst_t cmd_clear_all = {
+	.f = cmd_clear_all_parsed,
+	.data = NULL,
+	.help_str = "clear information for all ports",
+	.tokens = {        /* token list, NULL terminated */
+		(void *)&cmd_clear_all_clear,
+		(void *)&cmd_clear_all_op,
+		NULL,
+	},
+};
+
+/****************/
+
+struct cmd_quit_result {
+	cmdline_fixed_string_t quit;
+};
+
+static void
+cmd_quit_parsed(__rte_unused void *parsed_result,
+		struct cmdline *cl,
+		__rte_unused void *data)
+{
+	cmdline_quit(cl);
+}
+
+cmdline_parse_token_string_t cmd_quit_quit =
+	TOKEN_STRING_INITIALIZER(struct cmd_quit_result, quit,
+				 "quit");
+
+cmdline_parse_inst_t cmd_quit = {
+	.f = cmd_quit_parsed,
+	.data = NULL,
+	.help_str = "exit application",
+	.tokens = {        /* token list, NULL terminated */
+		(void *)&cmd_quit_quit,
+		NULL,
+	},
+};
+
+/****************/
+
+cmdline_parse_ctx_t main_ctx[] = {
+	(cmdline_parse_inst_t *)&cmd_help,
+	(cmdline_parse_inst_t *)&cmd_show,
+	(cmdline_parse_inst_t *)&cmd_show_all,
+	(cmdline_parse_inst_t *)&cmd_clear,
+	(cmdline_parse_inst_t *)&cmd_clear_all,
+	(cmdline_parse_inst_t *)&cmd_quit,
+	(cmdline_parse_inst_t *)&cmd_add_del_cxgbe_flow_director,
+	NULL,
+};
diff --git a/examples/test-cxgbe-filters/commands.h b/examples/test-cxgbe-filters/commands.h
new file mode 100644
index 0000000..7503c92
--- /dev/null
+++ b/examples/test-cxgbe-filters/commands.h
@@ -0,0 +1,40 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015-2016 Chelsio Communications.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Chelsio Communications nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _COMMANDS_H_
+#define _COMMANDS_H_
+
+#include <cmdline_parse.h>
+
+extern cmdline_parse_inst_t cmd_add_del_cxgbe_flow_director;
+#endif /* _COMMANDS_H_ */
diff --git a/examples/test-cxgbe-filters/config.c b/examples/test-cxgbe-filters/config.c
new file mode 100644
index 0000000..d70eeba
--- /dev/null
+++ b/examples/test-cxgbe-filters/config.c
@@ -0,0 +1,79 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015-2016 Chelsio Communications.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Chelsio Communications nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+
+#include <rte_lcore.h>
+#include <rte_log.h>
+#include <rte_ethdev.h>
+
+#include "main.h"
+
+struct app_params app;
+
+int
+app_init_config(void)
+{
+	uint32_t n_lcores, lcore_id;
+
+	app.n_ports = rte_eth_dev_count();
+	if (!app.n_ports) {
+		RTE_LOG(ERR, USER1, "No probed ethernet devices\n");
+		return -1;
+	}
+
+	n_lcores = 0;
+	for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
+		if (!rte_lcore_is_enabled(lcore_id))
+			continue;
+
+		if (lcore_id == rte_get_master_lcore())
+			continue;
+
+		if (n_lcores >= APP_MAX_PORTS || n_lcores >= app.n_ports)
+			break;
+
+		app.core_rx[n_lcores] = lcore_id;
+		n_lcores++;
+	}
+
+	if (n_lcores < 2) {
+		RTE_LOG(ERR, USER1, "Number of cores must be at least 3\n");
+		return -1;
+	} else if (n_lcores < app.n_ports) {
+		RTE_LOG(ERR, USER1, "Not enough cores for all ports\n");
+		return -1;
+	}
+
+	return 0;
+}
diff --git a/examples/test-cxgbe-filters/cxgbe/cxgbe_commands.c b/examples/test-cxgbe-filters/cxgbe/cxgbe_commands.c
new file mode 100644
index 0000000..f576589
--- /dev/null
+++ b/examples/test-cxgbe-filters/cxgbe/cxgbe_commands.c
@@ -0,0 +1,554 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015-2016 Chelsio Communications.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Chelsio Communications nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+#include <string.h>
+#ifndef __linux__
+#ifndef __FreeBSD__
+#include <net/socket.h>
+#else
+#include <sys/socket.h>
+#endif
+#endif
+
+#include <cmdline_rdline.h>
+#include <cmdline_parse.h>
+#include <cmdline_parse_num.h>
+#include <cmdline_parse_string.h>
+#include <cmdline_parse_ipaddr.h>
+#include <cmdline_parse_etheraddr.h>
+#include <cmdline.h>
+
+#include <rte_common.h>
+#include <rte_byteorder.h>
+#include <rte_ethdev.h>
+#include <rte_malloc.h>
+
+#include "main.h"
+#include "commands.h"
+#include "cxgbe_fdir.h"
+
+/****************/
+
+static void
+ipv4_addr_to_uint(cmdline_ipaddr_t ip_addr, uint8_t *ip)
+{
+	do {
+		if (ip_addr.family == AF_INET) {
+			rte_memcpy(ip, &ip_addr.addr.ipv4.s_addr,
+				   sizeof(struct in_addr));
+		} else {
+			printf("invalid parameter.\n");
+			return;
+		}
+	} while (0);
+}
+
+static void
+ipv6_addr_to_array(cmdline_ipaddr_t ip_addr, uint8_t *ip)
+{
+	do {
+		if (ip_addr.family == AF_INET6) {
+			rte_memcpy(ip, &ip_addr.addr.ipv6,
+				   sizeof(struct in6_addr));
+		} else {
+			printf("invalid parameter.\n");
+			return;
+		}
+	} while (0);
+}
+
+/* *** cxgbe flow director *** */
+struct cmd_cxgbe_flow_director_result {
+	cmdline_fixed_string_t cxgbe_flow_director;
+	uint8_t port_id;
+	cmdline_fixed_string_t ops;
+
+	/* Match fields without mask */
+	cmdline_fixed_string_t flow_type;
+	cmdline_fixed_string_t flow_mode;
+	cmdline_fixed_string_t flow_mode_value;
+	cmdline_fixed_string_t flow_prio;
+
+	/* Match fields with mask */
+	cmdline_fixed_string_t iport;
+	uint8_t iport_value;
+	uint8_t iport_mask;
+	cmdline_fixed_string_t ether;
+	uint16_t ether_type;
+	uint16_t ether_type_mask;
+	cmdline_fixed_string_t vlan;
+	uint16_t ivlan;
+	uint16_t ivlan_mask;
+	uint16_t ovlan;
+	uint16_t ovlan_mask;
+	cmdline_fixed_string_t ip;
+	uint8_t tos;
+	uint8_t tos_mask;
+	uint8_t proto;
+	uint8_t proto_mask;
+	cmdline_ipaddr_t src_ip;
+	cmdline_ipaddr_t src_ip_mask;
+	cmdline_ipaddr_t dst_ip;
+	cmdline_ipaddr_t dst_ip_mask;
+	uint16_t src_port;
+	uint16_t src_port_mask;
+	uint16_t dst_port;
+	uint16_t dst_port_mask;
+
+	/* Action */
+	cmdline_fixed_string_t action;
+	cmdline_fixed_string_t queue;
+	uint16_t queue_id;
+
+	/* Action arguments */
+	cmdline_fixed_string_t port_op;
+	uint8_t eport;
+	cmdline_fixed_string_t ether_op;
+	struct ether_addr smac;
+	struct ether_addr dmac;
+	cmdline_fixed_string_t vlan_op;
+	uint16_t new_vlan;
+	cmdline_fixed_string_t nat_op;
+	cmdline_ipaddr_t nat_src_ip;
+	cmdline_ipaddr_t nat_dst_ip;
+	uint16_t nat_src_port;
+	uint16_t nat_dst_port;
+
+	cmdline_fixed_string_t fd_id;
+	uint32_t fd_id_value;
+};
+
+cmdline_parse_token_string_t cmd_cxgbe_flow_director_cxgbe =
+	TOKEN_STRING_INITIALIZER(struct cmd_cxgbe_flow_director_result,
+				 cxgbe_flow_director,
+				 "filter");
+cmdline_parse_token_num_t cmd_cxgbe_flow_director_port_id =
+	TOKEN_NUM_INITIALIZER(struct cmd_cxgbe_flow_director_result,
+			      port_id, UINT8);
+cmdline_parse_token_string_t cmd_cxgbe_flow_director_ops =
+	TOKEN_STRING_INITIALIZER(struct cmd_cxgbe_flow_director_result,
+				 ops, "add#del");
+
+cmdline_parse_token_string_t cmd_cxgbe_flow_director_flow_type =
+	TOKEN_STRING_INITIALIZER(struct cmd_cxgbe_flow_director_result,
+				 flow_type, "ipv4#ipv6");
+cmdline_parse_token_string_t cmd_cxgbe_flow_director_flow_mode =
+	TOKEN_STRING_INITIALIZER(struct cmd_cxgbe_flow_director_result,
+				 flow_mode, "mode");
+cmdline_parse_token_string_t cmd_cxgbe_flow_director_flow_mode_value =
+	TOKEN_STRING_INITIALIZER(struct cmd_cxgbe_flow_director_result,
+				 flow_mode_value, "maskfull#maskless");
+cmdline_parse_token_string_t cmd_cxgbe_flow_director_flow_prio =
+	TOKEN_STRING_INITIALIZER(struct cmd_cxgbe_flow_director_result,
+				 flow_prio, "no-prio#prio");
+
+cmdline_parse_token_string_t cmd_cxgbe_flow_director_iport =
+	TOKEN_STRING_INITIALIZER(struct cmd_cxgbe_flow_director_result,
+				 iport, "ingress-port");
+cmdline_parse_token_num_t cmd_cxgbe_flow_director_iport_value =
+	TOKEN_NUM_INITIALIZER(struct cmd_cxgbe_flow_director_result,
+			      iport_value, UINT8);
+cmdline_parse_token_num_t cmd_cxgbe_flow_director_iport_mask =
+	TOKEN_NUM_INITIALIZER(struct cmd_cxgbe_flow_director_result,
+			      iport_mask, UINT8);
+
+cmdline_parse_token_string_t cmd_cxgbe_flow_director_ether =
+	TOKEN_STRING_INITIALIZER(struct cmd_cxgbe_flow_director_result,
+				 ether, "ether");
+cmdline_parse_token_num_t cmd_cxgbe_flow_director_ether_type =
+	TOKEN_NUM_INITIALIZER(struct cmd_cxgbe_flow_director_result,
+			      ether_type, UINT16);
+cmdline_parse_token_num_t cmd_cxgbe_flow_director_ether_type_mask =
+	TOKEN_NUM_INITIALIZER(struct cmd_cxgbe_flow_director_result,
+			      ether_type_mask, UINT16);
+
+cmdline_parse_token_string_t cmd_cxgbe_flow_director_vlan =
+	TOKEN_STRING_INITIALIZER(struct cmd_cxgbe_flow_director_result,
+				 vlan, "vlan");
+cmdline_parse_token_num_t cmd_cxgbe_flow_director_ivlan =
+	TOKEN_NUM_INITIALIZER(struct cmd_cxgbe_flow_director_result,
+			      ivlan, UINT16);
+cmdline_parse_token_num_t cmd_cxgbe_flow_director_ivlan_mask =
+	TOKEN_NUM_INITIALIZER(struct cmd_cxgbe_flow_director_result,
+			      ivlan_mask, UINT16);
+cmdline_parse_token_num_t cmd_cxgbe_flow_director_ovlan =
+	TOKEN_NUM_INITIALIZER(struct cmd_cxgbe_flow_director_result,
+			      ovlan, UINT16);
+cmdline_parse_token_num_t cmd_cxgbe_flow_director_ovlan_mask =
+	TOKEN_NUM_INITIALIZER(struct cmd_cxgbe_flow_director_result,
+			      ovlan_mask, UINT16);
+
+cmdline_parse_token_string_t cmd_cxgbe_flow_director_ip =
+	TOKEN_STRING_INITIALIZER(struct cmd_cxgbe_flow_director_result,
+				 ip, "ip");
+cmdline_parse_token_num_t cmd_cxgbe_flow_director_tos =
+	TOKEN_NUM_INITIALIZER(struct cmd_cxgbe_flow_director_result,
+			      tos, UINT8);
+cmdline_parse_token_num_t cmd_cxgbe_flow_director_tos_mask =
+	TOKEN_NUM_INITIALIZER(struct cmd_cxgbe_flow_director_result,
+			      tos_mask, UINT8);
+cmdline_parse_token_num_t cmd_cxgbe_flow_director_proto =
+	TOKEN_NUM_INITIALIZER(struct cmd_cxgbe_flow_director_result,
+			      proto, UINT8);
+cmdline_parse_token_num_t cmd_cxgbe_flow_director_proto_mask =
+	TOKEN_NUM_INITIALIZER(struct cmd_cxgbe_flow_director_result,
+			      proto_mask, UINT8);
+cmdline_parse_token_ipaddr_t cmd_cxgbe_flow_director_src_ip =
+	TOKEN_IPADDR_INITIALIZER(struct cmd_cxgbe_flow_director_result,
+				 src_ip);
+cmdline_parse_token_ipaddr_t cmd_cxgbe_flow_director_src_ip_mask =
+	TOKEN_IPADDR_INITIALIZER(struct cmd_cxgbe_flow_director_result,
+				 src_ip_mask);
+cmdline_parse_token_ipaddr_t cmd_cxgbe_flow_director_dst_ip =
+	TOKEN_IPADDR_INITIALIZER(struct cmd_cxgbe_flow_director_result,
+				 dst_ip);
+cmdline_parse_token_ipaddr_t cmd_cxgbe_flow_director_dst_ip_mask =
+	TOKEN_IPADDR_INITIALIZER(struct cmd_cxgbe_flow_director_result,
+				 dst_ip_mask);
+cmdline_parse_token_num_t cmd_cxgbe_flow_director_src_port =
+	TOKEN_NUM_INITIALIZER(struct cmd_cxgbe_flow_director_result,
+			      src_port, UINT16);
+cmdline_parse_token_num_t cmd_cxgbe_flow_director_src_port_mask =
+	TOKEN_NUM_INITIALIZER(struct cmd_cxgbe_flow_director_result,
+			      src_port_mask, UINT16);
+cmdline_parse_token_num_t cmd_cxgbe_flow_director_dst_port =
+	TOKEN_NUM_INITIALIZER(struct cmd_cxgbe_flow_director_result,
+			      dst_port, UINT16);
+cmdline_parse_token_num_t cmd_cxgbe_flow_director_dst_port_mask =
+	TOKEN_NUM_INITIALIZER(struct cmd_cxgbe_flow_director_result,
+			      dst_port_mask, UINT16);
+
+cmdline_parse_token_string_t cmd_cxgbe_flow_director_action =
+	TOKEN_STRING_INITIALIZER(struct cmd_cxgbe_flow_director_result,
+				 action, "drop#fwd#switch");
+cmdline_parse_token_string_t cmd_cxgbe_flow_director_queue =
+	TOKEN_STRING_INITIALIZER(struct cmd_cxgbe_flow_director_result,
+				 queue, "queue");
+cmdline_parse_token_num_t cmd_cxgbe_flow_director_queue_id =
+	TOKEN_NUM_INITIALIZER(struct cmd_cxgbe_flow_director_result,
+			      queue_id, UINT16);
+
+cmdline_parse_token_string_t cmd_cxgbe_flow_director_port_op =
+	TOKEN_STRING_INITIALIZER(struct cmd_cxgbe_flow_director_result,
+				 port_op, "port-none#port-redirect");
+cmdline_parse_token_num_t cmd_cxgbe_flow_director_eport =
+	TOKEN_NUM_INITIALIZER(struct cmd_cxgbe_flow_director_result,
+			      eport, UINT8);
+cmdline_parse_token_string_t cmd_cxgbe_flow_director_ether_op =
+	TOKEN_STRING_INITIALIZER(struct cmd_cxgbe_flow_director_result,
+				 ether_op, "ether-none#mac-rewrite#mac-swap");
+cmdline_parse_token_etheraddr_t cmd_cxgbe_flow_director_smac =
+	TOKEN_ETHERADDR_INITIALIZER(struct cmd_cxgbe_flow_director_result,
+				    smac);
+cmdline_parse_token_etheraddr_t cmd_cxgbe_flow_director_dmac =
+	TOKEN_ETHERADDR_INITIALIZER(struct cmd_cxgbe_flow_director_result,
+				    dmac);
+cmdline_parse_token_string_t cmd_cxgbe_flow_director_vlan_op =
+	TOKEN_STRING_INITIALIZER(struct cmd_cxgbe_flow_director_result,
+				 vlan_op, "vlan-none#vlan-rewrite#vlan-delete");
+cmdline_parse_token_num_t cmd_cxgbe_flow_director_new_vlan =
+	TOKEN_NUM_INITIALIZER(struct cmd_cxgbe_flow_director_result,
+			      new_vlan, UINT16);
+cmdline_parse_token_string_t cmd_cxgbe_flow_director_nat_op =
+	TOKEN_STRING_INITIALIZER(struct cmd_cxgbe_flow_director_result,
+				 nat_op, "nat-none#nat-rewrite");
+cmdline_parse_token_num_t cmd_cxgbe_flow_director_nat_src_ip =
+	TOKEN_IPADDR_INITIALIZER(struct cmd_cxgbe_flow_director_result,
+				 nat_src_ip);
+cmdline_parse_token_num_t cmd_cxgbe_flow_director_nat_dst_ip =
+	TOKEN_IPADDR_INITIALIZER(struct cmd_cxgbe_flow_director_result,
+				 nat_dst_ip);
+cmdline_parse_token_num_t cmd_cxgbe_flow_director_nat_src_port =
+	TOKEN_NUM_INITIALIZER(struct cmd_cxgbe_flow_director_result,
+			      nat_src_port, UINT16);
+cmdline_parse_token_num_t cmd_cxgbe_flow_director_nat_dst_port =
+	TOKEN_NUM_INITIALIZER(struct cmd_cxgbe_flow_director_result,
+			      nat_dst_port, UINT16);
+
+cmdline_parse_token_string_t cmd_cxgbe_flow_director_fd_id =
+	TOKEN_STRING_INITIALIZER(struct cmd_cxgbe_flow_director_result,
+				 fd_id, "fd_id");
+cmdline_parse_token_num_t cmd_cxgbe_flow_director_fd_id_value =
+	TOKEN_NUM_INITIALIZER(struct cmd_cxgbe_flow_director_result,
+			      fd_id_value, UINT32);
+
+static void
+cxgbe_fill_in_match_fields(struct rte_eth_fdir_input *input,
+			   struct cmd_cxgbe_flow_director_result *res)
+{
+	struct cxgbe_fdir_input_admin admin;
+	struct cxgbe_fdir_input_flow val, mask;
+	uint8_t *raw_pkt_flow;
+
+	memset(&admin, 0, sizeof(admin));
+	memset(&val, 0, sizeof(val));
+	memset(&mask, 0, sizeof(mask));
+
+	/* Fill in match admin fields that don't have masks */
+	if (!strcmp(res->flow_type, "ipv4"))
+		admin.type = 0;
+	else
+		admin.type = 1;
+
+	if (!strcmp(res->flow_mode_value, "maskfull"))
+		admin.cap = 0;
+	else
+		admin.cap = 1;
+
+	/*
+	 * If a packet can match both a maskfull and maskless filter,
+	 * enable prio bit to allow maskfull filter to have priority
+	 * over the maskless filter.
+	 */
+	if (!strcmp(res->flow_prio, "prio"))
+		admin.prio = 1;
+	else
+		admin.prio = 0;
+
+	/* Fill in match fields that have masks */
+	val.iport   = res->iport_value;
+	if (val.iport > 7) {
+		printf("iport must be < 8. Using 0\n");
+		val.iport = 0;
+	}
+	val.ethtype = rte_cpu_to_be_16(res->ether_type);
+
+	val.ivlan = rte_cpu_to_be_16(res->ivlan);
+	val.ovlan = rte_cpu_to_be_16(res->ovlan);
+
+	val.tos   = res->tos;
+	val.proto = res->proto;
+	if (!admin.type) {
+		ipv4_addr_to_uint(res->dst_ip, &val.lip[0]);
+		ipv4_addr_to_uint(res->src_ip, &val.fip[0]);
+	} else {
+		ipv6_addr_to_array(res->dst_ip, &val.lip[0]);
+		ipv6_addr_to_array(res->src_ip, &val.fip[0]);
+	}
+
+	val.lport = rte_cpu_to_be_16(res->dst_port);
+	val.fport = rte_cpu_to_be_16(res->src_port);
+
+	/* Fill in match mask fields */
+	mask.iport   = res->iport_mask;
+	if (mask.iport > 7) {
+		printf("iport_mask must be < 8. Using 0\n");
+		mask.iport = 0;
+	}
+	mask.ethtype = rte_cpu_to_be_16(res->ether_type_mask);
+
+	mask.ivlan = rte_cpu_to_be_16(res->ivlan_mask);
+	mask.ovlan = rte_cpu_to_be_16(res->ovlan_mask);
+
+	mask.tos   = res->tos_mask;
+	mask.proto = res->proto_mask;
+	if (!admin.type) {
+		ipv4_addr_to_uint(res->dst_ip_mask, &mask.lip[0]);
+		ipv4_addr_to_uint(res->src_ip_mask, &mask.fip[0]);
+	} else {
+		ipv6_addr_to_array(res->dst_ip_mask, &mask.lip[0]);
+		ipv6_addr_to_array(res->src_ip_mask, &mask.fip[0]);
+	}
+
+	mask.lport = rte_cpu_to_be_16(res->dst_port_mask);
+	mask.fport = rte_cpu_to_be_16(res->src_port_mask);
+
+	/* Fill in the above data to raw_pkt_flow array */
+	raw_pkt_flow = &input->flow.raw_pkt_flow[0];
+	rte_memcpy(raw_pkt_flow, &admin, sizeof(admin));
+	raw_pkt_flow += sizeof(admin);
+	rte_memcpy(raw_pkt_flow, &val, sizeof(val));
+	rte_memcpy(&input->flow_mask.raw_pkt_flow[0], &mask, sizeof(mask));
+}
+
+static void
+cxgbe_fill_in_action_fields(struct rte_eth_fdir_action *behavior,
+			    struct cmd_cxgbe_flow_director_result *res)
+{
+	struct cxgbe_fdir_action action;
+
+	memset(&action, 0, sizeof(action));
+
+	/* Fill in port behavior arguments */
+	if (!strcmp(res->port_op, "port-redirect")) {
+		action.eport = res->eport;
+		if (action.eport > 3) {
+			printf("eport must be < 4. Using 0\n");
+			action.eport = 0;
+		}
+	}
+
+	/* Fill in ether behavior arguments */
+	if (!strcmp(res->ether_op, "mac-rewrite")) {
+		rte_memcpy(&action.dmac, &res->dmac, sizeof(struct ether_addr));
+		rte_memcpy(&action.smac, &res->smac, sizeof(struct ether_addr));
+		action.newdmac = 1;
+		action.newsmac = 1;
+	} else if (!strcmp(res->ether_op, "mac-swap")) {
+		action.swapmac = 1;
+	}
+
+	/* Fill in vlan behavior arguments */
+	if (!strcmp(res->vlan_op, "vlan-delete")) {
+		action.vlan = rte_cpu_to_be_16(res->new_vlan);
+		action.newvlan = 1; /* VLAN delete */
+	} else if (!strcmp(res->vlan_op, "vlan-rewrite")) {
+		action.vlan = rte_cpu_to_be_16(res->new_vlan);
+		action.newvlan = 3; /* VLAN rewrite */
+	}
+
+	/* Fill in nat behavior arguments */
+	if (!strcmp(res->nat_op, "nat-rewrite")) {
+		action.nat_mode = 7; /* Rewrite IP and port */
+		if (!strcmp(res->flow_type, "ipv4")) {
+			ipv4_addr_to_uint(res->nat_dst_ip, &action.nat_lip[0]);
+			ipv4_addr_to_uint(res->nat_src_ip, &action.nat_fip[0]);
+		} else {
+			ipv6_addr_to_array(res->nat_dst_ip, &action.nat_lip[0]);
+			ipv6_addr_to_array(res->nat_src_ip, &action.nat_fip[0]);
+		}
+		action.nat_lport = rte_cpu_to_be_16(res->nat_dst_port);
+		action.nat_fport = rte_cpu_to_be_16(res->nat_src_port);
+	}
+
+	/* Fill in the above data to behavior_arg array */
+	rte_memcpy(&behavior->behavior_arg[0], &action, sizeof(action));
+}
+
+static void
+cmd_cxgbe_flow_director_parsed(void *parsed_result,
+			       __attribute__((unused)) struct cmdline *cl,
+			       __attribute__((unused)) void *data)
+{
+	struct cmd_cxgbe_flow_director_result *res = parsed_result;
+	struct rte_eth_fdir_filter entry;
+	struct rte_eth_dev_info dev_info;
+	int ret;
+
+	rte_eth_dev_info_get(res->port_id, &dev_info);
+	if (strcmp(dev_info.driver_name, "rte_cxgbe_pmd")) {
+		printf("Not a Chelsio Device\n");
+		return;
+	}
+
+	ret = rte_eth_dev_filter_supported(res->port_id, RTE_ETH_FILTER_FDIR);
+	if (ret < 0) {
+		printf("flow director is not supported on port %u.\n",
+		       res->port_id);
+		return;
+	}
+
+	memset(&entry, 0, sizeof(entry));
+	entry.input.flow_type = RTE_ETH_FLOW_RAW_PKT;
+
+	cxgbe_fill_in_match_fields(&entry.input, res);
+
+	if (!strcmp(res->action, "drop"))
+		entry.action.behavior = RTE_ETH_FDIR_REJECT;
+	else if (!strcmp(res->action, "fwd"))
+		entry.action.behavior = RTE_ETH_FDIR_ACCEPT;
+	else
+		entry.action.behavior = RTE_ETH_FDIR_SWITCH;
+
+	cxgbe_fill_in_action_fields(&entry.action, res);
+
+	entry.action.rx_queue = res->queue_id;
+	entry.soft_id = res->fd_id_value;
+	if (!strcmp(res->ops, "add"))
+		ret = rte_eth_dev_filter_ctrl(res->port_id, RTE_ETH_FILTER_FDIR,
+					      RTE_ETH_FILTER_ADD, &entry);
+	else
+		ret = rte_eth_dev_filter_ctrl(res->port_id, RTE_ETH_FILTER_FDIR,
+					      RTE_ETH_FILTER_DELETE, &entry);
+	if (ret < 0)
+		printf("flow director programming error: (%s)\n",
+		       strerror(-ret));
+}
+
+cmdline_parse_inst_t cmd_add_del_cxgbe_flow_director = {
+	.f = cmd_cxgbe_flow_director_parsed,
+	.data = NULL,
+	.help_str = "add or delete fdir entry on a cxgbe device",
+	.tokens = {
+		(void *)&cmd_cxgbe_flow_director_cxgbe,
+		(void *)&cmd_cxgbe_flow_director_port_id,
+		(void *)&cmd_cxgbe_flow_director_ops,
+		(void *)&cmd_cxgbe_flow_director_flow_type,
+		(void *)&cmd_cxgbe_flow_director_flow_mode,
+		(void *)&cmd_cxgbe_flow_director_flow_mode_value,
+		(void *)&cmd_cxgbe_flow_director_flow_prio,
+		(void *)&cmd_cxgbe_flow_director_iport,
+		(void *)&cmd_cxgbe_flow_director_iport_value,
+		(void *)&cmd_cxgbe_flow_director_iport_mask,
+		(void *)&cmd_cxgbe_flow_director_ether,
+		(void *)&cmd_cxgbe_flow_director_ether_type,
+		(void *)&cmd_cxgbe_flow_director_ether_type_mask,
+		(void *)&cmd_cxgbe_flow_director_vlan,
+		(void *)&cmd_cxgbe_flow_director_ivlan,
+		(void *)&cmd_cxgbe_flow_director_ivlan_mask,
+		(void *)&cmd_cxgbe_flow_director_ovlan,
+		(void *)&cmd_cxgbe_flow_director_ovlan_mask,
+		(void *)&cmd_cxgbe_flow_director_ip,
+		(void *)&cmd_cxgbe_flow_director_tos,
+		(void *)&cmd_cxgbe_flow_director_tos_mask,
+		(void *)&cmd_cxgbe_flow_director_proto,
+		(void *)&cmd_cxgbe_flow_director_proto_mask,
+		(void *)&cmd_cxgbe_flow_director_src_ip,
+		(void *)&cmd_cxgbe_flow_director_src_ip_mask,
+		(void *)&cmd_cxgbe_flow_director_dst_ip,
+		(void *)&cmd_cxgbe_flow_director_dst_ip_mask,
+		(void *)&cmd_cxgbe_flow_director_src_port,
+		(void *)&cmd_cxgbe_flow_director_src_port_mask,
+		(void *)&cmd_cxgbe_flow_director_dst_port,
+		(void *)&cmd_cxgbe_flow_director_dst_port_mask,
+		(void *)&cmd_cxgbe_flow_director_action,
+		(void *)&cmd_cxgbe_flow_director_queue,
+		(void *)&cmd_cxgbe_flow_director_queue_id,
+		(void *)&cmd_cxgbe_flow_director_port_op,
+		(void *)&cmd_cxgbe_flow_director_eport,
+		(void *)&cmd_cxgbe_flow_director_ether_op,
+		(void *)&cmd_cxgbe_flow_director_smac,
+		(void *)&cmd_cxgbe_flow_director_dmac,
+		(void *)&cmd_cxgbe_flow_director_vlan_op,
+		(void *)&cmd_cxgbe_flow_director_new_vlan,
+		(void *)&cmd_cxgbe_flow_director_nat_op,
+		(void *)&cmd_cxgbe_flow_director_nat_src_ip,
+		(void *)&cmd_cxgbe_flow_director_nat_dst_ip,
+		(void *)&cmd_cxgbe_flow_director_nat_src_port,
+		(void *)&cmd_cxgbe_flow_director_nat_dst_port,
+		(void *)&cmd_cxgbe_flow_director_fd_id,
+		(void *)&cmd_cxgbe_flow_director_fd_id_value,
+		NULL,
+	},
+};
diff --git a/examples/test-cxgbe-filters/cxgbe/cxgbe_fdir.h b/examples/test-cxgbe-filters/cxgbe/cxgbe_fdir.h
new file mode 100644
index 0000000..60219d8
--- /dev/null
+++ b/examples/test-cxgbe-filters/cxgbe/cxgbe_fdir.h
@@ -0,0 +1,79 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015-2016 Chelsio Communications.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Chelsio Communications nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _APP_CXGBE_FDIR_H_
+#define _APP_CXGBE_FDIR_H_
+
+/*
+ * RTE_ETH_FLOW_RAW_PKT representation.
+ * Must be kept in sync with the representation in
+ * drivers/net/cxgbe/cxgbe_fdir.h.
+ */
+struct cxgbe_fdir_input_admin {
+	uint8_t prio;        /* filter has priority over maskless */
+	uint8_t type;        /* 0 => IPv4, 1 => IPv6 */
+	uint8_t cap;         /* 0 => Maskfull, 1 => Maskless */
+};
+
+struct cxgbe_fdir_input_flow {
+	uint16_t ethtype;    /* ethernet type */
+	uint8_t iport;       /* ingress port */
+	uint8_t proto;       /* protocol type */
+	uint8_t tos;         /* TOS/Traffic Type */
+	uint16_t ivlan;      /* inner VLAN */
+	uint16_t ovlan;      /* outer VLAN */
+
+	uint8_t lip[16];     /* local IP address (IPv4 in [3:0]) */
+	uint8_t fip[16];     /* foreign IP address (IPv4 in [3:0]) */
+	uint16_t lport;      /* local port */
+	uint16_t fport;      /* foreign port */
+};
+
+struct cxgbe_fdir_action {
+	uint8_t eport;       /* egress port to switch packet out */
+	uint8_t newdmac;     /* rewrite destination MAC address */
+	uint8_t newsmac;     /* rewrite source MAC address */
+	uint8_t swapmac;     /* swap SMAC/DMAC for loopback packet */
+	uint8_t newvlan;     /* rewrite VLAN Tag */
+	uint8_t nat_mode;    /* specify NAT operation mode */
+
+	uint8_t dmac[ETHER_ADDR_LEN];  /* new destination MAC address */
+	uint8_t smac[ETHER_ADDR_LEN];  /* new source MAC address */
+	uint16_t vlan;       /* VLAN Tag to insert */
+
+	uint8_t nat_lip[16]; /* local IP to use after NAT'ing */
+	uint8_t nat_fip[16]; /* foreign IP to use after NAT'ing */
+	uint16_t nat_lport;  /* local port to use after NAT'ing */
+	uint16_t nat_fport;  /* foreign port to use after NAT'ing */
+};
+#endif /* _APP_CXGBE_FDIR_H_ */
diff --git a/examples/test-cxgbe-filters/init.c b/examples/test-cxgbe-filters/init.c
new file mode 100644
index 0000000..4452075
--- /dev/null
+++ b/examples/test-cxgbe-filters/init.c
@@ -0,0 +1,201 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015-2016 Chelsio Communications.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Chelsio Communications nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+
+#include <rte_log.h>
+#include <rte_mempool.h>
+#include <rte_ring.h>
+#include <rte_mbuf.h>
+#include <rte_ethdev.h>
+#include <rte_cycles.h>
+
+#include "main.h"
+
+struct app_params app = {
+	/* Ports*/
+	.n_ports = APP_MAX_PORTS,
+	.port_rx_ring_size = 128,
+	.port_tx_ring_size = 512,
+
+	/* Buffer pool */
+	.pool_buffer_size = RTE_MBUF_DEFAULT_BUF_SIZE,
+	.pool_size = 32 * 1024,
+	.pool_cache_size = 256,
+
+	/* Burst sizes */
+	.burst_size_rx_read = 64,
+};
+
+static struct rte_eth_conf port_conf = {
+	.rxmode = {
+		.max_rx_pkt_len = ETHER_MAX_LEN,
+		.split_hdr_size = 0,
+		.header_split   = 0, /* Header Split disabled */
+		.hw_ip_checksum = 1, /* IP checksum offload enabled */
+		.hw_vlan_filter = 0, /* VLAN filtering disabled */
+		.jumbo_frame    = 0, /* Jumbo Frame Support disabled */
+		.hw_strip_crc   = 0, /* CRC stripped by hardware */
+	},
+	.fdir_conf = {
+		.mode = RTE_FDIR_MODE_NONE,
+		.pballoc = RTE_FDIR_PBALLOC_64K,
+		.status = RTE_FDIR_REPORT_STATUS,
+		.mask = {
+			.vlan_tci_mask = 0x0,
+			.ipv4_mask     = {
+				.src_ip = 0xFFFFFFFF,
+				.dst_ip = 0xFFFFFFFF,
+			},
+			.ipv6_mask     = {
+				.src_ip = {0xFFFFFFFF, 0xFFFFFFFF,
+					   0xFFFFFFFF, 0xFFFFFFFF},
+				.dst_ip = {0xFFFFFFFF, 0xFFFFFFFF,
+					   0xFFFFFFFF, 0xFFFFFFFF},
+			},
+			.src_port_mask = 0xFFFF,
+			.dst_port_mask = 0xFFFF,
+			.mac_addr_byte_mask = 0xFF,
+			.tunnel_type_mask = 1,
+			.tunnel_id_mask = 0xFFFFFFFF,
+		},
+		.drop_queue = 127,
+	},
+};
+
+static void
+app_init_mbuf_pools(void)
+{
+	/* Init the buffer pool */
+	app.pool = rte_pktmbuf_pool_create("mempool", app.pool_size,
+					   app.pool_cache_size, 0,
+					   app.pool_buffer_size,
+					   rte_socket_id());
+	if (!app.pool)
+		rte_panic("Cannot create mbuf pool\n");
+}
+
+static void
+app_ports_check_link(void)
+{
+	uint32_t all_ports_up, port, count, print_flag = 0;
+
+#define CHECK_INTERVAL 100 /* 100ms */
+#define MAX_CHECK_TIME 90 /* 9s (90 * 100ms) in total */
+
+	for (count = 0; count < MAX_CHECK_TIME; count++) {
+		all_ports_up = 1;
+		for (port = 0; port < app.n_ports; port++) {
+			struct rte_eth_link link;
+
+			memset(&link, 0, sizeof(link));
+			rte_eth_link_get_nowait(port, &link);
+			if (print_flag == 1) {
+				RTE_LOG(INFO, USER1, "Port %u (%u Gbps) %s\n",
+					port, link.link_speed / 1000,
+					link.link_status ? "UP" : "DOWN");
+			}
+
+			if (!link.link_status) {
+				all_ports_up = 0;
+				break;
+			}
+		}
+
+		if (print_flag == 1)
+			break;
+
+		if (!all_ports_up) {
+			fflush(stdout);
+			rte_delay_ms(CHECK_INTERVAL);
+		}
+
+		if (all_ports_up || count == (MAX_CHECK_TIME - 1))
+			print_flag = 1;
+	}
+}
+
+static void
+app_init_ports(void)
+{
+	struct ether_addr mac_addr;
+	unsigned int  port;
+	int ret;
+
+	/* Init NIC ports, then start the ports */
+	for (port = 0; port < app.n_ports; port++) {
+		RTE_LOG(INFO, USER1, "Initializing NIC port %u ...\n", port);
+
+		/* Init port */
+		ret = rte_eth_dev_configure(port, 1, 1,	&port_conf);
+		if (ret < 0)
+			rte_panic("Cannot init NIC port %u (%d)\n", port, ret);
+
+		/* Init RX queues */
+		ret = rte_eth_rx_queue_setup(port, 0, app.port_rx_ring_size,
+					     rte_eth_dev_socket_id(port), NULL,
+					     app.pool);
+		if (ret < 0)
+			rte_panic("Cannot init RX for port %u (%d)\n",
+				  port, ret);
+
+		/* Init TX queues */
+		ret = rte_eth_tx_queue_setup(port, 0, app.port_tx_ring_size,
+					     rte_eth_dev_socket_id(port), NULL);
+		if (ret < 0)
+			rte_panic("Cannot init TX for port %u (%d)\n",
+				  port, ret);
+
+		/* Start port */
+		ret = rte_eth_dev_start(port);
+		if (ret < 0)
+			rte_panic("Cannot start port %u (%d)\n", port, ret);
+
+		rte_eth_macaddr_get(port, &mac_addr);
+		printf("Port %d: %02X:%02X:%02X:%02X:%02X:%02X\n", port,
+		       mac_addr.addr_bytes[0], mac_addr.addr_bytes[1],
+		       mac_addr.addr_bytes[2], mac_addr.addr_bytes[3],
+		       mac_addr.addr_bytes[4], mac_addr.addr_bytes[5]);
+
+		rte_eth_promiscuous_enable(port);
+	}
+
+	app_ports_check_link();
+}
+
+void
+app_init(void)
+{
+	app_init_mbuf_pools();
+	app_init_ports();
+}
diff --git a/examples/test-cxgbe-filters/main.c b/examples/test-cxgbe-filters/main.c
new file mode 100644
index 0000000..c753b32
--- /dev/null
+++ b/examples/test-cxgbe-filters/main.c
@@ -0,0 +1,79 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015-2016 Chelsio Communications.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Chelsio Communications nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+
+#include <rte_eal.h>
+#include <rte_common.h>
+#include <rte_ether.h>
+#include <rte_ethdev.h>
+
+#include <cmdline_rdline.h>
+#include <cmdline_parse.h>
+#include <cmdline_socket.h>
+#include <cmdline.h>
+
+#include "main.h"
+
+int
+main(int argc, char **argv)
+{
+	struct cmdline *cl;
+	unsigned int i;
+	int ret;
+
+	ret = rte_eal_init(argc, argv);
+	if (ret < 0)
+		return -1;
+
+	if (app_init_config() < 0)
+		return -1;
+
+	app_init();
+
+	for (i = 0; i < app.n_ports; i++) {
+		/* Launch rx main loop on every lcore except master */
+		rte_eal_remote_launch(app_main_loop_rx, NULL, app.core_rx[i]);
+	}
+
+	cl = cmdline_stdin_new(main_ctx, "cxgbe> ");
+	if (!cl) {
+		printf("cmdline init failed\n");
+		return -1;
+	}
+
+	cmdline_interact(cl);
+	cmdline_stdin_exit(cl);
+
+	return 0;
+}
diff --git a/examples/test-cxgbe-filters/main.h b/examples/test-cxgbe-filters/main.h
new file mode 100644
index 0000000..4252052
--- /dev/null
+++ b/examples/test-cxgbe-filters/main.h
@@ -0,0 +1,77 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015-2016 Chelsio Communications.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Chelsio Communications nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _CXGBE_APP_MAIN_H_
+#define _CXGBE_APP_MAIN_H_
+
+#include <cmdline_parse.h>
+
+#define APP_MBUF_ARRAY_SIZE 256
+#define APP_MAX_PORTS 4
+
+struct app_mbuf_array {
+	struct rte_mbuf *array[APP_MBUF_ARRAY_SIZE];
+	uint64_t pkts; /* Number of received packets */
+};
+
+struct app_params {
+	/* CPU cores */
+	uint32_t core_rx[APP_MAX_PORTS];
+
+	/* Ports*/
+	uint32_t n_ports;
+	uint32_t port_rx_ring_size;
+	uint32_t port_tx_ring_size;
+
+	/* Internal buffers */
+	struct app_mbuf_array mbuf_rx[APP_MAX_PORTS];
+
+	/* Buffer pool */
+	struct rte_mempool *pool;
+	uint32_t pool_buffer_size;
+	uint32_t pool_size;
+	uint32_t pool_cache_size;
+
+	/* Burst sizes */
+	uint32_t burst_size_rx_read;
+} __rte_cache_aligned;
+
+extern struct app_params app;
+
+extern cmdline_parse_ctx_t main_ctx[];
+
+int app_init_config(void);
+void app_init(void);
+
+int app_main_loop_rx(void *arg);
+#endif /* _CXGBE_APP_MAIN_H_ */
diff --git a/examples/test-cxgbe-filters/runtime.c b/examples/test-cxgbe-filters/runtime.c
new file mode 100644
index 0000000..043b5a4
--- /dev/null
+++ b/examples/test-cxgbe-filters/runtime.c
@@ -0,0 +1,74 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015-2016 Chelsio Communications.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Chelsio Communications nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <rte_log.h>
+#include <rte_ethdev.h>
+
+#include "main.h"
+
+int
+app_main_loop_rx(void *arg)
+{
+	uint32_t lcore_id;
+	uint16_t i;
+	uint8_t port_id = 0;
+
+	RTE_SET_USED(arg);
+
+	lcore_id = rte_lcore_id();
+
+	for (i = 0; i < app.n_ports; i++)
+		if (app.core_rx[i] == lcore_id)
+			port_id = i;
+
+	app.mbuf_rx[port_id].pkts = 0;
+
+	RTE_LOG(INFO, USER1, "Core %u is doing RX for port %d\n",
+		lcore_id, port_id);
+
+	while (1) {
+		uint16_t n_mbufs;
+
+		n_mbufs = rte_eth_rx_burst(port_id, 0,
+					   app.mbuf_rx[port_id].array,
+					   app.burst_size_rx_read);
+
+		if (!n_mbufs)
+			continue;
+
+		app.mbuf_rx[port_id].pkts += n_mbufs;
+		for (i = 0; i < n_mbufs; i++)
+			rte_pktmbuf_free(app.mbuf_rx[port_id].array[i]);
+	}
+	return 0;
+}
-- 
2.5.3

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 03/10] cxgbe: add skeleton to add support for T5 hardware filtering
  2016-02-03  8:32 [PATCH 00/10] cxgbe: Add flow director support Rahul Lakkireddy
  2016-02-03  8:32 ` [PATCH 01/10] ethdev: add a generic flow and new behavior switch to fdir Rahul Lakkireddy
  2016-02-03  8:32 ` [PATCH 02/10] examples/test-cxgbe-filters: add example to test cxgbe fdir support Rahul Lakkireddy
@ 2016-02-03  8:32 ` Rahul Lakkireddy
  2016-02-03  8:32 ` [PATCH 04/10] cxgbe: add control txq for communicating filtering info Rahul Lakkireddy
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 27+ messages in thread
From: Rahul Lakkireddy @ 2016-02-03  8:32 UTC (permalink / raw)
  To: dev; +Cc: Kumar Sanghvi, Nirranjan Kirubaharan

Update base to get info about supported filtering features.

Add TID table that holds filter state and add filter specification
that provide a skeleton to add LE-TCAM (Maskfull) and Hash (Maskless)
T5 filtering capabilities.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
---
 drivers/net/cxgbe/Makefile              |   1 +
 drivers/net/cxgbe/base/adapter.h        |   7 +
 drivers/net/cxgbe/base/t4fw_interface.h |  10 ++
 drivers/net/cxgbe/cxgbe_filter.h        | 231 ++++++++++++++++++++++++++++++++
 drivers/net/cxgbe/cxgbe_main.c          | 138 +++++++++++++++++++
 drivers/net/cxgbe/cxgbe_ofld.h          |  86 ++++++++++++
 6 files changed, 473 insertions(+)
 create mode 100644 drivers/net/cxgbe/cxgbe_filter.h
 create mode 100644 drivers/net/cxgbe/cxgbe_ofld.h

diff --git a/drivers/net/cxgbe/Makefile b/drivers/net/cxgbe/Makefile
index 0711976..895c767 100644
--- a/drivers/net/cxgbe/Makefile
+++ b/drivers/net/cxgbe/Makefile
@@ -83,5 +83,6 @@ SRCS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += t4_hw.c
 DEPDIRS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += lib/librte_eal lib/librte_ether
 DEPDIRS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += lib/librte_mempool lib/librte_mbuf
 DEPDIRS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += lib/librte_net lib/librte_malloc
+DEPDIRS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += lib/librte_sched
 
 include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/net/cxgbe/base/adapter.h b/drivers/net/cxgbe/base/adapter.h
index a5225c0..51896b1 100644
--- a/drivers/net/cxgbe/base/adapter.h
+++ b/drivers/net/cxgbe/base/adapter.h
@@ -40,6 +40,7 @@
 
 #include "cxgbe_compat.h"
 #include "t4_regs_values.h"
+#include "cxgbe_ofld.h"
 
 enum {
 	MAX_ETH_QSETS = 64,           /* # of Ethernet Tx/Rx queue sets */
@@ -319,6 +320,12 @@ struct adapter {
 	unsigned int pf;       /* associated physical function id */
 
 	int use_unpacked_mode; /* unpacked rx mode state */
+
+	unsigned int clipt_start; /* CLIP table start */
+	unsigned int clipt_end;   /* CLIP table end */
+	unsigned int l2t_start;   /* Layer 2 table start */
+	unsigned int l2t_end;     /* Layer 2 table end */
+	struct tid_info tids;     /* Info used to access TID related tables */
 };
 
 #define CXGBE_PCI_REG(reg) (*((volatile uint32_t *)(reg)))
diff --git a/drivers/net/cxgbe/base/t4fw_interface.h b/drivers/net/cxgbe/base/t4fw_interface.h
index 74f19fe..8286ca1 100644
--- a/drivers/net/cxgbe/base/t4fw_interface.h
+++ b/drivers/net/cxgbe/base/t4fw_interface.h
@@ -394,6 +394,10 @@ enum fw_params_mnem {
 enum fw_params_param_dev {
 	FW_PARAMS_PARAM_DEV_CCLK	= 0x00, /* chip core clock in khz */
 	FW_PARAMS_PARAM_DEV_PORTVEC	= 0x01, /* the port vector */
+	FW_PARAMS_PARAM_DEV_NTID	= 0x02, /* reads the number of TIDs
+						 * allocated by the device's
+						 * Lookup Engine
+						 */
 	FW_PARAMS_PARAM_DEV_ULPTX_MEMWRITE_DSGL = 0x17,
 };
 
@@ -401,6 +405,12 @@ enum fw_params_param_dev {
  * physical and virtual function parameters
  */
 enum fw_params_param_pfvf {
+	FW_PARAMS_PARAM_PFVF_CLIP_START = 0x03,
+	FW_PARAMS_PARAM_PFVF_CLIP_END = 0x04,
+	FW_PARAMS_PARAM_PFVF_FILTER_START = 0x05,
+	FW_PARAMS_PARAM_PFVF_FILTER_END = 0x06,
+	FW_PARAMS_PARAM_PFVF_L2T_START = 0x13,
+	FW_PARAMS_PARAM_PFVF_L2T_END = 0x14,
 	FW_PARAMS_PARAM_PFVF_CPLFW4MSG_ENCAP = 0x31
 };
 
diff --git a/drivers/net/cxgbe/cxgbe_filter.h b/drivers/net/cxgbe/cxgbe_filter.h
new file mode 100644
index 0000000..a746d13
--- /dev/null
+++ b/drivers/net/cxgbe/cxgbe_filter.h
@@ -0,0 +1,231 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015-2016 Chelsio Communications.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Chelsio Communications nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _CXGBE_FILTER_H_
+#define _CXGBE_FILTER_H_
+
+/*
+ * Defined bit width of user definable filter tuples
+ */
+#define ETHTYPE_BITWIDTH 16
+#define FRAG_BITWIDTH 1
+#define MACIDX_BITWIDTH 9
+#define FCOE_BITWIDTH 1
+#define IPORT_BITWIDTH 3
+#define MATCHTYPE_BITWIDTH 3
+#define PROTO_BITWIDTH 8
+#define TOS_BITWIDTH 8
+#define PF_BITWIDTH 8
+#define VF_BITWIDTH 8
+#define IVLAN_BITWIDTH 16
+#define OVLAN_BITWIDTH 16
+
+/*
+ * Filter matching rules.  These consist of a set of ingress packet field
+ * (value, mask) tuples.  The associated ingress packet field matches the
+ * tuple when ((field & mask) == value).  (Thus a wildcard "don't care" field
+ * rule can be constructed by specifying a tuple of (0, 0).)  A filter rule
+ * matches an ingress packet when all of the individual individual field
+ * matching rules are true.
+ *
+ * Partial field masks are always valid, however, while it may be easy to
+ * understand their meanings for some fields (e.g. IP address to match a
+ * subnet), for others making sensible partial masks is less intuitive (e.g.
+ * MPS match type) ...
+ */
+struct ch_filter_tuple {
+	/*
+	 * Compressed header matching field rules.  The TP_VLAN_PRI_MAP
+	 * register selects which of these fields will participate in the
+	 * filter match rules -- up to a maximum of 36 bits.  Because
+	 * TP_VLAN_PRI_MAP is a global register, all filters must use the same
+	 * set of fields.
+	 */
+	uint32_t ethtype:ETHTYPE_BITWIDTH;	/* Ethernet type */
+	uint32_t frag:FRAG_BITWIDTH;		/* IP fragmentation header */
+	uint32_t ivlan_vld:1;			/* inner VLAN valid */
+	uint32_t ovlan_vld:1;			/* outer VLAN valid */
+	uint32_t pfvf_vld:1;			/* PF/VF valid */
+	uint32_t macidx:MACIDX_BITWIDTH;	/* exact match MAC index */
+	uint32_t fcoe:FCOE_BITWIDTH;		/* FCoE packet */
+	uint32_t iport:IPORT_BITWIDTH;		/* ingress port */
+	uint32_t matchtype:MATCHTYPE_BITWIDTH;	/* MPS match type */
+	uint32_t proto:PROTO_BITWIDTH;		/* protocol type */
+	uint32_t tos:TOS_BITWIDTH;		/* TOS/Traffic Type */
+	uint32_t pf:PF_BITWIDTH;		/* PCI-E PF ID */
+	uint32_t vf:VF_BITWIDTH;		/* PCI-E VF ID */
+	uint32_t ivlan:IVLAN_BITWIDTH;		/* inner VLAN */
+	uint32_t ovlan:OVLAN_BITWIDTH;		/* outer VLAN */
+
+	/*
+	 * Uncompressed header matching field rules.  These are always
+	 * available for field rules.
+	 */
+	uint8_t lip[16];	/* local IP address (IPv4 in [3:0]) */
+	uint8_t fip[16];	/* foreign IP address (IPv4 in [3:0]) */
+	uint16_t lport;		/* local port */
+	uint16_t fport;		/* foreign port */
+
+	/* reservations for future additions */
+	uint8_t rsvd[12];
+};
+
+/*
+ * Filter specification
+ */
+struct ch_filter_specification {
+	/* Administrative fields for filter. */
+	uint32_t hitcnts:1;	/* count filter hits in TCB */
+	uint32_t prio:1;	/* filter has priority over active/server */
+
+	/*
+	 * Fundamental filter typing.  This is the one element of filter
+	 * matching that doesn't exist as a (value, mask) tuple.
+	 */
+	uint32_t type:1;	/* 0 => IPv4, 1 => IPv6 */
+	uint32_t cap:1;		/* 0 => LE-TCAM, 1 => Hash */
+
+	/*
+	 * Packet dispatch information.  Ingress packets which match the
+	 * filter rules will be dropped, passed to the host or switched back
+	 * out as egress packets.
+	 */
+	uint32_t action:2;	/* drop, pass, switch */
+
+	uint32_t rpttid:1;	/* report TID in RSS hash field */
+
+	uint32_t dirsteer:1;	/* 0 => RSS, 1 => steer to iq */
+	uint32_t iq:10;		/* ingress queue */
+
+	uint32_t maskhash:1;	/* dirsteer=0: store RSS hash in TCB */
+	uint32_t dirsteerhash:1;/* dirsteer=1: 0 => TCB contains RSS hash */
+				/*             1 => TCB contains IQ ID */
+
+	/*
+	 * Switch proxy/rewrite fields.  An ingress packet which matches a
+	 * filter with "switch" set will be looped back out as an egress
+	 * packet -- potentially with some Ethernet header rewriting.
+	 */
+	uint32_t eport:2;	/* egress port to switch packet out */
+	uint32_t newdmac:1;	/* rewrite destination MAC address */
+	uint32_t newsmac:1;	/* rewrite source MAC address */
+	uint32_t swapmac:1;     /* swap SMAC/DMAC for loopback packet */
+	uint32_t newvlan:2;	/* rewrite VLAN Tag */
+	uint32_t nat_mode:3;	/* specify NAT operation mode */
+	uint32_t nat_flag_chk:1;/* check TCP flags before NAT'ing */
+	uint32_t nat_seq_chk;	/* sequence value to use for NAT check*/
+	uint8_t dmac[ETHER_ADDR_LEN];	/* new destination MAC address */
+	uint8_t smac[ETHER_ADDR_LEN];	/* new source MAC address */
+	uint16_t vlan;		/* VLAN Tag to insert */
+
+	uint8_t nat_lip[16];	/* local IP to use after NAT'ing */
+	uint8_t nat_fip[16];	/* foreign IP to use after NAT'ing */
+	uint16_t nat_lport;	/* local port to use after NAT'ing */
+	uint16_t nat_fport;	/* foreign port to use after NAT'ing */
+
+	/* reservation for future additions */
+	uint8_t rsvd[6];
+
+	/* Filter rule value/mask pairs. */
+	struct ch_filter_tuple val;
+	struct ch_filter_tuple mask;
+};
+
+enum {
+	FILTER_PASS = 0,	/* default */
+	FILTER_DROP,
+	FILTER_SWITCH
+};
+
+enum {
+	VLAN_NOCHANGE = 0,	/* default */
+	VLAN_REMOVE,
+	VLAN_INSERT,
+	VLAN_REWRITE
+};
+
+enum {
+	NAT_MODE_NONE = 0,	/* No NAT performed */
+	NAT_MODE_DIP,		/* NAT on Dst IP */
+	NAT_MODE_DIP_DP,	/* NAT on Dst IP, Dst Port */
+	NAT_MODE_DIP_DP_SIP,	/* NAT on Dst IP, Dst Port and Src IP */
+	NAT_MODE_DIP_DP_SP,	/* NAT on Dst IP, Dst Port and Src Port */
+	NAT_MODE_SIP_SP,	/* NAT on Src IP and Src Port */
+	NAT_MODE_DIP_SIP_SP,	/* NAT on Dst IP, Src IP and Src Port */
+	NAT_MODE_ALL		/* NAT on entire 4-tuple */
+};
+
+enum filter_type {
+	FILTER_TYPE_IPV4 = 0,
+	FILTER_TYPE_IPV6,
+};
+
+struct t4_completion {
+	unsigned int done;       /* completion done (0 - No, 1 - Yes) */
+	rte_spinlock_t lock;     /* completion lock */
+};
+
+/*
+ * Filter operation context to allow callers of cxgbe_set_filter() and
+ * cxgbe_del_filter() to wait for an asynchronous completion.
+ */
+struct filter_ctx {
+	struct t4_completion completion; /* completion rendezvous */
+	int result;                      /* result of operation */
+	u32 tid;                         /* to store tid of hash filter */
+};
+
+/*
+ * Host shadow copy of ingress filter entry.  This is in host native format
+ * and doesn't match the ordering or bit order, etc. of the hardware or the
+ * firmware command.
+ */
+struct filter_entry {
+	/*
+	 * Administrative fields for filter.
+	 */
+	u32 valid:1;                /* filter allocated and valid */
+	u32 locked:1;               /* filter is administratively locked */
+	u32 pending:1;              /* filter action is pending FW reply */
+	struct filter_ctx *ctx;     /* caller's completion hook */
+	struct rte_eth_dev *dev;    /* Port's rte eth device */
+
+	/* This will store the actual tid */
+	u32 tid;
+
+	/*
+	 * The filter itself.
+	 */
+	struct ch_filter_specification fs;
+};
+#endif /* _CXGBE_FILTER_H_ */
diff --git a/drivers/net/cxgbe/cxgbe_main.c b/drivers/net/cxgbe/cxgbe_main.c
index aff23d0..b116a40 100644
--- a/drivers/net/cxgbe/cxgbe_main.c
+++ b/drivers/net/cxgbe/cxgbe_main.c
@@ -67,6 +67,22 @@
 #include "t4_msg.h"
 #include "cxgbe.h"
 
+/**
+ * Allocate a chunk of memory. The allocated memory is cleared.
+ */
+void *t4_alloc_mem(size_t size)
+{
+	return rte_zmalloc(NULL, size, 0);
+}
+
+/**
+ * Free memory allocated through t4_alloc_mem().
+ */
+void t4_free_mem(void *addr)
+{
+	rte_free(addr);
+}
+
 /*
  * Response queue handler for the FW event queue.
  */
@@ -198,6 +214,78 @@ int cxgb4_set_rspq_intr_params(struct sge_rspq *q, unsigned int us,
 	return 0;
 }
 
+/**
+ * Free TID tables.
+ */
+static void tid_free(struct tid_info *t)
+{
+	if (t->tid_tab) {
+		if (t->ftid_bmap)
+			rte_bitmap_free(t->ftid_bmap);
+
+		if (t->ftid_bmap_array)
+			t4_os_free(t->ftid_bmap_array);
+
+		t4_os_free(t->tid_tab);
+	}
+
+	memset(t, 0, sizeof(struct tid_info));
+}
+
+/**
+ * Allocate and initialize the TID tables.  Returns 0 on success.
+ */
+static int tid_init(struct tid_info *t)
+{
+	size_t size;
+	unsigned int ftid_bmap_size;
+	unsigned int natids = t->natids;
+	unsigned int max_ftids = t->nftids;
+
+	ftid_bmap_size = rte_bitmap_get_memory_footprint(t->nftids);
+	size = t->ntids * sizeof(*t->tid_tab) +
+	       natids * sizeof(*t->atid_tab) +
+	       max_ftids * sizeof(*t->ftid_tab);
+
+	t->tid_tab = t4_os_alloc(size);
+	if (!t->tid_tab)
+		return -ENOMEM;
+
+	t->atid_tab = (union aopen_entry *)&t->tid_tab[t->ntids];
+	t->ftid_tab = (struct filter_entry *)&t->atid_tab[natids];
+	t->ftid_bmap_array = t4_os_alloc(ftid_bmap_size);
+	if (!t->ftid_bmap_array) {
+		tid_free(t);
+		return -ENOMEM;
+	}
+
+	t4_os_lock_init(&t->atid_lock);
+	t4_os_lock_init(&t->ftid_lock);
+
+	t->afree = NULL;
+	t->atids_in_use = 0;
+	rte_atomic32_init(&t->tids_in_use);
+	rte_atomic32_set(&t->tids_in_use, 0);
+	rte_atomic32_init(&t->conns_in_use);
+	rte_atomic32_set(&t->conns_in_use, 0);
+
+	/* Setup the free list for atid_tab and clear the stid bitmap. */
+	if (natids) {
+		while (--natids)
+			t->atid_tab[natids - 1].next = &t->atid_tab[natids];
+		t->afree = t->atid_tab;
+	}
+
+	t->ftid_bmap = rte_bitmap_init(t->nftids, t->ftid_bmap_array,
+				       ftid_bmap_size);
+	if (!t->ftid_bmap) {
+		tid_free(t);
+		return -ENOMEM;
+	}
+
+	return 0;
+}
+
 static inline bool is_x_1g_port(const struct link_config *lc)
 {
 	return ((lc->supported & FW_PORT_CAP_SPEED_1G) != 0);
@@ -598,6 +686,7 @@ bye:
 
 static int adap_init0(struct adapter *adap)
 {
+	struct fw_caps_config_cmd caps_cmd;
 	int ret = 0;
 	u32 v, port_vec;
 	enum dev_state state;
@@ -729,6 +818,48 @@ static int adap_init0(struct adapter *adap)
 	 V_FW_PARAMS_PARAM_Y(0) | \
 	 V_FW_PARAMS_PARAM_Z(0))
 
+	params[0] = FW_PARAM_PFVF(L2T_START);
+	params[1] = FW_PARAM_PFVF(L2T_END);
+	params[2] = FW_PARAM_PFVF(FILTER_START);
+	params[3] = FW_PARAM_PFVF(FILTER_END);
+	ret = t4_query_params(adap, adap->mbox, adap->pf, 0, 4, params, val);
+	if (ret < 0)
+		goto bye;
+	adap->l2t_start = val[0];
+	adap->l2t_end = val[1];
+	adap->tids.ftid_base = val[2];
+	adap->tids.nftids = val[3] - val[2] + 1;
+
+	params[0] = FW_PARAM_PFVF(CLIP_START);
+	params[1] = FW_PARAM_PFVF(CLIP_END);
+	ret = t4_query_params(adap, adap->mbox, adap->pf, 0, 2, params, val);
+	if (ret < 0)
+		goto bye;
+	adap->clipt_start = val[0];
+	adap->clipt_end = val[1];
+
+	/*
+	 * Get device capabilities so we can determine what resources we need
+	 * to manage.
+	 */
+	memset(&caps_cmd, 0, sizeof(caps_cmd));
+	caps_cmd.op_to_write = htonl(V_FW_CMD_OP(FW_CAPS_CONFIG_CMD) |
+				     F_FW_CMD_REQUEST | F_FW_CMD_READ);
+	caps_cmd.cfvalid_to_len16 = htonl(FW_LEN16(caps_cmd));
+	ret = t4_wr_mbox(adap, adap->mbox, &caps_cmd, sizeof(caps_cmd),
+			 &caps_cmd);
+	if (ret < 0)
+		goto bye;
+
+	/* query tid-related parameters */
+	params[0] = FW_PARAM_DEV(NTID);
+	ret = t4_query_params(adap, adap->mbox, adap->pf, 0, 1,
+			      params, val);
+	if (ret < 0)
+		goto bye;
+	adap->tids.ntids = val[0];
+	adap->tids.natids = min(adap->tids.ntids / 2, MAX_ATIDS);
+
 	/* If we're running on newer firmware, let it know that we're
 	 * prepared to deal with encapsulated CPL messages.  Older
 	 * firmware won't understand this and we'll just get
@@ -1049,6 +1180,7 @@ void cxgbe_close(struct adapter *adapter)
 	int i;
 
 	if (adapter->flags & FULL_INIT_DONE) {
+		tid_free(&adapter->tids);
 		t4_intr_disable(adapter);
 		t4_sge_tx_monitor_stop(adapter);
 		t4_free_sge_resources(adapter);
@@ -1191,6 +1323,12 @@ allocate_mac:
 
 	print_port_info(adapter);
 
+	if (tid_init(&adapter->tids) < 0) {
+		/* Disable filtering support */
+		dev_warn(adapter, "could not allocate TID table, "
+			 "filter support disabled. Continuing\n");
+	}
+
 	err = init_rss(adapter);
 	if (err)
 		goto out_free;
diff --git a/drivers/net/cxgbe/cxgbe_ofld.h b/drivers/net/cxgbe/cxgbe_ofld.h
new file mode 100644
index 0000000..3bcb648
--- /dev/null
+++ b/drivers/net/cxgbe/cxgbe_ofld.h
@@ -0,0 +1,86 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015-2016 Chelsio Communications.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Chelsio Communications nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _CXGBE_OFLD_H_
+#define _CXGBE_OFLD_H_
+
+#include <rte_bitmap.h>
+
+#include "cxgbe_filter.h"
+
+/*
+ * Max # of ATIDs.  The absolute HW max is 16K but we keep it lower.
+ */
+#define MAX_ATIDS 8192U
+
+union aopen_entry {
+	void *data;
+	union aopen_entry *next;
+};
+
+/*
+ * Holds the size, base address, free list start, etc of the TID, filter TID
+ * and active-open TID tables.  The tables themselves are allocated dynamically.
+ */
+struct tid_info {
+	void **tid_tab;
+	unsigned int ntids;
+
+	unsigned int hash_base;
+
+	union aopen_entry *atid_tab;
+	unsigned int natids;
+
+	struct filter_entry *ftid_tab;	/* Normal filters */
+	struct rte_bitmap *ftid_bmap;
+	uint8_t *ftid_bmap_array;
+	unsigned int nftids;
+	unsigned int ftid_base;
+
+	/*
+	 * The following members are accessed R/W so we put them in their own
+	 * cache line.
+	 */
+	rte_spinlock_t atid_lock __rte_cache_aligned;
+	union aopen_entry *afree;
+	unsigned int atids_in_use;
+
+	/* TIDs in the TCAM */
+	rte_atomic32_t tids_in_use;
+	/* TIDs in the HASH */
+	rte_atomic32_t hash_tids_in_use;
+
+	rte_atomic32_t conns_in_use;
+	rte_spinlock_t ftid_lock;
+};
+#endif /* _CXGBE_OFLD_H_ */
-- 
2.5.3

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 04/10] cxgbe: add control txq for communicating filtering info
  2016-02-03  8:32 [PATCH 00/10] cxgbe: Add flow director support Rahul Lakkireddy
                   ` (2 preceding siblings ...)
  2016-02-03  8:32 ` [PATCH 03/10] cxgbe: add skeleton to add support for T5 hardware filtering Rahul Lakkireddy
@ 2016-02-03  8:32 ` Rahul Lakkireddy
  2016-02-03  8:32 ` [PATCH 05/10] cxgbe: add compressed local IP table for matching IPv6 addresses Rahul Lakkireddy
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 27+ messages in thread
From: Rahul Lakkireddy @ 2016-02-03  8:32 UTC (permalink / raw)
  To: dev; +Cc: Kumar Sanghvi, Nirranjan Kirubaharan

Add a control txq with its own mempool to create control packets
containing filtering information to be communicated with firmware.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
---
 drivers/net/cxgbe/base/adapter.h        |  15 +++
 drivers/net/cxgbe/base/common.h         |   2 +
 drivers/net/cxgbe/base/t4_hw.c          |  25 ++++
 drivers/net/cxgbe/base/t4fw_interface.h |  64 ++++++++++
 drivers/net/cxgbe/cxgbe.h               |   1 +
 drivers/net/cxgbe/cxgbe_ethdev.c        |   3 +
 drivers/net/cxgbe/cxgbe_main.c          |  41 +++++++
 drivers/net/cxgbe/sge.c                 | 199 +++++++++++++++++++++++++++++++-
 8 files changed, 349 insertions(+), 1 deletion(-)

diff --git a/drivers/net/cxgbe/base/adapter.h b/drivers/net/cxgbe/base/adapter.h
index 51896b1..ff8000f 100644
--- a/drivers/net/cxgbe/base/adapter.h
+++ b/drivers/net/cxgbe/base/adapter.h
@@ -44,6 +44,7 @@
 
 enum {
 	MAX_ETH_QSETS = 64,           /* # of Ethernet Tx/Rx queue sets */
+	MAX_CTRL_QUEUES = NCHAN,      /* # of control Tx queues */
 };
 
 struct adapter;
@@ -271,10 +272,20 @@ struct sge_eth_txq {                   /* state for an SGE Ethernet Tx queue */
 	unsigned int flags;            /* flags for state of the queue */
 } __rte_cache_aligned;
 
+struct sge_ctrl_txq {                /* State for an SGE control Tx queue */
+	struct sge_txq q;            /* txq */
+	struct adapter *adapter;     /* adapter associated with this queue */
+	rte_spinlock_t ctrlq_lock;   /* control queue lock */
+	u8 full;                     /* the Tx ring is full */
+	u64 txp;                     /* number of transmits */
+	struct rte_mempool *mb_pool; /* mempool to generate ctrl pkts */
+} __rte_cache_aligned;
+
 struct sge {
 	struct sge_eth_txq ethtxq[MAX_ETH_QSETS];
 	struct sge_eth_rxq ethrxq[MAX_ETH_QSETS];
 	struct sge_rspq fw_evtq __rte_cache_aligned;
+	struct sge_ctrl_txq ctrlq[MAX_CTRL_QUEUES];
 
 	u16 max_ethqsets;           /* # of available Ethernet queue sets */
 	u32 stat_len;               /* length of status page at ring end */
@@ -556,12 +567,16 @@ void t4_free_sge_resources(struct adapter *adap);
 void t4_sge_tx_monitor_start(struct adapter *adap);
 void t4_sge_tx_monitor_stop(struct adapter *adap);
 int t4_eth_xmit(struct sge_eth_txq *txq, struct rte_mbuf *mbuf);
+int t4_mgmt_tx(struct sge_ctrl_txq *txq, struct rte_mbuf *mbuf);
 int t4_ethrx_handler(struct sge_rspq *q, const __be64 *rsp,
 		     const struct pkt_gl *gl);
 int t4_sge_init(struct adapter *adap);
 int t4_sge_alloc_eth_txq(struct adapter *adap, struct sge_eth_txq *txq,
 			 struct rte_eth_dev *eth_dev, uint16_t queue_id,
 			 unsigned int iqid, int socket_id);
+int t4_sge_alloc_ctrl_txq(struct adapter *adap, struct sge_ctrl_txq *txq,
+			  struct rte_eth_dev *eth_dev, uint16_t queue_id,
+			  unsigned int iqid, int socket_id);
 int t4_sge_alloc_rxq(struct adapter *adap, struct sge_rspq *rspq, bool fwevtq,
 		     struct rte_eth_dev *eth_dev, int intr_idx,
 		     struct sge_fl *fl, rspq_handler_t handler,
diff --git a/drivers/net/cxgbe/base/common.h b/drivers/net/cxgbe/base/common.h
index cf2e82d..2b39c10 100644
--- a/drivers/net/cxgbe/base/common.h
+++ b/drivers/net/cxgbe/base/common.h
@@ -311,6 +311,8 @@ int t4_iq_free(struct adapter *adap, unsigned int mbox, unsigned int pf,
 	       unsigned int fl0id, unsigned int fl1id);
 int t4_eth_eq_free(struct adapter *adap, unsigned int mbox, unsigned int pf,
 		   unsigned int vf, unsigned int eqid);
+int t4_ctrl_eq_free(struct adapter *adap, unsigned int mbox, unsigned int pf,
+		    unsigned int vf, unsigned int eqid);
 
 static inline unsigned int core_ticks_per_usec(const struct adapter *adap)
 {
diff --git a/drivers/net/cxgbe/base/t4_hw.c b/drivers/net/cxgbe/base/t4_hw.c
index 884d2cf..de2e6b7 100644
--- a/drivers/net/cxgbe/base/t4_hw.c
+++ b/drivers/net/cxgbe/base/t4_hw.c
@@ -2124,6 +2124,31 @@ int t4_eth_eq_free(struct adapter *adap, unsigned int mbox, unsigned int pf,
 }
 
 /**
+ * t4_ctrl_eq_free - free a control egress queue
+ * @adap: the adapter
+ * @mbox: mailbox to use for the FW command
+ * @pf: the PF owning the queue
+ * @vf: the VF owning the queue
+ * @eqid: egress queue id
+ *
+ * Frees a control egress queue.
+ */
+int t4_ctrl_eq_free(struct adapter *adap, unsigned int mbox, unsigned int pf,
+		    unsigned int vf, unsigned int eqid)
+{
+	struct fw_eq_ctrl_cmd c;
+
+	memset(&c, 0, sizeof(c));
+	c.op_to_vfn = cpu_to_be32(V_FW_CMD_OP(FW_EQ_CTRL_CMD) |
+				  F_FW_CMD_REQUEST | F_FW_CMD_EXEC |
+				  V_FW_EQ_CTRL_CMD_PFN(pf) |
+				  V_FW_EQ_CTRL_CMD_VFN(vf));
+	c.alloc_to_len16 = cpu_to_be32(F_FW_EQ_CTRL_CMD_FREE | FW_LEN16(c));
+	c.cmpliqid_eqid = cpu_to_be32(V_FW_EQ_CTRL_CMD_EQID(eqid));
+	return t4_wr_mbox(adap, mbox, &c, sizeof(c), NULL);
+}
+
+/**
  * t4_handle_fw_rpl - process a FW reply message
  * @adap: the adapter
  * @rpl: start of the FW message
diff --git a/drivers/net/cxgbe/base/t4fw_interface.h b/drivers/net/cxgbe/base/t4fw_interface.h
index 8286ca1..ddbff89 100644
--- a/drivers/net/cxgbe/base/t4fw_interface.h
+++ b/drivers/net/cxgbe/base/t4fw_interface.h
@@ -178,6 +178,7 @@ enum fw_cmd_opcodes {
 	FW_PARAMS_CMD                  = 0x08,
 	FW_IQ_CMD                      = 0x10,
 	FW_EQ_ETH_CMD                  = 0x12,
+	FW_EQ_CTRL_CMD                 = 0x13,
 	FW_VI_CMD                      = 0x14,
 	FW_VI_MAC_CMD                  = 0x15,
 	FW_VI_RXMODE_CMD               = 0x16,
@@ -788,6 +789,69 @@ struct fw_eq_eth_cmd {
 #define G_FW_EQ_ETH_CMD_VIID(x)	\
 	(((x) >> S_FW_EQ_ETH_CMD_VIID) & M_FW_EQ_ETH_CMD_VIID)
 
+struct fw_eq_ctrl_cmd {
+	__be32 op_to_vfn;
+	__be32 alloc_to_len16;
+	__be32 cmpliqid_eqid;
+	__be32 physeqid_pkd;
+	__be32 fetchszm_to_iqid;
+	__be32 dcaen_to_eqsize;
+	__be64 eqaddr;
+};
+
+#define S_FW_EQ_CTRL_CMD_PFN		8
+#define V_FW_EQ_CTRL_CMD_PFN(x)		((x) << S_FW_EQ_CTRL_CMD_PFN)
+
+#define S_FW_EQ_CTRL_CMD_VFN		0
+#define V_FW_EQ_CTRL_CMD_VFN(x)		((x) << S_FW_EQ_CTRL_CMD_VFN)
+
+#define S_FW_EQ_CTRL_CMD_ALLOC		31
+#define V_FW_EQ_CTRL_CMD_ALLOC(x)	((x) << S_FW_EQ_CTRL_CMD_ALLOC)
+#define F_FW_EQ_CTRL_CMD_ALLOC		V_FW_EQ_CTRL_CMD_ALLOC(1U)
+
+#define S_FW_EQ_CTRL_CMD_FREE		30
+#define V_FW_EQ_CTRL_CMD_FREE(x)	((x) << S_FW_EQ_CTRL_CMD_FREE)
+#define F_FW_EQ_CTRL_CMD_FREE		V_FW_EQ_CTRL_CMD_FREE(1U)
+
+#define S_FW_EQ_CTRL_CMD_EQSTART	28
+#define V_FW_EQ_CTRL_CMD_EQSTART(x)	((x) << S_FW_EQ_CTRL_CMD_EQSTART)
+#define F_FW_EQ_CTRL_CMD_EQSTART	V_FW_EQ_CTRL_CMD_EQSTART(1U)
+
+#define S_FW_EQ_CTRL_CMD_CMPLIQID	20
+#define V_FW_EQ_CTRL_CMD_CMPLIQID(x)	((x) << S_FW_EQ_CTRL_CMD_CMPLIQID)
+
+#define S_FW_EQ_CTRL_CMD_EQID		0
+#define M_FW_EQ_CTRL_CMD_EQID		0xfffff
+#define V_FW_EQ_CTRL_CMD_EQID(x)	((x) << S_FW_EQ_CTRL_CMD_EQID)
+#define G_FW_EQ_CTRL_CMD_EQID(x)	\
+	(((x) >> S_FW_EQ_CTRL_CMD_EQID) & M_FW_EQ_CTRL_CMD_EQID)
+
+#define S_FW_EQ_CTRL_CMD_FETCHRO	22
+#define V_FW_EQ_CTRL_CMD_FETCHRO(x)	((x) << S_FW_EQ_CTRL_CMD_FETCHRO)
+#define F_FW_EQ_CTRL_CMD_FETCHRO	V_FW_EQ_CTRL_CMD_FETCHRO(1U)
+
+#define S_FW_EQ_CTRL_CMD_HOSTFCMODE	20
+#define M_FW_EQ_CTRL_CMD_HOSTFCMODE	0x3
+#define V_FW_EQ_CTRL_CMD_HOSTFCMODE(x)	((x) << S_FW_EQ_CTRL_CMD_HOSTFCMODE)
+
+#define S_FW_EQ_CTRL_CMD_PCIECHN	16
+#define V_FW_EQ_CTRL_CMD_PCIECHN(x)	((x) << S_FW_EQ_CTRL_CMD_PCIECHN)
+
+#define S_FW_EQ_CTRL_CMD_IQID		0
+#define V_FW_EQ_CTRL_CMD_IQID(x)	((x) << S_FW_EQ_CTRL_CMD_IQID)
+
+#define S_FW_EQ_CTRL_CMD_FBMIN		23
+#define V_FW_EQ_CTRL_CMD_FBMIN(x)	((x) << S_FW_EQ_CTRL_CMD_FBMIN)
+
+#define S_FW_EQ_CTRL_CMD_FBMAX		20
+#define V_FW_EQ_CTRL_CMD_FBMAX(x)	((x) << S_FW_EQ_CTRL_CMD_FBMAX)
+
+#define S_FW_EQ_CTRL_CMD_CIDXFTHRESH	16
+#define V_FW_EQ_CTRL_CMD_CIDXFTHRESH(x)	((x) << S_FW_EQ_CTRL_CMD_CIDXFTHRESH)
+
+#define S_FW_EQ_CTRL_CMD_EQSIZE		0
+#define V_FW_EQ_CTRL_CMD_EQSIZE(x)	((x) << S_FW_EQ_CTRL_CMD_EQSIZE)
+
 enum fw_vi_func {
 	FW_VI_FUNC_ETH,
 };
diff --git a/drivers/net/cxgbe/cxgbe.h b/drivers/net/cxgbe/cxgbe.h
index 0201c99..9ca4388 100644
--- a/drivers/net/cxgbe/cxgbe.h
+++ b/drivers/net/cxgbe/cxgbe.h
@@ -56,6 +56,7 @@ int link_start(struct port_info *pi);
 void init_rspq(struct adapter *adap, struct sge_rspq *q, unsigned int us,
 	       unsigned int cnt, unsigned int size, unsigned int iqe_size);
 int setup_sge_fwevtq(struct adapter *adapter);
+int setup_sge_ctrl_txq(struct adapter *adapter);
 void cfg_queues(struct rte_eth_dev *eth_dev);
 int cfg_queue_count(struct rte_eth_dev *eth_dev);
 int setup_rss(struct port_info *pi);
diff --git a/drivers/net/cxgbe/cxgbe_ethdev.c b/drivers/net/cxgbe/cxgbe_ethdev.c
index 97ef152..2701bb6 100644
--- a/drivers/net/cxgbe/cxgbe_ethdev.c
+++ b/drivers/net/cxgbe/cxgbe_ethdev.c
@@ -393,6 +393,9 @@ static int cxgbe_dev_configure(struct rte_eth_dev *eth_dev)
 		if (err)
 			return err;
 		adapter->flags |= FW_QUEUE_BOUND;
+		err = setup_sge_ctrl_txq(adapter);
+		if (err)
+			return err;
 	}
 
 	err = cfg_queue_count(eth_dev);
diff --git a/drivers/net/cxgbe/cxgbe_main.c b/drivers/net/cxgbe/cxgbe_main.c
index b116a40..7a1cc13 100644
--- a/drivers/net/cxgbe/cxgbe_main.c
+++ b/drivers/net/cxgbe/cxgbe_main.c
@@ -123,6 +123,47 @@ out:
 	return 0;
 }
 
+/**
+ * Setup sge control queues to pass control information.
+ */
+int setup_sge_ctrl_txq(struct adapter *adapter)
+{
+	struct sge *s = &adapter->sge;
+	int err = 0, i = 0;
+
+	for_each_port(adapter, i) {
+		char name[RTE_ETH_NAME_MAX_LEN];
+		struct sge_ctrl_txq *q = &s->ctrlq[i];
+
+		q->q.size = 1024;
+		err = t4_sge_alloc_ctrl_txq(adapter, q,
+					    adapter->eth_dev,  i,
+					    s->fw_evtq.cntxt_id,
+					    rte_socket_id());
+		if (err) {
+			dev_err(adapter, "Failed to alloc ctrl txq. Err: %d",
+				err);
+			goto out;
+		}
+		snprintf(name, sizeof(name), "cxgbe_ctrl_pool_%d", i);
+		q->mb_pool = rte_pktmbuf_pool_create(name, s->ctrlq[i].q.size,
+						     RTE_CACHE_LINE_SIZE,
+						     RTE_MBUF_PRIV_ALIGN,
+						     RTE_MBUF_DEFAULT_BUF_SIZE,
+						     SOCKET_ID_ANY);
+		if (!q->mb_pool) {
+			dev_err(adapter, "Can't create ctrl pool for port: %d",
+				i);
+			err = -ENOMEM;
+			goto out;
+		}
+	}
+	return 0;
+out:
+	t4_free_sge_resources(adapter);
+	return err;
+}
+
 int setup_sge_fwevtq(struct adapter *adapter)
 {
 	struct sge *s = &adapter->sge;
diff --git a/drivers/net/cxgbe/sge.c b/drivers/net/cxgbe/sge.c
index 3c62d03..bd4b381 100644
--- a/drivers/net/cxgbe/sge.c
+++ b/drivers/net/cxgbe/sge.c
@@ -84,6 +84,11 @@ static inline void ship_tx_pkt_coalesce_wr(struct adapter *adap,
 #define MAX_IMM_TX_PKT_LEN 256
 
 /*
+ * Max size of a WR sent through a control Tx queue.
+ */
+#define MAX_CTRL_WR_LEN SGE_MAX_WR_LEN
+
+/*
  * Rx buffer sizes for "usembufs" Free List buffers (one ingress packet
  * per mbuf buffer).  We currently only support two sizes for 1500- and
  * 9000-byte MTUs. We could easily support more but there doesn't seem to be
@@ -1234,6 +1239,129 @@ out_free:
 }
 
 /**
+ * reclaim_completed_tx_imm - reclaim completed control-queue Tx descs
+ * @q: the SGE control Tx queue
+ *
+ * This is a variant of reclaim_completed_tx() that is used for Tx queues
+ * that send only immediate data (presently just the control queues) and
+ * thus do not have any mbufs to release.
+ */
+static inline void reclaim_completed_tx_imm(struct sge_txq *q)
+{
+	int hw_cidx = ntohs(q->stat->cidx);
+	int reclaim = hw_cidx - q->cidx;
+
+	if (reclaim < 0)
+		reclaim += q->size;
+
+	q->in_use -= reclaim;
+	q->cidx = hw_cidx;
+}
+
+/**
+ * is_imm - check whether a packet can be sent as immediate data
+ * @mbuf: the packet
+ *
+ * Returns true if a packet can be sent as a WR with immediate data.
+ */
+static inline int is_imm(const struct rte_mbuf *mbuf)
+{
+	return mbuf->pkt_len <= MAX_CTRL_WR_LEN;
+}
+
+/**
+ * inline_tx_mbuf: inline a packet's data into TX descriptors
+ * @q: the TX queue where the packet will be inlined
+ * @from: pointer to data portion of packet
+ * @to: pointer after cpl where data has to be inlined
+ * @len: length of data to inline
+ *
+ * Inline a packet's contents directly to TX descriptors, starting at
+ * the given position within the TX DMA ring.
+ * Most of the complexity of this operation is dealing with wrap arounds
+ * in the middle of the packet we want to inline.
+ */
+static void inline_tx_mbuf(const struct sge_txq *q, caddr_t from, caddr_t *to,
+			   int len)
+{
+	int left = RTE_PTR_DIFF(q->stat, *to);
+
+	if (likely((uintptr_t)*to + len <= (uintptr_t)q->stat)) {
+		rte_memcpy(*to, from, len);
+		*to = RTE_PTR_ADD(*to, len);
+	} else {
+		rte_memcpy(*to, from, left);
+		from = RTE_PTR_ADD(from, left);
+		left = len - left;
+		rte_memcpy((void *)q->desc, from, left);
+		*to = RTE_PTR_ADD((void *)q->desc, left);
+	}
+}
+
+/**
+ * ctrl_xmit - send a packet through an SGE control Tx queue
+ * @q: the control queue
+ * @mbuf: the packet
+ *
+ * Send a packet through an SGE control Tx queue.  Packets sent through
+ * a control queue must fit entirely as immediate data.
+ */
+static int ctrl_xmit(struct sge_ctrl_txq *q, struct rte_mbuf *mbuf)
+{
+	unsigned int ndesc;
+	struct fw_wr_hdr *wr;
+	caddr_t dst;
+
+	if (unlikely(!is_imm(mbuf))) {
+		WARN_ON(1);
+		rte_pktmbuf_free(mbuf);
+		return -1;
+	}
+
+	reclaim_completed_tx_imm(&q->q);
+	ndesc = DIV_ROUND_UP(mbuf->pkt_len, sizeof(struct tx_desc));
+	t4_os_lock(&q->ctrlq_lock);
+
+	q->full = txq_avail(&q->q) < ndesc ? 1 : 0;
+	if (unlikely(q->full)) {
+		t4_os_unlock(&q->ctrlq_lock);
+		return -1;
+	}
+
+	wr = (struct fw_wr_hdr *)&q->q.desc[q->q.pidx];
+	dst = (void *)wr;
+	inline_tx_mbuf(&q->q, rte_pktmbuf_mtod(mbuf, caddr_t),
+		       &dst, mbuf->data_len);
+
+	txq_advance(&q->q, ndesc);
+	if (unlikely(txq_avail(&q->q) < 64))
+		wr->lo |= htonl(F_FW_WR_EQUEQ);
+
+	q->txp++;
+
+	ring_tx_db(q->adapter, &q->q);
+	t4_os_unlock(&q->ctrlq_lock);
+
+	rte_pktmbuf_free(mbuf);
+	return 0;
+}
+
+/**
+ * t4_mgmt_tx - send a management message
+ * @q: the control queue
+ * @mbuf: the packet containing the management message
+ *
+ * Send a management message through control queue.
+ */
+int t4_mgmt_tx(struct sge_ctrl_txq *q, struct rte_mbuf *mbuf)
+{
+	int ret;
+
+	ret = ctrl_xmit(q, mbuf);
+	return ret;
+}
+
+/**
  * alloc_ring - allocate resources for an SGE descriptor ring
  * @dev: the PCI device's core device
  * @nelem: the number of descriptors
@@ -1949,6 +2077,63 @@ int t4_sge_alloc_eth_txq(struct adapter *adap, struct sge_eth_txq *txq,
 	return 0;
 }
 
+int t4_sge_alloc_ctrl_txq(struct adapter *adap, struct sge_ctrl_txq *txq,
+			  struct rte_eth_dev *eth_dev, uint16_t queue_id,
+			  unsigned int iqid, int socket_id)
+{
+	int ret, nentries;
+	struct fw_eq_ctrl_cmd c;
+	struct sge *s = &adap->sge;
+	struct port_info *pi = (struct port_info *)(eth_dev->data->dev_private);
+	char z_name[RTE_MEMZONE_NAMESIZE];
+	char z_name_sw[RTE_MEMZONE_NAMESIZE];
+
+	/* Add status entries */
+	nentries = txq->q.size + s->stat_len / sizeof(struct tx_desc);
+
+	snprintf(z_name, sizeof(z_name), "%s_%s_%d_%d",
+		 eth_dev->driver->pci_drv.name, "ctrl_tx_ring",
+		 eth_dev->data->port_id, queue_id);
+	snprintf(z_name_sw, sizeof(z_name_sw), "%s_sw_ring", z_name);
+
+	txq->q.desc = alloc_ring(txq->q.size, sizeof(struct tx_desc),
+				 0, &txq->q.phys_addr,
+				 NULL, 0, queue_id,
+				 socket_id, z_name, z_name_sw);
+	if (!txq->q.desc)
+		return -ENOMEM;
+
+	memset(&c, 0, sizeof(c));
+	c.op_to_vfn = htonl(V_FW_CMD_OP(FW_EQ_CTRL_CMD) | F_FW_CMD_REQUEST |
+			    F_FW_CMD_WRITE | F_FW_CMD_EXEC |
+			    V_FW_EQ_CTRL_CMD_PFN(adap->pf) |
+			    V_FW_EQ_CTRL_CMD_VFN(0));
+	c.alloc_to_len16 = htonl(F_FW_EQ_CTRL_CMD_ALLOC |
+				 F_FW_EQ_CTRL_CMD_EQSTART | (sizeof(c) / 16));
+	c.cmpliqid_eqid = htonl(V_FW_EQ_CTRL_CMD_CMPLIQID(0));
+	c.physeqid_pkd = htonl(0);
+	c.fetchszm_to_iqid =
+		htonl(V_FW_EQ_CTRL_CMD_HOSTFCMODE(X_HOSTFCMODE_NONE) |
+		      V_FW_EQ_CTRL_CMD_PCIECHN(pi->tx_chan) |
+		      F_FW_EQ_CTRL_CMD_FETCHRO | V_FW_EQ_CTRL_CMD_IQID(iqid));
+	c.dcaen_to_eqsize =
+		htonl(V_FW_EQ_CTRL_CMD_FBMIN(X_FETCHBURSTMIN_64B) |
+		      V_FW_EQ_CTRL_CMD_FBMAX(X_FETCHBURSTMAX_512B) |
+		      V_FW_EQ_CTRL_CMD_EQSIZE(nentries));
+	c.eqaddr = cpu_to_be64(txq->q.phys_addr);
+
+	ret = t4_wr_mbox(adap, adap->mbox, &c, sizeof(c), &c);
+	if (ret) {
+		txq->q.desc = NULL;
+		return ret;
+	}
+
+	init_txq(adap, &txq->q, G_FW_EQ_CTRL_CMD_EQID(ntohl(c.cmpliqid_eqid)));
+	txq->adapter = adap;
+	txq->full = 0;
+	return 0;
+}
+
 static void free_txq(struct sge_txq *q)
 {
 	q->cntxt_id = 0;
@@ -2043,7 +2228,7 @@ void t4_sge_tx_monitor_stop(struct adapter *adap)
  */
 void t4_free_sge_resources(struct adapter *adap)
 {
-	int i;
+	unsigned int i;
 	struct sge_eth_rxq *rxq = &adap->sge.ethrxq[0];
 	struct sge_eth_txq *txq = &adap->sge.ethtxq[0];
 
@@ -2060,6 +2245,18 @@ void t4_free_sge_resources(struct adapter *adap)
 		}
 	}
 
+	/* clean up control Tx queues */
+	for (i = 0; i < ARRAY_SIZE(adap->sge.ctrlq); i++) {
+		struct sge_ctrl_txq *cq = &adap->sge.ctrlq[i];
+
+		if (cq->q.desc) {
+			reclaim_completed_tx_imm(&cq->q);
+			t4_ctrl_eq_free(adap, adap->mbox, adap->pf, 0,
+					cq->q.cntxt_id);
+			free_txq(&cq->q);
+		}
+	}
+
 	if (adap->sge.fw_evtq.desc)
 		free_rspq_fl(adap, &adap->sge.fw_evtq, NULL);
 }
-- 
2.5.3

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 05/10] cxgbe: add compressed local IP table for matching IPv6 addresses
  2016-02-03  8:32 [PATCH 00/10] cxgbe: Add flow director support Rahul Lakkireddy
                   ` (3 preceding siblings ...)
  2016-02-03  8:32 ` [PATCH 04/10] cxgbe: add control txq for communicating filtering info Rahul Lakkireddy
@ 2016-02-03  8:32 ` Rahul Lakkireddy
  2016-02-03  8:32 ` [PATCH 06/10] cxgbe: add layer 2 table for switch action filter Rahul Lakkireddy
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 27+ messages in thread
From: Rahul Lakkireddy @ 2016-02-03  8:32 UTC (permalink / raw)
  To: dev; +Cc: Kumar Sanghvi, Nirranjan Kirubaharan

Add Compressed Local IP (CLIP) table that holds IPv6 addresses to be
matched against packets.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
---
 drivers/net/cxgbe/Makefile              |   1 +
 drivers/net/cxgbe/base/adapter.h        |  52 ++++++++
 drivers/net/cxgbe/base/t4fw_interface.h |  23 ++++
 drivers/net/cxgbe/clip_tbl.c            | 220 ++++++++++++++++++++++++++++++++
 drivers/net/cxgbe/clip_tbl.h            |  59 +++++++++
 drivers/net/cxgbe/cxgbe_filter.h        |   1 +
 drivers/net/cxgbe/cxgbe_main.c          |  11 ++
 7 files changed, 367 insertions(+)
 create mode 100644 drivers/net/cxgbe/clip_tbl.c
 create mode 100644 drivers/net/cxgbe/clip_tbl.h

diff --git a/drivers/net/cxgbe/Makefile b/drivers/net/cxgbe/Makefile
index 895c767..71b654a 100644
--- a/drivers/net/cxgbe/Makefile
+++ b/drivers/net/cxgbe/Makefile
@@ -78,6 +78,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += cxgbe_ethdev.c
 SRCS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += cxgbe_main.c
 SRCS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += sge.c
 SRCS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += t4_hw.c
+SRCS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += clip_tbl.c
 
 # this lib depends upon:
 DEPDIRS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += lib/librte_eal lib/librte_ether
diff --git a/drivers/net/cxgbe/base/adapter.h b/drivers/net/cxgbe/base/adapter.h
index ff8000f..b0234e9 100644
--- a/drivers/net/cxgbe/base/adapter.h
+++ b/drivers/net/cxgbe/base/adapter.h
@@ -37,6 +37,8 @@
 #define __T4_ADAPTER_H__
 
 #include <rte_mbuf.h>
+#include <rte_rwlock.h>
+#include <rte_ethdev.h>
 
 #include "cxgbe_compat.h"
 #include "t4_regs_values.h"
@@ -334,6 +336,7 @@ struct adapter {
 
 	unsigned int clipt_start; /* CLIP table start */
 	unsigned int clipt_end;   /* CLIP table end */
+	struct clip_tbl *clipt;   /* CLIP table */
 	unsigned int l2t_start;   /* Layer 2 table start */
 	unsigned int l2t_end;     /* Layer 2 table end */
 	struct tid_info tids;     /* Info used to access TID related tables */
@@ -543,6 +546,44 @@ static inline void t4_os_atomic_list_del(struct mbox_entry *entry,
 }
 
 /**
+ * t4_os_rwlock_init - initialize rwlock
+ * @lock: the rwlock
+ */
+static inline void t4_os_rwlock_init(rte_rwlock_t *lock)
+{
+	rte_rwlock_init(lock);
+}
+
+/**
+ * t4_os_write_lock - get a write lock
+ * @lock: the rwlock
+ */
+static inline void t4_os_write_lock(rte_rwlock_t *lock)
+{
+	rte_rwlock_write_lock(lock);
+}
+
+/**
+ * t4_os_write_unlock - unlock a write lock
+ * @lock: the rwlock
+ */
+static inline void t4_os_write_unlock(rte_rwlock_t *lock)
+{
+	rte_rwlock_write_unlock(lock);
+}
+
+/**
+ * ethdev2pinfo - return the port_info structure associated with a rte_eth_dev
+ * @dev: the rte_eth_dev
+ *
+ * Return the struct port_info associated with a rte_eth_dev
+ */
+static inline struct port_info *ethdev2pinfo(const struct rte_eth_dev *dev)
+{
+	return (struct port_info *)dev->data->dev_private;
+}
+
+/**
  * adap2pinfo - return the port_info of a port
  * @adap: the adapter
  * @idx: the port index
@@ -554,6 +595,17 @@ static inline struct port_info *adap2pinfo(struct adapter *adap, int idx)
 	return &adap->port[idx];
 }
 
+/**
+ * ethdev2adap - return the adapter structure associated with a rte_eth_dev
+ * @dev: the rte_eth_dev
+ *
+ * Return the struct adapter associated with a rte_eth_dev
+ */
+static inline struct adapter *ethdev2adap(const struct rte_eth_dev *dev)
+{
+	return ethdev2pinfo(dev)->adapter;
+}
+
 void *t4_alloc_mem(size_t size);
 void t4_free_mem(void *addr);
 #define t4_os_alloc(_size)     t4_alloc_mem((_size))
diff --git a/drivers/net/cxgbe/base/t4fw_interface.h b/drivers/net/cxgbe/base/t4fw_interface.h
index ddbff89..cbf9f32 100644
--- a/drivers/net/cxgbe/base/t4fw_interface.h
+++ b/drivers/net/cxgbe/base/t4fw_interface.h
@@ -186,6 +186,7 @@ enum fw_cmd_opcodes {
 	FW_PORT_CMD                    = 0x1b,
 	FW_RSS_IND_TBL_CMD             = 0x20,
 	FW_RSS_VI_CONFIG_CMD           = 0x23,
+	FW_CLIP_CMD                    = 0x28,
 	FW_DEBUG_CMD                   = 0x81,
 };
 
@@ -1666,6 +1667,28 @@ struct fw_rss_vi_config_cmd {
 	(((x) >> S_FW_RSS_VI_CONFIG_CMD_UDPEN) & M_FW_RSS_VI_CONFIG_CMD_UDPEN)
 #define F_FW_RSS_VI_CONFIG_CMD_UDPEN	V_FW_RSS_VI_CONFIG_CMD_UDPEN(1U)
 
+struct fw_clip_cmd {
+	__be32 op_to_write;
+	__be32 alloc_to_len16;
+	__be64 ip_hi;
+	__be64 ip_lo;
+	__be32 r4[2];
+};
+
+#define S_FW_CLIP_CMD_ALLOC		31
+#define M_FW_CLIP_CMD_ALLOC		0x1
+#define V_FW_CLIP_CMD_ALLOC(x)		((x) << S_FW_CLIP_CMD_ALLOC)
+#define G_FW_CLIP_CMD_ALLOC(x)		\
+	(((x) >> S_FW_CLIP_CMD_ALLOC) & M_FW_CLIP_CMD_ALLOC)
+#define F_FW_CLIP_CMD_ALLOC		V_FW_CLIP_CMD_ALLOC(1U)
+
+#define S_FW_CLIP_CMD_FREE		30
+#define M_FW_CLIP_CMD_FREE		0x1
+#define V_FW_CLIP_CMD_FREE(x)		((x) << S_FW_CLIP_CMD_FREE)
+#define G_FW_CLIP_CMD_FREE(x)		\
+	(((x) >> S_FW_CLIP_CMD_FREE) & M_FW_CLIP_CMD_FREE)
+#define F_FW_CLIP_CMD_FREE		V_FW_CLIP_CMD_FREE(1U)
+
 /******************************************************************************
  *   D E B U G   C O M M A N D s
  ******************************************************/
diff --git a/drivers/net/cxgbe/clip_tbl.c b/drivers/net/cxgbe/clip_tbl.c
new file mode 100644
index 0000000..ddcafb5
--- /dev/null
+++ b/drivers/net/cxgbe/clip_tbl.c
@@ -0,0 +1,220 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015-2016 Chelsio Communications.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Chelsio Communications nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include "common.h"
+#include "clip_tbl.h"
+
+/**
+ * Allocate clip entry in HW with associated IPV4/IPv6 address
+ */
+static int clip6_get_mbox(const struct rte_eth_dev *dev, const u32 *lip)
+{
+	struct adapter *adap = ethdev2adap(dev);
+	struct fw_clip_cmd c;
+	u64 hi = ((u64)lip[1]) << 32 | lip[0];
+	u64 lo = ((u64)lip[3]) << 32 | lip[2];
+
+	memset(&c, 0, sizeof(c));
+	c.op_to_write = cpu_to_be32(V_FW_CMD_OP(FW_CLIP_CMD) |
+				    F_FW_CMD_REQUEST | F_FW_CMD_WRITE);
+	c.alloc_to_len16 = cpu_to_be32(F_FW_CLIP_CMD_ALLOC | FW_LEN16(c));
+	c.ip_hi = hi;
+	c.ip_lo = lo;
+	return t4_wr_mbox_meat(adap, adap->mbox, &c, sizeof(c), &c, false);
+}
+
+/**
+ * Delete clip entry in HW having the associated IPV4/IPV6 address
+ */
+static int clip6_release_mbox(const struct rte_eth_dev *dev, const u32 *lip)
+{
+	struct adapter *adap = ethdev2adap(dev);
+	struct fw_clip_cmd c;
+	u64 hi = ((u64)lip[1]) << 32 | lip[0];
+	u64 lo = ((u64)lip[3]) << 32 | lip[2];
+
+	memset(&c, 0, sizeof(c));
+	c.op_to_write = cpu_to_be32(V_FW_CMD_OP(FW_CLIP_CMD) |
+				    F_FW_CMD_REQUEST | F_FW_CMD_READ);
+	c.alloc_to_len16 = cpu_to_be32(F_FW_CLIP_CMD_FREE | FW_LEN16(c));
+	c.ip_hi = hi;
+	c.ip_lo = lo;
+	return t4_wr_mbox_meat(adap, adap->mbox, &c, sizeof(c), &c, false);
+}
+
+/**
+ * cxgbe_clip_release - Release associated CLIP entry
+ * @ce: clip entry to release
+ *
+ * Releases ref count and frees up a clip entry from CLIP table
+ */
+void cxgbe_clip_release(struct rte_eth_dev *dev, struct clip_entry *ce)
+{
+	int ret;
+
+	t4_os_lock(&ce->lock);
+	if (rte_atomic32_dec_and_test(&ce->refcnt)) {
+		ret = clip6_release_mbox(dev, ce->addr);
+		if (ret)
+			dev_debug(adap, "CLIP FW DEL CMD failed: %d", ret);
+	}
+	t4_os_unlock(&ce->lock);
+}
+
+/**
+ * find_or_alloc_clipe - Find/Allocate a free CLIP entry
+ * @c: CLIP table
+ * @lip: IPV4/IPV6 address to compare/add
+ * Returns pointer to the IPV4/IPV6 entry found/created
+ *
+ * Finds/Allocates an CLIP entry to be used for a filter rule.
+ */
+static struct clip_entry *find_or_alloc_clipe(struct clip_tbl *c,
+					      const u32 *lip)
+{
+	struct clip_entry *end, *e;
+	struct clip_entry *first_free = NULL;
+	unsigned int clipt_size = c->clipt_size;
+
+	for (e = &c->cl_list[0], end = &c->cl_list[clipt_size]; e != end; ++e) {
+		if (rte_atomic32_read(&e->refcnt) == 0) {
+			if (!first_free)
+				first_free = e;
+		} else {
+			if (memcmp(lip, e->addr, sizeof(e->addr)) == 0)
+				goto exists;
+		}
+	}
+
+	if (first_free) {
+		e = first_free;
+		goto exists;
+	}
+
+	return NULL;
+
+exists:
+	return e;
+}
+
+static struct clip_entry *t4_clip_alloc(struct rte_eth_dev *dev,
+					u32 *lip, u8 v6)
+{
+	struct adapter *adap = ethdev2adap(dev);
+	struct clip_tbl *ctbl = adap->clipt;
+	struct clip_entry *ce;
+	int ret;
+
+	t4_os_write_lock(&ctbl->lock);
+	ce = find_or_alloc_clipe(ctbl, lip);
+	if (ce) {
+		t4_os_lock(&ce->lock);
+		if (!rte_atomic32_read(&ce->refcnt)) {
+			rte_memcpy(ce->addr, lip, sizeof(ce->addr));
+			if (v6) {
+				ce->type = FILTER_TYPE_IPV6;
+				rte_atomic32_set(&ce->refcnt, 1);
+				ret = clip6_get_mbox(dev, lip);
+				if (ret) {
+					dev_debug(adap,
+						  "CLIP FW ADD CMD failed: %d",
+						  ret);
+					ce = NULL;
+				}
+			} else {
+				ce->type = FILTER_TYPE_IPV4;
+			}
+		} else {
+			rte_atomic32_inc(&ce->refcnt);
+		}
+		t4_os_unlock(&ce->lock);
+	}
+	t4_os_write_unlock(&ctbl->lock);
+
+	return ce;
+}
+
+/**
+ * cxgbe_clip_alloc - Allocate a IPV6 CLIP entry
+ * @dev: rte_eth_dev pointer
+ * @lip: IPV6 address to add
+ * Returns pointer to the CLIP entry created
+ *
+ * Allocates a IPV6 CLIP entry to be used for a filter rule.
+ */
+struct clip_entry *cxgbe_clip_alloc(struct rte_eth_dev *dev, u32 *lip)
+{
+	return t4_clip_alloc(dev, lip, FILTER_TYPE_IPV6);
+}
+
+/**
+ * Initialize CLIP Table
+ */
+struct clip_tbl *t4_init_clip_tbl(unsigned int clipt_start,
+				  unsigned int clipt_end)
+{
+	unsigned int clipt_size;
+	struct clip_tbl *ctbl;
+	unsigned int i;
+
+	if (clipt_start >= clipt_end)
+		return NULL;
+
+	clipt_size = clipt_end - clipt_start + 1;
+
+	ctbl = t4_os_alloc(sizeof(*ctbl) +
+			   clipt_size * sizeof(struct clip_entry));
+	if (!ctbl)
+		return NULL;
+
+	ctbl->clipt_start = clipt_start;
+	ctbl->clipt_size = clipt_size;
+
+	t4_os_rwlock_init(&ctbl->lock);
+
+	for (i = 0; i < ctbl->clipt_size; i++) {
+		t4_os_lock_init(&ctbl->cl_list[i].lock);
+		rte_atomic32_set(&ctbl->cl_list[i].refcnt, 0);
+	}
+
+	return ctbl;
+}
+
+/**
+ * Cleanup CLIP Table
+ */
+void t4_cleanup_clip_tbl(struct adapter *adap)
+{
+	if (adap->clipt)
+		t4_os_free(adap->clipt);
+}
diff --git a/drivers/net/cxgbe/clip_tbl.h b/drivers/net/cxgbe/clip_tbl.h
new file mode 100644
index 0000000..04f3305
--- /dev/null
+++ b/drivers/net/cxgbe/clip_tbl.h
@@ -0,0 +1,59 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015-2016 Chelsio Communications.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Chelsio Communications nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _CXGBE_CLIP_H_
+#define _CXGBE_CLIP_H_
+
+/*
+ * State for the corresponding entry of the HW CLIP table.
+ */
+struct clip_entry {
+	enum filter_type type;       /* entry type */
+	u32 addr[4];                 /* IPV4 or IPV6 address */
+	rte_spinlock_t lock;         /* entry lock */
+	rte_atomic32_t refcnt;       /* entry reference count */
+};
+
+struct clip_tbl {
+	unsigned int clipt_start;     /* start index of CLIP table */
+	unsigned int clipt_size;      /* size of CLIP table */
+	rte_rwlock_t lock;            /* table rw lock */
+	struct clip_entry cl_list[0]; /* MUST BE LAST */
+};
+
+struct clip_tbl *t4_init_clip_tbl(unsigned int clipt_start,
+				  unsigned int clipt_end);
+void t4_cleanup_clip_tbl(struct adapter *adap);
+struct clip_entry *cxgbe_clip_alloc(struct rte_eth_dev *dev, u32 *lip);
+void cxgbe_clip_release(struct rte_eth_dev *dev, struct clip_entry *ce);
+#endif /* _CXGBE_CLIP_H_ */
diff --git a/drivers/net/cxgbe/cxgbe_filter.h b/drivers/net/cxgbe/cxgbe_filter.h
index a746d13..bb4b367 100644
--- a/drivers/net/cxgbe/cxgbe_filter.h
+++ b/drivers/net/cxgbe/cxgbe_filter.h
@@ -218,6 +218,7 @@ struct filter_entry {
 	u32 locked:1;               /* filter is administratively locked */
 	u32 pending:1;              /* filter action is pending FW reply */
 	struct filter_ctx *ctx;     /* caller's completion hook */
+	struct clip_entry *clipt;   /* CLIP Table entry for IPv6 */
 	struct rte_eth_dev *dev;    /* Port's rte eth device */
 
 	/* This will store the actual tid */
diff --git a/drivers/net/cxgbe/cxgbe_main.c b/drivers/net/cxgbe/cxgbe_main.c
index 7a1cc13..5960d9a 100644
--- a/drivers/net/cxgbe/cxgbe_main.c
+++ b/drivers/net/cxgbe/cxgbe_main.c
@@ -66,6 +66,7 @@
 #include "t4_regs.h"
 #include "t4_msg.h"
 #include "cxgbe.h"
+#include "clip_tbl.h"
 
 /**
  * Allocate a chunk of memory. The allocated memory is cleared.
@@ -1222,6 +1223,7 @@ void cxgbe_close(struct adapter *adapter)
 
 	if (adapter->flags & FULL_INIT_DONE) {
 		tid_free(&adapter->tids);
+		t4_cleanup_clip_tbl(adapter);
 		t4_intr_disable(adapter);
 		t4_sge_tx_monitor_stop(adapter);
 		t4_free_sge_resources(adapter);
@@ -1364,6 +1366,15 @@ allocate_mac:
 
 	print_port_info(adapter);
 
+	adapter->clipt = t4_init_clip_tbl(adapter->clipt_start,
+					  adapter->clipt_end);
+	if (!adapter->clipt) {
+		/* We tolerate a lack of clip_table, giving up some
+		 * functionality
+		 */
+		dev_warn(adapter, "could not allocate CLIP. Continuing\n");
+	}
+
 	if (tid_init(&adapter->tids) < 0) {
 		/* Disable filtering support */
 		dev_warn(adapter, "could not allocate TID table, "
-- 
2.5.3

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 06/10] cxgbe: add layer 2 table for switch action filter
  2016-02-03  8:32 [PATCH 00/10] cxgbe: Add flow director support Rahul Lakkireddy
                   ` (4 preceding siblings ...)
  2016-02-03  8:32 ` [PATCH 05/10] cxgbe: add compressed local IP table for matching IPv6 addresses Rahul Lakkireddy
@ 2016-02-03  8:32 ` Rahul Lakkireddy
  2016-02-03  8:32 ` [PATCH 07/10] cxgbe: add source mac " Rahul Lakkireddy
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 27+ messages in thread
From: Rahul Lakkireddy @ 2016-02-03  8:32 UTC (permalink / raw)
  To: dev; +Cc: Kumar Sanghvi, Nirranjan Kirubaharan

Add Layer 2 Table (L2T) that holds destination mac addresses and
vlan ids to be modified on a packet that matches a corresponding
set 'switch' action filter.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
---
 drivers/net/cxgbe/Makefile              |   1 +
 drivers/net/cxgbe/base/adapter.h        |   1 +
 drivers/net/cxgbe/base/t4_msg.h         |  58 +++++++
 drivers/net/cxgbe/base/t4fw_interface.h |  11 ++
 drivers/net/cxgbe/cxgbe_filter.h        |   1 +
 drivers/net/cxgbe/cxgbe_main.c          |  12 ++
 drivers/net/cxgbe/cxgbe_ofld.h          |   9 ++
 drivers/net/cxgbe/l2t.c                 | 261 ++++++++++++++++++++++++++++++++
 drivers/net/cxgbe/l2t.h                 |  87 +++++++++++
 9 files changed, 441 insertions(+)
 create mode 100644 drivers/net/cxgbe/l2t.c
 create mode 100644 drivers/net/cxgbe/l2t.h

diff --git a/drivers/net/cxgbe/Makefile b/drivers/net/cxgbe/Makefile
index 71b654a..e98e93c 100644
--- a/drivers/net/cxgbe/Makefile
+++ b/drivers/net/cxgbe/Makefile
@@ -79,6 +79,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += cxgbe_main.c
 SRCS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += sge.c
 SRCS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += t4_hw.c
 SRCS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += clip_tbl.c
+SRCS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += l2t.c
 
 # this lib depends upon:
 DEPDIRS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += lib/librte_eal lib/librte_ether
diff --git a/drivers/net/cxgbe/base/adapter.h b/drivers/net/cxgbe/base/adapter.h
index b0234e9..c29bb98 100644
--- a/drivers/net/cxgbe/base/adapter.h
+++ b/drivers/net/cxgbe/base/adapter.h
@@ -339,6 +339,7 @@ struct adapter {
 	struct clip_tbl *clipt;   /* CLIP table */
 	unsigned int l2t_start;   /* Layer 2 table start */
 	unsigned int l2t_end;     /* Layer 2 table end */
+	struct l2t_data *l2t;     /* Layer 2 table */
 	struct tid_info tids;     /* Info used to access TID related tables */
 };
 
diff --git a/drivers/net/cxgbe/base/t4_msg.h b/drivers/net/cxgbe/base/t4_msg.h
index 4b04cd0..d51edd7 100644
--- a/drivers/net/cxgbe/base/t4_msg.h
+++ b/drivers/net/cxgbe/base/t4_msg.h
@@ -35,6 +35,8 @@
 #define T4_MSG_H
 
 enum {
+	CPL_L2T_WRITE_REQ     = 0x12,
+	CPL_L2T_WRITE_RPL     = 0x23,
 	CPL_SGE_EGR_UPDATE    = 0xA5,
 	CPL_FW4_MSG           = 0xC0,
 	CPL_FW6_MSG           = 0xE0,
@@ -42,6 +44,11 @@ enum {
 	CPL_TX_PKT_XT         = 0xEE,
 };
 
+enum CPL_error {
+	CPL_ERR_NONE               = 0,
+	CPL_ERR_TCAM_FULL          = 3,
+};
+
 enum {                     /* TX_PKT_XT checksum types */
 	TX_CSUM_TCPIP  = 8,
 	TX_CSUM_UDPIP  = 9,
@@ -53,6 +60,28 @@ union opcode_tid {
 	__u8 opcode;
 };
 
+#define S_CPL_OPCODE    24
+#define V_CPL_OPCODE(x) ((x) << S_CPL_OPCODE)
+
+#define G_TID(x)    ((x) & 0xFFFFFF)
+
+/* tid is assumed to be 24-bits */
+#define MK_OPCODE_TID(opcode, tid) (V_CPL_OPCODE(opcode) | (tid))
+
+#define OPCODE_TID(cmd) ((cmd)->ot.opcode_tid)
+
+/* extract the TID from a CPL command */
+#define GET_TID(cmd) (G_TID(be32_to_cpu(OPCODE_TID(cmd))))
+
+/* partitioning of TID fields that also carry a queue id */
+#define S_TID_TID    0
+#define M_TID_TID    0x3fff
+#define V_TID_TID(x) ((x) << S_TID_TID)
+#define G_TID_TID(x) (((x) >> S_TID_TID) & M_TID_TID)
+
+#define S_TID_QID    14
+#define V_TID_QID(x) ((x) << S_TID_QID)
+
 struct rss_header {
 	__u8 opcode;
 #if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
@@ -245,6 +274,35 @@ struct cpl_rx_pkt {
 	__be16 err_vec;
 };
 
+struct cpl_l2t_write_req {
+	WR_HDR;
+	union opcode_tid ot;
+	__be16 params;
+	__be16 l2t_idx;
+	__be16 vlan;
+	__u8   dst_mac[6];
+};
+
+/* cpl_l2t_write_req.params fields */
+#define S_L2T_W_PORT    8
+#define V_L2T_W_PORT(x) ((x) << S_L2T_W_PORT)
+
+#define S_L2T_W_LPBK    10
+#define V_L2T_W_LPBK(x) ((x) << S_L2T_W_LPBK)
+
+#define S_L2T_W_ARPMISS         11
+#define V_L2T_W_ARPMISS(x)      ((x) << S_L2T_W_ARPMISS)
+
+#define S_L2T_W_NOREPLY    15
+#define V_L2T_W_NOREPLY(x) ((x) << S_L2T_W_NOREPLY)
+
+struct cpl_l2t_write_rpl {
+	RSS_HDR
+	union opcode_tid ot;
+	__u8 status;
+	__u8 rsvd[3];
+};
+
 /* rx_pkt.l2info fields */
 #define S_RXF_UDP    22
 #define V_RXF_UDP(x) ((x) << S_RXF_UDP)
diff --git a/drivers/net/cxgbe/base/t4fw_interface.h b/drivers/net/cxgbe/base/t4fw_interface.h
index cbf9f32..8a8652a 100644
--- a/drivers/net/cxgbe/base/t4fw_interface.h
+++ b/drivers/net/cxgbe/base/t4fw_interface.h
@@ -82,6 +82,7 @@ enum fw_memtype {
  ********************************/
 
 enum fw_wr_opcodes {
+	FW_TP_WR		= 0x05,
 	FW_ETH_TX_PKT_WR	= 0x08,
 	FW_ETH_TX_PKTS_WR	= 0x09,
 };
@@ -101,6 +102,11 @@ struct fw_wr_hdr {
 #define V_FW_WR_OP(x)		((x) << S_FW_WR_OP)
 #define G_FW_WR_OP(x)		(((x) >> S_FW_WR_OP) & M_FW_WR_OP)
 
+/* atomic flag (hi) - firmware encapsulates CPLs in CPL_BARRIER
+ */
+#define S_FW_WR_ATOMIC		23
+#define V_FW_WR_ATOMIC(x)	((x) << S_FW_WR_ATOMIC)
+
 /* work request immediate data length (hi)
  */
 #define S_FW_WR_IMMDLEN	0
@@ -117,6 +123,11 @@ struct fw_wr_hdr {
 #define G_FW_WR_EQUEQ(x)	(((x) >> S_FW_WR_EQUEQ) & M_FW_WR_EQUEQ)
 #define F_FW_WR_EQUEQ		V_FW_WR_EQUEQ(1U)
 
+/* flow context identifier (lo)
+ */
+#define S_FW_WR_FLOWID		8
+#define V_FW_WR_FLOWID(x)	((x) << S_FW_WR_FLOWID)
+
 /* length in units of 16-bytes (lo)
  */
 #define S_FW_WR_LEN16		0
diff --git a/drivers/net/cxgbe/cxgbe_filter.h b/drivers/net/cxgbe/cxgbe_filter.h
index bb4b367..933496a 100644
--- a/drivers/net/cxgbe/cxgbe_filter.h
+++ b/drivers/net/cxgbe/cxgbe_filter.h
@@ -219,6 +219,7 @@ struct filter_entry {
 	u32 pending:1;              /* filter action is pending FW reply */
 	struct filter_ctx *ctx;     /* caller's completion hook */
 	struct clip_entry *clipt;   /* CLIP Table entry for IPv6 */
+	struct l2t_entry *l2t;      /* Layer Two Table entry for dmac */
 	struct rte_eth_dev *dev;    /* Port's rte eth device */
 
 	/* This will store the actual tid */
diff --git a/drivers/net/cxgbe/cxgbe_main.c b/drivers/net/cxgbe/cxgbe_main.c
index 5960d9a..63c6318 100644
--- a/drivers/net/cxgbe/cxgbe_main.c
+++ b/drivers/net/cxgbe/cxgbe_main.c
@@ -67,6 +67,7 @@
 #include "t4_msg.h"
 #include "cxgbe.h"
 #include "clip_tbl.h"
+#include "l2t.h"
 
 /**
  * Allocate a chunk of memory. The allocated memory is cleared.
@@ -116,6 +117,10 @@ static int fwevtq_handler(struct sge_rspq *q, const __be64 *rsp,
 		const struct cpl_fw6_msg *msg = (const void *)rsp;
 
 		t4_handle_fw_rpl(q->adapter, msg->data);
+	} else if (opcode == CPL_L2T_WRITE_RPL) {
+		const struct cpl_l2t_write_rpl *p = (const void *)rsp;
+
+		do_l2t_write_rpl(q->adapter, p);
 	} else {
 		dev_err(adapter, "unexpected CPL %#x on FW event queue\n",
 			opcode);
@@ -1224,6 +1229,7 @@ void cxgbe_close(struct adapter *adapter)
 	if (adapter->flags & FULL_INIT_DONE) {
 		tid_free(&adapter->tids);
 		t4_cleanup_clip_tbl(adapter);
+		t4_cleanup_l2t(adapter);
 		t4_intr_disable(adapter);
 		t4_sge_tx_monitor_stop(adapter);
 		t4_free_sge_resources(adapter);
@@ -1375,6 +1381,12 @@ allocate_mac:
 		dev_warn(adapter, "could not allocate CLIP. Continuing\n");
 	}
 
+	adapter->l2t = t4_init_l2t(adapter->l2t_start, adapter->l2t_end);
+	if (!adapter->l2t) {
+		/* We tolerate a lack of L2T, giving up some functionality */
+		dev_warn(adapter, "could not allocate L2T. Continuing\n");
+	}
+
 	if (tid_init(&adapter->tids) < 0) {
 		/* Disable filtering support */
 		dev_warn(adapter, "could not allocate TID table, "
diff --git a/drivers/net/cxgbe/cxgbe_ofld.h b/drivers/net/cxgbe/cxgbe_ofld.h
index 3bcb648..19971e7 100644
--- a/drivers/net/cxgbe/cxgbe_ofld.h
+++ b/drivers/net/cxgbe/cxgbe_ofld.h
@@ -38,6 +38,15 @@
 
 #include "cxgbe_filter.h"
 
+#define INIT_TP_WR(w, tid) do { \
+	(w)->wr.wr_hi = cpu_to_be32(V_FW_WR_OP(FW_TP_WR) | \
+				V_FW_WR_IMMDLEN(sizeof(*w) - sizeof(w->wr))); \
+	(w)->wr.wr_mid = cpu_to_be32( \
+				V_FW_WR_LEN16(DIV_ROUND_UP(sizeof(*w), 16)) | \
+				V_FW_WR_FLOWID(tid)); \
+	(w)->wr.wr_lo = cpu_to_be64(0); \
+} while (0)
+
 /*
  * Max # of ATIDs.  The absolute HW max is 16K but we keep it lower.
  */
diff --git a/drivers/net/cxgbe/l2t.c b/drivers/net/cxgbe/l2t.c
new file mode 100644
index 0000000..cf36150
--- /dev/null
+++ b/drivers/net/cxgbe/l2t.c
@@ -0,0 +1,261 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015-2016 Chelsio Communications.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Chelsio Communications nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include "common.h"
+#include "l2t.h"
+
+/**
+ * cxgbe_l2t_release - Release associated L2T entry
+ * @e: L2T entry to release
+ *
+ * Releases ref count and frees up an L2T entry from L2T table
+ */
+void cxgbe_l2t_release(struct l2t_entry *e)
+{
+	if (rte_atomic32_read(&e->refcnt) != 0)
+		rte_atomic32_dec(&e->refcnt);
+}
+
+/**
+ * Process a CPL_L2T_WRITE_RPL. Note that the TID in the reply is really
+ * the L2T index it refers to.
+ */
+void do_l2t_write_rpl(struct adapter *adap, const struct cpl_l2t_write_rpl *rpl)
+{
+	struct l2t_data *d = adap->l2t;
+	unsigned int tid = GET_TID(rpl);
+	unsigned int l2t_idx = tid % L2T_SIZE;
+
+	if (unlikely(rpl->status != CPL_ERR_NONE)) {
+		dev_err(adap,
+			"Unexpected L2T_WRITE_RPL status %u for entry %u\n",
+			rpl->status, l2t_idx);
+		return;
+	}
+
+	if (tid & F_SYNC_WR) {
+		struct l2t_entry *e = &d->l2tab[l2t_idx - d->l2t_start];
+
+		t4_os_lock(&e->lock);
+		if (e->state != L2T_STATE_SWITCHING)
+			e->state = L2T_STATE_VALID;
+		t4_os_unlock(&e->lock);
+	}
+}
+
+/**
+ * Write an L2T entry.  Must be called with the entry locked.
+ * The write may be synchronous or asynchronous.
+ */
+static int write_l2e(struct rte_eth_dev *dev, struct l2t_entry *e, int sync,
+		     bool loopback, bool arpmiss)
+{
+	struct adapter *adap = ethdev2adap(dev);
+	struct l2t_data *d = adap->l2t;
+	struct rte_mbuf *mbuf;
+	struct cpl_l2t_write_req *req;
+	struct sge_ctrl_txq *ctrlq;
+	unsigned int l2t_idx = e->idx + d->l2t_start;
+	unsigned int port_id = ethdev2pinfo(dev)->port_id;
+	int i;
+
+	ctrlq = &adap->sge.ctrlq[port_id];
+	mbuf = rte_pktmbuf_alloc(ctrlq->mb_pool);
+	if (!mbuf)
+		return -ENOMEM;
+
+	mbuf->data_len = sizeof(*req);
+	mbuf->pkt_len = mbuf->data_len;
+
+	req = rte_pktmbuf_mtod(mbuf, struct cpl_l2t_write_req *);
+	INIT_TP_WR(req, 0);
+
+	OPCODE_TID(req) =
+		cpu_to_be32(MK_OPCODE_TID(CPL_L2T_WRITE_REQ,
+					  l2t_idx | V_SYNC_WR(sync) |
+					  V_TID_QID(adap->sge.fw_evtq.abs_id)));
+	req->params = cpu_to_be16(V_L2T_W_PORT(e->lport) |
+				  V_L2T_W_LPBK(loopback) |
+				  V_L2T_W_ARPMISS(arpmiss) |
+				  V_L2T_W_NOREPLY(!sync));
+	req->l2t_idx = cpu_to_be16(l2t_idx);
+	req->vlan = cpu_to_be16(e->vlan);
+	rte_memcpy(req->dst_mac, e->dmac, ETHER_ADDR_LEN);
+
+	if (loopback) {
+		for (i = 0; i < ETHER_ADDR_LEN; i++)
+			req->dst_mac[i] = 0;
+	}
+
+	t4_mgmt_tx(ctrlq, mbuf);
+
+	if (sync && e->state != L2T_STATE_SWITCHING)
+		e->state = L2T_STATE_SYNC_WRITE;
+
+	return 0;
+}
+
+/**
+ * find_or_alloc_l2e - Find/Allocate a free L2T entry
+ * @d: L2T table
+ * @vlan: VLAN id to compare/add
+ * @port: port id to compare/add
+ * @dmac: Destination MAC address to compare/add
+ * Returns pointer to the L2T entry found/created
+ *
+ * Finds/Allocates an L2T entry to be used by switching rule of a filter.
+ */
+static struct l2t_entry *find_or_alloc_l2e(struct l2t_data *d, u16 vlan,
+					   u8 port, u8 *dmac)
+{
+	struct l2t_entry *end, *e;
+	struct l2t_entry *first_free = NULL;
+
+	for (e = &d->l2tab[0], end = &d->l2tab[d->l2t_size]; e != end; ++e) {
+		if (rte_atomic32_read(&e->refcnt) == 0) {
+			if (!first_free)
+				first_free = e;
+		} else {
+			if (e->state == L2T_STATE_SWITCHING) {
+				if ((!memcmp(e->dmac, dmac, ETHER_ADDR_LEN)) &&
+				    (e->vlan == vlan) && (e->lport == port))
+					goto exists;
+			}
+		}
+	}
+
+	if (first_free) {
+		e = first_free;
+		goto found;
+	}
+
+	return NULL;
+
+found:
+	e->state = L2T_STATE_UNUSED;
+
+exists:
+	return e;
+}
+
+static struct l2t_entry *t4_l2t_alloc_switching(struct rte_eth_dev *dev,
+						u16 vlan, u8 port,
+						u8 *eth_addr)
+{
+	struct adapter *adap = ethdev2adap(dev);
+	struct l2t_data *d = adap->l2t;
+	struct l2t_entry *e;
+	int ret;
+
+	t4_os_write_lock(&d->lock);
+	e = find_or_alloc_l2e(d, vlan, port, eth_addr);
+	if (e) {
+		t4_os_lock(&e->lock);
+		if (!rte_atomic32_read(&e->refcnt)) {
+			e->state = L2T_STATE_SWITCHING;
+			e->vlan = vlan;
+			e->lport = port;
+			rte_memcpy(e->dmac, eth_addr, ETHER_ADDR_LEN);
+			rte_atomic32_set(&e->refcnt, 1);
+			ret = write_l2e(dev, e, 0, !L2T_LPBK, !L2T_ARPMISS);
+			if (ret < 0) {
+				dev_debug(adap, "Failed to write L2T entry: %d",
+					  ret);
+				e = NULL;
+			}
+		} else {
+			rte_atomic32_inc(&e->refcnt);
+		}
+		t4_os_unlock(&e->lock);
+	}
+	t4_os_write_unlock(&d->lock);
+
+	return e;
+}
+
+/**
+ * cxgbe_l2t_alloc_switching - Allocate a L2T entry for switching rule
+ * @dev: rte_eth_dev pointer
+ * @vlan: VLAN Id
+ * @port: Associated port
+ * @dmac: Destination MAC address to add to L2T
+ * Returns pointer to the allocated l2t entry
+ *
+ * Allocates a L2T entry for use by switching rule of a filter
+ */
+struct l2t_entry *cxgbe_l2t_alloc_switching(struct rte_eth_dev *dev, u16 vlan,
+					    u8 port, u8 *dmac)
+{
+	return t4_l2t_alloc_switching(dev, vlan, port, dmac);
+}
+
+/**
+ * Initialize L2 Table
+ */
+struct l2t_data *t4_init_l2t(unsigned int l2t_start, unsigned int l2t_end)
+{
+	unsigned int l2t_size;
+	unsigned int i;
+	struct l2t_data *d;
+
+	if (l2t_start >= l2t_end || l2t_end >= L2T_SIZE)
+		return NULL;
+	l2t_size = l2t_end - l2t_start + 1;
+
+	d = t4_os_alloc(sizeof(*d) + l2t_size * sizeof(struct l2t_entry));
+	if (!d)
+		return NULL;
+
+	d->l2t_start = l2t_start;
+	d->l2t_size = l2t_size;
+
+	t4_os_rwlock_init(&d->lock);
+
+	for (i = 0; i < d->l2t_size; ++i) {
+		d->l2tab[i].idx = i;
+		d->l2tab[i].state = L2T_STATE_UNUSED;
+		t4_os_lock_init(&d->l2tab[i].lock);
+		rte_atomic32_set(&d->l2tab[i].refcnt, 0);
+	}
+
+	return d;
+}
+
+/**
+ * Cleanup L2 Table
+ */
+void t4_cleanup_l2t(struct adapter *adap)
+{
+	if (adap->l2t)
+		t4_os_free(adap->l2t);
+}
diff --git a/drivers/net/cxgbe/l2t.h b/drivers/net/cxgbe/l2t.h
new file mode 100644
index 0000000..a6bb9f7
--- /dev/null
+++ b/drivers/net/cxgbe/l2t.h
@@ -0,0 +1,87 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015-2016 Chelsio Communications.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Chelsio Communications nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _CXGBE_L2T_H_
+#define _CXGBE_L2T_H_
+
+#include "t4_msg.h"
+
+enum {
+	L2T_SIZE = 4096       /* # of L2T entries */
+};
+
+enum {
+	L2T_STATE_VALID,      /* entry is up to date */
+	L2T_STATE_SYNC_WRITE, /* synchronous write of entry underway */
+
+	/* when state is one of the below the entry is not hashed */
+	L2T_STATE_SWITCHING,  /* entry is being used by a switching filter */
+	L2T_STATE_UNUSED      /* entry not in use */
+};
+
+/*
+ * State for the corresponding entry of the HW L2 table.
+ */
+struct l2t_entry {
+	u16 state;                  /* entry state */
+	u16 idx;                    /* entry index within in-memory table */
+	u32 addr[4];                /* next hop IP or IPv6 address */
+	u16 vlan;                   /* VLAN TCI (id: bits 0-11, prio: 13-15 */
+	u8  lport;                  /* destination port */
+	u8  dmac[ETHER_ADDR_LEN];   /* destination MAC address */
+	rte_spinlock_t lock;        /* entry lock */
+	rte_atomic32_t refcnt;      /* entry reference count */
+};
+
+struct l2t_data {
+	unsigned int l2t_start;     /* start index of our piece of the L2T */
+	unsigned int l2t_size;      /* number of entries in l2tab */
+	rte_rwlock_t lock;          /* table rw lock */
+	struct l2t_entry l2tab[0];  /* MUST BE LAST */
+};
+
+#define L2T_LPBK	true
+#define L2T_ARPMISS	true
+
+/* identifies sync vs async L2T_WRITE_REQs */
+#define S_SYNC_WR    12
+#define V_SYNC_WR(x) ((x) << S_SYNC_WR)
+#define F_SYNC_WR    V_SYNC_WR(1)
+
+struct l2t_data *t4_init_l2t(unsigned int l2t_start, unsigned int l2t_end);
+void t4_cleanup_l2t(struct adapter *adap);
+struct l2t_entry *cxgbe_l2t_alloc_switching(struct rte_eth_dev *dev, u16 vlan,
+					    u8 port, u8 *dmac);
+void cxgbe_l2t_release(struct l2t_entry *e);
+void do_l2t_write_rpl(struct adapter *p, const struct cpl_l2t_write_rpl *rpl);
+#endif /* _CXGBE_L2T_H_ */
-- 
2.5.3

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 07/10] cxgbe: add source mac table for switch action filter
  2016-02-03  8:32 [PATCH 00/10] cxgbe: Add flow director support Rahul Lakkireddy
                   ` (5 preceding siblings ...)
  2016-02-03  8:32 ` [PATCH 06/10] cxgbe: add layer 2 table for switch action filter Rahul Lakkireddy
@ 2016-02-03  8:32 ` Rahul Lakkireddy
  2016-02-03  8:32 ` [PATCH 08/10] cxgbe: add LE-TCAM filtering support Rahul Lakkireddy
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 27+ messages in thread
From: Rahul Lakkireddy @ 2016-02-03  8:32 UTC (permalink / raw)
  To: dev; +Cc: Kumar Sanghvi, Nirranjan Kirubaharan

Add Source MAC Table (SMT) that holds source mac addresses to be
modified on a packet that matches a corresponding set 'switch'
action filter.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
---
 drivers/net/cxgbe/Makefile       |   1 +
 drivers/net/cxgbe/base/adapter.h |   1 +
 drivers/net/cxgbe/base/t4_msg.h  |  29 +++++
 drivers/net/cxgbe/cxgbe_filter.h |   2 +
 drivers/net/cxgbe/cxgbe_main.c   |  12 ++
 drivers/net/cxgbe/smt.c          | 275 +++++++++++++++++++++++++++++++++++++++
 drivers/net/cxgbe/smt.h          |  76 +++++++++++
 7 files changed, 396 insertions(+)
 create mode 100644 drivers/net/cxgbe/smt.c
 create mode 100644 drivers/net/cxgbe/smt.h

diff --git a/drivers/net/cxgbe/Makefile b/drivers/net/cxgbe/Makefile
index e98e93c..f5f5828 100644
--- a/drivers/net/cxgbe/Makefile
+++ b/drivers/net/cxgbe/Makefile
@@ -80,6 +80,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += sge.c
 SRCS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += t4_hw.c
 SRCS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += clip_tbl.c
 SRCS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += l2t.c
+SRCS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += smt.c
 
 # this lib depends upon:
 DEPDIRS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += lib/librte_eal lib/librte_ether
diff --git a/drivers/net/cxgbe/base/adapter.h b/drivers/net/cxgbe/base/adapter.h
index c29bb98..6af5c8e 100644
--- a/drivers/net/cxgbe/base/adapter.h
+++ b/drivers/net/cxgbe/base/adapter.h
@@ -340,6 +340,7 @@ struct adapter {
 	unsigned int l2t_start;   /* Layer 2 table start */
 	unsigned int l2t_end;     /* Layer 2 table end */
 	struct l2t_data *l2t;     /* Layer 2 table */
+	struct smt_data *smt;     /* Source MAC table */
 	struct tid_info tids;     /* Info used to access TID related tables */
 };
 
diff --git a/drivers/net/cxgbe/base/t4_msg.h b/drivers/net/cxgbe/base/t4_msg.h
index d51edd7..6dc255b 100644
--- a/drivers/net/cxgbe/base/t4_msg.h
+++ b/drivers/net/cxgbe/base/t4_msg.h
@@ -36,7 +36,9 @@
 
 enum {
 	CPL_L2T_WRITE_REQ     = 0x12,
+	CPL_SMT_WRITE_REQ     = 0x14,
 	CPL_L2T_WRITE_RPL     = 0x23,
+	CPL_SMT_WRITE_RPL     = 0x2E,
 	CPL_SGE_EGR_UPDATE    = 0xA5,
 	CPL_FW4_MSG           = 0xC0,
 	CPL_FW6_MSG           = 0xE0,
@@ -320,6 +322,33 @@ struct cpl_l2t_write_rpl {
 #define V_RXF_IP6(x) ((x) << S_RXF_IP6)
 #define F_RXF_IP6    V_RXF_IP6(1U)
 
+struct cpl_smt_write_req {
+	WR_HDR;
+	union opcode_tid ot;
+	__be32 params;
+	__be16 pfvf1;
+	__u8   src_mac1[6];
+	__be16 pfvf0;
+	__u8   src_mac0[6];
+};
+
+struct cpl_smt_write_rpl {
+	RSS_HDR
+	union opcode_tid ot;
+	__u8 status;
+	__u8 rsvd[3];
+};
+
+/* cpl_smt_{read,write}_req.params fields */
+#define S_SMTW_OVLAN_IDX    16
+#define V_SMTW_OVLAN_IDX(x) ((x) << S_SMTW_OVLAN_IDX)
+
+#define S_SMTW_IDX    20
+#define V_SMTW_IDX(x) ((x) << S_SMTW_IDX)
+
+#define S_SMTW_NORPL    31
+#define V_SMTW_NORPL(x) ((x) << S_SMTW_NORPL)
+
 /* cpl_fw*.type values */
 enum {
 	FW_TYPE_RSSCPL = 4,
diff --git a/drivers/net/cxgbe/cxgbe_filter.h b/drivers/net/cxgbe/cxgbe_filter.h
index 933496a..b03ccca 100644
--- a/drivers/net/cxgbe/cxgbe_filter.h
+++ b/drivers/net/cxgbe/cxgbe_filter.h
@@ -217,9 +217,11 @@ struct filter_entry {
 	u32 valid:1;                /* filter allocated and valid */
 	u32 locked:1;               /* filter is administratively locked */
 	u32 pending:1;              /* filter action is pending FW reply */
+	u32 smtidx:8;               /* Source MAC Table index for smac */
 	struct filter_ctx *ctx;     /* caller's completion hook */
 	struct clip_entry *clipt;   /* CLIP Table entry for IPv6 */
 	struct l2t_entry *l2t;      /* Layer Two Table entry for dmac */
+	struct smt_entry *smt;      /* Source Mac Table entry for smac */
 	struct rte_eth_dev *dev;    /* Port's rte eth device */
 
 	/* This will store the actual tid */
diff --git a/drivers/net/cxgbe/cxgbe_main.c b/drivers/net/cxgbe/cxgbe_main.c
index 63c6318..e7d017e 100644
--- a/drivers/net/cxgbe/cxgbe_main.c
+++ b/drivers/net/cxgbe/cxgbe_main.c
@@ -68,6 +68,7 @@
 #include "cxgbe.h"
 #include "clip_tbl.h"
 #include "l2t.h"
+#include "smt.h"
 
 /**
  * Allocate a chunk of memory. The allocated memory is cleared.
@@ -117,6 +118,10 @@ static int fwevtq_handler(struct sge_rspq *q, const __be64 *rsp,
 		const struct cpl_fw6_msg *msg = (const void *)rsp;
 
 		t4_handle_fw_rpl(q->adapter, msg->data);
+	} else if (opcode == CPL_SMT_WRITE_RPL) {
+		const struct cpl_smt_write_rpl *p = (const void *)rsp;
+
+		do_smt_write_rpl(q->adapter, p);
 	} else if (opcode == CPL_L2T_WRITE_RPL) {
 		const struct cpl_l2t_write_rpl *p = (const void *)rsp;
 
@@ -1230,6 +1235,7 @@ void cxgbe_close(struct adapter *adapter)
 		tid_free(&adapter->tids);
 		t4_cleanup_clip_tbl(adapter);
 		t4_cleanup_l2t(adapter);
+		t4_cleanup_smt(adapter);
 		t4_intr_disable(adapter);
 		t4_sge_tx_monitor_stop(adapter);
 		t4_free_sge_resources(adapter);
@@ -1387,6 +1393,12 @@ allocate_mac:
 		dev_warn(adapter, "could not allocate L2T. Continuing\n");
 	}
 
+	adapter->smt = t4_init_smt();
+	if (!adapter->smt) {
+		/* We tolerate a lack of SMT, giving up some functionality */
+		dev_warn(adapter, "could not allocate SMT. Continuing\n");
+	}
+
 	if (tid_init(&adapter->tids) < 0) {
 		/* Disable filtering support */
 		dev_warn(adapter, "could not allocate TID table, "
diff --git a/drivers/net/cxgbe/smt.c b/drivers/net/cxgbe/smt.c
new file mode 100644
index 0000000..639c0cf
--- /dev/null
+++ b/drivers/net/cxgbe/smt.c
@@ -0,0 +1,275 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015-2016 Chelsio Communications.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Chelsio Communications nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include "common.h"
+#include "smt.h"
+
+static void t4_smte_free(struct smt_entry *e)
+{
+	t4_os_lock(&e->lock);
+	if (rte_atomic32_read(&e->refcnt) == 0)  /* hasn't been recycled */
+		e->state = SMT_STATE_UNUSED;
+	t4_os_unlock(&e->lock);
+}
+
+/**
+ * cxgbe_smt_release - Release associated SMT entry
+ * @e: smt entry to release
+ *
+ * Releases ref count and frees up an smt entry from SMT table
+ */
+void cxgbe_smt_release(struct smt_entry *e)
+{
+	if (rte_atomic32_dec_and_test(&e->refcnt))
+		t4_smte_free(e);
+}
+
+/**
+ * do_smt_write_rpl - Process the SMT_WRITE_RPL message from FW
+ * @adapter: associated adapter that the port belongs to
+ * @rpl: SMT_WRITE_RPL message to parse and process
+ *
+ * Process the SMT_WRITE_RPL message received from the FW
+ */
+void do_smt_write_rpl(struct adapter *adapter,
+		      const struct cpl_smt_write_rpl *rpl)
+{
+	struct smt_data *s = adapter->smt;
+	unsigned int smtidx = G_TID_TID(GET_TID(rpl));
+
+	if (unlikely(rpl->status != CPL_ERR_NONE)) {
+		struct smt_entry *e = &s->smtab[smtidx];
+
+		dev_err(adapter,
+			"Unexpected SMT_WRITE_RPL status %u for entry %u\n",
+			rpl->status, smtidx);
+		t4_os_lock(&e->lock);
+		e->state = SMT_STATE_ERROR;
+		t4_os_unlock(&e->lock);
+		return;
+	}
+}
+
+/**
+ * write_smt_entry - Send the SMT_WRITE_REQ message to FW
+ * @dev: associated rte_eth_dev
+ * @e: SMT entry to write
+ *
+ * Send the SMT_WRITE_REQ message to the FW to add an SMT entry. Must be
+ * called with entry lock held.
+ */
+static int write_smt_entry(struct rte_eth_dev *dev, struct smt_entry *e)
+{
+	struct adapter *adapter = ethdev2adap(dev);
+	struct smt_data *s = adapter->smt;
+	struct sge_ctrl_txq *ctrlq;
+	struct cpl_smt_write_req *req;
+	struct rte_mbuf *mbuf;
+	unsigned int port_id = ethdev2pinfo(dev)->port_id;
+	u8 row;
+
+	ctrlq = &adapter->sge.ctrlq[port_id];
+	mbuf = rte_pktmbuf_alloc(ctrlq->mb_pool);
+	if (!mbuf)
+		return -ENOMEM;
+
+	mbuf->data_len = sizeof(*req);
+	mbuf->pkt_len = mbuf->data_len;
+
+	/* Source MAC Table (SMT) contains 256 SMAC entries
+	 * organized in 128 rows of 2 entries each.
+	 */
+	req = rte_pktmbuf_mtod(mbuf, struct cpl_smt_write_req *);
+	INIT_TP_WR(req, 0);
+
+	/* Each row contains an SMAC pair.
+	 * LSB selects the SMAC entry within a row
+	 */
+	row = (e->idx >> 1);
+	if (e->idx & 1) {
+		req->pfvf1 = 0x0;
+		rte_memcpy(req->src_mac1, e->src_mac, ETHER_ADDR_LEN);
+
+		/* fill pfvf0/src_mac0 with entry
+		 * at prev index from smt-tab.
+		 */
+		req->pfvf0 = 0x0;
+		rte_memcpy(req->src_mac0, &s->smtab[e->idx - 1].src_mac,
+			   ETHER_ADDR_LEN);
+	} else {
+		req->pfvf0 = 0x0;
+		rte_memcpy(req->src_mac0, e->src_mac, ETHER_ADDR_LEN);
+
+		/* fill pfvf1/src_mac1 with entry
+		 * at next index from smt-tab
+		 */
+		req->pfvf1 = 0x0;
+		rte_memcpy(req->src_mac1, s->smtab[e->idx + 1].src_mac,
+			   ETHER_ADDR_LEN);
+	}
+
+	OPCODE_TID(req) = cpu_to_be32(MK_OPCODE_TID(CPL_SMT_WRITE_REQ, e->idx |
+				      V_TID_QID(adapter->sge.fw_evtq.abs_id)));
+	req->params = cpu_to_be32(V_SMTW_NORPL(0) | V_SMTW_IDX(row) |
+				  V_SMTW_OVLAN_IDX(0));
+
+	t4_mgmt_tx(ctrlq, mbuf);
+
+	return 0;
+}
+
+/**
+ * find_or_alloc_smte - Find/Allocate a free SMT entry
+ * @s: SMT table
+ * @smac: MAC address to compare/add
+ * Returns pointer to the SMT entry found/created
+ *
+ * Finds/Allocates an SMT entry to be used by switching rule of a filter.
+ */
+static struct smt_entry *find_or_alloc_smte(struct smt_data *s, u8 *smac)
+{
+	struct smt_entry *e, *end;
+	struct smt_entry *first_free = NULL;
+
+	for (e = &s->smtab[0], end = &s->smtab[s->smt_size]; e != end; ++e) {
+		if (rte_atomic32_read(&e->refcnt) == 0) {
+			if (!first_free)
+				first_free = e;
+		} else {
+			if (e->state == SMT_STATE_SWITCHING) {
+				/*
+				 * This entry is actually in use. See if we can
+				 * re-use it ?
+				 */
+				if (!memcmp(e->src_mac, smac, ETHER_ADDR_LEN))
+					goto found_reuse;
+			}
+		}
+	}
+
+	if (first_free) {
+		e = first_free;
+		goto found;
+	}
+
+	return NULL;
+
+found:
+	e->state = SMT_STATE_UNUSED;
+
+found_reuse:
+	return e;
+}
+
+static struct smt_entry *t4_smt_alloc_switching(struct rte_eth_dev *dev,
+						u16 pfvf, u8 *smac)
+{
+	struct adapter *adap = ethdev2adap(dev);
+	struct smt_data *s = adap->smt;
+	struct smt_entry *e;
+	int ret;
+
+	t4_os_write_lock(&s->lock);
+	e = find_or_alloc_smte(s, smac);
+	if (e) {
+		t4_os_lock(&e->lock);
+		if (!rte_atomic32_read(&e->refcnt)) {
+			e->state = SMT_STATE_SWITCHING;
+			e->pfvf = pfvf;
+			rte_memcpy(e->src_mac, smac, ETHER_ADDR_LEN);
+			rte_atomic32_set(&e->refcnt, 1);
+			ret = write_smt_entry(dev, e);
+			if (ret < 0) {
+				dev_debug(adap, "Failed to write smt entry: %d",
+					  ret);
+				e = NULL;
+			}
+		} else {
+			rte_atomic32_inc(&e->refcnt);
+		}
+		t4_os_unlock(&e->lock);
+	}
+	t4_os_write_unlock(&s->lock);
+	return e;
+}
+
+/**
+ * cxgbe_smt_alloc_switching - Allocate a SMT entry for switching rule
+ * @dev: rte_eth_dev pointer
+ * @smac: MAC address to add to SMT
+ * Returns pointer to the SMT entry created
+ *
+ * Allocates a SMT entry to be used by switching rule of a filter.
+ */
+struct smt_entry *cxgbe_smt_alloc_switching(struct rte_eth_dev *dev, u8 *smac)
+{
+	return t4_smt_alloc_switching(dev, 0x0, smac);
+}
+
+/**
+ * Initialize Source MAC Table
+ */
+struct smt_data *t4_init_smt(void)
+{
+	unsigned int smt_size;
+	unsigned int i;
+	struct smt_data *s;
+
+	smt_size = SMT_SIZE;
+	s = t4_os_alloc(sizeof(*s) + smt_size * sizeof(struct smt_entry));
+	if (!s)
+		return NULL;
+
+	s->smt_size = smt_size;
+	t4_os_rwlock_init(&s->lock);
+
+	for (i = 0; i < s->smt_size; ++i) {
+		s->smtab[i].idx = i;
+		s->smtab[i].state = SMT_STATE_UNUSED;
+		memset(&s->smtab[i].src_mac, 0, ETHER_ADDR_LEN);
+		t4_os_lock_init(&s->smtab[i].lock);
+		rte_atomic32_init(&s->smtab[i].refcnt);
+		rte_atomic32_set(&s->smtab[i].refcnt, 0);
+	}
+
+	return s;
+}
+
+/**
+ * Cleanup Source MAC Table
+ */
+void t4_cleanup_smt(struct adapter *adap)
+{
+	if (adap->smt)
+		t4_os_free(adap->smt);
+}
diff --git a/drivers/net/cxgbe/smt.h b/drivers/net/cxgbe/smt.h
new file mode 100644
index 0000000..e8eed31
--- /dev/null
+++ b/drivers/net/cxgbe/smt.h
@@ -0,0 +1,76 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015-2016 Chelsio Communications.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Chelsio Communications nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _CXGBE_SMT_H_
+#define _CXGBE_SMT_H_
+
+#include "t4_msg.h"
+
+/*
+ * SMT related handling.
+ */
+enum {
+	SMT_STATE_SWITCHING,  /* entry is being used by a switching filter */
+	SMT_STATE_UNUSED,     /* entry not in use */
+	SMT_STATE_ERROR       /* got error from FW */
+};
+
+enum {
+	SMT_SIZE = 256        /* # of SMT entries */
+};
+
+/*
+ * State for the corresponding entry of the HW Source MAC table.
+ */
+struct smt_entry {
+	u16 state;                   /* entry state */
+	u16 idx;                     /* entry index within in-memory table */
+	u16 pfvf;                    /* associated pfvf index */
+	u8 src_mac[ETHER_ADDR_LEN];  /* source MAC address */
+	rte_atomic32_t refcnt;       /* entry reference count */
+	rte_spinlock_t lock;         /* entry lock */
+};
+
+struct smt_data {
+	unsigned int smt_size;       /* size of SMT */
+	rte_rwlock_t lock;           /* table rw lock */
+	struct smt_entry smtab[0];   /* MUST BE LAST */
+};
+
+struct smt_data *t4_init_smt(void);
+void t4_cleanup_smt(struct adapter *adap);
+struct smt_entry *cxgbe_smt_alloc_switching(struct rte_eth_dev *dev, u8 *smac);
+void cxgbe_smt_release(struct smt_entry *e);
+void do_smt_write_rpl(struct adapter *adapter,
+		      const struct cpl_smt_write_rpl *rpl);
+#endif /* _CXGBE_SMT_H_ */
-- 
2.5.3

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 08/10] cxgbe: add LE-TCAM filtering support
  2016-02-03  8:32 [PATCH 00/10] cxgbe: Add flow director support Rahul Lakkireddy
                   ` (6 preceding siblings ...)
  2016-02-03  8:32 ` [PATCH 07/10] cxgbe: add source mac " Rahul Lakkireddy
@ 2016-02-03  8:32 ` Rahul Lakkireddy
  2016-02-03  8:32 ` [PATCH 09/10] cxgbe: add HASH " Rahul Lakkireddy
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 27+ messages in thread
From: Rahul Lakkireddy @ 2016-02-03  8:32 UTC (permalink / raw)
  To: dev; +Cc: Kumar Sanghvi, Nirranjan Kirubaharan

Add support for setting LE-TCAM (Maskfull) filters.  IPv4 filters
occupy one index per filter, but IPv6 filters occupy 4 indices per
filter and must be on boundary aligned by 4.  Filters with lower
index have higher priority over filters with higher index.  When
a filter is hit, the rest of the filters with a higher index
are ignored and the action is taken immediately.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
---
 drivers/net/cxgbe/Makefile              |   1 +
 drivers/net/cxgbe/base/adapter.h        |  21 +
 drivers/net/cxgbe/base/common.h         |   2 +
 drivers/net/cxgbe/base/t4_hw.c          |   3 +
 drivers/net/cxgbe/base/t4_msg.h         |  39 ++
 drivers/net/cxgbe/base/t4_tcb.h         |  74 +++
 drivers/net/cxgbe/base/t4fw_interface.h | 145 ++++++
 drivers/net/cxgbe/cxgbe_filter.c        | 802 ++++++++++++++++++++++++++++++++
 drivers/net/cxgbe/cxgbe_filter.h        |  18 +
 drivers/net/cxgbe/cxgbe_main.c          |   6 +
 drivers/net/cxgbe/cxgbe_ofld.h          |   5 +
 11 files changed, 1116 insertions(+)
 create mode 100644 drivers/net/cxgbe/base/t4_tcb.h
 create mode 100644 drivers/net/cxgbe/cxgbe_filter.c

diff --git a/drivers/net/cxgbe/Makefile b/drivers/net/cxgbe/Makefile
index f5f5828..3201aff 100644
--- a/drivers/net/cxgbe/Makefile
+++ b/drivers/net/cxgbe/Makefile
@@ -81,6 +81,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += t4_hw.c
 SRCS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += clip_tbl.c
 SRCS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += l2t.c
 SRCS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += smt.c
+SRCS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += cxgbe_filter.c
 
 # this lib depends upon:
 DEPDIRS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += lib/librte_eal lib/librte_ether
diff --git a/drivers/net/cxgbe/base/adapter.h b/drivers/net/cxgbe/base/adapter.h
index 6af5c8e..a866993 100644
--- a/drivers/net/cxgbe/base/adapter.h
+++ b/drivers/net/cxgbe/base/adapter.h
@@ -575,6 +575,27 @@ static inline void t4_os_write_unlock(rte_rwlock_t *lock)
 }
 
 /**
+ * t4_init_completion - initialize completion
+ * @c: the completion context
+ */
+static inline void t4_init_completion(struct t4_completion *c)
+{
+	c->done = 0;
+	t4_os_lock_init(&c->lock);
+}
+
+/**
+ * t4_complete - set completion as done
+ * @c: the completion context
+ */
+static inline void t4_complete(struct t4_completion *c)
+{
+	t4_os_lock(&c->lock);
+	c->done = 1;
+	t4_os_unlock(&c->lock);
+}
+
+/**
  * ethdev2pinfo - return the port_info structure associated with a rte_eth_dev
  * @dev: the rte_eth_dev
  *
diff --git a/drivers/net/cxgbe/base/common.h b/drivers/net/cxgbe/base/common.h
index 2b39c10..21dca32 100644
--- a/drivers/net/cxgbe/base/common.h
+++ b/drivers/net/cxgbe/base/common.h
@@ -162,7 +162,9 @@ struct tp_params {
 	int vlan_shift;
 	int vnic_shift;
 	int port_shift;
+	int tos_shift;
 	int protocol_shift;
+	int ethertype_shift;
 };
 
 struct vpd_params {
diff --git a/drivers/net/cxgbe/base/t4_hw.c b/drivers/net/cxgbe/base/t4_hw.c
index de2e6b7..b35876c 100644
--- a/drivers/net/cxgbe/base/t4_hw.c
+++ b/drivers/net/cxgbe/base/t4_hw.c
@@ -2574,8 +2574,11 @@ int t4_init_tp_params(struct adapter *adap)
 	adap->params.tp.vlan_shift = t4_filter_field_shift(adap, F_VLAN);
 	adap->params.tp.vnic_shift = t4_filter_field_shift(adap, F_VNIC_ID);
 	adap->params.tp.port_shift = t4_filter_field_shift(adap, F_PORT);
+	adap->params.tp.tos_shift = t4_filter_field_shift(adap, F_TOS);
 	adap->params.tp.protocol_shift = t4_filter_field_shift(adap,
 							       F_PROTOCOL);
+	adap->params.tp.ethertype_shift = t4_filter_field_shift(adap,
+								F_ETHERTYPE);
 
 	/*
 	 * If TP_INGRESS_CONFIG.VNID == 0, then TP_VLAN_PRI_MAP.VNIC_ID
diff --git a/drivers/net/cxgbe/base/t4_msg.h b/drivers/net/cxgbe/base/t4_msg.h
index 6dc255b..57534f0 100644
--- a/drivers/net/cxgbe/base/t4_msg.h
+++ b/drivers/net/cxgbe/base/t4_msg.h
@@ -35,10 +35,12 @@
 #define T4_MSG_H
 
 enum {
+	CPL_SET_TCB_FIELD     = 0x5,
 	CPL_L2T_WRITE_REQ     = 0x12,
 	CPL_SMT_WRITE_REQ     = 0x14,
 	CPL_L2T_WRITE_RPL     = 0x23,
 	CPL_SMT_WRITE_RPL     = 0x2E,
+	CPL_SET_TCB_RPL       = 0x3A,
 	CPL_SGE_EGR_UPDATE    = 0xA5,
 	CPL_FW4_MSG           = 0xC0,
 	CPL_FW6_MSG           = 0xE0,
@@ -125,6 +127,43 @@ struct work_request_hdr {
 #define WR_HDR_SIZE 0
 #endif
 
+/* cpl_get_tcb.reply_ctrl fields */
+#define S_QUEUENO    0
+#define V_QUEUENO(x) ((x) << S_QUEUENO)
+
+#define S_REPLY_CHAN    14
+#define V_REPLY_CHAN(x) ((x) << S_REPLY_CHAN)
+
+#define S_NO_REPLY    15
+#define V_NO_REPLY(x) ((x) << S_NO_REPLY)
+
+struct cpl_set_tcb_field {
+	WR_HDR;
+	union opcode_tid ot;
+	__be16 reply_ctrl;
+	__be16 word_cookie;
+	__be64 mask;
+	__be64 val;
+};
+
+/* cpl_set_tcb_field.word_cookie fields */
+#define S_WORD    0
+#define V_WORD(x) ((x) << S_WORD)
+
+#define S_COOKIE    5
+#define M_COOKIE    0x7
+#define V_COOKIE(x) ((x) << S_COOKIE)
+#define G_COOKIE(x) (((x) >> S_COOKIE) & M_COOKIE)
+
+struct cpl_set_tcb_rpl {
+	RSS_HDR
+	union opcode_tid ot;
+	__be16 rsvd;
+	__u8   cookie;
+	__u8   status;
+	__be64 oldval;
+};
+
 struct cpl_tx_data {
 	union opcode_tid ot;
 	__be32 len;
diff --git a/drivers/net/cxgbe/base/t4_tcb.h b/drivers/net/cxgbe/base/t4_tcb.h
new file mode 100644
index 0000000..36afd56
--- /dev/null
+++ b/drivers/net/cxgbe/base/t4_tcb.h
@@ -0,0 +1,74 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015-2016 Chelsio Communications.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Chelsio Communications nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _T4_TCB_DEFS_H
+#define _T4_TCB_DEFS_H
+
+/* 31:24 */
+#define W_TCB_SMAC_SEL    0
+#define S_TCB_SMAC_SEL    24
+#define M_TCB_SMAC_SEL    0xffULL
+#define V_TCB_SMAC_SEL(x) ((x) << S_TCB_SMAC_SEL)
+
+/* 95:32 */
+#define W_TCB_T_FLAGS    1
+
+/* 105:96 */
+#define W_TCB_RSS_INFO    3
+#define S_TCB_RSS_INFO    0
+#define M_TCB_RSS_INFO    0x3ffULL
+#define V_TCB_RSS_INFO(x) ((x) << S_TCB_RSS_INFO)
+
+/* 191:160 */
+#define W_TCB_TIMESTAMP    5
+#define S_TCB_TIMESTAMP    0
+#define M_TCB_TIMESTAMP    0xffffffffULL
+#define V_TCB_TIMESTAMP(x) ((x) << S_TCB_TIMESTAMP)
+
+/* 223:192 */
+#define S_TCB_T_RTT_TS_RECENT_AGE    0
+#define M_TCB_T_RTT_TS_RECENT_AGE    0xffffffffULL
+#define V_TCB_T_RTT_TS_RECENT_AGE(x) ((x) << S_TCB_T_RTT_TS_RECENT_AGE)
+
+#define S_TF_MIGRATING    0
+#define V_TF_MIGRATING(x) ((x) << S_TF_MIGRATING)
+
+#define S_TF_NON_OFFLOAD    1
+#define V_TF_NON_OFFLOAD(x) ((x) << S_TF_NON_OFFLOAD)
+
+#define S_TF_CCTRL_ECE    60
+
+#define S_TF_CCTRL_CWR    61
+
+#define S_TF_CCTRL_RFR    62
+#endif /* _T4_TCB_DEFS_H */
diff --git a/drivers/net/cxgbe/base/t4fw_interface.h b/drivers/net/cxgbe/base/t4fw_interface.h
index 8a8652a..d3e4de5 100644
--- a/drivers/net/cxgbe/base/t4fw_interface.h
+++ b/drivers/net/cxgbe/base/t4fw_interface.h
@@ -82,6 +82,7 @@ enum fw_memtype {
  ********************************/
 
 enum fw_wr_opcodes {
+	FW_FILTER_WR		= 0x02,
 	FW_TP_WR		= 0x05,
 	FW_ETH_TX_PKT_WR	= 0x08,
 	FW_ETH_TX_PKTS_WR	= 0x09,
@@ -156,6 +157,150 @@ struct fw_eth_tx_pkts_wr {
 	__u8   type;
 };
 
+/* filter wr reply code in cookie in CPL_SET_TCB_RPL */
+enum fw_filter_wr_cookie {
+	FW_FILTER_WR_SUCCESS,
+	FW_FILTER_WR_FLT_ADDED,
+	FW_FILTER_WR_FLT_DELETED,
+	FW_FILTER_WR_SMT_TBL_FULL,
+	FW_FILTER_WR_EINVAL,
+};
+
+struct fw_filter_wr {
+	__be32 op_pkd;
+	__be32 len16_pkd;
+	__be64 r3;
+	__be32 tid_to_iq;
+	__be32 del_filter_to_l2tix;
+	__be16 ethtype;
+	__be16 ethtypem;
+	__u8   frag_to_ovlan_vldm;
+	__u8   smac_sel;
+	__be16 rx_chan_rx_rpl_iq;
+	__be32 maci_to_matchtypem;
+	__u8   ptcl;
+	__u8   ptclm;
+	__u8   ttyp;
+	__u8   ttypm;
+	__be16 ivlan;
+	__be16 ivlanm;
+	__be16 ovlan;
+	__be16 ovlanm;
+	__u8   lip[16];
+	__u8   lipm[16];
+	__u8   fip[16];
+	__u8   fipm[16];
+	__be16 lp;
+	__be16 lpm;
+	__be16 fp;
+	__be16 fpm;
+	__be16 r7;
+	__u8   sma[6];
+};
+
+#define S_FW_FILTER_WR_TID	12
+#define V_FW_FILTER_WR_TID(x)	((x) << S_FW_FILTER_WR_TID)
+
+#define S_FW_FILTER_WR_RQTYPE		11
+#define V_FW_FILTER_WR_RQTYPE(x)	((x) << S_FW_FILTER_WR_RQTYPE)
+
+#define S_FW_FILTER_WR_NOREPLY		10
+#define V_FW_FILTER_WR_NOREPLY(x)	((x) << S_FW_FILTER_WR_NOREPLY)
+
+#define S_FW_FILTER_WR_IQ	0
+#define V_FW_FILTER_WR_IQ(x)	((x) << S_FW_FILTER_WR_IQ)
+
+#define S_FW_FILTER_WR_DEL_FILTER	31
+#define V_FW_FILTER_WR_DEL_FILTER(x)	((x) << S_FW_FILTER_WR_DEL_FILTER)
+#define F_FW_FILTER_WR_DEL_FILTER	V_FW_FILTER_WR_DEL_FILTER(1U)
+
+#define S_FW_FILTER_WR_RPTTID		25
+#define V_FW_FILTER_WR_RPTTID(x)	((x) << S_FW_FILTER_WR_RPTTID)
+
+#define S_FW_FILTER_WR_DROP	24
+#define V_FW_FILTER_WR_DROP(x)	((x) << S_FW_FILTER_WR_DROP)
+
+#define S_FW_FILTER_WR_DIRSTEER		23
+#define V_FW_FILTER_WR_DIRSTEER(x)	((x) << S_FW_FILTER_WR_DIRSTEER)
+
+#define S_FW_FILTER_WR_MASKHASH		22
+#define V_FW_FILTER_WR_MASKHASH(x)	((x) << S_FW_FILTER_WR_MASKHASH)
+
+#define S_FW_FILTER_WR_DIRSTEERHASH	21
+#define V_FW_FILTER_WR_DIRSTEERHASH(x)	((x) << S_FW_FILTER_WR_DIRSTEERHASH)
+
+#define S_FW_FILTER_WR_LPBK	20
+#define V_FW_FILTER_WR_LPBK(x)	((x) << S_FW_FILTER_WR_LPBK)
+
+#define S_FW_FILTER_WR_DMAC	19
+#define V_FW_FILTER_WR_DMAC(x)	((x) << S_FW_FILTER_WR_DMAC)
+
+#define S_FW_FILTER_WR_INSVLAN		17
+#define V_FW_FILTER_WR_INSVLAN(x)	((x) << S_FW_FILTER_WR_INSVLAN)
+
+#define S_FW_FILTER_WR_RMVLAN		16
+#define V_FW_FILTER_WR_RMVLAN(x)	((x) << S_FW_FILTER_WR_RMVLAN)
+
+#define S_FW_FILTER_WR_HITCNTS		15
+#define V_FW_FILTER_WR_HITCNTS(x)	((x) << S_FW_FILTER_WR_HITCNTS)
+
+#define S_FW_FILTER_WR_TXCHAN		13
+#define V_FW_FILTER_WR_TXCHAN(x)	((x) << S_FW_FILTER_WR_TXCHAN)
+
+#define S_FW_FILTER_WR_PRIO	12
+#define V_FW_FILTER_WR_PRIO(x)	((x) << S_FW_FILTER_WR_PRIO)
+
+#define S_FW_FILTER_WR_L2TIX	0
+#define V_FW_FILTER_WR_L2TIX(x)	((x) << S_FW_FILTER_WR_L2TIX)
+
+#define S_FW_FILTER_WR_FRAG	7
+#define V_FW_FILTER_WR_FRAG(x)	((x) << S_FW_FILTER_WR_FRAG)
+
+#define S_FW_FILTER_WR_FRAGM	6
+#define V_FW_FILTER_WR_FRAGM(x)	((x) << S_FW_FILTER_WR_FRAGM)
+
+#define S_FW_FILTER_WR_IVLAN_VLD	5
+#define V_FW_FILTER_WR_IVLAN_VLD(x)	((x) << S_FW_FILTER_WR_IVLAN_VLD)
+
+#define S_FW_FILTER_WR_OVLAN_VLD	4
+#define V_FW_FILTER_WR_OVLAN_VLD(x)	((x) << S_FW_FILTER_WR_OVLAN_VLD)
+
+#define S_FW_FILTER_WR_IVLAN_VLDM	3
+#define V_FW_FILTER_WR_IVLAN_VLDM(x)	((x) << S_FW_FILTER_WR_IVLAN_VLDM)
+
+#define S_FW_FILTER_WR_OVLAN_VLDM	2
+#define V_FW_FILTER_WR_OVLAN_VLDM(x)	((x) << S_FW_FILTER_WR_OVLAN_VLDM)
+
+#define S_FW_FILTER_WR_RX_CHAN		15
+#define V_FW_FILTER_WR_RX_CHAN(x)	((x) << S_FW_FILTER_WR_RX_CHAN)
+
+#define S_FW_FILTER_WR_RX_RPL_IQ	0
+#define V_FW_FILTER_WR_RX_RPL_IQ(x)	((x) << S_FW_FILTER_WR_RX_RPL_IQ)
+
+#define S_FW_FILTER_WR_MACI	23
+#define V_FW_FILTER_WR_MACI(x)	((x) << S_FW_FILTER_WR_MACI)
+
+#define S_FW_FILTER_WR_MACIM	14
+#define V_FW_FILTER_WR_MACIM(x)	((x) << S_FW_FILTER_WR_MACIM)
+
+#define S_FW_FILTER_WR_FCOE	13
+#define V_FW_FILTER_WR_FCOE(x)	((x) << S_FW_FILTER_WR_FCOE)
+
+#define S_FW_FILTER_WR_FCOEM	12
+#define V_FW_FILTER_WR_FCOEM(x)	((x) << S_FW_FILTER_WR_FCOEM)
+
+#define S_FW_FILTER_WR_PORT	9
+#define V_FW_FILTER_WR_PORT(x)	((x) << S_FW_FILTER_WR_PORT)
+
+#define S_FW_FILTER_WR_PORTM	6
+#define V_FW_FILTER_WR_PORTM(x)	((x) << S_FW_FILTER_WR_PORTM)
+
+#define S_FW_FILTER_WR_MATCHTYPE	3
+#define V_FW_FILTER_WR_MATCHTYPE(x)	((x) << S_FW_FILTER_WR_MATCHTYPE)
+
+#define S_FW_FILTER_WR_MATCHTYPEM	0
+#define V_FW_FILTER_WR_MATCHTYPEM(x)	((x) << S_FW_FILTER_WR_MATCHTYPEM)
+
 /******************************************************************************
  *  C O M M A N D s
  *********************/
diff --git a/drivers/net/cxgbe/cxgbe_filter.c b/drivers/net/cxgbe/cxgbe_filter.c
new file mode 100644
index 0000000..d4e32b1
--- /dev/null
+++ b/drivers/net/cxgbe/cxgbe_filter.c
@@ -0,0 +1,802 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015-2016 Chelsio Communications.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Chelsio Communications nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include "common.h"
+#include "t4_regs.h"
+#include "t4_tcb.h"
+#include "t4fw_interface.h"
+#include "l2t.h"
+#include "smt.h"
+#include "clip_tbl.h"
+#include "cxgbe_filter.h"
+
+/**
+ * Validate if the requested filter specification can be set by checking
+ * if the requested features have been enabled
+ */
+static int validate_filter(struct adapter *adapter,
+			   struct ch_filter_specification *fs)
+{
+	u32 fconf, iconf;
+
+	/*
+	 * Check for unconfigured fields being used.
+	 */
+	fconf = adapter->params.tp.vlan_pri_map;
+	iconf = adapter->params.tp.ingress_config;
+
+#define S(_field) \
+	(fs->val._field || fs->mask._field)
+#define U(_mask, _field) \
+	(!(fconf & (_mask)) && S(_field))
+
+	if (U(F_FCOE, fcoe) || U(F_PORT, iport) || U(F_TOS, tos) ||
+	    U(F_ETHERTYPE, ethtype) || U(F_MACMATCH, macidx) ||
+	    U(F_MPSHITTYPE, matchtype) || U(F_FRAGMENTATION, frag) ||
+	    U(F_PROTOCOL, proto) ||
+	    U(F_VNIC_ID, pfvf_vld) ||
+	    U(F_VNIC_ID, ovlan_vld) ||
+	    U(F_VLAN, ivlan_vld))
+		return -EOPNOTSUPP;
+
+	/*
+	 * We need to translate any PF/VF specification into that
+	 * internal format below.
+	 */
+	if (S(pfvf_vld) && S(ovlan_vld))
+		return -EOPNOTSUPP;
+	if ((S(pfvf_vld) && !(iconf & F_VNIC)) ||
+	    (S(ovlan_vld) && (iconf & F_VNIC)))
+		return -EOPNOTSUPP;
+	if (fs->val.pf > 0x7 || fs->val.vf > 0x7f)
+		return -ERANGE;
+	fs->mask.pf &= 0x7;
+	fs->mask.vf &= 0x7f;
+
+#undef S
+#undef U
+
+	/*
+	 * If the user is requesting that the filter action loop
+	 * matching packets back out one of our ports, make sure that
+	 * the egress port is in range.
+	 */
+	if (fs->action == FILTER_SWITCH &&
+	    fs->eport >= adapter->params.nports)
+		return -ERANGE;
+
+	/*
+	 * Don't allow various trivially obvious bogus out-of-range
+	 * values ...
+	 */
+	if (fs->val.iport >= adapter->params.nports)
+		return -ERANGE;
+
+	return 0;
+}
+
+/**
+ * Get the queue to which the traffic must be steered to.
+ */
+static unsigned int get_filter_steerq(struct rte_eth_dev *dev,
+				      struct ch_filter_specification *fs)
+{
+	struct port_info *pi = ethdev2pinfo(dev);
+	struct adapter *adapter = pi->adapter;
+	unsigned int iq;
+
+	/*
+	 * If the user has requested steering matching Ingress Packets
+	 * to a specific Queue Set, we need to make sure it's in range
+	 * for the port and map that into the Absolute Queue ID of the
+	 * Queue Set's Response Queue.
+	 */
+	if (!fs->dirsteer) {
+		iq = 0;
+	} else {
+		/*
+		 * If the iq id is greater than the number of qsets,
+		 * then assume it is an absolute qid.
+		 */
+		if (fs->iq < pi->n_rx_qsets)
+			iq = adapter->sge.ethrxq[pi->first_qset +
+						 fs->iq].rspq.abs_id;
+		else
+			iq = fs->iq;
+	}
+
+	return iq;
+}
+
+/* Return an error number if the indicated filter isn't writable ... */
+int writable_filter(struct filter_entry *f)
+{
+	if (f->locked)
+		return -EPERM;
+	if (f->pending)
+		return -EBUSY;
+
+	return 0;
+}
+
+/**
+ * Send CPL_SET_TCB_FIELD message
+ */
+static void set_tcb_field(struct adapter *adapter, unsigned int ftid,
+			  u16 word, u64 mask, u64 val, int no_reply)
+{
+	struct rte_mbuf *mbuf;
+	struct cpl_set_tcb_field *req;
+	struct sge_ctrl_txq *ctrlq;
+
+	ctrlq = &adapter->sge.ctrlq[0];
+	mbuf = rte_pktmbuf_alloc(ctrlq->mb_pool);
+	BUG_ON(!mbuf);
+
+	mbuf->data_len = sizeof(*req);
+	mbuf->pkt_len = mbuf->data_len;
+
+	req = rte_pktmbuf_mtod(mbuf, struct cpl_set_tcb_field *);
+	memset(req, 0, sizeof(*req));
+	INIT_TP_WR_MIT_CPL(req, CPL_SET_TCB_FIELD, ftid);
+	req->reply_ctrl = cpu_to_be16(V_REPLY_CHAN(0) |
+				      V_QUEUENO(adapter->sge.fw_evtq.abs_id) |
+				      V_NO_REPLY(no_reply));
+	req->word_cookie = cpu_to_be16(V_WORD(word) | V_COOKIE(ftid));
+	req->mask = cpu_to_be64(mask);
+	req->val = cpu_to_be64(val);
+
+	t4_mgmt_tx(ctrlq, mbuf);
+}
+
+/**
+ * Set one of the t_flags bits in the TCB.
+ */
+static void set_tcb_tflag(struct adapter *adap, unsigned int ftid,
+			  unsigned int bit_pos, unsigned int val, int no_reply)
+{
+	set_tcb_field(adap, ftid,  W_TCB_T_FLAGS, 1ULL << bit_pos,
+		      (unsigned long long)val << bit_pos, no_reply);
+}
+
+/**
+ * Clear a filter and release any of its resources that we own.  This also
+ * clears the filter's "pending" status.
+ */
+void clear_filter(struct filter_entry *f)
+{
+	/*
+	 * If the filter has loopback rewriteing rules then we'll need to free
+	 * any existing Layer Two Table (L2T) entries of the filter rule.  The
+	 * firmware will handle freeing up any Source MAC Table (SMT) entries
+	 * used for rewriting Source MAC Addresses in loopback rules.
+	 */
+	if (f->l2t)
+		cxgbe_l2t_release(f->l2t);
+
+	if (f->smt)
+		cxgbe_smt_release(f->smt);
+
+	/*
+	 * The zeroing of the filter rule below clears the filter valid,
+	 * pending, locked flags, l2t pointer, etc. so it's all we need for
+	 * this operation.
+	 */
+	memset(f, 0, sizeof(*f));
+}
+
+/**
+ * Clear all set filters
+ */
+void cxgbe_clear_all_filters(struct adapter *adapter)
+{
+	unsigned int i;
+
+	if (adapter->tids.ftid_tab) {
+		struct filter_entry *f = &adapter->tids.ftid_tab[0];
+
+		for (i = 0; i < adapter->tids.nftids; i++, f++)
+			if (f->valid || f->pending)
+				clear_filter(f);
+	}
+}
+
+/**
+ * Check if entry already filled.
+ */
+static bool is_filter_set(struct tid_info *t, int fidx, int family)
+{
+	bool result = FALSE;
+	int i, max;
+
+	/* IPv6 requires four slots and IPv4 requires only 1 slot.
+	 * Ensure, there's enough slots available.
+	 */
+	max = family == FILTER_TYPE_IPV6 ? fidx + 3 : fidx;
+
+	t4_os_lock(&t->ftid_lock);
+	for (i = fidx; i <= max; i++) {
+		if (rte_bitmap_get(t->ftid_bmap, i)) {
+			result = TRUE;
+			break;
+		}
+	}
+	t4_os_unlock(&t->ftid_lock);
+	return result;
+}
+
+/**
+ * Set the corresponding entry in the bitmap. 4 slots are
+ * marked for IPv6, whereas only 1 slot is marked for IPv4.
+ */
+static int cxgbe_set_ftid(struct tid_info *t, int fidx, int family)
+{
+	t4_os_lock(&t->ftid_lock);
+	if (rte_bitmap_get(t->ftid_bmap, fidx)) {
+		t4_os_unlock(&t->ftid_lock);
+		return -EBUSY;
+	}
+
+	if (family == FILTER_TYPE_IPV4) {
+		rte_bitmap_set(t->ftid_bmap, fidx);
+	} else {
+		rte_bitmap_set(t->ftid_bmap, fidx);
+		rte_bitmap_set(t->ftid_bmap, fidx + 1);
+		rte_bitmap_set(t->ftid_bmap, fidx + 2);
+		rte_bitmap_set(t->ftid_bmap, fidx + 3);
+	}
+	t4_os_unlock(&t->ftid_lock);
+	return 0;
+}
+
+/**
+ * Clear the corresponding entry in the bitmap. 4 slots are
+ * cleared for IPv6, whereas only 1 slot is cleared for IPv4.
+ */
+static void cxgbe_clear_ftid(struct tid_info *t, int fidx, int family)
+{
+	t4_os_lock(&t->ftid_lock);
+	if (family == FILTER_TYPE_IPV4) {
+		rte_bitmap_clear(t->ftid_bmap, fidx);
+	} else {
+		rte_bitmap_clear(t->ftid_bmap, fidx);
+		rte_bitmap_clear(t->ftid_bmap, fidx + 1);
+		rte_bitmap_clear(t->ftid_bmap, fidx + 2);
+		rte_bitmap_clear(t->ftid_bmap, fidx + 3);
+	}
+	t4_os_unlock(&t->ftid_lock);
+}
+
+/**
+ * t4_mk_filtdelwr - create a delete filter WR
+ * @ftid: the filter ID
+ * @wr: the filter work request to populate
+ * @qid: ingress queue to receive the delete notification
+ *
+ * Creates a filter work request to delete the supplied filter.  If @qid is
+ * negative the delete notification is suppressed.
+ */
+static void t4_mk_filtdelwr(unsigned int ftid, struct fw_filter_wr *wr, int qid)
+{
+	memset(wr, 0, sizeof(*wr));
+	wr->op_pkd = cpu_to_be32(V_FW_WR_OP(FW_FILTER_WR));
+	wr->len16_pkd = cpu_to_be32(V_FW_WR_LEN16(sizeof(*wr) / 16));
+	wr->tid_to_iq = cpu_to_be32(V_FW_FILTER_WR_TID(ftid) |
+				    V_FW_FILTER_WR_NOREPLY(qid < 0));
+	wr->del_filter_to_l2tix = cpu_to_be32(F_FW_FILTER_WR_DEL_FILTER);
+	if (qid >= 0)
+		wr->rx_chan_rx_rpl_iq =
+				cpu_to_be16(V_FW_FILTER_WR_RX_RPL_IQ(qid));
+}
+
+/**
+ * Create FW work request to delete the filter at a specified index
+ */
+static int del_filter_wr(struct rte_eth_dev *dev, unsigned int fidx)
+{
+	struct adapter *adapter = ethdev2adap(dev);
+	struct filter_entry *f = &adapter->tids.ftid_tab[fidx];
+	struct rte_mbuf *mbuf;
+	struct fw_filter_wr *fwr;
+	struct sge_ctrl_txq *ctrlq;
+	unsigned int port_id = ethdev2pinfo(dev)->port_id;
+
+	ctrlq = &adapter->sge.ctrlq[port_id];
+	mbuf = rte_pktmbuf_alloc(ctrlq->mb_pool);
+	if (!mbuf)
+		return -ENOMEM;
+
+	mbuf->data_len = sizeof(*fwr);
+	mbuf->pkt_len = mbuf->data_len;
+
+	fwr = rte_pktmbuf_mtod(mbuf, struct fw_filter_wr *);
+	t4_mk_filtdelwr(f->tid, fwr, adapter->sge.fw_evtq.abs_id);
+
+	/*
+	 * Mark the filter as "pending" and ship off the Filter Work Request.
+	 * When we get the Work Request Reply we'll clear the pending status.
+	 */
+	f->pending = 1;
+	t4_mgmt_tx(ctrlq, mbuf);
+	return 0;
+}
+
+/**
+ * Delete the filter at the specified index (if valid).  The checks for all
+ * the common problems with doing this like the filter being locked, currently
+ * pending in another operation, etc.
+ */
+int delete_filter(struct rte_eth_dev *dev, unsigned int fidx)
+{
+	struct adapter *adapter = ethdev2adap(dev);
+	struct filter_entry *f;
+	int ret;
+	unsigned int max_fidx;
+
+	max_fidx = adapter->tids.nftids;
+	if (fidx >= max_fidx)
+		return -ERANGE;
+
+	f = &adapter->tids.ftid_tab[fidx];
+	ret = writable_filter(f);
+	if (ret)
+		return ret;
+	if (f->valid)
+		return del_filter_wr(dev, fidx);
+
+	return 0;
+}
+
+/**
+ * Send a Work Request to write the filter at a specified index.  We construct
+ * a Firmware Filter Work Request to have the work done and put the indicated
+ * filter into "pending" mode which will prevent any further actions against
+ * it till we get a reply from the firmware on the completion status of the
+ * request.
+ */
+int set_filter_wr(struct rte_eth_dev *dev, unsigned int fidx)
+{
+	struct adapter *adapter = ethdev2adap(dev);
+	struct filter_entry *f = &adapter->tids.ftid_tab[fidx];
+	struct rte_mbuf *mbuf;
+	struct fw_filter_wr *fwr;
+	struct sge_ctrl_txq *ctrlq;
+	unsigned int port_id = ethdev2pinfo(dev)->port_id;
+	int ret;
+
+	ctrlq = &adapter->sge.ctrlq[port_id];
+	mbuf = rte_pktmbuf_alloc(ctrlq->mb_pool);
+	if (!mbuf) {
+		ret = -ENOMEM;
+		goto out;
+	}
+
+	mbuf->data_len = sizeof(*fwr);
+	mbuf->pkt_len = mbuf->data_len;
+
+	fwr = rte_pktmbuf_mtod(mbuf, struct fw_filter_wr *);
+	memset(fwr, 0, sizeof(*fwr));
+
+	/*
+	 * If the new filter requires loopback Destination MAC and/or VLAN
+	 * rewriting then we need to allocate a Layer 2 Table (L2T) entry for
+	 * the filter.
+	 */
+	if (f->fs.newdmac || f->fs.newvlan) {
+		/* allocate L2T entry for new filter */
+		f->l2t = cxgbe_l2t_alloc_switching(f->dev, f->fs.vlan,
+						   f->fs.eport, f->fs.dmac);
+		if (!f->l2t) {
+			ret = -ENOMEM;
+			goto error;
+		}
+	}
+
+	/*
+	 * If the new filter requires loopback Source MAC rewriting then
+	 * we need to allocate a SMT entry for the filter.
+	 */
+	if (f->fs.newsmac) {
+		f->smt = cxgbe_smt_alloc_switching(f->dev, f->fs.smac);
+		if (!f->smt) {
+			if (f->l2t) {
+				cxgbe_l2t_release(f->l2t);
+				f->l2t = NULL;
+			}
+			ret = -ENOMEM;
+			goto error;
+		}
+		f->smtidx = f->smt->idx;
+	}
+
+	/*
+	 * Construct the work request to set the filter.
+	 */
+	fwr->op_pkd = cpu_to_be32(V_FW_WR_OP(FW_FILTER_WR));
+	fwr->len16_pkd = cpu_to_be32(V_FW_WR_LEN16(sizeof(*fwr) / 16));
+	fwr->tid_to_iq =
+		cpu_to_be32(V_FW_FILTER_WR_TID(f->tid) |
+			    V_FW_FILTER_WR_RQTYPE(f->fs.type) |
+			    V_FW_FILTER_WR_NOREPLY(0) |
+			    V_FW_FILTER_WR_IQ(f->fs.iq));
+	fwr->del_filter_to_l2tix =
+		cpu_to_be32(V_FW_FILTER_WR_RPTTID(f->fs.rpttid) |
+			    V_FW_FILTER_WR_DROP(f->fs.action == FILTER_DROP) |
+			    V_FW_FILTER_WR_DIRSTEER(f->fs.dirsteer) |
+			    V_FW_FILTER_WR_MASKHASH(f->fs.maskhash) |
+			    V_FW_FILTER_WR_DIRSTEERHASH(f->fs.dirsteerhash) |
+			    V_FW_FILTER_WR_LPBK(f->fs.action == FILTER_SWITCH) |
+			    V_FW_FILTER_WR_DMAC(f->fs.newdmac) |
+			    V_FW_FILTER_WR_INSVLAN(
+				    f->fs.newvlan == VLAN_INSERT ||
+				    f->fs.newvlan == VLAN_REWRITE) |
+			    V_FW_FILTER_WR_RMVLAN(
+				    f->fs.newvlan == VLAN_REMOVE ||
+				    f->fs.newvlan == VLAN_REWRITE) |
+			    V_FW_FILTER_WR_HITCNTS(f->fs.hitcnts) |
+			    V_FW_FILTER_WR_TXCHAN(f->fs.eport) |
+			    V_FW_FILTER_WR_PRIO(f->fs.prio) |
+			    V_FW_FILTER_WR_L2TIX(f->l2t ? f->l2t->idx : 0));
+	fwr->ethtype = cpu_to_be16(f->fs.val.ethtype);
+	fwr->ethtypem = cpu_to_be16(f->fs.mask.ethtype);
+	fwr->frag_to_ovlan_vldm =
+		     (V_FW_FILTER_WR_FRAG(f->fs.val.frag) |
+		      V_FW_FILTER_WR_FRAGM(f->fs.mask.frag) |
+		      V_FW_FILTER_WR_IVLAN_VLD(f->fs.val.ivlan_vld) |
+		      V_FW_FILTER_WR_OVLAN_VLD(f->fs.val.ovlan_vld) |
+		      V_FW_FILTER_WR_IVLAN_VLDM(f->fs.mask.ivlan_vld) |
+		      V_FW_FILTER_WR_OVLAN_VLDM(f->fs.mask.ovlan_vld));
+	fwr->smac_sel = 0;
+	fwr->rx_chan_rx_rpl_iq =
+		cpu_to_be16(V_FW_FILTER_WR_RX_CHAN(0) |
+			    V_FW_FILTER_WR_RX_RPL_IQ(
+				    adapter->sge.fw_evtq.abs_id));
+	fwr->maci_to_matchtypem =
+		cpu_to_be32(V_FW_FILTER_WR_MACI(f->fs.val.macidx) |
+			    V_FW_FILTER_WR_MACIM(f->fs.mask.macidx) |
+			    V_FW_FILTER_WR_FCOE(f->fs.val.fcoe) |
+			    V_FW_FILTER_WR_FCOEM(f->fs.mask.fcoe) |
+			    V_FW_FILTER_WR_PORT(f->fs.val.iport) |
+			    V_FW_FILTER_WR_PORTM(f->fs.mask.iport) |
+			    V_FW_FILTER_WR_MATCHTYPE(f->fs.val.matchtype) |
+			    V_FW_FILTER_WR_MATCHTYPEM(f->fs.mask.matchtype));
+	fwr->ptcl = f->fs.val.proto;
+	fwr->ptclm = f->fs.mask.proto;
+	fwr->ttyp = f->fs.val.tos;
+	fwr->ttypm = f->fs.mask.tos;
+	fwr->ivlan = cpu_to_be16(f->fs.val.ivlan);
+	fwr->ivlanm = cpu_to_be16(f->fs.mask.ivlan);
+	fwr->ovlan = cpu_to_be16(f->fs.val.ovlan);
+	fwr->ovlanm = cpu_to_be16(f->fs.mask.ovlan);
+	rte_memcpy(fwr->lip, f->fs.val.lip, sizeof(fwr->lip));
+	rte_memcpy(fwr->lipm, f->fs.mask.lip, sizeof(fwr->lipm));
+	rte_memcpy(fwr->fip, f->fs.val.fip, sizeof(fwr->fip));
+	rte_memcpy(fwr->fipm, f->fs.mask.fip, sizeof(fwr->fipm));
+	fwr->lp = cpu_to_be16(f->fs.val.lport);
+	fwr->lpm = cpu_to_be16(f->fs.mask.lport);
+	fwr->fp = cpu_to_be16(f->fs.val.fport);
+	fwr->fpm = cpu_to_be16(f->fs.mask.fport);
+
+	/*
+	 * Mark the filter as "pending" and ship off the Filter Work Request.
+	 * When we get the Work Request Reply we'll clear the pending status.
+	 */
+	f->pending = 1;
+	t4_mgmt_tx(ctrlq, mbuf);
+	return 0;
+
+error:
+	rte_pktmbuf_free(mbuf);
+out:
+	return ret;
+}
+
+/**
+ * Check a delete filter request for validity and send it to the hardware.
+ * Return 0 on success, an error number otherwise.  We attach any provided
+ * filter operation context to the internal filter specification in order to
+ * facilitate signaling completion of the operation.
+ */
+int cxgbe_del_filter(struct rte_eth_dev *dev, unsigned int filter_id,
+		     struct ch_filter_specification *fs,
+		     struct filter_ctx *ctx)
+{
+	struct port_info *pi = (struct port_info *)(dev->data->dev_private);
+	struct adapter *adapter = pi->adapter;
+	struct filter_entry *f;
+	int ret;
+
+	if (filter_id >= adapter->tids.nftids)
+		return -ERANGE;
+
+	ret = is_filter_set(&adapter->tids, filter_id, fs->type);
+	if (!ret) {
+		dev_warn(adap, "%s: could not find filter entry: %u\n",
+			 __func__, filter_id);
+		return -EINVAL;
+	}
+
+	f = &adapter->tids.ftid_tab[filter_id];
+	ret = writable_filter(f);
+	if (ret)
+		return ret;
+
+	if (f->valid) {
+		f->ctx = ctx;
+		cxgbe_clear_ftid(&adapter->tids,
+				 f->tid - adapter->tids.ftid_base,
+				 f->fs.type ? FILTER_TYPE_IPV6 :
+					      FILTER_TYPE_IPV4);
+		return del_filter_wr(dev, filter_id);
+	}
+
+	/*
+	 * If the caller has passed in a Completion Context then we need to
+	 * mark it as a successful completion so they don't stall waiting
+	 * for it.
+	 */
+	if (ctx) {
+		ctx->result = 0;
+		t4_complete(&ctx->completion);
+	}
+
+	return 0;
+}
+
+/**
+ * Check a Chelsio Filter Request for validity, convert it into our internal
+ * format and send it to the hardware.  Return 0 on success, an error number
+ * otherwise.  We attach any provided filter operation context to the internal
+ * filter specification in order to facilitate signaling completion of the
+ * operation.
+ */
+int cxgbe_set_filter(struct rte_eth_dev *dev, unsigned int filter_id,
+		     struct ch_filter_specification *fs,
+		     struct filter_ctx *ctx)
+{
+	struct port_info *pi = ethdev2pinfo(dev);
+	struct adapter *adapter = pi->adapter;
+	u32 iconf;
+	unsigned int fidx, iq, fid_bit = 0;
+	struct filter_entry *f;
+	int ret;
+
+	if (filter_id >= adapter->tids.nftids)
+		return -ERANGE;
+
+	ret = validate_filter(adapter, fs);
+	if (ret)
+		return ret;
+
+	ret = is_filter_set(&adapter->tids, filter_id, fs->type);
+	if (ret)
+		return -EBUSY;
+
+	iq = get_filter_steerq(dev, fs);
+
+	/*
+	 * IPv6 filters occupy four slots and must be aligned on
+	 * four-slot boundaries.  IPv4 filters only occupy a single
+	 * slot and have no alignment requirements but writing a new
+	 * IPv4 filter into the middle of an existing IPv6 filter
+	 * requires clearing the old IPv6 filter.
+	 */
+	if (fs->type == FILTER_TYPE_IPV4) { /* IPv4 */
+		/*
+		 * If our IPv4 filter isn't being written to a
+		 * multiple of four filter index and there's an IPv6
+		 * filter at the multiple of 4 base slot, then we need
+		 * to delete that IPv6 filter ...
+		 */
+		fidx = filter_id & ~0x3;
+		if (fidx != filter_id && adapter->tids.ftid_tab[fidx].fs.type) {
+			f = &adapter->tids.ftid_tab[fidx];
+			ret = delete_filter(dev, fidx);
+			if (ret)
+				return ret;
+			if (f->valid) {
+				fid_bit = f->tid;
+				fid_bit -= adapter->tids.ftid_base;
+				cxgbe_clear_ftid(&adapter->tids,
+						 fid_bit, FILTER_TYPE_IPV6);
+			}
+		}
+	} else { /* IPv6 */
+		/*
+		 * Ensure that the IPv6 filter is aligned on a
+		 * multiple of 4 boundary.
+		 */
+		if (filter_id & 0x3)
+			return -EINVAL;
+
+		/*
+		 * Check all except the base overlapping IPv4 filter
+		 * slots.
+		 */
+		for (fidx = filter_id + 1; fidx < filter_id + 4; fidx++) {
+			f = &adapter->tids.ftid_tab[fidx];
+			ret = delete_filter(dev, fidx);
+			if (ret)
+				return ret;
+			if (f->valid) {
+				fid_bit = f->tid;
+				fid_bit -=  adapter->tids.ftid_base;
+				cxgbe_clear_ftid(&adapter->tids,
+						 fid_bit, FILTER_TYPE_IPV4);
+			}
+		}
+	}
+
+	/*
+	 * Check to make sure that provided filter index is not
+	 * already in use by someone else
+	 */
+	f = &adapter->tids.ftid_tab[filter_id];
+	if (f->valid)
+		return -EBUSY;
+
+	fidx = adapter->tids.ftid_base + filter_id;
+	fid_bit = filter_id;
+	ret = cxgbe_set_ftid(&adapter->tids, fid_bit,
+			     fs->type ? FILTER_TYPE_IPV6 : FILTER_TYPE_IPV4);
+	if (ret)
+		return ret;
+
+	/*
+	 * Check to make sure the filter requested is writable ...
+	 */
+	ret = writable_filter(f);
+	if (ret) {
+		/* Clear the bits we have set above */
+		cxgbe_clear_ftid(&adapter->tids, fid_bit,
+				 fs->type ? FILTER_TYPE_IPV6 :
+					    FILTER_TYPE_IPV4);
+		return ret;
+	}
+
+	/*
+	 * Clear out any old resources being used by the filter before
+	 * we start constructing the new filter.
+	 */
+	if (f->valid)
+		clear_filter(f);
+
+	/*
+	 * Convert the filter specification into our internal format.
+	 * We copy the PF/VF specification into the Outer VLAN field
+	 * here so the rest of the code -- including the interface to
+	 * the firmware -- doesn't have to constantly do these checks.
+	 */
+	f->fs = *fs;
+	f->fs.iq = iq;
+	f->dev = dev;
+
+	iconf = adapter->params.tp.ingress_config;
+	if (iconf & F_VNIC) {
+		f->fs.val.ovlan = (fs->val.pf << 13) | fs->val.vf;
+		f->fs.mask.ovlan = (fs->mask.pf << 13) | fs->mask.vf;
+		f->fs.val.ovlan_vld = fs->val.pfvf_vld;
+		f->fs.mask.ovlan_vld = fs->mask.pfvf_vld;
+	}
+
+	/*
+	 * Attempt to set the filter.  If we don't succeed, we clear
+	 * it and return the failure.
+	 */
+	f->ctx = ctx;
+	f->tid = fidx; /* Save the actual tid */
+	ret = set_filter_wr(dev, filter_id);
+	if (ret) {
+		fid_bit = f->tid - adapter->tids.ftid_base;
+		cxgbe_clear_ftid(&adapter->tids, fid_bit,
+				 fs->type ? FILTER_TYPE_IPV6 :
+					    FILTER_TYPE_IPV4);
+		clear_filter(f);
+	}
+
+	return ret;
+}
+
+/**
+ * Handle a LE-TCAM filter write/deletion reply.
+ */
+void filter_rpl(struct adapter *adap, const struct cpl_set_tcb_rpl *rpl)
+{
+	struct filter_entry *f = NULL;
+	unsigned int tid = GET_TID(rpl);
+	int idx, max_fidx = adap->tids.nftids;
+
+	/* Get the corresponding filter entry for this tid */
+	if (adap->tids.ftid_tab) {
+		/* Check this in normal filter region */
+		idx = tid - adap->tids.ftid_base;
+		if (idx >= max_fidx)
+			return;
+
+		f = &adap->tids.ftid_tab[idx];
+		if (f->tid != tid)
+			return;
+	}
+
+	/* We found the filter entry for this tid */
+	if (f) {
+		unsigned int ret = G_COOKIE(rpl->cookie);
+		struct filter_ctx *ctx;
+
+		/*
+		 * Pull off any filter operation context attached to the
+		 * filter.
+		 */
+		ctx = f->ctx;
+		f->ctx = NULL;
+
+		if (ret == FW_FILTER_WR_FLT_DELETED) {
+			/*
+			 * Clear the filter when we get confirmation from the
+			 * hardware that the filter has been deleted.
+			 */
+			clear_filter(f);
+			if (ctx)
+				ctx->result = 0;
+		} else if (ret == FW_FILTER_WR_FLT_ADDED) {
+			f->pending = 0;  /* asynchronous setup completed */
+			f->valid = 1;
+			if (ctx) {
+				ctx->tid = f->tid;
+				ctx->result = 0;
+			}
+
+			if (f->fs.newsmac) {
+				/* do a set-tcb for smac-sel and CWR bit.. */
+				set_tcb_tflag(adap, f->tid, S_TF_CCTRL_CWR,
+					      1, 1);
+				set_tcb_field(adap, f->tid, W_TCB_SMAC_SEL,
+					      V_TCB_SMAC_SEL(M_TCB_SMAC_SEL),
+					      V_TCB_SMAC_SEL(f->smtidx), 1);
+			}
+		} else {
+			/*
+			 * Something went wrong.  Issue a warning about the
+			 * problem and clear everything out.
+			 */
+			dev_warn(adap, "filter %u setup failed with error %u\n",
+				 idx, ret);
+			clear_filter(f);
+			if (ctx)
+				ctx->result = -EINVAL;
+		}
+
+		if (ctx)
+			t4_complete(&ctx->completion);
+	}
+}
diff --git a/drivers/net/cxgbe/cxgbe_filter.h b/drivers/net/cxgbe/cxgbe_filter.h
index b03ccca..96c15d2 100644
--- a/drivers/net/cxgbe/cxgbe_filter.h
+++ b/drivers/net/cxgbe/cxgbe_filter.h
@@ -34,6 +34,8 @@
 #ifndef _CXGBE_FILTER_H_
 #define _CXGBE_FILTER_H_
 
+#include "t4_msg.h"
+
 /*
  * Defined bit width of user definable filter tuples
  */
@@ -232,4 +234,20 @@ struct filter_entry {
 	 */
 	struct ch_filter_specification fs;
 };
+
+struct adapter;
+
+void filter_rpl(struct adapter *adap, const struct cpl_set_tcb_rpl *rpl);
+void clear_filter(struct filter_entry *f);
+int set_filter_wr(struct rte_eth_dev *dev, unsigned int fidx);
+int delete_filter(struct rte_eth_dev *dev, unsigned int fidx);
+int writable_filter(struct filter_entry *f);
+int cxgbe_set_filter(struct rte_eth_dev *dev, unsigned int filter_id,
+		     struct ch_filter_specification *fs,
+		     struct filter_ctx *ctx);
+int cxgbe_del_filter(struct rte_eth_dev *dev, unsigned int filter_id,
+		     struct ch_filter_specification *fs,
+		     struct filter_ctx *ctx);
+
+void cxgbe_clear_all_filters(struct adapter *adapter);
 #endif /* _CXGBE_FILTER_H_ */
diff --git a/drivers/net/cxgbe/cxgbe_main.c b/drivers/net/cxgbe/cxgbe_main.c
index e7d017e..dfb6567 100644
--- a/drivers/net/cxgbe/cxgbe_main.c
+++ b/drivers/net/cxgbe/cxgbe_main.c
@@ -69,6 +69,7 @@
 #include "clip_tbl.h"
 #include "l2t.h"
 #include "smt.h"
+#include "cxgbe_filter.h"
 
 /**
  * Allocate a chunk of memory. The allocated memory is cleared.
@@ -118,6 +119,10 @@ static int fwevtq_handler(struct sge_rspq *q, const __be64 *rsp,
 		const struct cpl_fw6_msg *msg = (const void *)rsp;
 
 		t4_handle_fw_rpl(q->adapter, msg->data);
+	} else if (opcode == CPL_SET_TCB_RPL) {
+		const struct cpl_set_tcb_rpl *p = (const void *)rsp;
+
+		filter_rpl(q->adapter, p);
 	} else if (opcode == CPL_SMT_WRITE_RPL) {
 		const struct cpl_smt_write_rpl *p = (const void *)rsp;
 
@@ -1232,6 +1237,7 @@ void cxgbe_close(struct adapter *adapter)
 	int i;
 
 	if (adapter->flags & FULL_INIT_DONE) {
+		cxgbe_clear_all_filters(adapter);
 		tid_free(&adapter->tids);
 		t4_cleanup_clip_tbl(adapter);
 		t4_cleanup_l2t(adapter);
diff --git a/drivers/net/cxgbe/cxgbe_ofld.h b/drivers/net/cxgbe/cxgbe_ofld.h
index 19971e7..115472e 100644
--- a/drivers/net/cxgbe/cxgbe_ofld.h
+++ b/drivers/net/cxgbe/cxgbe_ofld.h
@@ -47,6 +47,11 @@
 	(w)->wr.wr_lo = cpu_to_be64(0); \
 } while (0)
 
+#define INIT_TP_WR_MIT_CPL(w, cpl, tid) do { \
+	INIT_TP_WR(w, tid); \
+	OPCODE_TID(w) = cpu_to_be32(MK_OPCODE_TID(cpl, tid)); \
+} while (0)
+
 /*
  * Max # of ATIDs.  The absolute HW max is 16K but we keep it lower.
  */
-- 
2.5.3

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 09/10] cxgbe: add HASH filtering support
  2016-02-03  8:32 [PATCH 00/10] cxgbe: Add flow director support Rahul Lakkireddy
                   ` (7 preceding siblings ...)
  2016-02-03  8:32 ` [PATCH 08/10] cxgbe: add LE-TCAM filtering support Rahul Lakkireddy
@ 2016-02-03  8:32 ` Rahul Lakkireddy
  2016-02-03  8:32 ` [PATCH 10/10] cxgbe: add flow director support and update documentation Rahul Lakkireddy
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 27+ messages in thread
From: Rahul Lakkireddy @ 2016-02-03  8:32 UTC (permalink / raw)
  To: dev; +Cc: Kumar Sanghvi, Nirranjan Kirubaharan

Add support for setting HASH (Maskless) filters.  Both IPv4 and IPv6
occupy only one index per filter.  Also, the index returned is a hash
computed by the hardware based on value in the fields being matched.
During matching, the hardware computes a hash of the relevant fields
in the incoming packet and compares it against the hashed indices.
If a match is found, then the filter's action is taken immediately.

HASH filters have higher priority over LE-TCAM filters if a packet
matches rules in both the HASH filter region and the LE-TCAM filter
region.  This can be changed by setting the prio bit in the filter
specification.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
---
 drivers/net/cxgbe/base/adapter.h        |  11 +
 drivers/net/cxgbe/base/common.h         |   7 +
 drivers/net/cxgbe/base/t4_msg.h         | 198 ++++++++
 drivers/net/cxgbe/base/t4_regs.h        |   9 +
 drivers/net/cxgbe/base/t4_regs_values.h |  25 +
 drivers/net/cxgbe/base/t4_tcb.h         |  21 +
 drivers/net/cxgbe/base/t4fw_interface.h |  19 +
 drivers/net/cxgbe/cxgbe_compat.h        |  12 +
 drivers/net/cxgbe/cxgbe_filter.c        | 812 ++++++++++++++++++++++++++++++++
 drivers/net/cxgbe/cxgbe_filter.h        |   7 +
 drivers/net/cxgbe/cxgbe_main.c          | 135 +++++-
 drivers/net/cxgbe/cxgbe_ofld.h          |  26 +
 12 files changed, 1280 insertions(+), 2 deletions(-)

diff --git a/drivers/net/cxgbe/base/adapter.h b/drivers/net/cxgbe/base/adapter.h
index a866993..a64571d 100644
--- a/drivers/net/cxgbe/base/adapter.h
+++ b/drivers/net/cxgbe/base/adapter.h
@@ -629,6 +629,17 @@ static inline struct adapter *ethdev2adap(const struct rte_eth_dev *dev)
 	return ethdev2pinfo(dev)->adapter;
 }
 
+/**
+ * cxgbe_port_viid - get the VI id of a port
+ * @dev: the device for the port
+ *
+ * Return the VI id of the given port.
+ */
+static inline unsigned int cxgbe_port_viid(const struct rte_eth_dev *dev)
+{
+	return ethdev2pinfo(dev)->viid;
+}
+
 void *t4_alloc_mem(size_t size);
 void t4_free_mem(void *addr);
 #define t4_os_alloc(_size)     t4_alloc_mem((_size))
diff --git a/drivers/net/cxgbe/base/common.h b/drivers/net/cxgbe/base/common.h
index 21dca32..bbeca75 100644
--- a/drivers/net/cxgbe/base/common.h
+++ b/drivers/net/cxgbe/base/common.h
@@ -220,6 +220,8 @@ struct adapter_params {
 	unsigned char nports;             /* # of ethernet ports */
 	unsigned char portvec;
 
+	unsigned char hash_filter;
+
 	enum chip_type chip;              /* chip code */
 	struct arch_specific_params arch; /* chip specific params */
 
@@ -255,6 +257,11 @@ static inline int t4_wait_op_done(struct adapter *adapter, int reg, u32 mask,
 #define for_each_port(adapter, iter) \
 	for (iter = 0; iter < (adapter)->params.nports; ++iter)
 
+static inline int is_hashfilter(const struct adapter *adap)
+{
+	return adap->params.hash_filter;
+}
+
 void t4_read_mtu_tbl(struct adapter *adap, u16 *mtus, u8 *mtu_log);
 void t4_tp_wr_bits_indirect(struct adapter *adap, unsigned int addr,
 			    unsigned int mask, unsigned int val);
diff --git a/drivers/net/cxgbe/base/t4_msg.h b/drivers/net/cxgbe/base/t4_msg.h
index 57534f0..bffea8b 100644
--- a/drivers/net/cxgbe/base/t4_msg.h
+++ b/drivers/net/cxgbe/base/t4_msg.h
@@ -35,12 +35,19 @@
 #define T4_MSG_H
 
 enum {
+	CPL_ACT_OPEN_REQ      = 0x3,
 	CPL_SET_TCB_FIELD     = 0x5,
+	CPL_ABORT_REQ         = 0xA,
+	CPL_ABORT_RPL         = 0xB,
 	CPL_L2T_WRITE_REQ     = 0x12,
 	CPL_SMT_WRITE_REQ     = 0x14,
+	CPL_TID_RELEASE       = 0x1A,
 	CPL_L2T_WRITE_RPL     = 0x23,
+	CPL_ACT_OPEN_RPL      = 0x25,
+	CPL_ABORT_RPL_RSS     = 0x2D,
 	CPL_SMT_WRITE_RPL     = 0x2E,
 	CPL_SET_TCB_RPL       = 0x3A,
+	CPL_ACT_OPEN_REQ6     = 0x83,
 	CPL_SGE_EGR_UPDATE    = 0xA5,
 	CPL_FW4_MSG           = 0xC0,
 	CPL_FW6_MSG           = 0xE0,
@@ -53,6 +60,16 @@ enum CPL_error {
 	CPL_ERR_TCAM_FULL          = 3,
 };
 
+enum {
+	ULP_MODE_NONE          = 0,
+	ULP_MODE_TCPDDP        = 5,
+};
+
+enum {
+	CPL_ABORT_SEND_RST = 0,
+	CPL_ABORT_NO_RST,
+};
+
 enum {                     /* TX_PKT_XT checksum types */
 	TX_CSUM_TCPIP  = 8,
 	TX_CSUM_UDPIP  = 9,
@@ -127,6 +144,148 @@ struct work_request_hdr {
 #define WR_HDR_SIZE 0
 #endif
 
+/* option 0 fields */
+#define S_TX_CHAN    2
+#define V_TX_CHAN(x) ((x) << S_TX_CHAN)
+
+#define S_NO_CONG    4
+#define V_NO_CONG(x) ((x) << S_NO_CONG)
+
+#define S_DELACK    5
+#define V_DELACK(x) ((x) << S_DELACK)
+
+#define S_NON_OFFLOAD    7
+#define V_NON_OFFLOAD(x) ((x) << S_NON_OFFLOAD)
+#define F_NON_OFFLOAD    V_NON_OFFLOAD(1U)
+
+#define S_ULP_MODE    8
+#define V_ULP_MODE(x) ((x) << S_ULP_MODE)
+
+#define S_SMAC_SEL    28
+#define V_SMAC_SEL(x) ((__u64)(x) << S_SMAC_SEL)
+
+#define S_L2T_IDX    36
+#define V_L2T_IDX(x) ((__u64)(x) << S_L2T_IDX)
+
+#define S_TCAM_BYPASS    48
+#define V_TCAM_BYPASS(x) ((__u64)(x) << S_TCAM_BYPASS)
+#define F_TCAM_BYPASS    V_TCAM_BYPASS(1ULL)
+
+#define S_NAGLE    49
+#define V_NAGLE(x) ((__u64)(x) << S_NAGLE)
+
+/* option 2 fields */
+#define S_RSS_QUEUE    0
+#define V_RSS_QUEUE(x) ((x) << S_RSS_QUEUE)
+
+#define S_RSS_QUEUE_VALID    10
+#define V_RSS_QUEUE_VALID(x) ((x) << S_RSS_QUEUE_VALID)
+#define F_RSS_QUEUE_VALID    V_RSS_QUEUE_VALID(1U)
+
+#define S_CONG_CNTRL    14
+#define V_CONG_CNTRL(x) ((x) << S_CONG_CNTRL)
+
+#define S_PACE    16
+#define V_PACE(x) ((x) << S_PACE)
+
+#define S_RX_FC_DISABLE    20
+#define V_RX_FC_DISABLE(x) ((x) << S_RX_FC_DISABLE)
+
+#define S_TX_QUEUE    23
+#define V_TX_QUEUE(x) ((x) << S_TX_QUEUE)
+
+#define S_RX_CHANNEL    26
+#define V_RX_CHANNEL(x) ((x) << S_RX_CHANNEL)
+#define F_RX_CHANNEL    V_RX_CHANNEL(1U)
+
+#define S_CCTRL_ECN    27
+#define V_CCTRL_ECN(x) ((x) << S_CCTRL_ECN)
+
+#define S_WND_SCALE_EN    28
+#define V_WND_SCALE_EN(x) ((x) << S_WND_SCALE_EN)
+
+#define S_SACK_EN    30
+#define V_SACK_EN(x) ((x) << S_SACK_EN)
+
+#define S_T5_OPT_2_VALID    31
+#define V_T5_OPT_2_VALID(x) ((x) << S_T5_OPT_2_VALID)
+#define F_T5_OPT_2_VALID    V_T5_OPT_2_VALID(1U)
+
+struct cpl_act_open_req {
+	WR_HDR;
+	union opcode_tid ot;
+	__be16 local_port;
+	__be16 peer_port;
+	__be32 local_ip;
+	__be32 peer_ip;
+	__be64 opt0;
+	__be32 params;
+	__be32 opt2;
+};
+
+#define S_FILTER_TUPLE	24
+#define V_FILTER_TUPLE(x) ((x) << S_FILTER_TUPLE)
+
+struct cpl_t5_act_open_req {
+	WR_HDR;
+	union opcode_tid ot;
+	__be16 local_port;
+	__be16 peer_port;
+	__be32 local_ip;
+	__be32 peer_ip;
+	__be64 opt0;
+	__be32 rsvd;
+	__be32 opt2;
+	__be64 params;
+};
+
+/* cpl_{t5,t6}_act_open_req.params field */
+struct cpl_act_open_req6 {
+	WR_HDR;
+	union opcode_tid ot;
+	__be16 local_port;
+	__be16 peer_port;
+	__be64 local_ip_hi;
+	__be64 local_ip_lo;
+	__be64 peer_ip_hi;
+	__be64 peer_ip_lo;
+	__be64 opt0;
+	__be32 params;
+	__be32 opt2;
+};
+
+struct cpl_t5_act_open_req6 {
+	WR_HDR;
+	union opcode_tid ot;
+	__be16 local_port;
+	__be16 peer_port;
+	__be64 local_ip_hi;
+	__be64 local_ip_lo;
+	__be64 peer_ip_hi;
+	__be64 peer_ip_lo;
+	__be64 opt0;
+	__be32 rsvd;
+	__be32 opt2;
+	__be64 params;
+};
+
+struct cpl_act_open_rpl {
+	RSS_HDR
+	union opcode_tid ot;
+	__be32 atid_status;
+};
+
+/* cpl_act_open_rpl.atid_status fields */
+#define S_AOPEN_STATUS    0
+#define M_AOPEN_STATUS    0xFF
+#define V_AOPEN_STATUS(x) ((x) << S_AOPEN_STATUS)
+#define G_AOPEN_STATUS(x) (((x) >> S_AOPEN_STATUS) & M_AOPEN_STATUS)
+
+#define S_AOPEN_ATID    8
+#define M_AOPEN_ATID    0xFFFFFF
+#define V_AOPEN_ATID(x) ((x) << S_AOPEN_ATID)
+#define G_AOPEN_ATID(x) (((x) >> S_AOPEN_ATID) & M_AOPEN_ATID)
+
 /* cpl_get_tcb.reply_ctrl fields */
 #define S_QUEUENO    0
 #define V_QUEUENO(x) ((x) << S_QUEUENO)
@@ -164,6 +323,39 @@ struct cpl_set_tcb_rpl {
 	__be64 oldval;
 };
 
+/* cpl_abort_req status command code
+ */
+struct cpl_abort_req {
+	WR_HDR;
+	union opcode_tid ot;
+	__be32 rsvd0;
+	__u8  rsvd1;
+	__u8  cmd;
+	__u8  rsvd2[6];
+};
+
+struct cpl_abort_rpl_rss {
+	RSS_HDR
+	union opcode_tid ot;
+	__u8  rsvd[3];
+	__u8  status;
+};
+
+struct cpl_abort_rpl {
+	WR_HDR;
+	union opcode_tid ot;
+	__be32 rsvd0;
+	__u8  rsvd1;
+	__u8  cmd;
+	__u8  rsvd2[6];
+};
+
+struct cpl_tid_release {
+	WR_HDR;
+	union opcode_tid ot;
+	__be32 rsvd;
+};
+
 struct cpl_tx_data {
 	union opcode_tid ot;
 	__be32 len;
@@ -411,7 +603,13 @@ struct cpl_fw6_msg {
 	__be64 data[4];
 };
 
+/* ULP_TX opcodes */
+enum {
+	ULP_TX_PKT = 4
+};
+
 enum {
+	ULP_TX_SC_NOOP = 0x80,
 	ULP_TX_SC_IMM  = 0x81,
 	ULP_TX_SC_DSGL = 0x82,
 	ULP_TX_SC_ISGL = 0x83
diff --git a/drivers/net/cxgbe/base/t4_regs.h b/drivers/net/cxgbe/base/t4_regs.h
index 9057e40..1a7b6df 100644
--- a/drivers/net/cxgbe/base/t4_regs.h
+++ b/drivers/net/cxgbe/base/t4_regs.h
@@ -793,3 +793,12 @@
 #define M_REV    0xfU
 #define V_REV(x) ((x) << S_REV)
 #define G_REV(x) (((x) >> S_REV) & M_REV)
+
+/* registers for module LE */
+#define A_LE_DB_CONFIG 0x19c04
+
+#define S_HASHEN    20
+#define V_HASHEN(x) ((x) << S_HASHEN)
+#define F_HASHEN    V_HASHEN(1U)
+
+#define A_LE_DB_TID_HASHBASE 0x19df8
diff --git a/drivers/net/cxgbe/base/t4_regs_values.h b/drivers/net/cxgbe/base/t4_regs_values.h
index d7d3144..8e1f6f3 100644
--- a/drivers/net/cxgbe/base/t4_regs_values.h
+++ b/drivers/net/cxgbe/base/t4_regs_values.h
@@ -155,6 +155,9 @@
  * selects for a particular field being present.  These fields, when present
  * in the Compressed Filter Tuple, have the following widths in bits.
  */
+#define S_FT_FIRST			S_FCOE
+#define S_FT_LAST			S_FRAGMENTATION
+
 #define W_FT_FCOE			1
 #define W_FT_PORT			3
 #define W_FT_VNIC_ID			17
@@ -166,4 +169,26 @@
 #define W_FT_MPSHITTYPE			3
 #define W_FT_FRAGMENTATION		1
 
+/*
+ * Some of the Compressed Filter Tuple fields have internal structure.  These
+ * bit shifts/masks describe those structures.  All shifts are relative to the
+ * base position of the fields within the Compressed Filter Tuple
+ */
+#define S_FT_VLAN_VLD		16
+#define V_FT_VLAN_VLD(x)	((x) << S_FT_VLAN_VLD)
+#define F_FT_VLAN_VLD		V_FT_VLAN_VLD(1U)
+
+#define S_FT_VNID_ID_VF		0
+#define M_FT_VNID_ID_VF		0x7fU
+#define V_FT_VNID_ID_VF(x)	((x) << S_FT_VNID_ID_VF)
+#define G_FT_VNID_ID_VF(x)	(((x) >> S_FT_VNID_ID_VF) & M_FT_VNID_ID_VF)
+
+#define S_FT_VNID_ID_PF		7
+#define M_FT_VNID_ID_PF		0x7U
+#define V_FT_VNID_ID_PF(x)	((x) << S_FT_VNID_ID_PF)
+#define G_FT_VNID_ID_PF(x)	(((x) >> S_FT_VNID_ID_PF) & M_FT_VNID_ID_PF)
+
+#define S_FT_VNID_ID_VLD	16
+#define V_FT_VNID_ID_VLD(x)	((x) << S_FT_VNID_ID_VLD)
+#define F_FT_VNID_ID_VLD(x)	V_FT_VNID_ID_VLD(1U)
 #endif /* __T4_REGS_VALUES_H__ */
diff --git a/drivers/net/cxgbe/base/t4_tcb.h b/drivers/net/cxgbe/base/t4_tcb.h
index 36afd56..1a076f3 100644
--- a/drivers/net/cxgbe/base/t4_tcb.h
+++ b/drivers/net/cxgbe/base/t4_tcb.h
@@ -60,6 +60,27 @@
 #define M_TCB_T_RTT_TS_RECENT_AGE    0xffffffffULL
 #define V_TCB_T_RTT_TS_RECENT_AGE(x) ((x) << S_TCB_T_RTT_TS_RECENT_AGE)
 
+/* 347:320 */
+#define W_TCB_SND_UNA_RAW    10
+
+/* 553:522 */
+#define W_TCB_RCV_NXT    16
+#define S_TCB_RCV_NXT    10
+#define M_TCB_RCV_NXT    0xffffffffULL
+#define V_TCB_RCV_NXT(x) ((__u64)(x) << S_TCB_RCV_NXT)
+
+/* 891:875 */
+#define W_TCB_RX_FRAG2_PTR_RAW    27
+
+/* 964:937 */
+#define W_TCB_RX_FRAG3_LEN_RAW    29
+
+/* 992:965 */
+#define W_TCB_RX_FRAG3_START_IDX_OFFSET_RAW    30
+
+/* 1000:993 */
+#define W_TCB_PDU_HDR_LEN    31
+
 #define S_TF_MIGRATING    0
 #define V_TF_MIGRATING(x) ((x) << S_TF_MIGRATING)
 
diff --git a/drivers/net/cxgbe/base/t4fw_interface.h b/drivers/net/cxgbe/base/t4fw_interface.h
index d3e4de5..d9278ff 100644
--- a/drivers/net/cxgbe/base/t4fw_interface.h
+++ b/drivers/net/cxgbe/base/t4fw_interface.h
@@ -83,6 +83,7 @@ enum fw_memtype {
 
 enum fw_wr_opcodes {
 	FW_FILTER_WR		= 0x02,
+	FW_ULPTX_WR		= 0x04,
 	FW_TP_WR		= 0x05,
 	FW_ETH_TX_PKT_WR	= 0x08,
 	FW_ETH_TX_PKTS_WR	= 0x09,
@@ -1009,6 +1010,24 @@ struct fw_eq_ctrl_cmd {
 #define S_FW_EQ_CTRL_CMD_EQSIZE		0
 #define V_FW_EQ_CTRL_CMD_EQSIZE(x)	((x) << S_FW_EQ_CTRL_CMD_EQSIZE)
 
+/* Macros for VIID parsing:
+ * VIID - [10:8] PFN, [7] VI Valid, [6:0] VI number
+ */
+#define S_FW_VIID_PFN		8
+#define M_FW_VIID_PFN		0x7
+#define V_FW_VIID_PFN(x)	((x) << S_FW_VIID_PFN)
+#define G_FW_VIID_PFN(x)	(((x) >> S_FW_VIID_PFN) & M_FW_VIID_PFN)
+
+#define S_FW_VIID_VIVLD		7
+#define M_FW_VIID_VIVLD		0x1
+#define V_FW_VIID_VIVLD(x)	((x) << S_FW_VIID_VIVLD)
+#define G_FW_VIID_VIVLD(x)	(((x) >> S_FW_VIID_VIVLD) & M_FW_VIID_VIVLD)
+
+#define S_FW_VIID_VIN		0
+#define M_FW_VIID_VIN		0x7F
+#define V_FW_VIID_VIN(x)	((x) << S_FW_VIID_VIN)
+#define G_FW_VIID_VIN(x)	(((x) >> S_FW_VIID_VIN) & M_FW_VIID_VIN)
+
 enum fw_vi_func {
 	FW_VI_FUNC_ETH,
 };
diff --git a/drivers/net/cxgbe/cxgbe_compat.h b/drivers/net/cxgbe/cxgbe_compat.h
index e68f8f5..6649afc 100644
--- a/drivers/net/cxgbe/cxgbe_compat.h
+++ b/drivers/net/cxgbe/cxgbe_compat.h
@@ -263,4 +263,16 @@ static inline void writeq(u64 val, volatile void __iomem *addr)
 	writel(val >> 32, (void *)((uintptr_t)addr + 4));
 }
 
+/*
+ * Multiplies an integer by a fraction, while avoiding unnecessary
+ * overflow or loss of precision.
+ */
+#define mult_frac(x, numer, denom)(                     \
+{                                                       \
+	typeof(x) quot = (x) / (denom);                 \
+	typeof(x) rem  = (x) % (denom);                 \
+	(quot * (numer)) + ((rem * (numer)) / (denom)); \
+}                                                       \
+)
+
 #endif /* _CXGBE_COMPAT_H_ */
diff --git a/drivers/net/cxgbe/cxgbe_filter.c b/drivers/net/cxgbe/cxgbe_filter.c
index d4e32b1..285381b 100644
--- a/drivers/net/cxgbe/cxgbe_filter.c
+++ b/drivers/net/cxgbe/cxgbe_filter.c
@@ -41,6 +41,44 @@
 #include "cxgbe_filter.h"
 
 /**
+ * Initialize Hash Filters
+ */
+int init_hash_filter(struct adapter *adap)
+{
+	unsigned int n_user_filters;
+	unsigned int user_filter_perc;
+	int ret;
+	u32 params[7], val[7];
+
+#define FW_PARAM_DEV(param) \
+	(V_FW_PARAMS_MNEM(FW_PARAMS_MNEM_DEV) | \
+	V_FW_PARAMS_PARAM_X(FW_PARAMS_PARAM_DEV_##param))
+
+#define FW_PARAM_PFVF(param) \
+	(V_FW_PARAMS_MNEM(FW_PARAMS_MNEM_PFVF) | \
+	V_FW_PARAMS_PARAM_X(FW_PARAMS_PARAM_PFVF_##param) |  \
+	V_FW_PARAMS_PARAM_Y(0) | \
+	V_FW_PARAMS_PARAM_Z(0))
+
+	params[0] = FW_PARAM_DEV(NTID);
+	ret = t4_query_params(adap, adap->mbox, adap->pf, 0, 1,
+			      params, val);
+	if (ret < 0)
+		return ret;
+	adap->tids.ntids = val[0];
+	adap->tids.natids = min(adap->tids.ntids / 2, MAX_ATIDS);
+
+	user_filter_perc = 100;
+	n_user_filters = mult_frac(adap->tids.nftids,
+				   user_filter_perc,
+				   100);
+
+	adap->tids.nftids = n_user_filters;
+	adap->params.hash_filter = 1;
+	return 0;
+}
+
+/**
  * Validate if the requested filter specification can be set by checking
  * if the requested features have been enabled
  */
@@ -190,6 +228,556 @@ static void set_tcb_tflag(struct adapter *adap, unsigned int ftid,
 }
 
 /**
+ * Build a CPL_SET_TCB_FIELD message as payload of a ULP_TX_PKT command.
+ */
+static inline void mk_set_tcb_field_ulp(struct filter_entry *f,
+					struct cpl_set_tcb_field *req,
+					unsigned int word,
+					u64 mask, u64 val, u8 cookie,
+					int no_reply)
+{
+	struct ulp_txpkt *txpkt = (struct ulp_txpkt *)req;
+	struct ulptx_idata *sc = (struct ulptx_idata *)(txpkt + 1);
+
+	txpkt->cmd_dest = cpu_to_be32(V_ULPTX_CMD(ULP_TX_PKT) |
+				      V_ULP_TXPKT_DEST(0));
+	txpkt->len = cpu_to_be32(DIV_ROUND_UP(sizeof(*req), 16));
+	sc->cmd_more = cpu_to_be32(V_ULPTX_CMD(ULP_TX_SC_IMM));
+	sc->len = cpu_to_be32(sizeof(*req) - sizeof(struct work_request_hdr));
+	OPCODE_TID(req) = cpu_to_be32(MK_OPCODE_TID(CPL_SET_TCB_FIELD, f->tid));
+	req->reply_ctrl = cpu_to_be16(V_NO_REPLY(no_reply) | V_REPLY_CHAN(0) |
+				      V_QUEUENO(0));
+	req->word_cookie = cpu_to_be16(V_WORD(word) | V_COOKIE(cookie));
+	req->mask = cpu_to_be64(mask);
+	req->val = cpu_to_be64(val);
+	sc = (struct ulptx_idata *)(req + 1);
+	sc->cmd_more = cpu_to_be32(V_ULPTX_CMD(ULP_TX_SC_NOOP));
+	sc->len = cpu_to_be32(0);
+}
+
+/**
+ * Set NAT parameters
+ */
+static void set_nat_params(struct adapter *adap, struct filter_entry *f,
+			   unsigned int tid, bool dip, bool sip,
+			   bool dp, bool sp)
+{
+	if (dip) {
+		if (f->fs.type) {
+			set_tcb_field(adap, tid, W_TCB_SND_UNA_RAW,
+				      WORD_MASK, f->fs.nat_lip[15] |
+				      f->fs.nat_lip[14] << 8 |
+				      f->fs.nat_lip[13] << 16 |
+				      f->fs.nat_lip[12] << 24, 1);
+
+			set_tcb_field(adap, tid, W_TCB_SND_UNA_RAW + 1,
+				      WORD_MASK, f->fs.nat_lip[11] |
+				      f->fs.nat_lip[10] << 8 |
+				      f->fs.nat_lip[9] << 16 |
+				      f->fs.nat_lip[8] << 24, 1);
+
+			set_tcb_field(adap, tid, W_TCB_SND_UNA_RAW + 2,
+				      WORD_MASK, f->fs.nat_lip[7] |
+				      f->fs.nat_lip[6] << 8 |
+				      f->fs.nat_lip[5] << 16 |
+				      f->fs.nat_lip[4] << 24, 1);
+
+			set_tcb_field(adap, tid, W_TCB_SND_UNA_RAW + 3,
+				      WORD_MASK, f->fs.nat_lip[3] |
+				      f->fs.nat_lip[2] << 8 |
+				      f->fs.nat_lip[1] << 16 |
+				      f->fs.nat_lip[0] << 24, 1);
+		} else {
+			set_tcb_field(adap, tid, W_TCB_RX_FRAG3_LEN_RAW,
+				      WORD_MASK, f->fs.nat_lip[3] |
+				      f->fs.nat_lip[2] << 8 |
+				      f->fs.nat_lip[1] << 16 |
+				      f->fs.nat_lip[0] << 24, 1);
+		}
+	}
+
+	if (sip) {
+		if (f->fs.type) {
+			set_tcb_field(adap, tid, W_TCB_RX_FRAG2_PTR_RAW,
+				      WORD_MASK, f->fs.nat_fip[15] |
+				      f->fs.nat_fip[14] << 8 |
+				      f->fs.nat_fip[13] << 16 |
+				      f->fs.nat_fip[12] << 24, 1);
+
+			set_tcb_field(adap, tid, W_TCB_RX_FRAG2_PTR_RAW + 1,
+				      WORD_MASK, f->fs.nat_fip[11] |
+				      f->fs.nat_fip[10] << 8 |
+				      f->fs.nat_fip[9] << 16 |
+				      f->fs.nat_fip[8] << 24, 1);
+
+			set_tcb_field(adap, tid, W_TCB_RX_FRAG2_PTR_RAW + 2,
+				      WORD_MASK, f->fs.nat_fip[7] |
+				      f->fs.nat_fip[6] << 8 |
+				      f->fs.nat_fip[5] << 16 |
+				      f->fs.nat_fip[4] << 24, 1);
+
+			set_tcb_field(adap, tid, W_TCB_RX_FRAG2_PTR_RAW + 3,
+				      WORD_MASK, f->fs.nat_fip[3] |
+				      f->fs.nat_fip[2] << 8 |
+				      f->fs.nat_fip[1] << 16 |
+				      f->fs.nat_fip[0] << 24, 1);
+
+		} else {
+			set_tcb_field(adap, tid,
+				      W_TCB_RX_FRAG3_START_IDX_OFFSET_RAW,
+				      WORD_MASK, f->fs.nat_fip[3] |
+				      f->fs.nat_fip[2] << 8 |
+				      f->fs.nat_fip[1] << 16 |
+				      f->fs.nat_fip[0] << 24, 1);
+		}
+	}
+
+	set_tcb_field(adap, tid, W_TCB_PDU_HDR_LEN, WORD_MASK,
+		      (dp ? f->fs.nat_lport : 0) |
+		      (sp ? f->fs.nat_fport << 16 : 0), 1);
+}
+
+/**
+ * Build a CPL_ABORT_REQ message as payload of a ULP_TX_PKT command.
+ */
+static void mk_abort_req_ulp(struct cpl_abort_req *abort_req,
+			     unsigned int tid)
+{
+	struct ulp_txpkt *txpkt = (struct ulp_txpkt *)abort_req;
+	struct ulptx_idata *sc = (struct ulptx_idata *)(txpkt + 1);
+
+	txpkt->cmd_dest = cpu_to_be32(V_ULPTX_CMD(ULP_TX_PKT) |
+				      V_ULP_TXPKT_DEST(0));
+	txpkt->len = cpu_to_be32(DIV_ROUND_UP(sizeof(*abort_req), 16));
+	sc->cmd_more = cpu_to_be32(V_ULPTX_CMD(ULP_TX_SC_IMM));
+	sc->len = cpu_to_be32(sizeof(*abort_req) -
+			      sizeof(struct work_request_hdr));
+	OPCODE_TID(abort_req) = cpu_to_be32(MK_OPCODE_TID(CPL_ABORT_REQ, tid));
+	abort_req->rsvd0 = cpu_to_be32(0);
+	abort_req->rsvd1 = 0;
+	abort_req->cmd = CPL_ABORT_NO_RST;
+	sc = (struct ulptx_idata *)(abort_req + 1);
+	sc->cmd_more = cpu_to_be32(V_ULPTX_CMD(ULP_TX_SC_NOOP));
+	sc->len = cpu_to_be32(0);
+}
+
+/**
+ * Build a CPL_ABORT_RPL message as payload of a ULP_TX_PKT command.
+ */
+static void mk_abort_rpl_ulp(struct cpl_abort_rpl *abort_rpl,
+			     unsigned int tid)
+{
+	struct ulp_txpkt *txpkt = (struct ulp_txpkt *)abort_rpl;
+	struct ulptx_idata *sc = (struct ulptx_idata *)(txpkt + 1);
+
+	txpkt->cmd_dest = cpu_to_be32(V_ULPTX_CMD(ULP_TX_PKT) |
+				      V_ULP_TXPKT_DEST(0));
+	txpkt->len = cpu_to_be32(DIV_ROUND_UP(sizeof(*abort_rpl), 16));
+	sc->cmd_more = cpu_to_be32(V_ULPTX_CMD(ULP_TX_SC_IMM));
+	sc->len = cpu_to_be32(sizeof(*abort_rpl) -
+			      sizeof(struct work_request_hdr));
+	OPCODE_TID(abort_rpl) = cpu_to_be32(MK_OPCODE_TID(CPL_ABORT_RPL, tid));
+	abort_rpl->rsvd0 = cpu_to_be32(0);
+	abort_rpl->rsvd1 = 0;
+	abort_rpl->cmd = CPL_ABORT_NO_RST;
+	sc = (struct ulptx_idata *)(abort_rpl + 1);
+	sc->cmd_more = cpu_to_be32(V_ULPTX_CMD(ULP_TX_SC_NOOP));
+	sc->len = cpu_to_be32(0);
+}
+
+/**
+ * Delete the specified hash filter.
+ */
+static int cxgbe_del_hash_filter(struct rte_eth_dev *dev,
+				 unsigned int filter_id,
+				 struct filter_ctx *ctx)
+{
+	struct adapter *adapter = ethdev2adap(dev);
+	struct tid_info *t = &adapter->tids;
+	struct filter_entry *f;
+	struct sge_ctrl_txq *ctrlq;
+	unsigned int port_id = ethdev2pinfo(dev)->port_id;
+	int ret;
+
+	if (filter_id > adapter->tids.ntids)
+		return -E2BIG;
+
+	f = lookup_tid(t, filter_id);
+	if (!f) {
+		dev_err(adapter, "%s: no filter entry for filter_id = %d\n",
+			__func__, filter_id);
+		return -EINVAL;
+	}
+
+	ret = writable_filter(f);
+	if (ret)
+		return ret;
+
+	if (f->valid) {
+		unsigned int wrlen;
+		struct rte_mbuf *mbuf;
+		struct work_request_hdr *wr;
+		struct ulptx_idata *aligner;
+		struct cpl_set_tcb_field *req;
+		struct cpl_abort_req *abort_req;
+		struct cpl_abort_rpl *abort_rpl;
+
+		f->ctx = ctx;
+		f->pending = 1;
+
+		wrlen = cxgbe_roundup(sizeof(*wr) +
+				      (sizeof(*req) + sizeof(*aligner)) +
+				      sizeof(*abort_req) + sizeof(*abort_rpl),
+				      16);
+
+		ctrlq = &adapter->sge.ctrlq[port_id];
+		mbuf = rte_pktmbuf_alloc(ctrlq->mb_pool);
+		if (!mbuf) {
+			dev_err(adapter, "%s: could not allocate skb ..\n",
+				__func__);
+			goto out_err;
+		}
+
+		mbuf->data_len = wrlen;
+		mbuf->pkt_len = mbuf->data_len;
+
+		req = rte_pktmbuf_mtod(mbuf, struct cpl_set_tcb_field *);
+		INIT_ULPTX_WR(req, wrlen, 0, 0);
+		wr = (struct work_request_hdr *)req;
+		wr++;
+		req = (struct cpl_set_tcb_field *)wr;
+		mk_set_tcb_field_ulp(f, req, W_TCB_RSS_INFO,
+				     V_TCB_RSS_INFO(M_TCB_RSS_INFO),
+				     V_TCB_RSS_INFO(
+					adapter->sge.fw_evtq.abs_id),
+				     0, 1);
+		aligner = (struct ulptx_idata *)(req + 1);
+		abort_req = (struct cpl_abort_req *)(aligner + 1);
+		mk_abort_req_ulp(abort_req, f->tid);
+		abort_rpl = (struct cpl_abort_rpl *)(abort_req + 1);
+		mk_abort_rpl_ulp(abort_rpl, f->tid);
+		t4_mgmt_tx(ctrlq, mbuf);
+	}
+	return 0;
+
+out_err:
+	return -ENOMEM;
+}
+
+/**
+ * Construct hash filter ntuple.
+ */
+static u64 hash_filter_ntuple(const struct filter_entry *f)
+{
+	struct adapter *adap = ethdev2adap(f->dev);
+	struct tp_params *tp = &adap->params.tp;
+	u64 ntuple = 0;
+	u16 tcp_proto = 6; /* TCP Protocol Number */
+
+	/*
+	 * Initialize each of the fields which we care about which are present
+	 * in the Compressed Filter Tuple.
+	 */
+	if (tp->vlan_shift >= 0 && f->fs.mask.ivlan)
+		ntuple |= (F_FT_VLAN_VLD | f->fs.val.ivlan) << tp->vlan_shift;
+
+	if (tp->port_shift >= 0 && f->fs.mask.iport)
+		ntuple |= (u64)f->fs.val.iport << tp->port_shift;
+
+	if (tp->protocol_shift >= 0) {
+		if (!f->fs.val.proto)
+			ntuple |= (u64)tcp_proto << tp->protocol_shift;
+		else
+			ntuple |= (u64)f->fs.val.proto << tp->protocol_shift;
+	}
+
+	if (tp->tos_shift >= 0 && f->fs.mask.tos)
+		ntuple |= (u64)(f->fs.val.tos) << tp->tos_shift;
+
+	if (tp->vnic_shift >= 0 &&
+	    (f->fs.mask.ovlan || f->fs.mask.pf || f->fs.mask.vf)) {
+		u32 viid = cxgbe_port_viid(f->dev);
+		u32 vf = G_FW_VIID_VIN(viid);
+		u32 pf = G_FW_VIID_PFN(viid);
+		u32 vld = G_FW_VIID_VIVLD(viid);
+
+		ntuple |= (u64)(V_FT_VNID_ID_VF(vf) |
+				V_FT_VNID_ID_PF(pf) |
+				V_FT_VNID_ID_VLD(vld)) << tp->vnic_shift;
+	}
+
+	if (tp->ethertype_shift >= 0 && f->fs.mask.ethtype)
+		ntuple |= (u64)(f->fs.val.ethtype) << tp->ethertype_shift;
+
+	return ntuple;
+}
+
+/**
+ * Build a ACT_OPEN_REQ6 message for setting IPv6 hash filter.
+ */
+static void mk_act_open_req6(struct filter_entry *f, struct rte_mbuf *mbuf,
+			     unsigned int qid_filterid, struct adapter *adap)
+{
+	struct cpl_act_open_req6 *req = NULL;
+	struct cpl_t5_act_open_req6 *t5req = NULL;
+	u64 local_lo, local_hi, peer_lo, peer_hi;
+	u32 *lip = (u32 *)f->fs.val.lip;
+	u32 *fip = (u32 *)f->fs.val.fip;
+
+	switch (CHELSIO_CHIP_VERSION(adap->params.chip)) {
+	case CHELSIO_T5:
+		t5req = rte_pktmbuf_mtod(mbuf, struct cpl_t5_act_open_req6 *);
+
+		INIT_TP_WR(t5req, 0);
+		req = (struct cpl_act_open_req6 *)t5req;
+		break;
+	default:
+		dev_err(adap, "%s: unsupported chip type!\n", __func__);
+		return;
+	}
+
+	local_hi = ((u64)lip[1]) << 32 | lip[0];
+	local_lo = ((u64)lip[3]) << 32 | lip[2];
+	peer_hi = ((u64)fip[1]) << 32 | fip[0];
+	peer_lo = ((u64)fip[3]) << 32 | fip[2];
+
+	OPCODE_TID(req) = cpu_to_be32(MK_OPCODE_TID(CPL_ACT_OPEN_REQ6,
+						    qid_filterid));
+	req->local_port = cpu_to_be16(f->fs.val.lport);
+	req->peer_port = cpu_to_be16(f->fs.val.fport);
+	req->local_ip_hi = local_hi;
+	req->local_ip_lo = local_lo;
+	req->peer_ip_hi = peer_hi;
+	req->peer_ip_lo = peer_lo;
+	req->opt0 = cpu_to_be64(V_NAGLE(f->fs.newvlan == VLAN_REMOVE ||
+					f->fs.newvlan == VLAN_REWRITE) |
+				V_DELACK(f->fs.hitcnts) |
+				V_L2T_IDX(f->l2t ? f->l2t->idx : 0) |
+				V_SMAC_SEL(
+					(cxgbe_port_viid(f->dev) & 0x7F) << 1) |
+				V_TX_CHAN(f->fs.eport) |
+				V_NO_CONG(f->fs.rpttid) |
+				V_ULP_MODE(f->fs.nat_mode ?
+					   ULP_MODE_TCPDDP : ULP_MODE_NONE) |
+				F_TCAM_BYPASS | F_NON_OFFLOAD);
+
+	if (is_t5(adap->params.chip)) {
+		t5req->params = cpu_to_be64(
+				V_FILTER_TUPLE(hash_filter_ntuple(f)));
+		t5req->opt2 = cpu_to_be32(F_RSS_QUEUE_VALID |
+					  V_RSS_QUEUE(f->fs.iq) |
+					  V_TX_QUEUE(f->fs.nat_mode) |
+					  V_WND_SCALE_EN(f->fs.nat_flag_chk) |
+					  V_RX_FC_DISABLE(
+						  f->fs.nat_seq_chk ? 1 : 0) |
+					  F_T5_OPT_2_VALID |
+					  F_RX_CHANNEL |
+					  V_SACK_EN(f->fs.swapmac) |
+					  V_CONG_CNTRL(
+						(f->fs.action == FILTER_DROP) |
+						(f->fs.dirsteer << 1)) |
+					  V_PACE((f->fs.maskhash) |
+						 (f->fs.dirsteerhash << 1)) |
+					  V_CCTRL_ECN(
+						f->fs.action == FILTER_SWITCH));
+	}
+}
+
+/**
+ * Build a ACT_OPEN_REQ message for setting IPv4 hash filter.
+ */
+static void mk_act_open_req(struct filter_entry *f, struct rte_mbuf *mbuf,
+			    unsigned int qid_filterid, struct adapter *adap)
+{
+	struct cpl_act_open_req *req = NULL;
+	struct cpl_t5_act_open_req *t5req = NULL;
+
+	switch (CHELSIO_CHIP_VERSION(adap->params.chip)) {
+	case CHELSIO_T5:
+		t5req = rte_pktmbuf_mtod(mbuf, struct cpl_t5_act_open_req *);
+
+		INIT_TP_WR(t5req, 0);
+		req = (struct cpl_act_open_req *)t5req;
+		break;
+	default:
+		dev_err(adap, "%s: unsupported chip type!\n", __func__);
+		return;
+	}
+
+	OPCODE_TID(req) = cpu_to_be32(MK_OPCODE_TID(CPL_ACT_OPEN_REQ,
+						    qid_filterid));
+	req->local_port = cpu_to_be16(f->fs.val.lport);
+	req->peer_port = cpu_to_be16(f->fs.val.fport);
+	req->local_ip = f->fs.val.lip[0] | f->fs.val.lip[1] << 8 |
+			f->fs.val.lip[2] << 16 | f->fs.val.lip[3] << 24;
+	req->peer_ip = f->fs.val.fip[0] | f->fs.val.fip[1] << 8 |
+			f->fs.val.fip[2] << 16 | f->fs.val.fip[3] << 24;
+	req->opt0 = cpu_to_be64(V_NAGLE(f->fs.newvlan == VLAN_REMOVE ||
+					f->fs.newvlan == VLAN_REWRITE) |
+				V_DELACK(f->fs.hitcnts) |
+				V_L2T_IDX(f->l2t ? f->l2t->idx : 0) |
+				V_SMAC_SEL(
+					(cxgbe_port_viid(f->dev) & 0x7F) << 1) |
+				V_TX_CHAN(f->fs.eport) |
+				V_NO_CONG(f->fs.rpttid) |
+				V_ULP_MODE(f->fs.nat_mode ?
+					   ULP_MODE_TCPDDP : ULP_MODE_NONE) |
+				F_TCAM_BYPASS | F_NON_OFFLOAD);
+
+	if (is_t5(adap->params.chip)) {
+		t5req->params = cpu_to_be64(
+				V_FILTER_TUPLE(hash_filter_ntuple(f)));
+		t5req->opt2 = cpu_to_be32(F_RSS_QUEUE_VALID |
+					  V_RSS_QUEUE(f->fs.iq) |
+					  V_TX_QUEUE(f->fs.nat_mode) |
+					  V_WND_SCALE_EN(f->fs.nat_flag_chk) |
+					  V_RX_FC_DISABLE(
+						  f->fs.nat_seq_chk ? 1 : 0) |
+					  F_T5_OPT_2_VALID |
+					  F_RX_CHANNEL |
+					  V_SACK_EN(f->fs.swapmac) |
+					  V_CONG_CNTRL(
+						(f->fs.action == FILTER_DROP) |
+						(f->fs.dirsteer << 1)) |
+					  V_PACE((f->fs.maskhash) |
+						 (f->fs.dirsteerhash << 1)) |
+					  V_CCTRL_ECN(
+						f->fs.action == FILTER_SWITCH));
+	}
+}
+
+/**
+ * Set the specified hash filter.
+ */
+static int cxgbe_set_hash_filter(struct rte_eth_dev *dev,
+				 struct ch_filter_specification *fs,
+				 struct filter_ctx *ctx)
+{
+	struct port_info *pi = ethdev2pinfo(dev);
+	struct adapter *adapter = pi->adapter;
+	struct tid_info *t = &adapter->tids;
+	struct filter_entry *f;
+	struct rte_mbuf *mbuf;
+	struct sge_ctrl_txq *ctrlq;
+	unsigned int iq;
+	int atid, size;
+	int ret = 0;
+
+	ret = validate_filter(adapter, fs);
+	if (ret)
+		return ret;
+
+	iq = get_filter_steerq(dev, fs);
+
+	ctrlq = &adapter->sge.ctrlq[pi->port_id];
+
+	f = t4_os_alloc(sizeof(*f));
+	if (!f)
+		goto out_err;
+
+	f->fs = *fs;
+	f->ctx = ctx;
+	f->dev = dev;
+	f->fs.iq = iq;
+
+	/*
+	 * If the new filter requires loopback Destination MAC and/or VLAN
+	 * rewriting then we need to allocate a Layer 2 Table (L2T) entry for
+	 * the filter.
+	 */
+	if (f->fs.newdmac ||
+	    ((f->fs.newvlan == VLAN_INSERT) ||
+	     (f->fs.newvlan == VLAN_REWRITE))) {
+		/* allocate L2T entry for new filter */
+		f->l2t = cxgbe_l2t_alloc_switching(dev, f->fs.vlan,
+						   f->fs.eport, f->fs.dmac);
+		if (!f->l2t) {
+			ret = -ENOMEM;
+			goto out_err;
+		}
+	}
+
+	/*
+	 * If the new filter requires loopback Source MAC rewriting then
+	 * we need to allocate a SMT entry for the filter.
+	 */
+	if (f->fs.newsmac) {
+		f->smt = cxgbe_smt_alloc_switching(dev, f->fs.smac);
+		if (!f->smt) {
+			ret = -EAGAIN;
+			goto free_l2t;
+		}
+		f->smtidx = f->smt->idx;
+	}
+
+	atid = cxgbe_alloc_atid(t, f);
+	if (atid < 0)
+		goto free_smt;
+
+	if (f->fs.type) {
+		/* IPv6 hash filter */
+		f->clipt = cxgbe_clip_alloc(f->dev, (u32 *)&f->fs.val.lip);
+		if (!f->clipt)
+			goto free_atid;
+
+		size = sizeof(struct cpl_t5_act_open_req6);
+		mbuf = rte_pktmbuf_alloc(ctrlq->mb_pool);
+		if (!mbuf) {
+			ret = -ENOMEM;
+			goto free_clip;
+		}
+
+		mbuf->data_len = size;
+		mbuf->pkt_len = mbuf->data_len;
+
+		mk_act_open_req6(f, mbuf,
+				 ((adapter->sge.fw_evtq.abs_id << 14) | atid),
+				 adapter);
+	} else {
+		/* IPv4 hash filter */
+		size = sizeof(struct cpl_t5_act_open_req);
+		mbuf = rte_pktmbuf_alloc(ctrlq->mb_pool);
+		if (!mbuf) {
+			ret = -ENOMEM;
+			goto free_atid;
+		}
+
+		mbuf->data_len = size;
+		mbuf->pkt_len = mbuf->data_len;
+
+		mk_act_open_req(f, mbuf,
+				((adapter->sge.fw_evtq.abs_id << 14) | atid),
+				adapter);
+	}
+
+	f->pending = 1;
+	t4_mgmt_tx(ctrlq, mbuf);
+	return 0;
+
+free_clip:
+	cxgbe_clip_release(f->dev, f->clipt);
+
+free_atid:
+	cxgbe_free_atid(t, atid);
+
+free_smt:
+	if (f->smt) {
+		cxgbe_smt_release(f->smt);
+		f->smt = NULL;
+	}
+
+free_l2t:
+	if (f->l2t) {
+		cxgbe_l2t_release(f->l2t);
+		f->l2t = NULL;
+	}
+
+out_err:
+	t4_os_free(f);
+	return ret;
+}
+
+/**
  * Clear a filter and release any of its resources that we own.  This also
  * clears the filter's "pending" status.
  */
@@ -229,6 +817,18 @@ void cxgbe_clear_all_filters(struct adapter *adapter)
 			if (f->valid || f->pending)
 				clear_filter(f);
 	}
+
+	if (is_hashfilter(adapter) && adapter->tids.tid_tab) {
+		struct filter_entry *f;
+
+		for (i = adapter->tids.hash_base; i <= adapter->tids.ntids;
+		     i++) {
+			f = (struct filter_entry *)adapter->tids.tid_tab[i];
+
+			if (f && (f->valid || f->pending))
+				t4_os_free(f);
+		}
+	}
 }
 
 /**
@@ -536,6 +1136,9 @@ int cxgbe_del_filter(struct rte_eth_dev *dev, unsigned int filter_id,
 	struct filter_entry *f;
 	int ret;
 
+	if (is_hashfilter(adapter) && fs->cap)
+		return cxgbe_del_hash_filter(dev, filter_id, ctx);
+
 	if (filter_id >= adapter->tids.nftids)
 		return -ERANGE;
 
@@ -591,6 +1194,9 @@ int cxgbe_set_filter(struct rte_eth_dev *dev, unsigned int filter_id,
 	struct filter_entry *f;
 	int ret;
 
+	if (is_hashfilter(adapter) && fs->cap)
+		return cxgbe_set_hash_filter(dev, fs, ctx);
+
 	if (filter_id >= adapter->tids.nftids)
 		return -ERANGE;
 
@@ -800,3 +1406,209 @@ void filter_rpl(struct adapter *adap, const struct cpl_set_tcb_rpl *rpl)
 			t4_complete(&ctx->completion);
 	}
 }
+
+/**
+ * Handle a Hash filter write reply.
+ */
+void hash_filter_rpl(struct adapter *adap, const struct cpl_act_open_rpl *rpl)
+{
+	struct tid_info *t = &adap->tids;
+	struct filter_entry *f;
+	struct filter_ctx *ctx = NULL;
+	unsigned int tid = GET_TID(rpl);
+	unsigned int ftid = G_TID_TID(
+			    G_AOPEN_ATID(be32_to_cpu(rpl->atid_status)));
+	unsigned int status  = G_AOPEN_STATUS(be32_to_cpu(rpl->atid_status));
+
+	f = lookup_atid(t, ftid);
+	if (!f) {
+		dev_warn(adap, "%s: could not find filter entry: %d\n",
+			 __func__, ftid);
+		return;
+	}
+
+	ctx = f->ctx;
+	f->ctx = NULL;
+
+	switch (status) {
+	case CPL_ERR_NONE: {
+		f->tid = tid;
+		f->pending = 0;  /* asynchronous setup completed */
+		f->valid = 1;
+
+		cxgbe_insert_tid(t, f, f->tid, 0);
+		cxgbe_free_atid(t, ftid);
+		if (ctx) {
+			ctx->tid = f->tid;
+			ctx->result = 0;
+		}
+
+		if (f->fs.hitcnts)
+			set_tcb_field(adap, tid,
+				      W_TCB_TIMESTAMP,
+				      V_TCB_TIMESTAMP(M_TCB_TIMESTAMP) |
+				      V_TCB_T_RTT_TS_RECENT_AGE(
+					      M_TCB_T_RTT_TS_RECENT_AGE),
+				      V_TCB_TIMESTAMP(0ULL) |
+				      V_TCB_T_RTT_TS_RECENT_AGE(0ULL),
+				      1);
+
+		if (f->fs.newdmac)
+			set_tcb_tflag(adap, tid, S_TF_CCTRL_ECE, 1, 1);
+
+		if ((f->fs.newvlan == VLAN_INSERT) ||
+		    (f->fs.newvlan == VLAN_REWRITE))
+			set_tcb_tflag(adap, tid, S_TF_CCTRL_RFR, 1, 1);
+
+		if (f->fs.newsmac) {
+			set_tcb_tflag(adap, tid, S_TF_CCTRL_CWR, 1, 1);
+			set_tcb_field(adap, tid, W_TCB_SMAC_SEL,
+				      V_TCB_SMAC_SEL(M_TCB_SMAC_SEL),
+				      V_TCB_SMAC_SEL(f->smtidx), 1);
+		}
+
+		if (f->fs.nat_mode) {
+			switch (f->fs.nat_mode) {
+			case NAT_MODE_DIP:
+				set_nat_params(adap, f, tid, true,
+					       false, false, false);
+				break;
+
+			case NAT_MODE_DIP_DP:
+				set_nat_params(adap, f, tid, true,
+					       false, true, false);
+				break;
+
+			case NAT_MODE_DIP_DP_SIP:
+				set_nat_params(adap, f, tid, true,
+					       true, true, false);
+				break;
+
+			case NAT_MODE_DIP_DP_SP:
+				set_nat_params(adap, f, tid, true,
+					       false, true, true);
+				break;
+
+			case NAT_MODE_SIP_SP:
+				set_nat_params(adap, f, tid, false,
+					       true, false, true);
+				break;
+
+			case NAT_MODE_DIP_SIP_SP:
+				set_nat_params(adap, f, tid, true,
+					       true, false, true);
+				break;
+
+			case NAT_MODE_ALL:
+				set_nat_params(adap, f, tid, true,
+					       true, true, true);
+				break;
+
+			default:
+				dev_err(adap, "%s: Invalid NAT mode: %d\n",
+					__func__, f->fs.nat_mode);
+
+				if (f->l2t)
+					cxgbe_l2t_release(f->l2t);
+
+				if (f->smt)
+					cxgbe_smt_release(f->smt);
+
+				t4_os_free(f);
+
+				if (ctx) {
+					ctx->result = -EINVAL;
+					t4_complete(&ctx->completion);
+				}
+				return;
+			}
+		}
+
+		if (f->fs.nat_seq_chk) {
+			set_tcb_field(adap, tid, W_TCB_RCV_NXT,
+				      V_TCB_RCV_NXT(M_TCB_RCV_NXT),
+				      V_TCB_RCV_NXT(f->fs.nat_seq_chk), 1);
+		}
+
+		if (is_t5(adap->params.chip)) {
+			if (f->fs.action == FILTER_DROP) {
+				/*
+				 * Set Migrating bit to 1, and
+				 * set Non-offload bit to 0 - to achieve
+				 * Drop action with Hash filters
+				 */
+				set_tcb_field(adap, tid,
+					      W_TCB_T_FLAGS,
+					      V_TF_NON_OFFLOAD(1) |
+					      V_TF_MIGRATING(1),
+					      V_TF_MIGRATING(1), 1);
+			}
+		}
+
+		break;
+	}
+	default:
+		dev_warn(adap, "%s: filter creation failed with status = %u\n",
+			 __func__, status);
+
+		if (ctx) {
+			if (status == CPL_ERR_TCAM_FULL)
+				ctx->result = -EAGAIN;
+			else
+				ctx->result = -EINVAL;
+		}
+
+		if (f->l2t)
+			cxgbe_l2t_release(f->l2t);
+
+		if (f->smt)
+			cxgbe_smt_release(f->smt);
+
+		cxgbe_free_atid(t, ftid);
+		t4_os_free(f);
+	}
+
+	if (ctx)
+		t4_complete(&ctx->completion);
+}
+
+/**
+ * Handle a Hash filter delete reply.
+ */
+void hash_del_filter_rpl(struct adapter *adap,
+			 const struct cpl_abort_rpl_rss *rpl)
+{
+	struct tid_info *t = &adap->tids;
+	struct filter_entry *f;
+	struct filter_ctx *ctx = NULL;
+	unsigned int tid = GET_TID(rpl);
+
+	f = lookup_tid(t, tid);
+	if (!f) {
+		dev_warn(adap, "%s: could not find filter entry: %u\n",
+			 __func__, tid);
+		return;
+	}
+
+	ctx = f->ctx;
+	f->ctx = NULL;
+
+	f->valid = 0;
+
+	if (f->clipt)
+		cxgbe_clip_release(f->dev, f->clipt);
+
+	if (f->l2t)
+		cxgbe_l2t_release(f->l2t);
+
+	if (f->smt)
+		cxgbe_smt_release(f->smt);
+
+	cxgbe_remove_tid(t, 0, tid, 0);
+	t4_os_free(f);
+
+	if (ctx) {
+		ctx->result = 0;
+		t4_complete(&ctx->completion);
+	}
+}
diff --git a/drivers/net/cxgbe/cxgbe_filter.h b/drivers/net/cxgbe/cxgbe_filter.h
index 96c15d2..cde74fc 100644
--- a/drivers/net/cxgbe/cxgbe_filter.h
+++ b/drivers/net/cxgbe/cxgbe_filter.h
@@ -235,6 +235,8 @@ struct filter_entry {
 	struct ch_filter_specification fs;
 };
 
+#define WORD_MASK       0xffffffff
+
 struct adapter;
 
 void filter_rpl(struct adapter *adap, const struct cpl_set_tcb_rpl *rpl);
@@ -249,5 +251,10 @@ int cxgbe_del_filter(struct rte_eth_dev *dev, unsigned int filter_id,
 		     struct ch_filter_specification *fs,
 		     struct filter_ctx *ctx);
 
+int init_hash_filter(struct adapter *adap);
+void hash_filter_rpl(struct adapter *adap, const struct cpl_act_open_rpl *rpl);
+void hash_del_filter_rpl(struct adapter *adap,
+			 const struct cpl_abort_rpl_rss *rpl);
+
 void cxgbe_clear_all_filters(struct adapter *adapter);
 #endif /* _CXGBE_FILTER_H_ */
diff --git a/drivers/net/cxgbe/cxgbe_main.c b/drivers/net/cxgbe/cxgbe_main.c
index dfb6567..1f79ba3 100644
--- a/drivers/net/cxgbe/cxgbe_main.c
+++ b/drivers/net/cxgbe/cxgbe_main.c
@@ -123,6 +123,14 @@ static int fwevtq_handler(struct sge_rspq *q, const __be64 *rsp,
 		const struct cpl_set_tcb_rpl *p = (const void *)rsp;
 
 		filter_rpl(q->adapter, p);
+	} else if (opcode == CPL_ACT_OPEN_RPL) {
+		const struct cpl_act_open_rpl *p = (const void *)rsp;
+
+		hash_filter_rpl(q->adapter, p);
+	} else if (opcode == CPL_ABORT_RPL_RSS) {
+		const struct cpl_abort_rpl_rss *p = (const void *)rsp;
+
+		hash_del_filter_rpl(q->adapter, p);
 	} else if (opcode == CPL_SMT_WRITE_RPL) {
 		const struct cpl_smt_write_rpl *p = (const void *)rsp;
 
@@ -272,6 +280,110 @@ int cxgb4_set_rspq_intr_params(struct sge_rspq *q, unsigned int us,
 }
 
 /**
+ * Allocate an active-open TID and set it to the supplied value.
+ */
+int cxgbe_alloc_atid(struct tid_info *t, void *data)
+{
+	int atid = -1;
+
+	t4_os_lock(&t->atid_lock);
+	if (t->afree) {
+		union aopen_entry *p = t->afree;
+
+		atid = p - t->atid_tab;
+		t->afree = p->next;
+		p->data = data;
+		t->atids_in_use++;
+	}
+	t4_os_unlock(&t->atid_lock);
+	return atid;
+}
+
+/**
+ * Release an active-open TID.
+ */
+void cxgbe_free_atid(struct tid_info *t, unsigned int atid)
+{
+	union aopen_entry *p = &t->atid_tab[atid];
+
+	t4_os_lock(&t->atid_lock);
+	p->next = t->afree;
+	t->afree = p;
+	t->atids_in_use--;
+	t4_os_unlock(&t->atid_lock);
+}
+
+/**
+ * Populate a TID_RELEASE WR.  Caller must properly size the skb.
+ */
+static void mk_tid_release(struct rte_mbuf *mbuf, unsigned int tid)
+{
+	struct cpl_tid_release *req;
+
+	req = rte_pktmbuf_mtod(mbuf, struct cpl_tid_release *);
+	INIT_TP_WR_MIT_CPL(req, CPL_TID_RELEASE, tid);
+}
+
+/**
+ * Release a TID and inform HW.  If we are unable to allocate the release
+ * message we defer to a work queue.
+ */
+void cxgbe_remove_tid(struct tid_info *t, unsigned int chan, unsigned int tid,
+		      unsigned short family)
+{
+	struct rte_mbuf *mbuf;
+	struct adapter *adap = container_of(t, struct adapter, tids);
+
+	WARN_ON(tid >= t->ntids);
+
+	if (t->tid_tab[tid]) {
+		t->tid_tab[tid] = NULL;
+		rte_atomic32_dec(&t->conns_in_use);
+		if (t->hash_base && (tid >= t->hash_base)) {
+			if (family == FILTER_TYPE_IPV6)
+				rte_atomic32_sub(&t->hash_tids_in_use, 2);
+			else
+				rte_atomic32_dec(&t->hash_tids_in_use);
+		} else {
+			if (family == FILTER_TYPE_IPV6)
+				rte_atomic32_sub(&t->tids_in_use, 2);
+			else
+				rte_atomic32_dec(&t->tids_in_use);
+		}
+	}
+
+	mbuf = rte_pktmbuf_alloc((&adap->sge.ctrlq[chan])->mb_pool);
+	if (mbuf) {
+		mbuf->data_len = sizeof(struct cpl_tid_release);
+		mbuf->pkt_len = mbuf->data_len;
+		mk_tid_release(mbuf, tid);
+		t4_mgmt_tx(&adap->sge.ctrlq[chan], mbuf);
+	}
+}
+
+/**
+ * Insert a TID.
+ */
+void cxgbe_insert_tid(struct tid_info *t, void *data, unsigned int tid,
+		      unsigned short family)
+{
+	t->tid_tab[tid] = data;
+	if (t->hash_base && (tid >= t->hash_base)) {
+		if (family == FILTER_TYPE_IPV6)
+			rte_atomic32_add(&t->hash_tids_in_use, 2);
+		else
+			rte_atomic32_inc(&t->hash_tids_in_use);
+	} else {
+		if (family == FILTER_TYPE_IPV6)
+			rte_atomic32_add(&t->tids_in_use, 2);
+		else
+			rte_atomic32_inc(&t->tids_in_use);
+	}
+
+	rte_atomic32_inc(&t->conns_in_use);
+}
+
+/**
  * Free TID tables.
  */
 static void tid_free(struct tid_info *t)
@@ -675,8 +787,7 @@ static int adap_init0_config(struct adapter *adapter, int reset)
 	 * This will allow the firmware to optimize aspects of the hardware
 	 * configuration which will result in improved performance.
 	 */
-	caps_cmd.niccaps &= cpu_to_be16(~(FW_CAPS_CONFIG_NIC_HASHFILTER |
-					  FW_CAPS_CONFIG_NIC_ETHOFLD));
+	caps_cmd.niccaps &= cpu_to_be16(~FW_CAPS_CONFIG_NIC_ETHOFLD);
 	caps_cmd.toecaps = 0;
 	caps_cmd.iscsicaps = 0;
 	caps_cmd.rdmacaps = 0;
@@ -908,6 +1019,12 @@ static int adap_init0(struct adapter *adap)
 	if (ret < 0)
 		goto bye;
 
+	if ((caps_cmd.niccaps & cpu_to_be16(FW_CAPS_CONFIG_NIC_HASHFILTER)) &&
+	    is_t5(adap->params.chip)) {
+		if (init_hash_filter(adap) < 0)
+			goto bye;
+	}
+
 	/* query tid-related parameters */
 	params[0] = FW_PARAM_DEV(NTID);
 	ret = t4_query_params(adap, adap->mbox, adap->pf, 0, 1,
@@ -1411,6 +1528,20 @@ allocate_mac:
 			 "filter support disabled. Continuing\n");
 	}
 
+	if (is_hashfilter(adapter)) {
+		if (t4_read_reg(adapter, A_LE_DB_CONFIG) & F_HASHEN) {
+			u32 hash_base, hash_reg;
+
+			hash_reg = A_LE_DB_TID_HASHBASE;
+			hash_base = t4_read_reg(adapter, hash_reg);
+			adapter->tids.hash_base = hash_base / 4;
+		}
+	} else {
+		/* Disable hash filtering support */
+		dev_warn(adapter,
+			 "Maskless filter support disabled. Continuing\n");
+	}
+
 	err = init_rss(adapter);
 	if (err)
 		goto out_free;
diff --git a/drivers/net/cxgbe/cxgbe_ofld.h b/drivers/net/cxgbe/cxgbe_ofld.h
index 115472e..0cddf8d 100644
--- a/drivers/net/cxgbe/cxgbe_ofld.h
+++ b/drivers/net/cxgbe/cxgbe_ofld.h
@@ -52,6 +52,14 @@
 	OPCODE_TID(w) = cpu_to_be32(MK_OPCODE_TID(cpl, tid)); \
 } while (0)
 
+#define INIT_ULPTX_WR(w, wrlen, atomic, tid) do { \
+	(w)->wr.wr_hi = cpu_to_be32(V_FW_WR_OP(FW_ULPTX_WR) | \
+				    V_FW_WR_ATOMIC(atomic)); \
+	(w)->wr.wr_mid = cpu_to_be32(V_FW_WR_LEN16(DIV_ROUND_UP(wrlen, 16)) | \
+				     V_FW_WR_FLOWID(tid)); \
+	(w)->wr.wr_lo = cpu_to_be64(0); \
+} while (0)
+
 /*
  * Max # of ATIDs.  The absolute HW max is 16K but we keep it lower.
  */
@@ -97,4 +105,22 @@ struct tid_info {
 	rte_atomic32_t conns_in_use;
 	rte_spinlock_t ftid_lock;
 };
+
+static inline void *lookup_tid(const struct tid_info *t, unsigned int tid)
+{
+	return tid < t->ntids ? t->tid_tab[tid] : NULL;
+}
+
+static inline void *lookup_atid(const struct tid_info *t, unsigned int atid)
+{
+	return atid < t->natids ? t->atid_tab[atid].data : NULL;
+}
+
+int cxgbe_alloc_atid(struct tid_info *t, void *data);
+void cxgbe_free_atid(struct tid_info *t, unsigned int atid);
+
+void cxgbe_remove_tid(struct tid_info *t, unsigned int qid, unsigned int tid,
+		      unsigned short family);
+void cxgbe_insert_tid(struct tid_info *t, void *data, unsigned int tid,
+		      unsigned short family);
 #endif /* _CXGBE_OFLD_H_ */
-- 
2.5.3

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 10/10] cxgbe: add flow director support and update documentation
  2016-02-03  8:32 [PATCH 00/10] cxgbe: Add flow director support Rahul Lakkireddy
                   ` (8 preceding siblings ...)
  2016-02-03  8:32 ` [PATCH 09/10] cxgbe: add HASH " Rahul Lakkireddy
@ 2016-02-03  8:32 ` Rahul Lakkireddy
  2016-02-22 10:39 ` [PATCH 00/10] cxgbe: Add flow director support Rahul Lakkireddy
  2016-03-22 13:43 ` Bruce Richardson
  11 siblings, 0 replies; 27+ messages in thread
From: Rahul Lakkireddy @ 2016-02-03  8:32 UTC (permalink / raw)
  To: dev; +Cc: Kumar Sanghvi, Nirranjan Kirubaharan

Add flow director support for setting/deleting LE-TCAM (maskfull)
and HASH (maskless) filters.

Wait and poll firmware event queue for replies about the filter status.
Also, the firware event queue doesn't have any freelists.  So, there is
no need to refill any.

Provide stats showing the number of remaining free entries, and
number of successful and failed added/deleted filters.

Add documentation explaining the usage of CXGBE Flow Director support.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
---
 doc/guides/nics/cxgbe.rst            | 166 ++++++++
 doc/guides/rel_notes/release_2_3.rst |   7 +
 drivers/net/cxgbe/Makefile           |   1 +
 drivers/net/cxgbe/base/adapter.h     |   2 +
 drivers/net/cxgbe/cxgbe.h            |   3 +
 drivers/net/cxgbe/cxgbe_ethdev.c     |  18 +
 drivers/net/cxgbe/cxgbe_fdir.c       | 715 +++++++++++++++++++++++++++++++++++
 drivers/net/cxgbe/cxgbe_fdir.h       | 108 ++++++
 drivers/net/cxgbe/cxgbe_main.c       |  40 ++
 drivers/net/cxgbe/sge.c              |   3 +-
 10 files changed, 1062 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/cxgbe/cxgbe_fdir.c
 create mode 100644 drivers/net/cxgbe/cxgbe_fdir.h

diff --git a/doc/guides/nics/cxgbe.rst b/doc/guides/nics/cxgbe.rst
index d718f19..d2a0d74 100644
--- a/doc/guides/nics/cxgbe.rst
+++ b/doc/guides/nics/cxgbe.rst
@@ -51,6 +51,7 @@ CXGBE PMD has support for:
 - All multicast mode
 - Port hardware statistics
 - Jumbo frames
+- Packet classification and filtering
 
 Limitations
 -----------
@@ -187,6 +188,13 @@ Unified Wire package for Linux operating system are as follows:
 
       cxgbtool p1p1 loadcfg <path_to_uwire>/src/network/firmware/t5-config.txt
 
+   .. note::
+
+      To enable HASH filters, a special firmware configuration file is needed.
+      The file is located under the following directory:
+
+      <path_to_uwire>/src/network/firmware/hash_filter_config/t5-config.txt
+
 #. Use cxgbtool to load the firmware image onto the card:
 
    .. code-block:: console
@@ -541,6 +549,66 @@ devices managed by librte_pmd_cxgbe in FreeBSD operating system.
    Flow control pause TX/RX is disabled by default and can be enabled via
    testpmd. Refer section :ref:`flow-control` for more details.
 
+
+.. _filtering:
+
+Packet Classification and Filtering
+-----------------------------------
+
+Chelsio T5 NICs support packet classification and filtering in hardware.
+This feature can be used in the ingress path to:
+
+- Steer ingress packets that meet ACL (Access Control List) accept criteria
+  to a particular receive queue.
+
+- Switch (proxy) ingress packets that meet ACL accept criteria to an output
+  port, with optional header rewrite.
+
+- Drop ingress packets that fail ACL accept criteria.
+
+There are two types of filters that can be set, namely LE-TCAM (Maskfull)
+filters and HASH (Maskless) filters.  LE-TCAM filters allow specifying masks
+to the accept criteria to allow specifying a match for a range of values;
+whereas, HASH filters ignore masks and hence enforce a more strict accept
+criteria.
+
+By default, only LE-TCAM filter rules can be created.  Creating HASH filters
+requires a special firmware configuration file.  Instructions on how to
+manually flash the firmware configuration file are given in section
+:ref:`linux-installation`.
+
+The fields that can be specified for the accept criteria are based on the
+filter selection combination set in the firmware configuration
+(t5-config.txt) file flashed in section :ref:`linux-installation`.
+
+By default, the selection combination automatically includes source/
+destination IPV4/IPV6 address, and source/destination layer 4 port
+addresses.  In addition to the above, more combinations can be added by
+modifying the t5-config.txt firmware configuration file.
+
+For example, consider the following combination that has been set in
+t5-config.txt:
+
+.. code-block:: console
+
+   filterMode = ethertype, protocol, tos, vlan, port
+   filterMask = ethertype, protocol, tos, vlan, port
+
+In the above example, in addition to source/destination IPV4/IPV6
+addresses and layer 4 source/destination port addresses, a packet can also
+be matched against ethertype field set in the ethernet header, IP protocol
+and tos field set in the IP header, inner VLAN tag, and physical ingress
+port number, respectively.
+
+You can create 496 LE-TCAM filters and ~0.5 million HASH filter rules.
+For more information, please visit `Chelsio Communications Official Website
+<http://www.chelsio.com>`_.
+
+To test packet classification and filtering on a Chelsio NIC, an
+example app is provided in **examples/test-cxgbe-filters/** directory.
+Please see :doc:`Test CXGBE Filters Application Guide
+</sample_app_ug/test_cxgbe_filters>` to compile and run the example app.
+
 Sample Application Notes
 ------------------------
 
@@ -587,3 +655,101 @@ to configure the mtu of all the ports with a single command.
 
      testpmd> port stop all
      testpmd> port config all max-pkt-len 9000
+
+Add/Delete Filters
+~~~~~~~~~~~~~~~~~~
+
+To test packet classification and filtering on a Chelsio NIC, an
+example app is provided in **examples/test-cxgbe-filters/** directory.
+Please see :doc:`Test CXGBE Filters Application Guide
+</sample_app_ug/test_cxgbe_filters>` to compile and run the app.
+The examples below have to be run on the **test_cxgbe_filters** app.
+
+The command line to add/delete filters is given below. Note that the
+command is too long to fit on one line and hence is shown wrapped
+at "\\" for display purposes.  In real prompt, these commands should
+be on a single line without the "\\".
+
+  .. code-block:: console
+
+     cxgbe> filter (port_id) (add|del) (ipv4|ipv6) \
+            mode (maskfull|maskless) (no-prio|prio) \
+            ingress-port (iport) (iport_mask) \
+            ether (ether_type) (ether_type_mask) \
+            vlan (inner_vlan) (inner_vlan_mask) \
+            (outer_vlan) (outer_vlan_mask) \
+            ip (tos) (tos_mask) (proto) (proto_mask) \
+            (src_ip_address) (src_ip_mask) \
+            (dst_ip_address) (dst_ip_mask) \
+            (src_port) (src_port_mask) (dst_port) (dst_port_mask) \
+            (drop|fwd|switch) queue (queue_id) \
+            (port-none|port-redirect) (egress_port) \
+            (ether-none|mac-rewrite|mac-swap) (src_mac) (dst_mac) \
+            (vlan-none|vlan-rewrite|vlan-delete) (new_vlan) \
+            (nat-none|nat-rewrite) (nat_src_ip) (nat_dst_ip) \
+            (nat_src_port) (nat_dst_port) \
+            fd_id (fd_id_value)
+
+LE-TCAM (Maskfull) Filters
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+- To drop all traffic coming for 102.1.2.0/24 network, add a maskfull
+  filter as follows:
+
+  .. code-block:: console
+
+     cxgbe> filter 0 add ipv4 mode maskfull \
+            no-prio ingress-port 0 0 ether 0 0 vlan 0 0 0 0 \
+            ip 0 0 0 0 0.0.0.0 0.0.0.0 102.1.2.0 255.255.255.0 0 0 0 0 \
+            drop queue 0 port-none 0 \
+            ether-none 00:00:00:00:00:00 00:00:00:00:00:00 \
+            vlan-none 0 nat-none 0.0.0.0 0.0.0.0 0 0 \
+            fd_id 0
+
+- To switch all traffic coming for 102.1.2.0/24 network out via port 1
+  with source and destination mac addresses rewritten, add a maskfull
+  filter as follows:
+
+  .. code-block:: console
+
+     cxgbe> filter 0 add ipv4 mode maskfull \
+            no-prio ingress-port 0 0 ether 0 0 vlan 0 0 0 0 \
+            ip 0 0 0 0 0.0.0.0 0.0.0.0 102.1.2.0 255.255.255.0 0 0 0 0 \
+            switch queue 0 port-redirect 1 \
+            mac-rewrite 00:07:43:04:96:48 00:07:43:12:D4:88 \
+            vlan-none 0 nat-none 0.0.0.0 0.0.0.0 0 0 \
+            fd_id 0
+
+HASH (Maskless) Filters
+^^^^^^^^^^^^^^^^^^^^^^^
+
+Maskless filters require a special firmware configuration file. Please see
+section :ref:`filtering` for more information.
+
+- To steer all traffic coming for 102.1.2.2 with destination port 12865
+  from 102.1.2.1 with source port 12000 to port 1's rx queue, add a
+  maskless filter as follows:
+
+  .. code-block:: console
+
+     cxgbe> filter 1 add ipv4 mode maskless \
+            no-prio ingress-port 0 0 ether 0 0 vlan 0 0 0 0 \
+            ip 0 0 0 0 102.1.2.1 0.0.0.0 102.1.2.2 0.0.0.0 12000 0 12865 0 \
+            fwd queue 0 port-none 0 \
+            ether-none 00:00:00:00:00:00 00:00:00:00:00:00 \
+            vlan-none 0 nat-none 0.0.0.0 0.0.0.0 0 0 \
+            fd_id 0
+
+- To swap the source and destination mac addresses of all traffic coming
+  for 102.1.2.2 with destination port 12865 from 102.1.2.1 with source
+  port 12000, add a maskless filter as follows:
+
+  .. code-block:: console
+
+     cxgbe> filter 0 add ipv4 mode maskless \
+            no-prio ingress-port 0 0 ether 0 0 vlan 0 0 0 0 \
+            ip 0 0 0 0 102.1.2.1 0.0.0.0 102.1.2.2 0.0.0.0 12000 0 12865 0 \
+            switch queue 0 port-redirect 1 \
+            mac-swap 00:00:00:00:00:00 00:00:00:00:00:00 \
+            vlan-none 0 nat-none 0.0.0.0 0.0.0.0 0 0 \
+            fd_id 0
diff --git a/doc/guides/rel_notes/release_2_3.rst b/doc/guides/rel_notes/release_2_3.rst
index 19ce954..2953f52 100644
--- a/doc/guides/rel_notes/release_2_3.rst
+++ b/doc/guides/rel_notes/release_2_3.rst
@@ -4,6 +4,13 @@ DPDK Release 2.3
 New Features
 ------------
 
+* **Added flow director support for Chelsio CXGBE driver.**
+
+  * Added flow director support to enable Chelsio T5 NIC hardware filtering
+    features.
+  * Added an example app under ``examples/test-cxgbe-filters`` directory
+    to test Chelsio T5 NIC hardware filtering features.
+
 
 Resolved Issues
 ---------------
diff --git a/drivers/net/cxgbe/Makefile b/drivers/net/cxgbe/Makefile
index 3201aff..0d52cb1 100644
--- a/drivers/net/cxgbe/Makefile
+++ b/drivers/net/cxgbe/Makefile
@@ -82,6 +82,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += clip_tbl.c
 SRCS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += l2t.c
 SRCS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += smt.c
 SRCS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += cxgbe_filter.c
+SRCS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += cxgbe_fdir.c
 
 # this lib depends upon:
 DEPDIRS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += lib/librte_eal lib/librte_ether
diff --git a/drivers/net/cxgbe/base/adapter.h b/drivers/net/cxgbe/base/adapter.h
index a64571d..a03048d 100644
--- a/drivers/net/cxgbe/base/adapter.h
+++ b/drivers/net/cxgbe/base/adapter.h
@@ -342,6 +342,8 @@ struct adapter {
 	struct l2t_data *l2t;     /* Layer 2 table */
 	struct smt_data *smt;     /* Source MAC table */
 	struct tid_info tids;     /* Info used to access TID related tables */
+
+	struct cxgbe_fdir_map *fdir;  /* Flow Director */
 };
 
 #define CXGBE_PCI_REG(reg) (*((volatile uint32_t *)(reg)))
diff --git a/drivers/net/cxgbe/cxgbe.h b/drivers/net/cxgbe/cxgbe.h
index 9ca4388..f984103 100644
--- a/drivers/net/cxgbe/cxgbe.h
+++ b/drivers/net/cxgbe/cxgbe.h
@@ -36,6 +36,7 @@
 
 #include "common.h"
 #include "t4_regs.h"
+#include "cxgbe_fdir.h"
 
 #define CXGBE_MIN_RING_DESC_SIZE      128  /* Min TX/RX descriptor ring size */
 #define CXGBE_MAX_RING_DESC_SIZE      4096 /* Max TX/RX descriptor ring size */
@@ -52,6 +53,8 @@ int cxgbe_down(struct port_info *pi);
 void cxgbe_close(struct adapter *adapter);
 void cxgbe_stats_get(struct port_info *pi, struct port_stats *stats);
 void cxgbe_stats_reset(struct port_info *pi);
+int cxgbe_poll_for_completion(struct sge_rspq *q, unsigned int us,
+			      unsigned int cnt, struct t4_completion *c);
 int link_start(struct port_info *pi);
 void init_rspq(struct adapter *adap, struct sge_rspq *q, unsigned int us,
 	       unsigned int cnt, unsigned int size, unsigned int iqe_size);
diff --git a/drivers/net/cxgbe/cxgbe_ethdev.c b/drivers/net/cxgbe/cxgbe_ethdev.c
index 2701bb6..7016026 100644
--- a/drivers/net/cxgbe/cxgbe_ethdev.c
+++ b/drivers/net/cxgbe/cxgbe_ethdev.c
@@ -770,6 +770,23 @@ static int cxgbe_flow_ctrl_set(struct rte_eth_dev *eth_dev,
 			     &pi->link_cfg);
 }
 
+static int cxgbe_dev_filter_ctrl(struct rte_eth_dev *dev,
+				 enum rte_filter_type filter_type,
+				 enum rte_filter_op filter_op, void *arg)
+{
+	int ret;
+
+	switch (filter_type) {
+	case RTE_ETH_FILTER_FDIR:
+		ret = cxgbe_fdir_ctrl_func(dev, filter_op, arg);
+		break;
+	default:
+		ret = -ENOTSUP;
+		break;
+	}
+	return ret;
+}
+
 static struct eth_dev_ops cxgbe_eth_dev_ops = {
 	.dev_start		= cxgbe_dev_start,
 	.dev_stop		= cxgbe_dev_stop,
@@ -794,6 +811,7 @@ static struct eth_dev_ops cxgbe_eth_dev_ops = {
 	.stats_reset		= cxgbe_dev_stats_reset,
 	.flow_ctrl_get		= cxgbe_flow_ctrl_get,
 	.flow_ctrl_set		= cxgbe_flow_ctrl_set,
+	.filter_ctrl            = cxgbe_dev_filter_ctrl,
 };
 
 /*
diff --git a/drivers/net/cxgbe/cxgbe_fdir.c b/drivers/net/cxgbe/cxgbe_fdir.c
new file mode 100644
index 0000000..1c15e34
--- /dev/null
+++ b/drivers/net/cxgbe/cxgbe_fdir.c
@@ -0,0 +1,715 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015-2016 Chelsio Communications.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Chelsio Communications nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <rte_ether.h>
+#include <rte_ethdev.h>
+
+#include "cxgbe.h"
+#include "common.h"
+#include "cxgbe_filter.h"
+#include "smt.h"
+#include "clip_tbl.h"
+#include "cxgbe_fdir.h"
+
+/**
+ * Check if the specified entry is set or not
+ */
+static bool is_fdir_map_set(struct cxgbe_fdir_map *map,
+			    unsigned int cap,
+			    unsigned int idx)
+{
+	bool result = FALSE;
+
+	t4_os_lock(&map->lock);
+	if (!cap) {
+		if (rte_bitmap_get(map->mfull_bmap, idx))
+			result = TRUE;
+	} else {
+		if (rte_bitmap_get(map->mless_bmap, idx))
+			result = TRUE;
+	}
+	t4_os_unlock(&map->lock);
+
+	return result;
+}
+
+/**
+ * Set/Clear bitmap entry
+ */
+static int cxgbe_fdir_add_del_map_entry(struct cxgbe_fdir_map *map,
+					unsigned int cap, unsigned int idx,
+					bool del)
+{
+	struct cxgbe_fdir_map_entry *e;
+
+	if (cap && !map->maskless_size)
+		return 0;
+
+	if (cap && idx >= map->maskless_size)
+		return -ERANGE;
+
+	if (!cap && idx >= map->maskfull_size)
+		return -ERANGE;
+
+	t4_os_lock(&map->lock);
+	if (!cap) {
+		e = &map->mfull_entry[idx];
+		/*
+		 * IPv6 maskfull filters occupy 4 slots and IPv4 maskfull
+		 * filters take up one slot. Set the map accordingly.
+		 */
+		if (e->fs.type == FILTER_TYPE_IPV4) {
+			if (del)
+				rte_bitmap_clear(map->mfull_bmap, idx);
+			else
+				rte_bitmap_set(map->mfull_bmap, idx);
+		} else {
+			if (del) {
+				rte_bitmap_clear(map->mfull_bmap, idx);
+				rte_bitmap_clear(map->mfull_bmap, idx + 1);
+				rte_bitmap_clear(map->mfull_bmap, idx + 2);
+				rte_bitmap_clear(map->mfull_bmap, idx + 3);
+			} else {
+				rte_bitmap_set(map->mfull_bmap, idx);
+				rte_bitmap_set(map->mfull_bmap, idx + 1);
+				rte_bitmap_set(map->mfull_bmap, idx + 2);
+				rte_bitmap_set(map->mfull_bmap, idx + 3);
+			}
+		}
+	} else {
+		e = &map->mless_entry[idx];
+		/*
+		 * Maskless filters take up only one slot for both
+		 * IPv4 and IPv6
+		 */
+		if (del)
+			rte_bitmap_clear(map->mless_bmap, idx);
+		else
+			rte_bitmap_set(map->mless_bmap, idx);
+	}
+	t4_os_unlock(&map->lock);
+
+	return 0;
+}
+
+/**
+ * Fill up default masks
+ */
+static void fill_ch_spec_def_mask(struct ch_filter_specification *fs)
+{
+	unsigned int i;
+	unsigned int lip = 0, lip_mask = 0;
+	unsigned int fip = 0, fip_mask = 0;
+	unsigned int cap = fs->cap;
+
+	if (fs->val.iport && (!fs->mask.iport || cap))
+		fs->mask.iport |= ~0;
+	if (fs->val.ethtype && (!fs->mask.ethtype || cap))
+		fs->mask.ethtype |= ~0;
+	if (fs->val.ivlan && (!fs->mask.ivlan || cap))
+		fs->mask.ivlan |= ~0;
+	if (fs->val.ovlan && (!fs->mask.ovlan || cap))
+		fs->mask.ovlan |= ~0;
+	if (fs->val.tos && (!fs->mask.tos || cap))
+		fs->mask.tos |= ~0;
+	if (fs->val.proto && (!fs->mask.proto || cap))
+		fs->mask.proto |= ~0;
+
+	for (i = 0; i < ARRAY_SIZE(fs->val.lip); i++) {
+		lip |= fs->val.lip[i];
+		lip_mask |= fs->mask.lip[i];
+		fip |= fs->val.fip[i];
+		fip_mask |= fs->mask.fip[i];
+	}
+
+	if (lip && (!lip_mask || cap))
+		memset(fs->mask.lip, ~0, sizeof(fs->mask.lip));
+
+	if (fip && (!fip_mask || cap))
+		memset(fs->mask.fip, ~0, sizeof(fs->mask.lip));
+
+	if (fs->val.lport && (!fs->mask.lport || cap))
+		fs->mask.lport = ~0;
+	if (fs->val.fport && (!fs->mask.fport || cap))
+		fs->mask.fport = ~0;
+}
+
+/**
+ * Translate match fields to Chelsio Filter Specification
+ */
+static int fill_ch_spec_match(const struct rte_eth_fdir_filter *fdir_filter,
+			      struct ch_filter_specification *fs)
+{
+	struct cxgbe_fdir_input_admin admin;
+	struct cxgbe_fdir_input_flow val;
+	struct cxgbe_fdir_input_flow mask;
+	const uint8_t *raw_pkt_admin, *raw_pkt_match, *raw_pkt_mask;
+
+	raw_pkt_admin = &fdir_filter->input.flow.raw_pkt_flow[0];
+	raw_pkt_match = raw_pkt_admin + sizeof(admin);
+	raw_pkt_mask = &fdir_filter->input.flow_mask.raw_pkt_flow[0];
+
+	/* Match arguments without masks */
+	rte_memcpy(&admin, raw_pkt_admin, sizeof(admin));
+	fs->prio = admin.prio ? 1 : 0;
+	fs->type = admin.type ? 1 : 0;
+	fs->cap  = admin.cap ? 1 : 0;
+
+	/* Match arguments with masks */
+	rte_memcpy(&val, raw_pkt_match, sizeof(val));
+	fs->val.ethtype = be16_to_cpu(val.ethtype);
+	if (val.iport > 7)
+		return -ERANGE;
+	fs->val.iport   = val.iport;
+
+	fs->val.ivlan   = be16_to_cpu(val.ivlan);
+	fs->val.ovlan   = be16_to_cpu(val.ovlan);
+
+	fs->val.proto   = val.proto;
+	fs->val.tos     = val.tos;
+	rte_memcpy(&fs->val.lip[0], &val.lip[0], sizeof(val.lip));
+	rte_memcpy(&fs->val.fip[0], &val.fip[0], sizeof(val.fip));
+
+	fs->val.lport   = be16_to_cpu(val.lport);
+	fs->val.fport   = be16_to_cpu(val.fport);
+
+	/* Masks for matched arguments */
+	rte_memcpy(&mask, raw_pkt_mask, sizeof(mask));
+	fs->mask.ethtype = be16_to_cpu(mask.ethtype);
+	if (mask.iport > 7)
+		return -ERANGE;
+	fs->mask.iport   = mask.iport;
+
+	fs->mask.ivlan   = be16_to_cpu(mask.ivlan);
+	fs->mask.ovlan   = be16_to_cpu(mask.ovlan);
+
+	fs->mask.proto   = mask.proto;
+	fs->mask.tos     = mask.tos;
+	rte_memcpy(&fs->mask.lip[0], &mask.lip[0], sizeof(mask.lip));
+	rte_memcpy(&fs->mask.fip[0], &mask.fip[0], sizeof(mask.fip));
+
+	fs->mask.lport   = be16_to_cpu(mask.lport);
+	fs->mask.fport   = be16_to_cpu(mask.fport);
+
+	/* Fill up matched field masks with defaults if not specified */
+	fill_ch_spec_def_mask(fs);
+
+	if (fs->val.ivlan) {
+		fs->val.ivlan_vld = 1;
+		fs->mask.ivlan_vld = 1;
+	}
+
+	if (fs->val.ovlan) {
+		fs->val.ovlan_vld = 1;
+		fs->mask.ovlan_vld = 1;
+	}
+
+	/* Disable filter hit counting for Maskless filters */
+	if (fs->cap)
+		fs->hitcnts = 0;
+	else
+		fs->hitcnts = 1;
+
+	return 0;
+}
+
+/**
+ * Translate action fields to Chelsio Filter Specification
+ */
+static int fill_ch_spec_action(struct rte_eth_dev *dev,
+			       const struct rte_eth_fdir_filter *fdir_filter,
+			       struct ch_filter_specification *fs)
+{
+	struct port_info *pi = ethdev2pinfo(dev);
+	struct cxgbe_fdir_action action;
+	int err = 0;
+	unsigned int drop_queue = dev->data->dev_conf.fdir_conf.drop_queue;
+	const uint8_t *action_arg;
+
+	if (fdir_filter->action.rx_queue > MAX_ETH_QSETS) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	/* Action Arguments */
+	switch (fdir_filter->action.behavior) {
+	case RTE_ETH_FDIR_ACCEPT:
+		if (fdir_filter->action.rx_queue < pi->n_rx_qsets) {
+			fs->dirsteer = 1;
+			fs->iq = fdir_filter->action.rx_queue;
+		}
+		fs->action = FILTER_PASS;
+		break;
+	case RTE_ETH_FDIR_REJECT:
+		if (fdir_filter->action.rx_queue == drop_queue) {
+			if (drop_queue < pi->n_rx_qsets) {
+				/* Send to drop queue */
+				fs->dirsteer = 1;
+				fs->iq = drop_queue;
+				fs->action = FILTER_PASS;
+				err = 0;
+				goto out;
+			}
+		}
+		/* Drop in hardware */
+		fs->action = FILTER_DROP;
+		break;
+	case RTE_ETH_FDIR_SWITCH:
+		action_arg = &fdir_filter->action.behavior_arg[0];
+
+		rte_memcpy(&action, action_arg, sizeof(action));
+		if (action.eport > 4) {
+			err = -ERANGE;
+			break;
+		}
+		fs->eport = action.eport;
+
+		fs->newdmac = action.newdmac;
+		fs->newsmac = action.newsmac;
+		fs->swapmac = action.swapmac;
+		rte_memcpy(&fs->dmac[0], &action.dmac[0], ETHER_ADDR_LEN);
+		rte_memcpy(&fs->smac[0], &action.smac[0], ETHER_ADDR_LEN);
+
+		if (action.newvlan > VLAN_REWRITE) {
+			err = -ERANGE;
+			break;
+		}
+		fs->newvlan = action.newvlan;
+		fs->vlan = be16_to_cpu(action.vlan);
+
+		if (action.nat_mode && action.nat_mode != NAT_MODE_ALL) {
+			err = -ENOTSUP;
+			break;
+		}
+		fs->nat_mode = action.nat_mode;
+		rte_memcpy(&fs->nat_lip[0], &action.nat_lip[0],
+			   sizeof(action.nat_lip));
+		rte_memcpy(&fs->nat_fip[0], &action.nat_fip[0],
+			   sizeof(action.nat_fip));
+		fs->nat_lport = be16_to_cpu(action.nat_lport);
+		fs->nat_fport = be16_to_cpu(action.nat_fport);
+
+		fs->action = FILTER_SWITCH;
+		break;
+	default:
+		err = -EINVAL;
+		break;
+	}
+
+out:
+	return err;
+}
+
+/**
+ * cxgbe_add_del_fdir_filter - add or remove a flow diretor filter.
+ * @dev: pointer to the structure rte_eth_dev
+ * @filter: fdir filter entry
+ * @fd_id: fdir index to insert/delete
+ * @del: 1 - delete, 0 - add
+ */
+static int cxgbe_add_del_fdir_filter(struct rte_eth_dev *dev,
+				     const struct rte_eth_fdir_filter *filter,
+				     unsigned int fd_id, bool del)
+{
+	struct adapter *adapter = ethdev2adap(dev);
+	struct cxgbe_fdir_map *map = adapter->fdir;
+	struct cxgbe_fdir_map_entry *entry;
+	struct ch_filter_specification fs;
+	struct filter_ctx ctx;
+	unsigned int filter_id;
+	bool map_set;
+	int err = 0;
+
+	if (filter->input.flow_type != RTE_ETH_FLOW_RAW_PKT)
+		return -ENOTSUP;
+
+	if (!(adapter->flags & FULL_INIT_DONE))
+		return -EAGAIN;  /* can still change nfilters */
+
+	t4_init_completion(&ctx.completion);
+
+	memset(&fs, 0, sizeof(fs));
+
+	/* Fill in the Match arguments to create the filter */
+	err = fill_ch_spec_match(filter, &fs);
+	if (err) {
+		dev_err(adapter, "FDIR filter invalid match argument\n");
+		goto out;
+	}
+
+	/* Sanity Check Filter ID */
+	if (fs.cap) {
+		if (!map->maskless_size) {
+			dev_err(adapter,
+				"Maskless Filters have been disabled\n");
+			return -ENOTSUP;
+		}
+
+		if (fd_id > map->maskless_size) {
+			dev_err(adapter,
+				"Maskless Filters fd_id range is 0 to %d\n",
+				map->maskless_size - 1);
+			return -ERANGE;
+		}
+		entry = &map->mless_entry[fd_id];
+	} else {
+		if (!map->maskfull_size) {
+			dev_err(adapter,
+				"Maskfull Filters have been disabled\n");
+			return -ENOTSUP;
+		}
+
+		if (fd_id > map->maskfull_size) {
+			dev_err(adapter,
+				"Maskfull Filters fd_id range is 0 to %d\n",
+				map->maskfull_size - 1);
+			return -ERANGE;
+		}
+		entry = &map->mfull_entry[fd_id];
+	}
+
+	/*
+	 * We are not bothered about action arguments to delete the filter, but
+	 * we need to know if it is a maskfull or maskless filter that is
+	 * requested to be deleted.
+	 */
+	map_set = is_fdir_map_set(map, fs.cap, fd_id);
+	if (del) {
+		if (!map_set) {
+			dev_err(adap, "No entry with fd_id %d found\n", fd_id);
+			return -EINVAL;
+		}
+	} else {
+		if (map_set) {
+			dev_err(adap, "Entry with fd_id %d occupied\n", fd_id);
+			return -EINVAL;
+		}
+	}
+
+	filter_id = fs.cap ? entry->tid : fd_id;
+	if (del) {
+		err = cxgbe_del_filter(dev, filter_id, &fs, &ctx);
+		if (!err) {
+			/* Poll the FW for reply */
+			err = cxgbe_poll_for_completion(&adapter->sge.fw_evtq,
+							CXGBE_FDIR_POLL_US,
+							CXGBE_FDIR_POLL_CNT,
+							&ctx.completion);
+			if (err) {
+				dev_err(adapter,
+					"FDIR filter delete timeout\n");
+				goto out;
+			} else {
+				err = ctx.result; /* Async Completion Done */
+				if (err)
+					goto out;
+
+				cxgbe_fdir_add_del_map_entry(map, fs.cap,
+							     fd_id, TRUE);
+				memset(entry, 0, sizeof(*entry));
+			}
+		} else {
+			dev_err(adapter, "Fail to delete FDIR filter!\n");
+			goto out;
+		}
+		return 0;
+	}
+
+	/* Fill in the Action arguments to create the filter */
+	err = fill_ch_spec_action(dev, filter, &fs);
+	if (err) {
+		dev_err(adapter, "FDIR filter invalid action argument\n");
+		goto out;
+	}
+
+	/* NAT not supported for LE-TCAM */
+	if (!fs.cap && fs.nat_mode) {
+		dev_err(adapter, "Maskfull NAT is not supported\n");
+		return -ENOTSUP;
+	}
+
+	/* Create the filter */
+	err = cxgbe_set_filter(dev, filter_id, &fs, &ctx);
+	if (!err) {
+		/* Poll the FW for reply */
+		err = cxgbe_poll_for_completion(&adapter->sge.fw_evtq,
+						CXGBE_FDIR_POLL_US,
+						CXGBE_FDIR_POLL_CNT,
+						&ctx.completion);
+		if (err) {
+			dev_err(adapter, "FDIR filter add timeout\n");
+			goto out;
+		} else {
+			err = ctx.result; /* Asynchronous Completion Done */
+			if (err)
+				goto out;
+
+			entry->tid = ctx.tid;
+			rte_memcpy(&entry->fs, &fs, sizeof(fs));
+			dev_debug(adapter, "FDIR inserted at tid: %d\n",
+				  ctx.tid);
+			cxgbe_fdir_add_del_map_entry(map, fs.cap, fd_id, FALSE);
+		}
+	} else {
+		dev_err(adapter, "Fail to add FDIR filter!\n");
+		goto out;
+	}
+
+	return 0;
+
+out:
+	return err;
+}
+
+/**
+ * Process the supported filter operations
+ */
+static int cxgbe_fdir_filter_op(struct rte_eth_dev *dev,
+				enum rte_filter_op filter_op,
+				struct rte_eth_fdir_filter *fdir)
+{
+	struct adapter *adap = ethdev2adap(dev);
+	struct cxgbe_fdir_map *map = adap->fdir;
+	struct rte_eth_fdir_stats *stats = &map->stats;
+	unsigned int fd_id = fdir->soft_id;
+	int ret;
+
+	if (fd_id >= (map->maskfull_size + map->maskless_size)) {
+		dev_err(adap, "FD_ID must be < %d\n",
+			map->maskfull_size + map->maskless_size);
+		return -ERANGE;
+	}
+
+	switch (filter_op) {
+	case RTE_ETH_FILTER_ADD:
+		ret = cxgbe_add_del_fdir_filter(dev, fdir, fd_id, FALSE);
+		if (ret) {
+			stats->f_add++;
+		} else {
+			stats->free--;
+			stats->add++;
+		}
+		break;
+	case RTE_ETH_FILTER_DELETE:
+		ret = cxgbe_add_del_fdir_filter(dev, fdir, fd_id, TRUE);
+		if (ret) {
+			stats->f_remove++;
+		} else {
+			stats->free++;
+			stats->remove++;
+		}
+		break;
+	case RTE_ETH_FILTER_UPDATE:
+	case RTE_ETH_FILTER_FLUSH:
+	case RTE_ETH_FILTER_GET:
+	case RTE_ETH_FILTER_SET:
+		ret = -ENOTSUP;
+		break;
+	default:
+		ret = -EINVAL;
+		break;
+	}
+	return ret;
+}
+
+/**
+ * Fill FDIR stats
+ */
+static void cxgbe_fdir_get_stats(struct rte_eth_dev *dev,
+				 struct rte_eth_fdir_stats *fdir_stats)
+{
+	struct adapter *adap = ethdev2adap(dev);
+	struct rte_eth_fdir_stats *stats = &adap->fdir->stats;
+
+	rte_memcpy(fdir_stats, stats, sizeof(*stats));
+}
+
+/**
+ * cxgbe_fdir_ctrl_func - deal with all operations on flow director.
+ * @dev: pointer to the structure rte_eth_dev
+ * @filter_op:operation will be taken
+ * @arg: a pointer to specific structure corresponding to the filter_op
+ */
+int cxgbe_fdir_ctrl_func(struct rte_eth_dev *dev, enum rte_filter_op filter_op,
+			 void *arg)
+{
+	struct adapter *adapter = ethdev2adap(dev);
+	int ret = 0;
+
+	if (!adapter->fdir)
+		return -ENOTSUP;
+
+	if (filter_op == RTE_ETH_FILTER_NOP)
+		return 0;
+
+	if (!arg && filter_op != RTE_ETH_FILTER_FLUSH)
+		return -EINVAL;
+
+	switch (filter_op) {
+	case RTE_ETH_FILTER_ADD:
+	case RTE_ETH_FILTER_DELETE:
+	case RTE_ETH_FILTER_UPDATE:
+	case RTE_ETH_FILTER_FLUSH:
+	case RTE_ETH_FILTER_GET:
+	case RTE_ETH_FILTER_SET:
+		ret = cxgbe_fdir_filter_op(dev, filter_op,
+					   (struct rte_eth_fdir_filter *)arg);
+		break;
+	case RTE_ETH_FILTER_INFO:
+		ret = -ENOTSUP;
+		break;
+	case RTE_ETH_FILTER_STATS:
+		cxgbe_fdir_get_stats(dev, (struct rte_eth_fdir_stats *)arg);
+		break;
+	default:
+		ret = -EINVAL;
+		break;
+	}
+	return ret;
+}
+
+/**
+ * Intitialize flow director
+ */
+struct cxgbe_fdir_map *cxgbe_init_fdir(struct adapter *adap)
+{
+	struct tid_info *t = &adap->tids;
+	struct cxgbe_fdir_map *map;
+	unsigned int mfull_size, mless_size;
+	unsigned int mfull_bmap_size, mless_bmap_size;
+
+	if (!t->tid_tab)
+		return NULL;
+
+	mfull_size = t->nftids;
+	if (is_hashfilter(adap))
+		mless_size = t->ntids - t->hash_base;
+	else
+		mless_size = 0;
+	mfull_bmap_size = rte_bitmap_get_memory_footprint(mfull_size);
+	mless_bmap_size = rte_bitmap_get_memory_footprint(mless_size);
+
+	if ((mfull_size + mless_size) < 1)
+		return NULL;
+
+	map = t4_os_alloc(sizeof(*map));
+	if (!map)
+		return NULL;
+
+	map->maskfull_size = mfull_size;
+	map->maskless_size = mless_size;
+
+	/* Allocate Maskfull Entries */
+	map->mfull_bmap_array = t4_os_alloc(mfull_bmap_size);
+	if (!map->mfull_bmap_array)
+		goto free_map;
+	map->mfull_bmap = rte_bitmap_init(mfull_size, map->mfull_bmap_array,
+					  mfull_bmap_size);
+	if (!map->mfull_bmap)
+		goto free_mfull;
+
+	map->mfull_entry = t4_os_alloc(mfull_size *
+				       sizeof(struct cxgbe_fdir_map_entry));
+	if (!map->mfull_entry)
+		goto free_mfull;
+
+	/* Allocate Maskless Entries */
+	if (mless_size) {
+		map->mless_bmap_array = t4_os_alloc(mless_bmap_size);
+		if (!map->mless_bmap_array)
+			goto free_mfull;
+		map->mless_bmap = rte_bitmap_init(mless_size,
+						  map->mless_bmap_array,
+						  mless_bmap_size);
+		if (!map->mless_bmap)
+			goto free_mless;
+
+		map->mless_entry = t4_os_alloc(mless_size *
+					sizeof(struct cxgbe_fdir_map_entry));
+		if (!map->mless_entry)
+			goto free_mless;
+	}
+
+	t4_os_lock_init(&map->lock);
+
+	map->stats.free = mfull_size + mless_size;
+	return map;
+
+free_mless:
+	if (map->mless_bmap)
+		rte_bitmap_free(map->mless_bmap);
+
+	if (map->mless_bmap_array)
+		t4_os_free(map->mless_bmap_array);
+
+free_mfull:
+	if (map->mfull_entry)
+		t4_os_free(map->mfull_entry);
+
+	if (map->mfull_bmap)
+		rte_bitmap_free(map->mfull_bmap);
+
+	if (map->mfull_bmap_array)
+		t4_os_free(map->mfull_bmap_array);
+
+free_map:
+	t4_os_free(map);
+	return NULL;
+}
+
+/**
+ * Cleanup flow director
+ */
+void cxgbe_cleanup_fdir(struct adapter *adap)
+{
+	struct cxgbe_fdir_map *map = adap->fdir;
+
+	if (map) {
+		if (map->mfull_bmap) {
+			rte_bitmap_free(map->mfull_bmap);
+			t4_os_free(map->mfull_bmap_array);
+		}
+
+		if (map->mless_bmap) {
+			rte_bitmap_free(map->mless_bmap);
+			t4_os_free(map->mless_bmap_array);
+		}
+
+		if (map->mfull_entry)
+			t4_os_free(map->mfull_entry);
+		if (map->mless_entry)
+			t4_os_free(map->mless_entry);
+
+		t4_os_free(map);
+	}
+}
diff --git a/drivers/net/cxgbe/cxgbe_fdir.h b/drivers/net/cxgbe/cxgbe_fdir.h
new file mode 100644
index 0000000..d4ae183
--- /dev/null
+++ b/drivers/net/cxgbe/cxgbe_fdir.h
@@ -0,0 +1,108 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015-2016 Chelsio Communications.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Chelsio Communications nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _CXGBE_FDIR_H_
+#define _CXGBE_FDIR_H_
+
+#define CXGBE_FDIR_POLL_US  10
+#define CXGBE_FDIR_POLL_CNT 10
+
+/* RTE_ETH_FLOW_RAW_PKT representation. */
+struct cxgbe_fdir_input_admin {
+	uint8_t prio;
+	uint8_t type;
+	uint8_t cap;
+};
+
+struct cxgbe_fdir_input_flow {
+	uint16_t ethtype;
+	uint8_t iport;
+	uint8_t proto;
+	uint8_t tos;
+	uint16_t ivlan;
+	uint16_t ovlan;
+
+	uint8_t lip[16];
+	uint8_t fip[16];
+	uint16_t lport;
+	uint16_t fport;
+};
+
+struct cxgbe_fdir_action {
+	uint8_t eport;
+	uint8_t newdmac;
+	uint8_t newsmac;
+	uint8_t swapmac;
+	uint8_t newvlan;
+	uint8_t nat_mode;
+	uint8_t dmac[ETHER_ADDR_LEN];
+	uint8_t smac[ETHER_ADDR_LEN];
+	uint16_t vlan;
+
+	uint8_t nat_lip[16];
+	uint8_t nat_fip[16];
+	uint16_t nat_lport;
+	uint16_t nat_fport;
+};
+
+/* The cxgbe_fdir_map is a mapping between the DPDK filter stack and the
+ * CXGBE filtering support. Its main intention is to translate info between
+ * the two.
+ */
+struct cxgbe_fdir_map_entry {
+	u32 tid;                           /* Filter index */
+	struct ch_filter_specification fs; /* Filter specification */
+};
+
+struct cxgbe_fdir_map {
+	/* DPDK related info */
+	struct rte_eth_fdir_stats stats;
+
+	/* CXGBE related info */
+	unsigned int maskfull_size;    /* Size of Maskfull region */
+	unsigned int maskless_size;    /* Size of Maskless region */
+	rte_spinlock_t lock;           /* Lock to access an entry */
+
+	uint8_t *mfull_bmap_array;     /* Bitmap array for maskfull entries */
+	struct rte_bitmap *mfull_bmap; /* Bitmap for maskfull entries */
+	uint8_t *mless_bmap_array;     /* Bitmap array for maskless entries */
+	struct rte_bitmap *mless_bmap; /* Bitmap for maskless entries */
+	struct cxgbe_fdir_map_entry *mfull_entry; /* Maskfull fdir entries */
+	struct cxgbe_fdir_map_entry *mless_entry; /* Maskless fdir entries */
+};
+
+struct cxgbe_fdir_map *cxgbe_init_fdir(struct adapter *adap);
+void cxgbe_cleanup_fdir(struct adapter *adap);
+int cxgbe_fdir_ctrl_func(struct rte_eth_dev *dev,
+			 enum rte_filter_op filter_op, void *arg);
+#endif /* _CXGBE_FDIR_H_ */
diff --git a/drivers/net/cxgbe/cxgbe_main.c b/drivers/net/cxgbe/cxgbe_main.c
index 1f79ba3..f8168c6 100644
--- a/drivers/net/cxgbe/cxgbe_main.c
+++ b/drivers/net/cxgbe/cxgbe_main.c
@@ -148,6 +148,38 @@ out:
 }
 
 /**
+ * cxgbe_poll_for_completion: Poll rxq for completion
+ * @q: rxq to poll
+ * @us: microseconds to delay
+ * @cnt: number of times to poll
+ * @c: completion to check for 'done' status
+ *
+ * Polls the rxq for reples until completion is done or the count
+ * expires.
+ */
+int cxgbe_poll_for_completion(struct sge_rspq *q, unsigned int us,
+			      unsigned int cnt, struct t4_completion *c)
+{
+	unsigned int i;
+	unsigned int work_done, budget = 4;
+
+	if (!c)
+		return -EINVAL;
+
+	for (i = 0; i < cnt; i++) {
+		cxgbe_poll(q, NULL, budget, &work_done);
+		t4_os_lock(&c->lock);
+		if (c->done) {
+			t4_os_unlock(&c->lock);
+			return 0;
+		}
+		t4_os_unlock(&c->lock);
+		udelay(us);
+	}
+	return -ETIMEDOUT;
+}
+
+/**
  * Setup sge control queues to pass control information.
  */
 int setup_sge_ctrl_txq(struct adapter *adapter)
@@ -1355,6 +1387,7 @@ void cxgbe_close(struct adapter *adapter)
 
 	if (adapter->flags & FULL_INIT_DONE) {
 		cxgbe_clear_all_filters(adapter);
+		cxgbe_cleanup_fdir(adapter);
 		tid_free(&adapter->tids);
 		t4_cleanup_clip_tbl(adapter);
 		t4_cleanup_l2t(adapter);
@@ -1542,6 +1575,13 @@ allocate_mac:
 			 "Maskless filter support disabled. Continuing\n");
 	}
 
+	adapter->fdir = cxgbe_init_fdir(adapter);
+	if (!adapter->fdir) {
+		/* Disable Flow Director */
+		dev_warn(adapter, "could not allocate FDIR "
+			 "flow director support disabled. Continuing\n");
+	}
+
 	err = init_rss(adapter);
 	if (err)
 		goto out_free;
diff --git a/drivers/net/cxgbe/sge.c b/drivers/net/cxgbe/sge.c
index bd4b381..83e833c 100644
--- a/drivers/net/cxgbe/sge.c
+++ b/drivers/net/cxgbe/sge.c
@@ -1664,7 +1664,8 @@ static int process_responses(struct sge_rspq *q, int budget,
 			unsigned int params;
 			u32 val;
 
-			if (fl_cap(&rxq->fl) - rxq->fl.avail >= 64)
+			if (q->offset >= 0 &&
+			    fl_cap(&rxq->fl) - rxq->fl.avail >= 64)
 				__refill_fl(q->adapter, &rxq->fl);
 			params = V_QINTR_TIMER_IDX(X_TIMERREG_UPDATE_CIDX);
 			q->next_intr_params = params;
-- 
2.5.3

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH 00/10] cxgbe: Add flow director support
  2016-02-03  8:32 [PATCH 00/10] cxgbe: Add flow director support Rahul Lakkireddy
                   ` (9 preceding siblings ...)
  2016-02-03  8:32 ` [PATCH 10/10] cxgbe: add flow director support and update documentation Rahul Lakkireddy
@ 2016-02-22 10:39 ` Rahul Lakkireddy
  2016-03-22 13:43 ` Bruce Richardson
  11 siblings, 0 replies; 27+ messages in thread
From: Rahul Lakkireddy @ 2016-02-22 10:39 UTC (permalink / raw)
  To: dev; +Cc: Kumar Sanghvi, Nirranjan Kirubaharan

Hi All,

On Wednesday, February 02/03/16, 2016 at 14:02:21 +0530, Rahul Lakkireddy wrote:
> This series of patches extend the flow director filter and add support
> for Chelsio T5 hardware filtering capabilities.
> 
> Chelsio T5 supports carrying out filtering in hardware which supports 3
> actions to carry out on a packet which hit a filter viz.
> 
> 1. Action Pass - Packets hitting a filter rule can be directed to a
>    particular RXQ.
> 
> 2. Action Drop - Packets hitting a filter rule are dropped in h/w.
> 
> 3. Action Switch - Packets hitting a filter rule can be switched in h/w
>    from one port to another, without involvement of host.  Also, the
>    action Switch also supports rewrite of src-mac/dst-mac headers as
>    well as rewrite of vlan headers.  It also supports rewrite of IP
>    headers and thereby, supports NAT (Network Address Translation)
>    in h/w.
> 
> Also, each filter rule can optionally support specifying a mask value
> i.e. it's possible to create a filter rule for an entire subnet of IP
> addresses or a range of tcp/udp ports, etc.
> 
> Patch 1 does the following:
> - Adds a new flow RTE_ETH_FLOW_RAW_PKT to allow specifying a generic
>   flow.
> - Adds an additional generic array to rte_eth_fdir_flow to allow
>   specifying generic flow input.
> - Adds an additional mask for the flow input to allow range of values
>   to be matched in the flow input.
> - Adds a new behavior 'switch'.
> - Adds a generic array to hold behavior arguments that can be passed
>   when a particular behavior is taken. For ex: in case of action
>   'switch', pass additional 4-tuple to allow rewriting src/dst ip and
>   port addresses to support NAT'ing.
> 
> RFC series of patches and discussion involving these enhancements to the
> flow director are available at [1].
> 
> Patch 2 adds command line example app to test cxgbe flow director. Also
> add documentation for the example app.
> 
> Patch 3 updates the cxgbe base to add support for packet filtering.
> 
> Patch 4 adds control txq for communicating filter info to the firmware.
> 
> Patches 5-7 add compressed local ip (CLIP) table, layer 2 table (L2T),
> and source mac table (SMT) definitions required for holding info
> for matching and executing various operations on matched filters.
> 
> Patch 8 adds the LE-TCAM (maskfull) filter support.
> 
> Patch 9 adds the HASH (maskless) filter support.
> 
> Patch 10 adds and implements the flow director filter operations. Also
> add the documentation.
> 
> 
> [1] http://comments.gmane.org/gmane.comp.networking.dpdk.devel/29986
> 
> Rahul Lakkireddy (10):
>   ethdev: add a generic flow and new behavior switch to fdir
>   examples/test-cxgbe-filters: add example to test cxgbe fdir support
>   cxgbe: add skeleton to add support for T5 hardware filtering
>   cxgbe: add control txq for communicating filtering info
>   cxgbe: add compressed local IP table for matching IPv6 addresses
>   cxgbe: add layer 2 table for switch action filter
>   cxgbe: add source mac table for switch action filter
>   cxgbe: add LE-TCAM filtering support
>   cxgbe: add HASH filtering support
>   cxgbe: add flow director support and update documentation
> 
>  MAINTAINERS                                        |    2 +
>  doc/guides/nics/cxgbe.rst                          |  166 ++
>  doc/guides/rel_notes/release_2_3.rst               |   10 +
>  doc/guides/sample_app_ug/index.rst                 |    1 +
>  doc/guides/sample_app_ug/test_cxgbe_filters.rst    |  694 +++++++++
>  drivers/net/cxgbe/Makefile                         |    6 +
>  drivers/net/cxgbe/base/adapter.h                   |  110 ++
>  drivers/net/cxgbe/base/common.h                    |   11 +
>  drivers/net/cxgbe/base/t4_hw.c                     |   28 +
>  drivers/net/cxgbe/base/t4_msg.h                    |  324 ++++
>  drivers/net/cxgbe/base/t4_regs.h                   |    9 +
>  drivers/net/cxgbe/base/t4_regs_values.h            |   25 +
>  drivers/net/cxgbe/base/t4_tcb.h                    |   95 ++
>  drivers/net/cxgbe/base/t4fw_interface.h            |  272 ++++
>  drivers/net/cxgbe/clip_tbl.c                       |  220 +++
>  drivers/net/cxgbe/clip_tbl.h                       |   59 +
>  drivers/net/cxgbe/cxgbe.h                          |    4 +
>  drivers/net/cxgbe/cxgbe_compat.h                   |   12 +
>  drivers/net/cxgbe/cxgbe_ethdev.c                   |   21 +
>  drivers/net/cxgbe/cxgbe_fdir.c                     |  715 +++++++++
>  drivers/net/cxgbe/cxgbe_fdir.h                     |  108 ++
>  drivers/net/cxgbe/cxgbe_filter.c                   | 1614 ++++++++++++++++++++
>  drivers/net/cxgbe/cxgbe_filter.h                   |  260 ++++
>  drivers/net/cxgbe/cxgbe_main.c                     |  395 ++++-
>  drivers/net/cxgbe/cxgbe_ofld.h                     |  126 ++
>  drivers/net/cxgbe/l2t.c                            |  261 ++++
>  drivers/net/cxgbe/l2t.h                            |   87 ++
>  drivers/net/cxgbe/sge.c                            |  202 ++-
>  drivers/net/cxgbe/smt.c                            |  275 ++++
>  drivers/net/cxgbe/smt.h                            |   76 +
>  examples/Makefile                                  |    1 +
>  examples/test-cxgbe-filters/Makefile               |   63 +
>  examples/test-cxgbe-filters/commands.c             |  429 ++++++
>  examples/test-cxgbe-filters/commands.h             |   40 +
>  examples/test-cxgbe-filters/config.c               |   79 +
>  examples/test-cxgbe-filters/cxgbe/cxgbe_commands.c |  554 +++++++
>  examples/test-cxgbe-filters/cxgbe/cxgbe_fdir.h     |   79 +
>  examples/test-cxgbe-filters/init.c                 |  201 +++
>  examples/test-cxgbe-filters/main.c                 |   79 +
>  examples/test-cxgbe-filters/main.h                 |   77 +
>  examples/test-cxgbe-filters/runtime.c              |   74 +
>  lib/librte_ether/rte_eth_ctrl.h                    |   15 +-
>  42 files changed, 7874 insertions(+), 5 deletions(-)
>  create mode 100644 doc/guides/sample_app_ug/test_cxgbe_filters.rst
>  create mode 100644 drivers/net/cxgbe/base/t4_tcb.h
>  create mode 100644 drivers/net/cxgbe/clip_tbl.c
>  create mode 100644 drivers/net/cxgbe/clip_tbl.h
>  create mode 100644 drivers/net/cxgbe/cxgbe_fdir.c
>  create mode 100644 drivers/net/cxgbe/cxgbe_fdir.h
>  create mode 100644 drivers/net/cxgbe/cxgbe_filter.c
>  create mode 100644 drivers/net/cxgbe/cxgbe_filter.h
>  create mode 100644 drivers/net/cxgbe/cxgbe_ofld.h
>  create mode 100644 drivers/net/cxgbe/l2t.c
>  create mode 100644 drivers/net/cxgbe/l2t.h
>  create mode 100644 drivers/net/cxgbe/smt.c
>  create mode 100644 drivers/net/cxgbe/smt.h
>  create mode 100644 examples/test-cxgbe-filters/Makefile
>  create mode 100644 examples/test-cxgbe-filters/commands.c
>  create mode 100644 examples/test-cxgbe-filters/commands.h
>  create mode 100644 examples/test-cxgbe-filters/config.c
>  create mode 100644 examples/test-cxgbe-filters/cxgbe/cxgbe_commands.c
>  create mode 100644 examples/test-cxgbe-filters/cxgbe/cxgbe_fdir.h
>  create mode 100644 examples/test-cxgbe-filters/init.c
>  create mode 100644 examples/test-cxgbe-filters/main.c
>  create mode 100644 examples/test-cxgbe-filters/main.h
>  create mode 100644 examples/test-cxgbe-filters/runtime.c
> 
> -- 
> 2.5.3
> 

Just a gentle reminder if there is any review comment on this series?

Thanks,
Rahul

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 02/10] examples/test-cxgbe-filters: add example to test cxgbe fdir support
  2016-02-03  8:32 ` [PATCH 02/10] examples/test-cxgbe-filters: add example to test cxgbe fdir support Rahul Lakkireddy
@ 2016-02-24 14:40   ` Bruce Richardson
  2016-02-24 18:35     ` Rahul Lakkireddy
  0 siblings, 1 reply; 27+ messages in thread
From: Bruce Richardson @ 2016-02-24 14:40 UTC (permalink / raw)
  To: Rahul Lakkireddy; +Cc: dev, Kumar Sanghvi, Nirranjan Kirubaharan

On Wed, Feb 03, 2016 at 02:02:23PM +0530, Rahul Lakkireddy wrote:
> Add a new test_cxgbe_filters command line example to test support for
> Chelsio T5 hardware filtering. Shows how to pass the Chelsio input flow
> and input masks. Also, shows how to pass extra behavior arguments to
> rewrite fields in matched filter rules.
> 
> Also add documentation and update MAINTAINERS.
> 
> Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
> Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>

Hi,

for testing NIC functionality, the "testpmd" app is what is used, and it already
contains support for existing flow director functionality. Should the testing
functionality not be included there?

Note: that's not to say we don't need a simple example app as well, for 
demonstrating how to use flow director, but at minimum for nic features we
generally need to have testpmd support.

Can this patchset perhaps be changed to include some testpmd support, and maybe
have any example apps as a separate set?

Regards,
/Bruce

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 01/10] ethdev: add a generic flow and new behavior switch to fdir
  2016-02-03  8:32 ` [PATCH 01/10] ethdev: add a generic flow and new behavior switch to fdir Rahul Lakkireddy
@ 2016-02-24 14:43   ` Bruce Richardson
  2016-02-24 15:02     ` Thomas Monjalon
  2016-02-25  3:26   ` Wu, Jingjing
  1 sibling, 1 reply; 27+ messages in thread
From: Bruce Richardson @ 2016-02-24 14:43 UTC (permalink / raw)
  To: Rahul Lakkireddy; +Cc: dev, Kumar Sanghvi, Nirranjan Kirubaharan

On Wed, Feb 03, 2016 at 02:02:22PM +0530, Rahul Lakkireddy wrote:
> Add a new raw packet flow that allows specifying generic flow input.
> 
> Add the ability to provide masks for fields in flow to allow range of
> values.
> 
> Add a new behavior switch.
> 
> Add the ability to provide behavior arguments to allow rewriting matched
> fields with new values. Ex: allows to provide new ip and port addresses
> to rewrite the fields of packets matching a filter rule before NAT'ing.
> 
> Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
> Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
> ---
>  doc/guides/rel_notes/release_2_3.rst |  3 +++
>  lib/librte_ether/rte_eth_ctrl.h      | 15 ++++++++++++++-
>  2 files changed, 17 insertions(+), 1 deletion(-)
> 
Thomas, any comments as ethdev maintainer?

Jingjing, you have been doing some work on flow director for other NICs. Can
you perhaps review this patch as well.

Regards,
/Bruce

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 01/10] ethdev: add a generic flow and new behavior switch to fdir
  2016-02-24 14:43   ` Bruce Richardson
@ 2016-02-24 15:02     ` Thomas Monjalon
  2016-02-24 18:40       ` Rahul Lakkireddy
  0 siblings, 1 reply; 27+ messages in thread
From: Thomas Monjalon @ 2016-02-24 15:02 UTC (permalink / raw)
  To: Rahul Lakkireddy; +Cc: dev, Kumar Sanghvi, Nirranjan Kirubaharan

2016-02-24 14:43, Bruce Richardson:
> On Wed, Feb 03, 2016 at 02:02:22PM +0530, Rahul Lakkireddy wrote:
> > Add a new raw packet flow that allows specifying generic flow input.
> > 
> > Add the ability to provide masks for fields in flow to allow range of
> > values.
> > 
> > Add a new behavior switch.
> > 
> > Add the ability to provide behavior arguments to allow rewriting matched
> > fields with new values. Ex: allows to provide new ip and port addresses
> > to rewrite the fields of packets matching a filter rule before NAT'ing.
> > 
> Thomas, any comments as ethdev maintainer?

Yes, some comments.
First, there are several different changes in the same patch. It must be split.
Then I don't understand at all the raw flow filter. What is a raw flow?
How behavior_arg must be used?

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 02/10] examples/test-cxgbe-filters: add example to test cxgbe fdir support
  2016-02-24 14:40   ` Bruce Richardson
@ 2016-02-24 18:35     ` Rahul Lakkireddy
  2016-02-25 13:48       ` Bruce Richardson
  0 siblings, 1 reply; 27+ messages in thread
From: Rahul Lakkireddy @ 2016-02-24 18:35 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev, Kumar A S, Nirranjan Kirubaharan

Hi Bruce,

On Wednesday, February 02/24/16, 2016 at 06:40:56 -0800, Bruce Richardson wrote:
> On Wed, Feb 03, 2016 at 02:02:23PM +0530, Rahul Lakkireddy wrote:
> > Add a new test_cxgbe_filters command line example to test support for
> > Chelsio T5 hardware filtering. Shows how to pass the Chelsio input flow
> > and input masks. Also, shows how to pass extra behavior arguments to
> > rewrite fields in matched filter rules.
> > 
> > Also add documentation and update MAINTAINERS.
> > 
> > Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
> > Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
> 
> Hi,
> 
> for testing NIC functionality, the "testpmd" app is what is used, and it already
> contains support for existing flow director functionality. Should the testing
> functionality not be included there?
> 

We initially thought of adding example by extending flow director in
testpmd itself.  However, based on discussion at [1], we then created
a separate sample app for this considering each vendor would come up
with their own sample app.

[1] http://permalink.gmane.org/gmane.comp.networking.dpdk.devel/31471

> Note: that's not to say we don't need a simple example app as well, for 
> demonstrating how to use flow director, but at minimum for nic features we
> generally need to have testpmd support.
> 
> Can this patchset perhaps be changed to include some testpmd support, and maybe
> have any example apps as a separate set?
> 

If adding example in the existing testpmd sounds more appropriate then,
I will re-submit by updating testpmd instead.

Thanks,
Rahul

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 01/10] ethdev: add a generic flow and new behavior switch to fdir
  2016-02-24 15:02     ` Thomas Monjalon
@ 2016-02-24 18:40       ` Rahul Lakkireddy
  2016-02-24 22:17         ` Thomas Monjalon
  0 siblings, 1 reply; 27+ messages in thread
From: Rahul Lakkireddy @ 2016-02-24 18:40 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev, Kumar A S, Nirranjan Kirubaharan

Hi Thomas,

On Wednesday, February 02/24/16, 2016 at 07:02:42 -0800, Thomas Monjalon wrote:
> 2016-02-24 14:43, Bruce Richardson:
> > On Wed, Feb 03, 2016 at 02:02:22PM +0530, Rahul Lakkireddy wrote:
> > > Add a new raw packet flow that allows specifying generic flow input.
> > > 
> > > Add the ability to provide masks for fields in flow to allow range of
> > > values.
> > > 
> > > Add a new behavior switch.
> > > 
> > > Add the ability to provide behavior arguments to allow rewriting matched
> > > fields with new values. Ex: allows to provide new ip and port addresses
> > > to rewrite the fields of packets matching a filter rule before NAT'ing.
> > > 
> > Thomas, any comments as ethdev maintainer?
> 
> Yes, some comments.
> First, there are several different changes in the same patch. It must be split.

Should each structure change be split into a separate patch?

> Then I don't understand at all the raw flow filter. What is a raw flow?
> How behavior_arg must be used?
> 

This was discussed with Jingjing at

http://permalink.gmane.org/gmane.comp.networking.dpdk.devel/31471

A raw flow provides a generic way for vendors to add their vendor
specific input flow.  In our case, it is possible to match several flows
in a single rule.  For example, it's possible to set an ethernet, vlan,
ip and tcp/udp flows all in a single rule.  We can specify all of these
flows in a single raw input flow, which can then be passed to cxgbe flow
director to set the corresponding filter.

On similar lines, behavior_arg provides a generic way to pass extra
action arguments for matched flows.  For example, in our case, to
perform NAT, the new src/dst ip and src/dst port addresses to be
re-written for a matched rule can be passed in behavior_arg.

Thanks,
Rahul

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 01/10] ethdev: add a generic flow and new behavior switch to fdir
  2016-02-24 18:40       ` Rahul Lakkireddy
@ 2016-02-24 22:17         ` Thomas Monjalon
  2016-02-25  9:33           ` Rahul Lakkireddy
  0 siblings, 1 reply; 27+ messages in thread
From: Thomas Monjalon @ 2016-02-24 22:17 UTC (permalink / raw)
  To: Rahul Lakkireddy; +Cc: dev, Kumar A S, Nirranjan Kirubaharan

Caution: I truly respect the work done by Chelsio on DPDK.
And I'm sure you can help to build a good filtering API, which
was mainly designed with Intel needs in mind because it was
difficult to have opinions of other vendors some time ago.
That's why it's a chance to have new needs and it would be a shame
to let it go through a vendor specific backdoor.

2016-02-25 00:10, Rahul Lakkireddy:
> Hi Thomas,
> 
> On Wednesday, February 02/24/16, 2016 at 07:02:42 -0800, Thomas Monjalon wrote:
> > 2016-02-24 14:43, Bruce Richardson:
> > > On Wed, Feb 03, 2016 at 02:02:22PM +0530, Rahul Lakkireddy wrote:
> > > > Add a new raw packet flow that allows specifying generic flow input.
> > > > 
> > > > Add the ability to provide masks for fields in flow to allow range of
> > > > values.
> > > > 
> > > > Add a new behavior switch.
> > > > 
> > > > Add the ability to provide behavior arguments to allow rewriting matched
> > > > fields with new values. Ex: allows to provide new ip and port addresses
> > > > to rewrite the fields of packets matching a filter rule before NAT'ing.
> > > > 
> > > Thomas, any comments as ethdev maintainer?
> > 
> > Yes, some comments.
> > First, there are several different changes in the same patch. It must be split.
> 
> Should each structure change be split into a separate patch?

A patch = a feature.
The switch action and the flow rule are different things.

> > Then I don't understand at all the raw flow filter. What is a raw flow?
> > How behavior_arg must be used?
> 
> This was discussed with Jingjing at
> 
> http://permalink.gmane.org/gmane.comp.networking.dpdk.devel/31471

Thanks, I missed it.

> A raw flow provides a generic way for vendors to add their vendor
> specific input flow.

Please, "generic" and "vendor specific" in the same sentence.
It's obviously wrong.

> In our case, it is possible to match several flows
> in a single rule.  For example, it's possible to set an ethernet, vlan,
> ip and tcp/udp flows all in a single rule.  We can specify all of these
> flows in a single raw input flow, which can then be passed to cxgbe flow
> director to set the corresponding filter.

I feel we need to define what is an API.
If the application wants to call something specific to the NIC, why using
the ethdev API? You just have to include cxgbe.h.

> On similar lines, behavior_arg provides a generic way to pass extra
> action arguments for matched flows.  For example, in our case, to
> perform NAT, the new src/dst ip and src/dst port addresses to be
> re-written for a matched rule can be passed in behavior_arg.

Yes a kind of void* to give what you want to the driver without the
convenience of a documented function.

I know the support of filters among NICs is really heterogeneous.
And the DPDK API are not yet generic enough. But please do not give up!
If the filtering API can be improved to support your cases, please do it.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 01/10] ethdev: add a generic flow and new behavior switch to fdir
  2016-02-03  8:32 ` [PATCH 01/10] ethdev: add a generic flow and new behavior switch to fdir Rahul Lakkireddy
  2016-02-24 14:43   ` Bruce Richardson
@ 2016-02-25  3:26   ` Wu, Jingjing
  2016-02-25  9:11     ` Rahul Lakkireddy
  1 sibling, 1 reply; 27+ messages in thread
From: Wu, Jingjing @ 2016-02-25  3:26 UTC (permalink / raw)
  To: Rahul Lakkireddy, dev; +Cc: Kumar Sanghvi, Nirranjan Kirubaharan



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Rahul Lakkireddy
> Sent: Wednesday, February 03, 2016 4:32 PM
> To: dev@dpdk.org
> Cc: Kumar Sanghvi; Nirranjan Kirubaharan
> Subject: [dpdk-dev] [PATCH 01/10] ethdev: add a generic flow and new
> behavior switch to fdir
> 
> Add a new raw packet flow that allows specifying generic flow input.
> 
> Add the ability to provide masks for fields in flow to allow range of values.
> 
> Add a new behavior switch.
> 
> Add the ability to provide behavior arguments to allow rewriting matched
> fields with new values. Ex: allows to provide new ip and port addresses to
> rewrite the fields of packets matching a filter rule before NAT'ing.
> 
> Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
> Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
> ---
>  doc/guides/rel_notes/release_2_3.rst |  3 +++
>  lib/librte_ether/rte_eth_ctrl.h      | 15 ++++++++++++++-
>  2 files changed, 17 insertions(+), 1 deletion(-)
> 
> diff --git a/doc/guides/rel_notes/release_2_3.rst
> b/doc/guides/rel_notes/release_2_3.rst
> index 99de186..19ce954 100644
> --- a/doc/guides/rel_notes/release_2_3.rst
> +++ b/doc/guides/rel_notes/release_2_3.rst
> @@ -39,6 +39,9 @@ API Changes
>  ABI Changes
>  -----------
> 
> +* New flow type ``RTE_ETH_FLOW_RAW_PKT`` had been introduced and
> hence
> +  ``RTE_ETH_FLOW_MAX`` had been increased to 19.
> +
> 
Great to see a raw_pkt_flow type.
And there is already a flow type "RTE_ETH_FLOW_RAW", it's not necessary to add a new one.

>  Shared Library Versions
>  -----------------------
> diff --git a/lib/librte_ether/rte_eth_ctrl.h b/lib/librte_ether/rte_eth_ctrl.h
> index ce224ad..1bc0d03 100644
> --- a/lib/librte_ether/rte_eth_ctrl.h
> +++ b/lib/librte_ether/rte_eth_ctrl.h
> @@ -74,7 +74,8 @@ extern "C" {
>  #define RTE_ETH_FLOW_IPV6_EX            15
>  #define RTE_ETH_FLOW_IPV6_TCP_EX        16
>  #define RTE_ETH_FLOW_IPV6_UDP_EX        17
> -#define RTE_ETH_FLOW_MAX                18
> +#define RTE_ETH_FLOW_RAW_PKT            18
> +#define RTE_ETH_FLOW_MAX                19
> 
>  /**
>   * Feature filter types
> @@ -499,6 +500,9 @@ struct rte_eth_tunnel_flow {
>  	struct ether_addr mac_addr;                /**< Mac address to match. */
>  };
> 
> +/**< Max length of raw packet in bytes. */ #define
> +RTE_ETH_RAW_PKT_FLOW_MAX_LEN 256
> +
>  /**
>   * An union contains the inputs for all types of flow
>   */
> @@ -514,6 +518,7 @@ union rte_eth_fdir_flow {
>  	struct rte_eth_ipv6_flow   ipv6_flow;
>  	struct rte_eth_mac_vlan_flow mac_vlan_flow;
>  	struct rte_eth_tunnel_flow   tunnel_flow;
> +	uint8_t raw_pkt_flow[RTE_ETH_RAW_PKT_FLOW_MAX_LEN];
>  };
> 
>  /**
> @@ -534,6 +539,8 @@ struct rte_eth_fdir_input {
>  	uint16_t flow_type;
>  	union rte_eth_fdir_flow flow;
>  	/**< Flow fields to match, dependent on flow_type */
> +	union rte_eth_fdir_flow flow_mask;
> +	/**< Mask for the fields matched, dependent on flow */
>  	struct rte_eth_fdir_flow_ext flow_ext;
>  	/**< Additional fields to match */
>  };
> @@ -545,6 +552,7 @@ enum rte_eth_fdir_behavior {
>  	RTE_ETH_FDIR_ACCEPT = 0,
>  	RTE_ETH_FDIR_REJECT,
>  	RTE_ETH_FDIR_PASSTHRU,
> +	RTE_ETH_FDIR_SWITCH,
>  };
> 
>  /**
> @@ -558,6 +566,9 @@ enum rte_eth_fdir_status {
>  	RTE_ETH_FDIR_REPORT_FLEX_8,        /**< Report 8 flex bytes. */
>  };
> 
> +/**< Max # of behavior arguments */
> +#define RTE_ETH_BEHAVIOR_ARG_MAX_LEN 256
> +
>  /**
>   * A structure used to define an action when match FDIR packet filter.
>   */
> @@ -569,6 +580,8 @@ struct rte_eth_fdir_action {
>  	/**< If report_status is RTE_ETH_FDIR_REPORT_ID_FLEX_4 or
>  	     RTE_ETH_FDIR_REPORT_FLEX_8, flex_off specifies where the
> reported
>  	     flex bytes start from in flexible payload. */
> +	uint8_t behavior_arg[RTE_ETH_BEHAVIOR_ARG_MAX_LEN];
> +	/**< Extra arguments for behavior taken */
>  };
> 
>  /**
> --
> 2.5.3

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 01/10] ethdev: add a generic flow and new behavior switch to fdir
  2016-02-25  3:26   ` Wu, Jingjing
@ 2016-02-25  9:11     ` Rahul Lakkireddy
  0 siblings, 0 replies; 27+ messages in thread
From: Rahul Lakkireddy @ 2016-02-25  9:11 UTC (permalink / raw)
  To: Wu, Jingjing; +Cc: dev, Kumar A S, Nirranjan Kirubaharan

Hi Jingjing,

On Wednesday, February 02/24/16, 2016 at 19:26:54 -0800, Wu, Jingjing wrote:
> 
> 
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Rahul Lakkireddy
> > Sent: Wednesday, February 03, 2016 4:32 PM
> > To: dev@dpdk.org
> > Cc: Kumar Sanghvi; Nirranjan Kirubaharan
> > Subject: [dpdk-dev] [PATCH 01/10] ethdev: add a generic flow and new
> > behavior switch to fdir
> > 
> > Add a new raw packet flow that allows specifying generic flow input.
> > 
> > Add the ability to provide masks for fields in flow to allow range of values.
> > 
> > Add a new behavior switch.
> > 
> > Add the ability to provide behavior arguments to allow rewriting matched
> > fields with new values. Ex: allows to provide new ip and port addresses to
> > rewrite the fields of packets matching a filter rule before NAT'ing.
> > 
> > Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
> > Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
> > ---
> >  doc/guides/rel_notes/release_2_3.rst |  3 +++
> >  lib/librte_ether/rte_eth_ctrl.h      | 15 ++++++++++++++-
> >  2 files changed, 17 insertions(+), 1 deletion(-)
> > 
> > diff --git a/doc/guides/rel_notes/release_2_3.rst
> > b/doc/guides/rel_notes/release_2_3.rst
> > index 99de186..19ce954 100644
> > --- a/doc/guides/rel_notes/release_2_3.rst
> > +++ b/doc/guides/rel_notes/release_2_3.rst
> > @@ -39,6 +39,9 @@ API Changes
> >  ABI Changes
> >  -----------
> > 
> > +* New flow type ``RTE_ETH_FLOW_RAW_PKT`` had been introduced and
> > hence
> > +  ``RTE_ETH_FLOW_MAX`` had been increased to 19.
> > +
> > 
> Great to see a raw_pkt_flow type.
> And there is already a flow type "RTE_ETH_FLOW_RAW", it's not necessary to add a new one.

I added a new type based on your feedback only
http://permalink.gmane.org/gmane.comp.networking.dpdk.devel/31386

So, do you want me to revert this change and use RTE_ETH_FLOW_RAW
instead ?


Thanks,
Rahul.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 01/10] ethdev: add a generic flow and new behavior switch to fdir
  2016-02-24 22:17         ` Thomas Monjalon
@ 2016-02-25  9:33           ` Rahul Lakkireddy
  2016-02-25 18:24             ` Thomas Monjalon
  0 siblings, 1 reply; 27+ messages in thread
From: Rahul Lakkireddy @ 2016-02-25  9:33 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev, Kumar A S, Nirranjan Kirubaharan

Hi Thomas,

On Wednesday, February 02/24/16, 2016 at 14:17:58 -0800, Thomas Monjalon wrote:
> Caution: I truly respect the work done by Chelsio on DPDK.

Thank you. And Chelsio will continue to support DPDK community and will
continue to contribute.

> And I'm sure you can help to build a good filtering API, which
> was mainly designed with Intel needs in mind because it was
> difficult to have opinions of other vendors some time ago.
> That's why it's a chance to have new needs and it would be a shame
> to let it go through a vendor specific backdoor.

I agree that new needs should be raised.

RFC v1 was submitted at:
http://permalink.gmane.org/gmane.comp.networking.dpdk.devel/29986

RFC v2 was submitted at:
http://permalink.gmane.org/gmane.comp.networking.dpdk.devel/30732

I tried to accomodate as many fields as possible to make this as generic as
possible. Also, I followed the review comments given by Jingjing.
I also waited for more review comments before posting this series to
see if there were any objections with the approach.

I have been trying to make this generic for all vendors and not favour
any one over the other.

> 
> 2016-02-25 00:10, Rahul Lakkireddy:
> > Hi Thomas,
> > 
> > On Wednesday, February 02/24/16, 2016 at 07:02:42 -0800, Thomas Monjalon wrote:
> > > 2016-02-24 14:43, Bruce Richardson:
> > > > On Wed, Feb 03, 2016 at 02:02:22PM +0530, Rahul Lakkireddy wrote:
> > > > > Add a new raw packet flow that allows specifying generic flow input.
> > > > > 
> > > > > Add the ability to provide masks for fields in flow to allow range of
> > > > > values.
> > > > > 
> > > > > Add a new behavior switch.
> > > > > 
> > > > > Add the ability to provide behavior arguments to allow rewriting matched
> > > > > fields with new values. Ex: allows to provide new ip and port addresses
> > > > > to rewrite the fields of packets matching a filter rule before NAT'ing.
> > > > > 
> > > > Thomas, any comments as ethdev maintainer?
> > > 
> > > Yes, some comments.
> > > First, there are several different changes in the same patch. It must be split.
> > 
> > Should each structure change be split into a separate patch?
> 
> A patch = a feature.
> The switch action and the flow rule are different things.

Ok. I will split this into separate patches.

> 
> > > Then I don't understand at all the raw flow filter. What is a raw flow?
> > > How behavior_arg must be used?
> > 
> > This was discussed with Jingjing at
> > 
> > http://permalink.gmane.org/gmane.comp.networking.dpdk.devel/31471
> 
> Thanks, I missed it.
> 
> > A raw flow provides a generic way for vendors to add their vendor
> > specific input flow.
> 
> Please, "generic" and "vendor specific" in the same sentence.
> It's obviously wrong.

I think this sentence is being mis-interpreted.
What I intended to say is: the fields are generic so that any vendor can
hook-in. The fields themselves are not vendor specific.


> 
> > In our case, it is possible to match several flows
> > in a single rule.  For example, it's possible to set an ethernet, vlan,
> > ip and tcp/udp flows all in a single rule.  We can specify all of these
> > flows in a single raw input flow, which can then be passed to cxgbe flow
> > director to set the corresponding filter.
> 
> I feel we need to define what is an API.
> If the application wants to call something specific to the NIC, why using
> the ethdev API? You just have to include cxgbe.h.

Well, in that sense, flow-director is also very intel specific, no ?
What we are trying to do is make flow-director generic and, we have been
following the review comments on this. If there are better ideas on how
to achieve this, we are open to suggestions/comments and are ready to
re-do the series and re-submit also.


> 
> > On similar lines, behavior_arg provides a generic way to pass extra
> > action arguments for matched flows.  For example, in our case, to
> > perform NAT, the new src/dst ip and src/dst port addresses to be
> > re-written for a matched rule can be passed in behavior_arg.
> 
> Yes a kind of void* to give what you want to the driver without the
> convenience of a documented function.

void* approach was taken based on review comments from Jingjing.
And we didn't receive any further comments/objections on that thread.

> 
> I know the support of filters among NICs is really heterogeneous.
> And the DPDK API are not yet generic enough. But please do not give up!
> If the filtering API can be improved to support your cases, please do it.

I am not giving up. If there are better suggestions then, I am willing
to re-do and re-submit the series.
If the approach taken in RFC v1 series looks more promising then, I can
re-surrect that also. However, I will need some direction over here so
that it becomes generic and doesn't remain intel specific as it is now.

Thanks,
Rahul.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 02/10] examples/test-cxgbe-filters: add example to test cxgbe fdir support
  2016-02-24 18:35     ` Rahul Lakkireddy
@ 2016-02-25 13:48       ` Bruce Richardson
  0 siblings, 0 replies; 27+ messages in thread
From: Bruce Richardson @ 2016-02-25 13:48 UTC (permalink / raw)
  To: Rahul Lakkireddy; +Cc: dev, Kumar A S, Nirranjan Kirubaharan

On Thu, Feb 25, 2016 at 12:05:34AM +0530, Rahul Lakkireddy wrote:
> Hi Bruce,
> 
> On Wednesday, February 02/24/16, 2016 at 06:40:56 -0800, Bruce Richardson wrote:
> > On Wed, Feb 03, 2016 at 02:02:23PM +0530, Rahul Lakkireddy wrote:
> > > Add a new test_cxgbe_filters command line example to test support for
> > > Chelsio T5 hardware filtering. Shows how to pass the Chelsio input flow
> > > and input masks. Also, shows how to pass extra behavior arguments to
> > > rewrite fields in matched filter rules.
> > > 
> > > Also add documentation and update MAINTAINERS.
> > > 
> > > Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
> > > Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
> > 
> > Hi,
> > 
> > for testing NIC functionality, the "testpmd" app is what is used, and it already
> > contains support for existing flow director functionality. Should the testing
> > functionality not be included there?
> > 
> 
> We initially thought of adding example by extending flow director in
> testpmd itself.  However, based on discussion at [1], we then created
> a separate sample app for this considering each vendor would come up
> with their own sample app.
> 
> [1] http://permalink.gmane.org/gmane.comp.networking.dpdk.devel/31471
>

Thanks for the old conversation link. I don't like the idea of having vendor
specific sample apps, I think the samples should be as generic as possible. If
there is a particular feature that you want to demonstrate in a sample app, then
that is worthy of inclusion - assuming it's possible that in future more than
one vendor's hardware will support the feature.

Within the documentation for the particular NIC or driver, feel free to call out
what features the NIC supports and how those should be configured.

I hope this makes sense. We probably need a documented policy on how to go
about adding new NIC feature support to ensure everyone is in agreement on how
things should be done so as to allow vendors to expose the features of their
hardware, while at the same time allowing users to write applications that can
be generic across the various hardware platforms.

Regards,
/Bruce

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 01/10] ethdev: add a generic flow and new behavior switch to fdir
  2016-02-25  9:33           ` Rahul Lakkireddy
@ 2016-02-25 18:24             ` Thomas Monjalon
  2016-02-26  1:17               ` Wu, Jingjing
  0 siblings, 1 reply; 27+ messages in thread
From: Thomas Monjalon @ 2016-02-25 18:24 UTC (permalink / raw)
  To: Rahul Lakkireddy; +Cc: dev, Kumar A S, Nirranjan Kirubaharan

2016-02-25 15:03, Rahul Lakkireddy:
> On Wednesday, February 02/24/16, 2016 at 14:17:58 -0800, Thomas Monjalon wrote:
> > > A raw flow provides a generic way for vendors to add their vendor
> > > specific input flow.
> > 
> > Please, "generic" and "vendor specific" in the same sentence.
> > It's obviously wrong.
> 
> I think this sentence is being mis-interpreted.
> What I intended to say is: the fields are generic so that any vendor can
> hook-in. The fields themselves are not vendor specific.

We are trying to push some features into fields of an API instead of
thinking how to make it simple.

> > > In our case, it is possible to match several flows
> > > in a single rule.  For example, it's possible to set an ethernet, vlan,
> > > ip and tcp/udp flows all in a single rule.  We can specify all of these
> > > flows in a single raw input flow, which can then be passed to cxgbe flow
> > > director to set the corresponding filter.
> > 
> > I feel we need to define what is an API.
> > If the application wants to call something specific to the NIC, why using
> > the ethdev API? You just have to include cxgbe.h.
> 
> Well, in that sense, flow-director is also very intel specific, no ?

Yes. I think the term "flow director" comes from Intel.

> What we are trying to do is make flow-director generic

So let's stop calling it flow director.
We are talking about filtering, right?

> and, we have been
> following the review comments on this. If there are better ideas on how
> to achieve this, we are open to suggestions/comments and are ready to
> re-do the series and re-submit also.

My first question: are you happy with the current API?
Do you understand the difference between RTE_ETH_FILTER_ETHERTYPE and
RTE_ETH_FILTER_FDIR with RTE_ETH_FLOW_L2_PAYLOAD?
Do you understand this structure?
enum rte_eth_fdir_status {
    RTE_ETH_FDIR_NO_REPORT_STATUS = 0, /**< Report nothing. */
    RTE_ETH_FDIR_REPORT_ID,            /**< Only report FD ID. */
    RTE_ETH_FDIR_REPORT_ID_FLEX_4,     /**< Report FD ID and 4 flex bytes. */
    RTE_ETH_FDIR_REPORT_FLEX_8,        /**< Report 8 flex bytes. */
};
These values?
enum rte_fdir_mode {
    RTE_FDIR_MODE_NONE      = 0, /**< Disable FDIR support. */
    RTE_FDIR_MODE_SIGNATURE,     /**< Enable FDIR signature filter mode. */
    RTE_FDIR_MODE_PERFECT,       /**< Enable FDIR perfect filter mode. */
    RTE_FDIR_MODE_PERFECT_MAC_VLAN, /**< Enable FDIR filter mode - MAC VLAN. */
    RTE_FDIR_MODE_PERFECT_TUNNEL,   /**< Enable FDIR filter mode - tunnel. */
};

>From my point of view, it is insane.
We have put the hardware complexity in the API.
And now you want to put some vendor specific data in some fields
like some black magic recipes.

Why is it so complex? We are talking about packet filtering, not rocket science!

> > I know the support of filters among NICs is really heterogeneous.
> > And the DPDK API are not yet generic enough. But please do not give up!
> > If the filtering API can be improved to support your cases, please do it.
> 
> I am not giving up. If there are better suggestions then, I am willing
> to re-do and re-submit the series.
> If the approach taken in RFC v1 series looks more promising then, I can
> re-surrect that also. However, I will need some direction over here so
> that it becomes generic and doesn't remain intel specific as it is now.

Yes the approach in the RFC was better in the sense it was describing the
fields. But honestly, I'd prefer thinking of filtering from scratch.

What is a hardware filter? (note there is no such doc yet)
It matches a packet with some criterias and take an action on it.
Simple.
Now details (it can take weeks or months to list every details).

A hardware implements a subset of the infinite capabilities.
So the API must provide a flag to check a rule/action capability without
really configuring it.

A matching rule must match every criterias or only some of them (AND/OR operators).
An action is associated to a matching rule.
There can be several matching rules on the same port (Chelsio case?).
Any packet field can be matched (we currently define some of them).

An action can be of different types:
- drop
- switch
- accept in any queue
- accept in a specific queue

Most of the rules give some values to match the fields.
The hash filtering (RSS) specify only some fields and a key to direct
packets in different queues.

Please, Intel, Chelsio and other vendors, tell what is wrong above
and let's start a sane discussion on hardware filtering.
More background:
The current API was designed by Intel when they were the only NIC vendor.
Now that there are 8 vendors with different capabilities and that FPGA should
bring even more capabilities, we should be able to build something more
generic while being usable.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 01/10] ethdev: add a generic flow and new behavior switch to fdir
  2016-02-25 18:24             ` Thomas Monjalon
@ 2016-02-26  1:17               ` Wu, Jingjing
  2016-03-03 15:03                 ` Olga Shern
  2016-07-20 10:45                 ` Thomas Monjalon
  0 siblings, 2 replies; 27+ messages in thread
From: Wu, Jingjing @ 2016-02-26  1:17 UTC (permalink / raw)
  To: Thomas Monjalon, Rahul Lakkireddy; +Cc: dev, Kumar A S, Nirranjan Kirubaharan



> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> Sent: Friday, February 26, 2016 2:25 AM
> To: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
> Cc: Richardson, Bruce <bruce.richardson@intel.com>; dev@dpdk.org; Kumar A S
> <kumaras@chelsio.com>; Nirranjan Kirubaharan <nirranjan@chelsio.com>; Wu, Jingjing
> <jingjing.wu@intel.com>
> Subject: Re: [dpdk-dev] [PATCH 01/10] ethdev: add a generic flow and new behavior switch
> to fdir
> 
> 2016-02-25 15:03, Rahul Lakkireddy:
> > On Wednesday, February 02/24/16, 2016 at 14:17:58 -0800, Thomas Monjalon wrote:
> > > > A raw flow provides a generic way for vendors to add their vendor
> > > > specific input flow.
> > >
> > > Please, "generic" and "vendor specific" in the same sentence.
> > > It's obviously wrong.
> >
> > I think this sentence is being mis-interpreted.
> > What I intended to say is: the fields are generic so that any vendor can
> > hook-in. The fields themselves are not vendor specific.
> 
> We are trying to push some features into fields of an API instead of
> thinking how to make it simple.
> 
> > > > In our case, it is possible to match several flows
> > > > in a single rule.  For example, it's possible to set an ethernet, vlan,
> > > > ip and tcp/udp flows all in a single rule.  We can specify all of these
> > > > flows in a single raw input flow, which can then be passed to cxgbe flow
> > > > director to set the corresponding filter.
> > >
> > > I feel we need to define what is an API.
> > > If the application wants to call something specific to the NIC, why using
> > > the ethdev API? You just have to include cxgbe.h.
> >
> > Well, in that sense, flow-director is also very intel specific, no ?
> 
> Yes. I think the term "flow director" comes from Intel.
> 
> > What we are trying to do is make flow-director generic
> 
> So let's stop calling it flow director.
> We are talking about filtering, right?
> 
Hi Thomas

Are you suggesting chelsio to define a new filter type?

> Why is it so complex? We are talking about packet filtering, not rocket science!
>
The complex is due to different NICs different behavior :-)
As I know, it is a common way to use used-define data pass specific infor to driver.

Even flow director is concept from Intel's NIC, but I think it is the generic one comparing
with other kinds of filters. So I think that's why Rahul choose it to add their kind of filters.
As I know enic driver also uses flow director API to support their filters.

No matter chelsio NIC filter uses flow director API or define another new filter type. I vote
the change happened in struct rte_eth_fdir_input, it provide a RAW Flow type,
And there is also a mask field for that, by this way, user can have a flexible way to configure.
And drivers can parse the raw input to define the filter fields.

But for the change happened in struct rte_eth_fdir_action, only SWITCH type is added,
Where to switch? All things is in behavior_arg[RTE_ETH_BEHAVIOR_ARG_MAX_LEN]
which is black to user. Maybe your previous define in RFC makes more sense. It's better to add
user defined field but not for all args.

Any better suggestion?

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 01/10] ethdev: add a generic flow and new behavior switch to fdir
  2016-02-26  1:17               ` Wu, Jingjing
@ 2016-03-03 15:03                 ` Olga Shern
  2016-07-20 10:45                 ` Thomas Monjalon
  1 sibling, 0 replies; 27+ messages in thread
From: Olga Shern @ 2016-03-03 15:03 UTC (permalink / raw)
  To: Wu, Jingjing, Thomas Monjalon, Rahul Lakkireddy
  Cc: dev, Kumar A S, Nirranjan Kirubaharan

I think what Thomas meant is that we should redesign  Flow Director feature and call it something else , Mellanox is calling  it "Flow Steering"  . I agree that  Filtering may be more generic name.
We have implemented Flow Director API in Mellanox ConnectX-4 PMD (part of the DPDK 16.04 patches) but  we did is in very awkward way that will fit the current API and some Mellanox features are missing with current Flow Director API.
Therefore I disagree with Jingjing's statement that this API is generic. 
Frankly, it is very hard to understand it , as Thomas mentioned  ..., not sure how DPDK users understand what each function/field means .... 

Best Regards,
Olga

-----Original Message-----
From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Wu, Jingjing
Sent: Friday, February 26, 2016 3:18 AM
To: Thomas Monjalon; Rahul Lakkireddy
Cc: dev@dpdk.org; Kumar A S; Nirranjan Kirubaharan
Subject: Re: [dpdk-dev] [PATCH 01/10] ethdev: add a generic flow and new behavior switch to fdir



> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> Sent: Friday, February 26, 2016 2:25 AM
> To: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
> Cc: Richardson, Bruce <bruce.richardson@intel.com>; dev@dpdk.org; 
> Kumar A S <kumaras@chelsio.com>; Nirranjan Kirubaharan 
> <nirranjan@chelsio.com>; Wu, Jingjing <jingjing.wu@intel.com>
> Subject: Re: [dpdk-dev] [PATCH 01/10] ethdev: add a generic flow and 
> new behavior switch to fdir
> 
> 2016-02-25 15:03, Rahul Lakkireddy:
> > On Wednesday, February 02/24/16, 2016 at 14:17:58 -0800, Thomas Monjalon wrote:
> > > > A raw flow provides a generic way for vendors to add their 
> > > > vendor specific input flow.
> > >
> > > Please, "generic" and "vendor specific" in the same sentence.
> > > It's obviously wrong.
> >
> > I think this sentence is being mis-interpreted.
> > What I intended to say is: the fields are generic so that any vendor 
> > can hook-in. The fields themselves are not vendor specific.
> 
> We are trying to push some features into fields of an API instead of 
> thinking how to make it simple.
> 
> > > > In our case, it is possible to match several flows in a single 
> > > > rule.  For example, it's possible to set an ethernet, vlan, ip 
> > > > and tcp/udp flows all in a single rule.  We can specify all of 
> > > > these flows in a single raw input flow, which can then be passed 
> > > > to cxgbe flow director to set the corresponding filter.
> > >
> > > I feel we need to define what is an API.
> > > If the application wants to call something specific to the NIC, 
> > > why using the ethdev API? You just have to include cxgbe.h.
> >
> > Well, in that sense, flow-director is also very intel specific, no ?
> 
> Yes. I think the term "flow director" comes from Intel.
> 
> > What we are trying to do is make flow-director generic
> 
> So let's stop calling it flow director.
> We are talking about filtering, right?
> 
Hi Thomas

Are you suggesting chelsio to define a new filter type?

> Why is it so complex? We are talking about packet filtering, not rocket science!
>
The complex is due to different NICs different behavior :-) As I know, it is a common way to use used-define data pass specific infor to driver.

Even flow director is concept from Intel's NIC, but I think it is the generic one comparing with other kinds of filters. So I think that's why Rahul choose it to add their kind of filters.
As I know enic driver also uses flow director API to support their filters.

No matter chelsio NIC filter uses flow director API or define another new filter type. I vote the change happened in struct rte_eth_fdir_input, it provide a RAW Flow type, And there is also a mask field for that, by this way, user can have a flexible way to configure.
And drivers can parse the raw input to define the filter fields.

But for the change happened in struct rte_eth_fdir_action, only SWITCH type is added, Where to switch? All things is in behavior_arg[RTE_ETH_BEHAVIOR_ARG_MAX_LEN]
which is black to user. Maybe your previous define in RFC makes more sense. It's better to add user defined field but not for all args.

Any better suggestion?

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 00/10] cxgbe: Add flow director support
  2016-02-03  8:32 [PATCH 00/10] cxgbe: Add flow director support Rahul Lakkireddy
                   ` (10 preceding siblings ...)
  2016-02-22 10:39 ` [PATCH 00/10] cxgbe: Add flow director support Rahul Lakkireddy
@ 2016-03-22 13:43 ` Bruce Richardson
  11 siblings, 0 replies; 27+ messages in thread
From: Bruce Richardson @ 2016-03-22 13:43 UTC (permalink / raw)
  To: Rahul Lakkireddy
  Cc: dev, Kumar Sanghvi, Nirranjan Kirubaharan, thomas.monjalon

On Wed, Feb 03, 2016 at 02:02:21PM +0530, Rahul Lakkireddy wrote:
> This series of patches extend the flow director filter and add support
> for Chelsio T5 hardware filtering capabilities.
> 
> Chelsio T5 supports carrying out filtering in hardware which supports 3
> actions to carry out on a packet which hit a filter viz.
> 
> 1. Action Pass - Packets hitting a filter rule can be directed to a
>    particular RXQ.
> 
> 2. Action Drop - Packets hitting a filter rule are dropped in h/w.
> 
> 3. Action Switch - Packets hitting a filter rule can be switched in h/w
>    from one port to another, without involvement of host.  Also, the
>    action Switch also supports rewrite of src-mac/dst-mac headers as
>    well as rewrite of vlan headers.  It also supports rewrite of IP
>    headers and thereby, supports NAT (Network Address Translation)
>    in h/w.
> 
> Also, each filter rule can optionally support specifying a mask value
> i.e. it's possible to create a filter rule for an entire subnet of IP
> addresses or a range of tcp/udp ports, etc.
> 
> Patch 1 does the following:
> - Adds a new flow RTE_ETH_FLOW_RAW_PKT to allow specifying a generic
>   flow.
> - Adds an additional generic array to rte_eth_fdir_flow to allow
>   specifying generic flow input.
> - Adds an additional mask for the flow input to allow range of values
>   to be matched in the flow input.
> - Adds a new behavior 'switch'.
> - Adds a generic array to hold behavior arguments that can be passed
>   when a particular behavior is taken. For ex: in case of action
>   'switch', pass additional 4-tuple to allow rewriting src/dst ip and
>   port addresses to support NAT'ing.
> 
Patch 1 of this set is not mergable for 16.04 because there is no agreement on
the way forward for filtering APIs, and the fact that the changes proposed are
not acceptable to the maintainers as-is.
Therefore the whole patchset must be deferred for a later release. I'm updating
the status in patchwork to "Changes Requested".

Regards,
/Bruce

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 01/10] ethdev: add a generic flow and new behavior switch to fdir
  2016-02-26  1:17               ` Wu, Jingjing
  2016-03-03 15:03                 ` Olga Shern
@ 2016-07-20 10:45                 ` Thomas Monjalon
  1 sibling, 0 replies; 27+ messages in thread
From: Thomas Monjalon @ 2016-07-20 10:45 UTC (permalink / raw)
  To: Wu, Jingjing, Rahul Lakkireddy
  Cc: Richardson, Bruce, dev, Kumar A S, Nirranjan Kirubaharan,
	adrien.mazarguil

2016-02-26 01:17, Wu, Jingjing:
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> > 2016-02-25 15:03, Rahul Lakkireddy:
> > > On Wednesday, February 02/24/16, 2016 at 14:17:58 -0800, Thomas Monjalon wrote:
> > > > > In our case, it is possible to match several flows
> > > > > in a single rule.  For example, it's possible to set an ethernet, vlan,
> > > > > ip and tcp/udp flows all in a single rule.  We can specify all of these
> > > > > flows in a single raw input flow, which can then be passed to cxgbe flow
> > > > > director to set the corresponding filter.
> > > >
> > > > I feel we need to define what is an API.
> > > > If the application wants to call something specific to the NIC, why using
> > > > the ethdev API? You just have to include cxgbe.h.
> > >
> > > Well, in that sense, flow-director is also very intel specific, no ?
> > 
> > Yes. I think the term "flow director" comes from Intel.
> > 
> > > What we are trying to do is make flow-director generic
> > 
> > So let's stop calling it flow director.
> > We are talking about filtering, right?
> 
> Are you suggesting chelsio to define a new filter type?
> 
> > Why is it so complex? We are talking about packet filtering, not rocket science!
> >
> The complex is due to different NICs different behavior :-)
> As I know, it is a common way to use used-define data pass specific infor to driver.
> 
> Even flow director is concept from Intel's NIC, but I think it is the generic one comparing
> with other kinds of filters. So I think that's why Rahul choose it to add their kind of filters.
> As I know enic driver also uses flow director API to support their filters.
[...]
> Any better suggestion?

We have a more generic proposal now:
	http://dpdk.org/ml/archives/dev/2016-July/043365.html
Rahul, does it fit your needs?

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2016-07-20 10:45 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-03  8:32 [PATCH 00/10] cxgbe: Add flow director support Rahul Lakkireddy
2016-02-03  8:32 ` [PATCH 01/10] ethdev: add a generic flow and new behavior switch to fdir Rahul Lakkireddy
2016-02-24 14:43   ` Bruce Richardson
2016-02-24 15:02     ` Thomas Monjalon
2016-02-24 18:40       ` Rahul Lakkireddy
2016-02-24 22:17         ` Thomas Monjalon
2016-02-25  9:33           ` Rahul Lakkireddy
2016-02-25 18:24             ` Thomas Monjalon
2016-02-26  1:17               ` Wu, Jingjing
2016-03-03 15:03                 ` Olga Shern
2016-07-20 10:45                 ` Thomas Monjalon
2016-02-25  3:26   ` Wu, Jingjing
2016-02-25  9:11     ` Rahul Lakkireddy
2016-02-03  8:32 ` [PATCH 02/10] examples/test-cxgbe-filters: add example to test cxgbe fdir support Rahul Lakkireddy
2016-02-24 14:40   ` Bruce Richardson
2016-02-24 18:35     ` Rahul Lakkireddy
2016-02-25 13:48       ` Bruce Richardson
2016-02-03  8:32 ` [PATCH 03/10] cxgbe: add skeleton to add support for T5 hardware filtering Rahul Lakkireddy
2016-02-03  8:32 ` [PATCH 04/10] cxgbe: add control txq for communicating filtering info Rahul Lakkireddy
2016-02-03  8:32 ` [PATCH 05/10] cxgbe: add compressed local IP table for matching IPv6 addresses Rahul Lakkireddy
2016-02-03  8:32 ` [PATCH 06/10] cxgbe: add layer 2 table for switch action filter Rahul Lakkireddy
2016-02-03  8:32 ` [PATCH 07/10] cxgbe: add source mac " Rahul Lakkireddy
2016-02-03  8:32 ` [PATCH 08/10] cxgbe: add LE-TCAM filtering support Rahul Lakkireddy
2016-02-03  8:32 ` [PATCH 09/10] cxgbe: add HASH " Rahul Lakkireddy
2016-02-03  8:32 ` [PATCH 10/10] cxgbe: add flow director support and update documentation Rahul Lakkireddy
2016-02-22 10:39 ` [PATCH 00/10] cxgbe: Add flow director support Rahul Lakkireddy
2016-03-22 13:43 ` Bruce Richardson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.