All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ido Schimmel <idosch@idosch.org>
To: netdev@vger.kernel.org
Cc: davem@davemloft.net, jiri@mellanox.com, mlxsw@mellanox.com,
	dsahern@gmail.com, roopa@cumulusnetworks.com,
	nikolay@cumulusnetworks.com, andy@greyhouse.net,
	pablo@netfilter.org, jakub.kicinski@netronome.com,
	pieter.jansenvanvuuren@netronome.com, andrew@lunn.ch,
	f.fainelli@gmail.com, vivien.didelot@gmail.com,
	Ido Schimmel <idosch@mellanox.com>
Subject: [PATCH net-next 04/11] Documentation: Add devlink-trap documentation
Date: Sun,  7 Jul 2019 10:58:21 +0300	[thread overview]
Message-ID: <20190707075828.3315-5-idosch@idosch.org> (raw)
In-Reply-To: <20190707075828.3315-1-idosch@idosch.org>

From: Ido Schimmel <idosch@mellanox.com>

Add initial documentation of the devlink-trap mechanism, explaining the
background, motivation and the semantics of the interface.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
---
 Documentation/networking/devlink-trap.rst | 179 ++++++++++++++++++++++
 Documentation/networking/index.rst        |   1 +
 include/net/devlink.h                     |   6 +
 3 files changed, 186 insertions(+)
 create mode 100644 Documentation/networking/devlink-trap.rst

diff --git a/Documentation/networking/devlink-trap.rst b/Documentation/networking/devlink-trap.rst
new file mode 100644
index 000000000000..4a4fd42f5372
--- /dev/null
+++ b/Documentation/networking/devlink-trap.rst
@@ -0,0 +1,179 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+============
+Devlink Trap
+============
+
+Background
+==========
+
+Devices capable of offloading the kernel's datapath and perform functions such
+as bridging and routing must also be able to send specific packets to the
+kernel (i.e., the CPU) for processing.
+
+For example, a device acting as a multicast-aware bridge must be able to send
+IGMP membership reports to the kernel for processing by the bridge module.
+Without processing such packets, the bridge module could never populate its
+MDB.
+
+As another example, consider a device acting as router which has received an IP
+packet with a TTL of 1. Upon routing the packet the device must send it to the
+kernel so that it will route it as well and generate an ICMP Time Exceeded
+error datagram. Without letting the kernel route such packets itself, utilities
+such as ``traceroute`` could never work.
+
+The fundamental ability of sending certain packets to the kernel for processing
+is called "packet trapping".
+
+Overview
+========
+
+The ``devlink-trap`` mechanism allows capable device drivers to register their
+supported packet traps with ``devlink`` and report trapped packets to
+``devlink`` for further analysis.
+
+Upon receiving trapped packets, ``devlink`` will perform a per-trap packets and
+bytes accounting and potentially report the packet to user space via a netlink
+event along with all the provided metadata (e.g., trap reason, timestamp, input
+port). This is especially useful for drop traps (see :ref:`Trap-Types`)
+as it allows users to obtain further visibility into packet drops that would
+otherwise be invisible.
+
+The following diagram provides a general overview of ``devlink-trap``::
+
+                                    Netlink event: Packet w/ metadata
+                                  ^
+                                  |
+         Userspace                |
+	+---------------------------------------------------+
+	 Kernel                   |
+				  |
+			     +----+----+
+			     |         |      Kernel's Rx path
+			     | devlink |      (non-drop traps)
+			     |         |
+			     +----^----+      ^
+				  |           |
+				  +-----------+
+				  |
+			  +-------+-------+
+			  |               |
+			  | Device driver |
+			  |               |
+			  +-------^-------+
+	 Kernel                   |
+	+---------------------------------------------------+
+	 Hardware                 |
+				  | Trapped packet
+				  |
+			       +--+---+
+			       |      |
+			       | ASIC |
+			       |      |
+			       +------+
+
+.. _Trap-Types:
+
+Trap Types
+==========
+
+The ``devlink-trap`` mechanism supports the following packet trap types:
+
+  * ``drop``: Trapped packets were dropped by the underlying device. Packets
+    are only processed by ``devlink`` and not injected to the kernel's Rx path.
+    The trap action (see :ref:`Trap-Actions`) can be changed.
+  * ``exception``: Trapped packets were not forwarded as intended by the
+    underlying device due to an exception (e.g., TTL error, missing neighbour
+    entry) and trapped to the control plane for resolution. Packets are
+    processed by ``devlink`` and injected to the kernel's Rx path. Changing the
+    action of such traps is not allowed, as it can easily break the control
+    plane.
+
+.. _Trap-Actions:
+
+Trap Actions
+============
+
+The ``devlink-trap`` mechanism supports the following packet trap actions:
+
+  * ``trap``: The sole copy of the packet is sent to the CPU.
+  * ``drop``: The packet is dropped by the underlying device and a copy is not
+    sent to the CPU.
+
+Generic Packet Traps
+====================
+
+Generic packet traps are used to describe traps that trap well-defined packets
+or packets that are trapped due to well-defined conditions (e.g., TTL error).
+Such traps can be shared by multiple device drivers and their description must
+be added to the following table:
+
+.. list-table:: List of Generic Packet Traps
+   :widths: 5 5 90
+
+   * - Name
+     - Type
+     - Description
+   * - ``source_mac_is_multicast``
+     - ``drop``
+     - Traps incoming packets that the device decided to drop because of a
+       multicast source MAC
+   * - ``vlan_tag_mismatch``
+     - ``drop``
+     - Traps incoming packets that the device decided to drop in case of VLAN
+       tag mismatch: The ingress bridge port is not configured with a PVID and
+       the packet is untagged or prio-tagged
+   * - ``ingress_vlan_filter``
+     - ``drop``
+     - Traps incoming packets that the device decided to drop in case they are
+       tagged with a VLAN that is not configured on the ingress bridge port
+   * - ``ingress_spanning_tree_filter``
+     - ``drop``
+     - Traps incoming packets that the device decided to drop in case the STP
+       state of the ingress bridge port is not "forwarding"
+   * - ``port_list_is_empty``
+     - ``drop``
+     - Traps packets that the device decided to drop in case they need to be
+       flooded and the flood list is empty
+   * - ``port_loopback_filter``
+     - ``drop``
+     - Traps packets that the device decided to drop in case after layer 2
+       forwarding the only port from which they should be transmitted through
+       is the port from which they were received
+   * - ``blackhole_route``
+     - ``drop``
+     - Traps packets that the device decided to drop in case they hit a
+       blackhole route
+   * - ``ttl_value_is_too_small``
+     - ``exception``
+     - Traps unicast packets that should be forwarded by the device whose TTL
+       was decremented to 0 or less
+   * - ``tail_drop``
+     - ``drop``
+     - Traps packets that the device decided to drop because they could not be
+       enqueued to a transmission queue which is full
+
+Generic Packet Trap Groups
+==========================
+
+Generic packet trap groups are used to aggregate logically related packet
+traps. These groups allow the user to batch operations such as setting the
+trap action and report state of all member traps. In addition, ``devlink-trap``
+can report aggregated per-group packets and bytes statistics, in case per-trap
+statistics are too narrow. The description of these groups must be added to the
+following table:
+
+.. list-table:: List of Generic Packet Trap Groups
+   :widths: 10 90
+
+   * - Name
+     - Description
+   * - ``l2_drops``
+     - Contains packet traps for packets that were dropped by the device during
+       layer 2 forwarding (i.e., bridge)
+   * - ``l3_drops``
+     - Contains packet traps for packets that were dropped by the device or hit
+       an exception (e.g., TTL error) during layer 3 forwarding
+   * - ``buffer_drops``
+     - Contains packet traps for packets that were dropped by the device due to
+       an enqueue decision
diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst
index a46fca264bee..86a814e4d450 100644
--- a/Documentation/networking/index.rst
+++ b/Documentation/networking/index.rst
@@ -14,6 +14,7 @@ Contents:
    device_drivers/index
    dsa/index
    devlink-info-versions
+   devlink-trap
    ieee802154
    kapi
    z8530book
diff --git a/include/net/devlink.h b/include/net/devlink.h
index 9121eb45b1c0..5a578750f6b4 100644
--- a/include/net/devlink.h
+++ b/include/net/devlink.h
@@ -520,6 +520,9 @@ struct devlink_trap {
 	u32 metadata_cap;
 };
 
+/* All traps must be documented in
+ * Documentation/networking/devlink-trap.rst
+ */
 enum devlink_trap_generic_id {
 	DEVLINK_TRAP_GENERIC_ID_SMAC_MC,
 	DEVLINK_TRAP_GENERIC_ID_VLAN_TAG_MISMATCH,
@@ -536,6 +539,9 @@ enum devlink_trap_generic_id {
 	DEVLINK_TRAP_GENERIC_ID_MAX = __DEVLINK_TRAP_GENERIC_ID_MAX - 1,
 };
 
+/* All trap groups must be documented in
+ * Documentation/networking/devlink-trap.rst
+ */
 enum devlink_trap_group_generic_id {
 	DEVLINK_TRAP_GROUP_GENERIC_ID_L2_DROPS,
 	DEVLINK_TRAP_GROUP_GENERIC_ID_L3_DROPS,
-- 
2.20.1


  parent reply	other threads:[~2019-07-07  8:00 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-07  7:58 [PATCH net-next 00/11] Add drop monitor for offloaded data paths Ido Schimmel
2019-07-07  7:58 ` [PATCH net-next 01/11] devlink: Create helper to fill port type information Ido Schimmel
2019-07-07  7:58 ` [PATCH net-next 02/11] devlink: Add packet trap infrastructure Ido Schimmel
2019-07-07  7:58 ` [PATCH net-next 03/11] devlink: Add generic packet traps and groups Ido Schimmel
2019-07-07  7:58 ` Ido Schimmel [this message]
2019-07-07  7:58 ` [PATCH net-next 05/11] netdevsim: Add devlink-trap support Ido Schimmel
2019-07-07  7:58 ` [PATCH net-next 06/11] Documentation: Add description of netdevsim traps Ido Schimmel
2019-07-07  7:58 ` [PATCH net-next 07/11] mlxsw: core: Add API to set trap action Ido Schimmel
2019-07-07  7:58 ` [PATCH net-next 08/11] mlxsw: reg: Add new " Ido Schimmel
2019-07-07  7:58 ` [PATCH net-next 09/11] mlxsw: Add layer 2 discard trap IDs Ido Schimmel
2019-07-07  7:58 ` [PATCH net-next 10/11] mlxsw: Add trap group for layer 2 discards Ido Schimmel
2019-07-07  7:58 ` [PATCH net-next 11/11] mlxsw: spectrum: Add devlink-trap support Ido Schimmel
2019-07-07  8:01 ` [PATCH iproute2-next 0/7] " Ido Schimmel
2019-07-07  8:01   ` [PATCH iproute2-next 1/7] devlink: Increase number of supported options Ido Schimmel
2019-07-07  8:01   ` [PATCH iproute2-next 2/7] devlink: Add devlink trap set and show commands Ido Schimmel
2019-07-07  8:01   ` [PATCH iproute2-next 3/7] devlink: Add devlink trap group " Ido Schimmel
2019-07-07  8:01   ` [PATCH iproute2-next 4/7] devlink: Add devlink trap monitor support Ido Schimmel
2019-07-07  8:01   ` [PATCH iproute2-next 5/7] devlink: Set NETLINK_NO_ENOBUFS when monitoring events Ido Schimmel
2019-07-07  8:01   ` [PATCH iproute2-next 6/7] devlink: Add fflush() to print functions Ido Schimmel
2019-07-07  8:02   ` [PATCH iproute2-next 7/7] devlink: Add man page for devlink-trap Ido Schimmel
2019-07-07  8:03 ` [RFC PATCH net-next 0/5] selftests: Add devlink-trap selftests Ido Schimmel
2019-07-07  8:03   ` [RFC PATCH net-next 1/5] selftests: devlink_trap: Add test cases for devlink-trap Ido Schimmel
2019-07-07  8:03   ` [RFC PATCH net-next 2/5] Documentation: Add a section for devlink-trap testing Ido Schimmel
2019-07-07  8:03   ` [RFC PATCH net-next 3/5] selftests: forwarding: devlink_lib: Add devlink-trap helpers Ido Schimmel
2019-07-07  8:03   ` [RFC PATCH net-next 4/5] selftests: mlxsw: Add test cases for devlink-trap L2 drops Ido Schimmel
2019-07-07  8:03   ` [RFC PATCH net-next 5/5] selftests: mlxsw: Add a test case for devlink-trap Ido Schimmel
2019-07-07  8:15 ` [PATCH net-next 00/11] Add drop monitor for offloaded data paths Ido Schimmel
2019-07-07 19:45 ` David Miller
2019-07-08 13:19   ` Ido Schimmel
2019-07-08 22:51     ` Jakub Kicinski
2019-07-09 12:38       ` Ido Schimmel
2019-07-09 22:34         ` Jakub Kicinski
2019-07-10 11:20           ` Ido Schimmel
2019-07-10 11:39             ` Toke Høiland-Jørgensen
2019-07-11 12:39   ` Ido Schimmel
2019-07-11 19:02     ` David Miller
2019-07-11 23:53     ` Neil Horman
2019-07-12  3:40       ` Florian Fainelli
2019-07-12 12:05         ` Neil Horman
2019-07-12  9:27       ` Toke Høiland-Jørgensen
2019-07-12 12:18         ` Neil Horman
2019-07-12 12:33           ` Toke Høiland-Jørgensen
2019-07-13  0:40             ` Neil Horman
2019-07-13  8:07               ` Toke Høiland-Jørgensen
2019-07-12 13:52       ` Ido Schimmel
2019-07-14 11:29         ` Neil Horman
2019-07-14 12:43           ` Ido Schimmel
2019-07-14  2:38     ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190707075828.3315-5-idosch@idosch.org \
    --to=idosch@idosch.org \
    --cc=andrew@lunn.ch \
    --cc=andy@greyhouse.net \
    --cc=davem@davemloft.net \
    --cc=dsahern@gmail.com \
    --cc=f.fainelli@gmail.com \
    --cc=idosch@mellanox.com \
    --cc=jakub.kicinski@netronome.com \
    --cc=jiri@mellanox.com \
    --cc=mlxsw@mellanox.com \
    --cc=netdev@vger.kernel.org \
    --cc=nikolay@cumulusnetworks.com \
    --cc=pablo@netfilter.org \
    --cc=pieter.jansenvanvuuren@netronome.com \
    --cc=roopa@cumulusnetworks.com \
    --cc=vivien.didelot@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.