All of lore.kernel.org
 help / color / mirror / Atom feed
* [virtio-dev] [PATCH v13] virtio-net: support inner header hash
@ 2023-04-23  7:35 ` Heng Qi
  0 siblings, 0 replies; 60+ messages in thread
From: Heng Qi @ 2023-04-23  7:35 UTC (permalink / raw)
  To: virtio-dev, virtio-comment
  Cc: Michael S . Tsirkin, Parav Pandit, Jason Wang, Yuri Benditovich,
	Xuan Zhuo

1. Currently, a received encapsulated packet has an outer and an inner header, but
the virtio device is unable to calculate the hash for the inner header. The same
flow can traverse through different tunnels, resulting in the encapsulated
packets being spread across multiple receive queues (refer to the figure below).
However, in certain scenarios, we may need to direct these encapsulated packets of
the same flow to a single receive queue. This facilitates the processing
of the flow by the same CPU to improve performance (warm caches, less locking, etc.).

               client1                    client2
                  |        +-------+         |
                  +------->|tunnels|<--------+
                           +-------+
                              |  |
                              v  v
                      +-----------------+
                      | monitoring host |
                      +-----------------+

To achieve this, the device can calculate a symmetric hash based on the inner headers
of the same flow.

2. For legacy systems, they may lack entropy fields which modern protocols have in
the outer header, resulting in multiple flows with the same outer header but
different inner headers being directed to the same receive queue. This results in
poor receive performance.

To address this limitation, inner header hash can be used to enable the device to advertise
the capability to calculate the hash for the inner packet, regaining better receive performance.

Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
---
v12->v13:
	1. Add a GET command for hash_tunnel_types. @Parav Pandit
	2. Add tunneling protocol explanation. @Jason Wang
	3. Add comments on some usage scenarios for inner hash.

v11->v12:
	1. Add a command VIRTIO_NET_CTRL_MQ_TUNNEL_CONFIG.
	2. Refine the commit log. @Michael S . Tsirkin
	3. Add some tunnel types.

v10->v11:
	1. Revise commit log for clarity for readers.
	2. Some modifications to avoid undefined terms. @Parav Pandit
	3. Change VIRTIO_NET_F_HASH_TUNNEL dependency. @Parav Pandit
	4. Add the normative statements. @Parav Pandit

v9->v10:
	1. Removed hash_report_tunnel related information. @Parav Pandit
	2. Re-describe the limitations of QoS for tunneling.
	3. Some clarification.

v8->v9:
	1. Merge hash_report_tunnel_types into hash_report. @Parav Pandit
	2. Add tunnel security section. @Michael S . Tsirkin
	3. Add VIRTIO_NET_F_HASH_REPORT_TUNNEL.
	4. Fix some typos.
	5. Add more tunnel types. @Michael S . Tsirkin

v7->v8:
	1. Add supported_hash_tunnel_types. @Jason Wang, @Parav Pandit
	2. Change hash_report_tunnel to hash_report_tunnel_types. @Parav Pandit
	3. Removed re-definition for inner packet hashing. @Parav Pandit
	4. Fix some typos. @Michael S . Tsirkin
	5. Clarify some sentences. @Michael S . Tsirkin

v6->v7:
	1. Modify the wording of some sentences for clarity. @Michael S. Tsirkin
	2. Fix some syntax issues. @Michael S. Tsirkin

v5->v6:
	1. Fix some syntax and capitalization issues. @Michael S. Tsirkin
	2. Use encapsulated/encaptulation uniformly. @Michael S. Tsirkin
	3. Move the links to introduction section. @Michael S. Tsirkin
	4. Clarify some sentences. @Michael S. Tsirkin

v4->v5:
	1. Clarify some paragraphs. @Cornelia Huck
	2. Fix the u8 type. @Cornelia Huck

v3->v4:
	1. Rename VIRTIO_NET_F_HASH_GRE_VXLAN_GENEVE_INNER to VIRTIO_NET_F_HASH_TUNNEL. @Jason Wang
	2. Make things clearer. @Jason Wang @Michael S. Tsirkin
	3. Keep the possibility to use inner hash for automatic receive steering. @Jason Wang
	4. Add the "Tunnel packet" paragraph to avoid repeating the GRE etc. many times. @Michael S. Tsirkin

v2->v3:
	1. Add a feature bit for GRE/VXLAN/GENEVE inner hash. @Jason Wang
	2. Chang \field{hash_tunnel} to \field{hash_report_tunnel}. @Jason Wang, @Michael S. Tsirkin

v1->v2:
	1. Remove the patch for the bitmask fix. @Michael S. Tsirkin
	2. Clarify some paragraphs. @Jason Wang
	3. Add \field{hash_tunnel} and VIRTIO_NET_HASH_REPORT_GRE. @Yuri Benditovich

 device-types/net/description.tex        | 159 ++++++++++++++++++++++++
 device-types/net/device-conformance.tex |   1 +
 device-types/net/driver-conformance.tex |   1 +
 introduction.tex                        |  44 +++++++
 4 files changed, 205 insertions(+)

diff --git a/device-types/net/description.tex b/device-types/net/description.tex
index 0500bb6..48e41f1 100644
--- a/device-types/net/description.tex
+++ b/device-types/net/description.tex
@@ -83,6 +83,9 @@ \subsection{Feature bits}\label{sec:Device Types / Network Device / Feature bits
 \item[VIRTIO_NET_F_CTRL_MAC_ADDR(23)] Set MAC address through control
     channel.
 
+\item[VIRTIO_NET_F_HASH_TUNNEL(52)] Device supports inner header hash
+    for tunnel-encapsulated packets.
+
 \item[VIRTIO_NET_F_NOTF_COAL(53)] Device supports notifications coalescing.
 
 \item[VIRTIO_NET_F_GUEST_USO4 (54)] Driver can receive USOv4 packets.
@@ -139,6 +142,7 @@ \subsubsection{Feature bit requirements}\label{sec:Device Types / Network Device
 \item[VIRTIO_NET_F_NOTF_COAL] Requires VIRTIO_NET_F_CTRL_VQ.
 \item[VIRTIO_NET_F_RSC_EXT] Requires VIRTIO_NET_F_HOST_TSO4 or VIRTIO_NET_F_HOST_TSO6.
 \item[VIRTIO_NET_F_RSS] Requires VIRTIO_NET_F_CTRL_VQ.
+\item[VIRTIO_NET_F_HASH_TUNNEL] Requires VIRTIO_NET_F_CTRL_VQ along with VIRTIO_NET_F_RSS and/or VIRTIO_NET_F_HASH_REPORT.
 \end{description}
 
 \subsubsection{Legacy Interface: Feature bits}\label{sec:Device Types / Network Device / Feature bits / Legacy Interface: Feature bits}
@@ -198,6 +202,7 @@ \subsection{Device configuration layout}\label{sec:Device Types / Network Device
         u8 rss_max_key_size;
         le16 rss_max_indirection_table_length;
         le32 supported_hash_types;
+        le32 supported_tunnel_hash_types;
 };
 \end{lstlisting}
 The following field, \field{rss_max_key_size} only exists if VIRTIO_NET_F_RSS or VIRTIO_NET_F_HASH_REPORT is set.
@@ -212,6 +217,11 @@ \subsection{Device configuration layout}\label{sec:Device Types / Network Device
 Field \field{supported_hash_types} contains the bitmask of supported hash types.
 See \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled hash types} for details of supported hash types.
 
+Field \field{supported_tunnel_hash_types} only exists if the device supports inner header hash, i.e. if VIRTIO_NET_F_HASH_TUNNEL is set.
+
+Filed \field{supported_tunnel_hash_types} contains the bitmask of supported tunnel hash types.
+See \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled encapsulation hash types} for details of supported tunnel hash types.
+
 \devicenormative{\subsubsection}{Device configuration layout}{Device Types / Network Device / Device configuration layout}
 
 The device MUST set \field{max_virtqueue_pairs} to between 1 and 0x8000 inclusive,
@@ -848,6 +858,7 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
 If the feature VIRTIO_NET_F_RSS was negotiated:
 \begin{itemize}
 \item The device uses \field{hash_types} of the virtio_net_rss_config structure as 'Enabled hash types' bitmask.
+\item The device uses \field{hash_tunnel_types} of the virtio_net_hash_tunnel_config structure as 'Enabled hash tunnel types' bitmask if VIRTIO_NET_F_HASH_TUNNEL was negotiated.
 \item The device uses a key as defined in \field{hash_key_data} and \field{hash_key_length} of the virtio_net_rss_config structure (see
 \ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) / Setting RSS parameters}).
 \end{itemize}
@@ -855,6 +866,7 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
 If the feature VIRTIO_NET_F_RSS was not negotiated:
 \begin{itemize}
 \item The device uses \field{hash_types} of the virtio_net_hash_config structure as 'Enabled hash types' bitmask.
+\item The device uses \field{hash_tunnel_types} of the virtio_net_hash_tunnel_config structure as 'Enabled hash tunnel types' bitmask if VIRTIO_NET_F_HASH_TUNNEL was negotiated.
 \item The device uses a key as defined in \field{hash_key_data} and \field{hash_key_length} of the virtio_net_hash_config structure (see
 \ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode / Hash calculation}).
 \end{itemize}
@@ -870,6 +882,7 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
 
 \subparagraph{Supported/enabled hash types}
 \label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled hash types}
+This paragraph relies on definitions from \hyperref[intro:IP]{[IP]}, \hyperref[intro:UDP]{[UDP]} and \hyperref[intro:TCP]{[TCP]}.
 Hash types applicable for IPv4 packets:
 \begin{lstlisting}
 #define VIRTIO_NET_HASH_TYPE_IPv4              (1 << 0)
@@ -980,6 +993,152 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
 (see \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / IPv6 packets without extension header}).
 \end{itemize}
 
+\paragraph{Inner Header Hash}
+\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Inner Header Hash}
+
+If VIRTIO_NET_F_HASH_TUNNEL has been negotiated, the device supports inner header hash and the driver can send
+commands VIRTIO_NET_CTRL_TUNNEL_HASH_SET and VIRTIO_NET_CTRL_TUNNEL_HASH_GET for the inner header hash configuration.
+
+struct virtio_net_hash_tunnel_config {
+    le32 hash_tunnel_types;
+};
+
+#define VIRTIO_NET_CTRL_TUNNEL_HASH 7
+ #define VIRTIO_NET_CTRL_TUNNEL_HASH_SET 0
+ #define VIRTIO_NET_CTRL_TUNNEL_HASH_GET 1
+
+Filed \field{hash_tunnel_types} contains a bitmask of configured hash tunnel types as
+defined in \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled hash tunnel types}.
+
+The class VIRTIO_NET_CTRL_TUNNEL_HASH has the following commands:
+\begin{itemize}
+\item VIRTIO_NET_CTRL_TUNNEL_HASH_SET: set the \field{hash_tunnel_types} to configure the inner header hash calculation for the device.
+\item VIRTIO_NET_CTRL_TUNNEL_HASH_GET: get the \field{hash_tunnel_types} from the device.
+\end{itemize}
+
+For the command VIRTIO_NET_CTRL_TUNNEL_HASH_SET, the structure virtio_net_hash_tunnel_config is write-only for the driver.
+For the command VIRTIO_NET_CTRL_TUNNEL_HASH_GET, the structure virtio_net_hash_tunnel_config is read-only for the driver.
+
+\subparagraph{Tunnel/Encapsulated packet}
+\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Tunnel/Encapsulated packet}
+
+A tunnel packet is encapsulated from the original packet based on the tunneling protocol (only a single level of
+encapsulation is currently supported). The encapsulated packet contains an outer header and an inner header, and
+the device calculates the hash over either the inner header or the outer header.
+
+If VIRTIO_NET_F_HASH_TUNNEL is negotiated and a received encapsulated packet's outer header matches one of the
+configured \field{hash_tunnel_types}, the hash of the inner header is calculated.
+
+Supported encapsulated packet types:
+\begin{itemize}
+\item \hyperref[intro:gre_rfc2784]{[GRE_rfc2784]}: the outer header is over IPv4 and the inner header is over IPv4. The outer header does not contain the transport protocol.
+\item \hyperref[intro:gre_rfc2890]{[GRE_rfc2890]}: the outer header is over IPv4 and the inner header is over IPv4. The outer header does not contain the transport protocol.
+\item \hyperref[intro:gre_rfc7676]{[GRE_rfc7676]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6. The outer header does not contain the transport protocol.
+\item \hyperref[intro:gre_in_udp_rfc8086]{[GRE-in-UDP]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6. The outer header uses UDP as the transport protocol.
+\item \hyperref[intro:vxlan]{[VXLAN]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6. The outer header uses UDP as the transport protocol.
+\item \hyperref[intro:vxlan_gpe]{[VXLAN-GPE]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6. The outer header uses UDP as the transport protocol.
+\item \hyperref[intro:geneve]{[GENEVE]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6. The outer header uses UDP as the transport protocol.
+\item \hyperref[intro:ipip]{[IPIP]}: the outer header is over IPv4 and the inner header is over IPv4. The outer header does not contain the transport protocol.
+\item \hyperref[intro:nvgre]{[NVGRE]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6. The outer header does not contain the transport protocol.
+\item \hyperref[intro:sit]{[STT]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6. The outer header uses TCP-like as the transport protocol.
+\end{itemize}
+
+If VIRTIO_NET_HASH_TUNNEL_TYPE_NONE is set or the encapsulation type is not included in \field{hash_tunnel_types},
+the hash of the outer header is calculated for the received encapsulated packet.
+
+The hash is calculated for the received non-encapsulated packet as if VIRTIO_NET_F_HASH_TUNNEL was not negotiated.
+
+\subparagraph{Supported/enabled encapsulation hash types}
+\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled encapsulation hash types}
+
+\begin{lstlisting}
+#define VIRTIO_NET_HASH_TUNNEL_TYPE_NONE        (1 << 0)
+\end{lstlisting}
+
+Supported encapsulation hash types:
+Hash type applicable for inner payload of the \hyperref[intro:gre_rfc2784]{[GRE_rfc2784]} packet:
+\begin{lstlisting}
+#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_2784    (1 << 1)
+\end{lstlisting}
+Hash type applicable for inner payload of the \hyperref[intro:gre_rfc2890]{[GRE_rfc2890]} packet:
+\begin{lstlisting}
+#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_2890    (1 << 2)
+\end{lstlisting}
+Hash type applicable for inner payload of the \hyperref[intro:gre_rfc7676]{[GRE_rfc7676]} packet:
+\begin{lstlisting}
+#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_7676    (1 << 3)
+\end{lstlisting}
+Hash type applicable for inner payload of the \hyperref[intro:gre_in_udp_rfc8086]{[GRE-in-UDP]} packet:
+\begin{lstlisting}
+#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_UDP     (1 << 4)
+\end{lstlisting}
+Hash type applicable for inner payload of the \hyperref[intro:vxlan]{[VXLAN]} packet:
+\begin{lstlisting}
+#define VIRTIO_NET_HASH_TUNNEL_TYPE_VXLAN       (1 << 5)
+\end{lstlisting}
+Hash type applicable for inner payload of the \hyperref[intro:vxlan_gpe]{[VXLAN-GPE]} packet:
+\begin{lstlisting}
+#define VIRTIO_NET_HASH_TUNNEL_TYPE_VXLAN_GPE   (1 << 6)
+\end{lstlisting}
+Hash type applicable for inner payload of the \hyperref[intro:geneve]{[GENEVE]} packet:
+\begin{lstlisting}
+#define VIRTIO_NET_HASH_TUNNEL_TYPE_GENEVE      (1 << 7)
+\end{lstlisting}
+Hash type applicable for inner payload of the \hyperref[intro:ipip]{[IPIP]} packet:
+\begin{lstlisting}
+#define VIRTIO_NET_HASH_TUNNEL_TYPE_IPIP        (1 << 8)
+\end{lstlisting}
+Hash type applicable for inner payload of the \hyperref[intro:nvgre]{[NVGRE]} packet:
+\begin{lstlisting}
+#define VIRTIO_NET_HASH_TUNNEL_TYPE_NVGRE       (1 << 9)
+\end{lstlisting}
+Hash type applicable for inner payload of the \hyperref[intro:stt]{[STT]} packet:
+\begin{lstlisting}
+#define VIRTIO_NET_HASH_TUNNEL_TYPE_STT         (1 << 10)
+\end{lstlisting}
+
+\subparagraph{Advice}
+Usage scenarios of inner header hash (but not limited to):
+\begin{itemize}
+\item Legacy tunneling protocols that lack entropy in the outer header use inner header hash to hash flows
+      with the same outer header but different inner headers to different queues for better receiving performance.
+\item In scenarios where the same flow passing through different tunnels is expected to be received in the same queue,
+      warm caches, lessing locking, etc. are optimized to obtain receiving performance.
+\item For modern tunneling protocols, in some scenarios, the entropy in the outer header is insufficient, for example,
+      the external destination port number is fixed. Inner header hash can be used to increase the entropy to
+      regain receiving performance.
+\end{itemize}
+
+For scenarios with sufficient external entropy or no internal hashing requirements, inner header hash may not be needed:
+A tunnel is often expected to isolate the external network from the internal one. By completely ignoring entropy
+in the external header and replacing it with entropy from the internal header, for hash calculations, this expectation
+might be violated to a certain extent, depending on how the hash is used. When the hash use is limited to RSS queue
+selection, inner header hash may have quality of service (QoS) limitations.
+
+Possible mitigations:
+\begin{itemize}
+\item Use a tool with good forwarding performance to keep the receive queue from filling up.
+\item If the QoS is unavailable, the driver can set \field{hash_tunnel_types} to VIRTIO_NET_HASH_TUNNEL_TYPE_NONE
+      to disable inner header hash for encapsulated packets.
+\item Perform appropriate QoS before packets consume the receive buffers of the receive queues.
+\end{itemize}
+
+\devicenormative{\subparagraph}{Inner Header Hash}{Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
+
+The device MUST calculate the hash on the outer header if the type of the received encapsulated packet does not match any value of the \field{hash_tunnel_types}.
+
+The device MUST respond to the VIRTIO_NET_CTRL_TUNNEL_HASH_SET command with VIRTIO_NET_ERR if the device received an unrecognized or unsupported VIRTIO_NET_HASH_TUNNEL_TYPE_ flag.
+
+Upon reset, the device MUST initialize \field{hash_tunnel_type} to 0.
+
+\drivernormative{\subparagraph}{Inner Header Hash}{Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
+
+The driver MUST have negotiated the VIRTIO_NET_F_HASH_TUNNEL feature when issuing commands VIRTIO_NET_CTRL_TUNNEL_HASH_SET and VIRTIO_NET_CTRL_TUNNEL_HASH_GET.
+
+The driver MUST ignore the values of \field{hash_tunnel_types} received from the VIRTIO_NET_CTRL_TUNNEL_HASH_GET command if the device responds with VIRTIO_NET_ERR.
+
+The driver MUST NOT set any VIRTIO_NET_HASH_TUNNEL_TYPE_ flags that are not supported by the device.
+
 \paragraph{Hash reporting for incoming packets}
 \label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash reporting for incoming packets}
 
diff --git a/device-types/net/device-conformance.tex b/device-types/net/device-conformance.tex
index 54f6783..f88f48b 100644
--- a/device-types/net/device-conformance.tex
+++ b/device-types/net/device-conformance.tex
@@ -14,4 +14,5 @@
 \item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode}
 \item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) / RSS processing}
 \item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Notifications Coalescing}
+\item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
 \end{itemize}
diff --git a/device-types/net/driver-conformance.tex b/device-types/net/driver-conformance.tex
index 97d0cc1..9d853d9 100644
--- a/device-types/net/driver-conformance.tex
+++ b/device-types/net/driver-conformance.tex
@@ -14,4 +14,5 @@
 \item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Offloads State Configuration / Setting Offloads State}
 \item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) }
 \item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Notifications Coalescing}
+\item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
 \end{itemize}
diff --git a/introduction.tex b/introduction.tex
index 287c5fc..36b620f 100644
--- a/introduction.tex
+++ b/introduction.tex
@@ -99,6 +99,50 @@ \section{Normative References}\label{sec:Normative References}
     Standards for Efficient Cryptography Group(SECG), ``SEC1: Elliptic Cureve Cryptography'', Version 1.0, September 2000.
 	\newline\url{https://www.secg.org/sec1-v2.pdf}\\
 
+	\phantomsection\label{intro:gre_rfc2784}\textbf{[GRE_rfc2784]} &
+    Generic Routing Encapsulation. This protocol is only specified for IPv4 and used as either the payload or delivery protocol.
+	\newline\url{https://datatracker.ietf.org/doc/rfc2784/}\\
+	\phantomsection\label{intro:gre_rfc2890}\textbf{[GRE_rfc2890]} &
+    Key and Sequence Number Extensions to GRE \ref{intro:gre_rfc2784}. This protocol describes extensions by which two fields, Key and
+    Sequence Number, can be optionally carried in the GRE Header \ref{intro:gre_rfc2784}.
+	\newline\url{https://www.rfc-editor.org/rfc/rfc2890}\\
+	\phantomsection\label{intro:gre_rfc7676}\textbf{[GRE_rfc7676]} &
+    IPv6 Support for Generic Routing Encapsulation (GRE). This protocol is specified for IPv6 and used as either the payload or
+    delivery protocol. Note that this does not change the GRE header format or any behaviors specified by RFC 2784 or RFC 2890.
+	\newline\url{https://datatracker.ietf.org/doc/rfc7676/}\\
+	\phantomsection\label{intro:gre_in_udp_rfc8086}\textbf{[GRE-in-UDP]} &
+    GRE-in-UDP Encapsulation. This specifies a method of encapsulating network protocol packets within GRE and UDP headers.
+    This GRE-in-UDP encapsulation allows the UDP source port field to be used as an entropy field. This protocol is specified for IPv4 and IPv6,
+    and used as either the payload or delivery protocol.
+	\newline\url{https://www.rfc-editor.org/rfc/rfc8086}\\
+	\phantomsection\label{intro:vxlan}\textbf{[VXLAN]} &
+    Virtual eXtensible Local Area Network.
+	\newline\url{https://datatracker.ietf.org/doc/rfc7348/}\\
+	\phantomsection\label{intro:vxlan-gpe}\textbf{[VXLAN-GPE]} &
+    Generic Protocol Extension for VXLAN. This protocol describes extending Virtual eXtensible Local Area Network (VXLAN) via changes to the VXLAN header.
+	\newline\url{https://www.ietf.org/archive/id/draft-ietf-nvo3-vxlan-gpe-12.txt}\\
+	\phantomsection\label{intro:geneve}\textbf{[GENEVE]} &
+    Generic Network Virtualization Encapsulation.
+	\newline\url{https://datatracker.ietf.org/doc/rfc8926/}\\
+	\phantomsection\label{intro:ipip}\textbf{[IPIP]} &
+    IP Encapsulation within IP.
+	\newline\url{https://www.rfc-editor.org/rfc/rfc2003}\\
+	\phantomsection\label{intro:nvgre}\textbf{[NVGRE]} &
+    NVGRE: Network Virtualization Using Generic Routing Encapsulation
+	\newline\url{https://www.rfc-editor.org/rfc/rfc7637.html}\\
+	\phantomsection\label{intro:stt}\textbf{[STT]} &
+    Stateless Transport Tunneling. STT is particularly useful when some tunnel endpoints are in end-systems, as it utilizes the capabilities
+    of the network interface card to improve performance.
+	\newline\url{https://www.ietf.org/archive/id/draft-davie-stt-08.txt}\\
+	\phantomsection\label{intro:IP}\textbf{[IP]} &
+    INTERNET PROTOCOL
+	\newline\url{https://www.rfc-editor.org/rfc/rfc791}\\
+	\phantomsection\label{intro:UDP}\textbf{[UDP]} &
+    User Datagram Protocol
+	\newline\url{https://www.rfc-editor.org/rfc/rfc768}\\
+	\phantomsection\label{intro:TCP}\textbf{[TCP]} &
+    TRANSMISSION CONTROL PROTOCOL
+	\newline\url{https://www.rfc-editor.org/rfc/rfc793}\\
 \end{longtable}
 
 \section{Non-Normative References}
-- 
2.19.1.6.gb485710b


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [virtio-comment] [PATCH v13] virtio-net: support inner header hash
@ 2023-04-23  7:35 ` Heng Qi
  0 siblings, 0 replies; 60+ messages in thread
From: Heng Qi @ 2023-04-23  7:35 UTC (permalink / raw)
  To: virtio-dev, virtio-comment
  Cc: Michael S . Tsirkin, Parav Pandit, Jason Wang, Yuri Benditovich,
	Xuan Zhuo

1. Currently, a received encapsulated packet has an outer and an inner header, but
the virtio device is unable to calculate the hash for the inner header. The same
flow can traverse through different tunnels, resulting in the encapsulated
packets being spread across multiple receive queues (refer to the figure below).
However, in certain scenarios, we may need to direct these encapsulated packets of
the same flow to a single receive queue. This facilitates the processing
of the flow by the same CPU to improve performance (warm caches, less locking, etc.).

               client1                    client2
                  |        +-------+         |
                  +------->|tunnels|<--------+
                           +-------+
                              |  |
                              v  v
                      +-----------------+
                      | monitoring host |
                      +-----------------+

To achieve this, the device can calculate a symmetric hash based on the inner headers
of the same flow.

2. For legacy systems, they may lack entropy fields which modern protocols have in
the outer header, resulting in multiple flows with the same outer header but
different inner headers being directed to the same receive queue. This results in
poor receive performance.

To address this limitation, inner header hash can be used to enable the device to advertise
the capability to calculate the hash for the inner packet, regaining better receive performance.

Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
---
v12->v13:
	1. Add a GET command for hash_tunnel_types. @Parav Pandit
	2. Add tunneling protocol explanation. @Jason Wang
	3. Add comments on some usage scenarios for inner hash.

v11->v12:
	1. Add a command VIRTIO_NET_CTRL_MQ_TUNNEL_CONFIG.
	2. Refine the commit log. @Michael S . Tsirkin
	3. Add some tunnel types.

v10->v11:
	1. Revise commit log for clarity for readers.
	2. Some modifications to avoid undefined terms. @Parav Pandit
	3. Change VIRTIO_NET_F_HASH_TUNNEL dependency. @Parav Pandit
	4. Add the normative statements. @Parav Pandit

v9->v10:
	1. Removed hash_report_tunnel related information. @Parav Pandit
	2. Re-describe the limitations of QoS for tunneling.
	3. Some clarification.

v8->v9:
	1. Merge hash_report_tunnel_types into hash_report. @Parav Pandit
	2. Add tunnel security section. @Michael S . Tsirkin
	3. Add VIRTIO_NET_F_HASH_REPORT_TUNNEL.
	4. Fix some typos.
	5. Add more tunnel types. @Michael S . Tsirkin

v7->v8:
	1. Add supported_hash_tunnel_types. @Jason Wang, @Parav Pandit
	2. Change hash_report_tunnel to hash_report_tunnel_types. @Parav Pandit
	3. Removed re-definition for inner packet hashing. @Parav Pandit
	4. Fix some typos. @Michael S . Tsirkin
	5. Clarify some sentences. @Michael S . Tsirkin

v6->v7:
	1. Modify the wording of some sentences for clarity. @Michael S. Tsirkin
	2. Fix some syntax issues. @Michael S. Tsirkin

v5->v6:
	1. Fix some syntax and capitalization issues. @Michael S. Tsirkin
	2. Use encapsulated/encaptulation uniformly. @Michael S. Tsirkin
	3. Move the links to introduction section. @Michael S. Tsirkin
	4. Clarify some sentences. @Michael S. Tsirkin

v4->v5:
	1. Clarify some paragraphs. @Cornelia Huck
	2. Fix the u8 type. @Cornelia Huck

v3->v4:
	1. Rename VIRTIO_NET_F_HASH_GRE_VXLAN_GENEVE_INNER to VIRTIO_NET_F_HASH_TUNNEL. @Jason Wang
	2. Make things clearer. @Jason Wang @Michael S. Tsirkin
	3. Keep the possibility to use inner hash for automatic receive steering. @Jason Wang
	4. Add the "Tunnel packet" paragraph to avoid repeating the GRE etc. many times. @Michael S. Tsirkin

v2->v3:
	1. Add a feature bit for GRE/VXLAN/GENEVE inner hash. @Jason Wang
	2. Chang \field{hash_tunnel} to \field{hash_report_tunnel}. @Jason Wang, @Michael S. Tsirkin

v1->v2:
	1. Remove the patch for the bitmask fix. @Michael S. Tsirkin
	2. Clarify some paragraphs. @Jason Wang
	3. Add \field{hash_tunnel} and VIRTIO_NET_HASH_REPORT_GRE. @Yuri Benditovich

 device-types/net/description.tex        | 159 ++++++++++++++++++++++++
 device-types/net/device-conformance.tex |   1 +
 device-types/net/driver-conformance.tex |   1 +
 introduction.tex                        |  44 +++++++
 4 files changed, 205 insertions(+)

diff --git a/device-types/net/description.tex b/device-types/net/description.tex
index 0500bb6..48e41f1 100644
--- a/device-types/net/description.tex
+++ b/device-types/net/description.tex
@@ -83,6 +83,9 @@ \subsection{Feature bits}\label{sec:Device Types / Network Device / Feature bits
 \item[VIRTIO_NET_F_CTRL_MAC_ADDR(23)] Set MAC address through control
     channel.
 
+\item[VIRTIO_NET_F_HASH_TUNNEL(52)] Device supports inner header hash
+    for tunnel-encapsulated packets.
+
 \item[VIRTIO_NET_F_NOTF_COAL(53)] Device supports notifications coalescing.
 
 \item[VIRTIO_NET_F_GUEST_USO4 (54)] Driver can receive USOv4 packets.
@@ -139,6 +142,7 @@ \subsubsection{Feature bit requirements}\label{sec:Device Types / Network Device
 \item[VIRTIO_NET_F_NOTF_COAL] Requires VIRTIO_NET_F_CTRL_VQ.
 \item[VIRTIO_NET_F_RSC_EXT] Requires VIRTIO_NET_F_HOST_TSO4 or VIRTIO_NET_F_HOST_TSO6.
 \item[VIRTIO_NET_F_RSS] Requires VIRTIO_NET_F_CTRL_VQ.
+\item[VIRTIO_NET_F_HASH_TUNNEL] Requires VIRTIO_NET_F_CTRL_VQ along with VIRTIO_NET_F_RSS and/or VIRTIO_NET_F_HASH_REPORT.
 \end{description}
 
 \subsubsection{Legacy Interface: Feature bits}\label{sec:Device Types / Network Device / Feature bits / Legacy Interface: Feature bits}
@@ -198,6 +202,7 @@ \subsection{Device configuration layout}\label{sec:Device Types / Network Device
         u8 rss_max_key_size;
         le16 rss_max_indirection_table_length;
         le32 supported_hash_types;
+        le32 supported_tunnel_hash_types;
 };
 \end{lstlisting}
 The following field, \field{rss_max_key_size} only exists if VIRTIO_NET_F_RSS or VIRTIO_NET_F_HASH_REPORT is set.
@@ -212,6 +217,11 @@ \subsection{Device configuration layout}\label{sec:Device Types / Network Device
 Field \field{supported_hash_types} contains the bitmask of supported hash types.
 See \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled hash types} for details of supported hash types.
 
+Field \field{supported_tunnel_hash_types} only exists if the device supports inner header hash, i.e. if VIRTIO_NET_F_HASH_TUNNEL is set.
+
+Filed \field{supported_tunnel_hash_types} contains the bitmask of supported tunnel hash types.
+See \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled encapsulation hash types} for details of supported tunnel hash types.
+
 \devicenormative{\subsubsection}{Device configuration layout}{Device Types / Network Device / Device configuration layout}
 
 The device MUST set \field{max_virtqueue_pairs} to between 1 and 0x8000 inclusive,
@@ -848,6 +858,7 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
 If the feature VIRTIO_NET_F_RSS was negotiated:
 \begin{itemize}
 \item The device uses \field{hash_types} of the virtio_net_rss_config structure as 'Enabled hash types' bitmask.
+\item The device uses \field{hash_tunnel_types} of the virtio_net_hash_tunnel_config structure as 'Enabled hash tunnel types' bitmask if VIRTIO_NET_F_HASH_TUNNEL was negotiated.
 \item The device uses a key as defined in \field{hash_key_data} and \field{hash_key_length} of the virtio_net_rss_config structure (see
 \ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) / Setting RSS parameters}).
 \end{itemize}
@@ -855,6 +866,7 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
 If the feature VIRTIO_NET_F_RSS was not negotiated:
 \begin{itemize}
 \item The device uses \field{hash_types} of the virtio_net_hash_config structure as 'Enabled hash types' bitmask.
+\item The device uses \field{hash_tunnel_types} of the virtio_net_hash_tunnel_config structure as 'Enabled hash tunnel types' bitmask if VIRTIO_NET_F_HASH_TUNNEL was negotiated.
 \item The device uses a key as defined in \field{hash_key_data} and \field{hash_key_length} of the virtio_net_hash_config structure (see
 \ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode / Hash calculation}).
 \end{itemize}
@@ -870,6 +882,7 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
 
 \subparagraph{Supported/enabled hash types}
 \label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled hash types}
+This paragraph relies on definitions from \hyperref[intro:IP]{[IP]}, \hyperref[intro:UDP]{[UDP]} and \hyperref[intro:TCP]{[TCP]}.
 Hash types applicable for IPv4 packets:
 \begin{lstlisting}
 #define VIRTIO_NET_HASH_TYPE_IPv4              (1 << 0)
@@ -980,6 +993,152 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
 (see \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / IPv6 packets without extension header}).
 \end{itemize}
 
+\paragraph{Inner Header Hash}
+\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Inner Header Hash}
+
+If VIRTIO_NET_F_HASH_TUNNEL has been negotiated, the device supports inner header hash and the driver can send
+commands VIRTIO_NET_CTRL_TUNNEL_HASH_SET and VIRTIO_NET_CTRL_TUNNEL_HASH_GET for the inner header hash configuration.
+
+struct virtio_net_hash_tunnel_config {
+    le32 hash_tunnel_types;
+};
+
+#define VIRTIO_NET_CTRL_TUNNEL_HASH 7
+ #define VIRTIO_NET_CTRL_TUNNEL_HASH_SET 0
+ #define VIRTIO_NET_CTRL_TUNNEL_HASH_GET 1
+
+Filed \field{hash_tunnel_types} contains a bitmask of configured hash tunnel types as
+defined in \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled hash tunnel types}.
+
+The class VIRTIO_NET_CTRL_TUNNEL_HASH has the following commands:
+\begin{itemize}
+\item VIRTIO_NET_CTRL_TUNNEL_HASH_SET: set the \field{hash_tunnel_types} to configure the inner header hash calculation for the device.
+\item VIRTIO_NET_CTRL_TUNNEL_HASH_GET: get the \field{hash_tunnel_types} from the device.
+\end{itemize}
+
+For the command VIRTIO_NET_CTRL_TUNNEL_HASH_SET, the structure virtio_net_hash_tunnel_config is write-only for the driver.
+For the command VIRTIO_NET_CTRL_TUNNEL_HASH_GET, the structure virtio_net_hash_tunnel_config is read-only for the driver.
+
+\subparagraph{Tunnel/Encapsulated packet}
+\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Tunnel/Encapsulated packet}
+
+A tunnel packet is encapsulated from the original packet based on the tunneling protocol (only a single level of
+encapsulation is currently supported). The encapsulated packet contains an outer header and an inner header, and
+the device calculates the hash over either the inner header or the outer header.
+
+If VIRTIO_NET_F_HASH_TUNNEL is negotiated and a received encapsulated packet's outer header matches one of the
+configured \field{hash_tunnel_types}, the hash of the inner header is calculated.
+
+Supported encapsulated packet types:
+\begin{itemize}
+\item \hyperref[intro:gre_rfc2784]{[GRE_rfc2784]}: the outer header is over IPv4 and the inner header is over IPv4. The outer header does not contain the transport protocol.
+\item \hyperref[intro:gre_rfc2890]{[GRE_rfc2890]}: the outer header is over IPv4 and the inner header is over IPv4. The outer header does not contain the transport protocol.
+\item \hyperref[intro:gre_rfc7676]{[GRE_rfc7676]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6. The outer header does not contain the transport protocol.
+\item \hyperref[intro:gre_in_udp_rfc8086]{[GRE-in-UDP]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6. The outer header uses UDP as the transport protocol.
+\item \hyperref[intro:vxlan]{[VXLAN]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6. The outer header uses UDP as the transport protocol.
+\item \hyperref[intro:vxlan_gpe]{[VXLAN-GPE]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6. The outer header uses UDP as the transport protocol.
+\item \hyperref[intro:geneve]{[GENEVE]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6. The outer header uses UDP as the transport protocol.
+\item \hyperref[intro:ipip]{[IPIP]}: the outer header is over IPv4 and the inner header is over IPv4. The outer header does not contain the transport protocol.
+\item \hyperref[intro:nvgre]{[NVGRE]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6. The outer header does not contain the transport protocol.
+\item \hyperref[intro:sit]{[STT]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6. The outer header uses TCP-like as the transport protocol.
+\end{itemize}
+
+If VIRTIO_NET_HASH_TUNNEL_TYPE_NONE is set or the encapsulation type is not included in \field{hash_tunnel_types},
+the hash of the outer header is calculated for the received encapsulated packet.
+
+The hash is calculated for the received non-encapsulated packet as if VIRTIO_NET_F_HASH_TUNNEL was not negotiated.
+
+\subparagraph{Supported/enabled encapsulation hash types}
+\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled encapsulation hash types}
+
+\begin{lstlisting}
+#define VIRTIO_NET_HASH_TUNNEL_TYPE_NONE        (1 << 0)
+\end{lstlisting}
+
+Supported encapsulation hash types:
+Hash type applicable for inner payload of the \hyperref[intro:gre_rfc2784]{[GRE_rfc2784]} packet:
+\begin{lstlisting}
+#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_2784    (1 << 1)
+\end{lstlisting}
+Hash type applicable for inner payload of the \hyperref[intro:gre_rfc2890]{[GRE_rfc2890]} packet:
+\begin{lstlisting}
+#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_2890    (1 << 2)
+\end{lstlisting}
+Hash type applicable for inner payload of the \hyperref[intro:gre_rfc7676]{[GRE_rfc7676]} packet:
+\begin{lstlisting}
+#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_7676    (1 << 3)
+\end{lstlisting}
+Hash type applicable for inner payload of the \hyperref[intro:gre_in_udp_rfc8086]{[GRE-in-UDP]} packet:
+\begin{lstlisting}
+#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_UDP     (1 << 4)
+\end{lstlisting}
+Hash type applicable for inner payload of the \hyperref[intro:vxlan]{[VXLAN]} packet:
+\begin{lstlisting}
+#define VIRTIO_NET_HASH_TUNNEL_TYPE_VXLAN       (1 << 5)
+\end{lstlisting}
+Hash type applicable for inner payload of the \hyperref[intro:vxlan_gpe]{[VXLAN-GPE]} packet:
+\begin{lstlisting}
+#define VIRTIO_NET_HASH_TUNNEL_TYPE_VXLAN_GPE   (1 << 6)
+\end{lstlisting}
+Hash type applicable for inner payload of the \hyperref[intro:geneve]{[GENEVE]} packet:
+\begin{lstlisting}
+#define VIRTIO_NET_HASH_TUNNEL_TYPE_GENEVE      (1 << 7)
+\end{lstlisting}
+Hash type applicable for inner payload of the \hyperref[intro:ipip]{[IPIP]} packet:
+\begin{lstlisting}
+#define VIRTIO_NET_HASH_TUNNEL_TYPE_IPIP        (1 << 8)
+\end{lstlisting}
+Hash type applicable for inner payload of the \hyperref[intro:nvgre]{[NVGRE]} packet:
+\begin{lstlisting}
+#define VIRTIO_NET_HASH_TUNNEL_TYPE_NVGRE       (1 << 9)
+\end{lstlisting}
+Hash type applicable for inner payload of the \hyperref[intro:stt]{[STT]} packet:
+\begin{lstlisting}
+#define VIRTIO_NET_HASH_TUNNEL_TYPE_STT         (1 << 10)
+\end{lstlisting}
+
+\subparagraph{Advice}
+Usage scenarios of inner header hash (but not limited to):
+\begin{itemize}
+\item Legacy tunneling protocols that lack entropy in the outer header use inner header hash to hash flows
+      with the same outer header but different inner headers to different queues for better receiving performance.
+\item In scenarios where the same flow passing through different tunnels is expected to be received in the same queue,
+      warm caches, lessing locking, etc. are optimized to obtain receiving performance.
+\item For modern tunneling protocols, in some scenarios, the entropy in the outer header is insufficient, for example,
+      the external destination port number is fixed. Inner header hash can be used to increase the entropy to
+      regain receiving performance.
+\end{itemize}
+
+For scenarios with sufficient external entropy or no internal hashing requirements, inner header hash may not be needed:
+A tunnel is often expected to isolate the external network from the internal one. By completely ignoring entropy
+in the external header and replacing it with entropy from the internal header, for hash calculations, this expectation
+might be violated to a certain extent, depending on how the hash is used. When the hash use is limited to RSS queue
+selection, inner header hash may have quality of service (QoS) limitations.
+
+Possible mitigations:
+\begin{itemize}
+\item Use a tool with good forwarding performance to keep the receive queue from filling up.
+\item If the QoS is unavailable, the driver can set \field{hash_tunnel_types} to VIRTIO_NET_HASH_TUNNEL_TYPE_NONE
+      to disable inner header hash for encapsulated packets.
+\item Perform appropriate QoS before packets consume the receive buffers of the receive queues.
+\end{itemize}
+
+\devicenormative{\subparagraph}{Inner Header Hash}{Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
+
+The device MUST calculate the hash on the outer header if the type of the received encapsulated packet does not match any value of the \field{hash_tunnel_types}.
+
+The device MUST respond to the VIRTIO_NET_CTRL_TUNNEL_HASH_SET command with VIRTIO_NET_ERR if the device received an unrecognized or unsupported VIRTIO_NET_HASH_TUNNEL_TYPE_ flag.
+
+Upon reset, the device MUST initialize \field{hash_tunnel_type} to 0.
+
+\drivernormative{\subparagraph}{Inner Header Hash}{Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
+
+The driver MUST have negotiated the VIRTIO_NET_F_HASH_TUNNEL feature when issuing commands VIRTIO_NET_CTRL_TUNNEL_HASH_SET and VIRTIO_NET_CTRL_TUNNEL_HASH_GET.
+
+The driver MUST ignore the values of \field{hash_tunnel_types} received from the VIRTIO_NET_CTRL_TUNNEL_HASH_GET command if the device responds with VIRTIO_NET_ERR.
+
+The driver MUST NOT set any VIRTIO_NET_HASH_TUNNEL_TYPE_ flags that are not supported by the device.
+
 \paragraph{Hash reporting for incoming packets}
 \label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash reporting for incoming packets}
 
diff --git a/device-types/net/device-conformance.tex b/device-types/net/device-conformance.tex
index 54f6783..f88f48b 100644
--- a/device-types/net/device-conformance.tex
+++ b/device-types/net/device-conformance.tex
@@ -14,4 +14,5 @@
 \item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode}
 \item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) / RSS processing}
 \item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Notifications Coalescing}
+\item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
 \end{itemize}
diff --git a/device-types/net/driver-conformance.tex b/device-types/net/driver-conformance.tex
index 97d0cc1..9d853d9 100644
--- a/device-types/net/driver-conformance.tex
+++ b/device-types/net/driver-conformance.tex
@@ -14,4 +14,5 @@
 \item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Offloads State Configuration / Setting Offloads State}
 \item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) }
 \item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Notifications Coalescing}
+\item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
 \end{itemize}
diff --git a/introduction.tex b/introduction.tex
index 287c5fc..36b620f 100644
--- a/introduction.tex
+++ b/introduction.tex
@@ -99,6 +99,50 @@ \section{Normative References}\label{sec:Normative References}
     Standards for Efficient Cryptography Group(SECG), ``SEC1: Elliptic Cureve Cryptography'', Version 1.0, September 2000.
 	\newline\url{https://www.secg.org/sec1-v2.pdf}\\
 
+	\phantomsection\label{intro:gre_rfc2784}\textbf{[GRE_rfc2784]} &
+    Generic Routing Encapsulation. This protocol is only specified for IPv4 and used as either the payload or delivery protocol.
+	\newline\url{https://datatracker.ietf.org/doc/rfc2784/}\\
+	\phantomsection\label{intro:gre_rfc2890}\textbf{[GRE_rfc2890]} &
+    Key and Sequence Number Extensions to GRE \ref{intro:gre_rfc2784}. This protocol describes extensions by which two fields, Key and
+    Sequence Number, can be optionally carried in the GRE Header \ref{intro:gre_rfc2784}.
+	\newline\url{https://www.rfc-editor.org/rfc/rfc2890}\\
+	\phantomsection\label{intro:gre_rfc7676}\textbf{[GRE_rfc7676]} &
+    IPv6 Support for Generic Routing Encapsulation (GRE). This protocol is specified for IPv6 and used as either the payload or
+    delivery protocol. Note that this does not change the GRE header format or any behaviors specified by RFC 2784 or RFC 2890.
+	\newline\url{https://datatracker.ietf.org/doc/rfc7676/}\\
+	\phantomsection\label{intro:gre_in_udp_rfc8086}\textbf{[GRE-in-UDP]} &
+    GRE-in-UDP Encapsulation. This specifies a method of encapsulating network protocol packets within GRE and UDP headers.
+    This GRE-in-UDP encapsulation allows the UDP source port field to be used as an entropy field. This protocol is specified for IPv4 and IPv6,
+    and used as either the payload or delivery protocol.
+	\newline\url{https://www.rfc-editor.org/rfc/rfc8086}\\
+	\phantomsection\label{intro:vxlan}\textbf{[VXLAN]} &
+    Virtual eXtensible Local Area Network.
+	\newline\url{https://datatracker.ietf.org/doc/rfc7348/}\\
+	\phantomsection\label{intro:vxlan-gpe}\textbf{[VXLAN-GPE]} &
+    Generic Protocol Extension for VXLAN. This protocol describes extending Virtual eXtensible Local Area Network (VXLAN) via changes to the VXLAN header.
+	\newline\url{https://www.ietf.org/archive/id/draft-ietf-nvo3-vxlan-gpe-12.txt}\\
+	\phantomsection\label{intro:geneve}\textbf{[GENEVE]} &
+    Generic Network Virtualization Encapsulation.
+	\newline\url{https://datatracker.ietf.org/doc/rfc8926/}\\
+	\phantomsection\label{intro:ipip}\textbf{[IPIP]} &
+    IP Encapsulation within IP.
+	\newline\url{https://www.rfc-editor.org/rfc/rfc2003}\\
+	\phantomsection\label{intro:nvgre}\textbf{[NVGRE]} &
+    NVGRE: Network Virtualization Using Generic Routing Encapsulation
+	\newline\url{https://www.rfc-editor.org/rfc/rfc7637.html}\\
+	\phantomsection\label{intro:stt}\textbf{[STT]} &
+    Stateless Transport Tunneling. STT is particularly useful when some tunnel endpoints are in end-systems, as it utilizes the capabilities
+    of the network interface card to improve performance.
+	\newline\url{https://www.ietf.org/archive/id/draft-davie-stt-08.txt}\\
+	\phantomsection\label{intro:IP}\textbf{[IP]} &
+    INTERNET PROTOCOL
+	\newline\url{https://www.rfc-editor.org/rfc/rfc791}\\
+	\phantomsection\label{intro:UDP}\textbf{[UDP]} &
+    User Datagram Protocol
+	\newline\url{https://www.rfc-editor.org/rfc/rfc768}\\
+	\phantomsection\label{intro:TCP}\textbf{[TCP]} &
+    TRANSMISSION CONTROL PROTOCOL
+	\newline\url{https://www.rfc-editor.org/rfc/rfc793}\\
 \end{longtable}
 
 \section{Non-Normative References}
-- 
2.19.1.6.gb485710b


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [virtio-dev] Re: [PATCH v13] virtio-net: support inner header hash
  2023-04-23  7:35 ` [virtio-comment] " Heng Qi
@ 2023-04-25 20:28   ` Parav Pandit
  -1 siblings, 0 replies; 60+ messages in thread
From: Parav Pandit @ 2023-04-25 20:28 UTC (permalink / raw)
  To: Heng Qi, virtio-dev, virtio-comment
  Cc: Michael S . Tsirkin, Jason Wang, Yuri Benditovich, Xuan Zhuo



On 4/23/2023 3:35 AM, Heng Qi wrote:
>   
>   \subsubsection{Legacy Interface: Feature bits}\label{sec:Device Types / Network Device / Feature bits / Legacy Interface: Feature bits}
> @@ -198,6 +202,7 @@ \subsection{Device configuration layout}\label{sec:Device Types / Network Device
>           u8 rss_max_key_size;
>           le16 rss_max_indirection_table_length;
>           le32 supported_hash_types;
> +        le32 supported_tunnel_hash_types;
>   };
In v12 I was asking this to move to above field from the config area to 
the GET command in comment [1] as,

"With that no need to define two fields at two different places in 
config area and also in cvq."

I am sorry if that was not clear enough.

[1] 
https://lore.kernel.org/virtio-dev/569cbaf9-f1fb-0e1f-a2ef-b1d7cd7dbb1f@nvidia.com/

>   \subparagraph{Supported/enabled hash types}
>   \label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled hash types}
> +This paragraph relies on definitions from \hyperref[intro:IP]{[IP]}, \hyperref[intro:UDP]{[UDP]} and \hyperref[intro:TCP]{[TCP]}.
>   Hash types applicable for IPv4 packets:
>   \begin{lstlisting}
>   #define VIRTIO_NET_HASH_TYPE_IPv4              (1 << 0)
> @@ -980,6 +993,152 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>   (see \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / IPv6 packets without extension header}).
>   \end{itemize}
>   
> +\paragraph{Inner Header Hash}
> +\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Inner Header Hash}
> +
> +If VIRTIO_NET_F_HASH_TUNNEL has been negotiated, the device supports inner header hash and the driver can send
> +commands VIRTIO_NET_CTRL_TUNNEL_HASH_SET and VIRTIO_NET_CTRL_TUNNEL_HASH_GET for the inner header hash configuration.
> +
> +struct virtio_net_hash_tunnel_config {
Please move field from the config struct to here. Both are RO fields.

le32 supported_hash_tunnel_types;
> +    le32 hash_tunnel_types;
> +};
> +
> +#define VIRTIO_NET_CTRL_TUNNEL_HASH 7
> + #define VIRTIO_NET_CTRL_TUNNEL_HASH_SET 0
> + #define VIRTIO_NET_CTRL_TUNNEL_HASH_GET 1
> +
> +Filed \field{hash_tunnel_types} contains a bitmask of configured hash tunnel types as
> +defined in \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled hash tunnel types}.
> +
> +The class VIRTIO_NET_CTRL_TUNNEL_HASH has the following commands:
> +\begin{itemize}
> +\item VIRTIO_NET_CTRL_TUNNEL_HASH_SET: set the \field{hash_tunnel_types} to configure the inner header hash calculation for the device.
> +\item VIRTIO_NET_CTRL_TUNNEL_HASH_GET: get the \field{hash_tunnel_types} from the device.
> +\end{itemize}
> +
> +For the command VIRTIO_NET_CTRL_TUNNEL_HASH_SET, the structure virtio_net_hash_tunnel_config is write-only for the driver.
> +For the command VIRTIO_NET_CTRL_TUNNEL_HASH_GET, the structure virtio_net_hash_tunnel_config is read-only for the driver.
> +
You need to split the structures to two, one for get and one for set in 
above description as get and set contains different fields.
> +
> +If VIRTIO_NET_HASH_TUNNEL_TYPE_NONE is set or the encapsulation type is not included in \field{hash_tunnel_types},
> +the hash of the outer header is calculated for the received encapsulated packet.
> +
> +
> +For scenarios with sufficient external entropy or no internal hashing requirements, inner header hash may not be needed:
> +A tunnel is often expected to isolate the external network from the internal one. By completely ignoring entropy
> +in the external header and replacing it with entropy from the internal header, for hash calculations, this expectation
You wanted to say inner here like rest of the places.

s/internal header/inner header

> +The driver MUST NOT set any VIRTIO_NET_HASH_TUNNEL_TYPE_ flags that are not supported by the device.
Multiple flags so,

s/flags that are/flags which are/

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [virtio-comment] Re: [PATCH v13] virtio-net: support inner header hash
@ 2023-04-25 20:28   ` Parav Pandit
  0 siblings, 0 replies; 60+ messages in thread
From: Parav Pandit @ 2023-04-25 20:28 UTC (permalink / raw)
  To: Heng Qi, virtio-dev, virtio-comment
  Cc: Michael S . Tsirkin, Jason Wang, Yuri Benditovich, Xuan Zhuo



On 4/23/2023 3:35 AM, Heng Qi wrote:
>   
>   \subsubsection{Legacy Interface: Feature bits}\label{sec:Device Types / Network Device / Feature bits / Legacy Interface: Feature bits}
> @@ -198,6 +202,7 @@ \subsection{Device configuration layout}\label{sec:Device Types / Network Device
>           u8 rss_max_key_size;
>           le16 rss_max_indirection_table_length;
>           le32 supported_hash_types;
> +        le32 supported_tunnel_hash_types;
>   };
In v12 I was asking this to move to above field from the config area to 
the GET command in comment [1] as,

"With that no need to define two fields at two different places in 
config area and also in cvq."

I am sorry if that was not clear enough.

[1] 
https://lore.kernel.org/virtio-dev/569cbaf9-f1fb-0e1f-a2ef-b1d7cd7dbb1f@nvidia.com/

>   \subparagraph{Supported/enabled hash types}
>   \label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled hash types}
> +This paragraph relies on definitions from \hyperref[intro:IP]{[IP]}, \hyperref[intro:UDP]{[UDP]} and \hyperref[intro:TCP]{[TCP]}.
>   Hash types applicable for IPv4 packets:
>   \begin{lstlisting}
>   #define VIRTIO_NET_HASH_TYPE_IPv4              (1 << 0)
> @@ -980,6 +993,152 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>   (see \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / IPv6 packets without extension header}).
>   \end{itemize}
>   
> +\paragraph{Inner Header Hash}
> +\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Inner Header Hash}
> +
> +If VIRTIO_NET_F_HASH_TUNNEL has been negotiated, the device supports inner header hash and the driver can send
> +commands VIRTIO_NET_CTRL_TUNNEL_HASH_SET and VIRTIO_NET_CTRL_TUNNEL_HASH_GET for the inner header hash configuration.
> +
> +struct virtio_net_hash_tunnel_config {
Please move field from the config struct to here. Both are RO fields.

le32 supported_hash_tunnel_types;
> +    le32 hash_tunnel_types;
> +};
> +
> +#define VIRTIO_NET_CTRL_TUNNEL_HASH 7
> + #define VIRTIO_NET_CTRL_TUNNEL_HASH_SET 0
> + #define VIRTIO_NET_CTRL_TUNNEL_HASH_GET 1
> +
> +Filed \field{hash_tunnel_types} contains a bitmask of configured hash tunnel types as
> +defined in \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled hash tunnel types}.
> +
> +The class VIRTIO_NET_CTRL_TUNNEL_HASH has the following commands:
> +\begin{itemize}
> +\item VIRTIO_NET_CTRL_TUNNEL_HASH_SET: set the \field{hash_tunnel_types} to configure the inner header hash calculation for the device.
> +\item VIRTIO_NET_CTRL_TUNNEL_HASH_GET: get the \field{hash_tunnel_types} from the device.
> +\end{itemize}
> +
> +For the command VIRTIO_NET_CTRL_TUNNEL_HASH_SET, the structure virtio_net_hash_tunnel_config is write-only for the driver.
> +For the command VIRTIO_NET_CTRL_TUNNEL_HASH_GET, the structure virtio_net_hash_tunnel_config is read-only for the driver.
> +
You need to split the structures to two, one for get and one for set in 
above description as get and set contains different fields.
> +
> +If VIRTIO_NET_HASH_TUNNEL_TYPE_NONE is set or the encapsulation type is not included in \field{hash_tunnel_types},
> +the hash of the outer header is calculated for the received encapsulated packet.
> +
> +
> +For scenarios with sufficient external entropy or no internal hashing requirements, inner header hash may not be needed:
> +A tunnel is often expected to isolate the external network from the internal one. By completely ignoring entropy
> +in the external header and replacing it with entropy from the internal header, for hash calculations, this expectation
You wanted to say inner here like rest of the places.

s/internal header/inner header

> +The driver MUST NOT set any VIRTIO_NET_HASH_TUNNEL_TYPE_ flags that are not supported by the device.
Multiple flags so,

s/flags that are/flags which are/

This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [virtio-dev] Re: [virtio-comment] [PATCH v13] virtio-net: support inner header hash
  2023-04-23  7:35 ` [virtio-comment] " Heng Qi
@ 2023-04-25 21:03   ` Michael S. Tsirkin
  -1 siblings, 0 replies; 60+ messages in thread
From: Michael S. Tsirkin @ 2023-04-25 21:03 UTC (permalink / raw)
  To: Heng Qi
  Cc: virtio-dev, virtio-comment, Parav Pandit, Jason Wang,
	Yuri Benditovich, Xuan Zhuo

On Sun, Apr 23, 2023 at 03:35:32PM +0800, Heng Qi wrote:
> 1. Currently, a received encapsulated packet has an outer and an inner header, but
> the virtio device is unable to calculate the hash for the inner header. The same
> flow can traverse through different tunnels, resulting in the encapsulated
> packets being spread across multiple receive queues (refer to the figure below).
> However, in certain scenarios, we may need to direct these encapsulated packets of
> the same flow to a single receive queue. This facilitates the processing
> of the flow by the same CPU to improve performance (warm caches, less locking, etc.).
> 
>                client1                    client2
>                   |        +-------+         |
>                   +------->|tunnels|<--------+
>                            +-------+
>                               |  |
>                               v  v
>                       +-----------------+
>                       | monitoring host |
>                       +-----------------+
> 
> To achieve this, the device can calculate a symmetric hash based on the inner headers
> of the same flow.
> 
> 2. For legacy systems, they may lack entropy fields which modern protocols have in
> the outer header, resulting in multiple flows with the same outer header but
> different inner headers being directed to the same receive queue. This results in
> poor receive performance.
> 
> To address this limitation, inner header hash can be used to enable the device to advertise
> the capability to calculate the hash for the inner packet, regaining better receive performance.
> 
> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>

grammar in new text is still pretty bad, lots of typos too.
Don't have time to fix it for you right now sorry, it's
a holiday here.

> ---
> v12->v13:
> 	1. Add a GET command for hash_tunnel_types. @Parav Pandit
> 	2. Add tunneling protocol explanation. @Jason Wang
> 	3. Add comments on some usage scenarios for inner hash.
> 
> v11->v12:
> 	1. Add a command VIRTIO_NET_CTRL_MQ_TUNNEL_CONFIG.
> 	2. Refine the commit log. @Michael S . Tsirkin
> 	3. Add some tunnel types.
> 
> v10->v11:
> 	1. Revise commit log for clarity for readers.
> 	2. Some modifications to avoid undefined terms. @Parav Pandit
> 	3. Change VIRTIO_NET_F_HASH_TUNNEL dependency. @Parav Pandit
> 	4. Add the normative statements. @Parav Pandit
> 
> v9->v10:
> 	1. Removed hash_report_tunnel related information. @Parav Pandit
> 	2. Re-describe the limitations of QoS for tunneling.
> 	3. Some clarification.
> 
> v8->v9:
> 	1. Merge hash_report_tunnel_types into hash_report. @Parav Pandit
> 	2. Add tunnel security section. @Michael S . Tsirkin
> 	3. Add VIRTIO_NET_F_HASH_REPORT_TUNNEL.
> 	4. Fix some typos.
> 	5. Add more tunnel types. @Michael S . Tsirkin
> 
> v7->v8:
> 	1. Add supported_hash_tunnel_types. @Jason Wang, @Parav Pandit
> 	2. Change hash_report_tunnel to hash_report_tunnel_types. @Parav Pandit
> 	3. Removed re-definition for inner packet hashing. @Parav Pandit
> 	4. Fix some typos. @Michael S . Tsirkin
> 	5. Clarify some sentences. @Michael S . Tsirkin
> 
> v6->v7:
> 	1. Modify the wording of some sentences for clarity. @Michael S. Tsirkin
> 	2. Fix some syntax issues. @Michael S. Tsirkin
> 
> v5->v6:
> 	1. Fix some syntax and capitalization issues. @Michael S. Tsirkin
> 	2. Use encapsulated/encaptulation uniformly. @Michael S. Tsirkin
> 	3. Move the links to introduction section. @Michael S. Tsirkin
> 	4. Clarify some sentences. @Michael S. Tsirkin
> 
> v4->v5:
> 	1. Clarify some paragraphs. @Cornelia Huck
> 	2. Fix the u8 type. @Cornelia Huck
> 
> v3->v4:
> 	1. Rename VIRTIO_NET_F_HASH_GRE_VXLAN_GENEVE_INNER to VIRTIO_NET_F_HASH_TUNNEL. @Jason Wang
> 	2. Make things clearer. @Jason Wang @Michael S. Tsirkin
> 	3. Keep the possibility to use inner hash for automatic receive steering. @Jason Wang
> 	4. Add the "Tunnel packet" paragraph to avoid repeating the GRE etc. many times. @Michael S. Tsirkin
> 
> v2->v3:
> 	1. Add a feature bit for GRE/VXLAN/GENEVE inner hash. @Jason Wang
> 	2. Chang \field{hash_tunnel} to \field{hash_report_tunnel}. @Jason Wang, @Michael S. Tsirkin
> 
> v1->v2:
> 	1. Remove the patch for the bitmask fix. @Michael S. Tsirkin
> 	2. Clarify some paragraphs. @Jason Wang
> 	3. Add \field{hash_tunnel} and VIRTIO_NET_HASH_REPORT_GRE. @Yuri Benditovich
> 
>  device-types/net/description.tex        | 159 ++++++++++++++++++++++++
>  device-types/net/device-conformance.tex |   1 +
>  device-types/net/driver-conformance.tex |   1 +
>  introduction.tex                        |  44 +++++++
>  4 files changed, 205 insertions(+)
> 
> diff --git a/device-types/net/description.tex b/device-types/net/description.tex
> index 0500bb6..48e41f1 100644
> --- a/device-types/net/description.tex
> +++ b/device-types/net/description.tex
> @@ -83,6 +83,9 @@ \subsection{Feature bits}\label{sec:Device Types / Network Device / Feature bits
>  \item[VIRTIO_NET_F_CTRL_MAC_ADDR(23)] Set MAC address through control
>      channel.
>  
> +\item[VIRTIO_NET_F_HASH_TUNNEL(52)] Device supports inner header hash
> +    for tunnel-encapsulated packets.
> +
>  \item[VIRTIO_NET_F_NOTF_COAL(53)] Device supports notifications coalescing.
>  
>  \item[VIRTIO_NET_F_GUEST_USO4 (54)] Driver can receive USOv4 packets.
> @@ -139,6 +142,7 @@ \subsubsection{Feature bit requirements}\label{sec:Device Types / Network Device
>  \item[VIRTIO_NET_F_NOTF_COAL] Requires VIRTIO_NET_F_CTRL_VQ.
>  \item[VIRTIO_NET_F_RSC_EXT] Requires VIRTIO_NET_F_HOST_TSO4 or VIRTIO_NET_F_HOST_TSO6.
>  \item[VIRTIO_NET_F_RSS] Requires VIRTIO_NET_F_CTRL_VQ.
> +\item[VIRTIO_NET_F_HASH_TUNNEL] Requires VIRTIO_NET_F_CTRL_VQ along with VIRTIO_NET_F_RSS and/or VIRTIO_NET_F_HASH_REPORT.
>  \end{description}
>  
>  \subsubsection{Legacy Interface: Feature bits}\label{sec:Device Types / Network Device / Feature bits / Legacy Interface: Feature bits}
> @@ -198,6 +202,7 @@ \subsection{Device configuration layout}\label{sec:Device Types / Network Device
>          u8 rss_max_key_size;
>          le16 rss_max_indirection_table_length;
>          le32 supported_hash_types;
> +        le32 supported_tunnel_hash_types;
>  };
>  \end{lstlisting}
>  The following field, \field{rss_max_key_size} only exists if VIRTIO_NET_F_RSS or VIRTIO_NET_F_HASH_REPORT is set.
> @@ -212,6 +217,11 @@ \subsection{Device configuration layout}\label{sec:Device Types / Network Device
>  Field \field{supported_hash_types} contains the bitmask of supported hash types.
>  See \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled hash types} for details of supported hash types.
>  
> +Field \field{supported_tunnel_hash_types} only exists if the device supports inner header hash, i.e. if VIRTIO_NET_F_HASH_TUNNEL is set.
> +
> +Filed \field{supported_tunnel_hash_types} contains the bitmask of supported tunnel hash types.
> +See \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled encapsulation hash types} for details of supported tunnel hash types.
> +
>  \devicenormative{\subsubsection}{Device configuration layout}{Device Types / Network Device / Device configuration layout}
>  
>  The device MUST set \field{max_virtqueue_pairs} to between 1 and 0x8000 inclusive,
> @@ -848,6 +858,7 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>  If the feature VIRTIO_NET_F_RSS was negotiated:
>  \begin{itemize}
>  \item The device uses \field{hash_types} of the virtio_net_rss_config structure as 'Enabled hash types' bitmask.
> +\item The device uses \field{hash_tunnel_types} of the virtio_net_hash_tunnel_config structure as 'Enabled hash tunnel types' bitmask if VIRTIO_NET_F_HASH_TUNNEL was negotiated.
>  \item The device uses a key as defined in \field{hash_key_data} and \field{hash_key_length} of the virtio_net_rss_config structure (see
>  \ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) / Setting RSS parameters}).
>  \end{itemize}
> @@ -855,6 +866,7 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>  If the feature VIRTIO_NET_F_RSS was not negotiated:
>  \begin{itemize}
>  \item The device uses \field{hash_types} of the virtio_net_hash_config structure as 'Enabled hash types' bitmask.
> +\item The device uses \field{hash_tunnel_types} of the virtio_net_hash_tunnel_config structure as 'Enabled hash tunnel types' bitmask if VIRTIO_NET_F_HASH_TUNNEL was negotiated.
>  \item The device uses a key as defined in \field{hash_key_data} and \field{hash_key_length} of the virtio_net_hash_config structure (see
>  \ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode / Hash calculation}).
>  \end{itemize}
> @@ -870,6 +882,7 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>  
>  \subparagraph{Supported/enabled hash types}
>  \label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled hash types}
> +This paragraph relies on definitions from \hyperref[intro:IP]{[IP]}, \hyperref[intro:UDP]{[UDP]} and \hyperref[intro:TCP]{[TCP]}.
>  Hash types applicable for IPv4 packets:
>  \begin{lstlisting}
>  #define VIRTIO_NET_HASH_TYPE_IPv4              (1 << 0)
> @@ -980,6 +993,152 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>  (see \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / IPv6 packets without extension header}).
>  \end{itemize}
>  
> +\paragraph{Inner Header Hash}
> +\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Inner Header Hash}
> +
> +If VIRTIO_NET_F_HASH_TUNNEL has been negotiated, the device supports inner header hash and the driver can send
> +commands VIRTIO_NET_CTRL_TUNNEL_HASH_SET and VIRTIO_NET_CTRL_TUNNEL_HASH_GET for the inner header hash configuration.
> +
> +struct virtio_net_hash_tunnel_config {
> +    le32 hash_tunnel_types;
> +};
> +
> +#define VIRTIO_NET_CTRL_TUNNEL_HASH 7
> + #define VIRTIO_NET_CTRL_TUNNEL_HASH_SET 0
> + #define VIRTIO_NET_CTRL_TUNNEL_HASH_GET 1
> +
> +Filed \field{hash_tunnel_types} contains a bitmask of configured hash tunnel types as
> +defined in \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled hash tunnel types}.
> +
> +The class VIRTIO_NET_CTRL_TUNNEL_HASH has the following commands:
> +\begin{itemize}
> +\item VIRTIO_NET_CTRL_TUNNEL_HASH_SET: set the \field{hash_tunnel_types} to configure the inner header hash calculation for the device.
> +\item VIRTIO_NET_CTRL_TUNNEL_HASH_GET: get the \field{hash_tunnel_types} from the device.
> +\end{itemize}
> +
> +For the command VIRTIO_NET_CTRL_TUNNEL_HASH_SET, the structure virtio_net_hash_tunnel_config is write-only for the driver.
> +For the command VIRTIO_NET_CTRL_TUNNEL_HASH_GET, the structure virtio_net_hash_tunnel_config is read-only for the driver.
> +
> +\subparagraph{Tunnel/Encapsulated packet}
> +\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Tunnel/Encapsulated packet}
> +
> +A tunnel packet is encapsulated from the original packet based on the tunneling protocol (only a single level of
> +encapsulation is currently supported). The encapsulated packet contains an outer header and an inner header, and
> +the device calculates the hash over either the inner header or the outer header.
> +
> +If VIRTIO_NET_F_HASH_TUNNEL is negotiated and a received encapsulated packet's outer header matches one of the
> +configured \field{hash_tunnel_types}, the hash of the inner header is calculated.
> +
> +Supported encapsulated packet types:
> +\begin{itemize}
> +\item \hyperref[intro:gre_rfc2784]{[GRE_rfc2784]}: the outer header is over IPv4 and the inner header is over IPv4. The outer header does not contain the transport protocol.
> +\item \hyperref[intro:gre_rfc2890]{[GRE_rfc2890]}: the outer header is over IPv4 and the inner header is over IPv4. The outer header does not contain the transport protocol.
> +\item \hyperref[intro:gre_rfc7676]{[GRE_rfc7676]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6. The outer header does not contain the transport protocol.
> +\item \hyperref[intro:gre_in_udp_rfc8086]{[GRE-in-UDP]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6. The outer header uses UDP as the transport protocol.
> +\item \hyperref[intro:vxlan]{[VXLAN]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6. The outer header uses UDP as the transport protocol.
> +\item \hyperref[intro:vxlan_gpe]{[VXLAN-GPE]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6. The outer header uses UDP as the transport protocol.
> +\item \hyperref[intro:geneve]{[GENEVE]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6. The outer header uses UDP as the transport protocol.
> +\item \hyperref[intro:ipip]{[IPIP]}: the outer header is over IPv4 and the inner header is over IPv4. The outer header does not contain the transport protocol.
> +\item \hyperref[intro:nvgre]{[NVGRE]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6. The outer header does not contain the transport protocol.
> +\item \hyperref[intro:sit]{[STT]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6. The outer header uses TCP-like as the transport protocol.
> +\end{itemize}
> +
> +If VIRTIO_NET_HASH_TUNNEL_TYPE_NONE is set or the encapsulation type is not included in \field{hash_tunnel_types},
> +the hash of the outer header is calculated for the received encapsulated packet.
> +
> +The hash is calculated for the received non-encapsulated packet as if VIRTIO_NET_F_HASH_TUNNEL was not negotiated.
> +
> +\subparagraph{Supported/enabled encapsulation hash types}
> +\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled encapsulation hash types}
> +
> +\begin{lstlisting}
> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_NONE        (1 << 0)
> +\end{lstlisting}
> +
> +Supported encapsulation hash types:
> +Hash type applicable for inner payload of the \hyperref[intro:gre_rfc2784]{[GRE_rfc2784]} packet:
> +\begin{lstlisting}
> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_2784    (1 << 1)
> +\end{lstlisting}
> +Hash type applicable for inner payload of the \hyperref[intro:gre_rfc2890]{[GRE_rfc2890]} packet:
> +\begin{lstlisting}
> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_2890    (1 << 2)
> +\end{lstlisting}
> +Hash type applicable for inner payload of the \hyperref[intro:gre_rfc7676]{[GRE_rfc7676]} packet:
> +\begin{lstlisting}
> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_7676    (1 << 3)
> +\end{lstlisting}
> +Hash type applicable for inner payload of the \hyperref[intro:gre_in_udp_rfc8086]{[GRE-in-UDP]} packet:
> +\begin{lstlisting}
> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_UDP     (1 << 4)
> +\end{lstlisting}
> +Hash type applicable for inner payload of the \hyperref[intro:vxlan]{[VXLAN]} packet:
> +\begin{lstlisting}
> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_VXLAN       (1 << 5)
> +\end{lstlisting}
> +Hash type applicable for inner payload of the \hyperref[intro:vxlan_gpe]{[VXLAN-GPE]} packet:
> +\begin{lstlisting}
> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_VXLAN_GPE   (1 << 6)
> +\end{lstlisting}
> +Hash type applicable for inner payload of the \hyperref[intro:geneve]{[GENEVE]} packet:
> +\begin{lstlisting}
> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GENEVE      (1 << 7)
> +\end{lstlisting}
> +Hash type applicable for inner payload of the \hyperref[intro:ipip]{[IPIP]} packet:
> +\begin{lstlisting}
> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_IPIP        (1 << 8)
> +\end{lstlisting}
> +Hash type applicable for inner payload of the \hyperref[intro:nvgre]{[NVGRE]} packet:
> +\begin{lstlisting}
> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_NVGRE       (1 << 9)
> +\end{lstlisting}
> +Hash type applicable for inner payload of the \hyperref[intro:stt]{[STT]} packet:
> +\begin{lstlisting}
> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_STT         (1 << 10)
> +\end{lstlisting}

Too many protocols to support. Can we start with just one or two?

> +\subparagraph{Advice}
> +Usage scenarios of inner header hash (but not limited to):
> +\begin{itemize}
> +\item Legacy tunneling protocols that lack entropy in the outer header use inner header hash to hash flows
> +      with the same outer header but different inner headers to different queues for better receiving performance.

That's the only one that sounds convincing to me.
How about we start with just legacy? What are these? legacy GRE and
maybe NVGRE?

> +\item In scenarios where the same flow passing through different tunnels is expected to be received in the same queue,
> +      warm caches, lessing locking, etc. are optimized to obtain receiving performance.
> +\item For modern tunneling protocols, in some scenarios, the entropy in the outer header is insufficient, for example,
> +      the external destination port number is fixed. Inner header hash can be used to increase the entropy to
> +      regain receiving performance.

Too vague. If the problem existed someone would have tried to solve it
at the protocol level by now.


> +\end{itemize}
> +
> +For scenarios with sufficient external entropy or no internal hashing requirements, inner header hash may not be needed:
> +A tunnel is often expected to isolate the external network from the internal one. By completely ignoring entropy
> +in the external header and replacing it with entropy from the internal header, for hash calculations, this expectation
> +might be violated to a certain extent, depending on how the hash is used. When the hash use is limited to RSS queue
> +selection, inner header hash may have quality of service (QoS) limitations.
> +
> +Possible mitigations:
> +\begin{itemize}
> +\item Use a tool with good forwarding performance to keep the receive queue from filling up.
> +\item If the QoS is unavailable, the driver can set \field{hash_tunnel_types} to VIRTIO_NET_HASH_TUNNEL_TYPE_NONE
> +      to disable inner header hash for encapsulated packets.
> +\item Perform appropriate QoS before packets consume the receive buffers of the receive queues.
> +\end{itemize}
> +
> +\devicenormative{\subparagraph}{Inner Header Hash}{Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
> +
> +The device MUST calculate the hash on the outer header if the type of the received encapsulated packet does not match any value of the \field{hash_tunnel_types}.
> +
> +The device MUST respond to the VIRTIO_NET_CTRL_TUNNEL_HASH_SET command with VIRTIO_NET_ERR if the device received an unrecognized or unsupported VIRTIO_NET_HASH_TUNNEL_TYPE_ flag.
> +
> +Upon reset, the device MUST initialize \field{hash_tunnel_type} to 0.
> +
> +\drivernormative{\subparagraph}{Inner Header Hash}{Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
> +
> +The driver MUST have negotiated the VIRTIO_NET_F_HASH_TUNNEL feature when issuing commands VIRTIO_NET_CTRL_TUNNEL_HASH_SET and VIRTIO_NET_CTRL_TUNNEL_HASH_GET.
> +
> +The driver MUST ignore the values of \field{hash_tunnel_types} received from the VIRTIO_NET_CTRL_TUNNEL_HASH_GET command if the device responds with VIRTIO_NET_ERR.
> +
> +The driver MUST NOT set any VIRTIO_NET_HASH_TUNNEL_TYPE_ flags that are not supported by the device.
> +
>  \paragraph{Hash reporting for incoming packets}
>  \label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash reporting for incoming packets}
>  
> diff --git a/device-types/net/device-conformance.tex b/device-types/net/device-conformance.tex
> index 54f6783..f88f48b 100644
> --- a/device-types/net/device-conformance.tex
> +++ b/device-types/net/device-conformance.tex
> @@ -14,4 +14,5 @@
>  \item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode}
>  \item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) / RSS processing}
>  \item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Notifications Coalescing}
> +\item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
>  \end{itemize}
> diff --git a/device-types/net/driver-conformance.tex b/device-types/net/driver-conformance.tex
> index 97d0cc1..9d853d9 100644
> --- a/device-types/net/driver-conformance.tex
> +++ b/device-types/net/driver-conformance.tex
> @@ -14,4 +14,5 @@
>  \item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Offloads State Configuration / Setting Offloads State}
>  \item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) }
>  \item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Notifications Coalescing}
> +\item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
>  \end{itemize}
> diff --git a/introduction.tex b/introduction.tex
> index 287c5fc..36b620f 100644
> --- a/introduction.tex
> +++ b/introduction.tex
> @@ -99,6 +99,50 @@ \section{Normative References}\label{sec:Normative References}
>      Standards for Efficient Cryptography Group(SECG), ``SEC1: Elliptic Cureve Cryptography'', Version 1.0, September 2000.
>  	\newline\url{https://www.secg.org/sec1-v2.pdf}\\
>  
> +	\phantomsection\label{intro:gre_rfc2784}\textbf{[GRE_rfc2784]} &
> +    Generic Routing Encapsulation. This protocol is only specified for IPv4 and used as either the payload or delivery protocol.
> +	\newline\url{https://datatracker.ietf.org/doc/rfc2784/}\\
> +	\phantomsection\label{intro:gre_rfc2890}\textbf{[GRE_rfc2890]} &
> +    Key and Sequence Number Extensions to GRE \ref{intro:gre_rfc2784}. This protocol describes extensions by which two fields, Key and
> +    Sequence Number, can be optionally carried in the GRE Header \ref{intro:gre_rfc2784}.
> +	\newline\url{https://www.rfc-editor.org/rfc/rfc2890}\\
> +	\phantomsection\label{intro:gre_rfc7676}\textbf{[GRE_rfc7676]} &
> +    IPv6 Support for Generic Routing Encapsulation (GRE). This protocol is specified for IPv6 and used as either the payload or
> +    delivery protocol. Note that this does not change the GRE header format or any behaviors specified by RFC 2784 or RFC 2890.
> +	\newline\url{https://datatracker.ietf.org/doc/rfc7676/}\\
> +	\phantomsection\label{intro:gre_in_udp_rfc8086}\textbf{[GRE-in-UDP]} &
> +    GRE-in-UDP Encapsulation. This specifies a method of encapsulating network protocol packets within GRE and UDP headers.
> +    This GRE-in-UDP encapsulation allows the UDP source port field to be used as an entropy field. This protocol is specified for IPv4 and IPv6,
> +    and used as either the payload or delivery protocol.
> +	\newline\url{https://www.rfc-editor.org/rfc/rfc8086}\\
> +	\phantomsection\label{intro:vxlan}\textbf{[VXLAN]} &
> +    Virtual eXtensible Local Area Network.
> +	\newline\url{https://datatracker.ietf.org/doc/rfc7348/}\\
> +	\phantomsection\label{intro:vxlan-gpe}\textbf{[VXLAN-GPE]} &
> +    Generic Protocol Extension for VXLAN. This protocol describes extending Virtual eXtensible Local Area Network (VXLAN) via changes to the VXLAN header.
> +	\newline\url{https://www.ietf.org/archive/id/draft-ietf-nvo3-vxlan-gpe-12.txt}\\
> +	\phantomsection\label{intro:geneve}\textbf{[GENEVE]} &
> +    Generic Network Virtualization Encapsulation.
> +	\newline\url{https://datatracker.ietf.org/doc/rfc8926/}\\
> +	\phantomsection\label{intro:ipip}\textbf{[IPIP]} &
> +    IP Encapsulation within IP.
> +	\newline\url{https://www.rfc-editor.org/rfc/rfc2003}\\
> +	\phantomsection\label{intro:nvgre}\textbf{[NVGRE]} &
> +    NVGRE: Network Virtualization Using Generic Routing Encapsulation
> +	\newline\url{https://www.rfc-editor.org/rfc/rfc7637.html}\\
> +	\phantomsection\label{intro:stt}\textbf{[STT]} &
> +    Stateless Transport Tunneling. STT is particularly useful when some tunnel endpoints are in end-systems, as it utilizes the capabilities
> +    of the network interface card to improve performance.
> +	\newline\url{https://www.ietf.org/archive/id/draft-davie-stt-08.txt}\\
> +	\phantomsection\label{intro:IP}\textbf{[IP]} &
> +    INTERNET PROTOCOL
> +	\newline\url{https://www.rfc-editor.org/rfc/rfc791}\\
> +	\phantomsection\label{intro:UDP}\textbf{[UDP]} &
> +    User Datagram Protocol
> +	\newline\url{https://www.rfc-editor.org/rfc/rfc768}\\
> +	\phantomsection\label{intro:TCP}\textbf{[TCP]} &
> +    TRANSMISSION CONTROL PROTOCOL
> +	\newline\url{https://www.rfc-editor.org/rfc/rfc793}\\
>  \end{longtable}
>  
>  \section{Non-Normative References}
> -- 
> 2.19.1.6.gb485710b
> 
> 
> This publicly archived list offers a means to provide input to the
> OASIS Virtual I/O Device (VIRTIO) TC.
> 
> In order to verify user consent to the Feedback License terms and
> to minimize spam in the list archive, subscription is required
> before posting.
> 
> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> List help: virtio-comment-help@lists.oasis-open.org
> List archive: https://lists.oasis-open.org/archives/virtio-comment/
> Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> Committee: https://www.oasis-open.org/committees/virtio/
> Join OASIS: https://www.oasis-open.org/join/
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [virtio-comment] [PATCH v13] virtio-net: support inner header hash
@ 2023-04-25 21:03   ` Michael S. Tsirkin
  0 siblings, 0 replies; 60+ messages in thread
From: Michael S. Tsirkin @ 2023-04-25 21:03 UTC (permalink / raw)
  To: Heng Qi
  Cc: virtio-dev, virtio-comment, Parav Pandit, Jason Wang,
	Yuri Benditovich, Xuan Zhuo

On Sun, Apr 23, 2023 at 03:35:32PM +0800, Heng Qi wrote:
> 1. Currently, a received encapsulated packet has an outer and an inner header, but
> the virtio device is unable to calculate the hash for the inner header. The same
> flow can traverse through different tunnels, resulting in the encapsulated
> packets being spread across multiple receive queues (refer to the figure below).
> However, in certain scenarios, we may need to direct these encapsulated packets of
> the same flow to a single receive queue. This facilitates the processing
> of the flow by the same CPU to improve performance (warm caches, less locking, etc.).
> 
>                client1                    client2
>                   |        +-------+         |
>                   +------->|tunnels|<--------+
>                            +-------+
>                               |  |
>                               v  v
>                       +-----------------+
>                       | monitoring host |
>                       +-----------------+
> 
> To achieve this, the device can calculate a symmetric hash based on the inner headers
> of the same flow.
> 
> 2. For legacy systems, they may lack entropy fields which modern protocols have in
> the outer header, resulting in multiple flows with the same outer header but
> different inner headers being directed to the same receive queue. This results in
> poor receive performance.
> 
> To address this limitation, inner header hash can be used to enable the device to advertise
> the capability to calculate the hash for the inner packet, regaining better receive performance.
> 
> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>

grammar in new text is still pretty bad, lots of typos too.
Don't have time to fix it for you right now sorry, it's
a holiday here.

> ---
> v12->v13:
> 	1. Add a GET command for hash_tunnel_types. @Parav Pandit
> 	2. Add tunneling protocol explanation. @Jason Wang
> 	3. Add comments on some usage scenarios for inner hash.
> 
> v11->v12:
> 	1. Add a command VIRTIO_NET_CTRL_MQ_TUNNEL_CONFIG.
> 	2. Refine the commit log. @Michael S . Tsirkin
> 	3. Add some tunnel types.
> 
> v10->v11:
> 	1. Revise commit log for clarity for readers.
> 	2. Some modifications to avoid undefined terms. @Parav Pandit
> 	3. Change VIRTIO_NET_F_HASH_TUNNEL dependency. @Parav Pandit
> 	4. Add the normative statements. @Parav Pandit
> 
> v9->v10:
> 	1. Removed hash_report_tunnel related information. @Parav Pandit
> 	2. Re-describe the limitations of QoS for tunneling.
> 	3. Some clarification.
> 
> v8->v9:
> 	1. Merge hash_report_tunnel_types into hash_report. @Parav Pandit
> 	2. Add tunnel security section. @Michael S . Tsirkin
> 	3. Add VIRTIO_NET_F_HASH_REPORT_TUNNEL.
> 	4. Fix some typos.
> 	5. Add more tunnel types. @Michael S . Tsirkin
> 
> v7->v8:
> 	1. Add supported_hash_tunnel_types. @Jason Wang, @Parav Pandit
> 	2. Change hash_report_tunnel to hash_report_tunnel_types. @Parav Pandit
> 	3. Removed re-definition for inner packet hashing. @Parav Pandit
> 	4. Fix some typos. @Michael S . Tsirkin
> 	5. Clarify some sentences. @Michael S . Tsirkin
> 
> v6->v7:
> 	1. Modify the wording of some sentences for clarity. @Michael S. Tsirkin
> 	2. Fix some syntax issues. @Michael S. Tsirkin
> 
> v5->v6:
> 	1. Fix some syntax and capitalization issues. @Michael S. Tsirkin
> 	2. Use encapsulated/encaptulation uniformly. @Michael S. Tsirkin
> 	3. Move the links to introduction section. @Michael S. Tsirkin
> 	4. Clarify some sentences. @Michael S. Tsirkin
> 
> v4->v5:
> 	1. Clarify some paragraphs. @Cornelia Huck
> 	2. Fix the u8 type. @Cornelia Huck
> 
> v3->v4:
> 	1. Rename VIRTIO_NET_F_HASH_GRE_VXLAN_GENEVE_INNER to VIRTIO_NET_F_HASH_TUNNEL. @Jason Wang
> 	2. Make things clearer. @Jason Wang @Michael S. Tsirkin
> 	3. Keep the possibility to use inner hash for automatic receive steering. @Jason Wang
> 	4. Add the "Tunnel packet" paragraph to avoid repeating the GRE etc. many times. @Michael S. Tsirkin
> 
> v2->v3:
> 	1. Add a feature bit for GRE/VXLAN/GENEVE inner hash. @Jason Wang
> 	2. Chang \field{hash_tunnel} to \field{hash_report_tunnel}. @Jason Wang, @Michael S. Tsirkin
> 
> v1->v2:
> 	1. Remove the patch for the bitmask fix. @Michael S. Tsirkin
> 	2. Clarify some paragraphs. @Jason Wang
> 	3. Add \field{hash_tunnel} and VIRTIO_NET_HASH_REPORT_GRE. @Yuri Benditovich
> 
>  device-types/net/description.tex        | 159 ++++++++++++++++++++++++
>  device-types/net/device-conformance.tex |   1 +
>  device-types/net/driver-conformance.tex |   1 +
>  introduction.tex                        |  44 +++++++
>  4 files changed, 205 insertions(+)
> 
> diff --git a/device-types/net/description.tex b/device-types/net/description.tex
> index 0500bb6..48e41f1 100644
> --- a/device-types/net/description.tex
> +++ b/device-types/net/description.tex
> @@ -83,6 +83,9 @@ \subsection{Feature bits}\label{sec:Device Types / Network Device / Feature bits
>  \item[VIRTIO_NET_F_CTRL_MAC_ADDR(23)] Set MAC address through control
>      channel.
>  
> +\item[VIRTIO_NET_F_HASH_TUNNEL(52)] Device supports inner header hash
> +    for tunnel-encapsulated packets.
> +
>  \item[VIRTIO_NET_F_NOTF_COAL(53)] Device supports notifications coalescing.
>  
>  \item[VIRTIO_NET_F_GUEST_USO4 (54)] Driver can receive USOv4 packets.
> @@ -139,6 +142,7 @@ \subsubsection{Feature bit requirements}\label{sec:Device Types / Network Device
>  \item[VIRTIO_NET_F_NOTF_COAL] Requires VIRTIO_NET_F_CTRL_VQ.
>  \item[VIRTIO_NET_F_RSC_EXT] Requires VIRTIO_NET_F_HOST_TSO4 or VIRTIO_NET_F_HOST_TSO6.
>  \item[VIRTIO_NET_F_RSS] Requires VIRTIO_NET_F_CTRL_VQ.
> +\item[VIRTIO_NET_F_HASH_TUNNEL] Requires VIRTIO_NET_F_CTRL_VQ along with VIRTIO_NET_F_RSS and/or VIRTIO_NET_F_HASH_REPORT.
>  \end{description}
>  
>  \subsubsection{Legacy Interface: Feature bits}\label{sec:Device Types / Network Device / Feature bits / Legacy Interface: Feature bits}
> @@ -198,6 +202,7 @@ \subsection{Device configuration layout}\label{sec:Device Types / Network Device
>          u8 rss_max_key_size;
>          le16 rss_max_indirection_table_length;
>          le32 supported_hash_types;
> +        le32 supported_tunnel_hash_types;
>  };
>  \end{lstlisting}
>  The following field, \field{rss_max_key_size} only exists if VIRTIO_NET_F_RSS or VIRTIO_NET_F_HASH_REPORT is set.
> @@ -212,6 +217,11 @@ \subsection{Device configuration layout}\label{sec:Device Types / Network Device
>  Field \field{supported_hash_types} contains the bitmask of supported hash types.
>  See \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled hash types} for details of supported hash types.
>  
> +Field \field{supported_tunnel_hash_types} only exists if the device supports inner header hash, i.e. if VIRTIO_NET_F_HASH_TUNNEL is set.
> +
> +Filed \field{supported_tunnel_hash_types} contains the bitmask of supported tunnel hash types.
> +See \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled encapsulation hash types} for details of supported tunnel hash types.
> +
>  \devicenormative{\subsubsection}{Device configuration layout}{Device Types / Network Device / Device configuration layout}
>  
>  The device MUST set \field{max_virtqueue_pairs} to between 1 and 0x8000 inclusive,
> @@ -848,6 +858,7 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>  If the feature VIRTIO_NET_F_RSS was negotiated:
>  \begin{itemize}
>  \item The device uses \field{hash_types} of the virtio_net_rss_config structure as 'Enabled hash types' bitmask.
> +\item The device uses \field{hash_tunnel_types} of the virtio_net_hash_tunnel_config structure as 'Enabled hash tunnel types' bitmask if VIRTIO_NET_F_HASH_TUNNEL was negotiated.
>  \item The device uses a key as defined in \field{hash_key_data} and \field{hash_key_length} of the virtio_net_rss_config structure (see
>  \ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) / Setting RSS parameters}).
>  \end{itemize}
> @@ -855,6 +866,7 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>  If the feature VIRTIO_NET_F_RSS was not negotiated:
>  \begin{itemize}
>  \item The device uses \field{hash_types} of the virtio_net_hash_config structure as 'Enabled hash types' bitmask.
> +\item The device uses \field{hash_tunnel_types} of the virtio_net_hash_tunnel_config structure as 'Enabled hash tunnel types' bitmask if VIRTIO_NET_F_HASH_TUNNEL was negotiated.
>  \item The device uses a key as defined in \field{hash_key_data} and \field{hash_key_length} of the virtio_net_hash_config structure (see
>  \ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode / Hash calculation}).
>  \end{itemize}
> @@ -870,6 +882,7 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>  
>  \subparagraph{Supported/enabled hash types}
>  \label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled hash types}
> +This paragraph relies on definitions from \hyperref[intro:IP]{[IP]}, \hyperref[intro:UDP]{[UDP]} and \hyperref[intro:TCP]{[TCP]}.
>  Hash types applicable for IPv4 packets:
>  \begin{lstlisting}
>  #define VIRTIO_NET_HASH_TYPE_IPv4              (1 << 0)
> @@ -980,6 +993,152 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>  (see \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / IPv6 packets without extension header}).
>  \end{itemize}
>  
> +\paragraph{Inner Header Hash}
> +\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Inner Header Hash}
> +
> +If VIRTIO_NET_F_HASH_TUNNEL has been negotiated, the device supports inner header hash and the driver can send
> +commands VIRTIO_NET_CTRL_TUNNEL_HASH_SET and VIRTIO_NET_CTRL_TUNNEL_HASH_GET for the inner header hash configuration.
> +
> +struct virtio_net_hash_tunnel_config {
> +    le32 hash_tunnel_types;
> +};
> +
> +#define VIRTIO_NET_CTRL_TUNNEL_HASH 7
> + #define VIRTIO_NET_CTRL_TUNNEL_HASH_SET 0
> + #define VIRTIO_NET_CTRL_TUNNEL_HASH_GET 1
> +
> +Filed \field{hash_tunnel_types} contains a bitmask of configured hash tunnel types as
> +defined in \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled hash tunnel types}.
> +
> +The class VIRTIO_NET_CTRL_TUNNEL_HASH has the following commands:
> +\begin{itemize}
> +\item VIRTIO_NET_CTRL_TUNNEL_HASH_SET: set the \field{hash_tunnel_types} to configure the inner header hash calculation for the device.
> +\item VIRTIO_NET_CTRL_TUNNEL_HASH_GET: get the \field{hash_tunnel_types} from the device.
> +\end{itemize}
> +
> +For the command VIRTIO_NET_CTRL_TUNNEL_HASH_SET, the structure virtio_net_hash_tunnel_config is write-only for the driver.
> +For the command VIRTIO_NET_CTRL_TUNNEL_HASH_GET, the structure virtio_net_hash_tunnel_config is read-only for the driver.
> +
> +\subparagraph{Tunnel/Encapsulated packet}
> +\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Tunnel/Encapsulated packet}
> +
> +A tunnel packet is encapsulated from the original packet based on the tunneling protocol (only a single level of
> +encapsulation is currently supported). The encapsulated packet contains an outer header and an inner header, and
> +the device calculates the hash over either the inner header or the outer header.
> +
> +If VIRTIO_NET_F_HASH_TUNNEL is negotiated and a received encapsulated packet's outer header matches one of the
> +configured \field{hash_tunnel_types}, the hash of the inner header is calculated.
> +
> +Supported encapsulated packet types:
> +\begin{itemize}
> +\item \hyperref[intro:gre_rfc2784]{[GRE_rfc2784]}: the outer header is over IPv4 and the inner header is over IPv4. The outer header does not contain the transport protocol.
> +\item \hyperref[intro:gre_rfc2890]{[GRE_rfc2890]}: the outer header is over IPv4 and the inner header is over IPv4. The outer header does not contain the transport protocol.
> +\item \hyperref[intro:gre_rfc7676]{[GRE_rfc7676]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6. The outer header does not contain the transport protocol.
> +\item \hyperref[intro:gre_in_udp_rfc8086]{[GRE-in-UDP]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6. The outer header uses UDP as the transport protocol.
> +\item \hyperref[intro:vxlan]{[VXLAN]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6. The outer header uses UDP as the transport protocol.
> +\item \hyperref[intro:vxlan_gpe]{[VXLAN-GPE]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6. The outer header uses UDP as the transport protocol.
> +\item \hyperref[intro:geneve]{[GENEVE]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6. The outer header uses UDP as the transport protocol.
> +\item \hyperref[intro:ipip]{[IPIP]}: the outer header is over IPv4 and the inner header is over IPv4. The outer header does not contain the transport protocol.
> +\item \hyperref[intro:nvgre]{[NVGRE]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6. The outer header does not contain the transport protocol.
> +\item \hyperref[intro:sit]{[STT]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6. The outer header uses TCP-like as the transport protocol.
> +\end{itemize}
> +
> +If VIRTIO_NET_HASH_TUNNEL_TYPE_NONE is set or the encapsulation type is not included in \field{hash_tunnel_types},
> +the hash of the outer header is calculated for the received encapsulated packet.
> +
> +The hash is calculated for the received non-encapsulated packet as if VIRTIO_NET_F_HASH_TUNNEL was not negotiated.
> +
> +\subparagraph{Supported/enabled encapsulation hash types}
> +\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled encapsulation hash types}
> +
> +\begin{lstlisting}
> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_NONE        (1 << 0)
> +\end{lstlisting}
> +
> +Supported encapsulation hash types:
> +Hash type applicable for inner payload of the \hyperref[intro:gre_rfc2784]{[GRE_rfc2784]} packet:
> +\begin{lstlisting}
> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_2784    (1 << 1)
> +\end{lstlisting}
> +Hash type applicable for inner payload of the \hyperref[intro:gre_rfc2890]{[GRE_rfc2890]} packet:
> +\begin{lstlisting}
> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_2890    (1 << 2)
> +\end{lstlisting}
> +Hash type applicable for inner payload of the \hyperref[intro:gre_rfc7676]{[GRE_rfc7676]} packet:
> +\begin{lstlisting}
> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_7676    (1 << 3)
> +\end{lstlisting}
> +Hash type applicable for inner payload of the \hyperref[intro:gre_in_udp_rfc8086]{[GRE-in-UDP]} packet:
> +\begin{lstlisting}
> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_UDP     (1 << 4)
> +\end{lstlisting}
> +Hash type applicable for inner payload of the \hyperref[intro:vxlan]{[VXLAN]} packet:
> +\begin{lstlisting}
> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_VXLAN       (1 << 5)
> +\end{lstlisting}
> +Hash type applicable for inner payload of the \hyperref[intro:vxlan_gpe]{[VXLAN-GPE]} packet:
> +\begin{lstlisting}
> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_VXLAN_GPE   (1 << 6)
> +\end{lstlisting}
> +Hash type applicable for inner payload of the \hyperref[intro:geneve]{[GENEVE]} packet:
> +\begin{lstlisting}
> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GENEVE      (1 << 7)
> +\end{lstlisting}
> +Hash type applicable for inner payload of the \hyperref[intro:ipip]{[IPIP]} packet:
> +\begin{lstlisting}
> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_IPIP        (1 << 8)
> +\end{lstlisting}
> +Hash type applicable for inner payload of the \hyperref[intro:nvgre]{[NVGRE]} packet:
> +\begin{lstlisting}
> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_NVGRE       (1 << 9)
> +\end{lstlisting}
> +Hash type applicable for inner payload of the \hyperref[intro:stt]{[STT]} packet:
> +\begin{lstlisting}
> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_STT         (1 << 10)
> +\end{lstlisting}

Too many protocols to support. Can we start with just one or two?

> +\subparagraph{Advice}
> +Usage scenarios of inner header hash (but not limited to):
> +\begin{itemize}
> +\item Legacy tunneling protocols that lack entropy in the outer header use inner header hash to hash flows
> +      with the same outer header but different inner headers to different queues for better receiving performance.

That's the only one that sounds convincing to me.
How about we start with just legacy? What are these? legacy GRE and
maybe NVGRE?

> +\item In scenarios where the same flow passing through different tunnels is expected to be received in the same queue,
> +      warm caches, lessing locking, etc. are optimized to obtain receiving performance.
> +\item For modern tunneling protocols, in some scenarios, the entropy in the outer header is insufficient, for example,
> +      the external destination port number is fixed. Inner header hash can be used to increase the entropy to
> +      regain receiving performance.

Too vague. If the problem existed someone would have tried to solve it
at the protocol level by now.


> +\end{itemize}
> +
> +For scenarios with sufficient external entropy or no internal hashing requirements, inner header hash may not be needed:
> +A tunnel is often expected to isolate the external network from the internal one. By completely ignoring entropy
> +in the external header and replacing it with entropy from the internal header, for hash calculations, this expectation
> +might be violated to a certain extent, depending on how the hash is used. When the hash use is limited to RSS queue
> +selection, inner header hash may have quality of service (QoS) limitations.
> +
> +Possible mitigations:
> +\begin{itemize}
> +\item Use a tool with good forwarding performance to keep the receive queue from filling up.
> +\item If the QoS is unavailable, the driver can set \field{hash_tunnel_types} to VIRTIO_NET_HASH_TUNNEL_TYPE_NONE
> +      to disable inner header hash for encapsulated packets.
> +\item Perform appropriate QoS before packets consume the receive buffers of the receive queues.
> +\end{itemize}
> +
> +\devicenormative{\subparagraph}{Inner Header Hash}{Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
> +
> +The device MUST calculate the hash on the outer header if the type of the received encapsulated packet does not match any value of the \field{hash_tunnel_types}.
> +
> +The device MUST respond to the VIRTIO_NET_CTRL_TUNNEL_HASH_SET command with VIRTIO_NET_ERR if the device received an unrecognized or unsupported VIRTIO_NET_HASH_TUNNEL_TYPE_ flag.
> +
> +Upon reset, the device MUST initialize \field{hash_tunnel_type} to 0.
> +
> +\drivernormative{\subparagraph}{Inner Header Hash}{Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
> +
> +The driver MUST have negotiated the VIRTIO_NET_F_HASH_TUNNEL feature when issuing commands VIRTIO_NET_CTRL_TUNNEL_HASH_SET and VIRTIO_NET_CTRL_TUNNEL_HASH_GET.
> +
> +The driver MUST ignore the values of \field{hash_tunnel_types} received from the VIRTIO_NET_CTRL_TUNNEL_HASH_GET command if the device responds with VIRTIO_NET_ERR.
> +
> +The driver MUST NOT set any VIRTIO_NET_HASH_TUNNEL_TYPE_ flags that are not supported by the device.
> +
>  \paragraph{Hash reporting for incoming packets}
>  \label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash reporting for incoming packets}
>  
> diff --git a/device-types/net/device-conformance.tex b/device-types/net/device-conformance.tex
> index 54f6783..f88f48b 100644
> --- a/device-types/net/device-conformance.tex
> +++ b/device-types/net/device-conformance.tex
> @@ -14,4 +14,5 @@
>  \item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode}
>  \item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) / RSS processing}
>  \item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Notifications Coalescing}
> +\item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
>  \end{itemize}
> diff --git a/device-types/net/driver-conformance.tex b/device-types/net/driver-conformance.tex
> index 97d0cc1..9d853d9 100644
> --- a/device-types/net/driver-conformance.tex
> +++ b/device-types/net/driver-conformance.tex
> @@ -14,4 +14,5 @@
>  \item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Offloads State Configuration / Setting Offloads State}
>  \item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) }
>  \item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Notifications Coalescing}
> +\item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
>  \end{itemize}
> diff --git a/introduction.tex b/introduction.tex
> index 287c5fc..36b620f 100644
> --- a/introduction.tex
> +++ b/introduction.tex
> @@ -99,6 +99,50 @@ \section{Normative References}\label{sec:Normative References}
>      Standards for Efficient Cryptography Group(SECG), ``SEC1: Elliptic Cureve Cryptography'', Version 1.0, September 2000.
>  	\newline\url{https://www.secg.org/sec1-v2.pdf}\\
>  
> +	\phantomsection\label{intro:gre_rfc2784}\textbf{[GRE_rfc2784]} &
> +    Generic Routing Encapsulation. This protocol is only specified for IPv4 and used as either the payload or delivery protocol.
> +	\newline\url{https://datatracker.ietf.org/doc/rfc2784/}\\
> +	\phantomsection\label{intro:gre_rfc2890}\textbf{[GRE_rfc2890]} &
> +    Key and Sequence Number Extensions to GRE \ref{intro:gre_rfc2784}. This protocol describes extensions by which two fields, Key and
> +    Sequence Number, can be optionally carried in the GRE Header \ref{intro:gre_rfc2784}.
> +	\newline\url{https://www.rfc-editor.org/rfc/rfc2890}\\
> +	\phantomsection\label{intro:gre_rfc7676}\textbf{[GRE_rfc7676]} &
> +    IPv6 Support for Generic Routing Encapsulation (GRE). This protocol is specified for IPv6 and used as either the payload or
> +    delivery protocol. Note that this does not change the GRE header format or any behaviors specified by RFC 2784 or RFC 2890.
> +	\newline\url{https://datatracker.ietf.org/doc/rfc7676/}\\
> +	\phantomsection\label{intro:gre_in_udp_rfc8086}\textbf{[GRE-in-UDP]} &
> +    GRE-in-UDP Encapsulation. This specifies a method of encapsulating network protocol packets within GRE and UDP headers.
> +    This GRE-in-UDP encapsulation allows the UDP source port field to be used as an entropy field. This protocol is specified for IPv4 and IPv6,
> +    and used as either the payload or delivery protocol.
> +	\newline\url{https://www.rfc-editor.org/rfc/rfc8086}\\
> +	\phantomsection\label{intro:vxlan}\textbf{[VXLAN]} &
> +    Virtual eXtensible Local Area Network.
> +	\newline\url{https://datatracker.ietf.org/doc/rfc7348/}\\
> +	\phantomsection\label{intro:vxlan-gpe}\textbf{[VXLAN-GPE]} &
> +    Generic Protocol Extension for VXLAN. This protocol describes extending Virtual eXtensible Local Area Network (VXLAN) via changes to the VXLAN header.
> +	\newline\url{https://www.ietf.org/archive/id/draft-ietf-nvo3-vxlan-gpe-12.txt}\\
> +	\phantomsection\label{intro:geneve}\textbf{[GENEVE]} &
> +    Generic Network Virtualization Encapsulation.
> +	\newline\url{https://datatracker.ietf.org/doc/rfc8926/}\\
> +	\phantomsection\label{intro:ipip}\textbf{[IPIP]} &
> +    IP Encapsulation within IP.
> +	\newline\url{https://www.rfc-editor.org/rfc/rfc2003}\\
> +	\phantomsection\label{intro:nvgre}\textbf{[NVGRE]} &
> +    NVGRE: Network Virtualization Using Generic Routing Encapsulation
> +	\newline\url{https://www.rfc-editor.org/rfc/rfc7637.html}\\
> +	\phantomsection\label{intro:stt}\textbf{[STT]} &
> +    Stateless Transport Tunneling. STT is particularly useful when some tunnel endpoints are in end-systems, as it utilizes the capabilities
> +    of the network interface card to improve performance.
> +	\newline\url{https://www.ietf.org/archive/id/draft-davie-stt-08.txt}\\
> +	\phantomsection\label{intro:IP}\textbf{[IP]} &
> +    INTERNET PROTOCOL
> +	\newline\url{https://www.rfc-editor.org/rfc/rfc791}\\
> +	\phantomsection\label{intro:UDP}\textbf{[UDP]} &
> +    User Datagram Protocol
> +	\newline\url{https://www.rfc-editor.org/rfc/rfc768}\\
> +	\phantomsection\label{intro:TCP}\textbf{[TCP]} &
> +    TRANSMISSION CONTROL PROTOCOL
> +	\newline\url{https://www.rfc-editor.org/rfc/rfc793}\\
>  \end{longtable}
>  
>  \section{Non-Normative References}
> -- 
> 2.19.1.6.gb485710b
> 
> 
> This publicly archived list offers a means to provide input to the
> OASIS Virtual I/O Device (VIRTIO) TC.
> 
> In order to verify user consent to the Feedback License terms and
> to minimize spam in the list archive, subscription is required
> before posting.
> 
> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> List help: virtio-comment-help@lists.oasis-open.org
> List archive: https://lists.oasis-open.org/archives/virtio-comment/
> Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> Committee: https://www.oasis-open.org/committees/virtio/
> Join OASIS: https://www.oasis-open.org/join/
> 


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH v13] virtio-net: support inner header hash
  2023-04-25 20:28   ` [virtio-comment] " Parav Pandit
@ 2023-04-25 21:06     ` Michael S. Tsirkin
  -1 siblings, 0 replies; 60+ messages in thread
From: Michael S. Tsirkin @ 2023-04-25 21:06 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Heng Qi, virtio-dev, virtio-comment, Jason Wang,
	Yuri Benditovich, Xuan Zhuo

On Tue, Apr 25, 2023 at 04:28:33PM -0400, Parav Pandit wrote:
> 
> 
> On 4/23/2023 3:35 AM, Heng Qi wrote:
> >   \subsubsection{Legacy Interface: Feature bits}\label{sec:Device Types / Network Device / Feature bits / Legacy Interface: Feature bits}
> > @@ -198,6 +202,7 @@ \subsection{Device configuration layout}\label{sec:Device Types / Network Device
> >           u8 rss_max_key_size;
> >           le16 rss_max_indirection_table_length;
> >           le32 supported_hash_types;
> > +        le32 supported_tunnel_hash_types;

this needs a comment explaining it only exists with some
feature bits.

> >   };
> In v12 I was asking this to move to above field from the config area to the
> GET command in comment [1] as,
> 
> "With that no need to define two fields at two different places in config
> area and also in cvq."

I think I disagree.
the proposed design is consistent with regular tunneling.

however I feel we should limit this to 1-2 legacy protocols.
with that, we do not really need a new field at all,
they can fit in supported_hash_types.




> I am sorry if that was not clear enough.
> 
> [1] https://lore.kernel.org/virtio-dev/569cbaf9-f1fb-0e1f-a2ef-b1d7cd7dbb1f@nvidia.com/
> 
> >   \subparagraph{Supported/enabled hash types}
> >   \label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled hash types}
> > +This paragraph relies on definitions from \hyperref[intro:IP]{[IP]}, \hyperref[intro:UDP]{[UDP]} and \hyperref[intro:TCP]{[TCP]}.
> >   Hash types applicable for IPv4 packets:
> >   \begin{lstlisting}
> >   #define VIRTIO_NET_HASH_TYPE_IPv4              (1 << 0)
> > @@ -980,6 +993,152 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
> >   (see \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / IPv6 packets without extension header}).
> >   \end{itemize}
> > +\paragraph{Inner Header Hash}
> > +\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Inner Header Hash}
> > +
> > +If VIRTIO_NET_F_HASH_TUNNEL has been negotiated, the device supports inner header hash and the driver can send
> > +commands VIRTIO_NET_CTRL_TUNNEL_HASH_SET and VIRTIO_NET_CTRL_TUNNEL_HASH_GET for the inner header hash configuration.
> > +
> > +struct virtio_net_hash_tunnel_config {
> Please move field from the config struct to here. Both are RO fields.
> 
> le32 supported_hash_tunnel_types;
> > +    le32 hash_tunnel_types;
> > +};
> > +
> > +#define VIRTIO_NET_CTRL_TUNNEL_HASH 7
> > + #define VIRTIO_NET_CTRL_TUNNEL_HASH_SET 0
> > + #define VIRTIO_NET_CTRL_TUNNEL_HASH_GET 1
> > +
> > +Filed \field{hash_tunnel_types} contains a bitmask of configured hash tunnel types as
> > +defined in \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled hash tunnel types}.
> > +
> > +The class VIRTIO_NET_CTRL_TUNNEL_HASH has the following commands:
> > +\begin{itemize}
> > +\item VIRTIO_NET_CTRL_TUNNEL_HASH_SET: set the \field{hash_tunnel_types} to configure the inner header hash calculation for the device.
> > +\item VIRTIO_NET_CTRL_TUNNEL_HASH_GET: get the \field{hash_tunnel_types} from the device.
> > +\end{itemize}
> > +
> > +For the command VIRTIO_NET_CTRL_TUNNEL_HASH_SET, the structure virtio_net_hash_tunnel_config is write-only for the driver.
> > +For the command VIRTIO_NET_CTRL_TUNNEL_HASH_GET, the structure virtio_net_hash_tunnel_config is read-only for the driver.
> > +
> You need to split the structures to two, one for get and one for set in
> above description as get and set contains different fields.
> > +
> > +If VIRTIO_NET_HASH_TUNNEL_TYPE_NONE is set or the encapsulation type is not included in \field{hash_tunnel_types},
> > +the hash of the outer header is calculated for the received encapsulated packet.
> > +
> > +
> > +For scenarios with sufficient external entropy or no internal hashing requirements, inner header hash may not be needed:
> > +A tunnel is often expected to isolate the external network from the internal one. By completely ignoring entropy
> > +in the external header and replacing it with entropy from the internal header, for hash calculations, this expectation
> You wanted to say inner here like rest of the places.
> 
> s/internal header/inner header
> 
> > +The driver MUST NOT set any VIRTIO_NET_HASH_TUNNEL_TYPE_ flags that are not supported by the device.
> Multiple flags so,
> 
> s/flags that are/flags which are/
> 
> This publicly archived list offers a means to provide input to the
> OASIS Virtual I/O Device (VIRTIO) TC.
> 
> In order to verify user consent to the Feedback License terms and
> to minimize spam in the list archive, subscription is required
> before posting.
> 
> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> List help: virtio-comment-help@lists.oasis-open.org
> List archive: https://lists.oasis-open.org/archives/virtio-comment/
> Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> Committee: https://www.oasis-open.org/committees/virtio/
> Join OASIS: https://www.oasis-open.org/join/
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [virtio-comment] Re: [PATCH v13] virtio-net: support inner header hash
@ 2023-04-25 21:06     ` Michael S. Tsirkin
  0 siblings, 0 replies; 60+ messages in thread
From: Michael S. Tsirkin @ 2023-04-25 21:06 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Heng Qi, virtio-dev, virtio-comment, Jason Wang,
	Yuri Benditovich, Xuan Zhuo

On Tue, Apr 25, 2023 at 04:28:33PM -0400, Parav Pandit wrote:
> 
> 
> On 4/23/2023 3:35 AM, Heng Qi wrote:
> >   \subsubsection{Legacy Interface: Feature bits}\label{sec:Device Types / Network Device / Feature bits / Legacy Interface: Feature bits}
> > @@ -198,6 +202,7 @@ \subsection{Device configuration layout}\label{sec:Device Types / Network Device
> >           u8 rss_max_key_size;
> >           le16 rss_max_indirection_table_length;
> >           le32 supported_hash_types;
> > +        le32 supported_tunnel_hash_types;

this needs a comment explaining it only exists with some
feature bits.

> >   };
> In v12 I was asking this to move to above field from the config area to the
> GET command in comment [1] as,
> 
> "With that no need to define two fields at two different places in config
> area and also in cvq."

I think I disagree.
the proposed design is consistent with regular tunneling.

however I feel we should limit this to 1-2 legacy protocols.
with that, we do not really need a new field at all,
they can fit in supported_hash_types.




> I am sorry if that was not clear enough.
> 
> [1] https://lore.kernel.org/virtio-dev/569cbaf9-f1fb-0e1f-a2ef-b1d7cd7dbb1f@nvidia.com/
> 
> >   \subparagraph{Supported/enabled hash types}
> >   \label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled hash types}
> > +This paragraph relies on definitions from \hyperref[intro:IP]{[IP]}, \hyperref[intro:UDP]{[UDP]} and \hyperref[intro:TCP]{[TCP]}.
> >   Hash types applicable for IPv4 packets:
> >   \begin{lstlisting}
> >   #define VIRTIO_NET_HASH_TYPE_IPv4              (1 << 0)
> > @@ -980,6 +993,152 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
> >   (see \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / IPv6 packets without extension header}).
> >   \end{itemize}
> > +\paragraph{Inner Header Hash}
> > +\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Inner Header Hash}
> > +
> > +If VIRTIO_NET_F_HASH_TUNNEL has been negotiated, the device supports inner header hash and the driver can send
> > +commands VIRTIO_NET_CTRL_TUNNEL_HASH_SET and VIRTIO_NET_CTRL_TUNNEL_HASH_GET for the inner header hash configuration.
> > +
> > +struct virtio_net_hash_tunnel_config {
> Please move field from the config struct to here. Both are RO fields.
> 
> le32 supported_hash_tunnel_types;
> > +    le32 hash_tunnel_types;
> > +};
> > +
> > +#define VIRTIO_NET_CTRL_TUNNEL_HASH 7
> > + #define VIRTIO_NET_CTRL_TUNNEL_HASH_SET 0
> > + #define VIRTIO_NET_CTRL_TUNNEL_HASH_GET 1
> > +
> > +Filed \field{hash_tunnel_types} contains a bitmask of configured hash tunnel types as
> > +defined in \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled hash tunnel types}.
> > +
> > +The class VIRTIO_NET_CTRL_TUNNEL_HASH has the following commands:
> > +\begin{itemize}
> > +\item VIRTIO_NET_CTRL_TUNNEL_HASH_SET: set the \field{hash_tunnel_types} to configure the inner header hash calculation for the device.
> > +\item VIRTIO_NET_CTRL_TUNNEL_HASH_GET: get the \field{hash_tunnel_types} from the device.
> > +\end{itemize}
> > +
> > +For the command VIRTIO_NET_CTRL_TUNNEL_HASH_SET, the structure virtio_net_hash_tunnel_config is write-only for the driver.
> > +For the command VIRTIO_NET_CTRL_TUNNEL_HASH_GET, the structure virtio_net_hash_tunnel_config is read-only for the driver.
> > +
> You need to split the structures to two, one for get and one for set in
> above description as get and set contains different fields.
> > +
> > +If VIRTIO_NET_HASH_TUNNEL_TYPE_NONE is set or the encapsulation type is not included in \field{hash_tunnel_types},
> > +the hash of the outer header is calculated for the received encapsulated packet.
> > +
> > +
> > +For scenarios with sufficient external entropy or no internal hashing requirements, inner header hash may not be needed:
> > +A tunnel is often expected to isolate the external network from the internal one. By completely ignoring entropy
> > +in the external header and replacing it with entropy from the internal header, for hash calculations, this expectation
> You wanted to say inner here like rest of the places.
> 
> s/internal header/inner header
> 
> > +The driver MUST NOT set any VIRTIO_NET_HASH_TUNNEL_TYPE_ flags that are not supported by the device.
> Multiple flags so,
> 
> s/flags that are/flags which are/
> 
> This publicly archived list offers a means to provide input to the
> OASIS Virtual I/O Device (VIRTIO) TC.
> 
> In order to verify user consent to the Feedback License terms and
> to minimize spam in the list archive, subscription is required
> before posting.
> 
> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> List help: virtio-comment-help@lists.oasis-open.org
> List archive: https://lists.oasis-open.org/archives/virtio-comment/
> Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> Committee: https://www.oasis-open.org/committees/virtio/
> Join OASIS: https://www.oasis-open.org/join/
> 


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [virtio-dev] RE: [virtio-comment] Re: [PATCH v13] virtio-net: support inner header hash
  2023-04-25 21:06     ` Michael S. Tsirkin
@ 2023-04-25 21:39       ` Parav Pandit
  -1 siblings, 0 replies; 60+ messages in thread
From: Parav Pandit @ 2023-04-25 21:39 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Heng Qi, virtio-dev, virtio-comment, Jason Wang,
	Yuri Benditovich, Xuan Zhuo


> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Tuesday, April 25, 2023 5:06 PM
> > On 4/23/2023 3:35 AM, Heng Qi wrote:
> > >   \subsubsection{Legacy Interface: Feature bits}\label{sec:Device
> > > Types / Network Device / Feature bits / Legacy Interface: Feature bits} @@
> -198,6 +202,7 @@ \subsection{Device configuration layout}\label{sec:Device
> Types / Network Device
> > >           u8 rss_max_key_size;
> > >           le16 rss_max_indirection_table_length;
> > >           le32 supported_hash_types;
> > > +        le32 supported_tunnel_hash_types;
> 
> this needs a comment explaining it only exists with some feature bits.
> 
Yes, it is already there.
+Field \field{supported_tunnel_hash_types} only exists if the device supports inner header hash, i.e. if VIRTIO_NET_F_HASH_TUNNEL is set.
+
I think it should be changed from "device supports" to "driver negotiated".

> > >   };
> > In v12 I was asking this to move to above field from the config area
> > to the GET command in comment [1] as,
> >
> > "With that no need to define two fields at two different places in
> > config area and also in cvq."
> 
> I think I disagree.
> the proposed design is consistent with regular tunneling.
> 
Sure.
I understand how config space has evolved from 0.9.5 to know without much attention, but really expanding this way is not helpful.
It requires building more and more RAM based devices even for PCI PFs, which is sub optimal.
CVQ already exists and provides the SET command. There is no reason to do GET in some other way.
Device has single channel to provide a value and hence it doesn't need any internal synchronization between two paths.

So, if we add a new feature bit as VIRTIO_F_CFG_SPACE_OVER_AQ it is an improvement.
But it still comes with a cost which device cannot mitigate.
The cost is, 
1. a driver may not negotiate such feature bit, and device ends up burning memory.
1.b Device provisioning becomes a factor of what some guests may use/not use/already using on the live system.

2. Every device needs AQ even when the CVQ exists.

Hence, better to avoid expanding current structure for any new fields, specially which has the SET equivalent.

But if we want to with the path of VIRTIO_F_CFG_SPACE_OVER_AQ, it is fine.
More things can evolve for generic things like config space over AQ.

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 60+ messages in thread

* RE: [virtio-comment] Re: [PATCH v13] virtio-net: support inner header hash
@ 2023-04-25 21:39       ` Parav Pandit
  0 siblings, 0 replies; 60+ messages in thread
From: Parav Pandit @ 2023-04-25 21:39 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Heng Qi, virtio-dev, virtio-comment, Jason Wang,
	Yuri Benditovich, Xuan Zhuo


> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Tuesday, April 25, 2023 5:06 PM
> > On 4/23/2023 3:35 AM, Heng Qi wrote:
> > >   \subsubsection{Legacy Interface: Feature bits}\label{sec:Device
> > > Types / Network Device / Feature bits / Legacy Interface: Feature bits} @@
> -198,6 +202,7 @@ \subsection{Device configuration layout}\label{sec:Device
> Types / Network Device
> > >           u8 rss_max_key_size;
> > >           le16 rss_max_indirection_table_length;
> > >           le32 supported_hash_types;
> > > +        le32 supported_tunnel_hash_types;
> 
> this needs a comment explaining it only exists with some feature bits.
> 
Yes, it is already there.
+Field \field{supported_tunnel_hash_types} only exists if the device supports inner header hash, i.e. if VIRTIO_NET_F_HASH_TUNNEL is set.
+
I think it should be changed from "device supports" to "driver negotiated".

> > >   };
> > In v12 I was asking this to move to above field from the config area
> > to the GET command in comment [1] as,
> >
> > "With that no need to define two fields at two different places in
> > config area and also in cvq."
> 
> I think I disagree.
> the proposed design is consistent with regular tunneling.
> 
Sure.
I understand how config space has evolved from 0.9.5 to know without much attention, but really expanding this way is not helpful.
It requires building more and more RAM based devices even for PCI PFs, which is sub optimal.
CVQ already exists and provides the SET command. There is no reason to do GET in some other way.
Device has single channel to provide a value and hence it doesn't need any internal synchronization between two paths.

So, if we add a new feature bit as VIRTIO_F_CFG_SPACE_OVER_AQ it is an improvement.
But it still comes with a cost which device cannot mitigate.
The cost is, 
1. a driver may not negotiate such feature bit, and device ends up burning memory.
1.b Device provisioning becomes a factor of what some guests may use/not use/already using on the live system.

2. Every device needs AQ even when the CVQ exists.

Hence, better to avoid expanding current structure for any new fields, specially which has the SET equivalent.

But if we want to with the path of VIRTIO_F_CFG_SPACE_OVER_AQ, it is fine.
More things can evolve for generic things like config space over AQ.

This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH v13] virtio-net: support inner header hash
  2023-04-25 21:39       ` Parav Pandit
@ 2023-04-26  4:12         ` Michael S. Tsirkin
  -1 siblings, 0 replies; 60+ messages in thread
From: Michael S. Tsirkin @ 2023-04-26  4:12 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Heng Qi, virtio-dev, virtio-comment, Jason Wang,
	Yuri Benditovich, Xuan Zhuo

On Tue, Apr 25, 2023 at 09:39:28PM +0000, Parav Pandit wrote:
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Tuesday, April 25, 2023 5:06 PM
> > > On 4/23/2023 3:35 AM, Heng Qi wrote:
> > > >   \subsubsection{Legacy Interface: Feature bits}\label{sec:Device
> > > > Types / Network Device / Feature bits / Legacy Interface: Feature bits} @@
> > -198,6 +202,7 @@ \subsection{Device configuration layout}\label{sec:Device
> > Types / Network Device
> > > >           u8 rss_max_key_size;
> > > >           le16 rss_max_indirection_table_length;
> > > >           le32 supported_hash_types;
> > > > +        le32 supported_tunnel_hash_types;
> > 
> > this needs a comment explaining it only exists with some feature bits.
> > 
> Yes, it is already there.
> +Field \field{supported_tunnel_hash_types} only exists if the device supports inner header hash, i.e. if VIRTIO_NET_F_HASH_TUNNEL is set.
> +
> I think it should be changed from "device supports" to "driver negotiated".
> 
> > > >   };
> > > In v12 I was asking this to move to above field from the config area
> > > to the GET command in comment [1] as,
> > >
> > > "With that no need to define two fields at two different places in
> > > config area and also in cvq."
> > 
> > I think I disagree.
> > the proposed design is consistent with regular tunneling.
> > 
> Sure.
> I understand how config space has evolved from 0.9.5 to know without much attention, but really expanding this way is not helpful.
> It requires building more and more RAM based devices even for PCI PFs, which is sub optimal.

No, this is ROM, not RAM.

> CVQ already exists and provides the SET command. There is no reason to do GET in some other way.

Spoken looking at just hardware cost :)
The reason is that this is device specific. Maintainance overhead and
system RAM cost for the code to support this should not be ignored.

> Device has single channel to provide a value and hence it doesn't need any internal synchronization between two paths.
> 
> So, if we add a new feature bit as VIRTIO_F_CFG_SPACE_OVER_AQ it is an improvement.
> But it still comes with a cost which device cannot mitigate.
> The cost is, 
> 1. a driver may not negotiate such feature bit, and device ends up burning memory.
> 1.b Device provisioning becomes a factor of what some guests may use/not use/already using on the live system.
> 
> 2. Every device needs AQ even when the CVQ exists.
> 
> Hence, better to avoid expanding current structure for any new fields, specially which has the SET equivalent.
> 
> But if we want to with the path of VIRTIO_F_CFG_SPACE_OVER_AQ, it is fine.
> More things can evolve for generic things like config space over AQ.

I am not sure what does VIRTIO_F_CFG_SPACE_OVER_AQ mean, and what are
these costs.  What I had in mind is extending proposed vq transport to
support sriov. I don't see why we can not have exactly 0 bytes of memory
per VF.

And if we care about single bytes of PF memory (do we? there's only one
PF per SRIOV device ...), what we should do is a variant of pci
transport that aggressively saves memory.


-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [virtio-comment] Re: [PATCH v13] virtio-net: support inner header hash
@ 2023-04-26  4:12         ` Michael S. Tsirkin
  0 siblings, 0 replies; 60+ messages in thread
From: Michael S. Tsirkin @ 2023-04-26  4:12 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Heng Qi, virtio-dev, virtio-comment, Jason Wang,
	Yuri Benditovich, Xuan Zhuo

On Tue, Apr 25, 2023 at 09:39:28PM +0000, Parav Pandit wrote:
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Tuesday, April 25, 2023 5:06 PM
> > > On 4/23/2023 3:35 AM, Heng Qi wrote:
> > > >   \subsubsection{Legacy Interface: Feature bits}\label{sec:Device
> > > > Types / Network Device / Feature bits / Legacy Interface: Feature bits} @@
> > -198,6 +202,7 @@ \subsection{Device configuration layout}\label{sec:Device
> > Types / Network Device
> > > >           u8 rss_max_key_size;
> > > >           le16 rss_max_indirection_table_length;
> > > >           le32 supported_hash_types;
> > > > +        le32 supported_tunnel_hash_types;
> > 
> > this needs a comment explaining it only exists with some feature bits.
> > 
> Yes, it is already there.
> +Field \field{supported_tunnel_hash_types} only exists if the device supports inner header hash, i.e. if VIRTIO_NET_F_HASH_TUNNEL is set.
> +
> I think it should be changed from "device supports" to "driver negotiated".
> 
> > > >   };
> > > In v12 I was asking this to move to above field from the config area
> > > to the GET command in comment [1] as,
> > >
> > > "With that no need to define two fields at two different places in
> > > config area and also in cvq."
> > 
> > I think I disagree.
> > the proposed design is consistent with regular tunneling.
> > 
> Sure.
> I understand how config space has evolved from 0.9.5 to know without much attention, but really expanding this way is not helpful.
> It requires building more and more RAM based devices even for PCI PFs, which is sub optimal.

No, this is ROM, not RAM.

> CVQ already exists and provides the SET command. There is no reason to do GET in some other way.

Spoken looking at just hardware cost :)
The reason is that this is device specific. Maintainance overhead and
system RAM cost for the code to support this should not be ignored.

> Device has single channel to provide a value and hence it doesn't need any internal synchronization between two paths.
> 
> So, if we add a new feature bit as VIRTIO_F_CFG_SPACE_OVER_AQ it is an improvement.
> But it still comes with a cost which device cannot mitigate.
> The cost is, 
> 1. a driver may not negotiate such feature bit, and device ends up burning memory.
> 1.b Device provisioning becomes a factor of what some guests may use/not use/already using on the live system.
> 
> 2. Every device needs AQ even when the CVQ exists.
> 
> Hence, better to avoid expanding current structure for any new fields, specially which has the SET equivalent.
> 
> But if we want to with the path of VIRTIO_F_CFG_SPACE_OVER_AQ, it is fine.
> More things can evolve for generic things like config space over AQ.

I am not sure what does VIRTIO_F_CFG_SPACE_OVER_AQ mean, and what are
these costs.  What I had in mind is extending proposed vq transport to
support sriov. I don't see why we can not have exactly 0 bytes of memory
per VF.

And if we care about single bytes of PF memory (do we? there's only one
PF per SRIOV device ...), what we should do is a variant of pci
transport that aggressively saves memory.


-- 
MST


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [virtio-dev] RE: [virtio-comment] Re: [PATCH v13] virtio-net: support inner header hash
  2023-04-26  4:12         ` Michael S. Tsirkin
@ 2023-04-26  4:27           ` Parav Pandit
  -1 siblings, 0 replies; 60+ messages in thread
From: Parav Pandit @ 2023-04-26  4:27 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Heng Qi, virtio-dev, virtio-comment, Jason Wang,
	Yuri Benditovich, Xuan Zhuo


> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Wednesday, April 26, 2023 12:12 AM
> 
> 
> No, this is ROM, not RAM.
> 
For VFs it is not ROM because one might configure VF with different feature bits.

> > CVQ already exists and provides the SET command. There is no reason to do
> GET in some other way.
> 
> Spoken looking at just hardware cost :)
Not really. The CVQ is already there, no extra overheads.

> The reason is that this is device specific. Maintainance overhead and system
> RAM cost for the code to support this should not be ignored.
Sure. As above its not a new and such query occurs only once or extremely rare.
> 
> > Device has single channel to provide a value and hence it doesn't need any
> internal synchronization between two paths.
> >
> > So, if we add a new feature bit as VIRTIO_F_CFG_SPACE_OVER_AQ it is an
> improvement.
> > But it still comes with a cost which device cannot mitigate.
> > The cost is,
> > 1. a driver may not negotiate such feature bit, and device ends up burning
> memory.
> > 1.b Device provisioning becomes a factor of what some guests may use/not
> use/already using on the live system.
> >
> > 2. Every device needs AQ even when the CVQ exists.
> >
> > Hence, better to avoid expanding current structure for any new fields,
> specially which has the SET equivalent.
> >
> > But if we want to with the path of VIRTIO_F_CFG_SPACE_OVER_AQ, it is fine.
> > More things can evolve for generic things like config space over AQ.
> 
> I am not sure what does VIRTIO_F_CFG_SPACE_OVER_AQ mean, and what are
> these costs.  
What I had in mind is all the rarely used and/or one time used config registers are queries over Q interface.
A device exposes a feature bit indicating that it offers it via a q interface instead of MMIO.
A net device already has a CVQ and its almost always there, so utilizing an existing object to query makes perfect sense.
It can be argued other way that, hey other devices cannot benefit for it because they missed the CVQ.
So may be a AQ can service for all the device types.

> What I had in mind is extending proposed vq transport to support
> sriov. I don't see why we can not have exactly 0 bytes of memory per VF.
> 
VFs other than legacy purposes do not undergo mediation via PF.
Legacy is the only exception; newer things is direct communication for a PCI device PF, SRIOV VFs and SIOV devices.

> And if we care about single bytes of PF memory (do we? there's only one PF per
> SRIOV device ...), what we should do is a variant of pci transport that
> aggressively saves memory.
Users already deploy 16 to 30 PFs coming from single physical card today in single system.
Building things uniformly for PF, VF, SIOV devices is coming for free.

It is not only this byte, but we also see that device needs to offer many capabilities more than boolean flag, so putting non latency sensitive item on the MMIO area hurts overall.

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 60+ messages in thread

* RE: [virtio-comment] Re: [PATCH v13] virtio-net: support inner header hash
@ 2023-04-26  4:27           ` Parav Pandit
  0 siblings, 0 replies; 60+ messages in thread
From: Parav Pandit @ 2023-04-26  4:27 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Heng Qi, virtio-dev, virtio-comment, Jason Wang,
	Yuri Benditovich, Xuan Zhuo


> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Wednesday, April 26, 2023 12:12 AM
> 
> 
> No, this is ROM, not RAM.
> 
For VFs it is not ROM because one might configure VF with different feature bits.

> > CVQ already exists and provides the SET command. There is no reason to do
> GET in some other way.
> 
> Spoken looking at just hardware cost :)
Not really. The CVQ is already there, no extra overheads.

> The reason is that this is device specific. Maintainance overhead and system
> RAM cost for the code to support this should not be ignored.
Sure. As above its not a new and such query occurs only once or extremely rare.
> 
> > Device has single channel to provide a value and hence it doesn't need any
> internal synchronization between two paths.
> >
> > So, if we add a new feature bit as VIRTIO_F_CFG_SPACE_OVER_AQ it is an
> improvement.
> > But it still comes with a cost which device cannot mitigate.
> > The cost is,
> > 1. a driver may not negotiate such feature bit, and device ends up burning
> memory.
> > 1.b Device provisioning becomes a factor of what some guests may use/not
> use/already using on the live system.
> >
> > 2. Every device needs AQ even when the CVQ exists.
> >
> > Hence, better to avoid expanding current structure for any new fields,
> specially which has the SET equivalent.
> >
> > But if we want to with the path of VIRTIO_F_CFG_SPACE_OVER_AQ, it is fine.
> > More things can evolve for generic things like config space over AQ.
> 
> I am not sure what does VIRTIO_F_CFG_SPACE_OVER_AQ mean, and what are
> these costs.  
What I had in mind is all the rarely used and/or one time used config registers are queries over Q interface.
A device exposes a feature bit indicating that it offers it via a q interface instead of MMIO.
A net device already has a CVQ and its almost always there, so utilizing an existing object to query makes perfect sense.
It can be argued other way that, hey other devices cannot benefit for it because they missed the CVQ.
So may be a AQ can service for all the device types.

> What I had in mind is extending proposed vq transport to support
> sriov. I don't see why we can not have exactly 0 bytes of memory per VF.
> 
VFs other than legacy purposes do not undergo mediation via PF.
Legacy is the only exception; newer things is direct communication for a PCI device PF, SRIOV VFs and SIOV devices.

> And if we care about single bytes of PF memory (do we? there's only one PF per
> SRIOV device ...), what we should do is a variant of pci transport that
> aggressively saves memory.
Users already deploy 16 to 30 PFs coming from single physical card today in single system.
Building things uniformly for PF, VF, SIOV devices is coming for free.

It is not only this byte, but we also see that device needs to offer many capabilities more than boolean flag, so putting non latency sensitive item on the MMIO area hurts overall.

This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH v13] virtio-net: support inner header hash
  2023-04-26  4:27           ` Parav Pandit
@ 2023-04-26  5:02             ` Michael S. Tsirkin
  -1 siblings, 0 replies; 60+ messages in thread
From: Michael S. Tsirkin @ 2023-04-26  5:02 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Heng Qi, virtio-dev, virtio-comment, Jason Wang,
	Yuri Benditovich, Xuan Zhuo

On Wed, Apr 26, 2023 at 04:27:34AM +0000, Parav Pandit wrote:
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Wednesday, April 26, 2023 12:12 AM
> > 
> > 
> > No, this is ROM, not RAM.
> > 
> For VFs it is not ROM because one might configure VF with different feature bits.

Hmm interesting. If you want to block a specific feature from VF you
probably want to do it for things like LM compat, and I feel the way to
do it is in software not in hardware because you want e.g.  cross-vendor
compatibility, which specific vendors rarely care about.  No? For
example, we have all the crazy amount of flexibility with layout of pci
config space (not virtio config) that we have zero control over unless
we want to go and mandate all that in virtio - which would put us on a
treadmill of chasing each new pci capability added.

BTW I generally wonder about why this aggressive fight against memory
mapped registers does not seem to be reflected in the pci spec at all,
which seems to keep happily adding more memory mapped stuff for control
plane things in each revision. If anyone, I would expect these guys to
know a thing or two about pci hardware.


> > > CVQ already exists and provides the SET command. There is no reason to do
> > GET in some other way.
> > 
> > Spoken looking at just hardware cost :)
> Not really. The CVQ is already there, no extra overheads.
> > The reason is that this is device specific. Maintainance overhead and system
> > RAM cost for the code to support this should not be ignored.
> Sure. As above its not a new and such query occurs only once or extremely rare.

It will have to be done each time user runs ethtool -k, no?
Unless you cache this in software then it's extra RAM :)


> > 
> > > Device has single channel to provide a value and hence it doesn't need any
> > internal synchronization between two paths.
> > >
> > > So, if we add a new feature bit as VIRTIO_F_CFG_SPACE_OVER_AQ it is an
> > improvement.
> > > But it still comes with a cost which device cannot mitigate.
> > > The cost is,
> > > 1. a driver may not negotiate such feature bit, and device ends up burning
> > memory.
> > > 1.b Device provisioning becomes a factor of what some guests may use/not
> > use/already using on the live system.
> > >
> > > 2. Every device needs AQ even when the CVQ exists.
> > >
> > > Hence, better to avoid expanding current structure for any new fields,
> > specially which has the SET equivalent.
> > >
> > > But if we want to with the path of VIRTIO_F_CFG_SPACE_OVER_AQ, it is fine.
> > > More things can evolve for generic things like config space over AQ.
> > 
> > I am not sure what does VIRTIO_F_CFG_SPACE_OVER_AQ mean, and what are
> > these costs.  
> What I had in mind is all the rarely used and/or one time used config registers are queries over Q interface.
> A device exposes a feature bit indicating that it offers it via a q interface instead of MMIO.
> A net device already has a CVQ and its almost always there, so utilizing an existing object to query makes perfect sense.
> It can be argued other way that, hey other devices cannot benefit for it because they missed the CVQ.
> So may be a AQ can service for all the device types.

Exactly.

> > What I had in mind is extending proposed vq transport to support
> > sriov. I don't see why we can not have exactly 0 bytes of memory per VF.
> > 
> VFs other than legacy purposes do not undergo mediation via PF.
> Legacy is the only exception; newer things is direct communication for a PCI device PF, SRIOV VFs and SIOV devices.
> 
> > And if we care about single bytes of PF memory (do we? there's only one PF per
> > SRIOV device ...), what we should do is a variant of pci transport that
> > aggressively saves memory.
> Users already deploy 16 to 30 PFs coming from single physical card today in single system.
> Building things uniformly for PF, VF, SIOV devices is coming for free.
> 
> It is not only this byte, but we also see that device needs to offer many capabilities more than boolean flag, so putting non latency sensitive item on the MMIO area hurts overall.


-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [virtio-comment] Re: [PATCH v13] virtio-net: support inner header hash
@ 2023-04-26  5:02             ` Michael S. Tsirkin
  0 siblings, 0 replies; 60+ messages in thread
From: Michael S. Tsirkin @ 2023-04-26  5:02 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Heng Qi, virtio-dev, virtio-comment, Jason Wang,
	Yuri Benditovich, Xuan Zhuo

On Wed, Apr 26, 2023 at 04:27:34AM +0000, Parav Pandit wrote:
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Wednesday, April 26, 2023 12:12 AM
> > 
> > 
> > No, this is ROM, not RAM.
> > 
> For VFs it is not ROM because one might configure VF with different feature bits.

Hmm interesting. If you want to block a specific feature from VF you
probably want to do it for things like LM compat, and I feel the way to
do it is in software not in hardware because you want e.g.  cross-vendor
compatibility, which specific vendors rarely care about.  No? For
example, we have all the crazy amount of flexibility with layout of pci
config space (not virtio config) that we have zero control over unless
we want to go and mandate all that in virtio - which would put us on a
treadmill of chasing each new pci capability added.

BTW I generally wonder about why this aggressive fight against memory
mapped registers does not seem to be reflected in the pci spec at all,
which seems to keep happily adding more memory mapped stuff for control
plane things in each revision. If anyone, I would expect these guys to
know a thing or two about pci hardware.


> > > CVQ already exists and provides the SET command. There is no reason to do
> > GET in some other way.
> > 
> > Spoken looking at just hardware cost :)
> Not really. The CVQ is already there, no extra overheads.
> > The reason is that this is device specific. Maintainance overhead and system
> > RAM cost for the code to support this should not be ignored.
> Sure. As above its not a new and such query occurs only once or extremely rare.

It will have to be done each time user runs ethtool -k, no?
Unless you cache this in software then it's extra RAM :)


> > 
> > > Device has single channel to provide a value and hence it doesn't need any
> > internal synchronization between two paths.
> > >
> > > So, if we add a new feature bit as VIRTIO_F_CFG_SPACE_OVER_AQ it is an
> > improvement.
> > > But it still comes with a cost which device cannot mitigate.
> > > The cost is,
> > > 1. a driver may not negotiate such feature bit, and device ends up burning
> > memory.
> > > 1.b Device provisioning becomes a factor of what some guests may use/not
> > use/already using on the live system.
> > >
> > > 2. Every device needs AQ even when the CVQ exists.
> > >
> > > Hence, better to avoid expanding current structure for any new fields,
> > specially which has the SET equivalent.
> > >
> > > But if we want to with the path of VIRTIO_F_CFG_SPACE_OVER_AQ, it is fine.
> > > More things can evolve for generic things like config space over AQ.
> > 
> > I am not sure what does VIRTIO_F_CFG_SPACE_OVER_AQ mean, and what are
> > these costs.  
> What I had in mind is all the rarely used and/or one time used config registers are queries over Q interface.
> A device exposes a feature bit indicating that it offers it via a q interface instead of MMIO.
> A net device already has a CVQ and its almost always there, so utilizing an existing object to query makes perfect sense.
> It can be argued other way that, hey other devices cannot benefit for it because they missed the CVQ.
> So may be a AQ can service for all the device types.

Exactly.

> > What I had in mind is extending proposed vq transport to support
> > sriov. I don't see why we can not have exactly 0 bytes of memory per VF.
> > 
> VFs other than legacy purposes do not undergo mediation via PF.
> Legacy is the only exception; newer things is direct communication for a PCI device PF, SRIOV VFs and SIOV devices.
> 
> > And if we care about single bytes of PF memory (do we? there's only one PF per
> > SRIOV device ...), what we should do is a variant of pci transport that
> > aggressively saves memory.
> Users already deploy 16 to 30 PFs coming from single physical card today in single system.
> Building things uniformly for PF, VF, SIOV devices is coming for free.
> 
> It is not only this byte, but we also see that device needs to offer many capabilities more than boolean flag, so putting non latency sensitive item on the MMIO area hurts overall.


-- 
MST


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [virtio-dev] Re: [PATCH v13] virtio-net: support inner header hash
  2023-04-25 20:28   ` [virtio-comment] " Parav Pandit
@ 2023-04-26 13:42     ` Heng Qi
  -1 siblings, 0 replies; 60+ messages in thread
From: Heng Qi @ 2023-04-26 13:42 UTC (permalink / raw)
  To: Parav Pandit, virtio-dev, virtio-comment
  Cc: Michael S . Tsirkin, Jason Wang, Yuri Benditovich, Xuan Zhuo



在 2023/4/26 上午4:28, Parav Pandit 写道:
>
>
> On 4/23/2023 3:35 AM, Heng Qi wrote:
>>     \subsubsection{Legacy Interface: Feature bits}\label{sec:Device 
>> Types / Network Device / Feature bits / Legacy Interface: Feature bits}
>> @@ -198,6 +202,7 @@ \subsection{Device configuration 
>> layout}\label{sec:Device Types / Network Device
>>           u8 rss_max_key_size;
>>           le16 rss_max_indirection_table_length;
>>           le32 supported_hash_types;
>> +        le32 supported_tunnel_hash_types;
>>   };
> In v12 I was asking this to move to above field from the config area 
> to the GET command in comment [1] as,
>
> "With that no need to define two fields at two different places in 
> config area and also in cvq."

I'm not sure if this is sufficiently motivated, RSS also has 
supports_hash_types in config space.

We don't actually need cvq and config to sync on 
supported_tunnel_hash_types, since it doesn't need to change (meaning 
supported_tunnel_hash_types doesn't send configuration change 
notifications).

>
> I am sorry if that was not clear enough.
>
> [1] 
> https://lore.kernel.org/virtio-dev/569cbaf9-f1fb-0e1f-a2ef-b1d7cd7dbb1f@nvidia.com/
>
>>   \subparagraph{Supported/enabled hash types}
>>   \label{sec:Device Types / Network Device / Device Operation / 
>> Processing of Incoming Packets / Hash calculation for incoming 
>> packets / Supported/enabled hash types}
>> +This paragraph relies on definitions from \hyperref[intro:IP]{[IP]}, 
>> \hyperref[intro:UDP]{[UDP]} and \hyperref[intro:TCP]{[TCP]}.
>>   Hash types applicable for IPv4 packets:
>>   \begin{lstlisting}
>>   #define VIRTIO_NET_HASH_TYPE_IPv4              (1 << 0)
>> @@ -980,6 +993,152 @@ \subsubsection{Processing of Incoming 
>> Packets}\label{sec:Device Types / Network
>>   (see \ref{sec:Device Types / Network Device / Device Operation / 
>> Processing of Incoming Packets / Hash calculation for incoming 
>> packets / IPv6 packets without extension header}).
>>   \end{itemize}
>>   +\paragraph{Inner Header Hash}
>> +\label{sec:Device Types / Network Device / Device Operation / 
>> Processing of Incoming Packets / Inner Header Hash}
>> +
>> +If VIRTIO_NET_F_HASH_TUNNEL has been negotiated, the device supports 
>> inner header hash and the driver can send
>> +commands VIRTIO_NET_CTRL_TUNNEL_HASH_SET and 
>> VIRTIO_NET_CTRL_TUNNEL_HASH_GET for the inner header hash configuration.
>> +
>> +struct virtio_net_hash_tunnel_config {
> Please move field from the config struct to here. Both are RO fields.
>
> le32 supported_hash_tunnel_types;
>> +    le32 hash_tunnel_types;
>> +};
>> +
>> +#define VIRTIO_NET_CTRL_TUNNEL_HASH 7
>> + #define VIRTIO_NET_CTRL_TUNNEL_HASH_SET 0
>> + #define VIRTIO_NET_CTRL_TUNNEL_HASH_GET 1
>> +
>> +Filed \field{hash_tunnel_types} contains a bitmask of configured 
>> hash tunnel types as
>> +defined in \ref{sec:Device Types / Network Device / Device Operation 
>> / Processing of Incoming Packets / Hash calculation for incoming 
>> packets / Supported/enabled hash tunnel types}.
>> +
>> +The class VIRTIO_NET_CTRL_TUNNEL_HASH has the following commands:
>> +\begin{itemize}
>> +\item VIRTIO_NET_CTRL_TUNNEL_HASH_SET: set the 
>> \field{hash_tunnel_types} to configure the inner header hash 
>> calculation for the device.
>> +\item VIRTIO_NET_CTRL_TUNNEL_HASH_GET: get the 
>> \field{hash_tunnel_types} from the device.
>> +\end{itemize}
>> +
>> +For the command VIRTIO_NET_CTRL_TUNNEL_HASH_SET, the structure 
>> virtio_net_hash_tunnel_config is write-only for the driver.
>> +For the command VIRTIO_NET_CTRL_TUNNEL_HASH_GET, the structure 
>> virtio_net_hash_tunnel_config is read-only for the driver.
>> +
> You need to split the structures to two, one for get and one for set 
> in above description as get and set contains different fields.
>> +
>> +If VIRTIO_NET_HASH_TUNNEL_TYPE_NONE is set or the encapsulation type 
>> is not included in \field{hash_tunnel_types},
>> +the hash of the outer header is calculated for the received 
>> encapsulated packet.
>> +
>> +
>> +For scenarios with sufficient external entropy or no internal 
>> hashing requirements, inner header hash may not be needed:
>> +A tunnel is often expected to isolate the external network from the 
>> internal one. By completely ignoring entropy
>> +in the external header and replacing it with entropy from the 
>> internal header, for hash calculations, this expectation
> You wanted to say inner here like rest of the places.
>
> s/internal header/inner header

I want to make the 'external' and 'internal' correspond, but avoid the 
internal header, and use a unified 'inner header' is also reasonable.:)

>
>
>> +The driver MUST NOT set any VIRTIO_NET_HASH_TUNNEL_TYPE_ flags that 
>> are not supported by the device.
> Multiple flags so,
>
> s/flags that are/flags which are/

Will fix.

Thanks!


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [virtio-comment] Re: [PATCH v13] virtio-net: support inner header hash
@ 2023-04-26 13:42     ` Heng Qi
  0 siblings, 0 replies; 60+ messages in thread
From: Heng Qi @ 2023-04-26 13:42 UTC (permalink / raw)
  To: Parav Pandit, virtio-dev, virtio-comment
  Cc: Michael S . Tsirkin, Jason Wang, Yuri Benditovich, Xuan Zhuo



在 2023/4/26 上午4:28, Parav Pandit 写道:
>
>
> On 4/23/2023 3:35 AM, Heng Qi wrote:
>>     \subsubsection{Legacy Interface: Feature bits}\label{sec:Device 
>> Types / Network Device / Feature bits / Legacy Interface: Feature bits}
>> @@ -198,6 +202,7 @@ \subsection{Device configuration 
>> layout}\label{sec:Device Types / Network Device
>>           u8 rss_max_key_size;
>>           le16 rss_max_indirection_table_length;
>>           le32 supported_hash_types;
>> +        le32 supported_tunnel_hash_types;
>>   };
> In v12 I was asking this to move to above field from the config area 
> to the GET command in comment [1] as,
>
> "With that no need to define two fields at two different places in 
> config area and also in cvq."

I'm not sure if this is sufficiently motivated, RSS also has 
supports_hash_types in config space.

We don't actually need cvq and config to sync on 
supported_tunnel_hash_types, since it doesn't need to change (meaning 
supported_tunnel_hash_types doesn't send configuration change 
notifications).

>
> I am sorry if that was not clear enough.
>
> [1] 
> https://lore.kernel.org/virtio-dev/569cbaf9-f1fb-0e1f-a2ef-b1d7cd7dbb1f@nvidia.com/
>
>>   \subparagraph{Supported/enabled hash types}
>>   \label{sec:Device Types / Network Device / Device Operation / 
>> Processing of Incoming Packets / Hash calculation for incoming 
>> packets / Supported/enabled hash types}
>> +This paragraph relies on definitions from \hyperref[intro:IP]{[IP]}, 
>> \hyperref[intro:UDP]{[UDP]} and \hyperref[intro:TCP]{[TCP]}.
>>   Hash types applicable for IPv4 packets:
>>   \begin{lstlisting}
>>   #define VIRTIO_NET_HASH_TYPE_IPv4              (1 << 0)
>> @@ -980,6 +993,152 @@ \subsubsection{Processing of Incoming 
>> Packets}\label{sec:Device Types / Network
>>   (see \ref{sec:Device Types / Network Device / Device Operation / 
>> Processing of Incoming Packets / Hash calculation for incoming 
>> packets / IPv6 packets without extension header}).
>>   \end{itemize}
>>   +\paragraph{Inner Header Hash}
>> +\label{sec:Device Types / Network Device / Device Operation / 
>> Processing of Incoming Packets / Inner Header Hash}
>> +
>> +If VIRTIO_NET_F_HASH_TUNNEL has been negotiated, the device supports 
>> inner header hash and the driver can send
>> +commands VIRTIO_NET_CTRL_TUNNEL_HASH_SET and 
>> VIRTIO_NET_CTRL_TUNNEL_HASH_GET for the inner header hash configuration.
>> +
>> +struct virtio_net_hash_tunnel_config {
> Please move field from the config struct to here. Both are RO fields.
>
> le32 supported_hash_tunnel_types;
>> +    le32 hash_tunnel_types;
>> +};
>> +
>> +#define VIRTIO_NET_CTRL_TUNNEL_HASH 7
>> + #define VIRTIO_NET_CTRL_TUNNEL_HASH_SET 0
>> + #define VIRTIO_NET_CTRL_TUNNEL_HASH_GET 1
>> +
>> +Filed \field{hash_tunnel_types} contains a bitmask of configured 
>> hash tunnel types as
>> +defined in \ref{sec:Device Types / Network Device / Device Operation 
>> / Processing of Incoming Packets / Hash calculation for incoming 
>> packets / Supported/enabled hash tunnel types}.
>> +
>> +The class VIRTIO_NET_CTRL_TUNNEL_HASH has the following commands:
>> +\begin{itemize}
>> +\item VIRTIO_NET_CTRL_TUNNEL_HASH_SET: set the 
>> \field{hash_tunnel_types} to configure the inner header hash 
>> calculation for the device.
>> +\item VIRTIO_NET_CTRL_TUNNEL_HASH_GET: get the 
>> \field{hash_tunnel_types} from the device.
>> +\end{itemize}
>> +
>> +For the command VIRTIO_NET_CTRL_TUNNEL_HASH_SET, the structure 
>> virtio_net_hash_tunnel_config is write-only for the driver.
>> +For the command VIRTIO_NET_CTRL_TUNNEL_HASH_GET, the structure 
>> virtio_net_hash_tunnel_config is read-only for the driver.
>> +
> You need to split the structures to two, one for get and one for set 
> in above description as get and set contains different fields.
>> +
>> +If VIRTIO_NET_HASH_TUNNEL_TYPE_NONE is set or the encapsulation type 
>> is not included in \field{hash_tunnel_types},
>> +the hash of the outer header is calculated for the received 
>> encapsulated packet.
>> +
>> +
>> +For scenarios with sufficient external entropy or no internal 
>> hashing requirements, inner header hash may not be needed:
>> +A tunnel is often expected to isolate the external network from the 
>> internal one. By completely ignoring entropy
>> +in the external header and replacing it with entropy from the 
>> internal header, for hash calculations, this expectation
> You wanted to say inner here like rest of the places.
>
> s/internal header/inner header

I want to make the 'external' and 'internal' correspond, but avoid the 
internal header, and use a unified 'inner header' is also reasonable.:)

>
>
>> +The driver MUST NOT set any VIRTIO_NET_HASH_TUNNEL_TYPE_ flags that 
>> are not supported by the device.
> Multiple flags so,
>
> s/flags that are/flags which are/

Will fix.

Thanks!


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [virtio-dev] RE: [PATCH v13] virtio-net: support inner header hash
  2023-04-26 13:42     ` [virtio-comment] " Heng Qi
@ 2023-04-26 13:47       ` Parav Pandit
  -1 siblings, 0 replies; 60+ messages in thread
From: Parav Pandit @ 2023-04-26 13:47 UTC (permalink / raw)
  To: Heng Qi, virtio-dev, virtio-comment
  Cc: Michael S . Tsirkin, Jason Wang, Yuri Benditovich, Xuan Zhuo



> From: Heng Qi <hengqi@linux.alibaba.com>
> Sent: Wednesday, April 26, 2023 9:43 AM

> > "With that no need to define two fields at two different places in
> > config area and also in cvq."
> 
> I'm not sure if this is sufficiently motivated, RSS also has supports_hash_types
> in config space.
>
There are many fields grown to MMIO config space because 0.9.5 has config space there.
These are usually one time read or rarely used; hence it doesn’t need to come from this area of the device to be _always_ available.
 
> We don't actually need cvq and config to sync on
> supported_tunnel_hash_types, since it doesn't need to change (meaning
> supported_tunnel_hash_types doesn't send configuration change
> notifications).
> 
It doesn’t but device is expected to publish the value at two places.

> >> +For scenarios with sufficient external entropy or no internal
> >> hashing requirements, inner header hash may not be needed:
> >> +A tunnel is often expected to isolate the external network from the
> >> internal one. By completely ignoring entropy
> >> +in the external header and replacing it with entropy from the
> >> internal header, for hash calculations, this expectation
> > You wanted to say inner here like rest of the places.
> >
> > s/internal header/inner header
> 
> I want to make the 'external' and 'internal' correspond, but avoid the internal
> header, and use a unified 'inner header' is also reasonable.:)
>
Since in rest of other description, we refer to it as "outer", it is better to keep it consistent as outer instead of external here.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* [virtio-comment] RE: [PATCH v13] virtio-net: support inner header hash
@ 2023-04-26 13:47       ` Parav Pandit
  0 siblings, 0 replies; 60+ messages in thread
From: Parav Pandit @ 2023-04-26 13:47 UTC (permalink / raw)
  To: Heng Qi, virtio-dev, virtio-comment
  Cc: Michael S . Tsirkin, Jason Wang, Yuri Benditovich, Xuan Zhuo



> From: Heng Qi <hengqi@linux.alibaba.com>
> Sent: Wednesday, April 26, 2023 9:43 AM

> > "With that no need to define two fields at two different places in
> > config area and also in cvq."
> 
> I'm not sure if this is sufficiently motivated, RSS also has supports_hash_types
> in config space.
>
There are many fields grown to MMIO config space because 0.9.5 has config space there.
These are usually one time read or rarely used; hence it doesn’t need to come from this area of the device to be _always_ available.
 
> We don't actually need cvq and config to sync on
> supported_tunnel_hash_types, since it doesn't need to change (meaning
> supported_tunnel_hash_types doesn't send configuration change
> notifications).
> 
It doesn’t but device is expected to publish the value at two places.

> >> +For scenarios with sufficient external entropy or no internal
> >> hashing requirements, inner header hash may not be needed:
> >> +A tunnel is often expected to isolate the external network from the
> >> internal one. By completely ignoring entropy
> >> +in the external header and replacing it with entropy from the
> >> internal header, for hash calculations, this expectation
> > You wanted to say inner here like rest of the places.
> >
> > s/internal header/inner header
> 
> I want to make the 'external' and 'internal' correspond, but avoid the internal
> header, and use a unified 'inner header' is also reasonable.:)
>
Since in rest of other description, we refer to it as "outer", it is better to keep it consistent as outer instead of external here.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [virtio-comment] RE: [PATCH v13] virtio-net: support inner header hash
  2023-04-26 13:47       ` [virtio-comment] " Parav Pandit
@ 2023-04-26 14:03         ` Heng Qi
  -1 siblings, 0 replies; 60+ messages in thread
From: Heng Qi @ 2023-04-26 14:03 UTC (permalink / raw)
  To: Parav Pandit, virtio-dev, virtio-comment
  Cc: Michael S . Tsirkin, Jason Wang, Yuri Benditovich, Xuan Zhuo



在 2023/4/26 下午9:47, Parav Pandit 写道:
>
>> From: Heng Qi <hengqi@linux.alibaba.com>
>> Sent: Wednesday, April 26, 2023 9:43 AM
>>> "With that no need to define two fields at two different places in
>>> config area and also in cvq."
>> I'm not sure if this is sufficiently motivated, RSS also has supports_hash_types
>> in config space.
>>
> There are many fields grown to MMIO config space because 0.9.5 has config space there.
> These are usually one time read or rarely used; hence it doesn’t need to come from this area of the device to be _always_ available.

Yes.

supported_tunnel_hash_types If in cvq, each GET command needs to return 
supported_tunnel_hash_types and hash_tunnel_types,
but users may only need the hash_tunnel_types "SET" by themselves.

>   
>> We don't actually need cvq and config to sync on
>> supported_tunnel_hash_types, since it doesn't need to change (meaning
>> supported_tunnel_hash_types doesn't send configuration change
>> notifications).
>>
> It doesn’t but device is expected to publish the value at two places.

Yes, but that seems like a tiny cost, and the cvq command-related 
structure is much simpler.

>
>>>> +For scenarios with sufficient external entropy or no internal
>>>> hashing requirements, inner header hash may not be needed:
>>>> +A tunnel is often expected to isolate the external network from the
>>>> internal one. By completely ignoring entropy
>>>> +in the external header and replacing it with entropy from the
>>>> internal header, for hash calculations, this expectation
>>> You wanted to say inner here like rest of the places.
>>>
>>> s/internal header/inner header
>> I want to make the 'external' and 'internal' correspond, but avoid the internal
>> header, and use a unified 'inner header' is also reasonable.:)
>>
> Since in rest of other description, we refer to it as "outer", it is better to keep it consistent as outer instead of external here.

Totally agree.

Thanks.



This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [virtio-dev] Re: [virtio-comment] RE: [PATCH v13] virtio-net: support inner header hash
@ 2023-04-26 14:03         ` Heng Qi
  0 siblings, 0 replies; 60+ messages in thread
From: Heng Qi @ 2023-04-26 14:03 UTC (permalink / raw)
  To: Parav Pandit, virtio-dev, virtio-comment
  Cc: Michael S . Tsirkin, Jason Wang, Yuri Benditovich, Xuan Zhuo



在 2023/4/26 下午9:47, Parav Pandit 写道:
>
>> From: Heng Qi <hengqi@linux.alibaba.com>
>> Sent: Wednesday, April 26, 2023 9:43 AM
>>> "With that no need to define two fields at two different places in
>>> config area and also in cvq."
>> I'm not sure if this is sufficiently motivated, RSS also has supports_hash_types
>> in config space.
>>
> There are many fields grown to MMIO config space because 0.9.5 has config space there.
> These are usually one time read or rarely used; hence it doesn’t need to come from this area of the device to be _always_ available.

Yes.

supported_tunnel_hash_types If in cvq, each GET command needs to return 
supported_tunnel_hash_types and hash_tunnel_types,
but users may only need the hash_tunnel_types "SET" by themselves.

>   
>> We don't actually need cvq and config to sync on
>> supported_tunnel_hash_types, since it doesn't need to change (meaning
>> supported_tunnel_hash_types doesn't send configuration change
>> notifications).
>>
> It doesn’t but device is expected to publish the value at two places.

Yes, but that seems like a tiny cost, and the cvq command-related 
structure is much simpler.

>
>>>> +For scenarios with sufficient external entropy or no internal
>>>> hashing requirements, inner header hash may not be needed:
>>>> +A tunnel is often expected to isolate the external network from the
>>>> internal one. By completely ignoring entropy
>>>> +in the external header and replacing it with entropy from the
>>>> internal header, for hash calculations, this expectation
>>> You wanted to say inner here like rest of the places.
>>>
>>> s/internal header/inner header
>> I want to make the 'external' and 'internal' correspond, but avoid the internal
>> header, and use a unified 'inner header' is also reasonable.:)
>>
> Since in rest of other description, we refer to it as "outer", it is better to keep it consistent as outer instead of external here.

Totally agree.

Thanks.



---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [virtio-dev] Re: [virtio-comment] [PATCH v13] virtio-net: support inner header hash
  2023-04-25 21:03   ` Michael S. Tsirkin
@ 2023-04-26 14:14     ` Heng Qi
  -1 siblings, 0 replies; 60+ messages in thread
From: Heng Qi @ 2023-04-26 14:14 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-dev, virtio-comment, Parav Pandit, Jason Wang,
	Yuri Benditovich, Xuan Zhuo



在 2023/4/26 上午5:03, Michael S. Tsirkin 写道:
> On Sun, Apr 23, 2023 at 03:35:32PM +0800, Heng Qi wrote:
>> 1. Currently, a received encapsulated packet has an outer and an inner header, but
>> the virtio device is unable to calculate the hash for the inner header. The same
>> flow can traverse through different tunnels, resulting in the encapsulated
>> packets being spread across multiple receive queues (refer to the figure below).
>> However, in certain scenarios, we may need to direct these encapsulated packets of
>> the same flow to a single receive queue. This facilitates the processing
>> of the flow by the same CPU to improve performance (warm caches, less locking, etc.).
>>
>>                 client1                    client2
>>                    |        +-------+         |
>>                    +------->|tunnels|<--------+
>>                             +-------+
>>                                |  |
>>                                v  v
>>                        +-----------------+
>>                        | monitoring host |
>>                        +-----------------+
>>
>> To achieve this, the device can calculate a symmetric hash based on the inner headers
>> of the same flow.
>>
>> 2. For legacy systems, they may lack entropy fields which modern protocols have in
>> the outer header, resulting in multiple flows with the same outer header but
>> different inner headers being directed to the same receive queue. This results in
>> poor receive performance.
>>
>> To address this limitation, inner header hash can be used to enable the device to advertise
>> the capability to calculate the hash for the inner packet, regaining better receive performance.
>>
>> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
>> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> grammar in new text is still pretty bad, lots of typos too.
> Don't have time to fix it for you right now sorry, it's
> a holiday here.

I used a grammar checker and it doesn't seem to be doing a good job. 
I'll do a more granular check.:)

>
>> ---
>> v12->v13:
>> 	1. Add a GET command for hash_tunnel_types. @Parav Pandit
>> 	2. Add tunneling protocol explanation. @Jason Wang
>> 	3. Add comments on some usage scenarios for inner hash.
>>
>> v11->v12:
>> 	1. Add a command VIRTIO_NET_CTRL_MQ_TUNNEL_CONFIG.
>> 	2. Refine the commit log. @Michael S . Tsirkin
>> 	3. Add some tunnel types.
>>
>> v10->v11:
>> 	1. Revise commit log for clarity for readers.
>> 	2. Some modifications to avoid undefined terms. @Parav Pandit
>> 	3. Change VIRTIO_NET_F_HASH_TUNNEL dependency. @Parav Pandit
>> 	4. Add the normative statements. @Parav Pandit
>>
>> v9->v10:
>> 	1. Removed hash_report_tunnel related information. @Parav Pandit
>> 	2. Re-describe the limitations of QoS for tunneling.
>> 	3. Some clarification.
>>
>> v8->v9:
>> 	1. Merge hash_report_tunnel_types into hash_report. @Parav Pandit
>> 	2. Add tunnel security section. @Michael S . Tsirkin
>> 	3. Add VIRTIO_NET_F_HASH_REPORT_TUNNEL.
>> 	4. Fix some typos.
>> 	5. Add more tunnel types. @Michael S . Tsirkin
>>
>> v7->v8:
>> 	1. Add supported_hash_tunnel_types. @Jason Wang, @Parav Pandit
>> 	2. Change hash_report_tunnel to hash_report_tunnel_types. @Parav Pandit
>> 	3. Removed re-definition for inner packet hashing. @Parav Pandit
>> 	4. Fix some typos. @Michael S . Tsirkin
>> 	5. Clarify some sentences. @Michael S . Tsirkin
>>
>> v6->v7:
>> 	1. Modify the wording of some sentences for clarity. @Michael S. Tsirkin
>> 	2. Fix some syntax issues. @Michael S. Tsirkin
>>
>> v5->v6:
>> 	1. Fix some syntax and capitalization issues. @Michael S. Tsirkin
>> 	2. Use encapsulated/encaptulation uniformly. @Michael S. Tsirkin
>> 	3. Move the links to introduction section. @Michael S. Tsirkin
>> 	4. Clarify some sentences. @Michael S. Tsirkin
>>
>> v4->v5:
>> 	1. Clarify some paragraphs. @Cornelia Huck
>> 	2. Fix the u8 type. @Cornelia Huck
>>
>> v3->v4:
>> 	1. Rename VIRTIO_NET_F_HASH_GRE_VXLAN_GENEVE_INNER to VIRTIO_NET_F_HASH_TUNNEL. @Jason Wang
>> 	2. Make things clearer. @Jason Wang @Michael S. Tsirkin
>> 	3. Keep the possibility to use inner hash for automatic receive steering. @Jason Wang
>> 	4. Add the "Tunnel packet" paragraph to avoid repeating the GRE etc. many times. @Michael S. Tsirkin
>>
>> v2->v3:
>> 	1. Add a feature bit for GRE/VXLAN/GENEVE inner hash. @Jason Wang
>> 	2. Chang \field{hash_tunnel} to \field{hash_report_tunnel}. @Jason Wang, @Michael S. Tsirkin
>>
>> v1->v2:
>> 	1. Remove the patch for the bitmask fix. @Michael S. Tsirkin
>> 	2. Clarify some paragraphs. @Jason Wang
>> 	3. Add \field{hash_tunnel} and VIRTIO_NET_HASH_REPORT_GRE. @Yuri Benditovich
>>
>>   device-types/net/description.tex        | 159 ++++++++++++++++++++++++
>>   device-types/net/device-conformance.tex |   1 +
>>   device-types/net/driver-conformance.tex |   1 +
>>   introduction.tex                        |  44 +++++++
>>   4 files changed, 205 insertions(+)
>>
>> diff --git a/device-types/net/description.tex b/device-types/net/description.tex
>> index 0500bb6..48e41f1 100644
>> --- a/device-types/net/description.tex
>> +++ b/device-types/net/description.tex
>> @@ -83,6 +83,9 @@ \subsection{Feature bits}\label{sec:Device Types / Network Device / Feature bits
>>   \item[VIRTIO_NET_F_CTRL_MAC_ADDR(23)] Set MAC address through control
>>       channel.
>>   
>> +\item[VIRTIO_NET_F_HASH_TUNNEL(52)] Device supports inner header hash
>> +    for tunnel-encapsulated packets.
>> +
>>   \item[VIRTIO_NET_F_NOTF_COAL(53)] Device supports notifications coalescing.
>>   
>>   \item[VIRTIO_NET_F_GUEST_USO4 (54)] Driver can receive USOv4 packets.
>> @@ -139,6 +142,7 @@ \subsubsection{Feature bit requirements}\label{sec:Device Types / Network Device
>>   \item[VIRTIO_NET_F_NOTF_COAL] Requires VIRTIO_NET_F_CTRL_VQ.
>>   \item[VIRTIO_NET_F_RSC_EXT] Requires VIRTIO_NET_F_HOST_TSO4 or VIRTIO_NET_F_HOST_TSO6.
>>   \item[VIRTIO_NET_F_RSS] Requires VIRTIO_NET_F_CTRL_VQ.
>> +\item[VIRTIO_NET_F_HASH_TUNNEL] Requires VIRTIO_NET_F_CTRL_VQ along with VIRTIO_NET_F_RSS and/or VIRTIO_NET_F_HASH_REPORT.
>>   \end{description}
>>   
>>   \subsubsection{Legacy Interface: Feature bits}\label{sec:Device Types / Network Device / Feature bits / Legacy Interface: Feature bits}
>> @@ -198,6 +202,7 @@ \subsection{Device configuration layout}\label{sec:Device Types / Network Device
>>           u8 rss_max_key_size;
>>           le16 rss_max_indirection_table_length;
>>           le32 supported_hash_types;
>> +        le32 supported_tunnel_hash_types;
>>   };
>>   \end{lstlisting}
>>   The following field, \field{rss_max_key_size} only exists if VIRTIO_NET_F_RSS or VIRTIO_NET_F_HASH_REPORT is set.
>> @@ -212,6 +217,11 @@ \subsection{Device configuration layout}\label{sec:Device Types / Network Device
>>   Field \field{supported_hash_types} contains the bitmask of supported hash types.
>>   See \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled hash types} for details of supported hash types.
>>   
>> +Field \field{supported_tunnel_hash_types} only exists if the device supports inner header hash, i.e. if VIRTIO_NET_F_HASH_TUNNEL is set.
>> +
>> +Filed \field{supported_tunnel_hash_types} contains the bitmask of supported tunnel hash types.
>> +See \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled encapsulation hash types} for details of supported tunnel hash types.
>> +
>>   \devicenormative{\subsubsection}{Device configuration layout}{Device Types / Network Device / Device configuration layout}
>>   
>>   The device MUST set \field{max_virtqueue_pairs} to between 1 and 0x8000 inclusive,
>> @@ -848,6 +858,7 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>>   If the feature VIRTIO_NET_F_RSS was negotiated:
>>   \begin{itemize}
>>   \item The device uses \field{hash_types} of the virtio_net_rss_config structure as 'Enabled hash types' bitmask.
>> +\item The device uses \field{hash_tunnel_types} of the virtio_net_hash_tunnel_config structure as 'Enabled hash tunnel types' bitmask if VIRTIO_NET_F_HASH_TUNNEL was negotiated.
>>   \item The device uses a key as defined in \field{hash_key_data} and \field{hash_key_length} of the virtio_net_rss_config structure (see
>>   \ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) / Setting RSS parameters}).
>>   \end{itemize}
>> @@ -855,6 +866,7 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>>   If the feature VIRTIO_NET_F_RSS was not negotiated:
>>   \begin{itemize}
>>   \item The device uses \field{hash_types} of the virtio_net_hash_config structure as 'Enabled hash types' bitmask.
>> +\item The device uses \field{hash_tunnel_types} of the virtio_net_hash_tunnel_config structure as 'Enabled hash tunnel types' bitmask if VIRTIO_NET_F_HASH_TUNNEL was negotiated.
>>   \item The device uses a key as defined in \field{hash_key_data} and \field{hash_key_length} of the virtio_net_hash_config structure (see
>>   \ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode / Hash calculation}).
>>   \end{itemize}
>> @@ -870,6 +882,7 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>>   
>>   \subparagraph{Supported/enabled hash types}
>>   \label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled hash types}
>> +This paragraph relies on definitions from \hyperref[intro:IP]{[IP]}, \hyperref[intro:UDP]{[UDP]} and \hyperref[intro:TCP]{[TCP]}.
>>   Hash types applicable for IPv4 packets:
>>   \begin{lstlisting}
>>   #define VIRTIO_NET_HASH_TYPE_IPv4              (1 << 0)
>> @@ -980,6 +993,152 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>>   (see \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / IPv6 packets without extension header}).
>>   \end{itemize}
>>   
>> +\paragraph{Inner Header Hash}
>> +\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Inner Header Hash}
>> +
>> +If VIRTIO_NET_F_HASH_TUNNEL has been negotiated, the device supports inner header hash and the driver can send
>> +commands VIRTIO_NET_CTRL_TUNNEL_HASH_SET and VIRTIO_NET_CTRL_TUNNEL_HASH_GET for the inner header hash configuration.
>> +
>> +struct virtio_net_hash_tunnel_config {
>> +    le32 hash_tunnel_types;
>> +};
>> +
>> +#define VIRTIO_NET_CTRL_TUNNEL_HASH 7
>> + #define VIRTIO_NET_CTRL_TUNNEL_HASH_SET 0
>> + #define VIRTIO_NET_CTRL_TUNNEL_HASH_GET 1
>> +
>> +Filed \field{hash_tunnel_types} contains a bitmask of configured hash tunnel types as
>> +defined in \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled hash tunnel types}.
>> +
>> +The class VIRTIO_NET_CTRL_TUNNEL_HASH has the following commands:
>> +\begin{itemize}
>> +\item VIRTIO_NET_CTRL_TUNNEL_HASH_SET: set the \field{hash_tunnel_types} to configure the inner header hash calculation for the device.
>> +\item VIRTIO_NET_CTRL_TUNNEL_HASH_GET: get the \field{hash_tunnel_types} from the device.
>> +\end{itemize}
>> +
>> +For the command VIRTIO_NET_CTRL_TUNNEL_HASH_SET, the structure virtio_net_hash_tunnel_config is write-only for the driver.
>> +For the command VIRTIO_NET_CTRL_TUNNEL_HASH_GET, the structure virtio_net_hash_tunnel_config is read-only for the driver.
>> +
>> +\subparagraph{Tunnel/Encapsulated packet}
>> +\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Tunnel/Encapsulated packet}
>> +
>> +A tunnel packet is encapsulated from the original packet based on the tunneling protocol (only a single level of
>> +encapsulation is currently supported). The encapsulated packet contains an outer header and an inner header, and
>> +the device calculates the hash over either the inner header or the outer header.
>> +
>> +If VIRTIO_NET_F_HASH_TUNNEL is negotiated and a received encapsulated packet's outer header matches one of the
>> +configured \field{hash_tunnel_types}, the hash of the inner header is calculated.
>> +
>> +Supported encapsulated packet types:
>> +\begin{itemize}
>> +\item \hyperref[intro:gre_rfc2784]{[GRE_rfc2784]}: the outer header is over IPv4 and the inner header is over IPv4. The outer header does not contain the transport protocol.
>> +\item \hyperref[intro:gre_rfc2890]{[GRE_rfc2890]}: the outer header is over IPv4 and the inner header is over IPv4. The outer header does not contain the transport protocol.
>> +\item \hyperref[intro:gre_rfc7676]{[GRE_rfc7676]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6. The outer header does not contain the transport protocol.
>> +\item \hyperref[intro:gre_in_udp_rfc8086]{[GRE-in-UDP]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6. The outer header uses UDP as the transport protocol.
>> +\item \hyperref[intro:vxlan]{[VXLAN]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6. The outer header uses UDP as the transport protocol.
>> +\item \hyperref[intro:vxlan_gpe]{[VXLAN-GPE]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6. The outer header uses UDP as the transport protocol.
>> +\item \hyperref[intro:geneve]{[GENEVE]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6. The outer header uses UDP as the transport protocol.
>> +\item \hyperref[intro:ipip]{[IPIP]}: the outer header is over IPv4 and the inner header is over IPv4. The outer header does not contain the transport protocol.
>> +\item \hyperref[intro:nvgre]{[NVGRE]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6. The outer header does not contain the transport protocol.
>> +\item \hyperref[intro:sit]{[STT]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6. The outer header uses TCP-like as the transport protocol.
>> +\end{itemize}
>> +
>> +If VIRTIO_NET_HASH_TUNNEL_TYPE_NONE is set or the encapsulation type is not included in \field{hash_tunnel_types},
>> +the hash of the outer header is calculated for the received encapsulated packet.
>> +
>> +The hash is calculated for the received non-encapsulated packet as if VIRTIO_NET_F_HASH_TUNNEL was not negotiated.
>> +
>> +\subparagraph{Supported/enabled encapsulation hash types}
>> +\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled encapsulation hash types}
>> +
>> +\begin{lstlisting}
>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_NONE        (1 << 0)
>> +\end{lstlisting}
>> +
>> +Supported encapsulation hash types:
>> +Hash type applicable for inner payload of the \hyperref[intro:gre_rfc2784]{[GRE_rfc2784]} packet:
>> +\begin{lstlisting}
>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_2784    (1 << 1)
>> +\end{lstlisting}
>> +Hash type applicable for inner payload of the \hyperref[intro:gre_rfc2890]{[GRE_rfc2890]} packet:
>> +\begin{lstlisting}
>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_2890    (1 << 2)
>> +\end{lstlisting}
>> +Hash type applicable for inner payload of the \hyperref[intro:gre_rfc7676]{[GRE_rfc7676]} packet:
>> +\begin{lstlisting}
>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_7676    (1 << 3)
>> +\end{lstlisting}
>> +Hash type applicable for inner payload of the \hyperref[intro:gre_in_udp_rfc8086]{[GRE-in-UDP]} packet:
>> +\begin{lstlisting}
>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_UDP     (1 << 4)
>> +\end{lstlisting}
>> +Hash type applicable for inner payload of the \hyperref[intro:vxlan]{[VXLAN]} packet:
>> +\begin{lstlisting}
>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_VXLAN       (1 << 5)
>> +\end{lstlisting}
>> +Hash type applicable for inner payload of the \hyperref[intro:vxlan_gpe]{[VXLAN-GPE]} packet:
>> +\begin{lstlisting}
>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_VXLAN_GPE   (1 << 6)
>> +\end{lstlisting}
>> +Hash type applicable for inner payload of the \hyperref[intro:geneve]{[GENEVE]} packet:
>> +\begin{lstlisting}
>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GENEVE      (1 << 7)
>> +\end{lstlisting}
>> +Hash type applicable for inner payload of the \hyperref[intro:ipip]{[IPIP]} packet:
>> +\begin{lstlisting}
>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_IPIP        (1 << 8)
>> +\end{lstlisting}
>> +Hash type applicable for inner payload of the \hyperref[intro:nvgre]{[NVGRE]} packet:
>> +\begin{lstlisting}
>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_NVGRE       (1 << 9)
>> +\end{lstlisting}
>> +Hash type applicable for inner payload of the \hyperref[intro:stt]{[STT]} packet:
>> +\begin{lstlisting}
>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_STT         (1 << 10)
>> +\end{lstlisting}
> Too many protocols to support. Can we start with just one or two?

This does not mean that every device needs to implement and support all 
of these, they can choose to support some protocols they want.

I add these because we have scale application scenarios for modern 
protocols VXLAN-GPE/GENEVE:

+\item In scenarios where the same flow passing through different tunnels is expected to be received in the same queue,
+      warm caches, lessing locking, etc. are optimized to obtain receiving performance.


Maybe the legacy GRE, VXLAN-GPE and GENEVE? But it has a little crossover.

Thanks.

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [virtio-comment] [PATCH v13] virtio-net: support inner header hash
@ 2023-04-26 14:14     ` Heng Qi
  0 siblings, 0 replies; 60+ messages in thread
From: Heng Qi @ 2023-04-26 14:14 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-dev, virtio-comment, Parav Pandit, Jason Wang,
	Yuri Benditovich, Xuan Zhuo



在 2023/4/26 上午5:03, Michael S. Tsirkin 写道:
> On Sun, Apr 23, 2023 at 03:35:32PM +0800, Heng Qi wrote:
>> 1. Currently, a received encapsulated packet has an outer and an inner header, but
>> the virtio device is unable to calculate the hash for the inner header. The same
>> flow can traverse through different tunnels, resulting in the encapsulated
>> packets being spread across multiple receive queues (refer to the figure below).
>> However, in certain scenarios, we may need to direct these encapsulated packets of
>> the same flow to a single receive queue. This facilitates the processing
>> of the flow by the same CPU to improve performance (warm caches, less locking, etc.).
>>
>>                 client1                    client2
>>                    |        +-------+         |
>>                    +------->|tunnels|<--------+
>>                             +-------+
>>                                |  |
>>                                v  v
>>                        +-----------------+
>>                        | monitoring host |
>>                        +-----------------+
>>
>> To achieve this, the device can calculate a symmetric hash based on the inner headers
>> of the same flow.
>>
>> 2. For legacy systems, they may lack entropy fields which modern protocols have in
>> the outer header, resulting in multiple flows with the same outer header but
>> different inner headers being directed to the same receive queue. This results in
>> poor receive performance.
>>
>> To address this limitation, inner header hash can be used to enable the device to advertise
>> the capability to calculate the hash for the inner packet, regaining better receive performance.
>>
>> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
>> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> grammar in new text is still pretty bad, lots of typos too.
> Don't have time to fix it for you right now sorry, it's
> a holiday here.

I used a grammar checker and it doesn't seem to be doing a good job. 
I'll do a more granular check.:)

>
>> ---
>> v12->v13:
>> 	1. Add a GET command for hash_tunnel_types. @Parav Pandit
>> 	2. Add tunneling protocol explanation. @Jason Wang
>> 	3. Add comments on some usage scenarios for inner hash.
>>
>> v11->v12:
>> 	1. Add a command VIRTIO_NET_CTRL_MQ_TUNNEL_CONFIG.
>> 	2. Refine the commit log. @Michael S . Tsirkin
>> 	3. Add some tunnel types.
>>
>> v10->v11:
>> 	1. Revise commit log for clarity for readers.
>> 	2. Some modifications to avoid undefined terms. @Parav Pandit
>> 	3. Change VIRTIO_NET_F_HASH_TUNNEL dependency. @Parav Pandit
>> 	4. Add the normative statements. @Parav Pandit
>>
>> v9->v10:
>> 	1. Removed hash_report_tunnel related information. @Parav Pandit
>> 	2. Re-describe the limitations of QoS for tunneling.
>> 	3. Some clarification.
>>
>> v8->v9:
>> 	1. Merge hash_report_tunnel_types into hash_report. @Parav Pandit
>> 	2. Add tunnel security section. @Michael S . Tsirkin
>> 	3. Add VIRTIO_NET_F_HASH_REPORT_TUNNEL.
>> 	4. Fix some typos.
>> 	5. Add more tunnel types. @Michael S . Tsirkin
>>
>> v7->v8:
>> 	1. Add supported_hash_tunnel_types. @Jason Wang, @Parav Pandit
>> 	2. Change hash_report_tunnel to hash_report_tunnel_types. @Parav Pandit
>> 	3. Removed re-definition for inner packet hashing. @Parav Pandit
>> 	4. Fix some typos. @Michael S . Tsirkin
>> 	5. Clarify some sentences. @Michael S . Tsirkin
>>
>> v6->v7:
>> 	1. Modify the wording of some sentences for clarity. @Michael S. Tsirkin
>> 	2. Fix some syntax issues. @Michael S. Tsirkin
>>
>> v5->v6:
>> 	1. Fix some syntax and capitalization issues. @Michael S. Tsirkin
>> 	2. Use encapsulated/encaptulation uniformly. @Michael S. Tsirkin
>> 	3. Move the links to introduction section. @Michael S. Tsirkin
>> 	4. Clarify some sentences. @Michael S. Tsirkin
>>
>> v4->v5:
>> 	1. Clarify some paragraphs. @Cornelia Huck
>> 	2. Fix the u8 type. @Cornelia Huck
>>
>> v3->v4:
>> 	1. Rename VIRTIO_NET_F_HASH_GRE_VXLAN_GENEVE_INNER to VIRTIO_NET_F_HASH_TUNNEL. @Jason Wang
>> 	2. Make things clearer. @Jason Wang @Michael S. Tsirkin
>> 	3. Keep the possibility to use inner hash for automatic receive steering. @Jason Wang
>> 	4. Add the "Tunnel packet" paragraph to avoid repeating the GRE etc. many times. @Michael S. Tsirkin
>>
>> v2->v3:
>> 	1. Add a feature bit for GRE/VXLAN/GENEVE inner hash. @Jason Wang
>> 	2. Chang \field{hash_tunnel} to \field{hash_report_tunnel}. @Jason Wang, @Michael S. Tsirkin
>>
>> v1->v2:
>> 	1. Remove the patch for the bitmask fix. @Michael S. Tsirkin
>> 	2. Clarify some paragraphs. @Jason Wang
>> 	3. Add \field{hash_tunnel} and VIRTIO_NET_HASH_REPORT_GRE. @Yuri Benditovich
>>
>>   device-types/net/description.tex        | 159 ++++++++++++++++++++++++
>>   device-types/net/device-conformance.tex |   1 +
>>   device-types/net/driver-conformance.tex |   1 +
>>   introduction.tex                        |  44 +++++++
>>   4 files changed, 205 insertions(+)
>>
>> diff --git a/device-types/net/description.tex b/device-types/net/description.tex
>> index 0500bb6..48e41f1 100644
>> --- a/device-types/net/description.tex
>> +++ b/device-types/net/description.tex
>> @@ -83,6 +83,9 @@ \subsection{Feature bits}\label{sec:Device Types / Network Device / Feature bits
>>   \item[VIRTIO_NET_F_CTRL_MAC_ADDR(23)] Set MAC address through control
>>       channel.
>>   
>> +\item[VIRTIO_NET_F_HASH_TUNNEL(52)] Device supports inner header hash
>> +    for tunnel-encapsulated packets.
>> +
>>   \item[VIRTIO_NET_F_NOTF_COAL(53)] Device supports notifications coalescing.
>>   
>>   \item[VIRTIO_NET_F_GUEST_USO4 (54)] Driver can receive USOv4 packets.
>> @@ -139,6 +142,7 @@ \subsubsection{Feature bit requirements}\label{sec:Device Types / Network Device
>>   \item[VIRTIO_NET_F_NOTF_COAL] Requires VIRTIO_NET_F_CTRL_VQ.
>>   \item[VIRTIO_NET_F_RSC_EXT] Requires VIRTIO_NET_F_HOST_TSO4 or VIRTIO_NET_F_HOST_TSO6.
>>   \item[VIRTIO_NET_F_RSS] Requires VIRTIO_NET_F_CTRL_VQ.
>> +\item[VIRTIO_NET_F_HASH_TUNNEL] Requires VIRTIO_NET_F_CTRL_VQ along with VIRTIO_NET_F_RSS and/or VIRTIO_NET_F_HASH_REPORT.
>>   \end{description}
>>   
>>   \subsubsection{Legacy Interface: Feature bits}\label{sec:Device Types / Network Device / Feature bits / Legacy Interface: Feature bits}
>> @@ -198,6 +202,7 @@ \subsection{Device configuration layout}\label{sec:Device Types / Network Device
>>           u8 rss_max_key_size;
>>           le16 rss_max_indirection_table_length;
>>           le32 supported_hash_types;
>> +        le32 supported_tunnel_hash_types;
>>   };
>>   \end{lstlisting}
>>   The following field, \field{rss_max_key_size} only exists if VIRTIO_NET_F_RSS or VIRTIO_NET_F_HASH_REPORT is set.
>> @@ -212,6 +217,11 @@ \subsection{Device configuration layout}\label{sec:Device Types / Network Device
>>   Field \field{supported_hash_types} contains the bitmask of supported hash types.
>>   See \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled hash types} for details of supported hash types.
>>   
>> +Field \field{supported_tunnel_hash_types} only exists if the device supports inner header hash, i.e. if VIRTIO_NET_F_HASH_TUNNEL is set.
>> +
>> +Filed \field{supported_tunnel_hash_types} contains the bitmask of supported tunnel hash types.
>> +See \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled encapsulation hash types} for details of supported tunnel hash types.
>> +
>>   \devicenormative{\subsubsection}{Device configuration layout}{Device Types / Network Device / Device configuration layout}
>>   
>>   The device MUST set \field{max_virtqueue_pairs} to between 1 and 0x8000 inclusive,
>> @@ -848,6 +858,7 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>>   If the feature VIRTIO_NET_F_RSS was negotiated:
>>   \begin{itemize}
>>   \item The device uses \field{hash_types} of the virtio_net_rss_config structure as 'Enabled hash types' bitmask.
>> +\item The device uses \field{hash_tunnel_types} of the virtio_net_hash_tunnel_config structure as 'Enabled hash tunnel types' bitmask if VIRTIO_NET_F_HASH_TUNNEL was negotiated.
>>   \item The device uses a key as defined in \field{hash_key_data} and \field{hash_key_length} of the virtio_net_rss_config structure (see
>>   \ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) / Setting RSS parameters}).
>>   \end{itemize}
>> @@ -855,6 +866,7 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>>   If the feature VIRTIO_NET_F_RSS was not negotiated:
>>   \begin{itemize}
>>   \item The device uses \field{hash_types} of the virtio_net_hash_config structure as 'Enabled hash types' bitmask.
>> +\item The device uses \field{hash_tunnel_types} of the virtio_net_hash_tunnel_config structure as 'Enabled hash tunnel types' bitmask if VIRTIO_NET_F_HASH_TUNNEL was negotiated.
>>   \item The device uses a key as defined in \field{hash_key_data} and \field{hash_key_length} of the virtio_net_hash_config structure (see
>>   \ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode / Hash calculation}).
>>   \end{itemize}
>> @@ -870,6 +882,7 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>>   
>>   \subparagraph{Supported/enabled hash types}
>>   \label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled hash types}
>> +This paragraph relies on definitions from \hyperref[intro:IP]{[IP]}, \hyperref[intro:UDP]{[UDP]} and \hyperref[intro:TCP]{[TCP]}.
>>   Hash types applicable for IPv4 packets:
>>   \begin{lstlisting}
>>   #define VIRTIO_NET_HASH_TYPE_IPv4              (1 << 0)
>> @@ -980,6 +993,152 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>>   (see \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / IPv6 packets without extension header}).
>>   \end{itemize}
>>   
>> +\paragraph{Inner Header Hash}
>> +\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Inner Header Hash}
>> +
>> +If VIRTIO_NET_F_HASH_TUNNEL has been negotiated, the device supports inner header hash and the driver can send
>> +commands VIRTIO_NET_CTRL_TUNNEL_HASH_SET and VIRTIO_NET_CTRL_TUNNEL_HASH_GET for the inner header hash configuration.
>> +
>> +struct virtio_net_hash_tunnel_config {
>> +    le32 hash_tunnel_types;
>> +};
>> +
>> +#define VIRTIO_NET_CTRL_TUNNEL_HASH 7
>> + #define VIRTIO_NET_CTRL_TUNNEL_HASH_SET 0
>> + #define VIRTIO_NET_CTRL_TUNNEL_HASH_GET 1
>> +
>> +Filed \field{hash_tunnel_types} contains a bitmask of configured hash tunnel types as
>> +defined in \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled hash tunnel types}.
>> +
>> +The class VIRTIO_NET_CTRL_TUNNEL_HASH has the following commands:
>> +\begin{itemize}
>> +\item VIRTIO_NET_CTRL_TUNNEL_HASH_SET: set the \field{hash_tunnel_types} to configure the inner header hash calculation for the device.
>> +\item VIRTIO_NET_CTRL_TUNNEL_HASH_GET: get the \field{hash_tunnel_types} from the device.
>> +\end{itemize}
>> +
>> +For the command VIRTIO_NET_CTRL_TUNNEL_HASH_SET, the structure virtio_net_hash_tunnel_config is write-only for the driver.
>> +For the command VIRTIO_NET_CTRL_TUNNEL_HASH_GET, the structure virtio_net_hash_tunnel_config is read-only for the driver.
>> +
>> +\subparagraph{Tunnel/Encapsulated packet}
>> +\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Tunnel/Encapsulated packet}
>> +
>> +A tunnel packet is encapsulated from the original packet based on the tunneling protocol (only a single level of
>> +encapsulation is currently supported). The encapsulated packet contains an outer header and an inner header, and
>> +the device calculates the hash over either the inner header or the outer header.
>> +
>> +If VIRTIO_NET_F_HASH_TUNNEL is negotiated and a received encapsulated packet's outer header matches one of the
>> +configured \field{hash_tunnel_types}, the hash of the inner header is calculated.
>> +
>> +Supported encapsulated packet types:
>> +\begin{itemize}
>> +\item \hyperref[intro:gre_rfc2784]{[GRE_rfc2784]}: the outer header is over IPv4 and the inner header is over IPv4. The outer header does not contain the transport protocol.
>> +\item \hyperref[intro:gre_rfc2890]{[GRE_rfc2890]}: the outer header is over IPv4 and the inner header is over IPv4. The outer header does not contain the transport protocol.
>> +\item \hyperref[intro:gre_rfc7676]{[GRE_rfc7676]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6. The outer header does not contain the transport protocol.
>> +\item \hyperref[intro:gre_in_udp_rfc8086]{[GRE-in-UDP]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6. The outer header uses UDP as the transport protocol.
>> +\item \hyperref[intro:vxlan]{[VXLAN]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6. The outer header uses UDP as the transport protocol.
>> +\item \hyperref[intro:vxlan_gpe]{[VXLAN-GPE]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6. The outer header uses UDP as the transport protocol.
>> +\item \hyperref[intro:geneve]{[GENEVE]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6. The outer header uses UDP as the transport protocol.
>> +\item \hyperref[intro:ipip]{[IPIP]}: the outer header is over IPv4 and the inner header is over IPv4. The outer header does not contain the transport protocol.
>> +\item \hyperref[intro:nvgre]{[NVGRE]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6. The outer header does not contain the transport protocol.
>> +\item \hyperref[intro:sit]{[STT]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6. The outer header uses TCP-like as the transport protocol.
>> +\end{itemize}
>> +
>> +If VIRTIO_NET_HASH_TUNNEL_TYPE_NONE is set or the encapsulation type is not included in \field{hash_tunnel_types},
>> +the hash of the outer header is calculated for the received encapsulated packet.
>> +
>> +The hash is calculated for the received non-encapsulated packet as if VIRTIO_NET_F_HASH_TUNNEL was not negotiated.
>> +
>> +\subparagraph{Supported/enabled encapsulation hash types}
>> +\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled encapsulation hash types}
>> +
>> +\begin{lstlisting}
>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_NONE        (1 << 0)
>> +\end{lstlisting}
>> +
>> +Supported encapsulation hash types:
>> +Hash type applicable for inner payload of the \hyperref[intro:gre_rfc2784]{[GRE_rfc2784]} packet:
>> +\begin{lstlisting}
>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_2784    (1 << 1)
>> +\end{lstlisting}
>> +Hash type applicable for inner payload of the \hyperref[intro:gre_rfc2890]{[GRE_rfc2890]} packet:
>> +\begin{lstlisting}
>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_2890    (1 << 2)
>> +\end{lstlisting}
>> +Hash type applicable for inner payload of the \hyperref[intro:gre_rfc7676]{[GRE_rfc7676]} packet:
>> +\begin{lstlisting}
>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_7676    (1 << 3)
>> +\end{lstlisting}
>> +Hash type applicable for inner payload of the \hyperref[intro:gre_in_udp_rfc8086]{[GRE-in-UDP]} packet:
>> +\begin{lstlisting}
>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_UDP     (1 << 4)
>> +\end{lstlisting}
>> +Hash type applicable for inner payload of the \hyperref[intro:vxlan]{[VXLAN]} packet:
>> +\begin{lstlisting}
>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_VXLAN       (1 << 5)
>> +\end{lstlisting}
>> +Hash type applicable for inner payload of the \hyperref[intro:vxlan_gpe]{[VXLAN-GPE]} packet:
>> +\begin{lstlisting}
>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_VXLAN_GPE   (1 << 6)
>> +\end{lstlisting}
>> +Hash type applicable for inner payload of the \hyperref[intro:geneve]{[GENEVE]} packet:
>> +\begin{lstlisting}
>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GENEVE      (1 << 7)
>> +\end{lstlisting}
>> +Hash type applicable for inner payload of the \hyperref[intro:ipip]{[IPIP]} packet:
>> +\begin{lstlisting}
>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_IPIP        (1 << 8)
>> +\end{lstlisting}
>> +Hash type applicable for inner payload of the \hyperref[intro:nvgre]{[NVGRE]} packet:
>> +\begin{lstlisting}
>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_NVGRE       (1 << 9)
>> +\end{lstlisting}
>> +Hash type applicable for inner payload of the \hyperref[intro:stt]{[STT]} packet:
>> +\begin{lstlisting}
>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_STT         (1 << 10)
>> +\end{lstlisting}
> Too many protocols to support. Can we start with just one or two?

This does not mean that every device needs to implement and support all 
of these, they can choose to support some protocols they want.

I add these because we have scale application scenarios for modern 
protocols VXLAN-GPE/GENEVE:

+\item In scenarios where the same flow passing through different tunnels is expected to be received in the same queue,
+      warm caches, lessing locking, etc. are optimized to obtain receiving performance.


Maybe the legacy GRE, VXLAN-GPE and GENEVE? But it has a little crossover.

Thanks.

This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [virtio-dev] RE: [virtio-comment] RE: [PATCH v13] virtio-net: support inner header hash
  2023-04-26 14:03         ` [virtio-dev] " Heng Qi
@ 2023-04-26 14:24           ` Parav Pandit
  -1 siblings, 0 replies; 60+ messages in thread
From: Parav Pandit @ 2023-04-26 14:24 UTC (permalink / raw)
  To: Heng Qi, virtio-dev, virtio-comment
  Cc: Michael S . Tsirkin, Jason Wang, Yuri Benditovich, Xuan Zhuo



> From: Heng Qi <hengqi@linux.alibaba.com>
> Sent: Wednesday, April 26, 2023 10:04 AM

> Yes, but that seems like a tiny cost, and the cvq command-related structure is
> much simpler.
Current structure size is 24 bytes.
This size becomes multiplier with device count scale to be always available and rarely changes.

As we add new features such device capabilities grow making the multiplier bigger.
For example 
a. flow steering capabilities (how many flows, what mask, supported protocols, generic options)
b. hds capabilities
c. counter capabilities (histogram based, which error counters supported, etc) 
d. which new type of tx vq improvements supported.
e. hw gro context count supported

May be more..

Depending on the container/VM size certain capabilities may change from device to device.
Hence it is hard to deduplicate them at device level.

Therefore, ability to query them over a non_always_available transport is preferred choice from the device.

A driver may choose to cache it if its being frequently accessed or ask device when needed.
Even when it's cached by driver, it is coming from the component that doesn’t have transport level timeouts associated with it.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* RE: [virtio-comment] RE: [PATCH v13] virtio-net: support inner header hash
@ 2023-04-26 14:24           ` Parav Pandit
  0 siblings, 0 replies; 60+ messages in thread
From: Parav Pandit @ 2023-04-26 14:24 UTC (permalink / raw)
  To: Heng Qi, virtio-dev, virtio-comment
  Cc: Michael S . Tsirkin, Jason Wang, Yuri Benditovich, Xuan Zhuo



> From: Heng Qi <hengqi@linux.alibaba.com>
> Sent: Wednesday, April 26, 2023 10:04 AM

> Yes, but that seems like a tiny cost, and the cvq command-related structure is
> much simpler.
Current structure size is 24 bytes.
This size becomes multiplier with device count scale to be always available and rarely changes.

As we add new features such device capabilities grow making the multiplier bigger.
For example 
a. flow steering capabilities (how many flows, what mask, supported protocols, generic options)
b. hds capabilities
c. counter capabilities (histogram based, which error counters supported, etc) 
d. which new type of tx vq improvements supported.
e. hw gro context count supported

May be more..

Depending on the container/VM size certain capabilities may change from device to device.
Hence it is hard to deduplicate them at device level.

Therefore, ability to query them over a non_always_available transport is preferred choice from the device.

A driver may choose to cache it if its being frequently accessed or ask device when needed.
Even when it's cached by driver, it is coming from the component that doesn’t have transport level timeouts associated with it.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* [virtio-dev] Re: [virtio-comment] [PATCH v13] virtio-net: support inner header hash
  2023-04-26 14:14     ` Heng Qi
@ 2023-04-26 14:48       ` Michael S. Tsirkin
  -1 siblings, 0 replies; 60+ messages in thread
From: Michael S. Tsirkin @ 2023-04-26 14:48 UTC (permalink / raw)
  To: Heng Qi
  Cc: virtio-dev, virtio-comment, Parav Pandit, Jason Wang,
	Yuri Benditovich, Xuan Zhuo

On Wed, Apr 26, 2023 at 10:14:30PM +0800, Heng Qi wrote:
> This does not mean that every device needs to implement and support all of
> these, they can choose to support some protocols they want.
> 
> I add these because we have scale application scenarios for modern protocols
> VXLAN-GPE/GENEVE:
> 
> +\item In scenarios where the same flow passing through different tunnels is expected to be received in the same queue,
> +      warm caches, lessing locking, etc. are optimized to obtain receiving performance.
> 
> 
> Maybe the legacy GRE, VXLAN-GPE and GENEVE? But it has a little crossover.
> 
> Thanks.

But VXLAN-GPE/GENEVE can use source port for entropy.

	It is recommended that the UDP source port number
	 be calculated using a hash of fields from the inner packet

That is best because
it allows end to end control and is protocol agnostic.
All that is missing is symmetric Toepliz and all is well?

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [virtio-comment] [PATCH v13] virtio-net: support inner header hash
@ 2023-04-26 14:48       ` Michael S. Tsirkin
  0 siblings, 0 replies; 60+ messages in thread
From: Michael S. Tsirkin @ 2023-04-26 14:48 UTC (permalink / raw)
  To: Heng Qi
  Cc: virtio-dev, virtio-comment, Parav Pandit, Jason Wang,
	Yuri Benditovich, Xuan Zhuo

On Wed, Apr 26, 2023 at 10:14:30PM +0800, Heng Qi wrote:
> This does not mean that every device needs to implement and support all of
> these, they can choose to support some protocols they want.
> 
> I add these because we have scale application scenarios for modern protocols
> VXLAN-GPE/GENEVE:
> 
> +\item In scenarios where the same flow passing through different tunnels is expected to be received in the same queue,
> +      warm caches, lessing locking, etc. are optimized to obtain receiving performance.
> 
> 
> Maybe the legacy GRE, VXLAN-GPE and GENEVE? But it has a little crossover.
> 
> Thanks.

But VXLAN-GPE/GENEVE can use source port for entropy.

	It is recommended that the UDP source port number
	 be calculated using a hash of fields from the inner packet

That is best because
it allows end to end control and is protocol agnostic.
All that is missing is symmetric Toepliz and all is well?

-- 
MST


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [virtio-dev] Re: [virtio-comment] RE: [PATCH v13] virtio-net: support inner header hash
  2023-04-26 14:24           ` Parav Pandit
@ 2023-04-26 14:57             ` Michael S. Tsirkin
  -1 siblings, 0 replies; 60+ messages in thread
From: Michael S. Tsirkin @ 2023-04-26 14:57 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Heng Qi, virtio-dev, virtio-comment, Jason Wang,
	Yuri Benditovich, Xuan Zhuo

On Wed, Apr 26, 2023 at 02:24:48PM +0000, Parav Pandit wrote:
> 
> 
> > From: Heng Qi <hengqi@linux.alibaba.com>
> > Sent: Wednesday, April 26, 2023 10:04 AM
> 
> > Yes, but that seems like a tiny cost, and the cvq command-related structure is
> > much simpler.
> Current structure size is 24 bytes.
> This size becomes multiplier with device count scale to be always available and rarely changes.
> 
> As we add new features such device capabilities grow making the multiplier bigger.
> For example 
> a. flow steering capabilities (how many flows, what mask, supported protocols, generic options)
> b. hds capabilities
> c. counter capabilities (histogram based, which error counters supported, etc) 
> d. which new type of tx vq improvements supported.
> e. hw gro context count supported
> 
> May be more..

All these are ROM though. Can be shared between functions on a single
device, be it VFs or multifunction PF. Yea I heard you about
maybe making them programmable. For some cases where there is a
hardware resource associated with it, it makes sense.
Not in this case though, it's just another way to calculate hash.


> Depending on the container/VM size certain capabilities may change from device to device.
> Hence it is hard to deduplicate them at device level.
> 
> Therefore, ability to query them over a non_always_available transport is preferred choice from the device.
> 
> A driver may choose to cache it if its being frequently accessed or ask device when needed.
> Even when it's cached by driver, it is coming from the component that doesn’t have transport level timeouts associated with it.

well caching by driver is using up same amount of RAM, only with no
chance to 

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [virtio-comment] RE: [PATCH v13] virtio-net: support inner header hash
@ 2023-04-26 14:57             ` Michael S. Tsirkin
  0 siblings, 0 replies; 60+ messages in thread
From: Michael S. Tsirkin @ 2023-04-26 14:57 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Heng Qi, virtio-dev, virtio-comment, Jason Wang,
	Yuri Benditovich, Xuan Zhuo

On Wed, Apr 26, 2023 at 02:24:48PM +0000, Parav Pandit wrote:
> 
> 
> > From: Heng Qi <hengqi@linux.alibaba.com>
> > Sent: Wednesday, April 26, 2023 10:04 AM
> 
> > Yes, but that seems like a tiny cost, and the cvq command-related structure is
> > much simpler.
> Current structure size is 24 bytes.
> This size becomes multiplier with device count scale to be always available and rarely changes.
> 
> As we add new features such device capabilities grow making the multiplier bigger.
> For example 
> a. flow steering capabilities (how many flows, what mask, supported protocols, generic options)
> b. hds capabilities
> c. counter capabilities (histogram based, which error counters supported, etc) 
> d. which new type of tx vq improvements supported.
> e. hw gro context count supported
> 
> May be more..

All these are ROM though. Can be shared between functions on a single
device, be it VFs or multifunction PF. Yea I heard you about
maybe making them programmable. For some cases where there is a
hardware resource associated with it, it makes sense.
Not in this case though, it's just another way to calculate hash.


> Depending on the container/VM size certain capabilities may change from device to device.
> Hence it is hard to deduplicate them at device level.
> 
> Therefore, ability to query them over a non_always_available transport is preferred choice from the device.
> 
> A driver may choose to cache it if its being frequently accessed or ask device when needed.
> Even when it's cached by driver, it is coming from the component that doesn’t have transport level timeouts associated with it.

well caching by driver is using up same amount of RAM, only with no
chance to 

-- 
MST


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [virtio-dev] RE: [virtio-comment] RE: [PATCH v13] virtio-net: support inner header hash
  2023-04-26 14:57             ` Michael S. Tsirkin
@ 2023-04-26 15:20               ` Parav Pandit
  -1 siblings, 0 replies; 60+ messages in thread
From: Parav Pandit @ 2023-04-26 15:20 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Heng Qi, virtio-dev, virtio-comment, Jason Wang,
	Yuri Benditovich, Xuan Zhuo



> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Wednesday, April 26, 2023 10:58 AM

> All these are ROM though. Can be shared between functions on a single device,
> be it VFs or multifunction PF. Yea I heard you about maybe making them
> programmable. For some cases where there is a hardware resource associated
> with it, it makes sense.
Right. Some are, some are not ROM.
And it demands special circuitry for non-critical path.

> Not in this case though, it's just another way to calculate hash.
> 
Sure. Instead of doing field by field bi-section this way requires constant discussion like this.
Instead, placement of time critical fields in MMIO is better approach as it is generic guideline vs ROM/RAM fields.
 
> 
> > Depending on the container/VM size certain capabilities may change from
> device to device.
> > Hence it is hard to deduplicate them at device level.
> >
> > Therefore, ability to query them over a non_always_available transport is
> preferred choice from the device.
> >
> > A driver may choose to cache it if its being frequently accessed or ask device
> when needed.
> > Even when it's cached by driver, it is coming from the component that
> doesn’t have transport level timeouts associated with it.
> 
> well caching by driver is using up same amount of RAM, only with no chance to
> 
No chance to fault. And better utilizing the device.
And it is not always the pinned RAM, future it can be paged out too. 
Many fields won't be even cached.
Some may even get set in the net_device or other kernel level device structures.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* RE: [virtio-comment] RE: [PATCH v13] virtio-net: support inner header hash
@ 2023-04-26 15:20               ` Parav Pandit
  0 siblings, 0 replies; 60+ messages in thread
From: Parav Pandit @ 2023-04-26 15:20 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Heng Qi, virtio-dev, virtio-comment, Jason Wang,
	Yuri Benditovich, Xuan Zhuo



> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Wednesday, April 26, 2023 10:58 AM

> All these are ROM though. Can be shared between functions on a single device,
> be it VFs or multifunction PF. Yea I heard you about maybe making them
> programmable. For some cases where there is a hardware resource associated
> with it, it makes sense.
Right. Some are, some are not ROM.
And it demands special circuitry for non-critical path.

> Not in this case though, it's just another way to calculate hash.
> 
Sure. Instead of doing field by field bi-section this way requires constant discussion like this.
Instead, placement of time critical fields in MMIO is better approach as it is generic guideline vs ROM/RAM fields.
 
> 
> > Depending on the container/VM size certain capabilities may change from
> device to device.
> > Hence it is hard to deduplicate them at device level.
> >
> > Therefore, ability to query them over a non_always_available transport is
> preferred choice from the device.
> >
> > A driver may choose to cache it if its being frequently accessed or ask device
> when needed.
> > Even when it's cached by driver, it is coming from the component that
> doesn’t have transport level timeouts associated with it.
> 
> well caching by driver is using up same amount of RAM, only with no chance to
> 
No chance to fault. And better utilizing the device.
And it is not always the pinned RAM, future it can be paged out too. 
Many fields won't be even cached.
Some may even get set in the net_device or other kernel level device structures.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [virtio-dev] RE: [virtio-comment] RE: [PATCH v13] virtio-net: support inner header hash
  2023-04-26 14:24           ` Parav Pandit
@ 2023-04-27  2:19             ` Heng Qi
  -1 siblings, 0 replies; 60+ messages in thread
From: Heng Qi @ 2023-04-27  2:19 UTC (permalink / raw)
  To: Parav Pandit, virtio-dev, virtio-comment
  Cc: Michael S . Tsirkin, Jason Wang, Yuri Benditovich, Xuan Zhuo



在 2023/4/26 下午10:24, Parav Pandit 写道:
>
>> From: Heng Qi <hengqi@linux.alibaba.com>
>> Sent: Wednesday, April 26, 2023 10:04 AM
>> Yes, but that seems like a tiny cost, and the cvq command-related structure is
>> much simpler.
> Current structure size is 24 bytes.
> This size becomes multiplier with device count scale to be always available and rarely changes.
>
> As we add new features such device capabilities grow making the multiplier bigger.
> For example
> a. flow steering capabilities (how many flows, what mask, supported protocols, generic options)
> b. hds capabilities
> c. counter capabilities (histogram based, which error counters supported, etc)
> d. which new type of tx vq improvements supported.
> e. hw gro context count supported
>
> May be more..
>
> Depending on the container/VM size certain capabilities may change from device to device.
> Hence it is hard to deduplicate them at device level.

This makes sense. In general, we should be careful about adding things 
to the device space unless the benefit is non-trivial.

Thanks.

>
> Therefore, ability to query them over a non_always_available transport is preferred choice from the device.
>
> A driver may choose to cache it if its being frequently accessed or ask device when needed.
> Even when it's cached by driver, it is coming from the component that doesn’t have transport level timeouts associated with it.


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [virtio-comment] Re: [virtio-dev] RE: [virtio-comment] RE: [PATCH v13] virtio-net: support inner header hash
@ 2023-04-27  2:19             ` Heng Qi
  0 siblings, 0 replies; 60+ messages in thread
From: Heng Qi @ 2023-04-27  2:19 UTC (permalink / raw)
  To: Parav Pandit, virtio-dev, virtio-comment
  Cc: Michael S . Tsirkin, Jason Wang, Yuri Benditovich, Xuan Zhuo



在 2023/4/26 下午10:24, Parav Pandit 写道:
>
>> From: Heng Qi <hengqi@linux.alibaba.com>
>> Sent: Wednesday, April 26, 2023 10:04 AM
>> Yes, but that seems like a tiny cost, and the cvq command-related structure is
>> much simpler.
> Current structure size is 24 bytes.
> This size becomes multiplier with device count scale to be always available and rarely changes.
>
> As we add new features such device capabilities grow making the multiplier bigger.
> For example
> a. flow steering capabilities (how many flows, what mask, supported protocols, generic options)
> b. hds capabilities
> c. counter capabilities (histogram based, which error counters supported, etc)
> d. which new type of tx vq improvements supported.
> e. hw gro context count supported
>
> May be more..
>
> Depending on the container/VM size certain capabilities may change from device to device.
> Hence it is hard to deduplicate them at device level.

This makes sense. In general, we should be careful about adding things 
to the device space unless the benefit is non-trivial.

Thanks.

>
> Therefore, ability to query them over a non_always_available transport is preferred choice from the device.
>
> A driver may choose to cache it if its being frequently accessed or ask device when needed.
> Even when it's cached by driver, it is coming from the component that doesn’t have transport level timeouts associated with it.


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [virtio-dev] Re: [virtio-comment] [PATCH v13] virtio-net: support inner header hash
  2023-04-26 14:48       ` Michael S. Tsirkin
@ 2023-04-27  2:28         ` Heng Qi
  -1 siblings, 0 replies; 60+ messages in thread
From: Heng Qi @ 2023-04-27  2:28 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-dev, virtio-comment, Parav Pandit, Jason Wang,
	Yuri Benditovich, Xuan Zhuo



在 2023/4/26 下午10:48, Michael S. Tsirkin 写道:
> On Wed, Apr 26, 2023 at 10:14:30PM +0800, Heng Qi wrote:
>> This does not mean that every device needs to implement and support all of
>> these, they can choose to support some protocols they want.
>>
>> I add these because we have scale application scenarios for modern protocols
>> VXLAN-GPE/GENEVE:
>>
>> +\item In scenarios where the same flow passing through different tunnels is expected to be received in the same queue,
>> +      warm caches, lessing locking, etc. are optimized to obtain receiving performance.
>>
>>
>> Maybe the legacy GRE, VXLAN-GPE and GENEVE? But it has a little crossover.
>>
>> Thanks.
> But VXLAN-GPE/GENEVE can use source port for entropy.
>
> 	It is recommended that the UDP source port number
> 	 be calculated using a hash of fields from the inner packet
>
> That is best because
> it allows end to end control and is protocol agnostic.

Yes. I agree with this, I don't think we have an argument on this point 
right now.:)

For VXLAN-GPE/GENEVE or other modern tunneling protocols, we have to 
deal with
scenarios where the same flow passes through different tunnels.

Having them hashed to the same rx queue, is hard to do via outer headers.

> All that is missing is symmetric Toepliz and all is well?

The scenarios above or in the commit log also require inner headers.


Thanks.

>


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] [PATCH v13] virtio-net: support inner header hash
@ 2023-04-27  2:28         ` Heng Qi
  0 siblings, 0 replies; 60+ messages in thread
From: Heng Qi @ 2023-04-27  2:28 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-dev, virtio-comment, Parav Pandit, Jason Wang,
	Yuri Benditovich, Xuan Zhuo



在 2023/4/26 下午10:48, Michael S. Tsirkin 写道:
> On Wed, Apr 26, 2023 at 10:14:30PM +0800, Heng Qi wrote:
>> This does not mean that every device needs to implement and support all of
>> these, they can choose to support some protocols they want.
>>
>> I add these because we have scale application scenarios for modern protocols
>> VXLAN-GPE/GENEVE:
>>
>> +\item In scenarios where the same flow passing through different tunnels is expected to be received in the same queue,
>> +      warm caches, lessing locking, etc. are optimized to obtain receiving performance.
>>
>>
>> Maybe the legacy GRE, VXLAN-GPE and GENEVE? But it has a little crossover.
>>
>> Thanks.
> But VXLAN-GPE/GENEVE can use source port for entropy.
>
> 	It is recommended that the UDP source port number
> 	 be calculated using a hash of fields from the inner packet
>
> That is best because
> it allows end to end control and is protocol agnostic.

Yes. I agree with this, I don't think we have an argument on this point 
right now.:)

For VXLAN-GPE/GENEVE or other modern tunneling protocols, we have to 
deal with
scenarios where the same flow passes through different tunnels.

Having them hashed to the same rx queue, is hard to do via outer headers.

> All that is missing is symmetric Toepliz and all is well?

The scenarios above or in the commit log also require inner headers.


Thanks.

>


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [virtio-dev] Re: [virtio-comment] [PATCH v13] virtio-net: support inner header hash
  2023-04-27  2:28         ` [virtio-comment] " Heng Qi
@ 2023-04-27 17:13           ` Michael S. Tsirkin
  -1 siblings, 0 replies; 60+ messages in thread
From: Michael S. Tsirkin @ 2023-04-27 17:13 UTC (permalink / raw)
  To: Heng Qi
  Cc: virtio-dev, virtio-comment, Parav Pandit, Jason Wang,
	Yuri Benditovich, Xuan Zhuo

On Thu, Apr 27, 2023 at 10:28:29AM +0800, Heng Qi wrote:
> 
> 
> 在 2023/4/26 下午10:48, Michael S. Tsirkin 写道:
> > On Wed, Apr 26, 2023 at 10:14:30PM +0800, Heng Qi wrote:
> > > This does not mean that every device needs to implement and support all of
> > > these, they can choose to support some protocols they want.
> > > 
> > > I add these because we have scale application scenarios for modern protocols
> > > VXLAN-GPE/GENEVE:
> > > 
> > > +\item In scenarios where the same flow passing through different tunnels is expected to be received in the same queue,
> > > +      warm caches, lessing locking, etc. are optimized to obtain receiving performance.
> > > 
> > > 
> > > Maybe the legacy GRE, VXLAN-GPE and GENEVE? But it has a little crossover.
> > > 
> > > Thanks.
> > But VXLAN-GPE/GENEVE can use source port for entropy.
> > 
> > 	It is recommended that the UDP source port number
> > 	 be calculated using a hash of fields from the inner packet
> > 
> > That is best because
> > it allows end to end control and is protocol agnostic.
> 
> Yes. I agree with this, I don't think we have an argument on this point
> right now.:)
> 
> For VXLAN-GPE/GENEVE or other modern tunneling protocols, we have to deal
> with
> scenarios where the same flow passes through different tunnels.
> 
> Having them hashed to the same rx queue, is hard to do via outer headers.
> > All that is missing is symmetric Toepliz and all is well?
> 
> The scenarios above or in the commit log also require inner headers.

Hmm I am not sure I get it 100%.
Could you show an example with inner header hash in the port #,
hash is symmetric, and you still have trouble?


It kinds of sounds like not enough entropy is not the problem
at this point. You now want to drop everything from the header
except the UDP source port. Is that a fair summary?

> 
> 
> Thanks.
> 
> > 


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] [PATCH v13] virtio-net: support inner header hash
@ 2023-04-27 17:13           ` Michael S. Tsirkin
  0 siblings, 0 replies; 60+ messages in thread
From: Michael S. Tsirkin @ 2023-04-27 17:13 UTC (permalink / raw)
  To: Heng Qi
  Cc: virtio-dev, virtio-comment, Parav Pandit, Jason Wang,
	Yuri Benditovich, Xuan Zhuo

On Thu, Apr 27, 2023 at 10:28:29AM +0800, Heng Qi wrote:
> 
> 
> 在 2023/4/26 下午10:48, Michael S. Tsirkin 写道:
> > On Wed, Apr 26, 2023 at 10:14:30PM +0800, Heng Qi wrote:
> > > This does not mean that every device needs to implement and support all of
> > > these, they can choose to support some protocols they want.
> > > 
> > > I add these because we have scale application scenarios for modern protocols
> > > VXLAN-GPE/GENEVE:
> > > 
> > > +\item In scenarios where the same flow passing through different tunnels is expected to be received in the same queue,
> > > +      warm caches, lessing locking, etc. are optimized to obtain receiving performance.
> > > 
> > > 
> > > Maybe the legacy GRE, VXLAN-GPE and GENEVE? But it has a little crossover.
> > > 
> > > Thanks.
> > But VXLAN-GPE/GENEVE can use source port for entropy.
> > 
> > 	It is recommended that the UDP source port number
> > 	 be calculated using a hash of fields from the inner packet
> > 
> > That is best because
> > it allows end to end control and is protocol agnostic.
> 
> Yes. I agree with this, I don't think we have an argument on this point
> right now.:)
> 
> For VXLAN-GPE/GENEVE or other modern tunneling protocols, we have to deal
> with
> scenarios where the same flow passes through different tunnels.
> 
> Having them hashed to the same rx queue, is hard to do via outer headers.
> > All that is missing is symmetric Toepliz and all is well?
> 
> The scenarios above or in the commit log also require inner headers.

Hmm I am not sure I get it 100%.
Could you show an example with inner header hash in the port #,
hash is symmetric, and you still have trouble?


It kinds of sounds like not enough entropy is not the problem
at this point. You now want to drop everything from the header
except the UDP source port. Is that a fair summary?

> 
> 
> Thanks.
> 
> > 


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] [PATCH v13] virtio-net: support inner header hash
  2023-04-27 17:13           ` [virtio-comment] " Michael S. Tsirkin
@ 2023-05-05 13:51             ` Heng Qi
  -1 siblings, 0 replies; 60+ messages in thread
From: Heng Qi @ 2023-05-05 13:51 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-dev, virtio-comment, Parav Pandit, Jason Wang,
	Yuri Benditovich, Xuan Zhuo

On Thu, Apr 27, 2023 at 01:13:29PM -0400, Michael S. Tsirkin wrote:
> On Thu, Apr 27, 2023 at 10:28:29AM +0800, Heng Qi wrote:
> > 
> > 
> > 在 2023/4/26 下午10:48, Michael S. Tsirkin 写道:
> > > On Wed, Apr 26, 2023 at 10:14:30PM +0800, Heng Qi wrote:
> > > > This does not mean that every device needs to implement and support all of
> > > > these, they can choose to support some protocols they want.
> > > > 
> > > > I add these because we have scale application scenarios for modern protocols
> > > > VXLAN-GPE/GENEVE:
> > > > 
> > > > +\item In scenarios where the same flow passing through different tunnels is expected to be received in the same queue,
> > > > +      warm caches, lessing locking, etc. are optimized to obtain receiving performance.
> > > > 
> > > > 
> > > > Maybe the legacy GRE, VXLAN-GPE and GENEVE? But it has a little crossover.
> > > > 
> > > > Thanks.
> > > But VXLAN-GPE/GENEVE can use source port for entropy.
> > > 
> > > 	It is recommended that the UDP source port number
> > > 	 be calculated using a hash of fields from the inner packet
> > > 
> > > That is best because
> > > it allows end to end control and is protocol agnostic.
> > 
> > Yes. I agree with this, I don't think we have an argument on this point
> > right now.:)
> > 
> > For VXLAN-GPE/GENEVE or other modern tunneling protocols, we have to deal
> > with
> > scenarios where the same flow passes through different tunnels.
> > 
> > Having them hashed to the same rx queue, is hard to do via outer headers.
> > > All that is missing is symmetric Toepliz and all is well?
> > 
> > The scenarios above or in the commit log also require inner headers.
> 
> Hmm I am not sure I get it 100%.
> Could you show an example with inner header hash in the port #,
> hash is symmetric, and you still have trouble?
> 
> 
> It kinds of sounds like not enough entropy is not the problem
> at this point.

Sorry for the late reply. :)

For modern tunneling protocols, yes.

> You now want to drop everything from the header
> except the UDP source port. Is that a fair summary?
> 

For example, for the same flow passing through different VXLAN tunnels,
packets in this flow have the same inner header and different outer
headers. Sometimes these packets of the flow need to be hashed to the
same rxq, then we can use the inner header as the hash input.

Thanks!

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] [PATCH v13] virtio-net: support inner header hash
@ 2023-05-05 13:51             ` Heng Qi
  0 siblings, 0 replies; 60+ messages in thread
From: Heng Qi @ 2023-05-05 13:51 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-dev, virtio-comment, Parav Pandit, Jason Wang,
	Yuri Benditovich, Xuan Zhuo

On Thu, Apr 27, 2023 at 01:13:29PM -0400, Michael S. Tsirkin wrote:
> On Thu, Apr 27, 2023 at 10:28:29AM +0800, Heng Qi wrote:
> > 
> > 
> > 在 2023/4/26 下午10:48, Michael S. Tsirkin 写道:
> > > On Wed, Apr 26, 2023 at 10:14:30PM +0800, Heng Qi wrote:
> > > > This does not mean that every device needs to implement and support all of
> > > > these, they can choose to support some protocols they want.
> > > > 
> > > > I add these because we have scale application scenarios for modern protocols
> > > > VXLAN-GPE/GENEVE:
> > > > 
> > > > +\item In scenarios where the same flow passing through different tunnels is expected to be received in the same queue,
> > > > +      warm caches, lessing locking, etc. are optimized to obtain receiving performance.
> > > > 
> > > > 
> > > > Maybe the legacy GRE, VXLAN-GPE and GENEVE? But it has a little crossover.
> > > > 
> > > > Thanks.
> > > But VXLAN-GPE/GENEVE can use source port for entropy.
> > > 
> > > 	It is recommended that the UDP source port number
> > > 	 be calculated using a hash of fields from the inner packet
> > > 
> > > That is best because
> > > it allows end to end control and is protocol agnostic.
> > 
> > Yes. I agree with this, I don't think we have an argument on this point
> > right now.:)
> > 
> > For VXLAN-GPE/GENEVE or other modern tunneling protocols, we have to deal
> > with
> > scenarios where the same flow passes through different tunnels.
> > 
> > Having them hashed to the same rx queue, is hard to do via outer headers.
> > > All that is missing is symmetric Toepliz and all is well?
> > 
> > The scenarios above or in the commit log also require inner headers.
> 
> Hmm I am not sure I get it 100%.
> Could you show an example with inner header hash in the port #,
> hash is symmetric, and you still have trouble?
> 
> 
> It kinds of sounds like not enough entropy is not the problem
> at this point.

Sorry for the late reply. :)

For modern tunneling protocols, yes.

> You now want to drop everything from the header
> except the UDP source port. Is that a fair summary?
> 

For example, for the same flow passing through different VXLAN tunnels,
packets in this flow have the same inner header and different outer
headers. Sometimes these packets of the flow need to be hashed to the
same rxq, then we can use the inner header as the hash input.

Thanks!

This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] [PATCH v13] virtio-net: support inner header hash
  2023-05-05 13:51             ` Heng Qi
@ 2023-05-05 14:56               ` Michael S. Tsirkin
  -1 siblings, 0 replies; 60+ messages in thread
From: Michael S. Tsirkin @ 2023-05-05 14:56 UTC (permalink / raw)
  To: Heng Qi
  Cc: virtio-dev, virtio-comment, Parav Pandit, Jason Wang,
	Yuri Benditovich, Xuan Zhuo

On Fri, May 05, 2023 at 09:51:15PM +0800, Heng Qi wrote:
> On Thu, Apr 27, 2023 at 01:13:29PM -0400, Michael S. Tsirkin wrote:
> > On Thu, Apr 27, 2023 at 10:28:29AM +0800, Heng Qi wrote:
> > > 
> > > 
> > > 在 2023/4/26 下午10:48, Michael S. Tsirkin 写道:
> > > > On Wed, Apr 26, 2023 at 10:14:30PM +0800, Heng Qi wrote:
> > > > > This does not mean that every device needs to implement and support all of
> > > > > these, they can choose to support some protocols they want.
> > > > > 
> > > > > I add these because we have scale application scenarios for modern protocols
> > > > > VXLAN-GPE/GENEVE:
> > > > > 
> > > > > +\item In scenarios where the same flow passing through different tunnels is expected to be received in the same queue,
> > > > > +      warm caches, lessing locking, etc. are optimized to obtain receiving performance.
> > > > > 
> > > > > 
> > > > > Maybe the legacy GRE, VXLAN-GPE and GENEVE? But it has a little crossover.
> > > > > 
> > > > > Thanks.
> > > > But VXLAN-GPE/GENEVE can use source port for entropy.
> > > > 
> > > > 	It is recommended that the UDP source port number
> > > > 	 be calculated using a hash of fields from the inner packet
> > > > 
> > > > That is best because
> > > > it allows end to end control and is protocol agnostic.
> > > 
> > > Yes. I agree with this, I don't think we have an argument on this point
> > > right now.:)
> > > 
> > > For VXLAN-GPE/GENEVE or other modern tunneling protocols, we have to deal
> > > with
> > > scenarios where the same flow passes through different tunnels.
> > > 
> > > Having them hashed to the same rx queue, is hard to do via outer headers.
> > > > All that is missing is symmetric Toepliz and all is well?
> > > 
> > > The scenarios above or in the commit log also require inner headers.
> > 
> > Hmm I am not sure I get it 100%.
> > Could you show an example with inner header hash in the port #,
> > hash is symmetric, and you still have trouble?
> > 
> > 
> > It kinds of sounds like not enough entropy is not the problem
> > at this point.
> 
> Sorry for the late reply. :)
> 
> For modern tunneling protocols, yes.
> 
> > You now want to drop everything from the header
> > except the UDP source port. Is that a fair summary?
> > 
> 
> For example, for the same flow passing through different VXLAN tunnels,
> packets in this flow have the same inner header and different outer
> headers. Sometimes these packets of the flow need to be hashed to the
> same rxq, then we can use the inner header as the hash input.
> 
> Thanks!

So, they will have the same source port yes? Any way to use that
so we don't depend on a specific protocol?

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] [PATCH v13] virtio-net: support inner header hash
@ 2023-05-05 14:56               ` Michael S. Tsirkin
  0 siblings, 0 replies; 60+ messages in thread
From: Michael S. Tsirkin @ 2023-05-05 14:56 UTC (permalink / raw)
  To: Heng Qi
  Cc: virtio-dev, virtio-comment, Parav Pandit, Jason Wang,
	Yuri Benditovich, Xuan Zhuo

On Fri, May 05, 2023 at 09:51:15PM +0800, Heng Qi wrote:
> On Thu, Apr 27, 2023 at 01:13:29PM -0400, Michael S. Tsirkin wrote:
> > On Thu, Apr 27, 2023 at 10:28:29AM +0800, Heng Qi wrote:
> > > 
> > > 
> > > 在 2023/4/26 下午10:48, Michael S. Tsirkin 写道:
> > > > On Wed, Apr 26, 2023 at 10:14:30PM +0800, Heng Qi wrote:
> > > > > This does not mean that every device needs to implement and support all of
> > > > > these, they can choose to support some protocols they want.
> > > > > 
> > > > > I add these because we have scale application scenarios for modern protocols
> > > > > VXLAN-GPE/GENEVE:
> > > > > 
> > > > > +\item In scenarios where the same flow passing through different tunnels is expected to be received in the same queue,
> > > > > +      warm caches, lessing locking, etc. are optimized to obtain receiving performance.
> > > > > 
> > > > > 
> > > > > Maybe the legacy GRE, VXLAN-GPE and GENEVE? But it has a little crossover.
> > > > > 
> > > > > Thanks.
> > > > But VXLAN-GPE/GENEVE can use source port for entropy.
> > > > 
> > > > 	It is recommended that the UDP source port number
> > > > 	 be calculated using a hash of fields from the inner packet
> > > > 
> > > > That is best because
> > > > it allows end to end control and is protocol agnostic.
> > > 
> > > Yes. I agree with this, I don't think we have an argument on this point
> > > right now.:)
> > > 
> > > For VXLAN-GPE/GENEVE or other modern tunneling protocols, we have to deal
> > > with
> > > scenarios where the same flow passes through different tunnels.
> > > 
> > > Having them hashed to the same rx queue, is hard to do via outer headers.
> > > > All that is missing is symmetric Toepliz and all is well?
> > > 
> > > The scenarios above or in the commit log also require inner headers.
> > 
> > Hmm I am not sure I get it 100%.
> > Could you show an example with inner header hash in the port #,
> > hash is symmetric, and you still have trouble?
> > 
> > 
> > It kinds of sounds like not enough entropy is not the problem
> > at this point.
> 
> Sorry for the late reply. :)
> 
> For modern tunneling protocols, yes.
> 
> > You now want to drop everything from the header
> > except the UDP source port. Is that a fair summary?
> > 
> 
> For example, for the same flow passing through different VXLAN tunnels,
> packets in this flow have the same inner header and different outer
> headers. Sometimes these packets of the flow need to be hashed to the
> same rxq, then we can use the inner header as the hash input.
> 
> Thanks!

So, they will have the same source port yes? Any way to use that
so we don't depend on a specific protocol?

-- 
MST


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [virtio-dev] Re: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] [PATCH v13] virtio-net: support inner header hash
  2023-05-05 14:56               ` Michael S. Tsirkin
@ 2023-05-09 14:22                 ` Heng Qi
  -1 siblings, 0 replies; 60+ messages in thread
From: Heng Qi @ 2023-05-09 14:22 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-dev, virtio-comment, Parav Pandit, Jason Wang,
	Yuri Benditovich, Xuan Zhuo



在 2023/5/5 下午10:56, Michael S. Tsirkin 写道:
> On Fri, May 05, 2023 at 09:51:15PM +0800, Heng Qi wrote:
>> On Thu, Apr 27, 2023 at 01:13:29PM -0400, Michael S. Tsirkin wrote:
>>> On Thu, Apr 27, 2023 at 10:28:29AM +0800, Heng Qi wrote:
>>>>
>>>> 在 2023/4/26 下午10:48, Michael S. Tsirkin 写道:
>>>>> On Wed, Apr 26, 2023 at 10:14:30PM +0800, Heng Qi wrote:
>>>>>> This does not mean that every device needs to implement and support all of
>>>>>> these, they can choose to support some protocols they want.
>>>>>>
>>>>>> I add these because we have scale application scenarios for modern protocols
>>>>>> VXLAN-GPE/GENEVE:
>>>>>>
>>>>>> +\item In scenarios where the same flow passing through different tunnels is expected to be received in the same queue,
>>>>>> +      warm caches, lessing locking, etc. are optimized to obtain receiving performance.
>>>>>>
>>>>>>
>>>>>> Maybe the legacy GRE, VXLAN-GPE and GENEVE? But it has a little crossover.
>>>>>>
>>>>>> Thanks.
>>>>> But VXLAN-GPE/GENEVE can use source port for entropy.
>>>>>
>>>>> 	It is recommended that the UDP source port number
>>>>> 	 be calculated using a hash of fields from the inner packet
>>>>>
>>>>> That is best because
>>>>> it allows end to end control and is protocol agnostic.
>>>> Yes. I agree with this, I don't think we have an argument on this point
>>>> right now.:)
>>>>
>>>> For VXLAN-GPE/GENEVE or other modern tunneling protocols, we have to deal
>>>> with
>>>> scenarios where the same flow passes through different tunnels.
>>>>
>>>> Having them hashed to the same rx queue, is hard to do via outer headers.
>>>>> All that is missing is symmetric Toepliz and all is well?
>>>> The scenarios above or in the commit log also require inner headers.
>>> Hmm I am not sure I get it 100%.
>>> Could you show an example with inner header hash in the port #,
>>> hash is symmetric, and you still have trouble?
>>>
>>>
>>> It kinds of sounds like not enough entropy is not the problem
>>> at this point.
>> Sorry for the late reply. :)
>>
>> For modern tunneling protocols, yes.
>>
>>> You now want to drop everything from the header
>>> except the UDP source port. Is that a fair summary?
>>>
>> For example, for the same flow passing through different VXLAN tunnels,
>> packets in this flow have the same inner header and different outer
>> headers. Sometimes these packets of the flow need to be hashed to the
>> same rxq, then we can use the inner header as the hash input.
>>
>> Thanks!
> So, they will have the same source port yes?

Yes. The outer source port can be calculated using the 5-tuple of the 
original packet,
and the outer ports are the same but the outer IPs are different after 
different directions of the same flow pass through different tunnels.

> Any way to use that

We use it in monitoring, firewall and other scenarios.

> so we don't depend on a specific protocol?

Yes, selected tunneling protocols can be used in this scenario like this.

Thanks.



---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] [PATCH v13] virtio-net: support inner header hash
@ 2023-05-09 14:22                 ` Heng Qi
  0 siblings, 0 replies; 60+ messages in thread
From: Heng Qi @ 2023-05-09 14:22 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-dev, virtio-comment, Parav Pandit, Jason Wang,
	Yuri Benditovich, Xuan Zhuo



在 2023/5/5 下午10:56, Michael S. Tsirkin 写道:
> On Fri, May 05, 2023 at 09:51:15PM +0800, Heng Qi wrote:
>> On Thu, Apr 27, 2023 at 01:13:29PM -0400, Michael S. Tsirkin wrote:
>>> On Thu, Apr 27, 2023 at 10:28:29AM +0800, Heng Qi wrote:
>>>>
>>>> 在 2023/4/26 下午10:48, Michael S. Tsirkin 写道:
>>>>> On Wed, Apr 26, 2023 at 10:14:30PM +0800, Heng Qi wrote:
>>>>>> This does not mean that every device needs to implement and support all of
>>>>>> these, they can choose to support some protocols they want.
>>>>>>
>>>>>> I add these because we have scale application scenarios for modern protocols
>>>>>> VXLAN-GPE/GENEVE:
>>>>>>
>>>>>> +\item In scenarios where the same flow passing through different tunnels is expected to be received in the same queue,
>>>>>> +      warm caches, lessing locking, etc. are optimized to obtain receiving performance.
>>>>>>
>>>>>>
>>>>>> Maybe the legacy GRE, VXLAN-GPE and GENEVE? But it has a little crossover.
>>>>>>
>>>>>> Thanks.
>>>>> But VXLAN-GPE/GENEVE can use source port for entropy.
>>>>>
>>>>> 	It is recommended that the UDP source port number
>>>>> 	 be calculated using a hash of fields from the inner packet
>>>>>
>>>>> That is best because
>>>>> it allows end to end control and is protocol agnostic.
>>>> Yes. I agree with this, I don't think we have an argument on this point
>>>> right now.:)
>>>>
>>>> For VXLAN-GPE/GENEVE or other modern tunneling protocols, we have to deal
>>>> with
>>>> scenarios where the same flow passes through different tunnels.
>>>>
>>>> Having them hashed to the same rx queue, is hard to do via outer headers.
>>>>> All that is missing is symmetric Toepliz and all is well?
>>>> The scenarios above or in the commit log also require inner headers.
>>> Hmm I am not sure I get it 100%.
>>> Could you show an example with inner header hash in the port #,
>>> hash is symmetric, and you still have trouble?
>>>
>>>
>>> It kinds of sounds like not enough entropy is not the problem
>>> at this point.
>> Sorry for the late reply. :)
>>
>> For modern tunneling protocols, yes.
>>
>>> You now want to drop everything from the header
>>> except the UDP source port. Is that a fair summary?
>>>
>> For example, for the same flow passing through different VXLAN tunnels,
>> packets in this flow have the same inner header and different outer
>> headers. Sometimes these packets of the flow need to be hashed to the
>> same rxq, then we can use the inner header as the hash input.
>>
>> Thanks!
> So, they will have the same source port yes?

Yes. The outer source port can be calculated using the 5-tuple of the 
original packet,
and the outer ports are the same but the outer IPs are different after 
different directions of the same flow pass through different tunnels.

> Any way to use that

We use it in monitoring, firewall and other scenarios.

> so we don't depend on a specific protocol?

Yes, selected tunneling protocols can be used in this scenario like this.

Thanks.



This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [virtio-dev] Re: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] [PATCH v13] virtio-net: support inner header hash
  2023-05-09 14:22                 ` [virtio-comment] " Heng Qi
@ 2023-05-09 15:15                   ` Michael S. Tsirkin
  -1 siblings, 0 replies; 60+ messages in thread
From: Michael S. Tsirkin @ 2023-05-09 15:15 UTC (permalink / raw)
  To: Heng Qi
  Cc: virtio-dev, virtio-comment, Parav Pandit, Jason Wang,
	Yuri Benditovich, Xuan Zhuo

On Tue, May 09, 2023 at 10:22:19PM +0800, Heng Qi wrote:
> 
> 
> 在 2023/5/5 下午10:56, Michael S. Tsirkin 写道:
> > On Fri, May 05, 2023 at 09:51:15PM +0800, Heng Qi wrote:
> > > On Thu, Apr 27, 2023 at 01:13:29PM -0400, Michael S. Tsirkin wrote:
> > > > On Thu, Apr 27, 2023 at 10:28:29AM +0800, Heng Qi wrote:
> > > > > 
> > > > > 在 2023/4/26 下午10:48, Michael S. Tsirkin 写道:
> > > > > > On Wed, Apr 26, 2023 at 10:14:30PM +0800, Heng Qi wrote:
> > > > > > > This does not mean that every device needs to implement and support all of
> > > > > > > these, they can choose to support some protocols they want.
> > > > > > > 
> > > > > > > I add these because we have scale application scenarios for modern protocols
> > > > > > > VXLAN-GPE/GENEVE:
> > > > > > > 
> > > > > > > +\item In scenarios where the same flow passing through different tunnels is expected to be received in the same queue,
> > > > > > > +      warm caches, lessing locking, etc. are optimized to obtain receiving performance.
> > > > > > > 
> > > > > > > 
> > > > > > > Maybe the legacy GRE, VXLAN-GPE and GENEVE? But it has a little crossover.
> > > > > > > 
> > > > > > > Thanks.
> > > > > > But VXLAN-GPE/GENEVE can use source port for entropy.
> > > > > > 
> > > > > > 	It is recommended that the UDP source port number
> > > > > > 	 be calculated using a hash of fields from the inner packet
> > > > > > 
> > > > > > That is best because
> > > > > > it allows end to end control and is protocol agnostic.
> > > > > Yes. I agree with this, I don't think we have an argument on this point
> > > > > right now.:)
> > > > > 
> > > > > For VXLAN-GPE/GENEVE or other modern tunneling protocols, we have to deal
> > > > > with
> > > > > scenarios where the same flow passes through different tunnels.
> > > > > 
> > > > > Having them hashed to the same rx queue, is hard to do via outer headers.
> > > > > > All that is missing is symmetric Toepliz and all is well?
> > > > > The scenarios above or in the commit log also require inner headers.
> > > > Hmm I am not sure I get it 100%.
> > > > Could you show an example with inner header hash in the port #,
> > > > hash is symmetric, and you still have trouble?
> > > > 
> > > > 
> > > > It kinds of sounds like not enough entropy is not the problem
> > > > at this point.
> > > Sorry for the late reply. :)
> > > 
> > > For modern tunneling protocols, yes.
> > > 
> > > > You now want to drop everything from the header
> > > > except the UDP source port. Is that a fair summary?
> > > > 
> > > For example, for the same flow passing through different VXLAN tunnels,
> > > packets in this flow have the same inner header and different outer
> > > headers. Sometimes these packets of the flow need to be hashed to the
> > > same rxq, then we can use the inner header as the hash input.
> > > 
> > > Thanks!
> > So, they will have the same source port yes?
> 
> Yes. The outer source port can be calculated using the 5-tuple of the
> original packet,
> and the outer ports are the same but the outer IPs are different after
> different directions of the same flow pass through different tunnels.
> > Any way to use that
> 
> We use it in monitoring, firewall and other scenarios.
> 
> > so we don't depend on a specific protocol?
> 
> Yes, selected tunneling protocols can be used in this scenario like this.
> 
> Thanks.
> 

No, the question was - can we generalize this somehow then?
For example, a flag to ignore source IP when hashing?
Or maybe just for UDP packets?

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] [PATCH v13] virtio-net: support inner header hash
@ 2023-05-09 15:15                   ` Michael S. Tsirkin
  0 siblings, 0 replies; 60+ messages in thread
From: Michael S. Tsirkin @ 2023-05-09 15:15 UTC (permalink / raw)
  To: Heng Qi
  Cc: virtio-dev, virtio-comment, Parav Pandit, Jason Wang,
	Yuri Benditovich, Xuan Zhuo

On Tue, May 09, 2023 at 10:22:19PM +0800, Heng Qi wrote:
> 
> 
> 在 2023/5/5 下午10:56, Michael S. Tsirkin 写道:
> > On Fri, May 05, 2023 at 09:51:15PM +0800, Heng Qi wrote:
> > > On Thu, Apr 27, 2023 at 01:13:29PM -0400, Michael S. Tsirkin wrote:
> > > > On Thu, Apr 27, 2023 at 10:28:29AM +0800, Heng Qi wrote:
> > > > > 
> > > > > 在 2023/4/26 下午10:48, Michael S. Tsirkin 写道:
> > > > > > On Wed, Apr 26, 2023 at 10:14:30PM +0800, Heng Qi wrote:
> > > > > > > This does not mean that every device needs to implement and support all of
> > > > > > > these, they can choose to support some protocols they want.
> > > > > > > 
> > > > > > > I add these because we have scale application scenarios for modern protocols
> > > > > > > VXLAN-GPE/GENEVE:
> > > > > > > 
> > > > > > > +\item In scenarios where the same flow passing through different tunnels is expected to be received in the same queue,
> > > > > > > +      warm caches, lessing locking, etc. are optimized to obtain receiving performance.
> > > > > > > 
> > > > > > > 
> > > > > > > Maybe the legacy GRE, VXLAN-GPE and GENEVE? But it has a little crossover.
> > > > > > > 
> > > > > > > Thanks.
> > > > > > But VXLAN-GPE/GENEVE can use source port for entropy.
> > > > > > 
> > > > > > 	It is recommended that the UDP source port number
> > > > > > 	 be calculated using a hash of fields from the inner packet
> > > > > > 
> > > > > > That is best because
> > > > > > it allows end to end control and is protocol agnostic.
> > > > > Yes. I agree with this, I don't think we have an argument on this point
> > > > > right now.:)
> > > > > 
> > > > > For VXLAN-GPE/GENEVE or other modern tunneling protocols, we have to deal
> > > > > with
> > > > > scenarios where the same flow passes through different tunnels.
> > > > > 
> > > > > Having them hashed to the same rx queue, is hard to do via outer headers.
> > > > > > All that is missing is symmetric Toepliz and all is well?
> > > > > The scenarios above or in the commit log also require inner headers.
> > > > Hmm I am not sure I get it 100%.
> > > > Could you show an example with inner header hash in the port #,
> > > > hash is symmetric, and you still have trouble?
> > > > 
> > > > 
> > > > It kinds of sounds like not enough entropy is not the problem
> > > > at this point.
> > > Sorry for the late reply. :)
> > > 
> > > For modern tunneling protocols, yes.
> > > 
> > > > You now want to drop everything from the header
> > > > except the UDP source port. Is that a fair summary?
> > > > 
> > > For example, for the same flow passing through different VXLAN tunnels,
> > > packets in this flow have the same inner header and different outer
> > > headers. Sometimes these packets of the flow need to be hashed to the
> > > same rxq, then we can use the inner header as the hash input.
> > > 
> > > Thanks!
> > So, they will have the same source port yes?
> 
> Yes. The outer source port can be calculated using the 5-tuple of the
> original packet,
> and the outer ports are the same but the outer IPs are different after
> different directions of the same flow pass through different tunnels.
> > Any way to use that
> 
> We use it in monitoring, firewall and other scenarios.
> 
> > so we don't depend on a specific protocol?
> 
> Yes, selected tunneling protocols can be used in this scenario like this.
> 
> Thanks.
> 

No, the question was - can we generalize this somehow then?
For example, a flag to ignore source IP when hashing?
Or maybe just for UDP packets?

-- 
MST


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] [PATCH v13] virtio-net: support inner header hash
  2023-05-09 15:15                   ` [virtio-comment] " Michael S. Tsirkin
@ 2023-05-10  9:15                     ` Heng Qi
  -1 siblings, 0 replies; 60+ messages in thread
From: Heng Qi @ 2023-05-10  9:15 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-dev, virtio-comment, Parav Pandit, Jason Wang,
	Yuri Benditovich, Xuan Zhuo



在 2023/5/9 下午11:15, Michael S. Tsirkin 写道:
> On Tue, May 09, 2023 at 10:22:19PM +0800, Heng Qi wrote:
>>
>> 在 2023/5/5 下午10:56, Michael S. Tsirkin 写道:
>>> On Fri, May 05, 2023 at 09:51:15PM +0800, Heng Qi wrote:
>>>> On Thu, Apr 27, 2023 at 01:13:29PM -0400, Michael S. Tsirkin wrote:
>>>>> On Thu, Apr 27, 2023 at 10:28:29AM +0800, Heng Qi wrote:
>>>>>> 在 2023/4/26 下午10:48, Michael S. Tsirkin 写道:
>>>>>>> On Wed, Apr 26, 2023 at 10:14:30PM +0800, Heng Qi wrote:
>>>>>>>> This does not mean that every device needs to implement and support all of
>>>>>>>> these, they can choose to support some protocols they want.
>>>>>>>>
>>>>>>>> I add these because we have scale application scenarios for modern protocols
>>>>>>>> VXLAN-GPE/GENEVE:
>>>>>>>>
>>>>>>>> +\item In scenarios where the same flow passing through different tunnels is expected to be received in the same queue,
>>>>>>>> +      warm caches, lessing locking, etc. are optimized to obtain receiving performance.
>>>>>>>>
>>>>>>>>
>>>>>>>> Maybe the legacy GRE, VXLAN-GPE and GENEVE? But it has a little crossover.
>>>>>>>>
>>>>>>>> Thanks.
>>>>>>> But VXLAN-GPE/GENEVE can use source port for entropy.
>>>>>>>
>>>>>>> 	It is recommended that the UDP source port number
>>>>>>> 	 be calculated using a hash of fields from the inner packet
>>>>>>>
>>>>>>> That is best because
>>>>>>> it allows end to end control and is protocol agnostic.
>>>>>> Yes. I agree with this, I don't think we have an argument on this point
>>>>>> right now.:)
>>>>>>
>>>>>> For VXLAN-GPE/GENEVE or other modern tunneling protocols, we have to deal
>>>>>> with
>>>>>> scenarios where the same flow passes through different tunnels.
>>>>>>
>>>>>> Having them hashed to the same rx queue, is hard to do via outer headers.
>>>>>>> All that is missing is symmetric Toepliz and all is well?
>>>>>> The scenarios above or in the commit log also require inner headers.
>>>>> Hmm I am not sure I get it 100%.
>>>>> Could you show an example with inner header hash in the port #,
>>>>> hash is symmetric, and you still have trouble?
>>>>>
>>>>>
>>>>> It kinds of sounds like not enough entropy is not the problem
>>>>> at this point.
>>>> Sorry for the late reply. :)
>>>>
>>>> For modern tunneling protocols, yes.
>>>>
>>>>> You now want to drop everything from the header
>>>>> except the UDP source port. Is that a fair summary?
>>>>>
>>>> For example, for the same flow passing through different VXLAN tunnels,
>>>> packets in this flow have the same inner header and different outer
>>>> headers. Sometimes these packets of the flow need to be hashed to the
>>>> same rxq, then we can use the inner header as the hash input.
>>>>
>>>> Thanks!
>>> So, they will have the same source port yes?
>> Yes. The outer source port can be calculated using the 5-tuple of the
>> original packet,
>> and the outer ports are the same but the outer IPs are different after
>> different directions of the same flow pass through different tunnels.
>>> Any way to use that
>> We use it in monitoring, firewall and other scenarios.
>>
>>> so we don't depend on a specific protocol?
>> Yes, selected tunneling protocols can be used in this scenario like this.
>>
>> Thanks.
>>
> No, the question was - can we generalize this somehow then?
> For example, a flag to ignore source IP when hashing?
> Or maybe just for UDP packets?

1. I think the common solution is based on the inner header, so that 
GRE/IPIP tunnels can also enjoy inner symmetric hashing.

2. The VXLAN spec does not show that the outer source port in both 
directions of the same flow must be the same [1]
(although the outer source port is calculated based on the consistent 
hash in the kernel. The consistent hash will sort the five-tuple before 
calculating hashing),
but it is best not to assume that consistent hashing is used in all 
VXLAN implementations. The GENEVE spec uses "SHOUlD"[2].

3. How should we generalize? The device uses a feature to advertise all 
the tunnel types it supports, and hashes these tunnel types using the 
outer source port,
and then we still have to give the specific tunneling protocols 
supported by the device, just like we do now.

[1] "Source Port: It is recommended that the UDP source port number be 
calculated using a hash of fields from the inner packet -- one example
being a hash of the inner Ethernet frame's headers. This is to enable a 
level of entropy for the ECMP/load-balancing of the VM-to-VM traffic across
the VXLAN overlay. When calculating the UDP source port number in this 
manner, it is RECOMMENDED that the value be in the dynamic/private
port range 49152-65535 [RFC6335] "

[2] "Source Port: A source port selected by the originating tunnel 
endpoint. This source port SHOULD be the same for all packets belonging to a
single encapsulated flow to prevent reordering due to the use of 
different paths. To encourage an even distribution of flows across 
multiple links,
the source port SHOULD be calculated using a hash of the encapsulated 
packet headers using, for example, a traditional 5-tuple. Since the port
represents a flow identifier rather than a true UDP connection, the 
entire 16-bit range MAY be used to maximize entropy. In addition to 
setting the
source port, for IPv6, the flow label MAY also be used for providing 
entropy. For an example of using the IPv6 flow label for tunnel use 
cases, see [RFC6438]."

Thanks.

>


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] [PATCH v13] virtio-net: support inner header hash
@ 2023-05-10  9:15                     ` Heng Qi
  0 siblings, 0 replies; 60+ messages in thread
From: Heng Qi @ 2023-05-10  9:15 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-dev, virtio-comment, Parav Pandit, Jason Wang,
	Yuri Benditovich, Xuan Zhuo



在 2023/5/9 下午11:15, Michael S. Tsirkin 写道:
> On Tue, May 09, 2023 at 10:22:19PM +0800, Heng Qi wrote:
>>
>> 在 2023/5/5 下午10:56, Michael S. Tsirkin 写道:
>>> On Fri, May 05, 2023 at 09:51:15PM +0800, Heng Qi wrote:
>>>> On Thu, Apr 27, 2023 at 01:13:29PM -0400, Michael S. Tsirkin wrote:
>>>>> On Thu, Apr 27, 2023 at 10:28:29AM +0800, Heng Qi wrote:
>>>>>> 在 2023/4/26 下午10:48, Michael S. Tsirkin 写道:
>>>>>>> On Wed, Apr 26, 2023 at 10:14:30PM +0800, Heng Qi wrote:
>>>>>>>> This does not mean that every device needs to implement and support all of
>>>>>>>> these, they can choose to support some protocols they want.
>>>>>>>>
>>>>>>>> I add these because we have scale application scenarios for modern protocols
>>>>>>>> VXLAN-GPE/GENEVE:
>>>>>>>>
>>>>>>>> +\item In scenarios where the same flow passing through different tunnels is expected to be received in the same queue,
>>>>>>>> +      warm caches, lessing locking, etc. are optimized to obtain receiving performance.
>>>>>>>>
>>>>>>>>
>>>>>>>> Maybe the legacy GRE, VXLAN-GPE and GENEVE? But it has a little crossover.
>>>>>>>>
>>>>>>>> Thanks.
>>>>>>> But VXLAN-GPE/GENEVE can use source port for entropy.
>>>>>>>
>>>>>>> 	It is recommended that the UDP source port number
>>>>>>> 	 be calculated using a hash of fields from the inner packet
>>>>>>>
>>>>>>> That is best because
>>>>>>> it allows end to end control and is protocol agnostic.
>>>>>> Yes. I agree with this, I don't think we have an argument on this point
>>>>>> right now.:)
>>>>>>
>>>>>> For VXLAN-GPE/GENEVE or other modern tunneling protocols, we have to deal
>>>>>> with
>>>>>> scenarios where the same flow passes through different tunnels.
>>>>>>
>>>>>> Having them hashed to the same rx queue, is hard to do via outer headers.
>>>>>>> All that is missing is symmetric Toepliz and all is well?
>>>>>> The scenarios above or in the commit log also require inner headers.
>>>>> Hmm I am not sure I get it 100%.
>>>>> Could you show an example with inner header hash in the port #,
>>>>> hash is symmetric, and you still have trouble?
>>>>>
>>>>>
>>>>> It kinds of sounds like not enough entropy is not the problem
>>>>> at this point.
>>>> Sorry for the late reply. :)
>>>>
>>>> For modern tunneling protocols, yes.
>>>>
>>>>> You now want to drop everything from the header
>>>>> except the UDP source port. Is that a fair summary?
>>>>>
>>>> For example, for the same flow passing through different VXLAN tunnels,
>>>> packets in this flow have the same inner header and different outer
>>>> headers. Sometimes these packets of the flow need to be hashed to the
>>>> same rxq, then we can use the inner header as the hash input.
>>>>
>>>> Thanks!
>>> So, they will have the same source port yes?
>> Yes. The outer source port can be calculated using the 5-tuple of the
>> original packet,
>> and the outer ports are the same but the outer IPs are different after
>> different directions of the same flow pass through different tunnels.
>>> Any way to use that
>> We use it in monitoring, firewall and other scenarios.
>>
>>> so we don't depend on a specific protocol?
>> Yes, selected tunneling protocols can be used in this scenario like this.
>>
>> Thanks.
>>
> No, the question was - can we generalize this somehow then?
> For example, a flag to ignore source IP when hashing?
> Or maybe just for UDP packets?

1. I think the common solution is based on the inner header, so that 
GRE/IPIP tunnels can also enjoy inner symmetric hashing.

2. The VXLAN spec does not show that the outer source port in both 
directions of the same flow must be the same [1]
(although the outer source port is calculated based on the consistent 
hash in the kernel. The consistent hash will sort the five-tuple before 
calculating hashing),
but it is best not to assume that consistent hashing is used in all 
VXLAN implementations. The GENEVE spec uses "SHOUlD"[2].

3. How should we generalize? The device uses a feature to advertise all 
the tunnel types it supports, and hashes these tunnel types using the 
outer source port,
and then we still have to give the specific tunneling protocols 
supported by the device, just like we do now.

[1] "Source Port: It is recommended that the UDP source port number be 
calculated using a hash of fields from the inner packet -- one example
being a hash of the inner Ethernet frame's headers. This is to enable a 
level of entropy for the ECMP/load-balancing of the VM-to-VM traffic across
the VXLAN overlay. When calculating the UDP source port number in this 
manner, it is RECOMMENDED that the value be in the dynamic/private
port range 49152-65535 [RFC6335] "

[2] "Source Port: A source port selected by the originating tunnel 
endpoint. This source port SHOULD be the same for all packets belonging to a
single encapsulated flow to prevent reordering due to the use of 
different paths. To encourage an even distribution of flows across 
multiple links,
the source port SHOULD be calculated using a hash of the encapsulated 
packet headers using, for example, a traditional 5-tuple. Since the port
represents a flow identifier rather than a true UDP connection, the 
entire 16-bit range MAY be used to maximize entropy. In addition to 
setting the
source port, for IPv6, the flow label MAY also be used for providing 
entropy. For an example of using the IPv6 flow label for tunnel use 
cases, see [RFC6438]."

Thanks.

>


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] [PATCH v13] virtio-net: support inner header hash
  2023-05-10  9:15                     ` Heng Qi
@ 2023-05-11  6:22                       ` Michael S. Tsirkin
  -1 siblings, 0 replies; 60+ messages in thread
From: Michael S. Tsirkin @ 2023-05-11  6:22 UTC (permalink / raw)
  To: Heng Qi
  Cc: virtio-dev, virtio-comment, Parav Pandit, Jason Wang,
	Yuri Benditovich, Xuan Zhuo

On Wed, May 10, 2023 at 05:15:37PM +0800, Heng Qi wrote:
> 
> 
> 在 2023/5/9 下午11:15, Michael S. Tsirkin 写道:
> > On Tue, May 09, 2023 at 10:22:19PM +0800, Heng Qi wrote:
> > > 
> > > 在 2023/5/5 下午10:56, Michael S. Tsirkin 写道:
> > > > On Fri, May 05, 2023 at 09:51:15PM +0800, Heng Qi wrote:
> > > > > On Thu, Apr 27, 2023 at 01:13:29PM -0400, Michael S. Tsirkin wrote:
> > > > > > On Thu, Apr 27, 2023 at 10:28:29AM +0800, Heng Qi wrote:
> > > > > > > 在 2023/4/26 下午10:48, Michael S. Tsirkin 写道:
> > > > > > > > On Wed, Apr 26, 2023 at 10:14:30PM +0800, Heng Qi wrote:
> > > > > > > > > This does not mean that every device needs to implement and support all of
> > > > > > > > > these, they can choose to support some protocols they want.
> > > > > > > > > 
> > > > > > > > > I add these because we have scale application scenarios for modern protocols
> > > > > > > > > VXLAN-GPE/GENEVE:
> > > > > > > > > 
> > > > > > > > > +\item In scenarios where the same flow passing through different tunnels is expected to be received in the same queue,
> > > > > > > > > +      warm caches, lessing locking, etc. are optimized to obtain receiving performance.
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > Maybe the legacy GRE, VXLAN-GPE and GENEVE? But it has a little crossover.
> > > > > > > > > 
> > > > > > > > > Thanks.
> > > > > > > > But VXLAN-GPE/GENEVE can use source port for entropy.
> > > > > > > > 
> > > > > > > > 	It is recommended that the UDP source port number
> > > > > > > > 	 be calculated using a hash of fields from the inner packet
> > > > > > > > 
> > > > > > > > That is best because
> > > > > > > > it allows end to end control and is protocol agnostic.
> > > > > > > Yes. I agree with this, I don't think we have an argument on this point
> > > > > > > right now.:)
> > > > > > > 
> > > > > > > For VXLAN-GPE/GENEVE or other modern tunneling protocols, we have to deal
> > > > > > > with
> > > > > > > scenarios where the same flow passes through different tunnels.
> > > > > > > 
> > > > > > > Having them hashed to the same rx queue, is hard to do via outer headers.
> > > > > > > > All that is missing is symmetric Toepliz and all is well?
> > > > > > > The scenarios above or in the commit log also require inner headers.
> > > > > > Hmm I am not sure I get it 100%.
> > > > > > Could you show an example with inner header hash in the port #,
> > > > > > hash is symmetric, and you still have trouble?
> > > > > > 
> > > > > > 
> > > > > > It kinds of sounds like not enough entropy is not the problem
> > > > > > at this point.
> > > > > Sorry for the late reply. :)
> > > > > 
> > > > > For modern tunneling protocols, yes.
> > > > > 
> > > > > > You now want to drop everything from the header
> > > > > > except the UDP source port. Is that a fair summary?
> > > > > > 
> > > > > For example, for the same flow passing through different VXLAN tunnels,
> > > > > packets in this flow have the same inner header and different outer
> > > > > headers. Sometimes these packets of the flow need to be hashed to the
> > > > > same rxq, then we can use the inner header as the hash input.
> > > > > 
> > > > > Thanks!
> > > > So, they will have the same source port yes?
> > > Yes. The outer source port can be calculated using the 5-tuple of the
> > > original packet,
> > > and the outer ports are the same but the outer IPs are different after
> > > different directions of the same flow pass through different tunnels.
> > > > Any way to use that
> > > We use it in monitoring, firewall and other scenarios.
> > > 
> > > > so we don't depend on a specific protocol?
> > > Yes, selected tunneling protocols can be used in this scenario like this.
> > > 
> > > Thanks.
> > > 
> > No, the question was - can we generalize this somehow then?
> > For example, a flag to ignore source IP when hashing?
> > Or maybe just for UDP packets?
> 
> 1. I think the common solution is based on the inner header, so that
> GRE/IPIP tunnels can also enjoy inner symmetric hashing.
> 
> 2. The VXLAN spec does not show that the outer source port in both
> directions of the same flow must be the same [1]
> (although the outer source port is calculated based on the consistent hash
> in the kernel. The consistent hash will sort the five-tuple before
> calculating hashing),
> but it is best not to assume that consistent hashing is used in all VXLAN
> implementations.

I agree, best not to assume if it's not in the spec.
The requirement to hash two sides to same queue might
not be necessary for everyone though, right?

> The GENEVE spec uses "SHOUlD"[2].

What about other tunnels? Could you summarize please?
SHOULD means "if you ignore this
things will work but not well".
You mentioned concerns such as worse performance,
this is fine with SHOULD. Is inner hashing important for
correctness sometimes?

> 3. How should we generalize? The device uses a feature to advertise all the
> tunnel types it supports, and hashes these tunnel types using the outer
> source port,
> and then we still have to give the specific tunneling protocols supported by
> the device, just like we do now.

Is it problematic to do this for all UDP packets?

> [1] "Source Port: It is recommended that the UDP source port number be
> calculated using a hash of fields from the inner packet -- one example
> being a hash of the inner Ethernet frame's headers. This is to enable a
> level of entropy for the ECMP/load-balancing of the VM-to-VM traffic across
> the VXLAN overlay. When calculating the UDP source port number in this
> manner, it is RECOMMENDED that the value be in the dynamic/private
> port range 49152-65535 [RFC6335] "
> 
> [2] "Source Port: A source port selected by the originating tunnel endpoint.
> This source port SHOULD be the same for all packets belonging to a
> single encapsulated flow to prevent reordering due to the use of different
> paths. To encourage an even distribution of flows across multiple links,
> the source port SHOULD be calculated using a hash of the encapsulated packet
> headers using, for example, a traditional 5-tuple. Since the port
> represents a flow identifier rather than a true UDP connection, the entire
> 16-bit range MAY be used to maximize entropy. In addition to setting the
> source port, for IPv6, the flow label MAY also be used for providing
> entropy. For an example of using the IPv6 flow label for tunnel use cases,
> see [RFC6438]."
> 
> Thanks.
> 
> > 
> 
> 
> This publicly archived list offers a means to provide input to the
> OASIS Virtual I/O Device (VIRTIO) TC.
> 
> In order to verify user consent to the Feedback License terms and
> to minimize spam in the list archive, subscription is required
> before posting.
> 
> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> List help: virtio-comment-help@lists.oasis-open.org
> List archive: https://lists.oasis-open.org/archives/virtio-comment/
> Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> Committee: https://www.oasis-open.org/committees/virtio/
> Join OASIS: https://www.oasis-open.org/join/
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] [PATCH v13] virtio-net: support inner header hash
@ 2023-05-11  6:22                       ` Michael S. Tsirkin
  0 siblings, 0 replies; 60+ messages in thread
From: Michael S. Tsirkin @ 2023-05-11  6:22 UTC (permalink / raw)
  To: Heng Qi
  Cc: virtio-dev, virtio-comment, Parav Pandit, Jason Wang,
	Yuri Benditovich, Xuan Zhuo

On Wed, May 10, 2023 at 05:15:37PM +0800, Heng Qi wrote:
> 
> 
> 在 2023/5/9 下午11:15, Michael S. Tsirkin 写道:
> > On Tue, May 09, 2023 at 10:22:19PM +0800, Heng Qi wrote:
> > > 
> > > 在 2023/5/5 下午10:56, Michael S. Tsirkin 写道:
> > > > On Fri, May 05, 2023 at 09:51:15PM +0800, Heng Qi wrote:
> > > > > On Thu, Apr 27, 2023 at 01:13:29PM -0400, Michael S. Tsirkin wrote:
> > > > > > On Thu, Apr 27, 2023 at 10:28:29AM +0800, Heng Qi wrote:
> > > > > > > 在 2023/4/26 下午10:48, Michael S. Tsirkin 写道:
> > > > > > > > On Wed, Apr 26, 2023 at 10:14:30PM +0800, Heng Qi wrote:
> > > > > > > > > This does not mean that every device needs to implement and support all of
> > > > > > > > > these, they can choose to support some protocols they want.
> > > > > > > > > 
> > > > > > > > > I add these because we have scale application scenarios for modern protocols
> > > > > > > > > VXLAN-GPE/GENEVE:
> > > > > > > > > 
> > > > > > > > > +\item In scenarios where the same flow passing through different tunnels is expected to be received in the same queue,
> > > > > > > > > +      warm caches, lessing locking, etc. are optimized to obtain receiving performance.
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > Maybe the legacy GRE, VXLAN-GPE and GENEVE? But it has a little crossover.
> > > > > > > > > 
> > > > > > > > > Thanks.
> > > > > > > > But VXLAN-GPE/GENEVE can use source port for entropy.
> > > > > > > > 
> > > > > > > > 	It is recommended that the UDP source port number
> > > > > > > > 	 be calculated using a hash of fields from the inner packet
> > > > > > > > 
> > > > > > > > That is best because
> > > > > > > > it allows end to end control and is protocol agnostic.
> > > > > > > Yes. I agree with this, I don't think we have an argument on this point
> > > > > > > right now.:)
> > > > > > > 
> > > > > > > For VXLAN-GPE/GENEVE or other modern tunneling protocols, we have to deal
> > > > > > > with
> > > > > > > scenarios where the same flow passes through different tunnels.
> > > > > > > 
> > > > > > > Having them hashed to the same rx queue, is hard to do via outer headers.
> > > > > > > > All that is missing is symmetric Toepliz and all is well?
> > > > > > > The scenarios above or in the commit log also require inner headers.
> > > > > > Hmm I am not sure I get it 100%.
> > > > > > Could you show an example with inner header hash in the port #,
> > > > > > hash is symmetric, and you still have trouble?
> > > > > > 
> > > > > > 
> > > > > > It kinds of sounds like not enough entropy is not the problem
> > > > > > at this point.
> > > > > Sorry for the late reply. :)
> > > > > 
> > > > > For modern tunneling protocols, yes.
> > > > > 
> > > > > > You now want to drop everything from the header
> > > > > > except the UDP source port. Is that a fair summary?
> > > > > > 
> > > > > For example, for the same flow passing through different VXLAN tunnels,
> > > > > packets in this flow have the same inner header and different outer
> > > > > headers. Sometimes these packets of the flow need to be hashed to the
> > > > > same rxq, then we can use the inner header as the hash input.
> > > > > 
> > > > > Thanks!
> > > > So, they will have the same source port yes?
> > > Yes. The outer source port can be calculated using the 5-tuple of the
> > > original packet,
> > > and the outer ports are the same but the outer IPs are different after
> > > different directions of the same flow pass through different tunnels.
> > > > Any way to use that
> > > We use it in monitoring, firewall and other scenarios.
> > > 
> > > > so we don't depend on a specific protocol?
> > > Yes, selected tunneling protocols can be used in this scenario like this.
> > > 
> > > Thanks.
> > > 
> > No, the question was - can we generalize this somehow then?
> > For example, a flag to ignore source IP when hashing?
> > Or maybe just for UDP packets?
> 
> 1. I think the common solution is based on the inner header, so that
> GRE/IPIP tunnels can also enjoy inner symmetric hashing.
> 
> 2. The VXLAN spec does not show that the outer source port in both
> directions of the same flow must be the same [1]
> (although the outer source port is calculated based on the consistent hash
> in the kernel. The consistent hash will sort the five-tuple before
> calculating hashing),
> but it is best not to assume that consistent hashing is used in all VXLAN
> implementations.

I agree, best not to assume if it's not in the spec.
The requirement to hash two sides to same queue might
not be necessary for everyone though, right?

> The GENEVE spec uses "SHOUlD"[2].

What about other tunnels? Could you summarize please?
SHOULD means "if you ignore this
things will work but not well".
You mentioned concerns such as worse performance,
this is fine with SHOULD. Is inner hashing important for
correctness sometimes?

> 3. How should we generalize? The device uses a feature to advertise all the
> tunnel types it supports, and hashes these tunnel types using the outer
> source port,
> and then we still have to give the specific tunneling protocols supported by
> the device, just like we do now.

Is it problematic to do this for all UDP packets?

> [1] "Source Port: It is recommended that the UDP source port number be
> calculated using a hash of fields from the inner packet -- one example
> being a hash of the inner Ethernet frame's headers. This is to enable a
> level of entropy for the ECMP/load-balancing of the VM-to-VM traffic across
> the VXLAN overlay. When calculating the UDP source port number in this
> manner, it is RECOMMENDED that the value be in the dynamic/private
> port range 49152-65535 [RFC6335] "
> 
> [2] "Source Port: A source port selected by the originating tunnel endpoint.
> This source port SHOULD be the same for all packets belonging to a
> single encapsulated flow to prevent reordering due to the use of different
> paths. To encourage an even distribution of flows across multiple links,
> the source port SHOULD be calculated using a hash of the encapsulated packet
> headers using, for example, a traditional 5-tuple. Since the port
> represents a flow identifier rather than a true UDP connection, the entire
> 16-bit range MAY be used to maximize entropy. In addition to setting the
> source port, for IPv6, the flow label MAY also be used for providing
> entropy. For an example of using the IPv6 flow label for tunnel use cases,
> see [RFC6438]."
> 
> Thanks.
> 
> > 
> 
> 
> This publicly archived list offers a means to provide input to the
> OASIS Virtual I/O Device (VIRTIO) TC.
> 
> In order to verify user consent to the Feedback License terms and
> to minimize spam in the list archive, subscription is required
> before posting.
> 
> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> List help: virtio-comment-help@lists.oasis-open.org
> List archive: https://lists.oasis-open.org/archives/virtio-comment/
> Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> Committee: https://www.oasis-open.org/committees/virtio/
> Join OASIS: https://www.oasis-open.org/join/
> 


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] [PATCH v13] virtio-net: support inner header hash
  2023-05-11  6:22                       ` Michael S. Tsirkin
@ 2023-05-12  6:00                         ` Heng Qi
  -1 siblings, 0 replies; 60+ messages in thread
From: Heng Qi @ 2023-05-12  6:00 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-dev, virtio-comment, Parav Pandit, Jason Wang,
	Yuri Benditovich, Xuan Zhuo

On Thu, May 11, 2023 at 02:22:12AM -0400, Michael S. Tsirkin wrote:
> On Wed, May 10, 2023 at 05:15:37PM +0800, Heng Qi wrote:
> > 
> > 
> > 在 2023/5/9 下午11:15, Michael S. Tsirkin 写道:
> > > On Tue, May 09, 2023 at 10:22:19PM +0800, Heng Qi wrote:
> > > > 
> > > > 在 2023/5/5 下午10:56, Michael S. Tsirkin 写道:
> > > > > On Fri, May 05, 2023 at 09:51:15PM +0800, Heng Qi wrote:
> > > > > > On Thu, Apr 27, 2023 at 01:13:29PM -0400, Michael S. Tsirkin wrote:
> > > > > > > On Thu, Apr 27, 2023 at 10:28:29AM +0800, Heng Qi wrote:
> > > > > > > > 在 2023/4/26 下午10:48, Michael S. Tsirkin 写道:
> > > > > > > > > On Wed, Apr 26, 2023 at 10:14:30PM +0800, Heng Qi wrote:
> > > > > > > > > > This does not mean that every device needs to implement and support all of
> > > > > > > > > > these, they can choose to support some protocols they want.
> > > > > > > > > > 
> > > > > > > > > > I add these because we have scale application scenarios for modern protocols
> > > > > > > > > > VXLAN-GPE/GENEVE:
> > > > > > > > > > 
> > > > > > > > > > +\item In scenarios where the same flow passing through different tunnels is expected to be received in the same queue,
> > > > > > > > > > +      warm caches, lessing locking, etc. are optimized to obtain receiving performance.
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > Maybe the legacy GRE, VXLAN-GPE and GENEVE? But it has a little crossover.
> > > > > > > > > > 
> > > > > > > > > > Thanks.
> > > > > > > > > But VXLAN-GPE/GENEVE can use source port for entropy.
> > > > > > > > > 
> > > > > > > > > 	It is recommended that the UDP source port number
> > > > > > > > > 	 be calculated using a hash of fields from the inner packet
> > > > > > > > > 
> > > > > > > > > That is best because
> > > > > > > > > it allows end to end control and is protocol agnostic.
> > > > > > > > Yes. I agree with this, I don't think we have an argument on this point
> > > > > > > > right now.:)
> > > > > > > > 
> > > > > > > > For VXLAN-GPE/GENEVE or other modern tunneling protocols, we have to deal
> > > > > > > > with
> > > > > > > > scenarios where the same flow passes through different tunnels.
> > > > > > > > 
> > > > > > > > Having them hashed to the same rx queue, is hard to do via outer headers.
> > > > > > > > > All that is missing is symmetric Toepliz and all is well?
> > > > > > > > The scenarios above or in the commit log also require inner headers.
> > > > > > > Hmm I am not sure I get it 100%.
> > > > > > > Could you show an example with inner header hash in the port #,
> > > > > > > hash is symmetric, and you still have trouble?
> > > > > > > 
> > > > > > > 
> > > > > > > It kinds of sounds like not enough entropy is not the problem
> > > > > > > at this point.
> > > > > > Sorry for the late reply. :)
> > > > > > 
> > > > > > For modern tunneling protocols, yes.
> > > > > > 
> > > > > > > You now want to drop everything from the header
> > > > > > > except the UDP source port. Is that a fair summary?
> > > > > > > 
> > > > > > For example, for the same flow passing through different VXLAN tunnels,
> > > > > > packets in this flow have the same inner header and different outer
> > > > > > headers. Sometimes these packets of the flow need to be hashed to the
> > > > > > same rxq, then we can use the inner header as the hash input.
> > > > > > 
> > > > > > Thanks!
> > > > > So, they will have the same source port yes?
> > > > Yes. The outer source port can be calculated using the 5-tuple of the
> > > > original packet,
> > > > and the outer ports are the same but the outer IPs are different after
> > > > different directions of the same flow pass through different tunnels.
> > > > > Any way to use that
> > > > We use it in monitoring, firewall and other scenarios.
> > > > 
> > > > > so we don't depend on a specific protocol?
> > > > Yes, selected tunneling protocols can be used in this scenario like this.
> > > > 
> > > > Thanks.
> > > > 
> > > No, the question was - can we generalize this somehow then?
> > > For example, a flag to ignore source IP when hashing?
> > > Or maybe just for UDP packets?
> > 
> > 1. I think the common solution is based on the inner header, so that
> > GRE/IPIP tunnels can also enjoy inner symmetric hashing.
> > 
> > 2. The VXLAN spec does not show that the outer source port in both
> > directions of the same flow must be the same [1]
> > (although the outer source port is calculated based on the consistent hash
> > in the kernel. The consistent hash will sort the five-tuple before
> > calculating hashing),
> > but it is best not to assume that consistent hashing is used in all VXLAN
> > implementations.
> 
> I agree, best not to assume if it's not in the spec.
> The requirement to hash two sides to same queue might
> not be necessary for everyone though, right?

The outer source port is also not reliable when it needs to be hashed to
the same queue, but the inner header identifies a flow reliably and
universally.

> 
> > The GENEVE spec uses "SHOUlD"[2].
> 
> What about other tunnels? Could you summarize please?

Sure.

The VXLAN spec[1] does not show that the outer source port in both
directions of the same flow must be the same.

VXLAN-GPE[2]("SHOULD")/GENEVE[3]("SHOULD")/GRE-in-UDP[4.1]/STT[5]
recommend that the outer source port of the same flow be calculated
based on the inner header hash and set to the same.

But the udp source port of GRE-in-UDP may be used in a scenario similar
to NAPT [4.2], where the udp source port is no longer used for entropy,
but for identifying different internal hosts. So using udp source port
does not identify the same stream. This is why using the inner header is
more general, since information about the original stream can reliably
identify a flow.

[1] "Source Port: It is recommended that the UDP source port number be
calculated using a hash of fields from the inner packet -- one example
being a hash of the inner Ethernet frame's headers. This is to enable a
level of entropy for the ECMP/load-balancing of the VM-to-VM traffic
across the VXLAN overlay. When calculating the UDP source port number in
this manner, it is RECOMMENDED that the value be in the dynamic/private
port range 49152-65535 [RFC6335]"

[2] "Source UDP Port: The source UDP port is used as entropy for devices
forwarding encapsulated packets across the underlay (ECMP for IP routers,
or load splitting for link aggregation by bridges). Tenant traffic flows
should all use the same source UDP port to lower the chances of packet
reordering by the underlay for a given flow. It is recommended for VTEPs
to generate this port number using a hash of the inner packet headers.
Implementations MAY use the entire 16 bit source UDP port for entropy."

[3] "Source Port: A source port selected by the originating tunnel
endpoint. This source port SHOULD be the same for all packets belonging
to a single encapsulated flow to prevent reordering due to the use of
different paths. To encourage an even distribution of flows across
multiple links, the source port SHOULD be calculated using a hash of the
encapsulated packet headers using, for example, a traditional 5-tuple.
Since the port represents a flow identifier rather than a true UDP
connection, the entire 16-bit range MAY be used to maximize entropy."

[4.1] "GRE-in-UDP permits the UDP source port value to be used to encode
an entropy value. The UDP source port contains a 16-bit entropy value
that is generated by the encapsulator to identify a flow for the
encapsulated packet. The port value SHOULD be within the ephemeral port
range, i.e., 49152 to 65535, where the high-order two bits of the port
are set to one. This provides fourteen bits of entropy for the inner
flow identifier. In the case that an encapsulator is unable to derive
flow entropy from the payload header or the entropy usage has to be
disabled to meet operational requirements (see Section 7), to avoid
reordering with a packet flow, the encapsulator SHOULD use the same UDP
source port value for all packets assigned to a flow, e.g., the result
of an algorithm that performs a hash of the tunnel ingress and egress IP
address."

[4.2] "use of the UDP source port for entropy may impact middleboxes'
behavior. If a GRE-in-UDP tunnel is expected to be used on a path
with a middlebox, the tunnel can be configured either to disable use
of the UDP source port for entropy or to enable middleboxes to pass
packets with UDP source port entropy."

[5] "STT achieves the first goal by ensuring that the source and
destination ports and addresses in the outer header are all the same for
a single flow.  The second goal is achieved by generating the source
port using a random hash of fields in the headers of the inner packets,
e.g. the ports and addresses of the virtual flow's packets."

> SHOULD means "if you ignore this
> things will work but not well".
> You mentioned concerns such as worse performance,
> this is fine with SHOULD.

That's it.

> Is inner hashing important for
> correctness sometimes?

I'm sorry I didn't understand this, can you explain it in more detail?

> 
> > 3. How should we generalize? The device uses a feature to advertise all the
> > tunnel types it supports, and hashes these tunnel types using the outer
> > source port,
> > and then we still have to give the specific tunneling protocols supported by
> > the device, just like we do now.
> 
> Is it problematic to do this for all UDP packets?

I think there will be problems. While devices support configuring this,
drivers sometimes don't want devices to do special handling for certain
tunneling protocols.

Thanks.

> 
> > [1] "Source Port: It is recommended that the UDP source port number be
> > calculated using a hash of fields from the inner packet -- one example
> > being a hash of the inner Ethernet frame's headers. This is to enable a
> > level of entropy for the ECMP/load-balancing of the VM-to-VM traffic across
> > the VXLAN overlay. When calculating the UDP source port number in this
> > manner, it is RECOMMENDED that the value be in the dynamic/private
> > port range 49152-65535 [RFC6335] "
> > 
> > [2] "Source Port: A source port selected by the originating tunnel endpoint.
> > This source port SHOULD be the same for all packets belonging to a
> > single encapsulated flow to prevent reordering due to the use of different
> > paths. To encourage an even distribution of flows across multiple links,
> > the source port SHOULD be calculated using a hash of the encapsulated packet
> > headers using, for example, a traditional 5-tuple. Since the port
> > represents a flow identifier rather than a true UDP connection, the entire
> > 16-bit range MAY be used to maximize entropy. In addition to setting the
> > source port, for IPv6, the flow label MAY also be used for providing
> > entropy. For an example of using the IPv6 flow label for tunnel use cases,
> > see [RFC6438]."
> > 
> > Thanks.
> > 
> > > 
> > 
> > 
> > This publicly archived list offers a means to provide input to the
> > OASIS Virtual I/O Device (VIRTIO) TC.
> > 
> > In order to verify user consent to the Feedback License terms and
> > to minimize spam in the list archive, subscription is required
> > before posting.
> > 
> > Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> > List help: virtio-comment-help@lists.oasis-open.org
> > List archive: https://lists.oasis-open.org/archives/virtio-comment/
> > Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> > List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> > Committee: https://www.oasis-open.org/committees/virtio/
> > Join OASIS: https://www.oasis-open.org/join/
> > 

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] [PATCH v13] virtio-net: support inner header hash
@ 2023-05-12  6:00                         ` Heng Qi
  0 siblings, 0 replies; 60+ messages in thread
From: Heng Qi @ 2023-05-12  6:00 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-dev, virtio-comment, Parav Pandit, Jason Wang,
	Yuri Benditovich, Xuan Zhuo

On Thu, May 11, 2023 at 02:22:12AM -0400, Michael S. Tsirkin wrote:
> On Wed, May 10, 2023 at 05:15:37PM +0800, Heng Qi wrote:
> > 
> > 
> > 在 2023/5/9 下午11:15, Michael S. Tsirkin 写道:
> > > On Tue, May 09, 2023 at 10:22:19PM +0800, Heng Qi wrote:
> > > > 
> > > > 在 2023/5/5 下午10:56, Michael S. Tsirkin 写道:
> > > > > On Fri, May 05, 2023 at 09:51:15PM +0800, Heng Qi wrote:
> > > > > > On Thu, Apr 27, 2023 at 01:13:29PM -0400, Michael S. Tsirkin wrote:
> > > > > > > On Thu, Apr 27, 2023 at 10:28:29AM +0800, Heng Qi wrote:
> > > > > > > > 在 2023/4/26 下午10:48, Michael S. Tsirkin 写道:
> > > > > > > > > On Wed, Apr 26, 2023 at 10:14:30PM +0800, Heng Qi wrote:
> > > > > > > > > > This does not mean that every device needs to implement and support all of
> > > > > > > > > > these, they can choose to support some protocols they want.
> > > > > > > > > > 
> > > > > > > > > > I add these because we have scale application scenarios for modern protocols
> > > > > > > > > > VXLAN-GPE/GENEVE:
> > > > > > > > > > 
> > > > > > > > > > +\item In scenarios where the same flow passing through different tunnels is expected to be received in the same queue,
> > > > > > > > > > +      warm caches, lessing locking, etc. are optimized to obtain receiving performance.
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > Maybe the legacy GRE, VXLAN-GPE and GENEVE? But it has a little crossover.
> > > > > > > > > > 
> > > > > > > > > > Thanks.
> > > > > > > > > But VXLAN-GPE/GENEVE can use source port for entropy.
> > > > > > > > > 
> > > > > > > > > 	It is recommended that the UDP source port number
> > > > > > > > > 	 be calculated using a hash of fields from the inner packet
> > > > > > > > > 
> > > > > > > > > That is best because
> > > > > > > > > it allows end to end control and is protocol agnostic.
> > > > > > > > Yes. I agree with this, I don't think we have an argument on this point
> > > > > > > > right now.:)
> > > > > > > > 
> > > > > > > > For VXLAN-GPE/GENEVE or other modern tunneling protocols, we have to deal
> > > > > > > > with
> > > > > > > > scenarios where the same flow passes through different tunnels.
> > > > > > > > 
> > > > > > > > Having them hashed to the same rx queue, is hard to do via outer headers.
> > > > > > > > > All that is missing is symmetric Toepliz and all is well?
> > > > > > > > The scenarios above or in the commit log also require inner headers.
> > > > > > > Hmm I am not sure I get it 100%.
> > > > > > > Could you show an example with inner header hash in the port #,
> > > > > > > hash is symmetric, and you still have trouble?
> > > > > > > 
> > > > > > > 
> > > > > > > It kinds of sounds like not enough entropy is not the problem
> > > > > > > at this point.
> > > > > > Sorry for the late reply. :)
> > > > > > 
> > > > > > For modern tunneling protocols, yes.
> > > > > > 
> > > > > > > You now want to drop everything from the header
> > > > > > > except the UDP source port. Is that a fair summary?
> > > > > > > 
> > > > > > For example, for the same flow passing through different VXLAN tunnels,
> > > > > > packets in this flow have the same inner header and different outer
> > > > > > headers. Sometimes these packets of the flow need to be hashed to the
> > > > > > same rxq, then we can use the inner header as the hash input.
> > > > > > 
> > > > > > Thanks!
> > > > > So, they will have the same source port yes?
> > > > Yes. The outer source port can be calculated using the 5-tuple of the
> > > > original packet,
> > > > and the outer ports are the same but the outer IPs are different after
> > > > different directions of the same flow pass through different tunnels.
> > > > > Any way to use that
> > > > We use it in monitoring, firewall and other scenarios.
> > > > 
> > > > > so we don't depend on a specific protocol?
> > > > Yes, selected tunneling protocols can be used in this scenario like this.
> > > > 
> > > > Thanks.
> > > > 
> > > No, the question was - can we generalize this somehow then?
> > > For example, a flag to ignore source IP when hashing?
> > > Or maybe just for UDP packets?
> > 
> > 1. I think the common solution is based on the inner header, so that
> > GRE/IPIP tunnels can also enjoy inner symmetric hashing.
> > 
> > 2. The VXLAN spec does not show that the outer source port in both
> > directions of the same flow must be the same [1]
> > (although the outer source port is calculated based on the consistent hash
> > in the kernel. The consistent hash will sort the five-tuple before
> > calculating hashing),
> > but it is best not to assume that consistent hashing is used in all VXLAN
> > implementations.
> 
> I agree, best not to assume if it's not in the spec.
> The requirement to hash two sides to same queue might
> not be necessary for everyone though, right?

The outer source port is also not reliable when it needs to be hashed to
the same queue, but the inner header identifies a flow reliably and
universally.

> 
> > The GENEVE spec uses "SHOUlD"[2].
> 
> What about other tunnels? Could you summarize please?

Sure.

The VXLAN spec[1] does not show that the outer source port in both
directions of the same flow must be the same.

VXLAN-GPE[2]("SHOULD")/GENEVE[3]("SHOULD")/GRE-in-UDP[4.1]/STT[5]
recommend that the outer source port of the same flow be calculated
based on the inner header hash and set to the same.

But the udp source port of GRE-in-UDP may be used in a scenario similar
to NAPT [4.2], where the udp source port is no longer used for entropy,
but for identifying different internal hosts. So using udp source port
does not identify the same stream. This is why using the inner header is
more general, since information about the original stream can reliably
identify a flow.

[1] "Source Port: It is recommended that the UDP source port number be
calculated using a hash of fields from the inner packet -- one example
being a hash of the inner Ethernet frame's headers. This is to enable a
level of entropy for the ECMP/load-balancing of the VM-to-VM traffic
across the VXLAN overlay. When calculating the UDP source port number in
this manner, it is RECOMMENDED that the value be in the dynamic/private
port range 49152-65535 [RFC6335]"

[2] "Source UDP Port: The source UDP port is used as entropy for devices
forwarding encapsulated packets across the underlay (ECMP for IP routers,
or load splitting for link aggregation by bridges). Tenant traffic flows
should all use the same source UDP port to lower the chances of packet
reordering by the underlay for a given flow. It is recommended for VTEPs
to generate this port number using a hash of the inner packet headers.
Implementations MAY use the entire 16 bit source UDP port for entropy."

[3] "Source Port: A source port selected by the originating tunnel
endpoint. This source port SHOULD be the same for all packets belonging
to a single encapsulated flow to prevent reordering due to the use of
different paths. To encourage an even distribution of flows across
multiple links, the source port SHOULD be calculated using a hash of the
encapsulated packet headers using, for example, a traditional 5-tuple.
Since the port represents a flow identifier rather than a true UDP
connection, the entire 16-bit range MAY be used to maximize entropy."

[4.1] "GRE-in-UDP permits the UDP source port value to be used to encode
an entropy value. The UDP source port contains a 16-bit entropy value
that is generated by the encapsulator to identify a flow for the
encapsulated packet. The port value SHOULD be within the ephemeral port
range, i.e., 49152 to 65535, where the high-order two bits of the port
are set to one. This provides fourteen bits of entropy for the inner
flow identifier. In the case that an encapsulator is unable to derive
flow entropy from the payload header or the entropy usage has to be
disabled to meet operational requirements (see Section 7), to avoid
reordering with a packet flow, the encapsulator SHOULD use the same UDP
source port value for all packets assigned to a flow, e.g., the result
of an algorithm that performs a hash of the tunnel ingress and egress IP
address."

[4.2] "use of the UDP source port for entropy may impact middleboxes'
behavior. If a GRE-in-UDP tunnel is expected to be used on a path
with a middlebox, the tunnel can be configured either to disable use
of the UDP source port for entropy or to enable middleboxes to pass
packets with UDP source port entropy."

[5] "STT achieves the first goal by ensuring that the source and
destination ports and addresses in the outer header are all the same for
a single flow.  The second goal is achieved by generating the source
port using a random hash of fields in the headers of the inner packets,
e.g. the ports and addresses of the virtual flow's packets."

> SHOULD means "if you ignore this
> things will work but not well".
> You mentioned concerns such as worse performance,
> this is fine with SHOULD.

That's it.

> Is inner hashing important for
> correctness sometimes?

I'm sorry I didn't understand this, can you explain it in more detail?

> 
> > 3. How should we generalize? The device uses a feature to advertise all the
> > tunnel types it supports, and hashes these tunnel types using the outer
> > source port,
> > and then we still have to give the specific tunneling protocols supported by
> > the device, just like we do now.
> 
> Is it problematic to do this for all UDP packets?

I think there will be problems. While devices support configuring this,
drivers sometimes don't want devices to do special handling for certain
tunneling protocols.

Thanks.

> 
> > [1] "Source Port: It is recommended that the UDP source port number be
> > calculated using a hash of fields from the inner packet -- one example
> > being a hash of the inner Ethernet frame's headers. This is to enable a
> > level of entropy for the ECMP/load-balancing of the VM-to-VM traffic across
> > the VXLAN overlay. When calculating the UDP source port number in this
> > manner, it is RECOMMENDED that the value be in the dynamic/private
> > port range 49152-65535 [RFC6335] "
> > 
> > [2] "Source Port: A source port selected by the originating tunnel endpoint.
> > This source port SHOULD be the same for all packets belonging to a
> > single encapsulated flow to prevent reordering due to the use of different
> > paths. To encourage an even distribution of flows across multiple links,
> > the source port SHOULD be calculated using a hash of the encapsulated packet
> > headers using, for example, a traditional 5-tuple. Since the port
> > represents a flow identifier rather than a true UDP connection, the entire
> > 16-bit range MAY be used to maximize entropy. In addition to setting the
> > source port, for IPv6, the flow label MAY also be used for providing
> > entropy. For an example of using the IPv6 flow label for tunnel use cases,
> > see [RFC6438]."
> > 
> > Thanks.
> > 
> > > 
> > 
> > 
> > This publicly archived list offers a means to provide input to the
> > OASIS Virtual I/O Device (VIRTIO) TC.
> > 
> > In order to verify user consent to the Feedback License terms and
> > to minimize spam in the list archive, subscription is required
> > before posting.
> > 
> > Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> > List help: virtio-comment-help@lists.oasis-open.org
> > List archive: https://lists.oasis-open.org/archives/virtio-comment/
> > Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> > List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> > Committee: https://www.oasis-open.org/committees/virtio/
> > Join OASIS: https://www.oasis-open.org/join/
> > 

This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] [PATCH v13] virtio-net: support inner header hash
  2023-05-12  6:00                         ` Heng Qi
@ 2023-05-12  6:54                           ` Michael S. Tsirkin
  -1 siblings, 0 replies; 60+ messages in thread
From: Michael S. Tsirkin @ 2023-05-12  6:54 UTC (permalink / raw)
  To: Heng Qi
  Cc: virtio-dev, virtio-comment, Parav Pandit, Jason Wang,
	Yuri Benditovich, Xuan Zhuo

On Fri, May 12, 2023 at 02:00:19PM +0800, Heng Qi wrote:
> On Thu, May 11, 2023 at 02:22:12AM -0400, Michael S. Tsirkin wrote:
> > On Wed, May 10, 2023 at 05:15:37PM +0800, Heng Qi wrote:
> > > 
> > > 
> > > 在 2023/5/9 下午11:15, Michael S. Tsirkin 写道:
> > > > On Tue, May 09, 2023 at 10:22:19PM +0800, Heng Qi wrote:
> > > > > 
> > > > > 在 2023/5/5 下午10:56, Michael S. Tsirkin 写道:
> > > > > > On Fri, May 05, 2023 at 09:51:15PM +0800, Heng Qi wrote:
> > > > > > > On Thu, Apr 27, 2023 at 01:13:29PM -0400, Michael S. Tsirkin wrote:
> > > > > > > > On Thu, Apr 27, 2023 at 10:28:29AM +0800, Heng Qi wrote:
> > > > > > > > > 在 2023/4/26 下午10:48, Michael S. Tsirkin 写道:
> > > > > > > > > > On Wed, Apr 26, 2023 at 10:14:30PM +0800, Heng Qi wrote:
> > > > > > > > > > > This does not mean that every device needs to implement and support all of
> > > > > > > > > > > these, they can choose to support some protocols they want.
> > > > > > > > > > > 
> > > > > > > > > > > I add these because we have scale application scenarios for modern protocols
> > > > > > > > > > > VXLAN-GPE/GENEVE:
> > > > > > > > > > > 
> > > > > > > > > > > +\item In scenarios where the same flow passing through different tunnels is expected to be received in the same queue,
> > > > > > > > > > > +      warm caches, lessing locking, etc. are optimized to obtain receiving performance.
> > > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > > Maybe the legacy GRE, VXLAN-GPE and GENEVE? But it has a little crossover.
> > > > > > > > > > > 
> > > > > > > > > > > Thanks.
> > > > > > > > > > But VXLAN-GPE/GENEVE can use source port for entropy.
> > > > > > > > > > 
> > > > > > > > > > 	It is recommended that the UDP source port number
> > > > > > > > > > 	 be calculated using a hash of fields from the inner packet
> > > > > > > > > > 
> > > > > > > > > > That is best because
> > > > > > > > > > it allows end to end control and is protocol agnostic.
> > > > > > > > > Yes. I agree with this, I don't think we have an argument on this point
> > > > > > > > > right now.:)
> > > > > > > > > 
> > > > > > > > > For VXLAN-GPE/GENEVE or other modern tunneling protocols, we have to deal
> > > > > > > > > with
> > > > > > > > > scenarios where the same flow passes through different tunnels.
> > > > > > > > > 
> > > > > > > > > Having them hashed to the same rx queue, is hard to do via outer headers.
> > > > > > > > > > All that is missing is symmetric Toepliz and all is well?
> > > > > > > > > The scenarios above or in the commit log also require inner headers.
> > > > > > > > Hmm I am not sure I get it 100%.
> > > > > > > > Could you show an example with inner header hash in the port #,
> > > > > > > > hash is symmetric, and you still have trouble?
> > > > > > > > 
> > > > > > > > 
> > > > > > > > It kinds of sounds like not enough entropy is not the problem
> > > > > > > > at this point.
> > > > > > > Sorry for the late reply. :)
> > > > > > > 
> > > > > > > For modern tunneling protocols, yes.
> > > > > > > 
> > > > > > > > You now want to drop everything from the header
> > > > > > > > except the UDP source port. Is that a fair summary?
> > > > > > > > 
> > > > > > > For example, for the same flow passing through different VXLAN tunnels,
> > > > > > > packets in this flow have the same inner header and different outer
> > > > > > > headers. Sometimes these packets of the flow need to be hashed to the
> > > > > > > same rxq, then we can use the inner header as the hash input.
> > > > > > > 
> > > > > > > Thanks!
> > > > > > So, they will have the same source port yes?
> > > > > Yes. The outer source port can be calculated using the 5-tuple of the
> > > > > original packet,
> > > > > and the outer ports are the same but the outer IPs are different after
> > > > > different directions of the same flow pass through different tunnels.
> > > > > > Any way to use that
> > > > > We use it in monitoring, firewall and other scenarios.
> > > > > 
> > > > > > so we don't depend on a specific protocol?
> > > > > Yes, selected tunneling protocols can be used in this scenario like this.
> > > > > 
> > > > > Thanks.
> > > > > 
> > > > No, the question was - can we generalize this somehow then?
> > > > For example, a flag to ignore source IP when hashing?
> > > > Or maybe just for UDP packets?
> > > 
> > > 1. I think the common solution is based on the inner header, so that
> > > GRE/IPIP tunnels can also enjoy inner symmetric hashing.
> > > 
> > > 2. The VXLAN spec does not show that the outer source port in both
> > > directions of the same flow must be the same [1]
> > > (although the outer source port is calculated based on the consistent hash
> > > in the kernel. The consistent hash will sort the five-tuple before
> > > calculating hashing),
> > > but it is best not to assume that consistent hashing is used in all VXLAN
> > > implementations.
> > 
> > I agree, best not to assume if it's not in the spec.
> > The requirement to hash two sides to same queue might
> > not be necessary for everyone though, right?
> 
> The outer source port is also not reliable when it needs to be hashed to
> the same queue, but the inner header identifies a flow reliably and
> universally.
> 
> > 
> > > The GENEVE spec uses "SHOUlD"[2].
> > 
> > What about other tunnels? Could you summarize please?
> 
> Sure.
> 
> The VXLAN spec[1] does not show that the outer source port in both
> directions of the same flow must be the same.
> 
> VXLAN-GPE[2]("SHOULD")/GENEVE[3]("SHOULD")/GRE-in-UDP[4.1]/STT[5]
> recommend that the outer source port of the same flow be calculated
> based on the inner header hash and set to the same.
> 
> But the udp source port of GRE-in-UDP may be used in a scenario similar
> to NAPT [4.2], where the udp source port is no longer used for entropy,
> but for identifying different internal hosts. So using udp source port
> does not identify the same stream. This is why using the inner header is
> more general, since information about the original stream can reliably
> identify a flow.
> 
> [1] "Source Port: It is recommended that the UDP source port number be
> calculated using a hash of fields from the inner packet -- one example
> being a hash of the inner Ethernet frame's headers. This is to enable a
> level of entropy for the ECMP/load-balancing of the VM-to-VM traffic
> across the VXLAN overlay. When calculating the UDP source port number in
> this manner, it is RECOMMENDED that the value be in the dynamic/private
> port range 49152-65535 [RFC6335]"
> 
> [2] "Source UDP Port: The source UDP port is used as entropy for devices
> forwarding encapsulated packets across the underlay (ECMP for IP routers,
> or load splitting for link aggregation by bridges). Tenant traffic flows
> should all use the same source UDP port to lower the chances of packet
> reordering by the underlay for a given flow. It is recommended for VTEPs
> to generate this port number using a hash of the inner packet headers.
> Implementations MAY use the entire 16 bit source UDP port for entropy."
> 
> [3] "Source Port: A source port selected by the originating tunnel
> endpoint. This source port SHOULD be the same for all packets belonging
> to a single encapsulated flow to prevent reordering due to the use of
> different paths. To encourage an even distribution of flows across
> multiple links, the source port SHOULD be calculated using a hash of the
> encapsulated packet headers using, for example, a traditional 5-tuple.
> Since the port represents a flow identifier rather than a true UDP
> connection, the entire 16-bit range MAY be used to maximize entropy."
> 
> [4.1] "GRE-in-UDP permits the UDP source port value to be used to encode
> an entropy value. The UDP source port contains a 16-bit entropy value
> that is generated by the encapsulator to identify a flow for the
> encapsulated packet. The port value SHOULD be within the ephemeral port
> range, i.e., 49152 to 65535, where the high-order two bits of the port
> are set to one. This provides fourteen bits of entropy for the inner
> flow identifier. In the case that an encapsulator is unable to derive
> flow entropy from the payload header or the entropy usage has to be
> disabled to meet operational requirements (see Section 7), to avoid
> reordering with a packet flow, the encapsulator SHOULD use the same UDP
> source port value for all packets assigned to a flow, e.g., the result
> of an algorithm that performs a hash of the tunnel ingress and egress IP
> address."
> 
> [4.2] "use of the UDP source port for entropy may impact middleboxes'
> behavior. If a GRE-in-UDP tunnel is expected to be used on a path
> with a middlebox, the tunnel can be configured either to disable use
> of the UDP source port for entropy or to enable middleboxes to pass
> packets with UDP source port entropy."
> 
> [5] "STT achieves the first goal by ensuring that the source and
> destination ports and addresses in the outer header are all the same for
> a single flow.  The second goal is achieved by generating the source
> port using a random hash of fields in the headers of the inner packets,
> e.g. the ports and addresses of the virtual flow's packets."



> > SHOULD means "if you ignore this
> > things will work but not well".
> > You mentioned concerns such as worse performance,
> > this is fine with SHOULD.
> 
> That's it.
> 
> > Is inner hashing important for
> > correctness sometimes?
> 
> I'm sorry I didn't understand this, can you explain it in more detail?

Do things actually break if inner hash is not enabled or is this
a performance optimization?

> > 
> > > 3. How should we generalize? The device uses a feature to advertise all the
> > > tunnel types it supports, and hashes these tunnel types using the outer
> > > source port,
> > > and then we still have to give the specific tunneling protocols supported by
> > > the device, just like we do now.
> > 
> > Is it problematic to do this for all UDP packets?
> 
> I think there will be problems. While devices support configuring this,
> drivers sometimes don't want devices to do special handling for certain
> tunneling protocols.
> 
> Thanks.

I guess we can at least add a flag to do this (ignore IP addresses,
just hash the port numbers) for all UDP packets?
Or maybe UDP4/UDP6 separately.
Hopefully this will be enough to prevent getting requests
to add more offloads in the future.


> > 
> > > [1] "Source Port: It is recommended that the UDP source port number be
> > > calculated using a hash of fields from the inner packet -- one example
> > > being a hash of the inner Ethernet frame's headers. This is to enable a
> > > level of entropy for the ECMP/load-balancing of the VM-to-VM traffic across
> > > the VXLAN overlay. When calculating the UDP source port number in this
> > > manner, it is RECOMMENDED that the value be in the dynamic/private
> > > port range 49152-65535 [RFC6335] "
> > > 
> > > [2] "Source Port: A source port selected by the originating tunnel endpoint.
> > > This source port SHOULD be the same for all packets belonging to a
> > > single encapsulated flow to prevent reordering due to the use of different
> > > paths. To encourage an even distribution of flows across multiple links,
> > > the source port SHOULD be calculated using a hash of the encapsulated packet
> > > headers using, for example, a traditional 5-tuple. Since the port
> > > represents a flow identifier rather than a true UDP connection, the entire
> > > 16-bit range MAY be used to maximize entropy. In addition to setting the
> > > source port, for IPv6, the flow label MAY also be used for providing
> > > entropy. For an example of using the IPv6 flow label for tunnel use cases,
> > > see [RFC6438]."
> > > 
> > > Thanks.
> > > 
> > > > 
> > > 
> > > 
> > > This publicly archived list offers a means to provide input to the
> > > OASIS Virtual I/O Device (VIRTIO) TC.
> > > 
> > > In order to verify user consent to the Feedback License terms and
> > > to minimize spam in the list archive, subscription is required
> > > before posting.
> > > 
> > > Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> > > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> > > List help: virtio-comment-help@lists.oasis-open.org
> > > List archive: https://lists.oasis-open.org/archives/virtio-comment/
> > > Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> > > List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> > > Committee: https://www.oasis-open.org/committees/virtio/
> > > Join OASIS: https://www.oasis-open.org/join/
> > > 


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] [PATCH v13] virtio-net: support inner header hash
@ 2023-05-12  6:54                           ` Michael S. Tsirkin
  0 siblings, 0 replies; 60+ messages in thread
From: Michael S. Tsirkin @ 2023-05-12  6:54 UTC (permalink / raw)
  To: Heng Qi
  Cc: virtio-dev, virtio-comment, Parav Pandit, Jason Wang,
	Yuri Benditovich, Xuan Zhuo

On Fri, May 12, 2023 at 02:00:19PM +0800, Heng Qi wrote:
> On Thu, May 11, 2023 at 02:22:12AM -0400, Michael S. Tsirkin wrote:
> > On Wed, May 10, 2023 at 05:15:37PM +0800, Heng Qi wrote:
> > > 
> > > 
> > > 在 2023/5/9 下午11:15, Michael S. Tsirkin 写道:
> > > > On Tue, May 09, 2023 at 10:22:19PM +0800, Heng Qi wrote:
> > > > > 
> > > > > 在 2023/5/5 下午10:56, Michael S. Tsirkin 写道:
> > > > > > On Fri, May 05, 2023 at 09:51:15PM +0800, Heng Qi wrote:
> > > > > > > On Thu, Apr 27, 2023 at 01:13:29PM -0400, Michael S. Tsirkin wrote:
> > > > > > > > On Thu, Apr 27, 2023 at 10:28:29AM +0800, Heng Qi wrote:
> > > > > > > > > 在 2023/4/26 下午10:48, Michael S. Tsirkin 写道:
> > > > > > > > > > On Wed, Apr 26, 2023 at 10:14:30PM +0800, Heng Qi wrote:
> > > > > > > > > > > This does not mean that every device needs to implement and support all of
> > > > > > > > > > > these, they can choose to support some protocols they want.
> > > > > > > > > > > 
> > > > > > > > > > > I add these because we have scale application scenarios for modern protocols
> > > > > > > > > > > VXLAN-GPE/GENEVE:
> > > > > > > > > > > 
> > > > > > > > > > > +\item In scenarios where the same flow passing through different tunnels is expected to be received in the same queue,
> > > > > > > > > > > +      warm caches, lessing locking, etc. are optimized to obtain receiving performance.
> > > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > > Maybe the legacy GRE, VXLAN-GPE and GENEVE? But it has a little crossover.
> > > > > > > > > > > 
> > > > > > > > > > > Thanks.
> > > > > > > > > > But VXLAN-GPE/GENEVE can use source port for entropy.
> > > > > > > > > > 
> > > > > > > > > > 	It is recommended that the UDP source port number
> > > > > > > > > > 	 be calculated using a hash of fields from the inner packet
> > > > > > > > > > 
> > > > > > > > > > That is best because
> > > > > > > > > > it allows end to end control and is protocol agnostic.
> > > > > > > > > Yes. I agree with this, I don't think we have an argument on this point
> > > > > > > > > right now.:)
> > > > > > > > > 
> > > > > > > > > For VXLAN-GPE/GENEVE or other modern tunneling protocols, we have to deal
> > > > > > > > > with
> > > > > > > > > scenarios where the same flow passes through different tunnels.
> > > > > > > > > 
> > > > > > > > > Having them hashed to the same rx queue, is hard to do via outer headers.
> > > > > > > > > > All that is missing is symmetric Toepliz and all is well?
> > > > > > > > > The scenarios above or in the commit log also require inner headers.
> > > > > > > > Hmm I am not sure I get it 100%.
> > > > > > > > Could you show an example with inner header hash in the port #,
> > > > > > > > hash is symmetric, and you still have trouble?
> > > > > > > > 
> > > > > > > > 
> > > > > > > > It kinds of sounds like not enough entropy is not the problem
> > > > > > > > at this point.
> > > > > > > Sorry for the late reply. :)
> > > > > > > 
> > > > > > > For modern tunneling protocols, yes.
> > > > > > > 
> > > > > > > > You now want to drop everything from the header
> > > > > > > > except the UDP source port. Is that a fair summary?
> > > > > > > > 
> > > > > > > For example, for the same flow passing through different VXLAN tunnels,
> > > > > > > packets in this flow have the same inner header and different outer
> > > > > > > headers. Sometimes these packets of the flow need to be hashed to the
> > > > > > > same rxq, then we can use the inner header as the hash input.
> > > > > > > 
> > > > > > > Thanks!
> > > > > > So, they will have the same source port yes?
> > > > > Yes. The outer source port can be calculated using the 5-tuple of the
> > > > > original packet,
> > > > > and the outer ports are the same but the outer IPs are different after
> > > > > different directions of the same flow pass through different tunnels.
> > > > > > Any way to use that
> > > > > We use it in monitoring, firewall and other scenarios.
> > > > > 
> > > > > > so we don't depend on a specific protocol?
> > > > > Yes, selected tunneling protocols can be used in this scenario like this.
> > > > > 
> > > > > Thanks.
> > > > > 
> > > > No, the question was - can we generalize this somehow then?
> > > > For example, a flag to ignore source IP when hashing?
> > > > Or maybe just for UDP packets?
> > > 
> > > 1. I think the common solution is based on the inner header, so that
> > > GRE/IPIP tunnels can also enjoy inner symmetric hashing.
> > > 
> > > 2. The VXLAN spec does not show that the outer source port in both
> > > directions of the same flow must be the same [1]
> > > (although the outer source port is calculated based on the consistent hash
> > > in the kernel. The consistent hash will sort the five-tuple before
> > > calculating hashing),
> > > but it is best not to assume that consistent hashing is used in all VXLAN
> > > implementations.
> > 
> > I agree, best not to assume if it's not in the spec.
> > The requirement to hash two sides to same queue might
> > not be necessary for everyone though, right?
> 
> The outer source port is also not reliable when it needs to be hashed to
> the same queue, but the inner header identifies a flow reliably and
> universally.
> 
> > 
> > > The GENEVE spec uses "SHOUlD"[2].
> > 
> > What about other tunnels? Could you summarize please?
> 
> Sure.
> 
> The VXLAN spec[1] does not show that the outer source port in both
> directions of the same flow must be the same.
> 
> VXLAN-GPE[2]("SHOULD")/GENEVE[3]("SHOULD")/GRE-in-UDP[4.1]/STT[5]
> recommend that the outer source port of the same flow be calculated
> based on the inner header hash and set to the same.
> 
> But the udp source port of GRE-in-UDP may be used in a scenario similar
> to NAPT [4.2], where the udp source port is no longer used for entropy,
> but for identifying different internal hosts. So using udp source port
> does not identify the same stream. This is why using the inner header is
> more general, since information about the original stream can reliably
> identify a flow.
> 
> [1] "Source Port: It is recommended that the UDP source port number be
> calculated using a hash of fields from the inner packet -- one example
> being a hash of the inner Ethernet frame's headers. This is to enable a
> level of entropy for the ECMP/load-balancing of the VM-to-VM traffic
> across the VXLAN overlay. When calculating the UDP source port number in
> this manner, it is RECOMMENDED that the value be in the dynamic/private
> port range 49152-65535 [RFC6335]"
> 
> [2] "Source UDP Port: The source UDP port is used as entropy for devices
> forwarding encapsulated packets across the underlay (ECMP for IP routers,
> or load splitting for link aggregation by bridges). Tenant traffic flows
> should all use the same source UDP port to lower the chances of packet
> reordering by the underlay for a given flow. It is recommended for VTEPs
> to generate this port number using a hash of the inner packet headers.
> Implementations MAY use the entire 16 bit source UDP port for entropy."
> 
> [3] "Source Port: A source port selected by the originating tunnel
> endpoint. This source port SHOULD be the same for all packets belonging
> to a single encapsulated flow to prevent reordering due to the use of
> different paths. To encourage an even distribution of flows across
> multiple links, the source port SHOULD be calculated using a hash of the
> encapsulated packet headers using, for example, a traditional 5-tuple.
> Since the port represents a flow identifier rather than a true UDP
> connection, the entire 16-bit range MAY be used to maximize entropy."
> 
> [4.1] "GRE-in-UDP permits the UDP source port value to be used to encode
> an entropy value. The UDP source port contains a 16-bit entropy value
> that is generated by the encapsulator to identify a flow for the
> encapsulated packet. The port value SHOULD be within the ephemeral port
> range, i.e., 49152 to 65535, where the high-order two bits of the port
> are set to one. This provides fourteen bits of entropy for the inner
> flow identifier. In the case that an encapsulator is unable to derive
> flow entropy from the payload header or the entropy usage has to be
> disabled to meet operational requirements (see Section 7), to avoid
> reordering with a packet flow, the encapsulator SHOULD use the same UDP
> source port value for all packets assigned to a flow, e.g., the result
> of an algorithm that performs a hash of the tunnel ingress and egress IP
> address."
> 
> [4.2] "use of the UDP source port for entropy may impact middleboxes'
> behavior. If a GRE-in-UDP tunnel is expected to be used on a path
> with a middlebox, the tunnel can be configured either to disable use
> of the UDP source port for entropy or to enable middleboxes to pass
> packets with UDP source port entropy."
> 
> [5] "STT achieves the first goal by ensuring that the source and
> destination ports and addresses in the outer header are all the same for
> a single flow.  The second goal is achieved by generating the source
> port using a random hash of fields in the headers of the inner packets,
> e.g. the ports and addresses of the virtual flow's packets."



> > SHOULD means "if you ignore this
> > things will work but not well".
> > You mentioned concerns such as worse performance,
> > this is fine with SHOULD.
> 
> That's it.
> 
> > Is inner hashing important for
> > correctness sometimes?
> 
> I'm sorry I didn't understand this, can you explain it in more detail?

Do things actually break if inner hash is not enabled or is this
a performance optimization?

> > 
> > > 3. How should we generalize? The device uses a feature to advertise all the
> > > tunnel types it supports, and hashes these tunnel types using the outer
> > > source port,
> > > and then we still have to give the specific tunneling protocols supported by
> > > the device, just like we do now.
> > 
> > Is it problematic to do this for all UDP packets?
> 
> I think there will be problems. While devices support configuring this,
> drivers sometimes don't want devices to do special handling for certain
> tunneling protocols.
> 
> Thanks.

I guess we can at least add a flag to do this (ignore IP addresses,
just hash the port numbers) for all UDP packets?
Or maybe UDP4/UDP6 separately.
Hopefully this will be enough to prevent getting requests
to add more offloads in the future.


> > 
> > > [1] "Source Port: It is recommended that the UDP source port number be
> > > calculated using a hash of fields from the inner packet -- one example
> > > being a hash of the inner Ethernet frame's headers. This is to enable a
> > > level of entropy for the ECMP/load-balancing of the VM-to-VM traffic across
> > > the VXLAN overlay. When calculating the UDP source port number in this
> > > manner, it is RECOMMENDED that the value be in the dynamic/private
> > > port range 49152-65535 [RFC6335] "
> > > 
> > > [2] "Source Port: A source port selected by the originating tunnel endpoint.
> > > This source port SHOULD be the same for all packets belonging to a
> > > single encapsulated flow to prevent reordering due to the use of different
> > > paths. To encourage an even distribution of flows across multiple links,
> > > the source port SHOULD be calculated using a hash of the encapsulated packet
> > > headers using, for example, a traditional 5-tuple. Since the port
> > > represents a flow identifier rather than a true UDP connection, the entire
> > > 16-bit range MAY be used to maximize entropy. In addition to setting the
> > > source port, for IPv6, the flow label MAY also be used for providing
> > > entropy. For an example of using the IPv6 flow label for tunnel use cases,
> > > see [RFC6438]."
> > > 
> > > Thanks.
> > > 
> > > > 
> > > 
> > > 
> > > This publicly archived list offers a means to provide input to the
> > > OASIS Virtual I/O Device (VIRTIO) TC.
> > > 
> > > In order to verify user consent to the Feedback License terms and
> > > to minimize spam in the list archive, subscription is required
> > > before posting.
> > > 
> > > Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> > > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> > > List help: virtio-comment-help@lists.oasis-open.org
> > > List archive: https://lists.oasis-open.org/archives/virtio-comment/
> > > Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> > > List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> > > Committee: https://www.oasis-open.org/committees/virtio/
> > > Join OASIS: https://www.oasis-open.org/join/
> > > 


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] [PATCH v13] virtio-net: support inner header hash
  2023-05-12  6:54                           ` Michael S. Tsirkin
@ 2023-05-12  7:23                             ` Heng Qi
  -1 siblings, 0 replies; 60+ messages in thread
From: Heng Qi @ 2023-05-12  7:23 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-dev, virtio-comment, Parav Pandit, Jason Wang,
	Yuri Benditovich, Xuan Zhuo

On Fri, May 12, 2023 at 02:54:34AM -0400, Michael S. Tsirkin wrote:
> On Fri, May 12, 2023 at 02:00:19PM +0800, Heng Qi wrote:
> > On Thu, May 11, 2023 at 02:22:12AM -0400, Michael S. Tsirkin wrote:
> > > On Wed, May 10, 2023 at 05:15:37PM +0800, Heng Qi wrote:
> > > > 
> > > > 
> > > > 在 2023/5/9 下午11:15, Michael S. Tsirkin 写道:
> > > > > On Tue, May 09, 2023 at 10:22:19PM +0800, Heng Qi wrote:
> > > > > > 
> > > > > > 在 2023/5/5 下午10:56, Michael S. Tsirkin 写道:
> > > > > > > On Fri, May 05, 2023 at 09:51:15PM +0800, Heng Qi wrote:
> > > > > > > > On Thu, Apr 27, 2023 at 01:13:29PM -0400, Michael S. Tsirkin wrote:
> > > > > > > > > On Thu, Apr 27, 2023 at 10:28:29AM +0800, Heng Qi wrote:
> > > > > > > > > > 在 2023/4/26 下午10:48, Michael S. Tsirkin 写道:
> > > > > > > > > > > On Wed, Apr 26, 2023 at 10:14:30PM +0800, Heng Qi wrote:
> > > > > > > > > > > > This does not mean that every device needs to implement and support all of
> > > > > > > > > > > > these, they can choose to support some protocols they want.
> > > > > > > > > > > > 
> > > > > > > > > > > > I add these because we have scale application scenarios for modern protocols
> > > > > > > > > > > > VXLAN-GPE/GENEVE:
> > > > > > > > > > > > 
> > > > > > > > > > > > +\item In scenarios where the same flow passing through different tunnels is expected to be received in the same queue,
> > > > > > > > > > > > +      warm caches, lessing locking, etc. are optimized to obtain receiving performance.
> > > > > > > > > > > > 
> > > > > > > > > > > > 
> > > > > > > > > > > > Maybe the legacy GRE, VXLAN-GPE and GENEVE? But it has a little crossover.
> > > > > > > > > > > > 
> > > > > > > > > > > > Thanks.
> > > > > > > > > > > But VXLAN-GPE/GENEVE can use source port for entropy.
> > > > > > > > > > > 
> > > > > > > > > > > 	It is recommended that the UDP source port number
> > > > > > > > > > > 	 be calculated using a hash of fields from the inner packet
> > > > > > > > > > > 
> > > > > > > > > > > That is best because
> > > > > > > > > > > it allows end to end control and is protocol agnostic.
> > > > > > > > > > Yes. I agree with this, I don't think we have an argument on this point
> > > > > > > > > > right now.:)
> > > > > > > > > > 
> > > > > > > > > > For VXLAN-GPE/GENEVE or other modern tunneling protocols, we have to deal
> > > > > > > > > > with
> > > > > > > > > > scenarios where the same flow passes through different tunnels.
> > > > > > > > > > 
> > > > > > > > > > Having them hashed to the same rx queue, is hard to do via outer headers.
> > > > > > > > > > > All that is missing is symmetric Toepliz and all is well?
> > > > > > > > > > The scenarios above or in the commit log also require inner headers.
> > > > > > > > > Hmm I am not sure I get it 100%.
> > > > > > > > > Could you show an example with inner header hash in the port #,
> > > > > > > > > hash is symmetric, and you still have trouble?
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > It kinds of sounds like not enough entropy is not the problem
> > > > > > > > > at this point.
> > > > > > > > Sorry for the late reply. :)
> > > > > > > > 
> > > > > > > > For modern tunneling protocols, yes.
> > > > > > > > 
> > > > > > > > > You now want to drop everything from the header
> > > > > > > > > except the UDP source port. Is that a fair summary?
> > > > > > > > > 
> > > > > > > > For example, for the same flow passing through different VXLAN tunnels,
> > > > > > > > packets in this flow have the same inner header and different outer
> > > > > > > > headers. Sometimes these packets of the flow need to be hashed to the
> > > > > > > > same rxq, then we can use the inner header as the hash input.
> > > > > > > > 
> > > > > > > > Thanks!
> > > > > > > So, they will have the same source port yes?
> > > > > > Yes. The outer source port can be calculated using the 5-tuple of the
> > > > > > original packet,
> > > > > > and the outer ports are the same but the outer IPs are different after
> > > > > > different directions of the same flow pass through different tunnels.
> > > > > > > Any way to use that
> > > > > > We use it in monitoring, firewall and other scenarios.
> > > > > > 
> > > > > > > so we don't depend on a specific protocol?
> > > > > > Yes, selected tunneling protocols can be used in this scenario like this.
> > > > > > 
> > > > > > Thanks.
> > > > > > 
> > > > > No, the question was - can we generalize this somehow then?
> > > > > For example, a flag to ignore source IP when hashing?
> > > > > Or maybe just for UDP packets?
> > > > 
> > > > 1. I think the common solution is based on the inner header, so that
> > > > GRE/IPIP tunnels can also enjoy inner symmetric hashing.
> > > > 
> > > > 2. The VXLAN spec does not show that the outer source port in both
> > > > directions of the same flow must be the same [1]
> > > > (although the outer source port is calculated based on the consistent hash
> > > > in the kernel. The consistent hash will sort the five-tuple before
> > > > calculating hashing),
> > > > but it is best not to assume that consistent hashing is used in all VXLAN
> > > > implementations.
> > > 
> > > I agree, best not to assume if it's not in the spec.
> > > The requirement to hash two sides to same queue might
> > > not be necessary for everyone though, right?
> > 
> > The outer source port is also not reliable when it needs to be hashed to
> > the same queue, but the inner header identifies a flow reliably and
> > universally.
> > 
> > > 
> > > > The GENEVE spec uses "SHOUlD"[2].
> > > 
> > > What about other tunnels? Could you summarize please?
> > 
> > Sure.
> > 
> > The VXLAN spec[1] does not show that the outer source port in both
> > directions of the same flow must be the same.
> > 
> > VXLAN-GPE[2]("SHOULD")/GENEVE[3]("SHOULD")/GRE-in-UDP[4.1]/STT[5]
> > recommend that the outer source port of the same flow be calculated
> > based on the inner header hash and set to the same.
> > 
> > But the udp source port of GRE-in-UDP may be used in a scenario similar
> > to NAPT [4.2], where the udp source port is no longer used for entropy,
> > but for identifying different internal hosts. So using udp source port
> > does not identify the same stream. This is why using the inner header is
> > more general, since information about the original stream can reliably
> > identify a flow.
> > 
> > [1] "Source Port: It is recommended that the UDP source port number be
> > calculated using a hash of fields from the inner packet -- one example
> > being a hash of the inner Ethernet frame's headers. This is to enable a
> > level of entropy for the ECMP/load-balancing of the VM-to-VM traffic
> > across the VXLAN overlay. When calculating the UDP source port number in
> > this manner, it is RECOMMENDED that the value be in the dynamic/private
> > port range 49152-65535 [RFC6335]"
> > 
> > [2] "Source UDP Port: The source UDP port is used as entropy for devices
> > forwarding encapsulated packets across the underlay (ECMP for IP routers,
> > or load splitting for link aggregation by bridges). Tenant traffic flows
> > should all use the same source UDP port to lower the chances of packet
> > reordering by the underlay for a given flow. It is recommended for VTEPs
> > to generate this port number using a hash of the inner packet headers.
> > Implementations MAY use the entire 16 bit source UDP port for entropy."
> > 
> > [3] "Source Port: A source port selected by the originating tunnel
> > endpoint. This source port SHOULD be the same for all packets belonging
> > to a single encapsulated flow to prevent reordering due to the use of
> > different paths. To encourage an even distribution of flows across
> > multiple links, the source port SHOULD be calculated using a hash of the
> > encapsulated packet headers using, for example, a traditional 5-tuple.
> > Since the port represents a flow identifier rather than a true UDP
> > connection, the entire 16-bit range MAY be used to maximize entropy."
> > 
> > [4.1] "GRE-in-UDP permits the UDP source port value to be used to encode
> > an entropy value. The UDP source port contains a 16-bit entropy value
> > that is generated by the encapsulator to identify a flow for the
> > encapsulated packet. The port value SHOULD be within the ephemeral port
> > range, i.e., 49152 to 65535, where the high-order two bits of the port
> > are set to one. This provides fourteen bits of entropy for the inner
> > flow identifier. In the case that an encapsulator is unable to derive
> > flow entropy from the payload header or the entropy usage has to be
> > disabled to meet operational requirements (see Section 7), to avoid
> > reordering with a packet flow, the encapsulator SHOULD use the same UDP
> > source port value for all packets assigned to a flow, e.g., the result
> > of an algorithm that performs a hash of the tunnel ingress and egress IP
> > address."
> > 
> > [4.2] "use of the UDP source port for entropy may impact middleboxes'
> > behavior. If a GRE-in-UDP tunnel is expected to be used on a path
> > with a middlebox, the tunnel can be configured either to disable use
> > of the UDP source port for entropy or to enable middleboxes to pass
> > packets with UDP source port entropy."
> > 
> > [5] "STT achieves the first goal by ensuring that the source and
> > destination ports and addresses in the outer header are all the same for
> > a single flow.  The second goal is achieved by generating the source
> > port using a random hash of fields in the headers of the inner packets,
> > e.g. the ports and addresses of the virtual flow's packets."
> 
> 
> 
> > > SHOULD means "if you ignore this
> > > things will work but not well".
> > > You mentioned concerns such as worse performance,
> > > this is fine with SHOULD.
> > 
> > That's it.
> > 
> > > Is inner hashing important for
> > > correctness sometimes?
> > 
> > I'm sorry I didn't understand this, can you explain it in more detail?
> 
> Do things actually break if inner hash is not enabled or is this
> a performance optimization?

Yes, the internal hash comes from our real internal needs, and the
application scenarios have a large scale. When the data traffic and
scale increase, this is very beneficial to our production efficiency and
cost. Performance optimization is not only an important direction of the
network, but also a manifestation of complete functionality. Based on
this, we have reason to believe that internal hashing will play a role
in future developments.

> 
> > > 
> > > > 3. How should we generalize? The device uses a feature to advertise all the
> > > > tunnel types it supports, and hashes these tunnel types using the outer
> > > > source port,
> > > > and then we still have to give the specific tunneling protocols supported by
> > > > the device, just like we do now.
> > > 
> > > Is it problematic to do this for all UDP packets?
> > 
> > I think there will be problems. While devices support configuring this,
> > drivers sometimes don't want devices to do special handling for certain
> > tunneling protocols.
> > 
> > Thanks.
> 
> I guess we can at least add a flag to do this (ignore IP addresses,
> just hash the port numbers) for all UDP packets?

Yes, I think this can also be used as a worker thread.

> Or maybe UDP4/UDP6 separately.
> Hopefully this will be enough to prevent getting requests
> to add more offloads in the future.

Agreed, and understand your concerns about this.

Thanks.
> 
> 
> > > 
> > > > [1] "Source Port: It is recommended that the UDP source port number be
> > > > calculated using a hash of fields from the inner packet -- one example
> > > > being a hash of the inner Ethernet frame's headers. This is to enable a
> > > > level of entropy for the ECMP/load-balancing of the VM-to-VM traffic across
> > > > the VXLAN overlay. When calculating the UDP source port number in this
> > > > manner, it is RECOMMENDED that the value be in the dynamic/private
> > > > port range 49152-65535 [RFC6335] "
> > > > 
> > > > [2] "Source Port: A source port selected by the originating tunnel endpoint.
> > > > This source port SHOULD be the same for all packets belonging to a
> > > > single encapsulated flow to prevent reordering due to the use of different
> > > > paths. To encourage an even distribution of flows across multiple links,
> > > > the source port SHOULD be calculated using a hash of the encapsulated packet
> > > > headers using, for example, a traditional 5-tuple. Since the port
> > > > represents a flow identifier rather than a true UDP connection, the entire
> > > > 16-bit range MAY be used to maximize entropy. In addition to setting the
> > > > source port, for IPv6, the flow label MAY also be used for providing
> > > > entropy. For an example of using the IPv6 flow label for tunnel use cases,
> > > > see [RFC6438]."
> > > > 
> > > > Thanks.
> > > > 
> > > > > 
> > > > 
> > > > 
> > > > This publicly archived list offers a means to provide input to the
> > > > OASIS Virtual I/O Device (VIRTIO) TC.
> > > > 
> > > > In order to verify user consent to the Feedback License terms and
> > > > to minimize spam in the list archive, subscription is required
> > > > before posting.
> > > > 
> > > > Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> > > > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> > > > List help: virtio-comment-help@lists.oasis-open.org
> > > > List archive: https://lists.oasis-open.org/archives/virtio-comment/
> > > > Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> > > > List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> > > > Committee: https://www.oasis-open.org/committees/virtio/
> > > > Join OASIS: https://www.oasis-open.org/join/
> > > > 

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] [PATCH v13] virtio-net: support inner header hash
@ 2023-05-12  7:23                             ` Heng Qi
  0 siblings, 0 replies; 60+ messages in thread
From: Heng Qi @ 2023-05-12  7:23 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-dev, virtio-comment, Parav Pandit, Jason Wang,
	Yuri Benditovich, Xuan Zhuo

On Fri, May 12, 2023 at 02:54:34AM -0400, Michael S. Tsirkin wrote:
> On Fri, May 12, 2023 at 02:00:19PM +0800, Heng Qi wrote:
> > On Thu, May 11, 2023 at 02:22:12AM -0400, Michael S. Tsirkin wrote:
> > > On Wed, May 10, 2023 at 05:15:37PM +0800, Heng Qi wrote:
> > > > 
> > > > 
> > > > 在 2023/5/9 下午11:15, Michael S. Tsirkin 写道:
> > > > > On Tue, May 09, 2023 at 10:22:19PM +0800, Heng Qi wrote:
> > > > > > 
> > > > > > 在 2023/5/5 下午10:56, Michael S. Tsirkin 写道:
> > > > > > > On Fri, May 05, 2023 at 09:51:15PM +0800, Heng Qi wrote:
> > > > > > > > On Thu, Apr 27, 2023 at 01:13:29PM -0400, Michael S. Tsirkin wrote:
> > > > > > > > > On Thu, Apr 27, 2023 at 10:28:29AM +0800, Heng Qi wrote:
> > > > > > > > > > 在 2023/4/26 下午10:48, Michael S. Tsirkin 写道:
> > > > > > > > > > > On Wed, Apr 26, 2023 at 10:14:30PM +0800, Heng Qi wrote:
> > > > > > > > > > > > This does not mean that every device needs to implement and support all of
> > > > > > > > > > > > these, they can choose to support some protocols they want.
> > > > > > > > > > > > 
> > > > > > > > > > > > I add these because we have scale application scenarios for modern protocols
> > > > > > > > > > > > VXLAN-GPE/GENEVE:
> > > > > > > > > > > > 
> > > > > > > > > > > > +\item In scenarios where the same flow passing through different tunnels is expected to be received in the same queue,
> > > > > > > > > > > > +      warm caches, lessing locking, etc. are optimized to obtain receiving performance.
> > > > > > > > > > > > 
> > > > > > > > > > > > 
> > > > > > > > > > > > Maybe the legacy GRE, VXLAN-GPE and GENEVE? But it has a little crossover.
> > > > > > > > > > > > 
> > > > > > > > > > > > Thanks.
> > > > > > > > > > > But VXLAN-GPE/GENEVE can use source port for entropy.
> > > > > > > > > > > 
> > > > > > > > > > > 	It is recommended that the UDP source port number
> > > > > > > > > > > 	 be calculated using a hash of fields from the inner packet
> > > > > > > > > > > 
> > > > > > > > > > > That is best because
> > > > > > > > > > > it allows end to end control and is protocol agnostic.
> > > > > > > > > > Yes. I agree with this, I don't think we have an argument on this point
> > > > > > > > > > right now.:)
> > > > > > > > > > 
> > > > > > > > > > For VXLAN-GPE/GENEVE or other modern tunneling protocols, we have to deal
> > > > > > > > > > with
> > > > > > > > > > scenarios where the same flow passes through different tunnels.
> > > > > > > > > > 
> > > > > > > > > > Having them hashed to the same rx queue, is hard to do via outer headers.
> > > > > > > > > > > All that is missing is symmetric Toepliz and all is well?
> > > > > > > > > > The scenarios above or in the commit log also require inner headers.
> > > > > > > > > Hmm I am not sure I get it 100%.
> > > > > > > > > Could you show an example with inner header hash in the port #,
> > > > > > > > > hash is symmetric, and you still have trouble?
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > It kinds of sounds like not enough entropy is not the problem
> > > > > > > > > at this point.
> > > > > > > > Sorry for the late reply. :)
> > > > > > > > 
> > > > > > > > For modern tunneling protocols, yes.
> > > > > > > > 
> > > > > > > > > You now want to drop everything from the header
> > > > > > > > > except the UDP source port. Is that a fair summary?
> > > > > > > > > 
> > > > > > > > For example, for the same flow passing through different VXLAN tunnels,
> > > > > > > > packets in this flow have the same inner header and different outer
> > > > > > > > headers. Sometimes these packets of the flow need to be hashed to the
> > > > > > > > same rxq, then we can use the inner header as the hash input.
> > > > > > > > 
> > > > > > > > Thanks!
> > > > > > > So, they will have the same source port yes?
> > > > > > Yes. The outer source port can be calculated using the 5-tuple of the
> > > > > > original packet,
> > > > > > and the outer ports are the same but the outer IPs are different after
> > > > > > different directions of the same flow pass through different tunnels.
> > > > > > > Any way to use that
> > > > > > We use it in monitoring, firewall and other scenarios.
> > > > > > 
> > > > > > > so we don't depend on a specific protocol?
> > > > > > Yes, selected tunneling protocols can be used in this scenario like this.
> > > > > > 
> > > > > > Thanks.
> > > > > > 
> > > > > No, the question was - can we generalize this somehow then?
> > > > > For example, a flag to ignore source IP when hashing?
> > > > > Or maybe just for UDP packets?
> > > > 
> > > > 1. I think the common solution is based on the inner header, so that
> > > > GRE/IPIP tunnels can also enjoy inner symmetric hashing.
> > > > 
> > > > 2. The VXLAN spec does not show that the outer source port in both
> > > > directions of the same flow must be the same [1]
> > > > (although the outer source port is calculated based on the consistent hash
> > > > in the kernel. The consistent hash will sort the five-tuple before
> > > > calculating hashing),
> > > > but it is best not to assume that consistent hashing is used in all VXLAN
> > > > implementations.
> > > 
> > > I agree, best not to assume if it's not in the spec.
> > > The requirement to hash two sides to same queue might
> > > not be necessary for everyone though, right?
> > 
> > The outer source port is also not reliable when it needs to be hashed to
> > the same queue, but the inner header identifies a flow reliably and
> > universally.
> > 
> > > 
> > > > The GENEVE spec uses "SHOUlD"[2].
> > > 
> > > What about other tunnels? Could you summarize please?
> > 
> > Sure.
> > 
> > The VXLAN spec[1] does not show that the outer source port in both
> > directions of the same flow must be the same.
> > 
> > VXLAN-GPE[2]("SHOULD")/GENEVE[3]("SHOULD")/GRE-in-UDP[4.1]/STT[5]
> > recommend that the outer source port of the same flow be calculated
> > based on the inner header hash and set to the same.
> > 
> > But the udp source port of GRE-in-UDP may be used in a scenario similar
> > to NAPT [4.2], where the udp source port is no longer used for entropy,
> > but for identifying different internal hosts. So using udp source port
> > does not identify the same stream. This is why using the inner header is
> > more general, since information about the original stream can reliably
> > identify a flow.
> > 
> > [1] "Source Port: It is recommended that the UDP source port number be
> > calculated using a hash of fields from the inner packet -- one example
> > being a hash of the inner Ethernet frame's headers. This is to enable a
> > level of entropy for the ECMP/load-balancing of the VM-to-VM traffic
> > across the VXLAN overlay. When calculating the UDP source port number in
> > this manner, it is RECOMMENDED that the value be in the dynamic/private
> > port range 49152-65535 [RFC6335]"
> > 
> > [2] "Source UDP Port: The source UDP port is used as entropy for devices
> > forwarding encapsulated packets across the underlay (ECMP for IP routers,
> > or load splitting for link aggregation by bridges). Tenant traffic flows
> > should all use the same source UDP port to lower the chances of packet
> > reordering by the underlay for a given flow. It is recommended for VTEPs
> > to generate this port number using a hash of the inner packet headers.
> > Implementations MAY use the entire 16 bit source UDP port for entropy."
> > 
> > [3] "Source Port: A source port selected by the originating tunnel
> > endpoint. This source port SHOULD be the same for all packets belonging
> > to a single encapsulated flow to prevent reordering due to the use of
> > different paths. To encourage an even distribution of flows across
> > multiple links, the source port SHOULD be calculated using a hash of the
> > encapsulated packet headers using, for example, a traditional 5-tuple.
> > Since the port represents a flow identifier rather than a true UDP
> > connection, the entire 16-bit range MAY be used to maximize entropy."
> > 
> > [4.1] "GRE-in-UDP permits the UDP source port value to be used to encode
> > an entropy value. The UDP source port contains a 16-bit entropy value
> > that is generated by the encapsulator to identify a flow for the
> > encapsulated packet. The port value SHOULD be within the ephemeral port
> > range, i.e., 49152 to 65535, where the high-order two bits of the port
> > are set to one. This provides fourteen bits of entropy for the inner
> > flow identifier. In the case that an encapsulator is unable to derive
> > flow entropy from the payload header or the entropy usage has to be
> > disabled to meet operational requirements (see Section 7), to avoid
> > reordering with a packet flow, the encapsulator SHOULD use the same UDP
> > source port value for all packets assigned to a flow, e.g., the result
> > of an algorithm that performs a hash of the tunnel ingress and egress IP
> > address."
> > 
> > [4.2] "use of the UDP source port for entropy may impact middleboxes'
> > behavior. If a GRE-in-UDP tunnel is expected to be used on a path
> > with a middlebox, the tunnel can be configured either to disable use
> > of the UDP source port for entropy or to enable middleboxes to pass
> > packets with UDP source port entropy."
> > 
> > [5] "STT achieves the first goal by ensuring that the source and
> > destination ports and addresses in the outer header are all the same for
> > a single flow.  The second goal is achieved by generating the source
> > port using a random hash of fields in the headers of the inner packets,
> > e.g. the ports and addresses of the virtual flow's packets."
> 
> 
> 
> > > SHOULD means "if you ignore this
> > > things will work but not well".
> > > You mentioned concerns such as worse performance,
> > > this is fine with SHOULD.
> > 
> > That's it.
> > 
> > > Is inner hashing important for
> > > correctness sometimes?
> > 
> > I'm sorry I didn't understand this, can you explain it in more detail?
> 
> Do things actually break if inner hash is not enabled or is this
> a performance optimization?

Yes, the internal hash comes from our real internal needs, and the
application scenarios have a large scale. When the data traffic and
scale increase, this is very beneficial to our production efficiency and
cost. Performance optimization is not only an important direction of the
network, but also a manifestation of complete functionality. Based on
this, we have reason to believe that internal hashing will play a role
in future developments.

> 
> > > 
> > > > 3. How should we generalize? The device uses a feature to advertise all the
> > > > tunnel types it supports, and hashes these tunnel types using the outer
> > > > source port,
> > > > and then we still have to give the specific tunneling protocols supported by
> > > > the device, just like we do now.
> > > 
> > > Is it problematic to do this for all UDP packets?
> > 
> > I think there will be problems. While devices support configuring this,
> > drivers sometimes don't want devices to do special handling for certain
> > tunneling protocols.
> > 
> > Thanks.
> 
> I guess we can at least add a flag to do this (ignore IP addresses,
> just hash the port numbers) for all UDP packets?

Yes, I think this can also be used as a worker thread.

> Or maybe UDP4/UDP6 separately.
> Hopefully this will be enough to prevent getting requests
> to add more offloads in the future.

Agreed, and understand your concerns about this.

Thanks.
> 
> 
> > > 
> > > > [1] "Source Port: It is recommended that the UDP source port number be
> > > > calculated using a hash of fields from the inner packet -- one example
> > > > being a hash of the inner Ethernet frame's headers. This is to enable a
> > > > level of entropy for the ECMP/load-balancing of the VM-to-VM traffic across
> > > > the VXLAN overlay. When calculating the UDP source port number in this
> > > > manner, it is RECOMMENDED that the value be in the dynamic/private
> > > > port range 49152-65535 [RFC6335] "
> > > > 
> > > > [2] "Source Port: A source port selected by the originating tunnel endpoint.
> > > > This source port SHOULD be the same for all packets belonging to a
> > > > single encapsulated flow to prevent reordering due to the use of different
> > > > paths. To encourage an even distribution of flows across multiple links,
> > > > the source port SHOULD be calculated using a hash of the encapsulated packet
> > > > headers using, for example, a traditional 5-tuple. Since the port
> > > > represents a flow identifier rather than a true UDP connection, the entire
> > > > 16-bit range MAY be used to maximize entropy. In addition to setting the
> > > > source port, for IPv6, the flow label MAY also be used for providing
> > > > entropy. For an example of using the IPv6 flow label for tunnel use cases,
> > > > see [RFC6438]."
> > > > 
> > > > Thanks.
> > > > 
> > > > > 
> > > > 
> > > > 
> > > > This publicly archived list offers a means to provide input to the
> > > > OASIS Virtual I/O Device (VIRTIO) TC.
> > > > 
> > > > In order to verify user consent to the Feedback License terms and
> > > > to minimize spam in the list archive, subscription is required
> > > > before posting.
> > > > 
> > > > Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> > > > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> > > > List help: virtio-comment-help@lists.oasis-open.org
> > > > List archive: https://lists.oasis-open.org/archives/virtio-comment/
> > > > Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> > > > List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> > > > Committee: https://www.oasis-open.org/committees/virtio/
> > > > Join OASIS: https://www.oasis-open.org/join/
> > > > 

This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] [PATCH v13] virtio-net: support inner header hash
  2023-05-12  7:23                             ` Heng Qi
@ 2023-05-12 11:27                               ` Michael S. Tsirkin
  -1 siblings, 0 replies; 60+ messages in thread
From: Michael S. Tsirkin @ 2023-05-12 11:27 UTC (permalink / raw)
  To: Heng Qi
  Cc: virtio-dev, virtio-comment, Parav Pandit, Jason Wang,
	Yuri Benditovich, Xuan Zhuo

On Fri, May 12, 2023 at 03:23:46PM +0800, Heng Qi wrote:
> On Fri, May 12, 2023 at 02:54:34AM -0400, Michael S. Tsirkin wrote:
> > On Fri, May 12, 2023 at 02:00:19PM +0800, Heng Qi wrote:
> > > On Thu, May 11, 2023 at 02:22:12AM -0400, Michael S. Tsirkin wrote:
> > > > On Wed, May 10, 2023 at 05:15:37PM +0800, Heng Qi wrote:
> > > > > 
> > > > > 
> > > > > 在 2023/5/9 下午11:15, Michael S. Tsirkin 写道:
> > > > > > On Tue, May 09, 2023 at 10:22:19PM +0800, Heng Qi wrote:
> > > > > > > 
> > > > > > > 在 2023/5/5 下午10:56, Michael S. Tsirkin 写道:
> > > > > > > > On Fri, May 05, 2023 at 09:51:15PM +0800, Heng Qi wrote:
> > > > > > > > > On Thu, Apr 27, 2023 at 01:13:29PM -0400, Michael S. Tsirkin wrote:
> > > > > > > > > > On Thu, Apr 27, 2023 at 10:28:29AM +0800, Heng Qi wrote:
> > > > > > > > > > > 在 2023/4/26 下午10:48, Michael S. Tsirkin 写道:
> > > > > > > > > > > > On Wed, Apr 26, 2023 at 10:14:30PM +0800, Heng Qi wrote:
> > > > > > > > > > > > > This does not mean that every device needs to implement and support all of
> > > > > > > > > > > > > these, they can choose to support some protocols they want.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > I add these because we have scale application scenarios for modern protocols
> > > > > > > > > > > > > VXLAN-GPE/GENEVE:
> > > > > > > > > > > > > 
> > > > > > > > > > > > > +\item In scenarios where the same flow passing through different tunnels is expected to be received in the same queue,
> > > > > > > > > > > > > +      warm caches, lessing locking, etc. are optimized to obtain receiving performance.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Maybe the legacy GRE, VXLAN-GPE and GENEVE? But it has a little crossover.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Thanks.
> > > > > > > > > > > > But VXLAN-GPE/GENEVE can use source port for entropy.
> > > > > > > > > > > > 
> > > > > > > > > > > > 	It is recommended that the UDP source port number
> > > > > > > > > > > > 	 be calculated using a hash of fields from the inner packet
> > > > > > > > > > > > 
> > > > > > > > > > > > That is best because
> > > > > > > > > > > > it allows end to end control and is protocol agnostic.
> > > > > > > > > > > Yes. I agree with this, I don't think we have an argument on this point
> > > > > > > > > > > right now.:)
> > > > > > > > > > > 
> > > > > > > > > > > For VXLAN-GPE/GENEVE or other modern tunneling protocols, we have to deal
> > > > > > > > > > > with
> > > > > > > > > > > scenarios where the same flow passes through different tunnels.
> > > > > > > > > > > 
> > > > > > > > > > > Having them hashed to the same rx queue, is hard to do via outer headers.
> > > > > > > > > > > > All that is missing is symmetric Toepliz and all is well?
> > > > > > > > > > > The scenarios above or in the commit log also require inner headers.
> > > > > > > > > > Hmm I am not sure I get it 100%.
> > > > > > > > > > Could you show an example with inner header hash in the port #,
> > > > > > > > > > hash is symmetric, and you still have trouble?
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > It kinds of sounds like not enough entropy is not the problem
> > > > > > > > > > at this point.
> > > > > > > > > Sorry for the late reply. :)
> > > > > > > > > 
> > > > > > > > > For modern tunneling protocols, yes.
> > > > > > > > > 
> > > > > > > > > > You now want to drop everything from the header
> > > > > > > > > > except the UDP source port. Is that a fair summary?
> > > > > > > > > > 
> > > > > > > > > For example, for the same flow passing through different VXLAN tunnels,
> > > > > > > > > packets in this flow have the same inner header and different outer
> > > > > > > > > headers. Sometimes these packets of the flow need to be hashed to the
> > > > > > > > > same rxq, then we can use the inner header as the hash input.
> > > > > > > > > 
> > > > > > > > > Thanks!
> > > > > > > > So, they will have the same source port yes?
> > > > > > > Yes. The outer source port can be calculated using the 5-tuple of the
> > > > > > > original packet,
> > > > > > > and the outer ports are the same but the outer IPs are different after
> > > > > > > different directions of the same flow pass through different tunnels.
> > > > > > > > Any way to use that
> > > > > > > We use it in monitoring, firewall and other scenarios.
> > > > > > > 
> > > > > > > > so we don't depend on a specific protocol?
> > > > > > > Yes, selected tunneling protocols can be used in this scenario like this.
> > > > > > > 
> > > > > > > Thanks.
> > > > > > > 
> > > > > > No, the question was - can we generalize this somehow then?
> > > > > > For example, a flag to ignore source IP when hashing?
> > > > > > Or maybe just for UDP packets?
> > > > > 
> > > > > 1. I think the common solution is based on the inner header, so that
> > > > > GRE/IPIP tunnels can also enjoy inner symmetric hashing.
> > > > > 
> > > > > 2. The VXLAN spec does not show that the outer source port in both
> > > > > directions of the same flow must be the same [1]
> > > > > (although the outer source port is calculated based on the consistent hash
> > > > > in the kernel. The consistent hash will sort the five-tuple before
> > > > > calculating hashing),
> > > > > but it is best not to assume that consistent hashing is used in all VXLAN
> > > > > implementations.
> > > > 
> > > > I agree, best not to assume if it's not in the spec.
> > > > The requirement to hash two sides to same queue might
> > > > not be necessary for everyone though, right?
> > > 
> > > The outer source port is also not reliable when it needs to be hashed to
> > > the same queue, but the inner header identifies a flow reliably and
> > > universally.
> > > 
> > > > 
> > > > > The GENEVE spec uses "SHOUlD"[2].
> > > > 
> > > > What about other tunnels? Could you summarize please?
> > > 
> > > Sure.
> > > 
> > > The VXLAN spec[1] does not show that the outer source port in both
> > > directions of the same flow must be the same.
> > > 
> > > VXLAN-GPE[2]("SHOULD")/GENEVE[3]("SHOULD")/GRE-in-UDP[4.1]/STT[5]
> > > recommend that the outer source port of the same flow be calculated
> > > based on the inner header hash and set to the same.
> > > 
> > > But the udp source port of GRE-in-UDP may be used in a scenario similar
> > > to NAPT [4.2], where the udp source port is no longer used for entropy,
> > > but for identifying different internal hosts. So using udp source port
> > > does not identify the same stream. This is why using the inner header is
> > > more general, since information about the original stream can reliably
> > > identify a flow.
> > > 
> > > [1] "Source Port: It is recommended that the UDP source port number be
> > > calculated using a hash of fields from the inner packet -- one example
> > > being a hash of the inner Ethernet frame's headers. This is to enable a
> > > level of entropy for the ECMP/load-balancing of the VM-to-VM traffic
> > > across the VXLAN overlay. When calculating the UDP source port number in
> > > this manner, it is RECOMMENDED that the value be in the dynamic/private
> > > port range 49152-65535 [RFC6335]"
> > > 
> > > [2] "Source UDP Port: The source UDP port is used as entropy for devices
> > > forwarding encapsulated packets across the underlay (ECMP for IP routers,
> > > or load splitting for link aggregation by bridges). Tenant traffic flows
> > > should all use the same source UDP port to lower the chances of packet
> > > reordering by the underlay for a given flow. It is recommended for VTEPs
> > > to generate this port number using a hash of the inner packet headers.
> > > Implementations MAY use the entire 16 bit source UDP port for entropy."
> > > 
> > > [3] "Source Port: A source port selected by the originating tunnel
> > > endpoint. This source port SHOULD be the same for all packets belonging
> > > to a single encapsulated flow to prevent reordering due to the use of
> > > different paths. To encourage an even distribution of flows across
> > > multiple links, the source port SHOULD be calculated using a hash of the
> > > encapsulated packet headers using, for example, a traditional 5-tuple.
> > > Since the port represents a flow identifier rather than a true UDP
> > > connection, the entire 16-bit range MAY be used to maximize entropy."
> > > 
> > > [4.1] "GRE-in-UDP permits the UDP source port value to be used to encode
> > > an entropy value. The UDP source port contains a 16-bit entropy value
> > > that is generated by the encapsulator to identify a flow for the
> > > encapsulated packet. The port value SHOULD be within the ephemeral port
> > > range, i.e., 49152 to 65535, where the high-order two bits of the port
> > > are set to one. This provides fourteen bits of entropy for the inner
> > > flow identifier. In the case that an encapsulator is unable to derive
> > > flow entropy from the payload header or the entropy usage has to be
> > > disabled to meet operational requirements (see Section 7), to avoid
> > > reordering with a packet flow, the encapsulator SHOULD use the same UDP
> > > source port value for all packets assigned to a flow, e.g., the result
> > > of an algorithm that performs a hash of the tunnel ingress and egress IP
> > > address."
> > > 
> > > [4.2] "use of the UDP source port for entropy may impact middleboxes'
> > > behavior. If a GRE-in-UDP tunnel is expected to be used on a path
> > > with a middlebox, the tunnel can be configured either to disable use
> > > of the UDP source port for entropy or to enable middleboxes to pass
> > > packets with UDP source port entropy."
> > > 
> > > [5] "STT achieves the first goal by ensuring that the source and
> > > destination ports and addresses in the outer header are all the same for
> > > a single flow.  The second goal is achieved by generating the source
> > > port using a random hash of fields in the headers of the inner packets,
> > > e.g. the ports and addresses of the virtual flow's packets."
> > 
> > 
> > 
> > > > SHOULD means "if you ignore this
> > > > things will work but not well".
> > > > You mentioned concerns such as worse performance,
> > > > this is fine with SHOULD.
> > > 
> > > That's it.
> > > 
> > > > Is inner hashing important for
> > > > correctness sometimes?
> > > 
> > > I'm sorry I didn't understand this, can you explain it in more detail?
> > 
> > Do things actually break if inner hash is not enabled or is this
> > a performance optimization?
> 
> Yes, the internal hash comes from our real internal needs, and the
> application scenarios have a large scale. When the data traffic and
> scale increase, this is very beneficial to our production efficiency and
> cost. Performance optimization is not only an important direction of the
> network, but also a manifestation of complete functionality. Based on
> this, we have reason to believe that internal hashing will play a role
> in future developments.

I frankly hope we will support something programmable for this
down the road rather than hard-coding.


> > 
> > > > 
> > > > > 3. How should we generalize? The device uses a feature to advertise all the
> > > > > tunnel types it supports, and hashes these tunnel types using the outer
> > > > > source port,
> > > > > and then we still have to give the specific tunneling protocols supported by
> > > > > the device, just like we do now.
> > > > 
> > > > Is it problematic to do this for all UDP packets?
> > > 
> > > I think there will be problems. While devices support configuring this,
> > > drivers sometimes don't want devices to do special handling for certain
> > > tunneling protocols.
> > > 
> > > Thanks.
> > 
> > I guess we can at least add a flag to do this (ignore IP addresses,
> > just hash the port numbers) for all UDP packets?
> 
> Yes, I think this can also be used as a worker thread.


I don't know what that means.

> > Or maybe UDP4/UDP6 separately.
> > Hopefully this will be enough to prevent getting requests
> > to add more offloads in the future.
> 
> Agreed, and understand your concerns about this.
> 
> Thanks.


> > 
> > 
> > > > 
> > > > > [1] "Source Port: It is recommended that the UDP source port number be
> > > > > calculated using a hash of fields from the inner packet -- one example
> > > > > being a hash of the inner Ethernet frame's headers. This is to enable a
> > > > > level of entropy for the ECMP/load-balancing of the VM-to-VM traffic across
> > > > > the VXLAN overlay. When calculating the UDP source port number in this
> > > > > manner, it is RECOMMENDED that the value be in the dynamic/private
> > > > > port range 49152-65535 [RFC6335] "
> > > > > 
> > > > > [2] "Source Port: A source port selected by the originating tunnel endpoint.
> > > > > This source port SHOULD be the same for all packets belonging to a
> > > > > single encapsulated flow to prevent reordering due to the use of different
> > > > > paths. To encourage an even distribution of flows across multiple links,
> > > > > the source port SHOULD be calculated using a hash of the encapsulated packet
> > > > > headers using, for example, a traditional 5-tuple. Since the port
> > > > > represents a flow identifier rather than a true UDP connection, the entire
> > > > > 16-bit range MAY be used to maximize entropy. In addition to setting the
> > > > > source port, for IPv6, the flow label MAY also be used for providing
> > > > > entropy. For an example of using the IPv6 flow label for tunnel use cases,
> > > > > see [RFC6438]."
> > > > > 
> > > > > Thanks.
> > > > > 
> > > > > > 
> > > > > 
> > > > > 
> > > > > This publicly archived list offers a means to provide input to the
> > > > > OASIS Virtual I/O Device (VIRTIO) TC.
> > > > > 
> > > > > In order to verify user consent to the Feedback License terms and
> > > > > to minimize spam in the list archive, subscription is required
> > > > > before posting.
> > > > > 
> > > > > Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> > > > > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> > > > > List help: virtio-comment-help@lists.oasis-open.org
> > > > > List archive: https://lists.oasis-open.org/archives/virtio-comment/
> > > > > Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> > > > > List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> > > > > Committee: https://www.oasis-open.org/committees/virtio/
> > > > > Join OASIS: https://www.oasis-open.org/join/
> > > > > 


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] [PATCH v13] virtio-net: support inner header hash
@ 2023-05-12 11:27                               ` Michael S. Tsirkin
  0 siblings, 0 replies; 60+ messages in thread
From: Michael S. Tsirkin @ 2023-05-12 11:27 UTC (permalink / raw)
  To: Heng Qi
  Cc: virtio-dev, virtio-comment, Parav Pandit, Jason Wang,
	Yuri Benditovich, Xuan Zhuo

On Fri, May 12, 2023 at 03:23:46PM +0800, Heng Qi wrote:
> On Fri, May 12, 2023 at 02:54:34AM -0400, Michael S. Tsirkin wrote:
> > On Fri, May 12, 2023 at 02:00:19PM +0800, Heng Qi wrote:
> > > On Thu, May 11, 2023 at 02:22:12AM -0400, Michael S. Tsirkin wrote:
> > > > On Wed, May 10, 2023 at 05:15:37PM +0800, Heng Qi wrote:
> > > > > 
> > > > > 
> > > > > 在 2023/5/9 下午11:15, Michael S. Tsirkin 写道:
> > > > > > On Tue, May 09, 2023 at 10:22:19PM +0800, Heng Qi wrote:
> > > > > > > 
> > > > > > > 在 2023/5/5 下午10:56, Michael S. Tsirkin 写道:
> > > > > > > > On Fri, May 05, 2023 at 09:51:15PM +0800, Heng Qi wrote:
> > > > > > > > > On Thu, Apr 27, 2023 at 01:13:29PM -0400, Michael S. Tsirkin wrote:
> > > > > > > > > > On Thu, Apr 27, 2023 at 10:28:29AM +0800, Heng Qi wrote:
> > > > > > > > > > > 在 2023/4/26 下午10:48, Michael S. Tsirkin 写道:
> > > > > > > > > > > > On Wed, Apr 26, 2023 at 10:14:30PM +0800, Heng Qi wrote:
> > > > > > > > > > > > > This does not mean that every device needs to implement and support all of
> > > > > > > > > > > > > these, they can choose to support some protocols they want.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > I add these because we have scale application scenarios for modern protocols
> > > > > > > > > > > > > VXLAN-GPE/GENEVE:
> > > > > > > > > > > > > 
> > > > > > > > > > > > > +\item In scenarios where the same flow passing through different tunnels is expected to be received in the same queue,
> > > > > > > > > > > > > +      warm caches, lessing locking, etc. are optimized to obtain receiving performance.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Maybe the legacy GRE, VXLAN-GPE and GENEVE? But it has a little crossover.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Thanks.
> > > > > > > > > > > > But VXLAN-GPE/GENEVE can use source port for entropy.
> > > > > > > > > > > > 
> > > > > > > > > > > > 	It is recommended that the UDP source port number
> > > > > > > > > > > > 	 be calculated using a hash of fields from the inner packet
> > > > > > > > > > > > 
> > > > > > > > > > > > That is best because
> > > > > > > > > > > > it allows end to end control and is protocol agnostic.
> > > > > > > > > > > Yes. I agree with this, I don't think we have an argument on this point
> > > > > > > > > > > right now.:)
> > > > > > > > > > > 
> > > > > > > > > > > For VXLAN-GPE/GENEVE or other modern tunneling protocols, we have to deal
> > > > > > > > > > > with
> > > > > > > > > > > scenarios where the same flow passes through different tunnels.
> > > > > > > > > > > 
> > > > > > > > > > > Having them hashed to the same rx queue, is hard to do via outer headers.
> > > > > > > > > > > > All that is missing is symmetric Toepliz and all is well?
> > > > > > > > > > > The scenarios above or in the commit log also require inner headers.
> > > > > > > > > > Hmm I am not sure I get it 100%.
> > > > > > > > > > Could you show an example with inner header hash in the port #,
> > > > > > > > > > hash is symmetric, and you still have trouble?
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > It kinds of sounds like not enough entropy is not the problem
> > > > > > > > > > at this point.
> > > > > > > > > Sorry for the late reply. :)
> > > > > > > > > 
> > > > > > > > > For modern tunneling protocols, yes.
> > > > > > > > > 
> > > > > > > > > > You now want to drop everything from the header
> > > > > > > > > > except the UDP source port. Is that a fair summary?
> > > > > > > > > > 
> > > > > > > > > For example, for the same flow passing through different VXLAN tunnels,
> > > > > > > > > packets in this flow have the same inner header and different outer
> > > > > > > > > headers. Sometimes these packets of the flow need to be hashed to the
> > > > > > > > > same rxq, then we can use the inner header as the hash input.
> > > > > > > > > 
> > > > > > > > > Thanks!
> > > > > > > > So, they will have the same source port yes?
> > > > > > > Yes. The outer source port can be calculated using the 5-tuple of the
> > > > > > > original packet,
> > > > > > > and the outer ports are the same but the outer IPs are different after
> > > > > > > different directions of the same flow pass through different tunnels.
> > > > > > > > Any way to use that
> > > > > > > We use it in monitoring, firewall and other scenarios.
> > > > > > > 
> > > > > > > > so we don't depend on a specific protocol?
> > > > > > > Yes, selected tunneling protocols can be used in this scenario like this.
> > > > > > > 
> > > > > > > Thanks.
> > > > > > > 
> > > > > > No, the question was - can we generalize this somehow then?
> > > > > > For example, a flag to ignore source IP when hashing?
> > > > > > Or maybe just for UDP packets?
> > > > > 
> > > > > 1. I think the common solution is based on the inner header, so that
> > > > > GRE/IPIP tunnels can also enjoy inner symmetric hashing.
> > > > > 
> > > > > 2. The VXLAN spec does not show that the outer source port in both
> > > > > directions of the same flow must be the same [1]
> > > > > (although the outer source port is calculated based on the consistent hash
> > > > > in the kernel. The consistent hash will sort the five-tuple before
> > > > > calculating hashing),
> > > > > but it is best not to assume that consistent hashing is used in all VXLAN
> > > > > implementations.
> > > > 
> > > > I agree, best not to assume if it's not in the spec.
> > > > The requirement to hash two sides to same queue might
> > > > not be necessary for everyone though, right?
> > > 
> > > The outer source port is also not reliable when it needs to be hashed to
> > > the same queue, but the inner header identifies a flow reliably and
> > > universally.
> > > 
> > > > 
> > > > > The GENEVE spec uses "SHOUlD"[2].
> > > > 
> > > > What about other tunnels? Could you summarize please?
> > > 
> > > Sure.
> > > 
> > > The VXLAN spec[1] does not show that the outer source port in both
> > > directions of the same flow must be the same.
> > > 
> > > VXLAN-GPE[2]("SHOULD")/GENEVE[3]("SHOULD")/GRE-in-UDP[4.1]/STT[5]
> > > recommend that the outer source port of the same flow be calculated
> > > based on the inner header hash and set to the same.
> > > 
> > > But the udp source port of GRE-in-UDP may be used in a scenario similar
> > > to NAPT [4.2], where the udp source port is no longer used for entropy,
> > > but for identifying different internal hosts. So using udp source port
> > > does not identify the same stream. This is why using the inner header is
> > > more general, since information about the original stream can reliably
> > > identify a flow.
> > > 
> > > [1] "Source Port: It is recommended that the UDP source port number be
> > > calculated using a hash of fields from the inner packet -- one example
> > > being a hash of the inner Ethernet frame's headers. This is to enable a
> > > level of entropy for the ECMP/load-balancing of the VM-to-VM traffic
> > > across the VXLAN overlay. When calculating the UDP source port number in
> > > this manner, it is RECOMMENDED that the value be in the dynamic/private
> > > port range 49152-65535 [RFC6335]"
> > > 
> > > [2] "Source UDP Port: The source UDP port is used as entropy for devices
> > > forwarding encapsulated packets across the underlay (ECMP for IP routers,
> > > or load splitting for link aggregation by bridges). Tenant traffic flows
> > > should all use the same source UDP port to lower the chances of packet
> > > reordering by the underlay for a given flow. It is recommended for VTEPs
> > > to generate this port number using a hash of the inner packet headers.
> > > Implementations MAY use the entire 16 bit source UDP port for entropy."
> > > 
> > > [3] "Source Port: A source port selected by the originating tunnel
> > > endpoint. This source port SHOULD be the same for all packets belonging
> > > to a single encapsulated flow to prevent reordering due to the use of
> > > different paths. To encourage an even distribution of flows across
> > > multiple links, the source port SHOULD be calculated using a hash of the
> > > encapsulated packet headers using, for example, a traditional 5-tuple.
> > > Since the port represents a flow identifier rather than a true UDP
> > > connection, the entire 16-bit range MAY be used to maximize entropy."
> > > 
> > > [4.1] "GRE-in-UDP permits the UDP source port value to be used to encode
> > > an entropy value. The UDP source port contains a 16-bit entropy value
> > > that is generated by the encapsulator to identify a flow for the
> > > encapsulated packet. The port value SHOULD be within the ephemeral port
> > > range, i.e., 49152 to 65535, where the high-order two bits of the port
> > > are set to one. This provides fourteen bits of entropy for the inner
> > > flow identifier. In the case that an encapsulator is unable to derive
> > > flow entropy from the payload header or the entropy usage has to be
> > > disabled to meet operational requirements (see Section 7), to avoid
> > > reordering with a packet flow, the encapsulator SHOULD use the same UDP
> > > source port value for all packets assigned to a flow, e.g., the result
> > > of an algorithm that performs a hash of the tunnel ingress and egress IP
> > > address."
> > > 
> > > [4.2] "use of the UDP source port for entropy may impact middleboxes'
> > > behavior. If a GRE-in-UDP tunnel is expected to be used on a path
> > > with a middlebox, the tunnel can be configured either to disable use
> > > of the UDP source port for entropy or to enable middleboxes to pass
> > > packets with UDP source port entropy."
> > > 
> > > [5] "STT achieves the first goal by ensuring that the source and
> > > destination ports and addresses in the outer header are all the same for
> > > a single flow.  The second goal is achieved by generating the source
> > > port using a random hash of fields in the headers of the inner packets,
> > > e.g. the ports and addresses of the virtual flow's packets."
> > 
> > 
> > 
> > > > SHOULD means "if you ignore this
> > > > things will work but not well".
> > > > You mentioned concerns such as worse performance,
> > > > this is fine with SHOULD.
> > > 
> > > That's it.
> > > 
> > > > Is inner hashing important for
> > > > correctness sometimes?
> > > 
> > > I'm sorry I didn't understand this, can you explain it in more detail?
> > 
> > Do things actually break if inner hash is not enabled or is this
> > a performance optimization?
> 
> Yes, the internal hash comes from our real internal needs, and the
> application scenarios have a large scale. When the data traffic and
> scale increase, this is very beneficial to our production efficiency and
> cost. Performance optimization is not only an important direction of the
> network, but also a manifestation of complete functionality. Based on
> this, we have reason to believe that internal hashing will play a role
> in future developments.

I frankly hope we will support something programmable for this
down the road rather than hard-coding.


> > 
> > > > 
> > > > > 3. How should we generalize? The device uses a feature to advertise all the
> > > > > tunnel types it supports, and hashes these tunnel types using the outer
> > > > > source port,
> > > > > and then we still have to give the specific tunneling protocols supported by
> > > > > the device, just like we do now.
> > > > 
> > > > Is it problematic to do this for all UDP packets?
> > > 
> > > I think there will be problems. While devices support configuring this,
> > > drivers sometimes don't want devices to do special handling for certain
> > > tunneling protocols.
> > > 
> > > Thanks.
> > 
> > I guess we can at least add a flag to do this (ignore IP addresses,
> > just hash the port numbers) for all UDP packets?
> 
> Yes, I think this can also be used as a worker thread.


I don't know what that means.

> > Or maybe UDP4/UDP6 separately.
> > Hopefully this will be enough to prevent getting requests
> > to add more offloads in the future.
> 
> Agreed, and understand your concerns about this.
> 
> Thanks.


> > 
> > 
> > > > 
> > > > > [1] "Source Port: It is recommended that the UDP source port number be
> > > > > calculated using a hash of fields from the inner packet -- one example
> > > > > being a hash of the inner Ethernet frame's headers. This is to enable a
> > > > > level of entropy for the ECMP/load-balancing of the VM-to-VM traffic across
> > > > > the VXLAN overlay. When calculating the UDP source port number in this
> > > > > manner, it is RECOMMENDED that the value be in the dynamic/private
> > > > > port range 49152-65535 [RFC6335] "
> > > > > 
> > > > > [2] "Source Port: A source port selected by the originating tunnel endpoint.
> > > > > This source port SHOULD be the same for all packets belonging to a
> > > > > single encapsulated flow to prevent reordering due to the use of different
> > > > > paths. To encourage an even distribution of flows across multiple links,
> > > > > the source port SHOULD be calculated using a hash of the encapsulated packet
> > > > > headers using, for example, a traditional 5-tuple. Since the port
> > > > > represents a flow identifier rather than a true UDP connection, the entire
> > > > > 16-bit range MAY be used to maximize entropy. In addition to setting the
> > > > > source port, for IPv6, the flow label MAY also be used for providing
> > > > > entropy. For an example of using the IPv6 flow label for tunnel use cases,
> > > > > see [RFC6438]."
> > > > > 
> > > > > Thanks.
> > > > > 
> > > > > > 
> > > > > 
> > > > > 
> > > > > This publicly archived list offers a means to provide input to the
> > > > > OASIS Virtual I/O Device (VIRTIO) TC.
> > > > > 
> > > > > In order to verify user consent to the Feedback License terms and
> > > > > to minimize spam in the list archive, subscription is required
> > > > > before posting.
> > > > > 
> > > > > Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> > > > > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> > > > > List help: virtio-comment-help@lists.oasis-open.org
> > > > > List archive: https://lists.oasis-open.org/archives/virtio-comment/
> > > > > Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> > > > > List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> > > > > Committee: https://www.oasis-open.org/committees/virtio/
> > > > > Join OASIS: https://www.oasis-open.org/join/
> > > > > 


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] [PATCH v13] virtio-net: support inner header hash
  2023-05-12 11:27                               ` Michael S. Tsirkin
@ 2023-05-15  6:51                                 ` Heng Qi
  -1 siblings, 0 replies; 60+ messages in thread
From: Heng Qi @ 2023-05-15  6:51 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-dev, virtio-comment, Parav Pandit, Jason Wang,
	Yuri Benditovich, Xuan Zhuo



在 2023/5/12 下午7:27, Michael S. Tsirkin 写道:
> On Fri, May 12, 2023 at 03:23:46PM +0800, Heng Qi wrote:
>> On Fri, May 12, 2023 at 02:54:34AM -0400, Michael S. Tsirkin wrote:
>>> On Fri, May 12, 2023 at 02:00:19PM +0800, Heng Qi wrote:
>>>> On Thu, May 11, 2023 at 02:22:12AM -0400, Michael S. Tsirkin wrote:
>>>>> On Wed, May 10, 2023 at 05:15:37PM +0800, Heng Qi wrote:
>>>>>>
>>>>>> 在 2023/5/9 下午11:15, Michael S. Tsirkin 写道:
>>>>>>> On Tue, May 09, 2023 at 10:22:19PM +0800, Heng Qi wrote:
>>>>>>>> 在 2023/5/5 下午10:56, Michael S. Tsirkin 写道:
>>>>>>>>> On Fri, May 05, 2023 at 09:51:15PM +0800, Heng Qi wrote:
>>>>>>>>>> On Thu, Apr 27, 2023 at 01:13:29PM -0400, Michael S. Tsirkin wrote:
>>>>>>>>>>> On Thu, Apr 27, 2023 at 10:28:29AM +0800, Heng Qi wrote:
>>>>>>>>>>>> 在 2023/4/26 下午10:48, Michael S. Tsirkin 写道:
>>>>>>>>>>>>> On Wed, Apr 26, 2023 at 10:14:30PM +0800, Heng Qi wrote:
>>>>>>>>>>>>>> This does not mean that every device needs to implement and support all of
>>>>>>>>>>>>>> these, they can choose to support some protocols they want.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I add these because we have scale application scenarios for modern protocols
>>>>>>>>>>>>>> VXLAN-GPE/GENEVE:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> +\item In scenarios where the same flow passing through different tunnels is expected to be received in the same queue,
>>>>>>>>>>>>>> +      warm caches, lessing locking, etc. are optimized to obtain receiving performance.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Maybe the legacy GRE, VXLAN-GPE and GENEVE? But it has a little crossover.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks.
>>>>>>>>>>>>> But VXLAN-GPE/GENEVE can use source port for entropy.
>>>>>>>>>>>>>
>>>>>>>>>>>>> 	It is recommended that the UDP source port number
>>>>>>>>>>>>> 	 be calculated using a hash of fields from the inner packet
>>>>>>>>>>>>>
>>>>>>>>>>>>> That is best because
>>>>>>>>>>>>> it allows end to end control and is protocol agnostic.
>>>>>>>>>>>> Yes. I agree with this, I don't think we have an argument on this point
>>>>>>>>>>>> right now.:)
>>>>>>>>>>>>
>>>>>>>>>>>> For VXLAN-GPE/GENEVE or other modern tunneling protocols, we have to deal
>>>>>>>>>>>> with
>>>>>>>>>>>> scenarios where the same flow passes through different tunnels.
>>>>>>>>>>>>
>>>>>>>>>>>> Having them hashed to the same rx queue, is hard to do via outer headers.
>>>>>>>>>>>>> All that is missing is symmetric Toepliz and all is well?
>>>>>>>>>>>> The scenarios above or in the commit log also require inner headers.
>>>>>>>>>>> Hmm I am not sure I get it 100%.
>>>>>>>>>>> Could you show an example with inner header hash in the port #,
>>>>>>>>>>> hash is symmetric, and you still have trouble?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> It kinds of sounds like not enough entropy is not the problem
>>>>>>>>>>> at this point.
>>>>>>>>>> Sorry for the late reply. :)
>>>>>>>>>>
>>>>>>>>>> For modern tunneling protocols, yes.
>>>>>>>>>>
>>>>>>>>>>> You now want to drop everything from the header
>>>>>>>>>>> except the UDP source port. Is that a fair summary?
>>>>>>>>>>>
>>>>>>>>>> For example, for the same flow passing through different VXLAN tunnels,
>>>>>>>>>> packets in this flow have the same inner header and different outer
>>>>>>>>>> headers. Sometimes these packets of the flow need to be hashed to the
>>>>>>>>>> same rxq, then we can use the inner header as the hash input.
>>>>>>>>>>
>>>>>>>>>> Thanks!
>>>>>>>>> So, they will have the same source port yes?
>>>>>>>> Yes. The outer source port can be calculated using the 5-tuple of the
>>>>>>>> original packet,
>>>>>>>> and the outer ports are the same but the outer IPs are different after
>>>>>>>> different directions of the same flow pass through different tunnels.
>>>>>>>>> Any way to use that
>>>>>>>> We use it in monitoring, firewall and other scenarios.
>>>>>>>>
>>>>>>>>> so we don't depend on a specific protocol?
>>>>>>>> Yes, selected tunneling protocols can be used in this scenario like this.
>>>>>>>>
>>>>>>>> Thanks.
>>>>>>>>
>>>>>>> No, the question was - can we generalize this somehow then?
>>>>>>> For example, a flag to ignore source IP when hashing?
>>>>>>> Or maybe just for UDP packets?
>>>>>> 1. I think the common solution is based on the inner header, so that
>>>>>> GRE/IPIP tunnels can also enjoy inner symmetric hashing.
>>>>>>
>>>>>> 2. The VXLAN spec does not show that the outer source port in both
>>>>>> directions of the same flow must be the same [1]
>>>>>> (although the outer source port is calculated based on the consistent hash
>>>>>> in the kernel. The consistent hash will sort the five-tuple before
>>>>>> calculating hashing),
>>>>>> but it is best not to assume that consistent hashing is used in all VXLAN
>>>>>> implementations.
>>>>> I agree, best not to assume if it's not in the spec.
>>>>> The requirement to hash two sides to same queue might
>>>>> not be necessary for everyone though, right?
>>>> The outer source port is also not reliable when it needs to be hashed to
>>>> the same queue, but the inner header identifies a flow reliably and
>>>> universally.
>>>>
>>>>>> The GENEVE spec uses "SHOUlD"[2].
>>>>> What about other tunnels? Could you summarize please?
>>>> Sure.
>>>>
>>>> The VXLAN spec[1] does not show that the outer source port in both
>>>> directions of the same flow must be the same.
>>>>
>>>> VXLAN-GPE[2]("SHOULD")/GENEVE[3]("SHOULD")/GRE-in-UDP[4.1]/STT[5]
>>>> recommend that the outer source port of the same flow be calculated
>>>> based on the inner header hash and set to the same.
>>>>
>>>> But the udp source port of GRE-in-UDP may be used in a scenario similar
>>>> to NAPT [4.2], where the udp source port is no longer used for entropy,
>>>> but for identifying different internal hosts. So using udp source port
>>>> does not identify the same stream. This is why using the inner header is
>>>> more general, since information about the original stream can reliably
>>>> identify a flow.
>>>>
>>>> [1] "Source Port: It is recommended that the UDP source port number be
>>>> calculated using a hash of fields from the inner packet -- one example
>>>> being a hash of the inner Ethernet frame's headers. This is to enable a
>>>> level of entropy for the ECMP/load-balancing of the VM-to-VM traffic
>>>> across the VXLAN overlay. When calculating the UDP source port number in
>>>> this manner, it is RECOMMENDED that the value be in the dynamic/private
>>>> port range 49152-65535 [RFC6335]"
>>>>
>>>> [2] "Source UDP Port: The source UDP port is used as entropy for devices
>>>> forwarding encapsulated packets across the underlay (ECMP for IP routers,
>>>> or load splitting for link aggregation by bridges). Tenant traffic flows
>>>> should all use the same source UDP port to lower the chances of packet
>>>> reordering by the underlay for a given flow. It is recommended for VTEPs
>>>> to generate this port number using a hash of the inner packet headers.
>>>> Implementations MAY use the entire 16 bit source UDP port for entropy."
>>>>
>>>> [3] "Source Port: A source port selected by the originating tunnel
>>>> endpoint. This source port SHOULD be the same for all packets belonging
>>>> to a single encapsulated flow to prevent reordering due to the use of
>>>> different paths. To encourage an even distribution of flows across
>>>> multiple links, the source port SHOULD be calculated using a hash of the
>>>> encapsulated packet headers using, for example, a traditional 5-tuple.
>>>> Since the port represents a flow identifier rather than a true UDP
>>>> connection, the entire 16-bit range MAY be used to maximize entropy."
>>>>
>>>> [4.1] "GRE-in-UDP permits the UDP source port value to be used to encode
>>>> an entropy value. The UDP source port contains a 16-bit entropy value
>>>> that is generated by the encapsulator to identify a flow for the
>>>> encapsulated packet. The port value SHOULD be within the ephemeral port
>>>> range, i.e., 49152 to 65535, where the high-order two bits of the port
>>>> are set to one. This provides fourteen bits of entropy for the inner
>>>> flow identifier. In the case that an encapsulator is unable to derive
>>>> flow entropy from the payload header or the entropy usage has to be
>>>> disabled to meet operational requirements (see Section 7), to avoid
>>>> reordering with a packet flow, the encapsulator SHOULD use the same UDP
>>>> source port value for all packets assigned to a flow, e.g., the result
>>>> of an algorithm that performs a hash of the tunnel ingress and egress IP
>>>> address."
>>>>
>>>> [4.2] "use of the UDP source port for entropy may impact middleboxes'
>>>> behavior. If a GRE-in-UDP tunnel is expected to be used on a path
>>>> with a middlebox, the tunnel can be configured either to disable use
>>>> of the UDP source port for entropy or to enable middleboxes to pass
>>>> packets with UDP source port entropy."
>>>>
>>>> [5] "STT achieves the first goal by ensuring that the source and
>>>> destination ports and addresses in the outer header are all the same for
>>>> a single flow.  The second goal is achieved by generating the source
>>>> port using a random hash of fields in the headers of the inner packets,
>>>> e.g. the ports and addresses of the virtual flow's packets."
>>>
>>>
>>>>> SHOULD means "if you ignore this
>>>>> things will work but not well".
>>>>> You mentioned concerns such as worse performance,
>>>>> this is fine with SHOULD.
>>>> That's it.
>>>>
>>>>> Is inner hashing important for
>>>>> correctness sometimes?
>>>> I'm sorry I didn't understand this, can you explain it in more detail?
>>> Do things actually break if inner hash is not enabled or is this
>>> a performance optimization?
>> Yes, the internal hash comes from our real internal needs, and the
>> application scenarios have a large scale. When the data traffic and
>> scale increase, this is very beneficial to our production efficiency and
>> cost. Performance optimization is not only an important direction of the
>> network, but also a manifestation of complete functionality. Based on
>> this, we have reason to believe that internal hashing will play a role
>> in future developments.
> I frankly hope we will support something programmable for this
> down the road rather than hard-coding.

The inner header hash first requires the device to parse the specific 
tunnel protocol to do specific things,
so we need to hardcode some tunnel types. GRE/VXLAN/GENEVE/NVGRE/STT are 
mainstream
tunneling protocols included as much as possible. 
\field{supported_tunnel_hash_types} provides
the device with the ability to choose to support certain tunneling 
protocols for inner hashing, and
\field{tunnel_hash_types} further provides drivers with configuration 
capability. These add programmability
and flexibility to the inner header hash. Or do we have other ways to 
increase programmability?

>
>>>>>> 3. How should we generalize? The device uses a feature to advertise all the
>>>>>> tunnel types it supports, and hashes these tunnel types using the outer
>>>>>> source port,
>>>>>> and then we still have to give the specific tunneling protocols supported by
>>>>>> the device, just like we do now.
>>>>> Is it problematic to do this for all UDP packets?
>>>> I think there will be problems. While devices support configuring this,
>>>> drivers sometimes don't want devices to do special handling for certain
>>>> tunneling protocols.
>>>>
>>>> Thanks.
>>> I guess we can at least add a flag to do this (ignore IP addresses,
>>> just hash the port numbers) for all UDP packets?
>> Yes, I think this can also be used as a worker thread.
>
> I don't know what that means.

As we have discussed, symmetric hashing based on udp source port is 
unreliable, and it is not suitable for
protocols such as GRE/NVGRE/IPIP that do not have outer transport headers.

Thanks.

>
>>> Or maybe UDP4/UDP6 separately.
>>> Hopefully this will be enough to prevent getting requests
>>> to add more offloads in the future.
>> Agreed, and understand your concerns about this.
>>
>> Thanks.
>
>>>
>>>>>> [1] "Source Port: It is recommended that the UDP source port number be
>>>>>> calculated using a hash of fields from the inner packet -- one example
>>>>>> being a hash of the inner Ethernet frame's headers. This is to enable a
>>>>>> level of entropy for the ECMP/load-balancing of the VM-to-VM traffic across
>>>>>> the VXLAN overlay. When calculating the UDP source port number in this
>>>>>> manner, it is RECOMMENDED that the value be in the dynamic/private
>>>>>> port range 49152-65535 [RFC6335] "
>>>>>>
>>>>>> [2] "Source Port: A source port selected by the originating tunnel endpoint.
>>>>>> This source port SHOULD be the same for all packets belonging to a
>>>>>> single encapsulated flow to prevent reordering due to the use of different
>>>>>> paths. To encourage an even distribution of flows across multiple links,
>>>>>> the source port SHOULD be calculated using a hash of the encapsulated packet
>>>>>> headers using, for example, a traditional 5-tuple. Since the port
>>>>>> represents a flow identifier rather than a true UDP connection, the entire
>>>>>> 16-bit range MAY be used to maximize entropy. In addition to setting the
>>>>>> source port, for IPv6, the flow label MAY also be used for providing
>>>>>> entropy. For an example of using the IPv6 flow label for tunnel use cases,
>>>>>> see [RFC6438]."
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>>
>>>>>> This publicly archived list offers a means to provide input to the
>>>>>> OASIS Virtual I/O Device (VIRTIO) TC.
>>>>>>
>>>>>> In order to verify user consent to the Feedback License terms and
>>>>>> to minimize spam in the list archive, subscription is required
>>>>>> before posting.
>>>>>>
>>>>>> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
>>>>>> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
>>>>>> List help: virtio-comment-help@lists.oasis-open.org
>>>>>> List archive: https://lists.oasis-open.org/archives/virtio-comment/
>>>>>> Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
>>>>>> List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
>>>>>> Committee: https://www.oasis-open.org/committees/virtio/
>>>>>> Join OASIS: https://www.oasis-open.org/join/
>>>>>>


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] [PATCH v13] virtio-net: support inner header hash
@ 2023-05-15  6:51                                 ` Heng Qi
  0 siblings, 0 replies; 60+ messages in thread
From: Heng Qi @ 2023-05-15  6:51 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-dev, virtio-comment, Parav Pandit, Jason Wang,
	Yuri Benditovich, Xuan Zhuo



在 2023/5/12 下午7:27, Michael S. Tsirkin 写道:
> On Fri, May 12, 2023 at 03:23:46PM +0800, Heng Qi wrote:
>> On Fri, May 12, 2023 at 02:54:34AM -0400, Michael S. Tsirkin wrote:
>>> On Fri, May 12, 2023 at 02:00:19PM +0800, Heng Qi wrote:
>>>> On Thu, May 11, 2023 at 02:22:12AM -0400, Michael S. Tsirkin wrote:
>>>>> On Wed, May 10, 2023 at 05:15:37PM +0800, Heng Qi wrote:
>>>>>>
>>>>>> 在 2023/5/9 下午11:15, Michael S. Tsirkin 写道:
>>>>>>> On Tue, May 09, 2023 at 10:22:19PM +0800, Heng Qi wrote:
>>>>>>>> 在 2023/5/5 下午10:56, Michael S. Tsirkin 写道:
>>>>>>>>> On Fri, May 05, 2023 at 09:51:15PM +0800, Heng Qi wrote:
>>>>>>>>>> On Thu, Apr 27, 2023 at 01:13:29PM -0400, Michael S. Tsirkin wrote:
>>>>>>>>>>> On Thu, Apr 27, 2023 at 10:28:29AM +0800, Heng Qi wrote:
>>>>>>>>>>>> 在 2023/4/26 下午10:48, Michael S. Tsirkin 写道:
>>>>>>>>>>>>> On Wed, Apr 26, 2023 at 10:14:30PM +0800, Heng Qi wrote:
>>>>>>>>>>>>>> This does not mean that every device needs to implement and support all of
>>>>>>>>>>>>>> these, they can choose to support some protocols they want.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I add these because we have scale application scenarios for modern protocols
>>>>>>>>>>>>>> VXLAN-GPE/GENEVE:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> +\item In scenarios where the same flow passing through different tunnels is expected to be received in the same queue,
>>>>>>>>>>>>>> +      warm caches, lessing locking, etc. are optimized to obtain receiving performance.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Maybe the legacy GRE, VXLAN-GPE and GENEVE? But it has a little crossover.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks.
>>>>>>>>>>>>> But VXLAN-GPE/GENEVE can use source port for entropy.
>>>>>>>>>>>>>
>>>>>>>>>>>>> 	It is recommended that the UDP source port number
>>>>>>>>>>>>> 	 be calculated using a hash of fields from the inner packet
>>>>>>>>>>>>>
>>>>>>>>>>>>> That is best because
>>>>>>>>>>>>> it allows end to end control and is protocol agnostic.
>>>>>>>>>>>> Yes. I agree with this, I don't think we have an argument on this point
>>>>>>>>>>>> right now.:)
>>>>>>>>>>>>
>>>>>>>>>>>> For VXLAN-GPE/GENEVE or other modern tunneling protocols, we have to deal
>>>>>>>>>>>> with
>>>>>>>>>>>> scenarios where the same flow passes through different tunnels.
>>>>>>>>>>>>
>>>>>>>>>>>> Having them hashed to the same rx queue, is hard to do via outer headers.
>>>>>>>>>>>>> All that is missing is symmetric Toepliz and all is well?
>>>>>>>>>>>> The scenarios above or in the commit log also require inner headers.
>>>>>>>>>>> Hmm I am not sure I get it 100%.
>>>>>>>>>>> Could you show an example with inner header hash in the port #,
>>>>>>>>>>> hash is symmetric, and you still have trouble?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> It kinds of sounds like not enough entropy is not the problem
>>>>>>>>>>> at this point.
>>>>>>>>>> Sorry for the late reply. :)
>>>>>>>>>>
>>>>>>>>>> For modern tunneling protocols, yes.
>>>>>>>>>>
>>>>>>>>>>> You now want to drop everything from the header
>>>>>>>>>>> except the UDP source port. Is that a fair summary?
>>>>>>>>>>>
>>>>>>>>>> For example, for the same flow passing through different VXLAN tunnels,
>>>>>>>>>> packets in this flow have the same inner header and different outer
>>>>>>>>>> headers. Sometimes these packets of the flow need to be hashed to the
>>>>>>>>>> same rxq, then we can use the inner header as the hash input.
>>>>>>>>>>
>>>>>>>>>> Thanks!
>>>>>>>>> So, they will have the same source port yes?
>>>>>>>> Yes. The outer source port can be calculated using the 5-tuple of the
>>>>>>>> original packet,
>>>>>>>> and the outer ports are the same but the outer IPs are different after
>>>>>>>> different directions of the same flow pass through different tunnels.
>>>>>>>>> Any way to use that
>>>>>>>> We use it in monitoring, firewall and other scenarios.
>>>>>>>>
>>>>>>>>> so we don't depend on a specific protocol?
>>>>>>>> Yes, selected tunneling protocols can be used in this scenario like this.
>>>>>>>>
>>>>>>>> Thanks.
>>>>>>>>
>>>>>>> No, the question was - can we generalize this somehow then?
>>>>>>> For example, a flag to ignore source IP when hashing?
>>>>>>> Or maybe just for UDP packets?
>>>>>> 1. I think the common solution is based on the inner header, so that
>>>>>> GRE/IPIP tunnels can also enjoy inner symmetric hashing.
>>>>>>
>>>>>> 2. The VXLAN spec does not show that the outer source port in both
>>>>>> directions of the same flow must be the same [1]
>>>>>> (although the outer source port is calculated based on the consistent hash
>>>>>> in the kernel. The consistent hash will sort the five-tuple before
>>>>>> calculating hashing),
>>>>>> but it is best not to assume that consistent hashing is used in all VXLAN
>>>>>> implementations.
>>>>> I agree, best not to assume if it's not in the spec.
>>>>> The requirement to hash two sides to same queue might
>>>>> not be necessary for everyone though, right?
>>>> The outer source port is also not reliable when it needs to be hashed to
>>>> the same queue, but the inner header identifies a flow reliably and
>>>> universally.
>>>>
>>>>>> The GENEVE spec uses "SHOUlD"[2].
>>>>> What about other tunnels? Could you summarize please?
>>>> Sure.
>>>>
>>>> The VXLAN spec[1] does not show that the outer source port in both
>>>> directions of the same flow must be the same.
>>>>
>>>> VXLAN-GPE[2]("SHOULD")/GENEVE[3]("SHOULD")/GRE-in-UDP[4.1]/STT[5]
>>>> recommend that the outer source port of the same flow be calculated
>>>> based on the inner header hash and set to the same.
>>>>
>>>> But the udp source port of GRE-in-UDP may be used in a scenario similar
>>>> to NAPT [4.2], where the udp source port is no longer used for entropy,
>>>> but for identifying different internal hosts. So using udp source port
>>>> does not identify the same stream. This is why using the inner header is
>>>> more general, since information about the original stream can reliably
>>>> identify a flow.
>>>>
>>>> [1] "Source Port: It is recommended that the UDP source port number be
>>>> calculated using a hash of fields from the inner packet -- one example
>>>> being a hash of the inner Ethernet frame's headers. This is to enable a
>>>> level of entropy for the ECMP/load-balancing of the VM-to-VM traffic
>>>> across the VXLAN overlay. When calculating the UDP source port number in
>>>> this manner, it is RECOMMENDED that the value be in the dynamic/private
>>>> port range 49152-65535 [RFC6335]"
>>>>
>>>> [2] "Source UDP Port: The source UDP port is used as entropy for devices
>>>> forwarding encapsulated packets across the underlay (ECMP for IP routers,
>>>> or load splitting for link aggregation by bridges). Tenant traffic flows
>>>> should all use the same source UDP port to lower the chances of packet
>>>> reordering by the underlay for a given flow. It is recommended for VTEPs
>>>> to generate this port number using a hash of the inner packet headers.
>>>> Implementations MAY use the entire 16 bit source UDP port for entropy."
>>>>
>>>> [3] "Source Port: A source port selected by the originating tunnel
>>>> endpoint. This source port SHOULD be the same for all packets belonging
>>>> to a single encapsulated flow to prevent reordering due to the use of
>>>> different paths. To encourage an even distribution of flows across
>>>> multiple links, the source port SHOULD be calculated using a hash of the
>>>> encapsulated packet headers using, for example, a traditional 5-tuple.
>>>> Since the port represents a flow identifier rather than a true UDP
>>>> connection, the entire 16-bit range MAY be used to maximize entropy."
>>>>
>>>> [4.1] "GRE-in-UDP permits the UDP source port value to be used to encode
>>>> an entropy value. The UDP source port contains a 16-bit entropy value
>>>> that is generated by the encapsulator to identify a flow for the
>>>> encapsulated packet. The port value SHOULD be within the ephemeral port
>>>> range, i.e., 49152 to 65535, where the high-order two bits of the port
>>>> are set to one. This provides fourteen bits of entropy for the inner
>>>> flow identifier. In the case that an encapsulator is unable to derive
>>>> flow entropy from the payload header or the entropy usage has to be
>>>> disabled to meet operational requirements (see Section 7), to avoid
>>>> reordering with a packet flow, the encapsulator SHOULD use the same UDP
>>>> source port value for all packets assigned to a flow, e.g., the result
>>>> of an algorithm that performs a hash of the tunnel ingress and egress IP
>>>> address."
>>>>
>>>> [4.2] "use of the UDP source port for entropy may impact middleboxes'
>>>> behavior. If a GRE-in-UDP tunnel is expected to be used on a path
>>>> with a middlebox, the tunnel can be configured either to disable use
>>>> of the UDP source port for entropy or to enable middleboxes to pass
>>>> packets with UDP source port entropy."
>>>>
>>>> [5] "STT achieves the first goal by ensuring that the source and
>>>> destination ports and addresses in the outer header are all the same for
>>>> a single flow.  The second goal is achieved by generating the source
>>>> port using a random hash of fields in the headers of the inner packets,
>>>> e.g. the ports and addresses of the virtual flow's packets."
>>>
>>>
>>>>> SHOULD means "if you ignore this
>>>>> things will work but not well".
>>>>> You mentioned concerns such as worse performance,
>>>>> this is fine with SHOULD.
>>>> That's it.
>>>>
>>>>> Is inner hashing important for
>>>>> correctness sometimes?
>>>> I'm sorry I didn't understand this, can you explain it in more detail?
>>> Do things actually break if inner hash is not enabled or is this
>>> a performance optimization?
>> Yes, the internal hash comes from our real internal needs, and the
>> application scenarios have a large scale. When the data traffic and
>> scale increase, this is very beneficial to our production efficiency and
>> cost. Performance optimization is not only an important direction of the
>> network, but also a manifestation of complete functionality. Based on
>> this, we have reason to believe that internal hashing will play a role
>> in future developments.
> I frankly hope we will support something programmable for this
> down the road rather than hard-coding.

The inner header hash first requires the device to parse the specific 
tunnel protocol to do specific things,
so we need to hardcode some tunnel types. GRE/VXLAN/GENEVE/NVGRE/STT are 
mainstream
tunneling protocols included as much as possible. 
\field{supported_tunnel_hash_types} provides
the device with the ability to choose to support certain tunneling 
protocols for inner hashing, and
\field{tunnel_hash_types} further provides drivers with configuration 
capability. These add programmability
and flexibility to the inner header hash. Or do we have other ways to 
increase programmability?

>
>>>>>> 3. How should we generalize? The device uses a feature to advertise all the
>>>>>> tunnel types it supports, and hashes these tunnel types using the outer
>>>>>> source port,
>>>>>> and then we still have to give the specific tunneling protocols supported by
>>>>>> the device, just like we do now.
>>>>> Is it problematic to do this for all UDP packets?
>>>> I think there will be problems. While devices support configuring this,
>>>> drivers sometimes don't want devices to do special handling for certain
>>>> tunneling protocols.
>>>>
>>>> Thanks.
>>> I guess we can at least add a flag to do this (ignore IP addresses,
>>> just hash the port numbers) for all UDP packets?
>> Yes, I think this can also be used as a worker thread.
>
> I don't know what that means.

As we have discussed, symmetric hashing based on udp source port is 
unreliable, and it is not suitable for
protocols such as GRE/NVGRE/IPIP that do not have outer transport headers.

Thanks.

>
>>> Or maybe UDP4/UDP6 separately.
>>> Hopefully this will be enough to prevent getting requests
>>> to add more offloads in the future.
>> Agreed, and understand your concerns about this.
>>
>> Thanks.
>
>>>
>>>>>> [1] "Source Port: It is recommended that the UDP source port number be
>>>>>> calculated using a hash of fields from the inner packet -- one example
>>>>>> being a hash of the inner Ethernet frame's headers. This is to enable a
>>>>>> level of entropy for the ECMP/load-balancing of the VM-to-VM traffic across
>>>>>> the VXLAN overlay. When calculating the UDP source port number in this
>>>>>> manner, it is RECOMMENDED that the value be in the dynamic/private
>>>>>> port range 49152-65535 [RFC6335] "
>>>>>>
>>>>>> [2] "Source Port: A source port selected by the originating tunnel endpoint.
>>>>>> This source port SHOULD be the same for all packets belonging to a
>>>>>> single encapsulated flow to prevent reordering due to the use of different
>>>>>> paths. To encourage an even distribution of flows across multiple links,
>>>>>> the source port SHOULD be calculated using a hash of the encapsulated packet
>>>>>> headers using, for example, a traditional 5-tuple. Since the port
>>>>>> represents a flow identifier rather than a true UDP connection, the entire
>>>>>> 16-bit range MAY be used to maximize entropy. In addition to setting the
>>>>>> source port, for IPv6, the flow label MAY also be used for providing
>>>>>> entropy. For an example of using the IPv6 flow label for tunnel use cases,
>>>>>> see [RFC6438]."
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>>
>>>>>> This publicly archived list offers a means to provide input to the
>>>>>> OASIS Virtual I/O Device (VIRTIO) TC.
>>>>>>
>>>>>> In order to verify user consent to the Feedback License terms and
>>>>>> to minimize spam in the list archive, subscription is required
>>>>>> before posting.
>>>>>>
>>>>>> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
>>>>>> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
>>>>>> List help: virtio-comment-help@lists.oasis-open.org
>>>>>> List archive: https://lists.oasis-open.org/archives/virtio-comment/
>>>>>> Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
>>>>>> List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
>>>>>> Committee: https://www.oasis-open.org/committees/virtio/
>>>>>> Join OASIS: https://www.oasis-open.org/join/
>>>>>>


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 60+ messages in thread

end of thread, other threads:[~2023-05-15  6:51 UTC | newest]

Thread overview: 60+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-23  7:35 [virtio-dev] [PATCH v13] virtio-net: support inner header hash Heng Qi
2023-04-23  7:35 ` [virtio-comment] " Heng Qi
2023-04-25 20:28 ` [virtio-dev] " Parav Pandit
2023-04-25 20:28   ` [virtio-comment] " Parav Pandit
2023-04-25 21:06   ` [virtio-dev] " Michael S. Tsirkin
2023-04-25 21:06     ` Michael S. Tsirkin
2023-04-25 21:39     ` [virtio-dev] " Parav Pandit
2023-04-25 21:39       ` Parav Pandit
2023-04-26  4:12       ` [virtio-dev] " Michael S. Tsirkin
2023-04-26  4:12         ` Michael S. Tsirkin
2023-04-26  4:27         ` [virtio-dev] " Parav Pandit
2023-04-26  4:27           ` Parav Pandit
2023-04-26  5:02           ` [virtio-dev] " Michael S. Tsirkin
2023-04-26  5:02             ` Michael S. Tsirkin
2023-04-26 13:42   ` [virtio-dev] " Heng Qi
2023-04-26 13:42     ` [virtio-comment] " Heng Qi
2023-04-26 13:47     ` [virtio-dev] " Parav Pandit
2023-04-26 13:47       ` [virtio-comment] " Parav Pandit
2023-04-26 14:03       ` Heng Qi
2023-04-26 14:03         ` [virtio-dev] " Heng Qi
2023-04-26 14:24         ` [virtio-dev] " Parav Pandit
2023-04-26 14:24           ` Parav Pandit
2023-04-26 14:57           ` [virtio-dev] " Michael S. Tsirkin
2023-04-26 14:57             ` Michael S. Tsirkin
2023-04-26 15:20             ` [virtio-dev] " Parav Pandit
2023-04-26 15:20               ` Parav Pandit
2023-04-27  2:19           ` [virtio-dev] " Heng Qi
2023-04-27  2:19             ` [virtio-comment] " Heng Qi
2023-04-25 21:03 ` [virtio-dev] Re: [virtio-comment] " Michael S. Tsirkin
2023-04-25 21:03   ` Michael S. Tsirkin
2023-04-26 14:14   ` [virtio-dev] " Heng Qi
2023-04-26 14:14     ` Heng Qi
2023-04-26 14:48     ` [virtio-dev] " Michael S. Tsirkin
2023-04-26 14:48       ` Michael S. Tsirkin
2023-04-27  2:28       ` [virtio-dev] " Heng Qi
2023-04-27  2:28         ` [virtio-comment] " Heng Qi
2023-04-27 17:13         ` Michael S. Tsirkin
2023-04-27 17:13           ` [virtio-comment] " Michael S. Tsirkin
2023-05-05 13:51           ` [virtio-dev] " Heng Qi
2023-05-05 13:51             ` Heng Qi
2023-05-05 14:56             ` [virtio-dev] " Michael S. Tsirkin
2023-05-05 14:56               ` Michael S. Tsirkin
2023-05-09 14:22               ` [virtio-dev] " Heng Qi
2023-05-09 14:22                 ` [virtio-comment] " Heng Qi
2023-05-09 15:15                 ` Michael S. Tsirkin
2023-05-09 15:15                   ` [virtio-comment] " Michael S. Tsirkin
2023-05-10  9:15                   ` [virtio-dev] " Heng Qi
2023-05-10  9:15                     ` Heng Qi
2023-05-11  6:22                     ` [virtio-dev] " Michael S. Tsirkin
2023-05-11  6:22                       ` Michael S. Tsirkin
2023-05-12  6:00                       ` [virtio-dev] " Heng Qi
2023-05-12  6:00                         ` Heng Qi
2023-05-12  6:54                         ` [virtio-dev] " Michael S. Tsirkin
2023-05-12  6:54                           ` Michael S. Tsirkin
2023-05-12  7:23                           ` [virtio-dev] " Heng Qi
2023-05-12  7:23                             ` Heng Qi
2023-05-12 11:27                             ` [virtio-dev] " Michael S. Tsirkin
2023-05-12 11:27                               ` Michael S. Tsirkin
2023-05-15  6:51                               ` [virtio-dev] " Heng Qi
2023-05-15  6:51                                 ` Heng Qi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.