All of lore.kernel.org
 help / color / mirror / Atom feed
* [virtio-dev] [PATCH v18] virtio-net: support inner header hash
@ 2023-06-21 13:50 ` Heng Qi
  0 siblings, 0 replies; 106+ messages in thread
From: Heng Qi @ 2023-06-21 13:50 UTC (permalink / raw)
  To: virtio-comment, virtio-dev
  Cc: Michael S . Tsirkin, Parav Pandit, Jason Wang, Yuri Benditovich,
	Xuan Zhuo, Cornelia Huck

1. Currently, a received encapsulated packet has an outer and an inner header, but
the virtio device is unable to calculate the hash for the inner header. The same
flow can traverse through different tunnels, resulting in the encapsulated
packets being spread across multiple receive queues (refer to the figure below).
However, in certain scenarios, we may need to direct these encapsulated packets of
the same flow to a single receive queue. This facilitates the processing
of the flow by the same CPU to improve performance (warm caches, less locking, etc.).

               client1                    client2
                  |        +-------+         |
                  +------->|tunnels|<--------+
                           +-------+
                              |  |
                              v  v
                      +-----------------+
                      | monitoring host |
                      +-----------------+

To achieve this, the device can calculate a symmetric hash based on the inner headers
of the same flow.

2. For legacy systems, they may lack entropy fields which modern protocols have in
the outer header, resulting in multiple flows with the same outer header but
different inner headers being directed to the same receive queue. This results in
poor receive performance.

To address this limitation, inner header hash can be used to enable the device to advertise
the capability to calculate the hash for the inner packet, regaining better receive performance.

Fixes: https://github.com/oasis-tcs/virtio-spec/issues/173

Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
---
v17->v18:
	1. Some rewording suggestions from Michael (Thanks!).
	2. Use 0 to disable inner header hash and remove
	   VIRTIO_NET_HASH_TUNNEL_TYPE_NONE.
v16->v17:
	1. Some small rewrites. @Parav Pandit
	2. Add Parav's Reviewed-by tag (Thanks!).

v15->v16:
	1. Remove the hash_option. In order to delimit the inner header hash and RSS
	   configuration, the ability to configure the outer src udp port hash is given
	   to RSS. This is orthogonal to inner header hash, which will be done in the
	   RSS capability extension topic (considered as an RSS extension together
	   with the symmetric toeplitz hash algorithm, etc.). @Parav Pandit @Michael S . Tsirkin
	2. Fix a 'field' typo. @Parav Pandit

v14->v15:
	1. Add tunnel hash option suggested by @Michael S . Tsirkin
	2. Adjust some descriptions.

v13->v14:
	1. Move supported_hash_tunnel_types from config space into cvq command. @Parav Pandit
	2. Rebase to master branch.
	3. Some minor modifications.

v12->v13:
	1. Add a GET command for hash_tunnel_types. @Parav Pandit
	2. Add tunneling protocol explanation. @Jason Wang
	3. Add comments on some usage scenarios for inner hash.

v11->v12:
	1. Add a command VIRTIO_NET_CTRL_MQ_TUNNEL_CONFIG.
	2. Refine the commit log. @Michael S . Tsirkin
	3. Add some tunnel types.

v10->v11:
	1. Revise commit log for clarity for readers.
	2. Some modifications to avoid undefined terms. @Parav Pandit
	3. Change VIRTIO_NET_F_HASH_TUNNEL dependency. @Parav Pandit
	4. Add the normative statements. @Parav Pandit

v9->v10:
	1. Removed hash_report_tunnel related information. @Parav Pandit
	2. Re-describe the limitations of QoS for tunneling.
	3. Some clarification.

v8->v9:
	1. Merge hash_report_tunnel_types into hash_report. @Parav Pandit
	2. Add tunnel security section. @Michael S . Tsirkin
	3. Add VIRTIO_NET_F_HASH_REPORT_TUNNEL.
	4. Fix some typos.
	5. Add more tunnel types. @Michael S . Tsirkin

v7->v8:
	1. Add supported_hash_tunnel_types. @Jason Wang, @Parav Pandit
	2. Change hash_report_tunnel to hash_report_tunnel_types. @Parav Pandit
	3. Removed re-definition for inner packet hashing. @Parav Pandit
	4. Fix some typos. @Michael S . Tsirkin
	5. Clarify some sentences. @Michael S . Tsirkin

v6->v7:
	1. Modify the wording of some sentences for clarity. @Michael S. Tsirkin
	2. Fix some syntax issues. @Michael S. Tsirkin

v5->v6:
	1. Fix some syntax and capitalization issues. @Michael S. Tsirkin
	2. Use encapsulated/encaptulation uniformly. @Michael S. Tsirkin
	3. Move the links to introduction section. @Michael S. Tsirkin
	4. Clarify some sentences. @Michael S. Tsirkin

v4->v5:
	1. Clarify some paragraphs. @Cornelia Huck
	2. Fix the u8 type. @Cornelia Huck

v3->v4:
	1. Rename VIRTIO_NET_F_HASH_GRE_VXLAN_GENEVE_INNER to VIRTIO_NET_F_HASH_TUNNEL. @Jason Wang
	2. Make things clearer. @Jason Wang @Michael S. Tsirkin
	3. Keep the possibility to use inner hash for automatic receive steering. @Jason Wang
	4. Add the "Tunnel packet" paragraph to avoid repeating the GRE etc. many times. @Michael S. Tsirkin

v2->v3:
	1. Add a feature bit for GRE/VXLAN/GENEVE inner hash. @Jason Wang
	2. Chang \field{hash_tunnel} to \field{hash_report_tunnel}. @Jason Wang, @Michael S. Tsirkin

v1->v2:
	1. Remove the patch for the bitmask fix. @Michael S. Tsirkin
	2. Clarify some paragraphs. @Jason Wang
	3. Add \field{hash_tunnel} and VIRTIO_NET_HASH_REPORT_GRE. @Yuri Benditovich

 device-types/net/description.tex        | 158 ++++++++++++++++++++++++
 device-types/net/device-conformance.tex |   1 +
 device-types/net/driver-conformance.tex |   1 +
 introduction.tex                        |  39 ++++++
 4 files changed, 199 insertions(+)

diff --git a/device-types/net/description.tex b/device-types/net/description.tex
index 3030222..9fdccfc 100644
--- a/device-types/net/description.tex
+++ b/device-types/net/description.tex
@@ -88,6 +88,8 @@ \subsection{Feature bits}\label{sec:Device Types / Network Device / Feature bits
 \item[VIRTIO_NET_F_CTRL_MAC_ADDR(23)] Set MAC address through control
     channel.
 
+\item[VIRTIO_NET_F_HASH_TUNNEL(51)] Device supports inner header hash for encapsulated packets.
+
 \item[VIRTIO_NET_F_VQ_NOTF_COAL(52)] Device supports virtqueue notification coalescing.
 
 \item[VIRTIO_NET_F_NOTF_COAL(53)] Device supports notifications coalescing.
@@ -147,6 +149,7 @@ \subsubsection{Feature bit requirements}\label{sec:Device Types / Network Device
 \item[VIRTIO_NET_F_RSC_EXT] Requires VIRTIO_NET_F_HOST_TSO4 or VIRTIO_NET_F_HOST_TSO6.
 \item[VIRTIO_NET_F_RSS] Requires VIRTIO_NET_F_CTRL_VQ.
 \item[VIRTIO_NET_F_VQ_NOTF_COAL] Requires VIRTIO_NET_F_CTRL_VQ.
+\item[VIRTIO_NET_F_HASH_TUNNEL] Requires VIRTIO_NET_F_CTRL_VQ along with VIRTIO_NET_F_RSS and/or VIRTIO_NET_F_HASH_REPORT.
 \end{description}
 
 \subsubsection{Legacy Interface: Feature bits}\label{sec:Device Types / Network Device / Feature bits / Legacy Interface: Feature bits}
@@ -869,6 +872,8 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
 If the feature VIRTIO_NET_F_RSS was negotiated:
 \begin{itemize}
 \item The device uses \field{hash_types} of the virtio_net_rss_config structure as 'Enabled hash types' bitmask.
+\item If additionally the feature VIRTIO_NET_F_HASH_TUNNEL was negotiated, the device uses \field{hash_tunnel_types} of the
+      virtnet_hash_tunnel_config_get structure as 'Encapsulation types enabled for inner header hash' bitmask.
 \item The device uses a key as defined in \field{hash_key_data} and \field{hash_key_length} of the virtio_net_rss_config structure (see
 \ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) / Setting RSS parameters}).
 \end{itemize}
@@ -876,6 +881,8 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
 If the feature VIRTIO_NET_F_RSS was not negotiated:
 \begin{itemize}
 \item The device uses \field{hash_types} of the virtio_net_hash_config structure as 'Enabled hash types' bitmask.
+\item If additionally the feature VIRTIO_NET_F_HASH_TUNNEL was negotiated, the device uses \field{hash_tunnel_types} of the
+      virtnet_hash_tunnel_config_get structure as 'Encapsulation types enabled for inner header hash' bitmask.
 \item The device uses a key as defined in \field{hash_key_data} and \field{hash_key_length} of the virtio_net_hash_config structure (see
 \ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode / Hash calculation}).
 \end{itemize}
@@ -889,6 +896,8 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
  \ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode / Hash calculation}.
 \end{itemize}
 
+The per-packet hash calculation can depend on the IP packet type. See
+\hyperref[intro:IP]{[IP]}, \hyperref[intro:UDP]{[UDP]} and \hyperref[intro:TCP]{[TCP]}.
 \subparagraph{Supported/enabled hash types}
 \label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled hash types}
 Hash types applicable for IPv4 packets:
@@ -1001,6 +1010,155 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
 (see \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / IPv6 packets without extension header}).
 \end{itemize}
 
+\paragraph{Inner Header Hash}
+\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Inner Header Hash}
+
+If VIRTIO_NET_F_HASH_TUNNEL has been negotiated, the driver can send commands VIRTIO_NET_CTRL_HASH_TUNNEL_SET
+and VIRTIO_NET_CTRL_HASH_TUNNEL_GET to configure the calculation of the inner header hash.
+
+struct virtnet_hash_tunnel_config_set {
+    le32 hash_tunnel_types;
+};
+
+struct virtnet_hash_tunnel_config_get {
+    le32 supported_hash_tunnel_types;
+    le32 hash_tunnel_types;
+};
+
+#define VIRTIO_NET_CTRL_HASH_TUNNEL 7
+ #define VIRTIO_NET_CTRL_HASH_TUNNEL_SET 0
+ #define VIRTIO_NET_CTRL_HASH_TUNNEL_GET 1
+
+
+Field \field{supported_hash_tunnel_types} provided by the device indicates that the device supports inner header hash for these encapsulation types.
+Field \field{supported_hash_tunnel_types} contains the bitmask of encapsulation types supported for inner header hash.
+See \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets /
+Hash calculation for incoming packets / Encapsulation types supported/enabled for inner header hash}.
+
+Field \field{hash_tunnel_types} contains the bitmask of encapsulation types enabled for inner header hash.
+See \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets /
+Hash calculation for incoming packets / Encapsulation types supported/enabled for inner header hash}.
+
+The class VIRTIO_NET_CTRL_HASH_TUNNEL has the following commands:
+\begin{itemize}
+\item VIRTIO_NET_CTRL_HASH_TUNNEL_SET: set \field{hash_tunnel_types} for the device using the
+      virtnet_hash_tunnel_config_set structure, which is read-only for the device.
+\item VIRTIO_NET_CTRL_HASH_TUNNEL_GET: get \field{hash_tunnel_types} and \field{supported_hash_tunnel_types}
+      from the device using the virtnet_hash_tunnel_config_get structure, which is write-only for the device.
+\end{itemize}
+
+\subparagraph{Encapsulated packet}
+\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Encapsulated packet}
+
+Multiple tunneling protocols allow encapsulating an inner, payload packet in an outer, encapsulated packet.
+The encapsulated packet thus contains an outer header and an inner header, and the device calculates the
+hash over either the inner header or the outer header.
+
+If VIRTIO_NET_F_HASH_TUNNEL is negotiated and a received encapsulated packet's outer header matches one of the
+encapsulation types enabled in \field{hash_tunnel_types}, then the device uses the inner header for hash
+calculations (only a single level of encapsulation is currently supported).
+
+If VIRTIO_NET_F_HASH_TUNNEL is negotiated and a received packet's (outer) header does not match any types enabled
+in \field{hash_tunnel_types}, then the device uses the outer header for hash calculations.
+
+Initially all encapsulation types are disabled (the value of \field{hash_tunnel_types} is 0) for inner header hash
+before any VIRTIO_NET_CTRL_HASH_TUNNEL_SET command are sent by the driver.
+
+Encapsulation types supported/enabled for inner header hash:
+\begin{itemize}
+    \item The outer header of the following encapsulation types does not contain the transport protocol:
+        \begin{enumerate}
+	    \item \hyperref[intro:ipip]{[IPIP]}: the outer header is over IPv4 and the inner header is over IPv4.
+	    \item \hyperref[intro:nvgre]{[NVGRE]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
+	    \item \hyperref[intro:gre_rfc2784]{[GRE_rfc2784]}: the outer header is over IPv4 and the inner header is over IPv4.
+	    \item \hyperref[intro:gre_rfc2890]{[GRE_rfc2890]}: the outer header is over IPv4 and the inner header is over IPv4.
+	    \item \hyperref[intro:gre_rfc7676]{[GRE_rfc7676]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
+        \end{enumerate}
+
+    \item The outer header of the following encapsulation types uses UDP as the transport protocol:
+        \begin{enumerate}
+	    \item \hyperref[intro:vxlan]{[VXLAN]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
+	    \item \hyperref[intro:geneve]{[GENEVE]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
+	    \item \hyperref[intro:vxlan_gpe]{[VXLAN-GPE]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
+	    \item \hyperref[intro:gre_in_udp_rfc8086]{[GRE-in-UDP]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
+        \end{enumerate}
+\end{itemize}
+
+\subparagraph{Encapsulation types supported/enabled for inner header hash}
+\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets /
+Hash calculation for incoming packets / Encapsulation types supported/enabled for inner header hash}
+
+Encapsulation types applicable for inner header hash:
+\begin{lstlisting}
+The \hyperref[intro:gre_rfc2784]{[GRE_rfc2784]} encapsulation type:
+#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_2784    (1 << 0)
+
+The \hyperref[intro:gre_rfc2890]{[GRE_rfc2890]} encapsulation type:
+#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_2890    (1 << 1)
+
+The \hyperref[intro:gre_rfc7676]{[GRE_rfc7676]} encapsulation type:
+#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_7676    (1 << 2)
+
+The \hyperref[intro:gre_in_udp_rfc8086]{[GRE-in-UDP]} encapsulation type:
+#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_UDP     (1 << 3)
+
+The \hyperref[intro:vxlan]{[VXLAN]} encapsulation type:
+#define VIRTIO_NET_HASH_TUNNEL_TYPE_VXLAN       (1 << 4)
+
+The \hyperref[intro:vxlan_gpe]{[VXLAN-GPE]} encapsulation type:
+#define VIRTIO_NET_HASH_TUNNEL_TYPE_VXLAN_GPE   (1 << 5)
+
+The \hyperref[intro:geneve]{[GENEVE]} encapsulation type:
+#define VIRTIO_NET_HASH_TUNNEL_TYPE_GENEVE      (1 << 6)
+
+The \hyperref[intro:ipip]{[IPIP]} encapsulation type:
+#define VIRTIO_NET_HASH_TUNNEL_TYPE_IPIP        (1 << 7)
+
+The \hyperref[intro:nvgre]{[NVGRE]} encapsulation type:
+#define VIRTIO_NET_HASH_TUNNEL_TYPE_NVGRE       (1 << 8)
+\end{lstlisting}
+
+\subparagraph{Advice}
+Example uses of inner header hash:
+\begin{itemize}
+\item Legacy tunneling protocols, lacking outer header entropy, can use RSS with inner header hash to
+      distribute flows with identical outer but different inner headers across various queues, improving performance.
+\item Identify an inner flow distributed across multiple outer tunnels.
+\end{itemize}
+
+As using the inner header hash completely discards the outer header entropy, care must be taken
+if the inner header is controlled by an adversary, as the adversary can then intentionally create
+configurations with insufficient entropy.
+
+Besides disabling inner header hash, mitigations would depend on:
+\begin{itemize}
+\item Use a tool with good forwarding performance to keep the receive queue from dropping packets.
+\item If the QoS (Quality of service) is unavailable, the driver can set \field{hash_tunnel_types} to 0
+      to disable inner header hash for all encapsulated packets.
+\item Perform appropriate QoS before packets consume the receive buffers of the receive queues.
+\end{itemize}
+
+\devicenormative{\subparagraph}{Inner Header Hash}{Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
+
+If the (outer) header of the received packet does not match any value enabled in \field{hash_tunnel_types},
+the device MUST calculate the hash on the outer header.
+
+If the device receives an unsupported or unrecognized value for \field{hash_tunnel_types}, it MUST respond to
+the VIRTIO_NET_CTRL_HASH_TUNNEL_SET command with VIRTIO_NET_ERR.
+
+If the device offers the VIRTIO_NET_F_HASH_TUNNEL feature, it MUST provide the values for \field{supported_hash_tunnel_types}.
+
+If \field{hash_tunnel_types} is set to 0 or upon device reset, the device MUST disable inner header hash for all encapsulation types.
+
+\drivernormative{\subparagraph}{Inner Header Hash}{Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
+
+The driver MUST have negotiated the VIRTIO_NET_F_HASH_TUNNEL feature when issuing
+commands VIRTIO_NET_CTRL_HASH_TUNNEL_SET and VIRTIO_NET_CTRL_HASH_TUNNEL_GET.
+
+The driver MUST ignore the values received from the VIRTIO_NET_CTRL_HASH_TUNNEL_GET command if the device responds with VIRTIO_NET_ERR.
+
+The driver MUST NOT set any value in \field{hash_tunnel_types} which is not set in \field{supported_hash_tunnel_types}.
+
 \paragraph{Hash reporting for incoming packets}
 \label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash reporting for incoming packets}
 
diff --git a/device-types/net/device-conformance.tex b/device-types/net/device-conformance.tex
index 54f6783..f88f48b 100644
--- a/device-types/net/device-conformance.tex
+++ b/device-types/net/device-conformance.tex
@@ -14,4 +14,5 @@
 \item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode}
 \item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) / RSS processing}
 \item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Notifications Coalescing}
+\item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
 \end{itemize}
diff --git a/device-types/net/driver-conformance.tex b/device-types/net/driver-conformance.tex
index 97d0cc1..9d853d9 100644
--- a/device-types/net/driver-conformance.tex
+++ b/device-types/net/driver-conformance.tex
@@ -14,4 +14,5 @@
 \item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Offloads State Configuration / Setting Offloads State}
 \item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) }
 \item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Notifications Coalescing}
+\item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
 \end{itemize}
diff --git a/introduction.tex b/introduction.tex
index b7155bf..81f07a4 100644
--- a/introduction.tex
+++ b/introduction.tex
@@ -102,6 +102,45 @@ \section{Normative References}\label{sec:Normative References}
     Standards for Efficient Cryptography Group(SECG), ``SEC1: Elliptic Cureve Cryptography'', Version 1.0, September 2000.
 	\newline\url{https://www.secg.org/sec1-v2.pdf}\\
 
+	\phantomsection\label{intro:gre_rfc2784}\textbf{[GRE_rfc2784]} &
+    Generic Routing Encapsulation. This protocol is only specified for IPv4 and used as either the payload or delivery protocol.
+	\newline\url{https://datatracker.ietf.org/doc/rfc2784/}\\
+	\phantomsection\label{intro:gre_rfc2890}\textbf{[GRE_rfc2890]} &
+    Key and Sequence Number Extensions to GRE \ref{intro:gre_rfc2784}. This protocol describes extensions by which two fields, Key and
+    Sequence Number, can be optionally carried in the GRE Header \ref{intro:gre_rfc2784}.
+	\newline\url{https://www.rfc-editor.org/rfc/rfc2890}\\
+	\phantomsection\label{intro:gre_rfc7676}\textbf{[GRE_rfc7676]} &
+    IPv6 Support for Generic Routing Encapsulation (GRE). This protocol is specified for IPv6 and used as either the payload or
+    delivery protocol. Note that this does not change the GRE header format or any behaviors specified by RFC 2784 or RFC 2890.
+	\newline\url{https://datatracker.ietf.org/doc/rfc7676/}\\
+	\phantomsection\label{intro:gre_in_udp_rfc8086}\textbf{[GRE-in-UDP]} &
+    GRE-in-UDP Encapsulation. This specifies a method of encapsulating network protocol packets within GRE and UDP headers.
+    This protocol is specified for IPv4 and IPv6, and used as either the payload or delivery protocol.
+	\newline\url{https://www.rfc-editor.org/rfc/rfc8086}\\
+	\phantomsection\label{intro:vxlan}\textbf{[VXLAN]} &
+    Virtual eXtensible Local Area Network.
+	\newline\url{https://datatracker.ietf.org/doc/rfc7348/}\\
+	\phantomsection\label{intro:vxlan-gpe}\textbf{[VXLAN-GPE]} &
+    Generic Protocol Extension for VXLAN. This protocol describes extending Virtual eXtensible Local Area Network (VXLAN) via changes to the VXLAN header.
+	\newline\url{https://www.ietf.org/archive/id/draft-ietf-nvo3-vxlan-gpe-12.txt}\\
+	\phantomsection\label{intro:geneve}\textbf{[GENEVE]} &
+    Generic Network Virtualization Encapsulation.
+	\newline\url{https://datatracker.ietf.org/doc/rfc8926/}\\
+	\phantomsection\label{intro:ipip}\textbf{[IPIP]} &
+    IP Encapsulation within IP.
+	\newline\url{https://www.rfc-editor.org/rfc/rfc2003}\\
+	\phantomsection\label{intro:nvgre}\textbf{[NVGRE]} &
+    NVGRE: Network Virtualization Using Generic Routing Encapsulation
+	\newline\url{https://www.rfc-editor.org/rfc/rfc7637.html}\\
+	\phantomsection\label{intro:IP}\textbf{[IP]} &
+    INTERNET PROTOCOL
+	\newline\url{https://www.rfc-editor.org/rfc/rfc791}\\
+	\phantomsection\label{intro:UDP}\textbf{[UDP]} &
+    User Datagram Protocol
+	\newline\url{https://www.rfc-editor.org/rfc/rfc768}\\
+	\phantomsection\label{intro:TCP}\textbf{[TCP]} &
+    TRANSMISSION CONTROL PROTOCOL
+	\newline\url{https://www.rfc-editor.org/rfc/rfc793}\\
 \end{longtable}
 
 \section{Non-Normative References}
-- 
2.19.1.6.gb485710b


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [virtio-comment] [PATCH v18] virtio-net: support inner header hash
@ 2023-06-21 13:50 ` Heng Qi
  0 siblings, 0 replies; 106+ messages in thread
From: Heng Qi @ 2023-06-21 13:50 UTC (permalink / raw)
  To: virtio-comment, virtio-dev
  Cc: Michael S . Tsirkin, Parav Pandit, Jason Wang, Yuri Benditovich,
	Xuan Zhuo, Cornelia Huck

1. Currently, a received encapsulated packet has an outer and an inner header, but
the virtio device is unable to calculate the hash for the inner header. The same
flow can traverse through different tunnels, resulting in the encapsulated
packets being spread across multiple receive queues (refer to the figure below).
However, in certain scenarios, we may need to direct these encapsulated packets of
the same flow to a single receive queue. This facilitates the processing
of the flow by the same CPU to improve performance (warm caches, less locking, etc.).

               client1                    client2
                  |        +-------+         |
                  +------->|tunnels|<--------+
                           +-------+
                              |  |
                              v  v
                      +-----------------+
                      | monitoring host |
                      +-----------------+

To achieve this, the device can calculate a symmetric hash based on the inner headers
of the same flow.

2. For legacy systems, they may lack entropy fields which modern protocols have in
the outer header, resulting in multiple flows with the same outer header but
different inner headers being directed to the same receive queue. This results in
poor receive performance.

To address this limitation, inner header hash can be used to enable the device to advertise
the capability to calculate the hash for the inner packet, regaining better receive performance.

Fixes: https://github.com/oasis-tcs/virtio-spec/issues/173

Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
---
v17->v18:
	1. Some rewording suggestions from Michael (Thanks!).
	2. Use 0 to disable inner header hash and remove
	   VIRTIO_NET_HASH_TUNNEL_TYPE_NONE.
v16->v17:
	1. Some small rewrites. @Parav Pandit
	2. Add Parav's Reviewed-by tag (Thanks!).

v15->v16:
	1. Remove the hash_option. In order to delimit the inner header hash and RSS
	   configuration, the ability to configure the outer src udp port hash is given
	   to RSS. This is orthogonal to inner header hash, which will be done in the
	   RSS capability extension topic (considered as an RSS extension together
	   with the symmetric toeplitz hash algorithm, etc.). @Parav Pandit @Michael S . Tsirkin
	2. Fix a 'field' typo. @Parav Pandit

v14->v15:
	1. Add tunnel hash option suggested by @Michael S . Tsirkin
	2. Adjust some descriptions.

v13->v14:
	1. Move supported_hash_tunnel_types from config space into cvq command. @Parav Pandit
	2. Rebase to master branch.
	3. Some minor modifications.

v12->v13:
	1. Add a GET command for hash_tunnel_types. @Parav Pandit
	2. Add tunneling protocol explanation. @Jason Wang
	3. Add comments on some usage scenarios for inner hash.

v11->v12:
	1. Add a command VIRTIO_NET_CTRL_MQ_TUNNEL_CONFIG.
	2. Refine the commit log. @Michael S . Tsirkin
	3. Add some tunnel types.

v10->v11:
	1. Revise commit log for clarity for readers.
	2. Some modifications to avoid undefined terms. @Parav Pandit
	3. Change VIRTIO_NET_F_HASH_TUNNEL dependency. @Parav Pandit
	4. Add the normative statements. @Parav Pandit

v9->v10:
	1. Removed hash_report_tunnel related information. @Parav Pandit
	2. Re-describe the limitations of QoS for tunneling.
	3. Some clarification.

v8->v9:
	1. Merge hash_report_tunnel_types into hash_report. @Parav Pandit
	2. Add tunnel security section. @Michael S . Tsirkin
	3. Add VIRTIO_NET_F_HASH_REPORT_TUNNEL.
	4. Fix some typos.
	5. Add more tunnel types. @Michael S . Tsirkin

v7->v8:
	1. Add supported_hash_tunnel_types. @Jason Wang, @Parav Pandit
	2. Change hash_report_tunnel to hash_report_tunnel_types. @Parav Pandit
	3. Removed re-definition for inner packet hashing. @Parav Pandit
	4. Fix some typos. @Michael S . Tsirkin
	5. Clarify some sentences. @Michael S . Tsirkin

v6->v7:
	1. Modify the wording of some sentences for clarity. @Michael S. Tsirkin
	2. Fix some syntax issues. @Michael S. Tsirkin

v5->v6:
	1. Fix some syntax and capitalization issues. @Michael S. Tsirkin
	2. Use encapsulated/encaptulation uniformly. @Michael S. Tsirkin
	3. Move the links to introduction section. @Michael S. Tsirkin
	4. Clarify some sentences. @Michael S. Tsirkin

v4->v5:
	1. Clarify some paragraphs. @Cornelia Huck
	2. Fix the u8 type. @Cornelia Huck

v3->v4:
	1. Rename VIRTIO_NET_F_HASH_GRE_VXLAN_GENEVE_INNER to VIRTIO_NET_F_HASH_TUNNEL. @Jason Wang
	2. Make things clearer. @Jason Wang @Michael S. Tsirkin
	3. Keep the possibility to use inner hash for automatic receive steering. @Jason Wang
	4. Add the "Tunnel packet" paragraph to avoid repeating the GRE etc. many times. @Michael S. Tsirkin

v2->v3:
	1. Add a feature bit for GRE/VXLAN/GENEVE inner hash. @Jason Wang
	2. Chang \field{hash_tunnel} to \field{hash_report_tunnel}. @Jason Wang, @Michael S. Tsirkin

v1->v2:
	1. Remove the patch for the bitmask fix. @Michael S. Tsirkin
	2. Clarify some paragraphs. @Jason Wang
	3. Add \field{hash_tunnel} and VIRTIO_NET_HASH_REPORT_GRE. @Yuri Benditovich

 device-types/net/description.tex        | 158 ++++++++++++++++++++++++
 device-types/net/device-conformance.tex |   1 +
 device-types/net/driver-conformance.tex |   1 +
 introduction.tex                        |  39 ++++++
 4 files changed, 199 insertions(+)

diff --git a/device-types/net/description.tex b/device-types/net/description.tex
index 3030222..9fdccfc 100644
--- a/device-types/net/description.tex
+++ b/device-types/net/description.tex
@@ -88,6 +88,8 @@ \subsection{Feature bits}\label{sec:Device Types / Network Device / Feature bits
 \item[VIRTIO_NET_F_CTRL_MAC_ADDR(23)] Set MAC address through control
     channel.
 
+\item[VIRTIO_NET_F_HASH_TUNNEL(51)] Device supports inner header hash for encapsulated packets.
+
 \item[VIRTIO_NET_F_VQ_NOTF_COAL(52)] Device supports virtqueue notification coalescing.
 
 \item[VIRTIO_NET_F_NOTF_COAL(53)] Device supports notifications coalescing.
@@ -147,6 +149,7 @@ \subsubsection{Feature bit requirements}\label{sec:Device Types / Network Device
 \item[VIRTIO_NET_F_RSC_EXT] Requires VIRTIO_NET_F_HOST_TSO4 or VIRTIO_NET_F_HOST_TSO6.
 \item[VIRTIO_NET_F_RSS] Requires VIRTIO_NET_F_CTRL_VQ.
 \item[VIRTIO_NET_F_VQ_NOTF_COAL] Requires VIRTIO_NET_F_CTRL_VQ.
+\item[VIRTIO_NET_F_HASH_TUNNEL] Requires VIRTIO_NET_F_CTRL_VQ along with VIRTIO_NET_F_RSS and/or VIRTIO_NET_F_HASH_REPORT.
 \end{description}
 
 \subsubsection{Legacy Interface: Feature bits}\label{sec:Device Types / Network Device / Feature bits / Legacy Interface: Feature bits}
@@ -869,6 +872,8 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
 If the feature VIRTIO_NET_F_RSS was negotiated:
 \begin{itemize}
 \item The device uses \field{hash_types} of the virtio_net_rss_config structure as 'Enabled hash types' bitmask.
+\item If additionally the feature VIRTIO_NET_F_HASH_TUNNEL was negotiated, the device uses \field{hash_tunnel_types} of the
+      virtnet_hash_tunnel_config_get structure as 'Encapsulation types enabled for inner header hash' bitmask.
 \item The device uses a key as defined in \field{hash_key_data} and \field{hash_key_length} of the virtio_net_rss_config structure (see
 \ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) / Setting RSS parameters}).
 \end{itemize}
@@ -876,6 +881,8 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
 If the feature VIRTIO_NET_F_RSS was not negotiated:
 \begin{itemize}
 \item The device uses \field{hash_types} of the virtio_net_hash_config structure as 'Enabled hash types' bitmask.
+\item If additionally the feature VIRTIO_NET_F_HASH_TUNNEL was negotiated, the device uses \field{hash_tunnel_types} of the
+      virtnet_hash_tunnel_config_get structure as 'Encapsulation types enabled for inner header hash' bitmask.
 \item The device uses a key as defined in \field{hash_key_data} and \field{hash_key_length} of the virtio_net_hash_config structure (see
 \ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode / Hash calculation}).
 \end{itemize}
@@ -889,6 +896,8 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
  \ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode / Hash calculation}.
 \end{itemize}
 
+The per-packet hash calculation can depend on the IP packet type. See
+\hyperref[intro:IP]{[IP]}, \hyperref[intro:UDP]{[UDP]} and \hyperref[intro:TCP]{[TCP]}.
 \subparagraph{Supported/enabled hash types}
 \label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled hash types}
 Hash types applicable for IPv4 packets:
@@ -1001,6 +1010,155 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
 (see \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / IPv6 packets without extension header}).
 \end{itemize}
 
+\paragraph{Inner Header Hash}
+\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Inner Header Hash}
+
+If VIRTIO_NET_F_HASH_TUNNEL has been negotiated, the driver can send commands VIRTIO_NET_CTRL_HASH_TUNNEL_SET
+and VIRTIO_NET_CTRL_HASH_TUNNEL_GET to configure the calculation of the inner header hash.
+
+struct virtnet_hash_tunnel_config_set {
+    le32 hash_tunnel_types;
+};
+
+struct virtnet_hash_tunnel_config_get {
+    le32 supported_hash_tunnel_types;
+    le32 hash_tunnel_types;
+};
+
+#define VIRTIO_NET_CTRL_HASH_TUNNEL 7
+ #define VIRTIO_NET_CTRL_HASH_TUNNEL_SET 0
+ #define VIRTIO_NET_CTRL_HASH_TUNNEL_GET 1
+
+
+Field \field{supported_hash_tunnel_types} provided by the device indicates that the device supports inner header hash for these encapsulation types.
+Field \field{supported_hash_tunnel_types} contains the bitmask of encapsulation types supported for inner header hash.
+See \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets /
+Hash calculation for incoming packets / Encapsulation types supported/enabled for inner header hash}.
+
+Field \field{hash_tunnel_types} contains the bitmask of encapsulation types enabled for inner header hash.
+See \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets /
+Hash calculation for incoming packets / Encapsulation types supported/enabled for inner header hash}.
+
+The class VIRTIO_NET_CTRL_HASH_TUNNEL has the following commands:
+\begin{itemize}
+\item VIRTIO_NET_CTRL_HASH_TUNNEL_SET: set \field{hash_tunnel_types} for the device using the
+      virtnet_hash_tunnel_config_set structure, which is read-only for the device.
+\item VIRTIO_NET_CTRL_HASH_TUNNEL_GET: get \field{hash_tunnel_types} and \field{supported_hash_tunnel_types}
+      from the device using the virtnet_hash_tunnel_config_get structure, which is write-only for the device.
+\end{itemize}
+
+\subparagraph{Encapsulated packet}
+\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Encapsulated packet}
+
+Multiple tunneling protocols allow encapsulating an inner, payload packet in an outer, encapsulated packet.
+The encapsulated packet thus contains an outer header and an inner header, and the device calculates the
+hash over either the inner header or the outer header.
+
+If VIRTIO_NET_F_HASH_TUNNEL is negotiated and a received encapsulated packet's outer header matches one of the
+encapsulation types enabled in \field{hash_tunnel_types}, then the device uses the inner header for hash
+calculations (only a single level of encapsulation is currently supported).
+
+If VIRTIO_NET_F_HASH_TUNNEL is negotiated and a received packet's (outer) header does not match any types enabled
+in \field{hash_tunnel_types}, then the device uses the outer header for hash calculations.
+
+Initially all encapsulation types are disabled (the value of \field{hash_tunnel_types} is 0) for inner header hash
+before any VIRTIO_NET_CTRL_HASH_TUNNEL_SET command are sent by the driver.
+
+Encapsulation types supported/enabled for inner header hash:
+\begin{itemize}
+    \item The outer header of the following encapsulation types does not contain the transport protocol:
+        \begin{enumerate}
+	    \item \hyperref[intro:ipip]{[IPIP]}: the outer header is over IPv4 and the inner header is over IPv4.
+	    \item \hyperref[intro:nvgre]{[NVGRE]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
+	    \item \hyperref[intro:gre_rfc2784]{[GRE_rfc2784]}: the outer header is over IPv4 and the inner header is over IPv4.
+	    \item \hyperref[intro:gre_rfc2890]{[GRE_rfc2890]}: the outer header is over IPv4 and the inner header is over IPv4.
+	    \item \hyperref[intro:gre_rfc7676]{[GRE_rfc7676]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
+        \end{enumerate}
+
+    \item The outer header of the following encapsulation types uses UDP as the transport protocol:
+        \begin{enumerate}
+	    \item \hyperref[intro:vxlan]{[VXLAN]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
+	    \item \hyperref[intro:geneve]{[GENEVE]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
+	    \item \hyperref[intro:vxlan_gpe]{[VXLAN-GPE]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
+	    \item \hyperref[intro:gre_in_udp_rfc8086]{[GRE-in-UDP]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
+        \end{enumerate}
+\end{itemize}
+
+\subparagraph{Encapsulation types supported/enabled for inner header hash}
+\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets /
+Hash calculation for incoming packets / Encapsulation types supported/enabled for inner header hash}
+
+Encapsulation types applicable for inner header hash:
+\begin{lstlisting}
+The \hyperref[intro:gre_rfc2784]{[GRE_rfc2784]} encapsulation type:
+#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_2784    (1 << 0)
+
+The \hyperref[intro:gre_rfc2890]{[GRE_rfc2890]} encapsulation type:
+#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_2890    (1 << 1)
+
+The \hyperref[intro:gre_rfc7676]{[GRE_rfc7676]} encapsulation type:
+#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_7676    (1 << 2)
+
+The \hyperref[intro:gre_in_udp_rfc8086]{[GRE-in-UDP]} encapsulation type:
+#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_UDP     (1 << 3)
+
+The \hyperref[intro:vxlan]{[VXLAN]} encapsulation type:
+#define VIRTIO_NET_HASH_TUNNEL_TYPE_VXLAN       (1 << 4)
+
+The \hyperref[intro:vxlan_gpe]{[VXLAN-GPE]} encapsulation type:
+#define VIRTIO_NET_HASH_TUNNEL_TYPE_VXLAN_GPE   (1 << 5)
+
+The \hyperref[intro:geneve]{[GENEVE]} encapsulation type:
+#define VIRTIO_NET_HASH_TUNNEL_TYPE_GENEVE      (1 << 6)
+
+The \hyperref[intro:ipip]{[IPIP]} encapsulation type:
+#define VIRTIO_NET_HASH_TUNNEL_TYPE_IPIP        (1 << 7)
+
+The \hyperref[intro:nvgre]{[NVGRE]} encapsulation type:
+#define VIRTIO_NET_HASH_TUNNEL_TYPE_NVGRE       (1 << 8)
+\end{lstlisting}
+
+\subparagraph{Advice}
+Example uses of inner header hash:
+\begin{itemize}
+\item Legacy tunneling protocols, lacking outer header entropy, can use RSS with inner header hash to
+      distribute flows with identical outer but different inner headers across various queues, improving performance.
+\item Identify an inner flow distributed across multiple outer tunnels.
+\end{itemize}
+
+As using the inner header hash completely discards the outer header entropy, care must be taken
+if the inner header is controlled by an adversary, as the adversary can then intentionally create
+configurations with insufficient entropy.
+
+Besides disabling inner header hash, mitigations would depend on:
+\begin{itemize}
+\item Use a tool with good forwarding performance to keep the receive queue from dropping packets.
+\item If the QoS (Quality of service) is unavailable, the driver can set \field{hash_tunnel_types} to 0
+      to disable inner header hash for all encapsulated packets.
+\item Perform appropriate QoS before packets consume the receive buffers of the receive queues.
+\end{itemize}
+
+\devicenormative{\subparagraph}{Inner Header Hash}{Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
+
+If the (outer) header of the received packet does not match any value enabled in \field{hash_tunnel_types},
+the device MUST calculate the hash on the outer header.
+
+If the device receives an unsupported or unrecognized value for \field{hash_tunnel_types}, it MUST respond to
+the VIRTIO_NET_CTRL_HASH_TUNNEL_SET command with VIRTIO_NET_ERR.
+
+If the device offers the VIRTIO_NET_F_HASH_TUNNEL feature, it MUST provide the values for \field{supported_hash_tunnel_types}.
+
+If \field{hash_tunnel_types} is set to 0 or upon device reset, the device MUST disable inner header hash for all encapsulation types.
+
+\drivernormative{\subparagraph}{Inner Header Hash}{Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
+
+The driver MUST have negotiated the VIRTIO_NET_F_HASH_TUNNEL feature when issuing
+commands VIRTIO_NET_CTRL_HASH_TUNNEL_SET and VIRTIO_NET_CTRL_HASH_TUNNEL_GET.
+
+The driver MUST ignore the values received from the VIRTIO_NET_CTRL_HASH_TUNNEL_GET command if the device responds with VIRTIO_NET_ERR.
+
+The driver MUST NOT set any value in \field{hash_tunnel_types} which is not set in \field{supported_hash_tunnel_types}.
+
 \paragraph{Hash reporting for incoming packets}
 \label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash reporting for incoming packets}
 
diff --git a/device-types/net/device-conformance.tex b/device-types/net/device-conformance.tex
index 54f6783..f88f48b 100644
--- a/device-types/net/device-conformance.tex
+++ b/device-types/net/device-conformance.tex
@@ -14,4 +14,5 @@
 \item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode}
 \item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) / RSS processing}
 \item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Notifications Coalescing}
+\item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
 \end{itemize}
diff --git a/device-types/net/driver-conformance.tex b/device-types/net/driver-conformance.tex
index 97d0cc1..9d853d9 100644
--- a/device-types/net/driver-conformance.tex
+++ b/device-types/net/driver-conformance.tex
@@ -14,4 +14,5 @@
 \item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Offloads State Configuration / Setting Offloads State}
 \item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) }
 \item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Notifications Coalescing}
+\item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
 \end{itemize}
diff --git a/introduction.tex b/introduction.tex
index b7155bf..81f07a4 100644
--- a/introduction.tex
+++ b/introduction.tex
@@ -102,6 +102,45 @@ \section{Normative References}\label{sec:Normative References}
     Standards for Efficient Cryptography Group(SECG), ``SEC1: Elliptic Cureve Cryptography'', Version 1.0, September 2000.
 	\newline\url{https://www.secg.org/sec1-v2.pdf}\\
 
+	\phantomsection\label{intro:gre_rfc2784}\textbf{[GRE_rfc2784]} &
+    Generic Routing Encapsulation. This protocol is only specified for IPv4 and used as either the payload or delivery protocol.
+	\newline\url{https://datatracker.ietf.org/doc/rfc2784/}\\
+	\phantomsection\label{intro:gre_rfc2890}\textbf{[GRE_rfc2890]} &
+    Key and Sequence Number Extensions to GRE \ref{intro:gre_rfc2784}. This protocol describes extensions by which two fields, Key and
+    Sequence Number, can be optionally carried in the GRE Header \ref{intro:gre_rfc2784}.
+	\newline\url{https://www.rfc-editor.org/rfc/rfc2890}\\
+	\phantomsection\label{intro:gre_rfc7676}\textbf{[GRE_rfc7676]} &
+    IPv6 Support for Generic Routing Encapsulation (GRE). This protocol is specified for IPv6 and used as either the payload or
+    delivery protocol. Note that this does not change the GRE header format or any behaviors specified by RFC 2784 or RFC 2890.
+	\newline\url{https://datatracker.ietf.org/doc/rfc7676/}\\
+	\phantomsection\label{intro:gre_in_udp_rfc8086}\textbf{[GRE-in-UDP]} &
+    GRE-in-UDP Encapsulation. This specifies a method of encapsulating network protocol packets within GRE and UDP headers.
+    This protocol is specified for IPv4 and IPv6, and used as either the payload or delivery protocol.
+	\newline\url{https://www.rfc-editor.org/rfc/rfc8086}\\
+	\phantomsection\label{intro:vxlan}\textbf{[VXLAN]} &
+    Virtual eXtensible Local Area Network.
+	\newline\url{https://datatracker.ietf.org/doc/rfc7348/}\\
+	\phantomsection\label{intro:vxlan-gpe}\textbf{[VXLAN-GPE]} &
+    Generic Protocol Extension for VXLAN. This protocol describes extending Virtual eXtensible Local Area Network (VXLAN) via changes to the VXLAN header.
+	\newline\url{https://www.ietf.org/archive/id/draft-ietf-nvo3-vxlan-gpe-12.txt}\\
+	\phantomsection\label{intro:geneve}\textbf{[GENEVE]} &
+    Generic Network Virtualization Encapsulation.
+	\newline\url{https://datatracker.ietf.org/doc/rfc8926/}\\
+	\phantomsection\label{intro:ipip}\textbf{[IPIP]} &
+    IP Encapsulation within IP.
+	\newline\url{https://www.rfc-editor.org/rfc/rfc2003}\\
+	\phantomsection\label{intro:nvgre}\textbf{[NVGRE]} &
+    NVGRE: Network Virtualization Using Generic Routing Encapsulation
+	\newline\url{https://www.rfc-editor.org/rfc/rfc7637.html}\\
+	\phantomsection\label{intro:IP}\textbf{[IP]} &
+    INTERNET PROTOCOL
+	\newline\url{https://www.rfc-editor.org/rfc/rfc791}\\
+	\phantomsection\label{intro:UDP}\textbf{[UDP]} &
+    User Datagram Protocol
+	\newline\url{https://www.rfc-editor.org/rfc/rfc768}\\
+	\phantomsection\label{intro:TCP}\textbf{[TCP]} &
+    TRANSMISSION CONTROL PROTOCOL
+	\newline\url{https://www.rfc-editor.org/rfc/rfc793}\\
 \end{longtable}
 
 \section{Non-Normative References}
-- 
2.19.1.6.gb485710b


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [virtio-dev] Re: [PATCH v18] virtio-net: support inner header hash
  2023-06-21 13:50 ` [virtio-comment] " Heng Qi
@ 2023-06-21 15:38   ` Michael S. Tsirkin
  -1 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2023-06-21 15:38 UTC (permalink / raw)
  To: Heng Qi
  Cc: virtio-comment, virtio-dev, Parav Pandit, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck

On Wed, Jun 21, 2023 at 09:50:52PM +0800, Heng Qi wrote:
> 1. Currently, a received encapsulated packet has an outer and an inner header, but
> the virtio device is unable to calculate the hash for the inner header. The same
> flow can traverse through different tunnels, resulting in the encapsulated
> packets being spread across multiple receive queues (refer to the figure below).
> However, in certain scenarios, we may need to direct these encapsulated packets of
> the same flow to a single receive queue. This facilitates the processing
> of the flow by the same CPU to improve performance (warm caches, less locking, etc.).
> 
>                client1                    client2
>                   |        +-------+         |
>                   +------->|tunnels|<--------+
>                            +-------+
>                               |  |
>                               v  v
>                       +-----------------+
>                       | monitoring host |
>                       +-----------------+
> 
> To achieve this, the device can calculate a symmetric hash based on the inner headers
> of the same flow.
> 
> 2. For legacy systems, they may lack entropy fields which modern protocols have in
> the outer header, resulting in multiple flows with the same outer header but
> different inner headers being directed to the same receive queue. This results in
> poor receive performance.
> 
> To address this limitation, inner header hash can be used to enable the device to advertise
> the capability to calculate the hash for the inner packet, regaining better receive performance.
> 
> Fixes: https://github.com/oasis-tcs/virtio-spec/issues/173
> 

don't put an empty line here

> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> Reviewed-by: Parav Pandit <parav@nvidia.com>


ok almost there. small corrections, and one enhancement suggestion.


> ---
> v17->v18:
> 	1. Some rewording suggestions from Michael (Thanks!).
> 	2. Use 0 to disable inner header hash and remove
> 	   VIRTIO_NET_HASH_TUNNEL_TYPE_NONE.
> v16->v17:
> 	1. Some small rewrites. @Parav Pandit
> 	2. Add Parav's Reviewed-by tag (Thanks!).
> 
> v15->v16:
> 	1. Remove the hash_option. In order to delimit the inner header hash and RSS
> 	   configuration, the ability to configure the outer src udp port hash is given
> 	   to RSS. This is orthogonal to inner header hash, which will be done in the
> 	   RSS capability extension topic (considered as an RSS extension together
> 	   with the symmetric toeplitz hash algorithm, etc.). @Parav Pandit @Michael S . Tsirkin
> 	2. Fix a 'field' typo. @Parav Pandit
> 
> v14->v15:
> 	1. Add tunnel hash option suggested by @Michael S . Tsirkin
> 	2. Adjust some descriptions.
> 
> v13->v14:
> 	1. Move supported_hash_tunnel_types from config space into cvq command. @Parav Pandit
> 	2. Rebase to master branch.
> 	3. Some minor modifications.
> 
> v12->v13:
> 	1. Add a GET command for hash_tunnel_types. @Parav Pandit
> 	2. Add tunneling protocol explanation. @Jason Wang
> 	3. Add comments on some usage scenarios for inner hash.
> 
> v11->v12:
> 	1. Add a command VIRTIO_NET_CTRL_MQ_TUNNEL_CONFIG.
> 	2. Refine the commit log. @Michael S . Tsirkin
> 	3. Add some tunnel types.
> 
> v10->v11:
> 	1. Revise commit log for clarity for readers.
> 	2. Some modifications to avoid undefined terms. @Parav Pandit
> 	3. Change VIRTIO_NET_F_HASH_TUNNEL dependency. @Parav Pandit
> 	4. Add the normative statements. @Parav Pandit
> 
> v9->v10:
> 	1. Removed hash_report_tunnel related information. @Parav Pandit
> 	2. Re-describe the limitations of QoS for tunneling.
> 	3. Some clarification.
> 
> v8->v9:
> 	1. Merge hash_report_tunnel_types into hash_report. @Parav Pandit
> 	2. Add tunnel security section. @Michael S . Tsirkin
> 	3. Add VIRTIO_NET_F_HASH_REPORT_TUNNEL.
> 	4. Fix some typos.
> 	5. Add more tunnel types. @Michael S . Tsirkin
> 
> v7->v8:
> 	1. Add supported_hash_tunnel_types. @Jason Wang, @Parav Pandit
> 	2. Change hash_report_tunnel to hash_report_tunnel_types. @Parav Pandit
> 	3. Removed re-definition for inner packet hashing. @Parav Pandit
> 	4. Fix some typos. @Michael S . Tsirkin
> 	5. Clarify some sentences. @Michael S . Tsirkin
> 
> v6->v7:
> 	1. Modify the wording of some sentences for clarity. @Michael S. Tsirkin
> 	2. Fix some syntax issues. @Michael S. Tsirkin
> 
> v5->v6:
> 	1. Fix some syntax and capitalization issues. @Michael S. Tsirkin
> 	2. Use encapsulated/encaptulation uniformly. @Michael S. Tsirkin
> 	3. Move the links to introduction section. @Michael S. Tsirkin
> 	4. Clarify some sentences. @Michael S. Tsirkin
> 
> v4->v5:
> 	1. Clarify some paragraphs. @Cornelia Huck
> 	2. Fix the u8 type. @Cornelia Huck
> 
> v3->v4:
> 	1. Rename VIRTIO_NET_F_HASH_GRE_VXLAN_GENEVE_INNER to VIRTIO_NET_F_HASH_TUNNEL. @Jason Wang
> 	2. Make things clearer. @Jason Wang @Michael S. Tsirkin
> 	3. Keep the possibility to use inner hash for automatic receive steering. @Jason Wang
> 	4. Add the "Tunnel packet" paragraph to avoid repeating the GRE etc. many times. @Michael S. Tsirkin
> 
> v2->v3:
> 	1. Add a feature bit for GRE/VXLAN/GENEVE inner hash. @Jason Wang
> 	2. Chang \field{hash_tunnel} to \field{hash_report_tunnel}. @Jason Wang, @Michael S. Tsirkin
> 
> v1->v2:
> 	1. Remove the patch for the bitmask fix. @Michael S. Tsirkin
> 	2. Clarify some paragraphs. @Jason Wang
> 	3. Add \field{hash_tunnel} and VIRTIO_NET_HASH_REPORT_GRE. @Yuri Benditovich
> 
>  device-types/net/description.tex        | 158 ++++++++++++++++++++++++
>  device-types/net/device-conformance.tex |   1 +
>  device-types/net/driver-conformance.tex |   1 +
>  introduction.tex                        |  39 ++++++
>  4 files changed, 199 insertions(+)
> 
> diff --git a/device-types/net/description.tex b/device-types/net/description.tex
> index 3030222..9fdccfc 100644
> --- a/device-types/net/description.tex
> +++ b/device-types/net/description.tex
> @@ -88,6 +88,8 @@ \subsection{Feature bits}\label{sec:Device Types / Network Device / Feature bits
>  \item[VIRTIO_NET_F_CTRL_MAC_ADDR(23)] Set MAC address through control
>      channel.
>  
> +\item[VIRTIO_NET_F_HASH_TUNNEL(51)] Device supports inner header hash for encapsulated packets.
> +
>  \item[VIRTIO_NET_F_VQ_NOTF_COAL(52)] Device supports virtqueue notification coalescing.
>  
>  \item[VIRTIO_NET_F_NOTF_COAL(53)] Device supports notifications coalescing.
> @@ -147,6 +149,7 @@ \subsubsection{Feature bit requirements}\label{sec:Device Types / Network Device
>  \item[VIRTIO_NET_F_RSC_EXT] Requires VIRTIO_NET_F_HOST_TSO4 or VIRTIO_NET_F_HOST_TSO6.
>  \item[VIRTIO_NET_F_RSS] Requires VIRTIO_NET_F_CTRL_VQ.
>  \item[VIRTIO_NET_F_VQ_NOTF_COAL] Requires VIRTIO_NET_F_CTRL_VQ.
> +\item[VIRTIO_NET_F_HASH_TUNNEL] Requires VIRTIO_NET_F_CTRL_VQ along with VIRTIO_NET_F_RSS and/or VIRTIO_NET_F_HASH_REPORT.

I think just or is enough.

>  \end{description}
>  
>  \subsubsection{Legacy Interface: Feature bits}\label{sec:Device Types / Network Device / Feature bits / Legacy Interface: Feature bits}
> @@ -869,6 +872,8 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>  If the feature VIRTIO_NET_F_RSS was negotiated:
>  \begin{itemize}
>  \item The device uses \field{hash_types} of the virtio_net_rss_config structure as 'Enabled hash types' bitmask.
> +\item If additionally the feature VIRTIO_NET_F_HASH_TUNNEL was negotiated, the device uses \field{hash_tunnel_types} of the
> +      virtnet_hash_tunnel_config_get structure as 'Encapsulation types enabled for inner header hash' bitmask.

why get and not set? e.g. if I call get then set then the field in set
will have effect.


>  \item The device uses a key as defined in \field{hash_key_data} and \field{hash_key_length} of the virtio_net_rss_config structure (see
>  \ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) / Setting RSS parameters}).
>  \end{itemize}
> @@ -876,6 +881,8 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>  If the feature VIRTIO_NET_F_RSS was not negotiated:
>  \begin{itemize}
>  \item The device uses \field{hash_types} of the virtio_net_hash_config structure as 'Enabled hash types' bitmask.
> +\item If additionally the feature VIRTIO_NET_F_HASH_TUNNEL was negotiated, the device uses \field{hash_tunnel_types} of the
> +      virtnet_hash_tunnel_config_get structure as 'Encapsulation types enabled for inner header hash' bitmask.

same

>  \item The device uses a key as defined in \field{hash_key_data} and \field{hash_key_length} of the virtio_net_hash_config structure (see
>  \ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode / Hash calculation}).
>  \end{itemize}
> @@ -889,6 +896,8 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>   \ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode / Hash calculation}.
>  \end{itemize}
>  
> +The per-packet hash calculation can depend on the IP packet type. See
> +\hyperref[intro:IP]{[IP]}, \hyperref[intro:UDP]{[UDP]} and \hyperref[intro:TCP]{[TCP]}.

and end paragraph here.

>  \subparagraph{Supported/enabled hash types}
>  \label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled hash types}
>  Hash types applicable for IPv4 packets:
> @@ -1001,6 +1010,155 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>  (see \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / IPv6 packets without extension header}).
>  \end{itemize}
>  
> +\paragraph{Inner Header Hash}
> +\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Inner Header Hash}
> +
> +If VIRTIO_NET_F_HASH_TUNNEL has been negotiated, the driver can send commands VIRTIO_NET_CTRL_HASH_TUNNEL_SET
> +and VIRTIO_NET_CTRL_HASH_TUNNEL_GET to configure the calculation of the inner header hash.
> +
> +struct virtnet_hash_tunnel_config_set {
> +    le32 hash_tunnel_types;
> +};
> +
> +struct virtnet_hash_tunnel_config_get {
> +    le32 supported_hash_tunnel_types;
> +    le32 hash_tunnel_types;
> +};
> +

It would be cleaner to have a single structure for both.

I also think hash_tunnel is unnecessarily verbose, and _config_ is also
pointless.

Returning supported_hash_tunnel_types back to device can also
be useful for debugging.

How about:


struct virtnet_hash_tunnel {
     le32 supported_tunnel_types;
     le32 enabled_tunnel_types;
};


For VIRTIO_NET_CTRL_HASH_TUNNEL_GET, \field{supported_tunnel_types}
contains the bitmask of encapsulation types supported
by the device for inner header hash; \field{enabled_tunnel_types}
contains the value received in a previous successful
call to VIRTIO_NET_CTRL_HASH_TUNNEL_SET.

For VIRTIO_NET_CTRL_HASH_TUNNEL_SET, \field{supported_tunnel_types}
contains the value returned by a previous
successful call to VIRTIO_NET_CTRL_HASH_TUNNEL_GET;
\field{enabled_tunnel_types}
contains the bitmask of encapsulation types to enable
for inner header hash.

and add normative statements to this end.


> +#define VIRTIO_NET_CTRL_HASH_TUNNEL 7
> + #define VIRTIO_NET_CTRL_HASH_TUNNEL_SET 0
> + #define VIRTIO_NET_CTRL_HASH_TUNNEL_GET 1
> +
> +
> +Field \field{supported_hash_tunnel_types} provided by the device indicates that the device supports inner header hash for these encapsulation types.
> +Field \field{supported_hash_tunnel_types} contains the bitmask of encapsulation types supported for inner header hash.

We don't need these two sentences. Just second one will do.

> +See \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets /
> +Hash calculation for incoming packets / Encapsulation types supported/enabled for inner header hash}.
> +
> +Field \field{hash_tunnel_types} contains the bitmask of encapsulation types enabled for inner header hash.

They have different meanings for set and get though.


> +See \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets /
> +Hash calculation for incoming packets / Encapsulation types supported/enabled for inner header hash}.
> +
> +The class VIRTIO_NET_CTRL_HASH_TUNNEL has the following commands:
> +\begin{itemize}
> +\item VIRTIO_NET_CTRL_HASH_TUNNEL_SET: set \field{hash_tunnel_types} for the device using the
> +      virtnet_hash_tunnel_config_set structure, which is read-only for the device.
> +\item VIRTIO_NET_CTRL_HASH_TUNNEL_GET: get \field{hash_tunnel_types} and \field{supported_hash_tunnel_types}
> +      from the device using the virtnet_hash_tunnel_config_get structure, which is write-only for the device.
> +\end{itemize}
> +
> +\subparagraph{Encapsulated packet}
> +\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Encapsulated packet}
> +
> +Multiple tunneling protocols allow encapsulating an inner, payload packet in an outer, encapsulated packet.
> +The encapsulated packet thus contains an outer header and an inner header, and the device calculates the
> +hash over either the inner header or the outer header.
> +
> +If VIRTIO_NET_F_HASH_TUNNEL is negotiated and a received encapsulated packet's outer header matches one of the
> +encapsulation types enabled in \field{hash_tunnel_types}, then the device uses the inner header for hash
> +calculations (only a single level of encapsulation is currently supported).
> +
> +If VIRTIO_NET_F_HASH_TUNNEL is negotiated and a received packet's (outer) header does not match any types enabled
> +in \field{hash_tunnel_types}, then the device uses the outer header for hash calculations.
> +
> +Initially all encapsulation types are disabled (the value of \field{hash_tunnel_types} is 0) for inner header hash
> +before any VIRTIO_NET_CTRL_HASH_TUNNEL_SET command are sent by the driver.

Initially (before driver sends VIRTIO_NET_CTRL_HASH_TUNNEL_SET)
all encapsulation types are disabled 

> +
> +Encapsulation types supported/enabled for inner header hash:
> +\begin{itemize}
> +    \item The outer header of the following encapsulation types does not contain the transport protocol:
> +        \begin{enumerate}
> +	    \item \hyperref[intro:ipip]{[IPIP]}: the outer header is over IPv4 and the inner header is over IPv4.
> +	    \item \hyperref[intro:nvgre]{[NVGRE]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
> +	    \item \hyperref[intro:gre_rfc2784]{[GRE_rfc2784]}: the outer header is over IPv4 and the inner header is over IPv4.
> +	    \item \hyperref[intro:gre_rfc2890]{[GRE_rfc2890]}: the outer header is over IPv4 and the inner header is over IPv4.
> +	    \item \hyperref[intro:gre_rfc7676]{[GRE_rfc7676]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
> +        \end{enumerate}
> +
> +    \item The outer header of the following encapsulation types uses UDP as the transport protocol:
> +        \begin{enumerate}
> +	    \item \hyperref[intro:vxlan]{[VXLAN]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
> +	    \item \hyperref[intro:geneve]{[GENEVE]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
> +	    \item \hyperref[intro:vxlan_gpe]{[VXLAN-GPE]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
> +	    \item \hyperref[intro:gre_in_udp_rfc8086]{[GRE-in-UDP]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
> +        \end{enumerate}
> +\end{itemize}
> +
> +\subparagraph{Encapsulation types supported/enabled for inner header hash}
> +\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets /
> +Hash calculation for incoming packets / Encapsulation types supported/enabled for inner header hash}
> +
> +Encapsulation types applicable for inner header hash:
> +\begin{lstlisting}
> +The \hyperref[intro:gre_rfc2784]{[GRE_rfc2784]} encapsulation type:
> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_2784    (1 << 0)
> +
> +The \hyperref[intro:gre_rfc2890]{[GRE_rfc2890]} encapsulation type:
> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_2890    (1 << 1)
> +
> +The \hyperref[intro:gre_rfc7676]{[GRE_rfc7676]} encapsulation type:
> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_7676    (1 << 2)
> +
> +The \hyperref[intro:gre_in_udp_rfc8086]{[GRE-in-UDP]} encapsulation type:
> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_UDP     (1 << 3)
> +
> +The \hyperref[intro:vxlan]{[VXLAN]} encapsulation type:
> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_VXLAN       (1 << 4)
> +
> +The \hyperref[intro:vxlan_gpe]{[VXLAN-GPE]} encapsulation type:
> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_VXLAN_GPE   (1 << 5)
> +
> +The \hyperref[intro:geneve]{[GENEVE]} encapsulation type:
> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GENEVE      (1 << 6)
> +
> +The \hyperref[intro:ipip]{[IPIP]} encapsulation type:
> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_IPIP        (1 << 7)
> +
> +The \hyperref[intro:nvgre]{[NVGRE]} encapsulation type:
> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_NVGRE       (1 << 8)
> +\end{lstlisting}
> +
> +\subparagraph{Advice}
> +Example uses of inner header hash:
> +\begin{itemize}
> +\item Legacy tunneling protocols, lacking outer header entropy, can use RSS with inner header hash to
> +      distribute flows with identical outer but different inner headers across various queues, improving performance.
> +\item Identify an inner flow distributed across multiple outer tunnels.
> +\end{itemize}
> +
> +As using the inner header hash completely discards the outer header entropy, care must be taken
> +if the inner header is controlled by an adversary, as the adversary can then intentionally create
> +configurations with insufficient entropy.
> +
> +Besides disabling inner header hash, mitigations would depend on:
> +\begin{itemize}
> +\item Use a tool with good forwarding performance to keep the receive queue from dropping packets.

this is quite vague

> +\item If the QoS (Quality of service) is unavailable, the driver can set \field{hash_tunnel_types} to 0
> +      to disable inner header hash for all encapsulated packets.

this is precisely disabling

> +\item Perform appropriate QoS before packets consume the receive buffers of the receive queues.

it is not at all clear how would devices do this.

> +\end{itemize}

Oh sorry I didn't complete the sentence :(
I suggest dropping above and having something like:

	Besides disabling inner header hash, mitigations would depend on how the
	hash is used, and the consequences of a successful attack.
	For example, if the attack causes packet drops, using a deeper queue
	might be able to mitigate it.



> +
> +\devicenormative{\subparagraph}{Inner Header Hash}{Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
> +
> +If the (outer) header of the received packet does not match any value

any encapsulation type

> enabled in \field{hash_tunnel_types},
> +the device MUST calculate the hash on the outer header.
> +
> +If the device receives an unsupported or unrecognized value for \field{hash_tunnel_types}, it MUST respond to
> +the VIRTIO_NET_CTRL_HASH_TUNNEL_SET command with VIRTIO_NET_ERR.

let's be specific. if any bits in hash_tunnel_types are not set in
supported_hash_tunnel_types

> +
> +If the device offers the VIRTIO_NET_F_HASH_TUNNEL feature, it MUST provide the values for \field{supported_hash_tunnel_types}.

what does this mean even?

> +
> +If \field{hash_tunnel_types} is set to 0 or upon device reset, the device MUST disable inner header hash for all encapsulation types.
> +
> +\drivernormative{\subparagraph}{Inner Header Hash}{Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
> +
> +The driver MUST have negotiated the VIRTIO_NET_F_HASH_TUNNEL feature when issuing
> +commands VIRTIO_NET_CTRL_HASH_TUNNEL_SET and VIRTIO_NET_CTRL_HASH_TUNNEL_GET.
> +
> +The driver MUST ignore the values received from the VIRTIO_NET_CTRL_HASH_TUNNEL_GET command if the device responds with VIRTIO_NET_ERR.
> +
> +The driver MUST NOT set any value in \field{hash_tunnel_types} which is not set in \field{supported_hash_tunnel_types}.

any bits. not any value

> +
>  \paragraph{Hash reporting for incoming packets}
>  \label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash reporting for incoming packets}
>  
> diff --git a/device-types/net/device-conformance.tex b/device-types/net/device-conformance.tex
> index 54f6783..f88f48b 100644
> --- a/device-types/net/device-conformance.tex
> +++ b/device-types/net/device-conformance.tex
> @@ -14,4 +14,5 @@
>  \item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode}
>  \item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) / RSS processing}
>  \item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Notifications Coalescing}
> +\item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
>  \end{itemize}
> diff --git a/device-types/net/driver-conformance.tex b/device-types/net/driver-conformance.tex
> index 97d0cc1..9d853d9 100644
> --- a/device-types/net/driver-conformance.tex
> +++ b/device-types/net/driver-conformance.tex
> @@ -14,4 +14,5 @@
>  \item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Offloads State Configuration / Setting Offloads State}
>  \item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) }
>  \item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Notifications Coalescing}
> +\item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
>  \end{itemize}
> diff --git a/introduction.tex b/introduction.tex
> index b7155bf..81f07a4 100644
> --- a/introduction.tex
> +++ b/introduction.tex
> @@ -102,6 +102,45 @@ \section{Normative References}\label{sec:Normative References}
>      Standards for Efficient Cryptography Group(SECG), ``SEC1: Elliptic Cureve Cryptography'', Version 1.0, September 2000.
>  	\newline\url{https://www.secg.org/sec1-v2.pdf}\\
>  
> +	\phantomsection\label{intro:gre_rfc2784}\textbf{[GRE_rfc2784]} &
> +    Generic Routing Encapsulation. This protocol is only specified for IPv4 and used as either the payload or delivery protocol.
> +	\newline\url{https://datatracker.ietf.org/doc/rfc2784/}\\
> +	\phantomsection\label{intro:gre_rfc2890}\textbf{[GRE_rfc2890]} &
> +    Key and Sequence Number Extensions to GRE \ref{intro:gre_rfc2784}. This protocol describes extensions by which two fields, Key and
> +    Sequence Number, can be optionally carried in the GRE Header \ref{intro:gre_rfc2784}.
> +	\newline\url{https://www.rfc-editor.org/rfc/rfc2890}\\
> +	\phantomsection\label{intro:gre_rfc7676}\textbf{[GRE_rfc7676]} &
> +    IPv6 Support for Generic Routing Encapsulation (GRE). This protocol is specified for IPv6 and used as either the payload or
> +    delivery protocol. Note that this does not change the GRE header format or any behaviors specified by RFC 2784 or RFC 2890.
> +	\newline\url{https://datatracker.ietf.org/doc/rfc7676/}\\
> +	\phantomsection\label{intro:gre_in_udp_rfc8086}\textbf{[GRE-in-UDP]} &
> +    GRE-in-UDP Encapsulation. This specifies a method of encapsulating network protocol packets within GRE and UDP headers.
> +    This protocol is specified for IPv4 and IPv6, and used as either the payload or delivery protocol.
> +	\newline\url{https://www.rfc-editor.org/rfc/rfc8086}\\
> +	\phantomsection\label{intro:vxlan}\textbf{[VXLAN]} &
> +    Virtual eXtensible Local Area Network.
> +	\newline\url{https://datatracker.ietf.org/doc/rfc7348/}\\
> +	\phantomsection\label{intro:vxlan-gpe}\textbf{[VXLAN-GPE]} &
> +    Generic Protocol Extension for VXLAN. This protocol describes extending Virtual eXtensible Local Area Network (VXLAN) via changes to the VXLAN header.
> +	\newline\url{https://www.ietf.org/archive/id/draft-ietf-nvo3-vxlan-gpe-12.txt}\\
> +	\phantomsection\label{intro:geneve}\textbf{[GENEVE]} &
> +    Generic Network Virtualization Encapsulation.
> +	\newline\url{https://datatracker.ietf.org/doc/rfc8926/}\\
> +	\phantomsection\label{intro:ipip}\textbf{[IPIP]} &
> +    IP Encapsulation within IP.
> +	\newline\url{https://www.rfc-editor.org/rfc/rfc2003}\\
> +	\phantomsection\label{intro:nvgre}\textbf{[NVGRE]} &
> +    NVGRE: Network Virtualization Using Generic Routing Encapsulation
> +	\newline\url{https://www.rfc-editor.org/rfc/rfc7637.html}\\
> +	\phantomsection\label{intro:IP}\textbf{[IP]} &
> +    INTERNET PROTOCOL
> +	\newline\url{https://www.rfc-editor.org/rfc/rfc791}\\
> +	\phantomsection\label{intro:UDP}\textbf{[UDP]} &
> +    User Datagram Protocol
> +	\newline\url{https://www.rfc-editor.org/rfc/rfc768}\\
> +	\phantomsection\label{intro:TCP}\textbf{[TCP]} &
> +    TRANSMISSION CONTROL PROTOCOL
> +	\newline\url{https://www.rfc-editor.org/rfc/rfc793}\\
>  \end{longtable}
>  
>  \section{Non-Normative References}
> -- 
> 2.19.1.6.gb485710b


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
@ 2023-06-21 15:38   ` Michael S. Tsirkin
  0 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2023-06-21 15:38 UTC (permalink / raw)
  To: Heng Qi
  Cc: virtio-comment, virtio-dev, Parav Pandit, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck

On Wed, Jun 21, 2023 at 09:50:52PM +0800, Heng Qi wrote:
> 1. Currently, a received encapsulated packet has an outer and an inner header, but
> the virtio device is unable to calculate the hash for the inner header. The same
> flow can traverse through different tunnels, resulting in the encapsulated
> packets being spread across multiple receive queues (refer to the figure below).
> However, in certain scenarios, we may need to direct these encapsulated packets of
> the same flow to a single receive queue. This facilitates the processing
> of the flow by the same CPU to improve performance (warm caches, less locking, etc.).
> 
>                client1                    client2
>                   |        +-------+         |
>                   +------->|tunnels|<--------+
>                            +-------+
>                               |  |
>                               v  v
>                       +-----------------+
>                       | monitoring host |
>                       +-----------------+
> 
> To achieve this, the device can calculate a symmetric hash based on the inner headers
> of the same flow.
> 
> 2. For legacy systems, they may lack entropy fields which modern protocols have in
> the outer header, resulting in multiple flows with the same outer header but
> different inner headers being directed to the same receive queue. This results in
> poor receive performance.
> 
> To address this limitation, inner header hash can be used to enable the device to advertise
> the capability to calculate the hash for the inner packet, regaining better receive performance.
> 
> Fixes: https://github.com/oasis-tcs/virtio-spec/issues/173
> 

don't put an empty line here

> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> Reviewed-by: Parav Pandit <parav@nvidia.com>


ok almost there. small corrections, and one enhancement suggestion.


> ---
> v17->v18:
> 	1. Some rewording suggestions from Michael (Thanks!).
> 	2. Use 0 to disable inner header hash and remove
> 	   VIRTIO_NET_HASH_TUNNEL_TYPE_NONE.
> v16->v17:
> 	1. Some small rewrites. @Parav Pandit
> 	2. Add Parav's Reviewed-by tag (Thanks!).
> 
> v15->v16:
> 	1. Remove the hash_option. In order to delimit the inner header hash and RSS
> 	   configuration, the ability to configure the outer src udp port hash is given
> 	   to RSS. This is orthogonal to inner header hash, which will be done in the
> 	   RSS capability extension topic (considered as an RSS extension together
> 	   with the symmetric toeplitz hash algorithm, etc.). @Parav Pandit @Michael S . Tsirkin
> 	2. Fix a 'field' typo. @Parav Pandit
> 
> v14->v15:
> 	1. Add tunnel hash option suggested by @Michael S . Tsirkin
> 	2. Adjust some descriptions.
> 
> v13->v14:
> 	1. Move supported_hash_tunnel_types from config space into cvq command. @Parav Pandit
> 	2. Rebase to master branch.
> 	3. Some minor modifications.
> 
> v12->v13:
> 	1. Add a GET command for hash_tunnel_types. @Parav Pandit
> 	2. Add tunneling protocol explanation. @Jason Wang
> 	3. Add comments on some usage scenarios for inner hash.
> 
> v11->v12:
> 	1. Add a command VIRTIO_NET_CTRL_MQ_TUNNEL_CONFIG.
> 	2. Refine the commit log. @Michael S . Tsirkin
> 	3. Add some tunnel types.
> 
> v10->v11:
> 	1. Revise commit log for clarity for readers.
> 	2. Some modifications to avoid undefined terms. @Parav Pandit
> 	3. Change VIRTIO_NET_F_HASH_TUNNEL dependency. @Parav Pandit
> 	4. Add the normative statements. @Parav Pandit
> 
> v9->v10:
> 	1. Removed hash_report_tunnel related information. @Parav Pandit
> 	2. Re-describe the limitations of QoS for tunneling.
> 	3. Some clarification.
> 
> v8->v9:
> 	1. Merge hash_report_tunnel_types into hash_report. @Parav Pandit
> 	2. Add tunnel security section. @Michael S . Tsirkin
> 	3. Add VIRTIO_NET_F_HASH_REPORT_TUNNEL.
> 	4. Fix some typos.
> 	5. Add more tunnel types. @Michael S . Tsirkin
> 
> v7->v8:
> 	1. Add supported_hash_tunnel_types. @Jason Wang, @Parav Pandit
> 	2. Change hash_report_tunnel to hash_report_tunnel_types. @Parav Pandit
> 	3. Removed re-definition for inner packet hashing. @Parav Pandit
> 	4. Fix some typos. @Michael S . Tsirkin
> 	5. Clarify some sentences. @Michael S . Tsirkin
> 
> v6->v7:
> 	1. Modify the wording of some sentences for clarity. @Michael S. Tsirkin
> 	2. Fix some syntax issues. @Michael S. Tsirkin
> 
> v5->v6:
> 	1. Fix some syntax and capitalization issues. @Michael S. Tsirkin
> 	2. Use encapsulated/encaptulation uniformly. @Michael S. Tsirkin
> 	3. Move the links to introduction section. @Michael S. Tsirkin
> 	4. Clarify some sentences. @Michael S. Tsirkin
> 
> v4->v5:
> 	1. Clarify some paragraphs. @Cornelia Huck
> 	2. Fix the u8 type. @Cornelia Huck
> 
> v3->v4:
> 	1. Rename VIRTIO_NET_F_HASH_GRE_VXLAN_GENEVE_INNER to VIRTIO_NET_F_HASH_TUNNEL. @Jason Wang
> 	2. Make things clearer. @Jason Wang @Michael S. Tsirkin
> 	3. Keep the possibility to use inner hash for automatic receive steering. @Jason Wang
> 	4. Add the "Tunnel packet" paragraph to avoid repeating the GRE etc. many times. @Michael S. Tsirkin
> 
> v2->v3:
> 	1. Add a feature bit for GRE/VXLAN/GENEVE inner hash. @Jason Wang
> 	2. Chang \field{hash_tunnel} to \field{hash_report_tunnel}. @Jason Wang, @Michael S. Tsirkin
> 
> v1->v2:
> 	1. Remove the patch for the bitmask fix. @Michael S. Tsirkin
> 	2. Clarify some paragraphs. @Jason Wang
> 	3. Add \field{hash_tunnel} and VIRTIO_NET_HASH_REPORT_GRE. @Yuri Benditovich
> 
>  device-types/net/description.tex        | 158 ++++++++++++++++++++++++
>  device-types/net/device-conformance.tex |   1 +
>  device-types/net/driver-conformance.tex |   1 +
>  introduction.tex                        |  39 ++++++
>  4 files changed, 199 insertions(+)
> 
> diff --git a/device-types/net/description.tex b/device-types/net/description.tex
> index 3030222..9fdccfc 100644
> --- a/device-types/net/description.tex
> +++ b/device-types/net/description.tex
> @@ -88,6 +88,8 @@ \subsection{Feature bits}\label{sec:Device Types / Network Device / Feature bits
>  \item[VIRTIO_NET_F_CTRL_MAC_ADDR(23)] Set MAC address through control
>      channel.
>  
> +\item[VIRTIO_NET_F_HASH_TUNNEL(51)] Device supports inner header hash for encapsulated packets.
> +
>  \item[VIRTIO_NET_F_VQ_NOTF_COAL(52)] Device supports virtqueue notification coalescing.
>  
>  \item[VIRTIO_NET_F_NOTF_COAL(53)] Device supports notifications coalescing.
> @@ -147,6 +149,7 @@ \subsubsection{Feature bit requirements}\label{sec:Device Types / Network Device
>  \item[VIRTIO_NET_F_RSC_EXT] Requires VIRTIO_NET_F_HOST_TSO4 or VIRTIO_NET_F_HOST_TSO6.
>  \item[VIRTIO_NET_F_RSS] Requires VIRTIO_NET_F_CTRL_VQ.
>  \item[VIRTIO_NET_F_VQ_NOTF_COAL] Requires VIRTIO_NET_F_CTRL_VQ.
> +\item[VIRTIO_NET_F_HASH_TUNNEL] Requires VIRTIO_NET_F_CTRL_VQ along with VIRTIO_NET_F_RSS and/or VIRTIO_NET_F_HASH_REPORT.

I think just or is enough.

>  \end{description}
>  
>  \subsubsection{Legacy Interface: Feature bits}\label{sec:Device Types / Network Device / Feature bits / Legacy Interface: Feature bits}
> @@ -869,6 +872,8 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>  If the feature VIRTIO_NET_F_RSS was negotiated:
>  \begin{itemize}
>  \item The device uses \field{hash_types} of the virtio_net_rss_config structure as 'Enabled hash types' bitmask.
> +\item If additionally the feature VIRTIO_NET_F_HASH_TUNNEL was negotiated, the device uses \field{hash_tunnel_types} of the
> +      virtnet_hash_tunnel_config_get structure as 'Encapsulation types enabled for inner header hash' bitmask.

why get and not set? e.g. if I call get then set then the field in set
will have effect.


>  \item The device uses a key as defined in \field{hash_key_data} and \field{hash_key_length} of the virtio_net_rss_config structure (see
>  \ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) / Setting RSS parameters}).
>  \end{itemize}
> @@ -876,6 +881,8 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>  If the feature VIRTIO_NET_F_RSS was not negotiated:
>  \begin{itemize}
>  \item The device uses \field{hash_types} of the virtio_net_hash_config structure as 'Enabled hash types' bitmask.
> +\item If additionally the feature VIRTIO_NET_F_HASH_TUNNEL was negotiated, the device uses \field{hash_tunnel_types} of the
> +      virtnet_hash_tunnel_config_get structure as 'Encapsulation types enabled for inner header hash' bitmask.

same

>  \item The device uses a key as defined in \field{hash_key_data} and \field{hash_key_length} of the virtio_net_hash_config structure (see
>  \ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode / Hash calculation}).
>  \end{itemize}
> @@ -889,6 +896,8 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>   \ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode / Hash calculation}.
>  \end{itemize}
>  
> +The per-packet hash calculation can depend on the IP packet type. See
> +\hyperref[intro:IP]{[IP]}, \hyperref[intro:UDP]{[UDP]} and \hyperref[intro:TCP]{[TCP]}.

and end paragraph here.

>  \subparagraph{Supported/enabled hash types}
>  \label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled hash types}
>  Hash types applicable for IPv4 packets:
> @@ -1001,6 +1010,155 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>  (see \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / IPv6 packets without extension header}).
>  \end{itemize}
>  
> +\paragraph{Inner Header Hash}
> +\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Inner Header Hash}
> +
> +If VIRTIO_NET_F_HASH_TUNNEL has been negotiated, the driver can send commands VIRTIO_NET_CTRL_HASH_TUNNEL_SET
> +and VIRTIO_NET_CTRL_HASH_TUNNEL_GET to configure the calculation of the inner header hash.
> +
> +struct virtnet_hash_tunnel_config_set {
> +    le32 hash_tunnel_types;
> +};
> +
> +struct virtnet_hash_tunnel_config_get {
> +    le32 supported_hash_tunnel_types;
> +    le32 hash_tunnel_types;
> +};
> +

It would be cleaner to have a single structure for both.

I also think hash_tunnel is unnecessarily verbose, and _config_ is also
pointless.

Returning supported_hash_tunnel_types back to device can also
be useful for debugging.

How about:


struct virtnet_hash_tunnel {
     le32 supported_tunnel_types;
     le32 enabled_tunnel_types;
};


For VIRTIO_NET_CTRL_HASH_TUNNEL_GET, \field{supported_tunnel_types}
contains the bitmask of encapsulation types supported
by the device for inner header hash; \field{enabled_tunnel_types}
contains the value received in a previous successful
call to VIRTIO_NET_CTRL_HASH_TUNNEL_SET.

For VIRTIO_NET_CTRL_HASH_TUNNEL_SET, \field{supported_tunnel_types}
contains the value returned by a previous
successful call to VIRTIO_NET_CTRL_HASH_TUNNEL_GET;
\field{enabled_tunnel_types}
contains the bitmask of encapsulation types to enable
for inner header hash.

and add normative statements to this end.


> +#define VIRTIO_NET_CTRL_HASH_TUNNEL 7
> + #define VIRTIO_NET_CTRL_HASH_TUNNEL_SET 0
> + #define VIRTIO_NET_CTRL_HASH_TUNNEL_GET 1
> +
> +
> +Field \field{supported_hash_tunnel_types} provided by the device indicates that the device supports inner header hash for these encapsulation types.
> +Field \field{supported_hash_tunnel_types} contains the bitmask of encapsulation types supported for inner header hash.

We don't need these two sentences. Just second one will do.

> +See \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets /
> +Hash calculation for incoming packets / Encapsulation types supported/enabled for inner header hash}.
> +
> +Field \field{hash_tunnel_types} contains the bitmask of encapsulation types enabled for inner header hash.

They have different meanings for set and get though.


> +See \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets /
> +Hash calculation for incoming packets / Encapsulation types supported/enabled for inner header hash}.
> +
> +The class VIRTIO_NET_CTRL_HASH_TUNNEL has the following commands:
> +\begin{itemize}
> +\item VIRTIO_NET_CTRL_HASH_TUNNEL_SET: set \field{hash_tunnel_types} for the device using the
> +      virtnet_hash_tunnel_config_set structure, which is read-only for the device.
> +\item VIRTIO_NET_CTRL_HASH_TUNNEL_GET: get \field{hash_tunnel_types} and \field{supported_hash_tunnel_types}
> +      from the device using the virtnet_hash_tunnel_config_get structure, which is write-only for the device.
> +\end{itemize}
> +
> +\subparagraph{Encapsulated packet}
> +\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Encapsulated packet}
> +
> +Multiple tunneling protocols allow encapsulating an inner, payload packet in an outer, encapsulated packet.
> +The encapsulated packet thus contains an outer header and an inner header, and the device calculates the
> +hash over either the inner header or the outer header.
> +
> +If VIRTIO_NET_F_HASH_TUNNEL is negotiated and a received encapsulated packet's outer header matches one of the
> +encapsulation types enabled in \field{hash_tunnel_types}, then the device uses the inner header for hash
> +calculations (only a single level of encapsulation is currently supported).
> +
> +If VIRTIO_NET_F_HASH_TUNNEL is negotiated and a received packet's (outer) header does not match any types enabled
> +in \field{hash_tunnel_types}, then the device uses the outer header for hash calculations.
> +
> +Initially all encapsulation types are disabled (the value of \field{hash_tunnel_types} is 0) for inner header hash
> +before any VIRTIO_NET_CTRL_HASH_TUNNEL_SET command are sent by the driver.

Initially (before driver sends VIRTIO_NET_CTRL_HASH_TUNNEL_SET)
all encapsulation types are disabled 

> +
> +Encapsulation types supported/enabled for inner header hash:
> +\begin{itemize}
> +    \item The outer header of the following encapsulation types does not contain the transport protocol:
> +        \begin{enumerate}
> +	    \item \hyperref[intro:ipip]{[IPIP]}: the outer header is over IPv4 and the inner header is over IPv4.
> +	    \item \hyperref[intro:nvgre]{[NVGRE]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
> +	    \item \hyperref[intro:gre_rfc2784]{[GRE_rfc2784]}: the outer header is over IPv4 and the inner header is over IPv4.
> +	    \item \hyperref[intro:gre_rfc2890]{[GRE_rfc2890]}: the outer header is over IPv4 and the inner header is over IPv4.
> +	    \item \hyperref[intro:gre_rfc7676]{[GRE_rfc7676]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
> +        \end{enumerate}
> +
> +    \item The outer header of the following encapsulation types uses UDP as the transport protocol:
> +        \begin{enumerate}
> +	    \item \hyperref[intro:vxlan]{[VXLAN]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
> +	    \item \hyperref[intro:geneve]{[GENEVE]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
> +	    \item \hyperref[intro:vxlan_gpe]{[VXLAN-GPE]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
> +	    \item \hyperref[intro:gre_in_udp_rfc8086]{[GRE-in-UDP]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
> +        \end{enumerate}
> +\end{itemize}
> +
> +\subparagraph{Encapsulation types supported/enabled for inner header hash}
> +\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets /
> +Hash calculation for incoming packets / Encapsulation types supported/enabled for inner header hash}
> +
> +Encapsulation types applicable for inner header hash:
> +\begin{lstlisting}
> +The \hyperref[intro:gre_rfc2784]{[GRE_rfc2784]} encapsulation type:
> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_2784    (1 << 0)
> +
> +The \hyperref[intro:gre_rfc2890]{[GRE_rfc2890]} encapsulation type:
> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_2890    (1 << 1)
> +
> +The \hyperref[intro:gre_rfc7676]{[GRE_rfc7676]} encapsulation type:
> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_7676    (1 << 2)
> +
> +The \hyperref[intro:gre_in_udp_rfc8086]{[GRE-in-UDP]} encapsulation type:
> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_UDP     (1 << 3)
> +
> +The \hyperref[intro:vxlan]{[VXLAN]} encapsulation type:
> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_VXLAN       (1 << 4)
> +
> +The \hyperref[intro:vxlan_gpe]{[VXLAN-GPE]} encapsulation type:
> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_VXLAN_GPE   (1 << 5)
> +
> +The \hyperref[intro:geneve]{[GENEVE]} encapsulation type:
> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GENEVE      (1 << 6)
> +
> +The \hyperref[intro:ipip]{[IPIP]} encapsulation type:
> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_IPIP        (1 << 7)
> +
> +The \hyperref[intro:nvgre]{[NVGRE]} encapsulation type:
> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_NVGRE       (1 << 8)
> +\end{lstlisting}
> +
> +\subparagraph{Advice}
> +Example uses of inner header hash:
> +\begin{itemize}
> +\item Legacy tunneling protocols, lacking outer header entropy, can use RSS with inner header hash to
> +      distribute flows with identical outer but different inner headers across various queues, improving performance.
> +\item Identify an inner flow distributed across multiple outer tunnels.
> +\end{itemize}
> +
> +As using the inner header hash completely discards the outer header entropy, care must be taken
> +if the inner header is controlled by an adversary, as the adversary can then intentionally create
> +configurations with insufficient entropy.
> +
> +Besides disabling inner header hash, mitigations would depend on:
> +\begin{itemize}
> +\item Use a tool with good forwarding performance to keep the receive queue from dropping packets.

this is quite vague

> +\item If the QoS (Quality of service) is unavailable, the driver can set \field{hash_tunnel_types} to 0
> +      to disable inner header hash for all encapsulated packets.

this is precisely disabling

> +\item Perform appropriate QoS before packets consume the receive buffers of the receive queues.

it is not at all clear how would devices do this.

> +\end{itemize}

Oh sorry I didn't complete the sentence :(
I suggest dropping above and having something like:

	Besides disabling inner header hash, mitigations would depend on how the
	hash is used, and the consequences of a successful attack.
	For example, if the attack causes packet drops, using a deeper queue
	might be able to mitigate it.



> +
> +\devicenormative{\subparagraph}{Inner Header Hash}{Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
> +
> +If the (outer) header of the received packet does not match any value

any encapsulation type

> enabled in \field{hash_tunnel_types},
> +the device MUST calculate the hash on the outer header.
> +
> +If the device receives an unsupported or unrecognized value for \field{hash_tunnel_types}, it MUST respond to
> +the VIRTIO_NET_CTRL_HASH_TUNNEL_SET command with VIRTIO_NET_ERR.

let's be specific. if any bits in hash_tunnel_types are not set in
supported_hash_tunnel_types

> +
> +If the device offers the VIRTIO_NET_F_HASH_TUNNEL feature, it MUST provide the values for \field{supported_hash_tunnel_types}.

what does this mean even?

> +
> +If \field{hash_tunnel_types} is set to 0 or upon device reset, the device MUST disable inner header hash for all encapsulation types.
> +
> +\drivernormative{\subparagraph}{Inner Header Hash}{Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
> +
> +The driver MUST have negotiated the VIRTIO_NET_F_HASH_TUNNEL feature when issuing
> +commands VIRTIO_NET_CTRL_HASH_TUNNEL_SET and VIRTIO_NET_CTRL_HASH_TUNNEL_GET.
> +
> +The driver MUST ignore the values received from the VIRTIO_NET_CTRL_HASH_TUNNEL_GET command if the device responds with VIRTIO_NET_ERR.
> +
> +The driver MUST NOT set any value in \field{hash_tunnel_types} which is not set in \field{supported_hash_tunnel_types}.

any bits. not any value

> +
>  \paragraph{Hash reporting for incoming packets}
>  \label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash reporting for incoming packets}
>  
> diff --git a/device-types/net/device-conformance.tex b/device-types/net/device-conformance.tex
> index 54f6783..f88f48b 100644
> --- a/device-types/net/device-conformance.tex
> +++ b/device-types/net/device-conformance.tex
> @@ -14,4 +14,5 @@
>  \item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode}
>  \item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) / RSS processing}
>  \item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Notifications Coalescing}
> +\item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
>  \end{itemize}
> diff --git a/device-types/net/driver-conformance.tex b/device-types/net/driver-conformance.tex
> index 97d0cc1..9d853d9 100644
> --- a/device-types/net/driver-conformance.tex
> +++ b/device-types/net/driver-conformance.tex
> @@ -14,4 +14,5 @@
>  \item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Offloads State Configuration / Setting Offloads State}
>  \item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) }
>  \item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Notifications Coalescing}
> +\item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
>  \end{itemize}
> diff --git a/introduction.tex b/introduction.tex
> index b7155bf..81f07a4 100644
> --- a/introduction.tex
> +++ b/introduction.tex
> @@ -102,6 +102,45 @@ \section{Normative References}\label{sec:Normative References}
>      Standards for Efficient Cryptography Group(SECG), ``SEC1: Elliptic Cureve Cryptography'', Version 1.0, September 2000.
>  	\newline\url{https://www.secg.org/sec1-v2.pdf}\\
>  
> +	\phantomsection\label{intro:gre_rfc2784}\textbf{[GRE_rfc2784]} &
> +    Generic Routing Encapsulation. This protocol is only specified for IPv4 and used as either the payload or delivery protocol.
> +	\newline\url{https://datatracker.ietf.org/doc/rfc2784/}\\
> +	\phantomsection\label{intro:gre_rfc2890}\textbf{[GRE_rfc2890]} &
> +    Key and Sequence Number Extensions to GRE \ref{intro:gre_rfc2784}. This protocol describes extensions by which two fields, Key and
> +    Sequence Number, can be optionally carried in the GRE Header \ref{intro:gre_rfc2784}.
> +	\newline\url{https://www.rfc-editor.org/rfc/rfc2890}\\
> +	\phantomsection\label{intro:gre_rfc7676}\textbf{[GRE_rfc7676]} &
> +    IPv6 Support for Generic Routing Encapsulation (GRE). This protocol is specified for IPv6 and used as either the payload or
> +    delivery protocol. Note that this does not change the GRE header format or any behaviors specified by RFC 2784 or RFC 2890.
> +	\newline\url{https://datatracker.ietf.org/doc/rfc7676/}\\
> +	\phantomsection\label{intro:gre_in_udp_rfc8086}\textbf{[GRE-in-UDP]} &
> +    GRE-in-UDP Encapsulation. This specifies a method of encapsulating network protocol packets within GRE and UDP headers.
> +    This protocol is specified for IPv4 and IPv6, and used as either the payload or delivery protocol.
> +	\newline\url{https://www.rfc-editor.org/rfc/rfc8086}\\
> +	\phantomsection\label{intro:vxlan}\textbf{[VXLAN]} &
> +    Virtual eXtensible Local Area Network.
> +	\newline\url{https://datatracker.ietf.org/doc/rfc7348/}\\
> +	\phantomsection\label{intro:vxlan-gpe}\textbf{[VXLAN-GPE]} &
> +    Generic Protocol Extension for VXLAN. This protocol describes extending Virtual eXtensible Local Area Network (VXLAN) via changes to the VXLAN header.
> +	\newline\url{https://www.ietf.org/archive/id/draft-ietf-nvo3-vxlan-gpe-12.txt}\\
> +	\phantomsection\label{intro:geneve}\textbf{[GENEVE]} &
> +    Generic Network Virtualization Encapsulation.
> +	\newline\url{https://datatracker.ietf.org/doc/rfc8926/}\\
> +	\phantomsection\label{intro:ipip}\textbf{[IPIP]} &
> +    IP Encapsulation within IP.
> +	\newline\url{https://www.rfc-editor.org/rfc/rfc2003}\\
> +	\phantomsection\label{intro:nvgre}\textbf{[NVGRE]} &
> +    NVGRE: Network Virtualization Using Generic Routing Encapsulation
> +	\newline\url{https://www.rfc-editor.org/rfc/rfc7637.html}\\
> +	\phantomsection\label{intro:IP}\textbf{[IP]} &
> +    INTERNET PROTOCOL
> +	\newline\url{https://www.rfc-editor.org/rfc/rfc791}\\
> +	\phantomsection\label{intro:UDP}\textbf{[UDP]} &
> +    User Datagram Protocol
> +	\newline\url{https://www.rfc-editor.org/rfc/rfc768}\\
> +	\phantomsection\label{intro:TCP}\textbf{[TCP]} &
> +    TRANSMISSION CONTROL PROTOCOL
> +	\newline\url{https://www.rfc-editor.org/rfc/rfc793}\\
>  \end{longtable}
>  
>  \section{Non-Normative References}
> -- 
> 2.19.1.6.gb485710b


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
  2023-06-21 15:38   ` [virtio-comment] " Michael S. Tsirkin
@ 2023-06-21 16:46     ` Heng Qi
  -1 siblings, 0 replies; 106+ messages in thread
From: Heng Qi @ 2023-06-21 16:46 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-comment, virtio-dev, Parav Pandit, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck

On Wed, Jun 21, 2023 at 11:38:52AM -0400, Michael S. Tsirkin wrote:
> On Wed, Jun 21, 2023 at 09:50:52PM +0800, Heng Qi wrote:
> > 1. Currently, a received encapsulated packet has an outer and an inner header, but
> > the virtio device is unable to calculate the hash for the inner header. The same
> > flow can traverse through different tunnels, resulting in the encapsulated
> > packets being spread across multiple receive queues (refer to the figure below).
> > However, in certain scenarios, we may need to direct these encapsulated packets of
> > the same flow to a single receive queue. This facilitates the processing
> > of the flow by the same CPU to improve performance (warm caches, less locking, etc.).
> > 
> >                client1                    client2
> >                   |        +-------+         |
> >                   +------->|tunnels|<--------+
> >                            +-------+
> >                               |  |
> >                               v  v
> >                       +-----------------+
> >                       | monitoring host |
> >                       +-----------------+
> > 
> > To achieve this, the device can calculate a symmetric hash based on the inner headers
> > of the same flow.
> > 
> > 2. For legacy systems, they may lack entropy fields which modern protocols have in
> > the outer header, resulting in multiple flows with the same outer header but
> > different inner headers being directed to the same receive queue. This results in
> > poor receive performance.
> > 
> > To address this limitation, inner header hash can be used to enable the device to advertise
> > the capability to calculate the hash for the inner packet, regaining better receive performance.
> > 
> > Fixes: https://github.com/oasis-tcs/virtio-spec/issues/173
> > 
> 
> don't put an empty line here

Ok. Will remove it.

> 
> > Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
> > Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > Reviewed-by: Parav Pandit <parav@nvidia.com>
> 
> 
> ok almost there. small corrections, and one enhancement suggestion.
> 
> 
> > ---
> > v17->v18:
> > 	1. Some rewording suggestions from Michael (Thanks!).
> > 	2. Use 0 to disable inner header hash and remove
> > 	   VIRTIO_NET_HASH_TUNNEL_TYPE_NONE.
> > v16->v17:
> > 	1. Some small rewrites. @Parav Pandit
> > 	2. Add Parav's Reviewed-by tag (Thanks!).
> > 
> > v15->v16:
> > 	1. Remove the hash_option. In order to delimit the inner header hash and RSS
> > 	   configuration, the ability to configure the outer src udp port hash is given
> > 	   to RSS. This is orthogonal to inner header hash, which will be done in the
> > 	   RSS capability extension topic (considered as an RSS extension together
> > 	   with the symmetric toeplitz hash algorithm, etc.). @Parav Pandit @Michael S . Tsirkin
> > 	2. Fix a 'field' typo. @Parav Pandit
> > 
> > v14->v15:
> > 	1. Add tunnel hash option suggested by @Michael S . Tsirkin
> > 	2. Adjust some descriptions.
> > 
> > v13->v14:
> > 	1. Move supported_hash_tunnel_types from config space into cvq command. @Parav Pandit
> > 	2. Rebase to master branch.
> > 	3. Some minor modifications.
> > 
> > v12->v13:
> > 	1. Add a GET command for hash_tunnel_types. @Parav Pandit
> > 	2. Add tunneling protocol explanation. @Jason Wang
> > 	3. Add comments on some usage scenarios for inner hash.
> > 
> > v11->v12:
> > 	1. Add a command VIRTIO_NET_CTRL_MQ_TUNNEL_CONFIG.
> > 	2. Refine the commit log. @Michael S . Tsirkin
> > 	3. Add some tunnel types.
> > 
> > v10->v11:
> > 	1. Revise commit log for clarity for readers.
> > 	2. Some modifications to avoid undefined terms. @Parav Pandit
> > 	3. Change VIRTIO_NET_F_HASH_TUNNEL dependency. @Parav Pandit
> > 	4. Add the normative statements. @Parav Pandit
> > 
> > v9->v10:
> > 	1. Removed hash_report_tunnel related information. @Parav Pandit
> > 	2. Re-describe the limitations of QoS for tunneling.
> > 	3. Some clarification.
> > 
> > v8->v9:
> > 	1. Merge hash_report_tunnel_types into hash_report. @Parav Pandit
> > 	2. Add tunnel security section. @Michael S . Tsirkin
> > 	3. Add VIRTIO_NET_F_HASH_REPORT_TUNNEL.
> > 	4. Fix some typos.
> > 	5. Add more tunnel types. @Michael S . Tsirkin
> > 
> > v7->v8:
> > 	1. Add supported_hash_tunnel_types. @Jason Wang, @Parav Pandit
> > 	2. Change hash_report_tunnel to hash_report_tunnel_types. @Parav Pandit
> > 	3. Removed re-definition for inner packet hashing. @Parav Pandit
> > 	4. Fix some typos. @Michael S . Tsirkin
> > 	5. Clarify some sentences. @Michael S . Tsirkin
> > 
> > v6->v7:
> > 	1. Modify the wording of some sentences for clarity. @Michael S. Tsirkin
> > 	2. Fix some syntax issues. @Michael S. Tsirkin
> > 
> > v5->v6:
> > 	1. Fix some syntax and capitalization issues. @Michael S. Tsirkin
> > 	2. Use encapsulated/encaptulation uniformly. @Michael S. Tsirkin
> > 	3. Move the links to introduction section. @Michael S. Tsirkin
> > 	4. Clarify some sentences. @Michael S. Tsirkin
> > 
> > v4->v5:
> > 	1. Clarify some paragraphs. @Cornelia Huck
> > 	2. Fix the u8 type. @Cornelia Huck
> > 
> > v3->v4:
> > 	1. Rename VIRTIO_NET_F_HASH_GRE_VXLAN_GENEVE_INNER to VIRTIO_NET_F_HASH_TUNNEL. @Jason Wang
> > 	2. Make things clearer. @Jason Wang @Michael S. Tsirkin
> > 	3. Keep the possibility to use inner hash for automatic receive steering. @Jason Wang
> > 	4. Add the "Tunnel packet" paragraph to avoid repeating the GRE etc. many times. @Michael S. Tsirkin
> > 
> > v2->v3:
> > 	1. Add a feature bit for GRE/VXLAN/GENEVE inner hash. @Jason Wang
> > 	2. Chang \field{hash_tunnel} to \field{hash_report_tunnel}. @Jason Wang, @Michael S. Tsirkin
> > 
> > v1->v2:
> > 	1. Remove the patch for the bitmask fix. @Michael S. Tsirkin
> > 	2. Clarify some paragraphs. @Jason Wang
> > 	3. Add \field{hash_tunnel} and VIRTIO_NET_HASH_REPORT_GRE. @Yuri Benditovich
> > 
> >  device-types/net/description.tex        | 158 ++++++++++++++++++++++++
> >  device-types/net/device-conformance.tex |   1 +
> >  device-types/net/driver-conformance.tex |   1 +
> >  introduction.tex                        |  39 ++++++
> >  4 files changed, 199 insertions(+)
> > 
> > diff --git a/device-types/net/description.tex b/device-types/net/description.tex
> > index 3030222..9fdccfc 100644
> > --- a/device-types/net/description.tex
> > +++ b/device-types/net/description.tex
> > @@ -88,6 +88,8 @@ \subsection{Feature bits}\label{sec:Device Types / Network Device / Feature bits
> >  \item[VIRTIO_NET_F_CTRL_MAC_ADDR(23)] Set MAC address through control
> >      channel.
> >  
> > +\item[VIRTIO_NET_F_HASH_TUNNEL(51)] Device supports inner header hash for encapsulated packets.
> > +
> >  \item[VIRTIO_NET_F_VQ_NOTF_COAL(52)] Device supports virtqueue notification coalescing.
> >  
> >  \item[VIRTIO_NET_F_NOTF_COAL(53)] Device supports notifications coalescing.
> > @@ -147,6 +149,7 @@ \subsubsection{Feature bit requirements}\label{sec:Device Types / Network Device
> >  \item[VIRTIO_NET_F_RSC_EXT] Requires VIRTIO_NET_F_HOST_TSO4 or VIRTIO_NET_F_HOST_TSO6.
> >  \item[VIRTIO_NET_F_RSS] Requires VIRTIO_NET_F_CTRL_VQ.
> >  \item[VIRTIO_NET_F_VQ_NOTF_COAL] Requires VIRTIO_NET_F_CTRL_VQ.
> > +\item[VIRTIO_NET_F_HASH_TUNNEL] Requires VIRTIO_NET_F_CTRL_VQ along with VIRTIO_NET_F_RSS and/or VIRTIO_NET_F_HASH_REPORT.
> 
> I think just or is enough.

Sure. I agree.

> 
> >  \end{description}
> >  
> >  \subsubsection{Legacy Interface: Feature bits}\label{sec:Device Types / Network Device / Feature bits / Legacy Interface: Feature bits}
> > @@ -869,6 +872,8 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
> >  If the feature VIRTIO_NET_F_RSS was negotiated:
> >  \begin{itemize}
> >  \item The device uses \field{hash_types} of the virtio_net_rss_config structure as 'Enabled hash types' bitmask.
> > +\item If additionally the feature VIRTIO_NET_F_HASH_TUNNEL was negotiated, the device uses \field{hash_tunnel_types} of the
> > +      virtnet_hash_tunnel_config_get structure as 'Encapsulation types enabled for inner header hash' bitmask.
> 
> why get and not set? e.g. if I call get then set then the field in set
> will have effect.

If the following command sequence:
1. The driver sets hash_tunnel_types using the SET command and saves it
somewhere e.g. virtnet_info->hash_tunnel_types_saved.
2. The driver fetches hash_tunnel_types using the GET command, which should
be equal to virtnet_info->hash_tunnel_types_saved (or even returned from
virtnet_info->hash_tunnel_types_saved, like RSS).
3. The driver sets hash_tunnel_types again using the SET command and saves
it in virtnet_info->hash_tunnel_types_saved.

But I think your enhanced proposal is also feasible, after all, before
for example RSS etc. we only have SET command can work very well.

> 
> 
> >  \item The device uses a key as defined in \field{hash_key_data} and \field{hash_key_length} of the virtio_net_rss_config structure (see
> >  \ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) / Setting RSS parameters}).
> >  \end{itemize}
> > @@ -876,6 +881,8 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
> >  If the feature VIRTIO_NET_F_RSS was not negotiated:
> >  \begin{itemize}
> >  \item The device uses \field{hash_types} of the virtio_net_hash_config structure as 'Enabled hash types' bitmask.
> > +\item If additionally the feature VIRTIO_NET_F_HASH_TUNNEL was negotiated, the device uses \field{hash_tunnel_types} of the
> > +      virtnet_hash_tunnel_config_get structure as 'Encapsulation types enabled for inner header hash' bitmask.
> 
> same
> 
> >  \item The device uses a key as defined in \field{hash_key_data} and \field{hash_key_length} of the virtio_net_hash_config structure (see
> >  \ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode / Hash calculation}).
> >  \end{itemize}
> > @@ -889,6 +896,8 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
> >   \ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode / Hash calculation}.
> >  \end{itemize}
> >  
> > +The per-packet hash calculation can depend on the IP packet type. See
> > +\hyperref[intro:IP]{[IP]}, \hyperref[intro:UDP]{[UDP]} and \hyperref[intro:TCP]{[TCP]}.
> 
> and end paragraph here.

Is adding a blank line here? --!

> 
> >  \subparagraph{Supported/enabled hash types}
> >  \label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled hash types}
> >  Hash types applicable for IPv4 packets:
> > @@ -1001,6 +1010,155 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
> >  (see \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / IPv6 packets without extension header}).
> >  \end{itemize}
> >  
> > +\paragraph{Inner Header Hash}
> > +\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Inner Header Hash}
> > +
> > +If VIRTIO_NET_F_HASH_TUNNEL has been negotiated, the driver can send commands VIRTIO_NET_CTRL_HASH_TUNNEL_SET
> > +and VIRTIO_NET_CTRL_HASH_TUNNEL_GET to configure the calculation of the inner header hash.
> > +
> > +struct virtnet_hash_tunnel_config_set {
> > +    le32 hash_tunnel_types;
> > +};
> > +
> > +struct virtnet_hash_tunnel_config_get {
> > +    le32 supported_hash_tunnel_types;
> > +    le32 hash_tunnel_types;
> > +};
> > +
> 
> It would be cleaner to have a single structure for both.
> 

SET carries an additional 32 bits of information. But if you think this
will make the overall structure more concise, I'm ok.

> I also think hash_tunnel is unnecessarily verbose, and _config_ is also
> pointless.
> 
> Returning supported_hash_tunnel_types back to device can also
> be useful for debugging.
> 
> How about:
> 
> 
> struct virtnet_hash_tunnel {
>      le32 supported_tunnel_types;
>      le32 enabled_tunnel_types;
> };
> 

It's OK.

And:
For the GET command, both fields are WO for the device.
For the SET command, \field{supported_tunnel_types} is RO for the device
and \field{enabled_tunnel_types} is WO for the device.

> 
> For VIRTIO_NET_CTRL_HASH_TUNNEL_GET, \field{supported_tunnel_types}
> contains the bitmask of encapsulation types supported
> by the device for inner header hash; \field{enabled_tunnel_types}
> contains the value received in a previous successful
> call to VIRTIO_NET_CTRL_HASH_TUNNEL_SET.
> 
> For VIRTIO_NET_CTRL_HASH_TUNNEL_SET, \field{supported_tunnel_types}
> contains the value returned by a previous
> successful call to VIRTIO_NET_CTRL_HASH_TUNNEL_GET;
> \field{enabled_tunnel_types}
> contains the bitmask of encapsulation types to enable
> for inner header hash.
> 
> and add normative statements to this end.
> 
> 

Ok.

> > +#define VIRTIO_NET_CTRL_HASH_TUNNEL 7
> > + #define VIRTIO_NET_CTRL_HASH_TUNNEL_SET 0
> > + #define VIRTIO_NET_CTRL_HASH_TUNNEL_GET 1
> > +
> > +
> > +Field \field{supported_hash_tunnel_types} provided by the device indicates that the device supports inner header hash for these encapsulation types.
> > +Field \field{supported_hash_tunnel_types} contains the bitmask of encapsulation types supported for inner header hash.
> 
> We don't need these two sentences. Just second one will do.
> 

Ok.

> > +See \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets /
> > +Hash calculation for incoming packets / Encapsulation types supported/enabled for inner header hash}.
> > +
> > +Field \field{hash_tunnel_types} contains the bitmask of encapsulation types enabled for inner header hash.
> 
> They have different meanings for set and get though.
> 
> 
> > +See \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets /
> > +Hash calculation for incoming packets / Encapsulation types supported/enabled for inner header hash}.
> > +
> > +The class VIRTIO_NET_CTRL_HASH_TUNNEL has the following commands:
> > +\begin{itemize}
> > +\item VIRTIO_NET_CTRL_HASH_TUNNEL_SET: set \field{hash_tunnel_types} for the device using the
> > +      virtnet_hash_tunnel_config_set structure, which is read-only for the device.
> > +\item VIRTIO_NET_CTRL_HASH_TUNNEL_GET: get \field{hash_tunnel_types} and \field{supported_hash_tunnel_types}
> > +      from the device using the virtnet_hash_tunnel_config_get structure, which is write-only for the device.
> > +\end{itemize}
> > +
> > +\subparagraph{Encapsulated packet}
> > +\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Encapsulated packet}
> > +
> > +Multiple tunneling protocols allow encapsulating an inner, payload packet in an outer, encapsulated packet.
> > +The encapsulated packet thus contains an outer header and an inner header, and the device calculates the
> > +hash over either the inner header or the outer header.
> > +
> > +If VIRTIO_NET_F_HASH_TUNNEL is negotiated and a received encapsulated packet's outer header matches one of the
> > +encapsulation types enabled in \field{hash_tunnel_types}, then the device uses the inner header for hash
> > +calculations (only a single level of encapsulation is currently supported).
> > +
> > +If VIRTIO_NET_F_HASH_TUNNEL is negotiated and a received packet's (outer) header does not match any types enabled
> > +in \field{hash_tunnel_types}, then the device uses the outer header for hash calculations.
> > +
> > +Initially all encapsulation types are disabled (the value of \field{hash_tunnel_types} is 0) for inner header hash
> > +before any VIRTIO_NET_CTRL_HASH_TUNNEL_SET command are sent by the driver.
> 
> Initially (before driver sends VIRTIO_NET_CTRL_HASH_TUNNEL_SET)
> all encapsulation types are disabled 
> 

Ok.

> > +
> > +Encapsulation types supported/enabled for inner header hash:
> > +\begin{itemize}
> > +    \item The outer header of the following encapsulation types does not contain the transport protocol:
> > +        \begin{enumerate}
> > +	    \item \hyperref[intro:ipip]{[IPIP]}: the outer header is over IPv4 and the inner header is over IPv4.
> > +	    \item \hyperref[intro:nvgre]{[NVGRE]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
> > +	    \item \hyperref[intro:gre_rfc2784]{[GRE_rfc2784]}: the outer header is over IPv4 and the inner header is over IPv4.
> > +	    \item \hyperref[intro:gre_rfc2890]{[GRE_rfc2890]}: the outer header is over IPv4 and the inner header is over IPv4.
> > +	    \item \hyperref[intro:gre_rfc7676]{[GRE_rfc7676]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
> > +        \end{enumerate}
> > +
> > +    \item The outer header of the following encapsulation types uses UDP as the transport protocol:
> > +        \begin{enumerate}
> > +	    \item \hyperref[intro:vxlan]{[VXLAN]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
> > +	    \item \hyperref[intro:geneve]{[GENEVE]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
> > +	    \item \hyperref[intro:vxlan_gpe]{[VXLAN-GPE]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
> > +	    \item \hyperref[intro:gre_in_udp_rfc8086]{[GRE-in-UDP]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
> > +        \end{enumerate}
> > +\end{itemize}
> > +
> > +\subparagraph{Encapsulation types supported/enabled for inner header hash}
> > +\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets /
> > +Hash calculation for incoming packets / Encapsulation types supported/enabled for inner header hash}
> > +
> > +Encapsulation types applicable for inner header hash:
> > +\begin{lstlisting}
> > +The \hyperref[intro:gre_rfc2784]{[GRE_rfc2784]} encapsulation type:
> > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_2784    (1 << 0)
> > +
> > +The \hyperref[intro:gre_rfc2890]{[GRE_rfc2890]} encapsulation type:
> > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_2890    (1 << 1)
> > +
> > +The \hyperref[intro:gre_rfc7676]{[GRE_rfc7676]} encapsulation type:
> > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_7676    (1 << 2)
> > +
> > +The \hyperref[intro:gre_in_udp_rfc8086]{[GRE-in-UDP]} encapsulation type:
> > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_UDP     (1 << 3)
> > +
> > +The \hyperref[intro:vxlan]{[VXLAN]} encapsulation type:
> > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_VXLAN       (1 << 4)
> > +
> > +The \hyperref[intro:vxlan_gpe]{[VXLAN-GPE]} encapsulation type:
> > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_VXLAN_GPE   (1 << 5)
> > +
> > +The \hyperref[intro:geneve]{[GENEVE]} encapsulation type:
> > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GENEVE      (1 << 6)
> > +
> > +The \hyperref[intro:ipip]{[IPIP]} encapsulation type:
> > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_IPIP        (1 << 7)
> > +
> > +The \hyperref[intro:nvgre]{[NVGRE]} encapsulation type:
> > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_NVGRE       (1 << 8)
> > +\end{lstlisting}
> > +
> > +\subparagraph{Advice}
> > +Example uses of inner header hash:
> > +\begin{itemize}
> > +\item Legacy tunneling protocols, lacking outer header entropy, can use RSS with inner header hash to
> > +      distribute flows with identical outer but different inner headers across various queues, improving performance.
> > +\item Identify an inner flow distributed across multiple outer tunnels.
> > +\end{itemize}
> > +
> > +As using the inner header hash completely discards the outer header entropy, care must be taken
> > +if the inner header is controlled by an adversary, as the adversary can then intentionally create
> > +configurations with insufficient entropy.
> > +
> > +Besides disabling inner header hash, mitigations would depend on:
> > +\begin{itemize}
> > +\item Use a tool with good forwarding performance to keep the receive queue from dropping packets.
> 
> this is quite vague
> 

Giving too specific advice would be too specific, which we discussed a
long time ago.

> > +\item If the QoS (Quality of service) is unavailable, the driver can set \field{hash_tunnel_types} to 0
> > +      to disable inner header hash for all encapsulated packets.
> 
> this is precisely disabling

\field{hash_tunnel_types} to 0 disabling inner ?

> 
> > +\item Perform appropriate QoS before packets consume the receive buffers of the receive queues.
> 
> it is not at all clear how would devices do this.
> 

The reason we're describing it broadly here is that this is done by
devices, which usually know what to do, and can also take actions
off-device, such as firewalls off-device, etc.

> > +\end{itemize}
> 
> Oh sorry I didn't complete the sentence :(
> I suggest dropping above and having something like:
> 
> 	Besides disabling inner header hash, mitigations would depend on how the
> 	hash is used, and the consequences of a successful attack.
> 	For example, if the attack causes packet drops, using a deeper queue
> 	might be able to mitigate it.

Ok, I got you now!

> 
> 
> 
> > +
> > +\devicenormative{\subparagraph}{Inner Header Hash}{Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
> > +
> > +If the (outer) header of the received packet does not match any value
> 
> any encapsulation type

Ok. And 'any bits' instead of 'any value' I think.

> 
> > enabled in \field{hash_tunnel_types},
> > +the device MUST calculate the hash on the outer header.
> > +
> > +If the device receives an unsupported or unrecognized value for \field{hash_tunnel_types}, it MUST respond to
> > +the VIRTIO_NET_CTRL_HASH_TUNNEL_SET command with VIRTIO_NET_ERR.
> 
> let's be specific. if any bits in hash_tunnel_types are not set in
> supported_hash_tunnel_types

Ok.

> 
> > +
> > +If the device offers the VIRTIO_NET_F_HASH_TUNNEL feature, it MUST provide the values for \field{supported_hash_tunnel_types}.
> 
> what does this mean even?

When the driver uses the GET command, the device should preferably
return the corresponding value..

> 
> > +
> > +If \field{hash_tunnel_types} is set to 0 or upon device reset, the device MUST disable inner header hash for all encapsulation types.
> > +
> > +\drivernormative{\subparagraph}{Inner Header Hash}{Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
> > +
> > +The driver MUST have negotiated the VIRTIO_NET_F_HASH_TUNNEL feature when issuing
> > +commands VIRTIO_NET_CTRL_HASH_TUNNEL_SET and VIRTIO_NET_CTRL_HASH_TUNNEL_GET.
> > +
> > +The driver MUST ignore the values received from the VIRTIO_NET_CTRL_HASH_TUNNEL_GET command if the device responds with VIRTIO_NET_ERR.
> > +
> > +The driver MUST NOT set any value in \field{hash_tunnel_types} which is not set in \field{supported_hash_tunnel_types}.
> 
> any bits. not any value

Get.

Thanks.

> 
> > +
> >  \paragraph{Hash reporting for incoming packets}
> >  \label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash reporting for incoming packets}
> >  
> > diff --git a/device-types/net/device-conformance.tex b/device-types/net/device-conformance.tex
> > index 54f6783..f88f48b 100644
> > --- a/device-types/net/device-conformance.tex
> > +++ b/device-types/net/device-conformance.tex
> > @@ -14,4 +14,5 @@
> >  \item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode}
> >  \item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) / RSS processing}
> >  \item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Notifications Coalescing}
> > +\item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
> >  \end{itemize}
> > diff --git a/device-types/net/driver-conformance.tex b/device-types/net/driver-conformance.tex
> > index 97d0cc1..9d853d9 100644
> > --- a/device-types/net/driver-conformance.tex
> > +++ b/device-types/net/driver-conformance.tex
> > @@ -14,4 +14,5 @@
> >  \item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Offloads State Configuration / Setting Offloads State}
> >  \item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) }
> >  \item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Notifications Coalescing}
> > +\item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
> >  \end{itemize}
> > diff --git a/introduction.tex b/introduction.tex
> > index b7155bf..81f07a4 100644
> > --- a/introduction.tex
> > +++ b/introduction.tex
> > @@ -102,6 +102,45 @@ \section{Normative References}\label{sec:Normative References}
> >      Standards for Efficient Cryptography Group(SECG), ``SEC1: Elliptic Cureve Cryptography'', Version 1.0, September 2000.
> >  	\newline\url{https://www.secg.org/sec1-v2.pdf}\\
> >  
> > +	\phantomsection\label{intro:gre_rfc2784}\textbf{[GRE_rfc2784]} &
> > +    Generic Routing Encapsulation. This protocol is only specified for IPv4 and used as either the payload or delivery protocol.
> > +	\newline\url{https://datatracker.ietf.org/doc/rfc2784/}\\
> > +	\phantomsection\label{intro:gre_rfc2890}\textbf{[GRE_rfc2890]} &
> > +    Key and Sequence Number Extensions to GRE \ref{intro:gre_rfc2784}. This protocol describes extensions by which two fields, Key and
> > +    Sequence Number, can be optionally carried in the GRE Header \ref{intro:gre_rfc2784}.
> > +	\newline\url{https://www.rfc-editor.org/rfc/rfc2890}\\
> > +	\phantomsection\label{intro:gre_rfc7676}\textbf{[GRE_rfc7676]} &
> > +    IPv6 Support for Generic Routing Encapsulation (GRE). This protocol is specified for IPv6 and used as either the payload or
> > +    delivery protocol. Note that this does not change the GRE header format or any behaviors specified by RFC 2784 or RFC 2890.
> > +	\newline\url{https://datatracker.ietf.org/doc/rfc7676/}\\
> > +	\phantomsection\label{intro:gre_in_udp_rfc8086}\textbf{[GRE-in-UDP]} &
> > +    GRE-in-UDP Encapsulation. This specifies a method of encapsulating network protocol packets within GRE and UDP headers.
> > +    This protocol is specified for IPv4 and IPv6, and used as either the payload or delivery protocol.
> > +	\newline\url{https://www.rfc-editor.org/rfc/rfc8086}\\
> > +	\phantomsection\label{intro:vxlan}\textbf{[VXLAN]} &
> > +    Virtual eXtensible Local Area Network.
> > +	\newline\url{https://datatracker.ietf.org/doc/rfc7348/}\\
> > +	\phantomsection\label{intro:vxlan-gpe}\textbf{[VXLAN-GPE]} &
> > +    Generic Protocol Extension for VXLAN. This protocol describes extending Virtual eXtensible Local Area Network (VXLAN) via changes to the VXLAN header.
> > +	\newline\url{https://www.ietf.org/archive/id/draft-ietf-nvo3-vxlan-gpe-12.txt}\\
> > +	\phantomsection\label{intro:geneve}\textbf{[GENEVE]} &
> > +    Generic Network Virtualization Encapsulation.
> > +	\newline\url{https://datatracker.ietf.org/doc/rfc8926/}\\
> > +	\phantomsection\label{intro:ipip}\textbf{[IPIP]} &
> > +    IP Encapsulation within IP.
> > +	\newline\url{https://www.rfc-editor.org/rfc/rfc2003}\\
> > +	\phantomsection\label{intro:nvgre}\textbf{[NVGRE]} &
> > +    NVGRE: Network Virtualization Using Generic Routing Encapsulation
> > +	\newline\url{https://www.rfc-editor.org/rfc/rfc7637.html}\\
> > +	\phantomsection\label{intro:IP}\textbf{[IP]} &
> > +    INTERNET PROTOCOL
> > +	\newline\url{https://www.rfc-editor.org/rfc/rfc791}\\
> > +	\phantomsection\label{intro:UDP}\textbf{[UDP]} &
> > +    User Datagram Protocol
> > +	\newline\url{https://www.rfc-editor.org/rfc/rfc768}\\
> > +	\phantomsection\label{intro:TCP}\textbf{[TCP]} &
> > +    TRANSMISSION CONTROL PROTOCOL
> > +	\newline\url{https://www.rfc-editor.org/rfc/rfc793}\\
> >  \end{longtable}
> >  
> >  \section{Non-Normative References}
> > -- 
> > 2.19.1.6.gb485710b
> 
> 
> This publicly archived list offers a means to provide input to the
> OASIS Virtual I/O Device (VIRTIO) TC.
> 
> In order to verify user consent to the Feedback License terms and
> to minimize spam in the list archive, subscription is required
> before posting.
> 
> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> List help: virtio-comment-help@lists.oasis-open.org
> List archive: https://lists.oasis-open.org/archives/virtio-comment/
> Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> Committee: https://www.oasis-open.org/committees/virtio/
> Join OASIS: https://www.oasis-open.org/join/

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
@ 2023-06-21 16:46     ` Heng Qi
  0 siblings, 0 replies; 106+ messages in thread
From: Heng Qi @ 2023-06-21 16:46 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-comment, virtio-dev, Parav Pandit, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck

On Wed, Jun 21, 2023 at 11:38:52AM -0400, Michael S. Tsirkin wrote:
> On Wed, Jun 21, 2023 at 09:50:52PM +0800, Heng Qi wrote:
> > 1. Currently, a received encapsulated packet has an outer and an inner header, but
> > the virtio device is unable to calculate the hash for the inner header. The same
> > flow can traverse through different tunnels, resulting in the encapsulated
> > packets being spread across multiple receive queues (refer to the figure below).
> > However, in certain scenarios, we may need to direct these encapsulated packets of
> > the same flow to a single receive queue. This facilitates the processing
> > of the flow by the same CPU to improve performance (warm caches, less locking, etc.).
> > 
> >                client1                    client2
> >                   |        +-------+         |
> >                   +------->|tunnels|<--------+
> >                            +-------+
> >                               |  |
> >                               v  v
> >                       +-----------------+
> >                       | monitoring host |
> >                       +-----------------+
> > 
> > To achieve this, the device can calculate a symmetric hash based on the inner headers
> > of the same flow.
> > 
> > 2. For legacy systems, they may lack entropy fields which modern protocols have in
> > the outer header, resulting in multiple flows with the same outer header but
> > different inner headers being directed to the same receive queue. This results in
> > poor receive performance.
> > 
> > To address this limitation, inner header hash can be used to enable the device to advertise
> > the capability to calculate the hash for the inner packet, regaining better receive performance.
> > 
> > Fixes: https://github.com/oasis-tcs/virtio-spec/issues/173
> > 
> 
> don't put an empty line here

Ok. Will remove it.

> 
> > Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
> > Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > Reviewed-by: Parav Pandit <parav@nvidia.com>
> 
> 
> ok almost there. small corrections, and one enhancement suggestion.
> 
> 
> > ---
> > v17->v18:
> > 	1. Some rewording suggestions from Michael (Thanks!).
> > 	2. Use 0 to disable inner header hash and remove
> > 	   VIRTIO_NET_HASH_TUNNEL_TYPE_NONE.
> > v16->v17:
> > 	1. Some small rewrites. @Parav Pandit
> > 	2. Add Parav's Reviewed-by tag (Thanks!).
> > 
> > v15->v16:
> > 	1. Remove the hash_option. In order to delimit the inner header hash and RSS
> > 	   configuration, the ability to configure the outer src udp port hash is given
> > 	   to RSS. This is orthogonal to inner header hash, which will be done in the
> > 	   RSS capability extension topic (considered as an RSS extension together
> > 	   with the symmetric toeplitz hash algorithm, etc.). @Parav Pandit @Michael S . Tsirkin
> > 	2. Fix a 'field' typo. @Parav Pandit
> > 
> > v14->v15:
> > 	1. Add tunnel hash option suggested by @Michael S . Tsirkin
> > 	2. Adjust some descriptions.
> > 
> > v13->v14:
> > 	1. Move supported_hash_tunnel_types from config space into cvq command. @Parav Pandit
> > 	2. Rebase to master branch.
> > 	3. Some minor modifications.
> > 
> > v12->v13:
> > 	1. Add a GET command for hash_tunnel_types. @Parav Pandit
> > 	2. Add tunneling protocol explanation. @Jason Wang
> > 	3. Add comments on some usage scenarios for inner hash.
> > 
> > v11->v12:
> > 	1. Add a command VIRTIO_NET_CTRL_MQ_TUNNEL_CONFIG.
> > 	2. Refine the commit log. @Michael S . Tsirkin
> > 	3. Add some tunnel types.
> > 
> > v10->v11:
> > 	1. Revise commit log for clarity for readers.
> > 	2. Some modifications to avoid undefined terms. @Parav Pandit
> > 	3. Change VIRTIO_NET_F_HASH_TUNNEL dependency. @Parav Pandit
> > 	4. Add the normative statements. @Parav Pandit
> > 
> > v9->v10:
> > 	1. Removed hash_report_tunnel related information. @Parav Pandit
> > 	2. Re-describe the limitations of QoS for tunneling.
> > 	3. Some clarification.
> > 
> > v8->v9:
> > 	1. Merge hash_report_tunnel_types into hash_report. @Parav Pandit
> > 	2. Add tunnel security section. @Michael S . Tsirkin
> > 	3. Add VIRTIO_NET_F_HASH_REPORT_TUNNEL.
> > 	4. Fix some typos.
> > 	5. Add more tunnel types. @Michael S . Tsirkin
> > 
> > v7->v8:
> > 	1. Add supported_hash_tunnel_types. @Jason Wang, @Parav Pandit
> > 	2. Change hash_report_tunnel to hash_report_tunnel_types. @Parav Pandit
> > 	3. Removed re-definition for inner packet hashing. @Parav Pandit
> > 	4. Fix some typos. @Michael S . Tsirkin
> > 	5. Clarify some sentences. @Michael S . Tsirkin
> > 
> > v6->v7:
> > 	1. Modify the wording of some sentences for clarity. @Michael S. Tsirkin
> > 	2. Fix some syntax issues. @Michael S. Tsirkin
> > 
> > v5->v6:
> > 	1. Fix some syntax and capitalization issues. @Michael S. Tsirkin
> > 	2. Use encapsulated/encaptulation uniformly. @Michael S. Tsirkin
> > 	3. Move the links to introduction section. @Michael S. Tsirkin
> > 	4. Clarify some sentences. @Michael S. Tsirkin
> > 
> > v4->v5:
> > 	1. Clarify some paragraphs. @Cornelia Huck
> > 	2. Fix the u8 type. @Cornelia Huck
> > 
> > v3->v4:
> > 	1. Rename VIRTIO_NET_F_HASH_GRE_VXLAN_GENEVE_INNER to VIRTIO_NET_F_HASH_TUNNEL. @Jason Wang
> > 	2. Make things clearer. @Jason Wang @Michael S. Tsirkin
> > 	3. Keep the possibility to use inner hash for automatic receive steering. @Jason Wang
> > 	4. Add the "Tunnel packet" paragraph to avoid repeating the GRE etc. many times. @Michael S. Tsirkin
> > 
> > v2->v3:
> > 	1. Add a feature bit for GRE/VXLAN/GENEVE inner hash. @Jason Wang
> > 	2. Chang \field{hash_tunnel} to \field{hash_report_tunnel}. @Jason Wang, @Michael S. Tsirkin
> > 
> > v1->v2:
> > 	1. Remove the patch for the bitmask fix. @Michael S. Tsirkin
> > 	2. Clarify some paragraphs. @Jason Wang
> > 	3. Add \field{hash_tunnel} and VIRTIO_NET_HASH_REPORT_GRE. @Yuri Benditovich
> > 
> >  device-types/net/description.tex        | 158 ++++++++++++++++++++++++
> >  device-types/net/device-conformance.tex |   1 +
> >  device-types/net/driver-conformance.tex |   1 +
> >  introduction.tex                        |  39 ++++++
> >  4 files changed, 199 insertions(+)
> > 
> > diff --git a/device-types/net/description.tex b/device-types/net/description.tex
> > index 3030222..9fdccfc 100644
> > --- a/device-types/net/description.tex
> > +++ b/device-types/net/description.tex
> > @@ -88,6 +88,8 @@ \subsection{Feature bits}\label{sec:Device Types / Network Device / Feature bits
> >  \item[VIRTIO_NET_F_CTRL_MAC_ADDR(23)] Set MAC address through control
> >      channel.
> >  
> > +\item[VIRTIO_NET_F_HASH_TUNNEL(51)] Device supports inner header hash for encapsulated packets.
> > +
> >  \item[VIRTIO_NET_F_VQ_NOTF_COAL(52)] Device supports virtqueue notification coalescing.
> >  
> >  \item[VIRTIO_NET_F_NOTF_COAL(53)] Device supports notifications coalescing.
> > @@ -147,6 +149,7 @@ \subsubsection{Feature bit requirements}\label{sec:Device Types / Network Device
> >  \item[VIRTIO_NET_F_RSC_EXT] Requires VIRTIO_NET_F_HOST_TSO4 or VIRTIO_NET_F_HOST_TSO6.
> >  \item[VIRTIO_NET_F_RSS] Requires VIRTIO_NET_F_CTRL_VQ.
> >  \item[VIRTIO_NET_F_VQ_NOTF_COAL] Requires VIRTIO_NET_F_CTRL_VQ.
> > +\item[VIRTIO_NET_F_HASH_TUNNEL] Requires VIRTIO_NET_F_CTRL_VQ along with VIRTIO_NET_F_RSS and/or VIRTIO_NET_F_HASH_REPORT.
> 
> I think just or is enough.

Sure. I agree.

> 
> >  \end{description}
> >  
> >  \subsubsection{Legacy Interface: Feature bits}\label{sec:Device Types / Network Device / Feature bits / Legacy Interface: Feature bits}
> > @@ -869,6 +872,8 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
> >  If the feature VIRTIO_NET_F_RSS was negotiated:
> >  \begin{itemize}
> >  \item The device uses \field{hash_types} of the virtio_net_rss_config structure as 'Enabled hash types' bitmask.
> > +\item If additionally the feature VIRTIO_NET_F_HASH_TUNNEL was negotiated, the device uses \field{hash_tunnel_types} of the
> > +      virtnet_hash_tunnel_config_get structure as 'Encapsulation types enabled for inner header hash' bitmask.
> 
> why get and not set? e.g. if I call get then set then the field in set
> will have effect.

If the following command sequence:
1. The driver sets hash_tunnel_types using the SET command and saves it
somewhere e.g. virtnet_info->hash_tunnel_types_saved.
2. The driver fetches hash_tunnel_types using the GET command, which should
be equal to virtnet_info->hash_tunnel_types_saved (or even returned from
virtnet_info->hash_tunnel_types_saved, like RSS).
3. The driver sets hash_tunnel_types again using the SET command and saves
it in virtnet_info->hash_tunnel_types_saved.

But I think your enhanced proposal is also feasible, after all, before
for example RSS etc. we only have SET command can work very well.

> 
> 
> >  \item The device uses a key as defined in \field{hash_key_data} and \field{hash_key_length} of the virtio_net_rss_config structure (see
> >  \ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) / Setting RSS parameters}).
> >  \end{itemize}
> > @@ -876,6 +881,8 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
> >  If the feature VIRTIO_NET_F_RSS was not negotiated:
> >  \begin{itemize}
> >  \item The device uses \field{hash_types} of the virtio_net_hash_config structure as 'Enabled hash types' bitmask.
> > +\item If additionally the feature VIRTIO_NET_F_HASH_TUNNEL was negotiated, the device uses \field{hash_tunnel_types} of the
> > +      virtnet_hash_tunnel_config_get structure as 'Encapsulation types enabled for inner header hash' bitmask.
> 
> same
> 
> >  \item The device uses a key as defined in \field{hash_key_data} and \field{hash_key_length} of the virtio_net_hash_config structure (see
> >  \ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode / Hash calculation}).
> >  \end{itemize}
> > @@ -889,6 +896,8 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
> >   \ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode / Hash calculation}.
> >  \end{itemize}
> >  
> > +The per-packet hash calculation can depend on the IP packet type. See
> > +\hyperref[intro:IP]{[IP]}, \hyperref[intro:UDP]{[UDP]} and \hyperref[intro:TCP]{[TCP]}.
> 
> and end paragraph here.

Is adding a blank line here? --!

> 
> >  \subparagraph{Supported/enabled hash types}
> >  \label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled hash types}
> >  Hash types applicable for IPv4 packets:
> > @@ -1001,6 +1010,155 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
> >  (see \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / IPv6 packets without extension header}).
> >  \end{itemize}
> >  
> > +\paragraph{Inner Header Hash}
> > +\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Inner Header Hash}
> > +
> > +If VIRTIO_NET_F_HASH_TUNNEL has been negotiated, the driver can send commands VIRTIO_NET_CTRL_HASH_TUNNEL_SET
> > +and VIRTIO_NET_CTRL_HASH_TUNNEL_GET to configure the calculation of the inner header hash.
> > +
> > +struct virtnet_hash_tunnel_config_set {
> > +    le32 hash_tunnel_types;
> > +};
> > +
> > +struct virtnet_hash_tunnel_config_get {
> > +    le32 supported_hash_tunnel_types;
> > +    le32 hash_tunnel_types;
> > +};
> > +
> 
> It would be cleaner to have a single structure for both.
> 

SET carries an additional 32 bits of information. But if you think this
will make the overall structure more concise, I'm ok.

> I also think hash_tunnel is unnecessarily verbose, and _config_ is also
> pointless.
> 
> Returning supported_hash_tunnel_types back to device can also
> be useful for debugging.
> 
> How about:
> 
> 
> struct virtnet_hash_tunnel {
>      le32 supported_tunnel_types;
>      le32 enabled_tunnel_types;
> };
> 

It's OK.

And:
For the GET command, both fields are WO for the device.
For the SET command, \field{supported_tunnel_types} is RO for the device
and \field{enabled_tunnel_types} is WO for the device.

> 
> For VIRTIO_NET_CTRL_HASH_TUNNEL_GET, \field{supported_tunnel_types}
> contains the bitmask of encapsulation types supported
> by the device for inner header hash; \field{enabled_tunnel_types}
> contains the value received in a previous successful
> call to VIRTIO_NET_CTRL_HASH_TUNNEL_SET.
> 
> For VIRTIO_NET_CTRL_HASH_TUNNEL_SET, \field{supported_tunnel_types}
> contains the value returned by a previous
> successful call to VIRTIO_NET_CTRL_HASH_TUNNEL_GET;
> \field{enabled_tunnel_types}
> contains the bitmask of encapsulation types to enable
> for inner header hash.
> 
> and add normative statements to this end.
> 
> 

Ok.

> > +#define VIRTIO_NET_CTRL_HASH_TUNNEL 7
> > + #define VIRTIO_NET_CTRL_HASH_TUNNEL_SET 0
> > + #define VIRTIO_NET_CTRL_HASH_TUNNEL_GET 1
> > +
> > +
> > +Field \field{supported_hash_tunnel_types} provided by the device indicates that the device supports inner header hash for these encapsulation types.
> > +Field \field{supported_hash_tunnel_types} contains the bitmask of encapsulation types supported for inner header hash.
> 
> We don't need these two sentences. Just second one will do.
> 

Ok.

> > +See \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets /
> > +Hash calculation for incoming packets / Encapsulation types supported/enabled for inner header hash}.
> > +
> > +Field \field{hash_tunnel_types} contains the bitmask of encapsulation types enabled for inner header hash.
> 
> They have different meanings for set and get though.
> 
> 
> > +See \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets /
> > +Hash calculation for incoming packets / Encapsulation types supported/enabled for inner header hash}.
> > +
> > +The class VIRTIO_NET_CTRL_HASH_TUNNEL has the following commands:
> > +\begin{itemize}
> > +\item VIRTIO_NET_CTRL_HASH_TUNNEL_SET: set \field{hash_tunnel_types} for the device using the
> > +      virtnet_hash_tunnel_config_set structure, which is read-only for the device.
> > +\item VIRTIO_NET_CTRL_HASH_TUNNEL_GET: get \field{hash_tunnel_types} and \field{supported_hash_tunnel_types}
> > +      from the device using the virtnet_hash_tunnel_config_get structure, which is write-only for the device.
> > +\end{itemize}
> > +
> > +\subparagraph{Encapsulated packet}
> > +\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Encapsulated packet}
> > +
> > +Multiple tunneling protocols allow encapsulating an inner, payload packet in an outer, encapsulated packet.
> > +The encapsulated packet thus contains an outer header and an inner header, and the device calculates the
> > +hash over either the inner header or the outer header.
> > +
> > +If VIRTIO_NET_F_HASH_TUNNEL is negotiated and a received encapsulated packet's outer header matches one of the
> > +encapsulation types enabled in \field{hash_tunnel_types}, then the device uses the inner header for hash
> > +calculations (only a single level of encapsulation is currently supported).
> > +
> > +If VIRTIO_NET_F_HASH_TUNNEL is negotiated and a received packet's (outer) header does not match any types enabled
> > +in \field{hash_tunnel_types}, then the device uses the outer header for hash calculations.
> > +
> > +Initially all encapsulation types are disabled (the value of \field{hash_tunnel_types} is 0) for inner header hash
> > +before any VIRTIO_NET_CTRL_HASH_TUNNEL_SET command are sent by the driver.
> 
> Initially (before driver sends VIRTIO_NET_CTRL_HASH_TUNNEL_SET)
> all encapsulation types are disabled 
> 

Ok.

> > +
> > +Encapsulation types supported/enabled for inner header hash:
> > +\begin{itemize}
> > +    \item The outer header of the following encapsulation types does not contain the transport protocol:
> > +        \begin{enumerate}
> > +	    \item \hyperref[intro:ipip]{[IPIP]}: the outer header is over IPv4 and the inner header is over IPv4.
> > +	    \item \hyperref[intro:nvgre]{[NVGRE]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
> > +	    \item \hyperref[intro:gre_rfc2784]{[GRE_rfc2784]}: the outer header is over IPv4 and the inner header is over IPv4.
> > +	    \item \hyperref[intro:gre_rfc2890]{[GRE_rfc2890]}: the outer header is over IPv4 and the inner header is over IPv4.
> > +	    \item \hyperref[intro:gre_rfc7676]{[GRE_rfc7676]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
> > +        \end{enumerate}
> > +
> > +    \item The outer header of the following encapsulation types uses UDP as the transport protocol:
> > +        \begin{enumerate}
> > +	    \item \hyperref[intro:vxlan]{[VXLAN]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
> > +	    \item \hyperref[intro:geneve]{[GENEVE]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
> > +	    \item \hyperref[intro:vxlan_gpe]{[VXLAN-GPE]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
> > +	    \item \hyperref[intro:gre_in_udp_rfc8086]{[GRE-in-UDP]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
> > +        \end{enumerate}
> > +\end{itemize}
> > +
> > +\subparagraph{Encapsulation types supported/enabled for inner header hash}
> > +\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets /
> > +Hash calculation for incoming packets / Encapsulation types supported/enabled for inner header hash}
> > +
> > +Encapsulation types applicable for inner header hash:
> > +\begin{lstlisting}
> > +The \hyperref[intro:gre_rfc2784]{[GRE_rfc2784]} encapsulation type:
> > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_2784    (1 << 0)
> > +
> > +The \hyperref[intro:gre_rfc2890]{[GRE_rfc2890]} encapsulation type:
> > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_2890    (1 << 1)
> > +
> > +The \hyperref[intro:gre_rfc7676]{[GRE_rfc7676]} encapsulation type:
> > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_7676    (1 << 2)
> > +
> > +The \hyperref[intro:gre_in_udp_rfc8086]{[GRE-in-UDP]} encapsulation type:
> > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_UDP     (1 << 3)
> > +
> > +The \hyperref[intro:vxlan]{[VXLAN]} encapsulation type:
> > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_VXLAN       (1 << 4)
> > +
> > +The \hyperref[intro:vxlan_gpe]{[VXLAN-GPE]} encapsulation type:
> > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_VXLAN_GPE   (1 << 5)
> > +
> > +The \hyperref[intro:geneve]{[GENEVE]} encapsulation type:
> > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GENEVE      (1 << 6)
> > +
> > +The \hyperref[intro:ipip]{[IPIP]} encapsulation type:
> > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_IPIP        (1 << 7)
> > +
> > +The \hyperref[intro:nvgre]{[NVGRE]} encapsulation type:
> > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_NVGRE       (1 << 8)
> > +\end{lstlisting}
> > +
> > +\subparagraph{Advice}
> > +Example uses of inner header hash:
> > +\begin{itemize}
> > +\item Legacy tunneling protocols, lacking outer header entropy, can use RSS with inner header hash to
> > +      distribute flows with identical outer but different inner headers across various queues, improving performance.
> > +\item Identify an inner flow distributed across multiple outer tunnels.
> > +\end{itemize}
> > +
> > +As using the inner header hash completely discards the outer header entropy, care must be taken
> > +if the inner header is controlled by an adversary, as the adversary can then intentionally create
> > +configurations with insufficient entropy.
> > +
> > +Besides disabling inner header hash, mitigations would depend on:
> > +\begin{itemize}
> > +\item Use a tool with good forwarding performance to keep the receive queue from dropping packets.
> 
> this is quite vague
> 

Giving too specific advice would be too specific, which we discussed a
long time ago.

> > +\item If the QoS (Quality of service) is unavailable, the driver can set \field{hash_tunnel_types} to 0
> > +      to disable inner header hash for all encapsulated packets.
> 
> this is precisely disabling

\field{hash_tunnel_types} to 0 disabling inner ?

> 
> > +\item Perform appropriate QoS before packets consume the receive buffers of the receive queues.
> 
> it is not at all clear how would devices do this.
> 

The reason we're describing it broadly here is that this is done by
devices, which usually know what to do, and can also take actions
off-device, such as firewalls off-device, etc.

> > +\end{itemize}
> 
> Oh sorry I didn't complete the sentence :(
> I suggest dropping above and having something like:
> 
> 	Besides disabling inner header hash, mitigations would depend on how the
> 	hash is used, and the consequences of a successful attack.
> 	For example, if the attack causes packet drops, using a deeper queue
> 	might be able to mitigate it.

Ok, I got you now!

> 
> 
> 
> > +
> > +\devicenormative{\subparagraph}{Inner Header Hash}{Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
> > +
> > +If the (outer) header of the received packet does not match any value
> 
> any encapsulation type

Ok. And 'any bits' instead of 'any value' I think.

> 
> > enabled in \field{hash_tunnel_types},
> > +the device MUST calculate the hash on the outer header.
> > +
> > +If the device receives an unsupported or unrecognized value for \field{hash_tunnel_types}, it MUST respond to
> > +the VIRTIO_NET_CTRL_HASH_TUNNEL_SET command with VIRTIO_NET_ERR.
> 
> let's be specific. if any bits in hash_tunnel_types are not set in
> supported_hash_tunnel_types

Ok.

> 
> > +
> > +If the device offers the VIRTIO_NET_F_HASH_TUNNEL feature, it MUST provide the values for \field{supported_hash_tunnel_types}.
> 
> what does this mean even?

When the driver uses the GET command, the device should preferably
return the corresponding value..

> 
> > +
> > +If \field{hash_tunnel_types} is set to 0 or upon device reset, the device MUST disable inner header hash for all encapsulation types.
> > +
> > +\drivernormative{\subparagraph}{Inner Header Hash}{Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
> > +
> > +The driver MUST have negotiated the VIRTIO_NET_F_HASH_TUNNEL feature when issuing
> > +commands VIRTIO_NET_CTRL_HASH_TUNNEL_SET and VIRTIO_NET_CTRL_HASH_TUNNEL_GET.
> > +
> > +The driver MUST ignore the values received from the VIRTIO_NET_CTRL_HASH_TUNNEL_GET command if the device responds with VIRTIO_NET_ERR.
> > +
> > +The driver MUST NOT set any value in \field{hash_tunnel_types} which is not set in \field{supported_hash_tunnel_types}.
> 
> any bits. not any value

Get.

Thanks.

> 
> > +
> >  \paragraph{Hash reporting for incoming packets}
> >  \label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash reporting for incoming packets}
> >  
> > diff --git a/device-types/net/device-conformance.tex b/device-types/net/device-conformance.tex
> > index 54f6783..f88f48b 100644
> > --- a/device-types/net/device-conformance.tex
> > +++ b/device-types/net/device-conformance.tex
> > @@ -14,4 +14,5 @@
> >  \item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode}
> >  \item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) / RSS processing}
> >  \item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Notifications Coalescing}
> > +\item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
> >  \end{itemize}
> > diff --git a/device-types/net/driver-conformance.tex b/device-types/net/driver-conformance.tex
> > index 97d0cc1..9d853d9 100644
> > --- a/device-types/net/driver-conformance.tex
> > +++ b/device-types/net/driver-conformance.tex
> > @@ -14,4 +14,5 @@
> >  \item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Offloads State Configuration / Setting Offloads State}
> >  \item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) }
> >  \item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Notifications Coalescing}
> > +\item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
> >  \end{itemize}
> > diff --git a/introduction.tex b/introduction.tex
> > index b7155bf..81f07a4 100644
> > --- a/introduction.tex
> > +++ b/introduction.tex
> > @@ -102,6 +102,45 @@ \section{Normative References}\label{sec:Normative References}
> >      Standards for Efficient Cryptography Group(SECG), ``SEC1: Elliptic Cureve Cryptography'', Version 1.0, September 2000.
> >  	\newline\url{https://www.secg.org/sec1-v2.pdf}\\
> >  
> > +	\phantomsection\label{intro:gre_rfc2784}\textbf{[GRE_rfc2784]} &
> > +    Generic Routing Encapsulation. This protocol is only specified for IPv4 and used as either the payload or delivery protocol.
> > +	\newline\url{https://datatracker.ietf.org/doc/rfc2784/}\\
> > +	\phantomsection\label{intro:gre_rfc2890}\textbf{[GRE_rfc2890]} &
> > +    Key and Sequence Number Extensions to GRE \ref{intro:gre_rfc2784}. This protocol describes extensions by which two fields, Key and
> > +    Sequence Number, can be optionally carried in the GRE Header \ref{intro:gre_rfc2784}.
> > +	\newline\url{https://www.rfc-editor.org/rfc/rfc2890}\\
> > +	\phantomsection\label{intro:gre_rfc7676}\textbf{[GRE_rfc7676]} &
> > +    IPv6 Support for Generic Routing Encapsulation (GRE). This protocol is specified for IPv6 and used as either the payload or
> > +    delivery protocol. Note that this does not change the GRE header format or any behaviors specified by RFC 2784 or RFC 2890.
> > +	\newline\url{https://datatracker.ietf.org/doc/rfc7676/}\\
> > +	\phantomsection\label{intro:gre_in_udp_rfc8086}\textbf{[GRE-in-UDP]} &
> > +    GRE-in-UDP Encapsulation. This specifies a method of encapsulating network protocol packets within GRE and UDP headers.
> > +    This protocol is specified for IPv4 and IPv6, and used as either the payload or delivery protocol.
> > +	\newline\url{https://www.rfc-editor.org/rfc/rfc8086}\\
> > +	\phantomsection\label{intro:vxlan}\textbf{[VXLAN]} &
> > +    Virtual eXtensible Local Area Network.
> > +	\newline\url{https://datatracker.ietf.org/doc/rfc7348/}\\
> > +	\phantomsection\label{intro:vxlan-gpe}\textbf{[VXLAN-GPE]} &
> > +    Generic Protocol Extension for VXLAN. This protocol describes extending Virtual eXtensible Local Area Network (VXLAN) via changes to the VXLAN header.
> > +	\newline\url{https://www.ietf.org/archive/id/draft-ietf-nvo3-vxlan-gpe-12.txt}\\
> > +	\phantomsection\label{intro:geneve}\textbf{[GENEVE]} &
> > +    Generic Network Virtualization Encapsulation.
> > +	\newline\url{https://datatracker.ietf.org/doc/rfc8926/}\\
> > +	\phantomsection\label{intro:ipip}\textbf{[IPIP]} &
> > +    IP Encapsulation within IP.
> > +	\newline\url{https://www.rfc-editor.org/rfc/rfc2003}\\
> > +	\phantomsection\label{intro:nvgre}\textbf{[NVGRE]} &
> > +    NVGRE: Network Virtualization Using Generic Routing Encapsulation
> > +	\newline\url{https://www.rfc-editor.org/rfc/rfc7637.html}\\
> > +	\phantomsection\label{intro:IP}\textbf{[IP]} &
> > +    INTERNET PROTOCOL
> > +	\newline\url{https://www.rfc-editor.org/rfc/rfc791}\\
> > +	\phantomsection\label{intro:UDP}\textbf{[UDP]} &
> > +    User Datagram Protocol
> > +	\newline\url{https://www.rfc-editor.org/rfc/rfc768}\\
> > +	\phantomsection\label{intro:TCP}\textbf{[TCP]} &
> > +    TRANSMISSION CONTROL PROTOCOL
> > +	\newline\url{https://www.rfc-editor.org/rfc/rfc793}\\
> >  \end{longtable}
> >  
> >  \section{Non-Normative References}
> > -- 
> > 2.19.1.6.gb485710b
> 
> 
> This publicly archived list offers a means to provide input to the
> OASIS Virtual I/O Device (VIRTIO) TC.
> 
> In order to verify user consent to the Feedback License terms and
> to minimize spam in the list archive, subscription is required
> before posting.
> 
> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> List help: virtio-comment-help@lists.oasis-open.org
> List archive: https://lists.oasis-open.org/archives/virtio-comment/
> Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> Committee: https://www.oasis-open.org/committees/virtio/
> Join OASIS: https://www.oasis-open.org/join/

This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-dev] RE: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
  2023-06-21 16:46     ` Heng Qi
@ 2023-06-21 17:52       ` Parav Pandit
  -1 siblings, 0 replies; 106+ messages in thread
From: Parav Pandit @ 2023-06-21 17:52 UTC (permalink / raw)
  To: Heng Qi, Michael S. Tsirkin
  Cc: virtio-comment, virtio-dev, Jason Wang, Yuri Benditovich,
	Xuan Zhuo, Cornelia Huck


> From: Heng Qi <hengqi@linux.alibaba.com>
> Sent: Wednesday, June 21, 2023 12:46 PM


> SET carries an additional 32 bits of information. But if you think this will make
> the overall structure more concise, I'm ok.
>
If it is placed in single structure than it needs to be reworded to remove WO or RO notion.
This also requires additional sw to indicate dma attributes to be RW when mapping this area.
And extra text to indicate that supported_hash_tunnel_types to be ignored on set command.

Two structures are more cleaner serving its purpose.
 
> > I also think hash_tunnel is unnecessarily verbose, and _config_ is
> > also pointless.
> >
> > Returning supported_hash_tunnel_types back to device can also be
> > useful for debugging.
> >
> > How about:
> >
> >
> > struct virtnet_hash_tunnel {
> >      le32 supported_tunnel_types;
> >      le32 enabled_tunnel_types;
> > };
> >
> 
> It's OK.
> 
> And:
> For the GET command, both fields are WO for the device.
> For the SET command, \field{supported_tunnel_types} is RO for the device and
> \field{enabled_tunnel_types} is WO for the device.
> 
> >
> > For VIRTIO_NET_CTRL_HASH_TUNNEL_GET, \field{supported_tunnel_types}
> > contains the bitmask of encapsulation types supported by the device
> > for inner header hash; \field{enabled_tunnel_types} contains the value
> > received in a previous successful call to
> > VIRTIO_NET_CTRL_HASH_TUNNEL_SET.
> >
> > For VIRTIO_NET_CTRL_HASH_TUNNEL_SET, \field{supported_tunnel_types}
> > contains the value returned by a previous successful call to
> > VIRTIO_NET_CTRL_HASH_TUNNEL_GET; \field{enabled_tunnel_types}
> contains
> > the bitmask of encapsulation types to enable for inner header hash.
> >
> > and add normative statements to this end.
> >
> >
> 
> Ok.
> 
> > > +#define VIRTIO_NET_CTRL_HASH_TUNNEL 7  #define
> > > +VIRTIO_NET_CTRL_HASH_TUNNEL_SET 0  #define
> > > +VIRTIO_NET_CTRL_HASH_TUNNEL_GET 1
> > > +
> > > +
> > > +Field \field{supported_hash_tunnel_types} provided by the device
> indicates that the device supports inner header hash for these encapsulation
> types.
> > > +Field \field{supported_hash_tunnel_types} contains the bitmask of
> encapsulation types supported for inner header hash.
> >
> > We don't need these two sentences. Just second one will do.
> >
> 
> Ok.
> 
> > > +See \ref{sec:Device Types / Network Device / Device Operation /
> > > +Processing of Incoming Packets / Hash calculation for incoming packets /
> Encapsulation types supported/enabled for inner header hash}.
> > > +
> > > +Field \field{hash_tunnel_types} contains the bitmask of encapsulation
> types enabled for inner header hash.
> >
> > They have different meanings for set and get though.
> >
> >
> > > +See \ref{sec:Device Types / Network Device / Device Operation /
> > > +Processing of Incoming Packets / Hash calculation for incoming packets /
> Encapsulation types supported/enabled for inner header hash}.
> > > +
> > > +The class VIRTIO_NET_CTRL_HASH_TUNNEL has the following commands:
> > > +\begin{itemize}
> > > +\item VIRTIO_NET_CTRL_HASH_TUNNEL_SET: set
> \field{hash_tunnel_types} for the device using the
> > > +      virtnet_hash_tunnel_config_set structure, which is read-only for the
> device.
> > > +\item VIRTIO_NET_CTRL_HASH_TUNNEL_GET: get
> \field{hash_tunnel_types} and \field{supported_hash_tunnel_types}
> > > +      from the device using the virtnet_hash_tunnel_config_get structure,
> which is write-only for the device.
> > > +\end{itemize}
> > > +
> > > +\subparagraph{Encapsulated packet}
> > > +\label{sec:Device Types / Network Device / Device Operation /
> > > +Processing of Incoming Packets / Hash calculation for incoming
> > > +packets / Encapsulated packet}
> > > +
> > > +Multiple tunneling protocols allow encapsulating an inner, payload packet
> in an outer, encapsulated packet.
> > > +The encapsulated packet thus contains an outer header and an inner
> > > +header, and the device calculates the hash over either the inner header or
> the outer header.
> > > +
> > > +If VIRTIO_NET_F_HASH_TUNNEL is negotiated and a received
> > > +encapsulated packet's outer header matches one of the encapsulation
> > > +types enabled in \field{hash_tunnel_types}, then the device uses the inner
> header for hash calculations (only a single level of encapsulation is currently
> supported).
> > > +
> > > +If VIRTIO_NET_F_HASH_TUNNEL is negotiated and a received packet's
> > > +(outer) header does not match any types enabled in
> \field{hash_tunnel_types}, then the device uses the outer header for hash
> calculations.
> > > +
> > > +Initially all encapsulation types are disabled (the value of
> > > +\field{hash_tunnel_types} is 0) for inner header hash before any
> VIRTIO_NET_CTRL_HASH_TUNNEL_SET command are sent by the driver.
> >
> > Initially (before driver sends VIRTIO_NET_CTRL_HASH_TUNNEL_SET) all
> > encapsulation types are disabled
> >
> 
> Ok.
> 
> > > +
> > > +Encapsulation types supported/enabled for inner header hash:
> > > +\begin{itemize}
> > > +    \item The outer header of the following encapsulation types does not
> contain the transport protocol:
> > > +        \begin{enumerate}
> > > +	    \item \hyperref[intro:ipip]{[IPIP]}: the outer header is over IPv4 and
> the inner header is over IPv4.
> > > +	    \item \hyperref[intro:nvgre]{[NVGRE]}: the outer header is over
> IPv4/IPv6 and the inner header is over IPv4/IPv6.
> > > +	    \item \hyperref[intro:gre_rfc2784]{[GRE_rfc2784]}: the outer header
> is over IPv4 and the inner header is over IPv4.
> > > +	    \item \hyperref[intro:gre_rfc2890]{[GRE_rfc2890]}: the outer header
> is over IPv4 and the inner header is over IPv4.
> > > +	    \item \hyperref[intro:gre_rfc7676]{[GRE_rfc7676]}: the outer header
> is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
> > > +        \end{enumerate}
> > > +
> > > +    \item The outer header of the following encapsulation types uses UDP as
> the transport protocol:
> > > +        \begin{enumerate}
> > > +	    \item \hyperref[intro:vxlan]{[VXLAN]}: the outer header is over
> IPv4/IPv6 and the inner header is over IPv4/IPv6.
> > > +	    \item \hyperref[intro:geneve]{[GENEVE]}: the outer header is over
> IPv4/IPv6 and the inner header is over IPv4/IPv6.
> > > +	    \item \hyperref[intro:vxlan_gpe]{[VXLAN-GPE]}: the outer header is
> over IPv4/IPv6 and the inner header is over IPv4/IPv6.
> > > +	    \item \hyperref[intro:gre_in_udp_rfc8086]{[GRE-in-UDP]}: the outer
> header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
> > > +        \end{enumerate}
> > > +\end{itemize}
> > > +
> > > +\subparagraph{Encapsulation types supported/enabled for inner
> > > +header hash} \label{sec:Device Types / Network Device / Device
> > > +Operation / Processing of Incoming Packets / Hash calculation for
> > > +incoming packets / Encapsulation types supported/enabled for inner
> > > +header hash}
> > > +
> > > +Encapsulation types applicable for inner header hash:
> > > +\begin{lstlisting}
> > > +The \hyperref[intro:gre_rfc2784]{[GRE_rfc2784]} encapsulation type:
> > > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_2784    (1 << 0)
> > > +
> > > +The \hyperref[intro:gre_rfc2890]{[GRE_rfc2890]} encapsulation type:
> > > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_2890    (1 << 1)
> > > +
> > > +The \hyperref[intro:gre_rfc7676]{[GRE_rfc7676]} encapsulation type:
> > > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_7676    (1 << 2)
> > > +
> > > +The \hyperref[intro:gre_in_udp_rfc8086]{[GRE-in-UDP]} encapsulation
> type:
> > > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_UDP     (1 << 3)
> > > +
> > > +The \hyperref[intro:vxlan]{[VXLAN]} encapsulation type:
> > > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_VXLAN       (1 << 4)
> > > +
> > > +The \hyperref[intro:vxlan_gpe]{[VXLAN-GPE]} encapsulation type:
> > > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_VXLAN_GPE   (1 << 5)
> > > +
> > > +The \hyperref[intro:geneve]{[GENEVE]} encapsulation type:
> > > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GENEVE      (1 << 6)
> > > +
> > > +The \hyperref[intro:ipip]{[IPIP]} encapsulation type:
> > > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_IPIP        (1 << 7)
> > > +
> > > +The \hyperref[intro:nvgre]{[NVGRE]} encapsulation type:
> > > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_NVGRE       (1 << 8)
> > > +\end{lstlisting}
> > > +
> > > +\subparagraph{Advice}
> > > +Example uses of inner header hash:
> > > +\begin{itemize}
> > > +\item Legacy tunneling protocols, lacking outer header entropy, can use
> RSS with inner header hash to
> > > +      distribute flows with identical outer but different inner headers across
> various queues, improving performance.
> > > +\item Identify an inner flow distributed across multiple outer tunnels.
> > > +\end{itemize}
> > > +
> > > +As using the inner header hash completely discards the outer header
> > > +entropy, care must be taken if the inner header is controlled by an
> > > +adversary, as the adversary can then intentionally create configurations
> with insufficient entropy.
> > > +
> > > +Besides disabling inner header hash, mitigations would depend on:
> > > +\begin{itemize}
> > > +\item Use a tool with good forwarding performance to keep the receive
> queue from dropping packets.
> >
> > this is quite vague
> >
> 
> Giving too specific advice would be too specific, which we discussed a long time
> ago.
> 
> > > +\item If the QoS (Quality of service) is unavailable, the driver can set
> \field{hash_tunnel_types} to 0
> > > +      to disable inner header hash for all encapsulated packets.
> >
> > this is precisely disabling
> 
> \field{hash_tunnel_types} to 0 disabling inner ?
> 
> >
> > > +\item Perform appropriate QoS before packets consume the receive
> buffers of the receive queues.
> >
> > it is not at all clear how would devices do this.
> >
> 
> The reason we're describing it broadly here is that this is done by devices, which
> usually know what to do, and can also take actions off-device, such as firewalls
> off-device, etc.
> 
> > > +\end{itemize}
> >
> > Oh sorry I didn't complete the sentence :( I suggest dropping above
> > and having something like:
> >
> > 	Besides disabling inner header hash, mitigations would depend on how
> the
> > 	hash is used, and the consequences of a successful attack.
> > 	For example, if the attack causes packet drops, using a deeper queue
> > 	might be able to mitigate it.
> 
> Ok, I got you now!
> 
> >
> >
> >
> > > +
> > > +\devicenormative{\subparagraph}{Inner Header Hash}{Device Types /
> > > +Network Device / Device Operation / Control Virtqueue / Inner
> > > +Header Hash}
> > > +
> > > +If the (outer) header of the received packet does not match any
> > > +value
> >
> > any encapsulation type
> 
> Ok. And 'any bits' instead of 'any value' I think.
> 
> >
> > > enabled in \field{hash_tunnel_types},
> > > +the device MUST calculate the hash on the outer header.
> > > +
> > > +If the device receives an unsupported or unrecognized value for
> > > +\field{hash_tunnel_types}, it MUST respond to the
> VIRTIO_NET_CTRL_HASH_TUNNEL_SET command with VIRTIO_NET_ERR.
> >
> > let's be specific. if any bits in hash_tunnel_types are not set in
> > supported_hash_tunnel_types
> 
> Ok.
> 
> >
> > > +
> > > +If the device offers the VIRTIO_NET_F_HASH_TUNNEL feature, it MUST
> provide the values for \field{supported_hash_tunnel_types}.
> >
> > what does this mean even?
> 
> When the driver uses the GET command, the device should preferably return
> the corresponding value..
> 
> >
> > > +
> > > +If \field{hash_tunnel_types} is set to 0 or upon device reset, the device
> MUST disable inner header hash for all encapsulation types.
> > > +
> > > +\drivernormative{\subparagraph}{Inner Header Hash}{Device Types /
> > > +Network Device / Device Operation / Control Virtqueue / Inner
> > > +Header Hash}
> > > +
> > > +The driver MUST have negotiated the VIRTIO_NET_F_HASH_TUNNEL
> > > +feature when issuing commands VIRTIO_NET_CTRL_HASH_TUNNEL_SET
> and VIRTIO_NET_CTRL_HASH_TUNNEL_GET.
> > > +
> > > +The driver MUST ignore the values received from the
> VIRTIO_NET_CTRL_HASH_TUNNEL_GET command if the device responds with
> VIRTIO_NET_ERR.
> > > +
> > > +The driver MUST NOT set any value in \field{hash_tunnel_types} which is
> not set in \field{supported_hash_tunnel_types}.
> >
> > any bits. not any value
> 
> Get.
> 
> Thanks.
> 
> >
> > > +
> > >  \paragraph{Hash reporting for incoming packets}  \label{sec:Device
> > > Types / Network Device / Device Operation / Processing of Incoming
> > > Packets / Hash reporting for incoming packets}
> > >
> > > diff --git a/device-types/net/device-conformance.tex
> > > b/device-types/net/device-conformance.tex
> > > index 54f6783..f88f48b 100644
> > > --- a/device-types/net/device-conformance.tex
> > > +++ b/device-types/net/device-conformance.tex
> > > @@ -14,4 +14,5 @@
> > >  \item \ref{devicenormative:Device Types / Network Device / Device
> > > Operation / Control Virtqueue / Automatic receive steering in
> > > multiqueue mode}  \item \ref{devicenormative:Device Types / Network
> > > Device / Device Operation / Control Virtqueue / Receive-side scaling
> > > (RSS) / RSS processing}  \item \ref{devicenormative:Device Types /
> > > Network Device / Device Operation / Control Virtqueue /
> > > Notifications Coalescing}
> > > +\item \ref{devicenormative:Device Types / Network Device / Device
> > > +Operation / Control Virtqueue / Inner Header Hash}
> > >  \end{itemize}
> > > diff --git a/device-types/net/driver-conformance.tex
> > > b/device-types/net/driver-conformance.tex
> > > index 97d0cc1..9d853d9 100644
> > > --- a/device-types/net/driver-conformance.tex
> > > +++ b/device-types/net/driver-conformance.tex
> > > @@ -14,4 +14,5 @@
> > >  \item \ref{drivernormative:Device Types / Network Device / Device
> > > Operation / Control Virtqueue / Offloads State Configuration /
> > > Setting Offloads State}  \item \ref{drivernormative:Device Types /
> > > Network Device / Device Operation / Control Virtqueue / Receive-side
> > > scaling (RSS) }  \item \ref{drivernormative:Device Types / Network
> > > Device / Device Operation / Control Virtqueue / Notifications
> > > Coalescing}
> > > +\item \ref{drivernormative:Device Types / Network Device / Device
> > > +Operation / Control Virtqueue / Inner Header Hash}
> > >  \end{itemize}
> > > diff --git a/introduction.tex b/introduction.tex index
> > > b7155bf..81f07a4 100644
> > > --- a/introduction.tex
> > > +++ b/introduction.tex
> > > @@ -102,6 +102,45 @@ \section{Normative
> References}\label{sec:Normative References}
> > >      Standards for Efficient Cryptography Group(SECG), ``SEC1: Elliptic Cureve
> Cryptography'', Version 1.0, September 2000.
> > >  	\newline\url{https://www.secg.org/sec1-v2.pdf}\\
> > >
> > > +	\phantomsection\label{intro:gre_rfc2784}\textbf{[GRE_rfc2784]} &
> > > +    Generic Routing Encapsulation. This protocol is only specified for IPv4
> and used as either the payload or delivery protocol.
> > > +	\newline\url{https://datatracker.ietf.org/doc/rfc2784/}\\
> > > +	\phantomsection\label{intro:gre_rfc2890}\textbf{[GRE_rfc2890]} &
> > > +    Key and Sequence Number Extensions to GRE \ref{intro:gre_rfc2784}.
> This protocol describes extensions by which two fields, Key and
> > > +    Sequence Number, can be optionally carried in the GRE Header
> \ref{intro:gre_rfc2784}.
> > > +	\newline\url{https://www.rfc-editor.org/rfc/rfc2890}\\
> > > +	\phantomsection\label{intro:gre_rfc7676}\textbf{[GRE_rfc7676]} &
> > > +    IPv6 Support for Generic Routing Encapsulation (GRE). This protocol is
> specified for IPv6 and used as either the payload or
> > > +    delivery protocol. Note that this does not change the GRE header format
> or any behaviors specified by RFC 2784 or RFC 2890.
> > > +	\newline\url{https://datatracker.ietf.org/doc/rfc7676/}\\
> > > +	\phantomsection\label{intro:gre_in_udp_rfc8086}\textbf{[GRE-in-
> UDP]} &
> > > +    GRE-in-UDP Encapsulation. This specifies a method of encapsulating
> network protocol packets within GRE and UDP headers.
> > > +    This protocol is specified for IPv4 and IPv6, and used as either the
> payload or delivery protocol.
> > > +	\newline\url{https://www.rfc-editor.org/rfc/rfc8086}\\
> > > +	\phantomsection\label{intro:vxlan}\textbf{[VXLAN]} &
> > > +    Virtual eXtensible Local Area Network.
> > > +	\newline\url{https://datatracker.ietf.org/doc/rfc7348/}\\
> > > +	\phantomsection\label{intro:vxlan-gpe}\textbf{[VXLAN-GPE]} &
> > > +    Generic Protocol Extension for VXLAN. This protocol describes extending
> Virtual eXtensible Local Area Network (VXLAN) via changes to the VXLAN
> header.
> > > +	\newline\url{https://www.ietf.org/archive/id/draft-ietf-nvo3-vxlan-gpe-
> 12.txt}\\
> > > +	\phantomsection\label{intro:geneve}\textbf{[GENEVE]} &
> > > +    Generic Network Virtualization Encapsulation.
> > > +	\newline\url{https://datatracker.ietf.org/doc/rfc8926/}\\
> > > +	\phantomsection\label{intro:ipip}\textbf{[IPIP]} &
> > > +    IP Encapsulation within IP.
> > > +	\newline\url{https://www.rfc-editor.org/rfc/rfc2003}\\
> > > +	\phantomsection\label{intro:nvgre}\textbf{[NVGRE]} &
> > > +    NVGRE: Network Virtualization Using Generic Routing Encapsulation
> > > +	\newline\url{https://www.rfc-editor.org/rfc/rfc7637.html}\\
> > > +	\phantomsection\label{intro:IP}\textbf{[IP]} &
> > > +    INTERNET PROTOCOL
> > > +	\newline\url{https://www.rfc-editor.org/rfc/rfc791}\\
> > > +	\phantomsection\label{intro:UDP}\textbf{[UDP]} &
> > > +    User Datagram Protocol
> > > +	\newline\url{https://www.rfc-editor.org/rfc/rfc768}\\
> > > +	\phantomsection\label{intro:TCP}\textbf{[TCP]} &
> > > +    TRANSMISSION CONTROL PROTOCOL
> > > +	\newline\url{https://www.rfc-editor.org/rfc/rfc793}\\
> > >  \end{longtable}
> > >
> > >  \section{Non-Normative References}
> > > --
> > > 2.19.1.6.gb485710b
> >
> >
> > This publicly archived list offers a means to provide input to the
> > OASIS Virtual I/O Device (VIRTIO) TC.
> >
> > In order to verify user consent to the Feedback License terms and to
> > minimize spam in the list archive, subscription is required before
> > posting.
> >
> > Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> > List help: virtio-comment-help@lists.oasis-open.org
> > List archive: https://lists.oasis-open.org/archives/virtio-comment/
> > Feedback License:
> > https://www.oasis-open.org/who/ipr/feedback_license.pdf
> > List Guidelines:
> > https://www.oasis-open.org/policies-guidelines/mailing-lists
> > Committee: https://www.oasis-open.org/committees/virtio/
> > Join OASIS: https://www.oasis-open.org/join/

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* RE: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
@ 2023-06-21 17:52       ` Parav Pandit
  0 siblings, 0 replies; 106+ messages in thread
From: Parav Pandit @ 2023-06-21 17:52 UTC (permalink / raw)
  To: Heng Qi, Michael S. Tsirkin
  Cc: virtio-comment, virtio-dev, Jason Wang, Yuri Benditovich,
	Xuan Zhuo, Cornelia Huck


> From: Heng Qi <hengqi@linux.alibaba.com>
> Sent: Wednesday, June 21, 2023 12:46 PM


> SET carries an additional 32 bits of information. But if you think this will make
> the overall structure more concise, I'm ok.
>
If it is placed in single structure than it needs to be reworded to remove WO or RO notion.
This also requires additional sw to indicate dma attributes to be RW when mapping this area.
And extra text to indicate that supported_hash_tunnel_types to be ignored on set command.

Two structures are more cleaner serving its purpose.
 
> > I also think hash_tunnel is unnecessarily verbose, and _config_ is
> > also pointless.
> >
> > Returning supported_hash_tunnel_types back to device can also be
> > useful for debugging.
> >
> > How about:
> >
> >
> > struct virtnet_hash_tunnel {
> >      le32 supported_tunnel_types;
> >      le32 enabled_tunnel_types;
> > };
> >
> 
> It's OK.
> 
> And:
> For the GET command, both fields are WO for the device.
> For the SET command, \field{supported_tunnel_types} is RO for the device and
> \field{enabled_tunnel_types} is WO for the device.
> 
> >
> > For VIRTIO_NET_CTRL_HASH_TUNNEL_GET, \field{supported_tunnel_types}
> > contains the bitmask of encapsulation types supported by the device
> > for inner header hash; \field{enabled_tunnel_types} contains the value
> > received in a previous successful call to
> > VIRTIO_NET_CTRL_HASH_TUNNEL_SET.
> >
> > For VIRTIO_NET_CTRL_HASH_TUNNEL_SET, \field{supported_tunnel_types}
> > contains the value returned by a previous successful call to
> > VIRTIO_NET_CTRL_HASH_TUNNEL_GET; \field{enabled_tunnel_types}
> contains
> > the bitmask of encapsulation types to enable for inner header hash.
> >
> > and add normative statements to this end.
> >
> >
> 
> Ok.
> 
> > > +#define VIRTIO_NET_CTRL_HASH_TUNNEL 7  #define
> > > +VIRTIO_NET_CTRL_HASH_TUNNEL_SET 0  #define
> > > +VIRTIO_NET_CTRL_HASH_TUNNEL_GET 1
> > > +
> > > +
> > > +Field \field{supported_hash_tunnel_types} provided by the device
> indicates that the device supports inner header hash for these encapsulation
> types.
> > > +Field \field{supported_hash_tunnel_types} contains the bitmask of
> encapsulation types supported for inner header hash.
> >
> > We don't need these two sentences. Just second one will do.
> >
> 
> Ok.
> 
> > > +See \ref{sec:Device Types / Network Device / Device Operation /
> > > +Processing of Incoming Packets / Hash calculation for incoming packets /
> Encapsulation types supported/enabled for inner header hash}.
> > > +
> > > +Field \field{hash_tunnel_types} contains the bitmask of encapsulation
> types enabled for inner header hash.
> >
> > They have different meanings for set and get though.
> >
> >
> > > +See \ref{sec:Device Types / Network Device / Device Operation /
> > > +Processing of Incoming Packets / Hash calculation for incoming packets /
> Encapsulation types supported/enabled for inner header hash}.
> > > +
> > > +The class VIRTIO_NET_CTRL_HASH_TUNNEL has the following commands:
> > > +\begin{itemize}
> > > +\item VIRTIO_NET_CTRL_HASH_TUNNEL_SET: set
> \field{hash_tunnel_types} for the device using the
> > > +      virtnet_hash_tunnel_config_set structure, which is read-only for the
> device.
> > > +\item VIRTIO_NET_CTRL_HASH_TUNNEL_GET: get
> \field{hash_tunnel_types} and \field{supported_hash_tunnel_types}
> > > +      from the device using the virtnet_hash_tunnel_config_get structure,
> which is write-only for the device.
> > > +\end{itemize}
> > > +
> > > +\subparagraph{Encapsulated packet}
> > > +\label{sec:Device Types / Network Device / Device Operation /
> > > +Processing of Incoming Packets / Hash calculation for incoming
> > > +packets / Encapsulated packet}
> > > +
> > > +Multiple tunneling protocols allow encapsulating an inner, payload packet
> in an outer, encapsulated packet.
> > > +The encapsulated packet thus contains an outer header and an inner
> > > +header, and the device calculates the hash over either the inner header or
> the outer header.
> > > +
> > > +If VIRTIO_NET_F_HASH_TUNNEL is negotiated and a received
> > > +encapsulated packet's outer header matches one of the encapsulation
> > > +types enabled in \field{hash_tunnel_types}, then the device uses the inner
> header for hash calculations (only a single level of encapsulation is currently
> supported).
> > > +
> > > +If VIRTIO_NET_F_HASH_TUNNEL is negotiated and a received packet's
> > > +(outer) header does not match any types enabled in
> \field{hash_tunnel_types}, then the device uses the outer header for hash
> calculations.
> > > +
> > > +Initially all encapsulation types are disabled (the value of
> > > +\field{hash_tunnel_types} is 0) for inner header hash before any
> VIRTIO_NET_CTRL_HASH_TUNNEL_SET command are sent by the driver.
> >
> > Initially (before driver sends VIRTIO_NET_CTRL_HASH_TUNNEL_SET) all
> > encapsulation types are disabled
> >
> 
> Ok.
> 
> > > +
> > > +Encapsulation types supported/enabled for inner header hash:
> > > +\begin{itemize}
> > > +    \item The outer header of the following encapsulation types does not
> contain the transport protocol:
> > > +        \begin{enumerate}
> > > +	    \item \hyperref[intro:ipip]{[IPIP]}: the outer header is over IPv4 and
> the inner header is over IPv4.
> > > +	    \item \hyperref[intro:nvgre]{[NVGRE]}: the outer header is over
> IPv4/IPv6 and the inner header is over IPv4/IPv6.
> > > +	    \item \hyperref[intro:gre_rfc2784]{[GRE_rfc2784]}: the outer header
> is over IPv4 and the inner header is over IPv4.
> > > +	    \item \hyperref[intro:gre_rfc2890]{[GRE_rfc2890]}: the outer header
> is over IPv4 and the inner header is over IPv4.
> > > +	    \item \hyperref[intro:gre_rfc7676]{[GRE_rfc7676]}: the outer header
> is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
> > > +        \end{enumerate}
> > > +
> > > +    \item The outer header of the following encapsulation types uses UDP as
> the transport protocol:
> > > +        \begin{enumerate}
> > > +	    \item \hyperref[intro:vxlan]{[VXLAN]}: the outer header is over
> IPv4/IPv6 and the inner header is over IPv4/IPv6.
> > > +	    \item \hyperref[intro:geneve]{[GENEVE]}: the outer header is over
> IPv4/IPv6 and the inner header is over IPv4/IPv6.
> > > +	    \item \hyperref[intro:vxlan_gpe]{[VXLAN-GPE]}: the outer header is
> over IPv4/IPv6 and the inner header is over IPv4/IPv6.
> > > +	    \item \hyperref[intro:gre_in_udp_rfc8086]{[GRE-in-UDP]}: the outer
> header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
> > > +        \end{enumerate}
> > > +\end{itemize}
> > > +
> > > +\subparagraph{Encapsulation types supported/enabled for inner
> > > +header hash} \label{sec:Device Types / Network Device / Device
> > > +Operation / Processing of Incoming Packets / Hash calculation for
> > > +incoming packets / Encapsulation types supported/enabled for inner
> > > +header hash}
> > > +
> > > +Encapsulation types applicable for inner header hash:
> > > +\begin{lstlisting}
> > > +The \hyperref[intro:gre_rfc2784]{[GRE_rfc2784]} encapsulation type:
> > > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_2784    (1 << 0)
> > > +
> > > +The \hyperref[intro:gre_rfc2890]{[GRE_rfc2890]} encapsulation type:
> > > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_2890    (1 << 1)
> > > +
> > > +The \hyperref[intro:gre_rfc7676]{[GRE_rfc7676]} encapsulation type:
> > > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_7676    (1 << 2)
> > > +
> > > +The \hyperref[intro:gre_in_udp_rfc8086]{[GRE-in-UDP]} encapsulation
> type:
> > > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_UDP     (1 << 3)
> > > +
> > > +The \hyperref[intro:vxlan]{[VXLAN]} encapsulation type:
> > > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_VXLAN       (1 << 4)
> > > +
> > > +The \hyperref[intro:vxlan_gpe]{[VXLAN-GPE]} encapsulation type:
> > > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_VXLAN_GPE   (1 << 5)
> > > +
> > > +The \hyperref[intro:geneve]{[GENEVE]} encapsulation type:
> > > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GENEVE      (1 << 6)
> > > +
> > > +The \hyperref[intro:ipip]{[IPIP]} encapsulation type:
> > > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_IPIP        (1 << 7)
> > > +
> > > +The \hyperref[intro:nvgre]{[NVGRE]} encapsulation type:
> > > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_NVGRE       (1 << 8)
> > > +\end{lstlisting}
> > > +
> > > +\subparagraph{Advice}
> > > +Example uses of inner header hash:
> > > +\begin{itemize}
> > > +\item Legacy tunneling protocols, lacking outer header entropy, can use
> RSS with inner header hash to
> > > +      distribute flows with identical outer but different inner headers across
> various queues, improving performance.
> > > +\item Identify an inner flow distributed across multiple outer tunnels.
> > > +\end{itemize}
> > > +
> > > +As using the inner header hash completely discards the outer header
> > > +entropy, care must be taken if the inner header is controlled by an
> > > +adversary, as the adversary can then intentionally create configurations
> with insufficient entropy.
> > > +
> > > +Besides disabling inner header hash, mitigations would depend on:
> > > +\begin{itemize}
> > > +\item Use a tool with good forwarding performance to keep the receive
> queue from dropping packets.
> >
> > this is quite vague
> >
> 
> Giving too specific advice would be too specific, which we discussed a long time
> ago.
> 
> > > +\item If the QoS (Quality of service) is unavailable, the driver can set
> \field{hash_tunnel_types} to 0
> > > +      to disable inner header hash for all encapsulated packets.
> >
> > this is precisely disabling
> 
> \field{hash_tunnel_types} to 0 disabling inner ?
> 
> >
> > > +\item Perform appropriate QoS before packets consume the receive
> buffers of the receive queues.
> >
> > it is not at all clear how would devices do this.
> >
> 
> The reason we're describing it broadly here is that this is done by devices, which
> usually know what to do, and can also take actions off-device, such as firewalls
> off-device, etc.
> 
> > > +\end{itemize}
> >
> > Oh sorry I didn't complete the sentence :( I suggest dropping above
> > and having something like:
> >
> > 	Besides disabling inner header hash, mitigations would depend on how
> the
> > 	hash is used, and the consequences of a successful attack.
> > 	For example, if the attack causes packet drops, using a deeper queue
> > 	might be able to mitigate it.
> 
> Ok, I got you now!
> 
> >
> >
> >
> > > +
> > > +\devicenormative{\subparagraph}{Inner Header Hash}{Device Types /
> > > +Network Device / Device Operation / Control Virtqueue / Inner
> > > +Header Hash}
> > > +
> > > +If the (outer) header of the received packet does not match any
> > > +value
> >
> > any encapsulation type
> 
> Ok. And 'any bits' instead of 'any value' I think.
> 
> >
> > > enabled in \field{hash_tunnel_types},
> > > +the device MUST calculate the hash on the outer header.
> > > +
> > > +If the device receives an unsupported or unrecognized value for
> > > +\field{hash_tunnel_types}, it MUST respond to the
> VIRTIO_NET_CTRL_HASH_TUNNEL_SET command with VIRTIO_NET_ERR.
> >
> > let's be specific. if any bits in hash_tunnel_types are not set in
> > supported_hash_tunnel_types
> 
> Ok.
> 
> >
> > > +
> > > +If the device offers the VIRTIO_NET_F_HASH_TUNNEL feature, it MUST
> provide the values for \field{supported_hash_tunnel_types}.
> >
> > what does this mean even?
> 
> When the driver uses the GET command, the device should preferably return
> the corresponding value..
> 
> >
> > > +
> > > +If \field{hash_tunnel_types} is set to 0 or upon device reset, the device
> MUST disable inner header hash for all encapsulation types.
> > > +
> > > +\drivernormative{\subparagraph}{Inner Header Hash}{Device Types /
> > > +Network Device / Device Operation / Control Virtqueue / Inner
> > > +Header Hash}
> > > +
> > > +The driver MUST have negotiated the VIRTIO_NET_F_HASH_TUNNEL
> > > +feature when issuing commands VIRTIO_NET_CTRL_HASH_TUNNEL_SET
> and VIRTIO_NET_CTRL_HASH_TUNNEL_GET.
> > > +
> > > +The driver MUST ignore the values received from the
> VIRTIO_NET_CTRL_HASH_TUNNEL_GET command if the device responds with
> VIRTIO_NET_ERR.
> > > +
> > > +The driver MUST NOT set any value in \field{hash_tunnel_types} which is
> not set in \field{supported_hash_tunnel_types}.
> >
> > any bits. not any value
> 
> Get.
> 
> Thanks.
> 
> >
> > > +
> > >  \paragraph{Hash reporting for incoming packets}  \label{sec:Device
> > > Types / Network Device / Device Operation / Processing of Incoming
> > > Packets / Hash reporting for incoming packets}
> > >
> > > diff --git a/device-types/net/device-conformance.tex
> > > b/device-types/net/device-conformance.tex
> > > index 54f6783..f88f48b 100644
> > > --- a/device-types/net/device-conformance.tex
> > > +++ b/device-types/net/device-conformance.tex
> > > @@ -14,4 +14,5 @@
> > >  \item \ref{devicenormative:Device Types / Network Device / Device
> > > Operation / Control Virtqueue / Automatic receive steering in
> > > multiqueue mode}  \item \ref{devicenormative:Device Types / Network
> > > Device / Device Operation / Control Virtqueue / Receive-side scaling
> > > (RSS) / RSS processing}  \item \ref{devicenormative:Device Types /
> > > Network Device / Device Operation / Control Virtqueue /
> > > Notifications Coalescing}
> > > +\item \ref{devicenormative:Device Types / Network Device / Device
> > > +Operation / Control Virtqueue / Inner Header Hash}
> > >  \end{itemize}
> > > diff --git a/device-types/net/driver-conformance.tex
> > > b/device-types/net/driver-conformance.tex
> > > index 97d0cc1..9d853d9 100644
> > > --- a/device-types/net/driver-conformance.tex
> > > +++ b/device-types/net/driver-conformance.tex
> > > @@ -14,4 +14,5 @@
> > >  \item \ref{drivernormative:Device Types / Network Device / Device
> > > Operation / Control Virtqueue / Offloads State Configuration /
> > > Setting Offloads State}  \item \ref{drivernormative:Device Types /
> > > Network Device / Device Operation / Control Virtqueue / Receive-side
> > > scaling (RSS) }  \item \ref{drivernormative:Device Types / Network
> > > Device / Device Operation / Control Virtqueue / Notifications
> > > Coalescing}
> > > +\item \ref{drivernormative:Device Types / Network Device / Device
> > > +Operation / Control Virtqueue / Inner Header Hash}
> > >  \end{itemize}
> > > diff --git a/introduction.tex b/introduction.tex index
> > > b7155bf..81f07a4 100644
> > > --- a/introduction.tex
> > > +++ b/introduction.tex
> > > @@ -102,6 +102,45 @@ \section{Normative
> References}\label{sec:Normative References}
> > >      Standards for Efficient Cryptography Group(SECG), ``SEC1: Elliptic Cureve
> Cryptography'', Version 1.0, September 2000.
> > >  	\newline\url{https://www.secg.org/sec1-v2.pdf}\\
> > >
> > > +	\phantomsection\label{intro:gre_rfc2784}\textbf{[GRE_rfc2784]} &
> > > +    Generic Routing Encapsulation. This protocol is only specified for IPv4
> and used as either the payload or delivery protocol.
> > > +	\newline\url{https://datatracker.ietf.org/doc/rfc2784/}\\
> > > +	\phantomsection\label{intro:gre_rfc2890}\textbf{[GRE_rfc2890]} &
> > > +    Key and Sequence Number Extensions to GRE \ref{intro:gre_rfc2784}.
> This protocol describes extensions by which two fields, Key and
> > > +    Sequence Number, can be optionally carried in the GRE Header
> \ref{intro:gre_rfc2784}.
> > > +	\newline\url{https://www.rfc-editor.org/rfc/rfc2890}\\
> > > +	\phantomsection\label{intro:gre_rfc7676}\textbf{[GRE_rfc7676]} &
> > > +    IPv6 Support for Generic Routing Encapsulation (GRE). This protocol is
> specified for IPv6 and used as either the payload or
> > > +    delivery protocol. Note that this does not change the GRE header format
> or any behaviors specified by RFC 2784 or RFC 2890.
> > > +	\newline\url{https://datatracker.ietf.org/doc/rfc7676/}\\
> > > +	\phantomsection\label{intro:gre_in_udp_rfc8086}\textbf{[GRE-in-
> UDP]} &
> > > +    GRE-in-UDP Encapsulation. This specifies a method of encapsulating
> network protocol packets within GRE and UDP headers.
> > > +    This protocol is specified for IPv4 and IPv6, and used as either the
> payload or delivery protocol.
> > > +	\newline\url{https://www.rfc-editor.org/rfc/rfc8086}\\
> > > +	\phantomsection\label{intro:vxlan}\textbf{[VXLAN]} &
> > > +    Virtual eXtensible Local Area Network.
> > > +	\newline\url{https://datatracker.ietf.org/doc/rfc7348/}\\
> > > +	\phantomsection\label{intro:vxlan-gpe}\textbf{[VXLAN-GPE]} &
> > > +    Generic Protocol Extension for VXLAN. This protocol describes extending
> Virtual eXtensible Local Area Network (VXLAN) via changes to the VXLAN
> header.
> > > +	\newline\url{https://www.ietf.org/archive/id/draft-ietf-nvo3-vxlan-gpe-
> 12.txt}\\
> > > +	\phantomsection\label{intro:geneve}\textbf{[GENEVE]} &
> > > +    Generic Network Virtualization Encapsulation.
> > > +	\newline\url{https://datatracker.ietf.org/doc/rfc8926/}\\
> > > +	\phantomsection\label{intro:ipip}\textbf{[IPIP]} &
> > > +    IP Encapsulation within IP.
> > > +	\newline\url{https://www.rfc-editor.org/rfc/rfc2003}\\
> > > +	\phantomsection\label{intro:nvgre}\textbf{[NVGRE]} &
> > > +    NVGRE: Network Virtualization Using Generic Routing Encapsulation
> > > +	\newline\url{https://www.rfc-editor.org/rfc/rfc7637.html}\\
> > > +	\phantomsection\label{intro:IP}\textbf{[IP]} &
> > > +    INTERNET PROTOCOL
> > > +	\newline\url{https://www.rfc-editor.org/rfc/rfc791}\\
> > > +	\phantomsection\label{intro:UDP}\textbf{[UDP]} &
> > > +    User Datagram Protocol
> > > +	\newline\url{https://www.rfc-editor.org/rfc/rfc768}\\
> > > +	\phantomsection\label{intro:TCP}\textbf{[TCP]} &
> > > +    TRANSMISSION CONTROL PROTOCOL
> > > +	\newline\url{https://www.rfc-editor.org/rfc/rfc793}\\
> > >  \end{longtable}
> > >
> > >  \section{Non-Normative References}
> > > --
> > > 2.19.1.6.gb485710b
> >
> >
> > This publicly archived list offers a means to provide input to the
> > OASIS Virtual I/O Device (VIRTIO) TC.
> >
> > In order to verify user consent to the Feedback License terms and to
> > minimize spam in the list archive, subscription is required before
> > posting.
> >
> > Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> > List help: virtio-comment-help@lists.oasis-open.org
> > List archive: https://lists.oasis-open.org/archives/virtio-comment/
> > Feedback License:
> > https://www.oasis-open.org/who/ipr/feedback_license.pdf
> > List Guidelines:
> > https://www.oasis-open.org/policies-guidelines/mailing-lists
> > Committee: https://www.oasis-open.org/committees/virtio/
> > Join OASIS: https://www.oasis-open.org/join/

This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
  2023-06-21 17:52       ` Parav Pandit
@ 2023-06-21 19:25         ` Michael S. Tsirkin
  -1 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2023-06-21 19:25 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck

On Wed, Jun 21, 2023 at 05:52:28PM +0000, Parav Pandit wrote:
> 
> > From: Heng Qi <hengqi@linux.alibaba.com>
> > Sent: Wednesday, June 21, 2023 12:46 PM
> 
> 
> > SET carries an additional 32 bits of information. But if you think this will make
> > the overall structure more concise, I'm ok.
> >
> If it is placed in single structure than it needs to be reworded to remove WO or RO notion.

Not really. all of structure is RO and WO.

> This also requires additional sw to indicate dma attributes to be RW when mapping this area.
> And extra text to indicate that supported_hash_tunnel_types to be ignored on set command.
> 
> Two structures are more cleaner serving its purpose.

No because all of structure is RO or WO.

> > > I also think hash_tunnel is unnecessarily verbose, and _config_ is
> > > also pointless.
> > >
> > > Returning supported_hash_tunnel_types back to device can also be
> > > useful for debugging.
> > >
> > > How about:
> > >
> > >
> > > struct virtnet_hash_tunnel {
> > >      le32 supported_tunnel_types;
> > >      le32 enabled_tunnel_types;
> > > };
> > >
> > 
> > It's OK.
> > 
> > And:
> > For the GET command, both fields are WO for the device.
> > For the SET command, \field{supported_tunnel_types} is RO for the device and
> > \field{enabled_tunnel_types} is WO for the device.
> > 
> > >
> > > For VIRTIO_NET_CTRL_HASH_TUNNEL_GET, \field{supported_tunnel_types}
> > > contains the bitmask of encapsulation types supported by the device
> > > for inner header hash; \field{enabled_tunnel_types} contains the value
> > > received in a previous successful call to
> > > VIRTIO_NET_CTRL_HASH_TUNNEL_SET.
> > >
> > > For VIRTIO_NET_CTRL_HASH_TUNNEL_SET, \field{supported_tunnel_types}
> > > contains the value returned by a previous successful call to
> > > VIRTIO_NET_CTRL_HASH_TUNNEL_GET; \field{enabled_tunnel_types}
> > contains
> > > the bitmask of encapsulation types to enable for inner header hash.
> > >
> > > and add normative statements to this end.
> > >
> > >
> > 
> > Ok.
> > 
> > > > +#define VIRTIO_NET_CTRL_HASH_TUNNEL 7  #define
> > > > +VIRTIO_NET_CTRL_HASH_TUNNEL_SET 0  #define
> > > > +VIRTIO_NET_CTRL_HASH_TUNNEL_GET 1
> > > > +
> > > > +
> > > > +Field \field{supported_hash_tunnel_types} provided by the device
> > indicates that the device supports inner header hash for these encapsulation
> > types.
> > > > +Field \field{supported_hash_tunnel_types} contains the bitmask of
> > encapsulation types supported for inner header hash.
> > >
> > > We don't need these two sentences. Just second one will do.
> > >
> > 
> > Ok.
> > 
> > > > +See \ref{sec:Device Types / Network Device / Device Operation /
> > > > +Processing of Incoming Packets / Hash calculation for incoming packets /
> > Encapsulation types supported/enabled for inner header hash}.
> > > > +
> > > > +Field \field{hash_tunnel_types} contains the bitmask of encapsulation
> > types enabled for inner header hash.
> > >
> > > They have different meanings for set and get though.
> > >
> > >
> > > > +See \ref{sec:Device Types / Network Device / Device Operation /
> > > > +Processing of Incoming Packets / Hash calculation for incoming packets /
> > Encapsulation types supported/enabled for inner header hash}.
> > > > +
> > > > +The class VIRTIO_NET_CTRL_HASH_TUNNEL has the following commands:
> > > > +\begin{itemize}
> > > > +\item VIRTIO_NET_CTRL_HASH_TUNNEL_SET: set
> > \field{hash_tunnel_types} for the device using the
> > > > +      virtnet_hash_tunnel_config_set structure, which is read-only for the
> > device.
> > > > +\item VIRTIO_NET_CTRL_HASH_TUNNEL_GET: get
> > \field{hash_tunnel_types} and \field{supported_hash_tunnel_types}
> > > > +      from the device using the virtnet_hash_tunnel_config_get structure,
> > which is write-only for the device.
> > > > +\end{itemize}
> > > > +
> > > > +\subparagraph{Encapsulated packet}
> > > > +\label{sec:Device Types / Network Device / Device Operation /
> > > > +Processing of Incoming Packets / Hash calculation for incoming
> > > > +packets / Encapsulated packet}
> > > > +
> > > > +Multiple tunneling protocols allow encapsulating an inner, payload packet
> > in an outer, encapsulated packet.
> > > > +The encapsulated packet thus contains an outer header and an inner
> > > > +header, and the device calculates the hash over either the inner header or
> > the outer header.
> > > > +
> > > > +If VIRTIO_NET_F_HASH_TUNNEL is negotiated and a received
> > > > +encapsulated packet's outer header matches one of the encapsulation
> > > > +types enabled in \field{hash_tunnel_types}, then the device uses the inner
> > header for hash calculations (only a single level of encapsulation is currently
> > supported).
> > > > +
> > > > +If VIRTIO_NET_F_HASH_TUNNEL is negotiated and a received packet's
> > > > +(outer) header does not match any types enabled in
> > \field{hash_tunnel_types}, then the device uses the outer header for hash
> > calculations.
> > > > +
> > > > +Initially all encapsulation types are disabled (the value of
> > > > +\field{hash_tunnel_types} is 0) for inner header hash before any
> > VIRTIO_NET_CTRL_HASH_TUNNEL_SET command are sent by the driver.
> > >
> > > Initially (before driver sends VIRTIO_NET_CTRL_HASH_TUNNEL_SET) all
> > > encapsulation types are disabled
> > >
> > 
> > Ok.
> > 
> > > > +
> > > > +Encapsulation types supported/enabled for inner header hash:
> > > > +\begin{itemize}
> > > > +    \item The outer header of the following encapsulation types does not
> > contain the transport protocol:
> > > > +        \begin{enumerate}
> > > > +	    \item \hyperref[intro:ipip]{[IPIP]}: the outer header is over IPv4 and
> > the inner header is over IPv4.
> > > > +	    \item \hyperref[intro:nvgre]{[NVGRE]}: the outer header is over
> > IPv4/IPv6 and the inner header is over IPv4/IPv6.
> > > > +	    \item \hyperref[intro:gre_rfc2784]{[GRE_rfc2784]}: the outer header
> > is over IPv4 and the inner header is over IPv4.
> > > > +	    \item \hyperref[intro:gre_rfc2890]{[GRE_rfc2890]}: the outer header
> > is over IPv4 and the inner header is over IPv4.
> > > > +	    \item \hyperref[intro:gre_rfc7676]{[GRE_rfc7676]}: the outer header
> > is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
> > > > +        \end{enumerate}
> > > > +
> > > > +    \item The outer header of the following encapsulation types uses UDP as
> > the transport protocol:
> > > > +        \begin{enumerate}
> > > > +	    \item \hyperref[intro:vxlan]{[VXLAN]}: the outer header is over
> > IPv4/IPv6 and the inner header is over IPv4/IPv6.
> > > > +	    \item \hyperref[intro:geneve]{[GENEVE]}: the outer header is over
> > IPv4/IPv6 and the inner header is over IPv4/IPv6.
> > > > +	    \item \hyperref[intro:vxlan_gpe]{[VXLAN-GPE]}: the outer header is
> > over IPv4/IPv6 and the inner header is over IPv4/IPv6.
> > > > +	    \item \hyperref[intro:gre_in_udp_rfc8086]{[GRE-in-UDP]}: the outer
> > header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
> > > > +        \end{enumerate}
> > > > +\end{itemize}
> > > > +
> > > > +\subparagraph{Encapsulation types supported/enabled for inner
> > > > +header hash} \label{sec:Device Types / Network Device / Device
> > > > +Operation / Processing of Incoming Packets / Hash calculation for
> > > > +incoming packets / Encapsulation types supported/enabled for inner
> > > > +header hash}
> > > > +
> > > > +Encapsulation types applicable for inner header hash:
> > > > +\begin{lstlisting}
> > > > +The \hyperref[intro:gre_rfc2784]{[GRE_rfc2784]} encapsulation type:
> > > > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_2784    (1 << 0)
> > > > +
> > > > +The \hyperref[intro:gre_rfc2890]{[GRE_rfc2890]} encapsulation type:
> > > > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_2890    (1 << 1)
> > > > +
> > > > +The \hyperref[intro:gre_rfc7676]{[GRE_rfc7676]} encapsulation type:
> > > > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_7676    (1 << 2)
> > > > +
> > > > +The \hyperref[intro:gre_in_udp_rfc8086]{[GRE-in-UDP]} encapsulation
> > type:
> > > > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_UDP     (1 << 3)
> > > > +
> > > > +The \hyperref[intro:vxlan]{[VXLAN]} encapsulation type:
> > > > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_VXLAN       (1 << 4)
> > > > +
> > > > +The \hyperref[intro:vxlan_gpe]{[VXLAN-GPE]} encapsulation type:
> > > > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_VXLAN_GPE   (1 << 5)
> > > > +
> > > > +The \hyperref[intro:geneve]{[GENEVE]} encapsulation type:
> > > > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GENEVE      (1 << 6)
> > > > +
> > > > +The \hyperref[intro:ipip]{[IPIP]} encapsulation type:
> > > > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_IPIP        (1 << 7)
> > > > +
> > > > +The \hyperref[intro:nvgre]{[NVGRE]} encapsulation type:
> > > > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_NVGRE       (1 << 8)
> > > > +\end{lstlisting}
> > > > +
> > > > +\subparagraph{Advice}
> > > > +Example uses of inner header hash:
> > > > +\begin{itemize}
> > > > +\item Legacy tunneling protocols, lacking outer header entropy, can use
> > RSS with inner header hash to
> > > > +      distribute flows with identical outer but different inner headers across
> > various queues, improving performance.
> > > > +\item Identify an inner flow distributed across multiple outer tunnels.
> > > > +\end{itemize}
> > > > +
> > > > +As using the inner header hash completely discards the outer header
> > > > +entropy, care must be taken if the inner header is controlled by an
> > > > +adversary, as the adversary can then intentionally create configurations
> > with insufficient entropy.
> > > > +
> > > > +Besides disabling inner header hash, mitigations would depend on:
> > > > +\begin{itemize}
> > > > +\item Use a tool with good forwarding performance to keep the receive
> > queue from dropping packets.
> > >
> > > this is quite vague
> > >
> > 
> > Giving too specific advice would be too specific, which we discussed a long time
> > ago.
> > 
> > > > +\item If the QoS (Quality of service) is unavailable, the driver can set
> > \field{hash_tunnel_types} to 0
> > > > +      to disable inner header hash for all encapsulated packets.
> > >
> > > this is precisely disabling
> > 
> > \field{hash_tunnel_types} to 0 disabling inner ?
> > 
> > >
> > > > +\item Perform appropriate QoS before packets consume the receive
> > buffers of the receive queues.
> > >
> > > it is not at all clear how would devices do this.
> > >
> > 
> > The reason we're describing it broadly here is that this is done by devices, which
> > usually know what to do, and can also take actions off-device, such as firewalls
> > off-device, etc.
> > 
> > > > +\end{itemize}
> > >
> > > Oh sorry I didn't complete the sentence :( I suggest dropping above
> > > and having something like:
> > >
> > > 	Besides disabling inner header hash, mitigations would depend on how
> > the
> > > 	hash is used, and the consequences of a successful attack.
> > > 	For example, if the attack causes packet drops, using a deeper queue
> > > 	might be able to mitigate it.
> > 
> > Ok, I got you now!
> > 
> > >
> > >
> > >
> > > > +
> > > > +\devicenormative{\subparagraph}{Inner Header Hash}{Device Types /
> > > > +Network Device / Device Operation / Control Virtqueue / Inner
> > > > +Header Hash}
> > > > +
> > > > +If the (outer) header of the received packet does not match any
> > > > +value
> > >
> > > any encapsulation type
> > 
> > Ok. And 'any bits' instead of 'any value' I think.
> > 
> > >
> > > > enabled in \field{hash_tunnel_types},
> > > > +the device MUST calculate the hash on the outer header.
> > > > +
> > > > +If the device receives an unsupported or unrecognized value for
> > > > +\field{hash_tunnel_types}, it MUST respond to the
> > VIRTIO_NET_CTRL_HASH_TUNNEL_SET command with VIRTIO_NET_ERR.
> > >
> > > let's be specific. if any bits in hash_tunnel_types are not set in
> > > supported_hash_tunnel_types
> > 
> > Ok.
> > 
> > >
> > > > +
> > > > +If the device offers the VIRTIO_NET_F_HASH_TUNNEL feature, it MUST
> > provide the values for \field{supported_hash_tunnel_types}.
> > >
> > > what does this mean even?
> > 
> > When the driver uses the GET command, the device should preferably return
> > the corresponding value..
> > 
> > >
> > > > +
> > > > +If \field{hash_tunnel_types} is set to 0 or upon device reset, the device
> > MUST disable inner header hash for all encapsulation types.
> > > > +
> > > > +\drivernormative{\subparagraph}{Inner Header Hash}{Device Types /
> > > > +Network Device / Device Operation / Control Virtqueue / Inner
> > > > +Header Hash}
> > > > +
> > > > +The driver MUST have negotiated the VIRTIO_NET_F_HASH_TUNNEL
> > > > +feature when issuing commands VIRTIO_NET_CTRL_HASH_TUNNEL_SET
> > and VIRTIO_NET_CTRL_HASH_TUNNEL_GET.
> > > > +
> > > > +The driver MUST ignore the values received from the
> > VIRTIO_NET_CTRL_HASH_TUNNEL_GET command if the device responds with
> > VIRTIO_NET_ERR.
> > > > +
> > > > +The driver MUST NOT set any value in \field{hash_tunnel_types} which is
> > not set in \field{supported_hash_tunnel_types}.
> > >
> > > any bits. not any value
> > 
> > Get.
> > 
> > Thanks.
> > 
> > >
> > > > +
> > > >  \paragraph{Hash reporting for incoming packets}  \label{sec:Device
> > > > Types / Network Device / Device Operation / Processing of Incoming
> > > > Packets / Hash reporting for incoming packets}
> > > >
> > > > diff --git a/device-types/net/device-conformance.tex
> > > > b/device-types/net/device-conformance.tex
> > > > index 54f6783..f88f48b 100644
> > > > --- a/device-types/net/device-conformance.tex
> > > > +++ b/device-types/net/device-conformance.tex
> > > > @@ -14,4 +14,5 @@
> > > >  \item \ref{devicenormative:Device Types / Network Device / Device
> > > > Operation / Control Virtqueue / Automatic receive steering in
> > > > multiqueue mode}  \item \ref{devicenormative:Device Types / Network
> > > > Device / Device Operation / Control Virtqueue / Receive-side scaling
> > > > (RSS) / RSS processing}  \item \ref{devicenormative:Device Types /
> > > > Network Device / Device Operation / Control Virtqueue /
> > > > Notifications Coalescing}
> > > > +\item \ref{devicenormative:Device Types / Network Device / Device
> > > > +Operation / Control Virtqueue / Inner Header Hash}
> > > >  \end{itemize}
> > > > diff --git a/device-types/net/driver-conformance.tex
> > > > b/device-types/net/driver-conformance.tex
> > > > index 97d0cc1..9d853d9 100644
> > > > --- a/device-types/net/driver-conformance.tex
> > > > +++ b/device-types/net/driver-conformance.tex
> > > > @@ -14,4 +14,5 @@
> > > >  \item \ref{drivernormative:Device Types / Network Device / Device
> > > > Operation / Control Virtqueue / Offloads State Configuration /
> > > > Setting Offloads State}  \item \ref{drivernormative:Device Types /
> > > > Network Device / Device Operation / Control Virtqueue / Receive-side
> > > > scaling (RSS) }  \item \ref{drivernormative:Device Types / Network
> > > > Device / Device Operation / Control Virtqueue / Notifications
> > > > Coalescing}
> > > > +\item \ref{drivernormative:Device Types / Network Device / Device
> > > > +Operation / Control Virtqueue / Inner Header Hash}
> > > >  \end{itemize}
> > > > diff --git a/introduction.tex b/introduction.tex index
> > > > b7155bf..81f07a4 100644
> > > > --- a/introduction.tex
> > > > +++ b/introduction.tex
> > > > @@ -102,6 +102,45 @@ \section{Normative
> > References}\label{sec:Normative References}
> > > >      Standards for Efficient Cryptography Group(SECG), ``SEC1: Elliptic Cureve
> > Cryptography'', Version 1.0, September 2000.
> > > >  	\newline\url{https://www.secg.org/sec1-v2.pdf}\\
> > > >
> > > > +	\phantomsection\label{intro:gre_rfc2784}\textbf{[GRE_rfc2784]} &
> > > > +    Generic Routing Encapsulation. This protocol is only specified for IPv4
> > and used as either the payload or delivery protocol.
> > > > +	\newline\url{https://datatracker.ietf.org/doc/rfc2784/}\\
> > > > +	\phantomsection\label{intro:gre_rfc2890}\textbf{[GRE_rfc2890]} &
> > > > +    Key and Sequence Number Extensions to GRE \ref{intro:gre_rfc2784}.
> > This protocol describes extensions by which two fields, Key and
> > > > +    Sequence Number, can be optionally carried in the GRE Header
> > \ref{intro:gre_rfc2784}.
> > > > +	\newline\url{https://www.rfc-editor.org/rfc/rfc2890}\\
> > > > +	\phantomsection\label{intro:gre_rfc7676}\textbf{[GRE_rfc7676]} &
> > > > +    IPv6 Support for Generic Routing Encapsulation (GRE). This protocol is
> > specified for IPv6 and used as either the payload or
> > > > +    delivery protocol. Note that this does not change the GRE header format
> > or any behaviors specified by RFC 2784 or RFC 2890.
> > > > +	\newline\url{https://datatracker.ietf.org/doc/rfc7676/}\\
> > > > +	\phantomsection\label{intro:gre_in_udp_rfc8086}\textbf{[GRE-in-
> > UDP]} &
> > > > +    GRE-in-UDP Encapsulation. This specifies a method of encapsulating
> > network protocol packets within GRE and UDP headers.
> > > > +    This protocol is specified for IPv4 and IPv6, and used as either the
> > payload or delivery protocol.
> > > > +	\newline\url{https://www.rfc-editor.org/rfc/rfc8086}\\
> > > > +	\phantomsection\label{intro:vxlan}\textbf{[VXLAN]} &
> > > > +    Virtual eXtensible Local Area Network.
> > > > +	\newline\url{https://datatracker.ietf.org/doc/rfc7348/}\\
> > > > +	\phantomsection\label{intro:vxlan-gpe}\textbf{[VXLAN-GPE]} &
> > > > +    Generic Protocol Extension for VXLAN. This protocol describes extending
> > Virtual eXtensible Local Area Network (VXLAN) via changes to the VXLAN
> > header.
> > > > +	\newline\url{https://www.ietf.org/archive/id/draft-ietf-nvo3-vxlan-gpe-
> > 12.txt}\\
> > > > +	\phantomsection\label{intro:geneve}\textbf{[GENEVE]} &
> > > > +    Generic Network Virtualization Encapsulation.
> > > > +	\newline\url{https://datatracker.ietf.org/doc/rfc8926/}\\
> > > > +	\phantomsection\label{intro:ipip}\textbf{[IPIP]} &
> > > > +    IP Encapsulation within IP.
> > > > +	\newline\url{https://www.rfc-editor.org/rfc/rfc2003}\\
> > > > +	\phantomsection\label{intro:nvgre}\textbf{[NVGRE]} &
> > > > +    NVGRE: Network Virtualization Using Generic Routing Encapsulation
> > > > +	\newline\url{https://www.rfc-editor.org/rfc/rfc7637.html}\\
> > > > +	\phantomsection\label{intro:IP}\textbf{[IP]} &
> > > > +    INTERNET PROTOCOL
> > > > +	\newline\url{https://www.rfc-editor.org/rfc/rfc791}\\
> > > > +	\phantomsection\label{intro:UDP}\textbf{[UDP]} &
> > > > +    User Datagram Protocol
> > > > +	\newline\url{https://www.rfc-editor.org/rfc/rfc768}\\
> > > > +	\phantomsection\label{intro:TCP}\textbf{[TCP]} &
> > > > +    TRANSMISSION CONTROL PROTOCOL
> > > > +	\newline\url{https://www.rfc-editor.org/rfc/rfc793}\\
> > > >  \end{longtable}
> > > >
> > > >  \section{Non-Normative References}
> > > > --
> > > > 2.19.1.6.gb485710b
> > >
> > >
> > > This publicly archived list offers a means to provide input to the
> > > OASIS Virtual I/O Device (VIRTIO) TC.
> > >
> > > In order to verify user consent to the Feedback License terms and to
> > > minimize spam in the list archive, subscription is required before
> > > posting.
> > >
> > > Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> > > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> > > List help: virtio-comment-help@lists.oasis-open.org
> > > List archive: https://lists.oasis-open.org/archives/virtio-comment/
> > > Feedback License:
> > > https://www.oasis-open.org/who/ipr/feedback_license.pdf
> > > List Guidelines:
> > > https://www.oasis-open.org/policies-guidelines/mailing-lists
> > > Committee: https://www.oasis-open.org/committees/virtio/
> > > Join OASIS: https://www.oasis-open.org/join/
> 
> This publicly archived list offers a means to provide input to the
> OASIS Virtual I/O Device (VIRTIO) TC.
> 
> In order to verify user consent to the Feedback License terms and
> to minimize spam in the list archive, subscription is required
> before posting.
> 
> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> List help: virtio-comment-help@lists.oasis-open.org
> List archive: https://lists.oasis-open.org/archives/virtio-comment/
> Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> Committee: https://www.oasis-open.org/committees/virtio/
> Join OASIS: https://www.oasis-open.org/join/


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
@ 2023-06-21 19:25         ` Michael S. Tsirkin
  0 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2023-06-21 19:25 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck

On Wed, Jun 21, 2023 at 05:52:28PM +0000, Parav Pandit wrote:
> 
> > From: Heng Qi <hengqi@linux.alibaba.com>
> > Sent: Wednesday, June 21, 2023 12:46 PM
> 
> 
> > SET carries an additional 32 bits of information. But if you think this will make
> > the overall structure more concise, I'm ok.
> >
> If it is placed in single structure than it needs to be reworded to remove WO or RO notion.

Not really. all of structure is RO and WO.

> This also requires additional sw to indicate dma attributes to be RW when mapping this area.
> And extra text to indicate that supported_hash_tunnel_types to be ignored on set command.
> 
> Two structures are more cleaner serving its purpose.

No because all of structure is RO or WO.

> > > I also think hash_tunnel is unnecessarily verbose, and _config_ is
> > > also pointless.
> > >
> > > Returning supported_hash_tunnel_types back to device can also be
> > > useful for debugging.
> > >
> > > How about:
> > >
> > >
> > > struct virtnet_hash_tunnel {
> > >      le32 supported_tunnel_types;
> > >      le32 enabled_tunnel_types;
> > > };
> > >
> > 
> > It's OK.
> > 
> > And:
> > For the GET command, both fields are WO for the device.
> > For the SET command, \field{supported_tunnel_types} is RO for the device and
> > \field{enabled_tunnel_types} is WO for the device.
> > 
> > >
> > > For VIRTIO_NET_CTRL_HASH_TUNNEL_GET, \field{supported_tunnel_types}
> > > contains the bitmask of encapsulation types supported by the device
> > > for inner header hash; \field{enabled_tunnel_types} contains the value
> > > received in a previous successful call to
> > > VIRTIO_NET_CTRL_HASH_TUNNEL_SET.
> > >
> > > For VIRTIO_NET_CTRL_HASH_TUNNEL_SET, \field{supported_tunnel_types}
> > > contains the value returned by a previous successful call to
> > > VIRTIO_NET_CTRL_HASH_TUNNEL_GET; \field{enabled_tunnel_types}
> > contains
> > > the bitmask of encapsulation types to enable for inner header hash.
> > >
> > > and add normative statements to this end.
> > >
> > >
> > 
> > Ok.
> > 
> > > > +#define VIRTIO_NET_CTRL_HASH_TUNNEL 7  #define
> > > > +VIRTIO_NET_CTRL_HASH_TUNNEL_SET 0  #define
> > > > +VIRTIO_NET_CTRL_HASH_TUNNEL_GET 1
> > > > +
> > > > +
> > > > +Field \field{supported_hash_tunnel_types} provided by the device
> > indicates that the device supports inner header hash for these encapsulation
> > types.
> > > > +Field \field{supported_hash_tunnel_types} contains the bitmask of
> > encapsulation types supported for inner header hash.
> > >
> > > We don't need these two sentences. Just second one will do.
> > >
> > 
> > Ok.
> > 
> > > > +See \ref{sec:Device Types / Network Device / Device Operation /
> > > > +Processing of Incoming Packets / Hash calculation for incoming packets /
> > Encapsulation types supported/enabled for inner header hash}.
> > > > +
> > > > +Field \field{hash_tunnel_types} contains the bitmask of encapsulation
> > types enabled for inner header hash.
> > >
> > > They have different meanings for set and get though.
> > >
> > >
> > > > +See \ref{sec:Device Types / Network Device / Device Operation /
> > > > +Processing of Incoming Packets / Hash calculation for incoming packets /
> > Encapsulation types supported/enabled for inner header hash}.
> > > > +
> > > > +The class VIRTIO_NET_CTRL_HASH_TUNNEL has the following commands:
> > > > +\begin{itemize}
> > > > +\item VIRTIO_NET_CTRL_HASH_TUNNEL_SET: set
> > \field{hash_tunnel_types} for the device using the
> > > > +      virtnet_hash_tunnel_config_set structure, which is read-only for the
> > device.
> > > > +\item VIRTIO_NET_CTRL_HASH_TUNNEL_GET: get
> > \field{hash_tunnel_types} and \field{supported_hash_tunnel_types}
> > > > +      from the device using the virtnet_hash_tunnel_config_get structure,
> > which is write-only for the device.
> > > > +\end{itemize}
> > > > +
> > > > +\subparagraph{Encapsulated packet}
> > > > +\label{sec:Device Types / Network Device / Device Operation /
> > > > +Processing of Incoming Packets / Hash calculation for incoming
> > > > +packets / Encapsulated packet}
> > > > +
> > > > +Multiple tunneling protocols allow encapsulating an inner, payload packet
> > in an outer, encapsulated packet.
> > > > +The encapsulated packet thus contains an outer header and an inner
> > > > +header, and the device calculates the hash over either the inner header or
> > the outer header.
> > > > +
> > > > +If VIRTIO_NET_F_HASH_TUNNEL is negotiated and a received
> > > > +encapsulated packet's outer header matches one of the encapsulation
> > > > +types enabled in \field{hash_tunnel_types}, then the device uses the inner
> > header for hash calculations (only a single level of encapsulation is currently
> > supported).
> > > > +
> > > > +If VIRTIO_NET_F_HASH_TUNNEL is negotiated and a received packet's
> > > > +(outer) header does not match any types enabled in
> > \field{hash_tunnel_types}, then the device uses the outer header for hash
> > calculations.
> > > > +
> > > > +Initially all encapsulation types are disabled (the value of
> > > > +\field{hash_tunnel_types} is 0) for inner header hash before any
> > VIRTIO_NET_CTRL_HASH_TUNNEL_SET command are sent by the driver.
> > >
> > > Initially (before driver sends VIRTIO_NET_CTRL_HASH_TUNNEL_SET) all
> > > encapsulation types are disabled
> > >
> > 
> > Ok.
> > 
> > > > +
> > > > +Encapsulation types supported/enabled for inner header hash:
> > > > +\begin{itemize}
> > > > +    \item The outer header of the following encapsulation types does not
> > contain the transport protocol:
> > > > +        \begin{enumerate}
> > > > +	    \item \hyperref[intro:ipip]{[IPIP]}: the outer header is over IPv4 and
> > the inner header is over IPv4.
> > > > +	    \item \hyperref[intro:nvgre]{[NVGRE]}: the outer header is over
> > IPv4/IPv6 and the inner header is over IPv4/IPv6.
> > > > +	    \item \hyperref[intro:gre_rfc2784]{[GRE_rfc2784]}: the outer header
> > is over IPv4 and the inner header is over IPv4.
> > > > +	    \item \hyperref[intro:gre_rfc2890]{[GRE_rfc2890]}: the outer header
> > is over IPv4 and the inner header is over IPv4.
> > > > +	    \item \hyperref[intro:gre_rfc7676]{[GRE_rfc7676]}: the outer header
> > is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
> > > > +        \end{enumerate}
> > > > +
> > > > +    \item The outer header of the following encapsulation types uses UDP as
> > the transport protocol:
> > > > +        \begin{enumerate}
> > > > +	    \item \hyperref[intro:vxlan]{[VXLAN]}: the outer header is over
> > IPv4/IPv6 and the inner header is over IPv4/IPv6.
> > > > +	    \item \hyperref[intro:geneve]{[GENEVE]}: the outer header is over
> > IPv4/IPv6 and the inner header is over IPv4/IPv6.
> > > > +	    \item \hyperref[intro:vxlan_gpe]{[VXLAN-GPE]}: the outer header is
> > over IPv4/IPv6 and the inner header is over IPv4/IPv6.
> > > > +	    \item \hyperref[intro:gre_in_udp_rfc8086]{[GRE-in-UDP]}: the outer
> > header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
> > > > +        \end{enumerate}
> > > > +\end{itemize}
> > > > +
> > > > +\subparagraph{Encapsulation types supported/enabled for inner
> > > > +header hash} \label{sec:Device Types / Network Device / Device
> > > > +Operation / Processing of Incoming Packets / Hash calculation for
> > > > +incoming packets / Encapsulation types supported/enabled for inner
> > > > +header hash}
> > > > +
> > > > +Encapsulation types applicable for inner header hash:
> > > > +\begin{lstlisting}
> > > > +The \hyperref[intro:gre_rfc2784]{[GRE_rfc2784]} encapsulation type:
> > > > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_2784    (1 << 0)
> > > > +
> > > > +The \hyperref[intro:gre_rfc2890]{[GRE_rfc2890]} encapsulation type:
> > > > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_2890    (1 << 1)
> > > > +
> > > > +The \hyperref[intro:gre_rfc7676]{[GRE_rfc7676]} encapsulation type:
> > > > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_7676    (1 << 2)
> > > > +
> > > > +The \hyperref[intro:gre_in_udp_rfc8086]{[GRE-in-UDP]} encapsulation
> > type:
> > > > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_UDP     (1 << 3)
> > > > +
> > > > +The \hyperref[intro:vxlan]{[VXLAN]} encapsulation type:
> > > > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_VXLAN       (1 << 4)
> > > > +
> > > > +The \hyperref[intro:vxlan_gpe]{[VXLAN-GPE]} encapsulation type:
> > > > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_VXLAN_GPE   (1 << 5)
> > > > +
> > > > +The \hyperref[intro:geneve]{[GENEVE]} encapsulation type:
> > > > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GENEVE      (1 << 6)
> > > > +
> > > > +The \hyperref[intro:ipip]{[IPIP]} encapsulation type:
> > > > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_IPIP        (1 << 7)
> > > > +
> > > > +The \hyperref[intro:nvgre]{[NVGRE]} encapsulation type:
> > > > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_NVGRE       (1 << 8)
> > > > +\end{lstlisting}
> > > > +
> > > > +\subparagraph{Advice}
> > > > +Example uses of inner header hash:
> > > > +\begin{itemize}
> > > > +\item Legacy tunneling protocols, lacking outer header entropy, can use
> > RSS with inner header hash to
> > > > +      distribute flows with identical outer but different inner headers across
> > various queues, improving performance.
> > > > +\item Identify an inner flow distributed across multiple outer tunnels.
> > > > +\end{itemize}
> > > > +
> > > > +As using the inner header hash completely discards the outer header
> > > > +entropy, care must be taken if the inner header is controlled by an
> > > > +adversary, as the adversary can then intentionally create configurations
> > with insufficient entropy.
> > > > +
> > > > +Besides disabling inner header hash, mitigations would depend on:
> > > > +\begin{itemize}
> > > > +\item Use a tool with good forwarding performance to keep the receive
> > queue from dropping packets.
> > >
> > > this is quite vague
> > >
> > 
> > Giving too specific advice would be too specific, which we discussed a long time
> > ago.
> > 
> > > > +\item If the QoS (Quality of service) is unavailable, the driver can set
> > \field{hash_tunnel_types} to 0
> > > > +      to disable inner header hash for all encapsulated packets.
> > >
> > > this is precisely disabling
> > 
> > \field{hash_tunnel_types} to 0 disabling inner ?
> > 
> > >
> > > > +\item Perform appropriate QoS before packets consume the receive
> > buffers of the receive queues.
> > >
> > > it is not at all clear how would devices do this.
> > >
> > 
> > The reason we're describing it broadly here is that this is done by devices, which
> > usually know what to do, and can also take actions off-device, such as firewalls
> > off-device, etc.
> > 
> > > > +\end{itemize}
> > >
> > > Oh sorry I didn't complete the sentence :( I suggest dropping above
> > > and having something like:
> > >
> > > 	Besides disabling inner header hash, mitigations would depend on how
> > the
> > > 	hash is used, and the consequences of a successful attack.
> > > 	For example, if the attack causes packet drops, using a deeper queue
> > > 	might be able to mitigate it.
> > 
> > Ok, I got you now!
> > 
> > >
> > >
> > >
> > > > +
> > > > +\devicenormative{\subparagraph}{Inner Header Hash}{Device Types /
> > > > +Network Device / Device Operation / Control Virtqueue / Inner
> > > > +Header Hash}
> > > > +
> > > > +If the (outer) header of the received packet does not match any
> > > > +value
> > >
> > > any encapsulation type
> > 
> > Ok. And 'any bits' instead of 'any value' I think.
> > 
> > >
> > > > enabled in \field{hash_tunnel_types},
> > > > +the device MUST calculate the hash on the outer header.
> > > > +
> > > > +If the device receives an unsupported or unrecognized value for
> > > > +\field{hash_tunnel_types}, it MUST respond to the
> > VIRTIO_NET_CTRL_HASH_TUNNEL_SET command with VIRTIO_NET_ERR.
> > >
> > > let's be specific. if any bits in hash_tunnel_types are not set in
> > > supported_hash_tunnel_types
> > 
> > Ok.
> > 
> > >
> > > > +
> > > > +If the device offers the VIRTIO_NET_F_HASH_TUNNEL feature, it MUST
> > provide the values for \field{supported_hash_tunnel_types}.
> > >
> > > what does this mean even?
> > 
> > When the driver uses the GET command, the device should preferably return
> > the corresponding value..
> > 
> > >
> > > > +
> > > > +If \field{hash_tunnel_types} is set to 0 or upon device reset, the device
> > MUST disable inner header hash for all encapsulation types.
> > > > +
> > > > +\drivernormative{\subparagraph}{Inner Header Hash}{Device Types /
> > > > +Network Device / Device Operation / Control Virtqueue / Inner
> > > > +Header Hash}
> > > > +
> > > > +The driver MUST have negotiated the VIRTIO_NET_F_HASH_TUNNEL
> > > > +feature when issuing commands VIRTIO_NET_CTRL_HASH_TUNNEL_SET
> > and VIRTIO_NET_CTRL_HASH_TUNNEL_GET.
> > > > +
> > > > +The driver MUST ignore the values received from the
> > VIRTIO_NET_CTRL_HASH_TUNNEL_GET command if the device responds with
> > VIRTIO_NET_ERR.
> > > > +
> > > > +The driver MUST NOT set any value in \field{hash_tunnel_types} which is
> > not set in \field{supported_hash_tunnel_types}.
> > >
> > > any bits. not any value
> > 
> > Get.
> > 
> > Thanks.
> > 
> > >
> > > > +
> > > >  \paragraph{Hash reporting for incoming packets}  \label{sec:Device
> > > > Types / Network Device / Device Operation / Processing of Incoming
> > > > Packets / Hash reporting for incoming packets}
> > > >
> > > > diff --git a/device-types/net/device-conformance.tex
> > > > b/device-types/net/device-conformance.tex
> > > > index 54f6783..f88f48b 100644
> > > > --- a/device-types/net/device-conformance.tex
> > > > +++ b/device-types/net/device-conformance.tex
> > > > @@ -14,4 +14,5 @@
> > > >  \item \ref{devicenormative:Device Types / Network Device / Device
> > > > Operation / Control Virtqueue / Automatic receive steering in
> > > > multiqueue mode}  \item \ref{devicenormative:Device Types / Network
> > > > Device / Device Operation / Control Virtqueue / Receive-side scaling
> > > > (RSS) / RSS processing}  \item \ref{devicenormative:Device Types /
> > > > Network Device / Device Operation / Control Virtqueue /
> > > > Notifications Coalescing}
> > > > +\item \ref{devicenormative:Device Types / Network Device / Device
> > > > +Operation / Control Virtqueue / Inner Header Hash}
> > > >  \end{itemize}
> > > > diff --git a/device-types/net/driver-conformance.tex
> > > > b/device-types/net/driver-conformance.tex
> > > > index 97d0cc1..9d853d9 100644
> > > > --- a/device-types/net/driver-conformance.tex
> > > > +++ b/device-types/net/driver-conformance.tex
> > > > @@ -14,4 +14,5 @@
> > > >  \item \ref{drivernormative:Device Types / Network Device / Device
> > > > Operation / Control Virtqueue / Offloads State Configuration /
> > > > Setting Offloads State}  \item \ref{drivernormative:Device Types /
> > > > Network Device / Device Operation / Control Virtqueue / Receive-side
> > > > scaling (RSS) }  \item \ref{drivernormative:Device Types / Network
> > > > Device / Device Operation / Control Virtqueue / Notifications
> > > > Coalescing}
> > > > +\item \ref{drivernormative:Device Types / Network Device / Device
> > > > +Operation / Control Virtqueue / Inner Header Hash}
> > > >  \end{itemize}
> > > > diff --git a/introduction.tex b/introduction.tex index
> > > > b7155bf..81f07a4 100644
> > > > --- a/introduction.tex
> > > > +++ b/introduction.tex
> > > > @@ -102,6 +102,45 @@ \section{Normative
> > References}\label{sec:Normative References}
> > > >      Standards for Efficient Cryptography Group(SECG), ``SEC1: Elliptic Cureve
> > Cryptography'', Version 1.0, September 2000.
> > > >  	\newline\url{https://www.secg.org/sec1-v2.pdf}\\
> > > >
> > > > +	\phantomsection\label{intro:gre_rfc2784}\textbf{[GRE_rfc2784]} &
> > > > +    Generic Routing Encapsulation. This protocol is only specified for IPv4
> > and used as either the payload or delivery protocol.
> > > > +	\newline\url{https://datatracker.ietf.org/doc/rfc2784/}\\
> > > > +	\phantomsection\label{intro:gre_rfc2890}\textbf{[GRE_rfc2890]} &
> > > > +    Key and Sequence Number Extensions to GRE \ref{intro:gre_rfc2784}.
> > This protocol describes extensions by which two fields, Key and
> > > > +    Sequence Number, can be optionally carried in the GRE Header
> > \ref{intro:gre_rfc2784}.
> > > > +	\newline\url{https://www.rfc-editor.org/rfc/rfc2890}\\
> > > > +	\phantomsection\label{intro:gre_rfc7676}\textbf{[GRE_rfc7676]} &
> > > > +    IPv6 Support for Generic Routing Encapsulation (GRE). This protocol is
> > specified for IPv6 and used as either the payload or
> > > > +    delivery protocol. Note that this does not change the GRE header format
> > or any behaviors specified by RFC 2784 or RFC 2890.
> > > > +	\newline\url{https://datatracker.ietf.org/doc/rfc7676/}\\
> > > > +	\phantomsection\label{intro:gre_in_udp_rfc8086}\textbf{[GRE-in-
> > UDP]} &
> > > > +    GRE-in-UDP Encapsulation. This specifies a method of encapsulating
> > network protocol packets within GRE and UDP headers.
> > > > +    This protocol is specified for IPv4 and IPv6, and used as either the
> > payload or delivery protocol.
> > > > +	\newline\url{https://www.rfc-editor.org/rfc/rfc8086}\\
> > > > +	\phantomsection\label{intro:vxlan}\textbf{[VXLAN]} &
> > > > +    Virtual eXtensible Local Area Network.
> > > > +	\newline\url{https://datatracker.ietf.org/doc/rfc7348/}\\
> > > > +	\phantomsection\label{intro:vxlan-gpe}\textbf{[VXLAN-GPE]} &
> > > > +    Generic Protocol Extension for VXLAN. This protocol describes extending
> > Virtual eXtensible Local Area Network (VXLAN) via changes to the VXLAN
> > header.
> > > > +	\newline\url{https://www.ietf.org/archive/id/draft-ietf-nvo3-vxlan-gpe-
> > 12.txt}\\
> > > > +	\phantomsection\label{intro:geneve}\textbf{[GENEVE]} &
> > > > +    Generic Network Virtualization Encapsulation.
> > > > +	\newline\url{https://datatracker.ietf.org/doc/rfc8926/}\\
> > > > +	\phantomsection\label{intro:ipip}\textbf{[IPIP]} &
> > > > +    IP Encapsulation within IP.
> > > > +	\newline\url{https://www.rfc-editor.org/rfc/rfc2003}\\
> > > > +	\phantomsection\label{intro:nvgre}\textbf{[NVGRE]} &
> > > > +    NVGRE: Network Virtualization Using Generic Routing Encapsulation
> > > > +	\newline\url{https://www.rfc-editor.org/rfc/rfc7637.html}\\
> > > > +	\phantomsection\label{intro:IP}\textbf{[IP]} &
> > > > +    INTERNET PROTOCOL
> > > > +	\newline\url{https://www.rfc-editor.org/rfc/rfc791}\\
> > > > +	\phantomsection\label{intro:UDP}\textbf{[UDP]} &
> > > > +    User Datagram Protocol
> > > > +	\newline\url{https://www.rfc-editor.org/rfc/rfc768}\\
> > > > +	\phantomsection\label{intro:TCP}\textbf{[TCP]} &
> > > > +    TRANSMISSION CONTROL PROTOCOL
> > > > +	\newline\url{https://www.rfc-editor.org/rfc/rfc793}\\
> > > >  \end{longtable}
> > > >
> > > >  \section{Non-Normative References}
> > > > --
> > > > 2.19.1.6.gb485710b
> > >
> > >
> > > This publicly archived list offers a means to provide input to the
> > > OASIS Virtual I/O Device (VIRTIO) TC.
> > >
> > > In order to verify user consent to the Feedback License terms and to
> > > minimize spam in the list archive, subscription is required before
> > > posting.
> > >
> > > Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> > > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> > > List help: virtio-comment-help@lists.oasis-open.org
> > > List archive: https://lists.oasis-open.org/archives/virtio-comment/
> > > Feedback License:
> > > https://www.oasis-open.org/who/ipr/feedback_license.pdf
> > > List Guidelines:
> > > https://www.oasis-open.org/policies-guidelines/mailing-lists
> > > Committee: https://www.oasis-open.org/committees/virtio/
> > > Join OASIS: https://www.oasis-open.org/join/
> 
> This publicly archived list offers a means to provide input to the
> OASIS Virtual I/O Device (VIRTIO) TC.
> 
> In order to verify user consent to the Feedback License terms and
> to minimize spam in the list archive, subscription is required
> before posting.
> 
> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> List help: virtio-comment-help@lists.oasis-open.org
> List archive: https://lists.oasis-open.org/archives/virtio-comment/
> Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> Committee: https://www.oasis-open.org/committees/virtio/
> Join OASIS: https://www.oasis-open.org/join/


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-dev] RE: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
  2023-06-21 19:25         ` Michael S. Tsirkin
@ 2023-06-21 19:28           ` Parav Pandit
  -1 siblings, 0 replies; 106+ messages in thread
From: Parav Pandit @ 2023-06-21 19:28 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck



> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Wednesday, June 21, 2023 3:26 PM
> 
> On Wed, Jun 21, 2023 at 05:52:28PM +0000, Parav Pandit wrote:
> >
> > > From: Heng Qi <hengqi@linux.alibaba.com>
> > > Sent: Wednesday, June 21, 2023 12:46 PM
> >
> >
> > > SET carries an additional 32 bits of information. But if you think
> > > this will make the overall structure more concise, I'm ok.
> > >
> > If it is placed in single structure than it needs to be reworded to remove WO
> or RO notion.
> 
> Not really. all of structure is RO and WO.
>
For the device, what is the meaning of driver writing supported_tunnel_types in SET command?
Ignore? It doesn't make sense to pass something to ignore.
Two structures are cleaner.
 
> > This also requires additional sw to indicate dma attributes to be RW when
> mapping this area.
> > And extra text to indicate that supported_hash_tunnel_types to be ignored
> on set command.
> >
> > Two structures are more cleaner serving its purpose.
> 
> No because all of structure is RO or WO.

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* RE: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
@ 2023-06-21 19:28           ` Parav Pandit
  0 siblings, 0 replies; 106+ messages in thread
From: Parav Pandit @ 2023-06-21 19:28 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck



> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Wednesday, June 21, 2023 3:26 PM
> 
> On Wed, Jun 21, 2023 at 05:52:28PM +0000, Parav Pandit wrote:
> >
> > > From: Heng Qi <hengqi@linux.alibaba.com>
> > > Sent: Wednesday, June 21, 2023 12:46 PM
> >
> >
> > > SET carries an additional 32 bits of information. But if you think
> > > this will make the overall structure more concise, I'm ok.
> > >
> > If it is placed in single structure than it needs to be reworded to remove WO
> or RO notion.
> 
> Not really. all of structure is RO and WO.
>
For the device, what is the meaning of driver writing supported_tunnel_types in SET command?
Ignore? It doesn't make sense to pass something to ignore.
Two structures are cleaner.
 
> > This also requires additional sw to indicate dma attributes to be RW when
> mapping this area.
> > And extra text to indicate that supported_hash_tunnel_types to be ignored
> on set command.
> >
> > Two structures are more cleaner serving its purpose.
> 
> No because all of structure is RO or WO.

This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
  2023-06-21 16:46     ` Heng Qi
@ 2023-06-21 19:32       ` Michael S. Tsirkin
  -1 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2023-06-21 19:32 UTC (permalink / raw)
  To: Heng Qi
  Cc: virtio-comment, virtio-dev, Parav Pandit, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck

On Thu, Jun 22, 2023 at 12:46:06AM +0800, Heng Qi wrote:
> On Wed, Jun 21, 2023 at 11:38:52AM -0400, Michael S. Tsirkin wrote:
> > On Wed, Jun 21, 2023 at 09:50:52PM +0800, Heng Qi wrote:
> > > 1. Currently, a received encapsulated packet has an outer and an inner header, but
> > > the virtio device is unable to calculate the hash for the inner header. The same
> > > flow can traverse through different tunnels, resulting in the encapsulated
> > > packets being spread across multiple receive queues (refer to the figure below).
> > > However, in certain scenarios, we may need to direct these encapsulated packets of
> > > the same flow to a single receive queue. This facilitates the processing
> > > of the flow by the same CPU to improve performance (warm caches, less locking, etc.).
> > > 
> > >                client1                    client2
> > >                   |        +-------+         |
> > >                   +------->|tunnels|<--------+
> > >                            +-------+
> > >                               |  |
> > >                               v  v
> > >                       +-----------------+
> > >                       | monitoring host |
> > >                       +-----------------+
> > > 
> > > To achieve this, the device can calculate a symmetric hash based on the inner headers
> > > of the same flow.
> > > 
> > > 2. For legacy systems, they may lack entropy fields which modern protocols have in
> > > the outer header, resulting in multiple flows with the same outer header but
> > > different inner headers being directed to the same receive queue. This results in
> > > poor receive performance.
> > > 
> > > To address this limitation, inner header hash can be used to enable the device to advertise
> > > the capability to calculate the hash for the inner packet, regaining better receive performance.
> > > 
> > > Fixes: https://github.com/oasis-tcs/virtio-spec/issues/173
> > > 
> > 
> > don't put an empty line here
> 
> Ok. Will remove it.
> 
> > 
> > > Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
> > > Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > Reviewed-by: Parav Pandit <parav@nvidia.com>
> > 
> > 
> > ok almost there. small corrections, and one enhancement suggestion.
> > 
> > 
> > > ---
> > > v17->v18:
> > > 	1. Some rewording suggestions from Michael (Thanks!).
> > > 	2. Use 0 to disable inner header hash and remove
> > > 	   VIRTIO_NET_HASH_TUNNEL_TYPE_NONE.
> > > v16->v17:
> > > 	1. Some small rewrites. @Parav Pandit
> > > 	2. Add Parav's Reviewed-by tag (Thanks!).
> > > 
> > > v15->v16:
> > > 	1. Remove the hash_option. In order to delimit the inner header hash and RSS
> > > 	   configuration, the ability to configure the outer src udp port hash is given
> > > 	   to RSS. This is orthogonal to inner header hash, which will be done in the
> > > 	   RSS capability extension topic (considered as an RSS extension together
> > > 	   with the symmetric toeplitz hash algorithm, etc.). @Parav Pandit @Michael S . Tsirkin
> > > 	2. Fix a 'field' typo. @Parav Pandit
> > > 
> > > v14->v15:
> > > 	1. Add tunnel hash option suggested by @Michael S . Tsirkin
> > > 	2. Adjust some descriptions.
> > > 
> > > v13->v14:
> > > 	1. Move supported_hash_tunnel_types from config space into cvq command. @Parav Pandit
> > > 	2. Rebase to master branch.
> > > 	3. Some minor modifications.
> > > 
> > > v12->v13:
> > > 	1. Add a GET command for hash_tunnel_types. @Parav Pandit
> > > 	2. Add tunneling protocol explanation. @Jason Wang
> > > 	3. Add comments on some usage scenarios for inner hash.
> > > 
> > > v11->v12:
> > > 	1. Add a command VIRTIO_NET_CTRL_MQ_TUNNEL_CONFIG.
> > > 	2. Refine the commit log. @Michael S . Tsirkin
> > > 	3. Add some tunnel types.
> > > 
> > > v10->v11:
> > > 	1. Revise commit log for clarity for readers.
> > > 	2. Some modifications to avoid undefined terms. @Parav Pandit
> > > 	3. Change VIRTIO_NET_F_HASH_TUNNEL dependency. @Parav Pandit
> > > 	4. Add the normative statements. @Parav Pandit
> > > 
> > > v9->v10:
> > > 	1. Removed hash_report_tunnel related information. @Parav Pandit
> > > 	2. Re-describe the limitations of QoS for tunneling.
> > > 	3. Some clarification.
> > > 
> > > v8->v9:
> > > 	1. Merge hash_report_tunnel_types into hash_report. @Parav Pandit
> > > 	2. Add tunnel security section. @Michael S . Tsirkin
> > > 	3. Add VIRTIO_NET_F_HASH_REPORT_TUNNEL.
> > > 	4. Fix some typos.
> > > 	5. Add more tunnel types. @Michael S . Tsirkin
> > > 
> > > v7->v8:
> > > 	1. Add supported_hash_tunnel_types. @Jason Wang, @Parav Pandit
> > > 	2. Change hash_report_tunnel to hash_report_tunnel_types. @Parav Pandit
> > > 	3. Removed re-definition for inner packet hashing. @Parav Pandit
> > > 	4. Fix some typos. @Michael S . Tsirkin
> > > 	5. Clarify some sentences. @Michael S . Tsirkin
> > > 
> > > v6->v7:
> > > 	1. Modify the wording of some sentences for clarity. @Michael S. Tsirkin
> > > 	2. Fix some syntax issues. @Michael S. Tsirkin
> > > 
> > > v5->v6:
> > > 	1. Fix some syntax and capitalization issues. @Michael S. Tsirkin
> > > 	2. Use encapsulated/encaptulation uniformly. @Michael S. Tsirkin
> > > 	3. Move the links to introduction section. @Michael S. Tsirkin
> > > 	4. Clarify some sentences. @Michael S. Tsirkin
> > > 
> > > v4->v5:
> > > 	1. Clarify some paragraphs. @Cornelia Huck
> > > 	2. Fix the u8 type. @Cornelia Huck
> > > 
> > > v3->v4:
> > > 	1. Rename VIRTIO_NET_F_HASH_GRE_VXLAN_GENEVE_INNER to VIRTIO_NET_F_HASH_TUNNEL. @Jason Wang
> > > 	2. Make things clearer. @Jason Wang @Michael S. Tsirkin
> > > 	3. Keep the possibility to use inner hash for automatic receive steering. @Jason Wang
> > > 	4. Add the "Tunnel packet" paragraph to avoid repeating the GRE etc. many times. @Michael S. Tsirkin
> > > 
> > > v2->v3:
> > > 	1. Add a feature bit for GRE/VXLAN/GENEVE inner hash. @Jason Wang
> > > 	2. Chang \field{hash_tunnel} to \field{hash_report_tunnel}. @Jason Wang, @Michael S. Tsirkin
> > > 
> > > v1->v2:
> > > 	1. Remove the patch for the bitmask fix. @Michael S. Tsirkin
> > > 	2. Clarify some paragraphs. @Jason Wang
> > > 	3. Add \field{hash_tunnel} and VIRTIO_NET_HASH_REPORT_GRE. @Yuri Benditovich
> > > 
> > >  device-types/net/description.tex        | 158 ++++++++++++++++++++++++
> > >  device-types/net/device-conformance.tex |   1 +
> > >  device-types/net/driver-conformance.tex |   1 +
> > >  introduction.tex                        |  39 ++++++
> > >  4 files changed, 199 insertions(+)
> > > 
> > > diff --git a/device-types/net/description.tex b/device-types/net/description.tex
> > > index 3030222..9fdccfc 100644
> > > --- a/device-types/net/description.tex
> > > +++ b/device-types/net/description.tex
> > > @@ -88,6 +88,8 @@ \subsection{Feature bits}\label{sec:Device Types / Network Device / Feature bits
> > >  \item[VIRTIO_NET_F_CTRL_MAC_ADDR(23)] Set MAC address through control
> > >      channel.
> > >  
> > > +\item[VIRTIO_NET_F_HASH_TUNNEL(51)] Device supports inner header hash for encapsulated packets.
> > > +
> > >  \item[VIRTIO_NET_F_VQ_NOTF_COAL(52)] Device supports virtqueue notification coalescing.
> > >  
> > >  \item[VIRTIO_NET_F_NOTF_COAL(53)] Device supports notifications coalescing.
> > > @@ -147,6 +149,7 @@ \subsubsection{Feature bit requirements}\label{sec:Device Types / Network Device
> > >  \item[VIRTIO_NET_F_RSC_EXT] Requires VIRTIO_NET_F_HOST_TSO4 or VIRTIO_NET_F_HOST_TSO6.
> > >  \item[VIRTIO_NET_F_RSS] Requires VIRTIO_NET_F_CTRL_VQ.
> > >  \item[VIRTIO_NET_F_VQ_NOTF_COAL] Requires VIRTIO_NET_F_CTRL_VQ.
> > > +\item[VIRTIO_NET_F_HASH_TUNNEL] Requires VIRTIO_NET_F_CTRL_VQ along with VIRTIO_NET_F_RSS and/or VIRTIO_NET_F_HASH_REPORT.
> > 
> > I think just or is enough.
> 
> Sure. I agree.
> 
> > 
> > >  \end{description}
> > >  
> > >  \subsubsection{Legacy Interface: Feature bits}\label{sec:Device Types / Network Device / Feature bits / Legacy Interface: Feature bits}
> > > @@ -869,6 +872,8 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
> > >  If the feature VIRTIO_NET_F_RSS was negotiated:
> > >  \begin{itemize}
> > >  \item The device uses \field{hash_types} of the virtio_net_rss_config structure as 'Enabled hash types' bitmask.
> > > +\item If additionally the feature VIRTIO_NET_F_HASH_TUNNEL was negotiated, the device uses \field{hash_tunnel_types} of the
> > > +      virtnet_hash_tunnel_config_get structure as 'Encapsulation types enabled for inner header hash' bitmask.
> > 
> > why get and not set? e.g. if I call get then set then the field in set
> > will have effect.
> 
> If the following command sequence:
> 1. The driver sets hash_tunnel_types using the SET command and saves it
> somewhere e.g. virtnet_info->hash_tunnel_types_saved.

This clearly does not work even with current structure you need to first
do a GET otherwise you have no idea what is supported.

> 2. The driver fetches hash_tunnel_types using the GET command, which should
> be equal to virtnet_info->hash_tunnel_types_saved (or even returned from
> virtnet_info->hash_tunnel_types_saved, like RSS).

why? for debugging?

> 3. The driver sets hash_tunnel_types again using the SET command and saves
> it in virtnet_info->hash_tunnel_types_saved.

I don't get it, why set it again?

> But I think your enhanced proposal is also feasible, after all, before
> for example RSS etc. we only have SET command can work very well.

Well RSS is designed (imho) better since it keeps the supported types in
config space. Doing it here would have removed the need for GET command.
Yes I know Parav hates config space, no I don't think for a read-only
field like this one this hate is justified.
Did we discuss this and decided not to add it in config space
for some reason? I don't remember ...


> > 
> > 
> > >  \item The device uses a key as defined in \field{hash_key_data} and \field{hash_key_length} of the virtio_net_rss_config structure (see
> > >  \ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) / Setting RSS parameters}).
> > >  \end{itemize}
> > > @@ -876,6 +881,8 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
> > >  If the feature VIRTIO_NET_F_RSS was not negotiated:
> > >  \begin{itemize}
> > >  \item The device uses \field{hash_types} of the virtio_net_hash_config structure as 'Enabled hash types' bitmask.
> > > +\item If additionally the feature VIRTIO_NET_F_HASH_TUNNEL was negotiated, the device uses \field{hash_tunnel_types} of the
> > > +      virtnet_hash_tunnel_config_get structure as 'Encapsulation types enabled for inner header hash' bitmask.
> > 
> > same
> > 
> > >  \item The device uses a key as defined in \field{hash_key_data} and \field{hash_key_length} of the virtio_net_hash_config structure (see
> > >  \ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode / Hash calculation}).
> > >  \end{itemize}
> > > @@ -889,6 +896,8 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
> > >   \ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode / Hash calculation}.
> > >  \end{itemize}
> > >  
> > > +The per-packet hash calculation can depend on the IP packet type. See
> > > +\hyperref[intro:IP]{[IP]}, \hyperref[intro:UDP]{[UDP]} and \hyperref[intro:TCP]{[TCP]}.
> > 
> > and end paragraph here.
> 
> Is adding a blank line here? --!

yes

> > 
> > >  \subparagraph{Supported/enabled hash types}
> > >  \label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled hash types}
> > >  Hash types applicable for IPv4 packets:
> > > @@ -1001,6 +1010,155 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
> > >  (see \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / IPv6 packets without extension header}).
> > >  \end{itemize}
> > >  
> > > +\paragraph{Inner Header Hash}
> > > +\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Inner Header Hash}
> > > +
> > > +If VIRTIO_NET_F_HASH_TUNNEL has been negotiated, the driver can send commands VIRTIO_NET_CTRL_HASH_TUNNEL_SET
> > > +and VIRTIO_NET_CTRL_HASH_TUNNEL_GET to configure the calculation of the inner header hash.
> > > +
> > > +struct virtnet_hash_tunnel_config_set {
> > > +    le32 hash_tunnel_types;
> > > +};
> > > +
> > > +struct virtnet_hash_tunnel_config_get {
> > > +    le32 supported_hash_tunnel_types;
> > > +    le32 hash_tunnel_types;
> > > +};
> > > +
> > 
> > It would be cleaner to have a single structure for both.
> > 
> 
> SET carries an additional 32 bits of information. But if you think this
> will make the overall structure more concise, I'm ok.

I think so yes.

> > I also think hash_tunnel is unnecessarily verbose, and _config_ is also
> > pointless.
> > 
> > Returning supported_hash_tunnel_types back to device can also
> > be useful for debugging.
> > 
> > How about:
> > 
> > 
> > struct virtnet_hash_tunnel {
> >      le32 supported_tunnel_types;
> >      le32 enabled_tunnel_types;
> > };
> > 
> 
> It's OK.
> 
> And:
> For the GET command, both fields are WO for the device.
> For the SET command, \field{supported_tunnel_types} is RO for the device
> and \field{enabled_tunnel_types} is WO for the device.
> 
> > 
> > For VIRTIO_NET_CTRL_HASH_TUNNEL_GET, \field{supported_tunnel_types}
> > contains the bitmask of encapsulation types supported
> > by the device for inner header hash; \field{enabled_tunnel_types}
> > contains the value received in a previous successful
> > call to VIRTIO_NET_CTRL_HASH_TUNNEL_SET.
> > 
> > For VIRTIO_NET_CTRL_HASH_TUNNEL_SET, \field{supported_tunnel_types}
> > contains the value returned by a previous
> > successful call to VIRTIO_NET_CTRL_HASH_TUNNEL_GET;
> > \field{enabled_tunnel_types}
> > contains the bitmask of encapsulation types to enable
> > for inner header hash.
> > 
> > and add normative statements to this end.
> > 
> > 
> 
> Ok.
> 
> > > +#define VIRTIO_NET_CTRL_HASH_TUNNEL 7
> > > + #define VIRTIO_NET_CTRL_HASH_TUNNEL_SET 0
> > > + #define VIRTIO_NET_CTRL_HASH_TUNNEL_GET 1
> > > +
> > > +
> > > +Field \field{supported_hash_tunnel_types} provided by the device indicates that the device supports inner header hash for these encapsulation types.
> > > +Field \field{supported_hash_tunnel_types} contains the bitmask of encapsulation types supported for inner header hash.
> > 
> > We don't need these two sentences. Just second one will do.
> > 
> 
> Ok.
> 
> > > +See \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets /
> > > +Hash calculation for incoming packets / Encapsulation types supported/enabled for inner header hash}.
> > > +
> > > +Field \field{hash_tunnel_types} contains the bitmask of encapsulation types enabled for inner header hash.
> > 
> > They have different meanings for set and get though.
> > 
> > 
> > > +See \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets /
> > > +Hash calculation for incoming packets / Encapsulation types supported/enabled for inner header hash}.
> > > +
> > > +The class VIRTIO_NET_CTRL_HASH_TUNNEL has the following commands:
> > > +\begin{itemize}
> > > +\item VIRTIO_NET_CTRL_HASH_TUNNEL_SET: set \field{hash_tunnel_types} for the device using the
> > > +      virtnet_hash_tunnel_config_set structure, which is read-only for the device.
> > > +\item VIRTIO_NET_CTRL_HASH_TUNNEL_GET: get \field{hash_tunnel_types} and \field{supported_hash_tunnel_types}
> > > +      from the device using the virtnet_hash_tunnel_config_get structure, which is write-only for the device.
> > > +\end{itemize}
> > > +
> > > +\subparagraph{Encapsulated packet}
> > > +\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Encapsulated packet}
> > > +
> > > +Multiple tunneling protocols allow encapsulating an inner, payload packet in an outer, encapsulated packet.
> > > +The encapsulated packet thus contains an outer header and an inner header, and the device calculates the
> > > +hash over either the inner header or the outer header.
> > > +
> > > +If VIRTIO_NET_F_HASH_TUNNEL is negotiated and a received encapsulated packet's outer header matches one of the
> > > +encapsulation types enabled in \field{hash_tunnel_types}, then the device uses the inner header for hash
> > > +calculations (only a single level of encapsulation is currently supported).
> > > +
> > > +If VIRTIO_NET_F_HASH_TUNNEL is negotiated and a received packet's (outer) header does not match any types enabled
> > > +in \field{hash_tunnel_types}, then the device uses the outer header for hash calculations.
> > > +
> > > +Initially all encapsulation types are disabled (the value of \field{hash_tunnel_types} is 0) for inner header hash
> > > +before any VIRTIO_NET_CTRL_HASH_TUNNEL_SET command are sent by the driver.
> > 
> > Initially (before driver sends VIRTIO_NET_CTRL_HASH_TUNNEL_SET)
> > all encapsulation types are disabled 
> > 
> 
> Ok.
> 
> > > +
> > > +Encapsulation types supported/enabled for inner header hash:
> > > +\begin{itemize}
> > > +    \item The outer header of the following encapsulation types does not contain the transport protocol:
> > > +        \begin{enumerate}
> > > +	    \item \hyperref[intro:ipip]{[IPIP]}: the outer header is over IPv4 and the inner header is over IPv4.
> > > +	    \item \hyperref[intro:nvgre]{[NVGRE]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
> > > +	    \item \hyperref[intro:gre_rfc2784]{[GRE_rfc2784]}: the outer header is over IPv4 and the inner header is over IPv4.
> > > +	    \item \hyperref[intro:gre_rfc2890]{[GRE_rfc2890]}: the outer header is over IPv4 and the inner header is over IPv4.
> > > +	    \item \hyperref[intro:gre_rfc7676]{[GRE_rfc7676]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
> > > +        \end{enumerate}
> > > +
> > > +    \item The outer header of the following encapsulation types uses UDP as the transport protocol:
> > > +        \begin{enumerate}
> > > +	    \item \hyperref[intro:vxlan]{[VXLAN]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
> > > +	    \item \hyperref[intro:geneve]{[GENEVE]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
> > > +	    \item \hyperref[intro:vxlan_gpe]{[VXLAN-GPE]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
> > > +	    \item \hyperref[intro:gre_in_udp_rfc8086]{[GRE-in-UDP]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
> > > +        \end{enumerate}
> > > +\end{itemize}
> > > +
> > > +\subparagraph{Encapsulation types supported/enabled for inner header hash}
> > > +\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets /
> > > +Hash calculation for incoming packets / Encapsulation types supported/enabled for inner header hash}
> > > +
> > > +Encapsulation types applicable for inner header hash:
> > > +\begin{lstlisting}
> > > +The \hyperref[intro:gre_rfc2784]{[GRE_rfc2784]} encapsulation type:
> > > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_2784    (1 << 0)
> > > +
> > > +The \hyperref[intro:gre_rfc2890]{[GRE_rfc2890]} encapsulation type:
> > > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_2890    (1 << 1)
> > > +
> > > +The \hyperref[intro:gre_rfc7676]{[GRE_rfc7676]} encapsulation type:
> > > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_7676    (1 << 2)
> > > +
> > > +The \hyperref[intro:gre_in_udp_rfc8086]{[GRE-in-UDP]} encapsulation type:
> > > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_UDP     (1 << 3)
> > > +
> > > +The \hyperref[intro:vxlan]{[VXLAN]} encapsulation type:
> > > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_VXLAN       (1 << 4)
> > > +
> > > +The \hyperref[intro:vxlan_gpe]{[VXLAN-GPE]} encapsulation type:
> > > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_VXLAN_GPE   (1 << 5)
> > > +
> > > +The \hyperref[intro:geneve]{[GENEVE]} encapsulation type:
> > > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GENEVE      (1 << 6)
> > > +
> > > +The \hyperref[intro:ipip]{[IPIP]} encapsulation type:
> > > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_IPIP        (1 << 7)
> > > +
> > > +The \hyperref[intro:nvgre]{[NVGRE]} encapsulation type:
> > > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_NVGRE       (1 << 8)
> > > +\end{lstlisting}
> > > +
> > > +\subparagraph{Advice}
> > > +Example uses of inner header hash:
> > > +\begin{itemize}
> > > +\item Legacy tunneling protocols, lacking outer header entropy, can use RSS with inner header hash to
> > > +      distribute flows with identical outer but different inner headers across various queues, improving performance.
> > > +\item Identify an inner flow distributed across multiple outer tunnels.
> > > +\end{itemize}
> > > +
> > > +As using the inner header hash completely discards the outer header entropy, care must be taken
> > > +if the inner header is controlled by an adversary, as the adversary can then intentionally create
> > > +configurations with insufficient entropy.
> > > +
> > > +Besides disabling inner header hash, mitigations would depend on:
> > > +\begin{itemize}
> > > +\item Use a tool with good forwarding performance to keep the receive queue from dropping packets.
> > 
> > this is quite vague
> > 
> 
> Giving too specific advice would be too specific, which we discussed a
> long time ago.
> 
> > > +\item If the QoS (Quality of service) is unavailable, the driver can set \field{hash_tunnel_types} to 0
> > > +      to disable inner header hash for all encapsulated packets.
> > 
> > this is precisely disabling
> 
> \field{hash_tunnel_types} to 0 disabling inner ?
> 
> > 
> > > +\item Perform appropriate QoS before packets consume the receive buffers of the receive queues.
> > 
> > it is not at all clear how would devices do this.
> > 
> 
> The reason we're describing it broadly here is that this is done by
> devices, which usually know what to do, and can also take actions
> off-device, such as firewalls off-device, etc.
> 
> > > +\end{itemize}
> > 
> > Oh sorry I didn't complete the sentence :(
> > I suggest dropping above and having something like:
> > 
> > 	Besides disabling inner header hash, mitigations would depend on how the
> > 	hash is used, and the consequences of a successful attack.
> > 	For example, if the attack causes packet drops, using a deeper queue
> > 	might be able to mitigate it.
> 
> Ok, I got you now!
> 
> > 
> > 
> > 
> > > +
> > > +\devicenormative{\subparagraph}{Inner Header Hash}{Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
> > > +
> > > +If the (outer) header of the received packet does not match any value
> > 
> > any encapsulation type
> 
> Ok. And 'any bits' instead of 'any value' I think.
> 
> > 
> > > enabled in \field{hash_tunnel_types},
> > > +the device MUST calculate the hash on the outer header.
> > > +
> > > +If the device receives an unsupported or unrecognized value for \field{hash_tunnel_types}, it MUST respond to
> > > +the VIRTIO_NET_CTRL_HASH_TUNNEL_SET command with VIRTIO_NET_ERR.
> > 
> > let's be specific. if any bits in hash_tunnel_types are not set in
> > supported_hash_tunnel_types
> 
> Ok.
> 
> > 
> > > +
> > > +If the device offers the VIRTIO_NET_F_HASH_TUNNEL feature, it MUST provide the values for \field{supported_hash_tunnel_types}.
> > 
> > what does this mean even?
> 
> When the driver uses the GET command, the device should preferably
> return the corresponding value..
> 
> > 
> > > +
> > > +If \field{hash_tunnel_types} is set to 0 or upon device reset, the device MUST disable inner header hash for all encapsulation types.
> > > +
> > > +\drivernormative{\subparagraph}{Inner Header Hash}{Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
> > > +
> > > +The driver MUST have negotiated the VIRTIO_NET_F_HASH_TUNNEL feature when issuing
> > > +commands VIRTIO_NET_CTRL_HASH_TUNNEL_SET and VIRTIO_NET_CTRL_HASH_TUNNEL_GET.
> > > +
> > > +The driver MUST ignore the values received from the VIRTIO_NET_CTRL_HASH_TUNNEL_GET command if the device responds with VIRTIO_NET_ERR.
> > > +
> > > +The driver MUST NOT set any value in \field{hash_tunnel_types} which is not set in \field{supported_hash_tunnel_types}.
> > 
> > any bits. not any value
> 
> Get.
> 
> Thanks.
> 
> > 
> > > +
> > >  \paragraph{Hash reporting for incoming packets}
> > >  \label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash reporting for incoming packets}
> > >  
> > > diff --git a/device-types/net/device-conformance.tex b/device-types/net/device-conformance.tex
> > > index 54f6783..f88f48b 100644
> > > --- a/device-types/net/device-conformance.tex
> > > +++ b/device-types/net/device-conformance.tex
> > > @@ -14,4 +14,5 @@
> > >  \item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode}
> > >  \item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) / RSS processing}
> > >  \item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Notifications Coalescing}
> > > +\item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
> > >  \end{itemize}
> > > diff --git a/device-types/net/driver-conformance.tex b/device-types/net/driver-conformance.tex
> > > index 97d0cc1..9d853d9 100644
> > > --- a/device-types/net/driver-conformance.tex
> > > +++ b/device-types/net/driver-conformance.tex
> > > @@ -14,4 +14,5 @@
> > >  \item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Offloads State Configuration / Setting Offloads State}
> > >  \item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) }
> > >  \item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Notifications Coalescing}
> > > +\item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
> > >  \end{itemize}
> > > diff --git a/introduction.tex b/introduction.tex
> > > index b7155bf..81f07a4 100644
> > > --- a/introduction.tex
> > > +++ b/introduction.tex
> > > @@ -102,6 +102,45 @@ \section{Normative References}\label{sec:Normative References}
> > >      Standards for Efficient Cryptography Group(SECG), ``SEC1: Elliptic Cureve Cryptography'', Version 1.0, September 2000.
> > >  	\newline\url{https://www.secg.org/sec1-v2.pdf}\\
> > >  
> > > +	\phantomsection\label{intro:gre_rfc2784}\textbf{[GRE_rfc2784]} &
> > > +    Generic Routing Encapsulation. This protocol is only specified for IPv4 and used as either the payload or delivery protocol.
> > > +	\newline\url{https://datatracker.ietf.org/doc/rfc2784/}\\
> > > +	\phantomsection\label{intro:gre_rfc2890}\textbf{[GRE_rfc2890]} &
> > > +    Key and Sequence Number Extensions to GRE \ref{intro:gre_rfc2784}. This protocol describes extensions by which two fields, Key and
> > > +    Sequence Number, can be optionally carried in the GRE Header \ref{intro:gre_rfc2784}.
> > > +	\newline\url{https://www.rfc-editor.org/rfc/rfc2890}\\
> > > +	\phantomsection\label{intro:gre_rfc7676}\textbf{[GRE_rfc7676]} &
> > > +    IPv6 Support for Generic Routing Encapsulation (GRE). This protocol is specified for IPv6 and used as either the payload or
> > > +    delivery protocol. Note that this does not change the GRE header format or any behaviors specified by RFC 2784 or RFC 2890.
> > > +	\newline\url{https://datatracker.ietf.org/doc/rfc7676/}\\
> > > +	\phantomsection\label{intro:gre_in_udp_rfc8086}\textbf{[GRE-in-UDP]} &
> > > +    GRE-in-UDP Encapsulation. This specifies a method of encapsulating network protocol packets within GRE and UDP headers.
> > > +    This protocol is specified for IPv4 and IPv6, and used as either the payload or delivery protocol.
> > > +	\newline\url{https://www.rfc-editor.org/rfc/rfc8086}\\
> > > +	\phantomsection\label{intro:vxlan}\textbf{[VXLAN]} &
> > > +    Virtual eXtensible Local Area Network.
> > > +	\newline\url{https://datatracker.ietf.org/doc/rfc7348/}\\
> > > +	\phantomsection\label{intro:vxlan-gpe}\textbf{[VXLAN-GPE]} &
> > > +    Generic Protocol Extension for VXLAN. This protocol describes extending Virtual eXtensible Local Area Network (VXLAN) via changes to the VXLAN header.
> > > +	\newline\url{https://www.ietf.org/archive/id/draft-ietf-nvo3-vxlan-gpe-12.txt}\\
> > > +	\phantomsection\label{intro:geneve}\textbf{[GENEVE]} &
> > > +    Generic Network Virtualization Encapsulation.
> > > +	\newline\url{https://datatracker.ietf.org/doc/rfc8926/}\\
> > > +	\phantomsection\label{intro:ipip}\textbf{[IPIP]} &
> > > +    IP Encapsulation within IP.
> > > +	\newline\url{https://www.rfc-editor.org/rfc/rfc2003}\\
> > > +	\phantomsection\label{intro:nvgre}\textbf{[NVGRE]} &
> > > +    NVGRE: Network Virtualization Using Generic Routing Encapsulation
> > > +	\newline\url{https://www.rfc-editor.org/rfc/rfc7637.html}\\
> > > +	\phantomsection\label{intro:IP}\textbf{[IP]} &
> > > +    INTERNET PROTOCOL
> > > +	\newline\url{https://www.rfc-editor.org/rfc/rfc791}\\
> > > +	\phantomsection\label{intro:UDP}\textbf{[UDP]} &
> > > +    User Datagram Protocol
> > > +	\newline\url{https://www.rfc-editor.org/rfc/rfc768}\\
> > > +	\phantomsection\label{intro:TCP}\textbf{[TCP]} &
> > > +    TRANSMISSION CONTROL PROTOCOL
> > > +	\newline\url{https://www.rfc-editor.org/rfc/rfc793}\\
> > >  \end{longtable}
> > >  
> > >  \section{Non-Normative References}
> > > -- 
> > > 2.19.1.6.gb485710b
> > 
> > 
> > This publicly archived list offers a means to provide input to the
> > OASIS Virtual I/O Device (VIRTIO) TC.
> > 
> > In order to verify user consent to the Feedback License terms and
> > to minimize spam in the list archive, subscription is required
> > before posting.
> > 
> > Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> > List help: virtio-comment-help@lists.oasis-open.org
> > List archive: https://lists.oasis-open.org/archives/virtio-comment/
> > Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> > List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> > Committee: https://www.oasis-open.org/committees/virtio/
> > Join OASIS: https://www.oasis-open.org/join/
> 
> This publicly archived list offers a means to provide input to the
> OASIS Virtual I/O Device (VIRTIO) TC.
> 
> In order to verify user consent to the Feedback License terms and
> to minimize spam in the list archive, subscription is required
> before posting.
> 
> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> List help: virtio-comment-help@lists.oasis-open.org
> List archive: https://lists.oasis-open.org/archives/virtio-comment/
> Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> Committee: https://www.oasis-open.org/committees/virtio/
> Join OASIS: https://www.oasis-open.org/join/


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
@ 2023-06-21 19:32       ` Michael S. Tsirkin
  0 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2023-06-21 19:32 UTC (permalink / raw)
  To: Heng Qi
  Cc: virtio-comment, virtio-dev, Parav Pandit, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck

On Thu, Jun 22, 2023 at 12:46:06AM +0800, Heng Qi wrote:
> On Wed, Jun 21, 2023 at 11:38:52AM -0400, Michael S. Tsirkin wrote:
> > On Wed, Jun 21, 2023 at 09:50:52PM +0800, Heng Qi wrote:
> > > 1. Currently, a received encapsulated packet has an outer and an inner header, but
> > > the virtio device is unable to calculate the hash for the inner header. The same
> > > flow can traverse through different tunnels, resulting in the encapsulated
> > > packets being spread across multiple receive queues (refer to the figure below).
> > > However, in certain scenarios, we may need to direct these encapsulated packets of
> > > the same flow to a single receive queue. This facilitates the processing
> > > of the flow by the same CPU to improve performance (warm caches, less locking, etc.).
> > > 
> > >                client1                    client2
> > >                   |        +-------+         |
> > >                   +------->|tunnels|<--------+
> > >                            +-------+
> > >                               |  |
> > >                               v  v
> > >                       +-----------------+
> > >                       | monitoring host |
> > >                       +-----------------+
> > > 
> > > To achieve this, the device can calculate a symmetric hash based on the inner headers
> > > of the same flow.
> > > 
> > > 2. For legacy systems, they may lack entropy fields which modern protocols have in
> > > the outer header, resulting in multiple flows with the same outer header but
> > > different inner headers being directed to the same receive queue. This results in
> > > poor receive performance.
> > > 
> > > To address this limitation, inner header hash can be used to enable the device to advertise
> > > the capability to calculate the hash for the inner packet, regaining better receive performance.
> > > 
> > > Fixes: https://github.com/oasis-tcs/virtio-spec/issues/173
> > > 
> > 
> > don't put an empty line here
> 
> Ok. Will remove it.
> 
> > 
> > > Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
> > > Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > Reviewed-by: Parav Pandit <parav@nvidia.com>
> > 
> > 
> > ok almost there. small corrections, and one enhancement suggestion.
> > 
> > 
> > > ---
> > > v17->v18:
> > > 	1. Some rewording suggestions from Michael (Thanks!).
> > > 	2. Use 0 to disable inner header hash and remove
> > > 	   VIRTIO_NET_HASH_TUNNEL_TYPE_NONE.
> > > v16->v17:
> > > 	1. Some small rewrites. @Parav Pandit
> > > 	2. Add Parav's Reviewed-by tag (Thanks!).
> > > 
> > > v15->v16:
> > > 	1. Remove the hash_option. In order to delimit the inner header hash and RSS
> > > 	   configuration, the ability to configure the outer src udp port hash is given
> > > 	   to RSS. This is orthogonal to inner header hash, which will be done in the
> > > 	   RSS capability extension topic (considered as an RSS extension together
> > > 	   with the symmetric toeplitz hash algorithm, etc.). @Parav Pandit @Michael S . Tsirkin
> > > 	2. Fix a 'field' typo. @Parav Pandit
> > > 
> > > v14->v15:
> > > 	1. Add tunnel hash option suggested by @Michael S . Tsirkin
> > > 	2. Adjust some descriptions.
> > > 
> > > v13->v14:
> > > 	1. Move supported_hash_tunnel_types from config space into cvq command. @Parav Pandit
> > > 	2. Rebase to master branch.
> > > 	3. Some minor modifications.
> > > 
> > > v12->v13:
> > > 	1. Add a GET command for hash_tunnel_types. @Parav Pandit
> > > 	2. Add tunneling protocol explanation. @Jason Wang
> > > 	3. Add comments on some usage scenarios for inner hash.
> > > 
> > > v11->v12:
> > > 	1. Add a command VIRTIO_NET_CTRL_MQ_TUNNEL_CONFIG.
> > > 	2. Refine the commit log. @Michael S . Tsirkin
> > > 	3. Add some tunnel types.
> > > 
> > > v10->v11:
> > > 	1. Revise commit log for clarity for readers.
> > > 	2. Some modifications to avoid undefined terms. @Parav Pandit
> > > 	3. Change VIRTIO_NET_F_HASH_TUNNEL dependency. @Parav Pandit
> > > 	4. Add the normative statements. @Parav Pandit
> > > 
> > > v9->v10:
> > > 	1. Removed hash_report_tunnel related information. @Parav Pandit
> > > 	2. Re-describe the limitations of QoS for tunneling.
> > > 	3. Some clarification.
> > > 
> > > v8->v9:
> > > 	1. Merge hash_report_tunnel_types into hash_report. @Parav Pandit
> > > 	2. Add tunnel security section. @Michael S . Tsirkin
> > > 	3. Add VIRTIO_NET_F_HASH_REPORT_TUNNEL.
> > > 	4. Fix some typos.
> > > 	5. Add more tunnel types. @Michael S . Tsirkin
> > > 
> > > v7->v8:
> > > 	1. Add supported_hash_tunnel_types. @Jason Wang, @Parav Pandit
> > > 	2. Change hash_report_tunnel to hash_report_tunnel_types. @Parav Pandit
> > > 	3. Removed re-definition for inner packet hashing. @Parav Pandit
> > > 	4. Fix some typos. @Michael S . Tsirkin
> > > 	5. Clarify some sentences. @Michael S . Tsirkin
> > > 
> > > v6->v7:
> > > 	1. Modify the wording of some sentences for clarity. @Michael S. Tsirkin
> > > 	2. Fix some syntax issues. @Michael S. Tsirkin
> > > 
> > > v5->v6:
> > > 	1. Fix some syntax and capitalization issues. @Michael S. Tsirkin
> > > 	2. Use encapsulated/encaptulation uniformly. @Michael S. Tsirkin
> > > 	3. Move the links to introduction section. @Michael S. Tsirkin
> > > 	4. Clarify some sentences. @Michael S. Tsirkin
> > > 
> > > v4->v5:
> > > 	1. Clarify some paragraphs. @Cornelia Huck
> > > 	2. Fix the u8 type. @Cornelia Huck
> > > 
> > > v3->v4:
> > > 	1. Rename VIRTIO_NET_F_HASH_GRE_VXLAN_GENEVE_INNER to VIRTIO_NET_F_HASH_TUNNEL. @Jason Wang
> > > 	2. Make things clearer. @Jason Wang @Michael S. Tsirkin
> > > 	3. Keep the possibility to use inner hash for automatic receive steering. @Jason Wang
> > > 	4. Add the "Tunnel packet" paragraph to avoid repeating the GRE etc. many times. @Michael S. Tsirkin
> > > 
> > > v2->v3:
> > > 	1. Add a feature bit for GRE/VXLAN/GENEVE inner hash. @Jason Wang
> > > 	2. Chang \field{hash_tunnel} to \field{hash_report_tunnel}. @Jason Wang, @Michael S. Tsirkin
> > > 
> > > v1->v2:
> > > 	1. Remove the patch for the bitmask fix. @Michael S. Tsirkin
> > > 	2. Clarify some paragraphs. @Jason Wang
> > > 	3. Add \field{hash_tunnel} and VIRTIO_NET_HASH_REPORT_GRE. @Yuri Benditovich
> > > 
> > >  device-types/net/description.tex        | 158 ++++++++++++++++++++++++
> > >  device-types/net/device-conformance.tex |   1 +
> > >  device-types/net/driver-conformance.tex |   1 +
> > >  introduction.tex                        |  39 ++++++
> > >  4 files changed, 199 insertions(+)
> > > 
> > > diff --git a/device-types/net/description.tex b/device-types/net/description.tex
> > > index 3030222..9fdccfc 100644
> > > --- a/device-types/net/description.tex
> > > +++ b/device-types/net/description.tex
> > > @@ -88,6 +88,8 @@ \subsection{Feature bits}\label{sec:Device Types / Network Device / Feature bits
> > >  \item[VIRTIO_NET_F_CTRL_MAC_ADDR(23)] Set MAC address through control
> > >      channel.
> > >  
> > > +\item[VIRTIO_NET_F_HASH_TUNNEL(51)] Device supports inner header hash for encapsulated packets.
> > > +
> > >  \item[VIRTIO_NET_F_VQ_NOTF_COAL(52)] Device supports virtqueue notification coalescing.
> > >  
> > >  \item[VIRTIO_NET_F_NOTF_COAL(53)] Device supports notifications coalescing.
> > > @@ -147,6 +149,7 @@ \subsubsection{Feature bit requirements}\label{sec:Device Types / Network Device
> > >  \item[VIRTIO_NET_F_RSC_EXT] Requires VIRTIO_NET_F_HOST_TSO4 or VIRTIO_NET_F_HOST_TSO6.
> > >  \item[VIRTIO_NET_F_RSS] Requires VIRTIO_NET_F_CTRL_VQ.
> > >  \item[VIRTIO_NET_F_VQ_NOTF_COAL] Requires VIRTIO_NET_F_CTRL_VQ.
> > > +\item[VIRTIO_NET_F_HASH_TUNNEL] Requires VIRTIO_NET_F_CTRL_VQ along with VIRTIO_NET_F_RSS and/or VIRTIO_NET_F_HASH_REPORT.
> > 
> > I think just or is enough.
> 
> Sure. I agree.
> 
> > 
> > >  \end{description}
> > >  
> > >  \subsubsection{Legacy Interface: Feature bits}\label{sec:Device Types / Network Device / Feature bits / Legacy Interface: Feature bits}
> > > @@ -869,6 +872,8 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
> > >  If the feature VIRTIO_NET_F_RSS was negotiated:
> > >  \begin{itemize}
> > >  \item The device uses \field{hash_types} of the virtio_net_rss_config structure as 'Enabled hash types' bitmask.
> > > +\item If additionally the feature VIRTIO_NET_F_HASH_TUNNEL was negotiated, the device uses \field{hash_tunnel_types} of the
> > > +      virtnet_hash_tunnel_config_get structure as 'Encapsulation types enabled for inner header hash' bitmask.
> > 
> > why get and not set? e.g. if I call get then set then the field in set
> > will have effect.
> 
> If the following command sequence:
> 1. The driver sets hash_tunnel_types using the SET command and saves it
> somewhere e.g. virtnet_info->hash_tunnel_types_saved.

This clearly does not work even with current structure you need to first
do a GET otherwise you have no idea what is supported.

> 2. The driver fetches hash_tunnel_types using the GET command, which should
> be equal to virtnet_info->hash_tunnel_types_saved (or even returned from
> virtnet_info->hash_tunnel_types_saved, like RSS).

why? for debugging?

> 3. The driver sets hash_tunnel_types again using the SET command and saves
> it in virtnet_info->hash_tunnel_types_saved.

I don't get it, why set it again?

> But I think your enhanced proposal is also feasible, after all, before
> for example RSS etc. we only have SET command can work very well.

Well RSS is designed (imho) better since it keeps the supported types in
config space. Doing it here would have removed the need for GET command.
Yes I know Parav hates config space, no I don't think for a read-only
field like this one this hate is justified.
Did we discuss this and decided not to add it in config space
for some reason? I don't remember ...


> > 
> > 
> > >  \item The device uses a key as defined in \field{hash_key_data} and \field{hash_key_length} of the virtio_net_rss_config structure (see
> > >  \ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) / Setting RSS parameters}).
> > >  \end{itemize}
> > > @@ -876,6 +881,8 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
> > >  If the feature VIRTIO_NET_F_RSS was not negotiated:
> > >  \begin{itemize}
> > >  \item The device uses \field{hash_types} of the virtio_net_hash_config structure as 'Enabled hash types' bitmask.
> > > +\item If additionally the feature VIRTIO_NET_F_HASH_TUNNEL was negotiated, the device uses \field{hash_tunnel_types} of the
> > > +      virtnet_hash_tunnel_config_get structure as 'Encapsulation types enabled for inner header hash' bitmask.
> > 
> > same
> > 
> > >  \item The device uses a key as defined in \field{hash_key_data} and \field{hash_key_length} of the virtio_net_hash_config structure (see
> > >  \ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode / Hash calculation}).
> > >  \end{itemize}
> > > @@ -889,6 +896,8 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
> > >   \ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode / Hash calculation}.
> > >  \end{itemize}
> > >  
> > > +The per-packet hash calculation can depend on the IP packet type. See
> > > +\hyperref[intro:IP]{[IP]}, \hyperref[intro:UDP]{[UDP]} and \hyperref[intro:TCP]{[TCP]}.
> > 
> > and end paragraph here.
> 
> Is adding a blank line here? --!

yes

> > 
> > >  \subparagraph{Supported/enabled hash types}
> > >  \label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled hash types}
> > >  Hash types applicable for IPv4 packets:
> > > @@ -1001,6 +1010,155 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
> > >  (see \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / IPv6 packets without extension header}).
> > >  \end{itemize}
> > >  
> > > +\paragraph{Inner Header Hash}
> > > +\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Inner Header Hash}
> > > +
> > > +If VIRTIO_NET_F_HASH_TUNNEL has been negotiated, the driver can send commands VIRTIO_NET_CTRL_HASH_TUNNEL_SET
> > > +and VIRTIO_NET_CTRL_HASH_TUNNEL_GET to configure the calculation of the inner header hash.
> > > +
> > > +struct virtnet_hash_tunnel_config_set {
> > > +    le32 hash_tunnel_types;
> > > +};
> > > +
> > > +struct virtnet_hash_tunnel_config_get {
> > > +    le32 supported_hash_tunnel_types;
> > > +    le32 hash_tunnel_types;
> > > +};
> > > +
> > 
> > It would be cleaner to have a single structure for both.
> > 
> 
> SET carries an additional 32 bits of information. But if you think this
> will make the overall structure more concise, I'm ok.

I think so yes.

> > I also think hash_tunnel is unnecessarily verbose, and _config_ is also
> > pointless.
> > 
> > Returning supported_hash_tunnel_types back to device can also
> > be useful for debugging.
> > 
> > How about:
> > 
> > 
> > struct virtnet_hash_tunnel {
> >      le32 supported_tunnel_types;
> >      le32 enabled_tunnel_types;
> > };
> > 
> 
> It's OK.
> 
> And:
> For the GET command, both fields are WO for the device.
> For the SET command, \field{supported_tunnel_types} is RO for the device
> and \field{enabled_tunnel_types} is WO for the device.
> 
> > 
> > For VIRTIO_NET_CTRL_HASH_TUNNEL_GET, \field{supported_tunnel_types}
> > contains the bitmask of encapsulation types supported
> > by the device for inner header hash; \field{enabled_tunnel_types}
> > contains the value received in a previous successful
> > call to VIRTIO_NET_CTRL_HASH_TUNNEL_SET.
> > 
> > For VIRTIO_NET_CTRL_HASH_TUNNEL_SET, \field{supported_tunnel_types}
> > contains the value returned by a previous
> > successful call to VIRTIO_NET_CTRL_HASH_TUNNEL_GET;
> > \field{enabled_tunnel_types}
> > contains the bitmask of encapsulation types to enable
> > for inner header hash.
> > 
> > and add normative statements to this end.
> > 
> > 
> 
> Ok.
> 
> > > +#define VIRTIO_NET_CTRL_HASH_TUNNEL 7
> > > + #define VIRTIO_NET_CTRL_HASH_TUNNEL_SET 0
> > > + #define VIRTIO_NET_CTRL_HASH_TUNNEL_GET 1
> > > +
> > > +
> > > +Field \field{supported_hash_tunnel_types} provided by the device indicates that the device supports inner header hash for these encapsulation types.
> > > +Field \field{supported_hash_tunnel_types} contains the bitmask of encapsulation types supported for inner header hash.
> > 
> > We don't need these two sentences. Just second one will do.
> > 
> 
> Ok.
> 
> > > +See \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets /
> > > +Hash calculation for incoming packets / Encapsulation types supported/enabled for inner header hash}.
> > > +
> > > +Field \field{hash_tunnel_types} contains the bitmask of encapsulation types enabled for inner header hash.
> > 
> > They have different meanings for set and get though.
> > 
> > 
> > > +See \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets /
> > > +Hash calculation for incoming packets / Encapsulation types supported/enabled for inner header hash}.
> > > +
> > > +The class VIRTIO_NET_CTRL_HASH_TUNNEL has the following commands:
> > > +\begin{itemize}
> > > +\item VIRTIO_NET_CTRL_HASH_TUNNEL_SET: set \field{hash_tunnel_types} for the device using the
> > > +      virtnet_hash_tunnel_config_set structure, which is read-only for the device.
> > > +\item VIRTIO_NET_CTRL_HASH_TUNNEL_GET: get \field{hash_tunnel_types} and \field{supported_hash_tunnel_types}
> > > +      from the device using the virtnet_hash_tunnel_config_get structure, which is write-only for the device.
> > > +\end{itemize}
> > > +
> > > +\subparagraph{Encapsulated packet}
> > > +\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Encapsulated packet}
> > > +
> > > +Multiple tunneling protocols allow encapsulating an inner, payload packet in an outer, encapsulated packet.
> > > +The encapsulated packet thus contains an outer header and an inner header, and the device calculates the
> > > +hash over either the inner header or the outer header.
> > > +
> > > +If VIRTIO_NET_F_HASH_TUNNEL is negotiated and a received encapsulated packet's outer header matches one of the
> > > +encapsulation types enabled in \field{hash_tunnel_types}, then the device uses the inner header for hash
> > > +calculations (only a single level of encapsulation is currently supported).
> > > +
> > > +If VIRTIO_NET_F_HASH_TUNNEL is negotiated and a received packet's (outer) header does not match any types enabled
> > > +in \field{hash_tunnel_types}, then the device uses the outer header for hash calculations.
> > > +
> > > +Initially all encapsulation types are disabled (the value of \field{hash_tunnel_types} is 0) for inner header hash
> > > +before any VIRTIO_NET_CTRL_HASH_TUNNEL_SET command are sent by the driver.
> > 
> > Initially (before driver sends VIRTIO_NET_CTRL_HASH_TUNNEL_SET)
> > all encapsulation types are disabled 
> > 
> 
> Ok.
> 
> > > +
> > > +Encapsulation types supported/enabled for inner header hash:
> > > +\begin{itemize}
> > > +    \item The outer header of the following encapsulation types does not contain the transport protocol:
> > > +        \begin{enumerate}
> > > +	    \item \hyperref[intro:ipip]{[IPIP]}: the outer header is over IPv4 and the inner header is over IPv4.
> > > +	    \item \hyperref[intro:nvgre]{[NVGRE]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
> > > +	    \item \hyperref[intro:gre_rfc2784]{[GRE_rfc2784]}: the outer header is over IPv4 and the inner header is over IPv4.
> > > +	    \item \hyperref[intro:gre_rfc2890]{[GRE_rfc2890]}: the outer header is over IPv4 and the inner header is over IPv4.
> > > +	    \item \hyperref[intro:gre_rfc7676]{[GRE_rfc7676]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
> > > +        \end{enumerate}
> > > +
> > > +    \item The outer header of the following encapsulation types uses UDP as the transport protocol:
> > > +        \begin{enumerate}
> > > +	    \item \hyperref[intro:vxlan]{[VXLAN]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
> > > +	    \item \hyperref[intro:geneve]{[GENEVE]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
> > > +	    \item \hyperref[intro:vxlan_gpe]{[VXLAN-GPE]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
> > > +	    \item \hyperref[intro:gre_in_udp_rfc8086]{[GRE-in-UDP]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
> > > +        \end{enumerate}
> > > +\end{itemize}
> > > +
> > > +\subparagraph{Encapsulation types supported/enabled for inner header hash}
> > > +\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets /
> > > +Hash calculation for incoming packets / Encapsulation types supported/enabled for inner header hash}
> > > +
> > > +Encapsulation types applicable for inner header hash:
> > > +\begin{lstlisting}
> > > +The \hyperref[intro:gre_rfc2784]{[GRE_rfc2784]} encapsulation type:
> > > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_2784    (1 << 0)
> > > +
> > > +The \hyperref[intro:gre_rfc2890]{[GRE_rfc2890]} encapsulation type:
> > > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_2890    (1 << 1)
> > > +
> > > +The \hyperref[intro:gre_rfc7676]{[GRE_rfc7676]} encapsulation type:
> > > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_7676    (1 << 2)
> > > +
> > > +The \hyperref[intro:gre_in_udp_rfc8086]{[GRE-in-UDP]} encapsulation type:
> > > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_UDP     (1 << 3)
> > > +
> > > +The \hyperref[intro:vxlan]{[VXLAN]} encapsulation type:
> > > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_VXLAN       (1 << 4)
> > > +
> > > +The \hyperref[intro:vxlan_gpe]{[VXLAN-GPE]} encapsulation type:
> > > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_VXLAN_GPE   (1 << 5)
> > > +
> > > +The \hyperref[intro:geneve]{[GENEVE]} encapsulation type:
> > > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GENEVE      (1 << 6)
> > > +
> > > +The \hyperref[intro:ipip]{[IPIP]} encapsulation type:
> > > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_IPIP        (1 << 7)
> > > +
> > > +The \hyperref[intro:nvgre]{[NVGRE]} encapsulation type:
> > > +#define VIRTIO_NET_HASH_TUNNEL_TYPE_NVGRE       (1 << 8)
> > > +\end{lstlisting}
> > > +
> > > +\subparagraph{Advice}
> > > +Example uses of inner header hash:
> > > +\begin{itemize}
> > > +\item Legacy tunneling protocols, lacking outer header entropy, can use RSS with inner header hash to
> > > +      distribute flows with identical outer but different inner headers across various queues, improving performance.
> > > +\item Identify an inner flow distributed across multiple outer tunnels.
> > > +\end{itemize}
> > > +
> > > +As using the inner header hash completely discards the outer header entropy, care must be taken
> > > +if the inner header is controlled by an adversary, as the adversary can then intentionally create
> > > +configurations with insufficient entropy.
> > > +
> > > +Besides disabling inner header hash, mitigations would depend on:
> > > +\begin{itemize}
> > > +\item Use a tool with good forwarding performance to keep the receive queue from dropping packets.
> > 
> > this is quite vague
> > 
> 
> Giving too specific advice would be too specific, which we discussed a
> long time ago.
> 
> > > +\item If the QoS (Quality of service) is unavailable, the driver can set \field{hash_tunnel_types} to 0
> > > +      to disable inner header hash for all encapsulated packets.
> > 
> > this is precisely disabling
> 
> \field{hash_tunnel_types} to 0 disabling inner ?
> 
> > 
> > > +\item Perform appropriate QoS before packets consume the receive buffers of the receive queues.
> > 
> > it is not at all clear how would devices do this.
> > 
> 
> The reason we're describing it broadly here is that this is done by
> devices, which usually know what to do, and can also take actions
> off-device, such as firewalls off-device, etc.
> 
> > > +\end{itemize}
> > 
> > Oh sorry I didn't complete the sentence :(
> > I suggest dropping above and having something like:
> > 
> > 	Besides disabling inner header hash, mitigations would depend on how the
> > 	hash is used, and the consequences of a successful attack.
> > 	For example, if the attack causes packet drops, using a deeper queue
> > 	might be able to mitigate it.
> 
> Ok, I got you now!
> 
> > 
> > 
> > 
> > > +
> > > +\devicenormative{\subparagraph}{Inner Header Hash}{Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
> > > +
> > > +If the (outer) header of the received packet does not match any value
> > 
> > any encapsulation type
> 
> Ok. And 'any bits' instead of 'any value' I think.
> 
> > 
> > > enabled in \field{hash_tunnel_types},
> > > +the device MUST calculate the hash on the outer header.
> > > +
> > > +If the device receives an unsupported or unrecognized value for \field{hash_tunnel_types}, it MUST respond to
> > > +the VIRTIO_NET_CTRL_HASH_TUNNEL_SET command with VIRTIO_NET_ERR.
> > 
> > let's be specific. if any bits in hash_tunnel_types are not set in
> > supported_hash_tunnel_types
> 
> Ok.
> 
> > 
> > > +
> > > +If the device offers the VIRTIO_NET_F_HASH_TUNNEL feature, it MUST provide the values for \field{supported_hash_tunnel_types}.
> > 
> > what does this mean even?
> 
> When the driver uses the GET command, the device should preferably
> return the corresponding value..
> 
> > 
> > > +
> > > +If \field{hash_tunnel_types} is set to 0 or upon device reset, the device MUST disable inner header hash for all encapsulation types.
> > > +
> > > +\drivernormative{\subparagraph}{Inner Header Hash}{Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
> > > +
> > > +The driver MUST have negotiated the VIRTIO_NET_F_HASH_TUNNEL feature when issuing
> > > +commands VIRTIO_NET_CTRL_HASH_TUNNEL_SET and VIRTIO_NET_CTRL_HASH_TUNNEL_GET.
> > > +
> > > +The driver MUST ignore the values received from the VIRTIO_NET_CTRL_HASH_TUNNEL_GET command if the device responds with VIRTIO_NET_ERR.
> > > +
> > > +The driver MUST NOT set any value in \field{hash_tunnel_types} which is not set in \field{supported_hash_tunnel_types}.
> > 
> > any bits. not any value
> 
> Get.
> 
> Thanks.
> 
> > 
> > > +
> > >  \paragraph{Hash reporting for incoming packets}
> > >  \label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash reporting for incoming packets}
> > >  
> > > diff --git a/device-types/net/device-conformance.tex b/device-types/net/device-conformance.tex
> > > index 54f6783..f88f48b 100644
> > > --- a/device-types/net/device-conformance.tex
> > > +++ b/device-types/net/device-conformance.tex
> > > @@ -14,4 +14,5 @@
> > >  \item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode}
> > >  \item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) / RSS processing}
> > >  \item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Notifications Coalescing}
> > > +\item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
> > >  \end{itemize}
> > > diff --git a/device-types/net/driver-conformance.tex b/device-types/net/driver-conformance.tex
> > > index 97d0cc1..9d853d9 100644
> > > --- a/device-types/net/driver-conformance.tex
> > > +++ b/device-types/net/driver-conformance.tex
> > > @@ -14,4 +14,5 @@
> > >  \item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Offloads State Configuration / Setting Offloads State}
> > >  \item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) }
> > >  \item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Notifications Coalescing}
> > > +\item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
> > >  \end{itemize}
> > > diff --git a/introduction.tex b/introduction.tex
> > > index b7155bf..81f07a4 100644
> > > --- a/introduction.tex
> > > +++ b/introduction.tex
> > > @@ -102,6 +102,45 @@ \section{Normative References}\label{sec:Normative References}
> > >      Standards for Efficient Cryptography Group(SECG), ``SEC1: Elliptic Cureve Cryptography'', Version 1.0, September 2000.
> > >  	\newline\url{https://www.secg.org/sec1-v2.pdf}\\
> > >  
> > > +	\phantomsection\label{intro:gre_rfc2784}\textbf{[GRE_rfc2784]} &
> > > +    Generic Routing Encapsulation. This protocol is only specified for IPv4 and used as either the payload or delivery protocol.
> > > +	\newline\url{https://datatracker.ietf.org/doc/rfc2784/}\\
> > > +	\phantomsection\label{intro:gre_rfc2890}\textbf{[GRE_rfc2890]} &
> > > +    Key and Sequence Number Extensions to GRE \ref{intro:gre_rfc2784}. This protocol describes extensions by which two fields, Key and
> > > +    Sequence Number, can be optionally carried in the GRE Header \ref{intro:gre_rfc2784}.
> > > +	\newline\url{https://www.rfc-editor.org/rfc/rfc2890}\\
> > > +	\phantomsection\label{intro:gre_rfc7676}\textbf{[GRE_rfc7676]} &
> > > +    IPv6 Support for Generic Routing Encapsulation (GRE). This protocol is specified for IPv6 and used as either the payload or
> > > +    delivery protocol. Note that this does not change the GRE header format or any behaviors specified by RFC 2784 or RFC 2890.
> > > +	\newline\url{https://datatracker.ietf.org/doc/rfc7676/}\\
> > > +	\phantomsection\label{intro:gre_in_udp_rfc8086}\textbf{[GRE-in-UDP]} &
> > > +    GRE-in-UDP Encapsulation. This specifies a method of encapsulating network protocol packets within GRE and UDP headers.
> > > +    This protocol is specified for IPv4 and IPv6, and used as either the payload or delivery protocol.
> > > +	\newline\url{https://www.rfc-editor.org/rfc/rfc8086}\\
> > > +	\phantomsection\label{intro:vxlan}\textbf{[VXLAN]} &
> > > +    Virtual eXtensible Local Area Network.
> > > +	\newline\url{https://datatracker.ietf.org/doc/rfc7348/}\\
> > > +	\phantomsection\label{intro:vxlan-gpe}\textbf{[VXLAN-GPE]} &
> > > +    Generic Protocol Extension for VXLAN. This protocol describes extending Virtual eXtensible Local Area Network (VXLAN) via changes to the VXLAN header.
> > > +	\newline\url{https://www.ietf.org/archive/id/draft-ietf-nvo3-vxlan-gpe-12.txt}\\
> > > +	\phantomsection\label{intro:geneve}\textbf{[GENEVE]} &
> > > +    Generic Network Virtualization Encapsulation.
> > > +	\newline\url{https://datatracker.ietf.org/doc/rfc8926/}\\
> > > +	\phantomsection\label{intro:ipip}\textbf{[IPIP]} &
> > > +    IP Encapsulation within IP.
> > > +	\newline\url{https://www.rfc-editor.org/rfc/rfc2003}\\
> > > +	\phantomsection\label{intro:nvgre}\textbf{[NVGRE]} &
> > > +    NVGRE: Network Virtualization Using Generic Routing Encapsulation
> > > +	\newline\url{https://www.rfc-editor.org/rfc/rfc7637.html}\\
> > > +	\phantomsection\label{intro:IP}\textbf{[IP]} &
> > > +    INTERNET PROTOCOL
> > > +	\newline\url{https://www.rfc-editor.org/rfc/rfc791}\\
> > > +	\phantomsection\label{intro:UDP}\textbf{[UDP]} &
> > > +    User Datagram Protocol
> > > +	\newline\url{https://www.rfc-editor.org/rfc/rfc768}\\
> > > +	\phantomsection\label{intro:TCP}\textbf{[TCP]} &
> > > +    TRANSMISSION CONTROL PROTOCOL
> > > +	\newline\url{https://www.rfc-editor.org/rfc/rfc793}\\
> > >  \end{longtable}
> > >  
> > >  \section{Non-Normative References}
> > > -- 
> > > 2.19.1.6.gb485710b
> > 
> > 
> > This publicly archived list offers a means to provide input to the
> > OASIS Virtual I/O Device (VIRTIO) TC.
> > 
> > In order to verify user consent to the Feedback License terms and
> > to minimize spam in the list archive, subscription is required
> > before posting.
> > 
> > Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> > List help: virtio-comment-help@lists.oasis-open.org
> > List archive: https://lists.oasis-open.org/archives/virtio-comment/
> > Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> > List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> > Committee: https://www.oasis-open.org/committees/virtio/
> > Join OASIS: https://www.oasis-open.org/join/
> 
> This publicly archived list offers a means to provide input to the
> OASIS Virtual I/O Device (VIRTIO) TC.
> 
> In order to verify user consent to the Feedback License terms and
> to minimize spam in the list archive, subscription is required
> before posting.
> 
> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> List help: virtio-comment-help@lists.oasis-open.org
> List archive: https://lists.oasis-open.org/archives/virtio-comment/
> Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> Committee: https://www.oasis-open.org/committees/virtio/
> Join OASIS: https://www.oasis-open.org/join/


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
  2023-06-21 19:28           ` Parav Pandit
@ 2023-06-21 19:35             ` Michael S. Tsirkin
  -1 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2023-06-21 19:35 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck

On Wed, Jun 21, 2023 at 07:28:33PM +0000, Parav Pandit wrote:
> 
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Wednesday, June 21, 2023 3:26 PM
> > 
> > On Wed, Jun 21, 2023 at 05:52:28PM +0000, Parav Pandit wrote:
> > >
> > > > From: Heng Qi <hengqi@linux.alibaba.com>
> > > > Sent: Wednesday, June 21, 2023 12:46 PM
> > >
> > >
> > > > SET carries an additional 32 bits of information. But if you think
> > > > this will make the overall structure more concise, I'm ok.
> > > >
> > > If it is placed in single structure than it needs to be reworded to remove WO
> > or RO notion.
> > 
> > Not really. all of structure is RO and WO.
> >
> For the device, what is the meaning of driver writing supported_tunnel_types in SET command?
> Ignore? It doesn't make sense to pass something to ignore.

It can be helpful for debugging. E.g. if driver sets
a bit that device does not recognize, it can figure
out that driver has it set in supported mask.
Or if it seems like a waste we can just ask drivers to
put 0 there, works for me too.


> Two structures are cleaner.

Frankly what's cleaner is what RSS is doing, supported mask in config
space. Was this discussed in some version? I don't remember.

> > > This also requires additional sw to indicate dma attributes to be RW when
> > mapping this area.
> > > And extra text to indicate that supported_hash_tunnel_types to be ignored
> > on set command.
> > >
> > > Two structures are more cleaner serving its purpose.
> > 
> > No because all of structure is RO or WO.
> 
> This publicly archived list offers a means to provide input to the
> OASIS Virtual I/O Device (VIRTIO) TC.
> 
> In order to verify user consent to the Feedback License terms and
> to minimize spam in the list archive, subscription is required
> before posting.
> 
> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> List help: virtio-comment-help@lists.oasis-open.org
> List archive: https://lists.oasis-open.org/archives/virtio-comment/
> Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> Committee: https://www.oasis-open.org/committees/virtio/
> Join OASIS: https://www.oasis-open.org/join/


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
@ 2023-06-21 19:35             ` Michael S. Tsirkin
  0 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2023-06-21 19:35 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck

On Wed, Jun 21, 2023 at 07:28:33PM +0000, Parav Pandit wrote:
> 
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Wednesday, June 21, 2023 3:26 PM
> > 
> > On Wed, Jun 21, 2023 at 05:52:28PM +0000, Parav Pandit wrote:
> > >
> > > > From: Heng Qi <hengqi@linux.alibaba.com>
> > > > Sent: Wednesday, June 21, 2023 12:46 PM
> > >
> > >
> > > > SET carries an additional 32 bits of information. But if you think
> > > > this will make the overall structure more concise, I'm ok.
> > > >
> > > If it is placed in single structure than it needs to be reworded to remove WO
> > or RO notion.
> > 
> > Not really. all of structure is RO and WO.
> >
> For the device, what is the meaning of driver writing supported_tunnel_types in SET command?
> Ignore? It doesn't make sense to pass something to ignore.

It can be helpful for debugging. E.g. if driver sets
a bit that device does not recognize, it can figure
out that driver has it set in supported mask.
Or if it seems like a waste we can just ask drivers to
put 0 there, works for me too.


> Two structures are cleaner.

Frankly what's cleaner is what RSS is doing, supported mask in config
space. Was this discussed in some version? I don't remember.

> > > This also requires additional sw to indicate dma attributes to be RW when
> > mapping this area.
> > > And extra text to indicate that supported_hash_tunnel_types to be ignored
> > on set command.
> > >
> > > Two structures are more cleaner serving its purpose.
> > 
> > No because all of structure is RO or WO.
> 
> This publicly archived list offers a means to provide input to the
> OASIS Virtual I/O Device (VIRTIO) TC.
> 
> In order to verify user consent to the Feedback License terms and
> to minimize spam in the list archive, subscription is required
> before posting.
> 
> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> List help: virtio-comment-help@lists.oasis-open.org
> List archive: https://lists.oasis-open.org/archives/virtio-comment/
> Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> Committee: https://www.oasis-open.org/committees/virtio/
> Join OASIS: https://www.oasis-open.org/join/


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-dev] RE: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
  2023-06-21 19:32       ` Michael S. Tsirkin
@ 2023-06-21 19:37         ` Parav Pandit
  -1 siblings, 0 replies; 106+ messages in thread
From: Parav Pandit @ 2023-06-21 19:37 UTC (permalink / raw)
  To: Michael S. Tsirkin, Heng Qi
  Cc: virtio-comment, virtio-dev, Jason Wang, Yuri Benditovich,
	Xuan Zhuo, Cornelia Huck



> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Wednesday, June 21, 2023 3:32 PM

> 
> Well RSS is designed (imho) better since it keeps the supported types in config
> space. Doing it here would have removed the need for GET command.
> Yes I know Parav hates config space, no I don't think for a read-only field like
> this one this hate is justified.
> Did we discuss this and decided not to add it in config space for some reason? I
> don't remember ...
> 
Yes, we discussed this in v12 or before to have symmetric interface to have it via get and set via cvq.
Should be there in the change log.

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* RE: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
@ 2023-06-21 19:37         ` Parav Pandit
  0 siblings, 0 replies; 106+ messages in thread
From: Parav Pandit @ 2023-06-21 19:37 UTC (permalink / raw)
  To: Michael S. Tsirkin, Heng Qi
  Cc: virtio-comment, virtio-dev, Jason Wang, Yuri Benditovich,
	Xuan Zhuo, Cornelia Huck



> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Wednesday, June 21, 2023 3:32 PM

> 
> Well RSS is designed (imho) better since it keeps the supported types in config
> space. Doing it here would have removed the need for GET command.
> Yes I know Parav hates config space, no I don't think for a read-only field like
> this one this hate is justified.
> Did we discuss this and decided not to add it in config space for some reason? I
> don't remember ...
> 
Yes, we discussed this in v12 or before to have symmetric interface to have it via get and set via cvq.
Should be there in the change log.

This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-dev] RE: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
  2023-06-21 19:35             ` Michael S. Tsirkin
@ 2023-06-21 19:39               ` Parav Pandit
  -1 siblings, 0 replies; 106+ messages in thread
From: Parav Pandit @ 2023-06-21 19:39 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck


> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Wednesday, June 21, 2023 3:36 PM
> 
> It can be helpful for debugging. E.g. if driver sets a bit that device does not
> recognize, it can figure out that driver has it set in supported mask.
Device is reporting error code anyway when it receives a unexpected command contents.

> Or if it seems like a waste we can just ask drivers to put 0 there, works for me
> too.

I fail to see this being any better or equal to two structures.

> 
> 
> > Two structures are cleaner.
> 
> Frankly what's cleaner is what RSS is doing, supported mask in config space.
> Was this discussed in some version? I don't remember.
> 
Yes, we discussed. 


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* RE: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
@ 2023-06-21 19:39               ` Parav Pandit
  0 siblings, 0 replies; 106+ messages in thread
From: Parav Pandit @ 2023-06-21 19:39 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck


> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Wednesday, June 21, 2023 3:36 PM
> 
> It can be helpful for debugging. E.g. if driver sets a bit that device does not
> recognize, it can figure out that driver has it set in supported mask.
Device is reporting error code anyway when it receives a unexpected command contents.

> Or if it seems like a waste we can just ask drivers to put 0 there, works for me
> too.

I fail to see this being any better or equal to two structures.

> 
> 
> > Two structures are cleaner.
> 
> Frankly what's cleaner is what RSS is doing, supported mask in config space.
> Was this discussed in some version? I don't remember.
> 
Yes, we discussed. 


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
  2023-06-21 19:39               ` Parav Pandit
@ 2023-06-21 19:45                 ` Michael S. Tsirkin
  -1 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2023-06-21 19:45 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck

On Wed, Jun 21, 2023 at 07:39:19PM +0000, Parav Pandit wrote:
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Wednesday, June 21, 2023 3:36 PM
> > 
> > It can be helpful for debugging. E.g. if driver sets a bit that device does not
> > recognize, it can figure out that driver has it set in supported mask.
> Device is reporting error code anyway when it receives a unexpected command contents.

To driver, yes. But if you are debugging this from software device
side then getting info about driver state is helpful.

> > Or if it seems like a waste we can just ask drivers to put 0 there, works for me
> > too.
> 
> I fail to see this being any better or equal to two structures.

One structure is just simpler for software than two.

> > 
> > 
> > > Two structures are cleaner.
> > 
> > Frankly what's cleaner is what RSS is doing, supported mask in config space.
> > Was this discussed in some version? I don't remember.
> > 
> Yes, we discussed. 

Couldn't find it and I don't want us to repeat same agruments.
Which version had this field in config space?

> 
> This publicly archived list offers a means to provide input to the
> OASIS Virtual I/O Device (VIRTIO) TC.
> 
> In order to verify user consent to the Feedback License terms and
> to minimize spam in the list archive, subscription is required
> before posting.
> 
> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> List help: virtio-comment-help@lists.oasis-open.org
> List archive: https://lists.oasis-open.org/archives/virtio-comment/
> Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> Committee: https://www.oasis-open.org/committees/virtio/
> Join OASIS: https://www.oasis-open.org/join/


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
@ 2023-06-21 19:45                 ` Michael S. Tsirkin
  0 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2023-06-21 19:45 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck

On Wed, Jun 21, 2023 at 07:39:19PM +0000, Parav Pandit wrote:
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Wednesday, June 21, 2023 3:36 PM
> > 
> > It can be helpful for debugging. E.g. if driver sets a bit that device does not
> > recognize, it can figure out that driver has it set in supported mask.
> Device is reporting error code anyway when it receives a unexpected command contents.

To driver, yes. But if you are debugging this from software device
side then getting info about driver state is helpful.

> > Or if it seems like a waste we can just ask drivers to put 0 there, works for me
> > too.
> 
> I fail to see this being any better or equal to two structures.

One structure is just simpler for software than two.

> > 
> > 
> > > Two structures are cleaner.
> > 
> > Frankly what's cleaner is what RSS is doing, supported mask in config space.
> > Was this discussed in some version? I don't remember.
> > 
> Yes, we discussed. 

Couldn't find it and I don't want us to repeat same agruments.
Which version had this field in config space?

> 
> This publicly archived list offers a means to provide input to the
> OASIS Virtual I/O Device (VIRTIO) TC.
> 
> In order to verify user consent to the Feedback License terms and
> to minimize spam in the list archive, subscription is required
> before posting.
> 
> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> List help: virtio-comment-help@lists.oasis-open.org
> List archive: https://lists.oasis-open.org/archives/virtio-comment/
> Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> Committee: https://www.oasis-open.org/committees/virtio/
> Join OASIS: https://www.oasis-open.org/join/


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
  2023-06-21 19:37         ` Parav Pandit
@ 2023-06-21 20:16           ` Michael S. Tsirkin
  -1 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2023-06-21 20:16 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck

On Wed, Jun 21, 2023 at 07:37:00PM +0000, Parav Pandit wrote:
> 
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Wednesday, June 21, 2023 3:32 PM
> 
> > 
> > Well RSS is designed (imho) better since it keeps the supported types in config
> > space. Doing it here would have removed the need for GET command.
> > Yes I know Parav hates config space, no I don't think for a read-only field like
> > this one this hate is justified.
> > Did we discuss this and decided not to add it in config space for some reason? I
> > don't remember ...
> > 
> Yes, we discussed this in v12 or before to have symmetric interface to have it via get and set via cvq.
> Should be there in the change log.

Oh good point, it is in commit log. Heng Qi, thanks for writing such a
nice detailed commit log!

Here is what you wrote:
	Given that a set command is added via cvq, it make sense to also do 
	symetrric work to get it via a cvq.

and actually I missed the fact that instead of making this
symmetric it forced separate structures for GET and SET.

If we move supported_tunnel_hash_types back to config space
then GET and SET just need the active bitmap. *that* seems
symmetric to me.

And the field is RO so no memory cost to exposing it in all VFs.


-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
@ 2023-06-21 20:16           ` Michael S. Tsirkin
  0 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2023-06-21 20:16 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck

On Wed, Jun 21, 2023 at 07:37:00PM +0000, Parav Pandit wrote:
> 
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Wednesday, June 21, 2023 3:32 PM
> 
> > 
> > Well RSS is designed (imho) better since it keeps the supported types in config
> > space. Doing it here would have removed the need for GET command.
> > Yes I know Parav hates config space, no I don't think for a read-only field like
> > this one this hate is justified.
> > Did we discuss this and decided not to add it in config space for some reason? I
> > don't remember ...
> > 
> Yes, we discussed this in v12 or before to have symmetric interface to have it via get and set via cvq.
> Should be there in the change log.

Oh good point, it is in commit log. Heng Qi, thanks for writing such a
nice detailed commit log!

Here is what you wrote:
	Given that a set command is added via cvq, it make sense to also do 
	symetrric work to get it via a cvq.

and actually I missed the fact that instead of making this
symmetric it forced separate structures for GET and SET.

If we move supported_tunnel_hash_types back to config space
then GET and SET just need the active bitmap. *that* seems
symmetric to me.

And the field is RO so no memory cost to exposing it in all VFs.


-- 
MST


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-dev] RE: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
  2023-06-21 20:16           ` Michael S. Tsirkin
@ 2023-06-21 20:24             ` Parav Pandit
  -1 siblings, 0 replies; 106+ messages in thread
From: Parav Pandit @ 2023-06-21 20:24 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck



> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Wednesday, June 21, 2023 4:17 PM

> 
> and actually I missed the fact that instead of making this symmetric it forced
> separate structures for GET and SET.
> 
> If we move supported_tunnel_hash_types back to config space then GET and
> SET just need the active bitmap. *that* seems symmetric to me.
> 
> And the field is RO so no memory cost to exposing it in all VFs.
Two structures do not bring the asymmetry.
Accessing current and enabled fields via two different mechanism is bringing the asymmetry.

So we do not prefer to keep growing the config space anymore, hence GET is the right approach to me.

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* RE: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
@ 2023-06-21 20:24             ` Parav Pandit
  0 siblings, 0 replies; 106+ messages in thread
From: Parav Pandit @ 2023-06-21 20:24 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck



> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Wednesday, June 21, 2023 4:17 PM

> 
> and actually I missed the fact that instead of making this symmetric it forced
> separate structures for GET and SET.
> 
> If we move supported_tunnel_hash_types back to config space then GET and
> SET just need the active bitmap. *that* seems symmetric to me.
> 
> And the field is RO so no memory cost to exposing it in all VFs.
Two structures do not bring the asymmetry.
Accessing current and enabled fields via two different mechanism is bringing the asymmetry.

So we do not prefer to keep growing the config space anymore, hence GET is the right approach to me.

This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
  2023-06-21 20:24             ` Parav Pandit
@ 2023-06-21 20:37               ` Michael S. Tsirkin
  -1 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2023-06-21 20:37 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck

On Wed, Jun 21, 2023 at 08:24:57PM +0000, Parav Pandit wrote:
> 
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Wednesday, June 21, 2023 4:17 PM
> 
> > 
> > and actually I missed the fact that instead of making this symmetric it forced
> > separate structures for GET and SET.
> > 
> > If we move supported_tunnel_hash_types back to config space then GET and
> > SET just need the active bitmap. *that* seems symmetric to me.
> > 
> > And the field is RO so no memory cost to exposing it in all VFs.
> Two structures do not bring the asymmetry.
> Accessing current and enabled fields via two different mechanism is bringing the asymmetry.

I guess it's a matter of taste, but it is clearly more consistent with
other hash things, to which it's very similar.


Nah, config space is too convenient when we can live with its
limitations. I don't thin kwe prefer not to keep growing it.
For some things such as this one it's perfect.

For example, for migration driver might want to validate
that two devices have same capability. doing it without
dma is nicer.

Another example, future admin transport will have ability
to provision devices by supplying their config space.
This will include this capability automatically, if
instead we hide it in a command we need to do extra
custom work. 

> So we do not prefer to keep growing the config space anymore,
> hence GET is the right approach to me.

Heh I know you hate config space. Let it go, stop wasting
time arguing about the same thing on every turn and instead
help define admin transport to solve it fully.

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
@ 2023-06-21 20:37               ` Michael S. Tsirkin
  0 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2023-06-21 20:37 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck

On Wed, Jun 21, 2023 at 08:24:57PM +0000, Parav Pandit wrote:
> 
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Wednesday, June 21, 2023 4:17 PM
> 
> > 
> > and actually I missed the fact that instead of making this symmetric it forced
> > separate structures for GET and SET.
> > 
> > If we move supported_tunnel_hash_types back to config space then GET and
> > SET just need the active bitmap. *that* seems symmetric to me.
> > 
> > And the field is RO so no memory cost to exposing it in all VFs.
> Two structures do not bring the asymmetry.
> Accessing current and enabled fields via two different mechanism is bringing the asymmetry.

I guess it's a matter of taste, but it is clearly more consistent with
other hash things, to which it's very similar.


Nah, config space is too convenient when we can live with its
limitations. I don't thin kwe prefer not to keep growing it.
For some things such as this one it's perfect.

For example, for migration driver might want to validate
that two devices have same capability. doing it without
dma is nicer.

Another example, future admin transport will have ability
to provision devices by supplying their config space.
This will include this capability automatically, if
instead we hide it in a command we need to do extra
custom work. 

> So we do not prefer to keep growing the config space anymore,
> hence GET is the right approach to me.

Heh I know you hate config space. Let it go, stop wasting
time arguing about the same thing on every turn and instead
help define admin transport to solve it fully.

-- 
MST


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-dev] RE: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
  2023-06-21 20:37               ` Michael S. Tsirkin
@ 2023-06-21 20:52                 ` Parav Pandit
  -1 siblings, 0 replies; 106+ messages in thread
From: Parav Pandit @ 2023-06-21 20:52 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck


> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Wednesday, June 21, 2023 4:38 PM

> > > And the field is RO so no memory cost to exposing it in all VFs.
> > Two structures do not bring the asymmetry.
> > Accessing current and enabled fields via two different mechanism is bringing
> the asymmetry.
> 
> I guess it's a matter of taste, but it is clearly more consistent with other hash
> things, to which it's very similar.
>
This is consistent with new commands we define including notification coalescing whose GET is not coming config space.
 
> 
> Nah, config space is too convenient when we can live with its limitations. I don't
> thin kwe prefer not to keep growing it.
> For some things such as this one it's perfect.
>
Fields are different between different devices.

> For example, for migration driver might want to validate that two devices have
> same capability. doing it without dma is nicer.
> 
A migration driver for real world scenario, will almost have to use the dma for amount of data it needs to exchange.

> Another example, future admin transport will have ability to provision devices
> by supplying their config space.
> This will include this capability automatically, if instead we hide it in a command
> we need to do extra custom work.
> 
> > So we do not prefer to keep growing the config space anymore, hence
> > GET is the right approach to me.
> 
> Heh I know you hate config space. Let it go, stop wasting time arguing about the
> same thing on every turn and instead help define admin transport to solve it

This was discussed many times, a driver to have a direct (non-intercepted by owner device) channel to device.
If you mean this non-intercepted channel as admin transport, fine.
If you mean this is intercepted and it is going over admin cmd, then it is of no use for all future interfaces.

We discussed this in thread with you and Jason.
I provided concrete example with size and device provisioning math too and other example of multi-physical address VQ.
So transporting register by register over some admin transport is sub-optimal.




---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* RE: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
@ 2023-06-21 20:52                 ` Parav Pandit
  0 siblings, 0 replies; 106+ messages in thread
From: Parav Pandit @ 2023-06-21 20:52 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck


> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Wednesday, June 21, 2023 4:38 PM

> > > And the field is RO so no memory cost to exposing it in all VFs.
> > Two structures do not bring the asymmetry.
> > Accessing current and enabled fields via two different mechanism is bringing
> the asymmetry.
> 
> I guess it's a matter of taste, but it is clearly more consistent with other hash
> things, to which it's very similar.
>
This is consistent with new commands we define including notification coalescing whose GET is not coming config space.
 
> 
> Nah, config space is too convenient when we can live with its limitations. I don't
> thin kwe prefer not to keep growing it.
> For some things such as this one it's perfect.
>
Fields are different between different devices.

> For example, for migration driver might want to validate that two devices have
> same capability. doing it without dma is nicer.
> 
A migration driver for real world scenario, will almost have to use the dma for amount of data it needs to exchange.

> Another example, future admin transport will have ability to provision devices
> by supplying their config space.
> This will include this capability automatically, if instead we hide it in a command
> we need to do extra custom work.
> 
> > So we do not prefer to keep growing the config space anymore, hence
> > GET is the right approach to me.
> 
> Heh I know you hate config space. Let it go, stop wasting time arguing about the
> same thing on every turn and instead help define admin transport to solve it

This was discussed many times, a driver to have a direct (non-intercepted by owner device) channel to device.
If you mean this non-intercepted channel as admin transport, fine.
If you mean this is intercepted and it is going over admin cmd, then it is of no use for all future interfaces.

We discussed this in thread with you and Jason.
I provided concrete example with size and device provisioning math too and other example of multi-physical address VQ.
So transporting register by register over some admin transport is sub-optimal.




This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
  2023-06-21 19:32       ` Michael S. Tsirkin
@ 2023-06-22  0:41         ` Heng Qi
  -1 siblings, 0 replies; 106+ messages in thread
From: Heng Qi @ 2023-06-22  0:41 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-comment, virtio-dev, Parav Pandit, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck



在 2023/6/22 上午3:32, Michael S. Tsirkin 写道:
> On Thu, Jun 22, 2023 at 12:46:06AM +0800, Heng Qi wrote:
>> On Wed, Jun 21, 2023 at 11:38:52AM -0400, Michael S. Tsirkin wrote:
>>> On Wed, Jun 21, 2023 at 09:50:52PM +0800, Heng Qi wrote:
>>>> 1. Currently, a received encapsulated packet has an outer and an inner header, but
>>>> the virtio device is unable to calculate the hash for the inner header. The same
>>>> flow can traverse through different tunnels, resulting in the encapsulated
>>>> packets being spread across multiple receive queues (refer to the figure below).
>>>> However, in certain scenarios, we may need to direct these encapsulated packets of
>>>> the same flow to a single receive queue. This facilitates the processing
>>>> of the flow by the same CPU to improve performance (warm caches, less locking, etc.).
>>>>
>>>>                 client1                    client2
>>>>                    |        +-------+         |
>>>>                    +------->|tunnels|<--------+
>>>>                             +-------+
>>>>                                |  |
>>>>                                v  v
>>>>                        +-----------------+
>>>>                        | monitoring host |
>>>>                        +-----------------+
>>>>
>>>> To achieve this, the device can calculate a symmetric hash based on the inner headers
>>>> of the same flow.
>>>>
>>>> 2. For legacy systems, they may lack entropy fields which modern protocols have in
>>>> the outer header, resulting in multiple flows with the same outer header but
>>>> different inner headers being directed to the same receive queue. This results in
>>>> poor receive performance.
>>>>
>>>> To address this limitation, inner header hash can be used to enable the device to advertise
>>>> the capability to calculate the hash for the inner packet, regaining better receive performance.
>>>>
>>>> Fixes: https://github.com/oasis-tcs/virtio-spec/issues/173
>>>>
>>> don't put an empty line here
>> Ok. Will remove it.
>>
>>>> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
>>>> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
>>>> Reviewed-by: Parav Pandit <parav@nvidia.com>
>>>
>>> ok almost there. small corrections, and one enhancement suggestion.
>>>
>>>
>>>> ---
>>>> v17->v18:
>>>> 	1. Some rewording suggestions from Michael (Thanks!).
>>>> 	2. Use 0 to disable inner header hash and remove
>>>> 	   VIRTIO_NET_HASH_TUNNEL_TYPE_NONE.
>>>> v16->v17:
>>>> 	1. Some small rewrites. @Parav Pandit
>>>> 	2. Add Parav's Reviewed-by tag (Thanks!).
>>>>
>>>> v15->v16:
>>>> 	1. Remove the hash_option. In order to delimit the inner header hash and RSS
>>>> 	   configuration, the ability to configure the outer src udp port hash is given
>>>> 	   to RSS. This is orthogonal to inner header hash, which will be done in the
>>>> 	   RSS capability extension topic (considered as an RSS extension together
>>>> 	   with the symmetric toeplitz hash algorithm, etc.). @Parav Pandit @Michael S . Tsirkin
>>>> 	2. Fix a 'field' typo. @Parav Pandit
>>>>
>>>> v14->v15:
>>>> 	1. Add tunnel hash option suggested by @Michael S . Tsirkin
>>>> 	2. Adjust some descriptions.
>>>>
>>>> v13->v14:
>>>> 	1. Move supported_hash_tunnel_types from config space into cvq command. @Parav Pandit
>>>> 	2. Rebase to master branch.
>>>> 	3. Some minor modifications.
>>>>
>>>> v12->v13:
>>>> 	1. Add a GET command for hash_tunnel_types. @Parav Pandit
>>>> 	2. Add tunneling protocol explanation. @Jason Wang
>>>> 	3. Add comments on some usage scenarios for inner hash.
>>>>
>>>> v11->v12:
>>>> 	1. Add a command VIRTIO_NET_CTRL_MQ_TUNNEL_CONFIG.
>>>> 	2. Refine the commit log. @Michael S . Tsirkin
>>>> 	3. Add some tunnel types.
>>>>
>>>> v10->v11:
>>>> 	1. Revise commit log for clarity for readers.
>>>> 	2. Some modifications to avoid undefined terms. @Parav Pandit
>>>> 	3. Change VIRTIO_NET_F_HASH_TUNNEL dependency. @Parav Pandit
>>>> 	4. Add the normative statements. @Parav Pandit
>>>>
>>>> v9->v10:
>>>> 	1. Removed hash_report_tunnel related information. @Parav Pandit
>>>> 	2. Re-describe the limitations of QoS for tunneling.
>>>> 	3. Some clarification.
>>>>
>>>> v8->v9:
>>>> 	1. Merge hash_report_tunnel_types into hash_report. @Parav Pandit
>>>> 	2. Add tunnel security section. @Michael S . Tsirkin
>>>> 	3. Add VIRTIO_NET_F_HASH_REPORT_TUNNEL.
>>>> 	4. Fix some typos.
>>>> 	5. Add more tunnel types. @Michael S . Tsirkin
>>>>
>>>> v7->v8:
>>>> 	1. Add supported_hash_tunnel_types. @Jason Wang, @Parav Pandit
>>>> 	2. Change hash_report_tunnel to hash_report_tunnel_types. @Parav Pandit
>>>> 	3. Removed re-definition for inner packet hashing. @Parav Pandit
>>>> 	4. Fix some typos. @Michael S . Tsirkin
>>>> 	5. Clarify some sentences. @Michael S . Tsirkin
>>>>
>>>> v6->v7:
>>>> 	1. Modify the wording of some sentences for clarity. @Michael S. Tsirkin
>>>> 	2. Fix some syntax issues. @Michael S. Tsirkin
>>>>
>>>> v5->v6:
>>>> 	1. Fix some syntax and capitalization issues. @Michael S. Tsirkin
>>>> 	2. Use encapsulated/encaptulation uniformly. @Michael S. Tsirkin
>>>> 	3. Move the links to introduction section. @Michael S. Tsirkin
>>>> 	4. Clarify some sentences. @Michael S. Tsirkin
>>>>
>>>> v4->v5:
>>>> 	1. Clarify some paragraphs. @Cornelia Huck
>>>> 	2. Fix the u8 type. @Cornelia Huck
>>>>
>>>> v3->v4:
>>>> 	1. Rename VIRTIO_NET_F_HASH_GRE_VXLAN_GENEVE_INNER to VIRTIO_NET_F_HASH_TUNNEL. @Jason Wang
>>>> 	2. Make things clearer. @Jason Wang @Michael S. Tsirkin
>>>> 	3. Keep the possibility to use inner hash for automatic receive steering. @Jason Wang
>>>> 	4. Add the "Tunnel packet" paragraph to avoid repeating the GRE etc. many times. @Michael S. Tsirkin
>>>>
>>>> v2->v3:
>>>> 	1. Add a feature bit for GRE/VXLAN/GENEVE inner hash. @Jason Wang
>>>> 	2. Chang \field{hash_tunnel} to \field{hash_report_tunnel}. @Jason Wang, @Michael S. Tsirkin
>>>>
>>>> v1->v2:
>>>> 	1. Remove the patch for the bitmask fix. @Michael S. Tsirkin
>>>> 	2. Clarify some paragraphs. @Jason Wang
>>>> 	3. Add \field{hash_tunnel} and VIRTIO_NET_HASH_REPORT_GRE. @Yuri Benditovich
>>>>
>>>>   device-types/net/description.tex        | 158 ++++++++++++++++++++++++
>>>>   device-types/net/device-conformance.tex |   1 +
>>>>   device-types/net/driver-conformance.tex |   1 +
>>>>   introduction.tex                        |  39 ++++++
>>>>   4 files changed, 199 insertions(+)
>>>>
>>>> diff --git a/device-types/net/description.tex b/device-types/net/description.tex
>>>> index 3030222..9fdccfc 100644
>>>> --- a/device-types/net/description.tex
>>>> +++ b/device-types/net/description.tex
>>>> @@ -88,6 +88,8 @@ \subsection{Feature bits}\label{sec:Device Types / Network Device / Feature bits
>>>>   \item[VIRTIO_NET_F_CTRL_MAC_ADDR(23)] Set MAC address through control
>>>>       channel.
>>>>   
>>>> +\item[VIRTIO_NET_F_HASH_TUNNEL(51)] Device supports inner header hash for encapsulated packets.
>>>> +
>>>>   \item[VIRTIO_NET_F_VQ_NOTF_COAL(52)] Device supports virtqueue notification coalescing.
>>>>   
>>>>   \item[VIRTIO_NET_F_NOTF_COAL(53)] Device supports notifications coalescing.
>>>> @@ -147,6 +149,7 @@ \subsubsection{Feature bit requirements}\label{sec:Device Types / Network Device
>>>>   \item[VIRTIO_NET_F_RSC_EXT] Requires VIRTIO_NET_F_HOST_TSO4 or VIRTIO_NET_F_HOST_TSO6.
>>>>   \item[VIRTIO_NET_F_RSS] Requires VIRTIO_NET_F_CTRL_VQ.
>>>>   \item[VIRTIO_NET_F_VQ_NOTF_COAL] Requires VIRTIO_NET_F_CTRL_VQ.
>>>> +\item[VIRTIO_NET_F_HASH_TUNNEL] Requires VIRTIO_NET_F_CTRL_VQ along with VIRTIO_NET_F_RSS and/or VIRTIO_NET_F_HASH_REPORT.
>>> I think just or is enough.
>> Sure. I agree.
>>
>>>>   \end{description}
>>>>   
>>>>   \subsubsection{Legacy Interface: Feature bits}\label{sec:Device Types / Network Device / Feature bits / Legacy Interface: Feature bits}
>>>> @@ -869,6 +872,8 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>>>>   If the feature VIRTIO_NET_F_RSS was negotiated:
>>>>   \begin{itemize}
>>>>   \item The device uses \field{hash_types} of the virtio_net_rss_config structure as 'Enabled hash types' bitmask.
>>>> +\item If additionally the feature VIRTIO_NET_F_HASH_TUNNEL was negotiated, the device uses \field{hash_tunnel_types} of the
>>>> +      virtnet_hash_tunnel_config_get structure as 'Encapsulation types enabled for inner header hash' bitmask.
>>> why get and not set? e.g. if I call get then set then the field in set
>>> will have effect.
>> If the following command sequence:
>> 1. The driver sets hash_tunnel_types using the SET command and saves it
>> somewhere e.g. virtnet_info->hash_tunnel_types_saved.
> This clearly does not work even with current structure you need to first
> do a GET otherwise you have no idea what is supported.

This sequence is just an example from the middle, we need the GET 
command before the first SET.

>
>> 2. The driver fetches hash_tunnel_types using the GET command, which should
>> be equal to virtnet_info->hash_tunnel_types_saved (or even returned from
>> virtnet_info->hash_tunnel_types_saved, like RSS).
> why? for debugging?

For convenience, the driver can try to update 
virtnet_info->hash_tunnel_types_saved every time the SET command is issued,
but this is too specific and depends on the implementation of the 
driver, I think it is out of the scope of the spec.

>
>> 3. The driver sets hash_tunnel_types again using the SET command and saves
>> it in virtnet_info->hash_tunnel_types_saved.
> I don't get it, why set it again?

Just an example to explain that the driver can save 
virtnet_info->hash_tunnel_types_saved every SET to reduce the number of 
GET tunnels or achieve synchronization.

>
>> But I think your enhanced proposal is also feasible, after all, before
>> for example RSS etc. we only have SET command can work very well.
> Well RSS is designed (imho) better since it keeps the supported types in
> config space. Doing it here would have removed the need for GET command.
> Yes I know Parav hates config space, no I don't think for a read-only
> field like this one this hate is justified.
> Did we discuss this and decided not to add it in config space
> for some reason? I don't remember ...

Yes, in v13 version.

Thanks.

>
>>>
>>>>   \item The device uses a key as defined in \field{hash_key_data} and \field{hash_key_length} of the virtio_net_rss_config structure (see
>>>>   \ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) / Setting RSS parameters}).
>>>>   \end{itemize}
>>>> @@ -876,6 +881,8 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>>>>   If the feature VIRTIO_NET_F_RSS was not negotiated:
>>>>   \begin{itemize}
>>>>   \item The device uses \field{hash_types} of the virtio_net_hash_config structure as 'Enabled hash types' bitmask.
>>>> +\item If additionally the feature VIRTIO_NET_F_HASH_TUNNEL was negotiated, the device uses \field{hash_tunnel_types} of the
>>>> +      virtnet_hash_tunnel_config_get structure as 'Encapsulation types enabled for inner header hash' bitmask.
>>> same
>>>
>>>>   \item The device uses a key as defined in \field{hash_key_data} and \field{hash_key_length} of the virtio_net_hash_config structure (see
>>>>   \ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode / Hash calculation}).
>>>>   \end{itemize}
>>>> @@ -889,6 +896,8 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>>>>    \ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode / Hash calculation}.
>>>>   \end{itemize}
>>>>   
>>>> +The per-packet hash calculation can depend on the IP packet type. See
>>>> +\hyperref[intro:IP]{[IP]}, \hyperref[intro:UDP]{[UDP]} and \hyperref[intro:TCP]{[TCP]}.
>>> and end paragraph here.
>> Is adding a blank line here? --!
> yes
>
>>>>   \subparagraph{Supported/enabled hash types}
>>>>   \label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled hash types}
>>>>   Hash types applicable for IPv4 packets:
>>>> @@ -1001,6 +1010,155 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>>>>   (see \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / IPv6 packets without extension header}).
>>>>   \end{itemize}
>>>>   
>>>> +\paragraph{Inner Header Hash}
>>>> +\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Inner Header Hash}
>>>> +
>>>> +If VIRTIO_NET_F_HASH_TUNNEL has been negotiated, the driver can send commands VIRTIO_NET_CTRL_HASH_TUNNEL_SET
>>>> +and VIRTIO_NET_CTRL_HASH_TUNNEL_GET to configure the calculation of the inner header hash.
>>>> +
>>>> +struct virtnet_hash_tunnel_config_set {
>>>> +    le32 hash_tunnel_types;
>>>> +};
>>>> +
>>>> +struct virtnet_hash_tunnel_config_get {
>>>> +    le32 supported_hash_tunnel_types;
>>>> +    le32 hash_tunnel_types;
>>>> +};
>>>> +
>>> It would be cleaner to have a single structure for both.
>>>
>> SET carries an additional 32 bits of information. But if you think this
>> will make the overall structure more concise, I'm ok.
> I think so yes.
>
>>> I also think hash_tunnel is unnecessarily verbose, and _config_ is also
>>> pointless.
>>>
>>> Returning supported_hash_tunnel_types back to device can also
>>> be useful for debugging.
>>>
>>> How about:
>>>
>>>
>>> struct virtnet_hash_tunnel {
>>>       le32 supported_tunnel_types;
>>>       le32 enabled_tunnel_types;
>>> };
>>>
>> It's OK.
>>
>> And:
>> For the GET command, both fields are WO for the device.
>> For the SET command, \field{supported_tunnel_types} is RO for the device
>> and \field{enabled_tunnel_types} is WO for the device.
>>
>>> For VIRTIO_NET_CTRL_HASH_TUNNEL_GET, \field{supported_tunnel_types}
>>> contains the bitmask of encapsulation types supported
>>> by the device for inner header hash; \field{enabled_tunnel_types}
>>> contains the value received in a previous successful
>>> call to VIRTIO_NET_CTRL_HASH_TUNNEL_SET.
>>>
>>> For VIRTIO_NET_CTRL_HASH_TUNNEL_SET, \field{supported_tunnel_types}
>>> contains the value returned by a previous
>>> successful call to VIRTIO_NET_CTRL_HASH_TUNNEL_GET;
>>> \field{enabled_tunnel_types}
>>> contains the bitmask of encapsulation types to enable
>>> for inner header hash.
>>>
>>> and add normative statements to this end.
>>>
>>>
>> Ok.
>>
>>>> +#define VIRTIO_NET_CTRL_HASH_TUNNEL 7
>>>> + #define VIRTIO_NET_CTRL_HASH_TUNNEL_SET 0
>>>> + #define VIRTIO_NET_CTRL_HASH_TUNNEL_GET 1
>>>> +
>>>> +
>>>> +Field \field{supported_hash_tunnel_types} provided by the device indicates that the device supports inner header hash for these encapsulation types.
>>>> +Field \field{supported_hash_tunnel_types} contains the bitmask of encapsulation types supported for inner header hash.
>>> We don't need these two sentences. Just second one will do.
>>>
>> Ok.
>>
>>>> +See \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets /
>>>> +Hash calculation for incoming packets / Encapsulation types supported/enabled for inner header hash}.
>>>> +
>>>> +Field \field{hash_tunnel_types} contains the bitmask of encapsulation types enabled for inner header hash.
>>> They have different meanings for set and get though.
>>>
>>>
>>>> +See \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets /
>>>> +Hash calculation for incoming packets / Encapsulation types supported/enabled for inner header hash}.
>>>> +
>>>> +The class VIRTIO_NET_CTRL_HASH_TUNNEL has the following commands:
>>>> +\begin{itemize}
>>>> +\item VIRTIO_NET_CTRL_HASH_TUNNEL_SET: set \field{hash_tunnel_types} for the device using the
>>>> +      virtnet_hash_tunnel_config_set structure, which is read-only for the device.
>>>> +\item VIRTIO_NET_CTRL_HASH_TUNNEL_GET: get \field{hash_tunnel_types} and \field{supported_hash_tunnel_types}
>>>> +      from the device using the virtnet_hash_tunnel_config_get structure, which is write-only for the device.
>>>> +\end{itemize}
>>>> +
>>>> +\subparagraph{Encapsulated packet}
>>>> +\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Encapsulated packet}
>>>> +
>>>> +Multiple tunneling protocols allow encapsulating an inner, payload packet in an outer, encapsulated packet.
>>>> +The encapsulated packet thus contains an outer header and an inner header, and the device calculates the
>>>> +hash over either the inner header or the outer header.
>>>> +
>>>> +If VIRTIO_NET_F_HASH_TUNNEL is negotiated and a received encapsulated packet's outer header matches one of the
>>>> +encapsulation types enabled in \field{hash_tunnel_types}, then the device uses the inner header for hash
>>>> +calculations (only a single level of encapsulation is currently supported).
>>>> +
>>>> +If VIRTIO_NET_F_HASH_TUNNEL is negotiated and a received packet's (outer) header does not match any types enabled
>>>> +in \field{hash_tunnel_types}, then the device uses the outer header for hash calculations.
>>>> +
>>>> +Initially all encapsulation types are disabled (the value of \field{hash_tunnel_types} is 0) for inner header hash
>>>> +before any VIRTIO_NET_CTRL_HASH_TUNNEL_SET command are sent by the driver.
>>> Initially (before driver sends VIRTIO_NET_CTRL_HASH_TUNNEL_SET)
>>> all encapsulation types are disabled
>>>
>> Ok.
>>
>>>> +
>>>> +Encapsulation types supported/enabled for inner header hash:
>>>> +\begin{itemize}
>>>> +    \item The outer header of the following encapsulation types does not contain the transport protocol:
>>>> +        \begin{enumerate}
>>>> +	    \item \hyperref[intro:ipip]{[IPIP]}: the outer header is over IPv4 and the inner header is over IPv4.
>>>> +	    \item \hyperref[intro:nvgre]{[NVGRE]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
>>>> +	    \item \hyperref[intro:gre_rfc2784]{[GRE_rfc2784]}: the outer header is over IPv4 and the inner header is over IPv4.
>>>> +	    \item \hyperref[intro:gre_rfc2890]{[GRE_rfc2890]}: the outer header is over IPv4 and the inner header is over IPv4.
>>>> +	    \item \hyperref[intro:gre_rfc7676]{[GRE_rfc7676]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
>>>> +        \end{enumerate}
>>>> +
>>>> +    \item The outer header of the following encapsulation types uses UDP as the transport protocol:
>>>> +        \begin{enumerate}
>>>> +	    \item \hyperref[intro:vxlan]{[VXLAN]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
>>>> +	    \item \hyperref[intro:geneve]{[GENEVE]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
>>>> +	    \item \hyperref[intro:vxlan_gpe]{[VXLAN-GPE]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
>>>> +	    \item \hyperref[intro:gre_in_udp_rfc8086]{[GRE-in-UDP]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
>>>> +        \end{enumerate}
>>>> +\end{itemize}
>>>> +
>>>> +\subparagraph{Encapsulation types supported/enabled for inner header hash}
>>>> +\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets /
>>>> +Hash calculation for incoming packets / Encapsulation types supported/enabled for inner header hash}
>>>> +
>>>> +Encapsulation types applicable for inner header hash:
>>>> +\begin{lstlisting}
>>>> +The \hyperref[intro:gre_rfc2784]{[GRE_rfc2784]} encapsulation type:
>>>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_2784    (1 << 0)
>>>> +
>>>> +The \hyperref[intro:gre_rfc2890]{[GRE_rfc2890]} encapsulation type:
>>>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_2890    (1 << 1)
>>>> +
>>>> +The \hyperref[intro:gre_rfc7676]{[GRE_rfc7676]} encapsulation type:
>>>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_7676    (1 << 2)
>>>> +
>>>> +The \hyperref[intro:gre_in_udp_rfc8086]{[GRE-in-UDP]} encapsulation type:
>>>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_UDP     (1 << 3)
>>>> +
>>>> +The \hyperref[intro:vxlan]{[VXLAN]} encapsulation type:
>>>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_VXLAN       (1 << 4)
>>>> +
>>>> +The \hyperref[intro:vxlan_gpe]{[VXLAN-GPE]} encapsulation type:
>>>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_VXLAN_GPE   (1 << 5)
>>>> +
>>>> +The \hyperref[intro:geneve]{[GENEVE]} encapsulation type:
>>>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GENEVE      (1 << 6)
>>>> +
>>>> +The \hyperref[intro:ipip]{[IPIP]} encapsulation type:
>>>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_IPIP        (1 << 7)
>>>> +
>>>> +The \hyperref[intro:nvgre]{[NVGRE]} encapsulation type:
>>>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_NVGRE       (1 << 8)
>>>> +\end{lstlisting}
>>>> +
>>>> +\subparagraph{Advice}
>>>> +Example uses of inner header hash:
>>>> +\begin{itemize}
>>>> +\item Legacy tunneling protocols, lacking outer header entropy, can use RSS with inner header hash to
>>>> +      distribute flows with identical outer but different inner headers across various queues, improving performance.
>>>> +\item Identify an inner flow distributed across multiple outer tunnels.
>>>> +\end{itemize}
>>>> +
>>>> +As using the inner header hash completely discards the outer header entropy, care must be taken
>>>> +if the inner header is controlled by an adversary, as the adversary can then intentionally create
>>>> +configurations with insufficient entropy.
>>>> +
>>>> +Besides disabling inner header hash, mitigations would depend on:
>>>> +\begin{itemize}
>>>> +\item Use a tool with good forwarding performance to keep the receive queue from dropping packets.
>>> this is quite vague
>>>
>> Giving too specific advice would be too specific, which we discussed a
>> long time ago.
>>
>>>> +\item If the QoS (Quality of service) is unavailable, the driver can set \field{hash_tunnel_types} to 0
>>>> +      to disable inner header hash for all encapsulated packets.
>>> this is precisely disabling
>> \field{hash_tunnel_types} to 0 disabling inner ?
>>
>>>> +\item Perform appropriate QoS before packets consume the receive buffers of the receive queues.
>>> it is not at all clear how would devices do this.
>>>
>> The reason we're describing it broadly here is that this is done by
>> devices, which usually know what to do, and can also take actions
>> off-device, such as firewalls off-device, etc.
>>
>>>> +\end{itemize}
>>> Oh sorry I didn't complete the sentence :(
>>> I suggest dropping above and having something like:
>>>
>>> 	Besides disabling inner header hash, mitigations would depend on how the
>>> 	hash is used, and the consequences of a successful attack.
>>> 	For example, if the attack causes packet drops, using a deeper queue
>>> 	might be able to mitigate it.
>> Ok, I got you now!
>>
>>>
>>>
>>>> +
>>>> +\devicenormative{\subparagraph}{Inner Header Hash}{Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
>>>> +
>>>> +If the (outer) header of the received packet does not match any value
>>> any encapsulation type
>> Ok. And 'any bits' instead of 'any value' I think.
>>
>>>> enabled in \field{hash_tunnel_types},
>>>> +the device MUST calculate the hash on the outer header.
>>>> +
>>>> +If the device receives an unsupported or unrecognized value for \field{hash_tunnel_types}, it MUST respond to
>>>> +the VIRTIO_NET_CTRL_HASH_TUNNEL_SET command with VIRTIO_NET_ERR.
>>> let's be specific. if any bits in hash_tunnel_types are not set in
>>> supported_hash_tunnel_types
>> Ok.
>>
>>>> +
>>>> +If the device offers the VIRTIO_NET_F_HASH_TUNNEL feature, it MUST provide the values for \field{supported_hash_tunnel_types}.
>>> what does this mean even?
>> When the driver uses the GET command, the device should preferably
>> return the corresponding value..
>>
>>>> +
>>>> +If \field{hash_tunnel_types} is set to 0 or upon device reset, the device MUST disable inner header hash for all encapsulation types.
>>>> +
>>>> +\drivernormative{\subparagraph}{Inner Header Hash}{Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
>>>> +
>>>> +The driver MUST have negotiated the VIRTIO_NET_F_HASH_TUNNEL feature when issuing
>>>> +commands VIRTIO_NET_CTRL_HASH_TUNNEL_SET and VIRTIO_NET_CTRL_HASH_TUNNEL_GET.
>>>> +
>>>> +The driver MUST ignore the values received from the VIRTIO_NET_CTRL_HASH_TUNNEL_GET command if the device responds with VIRTIO_NET_ERR.
>>>> +
>>>> +The driver MUST NOT set any value in \field{hash_tunnel_types} which is not set in \field{supported_hash_tunnel_types}.
>>> any bits. not any value
>> Get.
>>
>> Thanks.
>>
>>>> +
>>>>   \paragraph{Hash reporting for incoming packets}
>>>>   \label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash reporting for incoming packets}
>>>>   
>>>> diff --git a/device-types/net/device-conformance.tex b/device-types/net/device-conformance.tex
>>>> index 54f6783..f88f48b 100644
>>>> --- a/device-types/net/device-conformance.tex
>>>> +++ b/device-types/net/device-conformance.tex
>>>> @@ -14,4 +14,5 @@
>>>>   \item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode}
>>>>   \item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) / RSS processing}
>>>>   \item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Notifications Coalescing}
>>>> +\item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
>>>>   \end{itemize}
>>>> diff --git a/device-types/net/driver-conformance.tex b/device-types/net/driver-conformance.tex
>>>> index 97d0cc1..9d853d9 100644
>>>> --- a/device-types/net/driver-conformance.tex
>>>> +++ b/device-types/net/driver-conformance.tex
>>>> @@ -14,4 +14,5 @@
>>>>   \item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Offloads State Configuration / Setting Offloads State}
>>>>   \item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) }
>>>>   \item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Notifications Coalescing}
>>>> +\item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
>>>>   \end{itemize}
>>>> diff --git a/introduction.tex b/introduction.tex
>>>> index b7155bf..81f07a4 100644
>>>> --- a/introduction.tex
>>>> +++ b/introduction.tex
>>>> @@ -102,6 +102,45 @@ \section{Normative References}\label{sec:Normative References}
>>>>       Standards for Efficient Cryptography Group(SECG), ``SEC1: Elliptic Cureve Cryptography'', Version 1.0, September 2000.
>>>>   	\newline\url{https://www.secg.org/sec1-v2.pdf}\\
>>>>   
>>>> +	\phantomsection\label{intro:gre_rfc2784}\textbf{[GRE_rfc2784]} &
>>>> +    Generic Routing Encapsulation. This protocol is only specified for IPv4 and used as either the payload or delivery protocol.
>>>> +	\newline\url{https://datatracker.ietf.org/doc/rfc2784/}\\
>>>> +	\phantomsection\label{intro:gre_rfc2890}\textbf{[GRE_rfc2890]} &
>>>> +    Key and Sequence Number Extensions to GRE \ref{intro:gre_rfc2784}. This protocol describes extensions by which two fields, Key and
>>>> +    Sequence Number, can be optionally carried in the GRE Header \ref{intro:gre_rfc2784}.
>>>> +	\newline\url{https://www.rfc-editor.org/rfc/rfc2890}\\
>>>> +	\phantomsection\label{intro:gre_rfc7676}\textbf{[GRE_rfc7676]} &
>>>> +    IPv6 Support for Generic Routing Encapsulation (GRE). This protocol is specified for IPv6 and used as either the payload or
>>>> +    delivery protocol. Note that this does not change the GRE header format or any behaviors specified by RFC 2784 or RFC 2890.
>>>> +	\newline\url{https://datatracker.ietf.org/doc/rfc7676/}\\
>>>> +	\phantomsection\label{intro:gre_in_udp_rfc8086}\textbf{[GRE-in-UDP]} &
>>>> +    GRE-in-UDP Encapsulation. This specifies a method of encapsulating network protocol packets within GRE and UDP headers.
>>>> +    This protocol is specified for IPv4 and IPv6, and used as either the payload or delivery protocol.
>>>> +	\newline\url{https://www.rfc-editor.org/rfc/rfc8086}\\
>>>> +	\phantomsection\label{intro:vxlan}\textbf{[VXLAN]} &
>>>> +    Virtual eXtensible Local Area Network.
>>>> +	\newline\url{https://datatracker.ietf.org/doc/rfc7348/}\\
>>>> +	\phantomsection\label{intro:vxlan-gpe}\textbf{[VXLAN-GPE]} &
>>>> +    Generic Protocol Extension for VXLAN. This protocol describes extending Virtual eXtensible Local Area Network (VXLAN) via changes to the VXLAN header.
>>>> +	\newline\url{https://www.ietf.org/archive/id/draft-ietf-nvo3-vxlan-gpe-12.txt}\\
>>>> +	\phantomsection\label{intro:geneve}\textbf{[GENEVE]} &
>>>> +    Generic Network Virtualization Encapsulation.
>>>> +	\newline\url{https://datatracker.ietf.org/doc/rfc8926/}\\
>>>> +	\phantomsection\label{intro:ipip}\textbf{[IPIP]} &
>>>> +    IP Encapsulation within IP.
>>>> +	\newline\url{https://www.rfc-editor.org/rfc/rfc2003}\\
>>>> +	\phantomsection\label{intro:nvgre}\textbf{[NVGRE]} &
>>>> +    NVGRE: Network Virtualization Using Generic Routing Encapsulation
>>>> +	\newline\url{https://www.rfc-editor.org/rfc/rfc7637.html}\\
>>>> +	\phantomsection\label{intro:IP}\textbf{[IP]} &
>>>> +    INTERNET PROTOCOL
>>>> +	\newline\url{https://www.rfc-editor.org/rfc/rfc791}\\
>>>> +	\phantomsection\label{intro:UDP}\textbf{[UDP]} &
>>>> +    User Datagram Protocol
>>>> +	\newline\url{https://www.rfc-editor.org/rfc/rfc768}\\
>>>> +	\phantomsection\label{intro:TCP}\textbf{[TCP]} &
>>>> +    TRANSMISSION CONTROL PROTOCOL
>>>> +	\newline\url{https://www.rfc-editor.org/rfc/rfc793}\\
>>>>   \end{longtable}
>>>>   
>>>>   \section{Non-Normative References}
>>>> -- 
>>>> 2.19.1.6.gb485710b
>>>
>>> This publicly archived list offers a means to provide input to the
>>> OASIS Virtual I/O Device (VIRTIO) TC.
>>>
>>> In order to verify user consent to the Feedback License terms and
>>> to minimize spam in the list archive, subscription is required
>>> before posting.
>>>
>>> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
>>> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
>>> List help: virtio-comment-help@lists.oasis-open.org
>>> List archive: https://lists.oasis-open.org/archives/virtio-comment/
>>> Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
>>> List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
>>> Committee: https://www.oasis-open.org/committees/virtio/
>>> Join OASIS: https://www.oasis-open.org/join/
>> This publicly archived list offers a means to provide input to the
>> OASIS Virtual I/O Device (VIRTIO) TC.
>>
>> In order to verify user consent to the Feedback License terms and
>> to minimize spam in the list archive, subscription is required
>> before posting.
>>
>> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
>> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
>> List help: virtio-comment-help@lists.oasis-open.org
>> List archive: https://lists.oasis-open.org/archives/virtio-comment/
>> Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
>> List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
>> Committee: https://www.oasis-open.org/committees/virtio/
>> Join OASIS: https://www.oasis-open.org/join/
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-dev] Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
@ 2023-06-22  0:41         ` Heng Qi
  0 siblings, 0 replies; 106+ messages in thread
From: Heng Qi @ 2023-06-22  0:41 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-comment, virtio-dev, Parav Pandit, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck



在 2023/6/22 上午3:32, Michael S. Tsirkin 写道:
> On Thu, Jun 22, 2023 at 12:46:06AM +0800, Heng Qi wrote:
>> On Wed, Jun 21, 2023 at 11:38:52AM -0400, Michael S. Tsirkin wrote:
>>> On Wed, Jun 21, 2023 at 09:50:52PM +0800, Heng Qi wrote:
>>>> 1. Currently, a received encapsulated packet has an outer and an inner header, but
>>>> the virtio device is unable to calculate the hash for the inner header. The same
>>>> flow can traverse through different tunnels, resulting in the encapsulated
>>>> packets being spread across multiple receive queues (refer to the figure below).
>>>> However, in certain scenarios, we may need to direct these encapsulated packets of
>>>> the same flow to a single receive queue. This facilitates the processing
>>>> of the flow by the same CPU to improve performance (warm caches, less locking, etc.).
>>>>
>>>>                 client1                    client2
>>>>                    |        +-------+         |
>>>>                    +------->|tunnels|<--------+
>>>>                             +-------+
>>>>                                |  |
>>>>                                v  v
>>>>                        +-----------------+
>>>>                        | monitoring host |
>>>>                        +-----------------+
>>>>
>>>> To achieve this, the device can calculate a symmetric hash based on the inner headers
>>>> of the same flow.
>>>>
>>>> 2. For legacy systems, they may lack entropy fields which modern protocols have in
>>>> the outer header, resulting in multiple flows with the same outer header but
>>>> different inner headers being directed to the same receive queue. This results in
>>>> poor receive performance.
>>>>
>>>> To address this limitation, inner header hash can be used to enable the device to advertise
>>>> the capability to calculate the hash for the inner packet, regaining better receive performance.
>>>>
>>>> Fixes: https://github.com/oasis-tcs/virtio-spec/issues/173
>>>>
>>> don't put an empty line here
>> Ok. Will remove it.
>>
>>>> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
>>>> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
>>>> Reviewed-by: Parav Pandit <parav@nvidia.com>
>>>
>>> ok almost there. small corrections, and one enhancement suggestion.
>>>
>>>
>>>> ---
>>>> v17->v18:
>>>> 	1. Some rewording suggestions from Michael (Thanks!).
>>>> 	2. Use 0 to disable inner header hash and remove
>>>> 	   VIRTIO_NET_HASH_TUNNEL_TYPE_NONE.
>>>> v16->v17:
>>>> 	1. Some small rewrites. @Parav Pandit
>>>> 	2. Add Parav's Reviewed-by tag (Thanks!).
>>>>
>>>> v15->v16:
>>>> 	1. Remove the hash_option. In order to delimit the inner header hash and RSS
>>>> 	   configuration, the ability to configure the outer src udp port hash is given
>>>> 	   to RSS. This is orthogonal to inner header hash, which will be done in the
>>>> 	   RSS capability extension topic (considered as an RSS extension together
>>>> 	   with the symmetric toeplitz hash algorithm, etc.). @Parav Pandit @Michael S . Tsirkin
>>>> 	2. Fix a 'field' typo. @Parav Pandit
>>>>
>>>> v14->v15:
>>>> 	1. Add tunnel hash option suggested by @Michael S . Tsirkin
>>>> 	2. Adjust some descriptions.
>>>>
>>>> v13->v14:
>>>> 	1. Move supported_hash_tunnel_types from config space into cvq command. @Parav Pandit
>>>> 	2. Rebase to master branch.
>>>> 	3. Some minor modifications.
>>>>
>>>> v12->v13:
>>>> 	1. Add a GET command for hash_tunnel_types. @Parav Pandit
>>>> 	2. Add tunneling protocol explanation. @Jason Wang
>>>> 	3. Add comments on some usage scenarios for inner hash.
>>>>
>>>> v11->v12:
>>>> 	1. Add a command VIRTIO_NET_CTRL_MQ_TUNNEL_CONFIG.
>>>> 	2. Refine the commit log. @Michael S . Tsirkin
>>>> 	3. Add some tunnel types.
>>>>
>>>> v10->v11:
>>>> 	1. Revise commit log for clarity for readers.
>>>> 	2. Some modifications to avoid undefined terms. @Parav Pandit
>>>> 	3. Change VIRTIO_NET_F_HASH_TUNNEL dependency. @Parav Pandit
>>>> 	4. Add the normative statements. @Parav Pandit
>>>>
>>>> v9->v10:
>>>> 	1. Removed hash_report_tunnel related information. @Parav Pandit
>>>> 	2. Re-describe the limitations of QoS for tunneling.
>>>> 	3. Some clarification.
>>>>
>>>> v8->v9:
>>>> 	1. Merge hash_report_tunnel_types into hash_report. @Parav Pandit
>>>> 	2. Add tunnel security section. @Michael S . Tsirkin
>>>> 	3. Add VIRTIO_NET_F_HASH_REPORT_TUNNEL.
>>>> 	4. Fix some typos.
>>>> 	5. Add more tunnel types. @Michael S . Tsirkin
>>>>
>>>> v7->v8:
>>>> 	1. Add supported_hash_tunnel_types. @Jason Wang, @Parav Pandit
>>>> 	2. Change hash_report_tunnel to hash_report_tunnel_types. @Parav Pandit
>>>> 	3. Removed re-definition for inner packet hashing. @Parav Pandit
>>>> 	4. Fix some typos. @Michael S . Tsirkin
>>>> 	5. Clarify some sentences. @Michael S . Tsirkin
>>>>
>>>> v6->v7:
>>>> 	1. Modify the wording of some sentences for clarity. @Michael S. Tsirkin
>>>> 	2. Fix some syntax issues. @Michael S. Tsirkin
>>>>
>>>> v5->v6:
>>>> 	1. Fix some syntax and capitalization issues. @Michael S. Tsirkin
>>>> 	2. Use encapsulated/encaptulation uniformly. @Michael S. Tsirkin
>>>> 	3. Move the links to introduction section. @Michael S. Tsirkin
>>>> 	4. Clarify some sentences. @Michael S. Tsirkin
>>>>
>>>> v4->v5:
>>>> 	1. Clarify some paragraphs. @Cornelia Huck
>>>> 	2. Fix the u8 type. @Cornelia Huck
>>>>
>>>> v3->v4:
>>>> 	1. Rename VIRTIO_NET_F_HASH_GRE_VXLAN_GENEVE_INNER to VIRTIO_NET_F_HASH_TUNNEL. @Jason Wang
>>>> 	2. Make things clearer. @Jason Wang @Michael S. Tsirkin
>>>> 	3. Keep the possibility to use inner hash for automatic receive steering. @Jason Wang
>>>> 	4. Add the "Tunnel packet" paragraph to avoid repeating the GRE etc. many times. @Michael S. Tsirkin
>>>>
>>>> v2->v3:
>>>> 	1. Add a feature bit for GRE/VXLAN/GENEVE inner hash. @Jason Wang
>>>> 	2. Chang \field{hash_tunnel} to \field{hash_report_tunnel}. @Jason Wang, @Michael S. Tsirkin
>>>>
>>>> v1->v2:
>>>> 	1. Remove the patch for the bitmask fix. @Michael S. Tsirkin
>>>> 	2. Clarify some paragraphs. @Jason Wang
>>>> 	3. Add \field{hash_tunnel} and VIRTIO_NET_HASH_REPORT_GRE. @Yuri Benditovich
>>>>
>>>>   device-types/net/description.tex        | 158 ++++++++++++++++++++++++
>>>>   device-types/net/device-conformance.tex |   1 +
>>>>   device-types/net/driver-conformance.tex |   1 +
>>>>   introduction.tex                        |  39 ++++++
>>>>   4 files changed, 199 insertions(+)
>>>>
>>>> diff --git a/device-types/net/description.tex b/device-types/net/description.tex
>>>> index 3030222..9fdccfc 100644
>>>> --- a/device-types/net/description.tex
>>>> +++ b/device-types/net/description.tex
>>>> @@ -88,6 +88,8 @@ \subsection{Feature bits}\label{sec:Device Types / Network Device / Feature bits
>>>>   \item[VIRTIO_NET_F_CTRL_MAC_ADDR(23)] Set MAC address through control
>>>>       channel.
>>>>   
>>>> +\item[VIRTIO_NET_F_HASH_TUNNEL(51)] Device supports inner header hash for encapsulated packets.
>>>> +
>>>>   \item[VIRTIO_NET_F_VQ_NOTF_COAL(52)] Device supports virtqueue notification coalescing.
>>>>   
>>>>   \item[VIRTIO_NET_F_NOTF_COAL(53)] Device supports notifications coalescing.
>>>> @@ -147,6 +149,7 @@ \subsubsection{Feature bit requirements}\label{sec:Device Types / Network Device
>>>>   \item[VIRTIO_NET_F_RSC_EXT] Requires VIRTIO_NET_F_HOST_TSO4 or VIRTIO_NET_F_HOST_TSO6.
>>>>   \item[VIRTIO_NET_F_RSS] Requires VIRTIO_NET_F_CTRL_VQ.
>>>>   \item[VIRTIO_NET_F_VQ_NOTF_COAL] Requires VIRTIO_NET_F_CTRL_VQ.
>>>> +\item[VIRTIO_NET_F_HASH_TUNNEL] Requires VIRTIO_NET_F_CTRL_VQ along with VIRTIO_NET_F_RSS and/or VIRTIO_NET_F_HASH_REPORT.
>>> I think just or is enough.
>> Sure. I agree.
>>
>>>>   \end{description}
>>>>   
>>>>   \subsubsection{Legacy Interface: Feature bits}\label{sec:Device Types / Network Device / Feature bits / Legacy Interface: Feature bits}
>>>> @@ -869,6 +872,8 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>>>>   If the feature VIRTIO_NET_F_RSS was negotiated:
>>>>   \begin{itemize}
>>>>   \item The device uses \field{hash_types} of the virtio_net_rss_config structure as 'Enabled hash types' bitmask.
>>>> +\item If additionally the feature VIRTIO_NET_F_HASH_TUNNEL was negotiated, the device uses \field{hash_tunnel_types} of the
>>>> +      virtnet_hash_tunnel_config_get structure as 'Encapsulation types enabled for inner header hash' bitmask.
>>> why get and not set? e.g. if I call get then set then the field in set
>>> will have effect.
>> If the following command sequence:
>> 1. The driver sets hash_tunnel_types using the SET command and saves it
>> somewhere e.g. virtnet_info->hash_tunnel_types_saved.
> This clearly does not work even with current structure you need to first
> do a GET otherwise you have no idea what is supported.

This sequence is just an example from the middle, we need the GET 
command before the first SET.

>
>> 2. The driver fetches hash_tunnel_types using the GET command, which should
>> be equal to virtnet_info->hash_tunnel_types_saved (or even returned from
>> virtnet_info->hash_tunnel_types_saved, like RSS).
> why? for debugging?

For convenience, the driver can try to update 
virtnet_info->hash_tunnel_types_saved every time the SET command is issued,
but this is too specific and depends on the implementation of the 
driver, I think it is out of the scope of the spec.

>
>> 3. The driver sets hash_tunnel_types again using the SET command and saves
>> it in virtnet_info->hash_tunnel_types_saved.
> I don't get it, why set it again?

Just an example to explain that the driver can save 
virtnet_info->hash_tunnel_types_saved every SET to reduce the number of 
GET tunnels or achieve synchronization.

>
>> But I think your enhanced proposal is also feasible, after all, before
>> for example RSS etc. we only have SET command can work very well.
> Well RSS is designed (imho) better since it keeps the supported types in
> config space. Doing it here would have removed the need for GET command.
> Yes I know Parav hates config space, no I don't think for a read-only
> field like this one this hate is justified.
> Did we discuss this and decided not to add it in config space
> for some reason? I don't remember ...

Yes, in v13 version.

Thanks.

>
>>>
>>>>   \item The device uses a key as defined in \field{hash_key_data} and \field{hash_key_length} of the virtio_net_rss_config structure (see
>>>>   \ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) / Setting RSS parameters}).
>>>>   \end{itemize}
>>>> @@ -876,6 +881,8 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>>>>   If the feature VIRTIO_NET_F_RSS was not negotiated:
>>>>   \begin{itemize}
>>>>   \item The device uses \field{hash_types} of the virtio_net_hash_config structure as 'Enabled hash types' bitmask.
>>>> +\item If additionally the feature VIRTIO_NET_F_HASH_TUNNEL was negotiated, the device uses \field{hash_tunnel_types} of the
>>>> +      virtnet_hash_tunnel_config_get structure as 'Encapsulation types enabled for inner header hash' bitmask.
>>> same
>>>
>>>>   \item The device uses a key as defined in \field{hash_key_data} and \field{hash_key_length} of the virtio_net_hash_config structure (see
>>>>   \ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode / Hash calculation}).
>>>>   \end{itemize}
>>>> @@ -889,6 +896,8 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>>>>    \ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode / Hash calculation}.
>>>>   \end{itemize}
>>>>   
>>>> +The per-packet hash calculation can depend on the IP packet type. See
>>>> +\hyperref[intro:IP]{[IP]}, \hyperref[intro:UDP]{[UDP]} and \hyperref[intro:TCP]{[TCP]}.
>>> and end paragraph here.
>> Is adding a blank line here? --!
> yes
>
>>>>   \subparagraph{Supported/enabled hash types}
>>>>   \label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled hash types}
>>>>   Hash types applicable for IPv4 packets:
>>>> @@ -1001,6 +1010,155 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>>>>   (see \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / IPv6 packets without extension header}).
>>>>   \end{itemize}
>>>>   
>>>> +\paragraph{Inner Header Hash}
>>>> +\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Inner Header Hash}
>>>> +
>>>> +If VIRTIO_NET_F_HASH_TUNNEL has been negotiated, the driver can send commands VIRTIO_NET_CTRL_HASH_TUNNEL_SET
>>>> +and VIRTIO_NET_CTRL_HASH_TUNNEL_GET to configure the calculation of the inner header hash.
>>>> +
>>>> +struct virtnet_hash_tunnel_config_set {
>>>> +    le32 hash_tunnel_types;
>>>> +};
>>>> +
>>>> +struct virtnet_hash_tunnel_config_get {
>>>> +    le32 supported_hash_tunnel_types;
>>>> +    le32 hash_tunnel_types;
>>>> +};
>>>> +
>>> It would be cleaner to have a single structure for both.
>>>
>> SET carries an additional 32 bits of information. But if you think this
>> will make the overall structure more concise, I'm ok.
> I think so yes.
>
>>> I also think hash_tunnel is unnecessarily verbose, and _config_ is also
>>> pointless.
>>>
>>> Returning supported_hash_tunnel_types back to device can also
>>> be useful for debugging.
>>>
>>> How about:
>>>
>>>
>>> struct virtnet_hash_tunnel {
>>>       le32 supported_tunnel_types;
>>>       le32 enabled_tunnel_types;
>>> };
>>>
>> It's OK.
>>
>> And:
>> For the GET command, both fields are WO for the device.
>> For the SET command, \field{supported_tunnel_types} is RO for the device
>> and \field{enabled_tunnel_types} is WO for the device.
>>
>>> For VIRTIO_NET_CTRL_HASH_TUNNEL_GET, \field{supported_tunnel_types}
>>> contains the bitmask of encapsulation types supported
>>> by the device for inner header hash; \field{enabled_tunnel_types}
>>> contains the value received in a previous successful
>>> call to VIRTIO_NET_CTRL_HASH_TUNNEL_SET.
>>>
>>> For VIRTIO_NET_CTRL_HASH_TUNNEL_SET, \field{supported_tunnel_types}
>>> contains the value returned by a previous
>>> successful call to VIRTIO_NET_CTRL_HASH_TUNNEL_GET;
>>> \field{enabled_tunnel_types}
>>> contains the bitmask of encapsulation types to enable
>>> for inner header hash.
>>>
>>> and add normative statements to this end.
>>>
>>>
>> Ok.
>>
>>>> +#define VIRTIO_NET_CTRL_HASH_TUNNEL 7
>>>> + #define VIRTIO_NET_CTRL_HASH_TUNNEL_SET 0
>>>> + #define VIRTIO_NET_CTRL_HASH_TUNNEL_GET 1
>>>> +
>>>> +
>>>> +Field \field{supported_hash_tunnel_types} provided by the device indicates that the device supports inner header hash for these encapsulation types.
>>>> +Field \field{supported_hash_tunnel_types} contains the bitmask of encapsulation types supported for inner header hash.
>>> We don't need these two sentences. Just second one will do.
>>>
>> Ok.
>>
>>>> +See \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets /
>>>> +Hash calculation for incoming packets / Encapsulation types supported/enabled for inner header hash}.
>>>> +
>>>> +Field \field{hash_tunnel_types} contains the bitmask of encapsulation types enabled for inner header hash.
>>> They have different meanings for set and get though.
>>>
>>>
>>>> +See \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets /
>>>> +Hash calculation for incoming packets / Encapsulation types supported/enabled for inner header hash}.
>>>> +
>>>> +The class VIRTIO_NET_CTRL_HASH_TUNNEL has the following commands:
>>>> +\begin{itemize}
>>>> +\item VIRTIO_NET_CTRL_HASH_TUNNEL_SET: set \field{hash_tunnel_types} for the device using the
>>>> +      virtnet_hash_tunnel_config_set structure, which is read-only for the device.
>>>> +\item VIRTIO_NET_CTRL_HASH_TUNNEL_GET: get \field{hash_tunnel_types} and \field{supported_hash_tunnel_types}
>>>> +      from the device using the virtnet_hash_tunnel_config_get structure, which is write-only for the device.
>>>> +\end{itemize}
>>>> +
>>>> +\subparagraph{Encapsulated packet}
>>>> +\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Encapsulated packet}
>>>> +
>>>> +Multiple tunneling protocols allow encapsulating an inner, payload packet in an outer, encapsulated packet.
>>>> +The encapsulated packet thus contains an outer header and an inner header, and the device calculates the
>>>> +hash over either the inner header or the outer header.
>>>> +
>>>> +If VIRTIO_NET_F_HASH_TUNNEL is negotiated and a received encapsulated packet's outer header matches one of the
>>>> +encapsulation types enabled in \field{hash_tunnel_types}, then the device uses the inner header for hash
>>>> +calculations (only a single level of encapsulation is currently supported).
>>>> +
>>>> +If VIRTIO_NET_F_HASH_TUNNEL is negotiated and a received packet's (outer) header does not match any types enabled
>>>> +in \field{hash_tunnel_types}, then the device uses the outer header for hash calculations.
>>>> +
>>>> +Initially all encapsulation types are disabled (the value of \field{hash_tunnel_types} is 0) for inner header hash
>>>> +before any VIRTIO_NET_CTRL_HASH_TUNNEL_SET command are sent by the driver.
>>> Initially (before driver sends VIRTIO_NET_CTRL_HASH_TUNNEL_SET)
>>> all encapsulation types are disabled
>>>
>> Ok.
>>
>>>> +
>>>> +Encapsulation types supported/enabled for inner header hash:
>>>> +\begin{itemize}
>>>> +    \item The outer header of the following encapsulation types does not contain the transport protocol:
>>>> +        \begin{enumerate}
>>>> +	    \item \hyperref[intro:ipip]{[IPIP]}: the outer header is over IPv4 and the inner header is over IPv4.
>>>> +	    \item \hyperref[intro:nvgre]{[NVGRE]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
>>>> +	    \item \hyperref[intro:gre_rfc2784]{[GRE_rfc2784]}: the outer header is over IPv4 and the inner header is over IPv4.
>>>> +	    \item \hyperref[intro:gre_rfc2890]{[GRE_rfc2890]}: the outer header is over IPv4 and the inner header is over IPv4.
>>>> +	    \item \hyperref[intro:gre_rfc7676]{[GRE_rfc7676]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
>>>> +        \end{enumerate}
>>>> +
>>>> +    \item The outer header of the following encapsulation types uses UDP as the transport protocol:
>>>> +        \begin{enumerate}
>>>> +	    \item \hyperref[intro:vxlan]{[VXLAN]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
>>>> +	    \item \hyperref[intro:geneve]{[GENEVE]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
>>>> +	    \item \hyperref[intro:vxlan_gpe]{[VXLAN-GPE]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
>>>> +	    \item \hyperref[intro:gre_in_udp_rfc8086]{[GRE-in-UDP]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6.
>>>> +        \end{enumerate}
>>>> +\end{itemize}
>>>> +
>>>> +\subparagraph{Encapsulation types supported/enabled for inner header hash}
>>>> +\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets /
>>>> +Hash calculation for incoming packets / Encapsulation types supported/enabled for inner header hash}
>>>> +
>>>> +Encapsulation types applicable for inner header hash:
>>>> +\begin{lstlisting}
>>>> +The \hyperref[intro:gre_rfc2784]{[GRE_rfc2784]} encapsulation type:
>>>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_2784    (1 << 0)
>>>> +
>>>> +The \hyperref[intro:gre_rfc2890]{[GRE_rfc2890]} encapsulation type:
>>>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_2890    (1 << 1)
>>>> +
>>>> +The \hyperref[intro:gre_rfc7676]{[GRE_rfc7676]} encapsulation type:
>>>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_7676    (1 << 2)
>>>> +
>>>> +The \hyperref[intro:gre_in_udp_rfc8086]{[GRE-in-UDP]} encapsulation type:
>>>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_UDP     (1 << 3)
>>>> +
>>>> +The \hyperref[intro:vxlan]{[VXLAN]} encapsulation type:
>>>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_VXLAN       (1 << 4)
>>>> +
>>>> +The \hyperref[intro:vxlan_gpe]{[VXLAN-GPE]} encapsulation type:
>>>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_VXLAN_GPE   (1 << 5)
>>>> +
>>>> +The \hyperref[intro:geneve]{[GENEVE]} encapsulation type:
>>>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GENEVE      (1 << 6)
>>>> +
>>>> +The \hyperref[intro:ipip]{[IPIP]} encapsulation type:
>>>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_IPIP        (1 << 7)
>>>> +
>>>> +The \hyperref[intro:nvgre]{[NVGRE]} encapsulation type:
>>>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_NVGRE       (1 << 8)
>>>> +\end{lstlisting}
>>>> +
>>>> +\subparagraph{Advice}
>>>> +Example uses of inner header hash:
>>>> +\begin{itemize}
>>>> +\item Legacy tunneling protocols, lacking outer header entropy, can use RSS with inner header hash to
>>>> +      distribute flows with identical outer but different inner headers across various queues, improving performance.
>>>> +\item Identify an inner flow distributed across multiple outer tunnels.
>>>> +\end{itemize}
>>>> +
>>>> +As using the inner header hash completely discards the outer header entropy, care must be taken
>>>> +if the inner header is controlled by an adversary, as the adversary can then intentionally create
>>>> +configurations with insufficient entropy.
>>>> +
>>>> +Besides disabling inner header hash, mitigations would depend on:
>>>> +\begin{itemize}
>>>> +\item Use a tool with good forwarding performance to keep the receive queue from dropping packets.
>>> this is quite vague
>>>
>> Giving too specific advice would be too specific, which we discussed a
>> long time ago.
>>
>>>> +\item If the QoS (Quality of service) is unavailable, the driver can set \field{hash_tunnel_types} to 0
>>>> +      to disable inner header hash for all encapsulated packets.
>>> this is precisely disabling
>> \field{hash_tunnel_types} to 0 disabling inner ?
>>
>>>> +\item Perform appropriate QoS before packets consume the receive buffers of the receive queues.
>>> it is not at all clear how would devices do this.
>>>
>> The reason we're describing it broadly here is that this is done by
>> devices, which usually know what to do, and can also take actions
>> off-device, such as firewalls off-device, etc.
>>
>>>> +\end{itemize}
>>> Oh sorry I didn't complete the sentence :(
>>> I suggest dropping above and having something like:
>>>
>>> 	Besides disabling inner header hash, mitigations would depend on how the
>>> 	hash is used, and the consequences of a successful attack.
>>> 	For example, if the attack causes packet drops, using a deeper queue
>>> 	might be able to mitigate it.
>> Ok, I got you now!
>>
>>>
>>>
>>>> +
>>>> +\devicenormative{\subparagraph}{Inner Header Hash}{Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
>>>> +
>>>> +If the (outer) header of the received packet does not match any value
>>> any encapsulation type
>> Ok. And 'any bits' instead of 'any value' I think.
>>
>>>> enabled in \field{hash_tunnel_types},
>>>> +the device MUST calculate the hash on the outer header.
>>>> +
>>>> +If the device receives an unsupported or unrecognized value for \field{hash_tunnel_types}, it MUST respond to
>>>> +the VIRTIO_NET_CTRL_HASH_TUNNEL_SET command with VIRTIO_NET_ERR.
>>> let's be specific. if any bits in hash_tunnel_types are not set in
>>> supported_hash_tunnel_types
>> Ok.
>>
>>>> +
>>>> +If the device offers the VIRTIO_NET_F_HASH_TUNNEL feature, it MUST provide the values for \field{supported_hash_tunnel_types}.
>>> what does this mean even?
>> When the driver uses the GET command, the device should preferably
>> return the corresponding value..
>>
>>>> +
>>>> +If \field{hash_tunnel_types} is set to 0 or upon device reset, the device MUST disable inner header hash for all encapsulation types.
>>>> +
>>>> +\drivernormative{\subparagraph}{Inner Header Hash}{Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
>>>> +
>>>> +The driver MUST have negotiated the VIRTIO_NET_F_HASH_TUNNEL feature when issuing
>>>> +commands VIRTIO_NET_CTRL_HASH_TUNNEL_SET and VIRTIO_NET_CTRL_HASH_TUNNEL_GET.
>>>> +
>>>> +The driver MUST ignore the values received from the VIRTIO_NET_CTRL_HASH_TUNNEL_GET command if the device responds with VIRTIO_NET_ERR.
>>>> +
>>>> +The driver MUST NOT set any value in \field{hash_tunnel_types} which is not set in \field{supported_hash_tunnel_types}.
>>> any bits. not any value
>> Get.
>>
>> Thanks.
>>
>>>> +
>>>>   \paragraph{Hash reporting for incoming packets}
>>>>   \label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash reporting for incoming packets}
>>>>   
>>>> diff --git a/device-types/net/device-conformance.tex b/device-types/net/device-conformance.tex
>>>> index 54f6783..f88f48b 100644
>>>> --- a/device-types/net/device-conformance.tex
>>>> +++ b/device-types/net/device-conformance.tex
>>>> @@ -14,4 +14,5 @@
>>>>   \item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode}
>>>>   \item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) / RSS processing}
>>>>   \item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Notifications Coalescing}
>>>> +\item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
>>>>   \end{itemize}
>>>> diff --git a/device-types/net/driver-conformance.tex b/device-types/net/driver-conformance.tex
>>>> index 97d0cc1..9d853d9 100644
>>>> --- a/device-types/net/driver-conformance.tex
>>>> +++ b/device-types/net/driver-conformance.tex
>>>> @@ -14,4 +14,5 @@
>>>>   \item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Offloads State Configuration / Setting Offloads State}
>>>>   \item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) }
>>>>   \item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Notifications Coalescing}
>>>> +\item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Inner Header Hash}
>>>>   \end{itemize}
>>>> diff --git a/introduction.tex b/introduction.tex
>>>> index b7155bf..81f07a4 100644
>>>> --- a/introduction.tex
>>>> +++ b/introduction.tex
>>>> @@ -102,6 +102,45 @@ \section{Normative References}\label{sec:Normative References}
>>>>       Standards for Efficient Cryptography Group(SECG), ``SEC1: Elliptic Cureve Cryptography'', Version 1.0, September 2000.
>>>>   	\newline\url{https://www.secg.org/sec1-v2.pdf}\\
>>>>   
>>>> +	\phantomsection\label{intro:gre_rfc2784}\textbf{[GRE_rfc2784]} &
>>>> +    Generic Routing Encapsulation. This protocol is only specified for IPv4 and used as either the payload or delivery protocol.
>>>> +	\newline\url{https://datatracker.ietf.org/doc/rfc2784/}\\
>>>> +	\phantomsection\label{intro:gre_rfc2890}\textbf{[GRE_rfc2890]} &
>>>> +    Key and Sequence Number Extensions to GRE \ref{intro:gre_rfc2784}. This protocol describes extensions by which two fields, Key and
>>>> +    Sequence Number, can be optionally carried in the GRE Header \ref{intro:gre_rfc2784}.
>>>> +	\newline\url{https://www.rfc-editor.org/rfc/rfc2890}\\
>>>> +	\phantomsection\label{intro:gre_rfc7676}\textbf{[GRE_rfc7676]} &
>>>> +    IPv6 Support for Generic Routing Encapsulation (GRE). This protocol is specified for IPv6 and used as either the payload or
>>>> +    delivery protocol. Note that this does not change the GRE header format or any behaviors specified by RFC 2784 or RFC 2890.
>>>> +	\newline\url{https://datatracker.ietf.org/doc/rfc7676/}\\
>>>> +	\phantomsection\label{intro:gre_in_udp_rfc8086}\textbf{[GRE-in-UDP]} &
>>>> +    GRE-in-UDP Encapsulation. This specifies a method of encapsulating network protocol packets within GRE and UDP headers.
>>>> +    This protocol is specified for IPv4 and IPv6, and used as either the payload or delivery protocol.
>>>> +	\newline\url{https://www.rfc-editor.org/rfc/rfc8086}\\
>>>> +	\phantomsection\label{intro:vxlan}\textbf{[VXLAN]} &
>>>> +    Virtual eXtensible Local Area Network.
>>>> +	\newline\url{https://datatracker.ietf.org/doc/rfc7348/}\\
>>>> +	\phantomsection\label{intro:vxlan-gpe}\textbf{[VXLAN-GPE]} &
>>>> +    Generic Protocol Extension for VXLAN. This protocol describes extending Virtual eXtensible Local Area Network (VXLAN) via changes to the VXLAN header.
>>>> +	\newline\url{https://www.ietf.org/archive/id/draft-ietf-nvo3-vxlan-gpe-12.txt}\\
>>>> +	\phantomsection\label{intro:geneve}\textbf{[GENEVE]} &
>>>> +    Generic Network Virtualization Encapsulation.
>>>> +	\newline\url{https://datatracker.ietf.org/doc/rfc8926/}\\
>>>> +	\phantomsection\label{intro:ipip}\textbf{[IPIP]} &
>>>> +    IP Encapsulation within IP.
>>>> +	\newline\url{https://www.rfc-editor.org/rfc/rfc2003}\\
>>>> +	\phantomsection\label{intro:nvgre}\textbf{[NVGRE]} &
>>>> +    NVGRE: Network Virtualization Using Generic Routing Encapsulation
>>>> +	\newline\url{https://www.rfc-editor.org/rfc/rfc7637.html}\\
>>>> +	\phantomsection\label{intro:IP}\textbf{[IP]} &
>>>> +    INTERNET PROTOCOL
>>>> +	\newline\url{https://www.rfc-editor.org/rfc/rfc791}\\
>>>> +	\phantomsection\label{intro:UDP}\textbf{[UDP]} &
>>>> +    User Datagram Protocol
>>>> +	\newline\url{https://www.rfc-editor.org/rfc/rfc768}\\
>>>> +	\phantomsection\label{intro:TCP}\textbf{[TCP]} &
>>>> +    TRANSMISSION CONTROL PROTOCOL
>>>> +	\newline\url{https://www.rfc-editor.org/rfc/rfc793}\\
>>>>   \end{longtable}
>>>>   
>>>>   \section{Non-Normative References}
>>>> -- 
>>>> 2.19.1.6.gb485710b
>>>
>>> This publicly archived list offers a means to provide input to the
>>> OASIS Virtual I/O Device (VIRTIO) TC.
>>>
>>> In order to verify user consent to the Feedback License terms and
>>> to minimize spam in the list archive, subscription is required
>>> before posting.
>>>
>>> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
>>> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
>>> List help: virtio-comment-help@lists.oasis-open.org
>>> List archive: https://lists.oasis-open.org/archives/virtio-comment/
>>> Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
>>> List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
>>> Committee: https://www.oasis-open.org/committees/virtio/
>>> Join OASIS: https://www.oasis-open.org/join/
>> This publicly archived list offers a means to provide input to the
>> OASIS Virtual I/O Device (VIRTIO) TC.
>>
>> In order to verify user consent to the Feedback License terms and
>> to minimize spam in the list archive, subscription is required
>> before posting.
>>
>> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
>> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
>> List help: virtio-comment-help@lists.oasis-open.org
>> List archive: https://lists.oasis-open.org/archives/virtio-comment/
>> Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
>> List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
>> Committee: https://www.oasis-open.org/committees/virtio/
>> Join OASIS: https://www.oasis-open.org/join/
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-dev] Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
  2023-06-21 19:35             ` Michael S. Tsirkin
@ 2023-06-22  0:46               ` Heng Qi
  -1 siblings, 0 replies; 106+ messages in thread
From: Heng Qi @ 2023-06-22  0:46 UTC (permalink / raw)
  To: Michael S. Tsirkin, Parav Pandit
  Cc: virtio-comment, virtio-dev, Jason Wang, Yuri Benditovich,
	Xuan Zhuo, Cornelia Huck



在 2023/6/22 上午3:35, Michael S. Tsirkin 写道:
> On Wed, Jun 21, 2023 at 07:28:33PM +0000, Parav Pandit wrote:
>>
>>> From: Michael S. Tsirkin <mst@redhat.com>
>>> Sent: Wednesday, June 21, 2023 3:26 PM
>>>
>>> On Wed, Jun 21, 2023 at 05:52:28PM +0000, Parav Pandit wrote:
>>>>> From: Heng Qi <hengqi@linux.alibaba.com>
>>>>> Sent: Wednesday, June 21, 2023 12:46 PM
>>>>
>>>>> SET carries an additional 32 bits of information. But if you think
>>>>> this will make the overall structure more concise, I'm ok.
>>>>>
>>>> If it is placed in single structure than it needs to be reworded to remove WO
>>> or RO notion.
>>>
>>> Not really. all of structure is RO and WO.
>>>
>> For the device, what is the meaning of driver writing supported_tunnel_types in SET command?
>> Ignore? It doesn't make sense to pass something to ignore.
> It can be helpful for debugging. E.g. if driver sets
> a bit that device does not recognize, it can figure
> out that driver has it set in supported mask.
> Or if it seems like a waste we can just ask drivers to
> put 0 there, works for me too.

For the SET command, it is a bit strange to have the driver write 
support_hash_tunnel_types to the device.
'supported' is the capability of the device, 'enabled' is the capability 
of the driver.
The driver should be read-only to 'supported', the driver should not try 
to write it.

Thanks.

>
>
>> Two structures are cleaner.
> Frankly what's cleaner is what RSS is doing, supported mask in config
> space. Was this discussed in some version? I don't remember.
>
>>>> This also requires additional sw to indicate dma attributes to be RW when
>>> mapping this area.
>>>> And extra text to indicate that supported_hash_tunnel_types to be ignored
>>> on set command.
>>>> Two structures are more cleaner serving its purpose.
>>> No because all of structure is RO or WO.
>> This publicly archived list offers a means to provide input to the
>> OASIS Virtual I/O Device (VIRTIO) TC.
>>
>> In order to verify user consent to the Feedback License terms and
>> to minimize spam in the list archive, subscription is required
>> before posting.
>>
>> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
>> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
>> List help: virtio-comment-help@lists.oasis-open.org
>> List archive: https://lists.oasis-open.org/archives/virtio-comment/
>> Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
>> List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
>> Committee: https://www.oasis-open.org/committees/virtio/
>> Join OASIS: https://www.oasis-open.org/join/
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
@ 2023-06-22  0:46               ` Heng Qi
  0 siblings, 0 replies; 106+ messages in thread
From: Heng Qi @ 2023-06-22  0:46 UTC (permalink / raw)
  To: Michael S. Tsirkin, Parav Pandit
  Cc: virtio-comment, virtio-dev, Jason Wang, Yuri Benditovich,
	Xuan Zhuo, Cornelia Huck



在 2023/6/22 上午3:35, Michael S. Tsirkin 写道:
> On Wed, Jun 21, 2023 at 07:28:33PM +0000, Parav Pandit wrote:
>>
>>> From: Michael S. Tsirkin <mst@redhat.com>
>>> Sent: Wednesday, June 21, 2023 3:26 PM
>>>
>>> On Wed, Jun 21, 2023 at 05:52:28PM +0000, Parav Pandit wrote:
>>>>> From: Heng Qi <hengqi@linux.alibaba.com>
>>>>> Sent: Wednesday, June 21, 2023 12:46 PM
>>>>
>>>>> SET carries an additional 32 bits of information. But if you think
>>>>> this will make the overall structure more concise, I'm ok.
>>>>>
>>>> If it is placed in single structure than it needs to be reworded to remove WO
>>> or RO notion.
>>>
>>> Not really. all of structure is RO and WO.
>>>
>> For the device, what is the meaning of driver writing supported_tunnel_types in SET command?
>> Ignore? It doesn't make sense to pass something to ignore.
> It can be helpful for debugging. E.g. if driver sets
> a bit that device does not recognize, it can figure
> out that driver has it set in supported mask.
> Or if it seems like a waste we can just ask drivers to
> put 0 there, works for me too.

For the SET command, it is a bit strange to have the driver write 
support_hash_tunnel_types to the device.
'supported' is the capability of the device, 'enabled' is the capability 
of the driver.
The driver should be read-only to 'supported', the driver should not try 
to write it.

Thanks.

>
>
>> Two structures are cleaner.
> Frankly what's cleaner is what RSS is doing, supported mask in config
> space. Was this discussed in some version? I don't remember.
>
>>>> This also requires additional sw to indicate dma attributes to be RW when
>>> mapping this area.
>>>> And extra text to indicate that supported_hash_tunnel_types to be ignored
>>> on set command.
>>>> Two structures are more cleaner serving its purpose.
>>> No because all of structure is RO or WO.
>> This publicly archived list offers a means to provide input to the
>> OASIS Virtual I/O Device (VIRTIO) TC.
>>
>> In order to verify user consent to the Feedback License terms and
>> to minimize spam in the list archive, subscription is required
>> before posting.
>>
>> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
>> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
>> List help: virtio-comment-help@lists.oasis-open.org
>> List archive: https://lists.oasis-open.org/archives/virtio-comment/
>> Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
>> List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
>> Committee: https://www.oasis-open.org/committees/virtio/
>> Join OASIS: https://www.oasis-open.org/join/
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-comment] Re: [virtio-dev] RE: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
  2023-06-21 20:52                 ` Parav Pandit
@ 2023-06-22  0:59                   ` Heng Qi
  -1 siblings, 0 replies; 106+ messages in thread
From: Heng Qi @ 2023-06-22  0:59 UTC (permalink / raw)
  To: Parav Pandit, Michael S. Tsirkin
  Cc: virtio-comment, virtio-dev, Jason Wang, Yuri Benditovich,
	Xuan Zhuo, Cornelia Huck



在 2023/6/22 上午4:52, Parav Pandit 写道:
>> From: Michael S. Tsirkin <mst@redhat.com>
>> Sent: Wednesday, June 21, 2023 4:38 PM
>>>> And the field is RO so no memory cost to exposing it in all VFs.
>>> Two structures do not bring the asymmetry.
>>> Accessing current and enabled fields via two different mechanism is bringing
>> the asymmetry.
>>
>> I guess it's a matter of taste, but it is clearly more consistent with other hash
>> things, to which it's very similar.
>>
> This is consistent with new commands we define including notification coalescing whose GET is not coming config space.

Yes.

>   
>> Nah, config space is too convenient when we can live with its limitations. I don't
>> thin kwe prefer not to keep growing it.
>> For some things such as this one it's perfect.
>>
> Fields are different between different devices.
>
>> For example, for migration driver might want to validate that two devices have
>> same capability. doing it without dma is nicer.
>>
> A migration driver for real world scenario, will almost have to use the dma for amount of data it needs to exchange.
>
>> Another example, future admin transport will have ability to provision devices
>> by supplying their config space.
>> This will include this capability automatically, if instead we hide it in a command
>> we need to do extra custom work.
>>
>>> So we do not prefer to keep growing the config space anymore, hence
>>> GET is the right approach to me.
>> Heh I know you hate config space. Let it go, stop wasting time arguing about the
>> same thing on every turn and instead help define admin transport to solve it
> This was discussed many times, a driver to have a direct (non-intercepted by owner device) channel to device.
> If you mean this non-intercepted channel as admin transport, fine.
> If you mean this is intercepted and it is going over admin cmd, then it is of no use for all future interfaces.
>
> We discussed this in thread with you and Jason.
> I provided concrete example with size and device provisioning math too and other example of multi-physical address VQ.
> So transporting register by register over some admin transport is sub-optimal.

Parav, your implementation prefers two separate struct versions and 
doesn't let supported_hash_tunnel_types expand in configuration space. I 
remember this.
I agree that we don't want to jump back and forth, especially as there 
are practical reasons and 5 version jumps to get 
supported_hash_tunnel_types back into the config space.

The original intention of Michael's proposal to merge structures in v18 
should be that two separate structures will cause asynchrony.
I don't think so, the driver can cache enabled hash_tunnel_types every 
SET command. Or after the SET command the driver *SHOULD* use the GET 
command again, which is the workaround.

Thanks.

>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-dev] RE: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
@ 2023-06-22  0:59                   ` Heng Qi
  0 siblings, 0 replies; 106+ messages in thread
From: Heng Qi @ 2023-06-22  0:59 UTC (permalink / raw)
  To: Parav Pandit, Michael S. Tsirkin
  Cc: virtio-comment, virtio-dev, Jason Wang, Yuri Benditovich,
	Xuan Zhuo, Cornelia Huck



在 2023/6/22 上午4:52, Parav Pandit 写道:
>> From: Michael S. Tsirkin <mst@redhat.com>
>> Sent: Wednesday, June 21, 2023 4:38 PM
>>>> And the field is RO so no memory cost to exposing it in all VFs.
>>> Two structures do not bring the asymmetry.
>>> Accessing current and enabled fields via two different mechanism is bringing
>> the asymmetry.
>>
>> I guess it's a matter of taste, but it is clearly more consistent with other hash
>> things, to which it's very similar.
>>
> This is consistent with new commands we define including notification coalescing whose GET is not coming config space.

Yes.

>   
>> Nah, config space is too convenient when we can live with its limitations. I don't
>> thin kwe prefer not to keep growing it.
>> For some things such as this one it's perfect.
>>
> Fields are different between different devices.
>
>> For example, for migration driver might want to validate that two devices have
>> same capability. doing it without dma is nicer.
>>
> A migration driver for real world scenario, will almost have to use the dma for amount of data it needs to exchange.
>
>> Another example, future admin transport will have ability to provision devices
>> by supplying their config space.
>> This will include this capability automatically, if instead we hide it in a command
>> we need to do extra custom work.
>>
>>> So we do not prefer to keep growing the config space anymore, hence
>>> GET is the right approach to me.
>> Heh I know you hate config space. Let it go, stop wasting time arguing about the
>> same thing on every turn and instead help define admin transport to solve it
> This was discussed many times, a driver to have a direct (non-intercepted by owner device) channel to device.
> If you mean this non-intercepted channel as admin transport, fine.
> If you mean this is intercepted and it is going over admin cmd, then it is of no use for all future interfaces.
>
> We discussed this in thread with you and Jason.
> I provided concrete example with size and device provisioning math too and other example of multi-physical address VQ.
> So transporting register by register over some admin transport is sub-optimal.

Parav, your implementation prefers two separate struct versions and 
doesn't let supported_hash_tunnel_types expand in configuration space. I 
remember this.
I agree that we don't want to jump back and forth, especially as there 
are practical reasons and 5 version jumps to get 
supported_hash_tunnel_types back into the config space.

The original intention of Michael's proposal to merge structures in v18 
should be that two separate structures will cause asynchrony.
I don't think so, the driver can cache enabled hash_tunnel_types every 
SET command. Or after the SET command the driver *SHOULD* use the GET 
command again, which is the workaround.

Thanks.

>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* RE: [virtio-dev] RE: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
  2023-06-22  0:59                   ` Heng Qi
@ 2023-06-22  1:04                     ` Parav Pandit
  -1 siblings, 0 replies; 106+ messages in thread
From: Parav Pandit @ 2023-06-22  1:04 UTC (permalink / raw)
  To: Heng Qi, Michael S. Tsirkin
  Cc: virtio-comment, virtio-dev, Jason Wang, Yuri Benditovich,
	Xuan Zhuo, Cornelia Huck



> From: Heng Qi <hengqi@linux.alibaba.com>
> Sent: Wednesday, June 21, 2023 8:59 PM

[..]
> > We discussed this in thread with you and Jason.
> > I provided concrete example with size and device provisioning math too and
> other example of multi-physical address VQ.
> > So transporting register by register over some admin transport is sub-optimal.
> 
> Parav, your implementation prefers two separate struct versions and doesn't let
> supported_hash_tunnel_types expand in configuration space. I remember this.
> I agree that we don't want to jump back and forth, especially as there are
> practical reasons and 5 version jumps to get supported_hash_tunnel_types
> back into the config space.
> 
Right.

> The original intention of Michael's proposal to merge structures in v18 should
> be that two separate structures will cause asynchrony.

> I don't think so, the driver can cache enabled hash_tunnel_types every SET
> command. Or after the SET command the driver *SHOULD* use the GET
> command again, which is the workaround.
>
There is no need to perform GET after SET.
A driver-device contract is, is SET command returns success, it means device accepted the command and will apply the filter.
If device fails it, there is anyway error.

Later when if driver wants to modify the tunnel type (add/remove), 
a. either it can used previously read supported type (if cached in driver)
b. issue GET to know supported tunnel types (if not cached)
 
> Thanks.
> 
> >
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> > For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-comment] RE: [virtio-dev] RE: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
@ 2023-06-22  1:04                     ` Parav Pandit
  0 siblings, 0 replies; 106+ messages in thread
From: Parav Pandit @ 2023-06-22  1:04 UTC (permalink / raw)
  To: Heng Qi, Michael S. Tsirkin
  Cc: virtio-comment, virtio-dev, Jason Wang, Yuri Benditovich,
	Xuan Zhuo, Cornelia Huck



> From: Heng Qi <hengqi@linux.alibaba.com>
> Sent: Wednesday, June 21, 2023 8:59 PM

[..]
> > We discussed this in thread with you and Jason.
> > I provided concrete example with size and device provisioning math too and
> other example of multi-physical address VQ.
> > So transporting register by register over some admin transport is sub-optimal.
> 
> Parav, your implementation prefers two separate struct versions and doesn't let
> supported_hash_tunnel_types expand in configuration space. I remember this.
> I agree that we don't want to jump back and forth, especially as there are
> practical reasons and 5 version jumps to get supported_hash_tunnel_types
> back into the config space.
> 
Right.

> The original intention of Michael's proposal to merge structures in v18 should
> be that two separate structures will cause asynchrony.

> I don't think so, the driver can cache enabled hash_tunnel_types every SET
> command. Or after the SET command the driver *SHOULD* use the GET
> command again, which is the workaround.
>
There is no need to perform GET after SET.
A driver-device contract is, is SET command returns success, it means device accepted the command and will apply the filter.
If device fails it, there is anyway error.

Later when if driver wants to modify the tunnel type (add/remove), 
a. either it can used previously read supported type (if cached in driver)
b. issue GET to know supported tunnel types (if not cached)
 
> Thanks.
> 
> >
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> > For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-dev] RE: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
  2023-06-22  1:04                     ` [virtio-comment] " Parav Pandit
@ 2023-06-22  1:17                       ` Heng Qi
  -1 siblings, 0 replies; 106+ messages in thread
From: Heng Qi @ 2023-06-22  1:17 UTC (permalink / raw)
  To: Parav Pandit, Michael S. Tsirkin
  Cc: virtio-comment, virtio-dev, Jason Wang, Yuri Benditovich,
	Xuan Zhuo, Cornelia Huck



在 2023/6/22 上午9:04, Parav Pandit 写道:
>
>> From: Heng Qi <hengqi@linux.alibaba.com>
>> Sent: Wednesday, June 21, 2023 8:59 PM
> [..]
>>> We discussed this in thread with you and Jason.
>>> I provided concrete example with size and device provisioning math too and
>> other example of multi-physical address VQ.
>>> So transporting register by register over some admin transport is sub-optimal.
>> Parav, your implementation prefers two separate struct versions and doesn't let
>> supported_hash_tunnel_types expand in configuration space. I remember this.
>> I agree that we don't want to jump back and forth, especially as there are
>> practical reasons and 5 version jumps to get supported_hash_tunnel_types
>> back into the config space.
>>
> Right.
>
>> The original intention of Michael's proposal to merge structures in v18 should
>> be that two separate structures will cause asynchrony.
>> I don't think so, the driver can cache enabled hash_tunnel_types every SET
>> command. Or after the SET command the driver *SHOULD* use the GET
>> command again, which is the workaround.
>>
> There is no need to perform GET after SET.

Yes.

> A driver-device contract is, is SET command returns success, it means device accepted the command and will apply the filter.
> If device fails it, there is anyway error.
>
> Later when if driver wants to modify the tunnel type (add/remove),
> a. either it can used previously read supported type (if cached in driver)
> b. issue GET to know supported tunnel types (if not cached)

You are right. This is what I want to say.

Thanks.

>   
>> Thanks.
>>
>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
>>> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-comment] Re: [virtio-dev] RE: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
@ 2023-06-22  1:17                       ` Heng Qi
  0 siblings, 0 replies; 106+ messages in thread
From: Heng Qi @ 2023-06-22  1:17 UTC (permalink / raw)
  To: Parav Pandit, Michael S. Tsirkin
  Cc: virtio-comment, virtio-dev, Jason Wang, Yuri Benditovich,
	Xuan Zhuo, Cornelia Huck



在 2023/6/22 上午9:04, Parav Pandit 写道:
>
>> From: Heng Qi <hengqi@linux.alibaba.com>
>> Sent: Wednesday, June 21, 2023 8:59 PM
> [..]
>>> We discussed this in thread with you and Jason.
>>> I provided concrete example with size and device provisioning math too and
>> other example of multi-physical address VQ.
>>> So transporting register by register over some admin transport is sub-optimal.
>> Parav, your implementation prefers two separate struct versions and doesn't let
>> supported_hash_tunnel_types expand in configuration space. I remember this.
>> I agree that we don't want to jump back and forth, especially as there are
>> practical reasons and 5 version jumps to get supported_hash_tunnel_types
>> back into the config space.
>>
> Right.
>
>> The original intention of Michael's proposal to merge structures in v18 should
>> be that two separate structures will cause asynchrony.
>> I don't think so, the driver can cache enabled hash_tunnel_types every SET
>> command. Or after the SET command the driver *SHOULD* use the GET
>> command again, which is the workaround.
>>
> There is no need to perform GET after SET.

Yes.

> A driver-device contract is, is SET command returns success, it means device accepted the command and will apply the filter.
> If device fails it, there is anyway error.
>
> Later when if driver wants to modify the tunnel type (add/remove),
> a. either it can used previously read supported type (if cached in driver)
> b. issue GET to know supported tunnel types (if not cached)

You are right. This is what I want to say.

Thanks.

>   
>> Thanks.
>>
>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
>>> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
  2023-06-21 20:52                 ` Parav Pandit
@ 2023-06-22  6:23                   ` Michael S. Tsirkin
  -1 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2023-06-22  6:23 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck

On Wed, Jun 21, 2023 at 08:52:04PM +0000, Parav Pandit wrote:
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Wednesday, June 21, 2023 4:38 PM
> 
> > > > And the field is RO so no memory cost to exposing it in all VFs.
> > > Two structures do not bring the asymmetry.
> > > Accessing current and enabled fields via two different mechanism is bringing
> > the asymmetry.
> > 
> > I guess it's a matter of taste, but it is clearly more consistent with other hash
> > things, to which it's very similar.
> >
> This is consistent with new commands we define including notification coalescing whose GET is not coming config space.

But there GET just reports the current state. Not the read only
capability. So there would be cost per VF to keep it in config space.
This one is RO no cost per VF. Let's make it convenient?

> > 
> > Nah, config space is too convenient when we can live with its limitations. I don't
> > thin kwe prefer not to keep growing it.
> > For some things such as this one it's perfect.
> >
> Fields are different between different devices.

Not sure what's the implication?

> > For example, for migration driver might want to validate that two devices have
> > same capability. doing it without dma is nicer.
> > 
> A migration driver for real world scenario, will almost have to use the dma for amount of data it needs to exchange.

Not migration itself, provisioning.

> > Another example, future admin transport will have ability to provision devices
> > by supplying their config space.
> > This will include this capability automatically, if instead we hide it in a command
> > we need to do extra custom work.
> > 
> > > So we do not prefer to keep growing the config space anymore, hence
> > > GET is the right approach to me.
> > 
> > Heh I know you hate config space. Let it go, stop wasting time arguing about the
> > same thing on every turn and instead help define admin transport to solve it
> 
> This was discussed many times, a driver to have a direct (non-intercepted by owner device) channel to device.
> If you mean this non-intercepted channel as admin transport, fine.

we can do that, sure.

> If you mean this is intercepted and it is going over admin cmd, then it is of no use for all future interfaces.
> 
> We discussed this in thread with you and Jason.
> I provided concrete example with size and device provisioning math too and other example of multi-physical address VQ.
> So transporting register by register over some admin transport is sub-optimal.
> 

Not register by register, we can send all of config space as long as
it's RO. This field is.

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
@ 2023-06-22  6:23                   ` Michael S. Tsirkin
  0 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2023-06-22  6:23 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck

On Wed, Jun 21, 2023 at 08:52:04PM +0000, Parav Pandit wrote:
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Wednesday, June 21, 2023 4:38 PM
> 
> > > > And the field is RO so no memory cost to exposing it in all VFs.
> > > Two structures do not bring the asymmetry.
> > > Accessing current and enabled fields via two different mechanism is bringing
> > the asymmetry.
> > 
> > I guess it's a matter of taste, but it is clearly more consistent with other hash
> > things, to which it's very similar.
> >
> This is consistent with new commands we define including notification coalescing whose GET is not coming config space.

But there GET just reports the current state. Not the read only
capability. So there would be cost per VF to keep it in config space.
This one is RO no cost per VF. Let's make it convenient?

> > 
> > Nah, config space is too convenient when we can live with its limitations. I don't
> > thin kwe prefer not to keep growing it.
> > For some things such as this one it's perfect.
> >
> Fields are different between different devices.

Not sure what's the implication?

> > For example, for migration driver might want to validate that two devices have
> > same capability. doing it without dma is nicer.
> > 
> A migration driver for real world scenario, will almost have to use the dma for amount of data it needs to exchange.

Not migration itself, provisioning.

> > Another example, future admin transport will have ability to provision devices
> > by supplying their config space.
> > This will include this capability automatically, if instead we hide it in a command
> > we need to do extra custom work.
> > 
> > > So we do not prefer to keep growing the config space anymore, hence
> > > GET is the right approach to me.
> > 
> > Heh I know you hate config space. Let it go, stop wasting time arguing about the
> > same thing on every turn and instead help define admin transport to solve it
> 
> This was discussed many times, a driver to have a direct (non-intercepted by owner device) channel to device.
> If you mean this non-intercepted channel as admin transport, fine.

we can do that, sure.

> If you mean this is intercepted and it is going over admin cmd, then it is of no use for all future interfaces.
> 
> We discussed this in thread with you and Jason.
> I provided concrete example with size and device provisioning math too and other example of multi-physical address VQ.
> So transporting register by register over some admin transport is sub-optimal.
> 

Not register by register, we can send all of config space as long as
it's RO. This field is.

-- 
MST


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-dev] RE: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
  2023-06-22  6:23                   ` Michael S. Tsirkin
@ 2023-06-22 12:32                     ` Parav Pandit
  -1 siblings, 0 replies; 106+ messages in thread
From: Parav Pandit @ 2023-06-22 12:32 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck



> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Thursday, June 22, 2023 2:23 AM
> 
> On Wed, Jun 21, 2023 at 08:52:04PM +0000, Parav Pandit wrote:
> >
> > > From: Michael S. Tsirkin <mst@redhat.com>
> > > Sent: Wednesday, June 21, 2023 4:38 PM
> >
> > > > > And the field is RO so no memory cost to exposing it in all VFs.
> > > > Two structures do not bring the asymmetry.
> > > > Accessing current and enabled fields via two different mechanism
> > > > is bringing
> > > the asymmetry.
> > >
> > > I guess it's a matter of taste, but it is clearly more consistent
> > > with other hash things, to which it's very similar.
> > >
> > This is consistent with new commands we define including notification
> coalescing whose GET is not coming config space.
> 
> But there GET just reports the current state. Not the read only capability. So
> there would be cost per VF to keep it in config space.
> This one is RO no cost per VF. Let's make it convenient?
>
And each VF can have different value hence requires per VF storage in the device.
 
> > >
> > > Nah, config space is too convenient when we can live with its
> > > limitations. I don't thin kwe prefer not to keep growing it.
> > > For some things such as this one it's perfect.
> > >
> > Fields are different between different devices.
> 
> Not sure what's the implication?
Implication is device needs to store this in always available on-chip memory which is not good.

> 
> > > For example, for migration driver might want to validate that two
> > > devices have same capability. doing it without dma is nicer.
> > >
> > A migration driver for real world scenario, will almost have to use the dma for
> amount of data it needs to exchange.
> 
> Not migration itself, provisioning.
>
Provisioning driver usually do not attach to the member device directly.
This requires device reset, followed by reaching _DRIVER stage, querying features etc and config area.
And unbinding it and second reset by member driver. Ugh.

Provisioning driver also needs to get the state or capabilities even when member driver is already attached.
So config space is not much a gain either.
 
> > > Another example, future admin transport will have ability to
> > > provision devices by supplying their config space.
> > > This will include this capability automatically, if instead we hide
> > > it in a command we need to do extra custom work.
> > >
> > > > So we do not prefer to keep growing the config space anymore,
> > > > hence GET is the right approach to me.
> > >
> > > Heh I know you hate config space. Let it go, stop wasting time
> > > arguing about the same thing on every turn and instead help define
> > > admin transport to solve it
> >
> > This was discussed many times, a driver to have a direct (non-intercepted by
> owner device) channel to device.
> > If you mean this non-intercepted channel as admin transport, fine.
> 
> we can do that, sure.
> 
> > If you mean this is intercepted and it is going over admin cmd, then it is of no
> use for all future interfaces.
> >
> > We discussed this in thread with you and Jason.
> > I provided concrete example with size and device provisioning math too and
> other example of multi-physical address VQ.
> > So transporting register by register over some admin transport is sub-optimal.
> >
> 
> Not register by register, we can send all of config space as long as it's RO. This
> field is.
>
It is RO in context of one member device, but every VF can have different value.
The device will never know if one will use new cmdvq to access or some old driver will use without it.
And hence, it always needs to provision it on onchip memory for backward compatibility.

Instead of decision point being RO vs RW, 
any new fields via cmdvq and existing fields stays in cfg space, give predictable behavior to size the member devices in the system.
Once the cmdvq is available, we can get rid of GET command used in this version for new future features.
Till that arrives, GET command is the efficient way.
 
> --
> MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* RE: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
@ 2023-06-22 12:32                     ` Parav Pandit
  0 siblings, 0 replies; 106+ messages in thread
From: Parav Pandit @ 2023-06-22 12:32 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck



> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Thursday, June 22, 2023 2:23 AM
> 
> On Wed, Jun 21, 2023 at 08:52:04PM +0000, Parav Pandit wrote:
> >
> > > From: Michael S. Tsirkin <mst@redhat.com>
> > > Sent: Wednesday, June 21, 2023 4:38 PM
> >
> > > > > And the field is RO so no memory cost to exposing it in all VFs.
> > > > Two structures do not bring the asymmetry.
> > > > Accessing current and enabled fields via two different mechanism
> > > > is bringing
> > > the asymmetry.
> > >
> > > I guess it's a matter of taste, but it is clearly more consistent
> > > with other hash things, to which it's very similar.
> > >
> > This is consistent with new commands we define including notification
> coalescing whose GET is not coming config space.
> 
> But there GET just reports the current state. Not the read only capability. So
> there would be cost per VF to keep it in config space.
> This one is RO no cost per VF. Let's make it convenient?
>
And each VF can have different value hence requires per VF storage in the device.
 
> > >
> > > Nah, config space is too convenient when we can live with its
> > > limitations. I don't thin kwe prefer not to keep growing it.
> > > For some things such as this one it's perfect.
> > >
> > Fields are different between different devices.
> 
> Not sure what's the implication?
Implication is device needs to store this in always available on-chip memory which is not good.

> 
> > > For example, for migration driver might want to validate that two
> > > devices have same capability. doing it without dma is nicer.
> > >
> > A migration driver for real world scenario, will almost have to use the dma for
> amount of data it needs to exchange.
> 
> Not migration itself, provisioning.
>
Provisioning driver usually do not attach to the member device directly.
This requires device reset, followed by reaching _DRIVER stage, querying features etc and config area.
And unbinding it and second reset by member driver. Ugh.

Provisioning driver also needs to get the state or capabilities even when member driver is already attached.
So config space is not much a gain either.
 
> > > Another example, future admin transport will have ability to
> > > provision devices by supplying their config space.
> > > This will include this capability automatically, if instead we hide
> > > it in a command we need to do extra custom work.
> > >
> > > > So we do not prefer to keep growing the config space anymore,
> > > > hence GET is the right approach to me.
> > >
> > > Heh I know you hate config space. Let it go, stop wasting time
> > > arguing about the same thing on every turn and instead help define
> > > admin transport to solve it
> >
> > This was discussed many times, a driver to have a direct (non-intercepted by
> owner device) channel to device.
> > If you mean this non-intercepted channel as admin transport, fine.
> 
> we can do that, sure.
> 
> > If you mean this is intercepted and it is going over admin cmd, then it is of no
> use for all future interfaces.
> >
> > We discussed this in thread with you and Jason.
> > I provided concrete example with size and device provisioning math too and
> other example of multi-physical address VQ.
> > So transporting register by register over some admin transport is sub-optimal.
> >
> 
> Not register by register, we can send all of config space as long as it's RO. This
> field is.
>
It is RO in context of one member device, but every VF can have different value.
The device will never know if one will use new cmdvq to access or some old driver will use without it.
And hence, it always needs to provision it on onchip memory for backward compatibility.

Instead of decision point being RO vs RW, 
any new fields via cmdvq and existing fields stays in cfg space, give predictable behavior to size the member devices in the system.
Once the cmdvq is available, we can get rid of GET command used in this version for new future features.
Till that arrives, GET command is the efficient way.
 
> --
> MST


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
  2023-06-22 12:32                     ` Parav Pandit
@ 2023-06-22 13:42                       ` Heng Qi
  -1 siblings, 0 replies; 106+ messages in thread
From: Heng Qi @ 2023-06-22 13:42 UTC (permalink / raw)
  To: Parav Pandit, Michael S. Tsirkin
  Cc: virtio-comment, virtio-dev, Jason Wang, Yuri Benditovich,
	Xuan Zhuo, Cornelia Huck



在 2023/6/22 下午8:32, Parav Pandit 写道:
>
>> From: Michael S. Tsirkin <mst@redhat.com>
>> Sent: Thursday, June 22, 2023 2:23 AM
>>
>> On Wed, Jun 21, 2023 at 08:52:04PM +0000, Parav Pandit wrote:
>>>> From: Michael S. Tsirkin <mst@redhat.com>
>>>> Sent: Wednesday, June 21, 2023 4:38 PM
>>>>>> And the field is RO so no memory cost to exposing it in all VFs.
>>>>> Two structures do not bring the asymmetry.
>>>>> Accessing current and enabled fields via two different mechanism
>>>>> is bringing
>>>> the asymmetry.
>>>>
>>>> I guess it's a matter of taste, but it is clearly more consistent
>>>> with other hash things, to which it's very similar.
>>>>
>>> This is consistent with new commands we define including notification
>> coalescing whose GET is not coming config space.
>>
>> But there GET just reports the current state. Not the read only capability. So
>> there would be cost per VF to keep it in config space.
>> This one is RO no cost per VF. Let's make it convenient?
>>
> And each VF can have different value hence requires per VF storage in the device.
>   
>>>> Nah, config space is too convenient when we can live with its
>>>> limitations. I don't thin kwe prefer not to keep growing it.
>>>> For some things such as this one it's perfect.
>>>>
>>> Fields are different between different devices.
>> Not sure what's the implication?
> Implication is device needs to store this in always available on-chip memory which is not good.
>
>>>> For example, for migration driver might want to validate that two
>>>> devices have same capability. doing it without dma is nicer.
>>>>
>>> A migration driver for real world scenario, will almost have to use the dma for
>> amount of data it needs to exchange.
>>
>> Not migration itself, provisioning.
>>
> Provisioning driver usually do not attach to the member device directly.
> This requires device reset, followed by reaching _DRIVER stage, querying features etc and config area.
> And unbinding it and second reset by member driver. Ugh.
>
> Provisioning driver also needs to get the state or capabilities even when member driver is already attached.
> So config space is not much a gain either.
>   
>>>> Another example, future admin transport will have ability to
>>>> provision devices by supplying their config space.
>>>> This will include this capability automatically, if instead we hide
>>>> it in a command we need to do extra custom work.
>>>>
>>>>> So we do not prefer to keep growing the config space anymore,
>>>>> hence GET is the right approach to me.
>>>> Heh I know you hate config space. Let it go, stop wasting time
>>>> arguing about the same thing on every turn and instead help define
>>>> admin transport to solve it
>>> This was discussed many times, a driver to have a direct (non-intercepted by
>> owner device) channel to device.
>>> If you mean this non-intercepted channel as admin transport, fine.
>> we can do that, sure.
>>
>>> If you mean this is intercepted and it is going over admin cmd, then it is of no
>> use for all future interfaces.
>>> We discussed this in thread with you and Jason.
>>> I provided concrete example with size and device provisioning math too and
>> other example of multi-physical address VQ.
>>> So transporting register by register over some admin transport is sub-optimal.
>>>
>> Not register by register, we can send all of config space as long as it's RO. This
>> field is.
>>
> It is RO in context of one member device, but every VF can have different value.
> The device will never know if one will use new cmdvq to access or some old driver will use without it.
> And hence, it always needs to provision it on onchip memory for backward compatibility.

Yes, I think we also have to consider upcoming
     1. device counters (e.g. supported_device_counter),
     2. receive flow filters (e.g. supported_flow_types, 
supported_max_entries),
     3. header splits (e.g. supported_split_types) etc.
Continuous expansion of the configuration space needs to be careful.

>
> Instead of decision point being RO vs RW,
> any new fields via cmdvq and existing fields stays in cfg space, give predictable behavior to size the member devices in the system.
> Once the cmdvq is available, we can get rid of GET command used in this version for new future features.
> Till that arrives, GET command is the efficient way.

Yes, I agree.

Thanks.


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
@ 2023-06-22 13:42                       ` Heng Qi
  0 siblings, 0 replies; 106+ messages in thread
From: Heng Qi @ 2023-06-22 13:42 UTC (permalink / raw)
  To: Parav Pandit, Michael S. Tsirkin
  Cc: virtio-comment, virtio-dev, Jason Wang, Yuri Benditovich,
	Xuan Zhuo, Cornelia Huck



在 2023/6/22 下午8:32, Parav Pandit 写道:
>
>> From: Michael S. Tsirkin <mst@redhat.com>
>> Sent: Thursday, June 22, 2023 2:23 AM
>>
>> On Wed, Jun 21, 2023 at 08:52:04PM +0000, Parav Pandit wrote:
>>>> From: Michael S. Tsirkin <mst@redhat.com>
>>>> Sent: Wednesday, June 21, 2023 4:38 PM
>>>>>> And the field is RO so no memory cost to exposing it in all VFs.
>>>>> Two structures do not bring the asymmetry.
>>>>> Accessing current and enabled fields via two different mechanism
>>>>> is bringing
>>>> the asymmetry.
>>>>
>>>> I guess it's a matter of taste, but it is clearly more consistent
>>>> with other hash things, to which it's very similar.
>>>>
>>> This is consistent with new commands we define including notification
>> coalescing whose GET is not coming config space.
>>
>> But there GET just reports the current state. Not the read only capability. So
>> there would be cost per VF to keep it in config space.
>> This one is RO no cost per VF. Let's make it convenient?
>>
> And each VF can have different value hence requires per VF storage in the device.
>   
>>>> Nah, config space is too convenient when we can live with its
>>>> limitations. I don't thin kwe prefer not to keep growing it.
>>>> For some things such as this one it's perfect.
>>>>
>>> Fields are different between different devices.
>> Not sure what's the implication?
> Implication is device needs to store this in always available on-chip memory which is not good.
>
>>>> For example, for migration driver might want to validate that two
>>>> devices have same capability. doing it without dma is nicer.
>>>>
>>> A migration driver for real world scenario, will almost have to use the dma for
>> amount of data it needs to exchange.
>>
>> Not migration itself, provisioning.
>>
> Provisioning driver usually do not attach to the member device directly.
> This requires device reset, followed by reaching _DRIVER stage, querying features etc and config area.
> And unbinding it and second reset by member driver. Ugh.
>
> Provisioning driver also needs to get the state or capabilities even when member driver is already attached.
> So config space is not much a gain either.
>   
>>>> Another example, future admin transport will have ability to
>>>> provision devices by supplying their config space.
>>>> This will include this capability automatically, if instead we hide
>>>> it in a command we need to do extra custom work.
>>>>
>>>>> So we do not prefer to keep growing the config space anymore,
>>>>> hence GET is the right approach to me.
>>>> Heh I know you hate config space. Let it go, stop wasting time
>>>> arguing about the same thing on every turn and instead help define
>>>> admin transport to solve it
>>> This was discussed many times, a driver to have a direct (non-intercepted by
>> owner device) channel to device.
>>> If you mean this non-intercepted channel as admin transport, fine.
>> we can do that, sure.
>>
>>> If you mean this is intercepted and it is going over admin cmd, then it is of no
>> use for all future interfaces.
>>> We discussed this in thread with you and Jason.
>>> I provided concrete example with size and device provisioning math too and
>> other example of multi-physical address VQ.
>>> So transporting register by register over some admin transport is sub-optimal.
>>>
>> Not register by register, we can send all of config space as long as it's RO. This
>> field is.
>>
> It is RO in context of one member device, but every VF can have different value.
> The device will never know if one will use new cmdvq to access or some old driver will use without it.
> And hence, it always needs to provision it on onchip memory for backward compatibility.

Yes, I think we also have to consider upcoming
     1. device counters (e.g. supported_device_counter),
     2. receive flow filters (e.g. supported_flow_types, 
supported_max_entries),
     3. header splits (e.g. supported_split_types) etc.
Continuous expansion of the configuration space needs to be careful.

>
> Instead of decision point being RO vs RW,
> any new fields via cmdvq and existing fields stays in cfg space, give predictable behavior to size the member devices in the system.
> Once the cmdvq is available, we can get rid of GET command used in this version for new future features.
> Till that arrives, GET command is the efficient way.

Yes, I agree.

Thanks.


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-dev] RE: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
  2023-06-22 13:42                       ` Heng Qi
@ 2023-06-22 14:27                         ` Parav Pandit
  -1 siblings, 0 replies; 106+ messages in thread
From: Parav Pandit @ 2023-06-22 14:27 UTC (permalink / raw)
  To: Heng Qi, Michael S. Tsirkin
  Cc: virtio-comment, virtio-dev, Jason Wang, Yuri Benditovich,
	Xuan Zhuo, Cornelia Huck



> From: Heng Qi <hengqi@linux.alibaba.com>
> Sent: Thursday, June 22, 2023 9:43 AM
> 
> Yes, I think we also have to consider upcoming
>      1. device counters (e.g. supported_device_counter),
>      2. receive flow filters (e.g. supported_flow_types, supported_max_entries),
>      3. header splits (e.g. supported_split_types) etc.
> Continuous expansion of the configuration space needs to be careful.
> 
> >
> > Instead of decision point being RO vs RW, any new fields via cmdvq and
> > existing fields stays in cfg space, give predictable behavior to size the member
> devices in the system.
> > Once the cmdvq is available, we can get rid of GET command used in this
> version for new future features.
> > Till that arrives, GET command is the efficient way.
> 
> Yes, I agree.
> 
> Thanks.

Right. So, Miachel main concern was two vs one struct.
And given the GET and SET works on different fields, having two structure is just fine.
The code is fairly small.
I don’t see any real issue here with v18.


^ permalink raw reply	[flat|nested] 106+ messages in thread

* RE: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
@ 2023-06-22 14:27                         ` Parav Pandit
  0 siblings, 0 replies; 106+ messages in thread
From: Parav Pandit @ 2023-06-22 14:27 UTC (permalink / raw)
  To: Heng Qi, Michael S. Tsirkin
  Cc: virtio-comment, virtio-dev, Jason Wang, Yuri Benditovich,
	Xuan Zhuo, Cornelia Huck



> From: Heng Qi <hengqi@linux.alibaba.com>
> Sent: Thursday, June 22, 2023 9:43 AM
> 
> Yes, I think we also have to consider upcoming
>      1. device counters (e.g. supported_device_counter),
>      2. receive flow filters (e.g. supported_flow_types, supported_max_entries),
>      3. header splits (e.g. supported_split_types) etc.
> Continuous expansion of the configuration space needs to be careful.
> 
> >
> > Instead of decision point being RO vs RW, any new fields via cmdvq and
> > existing fields stays in cfg space, give predictable behavior to size the member
> devices in the system.
> > Once the cmdvq is available, we can get rid of GET command used in this
> version for new future features.
> > Till that arrives, GET command is the efficient way.
> 
> Yes, I agree.
> 
> Thanks.

Right. So, Miachel main concern was two vs one struct.
And given the GET and SET works on different fields, having two structure is just fine.
The code is fairly small.
I don’t see any real issue here with v18.


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
  2023-06-22 12:32                     ` Parav Pandit
@ 2023-06-22 16:28                       ` Michael S. Tsirkin
  -1 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2023-06-22 16:28 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck

On Thu, Jun 22, 2023 at 12:32:33PM +0000, Parav Pandit wrote:
> 
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Thursday, June 22, 2023 2:23 AM
> > 
> > On Wed, Jun 21, 2023 at 08:52:04PM +0000, Parav Pandit wrote:
> > >
> > > > From: Michael S. Tsirkin <mst@redhat.com>
> > > > Sent: Wednesday, June 21, 2023 4:38 PM
> > >
> > > > > > And the field is RO so no memory cost to exposing it in all VFs.
> > > > > Two structures do not bring the asymmetry.
> > > > > Accessing current and enabled fields via two different mechanism
> > > > > is bringing
> > > > the asymmetry.
> > > >
> > > > I guess it's a matter of taste, but it is clearly more consistent
> > > > with other hash things, to which it's very similar.
> > > >
> > > This is consistent with new commands we define including notification
> > coalescing whose GET is not coming config space.
> > 
> > But there GET just reports the current state. Not the read only capability. So
> > there would be cost per VF to keep it in config space.
> > This one is RO no cost per VF. Let's make it convenient?
> >
> And each VF can have different value hence requires per VF storage in the device.
>  
> > > >
> > > > Nah, config space is too convenient when we can live with its
> > > > limitations. I don't thin kwe prefer not to keep growing it.
> > > > For some things such as this one it's perfect.
> > > >
> > > Fields are different between different devices.
> > 
> > Not sure what's the implication?
> Implication is device needs to store this in always available on-chip memory which is not good.

Oh by devices you mean VFs. Now I get your motivation, at least. Thanks.

> > 
> > > > For example, for migration driver might want to validate that two
> > > > devices have same capability. doing it without dma is nicer.
> > > >
> > > A migration driver for real world scenario, will almost have to use the dma for
> > amount of data it needs to exchange.
> > 
> > Not migration itself, provisioning.
> >
> Provisioning driver usually do not attach to the member device directly.
> This requires device reset, followed by reaching _DRIVER stage, querying features etc and config area.
> And unbinding it and second reset by member driver. Ugh.
> Provisioning driver also needs to get the state or capabilities even when member driver is already attached.
> So config space is not much a gain either.
>

Absolutely, that's why we have admin commands.  I was hoping for an
admin command that basically gets/sets RO fields of the config space.

> > > > Another example, future admin transport will have ability to
> > > > provision devices by supplying their config space.
> > > > This will include this capability automatically, if instead we hide
> > > > it in a command we need to do extra custom work.
> > > >
> > > > > So we do not prefer to keep growing the config space anymore,
> > > > > hence GET is the right approach to me.
> > > >
> > > > Heh I know you hate config space. Let it go, stop wasting time
> > > > arguing about the same thing on every turn and instead help define
> > > > admin transport to solve it
> > >
> > > This was discussed many times, a driver to have a direct (non-intercepted by
> > owner device) channel to device.
> > > If you mean this non-intercepted channel as admin transport, fine.
> > 
> > we can do that, sure.
> > 
> > > If you mean this is intercepted and it is going over admin cmd, then it is of no
> > use for all future interfaces.
> > >
> > > We discussed this in thread with you and Jason.
> > > I provided concrete example with size and device provisioning math too and
> > other example of multi-physical address VQ.
> > > So transporting register by register over some admin transport is sub-optimal.
> > >
> > 
> > Not register by register, we can send all of config space as long as it's RO. This
> > field is.
> >
> It is RO in context of one member device, but every VF can have different value.
> The device will never know if one will use new cmdvq to access or some old driver will use without it.
> And hence, it always needs to provision it on onchip memory for backward compatibility.
> 
> Instead of decision point being RO vs RW, 
> any new fields via cmdvq and existing fields stays in cfg space, give predictable behavior to size the member devices in the system.
> Once the cmdvq is available, we can get rid of GET command used in this version for new future features.
> Till that arrives, GET command is the efficient way.

I understand.  I just don't much like these patchwork solutions though.
And I don't like that we will pay by not having a single conherent
way to provision and query capabilities through config space,
instead just for this thing we will have a special thing.

Why don't we focus on a work on a full solution? Just don't implement
this thing in your devices meanwhile until we do.

> > --
> > MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
@ 2023-06-22 16:28                       ` Michael S. Tsirkin
  0 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2023-06-22 16:28 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck

On Thu, Jun 22, 2023 at 12:32:33PM +0000, Parav Pandit wrote:
> 
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Thursday, June 22, 2023 2:23 AM
> > 
> > On Wed, Jun 21, 2023 at 08:52:04PM +0000, Parav Pandit wrote:
> > >
> > > > From: Michael S. Tsirkin <mst@redhat.com>
> > > > Sent: Wednesday, June 21, 2023 4:38 PM
> > >
> > > > > > And the field is RO so no memory cost to exposing it in all VFs.
> > > > > Two structures do not bring the asymmetry.
> > > > > Accessing current and enabled fields via two different mechanism
> > > > > is bringing
> > > > the asymmetry.
> > > >
> > > > I guess it's a matter of taste, but it is clearly more consistent
> > > > with other hash things, to which it's very similar.
> > > >
> > > This is consistent with new commands we define including notification
> > coalescing whose GET is not coming config space.
> > 
> > But there GET just reports the current state. Not the read only capability. So
> > there would be cost per VF to keep it in config space.
> > This one is RO no cost per VF. Let's make it convenient?
> >
> And each VF can have different value hence requires per VF storage in the device.
>  
> > > >
> > > > Nah, config space is too convenient when we can live with its
> > > > limitations. I don't thin kwe prefer not to keep growing it.
> > > > For some things such as this one it's perfect.
> > > >
> > > Fields are different between different devices.
> > 
> > Not sure what's the implication?
> Implication is device needs to store this in always available on-chip memory which is not good.

Oh by devices you mean VFs. Now I get your motivation, at least. Thanks.

> > 
> > > > For example, for migration driver might want to validate that two
> > > > devices have same capability. doing it without dma is nicer.
> > > >
> > > A migration driver for real world scenario, will almost have to use the dma for
> > amount of data it needs to exchange.
> > 
> > Not migration itself, provisioning.
> >
> Provisioning driver usually do not attach to the member device directly.
> This requires device reset, followed by reaching _DRIVER stage, querying features etc and config area.
> And unbinding it and second reset by member driver. Ugh.
> Provisioning driver also needs to get the state or capabilities even when member driver is already attached.
> So config space is not much a gain either.
>

Absolutely, that's why we have admin commands.  I was hoping for an
admin command that basically gets/sets RO fields of the config space.

> > > > Another example, future admin transport will have ability to
> > > > provision devices by supplying their config space.
> > > > This will include this capability automatically, if instead we hide
> > > > it in a command we need to do extra custom work.
> > > >
> > > > > So we do not prefer to keep growing the config space anymore,
> > > > > hence GET is the right approach to me.
> > > >
> > > > Heh I know you hate config space. Let it go, stop wasting time
> > > > arguing about the same thing on every turn and instead help define
> > > > admin transport to solve it
> > >
> > > This was discussed many times, a driver to have a direct (non-intercepted by
> > owner device) channel to device.
> > > If you mean this non-intercepted channel as admin transport, fine.
> > 
> > we can do that, sure.
> > 
> > > If you mean this is intercepted and it is going over admin cmd, then it is of no
> > use for all future interfaces.
> > >
> > > We discussed this in thread with you and Jason.
> > > I provided concrete example with size and device provisioning math too and
> > other example of multi-physical address VQ.
> > > So transporting register by register over some admin transport is sub-optimal.
> > >
> > 
> > Not register by register, we can send all of config space as long as it's RO. This
> > field is.
> >
> It is RO in context of one member device, but every VF can have different value.
> The device will never know if one will use new cmdvq to access or some old driver will use without it.
> And hence, it always needs to provision it on onchip memory for backward compatibility.
> 
> Instead of decision point being RO vs RW, 
> any new fields via cmdvq and existing fields stays in cfg space, give predictable behavior to size the member devices in the system.
> Once the cmdvq is available, we can get rid of GET command used in this version for new future features.
> Till that arrives, GET command is the efficient way.

I understand.  I just don't much like these patchwork solutions though.
And I don't like that we will pay by not having a single conherent
way to provision and query capabilities through config space,
instead just for this thing we will have a special thing.

Why don't we focus on a work on a full solution? Just don't implement
this thing in your devices meanwhile until we do.

> > --
> > MST


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-dev] RE: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
  2023-06-22 16:28                       ` Michael S. Tsirkin
@ 2023-06-22 16:42                         ` Parav Pandit
  -1 siblings, 0 replies; 106+ messages in thread
From: Parav Pandit @ 2023-06-22 16:42 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck



> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Thursday, June 22, 2023 12:28 PM

> > > Not sure what's the implication?
> > Implication is device needs to store this in always available on-chip memory
> which is not good.
> 
> Oh by devices you mean VFs. Now I get your motivation, at least. Thanks.
>
> > >
> > > > > For example, for migration driver might want to validate that two
> > > > > devices have same capability. doing it without dma is nicer.
> > > > >
> > > > A migration driver for real world scenario, will almost have to use the dma
> for
> > > amount of data it needs to exchange.
> > >
> > > Not migration itself, provisioning.
> > >
> > Provisioning driver usually do not attach to the member device directly.
> > This requires device reset, followed by reaching _DRIVER stage, querying
> features etc and config area.
> > And unbinding it and second reset by member driver. Ugh.
> > Provisioning driver also needs to get the state or capabilities even when
> member driver is already attached.
> > So config space is not much a gain either.
> >
> 
> Absolutely, that's why we have admin commands.  I was hoping for an
> admin command that basically gets/sets RO fields of the config space.
>
Admin command as I recall are not accessible directly by the member driver to the member device.
So a cmdq or cfgq is needed.
 
> > Instead of decision point being RO vs RW,
> > any new fields via cmdvq and existing fields stays in cfg space, give predictable
> behavior to size the member devices in the system.
> > Once the cmdvq is available, we can get rid of GET command used in this
> version for new future features.
> > Till that arrives, GET command is the efficient way.
> 
> I understand.  I just don't much like these patchwork solutions though.
> And I don't like that we will pay by not having a single conherent
> way to provision and query capabilities through config space,
> instead just for this thing we will have a special thing.
>
The single way for every device to query their capabilities is via a cfgvq for all new fields without extending the existing config space.
(and optionally old fields).
 
> Why don't we focus on a work on a full solution? Just don't implement
> this thing in your devices meanwhile until we do.
> 
Then Heng needs to wait for cfgvq to be defined to be implemented first.
Doesn't look reasonable to me.
Current GET is coherent with the new commands defined such as notification coalescing.

As community, we should work on defining the cfgvq, till that time have the optimal way to get the config, i.e. using the cvq.

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* RE: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
@ 2023-06-22 16:42                         ` Parav Pandit
  0 siblings, 0 replies; 106+ messages in thread
From: Parav Pandit @ 2023-06-22 16:42 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck



> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Thursday, June 22, 2023 12:28 PM

> > > Not sure what's the implication?
> > Implication is device needs to store this in always available on-chip memory
> which is not good.
> 
> Oh by devices you mean VFs. Now I get your motivation, at least. Thanks.
>
> > >
> > > > > For example, for migration driver might want to validate that two
> > > > > devices have same capability. doing it without dma is nicer.
> > > > >
> > > > A migration driver for real world scenario, will almost have to use the dma
> for
> > > amount of data it needs to exchange.
> > >
> > > Not migration itself, provisioning.
> > >
> > Provisioning driver usually do not attach to the member device directly.
> > This requires device reset, followed by reaching _DRIVER stage, querying
> features etc and config area.
> > And unbinding it and second reset by member driver. Ugh.
> > Provisioning driver also needs to get the state or capabilities even when
> member driver is already attached.
> > So config space is not much a gain either.
> >
> 
> Absolutely, that's why we have admin commands.  I was hoping for an
> admin command that basically gets/sets RO fields of the config space.
>
Admin command as I recall are not accessible directly by the member driver to the member device.
So a cmdq or cfgq is needed.
 
> > Instead of decision point being RO vs RW,
> > any new fields via cmdvq and existing fields stays in cfg space, give predictable
> behavior to size the member devices in the system.
> > Once the cmdvq is available, we can get rid of GET command used in this
> version for new future features.
> > Till that arrives, GET command is the efficient way.
> 
> I understand.  I just don't much like these patchwork solutions though.
> And I don't like that we will pay by not having a single conherent
> way to provision and query capabilities through config space,
> instead just for this thing we will have a special thing.
>
The single way for every device to query their capabilities is via a cfgvq for all new fields without extending the existing config space.
(and optionally old fields).
 
> Why don't we focus on a work on a full solution? Just don't implement
> this thing in your devices meanwhile until we do.
> 
Then Heng needs to wait for cfgvq to be defined to be implemented first.
Doesn't look reasonable to me.
Current GET is coherent with the new commands defined such as notification coalescing.

As community, we should work on defining the cfgvq, till that time have the optimal way to get the config, i.e. using the cvq.

This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
  2023-06-22 14:27                         ` Parav Pandit
@ 2023-06-22 16:46                           ` Michael S. Tsirkin
  -1 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2023-06-22 16:46 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck

On Thu, Jun 22, 2023 at 02:27:31PM +0000, Parav Pandit wrote:
> 
> 
> > From: Heng Qi <hengqi@linux.alibaba.com>
> > Sent: Thursday, June 22, 2023 9:43 AM
> > 
> > Yes, I think we also have to consider upcoming
> >      1. device counters (e.g. supported_device_counter),
> >      2. receive flow filters (e.g. supported_flow_types, supported_max_entries),
> >      3. header splits (e.g. supported_split_types) etc.
> > Continuous expansion of the configuration space needs to be careful.
> > 
> > >
> > > Instead of decision point being RO vs RW, any new fields via cmdvq and
> > > existing fields stays in cfg space, give predictable behavior to size the member
> > devices in the system.
> > > Once the cmdvq is available, we can get rid of GET command used in this
> > version for new future features.
> > > Till that arrives, GET command is the efficient way.
> > 
> > Yes, I agree.
> > 
> > Thanks.
> 
> Right. So, Miachel main concern was two vs one struct.
> And given the GET and SET works on different fields, having two structure is just fine.
> The code is fairly small.
> I don’t see any real issue here with v18.
> 

The hardware footprint of keeping this in memory is also fairly small :)
I care about a messy interface because this mess builds up over time.

And I am worried about capabilities really. My bad that I missed this
change in v13. I only can say in my defence that I already had
to rewrite huge chunks of this proposal to make it readable
so one can't say I'm only delaying things, I also made an effort
to help this progress faster :)

I feel we need a single place where device capabilities can live. So far
they were in config space.  It's consistent, yes I get this has hardware
costs *if* there's a huge number of VFs and *if* there's a way to
provision each VF with a different configuration.  And yes querying VFs
over MMIO is kind of ugly. But it does at least work, and works fine
while VF is assigned.  So we can build migration around that *today*.

But querying over cvq while VF is assigned clearly *doesn't* work.

So what is the solution proposed for this?

Yes the current migration is broken in many ways but that's what we
have. Let's build something better sure but that is not 1.3 material.

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
@ 2023-06-22 16:46                           ` Michael S. Tsirkin
  0 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2023-06-22 16:46 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck

On Thu, Jun 22, 2023 at 02:27:31PM +0000, Parav Pandit wrote:
> 
> 
> > From: Heng Qi <hengqi@linux.alibaba.com>
> > Sent: Thursday, June 22, 2023 9:43 AM
> > 
> > Yes, I think we also have to consider upcoming
> >      1. device counters (e.g. supported_device_counter),
> >      2. receive flow filters (e.g. supported_flow_types, supported_max_entries),
> >      3. header splits (e.g. supported_split_types) etc.
> > Continuous expansion of the configuration space needs to be careful.
> > 
> > >
> > > Instead of decision point being RO vs RW, any new fields via cmdvq and
> > > existing fields stays in cfg space, give predictable behavior to size the member
> > devices in the system.
> > > Once the cmdvq is available, we can get rid of GET command used in this
> > version for new future features.
> > > Till that arrives, GET command is the efficient way.
> > 
> > Yes, I agree.
> > 
> > Thanks.
> 
> Right. So, Miachel main concern was two vs one struct.
> And given the GET and SET works on different fields, having two structure is just fine.
> The code is fairly small.
> I don’t see any real issue here with v18.
> 

The hardware footprint of keeping this in memory is also fairly small :)
I care about a messy interface because this mess builds up over time.

And I am worried about capabilities really. My bad that I missed this
change in v13. I only can say in my defence that I already had
to rewrite huge chunks of this proposal to make it readable
so one can't say I'm only delaying things, I also made an effort
to help this progress faster :)

I feel we need a single place where device capabilities can live. So far
they were in config space.  It's consistent, yes I get this has hardware
costs *if* there's a huge number of VFs and *if* there's a way to
provision each VF with a different configuration.  And yes querying VFs
over MMIO is kind of ugly. But it does at least work, and works fine
while VF is assigned.  So we can build migration around that *today*.

But querying over cvq while VF is assigned clearly *doesn't* work.

So what is the solution proposed for this?

Yes the current migration is broken in many ways but that's what we
have. Let's build something better sure but that is not 1.3 material.

-- 
MST


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
  2023-06-22 16:42                         ` Parav Pandit
@ 2023-06-22 16:54                           ` Michael S. Tsirkin
  -1 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2023-06-22 16:54 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck

On Thu, Jun 22, 2023 at 04:42:40PM +0000, Parav Pandit wrote:
> 
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Thursday, June 22, 2023 12:28 PM
> 
> > > > Not sure what's the implication?
> > > Implication is device needs to store this in always available on-chip memory
> > which is not good.
> > 
> > Oh by devices you mean VFs. Now I get your motivation, at least. Thanks.
> >
> > > >
> > > > > > For example, for migration driver might want to validate that two
> > > > > > devices have same capability. doing it without dma is nicer.
> > > > > >
> > > > > A migration driver for real world scenario, will almost have to use the dma
> > for
> > > > amount of data it needs to exchange.
> > > >
> > > > Not migration itself, provisioning.
> > > >
> > > Provisioning driver usually do not attach to the member device directly.
> > > This requires device reset, followed by reaching _DRIVER stage, querying
> > features etc and config area.
> > > And unbinding it and second reset by member driver. Ugh.
> > > Provisioning driver also needs to get the state or capabilities even when
> > member driver is already attached.
> > > So config space is not much a gain either.
> > >
> > 
> > Absolutely, that's why we have admin commands.  I was hoping for an
> > admin command that basically gets/sets RO fields of the config space.
> >
> Admin command as I recall are not accessible directly by the member driver to the member device.
> So a cmdq or cfgq is needed.

Possible, sure. Or we actually discussed a self group. I took it away until it had a user.


> > > Instead of decision point being RO vs RW,
> > > any new fields via cmdvq and existing fields stays in cfg space, give predictable
> > behavior to size the member devices in the system.
> > > Once the cmdvq is available, we can get rid of GET command used in this
> > version for new future features.
> > > Till that arrives, GET command is the efficient way.
> > 
> > I understand.  I just don't much like these patchwork solutions though.
> > And I don't like that we will pay by not having a single conherent
> > way to provision and query capabilities through config space,
> > instead just for this thing we will have a special thing.
> >
> The single way for every device to query their capabilities is via a cfgvq for all new fields without extending the existing config space.
> (and optionally old fields).

Or adminq with self group. I like this somewhat better because we need
exactly same query from owner.

> > Why don't we focus on a work on a full solution? Just don't implement
> > this thing in your devices meanwhile until we do.
> > 
> Then Heng needs to wait for cfgvq to be defined to be implemented first.
> Doesn't look reasonable to me.

And *everything* has to wait. No, not reasonable. We somehow managed
to release several spec versions and things did not ground to
a halt without cfgvq. Don't see a reason to do it right now,
what's special about now? I feel we should add to config space
and then solve it all.

> Current GET is coherent with the new commands defined such as notification coalescing.
> 
> As community, we should work on defining the cfgvq, till that time have the optimal way to get the config, i.e. using the cvq.

cvq doesn't really work for capabilities though.

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
@ 2023-06-22 16:54                           ` Michael S. Tsirkin
  0 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2023-06-22 16:54 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck

On Thu, Jun 22, 2023 at 04:42:40PM +0000, Parav Pandit wrote:
> 
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Thursday, June 22, 2023 12:28 PM
> 
> > > > Not sure what's the implication?
> > > Implication is device needs to store this in always available on-chip memory
> > which is not good.
> > 
> > Oh by devices you mean VFs. Now I get your motivation, at least. Thanks.
> >
> > > >
> > > > > > For example, for migration driver might want to validate that two
> > > > > > devices have same capability. doing it without dma is nicer.
> > > > > >
> > > > > A migration driver for real world scenario, will almost have to use the dma
> > for
> > > > amount of data it needs to exchange.
> > > >
> > > > Not migration itself, provisioning.
> > > >
> > > Provisioning driver usually do not attach to the member device directly.
> > > This requires device reset, followed by reaching _DRIVER stage, querying
> > features etc and config area.
> > > And unbinding it and second reset by member driver. Ugh.
> > > Provisioning driver also needs to get the state or capabilities even when
> > member driver is already attached.
> > > So config space is not much a gain either.
> > >
> > 
> > Absolutely, that's why we have admin commands.  I was hoping for an
> > admin command that basically gets/sets RO fields of the config space.
> >
> Admin command as I recall are not accessible directly by the member driver to the member device.
> So a cmdq or cfgq is needed.

Possible, sure. Or we actually discussed a self group. I took it away until it had a user.


> > > Instead of decision point being RO vs RW,
> > > any new fields via cmdvq and existing fields stays in cfg space, give predictable
> > behavior to size the member devices in the system.
> > > Once the cmdvq is available, we can get rid of GET command used in this
> > version for new future features.
> > > Till that arrives, GET command is the efficient way.
> > 
> > I understand.  I just don't much like these patchwork solutions though.
> > And I don't like that we will pay by not having a single conherent
> > way to provision and query capabilities through config space,
> > instead just for this thing we will have a special thing.
> >
> The single way for every device to query their capabilities is via a cfgvq for all new fields without extending the existing config space.
> (and optionally old fields).

Or adminq with self group. I like this somewhat better because we need
exactly same query from owner.

> > Why don't we focus on a work on a full solution? Just don't implement
> > this thing in your devices meanwhile until we do.
> > 
> Then Heng needs to wait for cfgvq to be defined to be implemented first.
> Doesn't look reasonable to me.

And *everything* has to wait. No, not reasonable. We somehow managed
to release several spec versions and things did not ground to
a halt without cfgvq. Don't see a reason to do it right now,
what's special about now? I feel we should add to config space
and then solve it all.

> Current GET is coherent with the new commands defined such as notification coalescing.
> 
> As community, we should work on defining the cfgvq, till that time have the optimal way to get the config, i.e. using the cvq.

cvq doesn't really work for capabilities though.

-- 
MST


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-dev] RE: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
  2023-06-22 16:46                           ` Michael S. Tsirkin
@ 2023-06-22 16:54                             ` Parav Pandit
  -1 siblings, 0 replies; 106+ messages in thread
From: Parav Pandit @ 2023-06-22 16:54 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck



> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Thursday, June 22, 2023 12:47 PM

> 
> The hardware footprint of keeping this in memory is also fairly small :) I care
> about a messy interface because this mess builds up over time.
>
It is really a simple GET command.
It is actually messy for the device to implement functionality in two places in cfg space and cvq.
 
> And I am worried about capabilities really. My bad that I missed this change in
> v13. I only can say in my defence that I already had to rewrite huge chunks of
> this proposal to make it readable so one can't say I'm only delaying things, I also
> made an effort to help this progress faster :)
> 
> I feel we need a single place where device capabilities can live. So far they were
> in config space.  It's consistent, yes I get this has hardware costs *if* there's a
> huge number of VFs and *if* there's a way to provision each VF with a different
> configuration.  
All the ifs are valid today.

> And yes querying VFs over MMIO is kind of ugly. But it does at
> least work, and works fine while VF is assigned.  So we can build migration
> around that *today*.
>
Other way to say migration can be skipped for this feature bit, and it still works for rest.
 
> But querying over cvq while VF is assigned clearly *doesn't* work.
> 
That is not the idea at all.
Querying VF capabilities is the role of the admin command for which we built it.

> So what is the solution proposed for this?
> 
1. Query member device capabilities via admin command

> Yes the current migration is broken in many ways but that's what we have. Let's
> build something better sure but that is not 1.3 material.

True, it is not 1.3 material, hence the proposal was to have the GET command.
Once/if we reach agreement that no new fields to be added to config space starting 1.4 and should be queried using non intercepted cfgvq, it makes sense to let this go in cfg space.
Else GET command seems the elegant and right approach.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* RE: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
@ 2023-06-22 16:54                             ` Parav Pandit
  0 siblings, 0 replies; 106+ messages in thread
From: Parav Pandit @ 2023-06-22 16:54 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck



> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Thursday, June 22, 2023 12:47 PM

> 
> The hardware footprint of keeping this in memory is also fairly small :) I care
> about a messy interface because this mess builds up over time.
>
It is really a simple GET command.
It is actually messy for the device to implement functionality in two places in cfg space and cvq.
 
> And I am worried about capabilities really. My bad that I missed this change in
> v13. I only can say in my defence that I already had to rewrite huge chunks of
> this proposal to make it readable so one can't say I'm only delaying things, I also
> made an effort to help this progress faster :)
> 
> I feel we need a single place where device capabilities can live. So far they were
> in config space.  It's consistent, yes I get this has hardware costs *if* there's a
> huge number of VFs and *if* there's a way to provision each VF with a different
> configuration.  
All the ifs are valid today.

> And yes querying VFs over MMIO is kind of ugly. But it does at
> least work, and works fine while VF is assigned.  So we can build migration
> around that *today*.
>
Other way to say migration can be skipped for this feature bit, and it still works for rest.
 
> But querying over cvq while VF is assigned clearly *doesn't* work.
> 
That is not the idea at all.
Querying VF capabilities is the role of the admin command for which we built it.

> So what is the solution proposed for this?
> 
1. Query member device capabilities via admin command

> Yes the current migration is broken in many ways but that's what we have. Let's
> build something better sure but that is not 1.3 material.

True, it is not 1.3 material, hence the proposal was to have the GET command.
Once/if we reach agreement that no new fields to be added to config space starting 1.4 and should be queried using non intercepted cfgvq, it makes sense to let this go in cfg space.
Else GET command seems the elegant and right approach.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
  2023-06-22 16:54                             ` Parav Pandit
@ 2023-06-22 17:03                               ` Michael S. Tsirkin
  -1 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2023-06-22 17:03 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck

On Thu, Jun 22, 2023 at 04:54:48PM +0000, Parav Pandit wrote:
> 
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Thursday, June 22, 2023 12:47 PM
> 
> > 
> > The hardware footprint of keeping this in memory is also fairly small :) I care
> > about a messy interface because this mess builds up over time.
> >
> It is really a simple GET command.
> It is actually messy for the device to implement functionality in two places in cfg space and cvq.
>  
> > And I am worried about capabilities really. My bad that I missed this change in
> > v13. I only can say in my defence that I already had to rewrite huge chunks of
> > this proposal to make it readable so one can't say I'm only delaying things, I also
> > made an effort to help this progress faster :)
> > 
> > I feel we need a single place where device capabilities can live. So far they were
> > in config space.  It's consistent, yes I get this has hardware costs *if* there's a
> > huge number of VFs and *if* there's a way to provision each VF with a different
> > configuration.  
> All the ifs are valid today.
> 
> > And yes querying VFs over MMIO is kind of ugly. But it does at
> > least work, and works fine while VF is assigned.  So we can build migration
> > around that *today*.
> >
> Other way to say migration can be skipped for this feature bit, and it still works for rest.

If VF is assigned then we can't really control what does guest enable.

> > But querying over cvq while VF is assigned clearly *doesn't* work.
> > 
> That is not the idea at all.
> Querying VF capabilities is the role of the admin command for which we built it.

This GET is exactly that though.

> > So what is the solution proposed for this?
> > 
> 1. Query member device capabilities via admin command

But that's not 1.3 material.

> > Yes the current migration is broken in many ways but that's what we have. Let's
> > build something better sure but that is not 1.3 material.
> 
> True, it is not 1.3 material, hence the proposal was to have the GET command.
> Once/if we reach agreement that no new fields to be added to config space starting 1.4 and should be queried using non intercepted cfgvq, it makes sense to let this go in cfg space.
> Else GET command seems the elegant and right approach.

I expect no such agreement at all. Instead, I expect that we'll have an
alternative way to access config space. guest virtio core then needs to
learn both ways, and devices can support one or both.

A good implementation of virtio_cread can abstract that easily so we
don't need to change drivers.

-- 
MST


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
@ 2023-06-22 17:03                               ` Michael S. Tsirkin
  0 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2023-06-22 17:03 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck

On Thu, Jun 22, 2023 at 04:54:48PM +0000, Parav Pandit wrote:
> 
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Thursday, June 22, 2023 12:47 PM
> 
> > 
> > The hardware footprint of keeping this in memory is also fairly small :) I care
> > about a messy interface because this mess builds up over time.
> >
> It is really a simple GET command.
> It is actually messy for the device to implement functionality in two places in cfg space and cvq.
>  
> > And I am worried about capabilities really. My bad that I missed this change in
> > v13. I only can say in my defence that I already had to rewrite huge chunks of
> > this proposal to make it readable so one can't say I'm only delaying things, I also
> > made an effort to help this progress faster :)
> > 
> > I feel we need a single place where device capabilities can live. So far they were
> > in config space.  It's consistent, yes I get this has hardware costs *if* there's a
> > huge number of VFs and *if* there's a way to provision each VF with a different
> > configuration.  
> All the ifs are valid today.
> 
> > And yes querying VFs over MMIO is kind of ugly. But it does at
> > least work, and works fine while VF is assigned.  So we can build migration
> > around that *today*.
> >
> Other way to say migration can be skipped for this feature bit, and it still works for rest.

If VF is assigned then we can't really control what does guest enable.

> > But querying over cvq while VF is assigned clearly *doesn't* work.
> > 
> That is not the idea at all.
> Querying VF capabilities is the role of the admin command for which we built it.

This GET is exactly that though.

> > So what is the solution proposed for this?
> > 
> 1. Query member device capabilities via admin command

But that's not 1.3 material.

> > Yes the current migration is broken in many ways but that's what we have. Let's
> > build something better sure but that is not 1.3 material.
> 
> True, it is not 1.3 material, hence the proposal was to have the GET command.
> Once/if we reach agreement that no new fields to be added to config space starting 1.4 and should be queried using non intercepted cfgvq, it makes sense to let this go in cfg space.
> Else GET command seems the elegant and right approach.

I expect no such agreement at all. Instead, I expect that we'll have an
alternative way to access config space. guest virtio core then needs to
learn both ways, and devices can support one or both.

A good implementation of virtio_cread can abstract that easily so we
don't need to change drivers.

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* RE: [virtio-dev] Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
  2023-06-22 16:54                           ` Michael S. Tsirkin
@ 2023-06-22 17:04                             ` Parav Pandit
  -1 siblings, 0 replies; 106+ messages in thread
From: Parav Pandit @ 2023-06-22 17:04 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck



> From: virtio-dev@lists.oasis-open.org <virtio-dev@lists.oasis-open.org> On
> Behalf Of Michael S. Tsirkin
> Sent: Thursday, June 22, 2023 12:54 PM

> > Admin command as I recall are not accessible directly by the member driver to
> the member device.
> > So a cmdq or cfgq is needed.
> 
> Possible, sure. Or we actually discussed a self group. I took it away until it had a
> user.
>
The problematic part of AQ is that its index is placed in the yet another onchip die register that does not scale as each member device has different queue count.

When admin queue was discussed, it was only for group owner, (you answered to Jiri).
Hence the scale is relatively less, so it was acceptable.

Now having unique numbers for VFs is not good.
Max proposal was the last index after existing defined VQs of num_queues, that saves the storage space on device.

> > The single way for every device to query their capabilities is via a cfgvq for all
> new fields without extending the existing config space.
> > (and optionally old fields).
> 
> Or adminq with self group. I like this somewhat better because we need exactly
> same query from owner.
>
Yes. this is why I proposed to name is cmdvq that can carry admin commands or other.
But fine, we had to progress for group owner.
 
> > > Why don't we focus on a work on a full solution? Just don't
> > > implement this thing in your devices meanwhile until we do.
> > >
> > Then Heng needs to wait for cfgvq to be defined to be implemented first.
> > Doesn't look reasonable to me.
> 
> And *everything* has to wait. No, not reasonable. We somehow managed to
> release several spec versions and things did not ground to a halt without cfgvq.
> Don't see a reason to do it right now, what's special about now? I feel we should
> add to config space and then solve it all.
>
Things didn't ground at cost of device keep increasing their memory footprint.
The latest addition I remember is the queue_reset register.
It was bit but a purely control operation that got in there.
 
> > Current GET is coherent with the new commands defined such as notification
> coalescing.
> >
> > As community, we should work on defining the cfgvq, till that time have the
> optimal way to get the config, i.e. using the cvq.
> 
> cvq doesn't really work for capabilities though.

For the device itself, it does which is what is being done here.

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-comment] RE: [virtio-dev] Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
@ 2023-06-22 17:04                             ` Parav Pandit
  0 siblings, 0 replies; 106+ messages in thread
From: Parav Pandit @ 2023-06-22 17:04 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck



> From: virtio-dev@lists.oasis-open.org <virtio-dev@lists.oasis-open.org> On
> Behalf Of Michael S. Tsirkin
> Sent: Thursday, June 22, 2023 12:54 PM

> > Admin command as I recall are not accessible directly by the member driver to
> the member device.
> > So a cmdq or cfgq is needed.
> 
> Possible, sure. Or we actually discussed a self group. I took it away until it had a
> user.
>
The problematic part of AQ is that its index is placed in the yet another onchip die register that does not scale as each member device has different queue count.

When admin queue was discussed, it was only for group owner, (you answered to Jiri).
Hence the scale is relatively less, so it was acceptable.

Now having unique numbers for VFs is not good.
Max proposal was the last index after existing defined VQs of num_queues, that saves the storage space on device.

> > The single way for every device to query their capabilities is via a cfgvq for all
> new fields without extending the existing config space.
> > (and optionally old fields).
> 
> Or adminq with self group. I like this somewhat better because we need exactly
> same query from owner.
>
Yes. this is why I proposed to name is cmdvq that can carry admin commands or other.
But fine, we had to progress for group owner.
 
> > > Why don't we focus on a work on a full solution? Just don't
> > > implement this thing in your devices meanwhile until we do.
> > >
> > Then Heng needs to wait for cfgvq to be defined to be implemented first.
> > Doesn't look reasonable to me.
> 
> And *everything* has to wait. No, not reasonable. We somehow managed to
> release several spec versions and things did not ground to a halt without cfgvq.
> Don't see a reason to do it right now, what's special about now? I feel we should
> add to config space and then solve it all.
>
Things didn't ground at cost of device keep increasing their memory footprint.
The latest addition I remember is the queue_reset register.
It was bit but a purely control operation that got in there.
 
> > Current GET is coherent with the new commands defined such as notification
> coalescing.
> >
> > As community, we should work on defining the cfgvq, till that time have the
> optimal way to get the config, i.e. using the cvq.
> 
> cvq doesn't really work for capabilities though.

For the device itself, it does which is what is being done here.

This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
  2023-06-22 12:32                     ` Parav Pandit
@ 2023-06-22 17:11                       ` Michael S. Tsirkin
  -1 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2023-06-22 17:11 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck

On Thu, Jun 22, 2023 at 12:32:33PM +0000, Parav Pandit wrote:
> Provisioning driver usually do not attach to the member device directly.
> This requires device reset, followed by reaching _DRIVER stage, querying features etc and config area.
> And unbinding it and second reset by member driver. Ugh.
> Provisioning driver also needs to get the state or capabilities even when member driver is already attached.
> So config space is not much a gain either.

Actually it's RO so you *can* read it without any issues:
- block guest access to status
- check DRIVER.
If set:
	- read features, config
If not set:
	- read features, config
	- reset

I am not saying it is elegant but then all of vdpa pile of
hacks is not elegant.

And I am all for building something better but we didn't
build it yet.

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
@ 2023-06-22 17:11                       ` Michael S. Tsirkin
  0 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2023-06-22 17:11 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck

On Thu, Jun 22, 2023 at 12:32:33PM +0000, Parav Pandit wrote:
> Provisioning driver usually do not attach to the member device directly.
> This requires device reset, followed by reaching _DRIVER stage, querying features etc and config area.
> And unbinding it and second reset by member driver. Ugh.
> Provisioning driver also needs to get the state or capabilities even when member driver is already attached.
> So config space is not much a gain either.

Actually it's RO so you *can* read it without any issues:
- block guest access to status
- check DRIVER.
If set:
	- read features, config
If not set:
	- read features, config
	- reset

I am not saying it is elegant but then all of vdpa pile of
hacks is not elegant.

And I am all for building something better but we didn't
build it yet.

-- 
MST


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-dev] RE: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
  2023-06-22 17:03                               ` [virtio-dev] " Michael S. Tsirkin
@ 2023-06-22 17:11                                 ` Parav Pandit
  -1 siblings, 0 replies; 106+ messages in thread
From: Parav Pandit @ 2023-06-22 17:11 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck



> From: virtio-comment@lists.oasis-open.org <virtio-comment@lists.oasis-
> open.org> On Behalf Of Michael S. Tsirkin
> Sent: Thursday, June 22, 2023 1:04 PM
> To: Parav Pandit <parav@nvidia.com>
> 
> On Thu, Jun 22, 2023 at 04:54:48PM +0000, Parav Pandit wrote:
> >
> >
> > > From: Michael S. Tsirkin <mst@redhat.com>
> > > Sent: Thursday, June 22, 2023 12:47 PM
> >
> > >
> > > The hardware footprint of keeping this in memory is also fairly
> > > small :) I care about a messy interface because this mess builds up over time.
> > >
> > It is really a simple GET command.
> > It is actually messy for the device to implement functionality in two places in
> cfg space and cvq.
> >
> > > And I am worried about capabilities really. My bad that I missed
> > > this change in v13. I only can say in my defence that I already had
> > > to rewrite huge chunks of this proposal to make it readable so one
> > > can't say I'm only delaying things, I also made an effort to help
> > > this progress faster :)
> > >
> > > I feel we need a single place where device capabilities can live. So
> > > far they were in config space.  It's consistent, yes I get this has
> > > hardware costs *if* there's a huge number of VFs and *if* there's a
> > > way to provision each VF with a different configuration.
> > All the ifs are valid today.
> >
> > > And yes querying VFs over MMIO is kind of ugly. But it does at least
> > > work, and works fine while VF is assigned.  So we can build
> > > migration around that *today*.
> > >
> > Other way to say migration can be skipped for this feature bit, and it still
> works for rest.
> 
> If VF is assigned then we can't really control what does guest enable.
>
All parameters set over the CVQ needs to be accessible to the migration entity.
Including RSS and including this bit.
Either by trapping the CVQ or by having AQ command to query member dev capabilities.

> > > But querying over cvq while VF is assigned clearly *doesn't* work.
> > >
> > That is not the idea at all.
> > Querying VF capabilities is the role of the admin command for which we built
> it.
> 
> This GET is exactly that though.
> 
Not exactly.
This GET command is needed for the member driver to know what is supported for the device it got.

> > > So what is the solution proposed for this?
> > >
> > 1. Query member device capabilities via admin command
> 
> But that's not 1.3 material.
> 
> > > Yes the current migration is broken in many ways but that's what we
> > > have. Let's build something better sure but that is not 1.3 material.
> >
> > True, it is not 1.3 material, hence the proposal was to have the GET command.
> > Once/if we reach agreement that no new fields to be added to config space
> starting 1.4 and should be queried using non intercepted cfgvq, it makes sense
> to let this go in cfg space.
> > Else GET command seems the elegant and right approach.
> 
> I expect no such agreement at all. Instead, I expect that we'll have an alternative
> way to access config space. guest virtio core then needs to learn both ways, and
> devices can support one or both.
> 
Yeah, we disagree.
Because alternative way that you propose is not predictable way to build the device efficiently.
It always needs to account for old driver to support.
This is clearly sub-optimal as the capabilities grow.

> A good implementation of virtio_cread can abstract that easily so we don't need
> to change drivers.

There is no backward compat issue for the GET command being new.

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* RE: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
@ 2023-06-22 17:11                                 ` Parav Pandit
  0 siblings, 0 replies; 106+ messages in thread
From: Parav Pandit @ 2023-06-22 17:11 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck



> From: virtio-comment@lists.oasis-open.org <virtio-comment@lists.oasis-
> open.org> On Behalf Of Michael S. Tsirkin
> Sent: Thursday, June 22, 2023 1:04 PM
> To: Parav Pandit <parav@nvidia.com>
> 
> On Thu, Jun 22, 2023 at 04:54:48PM +0000, Parav Pandit wrote:
> >
> >
> > > From: Michael S. Tsirkin <mst@redhat.com>
> > > Sent: Thursday, June 22, 2023 12:47 PM
> >
> > >
> > > The hardware footprint of keeping this in memory is also fairly
> > > small :) I care about a messy interface because this mess builds up over time.
> > >
> > It is really a simple GET command.
> > It is actually messy for the device to implement functionality in two places in
> cfg space and cvq.
> >
> > > And I am worried about capabilities really. My bad that I missed
> > > this change in v13. I only can say in my defence that I already had
> > > to rewrite huge chunks of this proposal to make it readable so one
> > > can't say I'm only delaying things, I also made an effort to help
> > > this progress faster :)
> > >
> > > I feel we need a single place where device capabilities can live. So
> > > far they were in config space.  It's consistent, yes I get this has
> > > hardware costs *if* there's a huge number of VFs and *if* there's a
> > > way to provision each VF with a different configuration.
> > All the ifs are valid today.
> >
> > > And yes querying VFs over MMIO is kind of ugly. But it does at least
> > > work, and works fine while VF is assigned.  So we can build
> > > migration around that *today*.
> > >
> > Other way to say migration can be skipped for this feature bit, and it still
> works for rest.
> 
> If VF is assigned then we can't really control what does guest enable.
>
All parameters set over the CVQ needs to be accessible to the migration entity.
Including RSS and including this bit.
Either by trapping the CVQ or by having AQ command to query member dev capabilities.

> > > But querying over cvq while VF is assigned clearly *doesn't* work.
> > >
> > That is not the idea at all.
> > Querying VF capabilities is the role of the admin command for which we built
> it.
> 
> This GET is exactly that though.
> 
Not exactly.
This GET command is needed for the member driver to know what is supported for the device it got.

> > > So what is the solution proposed for this?
> > >
> > 1. Query member device capabilities via admin command
> 
> But that's not 1.3 material.
> 
> > > Yes the current migration is broken in many ways but that's what we
> > > have. Let's build something better sure but that is not 1.3 material.
> >
> > True, it is not 1.3 material, hence the proposal was to have the GET command.
> > Once/if we reach agreement that no new fields to be added to config space
> starting 1.4 and should be queried using non intercepted cfgvq, it makes sense
> to let this go in cfg space.
> > Else GET command seems the elegant and right approach.
> 
> I expect no such agreement at all. Instead, I expect that we'll have an alternative
> way to access config space. guest virtio core then needs to learn both ways, and
> devices can support one or both.
> 
Yeah, we disagree.
Because alternative way that you propose is not predictable way to build the device efficiently.
It always needs to account for old driver to support.
This is clearly sub-optimal as the capabilities grow.

> A good implementation of virtio_cread can abstract that easily so we don't need
> to change drivers.

There is no backward compat issue for the GET command being new.

This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-dev] Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
  2023-06-22 17:04                             ` [virtio-comment] " Parav Pandit
@ 2023-06-22 17:14                               ` Michael S. Tsirkin
  -1 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2023-06-22 17:14 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck

On Thu, Jun 22, 2023 at 05:04:04PM +0000, Parav Pandit wrote:
> 
> 
> > From: virtio-dev@lists.oasis-open.org <virtio-dev@lists.oasis-open.org> On
> > Behalf Of Michael S. Tsirkin
> > Sent: Thursday, June 22, 2023 12:54 PM
> 
> > > Admin command as I recall are not accessible directly by the member driver to
> > the member device.
> > > So a cmdq or cfgq is needed.
> > 
> > Possible, sure. Or we actually discussed a self group. I took it away until it had a
> > user.
> >
> The problematic part of AQ is that its index is placed in the yet another onchip die register that does not scale as each member device has different queue count.
> When admin queue was discussed, it was only for group owner, (you answered to Jiri).
> Hence the scale is relatively less, so it was acceptable.
> 
> Now having unique numbers for VFs is not good.
> Max proposal was the last index after existing defined VQs of num_queues, that saves the storage space on device.

Surely, you can just have a very large index and be done with it?

> > > The single way for every device to query their capabilities is via a cfgvq for all
> > new fields without extending the existing config space.
> > > (and optionally old fields).
> > 
> > Or adminq with self group. I like this somewhat better because we need exactly
> > same query from owner.
> >
> Yes. this is why I proposed to name is cmdvq that can carry admin commands or other.
> But fine, we had to progress for group owner.
>  
> > > > Why don't we focus on a work on a full solution? Just don't
> > > > implement this thing in your devices meanwhile until we do.
> > > >
> > > Then Heng needs to wait for cfgvq to be defined to be implemented first.
> > > Doesn't look reasonable to me.
> > 
> > And *everything* has to wait. No, not reasonable. We somehow managed to
> > release several spec versions and things did not ground to a halt without cfgvq.
> > Don't see a reason to do it right now, what's special about now? I feel we should
> > add to config space and then solve it all.
> >
> Things didn't ground at cost of device keep increasing their memory footprint.
> The latest addition I remember is the queue_reset register.
> It was bit but a purely control operation that got in there.
>  
> > > Current GET is coherent with the new commands defined such as notification
> > coalescing.
> > >
> > > As community, we should work on defining the cfgvq, till that time have the
> > optimal way to get the config, i.e. using the cvq.
> > 
> > cvq doesn't really work for capabilities though.
> 
> For the device itself, it does which is what is being done here.

Yes but not for migration.

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
@ 2023-06-22 17:14                               ` Michael S. Tsirkin
  0 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2023-06-22 17:14 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck

On Thu, Jun 22, 2023 at 05:04:04PM +0000, Parav Pandit wrote:
> 
> 
> > From: virtio-dev@lists.oasis-open.org <virtio-dev@lists.oasis-open.org> On
> > Behalf Of Michael S. Tsirkin
> > Sent: Thursday, June 22, 2023 12:54 PM
> 
> > > Admin command as I recall are not accessible directly by the member driver to
> > the member device.
> > > So a cmdq or cfgq is needed.
> > 
> > Possible, sure. Or we actually discussed a self group. I took it away until it had a
> > user.
> >
> The problematic part of AQ is that its index is placed in the yet another onchip die register that does not scale as each member device has different queue count.
> When admin queue was discussed, it was only for group owner, (you answered to Jiri).
> Hence the scale is relatively less, so it was acceptable.
> 
> Now having unique numbers for VFs is not good.
> Max proposal was the last index after existing defined VQs of num_queues, that saves the storage space on device.

Surely, you can just have a very large index and be done with it?

> > > The single way for every device to query their capabilities is via a cfgvq for all
> > new fields without extending the existing config space.
> > > (and optionally old fields).
> > 
> > Or adminq with self group. I like this somewhat better because we need exactly
> > same query from owner.
> >
> Yes. this is why I proposed to name is cmdvq that can carry admin commands or other.
> But fine, we had to progress for group owner.
>  
> > > > Why don't we focus on a work on a full solution? Just don't
> > > > implement this thing in your devices meanwhile until we do.
> > > >
> > > Then Heng needs to wait for cfgvq to be defined to be implemented first.
> > > Doesn't look reasonable to me.
> > 
> > And *everything* has to wait. No, not reasonable. We somehow managed to
> > release several spec versions and things did not ground to a halt without cfgvq.
> > Don't see a reason to do it right now, what's special about now? I feel we should
> > add to config space and then solve it all.
> >
> Things didn't ground at cost of device keep increasing their memory footprint.
> The latest addition I remember is the queue_reset register.
> It was bit but a purely control operation that got in there.
>  
> > > Current GET is coherent with the new commands defined such as notification
> > coalescing.
> > >
> > > As community, we should work on defining the cfgvq, till that time have the
> > optimal way to get the config, i.e. using the cvq.
> > 
> > cvq doesn't really work for capabilities though.
> 
> For the device itself, it does which is what is being done here.

Yes but not for migration.

-- 
MST


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-dev] RE: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
  2023-06-22 17:11                       ` Michael S. Tsirkin
@ 2023-06-22 17:15                         ` Parav Pandit
  -1 siblings, 0 replies; 106+ messages in thread
From: Parav Pandit @ 2023-06-22 17:15 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck



> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Thursday, June 22, 2023 1:11 PM

> > Provisioning driver usually do not attach to the member device directly.
> > This requires device reset, followed by reaching _DRIVER stage, querying
> features etc and config area.
> > And unbinding it and second reset by member driver. Ugh.
> > Provisioning driver also needs to get the state or capabilities even when
> member driver is already attached.
> > So config space is not much a gain either.
> 
> Actually it's RO so you *can* read it without any issues:

It is RO but not same across all devices.

> - block guest access to status
> - check DRIVER.
> If set:
> 	- read features, config
> If not set:
> 	- read features, config
> 	- reset
> 
This is what I explained.
It is more messy if you equate to GET command has mess.
> I am not saying it is elegant but then all of vdpa pile of hacks is not elegant.
> 
I don't want to comment for vdpa. But it is not part of the spec...

> And I am all for building something better but we didn't build it yet.

The proposal for 1.4 is literally very simple as below.
1. All existing fields of cfg space stays in cfg space
2. Any new capabilities to be queried, query using a vq (aq, cfgvq, whatevervq).
3. Optionally existing fields can be queries over vq of #2

Once this arrive, no need for new GET commands.
Till that time, don't keep infinitely grow the cfg space.
Any next addition to cfg space, should work on defining the cfgvq.


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* RE: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
@ 2023-06-22 17:15                         ` Parav Pandit
  0 siblings, 0 replies; 106+ messages in thread
From: Parav Pandit @ 2023-06-22 17:15 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck



> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Thursday, June 22, 2023 1:11 PM

> > Provisioning driver usually do not attach to the member device directly.
> > This requires device reset, followed by reaching _DRIVER stage, querying
> features etc and config area.
> > And unbinding it and second reset by member driver. Ugh.
> > Provisioning driver also needs to get the state or capabilities even when
> member driver is already attached.
> > So config space is not much a gain either.
> 
> Actually it's RO so you *can* read it without any issues:

It is RO but not same across all devices.

> - block guest access to status
> - check DRIVER.
> If set:
> 	- read features, config
> If not set:
> 	- read features, config
> 	- reset
> 
This is what I explained.
It is more messy if you equate to GET command has mess.
> I am not saying it is elegant but then all of vdpa pile of hacks is not elegant.
> 
I don't want to comment for vdpa. But it is not part of the spec...

> And I am all for building something better but we didn't build it yet.

The proposal for 1.4 is literally very simple as below.
1. All existing fields of cfg space stays in cfg space
2. Any new capabilities to be queried, query using a vq (aq, cfgvq, whatevervq).
3. Optionally existing fields can be queries over vq of #2

Once this arrive, no need for new GET commands.
Till that time, don't keep infinitely grow the cfg space.
Any next addition to cfg space, should work on defining the cfgvq.


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 106+ messages in thread

* RE: [virtio-dev] Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
  2023-06-22 17:14                               ` [virtio-comment] " Michael S. Tsirkin
@ 2023-06-22 17:20                                 ` Parav Pandit
  -1 siblings, 0 replies; 106+ messages in thread
From: Parav Pandit @ 2023-06-22 17:20 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck


> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Thursday, June 22, 2023 1:15 PM
> 
> On Thu, Jun 22, 2023 at 05:04:04PM +0000, Parav Pandit wrote:
> >
> >
> > > From: virtio-dev@lists.oasis-open.org
> > > <virtio-dev@lists.oasis-open.org> On Behalf Of Michael S. Tsirkin
> > > Sent: Thursday, June 22, 2023 12:54 PM
> >
> > > > Admin command as I recall are not accessible directly by the
> > > > member driver to
> > > the member device.
> > > > So a cmdq or cfgq is needed.
> > >
> > > Possible, sure. Or we actually discussed a self group. I took it
> > > away until it had a user.
> > >
> > The problematic part of AQ is that its index is placed in the yet another onchip
> die register that does not scale as each member device has different queue
> count.
> > When admin queue was discussed, it was only for group owner, (you
> answered to Jiri).
> > Hence the scale is relatively less, so it was acceptable.
> >
> > Now having unique numbers for VFs is not good.
> > Max proposal was the last index after existing defined VQs of num_queues,
> that saves the storage space on device.
> 
> Surely, you can just have a very large index and be done with it?
>
There is count of AQ too.
For receive flow filters one may want to have multiple flowfilter_vqs as the perf req is high for some vms.

And device to build non linear PCI steering on the driver notification for this very high q count.
It is optimal to have finite and linear q max value.

> > > > The single way for every device to query their capabilities is via
> > > > a cfgvq for all
> > > new fields without extending the existing config space.
> > > > (and optionally old fields).
> > >
> > > Or adminq with self group. I like this somewhat better because we
> > > need exactly same query from owner.
> > >
> > Yes. this is why I proposed to name is cmdvq that can carry admin commands
> or other.
> > But fine, we had to progress for group owner.
> >
> > > > > Why don't we focus on a work on a full solution? Just don't
> > > > > implement this thing in your devices meanwhile until we do.
> > > > >
> > > > Then Heng needs to wait for cfgvq to be defined to be implemented first.
> > > > Doesn't look reasonable to me.
> > >
> > > And *everything* has to wait. No, not reasonable. We somehow managed
> > > to release several spec versions and things did not ground to a halt without
> cfgvq.
> > > Don't see a reason to do it right now, what's special about now? I
> > > feel we should add to config space and then solve it all.
> > >
> > Things didn't ground at cost of device keep increasing their memory footprint.
> > The latest addition I remember is the queue_reset register.
> > It was bit but a purely control operation that got in there.
> >
> > > > Current GET is coherent with the new commands defined such as
> > > > notification
> > > coalescing.
> > > >
> > > > As community, we should work on defining the cfgvq, till that time
> > > > have the
> > > optimal way to get the config, i.e. using the cvq.
> > >
> > > cvq doesn't really work for capabilities though.
> >
> > For the device itself, it does which is what is being done here.
> 
> Yes but not for migration.

For migration a admin command to query capabilities is needed.
This is present in the transport vq proposal already to be rebased on top of admin cmd.

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-comment] RE: [virtio-dev] Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
@ 2023-06-22 17:20                                 ` Parav Pandit
  0 siblings, 0 replies; 106+ messages in thread
From: Parav Pandit @ 2023-06-22 17:20 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck


> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Thursday, June 22, 2023 1:15 PM
> 
> On Thu, Jun 22, 2023 at 05:04:04PM +0000, Parav Pandit wrote:
> >
> >
> > > From: virtio-dev@lists.oasis-open.org
> > > <virtio-dev@lists.oasis-open.org> On Behalf Of Michael S. Tsirkin
> > > Sent: Thursday, June 22, 2023 12:54 PM
> >
> > > > Admin command as I recall are not accessible directly by the
> > > > member driver to
> > > the member device.
> > > > So a cmdq or cfgq is needed.
> > >
> > > Possible, sure. Or we actually discussed a self group. I took it
> > > away until it had a user.
> > >
> > The problematic part of AQ is that its index is placed in the yet another onchip
> die register that does not scale as each member device has different queue
> count.
> > When admin queue was discussed, it was only for group owner, (you
> answered to Jiri).
> > Hence the scale is relatively less, so it was acceptable.
> >
> > Now having unique numbers for VFs is not good.
> > Max proposal was the last index after existing defined VQs of num_queues,
> that saves the storage space on device.
> 
> Surely, you can just have a very large index and be done with it?
>
There is count of AQ too.
For receive flow filters one may want to have multiple flowfilter_vqs as the perf req is high for some vms.

And device to build non linear PCI steering on the driver notification for this very high q count.
It is optimal to have finite and linear q max value.

> > > > The single way for every device to query their capabilities is via
> > > > a cfgvq for all
> > > new fields without extending the existing config space.
> > > > (and optionally old fields).
> > >
> > > Or adminq with self group. I like this somewhat better because we
> > > need exactly same query from owner.
> > >
> > Yes. this is why I proposed to name is cmdvq that can carry admin commands
> or other.
> > But fine, we had to progress for group owner.
> >
> > > > > Why don't we focus on a work on a full solution? Just don't
> > > > > implement this thing in your devices meanwhile until we do.
> > > > >
> > > > Then Heng needs to wait for cfgvq to be defined to be implemented first.
> > > > Doesn't look reasonable to me.
> > >
> > > And *everything* has to wait. No, not reasonable. We somehow managed
> > > to release several spec versions and things did not ground to a halt without
> cfgvq.
> > > Don't see a reason to do it right now, what's special about now? I
> > > feel we should add to config space and then solve it all.
> > >
> > Things didn't ground at cost of device keep increasing their memory footprint.
> > The latest addition I remember is the queue_reset register.
> > It was bit but a purely control operation that got in there.
> >
> > > > Current GET is coherent with the new commands defined such as
> > > > notification
> > > coalescing.
> > > >
> > > > As community, we should work on defining the cfgvq, till that time
> > > > have the
> > > optimal way to get the config, i.e. using the cvq.
> > >
> > > cvq doesn't really work for capabilities though.
> >
> > For the device itself, it does which is what is being done here.
> 
> Yes but not for migration.

For migration a admin command to query capabilities is needed.
This is present in the transport vq proposal already to be rebased on top of admin cmd.

This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
  2023-06-22 17:11                                 ` Parav Pandit
@ 2023-06-22 17:28                                   ` Michael S. Tsirkin
  -1 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2023-06-22 17:28 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck

On Thu, Jun 22, 2023 at 05:11:24PM +0000, Parav Pandit wrote:
> 
> 
> > From: virtio-comment@lists.oasis-open.org <virtio-comment@lists.oasis-
> > open.org> On Behalf Of Michael S. Tsirkin
> > Sent: Thursday, June 22, 2023 1:04 PM
> > To: Parav Pandit <parav@nvidia.com>
> > 
> > On Thu, Jun 22, 2023 at 04:54:48PM +0000, Parav Pandit wrote:
> > >
> > >
> > > > From: Michael S. Tsirkin <mst@redhat.com>
> > > > Sent: Thursday, June 22, 2023 12:47 PM
> > >
> > > >
> > > > The hardware footprint of keeping this in memory is also fairly
> > > > small :) I care about a messy interface because this mess builds up over time.
> > > >
> > > It is really a simple GET command.
> > > It is actually messy for the device to implement functionality in two places in
> > cfg space and cvq.
> > >
> > > > And I am worried about capabilities really. My bad that I missed
> > > > this change in v13. I only can say in my defence that I already had
> > > > to rewrite huge chunks of this proposal to make it readable so one
> > > > can't say I'm only delaying things, I also made an effort to help
> > > > this progress faster :)
> > > >
> > > > I feel we need a single place where device capabilities can live. So
> > > > far they were in config space.  It's consistent, yes I get this has
> > > > hardware costs *if* there's a huge number of VFs and *if* there's a
> > > > way to provision each VF with a different configuration.
> > > All the ifs are valid today.
> > >
> > > > And yes querying VFs over MMIO is kind of ugly. But it does at least
> > > > work, and works fine while VF is assigned.  So we can build
> > > > migration around that *today*.
> > > >
> > > Other way to say migration can be skipped for this feature bit, and it still
> > works for rest.
> > 
> > If VF is assigned then we can't really control what does guest enable.
> >
> All parameters set over the CVQ needs to be accessible to the migration entity.
> Including RSS and including this bit.
> Either by trapping the CVQ or by having AQ command to query member dev capabilities.

Yes, parameters set needs to be trapped, so they can be
replayed. This is a capability though. After guest has
driven the CVQ for a while submitting a command to it
so we get capability - I don't see how that works.

> > > > But querying over cvq while VF is assigned clearly *doesn't* work.
> > > >
> > > That is not the idea at all.
> > > Querying VF capabilities is the role of the admin command for which we built
> > it.
> > 
> > This GET is exactly that though.
> > 
> Not exactly.
> This GET command is needed for the member driver to know what is supported for the device it got.

yes. but how will hypervisor get the capability?

> > > > So what is the solution proposed for this?
> > > >
> > > 1. Query member device capabilities via admin command
> > 
> > But that's not 1.3 material.
> > 
> > > > Yes the current migration is broken in many ways but that's what we
> > > > have. Let's build something better sure but that is not 1.3 material.
> > >
> > > True, it is not 1.3 material, hence the proposal was to have the GET command.
> > > Once/if we reach agreement that no new fields to be added to config space
> > starting 1.4 and should be queried using non intercepted cfgvq, it makes sense
> > to let this go in cfg space.
> > > Else GET command seems the elegant and right approach.
> > 
> > I expect no such agreement at all. Instead, I expect that we'll have an alternative
> > way to access config space. guest virtio core then needs to learn both ways, and
> > devices can support one or both.
> > 
> Yeah, we disagree.
> Because alternative way that you propose is not predictable way to build the device efficiently.
> It always needs to account for old driver to support.
> This is clearly sub-optimal as the capabilities grow.

So just quickly add the new capability in the spec and then the number
of linux releases that will have the new feature but not config command
or whatever that is will be too small for vendors to care.

> 
> > A good implementation of virtio_cread can abstract that easily so we don't need
> > to change drivers.
> 
> There is no backward compat issue for the GET command being new.

It's just a shortcut replacing what we really want.  As long as a
shortcut is available people will keep using exactly that.  So I fully
expect more proposals for such GET commands on the pretext that one is
there so why not another one. Adding more tech debt for whoever
finally gets around to building a config space access gateway.

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
@ 2023-06-22 17:28                                   ` Michael S. Tsirkin
  0 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2023-06-22 17:28 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck

On Thu, Jun 22, 2023 at 05:11:24PM +0000, Parav Pandit wrote:
> 
> 
> > From: virtio-comment@lists.oasis-open.org <virtio-comment@lists.oasis-
> > open.org> On Behalf Of Michael S. Tsirkin
> > Sent: Thursday, June 22, 2023 1:04 PM
> > To: Parav Pandit <parav@nvidia.com>
> > 
> > On Thu, Jun 22, 2023 at 04:54:48PM +0000, Parav Pandit wrote:
> > >
> > >
> > > > From: Michael S. Tsirkin <mst@redhat.com>
> > > > Sent: Thursday, June 22, 2023 12:47 PM
> > >
> > > >
> > > > The hardware footprint of keeping this in memory is also fairly
> > > > small :) I care about a messy interface because this mess builds up over time.
> > > >
> > > It is really a simple GET command.
> > > It is actually messy for the device to implement functionality in two places in
> > cfg space and cvq.
> > >
> > > > And I am worried about capabilities really. My bad that I missed
> > > > this change in v13. I only can say in my defence that I already had
> > > > to rewrite huge chunks of this proposal to make it readable so one
> > > > can't say I'm only delaying things, I also made an effort to help
> > > > this progress faster :)
> > > >
> > > > I feel we need a single place where device capabilities can live. So
> > > > far they were in config space.  It's consistent, yes I get this has
> > > > hardware costs *if* there's a huge number of VFs and *if* there's a
> > > > way to provision each VF with a different configuration.
> > > All the ifs are valid today.
> > >
> > > > And yes querying VFs over MMIO is kind of ugly. But it does at least
> > > > work, and works fine while VF is assigned.  So we can build
> > > > migration around that *today*.
> > > >
> > > Other way to say migration can be skipped for this feature bit, and it still
> > works for rest.
> > 
> > If VF is assigned then we can't really control what does guest enable.
> >
> All parameters set over the CVQ needs to be accessible to the migration entity.
> Including RSS and including this bit.
> Either by trapping the CVQ or by having AQ command to query member dev capabilities.

Yes, parameters set needs to be trapped, so they can be
replayed. This is a capability though. After guest has
driven the CVQ for a while submitting a command to it
so we get capability - I don't see how that works.

> > > > But querying over cvq while VF is assigned clearly *doesn't* work.
> > > >
> > > That is not the idea at all.
> > > Querying VF capabilities is the role of the admin command for which we built
> > it.
> > 
> > This GET is exactly that though.
> > 
> Not exactly.
> This GET command is needed for the member driver to know what is supported for the device it got.

yes. but how will hypervisor get the capability?

> > > > So what is the solution proposed for this?
> > > >
> > > 1. Query member device capabilities via admin command
> > 
> > But that's not 1.3 material.
> > 
> > > > Yes the current migration is broken in many ways but that's what we
> > > > have. Let's build something better sure but that is not 1.3 material.
> > >
> > > True, it is not 1.3 material, hence the proposal was to have the GET command.
> > > Once/if we reach agreement that no new fields to be added to config space
> > starting 1.4 and should be queried using non intercepted cfgvq, it makes sense
> > to let this go in cfg space.
> > > Else GET command seems the elegant and right approach.
> > 
> > I expect no such agreement at all. Instead, I expect that we'll have an alternative
> > way to access config space. guest virtio core then needs to learn both ways, and
> > devices can support one or both.
> > 
> Yeah, we disagree.
> Because alternative way that you propose is not predictable way to build the device efficiently.
> It always needs to account for old driver to support.
> This is clearly sub-optimal as the capabilities grow.

So just quickly add the new capability in the spec and then the number
of linux releases that will have the new feature but not config command
or whatever that is will be too small for vendors to care.

> 
> > A good implementation of virtio_cread can abstract that easily so we don't need
> > to change drivers.
> 
> There is no backward compat issue for the GET command being new.

It's just a shortcut replacing what we really want.  As long as a
shortcut is available people will keep using exactly that.  So I fully
expect more proposals for such GET commands on the pretext that one is
there so why not another one. Adding more tech debt for whoever
finally gets around to building a config space access gateway.

-- 
MST


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
  2023-06-22 17:15                         ` Parav Pandit
@ 2023-06-22 17:37                           ` Michael S. Tsirkin
  -1 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2023-06-22 17:37 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck

On Thu, Jun 22, 2023 at 05:15:50PM +0000, Parav Pandit wrote:
> 
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Thursday, June 22, 2023 1:11 PM
> 
> > > Provisioning driver usually do not attach to the member device directly.
> > > This requires device reset, followed by reaching _DRIVER stage, querying
> > features etc and config area.
> > > And unbinding it and second reset by member driver. Ugh.
> > > Provisioning driver also needs to get the state or capabilities even when
> > member driver is already attached.
> > > So config space is not much a gain either.
> > 
> > Actually it's RO so you *can* read it without any issues:
> 
> It is RO but not same across all devices.

If you provision VFs differently. I got it.

> > - block guest access to status
> > - check DRIVER.
> > If set:
> > 	- read features, config
> > If not set:
> > 	- read features, config
> > 	- reset
> > 
> This is what I explained.
> It is more messy if you equate to GET command has mess.

At least it works.

> > I am not saying it is elegant but then all of vdpa pile of hacks is not elegant.
> > 
> I don't want to comment for vdpa. But it is not part of the spec...

Neither is QEMU.  It's one of spec implementations. Yes, we care about
not adding blockers for features that, superficially, might make
sense for them.

> > And I am all for building something better but we didn't build it yet.
> 
> The proposal for 1.4 is literally very simple as below.
> 1. All existing fields of cfg space stays in cfg space
> 2. Any new capabilities to be queried, query using a vq (aq, cfgvq, whatevervq).
> 3. Optionally existing fields can be queries over vq of #2
> Once this arrive, no need for new GET commands.
> Till that time, don't keep infinitely grow the cfg space.
> Any next addition to cfg space, should work on defining the cfgvq.

Simple, but short sighted. I know you guys don't support your
hardware for 10-20 years but for software people do.
And so "All existing fields of cfg space stays in cfg space" is a bad
idea simply because this does not allow removing things from config
space not in 10 not in 20 years not ever.


Instead we need to allow two ways to access config space.  Teach drivers
about both, actually mandate supporting both.  And then devices will
make their own cost/benefit decision about which features they want to
support in MMIO.


-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
@ 2023-06-22 17:37                           ` Michael S. Tsirkin
  0 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2023-06-22 17:37 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck

On Thu, Jun 22, 2023 at 05:15:50PM +0000, Parav Pandit wrote:
> 
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Thursday, June 22, 2023 1:11 PM
> 
> > > Provisioning driver usually do not attach to the member device directly.
> > > This requires device reset, followed by reaching _DRIVER stage, querying
> > features etc and config area.
> > > And unbinding it and second reset by member driver. Ugh.
> > > Provisioning driver also needs to get the state or capabilities even when
> > member driver is already attached.
> > > So config space is not much a gain either.
> > 
> > Actually it's RO so you *can* read it without any issues:
> 
> It is RO but not same across all devices.

If you provision VFs differently. I got it.

> > - block guest access to status
> > - check DRIVER.
> > If set:
> > 	- read features, config
> > If not set:
> > 	- read features, config
> > 	- reset
> > 
> This is what I explained.
> It is more messy if you equate to GET command has mess.

At least it works.

> > I am not saying it is elegant but then all of vdpa pile of hacks is not elegant.
> > 
> I don't want to comment for vdpa. But it is not part of the spec...

Neither is QEMU.  It's one of spec implementations. Yes, we care about
not adding blockers for features that, superficially, might make
sense for them.

> > And I am all for building something better but we didn't build it yet.
> 
> The proposal for 1.4 is literally very simple as below.
> 1. All existing fields of cfg space stays in cfg space
> 2. Any new capabilities to be queried, query using a vq (aq, cfgvq, whatevervq).
> 3. Optionally existing fields can be queries over vq of #2
> Once this arrive, no need for new GET commands.
> Till that time, don't keep infinitely grow the cfg space.
> Any next addition to cfg space, should work on defining the cfgvq.

Simple, but short sighted. I know you guys don't support your
hardware for 10-20 years but for software people do.
And so "All existing fields of cfg space stays in cfg space" is a bad
idea simply because this does not allow removing things from config
space not in 10 not in 20 years not ever.


Instead we need to allow two ways to access config space.  Teach drivers
about both, actually mandate supporting both.  And then devices will
make their own cost/benefit decision about which features they want to
support in MMIO.


-- 
MST


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-dev] Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
  2023-06-22 17:20                                 ` [virtio-comment] " Parav Pandit
@ 2023-06-22 17:43                                   ` Michael S. Tsirkin
  -1 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2023-06-22 17:43 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck

On Thu, Jun 22, 2023 at 05:20:23PM +0000, Parav Pandit wrote:
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Thursday, June 22, 2023 1:15 PM
> > 
> > On Thu, Jun 22, 2023 at 05:04:04PM +0000, Parav Pandit wrote:
> > >
> > >
> > > > From: virtio-dev@lists.oasis-open.org
> > > > <virtio-dev@lists.oasis-open.org> On Behalf Of Michael S. Tsirkin
> > > > Sent: Thursday, June 22, 2023 12:54 PM
> > >
> > > > > Admin command as I recall are not accessible directly by the
> > > > > member driver to
> > > > the member device.
> > > > > So a cmdq or cfgq is needed.
> > > >
> > > > Possible, sure. Or we actually discussed a self group. I took it
> > > > away until it had a user.
> > > >
> > > The problematic part of AQ is that its index is placed in the yet another onchip
> > die register that does not scale as each member device has different queue
> > count.
> > > When admin queue was discussed, it was only for group owner, (you
> > answered to Jiri).
> > > Hence the scale is relatively less, so it was acceptable.
> > >
> > > Now having unique numbers for VFs is not good.
> > > Max proposal was the last index after existing defined VQs of num_queues,
> > that saves the storage space on device.
> > 
> > Surely, you can just have a very large index and be done with it?
> >
> There is count of AQ too.

Make that same across VFs?

> For receive flow filters one may want to have multiple flowfilter_vqs as the perf req is high for some vms.
> 
> And device to build non linear PCI steering on the driver notification for this very high q count.
> It is optimal to have finite and linear q max value.

What does this have to do with AQ? These are data vqs.


> > > > > The single way for every device to query their capabilities is via
> > > > > a cfgvq for all
> > > > new fields without extending the existing config space.
> > > > > (and optionally old fields).
> > > >
> > > > Or adminq with self group. I like this somewhat better because we
> > > > need exactly same query from owner.
> > > >
> > > Yes. this is why I proposed to name is cmdvq that can carry admin commands
> > or other.
> > > But fine, we had to progress for group owner.
> > >
> > > > > > Why don't we focus on a work on a full solution? Just don't
> > > > > > implement this thing in your devices meanwhile until we do.
> > > > > >
> > > > > Then Heng needs to wait for cfgvq to be defined to be implemented first.
> > > > > Doesn't look reasonable to me.
> > > >
> > > > And *everything* has to wait. No, not reasonable. We somehow managed
> > > > to release several spec versions and things did not ground to a halt without
> > cfgvq.
> > > > Don't see a reason to do it right now, what's special about now? I
> > > > feel we should add to config space and then solve it all.
> > > >
> > > Things didn't ground at cost of device keep increasing their memory footprint.
> > > The latest addition I remember is the queue_reset register.
> > > It was bit but a purely control operation that got in there.
> > >
> > > > > Current GET is coherent with the new commands defined such as
> > > > > notification
> > > > coalescing.
> > > > >
> > > > > As community, we should work on defining the cfgvq, till that time
> > > > > have the
> > > > optimal way to get the config, i.e. using the cvq.
> > > >
> > > > cvq doesn't really work for capabilities though.
> > >
> > > For the device itself, it does which is what is being done here.
> > 
> > Yes but not for migration.
> 
> For migration a admin command to query capabilities is needed.
> This is present in the transport vq proposal already to be rebased on top of admin cmd.

So 1.4 will maybe have new migration capabilities, and that is great.
But I do not like it that we are adding in 1.3 features that
can't be supported with current migration capabilities.

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
@ 2023-06-22 17:43                                   ` Michael S. Tsirkin
  0 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2023-06-22 17:43 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck

On Thu, Jun 22, 2023 at 05:20:23PM +0000, Parav Pandit wrote:
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Thursday, June 22, 2023 1:15 PM
> > 
> > On Thu, Jun 22, 2023 at 05:04:04PM +0000, Parav Pandit wrote:
> > >
> > >
> > > > From: virtio-dev@lists.oasis-open.org
> > > > <virtio-dev@lists.oasis-open.org> On Behalf Of Michael S. Tsirkin
> > > > Sent: Thursday, June 22, 2023 12:54 PM
> > >
> > > > > Admin command as I recall are not accessible directly by the
> > > > > member driver to
> > > > the member device.
> > > > > So a cmdq or cfgq is needed.
> > > >
> > > > Possible, sure. Or we actually discussed a self group. I took it
> > > > away until it had a user.
> > > >
> > > The problematic part of AQ is that its index is placed in the yet another onchip
> > die register that does not scale as each member device has different queue
> > count.
> > > When admin queue was discussed, it was only for group owner, (you
> > answered to Jiri).
> > > Hence the scale is relatively less, so it was acceptable.
> > >
> > > Now having unique numbers for VFs is not good.
> > > Max proposal was the last index after existing defined VQs of num_queues,
> > that saves the storage space on device.
> > 
> > Surely, you can just have a very large index and be done with it?
> >
> There is count of AQ too.

Make that same across VFs?

> For receive flow filters one may want to have multiple flowfilter_vqs as the perf req is high for some vms.
> 
> And device to build non linear PCI steering on the driver notification for this very high q count.
> It is optimal to have finite and linear q max value.

What does this have to do with AQ? These are data vqs.


> > > > > The single way for every device to query their capabilities is via
> > > > > a cfgvq for all
> > > > new fields without extending the existing config space.
> > > > > (and optionally old fields).
> > > >
> > > > Or adminq with self group. I like this somewhat better because we
> > > > need exactly same query from owner.
> > > >
> > > Yes. this is why I proposed to name is cmdvq that can carry admin commands
> > or other.
> > > But fine, we had to progress for group owner.
> > >
> > > > > > Why don't we focus on a work on a full solution? Just don't
> > > > > > implement this thing in your devices meanwhile until we do.
> > > > > >
> > > > > Then Heng needs to wait for cfgvq to be defined to be implemented first.
> > > > > Doesn't look reasonable to me.
> > > >
> > > > And *everything* has to wait. No, not reasonable. We somehow managed
> > > > to release several spec versions and things did not ground to a halt without
> > cfgvq.
> > > > Don't see a reason to do it right now, what's special about now? I
> > > > feel we should add to config space and then solve it all.
> > > >
> > > Things didn't ground at cost of device keep increasing their memory footprint.
> > > The latest addition I remember is the queue_reset register.
> > > It was bit but a purely control operation that got in there.
> > >
> > > > > Current GET is coherent with the new commands defined such as
> > > > > notification
> > > > coalescing.
> > > > >
> > > > > As community, we should work on defining the cfgvq, till that time
> > > > > have the
> > > > optimal way to get the config, i.e. using the cvq.
> > > >
> > > > cvq doesn't really work for capabilities though.
> > >
> > > For the device itself, it does which is what is being done here.
> > 
> > Yes but not for migration.
> 
> For migration a admin command to query capabilities is needed.
> This is present in the transport vq proposal already to be rebased on top of admin cmd.

So 1.4 will maybe have new migration capabilities, and that is great.
But I do not like it that we are adding in 1.3 features that
can't be supported with current migration capabilities.

-- 
MST


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 106+ messages in thread

* RE: [virtio-dev] Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
  2023-06-22 17:37                           ` Michael S. Tsirkin
@ 2023-06-22 17:51                             ` Parav Pandit
  -1 siblings, 0 replies; 106+ messages in thread
From: Parav Pandit @ 2023-06-22 17:51 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck


> From: virtio-dev@lists.oasis-open.org <virtio-dev@lists.oasis-open.org> On
> Behalf Of Michael S. Tsirkin
> Sent: Thursday, June 22, 2023 1:38 PM
> 
> On Thu, Jun 22, 2023 at 05:15:50PM +0000, Parav Pandit wrote:
> >
> >
> > > From: Michael S. Tsirkin <mst@redhat.com>
> > > Sent: Thursday, June 22, 2023 1:11 PM
> >
> > > > Provisioning driver usually do not attach to the member device directly.
> > > > This requires device reset, followed by reaching _DRIVER stage,
> > > > querying
> > > features etc and config area.
> > > > And unbinding it and second reset by member driver. Ugh.
> > > > Provisioning driver also needs to get the state or capabilities
> > > > even when
> > > member driver is already attached.
> > > > So config space is not much a gain either.
> > >
> > > Actually it's RO so you *can* read it without any issues:
> >
> > It is RO but not same across all devices.
> 
> If you provision VFs differently. I got it.
> 
> > > - block guest access to status
> > > - check DRIVER.
> > > If set:
> > > 	- read features, config
> > > If not set:
> > > 	- read features, config
> > > 	- reset
> > >
> > This is what I explained.
> > It is more messy if you equate to GET command has mess.
> 
> At least it works.
>
And new GET command also works when CVQ is trapped.
And also works when AQ is querying capabilities of a member device using WIP cmd.
 
> > > I am not saying it is elegant but then all of vdpa pile of hacks is not elegant.
> > >
> > I don't want to comment for vdpa. But it is not part of the spec...
> 
> Neither is QEMU.  It's one of spec implementations. Yes, we care about not
> adding blockers for features that, superficially, might make sense for them.
> 
> > > And I am all for building something better but we didn't build it yet.
> >
> > The proposal for 1.4 is literally very simple as below.
> > 1. All existing fields of cfg space stays in cfg space 2. Any new
> > capabilities to be queried, query using a vq (aq, cfgvq, whatevervq).
> > 3. Optionally existing fields can be queries over vq of #2 Once this
> > arrive, no need for new GET commands.
> > Till that time, don't keep infinitely grow the cfg space.
> > Any next addition to cfg space, should work on defining the cfgvq.
> 
> Simple, but short sighted. I know you guys don't support your hardware for 10-
> 20 years but for software people do.
> And so "All existing fields of cfg space stays in cfg space" is a bad idea simply
> because this does not allow removing things from config space not in 10 not in
> 20 years not ever.
>
#1 is for backward compat for existing drivers.
 You missed about #3. Existing cfg space fields can be queries using the cfgvq too.

> 
> Instead we need to allow two ways to access config space.  Teach drivers about
> both, actually mandate supporting both.  And then devices will make their own
> cost/benefit decision about which features they want to support in MMIO.

If both method is mandated, I don't see benefit at all of two methods.
VQ is generic part of the spec for slow and fast operation, so it is not at all a cost for config reading.

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-comment] RE: [virtio-dev] Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
@ 2023-06-22 17:51                             ` Parav Pandit
  0 siblings, 0 replies; 106+ messages in thread
From: Parav Pandit @ 2023-06-22 17:51 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck


> From: virtio-dev@lists.oasis-open.org <virtio-dev@lists.oasis-open.org> On
> Behalf Of Michael S. Tsirkin
> Sent: Thursday, June 22, 2023 1:38 PM
> 
> On Thu, Jun 22, 2023 at 05:15:50PM +0000, Parav Pandit wrote:
> >
> >
> > > From: Michael S. Tsirkin <mst@redhat.com>
> > > Sent: Thursday, June 22, 2023 1:11 PM
> >
> > > > Provisioning driver usually do not attach to the member device directly.
> > > > This requires device reset, followed by reaching _DRIVER stage,
> > > > querying
> > > features etc and config area.
> > > > And unbinding it and second reset by member driver. Ugh.
> > > > Provisioning driver also needs to get the state or capabilities
> > > > even when
> > > member driver is already attached.
> > > > So config space is not much a gain either.
> > >
> > > Actually it's RO so you *can* read it without any issues:
> >
> > It is RO but not same across all devices.
> 
> If you provision VFs differently. I got it.
> 
> > > - block guest access to status
> > > - check DRIVER.
> > > If set:
> > > 	- read features, config
> > > If not set:
> > > 	- read features, config
> > > 	- reset
> > >
> > This is what I explained.
> > It is more messy if you equate to GET command has mess.
> 
> At least it works.
>
And new GET command also works when CVQ is trapped.
And also works when AQ is querying capabilities of a member device using WIP cmd.
 
> > > I am not saying it is elegant but then all of vdpa pile of hacks is not elegant.
> > >
> > I don't want to comment for vdpa. But it is not part of the spec...
> 
> Neither is QEMU.  It's one of spec implementations. Yes, we care about not
> adding blockers for features that, superficially, might make sense for them.
> 
> > > And I am all for building something better but we didn't build it yet.
> >
> > The proposal for 1.4 is literally very simple as below.
> > 1. All existing fields of cfg space stays in cfg space 2. Any new
> > capabilities to be queried, query using a vq (aq, cfgvq, whatevervq).
> > 3. Optionally existing fields can be queries over vq of #2 Once this
> > arrive, no need for new GET commands.
> > Till that time, don't keep infinitely grow the cfg space.
> > Any next addition to cfg space, should work on defining the cfgvq.
> 
> Simple, but short sighted. I know you guys don't support your hardware for 10-
> 20 years but for software people do.
> And so "All existing fields of cfg space stays in cfg space" is a bad idea simply
> because this does not allow removing things from config space not in 10 not in
> 20 years not ever.
>
#1 is for backward compat for existing drivers.
 You missed about #3. Existing cfg space fields can be queries using the cfgvq too.

> 
> Instead we need to allow two ways to access config space.  Teach drivers about
> both, actually mandate supporting both.  And then devices will make their own
> cost/benefit decision about which features they want to support in MMIO.

If both method is mandated, I don't see benefit at all of two methods.
VQ is generic part of the spec for slow and fast operation, so it is not at all a cost for config reading.

This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-dev] RE: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
  2023-06-22 17:28                                   ` Michael S. Tsirkin
@ 2023-06-22 17:58                                     ` Parav Pandit
  -1 siblings, 0 replies; 106+ messages in thread
From: Parav Pandit @ 2023-06-22 17:58 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck



> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Thursday, June 22, 2023 1:28 PM

> > > If VF is assigned then we can't really control what does guest enable.
> > >
> > All parameters set over the CVQ needs to be accessible to the migration entity.
> > Including RSS and including this bit.
> > Either by trapping the CVQ or by having AQ command to query member dev
> capabilities.
> 
> Yes, parameters set needs to be trapped, so they can be replayed. 
Only for vdpa type of solution.

For passthrough device no need to trap any parameters or capability. 

> This is a
> capability though. After guest has driven the CVQ for a while submitting a
> command to it so we get capability - I don't see how that works.
> 
Group owner device gets to query the capabilities (and provision) using admin command.
(already part of transport vq proposal).

> > > > > But querying over cvq while VF is assigned clearly *doesn't* work.
> > > > >
> > > > That is not the idea at all.
> > > > Querying VF capabilities is the role of the admin command for
> > > > which we built
> > > it.
> > >
> > > This GET is exactly that though.
> > >
> > Not exactly.
> > This GET command is needed for the member driver to know what is
> supported for the device it got.
> 
> yes. but how will hypervisor get the capability?
>
By issuing admin command on the group owner device.

> > > > > So what is the solution proposed for this?
> > > > >
> > > > 1. Query member device capabilities via admin command
> > >
> > > But that's not 1.3 material.
> > >
> > > > > Yes the current migration is broken in many ways but that's what
> > > > > we have. Let's build something better sure but that is not 1.3 material.
> > > >
> > > > True, it is not 1.3 material, hence the proposal was to have the GET
> command.
> > > > Once/if we reach agreement that no new fields to be added to
> > > > config space
> > > starting 1.4 and should be queried using non intercepted cfgvq, it
> > > makes sense to let this go in cfg space.
> > > > Else GET command seems the elegant and right approach.
> > >
> > > I expect no such agreement at all. Instead, I expect that we'll have
> > > an alternative way to access config space. guest virtio core then
> > > needs to learn both ways, and devices can support one or both.
> > >
> > Yeah, we disagree.
> > Because alternative way that you propose is not predictable way to build the
> device efficiently.
> > It always needs to account for old driver to support.
> > This is clearly sub-optimal as the capabilities grow.
> 
> So just quickly add the new capability in the spec and then the number of linux
> releases that will have the new feature but not config command or whatever
> that is will be too small for vendors to care.
> 
I didn't follow this suggestion.

> >
> > > A good implementation of virtio_cread can abstract that easily so we
> > > don't need to change drivers.
> >
> > There is no backward compat issue for the GET command being new.
> 
> It's just a shortcut replacing what we really want.  As long as a shortcut is
> available people will keep using exactly that.  So I fully expect more proposals
> for such GET commands on the pretext that one is there so why not another
> one. Adding more tech debt for whoever finally gets around to building a config
> space access gateway.
>
Not really. as suggested, the first addition of new field to the config space in 1.4-time frame, should add the cfgvq, and not follow the previous example.
Because this is being thought through now, it is not at all hard for any new things to follow the guideline.

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* RE: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
@ 2023-06-22 17:58                                     ` Parav Pandit
  0 siblings, 0 replies; 106+ messages in thread
From: Parav Pandit @ 2023-06-22 17:58 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck



> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Thursday, June 22, 2023 1:28 PM

> > > If VF is assigned then we can't really control what does guest enable.
> > >
> > All parameters set over the CVQ needs to be accessible to the migration entity.
> > Including RSS and including this bit.
> > Either by trapping the CVQ or by having AQ command to query member dev
> capabilities.
> 
> Yes, parameters set needs to be trapped, so they can be replayed. 
Only for vdpa type of solution.

For passthrough device no need to trap any parameters or capability. 

> This is a
> capability though. After guest has driven the CVQ for a while submitting a
> command to it so we get capability - I don't see how that works.
> 
Group owner device gets to query the capabilities (and provision) using admin command.
(already part of transport vq proposal).

> > > > > But querying over cvq while VF is assigned clearly *doesn't* work.
> > > > >
> > > > That is not the idea at all.
> > > > Querying VF capabilities is the role of the admin command for
> > > > which we built
> > > it.
> > >
> > > This GET is exactly that though.
> > >
> > Not exactly.
> > This GET command is needed for the member driver to know what is
> supported for the device it got.
> 
> yes. but how will hypervisor get the capability?
>
By issuing admin command on the group owner device.

> > > > > So what is the solution proposed for this?
> > > > >
> > > > 1. Query member device capabilities via admin command
> > >
> > > But that's not 1.3 material.
> > >
> > > > > Yes the current migration is broken in many ways but that's what
> > > > > we have. Let's build something better sure but that is not 1.3 material.
> > > >
> > > > True, it is not 1.3 material, hence the proposal was to have the GET
> command.
> > > > Once/if we reach agreement that no new fields to be added to
> > > > config space
> > > starting 1.4 and should be queried using non intercepted cfgvq, it
> > > makes sense to let this go in cfg space.
> > > > Else GET command seems the elegant and right approach.
> > >
> > > I expect no such agreement at all. Instead, I expect that we'll have
> > > an alternative way to access config space. guest virtio core then
> > > needs to learn both ways, and devices can support one or both.
> > >
> > Yeah, we disagree.
> > Because alternative way that you propose is not predictable way to build the
> device efficiently.
> > It always needs to account for old driver to support.
> > This is clearly sub-optimal as the capabilities grow.
> 
> So just quickly add the new capability in the spec and then the number of linux
> releases that will have the new feature but not config command or whatever
> that is will be too small for vendors to care.
> 
I didn't follow this suggestion.

> >
> > > A good implementation of virtio_cread can abstract that easily so we
> > > don't need to change drivers.
> >
> > There is no backward compat issue for the GET command being new.
> 
> It's just a shortcut replacing what we really want.  As long as a shortcut is
> available people will keep using exactly that.  So I fully expect more proposals
> for such GET commands on the pretext that one is there so why not another
> one. Adding more tech debt for whoever finally gets around to building a config
> space access gateway.
>
Not really. as suggested, the first addition of new field to the config space in 1.4-time frame, should add the cfgvq, and not follow the previous example.
Because this is being thought through now, it is not at all hard for any new things to follow the guideline.

This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-dev] Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
  2023-06-22 17:51                             ` [virtio-comment] " Parav Pandit
@ 2023-06-22 18:11                               ` Michael S. Tsirkin
  -1 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2023-06-22 18:11 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck

On Thu, Jun 22, 2023 at 05:51:41PM +0000, Parav Pandit wrote:
> 
> > From: virtio-dev@lists.oasis-open.org <virtio-dev@lists.oasis-open.org> On
> > Behalf Of Michael S. Tsirkin
> > Sent: Thursday, June 22, 2023 1:38 PM
> > 
> > On Thu, Jun 22, 2023 at 05:15:50PM +0000, Parav Pandit wrote:
> > >
> > >
> > > > From: Michael S. Tsirkin <mst@redhat.com>
> > > > Sent: Thursday, June 22, 2023 1:11 PM
> > >
> > > > > Provisioning driver usually do not attach to the member device directly.
> > > > > This requires device reset, followed by reaching _DRIVER stage,
> > > > > querying
> > > > features etc and config area.
> > > > > And unbinding it and second reset by member driver. Ugh.
> > > > > Provisioning driver also needs to get the state or capabilities
> > > > > even when
> > > > member driver is already attached.
> > > > > So config space is not much a gain either.
> > > >
> > > > Actually it's RO so you *can* read it without any issues:
> > >
> > > It is RO but not same across all devices.
> > 
> > If you provision VFs differently. I got it.
> > 
> > > > - block guest access to status
> > > > - check DRIVER.
> > > > If set:
> > > > 	- read features, config
> > > > If not set:
> > > > 	- read features, config
> > > > 	- reset
> > > >
> > > This is what I explained.
> > > It is more messy if you equate to GET command has mess.
> > 
> > At least it works.
> >
> And new GET command also works when CVQ is trapped.

Not just trapped. You need to issue these commands.
That will require
- driving cvq at guest boot, sending commands, then reset
- poking at guest memory to change CVQ contents on
  the fly to mask commands.

How bad or hard is that? I'll need to ponder this a bit.


> And also works when AQ is querying capabilities of a member device using WIP cmd.

yes that's the way forward in 1.4.

> > > > I am not saying it is elegant but then all of vdpa pile of hacks is not elegant.
> > > >
> > > I don't want to comment for vdpa. But it is not part of the spec...
> > 
> > Neither is QEMU.  It's one of spec implementations. Yes, we care about not
> > adding blockers for features that, superficially, might make sense for them.
> > 
> > > > And I am all for building something better but we didn't build it yet.
> > >
> > > The proposal for 1.4 is literally very simple as below.
> > > 1. All existing fields of cfg space stays in cfg space 2. Any new
> > > capabilities to be queried, query using a vq (aq, cfgvq, whatevervq).
> > > 3. Optionally existing fields can be queries over vq of #2 Once this
> > > arrive, no need for new GET commands.
> > > Till that time, don't keep infinitely grow the cfg space.
> > > Any next addition to cfg space, should work on defining the cfgvq.
> > 
> > Simple, but short sighted. I know you guys don't support your hardware for 10-
> > 20 years but for software people do.
> > And so "All existing fields of cfg space stays in cfg space" is a bad idea simply
> > because this does not allow removing things from config space not in 10 not in
> > 20 years not ever.
> >
> #1 is for backward compat for existing drivers.
>  You missed about #3. Existing cfg space fields can be queries using the cfgvq too.

Then #1 does not matter. We can give devices choice.

> > 
> > Instead we need to allow two ways to access config space.  Teach drivers about
> > both, actually mandate supporting both.  And then devices will make their own
> > cost/benefit decision about which features they want to support in MMIO.
> 
> If both method is mandated, I don't see benefit at all of two methods.

Mandated for driver.
Benefit is for devices, they will have the choice which drivers to
support. In 10-20 years all drivers support cfg command and then
people can start shipping devices without MMIO access to any
registers.

> VQ is generic part of the spec for slow and fast operation, so it is not at all a cost for config reading.

This will depend. E.g. if there's a single command to get all of config
in one go, then it actually can be a speedup, reducing the # of VM
exits.

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
@ 2023-06-22 18:11                               ` Michael S. Tsirkin
  0 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2023-06-22 18:11 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck

On Thu, Jun 22, 2023 at 05:51:41PM +0000, Parav Pandit wrote:
> 
> > From: virtio-dev@lists.oasis-open.org <virtio-dev@lists.oasis-open.org> On
> > Behalf Of Michael S. Tsirkin
> > Sent: Thursday, June 22, 2023 1:38 PM
> > 
> > On Thu, Jun 22, 2023 at 05:15:50PM +0000, Parav Pandit wrote:
> > >
> > >
> > > > From: Michael S. Tsirkin <mst@redhat.com>
> > > > Sent: Thursday, June 22, 2023 1:11 PM
> > >
> > > > > Provisioning driver usually do not attach to the member device directly.
> > > > > This requires device reset, followed by reaching _DRIVER stage,
> > > > > querying
> > > > features etc and config area.
> > > > > And unbinding it and second reset by member driver. Ugh.
> > > > > Provisioning driver also needs to get the state or capabilities
> > > > > even when
> > > > member driver is already attached.
> > > > > So config space is not much a gain either.
> > > >
> > > > Actually it's RO so you *can* read it without any issues:
> > >
> > > It is RO but not same across all devices.
> > 
> > If you provision VFs differently. I got it.
> > 
> > > > - block guest access to status
> > > > - check DRIVER.
> > > > If set:
> > > > 	- read features, config
> > > > If not set:
> > > > 	- read features, config
> > > > 	- reset
> > > >
> > > This is what I explained.
> > > It is more messy if you equate to GET command has mess.
> > 
> > At least it works.
> >
> And new GET command also works when CVQ is trapped.

Not just trapped. You need to issue these commands.
That will require
- driving cvq at guest boot, sending commands, then reset
- poking at guest memory to change CVQ contents on
  the fly to mask commands.

How bad or hard is that? I'll need to ponder this a bit.


> And also works when AQ is querying capabilities of a member device using WIP cmd.

yes that's the way forward in 1.4.

> > > > I am not saying it is elegant but then all of vdpa pile of hacks is not elegant.
> > > >
> > > I don't want to comment for vdpa. But it is not part of the spec...
> > 
> > Neither is QEMU.  It's one of spec implementations. Yes, we care about not
> > adding blockers for features that, superficially, might make sense for them.
> > 
> > > > And I am all for building something better but we didn't build it yet.
> > >
> > > The proposal for 1.4 is literally very simple as below.
> > > 1. All existing fields of cfg space stays in cfg space 2. Any new
> > > capabilities to be queried, query using a vq (aq, cfgvq, whatevervq).
> > > 3. Optionally existing fields can be queries over vq of #2 Once this
> > > arrive, no need for new GET commands.
> > > Till that time, don't keep infinitely grow the cfg space.
> > > Any next addition to cfg space, should work on defining the cfgvq.
> > 
> > Simple, but short sighted. I know you guys don't support your hardware for 10-
> > 20 years but for software people do.
> > And so "All existing fields of cfg space stays in cfg space" is a bad idea simply
> > because this does not allow removing things from config space not in 10 not in
> > 20 years not ever.
> >
> #1 is for backward compat for existing drivers.
>  You missed about #3. Existing cfg space fields can be queries using the cfgvq too.

Then #1 does not matter. We can give devices choice.

> > 
> > Instead we need to allow two ways to access config space.  Teach drivers about
> > both, actually mandate supporting both.  And then devices will make their own
> > cost/benefit decision about which features they want to support in MMIO.
> 
> If both method is mandated, I don't see benefit at all of two methods.

Mandated for driver.
Benefit is for devices, they will have the choice which drivers to
support. In 10-20 years all drivers support cfg command and then
people can start shipping devices without MMIO access to any
registers.

> VQ is generic part of the spec for slow and fast operation, so it is not at all a cost for config reading.

This will depend. E.g. if there's a single command to get all of config
in one go, then it actually can be a speedup, reducing the # of VM
exits.

-- 
MST


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 106+ messages in thread

* RE: [virtio-dev] Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
  2023-06-22 17:43                                   ` [virtio-comment] " Michael S. Tsirkin
@ 2023-06-22 18:12                                     ` Parav Pandit
  -1 siblings, 0 replies; 106+ messages in thread
From: Parav Pandit @ 2023-06-22 18:12 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck

> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Thursday, June 22, 2023 1:44 PM
> 
> On Thu, Jun 22, 2023 at 05:20:23PM +0000, Parav Pandit wrote:
> >
> > > From: Michael S. Tsirkin <mst@redhat.com>
> > > Sent: Thursday, June 22, 2023 1:15 PM
> > >
> > > On Thu, Jun 22, 2023 at 05:04:04PM +0000, Parav Pandit wrote:
> > > >
> > > >
> > > > > From: virtio-dev@lists.oasis-open.org
> > > > > <virtio-dev@lists.oasis-open.org> On Behalf Of Michael S.
> > > > > Tsirkin
> > > > > Sent: Thursday, June 22, 2023 12:54 PM
> > > >
> > > > > > Admin command as I recall are not accessible directly by the
> > > > > > member driver to
> > > > > the member device.
> > > > > > So a cmdq or cfgq is needed.
> > > > >
> > > > > Possible, sure. Or we actually discussed a self group. I took it
> > > > > away until it had a user.
> > > > >
> > > > The problematic part of AQ is that its index is placed in the yet
> > > > another onchip
> > > die register that does not scale as each member device has different
> > > queue count.
> > > > When admin queue was discussed, it was only for group owner, (you
> > > answered to Jiri).
> > > > Hence the scale is relatively less, so it was acceptable.
> > > >
> > > > Now having unique numbers for VFs is not good.
> > > > Max proposal was the last index after existing defined VQs of
> > > > num_queues,
> > > that saves the storage space on device.
> > >
> > > Surely, you can just have a very large index and be done with it?
> > >
> > There is count of AQ too.
> 
> Make that same across VFs?
>
Queues are not infinite, so when one doesn't need it, better to not make same number.
 
> > For receive flow filters one may want to have multiple flowfilter_vqs as the
> perf req is high for some vms.
> >
> > And device to build non linear PCI steering on the driver notification for this
> very high q count.
> > It is optimal to have finite and linear q max value.
> 
> What does this have to do with AQ? These are data vqs.
> 

I think further. Ignore this comment.
We need few bare minimum fields to bootstrap the device.
So num_aq is one of them to absorb.
This is ok.

> So 1.4 will maybe have new migration capabilities, and that is great.
> But I do not like it that we are adding in 1.3 features that can't be supported
> with current migration capabilities.
For 1.3 vdpa style solution are anyway trapping the CVQ so no problem for it either.

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-comment] RE: [virtio-dev] Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
@ 2023-06-22 18:12                                     ` Parav Pandit
  0 siblings, 0 replies; 106+ messages in thread
From: Parav Pandit @ 2023-06-22 18:12 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck

> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Thursday, June 22, 2023 1:44 PM
> 
> On Thu, Jun 22, 2023 at 05:20:23PM +0000, Parav Pandit wrote:
> >
> > > From: Michael S. Tsirkin <mst@redhat.com>
> > > Sent: Thursday, June 22, 2023 1:15 PM
> > >
> > > On Thu, Jun 22, 2023 at 05:04:04PM +0000, Parav Pandit wrote:
> > > >
> > > >
> > > > > From: virtio-dev@lists.oasis-open.org
> > > > > <virtio-dev@lists.oasis-open.org> On Behalf Of Michael S.
> > > > > Tsirkin
> > > > > Sent: Thursday, June 22, 2023 12:54 PM
> > > >
> > > > > > Admin command as I recall are not accessible directly by the
> > > > > > member driver to
> > > > > the member device.
> > > > > > So a cmdq or cfgq is needed.
> > > > >
> > > > > Possible, sure. Or we actually discussed a self group. I took it
> > > > > away until it had a user.
> > > > >
> > > > The problematic part of AQ is that its index is placed in the yet
> > > > another onchip
> > > die register that does not scale as each member device has different
> > > queue count.
> > > > When admin queue was discussed, it was only for group owner, (you
> > > answered to Jiri).
> > > > Hence the scale is relatively less, so it was acceptable.
> > > >
> > > > Now having unique numbers for VFs is not good.
> > > > Max proposal was the last index after existing defined VQs of
> > > > num_queues,
> > > that saves the storage space on device.
> > >
> > > Surely, you can just have a very large index and be done with it?
> > >
> > There is count of AQ too.
> 
> Make that same across VFs?
>
Queues are not infinite, so when one doesn't need it, better to not make same number.
 
> > For receive flow filters one may want to have multiple flowfilter_vqs as the
> perf req is high for some vms.
> >
> > And device to build non linear PCI steering on the driver notification for this
> very high q count.
> > It is optimal to have finite and linear q max value.
> 
> What does this have to do with AQ? These are data vqs.
> 

I think further. Ignore this comment.
We need few bare minimum fields to bootstrap the device.
So num_aq is one of them to absorb.
This is ok.

> So 1.4 will maybe have new migration capabilities, and that is great.
> But I do not like it that we are adding in 1.3 features that can't be supported
> with current migration capabilities.
For 1.3 vdpa style solution are anyway trapping the CVQ so no problem for it either.

This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 106+ messages in thread

* RE: [virtio-dev] Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
  2023-06-22 18:11                               ` [virtio-comment] " Michael S. Tsirkin
@ 2023-06-22 18:17                                 ` Parav Pandit
  -1 siblings, 0 replies; 106+ messages in thread
From: Parav Pandit @ 2023-06-22 18:17 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck



> From: virtio-dev@lists.oasis-open.org <virtio-dev@lists.oasis-open.org> On
> Behalf Of Michael S. Tsirkin
> Sent: Thursday, June 22, 2023 2:12 PM

[..]

> > > > The proposal for 1.4 is literally very simple as below.
> > > > 1. All existing fields of cfg space stays in cfg space 2. Any new
> > > > capabilities to be queried, query using a vq (aq, cfgvq, whatevervq).
> > > > 3. Optionally existing fields can be queries over vq of #2 Once
> > > > this arrive, no need for new GET commands.
> > > > Till that time, don't keep infinitely grow the cfg space.
> > > > Any next addition to cfg space, should work on defining the cfgvq.
> > >
> > > Simple, but short sighted. I know you guys don't support your
> > > hardware for 10-
> > > 20 years but for software people do.
> > > And so "All existing fields of cfg space stays in cfg space" is a
> > > bad idea simply because this does not allow removing things from
> > > config space not in 10 not in
> > > 20 years not ever.
> > >
> > #1 is for backward compat for existing drivers.
> >  You missed about #3. Existing cfg space fields can be queries using the cfgvq
> too.
> 
> Then #1 does not matter. We can give devices choice.
> 
> > >
> > > Instead we need to allow two ways to access config space.  Teach
> > > drivers about both, actually mandate supporting both.  And then
> > > devices will make their own cost/benefit decision about which features they
> want to support in MMIO.
> >
> > If both method is mandated, I don't see benefit at all of two methods.
> 
> Mandated for driver.
> Benefit is for devices, they will have the choice which drivers to support. In 10-
> 20 years all drivers support cfg command and then people can start shipping
> devices without MMIO access to any registers.
>
Until that point devices are forced to burn memory, which is not needed at all in < 3 years.
Once cfgvq is present for new fields, from the day 1, device does not need to store any newly defined fields on MMIO.

And 10 to 20 years, it can stop for existing fields..

The clear benefit is fields defined in 1.4 no longer needs to be stored starting from 2023-24.
 
> > VQ is generic part of the spec for slow and fast operation, so it is not at all a
> cost for config reading.
> 
> This will depend. E.g. if there's a single command to get all of config in one go,
> then it actually can be a speedup, reducing the # of VM exits.

Yes, one command to get all at least device specific config.

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-comment] RE: [virtio-dev] Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
@ 2023-06-22 18:17                                 ` Parav Pandit
  0 siblings, 0 replies; 106+ messages in thread
From: Parav Pandit @ 2023-06-22 18:17 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck



> From: virtio-dev@lists.oasis-open.org <virtio-dev@lists.oasis-open.org> On
> Behalf Of Michael S. Tsirkin
> Sent: Thursday, June 22, 2023 2:12 PM

[..]

> > > > The proposal for 1.4 is literally very simple as below.
> > > > 1. All existing fields of cfg space stays in cfg space 2. Any new
> > > > capabilities to be queried, query using a vq (aq, cfgvq, whatevervq).
> > > > 3. Optionally existing fields can be queries over vq of #2 Once
> > > > this arrive, no need for new GET commands.
> > > > Till that time, don't keep infinitely grow the cfg space.
> > > > Any next addition to cfg space, should work on defining the cfgvq.
> > >
> > > Simple, but short sighted. I know you guys don't support your
> > > hardware for 10-
> > > 20 years but for software people do.
> > > And so "All existing fields of cfg space stays in cfg space" is a
> > > bad idea simply because this does not allow removing things from
> > > config space not in 10 not in
> > > 20 years not ever.
> > >
> > #1 is for backward compat for existing drivers.
> >  You missed about #3. Existing cfg space fields can be queries using the cfgvq
> too.
> 
> Then #1 does not matter. We can give devices choice.
> 
> > >
> > > Instead we need to allow two ways to access config space.  Teach
> > > drivers about both, actually mandate supporting both.  And then
> > > devices will make their own cost/benefit decision about which features they
> want to support in MMIO.
> >
> > If both method is mandated, I don't see benefit at all of two methods.
> 
> Mandated for driver.
> Benefit is for devices, they will have the choice which drivers to support. In 10-
> 20 years all drivers support cfg command and then people can start shipping
> devices without MMIO access to any registers.
>
Until that point devices are forced to burn memory, which is not needed at all in < 3 years.
Once cfgvq is present for new fields, from the day 1, device does not need to store any newly defined fields on MMIO.

And 10 to 20 years, it can stop for existing fields..

The clear benefit is fields defined in 1.4 no longer needs to be stored starting from 2023-24.
 
> > VQ is generic part of the spec for slow and fast operation, so it is not at all a
> cost for config reading.
> 
> This will depend. E.g. if there's a single command to get all of config in one go,
> then it actually can be a speedup, reducing the # of VM exits.

Yes, one command to get all at least device specific config.

This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-dev] Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
  2023-06-22 18:12                                     ` [virtio-comment] " Parav Pandit
@ 2023-06-22 18:36                                       ` Michael S. Tsirkin
  -1 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2023-06-22 18:36 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck

On Thu, Jun 22, 2023 at 06:12:09PM +0000, Parav Pandit wrote:
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Thursday, June 22, 2023 1:44 PM
> > 
> > On Thu, Jun 22, 2023 at 05:20:23PM +0000, Parav Pandit wrote:
> > >
> > > > From: Michael S. Tsirkin <mst@redhat.com>
> > > > Sent: Thursday, June 22, 2023 1:15 PM
> > > >
> > > > On Thu, Jun 22, 2023 at 05:04:04PM +0000, Parav Pandit wrote:
> > > > >
> > > > >
> > > > > > From: virtio-dev@lists.oasis-open.org
> > > > > > <virtio-dev@lists.oasis-open.org> On Behalf Of Michael S.
> > > > > > Tsirkin
> > > > > > Sent: Thursday, June 22, 2023 12:54 PM
> > > > >
> > > > > > > Admin command as I recall are not accessible directly by the
> > > > > > > member driver to
> > > > > > the member device.
> > > > > > > So a cmdq or cfgq is needed.
> > > > > >
> > > > > > Possible, sure. Or we actually discussed a self group. I took it
> > > > > > away until it had a user.
> > > > > >
> > > > > The problematic part of AQ is that its index is placed in the yet
> > > > > another onchip
> > > > die register that does not scale as each member device has different
> > > > queue count.
> > > > > When admin queue was discussed, it was only for group owner, (you
> > > > answered to Jiri).
> > > > > Hence the scale is relatively less, so it was acceptable.
> > > > >
> > > > > Now having unique numbers for VFs is not good.
> > > > > Max proposal was the last index after existing defined VQs of
> > > > > num_queues,
> > > > that saves the storage space on device.
> > > >
> > > > Surely, you can just have a very large index and be done with it?
> > > >
> > > There is count of AQ too.
> > 
> > Make that same across VFs?
> >
> Queues are not infinite, so when one doesn't need it, better to not make same number.

I don't get it. Can you show an example configuration where there's
a problem? Which device type, # of data queues and admin queues desired and etc etc.

> > > For receive flow filters one may want to have multiple flowfilter_vqs as the
> > perf req is high for some vms.
> > >
> > > And device to build non linear PCI steering on the driver notification for this
> > very high q count.
> > > It is optimal to have finite and linear q max value.
> > 
> > What does this have to do with AQ? These are data vqs.
> > 
> 
> I think further. Ignore this comment.
> We need few bare minimum fields to bootstrap the device.
> So num_aq is one of them to absorb.
> This is ok.
> 
> > So 1.4 will maybe have new migration capabilities, and that is great.
> > But I do not like it that we are adding in 1.3 features that can't be supported
> > with current migration capabilities.
> For 1.3 vdpa style solution are anyway trapping the CVQ so no problem for it either.

OK I get the idea. True.
Trapping is not the same as driving or actually emulating though.
That part would be new. Certainly annoying.
Practical? Needs a bit of thought.


What's even more annoying is that provisioning would suffer too, instead
of treating all fields the same this one will have to be treated
differently.

Can't say something can't be done, but it's unfortunate that
we are adding to technical debt.

It's late here, I think I will sleep on this.


-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
@ 2023-06-22 18:36                                       ` Michael S. Tsirkin
  0 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2023-06-22 18:36 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck

On Thu, Jun 22, 2023 at 06:12:09PM +0000, Parav Pandit wrote:
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Thursday, June 22, 2023 1:44 PM
> > 
> > On Thu, Jun 22, 2023 at 05:20:23PM +0000, Parav Pandit wrote:
> > >
> > > > From: Michael S. Tsirkin <mst@redhat.com>
> > > > Sent: Thursday, June 22, 2023 1:15 PM
> > > >
> > > > On Thu, Jun 22, 2023 at 05:04:04PM +0000, Parav Pandit wrote:
> > > > >
> > > > >
> > > > > > From: virtio-dev@lists.oasis-open.org
> > > > > > <virtio-dev@lists.oasis-open.org> On Behalf Of Michael S.
> > > > > > Tsirkin
> > > > > > Sent: Thursday, June 22, 2023 12:54 PM
> > > > >
> > > > > > > Admin command as I recall are not accessible directly by the
> > > > > > > member driver to
> > > > > > the member device.
> > > > > > > So a cmdq or cfgq is needed.
> > > > > >
> > > > > > Possible, sure. Or we actually discussed a self group. I took it
> > > > > > away until it had a user.
> > > > > >
> > > > > The problematic part of AQ is that its index is placed in the yet
> > > > > another onchip
> > > > die register that does not scale as each member device has different
> > > > queue count.
> > > > > When admin queue was discussed, it was only for group owner, (you
> > > > answered to Jiri).
> > > > > Hence the scale is relatively less, so it was acceptable.
> > > > >
> > > > > Now having unique numbers for VFs is not good.
> > > > > Max proposal was the last index after existing defined VQs of
> > > > > num_queues,
> > > > that saves the storage space on device.
> > > >
> > > > Surely, you can just have a very large index and be done with it?
> > > >
> > > There is count of AQ too.
> > 
> > Make that same across VFs?
> >
> Queues are not infinite, so when one doesn't need it, better to not make same number.

I don't get it. Can you show an example configuration where there's
a problem? Which device type, # of data queues and admin queues desired and etc etc.

> > > For receive flow filters one may want to have multiple flowfilter_vqs as the
> > perf req is high for some vms.
> > >
> > > And device to build non linear PCI steering on the driver notification for this
> > very high q count.
> > > It is optimal to have finite and linear q max value.
> > 
> > What does this have to do with AQ? These are data vqs.
> > 
> 
> I think further. Ignore this comment.
> We need few bare minimum fields to bootstrap the device.
> So num_aq is one of them to absorb.
> This is ok.
> 
> > So 1.4 will maybe have new migration capabilities, and that is great.
> > But I do not like it that we are adding in 1.3 features that can't be supported
> > with current migration capabilities.
> For 1.3 vdpa style solution are anyway trapping the CVQ so no problem for it either.

OK I get the idea. True.
Trapping is not the same as driving or actually emulating though.
That part would be new. Certainly annoying.
Practical? Needs a bit of thought.


What's even more annoying is that provisioning would suffer too, instead
of treating all fields the same this one will have to be treated
differently.

Can't say something can't be done, but it's unfortunate that
we are adding to technical debt.

It's late here, I think I will sleep on this.


-- 
MST


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-dev] Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
  2023-06-22 18:17                                 ` [virtio-comment] " Parav Pandit
@ 2023-06-22 18:40                                   ` Michael S. Tsirkin
  -1 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2023-06-22 18:40 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck

On Thu, Jun 22, 2023 at 06:17:54PM +0000, Parav Pandit wrote:
> 
> 
> > From: virtio-dev@lists.oasis-open.org <virtio-dev@lists.oasis-open.org> On
> > Behalf Of Michael S. Tsirkin
> > Sent: Thursday, June 22, 2023 2:12 PM
> 
> [..]
> 
> > > > > The proposal for 1.4 is literally very simple as below.
> > > > > 1. All existing fields of cfg space stays in cfg space 2. Any new
> > > > > capabilities to be queried, query using a vq (aq, cfgvq, whatevervq).
> > > > > 3. Optionally existing fields can be queries over vq of #2 Once
> > > > > this arrive, no need for new GET commands.
> > > > > Till that time, don't keep infinitely grow the cfg space.
> > > > > Any next addition to cfg space, should work on defining the cfgvq.
> > > >
> > > > Simple, but short sighted. I know you guys don't support your
> > > > hardware for 10-
> > > > 20 years but for software people do.
> > > > And so "All existing fields of cfg space stays in cfg space" is a
> > > > bad idea simply because this does not allow removing things from
> > > > config space not in 10 not in
> > > > 20 years not ever.
> > > >
> > > #1 is for backward compat for existing drivers.
> > >  You missed about #3. Existing cfg space fields can be queries using the cfgvq
> > too.
> > 
> > Then #1 does not matter. We can give devices choice.
> > 
> > > >
> > > > Instead we need to allow two ways to access config space.  Teach
> > > > drivers about both, actually mandate supporting both.  And then
> > > > devices will make their own cost/benefit decision about which features they
> > want to support in MMIO.
> > >
> > > If both method is mandated, I don't see benefit at all of two methods.
> > 
> > Mandated for driver.
> > Benefit is for devices, they will have the choice which drivers to support. In 10-
> > 20 years all drivers support cfg command and then people can start shipping
> > devices without MMIO access to any registers.
> >
> Until that point devices are forced to burn memory, which is not needed at all in < 3 years.

No they are not. They can make their own decision which fields to support in MMIO space.
If they want to cut at 1.3 time, they can, but they also can cut it
at 1.2 time, this means some features will not be accessible through
MMIO only through the new commands.

> Once cfgvq is present for new fields, from the day 1, device does not need to store any newly defined fields on MMIO.

Exactly.

> And 10 to 20 years, it can stop for existing fields..
> 
> The clear benefit is fields defined in 1.4 no longer needs to be stored starting from 2023-24.

And why is it a problem that someone can also build a 1.4 device with MMIO?

> > > VQ is generic part of the spec for slow and fast operation, so it is not at all a
> > cost for config reading.
> > 
> > This will depend. E.g. if there's a single command to get all of config in one go,
> > then it actually can be a speedup, reducing the # of VM exits.
> 
> Yes, one command to get all at least device specific config.


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
@ 2023-06-22 18:40                                   ` Michael S. Tsirkin
  0 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2023-06-22 18:40 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck

On Thu, Jun 22, 2023 at 06:17:54PM +0000, Parav Pandit wrote:
> 
> 
> > From: virtio-dev@lists.oasis-open.org <virtio-dev@lists.oasis-open.org> On
> > Behalf Of Michael S. Tsirkin
> > Sent: Thursday, June 22, 2023 2:12 PM
> 
> [..]
> 
> > > > > The proposal for 1.4 is literally very simple as below.
> > > > > 1. All existing fields of cfg space stays in cfg space 2. Any new
> > > > > capabilities to be queried, query using a vq (aq, cfgvq, whatevervq).
> > > > > 3. Optionally existing fields can be queries over vq of #2 Once
> > > > > this arrive, no need for new GET commands.
> > > > > Till that time, don't keep infinitely grow the cfg space.
> > > > > Any next addition to cfg space, should work on defining the cfgvq.
> > > >
> > > > Simple, but short sighted. I know you guys don't support your
> > > > hardware for 10-
> > > > 20 years but for software people do.
> > > > And so "All existing fields of cfg space stays in cfg space" is a
> > > > bad idea simply because this does not allow removing things from
> > > > config space not in 10 not in
> > > > 20 years not ever.
> > > >
> > > #1 is for backward compat for existing drivers.
> > >  You missed about #3. Existing cfg space fields can be queries using the cfgvq
> > too.
> > 
> > Then #1 does not matter. We can give devices choice.
> > 
> > > >
> > > > Instead we need to allow two ways to access config space.  Teach
> > > > drivers about both, actually mandate supporting both.  And then
> > > > devices will make their own cost/benefit decision about which features they
> > want to support in MMIO.
> > >
> > > If both method is mandated, I don't see benefit at all of two methods.
> > 
> > Mandated for driver.
> > Benefit is for devices, they will have the choice which drivers to support. In 10-
> > 20 years all drivers support cfg command and then people can start shipping
> > devices without MMIO access to any registers.
> >
> Until that point devices are forced to burn memory, which is not needed at all in < 3 years.

No they are not. They can make their own decision which fields to support in MMIO space.
If they want to cut at 1.3 time, they can, but they also can cut it
at 1.2 time, this means some features will not be accessible through
MMIO only through the new commands.

> Once cfgvq is present for new fields, from the day 1, device does not need to store any newly defined fields on MMIO.

Exactly.

> And 10 to 20 years, it can stop for existing fields..
> 
> The clear benefit is fields defined in 1.4 no longer needs to be stored starting from 2023-24.

And why is it a problem that someone can also build a 1.4 device with MMIO?

> > > VQ is generic part of the spec for slow and fast operation, so it is not at all a
> > cost for config reading.
> > 
> > This will depend. E.g. if there's a single command to get all of config in one go,
> > then it actually can be a speedup, reducing the # of VM exits.
> 
> Yes, one command to get all at least device specific config.


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-dev] RE: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
  2023-06-22 18:40                                   ` [virtio-comment] " Michael S. Tsirkin
@ 2023-06-22 18:50                                     ` Parav Pandit
  -1 siblings, 0 replies; 106+ messages in thread
From: Parav Pandit @ 2023-06-22 18:50 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck



> From: virtio-comment@lists.oasis-open.org <virtio-comment@lists.oasis-
> open.org> On Behalf Of Michael S. Tsirkin
> Sent: Thursday, June 22, 2023 2:40 PM
> To: Parav Pandit <parav@nvidia.com>
> Cc: Heng Qi <hengqi@linux.alibaba.com>; virtio-comment@lists.oasis-open.org;
> virtio-dev@lists.oasis-open.org; Jason Wang <jasowang@redhat.com>; Yuri
> Benditovich <yuri.benditovich@daynix.com>; Xuan Zhuo
> <xuanzhuo@linux.alibaba.com>; Cornelia Huck <cohuck@redhat.com>
> Subject: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] Re: [PATCH v18]
> virtio-net: support inner header hash
> 
> On Thu, Jun 22, 2023 at 06:17:54PM +0000, Parav Pandit wrote:
> >
> >
> > > From: virtio-dev@lists.oasis-open.org
> > > <virtio-dev@lists.oasis-open.org> On Behalf Of Michael S. Tsirkin
> > > Sent: Thursday, June 22, 2023 2:12 PM
> >
> > [..]
> >
> > > > > > The proposal for 1.4 is literally very simple as below.
> > > > > > 1. All existing fields of cfg space stays in cfg space 2. Any
> > > > > > new capabilities to be queried, query using a vq (aq, cfgvq,
> whatevervq).
> > > > > > 3. Optionally existing fields can be queries over vq of #2
> > > > > > Once this arrive, no need for new GET commands.
> > > > > > Till that time, don't keep infinitely grow the cfg space.
> > > > > > Any next addition to cfg space, should work on defining the cfgvq.
> > > > >
> > > > > Simple, but short sighted. I know you guys don't support your
> > > > > hardware for 10-
> > > > > 20 years but for software people do.
> > > > > And so "All existing fields of cfg space stays in cfg space" is
> > > > > a bad idea simply because this does not allow removing things
> > > > > from config space not in 10 not in
> > > > > 20 years not ever.
> > > > >
> > > > #1 is for backward compat for existing drivers.
> > > >  You missed about #3. Existing cfg space fields can be queries
> > > > using the cfgvq
> > > too.
> > >
> > > Then #1 does not matter. We can give devices choice.
> > >
> > > > >
> > > > > Instead we need to allow two ways to access config space.  Teach
> > > > > drivers about both, actually mandate supporting both.  And then
> > > > > devices will make their own cost/benefit decision about which
> > > > > features they
> > > want to support in MMIO.
> > > >
> > > > If both method is mandated, I don't see benefit at all of two methods.
> > >
> > > Mandated for driver.
> > > Benefit is for devices, they will have the choice which drivers to
> > > support. In 10-
> > > 20 years all drivers support cfg command and then people can start
> > > shipping devices without MMIO access to any registers.
> > >
> > Until that point devices are forced to burn memory, which is not needed at all
> in < 3 years.
> 
> No they are not. They can make their own decision which fields to support in
> MMIO space.
Existing fields used by existing drivers must be in MMIO.
Device does not have a choice.

> If they want to cut at 1.3 time, they can, but they also can cut it at 1.2 time, this
> means some features will not be accessible through MMIO only through the
> new commands.
>
New commands services new fields by newer driver.
Newer driver can call new command to get new and old fields both.

So 1.3 and 1.4 devices cannot optimize for older drivers.

 
> > Once cfgvq is present for new fields, from the day 1, device does not need to
> store any newly defined fields on MMIO.
> 
> Exactly.
> 
> > And 10 to 20 years, it can stop for existing fields..
> >
> > The clear benefit is fields defined in 1.4 no longer needs to be stored starting
> from 2023-24.
> 
> And why is it a problem that someone can also build a 1.4 device with MMIO?
>
Why to build device using large MMIO when those fields are accessible via vq.
VQ construct exists so it is already simple.

If the spec is defined in a way, that new fields must be accessible via vq, and optionally via MMIO, than it is ok.
One can choose to build via MMIO.
So MMIO for new fields must not be mandatory.

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* RE: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
@ 2023-06-22 18:50                                     ` Parav Pandit
  0 siblings, 0 replies; 106+ messages in thread
From: Parav Pandit @ 2023-06-22 18:50 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck



> From: virtio-comment@lists.oasis-open.org <virtio-comment@lists.oasis-
> open.org> On Behalf Of Michael S. Tsirkin
> Sent: Thursday, June 22, 2023 2:40 PM
> To: Parav Pandit <parav@nvidia.com>
> Cc: Heng Qi <hengqi@linux.alibaba.com>; virtio-comment@lists.oasis-open.org;
> virtio-dev@lists.oasis-open.org; Jason Wang <jasowang@redhat.com>; Yuri
> Benditovich <yuri.benditovich@daynix.com>; Xuan Zhuo
> <xuanzhuo@linux.alibaba.com>; Cornelia Huck <cohuck@redhat.com>
> Subject: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] Re: [PATCH v18]
> virtio-net: support inner header hash
> 
> On Thu, Jun 22, 2023 at 06:17:54PM +0000, Parav Pandit wrote:
> >
> >
> > > From: virtio-dev@lists.oasis-open.org
> > > <virtio-dev@lists.oasis-open.org> On Behalf Of Michael S. Tsirkin
> > > Sent: Thursday, June 22, 2023 2:12 PM
> >
> > [..]
> >
> > > > > > The proposal for 1.4 is literally very simple as below.
> > > > > > 1. All existing fields of cfg space stays in cfg space 2. Any
> > > > > > new capabilities to be queried, query using a vq (aq, cfgvq,
> whatevervq).
> > > > > > 3. Optionally existing fields can be queries over vq of #2
> > > > > > Once this arrive, no need for new GET commands.
> > > > > > Till that time, don't keep infinitely grow the cfg space.
> > > > > > Any next addition to cfg space, should work on defining the cfgvq.
> > > > >
> > > > > Simple, but short sighted. I know you guys don't support your
> > > > > hardware for 10-
> > > > > 20 years but for software people do.
> > > > > And so "All existing fields of cfg space stays in cfg space" is
> > > > > a bad idea simply because this does not allow removing things
> > > > > from config space not in 10 not in
> > > > > 20 years not ever.
> > > > >
> > > > #1 is for backward compat for existing drivers.
> > > >  You missed about #3. Existing cfg space fields can be queries
> > > > using the cfgvq
> > > too.
> > >
> > > Then #1 does not matter. We can give devices choice.
> > >
> > > > >
> > > > > Instead we need to allow two ways to access config space.  Teach
> > > > > drivers about both, actually mandate supporting both.  And then
> > > > > devices will make their own cost/benefit decision about which
> > > > > features they
> > > want to support in MMIO.
> > > >
> > > > If both method is mandated, I don't see benefit at all of two methods.
> > >
> > > Mandated for driver.
> > > Benefit is for devices, they will have the choice which drivers to
> > > support. In 10-
> > > 20 years all drivers support cfg command and then people can start
> > > shipping devices without MMIO access to any registers.
> > >
> > Until that point devices are forced to burn memory, which is not needed at all
> in < 3 years.
> 
> No they are not. They can make their own decision which fields to support in
> MMIO space.
Existing fields used by existing drivers must be in MMIO.
Device does not have a choice.

> If they want to cut at 1.3 time, they can, but they also can cut it at 1.2 time, this
> means some features will not be accessible through MMIO only through the
> new commands.
>
New commands services new fields by newer driver.
Newer driver can call new command to get new and old fields both.

So 1.3 and 1.4 devices cannot optimize for older drivers.

 
> > Once cfgvq is present for new fields, from the day 1, device does not need to
> store any newly defined fields on MMIO.
> 
> Exactly.
> 
> > And 10 to 20 years, it can stop for existing fields..
> >
> > The clear benefit is fields defined in 1.4 no longer needs to be stored starting
> from 2023-24.
> 
> And why is it a problem that someone can also build a 1.4 device with MMIO?
>
Why to build device using large MMIO when those fields are accessible via vq.
VQ construct exists so it is already simple.

If the spec is defined in a way, that new fields must be accessible via vq, and optionally via MMIO, than it is ok.
One can choose to build via MMIO.
So MMIO for new fields must not be mandatory.

This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
  2023-06-22 18:50                                     ` Parav Pandit
@ 2023-06-22 19:02                                       ` Michael S. Tsirkin
  -1 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2023-06-22 19:02 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck

On Thu, Jun 22, 2023 at 06:50:16PM +0000, Parav Pandit wrote:
> 
> 
> > From: virtio-comment@lists.oasis-open.org <virtio-comment@lists.oasis-
> > open.org> On Behalf Of Michael S. Tsirkin
> > Sent: Thursday, June 22, 2023 2:40 PM
> > To: Parav Pandit <parav@nvidia.com>
> > Cc: Heng Qi <hengqi@linux.alibaba.com>; virtio-comment@lists.oasis-open.org;
> > virtio-dev@lists.oasis-open.org; Jason Wang <jasowang@redhat.com>; Yuri
> > Benditovich <yuri.benditovich@daynix.com>; Xuan Zhuo
> > <xuanzhuo@linux.alibaba.com>; Cornelia Huck <cohuck@redhat.com>
> > Subject: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] Re: [PATCH v18]
> > virtio-net: support inner header hash
> > 
> > On Thu, Jun 22, 2023 at 06:17:54PM +0000, Parav Pandit wrote:
> > >
> > >
> > > > From: virtio-dev@lists.oasis-open.org
> > > > <virtio-dev@lists.oasis-open.org> On Behalf Of Michael S. Tsirkin
> > > > Sent: Thursday, June 22, 2023 2:12 PM
> > >
> > > [..]
> > >
> > > > > > > The proposal for 1.4 is literally very simple as below.
> > > > > > > 1. All existing fields of cfg space stays in cfg space 2. Any
> > > > > > > new capabilities to be queried, query using a vq (aq, cfgvq,
> > whatevervq).
> > > > > > > 3. Optionally existing fields can be queries over vq of #2
> > > > > > > Once this arrive, no need for new GET commands.
> > > > > > > Till that time, don't keep infinitely grow the cfg space.
> > > > > > > Any next addition to cfg space, should work on defining the cfgvq.
> > > > > >
> > > > > > Simple, but short sighted. I know you guys don't support your
> > > > > > hardware for 10-
> > > > > > 20 years but for software people do.
> > > > > > And so "All existing fields of cfg space stays in cfg space" is
> > > > > > a bad idea simply because this does not allow removing things
> > > > > > from config space not in 10 not in
> > > > > > 20 years not ever.
> > > > > >
> > > > > #1 is for backward compat for existing drivers.
> > > > >  You missed about #3. Existing cfg space fields can be queries
> > > > > using the cfgvq
> > > > too.
> > > >
> > > > Then #1 does not matter. We can give devices choice.
> > > >
> > > > > >
> > > > > > Instead we need to allow two ways to access config space.  Teach
> > > > > > drivers about both, actually mandate supporting both.  And then
> > > > > > devices will make their own cost/benefit decision about which
> > > > > > features they
> > > > want to support in MMIO.
> > > > >
> > > > > If both method is mandated, I don't see benefit at all of two methods.
> > > >
> > > > Mandated for driver.
> > > > Benefit is for devices, they will have the choice which drivers to
> > > > support. In 10-
> > > > 20 years all drivers support cfg command and then people can start
> > > > shipping devices without MMIO access to any registers.
> > > >
> > > Until that point devices are forced to burn memory, which is not needed at all
> > in < 3 years.
> > 
> > No they are not. They can make their own decision which fields to support in
> > MMIO space.
> Existing fields used by existing drivers must be in MMIO.
> Device does not have a choice.

They can if they like mask some less important features reducing the footprint.
This will be the cost/benefit analysis each vendor does separately.

Case in point, if inner hash support is in 1.3, and linux drivers don't
really use that at all for a year or two because it's only used by
alibaba in their stack, and by the time linux starts using
them it also supports the new commands, then practically
devices do not care and can just mask the feature bit in MMIO.


> > If they want to cut at 1.3 time, they can, but they also can cut it at 1.2 time, this
> > means some features will not be accessible through MMIO only through the
> > new commands.
> >
> New commands services new fields by newer driver.
> Newer driver can call new command to get new and old fields both.

yes.

> So 1.3 and 1.4 devices cannot optimize for older drivers.


I don't know what this means.

>  
> > > Once cfgvq is present for new fields, from the day 1, device does not need to
> > store any newly defined fields on MMIO.
> > 
> > Exactly.
> > 
> > > And 10 to 20 years, it can stop for existing fields..
> > >
> > > The clear benefit is fields defined in 1.4 no longer needs to be stored starting
> > from 2023-24.
> > 
> > And why is it a problem that someone can also build a 1.4 device with MMIO?
> >
> Why to build device using large MMIO when those fields are accessible via vq.

If you don't want to - don't. But it's simple and handy for software.
Not all devices are complex beasts like net.

Apropos, I am not yet sure the best way is simply adding admin vq to
vfs, in that this only works for fields not needed before DRIVER_OK.
Ideally I think we should fix all of config space, including device and
common config. I have some ideas but let's not get ahead of ourselves.

> VQ construct exists so it is already simple.
> 
> If the spec is defined in a way, that new fields must be accessible via vq, and optionally via MMIO, than it is ok.
> One can choose to build via MMIO.
> So MMIO for new fields must not be mandatory.

For devices, yes. I am suggesting mandating it for drivers.
Or maybe it's enough to make it a SHOULD.

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
@ 2023-06-22 19:02                                       ` Michael S. Tsirkin
  0 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2023-06-22 19:02 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck

On Thu, Jun 22, 2023 at 06:50:16PM +0000, Parav Pandit wrote:
> 
> 
> > From: virtio-comment@lists.oasis-open.org <virtio-comment@lists.oasis-
> > open.org> On Behalf Of Michael S. Tsirkin
> > Sent: Thursday, June 22, 2023 2:40 PM
> > To: Parav Pandit <parav@nvidia.com>
> > Cc: Heng Qi <hengqi@linux.alibaba.com>; virtio-comment@lists.oasis-open.org;
> > virtio-dev@lists.oasis-open.org; Jason Wang <jasowang@redhat.com>; Yuri
> > Benditovich <yuri.benditovich@daynix.com>; Xuan Zhuo
> > <xuanzhuo@linux.alibaba.com>; Cornelia Huck <cohuck@redhat.com>
> > Subject: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] Re: [PATCH v18]
> > virtio-net: support inner header hash
> > 
> > On Thu, Jun 22, 2023 at 06:17:54PM +0000, Parav Pandit wrote:
> > >
> > >
> > > > From: virtio-dev@lists.oasis-open.org
> > > > <virtio-dev@lists.oasis-open.org> On Behalf Of Michael S. Tsirkin
> > > > Sent: Thursday, June 22, 2023 2:12 PM
> > >
> > > [..]
> > >
> > > > > > > The proposal for 1.4 is literally very simple as below.
> > > > > > > 1. All existing fields of cfg space stays in cfg space 2. Any
> > > > > > > new capabilities to be queried, query using a vq (aq, cfgvq,
> > whatevervq).
> > > > > > > 3. Optionally existing fields can be queries over vq of #2
> > > > > > > Once this arrive, no need for new GET commands.
> > > > > > > Till that time, don't keep infinitely grow the cfg space.
> > > > > > > Any next addition to cfg space, should work on defining the cfgvq.
> > > > > >
> > > > > > Simple, but short sighted. I know you guys don't support your
> > > > > > hardware for 10-
> > > > > > 20 years but for software people do.
> > > > > > And so "All existing fields of cfg space stays in cfg space" is
> > > > > > a bad idea simply because this does not allow removing things
> > > > > > from config space not in 10 not in
> > > > > > 20 years not ever.
> > > > > >
> > > > > #1 is for backward compat for existing drivers.
> > > > >  You missed about #3. Existing cfg space fields can be queries
> > > > > using the cfgvq
> > > > too.
> > > >
> > > > Then #1 does not matter. We can give devices choice.
> > > >
> > > > > >
> > > > > > Instead we need to allow two ways to access config space.  Teach
> > > > > > drivers about both, actually mandate supporting both.  And then
> > > > > > devices will make their own cost/benefit decision about which
> > > > > > features they
> > > > want to support in MMIO.
> > > > >
> > > > > If both method is mandated, I don't see benefit at all of two methods.
> > > >
> > > > Mandated for driver.
> > > > Benefit is for devices, they will have the choice which drivers to
> > > > support. In 10-
> > > > 20 years all drivers support cfg command and then people can start
> > > > shipping devices without MMIO access to any registers.
> > > >
> > > Until that point devices are forced to burn memory, which is not needed at all
> > in < 3 years.
> > 
> > No they are not. They can make their own decision which fields to support in
> > MMIO space.
> Existing fields used by existing drivers must be in MMIO.
> Device does not have a choice.

They can if they like mask some less important features reducing the footprint.
This will be the cost/benefit analysis each vendor does separately.

Case in point, if inner hash support is in 1.3, and linux drivers don't
really use that at all for a year or two because it's only used by
alibaba in their stack, and by the time linux starts using
them it also supports the new commands, then practically
devices do not care and can just mask the feature bit in MMIO.


> > If they want to cut at 1.3 time, they can, but they also can cut it at 1.2 time, this
> > means some features will not be accessible through MMIO only through the
> > new commands.
> >
> New commands services new fields by newer driver.
> Newer driver can call new command to get new and old fields both.

yes.

> So 1.3 and 1.4 devices cannot optimize for older drivers.


I don't know what this means.

>  
> > > Once cfgvq is present for new fields, from the day 1, device does not need to
> > store any newly defined fields on MMIO.
> > 
> > Exactly.
> > 
> > > And 10 to 20 years, it can stop for existing fields..
> > >
> > > The clear benefit is fields defined in 1.4 no longer needs to be stored starting
> > from 2023-24.
> > 
> > And why is it a problem that someone can also build a 1.4 device with MMIO?
> >
> Why to build device using large MMIO when those fields are accessible via vq.

If you don't want to - don't. But it's simple and handy for software.
Not all devices are complex beasts like net.

Apropos, I am not yet sure the best way is simply adding admin vq to
vfs, in that this only works for fields not needed before DRIVER_OK.
Ideally I think we should fix all of config space, including device and
common config. I have some ideas but let's not get ahead of ourselves.

> VQ construct exists so it is already simple.
> 
> If the spec is defined in a way, that new fields must be accessible via vq, and optionally via MMIO, than it is ok.
> One can choose to build via MMIO.
> So MMIO for new fields must not be mandatory.

For devices, yes. I am suggesting mandating it for drivers.
Or maybe it's enough to make it a SHOULD.

-- 
MST


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-dev] RE: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
  2023-06-22 19:02                                       ` Michael S. Tsirkin
@ 2023-06-22 20:27                                         ` Parav Pandit
  -1 siblings, 0 replies; 106+ messages in thread
From: Parav Pandit @ 2023-06-22 20:27 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck



> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Thursday, June 22, 2023 3:02 PM


> > Existing fields used by existing drivers must be in MMIO.
> > Device does not have a choice.
> 
> They can if they like mask some less important features reducing the footprint.
How do to that? How would device know to not have existing fields on MMIO?

> This will be the cost/benefit analysis each vendor does separately.
> 
> Case in point, if inner hash support is in 1.3, and linux drivers don't really use
> that at all for a year or two because it's only used by alibaba in their stack, and
> by the time linux starts using them it also supports the new commands, then
> practically devices do not care and can just mask the feature bit in MMIO.
> 
Once it lands to config space, it becomes existing fields.
More below.
> 

> > > If they want to cut at 1.3 time, they can, but they also can cut it
> > > at 1.2 time, this means some features will not be accessible through
> > > MMIO only through the new commands.
> > >
> > New commands services new fields by newer driver.
> > Newer driver can call new command to get new and old fields both.
> 
> yes.
> 
> > So 1.3 and 1.4 devices cannot optimize for older drivers.
> 
> 
> I don't know what this means.
> 
I guess next below question answers it.
 
> >
> > > > Once cfgvq is present for new fields, from the day 1, device does
> > > > not need to
> > > store any newly defined fields on MMIO.
> > >
> > > Exactly.
> > >
> > > > And 10 to 20 years, it can stop for existing fields..
> > > >
> > > > The clear benefit is fields defined in 1.4 no longer needs to be
> > > > stored starting
> > > from 2023-24.
> > >
> > > And why is it a problem that someone can also build a 1.4 device with
> MMIO?
> > >
> > Why to build device using large MMIO when those fields are accessible via vq.
> 
> If you don't want to - don't. But it's simple and handy for software.
> Not all devices are complex beasts like net.

Every single device has a VQ to my knowledge.
So VQ is already simple for communication even without cvq.

> 
> Apropos, I am not yet sure the best way is simply adding admin vq to vfs, in that
> this only works for fields not needed before DRIVER_OK.
> Ideally I think we should fix all of config space, including device and common
> config. I have some ideas but let's not get ahead of ourselves.
> 
> > VQ construct exists so it is already simple.
> >
> > If the spec is defined in a way, that new fields must be accessible via vq, and
> optionally via MMIO, than it is ok.
> > One can choose to build via MMIO.
> > So MMIO for new fields must not be mandatory.
> 
> For devices, yes. I am suggesting mandating it for drivers.
It must be mandatory for the driver to support VQ, only than it works.
If driver does not implement, and device only offers via VQ, it does not work.

> Or maybe it's enough to make it a SHOULD.

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* RE: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
@ 2023-06-22 20:27                                         ` Parav Pandit
  0 siblings, 0 replies; 106+ messages in thread
From: Parav Pandit @ 2023-06-22 20:27 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck



> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Thursday, June 22, 2023 3:02 PM


> > Existing fields used by existing drivers must be in MMIO.
> > Device does not have a choice.
> 
> They can if they like mask some less important features reducing the footprint.
How do to that? How would device know to not have existing fields on MMIO?

> This will be the cost/benefit analysis each vendor does separately.
> 
> Case in point, if inner hash support is in 1.3, and linux drivers don't really use
> that at all for a year or two because it's only used by alibaba in their stack, and
> by the time linux starts using them it also supports the new commands, then
> practically devices do not care and can just mask the feature bit in MMIO.
> 
Once it lands to config space, it becomes existing fields.
More below.
> 

> > > If they want to cut at 1.3 time, they can, but they also can cut it
> > > at 1.2 time, this means some features will not be accessible through
> > > MMIO only through the new commands.
> > >
> > New commands services new fields by newer driver.
> > Newer driver can call new command to get new and old fields both.
> 
> yes.
> 
> > So 1.3 and 1.4 devices cannot optimize for older drivers.
> 
> 
> I don't know what this means.
> 
I guess next below question answers it.
 
> >
> > > > Once cfgvq is present for new fields, from the day 1, device does
> > > > not need to
> > > store any newly defined fields on MMIO.
> > >
> > > Exactly.
> > >
> > > > And 10 to 20 years, it can stop for existing fields..
> > > >
> > > > The clear benefit is fields defined in 1.4 no longer needs to be
> > > > stored starting
> > > from 2023-24.
> > >
> > > And why is it a problem that someone can also build a 1.4 device with
> MMIO?
> > >
> > Why to build device using large MMIO when those fields are accessible via vq.
> 
> If you don't want to - don't. But it's simple and handy for software.
> Not all devices are complex beasts like net.

Every single device has a VQ to my knowledge.
So VQ is already simple for communication even without cvq.

> 
> Apropos, I am not yet sure the best way is simply adding admin vq to vfs, in that
> this only works for fields not needed before DRIVER_OK.
> Ideally I think we should fix all of config space, including device and common
> config. I have some ideas but let's not get ahead of ourselves.
> 
> > VQ construct exists so it is already simple.
> >
> > If the spec is defined in a way, that new fields must be accessible via vq, and
> optionally via MMIO, than it is ok.
> > One can choose to build via MMIO.
> > So MMIO for new fields must not be mandatory.
> 
> For devices, yes. I am suggesting mandating it for drivers.
It must be mandatory for the driver to support VQ, only than it works.
If driver does not implement, and device only offers via VQ, it does not work.

> Or maybe it's enough to make it a SHOULD.

This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
  2023-06-22 17:58                                     ` Parav Pandit
@ 2023-06-28 10:41                                       ` Michael S. Tsirkin
  -1 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2023-06-28 10:41 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck

On Thu, Jun 22, 2023 at 05:58:29PM +0000, Parav Pandit wrote:
> > > > > > So what is the solution proposed for this?
> > > > > >
> > > > > 1. Query member device capabilities via admin command
> > > >
> > > > But that's not 1.3 material.
> > > >
> > > > > > Yes the current migration is broken in many ways but that's what
> > > > > > we have. Let's build something better sure but that is not 1.3 material.
> > > > >
> > > > > True, it is not 1.3 material, hence the proposal was to have the GET
> > command.
> > > > > Once/if we reach agreement that no new fields to be added to
> > > > > config space
> > > > starting 1.4 and should be queried using non intercepted cfgvq, it
> > > > makes sense to let this go in cfg space.
> > > > > Else GET command seems the elegant and right approach.
> > > >
> > > > I expect no such agreement at all. Instead, I expect that we'll have
> > > > an alternative way to access config space. guest virtio core then
> > > > needs to learn both ways, and devices can support one or both.
> > > >
> > > Yeah, we disagree.
> > > Because alternative way that you propose is not predictable way to build the
> > device efficiently.
> > > It always needs to account for old driver to support.
> > > This is clearly sub-optimal as the capabilities grow.
> > 
> > So just quickly add the new capability in the spec and then the number of linux
> > releases that will have the new feature but not config command or whatever
> > that is will be too small for vendors to care.
> > 
> I didn't follow this suggestion.

It is very simple though. 1.3 has inner hash feature. Imagine that
instead of endless flamewars rehashing same arguments, immediately post
1.3 we vote on an interface to access config space through DMA (not sure
it's a VQ BTW but that's a separate discussion).  This should include a
way to expose a subset of DMA features through MMIO, for compatibility.
If we are lucky guest support can land in the same Linux release that
has inner hash.  Drivers do not need to care: they just do virtio_cread
and that is it.

So a vendor that builds device with inner hash, can expose inner hash
only through DMA and not through MMIO.

I conclude that there's no reason to block this feature just
because it uses config space.





> > >
> > > > A good implementation of virtio_cread can abstract that easily so we
> > > > don't need to change drivers.
> > >
> > > There is no backward compat issue for the GET command being new.
> > 
> > It's just a shortcut replacing what we really want.  As long as a shortcut is
> > available people will keep using exactly that.  So I fully expect more proposals
> > for such GET commands on the pretext that one is there so why not another
> > one. Adding more tech debt for whoever finally gets around to building a config
> > space access gateway.
> >
> Not really. as suggested, the first addition of new field to the config space in 1.4-time frame, should add the cfgvq, and not follow the previous example.
> Because this is being thought through now, it is not at all hard for any new things to follow the guideline.

Really. Oh great, so there will be 3 ways to provision things!

I have not seen this patch yet.  And how long will this take to
materialize? I don't believe all TC work must be blocked until this
happens, or alternatively use ad-hock hacks.

I get it you want to save on chip memory. So work on a consistent
soltion for this.

All config space accesses should go through DMA.
Thus no special guidelines should be necessary, and drivers
can just keep doing virtio_cread like they always did.

To add to that, it will allow cheaper devices as
some existing config space will be able to move
quickly.

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
@ 2023-06-28 10:41                                       ` Michael S. Tsirkin
  0 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2023-06-28 10:41 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck

On Thu, Jun 22, 2023 at 05:58:29PM +0000, Parav Pandit wrote:
> > > > > > So what is the solution proposed for this?
> > > > > >
> > > > > 1. Query member device capabilities via admin command
> > > >
> > > > But that's not 1.3 material.
> > > >
> > > > > > Yes the current migration is broken in many ways but that's what
> > > > > > we have. Let's build something better sure but that is not 1.3 material.
> > > > >
> > > > > True, it is not 1.3 material, hence the proposal was to have the GET
> > command.
> > > > > Once/if we reach agreement that no new fields to be added to
> > > > > config space
> > > > starting 1.4 and should be queried using non intercepted cfgvq, it
> > > > makes sense to let this go in cfg space.
> > > > > Else GET command seems the elegant and right approach.
> > > >
> > > > I expect no such agreement at all. Instead, I expect that we'll have
> > > > an alternative way to access config space. guest virtio core then
> > > > needs to learn both ways, and devices can support one or both.
> > > >
> > > Yeah, we disagree.
> > > Because alternative way that you propose is not predictable way to build the
> > device efficiently.
> > > It always needs to account for old driver to support.
> > > This is clearly sub-optimal as the capabilities grow.
> > 
> > So just quickly add the new capability in the spec and then the number of linux
> > releases that will have the new feature but not config command or whatever
> > that is will be too small for vendors to care.
> > 
> I didn't follow this suggestion.

It is very simple though. 1.3 has inner hash feature. Imagine that
instead of endless flamewars rehashing same arguments, immediately post
1.3 we vote on an interface to access config space through DMA (not sure
it's a VQ BTW but that's a separate discussion).  This should include a
way to expose a subset of DMA features through MMIO, for compatibility.
If we are lucky guest support can land in the same Linux release that
has inner hash.  Drivers do not need to care: they just do virtio_cread
and that is it.

So a vendor that builds device with inner hash, can expose inner hash
only through DMA and not through MMIO.

I conclude that there's no reason to block this feature just
because it uses config space.





> > >
> > > > A good implementation of virtio_cread can abstract that easily so we
> > > > don't need to change drivers.
> > >
> > > There is no backward compat issue for the GET command being new.
> > 
> > It's just a shortcut replacing what we really want.  As long as a shortcut is
> > available people will keep using exactly that.  So I fully expect more proposals
> > for such GET commands on the pretext that one is there so why not another
> > one. Adding more tech debt for whoever finally gets around to building a config
> > space access gateway.
> >
> Not really. as suggested, the first addition of new field to the config space in 1.4-time frame, should add the cfgvq, and not follow the previous example.
> Because this is being thought through now, it is not at all hard for any new things to follow the guideline.

Really. Oh great, so there will be 3 ways to provision things!

I have not seen this patch yet.  And how long will this take to
materialize? I don't believe all TC work must be blocked until this
happens, or alternatively use ad-hock hacks.

I get it you want to save on chip memory. So work on a consistent
soltion for this.

All config space accesses should go through DMA.
Thus no special guidelines should be necessary, and drivers
can just keep doing virtio_cread like they always did.

To add to that, it will allow cheaper devices as
some existing config space will be able to move
quickly.

-- 
MST


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
  2023-06-22 20:27                                         ` Parav Pandit
@ 2023-06-28 10:47                                           ` Michael S. Tsirkin
  -1 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2023-06-28 10:47 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck

On Thu, Jun 22, 2023 at 08:27:48PM +0000, Parav Pandit wrote:
> 
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Thursday, June 22, 2023 3:02 PM
> 
> 
> > > Existing fields used by existing drivers must be in MMIO.
> > > Device does not have a choice.
> > 
> > They can if they like mask some less important features reducing the footprint.
> How do to that? How would device know to not have existing fields on MMIO?

As you said yourself, feature bits also have cost. They must be
accessed with DMA too, not through MMIO.
This set of features can have more bits than the one in MMIO.


> > This will be the cost/benefit analysis each vendor does separately.
> > 
> > Case in point, if inner hash support is in 1.3, and linux drivers don't really use
> > that at all for a year or two because it's only used by alibaba in their stack, and
> > by the time linux starts using them it also supports the new commands, then
> > practically devices do not care and can just mask the feature bit in MMIO.
> > 
> Once it lands to config space, it becomes existing fields.

What's in which version of the spec is immaterial.  If all guests using
it also support config space through DMA then it does not matter that
technically there was a spec version which had inner hash but not DMA.

> More below.
> > 
> 
> > > > If they want to cut at 1.3 time, they can, but they also can cut it
> > > > at 1.2 time, this means some features will not be accessible through
> > > > MMIO only through the new commands.
> > > >
> > > New commands services new fields by newer driver.
> > > Newer driver can call new command to get new and old fields both.
> > 
> > yes.
> > 
> > > So 1.3 and 1.4 devices cannot optimize for older drivers.
> > 
> > 
> > I don't know what this means.
> > 
> I guess next below question answers it.

I still don't know what did you mean by "So 1.3 and 1.4 devices cannot
optimize for older drivers". If it's immaterial, fine.

-- 
MST


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
@ 2023-06-28 10:47                                           ` Michael S. Tsirkin
  0 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2023-06-28 10:47 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck

On Thu, Jun 22, 2023 at 08:27:48PM +0000, Parav Pandit wrote:
> 
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Thursday, June 22, 2023 3:02 PM
> 
> 
> > > Existing fields used by existing drivers must be in MMIO.
> > > Device does not have a choice.
> > 
> > They can if they like mask some less important features reducing the footprint.
> How do to that? How would device know to not have existing fields on MMIO?

As you said yourself, feature bits also have cost. They must be
accessed with DMA too, not through MMIO.
This set of features can have more bits than the one in MMIO.


> > This will be the cost/benefit analysis each vendor does separately.
> > 
> > Case in point, if inner hash support is in 1.3, and linux drivers don't really use
> > that at all for a year or two because it's only used by alibaba in their stack, and
> > by the time linux starts using them it also supports the new commands, then
> > practically devices do not care and can just mask the feature bit in MMIO.
> > 
> Once it lands to config space, it becomes existing fields.

What's in which version of the spec is immaterial.  If all guests using
it also support config space through DMA then it does not matter that
technically there was a spec version which had inner hash but not DMA.

> More below.
> > 
> 
> > > > If they want to cut at 1.3 time, they can, but they also can cut it
> > > > at 1.2 time, this means some features will not be accessible through
> > > > MMIO only through the new commands.
> > > >
> > > New commands services new fields by newer driver.
> > > Newer driver can call new command to get new and old fields both.
> > 
> > yes.
> > 
> > > So 1.3 and 1.4 devices cannot optimize for older drivers.
> > 
> > 
> > I don't know what this means.
> > 
> I guess next below question answers it.

I still don't know what did you mean by "So 1.3 and 1.4 devices cannot
optimize for older drivers". If it's immaterial, fine.

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-dev] RE: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
  2023-06-28 10:41                                       ` Michael S. Tsirkin
@ 2023-06-28 16:46                                         ` Parav Pandit
  -1 siblings, 0 replies; 106+ messages in thread
From: Parav Pandit @ 2023-06-28 16:46 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck


> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Wednesday, June 28, 2023 6:41 AM

> > > So just quickly add the new capability in the spec and then the
> > > number of linux releases that will have the new feature but not
> > > config command or whatever that is will be too small for vendors to care.
> > >
> > I didn't follow this suggestion.
> 
> It is very simple though. 1.3 has inner hash feature. Imagine that instead of
> endless flamewars rehashing same arguments, immediately post

For sure if you call this discussion a war, I didn't start it and I didn't delay it to this point either :)

Patch added GET and SET command, GET returns device static and dynamic value, without burdening the device in consistent way.
Will moving to config space, take away the GET command? No?

Then what did one gain other than extra complexity of additional config space? Just more device memory burning.

> 1.3 we vote on an interface to access config space through DMA (not sure it's a
> VQ BTW but that's a separate discussion).  This should include a way to expose
> a subset of DMA features through MMIO, for compatibility.

I think you are missing the main point that I try to highlight but presumably it didn't come across. :(

Having an optional DMA interface does _not_ help device to optimize, because device must be support old drivers.

The proposal I explained in previous email is:
a. All new fields via vq
This means it is _guaranteed_ to be offchip and zero reason for backward compatibility, because there is no backward compat of non existing fields.

b. all existing fields stay in cfg space

c. Optionally existing fields can also be queried via cfg space

> If we are lucky guest support can land in the same Linux release that has inner
> hash.  Drivers do not need to care: they just do virtio_cread and that is it.
>
virtio_cread for extended config space accessible only via DMA or VQ will surely be hidden inside th virtio_cread(). Easy.
 
If we agree on this approach, great, one additional field like this can be added to config space.

> So a vendor that builds device with inner hash, can expose inner hash only
> through DMA and not through MMIO.
> 
> I conclude that there's no reason to block this feature just because it uses config
> space.
>
I must repeat that I am not blocking, I replied to way back on both the approaches in v14 or even before that.
And also explained part of the proposal.

This is why one should discuss the design and not keep asking for patches..
 
> 
> > > >
> > > > > A good implementation of virtio_cread can abstract that easily
> > > > > so we don't need to change drivers.
> > > >
> > > > There is no backward compat issue for the GET command being new.
> > >
> > > It's just a shortcut replacing what we really want.  As long as a
> > > shortcut is available people will keep using exactly that.  So I
> > > fully expect more proposals for such GET commands on the pretext
> > > that one is there so why not another one. Adding more tech debt for
> > > whoever finally gets around to building a config space access gateway.
> > >
> > Not really. as suggested, the first addition of new field to the config space in
> 1.4-time frame, should add the cfgvq, and not follow the previous example.
> > Because this is being thought through now, it is not at all hard for any new
> things to follow the guideline.
> 
> Really. Oh great, so there will be 3 ways to provision things!
>
Not sure how you concluded 3 ways...

Provision is via single command via AQ by the owner device.
New config fields via VQ of member device itself, again single way.
Existing config fields access via config space, single way.

> I have not seen this patch yet.  And how long will this take to materialize? I don't
> believe all TC work must be blocked until this happens, or alternatively use ad-
> hock hacks.
It is surely not the right way to ask for the patch when doing the design discussion and claiming that it is not ready.

> 
> I get it you want to save on chip memory. So work on a consistent soltion for
> this.
> 
> All config space accesses should go through DMA.
This is not predictable due to backward compat need.
Hence the ask is all *new* config space access go through DMA.

> Thus no special guidelines should be necessary, and drivers can just keep doing
> virtio_cread like they always did.
>
This can be very easily achieved in sw by knowing extended config space offset to use DMA.
 
> To add to that, it will allow cheaper devices as some existing config space will be
> able to move quickly.
This is possible only in those config when device side can know early enough, through a new feature bit and new registers for DMA, which will pay off for larger config space extended region.


I want to summarize my humble input:

1. all new extended config space fields to be accessed only via DMA/VQ by member device itself.
2. all existing fields to stay in config space without any interface change
3. optionally existing fields can be accessed vid DMA/VQ by member device itself.

4. If above proposal looks reasonable, sure, have the inner hash field of this patch as part of existing config space. I have no problem with it.

5. any new fields proposed in extended config space in 1.4-time frame must implement DMA/VQ interface as part of the proposal.

6. And if this needs more discussion, lets spend quality time on it and differ the release by a week and at least agree on the path forward.
No need to draft the spec normative language for design.



---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* RE: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
@ 2023-06-28 16:46                                         ` Parav Pandit
  0 siblings, 0 replies; 106+ messages in thread
From: Parav Pandit @ 2023-06-28 16:46 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck


> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Wednesday, June 28, 2023 6:41 AM

> > > So just quickly add the new capability in the spec and then the
> > > number of linux releases that will have the new feature but not
> > > config command or whatever that is will be too small for vendors to care.
> > >
> > I didn't follow this suggestion.
> 
> It is very simple though. 1.3 has inner hash feature. Imagine that instead of
> endless flamewars rehashing same arguments, immediately post

For sure if you call this discussion a war, I didn't start it and I didn't delay it to this point either :)

Patch added GET and SET command, GET returns device static and dynamic value, without burdening the device in consistent way.
Will moving to config space, take away the GET command? No?

Then what did one gain other than extra complexity of additional config space? Just more device memory burning.

> 1.3 we vote on an interface to access config space through DMA (not sure it's a
> VQ BTW but that's a separate discussion).  This should include a way to expose
> a subset of DMA features through MMIO, for compatibility.

I think you are missing the main point that I try to highlight but presumably it didn't come across. :(

Having an optional DMA interface does _not_ help device to optimize, because device must be support old drivers.

The proposal I explained in previous email is:
a. All new fields via vq
This means it is _guaranteed_ to be offchip and zero reason for backward compatibility, because there is no backward compat of non existing fields.

b. all existing fields stay in cfg space

c. Optionally existing fields can also be queried via cfg space

> If we are lucky guest support can land in the same Linux release that has inner
> hash.  Drivers do not need to care: they just do virtio_cread and that is it.
>
virtio_cread for extended config space accessible only via DMA or VQ will surely be hidden inside th virtio_cread(). Easy.
 
If we agree on this approach, great, one additional field like this can be added to config space.

> So a vendor that builds device with inner hash, can expose inner hash only
> through DMA and not through MMIO.
> 
> I conclude that there's no reason to block this feature just because it uses config
> space.
>
I must repeat that I am not blocking, I replied to way back on both the approaches in v14 or even before that.
And also explained part of the proposal.

This is why one should discuss the design and not keep asking for patches..
 
> 
> > > >
> > > > > A good implementation of virtio_cread can abstract that easily
> > > > > so we don't need to change drivers.
> > > >
> > > > There is no backward compat issue for the GET command being new.
> > >
> > > It's just a shortcut replacing what we really want.  As long as a
> > > shortcut is available people will keep using exactly that.  So I
> > > fully expect more proposals for such GET commands on the pretext
> > > that one is there so why not another one. Adding more tech debt for
> > > whoever finally gets around to building a config space access gateway.
> > >
> > Not really. as suggested, the first addition of new field to the config space in
> 1.4-time frame, should add the cfgvq, and not follow the previous example.
> > Because this is being thought through now, it is not at all hard for any new
> things to follow the guideline.
> 
> Really. Oh great, so there will be 3 ways to provision things!
>
Not sure how you concluded 3 ways...

Provision is via single command via AQ by the owner device.
New config fields via VQ of member device itself, again single way.
Existing config fields access via config space, single way.

> I have not seen this patch yet.  And how long will this take to materialize? I don't
> believe all TC work must be blocked until this happens, or alternatively use ad-
> hock hacks.
It is surely not the right way to ask for the patch when doing the design discussion and claiming that it is not ready.

> 
> I get it you want to save on chip memory. So work on a consistent soltion for
> this.
> 
> All config space accesses should go through DMA.
This is not predictable due to backward compat need.
Hence the ask is all *new* config space access go through DMA.

> Thus no special guidelines should be necessary, and drivers can just keep doing
> virtio_cread like they always did.
>
This can be very easily achieved in sw by knowing extended config space offset to use DMA.
 
> To add to that, it will allow cheaper devices as some existing config space will be
> able to move quickly.
This is possible only in those config when device side can know early enough, through a new feature bit and new registers for DMA, which will pay off for larger config space extended region.


I want to summarize my humble input:

1. all new extended config space fields to be accessed only via DMA/VQ by member device itself.
2. all existing fields to stay in config space without any interface change
3. optionally existing fields can be accessed vid DMA/VQ by member device itself.

4. If above proposal looks reasonable, sure, have the inner hash field of this patch as part of existing config space. I have no problem with it.

5. any new fields proposed in extended config space in 1.4-time frame must implement DMA/VQ interface as part of the proposal.

6. And if this needs more discussion, lets spend quality time on it and differ the release by a week and at least agree on the path forward.
No need to draft the spec normative language for design.



This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
  2023-06-28 16:46                                         ` Parav Pandit
@ 2023-06-28 17:08                                           ` Michael S. Tsirkin
  -1 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2023-06-28 17:08 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck

On Wed, Jun 28, 2023 at 04:46:04PM +0000, Parav Pandit wrote:
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Wednesday, June 28, 2023 6:41 AM
> 
> > > > So just quickly add the new capability in the spec and then the
> > > > number of linux releases that will have the new feature but not
> > > > config command or whatever that is will be too small for vendors to care.
> > > >
> > > I didn't follow this suggestion.
> > 
> > It is very simple though. 1.3 has inner hash feature. Imagine that instead of
> > endless flamewars rehashing same arguments, immediately post
> 
> For sure if you call this discussion a war, I didn't start it and I didn't delay it to this point either :)

Yea sorry I'm being harsh, the threads here grow too much. We all need
to make an effort not to repeat ourselves.

> Patch added GET and SET command, GET returns device static and dynamic value, without burdening the device in consistent way.
> Will moving to config space, take away the GET command? No?
> 
> Then what did one gain other than extra complexity of additional config space? Just more device memory burning.

I don't think Linux will ever use GET. It's there for completeness, I'm
fine with getting rid of it too.


> > 1.3 we vote on an interface to access config space through DMA (not sure it's a
> > VQ BTW but that's a separate discussion).  This should include a way to expose
> > a subset of DMA features through MMIO, for compatibility.
> 
> I think you are missing the main point that I try to highlight but presumably it didn't come across. :(
> 
> Having an optional DMA interface does _not_ help device to optimize, because device must be support old drivers.

Yea, I don't get it, sorry.
So where's this old driver that uses the inner hash feature?
It maybe will be released down the road right?
Why can't it at the same time support DMA for access to this field?






> The proposal I explained in previous email is:
> a. All new fields via vq
> This means it is _guaranteed_ to be offchip and zero reason for backward compatibility, because there is no backward compat of non existing fields.
> 
> b. all existing fields stay in cfg space
> 
> c. Optionally existing fields can also be queried via cfg space


And what I say is even simpler:
	all config space is accessible through DMA
	some of it optionally in MMIO

which fields to have in MMIO will be up to device.
Research what do drivers you care about use, and include that.






> > If we are lucky guest support can land in the same Linux release that has inner
> > hash.  Drivers do not need to care: they just do virtio_cread and that is it.
> >
> virtio_cread for extended config space accessible only via DMA or VQ will surely be hidden inside th virtio_cread(). Easy.
>  
> If we agree on this approach, great, one additional field like this can be added to config space.

I think we are pretty close then, great!


> > So a vendor that builds device with inner hash, can expose inner hash only
> > through DMA and not through MMIO.
> > 
> > I conclude that there's no reason to block this feature just because it uses config
> > space.
> >
> I must repeat that I am not blocking, I replied to way back on both the approaches in v14 or even before that.
> And also explained part of the proposal.
> 
> This is why one should discuss the design and not keep asking for patches..

I certainly don't prevent anyone from posting design sketches.  Problem
is this: even spec patches are often hard to understand, agrammatical,
etc. Not sure what to do here, but it's understandable when people only
start reviewing in earnest when the text gets half way readable.


> > 
> > > > >
> > > > > > A good implementation of virtio_cread can abstract that easily
> > > > > > so we don't need to change drivers.
> > > > >
> > > > > There is no backward compat issue for the GET command being new.
> > > >
> > > > It's just a shortcut replacing what we really want.  As long as a
> > > > shortcut is available people will keep using exactly that.  So I
> > > > fully expect more proposals for such GET commands on the pretext
> > > > that one is there so why not another one. Adding more tech debt for
> > > > whoever finally gets around to building a config space access gateway.
> > > >
> > > Not really. as suggested, the first addition of new field to the config space in
> > 1.4-time frame, should add the cfgvq, and not follow the previous example.
> > > Because this is being thought through now, it is not at all hard for any new
> > things to follow the guideline.
> > 
> > Really. Oh great, so there will be 3 ways to provision things!
> >
> Not sure how you concluded 3 ways...
> Provision is via single command via AQ by the owner device.

heh but what does this command get? if everything
device specific is in
config space it can just get config space layout exactly
and we have no extra work. if there's stuff not in config
space then it needs extra layout for that.

Yes I understand there's transport specific stuff.


> New config fields via VQ of member device itself, again single way.
> Existing config fields access via config space, single way.

That's two ways. The 3rd way refers to using cvq for inner hash
and future cfgvq for other config space.


> > I have not seen this patch yet.  And how long will this take to materialize? I don't
> > believe all TC work must be blocked until this happens, or alternatively use ad-
> > hock hacks.
> It is surely not the right way to ask for the patch when doing the design discussion and claiming that it is not ready.
> > 
> > I get it you want to save on chip memory. So work on a consistent soltion for
> > this.
> > 
> > All config space accesses should go through DMA.
> This is not predictable due to backward compat need.
> Hence the ask is all *new* config space access go through DMA.

Let's make it all go through DMA. As a vendor, you decide whether
inner hash is worth supporting now with MMIO or later with DMA.
What is wrong with that?


> > Thus no special guidelines should be necessary, and drivers can just keep doing
> > virtio_cread like they always did.
> >
> This can be very easily achieved in sw by knowing extended config space offset to use DMA.

Yea, doable. But why do you want to make future devices more expensive,
forever? then people will turn around and say we want IDPF it does
not have this baggage.

> > To add to that, it will allow cheaper devices as some existing config space will be
> > able to move quickly.
> This is possible only in those config when device side can know early enough, through a new feature bit and new registers for DMA, which will pay off for larger config space extended region.
> 
> 
> I want to summarize my humble input:
> 
> 1. all new extended config space fields to be accessed only via DMA/VQ by member device itself.

I propose optionally allow this in MMIO if device wants to.

> 2. all existing fields to stay in config space without any interface change

you mean in MMIO. ok.

> 3. optionally existing fields can be accessed vid DMA/VQ by member device itself.

I propose making this mandatory.

> 4. If above proposal looks reasonable, sure, have the inner hash field of this patch as part of existing config space. I have no problem with it.
> 
> 5. any new fields proposed in extended config space in 1.4-time frame must implement DMA/VQ interface as part of the proposal.

instead, I just propose that DMA is always a superset of MMIO.
So using DMA you know you are not missing out on anything.


> 6. And if this needs more discussion, lets spend quality time on it and differ the release by a week and at least agree on the path forward.
> No need to draft the spec normative language for design.

Yes no problem to delay a bit.

What do you think of the changes I proposed above?


-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
@ 2023-06-28 17:08                                           ` Michael S. Tsirkin
  0 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2023-06-28 17:08 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Heng Qi, virtio-comment, virtio-dev, Jason Wang,
	Yuri Benditovich, Xuan Zhuo, Cornelia Huck

On Wed, Jun 28, 2023 at 04:46:04PM +0000, Parav Pandit wrote:
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Wednesday, June 28, 2023 6:41 AM
> 
> > > > So just quickly add the new capability in the spec and then the
> > > > number of linux releases that will have the new feature but not
> > > > config command or whatever that is will be too small for vendors to care.
> > > >
> > > I didn't follow this suggestion.
> > 
> > It is very simple though. 1.3 has inner hash feature. Imagine that instead of
> > endless flamewars rehashing same arguments, immediately post
> 
> For sure if you call this discussion a war, I didn't start it and I didn't delay it to this point either :)

Yea sorry I'm being harsh, the threads here grow too much. We all need
to make an effort not to repeat ourselves.

> Patch added GET and SET command, GET returns device static and dynamic value, without burdening the device in consistent way.
> Will moving to config space, take away the GET command? No?
> 
> Then what did one gain other than extra complexity of additional config space? Just more device memory burning.

I don't think Linux will ever use GET. It's there for completeness, I'm
fine with getting rid of it too.


> > 1.3 we vote on an interface to access config space through DMA (not sure it's a
> > VQ BTW but that's a separate discussion).  This should include a way to expose
> > a subset of DMA features through MMIO, for compatibility.
> 
> I think you are missing the main point that I try to highlight but presumably it didn't come across. :(
> 
> Having an optional DMA interface does _not_ help device to optimize, because device must be support old drivers.

Yea, I don't get it, sorry.
So where's this old driver that uses the inner hash feature?
It maybe will be released down the road right?
Why can't it at the same time support DMA for access to this field?






> The proposal I explained in previous email is:
> a. All new fields via vq
> This means it is _guaranteed_ to be offchip and zero reason for backward compatibility, because there is no backward compat of non existing fields.
> 
> b. all existing fields stay in cfg space
> 
> c. Optionally existing fields can also be queried via cfg space


And what I say is even simpler:
	all config space is accessible through DMA
	some of it optionally in MMIO

which fields to have in MMIO will be up to device.
Research what do drivers you care about use, and include that.






> > If we are lucky guest support can land in the same Linux release that has inner
> > hash.  Drivers do not need to care: they just do virtio_cread and that is it.
> >
> virtio_cread for extended config space accessible only via DMA or VQ will surely be hidden inside th virtio_cread(). Easy.
>  
> If we agree on this approach, great, one additional field like this can be added to config space.

I think we are pretty close then, great!


> > So a vendor that builds device with inner hash, can expose inner hash only
> > through DMA and not through MMIO.
> > 
> > I conclude that there's no reason to block this feature just because it uses config
> > space.
> >
> I must repeat that I am not blocking, I replied to way back on both the approaches in v14 or even before that.
> And also explained part of the proposal.
> 
> This is why one should discuss the design and not keep asking for patches..

I certainly don't prevent anyone from posting design sketches.  Problem
is this: even spec patches are often hard to understand, agrammatical,
etc. Not sure what to do here, but it's understandable when people only
start reviewing in earnest when the text gets half way readable.


> > 
> > > > >
> > > > > > A good implementation of virtio_cread can abstract that easily
> > > > > > so we don't need to change drivers.
> > > > >
> > > > > There is no backward compat issue for the GET command being new.
> > > >
> > > > It's just a shortcut replacing what we really want.  As long as a
> > > > shortcut is available people will keep using exactly that.  So I
> > > > fully expect more proposals for such GET commands on the pretext
> > > > that one is there so why not another one. Adding more tech debt for
> > > > whoever finally gets around to building a config space access gateway.
> > > >
> > > Not really. as suggested, the first addition of new field to the config space in
> > 1.4-time frame, should add the cfgvq, and not follow the previous example.
> > > Because this is being thought through now, it is not at all hard for any new
> > things to follow the guideline.
> > 
> > Really. Oh great, so there will be 3 ways to provision things!
> >
> Not sure how you concluded 3 ways...
> Provision is via single command via AQ by the owner device.

heh but what does this command get? if everything
device specific is in
config space it can just get config space layout exactly
and we have no extra work. if there's stuff not in config
space then it needs extra layout for that.

Yes I understand there's transport specific stuff.


> New config fields via VQ of member device itself, again single way.
> Existing config fields access via config space, single way.

That's two ways. The 3rd way refers to using cvq for inner hash
and future cfgvq for other config space.


> > I have not seen this patch yet.  And how long will this take to materialize? I don't
> > believe all TC work must be blocked until this happens, or alternatively use ad-
> > hock hacks.
> It is surely not the right way to ask for the patch when doing the design discussion and claiming that it is not ready.
> > 
> > I get it you want to save on chip memory. So work on a consistent soltion for
> > this.
> > 
> > All config space accesses should go through DMA.
> This is not predictable due to backward compat need.
> Hence the ask is all *new* config space access go through DMA.

Let's make it all go through DMA. As a vendor, you decide whether
inner hash is worth supporting now with MMIO or later with DMA.
What is wrong with that?


> > Thus no special guidelines should be necessary, and drivers can just keep doing
> > virtio_cread like they always did.
> >
> This can be very easily achieved in sw by knowing extended config space offset to use DMA.

Yea, doable. But why do you want to make future devices more expensive,
forever? then people will turn around and say we want IDPF it does
not have this baggage.

> > To add to that, it will allow cheaper devices as some existing config space will be
> > able to move quickly.
> This is possible only in those config when device side can know early enough, through a new feature bit and new registers for DMA, which will pay off for larger config space extended region.
> 
> 
> I want to summarize my humble input:
> 
> 1. all new extended config space fields to be accessed only via DMA/VQ by member device itself.

I propose optionally allow this in MMIO if device wants to.

> 2. all existing fields to stay in config space without any interface change

you mean in MMIO. ok.

> 3. optionally existing fields can be accessed vid DMA/VQ by member device itself.

I propose making this mandatory.

> 4. If above proposal looks reasonable, sure, have the inner hash field of this patch as part of existing config space. I have no problem with it.
> 
> 5. any new fields proposed in extended config space in 1.4-time frame must implement DMA/VQ interface as part of the proposal.

instead, I just propose that DMA is always a superset of MMIO.
So using DMA you know you are not missing out on anything.


> 6. And if this needs more discussion, lets spend quality time on it and differ the release by a week and at least agree on the path forward.
> No need to draft the spec normative language for design.

Yes no problem to delay a bit.

What do you think of the changes I proposed above?


-- 
MST


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 106+ messages in thread

end of thread, other threads:[~2023-06-28 17:08 UTC | newest]

Thread overview: 106+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-21 13:50 [virtio-dev] [PATCH v18] virtio-net: support inner header hash Heng Qi
2023-06-21 13:50 ` [virtio-comment] " Heng Qi
2023-06-21 15:38 ` [virtio-dev] " Michael S. Tsirkin
2023-06-21 15:38   ` [virtio-comment] " Michael S. Tsirkin
2023-06-21 16:46   ` [virtio-dev] " Heng Qi
2023-06-21 16:46     ` Heng Qi
2023-06-21 17:52     ` [virtio-dev] " Parav Pandit
2023-06-21 17:52       ` Parav Pandit
2023-06-21 19:25       ` [virtio-dev] " Michael S. Tsirkin
2023-06-21 19:25         ` Michael S. Tsirkin
2023-06-21 19:28         ` [virtio-dev] " Parav Pandit
2023-06-21 19:28           ` Parav Pandit
2023-06-21 19:35           ` [virtio-dev] " Michael S. Tsirkin
2023-06-21 19:35             ` Michael S. Tsirkin
2023-06-21 19:39             ` [virtio-dev] " Parav Pandit
2023-06-21 19:39               ` Parav Pandit
2023-06-21 19:45               ` [virtio-dev] " Michael S. Tsirkin
2023-06-21 19:45                 ` Michael S. Tsirkin
2023-06-22  0:46             ` [virtio-dev] " Heng Qi
2023-06-22  0:46               ` [virtio-comment] " Heng Qi
2023-06-21 19:32     ` Michael S. Tsirkin
2023-06-21 19:32       ` Michael S. Tsirkin
2023-06-21 19:37       ` [virtio-dev] " Parav Pandit
2023-06-21 19:37         ` Parav Pandit
2023-06-21 20:16         ` [virtio-dev] " Michael S. Tsirkin
2023-06-21 20:16           ` Michael S. Tsirkin
2023-06-21 20:24           ` [virtio-dev] " Parav Pandit
2023-06-21 20:24             ` Parav Pandit
2023-06-21 20:37             ` [virtio-dev] " Michael S. Tsirkin
2023-06-21 20:37               ` Michael S. Tsirkin
2023-06-21 20:52               ` [virtio-dev] " Parav Pandit
2023-06-21 20:52                 ` Parav Pandit
2023-06-22  0:59                 ` [virtio-comment] Re: [virtio-dev] " Heng Qi
2023-06-22  0:59                   ` Heng Qi
2023-06-22  1:04                   ` Parav Pandit
2023-06-22  1:04                     ` [virtio-comment] " Parav Pandit
2023-06-22  1:17                     ` Heng Qi
2023-06-22  1:17                       ` [virtio-comment] " Heng Qi
2023-06-22  6:23                 ` [virtio-dev] " Michael S. Tsirkin
2023-06-22  6:23                   ` Michael S. Tsirkin
2023-06-22 12:32                   ` [virtio-dev] " Parav Pandit
2023-06-22 12:32                     ` Parav Pandit
2023-06-22 13:42                     ` [virtio-dev] " Heng Qi
2023-06-22 13:42                       ` Heng Qi
2023-06-22 14:27                       ` [virtio-dev] " Parav Pandit
2023-06-22 14:27                         ` Parav Pandit
2023-06-22 16:46                         ` [virtio-dev] " Michael S. Tsirkin
2023-06-22 16:46                           ` Michael S. Tsirkin
2023-06-22 16:54                           ` [virtio-dev] " Parav Pandit
2023-06-22 16:54                             ` Parav Pandit
2023-06-22 17:03                             ` Michael S. Tsirkin
2023-06-22 17:03                               ` [virtio-dev] " Michael S. Tsirkin
2023-06-22 17:11                               ` [virtio-dev] " Parav Pandit
2023-06-22 17:11                                 ` Parav Pandit
2023-06-22 17:28                                 ` [virtio-dev] " Michael S. Tsirkin
2023-06-22 17:28                                   ` Michael S. Tsirkin
2023-06-22 17:58                                   ` [virtio-dev] " Parav Pandit
2023-06-22 17:58                                     ` Parav Pandit
2023-06-28 10:41                                     ` [virtio-dev] " Michael S. Tsirkin
2023-06-28 10:41                                       ` Michael S. Tsirkin
2023-06-28 16:46                                       ` [virtio-dev] " Parav Pandit
2023-06-28 16:46                                         ` Parav Pandit
2023-06-28 17:08                                         ` [virtio-dev] " Michael S. Tsirkin
2023-06-28 17:08                                           ` Michael S. Tsirkin
2023-06-22 16:28                     ` [virtio-dev] " Michael S. Tsirkin
2023-06-22 16:28                       ` Michael S. Tsirkin
2023-06-22 16:42                       ` [virtio-dev] " Parav Pandit
2023-06-22 16:42                         ` Parav Pandit
2023-06-22 16:54                         ` [virtio-dev] " Michael S. Tsirkin
2023-06-22 16:54                           ` Michael S. Tsirkin
2023-06-22 17:04                           ` [virtio-dev] " Parav Pandit
2023-06-22 17:04                             ` [virtio-comment] " Parav Pandit
2023-06-22 17:14                             ` Michael S. Tsirkin
2023-06-22 17:14                               ` [virtio-comment] " Michael S. Tsirkin
2023-06-22 17:20                               ` Parav Pandit
2023-06-22 17:20                                 ` [virtio-comment] " Parav Pandit
2023-06-22 17:43                                 ` Michael S. Tsirkin
2023-06-22 17:43                                   ` [virtio-comment] " Michael S. Tsirkin
2023-06-22 18:12                                   ` Parav Pandit
2023-06-22 18:12                                     ` [virtio-comment] " Parav Pandit
2023-06-22 18:36                                     ` Michael S. Tsirkin
2023-06-22 18:36                                       ` [virtio-comment] " Michael S. Tsirkin
2023-06-22 17:11                     ` Michael S. Tsirkin
2023-06-22 17:11                       ` Michael S. Tsirkin
2023-06-22 17:15                       ` [virtio-dev] " Parav Pandit
2023-06-22 17:15                         ` Parav Pandit
2023-06-22 17:37                         ` [virtio-dev] " Michael S. Tsirkin
2023-06-22 17:37                           ` Michael S. Tsirkin
2023-06-22 17:51                           ` [virtio-dev] " Parav Pandit
2023-06-22 17:51                             ` [virtio-comment] " Parav Pandit
2023-06-22 18:11                             ` Michael S. Tsirkin
2023-06-22 18:11                               ` [virtio-comment] " Michael S. Tsirkin
2023-06-22 18:17                               ` Parav Pandit
2023-06-22 18:17                                 ` [virtio-comment] " Parav Pandit
2023-06-22 18:40                                 ` Michael S. Tsirkin
2023-06-22 18:40                                   ` [virtio-comment] " Michael S. Tsirkin
2023-06-22 18:50                                   ` [virtio-dev] " Parav Pandit
2023-06-22 18:50                                     ` Parav Pandit
2023-06-22 19:02                                     ` [virtio-dev] " Michael S. Tsirkin
2023-06-22 19:02                                       ` Michael S. Tsirkin
2023-06-22 20:27                                       ` [virtio-dev] " Parav Pandit
2023-06-22 20:27                                         ` Parav Pandit
2023-06-28 10:47                                         ` Michael S. Tsirkin
2023-06-28 10:47                                           ` [virtio-dev] " Michael S. Tsirkin
2023-06-22  0:41       ` Heng Qi
2023-06-22  0:41         ` Heng Qi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.