All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next 0/8] xen-netback/core: packet hashing
@ 2015-10-21 10:36 Paul Durrant
  2015-10-21 10:36 ` [PATCH net-next 1/8] xen-netback: re-import canonical netif header Paul Durrant
                   ` (19 more replies)
  0 siblings, 20 replies; 39+ messages in thread
From: Paul Durrant @ 2015-10-21 10:36 UTC (permalink / raw)
  To: netdev, xen-devel; +Cc: Paul Durrant

This series adds xen-netback support for hash negotiation with a frontend
driver, and an implementation of toeplitz hashing as the initial negotiable
algorithm.

Patch #1 re-imports the canonical netif header from Xen, which contains
the necessary definitions and a type required by subsequent patches.
(Note that this patch is not completely style-clean since the header
includes typedefs).

Patch #2 is some cleanup in xen-netback.

Patch #3 adds code to allow multiple extra_info segments to be passed from a
frontend to xen-netback.

Patch #4 adds code to allow xen-netback to accept new hash extra_info
segments from a frontend and set the skb hash information appropriately. 

Patch #5 makes a change to struct sk_buff: one extra bit is used to allow
full hash type information to be stored, rather than just the l4_hash
boolean value.

Patch #6 adds code to xen-netback to pass L3 or L4 skb hash values to
capable frontends.

Patch #7 adds code to xen-netback to provide a configurable (by the
frontend) mapping from hash values to queue numbers.

Patch #8 adds code to xen-netback to provide toeplitz hashing of skbs.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH net-next 1/8] xen-netback: re-import canonical netif header
  2015-10-21 10:36 [PATCH net-next 0/8] xen-netback/core: packet hashing Paul Durrant
  2015-10-21 10:36 ` [PATCH net-next 1/8] xen-netback: re-import canonical netif header Paul Durrant
@ 2015-10-21 10:36 ` Paul Durrant
  2015-10-26 17:05   ` Wei Liu
  2015-10-26 17:05   ` Wei Liu
  2015-10-21 10:36 ` [PATCH net-next 2/8] xen-netback: remove GSO information from xenvif_rx_meta Paul Durrant
                   ` (17 subsequent siblings)
  19 siblings, 2 replies; 39+ messages in thread
From: Paul Durrant @ 2015-10-21 10:36 UTC (permalink / raw)
  To: netdev, xen-devel
  Cc: Paul Durrant, Konrad Rzeszutek Wilk, Boris Ostrovsky,
	David Vrabel, Wei Liu

The canonical netif header (in the Xen source repo) and the Linux variant
have diverged significantly. Recently much documentation has been added to
the canonical header and new definitions and types to support packet hash
configuration. Subsequent patches in this series add support for packet
hash configuration in xen-netback so this patch re-imports the canonical
header in readiness.

To maintain compatibility and some style consistency with the old Linux
variant, the header was stripped of its emacs boilerplate, and
post-processed and copied into place with the following commands:

ed -s netif.h << EOF
H
,s/NETTXF_/XEN_NETTXF_/g
,s/NETRXF_/XEN_NETRXF_/g
,s/NETIF_RSP/XEN_NETIF_RSP/g
,s/netif_tx/xen_netif_tx/g
,s/netif_rx/xen_netif_rx/g
,s/netif_extra_info/xen_netif_extra_info/g
w
EOF

indent --linux-style netif.h -o include/xen/interface/io/netif.h

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
---

Whilst awaiting review of my patches to the canonical netif.h, import has
been done from my staging branch using:

wget http://xenbits.xen.org/gitweb/?p=people/pauldu/xen.git;a=blob_plain;f=xen/include/public/io/netif.h;hb=refs/heads/netif
---
 include/xen/interface/io/netif.h | 475 +++++++++++++++++++++++++++++++--------
 1 file changed, 381 insertions(+), 94 deletions(-)

diff --git a/include/xen/interface/io/netif.h b/include/xen/interface/io/netif.h
index 252ffd4..832dc37 100644
--- a/include/xen/interface/io/netif.h
+++ b/include/xen/interface/io/netif.h
@@ -3,14 +3,32 @@
  *
  * Unified network-device I/O interface for Xen guest OSes.
  *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ *
  * Copyright (c) 2003-2004, Keir Fraser
  */
 
 #ifndef __XEN_PUBLIC_IO_NETIF_H__
 #define __XEN_PUBLIC_IO_NETIF_H__
 
-#include <xen/interface/io/ring.h>
-#include <xen/interface/grant_table.h>
+#include "ring.h"
+#include "../grant_table.h"
 
 /*
  * Older implementation of Xen network frontend / backend has an
@@ -38,10 +56,10 @@
  * that it cannot safely queue packets (as it may not be kicked to send them).
  */
 
- /*
+/*
  * "feature-split-event-channels" is introduced to separate guest TX
- * and RX notificaion. Backend either doesn't support this feature or
- * advertise it via xenstore as 0 (disabled) or 1 (enabled).
+ * and RX notification. Backend either doesn't support this feature or
+ * advertises it via xenstore as 0 (disabled) or 1 (enabled).
  *
  * To make use of this feature, frontend should allocate two event
  * channels for TX and RX, advertise them to backend as
@@ -96,11 +114,112 @@
  * error. This includes scenarios where more (or fewer) queues were
  * requested than the frontend provided details for.
  *
- * Mapping of packets to queues is considered to be a function of the
- * transmitting system (backend or frontend) and is not negotiated
- * between the two. Guests are free to transmit packets on any queue
- * they choose, provided it has been set up correctly. Guests must be
- * prepared to receive packets on any queue they have requested be set up.
+ * Unless a hash algorithm or mapping of packet hash to queues has been
+ * negotiated (see below), queue selection is considered to be a function of
+ * the transmitting system (backend or frontend) and either end is free to
+ * transmit packets on any queue, provided it has been set up correctly.
+ * Guests must therefore be prepared to receive packets on any queue they
+ * have requested be set up.
+ */
+
+/*
+ * Hash negotiation (only applicable if using multiple queues):
+ *
+ * A backend can advertise a set of hash algorithms that it can perform by
+ * naming it in a space separated list in the "multi-queue-hash-list"
+ * xenstore key. For example, if the backend supports the 'foo' and 'bar'
+ * algorithms it would set:
+ *
+ * /local/domain/X/backend/vif/Y/Z/multi-queue-hash-list = "foo bar"
+ *
+ * Additionally, in supporting a particular algorithm, it may be necessary
+ * for the backend to specify the capabilities of its implementation of
+ * that algorithm, e.g. what sections of packet header it can hash.
+ * To do that it can set algorithm-specific keys under a parent capabilities
+ * key. For example, if the 'bar' algorithm implementation in the backend
+ * is capable of hashing over an IP version 4 header and a TCP header, the
+ * backend might set:
+ *
+ * /local/domain/X/backend/vif/Y/Z/multi-queue-hash-caps-bar/types = "ipv4+tcp"
+ *
+ * The backend should set all such keys before it moves into the initwait
+ * state.
+ *
+ * The frontend can select a hash algorithm at any time after it moves into
+ * the connected state by setting the "multi-queue-hash" key. The backend
+ * must therefore watch this key and be prepared to change hash algorithms
+ * at any time whilst in the connected state. So, for example, if the
+ * frontend wants 'foo' hashing, it should set:
+ *
+ * /local/domain/Y/device/vif/Z/multi-queue-hash = "foo"
+ *
+ * Additionally it may set parameters for that algorithm by setting
+ * algorithm-specific keys under a parent parameters key. For example, if
+ * the 'foo' algorithm implementation in the backend is capable of hashing
+ * over an IP version 4 header, a TCP header or both but the frontend only
+ * wants it to hash over only the IP version 4 header then it might set:
+ *
+ * /local/domain/Y/device/vif/Z/multi-queue-hash-params-foo/types = "ipv4"
+ *
+ * The backend must also watch the parameters key as the frontend may
+ * change the parameters at any time whilst in the connected state.
+ *
+ * (Capabilities and parameters documentation for specific algorithms is
+ * below).
+ *
+ * TOEPLITZ:
+ *
+ * If the backend supports Toeplitz hashing then it should include
+ * the algorithm name 'toeplitz' in its "multi-queue-hash-list" key.
+ * It should also advertise the following capabilities:
+ *
+ * types: a space separated list containing any or all of 'ipv4', 'tcpv4',
+ *        'ipv6', 'tcpv6', indicating over which headers the hash algorithm
+ *        is capable of being performed.
+ *
+ * max-key-length: an integer value indicating the maximum key length (in
+ *                 octets) that the frontend may supply.
+ *
+ * Upon selecting this algorithm, the frontend may supply the following
+ * parameters.
+ *
+ * types: a space separated list containing none, any or all of the type
+ *        names included in the types list in the capabilities.
+ *        When the backend encounters a packet type not in this list it
+ *        will assign a hash value of 0.
+ *
+ * key: a ':' separated list of octets (up to the maximum length specified
+ *      in the capabilities) expressed in hexadecimal indicating the key
+ *      that should be used in the hash calculation.
+ *
+ * For more information on Toeplitz hash calculation see:
+ *
+ * https://msdn.microsoft.com/en-us/library/windows/hardware/ff570725.aspx
+ */
+
+/*
+ * Hash mapping (only applicable if using multiple queues):
+ *
+ * If the backend is not capable, or no mapping is specified by the frontend
+ * then it is assumed that the hash -> queue mapping is done by simple
+ * modular arithmetic.
+ *
+ * To advertise that it is capable of accepting a specific mapping from the
+ * frontend the backend should set the "multi-queue-max-hash-mapping-length"
+ * key to a non-zero value. The frontend may then specify a mapping (up to
+ * the maximum specified length) as a ',' separated list of decimal queue
+ * numbers in the "multi-queue-hash-mapping" key.
+ *
+ * The backend should parse this list into an array and perform the mapping
+ * as follows:
+ *
+ * queue = mapping[hash % length-of-list]
+ *
+ * If any of the queue values specified in the list is not connected then
+ * the backend is free to choose a connected queue arbitrarily.
+ *
+ * The backend must be prepared to handle updates the mapping list at any
+ * time whilst in the connected state.
  */
 
 /*
@@ -118,151 +237,319 @@
  */
 
 /*
+ * "feature-multicast-control" advertises the capability to filter ethernet
+ * multicast packets in the backend. To enable use of this capability the
+ * frontend must set "request-multicast-control" before moving into the
+ * connected state.
+ *
+ * If "request-multicast-control" is set then the backend transmit side should
+ * no longer flood multicast packets to the frontend, it should instead drop any
+ * multicast packet that does not match in a filter list. The list is
+ * amended by the frontend by sending dummy transmit requests containing
+ * XEN_NETIF_EXTRA_TYPE_MCAST_{ADD,DEL} extra-info fragments as specified below.
+ * Once enabled by the frontend, the feature cannot be disabled except by
+ * closing and re-connecting to the backend.
+ */
+
+/*
+ * "feature-hash" advertises the capability to accept extra info slots of
+ * type XEN_NETIF_EXTRA_TYPE_HASH. They will not be sent by either end
+ * unless the other end advertises this feature.
+ */
+
+/*
  * This is the 'wire' format for packets:
- *  Request 1: xen_netif_tx_request  -- XEN_NETTXF_* (any flags)
- * [Request 2: xen_netif_extra_info]    (only if request 1 has XEN_NETTXF_extra_info)
- * [Request 3: xen_netif_extra_info]    (only if request 2 has XEN_NETIF_EXTRA_MORE)
- *  Request 4: xen_netif_tx_request  -- XEN_NETTXF_more_data
- *  Request 5: xen_netif_tx_request  -- XEN_NETTXF_more_data
+ *  Request 1: xen_netif_tx_request_t -- XEN_NETTXF_* (any flags)
+ * [Request 2: xen_netif_extra_info_t] (only if request 1 has
+ *                                  XEN_NETTXF_extra_info)
+ * [Request 3: xen_netif_extra_info_t] (only if request 2 has
+ *                                  XEN_NETIF_EXTRA_MORE)
+ *  Request 4: xen_netif_tx_request_t -- XEN_NETTXF_more_data
+ *  Request 5: xen_netif_tx_request_t -- XEN_NETTXF_more_data
  *  ...
- *  Request N: xen_netif_tx_request  -- 0
+ *  Request N: xen_netif_tx_request_t -- 0
+ */
+
+/*
+ * Guest transmit
+ * ==============
+ *
+ * Ring slot size is 12 octets, however not all request/response
+ * structs use the full size.
+ *
+ * tx request data (xen_netif_tx_request_t)
+ * ------------------------------------
+ *
+ *    0     1     2     3     4     5     6     7  octet
+ * +-----+-----+-----+-----+-----+-----+-----+-----+
+ * | grant ref             | offset    | flags     |
+ * +-----+-----+-----+-----+-----+-----+-----+-----+
+ * | id        | size      |
+ * +-----+-----+-----+-----+
+ *
+ * grant ref: Reference to buffer page.
+ * offset: Offset within buffer page.
+ * flags: XEN_NETTXF_*.
+ * id: request identifier, echoed in response.
+ * size: packet size in bytes.
+ *
+ * tx response (xen_netif_tx_response_t)
+ * ---------------------------------
+ *
+ *    0     1     2     3     4     5     6     7  octet
+ * +-----+-----+-----+-----+-----+-----+-----+-----+
+ * | id        | status    | unused                |
+ * +-----+-----+-----+-----+-----+-----+-----+-----+
+ * | unused                |
+ * +-----+-----+-----+-----+
+ *
+ * id: reflects id in transmit request
+ * status: XEN_NETIF_RSP_*
+ *
+ * Guest receive
+ * =============
+ *
+ * Ring slot size is 8 octets.
+ *
+ * rx request (xen_netif_rx_request_t)
+ * -------------------------------
+ *
+ *    0     1     2     3     4     5     6     7  octet
+ * +-----+-----+-----+-----+-----+-----+-----+-----+
+ * | id        | pad       | gref                  |
+ * +-----+-----+-----+-----+-----+-----+-----+-----+
+ *
+ * id: request identifier, echoed in response.
+ * gref: reference to incoming granted frame.
+ *
+ * rx response (xen_netif_rx_response_t)
+ * ---------------------------------
+ *
+ *    0     1     2     3     4     5     6     7  octet
+ * +-----+-----+-----+-----+-----+-----+-----+-----+
+ * | id        | offset    | flags     | status    |
+ * +-----+-----+-----+-----+-----+-----+-----+-----+
+ *
+ * id: reflects id in receive request
+ * offset: offset in page of start of received packet
+ * flags: XEN_NETRXF_*
+ * status: -ve: XEN_NETIF_RSP_*; +ve: Rx'ed pkt size.
+ *
+ * NOTE: Historically, to support GSO on the frontend receive side, Linux
+ *       netfront does not make use of the rx response id (because, as
+ *       described below, extra info structures overlay the id field).
+ *       Instead it assumes that responses always appear in the same ring
+ *       slot as their corresponding request. Thus, to maintain
+ *       compatibility, backends must make sure this is the case.
+ *
+ * Extra Info
+ * ==========
+ *
+ * Can be present if initial request or response has NET{T,R}XF_extra_info,
+ * or previous extra request has XEN_NETIF_EXTRA_MORE.
+ *
+ * The struct therefore needs to fit into either a tx or rx slot and
+ * is therefore limited to 8 octets.
+ *
+ * NOTE: Because extra info data overlays the usual request/response
+ *       structures, there is no id information in the opposite direction.
+ *       So, if an extra info overlays an rx response the frontend can
+ *       assume that it is in the same ring slot as the request that was
+ *       consumed to make the slot available, and the backend must ensure
+ *       this assumption is true.
+ *
+ * extra info (xen_netif_extra_info_t)
+ * -------------------------------
+ *
+ * General format:
+ *
+ *    0     1     2     3     4     5     6     7  octet
+ * +-----+-----+-----+-----+-----+-----+-----+-----+
+ * |type |flags| type specific data                |
+ * +-----+-----+-----+-----+-----+-----+-----+-----+
+ * | padding for tx        |
+ * +-----+-----+-----+-----+
+ *
+ * type: XEN_NETIF_EXTRA_TYPE_*
+ * flags: XEN_NETIF_EXTRA_FLAG_*
+ * padding for tx: present only in the tx case due to 8 octet limit
+ *                 from rx case. Not shown in type specific entries
+ *                 below.
+ *
+ * XEN_NETIF_EXTRA_TYPE_GSO:
+ *
+ *    0     1     2     3     4     5     6     7  octet
+ * +-----+-----+-----+-----+-----+-----+-----+-----+
+ * |type |flags| size      |type | pad | features  |
+ * +-----+-----+-----+-----+-----+-----+-----+-----+
+ *
+ * type: Must be XEN_NETIF_EXTRA_TYPE_GSO
+ * flags: XEN_NETIF_EXTRA_FLAG_*
+ * size: Maximum payload size of each segment. For example,
+ *       for TCP this is just the path MSS.
+ * type: XEN_NETIF_GSO_TYPE_*: This determines the protocol of
+ *       the packet and any extra features required to segment the
+ *       packet properly.
+ * features: EN_NETIF_GSO_FEAT_*: This specifies any extra GSO
+ *           features required to process this packet, such as ECN
+ *           support for TCPv4.
+ *
+ * XEN_NETIF_EXTRA_TYPE_MCAST_{ADD,DEL}:
+ *
+ *    0     1     2     3     4     5     6     7  octet
+ * +-----+-----+-----+-----+-----+-----+-----+-----+
+ * |type |flags| addr                              |
+ * +-----+-----+-----+-----+-----+-----+-----+-----+
+ *
+ * type: Must be XEN_NETIF_EXTRA_TYPE_MCAST_{ADD,DEL}
+ * flags: XEN_NETIF_EXTRA_FLAG_*
+ * addr: address to add/remove
+ *
+ * XEN_NETIF_EXTRA_TYPE_HASH:
+ *
+ *    0     1     2     3     4     5     6     7  octet
+ * +-----+-----+-----+-----+-----+-----+-----+-----+
+ * |type |flags|htype| pad |LSB ---- value ---- MSB|
+ * +-----+-----+-----+-----+-----+-----+-----+-----+
+ *
+ * type: Must be XEN_NETIF_EXTRA_TYPE_HASH
+ * flags: XEN_NETIF_EXTRA_FLAG_*
+ * htype: XEN_NETIF_HASH_TYPE_*
+ * value: Hash value
  */
 
 /* Protocol checksum field is blank in the packet (hardware offload)? */
-#define _XEN_NETTXF_csum_blank		(0)
-#define  XEN_NETTXF_csum_blank		(1U<<_XEN_NETTXF_csum_blank)
+#define _XEN_NETTXF_csum_blank     (0)
+#define  XEN_NETTXF_csum_blank     (1U<<_XEN_NETTXF_csum_blank)
 
 /* Packet data has been validated against protocol checksum. */
-#define _XEN_NETTXF_data_validated	(1)
-#define  XEN_NETTXF_data_validated	(1U<<_XEN_NETTXF_data_validated)
+#define _XEN_NETTXF_data_validated (1)
+#define  XEN_NETTXF_data_validated (1U<<_XEN_NETTXF_data_validated)
 
 /* Packet continues in the next request descriptor. */
-#define _XEN_NETTXF_more_data		(2)
-#define  XEN_NETTXF_more_data		(1U<<_XEN_NETTXF_more_data)
+#define _XEN_NETTXF_more_data      (2)
+#define  XEN_NETTXF_more_data      (1U<<_XEN_NETTXF_more_data)
 
 /* Packet to be followed by extra descriptor(s). */
-#define _XEN_NETTXF_extra_info		(3)
-#define  XEN_NETTXF_extra_info		(1U<<_XEN_NETTXF_extra_info)
+#define _XEN_NETTXF_extra_info     (3)
+#define  XEN_NETTXF_extra_info     (1U<<_XEN_NETTXF_extra_info)
 
 #define XEN_NETIF_MAX_TX_SIZE 0xFFFF
 struct xen_netif_tx_request {
-    grant_ref_t gref;      /* Reference to buffer page */
-    uint16_t offset;       /* Offset within buffer page */
-    uint16_t flags;        /* XEN_NETTXF_* */
-    uint16_t id;           /* Echoed in response message. */
-    uint16_t size;         /* Packet size in bytes.       */
+	grant_ref_t gref;
+	uint16_t offset;
+	uint16_t flags;
+	uint16_t id;
+	uint16_t size;
 };
+typedef struct xen_netif_tx_request xen_netif_tx_request_t;
 
 /* Types of xen_netif_extra_info descriptors. */
-#define XEN_NETIF_EXTRA_TYPE_NONE	(0)  /* Never used - invalid */
-#define XEN_NETIF_EXTRA_TYPE_GSO	(1)  /* u.gso */
-#define XEN_NETIF_EXTRA_TYPE_MCAST_ADD	(2)  /* u.mcast */
-#define XEN_NETIF_EXTRA_TYPE_MCAST_DEL	(3)  /* u.mcast */
-#define XEN_NETIF_EXTRA_TYPE_MAX	(4)
+#define XEN_NETIF_EXTRA_TYPE_NONE       (0)	/* Never used - invalid */
+#define XEN_NETIF_EXTRA_TYPE_GSO        (1)	/* u.gso */
+#define XEN_NETIF_EXTRA_TYPE_MCAST_ADD  (2)	/* u.mcast */
+#define XEN_NETIF_EXTRA_TYPE_MCAST_DEL  (3)	/* u.mcast */
+#define XEN_NETIF_EXTRA_TYPE_HASH       (4)	/* u.hash */
+#define XEN_NETIF_EXTRA_TYPE_MAX        (5)
 
-/* xen_netif_extra_info flags. */
-#define _XEN_NETIF_EXTRA_FLAG_MORE	(0)
-#define  XEN_NETIF_EXTRA_FLAG_MORE	(1U<<_XEN_NETIF_EXTRA_FLAG_MORE)
+/* xen_netif_extra_info_t flags. */
+#define _XEN_NETIF_EXTRA_FLAG_MORE (0)
+#define XEN_NETIF_EXTRA_FLAG_MORE  (1U<<_XEN_NETIF_EXTRA_FLAG_MORE)
 
 /* GSO types */
-#define XEN_NETIF_GSO_TYPE_NONE		(0)
-#define XEN_NETIF_GSO_TYPE_TCPV4	(1)
-#define XEN_NETIF_GSO_TYPE_TCPV6	(2)
+#define XEN_NETIF_GSO_TYPE_NONE         (0)
+#define XEN_NETIF_GSO_TYPE_TCPV4        (1)
+#define XEN_NETIF_GSO_TYPE_TCPV6        (2)
+
+/* Hash types */
+#define XEN_NETIF_HASH_TYPE_NONE        (0)
+#define XEN_NETIF_HASH_TYPE_TCPV4       (1)
+#define XEN_NETIF_HASH_TYPE_IPV4        (2)
+#define XEN_NETIF_HASH_TYPE_TCPV6       (3)
+#define XEN_NETIF_HASH_TYPE_IPV6        (4)
 
 /*
- * This structure needs to fit within both netif_tx_request and
- * netif_rx_response for compatibility.
+ * This structure needs to fit within both xen_netif_tx_request_t and
+ * xen_netif_rx_response_t for compatibility.
  */
 struct xen_netif_extra_info {
-	uint8_t type;  /* XEN_NETIF_EXTRA_TYPE_* */
-	uint8_t flags; /* XEN_NETIF_EXTRA_FLAG_* */
-
+	uint8_t type;
+	uint8_t flags;
 	union {
 		struct {
-			/*
-			 * Maximum payload size of each segment. For
-			 * example, for TCP this is just the path MSS.
-			 */
 			uint16_t size;
-
-			/*
-			 * GSO type. This determines the protocol of
-			 * the packet and any extra features required
-			 * to segment the packet properly.
-			 */
-			uint8_t type; /* XEN_NETIF_GSO_TYPE_* */
-
-			/* Future expansion. */
+			uint8_t type;
 			uint8_t pad;
-
-			/*
-			 * GSO features. This specifies any extra GSO
-			 * features required to process this packet,
-			 * such as ECN support for TCPv4.
-			 */
-			uint16_t features; /* XEN_NETIF_GSO_FEAT_* */
+			uint16_t features;
 		} gso;
-
 		struct {
-			uint8_t addr[6]; /* Address to add/remove. */
+			uint8_t addr[6];
 		} mcast;
+		struct {
+			uint8_t type;
+			uint8_t pad;
+			uint8_t value[4];
+		} hash;
 
 		uint16_t pad[3];
 	} u;
 };
+typedef struct xen_netif_extra_info xen_netif_extra_info_t;
 
 struct xen_netif_tx_response {
 	uint16_t id;
-	int16_t  status;       /* XEN_NETIF_RSP_* */
+	int16_t status;
 };
+typedef struct xen_netif_tx_response xen_netif_tx_response_t;
 
 struct xen_netif_rx_request {
-	uint16_t    id;        /* Echoed in response message.        */
-	grant_ref_t gref;      /* Reference to incoming granted frame */
+	uint16_t id;
+	uint16_t pad;
+	grant_ref_t gref;
 };
+typedef struct xen_netif_rx_request xen_netif_rx_request_t;
 
 /* Packet data has been validated against protocol checksum. */
-#define _XEN_NETRXF_data_validated	(0)
-#define  XEN_NETRXF_data_validated	(1U<<_XEN_NETRXF_data_validated)
+#define _XEN_NETRXF_data_validated (0)
+#define  XEN_NETRXF_data_validated (1U<<_XEN_NETRXF_data_validated)
 
 /* Protocol checksum field is blank in the packet (hardware offload)? */
-#define _XEN_NETRXF_csum_blank		(1)
-#define  XEN_NETRXF_csum_blank		(1U<<_XEN_NETRXF_csum_blank)
+#define _XEN_NETRXF_csum_blank     (1)
+#define  XEN_NETRXF_csum_blank     (1U<<_XEN_NETRXF_csum_blank)
 
 /* Packet continues in the next request descriptor. */
-#define _XEN_NETRXF_more_data		(2)
-#define  XEN_NETRXF_more_data		(1U<<_XEN_NETRXF_more_data)
+#define _XEN_NETRXF_more_data      (2)
+#define  XEN_NETRXF_more_data      (1U<<_XEN_NETRXF_more_data)
 
 /* Packet to be followed by extra descriptor(s). */
-#define _XEN_NETRXF_extra_info		(3)
-#define  XEN_NETRXF_extra_info		(1U<<_XEN_NETRXF_extra_info)
+#define _XEN_NETRXF_extra_info     (3)
+#define  XEN_NETRXF_extra_info     (1U<<_XEN_NETRXF_extra_info)
 
-/* GSO Prefix descriptor. */
-#define _XEN_NETRXF_gso_prefix		(4)
-#define  XEN_NETRXF_gso_prefix		(1U<<_XEN_NETRXF_gso_prefix)
+/* Packet has GSO prefix. Deprecated but included for compatibility */
+#define _XEN_NETRXF_gso_prefix     (4)
+#define  XEN_NETRXF_gso_prefix     (1U<<_XEN_NETRXF_gso_prefix)
 
 struct xen_netif_rx_response {
-    uint16_t id;
-    uint16_t offset;       /* Offset in page of start of received packet  */
-    uint16_t flags;        /* XEN_NETRXF_* */
-    int16_t  status;       /* -ve: BLKIF_RSP_* ; +ve: Rx'ed pkt size. */
+	uint16_t id;
+	uint16_t offset;
+	uint16_t flags;
+	int16_t status;
 };
+typedef struct xen_netif_rx_response xen_netif_rx_response_t;
 
 /*
  * Generate netif ring structures and types.
  */
 
-DEFINE_RING_TYPES(xen_netif_tx,
-		  struct xen_netif_tx_request,
+DEFINE_RING_TYPES(xen_netif_tx, struct xen_netif_tx_request,
 		  struct xen_netif_tx_response);
-DEFINE_RING_TYPES(xen_netif_rx,
-		  struct xen_netif_rx_request,
+DEFINE_RING_TYPES(xen_netif_rx, struct xen_netif_rx_request,
 		  struct xen_netif_rx_response);
 
-#define XEN_NETIF_RSP_DROPPED	-2
-#define XEN_NETIF_RSP_ERROR	-1
-#define XEN_NETIF_RSP_OKAY	 0
-/* No response: used for auxiliary requests (e.g., xen_netif_extra_info). */
-#define XEN_NETIF_RSP_NULL	 1
+#define XEN_NETIF_RSP_DROPPED         -2
+#define XEN_NETIF_RSP_ERROR           -1
+#define XEN_NETIF_RSP_OKAY             0
+/* No response: used for auxiliary requests (e.g., xen_netif_extra_info_t). */
+#define XEN_NETIF_RSP_NULL             1
 
 #endif
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH net-next 1/8] xen-netback: re-import canonical netif header
  2015-10-21 10:36 [PATCH net-next 0/8] xen-netback/core: packet hashing Paul Durrant
@ 2015-10-21 10:36 ` Paul Durrant
  2015-10-21 10:36 ` Paul Durrant
                   ` (18 subsequent siblings)
  19 siblings, 0 replies; 39+ messages in thread
From: Paul Durrant @ 2015-10-21 10:36 UTC (permalink / raw)
  To: netdev, xen-devel; +Cc: Boris Ostrovsky, Paul Durrant, Wei Liu, David Vrabel

The canonical netif header (in the Xen source repo) and the Linux variant
have diverged significantly. Recently much documentation has been added to
the canonical header and new definitions and types to support packet hash
configuration. Subsequent patches in this series add support for packet
hash configuration in xen-netback so this patch re-imports the canonical
header in readiness.

To maintain compatibility and some style consistency with the old Linux
variant, the header was stripped of its emacs boilerplate, and
post-processed and copied into place with the following commands:

ed -s netif.h << EOF
H
,s/NETTXF_/XEN_NETTXF_/g
,s/NETRXF_/XEN_NETRXF_/g
,s/NETIF_RSP/XEN_NETIF_RSP/g
,s/netif_tx/xen_netif_tx/g
,s/netif_rx/xen_netif_rx/g
,s/netif_extra_info/xen_netif_extra_info/g
w
EOF

indent --linux-style netif.h -o include/xen/interface/io/netif.h

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
---

Whilst awaiting review of my patches to the canonical netif.h, import has
been done from my staging branch using:

wget http://xenbits.xen.org/gitweb/?p=people/pauldu/xen.git;a=blob_plain;f=xen/include/public/io/netif.h;hb=refs/heads/netif
---
 include/xen/interface/io/netif.h | 475 +++++++++++++++++++++++++++++++--------
 1 file changed, 381 insertions(+), 94 deletions(-)

diff --git a/include/xen/interface/io/netif.h b/include/xen/interface/io/netif.h
index 252ffd4..832dc37 100644
--- a/include/xen/interface/io/netif.h
+++ b/include/xen/interface/io/netif.h
@@ -3,14 +3,32 @@
  *
  * Unified network-device I/O interface for Xen guest OSes.
  *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ *
  * Copyright (c) 2003-2004, Keir Fraser
  */
 
 #ifndef __XEN_PUBLIC_IO_NETIF_H__
 #define __XEN_PUBLIC_IO_NETIF_H__
 
-#include <xen/interface/io/ring.h>
-#include <xen/interface/grant_table.h>
+#include "ring.h"
+#include "../grant_table.h"
 
 /*
  * Older implementation of Xen network frontend / backend has an
@@ -38,10 +56,10 @@
  * that it cannot safely queue packets (as it may not be kicked to send them).
  */
 
- /*
+/*
  * "feature-split-event-channels" is introduced to separate guest TX
- * and RX notificaion. Backend either doesn't support this feature or
- * advertise it via xenstore as 0 (disabled) or 1 (enabled).
+ * and RX notification. Backend either doesn't support this feature or
+ * advertises it via xenstore as 0 (disabled) or 1 (enabled).
  *
  * To make use of this feature, frontend should allocate two event
  * channels for TX and RX, advertise them to backend as
@@ -96,11 +114,112 @@
  * error. This includes scenarios where more (or fewer) queues were
  * requested than the frontend provided details for.
  *
- * Mapping of packets to queues is considered to be a function of the
- * transmitting system (backend or frontend) and is not negotiated
- * between the two. Guests are free to transmit packets on any queue
- * they choose, provided it has been set up correctly. Guests must be
- * prepared to receive packets on any queue they have requested be set up.
+ * Unless a hash algorithm or mapping of packet hash to queues has been
+ * negotiated (see below), queue selection is considered to be a function of
+ * the transmitting system (backend or frontend) and either end is free to
+ * transmit packets on any queue, provided it has been set up correctly.
+ * Guests must therefore be prepared to receive packets on any queue they
+ * have requested be set up.
+ */
+
+/*
+ * Hash negotiation (only applicable if using multiple queues):
+ *
+ * A backend can advertise a set of hash algorithms that it can perform by
+ * naming it in a space separated list in the "multi-queue-hash-list"
+ * xenstore key. For example, if the backend supports the 'foo' and 'bar'
+ * algorithms it would set:
+ *
+ * /local/domain/X/backend/vif/Y/Z/multi-queue-hash-list = "foo bar"
+ *
+ * Additionally, in supporting a particular algorithm, it may be necessary
+ * for the backend to specify the capabilities of its implementation of
+ * that algorithm, e.g. what sections of packet header it can hash.
+ * To do that it can set algorithm-specific keys under a parent capabilities
+ * key. For example, if the 'bar' algorithm implementation in the backend
+ * is capable of hashing over an IP version 4 header and a TCP header, the
+ * backend might set:
+ *
+ * /local/domain/X/backend/vif/Y/Z/multi-queue-hash-caps-bar/types = "ipv4+tcp"
+ *
+ * The backend should set all such keys before it moves into the initwait
+ * state.
+ *
+ * The frontend can select a hash algorithm at any time after it moves into
+ * the connected state by setting the "multi-queue-hash" key. The backend
+ * must therefore watch this key and be prepared to change hash algorithms
+ * at any time whilst in the connected state. So, for example, if the
+ * frontend wants 'foo' hashing, it should set:
+ *
+ * /local/domain/Y/device/vif/Z/multi-queue-hash = "foo"
+ *
+ * Additionally it may set parameters for that algorithm by setting
+ * algorithm-specific keys under a parent parameters key. For example, if
+ * the 'foo' algorithm implementation in the backend is capable of hashing
+ * over an IP version 4 header, a TCP header or both but the frontend only
+ * wants it to hash over only the IP version 4 header then it might set:
+ *
+ * /local/domain/Y/device/vif/Z/multi-queue-hash-params-foo/types = "ipv4"
+ *
+ * The backend must also watch the parameters key as the frontend may
+ * change the parameters at any time whilst in the connected state.
+ *
+ * (Capabilities and parameters documentation for specific algorithms is
+ * below).
+ *
+ * TOEPLITZ:
+ *
+ * If the backend supports Toeplitz hashing then it should include
+ * the algorithm name 'toeplitz' in its "multi-queue-hash-list" key.
+ * It should also advertise the following capabilities:
+ *
+ * types: a space separated list containing any or all of 'ipv4', 'tcpv4',
+ *        'ipv6', 'tcpv6', indicating over which headers the hash algorithm
+ *        is capable of being performed.
+ *
+ * max-key-length: an integer value indicating the maximum key length (in
+ *                 octets) that the frontend may supply.
+ *
+ * Upon selecting this algorithm, the frontend may supply the following
+ * parameters.
+ *
+ * types: a space separated list containing none, any or all of the type
+ *        names included in the types list in the capabilities.
+ *        When the backend encounters a packet type not in this list it
+ *        will assign a hash value of 0.
+ *
+ * key: a ':' separated list of octets (up to the maximum length specified
+ *      in the capabilities) expressed in hexadecimal indicating the key
+ *      that should be used in the hash calculation.
+ *
+ * For more information on Toeplitz hash calculation see:
+ *
+ * https://msdn.microsoft.com/en-us/library/windows/hardware/ff570725.aspx
+ */
+
+/*
+ * Hash mapping (only applicable if using multiple queues):
+ *
+ * If the backend is not capable, or no mapping is specified by the frontend
+ * then it is assumed that the hash -> queue mapping is done by simple
+ * modular arithmetic.
+ *
+ * To advertise that it is capable of accepting a specific mapping from the
+ * frontend the backend should set the "multi-queue-max-hash-mapping-length"
+ * key to a non-zero value. The frontend may then specify a mapping (up to
+ * the maximum specified length) as a ',' separated list of decimal queue
+ * numbers in the "multi-queue-hash-mapping" key.
+ *
+ * The backend should parse this list into an array and perform the mapping
+ * as follows:
+ *
+ * queue = mapping[hash % length-of-list]
+ *
+ * If any of the queue values specified in the list is not connected then
+ * the backend is free to choose a connected queue arbitrarily.
+ *
+ * The backend must be prepared to handle updates the mapping list at any
+ * time whilst in the connected state.
  */
 
 /*
@@ -118,151 +237,319 @@
  */
 
 /*
+ * "feature-multicast-control" advertises the capability to filter ethernet
+ * multicast packets in the backend. To enable use of this capability the
+ * frontend must set "request-multicast-control" before moving into the
+ * connected state.
+ *
+ * If "request-multicast-control" is set then the backend transmit side should
+ * no longer flood multicast packets to the frontend, it should instead drop any
+ * multicast packet that does not match in a filter list. The list is
+ * amended by the frontend by sending dummy transmit requests containing
+ * XEN_NETIF_EXTRA_TYPE_MCAST_{ADD,DEL} extra-info fragments as specified below.
+ * Once enabled by the frontend, the feature cannot be disabled except by
+ * closing and re-connecting to the backend.
+ */
+
+/*
+ * "feature-hash" advertises the capability to accept extra info slots of
+ * type XEN_NETIF_EXTRA_TYPE_HASH. They will not be sent by either end
+ * unless the other end advertises this feature.
+ */
+
+/*
  * This is the 'wire' format for packets:
- *  Request 1: xen_netif_tx_request  -- XEN_NETTXF_* (any flags)
- * [Request 2: xen_netif_extra_info]    (only if request 1 has XEN_NETTXF_extra_info)
- * [Request 3: xen_netif_extra_info]    (only if request 2 has XEN_NETIF_EXTRA_MORE)
- *  Request 4: xen_netif_tx_request  -- XEN_NETTXF_more_data
- *  Request 5: xen_netif_tx_request  -- XEN_NETTXF_more_data
+ *  Request 1: xen_netif_tx_request_t -- XEN_NETTXF_* (any flags)
+ * [Request 2: xen_netif_extra_info_t] (only if request 1 has
+ *                                  XEN_NETTXF_extra_info)
+ * [Request 3: xen_netif_extra_info_t] (only if request 2 has
+ *                                  XEN_NETIF_EXTRA_MORE)
+ *  Request 4: xen_netif_tx_request_t -- XEN_NETTXF_more_data
+ *  Request 5: xen_netif_tx_request_t -- XEN_NETTXF_more_data
  *  ...
- *  Request N: xen_netif_tx_request  -- 0
+ *  Request N: xen_netif_tx_request_t -- 0
+ */
+
+/*
+ * Guest transmit
+ * ==============
+ *
+ * Ring slot size is 12 octets, however not all request/response
+ * structs use the full size.
+ *
+ * tx request data (xen_netif_tx_request_t)
+ * ------------------------------------
+ *
+ *    0     1     2     3     4     5     6     7  octet
+ * +-----+-----+-----+-----+-----+-----+-----+-----+
+ * | grant ref             | offset    | flags     |
+ * +-----+-----+-----+-----+-----+-----+-----+-----+
+ * | id        | size      |
+ * +-----+-----+-----+-----+
+ *
+ * grant ref: Reference to buffer page.
+ * offset: Offset within buffer page.
+ * flags: XEN_NETTXF_*.
+ * id: request identifier, echoed in response.
+ * size: packet size in bytes.
+ *
+ * tx response (xen_netif_tx_response_t)
+ * ---------------------------------
+ *
+ *    0     1     2     3     4     5     6     7  octet
+ * +-----+-----+-----+-----+-----+-----+-----+-----+
+ * | id        | status    | unused                |
+ * +-----+-----+-----+-----+-----+-----+-----+-----+
+ * | unused                |
+ * +-----+-----+-----+-----+
+ *
+ * id: reflects id in transmit request
+ * status: XEN_NETIF_RSP_*
+ *
+ * Guest receive
+ * =============
+ *
+ * Ring slot size is 8 octets.
+ *
+ * rx request (xen_netif_rx_request_t)
+ * -------------------------------
+ *
+ *    0     1     2     3     4     5     6     7  octet
+ * +-----+-----+-----+-----+-----+-----+-----+-----+
+ * | id        | pad       | gref                  |
+ * +-----+-----+-----+-----+-----+-----+-----+-----+
+ *
+ * id: request identifier, echoed in response.
+ * gref: reference to incoming granted frame.
+ *
+ * rx response (xen_netif_rx_response_t)
+ * ---------------------------------
+ *
+ *    0     1     2     3     4     5     6     7  octet
+ * +-----+-----+-----+-----+-----+-----+-----+-----+
+ * | id        | offset    | flags     | status    |
+ * +-----+-----+-----+-----+-----+-----+-----+-----+
+ *
+ * id: reflects id in receive request
+ * offset: offset in page of start of received packet
+ * flags: XEN_NETRXF_*
+ * status: -ve: XEN_NETIF_RSP_*; +ve: Rx'ed pkt size.
+ *
+ * NOTE: Historically, to support GSO on the frontend receive side, Linux
+ *       netfront does not make use of the rx response id (because, as
+ *       described below, extra info structures overlay the id field).
+ *       Instead it assumes that responses always appear in the same ring
+ *       slot as their corresponding request. Thus, to maintain
+ *       compatibility, backends must make sure this is the case.
+ *
+ * Extra Info
+ * ==========
+ *
+ * Can be present if initial request or response has NET{T,R}XF_extra_info,
+ * or previous extra request has XEN_NETIF_EXTRA_MORE.
+ *
+ * The struct therefore needs to fit into either a tx or rx slot and
+ * is therefore limited to 8 octets.
+ *
+ * NOTE: Because extra info data overlays the usual request/response
+ *       structures, there is no id information in the opposite direction.
+ *       So, if an extra info overlays an rx response the frontend can
+ *       assume that it is in the same ring slot as the request that was
+ *       consumed to make the slot available, and the backend must ensure
+ *       this assumption is true.
+ *
+ * extra info (xen_netif_extra_info_t)
+ * -------------------------------
+ *
+ * General format:
+ *
+ *    0     1     2     3     4     5     6     7  octet
+ * +-----+-----+-----+-----+-----+-----+-----+-----+
+ * |type |flags| type specific data                |
+ * +-----+-----+-----+-----+-----+-----+-----+-----+
+ * | padding for tx        |
+ * +-----+-----+-----+-----+
+ *
+ * type: XEN_NETIF_EXTRA_TYPE_*
+ * flags: XEN_NETIF_EXTRA_FLAG_*
+ * padding for tx: present only in the tx case due to 8 octet limit
+ *                 from rx case. Not shown in type specific entries
+ *                 below.
+ *
+ * XEN_NETIF_EXTRA_TYPE_GSO:
+ *
+ *    0     1     2     3     4     5     6     7  octet
+ * +-----+-----+-----+-----+-----+-----+-----+-----+
+ * |type |flags| size      |type | pad | features  |
+ * +-----+-----+-----+-----+-----+-----+-----+-----+
+ *
+ * type: Must be XEN_NETIF_EXTRA_TYPE_GSO
+ * flags: XEN_NETIF_EXTRA_FLAG_*
+ * size: Maximum payload size of each segment. For example,
+ *       for TCP this is just the path MSS.
+ * type: XEN_NETIF_GSO_TYPE_*: This determines the protocol of
+ *       the packet and any extra features required to segment the
+ *       packet properly.
+ * features: EN_NETIF_GSO_FEAT_*: This specifies any extra GSO
+ *           features required to process this packet, such as ECN
+ *           support for TCPv4.
+ *
+ * XEN_NETIF_EXTRA_TYPE_MCAST_{ADD,DEL}:
+ *
+ *    0     1     2     3     4     5     6     7  octet
+ * +-----+-----+-----+-----+-----+-----+-----+-----+
+ * |type |flags| addr                              |
+ * +-----+-----+-----+-----+-----+-----+-----+-----+
+ *
+ * type: Must be XEN_NETIF_EXTRA_TYPE_MCAST_{ADD,DEL}
+ * flags: XEN_NETIF_EXTRA_FLAG_*
+ * addr: address to add/remove
+ *
+ * XEN_NETIF_EXTRA_TYPE_HASH:
+ *
+ *    0     1     2     3     4     5     6     7  octet
+ * +-----+-----+-----+-----+-----+-----+-----+-----+
+ * |type |flags|htype| pad |LSB ---- value ---- MSB|
+ * +-----+-----+-----+-----+-----+-----+-----+-----+
+ *
+ * type: Must be XEN_NETIF_EXTRA_TYPE_HASH
+ * flags: XEN_NETIF_EXTRA_FLAG_*
+ * htype: XEN_NETIF_HASH_TYPE_*
+ * value: Hash value
  */
 
 /* Protocol checksum field is blank in the packet (hardware offload)? */
-#define _XEN_NETTXF_csum_blank		(0)
-#define  XEN_NETTXF_csum_blank		(1U<<_XEN_NETTXF_csum_blank)
+#define _XEN_NETTXF_csum_blank     (0)
+#define  XEN_NETTXF_csum_blank     (1U<<_XEN_NETTXF_csum_blank)
 
 /* Packet data has been validated against protocol checksum. */
-#define _XEN_NETTXF_data_validated	(1)
-#define  XEN_NETTXF_data_validated	(1U<<_XEN_NETTXF_data_validated)
+#define _XEN_NETTXF_data_validated (1)
+#define  XEN_NETTXF_data_validated (1U<<_XEN_NETTXF_data_validated)
 
 /* Packet continues in the next request descriptor. */
-#define _XEN_NETTXF_more_data		(2)
-#define  XEN_NETTXF_more_data		(1U<<_XEN_NETTXF_more_data)
+#define _XEN_NETTXF_more_data      (2)
+#define  XEN_NETTXF_more_data      (1U<<_XEN_NETTXF_more_data)
 
 /* Packet to be followed by extra descriptor(s). */
-#define _XEN_NETTXF_extra_info		(3)
-#define  XEN_NETTXF_extra_info		(1U<<_XEN_NETTXF_extra_info)
+#define _XEN_NETTXF_extra_info     (3)
+#define  XEN_NETTXF_extra_info     (1U<<_XEN_NETTXF_extra_info)
 
 #define XEN_NETIF_MAX_TX_SIZE 0xFFFF
 struct xen_netif_tx_request {
-    grant_ref_t gref;      /* Reference to buffer page */
-    uint16_t offset;       /* Offset within buffer page */
-    uint16_t flags;        /* XEN_NETTXF_* */
-    uint16_t id;           /* Echoed in response message. */
-    uint16_t size;         /* Packet size in bytes.       */
+	grant_ref_t gref;
+	uint16_t offset;
+	uint16_t flags;
+	uint16_t id;
+	uint16_t size;
 };
+typedef struct xen_netif_tx_request xen_netif_tx_request_t;
 
 /* Types of xen_netif_extra_info descriptors. */
-#define XEN_NETIF_EXTRA_TYPE_NONE	(0)  /* Never used - invalid */
-#define XEN_NETIF_EXTRA_TYPE_GSO	(1)  /* u.gso */
-#define XEN_NETIF_EXTRA_TYPE_MCAST_ADD	(2)  /* u.mcast */
-#define XEN_NETIF_EXTRA_TYPE_MCAST_DEL	(3)  /* u.mcast */
-#define XEN_NETIF_EXTRA_TYPE_MAX	(4)
+#define XEN_NETIF_EXTRA_TYPE_NONE       (0)	/* Never used - invalid */
+#define XEN_NETIF_EXTRA_TYPE_GSO        (1)	/* u.gso */
+#define XEN_NETIF_EXTRA_TYPE_MCAST_ADD  (2)	/* u.mcast */
+#define XEN_NETIF_EXTRA_TYPE_MCAST_DEL  (3)	/* u.mcast */
+#define XEN_NETIF_EXTRA_TYPE_HASH       (4)	/* u.hash */
+#define XEN_NETIF_EXTRA_TYPE_MAX        (5)
 
-/* xen_netif_extra_info flags. */
-#define _XEN_NETIF_EXTRA_FLAG_MORE	(0)
-#define  XEN_NETIF_EXTRA_FLAG_MORE	(1U<<_XEN_NETIF_EXTRA_FLAG_MORE)
+/* xen_netif_extra_info_t flags. */
+#define _XEN_NETIF_EXTRA_FLAG_MORE (0)
+#define XEN_NETIF_EXTRA_FLAG_MORE  (1U<<_XEN_NETIF_EXTRA_FLAG_MORE)
 
 /* GSO types */
-#define XEN_NETIF_GSO_TYPE_NONE		(0)
-#define XEN_NETIF_GSO_TYPE_TCPV4	(1)
-#define XEN_NETIF_GSO_TYPE_TCPV6	(2)
+#define XEN_NETIF_GSO_TYPE_NONE         (0)
+#define XEN_NETIF_GSO_TYPE_TCPV4        (1)
+#define XEN_NETIF_GSO_TYPE_TCPV6        (2)
+
+/* Hash types */
+#define XEN_NETIF_HASH_TYPE_NONE        (0)
+#define XEN_NETIF_HASH_TYPE_TCPV4       (1)
+#define XEN_NETIF_HASH_TYPE_IPV4        (2)
+#define XEN_NETIF_HASH_TYPE_TCPV6       (3)
+#define XEN_NETIF_HASH_TYPE_IPV6        (4)
 
 /*
- * This structure needs to fit within both netif_tx_request and
- * netif_rx_response for compatibility.
+ * This structure needs to fit within both xen_netif_tx_request_t and
+ * xen_netif_rx_response_t for compatibility.
  */
 struct xen_netif_extra_info {
-	uint8_t type;  /* XEN_NETIF_EXTRA_TYPE_* */
-	uint8_t flags; /* XEN_NETIF_EXTRA_FLAG_* */
-
+	uint8_t type;
+	uint8_t flags;
 	union {
 		struct {
-			/*
-			 * Maximum payload size of each segment. For
-			 * example, for TCP this is just the path MSS.
-			 */
 			uint16_t size;
-
-			/*
-			 * GSO type. This determines the protocol of
-			 * the packet and any extra features required
-			 * to segment the packet properly.
-			 */
-			uint8_t type; /* XEN_NETIF_GSO_TYPE_* */
-
-			/* Future expansion. */
+			uint8_t type;
 			uint8_t pad;
-
-			/*
-			 * GSO features. This specifies any extra GSO
-			 * features required to process this packet,
-			 * such as ECN support for TCPv4.
-			 */
-			uint16_t features; /* XEN_NETIF_GSO_FEAT_* */
+			uint16_t features;
 		} gso;
-
 		struct {
-			uint8_t addr[6]; /* Address to add/remove. */
+			uint8_t addr[6];
 		} mcast;
+		struct {
+			uint8_t type;
+			uint8_t pad;
+			uint8_t value[4];
+		} hash;
 
 		uint16_t pad[3];
 	} u;
 };
+typedef struct xen_netif_extra_info xen_netif_extra_info_t;
 
 struct xen_netif_tx_response {
 	uint16_t id;
-	int16_t  status;       /* XEN_NETIF_RSP_* */
+	int16_t status;
 };
+typedef struct xen_netif_tx_response xen_netif_tx_response_t;
 
 struct xen_netif_rx_request {
-	uint16_t    id;        /* Echoed in response message.        */
-	grant_ref_t gref;      /* Reference to incoming granted frame */
+	uint16_t id;
+	uint16_t pad;
+	grant_ref_t gref;
 };
+typedef struct xen_netif_rx_request xen_netif_rx_request_t;
 
 /* Packet data has been validated against protocol checksum. */
-#define _XEN_NETRXF_data_validated	(0)
-#define  XEN_NETRXF_data_validated	(1U<<_XEN_NETRXF_data_validated)
+#define _XEN_NETRXF_data_validated (0)
+#define  XEN_NETRXF_data_validated (1U<<_XEN_NETRXF_data_validated)
 
 /* Protocol checksum field is blank in the packet (hardware offload)? */
-#define _XEN_NETRXF_csum_blank		(1)
-#define  XEN_NETRXF_csum_blank		(1U<<_XEN_NETRXF_csum_blank)
+#define _XEN_NETRXF_csum_blank     (1)
+#define  XEN_NETRXF_csum_blank     (1U<<_XEN_NETRXF_csum_blank)
 
 /* Packet continues in the next request descriptor. */
-#define _XEN_NETRXF_more_data		(2)
-#define  XEN_NETRXF_more_data		(1U<<_XEN_NETRXF_more_data)
+#define _XEN_NETRXF_more_data      (2)
+#define  XEN_NETRXF_more_data      (1U<<_XEN_NETRXF_more_data)
 
 /* Packet to be followed by extra descriptor(s). */
-#define _XEN_NETRXF_extra_info		(3)
-#define  XEN_NETRXF_extra_info		(1U<<_XEN_NETRXF_extra_info)
+#define _XEN_NETRXF_extra_info     (3)
+#define  XEN_NETRXF_extra_info     (1U<<_XEN_NETRXF_extra_info)
 
-/* GSO Prefix descriptor. */
-#define _XEN_NETRXF_gso_prefix		(4)
-#define  XEN_NETRXF_gso_prefix		(1U<<_XEN_NETRXF_gso_prefix)
+/* Packet has GSO prefix. Deprecated but included for compatibility */
+#define _XEN_NETRXF_gso_prefix     (4)
+#define  XEN_NETRXF_gso_prefix     (1U<<_XEN_NETRXF_gso_prefix)
 
 struct xen_netif_rx_response {
-    uint16_t id;
-    uint16_t offset;       /* Offset in page of start of received packet  */
-    uint16_t flags;        /* XEN_NETRXF_* */
-    int16_t  status;       /* -ve: BLKIF_RSP_* ; +ve: Rx'ed pkt size. */
+	uint16_t id;
+	uint16_t offset;
+	uint16_t flags;
+	int16_t status;
 };
+typedef struct xen_netif_rx_response xen_netif_rx_response_t;
 
 /*
  * Generate netif ring structures and types.
  */
 
-DEFINE_RING_TYPES(xen_netif_tx,
-		  struct xen_netif_tx_request,
+DEFINE_RING_TYPES(xen_netif_tx, struct xen_netif_tx_request,
 		  struct xen_netif_tx_response);
-DEFINE_RING_TYPES(xen_netif_rx,
-		  struct xen_netif_rx_request,
+DEFINE_RING_TYPES(xen_netif_rx, struct xen_netif_rx_request,
 		  struct xen_netif_rx_response);
 
-#define XEN_NETIF_RSP_DROPPED	-2
-#define XEN_NETIF_RSP_ERROR	-1
-#define XEN_NETIF_RSP_OKAY	 0
-/* No response: used for auxiliary requests (e.g., xen_netif_extra_info). */
-#define XEN_NETIF_RSP_NULL	 1
+#define XEN_NETIF_RSP_DROPPED         -2
+#define XEN_NETIF_RSP_ERROR           -1
+#define XEN_NETIF_RSP_OKAY             0
+/* No response: used for auxiliary requests (e.g., xen_netif_extra_info_t). */
+#define XEN_NETIF_RSP_NULL             1
 
 #endif
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH net-next 2/8] xen-netback: remove GSO information from xenvif_rx_meta
  2015-10-21 10:36 [PATCH net-next 0/8] xen-netback/core: packet hashing Paul Durrant
                   ` (2 preceding siblings ...)
  2015-10-21 10:36 ` [PATCH net-next 2/8] xen-netback: remove GSO information from xenvif_rx_meta Paul Durrant
@ 2015-10-21 10:36 ` Paul Durrant
  2015-10-26 17:05   ` Wei Liu
  2015-10-26 17:05   ` Wei Liu
  2015-10-21 10:36 ` [PATCH net-next 3/8] xen-netback: support multiple extra info segments passed from frontend Paul Durrant
                   ` (15 subsequent siblings)
  19 siblings, 2 replies; 39+ messages in thread
From: Paul Durrant @ 2015-10-21 10:36 UTC (permalink / raw)
  To: netdev, xen-devel; +Cc: Paul Durrant, Ian Campbell, Wei Liu

The code in net_rx_action() that builds rx responses has direct access
to the skb so there is no need to copy this information into the meta
structure.

This patch removes the extraneous fields, saves space in the array and
removes many lines of code.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
---
 drivers/net/xen-netback/common.h  |  2 --
 drivers/net/xen-netback/netback.c | 75 +++++++++++++++------------------------
 2 files changed, 29 insertions(+), 48 deletions(-)

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index a7bf747..136ace1 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -70,8 +70,6 @@ struct pending_tx_info {
 struct xenvif_rx_meta {
 	int id;
 	int size;
-	int gso_type;
-	int gso_size;
 };
 
 #define GSO_BIT(type) \
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index ec98d43..27c6779 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -263,8 +263,6 @@ static struct xenvif_rx_meta *get_next_rx_buffer(struct xenvif_queue *queue,
 	req = RING_GET_REQUEST(&queue->rx, queue->rx.req_cons++);
 
 	meta = npo->meta + npo->meta_prod++;
-	meta->gso_type = XEN_NETIF_GSO_TYPE_NONE;
-	meta->gso_size = 0;
 	meta->size = 0;
 	meta->id = req->id;
 
@@ -281,12 +279,13 @@ static struct xenvif_rx_meta *get_next_rx_buffer(struct xenvif_queue *queue,
 static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb,
 				 struct netrx_pending_operations *npo,
 				 struct page *page, unsigned long size,
-				 unsigned long offset, int *head)
+				 unsigned long offset, int *head,
+				 int gso_type)
 {
+	struct xenvif *vif = queue->vif;
 	struct gnttab_copy *copy_gop;
 	struct xenvif_rx_meta *meta;
 	unsigned long bytes;
-	int gso_type = XEN_NETIF_GSO_TYPE_NONE;
 
 	/* Data must not cross a page boundary. */
 	BUG_ON(size + offset > PAGE_SIZE<<compound_order(page));
@@ -303,8 +302,14 @@ static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb
 		BUG_ON(offset >= PAGE_SIZE);
 		BUG_ON(npo->copy_off > MAX_BUFFER_OFFSET);
 
-		if (npo->copy_off == MAX_BUFFER_OFFSET)
+		if (npo->copy_off == MAX_BUFFER_OFFSET) {
+			/* Leave a gap for the GSO descriptor. */
+			if (*head && ((1 << gso_type) & vif->gso_mask))
+				queue->rx.req_cons++;
+
 			meta = get_next_rx_buffer(queue, npo);
+			*head = 0;
+		}
 
 		bytes = PAGE_SIZE - offset;
 		if (bytes > size)
@@ -345,20 +350,6 @@ static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb
 			page++;
 			offset = 0;
 		}
-
-		/* Leave a gap for the GSO descriptor. */
-		if (skb_is_gso(skb)) {
-			if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV4)
-				gso_type = XEN_NETIF_GSO_TYPE_TCPV4;
-			else if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV6)
-				gso_type = XEN_NETIF_GSO_TYPE_TCPV6;
-		}
-
-		if (*head && ((1 << gso_type) & queue->vif->gso_mask))
-			queue->rx.req_cons++;
-
-		*head = 0; /* There must be something in this buffer now. */
-
 	}
 }
 
@@ -402,27 +393,11 @@ static int xenvif_gop_skb(struct sk_buff *skb,
 	if ((1 << gso_type) & vif->gso_prefix_mask) {
 		req = RING_GET_REQUEST(&queue->rx, queue->rx.req_cons++);
 		meta = npo->meta + npo->meta_prod++;
-		meta->gso_type = gso_type;
-		meta->gso_size = skb_shinfo(skb)->gso_size;
 		meta->size = 0;
 		meta->id = req->id;
 	}
 
-	req = RING_GET_REQUEST(&queue->rx, queue->rx.req_cons++);
-	meta = npo->meta + npo->meta_prod++;
-
-	if ((1 << gso_type) & vif->gso_mask) {
-		meta->gso_type = gso_type;
-		meta->gso_size = skb_shinfo(skb)->gso_size;
-	} else {
-		meta->gso_type = XEN_NETIF_GSO_TYPE_NONE;
-		meta->gso_size = 0;
-	}
-
-	meta->size = 0;
-	meta->id = req->id;
-	npo->copy_off = 0;
-	npo->copy_gref = req->gref;
+	get_next_rx_buffer(queue, npo);
 
 	data = skb->data;
 	while (data < skb_tail_pointer(skb)) {
@@ -433,7 +408,8 @@ static int xenvif_gop_skb(struct sk_buff *skb,
 			len = skb_tail_pointer(skb) - data;
 
 		xenvif_gop_frag_copy(queue, skb, npo,
-				     virt_to_page(data), len, offset, &head);
+				     virt_to_page(data), len, offset, &head,
+				     gso_type);
 		data += len;
 	}
 
@@ -442,7 +418,7 @@ static int xenvif_gop_skb(struct sk_buff *skb,
 				     skb_frag_page(&skb_shinfo(skb)->frags[i]),
 				     skb_frag_size(&skb_shinfo(skb)->frags[i]),
 				     skb_shinfo(skb)->frags[i].page_offset,
-				     &head);
+				     &head, gso_type);
 	}
 
 	return npo->meta_prod - old_meta_prod;
@@ -542,15 +518,23 @@ static void xenvif_rx_action(struct xenvif_queue *queue)
 	gnttab_batch_copy(queue->grant_copy_op, npo.copy_prod);
 
 	while ((skb = __skb_dequeue(&rxq)) != NULL) {
+		struct xenvif *vif = queue->vif;
+		int gso_type = XEN_NETIF_GSO_TYPE_NONE;
+
+		if (skb_is_gso(skb)) {
+			if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV4)
+				gso_type = XEN_NETIF_GSO_TYPE_TCPV4;
+			else if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV6)
+				gso_type = XEN_NETIF_GSO_TYPE_TCPV6;
+		}
 
-		if ((1 << queue->meta[npo.meta_cons].gso_type) &
-		    queue->vif->gso_prefix_mask) {
+		if ((1 << gso_type) & vif->gso_prefix_mask) {
 			resp = RING_GET_RESPONSE(&queue->rx,
 						 queue->rx.rsp_prod_pvt++);
 
 			resp->flags = XEN_NETRXF_gso_prefix | XEN_NETRXF_more_data;
 
-			resp->offset = queue->meta[npo.meta_cons].gso_size;
+			resp->offset = skb_shinfo(skb)->gso_size;
 			resp->id = queue->meta[npo.meta_cons].id;
 			resp->status = XENVIF_RX_CB(skb)->meta_slots_used;
 
@@ -562,7 +546,7 @@ static void xenvif_rx_action(struct xenvif_queue *queue)
 		queue->stats.tx_bytes += skb->len;
 		queue->stats.tx_packets++;
 
-		status = xenvif_check_gop(queue->vif,
+		status = xenvif_check_gop(vif,
 					  XENVIF_RX_CB(skb)->meta_slots_used,
 					  &npo);
 
@@ -583,8 +567,7 @@ static void xenvif_rx_action(struct xenvif_queue *queue)
 					queue->meta[npo.meta_cons].size,
 					flags);
 
-		if ((1 << queue->meta[npo.meta_cons].gso_type) &
-		    queue->vif->gso_mask) {
+		if ((1 << gso_type) & vif->gso_mask) {
 			struct xen_netif_extra_info *gso =
 				(struct xen_netif_extra_info *)
 				RING_GET_RESPONSE(&queue->rx,
@@ -592,8 +575,8 @@ static void xenvif_rx_action(struct xenvif_queue *queue)
 
 			resp->flags |= XEN_NETRXF_extra_info;
 
-			gso->u.gso.type = queue->meta[npo.meta_cons].gso_type;
-			gso->u.gso.size = queue->meta[npo.meta_cons].gso_size;
+			gso->u.gso.type = gso_type;
+			gso->u.gso.size = skb_shinfo(skb)->gso_size;
 			gso->u.gso.pad = 0;
 			gso->u.gso.features = 0;
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH net-next 2/8] xen-netback: remove GSO information from xenvif_rx_meta
  2015-10-21 10:36 [PATCH net-next 0/8] xen-netback/core: packet hashing Paul Durrant
  2015-10-21 10:36 ` [PATCH net-next 1/8] xen-netback: re-import canonical netif header Paul Durrant
  2015-10-21 10:36 ` Paul Durrant
@ 2015-10-21 10:36 ` Paul Durrant
  2015-10-21 10:36 ` Paul Durrant
                   ` (16 subsequent siblings)
  19 siblings, 0 replies; 39+ messages in thread
From: Paul Durrant @ 2015-10-21 10:36 UTC (permalink / raw)
  To: netdev, xen-devel; +Cc: Paul Durrant, Wei Liu, Ian Campbell

The code in net_rx_action() that builds rx responses has direct access
to the skb so there is no need to copy this information into the meta
structure.

This patch removes the extraneous fields, saves space in the array and
removes many lines of code.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
---
 drivers/net/xen-netback/common.h  |  2 --
 drivers/net/xen-netback/netback.c | 75 +++++++++++++++------------------------
 2 files changed, 29 insertions(+), 48 deletions(-)

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index a7bf747..136ace1 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -70,8 +70,6 @@ struct pending_tx_info {
 struct xenvif_rx_meta {
 	int id;
 	int size;
-	int gso_type;
-	int gso_size;
 };
 
 #define GSO_BIT(type) \
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index ec98d43..27c6779 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -263,8 +263,6 @@ static struct xenvif_rx_meta *get_next_rx_buffer(struct xenvif_queue *queue,
 	req = RING_GET_REQUEST(&queue->rx, queue->rx.req_cons++);
 
 	meta = npo->meta + npo->meta_prod++;
-	meta->gso_type = XEN_NETIF_GSO_TYPE_NONE;
-	meta->gso_size = 0;
 	meta->size = 0;
 	meta->id = req->id;
 
@@ -281,12 +279,13 @@ static struct xenvif_rx_meta *get_next_rx_buffer(struct xenvif_queue *queue,
 static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb,
 				 struct netrx_pending_operations *npo,
 				 struct page *page, unsigned long size,
-				 unsigned long offset, int *head)
+				 unsigned long offset, int *head,
+				 int gso_type)
 {
+	struct xenvif *vif = queue->vif;
 	struct gnttab_copy *copy_gop;
 	struct xenvif_rx_meta *meta;
 	unsigned long bytes;
-	int gso_type = XEN_NETIF_GSO_TYPE_NONE;
 
 	/* Data must not cross a page boundary. */
 	BUG_ON(size + offset > PAGE_SIZE<<compound_order(page));
@@ -303,8 +302,14 @@ static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb
 		BUG_ON(offset >= PAGE_SIZE);
 		BUG_ON(npo->copy_off > MAX_BUFFER_OFFSET);
 
-		if (npo->copy_off == MAX_BUFFER_OFFSET)
+		if (npo->copy_off == MAX_BUFFER_OFFSET) {
+			/* Leave a gap for the GSO descriptor. */
+			if (*head && ((1 << gso_type) & vif->gso_mask))
+				queue->rx.req_cons++;
+
 			meta = get_next_rx_buffer(queue, npo);
+			*head = 0;
+		}
 
 		bytes = PAGE_SIZE - offset;
 		if (bytes > size)
@@ -345,20 +350,6 @@ static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb
 			page++;
 			offset = 0;
 		}
-
-		/* Leave a gap for the GSO descriptor. */
-		if (skb_is_gso(skb)) {
-			if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV4)
-				gso_type = XEN_NETIF_GSO_TYPE_TCPV4;
-			else if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV6)
-				gso_type = XEN_NETIF_GSO_TYPE_TCPV6;
-		}
-
-		if (*head && ((1 << gso_type) & queue->vif->gso_mask))
-			queue->rx.req_cons++;
-
-		*head = 0; /* There must be something in this buffer now. */
-
 	}
 }
 
@@ -402,27 +393,11 @@ static int xenvif_gop_skb(struct sk_buff *skb,
 	if ((1 << gso_type) & vif->gso_prefix_mask) {
 		req = RING_GET_REQUEST(&queue->rx, queue->rx.req_cons++);
 		meta = npo->meta + npo->meta_prod++;
-		meta->gso_type = gso_type;
-		meta->gso_size = skb_shinfo(skb)->gso_size;
 		meta->size = 0;
 		meta->id = req->id;
 	}
 
-	req = RING_GET_REQUEST(&queue->rx, queue->rx.req_cons++);
-	meta = npo->meta + npo->meta_prod++;
-
-	if ((1 << gso_type) & vif->gso_mask) {
-		meta->gso_type = gso_type;
-		meta->gso_size = skb_shinfo(skb)->gso_size;
-	} else {
-		meta->gso_type = XEN_NETIF_GSO_TYPE_NONE;
-		meta->gso_size = 0;
-	}
-
-	meta->size = 0;
-	meta->id = req->id;
-	npo->copy_off = 0;
-	npo->copy_gref = req->gref;
+	get_next_rx_buffer(queue, npo);
 
 	data = skb->data;
 	while (data < skb_tail_pointer(skb)) {
@@ -433,7 +408,8 @@ static int xenvif_gop_skb(struct sk_buff *skb,
 			len = skb_tail_pointer(skb) - data;
 
 		xenvif_gop_frag_copy(queue, skb, npo,
-				     virt_to_page(data), len, offset, &head);
+				     virt_to_page(data), len, offset, &head,
+				     gso_type);
 		data += len;
 	}
 
@@ -442,7 +418,7 @@ static int xenvif_gop_skb(struct sk_buff *skb,
 				     skb_frag_page(&skb_shinfo(skb)->frags[i]),
 				     skb_frag_size(&skb_shinfo(skb)->frags[i]),
 				     skb_shinfo(skb)->frags[i].page_offset,
-				     &head);
+				     &head, gso_type);
 	}
 
 	return npo->meta_prod - old_meta_prod;
@@ -542,15 +518,23 @@ static void xenvif_rx_action(struct xenvif_queue *queue)
 	gnttab_batch_copy(queue->grant_copy_op, npo.copy_prod);
 
 	while ((skb = __skb_dequeue(&rxq)) != NULL) {
+		struct xenvif *vif = queue->vif;
+		int gso_type = XEN_NETIF_GSO_TYPE_NONE;
+
+		if (skb_is_gso(skb)) {
+			if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV4)
+				gso_type = XEN_NETIF_GSO_TYPE_TCPV4;
+			else if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV6)
+				gso_type = XEN_NETIF_GSO_TYPE_TCPV6;
+		}
 
-		if ((1 << queue->meta[npo.meta_cons].gso_type) &
-		    queue->vif->gso_prefix_mask) {
+		if ((1 << gso_type) & vif->gso_prefix_mask) {
 			resp = RING_GET_RESPONSE(&queue->rx,
 						 queue->rx.rsp_prod_pvt++);
 
 			resp->flags = XEN_NETRXF_gso_prefix | XEN_NETRXF_more_data;
 
-			resp->offset = queue->meta[npo.meta_cons].gso_size;
+			resp->offset = skb_shinfo(skb)->gso_size;
 			resp->id = queue->meta[npo.meta_cons].id;
 			resp->status = XENVIF_RX_CB(skb)->meta_slots_used;
 
@@ -562,7 +546,7 @@ static void xenvif_rx_action(struct xenvif_queue *queue)
 		queue->stats.tx_bytes += skb->len;
 		queue->stats.tx_packets++;
 
-		status = xenvif_check_gop(queue->vif,
+		status = xenvif_check_gop(vif,
 					  XENVIF_RX_CB(skb)->meta_slots_used,
 					  &npo);
 
@@ -583,8 +567,7 @@ static void xenvif_rx_action(struct xenvif_queue *queue)
 					queue->meta[npo.meta_cons].size,
 					flags);
 
-		if ((1 << queue->meta[npo.meta_cons].gso_type) &
-		    queue->vif->gso_mask) {
+		if ((1 << gso_type) & vif->gso_mask) {
 			struct xen_netif_extra_info *gso =
 				(struct xen_netif_extra_info *)
 				RING_GET_RESPONSE(&queue->rx,
@@ -592,8 +575,8 @@ static void xenvif_rx_action(struct xenvif_queue *queue)
 
 			resp->flags |= XEN_NETRXF_extra_info;
 
-			gso->u.gso.type = queue->meta[npo.meta_cons].gso_type;
-			gso->u.gso.size = queue->meta[npo.meta_cons].gso_size;
+			gso->u.gso.type = gso_type;
+			gso->u.gso.size = skb_shinfo(skb)->gso_size;
 			gso->u.gso.pad = 0;
 			gso->u.gso.features = 0;
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH net-next 3/8] xen-netback: support multiple extra info segments passed from frontend
  2015-10-21 10:36 [PATCH net-next 0/8] xen-netback/core: packet hashing Paul Durrant
                   ` (3 preceding siblings ...)
  2015-10-21 10:36 ` Paul Durrant
@ 2015-10-21 10:36 ` Paul Durrant
  2015-10-26 17:05   ` Wei Liu
  2015-10-26 17:05   ` Wei Liu
  2015-10-21 10:36 ` Paul Durrant
                   ` (14 subsequent siblings)
  19 siblings, 2 replies; 39+ messages in thread
From: Paul Durrant @ 2015-10-21 10:36 UTC (permalink / raw)
  To: netdev, xen-devel; +Cc: Paul Durrant, Ian Campbell, Wei Liu

The code does not currently allow a frontend to pass multiple extra info
segments to the backend in a tx request. A subsequent patch in this series
needs this functionality so it is added here, without any other
modification, for better bisectability.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
---
 drivers/net/xen-netback/common.h  |  1 +
 drivers/net/xen-netback/netback.c | 27 +++++++++++++++++++--------
 2 files changed, 20 insertions(+), 8 deletions(-)

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index 136ace1..ce40bd7 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -51,6 +51,7 @@ typedef unsigned int pending_ring_idx_t;
 
 struct pending_tx_info {
 	struct xen_netif_tx_request req; /* tx request */
+	unsigned int extra_count; /* Number of extras following the request */
 	/* Callback data for released SKBs. The callback is always
 	 * xenvif_zerocopy_callback, desc contains the pending_idx, which is
 	 * also an index in pending_tx_info array. It is initialized in
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 27c6779..9f0c9f5 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -95,7 +95,8 @@ static void xenvif_idx_release(struct xenvif_queue *queue, u16 pending_idx,
 
 static void make_tx_response(struct xenvif_queue *queue,
 			     struct xen_netif_tx_request *txp,
-			     s8       st);
+			     s8       st,
+			     unsigned int extra_count);
 static void push_tx_responses(struct xenvif_queue *queue);
 
 static inline int tx_work_todo(struct xenvif_queue *queue);
@@ -646,7 +647,7 @@ static void xenvif_tx_err(struct xenvif_queue *queue,
 
 	do {
 		spin_lock_irqsave(&queue->response_lock, flags);
-		make_tx_response(queue, txp, XEN_NETIF_RSP_ERROR);
+		make_tx_response(queue, txp, XEN_NETIF_RSP_ERROR, 0);
 		push_tx_responses(queue);
 		spin_unlock_irqrestore(&queue->response_lock, flags);
 		if (cons == end)
@@ -1292,7 +1293,8 @@ static void xenvif_tx_build_gops(struct xenvif_queue *queue,
 			make_tx_response(queue, &txreq,
 					 (ret == 0) ?
 					 XEN_NETIF_RSP_OKAY :
-					 XEN_NETIF_RSP_ERROR);
+					 XEN_NETIF_RSP_ERROR,
+					 1);
 			push_tx_responses(queue);
 			continue;
 		}
@@ -1303,7 +1305,7 @@ static void xenvif_tx_build_gops(struct xenvif_queue *queue,
 			extra = &extras[XEN_NETIF_EXTRA_TYPE_MCAST_DEL - 1];
 			xenvif_mcast_del(queue->vif, extra->u.mcast.addr);
 
-			make_tx_response(queue, &txreq, XEN_NETIF_RSP_OKAY);
+			make_tx_response(queue, &txreq, XEN_NETIF_RSP_OKAY, 1);
 			push_tx_responses(queue);
 			continue;
 		}
@@ -1411,6 +1413,9 @@ static void xenvif_tx_build_gops(struct xenvif_queue *queue,
 			       sizeof(txreq));
 		}
 
+		if (extras[XEN_NETIF_EXTRA_TYPE_GSO - 1].type)
+			queue->pending_tx_info[pending_idx].extra_count++;
+
 		queue->pending_cons++;
 
 		gop = xenvif_get_requests(queue, skb, txfrags, gop,
@@ -1749,7 +1754,9 @@ static void xenvif_idx_release(struct xenvif_queue *queue, u16 pending_idx,
 
 	spin_lock_irqsave(&queue->response_lock, flags);
 
-	make_tx_response(queue, &pending_tx_info->req, status);
+	make_tx_response(queue, &pending_tx_info->req, status,
+			 pending_tx_info->extra_count);
+	memset(pending_tx_info, 0, sizeof(*pending_tx_info));
 
 	/* Release the pending index before pusing the Tx response so
 	 * its available before a new Tx request is pushed by the
@@ -1766,7 +1773,8 @@ static void xenvif_idx_release(struct xenvif_queue *queue, u16 pending_idx,
 
 static void make_tx_response(struct xenvif_queue *queue,
 			     struct xen_netif_tx_request *txp,
-			     s8       st)
+			     s8       st,
+			     unsigned int extra_count)
 {
 	RING_IDX i = queue->tx.rsp_prod_pvt;
 	struct xen_netif_tx_response *resp;
@@ -1775,8 +1783,11 @@ static void make_tx_response(struct xenvif_queue *queue,
 	resp->id     = txp->id;
 	resp->status = st;
 
-	if (txp->flags & XEN_NETTXF_extra_info)
-		RING_GET_RESPONSE(&queue->tx, ++i)->status = XEN_NETIF_RSP_NULL;
+	WARN_ON(!(txp->flags & XEN_NETTXF_extra_info) != !extra_count);
+
+	while (extra_count-- != 0)
+		RING_GET_RESPONSE(&queue->tx, ++i)->status =
+			XEN_NETIF_RSP_NULL;
 
 	queue->tx.rsp_prod_pvt = ++i;
 }
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH net-next 3/8] xen-netback: support multiple extra info segments passed from frontend
  2015-10-21 10:36 [PATCH net-next 0/8] xen-netback/core: packet hashing Paul Durrant
                   ` (4 preceding siblings ...)
  2015-10-21 10:36 ` [PATCH net-next 3/8] xen-netback: support multiple extra info segments passed from frontend Paul Durrant
@ 2015-10-21 10:36 ` Paul Durrant
  2015-10-21 10:36 ` [PATCH net-next 4/8] xen-netback: accept an L4 or L3 skb hash value from the frontend Paul Durrant
                   ` (13 subsequent siblings)
  19 siblings, 0 replies; 39+ messages in thread
From: Paul Durrant @ 2015-10-21 10:36 UTC (permalink / raw)
  To: netdev, xen-devel; +Cc: Paul Durrant, Wei Liu, Ian Campbell

The code does not currently allow a frontend to pass multiple extra info
segments to the backend in a tx request. A subsequent patch in this series
needs this functionality so it is added here, without any other
modification, for better bisectability.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
---
 drivers/net/xen-netback/common.h  |  1 +
 drivers/net/xen-netback/netback.c | 27 +++++++++++++++++++--------
 2 files changed, 20 insertions(+), 8 deletions(-)

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index 136ace1..ce40bd7 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -51,6 +51,7 @@ typedef unsigned int pending_ring_idx_t;
 
 struct pending_tx_info {
 	struct xen_netif_tx_request req; /* tx request */
+	unsigned int extra_count; /* Number of extras following the request */
 	/* Callback data for released SKBs. The callback is always
 	 * xenvif_zerocopy_callback, desc contains the pending_idx, which is
 	 * also an index in pending_tx_info array. It is initialized in
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 27c6779..9f0c9f5 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -95,7 +95,8 @@ static void xenvif_idx_release(struct xenvif_queue *queue, u16 pending_idx,
 
 static void make_tx_response(struct xenvif_queue *queue,
 			     struct xen_netif_tx_request *txp,
-			     s8       st);
+			     s8       st,
+			     unsigned int extra_count);
 static void push_tx_responses(struct xenvif_queue *queue);
 
 static inline int tx_work_todo(struct xenvif_queue *queue);
@@ -646,7 +647,7 @@ static void xenvif_tx_err(struct xenvif_queue *queue,
 
 	do {
 		spin_lock_irqsave(&queue->response_lock, flags);
-		make_tx_response(queue, txp, XEN_NETIF_RSP_ERROR);
+		make_tx_response(queue, txp, XEN_NETIF_RSP_ERROR, 0);
 		push_tx_responses(queue);
 		spin_unlock_irqrestore(&queue->response_lock, flags);
 		if (cons == end)
@@ -1292,7 +1293,8 @@ static void xenvif_tx_build_gops(struct xenvif_queue *queue,
 			make_tx_response(queue, &txreq,
 					 (ret == 0) ?
 					 XEN_NETIF_RSP_OKAY :
-					 XEN_NETIF_RSP_ERROR);
+					 XEN_NETIF_RSP_ERROR,
+					 1);
 			push_tx_responses(queue);
 			continue;
 		}
@@ -1303,7 +1305,7 @@ static void xenvif_tx_build_gops(struct xenvif_queue *queue,
 			extra = &extras[XEN_NETIF_EXTRA_TYPE_MCAST_DEL - 1];
 			xenvif_mcast_del(queue->vif, extra->u.mcast.addr);
 
-			make_tx_response(queue, &txreq, XEN_NETIF_RSP_OKAY);
+			make_tx_response(queue, &txreq, XEN_NETIF_RSP_OKAY, 1);
 			push_tx_responses(queue);
 			continue;
 		}
@@ -1411,6 +1413,9 @@ static void xenvif_tx_build_gops(struct xenvif_queue *queue,
 			       sizeof(txreq));
 		}
 
+		if (extras[XEN_NETIF_EXTRA_TYPE_GSO - 1].type)
+			queue->pending_tx_info[pending_idx].extra_count++;
+
 		queue->pending_cons++;
 
 		gop = xenvif_get_requests(queue, skb, txfrags, gop,
@@ -1749,7 +1754,9 @@ static void xenvif_idx_release(struct xenvif_queue *queue, u16 pending_idx,
 
 	spin_lock_irqsave(&queue->response_lock, flags);
 
-	make_tx_response(queue, &pending_tx_info->req, status);
+	make_tx_response(queue, &pending_tx_info->req, status,
+			 pending_tx_info->extra_count);
+	memset(pending_tx_info, 0, sizeof(*pending_tx_info));
 
 	/* Release the pending index before pusing the Tx response so
 	 * its available before a new Tx request is pushed by the
@@ -1766,7 +1773,8 @@ static void xenvif_idx_release(struct xenvif_queue *queue, u16 pending_idx,
 
 static void make_tx_response(struct xenvif_queue *queue,
 			     struct xen_netif_tx_request *txp,
-			     s8       st)
+			     s8       st,
+			     unsigned int extra_count)
 {
 	RING_IDX i = queue->tx.rsp_prod_pvt;
 	struct xen_netif_tx_response *resp;
@@ -1775,8 +1783,11 @@ static void make_tx_response(struct xenvif_queue *queue,
 	resp->id     = txp->id;
 	resp->status = st;
 
-	if (txp->flags & XEN_NETTXF_extra_info)
-		RING_GET_RESPONSE(&queue->tx, ++i)->status = XEN_NETIF_RSP_NULL;
+	WARN_ON(!(txp->flags & XEN_NETTXF_extra_info) != !extra_count);
+
+	while (extra_count-- != 0)
+		RING_GET_RESPONSE(&queue->tx, ++i)->status =
+			XEN_NETIF_RSP_NULL;
 
 	queue->tx.rsp_prod_pvt = ++i;
 }
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH net-next 4/8] xen-netback: accept an L4 or L3 skb hash value from the frontend
  2015-10-21 10:36 [PATCH net-next 0/8] xen-netback/core: packet hashing Paul Durrant
                   ` (5 preceding siblings ...)
  2015-10-21 10:36 ` Paul Durrant
@ 2015-10-21 10:36 ` Paul Durrant
  2015-10-26 17:05   ` Wei Liu
  2015-10-26 17:05   ` Wei Liu
  2015-10-21 10:36 ` Paul Durrant
                   ` (12 subsequent siblings)
  19 siblings, 2 replies; 39+ messages in thread
From: Paul Durrant @ 2015-10-21 10:36 UTC (permalink / raw)
  To: netdev, xen-devel; +Cc: Paul Durrant, Ian Campbell, Wei Liu

This patch adds an indication that netback is capable of handling hash
values passed from the frontend (see netif.h for details), and the code
necessary to process the additional xen_netif_extra_info segment and
set a hash on the skb.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
---
 drivers/net/xen-netback/netback.c | 25 +++++++++++++++++++++++++
 drivers/net/xen-netback/xenbus.c  |  8 ++++++++
 2 files changed, 33 insertions(+)

diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 9f0c9f5..3799b5a 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -1383,6 +1383,28 @@ static void xenvif_tx_build_gops(struct xenvif_queue *queue,
 			}
 		}
 
+		if (extras[XEN_NETIF_EXTRA_TYPE_HASH - 1].type) {
+			struct xen_netif_extra_info *extra;
+			u32 hash = *(u32 *)extra->u.hash.value;
+
+			extra = &extras[XEN_NETIF_EXTRA_TYPE_HASH - 1];
+
+			switch (extra->u.hash.type) {
+			case XEN_NETIF_HASH_TYPE_TCPV4:
+			case XEN_NETIF_HASH_TYPE_TCPV6:
+				skb_set_hash(skb, hash, PKT_HASH_TYPE_L4);
+				break;
+
+			case XEN_NETIF_HASH_TYPE_IPV4:
+			case XEN_NETIF_HASH_TYPE_IPV6:
+				skb_set_hash(skb, hash, PKT_HASH_TYPE_L3);
+				break;
+
+			default:
+				break;
+			}
+		}
+
 		XENVIF_TX_CB(skb)->pending_idx = pending_idx;
 
 		__skb_put(skb, data_len);
@@ -1416,6 +1438,9 @@ static void xenvif_tx_build_gops(struct xenvif_queue *queue,
 		if (extras[XEN_NETIF_EXTRA_TYPE_GSO - 1].type)
 			queue->pending_tx_info[pending_idx].extra_count++;
 
+		if (extras[XEN_NETIF_EXTRA_TYPE_HASH - 1].type)
+			queue->pending_tx_info[pending_idx].extra_count++;
+
 		queue->pending_cons++;
 
 		gop = xenvif_get_requests(queue, skb, txfrags, gop,
diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
index 929a6e7..2fa8a16 100644
--- a/drivers/net/xen-netback/xenbus.c
+++ b/drivers/net/xen-netback/xenbus.c
@@ -335,6 +335,14 @@ static int netback_probe(struct xenbus_device *dev,
 			goto abort_transaction;
 		}
 
+		/* We support hash values. */
+		err = xenbus_printf(xbt, dev->nodename,
+				    "feature-hash", "%d", 1);
+		if (err) {
+			message = "writing feature-hash";
+			goto abort_transaction;
+		}
+
 		err = xenbus_transaction_end(xbt, 0);
 	} while (err == -EAGAIN);
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH net-next 4/8] xen-netback: accept an L4 or L3 skb hash value from the frontend
  2015-10-21 10:36 [PATCH net-next 0/8] xen-netback/core: packet hashing Paul Durrant
                   ` (6 preceding siblings ...)
  2015-10-21 10:36 ` [PATCH net-next 4/8] xen-netback: accept an L4 or L3 skb hash value from the frontend Paul Durrant
@ 2015-10-21 10:36 ` Paul Durrant
  2015-10-21 10:36 ` [PATCH net-next 5/8] skbuff: store hash type in socket buffer Paul Durrant
                   ` (11 subsequent siblings)
  19 siblings, 0 replies; 39+ messages in thread
From: Paul Durrant @ 2015-10-21 10:36 UTC (permalink / raw)
  To: netdev, xen-devel; +Cc: Paul Durrant, Wei Liu, Ian Campbell

This patch adds an indication that netback is capable of handling hash
values passed from the frontend (see netif.h for details), and the code
necessary to process the additional xen_netif_extra_info segment and
set a hash on the skb.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
---
 drivers/net/xen-netback/netback.c | 25 +++++++++++++++++++++++++
 drivers/net/xen-netback/xenbus.c  |  8 ++++++++
 2 files changed, 33 insertions(+)

diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 9f0c9f5..3799b5a 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -1383,6 +1383,28 @@ static void xenvif_tx_build_gops(struct xenvif_queue *queue,
 			}
 		}
 
+		if (extras[XEN_NETIF_EXTRA_TYPE_HASH - 1].type) {
+			struct xen_netif_extra_info *extra;
+			u32 hash = *(u32 *)extra->u.hash.value;
+
+			extra = &extras[XEN_NETIF_EXTRA_TYPE_HASH - 1];
+
+			switch (extra->u.hash.type) {
+			case XEN_NETIF_HASH_TYPE_TCPV4:
+			case XEN_NETIF_HASH_TYPE_TCPV6:
+				skb_set_hash(skb, hash, PKT_HASH_TYPE_L4);
+				break;
+
+			case XEN_NETIF_HASH_TYPE_IPV4:
+			case XEN_NETIF_HASH_TYPE_IPV6:
+				skb_set_hash(skb, hash, PKT_HASH_TYPE_L3);
+				break;
+
+			default:
+				break;
+			}
+		}
+
 		XENVIF_TX_CB(skb)->pending_idx = pending_idx;
 
 		__skb_put(skb, data_len);
@@ -1416,6 +1438,9 @@ static void xenvif_tx_build_gops(struct xenvif_queue *queue,
 		if (extras[XEN_NETIF_EXTRA_TYPE_GSO - 1].type)
 			queue->pending_tx_info[pending_idx].extra_count++;
 
+		if (extras[XEN_NETIF_EXTRA_TYPE_HASH - 1].type)
+			queue->pending_tx_info[pending_idx].extra_count++;
+
 		queue->pending_cons++;
 
 		gop = xenvif_get_requests(queue, skb, txfrags, gop,
diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
index 929a6e7..2fa8a16 100644
--- a/drivers/net/xen-netback/xenbus.c
+++ b/drivers/net/xen-netback/xenbus.c
@@ -335,6 +335,14 @@ static int netback_probe(struct xenbus_device *dev,
 			goto abort_transaction;
 		}
 
+		/* We support hash values. */
+		err = xenbus_printf(xbt, dev->nodename,
+				    "feature-hash", "%d", 1);
+		if (err) {
+			message = "writing feature-hash";
+			goto abort_transaction;
+		}
+
 		err = xenbus_transaction_end(xbt, 0);
 	} while (err == -EAGAIN);
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH net-next 5/8] skbuff: store hash type in socket buffer...
  2015-10-21 10:36 [PATCH net-next 0/8] xen-netback/core: packet hashing Paul Durrant
                   ` (8 preceding siblings ...)
  2015-10-21 10:36 ` [PATCH net-next 5/8] skbuff: store hash type in socket buffer Paul Durrant
@ 2015-10-21 10:36 ` Paul Durrant
  2015-10-21 10:36 ` [PATCH net-next 6/8] xen-netback: pass an L4 or L3 skb hash value to the frontend Paul Durrant
                   ` (9 subsequent siblings)
  19 siblings, 0 replies; 39+ messages in thread
From: Paul Durrant @ 2015-10-21 10:36 UTC (permalink / raw)
  To: netdev, xen-devel
  Cc: Paul Durrant, David S. Miller, Jay Vosburgh, Veaceslav Falico,
	Andy Gospodarek

...rather than a boolean merely indicating a canonical L4 hash.

skb_set_hash() takes a hash type (from enum pkt_hash_types) as an
argument but information is lost since only a single bit in the skb
stores whether that hash type is PKT_HASH_TYPE_L4 or not.

By using two bits it's possible to store the complete hash type
information for use by drivers, such as xen-netback when forwarding
network packets to VM frontend drivers.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jay Vosburgh <j.vosburgh@gmail.com>
Cc: Veaceslav Falico <vfalico@gmail.com>
Cc: Andy Gospodarek <gospo@cumulusnetworks.com>
---
 drivers/net/bonding/bond_main.c |  2 +-
 include/linux/skbuff.h          | 53 ++++++++++++++++++++++++++++-------------
 include/net/flow_dissector.h    |  5 ++++
 include/net/sock.h              |  2 +-
 include/trace/events/net.h      |  2 +-
 net/core/flow_dissector.c       | 27 ++++++++++++++++-----
 6 files changed, 65 insertions(+), 26 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index d0f23cd..abf9c3f 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -3137,7 +3137,7 @@ u32 bond_xmit_hash(struct bonding *bond, struct sk_buff *skb)
 	u32 hash;
 
 	if (bond->params.xmit_policy == BOND_XMIT_POLICY_ENCAP34 &&
-	    skb->l4_hash)
+	    skb_has_l4_hash(skb))
 		return skb->hash;
 
 	if (bond->params.xmit_policy == BOND_XMIT_POLICY_LAYER2 ||
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 4398411..30e1e60 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -506,8 +506,7 @@ static inline u32 skb_mstamp_us_delta(const struct skb_mstamp *t1,
  *	@xmit_more: More SKBs are pending for this queue
  *	@ndisc_nodetype: router type (from link layer)
  *	@ooo_okay: allow the mapping of a socket to a queue to be changed
- *	@l4_hash: indicate hash is a canonical 4-tuple hash over transport
- *		ports.
+ *	@hash_type: indicates type of hash (see enum pkt_hash_types below)
  *	@sw_hash: indicates hash was computed in software stack
  *	@wifi_acked_valid: wifi_acked was set
  *	@wifi_acked: whether frame was acked on wifi or not
@@ -612,10 +611,10 @@ struct sk_buff {
 	__u8			nf_trace:1;
 	__u8			ip_summed:2;
 	__u8			ooo_okay:1;
-	__u8			l4_hash:1;
 	__u8			sw_hash:1;
 	__u8			wifi_acked_valid:1;
 	__u8			wifi_acked:1;
+	/* 1 bit hole */
 
 	__u8			no_fcs:1;
 	/* Indicates the inner headers are valid in the skbuff. */
@@ -632,7 +631,8 @@ struct sk_buff {
 	__u8			ipvs_property:1;
 	__u8			inner_protocol_type:1;
 	__u8			remcsum_offload:1;
-	/* 3 or 5 bit hole */
+	__u8			hash_type:2;
+	/* 1 or 3 bit hole */
 
 #ifdef CONFIG_NET_SCHED
 	__u16			tc_index;	/* traffic control index */
@@ -941,19 +941,35 @@ static inline void skb_clear_hash(struct sk_buff *skb)
 {
 	skb->hash = 0;
 	skb->sw_hash = 0;
-	skb->l4_hash = 0;
+	skb->hash_type = 0;
+}
+
+static inline enum pkt_hash_types skb_hash_type(struct sk_buff *skb)
+{
+	return skb->hash_type;
+}
+
+static inline bool skb_has_l4_hash(struct sk_buff *skb)
+{
+	return skb_hash_type(skb) == PKT_HASH_TYPE_L4;
+}
+
+static inline bool skb_has_sw_hash(struct sk_buff *skb)
+{
+	return !!skb->sw_hash;
 }
 
 static inline void skb_clear_hash_if_not_l4(struct sk_buff *skb)
 {
-	if (!skb->l4_hash)
+	if (!skb_has_l4_hash(skb))
 		skb_clear_hash(skb);
 }
 
 static inline void
-__skb_set_hash(struct sk_buff *skb, __u32 hash, bool is_sw, bool is_l4)
+__skb_set_hash(struct sk_buff *skb, __u32 hash, bool is_sw,
+	       enum pkt_hash_types type)
 {
-	skb->l4_hash = is_l4;
+	skb->hash_type = type;
 	skb->sw_hash = is_sw;
 	skb->hash = hash;
 }
@@ -962,13 +978,13 @@ static inline void
 skb_set_hash(struct sk_buff *skb, __u32 hash, enum pkt_hash_types type)
 {
 	/* Used by drivers to set hash from HW */
-	__skb_set_hash(skb, hash, false, type == PKT_HASH_TYPE_L4);
+	__skb_set_hash(skb, hash, false, type);
 }
 
 static inline void
-__skb_set_sw_hash(struct sk_buff *skb, __u32 hash, bool is_l4)
+__skb_set_sw_hash(struct sk_buff *skb, __u32 hash, enum pkt_hash_types type)
 {
-	__skb_set_hash(skb, hash, true, is_l4);
+	__skb_set_hash(skb, hash, true, type);
 }
 
 void __skb_get_hash(struct sk_buff *skb);
@@ -1021,9 +1037,10 @@ static inline bool skb_flow_dissect_flow_keys_buf(struct flow_keys *flow,
 				  data, proto, nhoff, hlen, flags);
 }
 
+
 static inline __u32 skb_get_hash(struct sk_buff *skb)
 {
-	if (!skb->l4_hash && !skb->sw_hash)
+	if (!skb_has_l4_hash(skb) && !skb_has_sw_hash(skb))
 		__skb_get_hash(skb);
 
 	return skb->hash;
@@ -1033,11 +1050,12 @@ __u32 __skb_get_hash_flowi6(struct sk_buff *skb, const struct flowi6 *fl6);
 
 static inline __u32 skb_get_hash_flowi6(struct sk_buff *skb, const struct flowi6 *fl6)
 {
-	if (!skb->l4_hash && !skb->sw_hash) {
+	if (!skb_has_l4_hash(skb) && !skb_has_sw_hash(skb)) {
 		struct flow_keys keys;
 		__u32 hash = __get_hash_from_flowi6(fl6, &keys);
 
-		__skb_set_sw_hash(skb, hash, flow_keys_have_l4(&keys));
+		__skb_set_sw_hash(skb, hash, flow_keys_have_l4(&keys) ?
+				  PKT_HASH_TYPE_L4 : PKT_HASH_TYPE_L3);
 	}
 
 	return skb->hash;
@@ -1047,11 +1065,12 @@ __u32 __skb_get_hash_flowi4(struct sk_buff *skb, const struct flowi4 *fl);
 
 static inline __u32 skb_get_hash_flowi4(struct sk_buff *skb, const struct flowi4 *fl4)
 {
-	if (!skb->l4_hash && !skb->sw_hash) {
+	if (!skb_has_l4_hash(skb) && !skb_has_sw_hash(skb)) {
 		struct flow_keys keys;
 		__u32 hash = __get_hash_from_flowi4(fl4, &keys);
 
-		__skb_set_sw_hash(skb, hash, flow_keys_have_l4(&keys));
+		__skb_set_sw_hash(skb, hash, flow_keys_have_l4(&keys) ?
+				  PKT_HASH_TYPE_L4 : PKT_HASH_TYPE_L3);
 	}
 
 	return skb->hash;
@@ -1068,7 +1087,7 @@ static inline void skb_copy_hash(struct sk_buff *to, const struct sk_buff *from)
 {
 	to->hash = from->hash;
 	to->sw_hash = from->sw_hash;
-	to->l4_hash = from->l4_hash;
+	to->hash_type = from->hash_type;
 };
 
 static inline void skb_sender_cpu_clear(struct sk_buff *skb)
diff --git a/include/net/flow_dissector.h b/include/net/flow_dissector.h
index 8c8548c..418b8c5 100644
--- a/include/net/flow_dissector.h
+++ b/include/net/flow_dissector.h
@@ -182,6 +182,11 @@ static inline bool flow_keys_have_l4(struct flow_keys *keys)
 	return (keys->ports.ports || keys->tags.flow_label);
 }
 
+static inline bool flow_keys_have_l3(struct flow_keys *keys)
+{
+	return !!keys->control.addr_type;
+}
+
 u32 flow_hash_from_keys(struct flow_keys *keys);
 
 #endif
diff --git a/include/net/sock.h b/include/net/sock.h
index 64a7545..b9f68db 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1938,7 +1938,7 @@ static inline void sock_poll_wait(struct file *filp,
 static inline void skb_set_hash_from_sk(struct sk_buff *skb, struct sock *sk)
 {
 	if (sk->sk_txhash) {
-		skb->l4_hash = 1;
+		skb->hash_type = PKT_HASH_TYPE_L4;
 		skb->hash = sk->sk_txhash;
 	}
 }
diff --git a/include/trace/events/net.h b/include/trace/events/net.h
index 49cc7c3..25e7979 100644
--- a/include/trace/events/net.h
+++ b/include/trace/events/net.h
@@ -180,7 +180,7 @@ DECLARE_EVENT_CLASS(net_dev_rx_verbose_template,
 		__entry->protocol = ntohs(skb->protocol);
 		__entry->ip_summed = skb->ip_summed;
 		__entry->hash = skb->hash;
-		__entry->l4_hash = skb->l4_hash;
+		__entry->l4_hash = skb->hash_type == PKT_HASH_TYPE_L4;
 		__entry->len = skb->len;
 		__entry->data_len = skb->data_len;
 		__entry->truesize = skb->truesize;
diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
index d79699c..956208b 100644
--- a/net/core/flow_dissector.c
+++ b/net/core/flow_dissector.c
@@ -658,17 +658,30 @@ EXPORT_SYMBOL(make_flow_keys_digest);
  *
  * This function calculates a flow hash based on src/dst addresses
  * and src/dst port numbers.  Sets hash in skb to non-zero hash value
- * on success, zero indicates no valid hash.  Also, sets l4_hash in skb
- * if hash is a canonical 4-tuple hash over transport ports.
+ * on success, zero indicates no valid hash in which case the hash type
+ * is set to NONE. If the hash is a canonical 4-tuple hash over transport
+ * ports then type is set to L4. If the hash did not include transport
+ * then type is set to L3, otherwise it is assumed to be L2 only.
  */
 void __skb_get_hash(struct sk_buff *skb)
 {
 	struct flow_keys keys;
+	u32 hash;
+	enum pkt_hash_types type;
 
 	__flow_hash_secret_init();
 
-	__skb_set_sw_hash(skb, ___skb_get_hash(skb, &keys, hashrnd),
-			  flow_keys_have_l4(&keys));
+	hash = ___skb_get_hash(skb, &keys, hashrnd);
+	if (hash == 0)
+		type = PKT_HASH_TYPE_NONE;
+	else if (flow_keys_have_l4(&keys))
+		type = PKT_HASH_TYPE_L4;
+	else if (flow_keys_have_l3(&keys))
+		type = PKT_HASH_TYPE_L3;
+	else
+		type = PKT_HASH_TYPE_L2;
+
+	__skb_set_sw_hash(skb, hash, type);
 }
 EXPORT_SYMBOL(__skb_get_hash);
 
@@ -698,7 +711,8 @@ __u32 __skb_get_hash_flowi6(struct sk_buff *skb, const struct flowi6 *fl6)
 	keys.basic.ip_proto = fl6->flowi6_proto;
 
 	__skb_set_sw_hash(skb, flow_hash_from_keys(&keys),
-			  flow_keys_have_l4(&keys));
+			  flow_keys_have_l4(&keys) ?
+			  PKT_HASH_TYPE_L4 : PKT_HASH_TYPE_L3);
 
 	return skb->hash;
 }
@@ -719,7 +733,8 @@ __u32 __skb_get_hash_flowi4(struct sk_buff *skb, const struct flowi4 *fl4)
 	keys.basic.ip_proto = fl4->flowi4_proto;
 
 	__skb_set_sw_hash(skb, flow_hash_from_keys(&keys),
-			  flow_keys_have_l4(&keys));
+			  flow_keys_have_l4(&keys) ?
+			  PKT_HASH_TYPE_L4 : PKT_HASH_TYPE_L3);
 
 	return skb->hash;
 }
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH net-next 5/8] skbuff: store hash type in socket buffer...
  2015-10-21 10:36 [PATCH net-next 0/8] xen-netback/core: packet hashing Paul Durrant
                   ` (7 preceding siblings ...)
  2015-10-21 10:36 ` Paul Durrant
@ 2015-10-21 10:36 ` Paul Durrant
  2015-10-21 10:36 ` Paul Durrant
                   ` (10 subsequent siblings)
  19 siblings, 0 replies; 39+ messages in thread
From: Paul Durrant @ 2015-10-21 10:36 UTC (permalink / raw)
  To: netdev, xen-devel
  Cc: Jay Vosburgh, Paul Durrant, Veaceslav Falico, David S. Miller,
	Andy Gospodarek

...rather than a boolean merely indicating a canonical L4 hash.

skb_set_hash() takes a hash type (from enum pkt_hash_types) as an
argument but information is lost since only a single bit in the skb
stores whether that hash type is PKT_HASH_TYPE_L4 or not.

By using two bits it's possible to store the complete hash type
information for use by drivers, such as xen-netback when forwarding
network packets to VM frontend drivers.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jay Vosburgh <j.vosburgh@gmail.com>
Cc: Veaceslav Falico <vfalico@gmail.com>
Cc: Andy Gospodarek <gospo@cumulusnetworks.com>
---
 drivers/net/bonding/bond_main.c |  2 +-
 include/linux/skbuff.h          | 53 ++++++++++++++++++++++++++++-------------
 include/net/flow_dissector.h    |  5 ++++
 include/net/sock.h              |  2 +-
 include/trace/events/net.h      |  2 +-
 net/core/flow_dissector.c       | 27 ++++++++++++++++-----
 6 files changed, 65 insertions(+), 26 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index d0f23cd..abf9c3f 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -3137,7 +3137,7 @@ u32 bond_xmit_hash(struct bonding *bond, struct sk_buff *skb)
 	u32 hash;
 
 	if (bond->params.xmit_policy == BOND_XMIT_POLICY_ENCAP34 &&
-	    skb->l4_hash)
+	    skb_has_l4_hash(skb))
 		return skb->hash;
 
 	if (bond->params.xmit_policy == BOND_XMIT_POLICY_LAYER2 ||
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 4398411..30e1e60 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -506,8 +506,7 @@ static inline u32 skb_mstamp_us_delta(const struct skb_mstamp *t1,
  *	@xmit_more: More SKBs are pending for this queue
  *	@ndisc_nodetype: router type (from link layer)
  *	@ooo_okay: allow the mapping of a socket to a queue to be changed
- *	@l4_hash: indicate hash is a canonical 4-tuple hash over transport
- *		ports.
+ *	@hash_type: indicates type of hash (see enum pkt_hash_types below)
  *	@sw_hash: indicates hash was computed in software stack
  *	@wifi_acked_valid: wifi_acked was set
  *	@wifi_acked: whether frame was acked on wifi or not
@@ -612,10 +611,10 @@ struct sk_buff {
 	__u8			nf_trace:1;
 	__u8			ip_summed:2;
 	__u8			ooo_okay:1;
-	__u8			l4_hash:1;
 	__u8			sw_hash:1;
 	__u8			wifi_acked_valid:1;
 	__u8			wifi_acked:1;
+	/* 1 bit hole */
 
 	__u8			no_fcs:1;
 	/* Indicates the inner headers are valid in the skbuff. */
@@ -632,7 +631,8 @@ struct sk_buff {
 	__u8			ipvs_property:1;
 	__u8			inner_protocol_type:1;
 	__u8			remcsum_offload:1;
-	/* 3 or 5 bit hole */
+	__u8			hash_type:2;
+	/* 1 or 3 bit hole */
 
 #ifdef CONFIG_NET_SCHED
 	__u16			tc_index;	/* traffic control index */
@@ -941,19 +941,35 @@ static inline void skb_clear_hash(struct sk_buff *skb)
 {
 	skb->hash = 0;
 	skb->sw_hash = 0;
-	skb->l4_hash = 0;
+	skb->hash_type = 0;
+}
+
+static inline enum pkt_hash_types skb_hash_type(struct sk_buff *skb)
+{
+	return skb->hash_type;
+}
+
+static inline bool skb_has_l4_hash(struct sk_buff *skb)
+{
+	return skb_hash_type(skb) == PKT_HASH_TYPE_L4;
+}
+
+static inline bool skb_has_sw_hash(struct sk_buff *skb)
+{
+	return !!skb->sw_hash;
 }
 
 static inline void skb_clear_hash_if_not_l4(struct sk_buff *skb)
 {
-	if (!skb->l4_hash)
+	if (!skb_has_l4_hash(skb))
 		skb_clear_hash(skb);
 }
 
 static inline void
-__skb_set_hash(struct sk_buff *skb, __u32 hash, bool is_sw, bool is_l4)
+__skb_set_hash(struct sk_buff *skb, __u32 hash, bool is_sw,
+	       enum pkt_hash_types type)
 {
-	skb->l4_hash = is_l4;
+	skb->hash_type = type;
 	skb->sw_hash = is_sw;
 	skb->hash = hash;
 }
@@ -962,13 +978,13 @@ static inline void
 skb_set_hash(struct sk_buff *skb, __u32 hash, enum pkt_hash_types type)
 {
 	/* Used by drivers to set hash from HW */
-	__skb_set_hash(skb, hash, false, type == PKT_HASH_TYPE_L4);
+	__skb_set_hash(skb, hash, false, type);
 }
 
 static inline void
-__skb_set_sw_hash(struct sk_buff *skb, __u32 hash, bool is_l4)
+__skb_set_sw_hash(struct sk_buff *skb, __u32 hash, enum pkt_hash_types type)
 {
-	__skb_set_hash(skb, hash, true, is_l4);
+	__skb_set_hash(skb, hash, true, type);
 }
 
 void __skb_get_hash(struct sk_buff *skb);
@@ -1021,9 +1037,10 @@ static inline bool skb_flow_dissect_flow_keys_buf(struct flow_keys *flow,
 				  data, proto, nhoff, hlen, flags);
 }
 
+
 static inline __u32 skb_get_hash(struct sk_buff *skb)
 {
-	if (!skb->l4_hash && !skb->sw_hash)
+	if (!skb_has_l4_hash(skb) && !skb_has_sw_hash(skb))
 		__skb_get_hash(skb);
 
 	return skb->hash;
@@ -1033,11 +1050,12 @@ __u32 __skb_get_hash_flowi6(struct sk_buff *skb, const struct flowi6 *fl6);
 
 static inline __u32 skb_get_hash_flowi6(struct sk_buff *skb, const struct flowi6 *fl6)
 {
-	if (!skb->l4_hash && !skb->sw_hash) {
+	if (!skb_has_l4_hash(skb) && !skb_has_sw_hash(skb)) {
 		struct flow_keys keys;
 		__u32 hash = __get_hash_from_flowi6(fl6, &keys);
 
-		__skb_set_sw_hash(skb, hash, flow_keys_have_l4(&keys));
+		__skb_set_sw_hash(skb, hash, flow_keys_have_l4(&keys) ?
+				  PKT_HASH_TYPE_L4 : PKT_HASH_TYPE_L3);
 	}
 
 	return skb->hash;
@@ -1047,11 +1065,12 @@ __u32 __skb_get_hash_flowi4(struct sk_buff *skb, const struct flowi4 *fl);
 
 static inline __u32 skb_get_hash_flowi4(struct sk_buff *skb, const struct flowi4 *fl4)
 {
-	if (!skb->l4_hash && !skb->sw_hash) {
+	if (!skb_has_l4_hash(skb) && !skb_has_sw_hash(skb)) {
 		struct flow_keys keys;
 		__u32 hash = __get_hash_from_flowi4(fl4, &keys);
 
-		__skb_set_sw_hash(skb, hash, flow_keys_have_l4(&keys));
+		__skb_set_sw_hash(skb, hash, flow_keys_have_l4(&keys) ?
+				  PKT_HASH_TYPE_L4 : PKT_HASH_TYPE_L3);
 	}
 
 	return skb->hash;
@@ -1068,7 +1087,7 @@ static inline void skb_copy_hash(struct sk_buff *to, const struct sk_buff *from)
 {
 	to->hash = from->hash;
 	to->sw_hash = from->sw_hash;
-	to->l4_hash = from->l4_hash;
+	to->hash_type = from->hash_type;
 };
 
 static inline void skb_sender_cpu_clear(struct sk_buff *skb)
diff --git a/include/net/flow_dissector.h b/include/net/flow_dissector.h
index 8c8548c..418b8c5 100644
--- a/include/net/flow_dissector.h
+++ b/include/net/flow_dissector.h
@@ -182,6 +182,11 @@ static inline bool flow_keys_have_l4(struct flow_keys *keys)
 	return (keys->ports.ports || keys->tags.flow_label);
 }
 
+static inline bool flow_keys_have_l3(struct flow_keys *keys)
+{
+	return !!keys->control.addr_type;
+}
+
 u32 flow_hash_from_keys(struct flow_keys *keys);
 
 #endif
diff --git a/include/net/sock.h b/include/net/sock.h
index 64a7545..b9f68db 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1938,7 +1938,7 @@ static inline void sock_poll_wait(struct file *filp,
 static inline void skb_set_hash_from_sk(struct sk_buff *skb, struct sock *sk)
 {
 	if (sk->sk_txhash) {
-		skb->l4_hash = 1;
+		skb->hash_type = PKT_HASH_TYPE_L4;
 		skb->hash = sk->sk_txhash;
 	}
 }
diff --git a/include/trace/events/net.h b/include/trace/events/net.h
index 49cc7c3..25e7979 100644
--- a/include/trace/events/net.h
+++ b/include/trace/events/net.h
@@ -180,7 +180,7 @@ DECLARE_EVENT_CLASS(net_dev_rx_verbose_template,
 		__entry->protocol = ntohs(skb->protocol);
 		__entry->ip_summed = skb->ip_summed;
 		__entry->hash = skb->hash;
-		__entry->l4_hash = skb->l4_hash;
+		__entry->l4_hash = skb->hash_type == PKT_HASH_TYPE_L4;
 		__entry->len = skb->len;
 		__entry->data_len = skb->data_len;
 		__entry->truesize = skb->truesize;
diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
index d79699c..956208b 100644
--- a/net/core/flow_dissector.c
+++ b/net/core/flow_dissector.c
@@ -658,17 +658,30 @@ EXPORT_SYMBOL(make_flow_keys_digest);
  *
  * This function calculates a flow hash based on src/dst addresses
  * and src/dst port numbers.  Sets hash in skb to non-zero hash value
- * on success, zero indicates no valid hash.  Also, sets l4_hash in skb
- * if hash is a canonical 4-tuple hash over transport ports.
+ * on success, zero indicates no valid hash in which case the hash type
+ * is set to NONE. If the hash is a canonical 4-tuple hash over transport
+ * ports then type is set to L4. If the hash did not include transport
+ * then type is set to L3, otherwise it is assumed to be L2 only.
  */
 void __skb_get_hash(struct sk_buff *skb)
 {
 	struct flow_keys keys;
+	u32 hash;
+	enum pkt_hash_types type;
 
 	__flow_hash_secret_init();
 
-	__skb_set_sw_hash(skb, ___skb_get_hash(skb, &keys, hashrnd),
-			  flow_keys_have_l4(&keys));
+	hash = ___skb_get_hash(skb, &keys, hashrnd);
+	if (hash == 0)
+		type = PKT_HASH_TYPE_NONE;
+	else if (flow_keys_have_l4(&keys))
+		type = PKT_HASH_TYPE_L4;
+	else if (flow_keys_have_l3(&keys))
+		type = PKT_HASH_TYPE_L3;
+	else
+		type = PKT_HASH_TYPE_L2;
+
+	__skb_set_sw_hash(skb, hash, type);
 }
 EXPORT_SYMBOL(__skb_get_hash);
 
@@ -698,7 +711,8 @@ __u32 __skb_get_hash_flowi6(struct sk_buff *skb, const struct flowi6 *fl6)
 	keys.basic.ip_proto = fl6->flowi6_proto;
 
 	__skb_set_sw_hash(skb, flow_hash_from_keys(&keys),
-			  flow_keys_have_l4(&keys));
+			  flow_keys_have_l4(&keys) ?
+			  PKT_HASH_TYPE_L4 : PKT_HASH_TYPE_L3);
 
 	return skb->hash;
 }
@@ -719,7 +733,8 @@ __u32 __skb_get_hash_flowi4(struct sk_buff *skb, const struct flowi4 *fl4)
 	keys.basic.ip_proto = fl4->flowi4_proto;
 
 	__skb_set_sw_hash(skb, flow_hash_from_keys(&keys),
-			  flow_keys_have_l4(&keys));
+			  flow_keys_have_l4(&keys) ?
+			  PKT_HASH_TYPE_L4 : PKT_HASH_TYPE_L3);
 
 	return skb->hash;
 }
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH net-next 6/8] xen-netback: pass an L4 or L3 skb hash value to the frontend
  2015-10-21 10:36 [PATCH net-next 0/8] xen-netback/core: packet hashing Paul Durrant
                   ` (9 preceding siblings ...)
  2015-10-21 10:36 ` Paul Durrant
@ 2015-10-21 10:36 ` Paul Durrant
  2015-10-26 17:05   ` Wei Liu
  2015-10-26 17:05   ` Wei Liu
  2015-10-21 10:36 ` Paul Durrant
                   ` (8 subsequent siblings)
  19 siblings, 2 replies; 39+ messages in thread
From: Paul Durrant @ 2015-10-21 10:36 UTC (permalink / raw)
  To: netdev, xen-devel; +Cc: Paul Durrant, Ian Campbell, Wei Liu

If the frontend indicates it's capable (see netif.h for details) and an
skb has an L4 or L3 hash value then pass the value to the frontend in
a xen_netif_extra_info segment.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
---
 drivers/net/xen-netback/common.h  |  1 +
 drivers/net/xen-netback/netback.c | 85 +++++++++++++++++++++++++++++++--------
 drivers/net/xen-netback/xenbus.c  |  5 +++
 3 files changed, 75 insertions(+), 16 deletions(-)

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index ce40bd7..1bce5a5 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -229,6 +229,7 @@ struct xenvif {
 	u8 ip_csum:1;
 	u8 ipv6_csum:1;
 	u8 multicast_control:1;
+	u8 hash_extra:1;
 
 	/* Is this interface disabled? True when backend discovers
 	 * frontend is rogue.
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 3799b5a..68994f9 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -152,10 +152,17 @@ static inline pending_ring_idx_t pending_index(unsigned i)
 
 static int xenvif_rx_ring_slots_needed(struct xenvif *vif)
 {
-	if (vif->gso_mask)
-		return DIV_ROUND_UP(vif->dev->gso_max_size, PAGE_SIZE) + 1;
+	int needed;
+
+	if (vif->gso_mask || vif->gso_prefix_mask)
+		needed = DIV_ROUND_UP(vif->dev->gso_max_size, PAGE_SIZE) + 1;
 	else
-		return DIV_ROUND_UP(vif->dev->mtu, PAGE_SIZE);
+		needed = DIV_ROUND_UP(vif->dev->mtu, PAGE_SIZE);
+
+	if (vif->hash_extra)
+		needed++;
+
+	return needed;
 }
 
 static bool xenvif_rx_ring_slots_available(struct xenvif_queue *queue)
@@ -304,12 +311,23 @@ static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb
 		BUG_ON(npo->copy_off > MAX_BUFFER_OFFSET);
 
 		if (npo->copy_off == MAX_BUFFER_OFFSET) {
-			/* Leave a gap for the GSO descriptor. */
-			if (*head && ((1 << gso_type) & vif->gso_mask))
-				queue->rx.req_cons++;
+			if (*head) {
+				*head = 0;
+
+				/* Leave a gap for the GSO descriptor. */
+				if ((1 << gso_type) & vif->gso_mask)
+					queue->rx.req_cons++;
+
+				/* Leave a gap for the hash extra
+				 * segment.
+				 */
+				if (vif->hash_extra &&
+				    (skb->protocol == htons(ETH_P_IP) ||
+				     skb->protocol == htons(ETH_P_IPV6)))
+					queue->rx.req_cons++;
+			}
 
 			meta = get_next_rx_buffer(queue, npo);
-			*head = 0;
 		}
 
 		bytes = PAGE_SIZE - offset;
@@ -521,6 +539,7 @@ static void xenvif_rx_action(struct xenvif_queue *queue)
 	while ((skb = __skb_dequeue(&rxq)) != NULL) {
 		struct xenvif *vif = queue->vif;
 		int gso_type = XEN_NETIF_GSO_TYPE_NONE;
+		struct xen_netif_extra_info *extra = NULL;
 
 		if (skb_is_gso(skb)) {
 			if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV4)
@@ -569,20 +588,54 @@ static void xenvif_rx_action(struct xenvif_queue *queue)
 					flags);
 
 		if ((1 << gso_type) & vif->gso_mask) {
-			struct xen_netif_extra_info *gso =
-				(struct xen_netif_extra_info *)
+			resp->flags |= XEN_NETRXF_extra_info;
+
+			extra = (struct xen_netif_extra_info *)
 				RING_GET_RESPONSE(&queue->rx,
 						  queue->rx.rsp_prod_pvt++);
 
-			resp->flags |= XEN_NETRXF_extra_info;
+			extra->u.gso.type = gso_type;
+			extra->u.gso.size = skb_shinfo(skb)->gso_size;
+			extra->u.gso.pad = 0;
+			extra->u.gso.features = 0;
+
+			extra->type = XEN_NETIF_EXTRA_TYPE_GSO;
+			extra->flags = 0;
+		}
+
+		if (vif->hash_extra &&
+		    (skb->protocol == htons(ETH_P_IP) ||
+		     skb->protocol == htons(ETH_P_IPV6))) {
+			if (resp->flags & XEN_NETRXF_extra_info)
+				extra->flags |= XEN_NETIF_EXTRA_FLAG_MORE;
+			else
+				resp->flags |= XEN_NETRXF_extra_info;
 
-			gso->u.gso.type = gso_type;
-			gso->u.gso.size = skb_shinfo(skb)->gso_size;
-			gso->u.gso.pad = 0;
-			gso->u.gso.features = 0;
+			extra = (struct xen_netif_extra_info *)
+				RING_GET_RESPONSE(&queue->rx,
+						  queue->rx.rsp_prod_pvt++);
+
+			if (skb_hash_type(skb) == PKT_HASH_TYPE_L4) {
+				extra->u.hash.type =
+					skb->protocol == htons(ETH_P_IP) ?
+					XEN_NETIF_HASH_TYPE_TCPV4 :
+					XEN_NETIF_HASH_TYPE_TCPV6;
+				*(uint32_t *)extra->u.hash.value =
+					skb_get_hash_raw(skb);
+			} else if (skb_hash_type(skb) == PKT_HASH_TYPE_L3) {
+				extra->u.hash.type =
+					skb->protocol == htons(ETH_P_IP) ?
+					XEN_NETIF_HASH_TYPE_IPV4 :
+					XEN_NETIF_HASH_TYPE_IPV6;
+				*(uint32_t *)extra->u.hash.value =
+					skb_get_hash_raw(skb);
+			} else {
+				extra->u.hash.type = XEN_NETIF_HASH_TYPE_NONE;
+				*(uint32_t *)extra->u.hash.value = 0;
+			}
 
-			gso->type = XEN_NETIF_EXTRA_TYPE_GSO;
-			gso->flags = 0;
+			extra->type = XEN_NETIF_EXTRA_TYPE_HASH;
+			extra->flags = 0;
 		}
 
 		xenvif_add_frag_responses(queue, status,
diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
index 2fa8a16..a31bcee 100644
--- a/drivers/net/xen-netback/xenbus.c
+++ b/drivers/net/xen-netback/xenbus.c
@@ -1037,6 +1037,11 @@ static int read_xenbus_vif_flags(struct backend_info *be)
 		val = 0;
 	vif->multicast_control = !!val;
 
+	if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-hash",
+			 "%d", &val) < 0)
+		val = 0;
+	vif->hash_extra = !!val;
+
 	return 0;
 }
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH net-next 6/8] xen-netback: pass an L4 or L3 skb hash value to the frontend
  2015-10-21 10:36 [PATCH net-next 0/8] xen-netback/core: packet hashing Paul Durrant
                   ` (10 preceding siblings ...)
  2015-10-21 10:36 ` [PATCH net-next 6/8] xen-netback: pass an L4 or L3 skb hash value to the frontend Paul Durrant
@ 2015-10-21 10:36 ` Paul Durrant
  2015-10-21 10:36 ` [PATCH net-next 7/8] xen-netback: add support for a multi-queue hash mapping table Paul Durrant
                   ` (7 subsequent siblings)
  19 siblings, 0 replies; 39+ messages in thread
From: Paul Durrant @ 2015-10-21 10:36 UTC (permalink / raw)
  To: netdev, xen-devel; +Cc: Paul Durrant, Wei Liu, Ian Campbell

If the frontend indicates it's capable (see netif.h for details) and an
skb has an L4 or L3 hash value then pass the value to the frontend in
a xen_netif_extra_info segment.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
---
 drivers/net/xen-netback/common.h  |  1 +
 drivers/net/xen-netback/netback.c | 85 +++++++++++++++++++++++++++++++--------
 drivers/net/xen-netback/xenbus.c  |  5 +++
 3 files changed, 75 insertions(+), 16 deletions(-)

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index ce40bd7..1bce5a5 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -229,6 +229,7 @@ struct xenvif {
 	u8 ip_csum:1;
 	u8 ipv6_csum:1;
 	u8 multicast_control:1;
+	u8 hash_extra:1;
 
 	/* Is this interface disabled? True when backend discovers
 	 * frontend is rogue.
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 3799b5a..68994f9 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -152,10 +152,17 @@ static inline pending_ring_idx_t pending_index(unsigned i)
 
 static int xenvif_rx_ring_slots_needed(struct xenvif *vif)
 {
-	if (vif->gso_mask)
-		return DIV_ROUND_UP(vif->dev->gso_max_size, PAGE_SIZE) + 1;
+	int needed;
+
+	if (vif->gso_mask || vif->gso_prefix_mask)
+		needed = DIV_ROUND_UP(vif->dev->gso_max_size, PAGE_SIZE) + 1;
 	else
-		return DIV_ROUND_UP(vif->dev->mtu, PAGE_SIZE);
+		needed = DIV_ROUND_UP(vif->dev->mtu, PAGE_SIZE);
+
+	if (vif->hash_extra)
+		needed++;
+
+	return needed;
 }
 
 static bool xenvif_rx_ring_slots_available(struct xenvif_queue *queue)
@@ -304,12 +311,23 @@ static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb
 		BUG_ON(npo->copy_off > MAX_BUFFER_OFFSET);
 
 		if (npo->copy_off == MAX_BUFFER_OFFSET) {
-			/* Leave a gap for the GSO descriptor. */
-			if (*head && ((1 << gso_type) & vif->gso_mask))
-				queue->rx.req_cons++;
+			if (*head) {
+				*head = 0;
+
+				/* Leave a gap for the GSO descriptor. */
+				if ((1 << gso_type) & vif->gso_mask)
+					queue->rx.req_cons++;
+
+				/* Leave a gap for the hash extra
+				 * segment.
+				 */
+				if (vif->hash_extra &&
+				    (skb->protocol == htons(ETH_P_IP) ||
+				     skb->protocol == htons(ETH_P_IPV6)))
+					queue->rx.req_cons++;
+			}
 
 			meta = get_next_rx_buffer(queue, npo);
-			*head = 0;
 		}
 
 		bytes = PAGE_SIZE - offset;
@@ -521,6 +539,7 @@ static void xenvif_rx_action(struct xenvif_queue *queue)
 	while ((skb = __skb_dequeue(&rxq)) != NULL) {
 		struct xenvif *vif = queue->vif;
 		int gso_type = XEN_NETIF_GSO_TYPE_NONE;
+		struct xen_netif_extra_info *extra = NULL;
 
 		if (skb_is_gso(skb)) {
 			if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV4)
@@ -569,20 +588,54 @@ static void xenvif_rx_action(struct xenvif_queue *queue)
 					flags);
 
 		if ((1 << gso_type) & vif->gso_mask) {
-			struct xen_netif_extra_info *gso =
-				(struct xen_netif_extra_info *)
+			resp->flags |= XEN_NETRXF_extra_info;
+
+			extra = (struct xen_netif_extra_info *)
 				RING_GET_RESPONSE(&queue->rx,
 						  queue->rx.rsp_prod_pvt++);
 
-			resp->flags |= XEN_NETRXF_extra_info;
+			extra->u.gso.type = gso_type;
+			extra->u.gso.size = skb_shinfo(skb)->gso_size;
+			extra->u.gso.pad = 0;
+			extra->u.gso.features = 0;
+
+			extra->type = XEN_NETIF_EXTRA_TYPE_GSO;
+			extra->flags = 0;
+		}
+
+		if (vif->hash_extra &&
+		    (skb->protocol == htons(ETH_P_IP) ||
+		     skb->protocol == htons(ETH_P_IPV6))) {
+			if (resp->flags & XEN_NETRXF_extra_info)
+				extra->flags |= XEN_NETIF_EXTRA_FLAG_MORE;
+			else
+				resp->flags |= XEN_NETRXF_extra_info;
 
-			gso->u.gso.type = gso_type;
-			gso->u.gso.size = skb_shinfo(skb)->gso_size;
-			gso->u.gso.pad = 0;
-			gso->u.gso.features = 0;
+			extra = (struct xen_netif_extra_info *)
+				RING_GET_RESPONSE(&queue->rx,
+						  queue->rx.rsp_prod_pvt++);
+
+			if (skb_hash_type(skb) == PKT_HASH_TYPE_L4) {
+				extra->u.hash.type =
+					skb->protocol == htons(ETH_P_IP) ?
+					XEN_NETIF_HASH_TYPE_TCPV4 :
+					XEN_NETIF_HASH_TYPE_TCPV6;
+				*(uint32_t *)extra->u.hash.value =
+					skb_get_hash_raw(skb);
+			} else if (skb_hash_type(skb) == PKT_HASH_TYPE_L3) {
+				extra->u.hash.type =
+					skb->protocol == htons(ETH_P_IP) ?
+					XEN_NETIF_HASH_TYPE_IPV4 :
+					XEN_NETIF_HASH_TYPE_IPV6;
+				*(uint32_t *)extra->u.hash.value =
+					skb_get_hash_raw(skb);
+			} else {
+				extra->u.hash.type = XEN_NETIF_HASH_TYPE_NONE;
+				*(uint32_t *)extra->u.hash.value = 0;
+			}
 
-			gso->type = XEN_NETIF_EXTRA_TYPE_GSO;
-			gso->flags = 0;
+			extra->type = XEN_NETIF_EXTRA_TYPE_HASH;
+			extra->flags = 0;
 		}
 
 		xenvif_add_frag_responses(queue, status,
diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
index 2fa8a16..a31bcee 100644
--- a/drivers/net/xen-netback/xenbus.c
+++ b/drivers/net/xen-netback/xenbus.c
@@ -1037,6 +1037,11 @@ static int read_xenbus_vif_flags(struct backend_info *be)
 		val = 0;
 	vif->multicast_control = !!val;
 
+	if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-hash",
+			 "%d", &val) < 0)
+		val = 0;
+	vif->hash_extra = !!val;
+
 	return 0;
 }
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH net-next 7/8] xen-netback: add support for a multi-queue hash mapping table
  2015-10-21 10:36 [PATCH net-next 0/8] xen-netback/core: packet hashing Paul Durrant
                   ` (12 preceding siblings ...)
  2015-10-21 10:36 ` [PATCH net-next 7/8] xen-netback: add support for a multi-queue hash mapping table Paul Durrant
@ 2015-10-21 10:36 ` Paul Durrant
  2015-10-26 17:05   ` Wei Liu
  2015-10-26 17:05   ` Wei Liu
  2015-10-21 10:36 ` [PATCH net-next 8/8] xen-netback: add support for toeplitz hashing Paul Durrant
                   ` (5 subsequent siblings)
  19 siblings, 2 replies; 39+ messages in thread
From: Paul Durrant @ 2015-10-21 10:36 UTC (permalink / raw)
  To: netdev, xen-devel; +Cc: Paul Durrant, Ian Campbell, Wei Liu

Advertise the capability to handle a hash mapping specified by the
frontend (see netif.h for details).

Add an ndo_select() entry point so that, of the frontend does specify a
hash mapping, the skb hash is extracted and mapped to a queue. If no
mapping is specified then the fallback queue selection function is
called so there is no change in behaviour.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
---
 drivers/net/xen-netback/common.h    |   7 ++
 drivers/net/xen-netback/interface.c |  14 ++++
 drivers/net/xen-netback/xenbus.c    | 154 ++++++++++++++++++++++++++++++------
 3 files changed, 152 insertions(+), 23 deletions(-)

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index 1bce5a5..23f2275 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -212,6 +212,8 @@ struct xenvif_mcast_addr {
 
 #define XEN_NETBK_MCAST_MAX 64
 
+#define XEN_NETBK_MAX_HASH_MAPPING_SIZE 128
+
 struct xenvif {
 	/* Unique identifier for this interface. */
 	domid_t          domid;
@@ -244,7 +246,12 @@ struct xenvif {
 	unsigned int num_queues; /* active queues, resource allocated */
 	unsigned int stalled_queues;
 
+	struct {
+		unsigned int table[XEN_NETBK_MAX_HASH_MAPPING_SIZE];
+		unsigned int length;
+	} hash_mapping;
 	struct xenbus_watch credit_watch;
+	struct xenbus_watch hash_mapping_watch;
 
 	spinlock_t lock;
 
diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
index e7bd63e..0c7da7b 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -142,6 +142,19 @@ void xenvif_wake_queue(struct xenvif_queue *queue)
 	netif_tx_wake_queue(netdev_get_tx_queue(dev, id));
 }
 
+static u16 xenvif_select_queue(struct net_device *dev, struct sk_buff *skb,
+			       void *accel_priv,
+			       select_queue_fallback_t fallback)
+{
+	struct xenvif *vif = netdev_priv(dev);
+
+	if (vif->hash_mapping.length == 0)
+		return fallback(dev, skb) % dev->real_num_tx_queues;
+
+	return vif->hash_mapping.table[skb_get_hash_raw(skb) %
+				       vif->hash_mapping.length];
+}
+
 static int xenvif_start_xmit(struct sk_buff *skb, struct net_device *dev)
 {
 	struct xenvif *vif = netdev_priv(dev);
@@ -386,6 +399,7 @@ static const struct ethtool_ops xenvif_ethtool_ops = {
 };
 
 static const struct net_device_ops xenvif_netdev_ops = {
+	.ndo_select_queue = xenvif_select_queue,
 	.ndo_start_xmit	= xenvif_start_xmit,
 	.ndo_get_stats	= xenvif_get_stats,
 	.ndo_open	= xenvif_open,
diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
index a31bcee..f5ed945 100644
--- a/drivers/net/xen-netback/xenbus.c
+++ b/drivers/net/xen-netback/xenbus.c
@@ -367,6 +367,13 @@ static int netback_probe(struct xenbus_device *dev,
 	if (err)
 		pr_debug("Error writing multi-queue-max-queues\n");
 
+	/* Multi-queue mapping support: This is an optional feature. */
+	err = xenbus_printf(XBT_NIL, dev->nodename,
+			    "multi-queue-max-hash-mapping-length", "%u",
+			    XEN_NETBK_MAX_HASH_MAPPING_SIZE);
+	if (err)
+		pr_debug("Error writing multi-queue-max-hash-mapping-length\n");
+
 	script = xenbus_read(XBT_NIL, dev->nodename, "script", NULL);
 	if (IS_ERR(script)) {
 		err = PTR_ERR(script);
@@ -691,38 +698,139 @@ static void xen_net_rate_changed(struct xenbus_watch *watch,
 	}
 }
 
-static int xen_register_watchers(struct xenbus_device *dev, struct xenvif *vif)
+static void xen_net_read_multi_queue_hash_mapping(struct xenvif *vif)
+{
+	struct xenbus_device *dev = xenvif_to_xenbus_device(vif);
+	char *str, *token;
+	unsigned int *table;
+	unsigned int n, i;
+
+	str = xenbus_read(XBT_NIL, dev->otherend,
+			  "multi-queue-hash-mapping", NULL);
+	if (IS_ERR(str))
+		goto fail1;
+
+	table = kcalloc(ARRAY_SIZE(vif->hash_mapping.table),
+			sizeof(unsigned int),
+			GFP_KERNEL);
+	if (!table) {
+		pr_err("%s: failed to allocate mapping table\n",
+		       dev->nodename);
+		goto fail2;
+	}
+
+	n = 0;
+	while ((token = strsep(&str, ",")) != NULL) {
+		int rc;
+
+		if (n >= ARRAY_SIZE(vif->hash_mapping.table)) {
+			pr_err("%s: mapping table too big\n",
+			       dev->nodename);
+			goto fail3;
+		}
+
+		rc = kstrtouint(token, 0, &table[n]);
+		if (rc < 0) {
+			pr_err("%s: invalid mapping table value (%s at index %u)\n",
+			       dev->nodename, token, n);
+			goto fail3;
+		}
+
+		n++;
+	}
+
+	if (n == 0) {
+		pr_err("%s: invalid mapping table\n", dev->nodename);
+		goto fail3;
+	}
+
+	vif->hash_mapping.length = n;
+
+	for (i = 0; i < n; i++)
+		vif->hash_mapping.table[i] =
+			(table[i] < vif->num_queues) ? table[i] : 0;
+
+	kfree(table);
+	kfree(str);
+	return;
+
+fail3:
+	kfree(table);
+fail2:
+	kfree(str);
+fail1:
+	vif->hash_mapping.length = 0;
+}
+
+static void xen_hash_mapping_changed(struct xenbus_watch *watch,
+				     const char **vec, unsigned int len)
+{
+	struct xenvif *vif = container_of(watch, struct xenvif,
+					  hash_mapping_watch);
+
+	xen_net_read_multi_queue_hash_mapping(vif);
+}
+
+static int xenvif_register_watch(const char *prefix, const char *name,
+				 void (*callback)(struct xenbus_watch *,
+						  const char **vec,
+						  unsigned int len),
+				 struct xenbus_watch *watch)
 {
-	int err = 0;
+	unsigned int len;
 	char *node;
-	unsigned maxlen = strlen(dev->nodename) + sizeof("/rate");
+	int err;
 
-	if (vif->credit_watch.node)
+	if (watch->node)
 		return -EADDRINUSE;
 
-	node = kmalloc(maxlen, GFP_KERNEL);
+	len = strlen(prefix) + 1 + strlen(name) + 1;
+
+	node = kmalloc(len, GFP_KERNEL);
 	if (!node)
 		return -ENOMEM;
-	snprintf(node, maxlen, "%s/rate", dev->nodename);
-	vif->credit_watch.node = node;
-	vif->credit_watch.callback = xen_net_rate_changed;
-	err = register_xenbus_watch(&vif->credit_watch);
+
+	snprintf(node, len, "%s/%s", prefix, name);
+	watch->node = node;
+	watch->callback = callback;
+	err = register_xenbus_watch(watch);
 	if (err) {
-		pr_err("Failed to set watcher %s\n", vif->credit_watch.node);
+		pr_err("Failed to set watch %s\n", node);
 		kfree(node);
-		vif->credit_watch.node = NULL;
-		vif->credit_watch.callback = NULL;
+		watch->node = NULL;
+		watch->callback = NULL;
 	}
 	return err;
 }
 
+static void xenvif_unregister_watch(struct xenbus_watch *watch)
+{
+	if (!watch->node)
+		return;
+
+	unregister_xenbus_watch(watch);
+	kfree(watch->node);
+
+	watch->node = NULL;
+	watch->callback = NULL;
+}
+
+static void xen_register_watchers(struct xenbus_device *dev, struct xenvif *vif)
+{
+	xenvif_register_watch(dev->nodename, "rate",
+			      xen_net_rate_changed,
+			      &vif->credit_watch);
+
+	xenvif_register_watch(dev->otherend,
+			      "multi-queue-hash-mapping",
+			      xen_hash_mapping_changed,
+			      &vif->hash_mapping_watch);
+}
+
 static void xen_unregister_watchers(struct xenvif *vif)
 {
-	if (vif->credit_watch.node) {
-		unregister_xenbus_watch(&vif->credit_watch);
-		kfree(vif->credit_watch.node);
-		vif->credit_watch.node = NULL;
-	}
+	xenvif_unregister_watch(&vif->hash_mapping_watch);
+	xenvif_unregister_watch(&vif->credit_watch);
 }
 
 static void unregister_hotplug_status_watch(struct backend_info *be)
@@ -782,6 +890,12 @@ static void connect(struct backend_info *be)
 		return;
 	}
 
+	/* Use the number of queues requested by the frontend */
+	be->vif->queues = vzalloc(requested_num_queues *
+				  sizeof(struct xenvif_queue));
+	be->vif->num_queues = requested_num_queues;
+	be->vif->stalled_queues = requested_num_queues;
+
 	err = xen_net_read_mac(dev, be->vif->fe_dev_addr);
 	if (err) {
 		xenbus_dev_fatal(dev, err, "parsing %s/mac", dev->nodename);
@@ -793,12 +907,6 @@ static void connect(struct backend_info *be)
 	xen_register_watchers(dev, be->vif);
 	read_xenbus_vif_flags(be);
 
-	/* Use the number of queues requested by the frontend */
-	be->vif->queues = vzalloc(requested_num_queues *
-				  sizeof(struct xenvif_queue));
-	be->vif->num_queues = requested_num_queues;
-	be->vif->stalled_queues = requested_num_queues;
-
 	for (queue_index = 0; queue_index < requested_num_queues; ++queue_index) {
 		queue = &be->vif->queues[queue_index];
 		queue->vif = be->vif;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH net-next 7/8] xen-netback: add support for a multi-queue hash mapping table
  2015-10-21 10:36 [PATCH net-next 0/8] xen-netback/core: packet hashing Paul Durrant
                   ` (11 preceding siblings ...)
  2015-10-21 10:36 ` Paul Durrant
@ 2015-10-21 10:36 ` Paul Durrant
  2015-10-21 10:36 ` Paul Durrant
                   ` (6 subsequent siblings)
  19 siblings, 0 replies; 39+ messages in thread
From: Paul Durrant @ 2015-10-21 10:36 UTC (permalink / raw)
  To: netdev, xen-devel; +Cc: Paul Durrant, Wei Liu, Ian Campbell

Advertise the capability to handle a hash mapping specified by the
frontend (see netif.h for details).

Add an ndo_select() entry point so that, of the frontend does specify a
hash mapping, the skb hash is extracted and mapped to a queue. If no
mapping is specified then the fallback queue selection function is
called so there is no change in behaviour.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
---
 drivers/net/xen-netback/common.h    |   7 ++
 drivers/net/xen-netback/interface.c |  14 ++++
 drivers/net/xen-netback/xenbus.c    | 154 ++++++++++++++++++++++++++++++------
 3 files changed, 152 insertions(+), 23 deletions(-)

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index 1bce5a5..23f2275 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -212,6 +212,8 @@ struct xenvif_mcast_addr {
 
 #define XEN_NETBK_MCAST_MAX 64
 
+#define XEN_NETBK_MAX_HASH_MAPPING_SIZE 128
+
 struct xenvif {
 	/* Unique identifier for this interface. */
 	domid_t          domid;
@@ -244,7 +246,12 @@ struct xenvif {
 	unsigned int num_queues; /* active queues, resource allocated */
 	unsigned int stalled_queues;
 
+	struct {
+		unsigned int table[XEN_NETBK_MAX_HASH_MAPPING_SIZE];
+		unsigned int length;
+	} hash_mapping;
 	struct xenbus_watch credit_watch;
+	struct xenbus_watch hash_mapping_watch;
 
 	spinlock_t lock;
 
diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
index e7bd63e..0c7da7b 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -142,6 +142,19 @@ void xenvif_wake_queue(struct xenvif_queue *queue)
 	netif_tx_wake_queue(netdev_get_tx_queue(dev, id));
 }
 
+static u16 xenvif_select_queue(struct net_device *dev, struct sk_buff *skb,
+			       void *accel_priv,
+			       select_queue_fallback_t fallback)
+{
+	struct xenvif *vif = netdev_priv(dev);
+
+	if (vif->hash_mapping.length == 0)
+		return fallback(dev, skb) % dev->real_num_tx_queues;
+
+	return vif->hash_mapping.table[skb_get_hash_raw(skb) %
+				       vif->hash_mapping.length];
+}
+
 static int xenvif_start_xmit(struct sk_buff *skb, struct net_device *dev)
 {
 	struct xenvif *vif = netdev_priv(dev);
@@ -386,6 +399,7 @@ static const struct ethtool_ops xenvif_ethtool_ops = {
 };
 
 static const struct net_device_ops xenvif_netdev_ops = {
+	.ndo_select_queue = xenvif_select_queue,
 	.ndo_start_xmit	= xenvif_start_xmit,
 	.ndo_get_stats	= xenvif_get_stats,
 	.ndo_open	= xenvif_open,
diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
index a31bcee..f5ed945 100644
--- a/drivers/net/xen-netback/xenbus.c
+++ b/drivers/net/xen-netback/xenbus.c
@@ -367,6 +367,13 @@ static int netback_probe(struct xenbus_device *dev,
 	if (err)
 		pr_debug("Error writing multi-queue-max-queues\n");
 
+	/* Multi-queue mapping support: This is an optional feature. */
+	err = xenbus_printf(XBT_NIL, dev->nodename,
+			    "multi-queue-max-hash-mapping-length", "%u",
+			    XEN_NETBK_MAX_HASH_MAPPING_SIZE);
+	if (err)
+		pr_debug("Error writing multi-queue-max-hash-mapping-length\n");
+
 	script = xenbus_read(XBT_NIL, dev->nodename, "script", NULL);
 	if (IS_ERR(script)) {
 		err = PTR_ERR(script);
@@ -691,38 +698,139 @@ static void xen_net_rate_changed(struct xenbus_watch *watch,
 	}
 }
 
-static int xen_register_watchers(struct xenbus_device *dev, struct xenvif *vif)
+static void xen_net_read_multi_queue_hash_mapping(struct xenvif *vif)
+{
+	struct xenbus_device *dev = xenvif_to_xenbus_device(vif);
+	char *str, *token;
+	unsigned int *table;
+	unsigned int n, i;
+
+	str = xenbus_read(XBT_NIL, dev->otherend,
+			  "multi-queue-hash-mapping", NULL);
+	if (IS_ERR(str))
+		goto fail1;
+
+	table = kcalloc(ARRAY_SIZE(vif->hash_mapping.table),
+			sizeof(unsigned int),
+			GFP_KERNEL);
+	if (!table) {
+		pr_err("%s: failed to allocate mapping table\n",
+		       dev->nodename);
+		goto fail2;
+	}
+
+	n = 0;
+	while ((token = strsep(&str, ",")) != NULL) {
+		int rc;
+
+		if (n >= ARRAY_SIZE(vif->hash_mapping.table)) {
+			pr_err("%s: mapping table too big\n",
+			       dev->nodename);
+			goto fail3;
+		}
+
+		rc = kstrtouint(token, 0, &table[n]);
+		if (rc < 0) {
+			pr_err("%s: invalid mapping table value (%s at index %u)\n",
+			       dev->nodename, token, n);
+			goto fail3;
+		}
+
+		n++;
+	}
+
+	if (n == 0) {
+		pr_err("%s: invalid mapping table\n", dev->nodename);
+		goto fail3;
+	}
+
+	vif->hash_mapping.length = n;
+
+	for (i = 0; i < n; i++)
+		vif->hash_mapping.table[i] =
+			(table[i] < vif->num_queues) ? table[i] : 0;
+
+	kfree(table);
+	kfree(str);
+	return;
+
+fail3:
+	kfree(table);
+fail2:
+	kfree(str);
+fail1:
+	vif->hash_mapping.length = 0;
+}
+
+static void xen_hash_mapping_changed(struct xenbus_watch *watch,
+				     const char **vec, unsigned int len)
+{
+	struct xenvif *vif = container_of(watch, struct xenvif,
+					  hash_mapping_watch);
+
+	xen_net_read_multi_queue_hash_mapping(vif);
+}
+
+static int xenvif_register_watch(const char *prefix, const char *name,
+				 void (*callback)(struct xenbus_watch *,
+						  const char **vec,
+						  unsigned int len),
+				 struct xenbus_watch *watch)
 {
-	int err = 0;
+	unsigned int len;
 	char *node;
-	unsigned maxlen = strlen(dev->nodename) + sizeof("/rate");
+	int err;
 
-	if (vif->credit_watch.node)
+	if (watch->node)
 		return -EADDRINUSE;
 
-	node = kmalloc(maxlen, GFP_KERNEL);
+	len = strlen(prefix) + 1 + strlen(name) + 1;
+
+	node = kmalloc(len, GFP_KERNEL);
 	if (!node)
 		return -ENOMEM;
-	snprintf(node, maxlen, "%s/rate", dev->nodename);
-	vif->credit_watch.node = node;
-	vif->credit_watch.callback = xen_net_rate_changed;
-	err = register_xenbus_watch(&vif->credit_watch);
+
+	snprintf(node, len, "%s/%s", prefix, name);
+	watch->node = node;
+	watch->callback = callback;
+	err = register_xenbus_watch(watch);
 	if (err) {
-		pr_err("Failed to set watcher %s\n", vif->credit_watch.node);
+		pr_err("Failed to set watch %s\n", node);
 		kfree(node);
-		vif->credit_watch.node = NULL;
-		vif->credit_watch.callback = NULL;
+		watch->node = NULL;
+		watch->callback = NULL;
 	}
 	return err;
 }
 
+static void xenvif_unregister_watch(struct xenbus_watch *watch)
+{
+	if (!watch->node)
+		return;
+
+	unregister_xenbus_watch(watch);
+	kfree(watch->node);
+
+	watch->node = NULL;
+	watch->callback = NULL;
+}
+
+static void xen_register_watchers(struct xenbus_device *dev, struct xenvif *vif)
+{
+	xenvif_register_watch(dev->nodename, "rate",
+			      xen_net_rate_changed,
+			      &vif->credit_watch);
+
+	xenvif_register_watch(dev->otherend,
+			      "multi-queue-hash-mapping",
+			      xen_hash_mapping_changed,
+			      &vif->hash_mapping_watch);
+}
+
 static void xen_unregister_watchers(struct xenvif *vif)
 {
-	if (vif->credit_watch.node) {
-		unregister_xenbus_watch(&vif->credit_watch);
-		kfree(vif->credit_watch.node);
-		vif->credit_watch.node = NULL;
-	}
+	xenvif_unregister_watch(&vif->hash_mapping_watch);
+	xenvif_unregister_watch(&vif->credit_watch);
 }
 
 static void unregister_hotplug_status_watch(struct backend_info *be)
@@ -782,6 +890,12 @@ static void connect(struct backend_info *be)
 		return;
 	}
 
+	/* Use the number of queues requested by the frontend */
+	be->vif->queues = vzalloc(requested_num_queues *
+				  sizeof(struct xenvif_queue));
+	be->vif->num_queues = requested_num_queues;
+	be->vif->stalled_queues = requested_num_queues;
+
 	err = xen_net_read_mac(dev, be->vif->fe_dev_addr);
 	if (err) {
 		xenbus_dev_fatal(dev, err, "parsing %s/mac", dev->nodename);
@@ -793,12 +907,6 @@ static void connect(struct backend_info *be)
 	xen_register_watchers(dev, be->vif);
 	read_xenbus_vif_flags(be);
 
-	/* Use the number of queues requested by the frontend */
-	be->vif->queues = vzalloc(requested_num_queues *
-				  sizeof(struct xenvif_queue));
-	be->vif->num_queues = requested_num_queues;
-	be->vif->stalled_queues = requested_num_queues;
-
 	for (queue_index = 0; queue_index < requested_num_queues; ++queue_index) {
 		queue = &be->vif->queues[queue_index];
 		queue->vif = be->vif;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH net-next 8/8] xen-netback: add support for toeplitz hashing
  2015-10-21 10:36 [PATCH net-next 0/8] xen-netback/core: packet hashing Paul Durrant
                   ` (13 preceding siblings ...)
  2015-10-21 10:36 ` Paul Durrant
@ 2015-10-21 10:36 ` Paul Durrant
  2015-10-26 17:05   ` Wei Liu
  2015-10-26 17:05   ` Wei Liu
  2015-10-21 10:36 ` Paul Durrant
                   ` (4 subsequent siblings)
  19 siblings, 2 replies; 39+ messages in thread
From: Paul Durrant @ 2015-10-21 10:36 UTC (permalink / raw)
  To: netdev, xen-devel; +Cc: Paul Durrant, Ian Campbell, Wei Liu

This patch adds all the necessary infrastructure to allow a frontend to
specify toeplitz hashing of network packets on its receive side. (See
netif.h for details of the xenbus protocol).

The toeplitz hash algorithm itself was based on pseudo-code provided by
Microsoft at:

https://msdn.microsoft.com/en-us/library/windows/hardware/ff570725.aspx

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
---
 drivers/net/xen-netback/common.h    |  32 ++++++
 drivers/net/xen-netback/interface.c | 111 +++++++++++++++++++-
 drivers/net/xen-netback/xenbus.c    | 195 ++++++++++++++++++++++++++++++++++++
 3 files changed, 335 insertions(+), 3 deletions(-)

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index 23f2275..4ebfad9 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -214,6 +214,31 @@ struct xenvif_mcast_addr {
 
 #define XEN_NETBK_MAX_HASH_MAPPING_SIZE 128
 
+enum xenvif_hash_alg {
+	XEN_NETBK_HASH_UNSPECIFIED,
+	XEN_NETBK_HASH_TOEPLITZ,
+};
+
+#define XEN_NETBK_MAX_TOEPLITZ_KEY_LENGTH 40
+
+struct xenvif_toeplitz_params {
+	union {
+		struct {
+			u8 ipv4_enabled:1;
+			u8 ipv4_tcp_enabled:1;
+			u8 ipv6_enabled:1;
+			u8 ipv6_tcp_enabled:1;
+		};
+		u8 types;
+	};
+
+	u8 key[XEN_NETBK_MAX_TOEPLITZ_KEY_LENGTH];
+};
+
+union xenvif_hash_params {
+	struct xenvif_toeplitz_params toeplitz;
+};
+
 struct xenvif {
 	/* Unique identifier for this interface. */
 	domid_t          domid;
@@ -250,8 +275,15 @@ struct xenvif {
 		unsigned int table[XEN_NETBK_MAX_HASH_MAPPING_SIZE];
 		unsigned int length;
 	} hash_mapping;
+
+	/* Hash */
+	enum xenvif_hash_alg hash_alg;
+	union xenvif_hash_params hash_params;
+
 	struct xenbus_watch credit_watch;
 	struct xenbus_watch hash_mapping_watch;
+	struct xenbus_watch hash_watch;
+	struct xenbus_watch hash_params_watch;
 
 	spinlock_t lock;
 
diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
index 0c7da7b..38eee4f 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -142,17 +142,122 @@ void xenvif_wake_queue(struct xenvif_queue *queue)
 	netif_tx_wake_queue(netdev_get_tx_queue(dev, id));
 }
 
+static u32 toeplitz_hash(const u8 *k, unsigned int klen,
+			 const u8 *d, unsigned int dlen)
+{
+	unsigned int di, ki;
+	u64 prefix = 0;
+	u64 hash = 0;
+
+	for (ki = 0; ki < 8; ki++) {
+		prefix |= ki < klen ? k[ki] : 0;
+		prefix <<= 8;
+	}
+
+	for (di = 0; di < dlen; di++) {
+		u8 byte = d[di];
+		unsigned int bit;
+
+		prefix |= ki < klen ? k[ki] : 0;
+		ki++;
+
+		for (bit = 0; bit < 8; bit++) {
+			if (byte & 0x80)
+				hash ^= prefix;
+			byte <<= 1;
+			prefix <<= 1;
+		}
+	}
+
+	return hash >> 32;
+}
+
+static void xenvif_set_toeplitz_hash(struct xenvif *vif, struct sk_buff *skb)
+{
+	struct flow_keys flow;
+	u32 hash = 0;
+	enum pkt_hash_types type = PKT_HASH_TYPE_NONE;
+	const u8 *key = vif->hash_params.toeplitz.key;
+	const unsigned int len = ARRAY_SIZE(vif->hash_params.toeplitz.key);
+
+	memset(&flow, 0, sizeof(flow));
+	if (!skb_flow_dissect_flow_keys(skb, &flow, 0))
+		goto done;
+
+	if (flow.basic.n_proto == htons(ETH_P_IP)) {
+		if (vif->hash_params.toeplitz.ipv4_tcp_enabled &&
+		    flow.basic.ip_proto == IPPROTO_TCP) {
+			u8 data[12];
+
+			memcpy(&data[0], &flow.addrs.v4addrs.src, 4);
+			memcpy(&data[4], &flow.addrs.v4addrs.dst, 4);
+			memcpy(&data[8], &flow.ports.src, 2);
+			memcpy(&data[10], &flow.ports.dst, 2);
+
+			hash = toeplitz_hash(key, len,
+					     data, sizeof(data));
+			type = PKT_HASH_TYPE_L4;
+		} else if (vif->hash_params.toeplitz.ipv4_enabled) {
+			u8 data[8];
+
+			memcpy(&data[0], &flow.addrs.v4addrs.src, 4);
+			memcpy(&data[4], &flow.addrs.v4addrs.dst, 4);
+
+			hash = toeplitz_hash(key, len,
+					     data, sizeof(data));
+			type = PKT_HASH_TYPE_L3;
+		}
+	} else if (flow.basic.n_proto == htons(ETH_P_IPV6)) {
+		if (vif->hash_params.toeplitz.ipv6_tcp_enabled &&
+		    flow.basic.ip_proto == IPPROTO_TCP) {
+			u8 data[36];
+
+			memcpy(&data[0], &flow.addrs.v6addrs.src, 16);
+			memcpy(&data[16], &flow.addrs.v6addrs.dst, 16);
+			memcpy(&data[32], &flow.ports.src, 2);
+			memcpy(&data[34], &flow.ports.dst, 2);
+
+			hash = toeplitz_hash(key, len,
+					     data, sizeof(data));
+			type = PKT_HASH_TYPE_L4;
+		} else if (vif->hash_params.toeplitz.ipv6_enabled) {
+			u8 data[32];
+
+			memcpy(&data[0], &flow.addrs.v6addrs.src, 16);
+			memcpy(&data[16], &flow.addrs.v6addrs.dst, 16);
+
+			hash = toeplitz_hash(key, len,
+					     data, sizeof(data));
+			type = PKT_HASH_TYPE_L3;
+		}
+	}
+
+done:
+	skb_set_hash(skb, hash, type);
+}
+
 static u16 xenvif_select_queue(struct net_device *dev, struct sk_buff *skb,
 			       void *accel_priv,
 			       select_queue_fallback_t fallback)
 {
 	struct xenvif *vif = netdev_priv(dev);
+	u32 hash;
+
+	/* If a hash algorithm has been specified re-calculate accordingly */
+	switch (vif->hash_alg) {
+	case XEN_NETBK_HASH_TOEPLITZ:
+		xenvif_set_toeplitz_hash(vif, skb);
+		hash = skb_get_hash_raw(skb);
+		break;
+	default:
+		hash = fallback(dev, skb);
+		break;
+	}
 
 	if (vif->hash_mapping.length == 0)
-		return fallback(dev, skb) % dev->real_num_tx_queues;
+		return hash % dev->real_num_tx_queues;
 
-	return vif->hash_mapping.table[skb_get_hash_raw(skb) %
-				       vif->hash_mapping.length];
+	return vif->hash_mapping.table[hash % vif->hash_mapping.length];
 }
 
 static int xenvif_start_xmit(struct sk_buff *skb, struct net_device *dev)
diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
index f5ed945..9d12bd8 100644
--- a/drivers/net/xen-netback/xenbus.c
+++ b/drivers/net/xen-netback/xenbus.c
@@ -246,6 +246,34 @@ static int netback_remove(struct xenbus_device *dev)
 	return 0;
 }
 
+static int netback_set_toeplitz_caps(struct xenbus_device *dev)
+{
+	unsigned int len = strlen(dev->nodename) +
+		sizeof("/multi-queue-hash-caps-toeplitz");
+	char *node;
+	int err;
+
+	node = kmalloc(len, GFP_KERNEL);
+	if (!node)
+		return -ENOMEM;
+
+	snprintf(node, len, "%s/multi-queue-hash-caps-toeplitz",
+		 dev->nodename);
+
+	err = xenbus_printf(XBT_NIL, node,
+			    "types", "ipv4 ipv4+tcp ipv6 ipv6+tcp");
+	if (err)
+		pr_debug("Error writing types\n");
+
+	err = xenbus_printf(XBT_NIL, node,
+			    "max-key-length", "%u",
+			    XEN_NETBK_MAX_TOEPLITZ_KEY_LENGTH);
+	if (err)
+		pr_debug("Error writing max-key-length\n");
+
+	kfree(node);
+	return 0;
+}
 
 /**
  * Entry point to this code when a new device is created.  Allocate the basic
@@ -374,6 +402,17 @@ static int netback_probe(struct xenbus_device *dev,
 	if (err)
 		pr_debug("Error writing multi-queue-max-hash-mapping-length\n");
 
+	/* Selectable multi-queue hash algorithms: This is an optional
+	 * feature.
+	 */
+	err = netback_set_toeplitz_caps(dev);
+	if (!err) {
+		err = xenbus_printf(XBT_NIL, dev->nodename,
+				    "multi-queue-hash-list", "toeplitz");
+		if (err)
+			pr_debug("Error writing multi-queue-hash-list\n");
+	}
+
 	script = xenbus_read(XBT_NIL, dev->nodename, "script", NULL);
 	if (IS_ERR(script)) {
 		err = PTR_ERR(script);
@@ -815,6 +854,153 @@ static void xenvif_unregister_watch(struct xenbus_watch *watch)
 	watch->callback = NULL;
 }
 
+static void xen_net_read_toeplitz_types(struct xenvif *vif,
+					const char *node)
+{
+	struct xenbus_device *dev = xenvif_to_xenbus_device(vif);
+	char *str, *token;
+
+	vif->hash_params.toeplitz.types = 0;
+
+	str = xenbus_read(XBT_NIL, node, "types", NULL);
+	if (IS_ERR(str))
+		return;
+
+	while ((token = strsep(&str, " ")) != NULL) {
+		if (strcmp(token, "ipv4") == 0) {
+			vif->hash_params.toeplitz.ipv4_enabled = 1;
+		} else if (strcmp(token, "ipv4+tcp") == 0) {
+			vif->hash_params.toeplitz.ipv4_tcp_enabled = 1;
+		} else if (strcmp(token, "ipv6") == 0) {
+			vif->hash_params.toeplitz.ipv6_enabled = 1;
+		} else if (strcmp(token, "ipv6+tcp") == 0) {
+			vif->hash_params.toeplitz.ipv6_tcp_enabled = 1;
+		} else {
+			pr_err("%s: unknown hash type (%s)\n",
+			       dev->nodename, token);
+			goto fail1;
+		}
+	}
+
+	kfree(str);
+	return;
+
+fail1:
+	vif->hash_params.toeplitz.types = 0;
+}
+
+static void xen_net_read_toeplitz_key(struct xenvif *vif,
+				      const char *node)
+{
+	struct xenbus_device *dev = xenvif_to_xenbus_device(vif);
+	char *str, *token;
+	u8 key[40];
+	unsigned int n, i;
+
+	str = xenbus_read(XBT_NIL, node, "key", NULL);
+	if (IS_ERR(str))
+		goto fail1;
+
+	memset(key, 0, sizeof(key));
+
+	n = 0;
+	while ((token = strsep(&str, ",")) != NULL) {
+		int rc;
+
+		if (n >= ARRAY_SIZE(vif->hash_params.toeplitz.key)) {
+			pr_err("%s: key too big\n",
+			       dev->nodename);
+			goto fail2;
+		}
+
+		rc = kstrtou8(token, 0, &key[n]);
+		if (rc < 0) {
+			pr_err("%s: invalid key value (%s at index %u)\n",
+			       dev->nodename, token, n);
+			goto fail2;
+		}
+
+		n++;
+	}
+
+	for (i = 0; i < ARRAY_SIZE(vif->hash_params.toeplitz.key); i++)
+		vif->hash_params.toeplitz.key[i] = key[i];
+
+	kfree(str);
+	return;
+
+fail2:
+	kfree(str);
+fail1:
+	vif->hash_params.toeplitz.types = 0;
+}
+
+static void xen_net_read_toeplitz_params(struct xenvif *vif)
+{
+	struct xenbus_device *dev = xenvif_to_xenbus_device(vif);
+	unsigned int len = strlen(dev->otherend) +
+		sizeof("/multi-queue-hash-params-toeplitz");
+	char *node;
+
+	node = kmalloc(len, GFP_KERNEL);
+	if (!node)
+		return;
+	snprintf(node, len, "%s/multi-queue-hash-params-toeplitz",
+		 dev->otherend);
+
+	xen_net_read_toeplitz_types(vif, node);
+	xen_net_read_toeplitz_key(vif, node);
+
+	kfree(node);
+}
+
+static void xen_hash_params_changed(struct xenbus_watch *watch,
+				    const char **vec, unsigned int len)
+{
+	struct xenvif *vif = container_of(watch, struct xenvif,
+					  hash_params_watch);
+
+	switch (vif->hash_alg) {
+	case XEN_NETBK_HASH_TOEPLITZ:
+		xen_net_read_toeplitz_params(vif);
+		break;
+	default:
+		break;
+	}
+}
+
+static void xen_net_read_hash(struct xenvif *vif)
+{
+	struct xenbus_device *dev = xenvif_to_xenbus_device(vif);
+	char *str;
+
+	vif->hash_alg = XEN_NETBK_HASH_UNSPECIFIED;
+	xenvif_unregister_watch(&vif->hash_params_watch);
+
+	str = xenbus_read(XBT_NIL, dev->otherend, "multi-queue-hash", NULL);
+	if (IS_ERR(str))
+		return;
+
+	if (strcmp(str, "toeplitz") == 0) {
+		vif->hash_alg = XEN_NETBK_HASH_TOEPLITZ;
+
+		xenvif_register_watch(dev->otherend,
+				      "multi-queue-hash-params-toeplitz",
+				      xen_hash_params_changed,
+				      &vif->hash_params_watch);
+	}
+
+	kfree(str);
+}
+
+static void xen_hash_changed(struct xenbus_watch *watch,
+			     const char **vec, unsigned int len)
+{
+	struct xenvif *vif = container_of(watch, struct xenvif, hash_watch);
+
+	xen_net_read_hash(vif);
+}
+
 static void xen_register_watchers(struct xenbus_device *dev, struct xenvif *vif)
 {
 	xenvif_register_watch(dev->nodename, "rate",
@@ -825,10 +1011,17 @@ static void xen_register_watchers(struct xenbus_device *dev, struct xenvif *vif)
 			      "multi-queue-hash-mapping",
 			      xen_hash_mapping_changed,
 			      &vif->hash_mapping_watch);
+
+	xenvif_register_watch(dev->otherend,
+			      "multi-queue-hash",
+			      xen_hash_changed,
+			      &vif->hash_watch);
 }
 
 static void xen_unregister_watchers(struct xenvif *vif)
 {
+	xenvif_unregister_watch(&vif->hash_params_watch);
+	xenvif_unregister_watch(&vif->hash_watch);
 	xenvif_unregister_watch(&vif->hash_mapping_watch);
 	xenvif_unregister_watch(&vif->credit_watch);
 }
@@ -874,6 +1067,8 @@ static void connect(struct backend_info *be)
 	unsigned int requested_num_queues;
 	struct xenvif_queue *queue;
 
+	be->vif->hash_alg = XEN_NETBK_HASH_UNSPECIFIED;
+
 	/* Check whether the frontend requested multiple queues
 	 * and read the number requested.
 	 */
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH net-next 8/8] xen-netback: add support for toeplitz hashing
  2015-10-21 10:36 [PATCH net-next 0/8] xen-netback/core: packet hashing Paul Durrant
                   ` (14 preceding siblings ...)
  2015-10-21 10:36 ` [PATCH net-next 8/8] xen-netback: add support for toeplitz hashing Paul Durrant
@ 2015-10-21 10:36 ` Paul Durrant
  2015-10-22 14:15 ` [PATCH net-next 0/8] xen-netback/core: packet hashing David Miller
                   ` (3 subsequent siblings)
  19 siblings, 0 replies; 39+ messages in thread
From: Paul Durrant @ 2015-10-21 10:36 UTC (permalink / raw)
  To: netdev, xen-devel; +Cc: Paul Durrant, Wei Liu, Ian Campbell

This patch adds all the necessary infrastructure to allow a frontend to
specify toeplitz hashing of network packets on its receive side. (See
netif.h for details of the xenbus protocol).

The toeplitz hash algorithm itself was based on pseudo-code provided by
Microsoft at:

https://msdn.microsoft.com/en-us/library/windows/hardware/ff570725.aspx

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
---
 drivers/net/xen-netback/common.h    |  32 ++++++
 drivers/net/xen-netback/interface.c | 111 +++++++++++++++++++-
 drivers/net/xen-netback/xenbus.c    | 195 ++++++++++++++++++++++++++++++++++++
 3 files changed, 335 insertions(+), 3 deletions(-)

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index 23f2275..4ebfad9 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -214,6 +214,31 @@ struct xenvif_mcast_addr {
 
 #define XEN_NETBK_MAX_HASH_MAPPING_SIZE 128
 
+enum xenvif_hash_alg {
+	XEN_NETBK_HASH_UNSPECIFIED,
+	XEN_NETBK_HASH_TOEPLITZ,
+};
+
+#define XEN_NETBK_MAX_TOEPLITZ_KEY_LENGTH 40
+
+struct xenvif_toeplitz_params {
+	union {
+		struct {
+			u8 ipv4_enabled:1;
+			u8 ipv4_tcp_enabled:1;
+			u8 ipv6_enabled:1;
+			u8 ipv6_tcp_enabled:1;
+		};
+		u8 types;
+	};
+
+	u8 key[XEN_NETBK_MAX_TOEPLITZ_KEY_LENGTH];
+};
+
+union xenvif_hash_params {
+	struct xenvif_toeplitz_params toeplitz;
+};
+
 struct xenvif {
 	/* Unique identifier for this interface. */
 	domid_t          domid;
@@ -250,8 +275,15 @@ struct xenvif {
 		unsigned int table[XEN_NETBK_MAX_HASH_MAPPING_SIZE];
 		unsigned int length;
 	} hash_mapping;
+
+	/* Hash */
+	enum xenvif_hash_alg hash_alg;
+	union xenvif_hash_params hash_params;
+
 	struct xenbus_watch credit_watch;
 	struct xenbus_watch hash_mapping_watch;
+	struct xenbus_watch hash_watch;
+	struct xenbus_watch hash_params_watch;
 
 	spinlock_t lock;
 
diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
index 0c7da7b..38eee4f 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -142,17 +142,122 @@ void xenvif_wake_queue(struct xenvif_queue *queue)
 	netif_tx_wake_queue(netdev_get_tx_queue(dev, id));
 }
 
+static u32 toeplitz_hash(const u8 *k, unsigned int klen,
+			 const u8 *d, unsigned int dlen)
+{
+	unsigned int di, ki;
+	u64 prefix = 0;
+	u64 hash = 0;
+
+	for (ki = 0; ki < 8; ki++) {
+		prefix |= ki < klen ? k[ki] : 0;
+		prefix <<= 8;
+	}
+
+	for (di = 0; di < dlen; di++) {
+		u8 byte = d[di];
+		unsigned int bit;
+
+		prefix |= ki < klen ? k[ki] : 0;
+		ki++;
+
+		for (bit = 0; bit < 8; bit++) {
+			if (byte & 0x80)
+				hash ^= prefix;
+			byte <<= 1;
+			prefix <<= 1;
+		}
+	}
+
+	return hash >> 32;
+}
+
+static void xenvif_set_toeplitz_hash(struct xenvif *vif, struct sk_buff *skb)
+{
+	struct flow_keys flow;
+	u32 hash = 0;
+	enum pkt_hash_types type = PKT_HASH_TYPE_NONE;
+	const u8 *key = vif->hash_params.toeplitz.key;
+	const unsigned int len = ARRAY_SIZE(vif->hash_params.toeplitz.key);
+
+	memset(&flow, 0, sizeof(flow));
+	if (!skb_flow_dissect_flow_keys(skb, &flow, 0))
+		goto done;
+
+	if (flow.basic.n_proto == htons(ETH_P_IP)) {
+		if (vif->hash_params.toeplitz.ipv4_tcp_enabled &&
+		    flow.basic.ip_proto == IPPROTO_TCP) {
+			u8 data[12];
+
+			memcpy(&data[0], &flow.addrs.v4addrs.src, 4);
+			memcpy(&data[4], &flow.addrs.v4addrs.dst, 4);
+			memcpy(&data[8], &flow.ports.src, 2);
+			memcpy(&data[10], &flow.ports.dst, 2);
+
+			hash = toeplitz_hash(key, len,
+					     data, sizeof(data));
+			type = PKT_HASH_TYPE_L4;
+		} else if (vif->hash_params.toeplitz.ipv4_enabled) {
+			u8 data[8];
+
+			memcpy(&data[0], &flow.addrs.v4addrs.src, 4);
+			memcpy(&data[4], &flow.addrs.v4addrs.dst, 4);
+
+			hash = toeplitz_hash(key, len,
+					     data, sizeof(data));
+			type = PKT_HASH_TYPE_L3;
+		}
+	} else if (flow.basic.n_proto == htons(ETH_P_IPV6)) {
+		if (vif->hash_params.toeplitz.ipv6_tcp_enabled &&
+		    flow.basic.ip_proto == IPPROTO_TCP) {
+			u8 data[36];
+
+			memcpy(&data[0], &flow.addrs.v6addrs.src, 16);
+			memcpy(&data[16], &flow.addrs.v6addrs.dst, 16);
+			memcpy(&data[32], &flow.ports.src, 2);
+			memcpy(&data[34], &flow.ports.dst, 2);
+
+			hash = toeplitz_hash(key, len,
+					     data, sizeof(data));
+			type = PKT_HASH_TYPE_L4;
+		} else if (vif->hash_params.toeplitz.ipv6_enabled) {
+			u8 data[32];
+
+			memcpy(&data[0], &flow.addrs.v6addrs.src, 16);
+			memcpy(&data[16], &flow.addrs.v6addrs.dst, 16);
+
+			hash = toeplitz_hash(key, len,
+					     data, sizeof(data));
+			type = PKT_HASH_TYPE_L3;
+		}
+	}
+
+done:
+	skb_set_hash(skb, hash, type);
+}
+
 static u16 xenvif_select_queue(struct net_device *dev, struct sk_buff *skb,
 			       void *accel_priv,
 			       select_queue_fallback_t fallback)
 {
 	struct xenvif *vif = netdev_priv(dev);
+	u32 hash;
+
+	/* If a hash algorithm has been specified re-calculate accordingly */
+	switch (vif->hash_alg) {
+	case XEN_NETBK_HASH_TOEPLITZ:
+		xenvif_set_toeplitz_hash(vif, skb);
+		hash = skb_get_hash_raw(skb);
+		break;
+	default:
+		hash = fallback(dev, skb);
+		break;
+	}
 
 	if (vif->hash_mapping.length == 0)
-		return fallback(dev, skb) % dev->real_num_tx_queues;
+		return hash % dev->real_num_tx_queues;
 
-	return vif->hash_mapping.table[skb_get_hash_raw(skb) %
-				       vif->hash_mapping.length];
+	return vif->hash_mapping.table[hash % vif->hash_mapping.length];
 }
 
 static int xenvif_start_xmit(struct sk_buff *skb, struct net_device *dev)
diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
index f5ed945..9d12bd8 100644
--- a/drivers/net/xen-netback/xenbus.c
+++ b/drivers/net/xen-netback/xenbus.c
@@ -246,6 +246,34 @@ static int netback_remove(struct xenbus_device *dev)
 	return 0;
 }
 
+static int netback_set_toeplitz_caps(struct xenbus_device *dev)
+{
+	unsigned int len = strlen(dev->nodename) +
+		sizeof("/multi-queue-hash-caps-toeplitz");
+	char *node;
+	int err;
+
+	node = kmalloc(len, GFP_KERNEL);
+	if (!node)
+		return -ENOMEM;
+
+	snprintf(node, len, "%s/multi-queue-hash-caps-toeplitz",
+		 dev->nodename);
+
+	err = xenbus_printf(XBT_NIL, node,
+			    "types", "ipv4 ipv4+tcp ipv6 ipv6+tcp");
+	if (err)
+		pr_debug("Error writing types\n");
+
+	err = xenbus_printf(XBT_NIL, node,
+			    "max-key-length", "%u",
+			    XEN_NETBK_MAX_TOEPLITZ_KEY_LENGTH);
+	if (err)
+		pr_debug("Error writing max-key-length\n");
+
+	kfree(node);
+	return 0;
+}
 
 /**
  * Entry point to this code when a new device is created.  Allocate the basic
@@ -374,6 +402,17 @@ static int netback_probe(struct xenbus_device *dev,
 	if (err)
 		pr_debug("Error writing multi-queue-max-hash-mapping-length\n");
 
+	/* Selectable multi-queue hash algorithms: This is an optional
+	 * feature.
+	 */
+	err = netback_set_toeplitz_caps(dev);
+	if (!err) {
+		err = xenbus_printf(XBT_NIL, dev->nodename,
+				    "multi-queue-hash-list", "toeplitz");
+		if (err)
+			pr_debug("Error writing multi-queue-hash-list\n");
+	}
+
 	script = xenbus_read(XBT_NIL, dev->nodename, "script", NULL);
 	if (IS_ERR(script)) {
 		err = PTR_ERR(script);
@@ -815,6 +854,153 @@ static void xenvif_unregister_watch(struct xenbus_watch *watch)
 	watch->callback = NULL;
 }
 
+static void xen_net_read_toeplitz_types(struct xenvif *vif,
+					const char *node)
+{
+	struct xenbus_device *dev = xenvif_to_xenbus_device(vif);
+	char *str, *token;
+
+	vif->hash_params.toeplitz.types = 0;
+
+	str = xenbus_read(XBT_NIL, node, "types", NULL);
+	if (IS_ERR(str))
+		return;
+
+	while ((token = strsep(&str, " ")) != NULL) {
+		if (strcmp(token, "ipv4") == 0) {
+			vif->hash_params.toeplitz.ipv4_enabled = 1;
+		} else if (strcmp(token, "ipv4+tcp") == 0) {
+			vif->hash_params.toeplitz.ipv4_tcp_enabled = 1;
+		} else if (strcmp(token, "ipv6") == 0) {
+			vif->hash_params.toeplitz.ipv6_enabled = 1;
+		} else if (strcmp(token, "ipv6+tcp") == 0) {
+			vif->hash_params.toeplitz.ipv6_tcp_enabled = 1;
+		} else {
+			pr_err("%s: unknown hash type (%s)\n",
+			       dev->nodename, token);
+			goto fail1;
+		}
+	}
+
+	kfree(str);
+	return;
+
+fail1:
+	vif->hash_params.toeplitz.types = 0;
+}
+
+static void xen_net_read_toeplitz_key(struct xenvif *vif,
+				      const char *node)
+{
+	struct xenbus_device *dev = xenvif_to_xenbus_device(vif);
+	char *str, *token;
+	u8 key[40];
+	unsigned int n, i;
+
+	str = xenbus_read(XBT_NIL, node, "key", NULL);
+	if (IS_ERR(str))
+		goto fail1;
+
+	memset(key, 0, sizeof(key));
+
+	n = 0;
+	while ((token = strsep(&str, ",")) != NULL) {
+		int rc;
+
+		if (n >= ARRAY_SIZE(vif->hash_params.toeplitz.key)) {
+			pr_err("%s: key too big\n",
+			       dev->nodename);
+			goto fail2;
+		}
+
+		rc = kstrtou8(token, 0, &key[n]);
+		if (rc < 0) {
+			pr_err("%s: invalid key value (%s at index %u)\n",
+			       dev->nodename, token, n);
+			goto fail2;
+		}
+
+		n++;
+	}
+
+	for (i = 0; i < ARRAY_SIZE(vif->hash_params.toeplitz.key); i++)
+		vif->hash_params.toeplitz.key[i] = key[i];
+
+	kfree(str);
+	return;
+
+fail2:
+	kfree(str);
+fail1:
+	vif->hash_params.toeplitz.types = 0;
+}
+
+static void xen_net_read_toeplitz_params(struct xenvif *vif)
+{
+	struct xenbus_device *dev = xenvif_to_xenbus_device(vif);
+	unsigned int len = strlen(dev->otherend) +
+		sizeof("/multi-queue-hash-params-toeplitz");
+	char *node;
+
+	node = kmalloc(len, GFP_KERNEL);
+	if (!node)
+		return;
+	snprintf(node, len, "%s/multi-queue-hash-params-toeplitz",
+		 dev->otherend);
+
+	xen_net_read_toeplitz_types(vif, node);
+	xen_net_read_toeplitz_key(vif, node);
+
+	kfree(node);
+}
+
+static void xen_hash_params_changed(struct xenbus_watch *watch,
+				    const char **vec, unsigned int len)
+{
+	struct xenvif *vif = container_of(watch, struct xenvif,
+					  hash_params_watch);
+
+	switch (vif->hash_alg) {
+	case XEN_NETBK_HASH_TOEPLITZ:
+		xen_net_read_toeplitz_params(vif);
+		break;
+	default:
+		break;
+	}
+}
+
+static void xen_net_read_hash(struct xenvif *vif)
+{
+	struct xenbus_device *dev = xenvif_to_xenbus_device(vif);
+	char *str;
+
+	vif->hash_alg = XEN_NETBK_HASH_UNSPECIFIED;
+	xenvif_unregister_watch(&vif->hash_params_watch);
+
+	str = xenbus_read(XBT_NIL, dev->otherend, "multi-queue-hash", NULL);
+	if (IS_ERR(str))
+		return;
+
+	if (strcmp(str, "toeplitz") == 0) {
+		vif->hash_alg = XEN_NETBK_HASH_TOEPLITZ;
+
+		xenvif_register_watch(dev->otherend,
+				      "multi-queue-hash-params-toeplitz",
+				      xen_hash_params_changed,
+				      &vif->hash_params_watch);
+	}
+
+	kfree(str);
+}
+
+static void xen_hash_changed(struct xenbus_watch *watch,
+			     const char **vec, unsigned int len)
+{
+	struct xenvif *vif = container_of(watch, struct xenvif, hash_watch);
+
+	xen_net_read_hash(vif);
+}
+
 static void xen_register_watchers(struct xenbus_device *dev, struct xenvif *vif)
 {
 	xenvif_register_watch(dev->nodename, "rate",
@@ -825,10 +1011,17 @@ static void xen_register_watchers(struct xenbus_device *dev, struct xenvif *vif)
 			      "multi-queue-hash-mapping",
 			      xen_hash_mapping_changed,
 			      &vif->hash_mapping_watch);
+
+	xenvif_register_watch(dev->otherend,
+			      "multi-queue-hash",
+			      xen_hash_changed,
+			      &vif->hash_watch);
 }
 
 static void xen_unregister_watchers(struct xenvif *vif)
 {
+	xenvif_unregister_watch(&vif->hash_params_watch);
+	xenvif_unregister_watch(&vif->hash_watch);
 	xenvif_unregister_watch(&vif->hash_mapping_watch);
 	xenvif_unregister_watch(&vif->credit_watch);
 }
@@ -874,6 +1067,8 @@ static void connect(struct backend_info *be)
 	unsigned int requested_num_queues;
 	struct xenvif_queue *queue;
 
+	be->vif->hash_alg = XEN_NETBK_HASH_UNSPECIFIED;
+
 	/* Check whether the frontend requested multiple queues
 	 * and read the number requested.
 	 */
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [PATCH net-next 0/8] xen-netback/core: packet hashing
  2015-10-21 10:36 [PATCH net-next 0/8] xen-netback/core: packet hashing Paul Durrant
                   ` (15 preceding siblings ...)
  2015-10-21 10:36 ` Paul Durrant
@ 2015-10-22 14:15 ` David Miller
  2015-10-22 14:15 ` David Miller
                   ` (2 subsequent siblings)
  19 siblings, 0 replies; 39+ messages in thread
From: David Miller @ 2015-10-22 14:15 UTC (permalink / raw)
  To: paul.durrant; +Cc: netdev, xen-devel

From: Paul Durrant <paul.durrant@citrix.com>
Date: Wed, 21 Oct 2015 11:36:17 +0100

> This series adds xen-netback support for hash negotiation with a frontend
> driver, and an implementation of toeplitz hashing as the initial negotiable
> algorithm.

I'd definitely like to see some XEN networking experts review this
before I apply it.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH net-next 0/8] xen-netback/core: packet hashing
  2015-10-21 10:36 [PATCH net-next 0/8] xen-netback/core: packet hashing Paul Durrant
                   ` (16 preceding siblings ...)
  2015-10-22 14:15 ` [PATCH net-next 0/8] xen-netback/core: packet hashing David Miller
@ 2015-10-22 14:15 ` David Miller
  2015-10-24 11:55 ` David Miller
  2015-10-24 11:55 ` David Miller
  19 siblings, 0 replies; 39+ messages in thread
From: David Miller @ 2015-10-22 14:15 UTC (permalink / raw)
  To: paul.durrant; +Cc: netdev, xen-devel

From: Paul Durrant <paul.durrant@citrix.com>
Date: Wed, 21 Oct 2015 11:36:17 +0100

> This series adds xen-netback support for hash negotiation with a frontend
> driver, and an implementation of toeplitz hashing as the initial negotiable
> algorithm.

I'd definitely like to see some XEN networking experts review this
before I apply it.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH net-next 0/8] xen-netback/core: packet hashing
  2015-10-21 10:36 [PATCH net-next 0/8] xen-netback/core: packet hashing Paul Durrant
                   ` (17 preceding siblings ...)
  2015-10-22 14:15 ` David Miller
@ 2015-10-24 11:55 ` David Miller
  2015-10-26 10:38   ` David Vrabel
  2015-10-26 10:38   ` [Xen-devel] " David Vrabel
  2015-10-24 11:55 ` David Miller
  19 siblings, 2 replies; 39+ messages in thread
From: David Miller @ 2015-10-24 11:55 UTC (permalink / raw)
  To: paul.durrant; +Cc: netdev, xen-devel

From: Paul Durrant <paul.durrant@citrix.com>
Date: Wed, 21 Oct 2015 11:36:17 +0100

> This series adds xen-netback support for hash negotiation with a frontend
> driver, and an implementation of toeplitz hashing as the initial negotiable
> algorithm.

Ping, I want to see some review from some other xen networking folks.

Thanks.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH net-next 0/8] xen-netback/core: packet hashing
  2015-10-21 10:36 [PATCH net-next 0/8] xen-netback/core: packet hashing Paul Durrant
                   ` (18 preceding siblings ...)
  2015-10-24 11:55 ` David Miller
@ 2015-10-24 11:55 ` David Miller
  19 siblings, 0 replies; 39+ messages in thread
From: David Miller @ 2015-10-24 11:55 UTC (permalink / raw)
  To: paul.durrant; +Cc: netdev, xen-devel

From: Paul Durrant <paul.durrant@citrix.com>
Date: Wed, 21 Oct 2015 11:36:17 +0100

> This series adds xen-netback support for hash negotiation with a frontend
> driver, and an implementation of toeplitz hashing as the initial negotiable
> algorithm.

Ping, I want to see some review from some other xen networking folks.

Thanks.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Xen-devel] [PATCH net-next 0/8] xen-netback/core: packet hashing
  2015-10-24 11:55 ` David Miller
  2015-10-26 10:38   ` David Vrabel
@ 2015-10-26 10:38   ` David Vrabel
  2015-10-26 12:09     ` David Miller
  2015-10-26 12:09     ` [Xen-devel] " David Miller
  1 sibling, 2 replies; 39+ messages in thread
From: David Vrabel @ 2015-10-26 10:38 UTC (permalink / raw)
  To: David Miller, paul.durrant; +Cc: netdev, xen-devel

On 24/10/15 12:55, David Miller wrote:
> From: Paul Durrant <paul.durrant@citrix.com>
> Date: Wed, 21 Oct 2015 11:36:17 +0100
> 
>> This series adds xen-netback support for hash negotiation with a frontend
>> driver, and an implementation of toeplitz hashing as the initial negotiable
>> algorithm.
> 
> Ping, I want to see some review from some other xen networking folks.

There's been some review of the front/back protocol (on a different
thread) and some significant changes have been suggested.

David

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH net-next 0/8] xen-netback/core: packet hashing
  2015-10-24 11:55 ` David Miller
@ 2015-10-26 10:38   ` David Vrabel
  2015-10-26 10:38   ` [Xen-devel] " David Vrabel
  1 sibling, 0 replies; 39+ messages in thread
From: David Vrabel @ 2015-10-26 10:38 UTC (permalink / raw)
  To: David Miller, paul.durrant; +Cc: netdev, xen-devel

On 24/10/15 12:55, David Miller wrote:
> From: Paul Durrant <paul.durrant@citrix.com>
> Date: Wed, 21 Oct 2015 11:36:17 +0100
> 
>> This series adds xen-netback support for hash negotiation with a frontend
>> driver, and an implementation of toeplitz hashing as the initial negotiable
>> algorithm.
> 
> Ping, I want to see some review from some other xen networking folks.

There's been some review of the front/back protocol (on a different
thread) and some significant changes have been suggested.

David

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Xen-devel] [PATCH net-next 0/8] xen-netback/core: packet hashing
  2015-10-26 10:38   ` [Xen-devel] " David Vrabel
  2015-10-26 12:09     ` David Miller
@ 2015-10-26 12:09     ` David Miller
  1 sibling, 0 replies; 39+ messages in thread
From: David Miller @ 2015-10-26 12:09 UTC (permalink / raw)
  To: david.vrabel; +Cc: paul.durrant, netdev, xen-devel

From: David Vrabel <david.vrabel@citrix.com>
Date: Mon, 26 Oct 2015 10:38:50 +0000

> On 24/10/15 12:55, David Miller wrote:
>> From: Paul Durrant <paul.durrant@citrix.com>
>> Date: Wed, 21 Oct 2015 11:36:17 +0100
>> 
>>> This series adds xen-netback support for hash negotiation with a frontend
>>> driver, and an implementation of toeplitz hashing as the initial negotiable
>>> algorithm.
>> 
>> Ping, I want to see some review from some other xen networking folks.
> 
> There's been some review of the front/back protocol (on a different
> thread) and some significant changes have been suggested.

Ok, I'll mark this series as "changes requested" then, thanks.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH net-next 0/8] xen-netback/core: packet hashing
  2015-10-26 10:38   ` [Xen-devel] " David Vrabel
@ 2015-10-26 12:09     ` David Miller
  2015-10-26 12:09     ` [Xen-devel] " David Miller
  1 sibling, 0 replies; 39+ messages in thread
From: David Miller @ 2015-10-26 12:09 UTC (permalink / raw)
  To: david.vrabel; +Cc: netdev, paul.durrant, xen-devel

From: David Vrabel <david.vrabel@citrix.com>
Date: Mon, 26 Oct 2015 10:38:50 +0000

> On 24/10/15 12:55, David Miller wrote:
>> From: Paul Durrant <paul.durrant@citrix.com>
>> Date: Wed, 21 Oct 2015 11:36:17 +0100
>> 
>>> This series adds xen-netback support for hash negotiation with a frontend
>>> driver, and an implementation of toeplitz hashing as the initial negotiable
>>> algorithm.
>> 
>> Ping, I want to see some review from some other xen networking folks.
> 
> There's been some review of the front/back protocol (on a different
> thread) and some significant changes have been suggested.

Ok, I'll mark this series as "changes requested" then, thanks.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH net-next 1/8] xen-netback: re-import canonical netif header
  2015-10-21 10:36 ` Paul Durrant
  2015-10-26 17:05   ` Wei Liu
@ 2015-10-26 17:05   ` Wei Liu
  1 sibling, 0 replies; 39+ messages in thread
From: Wei Liu @ 2015-10-26 17:05 UTC (permalink / raw)
  To: Paul Durrant
  Cc: netdev, xen-devel, Konrad Rzeszutek Wilk, Boris Ostrovsky,
	David Vrabel, Wei Liu

On Wed, Oct 21, 2015 at 11:36:18AM +0100, Paul Durrant wrote:
> The canonical netif header (in the Xen source repo) and the Linux variant
> have diverged significantly. Recently much documentation has been added to
> the canonical header and new definitions and types to support packet hash
> configuration. Subsequent patches in this series add support for packet
> hash configuration in xen-netback so this patch re-imports the canonical
> header in readiness.
> 
> To maintain compatibility and some style consistency with the old Linux
> variant, the header was stripped of its emacs boilerplate, and
> post-processed and copied into place with the following commands:
> 
> ed -s netif.h << EOF
> H
> ,s/NETTXF_/XEN_NETTXF_/g
> ,s/NETRXF_/XEN_NETRXF_/g
> ,s/NETIF_RSP/XEN_NETIF_RSP/g
> ,s/netif_tx/xen_netif_tx/g
> ,s/netif_rx/xen_netif_rx/g
> ,s/netif_extra_info/xen_netif_extra_info/g
> w
> EOF
> 
> indent --linux-style netif.h -o include/xen/interface/io/netif.h
> 
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Cc: David Vrabel <david.vrabel@citrix.com>
> Cc: Wei Liu <wei.liu2@citrix.com>
> ---
> 
> Whilst awaiting review of my patches to the canonical netif.h, import has
> been done from my staging branch using:
> 
> wget http://xenbits.xen.org/gitweb/?p=people/pauldu/xen.git;a=blob_plain;f=xen/include/public/io/netif.h;hb=refs/heads/netif

There is on-going discussion on this so I'm going to skip this patch for
now.

Wei.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH net-next 1/8] xen-netback: re-import canonical netif header
  2015-10-21 10:36 ` Paul Durrant
@ 2015-10-26 17:05   ` Wei Liu
  2015-10-26 17:05   ` Wei Liu
  1 sibling, 0 replies; 39+ messages in thread
From: Wei Liu @ 2015-10-26 17:05 UTC (permalink / raw)
  To: Paul Durrant; +Cc: Wei Liu, netdev, David Vrabel, xen-devel, Boris Ostrovsky

On Wed, Oct 21, 2015 at 11:36:18AM +0100, Paul Durrant wrote:
> The canonical netif header (in the Xen source repo) and the Linux variant
> have diverged significantly. Recently much documentation has been added to
> the canonical header and new definitions and types to support packet hash
> configuration. Subsequent patches in this series add support for packet
> hash configuration in xen-netback so this patch re-imports the canonical
> header in readiness.
> 
> To maintain compatibility and some style consistency with the old Linux
> variant, the header was stripped of its emacs boilerplate, and
> post-processed and copied into place with the following commands:
> 
> ed -s netif.h << EOF
> H
> ,s/NETTXF_/XEN_NETTXF_/g
> ,s/NETRXF_/XEN_NETRXF_/g
> ,s/NETIF_RSP/XEN_NETIF_RSP/g
> ,s/netif_tx/xen_netif_tx/g
> ,s/netif_rx/xen_netif_rx/g
> ,s/netif_extra_info/xen_netif_extra_info/g
> w
> EOF
> 
> indent --linux-style netif.h -o include/xen/interface/io/netif.h
> 
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Cc: David Vrabel <david.vrabel@citrix.com>
> Cc: Wei Liu <wei.liu2@citrix.com>
> ---
> 
> Whilst awaiting review of my patches to the canonical netif.h, import has
> been done from my staging branch using:
> 
> wget http://xenbits.xen.org/gitweb/?p=people/pauldu/xen.git;a=blob_plain;f=xen/include/public/io/netif.h;hb=refs/heads/netif

There is on-going discussion on this so I'm going to skip this patch for
now.

Wei.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH net-next 2/8] xen-netback: remove GSO information from xenvif_rx_meta
  2015-10-21 10:36 ` Paul Durrant
  2015-10-26 17:05   ` Wei Liu
@ 2015-10-26 17:05   ` Wei Liu
  1 sibling, 0 replies; 39+ messages in thread
From: Wei Liu @ 2015-10-26 17:05 UTC (permalink / raw)
  To: Paul Durrant; +Cc: netdev, xen-devel, Ian Campbell, Wei Liu

On Wed, Oct 21, 2015 at 11:36:19AM +0100, Paul Durrant wrote:
> The code in net_rx_action() that builds rx responses has direct access
> to the skb so there is no need to copy this information into the meta
> structure.
> 
> This patch removes the extraneous fields, saves space in the array and
> removes many lines of code.
> 
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> Cc: Ian Campbell <ian.campbell@citrix.com>
> Cc: Wei Liu <wei.liu2@citrix.com>

Reviewed-by: Wei Liu <wei.liu2@citrix.com>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH net-next 2/8] xen-netback: remove GSO information from xenvif_rx_meta
  2015-10-21 10:36 ` Paul Durrant
@ 2015-10-26 17:05   ` Wei Liu
  2015-10-26 17:05   ` Wei Liu
  1 sibling, 0 replies; 39+ messages in thread
From: Wei Liu @ 2015-10-26 17:05 UTC (permalink / raw)
  To: Paul Durrant; +Cc: netdev, Wei Liu, Ian Campbell, xen-devel

On Wed, Oct 21, 2015 at 11:36:19AM +0100, Paul Durrant wrote:
> The code in net_rx_action() that builds rx responses has direct access
> to the skb so there is no need to copy this information into the meta
> structure.
> 
> This patch removes the extraneous fields, saves space in the array and
> removes many lines of code.
> 
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> Cc: Ian Campbell <ian.campbell@citrix.com>
> Cc: Wei Liu <wei.liu2@citrix.com>

Reviewed-by: Wei Liu <wei.liu2@citrix.com>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH net-next 3/8] xen-netback: support multiple extra info segments passed from frontend
  2015-10-21 10:36 ` [PATCH net-next 3/8] xen-netback: support multiple extra info segments passed from frontend Paul Durrant
  2015-10-26 17:05   ` Wei Liu
@ 2015-10-26 17:05   ` Wei Liu
  1 sibling, 0 replies; 39+ messages in thread
From: Wei Liu @ 2015-10-26 17:05 UTC (permalink / raw)
  To: Paul Durrant; +Cc: netdev, xen-devel, Ian Campbell, Wei Liu

On Wed, Oct 21, 2015 at 11:36:20AM +0100, Paul Durrant wrote:
> The code does not currently allow a frontend to pass multiple extra info
> segments to the backend in a tx request. A subsequent patch in this series
> needs this functionality so it is added here, without any other
> modification, for better bisectability.
> 
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> Cc: Ian Campbell <ian.campbell@citrix.com>
> Cc: Wei Liu <wei.liu2@citrix.com>

Reviewed-by: Wei Liu <wei.liu2@citrix.com>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH net-next 3/8] xen-netback: support multiple extra info segments passed from frontend
  2015-10-21 10:36 ` [PATCH net-next 3/8] xen-netback: support multiple extra info segments passed from frontend Paul Durrant
@ 2015-10-26 17:05   ` Wei Liu
  2015-10-26 17:05   ` Wei Liu
  1 sibling, 0 replies; 39+ messages in thread
From: Wei Liu @ 2015-10-26 17:05 UTC (permalink / raw)
  To: Paul Durrant; +Cc: netdev, Wei Liu, Ian Campbell, xen-devel

On Wed, Oct 21, 2015 at 11:36:20AM +0100, Paul Durrant wrote:
> The code does not currently allow a frontend to pass multiple extra info
> segments to the backend in a tx request. A subsequent patch in this series
> needs this functionality so it is added here, without any other
> modification, for better bisectability.
> 
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> Cc: Ian Campbell <ian.campbell@citrix.com>
> Cc: Wei Liu <wei.liu2@citrix.com>

Reviewed-by: Wei Liu <wei.liu2@citrix.com>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH net-next 4/8] xen-netback: accept an L4 or L3 skb hash value from the frontend
  2015-10-21 10:36 ` [PATCH net-next 4/8] xen-netback: accept an L4 or L3 skb hash value from the frontend Paul Durrant
@ 2015-10-26 17:05   ` Wei Liu
  2015-10-26 17:05   ` Wei Liu
  1 sibling, 0 replies; 39+ messages in thread
From: Wei Liu @ 2015-10-26 17:05 UTC (permalink / raw)
  To: Paul Durrant; +Cc: netdev, xen-devel, Ian Campbell, Wei Liu

On Wed, Oct 21, 2015 at 11:36:21AM +0100, Paul Durrant wrote:
> This patch adds an indication that netback is capable of handling hash
> values passed from the frontend (see netif.h for details), and the code
> necessary to process the additional xen_netif_extra_info segment and
> set a hash on the skb.
> 
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> Cc: Ian Campbell <ian.campbell@citrix.com>
> Cc: Wei Liu <wei.liu2@citrix.com>

Reviewed-by: Wei Liu <wei.liu2@citrix.com>

[...]
>  
> +		/* We support hash values. */
> +		err = xenbus_printf(xbt, dev->nodename,
> +				    "feature-hash", "%d", 1);
> +		if (err) {
> +			message = "writing feature-hash";
> +			goto abort_transaction;

Feel free to retain my reviewed-by if this changes in next version.

Wei.

> +		}
> +
>  		err = xenbus_transaction_end(xbt, 0);
>  	} while (err == -EAGAIN);
>  
> -- 
> 2.1.4

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH net-next 4/8] xen-netback: accept an L4 or L3 skb hash value from the frontend
  2015-10-21 10:36 ` [PATCH net-next 4/8] xen-netback: accept an L4 or L3 skb hash value from the frontend Paul Durrant
  2015-10-26 17:05   ` Wei Liu
@ 2015-10-26 17:05   ` Wei Liu
  1 sibling, 0 replies; 39+ messages in thread
From: Wei Liu @ 2015-10-26 17:05 UTC (permalink / raw)
  To: Paul Durrant; +Cc: netdev, Wei Liu, Ian Campbell, xen-devel

On Wed, Oct 21, 2015 at 11:36:21AM +0100, Paul Durrant wrote:
> This patch adds an indication that netback is capable of handling hash
> values passed from the frontend (see netif.h for details), and the code
> necessary to process the additional xen_netif_extra_info segment and
> set a hash on the skb.
> 
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> Cc: Ian Campbell <ian.campbell@citrix.com>
> Cc: Wei Liu <wei.liu2@citrix.com>

Reviewed-by: Wei Liu <wei.liu2@citrix.com>

[...]
>  
> +		/* We support hash values. */
> +		err = xenbus_printf(xbt, dev->nodename,
> +				    "feature-hash", "%d", 1);
> +		if (err) {
> +			message = "writing feature-hash";
> +			goto abort_transaction;

Feel free to retain my reviewed-by if this changes in next version.

Wei.

> +		}
> +
>  		err = xenbus_transaction_end(xbt, 0);
>  	} while (err == -EAGAIN);
>  
> -- 
> 2.1.4

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH net-next 6/8] xen-netback: pass an L4 or L3 skb hash value to the frontend
  2015-10-21 10:36 ` [PATCH net-next 6/8] xen-netback: pass an L4 or L3 skb hash value to the frontend Paul Durrant
  2015-10-26 17:05   ` Wei Liu
@ 2015-10-26 17:05   ` Wei Liu
  1 sibling, 0 replies; 39+ messages in thread
From: Wei Liu @ 2015-10-26 17:05 UTC (permalink / raw)
  To: Paul Durrant; +Cc: netdev, xen-devel, Ian Campbell, Wei Liu

On Wed, Oct 21, 2015 at 11:36:23AM +0100, Paul Durrant wrote:
> If the frontend indicates it's capable (see netif.h for details) and an
> skb has an L4 or L3 hash value then pass the value to the frontend in
> a xen_netif_extra_info segment.
> 
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> Cc: Ian Campbell <ian.campbell@citrix.com>
> Cc: Wei Liu <wei.liu2@citrix.com>

Reviewed-by: Wei Liu <wei.liu2@citrix.com>

>  static int xenvif_rx_ring_slots_needed(struct xenvif *vif)
>  {
> -	if (vif->gso_mask)
> -		return DIV_ROUND_UP(vif->dev->gso_max_size, PAGE_SIZE) + 1;
> +	int needed;
> +
> +	if (vif->gso_mask || vif->gso_prefix_mask)

It seems like this line should become a patch for -stable?

>  		xenvif_add_frag_responses(queue, status,
> diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
> index 2fa8a16..a31bcee 100644
> --- a/drivers/net/xen-netback/xenbus.c
> +++ b/drivers/net/xen-netback/xenbus.c
> @@ -1037,6 +1037,11 @@ static int read_xenbus_vif_flags(struct backend_info *be)
>  		val = 0;
>  	vif->multicast_control = !!val;
>  
> +	if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-hash",
> +			 "%d", &val) < 0)
> +		val = 0;

Again, feel free to retain my reviewed-by if this changes in next
version.

Wei.

> +	vif->hash_extra = !!val;
> +
>  	return 0;
>  }
>  
> -- 
> 2.1.4

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH net-next 6/8] xen-netback: pass an L4 or L3 skb hash value to the frontend
  2015-10-21 10:36 ` [PATCH net-next 6/8] xen-netback: pass an L4 or L3 skb hash value to the frontend Paul Durrant
@ 2015-10-26 17:05   ` Wei Liu
  2015-10-26 17:05   ` Wei Liu
  1 sibling, 0 replies; 39+ messages in thread
From: Wei Liu @ 2015-10-26 17:05 UTC (permalink / raw)
  To: Paul Durrant; +Cc: netdev, Wei Liu, Ian Campbell, xen-devel

On Wed, Oct 21, 2015 at 11:36:23AM +0100, Paul Durrant wrote:
> If the frontend indicates it's capable (see netif.h for details) and an
> skb has an L4 or L3 hash value then pass the value to the frontend in
> a xen_netif_extra_info segment.
> 
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> Cc: Ian Campbell <ian.campbell@citrix.com>
> Cc: Wei Liu <wei.liu2@citrix.com>

Reviewed-by: Wei Liu <wei.liu2@citrix.com>

>  static int xenvif_rx_ring_slots_needed(struct xenvif *vif)
>  {
> -	if (vif->gso_mask)
> -		return DIV_ROUND_UP(vif->dev->gso_max_size, PAGE_SIZE) + 1;
> +	int needed;
> +
> +	if (vif->gso_mask || vif->gso_prefix_mask)

It seems like this line should become a patch for -stable?

>  		xenvif_add_frag_responses(queue, status,
> diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
> index 2fa8a16..a31bcee 100644
> --- a/drivers/net/xen-netback/xenbus.c
> +++ b/drivers/net/xen-netback/xenbus.c
> @@ -1037,6 +1037,11 @@ static int read_xenbus_vif_flags(struct backend_info *be)
>  		val = 0;
>  	vif->multicast_control = !!val;
>  
> +	if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-hash",
> +			 "%d", &val) < 0)
> +		val = 0;

Again, feel free to retain my reviewed-by if this changes in next
version.

Wei.

> +	vif->hash_extra = !!val;
> +
>  	return 0;
>  }
>  
> -- 
> 2.1.4

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH net-next 7/8] xen-netback: add support for a multi-queue hash mapping table
  2015-10-21 10:36 ` Paul Durrant
@ 2015-10-26 17:05   ` Wei Liu
  2015-10-26 17:05   ` Wei Liu
  1 sibling, 0 replies; 39+ messages in thread
From: Wei Liu @ 2015-10-26 17:05 UTC (permalink / raw)
  To: Paul Durrant; +Cc: netdev, xen-devel, Ian Campbell, Wei Liu

On Wed, Oct 21, 2015 at 11:36:24AM +0100, Paul Durrant wrote:
> Advertise the capability to handle a hash mapping specified by the
> frontend (see netif.h for details).
> 
> Add an ndo_select() entry point so that, of the frontend does specify a

"if the frontend ..."

> hash mapping, the skb hash is extracted and mapped to a queue. If no
> mapping is specified then the fallback queue selection function is
> called so there is no change in behaviour.
> 
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
[...]
> +static void xen_hash_mapping_changed(struct xenbus_watch *watch,
> +				     const char **vec, unsigned int len)
> +{
> +	struct xenvif *vif = container_of(watch, struct xenvif,
> +					  hash_mapping_watch);
> +
> +	xen_net_read_multi_queue_hash_mapping(vif);

Is it safe / correct to not stop the vif before changing mapping table?

Wei.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH net-next 7/8] xen-netback: add support for a multi-queue hash mapping table
  2015-10-21 10:36 ` Paul Durrant
  2015-10-26 17:05   ` Wei Liu
@ 2015-10-26 17:05   ` Wei Liu
  1 sibling, 0 replies; 39+ messages in thread
From: Wei Liu @ 2015-10-26 17:05 UTC (permalink / raw)
  To: Paul Durrant; +Cc: netdev, Wei Liu, Ian Campbell, xen-devel

On Wed, Oct 21, 2015 at 11:36:24AM +0100, Paul Durrant wrote:
> Advertise the capability to handle a hash mapping specified by the
> frontend (see netif.h for details).
> 
> Add an ndo_select() entry point so that, of the frontend does specify a

"if the frontend ..."

> hash mapping, the skb hash is extracted and mapped to a queue. If no
> mapping is specified then the fallback queue selection function is
> called so there is no change in behaviour.
> 
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
[...]
> +static void xen_hash_mapping_changed(struct xenbus_watch *watch,
> +				     const char **vec, unsigned int len)
> +{
> +	struct xenvif *vif = container_of(watch, struct xenvif,
> +					  hash_mapping_watch);
> +
> +	xen_net_read_multi_queue_hash_mapping(vif);

Is it safe / correct to not stop the vif before changing mapping table?

Wei.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH net-next 8/8] xen-netback: add support for toeplitz hashing
  2015-10-21 10:36 ` [PATCH net-next 8/8] xen-netback: add support for toeplitz hashing Paul Durrant
  2015-10-26 17:05   ` Wei Liu
@ 2015-10-26 17:05   ` Wei Liu
  1 sibling, 0 replies; 39+ messages in thread
From: Wei Liu @ 2015-10-26 17:05 UTC (permalink / raw)
  To: Paul Durrant; +Cc: netdev, xen-devel, Ian Campbell, Wei Liu

On Wed, Oct 21, 2015 at 11:36:25AM +0100, Paul Durrant wrote:
> This patch adds all the necessary infrastructure to allow a frontend to
> specify toeplitz hashing of network packets on its receive side. (See
> netif.h for details of the xenbus protocol).
> 
> The toeplitz hash algorithm itself was based on pseudo-code provided by
> Microsoft at:
> 
> https://msdn.microsoft.com/en-us/library/windows/hardware/ff570725.aspx
> 
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> Cc: Ian Campbell <ian.campbell@citrix.com>
> Cc: Wei Liu <wei.liu2@citrix.com>
[...]
>  
> diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
> index 0c7da7b..38eee4f 100644
> --- a/drivers/net/xen-netback/interface.c
> +++ b/drivers/net/xen-netback/interface.c
> @@ -142,17 +142,122 @@ void xenvif_wake_queue(struct xenvif_queue *queue)
>  	netif_tx_wake_queue(netdev_get_tx_queue(dev, id));
>  }
>  

I skipped the hash implementation because I don't think I know enough to
tell if it is correct or not, and protocol negotiation because I think
that's going to change in next version.

> +
> +
> +static void xen_net_read_toeplitz_key(struct xenvif *vif,
> +				      const char *node)
> +{
> +	struct xenbus_device *dev = xenvif_to_xenbus_device(vif);
> +	char *str, *token;
> +	u8 key[40];

This should use the macro.

> +	unsigned int n, i;
> +
> +	str = xenbus_read(XBT_NIL, node, "key", NULL);
> +	if (IS_ERR(str))
> +		goto fail1;
> +
> +	memset(key, 0, sizeof(key));
> +
> +	n = 0;
> +	while ((token = strsep(&str, ",")) != NULL) {
> +		int rc;
> +
> +		if (n >= ARRAY_SIZE(vif->hash_params.toeplitz.key)) {
> +			pr_err("%s: key too big\n",
> +			       dev->nodename);
> +			goto fail2;
> +		}
> +
> +		rc = kstrtou8(token, 0, &key[n]);
> +		if (rc < 0) {
> +			pr_err("%s: invalid key value (%s at index %u)\n",
> +			       dev->nodename, token, n);
> +			goto fail2;
> +		}
> +
> +		n++;
> +	}
> +
> +	for (i = 0; i < ARRAY_SIZE(vif->hash_params.toeplitz.key); i++)
> +		vif->hash_params.toeplitz.key[i] = key[i];
> +
> +	kfree(str);
> +	return;
> +
> +fail2:
> +	kfree(str);
> +fail1:
> +	vif->hash_params.toeplitz.types = 0;
> +}
> +
[...]
> +
> +static void xen_hash_changed(struct xenbus_watch *watch,
> +			     const char **vec, unsigned int len)
> +{
> +	struct xenvif *vif = container_of(watch, struct xenvif, hash_watch);
> +
> +	xen_net_read_hash(vif);

I think the same question for previous patch applies here, too.

Is there any concern of correctness and security implication that you
just change the hash without stopping the vif?

Wei.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH net-next 8/8] xen-netback: add support for toeplitz hashing
  2015-10-21 10:36 ` [PATCH net-next 8/8] xen-netback: add support for toeplitz hashing Paul Durrant
@ 2015-10-26 17:05   ` Wei Liu
  2015-10-26 17:05   ` Wei Liu
  1 sibling, 0 replies; 39+ messages in thread
From: Wei Liu @ 2015-10-26 17:05 UTC (permalink / raw)
  To: Paul Durrant; +Cc: netdev, Wei Liu, Ian Campbell, xen-devel

On Wed, Oct 21, 2015 at 11:36:25AM +0100, Paul Durrant wrote:
> This patch adds all the necessary infrastructure to allow a frontend to
> specify toeplitz hashing of network packets on its receive side. (See
> netif.h for details of the xenbus protocol).
> 
> The toeplitz hash algorithm itself was based on pseudo-code provided by
> Microsoft at:
> 
> https://msdn.microsoft.com/en-us/library/windows/hardware/ff570725.aspx
> 
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> Cc: Ian Campbell <ian.campbell@citrix.com>
> Cc: Wei Liu <wei.liu2@citrix.com>
[...]
>  
> diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
> index 0c7da7b..38eee4f 100644
> --- a/drivers/net/xen-netback/interface.c
> +++ b/drivers/net/xen-netback/interface.c
> @@ -142,17 +142,122 @@ void xenvif_wake_queue(struct xenvif_queue *queue)
>  	netif_tx_wake_queue(netdev_get_tx_queue(dev, id));
>  }
>  

I skipped the hash implementation because I don't think I know enough to
tell if it is correct or not, and protocol negotiation because I think
that's going to change in next version.

> +
> +
> +static void xen_net_read_toeplitz_key(struct xenvif *vif,
> +				      const char *node)
> +{
> +	struct xenbus_device *dev = xenvif_to_xenbus_device(vif);
> +	char *str, *token;
> +	u8 key[40];

This should use the macro.

> +	unsigned int n, i;
> +
> +	str = xenbus_read(XBT_NIL, node, "key", NULL);
> +	if (IS_ERR(str))
> +		goto fail1;
> +
> +	memset(key, 0, sizeof(key));
> +
> +	n = 0;
> +	while ((token = strsep(&str, ",")) != NULL) {
> +		int rc;
> +
> +		if (n >= ARRAY_SIZE(vif->hash_params.toeplitz.key)) {
> +			pr_err("%s: key too big\n",
> +			       dev->nodename);
> +			goto fail2;
> +		}
> +
> +		rc = kstrtou8(token, 0, &key[n]);
> +		if (rc < 0) {
> +			pr_err("%s: invalid key value (%s at index %u)\n",
> +			       dev->nodename, token, n);
> +			goto fail2;
> +		}
> +
> +		n++;
> +	}
> +
> +	for (i = 0; i < ARRAY_SIZE(vif->hash_params.toeplitz.key); i++)
> +		vif->hash_params.toeplitz.key[i] = key[i];
> +
> +	kfree(str);
> +	return;
> +
> +fail2:
> +	kfree(str);
> +fail1:
> +	vif->hash_params.toeplitz.types = 0;
> +}
> +
[...]
> +
> +static void xen_hash_changed(struct xenbus_watch *watch,
> +			     const char **vec, unsigned int len)
> +{
> +	struct xenvif *vif = container_of(watch, struct xenvif, hash_watch);
> +
> +	xen_net_read_hash(vif);

I think the same question for previous patch applies here, too.

Is there any concern of correctness and security implication that you
just change the hash without stopping the vif?

Wei.

^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2015-10-26 17:06 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-21 10:36 [PATCH net-next 0/8] xen-netback/core: packet hashing Paul Durrant
2015-10-21 10:36 ` [PATCH net-next 1/8] xen-netback: re-import canonical netif header Paul Durrant
2015-10-21 10:36 ` Paul Durrant
2015-10-26 17:05   ` Wei Liu
2015-10-26 17:05   ` Wei Liu
2015-10-21 10:36 ` [PATCH net-next 2/8] xen-netback: remove GSO information from xenvif_rx_meta Paul Durrant
2015-10-21 10:36 ` Paul Durrant
2015-10-26 17:05   ` Wei Liu
2015-10-26 17:05   ` Wei Liu
2015-10-21 10:36 ` [PATCH net-next 3/8] xen-netback: support multiple extra info segments passed from frontend Paul Durrant
2015-10-26 17:05   ` Wei Liu
2015-10-26 17:05   ` Wei Liu
2015-10-21 10:36 ` Paul Durrant
2015-10-21 10:36 ` [PATCH net-next 4/8] xen-netback: accept an L4 or L3 skb hash value from the frontend Paul Durrant
2015-10-26 17:05   ` Wei Liu
2015-10-26 17:05   ` Wei Liu
2015-10-21 10:36 ` Paul Durrant
2015-10-21 10:36 ` [PATCH net-next 5/8] skbuff: store hash type in socket buffer Paul Durrant
2015-10-21 10:36 ` Paul Durrant
2015-10-21 10:36 ` [PATCH net-next 6/8] xen-netback: pass an L4 or L3 skb hash value to the frontend Paul Durrant
2015-10-26 17:05   ` Wei Liu
2015-10-26 17:05   ` Wei Liu
2015-10-21 10:36 ` Paul Durrant
2015-10-21 10:36 ` [PATCH net-next 7/8] xen-netback: add support for a multi-queue hash mapping table Paul Durrant
2015-10-21 10:36 ` Paul Durrant
2015-10-26 17:05   ` Wei Liu
2015-10-26 17:05   ` Wei Liu
2015-10-21 10:36 ` [PATCH net-next 8/8] xen-netback: add support for toeplitz hashing Paul Durrant
2015-10-26 17:05   ` Wei Liu
2015-10-26 17:05   ` Wei Liu
2015-10-21 10:36 ` Paul Durrant
2015-10-22 14:15 ` [PATCH net-next 0/8] xen-netback/core: packet hashing David Miller
2015-10-22 14:15 ` David Miller
2015-10-24 11:55 ` David Miller
2015-10-26 10:38   ` David Vrabel
2015-10-26 10:38   ` [Xen-devel] " David Vrabel
2015-10-26 12:09     ` David Miller
2015-10-26 12:09     ` [Xen-devel] " David Miller
2015-10-24 11:55 ` David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.