linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next,1/3] hv_netvsc: Use per socket hash when available
@ 2017-04-09  0:53 Haiyang Zhang
  2017-04-09  0:54 ` [PATCH net-next,2/3] hv_netvsc: Fix the queue index computation in forwarding case Haiyang Zhang
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Haiyang Zhang @ 2017-04-09  0:53 UTC (permalink / raw)
  To: davem, netdev; +Cc: haiyangz, kys, olaf, vkuznets, linux-kernel

From: Haiyang Zhang <haiyangz@microsoft.com>

The per socket hash is set when a socket is connected. Use it, when
available, to save CPU cycles on repeatedly computing hash on the same
connection.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Stephen Hemminger <sthemmin@microsoft.com>
---
 drivers/net/hyperv/netvsc_drv.c |    7 ++++++-
 1 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index f24c289..0a129cb 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -211,9 +211,14 @@ static u16 netvsc_select_queue(struct net_device *ndev, struct sk_buff *skb,
 	int q_idx = sk_tx_queue_get(sk);
 
 	if (q_idx < 0 || skb->ooo_okay || q_idx >= num_tx_queues) {
-		u16 hash = __skb_tx_hash(ndev, skb, VRSS_SEND_TAB_SIZE);
+		u16 hash;
 		int new_idx;
 
+		if (sk)
+			skb_set_hash_from_sk(skb, sk);
+
+		hash = __skb_tx_hash(ndev, skb, VRSS_SEND_TAB_SIZE);
+
 		new_idx = net_device_ctx->tx_send_table[hash] % num_tx_queues;
 
 		if (q_idx != new_idx && sk &&
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH net-next,2/3] hv_netvsc: Fix the queue index computation in forwarding case
  2017-04-09  0:53 [PATCH net-next,1/3] hv_netvsc: Use per socket hash when available Haiyang Zhang
@ 2017-04-09  0:54 ` Haiyang Zhang
  2017-04-09  0:54 ` [PATCH net-next,3/3] hv_netvsc: Exclude non-TCP port numbers from vRSS hashing Haiyang Zhang
  2017-04-12  2:12 ` [PATCH net-next,1/3] hv_netvsc: Use per socket hash when available David Miller
  2 siblings, 0 replies; 5+ messages in thread
From: Haiyang Zhang @ 2017-04-09  0:54 UTC (permalink / raw)
  To: davem, netdev; +Cc: haiyangz, kys, olaf, vkuznets, linux-kernel

From: Haiyang Zhang <haiyangz@microsoft.com>

If the outgoing skb has a RX queue mapping available, we use the queue
number directly, other than put it through Send Indirection Table.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Stephen Hemminger <sthemmin@microsoft.com>
---
 drivers/net/hyperv/hyperv_net.h |    2 +-
 drivers/net/hyperv/netvsc_drv.c |   54 ++++++++++++++++++++++++--------------
 2 files changed, 35 insertions(+), 21 deletions(-)

diff --git a/drivers/net/hyperv/hyperv_net.h b/drivers/net/hyperv/hyperv_net.h
index 4747ad4..768b3ae 100644
--- a/drivers/net/hyperv/hyperv_net.h
+++ b/drivers/net/hyperv/hyperv_net.h
@@ -633,7 +633,7 @@ struct nvsp_message {
 
 #define NETVSC_PACKET_SIZE                      4096
 
-#define VRSS_SEND_TAB_SIZE 16
+#define VRSS_SEND_TAB_SIZE 16  /* must be power of 2 */
 #define VRSS_CHANNEL_MAX 64
 #define VRSS_CHANNEL_DEFAULT 8
 
diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index 0a129cb..fad864f 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -191,6 +191,27 @@ static int netvsc_close(struct net_device *net)
 	return ppi;
 }
 
+static inline int netvsc_get_tx_queue(struct net_device *ndev,
+				      struct sk_buff *skb, int old_idx)
+{
+	const struct net_device_context *ndc = netdev_priv(ndev);
+	struct sock *sk = skb->sk;
+	int q_idx;
+
+	if (sk)
+		skb_set_hash_from_sk(skb, sk);
+
+	q_idx = ndc->tx_send_table[skb_get_hash(skb) &
+				   (VRSS_SEND_TAB_SIZE - 1)];
+
+	/* If queue index changed record the new value */
+	if (q_idx != old_idx &&
+	    sk && sk_fullsock(sk) && rcu_access_pointer(sk->sk_dst_cache))
+		sk_tx_queue_set(sk, q_idx);
+
+	return q_idx;
+}
+
 /*
  * Select queue for transmit.
  *
@@ -205,29 +226,22 @@ static int netvsc_close(struct net_device *net)
 static u16 netvsc_select_queue(struct net_device *ndev, struct sk_buff *skb,
 			void *accel_priv, select_queue_fallback_t fallback)
 {
-	struct net_device_context *net_device_ctx = netdev_priv(ndev);
 	unsigned int num_tx_queues = ndev->real_num_tx_queues;
-	struct sock *sk = skb->sk;
-	int q_idx = sk_tx_queue_get(sk);
-
-	if (q_idx < 0 || skb->ooo_okay || q_idx >= num_tx_queues) {
-		u16 hash;
-		int new_idx;
-
-		if (sk)
-			skb_set_hash_from_sk(skb, sk);
-
-		hash = __skb_tx_hash(ndev, skb, VRSS_SEND_TAB_SIZE);
+	int q_idx = sk_tx_queue_get(skb->sk);
 
-		new_idx = net_device_ctx->tx_send_table[hash] % num_tx_queues;
-
-		if (q_idx != new_idx && sk &&
-		    sk_fullsock(sk) && rcu_access_pointer(sk->sk_dst_cache))
-			sk_tx_queue_set(sk, new_idx);
-
-		q_idx = new_idx;
+	if (q_idx < 0 || skb->ooo_okay) {
+		/* If forwarding a packet, we use the recorded queue when
+		 * available for better cache locality.
+		 */
+		if (skb_rx_queue_recorded(skb))
+			q_idx = skb_get_rx_queue(skb);
+		else
+			q_idx = netvsc_get_tx_queue(ndev, skb, q_idx);
 	}
 
+	while (unlikely(q_idx >= num_tx_queues))
+		q_idx -= num_tx_queues;
+
 	return q_idx;
 }
 
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH net-next,3/3] hv_netvsc: Exclude non-TCP port numbers from vRSS hashing
  2017-04-09  0:53 [PATCH net-next,1/3] hv_netvsc: Use per socket hash when available Haiyang Zhang
  2017-04-09  0:54 ` [PATCH net-next,2/3] hv_netvsc: Fix the queue index computation in forwarding case Haiyang Zhang
@ 2017-04-09  0:54 ` Haiyang Zhang
  2017-04-12  2:12 ` [PATCH net-next,1/3] hv_netvsc: Use per socket hash when available David Miller
  2 siblings, 0 replies; 5+ messages in thread
From: Haiyang Zhang @ 2017-04-09  0:54 UTC (permalink / raw)
  To: davem, netdev; +Cc: haiyangz, kys, olaf, vkuznets, linux-kernel

From: Haiyang Zhang <haiyangz@microsoft.com>

Azure hosts are not supporting non-TCP port numbers in vRSS hashing for
now. For example, UDP packet loss rate will be high if port numbers are
also included in vRSS hash.

So, we created this patch to use only IP numbers for hashing in non-TCP
traffic.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Stephen Hemminger <sthemmin@microsoft.com>
---
 drivers/net/hyperv/netvsc_drv.c |   38 ++++++++++++++++++++++++++++++++++----
 1 files changed, 34 insertions(+), 4 deletions(-)

diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index fad864f..d65ab05 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -191,6 +191,39 @@ static int netvsc_close(struct net_device *net)
 	return ppi;
 }
 
+/* Azure hosts don't support non-TCP port numbers in hashing yet. We compute
+ * hash for non-TCP traffic with only IP numbers.
+ */
+static inline u32 netvsc_get_hash(struct sk_buff *skb, struct sock *sk)
+{
+	struct flow_keys flow;
+	u32 hash;
+	static u32 hashrnd __read_mostly;
+
+	net_get_random_once(&hashrnd, sizeof(hashrnd));
+
+	if (!skb_flow_dissect_flow_keys(skb, &flow, 0))
+		return 0;
+
+	if (flow.basic.ip_proto == IPPROTO_TCP) {
+		if (sk)
+			skb_set_hash_from_sk(skb, sk);
+
+		return skb_get_hash(skb);
+	} else {
+		if (flow.basic.n_proto == htons(ETH_P_IP))
+			hash = jhash2((u32 *)&flow.addrs.v4addrs, 2, hashrnd);
+		else if (flow.basic.n_proto == htons(ETH_P_IPV6))
+			hash = jhash2((u32 *)&flow.addrs.v6addrs, 8, hashrnd);
+		else
+			hash = 0;
+	}
+
+	skb_set_hash(skb, hash, PKT_HASH_TYPE_L3);
+
+	return hash;
+}
+
 static inline int netvsc_get_tx_queue(struct net_device *ndev,
 				      struct sk_buff *skb, int old_idx)
 {
@@ -198,10 +231,7 @@ static inline int netvsc_get_tx_queue(struct net_device *ndev,
 	struct sock *sk = skb->sk;
 	int q_idx;
 
-	if (sk)
-		skb_set_hash_from_sk(skb, sk);
-
-	q_idx = ndc->tx_send_table[skb_get_hash(skb) &
+	q_idx = ndc->tx_send_table[netvsc_get_hash(skb, sk) &
 				   (VRSS_SEND_TAB_SIZE - 1)];
 
 	/* If queue index changed record the new value */
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH net-next,1/3] hv_netvsc: Use per socket hash when available
  2017-04-09  0:53 [PATCH net-next,1/3] hv_netvsc: Use per socket hash when available Haiyang Zhang
  2017-04-09  0:54 ` [PATCH net-next,2/3] hv_netvsc: Fix the queue index computation in forwarding case Haiyang Zhang
  2017-04-09  0:54 ` [PATCH net-next,3/3] hv_netvsc: Exclude non-TCP port numbers from vRSS hashing Haiyang Zhang
@ 2017-04-12  2:12 ` David Miller
  2017-04-12 16:24   ` Haiyang Zhang
  2 siblings, 1 reply; 5+ messages in thread
From: David Miller @ 2017-04-12  2:12 UTC (permalink / raw)
  To: haiyangz, haiyangz; +Cc: netdev, kys, olaf, vkuznets, linux-kernel

From: Haiyang Zhang <haiyangz@exchange.microsoft.com>
Date: Sat,  8 Apr 2017 17:53:59 -0700

> diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
> index f24c289..0a129cb 100644
> --- a/drivers/net/hyperv/netvsc_drv.c
> +++ b/drivers/net/hyperv/netvsc_drv.c
> @@ -211,9 +211,14 @@ static u16 netvsc_select_queue(struct net_device *ndev, struct sk_buff *skb,
>  	int q_idx = sk_tx_queue_get(sk);
>  
>  	if (q_idx < 0 || skb->ooo_okay || q_idx >= num_tx_queues) {
> -		u16 hash = __skb_tx_hash(ndev, skb, VRSS_SEND_TAB_SIZE);
> +		u16 hash;
>  		int new_idx;
>  
> +		if (sk)
> +			skb_set_hash_from_sk(skb, sk);
> +
> +		hash = __skb_tx_hash(ndev, skb, VRSS_SEND_TAB_SIZE);

Please do not do this.

TCP performs this operation for you for every pack it emits.

And also every socket family that uses skb_set_owner_w() either
directly or indirectly gets this done as well.

I do not want to see drivers start to get peppered with calls to this
thing.

Explain the case which is missing that matters, and we can address
that instead.

Thanks.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: [PATCH net-next,1/3] hv_netvsc: Use per socket hash when available
  2017-04-12  2:12 ` [PATCH net-next,1/3] hv_netvsc: Use per socket hash when available David Miller
@ 2017-04-12 16:24   ` Haiyang Zhang
  0 siblings, 0 replies; 5+ messages in thread
From: Haiyang Zhang @ 2017-04-12 16:24 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, KY Srinivasan, olaf, vkuznets, linux-kernel



> -----Original Message-----
> From: David Miller [mailto:davem@davemloft.net]
> Sent: Tuesday, April 11, 2017 10:13 PM
> To: Haiyang Zhang <haiyangz@microsoft.com>; Haiyang Zhang
> <haiyangz@microsoft.com>
> Cc: netdev@vger.kernel.org; KY Srinivasan <kys@microsoft.com>;
> olaf@aepfle.de; vkuznets@redhat.com; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH net-next,1/3] hv_netvsc: Use per socket hash when
> available
> 
> From: Haiyang Zhang <haiyangz@exchange.microsoft.com>
> Date: Sat,  8 Apr 2017 17:53:59 -0700
> 
> > diff --git a/drivers/net/hyperv/netvsc_drv.c
> b/drivers/net/hyperv/netvsc_drv.c
> > index f24c289..0a129cb 100644
> > --- a/drivers/net/hyperv/netvsc_drv.c
> > +++ b/drivers/net/hyperv/netvsc_drv.c
> > @@ -211,9 +211,14 @@ static u16 netvsc_select_queue(struct net_device
> *ndev, struct sk_buff *skb,
> >  	int q_idx = sk_tx_queue_get(sk);
> >
> >  	if (q_idx < 0 || skb->ooo_okay || q_idx >= num_tx_queues) {
> > -		u16 hash = __skb_tx_hash(ndev, skb, VRSS_SEND_TAB_SIZE);
> > +		u16 hash;
> >  		int new_idx;
> >
> > +		if (sk)
> > +			skb_set_hash_from_sk(skb, sk);
> > +
> > +		hash = __skb_tx_hash(ndev, skb, VRSS_SEND_TAB_SIZE);
> 
> Please do not do this.
> 
> TCP performs this operation for you for every pack it emits.
> 
> And also every socket family that uses skb_set_owner_w() either
> directly or indirectly gets this done as well.
> 
> I do not want to see drivers start to get peppered with calls to this
> thing.
> 
> Explain the case which is missing that matters, and we can address
> that instead.
> 
> Thanks.

Thanks for pointing this out. I did some tests, the skb->hash is indeed
set to the sk->sk_txhash by upper layer. I will remove this patch, and
re-submit other patches.

Thanks,
- Haiyang

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-04-12 16:24 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-09  0:53 [PATCH net-next,1/3] hv_netvsc: Use per socket hash when available Haiyang Zhang
2017-04-09  0:54 ` [PATCH net-next,2/3] hv_netvsc: Fix the queue index computation in forwarding case Haiyang Zhang
2017-04-09  0:54 ` [PATCH net-next,3/3] hv_netvsc: Exclude non-TCP port numbers from vRSS hashing Haiyang Zhang
2017-04-12  2:12 ` [PATCH net-next,1/3] hv_netvsc: Use per socket hash when available David Miller
2017-04-12 16:24   ` Haiyang Zhang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).