* [PATCH 0/4 v3] net: Implement fast TX queue selection
@ 2009-10-18 13:07 Krishna Kumar
2009-10-18 13:07 ` [PATCH 1/4 v3] net: Introduce sk_tx_queue_mapping Krishna Kumar
` (3 more replies)
0 siblings, 4 replies; 9+ messages in thread
From: Krishna Kumar @ 2009-10-18 13:07 UTC (permalink / raw)
To: davem; +Cc: netdev, herbert, Krishna Kumar, dada1
From: Krishna Kumar <krkumar2@in.ibm.com>
Notes:
1. Eric suggested:
- To use u16 for txq#, but I am using an "int" for now as that
avoids one unnecessary subtraction during tx.
- An improvement of caching the txq at connection establishment
time (TBD later) so as to use rxq# = txq#.
- Drivers can call sk_tx_queue_set() to set the txq if they are
going to call skb_tx_hash() internally.
2. v3 patch stress tested with 1000 netperfs, reboot's, etc.
Changelog [from v2]
--------------------
1. Changed names of functions setting, getting and returning the
txq#; and added a new one to reset the txq#.
2. Free sk doesn't need to reset txq#.
Changelog [from v1]
--------------------
1. Changed IPv6 code to call __sk_dst_reset() directly.
2. Removed the patch re-arranging ("encapsulating") __sk_dst_reset()
Multiqueue cards on routers/firewalls set skb->queue_mapping on
input which helps in faster xmit. Implement fast queue selection
for locally generated packets also, by saving the txq# for
connected sockets (in dev_pick_tx) and use it in subsequent
iterations. Locally generated packets for a connection will xmit
on the same txq, but routing & firewall loads should not be
affected by this patch. Tests shows the distribution across txq's
for 1-4 netperf sessions is similar to existing code.
Testing & results:
------------------
1. Cycles/Iter (C/I) used by dev_pick_tx:
(B -> Billion, M -> Million)
|--------------|------------------------|------------------------|
| | ORG | NEW |
| Test |--------|---------|-----|--------|---------|-----|
| | Cycles | Iters | C/I | Cycles | Iters | C/I |
|--------------|--------|---------|-----|--------|---------|-----|
| [TCP_STREAM, | 3.98 B | 12.47 M | 320 | 1.95 B | 12.92 M | 152 |
| UDP_STREAM, | | | | | | |
| TCP_RR, | | | | | | |
| UDP_RR] | | | | | | |
|--------------|--------|---------|-----|--------|---------|-----|
| [TCP_STREAM, | 8.92 B | 29.66 M | 300 | 3.82 B | 38.88 M | 98 |
| TCP_RR, | | | | | | |
| UDP_RR] | | | | | | |
|--------------|--------|---------|-----|--------|---------|-----|
2. Stress test (over 48 hours) : 1000 netperfs running combination
of TCP_STREAM/RR, UDP_STREAM/RR (v4/6, NODELAY/~NODELAY for all
tests), with some ssh sessions, reboots, modprobe -r driver, etc.
3. Performance test (10 hours): Single 10 hour netperf run of
TCP_STREAM/RR, TCP_STREAM + NO_DELAY and UDP_RR. Results show an
improvement in both performance and cpu utilization.
Tested on a 4-processor AMD Opteron 2.8 GHz system with 1GB memory,
10G Chelsio card. Each BW number is the sum of 3 iterations of
individual tests using 512, 16K, 64K & 128K I/O sizes, in Mb/s:
------------------------ TCP Tests -----------------------
#procs Org BW New BW (%) Org SD New SD (%)
------------------------------------------------------------
1 77777.7 81011.0 (4.15) 42.3 40.2 (-5.11)
4 91599.2 91878.8 (.30) 955.9 919.3 (-3.83)
6 89533.3 91792.2 (2.52) 2262.0 2143.0 (-5.25)
8 87507.5 89161.9 (1.89) 4363.4 4073.6 (-6.64)
10 85152.4 85607.8 (.53) 6890.4 6851.2 (-.56)
------------------------------------------------------------
------------------------- TCP NO_DELAY Tests ---------------
#procs Org BW New BW (%) Org SD New SD (%)
------------------------------------------------------------
1 57001.9 57888.0 (1.55) 67.7 70.2 (3.75)
4 69555.1 69957.4 (.57) 823.0 834.3 (1.36)
6 71359.3 71918.7 (.78) 1740.8 1724.5 (-.93)
8 72577.6 72496.1 (-.11) 2955.4 2937.7 (-.59)
10 70829.6 71444.2 (.86) 4826.1 4673.4 (-3.16)
------------------------------------------------------------
----------------------- Request Response Tests --------------------
#procs Org TPS New TPS (%) Org SD New SD (%)
(1-10)
-------------------------------------------------------------------
TCP 1019245.9 1042626.4 (2.29) 16352.9 16459.8 (.65)
UDP 934598.64 942956.9 (.89) 11607.3 11593.2 (-.12)
-------------------------------------------------------------------
Thanks,
- KK
Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
---
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 1/4 v3] net: Introduce sk_tx_queue_mapping
2009-10-18 13:07 [PATCH 0/4 v3] net: Implement fast TX queue selection Krishna Kumar
@ 2009-10-18 13:07 ` Krishna Kumar
2009-10-19 16:45 ` Eric Dumazet
2009-10-18 13:07 ` [PATCH 2/4 v3] net: Use sk_tx_queue_mapping for connected sockets Krishna Kumar
` (2 subsequent siblings)
3 siblings, 1 reply; 9+ messages in thread
From: Krishna Kumar @ 2009-10-18 13:07 UTC (permalink / raw)
To: davem; +Cc: netdev, herbert, Krishna Kumar, dada1
From: Krishna Kumar <krkumar2@in.ibm.com>
Introduce sk_tx_queue_mapping; and functions that set, test and
get this value. Reset sk_tx_queue_mapping to -1 whenever the dst
cache is set/reset, and in socket alloc. Setting txq to -1 and
using valid txq=<0 to n-1> allows the tx path to use the value
of sk_tx_queue_mapping directly instead of subtracting 1 on every
tx.
Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
---
include/net/sock.h | 26 ++++++++++++++++++++++++++
net/core/sock.c | 5 ++++-
2 files changed, 30 insertions(+), 1 deletion(-)
diff -ruNp org/include/net/sock.h new/include/net/sock.h
--- org/include/net/sock.h 2009-10-16 18:53:40.000000000 +0530
+++ new/include/net/sock.h 2009-10-16 21:38:44.000000000 +0530
@@ -107,6 +107,7 @@ struct net;
* @skc_node: main hash linkage for various protocol lookup tables
* @skc_nulls_node: main hash linkage for UDP/UDP-Lite protocol
* @skc_refcnt: reference count
+ * @skc_tx_queue_mapping: tx queue number for this connection
* @skc_hash: hash value used with various protocol lookup tables
* @skc_family: network address family
* @skc_state: Connection state
@@ -128,6 +129,7 @@ struct sock_common {
struct hlist_nulls_node skc_nulls_node;
};
atomic_t skc_refcnt;
+ int skc_tx_queue_mapping;
unsigned int skc_hash;
unsigned short skc_family;
@@ -215,6 +217,7 @@ struct sock {
#define sk_node __sk_common.skc_node
#define sk_nulls_node __sk_common.skc_nulls_node
#define sk_refcnt __sk_common.skc_refcnt
+#define sk_tx_queue_mapping __sk_common.skc_tx_queue_mapping
#define sk_copy_start __sk_common.skc_hash
#define sk_hash __sk_common.skc_hash
@@ -1094,8 +1097,29 @@ static inline void sock_put(struct sock
extern int sk_receive_skb(struct sock *sk, struct sk_buff *skb,
const int nested);
+static inline void sk_tx_queue_set(struct sock *sk, int tx_queue)
+{
+ sk->sk_tx_queue_mapping = tx_queue;
+}
+
+static inline void sk_tx_queue_clear(struct sock *sk)
+{
+ sk->sk_tx_queue_mapping = -1;
+}
+
+static inline int sk_tx_queue_get(const struct sock *sk)
+{
+ return sk->sk_tx_queue_mapping;
+}
+
+static inline bool sk_tx_queue_recorded(const struct sock *sk)
+{
+ return (sk && sk->sk_tx_queue_mapping >= 0);
+}
+
static inline void sk_set_socket(struct sock *sk, struct socket *sock)
{
+ sk_tx_queue_clear(sk);
sk->sk_socket = sock;
}
@@ -1152,6 +1176,7 @@ __sk_dst_set(struct sock *sk, struct dst
{
struct dst_entry *old_dst;
+ sk_tx_queue_clear(sk);
old_dst = sk->sk_dst_cache;
sk->sk_dst_cache = dst;
dst_release(old_dst);
@@ -1170,6 +1195,7 @@ __sk_dst_reset(struct sock *sk)
{
struct dst_entry *old_dst;
+ sk_tx_queue_clear(sk);
old_dst = sk->sk_dst_cache;
sk->sk_dst_cache = NULL;
dst_release(old_dst);
diff -ruNp org/net/core/sock.c new/net/core/sock.c
--- org/net/core/sock.c 2009-10-16 18:53:40.000000000 +0530
+++ new/net/core/sock.c 2009-10-16 21:29:02.000000000 +0530
@@ -357,6 +357,7 @@ struct dst_entry *__sk_dst_check(struct
struct dst_entry *dst = sk->sk_dst_cache;
if (dst && dst->obsolete && dst->ops->check(dst, cookie) == NULL) {
+ sk_tx_queue_clear(sk);
sk->sk_dst_cache = NULL;
dst_release(dst);
return NULL;
@@ -953,7 +954,8 @@ static void sock_copy(struct sock *nsk,
void *sptr = nsk->sk_security;
#endif
BUILD_BUG_ON(offsetof(struct sock, sk_copy_start) !=
- sizeof(osk->sk_node) + sizeof(osk->sk_refcnt));
+ sizeof(osk->sk_node) + sizeof(osk->sk_refcnt) +
+ sizeof(osk->sk_tx_queue_mapping));
memcpy(&nsk->sk_copy_start, &osk->sk_copy_start,
osk->sk_prot->obj_size - offsetof(struct sock, sk_copy_start));
#ifdef CONFIG_SECURITY_NETWORK
@@ -997,6 +999,7 @@ static struct sock *sk_prot_alloc(struct
if (!try_module_get(prot->owner))
goto out_free_sec;
+ sk_tx_queue_clear(sk);
}
return sk;
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 2/4 v3] net: Use sk_tx_queue_mapping for connected sockets
2009-10-18 13:07 [PATCH 0/4 v3] net: Implement fast TX queue selection Krishna Kumar
2009-10-18 13:07 ` [PATCH 1/4 v3] net: Introduce sk_tx_queue_mapping Krishna Kumar
@ 2009-10-18 13:07 ` Krishna Kumar
2009-10-18 13:08 ` [PATCH 3/4 v3] net: IPv6 changes Krishna Kumar
2009-10-18 13:08 ` [PATCH 4/4 v3] net: Fix for dst_negative_advice Krishna Kumar
3 siblings, 0 replies; 9+ messages in thread
From: Krishna Kumar @ 2009-10-18 13:07 UTC (permalink / raw)
To: davem; +Cc: netdev, herbert, Krishna Kumar, dada1
From: Krishna Kumar <krkumar2@in.ibm.com>
For connected sockets, the first run of dev_pick_tx saves the
calculated txq in sk_tx_queue_mapping. This is not saved if
either the device has a queue select or the socket is not
connected. Next iterations of dev_pick_tx uses the cached value
of sk_tx_queue_mapping.
Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
---
net/core/dev.c | 24 ++++++++++++++++++------
1 file changed, 18 insertions(+), 6 deletions(-)
diff -ruNp org/net/core/dev.c new/net/core/dev.c
--- org/net/core/dev.c 2009-10-16 18:53:40.000000000 +0530
+++ new/net/core/dev.c 2009-10-16 21:30:38.000000000 +0530
@@ -1791,13 +1791,25 @@ EXPORT_SYMBOL(skb_tx_hash);
static struct netdev_queue *dev_pick_tx(struct net_device *dev,
struct sk_buff *skb)
{
- const struct net_device_ops *ops = dev->netdev_ops;
- u16 queue_index = 0;
+ u16 queue_index;
+ struct sock *sk = skb->sk;
+
+ if (sk_tx_queue_recorded(sk)) {
+ queue_index = sk_tx_queue_get(sk);
+ } else {
+ const struct net_device_ops *ops = dev->netdev_ops;
- if (ops->ndo_select_queue)
- queue_index = ops->ndo_select_queue(dev, skb);
- else if (dev->real_num_tx_queues > 1)
- queue_index = skb_tx_hash(dev, skb);
+ if (ops->ndo_select_queue) {
+ queue_index = ops->ndo_select_queue(dev, skb);
+ } else {
+ queue_index = 0;
+ if (dev->real_num_tx_queues > 1)
+ queue_index = skb_tx_hash(dev, skb);
+
+ if (sk && sk->sk_dst_cache)
+ sk_record_tx_queue(sk, queue_index);
+ }
+ }
skb_set_queue_mapping(skb, queue_index);
return netdev_get_tx_queue(dev, queue_index);
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 3/4 v3] net: IPv6 changes
2009-10-18 13:07 [PATCH 0/4 v3] net: Implement fast TX queue selection Krishna Kumar
2009-10-18 13:07 ` [PATCH 1/4 v3] net: Introduce sk_tx_queue_mapping Krishna Kumar
2009-10-18 13:07 ` [PATCH 2/4 v3] net: Use sk_tx_queue_mapping for connected sockets Krishna Kumar
@ 2009-10-18 13:08 ` Krishna Kumar
2009-10-18 13:08 ` [PATCH 4/4 v3] net: Fix for dst_negative_advice Krishna Kumar
3 siblings, 0 replies; 9+ messages in thread
From: Krishna Kumar @ 2009-10-18 13:08 UTC (permalink / raw)
To: davem; +Cc: netdev, herbert, Krishna Kumar, dada1
From: Krishna Kumar <krkumar2@in.ibm.com>
IPv6: Reset sk_tx_queue_mapping when dst_cache is reset. Use existing
macro to do the work.
Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
---
net/ipv6/inet6_connection_sock.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff -ruNp org/net/ipv6/inet6_connection_sock.c new/net/ipv6/inet6_connection_sock.c
--- org/net/ipv6/inet6_connection_sock.c 2009-10-16 21:29:19.000000000 +0530
+++ new/net/ipv6/inet6_connection_sock.c 2009-10-16 21:31:00.000000000 +0530
@@ -168,8 +168,7 @@ struct dst_entry *__inet6_csk_dst_check(
if (dst) {
struct rt6_info *rt = (struct rt6_info *)dst;
if (rt->rt6i_flow_cache_genid != atomic_read(&flow_cache_genid)) {
- sk->sk_dst_cache = NULL;
- dst_release(dst);
+ __sk_dst_reset(sk);
dst = NULL;
}
}
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 4/4 v3] net: Fix for dst_negative_advice
2009-10-18 13:07 [PATCH 0/4 v3] net: Implement fast TX queue selection Krishna Kumar
` (2 preceding siblings ...)
2009-10-18 13:08 ` [PATCH 3/4 v3] net: IPv6 changes Krishna Kumar
@ 2009-10-18 13:08 ` Krishna Kumar
2009-10-19 4:12 ` Stephen Hemminger
3 siblings, 1 reply; 9+ messages in thread
From: Krishna Kumar @ 2009-10-18 13:08 UTC (permalink / raw)
To: davem; +Cc: netdev, herbert, Krishna Kumar, dada1
From: Krishna Kumar <krkumar2@in.ibm.com>
dst_negative_advice() should check for changed dst and reset
sk_tx_queue_mapping accordingly. Pass sock to the callers of
dst_negative_advice.
(sk_reset_txq is defined just for use by dst_negative_advice. The
only way I could find to get around this is to move dst_negative_()
from dst.h to dst.c, include sock.h in dst.c, etc)
Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
---
include/net/dst.h | 12 ++++++++++--
net/core/sock.c | 6 ++++++
net/dccp/timer.c | 4 ++--
net/decnet/af_decnet.c | 2 +-
net/ipv4/tcp_timer.c | 4 ++--
5 files changed, 21 insertions(+), 7 deletions(-)
diff -ruNp org/include/net/dst.h new/include/net/dst.h
--- org/include/net/dst.h 2009-10-16 21:30:56.000000000 +0530
+++ new/include/net/dst.h 2009-10-16 21:31:30.000000000 +0530
@@ -222,11 +222,19 @@ static inline void dst_confirm(struct ds
neigh_confirm(dst->neighbour);
}
-static inline void dst_negative_advice(struct dst_entry **dst_p)
+static inline void dst_negative_advice(struct dst_entry **dst_p,
+ struct sock *sk)
{
struct dst_entry * dst = *dst_p;
- if (dst && dst->ops->negative_advice)
+ if (dst && dst->ops->negative_advice) {
*dst_p = dst->ops->negative_advice(dst);
+
+ if (dst != *dst_p) {
+ extern void sk_reset_txq(struct sock *sk);
+
+ sk_reset_txq(sk);
+ }
+ }
}
static inline void dst_link_failure(struct sk_buff *skb)
diff -ruNp org/net/core/sock.c new/net/core/sock.c
--- org/net/core/sock.c 2009-10-16 21:30:56.000000000 +0530
+++ new/net/core/sock.c 2009-10-16 21:32:33.000000000 +0530
@@ -352,6 +352,12 @@ discard_and_relse:
}
EXPORT_SYMBOL(sk_receive_skb);
+void sk_reset_txq(struct sock *sk)
+{
+ sk_tx_queue_clear(sk);
+}
+EXPORT_SYMBOL(sk_reset_txq);
+
struct dst_entry *__sk_dst_check(struct sock *sk, u32 cookie)
{
struct dst_entry *dst = sk->sk_dst_cache;
diff -ruNp org/net/dccp/timer.c new/net/dccp/timer.c
--- org/net/dccp/timer.c 2009-10-16 21:30:56.000000000 +0530
+++ new/net/dccp/timer.c 2009-10-16 21:31:30.000000000 +0530
@@ -38,7 +38,7 @@ static int dccp_write_timeout(struct soc
if (sk->sk_state == DCCP_REQUESTING || sk->sk_state == DCCP_PARTOPEN) {
if (icsk->icsk_retransmits != 0)
- dst_negative_advice(&sk->sk_dst_cache);
+ dst_negative_advice(&sk->sk_dst_cache, sk);
retry_until = icsk->icsk_syn_retries ?
: sysctl_dccp_request_retries;
} else {
@@ -63,7 +63,7 @@ static int dccp_write_timeout(struct soc
Golden words :-).
*/
- dst_negative_advice(&sk->sk_dst_cache);
+ dst_negative_advice(&sk->sk_dst_cache, sk);
}
retry_until = sysctl_dccp_retries2;
diff -ruNp org/net/decnet/af_decnet.c new/net/decnet/af_decnet.c
--- org/net/decnet/af_decnet.c 2009-10-16 21:30:56.000000000 +0530
+++ new/net/decnet/af_decnet.c 2009-10-16 21:31:30.000000000 +0530
@@ -1955,7 +1955,7 @@ static int dn_sendmsg(struct kiocb *iocb
}
if ((flags & MSG_TRYHARD) && sk->sk_dst_cache)
- dst_negative_advice(&sk->sk_dst_cache);
+ dst_negative_advice(&sk->sk_dst_cache, sk);
mss = scp->segsize_rem;
fctype = scp->services_rem & NSP_FC_MASK;
diff -ruNp org/net/ipv4/tcp_timer.c new/net/ipv4/tcp_timer.c
--- org/net/ipv4/tcp_timer.c 2009-10-16 21:30:56.000000000 +0530
+++ new/net/ipv4/tcp_timer.c 2009-10-16 21:31:30.000000000 +0530
@@ -141,14 +141,14 @@ static int tcp_write_timeout(struct sock
if ((1 << sk->sk_state) & (TCPF_SYN_SENT | TCPF_SYN_RECV)) {
if (icsk->icsk_retransmits)
- dst_negative_advice(&sk->sk_dst_cache);
+ dst_negative_advice(&sk->sk_dst_cache, sk);
retry_until = icsk->icsk_syn_retries ? : sysctl_tcp_syn_retries;
} else {
if (retransmits_timed_out(sk, sysctl_tcp_retries1)) {
/* Black hole detection */
tcp_mtu_probing(icsk, sk);
- dst_negative_advice(&sk->sk_dst_cache);
+ dst_negative_advice(&sk->sk_dst_cache, sk);
}
retry_until = sysctl_tcp_retries2;
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 4/4 v3] net: Fix for dst_negative_advice
2009-10-18 13:08 ` [PATCH 4/4 v3] net: Fix for dst_negative_advice Krishna Kumar
@ 2009-10-19 4:12 ` Stephen Hemminger
2009-10-19 4:34 ` Krishna Kumar2
0 siblings, 1 reply; 9+ messages in thread
From: Stephen Hemminger @ 2009-10-19 4:12 UTC (permalink / raw)
To: Krishna Kumar; +Cc: davem, netdev, herbert, Krishna Kumar, dada1
On Sun, 18 Oct 2009 18:38:16 +0530
Krishna Kumar <krkumar2@in.ibm.com> wrote:
> From: Krishna Kumar <krkumar2@in.ibm.com>
>
> dst_negative_advice() should check for changed dst and reset
> sk_tx_queue_mapping accordingly. Pass sock to the callers of
> dst_negative_advice.
>
> (sk_reset_txq is defined just for use by dst_negative_advice. The
> only way I could find to get around this is to move dst_negative_()
> from dst.h to dst.c, include sock.h in dst.c, etc)
>
> Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
> ---
> include/net/dst.h | 12 ++++++++++--
> net/core/sock.c | 6 ++++++
> net/dccp/timer.c | 4 ++--
> net/decnet/af_decnet.c | 2 +-
> net/ipv4/tcp_timer.c | 4 ++--
> 5 files changed, 21 insertions(+), 7 deletions(-)
>
> diff -ruNp org/include/net/dst.h new/include/net/dst.h
> --- org/include/net/dst.h 2009-10-16 21:30:56.000000000 +0530
> +++ new/include/net/dst.h 2009-10-16 21:31:30.000000000 +0530
> @@ -222,11 +222,19 @@ static inline void dst_confirm(struct ds
> neigh_confirm(dst->neighbour);
> }
>
> -static inline void dst_negative_advice(struct dst_entry **dst_p)
> +static inline void dst_negative_advice(struct dst_entry **dst_p,
> + struct sock *sk)
> {
> struct dst_entry * dst = *dst_p;
> - if (dst && dst->ops->negative_advice)
> + if (dst && dst->ops->negative_advice) {
> *dst_p = dst->ops->negative_advice(dst);
> +
> + if (dst != *dst_p) {
> + extern void sk_reset_txq(struct sock *sk);
> +
> + sk_reset_txq(sk);
> + }
> + }
> }
>
> static inline void dst_link_failure(struct sk_buff *skb)
> diff -ruNp org/net/core/sock.c new/net/core/sock.c
> --- org/net/core/sock.c 2009-10-16 21:30:56.000000000 +0530
> +++ new/net/core/sock.c 2009-10-16 21:32:33.000000000 +0530
> @@ -352,6 +352,12 @@ discard_and_relse:
> }
> EXPORT_SYMBOL(sk_receive_skb);
>
> +void sk_reset_txq(struct sock *sk)
> +{
> + sk_tx_queue_clear(sk);
> +}
> +EXPORT_SYMBOL(sk_reset_txq);
> +
> struct dst_entry *__sk_dst_check(struct sock *sk, u32 cookie)
> {
> struct dst_entry *dst = sk->sk_dst_cache;
> diff -ruNp org/net/dccp/timer.c new/net/dccp/timer.c
> --- org/net/dccp/timer.c 2009-10-16 21:30:56.000000000 +0530
> +++ new/net/dccp/timer.c 2009-10-16 21:31:30.000000000 +0530
> @@ -38,7 +38,7 @@ static int dccp_write_timeout(struct soc
>
> if (sk->sk_state == DCCP_REQUESTING || sk->sk_state == DCCP_PARTOPEN) {
> if (icsk->icsk_retransmits != 0)
> - dst_negative_advice(&sk->sk_dst_cache);
> + dst_negative_advice(&sk->sk_dst_cache, sk);
> retry_until = icsk->icsk_syn_retries ?
> : sysctl_dccp_request_retries;
> } else {
> @@ -63,7 +63,7 @@ static int dccp_write_timeout(struct soc
> Golden words :-).
> */
>
> - dst_negative_advice(&sk->sk_dst_cache);
> + dst_negative_advice(&sk->sk_dst_cache, sk);
> }
>
> retry_until = sysctl_dccp_retries2;
> diff -ruNp org/net/decnet/af_decnet.c new/net/decnet/af_decnet.c
> --- org/net/decnet/af_decnet.c 2009-10-16 21:30:56.000000000 +0530
> +++ new/net/decnet/af_decnet.c 2009-10-16 21:31:30.000000000 +0530
> @@ -1955,7 +1955,7 @@ static int dn_sendmsg(struct kiocb *iocb
> }
>
> if ((flags & MSG_TRYHARD) && sk->sk_dst_cache)
> - dst_negative_advice(&sk->sk_dst_cache);
> + dst_negative_advice(&sk->sk_dst_cache, sk);
>
> mss = scp->segsize_rem;
> fctype = scp->services_rem & NSP_FC_MASK;
> diff -ruNp org/net/ipv4/tcp_timer.c new/net/ipv4/tcp_timer.c
> --- org/net/ipv4/tcp_timer.c 2009-10-16 21:30:56.000000000 +0530
> +++ new/net/ipv4/tcp_timer.c 2009-10-16 21:31:30.000000000 +0530
> @@ -141,14 +141,14 @@ static int tcp_write_timeout(struct sock
>
> if ((1 << sk->sk_state) & (TCPF_SYN_SENT | TCPF_SYN_RECV)) {
> if (icsk->icsk_retransmits)
> - dst_negative_advice(&sk->sk_dst_cache);
> + dst_negative_advice(&sk->sk_dst_cache, sk);
> retry_until = icsk->icsk_syn_retries ? : sysctl_tcp_syn_retries;
> } else {
> if (retransmits_timed_out(sk, sysctl_tcp_retries1)) {
> /* Black hole detection */
> tcp_mtu_probing(icsk, sk);
>
> - dst_negative_advice(&sk->sk_dst_cache);
> + dst_negative_advice(&sk->sk_dst_cache, sk);
> }
>
> retry_until = sysctl_tcp_retries2;
It is good that your patch is broken in pieces, but will the intermediate patches
still function correctly. I.e are they bisect safe?
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 4/4 v3] net: Fix for dst_negative_advice
2009-10-19 4:12 ` Stephen Hemminger
@ 2009-10-19 4:34 ` Krishna Kumar2
2009-10-20 4:17 ` David Miller
0 siblings, 1 reply; 9+ messages in thread
From: Krishna Kumar2 @ 2009-10-19 4:34 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: dada1, davem, herbert, netdev
Stephen Hemminger <shemminger@vyatta.com> wrote on 10/19/2009 09:42:09 AM:
> > diff -ruNp org/net/ipv4/tcp_timer.c new/net/ipv4/tcp_timer.c
> > --- org/net/ipv4/tcp_timer.c 2009-10-16 21:30:56.000000000 +0530
> > +++ new/net/ipv4/tcp_timer.c 2009-10-16 21:31:30.000000000 +0530
> > @@ -141,14 +141,14 @@ static int tcp_write_timeout(struct sock
> >
> > if ((1 << sk->sk_state) & (TCPF_SYN_SENT | TCPF_SYN_RECV)) {
> > if (icsk->icsk_retransmits)
> > - dst_negative_advice(&sk->sk_dst_cache);
> > + dst_negative_advice(&sk->sk_dst_cache, sk);
> > retry_until = icsk->icsk_syn_retries ? : sysctl_tcp_syn_retries;
> > } else {
> > if (retransmits_timed_out(sk, sysctl_tcp_retries1)) {
> > /* Black hole detection */
> > tcp_mtu_probing(icsk, sk);
> >
> > - dst_negative_advice(&sk->sk_dst_cache);
> > + dst_negative_advice(&sk->sk_dst_cache, sk);
> > }
> >
> > retry_until = sysctl_tcp_retries2;
>
> It is good that your patch is broken in pieces, but will the intermediate
patches
> still function correctly. I.e are they bisect safe?
I have only compile tested each patch, but I assume it could break
something.
Individual patches can be made to function correctly by renaming patch#2 to
patch#4 and move patch#3 and #4 ahead.
Should I resubmit with the changed order?
Thanks,
- KK
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/4 v3] net: Introduce sk_tx_queue_mapping
2009-10-18 13:07 ` [PATCH 1/4 v3] net: Introduce sk_tx_queue_mapping Krishna Kumar
@ 2009-10-19 16:45 ` Eric Dumazet
0 siblings, 0 replies; 9+ messages in thread
From: Eric Dumazet @ 2009-10-19 16:45 UTC (permalink / raw)
To: Krishna Kumar; +Cc: davem, netdev, herbert
Krishna Kumar a écrit :
> From: Krishna Kumar <krkumar2@in.ibm.com>
>
> Introduce sk_tx_queue_mapping; and functions that set, test and
> get this value. Reset sk_tx_queue_mapping to -1 whenever the dst
> cache is set/reset, and in socket alloc. Setting txq to -1 and
> using valid txq=<0 to n-1> allows the tx path to use the value
> of sk_tx_queue_mapping directly instead of subtracting 1 on every
> tx.
>
> Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 4/4 v3] net: Fix for dst_negative_advice
2009-10-19 4:34 ` Krishna Kumar2
@ 2009-10-20 4:17 ` David Miller
0 siblings, 0 replies; 9+ messages in thread
From: David Miller @ 2009-10-20 4:17 UTC (permalink / raw)
To: krkumar2; +Cc: shemminger, dada1, herbert, netdev
From: Krishna Kumar2 <krkumar2@in.ibm.com>
Date: Mon, 19 Oct 2009 10:04:53 +0530
> Should I resubmit with the changed order?
I took care of this.
I put the patch that actually makes dev.c use sk_tx_queue_mapping
last in the set.
All applied, thanks everyone.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2009-10-20 4:16 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-10-18 13:07 [PATCH 0/4 v3] net: Implement fast TX queue selection Krishna Kumar
2009-10-18 13:07 ` [PATCH 1/4 v3] net: Introduce sk_tx_queue_mapping Krishna Kumar
2009-10-19 16:45 ` Eric Dumazet
2009-10-18 13:07 ` [PATCH 2/4 v3] net: Use sk_tx_queue_mapping for connected sockets Krishna Kumar
2009-10-18 13:08 ` [PATCH 3/4 v3] net: IPv6 changes Krishna Kumar
2009-10-18 13:08 ` [PATCH 4/4 v3] net: Fix for dst_negative_advice Krishna Kumar
2009-10-19 4:12 ` Stephen Hemminger
2009-10-19 4:34 ` Krishna Kumar2
2009-10-20 4:17 ` David Miller
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.