* [bisected] e341694e3eb5 netlink_lookup() rcu conversion causes latencies @ 2014-10-11 8:36 Heiko Carstens 2014-10-11 19:32 ` Eric Dumazet 0 siblings, 1 reply; 7+ messages in thread From: Heiko Carstens @ 2014-10-11 8:36 UTC (permalink / raw) To: Thomas Graf, Nikolay Aleksandrov, David S. Miller Cc: netdev, linux-kernel, Ursula Braun Hi all, it just came to my attention that commit e341694e3eb5 "netlink: Convert netlink_lookup() to use RCU protected hash table" causes network latencies for me on s390. The testcase is quite simple and 100% reproducible on s390: Simply login via ssh to a remote system which has the above mentioned patch applied. Any action like pressing return now has significant latencies. Or in other words, working via such a connection becomes a pain ;) I haven't debugged it, however I assume the problem is that a) the commit introduces a synchronize_net() call und b) s390 kernels usually get compiled with CONFIG_HZ_100 while most other architectures use CONFIG_HZ_1000. If I change the kernel config to CONFIG_HZ_1000 the problem goes away, however I don't consider this a fix... Another reason why this hasn't been observed on x86 may or may not be that we haven't implemented CONFIG_HAVE_CONTEXT_TRACKING on s390 (yet). But that's just guessing... ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [bisected] e341694e3eb5 netlink_lookup() rcu conversion causes latencies 2014-10-11 8:36 [bisected] e341694e3eb5 netlink_lookup() rcu conversion causes latencies Heiko Carstens @ 2014-10-11 19:32 ` Eric Dumazet 2014-10-11 22:25 ` Thomas Graf 0 siblings, 1 reply; 7+ messages in thread From: Eric Dumazet @ 2014-10-11 19:32 UTC (permalink / raw) To: Heiko Carstens, Sasha Levin, paulmck Cc: Thomas Graf, Nikolay Aleksandrov, David S. Miller, netdev, linux-kernel, Ursula Braun On Sat, 2014-10-11 at 10:36 +0200, Heiko Carstens wrote: > Hi all, > > it just came to my attention that commit e341694e3eb5 > "netlink: Convert netlink_lookup() to use RCU protected hash table" > causes network latencies for me on s390. > > The testcase is quite simple and 100% reproducible on s390: > > Simply login via ssh to a remote system which has the above mentioned > patch applied. Any action like pressing return now has significant > latencies. Or in other words, working via such a connection becomes > a pain ;) > > I haven't debugged it, however I assume the problem is that a) the > commit introduces a synchronize_net() call und b) s390 kernels > usually get compiled with CONFIG_HZ_100 while most other architectures > use CONFIG_HZ_1000. > If I change the kernel config to CONFIG_HZ_1000 the problem goes away, > however I don't consider this a fix... > > Another reason why this hasn't been observed on x86 may or may not be > that we haven't implemented CONFIG_HAVE_CONTEXT_TRACKING on s390 (yet). > But that's just guessing... CC Paul and Sasha ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [bisected] e341694e3eb5 netlink_lookup() rcu conversion causes latencies 2014-10-11 19:32 ` Eric Dumazet @ 2014-10-11 22:25 ` Thomas Graf 2014-10-11 23:08 ` David Miller 2014-10-20 8:21 ` Heiko Carstens 0 siblings, 2 replies; 7+ messages in thread From: Thomas Graf @ 2014-10-11 22:25 UTC (permalink / raw) To: Eric Dumazet Cc: Heiko Carstens, Sasha Levin, paulmck, Nikolay Aleksandrov, David S. Miller, netdev, linux-kernel, Ursula Braun On 10/11/14 at 12:32pm, Eric Dumazet wrote: > On Sat, 2014-10-11 at 10:36 +0200, Heiko Carstens wrote: > > Hi all, > > > > it just came to my attention that commit e341694e3eb5 > > "netlink: Convert netlink_lookup() to use RCU protected hash table" > > causes network latencies for me on s390. > > > > The testcase is quite simple and 100% reproducible on s390: > > > > Simply login via ssh to a remote system which has the above mentioned > > patch applied. Any action like pressing return now has significant > > latencies. Or in other words, working via such a connection becomes > > a pain ;) > > > > I haven't debugged it, however I assume the problem is that a) the > > commit introduces a synchronize_net() call und b) s390 kernels > > usually get compiled with CONFIG_HZ_100 while most other architectures > > use CONFIG_HZ_1000. > > If I change the kernel config to CONFIG_HZ_1000 the problem goes away, > > however I don't consider this a fix... > > > > Another reason why this hasn't been observed on x86 may or may not be > > that we haven't implemented CONFIG_HAVE_CONTEXT_TRACKING on s390 (yet). > > But that's just guessing... > > CC Paul and Sasha I think the issue here is obvious and a fix is on the way to move the insertion and removal to a worker to no longer require the synchronize_rcu(). What bothers me is that the synchronize_rcu() should only occur on expand/shrink and not for every table update. The default table size is 64. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [bisected] e341694e3eb5 netlink_lookup() rcu conversion causes latencies 2014-10-11 22:25 ` Thomas Graf @ 2014-10-11 23:08 ` David Miller 2014-10-20 8:21 ` Heiko Carstens 1 sibling, 0 replies; 7+ messages in thread From: David Miller @ 2014-10-11 23:08 UTC (permalink / raw) To: tgraf Cc: eric.dumazet, heiko.carstens, sasha.levin, paulmck, nikolay, netdev, linux-kernel, braunu From: Thomas Graf <tgraf@suug.ch> Date: Sat, 11 Oct 2014 23:25:14 +0100 > I think the issue here is obvious and a fix is on the way to move > the insertion and removal to a worker to no longer require the > synchronize_rcu(). > > What bothers me is that the synchronize_rcu() should only occur > on expand/shrink and not for every table update. The default table > size is 64. Not true, every netlink socket release incurs a synchronize_net() now, because we added such a call to netlink_release(). I specifically brought this up to as a possible problem when the changes went in... ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [bisected] e341694e3eb5 netlink_lookup() rcu conversion causes latencies 2014-10-11 22:25 ` Thomas Graf 2014-10-11 23:08 ` David Miller @ 2014-10-20 8:21 ` Heiko Carstens 2014-10-20 19:53 ` [PATCH] netlink: Re-add locking to netlink_lookup() and seq walker Thomas Graf 1 sibling, 1 reply; 7+ messages in thread From: Heiko Carstens @ 2014-10-20 8:21 UTC (permalink / raw) To: Thomas Graf Cc: Eric Dumazet, Sasha Levin, paulmck, Nikolay Aleksandrov, David S. Miller, netdev, linux-kernel, Ursula Braun On Sat, Oct 11, 2014 at 11:25:14PM +0100, Thomas Graf wrote: > On 10/11/14 at 12:32pm, Eric Dumazet wrote: > > On Sat, 2014-10-11 at 10:36 +0200, Heiko Carstens wrote: > > > Hi all, > > > > > > it just came to my attention that commit e341694e3eb5 > > > "netlink: Convert netlink_lookup() to use RCU protected hash table" > > > causes network latencies for me on s390. > > > > > > The testcase is quite simple and 100% reproducible on s390: > > > > > > Simply login via ssh to a remote system which has the above mentioned > > > patch applied. Any action like pressing return now has significant > > > latencies. Or in other words, working via such a connection becomes > > > a pain ;) > > > > > > I haven't debugged it, however I assume the problem is that a) the > > > commit introduces a synchronize_net() call und b) s390 kernels > > > usually get compiled with CONFIG_HZ_100 while most other architectures > > > use CONFIG_HZ_1000. > > > If I change the kernel config to CONFIG_HZ_1000 the problem goes away, > > > however I don't consider this a fix... > > > > > > Another reason why this hasn't been observed on x86 may or may not be > > > that we haven't implemented CONFIG_HAVE_CONTEXT_TRACKING on s390 (yet). > > > But that's just guessing... > > > > CC Paul and Sasha > > I think the issue here is obvious and a fix is on the way to move > the insertion and removal to a worker to no longer require the > synchronize_rcu(). > > What bothers me is that the synchronize_rcu() should only occur > on expand/shrink and not for every table update. The default table > size is 64. *ping* ... is there already any patch available? ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH] netlink: Re-add locking to netlink_lookup() and seq walker 2014-10-20 8:21 ` Heiko Carstens @ 2014-10-20 19:53 ` Thomas Graf 2014-10-21 7:04 ` Heiko Carstens 0 siblings, 1 reply; 7+ messages in thread From: Thomas Graf @ 2014-10-20 19:53 UTC (permalink / raw) To: Heiko Carstens Cc: Eric Dumazet, Sasha Levin, paulmck, Nikolay Aleksandrov, David S. Miller, netdev, linux-kernel, Ursula Braun Heiko, Can you test the following patch: The synchronize_rcu() in netlink_release() introduces unacceptable latency. Reintroduce minimal lookup so we can drop the synchronize_rcu() until socket destruction has been RCUfied. Signed-off-by: Thomas Graf <tgraf@suug.ch> --- net/netlink/af_netlink.c | 37 +++++++++++++++++++++++++------------ 1 file changed, 25 insertions(+), 12 deletions(-) diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c index 7a186e7..f1de72d 100644 --- a/net/netlink/af_netlink.c +++ b/net/netlink/af_netlink.c @@ -96,6 +96,14 @@ static DECLARE_WAIT_QUEUE_HEAD(nl_table_wait); static int netlink_dump(struct sock *sk); static void netlink_skb_destructor(struct sk_buff *skb); +/* nl_table locking explained: + * Lookup and traversal are protected with nl_sk_hash_lock or nl_table_lock + * combined with an RCU read-side lock. Insertion and removal are protected + * with nl_sk_hash_lock while using RCU list modification primitives and may + * run in parallel to nl_table_lock protected lookups. Destruction of the + * Netlink socket may only occur *after* nl_table_lock has been acquired + * either during or after the socket has been removed from the list. + */ DEFINE_RWLOCK(nl_table_lock); EXPORT_SYMBOL_GPL(nl_table_lock); static atomic_t nl_table_users = ATOMIC_INIT(0); @@ -109,10 +117,10 @@ EXPORT_SYMBOL_GPL(nl_sk_hash_lock); static int lockdep_nl_sk_hash_is_held(void) { #ifdef CONFIG_LOCKDEP - return (debug_locks) ? lockdep_is_held(&nl_sk_hash_lock) : 1; -#else - return 1; + if (debug_locks) + return lockdep_is_held(&nl_sk_hash_lock) || lockdep_is_held(&nl_table_lock); #endif + return 1; } static ATOMIC_NOTIFIER_HEAD(netlink_chain); @@ -1028,11 +1036,13 @@ static struct sock *netlink_lookup(struct net *net, int protocol, u32 portid) struct netlink_table *table = &nl_table[protocol]; struct sock *sk; + read_lock(&nl_table_lock); rcu_read_lock(); sk = __netlink_lookup(table, portid, net); if (sk) sock_hold(sk); rcu_read_unlock(); + read_unlock(&nl_table_lock); return sk; } @@ -1257,9 +1267,6 @@ static int netlink_release(struct socket *sock) } netlink_table_ungrab(); - /* Wait for readers to complete */ - synchronize_net(); - kfree(nlk->groups); nlk->groups = NULL; @@ -1281,6 +1288,7 @@ static int netlink_autobind(struct socket *sock) retry: cond_resched(); + netlink_table_grab(); rcu_read_lock(); if (__netlink_lookup(table, portid, net)) { /* Bind collision, search negative portid values. */ @@ -1288,9 +1296,11 @@ retry: if (rover > -4097) rover = -4097; rcu_read_unlock(); + netlink_table_ungrab(); goto retry; } rcu_read_unlock(); + netlink_table_ungrab(); err = netlink_insert(sk, net, portid); if (err == -EADDRINUSE) @@ -2921,14 +2931,16 @@ static struct sock *netlink_seq_socket_idx(struct seq_file *seq, loff_t pos) } static void *netlink_seq_start(struct seq_file *seq, loff_t *pos) - __acquires(RCU) + __acquires(nl_table_lock) __acquires(RCU) { + read_lock(&nl_table_lock); rcu_read_lock(); return *pos ? netlink_seq_socket_idx(seq, *pos - 1) : SEQ_START_TOKEN; } static void *netlink_seq_next(struct seq_file *seq, void *v, loff_t *pos) { + struct rhashtable *ht; struct netlink_sock *nlk; struct nl_seq_iter *iter; struct net *net; @@ -2943,19 +2955,19 @@ static void *netlink_seq_next(struct seq_file *seq, void *v, loff_t *pos) iter = seq->private; nlk = v; - rht_for_each_entry_rcu(nlk, nlk->node.next, node) + i = iter->link; + ht = &nl_table[i].hash; + rht_for_each_entry(nlk, nlk->node.next, ht, node) if (net_eq(sock_net((struct sock *)nlk), net)) return nlk; - i = iter->link; j = iter->hash_idx + 1; do { - struct rhashtable *ht = &nl_table[i].hash; const struct bucket_table *tbl = rht_dereference_rcu(ht->tbl, ht); for (; j < tbl->size; j++) { - rht_for_each_entry_rcu(nlk, tbl->buckets[j], node) { + rht_for_each_entry(nlk, tbl->buckets[j], ht, node) { if (net_eq(sock_net((struct sock *)nlk), net)) { iter->link = i; iter->hash_idx = j; @@ -2971,9 +2983,10 @@ static void *netlink_seq_next(struct seq_file *seq, void *v, loff_t *pos) } static void netlink_seq_stop(struct seq_file *seq, void *v) - __releases(RCU) + __releases(RCU) __releases(nl_table_lock) { rcu_read_unlock(); + read_unlock(&nl_table_lock); } -- 1.9.3 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] netlink: Re-add locking to netlink_lookup() and seq walker 2014-10-20 19:53 ` [PATCH] netlink: Re-add locking to netlink_lookup() and seq walker Thomas Graf @ 2014-10-21 7:04 ` Heiko Carstens 0 siblings, 0 replies; 7+ messages in thread From: Heiko Carstens @ 2014-10-21 7:04 UTC (permalink / raw) To: Thomas Graf Cc: Eric Dumazet, Sasha Levin, paulmck, Nikolay Aleksandrov, David S. Miller, netdev, linux-kernel, Ursula Braun On Mon, Oct 20, 2014 at 08:53:55PM +0100, Thomas Graf wrote: > Heiko, > > Can you test the following patch: > > The synchronize_rcu() in netlink_release() introduces unacceptable > latency. Reintroduce minimal lookup so we can drop the > synchronize_rcu() until socket destruction has been RCUfied. > > Signed-off-by: Thomas Graf <tgraf@suug.ch> > --- > net/netlink/af_netlink.c | 37 +++++++++++++++++++++++++------------ > 1 file changed, 25 insertions(+), 12 deletions(-) Thanks a lot! Your patch fixes the issue for me. Reported-and-tested-by: Heiko Carstens <heiko.carstens@de.ibm.com> ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2014-10-21 7:04 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2014-10-11 8:36 [bisected] e341694e3eb5 netlink_lookup() rcu conversion causes latencies Heiko Carstens 2014-10-11 19:32 ` Eric Dumazet 2014-10-11 22:25 ` Thomas Graf 2014-10-11 23:08 ` David Miller 2014-10-20 8:21 ` Heiko Carstens 2014-10-20 19:53 ` [PATCH] netlink: Re-add locking to netlink_lookup() and seq walker Thomas Graf 2014-10-21 7:04 ` Heiko Carstens
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).