From: Eric Dumazet <edumazet@google.com> To: "David S . Miller" <davem@davemloft.net> Cc: netdev <netdev@vger.kernel.org>, Eric Dumazet <edumazet@google.com>, Willem de Bruijn <willemb@google.com>, Feng Tang <feng.tang@intel.com>, Eric Dumazet <eric.dumazet@gmail.com> Subject: [PATCH net 3/4] tcp: add tcp_tx_skb_cache sysctl Date: Fri, 14 Jun 2019 16:22:20 -0700 Message-ID: <20190614232221.248392-4-edumazet@google.com> (raw) In-Reply-To: <20190614232221.248392-1-edumazet@google.com> Feng Tang reported a performance regression after introduction of per TCP socket tx/rx caches, for TCP over loopback (netperf) There is high chance the regression is caused by a change on how well the 32 KB per-thread page (current->task_frag) can be recycled, and lack of pcp caches for order-3 pages. I could not reproduce the regression myself, cpus all being spinning on the mm spinlocks for page allocs/freeing, regardless of enabling or disabling the per tcp socket caches. It seems best to disable the feature by default, and let admins enabling it. MM layer either needs to provide scalable order-3 pages allocations, or could attempt a trylock on zone->lock if the caller only attempts to get a high-order page and is able to fallback to order-0 ones in case of pressure. Tests run on a 56 cores host (112 hyper threads) - 35.49% netperf [kernel.vmlinux] [k] queued_spin_lock_slowpath - 35.49% queued_spin_lock_slowpath - 18.18% get_page_from_freelist - __alloc_pages_nodemask - 18.18% alloc_pages_current skb_page_frag_refill sk_page_frag_refill tcp_sendmsg_locked tcp_sendmsg inet_sendmsg sock_sendmsg __sys_sendto __x64_sys_sendto do_syscall_64 entry_SYSCALL_64_after_hwframe __libc_send + 17.31% __free_pages_ok + 31.43% swapper [kernel.vmlinux] [k] intel_idle + 9.12% netperf [kernel.vmlinux] [k] copy_user_enhanced_fast_string + 6.53% netserver [kernel.vmlinux] [k] copy_user_enhanced_fast_string + 0.69% netserver [kernel.vmlinux] [k] queued_spin_lock_slowpath + 0.68% netperf [kernel.vmlinux] [k] skb_release_data + 0.52% netperf [kernel.vmlinux] [k] tcp_sendmsg_locked 0.46% netperf [kernel.vmlinux] [k] _raw_spin_lock_irqsave Fixes: 472c2e07eef0 ("tcp: add one skb cache for tx") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Feng Tang <feng.tang@intel.com> --- include/net/sock.h | 4 +++- net/ipv4/sysctl_net_ipv4.c | 8 ++++++++ 2 files changed, 11 insertions(+), 1 deletion(-) diff --git a/include/net/sock.h b/include/net/sock.h index b02645e2dfad722769c1455bcde76e46da9fc5ac..7d7f4ce63bb2aae7c87a9445d11339b6e6b19724 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -1463,12 +1463,14 @@ static inline void sk_mem_uncharge(struct sock *sk, int size) __sk_mem_reclaim(sk, 1 << 20); } +DECLARE_STATIC_KEY_FALSE(tcp_tx_skb_cache_key); static inline void sk_wmem_free_skb(struct sock *sk, struct sk_buff *skb) { sock_set_flag(sk, SOCK_QUEUE_SHRUNK); sk->sk_wmem_queued -= skb->truesize; sk_mem_uncharge(sk, skb->truesize); - if (!sk->sk_tx_skb_cache && !skb_cloned(skb)) { + if (static_branch_unlikely(&tcp_tx_skb_cache_key) && + !sk->sk_tx_skb_cache && !skb_cloned(skb)) { skb_zcopy_clear(skb, true); sk->sk_tx_skb_cache = skb; return; diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c index 886b58d31351df44725bdc34081e798bcb89ecf0..08a428a7b2749c4f2a03aa6352e44c053596ef75 100644 --- a/net/ipv4/sysctl_net_ipv4.c +++ b/net/ipv4/sysctl_net_ipv4.c @@ -54,6 +54,8 @@ static int one_day_secs = 24 * 3600; DEFINE_STATIC_KEY_FALSE(tcp_rx_skb_cache_key); EXPORT_SYMBOL(tcp_rx_skb_cache_key); +DEFINE_STATIC_KEY_FALSE(tcp_tx_skb_cache_key); + /* obsolete */ static int sysctl_tcp_low_latency __read_mostly; @@ -568,6 +570,12 @@ static struct ctl_table ipv4_table[] = { .mode = 0644, .proc_handler = proc_do_static_key, }, + { + .procname = "tcp_tx_skb_cache", + .data = &tcp_tx_skb_cache_key.key, + .mode = 0644, + .proc_handler = proc_do_static_key, + }, { } }; -- 2.22.0.410.gd8fdbe21b5-goog
next prev parent reply index Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-06-14 23:22 [PATCH net 0/4] tcp: add three static keys Eric Dumazet 2019-06-14 23:22 ` [PATCH net 1/4] sysctl: define proc_do_static_key() Eric Dumazet 2019-06-14 23:45 ` Alexei Starovoitov 2019-06-14 23:55 ` Eric Dumazet 2019-06-15 0:04 ` Alexei Starovoitov 2019-06-14 23:22 ` [PATCH net 2/4] tcp: add tcp_rx_skb_cache sysctl Eric Dumazet 2019-06-16 7:38 ` Feng Tang 2019-06-14 23:22 ` Eric Dumazet [this message] 2019-06-16 7:42 ` [PATCH net 3/4] tcp: add tcp_tx_skb_cache sysctl Feng Tang 2019-06-14 23:22 ` [PATCH net 4/4] net: add high_order_alloc_disable sysctl/static key Eric Dumazet 2019-06-15 3:18 ` [PATCH net 0/4] tcp: add three static keys David Miller
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20190614232221.248392-4-edumazet@google.com \ --to=edumazet@google.com \ --cc=davem@davemloft.net \ --cc=eric.dumazet@gmail.com \ --cc=feng.tang@intel.com \ --cc=netdev@vger.kernel.org \ --cc=willemb@google.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Netdev Archive on lore.kernel.org Archives are clonable: git clone --mirror https://lore.kernel.org/netdev/0 netdev/git/0.git git clone --mirror https://lore.kernel.org/netdev/1 netdev/git/1.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 netdev netdev/ https://lore.kernel.org/netdev \ netdev@vger.kernel.org public-inbox-index netdev Example config snippet for mirrors Newsgroup available over NNTP: nntp://nntp.lore.kernel.org/org.kernel.vger.netdev AGPL code for this site: git clone https://public-inbox.org/public-inbox.git