linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Steven Rostedt <rostedt@goodmis.org>
To: linux-kernel@vger.kernel.org,
	linux-rt-users <linux-rt-users@vger.kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	Carsten Emde <C.Emde@osadl.org>,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	John Kacur <jkacur@redhat.com>,
	Paul Gortmaker <paul.gortmaker@windriver.com>,
	<stable-rt@vger.kernel.org>, Nicholas Mc Guire <der.herr@hofr.at>
Subject: [PATCH RT 11/14] net: ip_send_unicast_reply: add missing local serialization
Date: Fri, 28 Feb 2014 22:52:20 -0500	[thread overview]
Message-ID: <20140301035237.424335985@goodmis.org> (raw)
In-Reply-To: 20140301035209.031474616@goodmis.org

[-- Attachment #1: 0011-net-ip_send_unicast_reply-add-missing-local-serializ.patch --]
[-- Type: text/plain, Size: 3660 bytes --]

3.10.32-rt31-rc2 stable review patch.
If anyone has any objections, please let me know.

------------------

From: Nicholas Mc Guire <der.herr@hofr.at>

in response to the oops in ip_output.c:ip_send_unicast_reply under high
network load with CONFIG_PREEMPT_RT_FULL=y, reported by Sami Pietikainen
<Sami.Pietikainen@wapice.com>, this patch adds local serialization in
ip_send_unicast_reply.

from ip_output.c:
/*
 *      Generic function to send a packet as reply to another packet.
 *      Used to send some TCP resets/acks so far.
 *
 *      Use a fake percpu inet socket to avoid false sharing and contention.
 */
static DEFINE_PER_CPU(struct inet_sock, unicast_sock) = {
...

which was added in commit be9f4a44 in linux-stable. The git log, wich
introduced the PER_CPU unicast_sock, states:
<snip>
commit be9f4a44e7d41cee50ddb5f038fc2391cbbb4046
Author: Eric Dumazet <edumazet@google.com>
Date:   Thu Jul 19 07:34:03 2012 +0000

    ipv4: tcp: remove per net tcp_sock

    tcp_v4_send_reset() and tcp_v4_send_ack() use a single socket
    per network namespace.

    This leads to bad behavior on multiqueue NICS, because many cpus
    contend for the socket lock and once socket lock is acquired, extra
    false sharing on various socket fields slow down the operations.

    To better resist to attacks, we use a percpu socket. Each cpu can
    run without contention, using appropriate memory (local node)
<snip>

The per-cpu here thus is assuming exclusivity serializing per cpu - so
the use of get_cpu_ligh introduced in
net-use-cpu-light-in-ip-send-unicast-reply.patch, which droped the
preempt_disable in favor of a migrate_disable is probably wrong as this
only handles the referencial consistency but not the serialization. To
evade a preempt_disable here a local lock would be needed.

Therapie:
 * add local lock:
 * and re-introduce local serialization:

Tested on x86 with high network load using the testcase from Sami Pietikainen
  while : ; do wget -O - ftp://LOCAL_SERVER/empty_file > /dev/null 2>&1; done

Link: http://www.spinics.net/lists/linux-rt-users/msg11007.html
Cc: stable-rt@vger.kernel.org
Signed-off-by: Nicholas Mc Guire <der.herr@hofr.at>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 net/ipv4/ip_output.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 2d69b1b..ab674e8 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -79,6 +79,7 @@
 #include <linux/mroute.h>
 #include <linux/netlink.h>
 #include <linux/tcp.h>
+#include <linux/locallock.h>
 
 int sysctl_ip_default_ttl __read_mostly = IPDEFTTL;
 EXPORT_SYMBOL(sysctl_ip_default_ttl);
@@ -1471,6 +1472,9 @@ static DEFINE_PER_CPU(struct inet_sock, unicast_sock) = {
 	.uc_ttl		= -1,
 };
 
+/* serialize concurrent calls on the same CPU to ip_send_unicast_reply */
+static DEFINE_LOCAL_IRQ_LOCK(unicast_lock);
+
 void ip_send_unicast_reply(struct net *net, struct sk_buff *skb, __be32 daddr,
 			   __be32 saddr, const struct ip_reply_arg *arg,
 			   unsigned int len)
@@ -1508,8 +1512,7 @@ void ip_send_unicast_reply(struct net *net, struct sk_buff *skb, __be32 daddr,
 	if (IS_ERR(rt))
 		return;
 
-	get_cpu_light();
-	inet = &__get_cpu_var(unicast_sock);
+	inet = &get_locked_var(unicast_lock, unicast_sock);
 
 	inet->tos = arg->tos;
 	sk = &inet->sk;
@@ -1533,7 +1536,7 @@ void ip_send_unicast_reply(struct net *net, struct sk_buff *skb, __be32 daddr,
 		ip_push_pending_frames(sk, &fl4);
 	}
 
-	put_cpu_light();
+	put_locked_var(unicast_lock, unicast_sock);
 
 	ip_rt_put(rt);
 }
-- 
1.8.5.3



  parent reply	other threads:[~2014-03-01  3:53 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-03-01  3:52 [PATCH RT 00/14] Linux 3.10.32-rt31-rc2 Steven Rostedt
2014-03-01  3:52 ` [PATCH RT 01/14] rcu: Dont activate RCU core on NO_HZ_FULL CPUs Steven Rostedt
2014-03-01  3:52 ` [PATCH RT 02/14] timers: do not raise softirq unconditionally Steven Rostedt
2014-03-01  3:52 ` [PATCH RT 03/14] timer: Raise softirq if theres irq_work Steven Rostedt
2014-03-01  3:52 ` [PATCH RT 04/14] timer/rt: Always raise the softirq if theres irq_work to be done Steven Rostedt
2014-03-01  3:52 ` [PATCH RT 05/14] rcutree/rcu_bh_qs: disable irq while calling rcu_preempt_qs() Steven Rostedt
2014-03-01  3:52 ` [PATCH RT 06/14] Revert "x86: Disable IST stacks for debug/int 3/stack fault for PREEMPT_RT" Steven Rostedt
2014-03-01  3:52 ` [PATCH RT 07/14] rt: Make cpu_chill() use hrtimer instead of msleep() Steven Rostedt
2014-03-01  3:52 ` [PATCH RT 08/14] kernel/hrtimer: be non-freezeable in cpu_chill() Steven Rostedt
2014-03-01  3:52 ` [PATCH RT 09/14] irq_work: allow certain work in hard irq context Steven Rostedt
2014-03-01  3:52 ` [PATCH RT 10/14] arm/unwind: use a raw_spin_lock Steven Rostedt
2014-03-01  3:52 ` Steven Rostedt [this message]
2014-03-01  3:52 ` [PATCH RT 12/14] leds: trigger: disable CPU trigger on -RT Steven Rostedt
2014-03-01  3:52 ` [PATCH RT 13/14] rcu: Eliminate softirq processing from rcutree Steven Rostedt
2014-03-01  3:52 ` [PATCH RT 14/14] Linux 3.10.32-rt31-rc2 Steven Rostedt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140301035237.424335985@goodmis.org \
    --to=rostedt@goodmis.org \
    --cc=C.Emde@osadl.org \
    --cc=bigeasy@linutronix.de \
    --cc=der.herr@hofr.at \
    --cc=jkacur@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=paul.gortmaker@windriver.com \
    --cc=stable-rt@vger.kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).