All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RT] net: move xmit_recursion to per-task variable on -RT
@ 2016-01-13 15:23 Sebastian Andrzej Siewior
  2016-01-13 17:31 ` Thomas Gleixner
  0 siblings, 1 reply; 8+ messages in thread
From: Sebastian Andrzej Siewior @ 2016-01-13 15:23 UTC (permalink / raw)
  To: linux-rt-users; +Cc: linux-kernel, tglx, Steven Rostedt, netdev

A softirq on -RT can be preempted. That means one task is in
__dev_queue_xmit(), gets preempted and another task may enter
__dev_queue_xmit() aw well. netperf together with a bridge device
will then trigger the `recursion alert` because each task increments
the xmit_recursion variable which is per-CPU.
A virtual device like br0 is required to trigger this warning.

This patch moves the counter to per task instead per-CPU so it counts
the recursion properly on -RT.

Cc: stable-rt@vger.kernel.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 include/linux/netdevice.h |  9 +++++++++
 include/linux/sched.h     |  3 +++
 net/core/dev.c            | 41 ++++++++++++++++++++++++++++++++++++++---
 3 files changed, 50 insertions(+), 3 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index f14e39cb897c..4a8d3429dc12 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2249,11 +2249,20 @@ void netdev_freemem(struct net_device *dev);
 void synchronize_net(void);
 int init_dummy_netdev(struct net_device *dev);
 
+#ifdef CONFIG_PREEMPT_RT_FULL
+static inline int dev_recursion_level(void)
+{
+	return atomic_read(&current->xmit_recursion);
+}
+
+#else
+
 DECLARE_PER_CPU(int, xmit_recursion);
 static inline int dev_recursion_level(void)
 {
 	return this_cpu_read(xmit_recursion);
 }
+#endif
 
 struct net_device *dev_get_by_index(struct net *net, int ifindex);
 struct net_device *__dev_get_by_index(struct net *net, int ifindex);
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 04eb2f8bc274..5d36818107b0 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1855,6 +1855,9 @@ struct task_struct {
 	pte_t kmap_pte[KM_TYPE_NR];
 # endif
 #endif
+#ifdef CONFIG_PREEMPT_RT_FULL
+	atomic_t xmit_recursion;
+#endif
 #ifdef CONFIG_DEBUG_ATOMIC_SLEEP
 	unsigned long	task_state_change;
 #endif
diff --git a/net/core/dev.c b/net/core/dev.c
index ae4a67e7e654..1f6a7e9a22c4 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2946,9 +2946,44 @@ static void skb_update_prio(struct sk_buff *skb)
 #define skb_update_prio(skb)
 #endif
 
+#ifdef CONFIG_PREEMPT_RT_FULL
+
+static inline int xmit_rec_read(void)
+{
+       return atomic_read(&current->xmit_recursion);
+}
+
+static inline void xmit_rec_inc(void)
+{
+       atomic_inc(&current->xmit_recursion);
+}
+
+static inline void xmit_rec_dec(void)
+{
+       atomic_dec(&current->xmit_recursion);
+}
+
+#else
+
 DEFINE_PER_CPU(int, xmit_recursion);
 EXPORT_SYMBOL(xmit_recursion);
 
+static inline int xmit_rec_read(void)
+{
+	return __this_cpu_read(xmit_recursion);
+}
+
+static inline void xmit_rec_inc(void)
+{
+	__this_cpu_inc(xmit_recursion);
+}
+
+static inline int xmit_rec_dec(void)
+{
+	__this_cpu_dec(xmit_recursion);
+}
+#endif
+
 #define RECURSION_LIMIT 10
 
 /**
@@ -3141,7 +3176,7 @@ static int __dev_queue_xmit(struct sk_buff *skb, void *accel_priv)
 
 		if (txq->xmit_lock_owner != cpu) {
 
-			if (__this_cpu_read(xmit_recursion) > RECURSION_LIMIT)
+			if (xmit_rec_read() > RECURSION_LIMIT)
 				goto recursion_alert;
 
 			skb = validate_xmit_skb(skb, dev);
@@ -3151,9 +3186,9 @@ static int __dev_queue_xmit(struct sk_buff *skb, void *accel_priv)
 			HARD_TX_LOCK(dev, txq, cpu);
 
 			if (!netif_xmit_stopped(txq)) {
-				__this_cpu_inc(xmit_recursion);
+				xmit_rec_inc();
 				skb = dev_hard_start_xmit(skb, dev, txq, &rc);
-				__this_cpu_dec(xmit_recursion);
+				xmit_rec_dec();
 				if (dev_xmit_complete(rc)) {
 					HARD_TX_UNLOCK(dev, txq);
 					goto out;
-- 
2.7.0.rc3

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH RT] net: move xmit_recursion to per-task variable on -RT
  2016-01-13 15:23 [PATCH RT] net: move xmit_recursion to per-task variable on -RT Sebastian Andrzej Siewior
@ 2016-01-13 17:31 ` Thomas Gleixner
  2016-01-14 14:50   ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 8+ messages in thread
From: Thomas Gleixner @ 2016-01-13 17:31 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: linux-rt-users, linux-kernel, Steven Rostedt, netdev

On Wed, 13 Jan 2016, Sebastian Andrzej Siewior wrote:
> +#ifdef CONFIG_PREEMPT_RT_FULL
> +static inline int dev_recursion_level(void)
> +{
> +	return atomic_read(&current->xmit_recursion);

Why would you need an atomic here. current does hardly race against itself.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH RT] net: move xmit_recursion to per-task variable on -RT
  2016-01-13 17:31 ` Thomas Gleixner
@ 2016-01-14 14:50   ` Sebastian Andrzej Siewior
  2016-01-14 22:02     ` Hannes Frederic Sowa
  0 siblings, 1 reply; 8+ messages in thread
From: Sebastian Andrzej Siewior @ 2016-01-14 14:50 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: linux-rt-users, linux-kernel, Steven Rostedt, netdev

* Thomas Gleixner | 2016-01-13 18:31:46 [+0100]:

>On Wed, 13 Jan 2016, Sebastian Andrzej Siewior wrote:
>> +#ifdef CONFIG_PREEMPT_RT_FULL
>> +static inline int dev_recursion_level(void)
>> +{
>> +	return atomic_read(&current->xmit_recursion);
>
>Why would you need an atomic here. current does hardly race against itself.

right.

--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2249,11 +2249,20 @@ void netdev_freemem(struct net_device *d
 void synchronize_net(void);
 int init_dummy_netdev(struct net_device *dev);
 
+#ifdef CONFIG_PREEMPT_RT_FULL
+static inline int dev_recursion_level(void)
+{
+	return current->xmit_recursion;
+}
+
+#else
+
 DECLARE_PER_CPU(int, xmit_recursion);
 static inline int dev_recursion_level(void)
 {
 	return this_cpu_read(xmit_recursion);
 }
+#endif
 
 struct net_device *dev_get_by_index(struct net *net, int ifindex);
 struct net_device *__dev_get_by_index(struct net *net, int ifindex);
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1851,6 +1851,9 @@ struct task_struct {
 #ifdef CONFIG_DEBUG_ATOMIC_SLEEP
 	unsigned long	task_state_change;
 #endif
+#ifdef CONFIG_PREEMPT_RT_FULL
+	int xmit_recursion;
+#endif
 	int pagefault_disabled;
 /* CPU-specific state of this task */
 	struct thread_struct thread;
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2940,9 +2940,44 @@ static void skb_update_prio(struct sk_bu
 #define skb_update_prio(skb)
 #endif
 
+#ifdef CONFIG_PREEMPT_RT_FULL
+
+static inline int xmit_rec_read(void)
+{
+       return current->xmit_recursion;
+}
+
+static inline void xmit_rec_inc(void)
+{
+       current->xmit_recursion++;
+}
+
+static inline void xmit_rec_dec(void)
+{
+       current->xmit_recursion--;
+}
+
+#else
+
 DEFINE_PER_CPU(int, xmit_recursion);
 EXPORT_SYMBOL(xmit_recursion);
 
+static inline int xmit_rec_read(void)
+{
+	return __this_cpu_read(xmit_recursion);
+}
+
+static inline void xmit_rec_inc(void)
+{
+	__this_cpu_inc(xmit_recursion);
+}
+
+static inline int xmit_rec_dec(void)
+{
+	__this_cpu_dec(xmit_recursion);
+}
+#endif
+
 #define RECURSION_LIMIT 10
 
 /**
@@ -3135,7 +3170,7 @@ static int __dev_queue_xmit(struct sk_bu
 
 		if (txq->xmit_lock_owner != cpu) {
 
-			if (__this_cpu_read(xmit_recursion) > RECURSION_LIMIT)
+			if (xmit_rec_read() > RECURSION_LIMIT)
 				goto recursion_alert;
 
 			skb = validate_xmit_skb(skb, dev);
@@ -3145,9 +3180,9 @@ static int __dev_queue_xmit(struct sk_bu
 			HARD_TX_LOCK(dev, txq, cpu);
 
 			if (!netif_xmit_stopped(txq)) {
-				__this_cpu_inc(xmit_recursion);
+				xmit_rec_inc();
 				skb = dev_hard_start_xmit(skb, dev, txq, &rc);
-				__this_cpu_dec(xmit_recursion);
+				xmit_rec_dec();
 				if (dev_xmit_complete(rc)) {
 					HARD_TX_UNLOCK(dev, txq);
 					goto out;

>Thanks,
>
>	tglx

Sebastian

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH RT] net: move xmit_recursion to per-task variable on -RT
  2016-01-14 14:50   ` Sebastian Andrzej Siewior
@ 2016-01-14 22:02     ` Hannes Frederic Sowa
  2016-01-14 22:20       ` Eric Dumazet
  0 siblings, 1 reply; 8+ messages in thread
From: Hannes Frederic Sowa @ 2016-01-14 22:02 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior, Thomas Gleixner
  Cc: linux-rt-users, linux-kernel, Steven Rostedt, netdev

On 14.01.2016 15:50, Sebastian Andrzej Siewior wrote:
> * Thomas Gleixner | 2016-01-13 18:31:46 [+0100]:
>
>> On Wed, 13 Jan 2016, Sebastian Andrzej Siewior wrote:
>>> +#ifdef CONFIG_PREEMPT_RT_FULL
>>> +static inline int dev_recursion_level(void)
>>> +{
>>> +	return atomic_read(&current->xmit_recursion);
>>
>> Why would you need an atomic here. current does hardly race against itself.
>
> right.

We are just adding a second recursion limit solely to openvswitch which 
has the same problem:

https://patchwork.ozlabs.org/patch/566769/

This time also we depend on rcu_read_lock marking the section being 
nonpreemptible. Nice would be a more generic solution here which doesn't 
need to always add something to *current.

Thanks,
Hannes

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH RT] net: move xmit_recursion to per-task variable on -RT
  2016-01-14 22:02     ` Hannes Frederic Sowa
@ 2016-01-14 22:20       ` Eric Dumazet
  2016-01-14 23:00         ` Hannes Frederic Sowa
  0 siblings, 1 reply; 8+ messages in thread
From: Eric Dumazet @ 2016-01-14 22:20 UTC (permalink / raw)
  To: Hannes Frederic Sowa
  Cc: Sebastian Andrzej Siewior, Thomas Gleixner, linux-rt-users,
	linux-kernel, Steven Rostedt, netdev

On Thu, 2016-01-14 at 23:02 +0100, Hannes Frederic Sowa wrote:

> We are just adding a second recursion limit solely to openvswitch which 
> has the same problem:
> 
> https://patchwork.ozlabs.org/patch/566769/
> 
> This time also we depend on rcu_read_lock marking the section being 
> nonpreemptible. Nice would be a more generic solution here which doesn't 
> need to always add something to *current.


Note that rcu_read_lock() does not imply that preemption is disabled.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH RT] net: move xmit_recursion to per-task variable on -RT
  2016-01-14 22:20       ` Eric Dumazet
@ 2016-01-14 23:00         ` Hannes Frederic Sowa
  2016-01-15  8:21           ` Thomas Gleixner
  0 siblings, 1 reply; 8+ messages in thread
From: Hannes Frederic Sowa @ 2016-01-14 23:00 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Sebastian Andrzej Siewior, Thomas Gleixner, linux-rt-users,
	linux-kernel, Steven Rostedt, netdev

On 14.01.2016 23:20, Eric Dumazet wrote:
> On Thu, 2016-01-14 at 23:02 +0100, Hannes Frederic Sowa wrote:
>
>> We are just adding a second recursion limit solely to openvswitch which
>> has the same problem:
>>
>> https://patchwork.ozlabs.org/patch/566769/
>>
>> This time also we depend on rcu_read_lock marking the section being
>> nonpreemptible. Nice would be a more generic solution here which doesn't
>> need to always add something to *current.
>
>
> Note that rcu_read_lock() does not imply that preemption is disabled.

Exactly, it is conditional on CONFIG_PREEMPT_CPU/CONFIG_PREMPT_COUNT but 
haven't thought about exactly that in this moment.

I will resend this patch with better protection.

Thanks Eric!

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH RT] net: move xmit_recursion to per-task variable on -RT
  2016-01-14 23:00         ` Hannes Frederic Sowa
@ 2016-01-15  8:21           ` Thomas Gleixner
  2016-01-15  9:34             ` Hannes Frederic Sowa
  0 siblings, 1 reply; 8+ messages in thread
From: Thomas Gleixner @ 2016-01-15  8:21 UTC (permalink / raw)
  To: Hannes Frederic Sowa
  Cc: Eric Dumazet, Sebastian Andrzej Siewior, linux-rt-users,
	linux-kernel, Steven Rostedt, netdev

On Fri, 15 Jan 2016, Hannes Frederic Sowa wrote:
> On 14.01.2016 23:20, Eric Dumazet wrote:
> > On Thu, 2016-01-14 at 23:02 +0100, Hannes Frederic Sowa wrote:
> > 
> > > We are just adding a second recursion limit solely to openvswitch which
> > > has the same problem:
> > > 
> > > https://patchwork.ozlabs.org/patch/566769/
> > > 
> > > This time also we depend on rcu_read_lock marking the section being
> > > nonpreemptible. Nice would be a more generic solution here which doesn't
> > > need to always add something to *current.
> > 
> > 
> > Note that rcu_read_lock() does not imply that preemption is disabled.
> 
> Exactly, it is conditional on CONFIG_PREEMPT_CPU/CONFIG_PREMPT_COUNT but
> haven't thought about exactly that in this moment.

Wrong. CONFIG_PREEMPT_RCU makes RCU preemptible.

If that is not set then it fiddles with preempt_count when
CONFIG_PREEMPT_COUNT=y. If CONFIG_PREEMPT_COUNT=n then you have a non
preemptible system anyway.

So you cannot assume that rcu_read_lock() disables preemption.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH RT] net: move xmit_recursion to per-task variable on -RT
  2016-01-15  8:21           ` Thomas Gleixner
@ 2016-01-15  9:34             ` Hannes Frederic Sowa
  0 siblings, 0 replies; 8+ messages in thread
From: Hannes Frederic Sowa @ 2016-01-15  9:34 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Eric Dumazet, Sebastian Andrzej Siewior, linux-rt-users,
	linux-kernel, Steven Rostedt, netdev

On 15.01.2016 09:21, Thomas Gleixner wrote:
> On Fri, 15 Jan 2016, Hannes Frederic Sowa wrote:
>> On 14.01.2016 23:20, Eric Dumazet wrote:
>>> On Thu, 2016-01-14 at 23:02 +0100, Hannes Frederic Sowa wrote:
>>>
>>>> We are just adding a second recursion limit solely to openvswitch which
>>>> has the same problem:
>>>>
>>>> https://patchwork.ozlabs.org/patch/566769/
>>>>
>>>> This time also we depend on rcu_read_lock marking the section being
>>>> nonpreemptible. Nice would be a more generic solution here which doesn't
>>>> need to always add something to *current.
>>>
>>>
>>> Note that rcu_read_lock() does not imply that preemption is disabled.
>>
>> Exactly, it is conditional on CONFIG_PREEMPT_CPU/CONFIG_PREMPT_COUNT but
>> haven't thought about exactly that in this moment.
>
> Wrong. CONFIG_PREEMPT_RCU makes RCU preemptible.
>
> If that is not set then it fiddles with preempt_count when
> CONFIG_PREEMPT_COUNT=y. If CONFIG_PREEMPT_COUNT=n then you have a non
> preemptible system anyway.
>
> So you cannot assume that rcu_read_lock() disables preemption.

Sorry for maybe writing it misleading but that is exactly what I wanted 
to say here. Yes, I agree, I didn't really check because of _bh and 
rcu_read_lock. This was a mistake. ;)

I already send out an updated patch with added preemption guards.

Thanks,
Hannes

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2016-01-15  9:34 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-13 15:23 [PATCH RT] net: move xmit_recursion to per-task variable on -RT Sebastian Andrzej Siewior
2016-01-13 17:31 ` Thomas Gleixner
2016-01-14 14:50   ` Sebastian Andrzej Siewior
2016-01-14 22:02     ` Hannes Frederic Sowa
2016-01-14 22:20       ` Eric Dumazet
2016-01-14 23:00         ` Hannes Frederic Sowa
2016-01-15  8:21           ` Thomas Gleixner
2016-01-15  9:34             ` Hannes Frederic Sowa

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.