All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net] net: reduce RECURSION_LIMIT to 8
@ 2016-01-07 17:40 Hannes Frederic Sowa
  2016-01-10 22:59 ` David Miller
  2016-01-11  6:38 ` pravin shelar
  0 siblings, 2 replies; 7+ messages in thread
From: Hannes Frederic Sowa @ 2016-01-07 17:40 UTC (permalink / raw)
  To: netdev; +Cc: David S. Miller, Eric Dumazet

When RECURSION_LIMIT was first introduced, Eric proposed a limit of 3.
This limit was later raised to 10 by DaveM. Nowadays it is observed that
configuraion errors in openvswitch cause the STACK_END_MAGIC to be
overwritten shortly after 9 recursion.

This patch tries to be conservative and reduces the limit to 8 without
further measurements. It seems ovs uses the stack more than other parts
of the networking stack - I couldn't bring the system down with a non-ovs
tunneling setup.

Cc: David S. Miller <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
I don't do crazy run-time estimation of the stack size for one recursion
and try automatically to come up with a limit per arch or kconfig
settings, as I assume that all systems should behave the same regarding
the recursion maximum. All configurations should run on all kinds of
systems. I consider 8 recursions to be plenty enough for the time being.

 net/core/dev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index ae00b894e67555..d93da7df84325d 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2941,7 +2941,7 @@ static void skb_update_prio(struct sk_buff *skb)
 DEFINE_PER_CPU(int, xmit_recursion);
 EXPORT_SYMBOL(xmit_recursion);
 
-#define RECURSION_LIMIT 10
+#define RECURSION_LIMIT 8
 
 /**
  *	dev_loopback_xmit - loop back @skb
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH net] net: reduce RECURSION_LIMIT to 8
  2016-01-07 17:40 [PATCH net] net: reduce RECURSION_LIMIT to 8 Hannes Frederic Sowa
@ 2016-01-10 22:59 ` David Miller
  2016-01-11  6:38 ` pravin shelar
  1 sibling, 0 replies; 7+ messages in thread
From: David Miller @ 2016-01-10 22:59 UTC (permalink / raw)
  To: hannes; +Cc: netdev, edumazet

From: Hannes Frederic Sowa <hannes@stressinduktion.org>
Date: Thu,  7 Jan 2016 18:40:53 +0100

> This patch tries to be conservative and reduces the limit to 8
> without further measurements. It seems ovs uses the stack more than
> other parts of the networking stack - I couldn't bring the system
> down with a non-ovs tunneling setup.

Can we figure out why OVS sucks so much wrt. stack usage instead?

I'd rather not paper over something like this, especially when it's
OVS which I'm completely not in the mood to specially cater for in
any way, shape, or form.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net] net: reduce RECURSION_LIMIT to 8
  2016-01-07 17:40 [PATCH net] net: reduce RECURSION_LIMIT to 8 Hannes Frederic Sowa
  2016-01-10 22:59 ` David Miller
@ 2016-01-11  6:38 ` pravin shelar
  2016-01-11 12:24   ` Hannes Frederic Sowa
  1 sibling, 1 reply; 7+ messages in thread
From: pravin shelar @ 2016-01-11  6:38 UTC (permalink / raw)
  To: Hannes Frederic Sowa
  Cc: Linux Kernel Network Developers, David S. Miller, Eric Dumazet

On Thu, Jan 7, 2016 at 9:40 AM, Hannes Frederic Sowa
<hannes@stressinduktion.org> wrote:
> When RECURSION_LIMIT was first introduced, Eric proposed a limit of 3.
> This limit was later raised to 10 by DaveM. Nowadays it is observed that
> configuraion errors in openvswitch cause the STACK_END_MAGIC to be
> overwritten shortly after 9 recursion.
>
Major user of stack space in OVS is sw_flow_key in
ovs_vport_receive(). With recent features like IPv6 tunnel support we
have increased the size of the flow-key which could have caused the
stack overflow sooner.
One way to avoid using stack in subsequent recursive call is to use
per-cpu storage for the sw_flow_key object. This is already done for
OVS recursive actions, so we can expand on that facility.


> This patch tries to be conservative and reduces the limit to 8 without
> further measurements. It seems ovs uses the stack more than other parts
> of the networking stack - I couldn't bring the system down with a non-ovs
> tunneling setup.
>
> Cc: David S. Miller <davem@davemloft.net>
> Cc: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
> ---
> I don't do crazy run-time estimation of the stack size for one recursion
> and try automatically to come up with a limit per arch or kconfig
> settings, as I assume that all systems should behave the same regarding
> the recursion maximum. All configurations should run on all kinds of
> systems. I consider 8 recursions to be plenty enough for the time being.
>
>  net/core/dev.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/net/core/dev.c b/net/core/dev.c
> index ae00b894e67555..d93da7df84325d 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -2941,7 +2941,7 @@ static void skb_update_prio(struct sk_buff *skb)
>  DEFINE_PER_CPU(int, xmit_recursion);
>  EXPORT_SYMBOL(xmit_recursion);
>
> -#define RECURSION_LIMIT 10
> +#define RECURSION_LIMIT 8
>
>  /**
>   *     dev_loopback_xmit - loop back @skb
> --
> 2.5.0
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net] net: reduce RECURSION_LIMIT to 8
  2016-01-11  6:38 ` pravin shelar
@ 2016-01-11 12:24   ` Hannes Frederic Sowa
  2016-01-12  0:36     ` pravin shelar
  0 siblings, 1 reply; 7+ messages in thread
From: Hannes Frederic Sowa @ 2016-01-11 12:24 UTC (permalink / raw)
  To: pravin shelar
  Cc: Linux Kernel Network Developers, David S. Miller, Eric Dumazet

On 11.01.2016 07:38, pravin shelar wrote:
> On Thu, Jan 7, 2016 at 9:40 AM, Hannes Frederic Sowa
> <hannes@stressinduktion.org> wrote:
>> When RECURSION_LIMIT was first introduced, Eric proposed a limit of 3.
>> This limit was later raised to 10 by DaveM. Nowadays it is observed that
>> configuraion errors in openvswitch cause the STACK_END_MAGIC to be
>> overwritten shortly after 9 recursion.
>>
> Major user of stack space in OVS is sw_flow_key in
> ovs_vport_receive(). With recent features like IPv6 tunnel support we
> have increased the size of the flow-key which could have caused the
> stack overflow sooner.
> One way to avoid using stack in subsequent recursive call is to use
> per-cpu storage for the sw_flow_key object. This is already done for
> OVS recursive actions, so we can expand on that facility.

Hmmm. This already came up. I think the difficulty is that 
ovs_vport_receive can be called from actions again with skb_cloned skb 
before the original's skb callstack is actually finished. Data in the 
percpu area would be overwritten while still being used. It would need 
some more logic IMHO.

What are recursive actions in ovs? I couldn't find any use of pcpu data 
in there? Thanks! :)

We could as an intermediate step add a recursion counter to openvswitch 
and limit call chains to depth 5, what do you think?

Bye,
Hannes

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net] net: reduce RECURSION_LIMIT to 8
  2016-01-11 12:24   ` Hannes Frederic Sowa
@ 2016-01-12  0:36     ` pravin shelar
  2016-01-12  1:48       ` Hannes Frederic Sowa
  0 siblings, 1 reply; 7+ messages in thread
From: pravin shelar @ 2016-01-12  0:36 UTC (permalink / raw)
  To: Hannes Frederic Sowa
  Cc: Linux Kernel Network Developers, David S. Miller, Eric Dumazet

On Mon, Jan 11, 2016 at 4:24 AM, Hannes Frederic Sowa
<hannes@stressinduktion.org> wrote:
> On 11.01.2016 07:38, pravin shelar wrote:
>>
>> On Thu, Jan 7, 2016 at 9:40 AM, Hannes Frederic Sowa
>> <hannes@stressinduktion.org> wrote:
>>>
>>> When RECURSION_LIMIT was first introduced, Eric proposed a limit of 3.
>>> This limit was later raised to 10 by DaveM. Nowadays it is observed that
>>> configuraion errors in openvswitch cause the STACK_END_MAGIC to be
>>> overwritten shortly after 9 recursion.
>>>
>> Major user of stack space in OVS is sw_flow_key in
>> ovs_vport_receive(). With recent features like IPv6 tunnel support we
>> have increased the size of the flow-key which could have caused the
>> stack overflow sooner.
>> One way to avoid using stack in subsequent recursive call is to use
>> per-cpu storage for the sw_flow_key object. This is already done for
>> OVS recursive actions, so we can expand on that facility.
>
>
> Hmmm. This already came up. I think the difficulty is that ovs_vport_receive
> can be called from actions again with skb_cloned skb before the original's
> skb callstack is actually finished. Data in the percpu area would be
> overwritten while still being used. It would need some more logic IMHO.
>
You can have stack of flow-keys and allocate a flow-key for each recursive call.

> What are recursive actions in ovs? I couldn't find any use of pcpu data in
> there? Thanks! :)
>
There are couple of recursive actions in OVS, e.g.
OVS_ACTION_ATTR_RECIRC. But it is implemented by using per-cpu
flow-key stack to avoid recursive function call.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net] net: reduce RECURSION_LIMIT to 8
  2016-01-12  0:36     ` pravin shelar
@ 2016-01-12  1:48       ` Hannes Frederic Sowa
  2016-01-12 20:41         ` pravin shelar
  0 siblings, 1 reply; 7+ messages in thread
From: Hannes Frederic Sowa @ 2016-01-12  1:48 UTC (permalink / raw)
  To: pravin shelar
  Cc: Linux Kernel Network Developers, David S. Miller, Eric Dumazet

On 12.01.2016 01:36, pravin shelar wrote:
> On Mon, Jan 11, 2016 at 4:24 AM, Hannes Frederic Sowa
> <hannes@stressinduktion.org> wrote:
>> On 11.01.2016 07:38, pravin shelar wrote:
>>>
>>> On Thu, Jan 7, 2016 at 9:40 AM, Hannes Frederic Sowa
>>> <hannes@stressinduktion.org> wrote:
>>>>
>>>> When RECURSION_LIMIT was first introduced, Eric proposed a limit of 3.
>>>> This limit was later raised to 10 by DaveM. Nowadays it is observed that
>>>> configuraion errors in openvswitch cause the STACK_END_MAGIC to be
>>>> overwritten shortly after 9 recursion.
>>>>
>>> Major user of stack space in OVS is sw_flow_key in
>>> ovs_vport_receive(). With recent features like IPv6 tunnel support we
>>> have increased the size of the flow-key which could have caused the
>>> stack overflow sooner.
>>> One way to avoid using stack in subsequent recursive call is to use
>>> per-cpu storage for the sw_flow_key object. This is already done for
>>> OVS recursive actions, so we can expand on that facility.
>>
>>
>> Hmmm. This already came up. I think the difficulty is that ovs_vport_receive
>> can be called from actions again with skb_cloned skb before the original's
>> skb callstack is actually finished. Data in the percpu area would be
>> overwritten while still being used. It would need some more logic IMHO.
>>
> You can have stack of flow-keys and allocate a flow-key for each recursive call.

Hmm, I came up with something like that but the other day I find it
unpleasant and think that the kmalloc might anyway end up in the fast
path:

diff --git a/net/openvswitch/vport.c b/net/openvswitch/vport.c
index 31cbc8c5c7db82..af9c94c732ce56 100644
--- a/net/openvswitch/vport.c
+++ b/net/openvswitch/vport.c
@@ -426,6 +426,9 @@ u32 ovs_vport_find_upcall_portid(const struct vport 
*vport, struct sk_buff *skb)
  	return ids->ids[ids_index];
  }

+static DEFINE_PER_CPU(struct sw_flow_key, pcpu_key);
+static DEFINE_PER_CPU(int, ovs_recursion);
+
  /**
   *	ovs_vport_receive - pass up received packet to the datapath for 
processing
   *
@@ -439,9 +442,19 @@ u32 ovs_vport_find_upcall_portid(const struct vport 
*vport, struct sk_buff *skb)
  int ovs_vport_receive(struct vport *vport, struct sk_buff *skb,
  		      const struct ip_tunnel_info *tun_info)
  {
-	struct sw_flow_key key;
+	struct sw_flow_key *key;
  	int error;

+	if (__this_cpu_inc_return(ovs_recursion) == 1) {
+		key = raw_cpu_ptr(&pcpu_key);
+	} else {
+		key = kmalloc(sizeof(*key), GFP_ATOMIC);
+		if (!key) {
+			__this_cpu_dec(ovs_recursion);
+			return -ENOMEM;
+		}
+	}
+
  	OVS_CB(skb)->input_vport = vport;
  	OVS_CB(skb)->mru = 0;
  	if (unlikely(dev_net(skb->dev) != ovs_dp_get_net(vport->dp))) {
@@ -454,12 +467,15 @@ int ovs_vport_receive(struct vport *vport, struct 
sk_buff *skb,
  	}

  	/* Extract flow from 'skb' into 'key'. */
-	error = ovs_flow_key_extract(tun_info, skb, &key);
+	error = ovs_flow_key_extract(tun_info, skb, key);
  	if (unlikely(error)) {
  		kfree_skb(skb);
-		return error;
+		goto out;
  	}
-	ovs_dp_process_packet(skb, &key);
+	ovs_dp_process_packet(skb, key);
+out:
+	if (__this_cpu_dec_return(ovs_recursion) > 0)
+		kfree(key);
  	return error;
  }
  EXPORT_SYMBOL_GPL(ovs_vport_receive);


>> What are recursive actions in ovs? I couldn't find any use of pcpu data in
>> there? Thanks! :)
>>
> There are couple of recursive actions in OVS, e.g.
> OVS_ACTION_ATTR_RECIRC. But it is implemented by using per-cpu
> flow-key stack to avoid recursive function call.

Ahh, the deferred_action stuff. I understand. So the idea would be to
lift even the first entry point to deferred_action a like?

This sounds like it could work but I fear we should find a solution
for stable, as this seems like a bit more work.

Thanks,
Hannes

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH net] net: reduce RECURSION_LIMIT to 8
  2016-01-12  1:48       ` Hannes Frederic Sowa
@ 2016-01-12 20:41         ` pravin shelar
  0 siblings, 0 replies; 7+ messages in thread
From: pravin shelar @ 2016-01-12 20:41 UTC (permalink / raw)
  To: Hannes Frederic Sowa
  Cc: Linux Kernel Network Developers, David S. Miller, Eric Dumazet

On Mon, Jan 11, 2016 at 5:48 PM, Hannes Frederic Sowa
<hannes@stressinduktion.org> wrote:
> On 12.01.2016 01:36, pravin shelar wrote:
>>
>> On Mon, Jan 11, 2016 at 4:24 AM, Hannes Frederic Sowa
>> <hannes@stressinduktion.org> wrote:
>>>
>>> On 11.01.2016 07:38, pravin shelar wrote:
>>>>
>>>>
>>>> On Thu, Jan 7, 2016 at 9:40 AM, Hannes Frederic Sowa
>>>> <hannes@stressinduktion.org> wrote:
>>>>>
>>>>>
>>>>> When RECURSION_LIMIT was first introduced, Eric proposed a limit of 3.
>>>>> This limit was later raised to 10 by DaveM. Nowadays it is observed
>>>>> that
>>>>> configuraion errors in openvswitch cause the STACK_END_MAGIC to be
>>>>> overwritten shortly after 9 recursion.
>>>>>
>>>> Major user of stack space in OVS is sw_flow_key in
>>>> ovs_vport_receive(). With recent features like IPv6 tunnel support we
>>>> have increased the size of the flow-key which could have caused the
>>>> stack overflow sooner.
>>>> One way to avoid using stack in subsequent recursive call is to use
>>>> per-cpu storage for the sw_flow_key object. This is already done for
>>>> OVS recursive actions, so we can expand on that facility.
>>>
>>>
>>>
>>> Hmmm. This already came up. I think the difficulty is that
>>> ovs_vport_receive
>>> can be called from actions again with skb_cloned skb before the
>>> original's
>>> skb callstack is actually finished. Data in the percpu area would be
>>> overwritten while still being used. It would need some more logic IMHO.
>>>
>> You can have stack of flow-keys and allocate a flow-key for each recursive
>> call.
>
>
> Hmm, I came up with something like that but the other day I find it
> unpleasant and think that the kmalloc might anyway end up in the fast
> path:
>
We can avoid the kmalloc using per-cpu array of flow-key. Infact for
first four recursion can be done using stack to avoid any performance
penalty of accessing percpu variables.

>
>
>>> What are recursive actions in ovs? I couldn't find any use of pcpu data
>>> in
>>> there? Thanks! :)
>>>
>> There are couple of recursive actions in OVS, e.g.
>> OVS_ACTION_ATTR_RECIRC. But it is implemented by using per-cpu
>> flow-key stack to avoid recursive function call.
>
>
> Ahh, the deferred_action stuff. I understand. So the idea would be to
> lift even the first entry point to deferred_action a like?
>
> This sounds like it could work but I fear we should find a solution
> for stable, as this seems like a bit more work.
>

I am fine with OVS specific recursion limit as stable tree fix for this issue.

Thanks.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-01-12 20:41 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-07 17:40 [PATCH net] net: reduce RECURSION_LIMIT to 8 Hannes Frederic Sowa
2016-01-10 22:59 ` David Miller
2016-01-11  6:38 ` pravin shelar
2016-01-11 12:24   ` Hannes Frederic Sowa
2016-01-12  0:36     ` pravin shelar
2016-01-12  1:48       ` Hannes Frederic Sowa
2016-01-12 20:41         ` pravin shelar

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.