* [PATCH net] net: reduce RECURSION_LIMIT to 8 @ 2016-01-07 17:40 Hannes Frederic Sowa 2016-01-10 22:59 ` David Miller 2016-01-11 6:38 ` pravin shelar 0 siblings, 2 replies; 7+ messages in thread From: Hannes Frederic Sowa @ 2016-01-07 17:40 UTC (permalink / raw) To: netdev; +Cc: David S. Miller, Eric Dumazet When RECURSION_LIMIT was first introduced, Eric proposed a limit of 3. This limit was later raised to 10 by DaveM. Nowadays it is observed that configuraion errors in openvswitch cause the STACK_END_MAGIC to be overwritten shortly after 9 recursion. This patch tries to be conservative and reduces the limit to 8 without further measurements. It seems ovs uses the stack more than other parts of the networking stack - I couldn't bring the system down with a non-ovs tunneling setup. Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> --- I don't do crazy run-time estimation of the stack size for one recursion and try automatically to come up with a limit per arch or kconfig settings, as I assume that all systems should behave the same regarding the recursion maximum. All configurations should run on all kinds of systems. I consider 8 recursions to be plenty enough for the time being. net/core/dev.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/core/dev.c b/net/core/dev.c index ae00b894e67555..d93da7df84325d 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -2941,7 +2941,7 @@ static void skb_update_prio(struct sk_buff *skb) DEFINE_PER_CPU(int, xmit_recursion); EXPORT_SYMBOL(xmit_recursion); -#define RECURSION_LIMIT 10 +#define RECURSION_LIMIT 8 /** * dev_loopback_xmit - loop back @skb -- 2.5.0 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH net] net: reduce RECURSION_LIMIT to 8 2016-01-07 17:40 [PATCH net] net: reduce RECURSION_LIMIT to 8 Hannes Frederic Sowa @ 2016-01-10 22:59 ` David Miller 2016-01-11 6:38 ` pravin shelar 1 sibling, 0 replies; 7+ messages in thread From: David Miller @ 2016-01-10 22:59 UTC (permalink / raw) To: hannes; +Cc: netdev, edumazet From: Hannes Frederic Sowa <hannes@stressinduktion.org> Date: Thu, 7 Jan 2016 18:40:53 +0100 > This patch tries to be conservative and reduces the limit to 8 > without further measurements. It seems ovs uses the stack more than > other parts of the networking stack - I couldn't bring the system > down with a non-ovs tunneling setup. Can we figure out why OVS sucks so much wrt. stack usage instead? I'd rather not paper over something like this, especially when it's OVS which I'm completely not in the mood to specially cater for in any way, shape, or form. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net] net: reduce RECURSION_LIMIT to 8 2016-01-07 17:40 [PATCH net] net: reduce RECURSION_LIMIT to 8 Hannes Frederic Sowa 2016-01-10 22:59 ` David Miller @ 2016-01-11 6:38 ` pravin shelar 2016-01-11 12:24 ` Hannes Frederic Sowa 1 sibling, 1 reply; 7+ messages in thread From: pravin shelar @ 2016-01-11 6:38 UTC (permalink / raw) To: Hannes Frederic Sowa Cc: Linux Kernel Network Developers, David S. Miller, Eric Dumazet On Thu, Jan 7, 2016 at 9:40 AM, Hannes Frederic Sowa <hannes@stressinduktion.org> wrote: > When RECURSION_LIMIT was first introduced, Eric proposed a limit of 3. > This limit was later raised to 10 by DaveM. Nowadays it is observed that > configuraion errors in openvswitch cause the STACK_END_MAGIC to be > overwritten shortly after 9 recursion. > Major user of stack space in OVS is sw_flow_key in ovs_vport_receive(). With recent features like IPv6 tunnel support we have increased the size of the flow-key which could have caused the stack overflow sooner. One way to avoid using stack in subsequent recursive call is to use per-cpu storage for the sw_flow_key object. This is already done for OVS recursive actions, so we can expand on that facility. > This patch tries to be conservative and reduces the limit to 8 without > further measurements. It seems ovs uses the stack more than other parts > of the networking stack - I couldn't bring the system down with a non-ovs > tunneling setup. > > Cc: David S. Miller <davem@davemloft.net> > Cc: Eric Dumazet <edumazet@google.com> > Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> > --- > I don't do crazy run-time estimation of the stack size for one recursion > and try automatically to come up with a limit per arch or kconfig > settings, as I assume that all systems should behave the same regarding > the recursion maximum. All configurations should run on all kinds of > systems. I consider 8 recursions to be plenty enough for the time being. > > net/core/dev.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/net/core/dev.c b/net/core/dev.c > index ae00b894e67555..d93da7df84325d 100644 > --- a/net/core/dev.c > +++ b/net/core/dev.c > @@ -2941,7 +2941,7 @@ static void skb_update_prio(struct sk_buff *skb) > DEFINE_PER_CPU(int, xmit_recursion); > EXPORT_SYMBOL(xmit_recursion); > > -#define RECURSION_LIMIT 10 > +#define RECURSION_LIMIT 8 > > /** > * dev_loopback_xmit - loop back @skb > -- > 2.5.0 > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net] net: reduce RECURSION_LIMIT to 8 2016-01-11 6:38 ` pravin shelar @ 2016-01-11 12:24 ` Hannes Frederic Sowa 2016-01-12 0:36 ` pravin shelar 0 siblings, 1 reply; 7+ messages in thread From: Hannes Frederic Sowa @ 2016-01-11 12:24 UTC (permalink / raw) To: pravin shelar Cc: Linux Kernel Network Developers, David S. Miller, Eric Dumazet On 11.01.2016 07:38, pravin shelar wrote: > On Thu, Jan 7, 2016 at 9:40 AM, Hannes Frederic Sowa > <hannes@stressinduktion.org> wrote: >> When RECURSION_LIMIT was first introduced, Eric proposed a limit of 3. >> This limit was later raised to 10 by DaveM. Nowadays it is observed that >> configuraion errors in openvswitch cause the STACK_END_MAGIC to be >> overwritten shortly after 9 recursion. >> > Major user of stack space in OVS is sw_flow_key in > ovs_vport_receive(). With recent features like IPv6 tunnel support we > have increased the size of the flow-key which could have caused the > stack overflow sooner. > One way to avoid using stack in subsequent recursive call is to use > per-cpu storage for the sw_flow_key object. This is already done for > OVS recursive actions, so we can expand on that facility. Hmmm. This already came up. I think the difficulty is that ovs_vport_receive can be called from actions again with skb_cloned skb before the original's skb callstack is actually finished. Data in the percpu area would be overwritten while still being used. It would need some more logic IMHO. What are recursive actions in ovs? I couldn't find any use of pcpu data in there? Thanks! :) We could as an intermediate step add a recursion counter to openvswitch and limit call chains to depth 5, what do you think? Bye, Hannes ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net] net: reduce RECURSION_LIMIT to 8 2016-01-11 12:24 ` Hannes Frederic Sowa @ 2016-01-12 0:36 ` pravin shelar 2016-01-12 1:48 ` Hannes Frederic Sowa 0 siblings, 1 reply; 7+ messages in thread From: pravin shelar @ 2016-01-12 0:36 UTC (permalink / raw) To: Hannes Frederic Sowa Cc: Linux Kernel Network Developers, David S. Miller, Eric Dumazet On Mon, Jan 11, 2016 at 4:24 AM, Hannes Frederic Sowa <hannes@stressinduktion.org> wrote: > On 11.01.2016 07:38, pravin shelar wrote: >> >> On Thu, Jan 7, 2016 at 9:40 AM, Hannes Frederic Sowa >> <hannes@stressinduktion.org> wrote: >>> >>> When RECURSION_LIMIT was first introduced, Eric proposed a limit of 3. >>> This limit was later raised to 10 by DaveM. Nowadays it is observed that >>> configuraion errors in openvswitch cause the STACK_END_MAGIC to be >>> overwritten shortly after 9 recursion. >>> >> Major user of stack space in OVS is sw_flow_key in >> ovs_vport_receive(). With recent features like IPv6 tunnel support we >> have increased the size of the flow-key which could have caused the >> stack overflow sooner. >> One way to avoid using stack in subsequent recursive call is to use >> per-cpu storage for the sw_flow_key object. This is already done for >> OVS recursive actions, so we can expand on that facility. > > > Hmmm. This already came up. I think the difficulty is that ovs_vport_receive > can be called from actions again with skb_cloned skb before the original's > skb callstack is actually finished. Data in the percpu area would be > overwritten while still being used. It would need some more logic IMHO. > You can have stack of flow-keys and allocate a flow-key for each recursive call. > What are recursive actions in ovs? I couldn't find any use of pcpu data in > there? Thanks! :) > There are couple of recursive actions in OVS, e.g. OVS_ACTION_ATTR_RECIRC. But it is implemented by using per-cpu flow-key stack to avoid recursive function call. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net] net: reduce RECURSION_LIMIT to 8 2016-01-12 0:36 ` pravin shelar @ 2016-01-12 1:48 ` Hannes Frederic Sowa 2016-01-12 20:41 ` pravin shelar 0 siblings, 1 reply; 7+ messages in thread From: Hannes Frederic Sowa @ 2016-01-12 1:48 UTC (permalink / raw) To: pravin shelar Cc: Linux Kernel Network Developers, David S. Miller, Eric Dumazet On 12.01.2016 01:36, pravin shelar wrote: > On Mon, Jan 11, 2016 at 4:24 AM, Hannes Frederic Sowa > <hannes@stressinduktion.org> wrote: >> On 11.01.2016 07:38, pravin shelar wrote: >>> >>> On Thu, Jan 7, 2016 at 9:40 AM, Hannes Frederic Sowa >>> <hannes@stressinduktion.org> wrote: >>>> >>>> When RECURSION_LIMIT was first introduced, Eric proposed a limit of 3. >>>> This limit was later raised to 10 by DaveM. Nowadays it is observed that >>>> configuraion errors in openvswitch cause the STACK_END_MAGIC to be >>>> overwritten shortly after 9 recursion. >>>> >>> Major user of stack space in OVS is sw_flow_key in >>> ovs_vport_receive(). With recent features like IPv6 tunnel support we >>> have increased the size of the flow-key which could have caused the >>> stack overflow sooner. >>> One way to avoid using stack in subsequent recursive call is to use >>> per-cpu storage for the sw_flow_key object. This is already done for >>> OVS recursive actions, so we can expand on that facility. >> >> >> Hmmm. This already came up. I think the difficulty is that ovs_vport_receive >> can be called from actions again with skb_cloned skb before the original's >> skb callstack is actually finished. Data in the percpu area would be >> overwritten while still being used. It would need some more logic IMHO. >> > You can have stack of flow-keys and allocate a flow-key for each recursive call. Hmm, I came up with something like that but the other day I find it unpleasant and think that the kmalloc might anyway end up in the fast path: diff --git a/net/openvswitch/vport.c b/net/openvswitch/vport.c index 31cbc8c5c7db82..af9c94c732ce56 100644 --- a/net/openvswitch/vport.c +++ b/net/openvswitch/vport.c @@ -426,6 +426,9 @@ u32 ovs_vport_find_upcall_portid(const struct vport *vport, struct sk_buff *skb) return ids->ids[ids_index]; } +static DEFINE_PER_CPU(struct sw_flow_key, pcpu_key); +static DEFINE_PER_CPU(int, ovs_recursion); + /** * ovs_vport_receive - pass up received packet to the datapath for processing * @@ -439,9 +442,19 @@ u32 ovs_vport_find_upcall_portid(const struct vport *vport, struct sk_buff *skb) int ovs_vport_receive(struct vport *vport, struct sk_buff *skb, const struct ip_tunnel_info *tun_info) { - struct sw_flow_key key; + struct sw_flow_key *key; int error; + if (__this_cpu_inc_return(ovs_recursion) == 1) { + key = raw_cpu_ptr(&pcpu_key); + } else { + key = kmalloc(sizeof(*key), GFP_ATOMIC); + if (!key) { + __this_cpu_dec(ovs_recursion); + return -ENOMEM; + } + } + OVS_CB(skb)->input_vport = vport; OVS_CB(skb)->mru = 0; if (unlikely(dev_net(skb->dev) != ovs_dp_get_net(vport->dp))) { @@ -454,12 +467,15 @@ int ovs_vport_receive(struct vport *vport, struct sk_buff *skb, } /* Extract flow from 'skb' into 'key'. */ - error = ovs_flow_key_extract(tun_info, skb, &key); + error = ovs_flow_key_extract(tun_info, skb, key); if (unlikely(error)) { kfree_skb(skb); - return error; + goto out; } - ovs_dp_process_packet(skb, &key); + ovs_dp_process_packet(skb, key); +out: + if (__this_cpu_dec_return(ovs_recursion) > 0) + kfree(key); return error; } EXPORT_SYMBOL_GPL(ovs_vport_receive); >> What are recursive actions in ovs? I couldn't find any use of pcpu data in >> there? Thanks! :) >> > There are couple of recursive actions in OVS, e.g. > OVS_ACTION_ATTR_RECIRC. But it is implemented by using per-cpu > flow-key stack to avoid recursive function call. Ahh, the deferred_action stuff. I understand. So the idea would be to lift even the first entry point to deferred_action a like? This sounds like it could work but I fear we should find a solution for stable, as this seems like a bit more work. Thanks, Hannes ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH net] net: reduce RECURSION_LIMIT to 8 2016-01-12 1:48 ` Hannes Frederic Sowa @ 2016-01-12 20:41 ` pravin shelar 0 siblings, 0 replies; 7+ messages in thread From: pravin shelar @ 2016-01-12 20:41 UTC (permalink / raw) To: Hannes Frederic Sowa Cc: Linux Kernel Network Developers, David S. Miller, Eric Dumazet On Mon, Jan 11, 2016 at 5:48 PM, Hannes Frederic Sowa <hannes@stressinduktion.org> wrote: > On 12.01.2016 01:36, pravin shelar wrote: >> >> On Mon, Jan 11, 2016 at 4:24 AM, Hannes Frederic Sowa >> <hannes@stressinduktion.org> wrote: >>> >>> On 11.01.2016 07:38, pravin shelar wrote: >>>> >>>> >>>> On Thu, Jan 7, 2016 at 9:40 AM, Hannes Frederic Sowa >>>> <hannes@stressinduktion.org> wrote: >>>>> >>>>> >>>>> When RECURSION_LIMIT was first introduced, Eric proposed a limit of 3. >>>>> This limit was later raised to 10 by DaveM. Nowadays it is observed >>>>> that >>>>> configuraion errors in openvswitch cause the STACK_END_MAGIC to be >>>>> overwritten shortly after 9 recursion. >>>>> >>>> Major user of stack space in OVS is sw_flow_key in >>>> ovs_vport_receive(). With recent features like IPv6 tunnel support we >>>> have increased the size of the flow-key which could have caused the >>>> stack overflow sooner. >>>> One way to avoid using stack in subsequent recursive call is to use >>>> per-cpu storage for the sw_flow_key object. This is already done for >>>> OVS recursive actions, so we can expand on that facility. >>> >>> >>> >>> Hmmm. This already came up. I think the difficulty is that >>> ovs_vport_receive >>> can be called from actions again with skb_cloned skb before the >>> original's >>> skb callstack is actually finished. Data in the percpu area would be >>> overwritten while still being used. It would need some more logic IMHO. >>> >> You can have stack of flow-keys and allocate a flow-key for each recursive >> call. > > > Hmm, I came up with something like that but the other day I find it > unpleasant and think that the kmalloc might anyway end up in the fast > path: > We can avoid the kmalloc using per-cpu array of flow-key. Infact for first four recursion can be done using stack to avoid any performance penalty of accessing percpu variables. > > >>> What are recursive actions in ovs? I couldn't find any use of pcpu data >>> in >>> there? Thanks! :) >>> >> There are couple of recursive actions in OVS, e.g. >> OVS_ACTION_ATTR_RECIRC. But it is implemented by using per-cpu >> flow-key stack to avoid recursive function call. > > > Ahh, the deferred_action stuff. I understand. So the idea would be to > lift even the first entry point to deferred_action a like? > > This sounds like it could work but I fear we should find a solution > for stable, as this seems like a bit more work. > I am fine with OVS specific recursion limit as stable tree fix for this issue. Thanks. ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2016-01-12 20:41 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-01-07 17:40 [PATCH net] net: reduce RECURSION_LIMIT to 8 Hannes Frederic Sowa 2016-01-10 22:59 ` David Miller 2016-01-11 6:38 ` pravin shelar 2016-01-11 12:24 ` Hannes Frederic Sowa 2016-01-12 0:36 ` pravin shelar 2016-01-12 1:48 ` Hannes Frederic Sowa 2016-01-12 20:41 ` pravin shelar
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.