* [RFC net-next 0/2] Track recursive calls in TC act_mirred @ 2019-06-14 14:33 John Hurley 2019-06-14 14:33 ` [RFC net-next 1/2] net: sched: refactor reinsert action John Hurley 2019-06-14 14:33 ` [RFC net-next 2/2] net: sched: protect against stack overflow in TC act_mirred John Hurley 0 siblings, 2 replies; 7+ messages in thread From: John Hurley @ 2019-06-14 14:33 UTC (permalink / raw) To: netdev Cc: davem, fw, jhs, simon.horman, jakub.kicinski, oss-drivers, John Hurley These patches aim to prevent act_mirred causing stack overflow events from recursively calling packet xmit or receive functions. Such events can occur with poor TC configuration that causes packets to travel in loops within the system. Florian Westphal advises that a recursion crash and packets looping are separate issues and should be treated as such. David Miller futher points out that pcpu counters cannot track the precise skb context required to detect loops. Hence these patches are not aimed at detecting packet loops, rather, preventing stack flows arising from such loops. John Hurley (2): net: sched: refactor reinsert action net: sched: protect against stack overflow in TC act_mirred include/net/pkt_cls.h | 2 +- include/net/sch_generic.h | 2 +- net/core/dev.c | 4 +--- net/sched/act_mirred.c | 17 ++++++++++++++++- 4 files changed, 19 insertions(+), 6 deletions(-) -- 2.7.4 ^ permalink raw reply [flat|nested] 7+ messages in thread
* [RFC net-next 1/2] net: sched: refactor reinsert action 2019-06-14 14:33 [RFC net-next 0/2] Track recursive calls in TC act_mirred John Hurley @ 2019-06-14 14:33 ` John Hurley 2019-06-17 18:43 ` Edward Cree 2019-06-14 14:33 ` [RFC net-next 2/2] net: sched: protect against stack overflow in TC act_mirred John Hurley 1 sibling, 1 reply; 7+ messages in thread From: John Hurley @ 2019-06-14 14:33 UTC (permalink / raw) To: netdev Cc: davem, fw, jhs, simon.horman, jakub.kicinski, oss-drivers, John Hurley The TC_ACT_REINSERT return type was added as an in-kernel only option to allow a packet ingress or egress redirect. This is used to avoid unnecessary skb clones in situations where they are not required. If a TC hook returns this code then the packet is 'reinserted' and no skb consume is carried out as no clone took place. This return type is only used in act_mirred. Rather than have the reinsert called from the main datapath, call it directly in act_mirred. Instead of returning TC_ACT_REINSERT, change the type to the new TC_ACT_CONSUMED which tells the caller that the packet has been stolen by another process and that no consume call is required. Moving all redirect calls to the act_mirred code is in preparation for tracking recursion created by act_mirred. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> --- include/net/pkt_cls.h | 2 +- include/net/sch_generic.h | 2 +- net/core/dev.c | 4 +--- net/sched/act_mirred.c | 3 ++- 4 files changed, 5 insertions(+), 6 deletions(-) diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h index 514e3c8..3ecb5c2 100644 --- a/include/net/pkt_cls.h +++ b/include/net/pkt_cls.h @@ -9,7 +9,7 @@ #include <net/flow_offload.h> /* TC action not accessible from user space */ -#define TC_ACT_REINSERT (TC_ACT_VALUE_MAX + 1) +#define TC_ACT_CONSUMED (TC_ACT_VALUE_MAX + 1) /* Basic packet classifier frontend definitions. */ diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h index 21f434f..855167b 100644 --- a/include/net/sch_generic.h +++ b/include/net/sch_generic.h @@ -279,7 +279,7 @@ struct tcf_result { }; const struct tcf_proto *goto_tp; - /* used by the TC_ACT_REINSERT action */ + /* used in the skb_tc_reinsert function */ struct { bool ingress; struct gnet_stats_queue *qstats; diff --git a/net/core/dev.c b/net/core/dev.c index eb7fb6d..ed5eeb7 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -4689,9 +4689,7 @@ sch_handle_ingress(struct sk_buff *skb, struct packet_type **pt_prev, int *ret, __skb_push(skb, skb->mac_len); skb_do_redirect(skb); return NULL; - case TC_ACT_REINSERT: - /* this does not scrub the packet, and updates stats on error */ - skb_tc_reinsert(skb, &cl_res); + case TC_ACT_CONSUMED: return NULL; default: break; diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c index 58e7573d..8c1d736 100644 --- a/net/sched/act_mirred.c +++ b/net/sched/act_mirred.c @@ -277,7 +277,8 @@ static int tcf_mirred_act(struct sk_buff *skb, const struct tc_action *a, if (use_reinsert) { res->ingress = want_ingress; res->qstats = this_cpu_ptr(m->common.cpu_qstats); - return TC_ACT_REINSERT; + skb_tc_reinsert(skb, res); + return TC_ACT_CONSUMED; } } -- 2.7.4 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [RFC net-next 1/2] net: sched: refactor reinsert action 2019-06-14 14:33 ` [RFC net-next 1/2] net: sched: refactor reinsert action John Hurley @ 2019-06-17 18:43 ` Edward Cree 2019-06-17 22:11 ` John Hurley 2019-06-18 18:57 ` Jakub Kicinski 0 siblings, 2 replies; 7+ messages in thread From: Edward Cree @ 2019-06-17 18:43 UTC (permalink / raw) To: John Hurley, netdev Cc: davem, fw, jhs, simon.horman, jakub.kicinski, oss-drivers On 14/06/2019 15:33, John Hurley wrote: > Instead of > returning TC_ACT_REINSERT, change the type to the new TC_ACT_CONSUMED > which tells the caller that the packet has been stolen by another process > and that no consume call is required. Possibly a dumb question, but why does this need a new CONSUMED rather than, say, taking an additional ref and returning TC_ACT_STOLEN? Apart from that, the series lgtm. -Ed ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFC net-next 1/2] net: sched: refactor reinsert action 2019-06-17 18:43 ` Edward Cree @ 2019-06-17 22:11 ` John Hurley 2019-06-18 18:57 ` Jakub Kicinski 1 sibling, 0 replies; 7+ messages in thread From: John Hurley @ 2019-06-17 22:11 UTC (permalink / raw) To: Edward Cree Cc: Linux Netdev List, David Miller, Florian Westphal, Jamal Hadi Salim, Simon Horman, Jakub Kicinski, oss-drivers On Mon, Jun 17, 2019 at 7:44 PM Edward Cree <ecree@solarflare.com> wrote: > > On 14/06/2019 15:33, John Hurley wrote: > > Instead of > > returning TC_ACT_REINSERT, change the type to the new TC_ACT_CONSUMED > > which tells the caller that the packet has been stolen by another process > > and that no consume call is required. > Possibly a dumb question, but why does this need a new CONSUMED rather > than, say, taking an additional ref and returning TC_ACT_STOLEN? > > Apart from that, the series lgtm. > Thanks for comments, Ed. The CONSUMED was to replace the REINSERT function but yes, this is probably not required now. I can respin > -Ed ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFC net-next 1/2] net: sched: refactor reinsert action 2019-06-17 18:43 ` Edward Cree 2019-06-17 22:11 ` John Hurley @ 2019-06-18 18:57 ` Jakub Kicinski 1 sibling, 0 replies; 7+ messages in thread From: Jakub Kicinski @ 2019-06-18 18:57 UTC (permalink / raw) To: Edward Cree Cc: John Hurley, netdev, davem, fw, jhs, simon.horman, oss-drivers, Paolo Abeni On Mon, 17 Jun 2019 19:43:53 +0100, Edward Cree wrote: > On 14/06/2019 15:33, John Hurley wrote: > > Instead of > > returning TC_ACT_REINSERT, change the type to the new TC_ACT_CONSUMED > > which tells the caller that the packet has been stolen by another process > > and that no consume call is required. > Possibly a dumb question, but why does this need a new CONSUMED rather > than, say, taking an additional ref and returning TC_ACT_STOLEN? Is it okay to reinsert a shared skb into the stack? In particular this looks a little scary: int pskb_expand_head(struct sk_buff *skb, int nhead, int ntail, gfp_t gfp_mask) { int i, osize = skb_end_offset(skb); int size = osize + nhead + ntail; long off; u8 *data; BUG_ON(nhead < 0); BUG_ON(skb_shared(skb)); ^^^^^^^^^^^^^^^^^^^^^^^^ Actually looking for Paolo's address to add him to CC I found that he said at the time: With ACT_SHOT caller/upper layer will free the skb, too. We will have an use after free (from either the upper layer and the xmit device). Similar issues with STOLEN, TRAP, etc. In the past, Changli Gao attempted to avoid the clone incrementing the skb usage count: commit 210d6de78c5d7c785fc532556cea340e517955e1 Author: Changli Gao <xiaosuo@gmail.com> Date: Thu Jun 24 16:25:12 2010 +0000 act_mirred: don't clone skb when skb isn't shared but some/many device drivers expect an skb usage count of 1, and that caused ooops and was revered. :) ^ permalink raw reply [flat|nested] 7+ messages in thread
* [RFC net-next 2/2] net: sched: protect against stack overflow in TC act_mirred 2019-06-14 14:33 [RFC net-next 0/2] Track recursive calls in TC act_mirred John Hurley 2019-06-14 14:33 ` [RFC net-next 1/2] net: sched: refactor reinsert action John Hurley @ 2019-06-14 14:33 ` John Hurley 2019-06-14 14:45 ` Florian Westphal 1 sibling, 1 reply; 7+ messages in thread From: John Hurley @ 2019-06-14 14:33 UTC (permalink / raw) To: netdev Cc: davem, fw, jhs, simon.horman, jakub.kicinski, oss-drivers, John Hurley TC hooks allow the application of filters and actions to packets at both ingress and egress of the network stack. It is possible, with poor configuration, that this can produce loops whereby an ingress hook calls a mirred egress action that has an egress hook that redirects back to the first ingress etc. The TC core classifier protects against loops when doing reclassifies but there is no protection against a packet looping between multiple hooks and recursively calling act_mirred. This can lead to stack overflow panics. Add a per CPU counter to act_mirred that is incremented for each recursive call of the action function when processing a packet. If a limit is passed then the packet is dropped and CPU counter reset. Note that this patch does not protect against loops in TC datapaths. Its aim is to prevent stack overflow kernel panics that can be a consequence of such loops. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> --- net/sched/act_mirred.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c index 8c1d736..766fae0 100644 --- a/net/sched/act_mirred.c +++ b/net/sched/act_mirred.c @@ -27,6 +27,9 @@ static LIST_HEAD(mirred_list); static DEFINE_SPINLOCK(mirred_list_lock); +#define MIRRED_RECURSION_LIMIT 4 +static DEFINE_PER_CPU(unsigned int, mirred_rec_level); + static bool tcf_mirred_is_act_redirect(int action) { return action == TCA_EGRESS_REDIR || action == TCA_INGRESS_REDIR; @@ -210,6 +213,7 @@ static int tcf_mirred_act(struct sk_buff *skb, const struct tc_action *a, struct sk_buff *skb2 = skb; bool m_mac_header_xmit; struct net_device *dev; + unsigned int rec_level; int retval, err = 0; bool use_reinsert; bool want_ingress; @@ -217,6 +221,14 @@ static int tcf_mirred_act(struct sk_buff *skb, const struct tc_action *a, int m_eaction; int mac_len; + rec_level = __this_cpu_inc_return(mirred_rec_level); + if (unlikely(rec_level > MIRRED_RECURSION_LIMIT)) { + net_warn_ratelimited("Packet exceeded mirred recursion limit on dev %s\n", + netdev_name(skb->dev)); + __this_cpu_dec(mirred_rec_level); + return TC_ACT_SHOT; + } + tcf_lastuse_update(&m->tcf_tm); bstats_cpu_update(this_cpu_ptr(m->common.cpu_bstats), skb); @@ -278,6 +290,7 @@ static int tcf_mirred_act(struct sk_buff *skb, const struct tc_action *a, res->ingress = want_ingress; res->qstats = this_cpu_ptr(m->common.cpu_qstats); skb_tc_reinsert(skb, res); + __this_cpu_dec(mirred_rec_level); return TC_ACT_CONSUMED; } } @@ -293,6 +306,7 @@ static int tcf_mirred_act(struct sk_buff *skb, const struct tc_action *a, if (tcf_mirred_is_act_redirect(m_eaction)) retval = TC_ACT_SHOT; } + __this_cpu_dec(mirred_rec_level); return retval; } -- 2.7.4 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [RFC net-next 2/2] net: sched: protect against stack overflow in TC act_mirred 2019-06-14 14:33 ` [RFC net-next 2/2] net: sched: protect against stack overflow in TC act_mirred John Hurley @ 2019-06-14 14:45 ` Florian Westphal 0 siblings, 0 replies; 7+ messages in thread From: Florian Westphal @ 2019-06-14 14:45 UTC (permalink / raw) To: John Hurley Cc: netdev, davem, fw, jhs, simon.horman, jakub.kicinski, oss-drivers John Hurley <john.hurley@netronome.com> wrote: > TC hooks allow the application of filters and actions to packets at both > ingress and egress of the network stack. It is possible, with poor > configuration, that this can produce loops whereby an ingress hook calls > a mirred egress action that has an egress hook that redirects back to > the first ingress etc. The TC core classifier protects against loops when > doing reclassifies but there is no protection against a packet looping > between multiple hooks and recursively calling act_mirred. This can lead > to stack overflow panics. > > Add a per CPU counter to act_mirred that is incremented for each recursive > call of the action function when processing a packet. If a limit is passed > then the packet is dropped and CPU counter reset. > > Note that this patch does not protect against loops in TC datapaths. Its > aim is to prevent stack overflow kernel panics that can be a consequence > of such loops. LGTM, thanks. ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2019-06-18 18:57 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-06-14 14:33 [RFC net-next 0/2] Track recursive calls in TC act_mirred John Hurley 2019-06-14 14:33 ` [RFC net-next 1/2] net: sched: refactor reinsert action John Hurley 2019-06-17 18:43 ` Edward Cree 2019-06-17 22:11 ` John Hurley 2019-06-18 18:57 ` Jakub Kicinski 2019-06-14 14:33 ` [RFC net-next 2/2] net: sched: protect against stack overflow in TC act_mirred John Hurley 2019-06-14 14:45 ` Florian Westphal
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.