[RFC net-next v2 1/1] net: sched: protect against loops in TC filter hooks

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [RFC net-next v2 1/1] net: sched: protect against loops in TC filter hooks
@ 2019-06-06 12:49 John Hurley
  2019-06-06 12:58 ` Florian Westphal
  0 siblings, 1 reply; 8+ messages in thread
From: John Hurley @ 2019-06-06 12:49 UTC (permalink / raw)
  To: netdev; +Cc: simon.horman, jakub.kicinski, jhs, fw, oss-drivers, John Hurley

TC hooks allow the application of filters and actions to packets at both
ingress and egress of the network stack. It is possible, with poor
configuration, that this can produce loops whereby an ingress hook calls
a mirred egress action that has an egress hook that redirects back to
the first ingress etc. The TC core classifier protects against loops when
doing reclassifies but there is no protection against a packet looping
between multiple hooks. This can lead to stack overflow panics among other
things.

Previous versions of the kernel (<4.2) had a TTL count in the tc_verd skb
member that protected against loops. This was removed and the tc_verd
variable replaced by bit fields.

Extend the TC fields in the skb with an additional 2 bits to store the TC
hop count. This should use existing allocated memory in the skb.

Add the checking and setting of the new hop count to the act_mirred file
given that it is the source of the loops. This means that the code
additions are not in the main datapath.

v1->v2
- change from per cpu counter to per skb tracking (Jamal)
- move check/update from fast path to act_mirred (Daniel)

Signed-off-by: John Hurley <john.hurley@netronome.com>
---
 include/linux/skbuff.h | 2 ++
 net/sched/act_mirred.c | 9 +++++++++
 2 files changed, 11 insertions(+)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 2ee5e63..f0dbc5b 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -645,6 +645,7 @@ typedef unsigned char *sk_buff_data_t;
  *	@tc_at_ingress: used within tc_classify to distinguish in/egress
  *	@tc_redirected: packet was redirected by a tc action
  *	@tc_from_ingress: if tc_redirected, tc_at_ingress at time of redirect
+ *	@tc_hop_count: hop counter to prevent packet loops
  *	@peeked: this packet has been seen already, so stats have been
  *		done for it, don't do them again
  *	@nf_trace: netfilter packet trace flag
@@ -827,6 +828,7 @@ struct sk_buff {
 	__u8			tc_at_ingress:1;
 	__u8			tc_redirected:1;
 	__u8			tc_from_ingress:1;
+	__u8			tc_hop_count:2;
 #endif
 #ifdef CONFIG_TLS_DEVICE
 	__u8			decrypted:1;
diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c
index c329390..6cf7f90 100644
--- a/net/sched/act_mirred.c
+++ b/net/sched/act_mirred.c
@@ -29,6 +29,8 @@
 #include <linux/tc_act/tc_mirred.h>
 #include <net/tc_act/tc_mirred.h>
 
+#define MAX_HOP_COUNT	3
+
 static LIST_HEAD(mirred_list);
 static DEFINE_SPINLOCK(mirred_list_lock);
 
@@ -222,6 +224,12 @@ static int tcf_mirred_act(struct sk_buff *skb, const struct tc_action *a,
 	int m_eaction;
 	int mac_len;
 
+	if (unlikely(skb->tc_hop_count == MAX_HOP_COUNT)) {
+		net_warn_ratelimited("Packet at %s exceeded max TC hop count\n",
+				     netdev_name(skb->dev));
+		return TC_ACT_SHOT;
+	}
+
 	tcf_lastuse_update(&m->tcf_tm);
 	bstats_cpu_update(this_cpu_ptr(m->common.cpu_bstats), skb);
 
@@ -271,6 +279,7 @@ static int tcf_mirred_act(struct sk_buff *skb, const struct tc_action *a,
 
 	skb2->skb_iif = skb->dev->ifindex;
 	skb2->dev = dev;
+	skb2->tc_hop_count++;
 
 	/* mirror is always swallowed */
 	if (is_redirect) {
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [RFC net-next v2 1/1] net: sched: protect against loops in TC filter hooks
  2019-06-06 12:49 [RFC net-next v2 1/1] net: sched: protect against loops in TC filter hooks John Hurley
@ 2019-06-06 12:58 ` Florian Westphal
  2019-06-06 13:27   ` John Hurley
  2019-06-06 18:19   ` David Miller
  0 siblings, 2 replies; 8+ messages in thread
From: Florian Westphal @ 2019-06-06 12:58 UTC (permalink / raw)
  To: John Hurley; +Cc: netdev, simon.horman, jakub.kicinski, jhs, fw, oss-drivers

John Hurley <john.hurley@netronome.com> wrote:
> TC hooks allow the application of filters and actions to packets at both
> ingress and egress of the network stack. It is possible, with poor
> configuration, that this can produce loops whereby an ingress hook calls
> a mirred egress action that has an egress hook that redirects back to
> the first ingress etc. The TC core classifier protects against loops when
> doing reclassifies but there is no protection against a packet looping
> between multiple hooks. This can lead to stack overflow panics among other
> things.
> 
> Previous versions of the kernel (<4.2) had a TTL count in the tc_verd skb
> member that protected against loops. This was removed and the tc_verd
> variable replaced by bit fields.
> 
> Extend the TC fields in the skb with an additional 2 bits to store the TC
> hop count. This should use existing allocated memory in the skb.
> 
> Add the checking and setting of the new hop count to the act_mirred file
> given that it is the source of the loops. This means that the code
> additions are not in the main datapath.
> 
> v1->v2
> - change from per cpu counter to per skb tracking (Jamal)
> - move check/update from fast path to act_mirred (Daniel)
> 
> Signed-off-by: John Hurley <john.hurley@netronome.com>
> ---
>  include/linux/skbuff.h | 2 ++
>  net/sched/act_mirred.c | 9 +++++++++
>  2 files changed, 11 insertions(+)
> 
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index 2ee5e63..f0dbc5b 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -645,6 +645,7 @@ typedef unsigned char *sk_buff_data_t;
>   *	@tc_at_ingress: used within tc_classify to distinguish in/egress
>   *	@tc_redirected: packet was redirected by a tc action
>   *	@tc_from_ingress: if tc_redirected, tc_at_ingress at time of redirect
> + *	@tc_hop_count: hop counter to prevent packet loops
>   *	@peeked: this packet has been seen already, so stats have been
>   *		done for it, don't do them again
>   *	@nf_trace: netfilter packet trace flag
> @@ -827,6 +828,7 @@ struct sk_buff {
>  	__u8			tc_at_ingress:1;
>  	__u8			tc_redirected:1;
>  	__u8			tc_from_ingress:1;
> +	__u8			tc_hop_count:2;

I dislike this, why can't we just use a pcpu counter?

The only problem is with recursion/nesting; whenever we
hit something that queues the skb for later we're safe.

We can't catch loops in real (physical) setups either,
e.g. bridge looping back on itself.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC net-next v2 1/1] net: sched: protect against loops in TC filter hooks
  2019-06-06 12:58 ` Florian Westphal
@ 2019-06-06 13:27   ` John Hurley
  2019-06-06 14:04     ` Florian Westphal
  2019-06-06 18:19   ` David Miller
  1 sibling, 1 reply; 8+ messages in thread
From: John Hurley @ 2019-06-06 13:27 UTC (permalink / raw)
  To: Florian Westphal
  Cc: Linux Netdev List, Simon Horman, Jakub Kicinski,
	Jamal Hadi Salim, oss-drivers

On Thu, Jun 6, 2019 at 1:58 PM Florian Westphal <fw@strlen.de> wrote:
>
> John Hurley <john.hurley@netronome.com> wrote:
> > TC hooks allow the application of filters and actions to packets at both
> > ingress and egress of the network stack. It is possible, with poor
> > configuration, that this can produce loops whereby an ingress hook calls
> > a mirred egress action that has an egress hook that redirects back to
> > the first ingress etc. The TC core classifier protects against loops when
> > doing reclassifies but there is no protection against a packet looping
> > between multiple hooks. This can lead to stack overflow panics among other
> > things.
> >
> > Previous versions of the kernel (<4.2) had a TTL count in the tc_verd skb
> > member that protected against loops. This was removed and the tc_verd
> > variable replaced by bit fields.
> >
> > Extend the TC fields in the skb with an additional 2 bits to store the TC
> > hop count. This should use existing allocated memory in the skb.
> >
> > Add the checking and setting of the new hop count to the act_mirred file
> > given that it is the source of the loops. This means that the code
> > additions are not in the main datapath.
> >
> > v1->v2
> > - change from per cpu counter to per skb tracking (Jamal)
> > - move check/update from fast path to act_mirred (Daniel)
> >
> > Signed-off-by: John Hurley <john.hurley@netronome.com>
> > ---
> >  include/linux/skbuff.h | 2 ++
> >  net/sched/act_mirred.c | 9 +++++++++
> >  2 files changed, 11 insertions(+)
> >
> > diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> > index 2ee5e63..f0dbc5b 100644
> > --- a/include/linux/skbuff.h
> > +++ b/include/linux/skbuff.h
> > @@ -645,6 +645,7 @@ typedef unsigned char *sk_buff_data_t;
> >   *   @tc_at_ingress: used within tc_classify to distinguish in/egress
> >   *   @tc_redirected: packet was redirected by a tc action
> >   *   @tc_from_ingress: if tc_redirected, tc_at_ingress at time of redirect
> > + *   @tc_hop_count: hop counter to prevent packet loops
> >   *   @peeked: this packet has been seen already, so stats have been
> >   *           done for it, don't do them again
> >   *   @nf_trace: netfilter packet trace flag
> > @@ -827,6 +828,7 @@ struct sk_buff {
> >       __u8                    tc_at_ingress:1;
> >       __u8                    tc_redirected:1;
> >       __u8                    tc_from_ingress:1;
> > +     __u8                    tc_hop_count:2;
>
> I dislike this, why can't we just use a pcpu counter?
>
> The only problem is with recursion/nesting; whenever we
> hit something that queues the skb for later we're safe.
>

Hi Florian,
The per cpu counter (initial proposal) should protect against
recursion through loops and the potential stack overflows.
It will not protect against a packet infinitely looping through a poor
configuration if (as you say) the packet is queued at some point and
the cpu counter reset.
The per skb tracking seems to accommodate both issues.
Do you see the how the cpu counter could stop infinite loops in the
case of queuing?
Or perhaps these are 2 different issues and should be treated differently?

> We can't catch loops in real (physical) setups either,
> e.g. bridge looping back on itself.

yes, this is only targeted at 'internal' loops.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC net-next v2 1/1] net: sched: protect against loops in TC filter hooks
  2019-06-06 13:27   ` John Hurley
@ 2019-06-06 14:04     ` Florian Westphal
  0 siblings, 0 replies; 8+ messages in thread
From: Florian Westphal @ 2019-06-06 14:04 UTC (permalink / raw)
  To: John Hurley
  Cc: Florian Westphal, Linux Netdev List, Simon Horman,
	Jakub Kicinski, Jamal Hadi Salim, oss-drivers

John Hurley <john.hurley@netronome.com> wrote:
> On Thu, Jun 6, 2019 at 1:58 PM Florian Westphal <fw@strlen.de> wrote:
> > I dislike this, why can't we just use a pcpu counter?
> >
> > The only problem is with recursion/nesting; whenever we
> > hit something that queues the skb for later we're safe.
> >
> 
> Hi Florian,
> The per cpu counter (initial proposal) should protect against
> recursion through loops and the potential stack overflows.
> It will not protect against a packet infinitely looping through a poor
> configuration if (as you say) the packet is queued at some point and
> the cpu counter reset.

Yes, it won't help, but thats not harmful, such cycle will be
broken/resolved as soon as the configuration is fixed.

> The per skb tracking seems to accommodate both issues.

Yes, but I do not see the 'looping with queueing' as a problem,
it can also occur for different reasons.

> Do you see the how the cpu counter could stop infinite loops in the
> case of queuing?

No, but I don't think it has to.

> Or perhaps these are 2 different issues and should be treated differently?

The recursion is a problem, so, yes, I think these are different issues.

> > We can't catch loops in real (physical) setups either,
> > e.g. bridge looping back on itself.
> 
> yes, this is only targeted at 'internal' loops.

Right, however, I'm not sure we should bother with those, we can't
prevent this (looping packets) from happening for other reasons.

I'm sure you can make packets go in circles without tc, e.g. via
veth+bridge+netns.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC net-next v2 1/1] net: sched: protect against loops in TC filter hooks
  2019-06-06 12:58 ` Florian Westphal
  2019-06-06 13:27   ` John Hurley
@ 2019-06-06 18:19   ` David Miller
  2019-06-06 19:52     ` Florian Westphal
  1 sibling, 1 reply; 8+ messages in thread
From: David Miller @ 2019-06-06 18:19 UTC (permalink / raw)
  To: fw; +Cc: john.hurley, netdev, simon.horman, jakub.kicinski, jhs, oss-drivers

From: Florian Westphal <fw@strlen.de>
Date: Thu, 6 Jun 2019 14:58:18 +0200

>> @@ -827,6 +828,7 @@ struct sk_buff {
>>  	__u8			tc_at_ingress:1;
>>  	__u8			tc_redirected:1;
>>  	__u8			tc_from_ingress:1;
>> +	__u8			tc_hop_count:2;
> 
> I dislike this, why can't we just use a pcpu counter?

I understand that it's because the only precise context is per-SKB not
per-cpu doing packet processing.  This has been discussed before.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC net-next v2 1/1] net: sched: protect against loops in TC filter hooks
  2019-06-06 18:19   ` David Miller
@ 2019-06-06 19:52     ` Florian Westphal
  2019-06-11 14:58       ` John Hurley
  0 siblings, 1 reply; 8+ messages in thread
From: Florian Westphal @ 2019-06-06 19:52 UTC (permalink / raw)
  To: David Miller
  Cc: fw, john.hurley, netdev, simon.horman, jakub.kicinski, jhs, oss-drivers

David Miller <davem@davemloft.net> wrote:
> From: Florian Westphal <fw@strlen.de>
> Date: Thu, 6 Jun 2019 14:58:18 +0200
> 
> >> @@ -827,6 +828,7 @@ struct sk_buff {
> >>  	__u8			tc_at_ingress:1;
> >>  	__u8			tc_redirected:1;
> >>  	__u8			tc_from_ingress:1;
> >> +	__u8			tc_hop_count:2;
> > 
> > I dislike this, why can't we just use a pcpu counter?
> 
> I understand that it's because the only precise context is per-SKB not
> per-cpu doing packet processing.  This has been discussed before.

I don't think its worth it, and it won't work with physical-world
loops (e.g. a bridge setup with no spanning tree and a closed loop).

Also I fear that if we start to do this for tc, we will also have to
followup later with more l2 hopcounts for other users, e.g. veth,
bridge, ovs, and so on.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC net-next v2 1/1] net: sched: protect against loops in TC filter hooks
  2019-06-06 19:52     ` Florian Westphal
@ 2019-06-11 14:58       ` John Hurley
  2019-06-11 15:03         ` Florian Westphal
  0 siblings, 1 reply; 8+ messages in thread
From: John Hurley @ 2019-06-11 14:58 UTC (permalink / raw)
  To: Florian Westphal
  Cc: David Miller, Linux Netdev List, Simon Horman, Jakub Kicinski,
	Jamal Hadi Salim, oss-drivers

On Thu, Jun 6, 2019 at 8:52 PM Florian Westphal <fw@strlen.de> wrote:
>
> David Miller <davem@davemloft.net> wrote:
> > From: Florian Westphal <fw@strlen.de>
> > Date: Thu, 6 Jun 2019 14:58:18 +0200
> >
> > >> @@ -827,6 +828,7 @@ struct sk_buff {
> > >>    __u8                    tc_at_ingress:1;
> > >>    __u8                    tc_redirected:1;
> > >>    __u8                    tc_from_ingress:1;
> > >> +  __u8                    tc_hop_count:2;
> > >
> > > I dislike this, why can't we just use a pcpu counter?
> >
> > I understand that it's because the only precise context is per-SKB not
> > per-cpu doing packet processing.  This has been discussed before.
>
> I don't think its worth it, and it won't work with physical-world
> loops (e.g. a bridge setup with no spanning tree and a closed loop).
>
> Also I fear that if we start to do this for tc, we will also have to
> followup later with more l2 hopcounts for other users, e.g. veth,
> bridge, ovs, and so on.

Hi David/Florian,
Moving forward with this, should we treat the looping and recursion as
2 separate issues and at least prevent the potential stack overflow
panics caused by the recursion?
The pcpu counter should protect against this.
Are there context specific issues that we may miss by doing this?
If not I will respin with the pcpu counter in act_mirred.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC net-next v2 1/1] net: sched: protect against loops in TC filter hooks
  2019-06-11 14:58       ` John Hurley
@ 2019-06-11 15:03         ` Florian Westphal
  0 siblings, 0 replies; 8+ messages in thread
From: Florian Westphal @ 2019-06-11 15:03 UTC (permalink / raw)
  To: John Hurley
  Cc: Florian Westphal, David Miller, Linux Netdev List, Simon Horman,
	Jakub Kicinski, Jamal Hadi Salim, oss-drivers

John Hurley <john.hurley@netronome.com> wrote:
> On Thu, Jun 6, 2019 at 8:52 PM Florian Westphal <fw@strlen.de> wrote:
> >
> > David Miller <davem@davemloft.net> wrote:
> > > From: Florian Westphal <fw@strlen.de>
> > > Date: Thu, 6 Jun 2019 14:58:18 +0200
> > >
> > > >> @@ -827,6 +828,7 @@ struct sk_buff {
> > > >>    __u8                    tc_at_ingress:1;
> > > >>    __u8                    tc_redirected:1;
> > > >>    __u8                    tc_from_ingress:1;
> > > >> +  __u8                    tc_hop_count:2;
> > > >
> > > > I dislike this, why can't we just use a pcpu counter?
> > >
> > > I understand that it's because the only precise context is per-SKB not
> > > per-cpu doing packet processing.  This has been discussed before.
> >
> > I don't think its worth it, and it won't work with physical-world
> > loops (e.g. a bridge setup with no spanning tree and a closed loop).
> >
> > Also I fear that if we start to do this for tc, we will also have to
> > followup later with more l2 hopcounts for other users, e.g. veth,
> > bridge, ovs, and so on.
> 
> Hi David/Florian,
> Moving forward with this, should we treat the looping and recursion as
> 2 separate issues and at least prevent the potential stack overflow
> panics caused by the recursion?
> The pcpu counter should protect against this.

As outlined above, I think they are different issues.

> Are there context specific issues that we may miss by doing this?

I can't think of any.

> If not I will respin with the pcpu counter in act_mirred.

Sounds good to me, thanks.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2019-06-11 15:29 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-06 12:49 [RFC net-next v2 1/1] net: sched: protect against loops in TC filter hooks John Hurley
2019-06-06 12:58 ` Florian Westphal
2019-06-06 13:27   ` John Hurley
2019-06-06 14:04     ` Florian Westphal
2019-06-06 18:19   ` David Miller
2019-06-06 19:52     ` Florian Westphal
2019-06-11 14:58       ` John Hurley
2019-06-11 15:03         ` Florian Westphal

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).