All of lore.kernel.org
 help / color / mirror / Atom feed
* Yet another bridge netfilter crash
@ 2010-07-23 13:42 Herbert Xu
  2010-07-23 14:18 ` Patrick McHardy
  0 siblings, 1 reply; 12+ messages in thread
From: Herbert Xu @ 2010-07-23 13:42 UTC (permalink / raw)
  To: Patrick McHardy, Stephen Hemminger, netdev

Hi:

I was cced on the following bug:

https://bugzilla.redhat.com/show_bug.cgi?id=617268

>From what I've seen in the crash dump, this would appear to be
yet another manifestation of the evil relationship between the
bridge and IPv4 through netfilter.

In particular, bridge netfilter invokes IPv4's PRE_ROUTING rules,
one of which assembles packets for connection tracking.

Unfortunately, the same cache is used for reassembling bridge
packets and non-bridge packets.

Now we already knew about this and its potential security effects.
However, what we didn't know is that this can also cause a packet
to appear in the bridge pre_routing code with nf_bridge set to
NULL when it must not be NULL.

This happens if the non-bridge fragment appeared first.

So now is the time to fix this properly by giving the bridge its
own separate conntrack namespace/zone.

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Yet another bridge netfilter crash
  2010-07-23 13:42 Yet another bridge netfilter crash Herbert Xu
@ 2010-07-23 14:18 ` Patrick McHardy
  2010-07-23 15:00   ` Herbert Xu
  0 siblings, 1 reply; 12+ messages in thread
From: Patrick McHardy @ 2010-07-23 14:18 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Stephen Hemminger, netdev

[-- Attachment #1: Type: text/plain, Size: 1003 bytes --]

On 23.07.2010 15:42, Herbert Xu wrote:
> Hi:
> 
> I was cced on the following bug:
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=617268
> 
>>From what I've seen in the crash dump, this would appear to be
> yet another manifestation of the evil relationship between the
> bridge and IPv4 through netfilter.
> 
> In particular, bridge netfilter invokes IPv4's PRE_ROUTING rules,
> one of which assembles packets for connection tracking.
> 
> Unfortunately, the same cache is used for reassembling bridge
> packets and non-bridge packets.
> 
> Now we already knew about this and its potential security effects.
> However, what we didn't know is that this can also cause a packet
> to appear in the bridge pre_routing code with nf_bridge set to
> NULL when it must not be NULL.
> 
> This happens if the non-bridge fragment appeared first.
> 
> So now is the time to fix this properly by giving the bridge its
> own separate conntrack namespace/zone.

I think we've already fixed this by commit 8fa9ff6:


[-- Attachment #2: x --]
[-- Type: text/plain, Size: 4090 bytes --]

commit 8fa9ff6849bb86c59cc2ea9faadf3cb2d5223497
Author: Patrick McHardy <kaber@trash.net>
Date:   Tue Dec 15 16:59:59 2009 +0100

    netfilter: fix crashes in bridge netfilter caused by fragment jumps
    
    When fragments from bridge netfilter are passed to IPv4 or IPv6 conntrack
    and a reassembly queue with the same fragment key already exists from
    reassembling a similar packet received on a different device (f.i. with
    multicasted fragments), the reassembled packet might continue on a different
    codepath than where the head fragment originated. This can cause crashes
    in bridge netfilter when a fragment received on a non-bridge device (and
    thus with skb->nf_bridge == NULL) continues through the bridge netfilter
    code.
    
    Add a new reassembly identifier for packets originating from bridge
    netfilter and use it to put those packets in insolated queues.
    
    Fixes http://bugzilla.kernel.org/show_bug.cgi?id=14805
    
    Reported-and-Tested-by: Chong Qiao <qiaochong@loongson.cn>
    Signed-off-by: Patrick McHardy <kaber@trash.net>

diff --git a/include/net/ip.h b/include/net/ip.h
index e6b9d12..85108cf 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -337,6 +337,7 @@ enum ip_defrag_users {
 	IP_DEFRAG_CALL_RA_CHAIN,
 	IP_DEFRAG_CONNTRACK_IN,
 	IP_DEFRAG_CONNTRACK_OUT,
+	IP_DEFRAG_CONNTRACK_BRIDGE_IN,
 	IP_DEFRAG_VS_IN,
 	IP_DEFRAG_VS_OUT,
 	IP_DEFRAG_VS_FWD
diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index d691603..ccab594 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -354,6 +354,7 @@ enum ip6_defrag_users {
 	IP6_DEFRAG_LOCAL_DELIVER,
 	IP6_DEFRAG_CONNTRACK_IN,
 	IP6_DEFRAG_CONNTRACK_OUT,
+	IP6_DEFRAG_CONNTRACK_BRIDGE_IN,
 };
 
 struct ip6_create_arg {
diff --git a/net/ipv4/netfilter/nf_defrag_ipv4.c b/net/ipv4/netfilter/nf_defrag_ipv4.c
index fa2d6b6..331ead3 100644
--- a/net/ipv4/netfilter/nf_defrag_ipv4.c
+++ b/net/ipv4/netfilter/nf_defrag_ipv4.c
@@ -14,6 +14,7 @@
 #include <net/route.h>
 #include <net/ip.h>
 
+#include <linux/netfilter_bridge.h>
 #include <linux/netfilter_ipv4.h>
 #include <net/netfilter/ipv4/nf_defrag_ipv4.h>
 
@@ -34,6 +35,20 @@ static int nf_ct_ipv4_gather_frags(struct sk_buff *skb, u_int32_t user)
 	return err;
 }
 
+static enum ip_defrag_users nf_ct_defrag_user(unsigned int hooknum,
+					      struct sk_buff *skb)
+{
+#ifdef CONFIG_BRIDGE_NETFILTER
+	if (skb->nf_bridge &&
+	    skb->nf_bridge->mask & BRNF_NF_BRIDGE_PREROUTING)
+		return IP_DEFRAG_CONNTRACK_BRIDGE_IN;
+#endif
+	if (hooknum == NF_INET_PRE_ROUTING)
+		return IP_DEFRAG_CONNTRACK_IN;
+	else
+		return IP_DEFRAG_CONNTRACK_OUT;
+}
+
 static unsigned int ipv4_conntrack_defrag(unsigned int hooknum,
 					  struct sk_buff *skb,
 					  const struct net_device *in,
@@ -50,10 +65,8 @@ static unsigned int ipv4_conntrack_defrag(unsigned int hooknum,
 #endif
 	/* Gather fragments. */
 	if (ip_hdr(skb)->frag_off & htons(IP_MF | IP_OFFSET)) {
-		if (nf_ct_ipv4_gather_frags(skb,
-					    hooknum == NF_INET_PRE_ROUTING ?
-					    IP_DEFRAG_CONNTRACK_IN :
-					    IP_DEFRAG_CONNTRACK_OUT))
+		enum ip_defrag_users user = nf_ct_defrag_user(hooknum, skb);
+		if (nf_ct_ipv4_gather_frags(skb, user))
 			return NF_STOLEN;
 	}
 	return NF_ACCEPT;
diff --git a/net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c b/net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c
index c0a82fe..0956eba 100644
--- a/net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c
+++ b/net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c
@@ -20,6 +20,7 @@
 #include <net/ipv6.h>
 #include <net/inet_frag.h>
 
+#include <linux/netfilter_bridge.h>
 #include <linux/netfilter_ipv6.h>
 #include <net/netfilter/nf_conntrack.h>
 #include <net/netfilter/nf_conntrack_helper.h>
@@ -190,6 +191,11 @@ out:
 static enum ip6_defrag_users nf_ct6_defrag_user(unsigned int hooknum,
 						struct sk_buff *skb)
 {
+#ifdef CONFIG_BRIDGE_NETFILTER
+	if (skb->nf_bridge &&
+	    skb->nf_bridge->mask & BRNF_NF_BRIDGE_PREROUTING)
+		return IP6_DEFRAG_CONNTRACK_BRIDGE_IN;
+#endif
 	if (hooknum == NF_INET_PRE_ROUTING)
 		return IP6_DEFRAG_CONNTRACK_IN;
 	else

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: Yet another bridge netfilter crash
  2010-07-23 14:18 ` Patrick McHardy
@ 2010-07-23 15:00   ` Herbert Xu
  2010-07-23 15:17     ` Patrick McHardy
  0 siblings, 1 reply; 12+ messages in thread
From: Herbert Xu @ 2010-07-23 15:00 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Stephen Hemminger, netdev

On Fri, Jul 23, 2010 at 04:18:46PM +0200, Patrick McHardy wrote:
>
> I think we've already fixed this by commit 8fa9ff6:
> 

> commit 8fa9ff6849bb86c59cc2ea9faadf3cb2d5223497
> Author: Patrick McHardy <kaber@trash.net>
> Date:   Tue Dec 15 16:59:59 2009 +0100
> 
>     netfilter: fix crashes in bridge netfilter caused by fragment jumps

Thanks for the pointer Patrick.

Your memory is much better than mine, as I was in that thread too :)

BTW, do you have any plans on addressing the deeper issue of
separating the connection tracking as well?

There's also the matter of fragments jumping between bridges.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Yet another bridge netfilter crash
  2010-07-23 15:00   ` Herbert Xu
@ 2010-07-23 15:17     ` Patrick McHardy
  2010-07-23 15:26       ` Herbert Xu
  0 siblings, 1 reply; 12+ messages in thread
From: Patrick McHardy @ 2010-07-23 15:17 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Stephen Hemminger, netdev

On 23.07.2010 17:00, Herbert Xu wrote:
> On Fri, Jul 23, 2010 at 04:18:46PM +0200, Patrick McHardy wrote:
>>
>> I think we've already fixed this by commit 8fa9ff6:
>>
> 
>> commit 8fa9ff6849bb86c59cc2ea9faadf3cb2d5223497
>> Author: Patrick McHardy <kaber@trash.net>
>> Date:   Tue Dec 15 16:59:59 2009 +0100
>>
>>     netfilter: fix crashes in bridge netfilter caused by fragment jumps
> 
> Thanks for the pointer Patrick.
> 
> Your memory is much better than mine, as I was in that thread too :)
> 
> BTW, do you have any plans on addressing the deeper issue of
> separating the connection tracking as well?

No concrete plans yet, but its something I'm definitely planning
to try at some point.

> There's also the matter of fragments jumping between bridges.

Conntrack zones can be used to avoid that, but that currently needs
manual configuration.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Yet another bridge netfilter crash
  2010-07-23 15:17     ` Patrick McHardy
@ 2010-07-23 15:26       ` Herbert Xu
  2010-08-04 16:30         ` Patrick McHardy
  0 siblings, 1 reply; 12+ messages in thread
From: Herbert Xu @ 2010-07-23 15:26 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Stephen Hemminger, netdev

On Fri, Jul 23, 2010 at 05:17:42PM +0200, Patrick McHardy wrote:
> 
> No concrete plans yet, but its something I'm definitely planning
> to try at some point.

Great!

> > There's also the matter of fragments jumping between bridges.
> 
> Conntrack zones can be used to avoid that, but that currently needs
> manual configuration.

I think this is something that we need to fix.  Because as it
stands, it can still crash if you get the wrong nf_bridge.

The reason is that skb->dev does not hold a ref count.  So the
reassembly code just throws it away and always uses the dev of
the last fragment.

This breaks when two bridges combine to reassemble a single
packet, as the nf_bridge attribute of the reassembled packet
may come from an skb whose device is now dead.  This is then
used to fill in the skb->dev (via nf_bridge->physindev).

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Yet another bridge netfilter crash
  2010-07-23 15:26       ` Herbert Xu
@ 2010-08-04 16:30         ` Patrick McHardy
  2010-08-04 16:41           ` Herbert Xu
  2010-08-04 23:00           ` Changli Gao
  0 siblings, 2 replies; 12+ messages in thread
From: Patrick McHardy @ 2010-08-04 16:30 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Stephen Hemminger, netdev

Am 23.07.2010 17:26, schrieb Herbert Xu:
> On Fri, Jul 23, 2010 at 05:17:42PM +0200, Patrick McHardy wrote:
>>
>>> There's also the matter of fragments jumping between bridges.
>>
>> Conntrack zones can be used to avoid that, but that currently needs
>> manual configuration.
> 
> I think this is something that we need to fix.  Because as it
> stands, it can still crash if you get the wrong nf_bridge.
> 
> The reason is that skb->dev does not hold a ref count.  So the
> reassembly code just throws it away and always uses the dev of
> the last fragment.
> 
> This breaks when two bridges combine to reassemble a single
> packet, as the nf_bridge attribute of the reassembled packet
> may come from an skb whose device is now dead.  This is then
> used to fill in the skb->dev (via nf_bridge->physindev).

We could perform a new device lookup on reassembly as we do
when expiring a fragment queue, but we probably shouldn't even
be reassembling fragments from different bridges. One way to
avoid this would be to automatically assign each bridge device
to a different conntrack zone, but conntrack zones are limited
to 2^16 and this might also have other unwanted side-effects.

Until we come up with something better the best fix seems to
be to perform the device lookup based on the iif.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Yet another bridge netfilter crash
  2010-08-04 16:30         ` Patrick McHardy
@ 2010-08-04 16:41           ` Herbert Xu
  2010-08-04 16:50             ` Patrick McHardy
  2010-08-04 23:00           ` Changli Gao
  1 sibling, 1 reply; 12+ messages in thread
From: Herbert Xu @ 2010-08-04 16:41 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Stephen Hemminger, netdev

On Wed, Aug 04, 2010 at 06:30:58PM +0200, Patrick McHardy wrote:
>
> We could perform a new device lookup on reassembly as we do
> when expiring a fragment queue, but we probably shouldn't even
> be reassembling fragments from different bridges. One way to
> avoid this would be to automatically assign each bridge device
> to a different conntrack zone, but conntrack zones are limited
> to 2^16 and this might also have other unwanted side-effects.
> 
> Until we come up with something better the best fix seems to
> be to perform the device lookup based on the iif.

I don't think we can as the iif will point to the bridge device.
The physindev contains the original physical device where the
packet came in.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Yet another bridge netfilter crash
  2010-08-04 16:41           ` Herbert Xu
@ 2010-08-04 16:50             ` Patrick McHardy
  2010-08-09 21:42               ` Herbert Xu
  0 siblings, 1 reply; 12+ messages in thread
From: Patrick McHardy @ 2010-08-04 16:50 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Stephen Hemminger, netdev

Am 04.08.2010 18:41, schrieb Herbert Xu:
> On Wed, Aug 04, 2010 at 06:30:58PM +0200, Patrick McHardy wrote:
>>
>> We could perform a new device lookup on reassembly as we do
>> when expiring a fragment queue, but we probably shouldn't even
>> be reassembling fragments from different bridges. One way to
>> avoid this would be to automatically assign each bridge device
>> to a different conntrack zone, but conntrack zones are limited
>> to 2^16 and this might also have other unwanted side-effects.
>>
>> Until we come up with something better the best fix seems to
>> be to perform the device lookup based on the iif.
> 
> I don't think we can as the iif will point to the bridge device.
> The physindev contains the original physical device where the
> packet came in.

If it originally points to the bridge device, there doesn't
seem anything wrong with the device pointing to the bridge
device after reassembly. Am I missing something?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Yet another bridge netfilter crash
  2010-08-04 16:30         ` Patrick McHardy
  2010-08-04 16:41           ` Herbert Xu
@ 2010-08-04 23:00           ` Changli Gao
  1 sibling, 0 replies; 12+ messages in thread
From: Changli Gao @ 2010-08-04 23:00 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Herbert Xu, Stephen Hemminger, netdev

On Thu, Aug 5, 2010 at 12:30 AM, Patrick McHardy <kaber@trash.net> wrote:
> Am 23.07.2010 17:26, schrieb Herbert Xu:
>> On Fri, Jul 23, 2010 at 05:17:42PM +0200, Patrick McHardy wrote:
>>>
>>>> There's also the matter of fragments jumping between bridges.
>>>
>>> Conntrack zones can be used to avoid that, but that currently needs
>>> manual configuration.
>>
>> I think this is something that we need to fix.  Because as it
>> stands, it can still crash if you get the wrong nf_bridge.
>>
>> The reason is that skb->dev does not hold a ref count.  So the
>> reassembly code just throws it away and always uses the dev of
>> the last fragment.
>>
>> This breaks when two bridges combine to reassemble a single
>> packet, as the nf_bridge attribute of the reassembled packet
>> may come from an skb whose device is now dead.  This is then
>> used to fill in the skb->dev (via nf_bridge->physindev).
>
> We could perform a new device lookup on reassembly as we do
> when expiring a fragment queue, but we probably shouldn't even
> be reassembling fragments from different bridges. One way to
> avoid this would be to automatically assign each bridge device
> to a different conntrack zone, but conntrack zones are limited
> to 2^16 and this might also have other unwanted side-effects.
>
> Until we come up with something better the best fix seems to
> be to perform the device lookup based on the iif.

How about holding physindev and physoutdev when queueing the skbs into
frag queue, and take the skb->dev(bridge NIC) as a key when queueing
the skbs from bridges?

-- 
Regards,
Changli Gao(xiaosuo@gmail.com)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Yet another bridge netfilter crash
  2010-08-04 16:50             ` Patrick McHardy
@ 2010-08-09 21:42               ` Herbert Xu
  2010-08-09 22:39                 ` Changli Gao
  0 siblings, 1 reply; 12+ messages in thread
From: Herbert Xu @ 2010-08-09 21:42 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Stephen Hemminger, netdev

On Wed, Aug 04, 2010 at 06:50:50PM +0200, Patrick McHardy wrote:
.
> >> Until we come up with something better the best fix seems to
> >> be to perform the device lookup based on the iif.
> > 
> > I don't think we can as the iif will point to the bridge device.
> > The physindev contains the original physical device where the
> > packet came in.
> 
> If it originally points to the bridge device, there doesn't
> seem anything wrong with the device pointing to the bridge
> device after reassembly. Am I missing something?

It's there so that netfilter rules can filter based on the original
inbound interface.

If you set it to the bridge then those rules won't work anymore.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Yet another bridge netfilter crash
  2010-08-09 21:42               ` Herbert Xu
@ 2010-08-09 22:39                 ` Changli Gao
  2010-08-10 15:19                   ` Herbert Xu
  0 siblings, 1 reply; 12+ messages in thread
From: Changli Gao @ 2010-08-09 22:39 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Patrick McHardy, Stephen Hemminger, netdev

On Tue, Aug 10, 2010 at 5:42 AM, Herbert Xu <herbert@gondor.apana.org.au> wrote:
> On Wed, Aug 04, 2010 at 06:50:50PM +0200, Patrick McHardy wrote:
> .
>> >> Until we come up with something better the best fix seems to
>> >> be to perform the device lookup based on the iif.
>> >
>> > I don't think we can as the iif will point to the bridge device.
>> > The physindev contains the original physical device where the
>> > packet came in.
>>
>> If it originally points to the bridge device, there doesn't
>> seem anything wrong with the device pointing to the bridge
>> device after reassembly. Am I missing something?
>
> It's there so that netfilter rules can filter based on the original
> inbound interface.
>
> If you set it to the bridge then those rules won't work anymore.
>

How about always using the skb->nf_bridge of the skb last received in
a defrag queue as the nf_bridge of the final defraged skb? I have
posted a patch doing such thing.
http://patchwork.ozlabs.org/patch/60904/

Thanks.

-- 
Regards,
Changli Gao(xiaosuo@gmail.com)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Yet another bridge netfilter crash
  2010-08-09 22:39                 ` Changli Gao
@ 2010-08-10 15:19                   ` Herbert Xu
  0 siblings, 0 replies; 12+ messages in thread
From: Herbert Xu @ 2010-08-10 15:19 UTC (permalink / raw)
  To: Changli Gao; +Cc: Patrick McHardy, Stephen Hemminger, netdev

On Tue, Aug 10, 2010 at 06:39:42AM +0800, Changli Gao wrote:
>
> How about always using the skb->nf_bridge of the skb last received in
> a defrag queue as the nf_bridge of the final defraged skb? I have
> posted a patch doing such thing.
> http://patchwork.ozlabs.org/patch/60904/

I'd rather just drop the fragment (and maybe fail the whole
reassembly) if you detect nf_bridge mismatches.

Reassembling fragments from two interfaces is one thing, but
knowingly reassembling fragments across two bridges is just
wrong.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2010-08-10 15:19 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-07-23 13:42 Yet another bridge netfilter crash Herbert Xu
2010-07-23 14:18 ` Patrick McHardy
2010-07-23 15:00   ` Herbert Xu
2010-07-23 15:17     ` Patrick McHardy
2010-07-23 15:26       ` Herbert Xu
2010-08-04 16:30         ` Patrick McHardy
2010-08-04 16:41           ` Herbert Xu
2010-08-04 16:50             ` Patrick McHardy
2010-08-09 21:42               ` Herbert Xu
2010-08-09 22:39                 ` Changli Gao
2010-08-10 15:19                   ` Herbert Xu
2010-08-04 23:00           ` Changli Gao

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.