* Patch fixing STP if bridge in non-default namespace.
@ 2023-07-10 13:35 Harry Coin
2023-07-11 3:22 ` Kuniyuki Iwashima
0 siblings, 1 reply; 17+ messages in thread
From: Harry Coin @ 2023-07-10 13:35 UTC (permalink / raw)
To: netdev
Notice without access to link-level multicast address 01:80:C2:00:00:00,
the STP loop-avoidance feature of bridges fails silently, leading to
packet storms if loops exist in the related L2. The Linux kernel's
latest code silently drops BPDU STP packets if the bridge is in a
non-default namespace.
The current llc_rcv.c around line 166 in net/llc/llc_input.c has
if (!net_eq(dev_net(dev), &init_net))
goto drop;
Which, when commented out, fixes this bug. A search on &init_net may
reveal many similar artifacts left over from the early days of namespace
implementation.
Thanks for all you do!
--
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Patch fixing STP if bridge in non-default namespace.
2023-07-10 13:35 Patch fixing STP if bridge in non-default namespace Harry Coin
@ 2023-07-11 3:22 ` Kuniyuki Iwashima
2023-07-11 17:08 ` llc needs namespace awareness asap, was " Harry Coin
0 siblings, 1 reply; 17+ messages in thread
From: Kuniyuki Iwashima @ 2023-07-11 3:22 UTC (permalink / raw)
To: hcoin; +Cc: netdev, kuniyu
From: Harry Coin <hcoin@quietfountain.com>
Date: Mon, 10 Jul 2023 08:35:08 -0500
> Notice without access to link-level multicast address 01:80:C2:00:00:00,
> the STP loop-avoidance feature of bridges fails silently, leading to
> packet storms if loops exist in the related L2. The Linux kernel's
> latest code silently drops BPDU STP packets if the bridge is in a
> non-default namespace.
>
> The current llc_rcv.c around line 166 in net/llc/llc_input.c has
>
> if (!net_eq(dev_net(dev), &init_net))
> goto drop;
>
> Which, when commented out, fixes this bug. A search on &init_net may
> reveal many similar artifacts left over from the early days of namespace
> implementation.
I think just removing the part is not sufficient and will introduce a bug
in another place.
As you found, llc has the same test in another place. For example, when
you create an AF_LLC socket, it has to be in the root netns. But if you
remove the test in llc_rcv() only, it seems llc_recv() would put a skb for
a child netns into sk's recv queue that is in the default netns.
- llc_rcv
- if (net_eq(dev_net(dev), &init_net))
- goto drop
- sap_handler / llc_sap_handler
- sk = llc_lookup_dgram
- llc_sap_rcv
- llc_sap_state_process
- sock_queue_rcv_skb
So, we need to namespacify the whole llc infra.
^ permalink raw reply [flat|nested] 17+ messages in thread
* llc needs namespace awareness asap, was Re: Patch fixing STP if bridge in non-default namespace.
2023-07-11 3:22 ` Kuniyuki Iwashima
@ 2023-07-11 17:08 ` Harry Coin
2023-07-11 18:32 ` Kuniyuki Iwashima
0 siblings, 1 reply; 17+ messages in thread
From: Harry Coin @ 2023-07-11 17:08 UTC (permalink / raw)
To: Kuniyuki Iwashima; +Cc: netdev
On 7/10/23 22:22, Kuniyuki Iwashima wrote:
> From: Harry Coin <hcoin@quietfountain.com>
> Date: Mon, 10 Jul 2023 08:35:08 -0500
>> Notice without access to link-level multicast address 01:80:C2:00:00:00,
>> the STP loop-avoidance feature of bridges fails silently, leading to
>> packet storms if loops exist in the related L2. The Linux kernel's
>> latest code silently drops BPDU STP packets if the bridge is in a
>> non-default namespace.
>>
>> The current llc_rcv.c around line 166 in net/llc/llc_input.c has
>>
>> if (!net_eq(dev_net(dev), &init_net))
>> goto drop;
>>
>> Which, when commented out, fixes this bug. A search on &init_net may
>> reveal many similar artifacts left over from the early days of namespace
>> implementation.
> I think just removing the part is not sufficient and will introduce a bug
> in another place.
>
> As you found, llc has the same test in another place. For example, when
> you create an AF_LLC socket, it has to be in the root netns. But if you
> remove the test in llc_rcv() only, it seems llc_recv() would put a skb for
> a child netns into sk's recv queue that is in the default netns.
>
> - llc_rcv
> - if (net_eq(dev_net(dev), &init_net))
> - goto drop
> - sap_handler / llc_sap_handler
> - sk = llc_lookup_dgram
> - llc_sap_rcv
> - llc_sap_state_process
> - sock_queue_rcv_skb
>
> So, we need to namespacify the whole llc infra.
Agreed. Probably sooner rather than later since IP4 and IP6 multicast,
GARP and more as well as STP depends on llc multicast delivery. I
suspect the authors who added the 'drop unless default namespace' code
commented out above knew this, and were just buying some time. Well,
the time has come.
Now all bridges in a namespace will always -- and silently -- think of
itself as the 'root bridge' as it can't get packets informing it
otherwise. This leads to packet storms at line-level speeds bringing
whole infrastructures down in a self-inflicted event worse than a DDOS
attack.
I think whoever does 'advisories' ought to warn the community that ipv6
ndp (if using multicast), ipv4 arp (if using multicast), bridges with
STP, lldp, GARP, ipv6 multicast and ipv4 mulitcast for sockets in the
non-default namespace will not get RX traffic as it gets dropped in the
kernel before other modules or user code has a chance to see it.
Outcomes range from local seeming disconnection to kernel induced
site-crippling packet storms.
Is there a way to track this llc namespace awareness effort? I'm new to
this particular dev community. It's on a critical path for my project.
--
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: llc needs namespace awareness asap, was Re: Patch fixing STP if bridge in non-default namespace.
2023-07-11 17:08 ` llc needs namespace awareness asap, was " Harry Coin
@ 2023-07-11 18:32 ` Kuniyuki Iwashima
2023-07-11 20:22 ` Harry Coin
0 siblings, 1 reply; 17+ messages in thread
From: Kuniyuki Iwashima @ 2023-07-11 18:32 UTC (permalink / raw)
To: hcoin; +Cc: kuniyu, netdev
From: Harry Coin <hcoin@quietfountain.com>
Date: Tue, 11 Jul 2023 12:08:15 -0500
> On 7/10/23 22:22, Kuniyuki Iwashima wrote:
> > From: Harry Coin <hcoin@quietfountain.com>
> > Date: Mon, 10 Jul 2023 08:35:08 -0500
> >> Notice without access to link-level multicast address 01:80:C2:00:00:00,
> >> the STP loop-avoidance feature of bridges fails silently, leading to
> >> packet storms if loops exist in the related L2. The Linux kernel's
> >> latest code silently drops BPDU STP packets if the bridge is in a
> >> non-default namespace.
> >>
> >> The current llc_rcv.c around line 166 in net/llc/llc_input.c has
> >>
> >> if (!net_eq(dev_net(dev), &init_net))
> >> goto drop;
> >>
> >> Which, when commented out, fixes this bug. A search on &init_net may
> >> reveal many similar artifacts left over from the early days of namespace
> >> implementation.
> > I think just removing the part is not sufficient and will introduce a bug
> > in another place.
> >
> > As you found, llc has the same test in another place. For example, when
> > you create an AF_LLC socket, it has to be in the root netns. But if you
> > remove the test in llc_rcv() only, it seems llc_recv() would put a skb for
> > a child netns into sk's recv queue that is in the default netns.
> >
> > - llc_rcv
> > - if (net_eq(dev_net(dev), &init_net))
> > - goto drop
> > - sap_handler / llc_sap_handler
> > - sk = llc_lookup_dgram
> > - llc_sap_rcv
> > - llc_sap_state_process
> > - sock_queue_rcv_skb
> >
> > So, we need to namespacify the whole llc infra.
>
> Agreed. Probably sooner rather than later since IP4 and IP6 multicast,
> GARP and more as well as STP depends on llc multicast delivery. I
> suspect the authors who added the 'drop unless default namespace' code
> commented out above knew this, and were just buying some time. Well,
> the time has come.
>
> Now all bridges in a namespace will always -- and silently -- think of
> itself as the 'root bridge' as it can't get packets informing it
> otherwise. This leads to packet storms at line-level speeds bringing
> whole infrastructures down in a self-inflicted event worse than a DDOS
> attack.
>
> I think whoever does 'advisories' ought to warn the community that ipv6
> ndp (if using multicast), ipv4 arp (if using multicast), bridges with
> STP, lldp, GARP, ipv6 multicast and ipv4 mulitcast for sockets in the
> non-default namespace will not get RX traffic as it gets dropped in the
> kernel before other modules or user code has a chance to see it.
> Outcomes range from local seeming disconnection to kernel induced
> site-crippling packet storms.
>
> Is there a way to track this llc namespace awareness effort? I'm new to
> this particular dev community. It's on a critical path for my project.
AFAIK, there is no ongoing work for this. I can spend some cycles on
this, but note that the patches might not be backported to stable as
it would be invasive.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: llc needs namespace awareness asap, was Re: Patch fixing STP if bridge in non-default namespace.
2023-07-11 18:32 ` Kuniyuki Iwashima
@ 2023-07-11 20:22 ` Harry Coin
2023-07-11 20:44 ` Andrew Lunn
0 siblings, 1 reply; 17+ messages in thread
From: Harry Coin @ 2023-07-11 20:22 UTC (permalink / raw)
To: Kuniyuki Iwashima; +Cc: netdev
On 7/11/23 13:32, Kuniyuki Iwashima wrote:
> From: Harry Coin <hcoin@quietfountain.com>
> Date: Tue, 11 Jul 2023 12:08:15 -0500
>> On 7/10/23 22:22, Kuniyuki Iwashima wrote:
>>> From: Harry Coin <hcoin@quietfountain.com>
>>> Date: Mon, 10 Jul 2023 08:35:08 -0500
>>>> Notice without access to link-level multicast address 01:80:C2:00:00:00,
>>>> the STP loop-avoidance feature of bridges fails silently, leading to
>>>> packet storms if loops exist in the related L2. The Linux kernel's
>>>> latest code silently drops BPDU STP packets if the bridge is in a
>>>> non-default namespace.
>>>>
>>>> The current llc_rcv.c around line 166 in net/llc/llc_input.c has
>>>>
>>>> if (!net_eq(dev_net(dev), &init_net))
>>>> goto drop;
>>>>
>>>> Which, when commented out, fixes this bug. A search on &init_net may
>>>> reveal many similar artifacts left over from the early days of namespace
>>>> implementation.
>>> I think just removing the part is not sufficient and will introduce a bug
>>> in another place.
>>>
>>> As you found, llc has the same test in another place. For example, when
>>> you create an AF_LLC socket, it has to be in the root netns. But if you
>>> remove the test in llc_rcv() only, it seems llc_recv() would put a skb for
>>> a child netns into sk's recv queue that is in the default netns.
>>>
>>> - llc_rcv
>>> - if (net_eq(dev_net(dev), &init_net))
>>> - goto drop
>>> - sap_handler / llc_sap_handler
>>> - sk = llc_lookup_dgram
>>> - llc_sap_rcv
>>> - llc_sap_state_process
>>> - sock_queue_rcv_skb
>>>
>>> So, we need to namespacify the whole llc infra.
>> Agreed. Probably sooner rather than later since IP4 and IP6 multicast,
>> GARP and more as well as STP depends on llc multicast delivery. I
>> suspect the authors who added the 'drop unless default namespace' code
>> commented out above knew this, and were just buying some time. Well,
>> the time has come.
>>
>> Now all bridges in a namespace will always -- and silently -- think of
>> itself as the 'root bridge' as it can't get packets informing it
>> otherwise. This leads to packet storms at line-level speeds bringing
>> whole infrastructures down in a self-inflicted event worse than a DDOS
>> attack.
>>
>> I think whoever does 'advisories' ought to warn the community that ipv6
>> ndp (if using multicast), ipv4 arp (if using multicast), bridges with
>> STP, lldp, GARP, ipv6 multicast and ipv4 mulitcast for sockets in the
>> non-default namespace will not get RX traffic as it gets dropped in the
>> kernel before other modules or user code has a chance to see it.
>> Outcomes range from local seeming disconnection to kernel induced
>> site-crippling packet storms.
>>
>> Is there a way to track this llc namespace awareness effort? I'm new to
>> this particular dev community. It's on a critical path for my project.
> AFAIK, there is no ongoing work for this. I can spend some cycles on
> this, but note that the patches might not be backported to stable as
> it would be invasive.
Thank you! When you offer your patches, and you hear worries about
being 'invasive', it's worth asking 'compared to what' -- since the
'status quo' is every bridge with STP in a non default namespace with a
loop in it somewhere will freeze every connected system more solid than
ice in Antarctica.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: llc needs namespace awareness asap, was Re: Patch fixing STP if bridge in non-default namespace.
2023-07-11 20:22 ` Harry Coin
@ 2023-07-11 20:44 ` Andrew Lunn
2023-07-11 21:40 ` Harry Coin
2023-07-12 0:49 ` Jakub Kicinski
0 siblings, 2 replies; 17+ messages in thread
From: Andrew Lunn @ 2023-07-11 20:44 UTC (permalink / raw)
To: Harry Coin; +Cc: Kuniyuki Iwashima, netdev
> > > > > The current llc_rcv.c around line 166 in net/llc/llc_input.c has
> > > > >
> > > > > if (!net_eq(dev_net(dev), &init_net))
> > > > > goto drop;
> > > > >
> Thank you! When you offer your patches, and you hear worries about being
> 'invasive', it's worth asking 'compared to what' -- since the 'status quo'
> is every bridge with STP in a non default namespace with a loop in it
> somewhere will freeze every connected system more solid than ice in
> Antarctica.
https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
say:
o It must be obviously correct and tested.
o It cannot be bigger than 100 lines, with context.
o It must fix only one thing.
o It must fix a real bug that bothers people (not a, "This could be a problem..." type thing).
git blame shows:
commit 721499e8931c5732202481ae24f2dfbf9910f129
Author: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Date: Sat Jul 19 22:34:43 2008 -0700
netns: Use net_eq() to compare net-namespaces for optimization.
diff --git a/net/llc/llc_input.c b/net/llc/llc_input.c
index 1c45f172991e..57ad974e4d94 100644
--- a/net/llc/llc_input.c
+++ b/net/llc/llc_input.c
@@ -150,7 +150,7 @@ int llc_rcv(struct sk_buff *skb, struct net_device *dev,
int (*rcv)(struct sk_buff *, struct net_device *,
struct packet_type *, struct net_device *);
- if (dev_net(dev) != &init_net)
+ if (!net_eq(dev_net(dev), &init_net))
goto drop;
/*
So this is just an optimization.
The test itself was added in
ommit e730c15519d09ea528b4d2f1103681fa5937c0e6
Author: Eric W. Biederman <ebiederm@xmission.com>
Date: Mon Sep 17 11:53:39 2007 -0700
[NET]: Make packet reception network namespace safe
This patch modifies every packet receive function
registered with dev_add_pack() to drop packets if they
are not from the initial network namespace.
This should ensure that the various network stacks do
not receive packets in a anything but the initial network
namespace until the code has been converted and is ready
for them.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
So that was over 15 years ago.
It appears it has not bothered people for over 15 years.
Lets wait until we get to see the actual fix. We can then decide how
simple/hard it is to back port to stable, if it fulfils the stable
rules or not.
Andrew
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: llc needs namespace awareness asap, was Re: Patch fixing STP if bridge in non-default namespace.
2023-07-11 20:44 ` Andrew Lunn
@ 2023-07-11 21:40 ` Harry Coin
2023-07-11 21:51 ` Kuniyuki Iwashima
2023-07-12 0:49 ` Jakub Kicinski
1 sibling, 1 reply; 17+ messages in thread
From: Harry Coin @ 2023-07-11 21:40 UTC (permalink / raw)
To: Andrew Lunn; +Cc: Kuniyuki Iwashima, netdev
On 7/11/23 15:44, Andrew Lunn wrote:
>>>>>> The current llc_rcv.c around line 166 in net/llc/llc_input.c has
>>>>>>
>>>>>> if (!net_eq(dev_net(dev), &init_net))
>>>>>> goto drop;
>>>>>>
>> Thank you! When you offer your patches, and you hear worries about being
>> 'invasive', it's worth asking 'compared to what' -- since the 'status quo'
>> is every bridge with STP in a non default namespace with a loop in it
>> somewhere will freeze every connected system more solid than ice in
>> Antarctica.
> https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
>
> say:
>
> o It must be obviously correct and tested.
> o It cannot be bigger than 100 lines, with context.
> o It must fix only one thing.
> o It must fix a real bug that bothers people (not a, "This could be a problem..." type thing).
>
> git blame shows:
>
> commit 721499e8931c5732202481ae24f2dfbf9910f129
> Author: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
> Date: Sat Jul 19 22:34:43 2008 -0700
>
> netns: Use net_eq() to compare net-namespaces for optimization.
>
> diff --git a/net/llc/llc_input.c b/net/llc/llc_input.c
> index 1c45f172991e..57ad974e4d94 100644
> --- a/net/llc/llc_input.c
> +++ b/net/llc/llc_input.c
> @@ -150,7 +150,7 @@ int llc_rcv(struct sk_buff *skb, struct net_device *dev,
> int (*rcv)(struct sk_buff *, struct net_device *,
> struct packet_type *, struct net_device *);
>
> - if (dev_net(dev) != &init_net)
> + if (!net_eq(dev_net(dev), &init_net))
> goto drop;
>
> /*
>
> So this is just an optimization.
>
> The test itself was added in
>
> ommit e730c15519d09ea528b4d2f1103681fa5937c0e6
> Author: Eric W. Biederman <ebiederm@xmission.com>
> Date: Mon Sep 17 11:53:39 2007 -0700
>
> [NET]: Make packet reception network namespace safe
>
> This patch modifies every packet receive function
> registered with dev_add_pack() to drop packets if they
> are not from the initial network namespace.
>
> This should ensure that the various network stacks do
> not receive packets in a anything but the initial network
> namespace until the code has been converted and is ready
> for them.
>
> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
> Signed-off-by: David S. Miller <davem@davemloft.net>
>
> So that was over 15 years ago.
>
> It appears it has not bothered people for over 15 years.
>
> Lets wait until we get to see the actual fix. We can then decide how
> simple/hard it is to back port to stable, if it fulfils the stable
> rules or not.
>
> Andrew
Andrew, fair enough. In the time until it's fixed, the kernel folks
should publish an advisory and block any attempt to set bridge stp state
to other than 0 if in a non-default namespace. The alternative is a
packet flood at whatever the top line speed is should there be a loop
somewhere in even one connected link.
--
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: llc needs namespace awareness asap, was Re: Patch fixing STP if bridge in non-default namespace.
2023-07-11 21:40 ` Harry Coin
@ 2023-07-11 21:51 ` Kuniyuki Iwashima
2023-07-11 22:44 ` Harry Coin
` (2 more replies)
0 siblings, 3 replies; 17+ messages in thread
From: Kuniyuki Iwashima @ 2023-07-11 21:51 UTC (permalink / raw)
To: hcoin; +Cc: andrew, kuniyu, netdev
From: Harry Coin <hcoin@quietfountain.com>
Date: Tue, 11 Jul 2023 16:40:03 -0500
> On 7/11/23 15:44, Andrew Lunn wrote:
> >>>>>> The current llc_rcv.c around line 166 in net/llc/llc_input.c has
> >>>>>>
> >>>>>> if (!net_eq(dev_net(dev), &init_net))
> >>>>>> goto drop;
> >>>>>>
> >> Thank you! When you offer your patches, and you hear worries about being
> >> 'invasive', it's worth asking 'compared to what' -- since the 'status quo'
> >> is every bridge with STP in a non default namespace with a loop in it
> >> somewhere will freeze every connected system more solid than ice in
> >> Antarctica.
> > https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
> >
> > say:
> >
> > o It must be obviously correct and tested.
> > o It cannot be bigger than 100 lines, with context.
> > o It must fix only one thing.
> > o It must fix a real bug that bothers people (not a, "This could be a problem..." type thing).
> >
> > git blame shows:
> >
> > commit 721499e8931c5732202481ae24f2dfbf9910f129
> > Author: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
> > Date: Sat Jul 19 22:34:43 2008 -0700
> >
> > netns: Use net_eq() to compare net-namespaces for optimization.
> >
> > diff --git a/net/llc/llc_input.c b/net/llc/llc_input.c
> > index 1c45f172991e..57ad974e4d94 100644
> > --- a/net/llc/llc_input.c
> > +++ b/net/llc/llc_input.c
> > @@ -150,7 +150,7 @@ int llc_rcv(struct sk_buff *skb, struct net_device *dev,
> > int (*rcv)(struct sk_buff *, struct net_device *,
> > struct packet_type *, struct net_device *);
> >
> > - if (dev_net(dev) != &init_net)
> > + if (!net_eq(dev_net(dev), &init_net))
> > goto drop;
> >
> > /*
> >
> > So this is just an optimization.
> >
> > The test itself was added in
> >
> > ommit e730c15519d09ea528b4d2f1103681fa5937c0e6
> > Author: Eric W. Biederman <ebiederm@xmission.com>
> > Date: Mon Sep 17 11:53:39 2007 -0700
> >
> > [NET]: Make packet reception network namespace safe
> >
> > This patch modifies every packet receive function
> > registered with dev_add_pack() to drop packets if they
> > are not from the initial network namespace.
> >
> > This should ensure that the various network stacks do
> > not receive packets in a anything but the initial network
> > namespace until the code has been converted and is ready
> > for them.
> >
> > Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
> > Signed-off-by: David S. Miller <davem@davemloft.net>
> >
> > So that was over 15 years ago.
> >
> > It appears it has not bothered people for over 15 years.
> >
> > Lets wait until we get to see the actual fix. We can then decide how
> > simple/hard it is to back port to stable, if it fulfils the stable
> > rules or not.
> >
> > Andrew
>
> Andrew, fair enough. In the time until it's fixed, the kernel folks
> should publish an advisory and block any attempt to set bridge stp state
> to other than 0 if in a non-default namespace. The alternative is a
> packet flood at whatever the top line speed is should there be a loop
> somewhere in even one connected link.
Like this ? Someone who didn't notice the issue might complain about
it as regression.
---8<---
diff --git a/net/bridge/br_stp_if.c b/net/bridge/br_stp_if.c
index 75204d36d7f9..a807996ac56b 100644
--- a/net/bridge/br_stp_if.c
+++ b/net/bridge/br_stp_if.c
@@ -201,6 +201,11 @@ int br_stp_set_enabled(struct net_bridge *br, unsigned long val,
{
ASSERT_RTNL();
+ if (!net_eq(dev_net(br->dev), &init_net)) {
+ NL_SET_ERR_MSG_MOD(extack, "STP can't be enabled in non-root netns");
+ return -EINVAL;
+ }
+
if (br_mrp_enabled(br)) {
NL_SET_ERR_MSG_MOD(extack,
"STP can't be enabled if MRP is already enabled");
---8<---
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: llc needs namespace awareness asap, was Re: Patch fixing STP if bridge in non-default namespace.
2023-07-11 21:51 ` Kuniyuki Iwashima
@ 2023-07-11 22:44 ` Harry Coin
2023-07-11 22:56 ` Kuniyuki Iwashima
[not found] ` <b01e5af6-e397-486d-3428-6fa30a919042@quietfountain.com>
2023-07-12 9:44 ` [Bridge] " Petr Machata
2 siblings, 1 reply; 17+ messages in thread
From: Harry Coin @ 2023-07-11 22:44 UTC (permalink / raw)
To: Kuniyuki Iwashima; +Cc: andrew, netdev
On 7/11/23 16:51, Kuniyuki Iwashima wrote:
> From: Harry Coin<hcoin@quietfountain.com>
> Date: Tue, 11 Jul 2023 16:40:03 -0500
>> On 7/11/23 15:44, Andrew Lunn wrote:
>>>>>>>> The current llc_rcv.c around line 166 in net/llc/llc_input.c has
>>>>>>>>
>>>>>>>> if (!net_eq(dev_net(dev), &init_net))
>>>>>>>> goto drop;
>>>>>>>>
>>>> Thank you! When you offer your patches, and you hear worries about being
>>>> 'invasive', it's worth asking 'compared to what' -- since the 'status quo'
>>>> is every bridge with STP in a non default namespace with a loop in it
>>>> somewhere will freeze every connected system more solid than ice in
>>>> Antarctica.
>>> https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
>>>
>>> say:
>>>
>>> o It must be obviously correct and tested.
>>> o It cannot be bigger than 100 lines, with context.
>>> o It must fix only one thing.
>>> o It must fix a real bug that bothers people (not a, "This could be a problem..." type thing).
>>>
>>> git blame shows:
>>>
>>> commit 721499e8931c5732202481ae24f2dfbf9910f129
>>> Author: YOSHIFUJI Hideaki<yoshfuji@linux-ipv6.org>
>>> Date: Sat Jul 19 22:34:43 2008 -0700
>>>
>>> netns: Use net_eq() to compare net-namespaces for optimization.
>>>
>>> diff --git a/net/llc/llc_input.c b/net/llc/llc_input.c
>>> index 1c45f172991e..57ad974e4d94 100644
>>> --- a/net/llc/llc_input.c
>>> +++ b/net/llc/llc_input.c
>>> @@ -150,7 +150,7 @@ int llc_rcv(struct sk_buff *skb, struct net_device *dev,
>>> int (*rcv)(struct sk_buff *, struct net_device *,
>>> struct packet_type *, struct net_device *);
>>>
>>> - if (dev_net(dev) != &init_net)
>>> + if (!net_eq(dev_net(dev), &init_net))
>>> goto drop;
>>>
>>> /*
>>>
>>> So this is just an optimization.
>>>
>>> The test itself was added in
>>>
>>> ommit e730c15519d09ea528b4d2f1103681fa5937c0e6
>>> Author: Eric W. Biederman<ebiederm@xmission.com>
>>> Date: Mon Sep 17 11:53:39 2007 -0700
>>>
>>> [NET]: Make packet reception network namespace safe
>>>
>>> This patch modifies every packet receive function
>>> registered with dev_add_pack() to drop packets if they
>>> are not from the initial network namespace.
>>>
>>> This should ensure that the various network stacks do
>>> not receive packets in a anything but the initial network
>>> namespace until the code has been converted and is ready
>>> for them.
>>>
>>> Signed-off-by: Eric W. Biederman<ebiederm@xmission.com>
>>> Signed-off-by: David S. Miller<davem@davemloft.net>
>>>
>>> So that was over 15 years ago.
>>>
>>> It appears it has not bothered people for over 15 years.
>>>
>>> Lets wait until we get to see the actual fix. We can then decide how
>>> simple/hard it is to back port to stable, if it fulfils the stable
>>> rules or not.
>>>
>>> Andrew
>> Andrew, fair enough. In the time until it's fixed, the kernel folks
>> should publish an advisory and block any attempt to set bridge stp state
>> to other than 0 if in a non-default namespace. The alternative is a
>> packet flood at whatever the top line speed is should there be a loop
>> somewhere in even one connected link.
> Like this ? Someone who didn't notice the issue might complain about
> it as regression.
>
> ---8<---
> diff --git a/net/bridge/br_stp_if.c b/net/bridge/br_stp_if.c
> index 75204d36d7f9..a807996ac56b 100644
> --- a/net/bridge/br_stp_if.c
> +++ b/net/bridge/br_stp_if.c
> @@ -201,6 +201,11 @@ int br_stp_set_enabled(struct net_bridge *br, unsigned long val,
> {
> ASSERT_RTNL();
>
> + if (!net_eq(dev_net(br->dev), &init_net)) {
> + NL_SET_ERR_MSG_MOD(extack, "STP can't be enabled in non-root netns");
> + return -EINVAL;
> + }
> +
> if (br_mrp_enabled(br)) {
> NL_SET_ERR_MSG_MOD(extack,
> "STP can't be enabled if MRP is already enabled");
> ---8<---
Something like that, but to your point about regression -- It a
statistically good bet there are many bridges with STP enabled in
non-default name spaces that simply have no loops in L2 BUT these are
'buried' inside docker images or prepackaged setups that work 'just
fine standalone' and also 'in lab namespaces (that just don't have L2
loops...) and so that don't hit the bug. These users are one "cable
click to an open port already connected somewhere they can't see" away
from bringing down every computer on their entire link (like me, been
there, it's not a happy week for anyone...), they just don't know it.
And their vendors 'trust STP, so that can't be it' --- but it is.
If the patch above gets installed-- then packagers downstream will have
to respond with effort to add code to turn off STP if finding themselves
in a namespace, and not if not. They will be displeased at having to
accommodate then de-accommodate when the fix lands. As a 'usually
downstream of the kernel' developer, I'd rather be warned than blocked.
Looking at those dates... wow! I expect other os kernels and standalone
switch vendors would see fixing this one as a removing a reliability
advantage they've had for a long time.
Perhaps a broadcast advisory "Until this is fixed, your site will have a
packet flood worse than an internal DDOS attack if there's a loop in a
link layer and if even one docker image or prepackaged project uses a
net bridge with STP enabled and is deployed in a non-default netns / net
namespace. Check with your package vendors if you're not sure. You'll
avoid this problem if your link layer layout chart is tree-and-branch
without even one crosslink." Yup, that'll be somewhat less than
popular. But better warned and awaiting a fix than blocked.
How hard can the fix be? Instead of dropping the packet if in the
non-default namespace, as each device is in a namespace it should be
fine to pass the packet only to listeners in the same namespace as the
device that received the packet. Back in the day this code was written,
it was probably 'hard to know' among the multicast subscribers what
namespace they were in.
I suspect the impact of this fix on existing code will be minor since
the only effect will be packets appearing where they were expected
before but not received.
Harry
--
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: llc needs namespace awareness asap, was Re: Patch fixing STP if bridge in non-default namespace.
2023-07-11 22:44 ` Harry Coin
@ 2023-07-11 22:56 ` Kuniyuki Iwashima
0 siblings, 0 replies; 17+ messages in thread
From: Kuniyuki Iwashima @ 2023-07-11 22:56 UTC (permalink / raw)
To: hcoin; +Cc: andrew, kuniyu, netdev
From: Harry Coin <hcoin@quietfountain.com>
Date: Tue, 11 Jul 2023 17:44:20 -0500
> On 7/11/23 16:51, Kuniyuki Iwashima wrote:
> > From: Harry Coin<hcoin@quietfountain.com>
> > Date: Tue, 11 Jul 2023 16:40:03 -0500
> >> On 7/11/23 15:44, Andrew Lunn wrote:
> >>>>>>>> The current llc_rcv.c around line 166 in net/llc/llc_input.c has
> >>>>>>>>
> >>>>>>>> if (!net_eq(dev_net(dev), &init_net))
> >>>>>>>> goto drop;
> >>>>>>>>
> >>>> Thank you! When you offer your patches, and you hear worries about being
> >>>> 'invasive', it's worth asking 'compared to what' -- since the 'status quo'
> >>>> is every bridge with STP in a non default namespace with a loop in it
> >>>> somewhere will freeze every connected system more solid than ice in
> >>>> Antarctica.
> >>> https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
> >>>
> >>> say:
> >>>
> >>> o It must be obviously correct and tested.
> >>> o It cannot be bigger than 100 lines, with context.
> >>> o It must fix only one thing.
> >>> o It must fix a real bug that bothers people (not a, "This could be a problem..." type thing).
> >>>
> >>> git blame shows:
> >>>
> >>> commit 721499e8931c5732202481ae24f2dfbf9910f129
> >>> Author: YOSHIFUJI Hideaki<yoshfuji@linux-ipv6.org>
> >>> Date: Sat Jul 19 22:34:43 2008 -0700
> >>>
> >>> netns: Use net_eq() to compare net-namespaces for optimization.
> >>>
> >>> diff --git a/net/llc/llc_input.c b/net/llc/llc_input.c
> >>> index 1c45f172991e..57ad974e4d94 100644
> >>> --- a/net/llc/llc_input.c
> >>> +++ b/net/llc/llc_input.c
> >>> @@ -150,7 +150,7 @@ int llc_rcv(struct sk_buff *skb, struct net_device *dev,
> >>> int (*rcv)(struct sk_buff *, struct net_device *,
> >>> struct packet_type *, struct net_device *);
> >>>
> >>> - if (dev_net(dev) != &init_net)
> >>> + if (!net_eq(dev_net(dev), &init_net))
> >>> goto drop;
> >>>
> >>> /*
> >>>
> >>> So this is just an optimization.
> >>>
> >>> The test itself was added in
> >>>
> >>> ommit e730c15519d09ea528b4d2f1103681fa5937c0e6
> >>> Author: Eric W. Biederman<ebiederm@xmission.com>
> >>> Date: Mon Sep 17 11:53:39 2007 -0700
> >>>
> >>> [NET]: Make packet reception network namespace safe
> >>>
> >>> This patch modifies every packet receive function
> >>> registered with dev_add_pack() to drop packets if they
> >>> are not from the initial network namespace.
> >>>
> >>> This should ensure that the various network stacks do
> >>> not receive packets in a anything but the initial network
> >>> namespace until the code has been converted and is ready
> >>> for them.
> >>>
> >>> Signed-off-by: Eric W. Biederman<ebiederm@xmission.com>
> >>> Signed-off-by: David S. Miller<davem@davemloft.net>
> >>>
> >>> So that was over 15 years ago.
> >>>
> >>> It appears it has not bothered people for over 15 years.
> >>>
> >>> Lets wait until we get to see the actual fix. We can then decide how
> >>> simple/hard it is to back port to stable, if it fulfils the stable
> >>> rules or not.
> >>>
> >>> Andrew
> >> Andrew, fair enough. In the time until it's fixed, the kernel folks
> >> should publish an advisory and block any attempt to set bridge stp state
> >> to other than 0 if in a non-default namespace. The alternative is a
> >> packet flood at whatever the top line speed is should there be a loop
> >> somewhere in even one connected link.
> > Like this ? Someone who didn't notice the issue might complain about
> > it as regression.
> >
> > ---8<---
> > diff --git a/net/bridge/br_stp_if.c b/net/bridge/br_stp_if.c
> > index 75204d36d7f9..a807996ac56b 100644
> > --- a/net/bridge/br_stp_if.c
> > +++ b/net/bridge/br_stp_if.c
> > @@ -201,6 +201,11 @@ int br_stp_set_enabled(struct net_bridge *br, unsigned long val,
> > {
> > ASSERT_RTNL();
> >
> > + if (!net_eq(dev_net(br->dev), &init_net)) {
> > + NL_SET_ERR_MSG_MOD(extack, "STP can't be enabled in non-root netns");
> > + return -EINVAL;
> > + }
> > +
> > if (br_mrp_enabled(br)) {
> > NL_SET_ERR_MSG_MOD(extack,
> > "STP can't be enabled if MRP is already enabled");
> > ---8<---
>
> Something like that, but to your point about regression -- It a
> statistically good bet there are many bridges with STP enabled in
> non-default name spaces that simply have no loops in L2 BUT these are
> 'buried' inside docker images or prepackaged setups that work 'just
> fine standalone' and also 'in lab namespaces (that just don't have L2
> loops...) and so that don't hit the bug. These users are one "cable
> click to an open port already connected somewhere they can't see" away
> from bringing down every computer on their entire link (like me, been
> there, it's not a happy week for anyone...), they just don't know it.
> And their vendors 'trust STP, so that can't be it' --- but it is.
>
> If the patch above gets installed-- then packagers downstream will have
> to respond with effort to add code to turn off STP if finding themselves
> in a namespace, and not if not. They will be displeased at having to
> accommodate then de-accommodate when the fix lands. As a 'usually
> downstream of the kernel' developer, I'd rather be warned than blocked.
Ok, will post the diff above as a formal patch shortly.
>
> Looking at those dates... wow! I expect other os kernels and standalone
> switch vendors would see fixing this one as a removing a reliability
> advantage they've had for a long time.
>
> Perhaps a broadcast advisory "Until this is fixed, your site will have a
> packet flood worse than an internal DDOS attack if there's a loop in a
> link layer and if even one docker image or prepackaged project uses a
> net bridge with STP enabled and is deployed in a non-default netns / net
> namespace. Check with your package vendors if you're not sure. You'll
> avoid this problem if your link layer layout chart is tree-and-branch
> without even one crosslink." Yup, that'll be somewhat less than
> popular. But better warned and awaiting a fix than blocked.
>
> How hard can the fix be? Instead of dropping the packet if in the
> non-default namespace, as each device is in a namespace it should be
> fine to pass the packet only to listeners in the same namespace as the
> device that received the packet. Back in the day this code was written,
> it was probably 'hard to know' among the multicast subscribers what
> namespace they were in.
>
> I suspect the impact of this fix on existing code will be minor since
> the only effect will be packets appearing where they were expected
> before but not received.
Looking around the code, fixing the issue will not be so hard,
but we should be careful so that we will not leak frames that
should have been invisible.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: llc needs namespace awareness asap, was Re: Patch fixing STP if bridge in non-default namespace.
2023-07-11 20:44 ` Andrew Lunn
2023-07-11 21:40 ` Harry Coin
@ 2023-07-12 0:49 ` Jakub Kicinski
1 sibling, 0 replies; 17+ messages in thread
From: Jakub Kicinski @ 2023-07-12 0:49 UTC (permalink / raw)
To: Andrew Lunn; +Cc: Harry Coin, Kuniyuki Iwashima, netdev
On Tue, 11 Jul 2023 22:44:44 +0200 Andrew Lunn wrote:
> o It cannot be bigger than 100 lines, with context.
Let's take stable rules with a large pinch of salt.
None of them are hard rules, especially the 100 line one.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: llc needs namespace awareness asap, was Re: Patch fixing STP if bridge in non-default namespace.
[not found] ` <b01e5af6-e397-486d-3428-6fa30a919042@quietfountain.com>
@ 2023-07-12 0:55 ` Andrew Lunn
2023-07-12 3:06 ` Harry Coin
0 siblings, 1 reply; 17+ messages in thread
From: Andrew Lunn @ 2023-07-12 0:55 UTC (permalink / raw)
To: Harry Coin; +Cc: Kuniyuki Iwashima, netdev
> Something like that, but to your point about regression -- It a
> statistically good bet there are many bridges with STP enabled in
> non-default name spaces that simply have no loops in L2 BUT these are
> 'buried' inside docker images or prepackaged setups that work 'just fine
> standalone' and also 'in lab namespaces (that just don't have L2 loops...)
> and so that don't hit the bug. These users are one "cable click to an open
> port already connected somewhere they can't see" away from bringing down
> every computer on their entire link (like me, been there, it's not a happy
> week for anyone...), they just don't know it. And their vendors 'trust STP,
> so that can't be it' --- but it is.
>
> If the patch above gets installed-- then packagers downstream will have to
> respond with effort to add code to turn off STP if finding themselves in a
> namespace, and not if not. They will be displeased at having to
> accommodate then de-accommodate when the fix lands. As a 'usually
> downstream of the kernel' developer, I'd rather be warned than blocked.
I don't know this code at all, so maybe a dumb question. What about
user space STP and RSTP? Do they get to see BPDUs? If that works, we
need to ensure any checking you add does not break that use case.
Andrew
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: llc needs namespace awareness asap, was Re: Patch fixing STP if bridge in non-default namespace.
2023-07-12 0:55 ` Andrew Lunn
@ 2023-07-12 3:06 ` Harry Coin
2023-07-13 22:37 ` Stephen Hemminger
0 siblings, 1 reply; 17+ messages in thread
From: Harry Coin @ 2023-07-12 3:06 UTC (permalink / raw)
To: Andrew Lunn; +Cc: Kuniyuki Iwashima, netdev
On 7/11/23 19:55, Andrew Lunn wrote:
>> Something like that, but to your point about regression -- It a
>> statistically good bet there are many bridges with STP enabled in
>> non-default name spaces that simply have no loops in L2 BUT these are
>> 'buried' inside docker images or prepackaged setups that work 'just fine
>> standalone' and also 'in lab namespaces (that just don't have L2 loops...)
>> and so that don't hit the bug. These users are one "cable click to an open
>> port already connected somewhere they can't see" away from bringing down
>> every computer on their entire link (like me, been there, it's not a happy
>> week for anyone...), they just don't know it. And their vendors 'trust STP,
>> so that can't be it' --- but it is.
>>
>> If the patch above gets installed-- then packagers downstream will have to
>> respond with effort to add code to turn off STP if finding themselves in a
>> namespace, and not if not. They will be displeased at having to
>> accommodate then de-accommodate when the fix lands. As a 'usually
>> downstream of the kernel' developer, I'd rather be warned than blocked.
> I don't know this code at all, so maybe a dumb question. What about
> user space STP and RSTP? Do they get to see BPDUs? If that works, we
> need to ensure any checking you add does not break that use case.
>
> Andrew
Andrew, the only RTSP / STP provider I know of is open-vswitch and their
docs (last I read them) provide the caution to use veth pairs to
namespaces, but not run their daemon there--- and fair enough as no
multicast llc would make it to that code in a namespace as currently
kernel implemented.
Like STP and namespaces in bridge code quite happily accepting commands
it fails to deliver (though it's properly a subject for another related
thread, you really have to read lots of fine print to notice the kernel
bridge code has both STP and vlan capability, but they do *not* play
well robustly though the system happily accepts configs like they do
(STP BPDUs appropriate to one vlan can go out untagged..)
Anyhow, to your point: As zero multicast L2 packets make it to the
kernel protocols stacks, much less user space if they aren't in the
default net name space, this fix at most would allow packets they
presently expect, are documented to get, but somehow magically never
arrive -- to arrive.
Put more 'mathematically': Neither the bug nor its fix will change
anything in the primary namespace, neither the bug nor its fix will
change packets that presently arrive in the non-default namespace to
behave other than as now they do. The only change should be to cause
multicast packets that ought to be delivered to listeners in the
non-default namespace to get them.
HTH
Harry Coin
Bettendorf, Iowa
--
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: llc needs namespace awareness asap, was Re: Patch fixing STP if bridge in non-default namespace.
2023-07-11 21:51 ` Kuniyuki Iwashima
@ 2023-07-12 9:44 ` Petr Machata
[not found] ` <b01e5af6-e397-486d-3428-6fa30a919042@quietfountain.com>
2023-07-12 9:44 ` [Bridge] " Petr Machata
2 siblings, 0 replies; 17+ messages in thread
From: Petr Machata @ 2023-07-12 9:44 UTC (permalink / raw)
To: Kuniyuki Iwashima; +Cc: hcoin, andrew, netdev, bridge
(CC'ing bridge maintainers.)
Kuniyuki Iwashima <kuniyu@amazon.com> writes:
> From: Harry Coin <hcoin@quietfountain.com>
> Date: Tue, 11 Jul 2023 16:40:03 -0500
>> On 7/11/23 15:44, Andrew Lunn wrote:
>> >>>>>> The current llc_rcv.c around line 166 in net/llc/llc_input.c has
>> >>>>>>
>> >>>>>> if (!net_eq(dev_net(dev), &init_net))
>> >>>>>> goto drop;
>> >>>>>>
>> >> Thank you! When you offer your patches, and you hear worries about being
>> >> 'invasive', it's worth asking 'compared to what' -- since the 'status quo'
>> >> is every bridge with STP in a non default namespace with a loop in it
>> >> somewhere will freeze every connected system more solid than ice in
>> >> Antarctica.
>> > https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
>> >
>> > say:
>> >
>> > o It must be obviously correct and tested.
>> > o It cannot be bigger than 100 lines, with context.
>> > o It must fix only one thing.
>> > o It must fix a real bug that bothers people (not a, "This could be a problem..." type thing).
>> >
>> > git blame shows:
>> >
>> > commit 721499e8931c5732202481ae24f2dfbf9910f129
>> > Author: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
>> > Date: Sat Jul 19 22:34:43 2008 -0700
>> >
>> > netns: Use net_eq() to compare net-namespaces for optimization.
>> >
>> > diff --git a/net/llc/llc_input.c b/net/llc/llc_input.c
>> > index 1c45f172991e..57ad974e4d94 100644
>> > --- a/net/llc/llc_input.c
>> > +++ b/net/llc/llc_input.c
>> > @@ -150,7 +150,7 @@ int llc_rcv(struct sk_buff *skb, struct net_device *dev,
>> > int (*rcv)(struct sk_buff *, struct net_device *,
>> > struct packet_type *, struct net_device *);
>> >
>> > - if (dev_net(dev) != &init_net)
>> > + if (!net_eq(dev_net(dev), &init_net))
>> > goto drop;
>> >
>> > /*
>> >
>> > So this is just an optimization.
>> >
>> > The test itself was added in
>> >
>> > ommit e730c15519d09ea528b4d2f1103681fa5937c0e6
>> > Author: Eric W. Biederman <ebiederm@xmission.com>
>> > Date: Mon Sep 17 11:53:39 2007 -0700
>> >
>> > [NET]: Make packet reception network namespace safe
>> >
>> > This patch modifies every packet receive function
>> > registered with dev_add_pack() to drop packets if they
>> > are not from the initial network namespace.
>> >
>> > This should ensure that the various network stacks do
>> > not receive packets in a anything but the initial network
>> > namespace until the code has been converted and is ready
>> > for them.
>> >
>> > Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
>> > Signed-off-by: David S. Miller <davem@davemloft.net>
>> >
>> > So that was over 15 years ago.
>> >
>> > It appears it has not bothered people for over 15 years.
>> >
>> > Lets wait until we get to see the actual fix. We can then decide how
>> > simple/hard it is to back port to stable, if it fulfils the stable
>> > rules or not.
>> >
>> > Andrew
>>
>> Andrew, fair enough. In the time until it's fixed, the kernel folks
>> should publish an advisory and block any attempt to set bridge stp state
>> to other than 0 if in a non-default namespace. The alternative is a
>> packet flood at whatever the top line speed is should there be a loop
>> somewhere in even one connected link.
>
> Like this ? Someone who didn't notice the issue might complain about
> it as regression.
>
> ---8<---
> diff --git a/net/bridge/br_stp_if.c b/net/bridge/br_stp_if.c
> index 75204d36d7f9..a807996ac56b 100644
> --- a/net/bridge/br_stp_if.c
> +++ b/net/bridge/br_stp_if.c
> @@ -201,6 +201,11 @@ int br_stp_set_enabled(struct net_bridge *br, unsigned long val,
> {
> ASSERT_RTNL();
>
> + if (!net_eq(dev_net(br->dev), &init_net)) {
> + NL_SET_ERR_MSG_MOD(extack, "STP can't be enabled in non-root netns");
> + return -EINVAL;
> + }
> +
> if (br_mrp_enabled(br)) {
> NL_SET_ERR_MSG_MOD(extack,
> "STP can't be enabled if MRP is already enabled");
> ---8<---
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Bridge] llc needs namespace awareness asap, was Re: Patch fixing STP if bridge in non-default namespace.
@ 2023-07-12 9:44 ` Petr Machata
0 siblings, 0 replies; 17+ messages in thread
From: Petr Machata @ 2023-07-12 9:44 UTC (permalink / raw)
To: Kuniyuki Iwashima; +Cc: bridge, hcoin, netdev, andrew
(CC'ing bridge maintainers.)
Kuniyuki Iwashima <kuniyu@amazon.com> writes:
> From: Harry Coin <hcoin@quietfountain.com>
> Date: Tue, 11 Jul 2023 16:40:03 -0500
>> On 7/11/23 15:44, Andrew Lunn wrote:
>> >>>>>> The current llc_rcv.c around line 166 in net/llc/llc_input.c has
>> >>>>>>
>> >>>>>> if (!net_eq(dev_net(dev), &init_net))
>> >>>>>> goto drop;
>> >>>>>>
>> >> Thank you! When you offer your patches, and you hear worries about being
>> >> 'invasive', it's worth asking 'compared to what' -- since the 'status quo'
>> >> is every bridge with STP in a non default namespace with a loop in it
>> >> somewhere will freeze every connected system more solid than ice in
>> >> Antarctica.
>> > https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
>> >
>> > say:
>> >
>> > o It must be obviously correct and tested.
>> > o It cannot be bigger than 100 lines, with context.
>> > o It must fix only one thing.
>> > o It must fix a real bug that bothers people (not a, "This could be a problem..." type thing).
>> >
>> > git blame shows:
>> >
>> > commit 721499e8931c5732202481ae24f2dfbf9910f129
>> > Author: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
>> > Date: Sat Jul 19 22:34:43 2008 -0700
>> >
>> > netns: Use net_eq() to compare net-namespaces for optimization.
>> >
>> > diff --git a/net/llc/llc_input.c b/net/llc/llc_input.c
>> > index 1c45f172991e..57ad974e4d94 100644
>> > --- a/net/llc/llc_input.c
>> > +++ b/net/llc/llc_input.c
>> > @@ -150,7 +150,7 @@ int llc_rcv(struct sk_buff *skb, struct net_device *dev,
>> > int (*rcv)(struct sk_buff *, struct net_device *,
>> > struct packet_type *, struct net_device *);
>> >
>> > - if (dev_net(dev) != &init_net)
>> > + if (!net_eq(dev_net(dev), &init_net))
>> > goto drop;
>> >
>> > /*
>> >
>> > So this is just an optimization.
>> >
>> > The test itself was added in
>> >
>> > ommit e730c15519d09ea528b4d2f1103681fa5937c0e6
>> > Author: Eric W. Biederman <ebiederm@xmission.com>
>> > Date: Mon Sep 17 11:53:39 2007 -0700
>> >
>> > [NET]: Make packet reception network namespace safe
>> >
>> > This patch modifies every packet receive function
>> > registered with dev_add_pack() to drop packets if they
>> > are not from the initial network namespace.
>> >
>> > This should ensure that the various network stacks do
>> > not receive packets in a anything but the initial network
>> > namespace until the code has been converted and is ready
>> > for them.
>> >
>> > Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
>> > Signed-off-by: David S. Miller <davem@davemloft.net>
>> >
>> > So that was over 15 years ago.
>> >
>> > It appears it has not bothered people for over 15 years.
>> >
>> > Lets wait until we get to see the actual fix. We can then decide how
>> > simple/hard it is to back port to stable, if it fulfils the stable
>> > rules or not.
>> >
>> > Andrew
>>
>> Andrew, fair enough. In the time until it's fixed, the kernel folks
>> should publish an advisory and block any attempt to set bridge stp state
>> to other than 0 if in a non-default namespace. The alternative is a
>> packet flood at whatever the top line speed is should there be a loop
>> somewhere in even one connected link.
>
> Like this ? Someone who didn't notice the issue might complain about
> it as regression.
>
> ---8<---
> diff --git a/net/bridge/br_stp_if.c b/net/bridge/br_stp_if.c
> index 75204d36d7f9..a807996ac56b 100644
> --- a/net/bridge/br_stp_if.c
> +++ b/net/bridge/br_stp_if.c
> @@ -201,6 +201,11 @@ int br_stp_set_enabled(struct net_bridge *br, unsigned long val,
> {
> ASSERT_RTNL();
>
> + if (!net_eq(dev_net(br->dev), &init_net)) {
> + NL_SET_ERR_MSG_MOD(extack, "STP can't be enabled in non-root netns");
> + return -EINVAL;
> + }
> +
> if (br_mrp_enabled(br)) {
> NL_SET_ERR_MSG_MOD(extack,
> "STP can't be enabled if MRP is already enabled");
> ---8<---
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: llc needs namespace awareness asap, was Re: Patch fixing STP if bridge in non-default namespace.
2023-07-12 3:06 ` Harry Coin
@ 2023-07-13 22:37 ` Stephen Hemminger
0 siblings, 0 replies; 17+ messages in thread
From: Stephen Hemminger @ 2023-07-13 22:37 UTC (permalink / raw)
To: Harry Coin; +Cc: Andrew Lunn, Kuniyuki Iwashima, netdev
On Tue, 11 Jul 2023 22:06:45 -0500
Harry Coin <hcoin@quietfountain.com> wrote:
> Andrew, the only RTSP / STP provider I know of is open-vswitch and their
> docs (last I read them) provide the caution to use veth pairs to
> namespaces, but not run their daemon there--- and fair enough as no
> multicast llc would make it to that code in a namespace as currently
> kernel implemented.
There are at least two other userspace STP providers.
Multiple Spanning Tree Protocol
https://github.com/mstpd/mstpd
And older (mostly for demo) code
https://github.com/shemminger/RSTP
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: llc needs namespace awareness asap, was Re: Patch fixing STP if bridge in non-default namespace.
2023-08-02 3:45 Hasenbosch, Samuel J
@ 2023-08-02 3:50 ` Kuniyuki Iwashima
0 siblings, 0 replies; 17+ messages in thread
From: Kuniyuki Iwashima @ 2023-08-02 3:50 UTC (permalink / raw)
To: samuel.j.hasenbosch; +Cc: andrew, hcoin, kuniyu, netdev, stephen
From: "Hasenbosch, Samuel J" <Samuel.J.Hasenbosch@boeing.com>
Date: Wed, 2 Aug 2023 03:45:10 +0000
> Forwarding related issue:
>
> https://lore.kernel.org/netdev/cf3001de-4ee2-45f2-83d3-3c878b85d628@free.fr/
It's the same issue and this series fixed it.
https://lore.kernel.org/netdev/20230718174152.57408-1-kuniyu@amazon.com/
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2023-08-02 3:51 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-10 13:35 Patch fixing STP if bridge in non-default namespace Harry Coin
2023-07-11 3:22 ` Kuniyuki Iwashima
2023-07-11 17:08 ` llc needs namespace awareness asap, was " Harry Coin
2023-07-11 18:32 ` Kuniyuki Iwashima
2023-07-11 20:22 ` Harry Coin
2023-07-11 20:44 ` Andrew Lunn
2023-07-11 21:40 ` Harry Coin
2023-07-11 21:51 ` Kuniyuki Iwashima
2023-07-11 22:44 ` Harry Coin
2023-07-11 22:56 ` Kuniyuki Iwashima
[not found] ` <b01e5af6-e397-486d-3428-6fa30a919042@quietfountain.com>
2023-07-12 0:55 ` Andrew Lunn
2023-07-12 3:06 ` Harry Coin
2023-07-13 22:37 ` Stephen Hemminger
2023-07-12 9:44 ` Petr Machata
2023-07-12 9:44 ` [Bridge] " Petr Machata
2023-07-12 0:49 ` Jakub Kicinski
2023-08-02 3:45 Hasenbosch, Samuel J
2023-08-02 3:50 ` Kuniyuki Iwashima
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.