All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net] net: bridge: clear bridge's private skb space on xmit
@ 2020-07-31 16:26 ` Nikolay Aleksandrov
  0 siblings, 0 replies; 18+ messages in thread
From: Nikolay Aleksandrov @ 2020-07-31 16:26 UTC (permalink / raw)
  To: netdev; +Cc: bridge, roopa, davem, Nikolay Aleksandrov

We need to clear all of the bridge private skb variables as they can be
stale due to the packet being recirculated through the stack and then
transmitted through the bridge device. Similar memset is already done on
bridge's input. We've seen cases where proxyarp_replied was 1 on routed
multicast packets transmitted through the bridge to ports with neigh
suppress which were getting dropped. Same thing can in theory happen with
the port isolation bit as well.

Fixes: 821f1b21cabb ("bridge: add new BR_NEIGH_SUPPRESS port flag to suppress arp and nd flood")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
---
 net/bridge/br_device.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
index 8c7b78f8bc23..9a2fb4aa1a10 100644
--- a/net/bridge/br_device.c
+++ b/net/bridge/br_device.c
@@ -36,6 +36,8 @@ netdev_tx_t br_dev_xmit(struct sk_buff *skb, struct net_device *dev)
 	const unsigned char *dest;
 	u16 vid = 0;
 
+	memset(skb->cb, 0, sizeof(struct br_input_skb_cb));
+
 	rcu_read_lock();
 	nf_ops = rcu_dereference(nf_br_ops);
 	if (nf_ops && nf_ops->br_dev_xmit_hook(skb)) {
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Bridge] [PATCH net] net: bridge: clear bridge's private skb space on xmit
@ 2020-07-31 16:26 ` Nikolay Aleksandrov
  0 siblings, 0 replies; 18+ messages in thread
From: Nikolay Aleksandrov @ 2020-07-31 16:26 UTC (permalink / raw)
  To: netdev; +Cc: Nikolay Aleksandrov, roopa, bridge, davem

We need to clear all of the bridge private skb variables as they can be
stale due to the packet being recirculated through the stack and then
transmitted through the bridge device. Similar memset is already done on
bridge's input. We've seen cases where proxyarp_replied was 1 on routed
multicast packets transmitted through the bridge to ports with neigh
suppress which were getting dropped. Same thing can in theory happen with
the port isolation bit as well.

Fixes: 821f1b21cabb ("bridge: add new BR_NEIGH_SUPPRESS port flag to suppress arp and nd flood")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
---
 net/bridge/br_device.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
index 8c7b78f8bc23..9a2fb4aa1a10 100644
--- a/net/bridge/br_device.c
+++ b/net/bridge/br_device.c
@@ -36,6 +36,8 @@ netdev_tx_t br_dev_xmit(struct sk_buff *skb, struct net_device *dev)
 	const unsigned char *dest;
 	u16 vid = 0;
 
+	memset(skb->cb, 0, sizeof(struct br_input_skb_cb));
+
 	rcu_read_lock();
 	nf_ops = rcu_dereference(nf_br_ops);
 	if (nf_ops && nf_ops->br_dev_xmit_hook(skb)) {
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH net] net: bridge: clear bridge's private skb space on xmit
  2020-07-31 16:26 ` [Bridge] " Nikolay Aleksandrov
@ 2020-07-31 17:27   ` David Ahern
  -1 siblings, 0 replies; 18+ messages in thread
From: David Ahern @ 2020-07-31 17:27 UTC (permalink / raw)
  To: Nikolay Aleksandrov, netdev; +Cc: bridge, roopa, davem

On 7/31/20 10:26 AM, Nikolay Aleksandrov wrote:
> We need to clear all of the bridge private skb variables as they can be
> stale due to the packet being recirculated through the stack and then
> transmitted through the bridge device. Similar memset is already done on
> bridge's input. We've seen cases where proxyarp_replied was 1 on routed
> multicast packets transmitted through the bridge to ports with neigh
> suppress which were getting dropped. Same thing can in theory happen with
> the port isolation bit as well.
> 
> Fixes: 821f1b21cabb ("bridge: add new BR_NEIGH_SUPPRESS port flag to suppress arp and nd flood")
> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
> ---
>  net/bridge/br_device.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
> index 8c7b78f8bc23..9a2fb4aa1a10 100644
> --- a/net/bridge/br_device.c
> +++ b/net/bridge/br_device.c
> @@ -36,6 +36,8 @@ netdev_tx_t br_dev_xmit(struct sk_buff *skb, struct net_device *dev)
>  	const unsigned char *dest;
>  	u16 vid = 0;
>  
> +	memset(skb->cb, 0, sizeof(struct br_input_skb_cb));
> +
>  	rcu_read_lock();
>  	nf_ops = rcu_dereference(nf_br_ops);
>  	if (nf_ops && nf_ops->br_dev_xmit_hook(skb)) {
> 

What's the performance hit of doing this on every packet?

Can you just set a flag that tells the code to reset on recirculation?
Seems like br_input_skb_cb has space for that.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bridge] [PATCH net] net: bridge: clear bridge's private skb space on xmit
@ 2020-07-31 17:27   ` David Ahern
  0 siblings, 0 replies; 18+ messages in thread
From: David Ahern @ 2020-07-31 17:27 UTC (permalink / raw)
  To: Nikolay Aleksandrov, netdev; +Cc: roopa, bridge, davem

On 7/31/20 10:26 AM, Nikolay Aleksandrov wrote:
> We need to clear all of the bridge private skb variables as they can be
> stale due to the packet being recirculated through the stack and then
> transmitted through the bridge device. Similar memset is already done on
> bridge's input. We've seen cases where proxyarp_replied was 1 on routed
> multicast packets transmitted through the bridge to ports with neigh
> suppress which were getting dropped. Same thing can in theory happen with
> the port isolation bit as well.
> 
> Fixes: 821f1b21cabb ("bridge: add new BR_NEIGH_SUPPRESS port flag to suppress arp and nd flood")
> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
> ---
>  net/bridge/br_device.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
> index 8c7b78f8bc23..9a2fb4aa1a10 100644
> --- a/net/bridge/br_device.c
> +++ b/net/bridge/br_device.c
> @@ -36,6 +36,8 @@ netdev_tx_t br_dev_xmit(struct sk_buff *skb, struct net_device *dev)
>  	const unsigned char *dest;
>  	u16 vid = 0;
>  
> +	memset(skb->cb, 0, sizeof(struct br_input_skb_cb));
> +
>  	rcu_read_lock();
>  	nf_ops = rcu_dereference(nf_br_ops);
>  	if (nf_ops && nf_ops->br_dev_xmit_hook(skb)) {
> 

What's the performance hit of doing this on every packet?

Can you just set a flag that tells the code to reset on recirculation?
Seems like br_input_skb_cb has space for that.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net] net: bridge: clear bridge's private skb space on xmit
  2020-07-31 17:27   ` [Bridge] " David Ahern
@ 2020-07-31 17:37     ` Nikolay Aleksandrov
  -1 siblings, 0 replies; 18+ messages in thread
From: Nikolay Aleksandrov @ 2020-07-31 17:37 UTC (permalink / raw)
  To: David Ahern, netdev; +Cc: bridge, roopa, davem

On 31/07/2020 20:27, David Ahern wrote:
> On 7/31/20 10:26 AM, Nikolay Aleksandrov wrote:
>> We need to clear all of the bridge private skb variables as they can be
>> stale due to the packet being recirculated through the stack and then
>> transmitted through the bridge device. Similar memset is already done on
>> bridge's input. We've seen cases where proxyarp_replied was 1 on routed
>> multicast packets transmitted through the bridge to ports with neigh
>> suppress which were getting dropped. Same thing can in theory happen with
>> the port isolation bit as well.
>>
>> Fixes: 821f1b21cabb ("bridge: add new BR_NEIGH_SUPPRESS port flag to suppress arp and nd flood")
>> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
>> ---
>>  net/bridge/br_device.c | 2 ++
>>  1 file changed, 2 insertions(+)
>>
>> diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
>> index 8c7b78f8bc23..9a2fb4aa1a10 100644
>> --- a/net/bridge/br_device.c
>> +++ b/net/bridge/br_device.c
>> @@ -36,6 +36,8 @@ netdev_tx_t br_dev_xmit(struct sk_buff *skb, struct net_device *dev)
>>  	const unsigned char *dest;
>>  	u16 vid = 0;
>>  
>> +	memset(skb->cb, 0, sizeof(struct br_input_skb_cb));
>> +
>>  	rcu_read_lock();
>>  	nf_ops = rcu_dereference(nf_br_ops);
>>  	if (nf_ops && nf_ops->br_dev_xmit_hook(skb)) {
>>
> 
> What's the performance hit of doing this on every packet?
> 
> Can you just set a flag that tells the code to reset on recirculation?
> Seems like br_input_skb_cb has space for that.
> 

Virtually non-existent, we had a patch that turned that field into a 16 byte
field so that is really 2 8 byte stores. It is already cache hot, we could
initialize each individual field separately as br_input does.

I don't want to waste flags on such thing, this makes it future-proof 
and I'll remove the individual field zeroing later which will alleviate
the cost further.



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bridge] [PATCH net] net: bridge: clear bridge's private skb space on xmit
@ 2020-07-31 17:37     ` Nikolay Aleksandrov
  0 siblings, 0 replies; 18+ messages in thread
From: Nikolay Aleksandrov @ 2020-07-31 17:37 UTC (permalink / raw)
  To: David Ahern, netdev; +Cc: roopa, bridge, davem

On 31/07/2020 20:27, David Ahern wrote:
> On 7/31/20 10:26 AM, Nikolay Aleksandrov wrote:
>> We need to clear all of the bridge private skb variables as they can be
>> stale due to the packet being recirculated through the stack and then
>> transmitted through the bridge device. Similar memset is already done on
>> bridge's input. We've seen cases where proxyarp_replied was 1 on routed
>> multicast packets transmitted through the bridge to ports with neigh
>> suppress which were getting dropped. Same thing can in theory happen with
>> the port isolation bit as well.
>>
>> Fixes: 821f1b21cabb ("bridge: add new BR_NEIGH_SUPPRESS port flag to suppress arp and nd flood")
>> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
>> ---
>>  net/bridge/br_device.c | 2 ++
>>  1 file changed, 2 insertions(+)
>>
>> diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
>> index 8c7b78f8bc23..9a2fb4aa1a10 100644
>> --- a/net/bridge/br_device.c
>> +++ b/net/bridge/br_device.c
>> @@ -36,6 +36,8 @@ netdev_tx_t br_dev_xmit(struct sk_buff *skb, struct net_device *dev)
>>  	const unsigned char *dest;
>>  	u16 vid = 0;
>>  
>> +	memset(skb->cb, 0, sizeof(struct br_input_skb_cb));
>> +
>>  	rcu_read_lock();
>>  	nf_ops = rcu_dereference(nf_br_ops);
>>  	if (nf_ops && nf_ops->br_dev_xmit_hook(skb)) {
>>
> 
> What's the performance hit of doing this on every packet?
> 
> Can you just set a flag that tells the code to reset on recirculation?
> Seems like br_input_skb_cb has space for that.
> 

Virtually non-existent, we had a patch that turned that field into a 16 byte
field so that is really 2 8 byte stores. It is already cache hot, we could
initialize each individual field separately as br_input does.

I don't want to waste flags on such thing, this makes it future-proof 
and I'll remove the individual field zeroing later which will alleviate
the cost further.



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net] net: bridge: clear bridge's private skb space on xmit
  2020-07-31 17:37     ` [Bridge] " Nikolay Aleksandrov
@ 2020-07-31 17:38       ` Nikolay Aleksandrov
  -1 siblings, 0 replies; 18+ messages in thread
From: Nikolay Aleksandrov @ 2020-07-31 17:38 UTC (permalink / raw)
  To: David Ahern, netdev; +Cc: bridge, roopa, davem

On 31/07/2020 20:37, Nikolay Aleksandrov wrote:
> On 31/07/2020 20:27, David Ahern wrote:
>> On 7/31/20 10:26 AM, Nikolay Aleksandrov wrote:
>>> We need to clear all of the bridge private skb variables as they can be
>>> stale due to the packet being recirculated through the stack and then
>>> transmitted through the bridge device. Similar memset is already done on
>>> bridge's input. We've seen cases where proxyarp_replied was 1 on routed
>>> multicast packets transmitted through the bridge to ports with neigh
>>> suppress which were getting dropped. Same thing can in theory happen with
>>> the port isolation bit as well.
>>>
>>> Fixes: 821f1b21cabb ("bridge: add new BR_NEIGH_SUPPRESS port flag to suppress arp and nd flood")
>>> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
>>> ---
>>>  net/bridge/br_device.c | 2 ++
>>>  1 file changed, 2 insertions(+)
>>>
>>> diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
>>> index 8c7b78f8bc23..9a2fb4aa1a10 100644
>>> --- a/net/bridge/br_device.c
>>> +++ b/net/bridge/br_device.c
>>> @@ -36,6 +36,8 @@ netdev_tx_t br_dev_xmit(struct sk_buff *skb, struct net_device *dev)
>>>  	const unsigned char *dest;
>>>  	u16 vid = 0;
>>>  
>>> +	memset(skb->cb, 0, sizeof(struct br_input_skb_cb));
>>> +
>>>  	rcu_read_lock();
>>>  	nf_ops = rcu_dereference(nf_br_ops);
>>>  	if (nf_ops && nf_ops->br_dev_xmit_hook(skb)) {
>>>
>>
>> What's the performance hit of doing this on every packet?
>>
>> Can you just set a flag that tells the code to reset on recirculation?
>> Seems like br_input_skb_cb has space for that.
>>
> 
> Virtually non-existent, we had a patch that turned that field into a 16 byte
> field so that is really 2 8 byte stores. It is already cache hot, we could

err, s/field/struct/

> initialize each individual field separately as br_input does.
> 
> I don't want to waste flags on such thing, this makes it future-proof 
> and I'll remove the individual field zeroing later which will alleviate
> the cost further.
> 
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bridge] [PATCH net] net: bridge: clear bridge's private skb space on xmit
@ 2020-07-31 17:38       ` Nikolay Aleksandrov
  0 siblings, 0 replies; 18+ messages in thread
From: Nikolay Aleksandrov @ 2020-07-31 17:38 UTC (permalink / raw)
  To: David Ahern, netdev; +Cc: roopa, bridge, davem

On 31/07/2020 20:37, Nikolay Aleksandrov wrote:
> On 31/07/2020 20:27, David Ahern wrote:
>> On 7/31/20 10:26 AM, Nikolay Aleksandrov wrote:
>>> We need to clear all of the bridge private skb variables as they can be
>>> stale due to the packet being recirculated through the stack and then
>>> transmitted through the bridge device. Similar memset is already done on
>>> bridge's input. We've seen cases where proxyarp_replied was 1 on routed
>>> multicast packets transmitted through the bridge to ports with neigh
>>> suppress which were getting dropped. Same thing can in theory happen with
>>> the port isolation bit as well.
>>>
>>> Fixes: 821f1b21cabb ("bridge: add new BR_NEIGH_SUPPRESS port flag to suppress arp and nd flood")
>>> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
>>> ---
>>>  net/bridge/br_device.c | 2 ++
>>>  1 file changed, 2 insertions(+)
>>>
>>> diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
>>> index 8c7b78f8bc23..9a2fb4aa1a10 100644
>>> --- a/net/bridge/br_device.c
>>> +++ b/net/bridge/br_device.c
>>> @@ -36,6 +36,8 @@ netdev_tx_t br_dev_xmit(struct sk_buff *skb, struct net_device *dev)
>>>  	const unsigned char *dest;
>>>  	u16 vid = 0;
>>>  
>>> +	memset(skb->cb, 0, sizeof(struct br_input_skb_cb));
>>> +
>>>  	rcu_read_lock();
>>>  	nf_ops = rcu_dereference(nf_br_ops);
>>>  	if (nf_ops && nf_ops->br_dev_xmit_hook(skb)) {
>>>
>>
>> What's the performance hit of doing this on every packet?
>>
>> Can you just set a flag that tells the code to reset on recirculation?
>> Seems like br_input_skb_cb has space for that.
>>
> 
> Virtually non-existent, we had a patch that turned that field into a 16 byte
> field so that is really 2 8 byte stores. It is already cache hot, we could

err, s/field/struct/

> initialize each individual field separately as br_input does.
> 
> I don't want to waste flags on such thing, this makes it future-proof 
> and I'll remove the individual field zeroing later which will alleviate
> the cost further.
> 
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net] net: bridge: clear bridge's private skb space on xmit
  2020-07-31 17:37     ` [Bridge] " Nikolay Aleksandrov
@ 2020-07-31 17:51       ` Nikolay Aleksandrov
  -1 siblings, 0 replies; 18+ messages in thread
From: Nikolay Aleksandrov @ 2020-07-31 17:51 UTC (permalink / raw)
  To: David Ahern, netdev; +Cc: bridge, roopa, davem

On 31/07/2020 20:37, Nikolay Aleksandrov wrote:
> On 31/07/2020 20:27, David Ahern wrote:
>> On 7/31/20 10:26 AM, Nikolay Aleksandrov wrote:
>>> We need to clear all of the bridge private skb variables as they can be
>>> stale due to the packet being recirculated through the stack and then
>>> transmitted through the bridge device. Similar memset is already done on
>>> bridge's input. We've seen cases where proxyarp_replied was 1 on routed
>>> multicast packets transmitted through the bridge to ports with neigh
>>> suppress which were getting dropped. Same thing can in theory happen with
>>> the port isolation bit as well.
>>>
>>> Fixes: 821f1b21cabb ("bridge: add new BR_NEIGH_SUPPRESS port flag to suppress arp and nd flood")
>>> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
>>> ---
>>>  net/bridge/br_device.c | 2 ++
>>>  1 file changed, 2 insertions(+)
>>>
>>> diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
>>> index 8c7b78f8bc23..9a2fb4aa1a10 100644
>>> --- a/net/bridge/br_device.c
>>> +++ b/net/bridge/br_device.c
>>> @@ -36,6 +36,8 @@ netdev_tx_t br_dev_xmit(struct sk_buff *skb, struct net_device *dev)
>>>  	const unsigned char *dest;
>>>  	u16 vid = 0;
>>>  
>>> +	memset(skb->cb, 0, sizeof(struct br_input_skb_cb));
>>> +
>>>  	rcu_read_lock();
>>>  	nf_ops = rcu_dereference(nf_br_ops);
>>>  	if (nf_ops && nf_ops->br_dev_xmit_hook(skb)) {
>>>
>>
>> What's the performance hit of doing this on every packet?
>>
>> Can you just set a flag that tells the code to reset on recirculation?
>> Seems like br_input_skb_cb has space for that.
>>
> 
> Virtually non-existent, we had a patch that turned that field into a 16 byte
> field so that is really 2 8 byte stores. It is already cache hot, we could
> initialize each individual field separately as br_input does.
> 
> I don't want to waste flags on such thing, this makes it future-proof 
> and I'll remove the individual field zeroing later which will alleviate
> the cost further.
> 

Also note that we already do this on input for each packet since the
struct was reduced to 16 bytes. It's the safest way since every different
sub-part of the bridge uses some set of these private variables and
we've had many similar bugs where they were used stale or unintentionally
were not initialized for some path.



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bridge] [PATCH net] net: bridge: clear bridge's private skb space on xmit
@ 2020-07-31 17:51       ` Nikolay Aleksandrov
  0 siblings, 0 replies; 18+ messages in thread
From: Nikolay Aleksandrov @ 2020-07-31 17:51 UTC (permalink / raw)
  To: David Ahern, netdev; +Cc: roopa, bridge, davem

On 31/07/2020 20:37, Nikolay Aleksandrov wrote:
> On 31/07/2020 20:27, David Ahern wrote:
>> On 7/31/20 10:26 AM, Nikolay Aleksandrov wrote:
>>> We need to clear all of the bridge private skb variables as they can be
>>> stale due to the packet being recirculated through the stack and then
>>> transmitted through the bridge device. Similar memset is already done on
>>> bridge's input. We've seen cases where proxyarp_replied was 1 on routed
>>> multicast packets transmitted through the bridge to ports with neigh
>>> suppress which were getting dropped. Same thing can in theory happen with
>>> the port isolation bit as well.
>>>
>>> Fixes: 821f1b21cabb ("bridge: add new BR_NEIGH_SUPPRESS port flag to suppress arp and nd flood")
>>> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
>>> ---
>>>  net/bridge/br_device.c | 2 ++
>>>  1 file changed, 2 insertions(+)
>>>
>>> diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
>>> index 8c7b78f8bc23..9a2fb4aa1a10 100644
>>> --- a/net/bridge/br_device.c
>>> +++ b/net/bridge/br_device.c
>>> @@ -36,6 +36,8 @@ netdev_tx_t br_dev_xmit(struct sk_buff *skb, struct net_device *dev)
>>>  	const unsigned char *dest;
>>>  	u16 vid = 0;
>>>  
>>> +	memset(skb->cb, 0, sizeof(struct br_input_skb_cb));
>>> +
>>>  	rcu_read_lock();
>>>  	nf_ops = rcu_dereference(nf_br_ops);
>>>  	if (nf_ops && nf_ops->br_dev_xmit_hook(skb)) {
>>>
>>
>> What's the performance hit of doing this on every packet?
>>
>> Can you just set a flag that tells the code to reset on recirculation?
>> Seems like br_input_skb_cb has space for that.
>>
> 
> Virtually non-existent, we had a patch that turned that field into a 16 byte
> field so that is really 2 8 byte stores. It is already cache hot, we could
> initialize each individual field separately as br_input does.
> 
> I don't want to waste flags on such thing, this makes it future-proof 
> and I'll remove the individual field zeroing later which will alleviate
> the cost further.
> 

Also note that we already do this on input for each packet since the
struct was reduced to 16 bytes. It's the safest way since every different
sub-part of the bridge uses some set of these private variables and
we've had many similar bugs where they were used stale or unintentionally
were not initialized for some path.



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net] net: bridge: clear bridge's private skb space on xmit
  2020-07-31 17:51       ` [Bridge] " Nikolay Aleksandrov
@ 2020-07-31 18:10         ` Nikolay Aleksandrov
  -1 siblings, 0 replies; 18+ messages in thread
From: Nikolay Aleksandrov @ 2020-07-31 18:10 UTC (permalink / raw)
  To: David Ahern, netdev; +Cc: bridge, roopa, davem

On 31/07/2020 20:51, Nikolay Aleksandrov wrote:
> On 31/07/2020 20:37, Nikolay Aleksandrov wrote:
>> On 31/07/2020 20:27, David Ahern wrote:
>>> On 7/31/20 10:26 AM, Nikolay Aleksandrov wrote:
>>>> We need to clear all of the bridge private skb variables as they can be
>>>> stale due to the packet being recirculated through the stack and then
>>>> transmitted through the bridge device. Similar memset is already done on
>>>> bridge's input. We've seen cases where proxyarp_replied was 1 on routed
>>>> multicast packets transmitted through the bridge to ports with neigh
>>>> suppress which were getting dropped. Same thing can in theory happen with
>>>> the port isolation bit as well.
>>>>
>>>> Fixes: 821f1b21cabb ("bridge: add new BR_NEIGH_SUPPRESS port flag to suppress arp and nd flood")
>>>> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
>>>> ---
>>>>  net/bridge/br_device.c | 2 ++
>>>>  1 file changed, 2 insertions(+)
>>>>
>>>> diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
>>>> index 8c7b78f8bc23..9a2fb4aa1a10 100644
>>>> --- a/net/bridge/br_device.c
>>>> +++ b/net/bridge/br_device.c
>>>> @@ -36,6 +36,8 @@ netdev_tx_t br_dev_xmit(struct sk_buff *skb, struct net_device *dev)
>>>>  	const unsigned char *dest;
>>>>  	u16 vid = 0;
>>>>  
>>>> +	memset(skb->cb, 0, sizeof(struct br_input_skb_cb));
>>>> +
>>>>  	rcu_read_lock();
>>>>  	nf_ops = rcu_dereference(nf_br_ops);
>>>>  	if (nf_ops && nf_ops->br_dev_xmit_hook(skb)) {
>>>>
>>>
>>> What's the performance hit of doing this on every packet?
>>>
>>> Can you just set a flag that tells the code to reset on recirculation?
>>> Seems like br_input_skb_cb has space for that.
>>>
>>
>> Virtually non-existent, we had a patch that turned that field into a 16 byte
>> field so that is really 2 8 byte stores. It is already cache hot, we could
>> initialize each individual field separately as br_input does.
>>
>> I don't want to waste flags on such thing, this makes it future-proof 
>> and I'll remove the individual field zeroing later which will alleviate
>> the cost further.
>>
> 
> Also note that we already do this on input for each packet since the
> struct was reduced to 16 bytes. It's the safest way since every different
> sub-part of the bridge uses some set of these private variables and
> we've had many similar bugs where they were used stale or unintentionally
> were not initialized for some path.
> 

In addition this doesn't need to be a recirculation, in theory it could happen
by a routed packet to svi on the bridge which got its skb->cb initialized before
hitting the bridge's xmit function. So a flag can't catch all possible cases.



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bridge] [PATCH net] net: bridge: clear bridge's private skb space on xmit
@ 2020-07-31 18:10         ` Nikolay Aleksandrov
  0 siblings, 0 replies; 18+ messages in thread
From: Nikolay Aleksandrov @ 2020-07-31 18:10 UTC (permalink / raw)
  To: David Ahern, netdev; +Cc: roopa, bridge, davem

On 31/07/2020 20:51, Nikolay Aleksandrov wrote:
> On 31/07/2020 20:37, Nikolay Aleksandrov wrote:
>> On 31/07/2020 20:27, David Ahern wrote:
>>> On 7/31/20 10:26 AM, Nikolay Aleksandrov wrote:
>>>> We need to clear all of the bridge private skb variables as they can be
>>>> stale due to the packet being recirculated through the stack and then
>>>> transmitted through the bridge device. Similar memset is already done on
>>>> bridge's input. We've seen cases where proxyarp_replied was 1 on routed
>>>> multicast packets transmitted through the bridge to ports with neigh
>>>> suppress which were getting dropped. Same thing can in theory happen with
>>>> the port isolation bit as well.
>>>>
>>>> Fixes: 821f1b21cabb ("bridge: add new BR_NEIGH_SUPPRESS port flag to suppress arp and nd flood")
>>>> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
>>>> ---
>>>>  net/bridge/br_device.c | 2 ++
>>>>  1 file changed, 2 insertions(+)
>>>>
>>>> diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
>>>> index 8c7b78f8bc23..9a2fb4aa1a10 100644
>>>> --- a/net/bridge/br_device.c
>>>> +++ b/net/bridge/br_device.c
>>>> @@ -36,6 +36,8 @@ netdev_tx_t br_dev_xmit(struct sk_buff *skb, struct net_device *dev)
>>>>  	const unsigned char *dest;
>>>>  	u16 vid = 0;
>>>>  
>>>> +	memset(skb->cb, 0, sizeof(struct br_input_skb_cb));
>>>> +
>>>>  	rcu_read_lock();
>>>>  	nf_ops = rcu_dereference(nf_br_ops);
>>>>  	if (nf_ops && nf_ops->br_dev_xmit_hook(skb)) {
>>>>
>>>
>>> What's the performance hit of doing this on every packet?
>>>
>>> Can you just set a flag that tells the code to reset on recirculation?
>>> Seems like br_input_skb_cb has space for that.
>>>
>>
>> Virtually non-existent, we had a patch that turned that field into a 16 byte
>> field so that is really 2 8 byte stores. It is already cache hot, we could
>> initialize each individual field separately as br_input does.
>>
>> I don't want to waste flags on such thing, this makes it future-proof 
>> and I'll remove the individual field zeroing later which will alleviate
>> the cost further.
>>
> 
> Also note that we already do this on input for each packet since the
> struct was reduced to 16 bytes. It's the safest way since every different
> sub-part of the bridge uses some set of these private variables and
> we've had many similar bugs where they were used stale or unintentionally
> were not initialized for some path.
> 

In addition this doesn't need to be a recirculation, in theory it could happen
by a routed packet to svi on the bridge which got its skb->cb initialized before
hitting the bridge's xmit function. So a flag can't catch all possible cases.



^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH net v2] net: bridge: clear skb private space on bridge dev xmit
  2020-07-31 18:10         ` [Bridge] " Nikolay Aleksandrov
@ 2020-08-02 12:50           ` Nikolay Aleksandrov
  -1 siblings, 0 replies; 18+ messages in thread
From: Nikolay Aleksandrov @ 2020-08-02 12:50 UTC (permalink / raw)
  To: netdev; +Cc: davem, roopa, bridge, dsahern, Nikolay Aleksandrov

We need to clear all of the bridge private skb variables as they can be
stale due to the packet having skb->cb initialized earlier and then
transmitted through the bridge device. Similar memset is already done on
bridge's input. We've seen cases where proxyarp_replied was 1 on routed
multicast packets transmitted through the bridge to ports with neigh
suppress and were getting dropped. Same thing can in theory happen with the
port isolation bit as well. We clear only the struct part after the bridge
pointer (currently 8 bytes) since the pointer is always set later.
We can now remove the redundant zeroing of frag_max_size.
Also add a BUILD_BUG_ON to make sure we catch any movement of the bridge
dev pointer.

Fixes: 821f1b21cabb ("bridge: add new BR_NEIGH_SUPPRESS port flag to suppress arp and nd flood")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
---
v2: clear only the second half of the struct which contains the
fields that are used in various bridge parts, this replaced the rep stos
instruction with a single movq on my x86 and in general reduces
the clear area to 8 bytes, and in addition we can remove the now
redundant zeroing of frag_max_size as it will be already cleared,
add a build_bug_on to make sure we catch any movement of the bridge
dev pointer

 net/bridge/br_device.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
index 8c7b78f8bc23..4f7880c99d3c 100644
--- a/net/bridge/br_device.c
+++ b/net/bridge/br_device.c
@@ -36,6 +36,12 @@ netdev_tx_t br_dev_xmit(struct sk_buff *skb, struct net_device *dev)
 	const unsigned char *dest;
 	u16 vid = 0;
 
+	/* clear all private fields after the bridge dev pointer */
+	BUILD_BUG_ON(offsetof(struct br_input_skb_cb, brdev) > 0);
+	memset(skb->cb + sizeof(struct net_device *),
+	       0,
+	       sizeof(struct br_input_skb_cb) - sizeof(struct net_device *));
+
 	rcu_read_lock();
 	nf_ops = rcu_dereference(nf_br_ops);
 	if (nf_ops && nf_ops->br_dev_xmit_hook(skb)) {
@@ -50,7 +56,6 @@ netdev_tx_t br_dev_xmit(struct sk_buff *skb, struct net_device *dev)
 
 	br_switchdev_frame_unmark(skb);
 	BR_INPUT_SKB_CB(skb)->brdev = dev;
-	BR_INPUT_SKB_CB(skb)->frag_max_size = 0;
 
 	skb_reset_mac_header(skb);
 	skb_pull(skb, ETH_HLEN);
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Bridge] [PATCH net v2] net: bridge: clear skb private space on bridge dev xmit
@ 2020-08-02 12:50           ` Nikolay Aleksandrov
  0 siblings, 0 replies; 18+ messages in thread
From: Nikolay Aleksandrov @ 2020-08-02 12:50 UTC (permalink / raw)
  To: netdev; +Cc: Nikolay Aleksandrov, roopa, bridge, davem, dsahern

We need to clear all of the bridge private skb variables as they can be
stale due to the packet having skb->cb initialized earlier and then
transmitted through the bridge device. Similar memset is already done on
bridge's input. We've seen cases where proxyarp_replied was 1 on routed
multicast packets transmitted through the bridge to ports with neigh
suppress and were getting dropped. Same thing can in theory happen with the
port isolation bit as well. We clear only the struct part after the bridge
pointer (currently 8 bytes) since the pointer is always set later.
We can now remove the redundant zeroing of frag_max_size.
Also add a BUILD_BUG_ON to make sure we catch any movement of the bridge
dev pointer.

Fixes: 821f1b21cabb ("bridge: add new BR_NEIGH_SUPPRESS port flag to suppress arp and nd flood")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
---
v2: clear only the second half of the struct which contains the
fields that are used in various bridge parts, this replaced the rep stos
instruction with a single movq on my x86 and in general reduces
the clear area to 8 bytes, and in addition we can remove the now
redundant zeroing of frag_max_size as it will be already cleared,
add a build_bug_on to make sure we catch any movement of the bridge
dev pointer

 net/bridge/br_device.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
index 8c7b78f8bc23..4f7880c99d3c 100644
--- a/net/bridge/br_device.c
+++ b/net/bridge/br_device.c
@@ -36,6 +36,12 @@ netdev_tx_t br_dev_xmit(struct sk_buff *skb, struct net_device *dev)
 	const unsigned char *dest;
 	u16 vid = 0;
 
+	/* clear all private fields after the bridge dev pointer */
+	BUILD_BUG_ON(offsetof(struct br_input_skb_cb, brdev) > 0);
+	memset(skb->cb + sizeof(struct net_device *),
+	       0,
+	       sizeof(struct br_input_skb_cb) - sizeof(struct net_device *));
+
 	rcu_read_lock();
 	nf_ops = rcu_dereference(nf_br_ops);
 	if (nf_ops && nf_ops->br_dev_xmit_hook(skb)) {
@@ -50,7 +56,6 @@ netdev_tx_t br_dev_xmit(struct sk_buff *skb, struct net_device *dev)
 
 	br_switchdev_frame_unmark(skb);
 	BR_INPUT_SKB_CB(skb)->brdev = dev;
-	BR_INPUT_SKB_CB(skb)->frag_max_size = 0;
 
 	skb_reset_mac_header(skb);
 	skb_pull(skb, ETH_HLEN);
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH net] net: bridge: clear bridge's private skb space on xmit
  2020-07-31 16:26 ` [Bridge] " Nikolay Aleksandrov
@ 2020-08-03 22:27   ` David Miller
  -1 siblings, 0 replies; 18+ messages in thread
From: David Miller @ 2020-08-03 22:27 UTC (permalink / raw)
  To: nikolay; +Cc: netdev, bridge, roopa

From: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Date: Fri, 31 Jul 2020 19:26:16 +0300

> We need to clear all of the bridge private skb variables as they can be
> stale due to the packet being recirculated through the stack and then
> transmitted through the bridge device. Similar memset is already done on
> bridge's input. We've seen cases where proxyarp_replied was 1 on routed
> multicast packets transmitted through the bridge to ports with neigh
> suppress which were getting dropped. Same thing can in theory happen with
> the port isolation bit as well.
> 
> Fixes: 821f1b21cabb ("bridge: add new BR_NEIGH_SUPPRESS port flag to suppress arp and nd flood")
> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>

Applied and queued up for -stable, thanks.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bridge] [PATCH net] net: bridge: clear bridge's private skb space on xmit
@ 2020-08-03 22:27   ` David Miller
  0 siblings, 0 replies; 18+ messages in thread
From: David Miller @ 2020-08-03 22:27 UTC (permalink / raw)
  To: nikolay; +Cc: netdev, roopa, bridge

From: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Date: Fri, 31 Jul 2020 19:26:16 +0300

> We need to clear all of the bridge private skb variables as they can be
> stale due to the packet being recirculated through the stack and then
> transmitted through the bridge device. Similar memset is already done on
> bridge's input. We've seen cases where proxyarp_replied was 1 on routed
> multicast packets transmitted through the bridge to ports with neigh
> suppress which were getting dropped. Same thing can in theory happen with
> the port isolation bit as well.
> 
> Fixes: 821f1b21cabb ("bridge: add new BR_NEIGH_SUPPRESS port flag to suppress arp and nd flood")
> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>

Applied and queued up for -stable, thanks.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net v2] net: bridge: clear skb private space on bridge dev xmit
  2020-08-02 12:50           ` [Bridge] " Nikolay Aleksandrov
@ 2020-08-03 22:58             ` David Miller
  -1 siblings, 0 replies; 18+ messages in thread
From: David Miller @ 2020-08-03 22:58 UTC (permalink / raw)
  To: nikolay; +Cc: netdev, roopa, bridge, dsahern

From: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Date: Sun,  2 Aug 2020 15:50:39 +0300

> We need to clear all of the bridge private skb variables as they can be
> stale due to the packet having skb->cb initialized earlier and then
> transmitted through the bridge device. Similar memset is already done on
> bridge's input. We've seen cases where proxyarp_replied was 1 on routed
> multicast packets transmitted through the bridge to ports with neigh
> suppress and were getting dropped. Same thing can in theory happen with the
> port isolation bit as well. We clear only the struct part after the bridge
> pointer (currently 8 bytes) since the pointer is always set later.
> We can now remove the redundant zeroing of frag_max_size.
> Also add a BUILD_BUG_ON to make sure we catch any movement of the bridge
> dev pointer.
> 
> Fixes: 821f1b21cabb ("bridge: add new BR_NEIGH_SUPPRESS port flag to suppress arp and nd flood")
> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>

Nikolay, I applied v1 already as I'm not at all against the full clear.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bridge] [PATCH net v2] net: bridge: clear skb private space on bridge dev xmit
@ 2020-08-03 22:58             ` David Miller
  0 siblings, 0 replies; 18+ messages in thread
From: David Miller @ 2020-08-03 22:58 UTC (permalink / raw)
  To: nikolay; +Cc: netdev, roopa, bridge, dsahern

From: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Date: Sun,  2 Aug 2020 15:50:39 +0300

> We need to clear all of the bridge private skb variables as they can be
> stale due to the packet having skb->cb initialized earlier and then
> transmitted through the bridge device. Similar memset is already done on
> bridge's input. We've seen cases where proxyarp_replied was 1 on routed
> multicast packets transmitted through the bridge to ports with neigh
> suppress and were getting dropped. Same thing can in theory happen with the
> port isolation bit as well. We clear only the struct part after the bridge
> pointer (currently 8 bytes) since the pointer is always set later.
> We can now remove the redundant zeroing of frag_max_size.
> Also add a BUILD_BUG_ON to make sure we catch any movement of the bridge
> dev pointer.
> 
> Fixes: 821f1b21cabb ("bridge: add new BR_NEIGH_SUPPRESS port flag to suppress arp and nd flood")
> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>

Nikolay, I applied v1 already as I'm not at all against the full clear.

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2020-08-03 22:58 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-31 16:26 [PATCH net] net: bridge: clear bridge's private skb space on xmit Nikolay Aleksandrov
2020-07-31 16:26 ` [Bridge] " Nikolay Aleksandrov
2020-07-31 17:27 ` David Ahern
2020-07-31 17:27   ` [Bridge] " David Ahern
2020-07-31 17:37   ` Nikolay Aleksandrov
2020-07-31 17:37     ` [Bridge] " Nikolay Aleksandrov
2020-07-31 17:38     ` Nikolay Aleksandrov
2020-07-31 17:38       ` [Bridge] " Nikolay Aleksandrov
2020-07-31 17:51     ` Nikolay Aleksandrov
2020-07-31 17:51       ` [Bridge] " Nikolay Aleksandrov
2020-07-31 18:10       ` Nikolay Aleksandrov
2020-07-31 18:10         ` [Bridge] " Nikolay Aleksandrov
2020-08-02 12:50         ` [PATCH net v2] net: bridge: clear skb private space on bridge dev xmit Nikolay Aleksandrov
2020-08-02 12:50           ` [Bridge] " Nikolay Aleksandrov
2020-08-03 22:58           ` David Miller
2020-08-03 22:58             ` [Bridge] " David Miller
2020-08-03 22:27 ` [PATCH net] net: bridge: clear bridge's private skb space on xmit David Miller
2020-08-03 22:27   ` [Bridge] " David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.