linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* bridge:fragmented packets dropped by bridge
@ 2022-10-07 11:21 Vyacheslav Salnikov
  2022-10-07 13:40 ` Slade Watkins
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Vyacheslav Salnikov @ 2022-10-07 11:21 UTC (permalink / raw)
  To: netfilter-devel, linux-kernel, netdev

Hi.

I switched from kernel versions 4.9 to 5.15 and found that the MTU on
the interfaces in the bridge does not change.
For example:
I have the following bridge:
bridge      interface
br0          sw1
               sw2
               sw3

And I change with ifconfig MTU.
I see that br0 sw1..sw3 has changed MTU from 1500 -> 1982.

But if i send a ping through these interfaces, I get 1500(I added
prints for output)
I investigated the code and found the reason:
The following commit came in the new kernel:
https://github.com/torvalds/linux/commit/ac6627a28dbfb5d96736544a00c3938fa7ea6dfb

And the behavior of the MTU setting has changed:
>
> Kernel 4.9:
> if (net->ipv4.sysctl_ip_fwd_use_pmtu ||
>    ip_mtu_locked(dst) ||
>    !forwarding)  <--- True
> return dst_mtu(dst) <--- 1982
>
>
> / 'forwarding = true' case should always honour route mtu /
> mtu = dst_metric_raw(dst, RTAX_MTU);
> if (mtu)
> return mtu;



Kernel 5.15:
>
> if (READ_ONCE(net->ipv4.sysctl_ip_fwd_use_pmtu) ||
>    ip_mtu_locked(dst) ||
>    !forwarding) { <--- True
> mtu = rt->rt_pmtu;  <--- 0
> if (mtu && time_before(jiffies, rt->dst.expires)) <-- False
> goto out;
> }
>
> / 'forwarding = true' case should always honour route mtu /
> mtu = dst_metric_raw(dst, RTAX_MTU); <---- 1500
> if (mtu) <--- True
> goto out;

As I see from the code in the end takes mtu from br_dst_default_metrics
> static const u32 br_dst_default_metrics[RTAX_MAX] = {
> [RTAX_MTU - 1] = 1500,
> };

Why is rt_pmtu now used instead of dst_mtu?
Why is forwarding = False called with dst_metric_raw?
Maybe we should add processing when mtu = rt->rt_pmtu == 0?
Could this be an error?


I found a thread discussing a similar problem. It suggested porting
the next patch:
Signed-off-by: Rundong Ge <rdong.ge@gmail.com>
---
 include/net/ip.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/net/ip.h b/include/net/ip.h
index 29d89de..0512de3 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -450,6 +450,8 @@ static inline unsigned int
ip_dst_mtu_maybe_forward(const struct dst_entry *dst,
 static inline unsigned int ip_skb_dst_mtu(struct sock *sk,
    const struct sk_buff *skb)
 {
+ if ((skb_dst(skb)->flags & DST_FAKE_RTABLE) && skb->dev)
+ return min(skb->dev->mtu, IP_MAX_MTU);
  if (!sk || !sk_fullsock(sk) || ip_sk_use_pmtu(sk)) {
  bool forwarding = IPCB(skb)->flags & IPSKB_FORWARDED;


Why was this patch not accepted in the end?
-- 
Best regards,
Slava.

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: bridge:fragmented packets dropped by bridge
  2022-10-07 11:21 bridge:fragmented packets dropped by bridge Vyacheslav Salnikov
@ 2022-10-07 13:40 ` Slade Watkins
  2022-10-13  0:43 ` Vadim Fedorenko
  2022-10-15  1:35 ` Vadim Fedorenko
  2 siblings, 0 replies; 4+ messages in thread
From: Slade Watkins @ 2022-10-07 13:40 UTC (permalink / raw)
  To: Vyacheslav Salnikov, netfilter-devel, linux-kernel, netdev

Hey there,

On 10/7/22 at 7:21 AM, Vyacheslav Salnikov wrote:
> Why was this patch not accepted in the end?

Huh...

I had to do just a little bit of digging to find the original thread,
but it really doesn't seem to me like there was a consensus on whether
or not to take the patch:
https://lore.kernel.org/lkml/20190730122534.30687-1-rdong.ge@gmail.com/T/#u

Reason I say that is that the thread is rather old, and died off quickly.

Someone involved with that patch may be able to offer the answers you're
looking for, if they haven't forgotten about it after all this time.

-srw

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: bridge:fragmented packets dropped by bridge
  2022-10-07 11:21 bridge:fragmented packets dropped by bridge Vyacheslav Salnikov
  2022-10-07 13:40 ` Slade Watkins
@ 2022-10-13  0:43 ` Vadim Fedorenko
  2022-10-15  1:35 ` Vadim Fedorenko
  2 siblings, 0 replies; 4+ messages in thread
From: Vadim Fedorenko @ 2022-10-13  0:43 UTC (permalink / raw)
  To: Vyacheslav Salnikov, netfilter-devel, linux-kernel, netdev

On 07.10.2022 12:21, Vyacheslav Salnikov wrote:
> Hi.
> 
> I switched from kernel versions 4.9 to 5.15 and found that the MTU on
> the interfaces in the bridge does not change.
> For example:
> I have the following bridge:
> bridge      interface
> br0          sw1
>                 sw2
>                 sw3
> 
> And I change with ifconfig MTU.
> I see that br0 sw1..sw3 has changed MTU from 1500 -> 1982.
> 
> But if i send a ping through these interfaces, I get 1500(I added
> prints for output)
> I investigated the code and found the reason:
> The following commit came in the new kernel:
> https://github.com/torvalds/linux/commit/ac6627a28dbfb5d96736544a00c3938fa7ea6dfb
> 
> And the behavior of the MTU setting has changed:
>>
>> Kernel 4.9:
>> if (net->ipv4.sysctl_ip_fwd_use_pmtu ||
>>     ip_mtu_locked(dst) ||
>>     !forwarding)  <--- True
>> return dst_mtu(dst) <--- 1982
>>
>>
>> / 'forwarding = true' case should always honour route mtu /
>> mtu = dst_metric_raw(dst, RTAX_MTU);
>> if (mtu)
>> return mtu;
> 
> 
> 
> Kernel 5.15:
>>
>> if (READ_ONCE(net->ipv4.sysctl_ip_fwd_use_pmtu) ||
>>     ip_mtu_locked(dst) ||
>>     !forwarding) { <--- True
>> mtu = rt->rt_pmtu;  <--- 0
>> if (mtu && time_before(jiffies, rt->dst.expires)) <-- False
>> goto out;
>> }
>>
>> / 'forwarding = true' case should always honour route mtu /
>> mtu = dst_metric_raw(dst, RTAX_MTU); <---- 1500
>> if (mtu) <--- True
>> goto out;
> 
> As I see from the code in the end takes mtu from br_dst_default_metrics
>> static const u32 br_dst_default_metrics[RTAX_MAX] = {
>> [RTAX_MTU - 1] = 1500,
>> };
> 
> Why is rt_pmtu now used instead of dst_mtu?
> Why is forwarding = False called with dst_metric_raw?
> Maybe we should add processing when mtu = rt->rt_pmtu == 0?
> Could this be an error?
> 
If you compare ipv4_mtu code from 4.9 you will see that the very first mtu value 
is filled by rt->rt_pmtu value. I believe there were changes to the bridge code 
where rt_pmtu value got empty or cleared.

I'm still looking for the root cause of the problem, will update you once I find it.


> 
> I found a thread discussing a similar problem. It suggested porting
> the next patch:
> Signed-off-by: Rundong Ge <rdong.ge@gmail.com>
> ---
>   include/net/ip.h | 2 ++
>   1 file changed, 2 insertions(+)
> 
> diff --git a/include/net/ip.h b/include/net/ip.h
> index 29d89de..0512de3 100644
> --- a/include/net/ip.h
> +++ b/include/net/ip.h
> @@ -450,6 +450,8 @@ static inline unsigned int
> ip_dst_mtu_maybe_forward(const struct dst_entry *dst,
>   static inline unsigned int ip_skb_dst_mtu(struct sock *sk,
>      const struct sk_buff *skb)
>   {
> + if ((skb_dst(skb)->flags & DST_FAKE_RTABLE) && skb->dev)
> + return min(skb->dev->mtu, IP_MAX_MTU);
>    if (!sk || !sk_fullsock(sk) || ip_sk_use_pmtu(sk)) {
>    bool forwarding = IPCB(skb)->flags & IPSKB_FORWARDED;
> 
> 
> Why was this patch not accepted in the end?


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: bridge:fragmented packets dropped by bridge
  2022-10-07 11:21 bridge:fragmented packets dropped by bridge Vyacheslav Salnikov
  2022-10-07 13:40 ` Slade Watkins
  2022-10-13  0:43 ` Vadim Fedorenko
@ 2022-10-15  1:35 ` Vadim Fedorenko
  2 siblings, 0 replies; 4+ messages in thread
From: Vadim Fedorenko @ 2022-10-15  1:35 UTC (permalink / raw)
  To: Vyacheslav Salnikov, netfilter-devel, linux-kernel, netdev

On 07.10.2022 12:21, Vyacheslav Salnikov wrote:
> Hi.
> 
> I switched from kernel versions 4.9 to 5.15 and found that the MTU on
> the interfaces in the bridge does not change.
> For example:
> I have the following bridge:
> bridge      interface
> br0          sw1
>                 sw2
>                 sw3
> 
> And I change with ifconfig MTU.
> I see that br0 sw1..sw3 has changed MTU from 1500 -> 1982.
> 
> But if i send a ping through these interfaces, I get 1500(I added
> prints for output)
> I investigated the code and found the reason:
> The following commit came in the new kernel:
> https://github.com/torvalds/linux/commit/ac6627a28dbfb5d96736544a00c3938fa7ea6dfb
> 
> And the behavior of the MTU setting has changed:
>>
>> Kernel 4.9:
>> if (net->ipv4.sysctl_ip_fwd_use_pmtu ||
>>     ip_mtu_locked(dst) ||
>>     !forwarding)  <--- True
>> return dst_mtu(dst) <--- 1982
>>
>>
>> / 'forwarding = true' case should always honour route mtu /
>> mtu = dst_metric_raw(dst, RTAX_MTU);
>> if (mtu)
>> return mtu;
> 
> 
> 
> Kernel 5.15:
>>
>> if (READ_ONCE(net->ipv4.sysctl_ip_fwd_use_pmtu) ||
>>     ip_mtu_locked(dst) ||
>>     !forwarding) { <--- True
>> mtu = rt->rt_pmtu;  <--- 0
>> if (mtu && time_before(jiffies, rt->dst.expires)) <-- False
>> goto out;
>> }
>>
>> / 'forwarding = true' case should always honour route mtu /
>> mtu = dst_metric_raw(dst, RTAX_MTU); <---- 1500
>> if (mtu) <--- True
>> goto out;
> 
> As I see from the code in the end takes mtu from br_dst_default_metrics
>> static const u32 br_dst_default_metrics[RTAX_MAX] = {
>> [RTAX_MTU - 1] = 1500,
>> };
> 
> Why is rt_pmtu now used instead of dst_mtu?
> Why is forwarding = False called with dst_metric_raw?
> Maybe we should add processing when mtu = rt->rt_pmtu == 0?
> Could this be an error?
> 

Can you share kernel configs for both versions? Actually only one config value
is needed - CONFIG_BRIDGE_NETFILTER.

It will help me investigate the issue.

> 
> I found a thread discussing a similar problem. It suggested porting
> the next patch:
> Signed-off-by: Rundong Ge <rdong.ge@gmail.com>
> ---
>   include/net/ip.h | 2 ++
>   1 file changed, 2 insertions(+)
> 
> diff --git a/include/net/ip.h b/include/net/ip.h
> index 29d89de..0512de3 100644
> --- a/include/net/ip.h
> +++ b/include/net/ip.h
> @@ -450,6 +450,8 @@ static inline unsigned int
> ip_dst_mtu_maybe_forward(const struct dst_entry *dst,
>   static inline unsigned int ip_skb_dst_mtu(struct sock *sk,
>      const struct sk_buff *skb)
>   {
> + if ((skb_dst(skb)->flags & DST_FAKE_RTABLE) && skb->dev)
> + return min(skb->dev->mtu, IP_MAX_MTU);
>    if (!sk || !sk_fullsock(sk) || ip_sk_use_pmtu(sk)) {
>    bool forwarding = IPCB(skb)->flags & IPSKB_FORWARDED;
> 
> 
> Why was this patch not accepted in the end?


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-10-15  1:36 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-07 11:21 bridge:fragmented packets dropped by bridge Vyacheslav Salnikov
2022-10-07 13:40 ` Slade Watkins
2022-10-13  0:43 ` Vadim Fedorenko
2022-10-15  1:35 ` Vadim Fedorenko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).