linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [BUG] moving fq back to clock monotonic breaks my setup
@ 2019-01-10  0:48 Ian Kumlien
  2019-01-10  5:53 ` Eric Dumazet
  0 siblings, 1 reply; 6+ messages in thread
From: Ian Kumlien @ 2019-01-10  0:48 UTC (permalink / raw)
  To: edumazet; +Cc: Linux Kernel Network Developers, linux-kernel

Hi,

Just been trough ~5+ hours of bisecting and eventually actually found
the culprit =)

commit fb420d5d91c1274d5966917725e71f27ed092a85 (refs/bisect/bad)
Author: Eric Dumazet <edumazet@google.com>
Date:   Fri Sep 28 10:28:44 2018 -0700

    tcp/fq: move back to CLOCK_MONOTONIC

[--8<--]

So this might be because my setup might be "odd".

Basically I have a firewall with four nics that uses two of those nics
to handle my normal
internet connection (firewall/MASQ/NAT) and the other two are assigned
to one bridge each.

The firewall is also my local caching DNS server and DHCP server,
which is also used by the VM:s...
But with 4.20 DHCP replies disappeared before entering the bridge - i
couldn't even see them in
tcpdump! (all nics are ixgbe on a atom soc)

I'm currently running a kernel with that patch reversed but I'm also
wondering about possible ways
forward since I'm reverting a fix from someone else...

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [BUG] moving fq back to clock monotonic breaks my setup
  2019-01-10  0:48 [BUG] moving fq back to clock monotonic breaks my setup Ian Kumlien
@ 2019-01-10  5:53 ` Eric Dumazet
  2019-01-10  8:25   ` Ian Kumlien
  0 siblings, 1 reply; 6+ messages in thread
From: Eric Dumazet @ 2019-01-10  5:53 UTC (permalink / raw)
  To: ian.kumlien, netdev; +Cc: LKML

On Wed, Jan 9, 2019 at 4:48 PM Ian Kumlien <ian.kumlien@gmail.com> wrote:
>
> Hi,
>
> Just been trough ~5+ hours of bisecting and eventually actually found
> the culprit =)
>
> commit fb420d5d91c1274d5966917725e71f27ed092a85 (refs/bisect/bad)
> Author: Eric Dumazet <edumazet@google.com>
> Date:   Fri Sep 28 10:28:44 2018 -0700
>
>     tcp/fq: move back to CLOCK_MONOTONIC
>
> [--8<--]
>
> So this might be because my setup might be "odd".
>
> Basically I have a firewall with four nics that uses two of those nics
> to handle my normal
> internet connection (firewall/MASQ/NAT) and the other two are assigned
> to one bridge each.
>
> The firewall is also my local caching DNS server and DHCP server,
> which is also used by the VM:s...
> But with 4.20 DHCP replies disappeared before entering the bridge - i
> couldn't even see them in
> tcpdump! (all nics are ixgbe on a atom soc)
>
> I'm currently running a kernel with that patch reversed but I'm also
> wondering about possible ways
> forward since I'm reverting a fix from someone else...

I suggest you use netdev@ mailing list instead of lkml

Then, we probably need to clear skb->tstamp in more paths (you are
mentioning bridge ...)

See commit 8203e2d844d34af247a151d8ebd68553a6e91785 for reference.

Can you try :

diff --git a/net/bridge/br_forward.c b/net/bridge/br_forward.c
index 5372e2042adfe20d3cd039c29057535b2413be61..bd4fa141420c92a44716bd93fcd8aa3d3310203a
100644
--- a/net/bridge/br_forward.c
+++ b/net/bridge/br_forward.c
@@ -53,6 +53,7 @@ int br_dev_queue_push_xmit(struct net *net, struct
sock *sk, struct sk_buff *skb
                skb_set_network_header(skb, depth);
        }

+       skb->tstamp = 0;
        dev_queue_xmit(skb);

        return 0;

Thanks.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [BUG] moving fq back to clock monotonic breaks my setup
  2019-01-10  5:53 ` Eric Dumazet
@ 2019-01-10  8:25   ` Ian Kumlien
  2019-01-10  8:55     ` Paolo Abeni
  0 siblings, 1 reply; 6+ messages in thread
From: Ian Kumlien @ 2019-01-10  8:25 UTC (permalink / raw)
  To: Eric Dumazet, pabeni; +Cc: netdev, LKML

On Thu, Jan 10, 2019 at 6:53 AM Eric Dumazet <edumazet@google.com> wrote:
> On Wed, Jan 9, 2019 at 4:48 PM Ian Kumlien <ian.kumlien@gmail.com> wrote:
> >
> > Hi,
> >
> > Just been trough ~5+ hours of bisecting and eventually actually found
> > the culprit =)
> >
> > commit fb420d5d91c1274d5966917725e71f27ed092a85 (refs/bisect/bad)
> > Author: Eric Dumazet <edumazet@google.com>
> > Date:   Fri Sep 28 10:28:44 2018 -0700
> >
> >     tcp/fq: move back to CLOCK_MONOTONIC
> >
> > [--8<--]
> >
> > So this might be because my setup might be "odd".
> >
> > Basically I have a firewall with four nics that uses two of those nics
> > to handle my normal
> > internet connection (firewall/MASQ/NAT) and the other two are assigned
> > to one bridge each.
> >
> > The firewall is also my local caching DNS server and DHCP server,
> > which is also used by the VM:s...
> > But with 4.20 DHCP replies disappeared before entering the bridge - i
> > couldn't even see them in
> > tcpdump! (all nics are ixgbe on a atom soc)
> >
> > I'm currently running a kernel with that patch reversed but I'm also
> > wondering about possible ways
> > forward since I'm reverting a fix from someone else...
>
> I suggest you use netdev@ mailing list instead of lkml
>
> Then, we probably need to clear skb->tstamp in more paths (you are
> mentioning bridge ...)
>
> See commit 8203e2d844d34af247a151d8ebd68553a6e91785 for reference.
>
> Can you try :
>
> diff --git a/net/bridge/br_forward.c b/net/bridge/br_forward.c
> index 5372e2042adfe20d3cd039c29057535b2413be61..bd4fa141420c92a44716bd93fcd8aa3d3310203a
> 100644
> --- a/net/bridge/br_forward.c
> +++ b/net/bridge/br_forward.c
> @@ -53,6 +53,7 @@ int br_dev_queue_push_xmit(struct net *net, struct
> sock *sk, struct sk_buff *skb
>                 skb_set_network_header(skb, depth);
>         }
>
> +       skb->tstamp = 0;
>         dev_queue_xmit(skb);
>
>         return 0;

This works, and so does: https://marc.info/?l=linux-netdev&m=154696956604748&w=2

Pointed out by Paolo (tested both separately)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [BUG] moving fq back to clock monotonic breaks my setup
  2019-01-10  8:25   ` Ian Kumlien
@ 2019-01-10  8:55     ` Paolo Abeni
  2019-01-11  9:34       ` Eric Dumazet
  0 siblings, 1 reply; 6+ messages in thread
From: Paolo Abeni @ 2019-01-10  8:55 UTC (permalink / raw)
  To: Ian Kumlien, Eric Dumazet; +Cc: netdev, LKML

On Thu, 2019-01-10 at 09:25 +0100, Ian Kumlien wrote:
> On Thu, Jan 10, 2019 at 6:53 AM Eric Dumazet <edumazet@google.com> wrote:
> > On Wed, Jan 9, 2019 at 4:48 PM Ian Kumlien <ian.kumlien@gmail.com> wrote:
> > > Hi,
> > > 
> > > Just been trough ~5+ hours of bisecting and eventually actually found
> > > the culprit =)
> > > 
> > > commit fb420d5d91c1274d5966917725e71f27ed092a85 (refs/bisect/bad)
> > > Author: Eric Dumazet <edumazet@google.com>
> > > Date:   Fri Sep 28 10:28:44 2018 -0700
> > > 
> > >     tcp/fq: move back to CLOCK_MONOTONIC
> > > 
> > > [--8<--]
> > > 
> > > So this might be because my setup might be "odd".
> > > 
> > > Basically I have a firewall with four nics that uses two of those nics
> > > to handle my normal
> > > internet connection (firewall/MASQ/NAT) and the other two are assigned
> > > to one bridge each.
> > > 
> > > The firewall is also my local caching DNS server and DHCP server,
> > > which is also used by the VM:s...
> > > But with 4.20 DHCP replies disappeared before entering the bridge - i
> > > couldn't even see them in
> > > tcpdump! (all nics are ixgbe on a atom soc)
> > > 
> > > I'm currently running a kernel with that patch reversed but I'm also
> > > wondering about possible ways
> > > forward since I'm reverting a fix from someone else...
> > 
> > I suggest you use netdev@ mailing list instead of lkml
> > 
> > Then, we probably need to clear skb->tstamp in more paths (you are
> > mentioning bridge ...)
> > 
> > See commit 8203e2d844d34af247a151d8ebd68553a6e91785 for reference.
> > 
> > Can you try :
> > 
> > diff --git a/net/bridge/br_forward.c b/net/bridge/br_forward.c
> > index 5372e2042adfe20d3cd039c29057535b2413be61..bd4fa141420c92a44716bd93fcd8aa3d3310203a
> > 100644
> > --- a/net/bridge/br_forward.c
> > +++ b/net/bridge/br_forward.c
> > @@ -53,6 +53,7 @@ int br_dev_queue_push_xmit(struct net *net, struct
> > sock *sk, struct sk_buff *skb
> >                 skb_set_network_header(skb, depth);
> >         }
> > 
> > +       skb->tstamp = 0;
> >         dev_queue_xmit(skb);
> > 
> >         return 0;
> 
> This works, and so does: https://marc.info/?l=linux-netdev&m=154696956604748&w=2
> 
> Pointed out by Paolo (tested both separately)

Note: I cleared the tstamp in br_forward_finish() instead of
br_dev_queue_push_xmit() because I think the latter could be called
also in the local xmit path, via br_nf_post_routing.

We must preserve the tstamp in output path, right?

Thanks,

Paolo





^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [BUG] moving fq back to clock monotonic breaks my setup
  2019-01-10  8:55     ` Paolo Abeni
@ 2019-01-11  9:34       ` Eric Dumazet
  2019-01-11  9:51         ` Ian Kumlien
  0 siblings, 1 reply; 6+ messages in thread
From: Eric Dumazet @ 2019-01-11  9:34 UTC (permalink / raw)
  To: Paolo Abeni; +Cc: ian.kumlien, netdev, LKML

On Thu, Jan 10, 2019 at 12:55 AM Paolo Abeni <pabeni@redhat.com> wrote:
>
> On Thu, 2019-01-10 at 09:25 +0100, Ian Kumlien wrote:


> > This works, and so does: https://marc.info/?l=linux-netdev&m=154696956604748&w=2
> >
> > Pointed out by Paolo (tested both separately)
>
> Note: I cleared the tstamp in br_forward_finish() instead of
> br_dev_queue_push_xmit() because I think the latter could be called
> also in the local xmit path, via br_nf_post_routing.
>
> We must preserve the tstamp in output path, right?
>

 I was not aware of your patch, SGTM, thanks.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [BUG] moving fq back to clock monotonic breaks my setup
  2019-01-11  9:34       ` Eric Dumazet
@ 2019-01-11  9:51         ` Ian Kumlien
  0 siblings, 0 replies; 6+ messages in thread
From: Ian Kumlien @ 2019-01-11  9:51 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Paolo Abeni, netdev, LKML

On Fri, Jan 11, 2019 at 10:35 AM Eric Dumazet <edumazet@google.com> wrote:
> On Thu, Jan 10, 2019 at 12:55 AM Paolo Abeni <pabeni@redhat.com> wrote:
> > On Thu, 2019-01-10 at 09:25 +0100, Ian Kumlien wrote:
>
>
> > > This works, and so does: https://marc.info/?l=linux-netdev&m=154696956604748&w=2
> > >
> > > Pointed out by Paolo (tested both separately)
> >
> > Note: I cleared the tstamp in br_forward_finish() instead of
> > br_dev_queue_push_xmit() because I think the latter could be called
> > also in the local xmit path, via br_nf_post_routing.
> >
> > We must preserve the tstamp in output path, right?
> >
>
>  I was not aware of your patch, SGTM, thanks.

And you can add Tested-by: ian.kumlien@gmail.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-01-11  9:52 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-10  0:48 [BUG] moving fq back to clock monotonic breaks my setup Ian Kumlien
2019-01-10  5:53 ` Eric Dumazet
2019-01-10  8:25   ` Ian Kumlien
2019-01-10  8:55     ` Paolo Abeni
2019-01-11  9:34       ` Eric Dumazet
2019-01-11  9:51         ` Ian Kumlien

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).