All of lore.kernel.org
 help / color / mirror / Atom feed
* Default qdisc not correctly initialized with custom MTU
@ 2019-09-08 14:13 Holger Hoffstätte
  2019-09-09 22:52 ` Cong Wang
  0 siblings, 1 reply; 5+ messages in thread
From: Holger Hoffstätte @ 2019-09-08 14:13 UTC (permalink / raw)
  To: Netdev


I just installed a better NIC (Aquantia 2.5/5/10Gb, apparently with
multiple queues) and now get the "mq" pseudo-qdisc automatically installed -
so far, so good. I also configure fq_codel as default qdisc via sysctls
and a larger MTU of 9000 for the device. This somehow leads to some
slight confusion about initialization order between the qdiscs and the
device.

Right after booting, where sysctl runs before eth0 setup:

$tc qd show
qdisc noqueue 0: dev lo root refcnt 2
qdisc mq 0: dev eth0 root
qdisc fq_codel 0: dev eth0 parent :8 limit 10240p flows 1024 quantum 1514 ...
qdisc fq_codel 0: dev eth0 parent :7 limit 10240p flows 1024 quantum 1514 ...
qdisc fq_codel 0: dev eth0 parent :6 limit 10240p flows 1024 quantum 1514 ...
qdisc fq_codel 0: dev eth0 parent :5 limit 10240p flows 1024 quantum 1514 ...
qdisc fq_codel 0: dev eth0 parent :4 limit 10240p flows 1024 quantum 1514 ...
qdisc fq_codel 0: dev eth0 parent :3 limit 10240p flows 1024 quantum 1514 ...
qdisc fq_codel 0: dev eth0 parent :2 limit 10240p flows 1024 quantum 1514 ...
qdisc fq_codel 0: dev eth0 parent :1 limit 10240p flows 1024 quantum 1514 ...

Note that fq_codel thinks the quantum (derived from the MTU) is still 1500;
it just used the default setting as there was no link yet.

Howwver, the MTU is set to 9000:

$ip link show eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP mode DEFAULT group default qlen 1000

It seems the default qdisc is created before the link is set up, and then
just attached to mq without consideration for the actual link configuration.
Simply kicking the whole thing to replace mq with itself again does the trick:

$tc qd replace root dev eth0 mq
$tc qd show
qdisc noqueue 0: dev lo root refcnt 2
qdisc mq 8001: dev eth0 root
qdisc fq_codel 0: dev eth0 parent 8001:8 limit 10240p flows 1024 quantum 9014 ...
qdisc fq_codel 0: dev eth0 parent 8001:7 limit 10240p flows 1024 quantum 9014 ...
qdisc fq_codel 0: dev eth0 parent 8001:6 limit 10240p flows 1024 quantum 9014 ...
qdisc fq_codel 0: dev eth0 parent 8001:5 limit 10240p flows 1024 quantum 9014 ...
qdisc fq_codel 0: dev eth0 parent 8001:4 limit 10240p flows 1024 quantum 9014 ...
qdisc fq_codel 0: dev eth0 parent 8001:3 limit 10240p flows 1024 quantum 9014 ...
qdisc fq_codel 0: dev eth0 parent 8001:2 limit 10240p flows 1024 quantum 9014 ...
qdisc fq_codel 0: dev eth0 parent 8001:1 limit 10240p flows 1024 quantum 9014 ...

Now the quanta are in line with the actual MTU.

I can't help but feel this is a slight bug in terms of initialization order,
and that the default qdisc should only be created when it's first being
used/attached to a link, not when the sysctls are configured.
Kernel is 5.2.x and I didn't see anything in 5.3 or net-next to address
this yet.

Thoughts?

Holger

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Default qdisc not correctly initialized with custom MTU
  2019-09-08 14:13 Default qdisc not correctly initialized with custom MTU Holger Hoffstätte
@ 2019-09-09 22:52 ` Cong Wang
  2019-09-10  9:14   ` Holger Hoffstätte
  0 siblings, 1 reply; 5+ messages in thread
From: Cong Wang @ 2019-09-09 22:52 UTC (permalink / raw)
  To: Holger Hoffstätte; +Cc: Netdev

On Mon, Sep 9, 2019 at 5:44 AM Holger Hoffstätte
<holger@applied-asynchrony.com> wrote:
> I can't help but feel this is a slight bug in terms of initialization order,
> and that the default qdisc should only be created when it's first being
> used/attached to a link, not when the sysctls are configured.

Yeah, this is because the fq_codel qdisc is initialized once and
doesn't get any notification when the netdev's MTU get changed.
We can "fix" this by adding a NETDEV_CHANGEMTU notifier to
qdisc's, but I don't know if it is really worth the effort.

Is there any reason you can't change that order?

Thanks.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Default qdisc not correctly initialized with custom MTU
  2019-09-09 22:52 ` Cong Wang
@ 2019-09-10  9:14   ` Holger Hoffstätte
  2019-09-10 16:56     ` Cong Wang
  0 siblings, 1 reply; 5+ messages in thread
From: Holger Hoffstätte @ 2019-09-10  9:14 UTC (permalink / raw)
  To: Cong Wang; +Cc: Netdev

On 9/10/19 12:52 AM, Cong Wang wrote:
> On Mon, Sep 9, 2019 at 5:44 AM Holger Hoffstätte
> <holger@applied-asynchrony.com> wrote:
>> I can't help but feel this is a slight bug in terms of initialization order,
>> and that the default qdisc should only be created when it's first being
>> used/attached to a link, not when the sysctls are configured.
> 
> Yeah, this is because the fq_codel qdisc is initialized once and
> doesn't get any notification when the netdev's MTU get changed.

My point was that it shouldn't be created or initialized at all when
the sysctl is configured, only the name should be validated/stored and
queried when needed. If any interface is brought up before that point,
no value (yet) would just mean "trod along with the defaults" to whoever
is doing the work.

> We can "fix" this by adding a NETDEV_CHANGEMTU notifier to
> qdisc's, but I don't know if it is really worth the effort.

This is essentially the opposite of what I had in mind. The problem is
that the entity was created, not that it needs to be notified.
Also I don't think that would work for scenarios with multiple links
using different MTUs.

> Is there any reason you can't change that order?

Yes, because that wouldn't solve anything?
Like i said I can just kick the root qdisc to update itself in
a post interface-setup script, and that works fine. Since I need
that script anyway for setting several other parameters for
the device it's no big deal - just another workaround.

A brief look at the initialization in sch_mq/sch_generic unfortunately
didn't really help clear things up for me, hence I guess my real
question is whether a qdisc *must* be created early for some reason
(assuming sysctls come before link setup), or whether this is something
that could be delayed and done on-demand.

thanks,
Holger

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Default qdisc not correctly initialized with custom MTU
  2019-09-10  9:14   ` Holger Hoffstätte
@ 2019-09-10 16:56     ` Cong Wang
  2019-09-10 17:45       ` Holger Hoffstätte
  0 siblings, 1 reply; 5+ messages in thread
From: Cong Wang @ 2019-09-10 16:56 UTC (permalink / raw)
  To: Holger Hoffstätte; +Cc: Netdev

On Tue, Sep 10, 2019 at 2:14 AM Holger Hoffstätte
<holger@applied-asynchrony.com> wrote:
>
> On 9/10/19 12:52 AM, Cong Wang wrote:
> > On Mon, Sep 9, 2019 at 5:44 AM Holger Hoffstätte
> > <holger@applied-asynchrony.com> wrote:
> >> I can't help but feel this is a slight bug in terms of initialization order,
> >> and that the default qdisc should only be created when it's first being
> >> used/attached to a link, not when the sysctls are configured.
> >
> > Yeah, this is because the fq_codel qdisc is initialized once and
> > doesn't get any notification when the netdev's MTU get changed.
>
> My point was that it shouldn't be created or initialized at all when
> the sysctl is configured, only the name should be validated/stored and
> queried when needed. If any interface is brought up before that point,
> no value (yet) would just mean "trod along with the defaults" to whoever
> is doing the work.

It is _not_ created when sysctl is configured, it is either created via tc
command, or implicitly created by kernel when you bring up eth0.
sysctl only tells kernel what to create by default, but never commits it.

>
> > We can "fix" this by adding a NETDEV_CHANGEMTU notifier to
> > qdisc's, but I don't know if it is really worth the effort.
>
> This is essentially the opposite of what I had in mind. The problem is
> that the entity was created, not that it needs to be notified.

Hmm? You did change MTU after adding fq_codel to eth0, right?
So how do you fix this without notification or recreation of fq_codel
in your mind?

I am happy to hear more details.

> Also I don't think that would work for scenarios with multiple links
> using different MTUs.

The fq_codel you created is apparently attached to a netdev,
I don't think this is even a problem. I _guess_ you somehow
believe you create a standalone fq_codel during sysctl setting,
this is just impossible. It must be attached to an interface, no
matter who creates it.

>
> > Is there any reason you can't change that order?
>
> Yes, because that wouldn't solve anything?

Really? You already said it works for you like below, I am confused.


> Like i said I can just kick the root qdisc to update itself in
> a post interface-setup script, and that works fine. Since I need
> that script anyway for setting several other parameters for
> the device it's no big deal - just another workaround.
>
> A brief look at the initialization in sch_mq/sch_generic unfortunately
> didn't really help clear things up for me, hence I guess my real
> question is whether a qdisc *must* be created early for some reason
> (assuming sysctls come before link setup), or whether this is something
> that could be delayed and done on-demand.

The default qdisc is created by kernel when you don't create any.
Again, you can create your own after changing the MTU, this should
solve the problem you see. It is all about ordering.

Thanks.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Default qdisc not correctly initialized with custom MTU
  2019-09-10 16:56     ` Cong Wang
@ 2019-09-10 17:45       ` Holger Hoffstätte
  0 siblings, 0 replies; 5+ messages in thread
From: Holger Hoffstätte @ 2019-09-10 17:45 UTC (permalink / raw)
  To: Cong Wang; +Cc: Netdev

On 9/10/19 6:56 PM, Cong Wang wrote:
> It is _not_ created when sysctl is configured, it is either created via tc
> command, or implicitly created by kernel when you bring up eth0.
> sysctl only tells kernel what to create by default, but never commits it.

Ok, thank you - that's good to know, because it means there is something
wrong with how my interface is initially brought up. And indeed I found
the problem: my startup scripts apparently bring up the interface twice -
once to "pre-start" (load/verify modules etc.) and then again after
applying mtu/route/etc. settings. Obviously without MTU bringing up the
interface will pull in the default qdisc in the interface's default config,
and that's what I saw after boot. Weird but what can I say.

Anyway, thanks for trying to help. :)

Holger

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-09-10 17:45 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-08 14:13 Default qdisc not correctly initialized with custom MTU Holger Hoffstätte
2019-09-09 22:52 ` Cong Wang
2019-09-10  9:14   ` Holger Hoffstätte
2019-09-10 16:56     ` Cong Wang
2019-09-10 17:45       ` Holger Hoffstätte

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.