netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Saravana Kannan <saravanak@google.com>
To: "Alvin Šipraga" <ALSI@bang-olufsen.dk>
Cc: Vladimir Oltean <olteanv@gmail.com>,
	Vladimir Oltean <vladimir.oltean@nxp.com>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	Jakub Kicinski <kuba@kernel.org>,
	"David S. Miller" <davem@davemloft.net>,
	Florian Fainelli <f.fainelli@gmail.com>,
	Andrew Lunn <andrew@lunn.ch>,
	Vivien Didelot <vivien.didelot@gmail.com>,
	Frank Rowand <frowand.list@gmail.com>,
	Rob Herring <robh+dt@kernel.org>
Subject: Re: [PATCH net] net: dsa: sja1105: fix use-after-free after calling of_find_compatible_node, or worse
Date: Mon, 23 Aug 2021 11:50:50 -0700	[thread overview]
Message-ID: <CAGETcx-7xgt5y_zNHzSMQf4YFCmWRPfP4_voshbNxKPgQ=b1tA@mail.gmail.com> (raw)
In-Reply-To: <14891624-655b-a71d-dc04-24404e0c2e1a@bang-olufsen.dk>

On Sun, Aug 22, 2021 at 7:19 AM Alvin Šipraga <ALSI@bang-olufsen.dk> wrote:
>
> Hi Saravana,
>
> Thanks for the follow-up. I tested your change and it does the trick:
> there is no deferral and the PHY driver gets probed first-try during the
> mdiobus registration during the call to dsa_register_switch().

I'm fairly certain the mdiobus registration happens before
dsa_register_switch(). It's in the probe call path of the DSA. The
connecting of the PHYs with the DSA is what happens when
dsa_register_switch() is called.

> I tested
> with the switch, PHY, and tagging drivers all builtin, or all modules,
> and it worked in both cases.
>
> On 8/20/21 6:52 PM, Saravana Kannan wrote:
> > Hi Alvin,
> >
> > Can you give this a shot to see if it fixes your issue? It basically
> > delays the registration of dsa_register_switch() until all the
> > consumers of this switch have probed. So it has a couple of caveats:
>
> Hm, weren't the only consumers the PHYs themselves? It seems like the
> main effect of your change is that - by doing the actual
> dsa_register_switch() call after the switch driver probe - the
> ethernet-switch (provider) is already probed, thereby allowing the PHY
> (consumer) to probe immediately.

Correct-ish -- if you modify this to account for what I said above.

>
> > 1. I'm hoping the PHYs are the only consumers of this switch.
>
> In my case that is true, if you count the mdio_bus as well:
>
> /sys/devices/platform/ethernet-switch# ls -l consumer\:*
> lrwxrwxrwx    1 root     root             0 Aug 22 16:00
> consumer:mdio_bus:SMI-0 ->
> ../../virtual/devlink/platform:ethernet-switch--mdio_bus:SMI-0
> lrwxrwxrwx    1 root     root             0 Aug 22 16:00
> consumer:mdio_bus:SMI-0:00 ->
> ../../virtual/devlink/platform:ethernet-switch--mdio_bus:SMI-0:00
> lrwxrwxrwx    1 root     root             0 Aug 22 16:00
> consumer:mdio_bus:SMI-0:01 ->
> ../../virtual/devlink/platform:ethernet-switch--mdio_bus:SMI-0:01
> lrwxrwxrwx    1 root     root             0 Aug 22 16:00
> consumer:mdio_bus:SMI-0:02 ->
> ../../virtual/devlink/platform:ethernet-switch--mdio_bus:SMI-0:02
> lrwxrwxrwx    1 root     root             0 Aug 22 16:00
> consumer:mdio_bus:SMI-0:03 ->
> ../../virtual/devlink/platform:ethernet-switch--mdio_bus:SMI-0:03

Hmm... mdio_bus being a consumer should prevent the sync_state() from
being called on "ethernet-switch". What's the value of the "status"
and "sync_state_only" files inside that mdio_bus folder?

> > 2. All of them have to probe successfully before the switch will
> > register itself.
>
> Yes.

Right, it's a yes in your case. But will it be a yes for all instances
of "realtek,rtl8366rb"?

> > 3. If dsa_register_switch() fails, we can't defer the probe (because
> > it already succeeded). But I'm not sure if it's a likely error code.
>
> It's of course possible that dsa_register_switch() fails. Assuming
> fw_devlink is doing its job properly, I think the reason is most likely
> going to be something specific to the driver, such as a communication
> timeout with the switch hardware itself.

But what if someone sets fw_devlink=permissive? Is it okay to break
this? There are ways to make this work for fw_devlink=permissive and
=on -- you check for each and decide where to call
dsa_register_switch() based on that.

> I get the impression that you don't necessarily regard this change as a
> proper fix, so I'm happy to do further tests if you choose to
> investigate further.

I thought about this in the background the past few days. I think
there are a couple of options:
1. We (community/Andrew) agree that this driver would only work with
fw_devlink=on and we can confirm that the other upstream uses of
"realtek,rtl8366rb" won't have any unprobed consumers problem and
switch to using my patch. Benefit is that it's a trivial and quick
change that gets things working again.
2. The "realtek,rtl8366rb" driver needs to be fixed to use a
"component device". A component device is a logical device that
represents a group of other devices. It's only initialized after all
these devices have probed successfully. The actual switch should be a
component device and it should call dsa_register_switch() in it's
"bind" (equivalent of probe). That way you can explicitly control what
devices need to be probed instead of depending on sync_state() that
have a bunch of caveats.

Alvin, do you want to take up (2)?

-Saravana

  reply	other threads:[~2021-08-23 18:51 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-17 14:52 [PATCH net] net: dsa: sja1105: fix use-after-free after calling of_find_compatible_node, or worse Vladimir Oltean
2021-08-17 21:25 ` Alvin Šipraga
2021-08-17 22:05   ` Andrew Lunn
2021-08-17 22:19     ` Alvin Šipraga
2021-08-17 22:31   ` Vladimir Oltean
2021-08-17 22:40     ` Vladimir Oltean
2021-08-17 23:01     ` Alvin Šipraga
2021-08-18  2:46     ` Saravana Kannan
2021-08-18 10:18       ` Alvin Šipraga
2021-08-19  3:28         ` Saravana Kannan
2021-08-19 11:22           ` Vladimir Oltean
2021-08-19 13:46             ` Alvin Šipraga
2021-08-20  0:50             ` Saravana Kannan
2021-08-19 13:35           ` Andrew Lunn
2021-08-19 23:52             ` Saravana Kannan
2021-08-20  0:37               ` Vladimir Oltean
2021-08-20  1:25                 ` Saravana Kannan
2021-08-20 13:01               ` Andrew Lunn
2021-08-19 13:42           ` Alvin Šipraga
2021-08-20  1:08             ` Saravana Kannan
2021-08-20 16:52               ` Saravana Kannan
2021-08-20 17:54                 ` Andrew Lunn
2021-08-20 18:10                   ` Saravana Kannan
2021-08-22 14:19                 ` Alvin Šipraga
2021-08-23 18:50                   ` Saravana Kannan [this message]
2021-08-23 20:43                     ` Andrew Lunn
2021-08-23 21:23                       ` Saravana Kannan
2021-08-25 13:40                     ` Alvin Šipraga
2021-08-26  5:33                       ` Saravana Kannan
2021-08-26  7:49                         ` Saravana Kannan
2021-08-26 11:09                         ` Alvin Šipraga
2021-08-18  9:30 ` patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAGETcx-7xgt5y_zNHzSMQf4YFCmWRPfP4_voshbNxKPgQ=b1tA@mail.gmail.com' \
    --to=saravanak@google.com \
    --cc=ALSI@bang-olufsen.dk \
    --cc=andrew@lunn.ch \
    --cc=davem@davemloft.net \
    --cc=f.fainelli@gmail.com \
    --cc=frowand.list@gmail.com \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=olteanv@gmail.com \
    --cc=robh+dt@kernel.org \
    --cc=vivien.didelot@gmail.com \
    --cc=vladimir.oltean@nxp.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).