All of lore.kernel.org
 help / color / mirror / Atom feed
From: Saravana Kannan <saravanak@google.com>
To: "Alvin Šipraga" <ALSI@bang-olufsen.dk>
Cc: Vladimir Oltean <olteanv@gmail.com>,
	Vladimir Oltean <vladimir.oltean@nxp.com>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	Jakub Kicinski <kuba@kernel.org>,
	"David S. Miller" <davem@davemloft.net>,
	Florian Fainelli <f.fainelli@gmail.com>,
	Andrew Lunn <andrew@lunn.ch>,
	Vivien Didelot <vivien.didelot@gmail.com>,
	Frank Rowand <frowand.list@gmail.com>,
	Rob Herring <robh+dt@kernel.org>
Subject: Re: [PATCH net] net: dsa: sja1105: fix use-after-free after calling of_find_compatible_node, or worse
Date: Wed, 18 Aug 2021 20:28:44 -0700	[thread overview]
Message-ID: <CAGETcx_uj0V4DChME-gy5HGKTYnxLBX=TH2rag29f_p=UcG+Tg@mail.gmail.com> (raw)
In-Reply-To: <6b89a9e1-e92e-ca99-9fbd-1d98f6a7864b@bang-olufsen.dk>

On Wed, Aug 18, 2021 at 3:18 AM Alvin Šipraga <ALSI@bang-olufsen.dk> wrote:
>
> Hi Saravana,
>
> On 8/18/21 4:46 AM, Saravana Kannan wrote:
> > On Tue, Aug 17, 2021 at 3:31 PM Vladimir Oltean <olteanv@gmail.com> wrote:
> >>
> >> Hi Alvin,
> >>
> >> On Tue, Aug 17, 2021 at 09:25:28PM +0000, Alvin Šipraga wrote:
> >>> I have an observation that's slightly out of the scope of your patch,
> >>> but I'll post here on the off chance that you find it relevant.
> >>> Apologies if it's out of place.
> >>>
> >>> Do these integrated NXP PHYs use a specific PHY driver, or do they just
> >>> use the Generic PHY driver?
> >>
> >> They refuse to probe at all with the Generic PHY driver. I have been
> >> caught off guard a few times now when I had a kernel built with
> >> CONFIG_NXP_C45_TJA11XX_PHY=n and their probing returns -22 in that case.
> >>
> >>> If the former is the case, do you experience that the PHY driver fails
> >>> to get probed during mdiobus registration if the kernel uses
> >>> fw_devlink=on?
> >>
> >> I don't test with "fw_devlink=on" in /proc/cmdline, this is the first
> >> time I do it. It behaves exactly as you say.
> >>
> >>>
> >>> In my case I am writing a new subdriver for realtek-smi, a DSA driver
> >>> which registers an internal MDIO bus analogously to sja1105, which is
> >>> why I'm asking. I noticed a deferred probe of the PHY driver because the
> >>> supplier (ethernet-switch) is not ready - presumably because all of this
> >>> is happening in the probe of the switch driver. See below:
> >>>
> >>> [   83.653213] device_add:3270: device: 'SMI-0': device_add
> >>> [   83.653905] device_pm_add:136: PM: Adding info for No Bus:SMI-0
> >>> [   83.654055] device_add:3270: device: 'platform:ethernet-switch--mdio_bus:SMI-0': device_add
> >>> [   83.654224] device_link_add:843: mdio_bus SMI-0: Linked as a sync state only consumer to ethernet-switch
> >>> [   83.654291] libphy: SMI slave MII: probed
> >>> ...
> >>> [   83.659809] device_add:3270: device: 'SMI-0:00': device_add
> >>> [   83.659883] bus_add_device:447: bus: 'mdio_bus': add device SMI-0:00
> >>> [   83.659970] device_pm_add:136: PM: Adding info for mdio_bus:SMI-0:00
> >>> [   83.660122] device_add:3270: device: 'platform:ethernet-switch--mdio_bus:SMI-0:00': device_add
> >>> [   83.660274] devices_kset_move_last:2701: devices_kset: Moving SMI-0:00 to end of list
> >>> [   83.660282] device_pm_move_last:203: PM: Moving mdio_bus:SMI-0:00 to end of list
> >>> [   83.660293] device_link_add:859: mdio_bus SMI-0:00: Linked as a consumer to ethernet-switch
> >>> [   83.660350] __driver_probe_device:736: bus: 'mdio_bus': __driver_probe_device: matched device SMI-0:00 with driver RTL8365MB-VC Gigabit Ethernet
> >>> [   83.660365] device_links_check_suppliers:1001: mdio_bus SMI-0:00: probe deferral - supplier ethernet-switch not ready
> >>> [   83.660376] driver_deferred_probe_add:138: mdio_bus SMI-0:00: Added to deferred list
> >>
> >> So it's a circular dependency? Switch cannot finish probing because it
> >> cannot connect to PHY, which cannot probe because switch has not
> >> finished probing, which....
> >
> > Hi Vladimir/Alvin,
> >
> > If there's a cyclic dependency between two devices, then fw_devlink=on
> > is smart enough to notice that. Once it notices a cycle, it knows that
> > it can't tell which one is the real dependency and which one is the
> > false dependency and so stops enforcing ordering between the devices
> > in the cycle.
> >
> > But fw_devlink doesn't understand all the properties yet. Just most of
> > them and I'm always trying to add more. So when it only understands
> > the property that's causing the false dependency but not the property
> > that causes the real dependency, it can cause issues like this where
> > fw_devlink=on enforces the false dependency and the driver/code
> > enforces the real dependency. These are generally easy to fix -- you
> > just need to teach fw_devlink how to parse more properties.
> >
> > This is just a preliminary analysis since I don't have all the info
> > yet -- so I could be wrong. With that said, I happened to be working
> > on adding fw_devlink support for phy-handle property and I think it
> > should fix your issue with fw_devlink=on. Can you give [1] a shot?
>
> I tried [1] but it did not seem to have any effect.
>
> >
> > If it doesn't fix it, can one of you please point me to an upstream
> > dts (not dtsi) file for a platform in which you see this issue? And
> > ideally also the DT nodes and their drivers that are involved in this
> > cycle? With that info, I should be able to root cause this if the
> > patch above doesn't already fix it.
>
> I'm working with a non-upstream dts - maybe Vladimir is using an
> upstream one? The pattern among the drivers/dts is common between our
> two cases.

Ideally, I can get a fully upstream example where this issue is
happening so that I can look at the actual code that's hitting this
issue and be sure my analysis is right.

>
> But for the sake of this discussion, my dts is pretty much the same as
> what you will find in arch/arm/boot/dts/gemini-dlink-dir-685.dts. The
> nodes of interest from that dts file are below, and the driver is in
> drivers/net/ds/{realtek-smi-core.c,rtl8366rb.c}. It's expected that the
> Realtek PHY driver in drivers/net/phy/realtek.c will get probed as part
> of the mdiobus registration, but that never happens. See my previous
> reply for a debug log.

Your DTS might be similar to this, but the driver code also matters
for me to be sure. Anyway, I took a look at this, but my analysis
below is going to be sketchy because I'm not looking at the actual
code that's reproducing this issue.

Assuming this issue actually happens with the example you pointed to
(I don't know this yet), here's what is happening:

The main problem is that the parent device switch seems to be assuming
it's child/grandchild devices (mdiobus/PHYs) will have probed
successfully as soon as they are added. This assumption is not true
and can be broken for multiple reasons such as:

1. The driver for the child devices (PHYs in this case) could be
loaded as a module after the parent (switch) is probed. So when the
devices are added, the PHYs would not be probed.
2. The child devices could defer probe because one of their suppliers
isn't ready yet. Either because of fw_devlink=on or the framework
itself returning -EPROBE_DEFER.
3. The child devices could be getting probed asynchronously. So the
device_add() would kick off a thread to probe the child devices in a
separate thread.

(2) is what is happening in this case. fw_devlink=on sees that
"switch" implements the "switch_intc" and "switch" hasn't finished
probing yet. So it has no way of knowing that switch_intc is actually
ready. And even if switch_intc was registered as part of switch's
probe() by the time the PHYs are added, switch_intc could get
deregistered if the probe fails at a later point. So until probe()
returns 0, fw_devlink can't be fully sure the supplier (switch_intc)
is ready. Which is good in general because you won't have to
forcefully unbind (if that is even handled correctly in the first
place) the consumers of a device if it fails probe() half way through
registering a few services.

I don't fully understand the networking frameworks, but I think
Vladimir might have a point in his earlier reply [1]. If you can make
the switch driver not assume its child PHYs are ready during the
switch's probe and instead have the switch check if the PHYs are ready
when the switch is "opened" that'd be better.

We can come up with hacks that'll delete the dependency that
fw_devlink=on is trying to enforce, but IMHO the proper fix is to have
parent drivers not assume child devices will be probed as soon as
device_add(child) returns. That's not guaranteed at all.

Btw, I do know why things work when you do the module load/unload
thing you mention in [2]. That has to do with some forced deletion of
dependencies that happens when device_bind_driver() is called when the
Generic PHY driver is used. The reason for why that's done is kind of
unrelated to the issue at hand, but the comment for
device_links_force_bind() should tell you why.

Hope that helps.

[1] - https://lore.kernel.org/netdev/20210817224008.pzdomrjaw5ewmpdg@skbuf/
[2] - https://lore.kernel.org/netdev/0c3e8814-acce-5836-3b1a-6804c21e9bf0@bang-olufsen.dk/

-Saravana

>
> / {
>         switch {
>                 compatible = "realtek,rtl8366rb";
>                 mdc-gpios = <&gpio0 21 GPIO_ACTIVE_HIGH>;
>                 mdio-gpios = <&gpio0 22 GPIO_ACTIVE_HIGH>;
>                 reset-gpios = <&gpio0 14 GPIO_ACTIVE_LOW>;
>                 realtek,disable-leds;
>
>                 switch_intc: interrupt-controller {
>                         /* GPIO 15 provides the interrupt */
>                         interrupt-parent = <&gpio0>;
>                         interrupts = <15 IRQ_TYPE_LEVEL_LOW>;
>                         interrupt-controller;
>                         #address-cells = <0>;
>                         #interrupt-cells = <1>;
>                 };
>
>                 ports {
>                         /* snip */
>                 };
>
>                 mdio {
>                         compatible = "realtek,smi-mdio";
>                         #address-cells = <1>;
>                         #size-cells = <0>;
>
>                         phy0: phy@0 {
>                                 reg = <0>;
>                                 interrupt-parent = <&switch_intc>;
>                                 interrupts = <0>;
>                         };
>                         phy1: phy@1 {
>                                 reg = <1>;
>                                 interrupt-parent = <&switch_intc>;
>                                 interrupts = <1>;
>                         };
>                         phy2: phy@2 {
>                                 reg = <2>;
>                                 interrupt-parent = <&switch_intc>;
>                                 interrupts = <2>;
>                         };
>                         phy3: phy@3 {
>                                 reg = <3>;
>                                 interrupt-parent = <&switch_intc>;
>                                 interrupts = <3>;
>                         };
>                         phy4: phy@4 {
>                                 reg = <4>;
>                                 interrupt-parent = <&switch_intc>;
>                                 interrupts = <12>;
>                         };
>                 };
>         };
> };
>
> Thanks for looking into this. Let me know if you need any other info or
> want something tested.
>
> Kind regards,
> Alvin
>
> >
> >>
> >> So how is it supposed to be solved then? Intuitively the 'mdio_bus SMI-0:00'
> >> device should not be added to the deferred list, it should have everything
> >> it needs right now (after all, it works without fw_devlink). No?
> >>
> >> It might be the late hour over here too, but right now I just don't
> >> know. Let me add Saravana to the discussion too, he made an impressive
> >> analysis recently on a PHY probing issue with mdio-mux,
> >
> > Lol, thanks for the kind words.
> >
> >> so the PHY
> >> library probing dependencies should still be fresh in his mind, maybe he
> >> has an idea what's wrong.
> >
> > [1] - https://lore.kernel.org/lkml/20210818021717.3268255-1-saravanak@google.com/T/#u>>
> > Thanks,
> > Saravana
> >

  reply	other threads:[~2021-08-19  3:29 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-17 14:52 [PATCH net] net: dsa: sja1105: fix use-after-free after calling of_find_compatible_node, or worse Vladimir Oltean
2021-08-17 21:25 ` Alvin Šipraga
2021-08-17 22:05   ` Andrew Lunn
2021-08-17 22:19     ` Alvin Šipraga
2021-08-17 22:31   ` Vladimir Oltean
2021-08-17 22:40     ` Vladimir Oltean
2021-08-17 23:01     ` Alvin Šipraga
2021-08-18  2:46     ` Saravana Kannan
2021-08-18 10:18       ` Alvin Šipraga
2021-08-19  3:28         ` Saravana Kannan [this message]
2021-08-19 11:22           ` Vladimir Oltean
2021-08-19 13:46             ` Alvin Šipraga
2021-08-20  0:50             ` Saravana Kannan
2021-08-19 13:35           ` Andrew Lunn
2021-08-19 23:52             ` Saravana Kannan
2021-08-20  0:37               ` Vladimir Oltean
2021-08-20  1:25                 ` Saravana Kannan
2021-08-20 13:01               ` Andrew Lunn
2021-08-19 13:42           ` Alvin Šipraga
2021-08-20  1:08             ` Saravana Kannan
2021-08-20 16:52               ` Saravana Kannan
2021-08-20 17:54                 ` Andrew Lunn
2021-08-20 18:10                   ` Saravana Kannan
2021-08-22 14:19                 ` Alvin Šipraga
2021-08-23 18:50                   ` Saravana Kannan
2021-08-23 20:43                     ` Andrew Lunn
2021-08-23 21:23                       ` Saravana Kannan
2021-08-25 13:40                     ` Alvin Šipraga
2021-08-26  5:33                       ` Saravana Kannan
2021-08-26  7:49                         ` Saravana Kannan
2021-08-26 11:09                         ` Alvin Šipraga
2021-08-18  9:30 ` patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAGETcx_uj0V4DChME-gy5HGKTYnxLBX=TH2rag29f_p=UcG+Tg@mail.gmail.com' \
    --to=saravanak@google.com \
    --cc=ALSI@bang-olufsen.dk \
    --cc=andrew@lunn.ch \
    --cc=davem@davemloft.net \
    --cc=f.fainelli@gmail.com \
    --cc=frowand.list@gmail.com \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=olteanv@gmail.com \
    --cc=robh+dt@kernel.org \
    --cc=vivien.didelot@gmail.com \
    --cc=vladimir.oltean@nxp.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.