linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Saravana Kannan <saravanak@google.com>
To: Andrew Lunn <andrew@lunn.ch>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Linus Walleij <linus.walleij@linaro.org>,
	Vivien Didelot <vivien.didelot@gmail.com>,
	Florian Fainelli <f.fainelli@gmail.com>,
	Vladimir Oltean <olteanv@gmail.com>,
	"David S. Miller" <davem@davemloft.net>,
	Jakub Kicinski <kuba@kernel.org>, Len Brown <lenb@kernel.org>,
	Alvin Sipraga <ALSI@bang-olufsen.dk>,
	kernel-team@android.com, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org, linux-acpi@vger.kernel.org
Subject: Re: [PATCH v1 1/2] driver core: fw_devlink: Add support for FWNODE_FLAG_BROKEN_PARENT
Date: Thu, 26 Aug 2021 12:56:07 -0700	[thread overview]
Message-ID: <CAGETcx-pSi60NtMM=59cve8kN9ff9fgepQ5R=uJ3Gynzh=0_BA@mail.gmail.com> (raw)
In-Reply-To: <YSeTdb6DbHbBYabN@lunn.ch>

Greg, Florian, Vladimir, Alvin,

Let's continue the rest of the discussion here.

On Thu, Aug 26, 2021 at 6:13 AM Andrew Lunn <andrew@lunn.ch> wrote:
>
> On Thu, Aug 26, 2021 at 12:45:24AM -0700, Saravana Kannan wrote:
> > If a parent device is also a supplier to a child device, fw_devlink=on
> > (correctly) delays the probe() of the child device until the probe() of
> > the parent finishes successfully.
> >
> > However, some drivers of such parent devices (where parent is also a
> > supplier) incorrectly expect the child device to finish probing
> > successfully as soon as they are added using device_add() and before the
> > probe() of the parent device has completed successfully.
>
> Please can you point at the code making this assumption. It sounds
> like we are missing some EPROBE_DEFER handling in the driver, or maybe
> the DSA framework.

For context, this was discussed and explained in [1] and subsequent
replies. But let me summarize it here.

Alvin reported an issue that with fw_devlink=on, his downstream
hardware which is very similar to [2] doesn't have its PHYs probed
correctly. Instead of the PHYs being probed by the specific driver, it
gets probed by the "Generic PHY" driver. For those who aren't very
familiar with PHYs/networking (this is based on what Andrew explained
to me earlier), Ethernet PHYs follow a specific standard and can have
some extended functionality. The specific driver would give the full
functionality, but if it's not available when the PHY needs to be
used/connected, the generic PHY driver is force bound to the PHY and
it gives the basic functionality.

So upon digging into this, this is what I found and where I think we
have some bad assumptions about the driver core are present:

The  DT node in [2] is probed by realtek_smi_probe() [3]. The call flow is:
realtek_smi_probe()
  -> dsa_register_switch()
    -> dsa_switch_probe()
      -> dsa_tree_setup()
        -> dsa_tree_setup_switches()
          -> dsa_switch_setup()
            -> ds->ops->setup(ds)
              -> rtl8366rb_setup()
                -> realtek_smi_setup_mdio()
                  -> of_mdiobus_register()
                     This scans the MDIO bus/DT and device_add()s the PHYs
          -> dsa_port_setup()
            -> dsa_port_link_register_of()
              -> dsa_port_phylink_register()
                -> phylink_of_phy_connect()
                  -> phylink_fwnode_phy_connect()
                    -> phy_attach_direct()
                       This checks if PHY device has already probed (by
                       checking for dev->driver). If not, it forces the
                       probe of the PHY using one of the generic PHY
                       drivers.

So within dsa_register_switch() the PHY device is added and then
expected to have probed in the same thread/calling context. As stated
earlier, this is not guaranteed by the driver core. And this is what
needs fixing. This works as long as the PHYs don't have dependencies
on any other devices/suppliers and never defer probe. In the issue
Alvin reported, the PHYs have a dependency and things fall apart. I
don't have a strong opinion on whether this is a framework level fix
or fixes in a few drivers.

In the specific instance of [2] (providing snippet below to make it
easier to follow), the "phy0" device [4] depends on the "switch"
device [2] since "switch_intc" (the interrupt provider for phy0) is
initialized by the "switch" driver. And fw_devlink=on delays the probe
of phy0 until switch[2] finishes probing successfully (i.e. after
dsa_register_switch() <- realtek_smi_probe() returns) -- this is the
whole point of fw_devlink=on this is what reduces the useless deferred
probes/probe attempts of consumers before the suppliers finish probing
successfully.

Since dsa_register_switch() assumes the PHYs would have been probed as
soon as they are added, but they aren't probed in this case, the PHY
is force bound to the generic PHY driver. Which is the original issue
Alvin reported. Hope this clears things up for everyone.

-Saravana

switch {
        compatible = "realtek,rtl8366rb";
...

        switch_intc: interrupt-controller {
...
        };

        ports {
...
                port@0 {
                        phy-handle = <&phy0>;
                };
                port@1 {
                };
...
        };

        mdio {
                compatible = "realtek,smi-mdio";
...
                phy0: phy@0 {
...
                        interrupt-parent = <&switch_intc>;
                        interrupts = <0>;
                };
...
        };
};

[1] - https://lore.kernel.org/netdev/CAGETcx_uj0V4DChME-gy5HGKTYnxLBX=TH2rag29f_p=UcG+Tg@mail.gmail.com/
[2] - https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm/boot/dts/gemini-dlink-dir-685.dts?id=73f3af7b4611d77bdaea303fb639333eb28e37d7#n190
[3] - https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/dsa/realtek-smi-core.c?id=73f3af7b4611d77bdaea303fb639333eb28e37d7#n386
[4] - https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm/boot/dts/gemini-dlink-dir-685.dts?id=73f3af7b4611d77bdaea303fb639333eb28e37d7#n255

  reply	other threads:[~2021-08-26 19:56 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-26  7:45 [PATCH v1 0/2] Fix rtl8366rb issues with fw_devlink=on Saravana Kannan
2021-08-26  7:45 ` [PATCH v1 1/2] driver core: fw_devlink: Add support for FWNODE_FLAG_BROKEN_PARENT Saravana Kannan
2021-08-26 13:13   ` Andrew Lunn
2021-08-26 19:56     ` Saravana Kannan [this message]
2021-08-26 20:53       ` Andrew Lunn
2021-08-26 23:44         ` Saravana Kannan
2021-08-27  1:23           ` Andrew Lunn
2021-08-27  4:02             ` Saravana Kannan
2021-08-27 13:44               ` Andrew Lunn
2021-08-27 18:06                 ` Saravana Kannan
2021-08-27 20:11                   ` Andrew Lunn
2021-08-27 21:33                     ` Saravana Kannan
2021-08-28 17:01                       ` Andrew Lunn
2021-08-30 19:03                         ` Saravana Kannan
2021-08-31 13:16                           ` Andrew Lunn
2021-08-31 19:26                             ` Saravana Kannan
2021-08-31 22:05                               ` Andrew Lunn
2021-08-31 22:18                                 ` Saravana Kannan
2021-08-31 23:02                                   ` Andrew Lunn
2021-08-31 23:18                                     ` Vladimir Oltean
2021-08-31 23:27                                       ` Andrew Lunn
2021-09-01  1:28                                       ` Vladimir Oltean
2021-09-01  1:38                                         ` Vladimir Oltean
2021-09-01  2:19                                           ` Saravana Kannan
2021-09-01  9:02                                             ` Vladimir Oltean
2021-09-01 22:57                                               ` Saravana Kannan
2021-09-01  2:00                                       ` Saravana Kannan
2021-09-01  8:46                                         ` Vladimir Oltean
2021-09-01 22:53                                           ` Saravana Kannan
2021-09-02 17:41                                           ` Andrew Lunn
2021-09-02 17:58                                             ` Saravana Kannan
2021-09-30  5:33                                       ` Saravana Kannan
2021-09-30 13:15                                         ` Andrew Lunn
2021-09-30 13:43                                         ` Vladimir Oltean
2021-09-30 14:00                                           ` Andrew Lunn
2021-09-30 17:31                                             ` Saravana Kannan
2021-09-30 19:38                                               ` Andrew Lunn
2021-09-30 19:48                                                 ` Saravana Kannan
2021-09-30 20:06                                                   ` Florian Fainelli
2021-09-30 20:14                                                     ` Saravana Kannan
2021-09-30 21:16                                                       ` Florian Fainelli
2021-09-30 20:22                                                     ` Andrew Lunn
2021-09-01  1:46                                     ` Saravana Kannan
2021-08-31 23:18                                   ` Andrew Lunn
2021-09-08 18:35                             ` Saravana Kannan
2021-09-09  1:39                           ` Andrew Lunn
2021-09-09  3:21                             ` Saravana Kannan
2021-09-09 10:38                               ` Vladimir Oltean
2021-09-09 15:00                               ` Andrew Lunn
2021-09-09 16:56                             ` Florian Fainelli
2021-08-26  7:45 ` [PATCH v1 2/2] net: dsa: rtl8366rb: Quick fix to work with fw_devlink=on Saravana Kannan
2021-08-26  7:55   ` Greg Kroah-Hartman
2021-08-26 11:25   ` Florian Fainelli
2021-08-26 17:29     ` Saravana Kannan
2021-08-26 11:29   ` Alvin Šipraga
2021-08-26 17:26     ` Saravana Kannan
2021-08-26 18:04       ` Alvin Šipraga

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAGETcx-pSi60NtMM=59cve8kN9ff9fgepQ5R=uJ3Gynzh=0_BA@mail.gmail.com' \
    --to=saravanak@google.com \
    --cc=ALSI@bang-olufsen.dk \
    --cc=andrew@lunn.ch \
    --cc=davem@davemloft.net \
    --cc=f.fainelli@gmail.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=kernel-team@android.com \
    --cc=kuba@kernel.org \
    --cc=lenb@kernel.org \
    --cc=linus.walleij@linaro.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=olteanv@gmail.com \
    --cc=rafael@kernel.org \
    --cc=vivien.didelot@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).