All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vladimir Oltean <olteanv@gmail.com>
To: Andrew Lunn <andrew@lunn.ch>
Cc: Saravana Kannan <saravanak@google.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Linus Walleij <linus.walleij@linaro.org>,
	Vivien Didelot <vivien.didelot@gmail.com>,
	Florian Fainelli <f.fainelli@gmail.com>,
	"David S. Miller" <davem@davemloft.net>,
	Jakub Kicinski <kuba@kernel.org>, Len Brown <lenb@kernel.org>,
	Alvin Sipraga <ALSI@bang-olufsen.dk>,
	kernel-team@android.com, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org, linux-acpi@vger.kernel.org
Subject: Re: [PATCH v1 1/2] driver core: fw_devlink: Add support for FWNODE_FLAG_BROKEN_PARENT
Date: Wed, 1 Sep 2021 04:38:30 +0300	[thread overview]
Message-ID: <20210901013830.yst73ubhsrlml54i@skbuf> (raw)
In-Reply-To: <20210901012826.iuy2bhvkrgahhrl7@skbuf>

On Wed, Sep 01, 2021 at 04:28:26AM +0300, Vladimir Oltean wrote:
> On Wed, Sep 01, 2021 at 02:18:04AM +0300, Vladimir Oltean wrote:
> > On Wed, Sep 01, 2021 at 01:02:09AM +0200, Andrew Lunn wrote:
> > > Rev B is interesting because switch0 and switch1 got genphy, while
> > > switch2 got the correct Marvell PHY driver. switch2 PHYs don't have
> > > interrupt properties, so don't loop back to their parent device.
> > 
> > This is interesting and not what I really expected to happen. It goes to
> > show that we really need more time to understand all the subtleties of
> > device dependencies before jumping on patching stuff.
> 
> There is an even more interesting variation which I would like to point
> out. It seems like a very odd loophole in the device links.
> 
> Take the example of the mv88e6xxx DSA driver. On my board
> (arch/arm64/boot/dts/marvell/armada-3720-turris-mox.dts), even after I
> had to declare the switches as interrupt controller and add interrupts
> to their internal PHYs, I still need considerable force to 'break' this
> board in the way discussed in this thread. The correct PHY driver insists
> to probe, and not genphy. Let me explain.
> 
> The automatic device links between the switch (supplier, as interrupt-controller)
> and PHYs (consumers) are added by fwnode_link_add, called from of_link_to_phandle.
> 
> Important note: fwnode_link_add does not link devices, it links OF nodes.
> 
> Even more important node, in the form of a comment:
> 
>  * The driver core will use the fwnode link to create a device link between the
>  * two device objects corresponding to @con and @sup when they are created. The
>  * driver core will automatically delete the fwnode link between @con and @sup
>  * after doing that.
> 
> Okay?!
> 
> What seems to be omitted is that the DSA switch driver's probing itself
> can be deferred. For example:
> 
> dsa_register_switch
> -> dsa_switch_probe
>    -> dsa_switch_parse_of
>       -> dsa_switch_parse_ports_of
>          -> dsa_port_parse_of
>             -> of_find_net_device_by_node(of_parse_phandle(dn, "ethernet", 0));
>             -> not found => return -EPROBE_DEFER
> 
> When dsa_register_switch() returns -EPROBE_DEFER, it is effectively
> an error path. So the reverse of initialization is performed.
> 
> The mv88e6xxx driver calls mv88e6xxx_mdios_register() right _before_
> dsa_register_switch. So when dsa_register_switch returns error code,
> mv88e6xxx_mdios_unregister() will be called.
> 
> When mv88e6xxx_mdios_unregister() is called, the MDIO buses with
> internal PHYs are destroyed. So the PHY devices themselves are destroyed
> too. And the device links between the DSA switch and the internal PHYs,
> those created based on the firmware node links created by fwnode_link_add,
> are dropped too.
> 
> Now remember the comment that the device links created based on
> fwnode_link_add are not restored.
> 
> So probing of the DSA switch finally resumes, and this time
> device_links_check_suppliers() is effectively bypassed, the PHYs no
> longer request probe deferral due to their supplier not being ready,
> because the device link no longer exists.
> 
> Isn't this self-sabotaging?!
> 
> Now generally, DSA drivers defer probing because they probe in parallel
> with the DSA master. This is typical if the switch is on a SPI bus, or
> I2C, or on an MDIO bus provided by a _standalone_ MDIO controller.
> 
> If the MDIO controller is not standalone, but is provided by Ethernet
> controller that is the DSA master itself, then things change a lot,
> because probing can never be parallel. The DSA master probes,
> initializes its MDIO bus, and this triggers the probing of the MDIO
> devices on that bus, one of which is the DSA switch. So DSA can no
> longer defer the probe due to that reason.
> 
> Secondly, in DSA we even have variation between drivers as to where they
> register their internal MDIO buses. The mv88e6xxx driver does this in
> mv88e6xxx_probe (the probe function on the MDIO bus). The rtl8366rb
> driver calls realtek_smi_setup_mdio() from rtl8366rb_setup(), and this
> is important. DSA provides drivers with a .setup() callback, which is
> guaranteed to take place after nothing can defer the switch's probe
> anymore.
> 
> So putting two and two together, sure enough, if I move mv88e6xxx_mdios_register
> from mv88e6xxx_probe to mv88e6xxx_setup, then I can reliably break this
> setup, because the device links framework isn't sabotaging itself anymore.
> 
> Conversely, I am pretty sure that if rtl8366rb was to call of_mdiobus_register()
> from the probe method and not the setup method, the entire design issue
> with interrupts on internal DSA switch ports would have went absolutely
> unnoticed for a few more years.
> 
> I have not tested this, but it also seems plausible that DSA can
> trivially and reliably bypass any fw_devlink=on restrictions by simply
> moving all of_mdiobus_register() driver calls from the .setup() method
> to their respective probe methods (prior to calling dsa_register_switch),
> then effectively fabricate an -EPROBE_DEFER during the first probe attempt.
> I mean, who will know whether that probe deferral request was justified
> or not?

Pushing the thought even further, it is not even necessary to move the
of_mdiobus_register() call to the probe function. Where it is (in .setup)
is already good enough. It is sufficient to return -EOPNOTSUPP once
(the first time) immediately _after_ the call to of_mdiobus_register
(and have a proper error path, i.e. call mdiobus_unregister too).

> Anyway, I'm not sure everyone agrees with this type of "solution" (even
> though it's worth pointing it out as a fw_devlink limitation). In any
> case, we need some sort of lightweight "fix" to the chicken-and-egg
> problem, which will give me enough time to think of something better.
> I hope it is at least clearer now that there are subtleties and nuances,
> and we cannot just assess how many boards are broken by looking at the
> device trees. By design, all are, sure, but they might still work, and
> that's better than nothing...

  reply	other threads:[~2021-09-01  1:38 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-26  7:45 [PATCH v1 0/2] Fix rtl8366rb issues with fw_devlink=on Saravana Kannan
2021-08-26  7:45 ` [PATCH v1 1/2] driver core: fw_devlink: Add support for FWNODE_FLAG_BROKEN_PARENT Saravana Kannan
2021-08-26 13:13   ` Andrew Lunn
2021-08-26 19:56     ` Saravana Kannan
2021-08-26 20:53       ` Andrew Lunn
2021-08-26 23:44         ` Saravana Kannan
2021-08-27  1:23           ` Andrew Lunn
2021-08-27  4:02             ` Saravana Kannan
2021-08-27 13:44               ` Andrew Lunn
2021-08-27 18:06                 ` Saravana Kannan
2021-08-27 20:11                   ` Andrew Lunn
2021-08-27 21:33                     ` Saravana Kannan
2021-08-28 17:01                       ` Andrew Lunn
2021-08-30 19:03                         ` Saravana Kannan
2021-08-31 13:16                           ` Andrew Lunn
2021-08-31 19:26                             ` Saravana Kannan
2021-08-31 22:05                               ` Andrew Lunn
2021-08-31 22:18                                 ` Saravana Kannan
2021-08-31 23:02                                   ` Andrew Lunn
2021-08-31 23:18                                     ` Vladimir Oltean
2021-08-31 23:27                                       ` Andrew Lunn
2021-09-01  1:28                                       ` Vladimir Oltean
2021-09-01  1:38                                         ` Vladimir Oltean [this message]
2021-09-01  2:19                                           ` Saravana Kannan
2021-09-01  9:02                                             ` Vladimir Oltean
2021-09-01 22:57                                               ` Saravana Kannan
2021-09-01  2:00                                       ` Saravana Kannan
2021-09-01  8:46                                         ` Vladimir Oltean
2021-09-01 22:53                                           ` Saravana Kannan
2021-09-02 17:41                                           ` Andrew Lunn
2021-09-02 17:58                                             ` Saravana Kannan
2021-09-30  5:33                                       ` Saravana Kannan
2021-09-30 13:15                                         ` Andrew Lunn
2021-09-30 13:43                                         ` Vladimir Oltean
2021-09-30 14:00                                           ` Andrew Lunn
2021-09-30 17:31                                             ` Saravana Kannan
2021-09-30 19:38                                               ` Andrew Lunn
2021-09-30 19:48                                                 ` Saravana Kannan
2021-09-30 20:06                                                   ` Florian Fainelli
2021-09-30 20:14                                                     ` Saravana Kannan
2021-09-30 21:16                                                       ` Florian Fainelli
2021-09-30 20:22                                                     ` Andrew Lunn
2021-09-01  1:46                                     ` Saravana Kannan
2021-08-31 23:18                                   ` Andrew Lunn
2021-09-08 18:35                             ` Saravana Kannan
2021-09-09  1:39                           ` Andrew Lunn
2021-09-09  3:21                             ` Saravana Kannan
2021-09-09 10:38                               ` Vladimir Oltean
2021-09-09 15:00                               ` Andrew Lunn
2021-09-09 16:56                             ` Florian Fainelli
2021-08-26  7:45 ` [PATCH v1 2/2] net: dsa: rtl8366rb: Quick fix to work with fw_devlink=on Saravana Kannan
2021-08-26  7:55   ` Greg Kroah-Hartman
2021-08-26 11:25   ` Florian Fainelli
2021-08-26 17:29     ` Saravana Kannan
2021-08-26 11:29   ` Alvin Šipraga
2021-08-26 17:26     ` Saravana Kannan
2021-08-26 18:04       ` Alvin Šipraga

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210901013830.yst73ubhsrlml54i@skbuf \
    --to=olteanv@gmail.com \
    --cc=ALSI@bang-olufsen.dk \
    --cc=andrew@lunn.ch \
    --cc=davem@davemloft.net \
    --cc=f.fainelli@gmail.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=kernel-team@android.com \
    --cc=kuba@kernel.org \
    --cc=lenb@kernel.org \
    --cc=linus.walleij@linaro.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=rafael@kernel.org \
    --cc=saravanak@google.com \
    --cc=vivien.didelot@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.