All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marcin Wojtas <mw@semihalf.com>
To: Russell King - ARM Linux admin <linux@armlinux.org.uk>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Sasha Levin <sashal@kernel.org>,
	Antoine Tenart <antoine.tenart@bootlin.com>,
	stable@vger.kernel.org,
	Alexandre Belloni <alexandre.belloni@bootlin.com>,
	"David S. Miller" <davem@davemloft.net>,
	Jakub Kicinski <kuba@kernel.org>, netdev <netdev@vger.kernel.org>,
	Gabor Samu <samu_gabor@yahoo.ca>,
	Jon Nettleton <jon@solid-run.com>,
	Andrew Elwell <andrew.elwell@gmail.com>
Subject: Re: [PATCH net-next 2/4] net: mvpp2: add mvpp2_phylink_to_port() helper
Date: Thu, 10 Dec 2020 18:43:50 +0100	[thread overview]
Message-ID: <CAPv3WKdWr0zfuTkK+x6u7C6FpFxkVtRFrEq1FvemVpLYw2+5ng@mail.gmail.com> (raw)
In-Reply-To: <20201210154651.GV1551@shell.armlinux.org.uk>

Hi Russell,

czw., 10 gru 2020 o 16:46 Russell King - ARM Linux admin
<linux@armlinux.org.uk> napisał(a):
>
> On Thu, Dec 10, 2020 at 03:35:29PM +0100, Marcin Wojtas wrote:
> > Hi Greg,
> >
> > śr., 9 gru 2020 o 11:59 Greg Kroah-Hartman
> > <gregkh@linuxfoundation.org> napisał(a):
> > > What part fixes the issue?  I can't see it...
> >
> > I re-checked in my setup and here's the smallest part of the original
> > patch, that fixes previously described issue:
> > diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
> > b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
> > index e98be8372780..9d71a4fe1750 100644
> > --- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
> > +++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
> > @@ -4767,6 +4767,11 @@ static void mvpp2_port_copy_mac_addr(struct
> > net_device *dev, struct mvpp2 *priv,
> >         eth_hw_addr_random(dev);
> >  }
> >
> > +static struct mvpp2_port *mvpp2_phylink_to_port(struct phylink_config *config)
> > +{
> > +       return container_of(config, struct mvpp2_port, phylink_config);
> > +}
> > +
> >  static void mvpp2_phylink_validate(struct phylink_config *config,
> >                                    unsigned long *supported,
> >                                    struct phylink_link_state *state)
> > @@ -5105,13 +5110,12 @@ static void mvpp2_gmac_config(struct
> > mvpp2_port *port, unsigned int mode,
> >  static void mvpp2_mac_config(struct phylink_config *config, unsigned int mode,
> >                              const struct phylink_link_state *state)
> >  {
> > -       struct net_device *dev = to_net_dev(config->dev);
> > -       struct mvpp2_port *port = netdev_priv(dev);
> > +       struct mvpp2_port *port = mvpp2_phylink_to_port(config);
> >         bool change_interface = port->phy_interface != state->interface;
> >
> >         /* Check for invalid configuration */
> >         if (mvpp2_is_xlg(state->interface) && port->gop_id != 0) {
> > -               netdev_err(dev, "Invalid mode on %s\n", dev->name);
> > +               netdev_err(port->dev, "Invalid mode on %s\n", port->dev->name);
> >                 return;
> >         }
> >
> > @@ -5151,8 +5155,7 @@ static void mvpp2_mac_link_up(struct
> > phylink_config *config,
> >                               int speed, int duplex,
> >                               bool tx_pause, bool rx_pause)
> >  {
> > -       struct net_device *dev = to_net_dev(config->dev);
> > -       struct mvpp2_port *port = netdev_priv(dev);
> > +       struct mvpp2_port *port = mvpp2_phylink_to_port(config);
> >         u32 val;
> >
> >         if (mvpp2_is_xlg(interface)) {
> > @@ -5199,7 +5202,7 @@ static void mvpp2_mac_link_up(struct
> > phylink_config *config,
> >
> >         mvpp2_egress_enable(port);
> >         mvpp2_ingress_enable(port);
> > -       netif_tx_wake_all_queues(dev);
> > +       netif_tx_wake_all_queues(port->dev);
> >  }
> >
> >  static void mvpp2_mac_link_down(struct phylink_config *config,
>
> The problem is caused by this hack:
>
>                 /* Phylink isn't used as of now for ACPI, so the MAC has to be
>                  * configured manually when the interface is started. This will
>                  * be removed as soon as the phylink ACPI support lands in.
>                  */
>                 struct phylink_link_state state = {
>                         .interface = port->phy_interface,
>                 };
>                 mvpp2_mac_config(&port->phylink_config, MLO_AN_INBAND, &state);
>                 mvpp2_mac_link_up(&port->phylink_config, MLO_AN_INBAND,
>                                   port->phy_interface, NULL);
>
> which passes an un-initialised (zeroed) port->phylink_config, as
> phylink is not used in ACPI setups.
>
> The problem occurs because port->phylink_config.dev (which is a
> NULL pointer in this instance) is passed to to_net_dev():
>
> #define to_net_dev(d) container_of(d, struct net_device, dev)
>
> Which then means netdev_priv(dev) attempts to dereference a not-quite
> NULL pointer, leading to an oops.

Correct. Hopefully the MDIO+ACPI patchset will land and I'll be able
to get rid of this hack and do not maintain dual handling paths any
longer.

>
> The problem here is that the bug was not noticed; it seems hardly
> anyone bothers to run mainline kernels with ACPI on Marvell platforms,
> or if they do, they don't bother reporting to mainline communities
> when they have problems. Likely, there's posts on some random web-based
> bulletin board or mailing list that kernel developers don't read
> somewhere complaining that there's an oops.
>

I must admit that due to other duties I did not follow the mainline
mvpp2 for a couple revisions (and I am not maintainer of it). However
recently I got reached-out directly by different developers - the
trigger was different distros upgrading the kernel above v5.4+ and for
some reasons the DT path is not chosen there (and the ACPI will be
chosen more and more in the SystemReady world).

> Like...
>
> https://lists.einval.com/pipermail/macchiato/2020-January/000309.html
> https://gist.github.com/AdrianKoshka/ff9862da2183a2d8e26d47baf8dc04b9
>
> This kind of segmentation is very disappointing; it means potentially
> lots of bugs go by unnoticed by kernel developers, and bugs only get
> fixed by chance.  Had it been reported to somewhere known earlier
> this year, it is likely that a proper fix patch would have been
> created.

Totally agree. As soon as I got noticed directly it took me no time to
find the regression root cause.

>
> How this gets handled is ultimately up to the stable developers to
> decide what they wish to accept. Do they wish to accept a back-ported
> full version of my commit 6c2b49eb9671 ("net: mvpp2: add
> mvpp2_phylink_to_port() helper") that unintentionally fixed this
> unknown issue, or do they want a more minimal fix such as the cut-down
> version of that commit that Marcin has supplied.
>
> Until something changes in the way bugs get reported, I suspect this
> won't be the only instance of bug-fixing-by-accident happening.
>

Thank you Russell for your input.

Best regards,
Marcin

  reply	other threads:[~2020-12-10 17:47 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-20  9:20 [PATCH net-next v2 0/4] Marvell mvpp2 improvements Russell King - ARM Linux admin
2020-06-20  9:21 ` [PATCH net-next 1/4] net: mvpp2: add port support helpers Russell King
2020-06-20  9:21 ` [PATCH net-next 2/4] net: mvpp2: add mvpp2_phylink_to_port() helper Russell King
2020-10-09  3:43   ` Marcin Wojtas
2020-11-02 17:38     ` Marcin Wojtas
2020-11-02 18:03       ` Greg Kroah-Hartman
2020-12-08 12:03         ` Marcin Wojtas
2020-12-08 13:35           ` Sasha Levin
2020-12-08 15:02             ` Marcin Wojtas
2020-12-09 11:00               ` Greg Kroah-Hartman
2020-12-10 14:35                 ` Marcin Wojtas
2020-12-10 15:46                   ` Russell King - ARM Linux admin
2020-12-10 17:43                     ` Marcin Wojtas [this message]
2020-12-10 17:56                       ` Russell King - ARM Linux admin
2020-12-10 18:22                         ` Marcin Wojtas
2020-12-10 20:26                           ` Andrew Lunn
2020-12-10 23:45                             ` Marcin Wojtas
2020-12-11  5:03                             ` Jon Nettleton
2020-12-11 14:01                               ` Andrew Lunn
2020-12-20 17:08                   ` Marcin Wojtas
2020-12-21 18:25                     ` Jakub Kicinski
2020-12-21 18:30                       ` Russell King - ARM Linux admin
2020-12-21 18:47                         ` Jakub Kicinski
2020-12-21 19:07                           ` Sasha Levin
2020-12-21 21:12                             ` Marcin Wojtas
2020-11-02 23:02       ` Russell King - ARM Linux admin
2020-06-20  9:21 ` [PATCH net-next 3/4] net: mvpp2: add register modification helper Russell King
2020-06-20  9:21 ` [PATCH net-next 4/4] net: mvpp2: set xlg flow control in mvpp2_mac_link_up() Russell King
2020-06-21  4:38 ` [PATCH net-next v2 0/4] Marvell mvpp2 improvements David Miller
  -- strict thread matches above, loose matches on Subject: below --
2020-06-18 15:38 [PATCH net-next " Russell King - ARM Linux admin
2020-06-18 15:38 ` [PATCH net-next 2/4] net: mvpp2: add mvpp2_phylink_to_port() helper Russell King
2020-06-20  3:20   ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAPv3WKdWr0zfuTkK+x6u7C6FpFxkVtRFrEq1FvemVpLYw2+5ng@mail.gmail.com \
    --to=mw@semihalf.com \
    --cc=alexandre.belloni@bootlin.com \
    --cc=andrew.elwell@gmail.com \
    --cc=antoine.tenart@bootlin.com \
    --cc=davem@davemloft.net \
    --cc=gregkh@linuxfoundation.org \
    --cc=jon@solid-run.com \
    --cc=kuba@kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=netdev@vger.kernel.org \
    --cc=samu_gabor@yahoo.ca \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.