All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lukas Wunner <lukas@wunner.de>
To: Oleksij Rempel <o.rempel@pengutronix.de>
Cc: Andrew Lunn <andrew@lunn.ch>, Oliver Neukum <oneukum@suse.com>,
	Oleksij Rempel <linux@rempel-privat.de>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	Heiner Kallweit <hkallweit1@gmail.com>
Subject: Re: ordering of call to unbind() in usbnet_disconnect
Date: Sat, 26 Mar 2022 13:25:52 +0100	[thread overview]
Message-ID: <20220326122552.GA31022@wunner.de> (raw)
In-Reply-To: <20220315113841.GA22337@pengutronix.de>

On Tue, Mar 15, 2022 at 12:38:41PM +0100, Oleksij Rempel wrote:
> On Tue, Mar 15, 2022 at 09:32:34AM +0100, Lukas Wunner wrote:
> > > > > > On Thu, Mar 10, 2022 at 12:25:08PM +0100, Oliver Neukum wrote:
> > > > > > > I got bug reports that 2c9d6c2b871d ("usbnet: run unbind() before
> > > > > > > unregister_netdev()") is causing regressions.
> > 
> > Is it illegal to disconnect a PHY from an unregistered, but not yet freed
> > net_device?
> > 
> > Oleksij, the commit message of 2c9d6c2b871d says that disconnecting the
> > PHY "fails" in that situation.  Please elaborate what the failure looked
> > like.  Did you get a stacktrace?

Oleksij, I cannot reproduce your stacktrace (included in full length below).
I've tested with kernel 5.13 (since the stacktrace was with 5.13-rc3)
with all your (and other people's) asix patches applied on top,
except for 2c9d6c2b871d.  Tried unplugging an AX88772A multiple times,
never got a stacktrace.

I've also walked down the code paths from usbnet_disconnect() and cannot
see how the stacktrace could occur.

Normally an unregistering netdev is removed from the linkwatch event list
(lweventlist) via this call stack:

          usbnet_disconnect()
            unregister_netdev()
              rtnl_unlock()
                netdev_run_todo()
                  netdev_wait_allrefs()
                    linkwatch_forget_dev()
                      linkwatch_do_dev()

For the stacktrace to occur, the netdev would have to be subsequently
re-added to the linkwatch event list via linkwatch_fire_event().

That is called, among other places, from netif_carrier_off().  However,
netif_carrier_off() is already called *before* linkwatch_forget_dev()
when unregister_netdev() stops the netdev before unregistering it:

          usbnet_disconnect()
            unregister_netdev()
              unregister_netdevice()
                unregister_netdevice_queue(dev, NULL)
                  unregister_netdevice_many()
                    dev_close_many()
                      __dev_close_many()
                        usbnet_stop()
                          ax88772_stop()
                            phy_stop() # state = PHY_HALTED
                              phy_state_machine()
                                phy_link_down()
                                  phy_link_change()
                                    netif_carrier_off()
                                      linkwatch_fire_event()

Again, this is *before* the subsequent call to linkwatch_forget_dev().
I don't see how netif_carrier_off() could be called once more after that.

Perhaps the stacktrace occurred with an earlier version of your patches
which lacked the call to phy_stop() in ax88772_stop()?

I'm going to submit a revert of 2c9d6c2b871d.  If you *are* able to
reproduce the stacktrace after the revert, please add a dump_stack()
to linkwatch_fire_event() so that we can find the root cause.

Thanks,

Lukas

> 
> [   15.459655] asix 2-1.2:1.0 eth1: Link is Up - 100Mbps/Full - flow control off
> [   30.600242] usb 2-1.2: USB disconnect, device number 3
> [   30.611962] asix 2-1.2:1.0 eth1: unregister 'asix' usb-ci_hdrc.1-1.2, ASIX AX88772B USB 2.0 Ethernet
> [   30.649173] asix 2-1.2:1.0 eth1 (unregistered): Failed to write reg index 0x0000: -19
> [   30.657027] asix 2-1.2:1.0 eth1 (unregistered): Failed to write Medium Mode mode to 0x0000: ffffffed
> [   30.683006] asix 2-1.2:1.0 eth1 (unregistered): Link is Down
> [   30.689512] asix 2-1.2:1.0 eth1 (unregistered): Failed to write reg index 0x0000: -19
> [   30.697359] asix 2-1.2:1.0 eth1 (unregistered): Failed to enable software MII access
> [   30.706009] asix 2-1.2:1.0 eth1 (unregistered): Failed to write reg index 0x0000: -19
> [   30.714277] asix 2-1.2:1.0 eth1 (unregistered): Failed to enable software MII access
> [   30.732689] 8<--- cut here ---
> [   30.735757] Unable to handle kernel paging request at virtual address 2e839000
> [   30.742984] pgd = af824ad7
> [   30.745695] [2e839000] *pgd=00000000
> [   30.749282] Internal error: Oops: 5 [#1] PREEMPT SMP ARM
> [   30.754602] Modules linked in:
> [   30.757663] CPU: 0 PID: 77 Comm: kworker/0:2 Not tainted 5.13.0-rc3-00818-g06edf1a940be #2
> [   30.765934] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
> [   30.772466] Workqueue: events linkwatch_event
> [   30.776841] PC is at linkwatch_do_dev+0x6c/0x88
> [   30.781380] LR is at __local_bh_enable_ip+0x6c/0x100
> [   30.786356] pc : [<c08637f4>]    lr : [<c013e5c4>]    psr: 60030093
> [   30.792625] sp : c1d2bed8  ip : 00000001  fp : c28d2000
> [   30.797852] r10: c0fcf19c  r9 : c0fcf170  r8 : 00000000
> [   30.803080] r7 : c102f044  r6 : c1d2beec  r5 : 00000063  r4 : c28d2000
> [   30.809611] r3 : 00000000  r2 : 00000000  r1 : 2e839000  r0 : 60030013
> [   30.816140] Flags: nZCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment none
> [   30.823367] Control: 10c5387d  Table: 12b54059  DAC: 00000051
> [   30.829114] Register r0 information: non-paged memory
> [   30.834174] Register r1 information: non-paged memory
> [   30.839231] Register r2 information: NULL pointer
> [   30.843941] Register r3 information: NULL pointer
> [   30.848649] Register r4 information: slab kmalloc-2k start c28d2000 pointer offset 0 size 2048
> [   30.857287] Register r5 information: non-paged memory
> [   30.862344] Register r6 information: non-slab/vmalloc memory
> [   30.868008] Register r7 information: non-slab/vmalloc memory
> [   30.873673] Register r8 information: NULL pointer
> [   30.878381] Register r9 information: non-slab/vmalloc memory
> [   30.884045] Register r10 information: non-slab/vmalloc memory
> [   30.889797] Register r11 information: slab kmalloc-2k start c28d2000 pointer offset 0 size 2048
> [   30.898516] Register r12 information: non-paged memory
> [   30.903662] Process kworker/0:2 (pid: 77, stack limit = 0xded42e9b)
> [   30.909935] Stack: (0xc1d2bed8 to 0xc1d2c000)
> [   30.914298] bec0:                                                       c28d22cc c0863a48
> [   30.922481] bee0: 00000000 c1d2a000 00000008 c1d2beec c1d2beec 73675768 c1d4da84 c0fcf170
> [   30.930663] bf00: c1ce0d80 ef6d28c0 ef6d5c00 00000000 00000000 c0fed000 ef6d28c0 c0863b8c
> [   30.938844] bf20: c0fcf170 c0155550 c1d2a000 c0a13fd4 ef6d28d8 c1ce0d80 ef6d28c0 c1ce0d94
> [   30.947026] bf40: ef6d28d8 c0f03d00 00000008 c1d2a000 ef6d28c0 c0155dc0 c1003e48 c0fec754
> [   30.955208] bf60: c1ce75e4 c1ce75c0 c1ce7fc0 c1d2a000 00000000 c1931eb4 c0155d5c c1ce0d80
> [   30.963389] bf80: c1ce75e4 c015bacc 00000000 c1ce7fc0 c015b954 00000000 00000000 00000000
> [   30.971570] bfa0: 00000000 00000000 00000000 c0100150 00000000 00000000 00000000 00000000
> [   30.979750] bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> [   30.987931] bfe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
> [   30.996114] [<c08637f4>] (linkwatch_do_dev) from [<c0863a48>] (__linkwatch_run_queue+0xe0/0x1f0)
> [   31.004917] [<c0863a48>] (__linkwatch_run_queue) from [<c0863b8c>] (linkwatch_event+0x34/0x3c)
> [   31.013540] [<c0863b8c>] (linkwatch_event) from [<c0155550>] (process_one_work+0x20c/0x5d0)
> [   31.021911] [<c0155550>] (process_one_work) from [<c0155dc0>] (worker_thread+0x64/0x570)
> [   31.030010] [<c0155dc0>] (worker_thread) from [<c015bacc>] (kthread+0x178/0x190)
> [   31.037421] [<c015bacc>] (kthread) from [<c0100150>] (ret_from_fork+0x14/0x24)
> [   31.044654] Exception stack(0xc1d2bfb0 to 0xc1d2bff8)
> [   31.049710] bfa0:                                     00000000 00000000 00000000 00000000
> [   31.057891] bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> [   31.066071] bfe0: 00000000 00000000 00000000 00000000 00000013 00000000
> [   31.072692] Code: e10f0000 f10c0080 e59432c8 ee1d1f90 (e7932001) 
> [   31.078788] ---[ end trace f80581862631ce84 ]---

  parent reply	other threads:[~2022-03-26 12:25 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-10 11:25 ordering of call to unbind() in usbnet_disconnect Oliver Neukum
2022-03-10 11:38 ` Oleksij Rempel
2022-03-14 18:42   ` Lukas Wunner
2022-03-14 19:14     ` Andrew Lunn
2022-03-15  5:44       ` Oleksij Rempel
2022-03-15  8:32         ` Lukas Wunner
2022-03-15 11:38           ` Oleksij Rempel
2022-03-15 13:28             ` Andrew Lunn
2022-03-17 15:53               ` Oliver Neukum
2022-03-17 21:03                 ` Lukas Wunner
2022-03-21 10:17                 ` Lukas Wunner
2022-03-21 10:43                   ` Oleksij Rempel
2022-03-31  9:35                   ` Oliver Neukum
2022-03-21 10:02               ` Lukas Wunner
2022-03-21 13:10                 ` Andrew Lunn
2022-03-26 12:39                   ` Lukas Wunner
2022-03-26 12:49                     ` Andrew Lunn
2022-03-26 13:04                       ` Lukas Wunner
2022-03-27  8:37                         ` Oleksij Rempel
2022-03-31  9:20                           ` Oliver Neukum
2022-03-31  9:30                             ` Lukas Wunner
2022-03-31  9:59                               ` Oliver Neukum
2022-03-31 11:22                                 ` Lukas Wunner
2022-03-26 12:25             ` Lukas Wunner [this message]
2022-03-26 12:44               ` Andrew Lunn
2022-03-26 13:01                 ` Lukas Wunner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220326122552.GA31022@wunner.de \
    --to=lukas@wunner.de \
    --cc=andrew@lunn.ch \
    --cc=hkallweit1@gmail.com \
    --cc=linux@rempel-privat.de \
    --cc=netdev@vger.kernel.org \
    --cc=o.rempel@pengutronix.de \
    --cc=oneukum@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.