linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] driver core: Fix bus_type.match() error handling
@ 2022-08-15 21:19 ` Isaac J. Manjarres
  2022-08-16  4:25   ` Guenter Roeck
  2022-08-25  9:22   ` Marek Szyprowski
  0 siblings, 2 replies; 18+ messages in thread
From: Isaac J. Manjarres @ 2022-08-15 21:19 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Rafael J. Wysocki, Ulf Hansson, Tomeu Vizoso,
	Russell King, Marek Szyprowski
  Cc: Isaac J. Manjarres, Saravana Kannan, stable, Guenter Roeck,
	kernel-team, linux-kernel

Both __device_attach_driver() and __driver_attach() check the return
code of the bus_type.match() function to see if the device needs to be
added to the deferred probe list. After adding the device to the list,
the logic attempts to bind the device to the driver anyway, as if the
device had matched with the driver, which is not correct.

If __device_attach_driver() detects that the device in question is not
ready to match with a driver on the bus, then it doesn't make sense for
the device to attempt to bind with the current driver or continue
attempting to match with any of the other drivers on the bus. So, update
the logic in __device_attach_driver() to reflect this.

If __driver_attach() detects that a driver tried to match with a device
and that results in any error, then the driver should not attempt to bind
with the device. However, the driver can still attempt to match and bind
with other devices on the bus, as drivers can be bound to multiple
devices. So, update the logic in __driver_attach() to reflect this.

Cc: Saravana Kannan <saravanak@google.com>
Cc: stable@kernel.org
Fixes: 656b8035b0ee ("ARM: 8524/1: driver cohandle -EPROBE_DEFER from bus_type.match()")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Isaac J. Manjarres <isaacmanjarres@google.com>
---
 drivers/base/dd.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

v1 -> v2:
- Fixed the logic in __driver_attach() to allow a driver to continue
  attempting to match and bind with devices in case of any error, not
  just probe deferral.

Guenter,

Can you please give test this patch to make sure it still works for you?

Thanks,
Isaac

diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index 70f79fc71539..453eb19a9a27 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -881,6 +881,11 @@ static int __device_attach_driver(struct device_driver *drv, void *_data)
 		dev_dbg(dev, "Device match requests probe deferral\n");
 		dev->can_match = true;
 		driver_deferred_probe_add(dev);
+		/*
+		 * Device can't match with a driver right now, so don't attempt
+		 * to match or bind with other drivers on the bus.
+		 */
+		return ret;
 	} else if (ret < 0) {
 		dev_dbg(dev, "Bus failed to match device: %d\n", ret);
 		return ret;
@@ -1120,9 +1125,18 @@ static int __driver_attach(struct device *dev, void *data)
 		dev_dbg(dev, "Device match requests probe deferral\n");
 		dev->can_match = true;
 		driver_deferred_probe_add(dev);
+		/*
+		 * Driver could not match with device right now, but may match
+		 * with another device on the bus.
+		 */
+		return 0;
 	} else if (ret < 0) {
 		dev_dbg(dev, "Bus failed to match device: %d\n", ret);
-		return ret;
+		/*
+		 * Driver could not match with device, but may match with
+		 * another device on the bus.
+		 */
+		return 0;
 	} /* ret > 0 means positive match */
 
 	if (driver_allows_async_probing(drv)) {
-- 
2.37.1.595.g718a3a8f04-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v2] driver core: Fix bus_type.match() error handling
  2022-08-15 21:19 ` [PATCH v2] driver core: Fix bus_type.match() error handling Isaac J. Manjarres
@ 2022-08-16  4:25   ` Guenter Roeck
  2022-08-16  5:17     ` Isaac Manjarres
  2022-08-25  9:22   ` Marek Szyprowski
  1 sibling, 1 reply; 18+ messages in thread
From: Guenter Roeck @ 2022-08-16  4:25 UTC (permalink / raw)
  To: Isaac J. Manjarres
  Cc: Greg Kroah-Hartman, Rafael J. Wysocki, Ulf Hansson, Tomeu Vizoso,
	Russell King, Marek Szyprowski, Saravana Kannan, stable,
	kernel-team, linux-kernel

On Mon, Aug 15, 2022 at 02:19:18PM -0700, Isaac J. Manjarres wrote:
> Both __device_attach_driver() and __driver_attach() check the return
> code of the bus_type.match() function to see if the device needs to be
> added to the deferred probe list. After adding the device to the list,
> the logic attempts to bind the device to the driver anyway, as if the
> device had matched with the driver, which is not correct.
> 
> If __device_attach_driver() detects that the device in question is not
> ready to match with a driver on the bus, then it doesn't make sense for
> the device to attempt to bind with the current driver or continue
> attempting to match with any of the other drivers on the bus. So, update
> the logic in __device_attach_driver() to reflect this.
> 
> If __driver_attach() detects that a driver tried to match with a device
> and that results in any error, then the driver should not attempt to bind
> with the device. However, the driver can still attempt to match and bind
> with other devices on the bus, as drivers can be bound to multiple
> devices. So, update the logic in __driver_attach() to reflect this.
> 
> Cc: Saravana Kannan <saravanak@google.com>
> Cc: stable@kernel.org
> Fixes: 656b8035b0ee ("ARM: 8524/1: driver cohandle -EPROBE_DEFER from bus_type.match()")
> Reported-by: Guenter Roeck <linux@roeck-us.net>
> Signed-off-by: Isaac J. Manjarres <isaacmanjarres@google.com>
> ---
>  drivers/base/dd.c | 16 +++++++++++++++-
>  1 file changed, 15 insertions(+), 1 deletion(-)
> 
> v1 -> v2:
> - Fixed the logic in __driver_attach() to allow a driver to continue
>   attempting to match and bind with devices in case of any error, not
>   just probe deferral.
> 
> Guenter,
> 
> Can you please give test this patch to make sure it still works for you?
> 

Not as well as v1. I still see the clk crash with versatileab, and imx25-pdk
emulations now stall during boot when trying to boot from usb.

Guenter

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2] driver core: Fix bus_type.match() error handling
  2022-08-16  4:25   ` Guenter Roeck
@ 2022-08-16  5:17     ` Isaac Manjarres
  2022-08-16 11:13       ` Guenter Roeck
  0 siblings, 1 reply; 18+ messages in thread
From: Isaac Manjarres @ 2022-08-16  5:17 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Greg Kroah-Hartman, Rafael J. Wysocki, Ulf Hansson, Tomeu Vizoso,
	Russell King, Marek Szyprowski, Saravana Kannan, stable,
	kernel-team, linux-kernel

On Mon, Aug 15, 2022 at 09:25:07PM -0700, Guenter Roeck wrote:
> > v1 -> v2:
> > - Fixed the logic in __driver_attach() to allow a driver to continue
> >   attempting to match and bind with devices in case of any error, not
> >   just probe deferral.
> > 
> > Guenter,
> > 
> > Can you please give test this patch to make sure it still works for you?
> > 
> 
> Not as well as v1. I still see the clk crash with versatileab, and imx25-pdk
> emulations now stall during boot when trying to boot from usb.
> 
> Guenter
Thanks for trying the patch out. This patch isn't meant to fix the clk
crash that you mentioned on another thread. I had made the following patch for
that: https://lore.kernel.org/lkml/YvqTvuqSll30Rv2k@google.com/. Have
you been able to give that a shot yet? If not can you please test with the
patch in this e-mail and that patch?

Please make sure you do not include this patch as it is known to cause
deadlocks: https://lore.kernel.org/lkml/YvXhJRlHN9OAIA5l@google.com/.

Did you test imx25-pdk emulations with just v1 of this patch previously?

Thanks,
Isaac

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2] driver core: Fix bus_type.match() error handling
  2022-08-16  5:17     ` Isaac Manjarres
@ 2022-08-16 11:13       ` Guenter Roeck
  2022-08-16 17:13         ` Isaac Manjarres
  0 siblings, 1 reply; 18+ messages in thread
From: Guenter Roeck @ 2022-08-16 11:13 UTC (permalink / raw)
  To: Isaac Manjarres
  Cc: Greg Kroah-Hartman, Rafael J. Wysocki, Ulf Hansson, Tomeu Vizoso,
	Russell King, Marek Szyprowski, Saravana Kannan, stable,
	kernel-team, linux-kernel

On Mon, Aug 15, 2022 at 10:17:23PM -0700, Isaac Manjarres wrote:
> On Mon, Aug 15, 2022 at 09:25:07PM -0700, Guenter Roeck wrote:
> > > v1 -> v2:
> > > - Fixed the logic in __driver_attach() to allow a driver to continue
> > >   attempting to match and bind with devices in case of any error, not
> > >   just probe deferral.
> > > 
> > > Guenter,
> > > 
> > > Can you please give test this patch to make sure it still works for you?
> > > 
> > 
> > Not as well as v1. I still see the clk crash with versatileab, and imx25-pdk
> > emulations now stall during boot when trying to boot from usb.
> > 
> > Guenter
> Thanks for trying the patch out. This patch isn't meant to fix the clk
> crash that you mentioned on another thread. I had made the following patch for
> that: https://lore.kernel.org/lkml/YvqTvuqSll30Rv2k@google.com/. Have
> you been able to give that a shot yet? If not can you please test with the
> patch in this e-mail and that patch?
> 

No, sorry, I missed that one. It does not apply, though - it is whitespace
corrupted. I tried to fix it up, but that failed.

> Please make sure you do not include this patch as it is known to cause
> deadlocks: https://lore.kernel.org/lkml/YvXhJRlHN9OAIA5l@google.com/.
> 
No, I did not include that patch.

> Did you test imx25-pdk emulations with just v1 of this patch previously?
> 

I am quite sure I did because it is a single setup.

Thanks,
Guenter

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2] driver core: Fix bus_type.match() error handling
  2022-08-16 11:13       ` Guenter Roeck
@ 2022-08-16 17:13         ` Isaac Manjarres
  2022-08-17  1:05           ` Guenter Roeck
  0 siblings, 1 reply; 18+ messages in thread
From: Isaac Manjarres @ 2022-08-16 17:13 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Greg Kroah-Hartman, Rafael J. Wysocki, Ulf Hansson, Tomeu Vizoso,
	Russell King, Marek Szyprowski, Saravana Kannan, stable,
	kernel-team, linux-kernel

On Tue, Aug 16, 2022 at 04:13:11AM -0700, Guenter Roeck wrote:
> On Mon, Aug 15, 2022 at 10:17:23PM -0700, Isaac Manjarres wrote:
> > On Mon, Aug 15, 2022 at 09:25:07PM -0700, Guenter Roeck wrote:
> > > > v1 -> v2:
> > > > - Fixed the logic in __driver_attach() to allow a driver to continue
> > > >   attempting to match and bind with devices in case of any error, not
> > > >   just probe deferral.
> > > > 
> > > > Guenter,
> > > > 
> > > > Can you please give test this patch to make sure it still works for you?
> > > > 
> > > 
> > > Not as well as v1. I still see the clk crash with versatileab, and imx25-pdk
> > > emulations now stall during boot when trying to boot from usb.
> > > 
> > > Guenter
> > Thanks for trying the patch out. This patch isn't meant to fix the clk
> > crash that you mentioned on another thread. I had made the following patch for
> > that: https://lore.kernel.org/lkml/YvqTvuqSll30Rv2k@google.com/. Have
> > you been able to give that a shot yet? If not can you please test with the
> > patch in this e-mail and that patch?
> > 
> 
> No, sorry, I missed that one. It does not apply, though - it is whitespace
> corrupted. I tried to fix it up, but that failed.

When applying the patch, can you please try with
git apply --ignore-whitespace ? That worked for me.
> 
> > Did you test imx25-pdk emulations with just v1 of this patch previously?
> > 
> 
> I am quite sure I did because it is a single setup.
> 
That's odd. Is this something that I can try out on qemu? If so, can you
please share the qemu commandline so I can take a look?

Thanks,
Isaac

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2] driver core: Fix bus_type.match() error handling
  2022-08-16 17:13         ` Isaac Manjarres
@ 2022-08-17  1:05           ` Guenter Roeck
  2022-08-17  1:12             ` Isaac Manjarres
  0 siblings, 1 reply; 18+ messages in thread
From: Guenter Roeck @ 2022-08-17  1:05 UTC (permalink / raw)
  To: Isaac Manjarres
  Cc: Greg Kroah-Hartman, Rafael J. Wysocki, Ulf Hansson, Tomeu Vizoso,
	Russell King, Marek Szyprowski, Saravana Kannan, stable,
	kernel-team, linux-kernel

On Tue, Aug 16, 2022 at 10:13:28AM -0700, Isaac Manjarres wrote:
> On Tue, Aug 16, 2022 at 04:13:11AM -0700, Guenter Roeck wrote:
> > On Mon, Aug 15, 2022 at 10:17:23PM -0700, Isaac Manjarres wrote:
> > > On Mon, Aug 15, 2022 at 09:25:07PM -0700, Guenter Roeck wrote:
> > > > > v1 -> v2:
> > > > > - Fixed the logic in __driver_attach() to allow a driver to continue
> > > > >   attempting to match and bind with devices in case of any error, not
> > > > >   just probe deferral.
> > > > > 
> > > > > Guenter,
> > > > > 
> > > > > Can you please give test this patch to make sure it still works for you?
> > > > > 
> > > > 
> > > > Not as well as v1. I still see the clk crash with versatileab, and imx25-pdk
> > > > emulations now stall during boot when trying to boot from usb.
> > > > 
> > > > Guenter
> > > Thanks for trying the patch out. This patch isn't meant to fix the clk
> > > crash that you mentioned on another thread. I had made the following patch for
> > > that: https://lore.kernel.org/lkml/YvqTvuqSll30Rv2k@google.com/. Have
> > > you been able to give that a shot yet? If not can you please test with the
> > > patch in this e-mail and that patch?
> > > 
> > 
> > No, sorry, I missed that one. It does not apply, though - it is whitespace
> > corrupted. I tried to fix it up, but that failed.
> 
> When applying the patch, can you please try with
> git apply --ignore-whitespace ? That worked for me.

Ok, that worked. With the above patch, the problems with sx1 and versatileab
are gone. However, imx25-pdk fails to shut down when booting from usb
drive. I cross checked that this does not happen without the above patch.

Guenter

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2] driver core: Fix bus_type.match() error handling
  2022-08-17  1:05           ` Guenter Roeck
@ 2022-08-17  1:12             ` Isaac Manjarres
  2022-08-18 22:59               ` Guenter Roeck
  0 siblings, 1 reply; 18+ messages in thread
From: Isaac Manjarres @ 2022-08-17  1:12 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Greg Kroah-Hartman, Rafael J. Wysocki, Ulf Hansson, Tomeu Vizoso,
	Russell King, Marek Szyprowski, Saravana Kannan, stable,
	kernel-team, linux-kernel

On Tue, Aug 16, 2022 at 06:05:59PM -0700, Guenter Roeck wrote:
> On Tue, Aug 16, 2022 at 10:13:28AM -0700, Isaac Manjarres wrote:
> > On Tue, Aug 16, 2022 at 04:13:11AM -0700, Guenter Roeck wrote:
> > > On Mon, Aug 15, 2022 at 10:17:23PM -0700, Isaac Manjarres wrote:
> > > > On Mon, Aug 15, 2022 at 09:25:07PM -0700, Guenter Roeck wrote:
> > > > > > v1 -> v2:
> > > > > > - Fixed the logic in __driver_attach() to allow a driver to continue
> > > > > >   attempting to match and bind with devices in case of any error, not
> > > > > >   just probe deferral.
> > > > > > 
> > > > > > Guenter,
> > > > > > 
> > > > > > Can you please give test this patch to make sure it still works for you?
> > > > > > 
> > > > > 
> > > > > Not as well as v1. I still see the clk crash with versatileab, and imx25-pdk
> > > > > emulations now stall during boot when trying to boot from usb.
> > > > > 
> > > > > Guenter
> > > > Thanks for trying the patch out. This patch isn't meant to fix the clk
> > > > crash that you mentioned on another thread. I had made the following patch for
> > > > that: https://lore.kernel.org/lkml/YvqTvuqSll30Rv2k@google.com/. Have
> > > > you been able to give that a shot yet? If not can you please test with the
> > > > patch in this e-mail and that patch?
> > > > 
> > > 
> > > No, sorry, I missed that one. It does not apply, though - it is whitespace
> > > corrupted. I tried to fix it up, but that failed.
> > 
> > When applying the patch, can you please try with
> > git apply --ignore-whitespace ? That worked for me.
> 
> Ok, that worked. With the above patch, the problems with sx1 and versatileab
> are gone.

Good to hear! Thanks for testing that patch out.
> However, imx25-pdk fails to shut down when booting from usb
> drive. I cross checked that this does not happen without the above patch.
> 
> Guenter

Can you please share the following for your imx25-pdk environment?
qemu commandline, defconfig, dtb, and baseline kernel commit you're
using for testing.

Thanks,
Isaac

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2] driver core: Fix bus_type.match() error handling
  2022-08-17  1:12             ` Isaac Manjarres
@ 2022-08-18 22:59               ` Guenter Roeck
  2022-08-19  0:38                 ` Isaac Manjarres
  0 siblings, 1 reply; 18+ messages in thread
From: Guenter Roeck @ 2022-08-18 22:59 UTC (permalink / raw)
  To: Isaac Manjarres
  Cc: Greg Kroah-Hartman, Rafael J. Wysocki, Ulf Hansson, Tomeu Vizoso,
	Russell King, Marek Szyprowski, Saravana Kannan, stable,
	kernel-team, linux-kernel

On Tue, Aug 16, 2022 at 06:12:30PM -0700, Isaac Manjarres wrote:
> > However, imx25-pdk fails to shut down when booting from usb
> > drive. I cross checked that this does not happen without the above patch.
> > 
> > Guenter
> 
> Can you please share the following for your imx25-pdk environment?
> qemu commandline, defconfig, dtb, and baseline kernel commit you're
> using for testing.
> 
It doesn't only affect imx25-pdk but also mcimx7d-sabre.
Problem is the same for both: Reboot after booting from
usb drive doesn't work; the system is stuck in shutdown.
Typical tail of log sequence:

Requesting system reboot
sd 0:0:0:0: [sda] Synchronizing SCSI cache
ci_hdrc ci_hdrc.1: remove, state 4
usb usb2: USB disconnect, device number 1
ci_hdrc ci_hdrc.1: USB bus 2 deregistered
ci_hdrc ci_hdrc.0: remove, state 1
usb usb1: USB disconnect, device number 1
usb 1-1: USB disconnect, device number 2
sd 0:0:0:0: [sda] Synchronizing SCSI cache
sd 0:0:0:0: [sda] Synchronize Cache(10) failed: Result: hostbyte=0x01 driverbyte=DRIVER_OK

[ stuck here until qemu is killed ]

qemu command line:

qemu-system-arm -M imx25-pdk -m 128 \
     -kernel arch/arm/boot/zImage -no-reboot -snapshot \
     -usb -device usb-storage,drive=d0,bus=usb-bus.0 \
     -drive file=rootfs-armv5.ext2,if=none,id=d0,format=raw \
     --append "root=/dev/sda rootwait console=ttymxc0,115200" \
     -dtb arch/arm/boot/dts/imx25-pdk.dtb \
     -nographic -monitor null -serial stdio

Root file system is
https://github.com/groeck/linux-build-test/blob/master/rootfs/arm/rootfs-armv5.ext2.gz

defconfig is below.

Hope this helps,
Guenter

---
CONFIG_SYSVIPC=y
CONFIG_POSIX_MQUEUE=y
CONFIG_NO_HZ=y
CONFIG_HIGH_RES_TIMERS=y
CONFIG_PREEMPT=y
CONFIG_LOG_BUF_SHIFT=14
CONFIG_CGROUPS=y
CONFIG_BLK_DEV_INITRD=y
CONFIG_EXPERT=y
CONFIG_PROFILING=y
CONFIG_ARCH_MULTI_V4T=y
CONFIG_ARCH_MULTI_V5=y
# CONFIG_ARCH_MULTI_V7 is not set
CONFIG_ARCH_MXC=y
CONFIG_SOC_IMX1=y
CONFIG_SOC_IMX25=y
CONFIG_SOC_IMX27=y
CONFIG_AEABI=y
CONFIG_PM_DEBUG=y
CONFIG_KPROBES=y
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
# CONFIG_SWAP is not set
CONFIG_SLAB=y
# CONFIG_COMPAT_BRK is not set
CONFIG_NET=y
CONFIG_PACKET=y
CONFIG_UNIX=y
CONFIG_INET=y
CONFIG_IP_MULTICAST=y
CONFIG_IP_PNP=y
# CONFIG_INET_DIAG is not set
# CONFIG_IPV6 is not set
CONFIG_NETFILTER=y
# CONFIG_WIRELESS is not set
CONFIG_DEVTMPFS=y
CONFIG_DEVTMPFS_MOUNT=y
CONFIG_PM_QOS_KUNIT_TEST=y
CONFIG_IMX_WEIM=y
CONFIG_MTD=y
CONFIG_MTD_CMDLINE_PARTS=y
CONFIG_MTD_BLOCK=y
CONFIG_MTD_CFI=y
CONFIG_MTD_CFI_ADV_OPTIONS=y
CONFIG_MTD_CFI_GEOMETRY=y
# CONFIG_MTD_MAP_BANK_WIDTH_1 is not set
# CONFIG_MTD_CFI_I2 is not set
CONFIG_MTD_CFI_INTELEXT=y
CONFIG_MTD_PHYSMAP=y
CONFIG_MTD_RAW_NAND=y
CONFIG_MTD_UBI=y
CONFIG_OF_UNITTEST=y
CONFIG_VIRTIO_BLK=y
CONFIG_EEPROM_AT25=y
CONFIG_BLK_DEV_SD=y
CONFIG_BLK_DEV_SR=y
# CONFIG_BLK_DEV_BSG is not set
CONFIG_SCSI_VIRTIO=y
CONFIG_ATA=y
CONFIG_PATA_IMX=y
CONFIG_NETDEVICES=y
CONFIG_VIRTIO_NET=y
CONFIG_CS89x0_PLATFORM=y
CONFIG_DM9000=y
CONFIG_SMC91X=y
CONFIG_SMC911X=y
CONFIG_SMSC911X=y
CONFIG_SMSC_PHY=y
CONFIG_USB_USBNET=y
# CONFIG_WLAN is not set
CONFIG_INPUT_EVDEV=y
CONFIG_KEYBOARD_GPIO=y
CONFIG_KEYBOARD_IMX=y
# CONFIG_INPUT_MOUSE is not set
CONFIG_INPUT_TOUCHSCREEN=y
CONFIG_TOUCHSCREEN_ADS7846=m
CONFIG_TOUCHSCREEN_MX25=y
CONFIG_TOUCHSCREEN_MC13783=y
# CONFIG_LEGACY_PTYS is not set
CONFIG_SERIAL_8250=m
CONFIG_SERIAL_IMX=y
CONFIG_SERIAL_IMX_CONSOLE=y
# CONFIG_HW_RANDOM is not set
CONFIG_SPI=y
CONFIG_SPI_IMX=y
CONFIG_SPI_SPIDEV=y
CONFIG_GPIO_SYSFS=y
CONFIG_GPIO_MXC=y
CONFIG_W1=y
CONFIG_W1_MASTER_MXC=y
CONFIG_W1_SLAVE_THERM=y
CONFIG_HWMON=m
CONFIG_SENSORS_MC13783_ADC=m
CONFIG_WATCHDOG=y
CONFIG_IMX2_WDT=y
CONFIG_MFD_MC13XXX_SPI=y
CONFIG_MFD_MX25_TSADC=y
CONFIG_REGULATOR=y
CONFIG_REGULATOR_FIXED_VOLTAGE=y
CONFIG_REGULATOR_GPIO=y
CONFIG_REGULATOR_MC13783=y
CONFIG_REGULATOR_MC13892=y
CONFIG_MEDIA_SUPPORT=y
CONFIG_V4L_PLATFORM_DRIVERS=y
CONFIG_V4L_MEM2MEM_DRIVERS=y
CONFIG_VIDEO_CODA=y
# CONFIG_DRM_DEBUG_MODESET_LOCK is not set
CONFIG_FB=y
CONFIG_FB_IMX=y
CONFIG_LCD_L4F00242T03=y
CONFIG_FRAMEBUFFER_CONSOLE=y
CONFIG_LOGO=y
CONFIG_SOUND=y
CONFIG_SND=y
# CONFIG_SND_ARM is not set
# CONFIG_SND_SPI is not set
CONFIG_SND_SOC=y
CONFIG_SND_IMX_SOC=y
CONFIG_USB_HID=m
CONFIG_USB=y
CONFIG_USB_XHCI_HCD=y
CONFIG_USB_EHCI_HCD=y
CONFIG_USB_OHCI_HCD=y
CONFIG_USB_STORAGE=y
CONFIG_USB_UAS=y
CONFIG_USB_CHIPIDEA=y
CONFIG_USB_CHIPIDEA_UDC=y
CONFIG_USB_CHIPIDEA_HOST=y
CONFIG_USB_TEST=y
CONFIG_USB_EHSET_TEST_FIXTURE=y
CONFIG_USB_LINK_LAYER_TEST=y
CONFIG_NOP_USB_XCEIV=y
CONFIG_USB_GADGET=y
CONFIG_USB_ETH=m
CONFIG_MMC=y
CONFIG_MMC_SDHCI=y
CONFIG_MMC_SDHCI_PLTFM=y
CONFIG_MMC_SDHCI_ESDHC_IMX=y
CONFIG_MMC_MXC=y
CONFIG_NEW_LEDS=y
CONFIG_LEDS_CLASS=y
CONFIG_LEDS_GPIO=y
CONFIG_LEDS_MC13783=y
CONFIG_LEDS_TRIGGERS=y
CONFIG_LEDS_TRIGGER_TIMER=y
CONFIG_LEDS_TRIGGER_HEARTBEAT=y
CONFIG_LEDS_TRIGGER_BACKLIGHT=y
CONFIG_LEDS_TRIGGER_GPIO=y
CONFIG_LEDS_TRIGGER_DEFAULT_ON=y
CONFIG_RTC_CLASS=y
CONFIG_RTC_DRV_IMXDI=y
CONFIG_RTC_DRV_MC13XXX=y
CONFIG_RTC_DRV_MXC=y
CONFIG_DMADEVICES=y
CONFIG_IMX_DMA=y
CONFIG_IMX_SDMA=y
CONFIG_DMATEST=y
CONFIG_VIRTIO_BALLOON=y
CONFIG_VIRTIO_MMIO=y
# CONFIG_IOMMU_SUPPORT is not set
CONFIG_IIO=y
CONFIG_FSL_MX25_ADC=y
CONFIG_PWM=y
CONFIG_PWM_IMX1=y
CONFIG_PWM_IMX27=y
CONFIG_EXT3_FS=y
CONFIG_EXT4_KUNIT_TESTS=y
CONFIG_BTRFS_FS=y
# CONFIG_DNOTIFY is not set
CONFIG_ISO9660_FS=y
CONFIG_VFAT_FS=y
# CONFIG_PROC_PAGE_MONITOR is not set
CONFIG_TMPFS=y
CONFIG_JFFS2_FS=y
CONFIG_UBIFS_FS=y
CONFIG_SQUASHFS=y
CONFIG_SQUASHFS_XATTR=y
CONFIG_SQUASHFS_4K_DEVBLK_SIZE=y
CONFIG_NFS_FS=y
CONFIG_ROOT_NFS=y
CONFIG_NLS_CODEPAGE_437=m
CONFIG_NLS_CODEPAGE_850=m
CONFIG_NLS_ISO8859_1=y
CONFIG_NLS_ISO8859_15=m
# CONFIG_CRYPTO_MANAGER_DISABLE_TESTS is not set
CONFIG_CRC32_SELFTEST=y
CONFIG_GLOB_SELFTEST=y
CONFIG_FONTS=y
CONFIG_FONT_8x8=y
CONFIG_DEBUG_INFO_DWARF5=y
CONFIG_KFENCE=y
CONFIG_PROVE_LOCKING=y
CONFIG_DEBUG_LOCKDEP=y
CONFIG_DEBUG_ATOMIC_SLEEP=y
CONFIG_DEBUG_LOCKING_API_SELFTESTS=y
CONFIG_WW_MUTEX_SELFTEST=y
CONFIG_DEBUG_LIST=y
CONFIG_RCU_EQS_DEBUG=y
CONFIG_KUNIT=y
CONFIG_KUNIT_TEST=y
CONFIG_TEST_SORT=y
CONFIG_RBTREE_TEST=y
CONFIG_INTERVAL_TREE_TEST=y
CONFIG_STRING_SELFTEST=y
CONFIG_TEST_BITMAP=y
CONFIG_TEST_UUID=y
CONFIG_TEST_FIRMWARE=y
CONFIG_TEST_SYSCTL=y
CONFIG_RESOURCE_KUNIT_TEST=y
CONFIG_SYSCTL_KUNIT_TEST=y
CONFIG_LIST_KUNIT_TEST=y
CONFIG_CMDLINE_KUNIT_TEST=y
CONFIG_MEMCPY_KUNIT_TEST=y

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2] driver core: Fix bus_type.match() error handling
  2022-08-18 22:59               ` Guenter Roeck
@ 2022-08-19  0:38                 ` Isaac Manjarres
  2022-08-19 11:28                   ` Guenter Roeck
  0 siblings, 1 reply; 18+ messages in thread
From: Isaac Manjarres @ 2022-08-19  0:38 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Greg Kroah-Hartman, Rafael J. Wysocki, Ulf Hansson, Tomeu Vizoso,
	Russell King, Marek Szyprowski, Saravana Kannan, stable,
	kernel-team, linux-kernel

On Thu, Aug 18, 2022 at 03:59:32PM -0700, Guenter Roeck wrote:
> Requesting system reboot
> sd 0:0:0:0: [sda] Synchronizing SCSI cache
> ci_hdrc ci_hdrc.1: remove, state 4
> usb usb2: USB disconnect, device number 1
> ci_hdrc ci_hdrc.1: USB bus 2 deregistered
> ci_hdrc ci_hdrc.0: remove, state 1
> usb usb1: USB disconnect, device number 1
> usb 1-1: USB disconnect, device number 2
> sd 0:0:0:0: [sda] Synchronizing SCSI cache
> sd 0:0:0:0: [sda] Synchronize Cache(10) failed: Result: hostbyte=0x01 driverbyte=DRIVER_OK
> 
> [ stuck here until qemu is killed ]
> 
Hi Guenter,

I'm actually observing the behavior described above even without my patch.
The tip of my tree is currently at 573ae4f13f63 ("tee: add overflow check in
register_shm_helper()"). I used git bisect to find what commit was
causing the problem, and narrowed it down to this series[1].

I reverted all 4 patches in that series, and I no longer see this hang
with my tree.

[1]: https://lore.kernel.org/all/20220712221936.1199196-1-bvanassche@acm.org/

Are these patches part of the tree you're using for testing and
observing this hang in?

Thanks,
Isaac

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2] driver core: Fix bus_type.match() error handling
  2022-08-19  0:38                 ` Isaac Manjarres
@ 2022-08-19 11:28                   ` Guenter Roeck
  2022-08-19 17:45                     ` Isaac Manjarres
  0 siblings, 1 reply; 18+ messages in thread
From: Guenter Roeck @ 2022-08-19 11:28 UTC (permalink / raw)
  To: Isaac Manjarres
  Cc: Greg Kroah-Hartman, Rafael J. Wysocki, Ulf Hansson, Tomeu Vizoso,
	Russell King, Marek Szyprowski, Saravana Kannan, stable,
	kernel-team, linux-kernel

On Thu, Aug 18, 2022 at 05:38:15PM -0700, Isaac Manjarres wrote:
> On Thu, Aug 18, 2022 at 03:59:32PM -0700, Guenter Roeck wrote:
> > Requesting system reboot
> > sd 0:0:0:0: [sda] Synchronizing SCSI cache
> > ci_hdrc ci_hdrc.1: remove, state 4
> > usb usb2: USB disconnect, device number 1
> > ci_hdrc ci_hdrc.1: USB bus 2 deregistered
> > ci_hdrc ci_hdrc.0: remove, state 1
> > usb usb1: USB disconnect, device number 1
> > usb 1-1: USB disconnect, device number 2
> > sd 0:0:0:0: [sda] Synchronizing SCSI cache
> > sd 0:0:0:0: [sda] Synchronize Cache(10) failed: Result: hostbyte=0x01 driverbyte=DRIVER_OK
> > 
> > [ stuck here until qemu is killed ]
> > 
> Hi Guenter,
> 
> I'm actually observing the behavior described above even without my patch.
> The tip of my tree is currently at 573ae4f13f63 ("tee: add overflow check in
> register_shm_helper()"). I used git bisect to find what commit was
> causing the problem, and narrowed it down to this series[1].
> 
> I reverted all 4 patches in that series, and I no longer see this hang
> with my tree.
> 
> [1]: https://lore.kernel.org/all/20220712221936.1199196-1-bvanassche@acm.org/
> 
> Are these patches part of the tree you're using for testing and
> observing this hang in?
> 

Yes, you are correct. I also see the problem in the mainline kernel,
at SHA 3b06a2755758, and with a large number of boots from usb drive
in various arm emulations. Sorry for the confusion. Too many crashes :-(.

Guenter

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2] driver core: Fix bus_type.match() error handling
  2022-08-19 11:28                   ` Guenter Roeck
@ 2022-08-19 17:45                     ` Isaac Manjarres
  2022-08-19 20:01                       ` Bart Van Assche
  0 siblings, 1 reply; 18+ messages in thread
From: Isaac Manjarres @ 2022-08-19 17:45 UTC (permalink / raw)
  To: Guenter Roeck, Bart Van Assche
  Cc: Greg Kroah-Hartman, Rafael J. Wysocki, Ulf Hansson, Tomeu Vizoso,
	Russell King, Marek Szyprowski, Saravana Kannan, stable,
	kernel-team, linux-kernel

On Fri, Aug 19, 2022 at 04:28:32AM -0700, Guenter Roeck wrote:
> On Thu, Aug 18, 2022 at 05:38:15PM -0700, Isaac Manjarres wrote:
> > On Thu, Aug 18, 2022 at 03:59:32PM -0700, Guenter Roeck wrote:
> > > Requesting system reboot
> > > sd 0:0:0:0: [sda] Synchronizing SCSI cache
> > > ci_hdrc ci_hdrc.1: remove, state 4
> > > usb usb2: USB disconnect, device number 1
> > > ci_hdrc ci_hdrc.1: USB bus 2 deregistered
> > > ci_hdrc ci_hdrc.0: remove, state 1
> > > usb usb1: USB disconnect, device number 1
> > > usb 1-1: USB disconnect, device number 2
> > > sd 0:0:0:0: [sda] Synchronizing SCSI cache
> > > sd 0:0:0:0: [sda] Synchronize Cache(10) failed: Result: hostbyte=0x01 driverbyte=DRIVER_OK
> > > 
> > > [ stuck here until qemu is killed ]
> > > 
> > Hi Guenter,
> > 
> > I'm actually observing the behavior described above even without my patch.
> > The tip of my tree is currently at 573ae4f13f63 ("tee: add overflow check in
> > register_shm_helper()"). I used git bisect to find what commit was
> > causing the problem, and narrowed it down to this series[1].
> > 
> > I reverted all 4 patches in that series, and I no longer see this hang
> > with my tree.
> > 
> > [1]: https://lore.kernel.org/all/20220712221936.1199196-1-bvanassche@acm.org/
> > 
> > Are these patches part of the tree you're using for testing and
> > observing this hang in?
> > 
> 
> Yes, you are correct. I also see the problem in the mainline kernel,
> at SHA 3b06a2755758, and with a large number of boots from usb drive
> in various arm emulations. Sorry for the confusion. Too many crashes :-(.
> 
> Guenter

No worries, thanks for confirming.

Hi Bart,

It seems that the patches mentioned in [1] are causing a hang during
reboot for various ARM emulations when booting from USB. Can you please
take a look? There's more information about what defconfig, rootfs, and
qemu commandline to use at [2].

[1]: https://lore.kernel.org/all/20220712221936.1199196-1-bvanassche@acm.org/
[2]: https://lore.kernel.org/all/20220818225932.GA3433999@roeck-us.net/

Thanks,
Isaac

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2] driver core: Fix bus_type.match() error handling
  2022-08-19 17:45                     ` Isaac Manjarres
@ 2022-08-19 20:01                       ` Bart Van Assche
  2022-08-19 20:55                         ` Guenter Roeck
  2022-08-19 22:08                         ` Guenter Roeck
  0 siblings, 2 replies; 18+ messages in thread
From: Bart Van Assche @ 2022-08-19 20:01 UTC (permalink / raw)
  To: Isaac Manjarres, Guenter Roeck
  Cc: Greg Kroah-Hartman, Rafael J. Wysocki, Ulf Hansson, Tomeu Vizoso,
	Russell King, Marek Szyprowski, Saravana Kannan, stable,
	kernel-team, linux-kernel

On 8/19/22 10:45, Isaac Manjarres wrote:
> It seems that the patches mentioned in [1] are causing a hang during
> reboot for various ARM emulations when booting from USB. Can you please
> take a look? There's more information about what defconfig, rootfs, and
> qemu commandline to use at [2].

Unfortunately I can't reproduce this hang in an x86 VM with kernel 
v6.0-rc1 and a USB disk attached via virt-manager. The lsscsi -v output 
shows that a USB disk has been attached:

[9:0:0:0]    disk    QEMU     QEMU HARDDISK    2.5+  /dev/sdd
   dir: /sys/bus/scsi/devices/9:0:0:0 
[/sys/devices/pci0000:00/0000:00:07.0/usb2/2-2/2-2:1.0/host9/target9:0:0/9:0:0:0]

Rebooting that VM happens in the expected time and without triggering 
any kernel warnings.

Since the issue has been observed in qemu, how about sharing the sysrq-t 
output? I recommend to collect that output as follows:
* Send the serial console output to a file. This involves adding 
console=ttyS0,115200n8 to the kernel command line and using the proper 
qemu options to save the serial console output into a file.
* Reproduce the hang and send the sysrq-t key sequence to qemu, e.g. as 
follows: virsh send-key ${vm_name} KEY_LEFTALT KEY_SYSRQ KEY_T

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2] driver core: Fix bus_type.match() error handling
  2022-08-19 20:01                       ` Bart Van Assche
@ 2022-08-19 20:55                         ` Guenter Roeck
  2022-08-19 22:08                         ` Guenter Roeck
  1 sibling, 0 replies; 18+ messages in thread
From: Guenter Roeck @ 2022-08-19 20:55 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Isaac Manjarres, Greg Kroah-Hartman, Rafael J. Wysocki,
	Ulf Hansson, Tomeu Vizoso, Russell King, Marek Szyprowski,
	Saravana Kannan, stable, kernel-team, linux-kernel

On Fri, Aug 19, 2022 at 01:01:29PM -0700, Bart Van Assche wrote:
> On 8/19/22 10:45, Isaac Manjarres wrote:
> > It seems that the patches mentioned in [1] are causing a hang during
> > reboot for various ARM emulations when booting from USB. Can you please
> > take a look? There's more information about what defconfig, rootfs, and
> > qemu commandline to use at [2].
> 
> Unfortunately I can't reproduce this hang in an x86 VM with kernel v6.0-rc1
> and a USB disk attached via virt-manager. The lsscsi -v output shows that a
> USB disk has been attached:
> 

The problem only reproduces with various arm emulations. I may have missed it,
but I have not noticed it on any other architecture.

> [9:0:0:0]    disk    QEMU     QEMU HARDDISK    2.5+  /dev/sdd
>   dir: /sys/bus/scsi/devices/9:0:0:0 [/sys/devices/pci0000:00/0000:00:07.0/usb2/2-2/2-2:1.0/host9/target9:0:0/9:0:0:0]
> 
> Rebooting that VM happens in the expected time and without triggering any
> kernel warnings.
> 
There are no warnings. The reboot just stalls. Also, the usb drive
attaches just fine. The probem is seen in shutdown, not during boot.

> Since the issue has been observed in qemu, how about sharing the sysrq-t
> output? I recommend to collect that output as follows:
> * Send the serial console output to a file. This involves adding
> console=ttyS0,115200n8 to the kernel command line and using the proper qemu
> options to save the serial console output into a file.
> * Reproduce the hang and send the sysrq-t key sequence to qemu, e.g. as
> follows: virsh send-key ${vm_name} KEY_LEFTALT KEY_SYSRQ KEY_T
> 

Will try.

Thanks,
Guenter

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2] driver core: Fix bus_type.match() error handling
  2022-08-19 20:01                       ` Bart Van Assche
  2022-08-19 20:55                         ` Guenter Roeck
@ 2022-08-19 22:08                         ` Guenter Roeck
  2022-08-20  0:07                           ` Bart Van Assche
  1 sibling, 1 reply; 18+ messages in thread
From: Guenter Roeck @ 2022-08-19 22:08 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Isaac Manjarres, Greg Kroah-Hartman, Rafael J. Wysocki,
	Ulf Hansson, Tomeu Vizoso, Russell King, Marek Szyprowski,
	Saravana Kannan, stable, kernel-team, linux-kernel

On Fri, Aug 19, 2022 at 01:01:29PM -0700, Bart Van Assche wrote:
> On 8/19/22 10:45, Isaac Manjarres wrote:
> > It seems that the patches mentioned in [1] are causing a hang during
> > reboot for various ARM emulations when booting from USB. Can you please
> > take a look? There's more information about what defconfig, rootfs, and
> > qemu commandline to use at [2].
> 
> Unfortunately I can't reproduce this hang in an x86 VM with kernel v6.0-rc1
> and a USB disk attached via virt-manager. The lsscsi -v output shows that a
> USB disk has been attached:
> 
> [9:0:0:0]    disk    QEMU     QEMU HARDDISK    2.5+  /dev/sdd
>   dir: /sys/bus/scsi/devices/9:0:0:0 [/sys/devices/pci0000:00/0000:00:07.0/usb2/2-2/2-2:1.0/host9/target9:0:0/9:0:0:0]
> 
> Rebooting that VM happens in the expected time and without triggering any
> kernel warnings.
> 
> Since the issue has been observed in qemu, how about sharing the sysrq-t
> output? I recommend to collect that output as follows:
> * Send the serial console output to a file. This involves adding
> console=ttyS0,115200n8 to the kernel command line and using the proper qemu
> options to save the serial console output into a file.
> * Reproduce the hang and send the sysrq-t key sequence to qemu, e.g. as
> follows: virsh send-key ${vm_name} KEY_LEFTALT KEY_SYSRQ KEY_T
> 
Unless I am missing something, this requires a virtio keyboard.
So far I have been unable to get this to work with qemu arm emulations.

Guenter

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2] driver core: Fix bus_type.match() error handling
  2022-08-19 22:08                         ` Guenter Roeck
@ 2022-08-20  0:07                           ` Bart Van Assche
  2022-08-20 11:48                             ` Guenter Roeck
  0 siblings, 1 reply; 18+ messages in thread
From: Bart Van Assche @ 2022-08-20  0:07 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Isaac Manjarres, Greg Kroah-Hartman, Rafael J. Wysocki,
	Ulf Hansson, Tomeu Vizoso, Russell King, Marek Szyprowski,
	Saravana Kannan, stable, kernel-team, linux-kernel

On 8/19/22 15:08, Guenter Roeck wrote:
> On Fri, Aug 19, 2022 at 01:01:29PM -0700, Bart Van Assche wrote:
>> Since the issue has been observed in qemu, how about sharing the sysrq-t
>> output? I recommend to collect that output as follows:
>> * Send the serial console output to a file. This involves adding
>> console=ttyS0,115200n8 to the kernel command line and using the proper qemu
>> options to save the serial console output into a file.
>> * Reproduce the hang and send the sysrq-t key sequence to qemu, e.g. as
>> follows: virsh send-key ${vm_name} KEY_LEFTALT KEY_SYSRQ KEY_T
>>
> Unless I am missing something, this requires a virtio keyboard.
> So far I have been unable to get this to work with qemu arm emulations.

That's unfortunate. Is there another way to collect call traces after
the lockup has happened? Is it sufficient to enable the serial console
and to monitor the serial console output? Is CONFIG_SOFTLOCKUP_DETECTOR=y
sufficient? If not, how about converting the new wait calls in the SCSI
code, e.g. as shown in the (totally untested) patch below?

Thanks,

Bart.


diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
index 6c63672971f1..edd238384f1d 100644
--- a/drivers/scsi/hosts.c
+++ b/drivers/scsi/hosts.c
@@ -35,6 +35,7 @@
  #include <linux/platform_device.h>
  #include <linux/pm_runtime.h>
  #include <linux/idr.h>
+#include <linux/sched/debug.h>
  #include <scsi/scsi_device.h>
  #include <scsi/scsi_host.h>
  #include <scsi/scsi_transport.h>
@@ -196,7 +197,11 @@ void scsi_remove_host(struct Scsi_Host *shost)
  	 * unloaded and/or the host resources can be released. Hence wait until
  	 * the dependent SCSI targets and devices are gone before returning.
  	 */
-	wait_event(shost->targets_wq, atomic_read(&shost->target_count) == 0);
+	while (wait_event_timeout(shost->targets_wq,
+			atomic_read(&shost->target_count) == 0, 60 * HZ) <= 0) {
+		show_state();
+		show_all_workqueues();
+	}

  	scsi_mq_destroy_tags(shost);
  }
diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index 213ebc88f76a..1c17b6c53ab0 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -14,6 +14,7 @@
  #include <linux/device.h>
  #include <linux/pm_runtime.h>
  #include <linux/bsg.h>
+#include <linux/sched/debug.h>

  #include <scsi/scsi.h>
  #include <scsi/scsi_device.h>
@@ -1536,7 +1537,11 @@ static void __scsi_remove_target(struct scsi_target *starget)
  	 * devices associated with @starget have been removed to prevent that
  	 * a SCSI error handling callback function triggers a use-after-free.
  	 */
-	wait_event(starget->sdev_wq, atomic_read(&starget->sdev_count) == 0);
+	while (wait_event_timeout(starget->sdev_wq,
+			atomic_read(&starget->sdev_count) == 0, 60 * HZ) <= 0) {
+		show_state();
+		show_all_workqueues();
+	}
  }

  /**

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v2] driver core: Fix bus_type.match() error handling
  2022-08-20  0:07                           ` Bart Van Assche
@ 2022-08-20 11:48                             ` Guenter Roeck
  2022-08-21 21:39                               ` Bart Van Assche
  0 siblings, 1 reply; 18+ messages in thread
From: Guenter Roeck @ 2022-08-20 11:48 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Isaac Manjarres, Greg Kroah-Hartman, Rafael J. Wysocki,
	Ulf Hansson, Tomeu Vizoso, Russell King, Marek Szyprowski,
	Saravana Kannan, stable, kernel-team, linux-kernel

On Fri, Aug 19, 2022 at 05:07:09PM -0700, Bart Van Assche wrote:
> On 8/19/22 15:08, Guenter Roeck wrote:
> > On Fri, Aug 19, 2022 at 01:01:29PM -0700, Bart Van Assche wrote:
> > > Since the issue has been observed in qemu, how about sharing the sysrq-t
> > > output? I recommend to collect that output as follows:
> > > * Send the serial console output to a file. This involves adding
> > > console=ttyS0,115200n8 to the kernel command line and using the proper qemu
> > > options to save the serial console output into a file.
> > > * Reproduce the hang and send the sysrq-t key sequence to qemu, e.g. as
> > > follows: virsh send-key ${vm_name} KEY_LEFTALT KEY_SYSRQ KEY_T
> > > 
> > Unless I am missing something, this requires a virtio keyboard.
> > So far I have been unable to get this to work with qemu arm emulations.
> 
> That's unfortunate. Is there another way to collect call traces after
> the lockup has happened? Is it sufficient to enable the serial console
> and to monitor the serial console output? Is CONFIG_SOFTLOCKUP_DETECTOR=y
> sufficient? If not, how about converting the new wait calls in the SCSI
> code, e.g. as shown in the (totally untested) patch below?
> 

Enabling the lockup detector did the trick. Backtrace below.

Guenter

---
INFO: task init:283 blocked for more than 122 seconds.
      Tainted: G                 N 6.0.0-rc1-00303-g963a70bee588 #3
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:init            state:D stack:    0 pid:  283 ppid:     1 flags:0x00000000
 __schedule from schedule+0x70/0x118
 schedule from scsi_remove_host+0x178/0x1c4
 scsi_remove_host from usb_stor_disconnect+0x40/0xe8
 usb_stor_disconnect from usb_unbind_interface+0x78/0x274
 usb_unbind_interface from device_release_driver_internal+0x1a4/0x230
 device_release_driver_internal from bus_remove_device+0xd0/0x100
 bus_remove_device from device_del+0x174/0x3ec
 device_del from usb_disable_device+0xcc/0x178
 usb_disable_device from usb_disconnect+0xcc/0x274
 usb_disconnect from usb_disconnect+0x98/0x274
 usb_disconnect from usb_remove_hcd+0xd0/0x16c
 usb_remove_hcd from host_stop+0x38/0xa8
 host_stop from ci_hdrc_remove+0x40/0x134
 ci_hdrc_remove from platform_remove+0x24/0x54
 platform_remove from device_release_driver_internal+0x1a4/0x230
 device_release_driver_internal from bus_remove_device+0xd0/0x100
 bus_remove_device from device_del+0x174/0x3ec
 device_del from platform_device_del.part.0+0x10/0x78
 platform_device_del.part.0 from platform_device_unregister+0x18/0x28
 platform_device_unregister from ci_hdrc_remove_device+0xc/0x24
 ci_hdrc_remove_device from ci_hdrc_imx_remove+0x28/0xfc
 ci_hdrc_imx_remove from device_shutdown+0x178/0x230
 device_shutdown from __do_sys_reboot+0x168/0x258
 __do_sys_reboot from ret_fast_syscall+0x0/0x1c
Exception stack(0xc8cb9fa8 to 0xc8cb9ff0)
9fa0:                   01234567 0000000f fee1dead 28121969 01234567 00000000
9fc0: 01234567 0000000f 00000001 00000058 000e05c0 00000000 00000000 00000000
9fe0: 000e0298 beacede4 000994bc b6efc2c0

Showing all locks held in the system:
1 lock held by rcu_tasks_kthre/10:
 #0: c0de0d6c (rcu_tasks.tasks_gp_mutex){+.+.}-{3:3}, at: rcu_tasks_one_gp+0x24/0x48c
1 lock held by khungtaskd/16:
 #0: c0de0c98 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x24/0x1a8
8 locks held by init/283:
 #0: c0dc87e4 (system_transition_mutex){+.+.}-{3:3}, at: __do_sys_reboot+0x90/0x258
 #1: c1985888 (&dev->mutex){....}-{3:3}, at: device_shutdown+0xd8/0x230
 #2: c1998488 (&dev->mutex){....}-{3:3}, at: device_shutdown+0xe8/0x230
 #3: c2596088 (&dev->mutex){....}-{3:3}, at: device_release_driver_internal+0x34/0x230
 #4: c0e5a5ac (usb_bus_idr_lock){+.+.}-{3:3}, at: usb_remove_hcd+0xc4/0x16c
 #5: c277c8f8 (&dev->mutex){....}-{3:3}, at: usb_disconnect+0x60/0x274
 #6: c27880f8 (&dev->mutex){....}-{3:3}, at: usb_disconnect+0x60/0x274
 #7: c27a9498 (&dev->mutex){....}-{3:3}, at: device_release_driver_internal+0x34/0x230

=============================================

Kernel panic - not syncing: hung_task: blocked tasks
CPU: 0 PID: 16 Comm: khungtaskd Tainted: G                 N 6.0.0-rc1-00303-g963a70bee588 #3
Hardware name: Freescale i.MX25 (Device Tree Support)
 unwind_backtrace from show_stack+0x10/0x18
 show_stack from dump_stack_lvl+0x34/0x54
 dump_stack_lvl from panic+0x114/0x32c
 panic from watchdog+0x3f4/0x7b4
 watchdog from kthread+0xec/0x128
 kthread from ret_from_fork+0x14/0x3c
Exception stack(0xc88b5fb0 to 0xc88b5ff8)
5fa0:                                     00000000 00000000 00000000 00000000
5fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
5fe0: 00000000 00000000 00000000 00000000 00000013 00000000

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2] driver core: Fix bus_type.match() error handling
  2022-08-20 11:48                             ` Guenter Roeck
@ 2022-08-21 21:39                               ` Bart Van Assche
  0 siblings, 0 replies; 18+ messages in thread
From: Bart Van Assche @ 2022-08-21 21:39 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Isaac Manjarres, Greg Kroah-Hartman, Rafael J. Wysocki,
	Ulf Hansson, Tomeu Vizoso, Russell King, Marek Szyprowski,
	Saravana Kannan, stable, kernel-team, linux-kernel

On 8/20/22 04:48, Guenter Roeck wrote:
> INFO: task init:283 blocked for more than 122 seconds.
>        Tainted: G                 N 6.0.0-rc1-00303-g963a70bee588 #3
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> task:init            state:D stack:    0 pid:  283 ppid:     1 flags:0x00000000
>   __schedule from schedule+0x70/0x118
>   schedule from scsi_remove_host+0x178/0x1c4
>   scsi_remove_host from usb_stor_disconnect+0x40/0xe8
>   usb_stor_disconnect from usb_unbind_interface+0x78/0x274
>   usb_unbind_interface from device_release_driver_internal+0x1a4/0x230
>   device_release_driver_internal from bus_remove_device+0xd0/0x100
>   bus_remove_device from device_del+0x174/0x3ec
>   device_del from usb_disable_device+0xcc/0x178
>   usb_disable_device from usb_disconnect+0xcc/0x274
>   usb_disconnect from usb_disconnect+0x98/0x274
>   usb_disconnect from usb_remove_hcd+0xd0/0x16c
>   usb_remove_hcd from host_stop+0x38/0xa8
>   host_stop from ci_hdrc_remove+0x40/0x134
>   ci_hdrc_remove from platform_remove+0x24/0x54
>   platform_remove from device_release_driver_internal+0x1a4/0x230
>   device_release_driver_internal from bus_remove_device+0xd0/0x100
>   bus_remove_device from device_del+0x174/0x3ec
>   device_del from platform_device_del.part.0+0x10/0x78
>   platform_device_del.part.0 from platform_device_unregister+0x18/0x28
>   platform_device_unregister from ci_hdrc_remove_device+0xc/0x24
>   ci_hdrc_remove_device from ci_hdrc_imx_remove+0x28/0xfc
>   ci_hdrc_imx_remove from device_shutdown+0x178/0x230
>   device_shutdown from __do_sys_reboot+0x168/0x258
>   __do_sys_reboot from ret_fast_syscall+0x0/0x1c

Hi Guenter,

Thank you for having shared this information. I think this deadlock is 
the result of holding a reference on /dev/sda (by the mount() system 
call) while calling scsi_remove_host().

It seems wrong to me that ci_hdrc_imx_shutdown() calls 
ci_hdrc_imx_remove() - I think that function should only do the minimum 
that is required to prepare for shutdown instead of calling 
scsi_remove_host() indirectly.

That being said, the patch series "scsi: core: Call 
blk_mq_free_tag_set() earlier" probably will have to be reverted because 
of the following deadlock reported by syzbot: 
https://lore.kernel.org/linux-scsi/000000000000b5187d05e6c08086@google.com/

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2] driver core: Fix bus_type.match() error handling
  2022-08-15 21:19 ` [PATCH v2] driver core: Fix bus_type.match() error handling Isaac J. Manjarres
  2022-08-16  4:25   ` Guenter Roeck
@ 2022-08-25  9:22   ` Marek Szyprowski
  1 sibling, 0 replies; 18+ messages in thread
From: Marek Szyprowski @ 2022-08-25  9:22 UTC (permalink / raw)
  To: Isaac J. Manjarres, Greg Kroah-Hartman, Rafael J. Wysocki,
	Ulf Hansson, Tomeu Vizoso, Russell King
  Cc: Saravana Kannan, stable, Guenter Roeck, kernel-team, linux-kernel

On 15.08.2022 23:19, Isaac J. Manjarres wrote:
> Both __device_attach_driver() and __driver_attach() check the return
> code of the bus_type.match() function to see if the device needs to be
> added to the deferred probe list. After adding the device to the list,
> the logic attempts to bind the device to the driver anyway, as if the
> device had matched with the driver, which is not correct.
>
> If __device_attach_driver() detects that the device in question is not
> ready to match with a driver on the bus, then it doesn't make sense for
> the device to attempt to bind with the current driver or continue
> attempting to match with any of the other drivers on the bus. So, update
> the logic in __device_attach_driver() to reflect this.
>
> If __driver_attach() detects that a driver tried to match with a device
> and that results in any error, then the driver should not attempt to bind
> with the device. However, the driver can still attempt to match and bind
> with other devices on the bus, as drivers can be bound to multiple
> devices. So, update the logic in __driver_attach() to reflect this.
>
> Cc: Saravana Kannan <saravanak@google.com>
> Cc: stable@kernel.org
> Fixes: 656b8035b0ee ("ARM: 8524/1: driver cohandle -EPROBE_DEFER from bus_type.match()")
> Reported-by: Guenter Roeck <linux@roeck-us.net>
> Signed-off-by: Isaac J. Manjarres <isaacmanjarres@google.com>

This fixes the boot issue observed on Qualcomm APQ8016 based DragonBoard 
410c, which I've missed while testing the "amba: Remove deferred device 
addition" patch. Feel free to add:

Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>

> ---
>   drivers/base/dd.c | 16 +++++++++++++++-
>   1 file changed, 15 insertions(+), 1 deletion(-)
>
> v1 -> v2:
> - Fixed the logic in __driver_attach() to allow a driver to continue
>    attempting to match and bind with devices in case of any error, not
>    just probe deferral.
>
> Guenter,
>
> Can you please give test this patch to make sure it still works for you?
>
> Thanks,
> Isaac
>
> diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> index 70f79fc71539..453eb19a9a27 100644
> --- a/drivers/base/dd.c
> +++ b/drivers/base/dd.c
> @@ -881,6 +881,11 @@ static int __device_attach_driver(struct device_driver *drv, void *_data)
>   		dev_dbg(dev, "Device match requests probe deferral\n");
>   		dev->can_match = true;
>   		driver_deferred_probe_add(dev);
> +		/*
> +		 * Device can't match with a driver right now, so don't attempt
> +		 * to match or bind with other drivers on the bus.
> +		 */
> +		return ret;
>   	} else if (ret < 0) {
>   		dev_dbg(dev, "Bus failed to match device: %d\n", ret);
>   		return ret;
> @@ -1120,9 +1125,18 @@ static int __driver_attach(struct device *dev, void *data)
>   		dev_dbg(dev, "Device match requests probe deferral\n");
>   		dev->can_match = true;
>   		driver_deferred_probe_add(dev);
> +		/*
> +		 * Driver could not match with device right now, but may match
> +		 * with another device on the bus.
> +		 */
> +		return 0;
>   	} else if (ret < 0) {
>   		dev_dbg(dev, "Bus failed to match device: %d\n", ret);
> -		return ret;
> +		/*
> +		 * Driver could not match with device, but may match with
> +		 * another device on the bus.
> +		 */
> +		return 0;
>   	} /* ret > 0 means positive match */
>   
>   	if (driver_allows_async_probing(drv)) {

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2022-08-25  9:23 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CGME20220815211927eucas1p275ed3f63f1baf76b319a828c214c651f@eucas1p2.samsung.com>
2022-08-15 21:19 ` [PATCH v2] driver core: Fix bus_type.match() error handling Isaac J. Manjarres
2022-08-16  4:25   ` Guenter Roeck
2022-08-16  5:17     ` Isaac Manjarres
2022-08-16 11:13       ` Guenter Roeck
2022-08-16 17:13         ` Isaac Manjarres
2022-08-17  1:05           ` Guenter Roeck
2022-08-17  1:12             ` Isaac Manjarres
2022-08-18 22:59               ` Guenter Roeck
2022-08-19  0:38                 ` Isaac Manjarres
2022-08-19 11:28                   ` Guenter Roeck
2022-08-19 17:45                     ` Isaac Manjarres
2022-08-19 20:01                       ` Bart Van Assche
2022-08-19 20:55                         ` Guenter Roeck
2022-08-19 22:08                         ` Guenter Roeck
2022-08-20  0:07                           ` Bart Van Assche
2022-08-20 11:48                             ` Guenter Roeck
2022-08-21 21:39                               ` Bart Van Assche
2022-08-25  9:22   ` Marek Szyprowski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).